Celebrating 25 years of Open Source
Past, Present, and Future |
Hi, everyone. We are getting started right now. This upcoming keynote is called Celebrating 25 Years of Open Source. Our speaker is going to be Nick Vidal. Thank you. Good morning. Bonjour. J.O.'s anniversary. Happy birthday, Open Source. 25 years. Yeah. So, I had a pleasure to work with the OSI five years ago, together with the board of directors, and it was a huge success. We organized 100 activities across 40 events worldwide, all the major Open Source conferences, and we plan to do this again this year. 25 years is a huge milestone for the Open Source community, and we need you to help celebrate this. So, I'm going to tell a bit about the history of Open Source. That's 25 years, but before then, we also have the history of free software, and I'm going to attempt to give a talk about this in about 30 minutes. Let's see how that goes. I have a couple of milestones, a couple of timelines, and let's see. So, I've created a timeline here. I'm going to give three talks, three timelines, actually three acts. I'm going to first talk about the past of Open Source. Later, I'm going to talk about the future of Open Source, and finally, I'm going to talk about the present of Open Source. For the past of Open Source, I'm going to be covering 12 timelines, but mostly I'll be focusing on the Open Source timeline and on the free software timeline. So, let's get started. So, I'm going to... So, let's go... Before talking about Open Source, let's go to the free software timeline. You can actually access this. It's aniv.co, and let's go here, and let's go back to 1955. So, this is a long time ago, and in 1955, we had the share user group. So, this was part of IBM way back then, and people were starting to share source codes, and this is how these mindsets of sharing codes, and there was no Open Source back then. There was no free software back then, but these mindsets was already there. This is akin to the human. They always shared knowledge, and so they started sharing codes with the share user group, and later we can see here that this is a funny one. If anyone remember the Open Letter to Hobbits, so Bill Gates was not very keen to people sharing his codes, and he was very angry about that. So, everyone shared their codes before, and with Bill Gates coming and other companies, they started making the source codes proprietary, and looking at that, Richard Stallman, he didn't like it. A lot of developers didn't like this, and they created a movement to go back to how it was before, about sharing knowledge, about sharing codes, about just making progress, and so he started GNU in 1983. 1985, he founded the Free Software Foundation, and also the next year, the definition of free software, so I'm going to go over this, and this is really important, probably most of you know, but it's always good to repeat this. So, the definition of free software is the freedom to run the program as you wish. That's the definition zero. One, the freedom to study how the program works and change it. Number two, the freedom to redistribute copies to your neighbors, and the third one, the freedom to distribute copies of modified versions to others. So, these are the basics. This is the basic definition of free software. In 89, there was created the GPL. A lot of people use GPL still today. It's one of the main licenses. We have the software in public interest. This was very important for Debian projects, and as you see in the open source timeline, the Debian projects shares a lot of similarities with the OSI, because Bruce Perlens was there, and he was fundamental on that. So, we have the software freedom conservancy that was created in 2006. The GPL version three released in 2007, and that's the timeline for the free software. Of course, this is an overview. If I miss something, please, this is available on GitHub, and you can propose a milestone to be added. Now, let's go back and go to the open source timeline. So, open source is not actually something that came out of nowhere. It has always been inspired by the free software movements, and those two are not different. The open source is a continuation of the free software movements. So, one of the key books that was very relevant at the time was the Cathedral and the Bazaar that was published by Eric Raymond, and this book compared how software was developed on systems like Linux, which was very chaotic sometimes, but it worked, and how it was developed in proprietary software. It was a very, it was a waterfall model, and it was comparing those two models and explaining how building together in this distributed manner still works, even for complex systems like Linux or Sand Mayo. So, this book inspired a Netscape to make it open source. They wanted to attract global developers to help with their software. And finally, Christine Peterson, she created the open source label. So, this is really interesting to highlight. How many of you knew that a woman was responsible for coining the term open source? Please raise your hands. You see, I only saw a few hands raised, and this is really important. The reason why we're here today talking about open source is because of a woman, so that's really special. And so, Christine, she wanted to, the reason why she created the label open source is that she wanted to make it really friendly for business to realize the importance of free software. But the free software label was not very well accepted. It was kind of, it didn't work out, but with the open source label, it was really magical, and everything just went really well. So, the open source initiative was founded in February with Eric Raymond and Bruce Perence from the Debian Project. The definition was created based on the Debian free software guidelines. So, as you can see, the Debian Project and the OSI has a very strong relationship since the very beginning. We had the open source summits happening that year. As you can see, Tim O'Reilly, he started promoting this as well, a very good marketing just from the beginning, with a lot of people, like Linus from the Linux, Larry Wall from Pearl, Brian from Apache, Eric from Sand Mail, Guido from Python. So, a lot of very, very well-known developers got together to promote open source and this new label. Of course, not everyone was happy about this popularity of open source, and one example is Microsoft. So, they started spreading a lot of fuzz, and we have the famous Halloween documents as well that was leaked, and they really wanted, they had a strategy to kill open source and free software, which was the idea of, extends the software and create proprietary things that they wouldn't share back and to break open source. And they really did try to kill open source at the beginning. So, in 1999, we had the OSI lessons list, and the OSI logo, the keyhole, was created as well in 1999, and it's this one here. At the time, there were a lot of lessons that were coming out, and this was challenging from a legal perspective, because this created a lot of confusion. So, one of the very important roles from the OSI was an attempt to bring all those lessons together, to see which ones were compatible or not, and to highlight only the main ones, so we wouldn't have a proliferation of several lessons, which would be just very confusing. So, the main ones, the most popular ones, are the Apache, the BST, the GPL, and MIT, and they're still very popular today. And five years ago, while we were celebrating the 20 years of open source, I'm not sure if you remember the Commons clause and other lessons that came at that year, they were trying to really say, hey, we're open source, but you can't share this, you can't distribute this. So, this is a case of open source, it's not really open source, and the OSI board and everyone involved, while we were celebrating the anniversary, we were really fighting to defend the open source definition, and I think we have succeeded. So, I attempted to give you an overview of the history of free software and of open source, and it's not just that. This was inspired, and has inspired a lot of movements. We have the free culture movements as well, highlighted by the Wikipedia. Wikipedia is about sharing knowledge, about sharing contents, about creating encyclopedia that everyone can help and everyone can edit. We can also discuss the role of open source in business, and how this has helped. So, this is a huge timeline. Susie, for example, is a very important company here in Germany, and it was founded in 1992. Red Hat is celebrating 30 years this year, and it was an example that you can actually create a billion-dollar business using only free and open source software. So, this was really important, and if you want to check out, there are so many stories here. So, we have Susie, Red Hat, Ubuntu, and Droids. I think it's important to highlight also Automatic, the company behind WordPress. So, today we have several people who use WordPress and develop WordPress and provide services using free and open source software, and there's a whole bunch of examples here on this timeline. If you want to check out, that's great. We also have open governments, how open source has influenced how governments are run, trying to make the government as transparent as possible to publish those documents and to allow citizens to help with the city, with the states, with the country, by really looking and making the government more transparent and more engaging. Open knowledge as well. So, the Internet Archive is trying to create a collection of everything digital, making books, digital games, music, and there's a lot of movements around open knowledge as well. We have open hardware as well, which is essential. How open source and free software has influenced open education as well. We have OERs, which allows people to share contents around education. The open web, this is, of course, one of the main ones. We follow an open web, it's very challenging. So, the web has to be open, right? Linux, of course, one of the main applications. Free software, we already saw this timeline. The importance of open access and open science and having reproducible science, right? And lastly, open data, the importance of open data. Especially right now, with AI, we have to have open data sets to help this and for people to collaborate. So, this is about the past. Now, we're going to jump to the future. And I'm just going to highlight a few projects which I'm really interested in. And I think this is the future of open source. Of course, there are many other areas of interest, but we have the software build materials. This was especially due to the vulnerability that has been happening recently. And the White House in the US, they brought everyone together to discuss the role of open source in business and in government and everywhere. And how it's important for us to have mechanisms that can protect those software. So, one of them is software build materials. And this is being championed by the open SSF and other organizations. This is really important. And basically, as bombs, they allow developers to track the penises and vulnerabilities in software packages to make sure that they are reliable. We also have SiegStore. And SiegStore is really important to make sure they have a package that is authentic. So, it allows developers to assign those packages. And so, nobody can tamper or compromise those packages. We finally have WebAssembly as well, which is coming along. So, there's WebAssembly on the browser side, but there's also WebAssembly on the server side, which is gaining a lot of importance. And one of the reasons why WebAssembly is important is because of portability, security, and also performance. And WebAssembly has a huge potential to allow open source to grow. Something that I'm really keen to working with as well with Confidential Computing. We have a dev room that's happening here at FOSDOM around Confidential Computing. And Confidential Computing is about encrypting data in use, in memory, in the CPU. So, this will impact how we use the cloud today. When we use the cloud, there are a lot of security issues. And by using Confidential Computing, we allow this to be more secure. Finally, regarding AI, we have Bloom. So, people here probably have heard of Shats GPT, but that's not open source. And one of the issues of not being open source is that we don't allow researchers or anyone, any individual to look into the algorithm or look into the models, the data sets. And Bloom is an attempt to do that. It's an European project, mostly, but it has researchers from all over the world as well, attempting to create something on open source AI. And this is really worth checking as well. And finally, we also have Stable Diffusion. This is a project, a deep learning project that you can enter a field description and it will generate an image for you. Again, the reason why it's important for you to have something that's open is because you want to see what are the models, what are the data sets, and research around that. And we want to have control over this. So, this, for me, is the future of open source, some really interesting projects. And now I'm going to talk about the presence of open source. So, this is the moment for you to celebrate open source. And I really want you to celebrate here at FOSM. So, it's who here came alone? You don't have a colleague. Please raise your hands. Did you come with a friend or are you alone? Okay, we have quite a few hands. So, go to someone by your side and ask, what's your open source journey? How did you get started with open source? Try to break the ice and share those stories. What are you interested in? What's the future of open source for you? What are the projects that you're working on? So, celebrate here at FOSM together with your colleagues. I would also like to invite you to go to the State of OpenCon. This is a conference that's happening right after FOSM in London. And there are 200 free tickets. If you want to attend this conference, usually the ticket is 199 pounds. But if you go to the table of Open UK here at FOSM, you can get a free ticket. And the conference is going to be very close to FOSM and it's nearby. You can take the train and be there and just celebrate the 25 years of open source. And also, celebrate the open source by creating your own personal timeline. You can fork this timeline and you can share your story. Also, you can share the story of your projects. If you don't have an open source project, you can look at other projects and create a story about your projects. Organize a party at your organization, at your company, bring some cupcakes and share those photos with the OSI. We'll be sharing that with others as well. So, you can share this. You can either tag the OSI on Twitter, but preferably on Macedon as well. So, this one here is a nice picture of a celebration in Japan, where they were celebrating the 20 years of open source. So, they just created, they have a cake from the tux and they just shared the photos and the OSI shared those again. And finally, I want to invite the OSI board of directors here for a special announcement to celebrate the 25 years of open source. And so, thank you, everyone, and Jugos University. Well, thank you. Thank you so much, Nick, for sharing this opportunity. Hello, everyone. I'm Stefano Maffulio. I'm the executive director of the open source initiative. And it's a privilege to be here to celebrate 25 years in February 1998. The open source initiative has been focusing a lot in shaping with the open source definition. We've been shaping the conversations around technology for the past 20 years. And the definition itself, the community that we have created has influenced every technology that we have seen from the start of the internet, cryptography, content right management, the software patents debate. We've been at the center of everything that happened in technology in our society. But this is not about only talking about the past. We are the stewards of the open source definition. As an organization, we have maintained it for the community. And the community itself changes all the time. The technology also changes. And we have to pay attention to these changes in order to adapt. In the past year, the organization has been looking at AI and how the new technology is making different from what we've seen before. This week and the next weeks, we're going to have a few announcements. We're going to publish a report of the findings of the deep dive on AI that we've done last year where we have interviewed experts in many different fields on academia, practitioners of AI, developers, hackers, corporations doing artificial intelligence, civil society, politicians, policymakers. And we have collected all the thoughts, all the findings into one document that you can look at. And I can share with you today just the very, the most important things that I've learned from this experience. One is that artificial intelligence and AI systems, machine learning, reinforcement learning, all these new modern systems, they require a huge amount of data. And that is interesting to me because it clashes a lot with the thoughts that we already had with privacy, for example. Collecting large amounts of data requires, to some extent, making compromises about privacy and rights of the authors. Like the concept of copyright that we always thought of, something that I've created, I have rights on. There is a new right in town that is also being regulated. We have the right to data mining, which is fairly new. And we have to try to find the strike a balance between these new rights, personal rights, authors rights, and availability of data because the other side of the spectrum is that large amounts of data can be amassed with large amounts of money and resources. So hackers may not have the availability of creating suitable AI systems if we put too many restrictions on data mining. That's one of the aspects that is going to be highlighted in the report. The second one is that the legal systems, or actually policymakers, are extremely interested in regulating this new field because it comes with, it's extremely powerful. The systems we've seen running are very powerful and they can create damage. Therefore, policymakers are paying attention and they want to regulate it. We need to find a balance between having too much regulation and allowing innovation. And the community needs to pay attention to this as it's happening. The third and last piece in this, it's very tip of the iceberg is that the practitioners of AI, developers of AI systems, researchers, professors, the academia levels are very aware of the ethical aspects, the use of AI systems. They are aware of the fact that they can be misused and they can create really bad damages to people and therefore they are thinking very hard about regulating the use of systems. This is something that the open source community has never really paid attention to. It's always allowed for very freely use of software. It's part of the open source definition, part of the free software definition, that you cannot stop, you cannot regulate the use. But AI practitioners seem to be quite interested in doing just the opposite. So we have to help them and talk to them a lot to understand their needs and to find with them social norms, social rules that allow for collaboration that limit the amount of damages that AI systems can do at the same time allow for the collaboration to happen in a frictionless and permissionless way. So with that said, I really recommend you to follow deepdive.opensource.org, go and listen to the podcasts if you have not heard them already, listen to them. And we have also the report coming up so you will be notified when it comes out. We have two more announcements to make. One, we are reviewing the process that reviews the licenses. And we're going to have a conversation in the legal dev room later today at one o'clock. And we also have an announcement to make for which Deb, our director of policy, will talk more about. Thank you. Thank you very much. Thanks again, Nick. I'm going to do a three-minute announcement. So I'm Deb Bryant. I'm policy director for the U.S. and I have also been involved with the United Nations over the last couple of years helping them as they were developing registry for digital public goods. Well, I'm very excited. You're the first to know, so you can socialize this, that OSI is joining as an official member of the digital public goods alliance. The DGBA is the initiative that has created a strategy to support the U.N. sustainable goals through the use of open source software and open infrastructure. If you're around tomorrow, there's a dev room that the foundation for public code and the digital public goods alliance are co-hosting. They also have a stand here so you can hear more about it. But as a public charity, we think it's important for OSI to support those goals. We're very excited that there's a recognition of the strong ingredient of open source and creating a more equitable world. So thanks for letting us share that here. And I'm going to hand it back to Nick to wrap up the session. Thanks, everybody. All right. And to conclude, every celebration, every birthday needs sweets and desserts. So we have a box here of Belgian chocolates. Of course. And stickers. There are some special stickers from the OSI. So please pass it around and enjoy it. Thank you. |
Open Source Software at NASA |
Thank you very much for having me here at FOSSTEM. This is my first FOSSTEM and as part of the feedback from the organizers for my talk was it sounds like something that could be enough the great way to end the meeting with lots of pretty pictures and I've done my best to fill this with lots of pretty pictures and so hopefully I'll meet the expectations. I'm Steve Crawford. I'm the Science Data Officer as part of the Chief Science Data Office of the Science Mission Directorate of NASA and I'm really excited here today to talk to you about NASA and open source software. And so just to give a little preview of what I'll be talking about I'll start off with software and the importance of it to NASA, recent open source software success stories, challenges with open software, open source science and then opportunities. And I think probably most everyone is familiar with NASA but I want to share with you that NASA's vision is to reach new heights and reveal the unknown for the benefit of all humankind. And this image is I think a great encapsulation of this. Going to the moon was obviously reaching new heights. And this image of Buzz Aldrin on Apollo 11 mission landing on the moon but it captures him actually in the act of a science experiment. They're actually capturing information about the solar winds on the surface of the moon and actually trying to better understand how our sun works and interacts with the different bodies and trying to actually explore heat of physics which is one of the main areas we do research on. And so that is actually a key bit of NASA. Not only to explore the universe and to explore our solar system and our atmosphere but also to actually understand it. And so trying to actually understand for the benefit of everyone is a key part of NASA. And at the heart of that and integral part of that and what makes NASA possible is software. Every mission, every bit of research that NASA does there's software involved in it. And a great example is the Apollo software and here's two of the people who are leaders and involved in it, Mary Jackson who started work at the computers, the very early computers as part of NASA to actually work and then became one of the lead engineers. And Margaret Hamilton who helped lead the team that was designing the control and command software for the Apollo 11 mission at MIT. And this software is actually released and made openly available. You can see an example of the scan of one of the early software. It's available online. It's also been converted to GitHub if you actually want to actually take a look at the software and the code from the Apollo mission. It's there and available. Software has been a key part for all of our different missions but NASA also has a long history of making the software openly available. And that sharing is part of NASA's DNA as part of the law that created NASA, the space act. In section 203, one of the aspects was to provide for the widest, practical and appropriate dissemination of information concerning its activities and results thereof. Part of our congressionally mandated law is to actually share what we do and disseminate appropriately. We do always have to be concerned about security and keeping things safe. But we also want to actually share as much possible as we do. And today we're doing that through our different directorates. We have our exploration systems directorate with Artemis, launching and aiming to go back to the moon and then Mars beyond it. We have our space operations mission directorate which is actually focused on the international space station and about operations and living in space. We have our science mission directorate which I'm part of which actually explores our universe and our earth. And we also have our space technology and aeronautics research mission directorates which actually are designing the technology that enable the exploration and discoveries made by the other directorates. I'm not focused most of the rest of my talk on the science mission directorate but a lot of what I say for SMD also applies to these other activities. And so what does the science mission directorate do? Well, it has three key science themes. These are to protect and improve life on earth and space, search for life elsewhere and that can be either within the solar system or planets circling around other stars and discover the secrets of the universe. And so these are three, I think it's fair to say, pretty big questions that affect all of humanity. And that's what we're actually working at in the science mission directorate to try and answer these questions. And how do we do that? Well, we do it through our fleet of missions. And so these are the different missions that the science mission directorate have in orbit around the earth, have in orbit around other planets or the sun or have exploring the universe. And we have a wide range of earth-observing satellites. We have a number of experiments going on on the International Space Station. We have a wide range of different explorers that have visited Mars. We also have observations of the sun. And we have our space telescopes and our space observatories observing the deepest reaches of the universe. And I just want to take a pause though as I take a look at and a very important aspect of this chart is that a lot of these missions, actually all of these missions, are done with our partners. NASA, even though NASA, everyone knows the NASA name and the NASA brand, so much of what NASA does is with others. It's with the community. We do it with our partners at the European Space Agency, at the Canadian Space Agency, at JAXA. We do it with the contractors and other agencies which are around the world that we partner with and work with. And we also do it with the wider community. Some of these missions are NASA-led, some are led by our partners. But really, I think there's actually one aspect which is actually very similar to open source where the missions and projects with NASA do take a very wide community in order to actually achieve these projects. And so I want to actually look at some of the success stories with these projects. And the first one I want to take a look at is Ingenuity. I'm going to play Ingenuity as part of the Mars Preserverance mission. Mars Preserverance is on the surface of Mars, it landed there in 2021, to explore Jezero Crater, an ancient delta on Mars. This was an area that was once rich with water and was sent there to actually explore whether or not the potential for life in that area. As part of that mission, there's a demonstrator on it which is the Ingenuity drone copter. And you can see the first flight of it here. And this is the first flight of a human-made object anywhere else in the solar system. And so we're actually flying a copter on another planet. Since the first flight, it's been repeated and has flown over 40 separate flights during the last two years, it's still flying, it's still exploring. This was something that I was expected to make about five flights and I'll just play that again. But it's driven by open-source software. F prime is the open-source flight control software for this and this framework was used for this project. It was released by JPL in 2017 and so this drone copter is flown by open-source software. One of the great things that happened to celebrate this accomplishment is that NASA and JPL partnered with GitHub to actually recognize all those who have contributed to this mission. And they didn't actually just to those who actually contributed to the F prime repository, they also contributed and recognized all the others through a badge of the Mars 2020 helicopter badge, which recognized all the dependencies and all the other packages and software that was involved and part of this project. And so this actually was over 12,000 people who contributed to these dependencies and these different packages which made this project possible. These contributors are from all around the world and have been contributing to open-source and a wide range of different open-source projects which made flight on Mars possible. Another project and this one is personally important to me because it's one that I've contributed directly to is the JPS Web Space Telescope. This was launched just over a year ago and it is a partnership between NASA, European Space Agency, ESA and the Canadian Space Agency. And I really want to actually once again actually like emphasize the fact that these projects, these massive projects which took almost 30 years to build was an international collaboration with a wide range of different agencies across different NASA centers, 14 different countries including Belgium right here, part of the Meary spectrograph was built here in Belgium, along with actually one of the leads for the Meary spectrograph of the Law Resolution spectrograph, Sarah Kendra was actually born and raised in Belgium. And so these projects take a wide range and a large community to actually support and actually produce these projects. And so the wonderful and beautiful thing is that after its launch and after its commissioning, James Web Space Telescope did start to produce beautiful, beautiful images. And so Jade was tea as here is one of the first images of the Karina Nebula. This is a small part of the overall image or small part of the overall Nebula showing the dust clouds which are actually home to new stars which are forming there. And not only is it producing imaging but it's also producing spectra and this is an example and one of the first spectra of carbon dioxide in the atmosphere of a planet orbiting another star. So it's in the atmosphere of a plant which is orbiting another star and it's detected by the spectrograph and by looking at as that planet passes in front of the star, you can actually do observations of it, you subtract off the effects of the star and you can actually see what the spectrum of that atmosphere of that planet is. This is the first detection of CO2. And this came out roughly about a month after we started producing public data from the telescope as we started releasing data. And here's a great quote from Natasha Patalia who is one of the leads on it, that NASA's open science guidance principles are centered in our early release science work supporting an inclusive, transparent and collaborative scientific process. They could actually produce their science so quickly in one month, this discovery one month after they started releasing data because of open science and open source software. They're able to test their software and their processes before the launch of the telescope because all of the calibration software was made publicly available. The data for this project, for our early release projects were made publicly available as soon as they were observed. And they also are a great example of open science as well. They've made all of their data, all of their software and all of their results openly accessible as well. So if you want to actually go and reproduce this spectra for yourselves, you can actually go and download their software which is uploaded to Zenodo and take a look at it. And so as I said, this is all made possible by having the software made openly available and this is part of all of the JVC calibration software. The software which is actually used to actually produce and science calibration and science ready images is all openly developed, all openly developed on GitHub. It enables scientists to test their projects before it became available but also allows them to feed back to the project to actually when they find bugs or when they find a better algorithm for calibrating the data, they can contribute it back to the program so that it actually can then be shared with a wider audience. And one thing that actually makes this available is it builds on the wider scientific Python environment and then contributes back to that community as well. And the way that this did it was it contributes in JVC and that team contributes back to the AstroPy project and this is a common Python library for astronomy. It builds on NumPy and it started in 2011 basically from an astronomy and Python mailing list. People have been actually emailing on the list saying I've just released a new generalized package for astronomy and after about the third email someone said why don't we all work together? And basically everyone did actually agree to that and we ended up getting together and working together to produce a generalized astronomy package and not only was it graduate students and software engineers and astronomers from around the world but it also included those software engineers and astronomers working on some of our biggest projects like the Hubble Telescope and the Chandra X-ray Observatory. Since it's actually released it's been used in over 10,000 publications making it really widely available and widely used. But we do have, you know, and on the images from their recent paper of showing actually where it's being used and where it's available and you can see it's all over the world but along with that the dots on the graphic represent where the maintainers from the project are, where the people who have been contributing to the project and they're spread out through both North America and Europe with some South American and Global South representation as well. And so I'm just going to pause again here and I'm going to put up another JVC early release image which is the Galaxy Cluster SMAC 0723 and this image also has a lot of importance to me because when I was doing my PhD in astronomy I studied objects which were very similar to this. Galaxy clusters are some of the largest, they are the largest gravitationally bound objects in the universe. They're the most massive objects having billions and billions of solar masses of billion billions times more massive than our own sun and are the largest collections of objects. And what you're seeing in this image is not only the galaxies which are part of this cluster but you're also seeing the effects of general relativity. They're bending the light from the galaxies which are behind them. Only a few hundred million or billion years older than the big bang and they're bending that light due to the mass, the mass of that galaxy is warping the space and time around it to focus those light on those background galaxies on us and we're seeing that through those galaxies which are streaked and bent around it. These are the background galaxies around it. And as I said, during my PhD I studied objects like this, galaxy clusters. When I was writing my PhD I wrote code in C and in pearl and I would put it into a tarball and put it up on my own personal website. And after I had gotten my PhD and I actually moved down to South Africa to work on a telescope there called the Southern African Large Telescope and I had to build a data management system for it except we had no resources. I was the only resource. And so I turned to the open source software, I turned to free software to actually use a wide variety of resources to build a very, very cost effective data management system for it. And it actually helped make it one of the most productive 10 meter class telescopes. But during that time was when I was working down there when I received that email about Astropi and it was, you know, this is my actual, even though I'd been releasing software, this was really my start into open source software where it actually is getting involved of a community of like-minded people who I've actually solved a similar problem. And it was really actually when I saw the potential of actually working together and sharing our software and how we did things that I actually really saw the power of open source software. And you know, that's actually then, and if you told me about 10 years ago or so that I'd be here talking to you now having actually worked in South Africa for a while and then moved on to work and manage the team that was developing the JST calibration software. And now working at NASA sharing open science across all of our different divisions and helping to further open the science, I wouldn't have actually believed you if you told me that 10 years ago that this is where getting involved in open source would have actually eventually led me to, but I'm here now and it's been actually really, really fascinating to hear all of the immense and fantastic open source work which has been going on at this conference. But I just want to actually show, you know, going back to what NASA does, this is less than 1% of the open source used by NASA. This is just a, basically NASA and the wide range of NASA projects is touching on just about every bit of open source that you can imagine. Likewise, NASA and the people working at NASA release a huge amount of open source. This is less than 1% of the open source that's being released by NASA and it's a wide range of different projects touching on both planetary science, astronomy, heliophysics, earth science along with technology development and other aspects. NASA loves to actually spin off the things that we're doing. NASTRAN was a program for engineering and fine element analysis developed in the 60s and released as public domain in the 70s. It now underlines, it still underlines a number of engineering codes which are out there and Ode Engineering Software which is out there, but NASA loves to actually spin out our projects. Another two examples of our spin offs is OpenStack which is underlying on-premise cloud computing. We still use this in our Adapt Supercomputer Center and for our on-premise cloud computing it's still often used, but with Rackspace we've handed this over to the community and to the wider community to further develop. OODT is another program that was developed for data management and for handling files likewise handed over to the Apache Foundation for that to be further developed to the community. NASA is still a participant in it, NASA still uses some of this software, but we are now just a member of a much larger community which are using these different projects. But sometimes it's still important for us to actually take the lead and develop our software ourselves and the JPL Spice Toolkit is an example of this. This is for spacecraft ephemeris, planet satellite comet or asteroid ephemeris, instrument information, orientation information and events information. This is a software that you use if you want to know where a comet or an asteroid that you want to land on will be in 10 years. This is how we actually determine the positions and velocities of spacecraft, of planets, of other objects which are all moving throughout our solar system. This software needs to be incredibly precise. If you get something wrong now in a very small amount and you try and land on your comet in 10 years you will be nowhere close to it and you will have no hope of recovering. And so there actually is a very, very rigorous process to actually contribute and to actually make changes to this code. It's made very slowly, very carefully because once again if you even get the floating points a little bit off or depending on even how the hardware works or specific hardware you may get different results. And there is actually, and it was primarily originally developed in C and they did release some other interfaces for it and it is released as open source but it's not openly developed currently. It's something funded by the planetary data system and so there is actually regular support for it and it's a great team that is currently further developing it. But there's also wrappers that have been developed for it. And Andrew Annex developed SpicePy which is a Python wrapper for this toolkit and it's been used in a wide range of different missions. It's actually now used by 80% of all Spice users. It's been used in Consini of the Mars Orbiter. It's very widely used in our range of different missions. But Andrew developed this when he was an undergraduate. He was working on the Consini mission and probably like a lot of people he just wanted to use Spice and Python and at the time there wasn't a way to do it and so he solved the problem. He developed and released it as open source software and everyone else started using it as well. And Andrew continued development for two years in his spare time while he was actually post his undergraduate and not even employed working on any of the missions. Fortunately enough he did go back for his PhD and he's now currently a post-socket Caltech so he's doing fine. But even though it's actually so widely used it wasn't actually clearly about how it was being funded or the limited recognition for example not very cited in the community even though it's actually becoming so critical to many projects that we're actually actively using it. And so that gets us to some of the challenges of NASA and open source. And so as I mentioned one thing there was a good challenge of how do we actually go from something which is just being developed as someone trying to solve their own problem but then ends up being widely used in the community. How do we actually develop that model? NASA also contributes. People working at NASA also contribute to open source. Here's Leo Singer just last week contributing to an open source project because he needed to fix it and fix a bug in it to make it work for his use case. But oftentimes it's not actually very clear about how that actually works for the NASA employee or the person working at NASA. And this is also not always very clear about also when we need to do things like sign contributor license agreements or other aspects how that works. And so sometimes often people working on NASA will do it so they get it done but it's not really clear about how they should be doing it. There's also licensing at NASA and about 20 years ago NASA created the NASA open source agreement and they're trying to solve a very specific problem. Civil servants, by US law civil servants can't produce work that is copyrightable. So open source licenses actually work based on copyright. And so if you're not allowed to produce something that's copyrightable that you can copyright, you can't use an existing open source license to actually protect it. They came up with the, and there really wasn't much guidance in the US government at that time about releasing as open source software. And so they came up with actually a relatively innovative solution at the time which is this NOSA license to actually enable civil servants to release their software as open source following this framework. Unfortunately, it wasn't widely recognized in the community, it is not recognized by the Free Software Foundation and it does complicate the reuse of NASA software. We also have bureaucracy at NASA. NASA is a very large government agency. And sometimes even though NASA does have a process of releasing software and it's released over 500, officially released over 500 open source packages, the processes can be long. We also not always best in engaging with the open source community and here's a tweet from Daniel Steinberg about getting emails from NASA asking us, or asking him about the status of Curl and whether or not it can be actually used in different ways. And that's an additional burden for people who may be actually working on this in their volunteer time or not being paid. If we actually have to figure out if the software is secure or what the risks associated with that software are, that should be our responsibility, but actually how that works with the open source framework and open source communities can be actually different aspects. But Curl is actually something which is critical, especially to our data management systems. It's being used very widely across NASA. And how that actually comes into play and in different places ends up being something that we have to actually figure out how to actually resolve and actually make sure that the software that we're using is being used appropriately. And I think there was a great talk earlier today on the security of software by Brian Bellandoff and that's going to be an ongoing question about how to answer that. And I always put up this XKCD comic on sustainability of open source software because it's better than if I could write 10 white papers and it won't have the same effect as this to actually indicate sometimes the problems and issues around sustainability. And that all of our open source systems, or not maybe not all of them, but a lot of them do actually develop or depend on a small number of developers who are actually maintaining it and keeping the code active and working. And there is actually, there was a great talk yesterday on the sustainability of open source and NASA does need reliable, secure software, especially for our space operations. And so what are we doing for the next steps? And within the science mission director, we've actually started up a couple of new things to actually help with this and help address some of these challenges. We've recently set up the chief science data office, which I'm part of. And as part of that, we've set up the open source science initiative. This is one part of it is to help actually support open source software at NASA. The other part of it is also helping to extend open science across the entire mission directorate. And so what is open science? Open science is a principle and practice of making research products and processes available to all while respecting diverse cultures, maintaining security and privacy, and fostering collaborations, reproducibility, and equity. And we really are focused on increasing accessibility, reproducibility, and inclusion when we're actually working on open science across NASA. But within SMD, we want to take that a step further. And this is why we started the open source science in practice. We definitely want to take, not only just make our products openly available, but we want to take the processes of science and open those up in much of the way of open source software development. We want to open the entirety of the scientific process from start to finish. We want to broaden community involvement in the scientific process, increase accessibility of our data, software, and publications, and facilitate inclusion, transparency, and reproducibility of science. And so with the NASA's open source science initiative, it is supporting scientists to integrate open science principles into the entirety of their research workflow. The NASA's OSSI is $20 million per year, and we're hoping to actually increase that further effort to increase open science across NASA. We have four areas that we're mainly focusing on, which is infrastructure, policy, funding, and community. I'm going to touch on some of the work that we're doing in these three, at least the three different areas. And so one thing is new policy. And so last month, we've released the SMD's new policy on scientific information. We want to make things as open as possible, as restricted as necessary, and always secure. And this new policy says that our publications, the research which is done and funded by NASA, are made openly available with no period of embargo. That our research data and software are shared at the time of publication, that they are made openly so that you can reproduce the science which is actually being done from our work, and that you're able to actually also then reuse that science to actually build on it. That our mission data are released as soon as possible and freely available. And that free is actually both in free as in beer, there's no charge to actually using our data, we make it free to download, free to access that data, and free as in speech as there's no restriction in what you can do with that data. And so that you can actually go and do whatever you want with the data we release, and that that data will be released under a Creative Commons zero license, so it's clearly in the public domain for everyone. That our unrestricted mission software is developed openly, we've learned these lessons from JVST that having that software available for everyone to actually be able to both have access to it and also reuse it improves how science is done. That we recognize software as a scientific product, our software is just as critical to our research. And that we'll be using these common open source licenses like Creative Commons zero for our data and permissive commonly used software licenses for our software like Apache, BSD, or MIT licenses. And we encourage using and contributing to open source software as part of our missions and our research. We're actually going to be also releasing further policies and updating further processes to make it easier for NASA employees and NASA people working with NASA to contribute release and use open source software, so there's more coming to make it easier. The other thing is we're directly funding open source software, especially in the scientific environment. In over the last two years we've selected over 16 proposals supporting 22 different projects which we're providing direct support to. This call was specifically for sustainability. We weren't asking for anything new, we weren't asking for new features or new products, what we were asking was please make sure this software works as best as you can and that's sustainable, that we can actually continue to use this and we've supported this with over three million dollars so far. We're also, and this is the next thing that we're kicking off this year, is NASA's Transform to Open Science. There's some stickers that already I think have disappeared very quickly, but the NASA TOPS effort is a 40 million five-year mission to accelerate adoption of open science. We're aiming to train 20,000 researchers to earn a NASA open science badge and certification which includes how to open source your software. We want to double the participation of historically excluded groups across NASA science and we want to enable five major scientific discoveries through open science principles. And we're going to do this through engagement, capacity sharing, incentives and coordination across the community. We're also following our own advice here, everything that we're doing with this project is going to be open sourced on GitHub so that community members can contribute it and we're kicking this all off with a year of open science of 2023. And this is the start of the project. As I said, it's all going to be up on GitHub, definitely please check it out and please engage with the community. But we're also not the only U.S. government agency contributing to the year of open science. Last month the White House did actually announce that 2023 would be a year of open science and there's going to be more stuff being announced by other U.S. agencies as well and we hope actually as many of you who are interested in science also take step forward to actually make your science and your results more open as well. And so we wanted to also talk about the opportunities for NASA and for using open science. And one of the most important things that we do and one of the most important aspects of or most immediate aspects of the science mission directorate is our contributions to the earth, of studying the huge impacts of climate change, about how that will affect our future, issues around environmental justice and other issues. And the way we are supporting it is that we do make all of our NASA data open. We have over 70 petabytes of data which is available in the cloud and even more that's available in different systems that are open for you to use in any way that you like to help address these questions. We have open APIs to these data sets which is all developed on NASA GitHub and provide as open access to the wider community and making these available to users around the world to those who actually need the data and most impacted by it. And there's actually, I know there's a big session here on the environmental impacts and there's a great talk this morning on, and I wanted to share one of the examples from NASA which is the power project that provides solar and meteorological data sets for NASA research for supporting renewable energy and building energy efficient and agriculture needs. We have made all the data freely available. We've also partnered with AWS to make the data freely available on there so that you can have access to this almost one, and this data set to actually develop the apps to build the open source tools on top of this, to actually use it in different ways that you can. To actually answer some of the toughest questions that we have around climate change and our environment, but we're also putting actually another five petabytes of data openly available on AWS and we're also looking to provide it on other cloud providers as well so that the data is there where you need it and where you can use it, and so we're actually looking to actually put our data in places where it's most useful and to partner with as many different groups that we can to actually put that data in useful places. And this is only actually the start. What's coming up next is our Earth System Observatories, which we're doing in partnerships with the European Space Agency and JAXA that are actually going to give even a closer view of the climate effects on Earth, and actually a better study of how these different effects both on changing our surface, changing our world, changing our climate, and our different particles. These are going to produce over 600 petabytes of data, and we're going to make this all freely available and accessible to the world to be able to actually access it and look at it and build things on it to answer these questions. And so we also have a lot, as I mentioned, NASA has released already a lot of open source software. We have a number of different ways to discover what might actually be out there. Code.nasa.gov is our site for our open source software at NASA. It has over 500 projects which have been officially released. We also have just released the science discovery engine. This is actually a system to actually explore both all of our different data sets, software, and technical documents across all of the science mission directorate. It has a listing, I think, currently of about 44,000 different pieces of software that have been released by different NASA researchers and missions. We have the NASA software catalog. We also have the NASA Github, which has over 500 repositories on it, but we also have all the different partners which are on Github. JPL has their software on Github, Space Telescope, has the JBC and the Hubble software on Github. There's also a lot of other people who have posted their own NASA research software around there. We also have the Astrophysics Data Service, and you can also search the Software Heritage Archive, which is archiving a lot of software across a wide range of different domains. And so we're really actually looking ahead at the really big challenges and trying to answer the big questions. Here's the dark mission, which earlier this year impacted on an asteroid, on a Diffmorphous, which is circling around another asteroid to actually detect to see if we could actually change its trajectory. If we can actually change the directory early enough of an asteroid, we can hopefully make sure that it deflects and avoids the Earth by using, for example, the Spice Toolkit to actually study where it's actually going to be moving and predicting it. And hopefully, ideally, protect the planet and avoid the same fate as the dinosaurs. But obviously, this is all building, and this wide range is building off a wide range of open source software. And so we do need more people. We need more hands and more eyes and more brains with the diverse experiences to participate in, so we can ask the best questions and find the best solutions. And so someone had asked me on Macedon what would be the best way or how you can contribute to NASA open source, and I wanted to actually share some of the different aspects. And so you can contribute directly to NASA open source code. It's available on GitHub. You can contribute to the open source software that's maintained by our partners at JPL or at Space Telescope or at the NASA Impact Project or the NASA Development Project or a wide range of different sources or at ESA and or at other groups that are actually contributing and answering scientific questions like CERN. So you can contribute directly to the open source code. But you can also keep contributing, building, and sustaining your code. There's so much open source that NASA is using. So even just by participating in your own open source projects, you're contributing to the NASA mission. And you can also come work with us. This can either be, you know, we're always hiring, and also our partners and our contractors are always hiring, and a lot of these are great places to work on open source. And so we also have great internship programs, especially at NASA and at JPL, and also always excited to have people come work with us or come work with our partners in Europe, like the European Space Agency. We're always looking for great technical talent to help make contributions and help build our missions. And to help address these major questions that impact humanity and to have us help reach new heights and reveal the unknown for the benefit of humanity. And whether or not we're asking questions of protect and improve life on Earth and space and trying to address questions around climate change, address questions around planetary protection, questions around environmental justice, these are important questions that affect all of us. Whether or not we're searching for life elsewhere, either searching for life in our own solar system on Mars or Europa or Venus, or searching for life around planets orbiting around other stars. And we're trying to discover the secrets of the universe, trying to understand general relativity, trying to understand how our universe, our earliest galaxies have formed, trying to understand how black holes work. These are all the types of questions that we're trying to answer here at NASA. And so I just want to stop there, and I'm almost at my end of my talk, but just actually want to end it with a little bit of audience participation. And so I do, if you did actually see during my talk, software that you contributed to, can you raise your hand? So I mean, during my talk, I could only show about 1% of the software out there. So if you actually know you actually contributed to some NASA software out there that's used by ESA or someone else, can you raise your hand? And you can see how many people are contributing to it. But as I said, there is even that wider range of software which is out there. And doing open source also means actually like being part of the community, which is actually developing guidance, developing licenses, growing the community, mentoring the community. And so if you're taking part in actually growing the open source community and governing the open source community, can you raise your hand? So if you've been working on licenses and mentoring and inclusion, and for everyone else, for those who actually want to contribute, who want to actually help us explore the universe and help reveal the unknown. And so for all of you who are building, oh, I had one last one which was for those who helped volunteer and organize this conference so I could take my moment to share NASA's message with everyone. Can you raise your hands for all those who volunteered? Because you're also helping with NASA's mission. And so for everyone who raised your hand, I just really do want to say thank you for all of your contributions. Thank you for helping us out and for those who will contribute in the future to open source and other projects, thank you for your future contributions. Thank you very much, everyone. |
Closing FOSDEM 2023 |
So, that was a blast, and we are finally back in person, and also I'm going to take another picture because we're finally in person. Once again, if you leave, please be quiet, because it is extremely loud up here, and no, you're not special enough that you should keep talking. Thank you very much. So, it's a wrap. Random stats, because stats are fun and easy to make, sometimes it was a little bit of a rush. Our peak this time was 1.5 gigs in concurrent outgoing streams. Once again, please quiet if you decide to leave. Last year we had 1.65 gigabits peak, and that was fully online, so that gives you a rough idea of how many people still chose to consume off-site, even though we have a relatively full foster, but not as packed as 2020 we think we estimate. 35 rooms, 800 concurrent viewers, and roughly 20K unique viewers on the streams. We have a total of almost 400 hours of scheduled video for the ones tracking the numbers over the years. That is less than the last two years, because physical ULB has less space and less death rooms than fully virtual, but still we managed to fit quite some stuff in. We already have 53 released videos online, and 21 are currently being reviewed and transcoding for a total of almost one full day of video. Speaking of video, if you are a speaker or if you are a death room manager, please, please, please, please, with sugar on top, you get those emails and you do what is in those emails, and please help us review them, because if you don't do it as a speaker, it falls on the death room managers to do it, and if the death room managers don't do it, it falls on staff to do it, and it's a lot of work and a lot of video, because 400 hours of video is a lot. So please, if you are a speaker, if you are a death room manager, those emails are important, do not ignore them if you have any issues, email video at fostermorg, and they come from no reply at fostermorg. Matrix also dropped, of course, obviously a lot of people are in person, but still we had quite some online participation slash online synchronization while people were rocking the hallway track and attending talks and such. We sold out of Schwag, we weren't fully certain, like, we didn't know what we have a fully packed ULB, would we have almost nothing, anything in between, but we managed to sell out, so that's good, and we also had 130 advanced orders. For those who don't know, we will continue to give you the ability to order Schwag beforehand, which means you just get a voucher, and you get a password, and you get two front desk, and you get it immediately, we don't have to do the payment, dance everything, and also maybe, depending on how the timeline works out, maybe we even can start ordering specifically for you as you do advanced orders. There was a lot of food consumed, there's actually two slides. There's a lot, and it's not even a full number, because this is as of 1530 local. Once again, thank you, it's really loud. Network, we stopped checking MAC addresses for various reasons, network was a little bit of a mess this year, because a lot of things broke, which we thought remotely were fine, but the MACs concurrent we saw for IPv6 addresses being assigned was over 30K, and the total amount which we gave out was 53K, people switched from one network to the other, blah, blah, blah, blah, so it's not a fully scientific method, and also privacy extensions which keep basically messing up all our stats, but it's at least not a little. Also, ULB has some traffic analysis on their controllers we don't for obvious privacy reasons, so you can guess what for the first time since they installed that stuff was the top website or the top site accessed by total volume of data going to and back, any guesses? Obviously. We also had some stats, initially we kept them at 2.5, but we switched from one gig to 10 gig over the weekend, because that's how we rock, and we actually managed to consume more than the one gigabit which we had before. Going forward, hopefully next year we will maybe even have two times 10 gig, we'll see. I keep telling people this every year, if you're still using 2.4 gigahertz, please don't. With hardware which does 5 gigahertz, of course you will have less contention, you won't have zero contention, but you will have less contention if you have hardware with 5 gigahertz. It's generally nicer, so if you buy your next laptop, your next whatever, consider spending a little bit more money on 5 gigahertz. We come to thank yous. First of all, the Matrix and Element people, who asked to not be named, just like so listed. Also Liberaget staff, of course, and also sponsors without whose this wouldn't be possible. As per usual, as I call out the next groups, if you wear a color t-shirt of the appropriate color, please come up front and take your applause. I might start crying again, I'll see. Thank you. I also see at least one green and one orange shirt, which is not up front, which is a big mistake. If you don't have the shirt on, but you did volunteering, deaf room, video, whatever, please come up front. We have a few deaf who are also supposed to come up front now. And thank you. As per always, please send feedback. We really rely on this feedback and we actually take it to heart, so please send any and all feedback which you have. We cannot always accommodate it, but we read everything and we really do take it to heart. That should be helping, not feedback, whatever. If you see trash here anywhere else, congratulations, this is your trash now. And I'm being serious, it is an immense amount of work to build up in basically half a day on Friday and also to tear down within two to four hours on Sunday. So anything you see, please just take it with you, throw it in the trash, whatever, but please help us clean. Also speaking of cleaning, if you have some time, we would highly appreciate it if you just stayed around a little bit, helped us clean up. If you gather here, we will tell you what to do and what it will take a little bit until we have all you segmented. But we can use every single hand for the network cabling which you tear out under guidance. You can keep the network cable. And also for the ones who stay until the very end and help until the very end, we will feed you in cable at the end. So please consider just coming up front to help. Yes, after the thing, obviously. And that's it. So see you next year. And as is tradition. And as is tradition. So A, I forgot all the speakers who are in here, they also are supposed to be on stage if they so choose. So now is the time to come on stage if you want to. And as is tradition, we are ending FOSDEM with the FOSDEM dance. My knee is kind of broken, so I'm not going to lead the dance. Wouter is going to lead the dance. And with that, thank you. See you next year. I don't know who among you has done the FOSDEM dance before. It's actually quite difficult. But not really. We start like this. And it's difficult because you have to balance on one leg. But I'm sure you can all do that, right? So bear with me. And then we lift the one leg. And then we start moving. And we go faster. And there we go. We're dancing. Thank you. That was the official FOSDEM dance. Thank you. |
How regulating software for the European market could impact FOSS |
First of all, thank you. Thank you for coming here to this session. I think it's fantastic to have an audience who is potentially interested with regulation and compliance. This is something that impact us all of us as a community, as our users, consumers, and also from a work perspective. So I think it's really important to be able to understand what are the impacts of regulations, compliance, certifications, when we are developing software, when we are sharing our work from, I would say, a community perspective, but also when that work goes into production on systems, critical systems. So thank you for being here for that. Today we have the unique opportunity to have some of the folks who wrote, or at least review wrote, some of the legislation directives, like the Cyber Resilience Act. And we will go through a couple of lightning talks, because I think it's really important to set the scene. So we could start discussing about having a panel, but if we don't understand the overall construct of those regulations, that's going to be hard. So I would like to have a warm welcome from Benjamin here, if you can give him a bit of love. Thank you. Thank you. So we are really honored to have the European Commission here to help us with this. So thank you. Benjamin, up to you. Thank you very much. Thank you so much for having me and also for showing interest in our work that will maybe affect some of you, but most of you probably won't be affected by it. I mean, as a Commission, we're always trying very hard to reach out to all the people that could potentially be affected by our regulation, but it's sometimes a bit difficult. I mean, it's often mostly large companies that have the resources to actually engage with us. So I'm really happy to be here and I'm so glad that so many of you decided also to stay for this panel. So the Cyber Resilience Act is a new proposal of the European Commission. It intends to improve the cybersecurity of hardware and software products. It's intended to complete a puzzle. So there's already other legislation at union level on cybersecurity. Most notably, there is the NIS directive, which has recently been reformed. The NIS directive focuses more on the users of hardware and software. So it ensures that large companies, critical infrastructure providers and so forth, take security seriously, do their patches regularly and all that. But the Cyber Resilience Act now focuses on the other side, on the manufacturers of these products. And I mean, I know that many of you are probably not dealing with policy every day, and it's particular not with European Union policy. So I've just like added one slide at the very beginning to quickly explain to you like what the Commission is and where we are in this process and who the other players are. So yeah, this is highly simplified. It's much more complicated than that, unfortunately. So I work for the European Commission for the Director General Connect. It's like the Director Generals. They are like departments within the Commission. So we mostly deal with telecommunications, internet, emerging technologies and so forth. And it is the European Commission that makes legislative proposals in the European Union. So we write the draft and then we propose this draft to the so-called co-legislators. That's the European Parliament and where you have the members of the European Parliament, so parliamentarians, and the Council that represents the 27 member states. And these two institutions, they analyze the Commission's proposal. They come up with their own positions and once they've done that, they negotiate with one another. And at the very end, once they agree on a common line, a common position, they adopt the text and it enters into force. It becomes a law. So this is basically how it works. It's very simplified. So our proposal came out on 15 September 2022, so in autumn. And it's now with Parliament and Council. Parliament has not yet started discussing it. They're still sorting out which committees will be in charge of it. But the Council is already at full speed. There are regular meetings every second Wednesday. The Council is meeting to discuss the text proposed by the Commission. So that's just for your background. And now we'll talk about the substance of the proposal. So what are we trying to achieve? So we want to reduce the number of vulnerabilities in hardware and software products. Of course, we know that it's an unrealistic goal to aim for zero vulnerabilities in products. And this is what the slide represents. So the cheese represents a hardware software product filled with holes. I mean, I know not all are filled with holes, but many are. And I mean, we attempt to reduce the number of such security holes in hardware and software products through legislation. And, yeah, I will just get right into the main elements of the proposal to give you an overview and then I will, you know, provide some more details. So it's all about providing cybersecurity rules for hardware and software products when they're placed on the market. So placing on the market entails that we're talking about commercial activities here. So companies that make money with hardware and software products, they will have to follow these rules. We're using a well-established framework. It's called the new legislative framework. It's been used in the past by the European Union to regulate other products such as toys or machinery or electrical equipment. And you may not know it, but you're all actually very familiar with it. So whenever you see a product that has the CE marking on it, CE, I will show it later, like you have it, for example, on laptops or iPhone chargers and such things. So this is all a new legislative framework regulation then. What we do in the proposal is we establish obligations for three types of economic operators for the manufacturers. I mean, that's the core element, of course, of the proposal. But then there are also the distributors. These are brick and mortar stores, online shops, and also importers that place products on the European market that have been developed in third countries. And the importers and the distributors, they don't have to, I mean, they don't do secure coding, of course. They are just selling products, but they have to ensure that the products that they are selling are in line, I mean, that they comply with the Cyber Resilience Act. One of the core novelties of our proposal is that it does not just cover rules for the placing on the market, so what the product should look like, but it also covers the period after that, the entire life cycle. So I mean, you all know that for cybersecurity, it's extremely important that you keep tracking your products, that you ensure that when you discover new vulnerabilities, that you fix them, you mitigate them, you provide patches and so forth. And this is part of our proposal that for a period of five years after the placing on the market of a product, you do that, you provide security updates. One essential element of any new legislative framework regulation is harmonized standards. So the rules that we propose in the Cyber Resilience Act, they are very high level, technology neutral, objective oriented. I mean, if you go through the proposal, you will see they're really high level. It's more like on the level of provide a secure by default configuration, take care of access management, make sure data is handled confidentially and so forth. So it's really high level. But then we have the standards that will be developed by the European standardization organizations. You may have heard of them. They're called SanSanElec and Etsy. And they will, I mean, they are voluntary. You don't need to use these standards, but they will help you write secure products. And then there is also the conformity assessment. So manufacturers, they will have to demonstrate that their products are actually compliant, but I will explain that in more detail later. Yeah. And finally, there's the market surveillance. So in all the member states, you will have authorities that can check out the products and approach manufacturers, ask them to remediate vulnerabilities, eliminate risks, and if they don't comply, even hand out fines. So the scope of the proposal, it's hardware and software products, products, not services. So when I say hardware products and software products, I mean, final products such as laptops or operating systems, but also components such as CPUs, hard drives, software libraries and so forth, we have, we don't regulate services. So software as a services outside the scope of this proposal, that's already regulated by the NIS directive, which I mentioned earlier, but we do regulate something that we call remote data processing solutions. So often today, products, they actually come with some cloud or other elements such as many mobile applications. They usually, they access data in the cloud and these are essential functionalities of the program. And then you would also have to ensure the security of these remote data processing solutions. As I already explained, we only regulate commercial products. So anything that is non-commercial is outside the scope, including open source when it's non-commercial. But on the other hand, so open source that is commercial, I mean, there are a lot of large commercial projects, I'm sure you're aware of them, they would be covered. And then we exclude certain products that already regulated by other regulations such as cars or medical devices. Yeah, I mean, why did we choose to exclude non-commercial open source software from the scope? There are a few reasons. And first of all, this is standard practice for any new legislative framework, product regulation at union level. We only focus on commercial activities. But also in the case of open source, there are some more special considerations that we took. So first of all, like, is it even, would it be fair to regulate someone that does not make a profit with a product, right? I mean, and we think it would not. And wouldn't we be setting the wrong incentives to regulate someone who doesn't make money with it, right? I mean, would you keep developing something if you're asked to like follow regulation and you're not even earning money with it? So we would be setting the wrong incentives. That's why we decided to exclude them. And then of course, we also know that open source, I mean, it's critical often from a cybersecurity perspective, but it's just as critical from an innovation and growth perspective. I mean, there are so many small companies, medium sized companies that use your components, your products free of charge to develop on top other products and services. And for them, it's crucial that they that they have access to this resource. And that's, I mean, we're fully aware of that. And my colleague Rice, he will, he will, I mean, he will later on the panel also explain to you how, in how many other ways we, we, we support open source as the European Commission. So here in a nutshell, the obligations that manufacturers of commercial products would be, would have to comply with. I mean, first you do a risk assessment of your product during the design phase, then you comply with the essential requirements that are in the regulation. I explained earlier what they look like, these high level objective oriented requirements. Then you establish vulnerability handling processes, such as a coordinated vulnerability disclosure policy. And yeah, and of course, I mean, with any legislation, you will have to document in which way you comply. This would be called a technical file. And once you've done all that, once you've done the conformity assessment, you can fix the CE marking to your product. And then for a period of five years, you will be required to handle the vulnerabilities that arise in your products. On top of that, there's also an obligation to report critical vulnerabilities to our cybersecurity agency. We are talking about vulnerabilities that are already exploited in the wild by criminal actors, but that have not been necessarily fixed. And this is for the purpose of helping national authorities mitigate risks in critical infrastructure and so forth. Here's one example of how we are also trying to help the community. So in the article 11, which is about the reporting requirements for manufacturers, we require them whenever they integrate a component and they discover a vulnerability in it. And this could be an open source component that they notify the entity or the person that maintains this component. I mean, so people like you often, for example. So for example, when you go, when you're from GitHub, when you source a component and then you discover a vulnerability, at least you should reach out to the maintainer and inform them of the vulnerability. Ideally do more, maybe even provide code to fix it. So here is how it works as regards components. This is an example of a smartphone manufacturer that builds a smartphone. So there will be some components here. You see them on the left-hand side in blue. These are components that the manufacturer develops themselves. So they would not be placed on the market separately, but as part of the final product. And then for these components, you would not need to do a separate conformity assessment, but it would be part of the final product. But then there are other components. You see them on the right-hand side, such as the ones that are developed by upstream manufacturers. So here you see the RAM or the CPU in that example would be manufactured by someone else. So and so these are placed on the market separately, and it would be the upstream manufacturer who would need to ensure the security of these components and to fix the CE marking. And of course, the manufacturer of the smartphone is responsible for the entire product, right? Just because you integrate a component of someone else doesn't mean you don't take any responsibility for it, but you would be, you would need to do due diligence when you integrate those components. Yeah, so these are the categories for the conformity assessment that you propose. For the vast majority of products, the default category, you would do the conformity assessment yourself as a manufacturer. You do not need to interact with any third party. But then there is a smaller list of critical products, which you can find in the annex of our proposal. It's roughly 10% of the scope, and there we require more stringent conformity assessment and sometimes even independent third party assessment. Examples of these more critical products are firewalls, CPUs, or operating systems. And then there is also a possibility for highly critical products. We are not making use of that now, but we are proposing that in the future we can add highly critical products to a list for which it would be necessary to undergo a mandatory certification. We do not have any products in mind. This is really just to future proof the legislation, because we all know that digital products permeate society more and more, and we become more and more dependent on them. And there may be a point at which we will rely on such crucial products that we need even a higher level of assurance. I don't know, human brain interfaces or something like that. I mean, yeah. Okay, this is the CE marking that you're all very familiar with. We are also trying to facilitate compliance for the manufacturers as much as possible. So we would be providing after the entry into force of the proposal guidance on compliance in general, as well as examples of how to do things in practice, for example, on how to delineate what is a commercial product, what is a non-commercial product. We will also, I mean, as I said, develop these harmonized standards. We are looking into providing funding through the Digital Europe program, that's a European funding initiative, for national market surveillance authorities, but also for manufacturers to help them comply. But then, frankly, we are also counting on you, on the community, on software developers, on the market also, to come up with tools that can help facilitate compliance. I mean, many of these tools already exist, of course, such as static and dynamic code analysis tools, S-bomb generators, but, I mean, you name it, there can be any number of tools that could help facilitate compliance, maybe to generate some of the documentation automatically, and, I mean, also be extremely open to ideas on how to facilitate this. So, feel free to approach me after the panel, and we can discuss it. Here, this is the timeline just for you to better understand, and this is also my last slide. So, as I said, we made the proposal on 15 September. We are now undergoing this legislative process. We are hoping to conclude it before the European elections in May 2024. And then, once it's done and enters into force, we will launch a request for standardization with the European standardization bodies. And then, 24 months later, it would enter into application, so that's when manufacturers would need to comply with the rules, and also when the harmonized standards would become available. Yeah, so there's still plenty of time for you to help in the process, to reach out to members of parliament if you have concern or to the member state authorities and get involved. We are, of course, also extremely interested in your views, so you can also reach out to us. I mean, I also brought some business parts, so later, if you're interested, you can have some. And, yeah, I mean, I'm extremely looking forward to the discussion now. Thanks. Sorry for being so fast, but I only have 15 minutes. Thank you, Benjamin. So, I don't know if you saw me moving around during the talk, but we were discussing both cyber resilience and also potential software defect. I just escorted a bug outside how convenient it was, so saving a bug. So, now we're transitioning to Martin from NLNet Labs, and he's going to give us this point of view of the CRA based on, from an NGO, so a non-government agency or office, organization, sorry. So, this is a different approach, and, yeah, they are dealing with DNS, so that might be important infrastructure element to consider in this topic. One welcome from Martin, please. Good morning, everyone. So, I'm really grateful that Benjamin just did the hard work of explaining the legislation, and for me, the, I suppose, easier part of reacting to it. So, my name is Martin Arte. I work for NLNet Labs, which is a non-profit R&D organization. We make open source software for the internet. I stole this terminology from a great report from a quite a while ago. And, what we do is we develop internet standards, or we contribute to the development of internet standards, and we make implementations. And, the specific area of focus that we have. The mic, closer to you. Ah, I'm, I will. So, the specific work that we do is we develop, we contribute to the development of internet standards, and we make implementations in software that are open source. And, we've been doing this for two decades. We make software in the area of DNS, and we make software in the area of routing, specifically, routing safety. And, what we also do is we try to bring that knowledge to policymakers, and that's what I'm doing, hopefully, today. So, Benjamin just talked about the scoping, and he mentioned that most of us would be out of scope. And, when I read the proposal in September, October, I wasn't so sure, and I think that's something we should discuss. So, there is a recital in the text that basically reads, free in open source software should not be covered by this regulation, which is, I think, I think we should be very happy that the commission is aware of open source, and has taken concern to actually address this in new legislation that I come up with. So, I think that is a, something we should celebrate, much like the fact that open source has been here for 25 years. So, maybe a little warm round of applause just to acknowledge that we've been here. And, what I want to talk about today is the fact that there's a clause here, and it says, outside the course of a commercial activity. And, what is a commercial activity? There are some examples. It's not a defined term. And, one of the examples that is relevant to my organization, to Enelnet Labs, is specifically that what is called out is that if you charge a price for technical support services, then you are considered commercial. So, the organization I work for is a registered charity. We make software, we develop standards, we give it away for free, we give away updates for free. And yet, we employ people with mortgages. So, the question is, how are we supposed to make money? And, we've found a way, namely, the network operators that use our software want it to be available long term. And, they want there to be different implementations available. So, they've chosen to pay us to provide support. And, so this makes our work a commercial activity, at least in the way we read this text. I would be glad to be wrong about most of the slides I'm showing you today. And, please tell me if I am. But, that's why we got interested in this legislation. And, there are some risks if you look at this exception. Because, if you have an expansive interpretation of the word commercial, then the exception is narrow. It's this balance, right? Depending on how you interpret commercial. And, if it's a narrow scope of exemption, then what you create is a disincentive for open source developers who start working on something in their free time to professionalize. Because, at some point, when you hit this commercial qualifier, then you come into scope of regulation. And, there's nothing bad about regulation per se, but we have to acknowledge that the step from not being regulated to be regulated is a change, right? And, what we don't want to happen is for charities like Anelot Labs and others who make this kind of software to realize that the way they've been able to sustain their development is now disappearing. So, this, what we don't want is for there to be an incentive to move to closed-source software. And, finally, especially in the area of DNS, what we want is product diversity. So, it's not a good outcome if we have, if we go to a place where there's less organizations working on these projects. Because, what we want is different implementations that do not share code for stability. So, it's not just me. The OSI reacted to the response. And, they said, and I like this quote a lot, the problem is not the lack of taxonomy of commercial, so the solution is not explaining what commercial is supposed to be. It is the very act of making commercial the qualification rather than deployment for trade. And, what is relevant about this quote is that there's a very different, the act of maintaining and creating software and the cost of that is very distinct from the act of profiting from its deployment. And, if you confuse these two steps, then you, and commercial may help you confuse those steps, then we get into this place. So, that's about general scoping issues. I'd like now to move to specific concerns we have from analog labs. And, why do we have concerns? It's because we have products. We make unbound, the resolver, we make some other products. At least, we believe it's not an unreasonable interpretation to consider the stuff we release, including binaries, to be products. And, maybe we're wrong. Love to hear that. So, if we are products and if what we do is considered commercial, then there are some unintended consequences by being covered by this legislation. Because, currently, we are in outfit with 15 people, 14 of whom are either software engineers or developers. Every effort we put into proving that we're doing the right thing, the compliance bit, which is a logical part of any legislation, every effort that we spend on compliance will not be spent on securing our software. And, the only reason why a charity like ours exists is baking secure software, resilient software for internet infrastructure. And, if you look at the NX3 list of critical products, it doesn't take too much imagination to frame most of what we do as critical products in terms of this legislation. And, I think Benjamin explained about the way to avoid having to call in third-party auditors. But, it depends on the availability of European standards that are applicable to our work. And, in general, I think most of us are not engaged in the European policy process. And, even less of us are involved at the European standard institutions. Because, basically, we have no need to be there at this point in time. And, we have not had a need in the past. So, at Inelid Labs, we quite work quite a lot on standardization, but it's at multi-stakeholder forms, accessible to any party, without payment, etc. And, that's not how these European standardization institutions work, unfortunately. So, the escape hatch, if you are a critical product, to not have to call in third-party auditors may not actually be available in practice. And, that means we will have a costly burden checking our processes, which does not actually contribute to the quality of code. And, that's why I think this is a concern for us, or at least an unintended consequence. There's also text in there which requires you to fix all known vulnerabilities. And, this does not have a regard for severity. And, what this does is, if you have to fix all known vulnerabilities, then discus your engineering effort from practices that protect you against unknown vulnerabilities. So, this is, I think this is not what it was actually intended. I'm just flagging it, because we, I think we want to, there to be good laws, and we want to help in a democracy to achieve those good laws. I mean, why, if you live in a democracy and you like that, why not contribute, I suppose? Otherwise, there's not a lot of points. So, there's also more detailed concerns. So, for example, we currently have the opportunity to provide security updates to providers we know are extremely critical to the functioning of the internet. So, if we are about to release a patch and give full information about what it fixes, we can choose to provide a update just ahead of time to a, for example, root server operator or some other entity which is not equal in terms of importance to all of society. And, the current text takes away our ability to do so, because it requires us to do everything at the same time, which may not actually benefit security, which is the goal. There's also obligations to report incidents that actually relate to third party use of our software. So, we don't do any operations, we just develop software, but if we are obliged to report incidents that relate to the use of our software, this gets into a conflict of our role of remediating vulnerabilities. And, this is the case, because these third parties may have some reason not to go public with an incident at first. And, if you create a conflict of interest in talking to us, then we may need longer to fix the vulnerability if there is such a thing. What we don't know, and this is more of an uncertainty, we don't know what constitutes substantial modification. So, the whole idea of that you trigger reassessment of your product for the CE marking, it hinges on the fact that, when substantial modification is made, you need to do it again. And, we currently don't know if that's a major release, minor release, patch release. We do quite a lot of releases. And, depending on how this pans out, we may spend a lot of work doing compliance and less work of doing release engineering. Finally, there is some clause about time limited availability of software testing purposes. And, I think that's fundamentally incompatible with how open source is done. Much of the points I've made today, we worked on together with ISC, the authors of Bind, among others. And, on their website, you can download about 30 years of binaries, of all versions of Bind ever made, which is really helpful if you're trying to find issues, vulnerabilities, whatever you try to. So, what we don't want is for them to have to get this stuff offline, because I don't think it actually helps us reduce vulnerabilities. So, if you are interested in this stuff, I recommend you to read further, because this was just my perspective. And, Elnert Labs, as an organization, a charity employing people that are getting paid full time to develop open source software, is not very common. And, I realize that. But, you don't have to take it from me. You can also read about other responses. So, there's the work of OSI. There's work by Open Forum Europe, Brussels-based Think Tank, which collects voices from other organizations relating to open source, including commercial ones. And Simon, who's in the front here, maybe he can raise his hand, actually compiled a list of responses by other. So, you can take a look yourself and give it some thought. I wanted also to quickly give a shout out to a blog written by Thomas DePierre. I'm hoping I'm not butchering your name. And, it relates to supply chain thinking, how that's currently quite dominant in thinking about security for open source. And, his point is basically, you supply chain thinking doesn't work if someone in this supply chain is not actually considering themselves to be a supplier. So, that's it with respect to what I had to say. In summary, we believe that these goals that the commission is trying to achieve are worthy goals. And, as a consumer, I would like them to succeed. However, there's also some unintended consequences in their current proposal. And, I think we should aim as a society to fix those. I'm very interested to hear your perspective. And, for now, I'd like to thank you for your attention. And, I'm looking forward to the panel. Thank you, Martin. We hit 30 minutes with two different opinions and views. Is there some questions in the audience? Not yet? Oh, one, two. Two questions. I'm from the UK. So, how does that affect other European countries that may be? Okay. How does it affect the UK and the UK working with Europe? We're still part of Europe after Brexit. And, secondly, how does it affect what's happening in the U.S. in terms of all the stuff that NIST is doing, particularly on, you know, from the executive order that happened in 2021? I couldn't hear the second question. The second question was, how does this affect the U.S.? So, yeah, thanks a lot for that question. So, for the Subresilience Act, we are applying the same principle as for any other union legislation that is about services or products. So, whenever you provide a service or a product in the European Union, you have to comply with the rules of the European Union. So, that means for the Subresilience Act, if you provide a hardware or software manufactured outside the European Union to consumers or other users within the union, you would have to comply with the Subresilience Act. So, there's absolutely no difference. It doesn't matter where you're located. If you want to sell your products here, then you have to comply. So, there's also no risk that there would be an advantage for maybe manufacturers from America or so forth that they would not have to comply and therefore have lower costs. No, it's an level playing field. Everyone who wants to sell their products here would have to comply with the same rules. I hope that answers your question. Hi. Why is policy the right place to put these measures into effect rather than a standardizations body like ISO, which already does a lot of the same things and provides people who want the guaranteed, the guarantees associated with this reliability testing to select products that comply with the ISO requirements? Okay. If I understood the question, the question is why we think that a mandatory approach is necessary, right? Right. Why is the mandatory approach necessary instead of a voluntary approach that could be selected by consumers who want it? Yeah. I mean, the voluntary approach is the approach that we took basically for the last 20 years, right? And it hasn't worked. I mean, we are not eager to regulate new sectors. I mean, especially when it comes to such innovation and growth fostering sectors such as hardware and software. So, I mean, that's one of the reasons why the Commission of the European Union for a long time decided not to regulate. But, I mean, we see that, you know, despite of new standards coming out, development of lots of secure development lifecycle, guidance, new programming languages that are memory safe. I mean, all this is coming. We see that. But nonetheless, I mean, the vulnerabilities, they are not going down. And we see actually more cybersecurity incidents than ever before. And this is why we chose to regulate. I mean, in our view, there is simply not sufficient incentive for the market to take cybersecurity as seriously as it should. I mean, I mean, you all know this. I mean, you have so many specific features in this market, such as, yeah, time to market. I mean, first mover advantage, who is the first one to build a platform wins basically the whole market, yeah? So, under all these constraints, manufacturers do not have an incentive to prioritize cybersecurity. And we feel that this can only be fixed through regulation. Thank you. So, there's more questions to come. But we have a panel after. So, you will be keep your question, ask your question after so that we can proceed with the last presentation. I was also thinking about the same thing as you, sir, about the ISO part. Working with customers and partners at Red Hat, I can tell you that they need a directive or a compliance based on the law. Otherwise, they will never apply the things. So, the next speakers, Omar, is going to introduce us to the other part, which is more like the defective product and what it means in terms of liability. So, please welcome Omar with a round of applause. Thank you. You're okay with that? Yeah, we can take the mic. Did Martin show you how to use that? Yeah, I think it's, yeah. Thank you. So, thank you for the invitation. Thank you to be here. And I'm going to try to not use too much legal terms because I think now we're going to enter a zone that if you're not a lawyer, you will not really understand what we're talking about. So, I will kind of try to use more simple situations. I think the name of the proposal speaks by itself, product liability directive. The PLD, as we call it, is about the liability. So, I'm going to tell you, give you some examples. Boeing 737 Marks, two of them crashed some years ago. An autonomous vehicle that crashed and killed a person. In South Korea, lawn mower that went into a person. Your mobile phone that overheats. All of these situations are what we call the liability. And all of them includes softwares. So, what Benjamin explained to you are the obligation that when you want to place your product in the union on the market, but when something goes wrong, someone has to pay the bill. This is how every system works in all countries. And what the product liability does tells you who is the liable person and what the victim has to show to claim the compensation. So, the regime of the PLD exists since 85. But as you know, the product of 85 was not the one that we have today, including software, AI systems, IoT, etc. So, what we have done is to redraft and re-clarify the regime that existed by including all softwares in it. Because as you know, and with the example that I gave to you, software can in some situations cause harm and cause a damage. So, what the PLD does and it says that the manufacturer is liable for damage caused by a defect in their product and the injured person, so in a tribunal or in a court, has to prove the damage, the existence of the damage, the defect in the product, and the link what we in legal terms call the causal relationship between the defect and the damage itself. If this is proved, the manufacturer has to compensate the victim or the family of the victim in case of death. So, the manufacturer is defined, this is a definition that appears in all legislation, product legislation in the union. The damage, what damages are we talking about? Death, personal injury, damage to property, think about an IoT boiler system in your house, overheats, the house explodes, someone has to pay for that. That's a simple situation, but at least you see exactly what we're talking about. Data loss, data corruption, this is a new addition. So, obviously the liability that you had until now covered any physical things, but as you know, your own stuff, even in the material world, when you talk about NFTs, when you talk about NFTs, why would you be covered if your painting burns because of your house exploded, let's say, or whatever, but you cannot get compensation if actually you lose your NFT. So, all of this has been re-adapted and included in the regime. So, these are the damages for which the manufacturer has to pay compensation in case there was a defect that caused the damage to one of them. And so, the entire notion of defect, and I'm going to try to explain a bit also what was the part of the revision, the entire idea of defect is that the product did not, let's say, had a lack of safety that a general person is expected to have. So, your fire alarm does not detect the smoke, you are expecting that it would actually do it. If you have a computer and actually the software has so many vulnerabilities that you lose any of your data, you might expect that actually you should not do it. So, all of these are normally explained in quotes, and if you establish that there was a defect and that there was a damage, then you get compensation. Defect does not mean that we're talking about a zero absence of bugs in a software, for instance. This is not what it means. Zero, it's impossible. You cannot have it, but you can have vulnerabilities that, as a developer, you should have expected, or you should have foreseen. So, all of these are elements that are taken into account to establish the liability of the manufacturer or manufacturer developer depends on which type of product we are talking about. So, as I said, the liability has been revised. There are different adaptations of the regime for liability, the digital age, circular economy, this is about the substantial modification that we've talked about. The global value chains, this is a concept that we talk of mostly in trade, where a product comes from outside of the union, the manufacturer is established outside of the union, and brings the product in the union, and so you need to know who is the liable one, and then a better protection for victims. So, basically, what the new proposal or the new product liability directive propose is to say that any software is a product, no matter how it's supplied or provided. So, in this, obviously, also includes any AI systems. No matter if it's a software as a service, or anything that will be concerned as a service, all of them are software, and all of them are products, and all of them will be covered by the liability. But at the same time, you also know that software do not work independently. You have what we call the digital service, the flow of data that feeds the AI system for the training, all of them will be covered, and all of the providers of those data, for instance, will be also covered by the liability in case something goes wrong. As we talk about already, there is the exception of open source software outside of a commercial activity. To be clear, outside of a commercial activity is the exception for any product. If a product is outside of a commercial activity, it's out. But what is a commercial activity? Well, if your software is sold, this is a commercial activity. If you provide services to the company that takes your open source software, this is a commercial activity. Those are examples of commercial activity. This is how it has been interpreted by the court. This is part of the court judgments and case law. Obviously, you will have different situations, but these are some examples that I gave to you. The damage is to data loss and corruption. The software updates and upgrades. Obviously, your software or any type of product that are digital evolved during their lifetime, and this has also to be taken into account. If you do not provide the software update or upgrade that was necessary to avoid the damage, you can be held liable. These are the different elements, but obviously, I will not go into absolute details. Addressing the cyber vulnerabilities might also open a claim for damages. The evolution of an AI system through machine learning could also open a claim if the machine creates any damage. That's a bit of the adaptation. The circular economy is the general concept of substantial modification. Obviously, if someone substantially modifies a product, in this case, a software, it should be the one who should be held liable if something happens, has a manufacturer or the developer of this software after he has done the substantial modification. The value chain is more related to the traders that impose the product in the union. As a victim, you should not be blocked just because you do not know who is the manufacturer of the product. Anyone who is part of the value chain that brought the product in the union can be held liable. Through an importer, if you buy through an online marketplace, those entities could be held liable instead of the manufacturer. Obviously, if they have contract with the manufacturer itself, they can have recourse against them. The better protection, it's more for technical points, but it's how it's going to work in courts. You have situations that are so complicated. Let's take an AI system, explaining where the defect is in an AI system might be complex as a victim to show. We created what we call in legal terms legal presumptions. If the victim can prove one of those elements, you have what we call the presumption. We presume that the product is defective and it has caused the damage. If it's the case, then it is to the manufacturer to explain that actually it is not the case. That is a legal tool that exists in every system, but we have adapted it to ensure that if something goes wrong, someone at least has a possibility to have a claim. It does not mean that you are liable. It just means that you can bring a claim to a court and make your case and explain why the product was defective and why you have a damage and why the manufacturer should be held liable. This is not an automatic situation. It's not because we have included software, for instance, in the scope of the legislation that it means that you will be held liable. You need to have a damage. You need to have a plane that crashes because of the software that obliged the pilot to go down instead, to go up, the autonomous vehicle that goes into a wall while there was nothing on the road. Those kind of situations are the one that at least allow a victim to make the claim and go in court and try to get the compensation. If the product was not defective, well, then there is no problem. Again, when we talk about software, this is not a zero bug that we are talking about. We are talking about expectation that you have about the software, the fact that your software should not overheat your boiler, the fact that the pilot, the software pilot in your plane should not push down while you're trying to go up. Those kind of things, these are the situations that we're talking about. These are a bit of technical situation, but we have what we call a disclosure of information. Only the manufacturer of the product itself knows the product. As a victim, you need to have access to some sort of information, like the data used, any kind of evidence that could be of an help to create the claim. Obviously, those information are protected by trade secrets. This is not a way of giving away your secrets. They are protected through courts. There are systems. We have years and years of practice in national system, in tribunals, to know exactly what can be showed, how can it be showed, and what to protect. But there should be an access to the necessary information for that. Finally, this is a bit of the small details that have been changed. There was a ceiling before for claims, before you could not get more than 70 million. I mean, we're not talking about every claim will be 70 million. Let's put it like this. But there was a ceiling cap. This has been removed. Nothing explained that you should limit it in number of money. There was a threshold. If your damage was less than 500 euros, you could not have any claim for liability. And then the liability as a time period, you're not liable for 300 years. You're liable for 10 years after putting on the market your product. After that, you cannot be held liable under these rules. But there is a time limitation. This allows to predict and to ensure the risk of your business, if it's a business, or of your product. And there has been a small adaptation for personal injuries. This is typical for some specific product like pharmaceutical products where the damage appears after. I mean, I didn't go into all the details, but the product liability does not apply only to software. It's one of the product. We're talking about chairs, cars, pharmaceutical product, all of them. So any damage that any of those products will cause, you would have a claim for compensation. So that's it from me. Thank you very much. I hope there was not too technical. Thank you, Omar. We're going to go now to the panel, and I will introduce James Lovegrove from Red Hat, our policy director. Who will be the moderator? So I'm giving the mic to James. Very good. Welcome, everyone. I appreciate that we're halfway through the session. So if you want to stand up and then sit down and stretch your arms, that's fine. So in terms of the first and weekend, I mean, yesterday we had the policy summit brought together a couple of hundred people in person and hundreds more online, I understand. But what's interesting is only five, six years ago when we started it in earnest, as the OFE is in the crowd, there was literally, I see, it was there, there were sort of 14, 15 people in the room. So I mean, that massive scaling is interesting in the sense of a response to the growth of EU policy impacting digital ecosystems, open source included, but also the intent and realization that as a community of open source practitioners, further collaboration and coordination is absolutely essential. I think today is a good example where the Commission see that as well. We're joined today by three members of the European Commission from Digit, from Grow, from Connect, and frankly, I think they're worth a shout out and a round of applause. It's a big deal having them here. Thank you. So, I mean, music to our ears, Benjamin Meade really clear up there. He mentioned rational to exclude non-commercial open source. So, you know, the door is open, the ears are open, but this is a complex process. Yesterday, we were talking about the analogy of trains, leaving stations and so forth, and yeah, it's now en route. It's going through quite a complex co-regulatory process, and as a community, we don't have the resources in many cases to actually keep up with that. So this kind of collaboration is absolutely critical if we're going to reflect that rationale of best practices when it comes to open source, but also best practices in terms of the tools that we're talking about into the law, into the practice. Otherwise, there will be unintended consequences, and I think nobody wants to hear that. I think the final point I want to make just before I hand over to our first panelist is that even if we get the open source exclusion and some of the clarity into the text and the articles around open source, it's still kind of half of the journey, and I think that's important just to underscore. There are still elements which need further discussion on how they practically impact this extraordinary ecosystem, and I think without further joining, I'm going to hand over to Zoe, who is a senior policy manager from Digital Europe. I'll let her introduce what the organization does and who it represents, but it's a big deal when it comes to engaging with EU policymakers both here in Brussels but across the Union, and it's great to have her here to let us know what Digital Europe thinks about this. Thanks. Thank you, James. Thanks a lot, everyone, and indeed many thanks to the organizers of the panel, but also to the commission. We also do really appreciate that you always take the time, and I know how much effort you're putting on this legislation. So Digital Europe is an industry trade association. For those that you don't know, we represent the tech industry that is active in Europe, but we also obviously represent both, well, global actors from across all regions that they are doing business in the tech industry in Europe, and we have two chambers of members. We have the corporate ones, and we also have the national trade associations, that this gives us an outlook on both the global scale of things, but also on the national level. So through our trade, national trade associations, we also represent all sizes, including SMEs, for example. So why are we here is that's because we think that the Cyber Resilience Act is one of the most pivotal legislations that we are going to have, well, now with this mandate, but in general, it's a trendsetter. It's a very, very important legislation. At Digital Europe, we do a sort of internal search to see what is important for our members to make sure that our resources are allocated accordingly, and AI, data, and the Cyber Resilience Act have been ranked as the three highest priorities for the digital industry. So I just want to emphasize again how important this legislation is, and thank again the organizers for really taking the time to highlight this. Now, I think it's probably one of the first times that the industry and NGOs really agree. So Martin's presentation was very interesting, and I noted down many of the points that we also highlight in our position. I also want to take a moment to say that you probably wonder, especially the open source community, you wonder why the industry, the big tech or multinationals, why do they care about the open source community. So actually, I just wanted to clarify that for us, it's very, very important to make sure that the open source community is not overburdened by this legislation. For many of our members, even up to 100% of their commercial products have rely on open source components. And just like James said, there are many priorities, many things that we can improve at this area, but not overburdening the open source community is one of our top priorities as well. So it is also important to note that it's a bit of a stereotype to think that the big tech companies, for example, they only rely on closed source and they don't care about the open source community. But I want to highlight that for us, it is a very, very high important to make this a key priority for the CRA to have clarity on what this means for the open source community. So I'm not going to repeat too much myself because you already said it, Recital 10 excludes open source. And we very much welcome this. And in general, we welcome the legislation. It's a very good draft. It's something we can really work with. And we need it. Let's start from that. We all agree, also the private sector, that we absolutely need these type of requirements. So we are very willing to find a way together that makes it work. But we really need to get it right, including too much, too soon can also be very damaging. So for open source, there is a conundrum of it's excluded, but not really. So the tech says that it's not in, but then when you look at the commercial element of it, and if you look at the definition of what commercial is under the blue guide of the new legislative framework, it says that commercial is understood as providing goods in a business-related context. So what does this mean for the open source community? What does the business context mean? Most, like I said before, many of the commercial products rely on open source components. So we need clarity. Right now, if you look at the document, there are 85 pages. Commercial is mentioned only twice and there is no definition. But this brings me to the bigger element, the issue of the proposal that we really see. One, we really welcome the fact that the NLF is the approach that is chosen. Benjamin did a great job explaining the NLF and the same marketing and everything related tangible products. The big quest right now is how to move a regime that has been so working very well. We think for tangible products, but now the biggest challenge is the CRA wants to transform this regime into something that is workable for software, which is again very welcome because NLF has proven to be the right way to do things. But we really need to work more on the definitions. So commercial is just one way, one example. Then there are other examples, like components. In article 11, when it comes to reporting, it says that manufacturers have to also report on vulnerabilities including soft open source components in paragraph 7. So the question there is, what does a component mean in the context of a software as well? So this is a general comment on how we really need to work together to transform NLF into something that makes sense for software. This is why at Digital Europe we have been really advocating for a clear provision asking the commission to develop guidelines that they adapt to the specificities of software and specifically standalone software. With the inclusion of the industry, because what we see here is that the people that are going to actually have to abide by the new compliance rules, the industry players, are not really included in the draft in the sense that we really want to be part of an expert group, that we are going to help the commission, that we know has the appetite to do this right. But we bring the experience and we bring the questions also. What does this mean? What does placement of the market for software, for example, mean? What does substantial modification mean? So all these things, we, I think the important thing is that we all see that we have the appetite, but it still needs to be clarified in the text. And maybe just as a final point, all these things need time. And I really welcome the question before on standardization. We appreciate that 90% of the products that rely on self-assessment and when existing harmonized standards are there. But we also need to look at what does this 10% mean for critical products? Have we actually made an assessment of how much of the burden is for third-party conformity, for example, are notified but is ready? So these all means that we need time to get it right. The current proposal says 24 months for implementation, for the development of harmonized standards, sorry, and for the development process is just too little. And I think this is probably something we are more or less aligned on that we need more time. And for us, the proposal is 48 months, but I know this is something we will discuss further with the commission and with the colleagues later. Okay, so I don't want to go into more, there is way more there. Maybe just as a final comment, even if, as James said, even if we get the open source exclusion right, there are several things that need to work in the whole context of software, including reporting obligations. The right timelines right now, it says 24 hours for initial notification to be, but we want it to be aligned fully with the needs directive because we think it's just introducing too many processes is just too much. And yeah, I mean, I don't want to go too much into more details because James is giving me a smile that says that I've talked too much. Thank you very much. Thank you, Zoe. Well done. So I'm going to now turn to Rom, a colleague of mine, as you know, Red Hat is an upstream first pure play open source software company. And therefore, Rom spends a lot of his time, maybe a lot of most of his time upstream as a volunteer, as a developer, as a contributor. So I'd be interested in hearing your perspective. As Martin was mentioning, you know, he has a particular perspective on it from DNS and now the sort of wider larger ecosystems are involved in the kinds of products and projects that Red Hat's involved in. And to Omar's point earlier about just getting real on zero bugs being impossible in the concepts of PLD, I think that's an important point to underscore and something which, you know, the no known vulnerabilities, implications for open source or even for proprietary, not that I care much about proprietary, but nonetheless, it is important to have that recognized that the commission understands that, but it needs reflecting, as I say, in the text to make sure that clarity is clear. So, Rom, I'm going to come over to you and give you the microphone. Thank you. Yeah, so we are in a unique position at Red Hat of being a large contributor for the open source. And if you look at the portfolio of product that we have, I think we're depending and contributing to about more than a million different open source projects. That's the level of contribution. One particular example that James was mentioning is the look at critical systems, but also critical products like credentials and the zero bug. We all know that zero bug policy is impossible to achieve. And I'm really glad that Omar underlined this. But the reality is that if we don't have clear definition about this, it's going to slow down so much contribution, so much innovation that we have. And we already see that through the lens of our internal folks and myself for some project that I'm working on from communities, maybe you know communities, with the special interest group for authentication and key management system, there's a lot of people scared there. And sometimes we're looking into putting into, I would say into the product, new features that we deem interesting. And we truly believe that it will be a game changer for the community and putting a product in production. But in the meantime, we are really scared because we are not able to make a full interrupt compatibility assessment with all the different projects. As I said, one million projects. Can you imagine if you start having the need to go through the full cycle of integration and testing for this? We try to do it at Red Hat. And that's sometimes the reason why when you look at the difference between what you find in the community with Fedora, or I'm doing a bit of marketing now, about Fedora, or CentOS, or I would say OKD, you will see that there's a major gap between what we have there, which is like top edge cutting edge solution versus what we have in the product. Because we're scared about it. Not scared in a sense that we don't want to give innovation, but we are scared about what would be the end goal, the end results, if we are becoming responsible also for some breaches. With credentials, and the work I'm doing on contributing, actually, because we are about 112 person working on this, we look into this perspective, this lens of compliance and regulation. And no later than yesterday, I was in one of our company events where we had 1200 people, techie people, we're interested in technology, and we were discussing a group of 60 there about this. And that's the moment some people were like, oh, I'm not going to touch this, I don't want to be responsible of this. So we need more clarity, we need more understanding. So when we see the law, and that's what I would really love to have, it's to have more like a framework where we can share this with the community and make sure that we can put straight upstream those perspective, those regulation compliance perspective, so that you guys are not afraid of contributing to any top notch cutting edge solutions. Credential is one of them. If you know a bit communities, you know that it's nothing related to password managers, but as soon as you're using secrets with communities, then it becomes a password managers. And it's a critical class one product then. And so it's those kind of things that nobody think about, but we need to get that into perspective so we understand what needs to happen then in the product. So there's specific elements to take care. And there's a cycle of work that is really, I was going to say burden, but we all know that when we do software, we need to go through a secure supply chain. This is the buzzword of 2022 and 2023. But it's a reality. And I like the comment about the ISO part. There's so many different ways to standardize this. There's so many ways. There's the best practices from an industry standpoint. There's guide. I think the EU has the blue guide also where you can have some guidelines. So there's different ways to approach this. But it's an ethical responsibility to apply it. And sometimes, well, we are so carry on with our innovation that we forget about it. We're like, oh, we should do it like this. And so that's the other side. I'm really worried about the regulation. But in the meantime, I'm really happy that there's something there to enforce us straight from the beginning to have this secure supply chain in perspective. That's for me, actually. Great. Thanks, Ron. Yeah. So I think one of the things I took from that is, and I think this penny is dropping, not just in this room, but elsewhere, is that software is fungible. And I think some of the terms in the draft regulation around intended purpose around sorry, it's calling from my son. Probably not a good time by Sam. Yeah. So in terms of intended purpose, reasonably foreseeable, any known or foreseeable circumstances, I think that's something which gives open source a pause for concern. And I think one of the reasons is coming from an open source licensing perspective is that bringing that community together is not just a question of passion to solve problems and build our open innovation in Europe, but also to be shielded. So developers do not incur liability nor offer warranties. And I think that's something which is clearly important. And actually, it's a good segue to Chris, because in his capacity at Digit, in the open source program office, the EUPL is one of such license, which again, is important to open source contributors, collaborators. So I'd be interested to hear more from your perspective by Sam, and understand from your thoughts on this particular topic. Thanks. Thank you. And it's very good to be back at FOSTA, people. I think we all agree. This legislation, of course, also impacts the commission open source projects. So we're looking forward to figuring out how to work on this. To give you a bit of perspective, so I talk on behalf of the European Commission's open source program office, and going to quote the words from Deborah Bryant yesterday, we're basically, for you, an easy to find, easy to access way to find out who to talk to amongst our colleagues. We will do the legwork, we'll find out who we should put you in touch with, we'll find out who to talk to. And we will liaise when necessary between the commission policymakers and the open source community. So that's one thing I can promise you, we'll try and make this happen. A little bit about the OSPO, James, is that okay, as context, so we started in 2020, thanks to a communication from the commission to itself, about an open source strategy that recognized that if the commission wants to reach its political goals, it needs to basically take into account that in everything there is software and in all the major political goals, open source is a critical component. To be an industry, to be a European market, to be active in innovation, to be active in competition, open source is there and it can't be taken out. One of the first things we did as OSPO is make it super easy for the commission to disseminate its open source process. So far, previously it was a paperwork process that would take six months, now it is basically up to the project to the site, we want to go open and we have a code repository, code.europa.tu, which now has some 200 projects and 700 users and this is from the commission, from ISMA, from DigiConnect, from JRC, from the European Central Bank, from the EDPS, so you should check that out and maybe start working on some of these solutions and this also shows that we will be impacted because we're pulling in an enormous amount of open source to build these components. Another thing that I think is good for you to realize is that we're, as a commission, OSPO in touch with OSPs in the member states. Some of them are here, I have my colleagues from Germany on the right for you on the left and the colleagues from the Czech Republic on the left for me, so on the right for you and we're in touch with these specifically for this kind of reasons. We need to network as OSPOs, we need to avoid making mistakes that others have made before us and work as a group. I think I'll stop here, it's a good introduction. Very good, thanks Case. So I am conscious we have, if I'm right, another half an hour, 25 minutes, is that right? So I would like to give Benjamin the floor just to capture some of the thoughts I see you scribbling down, I hope I put you on the spot, but I'm sure some of them are familiar and it would be really good to hear from you on this, thanks. Great, thank you so much, Jamin, actually all the points are familiar to me. We are of course engaging a lot with stakeholders and as the Commission, we are now in a tricky position because we made the proposal and it's of course the Commission's view that our own proposal is perfect the way it is. That being said, it's the beginning of the journey and it goes without saying that the Parliament and Council will closely look at the text and they're of course also consulting us a lot and many of the things that have been mentioned here, they are being discussed by the co-legislators. So for example, this question of providing products with no known vulnerability, I mean, I can say that this is being discussed already by the co-legislators and also, I mean, we've heard quite often already this, I mean, this concern that maybe the transition period is too short. I mean, we believe that 24 months is sufficient but we hear this a lot from other stakeholders. So I mean, you can be sure that this is being all taken very seriously from us. Wonderful. Now folks, it's really important that this community here in this room take this opportunity. It's not a speak now or forever hold your peace but it is a good opportunity. So if there are any who want to chime in a question, comment, that would be great. And then we're going to move back to the panel. Who's first? I think you're first. I already stole your mic, sorry. Okay, go for it. Sorry. So, well, I'll try to be brief but there's a lot to say. So first I have to apologize because I feel about this. So I will be carried away. Sorry. Nothing I'm going to say is meant to be offensive to the commission. Actually, I think that many of the recent acts are very good for the internet industry, for the internet, like the digital market sector. So the criticism will not be meant as a rejection of regulation. So my perspective, I'm the software developer and I'm the head of policy for open exchange which is a German open source software maker of around 270 people. So I'd say that we are small if compared to the big tech industries from globally. But we are still one of the biggest. So we are also one of the companies that, I mean, we make stuff that is in every Linux distribution like Davcott and so we care about security. We are ISO certified, ISO 27.1. So we are among those that could afford to follow everything that's in the Sierra and would do that. But we are part of an ecosystem. And so we feel the need to, I mean, also bring the perspective of others, including some smaller projects that we use in our own projects. And the problem with this regulation is, well, there are multiple problems. Let's start from the non-commercial exemption. So there isn't really no way that you can tell commercial projects from non-commercial projects because each and every project except maybe like my hobby project which I only, I mean, do for myself and publish somewhere, has some economic result, creates a value. And there will be people using it to create further value. And so it is basically very hard to write a non-ambiguous way to tell commercial from non-commercial. And all projects, including like, I mean, search centers, like NLNet universities need some money to, we need to get at least donations. And so there is no way you can prevent, you can tell this. And on the other hand, I would also be against in saying that all open source software is exempt from this regulation because I'm sure that the proprietary software makers would come and say, you know, as they have been saying for 20 years, you know, open source is just a toy, it's just a hobbyist project, so you should use proprietary software which is secure. So don't do that. The problem is really that this law doesn't seem to understand how software works. I mean, software, pure software, material products have a nature which is significantly different than any physical product with software on it. So this regulation seems to be designed with cars in demand, with toys, with IoT devices, but not for pure software. Pure software is more similar to literature in a way. It's something that is the result of collective development. And if you have a look at open source projects, I mean, yes, we are the manufacturers of Dovcott, but if you look at the code, it's written by people everywhere on the planet. So yes, but the real authors are maybe like a thousand people which are in Europe, outside of Europe, everywhere. And so your law seems to put the burden of like putting a CE mark on every line of code which is imported into Europe. I don't even know what that means. I mean, we have repositories, maybe they are in Europe, but they are mirrored in the US. There's people from Australia, I'm from China, I'm putting code into it. And this code is traveling all over the world throughout the borders of the European Union several times per second. So should I, every time I check some code, I mean, someone should stop the code at the border and put the CE mark. This doesn't really make any sense. It's simply impossible. And it is to say, I mean, all the people here, I mean, your law is designed for the traditional 20th century way of making things in which there's a manufacturer and there's a distributor and someone that sells them in a shop and then there's the user, the consumer. I mean, like if we were still buying, sorry, I'm trying to be short, but I know you want to stop me, but I have to say this. Yeah. And so this seems designed for the time when we went to the shop to media market and bought Windows 95 in a box on a CD. And this is not how it works. Everyone here is at the same time, a manufacturer, a distributor, an importer, and a user of code. And there's no way you can distinguish these roles. So you have to find something, you have to find somewhere. So no, I can cut it short. So I don't understand the problem you're trying to solve. Show me one case in which a European piece of open source software had a problem that created issues with the entire world, and how this could be solved by this regulation. Because if you just maybe funded some security audits for projects that cannot afford them, this would make a much bigger effect. So let's start again discussing from software for something specific for software, because this really is not working. Thank you. Thank you. So I submitted the Openform Europe document that Martin referenced. I probably have a dozen questions, but I picked one. So to publish software in the EU or to import software into the EU, the publisher must perform CRA certification and accept CRA obligations. So this means that software from outside the EU will no longer be available in the EU if those developers do not perform CRA certification or accept CRA obligations. How is that not a massive problem? Yeah, I think my understanding is one of the goals of the CRA is to avoid log 4j or open source type of incidents, but they are non-commercial open source software. And when we exclude them from CRA, we are technically not addressing these cases. So what do you think about this? And as a solution, what do you think about expanding the proposal to set up necessary operational financial assistance dedicated to the critical open source initiatives? Maybe especially the EU ones, maybe European Ospo. Is that feasible? Thank you, Alex. Thanks. Thanks for the insights and thanks for the panel. I mean, I have two questions. We have a lot of problems, but I'm interested in solutions. So first, what's your proposal to fix this commercial term? So do we have one? Could you let us know? And the second one to the commission. So do you see the issue like NL net? So would you consider charities to be affected by this definition or not? So what's the commission take on the NL net position? Thank you. Thank you very much. And finally, we've known for a long time since David Wheeler wrote his paper stating that all open source software is commercial, that the word commercial is a bad word to use if you want to exclude open source. And so I recognize that it arises from the courts in Europe and from the blue book, but have you considered using different terminology to solve the conflict that exists here because of the nature of the community you're addressing rather than because of the context of the existing legislation? And secondly, which community organizations did you consult in working on working on both the PLD and the CLA? You see the panel is comprised purely of commercial organizations or commission members. And all of the questions you've received are from community charity representatives who are in the audience and on the panel. I wonder what you're going to do to fix this problem so that you don't get it with every subsequent bill. Thank you. Yeah, thanks a lot for all these questions. I mean, maybe first this one on who we consulted. So as a run up to any commission proposal, there is a public consultation which runs for three months. It's a period of three months where we ask a set of questions to the public and everyone can respond to these questions, including you. I mean, maybe you didn't know it's possible. I understand that. Of course, we know, of course, very well that these public consultations, I mean, that big players, commercial players, they are more paying attention to this because they have the resources to do, though. And that's clear to us. But that's also, of course, one of the reasons why I'm here, right? Because I know that this community, maybe the voice is not heard as much as the voice of other communities. And as I said, we can, of course, keep engaging. You can come later. I can give you a card and we can discuss. We can even do maybe another small event and discuss further, right? I mean, it's all possible. On this concept of commercial, so first, I mean, I think it would be very difficult for us to replace it by another concept because it's well grounded in this new legislative framework. And all the industry players that we've been speaking to so far, they all say they want us to use the new legislative framework because they are familiar with it from other product types, right? So, and we want to ensure that all stakeholders can use the same framework. If we now invent a different framework, it will lead to a lot of fragmentation. It will make compliance more difficult for companies. So that would be difficult. Of course, I mean, there's this crucial question of where to draw the line of commercial. I think we are open to all your points. We understand them. I think what would, from our perspective, be very difficult. I mean, this is how I understood your intervention would be to say we just exclude all open source because it's all somewhat commercial and it all needs to be excluded. I think this would not work simply for the fact because it would create an incentive for proprietary manufacturers to swiftly switch over to open source, right? Okay. Not everyone can do that. Ah, you want that? Okay. Okay, I see. No, but it would create an incentive to get out of the scope and to not have to comply, right? I don't think this is what you want. Then there was also a point from you on all the many small players and how it would be maybe difficult for them to comply. Now, so the Cyber Resilience Act is, I mean, the requirements in there are their risk-based. That means that, I mean, when you're a very small player, there is also a chance that maybe your project faces smaller risks and is maybe less critical. And you can take that into account when you comply with the requirements of the regulation. On top of that, for the vast majority of products, we only require self attestation by the manufacturers, which means you will not have to reach out to any third parties or interact with any authorities. You can do it all by yourself. And this should be absolutely feasible. What other questions did we have? Yeah, there was this question of how the CRA would improve the ecosystem for components that are vulnerable, such as Log4J, where you said, I mean, this is a non- commercial component. So, I mean, we have this due diligence obligation in our proposal. That means that in the future, when a commercial manufacturer integrates a product, for instance, a logging utility that is free of charge available, they will have to make bigger effort than today looking at these components. I mean, that, again, of course, depends on the risk. I mean, if your project is low risk, you will only maybe check the change log if there are regular security updates or not, and you're done with this assessment. But of course, if you integrate this component into something much more critical, you will have to take a much closer look at these components. So what we're trying to do is we're trying to make manufacturers of commercial products, proprietary or open source or whatever, assume more responsibility for the open source components that they're integrating. And we believe that this will help raise the level of security on these components as well. Yeah, there was this question on whether charities would be considered as a commercial activity. So I cannot just give you a straight answer on that. I think it always depends on the individual case, right? I mean, what we've learned is that there are so many different types of open source projects, so many different types of funding models. So you really need to look at each and every case to see if it's a commercial activity or not. What I already said, I mean, we are, as the Commission, we are absolutely willing to provide more clarity. This could be done, for instance, already in the legislative process that we single out important examples and we provide clarity directly in the legal text. But for all these other niche cases that will come up, this will not be possible in the text because we cannot, you know, a legal text needs to be short and concise and we cannot take into account every single example. But we're absolutely prepared to provide examples once it has come into force so that you have the certainty that you need to know whether you are in the scope or not with your project. I think I took them all. Yeah, I think that's because, so for, I mean, I'm going to speak for the PLD, but for my proposal, we had expert groups consultation in 2018-2021, consultation also through the contractor where they were questioned through different kind of stakeholders. We had the consultation for when the proposal was adopted by the Commission in September 2022 with the Have Your Say portal. This is where you find all the proposals and where you can give your comments on it. So, and it was for more than three months for hours. So, we always tried, obviously, to reach as much as possible. The thing is, obviously, there are sometimes channels. It's not that we avoid having talks with the people that can criticize the legislation. We need critics to also make the legislation better. This is how it works. So, obviously, as you said that, yeah, there might be some players that are more voiceable. This is the reality. I think it's also part of how the field or the ecosystem organized between itself. You need to have one speaker, someone that can represent also the community, because obviously, if we have someone that comes for the open source community that says that he is speaking for the community, then he would bind everyone. So, you know, those companies that have those trade associations, et cetera, this also gives a framework for us to really say, okay, this is a voice of this sector, this sector, this sector. When you do not have this, it's a bit complicated for us to take it as just being the voice of the entire community, because you can have an opinion and someone else can have another one, but then which one do I have to consider as being the one of this community? That's a bit of the complexity also for us, just to be honest. On the charity one, there are definitions. Commercial activity is whatever against money or for free. It does not matter the economic value in that sense. It's a concept that goes much more broader than this. So, I understand that there is this need of clarification. This is where the definitions are there, but they are also the basics of any legislation that we have in the union. It's not that simple to change just one definition that it's common to all of them in one single piece of legislation. This is also the reality of how the proposal works. You need coherence. You can have some exclusions, but we talk about why, I mean, it's not just the need of excluding open source software. It's also the need of ensuring that does not have an impact on innovation, but you also need to take it into account the reality of it. So, that's a bit of the... Thanks, Omar. Benjamin. Yeah, sorry. I just want to be thorough. I forgot to answer one of the questions I just realized. So, there was a question on whether there could be a risk that products developed in third countries would simply not be made available anymore on the European Union market. So, no, we don't see this risk, to be honest. I mean, the European Union, depending on the figures that you look at, is probably the biggest economy in the world. I think everyone has a very strong interest in bringing their products on our market. It has also not been the case with other regulation. If you think of the GDPR, I mean, the list of companies that choose not to provide services in the European Union because of GDPR is very short. So, we don't see this risk. Thanks, Benjamin. Anna Martin, you're dying to say something. So, how do I go to you? Thank you, James. So, I will answer the question of Alexander of how to fix this, but first, I would like to respond to some of the comments made. So, I think you hit the nail on the head by saying that the consultation, it's complex within the existing way of how Brussels operates, at least that's my perspective now as a newcomer here. And I think if we are not addressing that bit, like how to talk to communities like this, then we will be in this situation again and again and again and again. And I want to point out specifically that the discussion about commercial just now, I've read both your proposals, and you are not using the same examples of what constitutes a commercial activity. So, if you're saying that they originated from the legal jurisprudence, then one of you seems to have a different view on the jurisprudence and the other one. So, and I asked, I was wondering, have you considered talking to the open source program office when considering the definition, and I'm sorry if this is a bit confrontational, but I was just actually curious. So, before you answer, I am going to try to answer Alexander's question. I think he was asking how to fix this. I'm not jealous of your jobs because I think this is extremely complicated, and I don't have the full answers. I have one sub-answer, and it relates to critical products. I believe that there should be an escape hatch for class one and class two critical products, that if they are completely open source, then there should be the ability to do self-assessment, regardless of any other factors. Because I think the value of having critical products that are open source is tremendous in terms of security, too. So, we should be considering that aspect in addition to whether or not there is an EU standard that it can conform to, for example. I hope I touched somewhat on the ask again in maybe a couple of weeks, because I think they are in a hurry, so we should be, too. And then maybe back to the question I just asked about talking to Gijs. The power of the moderator. What's it worth? Cold beer? Maybe just to add one question to make your job a bit harder. Well, first, I think that we also need to recognize that you guys are also here on a Saturday and everything, so I think the appetite to interact with many groups is also there, so I think that's important to acknowledge as well. But I'm mostly wondering about the part on technical support, because we are focusing a lot on the definition of commercial, but the text itself says also, especially when technical support is provided, then the open source software is not excluded anymore. And I guess this is more of a general question, because I'm not myself an open source developer, but I'm just wondering to what extent this is demotivating then for developers, because I understand a big part also of the maintenance after, for example, is a motivating factor for people to develop, because then they are also involved later on on the maintenance. And I'm wondering also to what extent this is counterproductive for the objectives that the CRA sets, having the people that are involved in the development, but also they're involved in the later stages of making sure that the product remains secure. Then actually this, I think I would assume that it helps the security of the product. So it's more of a question on whether you have discussed this part. I'm sure you have if you put it there, but just if you could elaborate a bit further and an open question also to the audience, just because I think it's important for the audience to sort of educate us as well. Thank you. Yes, okay, we've got five minutes left. So Rom and Oma, and then we close. Yeah, okay. So I'm just going to continue what you just said. I think one of the tremendous beauty of the open source community is giving support to each other. And I think technical support as defined is very dangerous. And I'm going to give an example. If you are having a, if you maintain a project and you use one of these services like GitHub and you start helping someone to make the solution work, so it's technical support, right? So you open an issue, you start helping the person and so on. If the person, the person who will receive the help decide to sponsor somehow the developers, giving money, giving contributions in terms of money, then it, from the definition, it becomes a commercial activity. This is not the incentive. It's a donation. So I think it's really important to get that into perspective. So we are not reducing the level of contribution, both in terms of software development, documentation, but also money to run some testing infrastructure that is required to do security. So that's my point. Just it. Wonderful. There's an applause. There we are. Good. So Oma, I'm going to hand it over to you. I know you wanted to answer a point from, from Mathen. So for the internal cuisine of how the commission works, proposal don't come out from one own service. So we are three different director general. So connect for the telecommunication, DigiGro is for the industry and Digit. Each proposal before going out is assessed by all the dedicated services of the commission. So everyone is consulted and everyone makes the legislation readapted where it needed because obviously the expertise of each one of the directory channels helps to make the proposal. So you need, it's an internal mandatory step. So that's a, and on the small definition, it's not that we have different concepts. It's the, how it has been interpreted until now and we're talking about almost 40 years of case law. It's elements of what the commercial activity can be and cannot be. It's never a proper situation. It does never draw a specific line. I know it does not reply to your question, but we have elements of what is or what is not a commercial activity and all of them take into account at the end to create the commercial activity, but they are not, it's not just if you are in room A, it's commercial activity and in room B, it's a commercial activity. It's different elements that can constitute the commercial activity or not. That's a bit of a situation. If you are developing your open source software from your couch on a Sunday, this is not covered. If you're, if a big company takes it out from, from a website, whatever, you are not making the commercial activity, but the company is doing the commercial activity. So you are not, you know, these are a bit of the differences. These are the elements, but if you make it, if you make yourself money out of it, this is a commercial activity in a way or another. If yourself make it, you are the commercial activity. If a big company takes the, the, the software and puts it, and puts it in his product, he's doing the commercial activity, but not you. So he will be the one to cover it, not yourself. It's a bit of, this is how the commercial activity works. Yeah, I know it's, it's complex, but yeah. Yeah. We have two minutes left. So 60 seconds. Yeah. I mean, on this technical support, I mean, it's of course a tricky one, right? The idea behind the commission proposal of the side resilience is act is to say that when you're making money with a product, it's a commercial activity. And then you should comply with the rules because there are so many different ways in which you can monetize a project. It's not always just charging a price. Yeah. Could also be placing ads in your mobile app or, I mean, charging a lot of money for technical support. I mean, technical support does not necessarily mean that it's a non-commercial activity. So of course it's different if you just recover your costs. Yeah. That's maybe something else. But if you're truly making money with a project, project, we, we are convinced that it's only fair that you are covered by the sub-resilience act. So that end device compatibility check, compliance check, I think is, is important. And I think we've run out of time. We could have gone into the whole notion of components, which you mentioned on 11.7 and, and where, where we see that, that posing some workability issues. Unfortunately, we've run out of time. But I would like to ask you all to show your |
The ELISA Project - Enabling Linux in Safety Applications
Projects insights and overview |
Okay. There's next talk. Please be silent. Okay. This will get an interesting thing. Normally I'm used to move my arms a lot while I'm talking, so I try to get the microphone always close to my body now. I will give you some information about the ELISA project. ELISA stands for Enabling Linux and Safety Applications. And maybe a quick question up front. Who is aware of safety critical software? So shortly raise the hand. Hi. That's good. Maybe 25, 30%. I hope you will also learn something new then. So maybe before we start fully, I just give you a short view on which project context I'm working. So as you can see, my project is mainly focusing on embedded IoT Linux at Bosch. And what you try to do is utilizing a lot of open source projects, see how they fit into a landscape, and can be of value for very, very different device classes because normally you don't believe it, but in all of these kind of products you will find Linux in there or also embedded real-time OS and so on. So that's all about this part, shortly about myself. Who am I? I'm a technical business development manager. I'm focusing on embedded open source, mainly doing this for the Bosch. And in parallel, that's also why I'm speaking here. I'm the technical steering committee chair and working group lead for the Linux foundation. I bring a past history of 15 years plus. I guess I started with Ubuntu 6.10 more or less to set it up on old PCs, sharing it to exchange students so like a distributed hub of PCs. And here since 10 years, I'm more or less in the automotive space with Linux. We had our first product with 2.6 kernel out. And I guess now we can start on the real things. So if we talk about Linux in safety critical systems, we first need to get an understanding what the system really means. And a critical system, maybe first you say assessing whether the system is safe requires understanding sufficiently. And you can see here's nothing about Linux in there because the system always goes beyond the scope of a pure operating system, beyond maybe a single component. And in this one, you have a system context in which Linux plays a role. And you need to understand the system context and how this is used. Because if you don't get the understanding how Linux operates, you cannot see in which components you're interested. And which features you may need or which not. And then you can evaluate what kind of these features are really relevant for safety. And while you're doing so, you're most likely identify gaps that exist and you will definitely need more and more work to get this done. So if you look into the Linux ecosystem, which we have already, there's a good chance or a good reason also to take Linux because there is a large variety of devices. The ecosystem is strong. You have good tools around this. An incredible amount of hardware support. It runs on many, many devices. And also very important, you have a broad set of experts in there. If you see what's sometimes taking as the benefit of a certified safety critical OS, often it comes with hard real-time requirements and capabilities. We know that the preempt or t-patches are in good shape in the kernel. But, well, hard real-time maybe goes even further down the road. And then there is a development process. And if you see these two sites, if you come and want to address very complex products like in the automotive field or maybe you can even call your robot vacuum cleaner a more complex product. And then you come from two perspectives. On the one side, you could go with a traditional small component-driven artist and you have to handle all the complexity. So you need to have more hardware involved. You have more multi-core support. Suddenly, not everything works out there. Or you go the other way around and you come with a Linux where you all have these kind of things, but you need to improve and see what do we do about the development process? What do you do about the real-time capabilities and so on? So anyway, when you build a more complex product, you need to find a way to tackle these kind of challenges and also bring the difference closer to each other. While we were looking at Linux, I'll take the part in the beginning. It's a little bit like a disclaimer and a little bit more text. In this collaboration of Elisa, we said we cannot engineer your system to be safe. We're talking about functional safety, not about cybersecurity. But if we just take this example, there's always a strong risk also that you have security breaches in your system. So it's similar here also with safety. If you build a system, it's still your responsibility for it. And just because we provide certain guidelines or engineering principles and so on, it's still in your responsibility as someone producing a product to make things safe. And also that way to make sure you really have the described processes in use, use the methodologies and one of the core questions which typically come is like, oh, so you're from Elisa, you make a safe Linux, will you certify a kernel version? And that's not what will work because we all know you have to move forward. There's continuous improvement, there's sink and there vulnerabilities fixed, so you need to go on. And this gives an additional challenge with the continuous certification. So we will definitely not have a version and we will also not certify Linux in this project. We just give the tools and other elements in there. So here, last part of it, there's still responsibility, legal applications, liability and so on, which is also near all. Nevertheless, we find a good set of partners already, which are willing to support this mission. And they subscribe and say we would like to bring the whole thing forward. And seeing this, there is the mission statement which we have drawn. It's lengthy basically. You can read that there's set of elements, processes, tools. It should be amenable to safety certification. We look into software and documentation development and in the end that we aid the development, deployment, operation, or the adoption of a project into another project. Okay. So if you look at this mission, you see basically four key parts, which we will also talk later about. You have elements and software, which is concrete implementation of what we're doing. And you also have the processes. A development process always falls into safety critical, into security system, wherever you look at. And if you start to automate things, if you would like to analyze, there's always a strong involvement of tools in there. And the last thing is when you do all this kind of work, you need to document it. And actually, there's a lot of documentation work needed in any place. So how will we do all this kind of things? We take it in our ELISA working groups. We split this depending on different topics, on different contexts. They're growing depending on demands of certain sizes reached. We're extending this. And if we take a first look, we have a safety architecture work group. This is a group which actively looks inside the kernel and takes, for example, a watchdog subsystem because watchdog is one of the crucial elements which we have in use. It looks what are potential safety related functionality. Is there something in the kernel which is non-safety related? How would these kind of things interfere? And by this, the safety architecture work group does a lot of analysis, try to improve documentation in the kernel, provide new tools. So that's a strong set in there, basically driven by use cases and demands of products. And a little more broader approach is brought in by the Linux features. And actually, the full name is Linux features for safety critical systems. So it's not about generic features. It's about the safety criticality part in there. You can imagine this a little bit like if you're familiar with security measures like namespaces or other parts that we're looking for elements in here which could improve safety. So which means if you take this special kernel configuration, a feature, turn it off on whatever you do and say, okay, this will come up as a blueprint. This is something how you better work with memory, how you not work with memory. All these kind of things are tackled in the Linux features. And then it's a nice group because with the results which are in there, if you're already in a process of enhancing Linux and don't want to wait for all the results of the use cases work group and so on, you can have incremental steps here, just take some part of it and make your system more robust, more dependable. And you can also judge it against how does it compare to securities things which you're doing. And so here that's the big value of this group. It's more on a direct use base and serving a long term safety argumentation, but not that it's something which develops for years. So it's basically assess what's there. As also the improvement of code quality is very important. We have tools investigation code improvement work group. The code improvements could be, for example, done with doing fuzzy testing on the kernel using tools like code checker or syscaller. And then bring them also into a setup where we have server kind of a CI which runs on Linux next or whatever kernel configuration to identify issues to get the kernel more robust, more dependable, reliable and serve also in the argumentation about the quality of the kernel. And what was also on the right side and some of the challenges part was on the engineering process. And as you know, there are rigorous methods within the kernel development. So there are a lot of reviews. Patches are rejected and you see that there's strong demand from traditional project management when it comes to safety products and not every process complies with it directly. So we need to find an argumentation. How is there an equivalence to the open source development process compared to what, for example, ISO 26262 requests for automotive products? On top what is very interesting to understand here is also that if we look into open source, you basically cannot easily buy a maintainer or developer there. So you cannot buy features directly or so you get more an unbiased view or maybe a personal view, but a maintainer who is really committed for the component for this power subsystem of the kernel and so on. And with this strong commitment, for example, you already fulfill a little bit of independent view because in safety systems, whenever it comes later on, the developer needs to commit to what has been done. But of course, it's not written down. It's not written down. The maintainer fully commits to whatever it does. So this is some part, for example, where you can start argumenting on it. And as the different elements need to get somewhere and need to be visible, we figured this out because we were running quite in parallel with different streams on this, but never brought this forward. We came up with the systems workgroup and the system workgroup actually should take all these different elements, bring them together, works cross-functional and maybe even cross-project and combine the elements. In order to tailor the system properly, we have vertical use cases, a newly created one, so there's not much information in this presentation about the aerospace workgroup yet. The overall idea is it should address everything which flies and you know that in aerospace there are many safety standards, safety integrity standards, various levels in there. What you may not know and that's at least what we have heard so far was there is already Linux in use and also in certified product there's Linux use, but it's only on a very low safety level, so it's not on a very higher upper level of safety certification. What's an obvious thing if you see the member there is like 50 to 60 percent is from the field of automotive and therefore we have an automotive use case in there. If you drive a car, if you have a scooter or whatever, you may see sometimes that there's an oil pressure sign, oil temperature sign, check engine, whatever, basically when you put on the ignition you can see all these little LEDs and this is also the use case which we are using in the automotive workgroup. Basically what we said digital or cluster, instrument cluster, the speedometer, everything becomes digital, everyone has a display in your cars and that gives a good chance because there are more complex system in there, a lot of rendering, graphics rendering involved and this is actually a safety critical function. Even if you are in driving or in rear gear mode this has to be properly displayed and it has a safety criticality assigned. Also showing the check engine part is a safety criticality. The third group which we have is from the medical devices and here this is something from a completely different perspective while automotive has the commercial element in mind, maybe one to have cost savings, driving topics forward with the open APS, APS artificial pancreas system. It's driven by open source so there were open standards, there were chances to interact with your insulin pump and you see that this can become very uncomfortable. So there's a nice TED talk from Dan M. Lewis, I recommend this, I put the link also in the slide deck, you can download them and check it. You can see that you basically need to track your glucose level and certain dose of your insulin depending on your glucose level and this is also with warnings and so on and it's very basically event triggered so you see the blood pressure, sugar level goes up so you set the dose, it has a certain delay until it reacts and what came in here was to add the raspberry pi in the middle writing some scripting around it, getting it stabilized and create a product out of it and why I want to stress this is not to any IEC ISO certification done, it was done by an open source engineer, started this project and if you download this, if you use it, you use it on your own risk and therefore the work of Elisa was basically also the first use case we put directly in the beginning of the workshop to say let's take a deeper look, let's analyze what's in there, it's running for thousands of people, it has never been certified, they are very happy and they see it's increasing quality of their life but it's not certified, it's a safety critical product not certified and we are not targeting to do the direct certification of it in the first that we are looking into the different levels of the analysis, see what is involved, what workloads are in there, is there something which could make this fail, is there a risk in there, what potentially could go wrong? And this is basically the completion of the use cases and I've drawn this basically together as you can see an inner part which is very common for almost all the different projects which get fed by the use cases, feeding and say this is how you need to configure, how you need to specialize because you cannot create a full safety critical item completely out of a context, you cannot have this generic safety argumentation, you always need to judge it towards assumed context and this then turns into these are deliverables. A little bit on another view here, you can see also an exemplary system architecture mainly how we triggered it in the systems work group, it's not only Linux involved in these latest products, so if you come and you, of course in the medical devices open APS system, it's a pure aspirin on it, there is not the direct artist involved if you don't treat the sensor or the insulin pump as the artist next to it, but if you come to more complex products, you always need to face that there are artists involved, there are microcontroller, micro processes, container technology come into picture, everybody talks about containers and embedded these days, and also virtualization technologies that be Xen or that be KVM, so this is something which gets in there easily, and for this part if you see on the working group side this Linux features architecture code improvement, this directly go into the Linux work, so the main outcome of this is for the Linux ecosystem, the Linux kernel, and a lot of this work is also not directly related to the hypervisor or the artist, but there are things also which going a little bit further, like the tools and the engineering process, things which are coming out there may also have a good value for other projects which you build on, so if you have a Yocto involved in there, you can build Xen and Zafaya also with a meta layer, and then it may be good to have this tooling part in there, or also code improvements can come into picture there, certain tools which we make use of in UCI for testing, OpenQA system or others, this is an element to be considered here, and lastly the use cases for the completeness, they basically tailor down this system to whatever you need, so for example in the automotive work group we for now tailor the system down for getting a better Linux kernel understanding, and we get rid of the endear originally from the container, the virtualization the others, but we know once we have solved some parts of our work, we need to get the system context and the system context involve all these kind of things, right, and saying this, we also do a certain outreach to other projects, so I put in the Zafaya community, we have the automotive group Linux which is already in there, there could be other Linux versions, and also strong involvement of the Yocto project, instead I didn't know where to put the SPDX so probably on this picture where we see it later on. How we interact so far, we already are in discussions with Zafaya and Xan, we have weekly meetings also where Xan members pop up, where Zafaya is present with some representative, and we saw that these are safety critical open source projects, so they basically share the same burden, they need to show how the development process is done, how do we guarantee certain quality levels, so where is the testing done, where are the requirements, management and the traceability to everything, so this is something which pops in there quite good. If we take this architecture and as I'm coming from the automotive part, we have different projects which share these architectural sorts, and there is a large group on the Eclipse STV project, there is a SOFI initiative from ARM, basically having similar members like the STV and then we have a large automotive grade Linux, which also is so nice to provide us with the reference implementation for the automotive use case, so they share very similar architectures. Lastly, not directly related to safety, but having safety considerations in there and being part of the system is the Yocto project for some build tooling part, to get this into a CI reproducible, here for example the air bomb generation suddenly plays into the game, which you can do with the Yocto project, and while we were discussing we figured out that there is also like data needed into a system as bomb, and for this we reached out to the SPDX and there is actually a SPDX special interest group on FUSA meeting weekly to extend this scope, I guess there is also later on talk where parts of it get presented. Why do we do all this, I like this statement from George Bernard Shaw, he said if I have an apple and you have an apple, if we exchange the apple we have still one apple, but if I have an idea and you have an idea and we exchange these ideas and we have two ideas and that's basically where it goes about, we need to get a good understanding, we need to bring the things together, and by this we of course need to look into certain activities, so now we come into the part what the different work groups do, and if we check for example the elements, process, tools, documentation, not every work group acts in the same amount as the others do, so just put some bubbles in here to see where are mainly our work is going, so we have a lot of things of course on the software part, the people are interested in the Linux kernel, and the process part is maybe not so strong because it needs to be centralized and the usage of this process goes into the other work groups, so the OSAP, the medical part, architecture a little bit also, they work on these kind of processes and bring this into the other work groups, tools seem to pop out on multiple work groups because tools are handy, tools pop up, we bring it into the, into repo, you tell about it, get it used, and if we want to go into continuous certification at some point of time there will be a need of having a lot of tool support in there, and basically every work group does documentation, I want to give you some examples on this from the process perspective, there is a system theoretic process analysis, that the first topic I will tell a little bit more about, so it's the dry stuff about the systems architecture, it's not the code level on this, but we figured out when you do this kind of STPA analysis at some point of time you reach also a level where you need to understand more about the kernel, so I'll tell you something a little bit about the workload tracing which we have done, and also here supporting from the another work group, we have a call tree tool that's self, not in beta, utilizing tools and the proving thing, but writing something also from scratch, and this all then later on fits into the meta Eliza, which is basically the Yocto layer for the automotive use case enhancing the automotive grade Linux demo, we also did something without modification like the code checker implementations to Scala, I will not tell that much about it, but just to give some examples for work on, and all our information is public, so we have quite spread it up, there's a GitHub part, there's some parts on G Drive, we do regular blog posts, and have some white papers published, so it always depends on whom do you want to have as audience or readers, so we share this, there's also YouTube channel, but I don't judge this as documentation, okay. As at first, we're looking to STPA, so STPA stands for System Theoretic Process Analysis, what's interesting to see is if you're coming from safety criticality, maybe automotive, you know, hazard analysis, risk assessment, FMEAs, you may grow with watch, spreadsheets, drawing cases, checking your API interface and all these kind of things, and the nice thing about the STPA is you go a little more in a graphical approach, like on the left part of the picture. Some basics here, it's still relatively new, I say this because the old analysis part come from microcontroller worlds up down to the 60s, 70s, I guess 70s is more or less, so there was a long time where a lot of these analysis techniques came in and they haven't been much improved, but the systems which have been analyzed have increased complexity, and this is something which needs to be considered, and this System Theoretic Process Analysis STPA is able to handle very complex systems. The reason for this is that you can't start from a quite broad view, and maybe you don't know all the elements, so you have something, you just get a name for it, you don't know how it really looks like, and you have another blob where you have more details, so you can connect all these different blocks, and these analysis will still survive even if you know not the old block of some specific part yet, and then you will go in a very iterative approach and just go there step by step, you figure something out, you go to one level down, going deeper into the system, figure out that your assumption didn't hold true, so you do these kind of things for the analysis, and what's also good, if you have certain analysis, it basically looks on an API level, it looks under definitions or so, but this one explicitly goes on the system context, and it includes human interaction, the human operation, and this is also what's not there for other parts. In parallel, you directly get a good, while you do the analysis, you already improve your documentation, you get a good standing of the system, and you can even if you are in a QA department, so you can even integrate it properly with existing systems model-based approaches. The principles of it, to get the very, very high level, it's quite easy. There are four key elements, there's the controller on top, this one sends a control action to a controlled process, and this provides typically feedback. Well, that's not enough, in the end there's also important to know that the controlled process as such may also control something else, so that's how things get more grown up, and the question now in the end is, what could go wrong, what are unsafe control actions? You can use these methodology for maybe understanding how your water pipes flow in a building, or how people walk through certain, so you can always attach it to whatever use case you like, it's always the same approach, but for our case, and the main idea of it was for safety, criticality, for risk assessments, and that's why we say, let's look under unsafe control action. A little bit of warning, and the next slide is in a way that you will not read, it's level one analysis of this open APS use case, and well, yeah, that's how it looks like, in the middle there's the open APS system, you have a view from a top level, so it's a developer view, it's not the full user view here, so you have infrastructure, people have algorithm developer, you release the software, then they come to the human operator who uses the software, installs it further on, this goes in the system, we don't know yet what the system is, this is what I meant with the very first level, you don't care if it's the Linux system or whatever it's underneath, so this is my open APS system, and when you have understood what is your critical part in there, how the system context looks like, you may go into the next level, and now we zoom in into this open APS system, and go on the next level, and in this you see there is actually a Raspberry Pi involved, we know this from the hardware part, and the OS in there, so it's a Raspbian, you have an open APS toolkit involved, the actual algorithm, this may control the insulin pump, the night scout part is also an external component, you see all these kind of things, and the work group has been on this level for some time, and then try to write down the next level going deeper, and then actually needed support, so that's where workload tracing came into picture, we used the mentorship project here and had support, so someone fully concentrating on the activity of workload tracing, that's another little table which you can at least read, therefore the main things to be known as, we use S-trace and C-scope as the main tools for the analysis, there are stressors in there, like Stress and G, Pax test and other parts, this may depend on your workload which you use, what you would challenge with the system, and in this one the information which is coming in there, now our system calls, how often are these system calls coming in the frequency of it, which subsystem do they belong to, that you know okay, where is my critical parts, where is the system call entry point, and by this you can more deep dive into the different system, and this causes a lot of refinement into the upper layers, again because now you have iteration and see maybe you have a wrong assumption, but still before everything was correct as you understood, no you just improve it, related to this calls of the call tree tool, that's something basically rewritten and own part, so the idea was to see here is a system call, what else, of course what are the ways, how to interact there, how to visualize things, because if you just see something and go through the code you cannot really grab the complexity, and this was just the first shot, so also here it's not worse to read, but you can see there's a file system part, and the very interesting part is, this is quite a static thing, so you will see all the potential options, while in the previous view if you have a call, if you have the workload tracing, you basically see where has the pass gone, but you don't directly uncover the untraced passes, and here you see all the passes, but you have the chance that you meet something completely irrelevant, because you're not on this with your workload, and this is a complimenting element of this, and well you get a good insights on the kernel construction, and it can help you to analyze more workload in there, right, we bring all these things together in the meta-eliter instrument cluster, it looks like the AGL instrument cluster, we saw this picture before, I highlighted the change which we did, we write danger in there, and this made us the whole thing safe, which well is of course not the full story, the full story is that we just needed a use case to which we can analyze, which has safety relevance, and it was a good QT-based demo, so we could make use of it, it was running on QAML, QAML has a little drawbacks on this, I'll come to this very soon, but with this you can start analysis tracing workloads, and also add a watchdog mechanism, watchdog would be the next part of it, basically what we use in a lot of concept is an external watchdog, even if you don't see it directly in the open APS system for example, there's still an external monitoring involved which gives emergency data, if the Raspberry Pi would do something wrong in the wrong or the other direction, not that it happens, but there is a monitor there which controls, which will give a beep or so and inform the user, similar you do it in the automotive case where you have this telltale environment and you want to have something which is traced in your workload, so yeah, this challenge response watchdog, challenge response basically it's not simply looking for something but it gives a little challenge to the workload while the workload process other parts and it gets a response in there so that you know, okay, yeah, that's really alive and it's not just replying and the demand here comes basically that we, for a lot of use cases, cannot fully guarantee that the workload comes in the proper time that the process doesn't hang and this release a lot of responsibility from you by checking this with an external workload, so it's mainly looking into the safety critical workload, I know there are ideas to say well let's put this watchdog thing and let's watch everything in there, this typically doesn't work out, so you really concentrate on the things and say this is safety critical and all the other parts are related to user experience, so if you're drawing rendering engine, God's lucky and you see a lot of delay and touch screen or whatever, that's nothing which you want to experience from a user perspective but as long as the warning signs come in time and in proper from a safety perspective, this is all fine, so it's good to split up here between what is the intended functionality, what is the safety criticality of it, what do I need to monitor and what not and for this, this is just the safety net in there, here I said this is used widely in automotive, there are other industries basically always have your safety net somewhere around which monitor things and what we try to do is we want to get more responsibility to Linux and by this you can start with a lot of elements in this safety critical part and yeah, so that's the main thing on this part and the last message is very important for me, it's not that you consider your watchdog in this design as being there or need to be there, you basically start creating your system that you never need to trigger the watchdog because you don't want this, this is just your system functionality and it has to work and in best case this gets not triggered into a safe state, for TELTA use case for example this could mean that the screen is turned off or that you do a restart, basically you would maybe make a black screen or so that people directly recognize the driver, oh it's not going right here, it could be also be the warning message or what else but depending on what's your safety process you need to make sure that this is really also triggered so their safety criticality comes in picture again. I prepared a one minute video but I never know how these kind of things properly work if you do a demonstration so I just put the YouTube link on the material and if you are brave enough or even not, I guess it's a straightforward thing, we have a good documentation how to experience this demo because when we started with the ELISA work we saw that we basically start building our topics from scratch, we documented everything right good as best understanding and then someone came and said well but I'm not using Ubuntu, I'm using an open SUSE tumbleweed and we figured we need a little bit more maybe that we have more environments set up that people can reproduce things so we came up with a docker container which basically gets the things packages installed which you need the right version of it to make it easier for people then the next thing we observed was oh okay the people do a yachtable it consumes a lot of space and a lot of compilation, maybe the cache binaries would be a good option and so we also enabled the estate in there so that you cannot build like in the parts which are still buildable or needed to be built in roughly 40 minutes on a poor laptop, it basically depends on your download speed also right it's quite a amount of download which you typically have with the yachtable, on the long one we also see if we can extend it to other systems and maybe also Debian version of it or so but for now it's the yachtable the last thing which we figured out there are also use cases maybe where you want to deep dive into the system and this would be the complimenting part to this demo if you don't want to see the video and you want to just try it out directly if you have QM on your system installed just download the binaries directly they get built nightly so really nightly so every night you get a new one it always goes to the latest version of the AGL with a little bit of problems last week but it's up and running again does a boot check does a boot check so that you can really experience it and it basically uses the instructions which are written down in the github readmemarkdown file right yeah this is about this some next steps the STPA is continued so we're getting into deeper levels of it we need to see that we get the workload tracing properly reflected in the different diagrams this was heavily driven by the medical devices where the automotive has not used the workload tracing that much but we bring this in there the call tree also got extended with another tool which was KS called KS enough does certain kernel static navigation tool so to get a better analysis on better view on this there for the meter Eliza as I was talking about QM where everybody wants to see real hardware so we also are in the past on bringing this on an ARM based hardware for now so we have the 86 and QM simulation and an ARM underneath is mainly driven by systems workgroup and what is very important so far this display checking in there so we are not normally would check what the rendering of a telltale but there's so many different kind of implementation so that we mock a lot of things there and we want to improve this so that we have proper display checks and also a lot of monitoring this is basically on the four topics which we have seen additionally we work on a system as bomb we enabled the as bomb part for generating material in the demo we want to improve kernel configuration trimmed on the size of the image then have the RT documentation updated have more complex cluster than we're involved and that's may need so summarizing what you have seen we talked about the challenges in the beginning basically what the difference between the traditional safety critical artist and the new one what this is what the collaboration can and what cannot achieve you heard about the goals and the way of the strategy which tools we analyze or which which elements we looked into and also then you could see how the different workgroups interacted how they put into a system how we all reach to wider community parts I talk about the contributions of the different workgroups which shared with the community also in form of usable use case downloadable then you could see methodologies of our STPA workload tracing and lastly we got a little review on what's coming next and I guess we're good from the time from the questioning part it does anyone have a question says one above coming down you have a question okay thanks for the interesting talk you mentioned certification as one big problem so where can we improve things so that certification processes become more open source friendly and open source software becomes more certification friendly so what has to be done or can be done there yeah I guess some part from the city asking how can open source and certification come closer to each other from both sides right and one thing could be for example done in the documentation and improving tracing down having tools supporting how do certain features get from the mailing list into the system if there's a test around it so this gives a lot of confidence and trust in what it's doing from another perspective there's not much in the safety integrity standards which allow the usage of pre-existent software elements so for this there's also an isoparse currently which allows more usage I mean depends on the safety standard which you're in if you're some relaxed medical standards less requirements on this but for automotive it's very strong and prohibitive on this so I would say doing careful work and explaining design decisions and so on making this visible and more structured having maybe centralized bug tracking and so on this this can help a lot from this perspective it will be good for the certification authorities and we do a lot of clearance also yeah I guess if I heard you correctly said from supporting the assessments and the sororities in there we also have company support where we really are in the working groups and get from certification authorities input in the continuous work which we are doing so they are directly working within the work groups as well yeah chin as well thank you very much for your talk I had a just a quick question I want to get a feel for what your opinion on on this is do you think there's space as a certification for for something like Linux improves can you move the mic a little closer because it's for me I hear the people louder leaving so just a little sorry yeah oh oh wow yes the difference as as as process for certification and for validation of Linux kind of improve and and change over time do you think there's ever going to be space for for Linux to be used in in kind of a critical component on vehicles or do you think that space is completely reserved for for something that's actually using real time the main part which I heard was if there's I got the real time part in the end yeah like do you think there is it's already there fair thank you okay it wasn't there does anyone else have a question you have a question yeah so what is the place for Linux itself in let's say what's the safety and integrates a level of Linux itself in this model because if we take let's say ISO 26262 there's a v-model requirements for development this but Linux already has source code there are no you know there is no coverage this test with all these mcdc coverage etc etc so what's the place of Linux and how to keep it maintain it without forking yeah so you say where's the space in the place of Linux if you see the v-model for example the ISO 26262 where does things fit in there a lot of demands like car car coverage parts tracing and so on so what you can see is that first of all speaking about a level you will not directly go to an asl d level which puts much more requirement on the tools that's for sure so you should start on the lower asl a b level that's also what we did we relaxed some parts also for automotive cases let's don't start with too complex parts maybe get a real time criticality out there because then you have to review much more parts and so the space which I see is that you should argue equivalence for certain things that you are in close collaboration with assessors and explain how things are done because when the ISO was originally prepared it was not considering a complex system as Linux being in use and the large amount of pre-existing software so from this if you are in an assessment if you are there if you can show and show the credibility by requirements work by good concepts you may in the first and come up which to a system which is arguable to be safe but not directly certifiable to your ISO 262 part but this already showed you the perfect discussion room also right because then you see well you cannot tell me this is not working but you still say it's not certifiable and then you see also the glitch of the standard and if you reach this point you have a lot of good support when you go with certification authorities early if you have internal assessments and you can judge it and in the end it's also your responsibility where you say oh I argue for an equivalence because it's not saying in this spec you have to assess recommended highly recommended leaving you also trace for showing equivalence to this model I'm using this and on top I'm adding this and by this you can get an argument and of course getting feedback from your developers that the work which you're doing also into kernel mainline and so on. So maybe also it's possible to somehow affect how ISO 262 is developed because it's a bit outdated in some way. Some of the members in ELISA have people in these ISO committees that are basically taking it back into that direction for the future revs of the standards. We don't have visibility at least I don't because I'm not in those committees but we do know that some of those member companies you saw up there are there and they are advocating for things to work a little bit better in future revs. Is there anyone else who has a question? Okay thank you for your talk. Thank you very much. |
Linux Inlaws
A how-to on world domination by accident |
Good afternoon. Are you ready? Martin, how many times did I tell you not to sedate the audience when we give a presentation? I think this was a marketing ploy, wasn't it? Let's try this again. Are you ready? Much better. Welcome to our presentation on Linux in loss. This is us. More like a talk than a presentation because there's only one slide. This is us. Yeah, so if you like knowledge or madness, either is good. Welcome. So just of curiosity, how many people of you listen to podcasts? Very good. Pretty much everybody, right? Wow. How many people have heard prior to this gig about Linux in loss? Okay. Our two listeners are here. Wait. Martin, Martin, Martin. This is all planned. Ah, yes, yes. Did you win the bet, by any chance? Because Martin now just won officially the bet. So how many people actually have listened to one of our episodes prior to the presentation? This one gets the bottom up. After the presentation, please collect your price. Indeed, indeed. Well done, that man. Okay. On with the show. Yes. So thank you all for coming. So today is really about getting an idea of what our podcast is like, which is generally very free-flowing and well-prepared. Yes, but maybe before we go into the details, maybe we should kind of introduce ourselves more formally. Yes, yes. I'm Chris. I'm Martin. Actually, it's the other way around. This is Martin Visser. My name is Chris Simmerman. You want to say something about you? Sure, sure. So, yes, as you, well, some of you who may know that listen to our podcast, we've been doing some IT stuff for a bit. Chris is retired now, so he just does it as a hobby. But yeah, this is us. Martin has about 20 years of international arms trade, drug-crafting, and other professional activities, but no jokes aside. My name is Chris Simmerman. I have been using open source for the last 30 plus years. And I also run Lurken in Germany. The Lark Frankfurt have been doing community events for 10 years, and about three years ago, but more about that in a minute, we decided to start this podcast. So, Martin, what did you do prior to the podcast? Prior to the podcast. Well, so yeah, like yourself, been in IT since the year dot, messing around with PDP11s and all that kind of stuff. If anybody knows what that is, but yeah. So yeah, in short, we've been around for quite a while and been playing with open source software, which you will hear all about today. Exactly. And we should probably say where we actually met the first time. We can do, we can do. Probably in a bar somewhere, no? Well, we did. Just prior to that, basically, I was working at a company called Redis. Redis, if you're listening, if you're watching this on the stream, the email is a sponsor at Linux Inlast on you, in case. And no, this is where we met. And Martin decided to join that no secret database company. And this is basically where we met about three years, four years ago. You're forgetting COVID on you. So we were doing technical sales at the company. And now on with the history of the podcast, it is November 2019. Yes, a rainy, cold Prague. What even was the event? The event was actually sales kickoff. Ah, yes, of course, of course. Or business review. I can't remember. Who knows, you know, some corporate nonsense. Exactly. Yes. But yes, out of that was some craft beer places visited. And where we discussed all things open source between us. Indeed. Martin was choking on hot chicken wings. And so was I. And after a few years, basically, we both discovered that yes, we were heavily into open source. And we always had this intention of doing a podcast. And then we said, why not do it together? And actually the first episode that was published in 2020 in February of 2020 actually is aptly named or the second episode is actually aptly named for them. Because where mostly I talk about 2020 for them. You went there at the time. I did. Yes. I give a presentation in the rest of room on. What is it? Okay, cool. Done. Make closer. Make closer. Yes. Okay. No, there was, yeah, there was a deaf room presentation. And then basically that was our second episode that we did. So yes, we kind of stumbled upon a format that we both like as in discuss topics like operating systems, rust, programming languages, stuff like that. And in between that, we do various interviews of people that are notable in the open source community. The format is loosely modeled on something called Floss Weekly. I don't know if that rings a bell. Twitter TV is the publisher. But they do weekly shows with just interviews. And the idea that we had at the time when we kind of originally devised the podcast to share the episodes between knowledge transfer for one or for better expression. But also livening up the whole format with actually interview guests. It took us some time to come up with the exact format. It was also loosely based on, well, as the name says, some of you may have recognized Linux Outlaws. That was a podcast a while ago that had a similar kind of format, right? Almost. Who of you knows Linux Outlaws? Ah, okay. Yeah, about 15 years ago, FAPSchafter and Dylan shared the idea of actually doing, I think, a weekly format talking just about current events. Something that we started with too. Yeah, we did that, didn't we, for a while. But it's, yeah, with the delay between publishing and talking about current events. It wasn't that relevant because the podcast came out like three months later, right? Exactly. So, yeah, go ahead, man. Yeah, well, this kind of brings us to where we started as well. Our podcast was first on Hacker Puppy Radio, which you may be familiar with as well. They have a booth. They do. Well, Ken has a booth here. Yes. Ken has a booth called Free and Culture Podcast. And this is what we use originally to bootstrap ourselves. That was actually, I spoke to Ken, I forced him in 2020, and he said, we're more than happy to bootstrap you in terms of host you to get us initially off the ground. And that's exactly what we did before we changed this, but more on this in a minute. Yeah, so for anybody thinking of doing a podcast, very easy to get started. We have an episode on this as well. We do, yes, we do. So, yeah, personally, I like the interviews. I don't know about yourself, but we have some quite very good guests in the past that we've had, which you may know as well, like the likes of recent one we had to run Levy from. Religious life. Religious life, yes. So we had various people supporting open source like the FSFE. Which is actually in the audience. In the audience, indeed. And we had various people on programming languages, operating systems. We, yes. For those of you who are now becoming interested, we do have a back catalog. The website is linusinlows.eu. For example, we had a kernel maintainer panel. We did the same with BSD, of course. Program language panel, yes. Yeah, program language panel and all the rest of it. So the idea is basically to get people on the podcast, and not just one, ideally, but more than one, and talk about projects, political issues, that sort of thing. Ah, political issues. Is there the particular episode you're thinking of from the Georgia? Yes, we have the Electronic Frontier Foundation on the podcast, but also, of course, the Free Software Foundation Europe. Yeah, it's well worth listening to that one. As is the one by Paul Ramsey, would you remember that one? Vately, remind me. I mean, we've been doing almost 80 episodes by now. Yeah, so Paul Ramsey is a push GIS maintainer. I don't know if you, those of you familiar with Postgreps will know push GIS, but he's also very much a speaker on things open source. And he kind of raises a lot of points around, you know, open source is free and available, but, you know, you get just a few people maintaining it. So there should be more funding for open source contributors in a way, but specifically in government, right? Because government uses a lot of open source software, but they're not actually contributing, and the code should be free as well, things like that. So we had a lot of discussions about this. True, yes. I think it's fair to say that we keep a, or we try to keep a well balanced middle ground between technical stuff and also political stuff. Yeah, that's a good point. Probably fair to say. But it's about raising that profile as well, right? For open source and the people that make that software is part of what we want to do. Absolutely. And this is basically why we came up with the podcast in the first place. Yep. All those years ago. Well, all three of them. So having said that, we do bi-weekly episodes, right? So it's got a busy schedule. And yeah, we alternate between a topic and an interview. Now, we should probably mention something called the dark side, Martin. The dark side. Do tell. Martin and myself, when we set out doing the podcast, we had this idea of providing a little of color to the podcast. So you've been sort of a bit of a dark color. Exactly. So who knows something called the bastard operator from hell, BUFH? Okay. Matthias, this is nothing okay for enough. No, the BUFH was, I think about a format about what, 20 years ago, maybe less, maybe more. That's where actually a system administrator talks about his trials and tribulations has put it this way. And so we took this original idea and the dark side is loosely modeled on the bastard operator from hell, but turned this into a much more comedy-oriented format that's put it this way. Yeah, for sure. It's a bit of dark humor. Let's put it that way. And yeah, these are also the only parts that are scripted in the podcast. Yes, good point. Good point. Everything is normally very ad hoc apart from that. Yeah, it's free flow. Normally, we do these dark sides, bits and pieces as part of a Halloween special, for example. Halloween, Christmas, Easter. Yes. Any other religious holiday you may want. And do you want to talk about Dutch's fladesa, maybe? Some teaser? Well, okay. So we have this, you're particularly talking about the Halloween one, right? Yes. Those of you who've heard this one, we have a yearly Halloween episode where various, we have a friend of ours who participates and does a very good, let's say, female vampire impression. Yes. If that's possible in the podcast. Yeah, actually, she's from Macedonia, right? And there, essentially, she's our voice model. She's our voice actress. And mostly, these Halloween episodes, of course, have a Halloween topic. So essentially, the episode that I was referring to, and again, we normally do these kind of around Halloween, that sort of thing, refers to the tribulations that a certain Transylvanian Duchess encounters when her workforce decides to unionize. It's that sort of black humor, and I won't give away too much. You have to listen to the episode. It's that sort of black humor that drives the dark side. In addition to kind of, and the people basically who... Slightly more serious, a subject in other episodes. And apart from the humor that is woven into the interviews, into the other sessions. Definitely, definitely. On that subject. Yes. Clearly, we make this podcast partly for ourselves because we have a bit of fun making it. True. But it's clearly aimed at listeners. So we have many ways for people to provide feedback. Yes. Carry on. The email address is, of course, FeedbackLinux.eu, but since we should probably talk about the hosting aspect, too. Yes. We publish normally on archive.org. So if you have an archive.org account, you can also leave a commonly feedback there. We do not do it Twitter for obvious reasons. Yes. No, we haven't done Twitter any time to be making of the podcast because never much of recent. We started on the on HBR for a long time, right? And only recently kind of, on other podcast media, you'd be able to find us on most of those. Yes. Of course, when we were still on Hacker Public Radio, you could leave comments on Hacker Public Radio, too. Yes. And I mean, these comments are very important to us, really. I know it's something that, yeah, more feedback we get, the more they... Exactly. I mean, yeah, the podcast is done for you listeners, not for us. Certainly not for our sponsors, because we don't have any. No. What we also do is actually, we also invite people from other podcasts on the podcast. For example, the Linux Lads, if Shane or anybody else or Conan is here, but I don't see them. Also, but we should also probably talk about the Grumpy Old Coders, right? Grumpy Old Coders, yes. Another set of podcasters that have a... Well, not a similar format, but they have... Yeah, they're more on the technical side, right? They publish less frequently. But it's again, it's some friend of ours that we occasionally have some banter with. Exactly. Yes. What else is there to talk about Linux in-laws? Well, you should listen to it. That's the main thing. Yes. Yeah. So why should you listen to the Linux in-laws, right? That's your cue. You put your spotty on. Well, why would you listen to the Linux in-laws? If you wouldn't be making it? Personally, yeah. It's mainly because we do intermix the detect stuff with a bit of fun, right? That's the main thing. And personally, I like the interview pieces generally very much, because you get so many people from different parts of the world, from different projects, and they all have clearly an open-source link, but they all have different stories to tell. And about open-source and why it's good, and they're part in it, really. Very true, yes. What about you? Well, I've been doing open-source for the last 30 plus years, so... This is why you run an Apple Mac. Indeed. But as we all know, underneath is a well-hidden BSD system, right? So no worries. Do we not have an upcoming episode on Tuxedo computers? We do, but I don't think they're listening, so it's not worth mentioning that as a sponsor. No, no worries. But no, because we touch on, I reckon, subjects that move people in terms of open-source is more and more used across industries, countries, planets. I mean, whoever has a mobile phone running Android, but also running iOS, is essentially running open-source software. Android is powered by Linux, as probably most of you know, and beneath the iOS is something called BSD, both widely used open-source operating systems. And I reckon with doing Linux in-laws, we are providing a platform for projects to voice themselves, to make themselves heard. These are not just Linux subjects, right? We also, as mentioned, touch other parts of open-source, whether it's software and hardware occasionally. And politics. Politics, yes. Society topics, and all the rest of it, yes. Yeah, not to mention, yeah. The showering situation in Germany is a common topic. Yeah, I mean, we normally reckon, we normally open with a little bit of humor. I normally slag off the United Kingdom, and Martin returns these jokes with dire descriptions of the situation in Germany. That's probably fair to say. Yeah, yeah, yeah. So, yes. So, yeah. If you want to, no. In short, Linux in-laws, you want to learn a little bit, you want to listen to people that have things to say about open-source, and you like a bit of fun, just come and listen to us, really. That's the... Absolutely, yeah. Well, what's our slogan, by the way? You come for the knowledge, but stay for the madness. This is something that marketing came up with before you fired them, right? Marketing? What is marketing? The department that you... Is that the people that make the t-shirts? No. So, who makes the t-shirts? The people with graphically designed capabilities I suppose. Fair enough, fair enough. No, we should talk about the logo, actually, no? You can talk about the logo, yes, yes. Oh, how many stickers have you brought, actually? Yes, you can find stickers, actually, at the Free Culture Podcast booth. Oh, you didn't have any photo listeners here? I do, yeah. Only about four or five. You find the rest, actually, at the booth. It's a prototype. Yes. So, yeah, very, very, very, very, very... Like the t-shirts. The first batch, yes. So, the logo, Martin, why did we need a new logo? Well, mainly because the previous logo was based on some pictures of us of 20 years ago. True, true, true. It's not that representative anymore. But, yeah, certain podcasts, let's say, publishers also didn't like the format, so... So, we had to change this. And after about 20,000 focus group sessions, quite a few... This is without marketing. Debates, exactly. Quite a few debates. Debates, due to various sessions, we came up with a new logo, which I hope you will find on most podcasting platforms also. By the way, yeah, although Ken is not in the audience, I think we should probably talk about metrics. Okay, go ahead. Metrics in terms of basically how many people... Well, metrics, metrics. We had done some metrics in the room today, so we could... About what? 150, maybe 108 people in the room. We host our episodes, our MP3s, on now, Ark of the Rock. And the download stats for an average one episode, just on that platform alone, clock in between 2,500 downloads per episode. But if you type Linux in-laws into your search engine of choice, you come up with a syndication left, right, and center. So it's probably... Or even your podcast app of choice, right? Yes. So it's probably fair to say that, on average, we are listened to by more than 8,000 people per episode. Give or take a few. Yes, unfortunately, they don't all feedback today. It's just a shame. Shall we interview someone in the room? Where is that listener? Where'd he go? You want to... Okay, he doesn't. Okay, fair enough. He's too shy for that. That's a shame. We should get him framed. By the way, when did you start to listen to Linux in-laws, if I may ask? Okay. Did you get that? Something about feedback. Sorry. Okay, let's talk later on this, sorry. Okay, no, we were just curious because... We get feedback at times, but the more we get, the better, obviously. Exactly, because essentially, we do the podcast for you in terms of, okay, this is our show, but we live on feedback because if you think, and that goes for anybody listening to us, if there's a subject that is of interest for you, feel free to give feedback so we can get that project on the show, talk to these politicians, whatever. And we're not kind of fixed on one particular topic, right? If you take the Ran Levy example with Manish's life, they're very much focused on security. Very true, yes. We'll do an episode on security, but we won't do a whole bit podcast on that. We'll do anything around open source. Exactly, needless to say, an open source angle would help. Yes, that would do. But at the end of the day, the Electronic Frontier Foundation is more like about civil rights. And when we did the episode with the Georgian chapter, I think one of the focus points of that interview was actually gun laws. Yes. For example. So the point that I'm making here is, yes, open source is important for us, but so are the politics and the civil rights aspects behind this whole notion, behind this whole revolution. Well, this is the part about free, isn't it, really, that we have in free and open source software that we also have talked about. Absolutely. And that's exactly the reason why we try to maintain a balance between technical and non-technical subjects. Ah, yes. And also, if you do have trouble sleeping, there is an episode on licenses, if I'm not mistaken. Yes, an open source license. But we didn't have guests on this. So yeah, it has many uses. We should probably talk about the road ahead, right? I mean, we've been doing this for the last three years now. Yeah. Where do you want to take this one? Apart from the usual world domination, making lots of money, how is yet to be determined? The objective is already kind of met, right? We're spreading the word and supporting the projects and the people that make open source. So we continue doing that, in my opinion. True, true, true. But maybe there's something more. Is there? Probably. Do tell. Some people have suggested, when we still have the focus groups, that more humor might be in order, for example, as in more dark side stuff. Okay. Well, we'll have to get some feedback on that, won't we? Indeed, indeed. Yes. What happened? I need to say, if our listeners have more ideas, the modern area. Indeed. Well, we have, I mean, the whole year is planned out pretty much, but there's always room for, of course, stuff for sure. Like today. And I mean, if you are maintaining a project that you, that you, as in you in the audience, that you would like to see feature on Linux in-laws, free to talk to us after this talk goes without saying. Yes. Now, this is exactly the point, right? If you have things that you want to be vocal about or talk about open source related, whether it's a project or your community efforts, come talk to us and we'll put you on the podcast. And with us, it's probably time for our interview guest. Oh, interview guest. Yes. We do have an interview guest. Right. Welcome. Welcome. Matias, why don't you join us on the stage? It's a little bit ahead of time, but more than Maria anyway. You'll get a bit of an appeal for how we do that episode. For those of you who do not know Matias Kirschner, why don't you introduce yourself? Yes. My name is Matias Kirschner. I'm the president of the Free Software Foundation Europe. That's my day job. And yeah, beside that, I did some volunteer activities, which is probably the topic of this interview. Absolutely. So hands up. Who does not know what the Free Software Foundation Europe does and is? So everybody knows the FSFE. You can skip that question. Fair enough. That was what she was getting at. Okay. Of course, that's also an episode from something called links and loss on the FSFE. But yes, why don't you tell us a little bit about how you came into the open source world? Put you on the spot right now. Oh, I first, the start was that I asked my father to subscribe to more newspapers so I can inform myself from more sources to see who is trying to accomplish world domination. And my father said, oh, that's too expensive. But I read something about this internet. And I will make sure that you get this, which was at that time probably more expensive and long run with the modem there. But yeah, I started with that. And then I had a second computer at home. And I tried to connect them. And I wanted to send emails to them without connecting to the internet. And it didn't work. Also, both of those computers had an email program on them. I complained at school. Someone said, I have something for you. Brought some floppies, brought some CDs. And then I installed my first GNU Linux distribution. And a few months later, I had a mail server. And from there on, then I read more about technology. I organized meetups to install parties for in the local groups, went to events. And then I read more and more about that it's also about political, social beliefs. And that's how I get interested in FSFE. Yeah, I understand the technical background, but there's clearly more to it for yourself than just the tech side with open source. I mean, at that time, I wasn't able to say, to formulate it this way. But in the end, for me, it's free software is a way to distribute power in our society. I think, I mean, democracies depend on distributing power. And we also need a distribution of technological power. And free software is an important tool for that to make sure that there is not one entity has too much power in our society. And I think that's the main part which is driving me for the work there. Very good. Very good. Did you want to take the next one? And why did you decide to join, eventually, the Free Software Foundation Europe? So at that time, then, I mean, there was, at this point, I had this political interest, I had this technical interest. I thought there are so many people on the internet, when you have questions about technical issues, you can ask. And there are so many bright people who will help you to figure it out. But about the political stuff, I was not able to figure it out myself. So I studied politics and management then. And we had to do a seven-month internship. And I asked some people, some DBM developers, I knew, where could I do that? And one of the suggestions was the FSFE. And then I read about them and applied there. And after a long or back and forth, because they didn't have interns before and didn't have an office at that time and didn't have money to pay. In the end, I ended up there in the one-room apartment of the president at that time on the sofa. He had a desk. And from there, I then worked and attended Free Software events, including my first FOSSTEM, then in 2005. Very nice. Very nice. So there's clearly, yeah. What's next after that? Is there some other will-domination plan for yourself? I think the one that is running is the topic we... I don't know what you want to say already about your next episode. Tease it away. Now is probably a good time to feature an upcoming presentation today, or reading rather? Yeah. So I will do a reading of this new book, which I wrote. It's called Ada and Sangamann, A Tale of Software, Skateboards, and Raspberry Ice Cream. It's about encouraging people to tinker with computers. It's about software freedom in a way that is hopefully encouraging them to think on their own how they want to use technology. It will be at six o'clock now. I have to check the room again in UB 2147. And there I will do a reading of the book in 30 minutes. And afterwards, answer some questions about how to write a book with under free license. Get that published by O'Reilly for the German version and No Start Press for the English version, plus other language versions. Clearly you have an objective with the book, right? It's not just, oh, let's write a book for a bit of fun, but there is more to it than that, if I'm not mistaken. I mean, at the beginning, my main idea was how can I explain this topic to my own children? And I asked if others have some ideas about that, and if there are some existing books, and they were either for older children, or they were about a bit complicated. And then I started to tell some bedtime stories myself, always a bit improvised, and then depending on the feedback, I was then just changing the stories at that time, and then in the end, this came out of it. Is this your personal kind of initiative, or is this something that the FSF is trying to promote in terms of, you know, making awareness amongst younger people? I mean, at the beginning, it was a personal project of mine, and I was also writing this on my own time. Then at one point, I talked to someone and told him about this plan, and he said, well, if you will write a book about free software, then I will buy 1000 copies and give that to customers. And then I talked with the FSF and then I got the confirmation that I can spend a few hundred euros to get an editor on board, so that I can get someone with experience how to actually write a children's book. And yeah, when we had this, then we could also hire an illustrator to do the illustrations for the book, and get an illustrator to understand Creative Commons licensing, which is very unusual for this area, and then we were also quite lucky to get two publishers on board who also understand free licensing, and agreed to publish this as commercial publishers under CC Buy, share like. Excellent, excellent. So where can people find the book? So the German one, you brought some with you today. So this is the commercial part, or at the German version, you can already buy that everywhere. The US version, the English version, you can at the moment just get that from the US publisher, because the worldwide distribution just starts in May. So the FSFE, we got some of the books and brought them through customs, which was a pain, and we have them at the booth, and so we have a few amount of those here. And yeah, else from now on already, you can order that at every bookstore and pre-order will be shipped in May. And tonight you're doing the first, let's say, reading of the English version, right? Yeah, that was one of my wishes, that FOSSTEM will be the place where I do the first reading of the English book, and yeah, I will do that there, and for everyone else, you would like to do that. I mean, you can first try it out, and one of my recommendations is it's a really great tool to talk to children about our topics. I did several readings with the German version at schools with 30 people, 40 children, or even once in a cinema with 150 children, talking with them about that, answering their questions they have, or having some discussions about what they want to invent, how computers work, and so on. And so we have all the slides and the text and where to change slides and all kind of other information, how you can do that on our git repository, it's all freely licensed, you can make changes to it, let's see what's coming out, you can make a sequel. Yeah, it's, I think that's something we will also discuss in the show, probably a tool for world domination. And for those of you who cannot make it at 6 p.m., there will be a reading on something called Linux Inlass in an upcoming episode, more than likely in April. So if you missed this reading tonight, feel free to tune in or download that particular episode. Okay. Anything else we should discuss with the president while we have him on the stage, Martin? I think we still have some minutes left. No, I think we have some minutes left. So I mean, I think the subject has been beaten to death, but anywhere I would like still to have your opinion on this. Richard M. Salman, just two sentences on this, just to wake up the audience once again. I mean, it's history for enough, but it's been a while, but I think he has made a rebound in terms of he's back on the board now, right, of the Free Software Fund. I know that the FSFE issues a statement after the Holy Barclay, but I reckon he's back on track now at least in the U.S., no? I mean, any views on this that you would like to share? Oh, that's very complicated. Okay. And we still have, I think, 10 minutes left. The abridged version. Yeah. Now I was cornered here on this topic. Now, I think that's a really complicated discussion, and it's also a discussion which, at the moment, I also prefer to have some personal conversations with other people involved in this before this is going public. So I'm sorry, but this is a bit too difficult to discuss. Very diplomatic answer. Back to much more safer grounds. Where do you see the Free Software Foundation Europe going? I mean, what I see is that the topics are going more and more mainstream, and so one of the big challenges there is how can we make sure that this topic doesn't stay at FOSTA and at some Free Software conferences, but that we are going to other conferences and connect with other topics which are ongoing and which are very much influenced by our topics that we are usually discussing here. So when we talk about, there are discussions in the sustainability area, which when you, there's a huge dependency on can you really make changes to hardware, software, and how is technology developed? There are questions about who should have control over the computers. Like, I mean, we have net neutrality. On the other hand, it's very important that we also have device neutrality so that every one of us has the right to install or remove software on our devices, that we can change web browsers, that we can change app stores, that we can change the services which are connected to our devices, and that's something which is, I mean, at the time when the movement started, it was small and we had some desktop machines and some laptops, and now we have the computers everywhere, and now, I mean, there are so many people who have questions about this topic, so it's more and more important to translate this to a more general audience, people who don't have such a knowledge about technology, and that's something which was, the FSFE always was doing this, but that's something where we have more and more demand about that, and it's now not anymore that you want to talk with people about free software, and most of them say, oh, I'm not interested, it's now more the problem that there are so many people who have questions about that, and we just have a limited amount of resources, so the important thing is how can we enable more people, more volunteers to also work on these topics, that's also why, with this book, I mean, I'm invited to do a lot of readings at schools, but I will not be able to fulfill this, that the demand there, so we need to enable others to also do that and do this, and that's also for the, for all the other topics, with device neutrality, with sustainability, with public money, public code, one of the campaigns we are running, where there are so many governments which are interested, and public administrations that are interested, yes, I'm totally convinced, but how do I do that, and so that's one of the big challenges there, how do we enable that, and I mean, during the last three years, it was very difficult to motivate volunteers to get in contact with new volunteers, that's why something like FOSTA now is very important there, I mean, public money, public code was the first campaign where I came across the FSFV all those years back, at the moment, you're running a sustainability campaign in the era of mobile devices, if I recall correctly, that was upcycling Android, where we help people to install other ROMs, other operating systems on their mobile devices and also explain people that it's important that you can install or remove software from devices and repair devices, so that's one other area. Okay, and would you go as far as kind of associating these campaigns with the right to repair that you all, that you almost touched upon a minute ago? Yeah, I mean, the right to repair is definitely one part of this whole device neutrality area. Any future campaigns that you want to tease now? I mean, there are several things that will come up, but one thing that for me is very important at this time of the year, so it's February, and on the 14th of February, we are always running the iLoveFree software day, the idea at that time was I thought that the flower industry is benefiting so much from this day. Why can't we have something where the free software community is benefiting from this? Yeah, so and I think in our community, we give each other very open feedback, very direct feedback, but sometimes we forget to say thank you to all those amazing people here at the conference who help others with their knowledge, who share their software with others, and so at the 14th of February, we always encourage everyone out there to thank other free software contributors out there, buy them a beverage, send them a letter or a postcard, or I mean, just in general, say thank you to them, because it's something which in the end, I mean, all of those people, they are doing such a great job, such an important contribution to our society, which we often forget about, and so do this on the 14th of February, add it to your calendar, and the next day you can continue to write feature requests, bug reports, complain why this programming language is used and not the other, and why this approach is bad and yours is better, but yeah, don't forget to say thank you once a year, at least. Also, there are, at least in Germany, there are a couple of events taking place, for example, one of the Frankfurt user groups, I think it's even the CCC, as in the KS computer club, has an event at the headquarters celebrating this free software day, so if you want to take part in this, check out, is it on the website? So yeah, on the website fsv.org, in the news section, you find an entry there about I Love Free Software Day, and several local groups are also organizing events, where we are inviting free software contributors, make sure that there are some pizzas, some cake, and there are people who can thank the others. We have, in Berlin, also, at Seabase Hacker Space, we have a DJ there, we have an event in Crees, we have an event in Frankfurt, and there are others there, and if you want to organize that on your, something like that, on your own, let us know, so we can also list your I Love Free Software Day event, it's an event where we hope that many of you here, many of the people visiting for them, will receive a thank you for your work, and well, if you want to organize something there, let us know. We can also help you a bit when you register there, that we could cover like the pizza for this event or so, and just write us an email, and we will see that we can also financially support when you want to organize that, so that, because we think that if every euro that is invested in such a nice event, and some nice cake there, that this will benefit Free Software in the long run a lot, because it's so motivating to receive, yeah, some appreciation for your work. Sounds like a bit like the dark side, really, we have cookies, don't we, cake in this case. So people, this has been almost a live session of something called Linux in-laws, this is the format that we normally do every, every, every month, as in bringing guests on stage, putting him or her on the spot, and then having a little bit of fun. If you like this, the feed address is on the, is on the, is on the screen, I think we have time for one more question from the audience. Is there any question that you would like to get answered right now, preferably from a foster, from a, from a free and open-source software background? If there isn't, thank you very much for listening, I'm tempted to say, there will be a session tonight, as in you can ask us anything, you also find the address on the website, it's just a little bit of fun. At that pub, we're going to start at about 8.30, the place isn't too big, so just make sure that you are there on time in order to, to avoid, to avoid disappointment if you can't get in. So 8.30 at that address, you also find this on Linux in-laws, you just click on the foster tab. Thank you very much. Thank you very much. Thank you. |
Similarity Detection in Online Integrity
Fighting abusive content with algorithms |
So, it's nice to see a nice crowd after two years of pandemic. You're beautiful. So today we're going to talk about similarity detection and how we use it in integrity. As a way to ensure that the website is a safe place, that people just maintain an integrity of place. The outline of the presentation is as follows. We're going to outline the problem, then how we use automation and similarity detection in order to achieve what we want. The current technology that we use for images, which is the vector search, then we are going to discuss in depth what is the actual technology, the vector embedding that makes possible to transform a picture into an element of search. The current platform offering that met up with this proposal to allow other people to crowd source all of their findings into a centralized place. And last but not least, what we have of open and free that you can install in your own, you can deploy in your own site to benefit all these technological findings. So the problem is that any big platform bears the responsibility to ensure it's a safe place to serve. No matter what also the law says, that you have to make sure whatever the user posts, you are ultimately responsible to make sure that everybody is just not exposed to things that will violate your community guidelines. Meta has almost three billion users, it's likely less than a world population. And although the vast majority of our users follow rules, some fringe bad actors will always be present. And at that scale, fringe means tens of millions of bad person creating a lot of problems. And when I mean issues, problems, I mean child exploitation, imageries, non-consensual, intimate imagery, which is a way to say revenge porn, adult sexual exploitation, people forced to perform sexual acts in front of camera against their will, terrorism, violence, whatever. And just to give you a couple of numbers, Meta publishes a transparency report quarterly about what we do to ensure the platform stays safe. And on the second quarter of 2022, we removed the 38 million of adult sexual exploitation pieces of content taken down. And it's just for this category, child exploitation is not so huge, thank God, but also there are other like violence, terrorism and stuff. That accounted for the 0.04% of view content worldwide. And in case you were asking, 97% of this content was proactively taken off, even before people could even see it. The remaining 2.8% is user reports, like I found this. And we take that down also, and we also add to the data banks just to make sure that we are not forgetting about that. Sometimes there are false positives because it's just unavoidable. And half million was restored upon user appeal. And we restore accounts and mostly accounts and the pictures that we're banned for. It goes by itself to the sheer volume of content, the huge scale, the problem we are facing, requires both automation and also human review to ensure either accuracy, both accuracy and also consistency. So there will be a problem if we had the 1 million people clicking and making decisions and what is violating for one is not for the other and vice versa. And so, and we cannot just also just employ automation, because otherwise we will have this very powerful site, decapitating everybody, also innocent users. So the role of automation and similarity detection, the thing is that a lot of things that happen online are things that are being repeated. So are things that are already occurred in the past. Like people posting a picture of some shooting, some mass shooting, for example, like the buffalo or the Christ church, gets taken down and the 10 more accounts spawn and post the same things. So it's much, it's very efficient to reason in terms of let's just redo the things that we already found out that worked. We employ automation to scale, of course, handle the scale of the problem and to consistently repeat a decision that a human reviewer has already vetted in the past. So we tie a content to a decision, a violating content to a decision, let's act upon this. And we take, we act, we tie the decision to the actions. Let's just repeat this action every time we meet a piece of content that triggered this same decision. We do that for videos, for pictures, and also for text. Today we'll be mostly talking about images because the techniques for video and pictures are somewhat very similar, text has a completely different array of techniques that we'll not be presenting today. So a way to, if you want to achieve similarity detection, you have to come up with a way to achieve similarity first. So how do we compare to pictures? Of course, we are not, we are not be doing pixel by pixel comparison. We want to be much faster. Our way to do that is just, okay, let's just MD5 hash all the pictures or SHA1 all the pictures and then we store them somewhere in an indexing system. And whenever a new picture comes in, we just recreate the hash and if it matches, we just ban, right? Well, that doesn't work very well because the cryptographic hashes are not resistant to resizing, rotation, one pixel alteration, all the hash changes all together. Instead, we can really benefit from local hashing because it allows for similarity measurement. Like you change slightly one piece, one portion of the image, and the hash changes a little, but not completely. Then you can reason in terms of distance between two hashes. So you have to turn, you have to find a way to turn an image into a vector and then you perform a vector search. Whenever two vectors are very, very close beyond a certain threshold, then it's probably a match. And just in case if you're asking, these are based as the architecture. You have more or less all the architectures share these four stages, observation, an image has been generated, usually push event like user uploaded something. Then you have the representation phase in which you hash the image to a compact representation. If you're indexing, you store that into your index and instead if you are at inference time like an event someone uploaded something, you search the index they have built with representation. In case you have a match, you action upon what you decide what to do with the match you got. Usually the idea is that this is very close to an image that I already see in the past that was banned and also the account was taken down. Do the same to this user. So first three pieces of content, Facebook has released a library which is FICE, the Facebook similarity search library is a library to do similarity search over a vector of dense vectors or vector floats or integers, for example. You can think about it like a C++ version of Lucene so you index stuff, puts that in a very big space and you can search in this space very fast. It supports CUDA so you can use your GPUs to search. It's basically index on steroids and it's C++ but it has Python bindings available and it scales almost nearly. You can really index 100 millions of pieces on a single machine and it just handles them really, doesn't need to saturate all the memory so it has a very good optimization properties that makes it a very good tool and you can go and download that on GitHub. Today we are also mostly referring to with the perceptual ashing. This means that we are reasoning in terms of colors, colors and images, shapes. We are not reasoning about what's happening inside the image. That's the semantic ashing which we are not going to talk about this today. Perceptual ashing just captures visual similarities and it's very nice for use case because it exactly does its job. So you might think that we are all talking about machine learning systems that come up with very clever representations about our pictures and I'm asking do we really need a convnet for that? Do we really need to employ GPUs? You already said that it's on CUDA so perhaps that's a nice hint but absolutely not. Most of this technology is like a ashing technology so they just computer represent a mathematical transformation over the image and it's really fast and it's really cheap and it can be executed almost everywhere. So a little bit of history, the first very notable example, it comes from a source that nobody would have thought about, it's Microsoft in 2009. Microsoft invents photo DNA. Photo DNA is the first algorithm employed in fight against exploitive images of children. So it transforms a picture into an ash of 144 unsigned integers on 8-bit representation. It's proprietary. So Microsoft licenses this to any non-profit or any organization that wants to fight exploitive images of children. It gives you a license, you can use for that and nothing else. But I cannot disclose the details of how that works. It can be used only for that but Microsoft donated the photo DNA to the National Center for the Missing and Exploited Children, the NACMAC. It's this American non-profit that basically acts as a coordination center in global fight against this phenomenon and shares this library with anyone that wants to integrate. This I cannot talk about how this works, this is the only moment in which I will say something like that. But we can talk about an open source counterpart that almost 10 years later Facebook releases PDQ. PDQ stands for Perceptual Algorithm Using Discrete Cousin Transform and gives a quality metric. It's a very, very bad acronym but we need a three-letter acronym so it's that. It creates a 256-bit hash, uses hamming distance to compute the distance. It's really fast. The compute overhead is negligible compared to discrete. Can tolerate some level of adversality. This means that you change the image because you want to fool the systems in that this image is not something which is well-known, PDQ can resist a little to this manipulation but not all of them. It's used in stopncii.org. It's a website where people, in case you have a fight with your ex-fiancé and he's threatening to publish your intimate imagery, you go to stopncii.org, you upload your intimate imageries, fingerprints get taken, original images get deleted right away of course, and these fingerprints are shared with partners that, okay, if I am going to see these fingerprints in my website, my platform, I'm going to take them down. So it's a crowd source effort and uses PDQ for images. How does that work? So PDQ hashing is, optionally scale down to a square image, okay. Then you compute the luminance. Luminance is the idea that you take the pixel that contributes most in the RGB channel. Instead of putting black and white, you use the luminance. It's just another procedure and the idea is that the luminance gives you better information about what was the channel that was contributing most to the color, to the light in that place. Then you down sample to 64 times 64 using a blur filter and the idea of the blur filter or tent filter is that it gets the most significant value in that region because if you keep convoluting a pixel with your neighborhood, what you will have in the end will be the highest value. So you obtain a representation which is compact and retains the most significant information. Then you divide the images in 16 times 16 boxes, each one by 4 pixels, and you calculate a discrete cosine transform of each box. The discrete cosine transform, so the box is at the 4 bar color there. You see that the grid with a lot of wobbly images, that is a discrete cosine transform. The idea is that any image, any signal can be represented as a sum of cosine signals. You only take the signal, the most significant one, so it's a form of compression actually, and you take the most significant coefficient for the biggest cosine you have. And then you calculate if the median is above a certain value, then it's one, otherwise it's zero. So you get this 256 in an array of 010101 in case this pixel were a high luminance or a low luminance. The DCT provides a spectral hashing property, identifies what is the point in the images, that contributes more or less. You have an hashing space, which is 2 to the power of 1 to 28, because it's half the ashes, because half is always 0, half is always 1. To search, you just do a vector search again, what you've just created. In case we want, we can use partially the same technology to do video hashing, and this is another, it comes in almost the same paper. The TMK is a temporary matching kernel, is a way to use the PDQ creation to do a video similarity detection algorithm. It produces a fixed length video hashes, so your hash stays at the same length, which is like 256 kilobytes, if I'm not wrong, even if your video lasts for 3 hours or 30 seconds. It just produces a fixed length, so it's really nice. What you do is that you resample a video to 15 frames, then you compute the PDQ without the 01 quantization, so you keep the float numbers. That's why it's called PDQF, PDQ float, and then you compute the average of the old descriptors that you have within various periods of the cousin and scene. Why we add the cousin curves? Because a cousin or a scene adds this wobbly movement that tells you whether a frame is before or later in the near surroundings, the near neighborhood of the frames. In case you have 10 pictures, you add this cousin signal, you know this picture is before this one because you see the cousin curve which is going up and going down, and it's a nice uniqueness fingerprinting time signature algorithm to add a cousin. You compute the average of all the frames, the PDQF for all the frames, with various periods, various scene and cousin, and then you pack them all together, and you have these five or six averages, and that's your PDQF embedding. Exampling is just you compare first the vector zero, which is the average of all the frames and doesn't retain a temporal signature, then if there is a match, you compare also all the other vectors at different periods, which are the level two action as the time signature, and so you can be really be sure that the videos are really the same, because if you find the same averages with the same periods, it must be the same video. It's nice that it's resistant to resampling, because you always resample. So in some way, if you vary the frame rate, the video will change, and MD5 hash will change, but this one is not full load. Ashing is really slow, because you have to do a transcoding of all the videos first, and then you have to read all the frames and compute the PDQ for every frame. But search is actually very fast. Another nice hashing technique that we have is the video MD5. I said that we will not be using a crypto-ashes highlight. We use crypto-ashes, but just for videos. This because if you take a MD5 of video and find exact copies, it's really cheap in this way. A lot of actors just repost unmodified content. They are not going really through the hassle of doing a encoding just to try to fool the systems. They just try to repost again. So the MD5 actually works, and it can be done with vector search if we use the bytes for the MD5 algorithm. And it's used widely in stopncii.org also. In 2022, Facebook has released the video PDQ, which is a different algorithm from the former one. Hashing is that we hash every frame to a PDQ hash, and we just pack the list. It's much bigger. It's not slower than the other one, but it has a nice property that we just have to search for individual frames. So we treat the problem as a back-of-word approach. So we just put all these frames inside an index library. Then we search, and we take all the candidates, and we do a pairwise comparison. If the pairwise comparison is successful beyond a certain threshold, then it's a match. And also this you get for free, and it's released along with the PDQ, along with the TMK, PDQF. All this is available inside the Facebook Research GitHub repository. What do you do once you have all these hashes? Your platform is computing the hashes, but it's the first time that you see this content, but perhaps all other actors have already seen this content, too. While you upload them to the threat exchange platform, Necmac shares the PDNA hashes, I told you, with all companies that are asking for them. So can you please tell me where this picture that someone uploaded is a match in Necmac? So I already know that this is something I should call the law enforcement. Data does the equivalent, but for the PDQ, because it has much less friction to adopt the PDQ compared to the PDNA. There's a team, the Internet Safety Engineering that builds and operates all these services where anyone can upload fingerprints, and so you can crowdsource a big graph of matches. There's REST API to access and post new data, has multi-language clients, uses PDQ, and users can also download the data. You are not forced to stay online, stay connected. You can just request for a dump of the database and you can search it. And you find all the data and all the APIs at the GitHub page. In 2020, Facebook also has released its most advanced algorithm to spot similar images, the SimSearchNet++. This is an error network, and it is capable of facing adversarial manipulation that the other embeddings just are not able to. Unfortunately, SimSearchNet is proprietary, so I cannot really talk about that, but we have a cousin product, SSCD, the SimSearch Copy Detection, which is open source and free. So I can really talk about that. They are somewhat related in some technological principles, so I can really talk about this. So this is a PyTorch-based model. So the problem that this, which is a state-of-the-art product, is trying to solve is what happens if I take a picture and I put a caption on it, alterating so many pixels everywhere. A PDQ or a PDNA-ASH would be altered dramatically, but is there anything we can do to teach a computer to just ignore all the captions, all the rotations, all the jitters, all the cropping of the image? Yes, there is. A person is able to do that, so we can teach a computer to do that, too. So models and code are available. What is now available is the training data that we use to create a model, of course. For those which are into the deep learning, it's a ResNet 50 convolutive neural network. And the novelty of the approach is that it's based on our MAC vocabularies. Our regional MAC, for those, how many of you know how a convolutive network work? Raise your hand. Okay, fine. Very good. So it's a network for the others that looks at the image, looks at portions of the image. Each neuron looks at a different portion, and then they pass what they have understood to a higher level series of neurons, the higher and the higher and the higher, until the last layer of the neurons has a very wide overview of the whole picture. In this case, we are using the maximum activation of all the channels that we have. So we take note which are the regions of our Carnaut maps for every different channel, which across all channels have the maximum activation. If you have 10 channels, and that region across all the different channels, all of them you have a maximum activation, that means that that area is an area of interest. So we use these areas of interest as a word in a vocabulary. So exactly when you do the Cosine similarity search for documents, you take all the words, you index all the words, you say these documents as these words, so it's like a vector of words, and then we try to see which are the vectors that have the most words in common and put in the same place. We do the same things, but for portions of the image. So we use the rmax. The idea is that it's a self-supervised system also. So it means that it's trained to recognize augmented input, and it's trained to match an input to its augmented version. So what we do is that we take the training set, we repeat a lot of augmentation, we add the captions, the randomity, we rotate, we flip, we alter the colors. For example, if you do a one degree of whitening, you make the image brighter, which is you add plus one to all the pixels in the image, you are altering all the pixels. But in this case, a PDQ hash is capable of understanding the difference. There's a very weak form of adversarial attack, because the PDQ just computes the difference between regions, so it's not going to be fooled. But you can be much more violent and put just a spot color somewhere, and PDQ is going to be fooled by that. Then you do through the CNN, you do a thing called gem pool, which means you do a generative mean pooling, a generalization of the average pooling, in case you were wondering. Then you go, and at the end you use an entropy-oriented loss function. This means that we want to encourage the network to spread the representation of training data along all different places, because we want to maximize the distance between all the training examples in the training set. So you get a nice uniform search space. At the inference time, you do the same with the CNN, and then you obtain a vector, which is a representation of an image. And the idea is that there is a distance that you can compute between the data set of the reference images. Of course, you can subtract a background data set that was used generally to augment the images, but in this case, what you obtain in the end is that the score of the augmented image is almost the same of the non-augmented version, because it just learns to ignore the places which are not organic in the image. And SSCD is freely available. You can download that and start playing. You find both code and models, as I already said, but not the training data. And by the way, Facebook has also announced an image similarity challenge. You have to determine whether a query image is a modified copy of any image in a reference corpus of one million. This is very similar to the Netflix recommendation challenge, when you had to recommend the movies and you had to beat Netflix's algorithm. And this is the image similarity challenge, and also the meta-IE video similarity challenge, which is two tracks. Generate a useful vector representation for a video, and also try to find a reference video into this very big corpus, and you don't have to only find a video. You have to find a clip, so a sub-portion of a video, into a very big corpus. And last but not least, since the last part of a donor is the tastier one, we have your turnkey open-source solution that you can install in your own premise. The hushier matcher actioner. HMA is an open-source turnkey safety solution. So you just download it, install it, and it starts working right away. What it does is that it scans the images that you want to push towards it. It has an index that is updated with all the hashes coming from thread exchange, but also from yours, and is able to, say, to bind banks' verticals of violations. You might have a non-severe violation or very severe violation, and you might decide that for non-severe violation, you just delete the content and send a warning, or for high-severity violation, you just immediately delete the content, shut down the account of the poster, and you also signal it to the law enforcement. You can do that. And you can configure actions in a backend that are tied to the content that you want to bank into your HMA platform. You can pull violating seeds from Facebook thread exchange API, and works on AWS only, because we wanted to make a very easy-to-use thing, and also something that doesn't really mix your bill higher. So we built it on AWS Lambda. So it doesn't cost anything until it runs, then it runs, spawns a Lambda instance, and then goes down, and you only pay for the seconds that it actually runs. But it's very fast. And there's a Terraform module available thanks to the lovely folks of the Internet Safety Engineering. This is how you deploy that. Your infra, you collocate HMA to your platform. For example, you might own a platform where people have a chat or people post pictures. Whenever new content comes, the web server asks the Azure, have you seen this? And the Azure goes to Matcher. Matcher goes to the index and says, do I know this? And in case there's a match, the actioner module will just tell your, you have to define a callback API in your own platform, like whenever the actioner calls, you are killing this content in your own backend. And, of course, you can fetch from external API new content from the fact exchange platform. So wrapping up, automation is necessary to be effective. But you will lose precision, of course, because automation doesn't really think. It just does whatever you have configured blindly. Human support is always needed for appeals and also to establish the ground through. So what is actually violating, what is not? Do expect false positive, because they will happen. You should put in place an appeal process to allow your users to restore the content. PDQ, video PDQ, MT5 and SSCD will provide you with a way to obtain compact representation of high dimensionality content like pictures and videos. HMA provides you with a turnkey solution that you can install on premise, on your premise, and search and enforce your integrity policies at your platform. And thread exchange provides you with a platform for exchanging representation with other big actors like, maybe, itself, for example. That was all from me. Thank you very much for listening. Any questions? You mentioned it for the challenge, I think? Oh, louder. So you mentioned it for the challenge, finding a clip of a video. Can PDQ do that, actually? You can't hear me. So can PDQ find clips of videos? That's my question, actually. So you should, you say, perhaps I heard about YouTube, what is something that already does. Like if the challenge is to find the clips of videos. Yeah, in general, it's possible, of course, and the video PDQ algorithms will ask every frame. So in case you send a very small sub portion of a video, you will have, like, 100 frames, for example, then these 100 frames will be treated as a bag of words. You search the index, you find the video that contained all of these words. So you have a match of all your query frames inside the index at the very long video that has it. And so it's a match. That's how we do. Of course, there are more clever ways to do that. Thanks. Hello. Not a technical question, but let's see. I was thinking that if you're using such a system to try to prevent digital crimes and such things like that, from an ethical perspective, I was just wondering that you, I suppose you have such images to compare them. And how do you process those, how do you make the decisions? So I repeat the question. From the ethical perspective, the idea is that, of course, we have to see the images in order to be able to know what's happening, right? Yeah, see and, of course, you have to save them and, I don't know, process them and how do you handle this? So this is not the kind of question that I really can answer because it is related to internal procedures. Now, if we have to compute the fingerprint of an image, there must be a one second in which the image is on our surface. It is, since the agency is like NECMAC, they share ashes. So you might have an ash for which you don't have a picture. And you have to trust that this ash is coming from a trusted source that has already vetted whether this ash is nasty stuff or not. That's how we actually avoid sanctioning heavily innocent people. So there is a collaboration with the trusted entities for this. When you receive those from an external agent, if those images are on your platform, you already know what you've seen. Thank you. Can you hear me despite the mask? Can you hear me? Thank you. So I have a question, but first I have a thanks because I have worked in this kind of thing and NECMAC doesn't share any useful data, IWF doesn't share any useful data, Farros doesn't share any useful data. So I will definitely take a look at the threat exchange platform and hope that it's much more useful. And thanks for that. No, I have a question anyway. If I was an attacker, I could download data from the threat exchange platform and try and run as many filters automatically until I find something that is not matched by PDQ, video PDQ, et cetera. What's the way to counter that? Oh, you're asking whether adversarial attacks are possible on PDQ? Yeah, of course. PDQ is a very naive algorithm that just detects the patches of colors. It is actually possible to create adversarial attacks. Just if you think that you alter many pixels in the image and perceptually for us doesn't change anything, but you might end up changing the most relevant pictures for the DCT algorithm. I will create a completely different ashing in the end. Also someone has demonstrated an attack, a reverse engineering attack on photo DNA. Like from the project, it's called ribosome. And it's a neural network that from a hash reconstructs a very blurry picture. So it is actually possible to do that, but PDQ is a very simple and fast algorithm. If you really want to combat seriously adversarial engineering, the things that you need neural networks like SSCD because it contains so many relations to different parts of the images and it's much harder to fool. I'm not saying it's not impossible because, of course, it's possible. In general, later someone will find a way, but it's the usual arms race between attackers and defenders. And it's no exception. Thank you for your question. Hello. Hi. First, thank you for the presentation. I think it's a very interesting topic. I wanted to link it because it's been a bit of a buzz the past few weeks, the generative AI, especially chat GPT, was wondering when you use that kind of algorithm and you scan an image, detect something, is there a level of confidence attached to the result and can you detect when an image is potentially a fake or? There is a lot of time because there's an echo, so I cannot really, can you do it louder please? It's hard to understand from here. Hello. Okay. Is it better? Okay. Yeah, so I said thank you, but I wanted to link to generative AI and I was asking so when you run that kind of algorithm to detect violence or child abuse or anything else, can you also attach a level of confidence in the response to explain whether it's, well, to define whether it's a potentially fake picture or is there an extension to the algorithm where you can link with the generative AI? I'm not sure about the answer. Sorry, we can go for a beer and I can explain more details and let's see. Yeah, you have a question. Hi. Thank you for the talk. It was very interesting. One more question also. Do you run SSCD in production as well? The deep learning network? If we're using SSCD in production, can't I reply to this question? We use simsearch net plus plus. We use this other one because we have written a blog post about this, so I can confirm that we use simsearch net plus plus. I cannot nor confirm or deny about SSCD, but those are related technologies, so I could talk about that. What does the production stack for simsearch net plus plus look like? How do you serve it? It must be pretty hard to deal with the GPUs. This is not a question that I'm sorry. I cannot talk about the production setups. I'm sorry. Okay, any question nearby? Thank you. Of course, you can imagine that we do not operate in the vacuum, so if you can think about how we serve results from a neural network, it is something perhaps similar to what would you do if you would have to put behind an API a model? So I kind of have two questions. The first question is, to what extent do... I think there are potentially two problems. Intentional mismatches and unintentional mismatches. So situations where perhaps an image has been recompressed or has been cropped, or is perhaps another image of the same situation, versus situations where people have deliberately deformed the image to try and get around these kind of systems. Do you have any idea of how performant it is against the two scenarios of either accidental or unintentional mismatches versus intentionally trying to avoid it? So it is, of course, possible to have unintentional mismatches, and I've seen images that were adversarial engineered to give the same embedding. Those are absolutely possible, again, in PDQ, PDNA, and all the perceptual hashing, which is just a mathematical transformation. You just have to find a way where the input seems the same to the algorithm. For the neural network things, it depends. You can study the code, you can study how it's done, if you can, it is absolutely possible sooner or later, because the adversarial attacker on combinets are a reality, so it's absolutely possible. I've seen some mismatches, but usually two perceptual hashes. Usually the more refined the technique, the harder it is to attack, of course, otherwise we just will stay with MD5, because it will be enough. Crops. PDQ is resistant to crops, SSCD is very resistant to crops. If you have rotations, I believe also PDQ is resistant to rotations, like flips, but you cannot ask much more than that. Other questions? Yeah. Do you have any information about the speed difference between SSCD and PDQ? So the question is whether I have some speed benchmarks for the difference of performance between PDQ and SSCD at inference time. PDQ is faster than your time to read the image from disk. So it's negligible. It will just compute. It's a mathematical transformation on the pixel. The neural network requires a dedicated hardware, if you do that on CPU it will take seconds, also because the model I think is big enough. It's not as big as GPT, but it's a 50 level CNET. So it's of course lower and requires dedicated hardware, but it's more precise. It just finds, SSCD finds anything that PDQ is able to find and much more. So in case, if you are very curious about, sorry, if you are very conscious about, I have to scan this stuff just to make sure they don't come from a ill source. You might want to set up an async process that will take more, but will just batch process all your stuff. If you need a super fast thing, PDQ will not really wait over your server. Thank you. Any other question? Hi. First of all, great question from my former colleague David, I think, down there. Not even looking this way. But what happens if you get a false positive match? How do you disregard that in the future without potentially disregarding a real match? So if we get a false positive match, how do we do to restore? Yeah. How do you restore? You mean or in MEDA all the day? Just anywhere. Like as a concept. So in MEDA, I cannot really say. With the Asher Matcher Actioner, you have the, you should provide a capability to your own platform for which you are soft deleting the image because you have to provide away an API in your platform that HMA will call on, where you say, soft delete this picture. So make it unavailable, but do not really delete it in case you want to appeal. So you need to provide like, undelete and unsoft delete and soft delete. This is the simplest way and most effective way to deal with false positive in case, whoops, I did a mistake, I want to restore the content. Sure. But if you have an image that someone wants to upload, say it's a popular image that a lot of people are going to upload, but it matches a pattern of another bad image, can you auto, is there a good way to make a more precise hash and exclude that and say this one is a false positive? It doesn't match what you think it does. So you don't have to keep undoing the. Okay. So partly if the image is popular, so we have many examples and we have many examples of an image which is not bad and then comes a bad image, whether we can use the fact that it's very widespread to augment our position. Is this the question? Okay. Well, really, there's nothing in this presentation that says these because once you train the network is trained and you start serving and the network will give you the same answers to the same question, to the same query. PDQ or other mathematical algorithm, perceptual algorithm is just a mathematical function so will not change. There's nothing to train. So to change a deficiency of your model, you have to retrain. You can do a better retraining and sometimes model are retrained as anything which is still under maintenance. For example, we get new data, for example, and we might want to retrain as any other model also for the spam filters is the same. Do we have more room for questions? I think it's done. Thank you so much. You'll be a wonderful audience. Thank you. |
Teaching machines to handle bugs and test Firefox more efficiently.
Using machine learning to automate bug management, test selection, and more. |
Hello, everyone. I'm Marco. Thank you for being here to listen to my talk. I'm an engineer manager at Mozilla. I've been at Mozilla for almost 10 years now. I started as a contributor, then an intern, then I was hired, and I've been here for almost 10 years. I started working on some funny projects, like writing a Java VM in JavaScript, and then more recently I started focusing on using machine learning and data mining techniques to improve developer efficiency, which has also been the subject of my PhD. During this talk, I will show you how we will all be out of a job in a few years, joking. I will just thank you through our journey of how we incrementally built features based on machine learning for improving software engineering, one on top of each other. I'm the father of two, Luna and Nika. Before we start with the presentation, I wanted to explain a little why we need to do all these complex machine learning things on top of bugs, CI, patches, et cetera, et cetera. Firefox is a very complex software. It's a browser. We have hundreds of bug reports and feature requests open per day. We have 108 million bug reports at this time, which is almost the price of one bedroom apartment in London. We release every four weeks with thousands of changes, and during 2022 we had 13 major releases and 45 million minor releases. As you can see, we even sometimes party when we reach a certain number of bugs. As I said, Firefox is one of the biggest software in the world. We have a lot of legacy. Netscape was open sourced 25 years ago. A few days ago we celebrated the 25 birthday. Over time we had 800,000 commits made by 9,000 unique contributors representing 25 million lines of code. We had 37,000 commits only last year by 1,000 unique contributors. Not all of them are paid. Many of them are volunteers. And this is a list of the languages that we use. As you can see, we use many of them. We have C++ and Rust for low-level things. Rust is gaining ground and is probably going to overcome C soon. We use JavaScript for front-end and for tests. And we use Python for CI and build system. But we have many more. So if anybody is interested in contributing, you have many options to choose from. But let's see. So as I said, the complexity is really large. We have thousands and thousands of bugs. And we need some way to control the quality, to increase the visibility into the quality of the software. And we cannot do that if the bugs are left uncontrolled. One of the first problems that we had was that there is no way to differentiate between defects and feature requests. We call them bugs on bugzilla. But they are actually just reports. Many of them are defects. Many of them are actually just feature requests. And so at the time, we had no way to measure quality. We had no way to tell in this release we have 100 bugs. In this other release, we had 50. So this release is better than the previous. And so we need a way to make this differentiation in order to measure the quality. And it was also hard to improve workflows if we had no way to differentiate between them. So we thought of introducing a new type field. This might seem simple. It's just choice between defect, enhancement and task. But in practice, when you have 9,000 unique contributors, some of them not paid. It's not easy to enforce a change like this. And we also had another problem. We have 100 million bugs. If we just introduce this type, it's not going to help us at all until we reach a mass of bugs that we change. So if we just introduce it at this time, it will only start to be useful six months from now. So we thought, how do we set the field for existing bugs so that this actually becomes useful from day one? And we thought of using machine learning. So we collected a dataset. I'm not sure it can be considered large nowadays. With 2,000 manually labelled bugs, few of us labelled independently. And then we shared the labelling so that we were consistent. And we had 9,000 labelled with some heuristics based on fields that were already present in bugzilla. Then we, using the fields from bugzilla and the title and comment through an NLP pipeline, we trained an XGB boost model. And we achieved accuracy that we deemed good enough to be used on production. And this is how the bug, bug project started. It was just a way to differentiate between defects and non-defects on bugzilla. We saw it worked and then we decided, we thought, what if we extend this to something else? What is the next big problem that we have on bugzilla? And it was assigning components. Again, we have lots of bugs, millions of, hundreds of thousands of bugs. We need a way to split them in groups so that the right team sees them, so that the right people see them. And the faster we do it, the faster we can fix them. At the time, it was manually done by volunteers and developers. So you can see a screenshot here, product and component, PDF viewer. In this case, we didn't need to manually create a data set because all of the 1 million bugs were already manually split into groups by volunteers and developers in the past. So we had, in this case, a very large data set, two decades worth of bugs. The problem here was that we had to roll back the bug to the initial state because otherwise, by training the model on the final state of the bug, we would have used future data to predict the past. And it was not possible, of course. So we rolled back the history of the bug to the beginning. We also reduced the number of components because, again, with the Firefox scale, we have hundreds of thousands of components. Many of them are no longer actually maintained and no longer relevant. So we reduced them to a smaller subset. And again, we had the same kind of architecture to train the model. With a small tweak, we didn't have perfect accuracy. And so we needed a way to choose confidence and recall. So pay the price of lower quality but catching more bugs or catching fewer bugs but be precise more time. So we can control this easily with a confidence level that is output by the model, which allows us to sometimes be more aggressive, sometimes be less aggressive. But at least we can have a minimum level of quality that we enforce. The average time to assign a bug then went from one week to a few seconds. Over time, we auto-classified 20,000 bugs. And since it worked, we also extended it to webcompad.com, which is yet another bug reporting system that we have at Mozilla, which if you find web compatibility bugs, please go there and file them because it's pretty important. And you can see here an action of the bot moving the bug to, again, the Firefox PDF viewer component. Maybe I should have used another example just for fun. Now we had something working. And it was starting to become promising. But we needed to make it better. We needed to have a better architecture for the machine learning side of things. We needed to retrain the models. We needed to collect new data. We needed to make sure that whenever a new component comes in, we retrain the model with the new components. If a component stops being used, we need to remove it from the dataset and things like that. So we built, over time, a very complex architecture. I won't go into too many details because it will take too long. But maybe if somebody has questions later, we can go into that. And then with the architecture now, it was easier to build new models. So we even had contributors building models just all by themselves. In particular, there was a contributor, Ayush, which helped us build a model to root out spam from bugzilla. So it seems weird, but yes, we do have spam on bugzilla as well. People are trying to get links to their websites into bugzilla because they think the search engine will index them. It's not actually the case. We tell them all the time, but they keep doing it anyway. We have university students. Bugzilla is probably the most studied bug tracking system in research. And we have many university students from many countries that use bugzilla as a playing field. Many times we even contact the universities and professors asking them if we can help them give more relevant topics to students, et cetera, but they keep filing bugs. And this contributor maybe was from one of these schools, was tired of it and helped us build a model. And the results were pretty good. I'll show you a few examples of bugs that were caught by the model. So this one was, if you look just at the first comment of the bug, it looks like a legit bug. But then the person created a second comment with a link to the website. And it was pretty clear that it was spam. This one is another example. This is actually a legit bug. It's not spam. Maybe it's not so usable as a bug report, but it was not spam. And then somebody else, a spammer, took exactly the same contents, created a new bug injecting the link to their website in the bug report. And somehow, I don't know how the model was able to detect that it was spam. It's funny because you can see that, so when you file a bug on bugzilla, bugzilla will automatically insert a user agent so that we have more information as possible to fix bugs. But in this case, he was filing the bug, copying the contents of the other bug, so we have two user agents. And they're even on different platforms, one on Mac and one on Chrome, actually. Okay. So we were done with bugs. We are not done with the bugs. We will have plenty of things to do in the future forever. But we were happy enough with bugs and we thought, what can we improve next? One of the topics that we were focusing on at the time was testing and cost associated to testing. We were experimenting with code coverage, trying to collect coverage to select relevant tests to run on a given patch. But it was pretty complex for various reasons. So we thought maybe we can apply machine learning here as well. But before we go into that, let me explain a bit about RCI because it's a little complex. So we have three branches, three repositories, which all kind of share the same code, Firefox. We have Try, which is on demand CI. We have AutoLand, which is the repository where patches land after they've been reviewed and approved. And we have Mozilla Central, which is the actual repository where Firefox source code lives and from which we build Firefox nightly. On Try, we run whatever the user wants. On AutoLand, we run a subset of tests. At the time, it was kind of random, what we decided to run. And on Mozilla Central, we run everything. To give you an idea on Try, we will have hundreds of pushes per day. On AutoLand, the same. And on Mozilla Central, we have only three or four. And it's restricted only to certain people that have the necessary permissions since you can build Firefox nightly from there. And it's going to be shipped to everyone. The scale here is similar to the bug case. We have 100,000 unique test files. We have around 150 unique test configurations. So combinations of operating systems, high level Firefox configurations. So old style engine versus new style engine, certain graphics engine versus another graphics engine, et cetera, et cetera. We have debug builds versus optimized builds. We have asan, code coverage, et cetera, et cetera. Of course, the matrix is huge and you get to 150 configurations. We have more than 300 pushes per day by developers. And the average push takes 1,500 hours if you were to run it all one after the other. It takes 300 machine years per month and we run around 100 million machines per month to run these tests. If you were to run all of the tests, you would need to run all of the tests in all of the configurations. You would need to run around 2.3 billion test files per day. Which is, of course, unfeasible. And this is a view of our tree herder, which is the user interface for Mozilla test results. You can see that it is almost unreadable. The green stuff is good. The orange stuff is probably not good. You can see that we have lots of tests and we spend a lot of money to run these tests. So what we wanted to do, we wanted to reduce the machine time, spend to run the tests. We wanted to reduce the end-to-end time so that developers, when they push, they get a result, yes or no, your patch is good or not, quickly. And we also wanted to reduce the cognitive overload for developers. Looking at a page like this, what is it? It's impossible to understand. Also, to give you an obvious example, if you're changing the Linux version of Firefox, I don't know, you're touching GTK, you don't need to run Windows tests. At the time, we were doing that. At the time, if you touched GTK code, we were running Android, Windows, Mac, that was totally useless. And the traditional way of running tests on browsers doesn't really work. You cannot run everything on all of the pushes. Otherwise, you will have a huge bill from the cloud provider. So we couldn't use coverage because of some technical reasons. We thought, what if we use machine learning? What if we extend bug, bug to also learn patches and tests? So the first part was to use machines to try to parse this information and try to understand what exactly failed. It might seem like an easy task if you have 100 tests or 10 tests, but when you have two billion tests, you have lots of intermittently failing tests. These tests fail randomly. They are not always the same. Every week, we see 150 new intermittent tests coming in. It's impossible to, it's not easy to automatically say if a failure is actually a failure or if it is an intermittent. Not even developers are able to do that sometimes. So not all of the tests are run on all of the pushes. So if I push my patch and a test doesn't run, but runs later on another patch and it fails, I don't know if it was my fault or somebody else's fault. And so we have sheriffs, people that are only focused, whose only focus, whose main focus is watching the CI, and they are pretty experienced at doing that, probably better than most developers. But human errors still exist. Even if we have their annotations, it's pretty hard to be sure about the results. You can see a meme that some sheriff created. Lucky tests are the infamous intermittently failing tests. So the first step, the second step after we implemented some heuristics to try to understand the failures due to a given patch was to analyze patches. We didn't have readily available tools, at least not fast enough for the amount of data that we are talking about. We just used Mercurial for authorship info. So who's the author of the push? Who's the reviewer? When was it pushed? Et cetera, et cetera. And we created a couple of projects written in Rust to parse patches efficiently and to analyze source code. The second one was actually a research partnership with the Politecnico di Torino. And the machine learning model itself, it's not a multi-label model as one might think, where each test is a label. It would be too large with the number of tests that we have. The model is simplified. The input is the table, test, and patch. And the label is just fail, not fail. So the features actually come from both the test, the patch, and the link between the test and the patch. So, for example, the past failures, when the same files were touched, the distance from the source files to the test files in the tree. How often source files were modified together with test files? Of course, if they're modified together, probably they are somehow linked. Maybe you need to fix the test. And so when you push your patch, you also fix the test. This is a clear link. But even then, we have lots of test redundancies. So we used frequent item set mining to try to understand which tests are redundant and remove them from the set of tests that are selected to run. And this was pretty successful as well. So now we had architecture to train models on bugs, to train models on patches and tests. The next step was to reuse what we built for patches to also try to predict defects. This is actually still an experimental phase. It's kind of a research project. So if anybody is interested in collaborating with us on this topic, we will be happy to do so. I will just show you a few things that we have done in the space for now. So the goals are to reduce the regressions by detecting the patches that reviewers should focus on more than others, to reduce the time spent by reviewers on less risky patches, and to when we detect that the patch is risky, trigger some risk control operations. For example, I don't know, running phasing tests more comprehensively in these patches and things like this. Of course, the model is just an evaluation of the risk. It's not actually going to tell us if there is a bug or not. And it will never replace a real reviewer who can actually review the patch more precisely. The first step was, again, build a data set. It is not easy to know which patches cause regressions. It's actually impossible at this time. There are some algorithms that are used in research. The most famous one is SZZ. But we had some answers that it was not so good. So we started here, again, introducing a change in the process that we have. We introduced a new field, which is called regressed by, so that developers, QA users, can specify what caused a given regression. So when they file a bug, if they know what caused it, they can specify it here. If they don't know what caused it, we have a few tools that we built over time to automatically download builds from RCI that we showed earlier. Automatically download builds from the past and run a bisection to try to find what the cause is for the given bug. With this, we managed to build a pretty large data set, 5,000 links between bug introducing and bug fixing commits. Actually, commit sets. And then this amounts to 24,000 commits. And then we were able, with this data set, to evaluate the current algorithms that are presented in the literature. And as we thought, they are not working well at all. So this is one of the areas of improvement for research. One of the improvements that we tried to apply and to SZZ was to improve the blame algorithm. If you're more familiar with Mercurial annotate algorithm, to try to, instead of looking at lines, splitting changes by words and tokens, so you can see changes, past changes by token instead of by line. This is a visualization from the Linux kernel. This is going to give you a much more precise view of what changed in the past. For example, it will skip over tab only changes, white space only changes and things like that. If you add an if, your code will be intended more, but you're not actually changing everything inside. You're changing only the if. This actually improved the results, but it was not enough to get to an acceptable level of accuracy. But it's nice and we can actually use it in the IDE. We're not doing it yet, but we will to give more information to users because developers use annotate and get blame a lot. And this is a UI that is work in progress for analyzing the risk of a patch. This is a screenshot from our code review tool. So we are showing the result of the algorithm with the confidence. So in this case, it was a risky patch with 79% confidence. And we give a few explanations to the developers. This is one of the most important things. Developers do not always trust developers like any other user. Do not always trust results from machine learning. And so you need to give them an explanation. And this is another part of the output of our tool. This is again on our code review tool. We're showing on the functions that are being changed by the patch if the function is risky or not. And which bugs in the past were involved in this function. So developers can try to see if the patch is reintroducing a previously fixed bug. And they can also know what kind of side effects there are when you make changes to a given area of the code. Now we did a lot of stuff for developers. We trained models for bugs. We trained models for patches. We trained models for tests. We trained models to predict the facts. Now I'm going to go to a slightly different topic even though it's connected. Privacy-friendly translations. So we're working on introducing translations in Firefox. The subtitle was actually translated automatically using Firefox translate, which you can use nowadays. The idea is that translation models improved a lot in recent times. Current cloud-based services do not offer the privacy guarantees that we like to offer in Firefox. They are closed source. They are not privacy-preserving. So we started a project. It was funded by the European Union to investigate client-side private translation capabilities in Firefox itself. It is currently available as an add-on that you can install in Firefox. We support many European languages and we're working on supporting even more. We're going to also work on supporting non-European languages like Chinese, Korean, Japanese, etc. And in this case, we use machine learning on the client side to perform the translation. So your data never leaves your Firefox. The models are downloaded from our servers, but they run locally on your machine. So the contents of the web page that you're looking at will never go to Google Bing or whatever. They will be translated locally on your machine. We use a few open data sets. Luckily, we have lots of them from past research. Not all of them have good quality, but many of them have. But we are looking for more. So if you have suggestions for data sets that we can use, please let us know. On the data sets, we perform some basic data cleaning. And we use machine learning-based techniques to clean the data, to remove bad sentence pairs that we believe are bad. Of course, the data set that I showed before are open, but sometimes they are just crawled from the web, so they contain all sorts of bad sentences. Also, HTML tags and stuff like that, we need to clean them up. Otherwise, the translations will learn to translate HTML tags. And we use some techniques to increase the size of the data set automatically, like back translations, translating sentences from one language to the other, and back translating it in order to increase the size of the data sets. So we trained a large model on cloud machines, which is pretty large. You can see it's around 800 megabytes, so every language pair, you would need to download 800 megabytes, and it is super slow, so we can only use that on cloud. So we use some techniques in order to reduce the size of these models and to make them faster. We use knowledge distillation, basically using the model, the large model that we trained as a trainer for a student model, which is much smaller, so you can see that from 800 megabytes we got to 16, I think now we're around 5, 6, something like that, so it's much smaller and you can actually download it on demand from our servers. And we use quantization for further compression and perf improvements, so moving the data from the model from float 32 to int 8. Then we compiled the machine translation engine to WebAssembly in order to be able to use it inside Firefox. We introduced some SIMD extensions into WebAssembly and into Firefox in order to be able to be even faster when translating, even though we translate a bit at a time, so it's pretty fast. And the engines are downloaded and updated on demand. Let me show you a demo. So, you can see my Firefox is in Italian, but you can see that it automatically detected that the page is in French and it is suggesting me to translate it to Italian. I will change it to English. Oh, fuck. So it is downloading the model. Now it's translating. So while it was translating, you already saw the contents of the first part of the page was already translated, so it's super quick in the end. And the translation seems to be pretty good. I don't speak French, but I think it makes sense. You can also use it from the toolbar, so you can choose a language and translate it to another. Let's do Italian to French. It works. All right. So if you know any data set that we can use, in addition to the ones that we already use, or if you're interested in building a new great feature in Firefox, or if you want to add support for your language or improving support for your language, come and talk to us at our booth. We would be really happy if you could help us. And before we come to an end, let me show you how far we've come. The dogs have grown, and we have learned that it is possible to tame the complexity of a large-scale software. It is possible to use the past history of development to support the future development, and it is possible to use machine learning in a privacy-friendly way and in the open. What else could we do with the data and the tools that we have at our disposal? I don't know. I'm looking forward to know. I'm looking forward to see what other wild ideas you and us at Mozilla can come up with. Thank you. Thank you very much, Marco, for the amazing talk. Now we're open for questions. If anyone would like to make a question, please raise your hand so I can take the microphone. Questions, questions, hands up. There. Okay, okay. I'm sorry, I'm learning. I'm new to this. I'm coming up. Hello. I have actually two questions. First question is, have you actually think about the idea to use this mechanism to automatically translate interface of Mozilla products? Sorry? This thing? Yes. Yeah. So the question is, have you think about mechanism of automatically translating the interface of Mozilla Firefox products, or maybe documentation you already have like MDN, because it's still a demand to translate this stuff? I'm sorry. I'm not hearing well. Can you maybe come closer? From here? Okay. Now it's better? Yes. Okay. So my question is, have you trying to use this mechanism of automatic translation to use this translation for existing interface you have in the products, and especially also documentation part? Because it's kind of vital part when you need to translate new functionality, or you have to translate something new in the interface, you need the help of translator. But if you already know how to translate in doing this stuff, so that means like you already have a data set, you can actually automatically translate new parts of interface without translator? Yes. So it is definitely something that could be used to help translators do their job. We could translate parts of the interface automatically. And of course, there will always be some review from actual translator to make sure that the translation makes sense in the context, especially because Firefox UI sometimes you have very short text and it needs to make sense. But yeah, it's definitely something that we have considered. And actually one of the data sets that we use from the list, it's not possible to see from the slide, but it's called Mozilla L10N and they are sentence pairs from our browser UI. People are actually using it in research for automating translations. Does anyone have any other question? Please raise your hands. If you have any other questions, Marco? Okay. If not, thank you very much again, Marco. Thank you. |
Sustaining Free and Open Source Software
Exploring Community, Financial, and Engineering Practices |
So, yeah, today, I'm going to be talking to you about sustaining free and open-source software, exploring community, financial, and technical practices. I was trying to go for the longest title possible. Do you think it's long enough? Probably a little bit longer. Okay, so my name's Abby. I do run open-source maintainer programs at GitHub. I'm also an organizer for Sustain OSS, and I care lots about maintainers, about the open-source ecosystem, so I'm really excited to talk to you about that today. So before I dive in, I do want to say thank you to all of these names, all these handles. About a week ago, I tweeted about this talk, but also just asking for examples of sustainable open-source, and all of you really came through. There was a lot of responses, a lot of things to think about, and, yeah, it really changed how I was thinking of this talk, so it actually gave me more work, but I think it was worth it. You'll get to hear some insights from all of these people today. So thanks, especially if your name's up here. You're the best. So here's what we're talking about today. So it is my first time at FOSSTEM, so, yeah, thank you so much to the organizers for having me here. Yeah, it's been really fun. I've really been enjoying this conference. Everyone told me it was going to be really cold, but I think they forgot that I'm Canadian, and this is really nice weather, so it's great here, so I don't know what I'm talking about, but yeah, so I do want to share a little bit about myself, my background, and why I care so much about FOSSTEM, and then we'll go into sustaining FOSSTEM, and spoiler alert, I do think maintainers are a huge part of this, and then from there we'll look at those three lenses around how we can be sustainable through community, through financial, and through technical practices. So first some background. It's a picture of me and my grandma, my Lola, also my little brothers there. There are pictures of me and just my grandma, but I really liked this one, even though my brother is stealing the show, I don't know, it's fun, and I like how big my eyes look with my glasses, it's cute, but I started my career writing open source software at the Ontario Institute for Cancer Research, so this was really meaningful work for me, because my grandmother, my Lola, she had passed away from cancer, so this idea that I was helping someone else's grandma, someone else's Lola, was really powerful, and it was really, yeah, really meaningful work, but the longer I was in academia, the more I realized how often scientists maybe fudge their data a little bit, or not share parts of their data, just that they can get published, or they can get the best results, and get tenure or get that funding, and this was all happening at the expense of real innovation, so I was really outraged at that, that's when I got really into open science, especially using open source and open data to help us find the best innovations, because I do think if we're sharing and we're building upon each other's work, that's how we'll do the best in the world. So since then, my career has gone from maintaining open science tools to teaching open source best practices, and now at GitHub, I'm supporting the maintainers in this new world. I started about six months ago, so I'm still a bit new, actually seven months, I counted the other day, yeah. So I'm going to be sharing some of these experiences as we go through the talk. All right, so now, sustaining FOSS, again, supporting maintainers, so I did frame this talk about sustainability, around sustainability, I do like, I think there's a lot of conversation happening in that space, I do think it's important, but after that tweet and like the responses I was getting there, I realized, I care a lot about project resilience, and how projects are able to survive over time, so a slight shift, but I think it helped me really think about how are projects resilient and strong and able to withstand troubles that come their way. So to me, the biggest factor that affects software's ability to continue is maintenance. In this changing tech landscape, if something isn't maintained for a while, it will really quickly become outdated. So I really like this quote from Toby Longel. He's an open ecosystem strategist, and this quote says, in open source, the maintainers working on the source code are the scarce resource that needs to be protected and nurtured. And we're missing a bit of context where he's talking about the commons, and I'm a little bit too jet-lagged to have a commons and public good conversation right now, but I do think this is an important framing, and I think it's important to be thinking about maintainers. I know Nadia Agbaal in her book, Working in Public, she also frames this problem as the resource is the maintainer's attention and trying to get their time, but I really like this framing around the person itself of a maintainer, and really what can we be doing to support maintainers, so then we are best supporting resilient open source. So this aligns a lot with what I've seen in movement building, where there's a lot of investment in raising up new leaders so that the movements can continue, and we also see this in corporations where people are always like, there's a lot of leadership training. Leaders are really important in communities, and it's the same for open source. Maintainers are really important for these communities. So if we look at supporting maintainers, we're going to look at these three lenses, and you don't have to address your project with all three of those lenses at once, and that's why I made an event diagram, but we're going to look at these three. So the first is on the community lens with succession planning. This one might not be as intuitive, so I'm going to dive into that a little bit more soon. The next one in pink is the financial side, so paying maintainers, and this is what we've heard a lot about lately, getting money to them, and the third one in blue is on the technical side, making it just easy to use and get started. So really supporting maintainers by taking away some of the maintainer burden, but also if a project is suddenly unmaintained, making it easy for someone to just take it and then run with it. And then just looking at the overlap pieces, say you have the pink and the blue, you have the financial piece and the technical piece, but you don't have that community piece. That's often like corporate open source, especially if they're not really investing in the community, and I think that's still valid and still fine, and this way it's okay because they can just pay new maintainers to come on board usually. So you don't really need all three of them to be completely sustainable. And then if you have just the green and the blue, so the community side and the technical side, but not that financial piece, that's like maybe a hobby project. I also really like how the open source archetypes, they have one, this is a report that was put up by Open Tech Strategies in Mozilla, they have an archetype called single maintainer houseplant. So if you're just like a maintainer by yourself, just like watering your little plant, you don't really need funding, funding might actually just hurt the project, but if it doesn't need that much maintenance, that's also fine. And then the last piece there, where you have the pink and the green, so you have the financial piece, you have the community piece, but really it's technically like it's just hard to get started, hard to use. That's often specialty libraries where you need like a really deep expertise in something to get involved. So those communities tend to have like very few maintainers that are very specialized, and they're still able to sustain over time, but they have other pieces to help with that. And you can probably survive with just one of these three, but I didn't map out all of them, it's a lot of things to map. All right, so the first one we're going to dive into is that succession planning, it's the community practices. So I did take this term from the corporate world, so companies often like look at succession planning to make sure that their companies will survive over long periods of time. And also you don't need to have this in open source, like BDFL is a model that we've seen many successful open source projects with BDFLs, but I do think that like tech is relatively new and open source is even newer, and that we're only starting to see people retire or we're starting to see people like get tired of maintaining packages and just delete them all. So I think we'll be thinking more about succession planning as we're seeing this natural churn of people go through the ecosystem. And I did have a conversation after that, I am really glad I tweeted about this talk. I should have said I set a goal earlier at the end of last year, it's a tweet every day, I just wanted to push myself to do something, and I would not have tweeted that if I didn't have that goal. I like to just sort of work on things on my own, but I'm glad I did that, so I'm working a little bit more open. But I did have a conversation with Deb Goodkin, she's the executive director of FreeBSD, and she pointed me towards this quote from Kirk McCusick, he's one of the original core maintainers for FreeBSD, but he says, a successful project has to be able to change the leadership, otherwise leadership becomes dead wood, which leaves the project rudderless. And that was Kirk writing this, so it's really small on my screen, he's writing just a piece on building and running an open source community, the FreeBSD project. But I do think this view of succession planning is helpful when thinking about the community, and it's really helpful for large communities where it can be really overwhelming working with everyone, but if you're prioritizing succession planning, then you know where you can have the most impact for your community long term. So an example I'll talk about, I think the easiest way to bring in succession planning is when you're selecting who to mentor in your project. So I like to suggest these three criteria, so that you're selecting people who are most likely to become leaders within your project. So first is mission aligned, they actually care about this work and want it to succeed. So I know you've probably seen many people come to your project that are more extractive, they don't care as much about the mission, but they just want something from the project, they just want to get something out of it. They're still good to have in the community, you can still welcome them, I just wouldn't invest a ton of time in them if you're mentoring people. So first make sure they're mission aligned, they're there, because they actually want the project to be there long term. Second available, do they actually have the time and resources to contribute? And it's great that we have these mentorship programs where they can get paid for their time, so this does open the door for more people. But if someone really wants to be part of this project, but then they don't have the time, I wouldn't also invest that time into mentorship unless they get that time, or unless you get the funding to help support their time or whatever is needed there. And then the third one there is willing to learn from whoever is mentoring them. So you might notice I didn't put skills needed, I know some projects will add skills as a different set of criteria, but I think it's much easier to find someone who's just willing to learn those skills that they need. And there will be a lot of project knowledge that you'll need to transfer on to them as a new maintainer. Yeah, but I would add anything that really can't be taught. You can put there. Yes, I think that's what I had for this side. All right, so I'm going to look at this case study. So this is work that I did at Mozilla, and I was running Mozilla Open Leaders. And each of these dots is a project that went through our training process. And yeah, so it was a mentorship program. So when they entered, they had to apply, and we would give them questions based on that criteria I shared before. So the mission for this program was really about strengthening open projects and communities around the world. So you'd ask some questions around what does openness mean to you, or why do you want to work open to see if they were mission aligned, or did they just come to get some training? We asked questions around what do you hope to learn from a mentor, how do you learn best? And then also made sure that they understood, hey, we want 10 hours of your week each time. And really setting those expectations. So by setting those expectations, we set quite a high bar for people coming through. But I think if we hadn't set that, and we were just getting people who maybe wanted to be mentored, they may not have been willing to give that 10 hours. But setting them up front lifted that bar. So we got more people that were really excited about the mission and wanted to help. Yeah. Oh, yeah, so then doing that really helped with some incredibly high retention rates. So especially at the beginning, you'll see the orange dots are the people that came back and mentored others. So they actually did become leaders within the community. So our retention rate was as high as 85% in the early days. It gets lower as you scale, because it is harder to keep that quality up over time. And then even after the program finished, so we did spin down this program, we spent one more round decentralizing it, and just working with past graduates to run 10 community led projects. And many of those are still running today, and they continue to mentor hundreds of students, not students, projects a year. And when I last checked, they've collectively raised over a million dollars from funders like NASA and the C.C. Chanzakaburg Institute. And I haven't checked in about a year, so it might be more now. But it's exciting just to see, even though I wasn't running this anymore, this mission was still going on. There's people still learning about open source in their fields. And I was going to make one more slide with all the people doing this, but I was very tired today when I was making slides, so I didn't make one for that. Okay. All right, so succession planning. So just a couple of notes. Adding a selection step and criteria will really help you prioritize who you're going to invest time in. And you just can't mentor everyone. There's a lot of people that will want your attention in an open source project, but this is a nice way to sort of filter, so that you're prioritizing for the long-term success of your project. Another place where succession planning can be helpful is when you're thinking about faucets incubated at a company, or like corporate open source. So if, especially if there's open source happening at a company that's not part of the core business, it might be time to think about where that project will go next in terms of succession planning. So a couple examples, maybe that will still spin out into foundation. So an example is the REST project, so that was at Mozilla for a very long time, but they spent out now they're at the REST foundation, or partner with other companies for shared governance. So the example I put there was Kubernetes, which was at Google, but now it's part of the shared infrastructure, so a lot of the big tech companies have this shared governance over this shared infrastructure that they all use. Okay. Their circle was the pink one, PayMaintainers, yay. So some financial practices, oh yeah, I'm a huge fan of GitHub, I work at GitHub, but a lot of times when a project joins a foundation, you won't necessarily get money directly to the maintainers. You'll get a lot of support, like ecosystem support, but I am so glad that there's companies like this and products like this, like GitHub sponsors, like Thanks Dev, they're here, yay. StackAid, open source collective, Tidelift, there's so many different groups just helping money get directly to maintainers, which is amazing, and I'm glad we're seeing so many of them. I actually saw this tweet from Justin Dorfin this morning when I was still making my slides. He wrote, if you told me 10 years ago we would have multiple businesses building services to help sustain open source projects, I would have been skeptical, can't imagine the next 10. So I completely agree with Justin, it's great seeing so many of these. So just to share a little bit more about GitHub sponsors, there's this explore page. So if you go there, you'll see the projects that your account depends on, and which ones are eligible for sponsorship to make it a little bit easier for you to sponsor projects. And one of the first projects that I worked on when I joined GitHub was this maintainer month sponsorship where we had half a million dollars, and we distributed that to across all of our dependencies, just evenly across everyone who had sponsors turned on. So it's really fun seeing everyone get their 500 bucks as part of maintainer month. And that's going to be part of the actual product soon. So bulk sponsorships are coming out, so it'll be much easier for you to just upload a list of maintainers that you want to give money to, and then it will spread it out. And then just a little bit more about paying maintainers, the funding landscape has changed dramatically in this past year or so. There's far more money available in open source. So the open source collective saw a 30% increase in payment to maintainers. And there's a lot more emphasis on securing the supply chain, especially after the Biden administration's announcement with that executive order on improving the nation's cyber security. And then this is an NSF grant and the pathways to enable open source ecosystem. There's just so much more money going into open source. I only highlighted like two of them there. There's a lot. So it's great that money's there, but how do we actually set projects up in a way that will give ongoing support to maintainers? And this is a really complicated question, especially if there's multiple maintainers on a project, like how do we do this equitably, which maintainers get paid? I'm not going to answer that here. Again, I'm far too jet-lagged for that, but we do have some models that I think we can follow and we can learn from that seem to work for supporting certain types of open source projects. I am really excited for the GitHub Accelerator. I think they're gearing up for their first cohort right now. Applications closed at the end of last year. But talking to Nathan, she's really interested in just seeing these new models arise. But I'm going to talk about just three models that we have today. I grouped a few things. I don't know if you can tell, but I'm a really big fan of the rule of three, so I will tell you things in threes. But the first is tips and donations. So this can be going to an individual or the donations can be going to a foundation. The second is getting hired at a company to do open source, and this is when the project is not governed by that company, it's governed outside, but they're paying you to do open source. And then the third one is that the company actually does open source and they're governing it, and then you're part of helping that. So I'm going to go through these three and talk about some examples, and sometimes when it works really well and maybe doesn't. So the first one with the tips and donations, again, it can be to a foundation or an individual. This works really well for visible projects. View is an example that's gotten a lot of sponsorship and donations, but I haven't used really good at that marketing, and view is a project that many developers use. It's very top of the stack. It's not really hidden. So people make a cognitive decision to use view, which helps make a decision to sponsor view. The second is content creators, Caleb has written a lot about how he's able to sustain a career just on GitHub sponsors donations, and a lot of that is he's creating content around the open source that he's building. So it's more of a Twitch stream model where people are excited to see, and he's a really good marketer, I think. So it's great that people are able to work like that. I also don't think that an open source maintainer has to also be a marketer. There has to be a way that we can get money to those people that don't naturally have Caleb's gift for marketing. And then the third is maintainers in lower cost of living areas. So I actually interviewed COVID Goyle, I'll go to the next slide. So here's a little clip. So I interviewed COVID Goyle and Carlos Becker. COVID maintains caliber and Carlos maintains a go-releaser. Yusuf Victor was also in this panel, but he didn't talk about this topic, so he's there. But if you watch it, you can see him. But both of them agreed that the strength of the US dollar really helped. So Carlos is in Brazil, COVID's in India. And COVID was saying, by moving from the US to India, he was actually able to sustain himself full time on the open source donations. He actually has a really cool story. He talked about how when he was in grad school, his parents would give him some pizza money. And then something happened with a credit card, and it stopped working, so he was like, well, what if I just turn on sponsorships, this is before sponsors existed. So he just turned on like a little PayPal donate button on caliber, and he started getting sponsorships, and people started donating. And then he was able to have that pizza money, they grew to be a little bit more than pizza money. And then when it came time, when he and his wife both finished grad school and they were thinking about what to do next, COVID was like, oh, maybe I can just do this full time because they were going to move back to India. And he was able to. So I think that's a really cool story, just how certain projects are able to survive off of donations. But I'll say it's a little tricky. All right, the second one is getting hired at a company to do free and open source software. So this is a great way for a company to have a direct line to the project, where they actually are paying for a maintainer to work on it. And it's great for the project to just have maintainers stably employed. So the example I put here was, this is Keeley Hammond, she works at Slack, and she's an electron maintainer. And I like this Read Me project piece on her, so I encourage you to go read it. But she just talks about being an electron maintainer and working at Slack and just how she's bringing in more of these non-code contributions. And we see this model happen a lot. I think Electron is the one that I'm highlighting here. But you do have to depend on that company to continually be sponsoring these folks. And the third one is that the company does free and open source software on their own. So if there's a few ways, this is a very sparse slide I'm just realizing. So if it's part of their core business model, I think Next.js and Vercel is a great example of this, where Vercel sells the hosting services so that it's just really easy for you to deploy your Next.js apps. And then I also grouped consulting services as part of this company does FOS. I know some people consult individually, a lot of people set up a company to do the consulting. So companies will hire you for your expertise on that project. And then you can sustain yourself that way. So again, from the Twitter thread, the Janus or Janus, I didn't learn how to pronounce that, the WebRTC server, they're able to sustain themselves through this consulting business. And that works when you're, again, when you're high enough on the stack, that people will realize they're using you. All right, so yeah, just to overview of what we talked about, these are the three models that I think work today. But I still think we need more models. I know talking to Nathan with that GitHub accelerator, she likes to ask that question, where is the Etsy model for open source? Where are the people who just create and then are able to get paid for doing that? It's hard with open source, because software isn't a thing that you can just take, but I'm hoping that this accelerator will help us explore that question and maybe try to figure out a nice model for that. All right, and that third lens was the technical practices. Just make it easy to use and get started with your project, technically. Again, this helps the maintainer burden. Maintainers won't have to walk you through getting your environment set up and everything. But also, yeah, if the project is suddenly unmaintained, ideally someone can just pick it up and go. So a few things, the best practices. The first three I actually took, I was consulting with the UNICEF venture fund, and this is actually a requirement for people to graduate from their incubator. They need to have documentation for everything. They need to have greater than 80% code coverage for tests and have CI CD involved. And the cloud developer environment, just making it easy for people to get started without having to figure out their local machine, I think, makes a big difference. Okay, so we're going to go through each of these again. With documentation, I know I already coded Kirk McCusick, but when I asked Deb, like, what did FreeBSD do really well early on that led to where it is today? And Kirk's response, he gave me a huge list, but one of them was emphasize documentation from the start. So making sure you're documenting things, really important. The next piece for testing in CI CD, again, it's just so much easier for people to come in and contribute to your code if there are tests. They can realize when something is broken. It's easier for you as a maintainer to know when you can merge something in or when you shouldn't. And then this is just a little gift showing how nice it is to have this all integrated using GitHub actions where everything is just all there together with your code. And then the last one, that cloud developer environment, I think Codespaces has been really nice for this. And you can just use the green button, we missed it in the animation, but then it launches this cloud IDE and then you're just right in the code right away. And now that Codespaces is offering 60 hours free every month for each individual, this is a great way for people to start coming into your project. And I was going to bring some of these business cards so you can talk to Craig. So Craig, he loves open source. Again I am jet lagged and I forgot these business cards at this event venue for later today. But you can take a picture of this if you want to talk to Craig. That's the link. And he's more than willing to help you get Codespaces set up so that individuals can just come to your project and get started quickly. All right, so just an overview, supporting maintainers through these three lenses, for the community side, their succession planning, from the financial side, paying them, from the technical side, making it really easy to use and get started. So if you are interested in any of these conversations at Sustain, I am one of the Sustain OSS organizers, we do have a conversation every other Friday where we talk about topics like this. So a lot of the information I got was from those conversations, so you can go to Sustain's discourse over there. But I also run GitHub's open source maintainer program, so if you want to talk about this with other maintainers, we do have a private maintainer community that we are just relaunching some new programming. So if you go there, you can request an invitation to the community and get involved. So huge thanks to everyone, I really appreciate being here. You can find me, I'm AbbeyCabs on GitHub and Twitter, I'm also on Mastodon. And yeah, thank you so much for having me. I'll now take questions. Thank you, Abigail. We have questions for Abigail. Just raise up your hands so I can get the mic over to you. Thank you. As I was preparing, I was like, oh, this will be easier if I have a long question period. But now I'm realizing I'm really tired, so please be nice to me with your questions. Hi. Thanks for the talk. So you are talking about ways to become a maintainer. There is another way that you don't talk about, which is working at a company and then convincing your CEO to open source your, for example, your core business. And so if I was CEO of a company, how would you convince me to switch to open source? Oh, all right, yeah. So you're saying that the other way to survive as open source project is to convince your company to run an open source business. Indeed. Yeah. So there's a lot of advantages to working open. I've talked about them before. I don't have them all written here, but big ones is like better products when you have a diverse community contributing, especially if your company isn't representing the diversity of the audience that you're working for or serving. Having that, that's good. Reduce production costs if you're sharing it across a big group. And also reduce support costs. And if your community is answering each other support questions, that's another way for reducing costs. And what's the other one? Also, if you want to be a de facto standard or if you want to be a platform by giving something away for free, more people will adopt it and then you can become that platform. So we see that with Android. So those are a few of the reasons I'd give off the top of my head, too, but it would depend a lot on what company I'm talking to and what their priorities are. And it's tricky to find a good business model that works with open source, but when you do, it's beautiful. Thanks a lot. What would you answer to someone saying, I'm not going to open source my code because I fear that people will be stealing it? Sorry, can you repeat? You don't want to open source your code because... People would be stealing it and copying basically your product. Oh, because you don't want to be copied? Yeah, exactly. You don't want to be stolen? Yeah, that is a tricky one. You could license it less permissively, instead of someone copies it, they'll have to license it the same way. You're answering really... You're asking tough questions. I asked for easy ones, but no, this is important. I say, yeah, you have to structure your business in a way where it doesn't matter if someone takes your code. So like the example I gave with Next.js and Vercel, like Next.js, anyone can take that, anyone can run it and do it themselves, but they're selling the server time and that computing time. So it doesn't matter to them if it gets copied. But I don't know, copying stuff, that's more innovation. It's better for the world often, so thank you. Thanks a lot. Any easier questions? There's a couple. You're next. Hi. Do you think GitHub itself will be open source in the future? Do I think GitHub itself will be open source in the future? I know that...Is this recorded? Yes it is. Yeah, I know... I'm sure it's fine if I say this. I know with Thomas Domke, he's the CEO of GitHub, he said in the past, like, what's stopping us if we open source GitHub right now, we just wouldn't be able to handle all the contributions and all the community coming in. I think that's one of the big reasons why they're not doing it. I wouldn't rule it out, maybe eventually, but yeah, our business model doesn't really... Yeah, the code itself, it's not there. It's more the hosting and the CI CD code spaces, things like that, where they're generating the revenue. Oh, and co-pilot now, yeah. Okay. Thank you. Thanks. Hard questions again. I hope it's okay. I said that. I'm sure no one will report me. Oh, this person was next. Okay. After that. Okay. Hi. Thanks for the talk. What would be your approach of sustaining a conference app, like the FOSSTEM one, which is only available for a weekend and then people do other things over the year, and yeah, like, how do you go about this? How would I go about sustaining a conference like FOSSTEM? I think FOSSTEM has done a really good job sustaining itself. It's grown so much, 8,000 people. It's just amazing to see the community come together and people get really excited about it. I don't know if I do anything that much differently. I think if, from what you said with how it happens only once a year, are you thinking about like getting, keeping people engaged through the year, then I might think start doing more online things in between as touch points or having more local things so people can have like their FOSSTEM in their city, the local meetups, just to keep people engaged over time. But I don't know, FOSSTEM seems to be going really strong unless there's something I don't know. Thank you so much for a great talk. I was a little bit surprised when you mentioned in the discussion about how to sustain a project financially that there was the case example of a person that moved to India and was able to sustain itself by working on FOSST, which is fantastic. But you used the word surviving, which seemed a little, yeah, it just surprised me because I think everybody can agree that people that maintain full-time FOSST projects, which is the infrastructure of a lot of the world, of IT really, should be, well, we all hope that one day we'll be able to pay these people a reasonable salary for their efforts and not just surviving. Now with that said, like you say, it's tricky to find business models that can financially sustain these projects. And GitHub is a company that has done so much for open source and plays such an important role and it also has enabled a lot of funding to start to arrive to maintainers. Now do you think that GitHub is now done finding new ways to find new models to sustain developers or is the conversation with GitHub still open where we could provide further feedback and provide novel ideas to tackle the business model of how to tackle the problem? Is GitHub open to new ideas? Yes, GitHub is open to new ideas. Actually join that private maintainers group. I think that would be really interesting avenue to have those kinds of conversations. And I think one of the big reasons why the GitHub Accelerator was set up was to help us try to find these models in better ways to pay people. Because like you said, I probably shouldn't have used the word surviving, but it's nice that COVID was able to do that, but that shouldn't have to be the norm. There's no way we could make all the open source maintainers live in India. I'm sure it's nice there. But yeah, there has to be a better way that we can actually sustain these people. So I think there's still a missing link. So I think you've identified there's still this missing piece and I don't really know the answer. There are people having these conversations and we're definitely open to hearing your thoughts. Thank you very much. I have a question. Oh, yeah? Yeah, you did talk about, I'll just have, my question would be a follow up like similar to what he just said about the financial maintenance of open source practitioners. So like, do you guys have a mechanism? How do you determine how do you reward somebody that is doing, like that is maintaining an open source platform? Like, how do you determine how much he should be rewarded? Sorry, what was the question? Like, how do you determine how much should someone be rewarded? Oh, how to determine how much someone is worth? Yeah, how he's rewarded, like for his involvement in an open source project. Yeah, yeah. Yeah, because like, for example, you might think like you have two people from different locations all contributing the same amount of hours, but then end up receiving different paycheck at the end of the month. Yeah. Yeah, no. And I think just the question for everyone, how do you, like if you have multiple maintainers, how do you know how much money should go to each person? And I think that's a really big question today. And I wasn't prepared to answer that in this talk. But I think, no, I've definitely heard of some projects where like one person is maybe in India, the other person's in North America. And then if they pay them evenly, the cost of living is so different. It just causes issues. Or like if one person started the project and they're still getting paid, but they're not an active maintainer anymore, like what happens there? There's a lot of open questions like that. And I think that really depends on the project. It really depends on the project governance and dynamic. It's really, that's a human, human problem. Yeah. But I agree. That is a big problem today. And it's a big problem for open source. How much time do we have? Okay. Okay. So this might be the last question, I think, yeah. I'll come here. Yeah. Hi. I'm a member of several organizations. And we are dealing with, in all of them, we are dealing with the same problem. We have new members coming in every year. And there is nobody to give them orders, basically. They would be very happy to help, but nobody tells them how. How can we get these people who are basically acting as middle managers for these organizations? Oh, man. Yeah. If you're in an organization, you have new people coming in, but then no one is telling them what to do or how to help. I think that's where establishing mentors is really important. And just finding people in your community already that's willing to onboard others. So that's why I really like programs like GSOC, Google Summer of Code, because it's good for bringing in the newcomers, but it's also good at finding mentors in the community. And this might be the first time they're stepping up in leadership. So that's a nice incentive, because they get a little bit of a kickback from Google also for mentoring. And they get the Mentor Summit and things like that. So yeah, I would think about it that way. What can you do to encourage mentors? It might be making a role for it. It might be giving them special titles and giving them some fancy thing. I don't know. A special dinner depends on what your community likes. Yep. Thank you. Cool. I think that's all for time. I'm around, so you can come chat with me. I did bring stickers and buttons, so you're welcome to come up here and get some. I have slightly less than I had for my talk this morning, where I accidentally gave far too much away. But this is how much is here. So thanks, everyone. It was great chatting with you. Thank you. |
Perspectives from the Open Source Developer
A Window into the Developer Experience from Linux Foundation Research |
Hello, Faustum, hi. Wow, it's so good to be here. You are the heroes of Faustum because I am the closing act and you have come to see me. This is so good. Thank you very much for being here. Six o'clock. Let's get the show on the road. I am from the Linux Foundation and I have the privilege of doing research on the open source ecosystem and I was brought in a couple of years ago to use data as a means to describe what is taking place across open source communities and to provide greater insight for the purposes of supporting all kinds of programs in the ecosystem. If you were here for Abigail's talk previously, a lot of what she had to say was to do with sustainability and funding projects and I am hopeful that the resource that we are creating at the Linux Foundation is the utility that will help support grants and funding for otherwise non-commercial open source projects, especially those that are contributing to sustainability. So that's the perspective that I come from. We want to investigate the impact of open source projects through research. We conduct our research through quantitative and qualitative methodologies, surveys, interviews and empirical research and we create this resource, co-create it with the community. We create it by the community and we give it back to the community. The reports are all published under Creative Commons. You don't have to give your email and we're not going to spam you for reading them. So they're a resource for everybody to leverage. So check us out. This is our home page if you want to scan and discover the publications that we produce to date. We have 26 publications and just shy of two years on a whole variety of topics that I'll get into. And the Linux Foundation has a history of doing some important research in open source. Not that long ago they published the Linux kernel report, Linux kernel history report as well as the 2020 FOSS contributor survey in collaboration with Harvard's Laboratory of Innovation Science, which was a deep dive into understanding contributor motivations, opportunities and challenges. So if you haven't seen either of those reports, I'd encourage you to check them out. The 26 reports we've done today are across a number of different frameworks. It's really tough to package up open source conceptually and do a bunch of research to say how are we going to organize this. So we set about, actually before I get down that path, quick example of some recent reports that you'll find at the top of our website include mentorship and open source, which also plays into Abigail's speech about sustainability and the role of mentorship programs, enabling collaboration across open source communities all around the world. How do we enhance collaboration? How do we break down silos and how do we reduce fragmentation? Industry reports like the 2022 state of open source in financial services takes a deep dive in an industry sector. And for our friends at the SOTA Foundation, hello Stephen Tan, the 2022 data and storage trends report. So just a quick highlight of four recent publications that have come out. We've got great stuff coming down the pike as well. Looking at the idea of open source for digital wallets. Why do we need an open source digital wallet infrastructure? Well essentially we have centralization in the wallet community right now through Google Pay and Apple Pay. We're creating infrastructure through an open source foundation that's going to create opportunities for any organization to build a secure digital asset infrastructure. Web 3 and sustainability, looking at some of the challenges around blockchain and how we can reduce the environmental and carbon impact of blockchain and how blockchains can reduce our own through providing secure data sets and perhaps incentives, real monetary tokens for us to change our behavior and start making sustainable choices and be rewarded for them, be rewarded for buying the carbon neutral product and use the token to ride public transit. But you can't use the token to put gas in your car. That's an example of the economies that are coming down the road. And our deliverables, everything we do is available under Creative Commons. We publish our data sets, our survey data that is free of personally identifiable information. And we do so on data.world, data repository essentially, not unlike GitHub, but a little bit more accessible for the non-coder to be able to find our data sets. And if you want to play around with our data and create your own analysis, go to data.world, check out the Linux Foundation data sets and feel free to explore some of our survey data. And see what you come up with. Let us know if you find something different. We'd love to hear from you. We produce reports, infographics and so on. So check out these resources and if there's a way that you can use them to support your project communities, go for it and let us know how you make out. So the first way we conduct research is on geographic lines. We think that there's some unique trends taking place in different regions around the world. And what's happening in Europe that might be different from North America? Is there a different culture? Are there different opportunities? And what's going on? The first report we published was last year, our Europe Spotlight. So if you're interested in how European open source communities are taking advantage of the open source opportunity or not, give this report a read. It's got some pretty interesting insights. Second way we go about conducting research is along an industry analysis. So if your project is specific to a certain industry, whether it's maybe it's film and entertainment, maybe it's financial services, maybe it's energy, our research is very focused sometimes on industry specific open source projects. The energy example is really exciting. And the more that regulators know about open source projects in an energy vertical, the better because we need to transition off of older infrastructure and open source is the path to doing that. One example of an industry project is in the motion pictures industry with the Academy Software Foundation. Many of the projects that were used to create the Academy nine time nominated film, Dune, was built with open source software. And we host nine projects at the Academy Software Foundation. And this research report tells the story of how it was formed and what needs to happen in that community to keep it sustainable where the challenges are. Technology horizontal is another way we approach our research. Recent example being resiliency in cloud. How do we achieve multi-cloud resiliency? We did this with Cloud Native Computing Foundation and it was a result of a round table we hosted at KubeCon in Valencia. And finally, and this is where the developer community comes into play, you'll see at the top of this graphic is the developer contributor framework to view open source opportunities. What are the issues specific to developers, to maintainers, to sustainability, to diversity equity inclusion, leadership, racial and social justice, and so on. So we've got a lot of reports that don't fall neatly into a tech horizontal or an industry vertical or a geographic framework, but they apply the whole ecosystem. Recent example, DEI. I happened to co-author this one with a woman named Jessica Groupman and it really talks about the good, the bad, and the ugly about diversity issues across open source project communities and how we can collectively work together to address some of the challenges. But importantly, we've got more and more work coming that is focused on the developer and maintainer experience. And it is very, very important that we continue to explore the issues so that we can provide better programming and resourcing to support our developer community. How do we ensure sustainable projects? How do we ensure diverse projects? How do we ensure funding? How do we incentivize developers to implement security best practices? And we find the answers to these questions through research. And today I'm going to share some of the insights that have come across the 26 projects that we've done to date in all different ways that have some story to tell about the developer experience. So this is what this collection of next slides is all about. Davos are having a really good time doing what they're doing. Many projects that we've done have said that open source development is fun. And like time and time and time again, this is almost a universal truth that people are here because they find it a stimulating and rewarding place to be. And that's kind of exciting. It's really, it's great and it's probably why open source projects are so prolific. People enjoy contributing code. Out of our Europe study, fun really is a leading motivation here. Much more so than, say, career development. We concluded that, and also through interviews, that there's sort of a romantic thing going on in Europe that is less prevalent in other geographic regions. And it really is around doing good, creating solutions that have a purpose. And there's a strong motivation for learning and fun in this community. Open source is also an incredibly empowering space for our respondents, the majority of whom are developers, that the opportunity to create infrastructure, to create projects that have intrinsic value for a region that allow for sovereignty, that allow us to not be reliant on certain vendors. And this is a really motivating factor. And this being in this space is an incredibly empowering and rewarding activity. We have an opportunity to change the world if we don't like it. What's fascinating is the extent to which developers are taking a personal interest in their work. They show up because they care. And really, without that personal attachment and intrinsic motivation, we just wouldn't be where we are, especially when we consider how many projects do not have funding. And they are not part of a commercial ecosystem. So this motivation keeps these projects going. It also, if there's a, the other side of this coin is that if there's a loss of motivation of some kind, if there's the feeling of sudden apathy, then a project can just be shelved and somebody can sign off and say, I'm out. So that is a risk that we face. And we've seen examples of that recently, though I couldn't name the specific project. You hear about people having a change of heart. Training is also something that's incredibly important as new technologies are emerging, as security is becoming increasingly important. Organizations are looking for highly skilled, highly trained open source software developers. And the skills that they're looking for now have eclipsed Linux as a focus point. Skills are sought after in cloud, in DevOps, in security, and yes, very much so in Linux. But we're also seeing new projects emerging in Metaverse, in AI, PyTorch being an example. So training is very much in demand. And fortunately, there are good resources available that are often free. We host many of them at the Linux Foundation, so check them out. The other bit of good news is that employers, if you're fortunate enough to be in a situation where you're paid to contribute, companies are paying for training. And 90% of hiring managers will say, we'll hire you, come here, and we will pay to skill you up. And the reason being, not only the hiring managers, but within organizations, we can't necessarily hire our way out of digital challenges. We have to sometimes work with the talent pools that we have and train our team members. Because it's incredibly costly to go out and pay a developer who you think has the right skills only for them to leave and go to a competitor. So training a resource in-house seems to be a trend that we're seeing specific to DevOps. And fortunately, everyone in this room is in a really good position because the demand for your skills exceeds the supply of skilled people available to provide. The service. You're in a very, very good space right now, as industries are becoming increasingly software defined, you are calling all the shots. So leverage it to your advantage. Sometimes there are barriers that prevent you from doing the things that you really want to do. I'll give you an example. You have an industry full-time, you have a paid job, it's in a highly regulated industry like financial services, and the company policy says that, sorry, not only can you not contribute to projects on behalf of the organization and open source that might be beneficial, but you also can't contribute to projects in your personal time. And that's a barrier that industry needs to figure out how to solve because it is ultimately to their detriment. And the regulatory community, particularly in certain verticals, has to come to terms with the realities of the way open source is developed and not put up so many barriers that are non-threatening. I'm thinking of community chat rooms on a repo within financial services and the SEC freaking out in case regulatory policies were being breached in any kind of activity. The other barriers are for certain types of industries are around access to Slack or access even SurveyMonkey, we do surveys, really challenging to try to get data through some corporate firewalls. So there are like non-tariff barriers. Really very, very important and there are lots of tools out there that can help developers better understand dependencies and the number one benefit, it's the number one benefit of producing S-POMs, better understanding the dependencies in your projects. Here is a truth and I think, I don't know if anyone here would disagree, but DOVs are pretty amazing project ambassadors. Carol Payne from Netflix speaks publicly and often about the projects she contributes to and inspires other people to get involved, helps recruiting new developers to projects and she feels that she has a responsibility to do that and we see so many DOVs advocating because it's the right thing to do and it's incredible to building a community. For all of those of you who are ambassadors for your project, keep up the good work. Makes such a difference and it's so nice to see people being the champion of one project or another. Open source development in particular has its benefits. I recently gave a talk to an organization that had a lot of people were just working in a closed source environment and they're not necessarily aware of what the benefit is to working in open source and of course everyone here knows that you have the opportunity to learn from others and build on the work of others, you don't actually have to go it alone, there's so much less pressure and less stress and tremendous upside but these truths are not necessarily widely known so that's what I have to do is say look these are great places, great spaces, there's a lot of benefit, open source is a good thing. Take advantage of the opportunities. I am a different kind of ambassador telling this story about why open source actually works and what the value proposition is. And devs mitigate risk by this very openness, by accepting external viewpoints. One of the things that I had to come to terms with when I started working for the Linux Foundation is the open sourcing of the research development process. Having so many eyes on a research paper before it gets published, that was like a shock to me. And because of the culture I am in at the Linux Foundation, I work with incredibly smart people and a whole bunch of different disciplines. They are all weighing in on my report and I'll tell you something, I've never had such high quality research as a consequence because people genuinely care, it's not necessarily in their job description but they are weighing in on the document because A, they can and so they do and they genuinely want to make something better. So my experience is that this methodology really, really works to create high quality outcomes, definitely onto something here. And yes devs can further mitigate risk by taking personal responsibility of implementing security best practices including the increasing use of multi-factor authentication tokens to increase greater account security. So we're spreading the word about how we can improve the security of software supply chains, what are the tools available, what resources do you need and we're really just getting started here. This particular factoid came from the report we published this year with the laboratory of innovation science from Harvard which was census two looking at essentially three software composition analysis from sneak, FOSSA and synopsis ran scans to identify the most commonly used application libraries so that we could have a measure, it's not a perfect measure but it is a measure of the most popular software. So if we know what the most widely used applications are we can prioritize our efforts to secure them. That's the rationale behind some of the research we're doing and that's where this particular point came from. And yes you shoulder a lot of responsibility. The FOSSA census two validated again that the most widely used free and open source software is developed by only a handful of contributors. In one dataset 136 were responsible for more than 80% of the lines of code added to the top 50 packages. So how do we share that responsibility? I don't know but we're working on it. If you have ideas that's what we're here for. I want everybody here to take away the opportunity that research through the LF is a mechanism for you to be heard. If a stat seems wrong or if you have an opinion let's hear your opinion. We would love nothing more than to hear from you so as to change programming and resourcing that will better support you. So let this be your mechanism to share your perspectives and knowledge. Toxicity and open source is real. It came through our, it was real before we did research on it but one in five open source devs have been discriminated or felt unwelcome or professionals I should say. And it is minority groups that are most acutely affected. The minority populations of our community really struggle in not feeling welcome in open source. And why is this such an issue? Well, it's really important because diverse technologies are better technologies. Diversity creates better tech. For the longest time I'm from Canada by the way and we have a thing called postal codes which differ from zip codes and they differ from European postal codes as well. But because we're so often filling out forms specific to the United States I'd like those forms to have an extra space where I can put my postal code in because that's just a technology frustration. But the designer of a form realized that they need more than zip codes for inclusion like let's include the Canadians. That's great. That's how you get better technology you think outside your lane and you increase opportunities for others to participate. And Rachel Rose from ILM. ILM is the Lucasfilm production unit. So industrial light and magic, making all the Star Wars stuff and making some pretty good films with open source software. And nobody says it better when we have people with varied backgrounds and opinions we get better software. Thank you, Rachel. We've got upcoming research that's very specific to maintainers because of the responsibility that is shouldered on maintainers. And what we did before I jump into this particular point, after we published the second census report, census two that was scanning those application libraries to identify the most popular software, we tried to reach as many of those maintainers we could to have a conversation with them and say what is going on? What do you need? What are your challenges? What are your constraints? How's it going? Do you have a successor? What was your pathway to hear? Are you paid? How much time is on your hands? And it is this methodology that is producing the next couple of slides. This report is still in development. We've got a really solid draft, but we will be publishing these insights at some point in the quarter. So really excited to get a window into the world of the maintainers of some of the most critical software in the world. And great quote that, you know, maintenance can fill an infinite amount of time if you let it. And that's true. Sometimes I feel that way about the Linux rotation. The more I give to Linux rotation, the more it takes from me. But it's a gift that, you know, you give and you get so much back in return. I think this is true for maintainers that they're just giving, giving, giving. And that repo is always going to take it. Maintainers need tools. They need hacks. They need productivity tools that are going to improve their workflows. Forty-four percent of developers want their employers to provide these artifacts, sandboxes and resources and things that help them do their jobs better. And money, though, is incredibly important. And Abigail's talk emphasized the need for funding people have to eat. It is not the primary motivator for people to work on open-source software projects. So it's not, maintainer constraints and challenges is not something that money necessarily solves. Money can't buy me more hours in my day, and it can't buy me sustainable legacy for my work. We need to approach things differently outside of financial terms alone. It is part of the equation, but it's not the motivating factor. And yeah, maintainership is very often a job. It is not a hobby, and the pressure for maintainers is to build features and not fix bugs. That's an equation that we need to change and figure out how we do it when we have a community that is so constrained in terms of time. How do we motivate intrinsically? What tools do we need to provide? So we're hoping to answer more of these types of questions as we continue to do more research. So yes, please get involved. Get involved. Research is just a different kind of open-source contribution. We issue badges. If you contribute to a research project, either by way of writing a forward or doing a peer review of a survey before it's published, maybe you're going to localize the survey to Japanese or Chinese or German or French. We've got an opportunity for you to get involved in this output. And there's nothing I love more than when Phil Hollerin from GitHub shares his credley badge on LinkedIn and says, it was great to be a contributor to research. It was really, really rewarding. And it was as rewarding for me to work with Phil. I have enjoyed this experience immensely. I've learned so much in my two years of doing research for the LF. And working with people all across the ecosystem, from industry to community organizations. I come from the blockchain space, I did blockchain research previously, but I really do love the open-source community. It's a great place. I would love nothing more for you all to sign up and receive our newsletter and to learn about opportunities to get more involved, to take surveys. This year we're going to launch a panel and hope to get as many people on that panel as possible so as to create better insight. And to be given the opportunity to answer surveys and tell us what's going on, tell us how to fix it. So please stay in touch with us. We're going to have more news on the development of our survey panel this year. It's my mandate to create it. And I'd love nothing more than for you to be a part of it. And with that, thank you very much. I'll be very happy to try to answer your questions. Thank you. Thank you, Larry. We have questions? Questions? Questions? Sure, yes, I will move to the slide this one. Thank you. You showed us some statistics there which I found really interesting. What's the most surprising thing that's come out of research that you've done or worked on with Linux Foundation? Wow, that is a great question. What is the most surprising thing that I've learned through research? I am surprised every single day. Honestly, I learn so much every day I'm humbled by the amount that I learn. I think what's fascinating is the community aspect of open source and that this is a community by people that without community we would not have this technology infrastructure. I think that's the most fascinating and inspiring thing that I have learned. It's really wonderful. So, yeah. Thanks. Thank you for your question. More questions? There's one over here. Over here. Sorry. There you go. Well, thanks for the presentation. I'm curious whether do you have any data on how much of the research that we have in the software industry that go to open source software versus closed source software, proprietary software. Do you have any data on that? No, we have not done any research of that nature, I'm afraid. It's an excellent question. There are so many questions to answer. That's an excellent question. Another question that we get a lot and still have not found a good way to answer is what is the consumption of open source software or the usage? We know that there are a lot of downloads, but we don't necessarily have a good way of measuring usage. Very challenging to track. It's a challenging space to research. But good question. Thank you. You had a slide on diversity, a quote. My personal observations on developers is that there's not so many women amongst them. Is it also a topic of the Linux Foundation to look into that, or has some research being done to find the reason why that is, and how to overcome them at some point? Why are there so few women in open source is part of... I'm going to go back to that slide so that you can see... You can download the report, actually, here it is. In this report, we do offer some insight as to how we can help overcome the fact that we have diversity challenges and why we struggle to recruit women. I've come to learn through my exploration of COBOL as a programming language, a little bit of the history about programming languages and computer science becoming more high value work and engineering becoming more high value work, and societally, it actually migrated from women programmers to men in the 60s. The first programmers, the film Hidden Figures, women using FORTRAN, it was not necessarily a high skilled activity. In fact, COBOL is an incredibly teachable skill, and there's an amazing opportunity to skill up anybody who wants to learn how to code, they generally can. Over time, computer engineering became deemed to be high value work and became the work of men exclusively, and the technology industry broadly, this is a function of tech in general, the maths, the sciences, the humanities. I don't think that we've done our daughters a great service in socializing them into the sciences. I was not socialized into tech, but I see that LEGO, one of the best toys I ever played with as a child, has been marketed not as a gender neutral toy anymore. There's LEGO friends in the pink boxes, and there's the helicopters and the other stuff, and that's marketed in the boys section, and I think that's unfortunate, because the type of LEGO products that are marketed for girls are around, it's a house, or it's a bakery, or it's a beauty shop, so if I play with LEGO, I'm playing with the beauty shop, and the shopping mall, but if I buy the other color of LEGO, I'm building the Death Star, the Star Wars station, or the helicopter, that's really a problem, so I think we have generational change to make, the way we socialize girls, and the way we socialize boys to do nontraditional activities as well, and we're in a real non-binary kind of world, so we have to think about how we socialize for the next generation. Any more questions? It's drink time! Thank you everybody. Were there any more questions? I didn't think so. This is awesome. Thank you so much for coming at the very end of the day. You're great. Thank you so much. |
Open Source in Environmental Sustainability
Preserving climate and natural resources with openness |
Hello, you all, morning creatures. Good morning. We're about to start the talks of the day and I'll leave you with Tobias Augspurge, who will be talking about how open source can leverage more sustainability in the technology. So welcome, welcome. Yeah, welcome also from my side. So I'm actually working in atmospheric science. So by education, I'm an aerospace engineer. I'm in my day job calibrating satellite instruments for atmospheric science. But let's say in my private time, I do this professional hobby, so I'm actually investigating with some other people how does open source can have a significant impact on environmental sustainability. And we created this project within a small community called Prototypes that is actually investigating multiple ways how to increase the sustainability of our environment open source projects. And this project was started with a friend of mine and me about two and a half years ago within a pandemic. So we had a lot of time and we realized, okay, there's actually no overview about what open source really means in environmental sustainability. And so we started to create a huge list, an awesome list. I think you know what this is. That actually lists all projects that we can find in this domain. And quite soon, Irene joined this project because she did a similar development four years ago. And we were thinking about, okay, we created now this huge list and let's make a study about this. How can we quantify and how can we also find quantitative ways to investigate what actually open source means in environmental sustainability. So Irene was actually funded by SUBAC, that is an accelerator program to open up data in climate. And after that, Josh Hopkins from Open Corridor joined us. He is interested to integrate multiple of the projects we investigated in a larger software project also for environmental sustainability. So what does environmental sustainability in general means? It is actually the goal to preserve the natural resources that we all depend on for future generations. But what does it actually require? So it requires the intelligent intention to understand, predict and manage the stability of our natural resources. And that means that you need to understand that actually we depend on nature and the natural resources that have been created over a million of years by the nature itself and the natural earth systems that we see here. For example, you see here on the right hand side an atmospheric simulation by the open source project called SCREAM. They are investigating how does the content of water vapor change and how does water vapor actually change the climate in the future when our earth is heating up because when the earth is heating up, more water vapor will be in the atmosphere. And this is a huge problem because it actually accelerates climate itself because water vapor is itself a greenhouse gas. So we see there an actually accelerating process and software is there very important to actually predict and to understand how this will be in the future. So the core question of the study and analysis that we did is what enables open source software and environmental sustainability? So I think you know that open source is now dominating our software industries worldwide and there are studies from synopsis that shows that actually in 79% of the code bases worldwide you can find open source code and they did it because they do security auditing and they look into the code and they see okay we find it everywhere and it impacts all our software that we depend on worldwide. So what we did in our study is we started with the state. So what is actually the state of this ecosystem? And we created by multiple insights but we also wanted to derive some principles based on the multiple projects we investigated and interviews and people we talked to. But we also wanted to create some visions. So what are actually opportunities? What are strategies? What is the potential that we are missing that actually open source delivers us in environmental sustainability? So let's have a look about what are applications of environmental sustainability in open source and one very important aspect there is the environmental intelligence. So you get satellite data all around the world from multiple satellites, from NASA satellites, from ESA satellites and most of this especially satellite data that deals with the environment that is really looking at the environment is actually open to the public and you can process it. There is a lot of open source tools and with this multi-spectra and also with images and with today also spatial resolution and temporal resolution of those satellites you can derive multiple aspects of our environment like the air quality but also how our forest changed and also how noise actually is changing with our cities. You can measure but also and it's very important point source emissions that means who is actually creating emissions and you can directly point at this is the company who creates this amount of emissions and this is getting more and more an important topic because we have now multiple aspects that are coming like CO2, Texas and carbon trading where we actually need traceable data about what is the amount of emissions somebody created but on the other side there's also things like biodiversity that you can measure like satellites create images with not just RGB but they have a huge amount of different colors that they measure and if you combine those colors you can actually derive the change of biodiversity of forest or any living being on the planet. Another important topic is also the urban vegetation so how do we lose our urban vegetation and in combination with the heat islands because where we lose our urban vegetation you have at the same time for the heat islands it means with climate change it will get very hot in this areas. So but there's also the application directly in our real world technology that we're seeing and we found very amazing projects like a complete 50 megawatt wind turbine that was being released open source so that means there's the blueprints how to build this wind turbine there's the simulation so all the software that actually is needed to design this wind turbines has been open sourced and in combination with multiple American and European organizations but you will also find interesting software libraries that help you to predict where do you place your photovoltaic and what is the perfect orientation of your photovoltaic in relation to the sun so that you have the most efficient of those solar panels but and there was also a talk about pipes yesterday we have now with open source it was we have now a complete simulation of the world energy grid and all those energy systems that are within this huge network of different power plants but also of the renewables and you see here for example the power grid of India how it was simulated with pipes so so what we actually did is a methodology that I from my analysis nobody did before an open source in general so the first thing we did is we compiled this list and it took me two years and I think more than 500 hours to investigate all those projects worldwide and we used multiple ways how to do it and you find it in the study in the report that we released some weeks ago it was a combination of researching on GitHub, GitLab, searching on different papers doing data mining but we also just looked on the stars of different people on GitHub so what are the people starring actually that are working in environmental sustainability so this was a huge effort but also the community helped us a lot so we had about more than 30 volunteers that contributed to the list and helped us to curate this list but and we really were focusing on projects that were aiming on environmental sustainability at their core. The core idea was to keep the list in a way that it is possible so we really try to list GitHub projects because GitLab and also GitLab projects are every kind of repository where people can participate into and so first step was creating the list and after that we created some scripts to automatically gather the metadata of all those projects and we created some targeted interviews with all those domains we investigated and the domains actually were derived within the study so this is not, this is our let's say perspective on this ecosystem and the namings that we use there and you see here in the middle that's actually the core project open sustainable technology and around us are the fields and within the fields are different topics that we put the projects into and after that we did the cooperative report and cross-validated actually the qualitative and quantitative insights that we derived from this investigation. So here's the overview of the data set and you can see we have in total 1,339 projects, this was quite some work and most of them were actually GitHub projects, some were GitLab projects and I must say actually the numbers here are all dealing with the GitHub projects. We could not integrate so far the GitLab API into the study, I hope to do this in further studies and also to integrate other platforms but to keep it simple the first study was really focusing on GitHub and actually on GitHub we also found most projects. You see here also that we listed the active project so within the study we tried only to list projects that had one closed issue in the last year that or one commit and that are documented and have a certain kind of quality so we did a small quality analysis of the project is there some documentation can people actually build the software by their own and use this in their own projects this was very important in the investigation and you can see here we derived a total number of stars and a total number of contributors but also other quantities to just show, give you an impression about how is the ecosystem in itself, how in this direction does it go. What you can also see is that we listed some inactive projects so actually this was the hard part of the study as you maybe know most open source projects die very soon so within the project actually 192 projects actually got inactive and this was also my impression when we do this analysis most projects were really inactive. I have no clear numbers but I think it's more than 90% of the projects we investigated were inactive they were just academia projects where somebody released a paper and then it's just over that's it it's like throwing a open source project to the wall and hope somebody maybe can use it but this is not where we want to focus on we really wanted to focus on a project that you can use and that you can reintegrate into newer projects. So the first thing that I realized when doing this investigation is there's quite a low popularity in environmental sustainability so even if it is a hot topic and you see it multiple times in the media and everybody is talking about it and yeah it sounds very trendy we could just find three projects that have more than 1,000 stars and in total on GitHub you find 38,000 projects that have more than 1,000 stars so it's just not and that's also what's my impression so most people are really not appreciating those climate models or for example those biodiversity models that have been created and that are actually being maintained by different organizations both white. So on the left hand side you see actually the color bar that is a community health index that we created it's called the development distribution score and we realized that actually the most important aspect of an open source project is how much is the knowledge and the development distributed within the project itself so we created just a very small number by the comment that is just we you take the number the comments of the strongest contributor and divide it by the total comments and you get an impression how much does this project actually depend on one developer or is there actually really community behind it and you see the AB Street that is the most popular project is really one-man show that's actually a project where somebody is trying to do a gamification of cities to improve our transport system to do it more sustainable. After that it's a very famous project and also it is a for-profit open source project that is called electricity map and after that we still see project like open farm or that or the scarf under that is actually a green software project so they are all dealing with different kinds of aspects how do we investigate and create knowledge about our environment and how can we improve our society so that we create less impact on our ecosystem let's say negative impact but you see also there some in popularity some climate models like or let's say earth science models and frameworks like Pangeo and it's or various models that are actually also very important like the WRF so on the right hand side you see the contributors so the total numbers of contributors that we could find within a project and electricity map there is again the let's say dominating project and electricity map is actually they combine the so-called carbon intensity of all different electric grids worldwide to give you an impression or give you a number about when I am consuming this amount of electricity at this point on the earth how much carbon do I actually release to the atmosphere by this power consumption and this was really let's say one of the very few larger communities that we could find where a lot of academic and for-profits and different organizations were coming together to create this the another project is actually the open food network where people try to create trustable knowledge and information about how what what is the information of food itself so that you can have trustworthy information and this is also a huge community project quite old and you can also see here just green project and very larger atmospheric project but in general you see the total numbers of contributors if you compare it with our open food projects are quite low it's not that big and for me personally actually the so this project itself has one thousand three hundred star so we are actually one of the most popular open source projects in the environmental sustainability and this is not a good sign but because I'm just we are just some random dudes it's we are we are actually we have a foundation we are just a community and people were coming together you know and that's what I wanted to show with this slide so within the open source community there's a lack for of popularity in environmental sustainability so if you look at the license and language of projects we see there's a lot of MIT license actually so it's actually easy to integrate those projects into for profit not that hard but there's also let's say a good portion of a free open source license like gpl 3.0 but what do we also could find is that with their data analysis we could not always really invest automatically investigate what is actually the license of the project so that's the custom license and this is always problematic so if you release open source code use a standard license that people know about they don't need a lawyer to actually see if they can actually reintegrate the software into maybe a commercial project because we want to when we need to go with such approaches into commercial projects if we don't commercialize environmental sustainability we will never achieve environmental sustainability on the right hand side you see the portion the proportions of the programming languages and here we can see that actually data analysis and data driven languages are really on the top I can imagine that you expect Python here but also R is here very prominent because in environmental sustainability it's a lot of statistics and processing a lot of data but we can also find here Fortron and C++ and those languages are used in the large scale simulation of our earth so if you want to really be efficient in your calculations that's the programming languages that you use and the high portion actually of Fortron shows us that many of those software projects are quite old this is really sometimes code that is 20 30 years old where our earth and our atmosphere biosphere hydrosphere has been simulated and this code has been reintegrated into new software projects yeah another JavaScript is here not that important so but that's what we also did is from the we actually derived from the projects all the organizations behind it so what is actually the namespaces of the projects and from the namespaces of the projects we had a little bit smaller list and this was actually hand labeled so we went through all those organizations and created actually the location but also the continent of those organizations and we see there's let's say Europe has quite a good standing here and also North America we found a lot of project that consider themselves organizations that consider themselves as global so but unfortunately I have to say actually from Asia and I did a lot of investigations I could not find a lot of open source projects that actually originate from the Asian continent whereas really an organization behind it it doesn't mean that there are not people involved into this just wanted to say that we could not find it and I did there a lot of research on this yes what we also did is we labeled the forms of organizations behind it so and it was surprising for me that actually the community based projects like we are the strongest so community we labeled when there was no legal structure behind it just some people coming together doing stuff this was considered community and after that academia has a very important role there because they did a lot of this research investigations in our environmental sustainability but also government agency are very important in this domain unfortunately I have to also to say for profits and that's a general trend are not that prominent we did I did a lot of investigations into the kind of do we find actually commercialized or for profit applications of this kind of software and I told you about electricity map that's maybe let's say the most one really successful project that we can find but besides that there is Microsoft is actually doing some good open source project in environmental sustainability like the planetary compute framework and so there's some other good developments but compared to the problem that we are dealing with this is a little bit disappointing so it cannot be that this is just been an academia community thing with some maybe some government agencies so there needs to be more for profit organizations and I think we see this also now with commercial open source software is becoming more and more important and there's really no billions of dollars going in this kind of development but we're missing there totally in this domain so you can see here this is all project that we mapped scattered over the project age you see here the topics and they are actually ordered we also created a size score so that we can order them you see the strongest was actually biosphere but this was also because we not separated actually biosphere so strongly if you would combine climate data processing and the climate models actually climate science I would say in general would be the most strongest area but also what was quite interesting is that actually hypersphere so the people dealing with our water resources they were also very strong we could find a lot of strong projects there it's all the same for soil it's all about air quality it's about agriculture and nutrition so really that what is really describing the state of our environment there we know a lot there's a lot of information about what is actually the state where we are going but if you go more into let's say the technology area if it's if you're talking about batteries if you're talking about hydrogen, bioenergy, carbon capture and removal there it was super hard to find good projects that we could list and also more surprisingly for me in the area of sustainable investment because sustainable investment is really something that a starter driven there's a lot of huge companies behind it that gather information that a lot of they have a lot of models and ways how to consider what is actually sustainable in our investment and there was actually two or three really strong projects but they were also at the beginning and compared to the problem we are facing this was also a little bit disappointing other areas like also carbon capture and carbon offsets were also not that prominent and I would say if you're more interested in the quantities that we created we created much more about the growth about the different kinds of communities and we created a lot of quantities and plots for you you will find us in the report so I would stop here to talk actually about let's say the quantitative analysis and go more into the into the suggestions into the recommendations that we derived from our analysis and from the interviews and from the people that we talked to and also I must say the discussions we had within the group lead to a lot of interesting aspects and recommendations so sustainable rating and investment is really getting a huge and important topic and it influence all of you because this is where actually your future is going into that means there's actually the United Nations principles for responsible investment this was created you see here 2006 and they created some really baseline for what and how do we actually do sustainable investment and you see here that actually in 2021 we are not talking about 100 around 140 trillions of dollars that are invest invested based on those principles so there's a huge amount of money going in this direction that needs to be because we actually want to transform our economy to a let's say sustainable system so the problem is that how I was very interested in this domain and I did a lot of investigations I talked to people about this and we analyzed how this is done so how is actually sustainable rating and assessment being done and actually it is a big black box to summarize it you have some sustainability reports that are actually self reported by the companies and then are there's a lot of news about sustainability of companies and a lot of marketing that you know and this is actually being combined in so-called sustainability assessment and there are multiple companies that are doing this and there's really not so much known about really how this goes and you see here also how actually greenwashing works it's very easy you create very beautiful reports that really look very good nobody is really trace is really assessing if they are really based on science it's really that you do self reporting about your environmental sustainability and then this sustainability assessments also take into account news about those companies actually so how is there good news about the environmental impact of those companies or bad news and they also take financial news about those companies and this is all is being combined in sustainable sustainability assessment and then there comes a magic number and based on those numbers 140 trillion of dollars are invested somewhere okay that's crazy from my perspective and I dig a little bit into this and I realized oh there's a lot of scientists that realize that actually this is like throwing the dice what this rating companies are actually doing you see here on the bottom that's actually the value of the benchmark rating from one company called system analytics and then you see here the divergence of this ratings and if there would be an ideal word this would be one line okay they would all agree on environmental sustainability but and actually they do not agree so you just have to go to the right rating agency and you can make every company actually sustainable today very easily been done and you also have to see that actually the rating itself has some so you don't get normally the very bad rating or the very good rating right so in general it is you just have to throw the dice and also the offer saying addressing ESG rating that means environmental governments because this is actually a little bit a combination of different kinds of ratings they do environmental but also a social sustainability rating and governments rating at diverges requires addressing ESG rating divergence requires one to understand how the data that underpins ESG ratings are generated so nobody knows how this data is generated and I think you know already the solution so when nobody knows our data is generated and we have a big black box yeah then you normally take open source but this is not happening because people are also doing the same thing in the so-called carbon offset so if your company is actually creating a lot of emissions and you want to be carbon neutral and you see this labeled all over the place now everybody's carbon neutral and it's great that they all actually want to be carbon neutral but there's actually a study being done by the Guardian and they investigated how much of those carbon that is actually released in the atmosphere is actually being stored back into our soil and into our ground and you see this on the right hand side there's actually the carbon cycle that is driving the CO2 into the atmosphere is actually a natural let's say there's a natural cycle that actually is going on and this cycle was created by actually it took millions of years to stabilize our atmosphere by this carbon cycle so that we have a low portion of carbon within our atmosphere and this way a stable weather and stable climate and the predictable earth systems that we all can depend on but yeah the Guardian found out that the real emissions that are actually stored in the ground are much lower than actually what they are claiming for so actually they say that over 90% of those carbon offsets people are claiming for actually claims that's what the Guardian is saying here but you see another problem Guardian is the Guardian is just showing your plot and that's actually the core problem of all of this they just at the end show your plot and then you believe my plot I did great science behind it just trust my science and that's let's say the core problem I want to investigate with you a little bit more so because we're talking about here about safety critical decision making the we know exactly and I showed you this also with the topics and with the numbers of people that are investigating we know very good with open science that is traceable for all of you you can go into this climate models and they're very good examples for in in in simple Python you can but you will also find larger open climate science projects that you can just run on your machine and they all come to the same result that we are actually on a very bad path on with colliding with the earth actually and this is very clear to us so there's no doubt so if you if you don't trust into climate science just go into repositories create an issue or deal with the software that you find there so opens the problem that we are facing now is that the decision making the assessment of sustainable and sustainable investment there we have no clue what we actually do and there's no open source software and not much open source software that we could really find so but there's help and especially in the energy sector we found a lot of good projects that that that really understand how to get rid of those problem of traceability because when somebody just is showing you a plot I think hydrogen will be the future of energy of the energy sector and there comes somebody else and shows you another plot this cannot really come you cannot really create a discussion about this and this is really everybody needs to be at least able to discuss about this right now and what they did in in actually in the energy modeling so the this pipes up project actually I showed you at the beginning they release those models and in this way you can create proof and show hey I calculated this if you don't agree with me show me your model and if you don't show me your model I don't trust you that's that's how it goes and this is really let's say a very important idea that I found here it's that you you need to understand that you can actually prove how much conclusion is actually being open and traceable you just have to see how much open data is there and is there uncertainties actually also in the data that is comes with the data is there open source models are there is there actually an open execution possible of those models do you find open results of those execution so is the artifacts actually create by its simulation is this actually also open source is there a way to participate can you go into the GitHub repository can you go into a conference and ask the question how you had has he done this and is this conclusion with uncertainties at the end so is there a plot that shows what is actually my unknown my uncertainty what or is it just a single number and if it's just a single number it's a problem and we can propagate the uncertainty through such a thing and that's very important for traceability then what we found out that actually openness itself and that's let's say one of the core findings of the whole study is the is the indicator for sustainability itself so if you want to know and it's very important to understand we sometimes don't know today what is actually sustainable but if there comes somebody and says I personally don't know if I'm really sustainable but I show you how I calculated why I think I am sustainable this show sustainable intention and this is what we need today because then you can go into and discussion you can go with people into some maintenance of code you can improve your own code and that's why actually openness itself is the core is really the key indicator for sustainability itself you see also despite a picture this was shows the standing on the shores of giants but from my perspective really describes very good open source open source or let's say the technology that we built that is this blind giant and we are standing on the shoulder but we can see farther by the giant but you cannot but the giant itself is blind but we can see much much farther and we can and the giants get bigger and bigger and we can see and predict the future in a much better way and that's why open source and actually the math and the models that we create there are so important to predict the future so one interesting aspect that I found to get rid of this problem is actually spatial finance it's the idea that we can use those whole earth observation data that we have to track down companies about environmental sustainability so all those software in earth observation all those open source software there is a huge amount of open source software and earth observation can be used by rating agencies and all of us to actually track down companies on the environmental impact that's not that hard and there's actually an Oxford group that started the spatial finance and what they started with is they created a map of the whole fossil fuel industry where you know the whole supply chain so where are they located where is the fuel supply chain that helps you to on the one side simulate it but also to check satellite data and track down okay this company has a claim that they're actually sustainable let's have a look and you have suddenly a sensor that is measuring the sustainability and that is needed so here is my let's say that's how I would actually improve the rating of environmental sustainability and how we can actually get rid of those problem is to actually integrate earth observation into this processing chain and very importantly where the assessment happens we need open source code and that's actually just one from the Leung's Foundation one project that is starting to do this but this is really everything very early stage and then suddenly we can maybe in the future get sustainability ratings about companies that have some uncertainties but it would be also very great if companies itself would use open source code to do environmental sustainability rating within the company so for me personally I would like if ExoMobil would say we would go open science now or every company that goes open science about their own environmental sustainability would have such a massive impact on the industry because finally you have the information and data to really measure sustainability within the company because so far this is all the black box for us so yeah I would like to say that open source is from my perspective the most underestimated climate strategy or climate action it is there's so much opportunity there that we created a whole list of recommendations how open source can have a significant impact so actually we were thinking about what could be follow-up projects and we had so many ideas about follow-up projects that we just were starting to list them all down as recommendations I think there are more than 50 now and I just thought you two of them in combination actually so and yeah my final words would actually be and that's I think we need to understand we need to build an operating system for environmental sustainability it is for this for things like carbon offsets or the rating of the environmental impact of companies you need to connect all the spheres all those open source projects together to get one to bet get a better conclusion about what is the state of our environment and that's why and that's I hope and that's what we hope to start and hope to build in the future is really such kind of an operating system yeah thank you so much well thank you Tobias for your brain exploding presentation we have you can type your questions in matrix or you can just raise your hand and I will bring you the microphone oh gosh okay I lost you in the crowd again again please Tobias thank you so much for very inspiring presentation can you hear me and I absolutely agree that open source is very much underestimated as source of sustainability and a question can you suggest any ways how can developers collaborate better so for example I know that there are some gaps in climate modeling in bridging between climate modeling and energy modeling and it would be very helpful if energy modelers would know better about climate modelers and vice versa do you have any ideas can you suggest something to improve this situation can you hear me yeah I think what is really missing is the community behind it that is cross over the topics actually conference you meet somewhere where these people meet and talk to each other that's what is missing and we realized and that's why we actually would and also the topic of environmental sustainability is always a side topic in open source conferences somewhere but we really need a conference actually there that's the point where this this connection between the software projects is starting at the beer in the evening maybe where these people are connecting they are not really connected with each other that's why this list is so successful they could not find each other Tobias you mentioned that's a very low for profit organizations currently yes but this if they increase we have also the lobbyism problem again because they are much more trained to get money for their already existing projects do you see some chance that new open source projects in this group will also get paid contributors this is also an opportunity yes it's not I'm not the biggest fan of commercial open source software I see there's some opportunities there are good ways bad ways it depends on topic and the climate model should maybe not be in for profit but if it is maybe about the rating yeah there should be a company behind so something it really depends on the kind of project we are talking about so for example if you if you want to create a wind turbine right a wind turbine that is open source completely you need a company that maybe gives you 40 years of maintenance and it needs to be sure that this is maintenance and miss maybe this is better with a for profit but depending on the project maybe on with a non-profit it's not an easy question what is the right organization for an open source project it's very complicated |
Making the world a better place through Open Source
Focusing the unique power of Open Source Communities as force of social good in today's complex geopolitical landscape |
I am going to talk today about, when I wrote the talk, I said open source collaboration, but I think open collaboration more in general is maybe more appropriate. And I think we have, as developers, don't allow me to touch code anymore, but you have an amazing potential to make the world a better place. So today I'd like to talk about, first of all, how, why open source is eating the world. And I want to start with a bit of my personal journey that hopefully makes you understand sort of the background that I'm coming from. I started about 15 years ago in open source, and I had a lot more hair. And I guess at the cost of aging myself, how many people know something called Apache Cocoon? Exactly. I figured. Exactly. So I started in open source really because it was fun, because it was, I gotta say, also a way for me to accelerate my career. The opportunity to collaborate on the left side, you got my best friend, who still is my best friend. I made some of the, you know, my best friendships in open source, but the day that I got my Apache email address, I still remember it. I was so proud about, you know, being in the, you know, in between so many smart guys, effectively much smarter than me, because I've been pretty much mainly release manager at that time. I never really wrote super hardcore code, but that's where I started. Really as a personal motivator. Then I moved into, I had the luck to move to Holland, let's say in Italy, didn't really have such a lively technology ecosystem in 2006. And I had the luck of starting to get paid to do my open source work, whether it was community open source, building SDKs, whether it is, you know, actually as part of my day job, being able to be first a developer, then a product manager for the developer platform. And not sure what I'm doing up there, but I guess clearly I was having a lot of fun in Munich, or thereabouts. Then I moved to the US. I never really grew up with the American dream, let's just say that, but both for work and for personal reasons, I moved to the United States and I entered this magic world of foundations. A foundation in a pretty conservative and regulated industry. This was 2015 when I got called to run, well, at the time it was called the Symphony Software Foundation and now it's Finos, the FinTech open source foundation. I tried to take, you know, the Apache way, my sort of open source upbringing and throw it at banks, it didn't really stick. I started saying, hey, if it doesn't happen on the list, it doesn't happen. You guys should, you know, collaborate openly. It didn't work. I really had to switch gears and start talking to them about what was the value for them to collaborate in the open. Not undermining the principles and the transparency and the governance of open source, but I found much more success in bringing all different constituents together, whether it is financial institutions, whether it is technology vendors, whether it is individual contributors of course that make still the largest part of our community and I think very interestingly regulators and that's, I think, a theme that you're going to hear quite recurring during the presentation. There is something there in having active public sector collaboration in industries, in open source projects. You know, there's still a long way to go, I wouldn't say that banks are first class citizens of the open source community, but we were able to make some of the largest firms in the US and in Europe and in the UK, I guess, extended Europe collaborate and that's, I think, the beginning of something quite interesting because they have some pretty powerful technology that they're now, you know, making available and trying to be good citizens that to maintain, not just, you know, dump and run. And then it got personal. You've probably seen on my desktop the picture of my kid, this is Leonardo, his special needs kid, pretty heavily fragile as a kid and I started seeing the, yeah, he was born like a lion, no, no, this disability, but I started seeing how open source, again, really is close to my heart, there are plenty of efforts out there that are trying to use the either open data or open source truly open collaboration approach to solve things that, you know, for which there's no commercial viability, big pharma is, you know, much more encouraged to work on, you know, cancer, you know, vaccines, things that have a massive applicability and therefore massive commercial potential rather than solving very rare genetic diseases where you have, you know, 10 people in the world having that specific gene that is, you know, a mutation. And so, you know, open treatment is under the Lenox Foundation, open single nucleotide polymorphism, I'm not really a biologist or doctor, but you see the power of bringing together data from all over the world to make something out of, you know, where there's no really commercial drive, there's no drive for innovation. And so, all of this to say that I find open source itself to be the positive sum game for which whether you are a corporate, whether you are a government, whether you are an individual, we can find ways to engage these people and I know that, you know, there's a lot of polarization today and I hope, you know, nowadays whether it is, again, individual versus big tech, whether it's Europe versus the US or versus China, but I do think without being naive that open source actually can be a way for everyone to collaborate and deliver not just technology innovation but social value. So, I want to, before we go into sort of what are the challenges, I think it's important to talk and sort of understand how some of the most successful projects, at least in my experience, came about. I mean, I'm sure you're all pretty familiar with a guy called Linus Torbal, so I'm not going to go into Linux as a pretty obvious example of a project that was individually started or individually led at the beginning, but more recently, again, more in the world of finance, one example that has been pretty interesting to me of a project that started, you know, by an individual out of really coding, maybe not so in the garage, but, you know, during the pandemic in lockdown, it's a software called OpenBB, it used to be called gain stunk terminal. If you're familiar with, you know, the gain stunk frenzy two years ago, where, you know, the community started betting against Wall Street and, you know, created a pretty interesting, you know, basically growth of a stock that really didn't gain stunk, didn't really have a lot more value, but it was an example of someone that, you know, of a movement that was really trying to democratize access to, you know, a very elitist and very wealthy, you know, part of the industry. And so one guy overnight built sort of a replacement Bloomberg terminal. For those of you who are not familiar with Bloomberg terminal is what, you know, people use in Wall Street is a very closed, very heavily proprietary terminal, you know, literally a thing, a box, that are used by traders to research and trade. And so someone went out there and built an actual terminal. Turns out that not only when you pair open source with a cause, you really get a potential amazing exponential growth, but of course timing has to be right. This was done when at the very moment that, you know, the whole world of retail investors was betting against Wall Street. And of course it's not just about the code. It was lucky, DDA, which happened to be a friend of, was able to, you know, put it on Acre News, it went viral on Acre News, went number one, and so it was off the razors. It started getting contributions, it's now gotten funding, and it rebranded from, again, gain stunk terminal to OpenBB. So there's a lot of power when you are able to pair open source with an actual social cause or, you know, an individual cause. Now we're very familiar, I think, with corporate led open source projects. We're getting more and more common. There are two, you know, examples here. As I said, I never really grew up with the American dream, but it has to be said that some contributions from big tech like Kubernetes or PyTorch have certainly created opportunities out there. And of course this is not because of charity. Let's be clear, while us coming from the open source community have an aspect of conscience, you know, these large corporations largely do open source because they see a potential down the line, whether it is commoditizing a competitor, if you've never seen the Kubernetes documentary, they actually have a documentary, I wish one day I'll be so cool that I have a documentary about my project, but, you know, Google said it out loud, incrementalism step by step wasn't going to get them to catch up with Amazon on the cloud front. And so they open source Kubernetes and, you know, we now have about 21 trillion investment globally about one trillion in Europe. And Kubernetes is one of the most sought after skills out there. And again, this doesn't mean that the motives originally were about the spirit of open source, but certainly it has created opportunity. Same for PyTorch, you know, some of the main advancements out there, oh, we can talk about TensorFlow, but really AI has certainly benefited from the open sourcing and more recently the introduction of open governance around some of these projects. But of course, as I said, open governance, you must play by the rules. And I know that we all have, you know, slightly different ideas what the rules should be, but certainly a good starting point is to use an OSI approved license. And ideally, you know, be in any foundation where you can bring all the different parties together and be a neutral environment. And then, you know, we have plenty of examples of non-profits led open source, whether it is projects from the Linux Foundation or more from, you know, a 501c3 type charity like, for example, Mojaloop, that works around financial inclusion. We have seen, especially in the last few years, how open source not always super successfully, because, for example, I left public health, was created, you know, in response to the COVID-19 pandemic to have, you know, open source COVID tracking apps, and, you know, it's been used in some governments, but as a U.S. president, I still have my paper-vaccine card, like when I was in the 80s. So it hasn't really been sort of as ubiquitous as we wanted to be, but certainly the attempt was correct, was the idea that we are all in a major global crisis, and open source can help. I think a great example is OpenSSF, you know, actually, while I tend to forget that OpenSSF was created two months before Log4Shell, not after, although the Linux Foundation was pretty well positioned to help and corral a response, at least from the U.S. government. We had several meetings at the White House, acting really as a sort of convening power, also for the large technology companies, and I hope, you know, we can work together on a global standard here, because, you know, we're all at risk based on open source sustainability, really. And then finally, as I said, known profit, when you, you know, talk about financial inclusion or other causes, you can attract also, you know, private funding, charity funding. And Mojalope is largely funded also by the Bill and Melinda Gates Foundation. And so, what is preventing us to make an even bigger impact? I think there's a few reasons. One I just hinted to, it's open source sustainability. I mean, most of you have probably seen that comic about, you know, about how often, and in some cases, how little we know about how the modern infrastructure sometimes is relying on, again, the amazing work of some random guy in the brackets as that, but, you know, I know many of you can relate to sometimes how, you know, unthankful the internet can be, or the technology community can be, when you are doing amazing work in maintain a library that then turns out, you know, large corporations, governments, and technology companies rely on every day. And so, whilst, of course, log for shells, solar winds, and all the, you know, attacks that we've seen in the last few years have raised the profile and created organizations like OpenSSF to really start addressing not only security, but sustainability. This remains a big problem. Also because of sort of a lack or uneven funding across the world. As I said, I had the luck quite early in my career to start being paid to do open source, and it's a beautiful thing. I never had to really struggle to do what I love so much doing. And a little caveat here, I don't come from a rich family, but they were able to make me study, and, you know, I think in Europe is, to an extent, easier to have sort of those basic rights, but I never, never forget that it's a privilege to be able to work in open source. You know, I used to start, you know, growing up, I used to say, look, if I, I wouldn't be here if I hadn't started contributing to open source when, you know, when I was with that picture with long hair, but it's because I had time to do it. There are simply people and, you know, large slices of society that, you know, don't have time or don't have the opportunity to be able to start contributing, because they have to do three jobs to get to the end of the month. And so open source sustainability is a major issue, and I think we need to continue addressing it. And of course, a little touch on open washing, it is, it is very much still a thing, especially in much, less mature industries. It's very easy and very cool to jump on the open bandwagon without truly understanding or, actually, either maliciously, meaning trying to exploit the marketing value of open source or inadvertently not understanding that, yeah, you have to contribute back if you want to, you know, ensure a lively ecosystem. And if you can do software as a corporate world, you can fund open source. You can pay your developers to work in the open. But I do think that the main issue is fragmentation, actually. This is a report that Linux Foundation Research led together with other foundations like Eclipse, Live Networking, and, you know, Linux Foundation Europe. And it's about really trying to understand what are the main challenges that prevent us from, you know, even further getting together and addressing some of the most pressing issues out there. This is one that is near and dear to my heart, especially as we are here in Brussels, and we recently started the Linux Foundation Europe in the understanding that there are, you know, there's a lot of potential to, you know, help Europe, European technology, European values, not in Europe, but really go global through the Linux Foundation platform, that of course is all over the world. There's clearly, you know, desire and a set of statements to use open source for advancing digital sovereignty, creating the digital commons. But it seems very much to me that it hasn't yet been a sort of true set of actions that are very concrete in, you know, driving towards that vision. It's very much promoted in principle, but, you know, sometimes even well-intentioned regulations. I think if you were here yesterday, there's been a lot of conversations around the Cyber Resilience Act and, you know, what the potential unintended consequences on open source can be, and the burden they might create on maintainers and foundations. While again, in principle, the goal was to, you know, try to address sustainability and try to address security and put some of the burden in the commercial, on the commercial companies. I think it's fairly correct. And so you always end up with great statements, but not so much of a aligned actions. And you have to say, Ospo's, you know, the European Commission is an Ospo now, two days ago the Dutch government announced the creation of an Ospo. They are sort of really starting to spring out there and I think it's a construct that is going to help. But I think, again, for what I've learned from finance, you need a bottoms up and a top down approach here. So where do we go from here? If we want to make an ever bigger impact, actually I'm missing a slide, sorry guys. I guess I lost it. I think there is a couple of things that we need to look forward to. One, understanding that open source is not the battlefield. We can, you know, compete on the commercial side of the house. We can compete on, you know, what is truly differentiating, but it's not the open source community where this battle should happen. Digital sovereignty doesn't mean building borders around open source. And indeed that is also what's coming out the fragmentation report. Transparent open source developments are the best antidote for techno-nationalism. This is something that, again, despite we know there are tensions all over, open source should be a bridge. It shouldn't be another way to divide us. And there are plenty of areas where we can collaborate, whether it is, you know, open source infrastructure, whether it is, you know, open source security, whether it is intellectual property that remains a pretty regional sort of concern, whether it is even broader goals. And that's where I'm going to go in the next slide. And I want to say we are asked as foundations to work better together. There are plenty of educational initiatives out there that we have helped the government. You know, we have the Tudor Group at the Linux Foundation, but there's the OSPO Plus Plus, works a lot with the public sector, there's the OSPO Zone and OSPO Alliance from the Eclipse Foundation. If there's one objective that I have in my new tenure here in Linux Foundation Europe is really to bring not only governments, but really start from foundation. This is something that we are being asked very much loudly by the ecosystem. And I realize this is in the U.S. up there, but we do have travel funding, but we are starting to also organize conferences and moments where we're focused on really educating the government into how they can help. Not just in words, not just in sort of grants, the stop at research level, but really with active participation in open source projects. And as I said, part of the issue here is fragmentation. I know for a fact, even within the Linux Foundation, we create five or six projects every day, sorry, every day, every quarter, there are thousands of projects created in GitHub every month, probably thousands is an underestimation. And so it behooves us, and that's again what comes out from the fragmentation report, to really do more to align projects. And I think I want to applaud, if you guys were here yesterday morning, the Open Source Initiative announced their partnership with the Digital Public Good Alliance, which is very much aligned to the UN Sustainable Goals. And at the Linux Foundation, we're also trying to align our initiatives, our projects. Again, we have, I don't know what's the current counter project, over a thousand projects in the Linux Foundation, and I have no idea how many can address, you know, climate, can address, you know, water crisis, can address energy. What I know is that Europe has, many of these projects are based in Europe, or as climate has a large presence in Europe, AgStack for agriculture and AgTech. And of course, LF Energy. Those of you who know LF Energy know that the leader recently passed away, Shirley Goodman, she was a great activist and very much seen Europe as a springboard for global climate action. And so you can certainly expect from us a much stronger alignment towards sustainability. In fact, again, unfortunately, the CFP closes today, so if you're quick, you can probably still sneak it in, but we are adding another track to our Open Source Summit, which is sustainability gone. Again, in the spirit of trying to at least have our house in order, and then hopefully start collaborating more broadly on the 17 Silver High Value Golds. And then, you know, I think another way to go beyond standard, beyond sort of this fragmentation is really, and I'm sure, again, this is not a new comic book, comic strip, I've taken part in standards, I've taken part in Open Source projects over time, and hopefully I don't offend anyone here, but I think Open Source is better. I mean, it's just more apt to incremental progress, and you know, you can always fork later if you don't agree, but ultimately it allows to build on an existing concrete piece of code rather than coming up with yet another sort of top-down, you know, 15th standard whereby 14 do exist. And I think this is something that also impacts how we collaborate as individuals. On one hand, look, one of the things that came out of the fragmentation report is, you know, that fragmentation is a double S word, sure we want consolidation, but on the other hand, you know, there is value in the distributed mode of Open Source development whereby, you know, we want to allow multiple options to solve the same problem, and then, you know, Open Source Darwinism is going to run its course, but I do also think that it beholds us as developers, I guess, I don't know if I can call myself a developer anymore, to try harder, to try harder not to reinvent the wheel. Again, I keep aging myself, but I used to be the Maven guy before going into the foundation world, and I got so much crap from the ant, guys, like, dude, what are you doing? This is not, this is not where, you know, it's very polarizing, and I'm like, yeah, sure, we know it's not perfect, but how about we try to consolidate and improve rather than creating yet another framework or yet another library? And again, sometimes there's perfectly valuable reasons to do so. And so, how do we bring it all together? This is a very similar slide to the one that we've seen before on finance, and my, the lesson that I've learned is that it's important to start with why. And by why, I mean, not only, you know, the conscious aspect of Open Source, but we all, I think, if we're here, are familiar with, are driven by, but if we want, I think, to truly make an even bigger impact and a more focused impact on whether it is those major social goals and sustainability development goals, whether it is, you know, starting to change an industry, finance, healthcare is one that, of course, is near and dear to my heart, and due to many reasons, it's still very closed, and, you know, sometimes they hide behind the regulated nature to sort of enact change, but the reality is that, as we've seen for banks, this is actually possible, or whether it is to, you know, make the government understand what they should be practically doing in an open source project to advance a certain policy to, you know, there's so much, at least in the EU, talk about open source and how open source can advance, you know, European small and medium businesses and innovation, and that really does need to be translated into action, and so I think independent organizations like Foundations, or just the Linux Foundation, any foundation out there and hopefully better collaborating with each other, here's a way to bring all these different actors together, still with the principles of open source, but, you know, I think if we lead with that, we're going to lose some people on the way, we're going to lose the people that really are there for the money, or for a cause, or, you know, unfortunately not everyone has this natural driver to collaborate, and so you always need to try and sell it, you know, it's a dirty word, an example, sell the reasons and the value why people would want to participate in open source, and to me, yeah, bringing everyone together under a common governance is really the way forward. And so, about to wrap, what can we do, can you do, practically to help? I think from our side, there is much more work in being organized and collaborate and aligned to common goals on the Foundation's side, but I think if anyone here is from the public sector, and you are listening online, open source can be the vehicle to drive policy goals, I mean, there's so much that we can bridge if we start with open governance, and I think we're not going to be there probably this year or even next year, but I see a moment in which, like I've seen with banks, we can involve, or why not, the government should be able to start open source projects that, you know, contextually drive these big policy goals, whether it is DMA, DSA, the Subresidency Act, the Interoperability Act, there's so much out there that Europe, in fact, is leading, I mean, I think about GDPR, I live in California, you know, thanks to GDPR, we have CCPA, which is the most similar sort of data privacy protection, in another wise, pretty wild landscape, and, you know, at level of consumer protection, that's really, you know, it's been a standard setting from the EU, and on the other hand, it makes very complex for companies to implement it, and so open source can really deliver, as a positive sum game, an easier way to implement for the regulated industries, and a more effective way for the government to enact those policies, and of course, I will continue reiterating that digital sovereignty doesn't mean building borders, we don't want to fragment the open source community, we're going to, as Europeans, not only lose existing technology is out there, but potentially lose sort of the innovation and the growth potential for the local small and medium businesses and hopefully growing businesses, I would like to see larger technology companies in Europe being born out of open source. For companies, invest in open source talent, again, I have a lot of grievances in the United States, but they do invest in talent, and that does allow maintainers to work in open source without having to struggle, and that also has a, as I said before, a diversity angle here, like there's just folks that cannot, in many parts of the world, don't have the time to just invest in open source, to just, you know, do it in their own time, and become an open source leader, I do think that, like we've seen for the examples of Google, I was actually talking to a company here yesterday, a local company, said we wouldn't even be in the market if we were not, if it was not for open source, that's what got us on the global stage, and I think, you know, if anyone here has the decision making power, if you're trying to sell your CEO into doing open source, lead with why, lead with how they can commoditize a competitor, they can, you know, become a de facto standard, and then, if you're not able as a company to contribute time or code, well, contribute money to open source through a maintainer funding, you know, with Patreon, GitHub sponsors, you know, Foundation as a mentorship platform, there's so many ways to fund open source, doesn't have to be foundations, doesn't have to be directly employing someone, and then, last but not least, as an individual, I think you have an amazing potential to be an activist through technology, and be an activist, I think some of the most successful activists out there are not just about, you know, the quality of the code, you gotta market what you're doing, and we're here to help. I can't promise they will be able to market or promote every single open source project out there. I've been working on this deck until 3 a.m. yesterday, and got here 10 minutes late this morning, but we are there to give visibility to a lot of these projects. And culture remains still a major fragmentation, a fragmentation problem in the industry. I, the slide that unfortunately I wasn't able to present showed how there's still a lot of prejudice, and so whether it's national, whether it's racial, whether it's political, check it out the door, because that's just not gonna, you know, that diverse perspective make better projects. And then as a guy with a big ego, this is more of a lesson for myself, put your ego aside. If there is, you know, I know as an engineer at heart, I have very strong opinions of how code should be written or how, you know, certain libraries should be done. There's plenty of reasons why you want to start a new framework, but there's the whole, there's a world of open source out there, and we as foundations can do a much better job at showing what projects are healthy, what projects are aligned with certain goals. We have a lot to do there, but if we want to really make that bigger impact, we will need commitment, so from you guys to sort of put your ego on the side. And with that, thank you so much. I appreciate you being here so early and coping with me being a little late. Thank you. I know if there are questions. |
Building Strong Foundations for a More Secure Future
Addressing The Systemic Issues in the Software Supply Chain that Led to Log4Shell |
Hi, I'm Brian Bellendorf. I'm the general manager for the Open Source Security Foundation, which is a project hosted at the Linux Foundation, but has its own rather large membership and set of activities and the like. And I thought I'd take the time to talk to you this morning about some of the things that we learned coming out of the log for shell incident. And in general, what we're doing at the OpenSSF to try to improve the state of security across all of Open Source software. And I do apologize for using the term software supply chain. I know folks are sometimes very sensitive to thinking of themselves as suppliers. You're all developers, you're all building components, you're all handing things off to the next person, right? And I, you know, I just want to be sensitive to that and recognize, you know, a lot of us pursue this, not just to write code for our companies or for other people to use, but because we love it, because it's like a form of literature. And so I come to this very much from an expansive view of what software is. But if you'll indulge me with the term software supply chain, you know, lots has been made about the fact that today, 2023, Open Source software is incredibly pervasive, because the further upstream you go in most software supply chains, even if the end result is proprietary software, the further upstream, the much more likely it is that your dependencies, your components are Open Source code. Something like 78%, according to a study by Synopsys last year, 78% of code in a typical product code base, that could be a container image, that could be software in a phone, that could be or a car, 78% on average is pre-existing Open Source code. That last 22% is the part that, you know, the company put its name on and whatever. But every 97% of code bases somewhere contain Open Source software. And 85% of code bases contain Open Source that is more than four years out of date. The comparable for this and log for Shell, by the way, were the number of companies who claimed we're not vulnerable to the log for J problem, because we're still on version 1.x, rather than 2.x. Don't worry, which had been out of support out of any updates for five years. So this is kind of a disaster. But fixing this requires thinking about, systematically, what does the software supply chain look like? And this is highly simplified. And in a way, this is only what happens at one node of a chain, right? But within a given software lifecycle, you've got the developer writing code from their head or in partnership with co-pilot now, I guess, into an IDE that then goes into a build system and pulls in dependencies and then creates packages and pushes them out to a consumer of sorts, right? Who could be another developer who then repeats that process and just uses that input as dependencies. And there's at least eight, and there's probably a lot more, but at least eight kind of major opportunities to take advantage of some default biases and assumptions. And frankly, just things we forgot to close up in the course of this development process. Everything from bypassing code review to compromising the source control system, to modifying code after it's come through source code and into build, to compromising the build platform, to using a bad dependency to bypassing CI CD entirely, which we all know happens, to compromising the package repo, to using a bad package as a consumer. So each of those has had examples of compromise in the last few years that has caused major breaches and data loss out there. And of course, we all, I don't know if any of you were on the front lines of fighting this fire over the winter holiday of 2021 into 2022, but it ruined a lot of people's holidays when log for shell hit. And by the way, I want to refer to the vulnerability and the breach and the remediation as the log for shell problem, not the log for J problem, because the log for J developers don't deserve to have their brand of their project turned into exhibit A in what's broken about open source. They were actually, it's great software. They're all professional developers. Let's give them some credit. There's a bunch of contributing factors we'll walk through, but it was really the log for shell breach. And what happened in the course of about six weeks is you went from a researcher for Alibaba in China, finding a vulnerability out of the ordinary course of due diligence work that he was doing, reporting it appropriately through the Apache software foundation processes and that leading to the very first CVE. It from starting from November 24th to all the way to January 4th and January 10th. So about six weeks where you have governments like the UK government warning people of this major systematic issue. And three more CVEs being discovered of various degrees of intensity, each of them leading to a subsequent patch, to a subsequent remediation by exhausted IT teams. If you talk to any of the log for J developers, I don't believe any of them are here, but they would talk about things like getting these demand letters from corporate legal departments, asking them to fax back a signed attestation that they had fixed the holes in log for J and that company's use. When there had been no relationship, that company was a free rider on top of their code. So I'm not going to read through each of these steps. I apologize for this, but there was this incredibly compressed timeline where people were intensely stressed where really the goodwill that we show as open source developers by putting our code out there and the fair warning that we give people to use it at your own risk was substantially attacked. It was substantially all these misconceptions that companies have about underlying open source code and the degree to which they take advantage of it kind of came to bear. And so it raised a bunch of questions amongst folks who perhaps hadn't thought about this before. Is open source software generally good reputation for security? Is it well-deserved? Does this demonstrate deep and pervasive technical issues across how we consume and develop open source code? Do these issues extend to the sustainability model for open source itself? Can we really depend upon so many, quote, volunteers, right? I mean, think about it. We don't depend upon volunteers to build bridges and highways to maintain our electrical grid, right? How do we depend upon volunteers to maintain our critical infrastructure that everything runs on, right? And of course, it wasn't just like us as technologists asking these questions of each other. It was like compliance and risk officers. It was the cybersecurity insurance industry. It was the European Union and the White House and UK's NCSC and other government agencies kind of all challenging us. Do we know what we're doing? And one interesting report, and it is worth your time to read, it's about 49 pages, came out about six months after the fact. What happened was the US government convened a group of experts from across industry and had them go and talk to the log bridge developers, talk to other open source experts, talk to lots and lots of open source foundations and try to ask what went on? What contributed to this? And it was modeled after, you know, when a plane crashes and the government will convene for many of them like a study group to answer, well, why did this plane crash other than, of course, sudden loss of altitude? You know, what were the underlying root causes to this plane crash? And so this was modeled after the very same thing. And it's a great report to read, I think, because it also comes up with some recommendations for how potentially to prevent the next one. And I'll walk through a couple of the conclusions. They said, you know, a focused review of the log for J code could have identified the unintended functionality that led to the problem. Understand the bug, the original big bug was in a portion of code that had been contributed to log for J years and years earlier by a company that wanted to support LDAP lookups in real time during the logging process, which seems like an extraordinarily bad idea to me. But okay, I'm not an enterprise IT. So they had added this functionality and then kind of left. They kind of didn't stick around to maintain it. They didn't really do a due diligence into the security of its own code. And the other log for J developers just kind of kept it around because they didn't get many bug reports on it. It wasn't really a problem. So it was kind of this forgotten portion of the code. So part of it was if they had security resources to look at every line of code that they were shipping out, rather than just the core stuff, which they were pretty diligent about, they might have discovered this bug. It might also have been discovered if the developers themselves had developed, had adopted certain secure coding practices consistent with how certain other organizations kind of define how those processes should work. If they'd had design reviews that focused on security and reducing kind of the surface area, the attack surface, sorry, for problems. If they'd used threat models to understand, hey, we really should try to make sure we're better protected against people using, through the user generated input, right? When you're a logging engine, you're dealing with a ton of user generated input. If you're parsing that for things like format strings, which is what they were doing in this, that's potentially very dangerous, right? That's something we've known kind of since some of the earliest CVEs out there. And then finally, if they'd had proper security audits. And so in answering that and trying to generalize from it, and that's always dangerous to generalize from a single example, but they found that the only way to reduce the likelihood of risk to the entire ecosystem caused by these kind of vulnerabilities and other widely used open source code was to ensure that as much code as possible is developed pursuant to standardized secure coding practices. Now, that kind of sounds like, well, if only the pilots had had better training, or if only the planes had been maintained better, then we wouldn't have had these problems, right? It seems a little bit like hindsight is 2020, right? But they did acknowledge that the volunteer based model of open source would need many more resources than they have to be able to make this possible, on average. I mean, you've all heard the aphorism, which is called Linus's law, but it was coined by Eric Raymond. Linus has disavowed the law actually, that says, with enough eyeballs, all bugs are shallow, right? What was missing from that quote was eyeballs per line of code. And I would argue that even in some of the MEST resource open source projects, we don't have enough eyeballs per line of code. And anything we do that divides that list, such as forks, for example, only takes us further away from having enough eyeballs to review code. There's lots that I can go into about it's not really just a supply chain security story. It was kind of a supply chain story because of just how pervasive log for J was. It was a bug that affected everything from the iTunes store to people badging in with security, kind of like badges to all sorts of like embedded systems and the like log for J was kind of everywhere. And because it was everywhere, it was in a whole lot of places that people didn't have the tools to even go and discover. Often it was compiled into jar files, so you couldn't even just do a directory listing and grep through it to find log for J.jar. You actually had to interrogate the development process. And without things like S-bombs, which appropriately the US government has focused on kind of saying this is an important thing to have, without a tool like that, a lot of enterprises were left scrambling to figure out if they were vulnerable, which versions they were running. And then, as I mentioned, making ludicrous claims like we're not vulnerable because we're way on an old version of log for J, or asking a completely disinterested third party like the log for J developers themselves to attest that they're not vulnerable. It was kind of crazy. Moving on, part of this as well is trying to understand the motivations of developers on an open source project, which again, they took a look at, but I think they could have pursued this even a little bit further. When you work on an open source project, your primary motivation, well, okay, first off, you probably all start as a user of the software. Your probably your first step into an open source project was not to write it from scratch. Maybe it is, but in either case, you have some utilities, something you want to use it for. So your first interest is get it running and get it running correctly. And so you're going to be fixing bugs. You're going to be adding features here and there, right? But adding things that help make the software more secure, very rarely do they turn into immediate benefit for you. It's often a thing that's hard to convince your manager is worth doing, right? Because it doesn't necessarily affect what the manager sees in terms of like the feature set and the code or hey, it's now fit for purpose or whatever. So there's a lot of sympathy that we can have for positions taken by like folks like Ben who wrote the cyber resilience act who was up here yesterday kind of defending what was written in the act to say, well, maybe there are other forces that need to come into play to help support those kinds of outcomes, right? To act as a forcing function for it if it wouldn't otherwise be there. But it's really hard to measure the return on that benefit independently. And if you can't measure the ROI, it tends to get disincented. So as a way to illustrate this, particularly the log for J, you know, I've had conversations with the mirror Montessori who some of you might know who's with AUSTIF who's here. We've kind of asked just hey, what would it have taken to do a proper third party code review for security of the log for J code base, right? Just as an independent thing looking at the number of lines of code in there. And the estimate we came back with was $50,000 to $100,000 depending on how deep you wanted to get. Let's say, and one of those would have found all four of those CVEs possibly more and with a little bit more money, generously, let's say another $50,000 to $100,000, you could have funded the fixes for those bugs and coordinated a disclosure process such that everybody got or a lot of people would get updated. And then you published the release and it wouldn't have been this mad scramble taking place over between Christmas and New Year's for a lot of folks. So $200,000, which is beyond what I think any of the log for J developers had in their back pocket. It's beyond what I think even, you know, the eight or ten of them together would have individually been able to put together or convince their employers to put in as a chunk of cash. But it was far less than the negative impact that that breach had on society, right? I mean, no one's actually sat and tried to calculate how much and when they've started, they've come back with billions of dollars in lost productivity in breaches and other things. So like trying to play that back and do hindsight, you know, 2020, that kind of thing, could we have discovered this and fixed it? You know, could we find the next one and spend $200,000 and fix that from being likely to happen? I don't think I could give you the one that that's likely to be, but what if I could give you a list of 200 projects, each of which probably had a greater than 1% chance based on their criticality as you can measure from how often these codes, code bases are depended upon by other packages. There's lots of data sources for that and we at the open SSF have developed something called the criticality index that'll find that. What if we could find this list of 200 projects based on criticality based on how well they score by some objective measure of risk? And I'll get into this in a little bit. Could I give you that list of 200 and could that likely, I mean, more than 50% chance prevent the next log for J? I would wager yes. And that $40 million, again, is more than any open source foundation has to be able to spend on this kind of work. Even all the foundations together collectively probably couldn't spend that. And this is the kind of thing you probably have to do each year in order to have that kind of impact. But $40 million is, I don't mean to like sound blasé about it, but frankly, pocket change for a lot of governments, especially if we got governments to work together on this, or the insurance industry to work together, or say many of the sectors who use this software without lifting a finger to contribute back, right? If we pooled these kinds of funds, we could have an impact like that. Some people just want to watch the world burn. I had to throw in this the obligatory slide, of course, but I want to kind of push forward this theory of change then around security. Because it's not just about spending money on specific interventions like that. I'll come back to that in a little bit, how we might rally those kinds of funds and focus on that kind of work. And what we're doing is the open SSF to have it. But I want to also put forward, it's actually not just about a matter of spending money. It's not just a matter of a mandate from a government to get a soft open source software to be more secure, to get our processes and the supply chain to be more secure. There's a culture change that has to happen as well. People are often very resistant to change. When your CI system is running and you're able to put out a new release and turn the crank, and a few hours after initiating, accepting a pull request, you've got a binary, you kind of don't want to mess it up. You don't want to change it. And especially the older we get, the more resistant we are to having to learn a new system or change, especially if there seems to be no benefit. But this is not unlike other times over the last 30 years that we've taken an insecure paradigm and made it more secure. And my general theory is something called Carrot's Defaults and Sticks. And the best example I can come up with this is how we went from a completely unencrypted web where browsers and servers talked clear text HTTP that could be sniffed by anybody between browser and server and got to the point where today, I mean, somebody might know the number, it was like 95% of web traffic is HTTPS. It's actually probably 99% now based on what's happening recently with browsers. But it didn't start with the browser maker saying, right, on April 1st, we're going to cut off or send these warning signs about unencrypted access in the beginning of the TLS era. It started with incentives. It started with carrots. It started by having that little green key that would show up in the location bar on a browser. It might start by certain folks saying, well, this is something that should be used for banking websites or for e-commerce websites or, you know, hosted email sites or that kind of thing. And that got about 15% of the web traffic getting encrypted out there, but it started the flat line. And then a bunch of people got together and realized, you know, we don't actually, it doesn't have to be this hard to get a TLS certificate and install it in the right place. We can automate this. We can automate demonstrating that you have domain control over this domain name. And if you do, then to give you a short lived TLS certificate that can automatically be installed in the right place in the web server. And that service was called Let's Encrypt. And it is now, I mean, for the last 10 years, it's been at the point where you can automatically, when you install Apache, when you install a web server and tell it to install a, you know, the TLS kind of version of that or a TLS profile, it will automatically set up a fetch to Let's Encrypt for a domain name you give it. And it's like automated, automatable out of the box, right? And that is what got us from 15% of the web being encrypted to about 75%. And at that point, about five, six years ago is when the web browser makers said, right, it's time for us to bring up the tail. And that's where I talk about sticks. And to finally get the laggards, the legacy sites, the folks who probably don't care about it, who probably even haven't updated their web server in five years or 10 or whatever, to finally get off the duff and use Let's Encrypt or some other technique. And they did that by making it progressively harder and harder for you to access a nonencrypted website through Firefox, through Chrome, through MSI, through other browsers. And they kind of talked amongst themselves how to do that. They tried not to piss people off, but you kind of have to piss some people off to do that. And as long as you just kind of progressively roll through, you can kind of bring people along. And there's some who will just forever be pissed off. But that's like the tail end of an adoption curve, right, is this kind of concepts of sticks. We need to think about the same thing when it comes to things like S-bombs or signing artifacts in the supply chain or software attestation levels, which I'll get into a bit. But when we think about how to get adoption of some of these security paradigms, it's got to be through this three-step kind of process. We can't just jump directly to sticks, which is kind of what the European Union Cyber Resiliency Act attempts to do. And I will say, I think the CRA is a backlash to the Silicon Valley move fast and break things kind of paradigm, this concept that open source software is some sort of reflection of that or connected to that and that we're just as reckless. But none of you all are Mark Zuckerberg, thankfully. None of you all, I think, take that degree of recklessness as a badge of honor. I think we're all just completely strapped for the amount of time that we'd rather, we'd really like to spend on making the software as secure as possible. And we need help. We need defaults. And we need, by the way, what's always worked in open source software as a doocracy, which is people showing up and doing the work, if that's what their primary interest is about. If somebody can sit on stage here and say, it's absolutely essential that this French nuclear power plant only run open source software that has been certified against a whole bunch of cybersecurity requirements, it's on them to do that work, not on the log for J developers or others. So this is where OpenSSF comes in. We were started in 2020, kind of as a result of a small kind of gathering that had been hosted on the West Coast, people working on software projects that had to do with enhancing the software development processes in the open source community to be a bit more secure. It was a mix of a bunch of different pieces of software, suggestions of protocols, building on some of the SBOM work that had been actually championed first by the licensing community, by the software licensing community. This is in particular a standard called SPDX for SBOMs. But they kind of realized that collectively what they were doing was building tools that would help try to measure risk in open source. And what does that mean? It means measuring the likelihood that there will be a new undiscovered vulnerability in this component and the impact that that would have downstream. Measurement is essential. If we can't measure whether we're improving the overall risk in that chain and the collective risk and our use of that software, we're not going to know whether the interventions that we're trying are actually meaningful. So you've got to measure it. You've got to then think about this sequence of carrots and defaults and sticks to eventually get this stuff adopted if it's any good. And then finally, as part of this culture change, are there things that we should be learning as open source developers, things that we should be thinking about as a professional type of operation, like as a diligence of care, as something that, as engineers, and that term used to have to go and take a certification exam to call yourself an engineer, right? And that instilled a sense of professionalism in that industry that led to bridges that didn't fall down when you hired an engineer to design it. We need a little bit of the same professionalism in software development across the board, not just open source. And here are a set of resources that might help us from a security point of view be better developers. So collectively, we want to put these pieces together. And we've got all sorts of projects in working groups, projects organized by thematically related working groups, a working group on best practices and documenting those and advocating for those, a working group on identifying security threats, understanding, relatively speaking, what are the areas to really worry about in the areas that might represent low threat? How do we think about supply chain integrity, like that chart I showed? How do you get those pieces, those opportunities for bugs to be inserted to just be locked down and hardened? How do we think about the CVE system is not great? Frankly, it's nowhere near perfect. It's not great for trying to automate and understand given this collection of software I use. Where are the vulnerabilities? Where are the known vulnerabilities? And how easy is it to remediate them? Are there known vulnerabilities that just don't matter because I'm not using them? And so there's an entire working group focused on vulnerability disclosures and on the vulnerability system that has a bunch of new ideas for this, but also developed content to try to help developers simply be better at coordinating vulnerability disclosures and updates. We've got another working group focused on once you've identified those critical projects, well securing and identifying critical projects, and that's where we've defined the criticality score. We've done some work with Harvard Business School to understand quantitatively how are things being used by enterprises and where might the next log for JB lurking, so to speak. And then one of the most important things we've got here is we've pulled together the architects and the people responsible for product at many of the major security repositories, NPM, PyPy, Maven Central, because if we're going to get anything improved throughout the chain, you need to involve the last couple hops of each of the nodes in that chain, which are the distribution points, and there are things you can do there to encourage more secure alternatives and eventually have the stick to say, well, no, we're not going to accept, you know, things like you might need to enforce two-factor auth for the more popular packages, right, which has been controversial to say the least, but is one of those things where it's like somewhere in that adoption curve we need to start nudging people into a more secure direction. But all of these pieces work together, and if you are an open-source software maintainer, what I'm going to walk through now is a set of specific things coming out of the open SSF that I'd love you to adopt. I'm not going to be able to talk about all the features of each just given time, but we've come up the very best starting point you can start to consume. The very first piece of thing that you can get from the open SSF are two concise guides, one that we've developed for evaluating open-source code. When you're out there looking at packages and you're trying to figure out is this community likely to have processes and is there likely to be an undiscovered vulnerability lurking in this code that I'm about to use? Is this a well-engineered, well-maintained active community that has adopted the right practices and the like, or is this a one-off that was developed by one person in a hurry thrown up on a repo and not well-maintained? I mean, we've got some ad hoc cues that we can use that most of us have, but how many of you use GitHub stars as your basis for deciding whether something is probably secure enough or not? I'm going to guess like probably too many of you. So there's a bunch of kind of subjective criteria that you can use. The flip side of that is if you are a maintainer and you are pushing code out, here are the signals you can send to your consumers of that software that show that you're taking this stuff seriously. And it's a bunch of best practices, adopting multi-factor auth, taking the courses on secure software development, using a specific combination of tools in the CI pipeline, thinking about how do you get to the point of doing rapid updates without throwing curveballs to your users because you change APIs all the time? That's the number one reason people don't update is that they assume even in minor point releases something is going to break because somebody took an API and marked it not only deprecated but removed it or changed a field that just sends things sideways or changed behavior in a way that they thought was compatible but was not. How do you get to the point where you can have more rapid updates and make it easier for your end users to pick those up? Now some of these ideas were elaborated upon in a course that we built within OpenSSF and have offered for free now through the Linux Foundation's training department called Secure Software Development Fundamentals. This has been translated to Japanese. It's being translated to a bunch of other languages, Chinese, Arabic, and Hebrew. And this is 14 to 18 hours worth of content that primarily talks about anti-patterns. What does it mean to not trust user-contributed input? What are some of the other common gotchas that have led to security vulnerabilities and breaches? Most software developers are self-taught. Some people take courses but even most university undergraduate level courses on computer science don't really teach about vulnerabilities and about common mistakes as well as they could. So this is something we think anybody who is writing code for a living or even for a hobby and you're giving this to somebody else to run, you should probably take this course. And the flip side of this is you might want to look and see whether the developers who are working on a thing you really care about have taken this course. You can get a badge that certifies you've taken the course, you've answered a basic quiz, it's not onerous, and it's free. But we hope it's something that helps substantiate that somebody is a bit more knowledgeable about this than they otherwise would be. Another part of this is something called the best practices badge. This is a checklist that's fairly extensive of the things that open-source projects can do to show that they take steps, they have a security team, they distribute things over HTTPS. I mean some of these things that seem pretty basic and each individual one is no guarantee of, you know, that it's security whole free code, but collectively can represent that this is a project that takes security more seriously. And studies have shown that the projects that have better scores tend to have fewer CVEs over the subsequent months and years. Now there's an automated tool for those projects because the best practice is badge is something that requires the maintainers to fill out, kind of a questionnaire, a checklist. There's some automation to that, but it's really just used to check the answers that the maintainers give. There's a different approach which is much more of a scanning kind of approach that called the OpenSSF security scorecards that automatically goes and scans repositories. It's gone and done a first wave scan of a million different repos, and you can trigger it to do an updated scan if you've made changes to your repo, but it takes a dozens of different heuristics. Things like is the, do you have artifacts, binary artifacts you've checked into your GitHub repo? That's probably not a great thing. Storing binaries and repos, I don't know how many of you might disagree, but like for reproducibility maybe, but use your package manager for to get your binary packages, checking it into source code control, opens the door to things that are not scrutinizable inside your source code system. Branch protection, CI test, do you have the best practices badge? Do you, have you done code reviews before code is merged, or does everybody just have commit proves, right? Do you have contributors from more than one organization? So some of this is adopts the chaos metrics, which looks at community health, but some of this as well are things like, do your tests, your automated tests, do they call fuzzing routines, fuzzing libraries? Now you can, you game a lot of these tests, right? Again, none of this is proof that your code is DFREC free, but collectively what this can do along with these other kind of measures of risk is develop a, is essentially develop a credit score for a project. And the scores and score card actually do correlate to lower CVEs. There was a study done by Sonotype who looked at the projects that had been scored and discovered that after receiving a score, there was a couple different categories within the security score cards. So they really were interested on what, which of those correlate most strongly to lower number of CVEs. So that was an interesting outcome. And this is going to be used, going to be used to refine the security score cards continuously over time to have them reflect kind of the changing landscape of some of the better run projects. But that's a big deal. And it's something that some projects have picked up as a leaderboard tool. The Cloud Native Compute Foundation ran a kind of a competition recently called the CLO Monitor, kind of on the sidelines for a month of their main KubeCon event where they got the maintainers of the different projects to commit to have a floor on the score. I think it was six or seven out of 10 for all of their projects, right? And have kind of a competition between them and with rewards for the maintainers who got their projects highest on the score card. So really cool to see. So all of that was about measurement. This now, the next set of things are about tools that help actually harden the software supply chain, so to speak. And one that you've heard, no doubt, talked about before, so I won't dwell on it too much, is something called Sigstore. And Sigstore is a software signing service. It's a set of software that are clients into that service. It's a protocol because it's a certain way of signing it that's an alternative to GPG signing of code. And it's a recognition that we haven't really signed artifacts through the supply chain pervasively, except for the very end. You know, when you do an app to get installed, it checks the GPG signatures on each package. Okay, that is helpful even above and beyond the fact that you're sending stuff over TLS, right? But for the rest of upstream, so often people are just pulling off of NPM, pulling off of package hosting, other packages just stored on bare websites where even validating the hash of that and fetching it over HTTPS doesn't prove the connection between the developers behind that code and the binary you have. There have been examples of people registering NPM packages that are named the same thing as a GitHub repo, kind of a typosquadding kind of attack as a way to try to cause you to inadvertently pick up the wrong piece of code. Obviously, sites, even sites like GitHub can be hacked, could be compromised, and you don't want your enterprise to be compromised if GitHub is compromised, frankly. So this is a tool to try to prevent that kind of thing, and it logs all of this to essentially a distributed ledger, a public database using short-lived keys, so you don't have to worry like PGP requires you to, to have private keys that you battle and keep private for a long time, just like let's encrypt, this is based on short-lived keys and an easy way to reprovision those keys. I won't go into much more depth on that. There's a few other things I'll point you to, something called Salsa, which is for supply chain level attestations, basically a way to distinguish between those things in your chain that are built to a higher degree of rigor than other things that are not. We've got another, something back on the best practices working group, which is a guide to coordinated vulnerability disclosure. So the next time there's a team, say they're not associated with Apache, say they're not associated with a major foundation who discover that they've got a pretty nasty bug and their code is used pretty widely, well, who do they turn to to understand how to, how to manage a coordinated disclosure process, how to not get in trouble for keeping something a little bit quiet while they've come up with the right fix, and then how do you evaluate who potentially to notify ahead of time so that you're upgrading enough of the internet before something becomes more widely known. And by the way, even if that sounds controversial, like there's some people who say, including the CRA, like if you know about a bug, you should tell everybody immediately, well, does anyone remember the hack that almost brought down the internet, the DNS cash poisoning bug in Bind, that if that had become widely known before the root name servers had been updated, would really, I mean, would have set us back so far, we would have had to revert to Etsy host files to be able to like connect on the internet again and get started. So the need for coordinated vulnerability disclosure might be somewhat controversial, but has become much more widely accepted today than ever before. And what one thing we're going to do in the OpenSSF is pull this all together into kind of a single dashboard to understand that risk, understand how organization, how open source projects compare apples to apples, and really as a tool to help the maintainers on those projects get better at what they do. But also, frankly, if we can help enterprises understand how where the risk lies and their use of code, if they can start making more choices based more on the security of that code than necessarily what is the most features of the most users, then we can start to, I think, bend industry in a better direction, somewhere between the carrots and the defaults kind of step of getting folks to adopt stuff. I do also want to throw out one of the biggest efforts that we have under the OpenSSF is this thing called the Alpha Omega project, which is independently funded and staffed, and it's got two pieces to it. The Alpha side is going and helping the largest open source foundations out there with the most critical needs, basically develop better security practices, develop security teams, go and do some proactive third party audits, but develop this muscle, this capability that hopefully persists even, you know, we'll go and we'll fund some of these projects for a couple years, and our hope is that at some point the stakeholders in that community take on that funding themselves, right? The companies who are depending upon that see the value of this kind of proactive investment and continue it forward so we can move on to the next set of projects. The Omega side of that is trying to set up scanning infrastructure for the most important 10,000 open source code bases to look for new kinds of vulnerabilities, to ask, you know, in theory, this GND LDAP bug in log4j that led to this thing, is it novel? And if it's novel, can we systematically scan for other projects that might be vulnerable to the same thing? And in some cases, could we even submit proactive pull requests to go and close those? And an example of another organization that has done this recently, I don't know if folks saw this announcement by Trelix last week, where they went and discovered 60,000, 61,000 open source projects that used the Python tar file module in an insecure way. This is actually an old CVE from 2007 that the C Python devs have refused to fix because of a claim that the only way to fix it would be to break POSIX. So we can have that debate some other time. They went and found 61,000 projects that have a vulnerability because they used it unsafely. They didn't sanitize inputs to it. They went and proactively issued 61,000 pull requests on those projects to fix this code. Doing this at scale is tremendously hard, but they did it. So far, after a month and a half of having those pull requests up, do you want to guess how many projects have actually accepted that pull request? Literally push the button to make their project more secure? 1,000. We still have a culture problem here. We still have an incentives problem. Even when you've given them this gift, it might not happen. Towards the end of time, I just want to say we've recognized that as the point I made earlier, you've got to show up. If you're an organization that cares about increasing the security of code, you've got to be prepared to invest time and ultimately money to make that happen. You cannot demand that open source developers simply be better. You've got to go and help them do that and spend that in. So one of the things we've done at the OpenSSF is over the last year, we developed an overarching plan to go and address 10 systematic weaknesses in open source and put together essentially business plans for each of them that would call for some funding to pay for the core of a project on the presumption that we could leverage volunteers around the periphery to try and have some of this impact. I won't go into detail of what that is too much. We call it the security mobilization plan. It includes things like $40 million to go and close security holes and some other things. It includes setting up an emergency response team for development teams that are understaffed who find a vulnerability to going and doing new scanning, to driving adoption of SigStore and other standards. So it's pretty comprehensive and it might seem like $150 million, which is the number we came up with after saying what could we do that would be lean, that would be kind of low-hanging fruit, but actually have a big impact. And we came up with this two-year number of $150 million. It might sound like a lot of money. It's certainly more than the Linux Foundation has. It's more than any of the other major open source foundations or even, frankly, Google and Microsoft have to spend on this, arguably. But there's a larger number out there that I want to focus on, which is $700 million, which is the fine that the U.S. Federal Trade Commission levied on Equifax for the 2007 data breach, caused in part by their use of unpatched open source software, Apache Struts. So to the industry, making the case that we collectively could pool our funds to go do this should be easy, right? And we've been out there trying to do it and have these conversations doing it at a time when the economic headwinds have not been in our favor. But still, the kinds of conversations we're having are very positive. And things like seeing the sovereign tech fund in Germany pop up and be able to fund not just some improvements in open source code, but security enhancement and the like has been really positive to see. And it should be a model for other countries to go and do this. But frankly, insurance companies as well as banks, as well as all these other industries that have benefited from open source software and haven't put things really in. And again, we're kind of out of time, so I won't go too much into depth. I do want to emphasize tooling around the S-bomb space is an important part of this as well, and being able to paint this overarching picture about how S-bombs, signed artifacts, software level attestations, and all these other things could have a positive impact. But we've got to show up and not just tell projects to adopt this, but weave it into the development tools as defaults so we can bring industry along. We've launched this with U.S. government back in May. We had a similar meeting in Japan in July. We've had conversations in Singapore and with people in other countries, and hopefully we'll see something here in Europe along the same lines. But let me just wrap up by saying there are attacks on the integrity of open source software that are increasingly disruptive and require us to work together to do things a little bit differently than we have before. It's not that we were reckless or careless or didn't care about security, most of us at least. There's some open source projects that definitely have been problems. But there's simply more that we can do and more that people who care about this and are starting to use open source software in French nuclear power plants can do to help this space be better. There's some specific steps as I've talked about, and that's why we're here at the open SSF, and we're here to help, but we also need your help as developers. It'd be very easy to see what we do as being the enterprises trying to make open source boring and make it all about checklists, and I'll concede that. But we're also here to try to say if we're going to shift the landscape, how do we work with the open source community to do that? Because we are part of the open source community, how we collectively take some action to make it less likely that our next winter holidays will be ruined. And with that, thank you. I think I left about 22 seconds for questions. Okay, five minutes. Okay, so we've got about five minutes for questions. So let me start by reading one question online. There was a question whether open SSF allows for anonymous reporting which could be without disclosing the reporter because that would be useful for companies fearing backlash or high expectations regarding what they have to do. So why bother reporting if you're just going to get into trouble? Why not just keep it, not release anything? So doing anonymity, making it possible to report anonymously would possibly improve that. Well, so the question is about anonymous reporting of bugs, and would that be helpful? Well, I will submit open source software has benefited tremendously by the fact that you aren't badging in to GitHub with your national ID, right? You are able to be pseudonymous. And you can use Satoshi Nakamoto as a famous example of that, whatever, right? But most of us or many of us have probably created login IDs that have nothing to do with our real name, right? And open source software is one of the last remaining places where you can actually productively collaborate with people who aren't, you know, fully identifying who they are, right? You're basing it on their quality of their contribution. And that's a really essential thing to try to preserve. And whether it's in reporting bugs or even collaborating on code, we should fight to preserve the right to be pseudonymous or even anonymous, if you want to call it that, in the development of code and in the fixing, reporting of bugs in the fixing of bugs. So I'm committed to trying to make sure we don't get to know your developer kinds of rules like some countries have started to call for. Thanks for a really interesting talk. I have a question about the economics of your proposed solution. It's not that you can pay developers out of the blue to do these tasks. You have to pull them away from other work. So you have to pay them actually more to let all the other work rest and focus on security. So this requires a major shift. Have you factored this into your proposal? No, I haven't thought about like having to pay not just for the work that we're doing, but fang for work that wouldn't be done because we're paying people to do the work. I think most of the time developers aren't able to work on open source code because they have to work on proprietary code to pay the bills. And so I think there's a lot of capacity out there for us if we do have funds to be able to pay for the kind of work that needs to be done. I also don't think we're talking about taking away from other software development work that's about adding features or fixing bugs. This is about bringing new types of organizations like the software auditor community to look at, to find new bugs. So it's not a big deal and frankly $150 million, even if that were all spent on software developers, would be a drop in the bucket compared to the total amount that is spent on developer salaries out there. So that doesn't worry me. But it's a good question. Thank you for that. Do you have more questions from the public? Okay, well thank you so much. There was one question in the chat about how it's the open SSL collaborating with OWASP. Is that a collaboration there or not? So how are we, there's a question in chat on how is the open SSL collaborating with OWASP. So Andrew Vanderstock who's the executive director of OWASP is on our board. There's a lot that OWASP does in terms of education and certification and community building that absolutely is essential and so we look for ways to work together and we'd like to avoid overlap with what they do. But yeah, that's about it. We're very complimentary to their efforts and I think they think the same. Okay, anybody else? Go. Hello, we are in Europe here. You have spoken about different conferences all over the world in Singapore but do you have something in Europe? We've had lots of conversations in Europe. There are many OSPOs starting in Europe who are interested in this and I do think OSPOs are an interesting lever point in being able to get standards adopted around security, being able to present measures of risk to be thinking about all these kinds of things. So have had interesting conversations that way, have not had the kind of full-throated engagement around this that we've seen with the United States and with some other countries. So I would like to see more but frankly even those other countries haven't yet put money into this. We're kind of waiting for certain political cycles to make their way through but I know we've inspired some action. Again, I do want to cite the sovereign tech fund out of Germany as not specifically a security fund but the right kind of thing for these countries to be doing. So anyways, thank you for the question. Anything else? That's exactly how much time we've had. So thank you again, Brian. Thanks all. |
Rosegarden: A Slumbering Giant
How a 20-year old OSS project is still going strong |
Great pleasure to be here. Can you all hear me at the back? Can you? Yeah, thank you. Yeah, wow. What was them? Thank you to the organizers and for the staff for inviting me here. It's a great pleasure to be here. And yeah, if you asked me a year ago, would I be here? I'd be like, no, because I hadn't actually heard of FOS. Apologies at that point. If I'd be told that I'll be talking about Rose Garden, which is something I've not worked on myself for a very long time, not contributed to, I'd be doubly surprised. However, I think it's important to look back sometimes. And I've called this talk a slumbering, yeah, Rose Garden a slumbering giant. And I thought, I didn't really think when I was putting this submission in as well. So I thought a slumbering giant, that made sense to me at the time. Who knows what slumbering actually means? Okay, I should explain that, because I was thinking about maybe changing. So slumbering actually means not quite sleeping, but also not quite awake either. And it's kind of in relation to the fact that Rose Garden has been around for a very long time. Maybe just as a show of hands as well. How many people have used, okay, let's start off simply. How many people have used computers to make music? Okay, so we've got about 20 maybe. That's pretty good. Keep your hand up if you use Linux to make music. Okay, that's actually more, I think. That's weird. Okay. About the same maybe. So how many people have heard of Rose Garden? Okay, it's not bad. Not impressed. Okay, good, good stuff. So Rose Garden started actually longer than 20 years ago. Let me show you. So this is the kind of current Rose Garden. This is how it looks now. And if you're familiar with Linux, sorry, with music software, by the way, I've got a pointer. Can you see it? Yeah, it's cool. I usually just use it to torture cats, but this is the first time I've actually used it on a presentation. So yeah, if you're unfamiliar with digital audio work stations, I'll give you a little tour. So we have down here some instruments, and you can map to those tracks, and they're either MIDI tracks, which is musical instrument, digital interface tracks, or you can have audio tracks too. And this is how Rose Garden essentially works. It allows you to compose. It allows you to record into it from a digital instrument, or from an analog instrument, and then you can edit it on the screen. Just to go back to the beginning, the talk today as well is going to be about the history of Rose Garden. It's going to be about how I believe that good engineering practice could talk, is interesting talk, the last talk from Brian about using the latest tools and technology to understand how open source software can be better. So as part of this talk, I'm going to look at how Rose Garden was put together in the first place, and why I believe it's still around because of the way it was put together, and because of some of the design decisions that we made during the creation of it. Oh yeah, if I didn't mention, there's a QR code here. If you want to scan that, you can go to a page where there'll be some links. I'll also share this at the end. Rose Garden was really a kind of a labour of love. It was almost an accidental success, but it was kind of done deliberately in a way. That doesn't really make any sense, but Rose Garden came about originally from a university project. So it's actually a lot older than 20 years. It's almost 30 years old. In fact, it's over 20 years old. And then it became something of its own thing. So just diving back into how it looks these days again. So we've got a piano roll editor on one side and a notation editor on the other. And this was one of the great powers of the original Rose Garden, as well. It had both notation editing and MIDI editing too. And this is the current Rose Garden homepage where you can download it and you can see how often it's raced. So up here, you can see there was a release in December. The current maintainer, Ted, works very hard on these. And when I'm talking about the engineering processes that go behind it, you'll see towards the end of the presentation that there's a lot of manual work involved still. So some of the best practices that we get through our day jobs can actually help us understand how we can make software like this more maintainable and easier to release. It's not untypical as well for large and old software products like this to have quite slow release cycles. And typically, once it's released, it goes into the usual repositories where it can be created and packaged for download on various different distributions, Linux distributions. It's been translated into around 20 different languages, including Chinese, Japanese, Italian, Spanish, Finnish, German loads. And over the last 20 years, it's been downloaded all over the world, as you can imagine. And this is some stats. So interestingly enough, we're still on SourceForge. That's down to the developers having a certain way of looking at things. So SourceForge is the preferred way of looking at things for a lot of the developers because they've been involved in it for a long time. And perhaps they're a little bit behind the times. So they like to dig their heels in sometimes when it comes to what is acceptable in terms of the technology that is being used. So the good part about it is that we have all the stats from the beginning of time, basically, all the way to today on how many times it gets downloaded. And this is source packages only, essentially, as well. So it's not bad. I mean, its height, it was 5,000 source package downloads per month. And even with the last release, it was still around 2,000. So if you add that into the mix with all the pre-packaged stuff that happens for the distributions, then it's pretty tidy amount. So who am I and why am I here? So I worked on Rose Garden between 95 and 2004-2005. I'm now independent consultant. And I kind of specialized mainly in software delivery and customer focus. And that means that over the years, as a professional software engineer, I've learned various techniques which has helped me understand, OK, well, maybe we could do things slightly better. And I'd like to talk and write about this right now. And this is how this talk, I suppose, kind of came about because I was thinking, what is the project that gave me the most pleasure over these years? And when it comes down to it, undoubtedly Rose Garden was the thing. And some of the takeaways I believe that we can get from this is to understand how open-source software becomes important and how it stays important as a social enterprise. So this is really a story about motivation and why we do things the way we do. And how open-source software becomes something that can take over our lives as well. So just to expand on that a little bit, I've kind of broke it down to three areas, which I believe are the core components of how open-source kind of gets good, how open-source projects become a thing in the first place and how they then sustain themselves. And it's mainly, spoilers, mainly around the social exercise. We do it because we kind of want to. We're driven to do it. Of course, we do it as well because we can't do it by ourselves. So that's number one for me, is the effort that it takes to build something. So when you look at something like Rose Garden or any other huge open-source project, you end up with hundreds of thousands of lines of code and lots of contributors from all over the world doing various different things, either writing code themselves, becoming different, doing translations, helping out the website with documentation. And this is a snapshot from the Rose Garden website of the contributors that have been involved over the years. Secondly, you can't do it without effort, you can't do it without support. This is a picture of the time that's elapsed between when I stopped working on Rose Garden and pretty much today. This is my son, Sam. He was one when I took a photo of him and put him on this poster to kind of be used at one of the sound expos that we did in 2004. And this is him last year as he graduated from high school. And this was probably the reason why I stopped contributing to Rose Garden this time because I went off to have a family. And of course the family is important for support, so they have to keep you going, maintaining your energy to be able to do stuff. In addition to that, we also see how the working world comes into play, one second. So after university, I went into, sorry. It's out of the working world shape, our experience and our perception of what goes on in the open source land. So after university, basically after Rose Garden first came out, going into engineering and becoming a person who would actually grow as an engineer and then be able to feed back into that project. So this is taking, this is the start, basically, of Rose Garden. So I went to Bartholomew University between 1990 and 1994. Rose Garden started as a university project by a couple of friends of mine, Chris Canham and Melissa Wilson. And they did it for a final year of projects at Barth, and it was written for John Fitch, who was one of the computer science professors there. His idea to come up with a UNIX sequencer and notation editor. I think it was the combined thing that he came up with together. So Rose Garden itself was born of a university project in around 1992. And the name is a bit strange. So Rose Garden, where does that come from? It's actually, thanks to these guys, these were a band called Bow House, who come out of Northampton in the UK from 1978. And Chris Canham, one of the lead developers, the originators of Rose Garden, he was a big fan of Bow House and he had one of the songs called Rose Garden, Funeral of Sores, which is actually a song by John Kale. And he, being a bit of a wag, thought, it's probably going to be crap, this notation editor. So we'll call it Rose Garden after Funeral of Scores, haha, Scores Notation, get it? So yeah, that was the origination of the name. So 1992, it was written on UNIX, it was written on SGI originally, under Erics, with the Athena widget set. It was a modified Athena widget set, written in C. And it looked pretty good. I mean, I think Chris did a pretty good job, actually, despite the fact he thought it would be terrible. This is actually Rose Garden 1.0, the source code. However, I believe this screenshot comes from Rose Garden 2.1, which is the first Linux port of that source code. And as you can see, it's got a piano roll editor, an event editor. It's also got the notation, oops, and it also works together on the same screen like that. So it was pretty advanced. And it worked via the Erics device, so it would actually send out to a MIDI device from the SGI, the Silicon Graphics workstation. That was it, number one. That was done for a final year project. So what happened next? Well, what happened along the side of that was that we ended up spending a lot of time mucking around. So in 1992, 1993, I was still at university. Chris and Melissa were peers of mine, friends of mine too, and we used to hang around in the Spark Labs as they were called at Barthie University, and we played too much X-Tank. Anyone know X-Tank? Yes, very good. Too much fun, and too many late nights. And that was the thing about coding at the time as well. It was a social enterprise. It's exactly the same these days as well. I look at my son. I say to him, okay, why have you not learned? Okay, so now he's going to university. He's starting to learn computer science at university. And when I was growing up, I had a computer. I probably liked some of you here. I started coding because I was interested in it. He was absolutely zero-interesting coding until he got to university. It's all been gaming. I've been like, well, why don't you build a website? I was like, no interest, but yet he goes on to study computer science. Annoyingly, he's really good at it, and he's picked it up in no time at all. So now I'm getting WhatsApp saying, Dad, what about this node problem? How do I fix that? Okay, fine. But to be honest, looking back, it's exactly the same now as it was then, because we just mucked around too. We spent a lot of time in the Spark Labs playing X-Tank. We spent a lot of time learning command line or learning about Unix processes, how they work. We learned stuff off each other. So at university, I learned Unix command lines solely from mucking around. I learned C and X11 from doing some university courses, some of my minors. But the majority of it was from, yeah, just hanging out with people, exchanging information. So that was Rose Garden 1. And then we left university. We went off into the working world as a peer group as well. So I ended up in London, eventually, as did Chris and Melissa. They were working for the same company, actually. And we ended up hanging out. And somehow, I don't know quite how this happened, but we got to a point where Linux was a thing. You know, I think I've got a picture next. Where is it? Okay. So I've been working for a little bit. I've just skipped ahead. I'll skip back. I've been working for a little bit. I managed to save a bit of money eventually, and I bought my first PC and it was one of these Gateway 2000s. Came in a cow box. I can't forget it. And it was so exciting because I just never had my own PC before. And the first thing I did with it was buy Slackware 3.0 and put that on it. And that came out of, yeah, workings. I'll go back to this picture. The working world. And that was really where things started to kind of fall into place. Because when you're working in a proper company, you have responsibilities. And that was the one thing that work really taught me. You're not just mucking around in the Spark Lab anymore. You're not kind of just stay up late to do your coding and have a lot of fun and then sleep late. You've got to, well, you end up doing that anyway when you're a young engineer sometimes. But if you're working for a company like this, like I was, a telecoms company, and we had responsibilities, so we created network management software and also software for multiplexers. And my first boss, she was great. She sent me out on a customer visit straight away. She said, okay, get in the car, go up to Sheffield in Yorkshire in the UK, and go and be on site with Yorkshire Telecom whilst they're upgrading their multiplexers and they're upgrading the software with our new network management software. And for something like that, as a young, just out of university graduate engineer, it was incredible because it finally gave you that connection between all the stuff that you do, all the hours that you spend on a piece of code and the customer themselves. Because if one of these cabinets fell out of the network or the upgrade didn't work, then some would have to be sent out in a van. So the street cabinet to reboot it and then you have to watch it in the control room to make sure it came back up again. And that was, yeah, the real kind of visceral connection. So this is the first you might call it a learning that I had. I didn't even realize it at the time, of course, you don't do. It was the fact that you need to serve somebody with everything you do. Well, I felt that anyway. Because sometimes you might just want to write some code for the hell of it, but unless someone is using it, where's the point? I felt that really, really keenly. So yeah, cow box. Actually, just to get back to the last one as well. So the cow box arrived. I got a P75, I think it was 95. So Pentium P75. It wasn't particularly flash. It wasn't top of the range, but I installed Slackware 3 on it. And then there was a tar ball of the Rose Garden source code knocking around. And I think Chris and myself knocked it together. It might have been Melissa as well, knocked it together into the first kind of release on Linux of Rose Garden. And then just after that, one second. So it was kind of the heyday of music software, really. Even got very exciting very quickly. So I bought my first PC. I was working. I was working too hard, but having fun. But after a while, hard work became too much like hard work. I was on pager duty and back in those days, it was actually a pager as well. And I remember the first night I was on pager duty. I was lying there in my bed and I put the pager right next to my head and I was just staring at it for the whole night. I couldn't go to sleep. I was just so scared in case it went off. What would I do? I had my own hardware at home. I had my PC. But it was, I'll come to this, it was on a 14.4 modem which I had to ask someone to buy for me at work. They wouldn't even, otherwise I'd have to drive into the office. I'd have to have a car. So it was basically, you were trying to offer support to a global product from a very basic setup whilst being on pager duty. And the pager duty lasted for a whole week as well. So you'd have a week on and then maybe two weeks off. And that was really stressful. So between that and working and everything else and probably parting a little bit too much, I just felt burnt out after a year and a half of this. So I kind of moved jobs a couple of times. I bumped up my pay a little bit and I got to a point a couple of years later where I could actually take some time off. So I thought, oh, God, let's relax. The first thing I did was buy some music software because it was the thing I wanted to do. So I've been building software and I've been getting into being an engineer. But what I really wanted to do was make some music and be a rock star. So seriously. So I bought a load, well, I say bought. I probably ripped. I definitely ripped a load of stuff and put it onto my Windows PC like Logic Audio, Cubase, Rebirth and Reason, all these great tools that were available at the time. Digital audio workstations, which meant they would record audio. You could put plugins into them. They would make great sounds and also things like Rebirth, which was like a TB808 simulator, emulator, whatever. Lots of fun. Lots of squelchy fun. So I spent a couple of years understanding these tools and having fun with them and recording. And what I didn't realize I was doing was I was learning the domain. And if you kind of look back from where we are now and you've kind of studied OO, object-oriented design, and you've maybe read Eric Evans' book or listened to Vaughan Vernon speak about DDD, domain driven design, you understand that that's what you're doing is you're learning the language, the ubiquitous language of music software and then you're learning to talk and feel the way that musicians do when they use this software. So around the same time, so around the same time, we had Rose Garden 2.1 already, but we hadn't moved on since 1992. This was about 1998 at this point. We did kick around a few ideas. So mainly Chris and myself and also Guillaume, a French developer who came on board, I think off the back of the 2.1 work. Again, it was a bit hazy. We started kicking around some ideas. We want to rewrite this thing. How are we going to do it? So I got this domain knowledge. Chris was an amazing software engineer and still is to this day and still works in audio software. And me piggybacking on his genius and also Guillaume's genius was a way to understand how we could build a whole new version of Rose Garden because it was an itch we wanted to scratch. And we discussed a few things and technology was quite fluid at the time. So we were like, could we do it in Java? Not really. There was no sound system or there was a very basic API. To be fair, the Linux sound system was not particularly advanced at that time. I was just about coming out but it was mainly OSS before that. So Rose Garden 2.1 just used open sound system, which was just a device, essentially. So you could, there was no internal routing of media events or audio. It was all straight out onto the device. Java had the same problem. It was kind of slow in catching up. It wasn't slow. It was just the fact that it hadn't been written yet. A lot of these interfaces, there was, I think, a media API for Java, but Java was never my forte. Didn't like it. Didn't like the stack traces. We also discussed IPC quite a lot with interfaces communication. We discussed Cobra. Anyone remember Cobra? Yeah. Anyone forgot Cobra? Yeah. Terrible. Well, the time was very exciting. There was that alien book. You read it and you thought, what the hell is this? It was really complicated. So it was too heavyweight. But I think actually Rose Garden 3, the missing number between 1, 2, and where we are now with Rose Garden 4 was kind of IPC, a Cobra-based thing. We might have even written in Java or something. It was crazy. Anyway, that didn't last very long. It was back of a fag packet type stuff, as they say in the UK. But we had a few other ideas. And we thought, well, it wants to be, we want to build it on Linux. It has to be on Linux. We're both quite pretty proficient in C++. Let's do C++ maybe. Let's build a nice data model. And also, we saw that Elsa was emergent. Jack, which is the Jack Audio Connection Kit, is still, I don't think it was invented until a couple years later, until maybe 2000 or so. But that was also a flexible audio routing system, which is very cool. And we eventually implemented. But we knew another thing, that the UI was going to be vital. Having worked on X11 and a thing of a very basic X11 widget set, there were two very exciting Linux alternatives coming up, JTK and all the various wizards, all the various parts of it. And KDE based on QT. And the debate probably didn't take that long, but it felt like it went on forever. But eventually, we decided, I think, as a kind of a three of us, on taking the most pragmatic attitude, and that was, we're going to go with KDE. And initially, we decided to go with KDE because it seemed to be slightly more mature. And I think we all like QT as well. We didn't know QT particularly well at that point. The toolkit that KDE is built on. So we come to the KDE years. So this was the real start of the real build of Rose Garden 4, the thing I showed you at the top. So we finally got here. From what I say here, KDE 3 was released in April 3, 2002. And I think what we'd done between then is we'd started probably in 2000, and we'd started writing some stuff. And eventually, we got to a point where we had to kind of go with the major upgrades. And KDE 3 felt, for the first time, I think, when we were finally kind of getting there. It felt like it was good. We were using a little bit of, have you ever seen that picture? Yeah, I'll just stick on here. So we were using a little bit of de-cop, I think it was, which was the communication interface for Rose Garden. We were running two separate processes, the Rose Garden sequencer and the Rose Garden itself, Rose Garden GUI, basically. And they communicated via, well, we were going to use the KDE mechanism. We didn't in the end. I think eventually, we ended up on a memory map file, which was suitably good for both processes to access at that time. So yeah, as you've seen, we ended up with a quite a flexible architecture. So this kind of maps the core. I'm not going to go into the core of the core of Rose Garden, but I can point out a few things. So there was the two processes, which underlaid it behind. The GUI communicated with the shared memory map file. The sequencer did actually all the running of the stuff, the in and the out and the whatever. It was all controlled via the transport down here. And it had quite a flexible mapping, so that we had the, any instrument here could be mapped to any of these instruments there. And they could be a MIDI device or an audio device. And we kind of created an abstraction layer there, which would allow us to map those as well onto either MIDI devices, which users could create called device files, which allowed our users to actually get involved with creating specific configurations for their own setups. So if you had a particular type of keyboard or a MIDI device somewhere else, then you'd be able to create a device for it and be able to map those sounds to your Rose Garden setup. Similarly, the segments themselves. So these blobs are all segments and you could slice those up and they don't have parameters associated with them too. So there was design obviously going on underneath the covers. And that was from quite an early, quite an early time as well. I believe that we didn't actively sit down and plan it all, but we spent a lot of time refactoring. And that was one of the biggest pain points I think we felt whilst we were working in the early 2000s, was there was all these great tools available. Coming out of working in industry where you had all the compilers essentially before then, the end of the 90s or mid to end of the 90s, it was typically you'd have to buy a compiler. So you had to buy a license for something. You'd have to buy your library. So the standard template library for C++ wasn't, it was still kind of getting there. You'd have to buy boost. I think it was or whatever it was called. There were the various things you had to pay for. GCC and G++ were still getting there as well. G++3 I think came out around the same time. So the tool set was shifting. There was also the shifting media layers as well in Linux. So we were talking to people on various main lists basically about like when is the new version of ALSA coming out? Were the stuff that we can use to actually get some real-time stuff working on ALSA? So all of these layers underneath we were waiting for. Plus the tooling was a little bit kind of getting there too. Plus all of the UI toolkits were also growing. So we were kind of growing alongside all of this stuff and it made it quite difficult. We didn't have any CI. We had like a couple of machines. We had some basic source control in Sourceforge which was probably CVS at the time, not even Subversion at that point, which worked. But typically even with three of us working on it, there would be breaking changes the whole time. We didn't have any tests per say. We'd just be downloading the latest bit of source code and building it at some break. So when you're refactoring our core area, even with three people and you've got all of these dependencies moving, it makes it very difficult to get anywhere. So we struggled to be fair. I was working for a couple of years. Well, it probably felt like a couple of years. It probably wasn't. It was probably more like six months. I did take some time off and I worked on it full-time in a little basement with multiple computers, with this CC running on various computers to try and speed up my builds and stuff. And it was a real struggle, mainly because we were refactoring and breaking stuff, but also because we had a kind of an unstable base. Still, through much effort and through much time and through much dedication from not just a core team, a growing team the whole time, but also some really passionate users who came along quite early and have stuck with us the whole way through. So we've still got people who have been using it for 20 years who just love it because they can't think of any other way of working with it, basically. It's brilliant. They'll compile it and they'll run it on their own kernels, or they'll download it as part of a package, or they'll have a machine which they haven't touched for years purely because they want to run Rose Garden. There are other alternatives. Of course, there's plenty of other alternatives, but what we've done is we kind of found our people, I suppose, through the way that we'd built it, through the way that we'd gone about building it, maybe, as well. The thought we put into it and the love, maybe, we've kind of built something which is stuck around. And as you can see here, it does, this is it doing some audio and it also plugs into this thing I mentioned called the Jack Audio Connection Kit, which is a way of routing audio around internally on Linux and various other platforms, too. So it's actually like an audio bus. Here's a virtual synth as well, which you can route to it. So it works with virtual synths, external synths, et cetera. So what we saw was an emergent architecture. As I mentioned, various things. I need to speed up a bit, so I'm going to go through a little bit faster. I've also mentioned the Rose Garden Studio Concept. So this was the thing about the device files. And I think this is another powerful feature of Rose Garden, one that's still used to this day. So we've got over 100 device files, so people have gone off and they built them for, I don't know, Yamaha synths, for Roland synths for various different types. And you can just download those and install them into Rose Garden. And you can map all the set, all the voices basically straight into it. Additionally, we've got the, as I mentioned, the Jack integration, the support for Ladsper and also DSSI plug-ins, which are native plug-ins for Linux and other platforms, as well as the VST plug-ins, which are the standard music, more standardized music plug-ins, audio plug-ins for Windows and other platforms. Additionally, of course, notation markup and printing using Lilypond as well. So a very high quality score export from Rose Garden, which has been very much appreciated by some of our users, too. So with all that rambling, this is one of the learnings that I found, was that just the architecture itself is difficult because of various reasons, communication, mainly, but also what do we want to build? Secondly, organization around that is difficult, too. And we saw that with the refactoring, that we had to suffer, essentially. Maybe with better tooling or maybe with more stable platforms that we were building on, it would have been simpler. And in hindsight, it seems almost, yeah, trivial, but at the time, it felt very difficult. Secondly, this is something which really stuck a lot in the claw, in the throat, which is everyone has an opinion on UI, everyone. Users, developers, even developers who don't profess to be UI or UX experts have an opinion because we all use the software, right? And if you change anything, then you break a workflow. And that's a difficult lesson to learn. You may be a great coder, but again, it becomes the social angle between what are we trying to build and how do we keep people happy? Because you could, and it's even happened on the Rose Garden developer list in the last couple of days. Ted's released a new version before Christmas, but he's released a patch version in the last week, and he changed one of the theme setups, so it defaults to something that no one likes. So there's a little mini uproar going on. On the flip side, when you do get it right, when you do spend the time getting your architecture right and you're thinking about your code and your core, then sometimes big features that you might have thought impossible even a few months ago are almost trivial to implement. But suddenly, those points, you go, oh, thank God, we got it right, and you actually feel that that is something that was worth spending the time on. It doesn't happen all the time, but sometimes it does, and it feels great. And yeah, fourthly, in this little block, literally all you need is a small number of users, and there needs to be loyal users to keep you going. So even though you're not getting paid for it, if you've got passionate users, it can mean everything. Okay. Talking of passionate users, we'd spend some time in basements and online, of course, building stuff, and we thought, we and Chris thought it would be a great idea to get out there and start showing it. And around the same time, 2002 or so, there were Linux expo starting to happen in the UK. So through contacts of ours, we were involved with Linux user magazine. I wrote for them at some point together, and we managed to flag a stand at one of the first Linux expos, and this is us looking all ruddy and young and lovely and wearing branded t-shirts. Look at that. Made that myself. Fantastic. This is my friend Mike, unfortunately, not with us anymore, but he was a great inspiration and helped us a lot and encouraged us to get set up. And this is Chris. And this is our fantastic set up, isn't that? Just stellar. But today, and yesterday, walking around, looking at people sitting behind desks, exactly, not better, better desks, to be honest, with much more impressive set ups, it really takes me back to those days when we were doing the same thing. We had our software, we were excited about it, and we went out there and showed it to people. So, yeah, we were doing lots of things, writing, went to trade shows, went to conferences. Well, yeah, not hundreds of them, a few, a handful. We started approaching vendors, so there were people who were building music, software, PCs, and we talked to them. Maybe they'd be interested. We talked to Roland. We talked to Yamaha. We talked to loads of people. We talked to AMD. There's a book. It's in my bag. There's the one second. So, Michael McIntyre wrote this whole guide, which was published, The Rose Garden Companion. Yeah, and then we started talking to partners as well. So, there were people at the same time who were betting their careers and their livelihoods on Linux being the next big platform. Obviously, everyone thinks this year is the year of the Linux desktop. But there were still people out there, well, then there were people out there who wanted to build Linux-based hardware as well. We talked to a creator of keyboards in Italy. We flew us over, and we talked to him. It was a fantastic couple of days. We met his family, his mother made us lovely pasta for lunch, and we saw his workshop where he was building these incredible multimedia platforms and keyboards in his garage, basically, his garage business. Fantastic. So, yeah, we got out on the road. Also, we launched a product. This was called Studio to Go. So, Chris and I, we formed a company and we built a bootable CD, which enables you to basically have an entire studio on your CD. You could put it into a Windows machine, and you could boot it up and boot it into Linux, and you'd be ready to go, basically. You could also install it on your Windows machine, too. This is Chris, the year later. So, from the previous photo, this is a kind of a level of professionalism hire. We look more relaxed, more kind of in the moment. Yeah, again, I think more relaxed. And then if we look another year further, this is us, the kind of the high point of our career with Fervent Software and Studio to Go. This was at the Sounds Expo in 2004, I believe, yeah. So, we weren't in this by ourselves. There was an organisation called Linux Audio, as well, which is still around, and in fact, I looked on the website yesterday, and apparently I'm on the board of directors or something for, anyway, it's not a company, and I'm not on the board of anything there, but it was like a kind of an open source collaboration group with a guy called Daniel James, as well, who also I work with in journalism. He was a big proponent of another Linux audio platform, which is still being sold, I believe. So, working together, we managed to get to a Sounds Expo where it was not just Linux stuff, it was also Windows and the rest of the event is there, too. However, at that point, it was getting clear that we weren't going to make a living out of it, Chris and I. So, we kind of, with a family coming along, and that's why I mentioned it at the beginning with a photo of my son, it was like a decision to say, well, what can we do now? Because we tried a kind of business, didn't make enough money for us, and we didn't really have the insight into what we should do next. So, we just said, okay, let's call it quits for the minute. I dropped off from Rose Garden at that point, but loads of people stepped up. One of the biggest things that they did in the next section, which is the QT years, was to port it from KDE to QT, which meant basically ripping out the KDE stuff. Sounds simple enough, but took about a year and a half, which led everyone to understand that major dependency upgrades are very hard. One of the good things to look at from the source code of Rose Garden now is that of the source files that we have, about 1400 C++ files, most of them are pretty small. So, under 500 lines, that's including comments. So, that's a good sign to healthy code base. So, I'm skipping now from 2008, 2010, and I'm kind of looking towards what's happened in the last 10 years, but also the future. So, Ted Felix came on board in 2010, and by his words, I spoke to him in the last week or two, by his words, he said it suffered from a lot of stability problems. So, this is, we have a GitHub as well, as well as a source for, and GitHub is downstream, so it allows us to do some nice visualizations. So this was around the first bit of work that we did to stabilize the initial Rose Garden. This was around the studio to go time, lots of work there, and then this was around the KDE to QT port, so lots of work went in there, too. After this, when Ted came on board, he said basically he spent a lot of time stabilizing, basically getting rid of bugs and making sure that there was adequate performance, and doing that, I think, I don't know, it was not unaided for sure, but focusing on that a lot. And you can see this also represented in the way that Chris and Gio were very involved here, and Ted was involved later, and this is my contribution. And also, we see, again, from GitHub, the churn that was going on around the same time, so studio to go product, QT port, and then stabilization after that. But if you then look at, I'm just comparing us to Musa, which is another Linux audio work station as well, and this goes up to 40 on this axis and up to a bit higher on this one. So in terms of contributions, we're not far off what Musa is doing as well, which is really encouraging to see. So I would say the product is pretty healthy, it's been stabilized, it's still being used by lots of people, you don't hear about it a lot, but it's still there. And I'm going a bit faster now because I need to wrap up, but the number one thing has got to be people, and that's what I mentioned at the front of this talk. People to do the work, of course, people to support you whilst you're doing the work, and the expertise that you build together when you're designing something. So that's why I see Rose Garden now is still surviving, and still something that we can contribute to. So top five things. For open source, basically, I would say this as well. So good architecture and good toolkit choices are super important, and they help with defining your quality and defining how far you can go with any one thing. And if you're not getting anywhere down a certain path, then just rip it up and start again, or refactor, or think really hard about your design. Also, Ted's jumped in with some stuff here. As he's mentioned that stringent requirements are merging, so making sure that you don't allow anything to merge, and in fact, recently there's been a fork of Rose Garden because one of the developers has got a bit annoyed that he's not getting his stuff reviewed, and it's fair enough, everyone's doing it in their spare time. It's still a labor of love. Also, I mentioned earlier, refactoring and unit testing, and I did actually run the unit test code coverage, the other day on Rose Garden, and it's 8%, which is not good. Okay, there are some challenges with a UI-based, whether with a GUI, but it should be higher than that. So I imagine one of the things that I might be looking at as I dip my toe back in a little bit is to maybe write some more tests. So number one, quality. Number two, coding is social. I mean, that's one of the things that we don't realise is that we think we're going to OSS, we can just hide maybe online or do some coding by ourselves, but increasingly it becomes social. It's a different type of sociability maybe to working in an office. But open source, I believe, is much more social than working in office because working in an engineering job or software job in an office because you can just focus in on the one thing that you're doing. You don't have to be maybe told that you need to be more interactive in a certain way of working, but in OSS, you have to be. There is no question about it. You've got to be able to defend your ideas and maybe persuade other people to. Also, related to this as well, it's the social side of the customer engagement. It's the fact that you do have customers or end users, however you define them, and they will have opinions, whether you like it or not. So you're putting yourself out there. Thirdly, and this is something I've come across in the last year or so, this guy called Seth Godin. He's an American marketing guru and his approach, I really like his approach, he's got a great podcast called a Kimbo, A-K-I-M-B-O, you should check it out. He writes and he speaks really well, really pithily on lots of subjects about marketing and he's not like a slimy kind of marketing dude, whatever kind of feel. It speaks to me as an open source person. One of the things, one of his key phrases is this, people like us do things like this and I think that's what it means to connect with your audience, to find your audience and to keep connecting with your audience at the right level. So yeah, we might spend some time arguing about how a UI should be or how we should architect this or if we should reflect on that. However, it's worth it if we connect with each other and we connect with our customers well at the same time. And also, he talks about audience, importantly, because you don't need to solve all the problems in the world, all you need to do is solve a few problems for a few people. Fourthly, use all the tools, so we are just so lucky these days, we've got all these amazing tools, we've got GitHub, we've got Co-Curve, we've got Cloud, whatever it is, lots of stuff, you know, Sonar Cloud, etc., all these great things that we can use, Sonar type. So use them. Use CI, use unit testing, use TDD, use all of the great engineering advances that we've come up with over the last ten years to make our code better. And finally, don't take it too seriously. One of the reasons that I got involved in OSS was that I could do it from, well, we could do it ourselves, you know, we could put our little imprint on the world or do something fun. And that's the point, it needs to be fun, it needs to be enjoyable. So when it stops being enjoyable, don't do it anymore. And that is it. Thank you very much. Any questions? Are there any questions? Oh, yes, there are. Please raise your hand. Okay. Hi, thank you. I can't wait to give the software a try when I get back home. I might install it later. But was there any possibility of you like playing a bit of music, maybe? No demo today, I should have said that at the top. Thank you. Any other questions? Okay, thank you very much. A big round of applause. |
Podcasting 2.0: it's all about Interoperability
How Podcasting 2.0 will save the Open Internet |
Hi, everyone. The next speaker is here. So our next talk is podcasting 2.0. It's all about interoperability. And the next speaker is Benjamin Bellamy. Give him a round of applause. Hello, everyone. Thank you for being here. Today we're going to talk about podcasting and more specifically, podcasting 2.0. And more specifically about – can you all see the screen? Yeah, good. So you can read it because I cannot pronounce that word. Interoperability. Love the concept. Hate the word. So why talking about podcasting here at Fosdame? Because podcasting will save the open internet. So let me introduce myself. I'm Benjamin Bellamy. I love podcasts, which is why I'm going to talk about podcasting. I love open source, which is why I'm here today. I'm founder and CEO of a company called Adoress, which is Latin. So don't ask me how to pronounce it. It's Latin. No one knows. And it means going to the ears. And I'm the father of Castopod. We'll talk about that later. And my father – I mean, Castopod is my baby. I was here when it was conceived. But now it's all grown up, and I cannot see it as much as I would like to. But enough talking about me. Let's talk about you. So I'm going to make a quick poll. I'm going to ask each one of you four questions. So that's going to take half an hour, and there will be half time for Q&A. Oh, no. Just where is your hand? Who's here is a podcaster? Also, like 30%. Who's a developer? Okay. Just one arm, because that's 200%. Have you heard of podcasting 2.0 before? Okay. So 25%. And have you heard of Castopod? 20%. Okay. So let's talk about users. Users are always the problem. No users, no bug reports. So the problem that we are facing now, all of us, is that we have to choose. We have to choose between the open web and the closed web. Between where we have control and where we don't have control. But of course, we don't have to be religious about this. And we know that closed silos are comfortable. When we are on the website of watching videos, silos are very useful because you get everything you need. You need a feature. It's there. You need to upload a file. You need to index it. You need to recommend it. You need to monetize it. Anything as a user or as a content creator. It's all there. It's very easy, very convenient. So at the end, yeah, that's probably the best solution. Let's all use closed silos because they're so much easier for everything. So, yeah, actually, maybe I was wrong. So let's go to closed silos and forget about this open internet problem. Or maybe there's something else. If you're a user, yes, maybe that can make sense. But as soon as you are a content creator, and I see that many of you are podcasters, we have to think about what can influence the content that you create. If you are on the open internet, so that's a federated network or a self-hosted platform, whichever, well, you have to follow the regulations. The law applies to each and every one of you. If you don't, you'll probably get into some troubles. And that's it. You are the only one in control of your content. Whereas if you're on the closed web, so let's say YouTube, well, you still have to follow the laws and the regulations. And you'll have issues as well on this platform. But at the same time, you have to follow the terms and conditions of the platform. The economic, political, ideological interests specific to the platform, and government pressure on the platform, talk about China or wherever, mass reporting of your content by opponent or competitors, this happens all the time, every day. A boat, how many times have we uploaded the content and we got a strike and then you end up talking to a boat and say, I got a strike, I don't know, it makes no sense. Well, that's what happens. A moderator in a bad mood. And a platform shutting down. Like you create content, you put all your value into it, you trust a platform where you put the value that you created and then it's gone. Let's, I don't know, like Google plus, goodbye. Goodbye Google wave, goodbye Google everything. So now let's talk about audio blogging. How does everything that I just said applies to audio blogging. So obviously blogging, audio content, audio blogging. Yeah, that makes sense. But how to call it. So we don't talk about audio blogging anymore. We haven't for many years, 15 actually, because now we're talking about podcasting. Podcasting was, is a word that was invented by a journalist, Ben Hormesly in the Guardian in 2004. And we've been talking about podcasting ever since. So who created podcasting? Who are the founders of podcasting? Well, that may sound stupid, but there isn't any podcasting without podcasts. Podcasting is about podcasts. It's about content. You can have the best technology ever. If you don't have any content, at the end of the day it's going to be useless. So these are some of the, the oldest and most famous podcasts. These are my choice. If you ask someone else, you'll probably get others. So the first one is the illusion of independent radio from 1989. Of course, at that time, it wasn't on Twitch. It wasn't on Apple podcast. It was broadcasted manually on cassette tapes and copied from tape to tape. But still, it's considered by many as the first podcast ever. And as you can see, the original title, it was from Russia. Then in 2004, the daily source code was a podcast that was broadcasted every day by Adam Curry, a kind of technical podcast. Then the year after the podcasting, do-it-yourself guide by Todd Cochrane, 2006, the Lens and the Sun podcast experiment by obviously Lens and the Sun. There are many others. I won't spend more time naming all of them. Just 2014, serial by Sarah Koenig. This one is important because it's the 350 million download podcasts. So that made a difference because we went from a model where, yeah, you have some nerds talking to other nerds and there were like a bunch of them here, a bunch of them there, and yeah. But now we're talking big numbers. So that's for the content. Now, the architects who designed it, who is at the origin of podcasting and the technology. So pop quiz. There is no technology here, just the names and the dates. So 1993, who knows what invention is behind that? MP3, yeah, correct. MP3. 1995, anyone knows? RSS, yes, correct. 2000. Yeah, enclosures. Enclosures in RSS. 2003. Yeah, that before the Daily Source Code, actually the Daily Source Code was a way to apply this. It's pushing your podcast to your iPod. And at the same time, we got iPodder.org, which was a repository and index for podcasts. And 2005. Speak louder. No, not the iPod. The iPod was earlier. This is podcasting iTunes. Okay, so that's the technology. But what's a podcast? Well, there are several definitions for a podcast. There is no official definition for a podcast. Some people say it's a content that you can download and listen to whenever you want to, wherever you want to. Some say it's an audio content. And my definition, the one I like the best, and I think it's probably should be the only one, if you ask me, is Adam Curry's one. A podcast is an RSS feed with an enclosure. So, an RSS feed and an enclosure. That's a podcast. So, example, this is not a podcast. This is a podcast. I don't know if you could see the difference. I'm going to show it to you again. So, not a podcast. A podcast. So, for the one who did not upload, this is not a podcast because there is no RSS feed. You can listen to it. You can download it. You can listen to it wherever you want to. But you cannot get an RSS feed. So, you cannot select and choose the application on which you want to listen to it. And you are in a closed ecosystem here. This is Spotify. Obviously, you are recognized. So, this is not the open Internet. So, this is why I do not consider this as a podcast. Whereas, this is a podcast. And it's, you can say, you can tell, it's better looking. So, if you remember the architects, in 2005, Apple podcast, which was iTunes podcast at that time, was launched. And it was launched by Apple, by Steve Jobs, right after Adam Curry, who was at the beginning, he was involved technically and also he was still a podcaster. He gave to Steve Jobs himself the index that he and some other guys developed at that time, which was called iPodder.org, which was an index with 4,000 podcasts. Remember that number, 4,000 podcasts. This is the index that we had at that time. And it was all the podcasts that we could listen to and at least search for within the iPodder index. And this is a changeover, because at that specific time Apple took care of podcasting. They did a pretty decent job, because they got no money from it, just marketing image, but they did like pretty much everything in the podcast industry since that moment. So, let's recap. Here you can see 1993 MP3, RSS, RSS plus MP3, to the iPods, iTunes, and nothing, nothing. So, that's not a bug. That's not PowerPoint breaking my slide, because this is impressed. This is not PowerPoint. And nothing happened. So, this is what I call the status quo of podcasting. Meaning, for 10 to 15 years, nothing happened, and it stayed the same. You can see here, so we still have hosting and indexes and apps and lines going from all the way crossing, which means that podcasting is still decentralized. And that's a very important point, because over the years, let me go back a little bit. Here, you see that too much. We had YouTube, Facebook, and many new technologies that disrupted the markets, but they're not centralized. When you're on Facebook, you're not on the Internet. You're in Facebook. When you are on YouTube, you're not on the Internet. You're in YouTube. Whereas for podcasting, well, it's decentralized, and it's still decentralized. So that's good news. But at the same time, not a single feature has been added for 15 years. Try to think about this. In which industry, we saw no innovation for 15 years, and we're talking about the Internet. Like, not a single feature for 15 years. Nothing happened. And at the same time, we saw the core of it, RSS, we saw it dying many times, mostly killed by Google, many times. So at the end of the day, yeah, what's happening to podcasting, is it dead? But no, it's still alive. But between these 10 years, so there's the 10 years, another word I cannot pronounce. There was no development because of a chicken and egg problem. If you are a podcast hosting company or developing a hosting solution, if you want to add a new feature, well, you need an application to implement it. Otherwise, it's useless. No one will say it, so you won't do it. And if you are developing a podcast listening app, well, it's useless to implement a new feature because if there is no hosting company implementing it, it's really useless. So everyone is waiting. So a solution would be to ask the one in the middle who's taking care of the index, Apple, to make the first step, well, try to ask something to Apple and see how it goes. So yeah, nothing happened for all these years. And then, 2020, Adam Curry and Dave Jones decided to reboot it to make a podcasting 2.0. So podcasting 2.0 is an open specification. And by open, I mean it's open source. And it's open to anyone. So each one of you can come and play with us and say, I need this. I want this. Podcasting is missing. This important feature or this feature that is not important to anyone but me, doesn't matter. It's an open specification open to everyone. Really, I mean it. And technically, it's an M space that extends the podcasting DTD, the one that was created by Apple and that everyone has been following for years. Even Google is following the Apple DTD now. They dropped their, they had one but they eventually dropped it. So if you go there, you will see all the new features which are usually new tags that had been added to podcast. So podcasting 2.0 is about decentralizing and returning to open source. So you, everyone can be a part of it and everyone can play together. That's really important. But this is not how we are going to convince users and listeners that this is the right solution. It's almost impossible to tell people you shouldn't be on Facebook because it's not open source. And you don't know what they're doing with your data. You have to be on something that's open and decentralized and federated. People just don't care. It's useless. If you do that, you're going to be exhausted before you have convinced two people. The only way to convince people is to be better than closed solutions and closed ecosystem. So you have to focus on user experience and features. And this is what podcasting 2.0 is about. So there are many, many, many new features that are in podcasting 2.0. These are only the ones that have been implemented yet. So alternate enclosures. Yes, because now we're using mp3, but mp3 is 30 years old. Maybe we like better m4a or org or flag or dolbyatmos or low bandwidth encoding files. Well, how do you do that with the usual DTD? You cannot. With podcasting 2.0, you can do that. You can have multiple enclosures for one episode. Bustograms. Who knows what is Bustogram? Okay. Like 5%. Bustogram is a way to send message, money, and love to a podcaster. So you're listening to a podcast and suddenly something, you find it's really interesting. Oh, I love that. You click, you send money. So it's Bitcoin over the LND network. You send a message, and you send a timestamp. So the podcaster knows at what time, within the episode, this Bustogram was sent and you receive money and message and love. Chapters. If you want to have chapters, of course you can have chapters within an mp3 file, but let's say if you have alternate enclosures and if you want to modify the chapters afterwards, and if you want the chapters to be indexed, to be seen without having to download the whole files, well, having chapters in a JSON file can be pretty, pretty much useful. Episode and season. Well, what if you want to give a title to a season and not just a number and to make it more easy to see an index? Anything, if you have Patreon or PayPal, you can put that within the RSS feed so that it shows up directly within the podcast app, the podcast player. UID. I cannot understand why we didn't have that earlier. If you have a podcast, very often you move it from a domain name to another, from a hosting company to another, how can you tell that these two RSS feeds are actually the same podcast? Well, before podcasting 2.0, you can't. Now you have an ID that doesn't change. Live, because podcasting is very often about live events. You're talking live, you're talking to everyone, and also making a podcast, and the live will be the podcast, but you want the users to be able to get a notification within the podcast app that says, oh, your favorite podcast is now live. Click here and listen to it. Location, because you want your content to be easily found, and sometimes it's about a specific place. How about geolocation? I think there's something called Google Map, but obviously nothing related to podcasting. With podcasting 2.0, if your podcast is about tourism, well, you can pinpoint an episode or a podcast. Then you can put it on a map. You can put your podcast on a map. Locked, we've seen podcasts that were copied and monetized, specifically on Anchor, on the Anchor platform. Some people found out that their podcast had been copied, copied on Anchor, and then monetized. They didn't know about it. So here, it won't prevent the copy, but at least it will say, please do not copy that podcast so that the hosting company knows that this podcast shouldn't be copied. Medium, because podcasting is not just only about talking. It can be music. It can be many, many other media. OP3 is an open source, open data, analytic platform that allows you to share your analytics with third party. Person is about saying, this content was made by this host, with this guest, and this sound engineer. Podcasting is a way to say, hey, my RSS feed just got updated, and you send that information into the Hive blockchain so that you don't have to query an RSS feed every minute to know when it was updated. Imagine how many requests Apple has to do to get notified where all the podcasts are updated. And actually, they are not notified. They fetch the answer. Sat-streaming, almost the same as Boostogram, but it's while you are listening to a content, like every second. Social interact is about talking with your audience from any app. So you say to the podcast app, this is where the discussion is taking place. This is where you should interact with the podcaster. And it can be, of course, on the podcaster platform. It can be on the Fediverse. It doesn't have to be on a closed platform where you don't have control on your content and on the content that your audience creates. Soundbites is a way to make a small byte of sound, obviously. Transcripts, which does not exist on podcasting except for podcasting 2.0, which is a real shame because accessibility is mandatory in Europe. It is mandatory in the US. And still, all the big platforms seem to not to care about this. Well, podcasting 2.0 brings a way to add transcriptions to the podcast and a transcription that the podcaster is responsible for. So as a podcaster, you can correct or do whatever you want to in the transcript and the transcript can be in the original language or it can be a translation. So you can make your podcast available to people who are hard of hearing or people who just don't speak your language very well. And the TXT, which is like the same, the DNS TXT record works. Right now, if you want to go on a podcast platform as a podcaster and you want to prove your ownership, the only way, the apple way, is to put your email address within the RSS feed and this will allow you to get a lot of spam. Well, with this, the platform who wants to verify your ownership will say, please put that token into your RSS feed, you put it and then you're good, you're gone and you don't have to share your email address. So all these features, this is not a wish list, this exists, this works, this has been implemented and it's working now. It's working now in many applications, platforms and services. And of course, if your podcast is a podcasting 2.0 podcast, it will also work on all the non-podcasting 2.0 platforms, it is backward compatible. So where are we now? There are over 4,000 podcasts that have declared the podcasting 2.0 namespace in their feed. And as I said, interoperability is the key. Meaning, this works only because we are working together. It's not a close silo and this is why it works. Here, there are only a few of the platforms that are implementing it. So on the left, you see the hosting services. On the right, the players and in the middle, the indexes, there are many others and they are all working together. So that's pretty nice. But let's not forget that the battle is not over. The FAM, we talk about Saga for podcasting, Spotify, Apple, Google and Amazon. They won't quit that easily. Last example, Spotify decided to implement chapters, but they didn't use that. They used their own way. So we need to say the word and to use it. But that won't be enough. Podcasting 2.0 has a powerful ally, which is podcast index. Podcast index is a podcast index. That's pretty convenient. That has all the podcasts. And it's open source, open data, meaning that anyone can have access to the index. Imagine before the year 2020, as a podcaster, if you want to exist, you need to ask for Apple to grant you permission to add your podcast in their index. You don't have right access. You need to ask permission. And if you say bad words, you're not sure that they'll accept you. And as a podcast app, if you want your users to be able to find podcasts, so to look for them, as the podcast app, you need to ask Apple to grant you permission to read the index so that your users will send some queries that will end up in Apple podcast index so that you can say, yeah, you're looking for a podcast talking about whatever. This is what we found. So if Apple doesn't want you to continue working with them, from one minute to the next one, they can ban you, they can revoke the access, and your app becomes useless to all your users. So with the podcast index, you can get an API key, you can get access, you can query as much as you want to, and if you're not comfortable with that because there's still a centralized database and a centralized API server, you can download the database, the podcast index is a database, you can download it, you go to podcastindex.org, you click download and you download the whole database, all of them. So everything is on the podcast index except stuff that is illegal, obviously, but there's no preference. They're not making choices for what should be allowed and what should not be allowed. And right now, it is twice as big as the Apple podcast index. So that's four million podcasts. So remember in 2005, we had four thousand, now four million. So if we recap again, the void is not blank anymore. We have podcasting 2.0, and we can work on this. So let's see how this works. We'll see how it works on Castopod. So what is Castopod? It's a podcast hosting solution. It's open source, it's AGPL3. It relies on PHP MySQL, Codignator 4. And of course, it is podcasting 2.0 compatible. It implements chapters, funding, GUID, location, locked, OP3 persons on by transcript, social interact, and social interact, it's for that it uses the activity pump protocol, meaning it's connected to the Fediverse and to Mastodon, obviously. So where you're on Castopod, you publish an episode, it's going to show up on your Mastodon feed as a user, and you'll be able to like it, share it, comment it. And the stars, the comments will come back on your Castopod server. So when you interact with your audience, the reactions from your audience are on your server. There is no middleman between you as a content creator and your audience. No technical person involved, no legal entity involved. So no one can pull the plug, and no one can send you their lawyer. And why are we developing that? Because we love you, and we think you deserve it. You deserve free speech. All of you. And your love that you send back to us pays our rent. Or it doesn't. So in order to be sustainable, Castopod has been funded by Adores, by a company, by NLNet, the NLNet Foundation. We are an open collective, so if you want to be a part of it, because we value everything that you can bring, time, talent, treasure, anything, go to open collective. We got a lot of value from you. For instance, we translated Castopod in French, because as some of you may have noticed, I have a slightly French accent. And Castopod has been translated in 20 languages, 21, I think. Not 100%, but almost the public part. Chinese, Spanish, Portuguese, Brazilian Portuguese, many of them. And we provide paid hosting on Castopod.com. So if you want to host your podcast on Castopod, but you don't want to take care of this, you don't want to go through the trouble of self-hosting, you can talk to me. Or go to Castopod.com. So if you want to know more about Castopod, just go to Castopod.org and you'll find everything that you need to know there. If you want to install it, go to docs.castopod.org and you will find many ways to install it. It's PHP MySQL, so with a basic Apache, ModPHP, MySQL, MariaDB, Parcona, whatever, you'll install it. We have Docker images, it works on Unohost, I saw some guys installing it on Synology, NAS, very easy to install. So what does it look like? So here, this is the logging page, and it has nothing interesting. It's just a logging page, but it works. And if you forget your password, there's a logging link. On the dashboard, you'll get all the information that you need for your podcast, the storage, the bandwidth, how much you've been using. But the thing is, all the information and the analytics, they are yours. The number of listeners, they are on your server. This is not Google Analytics. You have the raw data, of course it's DDPR compliant, but everything is yours. Now, if you want to create a podcast, it's going to take like 30 seconds, and you're up and running. If you want to create a new episode, it's going to take 30 seconds, and you're up and running. Of course, I'm not counting the time that you need to record the podcast, but that's up to you. Then you'll be able to listen to the podcast, of course, on the website. Castopod has a pretty decent website. I haven't said it, but it's podcasting, one-point-all compatible, so you will be able to listen to your podcast on any platform, including Spotify, Apple Podcast, Deezer, Amazon, everywhere. If your content is geo-located, you'll be able to see it on the beautiful map, so open street map, obviously. You can have private content, like premium content if you want to make a pay subscription, or maybe just for your friends. When you are connected to the Fediverse, people will be able to interact with your content, and you will see the answers back on your Castopod server. Here, this comment is a mastodon comment, I think. It comes back to your server. It's yours. Here, this is the OP3 website. This was developed by John Sporlock. This is an analytics website that allows you to have statistics analytics on a third-party platform, which is really convenient if you want to be able to share that with third parties. You have a third party involved, counting how many listeners you have, but it's open source and open data. Of course, everything that, again, works everywhere. This is a podcast addict, so you can see the transcription. It's showing on your phone. Still podcast addict. Here, you can see the person who was involved in the creation of the podcast. I think we're done. As a conclusion, how can you help? How can you be a part of it? Well, very easy. Just go to newpodcastapps.com and make your choice. Find an app, find a service provider, a hosting company, hosting whatever. It doesn't have to be Castopod. Of course, it's the best, but you don't have to believe me. You can make your own decision. And stop telling your, as a podcaster, stop telling your audience, go to Apple podcast and give me some stars. No. These are your stars, not Apple stars. So if you want to know more about this subject, the pod further 2.0, which will be in the podcast magazine number 2. I think it's going to be out in 10 days or so. It's in French. If you don't understand French, it will also be on Castopod's blog. You have the URL here. There's an RSS feed because RSS is not for podcasting only. You can also read all of the articles. The slides are on the first website. And since we're talking about podcasting, further listening, there's a podcast which is conveniently called podcasting 2.0, which talks about podcasting 2.0. And it's hosted by Adam Curry and Dave Jones. It's every Friday. And you can listen to it live. So like if you're using a podcasting 2.0 app, you'll get the notification. You'll click on it and you'll be able to listen to it live. Or after, if you don't want to listen to it live, podcast weekly review by Sam Thethy and James Cridland. They talk about podcasting 2.0 a lot. And James Cridland also has a daily podcast which is called Podnews. I highly recommend these three. And thank you. We have four minutes for questions. Do you have any questions? Okay. Hi. Is it working? Okay. So amazing talk. I have just one question. It's very few time. What about advertising bird in podcast? Is there some way to advertising and advertising review podcast? There are many, many ways for advertiser podcast. At Adores, my company, we work on recommendation and paid recommendation. Historically, there were two kinds of ads on the podcast, either from the host and it can be a host read or automatically inserted or on the app. We still have the two of them. I hope that eventually we can find a way to make the host and the app work together so that they built bigger audience together. Thank you. Thank you for your talk. I have two questions. How do I move my existing feed if I want to host it on Kastapod? That's my first question. What features which are in the podcasting 2.0 spec? You have this whole list of all these features. Some of you have implemented. What are you working on next to implement in Kastapod? First one, Kastapod has an import feature. As soon as you have your Kastapod instance ready, you copy paste the RSS feed URL, you click import and it fetches everything. About the features, we don't have all of them implemented yet. We're working on them one step at a time. Yes, hi. So you said the podcast index don't host anything illegal. So what your restriction is the podcast index in? Sorry, you said the podcast index doesn't host any illegal content. So I wonder what your restriction, the podcast index is hosted in. I'm sorry, I didn't hear your question. So you said the podcast index doesn't host anything illegal, but it allows everything legal. As long as they know it. What your restriction is it hosted in? What country's laws? Is it United States or Germany? So it's a U.S. corporation. It's a U.S. company. So I assume it's U.S. laws. But the only example that I saw are mostly people who copied podcasts. Something that you have to know is that the podcast industry is still very young. It's a very young 20 years old industry. Young because it hasn't been monetized as other industries. So when there is not that much money involved, you get less problems. Obviously, when it's bigger and when the money is flowing, we'll have some other issues. Yeah, that's for sure. A very few questions because the time is up. My question is about the transcription. Does Castopod help us to transcript or do we have to upload our own transcripts, our own tools? Right now, Castopod allows you to embed your transcripts. But we are working on the solution that will make it for you. We've been working on that for years and it's going to be out very soon. We have just a few questions on the chat. One is does podcasting 2.0 address stuff like integrating BitTorrent with podcast RSS? It does not address BitTorrent yet. It's pretty hard because it's pretty hard to get something that works fluently. That's something that's on the table, but it's not going to be here very, very soon. The next question was about hosting providers. What do you think about podcast hosting providers that conform to the standard but host short version on the actual podcast forcing you to download their app to finish listening to it while they're doing whatever they want to? I just think you shouldn't use them. You should go someplace where you have control, not the hosting company. Thank you very much. Thank you. If you have other questions, I'll be around. |
Decentralized Social Media with Hachyderm
Growing into medium scale, incident report, and forming Nivenly |
How's everybody feeling? Good. So if you haven't figured it out yet, I'm Chris Nova. Some people call me Chris. Some people call me Nova. Just don't call me Shirley. So we're going to get started with a few quick questions. There's a lot of people here. So I just want to get a feel for who's in the audience. So who here knows what mastodon is? Show of your hands. Okay. For folks at home, literally everybody just put their hand up. Who here knows what hackaderm is? Oh, God. Sorry. So pretty much the same number of people. Who here knows how to denial a service? Dadosa service? Okay. And how about just general abuse? How to like just use a service? Okay. So for those of you on the camera, literally the entire stadium of people just put all of their hands up. I can't believe I'm about to do this. You have 45 minutes starting now. You have my full permission to DOS my shit. Take down the service. You can do whatever you want. There are three known things I know of today that should make this pretty easy. I think if you knew exactly what you were doing, you could probably do it in about five minutes. So anyway, that's how we're going to start the talk off today. So the goal of this is to wake my partner up. So she's asleep right now. She's at home in Seattle. And if you're successful in disrupting the service, she will get some discord notifications and we have a team of volunteers. Their phones will go off. My phone will start going off. And hopefully, hopefully my puppy greets her with a smile and wakes her up as there's inevitably a crisis. Okay. So the reason I wanted to start this off is because we have had to do a tremendous amount of work to bring Hack Derm to where it is today. So just to kind of start the slides off with some basic numbers here. I don't know if y'all can see this, but this is just a public glimpse into the service that's online today, just serving mastodon. And there's 44,000 users. It looks like we had 200 people sign up today. I don't know how many of those people were here at Fosnum. I don't really know very much about them at all. We've had 20,000 toots. Sorry, we've had 789,000 toots. And we have 20,000 monthly active users. So there's been 20,000 people who signed into the service in the past 30 days alone. And we are currently federating with another 20,000 instances, which in my opinion is yet another attack vector for the internet at large that we should probably spend more time discussing. So if you are successful in flooding the service, hopefully by the end of my talk, we should see some spikes in these two middle graphs here. The HTTP response time is probably the most sensitive part of our entire system today. Cool. So let's get back into it. So about me, I work at GitHub. I'm a principal engineer at GitHub. I'm also an author. I've written some mediocre quality books. And as of four days ago, I'm also the president and a board member of a foundation I'll tell you about here shortly. And if you want to follow me on decentralized social media, there's my links there. Okay. So we're going to start off and we're going to do some basic context studying. And then we're going to go into like a little bit of an incident report of a situation we found ourselves in last November. And then we'll talk a little bit about what this means to me, what this means to the United States economy, the legal situation in the United States, and how we're kind of navigating all of this that we really kind of just stumbled upon earlier last year. So the short story here is my little mastodon server that was used for me in about 100 of my friends, if maybe not even 100, maybe 50 of my friends, had very quickly turned into what I consider medium size scale. And when we reached medium size scale, a lot of the problems aren't necessarily related to the technology. Although, as you're about to find out, operating a Ruby monolith at scale does come with a substantial amount of concerns, which we'll dig more into that in a moment. Okay. So just for folks at home who are watching the video after the fact, I want to give a little bit of context on mastodon in general. And I want to be clear. I am not a mastodon. Well, I guess it depends what you define as a contributor, but I don't work on mastodon that much. I've written a few issues. I've helped talk to some folks who do contribute to the project. But for the most part, this is probably the most detached I am from any of the open source projects I work on. I literally am a consumer of mastodon. The most involved I have gotten with this particular project has been going to GitHub and going to the release tab and downloading the latest versions for me to go and install on my server. So it's kind of nice, not going to lie, to just be on the consumer side of the open source for a change. But mastodon is ultimately social networking that's not for sale. It's built on the activity pub W3C protocol. And it's an alternative to familiar social media sites like Twitter. And it gives you much more ownership and control of your data from both an operator and a user perspective. Okay. So this is probably the number one question I get asked, which is how did we come up with the name Hackaderm? And if you talk to my friends in Italy, it's hashaderm or hashadermio or I've heard a lot of different variations of it. It's my partner came up with the name. It's ultimately a plan words with hacky and packaderm. So hacky is a clumsy, temporal or in elegant solution to a technical problem. And packaderm is a large thick skinned mammal such as an elephant rhinoceros or hippopotamus, obviously mastodon. You see where we're going with this. And so we like to say that hackaderm is a clumsy, temporary or in elegant thick skinned social media server. And depending on how successful some of these people with their laptops are, we're going to see how thick the skin really is. So again, right now we have roughly 45,000 hackadermians is what we refer to the people in the community, which is a lot of people. I wasn't prepared for the sheer number of people and the sheer number of like bizarre things that we would be getting into as we approach the size scale. And we have 20,000 people who are active. And so this is, there's a lot of implications of that specific ratio. But for the most part we see a lot of traffic go through our network every day. And I think at least once a day there's some sort of crisis. So we have all of the major problems of Jurassic Park, of a major theme park, of a normal technical shop, which has been fascinating to kind of watch this whole thing grow. The hackaderm community is pretty interesting. And to be completely honest, I'm still not really sure how it ended up the way it did. But it's mostly composed of technical and open source professionals, such as people here. It's similar to Fostodon, who here has heard of the Fostodon MasterDawn server. That one's also great. Also, I have some colleagues who work on the InfoSec one. That's also another good one. But I see a lot of like SREs, such style people, a lot of senior engineers who work on the InfoSec style people, a lot of senior engineers, directors. We even have some executives. And then we also have like honestly just some very beautiful anonymous hackers who keep everybody in check. And so it's a good blend of people. And we see a lot of interesting things come through our various servers. So our about page reads, here we are trying to build a curated network of respectful professionals in the tech industry around the globe. And the around the globe part is the interesting part, especially when we start looking at the legal implications of this, which again, we weren't necessarily prepared for. And we welcome anyone who follows the rules and needs a safe home or a fresh start. I think this was personally a big one for me. And I think this is also very relevant to a lot of the folks that I know who have joined in the last few months. I do think that there's like some pretty, in my opinion anyway, some pretty toxic mental health situations that folks find themselves in using Twitter. And I think that this is kind of an opportunity to just like rip the Band-Aid off and start fresh and kind of establish some new habits for people and some new self image for people. And so I do see a lot of people kind of reimagining themselves and reinventing themselves when they come to Hackaderm. But yeah, ultimately, it's hackers, professionals, enthusiasts, and we're passionate about life, respect, and digital freedom, and we believe in peace and balance. And I wrote this very casually like on a Twitch stream, and those words are actually pretty important now that we're continuing to dive a little deeper into what they actually mean. I think the thing that kind of comes to mind right now, the word professionals and enthusiasts right next to each other, when you come to a certain scale, having a lot of enthusiasts sit alongside professionals comes with some consequences and balancing these two things is actually pretty challenging from an operation standpoint. But ultimately, we want to be a safe space for the tech industry, for people who want to talk about the economy, open source intelligence, news. We talk a lot about Rust, we talk about Linux. Who here was at my talk on Aura yesterday? Awesome, thank you. So a few folks here. So that's a new project that I'm trying to get more people to talk about. We talk about Kubernetes, Go, et cetera, et cetera. So anyway, we're going to spend a little bit of time talking about this blog post that I wrote called Leaving the Basement. And to set the context a little bit, Hackaderm literally started running in my basement. And this is the story of how we kind of ended up moving out of the basement and dealing with some pretty substantial scale problems. I think it was in the middle of November, we started to, the service started to degrade. And there was a lot of consequences of just shutting the service down. And so people were getting very aggressive on the internet. As it turns out, the internet is full of grown men with opinions. I don't know if any of y'all have noticed this or not. But yeah, sometimes these grown men with opinions have very toxic opinions, and they like to say a lot of things about people's services. And so we tried to do our best to keep a positive attitude and just continue to move forward. So this is the story of what actually happened behind the scenes and how we ended up there. And I think there's some really good takeaways in this from a technical perspective. Okay, so we'll begin our story on November 27th of last year, 2022. And also keep in mind that, you know, this is one month before the holidays. So this is about the most burnt out I ever get every year. So usually around the end of November, I'm honestly, I'm about the most I can say to anyone is like fuck off, like I just really need some space. And I need a break and I want to go relax and I want to sleep in. And this is when our new service decided just to completely go down. And this was a really good growing opportunity for people. So we have some really interesting numbers here. And I tried to do my best to build a graph. And this is like, it looks like a very stereotypical stock graph that's like pointing up into the right. So I feel like I should just, you know, do like a good, like, hi, guys, we're here to talk about business. And look, our business is going up into the right. And like business numbers are important because growth and strategy and impact and business. But honestly, this is just the amount of people who were leaving Twitter. And really, I think they were just kind of looking for a new home. And we just happened to be one that that met their needs for the time being. So up until November 1st, we had less than 700 people. The prior six months, the service was online. That's how we gained those 700 people. So it was roughly 100 people a month for the first six months. And then this happened. And this was very unexpected for both myself and everybody in my immediate circle. So one of the things I talk about as a professional SRE. So let me back up. When I'm not keeping the masted on Ruby monoliths online, my other job is keeping the GitHub Ruby monoliths online. So some of you use GitHub. Some of you use Hackaderm. I work on both of them. And I have two different UB keys here in my backpack, one for each service. And so anyway, one of the things I often say is when I enter a conversation with someone, this is the most important thing. And I honestly want to get this as like my next tattoo because I say this at least once a week, which is what is the current state of the systems? And if you can't answer this question very confidently at any given moment, especially in a crisis, we should be having other conversations at that point because this is the starting point of every conversation in my opinion. So we'll start our service off our service discussion off here. So we had a rack of hardware in my basement. And these are the specs that we had running in the basement. So it was a hobby rack that I've collected over the past 10 years or so, you know, pieces of hardware that have been donated to me or that I found for a cheap price that were used. In fact, the star of the show, Alice, over here on the far left, I've literally carried her across Market Street in San Francisco and dropped her in a pile of like pee on the side of the road. The hardware has been through a lot to say the least, but it's what we had and this is what I was using for kind of a home lab at the time. I think the important thing here to notice though is that these are not trivial computers. These are proper rack mounted servers with proper specs and most, for the most part, these worked fine. It got the name the water tower because we had Alice, who was our main compute node, and then we had three identical Dell PowerEdge R620s named Yacko, Wacko, and Dot, respectively, and all three of them seemed to just be up to shenanigans at any given point in time, from memory failure to broken boot loaders to just bizarre networking behavior and having to go flip out Nix to try to get a better network connection. There was just a lot of obscure things that was happening at the hardware level. So, Meet Alice. She's a very infamous server, especially if you've read any of our posts or if you've ever watched my Twitch stream, but there she is there, and that's in my basement, and that's the Dell R630, and you can see she's got eight SSDs in the front of the carriage there, and she was sitting behind a firewall of my own design, and that was our main endpoint for pretty much everything I ran in my home lab, and that just so happened to be the main endpoint for our mastodon service up until the month of November. So, yes, it was a home lab, and I think the whole point of this is that we used it for a lot of things, and so the mastodon service was running on the home lab, and I do a lot of really bizarre things on Twitch, so if you follow me on Twitch, you probably have seen me work on kernel modules and experimental EBPF probes, and I've experimented with adding some features to the ZFS file system and compiling my own version of ZFS from scratch, and I've been doing a lot, and I also installed mastodon on that same server, and that's the key part of this. So, here's a list of things from my home lab that have not blown up. There has not been a billionaire who decided to buy a company that decided to insult the broader technical community and encourage them to move off to a decentralized service, and so you've probably never heard of any of these, and that all of these also run in that same home lab, so I think it's important to realize that this was a very unexpected event, and that these servers were in a pretty high state of entropy, and we didn't really have a good idea of the state of the systems. This was a home lab. So, as it turns out, 50,000 people trust me and really dislike a certain billionaire, and this was the one thing I kept hearing. We kept having large to medium size and smaller size name people with substantial Twitter following, like shoot us an e-mail and be like, yo, Nova, I'm done with Twitter, we're going to come to your mastodon server, and I'm like, okay, sounds cool, and then they have, you know, 350,000 Twitter followers, and it's not about the followers, but from an operator's perspective, I'm like, holy shit, that's a lot of traffic. That's a lot of people that we're going to have to open up web sockets against, and there's a lot to deal with there if you're going to be sending all of these messages out to all of these people who are going to be following you, and they all continue to say the one thing, which is like, well, we trust you to not screw this up, and like, you know, you can probably do better than he can, so we're just going to move over to your server anyway. Okay, so what had ended up happening is, it's a long story, and how we ended up finding it, I think ultimately took about three weeks, but we don't say root cause anymore, we say core cause, and the core cause of the incident is ultimately we had a bad disk on Alice. I don't know why the disk was bad, this really bothers me, so I'm just going to chop it up to like, it was just like a bad one in the batch, but we were able to actually isolate it after the fact, and determine that like a basic reader write to this SSD was in fact the problem. I also think an interesting takeaway here is that these were not consumer SSDs, these were actual proper enterprise SSDs, one of which either just decided to get slow IO, or I don't know what happened to it, but even in an isolation zone writing directly to an EXT4 system, we were still able to prove that this disk was substantially slower than another one of the same make and model. So it wasn't always bad, and it started to go bad, and this ultimately led to a cascading failure across our CDN and our geographic edge nodes, and so the interesting thing, and this is just one of those things, this is the aforementioned bad disk, and it also for some reason has a broken chassis in the front, so part of me kind of has to wonder, did the movers drop the server, or did something happen? I'm not really totally sure, but these are the woes of operating your own hardware in your basement. So here's a model of the cascading failure. Who here has dealt with cascading failures in production before? Okay, so 15 or 20, 30 hands or so. These are fascinating, how you get into these situations, and usually when you're dealing with one of these cascading failures, you're not really starting at the database, or at least you glance at the database and you think maybe something's wrong, and you usually blame DNS, but in our case, we were working back from our CDN. So imagine you are operating a mastodon server in your basement, and 50,000 people on the internet decide to join, and all of a sudden you can't even join a zoom call the next morning, because your internet pipeline is so throttled from your ISP, who's like, bro, why are you bringing this much traffic to your house? I don't understand what's going on, this is very bizarre. So what we did is the very first thing we did to offset the problem was we set up these CDN nodes around the world, and these basically served as reverse engine X proxies that had media cash on them, and we would then route the traffic through a dedicated connection from one of these CDN nodes back to YACO in my rack, and then YACO would then proxy the data over to Alice, and Alice was our main, our primary database running in the rack. So when things started to fail, it was like very intermittent failures in Frankfurt, and then we would get like some very intermittent failures in Fremont, and it all looked like engine X was the problem, we were getting timeouts and slow requests, and this whole incident is what later inspired us to build that dashboard that you see today, and the reason I was like we should be looking at those HTTP request times when I very politely asked you all to please DDoS my server, and so that transferred all the way back to Alice, and we learned entirely too much about Mastodon at scale, retracing everything back through the rack, and we had to go and trace Redis logs, and Sidekick queues, and Mastodon Ruby servers with the Puma server, and ultimately we found out that it was simply just Postgres unable to read and write from the database as fast as we would like. So these are what the graphs looked like the day of the outage, so we grabbed some screenshots, and I'm really glad we did because these make for some interesting takeaways here. On the left side you can see our HTTP response time, and so these are our get 200s, so in some cases the response time was actually, they were returning a 200, but we were having like 40 second responses. Was anybody here on Hackaderm when it was like in this weird like hangy stage where you kind of could upload media, but you kind of couldn't, and you're like what the heck is NovaPan's doing, she doesn't know how to operate a service? So this is what we were working on, we were working backwards from these graphs, and it was interesting to see the behavior of Mastodon under these conditions because you very quickly realized that different parts of the user interface were coupled with different parts of the back end, and so, and they all assumed that the entire user interface would work. So if the database started to go slow, maybe you could upload the image, but we couldn't actually write the image key to MySQL, and the UI would just kind of just exist in this in-between stage for like five minutes at a time. It was very interesting behavior. But ultimately, we isolated out the IO on disk, and we were able to determine it was old SDG and old SDH down here in the bottom right. You can see these numbers are closer to 100% for IO on our disks, and this was what was causing those cascading failures. So ultimately, this was a very exciting time. People were joining Mastodon around the clock, and our little group of people that hung out on Discord very quickly turned into a more serious group of people who hung out on Discord, and it was really fascinating to watch friends of the Twitch stream and my partner Quintessence, and there's even people here in the room. Malte and DMA, are you right here in the front? We are now best friends, and we wouldn't necessarily be friends if it wouldn't have been for this whole incident in the first place. So we were definitely working around the clock. I think Malte and DMA would kind of hand the service off to us when we woke up in the morning, and we would work until they woke up the following morning, and it was just this constant game of providing quick summaries of our work and then just like crashing and going to sleep for a few hours and trying to hold down a day job while we dealt with the service. And this is for the most part what it felt like behind the scenes. We had a dedicated channel where we were trying hard to work through things, and I think this is Malte just sent to the image. This is the moment where we finally realized what was going on, and we were starting to isolate the problems on the disks, and I think Malte was just like, okay, we finally found the problem. It's exactly what we thought it was, and everything is fine. This is going to be fine. And meanwhile, we have, you know, main names and technology joining the service, and things are kind of burning down all around us. From the human perspective, I wanted to share two interesting failure modes that we got into as people that I think are just an interesting takeaway for anybody who operates a production service. So the first failure mode was in a state of panic, I tried to just throw more computers at the problem, and so my response was like, we're going to go put more computers in the rack, and I turned on dot for the first time, and gave dot a public IP address, and I think the other big takeaway here was we got very good at doing the wrong things, and I think this is a very, very familiar trap for a lot of the organizations that I work with every day, is there will be some crisis, and they will respond to the crisis by doing something. In our case, it was creating a spreadsheet, and the spreadsheet helped us do some quick math, and that math helped us inform how we needed to provision our different system D services, and then when we changed the system D service, the rule was you needed to go update the spreadsheet, and this was a reaction to a crisis that allowed us to move forward, and then it was very difficult to get out of this situation, so I do think that there's a very interesting takeaway of you get in the habit of doing the wrong thing or doing a bad behavior during a crisis, and that can actually persist in the last longer than the actual incident itself, so we had all the major problems of a normal SRE team, and this was a volunteer open source project to begin with. Okay, so I have a friend in Boulder, his name's Gabe, him and I have known each other for a long time, he's grown very quickly in his career, he's now the Chief Product Officer of Digital Ocean, and Gabe texted me one day and says, hey Nova, so I bought this farm, and I'm trying to upload rooster pictures on your website, and I can't upload my rooster pictures on Hackaderm today. What's going on, and is there anything Digital Ocean can do to help? And so we were in a situation where we were trying to come up with a plan, we had just identified that the disks were the bottleneck and the single cause of our infrastructure problems, and I think this was the first time I kind of realized like, oh, we have 50,000 really smart, well-connected people who can more than obviously help us with our problems, and really the problem is how do we reach out to them, give them access to production, form a plan, and execute on that plan, and it became very obvious that our main problem wasn't necessarily fixing the disks in the basement, it was managing people, and it was organizing people to work on the service and making sure that we were in a good position to accept help from a corporation such as Digital Ocean in the first place. So Malte here, he's going to get embarrassed, but can we just give him a round of applause for this plan? He's smiling, but honestly, like if there was a Malte saved the day kind of moment, like straight up Malte saved the day. He came up with this very interesting engine X pattern that allowed us to effectively move our data off of the bad disks in the basement to the Digital Ocean service without taking the service offline, which you're like, okay, that's pretty cool, you can keep the service up, and you can start to fix the problem at the same time. Additionally, what this did was this actually gave us a means of getting the data out, and everybody who used the service contributed to the data migration. And so what we did is we set up this, who's here familiar with the try files directive in engine X config, a few people, you should, if you get time, go read about try files. This is a fascinating thing that engine X does, and what we were able to do was point media.hackaderm.io on Alice. We were able to point all of the CDN nodes towards Alice, and Alice would first try to resource the file from S3 running in Digital Ocean. If it could find it, it would then return that directly as basically a reverse proxy from S3 to the client, and otherwise it would resource it from the disks locally in the rack. So every time somebody read, whether it was an image or a post or something coming from the rack, it would then persist into S3 on the back end, and we would never have to serve that image ever again from Alice. So this was a clever solution, and it gave us a means to slowly start transferring the data, and every minute we transferred the data was another minute that it was likely going to be served from a cloud provider and not from my really crappy hardware running in my basement. So the disks were so slow, I mean, in my mind, these disks could be personified. They were like, they were beaten, they were tired, they have been through hell and back again, and it took eight days for us to arclone all of the data, which was about two terabytes of data, of Rooster videos and cat pictures and catter-day hashtags and all kinds of mastodon things over to Digital Ocean S3, and this was all courtesy of Gabe, who was like, bro, I just want to upload my Rooster pictures. So as we moved the files out of the basement, it became obvious that running this service in my basement was no longer going to work for us and that enough people had joined that we had reached critical mass. So our next decision was, okay, where do we actually want to move the compute to? And I think we all kind of have been like a little bit traumatized from like the vendor lock-in and the tech industry as it exists today. And so I think looking at Hackaderm, there was a lot of people who were very critical, myself included, of a dependency on various corporations. So we definitely didn't want to just go throw money at Amazon, right? Amazon has enough money. We're good taking our little community and putting it there. And we didn't want to go do the same thing at another cloud provider. So ultimately, we made the decision to go to Hetzner in Germany. Whoo, Hetzner. Another good caveat here is that from a legal perspective, Germany has some of the most restrictive privacy laws. And so this is going to be about the most isolated zone we're going to get in today. And a quick glance and a quick consultation with a lawyer told us that Germany was going to be the safest place to start the service from. So again, our biggest concerns had almost nothing to do with the crappy disks in my basement and almost everything to do with like international privacy law and user data. And we've very quickly found ourselves having discussions about the complications and implications of operating a global service. So here is our most recent diagram of how we kind of set things up. You can see that we had to balance things in my basement with things in Germany. And you can see that we have a set of CDN or point of presence nodes around the world. So it was very exciting for me when I flew across the ocean from Seattle to come here to Brussels, because for the first time our service, since the outage, was actually fast and responsive again because I am now being proxied through another server now that I am here on a different continent. So now what? Okay, so we've reached the point of stability. Our servers are stable. People are able to send their Rooster videos again. And we're still very much not out of the weeds. We still have a lot of concerns we need to deal with. So in general, the top Ruby monolith problems that we have solved to date is sidekick scaling, which if you've ever, who's here has operated sidekick before? It's a Ruby thing, show of hands. It's like a Ruby daemon that you have to specify the amount of threads and concurrent workers at runtime. And mastodon is built on this. So like every time we federate with a server, there's a whole queue that runs in the background that does the federation for us. We've also had to tackle network scaling, and we have a global CDN with reverse nginx proxies that has a cache on the edge so that the more people who look at an image, the more it's served from the cache. And all of those have legal implications. And it's just been a lot of work that we've had to get into to just operate a basic service so that we can all sit here in this room and I can make the joke, please go DDoS my web server on the back end. So here's a graph of our egress data. So the top of the graph here is roughly one terabyte of data per day. So you can see that looks like over on January 26th, we peaked over a terabyte of egress data. So that's honestly from an enterprise and scale perspective, this is no trivial amount of data, right? We're moving a lot of data across the wire and the fact that Hetzner can support us is very nice and seems to be working well for our needs today. Another interesting thing about just federation in general that we've had to kind of learn as a community is there's actually a lot of moderation consequences. And there's a pretty big user data and user privacy risk with operating mastodon. And so I put this sort of diagram together to just illustrate some of the consequences that we've had to deal with. In this case, we have three instances, one friendly, one neutral, and one evil. And even if the friendly instance decided to block the evil instance for whatever reason they deemed to be a cause for that blocking, it's still able for content to get out and to end up federating with another instance. I think what's important about this is this means that we can end up with content that is potentially illegal in the United States or illegal to have without like an 18 and up warning that puts myself, my family, and everybody who works on Hackaderm at risk. And so we've been trying hard to figure out how do we actually manage content and actually get to a point where we can manage this in an effective way. And let me just say I cannot thank the content warning feature on mastodon enough because that actually gives us a lot of insight into the types of things that could potentially be harmful. So ultimately, we had a lot of top non-Ruby monolith problems. So obviously, there was illegal concern. We have a team of moderators working around the clock who just deal with trolls and people who are causing problems and bad actors, and they're having to make judgment calls. And we have to establish rules, and these rules need to be enforced, and we have to respond to people, and people have really good reasons. There's videos out there that are very disruptive, and we have to go respond to them. And it takes a lot of work just to balance that on the back end. And the whole thing is ran by volunteers. And ultimately, where we are right now is we're spending roughly 1000 euro a month in hosting costs alone between the digital ocean bill, the Hetzner bill. We have an email API. So every time you go and you sign up for the service, you have to get an email so we can validate who you are. And all of this is coming from donations as they exist today. Okay. So if you want to learn more about Hackaderm, the community, and how we run things, we have a dedicated community resource. If you want to go grab and check it out, that's where we do things like announce our rules and our policies, and we document how we make moderation decisions in general. So the consequence of all of this is we've decided to found a new foundation called the Nivenly Foundation, which that's very exciting. So the name is just it's just the name of my blog that we turned into a 501c3. And I kind of like most things in my life, I kind of want this foundation to be relatively boring, but this will be the legal entity that will be used to protect Hackaderm and to hopefully fund the process moving forward. So right now the Nivenly Foundation has two projects, one of which I talked about yesterday called Aura, which is a distributed runtime written in Rust, and we also have Hackaderm. This is exciting because we this feels like the 90s. We have an open source service. This isn't an open source project that you can go download. We like legit have an open source service with graphs and people with pagers that we have to go and operate. And so that's an exciting thing that the Nivenly Foundation gets to do. So I want to introduce my wonderful partner who's not here, who is the executive director of the Nivenly Foundation, and also the person that we hopefully didn't just wake up by d-dossing the server. Anyway, she does the majority of the work and she couldn't be here today, but can we just give her a round of applause? Because she is actually the one who gets everything done. So she manages the infrastructure team right now. She's managing the moderator team right now. She even created these teams in the first place because people were freaking out and didn't know what to do. And so she wakes up every morning and deals with everything that Hackaderm throws at her, and I honestly thank her enough for the hard work that she's done. So one of the problems we've had to solve is a governance model for this whole thing. So we now have an open source service. There's legal risks and how are we going to make decisions as a nonprofit. And so we started to look at some of the consequences of modern day social media and some of the consequences of how corporations are navigating different open source spaces. And some of the things I noticed was for the most part, on Twitter especially, communities are very isolated from decisions. Users are detached from the technology and how things are done. And people are usually unable to impact change. So I had gotten into some trouble with Twitter. They banned my account. I wasn't able to talk to anyone. I had no avenue in which I could go and actually communicate with this corporation. And that became very problematic for me because I kind of used Twitter for a lot of things professionally. So what I started to realize was actually corporations usually have more influence and a better standing in the fabric of the economy than just a regular person does. And so as soon as I was able to interface with a corporation, I realized that I was no longer isolated from decisions. And I found that corporations often are not detached from the technology and corporations are in fact able to impact change. And I became obsessed with this idea. And I wrote a whole book about it. And I could, everywhere I looked, I saw this idea that ultimately corporations seem to have more rights than people. And that was very difficult for me to reconcile. I also think that this general observation explains why we see a lot of this on the Fediverse today. I think that there is this culture of cyberbullying and assuming that the people operating servers are inherently evil. And I see a lot of criticism instead of a lot of contribution. And somebody who comes from open source and I've worked on Linux and FreeBSD and Kubernetes, the Go programming language, the Rust programming language, it's very difficult for me not to intuitively walk up to a project and want to contribute. And so I guess this is just my way of saying that Mastodon gives us an opportunity and the Fediverse gives us an opportunity to no longer isolate people from the folks who are operating their services they use every day. And that's very exciting for me. So in our governing model, we want to figure out a way to balance communities and corporations. And this is the hybrid model that I'm hoping will actually be able to create a sustainable governing model for what we're doing. So right now, while we think corporate sponsorships are important, we're actually going to have two forms of non-corporate sponsorship, which are project members that you can achieve that status to simply by rolling up your sleeves and either contributing a project or becoming a contributor to one of our existing projects, or a general member, which is a small opt-in monthly fee that we have a few hundred people paying for right now. And the beauty of this is all general members are going to have a vote in how we do things. So if Hackaderm, the Mastodon server, wanted to, let's say, let a tech company have an account and that became controversial, anybody who makes a monthly donation to the service now is going to be able to have a vote in how we do things. And we're actually going to introduce a concept of open-source democracy. And we're going to be leveraging open W3C protocols to make this happen. And we still have some math to figure out exactly how much this is going to cost. However, this model is all built around the idea of a cooperation, which you see a lot of successful global companies do this and balance the different laws and trade-offs of different economies around the world. So my hope is that this will be slightly more sustainable and break down the sort of barrier between corporations and people because people now have a vote in influence and authority in how we do things. So we're still in very early stages of this. If you want to talk more, I'll be here at Fosdham. If you want to talk about Mastodon. And very specifically, if anybody here has any opinions on open-source democracy or how to build an open-source democratic model such that users can vote, I would love to talk to you. I want to learn as much as I can, and I want to help get Nivenly to a point where we actually have a sustainable model, and maybe we can learn some things from the various policy and legal efforts going on here in Belgium and in the EU. So now what? Now, really, it's just keeping Hackaderm online, which we're about to see if it is. Hopefully it is because I really feel like y'all would have been able to do a lot of damage if I would have been giving this presentation last November. And we just want to work towards a democratic model so that people who use the social media service have a vote and have influence in how that social media service is running so that it becomes everybody's social media service and not my social media service or somebody else's. So thank you to everyone who's been working on the service so far, and thank you to DMA and Malte who are here in the front, and specifically to the infrastructure team who helped us get out of the basement and keep the service online so that we can all have cat pictures and all the wonderful things that come with Mastodon. So thanks, everyone. Cool. And I grabbed a photo. So the test here is going to be to see, I'm going to try to upload the photo during questions, and we'll see how it goes. So here's a public resource. If you want to go check out the graph and see if there was a spike, you can go to grafana.hakaderm.io. And if you want to go to find out links to my slides and a recording of the video in the future, please go to github.com. And thanks again. And I guess we can do questions if anybody has questions. There's one over here. Right here. He's got his hand up. Okay, okay. I'm sorry. Could you start to interrupt, everybody? If you could please leave quietly, we are going to do Q&A right now. So we're just going to have a bit of Q&A. Please leave quietly. Thank you. I'm sorry, Chris. Can you show us the grafana panel? I can't hear you. I'm sorry. Okay, okay, okay. My questions. Thank you. Okay, right now. So the question was, can we see again the grafana table? Sure. Awesome. Grab a photo of this. Whoever did this round of applause. That's awesome. Hi. I was wondering, you were saying you could contribute skills or money to help. What are some ways that we as developers, engineers, SREs can help in the near future with keeping the video? It's a really good question. So the question was, how could we potentially volunteer or help out other than just throwing money at the problem? So the person to talk to is Quintessence. And we have a whole mod team right now that's working on onboarding docs. And I think we have 12 people right now. And these are folks from various tech companies around the world. And we have a Discord. So there's a link in the public resources I put. And there's a section on volunteering. And you can just interface with the team and get plugged in that way. Yeah, of course. Okay. More questions? Yes. You mentioned a thousand euros a month for the hosting. But I was wondering if you had an idea of what your total cost of ownership is now. And if the increase is linear with the increase of users in traffic. Sorry. The total cost of what? You mentioned a thousand euros per month for the hosting. But I guess your cost is much, much higher than that. So I was wondering if you know what your total cost is now monthly. And if it's been a linear increase with the number of users or? So the question is, is the cost of operating the Mastodon server, does it grow linearly with users? And the answer is no. It does increase with users. But I definitely think there's a threshold where you move from a small size to a medium size. And I think the traffic was really the deciding factor from us. So earlier it was just a few servers that we could operate on a small pipe. And now that we have a much larger footprint, we have to pay for a more enterprise and potentially a CDN and DDoS protection here in the future. And so that's grown up quite a bit. And that's probably our biggest cost right now is just the network. Cool. Any other questions? Hey, great talk. Did you evaluate, did you or any of your friends evaluate any other Mastodon compatible solution like Pluroma, Coma or any of that? Say, sorry, say again. Did you or any or when, when, when setting up, when setting up Hackaderm, did you or any of your friends evaluate any of the Mastodon compatible servers like Pluroma, Coma or any of that? So this, this is a really good question. So the question was, when we were setting up Hackaderm, did we look at any of the other Mastodon services like Pluroma or anything else? So the answer is no. And again, like, it's not like there was one day where I woke up and said, I'm going to go build a Mastodon server and I'm going to try to get all of the tech industry to come join us, right? Like I set it up for like me and my friends to just try out and Mastodon was the easiest one to get running on Arch Linux. And that was about the most thought that went into setting up Mastodon originally. And I think that like it had just continued to grow organically. And so like in hindsight, I mean, I like, I think there's opportunity to rewrite parts of Mastodon. I think there's a lot of opportunity to like have alternative dashboards as well. And so I'm not opposed to like operating different services for Hackaderm. I like to think of Hackaderm as a social media service where we just are on Mastodon mostly right now for today. So I don't have any personal experience operating the others, but I suspect that you know, as we move forward, the community might decide to switch over or run a different version or who knows, right? That's that's going to be up to the community now. All right. I have a further question. How fast was your internet speed at home to serve the Mastodon server? Sorry, say again? How fast was your internet speed at home to serve the Mastodon server? So like you showed the stats of your server setup with like 40 gigabits of possible network bandwidth, but how fast was actually the provided bandwidth from your ISP? Yeah. So this is a good question, which is how much bandwidth were we going through at my house? So in there's an official write up of the situation where we have some screenshots of the firewall at the house. And ultimately, we had pushed I think one terabyte was our busiest day in the middle of November over the ISP. So I had two connections, one of which was symmetrical 1G up and down that we were able to use and we like it maxed out our pipeline. We were like we were being startled by the ISP at one time. Yeah. Yeah. Thank you. |
Running a Hybrid Event with Open Source
The Plumbers Experience |
So, it's been a while since I've been in FOSDEM and one of the things that, well, I usually talk about technical topics, but this is really me talking about sort of using open source instead of producing it, which is quite an unusual position for me to be in. I work most often in things like the Linux kernel, I also work in open source politics, so I did a lot of kernel interaction with the Linux foundation and one of those pieces of interaction was actually being a founding member of Linux Plumbers conference, which is the conference we'll actually be talking about for this presentation. Plumbers was basically founded in 2008 as an in-person conference and it's always been an in-person conference up to the pandemic years. It's actually organized somewhat similarly to FOSDEM in that it has tracks and what are called microconferences. The microconferences are similar to dev rooms except they're really a bit more interactive, so they're really expected to do participant conversations and usually you get a cluster of people at the front talking to each other. The tracks themselves when we have them mostly tend to be presentations, so very like this, but the microconferences tend to be very discussion based. If you think about trying to do that sort of hybrid and online, you realize you need some sort of highly interactive video system, which is what we went through. One of the good things we had for Plumbers is that the conference actually produces quite a lot of money. We have quite a lot of sponsorship, so it's quite easy to pay people to do things in the conference. One of the things we paid for in 2018 was to actually make the conference live streaming for all of those who couldn't attend and also to try and put all of the videos of every presentation and every discussion we had up on the website. Obviously to do that, we needed to pay somebody to run AV in each of our rooms and Plumbers tends to run as about somewhere between four and eight simultaneous tracks. We needed about 16 cameras to do this and about between four and eight people to stay in the room to also do it. Obviously this then included provision of cameras in the rooms. The way we set up cameras at Plumbers is slightly different from what you have here because in your setup, the camera is mostly looking down at me because you think I should be the focus of attention, which is true because I'm giving the talk. When you're actually trying to record an ongoing live discussion that mostly happens between people in the audience, you also need the people who are live streaming to be able to see it. One of the other fortuitous things is Plumbers is always set up with two cameras, one at the back looking down on the speaker and then one at the front looking out at the audience. The AV guys are actually responsible for switching between the front view and the back view so that when we start to get conversations within the audience, the camera can actually catch them and you can watch it. This actually turned out to be really fortuitous as well. Obviously, when you try and record your conference, the first thing you realize is that in a highly interactive conference, most people don't use the microphone and it's really annoying for everybody on the streaming when you get a load of silence in the room. Getting people to use microphones is one of the key things that you have to do when you start recording your conference. This is still nothing to do with trying to get it live or hybrid. This is just recording the conference. You have to persuade all participants in the conversation to actually use a microphone. However much money your conference makes, you will only have a few microphones. This one, we have two microphones at the front. I have one Le Valier microphone for a room that could possibly seat over a thousand people. Trying to run around with microphones when you're doing an interactive conversation is really difficult as I assume the guys with orange t-shirts will be able to tell you over a beer after the event. The solution we took was borrowed from another conference in Paris called Colonel Recipes, which are these things called catch boxes and effectively they're square plastic polystyrene blobs that have a microphone inside that you can throw from one end of a conference room to another. This gives you a way of actually getting the microphone between the participants very fast and it's definitely fun to use and they do encourage participation. So far we've had no injuries. I should actually qualify. We've actually had no serious injuries so far with people throwing microphones at each other. I assume there's always a first time. The idea was really to try and make the conference that we were running available in real time to people who can't participate because it's a current development conference. We get people who want to come from all over the globe. We always have corners of the globe, especially when we run the conference in America, that even if they have the best will in the world and they can actually scrape up the money to come, they're often denied visas by the US authority. So it can be a really difficult problem. And for our first attempts at doing this, the communication was really only outbound. So the remote participants could watch what was going on in the room but they couldn't really participate. And obviously this all changed in 2000 when we got to the pandemic years. Like all other conferences, plumbers moved to being fully online. We were actually in the year 2000, we already booked a venue up in Halifax, Nova Scotia. We had a huge amount of fun trying to get out of that venue contract when it emerged that we couldn't actually hold the conference up there. And then we all had to scramble to try and actually do this fully online. So this is where the story becomes quite interesting because there are 10 people on Plumbers Committee. Mike Rappaport over here is one of them. Because of the origin, we are primarily kernel developers and because we're moving fully online, effectively, kernel developers are becoming responsible for web programming. Because what we're really doing is setting up a distributed set of web services. And if you know kernel developers, we mostly tend to theoretically look down on web developers because obviously the kernel development is at the hardcore of everything. So this is somewhat a story of how a bunch of kernel developers tried to do web and failed a bit, screwed up quite a lot and eventually got it working by the seat of their pants. We evaluated a load of options for actually going fully online as every conference did. They scrambled to, we discussed it a lot with other conferences. People had different ideas. We were actually under the umbrella of the Linux Foundation and there are much more commercial entity than we are. We're still a volunteer run, even pulling a lot of money. That idea was just to pay somebody to do the solutions. And frankly, in the early days, that looked like a good idea to us. If you can just pay money to get rid of the problem, why wouldn't you? And we have enough money. We've got almost a million dollars in the bank now for Plumbers. We could afford to spend a lot of that money to actually pay for somebody to sort out these problems. And the Linux Foundation tried to do it for several conferences and we looked at their solution and went, you know, this is terrible. I mean, why is it so difficult and so bad trying to run a conference? And so the only way we could actually foresee getting this working is if we just abandoned all of the commercial solutions and did it ourselves, which is much more difficult than we imagined to actually do. But the goal of this was that we definitely needed, in our online conference, to have two-way video interaction in real time between the participants. That was the goal. And there was no real conference system that did that. If you've tried holding conferences over Zoom, you can just about do it. But things like BBB actually do an awful lot better than this. And the other thing we discovered is that when you're running a collection of web services, it really is all about the integrations. Open Source gives you a lot of excellent pieces. So we use Big Blue Button, we use Indico for the conference scheduling, eventually we use Matrix for the Trap, we didn't use it in the first year. But there's also a downside to that. Whenever you try and integrate web services, you always write a set of glue around them that needs to be maintained. And if there are a load of disparate services that aren't used to talking to each other, it does the impedance matching between the services. And as the years go by, and remember this is a conference, so you put it on once a year, every year you upgrade to the new versions and suddenly your glue doesn't work again. So it always becomes a maintenance nightmare to get the glue working. So if you ever came to the Plumber's Conference, the web front end that you first see, if you go to meet.lpc.events, is a piece of code that John Corbett wrote in Python and Django called the Linux Plumber's Front End, LPCFE. Oh, and I should add that these things here are clickable URLs. So if you go to the presentation and you want to know where it is, you can just hover over it and it will tell you. But obviously the green light is actually the way you log people onto BBB. So by replacing green light and big blue button, we've effectively taken on the job ourselves of doing integration, authentication, user management, and pretty much everything else. This is actually a major replacement of services on big blue button. And the reason we did this is because we needed an integration with our conference timetable. We just wanted the conference timetable to display on the first screen with a little button that said, click here to go to this session. It's very like you have in FOSDAM with the matrix chat. If you just click on the integration of the matrix chat, it brings up the live stream and you can see what's going on in the conference. We wanted it to work pretty much as it does here. And green light just wouldn't do that for us because green light was designed to be a teaching front end for a single classroom. It wasn't really thinking about multiple tracks, multiple track rooms, and timetable integration. So we also, by getting rid of all of the attendee credentials, we actually had to integrate with some sort of authentication server that could be distributed. So being kernel developers, we chose LDAT because it's the oldest one and it's the one everybody has the most trouble with. So we should be able to use it, right? The credentials were the email address that you used to register with the conference, which is easy. And the password was just the confirmation number because fortunately the Linux Foundation CVent system spits out these long confirmation numbers and they're all guaranteed to be unique. So that's obviously really easy. Problem, again, integration. The CVent system will not speak to anything outside CVent. So if you register for the conference and you don't automatically show up in the LDAT database, what has to happen is I have to go to the CVent website and export a spreadsheet of all of the users, figure out which are new ones, and then feed the details into the LDAT server, which, I mean, as an offline activity, it sounds really simple. It becomes really complicated when you keep getting people who don't register into the last minute of the conference, turn up halfway through a session, want to register, and then immediately be able to see what the hell's going on. And they have to wait, I don't know how long, for somebody to actually put their credentials in the database. And when you're running a conference, pretty much all of your attendees are running around – or all of your organizers are running around full-time. Nobody has time to tend the back end of the database. And that became a bottleneck for certain people. So my request of you is if you ever get into this situation, register early, please, don't turn up at the last minute, it's much easier. Then the next problem is that nobody is actually really used to interacting online. I mean, you think it's easy, you've seen Zoom calls, you will just turn up and you've got, you know, pictures of each other on it. If you try and do that with 800 people, Zoom doesn't really do autofocus very well. You can't see who's talking. And if you're just observing the conference, it just becomes a mass of faces and it's completely useless. So we obviously had standard issues like good webcam, good microphone. The number of people probably among you sitting in this room who think that the standard set of speakers on your Dell system when you just open the laptop will do when you're chatting to each other, believe me, it won't. The thing that audiences hate the most in online conferences is static background noise, indistinctness, being unable to get the point across. If you don't have in the attendees who are going to interact good AV equipment, this is the thing which actually makes the conference experience go down the fastest. This is about the only thing we got right because we gave a speaker gifts designed to be delivered to anywhere in the world a month before the conference, a complete gift pack of headset lights so you could actually see the face and a webcam you could put on the top of your camera, a 1080p webcam so we were sure that we'd actually have a good view of the speakers. We made sure that they got to all of our nearly 100 speakers in the conference and most of them didn't use them, which was a bit of a problem. We also found, as I've explained with Zoom, that raised hands don't work and having a load of people on video who you have no idea what's going on doesn't work either. The thing we pioneered for plumbers is called video muting, which is where we try and instruct the attendees to, if you're just watching, don't turn your camera on. Pretty much it's a blank screen. The only people who turn their camera on are people who are talking or people who are waiting to talk. It actually gives a good visual clue that somebody wants to interject at this point. We found it was actually a reasonably good way of facilitating interaction in an online conference and it certainly solves the Zoom confusion where you just can't figure out who's talking to whom on your screen of, and for the online conference we did have about 900 attendees. You could imagine on a Zoom screen everybody would probably get about five pixels. In 2020, Big Blue Button comes with an internal chat system, which means anybody logged into Big Blue Button can pop out a chat tab at the side and they'll be able to chat with anybody who's logged into the conference. This is why, if you were doing streaming of the conference, you couldn't interact because that is a completely enclosed chat system. We realized that people wanted to have conversations outside the discussion rooms as well, things like hallway tracks. We actually adopted a, it's a sort of open source project called Rocket Chat. The core of the project is fully open sourced, but Rocket Chat themselves runs a more open core model where there are certain enterprise upsells and everything, but this was just the one we came across. The good thing for us is it doesn't require any integration because we'd already started to realize the more integrations you do, the more difficulty you have, the more strings and bailing wire stuff there is to go wrong at the crunch time when you put the conference on. So no integration looked like a good thing until we tried to run the conference. And then the problems were that chat was hard to read on the live stream because we're streaming it out pretty much through YouTube. We actually, Gilou Nadi wrote a streaming module for BVB and we just plugged it in as an additional attendee and what it saw is what we streamed to YouTube. And the problem is that YouTube does dynamic compression of most of the stuff while it's streaming. And if you have a low bandwidth connection, it will quite happily greek all of the chat that's going on and you just can't see what is going on on the chat, whereas if it's sort of having a bandwidth problem with faces and you're just looking at lips moving, you can stand losing quite a few pixels, but if you're trying to follow chat, it's just impossible. And obviously streamers didn't have any credentials to the Rocket Chat anyway because we based on our LDAP, if you're not registered to the conference, you have no chat credentials. So they couldn't join the hallway track either, which they didn't like. This was the problem. And the third problem was we had to buy an enterprise license from Rocket Chat actually to get some of the notifications that we used for integration to work. And the two chat systems were completely and fully disconnected. So you couldn't use your Rocket Chat login to look at the BBB chat and vice versa, which also was a bit balkanizing and a bit annoying for people. And so you can see that the thing I initially portrayed as a huge advantage because we didn't have to worry about the integration was the thing that everybody found the biggest disadvantage about the conference because it wasn't integrated. So you always get this pressure of, and you know as conference organizers, the more integration you do, the more chance there is that something will go wrong. But if you don't do enough integration, your audience don't like it. So you have to get the amount of integration right. And for several conferences, we actually struggled to do this. Every registered attendee got a matrix ID, and since we used email and confirmation number to log in, we figured we'd just use email address matrix as well. But there's a slight problem here. Emails contain an at sign. Matrix does not like additional at signs in your ID. So we had to replace the at sign with a full stop, which I thought, oh, it could be easy for our attendees. They're all computer geeks. They'll get this. They'll know email address for one server, email address but with at sign substituted with dot for another server, complete disaster. You know, they just couldn't figure out which ID correctly to use for which. Eventually we got the BBB server to use either form of the address. And we actually got the matrix server to translate using a JavaScript widget. If they typed in the email address, it would just fill in the dot and everything would just work. We tried it without doing that for a couple of days and got a bucketful of complaints. I should also add that when we chose Rocket Chat in 2020, I wasn't very happy about the decision because I was a fledgling matrix user, and I was sure that matrix could handle the conference, and I thought it would actually be a good idea to do this. So I advocated even in 2020 that we should be using matrix. And finally, after all the difficulty with Rocket Chat in 2021, I got my wish. So they told me, we're using matrix and you're in charge, which was fun until the first day came along and the matrix server with 900 people on it just fell over. And we had actually done scaling testing. I mean, I'm not naive. I know things can go wrong at scale that you don't see in your own lab. So we'd invited everybody to come along for a town hall, but only 100 people showed up. And it turned out that most of the scaling problems we had were somewhere between 100 and 900. So we hadn't hit any of them in our little test. And I'd also tried to scale up the matrix server. So our massive 16 CPU system that we were paying about $100 a month for, when you ran top on it was only using one CPU because I hadn't realized that matrix Synapse, which was the server we chose, was single-threaded. I mean, who in 2021 writes a single-threaded web application? But the matrix people thread matrix by using the back-end web server through a sequence of proxies to actually do this, but that is not the default configuration. And so on the Monday of the conference, I was left there in my own little basement sweating away going, how the hell do I scale this thing? But fortunately, several people who attended FOSDEM, which did use matrix before we did, also ran into this problem. And they were able to send me email saying, yeah, we've got this guy you can talk to. He'll teach you how to do it. So fortunately, within about four hours of discovering the problem and realizing it didn't scale, we had a roughly scaling matrix server working. And I managed to actually complete the scaling by the end of the first day. So for the second day, matrix was actually fully functional, fortunately, since it was my little baby. And I actually did a little blog post about how we got this to work, just so if anybody else ever runs into the problems, there are online resources to go to to get yourselves out of it. One of the good things about using matrix was that even the free tier attendees, the streaming tier attendees, could now actually interact by chat. So anybody with their own matrix ID could log on to any of our matrix rooms because they're all public, in the same way that all the FOSDEM rooms are public. Free attendees still got their own separated matrix ID, but that you only got logged on to the matrix ID if you actually popped up the side chat panel. You could also use your own separated matrix thing and use your own matrix ID as well if you wanted to. And it just depended how you wanted to interact with the conference. Cee Lunadi was the one who actually modified the BVB server to, instead of opening its own chat tab, we just stole the iframe concept it uses for the blackboard and we did a riot embedded which is done by some, the URL to the GitHub account is there. The integration, the riot embedded is not, it's not as good as running the electron type application, but trying to embed an electron application into a web form is just a nightmare and doesn't work. So it's a very cut down version, all it really does is text pictures and reactions. It doesn't do any of these sort of complicated threading or replies or anything, but that's pretty much good enough for conference attendees. But you can see the primrose path we're threading down because this is yet another integration into matrix, in BVB and matrix that is not done by either upstream. But it was really useful, so the streamers could see the chat, if it was greaked on their streaming screen, they could actually use a matrix client and still view the chat in real time without having to worry about video compression and reduction. And so we thought after we'd done this successfully for two years now, especially for one year with rocket chat, one year with matrix, it wouldn't be too difficult to move over to hybrid, which is probably really the point of this talk. We already had all the remote infrastructure. By the way, to run this style of remote infrastructure for a conference size of plumbers, so we're talking about 900 people and we assume that for the in-person one, we'd probably be about 400 in-person, 500 online, it will cost you somewhere between $3,000 and $5,000 to actually set up and run. That's not cheap to do the hosting of all of this stuff, especially because we had to host at scale, we needed one BVB server per each track room, we had about six track rooms, we had a few spare servers left over for hack rooms and we had one for the hallway track. But the way we looked through this, we already had all of the AV necessary for actually just, this is the naive thought, which is where kernel developers go, we got AV in the room, it's got two cameras and it's got a load of microphones, the room is just another single person attending the BVB chat, it'll be easy to do it like that, the naive way of thinking. But it really, really is not that simple trying to get a hybrid conference running. The significant problem is that you have to integrate with in-room AV and in-room AV, well, okay, so at ULB, you control it, but we were doing it in hotels. ULBs have their own local AV teams, if you touch a piece of their equipment, they tend to go screaming to management. In order to get the integration done, you can't do it yourself as the conference organizers, the AV people, the audio people have to do it and it's surprising the number of audio people who sort of sit behind those switcher boards at the back of conferences who've never even seen a sort of online hybrid conference and have no idea how to set this up. So just plonking this sort of video display down in front of them and saying, hey, now you're in charge of audio and video and we just want the conference to work, it wasn't actually a winning strategy. So we discovered rapidly and unfortunately it was pretty much on the day of the conference that we had to train the AV people and have to use all of this stuff. And we were actually quite fortunate because we'd ordered the video-based AV, we actually had two sets of AV people. So the hotel usually supplies audio people to plug in your projectors and your microphones into the hotel's audio. And then we had a separate AV company whose job was to manage the cameras and sort of cut the whole thing together and do the live streaming. So we actually had two AV teams effectively and we were assuming that we could just co-opt the AV team who was doing the live streaming because live streaming, running live streaming is not much different than running an online conference, right? Wrong. So we held meetings beforehand with them and we talked to them about it. We couldn't hold meetings with the hotel audio guys but we assumed it would just be a plug for them, which it wasn't. And nobody really anticipated the issues that we actually had. The AV setup of remote attendees is not obvious to somebody who's used to dealing with throwing microphones around a room because there is one extra wire that's coming off the internet over which a lot of attendee voices can come. And the first time they set up their AV patch panels, this wire just wasn't plugged in. The second time it was bypassing all of the audio routing which meant that the sound quality and the volume was chopping at high strength because it wasn't being attenuated through the usual filters. I mean, this is the 101, you know, how many screw-ups can you have? We basically did them all on the first day. So let's go through a list of the problems. Oh, I even forgot to mention that between 2001 and 2002, we decided to upgrade the whole of the infrastructure. We had specific things in BBB we wanted. After I told you we have two cameras, we have one at the back facing the speaker, one at the front facing the audience, the new version of BBB could actually handle multiple cameras with different views. So we figured that wouldn't it be really great if, you know, you want to see the audience so you just select the audience camera as a remote participant because you want to talk to them, or let's say you're a remote speaker, you want to see the audience, but the rest of the people just want to see you, we'll do this multiple camera selection. It was going to be the cornerstone of our implementation of this. Problem was, nobody told the AV guys. So the AV guys had come set up with a switcher for the two cameras, which is how they ran the streaming because obviously you can select one camera view or the other because the live streaming pretty much has one camera view out on it. And by the time we all sat together in the room, there was no way of unplugging this setup and we doing it such that we could plug in two video cameras because it all has to be done from the AV control deck and they physically didn't have another input because they were expecting to use a camera switch. So this is yet another screw up in planning. Third screw up was because we'd upgraded the new BBB, they'd redone the, and this user's node react as the frame backbone, they'd redone the way it was working. Riot embedded no longer worked as a pop out panel. We discovered this fortunately four days before the conference. So this was Guilinardi doing a very quick back end retrofit, trying to get the matrix chat that everybody was expecting to work to work. And fortunately by the time we did the conference, it was actually working, but this was yet another squeaky bum issue where we tried to do everything at the last minute. The first day we had a lot of rooms where the room couldn't hear the remote attendees. This is the wire problem from the remote or the remote attendees couldn't hear the room because it's a two wire problem. It's not just sound from the remote coming into the room, it's sound from the room going out to the remotes as well. So we had to have the sound correctly plugged into BBB. The streaming guys thought they were just plugging the sound into the streaming. So if you watch the live stream, it had the correct sound, but if you were on the BBB thing, it was completely silent because they hadn't plugged the sound in. It's just yet another screw up that we didn't actually think to test out. So this is how it looks in the sort of conference hall as we were going around it. So this is the panel you can see. This is what BBB looks like pretty much in a classroom situation. This is the presentation, which you can actually minimize if you click this button, but obviously we thought that the AV people might be able to click the minimize button when the conversation is going on and then click it back again. No, it didn't really work. They didn't really want to do this. So we just kept everything up. This is the attendees panel, which is all scrolling, which is nice, but if you think about it, that's not really what you want to see in the conference. The chat panel, if you click this, will pop out and come here, and the rest of this will really shrink, which is also not really what the attendees did. So what we should have done is have the chat panel down here and get rid of the attendee list or make it pop out only. So we put this on our to-do list for next time because it's sort of a nice to have, and it wasn't essential to getting the conference working. This is actually an excerpt from the RISC-5 microconference. So Atish Patra is actually asking a remote question. You can see that he's asking it of a member of the audience, so we've switched to the audience camera view, because ideally there would have been two cameras here, one of the audience, one of the speaker, and he could have switched between them, but we screwed up and didn't get that working in time. The tab also includes shared notes, which goes to a back end on ether pads, so the remote audience can also see the shared notes, and they can also click on the pop-up, and these buttons here actually allow you to mute people who are not supposed to be speaking when they are, because in a remote conference you get lots of people who forget to mute their microphone and then have a conversation with their wife or their significant other. This happens quite often. The way BBB is set up is that anybody can mute anybody, but we thought that might be a bad idea, because it's cantankerous kernel developers. You never quite know who they'll mute. Once you've muted somebody, you can't unmute them. They can only unmute themselves, because this is a sort of, you can't be unmuting people while they're going to the bathroom or something. It wouldn't be good on the conference. We figured this might be a problem, so we restricted this functionality to administrators only, and then nobody could figure out who the administrators were, so it became a bit of a problem. The video overlays work quite well. This is pretty much a blow-up of a 1080p thing, and you can see the greaking happening in the room. It's very difficult to make out individual attendees, but the sort of remote participants were reasonably visible to most people. BBB, and especially, you can actually see the way it works with the video-muting idea. It's really all you see are the people participating. One of the other things that we'd actually tried to do in the room is the camera at the front facing the back actually had a pan and zoom facility, which the AV operators assured us would allow them to zoom in on whoever was speaking, but in the event, what we discovered is that only two of the AV operators out of our six rooms knew how to use this. We did get panning and zooming in some of the rooms, but not in all of them. Again, this goes back to preparation and training. We will do better next time. It actually took us quite a while to get the audio routing right. We think that plugging two wires into an AV system wouldn't be that difficult, but in fact, it really was. We finally figured out that what you had to say to the audio guy is, here's a wire from what's effectively a microphone on the computer. The video guy will give it to you. You just plug it into your fader panel and remember you have it. You give him a wire for the AV stream that also has to go into the BBB thing. Obviously we got that to work, but obviously we had other problems. One of the other things that we'd really thought about was, when we get into the room, we'd actually like to have the chat and all the remote attendees on one monitor, and then we'd like to have the presentations on another. We thought we'll have a two projector setup, which sounds fine until you realize that we have rooms about as big as this, and the two projectors were right over there and right over there. Somebody sitting over there can't see what's on that projector. Somebody sitting over there can't see what's on that projector. The only way we could make it work was to display the same image on both projectors, which meant that two projectors was a complete waste of money, and we should have gone for a single central one as well. Next year we will either try with one or three, just to see if we can actually get this right. Obviously as I said, the two-camera handling, we thought would be the centerpiece of the technology. The reason why we'd upgraded BBB and caused all of the React JScript failures on the Matrix back end was to get this facility that we couldn't actually use. That was not only sleepless nights for one person for a week to try and get it all to work, but complete screw-up on the day when everything we thought we'd bought with that upgrade didn't function. The other problem we had was, because the AV people are actually doing the live streaming and because of the way BBB works, there is no choice other than to actually run the Master BB session that is showing up on both projectors on their AVMux console. I had thought this might be a problem as well, so BBB is very heavily React JavaScript based and certain browsers handle it better than other browsers. It works with Firefox quite well now, but it is mostly optimized for Chromium. So I had actually already asked them, God, if we have to do this, what is the built-in browser of this Tmux thing? They said, oh, it's Chrome, and I thought, oh, God, it's going to work. But the problem was that the Mux platform might have used Chromium as an internal thing, but Chromium is open source. So the guys who built the Mux platform had just basically taken WebKit, gutted some of the JavaScripts and hoped that that would be OK. What it really meant was the threading functions of JavaScript didn't work very well. The result was that it kept dropping out of the conference, and obviously, since this is running the AV conference, when it drops out, the in-room chat and the remote participants fully separate. The problem is there's no indication that this has happened from the console unless you happen to be watching very carefully, because all you see is that a microphone disappears from that little icon in the top left that says, hey, you're the room, and suddenly your microphone's gone. That's the only indication the AV people had they dropped out. Trying to train them to notice this and reconnect was a bit of a problem. So we do have automated reconnections on the back end, but we will try and integrate into BVV, but when we raised it on the list, they looked at us and said, well, how come you have this problem because nobody else has this problem? If BVV kept on dropping people out of the conversation as a matter of course, people would have complained. This was a specific interaction due to the cut-down browser in the T-Mux interface that nobody had anticipated. Again, another screw up from not actually trying it beforehand. So as I said, rejoin was manual, assuming the AV guy noticed. One of the good things we did have was interim shepherds who were committee members. So they could actually watch the on-screen panel. Every time the icon disappeared, I or one of the other shepherds could run to the back of the room and pat the AV guy on the shoulder and say, you need to rejoin, please do this. So we could make it work with a lot of manual fussing behind the scenes, but it wasn't easy. So one of the other things that we realized is that the hindsight is 20-20. They might want to run it from the AV console, but they don't have to run the whole thing. What we could have done is have a separate laptop that actually ran all of the AV connections under the charge of, I won't say somebody competent because it would have been one of me, the kernel developers or somebody, but somebody who would pay more attention. And then what we could have done for the V-Mux guys was to actually give them a VNC session or something like that into the panels that they were projecting over the streaming and everything else. And that way we would have had much more control in the room and we'd have been running a full-fledged browser that wouldn't have had most of the connection problems that we had. But obviously trying to sort that out on the day was pretty much impossible. Particularly unfortunate is this AV-Mux platform is sort of Windows-based and Windows does not do VNC very well and trying to get it to work with RDSP, which is the Windows native thing, didn't work very well. And we just, it was, we were in Dublin, we didn't have any equipment, we had to organize the parties and deal with online conferences, we just couldn't get it to work. So we do know that for the future we are going to try and do a shareable console. And I need to leave time for questions. I have 10 minutes left. We had a load of other problems as well. As you can see, this, so I'm, in order to try and tell you all the difficulty about doing this, I'm talking up the problems. From the audience's point of view, this conference was actually very successful. We got plaudits from most of the attendees that the interaction worked as they expected. They could have their one-on-one conversations, including between remote and local people, remote and remote people. So everything appeared to just work for them, modulo, you know, the problems of not being able to hear the room and having to rejoin and having to tell people they lost AV and they needed to rejoin. They forgave us for all of that. So it was actually highly rated as an incredibly successful conference, but it was a bit like a swan with that, you know, serene sort of thing swimming over the surface and a load of committee members frantically paddling underneath to try and get all of this to stay up and running. Our other big screw up was that we had a committee of 10 people, but because this was pandemic, not all of the committee could turn up in Dublin, and by the time we got to Dublin, we only had two local people who actually understood the system and could deal with it. Surprisingly enough, the two people who are actually standing up on the platform now taking the blame for all of this. But believe me, in a conference with six tracks and AV guys who don't know what they're doing, if you want a lot of exercise, please do it with two people. But if you want to actually make it work seamlessly and vaguely competently, you need one person who understands the setup in each room. So next time we will try and do it with a minimum number of people as we have same number of track rooms. Again, it's not rocket science. It's perhaps something we should have seen ahead of time, but we thought it would be easy and it wasn't. So to leave a bit of time for questions, if you have them, running conferences is actually hard. So just remember that when you go home from here. And volunteering for a conference is actually an act of public self-sacrifice that more of you should consider, because honestly, you will find out that it is really hard. Running online conferences is harder, especially if you try and do it yourselves, because there is actually a lot more to do. It is actually more difficult to run an online conference than it is to run an in-person conference paradoxically. And this is mainly to do with the integration issues. Everything that doesn't integrate well is either your fault or you're responsible for integrating it. And integrating web stuff, even in this magic day and age of sort of, you know, web scripting and operators and everything else, really doesn't work that well. And hybrid is really only for masochists. Only do a hybrid conference if you want to give yourself an untold amount of pain that you can then come and complain to an audience like you about later. But if you really want to get it right, the key is training your AV people. Do not wait like we did to turn up on the day and say it will be easy. You need a couple of training sessions a few months beforehand to try and get all of the kinks sorted out. And having meetings and discussing it doesn't cut it. You need the AV people to set it all up and preferably have a few people fly out to see the setup, work with a few remote people so you actually check it out in all of its situations. And obviously, you need to do this in advance. You don't do it on the day of the conference, which is what we tried to do. Oh, well, lesson learned. So with that, I think I have about five minutes for questions. This was actually all done in open source using HTML and CSS. That does make me a web developer, proud of it now, finally. Obviously you can see that current developers don't quite do web as well as web developers. So if any web developers would like to join our program committee and become jointly responsible for next year's disaster, please come forward. The presentation is all online, and I've also put it up on a link in PentaBath. So you should just be able to click on it. All of the links in the presentation that are highlighted are clickable if you want to see the URLs. And with that, I'll just say thank you. And Mike and I can now answer questions. Oops. You have Mike. Okay, we'll share this, Mike. Any questions? Please speak loudly into the microphone. Yes, thank you. From what you said about the AV setups in the hotels, were there any discussion to just ask the hotels to, you know, move their stuff and have your own stuff brought in? Because to me, that seems after your talk to have been an easier solution to just say, you know, your AV guys and your AV is probably good, but we have really specialized needs. Could we just bring our own equipment in? Okay. So the question is, could we bring our own equipment in? And the answer is pretty much every hotel, no. Because remember they do the AV routing as a very specialized sort of adaption for the conference rooms. They have a lot of really expensive speakers, sometimes in the ceilings, sometimes in the back of the room, and they don't want any non-experts touching it. So what hotels usually have as a clause in their contract is a requirement that you use their own sound guys. And this clause is non-negotiable. We did try to negotiate with a couple of hotels where we said, because we had a few screw-ups even in the live streaming days where the hotel didn't do very well and we thought we could do better. But it's always come back to us as an absolute no. It's actually very lucky for the ULB guys are used to having students handle their equipment and don't seem to have as many hang-ups about it as hotels. So there are people here who can be trained on this equipment who can use it. But pretty much when you're running a conference in a hotel or in a professional conference hall that we've also been to, so a community convention center or something, you actually have to use their AV staff. You don't get a choice. And that means that you always have this extra component of people that you have to deal with. And they're usually people who don't really understand fully what your requirements are because it's very difficult to communicate to the head of time. And these people are usually the people that you don't get to talk to until you turn up at the conference. So when you have your intermediate AV guys like we had and you have your meeting with them, you always have to make sure that they have enough knowledge and insight that they can do the knowledge transfer to the sound guys who are going to be coming in on the day from the hotel. So yeah, it's a huge problem and there's almost nothing you can do to get out of it besides sort of having a very friendly university who will let you play with their equipment because pretty much nobody in the professional circuit will let you play with their equipment. I have one question from the chat. Was it worth it to record the audience? So the question was, was it worth it to record the audience? And the answer to that is actually very much yes, because remember in a micro conference we're stimulating discussions between arbitrary groups of people. So these would be people who are sitting in the audience physically versus people who are sort of remote. And in order to have a correct two-way conversation, you actually have to be able to see the person on the other end. And so the whole point of the audience camera, at least when it worked with the panning and zooming, was that if you were having a remote to local conversation with the audience, the remote people could actually see the local participant. And if we only had a camera facing the speaker, we'd either have to call every participant up to stand on the stage, which wastes time, or we'd just have to not have them be able to see the person they're talking to. And being able to see the person you're talking to is very important for interactions. There are lots of visual cues that people get wrong. So I think having an audience camera is actually one of the good benefits of, or at least one of the benefits the plumbers did and one of the good insights we had to try and get an interactive conference running. So I wouldn't go back and do no audience camera, no. Just wanted to get back to what you said about the ULB being willing to allow us to use their audio equipment. I just want to point out that this has a lot to do with the fact that we've been doing this for 20 years and they know us. It wasn't always the case, and if you're not supposed to, or you're not working with the ULB, this probably isn't the case. So could you repeat the, I got half the question, no. Yeah, it wasn't the question really so much as to point out that us being allowed to touch the audio equipment here, by the ULB has a lot to do with the fact that we've been doing this for 20 years and they know us. Okay, so the point being made is that ULB only allows the staff here to touch the equipment because they've been doing it a long time and they've built up trust. The way we work, we tend to go to a different hotel every time we hold plumbers in a different location. So pretty much even if we could get a trust to build up with a hotel chain, we wouldn't be in the hotel often enough to do it. But the important point to emphasize is that pretty much 95% of the time you will be forced to work with AV people who are supplied to you by whoever owns the equipment. You won't get a choice of who does it. Okay. Is that it? Okay. Well, thank you very much indeed. |
Matrix 2.0
How we’re making Matrix go voom |
Okay, hello everybody. Can you hear me? Yes, I see thumbs up at the back. Please come in, come in, roll up, roll up for the Matrix show or introducing Matrix 2.0 or how we are going to make or how we have made Matrix go boom, very technical term. Please take your seats, ladies and gentlemen. Right, so hi everybody. I'm Matthew. I'm the project lead and co-founder of Matrix and I'm here to tell you all about the work we've been doing to fix Matrix's performance problems and a few other things as well. So I'm guessing that a bunch of people know what Matrix is, given that Vostem has been running on Matrix during the pandemic and we're doing four hard to lead doing a hybrid Matrix version of Vostem right now as we speak. So hello to everybody following along on the HLS stream by Matrix. You probably know that Matrix is an open network for secure decentralized real-time communication. We have to say this because people watching the video might not know later on. You can use it for interoperable chats, so following along on Vostem, bridge through to ILC or X and PP, etc. You can use it for interoperable VoIP, but Matrix at its core is a real-time communication fabric for any kind of real-time comms. So you could use it for communication within VR or AR. You could use it to synchronize world data within VR and AR. You could use it for IoT. It is basically meant to be the missing communication layer of the open web. So no single party in Matrix owns your conversations. The second you talk to somebody on a different server, your conversations get replicated equally between the different servers, so there cannot be a gatekeeper. There cannot be some big tech company going and holding your conversations hostage. Instead, the conversations are shared between all the participants. To apologize, the network is a mesh of servers like Vostem.org, say, on the right, which might be talking to Mozilla.org, which might be talking to Matrix.org, or Nome.org, KDE.org, or whatever it is, and you have native Matrix clients like Element, or Fluffy Chat, or Mecca, or Katernion, or hundreds of others these days, which connect through to the Matrix server. Then you have application services, which glue additional things onto the Matrix server, like bridges, or bots. You have identity servers, which we don't talk about because they're a mess, and we need to get rid of them. And then you've got application services that bridge through to things like Slack, or Teams, or Telegram, or ILC, or XMPP, et cetera. And that is a schematic view of the public Matrix network. Now, the Matrix ecosystem, as it sits today, looks something like this. You've got the Matrix spec still being the kind of one true commonality across the whole ecosystem, spec.matrix.org, a whole bunch of markdown, and swagger, or open API, I should say, that defines how the server's on the bottom, talk to the clients on the top. So you've got Synapse, as our first-gen Python server, which has been proving to be a bit more than first-generation, as we've invested a lot of time making it fast and generally performant. So Synapse is not going away anytime soon. If anything, it is corroding into Rust as the Python is augmented cybernetically with a bunch of Rust. Then we have Dendrite in Go, which is also doing very well and is ending up focusing more on embedded use cases. So you use Synapse for massive servers and use Dendrite for ones you embed into an app. Then we have application services and bridges for things like ILC bridging, et cetera. And then many other servers and bots and bridges from the wider community. The green stuff is stuff that we published as the matrix.org foundation, all Apache licensed for people to build on top of. We have, then, the clients. On the far left, we have our original SDKs of matrix JS SDK, React SDK on top of it, and then iOS SDK and Android SDK too. These are relatively venerable SDKs now. JS SDK is eight years old, which is, you know, enough to probably get a degree or something. iOS SDK is about the same age. Android SDK is a little bit younger, but this is what the current generations of element use today. Then we have Hydrogen, which is a relative newcomer. This is a progressive web app SDK, super small. It's about 70, 80 kilobytes in total for the whole thing, including all end-to-end encryption. And it's designed for embedded web matrix instances. So we have Hydrogen itself, hydrogen.element.io, as a very lightweight PWA for playing around with matrix. We have chatterbox as intercom style embeds a matrix chatbox into your website. We have Surderoom, which is our crazy science fiction spatial collaboration on top of matrix platform, where you want to have the matrix bit as lightweight as possible, which is why we use Hydrogen, the lightest element, to make it happen. And then the thing we'll be talking about a lot today is element X. So this is a total rewrite of element mobile on iOS and Android. And for maximum cliche, we have rewritten it in Rust using the matrix Rust SDK. I hear we have Rust fans in the audience. So we'll be talking a lot about that. Meanwhile, on the community side, there are just more and more clients all over the place, like Thunderbird released its native matrix implementation. We've got Watch the Matrix, which is an excellent Apple Watch matrix client, which doesn't tether to your phone. It's actually running on the watch itself. NeoChats, and KD, and QTE, and C++. I wish I could name them all. I can't. I don't have time. So help for the network. Here is the total number of users we've ever seen on matrix, which is, looked as if it was going to hit 90 million, and then somebody turned off the server. Kids, this is what happens if you turn off the server, because it means the graph goes down. So please never, ever turn off your matrix servers ever again. This is literally the phone home stats that Synapse reports. So it doesn't include Android. It doesn't include Conduit or other servers. It also doesn't include all of the paranoid people, which is quite a lot of people running matrix who obviously aren't going to phone home their usage stats to us. It does include guest users and bridge users. So it's a little bit of an overestimate. But the important thing is the shape of the graph, which, as you can see, is continuing to go well, apart from that guy who turned off his server. In terms of the spec, we have fallen into this cadence of quarterly releases of matrix since the big 1.0 back in 2020, and then particularly in 2022, we've managed to crank out a point release every quarter. Just in, I think, the day before Last Foster, we had spaces, restricted joins, and actually the matrix ERI scheme, which is now being implemented all over the place. Then a few months later, we added aggregations and restricted knocking. Now, these features have often been around for years, but this is actually formalizing them into the proper long-term supported spec. 1.4 added threads, massive feature that has been an epic to get done, but it landed. Edits, private read receipts, a very long time complained about matrix being that you couldn't turn off read receipts, turns out it's surprisingly hard to do, but we now have it specced and implemented. Then 1.5 in November, we added in formalized references, we fixed up some things in the ASAPI, and I think we have basically maintenance release on 1.6 any day now. So nothing too exciting in the next release, mainly because we're building up to the big 2.0. Nice stat is that there were 120 MSCs in 2022, of which 39 came from the community, 27 from new contributors, 30 from the gray beards of the spec core team, and then 51 from the folks who are paid to work on matrix.org, mainly by element. So it's a reasonable mix of community and core project work. In terms of uptake, other than that, we obviously help the world's best open source conference, dodge COVID. Hopefully, apologies if matrix was painful over the last couple of years, but it's probably better than no false to me at all. Lots of government uptake. New news is Germany, we have bundles messenger rolling out matrix across the whole German government in November, also good martyred, the German healthcare agency, proposing it as a neutral standard for secure messaging in healthcare. Lots and lots of associated deployments in healthcare, education, utilities, manufacturing. Basically, if you are an organization who cannot put their data unencrypted into some proprietary silo, like teams or Slack, I think matrix hopefully provides a good alternative. Moodle is busy integrating matrix into the learning management system. Automatic have got a project called chat tricks, which embeds matrix into WordPress, so you can literally dump your little chat console based on hydrogen into your WordPress blog. Reddit is rumored to be building chat capability powered by matrix, mainly because I think they had public who sign up enabled and somebody logged in and discovered a matrix over there, and there's a lot of interest over matrix being potentially the communication there for the digital market sites. So, last year, we put out this slide to basically say the plan for 2022. In the early days of matrix, we were just trying to make it work. Then we tried to make it work right and managed to exit beta and launched the 1.0. The last year particularly has been trying to make it work fast, and I hope we have now made it work fast. I will attempt to prove this to you with a demo. We haven't seen anything yet. So, there is one of my cats helping me here as a kid. Now, along the bottom here, you might see three, four icons in fact, element X, element R, element X itself, so that's element X nightly, and then element normal. We're going to talk about element X here. So, I've got the nightly here, and I'm going on the ship. This is what you see as a little splash screen. I'm going to hit continue on that, and hopefully I've got enough connectivity to connect to the server. That's a good start. If it takes that long to discover that there's a server out there, then this demo might not go so smoothly. I'm going to log in as my actual main real matrix account. I'm not going to type in my password in front of you, but I'm going to pull it out of my password manager. Hit continue. If the server is there, there's too many people in the room. It hasn't even started talking to the server yet, okay? And that's it. I'm in. So, my account... So, if I was going to try to log in on my normal element account, it would take 20 minutes, because I'm in 4,000 rooms that date back to literally day one, or actually day minus two weeks or something of matrix, and I can go and scrub through all of these gazillion rooms there, and they are all actually there. I can go and find somewhere, I know, try not to expose anything too sensitive, but you can see it's actually already pulled in room previews on all of these things. Going to, I know, this week in matrix, and there is the chat. There is just no spinner here other than the slow network at the beginning, which really was the slow network. And you can see we've got reactions in here. We've got some nice bubbles. We've got replies. We've got joins and parts and day markers, read markers. We've got map markdown. This is the SwiftUI incarnation of element X, but all of the heavy lifting here is done by Rust, and it is transformative. The whole point here is to be faster than Telegram, and I think that we might have got it, although if anybody's in a Telegram account with 4,000 rooms, please tell me how long it takes to log into it afresh and how long it takes to launch. So talking about how long it takes to launch, if I go and quit the app like that and relaunch it, we're in. That was it. And I'm going to risk doing one other thing, which is to launch my non-nightly, which I haven't actually used for a couple of hours. And again, it's synced almost instantly. And what I'm going to do is that this is on a custom build, which is hooked up to Yeager. And if I go over to Yeager, and if I have enough internet connection to even load the Yeager UI, this is going to be really fun for demoing later if this is how bad the connectivity is. And I search for the well-known element X Matthew app in the last, actually, let's do the last 15 minutes or five minutes even, then we've actually hooked up Rust SDK so that all of the logging is structured and all of the logging gets uploaded via LTP to Yeager. And if I had internet access, I should start tethering, I would be able to show you a blow-by-blow account of what happened when I launched the app then a minute ago. So what we're going to see when it finally loads, hopefully, is first of all, it has to pull up a local cache of my messages if you're already logged in from disk. For this, we currently use sled, which is a key value database native to Rust. It hasn't been going amazingly well for us, and let's hope that's the right one. And as you can see here, at the top, we've got the build operation. And the 410 milliseconds there, frankly, should be more like 40, because all it's doing is loading up 20 rooms also out of sled. We're going to move to SQLite because if nothing else, sled spends its entire life rebuilding itself and defragmenting itself when you launch it, which is a bit unfortunate when you're trying to launch an app quickly. Then it restores your session and gets a whole bunch of events out of it, which is the first couple of events on the page. And if you scroll past those, the really interesting one here is doing the sync. So this on the server is 90 milliseconds to calculate your sync response. It's ended up being 900 over the wire because of all you people with your electronic magnetic interference and your mobile phones. But still, you saw that it was very usable. It's like a second to get to the point that you're viewing stuff. And in fact, we already are interactive before the sync response returns thanks to the local store having been resumed. So we have gone deep down the rabbit hole, so the saying goes, to try to optimize the performance on element X. So it is as snappy as iMessage or WhatsApp or Telegram, rather than the slightly clunky beast that we've had historically. So before it looked like this, you got a synapse on the right. We've all sorts of fun workers to do the various bits and bobs. And then we had element iOS with iOS SDK, mainly written in Objective-C, matrix kit in the UI layer with more Swift in it. You had MXCrypto, again written, I think, in Objective-C, and LibOm as the encryption library in C++ and C sitting below it. And then the database layer was horrific with a mix of flat files, realm, core data, carrier pigeon, element iOS has some issues. In our brave new sliding sync world, everything has changed. On the left-hand side, we now have on iOS SwiftUI for the funky app I just showed you. On Android, we have Jetpack Compose. Then we have Unify bindings to the Rust SDK, which has been a lot of fun. Even on our Rust team, it's been hacking way, contributing async bindings through to Swift, to Unify, so that we could expose Rust SDK, complete with nice futures and async through to Swift and Kotlin. And then you've got Rust SDK itself, which is doing all the heavy lifting. It's got the crypto crate within it. And then within that is Vdozomats, which is our native Rust encryption implementation for matrix. Below that, you've got sled and in futures equalize. This then talks through to a sliding sync proxy. And this is written in Go, and it implements MSE 3575, which is sliding sync. And this is the magic for how this works so quickly. It's basically storing, well, it's going and talking normal sync to normal Synapse. So this could be Synapse or Dendrite or Conduit or anything on the right-hand side. The Golang thing is an intermediary that is going and sucking up the state of your account, storing it in a local Postgres and then talking the very, very responsive API in order to pull that data into Element X itself. It does it by looking like this. Sliding sync lets the clients request the bare minimum of data that they need to render the UI. So here's an almost real request where we say, I want to see the currently visible rooms. I want to see the first page to preload it. So I want 20 rooms, 0 to 20. I want it sorted by recency and then name. I only want the avatar and whether it's encrypted. I'm going to get the calculated name whatever. I don't want any messages because we've done a waste time actually downloading scroll back. And we're going to filter it to not have invites, not have old rooms, and not have those pesky space rooms. And whilst we're at it, we want to have end-to-end encryption. We want to have two device messages and we want account data. And the server or the sliding sync proxy will literally just return about 10k of data, which is those 20 rooms with the bare data, bare essentials that you requested. The key design criteria for sliding sync is the performance is constant with the number of rooms. And this was the horrible mistake with the old API and frankly the whole design of matrix historically that as you join more rooms, it gets linearly slower for basically everything. And that was fine for the first few years when people are in a couple of hundred rooms, but obviously we don't want to predicate the success of the protocol on, yeah, it's fine as long as you're not a power user or it's fine as long as you don't actually use it. And if you think of matrix being a bit like a file system, imagine how awful, and I'm looking at you, EXT2, a file system would be if it just slowed down linearly with the number of files that you put in a directory or some other characteristic. And as more and more rooms pop up in matrix, it's not just chat rooms, it could be spaces. Now imagine that you go and join the EU and the, you know, you're working in the EU government and the EU has got a massive space of hierarchy over all of the countries and all of their public sector bodies. Even before you've talked to somebody, you might need to have visibility over this big hierarchy of like a thousand rooms. You do not want your matrix client to take a thousand times longer to log in or sync. So that's basically the entire idea here that you can have an infinite number of rooms, bit like IMAP, where you can have massive mail folders and yet you're only going to care about the subset that your client wants to actually manipulate. Having requested this range of 20 rooms, you then get updates from the server, and this is why it's called sliding sync, that you basically have requested a window over these rooms, these 20 rooms or whatever it might be. And then as the state changes on the server, you get updates of inserts of room here, delete one here, invalidate it here. It doesn't have to be rooms, in future it could be members or other sort of characteristics. This has been shamelessly stolen from Discord, so apologies to Discord folks if you're listening, but thank you also for coming up with this nice approach for how to maintain performance and scrolling on apps. And the end result is just many more. It makes login instant, sync instant, and also time to view rooms instant. Element X is only going to talk sliding sync. There is no point in us wasting time implementing both approaches, but they're really quite different, and we want everybody using Element X for it to have a snappy snappy snappy experience. We've done a lot of iterations. I've been driving the poor Rust team and sliding sync team and Element X team mad by constantly demanding us to try to get the launch time down to 100 mils or something, and there's gone through probably 10 iterations to see how we actually drive the API. And it's been really interesting. The end conclusion is, first of all, when you launch the app, you sync that first screen's worth of rooms, but without any timeline. Literally the request I just showed you. Next, you immediately increase the timeline limit on that window to one. So you'll fill in the room previews, and it was happening so fast earlier that we didn't even have time to spot it happening. And then you pre-cache a page's worth of history of the visible rooms. So when I jumped into TWIM and the history was already all loaded there, it's because the background and pre-cache had already happened because I'd stopped scrolling the room list and immediately jumped in to pre-populate the history for those pages. Then you also incrementally build the big list of all your rooms in the background, which I guess technically is ON with a number of rooms, but because it's happening in the background, it's not on the critical path. It means you can do the scroll for all your rooms or search for a room by name instantly and be able to find them. And finally, you cache it in Sled or SQLite. Rust SDK is doing all the heavy lifting here. The code base is maturing really well. We got it audited at the Vdozomat Slayer last year, thanks to Least Authority, funded by Gamatik. And we have, I think, three other audits planned this year. They were meant to happen last year, but we had disruption along the way. Then, high-quality bindings are critical for this. I mentioned that we've added AsyncFuture to UniFFI. I think this stack could be the ultimate stack for building cross-platform mobile apps going forwards. I mean, you can use Rust for the heavy lifting, and then you hook up at the top a very thin but very native performance layer based on whatever the iOS gives you. And at the beginning, UniFFI was a little bit, it didn't have everything we needed, and so we invested to go and particularly hook up the future support. And the end result, I think, is quite transformational. So that's Rust SDK. Meanwhile, whilst Element X is maturing, we need to keep the existing clients secured too. But so it's going to take us a while to get to parity between Element and Element X. And the project for this for crypto is called Element R, confusingly. So this replaces the old cryptography implementations in JS, iOS, and Android SDK with the same Rust crate that powers Rust SDK. So it's just the crypto crate that is providing a consistent encryption implementation across all the platforms. So this means that if we have hypothetically horrible CVEs popping up, we only have to fix them in one place in the Rust SDK, rather than having to do it four times over between where by West Android and Rust. And you can use this today. It is still beta, so it may kill your cat and flood your house. I've been using it, it occasionally logs me out, which is a bit frustrating because initial sync on V2 takes 20 minutes. But I recommend having a play if you're interested, enable Rusty and Twen's encryption in labs on Element iOS. Androids will be coming fairly soon and Web started working on Friday. We sent the first and received the first encrypted messages by Rust crypto in a Wasm blob inside Element Web then. Rather than climatically, it looks and feels identical to the current crypto, except it's written in Rust. Then another big thing we've done to speed things up is faster remote room joins. So this is a huge internal change to Synapse. So again, you only receive the subset of state you need to participate in a room. Breaks all the assumptions that Synapse had. The rooms are typically atomic. Instead, you basically trickle in the membership of the room in the background after having got the minimum subset to join the room. So for instance, matrix HQ right now, there are 92,948 state events for every user who has ever joined or changed their name or left and a whole bunch of other things. If you actually look at the subset you need to participate in the room, it's 152. So this speeds up the room join time from 15 minutes to 14 seconds. So finally, we will hopefully have fixed the problem where somebody gets and stores matrix Synapse immediately tries to join matrix HQ, sits there for 15 minutes looking at errors as their computer explodes and wonders why everybody thinks matrix is as amazing as it is. So I mean, the computer will still explode because they're still having to talk to all of the servers in the room, but at least they will be responsive in 14 seconds. And we hope that the Synthelating conversation in matrix HQ will be such that it distracts them from the smoke coming out to their server. There's still a lot of room for improvement here. We shouldn't be hammering dead servers, which is where a lot of that time is going. And also, we should be caching the partial state. So, you know, if I want to join matrix HQ, the server can just go and hand me the events I need to do that. Another big thing is matrix RTC. So this is the name we're referring to MSE3401 and 3898 as many because those were awful names, whereas matrix RTC is a bit more snappy. And this is multi-party native VoIP. We've always had one-to-one VoIP. Here, we are standardizing the multi-party Zoom teams, Jetsy style experience, but in a heterogeneous way. So you can use different clients. This is in lamps and element web. Video rooms looks like this on the right hand side, powered by element call. And it works as what we call a matroshka widget, where you embed element call here as a widget. So this is an iframe on the left hand side. But even though element call itself is a matrix client, it is piggybacking on the host's matrix client. So it shares the encryption and the identity. You don't have two logs in sessions. And really excitingly, we have multiple independent implementations of this in element call, hydrogen, third room, and also, I believe, famously, it has an independent implementation in their healthcare product for Germany. And I'll very quickly try to show you a demo of this. And I'm going to show you, where am I going? Oh, cool. That's good. Fozdom 2023. If anybody wants to heckle along on this, then feel free. Oh, I don't have any internet access, do I? Video conferencing demo, so when you don't have any Wi-Fi, it's always a good idea, right? Let's see how it does. So I'm going to pop into that room there. And here is element call. And then I'm also going to spin up a local hydrogen. Here's what I made earlier. Oh, hello, Amundee. Found some meeting here. This is Amundee and everybody. She co-founded Matrix. And I'm going to wish that this thing was telling me that a video call was happening in the room. And still being one that's ended, but in practice, there's one happening right now. I wonder if this is... I wonder why? Okay. Perhaps we'll use a different room. Let's go to Fozdom 2024. Back to the future, everybody. And go over to hydrogen here. And I think I should be able to just be able to join Fozdom 2024 on call.ams.host. Yeah, okay, here I can actually join it. And hey, Presto, you've got me staring at myself like a muppet because nobody else is responding to my comedy joke of moving to 2024. So everybody, oh, hello. Perfect. Somebody at the back. Thank you. Oh, and there's Amundee. So this thing is really cool because it is completely standardized multi-party void signaling. We have two entirely different code bases. There is not a line of codes in common between hydrogen here on the left and element call here on the right. And just like back in the day on the phone network, we could call each other on different things or have different set clients or whatever it might happen to be. Oh, we've got somebody remote. That's awesome. Then here we actually have a proper heterogeneous thing. So unlike JETC or some other conferencing system where everybody has to end up using the same system to work, this is providing an interoperable thing. And crap out on this one because I've got something better. I've got something better. So one of the other projects we have worked at, which we're just releasing now is called Waterfall. Now what we just did then was full mesh. All the clients were talking to one another. I'm amazed that it worked as well as it did. What you want to do, though, is to have what's called an SFU. So these guys which go and mix together the local video calls. And Sean Dubois, who's the project leader at Pion, the Golang WebRTC, wrote one of these called SFU to SFU based on reading MSC3401. And we renamed it Waterfall. We've fleshed it out and we've hooked it up to element call. And I will endeavour to show you what that looks like. And it's going to be quite similar in some ways. Let me actually try a demo, demo room in here. Again, if someone is already in there, I'm going to try a fresh one. Let's call it fresh demo. Again, if anybody wants to try following along on this URL, if you can see it, please do so. Now this looks a little bit different because it's connecting to the SFU instead. Oh, hello and hello. And hopefully we will get some video off and on. So this has been bounced off the go SFU. But perhaps I'll distract everybody by zooming. So this has got a completely different layout on it. And it thinks it's connected. Oh, there's Simon. I'm glad that Simon of all people is able to connect because he has written the vast majority of Waterfall. So thanks for demoing Simon. And I suspect it is not like in the packet loss here on my side. People are trying to connect in there. But it's working for some folks. It's very, very new, very alpha, but it's exciting to actually see this thing working. And if I quickly turn on debug here in developer mode, you'll see that it actually supports simulcast. So here you can see that Simon is 640 by 360 pixels. Whereas this guy here in dandelion, hello dandelion, is 320 by 240. And if I go and zoom embarrassingly, then it should catch up. There we go. 640 by 480. And it's going to renegotiate the higher resolution stream through. And all the people are going and actually uploading multiple low resolution and high resolution ones. So this is very early, but it's the shape of the future for doing proper massive scalable SFU based conferences. And that's actually looking good. I'm going to take a quick selfie. There we go. Right. Next demo or next stuff. I'm running out of time. I've got a lot of demos. OpenID Connect. Matrix is subject to MSE approval. Moving over to OpenID Connect. It rocks and gives us 2FA, multifactorial, pass keys, QR code login. You don't have to implement the weird matrix off APIs on the server or the clients anymore. ElementX has native OIDC on iOS already and will be OIDC first. First room is the first OIDC only matrix client. Go to our OIDC yet.com for the gory details. So SigningSync faster joins native VoIP and OIDC is a big change. Like this is fundamentally changing how federation works, how VoIP works, and critically how servers sync data to clients, and how you log in. Couldn't be a bigger change. So we are taking the liberty of declaring it matrix 2.0 when we finally release it. So this is not a breaking change. This is pure enthusiasm basically on my behalf because I think it's worth saying, hey guys, come back to matrix, give it another go because we fixed all the crap stuff and we're calling it matrix 2.0. Okay, I'm not doing well on time. We're halfway through in theory. We're going to go now to the future. Flying cars and jet packs and all that good stuff. First of all, digital markets act. Where we're going, we do not need gatekeepers. If you haven't heard about the DMA, it is a fascinating piece of legislation that mandates the big tech companies must interoperate together, particularly for their communication services. The whole idea is that it empowers users to pick the services they want to use and trust without sacrificing the ability to talk to other people. Now frankly, forces the gatekeepers to differentiate based on quality and features, rather than relying on a monopolistic situation where all of their users happen to be trapped in the silo. This is happening right now. The rules came into force in November. The rules start to apply in May and then gatekeepers get designated and it will start getting enforced, which is chunky GDPR style finds in March 2024 if the gatekeepers have not gone and interoperated things together. Turns out the matrix already provides a secure interoperable communication protocol. Who knows? The DMA requires the gatekeepers to maintain end-to-end encryption, which is good news. There's been a lot of paranoia that DMA is a secret attack on end-to-end encryption. It really isn't. Having spoken to the folks responsible, they really want to make sure that if WhatsApp does E2E today, then an interoperable WhatsApp also needs to do E2E. They don't want to be responsible for destroying privacy. So to re-encrypt, you need to either do it on the client side, so you would install something like a WhatsApp to matrix app if you want to link your WhatsApp account into matrix, or you could do a multi-head approach effectively, open APIs and have your app talk to WhatsApp as well as matrix or whatever. Or everybody ends up talking the same protocol, which means on the server side, the gatekeepers would have to add support for matrix or XNPP or RCS or whatever it might be alongside their somewhat legacy proprietary protocol. ITF is established a working group called MIMI, more instant messaging interoperability targeting the everybody talks the same protocol approach. We've proposed matrix as an ITF draft for content and transport layers, and we're trying to work with them to make the most of the mix. Decentralized MLS, this is another big thing where we're going, we don't need salamanders, because if you haven't noticed, all of the encryption historically have been called axolotl or ohm or proteus, all of which are species of salamander. MLS is another ITF working group, in fact the one from which MIMI has emerged for next generation end-to-end encryption. We have figured out how to make MLS work in a decentralized model. We have implemented it in a toy type script stack called MLSTS. It is currently being re-implemented on top of open MLS and Rust. At which point, when it works, we will integrate it into Rust SDK and build it into real clients and write an MSE. Are we MLS yet dot com for all the gory details? Here are some early performance testing, which is pretty interesting. Let me get the key right for you here. If you look at MEGOM creating new sessions, this is obviously the log log scale for all of you mathematicians. You can see this red dashed line here, showing how MEGOM sessions scale with a number of users, and it's up at 100 seconds if you've got 100,000 users in the room, which is pretty slow. However, if you look at an MLS update or adding new members, then they're down at 1,000 milliseconds, so a couple of seconds. This is a major algorithmic improvement for certain operations over even VODOSMATS. Support of VODOSMATS, which feels relatively new, may get displaced by open MLS and DMLS when we get there, hopefully later this year. In peer-to-peer matrix, where we're going, we don't need servers. Matrix is way too dependent on servers, and the admins and the risks of internet shutdowns and censorship. This is because home servers have to store their users' chat history and metadata. Peer-to-peer matrix exists to fix this. This is a long-running blue sky project, so to speak, where we go and embed your home server inside the app in order to not have a server running in the cloud. Dendrite is the server we use. Big news on Dendrite is, as of a few weeks ago, it passes 100% server API compliance, so it has a parity of a good old Synapse, and 93% client server API, and the 7% is boring stuff we don't really care about for this. Pinecone, the peer-to-peer overlay network, is going really well as well. We just switched to soft state routing for reliability, and as of about 4 a.m. this morning, not me. Initial store and forward relaying is here. I will very rapidly try to demo that. That's still my phone there. If I go and launch Peer-to-peer matrix, Dendrite is not running. Now it is running. This has literally got a Dendrite running inside my phone here, and if I go over to Visor here, I should hopefully also be able to run it on my Android thing. Here it is, and did it already crash? So it is a bit crashy. It is still beta. I've got Fosdom demo here, Fosdom demo here, and slightly, okay, when I first say hello there, then you can see yay, messages flowing back and forth peer-to-peer. Now if I take my iPhone and put it on to airplane mode like so, and send some messages through from Android, they still go. So this is Peer-to-peer matrix running over Bluetooth flow energy. It silently fell over from running over IP into Bluetooth mode. Now the exciting thing is that if I also run... Well, this is going to be an interesting demo. So here is Element Peer-to-peer running in iOS or macOS, because you can do that on an M1 Mac, and apparently there are five connected peers, which is three more than I was expecting, so hello everybody out there who's about to screw up my demo, and you can see the same history coming on here, and I can send a message through, and you can see both on iOS it came through on Android too. Now the really interesting thing is if I go and... Now you are going to screw up my demo. If I go and kill the Peer-to-peer app on Android, and I send a message here saying hello relaying, and actually not in that room, I'm going to do it in my DM through to Android. I'm going to say testing relays, or very badly typed, and critically my... Where has it gone? Has it crashed? No, there is. So I'm not actually in that room. I'm only in one room on my Mac here. However, this hopefully has gone and peered, relayed through to my Android phone. So if I now go and kill the app here on iOS, and relaunch it here on Android, if I knew how to use Android, come on. This is going to work, or is it going to search Google? There it is. Perfect. Hopefully the first thing this thing will do is to... I'm in the wrong room. If Dendrite launches, the red bar is telling me that it can't tell and talk to its own Dendrite. You can do it. Yes! Testing relaying. This is huge because historically Peer-to-peer matrix has had a massive problem that the other guy has to be online running the app at the same time. And a long story, but I ended up on an aircraft carrier a couple of months ago going and trying to explain a bunch of people how they could use matrix in that environment. And there are a bunch of us on board this thing. And it turns out that an aircraft carrier is a massive floating Faraday cage. And we went and fired up Peer-to-peer matrix on it. And we're very smart that we can talk to one another, but you had to physically wave at the other guy to get them to launch the app so they could receive the message. Whereas with a relay, you can obviously talk to them even if the app isn't running. So that's big. Right. We're almost there. Matrix is not just for chat and VoIP. This is the final thing. Third Room is a decentralized consular for, well, matrix is a consular for any kind of real-time data. Third Room is this tiny project done by Robert May and AJ to provide an open decentralized platform for spatial collaboration of any kind built on matrix. It uses hydrogen, 3JS, bit ECS, and rapier for a new engine called manifold. And the idea is you take a matrix room. You upload some GLTF 3D assets to it. You upload some WASM or JavaScript scripts to it. They use matrix RTC to spell up spatial VoIP and physics synchronization over WebRTC. And you get WebFirst, open decentralized virtual worlds and spatial apps without any of the cryptocurrency or token or Facebook stuff, apart from possibly the hardware. And it looks like this. So you can literally go to it right now, thirdroom.io, and I'm going to go to a world called third room presentation here. And it's going to hopefully pull this up out of my index DB, because I don't want to wait to download 50 megabytes over the network. And if the demo gods are smiling, then you find yourself in this rather snazzy demo world. Now, this is running in Braille. So no plugins or anything. And it's going at 60 frames a second. It does this with proper multi-threading using shared array buffers and atomics to synchronize together the WebGL thread and also the game thread and the main thread in the UI. You can see I'm indeed wandering around there wearing a placeholder avatar. I can flip into third person view here. And you can see I'm also wearing the same beautiful thing. We haven't got customizable avatars yet. Some of the things I can show you here is that you can go and click on buttons. And this is actually a script showing the layout of the different threads. I don't have time to show you that right now, but the game thread has got like rapier physics and WebAssembly. But a really fun thing is that you can just do freeform scripting of any kind. So one example could be this silly, silly, silly demo, which hopefully will load up rapidly. Oh, that's what happens if you this is a bug where you have your worlds overlaying on one another. I'm going to keep it like this, but this is pretty cool. So we've got the construct room from the matrix, so I'll see for in place on top of Mars down here. And if I go over to the TV and I click on it, predictably enough, the entire world goes matrix. So the script for that is literally just sitting there as a bunch of JavaScript uploaded into the media repository. And it's compiled down to Wasm in real time by the engine by QuickJS, thanks to Fabrice Bellard. And it's like four lines of code. You use WebSG, which is our new API called WebSceneGraph that we're going to propose to W3C as a 3D API for manipulating JLTF scene graphs. Now you make it intractable and then every tick, every frame, you see if it's being pressed and then you toggle the state and you enable the matrix material on the room. It's slightly cheated by hard coding it on the in the engine for now like this. Something that we haven't hard goaded though is this guy, which is really exciting. I'm going to refresh his time. And here you can see a big scary black blob with Poland noise, which is pulsating according to my voice as I bellow at it. And this thing is actually a huge chunk of C, which is compiled down to Wasm and is going and programmatically changing in real time the JLTF scene. So this is like the first proper, more advanced capabilities on top of third room. The whole idea is you can build any old app on top of this. You could build Figma on this. You could do multiplayer blender. In fact, we have an in-world editor in here where I can go and select this guy at the bottom and they will have a big white bagel around it. And we don't yet have the property editor, but you should be able to go in and directly manipulate it, change the opacity, the transformations, et cetera, and all that sort of thing. And it really ends up feeling a lot like the web. Rather than a DOM, you've got JLTF, rather than the DOM API, you've got the WebSG API, rather than JavaScript, you've got Wasm, Sandblocks, Blobs, with Rust and Zig and C and JavaScript within it. And there is one final thing I'm going to try to show you, which is probably going to go horribly wrong, which is that we've just added WebXR into third room. So if I go and put on my Facebook device, and there we go. And I back away a bit. Probably unplug it. You'll see, hopefully, in fact, I need to go full screen on that. I guess I do. There we go. Is that coming through? You can see that here I am wondering around third room. Probably can get them full off the stage and break my neck. And I can go and, like, spawn object. So you can have big crate. I can throw the crate away. I can spawn some other big crate. Let's run away from that one. Go and pick this guy up and throw it away. Go and pick that one up. It's over here. And throw it away, et cetera. And I mean, this is running at 90 frames a second. 90 frames a second. So that's a bit weird. It is as least as good as the native MetaHorizon stuff, the Facebook of ships, except it's running within WebXR in a browser in a completely open environment. So we're kind of hoping this provides a really viable platform to build a genuine open spatial collaboration plane for the web. I've already spoken about that. Coming up next is persistence. So we don't yet persist the changes into the matrix room, but we will by uploading little bits of JLTF files so you can even have bots which go into that. Another thing I should have shown you, but forgot, was this guy somewhere. I've lost it already. Never mind. Well, I was going to show you it's an echo. This guy here. But if in this room I go in, and I think this is going to be a comedy, yeah, that's what happens if London and Mars get mixed together for everybody. If I go and say hi because, look, it's matrix room, then if I do slash echo something, I get an echo back. That echo has come from a widget in Wasm running inside the world. So you can program matrix now from within Wasm blobs sitting within the room. Nothing to do with third room. You could use this in clients, et cetera, to start doing client-side widgets. So what's next? Loads of stuff. One big PSA is that Gitter is going native matrix roughly next week. We basically can't afford to run both the Gitter infrastructure and a bunch of matrix infrastructure. So Gitter will become a branded element instance. The API will go away. Please use matrix instead. And finally, we need help. Friends, don't let friends use proprietary chat services. Please use matrix. And critically, and this is new and it's really important, if you're benefiting commercially from matrix, please financially support the foundation because it's stuck in this horrible feedback loop at the moment where the better we make matrix, the less inclined it seems that people want to pay for support or pay for things if they can just grab it on GitHub. This can end up being a disaster where we run out of cash. So please, please, please contribute back, particularly if you're a government. You've got loads of money. Also, run a server, run bridges bots, build on matrix, follow us on MasterDone, and thank you very much. Thanks for listening. Sorry for running on time, as always. |
Graphics: A Frame's Journey |
Hi, how are you doing? Welcome to FOSDOM. Congratulations on managing to get inside a room. This is the largest one I've ever seen. Usually it's just looking at the doors of ones that are full. So yeah, my name's Daniel Stone. I'm here to just give a relatively high level overview of the graphic stack. My hope with this, like I said, it's fairly high level, is to give you a decent understanding of all the different components that go into the modern graphic stack, how they fit together. So if you're trying to work with it anyway, you won't be trying to debug it because it's already perfect, but just being able to give you a good understanding of how everything does fit together. And now we have graphics output working, so that's a good start for this talk because that wasn't looking likely five minutes ago. Right, so the graphic stack looks like this. Any questions? That's the simplified version as well. More sensibly, if we try to build it up incrementally, just try and work through all of the different pieces and different components in essentially the order of near to far, which is, you know, in networking you think of upstream and downstream, usually in the graphics for the lot of what we think of is what's close to your eye and what's far from your eye. So in our case, the display is closest to your eyes, and this one's incredibly bright. In between, just underneath the display, controlling the display and giving you determining what should be shown, we have the window system layer, so that's your Wayland. It can be X11, but we don't talk about that. And then at the very back end, at the upstream side, you've got the clients, which are actually presenting the thing that you want to show. But then it turns out that your window system also uses the GPU to render, so it's not just OpenGL games that use accelerated graphics. It's the window system, so the nice diagram already gets a bit muddied because we're breaking the layers. And then maybe the window system uses some media output because you want to stream stuff onto it or, you know, to stream a conference talk, hello. And maybe one of your clients is also a window system because it turns out that even Chrome is a Wayland server these days, so our lovely little, we have three classes of three main components of our graphic stack. This illusion's already disappeared. But, you know, let's pretend that everything is fine and let's just try to build it up. So for us, DOM and KMS, the acronyms you mostly see, the direct rendering manager is anything to do with graphics or display inside the kernel. It's a weird legacy name. And those are all of the GPU and display drivers. And KMS is very specifically the part of DRM that actually controls the display. So when you're talking about HDMI output or something like that, then it's going to be KMS. And KMS is that very last step in the pipeline, the one that's closest to your eye. Its job is to turn pixels into light. Some people will tell you that there's a thing called FB Dev as well, but that's not right. FB Dev doesn't exist. And, yeah, in the division of responsibility as we go one step further back from your eye, the Windows system's job is to fundamentally to take a bunch of images from clients, combine them into a single image or multiple images if you have multiple displays, get them out to the eye and bring input events back. So, you know, Wayland is a protocol and nothing else. There's a very, there's a very small C layer in Wayland, which is really just IPC. And apart from that, it's just protocols and conventions. So, you know, MATA, the GNOME users is a Wayland server. Other popular ones would be KWIN, Western, WL routes. That's where all the implementation actually lies. And, yeah, like I say, they just combine window images together, get them out to the output device in the reverse direction they're bringing input back. X11 doesn't exist either. So, that's, we'll move on. Yeah. So, OpenGL and Vulkan, in a way they fit in. Their APIs, as we know for accelerated 3D, so you provide them a mesh and some textures and some shaders. Run this thing, make it fast. Great. But they only handle rendering. So, GL and Vulkan themselves have no concept of I want to be able to display to Wayland. That comes in with EGL and what we call the Vulkan WSI for window system integration layer. Their job is to bridge the two worlds. So, with OpenGL, you have EGL on the side that's the bridge between GL and say Wayland. With Vulkan, you have Core Vulkan and then the WSI on the side is that bridge bringing all the content across to the window system. And then there's GBM as well, which is maybe the most ill-fitting part of what we have. GBM is kind of a side channel to bridge EGL to KMS. So, right now, I mean, this is all happening through GNOME shell and MOTA. It's using GL to render my image with the next slide as a bonus preview and this one that you can see. MOTA, yeah, it uses GL to render and it uses EGL plus GBM to be able to pull images out to kernel mode setting. And GBM is a really, really strange and idiosyncratic bridge. Some people will tell you that GBM stands for the generic buffer manager. That's definitely not true. Yeah, we had an idea that GBM would be the thing that let people kind of peek under the hood of what EGL does as an implementation and be able to generically allocate buffers. We got as far as making it work for kernel mode setting and then realized how terrible the whole problem space was. So, we just pretended that it was never an acronym, that it's not generic and moved on with our words. So, at the end of all that, before we get into something more meaty, we've got clients rendering the content, maybe with the GPU, maybe just on the CPU, maybe it's just doing mem copy. It will pass a handle to that content over to the Wayland compositor with some metadata, some context. The compositor is going to pull it all together, choose how it's going to display it, apply any kind of policy or what have you. And then it's going to just push that final image out to KMS, which is going to turn it into electrons. So, we've got the diagram that's back to making sense. So, if we're looking at how KMS is actually put together, every single discrete device in your system is its own. I just have an Intel up top here. I have one DRM device, which is the entire Intel GPU and display complex. If you're on ARM systems usually, you're going to have two devices. The display and GPU are separate IP blocks from separate vendors who aren't really on speaking terms. So, you'll have one DRM device for your display controller and another DRM device for your GPU and they're completely separate. So, yeah, four KMS devices. We've got connectors representing real displays. So, we've got an embedded display port connector here and various display ports and HDMI connectors from my external outputs. CRTCs, that does stand for CRT controller because that's how long ago it was when we designed all this. CRTCs are the thing immediately upstream from connectors. They generate a pixel stream for the displays. So, any kind of scaling, cropping, compositing is done in the CRTC space. And CRTCs are just a combination of planes. So, planes, they take frame buffers. They can scale. They can be positioned within the CRTC. They can be stacked. And then the CRTC is the one that combines them. So, in quite a poor diagram, because for a graphics person, I can't actually draw very well, more of a text person, to be honest. Yeah, it's the frame buffer is just the client content. The plane is the one that's going to do any format conversion or scaling or what have you. Then the CRTC combines them all together, pushes them out to the connector. Then I think the important thing to bear in mind if you're trying to reason about graphics pipelines is that timing flows backwards. Timing never flows forwards. Because when you've got a physical display, it's going to refresh at a certain point in time. Unless it's VRR, no one asked about VRR. We don't quite know how that works yet. But timing flows backwards because this HDMI output is ticking at 60 hertz. That's happening at a very, very fixed point in time. And so that's the beginning of our reference. When we know that we want to present stuff to HDMI, we know exactly when the next refresh cycle is going to start, the next one after that, so on and so forth. So timing is always flowing backwards. This goes right the whole way from the connector back to the CRTC, back to the Windows system, and then back to the clients. It's always starting from that fixed hardware source. So yeah, you want to use DRM and KMS. Good for you. I'd recommend it. It's just a set of objects, like everything that turns out in computer science. It's objects with properties, and that's it. So you open your KMS device, you enumerate a list of objects, your CRTCs, your connectors, your planes, you look into their properties. So this connector type is DisplayPort, this one's HDMI, whatever. And then any time you want to actually affect something, so display new content, change resolution, whatever, that's all done through what we call Atomic Mode Setting, which is about 10 years old now, and it's a very low-level property-based interface. I wouldn't really recommend trying to drive it yourself, but it is possible. So Atomic is just a list of properties. So you've got all of your different objects and their different types. You know how you want to put them together. You know that I want this plane to go to this CRTC, to this connector, and so you take all of those objects, you do a massive property set, and then you do an atomic check before you commit just to see if the configuration is going to be accepted. One of the things about display hardware is that it's weird. It's really, really weird. There are infinite constraints on what you can actually do with the display hardware. So you might have three or four planes that you can use to composite content without using the GPU, but you can only use a couple of them at a time, or only one of them can have compressed content, or only two of them can be scaled. So because we don't have a good generic way of expressing these constraints and of constraint solving within the kernel, we do the dumbest possible thing. It's brute force. We just try every possible configuration that will get us to where we want to and see which one's going to stick. Then yeah, once you've gone through all that, you've done your atomic commit, you've got a frame on screen, it lives there until you change it. Because DRM is, it's a frame by frame API. It's not a producer-consumer where you connect a camera to an output and magic things occur and you get a video stream. You know, that's the domain of high-level frameworks like say PipeWire and Gstreamer have that pipeline concept. DRM is quite dumb. It just does what you tell it to, and it doesn't do anything else until you tell it to do something else. So yeah, we've essentially summing up, you know, we've enumerated all of our devices, we've used the DRM to do that, all of the objects. And again, as with timing, we're working backwards from the starting point of a connector. So we know that HDMI1 is the thing that we want to light up, so you always work backwards from that when you're building up your object tree. And then, you know, you are going to need a way to allocate some memory to display. It's not just a malloc pointer. So we have Gem, the graphics execution manager. It doesn't manage execution of any graphics jobs, it's just a memory allocator. This was about the point where we stopped actually naming acronyms because we've got almost all of them wrong. So Gem, you see a lot of, because that's the base of our kernel allocator for all graphics and display memory. And BO is something you see a lot of as well. So really, I told you about it, acronyms. So Gem BO is just, like a malloc pointer, it's untyped, it's a raw bucket of bytes. It can be pixel buffers, it can be shaders, it can be geometry meshes, whatever you want it to be. It doesn't have any properties or metadata, just a length and some content. But you can't allocate them generically because hardware is really that weird. We gave up on that a long time ago. So you're going to need some kind of hardware specific API to come up with a Gem BO. And you might be quite disappointed about that, which is reasonable. So we came up with dumb buffers as a specific class of Gem BOs designed specifically for CPU rendering when you're displaying KMS. So if you have something like Plymouth for your early start splash screen, that's not going to be using the GPU. It's just going to be doing CPU rendering, no device dependent code. And dumb buffers are the path to that there. I just wanted to get something up on the screen. I don't care if it's amazingly fast or efficient, I just need it to work and work everywhere. So this is actually a generic API inside KMS dumb buffers. Gives you a Gem BO, you can map it, you can fill it up with some nice pixels. And then wrap that in a KMS frame buffer is what annotates the BO with stuff like format and width and height and stuff that people think might be important. So yeah, like I said, you can use it for splash screens. Please don't try to use it for other stuff. It's not a generic memory allocation API either. It's just the thing that works. So yeah, with all that being said, that's a reasonable end-to-end picture of how to use KMS. You've allocated all the buffers you need or imported them from other clients. You've attached those frame buffers to planes. You've stuck them on a CRTC to get them in a kind of logical space and stacked against each other. You've set your CRTC and connector up for the output path. Commit everything. Hopefully that works. Then the kernel tells you that it's complete. You know when the next frame is going to be and you just keep on going. You can't click these links if you're sitting in this room, but they are clickable on the PDF. There's a bunch of pretty decent documentation examples and formats because I'm not trying to show you the entire thing. Just give you a good idea and some pointers. If you're bored of KMS or you just don't find display that exciting, you might want to move on to the Windows system world. There's a super quick one through Wayland. Again, it's the same thing. It's clients giving you images and you're giving clients pointer and keyboard and top screen events in return. I think the main thing about Wayland that people take a while to grasp is that it's descriptive rather than prescriptive. What I mean by that is in X11, when you have a pop-up, you tell X as a client, put this window exactly here on the screen. Give me all of the input events until I tell you otherwise because you're dictating specific outcomes. Wayland is exactly the other direction from that. The client tells the compositor, this is a pop-up. The compositor does the right thing for pop-ups, including capturing input and making it always be on top, but still letting your screensaver work, which is nice. It's just about the client annotating everything it has with a bunch of descriptive information and properties and then relying on the server to actually implement the right semantics. There's a fair bit of trust, but it gives us much, much more flexibility because by the end after how many years of X11, we were kind of painted into a corner really because clients were just dictating so much. We tried to make sure that there were no pods in Wayland that required the compositor to do a huge amount of work because it's such a critical part of the stack that you can't have it burning loads and loads of time. Like I said at the start, your compositor could be GNOME, K-Win, could be Western, Sway or something like that. They're all designed for different things and different use cases like window managers in X11 were. I think Western is the best one because I work on it. It's basically designed for everything that isn't a desktop, literally planes, trains and automobiles, digital signage, that kind of thing. It's really, really efficient and predictable and reliable, but I do use a desktop so I have GNOME on this one. There are absolutely a pile of them to choose from, but they all use the same protocol so they all look alike to the client. It's just a large collection of essentially all extension interfaces. WLBuffer is much like a frame buffer to handle to some pixels somewhere, no other information just width and height. A WL surface is a window, can be a pop-up, can be an application window, can be a subsurface. It takes the buffer, it just crops it and optionally it takes input back. XDG surface is the main one you'd interact with really because that's what adds all the desktop-like things of being able to resize and move windows and all that kind of thing. WLC is where the input comes from because we're still bad at naming, it turns out. That one was my fault actually. We did design Wailand fundamentally to be really, really easy to extend so there are quite a pile of extensions that you need to sort through and deal with. The nice thing is with it having been designed with KMS in mind, it's pretty similar. You've got your compositor doing the final output at the end and that's composed of a bunch of windows and surfaces which have got buffers attached to them. The compositor is the ultimate source of the timing and it flows that timing back to the clients as feedback. If you take that, it looks exactly the same as the KMS diagram we had earlier which is not really any coincidence and using that is exactly the same flow as KMS. This slide was almost copy and paste. Again, I'm not trying to give you a complete guide to how to write every Wailand client in the world. Please do use a tool kit. They will make your lives much easier so GTK, QT, STL, IMGUI, whatever. Use a compositor tool kit as well if you like. Libwestern in particular and WL routes are tool kits you can use to build compositors on top of good code bases. There's some links in here as well to Wailand info is a good tool to inspect. WL hacks is a debugging tool. Western debug is another debugging tool. There's some sample clients as well. The simple SHM and simple EGL are our kind of references of how do I actually start using this and start approaching it. Now we've got all that out of the way. I'm not going to try and explain GL to you because we'd be here forever. Like I said, it's GL as a model for accelerated 3D is clients providing the vertex data so you're kind of wire frame geometry, your input textures, material images, and your shader programs as well to run to generate the final output. No shaders can deform the geometry so you can do cool stuff. You can also do things like lighting per pixel and do that in a nice reflective way that's all computational. I guess the main thing to recognize about GPU is they're enormously parallel so thousands of threads, really. There's not much in the way of synchronization or shared memory. They really, GPUs can't do branching like CPUs. They want to have everything set up for them a long time in advance and just do straight line things from there. It's a long, deep pipeline essentially and you want to make that roughly as static as you can. The cost of being enormously fast and really, really powerful, it turns out, is that they're really power hungry. That's why we have composition in the display hardware as well because it turns out that just spinning up your GPU once per frame to produce the final display output. I worked on a device where the video runtime went from five hours if we didn't use the GPU to four hours if we did. It's a really measurable cost to get a GPU involved. You only want to do it if you've got the right reasons for it or if you actually need it. Like I said, it's just a pure 3D only API when you talk about GL and GLES because it came out of SGI where you told it to draw and it was drawing because there's only one screen and obviously it's going to come out at the right place in the screen as a simpler time. Then SGI realized that they needed some more nuance. They brought in GLX, which was the first go at integrating OpenGL with the Windows system. Originally it had the X server processing all the commands. That was terrible. We came up with the DRI for direct rendering infrastructure, not let the clients directly access the GPU. It relied on central memory allocation. We came up with DRI2 where the main innovation was that clients would manage their own memory in cooperation with the kernel and also execute all of their own commands. That was so good that any time you see DRI it just means accelerated rendering, so roughly describing the last 20 years. Any time you see DRI2 it doesn't mean actual DRI2 in X11. It just means this kind of looks like a modern Windows system by which I mean about the last 15 years. That can be confusing because those two terms are massively ambiguous, but if you ever see DRI2 it probably means that you're somewhere good. Then yeah, EGL is an abstraction of GLX. Rather than just plugging GL into X11 it lets you do Wayland, Android, whatever. All it really does is give you Windows that you can share with the Windows system, gives you some vague notion of timing, but it doesn't have any kind of events, so the only way you can get a consistent frame timing is if you block a lot in EGL. It just tries to hide everything and make it implicit, which again is where GBM comes in because that's what lets us steal buffers away from EGL, push them into KMS for display, handle our own timing and do it properly this time. EGL has that shape and then not coincidentally Vulkan has a fairly similar shape. Vulkan is the rendering API and that's it. Vulkan WSI is the EGL equivalent which provides that Windows system integration of creating Windows, posting content to them and so on. The main difference with Vulkan is that it's really, really explicit and clear about what it's doing. The downside is that because it's so explicit and clear you end up typing a hell of a lot of code. So it's more effort to use, but there's no magic hidden under Vulkan. You know exactly what's going on for better or worse. It's really good on the desktop that on mobile SOCs the hardware isn't necessarily entirely there yet. If you're doing high performance things or you just like seeing what's going on under the hood, I'd recommend Vulkan. And yeah, I think about the last bit that we'd end up having time for is I keep on going on about how we, you know, just saying that EGL will get things from GL to Wayland. The way we do that is DMA buff. It's a kernel concept about sharing memory regions between different subsystems, different processes, different contexts, whatever. So, you know, we've already got in the graphics side of things. We've got the gem buffer objects, but they're local to one particular device and to one particular user context. So, you know, when you want to export a buffer to your Wayland server or share it between, you know, V4L for your video capture and, excuse me, sorry, V4L for your video capture and your GPU to do some analysis on it. That's DMA buff, which just gives you a file descriptor you can use as a handle to that memory area and import it into different contexts or subsystems or places. And that's completely consistent throughout the stack, like all of Wayland, EGL, KMS, Vulkan, everything I've discussed has DMA buff integration because that's our lowest common denominator. So, yeah, we put it all together. I mean, because they're all built on the same building blocks, it's largely how you think it is. Well, hopefully if I've done a decent job of this talk, you know, the client's connecting to the compositor. It's creating a window declaring some very simple annotations about that. It wants to use the GPU, so it creates an EGL context pointing to the Wayland server. I'd like to render over here. The Wayland server has some DMA buff protocols, which tells it what it can and can't accept. The client uses GLES to render into that. That's wrapped in a DMA buff and passed over to the compositor. The compositor is deciding how to place and configure everything. It's importing that DMA buff that it's got from the client to generate one final image. It's then waiting until the next deadline, you know, that sort of 60 hertz cadence that we have. It's waiting until the next deadline to present that out going into KMS. That might be KMS doing its own composition directly in the display hardware or through the GPU itself. It's tough because the display hardware can do that final image composition of taking your sort of four or five images, mashing them all together and coming up with one. It is, like I said, a really measurable win on things like power and memory bandwidth, memory usage as well, but it's kind of complicated in that, you know, it's hard to know, be predictable about when you can and can't use it. It's a bit fiddly. It's one of the reasons I recommend using compositor frameworks like LibWestern, which do do all of this heavy lifting for you. You know, I've spent 10 years of my life trying to solve this problem and wouldn't recommend anyone else does it. It's not even really that interesting. Internally, Western has, like I said, that kind of brute force loop of just trying every possible configuration that could work, seeing what happens and throwing it at KMS to check if that will work. Currently, that's the most advanced one, but yeah, others are catching up. I think really to sum up what I was trying to say about GPUs and efficiency is one of the things that gets collaborate a lot is that no one realizes that every problem on mobile comes down to memory bandwidth. And so you can solve every problem by just copying buffers around more. But when you've got 4K buffers and you've got a low-end device, it turns out that this is always where your performance problem is. It's down in things like copies and naive memory usage. So yeah, that's just one thing to really be aware of is try and go for a zero-copy pipeline because when you have 4K and 144Hz, you really don't have much time and you don't want to spend it all just waiting for slow memory. Yeah, with that, I think we're pretty much coming up on time. So yeah, there's the quick whirlwind tour of how all that fits together. Anyone has any questions or wants to talk about how Wayland's amazing? Please feel free. If you have any questions, please raise your hand. When we launch a game in full screen, for example, does it go straight from GPU to screen or does it go all the way through KMS on that? It will go through the Windows system. So yeah, the question being, if you have a full screen game, will it go straight from the GPU to the display or will the Windows system still be involved? It will still be there, but ideally doing nothing. So it will just take the client buffer, give it directly to KMS and ask KMS to display it in the happy case. But it's always involved as the mediator, so when a notification pops up, it already has control, so it can show it. Hello. I can't. Is it working? Yeah. Okay. So forget the super new big question. When you say the frame buffer is tied to a plane, a plane is not a desktop, a plane is just a window. When you tie a frame buffer to a plane, the plane goes in the compositor. So the plane is a window, it's not the entire desktop. Yeah, exactly. So the CRTC is your final output as one flat image and planes are windows within that CRTC. Thank you. More questions? All right. Hello. Is it working? Hello. You mentioned that kernel mode setting is used turning the pixels into... Sorry, could you please... Sorry. Yeah. You mentioned that KMS kernel mode setting is used to turn the data into pixels on the screen. Is this where graphics card drivers are involved, another vendor-specific software, or is that earlier or later in the pipeline? Sorry, which parameters? So basically, where did graphics card drivers come in? Because I know there's like vendor-specific hardware that requires its own drivers somewhere in kernel space, I believe, so what does this fit in the pipeline? So all of the properties and parameters are defined in kernel space, and we try to standardize them as much as possible. So in the generic world, we do stick pretty religiously to a standard set of parameters that have common behavior across everyone. If you go to things like Android where you have hardware composer and vendor-based tells, it's completely different. And they're all... That's more of a negotiation between kernel and user space, which are both vendor-specific. That answers your question. Do you know if there's any toolkit libraries for writing compositors that are not desktop-specific? Any compositor libraries that are... Libraries for writing compositors that are not desktop-specific. So it's like LibWestern is good for writing desktops, types, things, but for highly embedded use cases, I've found any things that make it easy to write a compositor like that. Yeah, so LibWestern's the one for those kind of embedded or single-purpose use cases. MOTA, which is the basis of GNOME shell, can be used by anyone else, but it's really GPU reliant. And WROOTS is, I guess, kind of in the middle. It's not as friendly and desktop-y as GNOME, but it's not as sort of insanely efficient as Western, and that's the halfway house, I guess. Is there any tool you would recommend for profiling? Sorry, could you speak up? Is there any tool that you would recommend for profiling, the graphics tech? Is there a tool for profiling the graphics tech? Profiling, are there any tools for profiling the graphics tech? Kind of. So Mesa has integration with a tool called Profetto, which is the basis of Android GPU Inspector. There's some support in there for Western, specifically, to interpose its timeline on top of Profetto, but it's pretty patchy, to be honest. We've been working on that basically to try and make it easier so we can stop getting paid for debugging and profiling stuff, to be honest. But yeah, it's a slow process. Profetto is the best one there. I have a question. So why can't we do screen recording or screen sharing in a Bayland? You can. Screen sharing in Bayland is done through the XDG screencast portal and we did that because once, if you try to put it in Bayland itself as like a core protocol for clients to use, it was really going against the grain because everything was designed with this idea of the timing coming from the display and flowing back to the clients. And then once you put it in the other way that the client's receiving content, it really just is a terrible fit with pretty much every interface we had. So it's easier for us to and also working for like sandboxing and containers to go with the XDG portal solution. And yeah, it works every way basically. Okay. I think, yeah. Okay. Thank you, Daniel. Thanks very much. Thank you. |
Can we do an open source chip design in 45 minutes?
The state of free and open source silicon |
Welcome everybody. It's amazing to be back. It's amazing to be back in Brussels in person and see that many people here looking for an answer to the question, can we do an open source chip design in 45 minutes? And to save you all the hassle of actually just listening through this talk. Yes, we can. Thanks. Let's be great and enjoy Brussels. Or if you want to know more about how that actually can work, stay here. So we're going to look at technology. Obviously, this is for them. We're going to look at tools and process how to build an open source chip. We'll be looking at community because this is for them. We want to talk about the people and communities behind all of these efforts. And we will predict the future. Let's see how that goes and we'll see some examples of where that potentially went wrong in the past. So standing in front of you today is Philip. I'm a hardware developer. I've been doing a fair amount of software. I've been involved in open source for a fair amount of time. I gave my first foster target just looked 11 years ago. You find some contact information there. And I'm currently working at this company here. I've been doing chip verification for high-end mainframe CPUs. So that brings all the new together, actually. And I'm also a maintainer of the open source Cocoa2B project. And we'll get into that in a second. Previously, I was working at Lawrisk and open Titan. Again, to buzzwords just for you to Google if you want to. So open Titan is a security chip fully open source that you actually can get involved just like any other software community. I'm here today representing the Fossi Foundation. And this is a foundation there to kind of steward and take care of the open source chip design movement. It's based on individuals. And it's a non-for-profit registered in the UK. Also has been around for quite a while. And was initially born out of the open risk movement that did a open source processor CPU core back, probably, I don't know how many years by now, 10, 15, something like that, before risk 5 was a thing. So that's it for the introduction. Let's go into the technology. How do we build a chip? There are two main things that we need. First, we need to think about. We have an idea. Obviously, that's where we start. But then at some point, we need to implement the functionality. We do that. It looks similar to a regular programming task. So this is the logic design part. And that one is pretty well understood these days. Also an open source. We'll see why that is the case and why that has been thriving for many, many years now. The second part is the back end or the physics or the real world. The thing that actually gets us closer to not having a description of what our hardware could do, but actually a real physical chip that we can then potentially hold in our hands. So this was the part that was tricky so far. And again, we'll get into more of the details of that. But these two things are the ones we need. We need the logic, the functionality, and we need the implementation to make this a real thing. As in so many cases, kind of there is the terms are overloaded and fuzzy and ambiguous. So for my purpose here, I'm going to use the term front-end for kind of the logic part design. And then there will be a back end as well. So if you're coming from the web world or the JavaScript world, that probably means something very different to you. So if you have a front-end designer in IBM, they're potentially doing JavaScript because that's what they do as well. Or they're potentially doing chips. You don't know. Has led to interesting conversations in the past, I must say. So the front-end, I kind of briefly alluded to that. It's just like programming. So there are different levels of programming languages and hierarchies. There is, for example, high-level synthesis starting at the top. Well, if you would buy such a high-level synthesis tool, that would probably tell you you start with an idea and then maybe an algorithm, something that is fancy, and then just run the tool and you get your chip out of it immediately, kind of done. Some of them will sell you tools that allow you to start with a, well, in marketing, it sounds like an arbitrarily complex C or C++ code base and just run that high-level synthesis tool and you get a chip design out of it. Nothing, no human involves, works perfectly. Now, anybody who has C and C or C++ code bases will kind of attest, this is probably not how reality works. And it doesn't. Beyond that, again, the boundary is slightly fuzzy. There are high-level languages that make it ideally more convenient to write a chip. And then at the lowest level almost, there's always system Veriloc or Veriloc and PHDL. Again, system Veriloc being kind of slightly misleading in terminology. So system Veriloc is the current standard for hardware description languages that was previously called Veriloc. Some others use it as system Veriloc being a verification language, well, Veriloc is the design language. Again, let's make it as confusing as possible. It's the language called system Veriloc these days. The alternative is VHDL. Again, just like the C and yeah, more or less the C or C++ of the hardware world. And below there, that's typically not a, that's not a programming language. It's just a typically Veriloc as well. Just a lower-level representation of the same thing. That is the net list, which is effectively a C of ands and ors, and then connected by wires. So that's representing your logic. So these are the kind of the programming languages that you would use, and we'll get into a couple examples in a second. And then obviously, there's all the other stuff. Test frameworks, build tools. You need to have a proper conversation about a build tool. Otherwise, it's not a real programming environment. I mean, who are we if we can't argue endlessly about build tools. There are developer productivity tools. It's actually one of the areas where open source really shines compared to some of the commercial offerings and simulators. And again, we'll get into those in a second. So let's have a look at a couple examples. We want to do open source after all. So hardware description language that we can choose from. And the list is quite long, even though it kind of narrows down quickly once you kind of look deeper. So there is system Veriloc and Veriloc. So this is an IEEE standard. You can download a standard just like the C standard. And this is effectively the most commonly used language to do hardware design. It's quite old, which is not to say much of a problem, but it has been continuously updated. The other problem about system Veriloc and Veriloc is kind of years come and go and programming concepts and ideas come and go as well. The problem is if your language stays around that long, you just add everything. So if you look at the system Veriloc standards, it's huge. From that, it's huge as the C++ standard. But it will have some interesting corner cases if you look at that. And that has interesting side effects, because now we have a fair amount of tools, and we'll get to that in a second, how many tools you actually need to evolve to get a hardware design working. So all of those need to understand ideally the same subset of system Veriloc. And it also kind of interpreted the same way. So that's not always the case. That's why hardware designers who are per se already quite conservative in the tools they're choosing, will end up with a very kind of old school subset of Veriloc that you, India, and allowed to use. VHDL is another option, same story there, just slightly different. The Veriloc community is much larger. And especially if you're looking at producing real ASIC chip designs, you're going to look at Veriloc mostly. But then, program names as said come and go. So there are some interesting new ones up there as well. There's BlueSpec. There are a fair amount of Python-based HDLs, and some of them are more low-level than ours. So there is MyGen, Amaranth, and MyHDL, and a couple others. So this list is, by the way, no. It doesn't need to be complete. And it isn't complete. And there are a number of hardware description languages or programming languages that are based on functional programming languages. Spinal HDL, Chisel, and Clash, for example. Chisel, for example, being quite, well, often used these days because, I don't know, who has heard about Risk 5? Let's maybe raise your hand. So Risk 5 is an open-source instruction set architecture. So, like, the x86 instruction set architecture or ARM, the 78 instruction set architecture. And then, kind of that was developed in Berkeley originally. And they also developed a hardware language called Chisel. So the initial Risk 5 core implementation, the Rocket core is implemented in Chisel. And since that's open source and widely used, kind of Chisel also spread more widely. So you'll see that. And then there is a, I'm not quite sure how new or old it is, the circuit LLVM effort that kind of tries to build a LLVM-based compiler infrastructure to then place hardware languages or other pieces of functionality on top of. I've actually not seen that many, kind of, useful tools coming out of that LLVM-based middle layer. So we'll see what the future brings in that regard. So we choose programming languages. And then the next thing you want to do if you actually want to build a chip is ideally not write everything from scratch. So you reuse, you integrate existing things, and they're typically always called IP cores in that world. So there are a couple of options that makes it easier to integrate. So there is no, unfortunately, central package manager or something like that, like a cargo or an MPM or something like that, that you would have in the new or new software world. So, but what you have instead to do is more or less Google for whatever you might need, and you might find an abandoned Git repository or a sit drop somewhere that contains what you're looking for. And often it comes without, this has been potentially taped out, so a chip has been produced out of that, and it's stable, so we're not going to touch it again. So many of these kind of IP cores are considered stable because as soon as you've used them once, you don't want to touch them again. So you rarely have a fancy community around those cores. It's just, it's there, you use it, and then you're on your own. So that was, and still is, this website called OpenCourse, which was kind of a directory for open source hardware blocks. It's been unmaintained for quite a while, but it's still there. So if you're looking for, of course, it might be a good option, but again, don't expect that much of an active community around these offerings. And yes, I guess the best, that's why the slide is quite empty, option is just Google, and you'll find something. So now that we have put together a logic design, we have written some stuff on our own, we choose in the programming languages nobody else has been using before, and we added some IP cores that nobody was maintaining. So now we want to see if it actually works. I mean, it feels unlikely if I phrase it like that, but the chances are it actually does. So we need to verify it somehow, better or not so good. We're potentially documented. Nah, let's not do that. We want to make it look pretty. I mean, that's what we spend most of our time, don't we? I mean, it's tabs or spaces or indent here or there. It needs to look pretty, even it doesn't work. So that's something we need to do. And that's something the open source community was always fantastic about. And that's clearly something we're bringing to the world of commercial chip design. We need to simulate it. Because at this point, we're just doing logic design. So we don't have anything, we don't have a chip yet. So we somehow need to see what it actually does. And we potentially can run it on FPG. And I'll get to that in a second. So let's have a look at a couple buzzwords here of what is possible these days in the open source chip design world. Simulators, we start with that one. There are kind of three main simulators. And you see that they're main simulators because they have a logo. There is a fourth simulator that's rarely used, NVC, and doesn't have a logo. So I think that indicates what's going on. You also see by the style of the logos, what are the older projects? At least it's slightly misleading, I must say. And so on the top right, we have Ecois VariLog, which is an event-based VariLog simulator. It has been around for quite a while. It's stable, and it works really well. It's widely used. But it doesn't support much of the more modern system VariLog features. And it's not that fast. It's sufficiently fast for smaller designs. And it's kind of the standard choice if you just want to get started. Ecois VariLog is here. If you're looking for a BHGL option, there is GHGL, I think the most actively maintained BHGL simulator these days. And there is VariLog later in the middle, which is slightly different because it simulates VariLog as well. The name gives it away, I would say. But it's a cycle accurate simulator. So without going into too much of the details here, if you have kind of a real chip design, you clock it. And then you get kind of one clock pass per clock. Well, that's ridiculous. But that's kind of the only times when VariLog kind of re-evaluates things that are going on. In an event-based simulator, you get a slightly different behavior. Either way, VariLog behaves pretty much like a real chip would do afterwards. And it's a very, very active community. And it's typically based around the synthesizable subset of VariLog. So because that's kind of what it targets. It targets only the logic that you then want to bring on a chip and not so much the system VariLog that you can use to actually write a test bench or verification framework around it. The verification frameworks, I just mentioned that. Who has heard of UVM, the one up here, couple ones? Okay. So UVM is an interesting one. It's a system VariLog verification methodology. And already with methodology you kind of breathe this air of being designed by a committee. And that's how it looks. It's effectively a class library, a framework of classes and instructions on how to use them and when to use them to actually verify a chip. So write a test bench, more or less. So that's your Google test of VariLog, if you wish to. And there are a couple other options. And I mentioned that I'm involved there. So I'm going to mention that prominently. There's CocoDB, which gives you a way to write a test bench in Python that is then testing your hardware logic. There is OSVVM for BHDL designs. And there is on the top, just one of the further options. Symbiosis, if you don't want to do simulation-based verification, but you want to do formal verification, so you want to prove that certain things are happening or not happening. So this is then effectively using a SAT solver behind the scenes to prove some things can be done or can't be done. So that's a very different verification approach and works very, very well for some problems, not for all of them. And so you typically, if you're kind of trying to gain confidence in your design, you definitely go for some simulation-based verification, so simulation-based testing. And then you maybe throw in some formal around some specific areas. And while I'm putting these here, so this is kind of the way to do verification not new, but what we see here is that we have a fair amount of options in open source that are very, very high quality and ready to be used. So that's great. Build systems, and I need to mention the first one first. Who has heard of Feustock? Oh, a couple ones. So there's this guy called Ola Chingren, quite active on Twitter as well, and he's always writing award-winning software. So he started that initially when he just saw a tool that apparently has gained an award, but he couldn't find the award that it was awarded. So from then on, every tool he's writing is award-winning, even though it has never won any awards. Actually, I think Feustock has won one or two awards by now. So I think that's something we only took copy. Just make sure that we always say our software is award-winning. And from down, it's just much better. So Feustock is a build system for hardware designs, and Idyllize is a build system backend to make kind of the hardware design put together and feed that to a variety of different tools that are involved. So just like driving your compiler. There was VUnit, a regression manager. Well, okay, another maybe interesting term I'm saying regression manager here. So a regression in the software world would be something was working, and then it would broke it somehow. So kind of functionality degraded in quality. A regression in the hardware world would be somebody just running a fair amount of tests. So effectively, Unite DCI would be a regression, just to add confusing terminology. And there are a fair amount of other options for build systems. Again, just Google them and then fight about it, obviously. We talked about white space before, and when you're done with your fighting about your build system, you obviously need to fight about your right style guide to use for your programming languages. So we have come accustomed in the software world to there being kind of clang format and other go format and other enforcing, or not so much enforcing, auto format is there. If you look at a very low code and VHDL code, you see there is no format. And you see often there is kind of no taste either. So it just looks awful. And if you look through code, then everything looks different, which isn't that much of a problem in the commercial chip development world, because you own something. It's kind of, it's John's core. So this is kind of the piece of code that John is working on. Only he needs to be able to touch it, which is then up to John, however, they want to kind of format their code. And I've done that for 20 or 30 or 40 years, so they just have their in-house style that you also see changing over time. And that's not something that we kind of find normal in kind of these days of software development. So there was an effort, and there still is an effort to bring formatting and linting and language server integration. So that gives you a visual studio code integration, or other kinds of integrations. And the variable is the tool to look for there. There is very later lint. Part of that, very later simulated, I've shown before. And it actually does more static analysis jobs as well. And there are a couple other options to choose from. And mostly mentioning that a very long part here, you'll find that, because that's what I'm most familiar with, but I'm very sure there is an equivalent BHTL kind of option for many of these things as well. And if you know them, by the way, just raise your hand and we'll just add it right away. So let's have a look at just one or two examples. So this is a don't need to read it in full. So this is a piece of very low-con-top assigning something based on a ternary statement. And think about very like it's statically, it's not statically typed, but it still is easy to assign for example, 32 bits to seven bits or the other way around. And very likely, well, most very log tools will be, well, very much equal end of that fact and just strip off some bits that are not used or padded in some way. So what you would expect from many languages where the compiler just tells you, well, this is just bullshit that you've written here. And very likely need some further tools to do that actually. And variator lint is one of the options here. Just random examples where you actually need insight into the variables and the constants that are being used. With variable, we now also can do something that we also know quite well from a software world. It's just have style lints and stuff like that running in CIN actually give you feedback right away. And for example here, I'm having a screenshot from a open-type repository, a pull request, where one of these bots actually gives you immediately feedback on, well, that's training space. That's kind of the most boring example. But also things like rules about naming certain constructs. We don't need to go into the details of what's happening here. And I think I've seen some people from Antmicro here before. So if you have questions about variable, I think that people up there are to blame and to ask questions. But I've done a great job. So let's keep it at that for the moment. Frontend, we're doing great in terms of chip design. So we have all the tools. We've seen that the slides only always had a subset of tools that we can use. So Frontend is doing great and has been for ages. I mean, one thing I would love to see is that we just stop reinventing Verilog parsing. It's a huge language and we have so many tools that all think they can parse Verilog and they just can't. They just really can't. I mean, everybody thinks they just strip at that part of the standard and somebody uses it and you get these weird pipelines that just don't work anymore. So maybe a unified frontend at some point in time. And I know there have been a couple of attempts to actually do that. And it just leads to that situation that you now have one more frontend to care about, as always. Okay. We have our logic design. We can run it on an FPGA. Sure. And in the FPGA world, there are kind of two main manufacturers of FPGAs and there is something else. So who does, well, who does not know what an FPGA is? Okay. Very small number. So it's a field programmable gate array. So it's effectively a chip that you can reprogram. Let's put it like that in very simple terms. And normally you would use the tool that the vendor provides you that would be in, how do I name this thing? So it would be Vivado for Xilinx, now some other company, or Quartus for Altera, now Intel. Whatever. Just get bored. Come on. Stop it. Or if you don't want to use those closed tools, you can use symbiosis or formally Simpliflow or now for FPGA. And it gives you a reverse engineered way to actually target Xilinx 7-series FPGA. So these are the most common FPGAs. And actually do full designs, full open source using only open source tools and get them run on an FPGA. The problem there always was that kind of the programming instructions, the way how to kind of structure a program to put it in an FPGA wasn't known. That's what they reverse engineered and then put together kind of a various open source tools that then allow you to actually target FPGAs. So there is FPGAs are pretty much like if you're trying to build a phone, but the only thing you can do is actually build that. It's like it's nice, it's functionality, it's kind of similar, but it's pixely, you know. It's just it's not a right thing. And I mean I actually outsourced that work to my niece and nephew. So it's just well that's what we want, don't we? We want kind of open source silicon. We want a real chip. We want this thingy, not that. So how can we do that? We need the back end, we need the physical implementation. That adds a whole lot of new complexity, of course. And that's where things become interesting. And what we want is we want to go from RTL, so that's your registered transfer layer, a level, layer, level, I don't know. So that's your very lock to GDS2. That's a very old data format that you then send to a company that produces your chips. So that's a fact. So we want a RTL to GDS2 flow. And I have an example of one here. So we start with our very lock on the top. We do synthesis. And I'm not going to explain them all in detail here, because otherwise we're going to be here for a long time. But just to demystify some of the acronyms in here. So we do our synthesis that gives us a net list. We do a static timing analysis. So to see if we have put potentially too much logic after each other, so that you can't clock it anymore, that you only can clock the two kilohertz or something like that. That's what the static timing analysis does in the first one. So there's DFT designed for tests. If you have that chip, you're going to have a hard time looking inside it. So you want some test structures inside your chip that allow you to observe what's going on. So these are typically scan chains, something that JTEC was and still is used for, apart from the many other things that JTEC is being used for. But that's what DFT does. And these scans are automatically inserted normally for all or most registers. And then you have that. You still have a very long description. And then you do floor planning, placement, clock tree synthesis optimization, and global routing. So in the end you get a number of transistors and that you connect them somehow and you optimize them in a way that kind of minimizes or maximizes power or performance or area, depending on what you're looking for. And kind of all the things that we've skipped over so far that are now relevant in the physical world. For example, if you have that clock that goes to your chip, you always need to make sure, obviously also need to make sure that actually gets distributed across the chip. So you just have a single pin potentially where it comes in, but it needs to be everywhere. So you have that clock tree that you need to insert somehow. Something we don't need to worry about, for example, or not that much worry about if you're doing an FPGA design or a simulation, because that's something FPGA vendor has already done for us. So we're able to kind of ignore many of these things when we're just doing kind of FPGAs before. And then since we're back in the physical world, you need to manufacture them. We have antenna diodes that we need to insert. And after we've done that much processing, we're not quite sure what we got. So what we want to make sure is at this point that we actually know that still the original design is what we did. So we do a logic equivalence check that's the LEC here. And then we do detailed routing, RC extraction, another timing analysis. And finally, we're at GDS2 streaming. So that's when we do the file writing. And there we got our GDS2. Easy, isn't it? So just need to kind of hook those tools up together. Done. Beautiful. Why didn't we do that ages ago? I guess we were too lazy to just do that. Well, the reason is a different one. I just skipped out one part of the picture. And that's one input up here. That's the PDK, the process design kit. And this is something you get from a fab. And if you don't get it, you don't get it. You're not going to do any of that. And I was standing here in 2016, a couple years ago. These were some of the slides, style change as well. So we have the process design kit. We are getting their standard cell libraries. We get a fair amount of design rules. Like if you have ever done kind of PCB manufacturing, printed circuit boards, you also get a number of rules that you need to obey so that they can actually manufacture it. It's typically not that long that list. And if you want to fab a chip, the list is significantly longer. Electrical parameters. So you get it from the foundry. And you need to sign an NDA for that. That's where we stopped in 2016. That's where we stopped in 2017. And that's why we said back then, you can't do it as a hobbyist. So just there's no way of doing that. There were some companies who did it. And low risk was one of them. If you are kind of big enough, they still allow you to sign that NDA. But that's where we stopped. And then there was this talk. This was two years ago now. Tim Ansel, what? It's 2013. So 2020, I think. So that's even more. Yeah, let's say two years plus, minus. And where Tim Ansel, working at Google, was presenting the SkyWater PDK. So this is for the first time ever that, well, at least in the last 20 years, let's put it this way, a process design kit. So the rules from Foundry were open source. And they are available on GitHub right there. So let's think and switch over here. Can I? No, I can't. Let's keep it up there. So, yes, you can download them from GitHub. So now you solve that one problem. You actually can now manufacture a fully open source chip because you do have these rules. The downside here is this is a 130 nanometer PDK. If you're looking at current 3, 5 nanometer chips in your mobile phone potentially. So when was 130? 130 was this time. So these are Pentium 4's that were 130 nanometer roughly. So that's a net burst architecture. AMD roughly at the same time. So that's around 2002 where that technology was first introduced. Maybe you still have some of those PCs actually around. But nonetheless, these chips, well, this process is still being manufactured. Otherwise, Foundry wouldn't be able to produce something out of that. So they still have processing lines that actually use that flow. So the other thing that was happening is that Google said not only we talked at Foundry and they gave us this PDK away, we're also funding a shuttle. So that is effectively a cost effective way to produce chips together with others on a single wafer. And I said we're doing a couple of those every year. And if you submit an open source design, you get that manufactured for free, really kind of free as in no cost at all using only open source tools. So free as in real free as well. So that for the first time made it possible for everybody who only has internet access and doesn't need any cash at all to get a chip manufactured. So these are obviously not volume chips. So these are kind of experimental chips for you to actually give it a try. But you can for the first time ever you can. So this was a old article found that from these 130 nanometer chips, you know, kind of in a couple years they said that was in 2000, we should be able to reach 8 to 10 gigahertz on slightly better nodes. So I think that was around 70 nanometers. So if they said that would be possible, I mean, we are only like 18 years behind that prediction now. Let's see what we in the open source actually can do and probably not going to reach those eight gigahertz. But there's a fair amount of room in those old process nodes still to be optimized and still to be explored. So let's do some exploring actually. We've seen that picture before of that flow. You can get that flow very easily. You do a git clone of the open lane repository. You CD into it. You run make, make test and make mount and that gives you a docker container. And that docker container is something I have open right here. So this is the open lane docker container. And since we are recently short on time, I'm not going to run it in full. So I did run the flow. So this is a trivial design. So that's a SPM, a serial parallel multiplier. So this is a very small piece of very low code that we will then turn into a chip design. It's part of the standard demo, so I guess you can give that a try as well. So let's have a look. The flow, the processor runs through in a minute or after two minutes for that trivial design. For a larger design, it takes an hour for an AES core and a full SOC taking a couple 10, 20 hours depending on how you're doing. It's still not the end of the world. So that was a boring summary, isn't it? Let's have a look at the output right away. So it comes with a GDS view. So GDS is the output format, as we said. And just starting from that very log and using only open source components, you get the full GDS. And that is confusing because physics are confusing. So let's see if we can get rid of some things here or just have a look at the left side first. So we see the different cells that are being used. So the Sky Water Foundry provides us with a number of ands and ores and knots and other things optimized for different process callers. And they get chosen and then inserted into that design. So let's get rid of a bit of the stuff that we actually need for physics. So decaps, who needs those, fillers, nobody is there either. So that already gives us a slightly better picture. Maybe we can get rid of some of those metal layers, the connectivity. I mean, who needs that? And we see we can get closer to the actual interesting parts. So that is effectively your file that you sent to the Foundry. And you can do that as set free of cost using the shuttle program. Or if you don't want to use that, you can also pay a company called eFabulous to then fab that for you. As I said, using only the open source tools. Since we are really short on time, here are the things that you can have a look and play with. You're also, if you don't want to install anything locally, you can have a look at that link that gives you a Jupyter notebook, an online way to just actually play around. And just in your browser, write some VHDL or Verilog, actually in that case, and synthesizes and have a look at the GDS2 in the end. Only in your browser, nothing else needed. There is a project called Tiny Tapeout, where you can get a very, very small chip manufactured and learn how to do it. And they have a very, very nice GitHub action set up. And I need to show that because it's just fantastic. So apart from kind of you have your Verilog and then in the end, your GDS, you also get a built-in 3D GDS viewer that allows you to zoom through your design and see the metal layers, see everything in there. And let's get rid of some of the, oops, there was a wrong button. Oh, how can I help you? Some of the layers, so that was remove the fields and the decaps, and then remove the cell geometry as well. So that's your standard cells now. And we can actually look at them and see what they are and also look at the different layers. So that's really cool. And that shows if you have now all the open source tools, you can do all that. You can do that in a GitHub action where before when you actually had licenses and license servers and all that crap that comes with software that you need to buy for a lot of money, it's not just expensive, it's also a pain to use, and stuff like that wouldn't just be possible. So open source is here with that stuff, and it's staying, I'm telling you. So we got that one. And again, the link is there. So let's have a look at who's actually doing that. So I had a look at some GitHub stats. So yours is the synthesis tool and actually quite old tool. So that's 10 people. Again, it's a random month. It's at last month roughly. So 10 people working there. So that's a healthy community of people adding things. It's not a huge community. And this is a thing we'll see repeating. Very late at the simulation tool, 20 people in the last month pushing and doing stuff. CocoaDB, not that many people, but still. The SkyWater PDK, that's the manual, effectively, of your PDK. And that doesn't see that many changes. And I think that's one of the areas where more people could get involved. And actually, where we have as a community, more to figure out because that description that PDK is not complete and not error free, like in any document that you write, there are probably going to be errors. So we're still looking for more ways how to get people involved and fix those kind of even small errors. So open lane, we've seen that flow, reasonably active as well. We've come open road, which is within open lane, they always make it sound similar. But open road is kind of the majority of those physical design tools and bundling them together. What open road has an idea, it was funded by DARPA. And the idea is to be able to go from an RTL to a GDS 2 and 24 hours with no human in the loop, which is something very different from the normal hardware world, where you always have tools that are... Well, you can put it in two ways. You can say the tools are too crappy, so you need to go manually in and fix some stuff in the way. Or you say the tools are good enough and you just need that human to actually give it that special touch because hardware is just so difficult. It feels like at times when you set up compiler, isn't exactly good enough to actually really compile my software. So at times I just go in and just do a bit of hand assembly editing because there's corners around memory access. So they just need a bit of manual work. So that's how hardware design works. And that's what they want to change is just make it completely automated. Just for comparison, this is how things on a Linux kernel would look. So the hardware community is significantly smaller than the software community. And that's why Ruby, for example, is funding some of these things, is finding hardware engineers is quite tricky. And there is just tons of software engineers and tons of good ideas that have been explored and tried in the software world. So we want to bring that to hardware as well and make it accessible there. We'll look at it very later. And we also see more and more chip designs being submitted to OpenMPW. So that's the free as in cost shuttle program where you can just submit your designs. So whenever somebody learns typically by themselves how to do a chip design and then submit it, it kind of shows up in that graph. So there have been eight free of those manufacturing runs now. And there is more to come. The first chips are back. So the MPW1 chips. You see this is a fair amount of lead time. It takes typically at least two and a half months roughly to get chips back from FAB. But then you want to test them and you want to actually potentially fix errors and these OpenMPW ones. Since all the tooling is new, that's still a fair amount of problems in them. So bringing them up is not quite trivial and they don't fully work. But that's what we're learning. And that's why we kind of do that repeatedly and again and again. So we don't need to be fully correct the first time. We just have a couple more tries. So looking at the future, always hard. We're going to see innovation. We're going to see change that nobody predicted. And that's the good thing about making this open source. So somebody who has an idea can actually give it a try and just see if they can do it. Pretty much before, if you wanted to do chip design, you had to be within a large company that did it the way they always did. So you had to do their product end of story. So there is no that much unpredictable innovation. Tools getting more accessible. We can actually get access to them. And finally, we can actually revolutionize learning about hardware. Before, when I was taking a university course, it was kind of very boring theoretical. So you had kind of course materials from 10 years ago from somebody who has never seen a chip being made in their life because they just actually used the course material that they got 20 years ago. And that's just how teaching evolves. And that's kind of the teaching we see in very many universities about hardware design. So they might have a VHDL, a very low course, but it's so far from reality. It's kind of not even funny. So we can actually now learn how to do it the real way. There are some news. I'll let the links up there. And finally, if you actually want to take a paid for learning course, with all those open source tools, there is the zero to AC course from Matt Van. So he helped me actually preparing some slides to have a look at that stuff. If you're looking for in person events, there is a lecture coming up in the US, a open source hardware conference that the Fosse Foundation has been organizing for a number of years. And there is all kind of coming back after COVID, the main open source hardware conference in Munich this year. I think we haven't announced that very widely at September 15 and 16. And just stay for the October fest. Just take your hotel room a couple more days. And this is going to be a great experience. We end with a quote that I found, an atmosphere of excitement and anticipation pervades this field. Workers from many backgrounds, computer scientists, electrical engineers, physics, physicists are collaborating on a common problem area, which has not yet become classical. This territory is vast and largely unexplored. The rewards for those who simply press forward are great. So this was written in 1978. And with there again, with the open source world, let's make this open source chip design a reality. We have all the necessary ingredients. We have all the necessary tools. Let's make it happen. Thank you. |
Fedora Asahi
Fedora for Apple SIlicon |
I'm talking about Fidora Sahi, which is Fidora for Apple Silicon. It's funny. I was in the bar the other night talking with David and Neil about doing this presentation. And I said, yeah, on that Mac Mini, on about one in ten displays, it just doesn't work. So there is a small chance that I arrive on the day and hate you, my output won't work. So yeah, we hit that issue. So I was hoping to do the whole demo directly on the Mac Mini, but we had to go to Plan V on my Chromebook here. But I still got through it, and we'll still do all the Q&A. I was hoping to show you a couple of things live, but yeah, we'll have to shelve that. So yeah, I'm Eric Orton. I work for Red Hat. I work in the automotive org. So that's what I kind of work on. I had a competition, but I have to shelve that because it required the hardware. We'll see. If we have time at the end, I might try and plug in the HDMI one more time. So why do we care about Fidora and Apple Silicon? So Apple released new Android, Apple Silicon devices, I think it was late 2020. And there's actually a shortage of well-upstreamed devices. What's cool about this one is the firmware is unlocked out of the box. So it's actually a feature of the Mac devices to run alternative operating systems. And virtualization is also unlocked in the firmware as well. That's actually a feature I find quite handy. I run KVMs a lot on this Mac Mini. Yeah, and it's pleasantly fast. And I suppose Apple will be known for selling, marketing their hardware as premium, which it is, but it's also great value, great bang for a buck in terms of performance. So why do I care? Why did I get involved? I'm repeating what I said earlier, I work in the Red Hat on the automotive org. The automotive boards are ARM based, so I end up doing quite a bit of work that requires some kind of an ARM environment, and working on the Mac Mini allows me to iterate quickly. And then as a bonus, I learned more about ARM hardware and software implementations and things like kernel space rust and that kind of thing. So these were benchmarks. I did these well over a year ago. I was working a bit, I know it might be quite small and difficult to see. I was working with lib camera at the time, and I just used it as a program to profile different pieces of hardware I had around the house. So it's basically minutes it takes to build lib camera or seconds even. So the bar on the very left is Raspberry Pi 4. The green one is interesting, that was my phone, it was like a mid-range Motorola. I'd like a fedoresh P-Root, it's called, the piece of software. So I'd like a fedora container running on my phone. So that's the green one. The yellow one is also a fedora container running on my phone, but yellow is when my phone is routed, so I can get an extra little bit of performance out of that. That was using Sheroot rather than P-Root. And the yellow is my company-issued laptop, which is like an Intel i7, and the small blue bar at the end is how long it was taking me to build lib camera on my Mac mini. And finally, the orange bar, my company-issued laptop is twice the cost of the Mac mini, so that's kind of going back to, there's great bang for a buck with the Apple device. So what makes the project great, we've really great upstream folk and we collaborate with these quite frequently. So Hector, Alyssa, Sahelina, Dugal, Sven, Mac, and there's many more as well. So we've great downstream folk as well, Neil and Davide are actually here, they're big in the whole fedora ecosystem, they're big contributors. Michelle, Leif Liddy is another guy, helps out quite a bit, and there's many more. So this is another thing that I really love about the Asahi community in general. They have this kind of upstream everything attitude, so like absolutely everything we send to the various upstream projects, if at all possible. And that's one thing, there's various certifications around ARM devices, like Alyssa did just three there, I could find. One is work through a Chromebook, and one is the Red Hat Enterprise Linux certification, and the third is system-ready, but one thing that's in the spirit of all those certifications is upstream everything. And yeah, that's one of the kind of core values we have. And that's actually not as common as you would think with ARM devices, it's almost an exception when absolutely everything gets upstreamed. So in Fedora Asahi, this is a further workflow, so most of the time things will hit upstream first, and then Fedora will package that up, and then the Fedora Asahi remix will use those packages, so that's the common case. We also use this workflow, because sometimes submitting things upstream, it can take a bit of time, so that's a lot of our work as well in the Fedora Asahi community, so sometimes work will be submitted upstream, but it might not be accepted yet. So in Fedora Asahi, we'll take those patches, and we'll fork whatever packages we need to make sure you have the best experience possible while things are still being upstreamed, and eventually that will make its way to Fedora when it gets upstreamed, and yeah, I'm going to explain that further in the next few slides. So yeah, AbsolutelySense is a success when it comes to Fedora Asahi, so ultimately we plan on getting as much as possible into the main Fedora repository, so every time a forked package is obsolete, I would regard that as a success. So what do we fork? We fork Ubooth, that's kind of one I'm kind of expecting could be forked almost forever, because we have some Apple Silicon specific stuff in there, I knocked off my mic, 2 seconds. So we have Ubooth, we have a package we call Kernel, which is our own kernel, it's separate to the normal Fedora kernel, we have another kernel we call Kernel Edge, we fork Mesa, we have this kind of firmware package that Hector and the Med called M1N1, and there's a handful of others. What were the reasons Ubooth is going to be forked? I think we have some flags, build time flags, and we don't have a way of changing how it behaves at runtime yet. I could be wrong, it may not remain forked, the other thing about Fedora as well, we generally try and avoid maintaining firmware and focus on the operating system side, so if you notice our Fedora ARM images, they generally don't get packaged with firmware, we try and avoid getting into maintaining firmware, because I can scale quite badly at the lower level pieces of software. So Fedora kernel, it has Apple Silicon support, it boots, we continually test and enable more kernel cotton figs as support gets propagated upstream, it's built with 4K page size. So that's something interesting in Fedora, we try and just build one kernel per CPU architecture, so at least the way things currently are, we don't build a kernel for 4K, for 16K, 64K, because yeah, against scale, it's easier to maintain one kernel per CPU architecture. So something interesting about this kernel is not everything at the moment is upstream to its support for 4K page size, that's something that's continually in progress and the upstream folks are working on, but hardware is designed to work with 16K page size, so we'll get there. Getting everything working with 16K page size upstream is definitely the priority first. The other thing about 4K page size is you take a performance hit because the hardware is kind of tuned for 16K page size, so that's something to bear in mind. So yeah, the Fedora ASAHI SIG then maintains two kernels and this is the first one, I called it the Fedora ASAHI kernel here and so it uses the Fedora kernel as a base and we didn't add extra yet to be upstream patches from like the ASAHI Linux repos, we enable even more kernel configs, we build with 16K page size and it uses simple DRM, which is software rendered graphics and that's actually surprisingly fast, I'm always amazed at how fast simple DRM is on hardware like this, so if you're interested in Fedora ASAHI from a user perspective I would recommend this kernel or the next kernel I'm going to talk about because just the user experience is a bit better, more things work basically. So this is another kernel we maintain, we create this one not so long ago, just before Christmas, so this uses the Fedora, the last kernel I talked about basically as a base and we add even more patches and we enable even more kernel configs. So it uses accelerated graphics, what I had intended to do for this talk is that we would have a little competition of two people playing SuperDoc's cat just to show off the accelerated graphics but yeah, HDMI issues. So I found this kernel interesting to work with because it's built with the Rust for Linux kernel space port and yet the DSAHI GPU driver is one of the first fully fledged Rust for Linux drivers and it's pretty neat, it works well. And another difference is we build this kernel with Clang LLVM because basically GCC Rust support is a little bit behind so at a minimum you have to build a Rust code with Clang LLVM and I remember playing around with that package at the time and at that point it was just easy to build the whole code base with Clang LLVM including all the C code. But I think it is possible if you kind of want the hybrid builds to build the C code with GCC and the Rust code with Clang LLVM but we switched everything to Clang LLVM at least temporarily because it's easier and we also use a forked method package so that works with the Rust GPU driver. So what's our official release date, Davide, he's here somewhere, I was talking to him before the talk, he's presenting this stuff at scale in March in about a month's time. I hope he has better luck with his CMI output and that kind of thing than me. So yeah, Davide is handling our release so we think we should have an official release out before that. So most of the people working on this are kind of doing it part-time except for the upstream folk, they're more or less, some of them are full-time jobs so we're always welcome and open to have new contributors. So if you're interested to reach out to us on Matrix, Apple, it's actually pretty impressive they seem to be releasing new hardware pretty frequently and every time they release new hardware there's new things to do because every piece of hardware has its own nuances. Like this is something we were talking about in the last month or two, I don't actually have an M2 device so I can't test it personally, but WebKit is basically broken because there's this thing, it's a new feature of Armacam and Arm version 8.5 and it's called branch target identification and basically WebKit, basically someone has to write the code in WebKit to say if BTI do this but nobody has done it. Which is interesting, I remember yesterday at an ampere talk somebody asked a question, can new Arm versions break user space and I didn't want to answer because I wasn't speaking but yes they can sometimes because this is one of those cases. And that's kind of it, I have a couple of links there to our Matrix, our Wiki, our Project Tracker, the upstream, Sahil Linux.org page and this Git and Copper, you'll find some of our PM's in there if you're interested. Yes so that's kind of it, I'll take Q&A now if anyone has questions and answers and if we don't and we have a little bit of time, I might plug in the history, might cable one more time to see if we get very lucky. Still anybody have questions? At the beginning you mentioned that Apple Silicon is well upstreamed on ARM device, what do you mean by being well upstreamed? As in like does it have work to put back into it? Could you repeat the question? In the first slide you said that Apple Silicon is a well upstreamed ARM device, which is why you work on it, what does that mean exactly? So that's all thanks to the upstreamed fork, the Sahil Linux fork. So Hector Martin is the leader of that and he really believes in upstreaming. So like he tries his best, sometimes certain piece of code, yeah he does get difficulty, that's normal, it spends so many subsystems and so many different projects. So the world is top-streamed, absolutely everything. I pointed that out, there are like plenty of ARM SLCs that publish their code, so they like put a Git hub repo out and they'll publish the code there, but they'll never, often they don't code a final hurdle and get it like into Linux history, so like it runs out of the box with Fedora, Debian, Ubuntu, all the various distributions. And that's one of the things I love about the Asahi guys, because they go that extra mile to try and upstream everything, so that's what I meant for that. Hi, you mentioned that you think that might not be upstreamed, would it make sense to have a separate project to create an UEFI layer on top of that, to harmonize Fedora in that way? Yeah, I'll be honest, I did not work on the Ubuntu stuff myself, I think Hector did most of that, he calls it EFI like and not exactly EFI, so that might be the reason why it might remain a fork. To be honest, I work more with the downstream folks, so you'd probably have to talk to the upstream Asahi guys about that. I think it was Hector working on the Ubuntu stuff, but I could be wrong. I'll try and plug in the heads to my... Hi, thanks for the talk, I would like to know how it is to use a new programming language which is Rust in the kernel and in Mesa, is it going to be supported upstream? I think it's amazing, I'm not going to lie to you and say I've written thousands and thousands of lines of Rust, because I haven't, but building it is easy as long as you're applying LLVM at the moment, and I believe GCC started to release Rust support recently, so I expect GCC to get there as well eventually, so building it isn't too bad now. Building it, it works solidly, I've never had any crashes or anything. Hector and all those upstream guys swear by it, they reckon like they got that GPU driver written twice as quickly just by using Rust, and by not having to handle memory management manually always and all these things, and they handle trade races and stuff. I think another reason they chose Rust is I think when Rust engineering the Apple GPU driver I think it was written in C++, so I think it made a little bit easier for them because Rust has some of the features of C++, but they swear by it and they're the guys that actually write the Rust kernel of Hatches not me. Do you see a need or a demand for or use for Apple Silicon to run Linux servers in real scenarios in companies and this kind of stuff? Like in enterprise? Yes. Enterprise is tricky because we're very much a community supported effort and we don't have any, Apple don't really have an issue with us doing this, but we also don't have an official relationship with Apple, so it would be hard to deploy Linux and Apple Silicon in a data center when you don't have support, official support from the hardware manufacturer. So at the moment I don't see that happening, unless Apple all of a sudden are like, yeah, we'll support that configuration like in an enterprise environment, yeah, could happen. Where did the HDMI cable go? I'm just going to plug it in there. Sorry guys. Hi. Sorry, I was just going to wonder, you mentioned that for Uboot you weren't going to push things upstream from your fork, is that a licensing issue, like is it a GPL conflict or something like that or what's the problem? Could you repeat the question a little bit louder? I think earlier you said that for Uboot you would be maintaining a fork, is that a licensing issue, like is there something you can't commit back to Uboot that they wouldn't accept? Yeah, I didn't, again I didn't write the Uboot patches, I'm pretty sure that was Hector. I think it's because it's written in a non-standard way, like he often calls it UEFI-like and I think there's certain hacks he had to do to get it working on Apple Silicon that the maintainers may not accept, but I will repeat, does upstream, the upstream Asahi community, they really care about upstreaming into the real projects be it Linux, kernel, Mesa, etc. So if all possible, I'm sure they will, you know. Sometimes there's just hacks required because it's built around macOS at the end of the day. Some questions? I'll be around for the day, if any of you see a monitor around the campus or whatever and you want to see it in action, we can hook it up and try, I'm all into that. So yeah, it's about, I've seen this happen before, 90% of monitors work, so if you find a random monitor, yeah, we can do it. Okay, so thank you, Eric. Thanks guys, thanks very much. Thank you. Thank you. Thanks, Eric. Thanks, guys. Thanks, Eric. |
DNF5: the new era in RPM software management
How we rewrote the codebase and started loving the community |
Okay. So, hello. Let me introduce us. Like you may have heard, we are from Redhead, from Bruno, actually, and we are going to talk about DNF, DNF5, a new generation as we have in the title. To introduce us and the talk, we have it split into three sections. The first one is covered by me. It will be a technical overview. Then there is a community history and action items by Nicola. And finally, a live demo from Jan. First, I would like to explain what are we actually talking about. I imagine most of you already know this, but DNF is a package manager. Probably the easiest way to explain this to someone is to compare it to, like, an app store or command line app store. So, it installs, upgrades, removes packages and dependencies, and stuff like that. And there are many examples. You might be, and we are actually working on DNF and micro-DNF. To put this into some more context, we have this diagram that's actually describing Fedora. But I think many distributions have something, some similar setup like this. You can see that it's possible to interact with the package manager on, like, many ways or in many levels. But here we are talking about the high-level manager, and that is DNF or LibDNF. If we focus on that section, we could see a diagram of components that, for the current version, looks something like this. There are some problems with this, actually. And at first glance, I think it already looks more complicated than it needs to be. But mainly, you can see that the LibDNF, the library, is split into two sections. And DNF is actually just using the Hawkeye section, but not the Libheave one, while micro-DNF is using both. This would be fine if there was some extra functionality in micro-DNF, but it is actually the other way around. Micro-DNF does less than DNF, and this is because we have, like, duplicate code there. That's, of course, not good, so we should fix that. And the other big issues is with the extensions or plugins. Most of our plugins are for DNF, and they are in Python, but as you can see from the diagram, it's not possible to use the same extensions in micro-DNF. There's, like, simply no way how to get to them. And also, the other way around, if you have some extension for the Libheave, it doesn't get used in DNF. This is, again, not good, because we already have some extensions that we want in both, and we have to duplicate them. Again, that's just a bad situation. And to resolve that, we are introducing or creating DNF5. I should mention that when I say DNF5 here, I mean, like, this whole part of the stack, so the library, the plugins, the actual command line DNF tool. But the new diagram looks like this. It is much simpler. We have merged the insights of the library into one piece, and we have also merged DNF and micro-DNF into just one tool, which is here called DNF5. There's also a new DNF demon. We still have to, like, to plug in two nodes here for plugins or extensions, but it is much more clearly separated. We have, for the library, we have more, like, a passive extensions that get used automatically. Every time you use the library, they get loaded and run. And for DNF5, there are some more, like, active plugins. So this is typically, like, a new command that you can add, and the user actually has to, like, type it and run it. Okay. Another big two features are that we actually are breaking API, and it's not backwards-comfortable. So, okay, these are not exactly features, and you might ask, why are we doing this? So let me try to explain that or justify it. First, for the library, we completely restructured the API and tried to make it better and most importantly unified and safer, because before, like I mentioned, it was merged together a bunch of things, and it wasn't that great. So hopefully, this time, we try to learn from our mistakes and make it better. And we are, another change we are also doing is inside of the library. For example, we are now loading and downloading the repositories at the same time. This is because typically downloading the repositories is network intensive, but loading them on the other end requires CPU, so they kind of nicely match together and can be done at the same time. Another change is no, we don't download the file list metadata by default. If you are not familiar, then the file list metadata contains a list of every file in all of the RPMs, so it is quite a big file, and we think it's possible to get by without it as other distributions do, so we don't download it by default, but of course, it's still possible if the user wants to download it or even configure it, so it's downloaded every time. Okay, then we are trying to make it, of course, in general, faster, and I think we will see this in the demo later. And, yeah, we have bindings for the new library. We mainly focus on Python, because that's most of our users, but thanks to Sveak, it should be possible to generate also for other languages like Ruby, and it will need some work to do that, but hopefully it wouldn't be too bad. Now, I'm moving to the actual command line package manager, DNF, DNF5. Probably the biggest change or one of the biggest changes is that we dropped Python as a dependency, and this is actually what allowed us to merge micro DNF and DNF together, because before we needed micro DNF for containers on minimal environments where you need smaller footprints, but since now we don't need Python and everything is in C++, in C++, we have best of both worlds. Yeah, not everything has changed. Most of the commands are actually the same, except for a couple of differences that needed like picking or fixing, but on the other hand, the outputs of the commands did change. In fact, we do run still the same CI test that we run for DNF4 as well as for DNF5, and usually you have to change the checking of the outputs to make it work. Not all of the tests work, but we are still developing it and working on that. Yeah, then there is the daemon, and just really quickly, it already is accessible by debas, and since it uses the whole library or the same library, it will have the same functions, so you can use it to work with groups or modules and stuff like that. Okay, last thing from me, I want to mention a couple of additional improvements. I'm not going to match too many details, but we have configurable aliases, fully integrated modularity, just single configuration for all the users of the library. We manage to separate system state from history database and module state, and for example, we have built in autocomplete. There is address, and if you have questions about this or anything else, you can of course ask us after the presentation. Okay, so that was everything from me, and let me now hand it over to Nicola and the community part. Thank you. So after the technical overview, let's focus more on who is actually contributing to DNF5 and how we are changing our idea of looking at the community, not just our idea of the code base, how we plan to change our approach with the community. So of course, DNF5 is an upstream project, so it all starts from upstream, and it was a, I would say quite a bit chaotic in the past because components were a bit separated, issues were not enabled for DNF, for example, for quite a while, now they are, but it wasn't super clear the path we were taking with the community, many issues stayed open for a while, poor requests are reviewed, we didn't do great, but we plan to change it. So let me give you an overview of what was the past of DNF contributions, then we will look of what we expect in the future, and then the action items that the team has to make everything more transparent and all the workflow much clearer. So first of all, the past of the projects. So I'm saying projects because here I graphed the weekly contributions by author of DNF, and DNF and DNF5, and what I want to pull out from this graph is that, well, first of all, now the effort of the team is all focused on DNF5, as you see, it's just all the contributions are on the green part, and one other interesting thing is that, yes, DNF5 efforts are higher, are more, compared to DNF or LibDNF alone, but actually the effort, if you sum, the highest points of DNF and LibDNF, the efforts of DNF5 is less, and that's because you have to maintain a unified code base, and that's much, much more easier to do, just one thing. Also I want to show you during the history of the project how many people actually contributed, so these are on the Y, the number of people that did some number of contributions on the X, so there is some exceptional guy, around 200, there is some others, there's just one guy, 150, 125, and I like, everyone is doing a very small number, so that's fact, and that's how upstream works, you have exceptional core team members that they will do all the job and they are carrying, they're pulling the efforts, but it's actually good to maybe improve the community a bit more, and well to give you maybe some more perspective, you can even see, like this is, every bar is a contribution, so it's a different person, and yet again there are just few people that did a huge amount, a huge number of contributions, sorry, and many, many people that did just a few, like one, which of course are important, but what it means to us when we, what it means for us having someone from outside contributing, so I had this idea of kind of splitting the contribution into groups, and here is my analysis, so there are of course the authors of the project, they are responsible of let's say the 10% of the total number of commits or contributions, and there are 1, 2, max 5 maybe in the project, those are like the key components, they are there from the beginning, I'm talking generally, but this applies for our components, and there are paid programmers, co-authors, exceptional contributors that are doing 100 of commits, 100 of commits, well it's quite a number, and then of course there are the one-liners, the people that come and they're like, hey there's a typo here, super important as well, but they are just, they come and go, you will never see them again, and well they might be not even interested, sometimes they just come and they are just doing one small fix and going away, and then there are who I call the people in the middle, so actually the people that can effectively contribute by following the project, and by coming in more than once to just point out, hey, I am doing this request, maybe I'm opening an issue, and then the contributor gets interested in working into the project and does active commit, let's say, so brings in some code, or maybe delete some code, deleting code is very important, so let's look at how we perceive contributions from inside a team, so for DNF, talking still about weekly contributors, we had a total of 225 contributors, 4 authors, so 4 people that did more than 10% of the whole work, and 19 people, very good people that did 100 commits, and those people, they are regular, they are since forever, you kind of know that they are there helping you out, but let's focus on the others, so how many commits are in total, and how many have been without those people, and the number is actually roughly 10,000 total commits, oh sorry, I'm talking about contributions, so it's a little binned, so it's smaller numbers, but still, the statistic applies, roughly 4,000 commits without the authors, and roughly 1,000, so 1,000 commits are done just by those people that stay in the middle, more or less, and so what is your effort to be, let's say, active member of the community, so I said, okay, I want to do the 1% of this, that's pretty good, you do roughly 1, sorry, 11 commits, 11 issues, 11 whatever, 11 comments, and you are actually 1%, if you know 100 people, you definitely remember, if one person is missing or not, or like, hey, you, I don't know, you change your haircut to pink, and you remember, because the 1% is quite a lot, so it's very easy, and I'm talking about 11 contributions in all the history of DNF, and DNF is being active for like 10 years, so it's like 1 contribution per year, it's quite easy, right, and now it's the time to jump into DNF 5, because if you look to DNF 5, same applies, it's basically just team effort, so 5 authors roughly, this number will decrease with time, and 6 very good contributors, and then those are the number of contributions, and again, if you do 1 to contributors, we value that kind of contribution, whatever it is, quite a lot, I mean, it's significant, this is what I'm trying to bring up, so what kind of contributions are we talking about, we have a transparent workflow, we have open issues now, we triage all the issues weekly, this is something that, again, we were not doing for DNF for some time, we have public milestones in GitHub, so you can understand what our action plans are, and also we have discussions open that we use for announcement, that we use for, yeah, gathering opinions, like DNF 5 naming, it's discussed on GitHub, so you can go there and give your opinion, and questions, of course, if you have a question now, you can ask it, if you have a question later, there will be the link to GitHub to just go there and ask something, and, of course, we have bugzilla for Fedora tracking, and, I mean, downstream requests, or specific Fedora things, but, like, the upstream is where we start, and the upstream is where we are trying to improve to the community, a bit more to that, we have now a better documentation, the documentation is actually generated from the code base, it is tested, so everything, every snippet that's there, gets compiler deficiency plus plus, it gets interpreted, it's in Python, and so it's much more, let's say, reliable documentation, so we expect to have, I'm not saying what we expect to have, you know what, because you'll never know, but then we have now tutorials, we have code templates, we have a lot more material that a newcomer could work on and start to contribute to DNF using that material, right, and then we are planning to also add contributing guides that now are missing, but we are planning to improve the contribution workflow on GitHub yet again, and also to open first time issues which are missing, but we are targeting that, so just to conclude my part, who's the community, actually the community for us is the people who create issues, I mean, not create problems, create GitHub issues, right, people who add or remove code, and people to take part in the discussions, even now, even just by coming to us and say, hey, DNF 5, you know what, I tried it, it worked, or it broke my system, it happened, so I mean, it can happen, it's not perfect, or the people who raise questions like now, or again, they are reaching back to us to the mailing list, the more questions we have, the more reliable we can provide a code that actually works for the most user cases, so, and just using DNF 5, you are part of the community, and most importantly, and I really feel this quite a lot, because when I started working on DNF, I actually needed help from the team, and I believe that many people didn't quite dive into the contribution system because they didn't have help, so we are actually in the need of people that help and ask for help, and that will bring to all the other points, so, well, I hope now it's much more clear how we intend to work in the future, and now, let's see how DNF 5 works in practice with the demo, so, here you go, Jan will talk to you about the demo. Okay, so hopefully it will work, so, okay, let's try some hands-on usages, and what do we have here? So we have two separate individual containers, both running federal Linux, and on the, they have the same configuration, and on the left side, I will show you the examples using the original DNF command, and on the right side, I will show you the same examples running the DNF 5 command, so, let's start with something simple, just install a package, so, we will try to install this package, hopefully the Wi-Fi will work, and I will try the same on the right side, okay, and now, what's going on? We need to download the metadata about the packages, so, basically, what are the packages, what are their dependencies, the requirements, so, the definition of repositories is configured in the system, and now we need to just pull all the metadata from repositories, and now we can see the first difference between the DNF and DNF 5, what I was already Alish talking about, and it's the size of the metadata, because in DNF, everything was hard-coded, and always, all metadata was downloaded, so, there is, like, 64 megabytes for the Fedora 37 repository, and if we look into DNF 5, there is just 21 megabytes, and it's because of the file lists are not there by default, because usually we don't need them, okay, we can see some differences in the outputs, like, for the DNF 5, we have some more information there, like, what packages are being replaced, yes, but it's basically, it looks very similar, but the output should be a little bit more convenient for the user, so let's try to install the packages, and, yeah, that's it, basically. Another example, Alish was talking about the auto-completion, so this was also improved, we can see that for DNF, there was also some auto-completion, if we try, for example, the mark command, I will type the M and now the double-tap, and we can see the DNF offers the commands for us, okay, that's fine, but the mark command has also the sub-commands, so if I type the double-tap again, oh, it shows me, like, the installed packages on the system, so these are the arguments I should use, but if we see the help, like, we need to use one of those three sub-commands, and these are not suggested by the DNF, so this was improved for DNF 5, we can look at this, if we put the M and double-tap, we can already see that there is some difference, there is also some brief description of the sub-commands that are, all the commands that are offered for us, and if we now put double-tap again, we can see also the sub-commands there, so, and of course, there is also the auto-commission of the option, so we can use the double-tap again, and we can see all the parameters that are relevant to this command, and it's quite, it could be quite convenient. So now let's try some performance comparison, so, for example, the repo query command was quite improved, so it's basically the command for querying the packages based on some keywords, on filtering it, and so on, but for this example, we will use just the empty repo query, which basically lists all the available packages in repositories, so let's try that, running also for dnf5, and we can see that the dnf5 was a little bit faster, but let's measure it exactly, so let's use the time command, and the same for dnf5, and we can see for dnf, it was like almost three seconds, and for dnf5, it was one second, so it's a little bit faster there, and we can also try some more advanced repo query, so we can list all the packages that depend on CuraLibrary, and we will see that there will be, oh, there should be quite a big difference, so what depends on, right, so running it, I will try to run it parallel, and we will see, okay, so here is about six seconds for dnf5, it's quite fast, and for dnf, it will be three times longer, I think, okay, 20 seconds, okay, so that's it from the CLI demo, and we can move to API, and you could ask why I should use the API, well, of course, it depends on the use cases, but usually if your project has support, one of our scripting languages, we have the bindings for, like the Python, it could be much more convenient to use directly the API, and also it's more powerful than the CLI interface, but some common use cases are simpler to write through the command line, and so sometimes in the end, the client chooses both approaches, right, okay, so let's see some simple API example, let's say we have some, our favorite package, and we want to know what are the latest available version of this package in different federal releases, could be useful, so let's see step by step what we need to do, so if we have installed the double packages and the API bindings following our tutorials on the GitHub pages, the importing of the Python module is super simple, then we need to set up the session for the dnf, and running some initialization method, then we need to override the release version variable, which is basically used in the repository URL, so we always look only for the packages of the respective federal release, then we need to prepare or load the repositories from the configuration and prepare like the objects for the Python, and then we need to prepare the query, what we will actually be querying, so this is also quite simple, there is a package query object, and we will just filter all the packages containing the kernel name, and we want to have the latest version always, that's simple, and if we found any of those packages containing kernel, we just print the name of the package, otherwise not available, okay let's try it, running this simple script, and for federal 35 there is this kernel version, for federal 36 there will be some probably newer one, it's taking long, and also should output for federal 37 something a little bit newer, okay, now we can try to edit our script, and show you if we have already some dnf5 package available, but yeah, we need to add here a raw height version, a development version of the Fedora, doesn't work, I don't know how to, does anyone know how to exit V? I couldn't resist, I need to escape, it's a different laptop, and the thing is that it's actually a different laptop, because the htmi wasn't working, so I don't know, escape is not working, thank you, if anyone has any questions we will try to move on, and maybe we can move on, okay so you will never know if dnf5 is actually in some fedora, well guess what we will say it anyways, but it's always exiting it, oh maybe, maybe I got it, so in federal 35 there is no dnf5, in federal 46 no dnf5, it's taking some time, I don't know, maybe just, oh it's there, okay, we can move to some more challenging example, but if we want to add some, our awesome new command to dnf5, so then the user will install it, and just type dnf5, our command, and it will do the magic, it's quite simple, we can show it to write some simple dnf5 plugin in like 5 minutes, so what do we need, okay, we need devlin binding packages, that's no problem, we can install it following the tutorials, we need some template sources, okay, you can steal our dnf5 plugin sources and use them, and you need some build tools and build script, and you can also steal the cmake script we have for the dnf5 plugin, no problem, everything is in our dnf5 repository, now how to write it, you can follow our structure of dnf5 plugins, it's basically always some definition file, like the header, what's there, the actual implementation, the cpp file, and then there is like the adapter for the plugin API, but it's like the boilerplate code, you just need to change few lines and it will work, so let's look at this c++ example, there is the implementation of really simple example command, and it's following like the command interface from the dnf5, so there is some methods for the setup of the command we need to follow, in this step we need to tell the dnf5 we have some new command, so we basically register our new command into the dnf5, this is super simple and it could be copied from other plugins, then there is setting the argument parser, basically parsing the arguments, what are the arguments for our new command, listing all the positional arguments and what we have, but in this simple example we don't have any, so we just set some simple description, in the configure section we can like modify the dnf5 session, change some configuration or request some optional method that we downloaded, here we just going to be quiet, and there is the actual implementation of the command in the run method where we just, we will just output something on the terminal, and here I will show briefly the adapter code, so this is the boilerplate one, you just copy it and change a couple of things, this is like the plugin name, the version like name your outer and name of the plugin, then there is this long block of some boil break code that is same for all the plugins, each plugin consists of one or more commands, so we just put our new command into the list, and finally we need to return the instance of our new plugin class, and that's it, so let's try to build it and run it, at first I will show you that there is no example command for dnf5 right now, so I'm not cheating, right, and here are the sources, right, there is the cpp file I was showing, then there is the header file which is basically just the definition, and the adapter code in the example plugin cpp, and then there is the build script for the cmake which is stolen from the dnf5 sources, okay, so we just need to create build folder, and then we will do everything to make it, there is again some weird keyboard, wow, I need, this is not an English keyboard, I need like the ampersand, English, this one, how to, where is it, there we are, nice, oh, it's, how did it happen, what happened, I don't know, this, show me again, what the shift and, this is the Italian keyboard, so don't look at this, just look at the number, oh, sorry, it is printed in Italian, sorry, so make it, make install, so basically just build the sources and the installation is just the copying of the library into the correct place where the dnf looks for it, so building is done, and we can try running the dnf5 example, and yes, we did it, and that's it, okay, so when will the dnf5 release, you discovered already, the dnf5 is actually already in fedora row hide, so 38, and there is a plan to, well, in a moment, so first of all, if you're running stable fedora 36 or 37, you can use a copper to try it, and it will be parallel installation with dnf, and also here are all the sources that we have for whatever, you can find everything in the github repo, that's the only link you need, there's everything there, we have all the links copied in the talk information, if you run fedora row hide, or if you're in a container of row hide, you want to try it, it's there, so just dnf install dnf5, and you're good, and something more about it, when it will actually land in fedora, so there is a plan to do a major upgrade of micro dnf, that will actually sim link to dnf5, so it will be replaced in fedora 38, and in fedora 39, there's the plan to replace dnf with dnf5, so it's coming, and we hope to be ready for that, that's it, thank you, and if you have questions, you have time. Any question? Thank you so much for this presentation, my question is related to community, you demonstrate that there is quite a few people that are not making more than like 1, 2, 3, 4 commits, can you tell me what is your plan to help people contribute more, how are you going to improve the contributor experience? Thank you for the question, so the idea is to once again make it simpler and be more proactive to help people, so again in the past we weren't very responsive, and people got pissed because their code was just ignored, and the questions were not as important as maintaining dnf, or that was a perception from the outside, and now we try to, again, every week we go through all the discussions, all the new issues, all the new PRs from community, in all our components, and we set an action plan for that, so we set the milestones in every one of that, it's like we want to make this landing fedora 38, we want to make this landing fedora 39, so you know what is our priority, and we take seriously in serious considerations PRs, so if you just drop some code, it's very easy for us to review it and be proactive to that, and once again, it's done weekly and it's on our priority, our workflow has changed, hope it answered. Thank you. Just technical curiosity, is the new dnf5 daemon intended to replace the package kit daemon, is the dnf5 daemon intended to replace the package kit daemon? So it can, we did start with that in mind, because package kit was officially deprecated before, so that was one of the goals, but lately I've seen that package kit is maintained and there is even issue on our page that someone wants to create back end for dnf5, for the new IPI for package kit, so I'm not sure how it will turn up, but we will see, it was the plan originally, it could, sorry if you want to, yeah. The thing is package kit works, so it's not that easy to, I'm hearing, okay, well, I would say that for now gets the job done, but still, there's an alternative, actually the first enablement of dnf daemon on the system, there will be for other components that are less critical to probably to adapt to the new dnf, or actually the new IPI, we're looking at cockpit, for example, that could use the daemon, or anaconda maybe could use it, so it's not just package kit that for which the daemon would be relevant, let's say that. Just finally, we would like to do that because, like to replace it because it could provide, for example, like I mentioned the module support or group support to even to Grom Software, which is currently not possible, and that would be nice, yeah. So any other questions? Hello, I do also have a community participation question, are you talking or have you consider reaching out to other RPM-based distributions beyond Fedora and the CentOS, like OpenMand Revour or Suze perhaps, thank you. I'm not in contact because no one here is project manager or team lead, let's say that. I would let another team member of us answer into that, but yes, of course, the RPM distributions are taken into account and all the relatives to Fedora, they're of course taken into account. The thing is, for now, let's say it's a quite young project, dnfi, it's just two years since it has started the developing, and so I would say that this is the reason why we're just talking about Fedora, Fedora, Fedora, Fedora, because this is our upstream where we do the testing, our CI runs on Fedora, and well, I think that we definitely would be porting or like we would be very happy to see dnfi land in other RPM distributions. I think that's it, normal question, I think that's it, okay, thank you again, thank you everyone. |
Maker Tools in the Browser
CAM to 3D Printing: Zero Install, Always Up to Date |
I've been involved with, wait, I'm Stuart Allen, here, can reach me that email address. I've been involved with open source since the early 90s, dabbled in Linux kernel development, but mostly my projects have been stand alone applications and libraries, been the technical founder of several companies over the last 30 years, and I've managed to thread open source work through those companies and got them to release pretty significant packages as open source internal works that have data processing libraries and things like that. And over the last 10 years, I've become increasingly enamored with the idea of not having to install software anymore, at least not on the desktop. So I'm going to walk you through a little bit of back story for that. I don't like slides, so most of this will actually be a live demo of different applications walking you through the functionality, but I will walk through just a little bit of the technology in the back story for that first before sort of hopping into this. So to be a bit of a high wire act, let's see here, bullet points, excellent. So this is just the rough order of things that I'm going to talk about. I'm not good at following directions, even when I've written them, maybe especially when I've written them. So I might wander a little bit, keep me on topic if I do ramble a little bit, or if I talk too quickly. Definitely let me know. I'm prone to do that, which is why I didn't have any espresso today, at least not told this is over. Okay, so on to a little bit of back story. All right, since we are talking about maker tools, I wanted to just have a slide about making, but not dwell on it too long. While I was organizing these slides, it always turns into something other than what you start out. I had an idea when I went to this, and then what came out of it was kind of different, but the exception of the demo parts. It occurred to me that the proliferation of really powerful and accessible tools has turned really everybody into makers. And this is historically anomalous, like we've had tools in the past, but the kind of tools we have now are really incredibly powerful. My starting point as a maker was in the 70s, my older brother purchased a TRS-80 model 3 and abandoned it, so it became mine. And from that point, I basically learned basic by typing in programs at a bike magazine, and so I was a touch typist by the time I was 10, because I was like this, trying to get programs in faster. And that terminal window was, for me, a portal into an entirely new universe, and I can't really overstate how profound that was for me at the time as I learned to code, exploring this whole new world that you could just sort of create on a fly in front of you, and like you had to use your imagination a little more than you do today, you know, with 3D and all the great stuff we have, and BR. So imagination can be a powerful thing. I've been giving away software since about the time that I learned to code. Back then we called it shareware, and we used bulletin boards to get our creations out of their nest at an incredibly slow rate, but they would make it around the world based on the letters and even checks that I got from around the world at the time. And unfortunately, I was like 12 when I did this, and I made up a company, and I put it on the software title screen, and people sent me checks written out to that company. They didn't exist, and I couldn't cash them. So lesson learned. Let's see, back to the point. The difference in time now in history is that we are all makers to some extent, from I would say the ridiculous to the profound, as I have here in a bullet point, and I would have said that before, chat, GPT, mid-journey, and all the other kind of stuff that's out there today, that just serves to underscore the point, right? We can all be creators in a way that has really not been possible before. I guess that's sort of a debate. Let's see. Did I miss a slide? No, I didn't. This is the first time I'm using this, and the first time I'm presenting in like five years, so bear with me as I sort of like find my footing. Okay. I don't actually read the bullet points. The things that I say are, these are just inspirational things. The things that I say are just related to that. Why are browsers called browsers? We haven't called PCs word processors since the 80s, and it has to do with where they came from. I'm sure that there are a lot of people here who could better tell the story and the history of browsers, how they came to be, and the motivations of the people behind them, but I think it's safe to say that the early creators were focused on document hyperlinking, sharing of knowledge, not progressive web apps, and the browser sandbox, which is where we are today, many decades later. The reason it took us so long to get here is because we did start off from a document-centric view of the web, and that outside of mostly closed self-referential sites like Wikipedia, I don't think that hyperlinking really is what it was envisioned to be. Link rot is sort of the order of the day. What has been far more successful is the evolution of the browser as an application platform. Okay, so this brings us to sort of the inherent and natural tensions of today. I've grown skeptical of the way that we distribute software, the software itself, pretty weary of uninstalling, reinstalling, cleaning up and organizing. Do you really know what's in the software that you're installing on your machine, what it contains, and to what extent should you have to worry at all about this, right? Containerization on the server side has not really delivered us from complexity, which is totally a topic for another day, but since we are talking about containerization, it's related. Decades on, there's still really no cross-platform application development environment that works really at cross-platforms. I spent way too much time in the Java world, mostly enterprise, but even their take on AWT and all the other things that came after that were pretty horrendous. The toolkits that you can use now and try to get cross-platforms, it really don't work. So the fact that, you know, mobile tablets, touch, new interfaces, uh, fully complicate the ability to create applications that run in multiple different formats, and the cloud has created more tensions around data privacy and questions of individual agency. So hopefully, let's see if I actually address this in the next slide, this brings us to progressive web apps. I don't know if you know, or if you've heard of this term, you know, HTML5 and CSS, progressive web apps are basically web pages that can be self-standing at the end of the day. If you add a manifest and some service workers, they'll cache their own data, they can be installed and run offline. And so this is why, my background has been in high-performance computing, embedded systems, but I've ended up on the other end of the spectrum building web-based applications. And so there are a lot of people who will question the performance of JavaScript, JavaScript in general, in the browser. I hope to address all of that. You know, my coding arc, as I said, started with basic, it went through C, TickleTK, Perl, other scripting languages, embedded systems, Java for a couple of decades, uh, a D2, D2 tour through Verilog, um, doing FPGAs and some of these real-time systems for quantum computers, actually, um, before landing on JavaScript, oddly, while doing a prototype in sort of like the late, or around 2010. Um, so over the years, I've spent way too much time struggling with cross-platform and UI issues, um, and I think that open-source progressive web apps are really a solution for a lot of these problems. So, um, let's see, they are deemed trusted because they're running in the browser sandbox, um, and I lost my place on that because I wrote way too many notes for this slide, and so give me a second while to find where I was. I'm going to skip this, and I'm going to spend more time on the demo. You don't want to hear me talk about this much more. I will say that Android and Chromium are friendlier platforms than iOS, um, Apple is not really keen on progressive web apps because they compete with the App Store. If you think about it, if you have really good progressive web apps that you can install off a web page and run them offline and have access to all the modern web standards like Web USB and Web Serial and, um, all these other things, you don't really need the App Store and App Frameworks as much, and so they've done a pretty good job of trying to hobble, uh, WebKit and enforce its use on iOS, you used to be able to run Chromium on iOS without WebKit and you had all these things and they thought, like, no, for security reasons we can't do that. Well, um, not really a case, but it does serve the purpose of enforcing the sort of the App Store hegemony. Um, all right, so all the apps I'm going to be presenting today are single page progressive web apps, installable, runable offline, fully open source, MIT license, it's been around for quite a while in some cases. I'm not sure if I'm going to spend a ton of time on this, just, this is the pros and cons sort of as I see it. For me, faster development time is the thing that I care the most about. Um, it is basically a page load away from testing new code, which is a sub-second process, uh, for the most part, as you'll see. Um, I spent a year one day trying to add a small feature to an open source popular Python package, and the, uh, the dev test restart process took three minutes, and I was like, this is, this is insane. Like, I basically modify code, I hit reload, and I'm testing it immediately. Um, developer tools, especially Chromium are shockingly good, um, I'm just going to leave it at that for right now. All right, so before, this is setting up or teeing up the first application that I'm going to show you. Um, so I promise I'm almost to the demo. Um, a lot of my early making was in CNC, solid modeling, electronics, robotics, and things like that, autonomous vehicles, sorry. Um, and then around 2012, I started getting heavily involved in 3D printing, and you'll see a couple of my early projects here. Slicers of the day were painfully slow, um, they produced output that was suboptimal for the things that I was trying to make. Um, and the strength that I wanted, and I was working on engineer materials, scaled sort of like Lego-like things like this, and, um, I was like, I've been coding for a long time, how hard could it be to write a better slicer, right? I sort of like the genesis of why things like this is annoying me, how hard could it be to make a better one? Um, and so, you know, I started to build a prototype, and in a couple of weeks, I had basically a slicer that did everything that I needed to do, which was good for me, and that was 10 years ago, so you know you're never done with these things. But just to give you an example, a single cube or block out of this with the slicers of the time would take 20 minutes to make one of those cubes, and I was doing scaled up large, large scale models. My slicer would produce the same part, the output that would produce the same part in 11 minutes, and the part used less material and was more mechanically, was stronger, would bear more load, and did a lot of things that I wanted, so I was like already success sort of like two weeks in, again, this was 10 years ago. Slicers have come a long way since then, and I'm not sure how much I want to get into that, but again, let's see, I'm going to skip the next slide, wait, this, then this, I have this dual screen setup, it's hard to know what's going on here because this is not totally in sync with that. One timeline of some of the tools I've been building since then, Kirimoto, which is a multimodal application that runs in the browser, does FDM slicing for 3D printing, does CAM, does laser, does MSLA, which is for resin printing, does drag knives, that's part of laser, so basically it's an all-in-one maker tool for most of the things that you would have commonly used in your maker space. Because I was doing a lot of large scale printing, I ended up designing and building my own printer for my printer farm, which is also open source, open design, I ended up writing my own G-code center for that, and a whole bunch of other things. And then around 2015, when non-shape came along, so my CAD program of preference was SOLIDWORKS for 15 years, and those guys left and started on-shape, and just a little bit back story there, SOLIDWORKS promised since the very first day they came out that they would eventually have a Mac version of their software, and they didn't deliver on that after 15 years, so the guys left, created on-shape, and now it runs on a Mac, in a browser, and is better. So they opened an app store, and I was like, this is perfect, I could integrate Kirimoto into on-shape accessor APIs and have a connection between CAD and CAM in one thing. And I did that, got accessor APIs and integrated it in 24 hours, and the next day they called me and they're like, hey, that was awesome, we've never seen anybody integrate in less than 30 days, do you think you want to do a CAM package? And I was like, huh, 3D printing, CAM, additive, subtractive. It's interesting, I did it for a long time, how hard could that be? So I ended up adding CAM in 2016, again, had a prototype working in about two weeks, and here we are six years later, and we're really just getting into the interesting stuff on that package. So this is the interface for different modes of Kirimoto. This is the GitHub repository, and I'm going to jump into a demo of this. Application is called Kirimoto. Most people call it Kiri or KM when you're typing it out. The back story on that is that when I was building this, I was building two applications at the same time. One was the slicer, and the other one was actually a modeling program. I wanted to do CAD modeling in the browser, and again none of the other tools really existed at that time. This was 2012-ish, and I started to build a block modeler that was sort of a combination of Minecraft and blocks that you could skin things and connect them, and you could create blocks to create blocks. So I called it Metamoto, but this ended up taking all of my time, and I never got a chance to really finish that. I ban in it. But now I have the tools to actually maybe go back and finish that if I have time, which is doubtful, it seems like. Okay, the app flow, this is a really short and quick slide, all with different modes have the same workflow. Whether you're an FEM cam laser or MSLA, you basically have your arranging it on a workspace. You're doing a slicing pass, and the slicing pass is the crude pass that basically just says, I'm deconstructing geometry. I'm creating the rough paths that I'm going to be following, but I'm not doing things like path routing and other complex stuff. That's the prepare step, which basically takes the raw slices and then does path routing. And path routing in the 3D world for 3D printing and CNC and laser are all different, which is why it's abstracted out. And each of them has their own different views, and you can do different things in different views based on that. And then the last pass is basically exporting to your device target. I'm going to export G code or an SVG or, in the case of MSLA printers, massively large binaries for no reason. Yeah, all right. Let's see if there's something I missed here, didn't miss here. All right, the code structure, again, this is all running in the browser, 100%, JavaScript unless I say otherwise, and I will denote when that happens. There is a little bit of WebAssembly in there, and I plan to add more WebAssembly in the future as to accelerate things. The major libraries that I'm using are 3JS for the 3D, Clipper for polygon offsetting, and very recently I added Manifold for a 3D Boolean engine. It's used in one small place, but in the future I probably want to use it in other places. It's actually an incredible library. Storage on the right, these different blue boxes are different workers, different thread contexts, so you've got the UI context. And then all the work is done in workers to keep the UI responsive while all the heavy lifting is happening. There's no lateral communication between the workers. It's all hierarchical, top down. There's a reason for that. I don't know if I have enough time to actually really talk about it today, but denoted each of the boxes or the languages that are being used, languages and or libraries. All right, and I'll just hope to remember to show you all that stuff in the demo. So I'm going to hop into the demo now, start with Kirimoto, show you different parts of that integration with Onshape, and now I have to switch up the UI interface or rather the this thing here. I have to go to mirror display so you guys can see what's happening. Great. Now I have to go find my browser, let's just do it here. All right, this is not good. That's better. I am running this live on the network here which is found to be spotty sometimes during the day-to-day, sometimes it's great, sometimes it gets a little bit of trouble. So right off the good space you can launch these applications directly. Because it's running in the browser, actually hold on a second, I need to get my, because I no longer have speaker notes, I'm going to use my phone. That way hopefully I won't forget major important things to discuss. All right, you can launch these things directly from here. This allows you to launch any of the major modes or just hop right into the application by clicking on it there. And there you go. The whole thing is loaded. The footprint of this is under four megabytes compressed and usually loads in under a second with a cold cache, maybe about two and a half seconds to load the entire application with all the different modes and everything else that's in it. So it's quite small in terms of footprint for the code. More slicers, if you go to download and install them, are going to be in the two to 500 megabyte range for the same, for the similar code size, for similar functionality. And I'm going to bring up on the right-hand side really quickly just to show you what's going on under the hood. On the right-hand side, I don't know if you ever brought these up, you can look at the different threads or thread contexts that are down here and the interface, you can turn on and off threading, which will start and stop the workers on the right-hand side, which sort of eludes the previous slide that I showed you. So these are the distributed workers that will be doing all the heavy lifting sort of behind the scenes. And you can turn it on or off because some browsers, like Safari, don't allow you to do nested workers, they have to only work in a flat space, it's just one of the most of the web standards they refuse to support properly. And so it'll still work that way, it's just going to be slower. All right, typical slicer interfaces, the original interface for QerMoto was very much like the slicers that you see today, Cura and Prusa and all that kind of stuff with essentially a tree of a thousand parameters. This interface is slightly more opinionated and cut down. And so you'll see that you've got a lot fewer parameters for your model. But that's not really a problem. I would say that most parameters and most slicers are designed to help you deal with crappy 3D printers. And it makes sense because most 3D printers are crap, to be honest. And they don't have good software or they don't have good hardware or they don't have both. So there's a lot of stuff that you can fix in software with a crappy 3D printer, even if your firmware is junk. And a lot of printers actually are capable of running better firmware than they are delivered to you with. And it's just beyond most users to change out the firmware for a custom one and tune it the way it needs to be tuned. And so a lot of that stuff gets pushed into the slicer. And so the slicers will give you a thousand parameters, thousand parameters, 900 of those work against each other. And so slicer profiles become a religion where it's all about, oh, mine works perfectly, tweak a few things, something breaks. And so you're like, you can't correlate why this thing is breaking this other thing. Because this whole project is just me, I'm forced to be basically really brutal about what is in and what is out. And I'm doing heuristics for 99% of the stuff that's out there. Like, it's going to work for most cases. And under the hood, in the engine, or all the parameters that would be in other slicers, I'm just not exposing them to you. Because I don't really believe that's helpful. And again, if you run into a corner case where you just can't do the thing you want to do with your favorite parameter, just go use the other slicer or commit some code. Because this is open source, it's all on GitHub and has been for a long time. Let's see, there is, before I actually get into really demoing this, I wanted to show another thing. Because this is running in the browser, this is used pretty widely in not just in 3D printing but CAM and everything else in a lot of classrooms. Because it is one of the only pieces of software that runs on Chromebooks, it doesn't have to be installed. You can basically run it in classroom, Google Classroom, and other places. You can configure everything on the URL. So you can create profiles and put a bookmark somewhere and somebody can click on it and it'll open up the profile. And it'll be pre-loaded with everything that the student needs for that class. The device profiles, even loading up objects into your workspace. So it's pretty easy to do that kind of stuff. And the fact that you can basically export your workspace, open a private browsing tab, import it. You can use this as a way of getting your data around from browser to browser. Or again, for testing and debugging. People send me workspaces all the time, I open a private browsing tab, I import it, I replicate whatever their problem is immediately, and I'm able to fix it right away. So from this interface, usually it comes up like this. You can click on the docs grid space, it'll take you to this page here where it shows you everything about different ways you can do shared profiles, which I just alluded to, GCO macros, and different ways to embed or otherwise use the API for this. So to show you a quick, let's just open up something which is sort of like your basic 3D printing thing here, the Benchy. If anybody's done any 3D printing here, in the slice mode, you go in here and you're going to have the ability to look at different slices, ranges of slices like this. And the reason that there are two different modes, most slicers will basically take you right to the path-routed version of a sliced object. The reason I put you in here first is because you might want to modify parameters for ranges. So in this case, I could just say I'm going to turn off the infill for that bottom part, and then I have a range here that shows me that I've done that here and I've overridden the fill amount. And if I reslice that and go look at this the bottom up, you'll see that there's no infill down here, and then all of a sudden it switches to the normal view there, and then you can go and look at the preview, and this will show you the path-routing, path-routing in view involves also determining the speed at which those layers or portions of layers are going to be printed, and you can turn things on and off like moves. So you can look at the moves, I realize that aspect ratio is a little funky maybe here. Typical stuff that you would see in a slicer, nothing really amazing there. What is a little bit more interesting about Qimoto, and this isn't really specific to 3D printing, and I'll get to this more later, is you can go in and change the version on the fly because I'm hosting the last 10 versions of the software, I'm working on the beta channel, I'm pushing out software two to five times a week depending on how active development is going with major releases about every four to six weeks. And so if there's a bug in the current release, you can go just switch to the previous one, your workspace will be preserved, and it will bring you back into the same workspace that you were in before with all of your data and all of your everything. All of the stuff is persisted live in the browser, so again everything is just like reload the page, and you're back where you were before, screwed something up, reload the page. So there are three ways to import and export files really quickly, you can just drag and drop stuff into the browser. There are different formats you can support, so like in this case I can just drag in this file here which is something that some users really love to use as a benchmark because it's got about a million polygons, and it slices pretty fast. There are some users of the program who really helped me with performance because they care a lot about it, and have benchmark carry against Cura and Prusa and a bunch of other kind of stuff like that. And million polygons slicing this thing at the end of the day is about as fast as Cura was, at least last year, that was the last time we looked at this, fairly well-optimized. And again, this is pure JavaScript, there's no WebAssembly or anything else happening here under the covers. What else was I going to show you here, I'm not sure how important that is, but anyway the version switching is nice, and then there's a pluggable language, if anybody wants to contribute languages, it's pretty easy to plug in a new language into the interface, and again you just reload the page, and you've got another language with the same interface that you had before. And here we can go through, look through this, look through anything else. So typical stuff there, interesting thing to note about this is that once you've sliced this thing, we're now looking at about 25 or 30 million polygons in WebGL, and it's still fairly responsive, I mean it's not as fast as I'd like it to be. So future work is actually writing custom shaders, which I think would help with this. This is not a case where you can basically de-res or hide stuff on the fly, because if somebody goes in to look at the interior of this object, you can't just hide and then show that thing. There's always a possibility that interior features are going to be shown through the shell, so you can't hide them. There's no, everything that's in this, every path that's in this thing has to be shown at all times. And so this is an area where I would, if there were graphics experts, this is not my field, would love to have some feedback on this at some point. And then you can just export the code. A nice thing about this is that Qmodo also has a gcode, and again I'm not sticking to my script here, there's the things I want to talk about, but you can export the gcode and then once you've done that, download the gcode. You can also send directly to printers. I have a spooling agent that allows you to print, to send directly from Qmodo in the browser to Octoprint. So if you like use Octoprint as a back end, there's a way to send directly from the browser to your printer. And that's fairly well documented, and I have other ways of doing that for some of my own printers. Download the gcode. Qmodo also has a gcode renderer built in, so if you get some rando gcode and you want to bring that in, you can actually bring in the gcode back in, oops, let's find the downloads and bring that back in. And it will re-render the gcode. This works in all modes, so Qmodo will render gcode from CNC, laser, FDM. And so I'll get into it in the CNC section, I'll show you that as well. And then there's exporting and sharing of workspaces. So for example, I go into a private browsing session. And the internet's great, let's try it again, great, private browsing. I can import something like a CNC job. And it will switch modes, bring it in, tell me, give me a screenshot of when a workspace was created, and then bring everything in with it. So this is an example of the fourth-axis lathe work that's happening right now in the cam mode. This isn't fully baked yet, but it's the next thing that's coming out with the next release, which is fourth-axis CNC was the thing I added in the last release, and this just sort of gives you the ability to do some fun stuff like this. But that's just showing you importing, exporting, and sharing of workspaces. Cam mode, let's just hop into cam mode real fast, CNC mode. So you'll notice when you change modes, your device list changes, and the platform changes. It's adapting on the fly to whatever the new mode is that we're working on. I got this new CNC mill, make our car vera, and it got rid of all my other mills because it does everything that I need. And so that actually leads to another project that I've been working on. I'm going to bring in a workspace for that. This workspace that somebody sent me, they're like, hey, I have a problem with some anomalous cuts happening here. So in CNC mode, you get a new bar at the bottom. These are reorderable operations because CNC is a set of operations. Each operation may or may not use a different end mill, and then you can hover over each one and change the settings for those, and you can enable and disable them like that. And I'm just going to go ahead and slice through this and give you an idea of what the slice view looks like. I can turn on and off the different operations to look at what they're doing, or I can go through them like this. But this doesn't involve the routing. This is just, again, the pure slices. We go to preview this. It's going to show us the path routing that's a little aggressive on the hover. But once you've done this, we can go to an animation mode. And now this is the first time we're doing something actually pretty interesting, where on the back end, there is a shared array buffer, which is a relatively new technology in the web standards. This shared array is like a shared memory, but having shared memory and JavaScript is cool. This is a new thing. There's access to one shared pool of memory. They're all rendering into it simultaneously. And the front end 3JS engine is rendering the same buffer that's being rendered into by the workers on the back end. So I can go through and I can run this animation, and I can speed this animation up. And so all of this is actually happening in real time by the workers on the back end. And I got this workspace that says, I have this anomalous cut right here. What's that all about? Well, I'm not going to walk you through the great details, but they had one thing incorrectly set in one of the settings. I figured that out, sent it back to them with the correct settings, and they're off the, you know, it wasn't a bug this time. Oops. It looks like, you know, that's fortunate. My hand touched my phone. All the notes disappeared. It looked like I deleted them magically, but it didn't. It just created a new note that was empty. Okay. So here, this is a fixed workspace. Let's see. Can we also do other sort of import, export? It can import SPGs and extrude them. It can import images and turn them into height maps. So one of the common things people do with CNC is they'll do reliefs. And so I can bring in a relief, like an image of a flower, and add some parameters to that, and actually I want my stock to be offset, and then I want to resize this thing because it's kind of too large. So this is sort of a common use case. People want to do reliefs. Reliefs are a pretty, like I said, a pretty common thing. This came from this image right here, which I grabbed off the internet this morning, but just to show you that it will import things like that, and it will also import SPGs. So if you have an SPG, this is more common with lettering. If you want to do like lasering or CNC of lettering or things like that, you want to bring in an SPG, and you've created another 2D program, and I can do that. It'll bring it in. That's too large. I can delete pieces out of it and scale it because that's, again, too large. So then just delete and add some new operations there. This is actually a good segue into the laser mode. This is actually really too large now, again. Let me scale that guy down. And in laser mode, you're creating, let's see, I don't want to use this. I'm going to use a different cube. I'll just show you this. So by default, in the laser mode, when you slice, it basically looks at all the different Z changes, and it will, if you say nothing for zero for the height, it will basically find all the transition points, and it will create a slice at that point. The assumption there is you're going to slice a thing into a stackable layer, and you're going to basically do stacked, registered slices with a laser cutter. You can do something a little more complex. You can choose the slice height. These are, but like lettering, the things I showed you before, we're basically going to bring in a 2D thing usually, and then slice that and set like an offset. Offset is a kerf for the laser, and that'll tell you how far the line is going to be set off in the part, so that when you take into account the width of the laser, you get the size of the part that you want. It has some other interesting features, like a drag knife. You can turn on drag knives. Drag knives will do the radial cuts, so that you can basically, if you're cutting vinyl, for example, it's very similar to laser cutting, but you're dragging a rolling surface, and so 15 minutes, thank you, I'm rambling. Drag knife support, I'm going to quickly just go through the last mode or two here, and then show you a couple other things. Resin printing, also known as usually MSLA, which is masked SLA, as opposed to actually resin printing using a real laser, which is tracing something, 2Dbox owns that market. I bought a couple of resin printers, and I used them before I figured this out, that all of the file formats for resin printers are pretty much proprietary. They all contain the same information, which is basically a stack of images, but because 2Dbox sells the hardware, they force everybody to sign NDAs and make everybody have a different file format. What's out there today is all reverse engineered. I haven't put a lot of effort into MSLA since then, but it's really, really brutally simple. It doesn't even have the problem of path routing, because you're really just doing layers. You're slicing into layers, you're generating a bunch of bitmaps, you're encoding them into a file, and you're throwing that at your resin printer. The one thing that really doesn't make sense about this is that resin printers take so long to cure each layer that a microcontroller could render and raster each next layer, and it really should just be a stack of SVGs instead of 100 megabyte. This would be like a 100 megabyte file going to a resin printer. Should be 5K. It makes no sense, and I'm done ranting on that for today. The next way that you can access Kirimoto is through the engine. There is an engine API and an engine example. This is the pure API not associated with any UI or anything like that. You can run it standalone, you can run it in a browser. Here I'm actually going to just slice an object and generate out code in the browser. This is just using the engine directly. There's one more API which is actually used by, so the engine API and the frame API are used by a couple of different projects, even a commercial product. This is an API driving a frame. You can embed Kirimoto in a frame, in your site, and drive it through an API. This is automatically the API interaction. You can skin it, you can remove any UI elements you're not happy with, you can use a 3D part of it, so you can basically create your own engine for this, your own wrapper for this, and put it anywhere you want. That's just another sort of interesting way to access the interface. I'm going to show you the integration into Onshape if we've never used it before, browser based CAD, love it, can't send up nice things about it, it's not open source though, obviously. Go in, open a model, in this case I actually did a YouTube video demoing this, creating a CNC job to mill a guitar neck, so you go into Onshape, this is the model in Onshape. Kirimoto is a tab over here, you basically just say add Kirimoto, it shows up here, it's in the app store, and then inside of Onshape, because you can see I'm iframed right here, I bring in the guitar neck, and it imports into my workspace, and I can start to go into CAD mode, and start adding operations to mill this thing, export the gcode, all in the same interface. If I update the model over here, I can come over here and refresh the model in place. So if I have things that are referenced in the model, and I'm doing some milling stuff, and I've changed the model, I can basically just re-update it in this tab. So that's really the beauty of having this integration done in the browser through their APIs. It's the only case where this isn't a fully standalone application. I have to have APIs on the server that talk to their APIs using Elwalth and a bunch of other stuff. So I'll leave it at that, since we're running out of time, the last two things that I want to show you are, if I go back to grid space, mesh tool, in the slicer world you get a lot of crap models from Thingiverse and other places like that, non-manifold objects. And I kept adding more and more features into Qerimoto to manage those problems and try to heal those objects and clean them up. And eventually I'm like, I don't want to crowd the interface in Qerimoto with a bunch of stuff that really I used to use Blender for. And so I created my own tool for that. This was beginning of last year, I think, ending of last year. You can come in here and you can bring in any, so one of the things people do commonly with, oh, I didn't even show you the belt printing. So I will say that one of the things that Qerimoto did before most other slicers was add belt printing. So if you're not familiar with belt printers, one of the reasons I'm going to show you that is that belt printers are, you would think very similar to bottoms up 3D printing. They're not. They're completely different. They're similar in some ways, but basically if you think about it, every layer has a first layer problem. At 45 degrees, every layer has a first layer problem. And you have to deal with the fact that your bed is not rigid. It doesn't like a lot of materials, and you have a really hard time enclosing that thing. Having said that, if you can get your print job to work on a belt printer, it's phenomenal for lots of things. Why do you like to print swords? The problem with swords is they don't lay flat on the belt. So one of the things you can do in this tool is you can go in and split a sword. And there I have half a sword. Now the problem with that is that it is not manifold. It is open. So in here I can go to a pair, patch. I now have a sword which I can flip over, put on a belt printer and print it, which is something that people do pretty commonly. I'm not going to spend a lot more time in this tool other than it's something that I would love to improve over time and add a lot more features into over time. I think it's a tool that we should have running in the browser. There are some other things that are similar out there, but again, there's no cloud involvement for any of these programs. They're all completely standalone running the browser. And the very last thing, which I will show, because we're totally out of time now, is card control, which is a sender. I can't actually show you this because I have not connected via serial to a CNC mill, but this runs in the browser, uses web serial to talk to your device directly, locally it can talk. It's abstracted so that you can talk over USB serial or you can talk over the network to the device. But what's really cool about this is that if you get a mill and you go to your first time set up and you plug it into your laptop, you can click the serial button right here, it will find your mill, you will connect to it and start controlling it immediately when the browser without having to install any software. What led me here was when I got the mill and tried to install their software, it didn't work on any of my computers. It didn't work on Windows, it didn't work on Android, it didn't work on Mac. Now they fixed most of these problems, but again, it was like the software's not complex. I've written senders before. How hard could it be? So the cool thing about this software is actually that it shows you, uses 3JS, will render and show live in 3D what the mill is doing, so it's tracking in real time the G code against what's happening on the mill. All right, go to good space, you'll find links to the forums, docs, discord, all that kind of crap and now I can take questions. If anyone has questions, send over the microphone, yes, on that side, yeah. Hey guys, the room is full of makers. I, oh yes, yes, I found one. Hey, thanks for the talk. What are the techniques you're using for the workpiece simulation that you demoed in the CNC workspace? What's, what, the question is what, what, it looked like maybe you're using just a height map? Oh, for that, for that, yes, modeling this, correct. In the three axis CNC mode, it's rendering a height map because you can't do undercuts or side cuts or anything like that. In four axis mode, it doesn't use a height map. That's way harder to deal with. That's actually the only place I'm using manifold currently, and so it actually does 3D subtraction in real time of the tool geometry against the geometry. It's far more computationally intensive, but yeah, and I didn't have time to show that. Is manifold like robust mesh booleans? It is, so it's, it's a relatively new project, it's really exciting, written in C++, compiled into Wasm, and then loaded in as a library to run in each of the workers. Again, it's web assembly, so it's pretty fast, but it is a proper, it is a proper like manifold library for doing booleans. And I did contribute to that project because their initial API was too slow for what I was doing. They weren't producing arrays properly. I rewrote their JavaScript API for import, export of those and made it like 200 times faster, so that got accepted recently. Another question? Hiya. Thanks for the talk. I found it really interesting. I was wondering, I might have missed it, but I don't think you mentioned where you used WebAssembly in the product, and I was looking at the, WebAssembly, Wasm, and I was curious because looking at the slicer algorithms for FDM, they must be pretty computationally intensive, so how do you get that to work fast enough in the browser to be usable? So none of the slicing that you saw, none of the cam stuff, none of that used WebAssembly. Nothing that you saw today used WebAssembly. That's all pure JavaScript, which I know is probably upsetting for people. Funniness. Yeah, I'm surprised it's fast enough. It's impressive. So part of it is I did spend decades doing high performance computing, and I'm a bit of a performance nut, you know? So why would I use JavaScript? I mean, I did my prototype in JavaScript, and I'm like, this is just too easy, right? I started, like I said, it was like more than 10 years ago, and I was like, it worked. And my intention was to rewrite it in a real language, because I spent most of my life writing in real languages, and now I spend all of my time writing in JavaScript with a little bit of WebAssembly and some other things like that, and totally hooked on it. You can totally do some pretty amazing things in JavaScript if you decide you want to. Cool. Thank you. All right. Hello? Yes. We can see. Hello? Hello? Hello? Hello? Hello? Hello? Hello? Hello? Hello? Hello? Hello? Hello? Do you want me to draw with you? Can you handle it? One you can. please. come. I am open. Please. Please. Come in the ce subtle. What is it? Welcome. The last thing you enter out. You are the first one to do this? All right. Welcome. I had another talk and I also had a talk with you. So just, yeah, that's right. |
Passwordless Linux -- where are we? |
Welcome, everyone. It's probably the first time in the last three years I actually see this big auditorium full of people willing to hear something I'm going to rumble about. I actually wrote a bit of a script for this, but I'm bad at memorizing those scripts, so it will be remembering as I go, rather than seating something that I wrote about. So for the past maybe 10 years, a team I'm a member of, we are working on a bunch of stuff related to something that many people think is uninspiring, because it's all about the enterprise use of Linux, which is just a done deal, right? Every enterprise uses Linux nowadays. And we kind of look into some of the areas and choose what to work on, because there's always stuff unfinished and needed to be working on. And suddenly there's, obviously, 10, 15 years ago, some companies from the West Coast, they started to look into, hey, this shiny thing called the Internet and this shiny way of having your account somewhere in the social network drive your login into your machine and so on. And here we are, 10, 15 years after talking about doing similar thing for Linux in enterprise environment than at home. So I will start with a bit of introduction. I'm working at Red Hat for almost 12 years now. I'm focusing on the identity management. So basically, really running accounts in the Linux systems. Centralized identity management is what we do. Mostly working on these days pre-IPA. And I started with Samba 20 years ago, SSSD on the client side, MIT Kerberos for the authentication and so on. So there's a lot of things blended together and it's really not a single person or a single project. It's amalgamation of a lot of activities of a lot of people around the world. This talk is a kind of a split between past contemporary things and what we see in future or what we see now will be in future. Hopefully in February 39, if we get this through in upstreams first and then manage to collect this all together in the distributions, which will be a point to talk about a bit later. But in the past, we've been always getting all this stuff kind of with a bunch of assumptions. So one of the really unspoken assumptions is that when you login into your system, the Linux system or Windows system or so on, you don't only want to use that particular system. Typically, you want to go beyond that system. You want to go, well, in these days, you go onto web typically and use all these web applications, right? But before that, in office environment, you were reaching for your nice non-working printer or you were reaching for something to fetch files from or get your home drive and all the other stuff or play games together with your colleagues on the local network and so on. And in many of those cases, you needed to get through the authentication information that you provided or got at the login time. So you needed to get this state of authentication translated or correlated into the network onto the network services. And a typical approach that we see these days is that you unlock your, let's say, non-session. Then there is a secrets manager, which gets your secrets stored, unlocked using the password that you provided as a part of login. Then the passwords from this secret store used transparently by the other applications through some common APIs, like browsers have their own stuff. Of course, they have their own password managers and those might be even shareable across multiple devices and so on. But it still doesn't solve the real problem, which is when you log in into the system and you go to a specific network face and applications, you need to do this transition or transfer the authentication state between maybe incompatible or not well connected protocols. It is bridging that we don't really have in place and in many applications we kind of derive it based on certain things. Okay. In reality, it's much easier in the enterprise world because for almost 40 years we've been dealing with something that is called Kerber's in different ways of it, which is effectively a decoupling of the initial authentication and then requesting some authentication tokens on behalf of you after you get this initial authentication done. That's a nice thing. Over years Kerber's as a mechanism grown these authentication or how it's called awkwardly in the Kerber's world pre-authentication mechanisms that allow you to use something more than just passwords. So you can use, for example, smart cards or you can use one-time codes, two-factor authentication mechanisms and so on to log in or provide initial credentials to obtain the so-called ticket granting ticket in the Kerber's thing. The good part of it is that this is exactly a way to get the transferable authentication state. Over years we added to Kerber's ability to record how this initial state was acquired. So an application can use an API to inquire whether this initial token was obtained with a password or with a smart card or with some other more strong type of authentication. So it can differentiate how to behave if, for example, application-specific policies says that you cannot use this application if you were not using smart cards or you were not using FIDO2 tokens and so on. Then application can inquire this information and say, yeah, you're not allowed to use this. You cannot use sudo on these machines because you're merely used your password, that kind of thing. And this is the thing that I was doing a lot and the one thing connected with this is that the complexity of a typical Kerber's environment is huge. But separating the authentication method is a nice part because it allows to decouple the other part of complexity. All these tokens or separate devices that you can use to prove who you are and that you've processed these credentials from the actual applications. It's really splitting up in these two parts is what is important here because we can write applications once and forever, at least the verification of the authentication authorization there, and deal with the rest separately. Now this is, for example, how it looks like in a typical Kerber's environment with free AP or with Samba AD or Microsoft's Active Directory. This is detailed in the description from the Red Hat's Enterprise Linux Identity Management Guide. Links are in the PDF, not just as an example of what we're dealing with. And the fun part is that many seven years ago I've been talking in the other room in Janssen about similar things, so how we started with this, how we started with the Enterprise Graphical Workstations on Linux. That was, how many, 15 Fedora releases in past, 15. When I wrote that, I did not realize that we are 37 now and it's like seven and a half years back. And almost all this stuff already worked by that time. The smart cards were possible to use in this environment. Of course there were some issues with the visual representation, how you choose these smart cards and so on, but it's kind of funny that what I will be showing to you today is kind of similar to what we did seven years ago. We fixed this problem for smart cards eventually, together with the GNOME people. By introducing some interfaces between SSSD and GNOME components, GDM and the other parts of the Window Manager, so that they can pick up this information and show a nicer UI to choose who you are based on your smart card and so on. But smart cards were the first thing we did with this passwordless authentication because you don't need passwords anymore on this system. So what happened in these seven years, and that's a nice period because every seven years something happens with people, they change themselves and environment around them changes a lot. So what happened? Actually there was change of winds and that's a fundamental one. If you remember 2015 to 2016, a typical office environment, first there were offices where people were working every day going into those offices and they had a network actually where those computers were connected and enrolled into the whatever domain controllers were in those offices. It really was an environment that was for the infrastructure for people, right? That you come, you switch in, your laptop connected and it knows what to do. It was so to say infrastructure for people. Nowadays there's all infrastructure in cloud and it's not for people, it's for applications. You deploy applications somewhere in the cloud and all that infrastructure that is there is not for you as an employee, you as someone who consumes it. You don't consume that infrastructure anymore. In most cases, of course there are exceptions. In most cases you are consuming the applications and applications consume that infrastructure. So if your laptop is enrolled in something, that's most likely not the same environment as your applications enrolled. The applications are actually consuming what? They consuming stuff like OAuth. They get tokens, they get authorised to get tokens on your behalf from the identity provider service that is some WAG stuff that you don't really know how is built inside in most cases and you don't care. All you care about is that you can prove to this identity provider your identity and for that you can use a bunch of stuff. For most of the cases on the social networks or on a myriad of websites and so on, you don't even prove for them that you possess certain credentials. You prove this for some identity provider that they connected to and that something that I call bring your own authentication. So you connect your account in something to your social network account and you prove to the social network that you are who you are and then that social network IDP issues talking to that site where you want to be. That's essential, bring your own authentication to that site. The site doesn't really care how you authenticate it to the social network, how you prove yourself. And in fact what becomes out of it is all infrastructure around OAuth and all IDC is a part of it is built around I have a browser and this browser is everything I need. There are radio acts between different components and websites. They talk to each other on your behalf. So this is kind of a new mainframe where everything runs. So you still need a network access for this to work. And here we get back to the 2016. In 2016 one of the biggest issues we were identified at that point for enterprise workers was everyone goes into business trip, goes into the hotel and every single hotel has captive portal on their Wi-Fi. So in order to log in into your machine using your credentials you have to solve a problem of solving captive portal before you log in into the machine. The fun part is that we never solved this problem at all. Instead of solving this problem we basically say to ourselves let's remove our machines from the corporate environment, right? Decouple machine from the environment rather than solving the problem. Maybe this is a good solution. Maybe not. Because right now to solve, to get this authentication again working for us we need to run browser before we log in again. And running browser before you log in in titles running insecure unknown executable which is JavaScript code with the access to your hardware which is your at least screen and GPU and so on because that's what the browsers are requiring to do this stuff. So that's still the problem to solve and of course some of these problems might be solved but some of these problems might be, hey, move the way. So almost everyone has a phone. So this is somewhere I can run a browser instead of my laptop. So maybe I can run the authentication and authorization process on this separate device instead of running it on my laptop when I try to log in. It's kind of getting around the problem but maybe all the problems are like that. Get around them, roughen them, solve them directly. And in some cases this is not even a problem. For example, if you want to log in over SSH and you're already logged into your workstation then launching a browser on your already logged in workstation is easy, right? You just need to have a URL to get into. So let's try, I hope I will show this. Let's try a demo. So here I'm trying to SSH into some machine and on the left side from you, I have an IDP based on the key clock, open source project. And this user, it doesn't have the security key associated with it. Now I'm associating. So basically I'm taking this kind of token and connecting it with this user. Okay? I have this token associated somewhere there. And now I can log out of my IDP just to make sure that I don't reuse the same session that I obtained with the password, right? So now I'm trying to SSH and I'm presented with a message that says, okay, authenticate against this browser somewhere. So I have a browser. It's handy. I'm getting, I'm using the security key to actually authenticate instead of the password. I authenticate it into the system. Now the system says, do you authorize this access? Yes, I authorize this access and load it into the system. But a part of this authorization is that I'm getting Kerberos ticket. Kerberos ticket issued to me by my infrastructure. And now this Kerberos ticket can be used for, well, anything. Anything that's allowed there within this Kerberos environment. So I can SSH with this Kerberos ticket further. I can mount NFS or Sambo, something shares, get access to files. I can use this Kerberos ticket to sudo into the system itself. There is, since last year, or actually 2001, there is a POM module as a part of SSSD, POM, SSS, GSS. This module actually checks that you have active Kerberos ticket and its properties are good enough to access POM services. So you can configure, for example, sudo to say that if you have Kerberos, active Kerberos ticket, obtained with this IDP authentication or smart card, then you can use sudo rules on this machine. Which rules is separate from this, but you can get access to sudo with it. You don't have password. You never had password in your system. And the password may be existed somewhere in the IDP, in this key clog thing, or in GitHub, or GitLab, or Google for your company, or Azure for your company, or your own key clog stuff. But the rest will work the same, regardless of how you get there. So it's a bit complex in terms of how this is implemented and how it goes. But at the core of it is we split the process in two parts. There's initiating part, which in this case is the running on the server I tried to log in. There is a part that actually verifies that your IDP authorized access, which is running on the Kerberos key distribution center part, so Kerberos server side. And then there is a client, which is in our case SSH, trying to connect to the SSH server, which initiates process. And of course, there is a browser that you run somewhere. You sold that URI there, probably to use a phone to access that URI. You probably want to have a QR code instead of the URI, or in addition to the URI. This is one easy thing that we need to implement to make it easier. But also, it will become probably a mandatory part if you want to integrate this with the graphical login into the system, because you cannot run browser yet. But sure, you can show the nice picture that you can access from your phone or other device. And you will need something like that for the past keys existing only on your phone, the soft FIDA 2 tokens that both Apple and Google are eager to push to us these days. So effectively, the same thing works with the FIDA 2. In fact, the demo that I showed was FIDA 2 token, because the IDP side was configured to use FIDA 2 or Web Wolf token instead of password or smart card or OTP, whatever is there. So we already can do this. It's already, in fact, all the tests that I ran, it's Fedora 37, the one that's already released last November last year. So this passwordless login to the Linux console works on Fedora, works on RHEL 8.7.9.1, and a client side of it works on Ubuntu 22.10, because they have this new SSSD version that supports this stuff. Okay. So if I want to do this with the other services, let's hope I can get another demo. How I can switch to the next one, huh? Use it just a second. Okay. So this is basically exactly the same demo, just using, instead of SSH, I'm using VLOG. VLOG is just a normal application that uses PAM, Plugable Authentication Mechanism Module, to lock your system. So typically what happens is that you lock your console, and then you have to provide a password. VLOG doesn't know anything about all this stuff, but it works because we use the standard APIs there. And in this case, I don't need to provision the token, obviously, because it's already provisioned, but I can use it to log in into the VLOG application. This goes just transparently, maybe awkwardly because of the URLs and not nice UIs and so on, but still it's there. The most important part here is that in order to all this get work, and we have to have a network and service, this IDP thingy, how we can get without it. Well, FIDAR2 has this ability to work locally and not really through the web stuff. So we can run this authentication locally. And all these tokens that Ubico and all of this is token EU or some others, they usable in the isolated environment as well. So we can actually run it. So I got one small demo with this to, let's hope I get this started. This is recorded with my phone camera because I needed to actually show you that I pressed something on the screen. So this screen is locked, right? And now I'm trying to log in into that screen. And it asks me to insert this passkey device. So I'm inserting it in and you can see that it doesn't actually show you the full message that is shown there. The part of the message says and press enter before it goes further. So when I press enter, it tries to communicate with the device. It blinks to me and then I have to press a test that I physically present at the device to log in it. And after that, I'm unlocking the same way as with VLog or SSH through the palm. I'm unlocking this. This was my presentation this morning at the security dev room about the same thing. Now, if we go logically further, this is working actually by the way. I cannot show it on this laptop because as soon as I lock my screen to unlock, the video stream through HDMI will get disconnected and we will have to wait another five minutes to get my screen working back. So you wouldn't see this on the screen. So believe my camera. Now, how we can combine this? Logging in locally with the FIDO2 token is effectively, it gives me a login into the system. But the very first thing, the same problem that I discussed in the beginning applies here. How I transfer this authentication state that I used this passwordless device to log in to other applications? Well, in enterprise environment, whatever we call enterprise environment, for me for the last 10 years or so, enterprise environment is my home environment because I run what I develop, free IPA in my, as my home authentication service. I simply need to transition from this FIDO2 token authentication to Kerber's ticket using it, right? Sounds simple. Yes, it's also a bit of arrows and lines and so on, but it's pretty much in line with this one. It's just the question of who is communicating with whom to prove that the possession of credentials is the real one. Okay. Unlike the previous one, I cannot show a demo of this one because we are at the point where maybe in two weeks we will complete this implementation. And of course, the amount of bugs to find and fix will get this probably delayed until Fedora 39 or so. But I really hope to get this working by that time, which is by the way just how many? Seven, nine, nine months away. Sounds a nice number. So that's not everything, of course, because, well, it's only an infrastructural part of it. To get it usable and not to the point that users will trash their laptops because they cannot read the message. Imagine this is just a simple insert your passkey device. What if this was authenticate against these URL, HTTPS, something that can't even fit into this box? That's the situation right now. So we need to fix this with, in this case, it's a GDM, of course, this is the GNOME display manager. For the login, we need to fix this. And we started talking with the GDM developers, and there is some common understanding of what to do similar to what we did with the smart cards. The nice part of it is that it's mostly about a protocol of communicating between the PAM service, in this case, SSD, and the display manager through the environmental variables within the process. So SSD can tell that I'm now using this type of passwordless authentication. Please switch to this visualization of it. So we kind of can make it this way that it works for GNOME, but it's also usable for others. And here I come to the point of effort across multiple projects. So you can do this with GNOME, but then what to do with GDE, what to do with XA, XFAC, and others. There's a need to agree on something. We can agree on APIs, we can agree on protocol driving semantics behind this, but somebody needs to implement all these pieces in the end. This is a common effort as a community. It's not just one company or one developer crazy enough can drive and write all this code and get it. We as a community need to work together to get this done. And of course, this is just a top of the iceberg. This bunch of work needs to be done on the distribution level to make it usable for cleaning installations, for example. If I want to use this type of the key, if I go to keys, for example, for the login into machine, I'm sure probably want to use it also to provision encryption of my disk. And this is already possible with system D supporting configuration of lock devices with the FIDO2. In all contemporary distributions, system D already compiled with support for FIDO2. It's a matter of configuring them, right? It's the matter of configuring other services to reuse all of this. But I don't think as much as I like wiki pages that describe how to do this in multiple articles, I don't think this is what we should be doing. It's the distribution level job to polish or add this kind of polish. Again, as a community of distribution developers, in Fedora, in Debian, in others, in Arc, we need to focus on this kind of polishing. When all these pieces ready, sure, someone will be faster with the releases and polishing of it. But in the end, we as a community, we as community of developers, users, we are users themselves, even if we are, we know how this stuff works. It doesn't mean that we tolerate how badly it's integrated. So that's my message here. We need to work as a community to bring it all to us, to be pleasant to work with. And with this one, I guess I'm even on time. Thank you. My question is about the two factor authentication when there is a password and a security key. Do you know why in all the work flow the password is first and then the security key? Whereas it likes to me that the security key is the strongest part, and it should be better to first press the button of the key and then enter the password. Is there some movement to change the current way of doing things? The question is whether using password and second factor and so on can be one of the ways of doing it. And is there an activity to make it better, let's say this way? Yes and no. It depends on what we define as the first factor and second factor. One of the driving factors behind password less work in the last 10 years or so across the industry is the tightening of requirements from the government, specifically from the U.S. government. The U.S. government issued a so-called zero trust memorandum last year for urging governmental organization to move by end of fiscal year 2024 to move to password less authentication. This means that they have smart cards already, so that's solved. But they specifically name web often 5-2-2 as one of the alternatives to smart cards. And in fact, all of these things they have PIN associated with this, which can be looked as a kind of first factor. But it's not the first factor as we understand it. It's a first factor to unlock hardware rather than unlocking some factor in software. So this is the thing which drives some of this work, not necessarily for us but for many others already. And I think there is a value in staking factors. The question is in which environments you want to use them and so on and configure it this way. Certainly there is already a mechanism to enable it. For example, PAM already is staking. So technically you can stack multiple PAM authentication modules in the same configuration for VLOC or GDM or something and say always enter three passwords from different sources. That already would be multi-factor, multi-password authentication. Or one of them this and the other one is the password and the third one is phase of the moon and so on. So you can get that stuff already. It's the question of adopting to a specific environment where this makes sense. I have not seen multi-password in the traditional kind of environments yet. But if we do something like integration with the identity providers like I show it in the demo, that gives you possibility to move this check away into let's say keyclog. Keyclog actually has ability to ask a password, OTP and the web often at the same time. So you can configure that already. Again, the question is in which direction you want to focus this. Maybe not adding into SSSD or some specific PAM module but instead adding into more flexible and more controllable source. Any other questions? Yes, here. So you were talking about how can distributions and desktop managers contribute to get like a unified interface and I recently come to a project called XTG credentials portal which tries to specify a standardized debas interface for interacting with web often credentials such that different KDE and GNOME and stuff can hook into this. I was wondering whether you were aware of this project and if that would be a good place to maybe start trying to collaborate on a unified interface. So the question is am I aware of the projects like XTG credentials that tries to specify or unify access to the credentials for things like I think it's for the containerized environments, desktop applications running in flat pack and so on. Yes, I am aware and that's one partial thing that I mentioned about the parallel effort there but it's not for the logging, it's for the reuse of the credentials in the browser or relaying party outside of the container so that you can isolate direct access. That's very important but it's a bit different layer and it's the layer that we will eventually need to reuse if we want to do this transferable state of authentication between different places without caregivers. Yes, we will have to work with them or Dave with us or everyone. Yes. Hello. So I use ED25519 security keys which SSH supports and it's really simple so I just generate a key and I can use it and my SSH demon supports it and I don't really need to run any services and I need to use Kerberos. How do we get to a world where everything is that simple? So the question is why do I need all these things if my SSH keys work fine with the hardware based tokens, X549 and basic certificates and so on. The answer is simple. It's not an answer for all use cases. You cannot authenticate with these tokens and these keys to, for example, file services. You cannot use them to authenticate to NFS. You cannot use them to authenticate to SMB or other resources. Needs are different, use cases are different. All of these things does not replace SSH keys and SSH keys do not replace this one. They exist in parallel to each other because if you start using FIDO2 tokens, you wouldn't be using them exactly the same way on SSH level as well because SSH will need to generate their own keys based on the same, use of the same hardware but different protocol specific and application specific thing. And this is where the program really comes in. You can find use cases where existing functionality is perfectly usable and secure and attainable but it's not transferable to the other systems and other protocols. That's the problem we try to solve. So it's more amending rather than displacement. Hello. Hello. And thanks for the talk. My question is if I'm a very, very bad person, I want to break this system and steal the Kerberos ticket authentication, for example, what's the easiest way I can do it? So the question is how to steal Kerberos ticket in the Linux environment in the easiest way despite all the protection mechanisms we want to add. I preferably answer what mechanisms we have so that you may reconsider your activities. So in comparison, Microsoft some time ago added a virtualized slice where they store all the credentials so you do not have access to that slice. It's running literally not as a container, it's like a VM separated from the actual system where they handle all the security credentials and so on. And you cannot easily store that from the memory of the applications that try to load this stuff. On Linux side at that point we had a project called GSS proxy. The GSS proxy is a daemon that runs in separate, typically on the root, and it interjects into GSS API, hence the GSS proxy. GSS API operations and interposes access to all Kerberos crypto material. So when you are a service like HDTPD or NFS, you want to have access to the Kerberos key tab so that you can authenticate clients coming to you and get access to it, or your IMAP or send mail or something. And GSS proxy basically removes access to the key tabs, removes access to credential cache that you use. They are in separate process. So now if you're broken into NFS server like Ganesha or even the kernel, well, kernel one is probably easier target, right, if you happen to find a bug that allows to exploit it. But you're probably not a Linux Torvalds anymore and it's not 1994 where he found NFS problems to hack into and get the data out. Out of NFS. So typically this privilege separation between the processes works fairly well. If you're root, of course, you get what you get. In that case, there's some number of ways of preventing root running processes from accessing the capabilities on the kernel level. And on typically still this is an operation that should probably be multi-layered. Like you have to use kernel capabilities, then you have to use the file system level capabilities then as a Linux or app armor to prevent other kind of access things. But basically with the GSS proxy we get isolation on this level and also we get the content of the credential caches encrypted. So you can store or you can steal credential cache if you get access to that. But it will be encrypted with temporary key that exists only until GSS proxy runs. So even if you steal it, you cannot really use that one. The other part is that there are credential cache collection types that do not allow direct access to the credential cache. So for example a key ring, kernel key ring where you can store and tighten it fairly tight so that the application doesn't get these keys out. The other one is KCM which is also a separate process accessible over the Unix domain socket. You don't get direct access to the files either. So there are some mechanisms. They are not so dramatic as in Windows case where they run it in completely different VM isolated on a hardware level. But we get fairly far away. I've seen people trying to decode the KCM databases but that presumes that they get access to the physical root partitions so they can read the actual files from the KCM back store. But of course that if you're root, if you haven't access to the root partition, your game is lost in most cases. Alexander, many thanks for your interesting talk to give the community. |
Winners and Losers in FOSS
Open Source Has "Won" - Have We? |
Well, thank you, everyone, for attending. It's lovely to see all of you. My name's Mike. I go online under the handle, Nolski, so you'll find me as Nolski in the Matrix chat as well as in IRC. I have a bunch of titles because I choose to have titles instead of a social life. But probably the most important in the context of this talk is the bottom one. I am a scholar of political economy, of science and technology. I'm a graduate student, and this is kind of the field that I study. So political economy is kind of a funny word, but essentially what it means is it's the study of how economic systems and political systems are intrinsically linked. And so this framing is really what drives the analysis of my talk today. And so political economy has been around for a very long time. It's been the primary subject of many very, very famous and loved thinkers throughout history that I'm not going to name. But what are we talking about here today? Well, first, I want to do a quick history of free and open source software, like a very abridged one to kind of drive home a couple of questions I've been asking myself for the last few years. Second, I want to kind of posit a question about, are we really building the same software now that we were when the initial foundations of the ideology around free software began? And if we aren't, how is this software we're building today different? Then I want to talk about kind of where we are today as a community, as a society and ideologically. And then lastly, I kind of want to, you know, posit some questions that will hopefully generate a very lively but hopefully respectful discussion that adheres to the code of conduct with this event. So let's get started, or actually let's not get started because I talked to my boss and I have to give a couple disclaimers. So first, you know, I want to, I'm going to be talking about politics here and economics of free software. And I want to be clear, you know, this talk, it doesn't seek to analyze or make any stance on like the morality of who you work for or whether your employer makes you a good person or a bad person. It doesn't even seek to analyze whether certain organizations are good or bad. This talk is to analyze the structures that exist in our technological production and see whether, you know, we might be interested in changing those structures, not like who is good and bad. Furthermore, I got to read this verbatim according to my boss. The views and opinions expressed in this program are those of the speakers and do not necessarily reflect the views or positions of any entities that I represent. So now that that's out of the way. Thank you. Yes. So hopefully I will have a job when I get back home. So let's do a quick timeline of free and open source software. This is very abridged, so I'm leaving out parts for brevity so we can have a discussion and focus on other things. But in general for this talk, I want to bisect the history of FOSS into two periods. And these dates are very generalized, but you know, the first period I would argue is between the 70s and the early knots, particularly during the beginning of this period. You know, computing hardware was just becoming, starting to become standardized, right? Previously, if you bought a computing machine, it would come with a software provided by the manufacturer, and it was custom made for that model, right? And you couldn't easily transfer software from like one model by one manufacturer into another. But, you know, soon after, hardware architectures became standardized, right? And so people could write software that runs on many different kinds of machines. They could share it with each other, they could trade it, and software became what's called a commodity, right? And so this generated all sorts of new industries that, you know, I'm sure you're all aware of within like information technology and so on. And you know, as part of this, you know, businesses formed to develop and sell software and services around that software. And at the same time, so too did communities of practice, of production that sort of formed their early, you know, ideas of what we imagine as open source or free software communities. And so from this situation arose the beginnings of, you know, what I would call one of the foundational ideological pillars of free and open source software. In particular, Richard Stallman proposed these four essential rights or freedoms for users as like this, you know, kind of core ideological pillar of what free software should allow and kind of be about, right? These aren't like licensing terms, but rather kind of a stance, right, ideologically. Specifically, right, it's for any software user, you know, obtains, they should be able to use it for whatever purpose they want, they should be able to understand how it works, they should be able to share it onward to whoever they think might want it, and they should be able to modify it to suit their own needs, right, and improve it potentially. And so, you know, one thing I'm really interested in considering during this period is the underlying assumptions and technological context during when these four freedoms were initially positive, right? At this time, most software was this very discreet thing that like oftentimes came in a box or like on a floppy disk or whatever, right, and when you ran it, it was on your own machine, it was limited in its size and complexity, it was portable and understandable largely, right, and it was, you know, it was discreet. So, you know, many businesses at the time would license their software in this way that explicitly forbade these rights, right, like when, you know, you can download my copy of Windows but, you know, you have to pay for it and you can't modify it and you can't like, you know, do whatever. And so, this causes tension between these communities of developers creating free and open source software and businesses who are specifically seeking to profit off of their ability to generate revenue through these proprietary licensing strategies. You know, Bill Gates, you know, had this infamous quote where he referred to free software activists as, at the time, as communists on a television interview, and you know, I would argue if you talk to, talk to folks from this period, you know, many free software organizations actually struggled with like being labeled as like communists or anarchists at the time, and so, you know, this was kind of almost like a meme, and you know, just as a disclaimer here before I get sued, this poster wasn't made by Microsoft, so this, this poster was a meme created by someone else, so, you know, this is not representing Microsoft's views at the time or now, so Microsoft please do not sue me for false light. So, anyways, you know, this struggle led to there being many projects and products both open source and proprietary, right, oftentimes like competing products, right, and many businesses offered their own solutions that actively marketed against open source software alternatives that were trying to accomplish the same goals, fulfill the same features as their proprietary counterparts, right, and you know, these businesses, a lot of their ideas at the time is they're relying on their greater cohesion to try to build like a better product and software ecosystem to the, you know, when compared to their sometimes fragmented open source alternative, and this is how they seek to try to capture the market for any given like use case, right, so in short we had this situation where we see directly competing products whose common differentiator is really their licensing strategy, this is the thing that made them stand out apart from each other, right, you got Windows and Linux, you got Firefox and Internet Explorer, you got OpenGL and DirectX, right, all these were competing products to see which was better, and all these cases as a competition between, you know, which organization, the community or the firm could build what capacity is more effectively, right, and many of our debates and discussion at the time revolved around, you know, which method of production was more efficient at building these capacities, and you know, I'm sure even if you go around today or you talk to your friends, we still hear these discussions, right, when we're talking about trying to encourage people to develop open source, it's more effective, you know, it's just easier, we can share costs, and so, you know, during this time licensing was this huge part, if not the most important part of when we thought about the battle between free and open source software and proprietary software ecosystems, you know, IP rights at the time were really coming to a fore with the advent of like a lot of, you know, digital intellectual property, and when considering this initial ideology focused on, you know, user autonomy, freedom, user autonomy and freedom, this idea of IP ownership in order to facilitate that through licensing was like the sort of core method that many people thought about how to achieve this sort of more ideological goal, right? So now I want to kind of jump ahead to the sort of second phase where we are today, right? So this relationship between open source and proprietary software, it didn't really stay this way, right? Now as we all know, as you know, you come to any one of these conferences, you've probably seen this statistic already today, you know, free and open source software has become this foundational infrastructure upon which like pretty much all of our digital services are built, right? And before I go ahead, I just want to drop like a thank you and a note that a lot of the graphs I'm going to be showing in the next couple of slides come from this amazing 2021 report written by the Digital Commons Policy Council. So I got a link to that policy council in the slides which you can download off of the FOSTA website. I encourage you to check it out, but they did a lot of really great work for getting the data and the empirical sort of findings here. So anyways, you know, today we see some of the biggest antagonists towards free and open source software actually directing the most significant amounts of their workforce towards its support and budget, especially when compared to other firms, right? Microsoft was like the big bad in open source, right? And now they're actually one of the largest contributors of it, right? They're very much bought in to open source software production and not only have major firms change their tune in relationship to each other, but now when we look at open source communities and the largest projects and most important projects on the internet, employees of these firms now generally outnumber independent contributors. And you know, I'm careful to use the term independent here rather than volunteer because independent contributors are oftentimes contributing in the context of like an organization or institution, but it's not really one that is focused specifically on maximizing profit for themselves. So these types of contributors, I would argue, face like very different incentives in terms of like what they're trying to build and for what purpose. Furthermore, you know, as time has gone on, the ratio of firm employees to the independent contributor has increased steadily, right? So year over year in most or in pretty much all major projects, the amount of firm employees when compared to non-firm employees has increased, right? So this is not just like a finding, but it's a year over year trend that's increasing steadily. So you know, I would argue one thing to note in this case then is that we can sort of find that independent contributors oftentimes have less power in these communities simply because they're being outnumbered by a much more cohesive group of other contributors and they just have less power to develop, right? And so, you know, while these projects are majorly developed at the explicit discretion of firms, what I think is most important is actually the shift between these two periods is actually how the form of software itself has shifted when thinking about open-source software. So if you look at this list of projects, so this was I think the top 20 or 25 projects that in terms of contributors and activity founded by the Digital Commons Policy Council and if you look at this list, nearly all of these projects, they aren't really end-user software anymore, even of the technical sort, right? Many have for use solely in the development of these sort of large-scale products and platforms that can really only be applicable to users who have significant resources to run them, right? So like, let's have a look, right? We got, you know, a number of server orchestration software such as, like, you know, Kubernetes, Mobi or Ansible. We also have machine learning and distributed systems projects such as, like, Spark, PyTorch and TensorFlow. And I'm sure as you attend talks today, even at Fostum, a conference, you know, it's quite community focused, you'll likely find the majority of software programs being discussed are of this type. So you know, why has the production of free and open source software shifted so incredibly to this one very kind of specific niche within software applications? So I'd like to argue that most of the software we interact with on a day-to-day basis, and you know, maybe I'm leaning out of the, you know, free software developer archetype here when I'm saying this, but most of the software we interact with aren't really discrete applications anymore, but rather what I would call platforms, right? And because of this, the relationship of much of our open source software and our proprietary software has, like, fundamentally changed, right? So why are these projects that are, you know, important now so different than the projects that were important before? Now I'd like to argue that many of the most powerful technology firms have largely pivoted away from licensing discrete software products as their core sort of revenue model to the development and, you know, managing of platforms to generate revenue for them. So you know, before we go forward, I'm sure you've all heard the term platform before, but I'd like to try to formally define it in economic terms, right? And so this, this sort of, I'm paraphrasing here, but this definition was first posited by a Nobel Prize-winning economist, learner, and to roll. So platforms, when we think about a platform, you oftentimes think about it like a software system or something like that, right? But these software systems aren't really commodities in themselves, but rather they're made as a system to manage access between like networks on two sides of a wall or like commodities on two sides of a wall. So to give an example of this, you know, Facebook is a platform which manages access to communities and media for end users on one side, and then, you know, willing eyes, data and ad positioning for advertisers on the other side. And Facebook has kind of this monopolized access to both of these networks, right? Or at least like a very preferential access to both. You know, Amazon is a platform which facilitates access to distribution networks and a global marketplace for producers on one side, and then a network of goods and services for consumers on the other side. Google is a platform which facilitates information and digital services for end users on one side, and again, you know, willing eyes and user networks for those digital service providers and advertisers on the other. So the nature of the platform has positioned their creators to exploit what are called these multi-sided markets with these sort of two networks by acting as a mediator and a gatekeeper between access across this sort of divide. So, you know, I'd like to posit this question to everyone that we always hear all the time. It's honestly probably getting a little old in the tooth, but, you know, has open source one? Like, you know, have we done it? We are foundational to the operation of like the entire world. You know, open source is really important. More people than ever are creating it, right? And so, you know, no, I don't think so. I mean, there's clearly more open source, right? There's probably a similar or larger proportion of open source software to proprietary software that we use on a daily basis, at least within the dependency chain. However, we still have little to no control over the vast majority of services that impact most people's lives, particularly digital services, right? From e-commerce to social media to health care provision, all of these, you know, might heavily rely on free and open source infrastructure to function, but we still have no actual control. So when we think about the ideological pillars of our movement in the first place, I don't see any significant gains in this area. So, you know, to me, when looking at this history, it really seemed that many free and open source advocacy programs have been outmaneuvered in this way due to the shape of our software changing. So, you know, I know that's like a little depressing and everything, and I'm sure many people have a lot of opinions about, you know, maybe the validity of that, but, you know, I kind of want to talk about, well, if this is the case, right, and if this is a problem that we as a community are facing, then what do we do about it, right? So first, I want to kind of review a couple of key points that I wanted to make, and then I want to, I know this is like a stage shock, but I want to have a lot of Q&A afterwards to have a discussion with the audience. So first and foremost, you know, I want to claim, you know, free and open source software isn't competing with proprietary software anymore in any significant way. I know, I know there's, you know, probably examples, but when we look at the largest, most active, most important free and open source software projects, they are not competing with proprietary alternatives. They're helping host them, right? The second is our labor as free and open source software contributors in general is being increasingly dictated by firms. We see this year on year because we're being continually outnumbered by firm contributors, and we're having less of a say in the communities that we act in. You know, furthermore, more of us are working directly for firms when contributing and are therefore, you know, required to represent their interests. And even when we aren't, certainly the most important projects, our ability to dictate the direction of these is, you know, we're outnumbered. On the flip side, we are finding when these firms suffer any loss in their rate of profit. Usually it is us who are among the first to be cut, right? This precarity and increasing control that firms have over our collective labor powers means that oftentimes we're losing autonomy as creators over our own labor. You know, so this original framing of user control over, you know, our software that we download and run on our machines, you know, I would argue is no longer relevant because we live in a time where, you know, we majority interact with these really complex, massive platforms. Most critical software we interact with is no longer these discrete packages offering specific functionality to users, but rather, you know, these platforms are gating and mediating access between markets, commodities, and other networks, right? Therefore, our original conception of simply increasing the amount of openly licensed software products is no longer as relevant as the projects we're building are largely infrastructure to maintain these platforms, right? So we can no longer avoid this very political question about how we attempt to reorganize free and open source software labor on behalf of these original ideals of, you know, user rights. You know, labor organization has frequently been a topic avoided previously in free and open source software, and I would argue is due to this major focus on licensing instead. You know, we didn't talk about, like, who we're working for or how we're organizing our work because we wanted to focus on where the ownership was, and if we can create more, you know, open licensed things, then the ownership goes to everyone and the power is distributed. And I would argue we've seen that we've been kind of outmaneuvered in that sense because our interaction with owned software has largely changed. So you know, this all sounds not great, but if this is the case, then how can we adopt a more sort of labor-oriented approach to the topic of free and open source values and ideals, right? How can you, as a foster, or me, you know, I'm not off the hook here, help bring about this change? So you know, I can't claim to have all the answers here, you know, I'm just a guy interested in this topic, studying this stuff, especially now, you know, I understand the topic of political change is a very touchy one, however, there are some ideas I've researched that I think would be, you know, at least worthwhile for us as a community to investigate further. The first is the idea of understanding that not all free and open source software contributions are equal. So when considering the value of free and open source software contributions, you know, maybe we should prioritize developing alternatives to existing platforms that are controlled by firms, right? And so when I say alternative here, I don't just mean an openly licensed clone, but rather a solution design specifically with these ideals of user autonomy in mind, right? It may take an entirely different form to create something that would be comparable. The second I want to talk about that I think is very important is organizing within the workplace. You know, I've met many, many, many burnt out free and open source software maintainers in my life, and I understand the pain there. And so, you know, our ability to produce what we want requires that we have greater control over our own labor. So, you know, while technology roles are oftentimes compensated well, at least, you know, within certain geographic areas. It doesn't eliminate our precarity as workers, right? And I think in particular, you know, many of us are finding that out now. So, you know, in this case, what do we do about it? Well, you know, historically, the formulation of labor unions have allowed us as workers to better bargain with our employees and regain control over our labor power. And I think, you know, when thinking about how we can, like, manage to have the time and capacity to be able to build what we want, thinking about regaining that control within your own workplace, particularly through labor unions, is, you know, a great first step. The third thing I want to talk about, which is a little touchy, is, you know, explicit political organizing within the context of our free and open source software community. You know, I don't think we can just build and license the solutions that we want, but rather we need to organize ourselves along explicit political lines that go beyond that of licensing. When you think about free and open source software, I was in Bradley Kuhn's talk today in the legal dev room, where he was kind of talking about his personal history in this case. And he talked about how there were a lot of, I forget the word he used, but, like, collaborations between groups that had different interests, right? And he was questioning, like, you know, I don't know if we should have worked with them or worked with someone else. And I think it's very hard to determine these sort of partnerships that you enter unless you have a very clear political line that, you know, you as a community are agreeing to work together on. And I think defining this is really important, probably just as important as something like the open source definition. Lastly, you know, I think finding and developing new workplaces is going to be a really important fact when thinking about, you know, kind of figuring out how we can organize ourselves to develop the software that we want, right? I get that this isn't an option for everyone immediately, but, you know, the role of institutions such as, you know, NGOs, nonprofits and universities have historically been, you know, key creators of foundational free and open source software infrastructures. And you know, furthermore, I think we as a community see our ability to represent ourselves lessening in general, right? You look at organizations like the Linux Foundation, and they pretty much exclusively represent the interest of their corporate members, that is literally how they're incorporated, right? And they have no public good mandate. So I would argue that we should organize, agitate and push for these institutions in our society to support this production, to draw power away from firms. So now this is a bit tongue-in-cheek, you know, I'm not claiming you should all be communists, but this is my poorly made rendition of the earlier poster, you know. I want to wrap up with, you know, free and open source software, you know, I would argue was largely outmaneuvered due to this lack of cohesion and organization within a specified political framework. And so if we do, you know, seek to reorganize ourselves for a new era in free and open source software, then we should consider how we got here and, you know, see if we can correct some of those mistakes. So, you know, I want to, this is my second to last slide, and I'd like to open it up for Q&A if that's okay, but I thought I would leave this here, you know, if you found this stuff interesting, I'd like to leave kind of further reading lists. And so these were some of the primary citations that drove this research for me, but also I think just were interesting, relevant sort of pieces of written work. There's a couple of books in here, I link to them on LibGen, if you can't afford them, you should be able to download the e-book at least. And yeah, thank you, I'm Mike, and I'd like to hopefully try and answer any and all questions you might have. Thank you very much for the talk. You presented a very interesting table regarding the leading firms for each of your stories, etc. So I'm an NixPKG contributor, and it was mentioned that the leading firm was Logic Blocks. I think it doesn't exist anymore. I'm very interested in how you generate such tables, because it's super interesting. Also how do you deal with corporations or firms which are not really firms? For example, on the table there was Home Assistant, and the leading firm was Affolter Engineering, which to the best of my knowledge is Two Person, which I don't know if you can qualify this as a firm or not. And what about those group stories or projects where you have a lot of people which comes from very different backgrounds? Yes, that's pretty much it. So as I understand it, there's a little bit of rattling. You're kind of asking, what is the methodology for generating the data that you're showing? How did you get it? Am I able to click on this? So these tables were generated by the Digital Commons Policy Council as part of the Ford Foundation Digital Critical Infrastructure Research Grant. I don't know if I... Oh, wow, that was small. So this is the research grant. I don't have Wi-Fi. That was silly of me. But essentially, they detail in this report their exact methodology for doing it. It was a lot of...we had to deduplicate identities. We compared emails for frequency commits and mangled it all together into a single report. But they can detail it pretty well. I did look over it because I was asking myself the same questions and I was pretty satisfied with it. It was a while ago, so I probably would do a poor job of repeating it. I'm over here. Hi. I found that really interesting and very provoking. An outcome, I see a conclusion of this. The idea that free software has become platformized and has become this sort of substrate is a sort of two-tier society where we in this room are a sort of intelligence here and we have access to these free tools and we get access to the four freedoms. We can use free software licensed products and we can study them, share them and improve them and then we build services, a software substitute and everybody else is a kind of London proletariat who doesn't get those freedoms and doesn't get to sort of benefit in the intellectual fruits of free software. Is that necessarily connected to the kind of corporate capture of free and open source software that you were talking about and what are the organizational roots or the agitations that we can take to actually extend those freedoms out beyond the experts and to the people who use computers? Yeah. So a very big question, you asked for sure. If we imagine the free software community, oftentimes we're very technical people. I use Linux desktop and I'm a social scientist so I would consider myself probably fairly non-technical here but it allows us to have these sort of freedoms by having this technical knowledge and many people I'm sure many of you have interacted with maybe don't have that technical knowledge or that capacity or whatever to sort of interact with free software in the same way and use it, maybe their job requires that they don't or whatever. And so does this play a part in sort of figuring out how we can advocate for these ideas to be more equitable across people? Is that correct? Cool. Yeah, I would argue that for sure as technologists we face unique labor pressures, right? In particular, I was at the EU policy forum thing yesterday and there are a lot of policy people that go up and they're like, technology, the coders are going to save the planet because they're genius and we get a lot of lip service towards our technical ability. But on the flip side, I would argue that we still face many of the same precarities that we've seen workers have throughout history and for sure it's a struggle and it's a discussion we need to have as we're organizing ourselves but I think first and foremost, if we want to have better free and open source software, we got to have more control over our own labor power to create it. We've done a lot of really cool stuff just like as hobbyists and things like that. Imagine if there was an army of us all working together and dictating like, oh, this is what we want. I don't want Facebook. Who likes Facebook anyways, right? I want to build the thing that's like actually allowing me to connect with people in the ways that I dictate is best for me. It takes a lot of organization and agitation to get there and I think even as intellectuals or a person in the professional managerial class, we still have a lot of incentive to work at that. I hope that answers. It's kind of a vague answer. Yeah, so when open source is being developed in discrete ways for discrete products, there was no cost to the developers of the software when someone used that software in whatever setting. So you could produce a thousand or a million and the open source project almost didn't know who was there and who was using it. And I understand your concept that there's more of a consumption at the platform level now than discrete software. But the question is how does, how do projects evolve in an environment where you have to provide now, there is now a cost to that platform. So you've got to have infrastructure, you've got to have networking, you've got to have admins and so forth. And there's now a per user cost in a lot of ways to how your software is consumed in a way that before it was zero cost. And it feels like that change is pretty significant and does almost open the door or close the door frankly for pure open source projects to almost operate at scale in that environment. Yeah, great question and very, very relevant and not one I think I have the answer to but like I want to maybe propose a couple of potential answers that I think, I can't choose, right? We have to choose. And so when thinking about platforms, there's a lot of different ways, currently many platforms are kind of exploitative, we don't really have a choice in terms of whether or not we use them if we want to have access to a certain thing like social networks or whatever. So part of it's like, oh, maybe we just need to collectively own it, maybe just nationalize it is like one very political way of thinking about solving this problem. If we nationalize it, then it's like democratically controlled theoretically and like this is one way of approaching this problem of user autonomy, then if someone wants to change it, then we have democracy to help facilitate that somehow. Another way of approaching the problem though is like, do we need platforms at all, right? Is this approach of one specific technological solution to solving one specific need within our society, is this the right way of going about it or was this just like the most effective way of running it? Can we return to, can we imagine futures of which like our technology is progressing, but it's still discreet enough that we can maintain control over it? And then what does that look like and how do we get there? I think these are the kind of different sort of arguments and kind of maybe factions that might arise within the open source community. I think if we start, begin really rethinking like what we're doing on a more fundamental level. Does that sort of, yeah, thanks. Hi. Hi. I was wondering what would we see or hear when there's good organization within the workplace and that question is sort of in context of major strikes in the UK at the moment around a lot of public services and also in context of a blog post by Carl Mitchell called It's a Trap, which is basically warning that the union sphere in the US is very heavily legislated and there's lots of sort of bad political context when it comes to saying we should have a union for this, basically being something that you shouldn't ask for unless you really know what you're getting into. Yeah. So, I guess maybe before you hand it off, is your question more on like what do we do about unions kind of being portrayed in a certain way or is it like as a union applicable? That's organized well. Right? What does that look like? Yeah. Yeah. That's a great question. I have a lot of friends and as part of my research, I do lots of research with how unions currently organize and how they use technology and so, you know, there's not like, there's not one, you know, everyone I talk to, whenever I ask them this question, they're like, well, I can't give you a blueprint, right? But I can like tell you some things that work for us, right? Within, you know, I work with a couple union organizers at Amazon, right? And they have huge turnover. So, you know, focusing on like new employees as a point of organizing and getting people in educating them is like very important for them. However, I have another friend that works as a technician in a hospital who has a very small department. And when they start, when they started focusing on unionizing or at least like collectively bargaining amongst themselves, they started in a very small but critical part of their organization saying, you know, we can't do the whole hospital at once, we'll just get fired, right? Or we'll have a bad time and it'll be very hard and we don't have the capacity. But maybe within our department, if we, you know, begin forming certain lines about what we want and like what we care about, we can begin starting our organization from there and then moving outward if we gain momentum. You know, I think particularly within the technological context and people within, you know, like sort of tech companies, this might be an interesting thing to try out. But you know, that said, we haven't seen, we don't have a lot to learn from in this sense, right? There aren't a lot of existing tech company unions. But like, you know, maybe that's a place to start, maybe looking at some of the smaller ones or, you know, if you work at Amazon, check out Amazon Workers United. If you're at Google, check out Alphabet Workers United. There are existing unions within these companies, even if they aren't contract unions. Thanks for your talk. And this is also related to like organization, you mentioned software foundations. I mean, unfortunately, a lot of them have similar structures and like business models. So I mean, because I've been thinking about like alternatives to software foundation, is it that we need to work on like a different membership or a different business model or governance? Do you have any thoughts on this? Yeah. It's a big question. And certainly when I hear a lot of folks talking about, you know, I think it's, you know, I would argue, and I know I called out many companies here, but I would argue the Linux Foundation is quite different from most open source software foundations. You know, Linux Foundation is a membership organization that is formed specifically to represent, you know, its corporate members. You pay for a membership to get into the Linux Foundation. They have a summit and stuff like that. And they're supposed to advocate on your right, on behalf of your interests as a tech company that utilizes open source. And they, you know, they've grown quite a lot due to their funding. But I would argue that's like very different from something like the Python Software Foundation or the Rust Foundation or even like Apache or something like that. You know, I would argue, I would like to see more collaboration of, you know, I don't think software foundations for every project that exists in their own little ecosystem. I think that's good so you can receive funding and stuff like that. But we should start figuring out ways that like, you know, what if there was a Linux Foundation that wasn't, you know, driven principally by like members who paid for ownership, right? Maybe we could think about things like, oh, you know, the top contributors might get membership if you contribute a certain amount or, you know, some other form. There's a lot of different ways that you could go about it. But I think like figuring out how to bring communities together with similar interests even if they aren't the same project or in the same tech stack is important because, you know, then you can begin organizing along ideological lines rather than just like, how do we make Wayland work better in GNOME or something? Hi. I feel like there are a lot of pieces to kind of dismantling these platforms and I feel like a few of them are being discussed around here like the cooperative company structures and building new like open platforms but when it comes to unionizing, because there aren't really many tech unions, I've been thinking about a lot about how these would work and I wondered what your thoughts were on the idea of kind of using tech unions to try and dismantle some of the existing platforms like, say, we're not going to work on AWS because we don't agree with Amazon and what it's doing with its other employees, for example. What are your thoughts on that way of approaching this kind of thing? Yeah, you know, there's, I would say like, you know, one step at a time, you know, before giving the talk, obviously I talked about the subject to a lot of people and there's a sense of like, ah, you know, things are pretty bad now, like how are we supposed to like deal with all this stuff? It feels like everything's collapsing, we're kind of in survival mode. I would argue at least right now, right, at this moment, the real benefit of unionizing within technology workplaces is it gives a little bit more time and autonomy and stability back to technology workers and so, you know, when I think about the crisis and open source around like volunteers who are just overburdened and stuff and I think about like everyone I know that works in tech that's like on call all the time and, you know, like software services are crashing and now like, you know, their coworkers are being laid off or maybe they are, you know, this is the problem that I think like, you know, if we can claw back some of our own time, if we can get a little more autonomy on our own, a little more reasonable, you know, working conditions, like even, you know, I get that the pay is good in some conditions, then we can begin, you know, just having the headspace to think about these new solutions and have the time to, you know, organize with each other where like our boss isn't watching us either, right. Thanks for your talk, you talked about software, but do you think we can observe the same shift from individual and independent contributors to software companies and big firms paid contributors in the international standardization organization like IETF, for example. I think about how the first version of HTTP protocol was defined and the third version which was mostly the result of a bargain between Google, Facebook and Microsoft. Is there any data or work on it? I'm sure there is. I don't have any, I have anecdotal data, like I served on a couple IEEE standards committees and, you know, noticed this sort of, you know, major differentiating where you get people who are there on developing a standard specifically for the interest of their company, right. They're like, oh, my boss is like paying me and I'm supposed to help manage this and like make it so, you know, our organization can accomplish the goals that we want to do and like, I'm not saying it's evil to want to do that or anything like that, but, you know, I see a lot of that and I don't see a lot of people where it's like, I think this standard is pretty cool, like I did a thing with something related one time and I want to, you know, it's like showing up at your town hall meeting, who has time for that anymore, right. So, you know, I would argue the same principles are at play here, right. More and more organizations are interested in forming standards with each other because it helps share the cost of development of a lot of their infrastructures, right. No one wants to reinvent the entire stack for whatever thing they're building, so by utilizing standards, you know, you can more effectively do that the same way that you could by utilizing open source software to do that. And so I would say, like, principally, there are a lot of similarities, but I don't have like a, I certainly, at least at this time, I couldn't empirically prove it. But a great point for further research on the road from here. Right, is that better, fascinating talk. I just wondered whether we've made life easy for the big corporations. I think we all enjoy writing software, we enjoy solving a problem, and it's a bit like an academic researching something, and then you get an entrepreneur making his fortune. How can we fight back as developers? Yeah, so I guess your question is, you know, if I understand correctly, you're kind of like, you know, we kind of made it easy for us to get into the situation that we are in now, like, what do we, how do we fight back? Is that it? Yeah, well, you know, I think, you know, I'm not a huge fan of like kind of the self-flagellation where it's like, you know, we got corrupted or whatever. You know, as the last 50 years of history around, you know, the development of digital technology has a lot of very major incentives, it was very complex, it was not necessarily super obvious that we would be where we are today. You know, I think the most important thing is like, you know, the future is yours if you fight for it, right? It's not like an easy, you know, the things I suggested, if you want to try to unionize your workplace, it's not an easy thing to do, you know. I have many people that have like tried and failed, and I know many people that have succeeded as well. But, you know, I think like, if this is something we want, right, if you think about like the four freedoms or these ideals of open source as like user autonomy, you're like, you know, this is the world I want to live in, and as a technologist I can create this stuff, then you know, like anything, it will take work to try to do that, and it's more about recognizing that and then figuring out what to do next. I think we're at time too. One more question. Thanks for this. It's a really kind of great thing. I want to ask not only you, but everyone else here, you know, do you think that we as humans can step away from that kind of, that you always need to have someone like in charge? So we have that, you know, pyro meal kind of thing, and that to like to go more, you know, flat. So that's what I'm saying, so no, we are so used to have someone in charge and to someone as you said, word that I don't like a lot, I can dictate to someone and dictate what you should do and actually team up. I'm not sure that we as humans really can do that on some bigger scale, and it's a left. I'm very left, but I don't go far left, so I am on the real left side. So I'm saying that it kind of feels like if you play in a small band and you make a nicer songs and, you know, with music, and as soon as you get to my mind there, you sell your soul, and you're gone. So how we can go more flat without losing our soul? Yeah, I mean, it's tricky, you know, the human nature and needing someone in charge. This is a little bit of a joke here, but like, you know, I would love to introduce you to my four-year-old nephew. He might have you changing your thoughts on that very quickly with respecting authority and wanting that. You know, I think there's no easy question about like, but you have to think about our organization scientifically, right? I don't think it's like a matter of preference, but we have to think about our relationships to each other. We have to test and evaluate various modes of organization, and we have to approach it like any other difficult scientific problem in our lifetime, and, you know, through that hopefully we can find a better solution if not the right one. Thank you. We have many thanks for the amazing talk. |
Fair threaded task scheduler verified in TLA+ |
Hello, my name is Vlad, I am a CEO and C++ developer with roughly eight years of experience. Right now I work at a game dev company called Ubisoft, which some of you probably heard about, with mostly low level networking and some bits of system programming. And in Ubisoft, while doing a complete rewrite of our networking stack in one of game engines, I managed to design, I hope, a cool algorithm for doing so-called task scheduling. And today I share this algorithm in the form of this talk and an open source C++ implementation of it. Although, if you don't like C++, it doesn't have to be, today we focus just on the algorithms and they can be implemented in some other language too, like in Rust, I'm sure it can be done in Java, in C-sharp maybe. So there will be a lot of content, so we will be going relatively quick and I ask you to concentrate. Talk will follow the plan firstly, what is task scheduling exactly, because many things can be understood as it, then I explain how it's normally done, what are typical problems with it. I'm actually, at least, known to me and how worked an old task scheduler in my game engine and how works the new one. And then I will briefly show you the benchmarks, how it's verified and what are the future plans for it. So what is task scheduling exactly? Today, in this talk, I mean, as task scheduling, any kind of execution of code, such as code exec functions or something similar, and we just call it task, a generic term. And so to give you a few examples, tasks can be callbacks in a thread pool, or tasks can be watchers in an event loop like in LibF in C, we can have watchers as timers or wrapping sockets and they also have callbacks, or we can also call tasks routines in some routine engine like C++ routines or fibers in Toronto database. So here is a very, very simple trivial scheduler implemented in C++ like pseudo code for simplicity, which demonstrates this example. So it's just a single thread. It has a mutex locked queue for callbacks, and what it does is just executes those callbacks one by one. This is a very simple scheduler. Now let's see if such a basic scheduler can solve a real task, which I had at Ubisoft. Here, tasks execute requests coming to a server, and they handle save games, save game blobs. And there might be tens of thousands of those requests per second, and they are also CPU intensive. So every task might take milliseconds of pure CPU time, and also they consist of multiple steps like, I have to, in each task, lock user profile in some database, then I have to download the save game blob from another database, then I have to do some pre-processing of this blob, some stuff with some manipulations, then I have to upload it back and unlock the user profile. As you can guess, most of the time the task is not doing anything at all. It just sleeps waiting for network input from the databases. So literally more than 90% of times there is nothing happening in this task. The trivial scheduler, which we just saw with single triad, it will not work here, it simply will not scale. Firstly, single triad is not enough to handle so many requests, so being so CPU intensive, it will just choke on CPU. Secondly, we can postpone blocked tasks and do other tasks while the first ones are waiting for some events like network input. So this means we need routines so that tasks could yield, so they could give up their time to some other tasks in the same thread. The game engine I'm working with had a good enough scheduler, which could do the job for some years good enough. This was just a thread pool where each thread had a list of own tasks, and when new tasks were appearing, they were distributed to those threads in around Robin Manor one by one, and they were pinned to those threads forever. They could not migrate between threads. And then what each worker thread does is updates all its tasks with some fixed hard-coded period like once per 100 milliseconds or once per second. Each task, after being updated, it can return false eventually, and it consider it done, and it is deleted. So this is some polling, basically, and we will call this scheduler updater because there isn't much scheduling, really, it just updates those tasks without looking at their state or anything, so we call it updater. This updater thing had some typical problems, like schedulers of this type sometimes do. Firstly, it was unfair, meaning that tasks were pinned to threads, they could not migrate. This leads to a problem that your CPU usage will be unbalanced because some worker threads will get heavy tasks, and a lot of them will stay in the queues, and other threads will get light tasks and will be in idling most of the time. This will happen even if you do perfect round robin, and all your tasks are the same because actually tasks are never the same. Like in my case, all the tasks do the same several steps, but save game blobs, they can vary in size from kilobytes to megabytes, their processing, their downloading, obviously takes not the same time, so some threads will be unfairly loaded, and they will perform worse, at least in terms of latency, because they will have bigger queues, tasks will wait longer in them, and that is not the only problem. The other problem is polling, which in this case means the tasks, each task is updated unconditionally with some fixed period regardless of task state, so every task is updated even if it doesn't have work to do yet, so it's still waiting for network input. What it means is that if you select too big polling interval, then you will have too high latency, more than you could have. For instance, imagine like we have a task with just three steps, each taking five milliseconds, and in between it waits for some network input, and we will update it once per hundred milliseconds. Then your task will always at least take 215 milliseconds. The thing is, you don't always need so much time. Most of the time, like almost always, the network response will arrive earlier than 100 milliseconds expire, and you have events to process, but you are not doing it because it's not yet time to update the task, so we have higher latency than we could have. If you try to fight it, try to set lower polling interval, then you will burn CPU without need, because you will have spurious wakeups, so unnecessary wakeups. You will sometimes wake up tasks before they have stuff to do. Then it sounds too bad, like how expensive can it be, really, and imagine like spurious update of a task would cost us just 10 microseconds. To check a deadline or check an atomic flag, lock and lock, and you text and that's it. It's not much, but if you have 10,000 tasks per second doing those unnecessary wakeups, you just burned 10% of one CPU core, which already sounds worse, and this problem aggravates if you have more threads than CPU cores. Because then you didn't just burn the CPU time. You stole it from other threads, which could spend it on something useful. This problem was actually real, and during low tests, some servers were spending more CPU time on spurious wakeups than on doing actual work, because they were just burning so much for the green data clusters. What we need on the summary from a really performance schedule, firstly, it should be fair. We are not allowed to pin tasks to threads. It doesn't scale. Secondly, we need routines, so that tasks could give up their time, so they could yield and let the worker thread do some other work, other tasks. And also, we have zero tolerance against polling, no polling at all, everything should be event based. And these goals are achieved in a new scheduler, which I do to lack of fantasy, just called the task scheduler. Although the name is a pretty self-explanatory, what it does, the plan is that we will go firstly, we'll look at the big picture of the entire scheduler, and then we will look at individual parts of the scheduler, when you will already know what they do. Imagine like we have a process, this server running, it's a process. It has multiple threads, and they produce tasks. And we have a global object of type task scheduler in the process, like it can be C++ task scheduler, Java task scheduler, whatever language you implemented it in, it is a global object in the process. And those threads, they produce tasks, so they receive some requests from clients and post the, wrap them into tasks and post into the task scheduler in form of some callbacks of some sort. In the scheduler, they will gather in a so-called front queue. Then the scheduler will periodically pick up, take all the tasks from the front queue, and inspect them one by one. It will see that some tasks are ready to be executed right now, the red ones on the slide. They want to be executed at SAP, and some other tasks they do not want to be executed right now, they just yielded. We have routines, so there will be tasks which don't want to be executed now. They are waiting for something. For a deadline or for an explicit wake up for some event, they are moved into the wait queue, where they will wait for their events. So from the wait queue, we extract some older tasks which were already sleeping there for some time, and now their deadline is up, or they were woken up explicitly, either way we extract them from the queue, and all those red tasks go to the ready queue. And from here, they are extracted by the worker threads, which take them from the ready queue one by one and execute. And this is basically the entire pipeline, it's not too complicated, although some of you could already have a question, who does this part? We have external threads posting tasks. We have worker threads doing tasks, but who does the scheduling itself, managing management of the queues? The thing is, there is no dedicated thread doing just the scheduling. Instead, the worker threads compete for doing the scheduling, depending on who of them is idle. Imagine, like this big rectangle with queues, it's a room with a single key to it. And the worker threads, sometimes, depending on who of them is idle, again, will try to pick out the key. Whoever does it first, enters the room, does the scheduling, this queue management stuff, leaves the room, and starts doing tasks. So all the threads are the same. There is no threads having goals of some sort, like a naive implementation could do. And it's a bit like dining philosophers' problem, except that we have just one fork here. And this big rectangle, in this case, is just a plate of spaghetti, but not on code, code is clean. To improve understanding, there is a short visual example I prepared. Imagine, like, we have five tasks. They are posted onto the front queue, and one of the worker threads has the scheduling key right now. It will extract the tasks from the queue. We'll see that a couple of them yielded. They go to the wait queue, waiting for something, and the others are moved into the ready queue. From here, they are picked up by the worker threads. Nobody is doing the scheduling right now. All the threads are doing some actual work. A couple of tasks are done, and suddenly, those waiting tasks are woken up, or their deadline is up. They want to be executed now. So one of the worker threads will eventually pick up the scheduling key. We'll notice this. We'll move the tasks into the ready queue, and in parallel, some other thread finished and older tasks. Now, those two tasks are picked up by a couple of random threads. They are done, and some other thread will pick up the scheduling key, and the process repeats all the time when new tasks arrive. This is it. What we need to implement all this cool stuff, the language and the libraries, which we will be using, need to provide us with the following stuff, at least. We need mutexes, containers like race, lists, condition variables, and not important for the algorithm, but for the implementation, for the performance, it's extremely important. We will also need log-free atomic operations. Just in case not all of you know what they are, here is a short pseudo code explaining what these log-free atomics are, we will need three of them. Firstly, it's atomic load, which is basically reading a variable, but with respect to so-called memory models, which, again, Google afterwards, it's too complicated topic to dive in right now. We will also need atomic compare exchange, which is conditional assignment. So we set a new value to some variable if it was equal to something else which we wanted to check. And we will also need atomic exchange. So it sets a new value and returns the old value. The cool stuff about those log-free atomics is that they are not only atomic, but they are log-free. There is no mutexes. On the contrary, mutexes use this stuff inside. And how they are implemented, they are special instructions right on the CPU. So doing those operations doesn't even involve the kernel. It's quite cheap if you use it efficiently. And those three operations are basis for some very cool and extremely performant algorithms, as we will see. Not just here. There are a lot of those algorithms based on those log-free atomic operations. They are also called log-free algorithms. And we will follow the task pipeline, looking at the scheduler parts, and we will start from the front queue, just like the tasks. We know that this queue has multiple producers, and it has a single consumer. So it means this queue is multi-producer, single-consumer. This is a common notation for naming queues. We say multi-produce, multi- or single-producer, multi- or single-consumer, and we get four combinations of queue types. This is multi-producer, single-consumer. We also know about this queue that it will experience high contention, because I want my scheduler to be functional. When I have tens of threads and millions of tasks per second, this means extreme contention on the front queue. This in turn means I should avoid mutexes, because mutexes most likely will choke here. What you would do in a normal queue, not thread safe or anything, you would have to remember two positions, head and tail, and maintain those positions. If you try to turn this algorithm into a log-free thread safe implementation, it would be nightmare, because the more variables you have to update in a log-free way, the more complicated the algorithm gets, and queue, like with two variables, it's extremely hard to implement in a log-free way. We should try to avoid this if possible, and the idea is let's make it a stack instead of queue. So we will maintain just one position, the top item, and that's it. We don't need two variables here, and then the algorithm becomes quite simple. Pushing is, well, we try to link new item with the old top, set it, atomic compare exchange will fail, if some other threads does it faster than we are in this push, and we don't retry it. It's pretty simple. The more complicated part, although it looks shorter, is popping the items. The thing is that how pop is implemented, we just replace top item with null, effectively taking all the items at once. We cannot pop them one by one. And also, we have to reverse the list before we return it, because when we are pushing items in FIFO order, pushing them on the stack, they are returned in LIFO order, and we want to maintain the order, so we have to reverse it again. These are two downsides that we can't pop one by one, and we have to reverse the result before returning it. On the other hand, what we get, it is completely lock free, thread safe, and it's weight free. Who will win those drawbacks for being lock free and weight free? We will see in the end in the benchmark section. Next part of the task journey is weight queue. As we know, it stores yielded tasks, so they are waiting for something, like in 99 percent of cases, they are waiting for a deadline. Since we are using tasks for requests, requests usually have a timeout, meaning that they have a deadline. So what we need is to be able to quickly pop all tasks with expired deadline, simply because we have to do everything here quickly, there is a lot of tasks. And we also know that this queue is always accessed by one third at a time. The current scheduler worker who owns the scheduling key, so there is no concurrency on this queue. And that gives us quite a lot of freedom about what data structures we can use. That basically means we can use binary heap. It's ideal for such a job when you have to quickly pop something sorted by a thing like deadline. What happens is that we sort the tasks by deadlines here, basically. So the task with the closest deadline will be on top and we will be able, for constant time, tell immediately if any task has expired by just looking at the top. And in case not all of you know what binary heap is, there is a brief explanation. It's a perfectly balanced binary tree where each node value is less than values of its child nodes. We call this minimal heap. If we reverse the order, it will be called maximal heap. And this heap, it has quite good complexity. Quite good complexity. So for logarithmic time, we can pop any items even from the middle. We can push new items also for logarithmic time, which is very nice, very fast. From the weight queue, the tasks migrate to the ready queue, which as we know is populated by one thread, current scheduler worker, and it is consumed by multiple threads, worker threads. So it means this is a multi-consumer single producer queue. We also know that it will as well experience high contention because task scheduler should be perfectly functional with like 10 worker threads. Why not? And with millions of tasks per second, we will have high contention. We have to deal with it somehow. Although unfortunately, I don't know a nice simple algorithm for doing unbounded and lock free queue of this type. For the reason why you can Google ABA problem, after the talk, it's also quite complicated. We will not have time to dive into this, but just know that it is much more complicated than multi-producer single consumer version. Although I know a couple of other queues, unbounded log-based and bounded lock free. I want my final queue to be exactly unbounded, meaning not limited in size. So I don't want any limits inside the scheduler. You can add them on top, but I don't want them to be inside the scheduler. I don't want it to be limited by anything like queue size. So let's see what we can do with those two queues. The bounded lock free version, bounded lock free queue is simple. It's just a cyclic rate, except that the read and write index will make atomic variables. So in pushing, the only thing changes compared to normal cyclic buffer is that you increment the write index atomic increment and that's it. The popping is just a little bit more complicated. We have to retry it sometimes because there will be multiple consumers. They will compete for the same element sometimes. So atomic compare exchange will eventually fail and we will have to retry, but it's still quite simple. And the unbounded lock queue is just trivial. So it's a mutex and it's a list and we take mutex on every operation. Then it becomes lock based, but it's unbounded. So what we can do with the ready queue, what I had was the craziest idea. The thing is, our enemy is not the mutex itself, it's the contention on the mutex. So we could try to reduce the contention instead of deleting the mutex. We could skip it, but somehow maybe not lock it so often. Let's combine those two approaches together. Let's take the bounded lock free queue and make unbounded lock based queue of those lock free sub-queues. So it will be like a stdq, but the blocks are lock free and the big queue of those blocks is lock based. And how producer works, it will push new items to the latest block in a lock free way. When the block becomes full, it takes mutex, appends a new block, fills it in a lock free way and so on. And the consumers, they do their other thing vice versa, so they consume the first block in a lock free way. When it becomes empty, they take mutex, switch to the next block, consume it in a lock free way and so on. To see the benefit, imagine like sub-queue size, this block size, there's not four like on the slide, but it's 10,000. What happens then is we will take mutex lock not on IV operation, but once per 10,000 operations. Mutex is still here, but it's locked so rarely that its cost is neglectable. You will not see this mutex lock in any flame graphs anymore. The only problem with this is that the consumers will need an explicit state, because if consumer doesn't know which is the first block, in this queue of blocks, it will have to locate it. The queue of blocks, it's protected by a mutex, so if consumers don't have a state, if they don't reference the first block having items, then on every pop, they would have to find it, keeping the mutex, and that would mean mutex lock on every pop, if this is exactly what we wanted to avoid. Consumers need to register themselves, and then they can do the popping. This is not a problem for a task scheduler itself, because it has fixed set of worker threads which you specify at creation. They can register themselves as consumers at start and leave just fine with it, so it's just a problem for generic usage of this type of queue, but for task scheduler, it is just fine. Let's examine the visual example again. Imagine like we have this queue, it's empty, just single block, one producer, two consumers. Producer adds three items in a lock-free way. Everything is lock-free so far, one consumer consumes one item, lock-free. Now producer adds another item, lock-free. It sees that the block became full, so we need to append a new block, we take mutex, switch to the next, append a new block, switch to the next block, release the mutex, and add three more items in a lock-free way. Our consumers will work, they will finish consumption of the first block, consumer A will see that the block is empty, so it takes mutex, switches to next block, releases the mutex, and continues consumption in a lock-free way. So we take mutex only when we switch from block to block. And the other consumer, when we will try to consume something from it, it will see immediately that its current block is empty, because those blocks, they are, as you remember, lock-free bounded queues, there are read and write index, if we see that the read index of this block equals its size, it means it's empty, so we don't even need to full scan it. Anyway, consumer B will then lock mutex, switch to next block, release the mutex, and continue the consumption. And the old blocks, the completely consumed ones, can be discarded. You can free them, you can reuse them, like have a pool of those blocks, and append them to beginning again, if you want to, like, it's not cheap to allocate blocks having 10,000 items in them, so you might want to pull them, to have a pool of them. About the benchmarks, we will see in the end, again, as I said, with everything combined. And our progress so far is that we saw test-calular parts, those queues, and now we can have a glimpse of the routines. Unfortunately, we don't have enough time to dive deep into the implementation, but I can show you some usage examples, show the features they have, and for the implementation, you can look at the source code and ask me questions after the talk. To see why do we need coroutines again, and what features we need from the coroutines, let's inspect the simplified version of this save games example. This time, we have just two steps. Start download of a save game block, and then handle the result. This is in just two steps. What we know is that while the task is waiting for response from the network, it shouldn't block other tasks, so it should be able to yield. It should be able to step away. But we also know that we can't, this task, it can't sleep infinitely. There should be some sort of timeout. Requests can't be executing infinitely, so we need some deadline after which the task would be woken up and will cancel the request. So if the response arrives in time, before the deadline, we have to wake up the task before the deadline. There should be some sort of explicit wake up for the task which is right now sleeping in the wait queue, so it's not executing. How exactly do we wake it up from there? So we need an explicit wake up. We need yield, deadlines, and wake ups, three main features of those coroutines. Let's see an example. It's almost exactly like it looks in real code. Once there will be simplified version of C++, but it's very similar how it looks in reality. We have a global task scheduler in the process and some HTTP client. And imagine like a request from the client arrives, so we wrap it into a task and give it a callback called download. And we post it into the scheduler, so it will go into this front queue. The scheduler will execute our callback in one of the worker threads. This download callback, so what we do here is firstly set what will be the next step. The next step will be handling the result. Then we do the asynchronous HTTP request. Assume that our HTTP client is able to do this. So we start an asynchronous HTTP request and give it a future callback to execute when the request is done, which we'll call wake up on our task. While this will be explicit wake up when the request is done. And then we start waiting. So we post ourselves back into the scheduler with five seconds deadline. Either the task will wake up in five seconds or it will be woken up explicitly when the request is complete. This goes into the scheduler, sleeps in the wait queue for some time and eventually our callback handle result is executed. We have to check what happened. It could be two reasons. It could be that the task is expired. So five seconds passed and we were woken up by the deadline. And we just literally check if it's expired. If it is sold and we cancel the request and we start waiting for the cancellation to be complete to properly free all the resources and to go back into the scheduler. But now we wait for the cancellation. Otherwise, if it's not expired, it means the request is finally complete with some result. It could be success. It could be an HTTP error code or it could be our own consolation done on a previous wake up a few lines above. Either way, we just handle it and delete the task. This is literally, this is almost exactly how it looks in the source code. There is no spurious wake ups at all, no sleeps and the request has clear state machine with every state expressed at the callback. And what is better, these entire routine stuff, all those wake ups, post deadlines, this is all completely lock free. There is no single mutics used to implement this all. So this is to get you further interested into looking at the source code and asking questions afterwards. This is quite complicated stuff, as you can imagine, especially the implementation. How do we verify such stuff? Of course, there are unit tests, like literally hundreds and thousands of lines of unit tests. But with multi-tried algorithms, the thing is, even if you have 100% code coverage, it doesn't tell you anything. It's better than, I think, not having 100% coverage, but it's still, there might be bugs which can stay hidden for literally years. And you will not find them except when this thing explodes in production. There should be a way to at least verify the algorithm, maybe. We can't verify the source code, but we can improve our confidence about the algorithm itself. And the solution is TLA+, TLA+, stands for temporal logic of actions, and it is a combination of mathematics and temporal logic. And it's also a language and runtime of this language. This language allows you to verify literally any algorithms or systems, like you can verify an algorithm of a queue, like I did, or how your microservices interact, or you can verify how you go to a grocery store. If you can algorithmize it, then you can verify it. TLA+. It's suitable for anything like this. So in this TLA+, language, you write a specification and run the verification, and it will run and it will split your system, your algorithm, into a set of all the possible states in which this algorithm can be. And verify your own invariance in every reachable state of your system. So firstly, you define the algorithm, then you define what means validness for your algorithm, the invariance, and TLA+, we'll verify all them. Let's see an example, like I assume you have implemented a queue in any language, and we want to verify the algorithm of this queue. First you have to define what objects exist in your system, what agents you have there. In my queue, I have just pipe for items, so some sort of storage, and a couple of counters and limits. Then you have to define actions. In TLA+, every action is a set of mathematical conditions combined with some mathematical operators, like you have operators and or operator, or list or set operator, or complex operators like all items in the set comply with the condition operator, and many more other operators which can use in your actions. And the first action is always initialization, where you give initial values to your variables. Here I say that storage pipe is an empty list, and my counters of sent and received items are zero. Then I start defining some actual actions doing stuff, like send or push or insert or whatever. And there are ways how to do it. I'm not aware of any standard ways how to do it properly, so I invented my own. Firstly, I split my actions into two groups. And first group of conditions I define when the action is possible. So here I say send is possible when queue is not full, and when I still have items to send, so not everything pushed yet. In the second group, I tell what changes should be done when the first group is true, so when the condition is true. Here I say if the first part is true, then I add a new item to the pipe storage, as items I use numbers, and I increment the number of sent items, of pushed items. The problem is, as you could see, probably there is no really distinction between those two groups. It's imaginary, and the only distinction is this small single code sign. It means next value of the variable. The problem is, since here it passes basically math, in math there is no assignment. There is no action like assign new value to some variable. There is only, the closest thing we have is equal operator in math, but there is no assignment. But you can emulate it saying like next value of your variable equals old value and something done with it. So here I say literally last sent next equals last sent old plus one, which effectively results into assignment and programming languages, but here you have to simulate it like this. In theory plus there is no separation into groups, it's just several mathematical conditions, but I do separate it for making the specification easier to read. I do the same for the receive action. It's possible when I have items to receive, and the changes are to receive. And then you have to define what means validness for your system, so the invariance, which will be validated in every Ritual state. Here as they say that my queue is valid when all the items in the queue are ordered, like I pushed them. The queue never overflows, and then the items are received in the same order as sent. And then with some technical steps, simple ones, I run the validation, and it will give me a result like n states are found, like hundreds of thousands of millions of states that are found, and they are valid, or it will say that I found an invalid state. And here is how you get into the state from the initial one following this sequence of actions. And then you can turn those failure traces into unit tests in your actual code. Now this actually works, and they call it a bug in the scheduler, thanks to this thing. Here by links you can find the specifications for tax scheduler on the whole, and for the multi-consumer queue, which was not trivial enough, so as I would try to validate it as well. Two specifications, they are quite big, but most of the lines are comments explaining the algorithm. So the specification, the code part of them is not too complicated. You can read it like easily. And also there are instructions how to install TLA+, in the source code repository, how to run validation on those models, how to install TLA+, into the command line. It's not trivial, surprisingly. And there are also great course of lectures from the author of TLA+, Leslie Lampert. Lectures are quite fun, if you are not even planning to use TLA+, they are still worth watching, very entertaining. All of this can be found on the source code repository, and now about the benchmarks. How I did them, the benchmarks are comparative. So I'm not just running the benchmarks against themselves in vacuum against some random stuff. I compare the improved versions of those, my algorithms against trivial versions, naive versions using mutexes, to see if stuff actually improved. Like all the same benchmarks run on my algorithms and don't trivial implementations. For example, the smart queues I benchmark against their mutex locked versions, or the task scheduler I benchmark against thread pool, without coroutines, single queues, single mutex and nothing else. I run this on five different configurations of software and hardware with tens of scenarios and all the reports, while the performance are available on GitHub, in human readable markdown format. And you can also run them on your system with just a single line of Python script. It will generate the report for your case, and you can read it and see what's up. And so there are quite a lot of results, I can show just a few of them on the slides. I will use Debian Linux with eight cores running in Google Cloud, and I will show just some average results, no shocking like 100 times faster, although there are extreme cases when algorithms are almost the same, or when it's extremely faster, but I will show just some average results which you can actually get in real production. We start from the front queue again. The benchmark uses five producer threads and one consumer thread, doing busy loop pushes and pops all the time to get the worst case contention. It's just for one and a half times faster, and this is all considering the two drawbacks which you can remember, so we store items as stack in the front queue, we have to reverse them, we can pop them one by one, and still it is one and a half times faster. If we make it 10 producer threads, so we have more threads than CPU cores, worst case for mutics contention, it becomes 2.6 times faster. The ready queue, another benchmark, five consumer threads, one producer threads, one producer thread, again busy loop pushes and pops, it becomes 2.6 times faster, already and you can see that the mutics, the log contention is multiple orders slower than in a trivial implementation. This is thanks to us taking mutics log, not on every operation, but once per multiple thousand operations, and this is the result. Mutics is still here, but it almost doesn't affect the results at all. When we make it 10, consumer threads, it becomes already four and a half times faster. The naive queue degrades quite quick in this case. And now everything combined, the task scheduler on the whole. In this benchmark, tasks are empty, so they are just empty C++ functions. Not doing anything, worst case for contention again. And now we start from single worker thread, which will do both the scheduling and tasks, it becomes right away 2.2 times faster. Then a trivial thread pool without routine support, it becomes already 2.2 times faster. And zero log contention, so log wasn't contended even once between the worker and producer. When we make it five worker threads, it becomes three times faster, so it scales better than naive implementation, but we make it 10 worker threads, it becomes seven and half times faster. And these are not the best results, it can be even better. I'm just not showing extreme cases here, but I saw like 15 times speed up as well. It's just not something you will most likely get in production if you start using this, but those benchmarks are also available. And now about the real usage, not some random benchmarks in the vacuum, how it affects the actual code. Apply to this save games case from the beginning. We reported one of the microservices from updater scheduler to task scheduler, and we immediately without any further optimizations got 10 times speed up. We went from 100s RPS to bigger than 10,000 RPS, and latency dropped five times like right out of the box before we started doing some other optimizations. And the algorithm is extendable. Now, as you remember, there is this big rectangle where only one thread at a time can work. It means this is thread safe space. We can replace the binary heap with weighting task with something more complicated. For instance, we can put LibF inside, or EPUL or IO completion ports from Windows inside, and we get multi-threaded event loop, like multi-threaded LibF, we can store sockets in tasks, and we get circuits with deadlines, with yields, and this is, in fact, what we do have in Ubisoft, it's a fork of task scheduler where we just replaced weight queue with EPUL on Linux and IO completion ports on Windows. And we get more than millions, more than million messages per second with just several threads on sockets. With this thing, it's not too complicated to extend scheduler, it's just maybe next time about this multi-threaded event loop. What are the future plans for it? Firstly, we could try to run it on ARM. Maybe it already runs, but I just haven't tried. Maybe it works, maybe not, I have no idea. That's why it's open source. You can try it, send a pull request if something's not working. Also, it's currently implemented only in C++, and it's not even STL, although some people might consider it good, like me. I don't like STL, but it could use a port to STL as well, or to some other language. And also, there could be optimizations done like the front queue. Maybe there is a way not to store it as a stack, not to reverse the list of items before returning it. I just haven't found a simple way to do it, which would be worth trying, and this is the end. Thanks for your attention. And here are the links to the source code, and to this talk. It will be available, the animated versions with all the slides and my notes online by this link and my other talks. And also, there are bonus sections which some of you might ask as questions, and we will quickly go for them, or you can click on them yourself after the talk if you're interested. Okay, so time for questions, so show of hands, and we'll give you a mic. So, for the front queue, you can use some of Dmitry Vyukov's NPSC queue, very fast, much faster than the tribal stack, which is the thing you're using. The other thing is for the wait queue, well, you answer this with IOCP, KQ, Epo, that would be much more in line with something that uses timer or for networking. And you said that we cannot have single produce, no, multi-multi-consumer single production. Yes, you can, actually, you can use one of the chase left DQ, there's a paper for 2013 with formally verified primitives, including ARM, and those will work for you, use case. I know that there exist implementations of such queues, I just couldn't find the simple enough one. The thing is, in Ubisoft, internally, above all, sometimes, in Harm to performance, can value code simplicity, so it's not an option to use something extremely complicated, like hazard pointers or stuff like this, or, for example, I saw implementations of such queues, which are not wait-free, so they can be lock-free, let's say, but not wait-free, that also wasn't an option, because that's basically a spin-lock. There's 100 levels. |
Tools for linking Wikidata and OpenStreetMap
Software for adding links between open data projects |
So, hello. I'm Edward, and I'm going to be talking about some tools that I've been building for adding links between OpenStreetMap and Wikidata. I've been working on these for a few years. This is all a hobbyist project. I'm not being paid to work on this, but I thought I'd come here and share with you some of the work that I've been doing. So, I'm going to use as an example to talk about the software that I'm building, this building which is in Brussels, the Royal Palace of Brussels. It's in the city centre. So, you can see here, this is it in two different systems. You've got OpenStreetMap and you've got Wikidata, both showing the same building. So, I'll describe OpenStreetMap just for anyone who's not familiar with it. It's a collaborative map. I've been going since 2004, covers the whole world, and anyone can come in and edit the map. It's got revision history. You know, it works a lot like Wikipedia, but for maps. So, within OpenStreetMap, you've got three types of objects, nodes, ways and relations, increasing complexity. And each of those objects can have tags. Tags are pairs of keys and values. I've got some examples here for my example in Brussels. And the tags are not controlled by the software. You can put anything you want in, but it won't get rendered on the map unless it's one of the standard tags that gets used on OpenStreetMap. So, there's a community process for discussing, you know, how things should be tagged in OpenStreetMap, and then it gets documented on the OpenStreetMap wiki. So, everything in OpenStreetMap can be uniquely identified by the type and the ID. Like the ID on its own isn't enough. There's nodes and ways that have got the same ID. You have to have the type as well. So, in this example, the Royal Palace, you can see it's a relation. It's a complex polygon. You can see there's holes in the middle of the building, so you can't represent it as a way. And you can see there it's got an ID as well. So, what about the other system I'm talking about? Wiki data. So, wiki data is part of the Wikimedia Foundation, like the same people that run Wikipedia. And it's a wiki for structured data. It's newer than OpenStreetMap 2012. It launched, and it's big. Like, it's got 102 million items now. And for comparison, English Wikipedia has 6.6 million articles. Like, English Wikipedia is the biggest Wikipedia. And most of those articles have a wiki data item as well. But then there's a lot more data, a lot more items in wiki data than there are articles in English Wikipedia. So, if I take my example of the Royal Palace of Brussels, and you look it up on English Wikipedia, you can see there's a link in the sidebar that will take you to the wiki data item. And you click that link, you end up on the, this is the wiki data item for the Royal Palace. I'll talk you through some of the pieces on this page. So, you've got, down the side, the site links. These are links to Wikipedia articles in different languages. Like, part of the reason for the distance of wiki data is to store these, they call them inter-language links. They used to be stored in Wikipedia and had to be maintained across all the different languages. So, if there was a new article written in a new language, every existing article in one of the existing languages had to be updated with these links. So, much better to centralize them and store them in wiki data instead. All the bits and pieces you get on this page, you get a label, description and aliases. So, by default, when I look at wiki data, I just see them in English because that's the language I speak. But I can click the link to show me in more languages and you can see that there's names of the thing available in lots of languages and descriptions and so on. The other main part of this page you see is the list of statements. So, statements are a bit like tags in OpenStreetMap, but they're more controlled by the software. You can't just make up a property. You have to use ones that are already in the system. And again, there's a community process in wiki data for determining new properties. And the other big difference is that there's different data types. Like in OpenStreetMap, everything is a string, but wiki data has different data types of values. Here you can see there's an image and there's also a link to another item used as values in the statements. So the interesting in terms of maps is wiki data has got coordinate locations. There's almost 10 million items with coordinates. So those are the kinds of things that we're interested in and will probably be on OpenStreetMap as well. And there is a property for storing geo-shapes in wiki data, but it's quite new and it's not used so much. There's only 29,000 odd items with a geo-shape. So, you know, it's mostly about the coordinates. So the thing that I'm interested in is adding links between the systems. So if we have another look at OpenStreetMap, I've got highlighted here one of the tags for the palace and it's the wiki data tag and it's got a wiki data QID. This is the unique identifier for the wiki data item. So now the two systems are linked. Like if you visit this object on OpenStreetMap, then the user interface has a hyperlink that will take you to the same thing on wiki data. So why do I want to add links between wiki data and OpenStreetMap? Well, it makes the data in OpenStreetMap a lot more useful. Like wiki data tends to have labels in more languages. Like if you want the name of a thing in a different language, you can get it from wiki data. You can link to the wiki preview articles. You get images from commons and identifiers from other catalogs, data catalogs. So there's wiki media commons is the wiki media location for storing photos of things. So we get loads of photos of our building and we also get lots of identifiers in wiki data. So you can think of wiki data as a bit like the Rosetta Stone of linking different data catalogs. It makes sense to store all this information in one place. So why not use wiki data as that place for storing this kind of info? So this is a good thing. We want to add links. The other thing that you get is wiki data gets access to the shapes of things, the polygon outline of the building, which otherwise it wouldn't have without a link. So adding these links by hand is kind of laborious and time consuming. So better to write some software to do it instead. So the software I've written, I'm calling it awl places and the web address is osm.wiki.data.link. So this is what the software looks like when you visit it. It asks you for a place name where you want to search for some matches. So you can put in the name of your town, somewhere you're familiar with and can check that the matches are valid. So I've done a search and I've found the place where the Royal Palace is located. And this is the page you see. You've got a map with some blue pins. And these blue pins represent wiki data items that the software has found something that matches OpenStreetMap. So if I scroll down this page, you can see some example matches. So I show you various bits of data that come from wiki data and wikipedia to help you try and identify if these matches are valid. Like sometimes the software doesn't get it right and will give you an invalid match. So it's important to look through this list and check that all the matches are correct. And to help you with that, I show you the first paragraph from the wikipedia article and I show you any images that come from wiki data. You've got the wiki data description there. The paragraphs I show you, I'll talk later about how it decides which languages to use for showing those. But it supports various languages. And then it shows you some of the details from OpenStreetMap just so you can compare and make sure that they match. So if I click on one of those, then it will zoom in on the map and it shows you the polygon outline of the thing. You can see the red pin there is the selected thing. So that looks like a pretty good match. It's probably the same thing. So we can go ahead and save that. So we're interested in saving these matches to OpenStreetMap. So the software has a button that lets us log in via OpenStreetMap by OAuth. Just put in username and password and log in. And then you come back to the confirmation page where you just see the same list again but kind of abbreviated. These are things that I've checked and I've said yes, these are valid matches and I want to save them. You can put a change comment. So everything gets saved together as one change set like it goes in as a single edit. And the change comment on the change set is generated automatically based on the location but you can change it if you want to. So I'll carry on with describing the software and I'll show you some more features that I've built. So I've added a type filter like at the top here you can see it's a type filter and there's a list of different types of things that it's found that are possible matches. So it's got statues and buildings. I can tick a sculpture to say I just want sculptures. And then when I scroll down it will just show me things that it thinks are sculptures. So I can focus on one particular type of thing. Sometimes when you put in the name of a town you might get 200 matches and it's a bit overwhelming to do them all in one go. So it's useful to do them bit by bit just specific types that you're interested in. And then when I go to the save page it generates a change comment that's based on the type filter that you've selected. So here it just says add wiki data tags to sculptures in this area. So I'm just going to talk about how it determines what is a match. So if we have a look at one of these examples this is a sculpture. So if we have a look at the same thing on wiki data you can see that there's a statement in wiki data which is the instance of statement. So this is saying that this thing is a sculpture. So we can click through and have a look at the sculpture page. This is the wiki data item for the concept of a sculpture. And then if we scroll down this page we get a wiki data property which is for OpenStreetMap tag or key. So there's actually two values here and the second value is uninteresting like it's shown in red because it's a deprecated value like it's a kind of old value that used to be used in OpenStreetMap and it's being documented in wiki data. So the interesting one is the top one which is tag colon artwork underscore type of sculpture. So the information is stored in wiki data about what tags are used in OpenStreetMap to describe things. So using this information we can say that these two things are the same type of entity. So when it comes to matching things I'm looking for the coordinates to match like the two things and the two systems have to be close to each other. They're not necessarily a perfect match but within like 50 meters or something. And the entity type has to be the same like I just described. And then I'm also looking for a matching name or street address or identifier. So I pull names from all over the place in both systems like in wiki data there's a bunch of different fields or rather in OpenStreetMap there's a bunch of different fields where names can be stored and I look at all of those and then in wiki data there's different places to get the names. I look at the labels, wiki data, the aliases, the names of any wikipedia articles. I look at the file name of any images that are in wiki data just to get as much, you know, many possible names that I can use for matching. And then I normalize the names a lot so I lowercase them and I remove stop words and process them a lot to try and get as many name matches as I can. And then similarly with street addresses. So there's street addresses in OpenStreetMap and wiki data which I compare and the software also looks for street addresses in the first paragraph of wikipedia articles. And then in terms of matching identifiers there's lots of standardized OpenStreetMap tags for different identifiers and then there's also properties in wiki data for those same identifiers. So if, you know, I've got a railway station that's got the same station code in OpenStreetMap and wiki data I can be pretty sure that it's the same thing that I'm matching so I can be confident about that match. So one of the things I'm not using at the moment is the wikipedia tags which appear in OpenStreetMap. Like before wiki data came along there was lots of wikipedia tags added to OpenStreetMap and they're not completely consistent in their formatting for how they link to wikipedia and sometimes they're wrong. So, you know, I've left the work for now working on trying to match up using wikipedia tags for somebody else to have a look at. But I've been waiting for a few years now and no one has so I might have to have a go at this. So just in case anyone's interested in the technology behind this, the software is written in Python with Flask. I'm using Postgres as my database and then on the front end, you know, various bits of JavaScript. I'm not really a front end developer but, you know, I'm muddling my way through and it seems to be working quite well. I'm using a bunch of APIs to get this data. So in terms of searching for places to look for matches, I use the OpenStreetMap nominatum API and then to grab more data I use the overview pass API and then on the wiki data side, I do a lot of sparkle queries against the wiki data query service and I use the wiki data media wiki API to get the details of the wiki data items. So there's a bunch of things that don't work in my system at the moment. One of them is tunnels. Like I designed the software with the assumption that there would be a kind of one-to-one mapping between a thing in OpenStreetMap and a thing in wiki data and that doesn't work for tunnels because tunnels tend to get represented as two ways in OpenStreetMap where as in wiki data there'll be a single item. And so, you know, my assumption was wrong and I need to change my software to say that you can add the wiki data identifier to ways in OpenStreetMap but I haven't done that yet. Incidentally, we don't have the same problem with bridges. Like the way that bridges get represented in OpenStreetMap is they are often two ways but then there's a relation across the whole bridge that represents the bridge itself. And tunnels, there isn't a relation for representing the whole concept of the tunnel. So that's another possible approach. Maybe OpenStreetMap should change and start mapping the tunnels with a relation that contains the two ways, you know, for storing wiki data tags and any other information about the tunnel that is the same across both ways. So, another thing that I don't support are rivers because they are linear relations and my software that I'm using to import data from OpenStreetMap I'm using OSM to PGSQL and it can't handle linear relations. It just, you know, expects relations to be polygons. So at the moment rivers don't work in the system. And then similarly for tram stops. Tram stops are kind of complex objects in OpenStreetMap. You've got, you know, stop positions of where the tram stops on either side of the road which are no single points and they're collected together into a relation and that isn't supported properly by OSM to PGSQL. So I can't handle tram stops properly. I'm going to talk about a few more features that are in the software. So again, this is the center of Brussels and I've got the language selector. So the software has figured out all the languages that get used for the labels of things and the OpenStreetMap objects that are in this area. You know, unsurprisingly for Brussels the most popular languages are French and then Dutch and English is the third most popular. Interestingly we've got Latin at the bottom there. There's 22 items that have got labels in Latin in wiki data. But so by default this page is opened in French and you can see the type filter is appearing in French but I can't read French very well so if I want to change it to Dutch I can reorder these languages by drag and drop or I can click on move to top and you can see the type filter is now switched into being in Dutch or if I want it in English then I can move English to the top of the list and it will show me the type filter in English, English labels and descriptions. And if I scroll down the page you can see that this is the page appearing in French. You've got titles in French and the extracts from wikipedia in French or again I can change it into Dutch if I want or I can have it in English. And this works without reloading the page. You just change the order that you prefer the languages to appear in and it does it all on the client and switches it over. So some statistics for you. People are using this tool. Well first of all there's more and more wiki data tags appearing in OpenStreetMap so not all of them are coming from my software. You know there's other people figuring out how to add wiki data tags to OpenStreetMap. So here's some more stats. 26% of the wiki data tags in OpenStreetMap were added using this tool and we're up to 400 people and there's been 23,000 change sets and we're getting close to 700,000 wiki data tags added. So I'm going to talk about the licensing. Wiki data is CC0 or public domain. You can do anything you want with wiki data and OpenStreetMap uses the open database license which is a license that was pretty much written for OpenStreetMap. So you can't copy any data from OpenStreetMap into wiki data because you'd be re-licensing it CC0 which is not allowed. But even more than just the licenses being different the intellectual property jurisdictions are different. So OpenStreetMap asserts database rights. Like the argument is that it's a lot of effort to go around collecting all this information and putting it in OpenStreetMap and they want to protect that whereas wiki data is part of the wiki media foundation which uses US intellectual property rules and so under US law facts are not copyrighted, not protected rather in law. So the two things don't mesh that well but it's fine because I'm not copying any data between the systems. I'm just adding links between them. Like in some cases it might be nice if we could tidy up the data in one system based on the other but I'm not doing that and there's you've got to think carefully about the intellectual property rules before you try and do that. And so also just while we're talking about licenses my software is GPL and code is on GitHub it's all open source. Anyone can have a look at the software behind it. So an important aspect for being able to add these links between the systems is to have stable identifiers and for a long time OpenStreetMap has talked about the identifiers not being stable and sometimes say a railway station might get mapped as a single point and then later on somebody comes along and traces the outline of the building and so it changes from being a node into a way or a relation and the identifier will have changed. So they aren't stable identifiers for concepts in OpenStreetMap. So the thinking is that makes it difficult to link into OpenStreetMap because the identifiers might change and there's been discussions within the OpenStreetMap community of having a permanent ID and the discussions have been going on since 2017 and they haven't come to a conclusion. There's been an argument that maybe the right thing to use in terms of stable identifiers would be wiki data IDs, just say anything that's important enough to need a stable identifier is probably on wiki data and so you could use the wiki data ID as a permanent ID. But another way to look at it is in reality most of the world is mapped now on OpenStreetMap and the IDs aren't changing that much. Things tend to be mapped as polygons like outlines of buildings and people aren't coming along and making changes that are destructive in destroying the IDs. So maybe the IDs that are in OpenStreetMap already, the IDs that I talked about earlier, maybe they're stable enough and maybe it's okay to just link to those and not worry about them changing. Whereas we've got wiki data on the other hand and wiki data was designed always to have stable identifiers. That was a big part I think of the initial approach to wiki data. Wikipedia identifies things by article title and over time the article titles can change and then things get moved around and so they don't have long-term stable IDs and so the wiki data QIDs was an approach that gave you stable IDs. But it turns out that they're not completely stable. There's also redirects appearing in wiki data. Like with some of the work I've been doing, I find a lot of duplicates in wiki data. Things have been imported from different sources and say for example I found a lot of duplicate churches in wiki data. So when I go and I merge the churches, then the ID that represents one of those churches will change. So I've got on the slide here there's 10,000 OpenStreetMap objects that point to a redirect in wiki data and somebody needs to go through and resolve those redirects and fix OpenStreetMap. I will probably do that at some point if no one else does. So a recent change to wiki data is that there's a new property called OpenStreetMapElement and that is for storing OpenStreetMap IDs. So now it is possible to add the links in both directions. We can have links from wiki data to OpenStreetMap which we never used to be able to have. So I need to change my software to start adding these links in. When you save things at the moment it just uploads into OpenStreetMap it should be uploading them to wiki data as well. But to do that I need to make the user login to both systems which is possible but it will break the flow of it. So I am going to try and do a demo. Let's see. So this is the software I'm describing and I can say I want it in English. And you can see the type filter there and if I scroll down it shows matches that weren't very good at the start. So it's got some difficulty with this match and it can't handle it so we scroll past those. And here's the first match that the system can handle and if I click on it then it shows you the match. I can click toggle OSM tags. This is showing all of the tags from OpenStreetMap. The green ones are ones where it's found a match that's using those to figure out what the match is. I'll show you some more. Here's another one. You can see it appearing on the map. If I think this is not a correct match I can click here and it's deselected it. So I've got a whole pile of matches here. I've checked these ahead of time. They're all good. So I scroll to the bottom and I can say add tags to OpenStreetMap and this is the confirmation page that I was talking about. So I can hit save and the software goes through and it's saving my matches. So it has done it and I can say view my change set and you get to see my change set on OpenStreetMap. I can scroll down and you can see these are all the things I've edited. So nice and quick to go through and edit OpenStreetMap. I've just got another example. Another bit of Brussels. I can change to English. Say I want squares and then if I scroll down it will just show me some matches that haven't worked. So scroll past those. Here's some squares that the software has managed to match up. And these all look like good matches. I've checked these before so I can scroll to the bottom. There's another one and I can say add to save to OpenStreetMap and it's in the change comment it's put the word squares. So I can hit save and that is working to edit OpenStreetMap. I'll go back to the presentation. So that was my existing software. That's been running for a few years. People have been using that and I've been working on a new version of the software that I'm calling OwlMap. This is what OwlMap looks like. So when you open this you go straight to a map. It tries to guess where you are, locate you based on your IP address and then it shows you this interface much more map-based rather than like a list of things. You see the red pins are where there isn't a match already. Green pins are where there is a match and the yellow pins are OpenStreetMap things. So you can see some of them have a line between the green pin and the yellow pin. That's showing you which, you know, the green pin is a Wikipedia item that matches a thing on OpenStreetMap which is the yellow pin and there's a line between them. And you've got a filter at the side where you can filter on different item types. This is an example where I've selected one of the pins. I've clicked on a pin and it changes the color slightly and it shows you some details. You get to see the photo and bits and pieces from Wikidata. And then underneath it shows you a list of possible matches. It just says, you know, this is a building. Here's some other buildings nearby. And I can see the street addresses on here and, you know, the nearest building. The street address matches. But in actual fact, there's two street addresses on there. And if I scroll down this list, I can see that there's two buildings next to each other that both match this warehouse. So for some reason Wikidata is representing it as a single item whereas OpenStreetMap has got two separate objects. But this version of the software supports it. So I tick the boxes next to them and then I can hit save and it'll add the Wikidata tag to them. So this bit of software I'm still working on. It's live but it keeps breaking so I'm not really advertising for people to use it. I need to do some more work on it. And in fact, I think I need some help. You know, I'm just a hobbyist and I'm running out of time to work on this stuff. So I don't know if anyone knows how I can get some help with this, whether, you know, there's someone out there who wants to pay for this work or whether I can find volunteers to help me. I don't know. It's all a bit tricky like trying to work out managing people to work on this. So yeah, that's the software built. And I guess, has anyone got any questions? If you have a question, please raise your hand so I can see you all there. I'm coming. Thank you, Edward, for that. Hi, I'm Siebrandt. I'm a volunteer at Wikimedia. Wikimedia has a service called Wikimedia Cloud Services where you can get free compute resources. Oh, where you can get free compute resources. I would highly recommend that you look into that. So like the machine I'm running some of this stuff on is 60 gigabytes of RAM and two terabytes of disk. Would I be able to get that much from Cloud Services? I would highly recommend that you talk to someone there as you may be having a project that's quite valuable to the Wikimedia movement. I'm sure that someone will try to help you. Thank you for your contributions and for the talk. Have you considered interfacing or linking with OSMOS? It's a quality assurance project. It's a quality assurance project. It's a model where you see alerts on the mob, dangling ways, et cetera. I think it's somewhat extended and it has an existing user base. Maybe you could benefit from that. I haven't looked at this. I will write you later. Thank you. Hello. I have two remarks. First of all, I'm the maker of MapComplete which also has an entomology team to link Wikidata to Straits so we can work together on that. And then second, a small remark on the adding an ID of OpenStreetMap to Wikidata. That's a bit of a flow approach because IDs aren't very stable in OpenStreetMap. Say that a new park is opened, I place a point where the park is and then a few days later someone else passes by and says, oh, we have aerial imagery now, throws the outline as a polygon and then removes the alt point. That means that the link would be broken in Wikidata. I mean, I guess we just have to deal with that. We can have software that looks for these broken links. Maybe it would be nice if OpenStreetMap could add redirects like Wikidata has. Yeah, except that it's way more difficult than that because, for example, sometimes you have a big street and then you have properties which are different for parts of the street and then the street gets split into three parts. So then suddenly you'd have to redirect to three different parts. Do you think that it's a mistake to add OpenStreetMap IDs to Wikidata then? Yes, basically. It doesn't make sense at first glance but technically it will break down over time. So it's better to add a link to OpenStreetMap to Wikidata and then look it up reversely because the editing tools will keep track of the Wikidata link. So if the roads get split into multiple pieces, every single piece of the road will get a backlink to the Wikidata item. Yeah, you might have a good point. But let's have a discussion after the questions. Hi Ed, thanks for sharing the new software. It looks great. So I was fascinated by the example where you showed a modern one potential match and I just wondered does your software have a role to play in improving the quality of the data by cross-referencing between the two sides? I think it can improve the quality. Like I say, when I run this I find duplicates in Wikidata that are difficult to identify from just Wikidata itself. I feel like the coordinates that are in Wikidata don't get much use. Like for a long time you didn't even see the map appearing, the Wikidata pages, and then a lot of the coordinates were wrong. People transpose digits. Since the map is visible, people are more likely to check their data. The fact that the two systems exist, you can cross-reference them and find errors. Yes. I'm wondering how relevant it is now based upon the question just a moment ago. But I was wondering can you search Wikidata for a lot long window and find all objects within it when you're adding data to OpenStreetMap? So underneath I'm doing Sparkle queries to Wikidata, and Wikidata Sparkle queries do support coordinate bounding boxes. I can say you can write your own query in Sparkle that will give you all the churches within a given bounding box. I demoed two separate systems that should really be combined into one, and the old system doesn't support bounding boxes. It's all based on place polygons. You have to say, show me things that are in Brussels. You can't say, show me things within this rectangle. And the new system is more bounding box based in that you see the map and it just shows you all the matches that are in the rectangle that's visible on the screen. I'm not sure if that answers your question. It doesn't think. It's very valuable what you've done. Thanks. Any other questions? Raise your hand. Hi. Thank you for your talk. I had a question about the OpenStreetMap tags that are in Wikidata. I think you showed this in one of your slides. How often are these tags uploaded from OpenStreetMap, and does it pose any problem with the license compatibility issues that you talked about? I think you mean the property for OpenStreetMap tag or key. Things like I showed the palace type. Is that right? Is that the one you're thinking of? There's a few properties in Wikidata. Yes, the OSM tag, like the structure one. I don't think there's any problem in terms of the intellectual property. It's kept pretty up to date. People invent a new tag to use on OpenStreetMap, and then they go and find the matching Wikidata item and add the tag to it. And some unofficial tags that are used on OpenStreetMap, the information is in Wikidata. So it's pretty current, I think. So similar question from my side. Nice presentation. You explained the licenses. Nicely when you said that you cannot copy data from the OpenStreetMap to Wikidata, but what about the other way around? So that's an interesting question. And the OpenStreetMap community is a bit suspicious of the information that's in Wikidata. Like, there's a feeling, you know, where did the coordinates come from? Were they just copied from Google Maps? Like, do people look up a thing on Google Maps, find the coordinates, put the coordinates into Wikidata? And then does that make Wikidata a derived work of Google Maps? And so, you know, it's probably fine to copy any data from Wikidata into OpenStreetMap. You know, if you want to copy a name in a different language, you know, that's probably fine. But my software doesn't do that. I just add the links. And, you know, once the links are there, it's easier for somebody else to come along and find these things and copy the data over if they want. So my question is, does the software do the requests, the API requests on the back end on your hosted service, or is it the client, the user that will do the browser will do the API requests? I showed two versions. The old, you know, the more established version is using the Nominatum API to find things. And then it's using the Overpass API to grab lots of map data. And then they use the OpenStreetMap API to push the changes you make to upload the Wikidata tags back into OpenStreetMap. And the new system I built maintains a full mirror of the OpenStreetMap data just to make things faster. So I'm not using APIs for downloading data with that one. I just use the API for saving the changes. Does that answer your question? Yeah, partly. But does the request to fetch data from the Wikidata, does that go from your servers? Do your servers fetch data? It is all going from my server, yeah. It's not from the client browser. It's going. Like, I do a lot of pre-processing before I show you the list of paid matches, and then I store them all in the database. So when you load the list of matches for a place, it's not doing any queries either on the server or the client with the APIs. It's all stored in the database. I mean, that's a problem. The matches get stale. There's a refresh button that you can hit, and it will go off and rerun the matcher and get fresh data from OpenStreetMap and Wikidata. Yeah, okay, thanks. There was a question here? No? Okay, so I'll be back on the other side. Hi, I'm Valerio from Milano, and thank you so much for this tool. Again, thank you for the person who mentioned the possibility to host this tool on the Wikimedia Foundation infrastructure, because it would be really, really nice to propose this on the Wikimedia Fabricator, and I would be interested in discovering how the discussion will go. Second thing, you asked how to found your development. I think you can just contact your local Wikimedia chapter that maybe they provide microgrants or something like that. In my local community, some volunteers often in one week can obtain microgrants to develop small tools or to boost some activities. Maybe this can be interesting if they are useful for the university to produce OpenStreet software and Libre content. One feedback for the user interface, it's not clear to me how to contribute on just one element. If I have one minute, if I want to visit the tool and connect just one item, because I'm 100% sure about that item, so I just want to save on that contribution and be kidnapped, I don't know. So this maybe can be useful if it's not already possible. The two approaches for that, if you click on the title of an item, it takes you to a page where you can just edit a single item. Okay, wonderful. At the top of the page there's an uncheck all tick box, and then you can just tick the box next to one thing and scroll to the bottom and hit save. Both of those will work for adding a single Wikidata tag. Okay, thank you. And thanks for your comment about contacting my local Wikimedia chapter, that's a good idea. Last thing, can you repeat sorry, why do you need two terabytes of data to have this working? Thank you so much. The open stream app database is big. The Earth is big and I keep a whole copy of it to make things fast. And so it's probably 1.6 terabytes to store all of the open stream app data. I think that's time up. So thank you. Thank you. |
Reimplementing the Coreutils in a modern language (Rust)
Doing old things with modern tools |
Hi. Hello. I hope you had good beers yesterday. Thank you for coming this morning. I'm going to talk to you about a work that a few of us started a while back, which is implementing one of the pieces of software that we all have in our computers, so the corridors. So I will go through the history of that project, explain what we are trying to do, the why, and maybe do a demo, let's see what happens. So who I am. I'm doing a lot of things, way too many things according to my partner. I'm a Debian developer for like 15 years, LLVM for 10 years. I'm also, my actual job is I'm a director at Mozilla. I'm doing 2,000 things every day. But that work is clearly unrelated to what we are doing at Mozilla. Don't tweet saying Mozilla is working on that stuff. I will get in trouble. I don't want to get into troubles. But it's not a Mozilla project. But I have been working with a rest developer for a long time. I manage some key people in the rest project. And in Paris, we had the chance to have a bunch of people who worked on the rest for 10 years. So I have been in touch with those developers for a long time. I also uploaded the initial version of Resi in Debian a long time ago. And if you don't know about packaging, you can package a software when you are not an expert in the language it has been written in. I know that sounds crazy, but I'm not a C++ compiler developer, but I'm maintaining clang for like 10 years or so. And I'm also maintaining some of the most common rest packages in the Debian archive for a long time and therefore Ubuntu. But yeah, let's talk about what happened. If you remember, something weird happened three years ago now. Most of the planet went on lockdown in our country, sorry, in France. And I think it was the same in Belgium and Italy and some country. So they decided to close everything. So I don't know what you have done on your side, but myself, I asked myself, what can I do with that three times that I had? So some people make bread. So I stole a picture from Julien Donju. He made some fancy bread. Who did bread here? A few. Cool. Some others did some woodworking. So I stole a picture from someone who used to work at Red Hat, but now he's working at Mozilla. He did some woodworking. The picture is ugly. It's not my fault, but some people did that stuff. Some people did gardening. And myself, what I've done, it's my son on the top right. He loves Lego, but he's disfiring everything. Everything was put in a single bucket and we decided to rebuild the Lego. So it kept us busy for like three or four days. And then we still have to work, but my partner decided to wear 40 something, and she decided to rewatch Buffy as a vampire slayer. I have a good memory for TV show, so I was like, yeah, I don't want to watch it again. So, and then she did also that. So you can do the math. It's like 200 and four hours. And basically, I used that time to work on the rest, because she was watching that stuff and I already saw this episode when I was younger. So yeah, wanted to learn the rest. So before what I've done, 10 years ago, I worked on that fancy project which was at the beginning of Clang when it was just starting to support some basic C++. I packaged it into Debian and then I rebuilt the Debian archive instead of CCC. I used Clang and coped me a job at Medea at the end, and a lot of fun. I'm still doing that stuff even if I should stop at some point. So the idea of that project was how can you replace a compiler by another one? Or like, yeah, I'd like to do the same with the rest. I'm not a student anymore, so I don't want to do projects that are useless. So I want to work on something interesting. So I started thinking about what can I do? So the first one is, do I want to rewrite the GLC in rest? Maybe not. It's probably too hard. So Clang, Clang, LLVM, it's crazy. There is no way I can do that. And nobody is going to be interested in those projects. So I was like, what about the corridors? So one of the things of the corridors is that initially, I didn't know anything about that stuff. So like, oh, it's probably full of assembly. I don't want to learn assembly. I don't want to read assembly. But at the end, there is no assembly in coroutils. There is just one file that I never, I don't care about that one anyway. I'm good. But yeah, some people are going to say, why are you doing that? It's pointless. So yeah, it makes sense. You can think about that stuff. But the first one is why not? Like, we all had crazy ideas in our careers and this one is one of them. Rust is amazing. So you can bring some value and you will hear during that talk, me repeating that stuff many times. I would like to, I will repeat that stuff a few times. But the new implementation is fantastic. People doing that work are amazing. They are giant. And we always hear that Rust is amazing at security. At Mozilla, we keep repeating that stuff. But for the new implementation, it's not an argument. There are only 17 CVE for the last 20 years. So it's not about security, the re-implementation. And it's not about the license. I know that some company cares a lot about license. Myself, as soon as I can upload it into Debian, I'm fine. But some people love debating how the TPL or MIT are amazing or the SAC, depending on who you talk to. I am not interested to have that debate. I leave that debate to Reddit or the other one. And last but not least, it is super interesting. I hope during that presentation, I will be able to convince you to contribute and write patches. It's not that hard. I will even do a demo of fixing a bug life, hopefully. So I keep talking about that stuff. But I think you all have a basic understanding at least of what are the corridors. So we'll start with a quiz. So who was born before 2000 in that room? I see a lot of gray hair. Oh, yeah, a bunch of people. After 90, who was born after 90? After 80? After 71? Yeah. So congrats, you are younger than the initial implementation of the corridors. So the first version was published by Ken Thompson in 70. So thanks to software age, the archive done by Inria and a lot of actors, you can see the sources of the initial implementation. As you can see, the.s means assembly. I won't share any assembly code today, don't worry. But you can look at the source and it's pretty amazing to see that that code has been written 53 years ago. So Ken Thompson and Denise Richie, they worked on that stuff a long time ago. And what we are doing right now, and you can generalize to most of the things in text, is that we are building stuff on the shoulders of those giants. So those two folks invented things that we are still using on a daily basis, like we all use CPMV and all that stuff. Even if you don't know about it, your system is probably going to use it behind. I will also mention that it Postcat is very good from Adam Gordon-Bel, who is interviewing Brian Kerrigan, talking about the history of the unique operating system. So those folks wrote a new implementation of those commands in 72. So you can see that, for example, CPE and the if command were written in C. The code is surprisingly easy to read. So this one is a source from, again, 50 years ago of CHMOD. I'm sure that if you know a bit of C, you can read it. And found that fascinating to see that the code those folks wrote 50 years ago, I'm showing you that first time in 2023, and it's still relevant, and people can relate to that code. And you can ask yourself, is your code you are writing today still going to be valuable in 50 years? Probably not. But those folks, they made it. And it's probably going to stay for a long time. And this one is the actual implementation of the CHMOD function in the function written in 72. So it's not crazy code. It's full of bugs, probably. But it worked, and it is what started Unix a long time ago. And what I found surprising listening to that podcast is also how amazing programming language, the cori-teals and those command are. We take that for granted. But when you think about it, when you use sort, unique, cat, and all that stuff, it allows you to do some crazy things very quickly. So let's take a few seconds and think, I'll give you a text file, and you want to tell me what is the most common, the five most common words in the Shakespeare books, longer than six car. We can all do that stuff. It's probably six, seven lines of Python, same in Rust and same in source languages. But if you do that in cori-teals with bash, it is that common. And when you think about it, it's very impressive. That pipe and the redirection are the key things that we are doing, and how easy it is to program on a daily basis. All your system is running that kind of stuff on a daily basis, and it makes it super easy. I'm not saying that it is great. It's not fault-tolerant. If you have the single error in that command, everything is going to break, and you are not going to get what you want. But still, you can do that kind of thing very quickly and very easily. So by the way, the results are those ones. So in Shakespeare, more than five letters, it should further accent. I don't know how to pronounce that one. So I had to Google what it means. It means that you leave the scene, or that is what it means in English for Shakespeare before and master. And it's pretty funny to do that stuff. I did it until eight character, and it's quite interesting to see what Shakespeare used in terms of words. So now, let's talk about today. My brother is a story teacher. I'm not. So I will talk about what we have now. So we have 105 commands in the implementation. In the glue implementation, we are trying to reach that level. You are very familiar with many of those. Some of them you probably never heard of. And I'd like also to remind that what is in the corridors can be weird. So sometimes you don't have fine, you don't have, in the corridors you don't have fine less tops and all those commands, but you have some other things. And most of these commands, they come up with arguments which sometimes are conflicting with each other, sometimes are completely changing the argument of the behavior of the command depending on what you enter. So second quiz. So who knows about those commands? So L-I-C-P-M-V. Everybody, sure. And then this one, probably too. Now we are starting with the art stuff. Num format. So it's a command that I, yeah, there is one of the maintainers of the project with myself. So of course he knows. But really much it's the only one. So it's the kind of stuff that we have to deal with because we want to be a drop team replacement for the new project, but we have those kind of things. And who knows about PR? Yeah, one guy. So someone else. Convert text files for printing. So it is one of that command and it has a huge number of arguments which are probably conflicting with each other. So C-split, who knows about C-split? Daniel knows. Yeah, just a few people in that room. So it is to split a file into sections determined by the context line. Yeah, it's scripting, right? Yeah, it's weird. And we have plenty of other. So we have factors to do math. We have Pinky. I don't remember what he's doing. T-Sort is doing some kind of search. Shred is to delete, really remove the data of a file on the drive. I think it's more common, but still I rarely see that one in scripts. And so we have a bunch of implementation of the curators available on the market. So the most common that everybody knows is a new implementation. There are BSD, which is used on Mac, for example. BZbox is the one when you want to use on MBD devices or when you want to recover a system. Toybox is one of the core developers of BZbox. Decided to rewrite Toybox because it was sick of license discussion. I learned recently that there is a VLAN implementation. Don't ask me what is VLAN. I don't know. And if you are aware of the implementation, please let me know. I will tell you why I want to know. So let's talk about our implementation. So it was started by Jordy. I will butcher his name, but Bushiano in 2013. Before version 1.0, I sent an email to Jordy because he has a.be email addressing. I'm going to present the work that you started 10 years ago and he said, cool, glad to hear that this project is still alive. Myself, I found it in early 2020, before COVID, and then I started contributing in April. Remember that COVID started in Europe in March. It's not a coincidence. Now, he reads the size of the project. So we have 13,000 stars and we have 350 contributors. The second contributor to that project, you see his picture with a white background is over there. Well done, Terz, for your amazing work. He's the one reviewing RPR. I'm doing the easy one. I'm not a very good developer in general. Now, it's packaged in most of the distro. Obviously, it's not a coincidence in Debian and Ubuntu, but Fedora, Gen2, and most of them. It's used by it is shipping in Apertis, which is a Linux distro for cars, which between you and me, from that scary, that they are using our work in production. But I'm always the imposterous syndrome in terms of development. And it's used by a social network through the Yachter project. So it's not Facebook, the famous social networks, another one, and they are making glasses and so on. So I think you can guess who they are. But they are using that to take pictures in the glasses. And this one is for license reasons. So now I'm doing some product placement for one of the Mozilla achievement rest. So why do we want to do it in rest? It's, you don't have to worry about security issue at Mozilla on Firefox in particular. We see security issue caused by C and C++, not on a daily basis, but almost. And you should not do C and C++ anymore if you care about security. It's very portable. One of the things that I learned with that project is we are supporting a lot of configuration, a lot of operating system. And rest is really amazing for that. It was one of my big discovery. So views probably for some of the rest developer in the room. But for me, it was a surprise. And I really don't like to invent the wheel. So we can leverage a lot of great which has been developed by very talented people over the years. So LS color, for example, is used by LSD or XR probably to provide the same color as LS. And we are using that stuff in the rest corridors. Worked here to do a recursive operation on the directory. We are using that crate so we don't have to worry about that one. Temfile, we use it for MKTem, for example. So if you look at the sources of the node corridors, they have to implement everything by themselves. Sometimes they use that in some libraries, but they have to rewrite a lot of things. While for us, we can reuse what others have been doing. And last but not least, we have amazing performances. I will do a demo later of some of the performances that we have. But surprisingly, we are in some cases, we are significantly faster than the glue implementation. And it's a very popular language. No need to explain why, but we have a lot of contributors. And sometimes we are struggling to keep up with a number of requests just because everybody wants to learn Rust. And you should if you don't. But it's very popular. So what is the goal of this specific implementation? So when I took over that project a few years ago, I had exactly the same idea in mind as Chris Latner and Apple did back then with Clang is to be a drop-in replacement. So if you are not aware of that story, when Apple decided to work on Clang, one of their goals is to be a pure drop-in replacement for GCC in general for most of the options. And it has been one of the success you just had to overwrite the CC or CXX variable and you could use Clang directly. And if it was not working, it was a bug for most of the cases. So it works surprisingly well. Now Clang is a de facto standard for compiling most of the very complex applications like Chrome or Firefox. What can we do to replicate that? So the security is we focused on that one to be a drop-in replacement. We want that stuff to be cross-platform. It has been decided before my time as the leader of the project, but I love the idea. So we support the operating system. We are struggling with a free BSD, a CI because it sucks on GitHub. But besides that, it's working pretty well. Also, except for Fushia, we have CI for every one of them. So for every PR, we run a lot of tests on those. It's very easy to test. So on my laptop, running the full test suite takes less than a minute. And it's covering a lot of part of the code. I will share some of the code coverage information. I don't care about it. Some people do. For some people, it's a strength. For some people, it's a weakness. But it's an MIT license so that social network can reuse our code to save money. Anyway, I'm not interested in having that debate. And in my opinion, and I haven't seen anyone in the community using that argument, it's not a fight against the GNU project or the FSF. The GNU project has been doing a lot of good things for us in the open source world for 20 or 30 years. We are standing on their shoulder. It's not a fight. I know that some GNU quality developer of monitoring that project for a long time have been very friendly and so on. And I met Tim Mayoring 10 years or 20 years ago. And I was very impressed by that person. So, yeah, it's not about fighting. It's about collaboration. So when I started that project two or three years ago, my initial goal where I want to be able to boot a Debian on it, so my laptop here is running the GNU curricules now. For example, so it's not lying or it's working well now. Then it was to install the top 1,000 packages in Debian. So if I wanted to do that, is that Debian has a lot of script to configure the package because it's post-inst and they are usually done in bash and using the sort and Cp and MV and install. It's exercising a lot of features of the GNU corretails. And one of the goals was also to build three of the big projects I care about. So Linux, LLVM and Firefox, obviously. So we don't use that much scripting, so bash or corretails, but I still found some bugs, building so Linux can know some corner cases. And of course, package it into Debian and Ubuntu. I published some blog about it. They have been shared on a bunch of places. I've had some interesting comments, some of them not very interesting. Anyway, so to achieve those goals, we had to deploy a CI, add code coverage support, improve the code coverage. So it's one of the things to get familiar with a project. I wrote a lot of unit tests. So now the code coverage of our implementation is 80%. If you don't know much about code coverage, 80% is usually what we are trying to achieve in a project. It means that the code coverage is very good. It's very hard to reach 100% and sometimes it's a waste of time, but 80 is usually considered as being a very good code coverage. I think on Firefox, we are at 65 or 70, something like that. And we plug the bunch of tools. I mean, Lov is static analysis, LinkedIn, so of course I had to do that for that project. And we also documented a lot of those processes. Everybody loves about docs, so we wrote plenty of docs. And it took about a year to reach that state. So now the current stages. So what we have here is CI that is running for every PR. And we run our implementation against the new test suite. This is the latest graph so you can see that we have been working a lot on improving the compatibility. We are not there yet. I will fix one of them with you later during that presentation. But there is no silver bullet. For many of those, you have to spend a few hours to fix one. And you improve one by one. Before you ask a question, the skip is mostly that it is AC Linux. So CPMV and some common or CH-con, they are using AC Linux and GitHub action uses Ubuntu and Ubuntu doesn't have like default, so it's tricky to test that stuff into the CI. If you know how to do it, please reach out to us and like to fix that stuff. So how do we work? As I said many times, we want that stuff to be a timely replacement. So we wrote a mini wrapper to make it super easy. So that's command that I share is going to run the not-owner test for the touch command on GNU. So it's going to use the GNU test and run it against our implementation. So it's super easy to test. And we wrote some script to make our life easier. So we have a Python script which is going to tell us to the list. If you do it, it's several pages because we still have a lot of tests to fix. And we wrote also a fancy page. I can show you what it looks like. So here it is. We have the list of all the tests for each command. And of course, the big one is MISC where we have a mix. So sometimes it's just one line change in the code. Sometimes it's a big refactor. It depends. It's part of the fun. You never know what you are going to get. So let's use an example, for example, for MKgear. So this is the GNU I would put. So you do dash P. Who knows what is dash P in that one? Okay, cool. So it's create a recursive and V is verbose. Everybody knows that. So in the GNU implementations, they decided to do it directly by directly. You can argue that maybe it's not a good use. It's not smart. Maybe it is. Who cares? It's legacy. We have to deal with legacy. So our implementation was that one. So you can argue that maybe ours was better or worse. Who cares? But we updated that code. So we match exactly what GNU is doing. So if you look, I will share the slide after. You can look at the change. It's pretty easy to understand. This one is one of my favorite. So in Debian, when you install LibOffice, it is using app or more. And someone decided that instead of doing a touch, you do install DevNule, which creates an empty file. It's legacy also. We want to fix legacy everywhere. Probably not. So it was one of my favorite bugs. So I started investigating. So you use some REST codes which reproduce that issue. So if you do a copy of DevNule into a text file, it is failing with a source pass. It's not an existing regular file. So I open a bug upstream. So thread is quite interesting. Like people in the REST community are very passionate about that kind of thing. Too long didn't read. It hasn't been fixed. So here is a workaround. It's ugly. But if you know a better way to do it, besides fixing REST and dealing with a fallout of the fix, this is the best that we have. So we are looking if the input is DevNule. If it is DevNule, we are creating an empty file. This is what it takes to deal with legacy. So let's do a demo together. So please bear with me. It's going to be fun with the mic. It's going to be fun with the mic. Hello. So, ah, uppercase. Sure. Ah crap. Just switch to alacrity and never remember the shortcut to increase the font. Ha ha. This one. Better. Cool. So this is not the new version that I'm running. You can see I'm running our implementation. So we are working third release 0.0 17 weeks ago, something like that one week ago. So we are using that implementation of it. So when I took the train to come here, I looked at the test suite and I tried to find a cool bug for the demo. So I found one and I want to show you what it is. Because I don't always remember the command because I'm a font of a crowd. I don't want to look stupid. So I did a post bet within French. So here is building some stuff. So what I'm testing right now is a sort command. The sort command has some fancy flag. So this is a command from the GNU project. So I will show you what the test looks like on the GNU side. So here is the test. So of course we have the TPL3 on top and then there is a list of file with version name. So we have on the top we have the input. So for in and below we have the expected one. So the command that they are testing is a stable sort with a sort by version. So sort and same for LS. So you can specify what kind of sort you want. And of course doing sort of version is super complex and we can base it about that stuff for hours. I love it. So here is what they are testing. So of course as you can see it failed on our side. And why did it fail? So it seems that it is complaining that 5.4.0 it's not sorted the same way as 5.0.4. Super interesting, isn't it, right? So basically it means that we are sorting that version differently. Of course it's so obvious that it is the same version and it should be equal but when you are sorting in those cases equal doesn't mean anything so you need to make a decision. And of course we decided to do otherwise because I think the person who did the implementation didn't realize that GNU was doing something different. So let's try to fix that together. So what I like to do, I don't know how other contributors are doing is that I like to have test cases. So what I'm doing is I'm creating some basic command to be able to reproduce that easily so that I don't have to run the test. So I spare you the details. So here is the test case. Let me do that right now. So I've got my test file. So what I'm doing is I'm forcing the full pass to use the GNU implementation. I used the two arguments that we mentioned earlier and now I have the input and the output. In theory if I do a diff it should be empty. And it is. I love when demo works well. It's just the beginning so I will probably affect about some point. So now I want to test the GNU implementation. So here it is. So now I have a simple test case that I can work on. So I don't have to run the test suite. I don't have to do anything else. It's one of the things that I love with that project and that's why I'm not doing REST code at medias because I'm not a very good developer and Firefox is super complex and that stuff is pretty easy to do. You can do that on the train to come here. For example, what I need is taking this one. It's probably going to take 20 minutes to fix. Well, there is a weird thing at the end but I'm not going to produce a surprise yet. Anyway, this is the code that I have. So let's dive in now into the actual code. So let's use GDB. So GDB and REST works very well. So if I do a run, yeah, it works. If I do a breakpoint on main, yeah, it works. Next, next, next. Here it is. You can see the REST code. You can evaluate the variables working super well. So I won't open and look for the function. I already did that for you. So I already know the function where to put the breakpoint. So it's version underscore CMP. Our code is pretty well written. So I guess you'll understand what version CMP is doing. It's comparing version. So now I continue the execution and then I'm in the function doing it. So if I look at A, I see that it is that string. So the first one, if I look at B, here it is. So I have the two strings. So I will move in quick mode. There is the execution. So we have the version compare. And here is a function that I care about. So I will scroll. But you see in that function, it's probably what I care about. Because it is, if you don't know about REST, it is going to trim the zero at the beginning. So obviously in that case, it is what I'm looking for. So I want to remove the zero at first. So let's fix that code now. I learned from one of my colleagues at NANO. So I'm following the example. Yeah, exactly. It's the best, right? So here it is. I removed the trim. Of course, it's going to break something else after. I'm confident that it works. So let's rebuild that stuff. So here it is. It's rebuilding. It is what the file that I touched is using. It's a UCOR. So it's one of our basic library to do file management and so on. So it's normal that it is rebuilding the values dependencies. Not GDB. I want without GDB. Please. That one. And did it work? Yeah, it worked. Cool. I fixed the bug. So very proud of myself. I love how geeks love versioning comparison. So now the funny story is, of course, because it would be too easy. Now I'm not going to run the full test suite even if it takes less than a minute, but I'm just going to run that test because I know that that one fails. So LS has the same function, but LS has a different expectation in terms of version sorting. So, of course, that test is going to fail because LS likes the zero before instead of the other one. So, of course, that test fails. So one of the things that we could do is in the version compare, we could have, do we want zero to be first or do we want zero to be after doing a boolean? And when I say this, because I don't like the version comparison function, it's on our tree. So I think it's done. Someone else did it. Of course, there is a great doing it. So I reached out to the developer saying, can you add an option to change the sort when it is a zero? So the upstream made fun of me, but they have a PR ready, so they will probably end it. So we are going to remove 100 offline of code and use that crate because it is what we want to do. We don't want to maintain a comparison of version because who cares, right? So let's come back to the presentation. So performances. Benchmarking performance is hard. I know that in the room we have some experts in performances and they are not going to contract with me that when I say that benching is hard. But we are using hyperfine. We can see that, for example, the start is almost five times faster than the boolean implementation because some people spend a lot of time improving it in our code, but also in the crate. So that one is basically taking all Shakespeare books, the text file, making it random and sort it. So we are significantly faster than the new implementation. Similarly, if you do a recursive LS or a recursive CP, we are 1.5 faster than new is probably because the code generated by RC is much better than the one and returned by GNU for a long time. But I don't want to pretend that we are always doing better. For example, with the factor function, we are significantly slower than the new implementation five times. I don't know who uses the comment factor. I learned about that comment when I started contributing, but still, we want to be a good replacement and that one is very slow. So I'm going to do another demo using, I'm going to do some product placement for a project called Sempli. He is the author of that project. He is in the room, so if you need anything, don't blame me, blame him. I will do a quick demo about how can you do a proper benchmarking of performances. I will try to find how to do the primary perf kernel change. That one. I love the SH. So I just did a recursive LS with a comment called Sempli. So basically, it is going to instrument it and it's going to upload into the Firefox profiler, the analysis of that program. So I will zoom in a second. So here it is. I think Bodhiya did four presentations of the profiler today, so if you haven't seen any, I'm going to do a quick demo. Anyway, it is one of the magic things of Rust is that we can easily do that stuff. You saw how long it took. I will show you the command again that I run. This is that one. So Sempli records the binary and the argument. So it's very easy. If I try to do that, we see another language is going to take forever, but Rust and Sempli makes that so easy. So here I have the flame graph. So I can see that for a recursive LS, most of the time that we are spending is in display grid. So most of the time that LS is going to spend is not reading at the file or reading the metadata. It's just doing computation about how do you want to show the result into the terminal. And one of the amazing things is that it's going to, I can look at the source directly and see the counters and I can easily benchmark that stuff. So if you are into performances in Rust, you can use Sempli and the Firefox profiler to do it. And again, not here as Bodhiya, but it's very valuable for any project. It's not Bodhiya related and it makes that stuff super easy. So we have also some fancy documentation. Tert in the room did most of the work. So one of the things that he did that I love is that one of the things that I really struggled at first when I became a UNIX developer like 20 years ago is that I never knew how to do an example and find the example. So Tert linked to TLDR.sh where it's providing example for every comment. So for example, here it's base 64. You have example, but I'm going to use a comment that I didn't know about, shred. You have example for shred. So you don't have to Google on Stack Overflow how to do that stuff. You have that out of the box in our documentation. So we have development documentation. I want good suite for a matter of time. But one of the things, we are taking the liberty also to extend the new clarity. So we don't want to break compatibility, but we are doing some fancy thing like progress bar because who doesn't love progress bar? So for example, of course, it is that one. So we have a fancy progress bar now. You don't have that upstream. So you can think the guy is over there. I haven't done anything. But before the talk, he came to me and said, I looked at the implementation, at the patch that wasn't merging new and that patch was very hard to understand. If you look at the diff on the rest side, even if you don't know about rest, you will understand it because we are relying on a crate. I think the upstream developer of the crate is for them also. So thank you for your hard work. So here is what kind of stuff we can do. We can do it for MV. And MV is interesting because if you are moving between one file system to the other, it's not always super fast. We are also implementing some tools that new doesn't have. So B3Sum, for example, because it's like B2Sum, so it's a hash algorithm. And cut-w, someone contributed that patch recently. It's one of the options from BSD that new doesn't have. So what is next? We want to implement the missing option for the values binary. It's not hard. It's fun. We try to be nice. I'm not always nice, but I try to. We want to have a full compatibility with new. So the list that I show you, I like that to be fully green at some point. It's going to take years, but it's fun. It proves the performances on some key programs, like for example, for Factor. I don't know if we want to spend too much time, but there are other things where we could improve the performances. And if you're interested, that link is the most interesting one, to know where to start. It's well documented. We, if you are a student, we are probably going to apply for the Google Summer of Code. And if someone is interested in sponsoring that project, if your company or your foundation is interesting, I'd like to buy some credits on GitHub Action to build faster, because we're running the new street test about an hour. And I think we are using a lot of resources, so it would be nice to have a faster CI. And now I'd like to predict the future. So the Linux kernel, they landed the first support of Rust inside the tree. So if you do some stat on the Linux Git repo, you see that they are first. So there is only 37 files upstream. It's mostly glue. So how do you manage memory and compatibility with the system and support for the build system? We hope that they are going to land something in mainline soon, a feature, but is it going to happen this year or not? We don't know. And some people are still challenging Rust in the Linux kernel. The main argument that I heard is mainly for legacy system. But my prediction is that next year, we will start to see more and more distro vendor or cloud company to ship with some more and more Rust code. Probably my prediction is that some Linux distro are going to ship in the cloud our implementation in the next few years. So thanks for your time. I'd like to remind that it is not a Mozilla project, so please don't say Mozilla works on that. It will make my life easier. And I think we have a few minutes for questions. Here, I think we need to, in the back. We David. It's an issue for every Rust project currently. The fact that cargo is amazing at updating dependency is many of the Rust projects are subject to supply chain attack lately. And we are no exception. So I can do product placement if you, Mozilla has developed a tool called the cargo vet, which mitigates that stuff, but it's too complex to deploy for a hobby like this one. So yeah, you can do supply chain attack in our project. But we don't merge the dependable pull request immediately. We are trying to mitigate that by taking our time. Any other question? Yeah, two over there. Yeah, please. The code size compared to busy box, I don't know. With GNU, it's similar. So you have different way to compile the project. So the trick that I'm using in Debian for the package is that I'm building a single binary and I'm doing sim link. So if you do a sim link from curateals to a CP or to NS, the binary is going to understand that you are calling CP. So the size is comparable to the new implementation in that case. But because of the nature of course, the binaries are bigger than the rest. Another question? Yeah, it's a question that comes back so often. Do you compute the code coverage only on the passing tests or all of them? We run code coverage on everything, including the GNU test suite. So it takes an hour and a half. So the tests that are failing? Yeah, yeah. We test everything. Did you find any bugs in the original GNU implementation when doing the re-implementation? Yeah. We have a contributor who found two or three bugs in the upstream implementation and they have been fixed up. We ported and fixed up. Have you had any need to use unsafe and if yes, can you give a few examples in which kind of areas? We have some unsafe when we are calling the libc, some function of the libc we are calling it directly. So we have some unsafe. Yeah. I'm going to repeat what he said. He said that it's for the libc. It's mostly native calls. Have you looked at code complexity metrics like cyclomatic complexity? Because I guess with the error handling, the memory management and cleanup in C is often really messy and that could be a huge boon in maintainability. Yeah, yeah. We have some codes that clearly need to be factoring. So for example, in LS we have to work around the limitation of CLAP and the code complexity here is getting more and more complex. Yeah. That project needs some policies of our implementation need to be factoring to decrease it. But the code is usually pretty easy to understand. Thanks. Okay. I think we're out of time. Thank you all. Don't hesitate to contribute. I can find an email if you need. |
Zero Knowledge Cryptography and Anonymous Engineering
The development of zk-snarks in recent years and explosion in algos has opened up an entire new design space of anonymous engineering. |
I'm going to talk about the new breakthroughs that are happening in cryptography, the opening doors to unexplored spaces. I will try to speak louder. The free software movement and Linux at one time had power, it had vitality but then somewhere along the way we started to play catch up, we started to try and follow the competition. This on the desktop never happened. The once great browser Firefox, its market share dwindled to zero. Even this conference which has the best minds in free software community is funded by surveillance capitalism, Google, Microsoft. In this talk I want to talk about how we can escape the death trap and create the new paradigm of computing. This talk is dedicated to Peter Hintzians. Peter Hintzians was from Brussels and he was a programmer who, he wasn't born in Brussels but he lived and he died in Brussels and he really embodied the ideas of what elegant abstraction means. Abstraction is something which Don Pauli creates complexity, Don Well empowers the software developer but he was not just a good developer who made, for example, zero MQ which is really interesting conceptualization about how we could build distributed systems but he was also a theorist on creating free software, the social layer and creating free software communities. He taught us that free software is more than just having the code being accessible but it's an entire philosophy and when we create the good, elegant abstractions, it enables us to create software that's composable while not contributing complexity. This is like the basis of how the Linux architecture is created, there's components and rather than like in a Windows system where there's a system 32 filled with hundreds of DLLs, there is a component which people can modify and in our projects we try to embody his ideals, we try to carry his philosophy. So everything that we use today, the concept of the sockets, the processes, the file system was invented in the 70s with Unix. Since then, nothing has fundamentally changed about computing and when they created Unix, their vision was something that would enable deep collaboration between communities and the infrastructure that they created, the software, ended up becoming the basis of the web but they, at the time, they couldn't take their vision to its full conclusion. They didn't have the algorithms that we have now like around peer to peer and consensus and cryptography and so on. There wasn't huge network bandwidths, the resources in the hardware weren't as big. And since its invention, not much has changed. What is a zero knowledge proof? If I have a simple function and I call the function on a set of parameters or arguments and I produce a result, the return value of the function, if I want to show to you that this value that I've computed from the function is computed from some parameters that went into the function, then normally the way that you do that, like logically, is I would have to give you those input parameters and you have the function and you would compute the function yourself and get that result at the end and then be able to verify that the result is what it claims it is. For example, in a Bitcoin blockchain, you're doing transactions and everybody verifies the transactions that the computation is to advance the state machine to the next state is correctly done. Then there are two very interesting properties of ZK proof. So first, the ZK proof is succinct. What that means is the actual proof object that proves the computation that has been run on the values is very small. It's smaller than the input parameters that go into the function. You would expect that it would be some big proof, but actually, we save computation. When you want to verify that, you save computation compared to if you would compute the evaluation of the input values, what we call the witnesses yourself. The proof size is small and the time to verify the proof is very small compared to actually computing it and it can be anonymous. So there are some values that you put into a function to get a result. You don't know what S, X, and Y are. You know Z, you know Foo, but you don't know S, X, and Y and that enable us, they give us a very powerful technique in our arsenal, in our toolkit of anonymous engineering. So this is the general schema of ZK proofing. So you have a proof function. So that means that's how we generate a ZK proof. So you have some private values, the input values to your function Foo, and you have the output of the function, which are your public values, and you create a proof. And then I give to you the proof, and you want to verify the proof, and you have the public values from the evaluation of that function, and you get true or false back. So how does it work? This is obviously greatly simplified, but just observe this property. If I have polynomials, and I add two polynomials, and then I evaluate the polynomial, that is the same as evaluating the polynomials and adding them together. This is due to what is called mathematically the homomorphic property of the function that maps from the ring of the polynomials to the ring of the closure. And it works as well for multiplication. So just remember that homomorphic property, and then what we do is that function Foo, we do this step called erythmetization. So any code that you write, we turn that code into a set of polynomials. So how do we do that? Well, here imagine A and B are Boolean values, either 1 or 0. So how can we turn those into arithmetic expressions? So if you notice with those formulas in the top left, and these tables on the right, if you do the calculation, you will get the same thing as long as A and B are 0 or 1. When you perform those formulas, they are the equivalent to those Boolean expressions. And if you want to enforce, just as a side note, that a value s is in a certain range of values, for example, 0 or 1, well, it's just the same as saying s minus 0 multiplied by s minus 1 is equal to 0, which is the roots of the polynomial where it would evaluate to 0. That will be a degree to polynomial, and there will be no other roots of 0 there. And so we have that function foo, which, if you remember, where was it? It was here, this one. If s return x times y, else return x plus y. And how do we arithmeticize that? Well, you can see below that we have z is equal to sxy plus open bracket 1 minus s x plus y. Both of those are equivalent. So we've taken piece of code, we've arithmeticized it as a mathematical expression. So then we can use this Schwartz-Zippel lemma, which is rather than having to give you all of these huge degree polynomials, depending on the number of equations that you're checking. There is something that we can do where we can just evaluate a polynomial at one point. That relies on the Schwartz-Zippel lemma. So let's pretend that we have two polynomials that we're trying to check a multiplication of. If you remember in the first slide, we had fg evaluated alpha is equal to f of alpha multiplied by g of alpha. So these polynomials, if you notice, they're constructed so that they intersect through a certain number of points here. So the red one goes through 1, x equals 0, the red one goes through 1, x equals 1, the red one goes through 2, x equals 2, goes through minus 1, x equals 3, goes through 1, etc., 2, 3, 2. So that's a Lagrange interpolation of those points. And the yellow one, likewise, does the same, but for 0, 2, 2, 0, 2, 1, 3. So you can imagine those are the lines of our kind of proof or program that we're trying to enforce. So the first one might be that we want to check that 0 times 1 is equal to 0, and 2 times 2 is equal to 4, and 2 times minus 1 is equal to minus 2. So how do we construct that proof? Well, if we multiply the points together, like so, we get a new set of points. And then, because these polynomials are degree 6, to create the polynomial that comes from multiplying these two polynomials, we need 12 points, which are multiplied from both of these, but I've only done six here. So then we have these points, and we interpolate, we draw a polynomial interpolating those. So this is the new polynomial we get, the pink one. And if you remember this relation from earlier, we now have this polynomial FG, and therefore, if there is a protocol where a random point or a random X value is selected, then it's sufficient to prove that there is this evaluation at this combined polynomial FG of alpha is equal to evaluations of the other two polynomials multiplied together. And therefore, you can be sure that that pink one is the multiplication of all the individual points, because the random point and the probability of you being able to preempt that is basically nearly zero. And so we can actually see here, if we look at any two points, the top two is the red and the yellow one, and the white one is actually the multiplication of the two points, and the purple one is the purple one. So we've actually created the polynomial which have this property at all points along it. And because it has this property, it's sufficient just to pick a random point and check that that's true. And there is another puzzle piece which is the polynomial commitment proof. So essentially, you can create a commitment of a polynomial which is like hashing a polynomial, and you don't know what the polynomial is, so this is where the zero knowledge property come from, and then there's an object representing a polynomial in your system, and any time you can create a proof using the polynomial which has this statement on the right, which says that the f is a commitment or hash of this polynomial f, and z is equal to an evaluation of f at s. And so that's what that open does is it creates this proof on the right, and then I can give you this proof, and I can give you the commitment to the polynomial, which is just a hash of the polynomial essentially, and you can verify that whatever polynomial is inside of f is equal, is the z is equal to f, or evaluator s, and the same principle is true for addition. So we have multiplication and we have addition, which means we can construct any NP-complete program inside of zk proof. Also, another technique is multi-party computation. So in NPC, so with a zero knowledge proof, I can compute a value, I can prove a statement about some values that I hold, but maybe sometimes we need to compute, or other people need to be able to know certain facts about other actors in the system, and maybe they don't have the incentive to create a zk proof or to prove statements about values that they hold. So that's where we use another technique that's called MPC, and I will show you how we can do addition of values with MPC. So Alice has some number, some secret number, 110, and Alice and Bob has some other number, 1177, and Alice now splits her number randomly, such that those numbers add up to 110. So if you add them up, it would be 110, and then sends them to each other's servers. So none of the servers know what Alice's number is, but they know parts of it. They can reconstruct it if they collude, but we're assuming they don't collude, and then Bob does the same thing, he sends his numbers, and now when we want to compute an addition of the values, each of the server will add the values together, and now they get these new values. And if you look, those values added together when they reconstruct it is the answer of adding the two private values together, and multiplication is similar, but slightly more involved, but also it's possible. So MPC is another powerful technique. Also we have homomorphic encryption, so very simple partial homomorphic encryption is simply this function, which is elliptic curve multiplication. So if I have two values, and I add them together, and I multiply them by the generator of an elliptic curve, or just some point on the elliptic curve, that is the same as taking the value multiplying it by g, and then adding it to the other value multiplied by g. So homomorphic encryption, the original idea in the 80s was there's a cloud, and anybody can put values into this cloud, but they're encrypted, and then other people can compute answers encrypted for a certain public key. So you can use this to make computations on secret values. From an abstract level, there is this new emerging field of anonymous engineering, so we can compare it to other forms of engineering, so for example when we have software, we write these instructions that run on a CPU and execute, and when we do cryptography we try to use deep mathematical laws to try and create primitives or schemas, but the anonymous engineering is actually using those different techniques like the ones I just showed, or other ones like VDF or hash function, public key, asymmetric crypto, et cetera, to try and come up with schemas that enable certain forms of applications with invariance to hold. So let's give the first practical example, which is I have a set, I have a set of values, and this set is just represented by a single hash, and I want to prove that my value is in this set of objects, so to do that we have to construct something called a Merkel tree, so let's say we have eight values, and what we do is we take two values at a time, so we take the first two values and we hash them together, so we get hash of them, and let's represent that by A, and now let's hash the next two values, we get another node B, and then we hash them together and we get another node, so we get this kind of tree which the root R represents the entire set of values, and this is a simplified diagram, normally these might be 32 layers, so two to the power of 32 values will be in the tree, so for example we had V1 and V2, and we hash them together and we get A, and likewise we have V1, V3 and V4, we hash them together and we get B, and then we hash those together and we get AB, and then we do the same, we do the same on the right hand side and eventually we get to R, now if I have some value, any value, let's say, I don't know, V5, and let's say we also have R, how can I prove to you that I have, that V5 is in R, well what I need is a pathway to be able to get to R, so what does that pathway mean, so for example if I give you V6, then we can hash those together and we get C, and then if I give you D and we hash those together, then we get CD, and then if I give you AB and we hash that with CD, then we get R, and then I've proved to you that V5 is in R, using, instead of needing to give you all of the items, I just give you a logarithmic number of items, I give you a smaller number of items, so it's faster, it's used as a technique, but it can also be used to create an anonymous inclusion proof, so we can anonymously prove that there is some value in R, and we can even encrypt that value, or we can prove other statements on that value, so I'll show you some code, how that looks like, maybe I can put this mic somehow, like this, yeah that would be great, I need to speed up, but here is the proof, you see the Merkel route at the top, we're saying, and there's a pathway, we're proving some values in the route, and then we're re-encrypting the value, and we're exporting it, and, yeah hold it, yeah, and to compile it, and then I compile the proof, like this, so it's compiled, and then I have, sorry it's here, I have the code which actually computes the Merkel tree with the value, but then also you see, includes the ZK proof code, and then creates the witnesses, and where is it, and then loads the circuit, and then creates, constructs the proof here, so now we get a proof, and then for the verifier we do here, we verify the proof, so we can just run that, like so, ah, okay, no internet, but anyway, let's not worry about that, okay, so then we can use that to create anonymous voting, so how do we do that, well, we say on the, we create, when we create, constructs the people who are going to vote, we create something like a coin, and there's like a, you generate a random serial number, that's private, and you just create this commitment to it, and then when you want to use up your vote, then you burn the coin, and you make that public, that secret value S, which means you can't ever generate the same thing again, because that value is deterministic, and then you just, you prove that there is a C, that's the hash of S, and that C is in the tree, using the previous inclusion proof, and so how do we change that to do anonymous payments, well, it's very similar, except now this coin, not just being a hash of S, is also a hash of value for the coin, so it's two and four, which are owned by Alice, and then when we want to spend that coin that Alice has, then we reveal those serial numbers, and we can compute the partial homomorphic encryption of the two and the four, and we create this transaction with two outputs, and we create the two new coins, like we showed before in the previous slide, but we also want to prove that the value that goes into a transaction is the same as the value that goes out, and we do that using homomorphic encryption, like we showed earlier, and you see here, we've got the two plus the four is equal to the three plus the three, so there we go, then we can do atomic swaps with different types of assets, so Alice constructs her input and one output sending to Bob, Bob takes the transaction, adds his input and one output sending to Alice, Bob also signs the transaction, Bob signs and sends the finalised transaction, we can also do something where you have a network with anonymous spam protection, so you have a sharing scheme, and normally, so basically with this, you have this evaluation, I'm going to go fast now, and when you want to send a message you compute the message M, you compute this X and Y, and if in one epoch, you again create another message, so you're spamming the network, then you get these values, which using the equation on the first line, you can compute what A0 is, and A0 is actually your secret key, and so then that means that whenever you try to send another message to the network in any other epoch, now you've lost your account, you can never send, but it also means that messages are unlinkable, so you have unlinkability, we can do anonymous auctions using MPC, so Alice has bids $4, Bob bids $6, they do a computation, they compute who's the winner, we can do anonymous WikiLeaks, so I have this thing.jpeg, and then there's a protocol where, you know, I've said that this is, I've made some claim about what this file is, and it selects a random chunk from the file, and then we verify, yep, that file is what it claims it is, and then there's an auction on the remaining chunks, and the winners of those auctions decrypt the remaining parts, and then the file is decrypted, so if you go to doc.fi website, and you go to the docs section, we have, where is it? So go to the website, and there's also a blog called Insights, we have our own peer-to-peer anonymous chat, there's no concept of identities, so if you go to the doc, there's a section called IRCD, and we have a weekly meeting every Monday, but also there's a crypto section, ZCAS section, testnet guide, you know, we're looking for good devs as well, so, conclusion, so, we missed the mobile and the desktop, will we also miss the crypto renaissance, this is like our best chance to capture value for development, like, this is the biggest problem with creative people, is they create value, they don't necessarily have a way to capture some of that value back, we now have techniques to do that, we were promised this future of computing in the 90s, you know, the interface is, whatever happened to that, never got it, and now, our phones, they're filled with all these dog shit, electron apps, like, that's a failed paradigm, we're literally trying to copy Silicon Valley, I'm optimistic that now people are actually going, actually, no, Linux is different, we're distinct, we have our own energy, but we need to rediscover that, we need to create something that's new, because their model is about capturing users under surveillance capitalism, to extract value from them, our model is we create infrastructure, we create economic value for our networks to become strong, and as a movement grow powerful, it's a different way of thinking, open source was a mistake, you know, like, we discarded the beating heart of what gave us energy, so we need to conceptualize the computing paradigm, so, you know, let's build something new, like, actually inventive, so if I have a couple of minutes, I'm actually just going to show our website, so I can show where to find docs, okay, I guess there's no, no, I'm not there, come on to it, go into it. All right, let's give a tour of how docsos, so here, there's book, I talked about peer-to-peer distributed IRCD, you see there, instructions. There's also crypto section. You see here. And also implementations. There's a ZK explainer and also implementations of most of the major ZK algos. And also, probably more interesting for you guys, the ZCAS stuff, like how anonymous voting works and also anonymous payments. All right, I just showed the distributed chat. You just run a daemon like that. Open my WeChat, bam, here we are. There's encrypted channels as well. You just set in your config file an encrypted channel and then we have a chat. See I can chat with other people. So, yep, that's it. That's my talk. Thanks very much. Thank you. |
Building an Plant Monitoring App with InfluxDB, Python, and Flask with Edge to cloud replication
Plant monitoring with open source tools |
So, for this talk, we're going to be learning how to build a plant monitoring app with Influx DB, Python, and Flask with Edge to Cloud Replication being an option onto this project. So first things first, my name is Zoe Steinkamp, I'm a Developer Advocate for Influx Data, which means I have a large empathy for developers myself. I was actually a front-end software engineer for over eight years before I decided that I wanted to actually be able to listen to people's issues and fix them instead of just hear them come down from the product team. So if you guys have any questions, I will be allowing some time for Q&A during this presentation at the end, but if you want to reach out at any point or you just like to be friends with people on LinkedIn, this is my QR code. My name is relatively unique, I'm easy to find. The overview. So in this presentation, we're going to be walking through a few different pieces for this project. The first thing we're going to be walking through is the IoT hardware setup. So if you guys are not super familiar with like IoT devices and stuff, not to worry, I'll break it down and then we can kind of figure it out. Also all of this is available on GitHub, all this code examples, there's lots of instructions, this is a very well fleshed out project. So at the end, I'm going to be linking that as well so you can do it yourself at home very easily. We're going to go over the tools that we're going to be using for this project. We're going to go over a short interview, a short overview of InfluxDB just so that with people who don't understand how it works, we'll understand how it works in this project. The data ingestion setup, Flux and SQL, setting up edge data replication and data request which are kind of comboed together somewhat. And then finally at the end, the GitHub code base, links to other community links and such and then Q&A as well. So setting up your IoT devices. So this is a handy little diagram to show roughly how this is going to work in real life. But basically you have a plant and you're going to be monitoring it, you're going to need some kind of microcontroller to receive this information. I'll show a haphazard photo in a second of how that's going to look. But basically from that plant, we're going to get data roughly about, I like to say, how the plant is feeling. If it's thirsty or hot or just doesn't like you in particular, it'll let us know. From there, we put that data into our open source, our OSS instance. So InfluxDB is available open source so you can just easily download it off GitHub and get it running locally. So in that, we're going to go ahead and store our data. We're going to use a telegraph, that's what that little tiger is, we're going to use a telegraph agent to get the data inside. From there, if we want, we can go ahead and use our edge data replication feature to go ahead and push it to cloud. And then the idea here is that you can also host this locally, like you can host a little website with graphs and such. I'll be showing this as we go and the code is available. But basically the idea here is that you store your data locally, you use edge data replication to push it up into the cloud for longer term storage or just to have less data loss. And then from there, you can pull that data back out to actually start graphing and visualizing it. As promised, haphazard photo. So for this project, you need in no particular order, a plant, preferably alive, those are the best to monitor, a particle boron microcontroller or another compatible one. We have the schematics and the details for an Arduino, if that would be your preference. At least one IoT sensor for your plant and a breadboard with jump wires and terminal strips. As promised, this is what the schematics look like. So basically you can just kind of follow these schematics to the T and that helps you just get everything set up. We especially had certain issues with some of our sensors, what's the word, interfering with other ones. From that, I have four sensors for my project, those are the four that I just happened to buy off Amazon, which we do list, so you can, depending on your country, it will change, but these sensors are like 25 cents a pop, so they're really cheap and easy to get. I have temperature and humidity, I have light, I have soil and moisture, and I have temperature. So with all four of these, I can go ahead and hook them up to my breadboard and my microcontroller and I can start getting some of that data. So the tools we're going to be using today. So we are going to be using Flask, which for those of you guys who are not aware is a micro web framework written in Python. It's going to be doing some of the heavy lifting for the project, specifically it's going to be running the local application and allowing us to have some built in routing. We're going to be using InfluxDB for actually storing the data that we get from our IoT sensors from our plant. It comes with an API and tool set that's going to be easy for ingesting and querying that data back out. It's highly performance, so we don't have to worry about it running up when it's open sourced. It doesn't cost us anything outside of the server we're running on locally, but in general we want our data to be stored efficiently. And then it also has obviously our community and ecosystem. People like me there to help answer questions and come up with these awesome little projects like monitoring your planted home. Telegraph is a completely open source ingestion agent. It has over 300 plus different plugins depending on what you need and desire. For this project we use the exact D processor plugin to get the data into our open source. I'm also going to be showing code for what I'll actually I'm going to explain that later. But basically this is super nice to use. It has a very wide range of open source plugins supported by sometimes companies, sometimes community members. You'll find serious ones like Azure monitoring or AWS monitoring to the more fun ones like Minecraft or CSGO. For some reason you do not want to use Telegraph, maybe it just doesn't have a configuration that works for your device or your project. A lot of people are just going to go to the client libraries which I'll be showing a code example on how to use these as well. And this does live inside the project, so you don't have to worry about like going and finding it. We just left it there just in case people want to use it. So obviously it's got a few different options here. We're going to be using the Python one because that's the one I've worked in and that's what the project is written in. Another thing that I use when I built up this project is the flux extension for VS Code. It's really nice in that it allows me to write my flux queries and it kind of tells me if I'm misspelling or writing things wrong. It's just like any other extension that you're going to get in VS Code. It highlights things and helps you realize when you're making mistakes. Finally, we're going to be using plotly for graphing. It is a completely free and open source graphing library which is always our favorite. And it's really nice and easy to work with and very colorful which I appreciate. So a really quick overview. So for those of you guys who are not quite familiar with it, time series data is a very specific type of data. So it's what we're going to be getting from our plant because IoT sensors tend to give you time series data and the fact that it is metrics regularly intervaled at time. So what that means is that you want to know at what point the plant got thirsty or you want to know how many hours a day did it get sunlight. That's all time series data. That's data that you want to know about on a time scale. We normally see these as metrics at regular time intervals. Occasionally we see things like events. You can think of things also like the stock exchange or weather conditions as other great examples of this type of data. We tend to find these in multiple different applications. The software infrastructure is probably the most common and most people here would understand where that comes from. Obviously for this one we're going to be using IoT data. So one thing to note is if you had multiple plants at home, you might want to store that data like you might want to know that you have six orchids and seven aloe veras. You store that kind of data in a relational. You'd name them. You'd say this is the one that lives in the window on the north side of the house. This is the one that lives in the window on the south. And by the way, one of my coworkers totally did this. He has like a hundred plants in his house. So he organized it in his SQL DB, his relational, because this was a lot of plant data. But then when he was actually monitoring all of these plants, which I really don't know how he set this, his house is just full of cords. It's just cords everywhere. When he set this up to actually start monitoring all of these, that would be time series data. So that's going to be all of those timestamp metrics coming in. So this is kind of how the entire platform looks when it's all put together. So as you can see, you have your data sources, then you have telegraph in the client libraries as well as things like native ecosystems, which we're not going to go into today. And those are the ways of getting the data in. And from there, you can use Inflex DB to set up things like triggers and alerts, things like you can get, I have it set up to send me a text, be it Twilio, if my plant needs some water. I use it quite often at my job and then promptly ignore the text. It doesn't work out very well for the plant or me. But if I actually paid attention, this is very useful to use. And finally, obviously with these kind of data, what's the data stored once we have it being used, maybe down-sampling it, we can actually start seeing some results. Obviously, infrastructure insights isn't quite what this is, but more like plant insights. So when it comes to data ingestion setup. So I'm not going to go super in depth on how to set up your microcontroller, because depending on the one you're using, it's going to be different. They're all going to be very varied. You're just going to have to follow the instructions on that one. If you happen to have an Arduino or a Boron microcontroller, you could probably follow, I mean, you can follow our directions anyways, but those are probably going to be pretty easy to set up because we talk about it. But this is just an example of how the data tends to come in. So as you can see, I've got my port set up and then I start to get these data results. So for example, if I remember correctly, this one is the humidity one, this one is the temperature. As you can see, this is like the first, I'm going to call it the first flush. So sometimes the data comes in as zeros at first and then it starts to actually give you values. One thing to note, and I'm not going to go over it in this presentation, but you can see it in the GitHub, like in the repository in the code. We do tend to do a little bit of cleanup on these values. The data sensors are not exactly friendly in how they send you data, so I'm going to put it. So we did have to do a little bit of our own cleanup in Python, which luckily we supply to you. So if you're using roughly the same ones, you can go ahead and just use what we have. But like, for example, our temperature kind of came in a little bit weird and we had to change it so it actually read in a more human readable way, and we haven't yet fixed the light one. So it just looks really strange. Interesting. I expected my video to show up. Well, oh wait, it is up there. Aha. Well, let's see, can I get this to work? Not quite. Sorry, guys. Little difficulties. Well, go figure. This was working on my own machine, you know, five minutes ago, but that means nothing. I'm going to try and press, is there like a play button or something on here? All right, I'm just going to give up. So basically what this shows is how to set up your bucket in token, which I can actually probably just pull up. I'll do it at the end of this presentation. We're going to do this on the fly. I'll show it at the end, but basically it just shows you in the UI how you set up your bucket, which is just your database. You can pick for how long it wants to have a retention policy. That's how long you want to store the data. Maybe you only want to store it for a day. Maybe you want to store it for 30 days. You pick that at the beginning. And then it also gives you the option to do a explicit or implicit schema. And what that means is implicit just basically builds the schema off what you send us. So if you start streaming in data, we'll build it for you. Explicit is you tell us exactly how you want your data to be formatted, and we will reject any data that doesn't meet that schema. Obviously in a project like this, which I like to call pretty low risk, like it's not a big deal if the data is not quite perfect, just do the implicit, make life easy for yourself. But we give explicit as for more professional projects, I suppose you could say, where it really does matter that you reject that bad schema data. The other thing I showed is just how to make a quick token, because obviously you're going to need a token to actually get your data in and back out, you need those authentications. One thing to note, we do offer a all access token. We kind of warn against it, it even has a big warning on the screen saying, please don't do this, because it allows you full access to all of your buckets, all your databases, and it allows you to delete them. So if that tech, whenever falls into the wrong hands, or maybe you make a mistake, or your coworker makes a mistake, you know, somebody else, that can obviously cause a lot of problems. We like to call it basically creating your own big red button, you don't need to do that. So we also give you the option to pick, write and read tokens where you specify which buckets you want them to have access to. Again, I'll just show this a little bit later. And you can do it in the CLI as well, but normally when the video loads, the UI is a little bit more fun to visually see. So let's see, there we go. So for this code example, it's pretty straightforward as to how to actually set this up. As you can see, we have the influxdbclient.point. The influxdbclient is already set up in this example. Basically all you give it is your bucket and your token. You just basically say, this is where I want my data to go, and I have the authority authorization to actually do it. It's very straightforward and easy to set up. It takes like a second. But basically once you have all your authentication going, you can actually start sending those points up to your database. So with this one, we're calling the point sensor data. We're setting the user. It says it's not visually here, but it says Zoey, just as my name. It's not very special. Then we have the tag, which is the device ID. And then finally the sensor name with the value. So that's going to be something like humidity value 30. And basically from this, this is running in a Python file script that just is pretty much running as long as we're getting data. But basically this is a straightforward way to get it in. And this is using the Python client library. This is part of the Telegraph config file. This file has like, it's computer generated, so you don't need to write 200 lines of code, but the actual config file is like 200 lines of code. This is just a small snippet at the end of it that basically says that we're using the execd processor plugin. And from here, we're just telling it what measurements and what tagged keys to accept. Again, inside of the GitHub project, we kind of go a little bit more in depth. But the big thing is that every Telegraph config file and instructions are slightly different. So there's no necessary reason for me to show you the execd one when you could be using a different one for your own project. But basically, just follow the documentations for this. It's super simple and it's very well documented. Well, I guess I shouldn't say that since it's open source. So some of them are less well documented, but most of them are great. And this is a table example of the resulting data points. So as you can see, we have our sensor data with a field of, but this one we have a light and soil moisture. We have our value. And as I told you before, the values kind of come in a little bit weird. I don't know how soil moisture can be 1,372 point, many zeros and fives, but it can be. And then finally, the actual timestamp value, which says that obviously this value was from last year in like, I can't even think September, August, sometime in the early fall. So flux and sequel. So I've said this word before and I haven't really explained it. But basically what flux is, is it is the querying language of influx DB. So basically what it allows you to do is query for your time series data. It can do a lot of really awesome things. It can do things like the alerts, the management, but for right now we're just going to focus on the querying because that's the most straightforward thing and that's the main thing that you're going to end up doing. So in this versioning right here, basically what it's saying is from bucket, which again is just from database. Go ahead and give me smart city. Give me the range. This is a range of one day. It's got to start and a stop. You do not have to give it a range. You could literally just do from bucket, give me everything. You normally suggest you try to use a range because obviously, I mean if your bucket only has like one day of data, it's probably not a big deal, but if it has the past three years of data, that's going to be a while to come in and that's going to probably crash a lot. And then you have your filters. So with this one, what they're saying in more human terms is they're saying give me all the bicycles that have come through with the neighborhood ID of three. And what they're doing down here at this aggregate window is they're saying give me the mean for every one hour. So because this is one day, this is a one day range, this will return 24 data points. It will give you the mean amount of bikes that came through every hour in this neighborhood with the ID of three. And the one below it is doing the exact same, but it's doing it for the ID neighborhood of four. And then finally at the end, it's comparing them and it's getting a difference value. It's saying how many more bikes go through neighborhood three versus neighborhood four or vice versa. And so that's just one of the quick queries that you can do. The aggregate window is super great, especially for a project like this where you may be, although your IoT sensors will send you data every single nanosecond, let's get real here. Your plant, you don't need to know exactly what was happening to it. It's better to just get an average of how thirsty it is or average amount of light. You could bring it down even to five minutes. Like it does not need to be quite as in depth. And even for this one, they just wanted to know the mean amount of bikes that were coming through the city in these neighborhoods. This is how it actually looks like in our project. So the reason that you're seeing all these empty brackets is this is a reusable query. So we can say from different types of plant buddy buckets, or we can say different device IDs or different fields. So again, the field is going to be things like the humidity, the temperature, the moisture. And device ID, I actually, for my project at least, it's always the same because I only have one setup. But if I had multiple plants with multiple values, I would have the device ID basically being probably really the plant names, but I could say like Arduino one or Arduino two. But for this project, it's relatively smaller, so it's just easier. So change is here. So this doesn't really matter if you decide to do this project all in the open source. It won't matter really for you for a while. But one thing to note is if you do choose to do edge data replication, Influx CB cloud is now going to be allowing SQL. So you're going to be able to query your data back out using SQL instead of Flux. And we're also going to be supporting flight SQL plug-ins, which will allow you to connect to things like Apache superset and Grafana. I'm obviously going to be showing plot leaf for this one, but these are going to be options for you in the future. So it's just something to keep in mind. So let's get into edge data replication. I'm going to leave this up for just one sec. So normally when I say edge data replication, people kind of think of varying things depending on your job or depending on where you've heard it said before. Some people think of a solar panel in the middle of nowhere in the woods. That's the edge device because it's, I don't know, at the edge of civilization basically. But an edge device can be something as simple as a cell phone. It can be an ATM sitting at a bank. It can be a factory that just happens to have intermittent Wi-Fi because today or this week got an ice storm and the internet went out. So an edge device can really be almost, it's more broad than what we normally think of. It can be almost any device that it's important that it always stays connected, but that doesn't mean that it will. Or in the case of some people, it's your work server that happens to be sitting in your office that goes out because the power went out of the office and now somebody's getting the phone call at 2 a.m. to go to that office and fix the server. That's why cloud computing is great. So basically what edge data replication allows is it allows you to run your InfluxDB OSS instance, your edge, and basically it has a dispatch queue which holds that data. So as you can see here, you have your bucket, you have your queue. There are limits to how much data you can hold. You can check out the documentation to find out all the nitty-gritty. But basically from there, if you ever have like, you know, you ever have internet blackouts, you ever have power loss, you will have that data backed up. And then when it reconnects, it goes ahead and sends it to the cloud. Now obviously I would hope that nobody has plants that are so important that they necessarily need to back up their data. But I also like doing this because I monitor these plants at conferences, like they come with me when I'm doing basically what the people outside of this room are doing. Sometimes I have a plant at our booth where I monitor it and although this conference has been really great for Wi-Fi, not all of them are so wonderful. And so it's actually not uncommon for me and my plant to lose Wi-Fi and then I can use the edge data replication to still push that data up to the cloud once I reconnect. Or I close my laptop when I go to lunch and then it stops running, also not super great. But basically this is pretty easy to set up and get going on. So these are part of the setup instructions that are in this project's read me. So as you can see, we're running our InfluxDB OSS edge on Docker, so it's a Docker hosted OSS. And basically what the command in the second portion does is it just sets it up to be a edge device. It's just saying like, hey, do the config create, plant buddy edge, this is going to be where it's coming from, it's the open source version. And then the rest of these instructions are basically just for the USB ports and such. Like I said before, we have some pretty in-depth documentation on how to get this project going. And then these are the two big commands that you run. And they're pretty straightforward. Basically all you need to do is just have all of your information for your OSS, so that's going to be that bucket that we named before. You're going to need to create that remote connection. And then finally you need to do the replication command where you're saying replicate between the local bucket ID and the remote bucket ID. So as I said before, I'll show how you actually create the buckets, but for the cloud as well as the open source is the exact same, you just basically create the bucket, you need to get the ID for it, and then you're basically just saying, this is my local bucket, this is my cloud bucket, please make sure the data goes up in that direction. So data requests and visualizations. So when we are querying data back out, this is using again the Python client library, which although Telegraph does have a few output plugins, they're not relevant for this specific project. You could check them out if you wanted to send your data to a different way, a different website or such. But basically all we're doing here is we are using one of those flux queries, the same one that I showed from an earlier slide where it's basically just saying, give me the data for the past roughly day for this bucket with this value. And from there, you have your params, you have your bucket, your sensor name and your device ID, which can be submitted, like I said before, it's like a drop down that you can pick from, and basically once you do the query and you do the open.read, you're going to receive that data back. And you can receive this data back in different ways, but we're doing it in a data frame because that's the easiest for graphing implotly. This is currently in, what's the word, we're working on it. So we're currently working on getting this project to be integrated with SQL. That's going to be my task when I get home tomorrow on Monday or Tuesday whenever my flight lands. But basically from here, this is how it's going to be instead executed. You're basically just going to be using a SQL command and getting a very similar readback. With this one, we're just getting a, what's the word, like a straight read, we're not doing it into a data frame, but that is going to be something that we're going to set up and be an option. So if you do want to use this in the future, just wait like by the end of the week and we'll have that project up as a part of the plant buddy repo. And finally actually graphing the data. So it's pretty easy to actually graph the data inside of plotly. So as you can see, we have a few different line graphs, which are set for soil moisture, air temperature. And as you can see, we're setting a few, like these are the values that we're setting here, like the graph default device ID, we're sending in that air temperature, and we're getting it back in a graph format. And this is going to be another case where we're going to see if we can get this to work because I really want this one to work, darn it. I actually wonder, we're going to try something a little bit weird, see if we can get this out of the presenter view. Oh no, escape. There we go. Okay, this is not really ideal, but we're just going to have to go with it, I think, maybe. Man, it's really just not liking it, huh? I don't know why, what is this? Oh well, that's not helpful at all, darn. One second, I'm going to drag this onto my screen and just see if I can do it. I guess it just doesn't like the HDMI today. Check the other Wi-Fi, the dual stack one. I'm on the FOSDOM one. Yeah, there's a FOSDOM dual stack, which is IPv4. Try that one. This one? Yes. You think it's internet? Not every Google thing likes IPv6. I'll also refresh this really quick, see if that helps at all. Yeah, just really, it's so funny that, yeah, it was working before, but now it's just not liking me. All right. Oh, you've got to be kidding me. All right. I've got it working. I think I just actually need to change my share settings. All right. We're going to go ahead and change the way this is shared. Do you know how to change the settings by any chance? I thought it would just change it, but it didn't, like, just change it to just look at this. Just look at this screen. Oh, I don't think. No. Okay. Hmm. Fair enough. Yeah, it's just, like, it's not, all right. Here we go. Mirror display. It's all these new updates. I never know where anything is anymore. Okay, so it really is just the display thing, I think. I think it just doesn't want to work. There we go. Okay, cool. So, I'm so sorry, guys. I didn't realize it didn't like my share. Okay, so I'm going to go ahead and full screen this, and we'll just go back to the other video because why not? So this is how it actually looks in the end. So as you can see, it starts to actually make a little bit more sense. But basically, you can pick your fields. So this is, like, a graph where you can kind of change it as you desire. And you could also pick your bucket as well, which I might show in a second here on this video. There we go, yeah. So you could pick one of these many buckets. Most of these are not relevant to my project. They're just the buckets I have in my cloud account, or rather, my open source. And so as you can see, these are the two, I'm going to go back to this part of the video. These are the two hard-coded graphs. So as I said before, the original values sometimes come in really weird. Like, I don't know why the heck humidity went, like, all the way up to 90 and then dropped all the way back down. We normally do a first flush of a lot of this data when it first hits because it just kind of comes in funny. Or maybe I breathed on it. Who knows? They're relatively sensitive. It really does happen. But also, we had to do a little bit of exponential smoothing as well. So like, we smoothed out the soil moisture because it used to look like the air temperature does. It used to just, like, kind of jump around like a crazy thing. The plant did not move between, like, the frigid air to back inside. It's just these sensors can be a little bit temperamental. We bought the cheapest ones off Amazon. We can only expect so much. If you spend a little more money, you're going to get a nicer setup. So let me get at a full screen, please. And I can just not win today. All right. Nope. Now, you just want to play. Okay. So these are some of the new visualization options for Flight Sequel. We're also going to be adding these into the project, so you can check it out. We already have pretty good integration with Grafana as well. So if you would prefer to use them for your visualizations instead of Plotly, you're more than welcome to. And then these are those further resources I mentioned before. So this is the try it yourself. So this is where the actual project lives. This is the QR code as well as the GitHub. If you look up Plant Buddy on the Internet, you'll find this. And then we have a few different versions depending on what you want to do, including the edge data replication version, which I've mentioned here. Oh, I almost forgot about the other video. Let me go back up to it really quick. I like the videos because it means I don't normally have to jump around super crazily and go in and out of the Cloud UI. Too bad it sometimes comes in as like, it's funny, it's set for the high quality, but it never really is. I'd go back to Slideshow if you would be so kind. There we go. So as I was saying before, the create bucket is pretty straightforward. You just name it. And then as you can see, the delete data is set for never or older than a certain amount of days or time. And then that advanced configuration is the schema that you can pick. And then finally, the API tokens, also pretty straightforward. You can do the read-write, which is what I do suggest. This all-accent is the big red button that I mentioned earlier. As you can see, it's got the warning to don't do this. I do it because I don't care. I like to live life on the edge, haha. Horrible jokes. It's a great specialty of mine. But if you decide to do this the right way, this is how you would normally do it. You can pick your buckets for read and write. And you do need to have read and write if you want to use it in this context. Because if you just have read, it won't do you any good. If you don't have, I mean, I guess you could do one, but then your data is stuck inside and you can't do anything with it. So you need both. So that's that video. So I'm going to go back to the end of this. It's great. This thing never escapes. There we go. Awesome. So this is our community Slack. I'm also going to have a slide next that will have all of the, like it's the one to take a photo of. You don't need to take any photos of this one. But basically you can come join us in our Slack community. I'm there. My coworkers are there. We love to hang out and talk to people and take feedback as well as questions. It's pretty active. We get like 100 messages a day. So we're always busy in there. And then for getting started yourself, you can obviously head to the Influx community. It has a lot of projects as well as the Influx code base. So you can go ahead and download that open source versioning. And if you want to get started, that's our website, this is also where you're going to find things like our documentation. And this is that slide that I promised that kind of has like everything. It makes it really easy. So for getting started on cloud, if you would like, the community is both the forums and Slack. Slack is our more active community. Our forums are because we can only pay for such an upgraded amount of Slack history storage. So we put all of our old questions in the forum, so they are a resource that you can kind of search through. And if you don't search through it, that's where I search when I answer questions. And then we also do have the Influx community as well. It's basically the one on GitHub where you can find projects that people have worked on, including ourselves. Our book, which basically just goes into things like why you want to use it, the documentation, which I've mentioned multiple times because it really goes in depth on how to get this project set up and going. That's where you see things. They have some of our new stuff as well as just, in general, we like to highlight some of the projects that people are working on. And finally, just our university where you can learn more. It's completely free and go at your own pace. So now that we've gotten through everything, if anybody has any questions... Yes? Yeah, so I'll go ahead... Oh, that's not what I wanted. No. It's just taking me back to that stupid drive video. There we go. So yeah, so this is that Influx community plant buddy project. So the master branch. And then we also have, so for example, down here we talk about the control boards. So we've got the Arduino or the Boron. And then we have an entire sensor list. So for example, if I click on this one, it harasses me for cookies. It goes into the temperature sensor. So you can go ahead and learn about all the different sensors that we use for this project. You can also obviously search them up on the internet and buy them if you desire. And you can use many different types of sensors, but these just happen to be the four that we just wanted to end up using. And like I said before, in this project, we have, yes, the master branch, and then we also have things like EDR, which is edge data replication, Kafka, and then a few others. I normally end up in the master branch. It's kind of like the main versioning of the project. And yeah, and then in the future, the SQL one that I was telling you about, that's going to be EDR IOX. It's still currently being worked on as I speak, actually. So that one is not to be touched yet, until it's all done. Yes? Yeah. So the question was, how is InfluxDB different than OpenTSB? Sorry, TSTB, there we go. So from what I understand, TSTB is also an open source time series database, just like we are. I think the biggest difference is going to be how much functionality it comes out of the box with. I would obviously have to go to their actual code and check it out a little bit further. But normally the big thing that's our differentiator is the fact that we can, we actually have our own visualizations. We have our own ability with Flux to do things like alerting, like that moisture alerting that I was talking about before. And then with the new SQL integration, that will also be very nice for people who want to query in a language. Most people are already familiar with querying in when it comes to working with databases. But to be honest, a lot of time series DBs can be pretty comparable when it actually comes to the storage. So it's going to depend somewhat on your project and which one you want to, I suppose, work with. A lot of people normally like to, I normally do get told that we have pretty good documentation and a good community where we're very easy to work with and work through problems. And that's not always the case with every open source community. If anybody else has any other questions. If not, that's totally fine too, because that all gives you guys time to run off to the next talks or maybe go grab some lunch from the food trucks. Thank you. |
Practical Computerized Home Automation |
My name is Bruce Mamjan. I live in Philadelphia. Glad to be here. I've been attending FOSSTEM since 2003. This is my first time in a huge room, so I guess I'm going up in the world. This presentation is actually part of a 20-year journey for me using home automation. Not a whole lot of 20-year home automation folks out there, but I'm one of them. What this talk is going to be is walking you through that journey and giving you a lot of practical experience of what I've learned. Hopefully, leaving here, you'll have a better idea of not only what home automation things you can do, but also what home automation things are worth doing. That's a distinction. What you can do and what you should do are not always the same. I will give you some examples of that. If you're interested in my slides, they are right here on this website, in addition to this QR code right here. I am a Postgres core team member, so that's where you actually will also find 58 other Postgres talks and a lot of other Postgres things, but this has nothing to do with Postgres. Feel relaxed. I expect this to be an interesting talk. I will take questions as I go about specific topics, and then I'll leave time at the end for general topics, if that's okay. I'll just walk through. I'll ask for specific questions on a specific thing, and then at the end, we'll have time. What are we going to talk about? First, we'll talk about what is computerized automation. I know that might sound trivial to you, but I'm going to explain what automation is, and then what computerized automation is slightly different. We're then going to evaluate some technologies. I'm actually using, because I've been using automation for so long, my automation goes back to the 1980s. Yes, amazing. They had automation back then. Why are we not all flying around in cars, right, and have robot helpers? I don't know, but again, this technology has been around for a long time. I will talk a little bit about some of the evolving technologies. We even had some new technology in the past six months. It's a very hot field now when I was doing it, not so much. We'll talk about a sample deployment, basically how I went at the problem. Again, I'm going to be using old technology, but it's going to be the same process that you're going to go through. You have to sometimes avoid being overwhelmed by having too many options, and I'll try and explain how to hone in on that. I'll also explain what happens if you overdo automation, because nobody really talks about that. Everything sounds wonderful, but not always true. I'm going to talk about some of the programming issues I have, and then toward the end, we're going to talk about what is success? What does home automation success look like? Again, just because you can do something and it works doesn't mean it's actually benefiting your life, and that's what I'm going to try and cover. Finally, I'll talk about a couple applications that I've been challenged with, and then obviously some solutions to that. Let's talk about automation. What is non-programmatic automation? This is the concept of you've automated something, but it doesn't necessarily have computer control. It's not something that you can programmatically adjust and basically control. Things like, I don't know how many of you are old enough to remember that years ago you used to have timers, that you'd plug into lamps, and you'd have a dial. Yeah, okay. Yeah, you still have them. That's good. Although these older people who are raising their hand, oh, there's just one younger person there, but they'd have a dial, and you'd turn it, and you'd set the time, and then every time you lost power, you had to go back and set the dial to the right time. That is not programmatic automation. How many people have heard of something called the clapper? Oh my goodness. Yeah, if you want a hal, take a look at an old commercial video. It basically was a lamp that you would clap your hands, and it would turn on, and then you'd clap your hands, and it would turn off, hence the name clapper. Dawn dust sensors, you see those, of course, everywhere where when it gets dark, the lights go on. When it gets light, the lights go on. It's not programmatic, but it works really well. And of course, motion sensors, and those have gotten much more sophisticated over time, but again, years ago, they had even the idea of motion sensors, somebody would get close to the garage or walking to an entryway, and the light would go on, very convenient. Not programmatic, but very convenient. But then we're going to talk about programmatic automation, and this has really sort of exploded in the past, I don't know, five years, I'd say. For example, the concept of voice automation, voice control, is something that obviously didn't really didn't exist effectively 20 years ago, and now it's sort of hugely popular. But the interesting thing about programmatic automation is that it allows you to take, you see, the non-programmatic automation, the automation was kind of built into the device, right? You had the clapper, or you had a motion sensor, or you had a dawn dusk sensor, and you couldn't really change it. It was wired into how the device behaved, and you couldn't combine that with some other device, or combine it with some other inputs and make something different happen. Not that it wasn't useful, but programmatic allows you to combine behavior. It also allows you to do distance things. So again, one of the examples would be somebody comes into the garage, the garage doors open, and some other action happens somewhere else in the house, right? Not usually something that would happen in a non-programmatic way, but it's easy to do if you're on a network of some type, right? Activity detection, being able to write scripts and basically programs that control with the automation, incredibly powerful. And then finally, and I'll talk about this a little later, the idea of even bringing in external data and having the external data affect your home automation. For example, how cold is it outside, right? You may not have a thermometer or a home automation thermometer outside of your house, but you may be able to go to the weather service and find out how cold it is, and change your automation based on that. Again, this is where I think it gets really interesting, and you're going to see a lot of examples of that. Okay. Any questions so far? Okay. All right. Now, evaluating your technology. Now, I'm going way, way back in history here, but basically what is a network? What is the ability to communicate distances? Well, one of the early networks I didn't even list here is the telegraph. Now, I was a history major, so I studied the impact of the telegraph on society, and it was pretty dramatic, right? Something could happen in New York City, and within several minutes, people in Europe or people on the West Coast would know about it right away, right? I mean, that's pretty impressive. The telephone, of course, is a network. I guess you still do. You used to dial a number, right? It used to make sparks in the phone. That doesn't happen anymore, but you dial a number, and somewhere some distant person answers, right? Cordless telephones. I was around long enough to remember when those came out. All of a sudden, you didn't have to have the receiver connected with a wire anymore. It's a sort of network, right? Then we had ethernet, which obviously has been around since I remember the late 90s. Before that, it was like token ring and a bunch of weird stuff on coax. Well, you can still do ethernet on coax, but you get the idea. Wireless networks. I remember when those came out, so all of a sudden now, I didn't have to be tethered with a wire ethernet. I could do it wirelessly. This fourth one here, electrical, the fifth one, electrical network. I'm going to show you how that's actually a network. You might not have thought of that, but it is, and I'll show you why. And then finally, we're going to talk about some new wireless networks that are available, all right? So, looking historically, kind of a very 10,000 foot view of what's going on, the technology you're going to see me using from the 80s is something called X10. And I'll show you how it works. I'll show you some wiring diagrams. I was kind of impressed over here. I got a nice little, you know, like electrical little sticker here. I was like, oh, it's an electrical panel. Like very exciting. We're going to show you a little bit of that. There's something called U, universal power bus, which I don't think, I don't think it's even around anymore, but it was an early way of doing the same thing. And then we have things like Z-Wave and ZigBee. They are somewhat confusing because they both begin with Z. But ZigBee is the one you normally see. It is an international standard. Z-Wave is a proprietary interface that you don't see around a lot, a whole lot anymore. There used to be something called Instion, which I think is still around, but again, it's, you don't see it pretty much anymore. I wanted to talk a little bit about a brand new standard that just came out in the fall of last year, so maybe six, four or six months ago, called Matter. How many people have heard of Matter? Wow. Okay. That's great. Okay. So Matter is basically not, I want to qualify what that is. It's not really a network. It's really a protocol for devices using different networks to communicate, right? So up until now, for example, if you had a ZigBee Phillips Hue light bulb, let's say, you had to communicate with it using a Phillips app or a Phillips hub. And the idea of Matter is that we're going, is it's going to allow for home automation using different protocols. So could be ZigBee, I believe it's for its X10. It could be something like, something like Thread, which is kind of a lousy name, frankly, because it's confusing. So many things with threads. But it's a new protocol for low-power devices, which is, which you have a lot of those things now. So if you want to kind of get a lowdown on that, I've got a great article here. I also have some websites down here. So anything in blue is basically something you can click on. If you download the PDF, you just click on the blue and it pulls up the site for you. So again, a lot of opportunities, you know, it's sort of thinking back. You know, when X10 came out in the 80s, there was a sense that, oh, home automation is now here. We're going to be able to do all this stuff. And, you know, 35 years went by and really nothing happened. I was like the only guy, the person doing it. And eventually, you know, eventually now you hear a lot more about it and there's a lot more excitement about it, partially because the technology is advanced, partially because there's more options, partially because more people are doing things electronically or they're used to that interface and people are getting used to it. But there's still some limitations and I'll kind of talk about that. You know, not only do I do home automation, but I think I've inspired my family. My son does home automation. He's a big home kit person, a home kit being the Apple-like home automation thingy. I have a brother who does it. Most of his stuff is Zigbee. And I have a nephew who does it who also uses a lot of Zigbee stuff. So I have some interesting stories about that. Some good, some bad or problems that they had. But I'll talk about that when I kind of get a little farther ahead. Okay. Choosing a network. So let's suppose you're going to use some network technology or you're going to use some kind of home automation. One of the first things you have to figure out is what do I want to measure? And what do I want to control? Because effectively, all of the home automation is really like an input output kind of a case, right? When this happens, this other thing's supposed to happen. Or when I see this activity, I want some other activity to happen. Okay. So you have to identify if you're going to choose a particular technology. Is it a multi-protocol technology? Does it something like matter that can operate with multiple things? Or is it more like a Phillips hub that only really controls like Phillips devices, right? So you kind of see the difference. There's hubs that do only one protocol. There's some hubs that do multiple protocols. And if you pick a technology and that technology doesn't have a sensor you need, or doesn't have a controller you need, what are you going to do? Like for example, I needed 220 volt control of my pool pump. Now, some of these protocols don't have just, they just don't have 220 options, 220 volt options. Now, I'm sorry, I'm living in the United States. So remember, everything's 110, you have 220, everything here. So everything's, everything's fantastic in Europe, right? But we got both. So in my case, that was somewhat of a challenge, right? What about, what about a thermostat? If you need to control a thermostat or HVAC system, does that have an interface? What if, what if you want to, what if you want to open the garage, right? I'm going to show you later how I figured out how to figure out if the garage opens or shuts that I can figure out. But what about how, open it yourself? Like what's the control? Does your, is your protocol support controlling garage doors, right? Chimes, that's one that's actually pretty limited. There's a lot, some of the interface, a lot of the interface don't have chimes. You may not think that's important. I use it a lot. Wireless remotes, like the ability to have a wireless control for your device. What about a phone? Does it have a phone app or a phone interface, right? And of course, most of them have electrical plug controls. That's pretty clear. So again, you have to, and then there's different open source products. I'm actually using this thing called Hey You. Mr. House is very popular for home automation. There's a whole lot of open source home automation stuff. It's pretty cool. But my point is that you, you can't just jump in and say, I like the name Zigbee, or I like the name Matter, or I like, I like, I like the hue light bulbs from Phillips. You got to make sure that you're going to be able to bridge your sensors and your activity together under a unified system. Does that make sense to people? Is that, that kind of makes sense? I guess I've noticed it because I would say, oh, I really want to do X. Oh, I don't have a way of doing that, or I really want to know X, or I don't have a way of finding that out. And that was kind of cool. Now, signal reliability, you see that a lot, some with, with, with different, like Zigbee, I know has a limitation of distance, because it's a wireless protocol. So again, it might use Wi Fi. Do you have Wi Fi in all parts of your home, right? That they're all reachable. Is it reliable? If you're using something like X 10, do you have a way of doing that? I'll show you how that actually works. What about this kind of, seems kind of unusual, simplicity of device replacement. Why did I bring that one up? I had a friend, I knew a friend who, when he brought a, bought a new house, he put automated outlets in all, all of his outlets were automated. So instead of the normal plug, you know, like the normal outlet, every outlet was home automation enabled. And you're thinking, wow, guys are genius. This is great. He's going to be able to control. This is probably 10 years ago. And he said, you know, it was kind of neat, but the reliability of the devices wasn't very good. And it feels like every one month, let's say one Saturday a month, I am replacing an outlet in my house. And I'm like, that does not sound like fun because again, outlets, normal outlets will last for 30 or 40 years, right? Is a home automation outlet going to last 30 or 40 years? And even if it does, are you still going to be using that protocol 30 or 40 years from now? Right? That's, I mean, think about how much it's changed even in the past five years. It's hard to imagine, right? What, what, what, what opportunities or what interface you're going to need 10 years from now. You really want to replace all the outlets in your house with a new, with either replace it as they break or replace them. My brother had the ZigBee thing and he, the electricity like bounced on his house and all of a sudden all his devices went offline. And we're at a dinner. I think we had, it was me and my cousins and my brother were having a dinner and he's talking about this thing. And it just happened. And he hasn't been home yet because he went to work and happened while he was at work. So his wife calls, now the whites work. I can't turn on anything. He says, well unplug this. Unplug that, plug that back in. Doesn't fix it. And he's asking me, could a power spike, when they take the electricity off and on, could it have wiped out like a third of his devices? Because a third of his device would just like, gone. And I said, it's possible, but it's more likely it's the, like a hub or something. And it turned out he gave me the report later. He said, I went home and I played with it for like four or five hours and I got it working again. So hey, he had to be home to do it. His wife was not able to do it because she'd understand the network and how it was set up. But even then I think what he ended up having to do was to hard reset one of the hubs and that brought all the devices back on. Power cycling, it didn't work. It seems like some part of the CMOS got, whatever, the firmware got weird. But again, is that something you want to deal with in your house? And is that what you're, is that something your family wants to deal with in your house, right? So I'm just, I'm just saying, I'm not saying it's right or wrong, but it's, it's a reality is what I'm saying. Cost can be an issue. I know the Insteon stuff was fairly expensive. I have heard complaints about the cost of some of the Phillips products, although the quality is very high. People have talked about that. So again, a lot of things go into that choice, longevity, ability to upgrade, that kind of thing. Very, very interesting. I don't have any answers, but very interesting. Any questions so far? Okay. All right. Let me just dive in because this is a software conference. I want to give you, you know, some geeky stuff here. And this, this kind of freaked me out that this actually works. But it works every day without fail to the point where my family has no idea it doesn't, it could never not work. So about 20 years ago, I had, I just moved into a big house. And, you know, we're kind of living in the house for six months. Electrician was there. He was working on something. I remember was there. I remember why I was there. He was a really nice guy. And I said, you know, I said, can you, can you look at something? My wife would, there's a, there's a light at the bottom of the stairs. And my wife would like a switch at the top of the stairs. You can turn the light on, right? You're going down the stairs at night. You want to be able to turn the light on, but as you're walking down the stairs, make perfect sense. So I asked the guy, I said, can we, can you put a switch here? He's like, he's like, well, he said, I can, but the problem is the lights downstairs, the switch is on the wall downstairs. So I'd have to like open the wall and then the ceiling to get the wire up to the wall on the second floor, right? To put the switch in. I'm like, well, I don't, I don't think we want it that much. I don't think it's that important. He said, as that, have you, have you thought about X 10? I'm like, I said, I've heard of it. I said, does it really work? And he said, he said, actually it does. He said, I put it in my house and it actually works really well. But he said also, he said, I've talked to my customers about it. And particularly when I'm doing new construction, and I can basically install it while I'm working on their stuff. And and he said, every person I've talked to, I give them the quote of the price, it's like $600 to add some home automations, maybe the lights outside or whatever. But every customer I've talked to has said no. So he said, I use it. But everyone I try and recommend it to, they like, oh, I don't want the headache. I don't want to pay the $600. I don't see the value. These are the kind of things that the feedback that he has an electrician got. All right. So let me show you how it works. And we can kind of like walk through it and sort of see. Okay, so this is, and I apologize if there's any US specific electrical things in here. I'm going to do my best. Okay, so this is, this is basically the sign way for electricity. I believe this is a US cycle here, because I think it's 60 Hertz. And in the in Europe, it would be it would be 50 Hertz, right? So you just just pretend it's it's not that. So it basically the electricity goes up and down like this. This is alternating current. Okay, pretty, pretty straightforward. This is positive volt. This is negative volt. Okay. And then and then the target is 115 roughly in the US and it just goes back and forth. Okay. What they figured out in the 80s is that when the electricity crosses the zero voltage point, they could inject a small signal like a blip at a certain frequency and have it travel along the entire wiring of your house. Okay, so I'm not sure how many of you are aware that for example, you can run ethernet over coax now, it's called mocha. And it's basically if you have coax in your house, you can run ethernet. So I have a I have a TV in the family room, I don't have ethernet out there. But I have a coax. So in the basement, I have a mocha adapter, ethernet plugs in, out comes the coax, the coax goes through the whole house, gets picked up in the family room, I have another mocha adapter, in comes coax, out comes ethernet, goes right into the TV. Okay, so the same way that you can run ethernet over coax, you can run home automation over an electrical system. Okay, crazy, but true. So here we have, here we have a signal blip right here. Okay, here's a zero. So this would be a one. Yeah, we're getting back to binary here, very exciting. People are salivating. So here's a zero, here's a one, right, here's a zero, here's a zero, here's a one, here's a zero. Okay. And if you think of, now we're now we're starting to see where we're going here, you can see this is 110, 011, 01001. Okay, so what they effectively did, these people in the 80s weren't idiots, they basically had a binary code for different house codes, it goes from A to P. So this is the binary code for A. I know it's not ASCII, it's just whatever it is. They did have ASCII back then, but you know, stick with me here. This is A, this is B, this is C, this is D. And these are the patterns for the different house codes. So if you want, so actually here's a, here's a sine wave on what device is this, A1. Oh, right, people know what that's called, that's good, a oscilloscope, yes. I'm not that old, I guess, people remember, yes, this is an oscilloscope right here. And you can see that's the way it used to look, and it was, tectronics was the big manufacturer of these, as I remember. And here is an oscilloscope showing those little blips in the crossing of the zero voltage. Okay, so that's literally, and we can thank, I believe, I have a website that lists where I got this from, but that's kind of cool that you can just kind of link like that. Okay, so that's two ones right there. Okay, so the X10 standard basically, I actually forget the 1980s, 1975, was the standard was made in Scotland. It didn't really become popular, I think, till the 80s, uses the 120 Hertz carrier. And again, that's 260 kHz cycles in the US. I believe they also have them in Europe for European things. So one bit is transmitted per zero crossing, 120 bits a second, there you go. You thought modems were bad, this one's really bad. 256 maximum devices on your network, 16 house codes, each house code has 16 devices. And again, there's the standard right there. Okay, so this is what it would look like, the protocol for that. This would be the start code. This is the house code. And this is the device number. And then you would say on or off or dim or whatever you want that device to do. This is how you control it. This is the full protocol. So here are our house codes. Here are device numbers up to 16. And here are commands. So it looks like there's about 16 commands. There's, let's see, brightness, dimness, on or off. These are the big ones right here. These four. These four are the ones you see the most. Crazy but really cool that they got this to work. Obviously one of the problems is you need like three quarter of a second to send a signal. And those of you who are really smart will realize that's kind of slow. It's because you always send the signal twice because electrical systems can be noisy. And therefore, if the same signal appears twice, then you know that that's the valid signal. Okay, so it's sent twice, poor propagation. The United States uses split phase systems. I'll talk about that in a minute. I don't think Europe has that problem, but that there's an issue there. Line noise can be a problem. And other buildings can also probably see be a problem. So this is what a house typically looks like. You've got a 220 coming in, but basically it's positive 110 and negative 110. And then you have two phases. You have a positive phase and then a negative phase over here. And that's typically the way that houses are in the United States at least are built. One of the problems, of course, is that because you have split phase, it's hard to get signal from one side of the phase to the other. So they have something called a signal coupler, which basically transmits that blip between the two. Okay. Phase coupling is just do that. You can filter stuff, so you can filter out line noise if you want. So let's look at the actual devices. This is a switch right here. This is a normal switch. This is an X 10 switch. Notice there's no up or down because, you know, it's just a button because you may have turned it on remotely, right? So we don't have a status to it. There's no physical status. You push it. So on you push it, it's off. Okay. You might be thinking, oh, gee, why would I want a button like when I have home automation? And the funniest story I can tell you is that if home automation had started first and we never had physical buttons and physical buttons showed up, people would think they were genius because because there's physical buttons are really convenient. Okay. Let me tell you, let me tell you how convenient they are. My nephew lives in New York City, rehab the whole house, decided he was going to go full with home automation, put in physical light switches, but he decided I'm not going to use them. I'm just going to use the Philips or the Z-Way Zigbee bulbs. So if you're going to use the Zigbee bulbs and you control them remotely, you effectively have to always give them power, right? So the switch, you got to leave it on all the time. If you turn it off, it can't turn on anymore and flipping a switch doesn't work. It doesn't do anything. So what he decided he was going to do is every time he was going to control it with voice controls, so when he walks in the room, he says echo lights on and he leaves, he says echo lights off and echo lights on and echo lights off. And I was there in the fall and I said, how's it going? He said, you know, we're getting tired of talking all the time. And he said, you know, I thought it was really cool, but after a while it gets kind of annoying. So I think he's going to have to rethink that whole home automation lack of switch control thing. So there's what, the physical switch. Just what I'm saying is don't dismiss the light switches as completely useless. There actually is some use to that. Here's a different type of switch. It's a flat switch, you know, but more modern basically. Three-way switches, you can do those with X10 as well. So that's actually a three-way switch right there. No, actually, this is three-way, we're all three-way switches, actually, these three. You can do this, which is kind of cool. It's basically, there's nothing behind this. This is just stuck on the wall, right? So it's basically a wireless remote. You just hit the button, you stick it. This is how I fix the problem coming down the stairs. I just stuck one of these on the wall and then, you know, every two years I replace the little battery that's in it, like this big. So I go around the whole house, just replace all the batteries, every two years, it works really well. This is the adapter that basically receives the signal from the wireless, if you have that, okay? This is another wireless device that we have, little buttons, whatever you want, you control it. This is a four-device model. There's also a 16-device model right here. So this one right here, and this is a lamp. And you can see, here's a good setup. Here's the lamp. Here's the X10 module plugged into the outlet, okay? And then, basically, this goes up and goes to this, right? So this is an X10 controller, and it does up to 16 devices, right? So again, you need one, two devices, four devices, 16 devices. You can pick a different remote that does that for you, okay? So, you know, I think this is kind of a good example. What I, you'll notice, I did not get outlets that, no, X10. I just got plug-in things. Why did I do that? I don't want to spend one Saturday a month, like, pulling out my outlets and fixing them. They've been very reliable, but at the same time, I may not stay with this technology. I don't necessarily want to rip everything out. I don't mind seeing a little white box there. It's easier for me. And if something doesn't work, I can just plug another one in and debug it, like, real quick, right? Because it may not be the bug. It may be the box. It just makes it real easy. Okay. Okay. Pool pump, I talked about this. Basically, I have a pool, and I want to turn the pump on. Normally, this is done with the big timer dial. Remember, we talked about the timers with the little dial, and, you know, it goes around 20, every 24 hours. The problem with that is that I want the pump to run more when it's hot and less when it's not hot, because the hotter it is, the longer you have to run the pump. So, by controlling it, and this is right here, this little X10 switch right here, right, in this little box, you can basically figure out what, how warm is it outside? You ask the weather service, how warm is it outside? And then, depending on how warm it is, that's how long the pump runs. And my pool people will come, and they'll say, you should sell this. And I'm like, well, I could, but then I have to have a controller, and I have to have some way of programming it to get the weather. And it's not, what I found, it's very hard to, to productize a lot of the home automation stuff in general. Because everyone's need is different, and to productize it, it's kind of, it's kind of difficult. This is a command open source program called hey you, which actually controls X10. I've used this, I guess, since I've been using X10. There basically is an interface between the electric plug right there, and a serial cable, yes, serial, that comes out here, and goes over to here where my server is, right here. Okay, so this goes out, and then I think currently it goes into a little USB adapter, which converts serial to USB, and then the USB goes in, and Linux has no trouble communicating serial to that. So you can see the sort of network interface between, right, you've gone from Linux to USB to serial, and then to power to X10. Kind of interesting. Okay, I can monitor what's happening, so here is a report of what has happened at the house, all of the activity that's happened, so I can basically say okay this is happening, this is happening, this is happening, and I can look and get a log of that. This is a shell bash script that I wrote, which effectively monitors what happens in X10, and then it takes different actions depending on what happens. So if somebody does one thing, I say go do this other thing, is it too late, do this other thing, don't do it, if it's too late in the day, people might be asleep, are we ready to eat, there's all these things that happen. For example, when somebody hits a button in the kitchen, it sends a message to all of the terminals in the house, all the laptops in the house saying it's time to eat. There's a slideshow in the kitchen, big banner comes up, time to eat for 20 seconds, and there's also an X10 chime, so there's little X10 modules that make like a little ding dong sound, like a doorbell, and I put them in different parts of the house, so when they press it, there's this little ding dong, you hear different parts of the house, people know it's time to eat. I tried it first just making one really loud sound that didn't work well, because when you were in the room that made the loud sound, it was very annoying, and when you were far away, you didn't hear it, so to do that, I had to kind of spread it out. So again, you're looking at some input, and you're looking at some output, and that's kind of the way you have to think of it. So what inputs do I use? I use the time of day, I have dust on sensors to know how late it is outside, I use wireless remotes, I use caller ID, I'll talk about that in a minute. I even have telephone dialing, so I can actually, I don't know if you're familiar with modems, but you can still buy modems, and you can make them dial the telephone. So you can have like a, instead of dialing it yourself, you can tell the computer, take this address book entry and dial it, and I'll just pick up the phone and talk to them. So it kind of makes it convenient for people. Even things like outside temperature you can get from websites. And then outputs, we can do lights, we can turn on motors like the pool pump, we can do appliances like the coffee maker, I'll tell you that when we get there. I can make sounds, I can make broadcast messages on the laptops, and I can even have a slideshow, have a big screen to say, time to eat, or somebody's arrived in the garage. This is maybe overwhelming, but it's basically a layout of the first floor of my house. Here's the server right here. So remember I show you that little adapter, the serial adapter with the X10, that's actually right, that's number four. Here are the yellows, our lights, that are controlled. Here is the sunset sensor. Here is, yeah, so this is where you have a remote, you can tell people to do stuff. One of the interesting things, you'll notice that the lights at the front of the house, you see all these lights right here, they turn on right before dusk, right, when it's getting dark. But what I realized is that why would I turn the lights on always at the same intensity every day? What I'll do is I'll randomly have different intensity lights every time I turn them on. So if you look at my house one day, maybe the garage will be brighter, and the front door will be dimmer, and then the next day, the porch will be brighter, and the front door will be brighter, and the garage will be dimmer, right. So it just adds a little bit of variety to how people view your house, and kind of, it's like a neat little thing. I think it's kind of cool. Let's see here. Yeah, so this is, basically this is walking you through how, you know, how the system works. I think it's kind of interesting. Any questions? Yes, sir. So the question is, does the X10 link to your neighbors as well? So technically, X10 will broadcast up until your transformer. If you're familiar with the transformer, it turns out that my run goes right from my house to the transformer, so I don't have other people between me and the transformer, so I don't think anyone else sees it. However, if you're sharing a transformer with other people on the same wire, they will absolutely see it, and that's why they have the house codes. So if you find that somebody's using your house code, you may switch to a different house code, and therefore you will not see interference from them. Great question. Okay, yes. Can you have a filter in your mains? Absolutely, they sell filters that will filter your mains, and therefore it doesn't leak out to the rest of the house. Yeah, great question. I'm going to keep going here. So I promised I was going to talk about what success looks like. Basically, adding home automation changes your family's home environment. Good or bad, right? I recommend you start slow, because my brother, he had seen me do it, and he kind of went all in. But the reaction I got from, I just talked to my wife last night, and she was at a funeral. My uncle actually had a funeral on Friday that I unfortunately missed, but she was talking to my sister-in-law, and she was complaining about the fact that when she does not know how to control the lights in the house. There are different voice commands, and she can't remember what my brother called that device. I told maybe we put labels on things so you can remember what we called them for the voice commands, but it's not a good situation to be in when your family is feeling like they don't have control, because it's their house too, and they want to feel like they certainly should feel comfortable in controlling their environment, and if you kind of come in too strong, you can bypass them. So in fact, my trick was that I put in home automation just for the outside lights, and my wife, month later, she said, can you do this one lamp for me? I said, sure, and I waited four weeks, and I was really excited to do it, but I'm going to wait a couple of weeks, and then I did it, because I didn't want her to feel like I just was trying to move it too fast. It's been very incremental, and that has allowed people to look at it as a positive versus I think sometimes can look at as a negative. Except that some automation tests are impossible, and again, you've succeeded when a family member asks for a home automation addition. That is your gold standard. Keep an eye on that one, because if you don't get there, I would say you're kind of full, and this is actually a great article. Smart home gadgets, still a hard sell. Wall Street Journal article right there. So challenges. Change is hard. Reliability. So sometimes it's hard for family members to understand how it's working. Device longevity, I talked about that. Maintenance, I talked about what happens if something breaks. The cost, I talked about that, and then, of course, security, privacy are also important, and again, we have a nice talk about the Nest thermostat bug. It leaves users in the cold. Not a good experience there. So be aware. It's scary sometimes. So let's just run through this, and I'm going to take some questions. So this is mostly just a Q&A part. I mean, show and tell. So here are the modems I use, where it actually, again, serial cables going up to USB, but I have two phone lines, and it's just modems. Not only can I dial the number, but I can get caller ID, so I can get caller ID and show to everyone in the house who's calling on the phone. You don't have to look at the phone, you see it on your computer. Really kind of nice. I can log all the telephone calls coming in. I can dial the phone, as I said before. Again, I use caller ID to broadcast messages and to make it shine if it's somebody important, like a family member. And I end up taking the caller ID, I look it up in my telephone directory, and then I know who it is, and I can actually give a text message. Instead of just a number, I can customize what it looks like. I can make outgoing calls, as I said before. This is the second floor, or is the first floor. Again, talking about how all the pieces work together. This coffee maker is kind of fun. Second floor, again, some lights, some chimes, and some wireless remotes. So you can kind of see they're kind of spread out in different parts of the house. And here's the pool pump out here. This is what the, hey, you can fig file looks like. You basically define your codes, and then the names that go with each of those codes. Here's what X10 commands look like. Here's the standard timer. Here's a crontab. Of course, we're running on Linux, so we can crontab schedule a whole bunch of stuff, right? Works just very seamlessly, very easy to control for me, anyway. Dawn Dusk sensor, I talked about that. Here's a remote that you can use to control. This is the coffee maker, kind of hilarious. The way we do it is my aunt lives with us, and she usually gets up early. So my wife has a remote next to the bed, and when she wakes up, she presses the remote, and then it makes a chime in my aunt's area, and that means my aunt should, it turns on the coffee maker for her, okay? So I saw somebody look at their spouse. So yeah, so the cool thing is that you have to have an aunt to set it up for you, or something of an automated coffee system if you don't have an aunt. But the cool thing is you hit the button when you're about to get out of bed, and by the time you get downstairs, your coffee's ready, okay? Here's the coffee maker timer. It only turns around for like 15 minutes, so it doesn't stay on for very long. And she also has a little button on her phone, so if she's driving home, as she's heading home, she can hit the button, and then everyone, yeah, yeah, okay. I'll just leave it there. So this is how it actually works on the phone. I actually have an, it actually does an SSH session, and then calls an X10 command, and then logs out, basically, okay? Pool pump, job scheduler, weather for the website, wireless remote. This is kind of an interesting case, garage entry detection. I'm not going to get into this, but effectively what I ended up doing was to put an X10 switch up at the top, and when the door comes up, it detects it and just pushes a little lever, and I have an X10 control with like wires, and it sends a signal to say somebody closed the contact, and I know somebody came in the garage, okay? And again, that's exactly, that's what it looks like right there, right? There's the, there's the lever, right? And there's the X10, so the wires go up here, and then it plugs in, and then it goes right to the computer, and the computer does everything, okay? Okay, all right. Okay, let's take some questions. Yes, sir. Right, so how would you implement this with modern technology that didn't have as many moving parts? So my son, I would say, has done that. He's doing ZigBee, so everything's controlled with his phone, and he's using HomeKit. So basically, he's using, instead of using switches, he's using light bulbs that are ZigBee enabled, and, but he's still using some X10 stuff because he wants a physical switch on the wall. So I think the question is, are you happy with doing everything on your phone? That's the question, right? If I walk into a room, do I want to pull my phone out to turn the lights on, right? I think he's okay with that, and there's a whole bunch of cool stuff you can do like proximity when he's getting, when his car is getting near the house, cool stuff can happen. So there's a lot of potential there, but because I'm trying to bridge the old and new technology together, that's why it looks so kind of hokey. One more question, let's see if I can get it. Right here. Right. So you're basically, clearly, to be said, you've basically gone largely for X10, which is wired, and you are more of a power user slash thinker. So, but the dealing of the kind of like average user who's got no idea what exactly a smart home does, and or what their smart home assistant sends to the big tech, how would you talk to such people about home automation, because we've got the game, oh, we want your lightbulb connected to the Wi-Fi, oh, I've got this cat flap is connected to the Wi-Fi and everything. And then they end up with a situation they can't switch the lights on at home because the power went off. Is how would I start if I didn't do nothing? And I would just say get a, get a lightbulb that has a phone control, and they just kind of get started with it. But you're going to be on a 10-year journey potentially, because it didn't start this way. It started just with the out front lights, and it grew over time as I found need. So again, my goal was not really to talk about technology, but to figure out how to think about that technology in relation to you, your family, and your home. So thank you. |
The Open Source Business Guidebook
Building a Scalable OSS Based Business |
Hello, everyone. Hello, hello, hello. It looks like it is time for me to begin. So if you are here for this topic, you know, you're in the right place. That's awesome. Who am I? I am Matt Yakovic. I am the head of open source strategy at Scarf or the Haas for short. You may have heard me or seen talks from me before. Multiple ways to connect with me. Podcasts, different things that I do. If you don't know what Scarf does, what we do is we help projects and maintainers grow their open source adoption, help track it, help analyze it, help you understand who's downloading and using your open source software. We also do a hacking open source business podcast. And so that is up to 14 episodes, I think are recorded now. So if you're interested in understanding from others on what is working in the open source business space, okay, it seems like I need to hold it a little closer, then that would be a good place to check out. We're also launching a new website that has all of the information from these slides, as well as a whole bunch of other details on how to measure your open source businesses. And we have it under opensourcemetrics.org. If you're interested, there's links to the podcast and various things on running an open source business and measuring it. So how many people here know open source is different, right? Unlike other businesses, obviously you are competing with free, which changes the mindset quite a bit. Let me give you an example. It's a very easy example and people will pick it apart, but let me just get through it real quick. If you're selling a car nowadays, if you're a car manufacturer, you have the ability to set your price for the car, people can walk away from the car. If they don't buy a car or a vehicle, then they would have to choose some other alternative modes of transportation. If you use a car that's in the open source space, you can still set the price for the car, but you don't actually have to pay for it. So someone can come get the car for free, drive it around, and do whatever they want with it. So consumers can choose to buy your car or someone else. And that boggles the mind of many traditional business folks because it's giving something away for free. And one of the key things that people misunderstand about open source is it's great if it's free and people can use it, but in order for people not to choose the free version and pay for something that's in the commercial space, it has to give you more value than what you're going to get in that free space. You have to figure out some way to connect with those folks, and you have to make sure that they are finding value in what you have. Now, some of you, and I'm interested to know, how many people are more on the tech space in here? Can you raise your right hands? Okay, how many people on the business space? Raise your left foot. Okay, so a few. Well, so many on the tech space consider this type of a talk when we talk about revenue and money, evil. So I'm going to go ahead and put on my other hat, which is my super evil villain hat, you know, for this talk. And that doesn't necessarily mean that this is wrong, or that it is super evil or villainous, but it's hard for tech folks, especially founders, to talk about revenue and money. This is outside of the wheelhouse, right? From a technical standpoint, when we talk about growing, you know, your business, that's not something that is naturally inherent. So people seek out different advisors, they look for who's done it before, what sort of things have worked, and then they try to repeat that process. So I want to start by giving some baselines. There's a lot of common open source business models out there. There's there's several, but I'm going to talk about the three primary ones. And that's open core, a SAS model, or is, and as a service model, it could be infrastructure, platform as a service, just consider all those the same, or services themselves. Now, from an open core perspective, okay, and excuse all the graphics in this, I'm not a graphics designer, so, you know, they're they're they're a little hokey, but open core, how many people have heard of open core before? Oh, the vast majority, but not everyone. Okay, so open core is when you have an open source version of your project that's freely available for everyone. But then you have an enterprise version, the enterprise version reserves some set of features. And then if someone wants those features, typically more enterprise features like maybe higher security integration with partner products, things like that, then people will then have the ability to purchase that from you, and then they'll pay a regular ongoing fee. A lot of companies have dual licenses in this case. So their open source is, you know, let's say GPL, and then there's some enterprise license for their, you know, enterprise version. Okay, now this is dwindling a little bit in popularity, as the cloud has taken over much of the tech industry. And that's where a lot of new businesses, especially in the open source space, are looking. And that's in this SAS or this cloud model. Okay, you can download the software, but if you really want to run it, then, you know, you should use our SAS thing will make it easier. This provides a higher level of quote unquote stickiness. Now, if you are a manager, director, executive at a company that has, you know, venture funding or that you're, you know, has executives who are, you know, kind of in been in the industry for a while, I guarantee that they have had at least 10 conversations about stickiness in open source in a calendar year. And what stickiness is, is the ability for someone to start to use your open source software, and then continue to use it. So how do we make it sticky? How do we retain them? In many companies, this is a model that provides a higher level of stickiness. Because once you get into a cloud service, it's often really, really difficult to migrate away. Because you're relying on that cloud service to be your sys admins, your DBAs, your, you know, DevOps folks, they're handling all those functions. And so to replace that, you either have to go to another cloud service, or you need to find someone who can do those functions for you. And that's often a really difficult prospect to do. But there are some issues with this model. This model is disruptible, because other people can run that exact same as a service. What we've been seeing in the space of the SAS lately, is a lot of open source companies have changed their licenses to prevent that. And they've gone from open to not open. So you might have heard of SSPL, you might have heard of the BSL, other things. These licenses are kind of sort of open. But that means that, you know, as you get into the cloud space, you're restricted from competing with them. So it's more of a preservation of their revenue stream, if you will. And then there's the services model, which is really the original kind of OG, the old, you know, model for open source software. And that model is definitely something that we have seen be very popular and very successful in a lot of different companies. This is a viable pathway for a lot of companies to start. It's an easier pathway to revenue. But it is hard to grow beyond a certain limit. So, you know, you might be able to get a handful of customers working with you from a support perspective, from a consulting perspective. But you're really bound on the number of people. Because you have to have people to deliver those services. And if you don't have people to deliver those services, that becomes quite an issue. And if you compare the three models, I'll, you know, you can compare them. You know, you've got that open core, which is a proven model, but it is getting disrupted and eroded by the cloud a little bit. And, you know, there's still some success in that space, because a lot of companies, especially when you listen to, like, some of the sovereign cloud, the cloud native, they're looking at enterprise versions or open core versions. They want to run their operators. They want to build their own private clouds. So, there is still some play in that space. X is a service or software is a service. You know, that is still a very popular model. But the question is, if you are releasing features that are only available in a SaaS version, and only a minimal open source version, is SaaS, the SaaS model, is it still open source? Or is it not? And that's a question that we really haven't answered yet, especially as companies start to reserve features and software specifically just for their SaaS customers. And then there's the service model, which is definitely the easiest to get started if you're a small company. It, you know, has the most likelihood that you'll get some revenue in the door really quickly without a huge buy-in cost. Now, part of this model, okay, if you, whether you choose whichever model you choose, is how do you get people to move from being open source users to actually paying customers? Okay. And, you know, this is probably the most stressful thing in open source maintainer, founder, someone in the open source space has. Because ultimately, you're looking at this open source adoption, and let's say you're tracking 100 million downloads, but nobody wants to pay for anything. That's actually a use case that happens quite a bit. How do you, how do you, how do you figure that out? How do you, how do you turn that 100 million downloads? How do you turn that, you know, 5,000 people who are tweeting about you into something that is actually commercially viable? And when we go look at the classic sort of marketing funnel, and anybody here in marketing, I'm just curious, no one will, one person, two people want to admit it. Hey, three people. Okay. Oh, look at that. All right. So you might be familiar with this if you are in marketing, right? So this is just this very simple marketing funnel where you've got, you know, people who are hitting your website, and they turn into a contact or a lead, if you will, where they become a known user, and then they turn into a marketing qualified lead after they do a certain number of things. So let's say you're there on your website, they sign up for a webinar, maybe they downloaded your software, maybe they did a few other things, they get a certain number of points, and when they reach the point value, then all of a sudden it's like, whoo-hoo, you are now qualified, so then sales can talk to you. And then you have a, you know, sales accepted lead. So sales goes, ooh, that looks pretty good. And then you have a sales qualified lead when they start to say, hmm, I think I could sell something. All right. This is the classic marketing funnel, but it doesn't work in an open source space. There is a similar way to think about this. Okay. Because the open source model kind of flips this on its head. There is this model where we've got the interest phase, which is the people who are discovering what your product can do, how it can behave, what it can do differently. And you want them to actually try your software, okay? You want them to install your software or use your containers or use your library in, you know, their product. And only a subset of those who are actually interested, a subset of those who will visit your project's web page, visit your web pages, visit your assets, there, only that small subset is going to actually try. And of those who try, only a subset will continue to use it in production. And then only a subset will become a customer. So that model is very important because as we look at this phase, you've got to realize that if you have a million people on your website and you have 100,000 downloads, that's great. But there might only be 10,000 who use it in production and that might only have like 1,000 people who would be willing to pay for it. Now some of you who are familiar with the product space might say, well, we don't follow the funnel, we follow the, you know, product flywheel, the product growth flywheel. So anybody familiar with the product growth flywheel? All right, one, okay. So the idea is, it's funny because they actually stole a lot of this from the open source movement. When you look at the product flywheel, and this is all the hotness in all the product companies, right? So when we're talking about a product led company, a product, you know, product led growth, what they're talking about is people start off in this center, this core, and it's like, oh, how do we track and engage with people? How do we delight them? How do we make them really excited about things? Okay. And then we can take those strangers, turn them into prospects, which then turns them into customers, which then turns them into fans, and then they will start to become the people who will preach to the other people, to tell other people how awesome our software is and get them into the community. There's an open source version of this, and again, excuse my crappy little, you know, diagrams here, if someone wants to contribute these, better versions, that's awesome. But the idea here is, it's very similar, similar phases, but if you look at the differences, the differences are we have a download and try, we have that build and deploy into production, and then we have, oh, they're running it into prod, and then it kind of goes into, hey, do I need something more? That's when I'd be willing to pay. And then you'll evaluate a paid option and become that paying customer. It's a very similar cycle, both are the same, but there's a few differences. I like to clarify this a little bit more in the open source space because it makes sense, because really what you're talking about here is someone who's trying your open source software, they're getting used to it, they're like, wow, that's pretty cool. But if it only did this next thing, and then you're like, oh, well, I have that, it's in the enterprise version, right? If it only did, you know, you know, integration with this other product, and so that's where, oh, I need something more, and then you get them to evaluate it. And that becomes a way to build external evangelists. So there's also this idea of evangelism or evangelists, a way to foster people who are in your community, who become fans of your product, who then actually go out and contribute to that flywheel, right? So the goal here is you want someone who can start off telling people, wow, I was really successful with this application, and then they'll tell their followers and people who like them and people who are like-minded, and then those people will watch that content, and then those people will go ahead and try it. And it's similar when you add DevRel, evangelism, this is all part of that product flywheel, where you want more people to understand it and then eventually try it, and then eventually you want those people to create or help foster new people into the community where there are new evangelists who can go out there and talk about your product. So let's talk specifically about driving open source adoption, okay? And there's only three things you need to do. Yeah, I'm going to sound like one of those shill salesmen on the late night, you know, TV commercials where it's like, you know, these three things, if you buy into my system, you'll be awesome. You know, this will be the best thing ever for you. And so there's actually three requirements that we'll dig into a little bit, and as much as we can in the time remaining, for driving open source adoption and growing your open source adoption. And those secrets to the success are a kiss ass product. Okay, it all starts with the product. If you don't have a product that people want, it doesn't matter what else you do, they're not going to, you know, listen to you, they're not going to try it out. It needs to be rock solid. We need people to know who you are. Okay, a lot of open source projects that are out there have awesome technology behind them, but nobody knows who they are. Has anybody run into a product like and said like, how did I not know about that like three years ago? Right, that happens. And why is that? Because however, you know, whoever's running these projects, they're not really sure how to bring awareness to everyone else. And then you need to make it easier than all the alternatives out there in the space. Okay, obvious, right? All these are obvious, that it's not like it's like anything that's just crazy. All these should be obvious and easy as well. Now, I can't go into a deep dive for all of these, but briefly, I want to touch on each one, and we'll go through each one separately. So how do you build a kick ass open source project? Now, I can tell you, like, you know, all the technical jargon, but I'm going to stick with from a business perspective, what this looks like. First, I want to talk about the biggest mistake we all make when we're building a project. At a certain point, we have to realize that, you know, when we want when we are building a project, we're often starting it, building it for what we want. And that works for maybe like a little bit of time, and you might even make a successful company, but you're going to limit your growth if you are building for people who you think are the exact same people as you, right? It's that, oh, I assume everyone is like me. I know people who follow this, and they're like, well, I assume that everyone's going to install on the command line. I assume everyone will install from source. And so that's just the way that they think. That's not necessarily how everyone else in the community thinks. So make sure that you avoid that. And that's why the first key, in my opinion, is understanding, right? Understand the product and its value, right? What can your product do better than anyone else? Okay, are people willing to even pay for that? You should ask yourself that question. Can you answer that question? What do you do over the alternative? Right? So is there an alternative to this? Is there some other piece of software, something else that other people are doing? What do you consider success? Now, this one is really interesting. I've talked to a lot of founders. I've talked to a lot of CEOs, executives. And, you know, they'll talk to me about, oh, we want to accelerate our growth, our adoption. And I'll say, well, what's your goal? And they'll be like, well, we want to, you know, sell the company, or we want 10x evaluation, or we want, like, you know, fill in the blank. It's like, no, no, no, what do you really want? What do you want in the next year? What do you really, really want? And a lot of times it's like, oh, well, really what we want is people to sign up for our new beta. I've actually had that conversation where it's like, we just want like 1,000 people to try it and then give us feedback. Oh, it's a very different sort of product that you're building and sort, and also marketing activities or community activities that you're doing. And so that's something to consider. Who's going to use it? Are you building this for, you know, devs? Are you building this for business folks? Who's the end user? Who's going to benefit from this the most? And who would pay for that product if they're in that space? And where are those users most likely to be? You know, are these users going to be at Faustin? Are they going to be at, like, you know, Open Source Summit? Are they going to be at KubeCon? I mean, like, where will you see these people pop up? And how do you ensure that you know where they are and what sort of things that they like and what sort of things that they need? And so you want to understand that really early on, even if you're not trying to monetize or commercialize the Open Source yet, you know, you want to keep that in the back of your mind and start to funnel things in that right direction. Now, the next key is to listen and ask questions. Now, remember, I said a big mistake people make is the founder builds the software specifically for themselves, and they don't necessarily take into account other people in the ecosystem. And this is where product success is really determined based on your users and your potential users and listening to their needs. If you're not listening to their needs, you're going to be missing out on a lot of opportunity for growth. And that's where that mistake comes in. So avoid that if you can. The other one is you have to set a goal and focus on it. Be really good at a few things, not really okay at a lot of things, because there's a lot of other software you might be competing with. There's a lot of other people who are potentially solving the same issues. And it's really easy to get distracted by shiny object syndrome. Anybody get distracted by shiny object syndrome? I'm like that all the time. Yeah, I see hands waving all the time, right? And that is so easy to do. In fact, I was just talking with Avi, our CEO over here, and we're like talking about, you know, tracking web page stuff. And I'm like, oh, we could do that in our product. Oh, but we're like, wait a minute, we that's not our core product. Why would we go down that hole, right? But people do that a lot. And when you do that, it might be cool. And you might learn that there's a new subset and there might be a time for expanding your product. But until your core product has that established base, you don't want to do that. So don't deviate too often. Okay, stick to that original purpose. And if you do that, you will eventually find that you become better known on the market. People are willing to talk about you, use you, and then share additional contributions to maybe make your project a little better. Okay, this one's a big one. All right, has to meet or exceed expectations. Now, this totally seems obvious, right? But let's assume that I build a cell phone. And my cell phone has the awesomest speakers, the awesomest music system, and it is the best gaming platform ever, but it can't take calls. Is it still a cell phone? If I market it as a cell phone, if I sell it as a cell phone, if I say, oh, this is my cell phone, it doesn't matter. It's not. So if I'm trying to build a cell phone, but it is great at just music and gaming, but it can't take a call, I have not met the expectations of that. And that's where you need to figure out what are the table stakes for your project? What are the key ingredients? Whatever those are, that's where you have to focus. And that means do the job that it was originally intended to do, and meet some minimum standards. And there are minimum standards that we all have from an open source perspective. Okay, what are some of those minimum standards? Hey, you know what? It has to be secure, right? Like, who wants an insecure piece of software nowadays? Well, we have too many of them. Let's not add anymore. But it has to protect from data loss. I don't want leaky data. You know how many times my personal data has been leaked? My daughter's going to inherit free credit monitoring for the rest of her life, because I've had so many data leaks. I mean, I think there's several Matt Yankov it's running around right now, maybe even at this conference. So, you know, you know, it also has to be as bug free as possible. Now, while we strive, I'll strive for bug free software, we know it's not a viable thing. But shipping with known critical bugs is bad. And I know some companies who do it just to meet deadlines. It's not good. Next up is UI UX. Who's anybody any UI UX guys, any people on here? Okay, awesome. Yes, many, many. All right. Well, so it's like this. If your product is designed like this elevator, that's really confusing. Isn't it? I mean, UI UX is probably one of the most important things to get right. The user experience can ruin an awesome technical project. You can have the best technology out there. And if the user interface sucks, people won't use it. Or they'll use it once, which is even worse. So a bad user experience can ruin any benefits that you provide. GitHub is littered with projects like this, by the way. So of the millions and millions of projects that are out there, many of them fall victim to this, even if they are really solid. And if you want to grow your community and grow adoption, easy. Remember, easy was my point number three. You know, you need users who can be advanced if they want to and get into the weeds. But most of the users won't. And again, that gets back to that assumption that we build things for ourselves. A lot of CTOs, a lot of co founders, a lot of people in this space, what they think is, I'm super technical. And they are, they're building a company. Those out there in the space who are founders, you know, at that executive level, maybe they're, you know, maintainers of projects, you, you are all brilliant. And that means that your brilliance probably exceeds your average user's brilliance. That's okay. But keep that in mind. So make it easy for folks. And my product key number six is community. Okay. Open source projects need community. And we need to bake that in to the beginning. Okay. And that means including plans for fostering that strong community up front. Now I brought this up. I was previewing my slides and someone, you know, said, well, wait a minute, what about curl? Does curl really have a strong community of contributors? Right. And this is what I say is we have to look beyond code. This isn't just about code. You want people who will use your product. And if you search for curl, for instance, um, you know, your favorite search engine, you're going to find thousands of examples of how to use curl, right? You will. You can do it right now if you want tutorials, examples, cool things, hacks, whatever, that's community, even if it's not code contributions. And that's what you need to continue that product growth. All right. Moving on. Now we're talking about how do we super charge the awareness? Alrighty. So let's talk specifically about community. How anybody here in the community space? Oh, a few people. Oh, yeah. Okay. Great. So, um, this is the great hope for adoption in open source companies. In fact, a lot of people will prioritize a dev rel hire over any of the business hires because they want to start to connect directly with that community. Okay. And it's, it's, this is a balance, but the community can really help accelerate and drive that awareness if done right. But here's the thing. Okay. Community is starting to turn into the new marketing. For the last five years, the last 10 years, um, marketing has often been in control of community. Community has been a subset of marketing, sometimes a subset of product. But ultimately, this shift in mindset where the community is so important and it's driving so many conversations, it's supplanting marketing. And there are reasons that you still need marketing, but in the future, I think that what we'll see is the community team has marketing as a function, not community as a function of marketing, which is a different way to think of it. And that's a real mind shift for a lot of people in the business space. But if we do approach this from a community first approach, we can accelerate and grow the open source space even faster. And that gets back to this expectation, right? So when we talk about that expectation, I mentioned earlier that funnel, the flywheel, you know, when we're talking about, Hey, can we have people in the community who talk about us and then turn into followers of us who then continue to watch our things and then who continue to try and use our software in different ways, that's the ultimate outcome that we want. And when we talk about this community team, there's multiple functions in it. This is a chart that I've used several times before, but, you know, we're talking about DevRel or evangelism, we're talking about community management, we're talking about these OSPO roles with contributors, all of these things are part of a strong community team. All of them need to be satisfied. Now, a lot of us will focus only on evangelism, and then they won't do the other functions, or they'll focus only on the contributors and not do the other functions. And so all three of these are needed for a healthy, healthy ecosystem. Now, what are some popular ways to grow open source adoption? Okay, these are probably obvious for anybody who's been in the open source space for a while or been in the community, but they're classic, right? Content, number one, right? Content, everybody loves content. The more relevant content that you have, the more code examples that you can share, the more conference talks that are relevant to people, the more blogs you write, you know, those will make a difference, tutorials, sample applications and whatnot. All of these things are things that are important to those in the community. There's other activities you can do, especially when your product solves something that's newsworthy. Has anybody heard of the term news jacking? No? Okay, a couple of people. It's the idea that, you know, if there's something in the news cycle like, you know, oh, the latest hack, and, you know, or whatever, if there's an opportunity for you to maybe talk to people about how to build better products, if your product or your company does something differently that solves these other issues, this is a way for you to use the stories that are already out there to reach additional people. You know, if somebody's hacked or some cloud provider goes down, well, this is how we would prevent that, you know, or here's how our software did this differently. Those types of things actually can have a really far wide reach because those are things that are on top of people's minds. Now, you also need to generate friends telling friends, right? So that peer-to-peer connection, if you will, of people who are willing to share their experiences and share the different, you know, goodness of your software. And that goes hand in hand with some other things like good messaging and products and making it easier to start. Now, how do we make this easier? That's key. The first one is it requires a lot of collaboration. It's a choreographed dance, okay? So making this easier is you have to have not only easier products to use, you not only need a great user experience, you also need to make sure that you have enough information, enough tutorials, enough example code, enough out there in the community activity to merge those together because you have to have good product as well as strong community, as long as strong content, good documentation to make this work. And so that easy is important, but it goes beyond just one team's function, you know, and some of those key things or some of those things that I see people sometimes stumble with or sometimes fall down on are things like installation, you know, so how easy it is to install and get started. Same defaults. How many people have installed software and it's like, how did you not set that up? Like, why isn't that defaulted to on, you know, like encryption? Oh, encryption is just left off by default. Or my favorite is, and I don't even know, maybe there's somebody here who's run Mongo recently, but they used to not set a default password because they're like, oh, we don't want to limit people from getting started. Well, how can you not set a default password? By the way, that's the number one reason MongoDB has like data leaks, you know, you read these articles and it's like, oh, because these old, you know, these old installations didn't set a password. I don't consider that a security issue, that's more of a user issue, but that's a whole other story. How much automation do you put in place to help users? That's another critical one. Again, UI UX and how easy it is to debug problems. You know, if your problems, if the software crashes or has an issue or doesn't work as expected, do you have the right debugging in place for people to actually find what's wrong and fix it? Because if not, that's going to cause a really, really, really hard time in getting people to operationalize it. So we talked about, you know, kick-ass product, we talked about, you know, the awareness, we talked about, you know, how to make it easy. Now I got to put on my other hat. I do have another hat. It's my metrics hat. I got to wear my metrics hat for my metrics section of this. So we're going to talk about how do we track open-source usage and adoption now. Okay. And there's really kind of four or five major functions that I want to talk through. Awareness, usage, conversion, customers, and retention. Okay. And this kind of follows that funnel and that flywheel that we talked about originally. This is one of those things that, you know, each one of these requires a certain set of things to potentially look at. You don't have to look at all of these, but these are metrics that you should all consider and then pick and choose which ones make the most sense for your business. But when we talk about awareness, okay, a lot of people start with the first one on this list. So how many people track GitHub stars? Okay. Hey. You know, GitHub stars are a popularity contest, by the way. Did you know that? Like, it really is. Because it actually doesn't really show much other than people like your project or maybe they like people on your project. And did you know that for only $5.99, I can get you 1,000 stars on GitHub? There are services that do that. And so I don't know how many of the stars are actually real or not on some cases. It's hard to tell. But you should be looking at that because if it is going up into the right and you're not gaming the system, that is a good indication that people are finding your project. But keep in mind, from a project perspective, there are more users than there are users who will go to your GitHub page and get into your repository. And that's something to consider. Unless you're forcing people to go and install, you know, via source or your documentation on GitHub, you might have more people who are downloading some other mechanism. Maybe they're getting from a third party repo. Maybe they're just going to your website and getting it. So they might never go star or even see your GitHub pages. But looking at not only GitHub pages, but your traffic to your docs, your website, your different pieces of content and understanding who's interacting with it, that's another critical piece to this. And that's a classic marketing thing. A lot of people look at, you know, the website traffic. But you also want to look at those unique views to each individual page. All of those are critical to look at. Now, what's interesting when we talk about page views is we've had conversations with folks who talk about page views and they're like, there are certain pages that don't matter and then there's certain pages that do. And so understanding and being able to segment your traffic and look at who's looking at what is a very important thing to understanding that growth and adoption. So for instance, if someone looks at your pricing page, that's probably a good indicator that they might be interested in something commercial. If someone's looking at your documentation, that's a pretty good indicator that they're trying to install or debug an issue. If they're looking at tutorials on how to get started, that's a very good indication that, again, they're trying to use your software. And so you can track adoption through looking at those types of things. Now, when we talk about external evangelists and trying to get additional people into the ecosystem to talk about you or to figure out what you're doing or maybe get excited about your product, looking at the referrals to the website is another really good example. So I can see your blog or your blog pointed back to my website or your social tweet mentioned me. These referrals are often more critical than the website traffic on itself. Now, there's also two concepts that are really squishy. I won't get into a terrible amount of details if you're really interested, check out opensourcemetrics.com or.org. But share a voice. Have you ever heard of share a voice before? No? Okay. Think of share a voice is how often does the project or the thing come into people's minds or how often does it show up in search? How much of the market do you have? So, for instance, if I was to say to you, like, you know, when I say Linux, what pops into your mind? Like, what distribution pops in your mind? I'm going to guess most of you are probably going to say one of, like, three or four. But there's lots of other ones. And so those three or four have a higher share of voice than everyone else. And so if you can measure how much of your share of voice is, then you know how much of the market you potentially, you know, can reach out. And then when you have something to say, people are going to potentially listen. And then there's this idea of social reach. Now, probably everyone here has different social accounts on different platforms. And that's awesome. And the question is how much of a reach do you have? And this is, you know, measured by the number of followers, the amount of engagement. And the reason you want a larger social reach from an awareness standpoint is when you do have an announcement, when you launch a new product, when you talk about a new release, when you do have something important to say, this helps amplify that voice. Now, from a usage perspective, usage is where the adoption really comes into play here. Okay. Usage is how many people are actually using your software? All right. Now, I know a lot of companies have struggled with usage on the open source side. There's a lot of times people come back and they ask the question, well, who's using our open source software? And the answer a lot of times is, I don't know, you know, certain number of people, this number of people hit the download web page. But there's so many different places to get, you know, that information. I can't really get it. But some people can get raw download numbers. Anybody look at download numbers for their software? A few? Okay. It's hard to get. It really is, especially when you're looking at, you know, I deploy via Docker, some people are doing source, some people have packages. So that is a challenge. But it's important to understand if there are more people downloading, that's good. But you have to be careful. Because a lot, oh, I'm going to tell you a secret. There are a lot of bots on the internet. I don't know if you knew that. Don't tell anybody. And those bots download stuff. In fact, what we've found is a lot of times we might have a project. And so one of the things that I mentioned that SCARF does is it tracks, you know, things like downloads and download metrics. But what we found was stuff like, oh, this project has, you know, 50,000 downloads a month from two people. It's like, well, wait a minute. There's only two people actually downloading. So when you see 50,000 downloads, but you see only like two unique people, you're like, well, what is that? And then you dig into it and it's some bot that's downloading it and distributing it some other way. So that's important to realize that that is something that can happen. You want to look at that. But you want to look at that scrubbed unique download. So how many people are actually downloading versus how many people are, you know, just downloading and it could be bots or duplications. A lot of companies download the same software over and over again. A lot of people in their pipelines will download the latest version of software, et cetera. New users, new companies coming into the ecosystem. That's really important for usage. Sign up. So if you are a SAS provider, you're offering a free trial. Maybe you have a webinar. Maybe you have a mailing list. Maybe you have Slack. Sign ups are another thing that you want to track for usage. But that's not necessarily going to correlate to actual usage of the open source. But again, that's more of a directionally correct up into the right. You know, so the one that is really on the bottom here, the last four, this one's sticky. Okay. This one, this one's tough for me. Call home metrics. So most of the people in the open source space hate the concept of anything that's going to call home. Right? It's not necessarily always welcome. But there's a certain amount that might be needed for projects. And the question is where is that line? Right? And when I talk about call home metrics, is something running, is it installed? You know, is it available? How many things are running at any one time? These are critical questions that a lot of projects struggle with. Because just because even if you have a download doesn't necessarily mean that download actually translated into an actual user. And so you want to make sure that you can go back and forth between, you know, yeah, they downloaded, but they stopped using. Because if they downloaded and stopped using or never actually, you know, use it on a regular basis, that's a very different thing than someone who's just downloaded and you assume that they're using it. Now, when we, before we get into conversions, I want to say that just because you have adoption of your open source software doesn't necessarily mean that that's going to lead to commercial success. Okay? How many people, I mean, there might be some people in here and maybe you're willing to raise your hand. But I know that several people that I've talked to struggle with this where they're like, I have a million downloads a month, but I can't get anybody to buy anything from me. You know, oh, yeah, there we go. We have one brave soul. That happens quite a bit. And, you know, it gets back to some of those product things. And there's a lot of reasons or things that you can do to try and change that. But, you know, it gets back to this expectations on the awareness versus, you know, you know, what the business expects. And from a metric standpoint, you've got the two sides, right? So when you talk about the community side, the adoption of open source side, you know, a lot of times our efforts are around fostering that really strong community. And then there's this big question mark in the middle. And then on the business side, then there's, oh, now we want, you know, ARR, now we want, you know, to look at expansion numbers and churn and all this other stuff. So how do you bridge the gap between these two? And this is where conversion comes into play. And conversion is a bad word for people, right? You know, yeah, you will be assimilated. You know, we want that from the open source side, right? Might be a negative connotation, but I think it's funny. But I do have a super evil villain hat. But you do want those conversion metrics. You want people to say, okay, what does it take to go from free to paid? And to measure that, okay, there's, you know, different things to look at, right? That conversion is impacted by a lot of different things. And each one of these are things you can measure from a product. You know, you want better features, better security, you know, you want support, things like that. From a policy perspective, you know, companies often will require support or they might have compliance requirements. How do you meet them? And then people, right? So all of these things affect your conversion ratios. But looking at conversion ratios, you can actually control. You want to look at things like page docs or source to download conversion ratios, doc views to those who have downloaded to those who eventually signed up, if you can track that, that's great. I know that in the past, I've had to go through and do stuff like that where I've seen, I mean, actual example where it's like company A, there's really large company four years ago started looking at our blogs. And then they attended conferences, they did all this other stuff. And then two years, they called us up and had a conversation. And then a year later, they did something else. So there are breadcrumbs when you look that you can actually see what people are doing and what sort of activities help lead them to be a customer. Now, customers, when we talk about customer metrics, we're talking about standard business metrics. So these are the ones that most people who are running an open source company are familiar with the critical ones, things like the number of new customers or new logos, ARR, MRR, you know, stuff like that. So I'm not going to go terribly deep into this because there are MBA courses that cover a lot of these and I don't want to necessarily go in depth. But the ones that I look at typically the big three, number of customers, number of new logos, and yeah, the ARR, MRR. And then also, I look at the user to customer ratio. So if you know you have a million users and you have only five paying customers, how do you adjust that? How do you tweak it? How do you make that work better? And then I also look at the number of customers who are also advocating for us. From a retention and churn perspective, churn is the idea that people start using and then go away. So you want to look at those. You want to look at the number of instances that have gone away. You also want to do some competitive analysis, right? So if you have users and customers who are starting to use other open source projects or take a look at other things, you might want to be aware of that. So recapping, not all businesses, open source businesses are the same. Realistically, build an awesome project, focus on the funnel, and of course, measure it. All of these things will help lead you to open source success. Again, if you are interested in getting more details about any of the metrics or learning a little bit about what's been successful for other folks, check out opensourcemetrics.org. You can contribute to it if you think that there is something that is missing, something that you think we should work better, definitions, et cetera. We would love to have your feedback and help there. Details, if you want to reach out to me, be happy to have conversations with anybody offline. And that is it for me. Questions? Oh, we got one question. We're running mics to people in the crowd, it looks like. Oh, it's not working. Okay. Well, maybe you should shout. Maybe now. You mentioned collecting metrics from users, but because of the GDPR, you need informed consent that you're allowed to collect metrics. In your experience, what's the percentage of users that actually allow to do that? Is it like 1% or 10%? It's generally more than you would think. I mean, it's at least a third, maybe more. And beyond that, there are metrics that aren't tied to PII that can show those indicators going up and to the right. So just because you know that you have web page traffic or downloads, even if you don't know specifically who they are, you can still see those up and to the right, and those are just as important in a lot of cases. And that's why you have to pick and choose, which ones make the most sense. And in a lot of cases, what it is is it's about building that community and that relationship, so they're willing to share details. And how you do that and how you make sure is you don't do kind of like old kind of slimy sales tactics. Like you don't want to spam people, you don't want to call them. Look, let people raise their hand when they're ready. And that helps people a lot. And when people go to bat for you and tell you, you know, other people, hey, you should sign up for the newsletter, you should sign up for this. And then they give you that consent. That helps a lot. Thanks. Yeah. Oh, other other way, other way. Yeah. We had a couple more questions. We've got one in the middle, one in the end, one down here. Oh, hello. Yes. Okay. So I was one of the folks that raised their hands when you asked what, what do we know, open core. So I don't, but then you started explaining and I say, okay, that's like freemium. So I just was asking, could you explain the difference between open core and freemium? Ah, so, yeah. So freemium is where, you know, your first taste is free. And then, you know, you can, you can get more later on. Open core has been around a lot longer than, quote unquote, freemium. Okay. And typically from a freemium strategy, you're going to pay more as you go. And so having like a cloud service where the first, let's say, 100 gigs of data or something that you store is free. And then it goes on. That's more of that freemium strategy. But from an open source perspective, how I view open core is you've got a complete product that's running. You don't have to pay for anything. And if you wanted to augment it or change it on your own, you can. I think freemium typically comes in my mind and more into play on the cloud side, where you've got more of that, you know, kind of like ongoing service where you're paying for something where you can't necessarily replace it or pull it out. But it's a fine line. And it's really difficult to draw that line. And a lot of people might go back and forth. Oh, it looks like our time is up. I'm happy to answer your question if you want to come down and we can, we can chat afterwards. That's cool. I appreciate your time. I do. Thank you very much. |
Starting an Open Source Startup
What you need to know before starting your own open source startup |
Hi, everyone. My name is Tom, and today I'll be talking to you about starting an open-source startup. So about this talk, it's directed at early-stage startups or people thinking about starting one with an open-source product or people thinking about open-sourcing the existing product. If you were here for the previous talk, it was more about the step one to step two. Here I'm going to talk about the zero to one. You don't have a product yet, you're just starting out. And so what is an early-stage startup? For the purpose of this talk, it's an early-stage company that's designed to grow fast, often backed by venture capitalists, so investors, and aiming for very high revenue. You have to remember, and maybe it's important to frame the rest of the talk around this, is that this is a capitalistic endeavor. You're taking money from people, and you're promising them to return them more money after a certain amount of time. And you have to remember that all for our thing. It's something that's so easy to forget, but that's the basis for everything else. And it's especially easy for us to forget, because we're so used to giving so much free value to the world with our open-source contribution, and also taking a lot of free value back by using Linux, by using KDE, by using all of these tools. But if you're out there building a business, taking money from investors, that's something that is important to remember. So who am I? I'm the founder and CEO of Sviks. It's a company. We started as a business, and then open-source the product, although that was the intention from the beginning, and also created EtiSync that started as an open-source project, and then turned into a business. I'm obsessed with API, developer tools, been around for a while, worked on Elightment, OpenMoco, and a few other projects as well. So what is Sviks? We do webhook sending as a service. It's an API-first developer tool. We help companies send webhooks, essentially. It's both open-core, hand-on-oaster product, as we do both. We raise over $10 million from top investors, including Y Combinator, and Treason Horowitz, RF, and founders and CTOs of companies such as GitHub, PagetGT, Segment, Lookout, and Fly.io. So why am I even giving this talk? So there are a lot of resources for how to start a startup online. A lot of great resources. But there aren't a lot of resources for open-source startups, especially not ones that come from the community, from people that actually understand open-source. It's more like an opportunistic approach that a lot of people take. And also, a lot of people have been asking me for advice over the years, even this weekend. I figured, why not? Let's make a deck out of it. So before we begin, why even start a startup? I saw a talk by one of the co-founders of Facebook many years ago, and it really resonated with me, especially now that I've seen other people start companies, I've started companies myself. And a lot of people think it's cool. Your vision is that you're going to hit launch, you just finish building a product, you get press attention, go wakeboarding with Obama, or I don't know what you want to do, Rockstar. But the reality is that no one is actually going to care at all. You're going to beg them for press attention, you're going to beg customers to use your shitty product, and really, honestly, you don't have time for press and all of the Rockstar stuff anyway, you're building a business. And also, a lot of people think that they want to be the boss, you know, big office, private steam room, make a billion dollar deals, private jets, all of that stuff. In reality, it's a noisy co-working space, if you're lucky, most likely a kitchen counter, and begging people to try a product again and fighting for survival. So it's really not that, it's not like what you think. Another thing people think, well, I'm my own boss, I'm going to have flexibility. I love this quote, as an entrepreneur, you get some flexibility, you get to choose which 24 hours of the day you work on. It's so true, and for a variety of reasons, first of all, your customers are demanding, and you want to make sure that they're served right, you don't want to lose that deal just because you're unavailable, even though it's your wife's birthday or some other occasion. And the other reason is, you're really obsessed with what you're building, so you're actually thinking about it all the time anyway, and you have ideas in the shower, and you have ideas during dinner, and you're just dying to go back and kind of like put that into reality. Another bad reason is that you want to have more impact. Honestly, just join an early stage startup, you're going to be working on a product that's already well funded, doing well, and you're going to grow with the team, become the CTO maybe, it's like, you know, it's a great path, or a late stage company. You know, I worked at Samsung, my code was, you know, it's been used by millions and millions of people, or at least devices. That's much more impact than any of my companies have reached, you know, so far. So that's just something to keep in mind. Another thing, you want to get rich. That's not a good reason, because the likelihood of success is very low. You're most likely going to have like a really low salary for many years, and then you're going to go bust. Not a good strategy if you want to make money. Again, join an early stage startup, a premium 50 at Facebook, for example, is a multi-multi-multi-millionaire, or just work at Facebook, Google, all of those companies. Again, I don't recommend, but if this is what you want, then you're just chasing money, you can do that. And you have to remember, it's going to be absolutely brutal. So things are going to go wrong, deals are going to go full through, you're going to get a lot of nose from candidates, from investors, from literally everyone, all the time. You're going to want to quit, but because you're the founder, that means that if you quit, the company is going to close, and that means your employees are going to lose their job, and your investors are going to lose the money invested. So you're really like, it's really hard to quit as a founder. Okay, so we covered all the budget stuff. So why actually start a startup? And there are a few good reasons. One, you really care about the problem. You want to cure cancer. No one else is like, cracked it yet. You think you have a shot? Go for it. You're really passionate about your idea. You have, I don't know, Uber for cats, terrible idea, but like, this is something that you really care about, and you want to get, because everyone thinks it's terrible, I guess, for the laughs. Like, no one else is going to build it. You decide to build it yourself. And you just want to start, maybe, your own company. You just, I don't know, whatever reason, this is what you want to do. And just make sure that your motivation is long lasting, because you're going to need it for the next 10 years. Another good reason, a bit more altruistic, is that open source startups are good for the community. Much more money and talent into open source, great open source products. You know, we're all going to use them every day. And also, open source startups tend to support the rest of the community. We support other projects, both financially and by upstreaming patches. We also employ the maintainer of, like, Redis Rust. And so, again, great for the ecosystem. But you have to remember, there are other alternatives. You can join a promising startup. You can start a non-VC backed startup, much less of the stress that I mentioned. You can start a side business. I've done that. It's great, much less stressful, still stressful, but much less. Or you can just start an open source project. Again, I've done that. Many people in this room, I'm sure, have done it as well. It's a great way to just, you know, put something out there without the pressure of actually having returns. And so, really, just, like, be true to yourself and what you want, because, you know, you need to know what you're signing up for. So, let's talk about getting started. The first step in the journey is validating your idea. So, is this a problem that people even care about? Are you the right team to build it? How will you make money? How will you get customers? How big is the opportunity? You need to constantly validate your riskiest assumptions in order to not go, you know, like head on the wrong direction. I highly recommend filling in the white combinator application. Like, white combinator is like an accelerator. Even if you don't submit it, it just forces you to think very thoroughly about the business from the eyes of investors and from a business perspective. And it's been, like, very helpful for me. And you know how people say there are no bad ideas? There are definitely bad ideas. Some threads of common bad ideas are the ones that are very obvious. So, for example, oh, I'm going to start a startup for teleportation. Great idea. I'm sure all of you guys are going to, like, pay instead of, like, taking the flight back home. But the problem is it's actually technically difficult or impossible. That's why no one has started it. So, you know, it's the first step of validation. The other type of bad ideas is when you have a solution, you build a project, you're really excited by it. That's what you want to work on. The problem is it's not applicable to everything. I don't have a good example from the top of my head, but there are many of them. And I'm sure, you know, like, the more crypto, you know, crypto, God, haters will use the crypto as an example for that. Also, another one, I get pitched a lot of ideas about, you know, spaces that the two B founders know nothing about. You know, I know nothing about cancer research. I'm not going to start a cancer research company. And one last thing, which is actually not a bad signal, people telling you that it's bad. And that's, you know, going back to, like, why doesn't exist? Because everyone thinks it's bad. And that's where your opportunity lies. So, just ignore the naysayers. I asked Paul, like, the co-founder and CEO of Superbase, what's his best advice? And he said, everything should be open source, but not every open source should be a business. And that's so true. Again, validate, verify that what you're building is actually a good business. In the now immortal words of Y Combinator, you need to make something that people want. That's it. So, when you start, all you need to be focusing on is writing code, showing it to customers, getting feedback, talking to them, all of that. You know, in the open source world, we now release early, release often. That's how you get things to the back hands of customers or users and you get feedback. It's much better to find a few that really love what you're doing than many people that are, eh, because the few that love, really love it, are going to tell their friends about it. And the few that are okay about it are going to stop using it after a certain period of time. Also, 90-10 solution. So, find how you make the most impact with the less amount of time. Like, perfection is really the enemy of good here. And the last thing, you want to do things that don't scale. So, what do I mean by that? There are, like, certain activities that you can do at this stage of the company that are definitely not going to work once you have a billion users. But for now, you can do them. So, one example from our, from the beginning of Svix is that I used to send the API keys encrypted to our customers. We didn't have a sign-up flow. There was no need, right? I don't know if anyone wants the company or wants the product. So, like, why build a sign-up pro? Why build a landing page? None of these things matter in the beginning. So, just focus on the product. Another great quote that I love is from Steve Jobs. So, everything around you that you call life was made by people that were no smarter than you are. And you can change it, you can influence it, and you can build your own things that other people can use. So, it's really, don't be scared because it looks, you know, it seems difficult. Just, like, go for it. Really just go for it. So, before you start, ignore the naysayers, but be honest with yourself. You know, a lot of people are going to tell you what you're doing is stupid, silly, bad thing. Maybe they're right. You have to make that call. And effort counts for nothing. This is not school. If you don't get results, it doesn't matter how hard you work. It doesn't matter, like, how much you tried. It doesn't matter how cool your code is. It just doesn't matter. Again, do you risk the business? You have to listen to advice, but decide for yourself. First of all, it's because it's practice for later stages. You're the only one running this company. But also, you're the only one with a lot of context about what your customers want because you've been talking to them all the time. And, like, what doesn't work because you've been writing the code. So, really, just decide for yourself. And remember that most startups fail, like the vast majority. So, if you're doing average input, you're going to get average output, and that means deaf for startup. So, definitely watch out for that. Okay. I think that was, like, the beginning. Now we can talk a bit about the product. So, the first and most important thing is understanding the problem. And you need to know your customers, what they want, what they care about, what they hate. Really have the curiosity of a child and really get to have, like, as most empathy as you can. A good hack in the beginning is being your own customer. That really works. But you also have to make sure that there's a path to other customers that are different than you are. Because if you're just building, you know, like, I don't know, like, a name tags that just say the name Tom on them. That's great. But there are just so many Tom's in the world. I don't think it's, like, a good business. Another quote, I love quotes, is from Henry Ford. It's asked people what I wanted. They would have asked for a faster horse. And that means don't, you know, listen to your customers, learn about their needs and wants. But don't follow blindly. Say no. It's absolutely fine to say no. You also have to remember it's about what you do. It's not about what you do. Sorry. It's about what you enable. And so people don't actually care about drills. I mean, drills are fun. Don't get me wrong. But what they want is holes. And you have to figure out how you give them a hole. And again, don't, you know, don't obsess on the fact that you already have a drill. And another part about this is, you know, we talked about product, and that kind of implies, you know, what people actually use. But really, when I asked James about his experience, he said something that really resonated, which is your old company is the product. And actually, Zeno also concurred. And from my own experience, like, it's, I've lost counts, the amount of products that I really wanted to try. And I went to their website, and I couldn't understand really what they're doing, but I thought maybe it's relevant. And I went to their docs, couldn't understand how to get it started, and just dropped off, even though I'm sure it's actually the perfect product for me. And it's the same with every other aspect. So, you know, like marketing, customer success, all of those kind of things. You have to obsess with a customer experience. So obviously, I have great customer service, as I mentioned, you know, your wife's birthday, but the customer is calling. You may need to, like, drop for a few minutes. And but much better than having a great customer service is actually fixing the docs. Like, why should they even have a problem in the first place? But even better than fixing the docs is just making the product easy to use. So, you know, same defaults, great onboarding in the product itself. And all of those things make for a much better experience. Now, I know this doesn't need repeating at FOSDOM, you know, just like looking in this room or looking even outside with all the stands of the amazing communities that are powering, you know, this whole event, but maintaining a healthy community really matters for an open source project, because that's how you get contributors. That's how you get evangelism. That's how you get all of those things that later on turn into business goals. As I said, capitalistic talk. I'm not as cynical in real life, but I'm just like trying, just showing you the connections to business value. And so be quick to respond to PRs. The most annoying thing, I opened a PR like two years ago. Got feedback a few months ago. Obviously, never fixed the feedback. I mean, that's ridiculous. There are tools to help to make sure that you maintain a good community, like Orbit Love and also the community dev room here at FOSDOM. Okay, so we talked about a product which maybe some of you know, maybe some of you already built a successful open source project, but now what we need to remember is that we're actually building a business. We're no longer just engineers, and we should focus our time on non-engineering tasks, which is so hard, and I'm going to talk about that in a bit. But the first step is understanding your business model. And what is a business model? It's just a fancy way of saying how are you actually going to make money? There are many different models for open source and many different models for non-open source. Red Hat and Canonical, they're a great example of selling additional services, so training, support, consulting, packaging, so like packages literally. Another example is OpenCore. So you release some of the source code as open source and then proprietary stuff on top. GitLab and RedDisk both do it. Again, very common. Do a licensing, so cute and free buy they do it. So you have licensed your code as copy left or something that is not compatible with some companies, and then they pay you an extra fee to kind of like unlock it and get an enterprise contract. And then you have managed cloud, SaaS. Oftentimes goes hand in hand with OpenCore when you have your features only in the SaaS product and again for the enterprise. GitLab and MongoDB both do it. You can either run a multi-tenant SaaS or you can just run the product for them. And there was a time that OpenCore was the way to go and then SaaS became the way to go because the advent of the cloud. And actually OpenCore is getting some traction again because a lot of people want to sell first with GDPR, all of those kind of things. People are looking to sell first again. And then another one that's not often talked about is supporting products and services. So that's kind of like similar to OpenCore, but you're not actually extending the product, but you're providing services on top of it. So good examples are GitHub. GitHub has nothing to do with open source, but they're riding on the Git success. And Android, Google is selling services over an operating system that was open and can be customizable by many of the, you know, by all the other form manufacturers. You can also combine multiple models, but you should know what your main driver is. You're an early stage startup. You probably have like one person or more likely 0.1 person like working on this stuff. So you really need to focus and see like what's the main driver for you. And now, because you're building a billion-dollar business, you're going to have a lot of competition. Other people are going to see your success. They're going to see your opportunity. We have numerous copycats that have launched since we started. And you have to figure out like how do you protect yourself? So there are a few common ways for open source projects. One way is the license. Obviously, I'm sure those need no introduction here. Copy lefts or GPL. The problem with those is that oftentimes they're not liked by enterprise lawyers. So it's really hard to sell to an enterprise if GPL is just banned from that enterprise. So that's something to watch out for. Then you have the permissive ones, which is what we chose, but they don't provide any protection. And then you have the non-free or FOPEN, obviously not open source. I understand where they're coming from. Like seeing AWS just take your business, host it, and then take all of your revenue stream probably is not fun. I don't know what I would have done in that situation, but it's definitely, I can see how there's going to be like more innovation in the licensing space just following the mega-cloud companies. And the last one is usually, it's like CLAs, usually tied with copyright assignment. And then when you do do a licensing, you need to actually own the copyright. So contributors assign the copyright to you, and then you can re-license. If I asked multiple lawyers and founders, and almost all of them said, own the trademark, your license is probably too liberal to protect you. And the trademark is the reason why Mozilla can say, well, you know, like can prevent a competitor from starting Firefox plus plus or Firefox remix, all of those kind of things, because that is actually confusing to customers. And that's a good line of defense. And I think like a few years ago, the founder of own cloud actually gave a talk here at Forstem, and he mentioned that they never had the trademark for own cloud, and that prevented them from actually, you know, offering a good hosted service, because there were so many clones, and there were so many people that, you know, that claimed to be the official one, and it was like really hard for them to actually maintain control. Another one is network effects and ecosystem. So if you look at GitHub, or even GitHub, even though it's not open source, it's just because everyone is there, that means the new projects are going to be hosted there as well. That's it. It's really tough to break. But even without, you know, a SaaS component, just think about Postgres. There's so many libraries for Postgres, and so many utilities for Postgres across many languages, and across many, like, they're just, the ecosystem is vast, so it's really hard to append Postgres from where they are. So that's like, again, another good line of defense. Also remember, you know, plans are worthless. Everything is going to change. You're going to change your assumption. Everything, it's really in flux. Like, it's hard to understand how much things, sorry, how to explain how many things change on a single day. But actually, the process of planning is extremely valuable. You know, I'm going to say a word that probably sounds scary, which is like financial modeling, but it's really, you just go and download a spreadsheet, fill a few values, change it to whatever you need, and that will provide you a lot of insight into the levers that actually move your business. So if, for example, you want to charge people $5 and you think you're going to be able to reach 2,000 customers, that means that you're going to make $10,000 a month. Maybe enough for you, maybe not enough to sustain a business, but now you know, maybe you need to increase the price, or maybe you need to have a different business model. So really, just the process of planning is extremely valuable. Now that you know the plan, you kind of know how you're going to make money, it's very important to assess the business opportunity. One common wrong way to do it is to say, well, you know, the software industry is vast. It's $1 trillion. If we do really well, Microsoft has 5% of it. If we do really well, we can go to 1%, and that's a lot of money. Realistically, that just doesn't make sense, and there's like so many variables that are hidden here that, you know, it just doesn't add any insights. Like the much better way to do it is bottoms up analysis. So it's essentially you saying, well, you know, we make $5 out of each customer. I think we can get to a million customers next year, and then 5 million 5 years after, and then 100 million 10 years after. And you probably not going to hit 100 million, or probably you're going to change some stuff in the business, that's fine. But just understanding that where you're heading is a $500 million a year business, that already gives you enough of context, and like gives you investors enough context to know what it is, you know, what kind of business you have. It's also important to ask, are you really selling $2 for $1? Like is your business model is like a losing model? So those are a bit more advanced questions. It's better to just do the market sizing in the beginning, but that's just something to keep in mind. And when asked here from Cal.com, like it's the open source calendar, I don't know if you know this one, and his top advice, he said that as an open source startup, you actually need a 10x time. Time is total addressable market. So like the, what I said in 10 years, 100 million. And what he was saying by that is, you know, you're cannibalizing your own business. If you have a successful open source project, people are going to self-host it. At least I hope. And that means that you're competing with, you know, the most formidable competitor. They're always going to be at least as good as you are. And you're going to lose, right? So you have to understand that some people are not going to buy the, you know, the extra paid product. And that's important to keep in mind. I asked Nick from QuestDB, like his perspective, and for them, their delay in monetization, actually not charging money. And that I know, like the reason why I wanted to highlight this one, is that that's often a point of really cool about startups, like, oh, you know, this startup is losing like $10 million a year. But what you have to understand is that there are two kinds of losses. There are the losses that don't make sense, and you're just burning money for no good reason. And then the losses that are part of a strategy, like Nick's, which is saying, you know, we're only valuable if we actually have an ecosystem around us. We're only valuable if people actually use the product. And after that, we can figure out ways to make money. Or we already know how we're going to make money, but we can defer that. Okay. Now we can talk about go-to-market. Again, fancy way of saying getting customers. You know, you have to remember that at the end of the day, you're a business, and you need to make money. And paying customers are your business. Again, you can start by non-paying in the beginning, but you need customers. One common way of thinking about it is if you build it, they will come. It worked in the past, but it probably not going to work for you. You know, it worked for Linux, because that was the first open-source operating system. It worked for KDE. It worked for a lot of Linux distributions. But realistically for you, you're going to have a lot of competition, and people are most likely, like the market is most likely very saturated. So people are not just going to show up. So what you can do, though, is do things that don't scale. Again, as we said earlier, what does that mean? You know people. Ask, you know, email 200 people from your contact list to try your product. Or you're here at FOSDM. Just go to a relevant area or relevant dev room and just chat to people and just ask them to try your product. There are a lot of easy ways that are definitely not going to work. That's not how you're going to get to 100 million customers, but that's how you're going to get to your first stand. And you want to focus on the easy ones, like trying to sell to Amazon sounds like a very, you know, like, oh my god, they're going to pay me $10 million a year. That's amazing. But realistically, it's going to take six months until you get anywhere, and then it's going to fail. But even if it doesn't fail, then you succeed. It took you a year, and you're probably out of business by the time you even like managed to scratch the surface on that deal. And again, few that love you are much better than many who like. Go-to-market is a very common topic with both technical and open-source developers. The reason why is because we're technical. We don't think about marketing. We don't think about sales. We don't think about those things. And so a lot of people had a lot of feedback about that. One, ship as early as possible. Another one, learn marketing. Another one, you know, if you're trying to sell to companies, maybe start by getting the individuals at home first, because that's much easier than getting the approval and the contract and all of that going on. Now that you're kind of like, you're starting to get customers, you know, there's some traction, you actually have to, you know, have some metrics. And by metrics, I don't mean Google Analytics and like tracking your customers. I really just mean like knowing our things are even going the right way. And so, and also actually like if you know, if you're tracking the important things, you can also affect them. You can like, oh, revenue is not growing as much as we wanted. I guess we can, you know, do something about it. And so you really want to track the most important parts of the business. Often revenue, but you know, some precursor to revenue can be, you know, can be a good one as well. And it's really easy to go for vanity metrics. Like one of the founders, I'm not going to put, I didn't put a quote from him, but he said that they were tracking PyPy downloads. So like the Python package manager. That sounds great. The problem is, again, like CI, all of those kind of bots, all of those things actually download packages from there as well. And they thought they had a lot of traction but actually had nothing. So that's something to keep in mind. So there's some good metrics that you can use. You know, revenue is always king. If you're making money, great. But also a number of customers using you in production. So a few big companies that everyone knows are already using you to, you know, to serve millions of customers. That's something. Even if you're not making money now, you can probably make money later. Some insufficient ones are GitHub stars, signups, downloads, Twitter followers. You know, a GitHub star is really just someone clicked a button. Maybe that someone is a target customer, maybe not. Signups, maybe they signed up and then they dropped because they realized it's not even the product that they want. They're like a lot of really false signals here. And also, GitHub stars don't mean that you have a great business. So this one has like 240,000 stars, like one of the top star repos. But it's a set of tutorials. And maybe it's a great business. I don't know. They can spin it into something. But it's definitely not a business on its face value. But a lot of stars can definitely indicate that you're onto something. Again, this is from Superbase. They're at 40,000 stars and they're just growing like crazy every year consistently for two years, I guess. That's something. And that's just like an indication of the market share that they're gaining with the developers. So I guess the main question that we need to ask is whether we should even open source. I talked about all of that. And before you boom me off stage, I know this is not the right place to ask this question. But you have to understand it's actually a business decision. At the end of the day, you took money from investors. You promised them to return them more money. You got people on board, offered them a job and told them that you're going to be around for five years. So you have to make the best decision for your business. And there are reasons for both doing it and not doing it. And so one thing is your customers even care. Maybe they don't even know what open source is. I don't even know. I don't want to go with some cliche, but they just have no idea what open source is and they couldn't care less. Is there a business advantage to be in open source? So for example, when you're a developer tool, that means that people can try your product more easily. That's great. Maybe go open source. And does it even make businesses? You're just going to cut yourself. You're going to just kill your own business. So ideology is fine. I've done it for ideology as well with Etisink. It started as an open source project. But you just have to remember that at the end of the day, you're building a business. So they're definitely upsides. Engaged community really helps. Again, so many volunteers outside handing out flyers about projects, really promoting them. And you can also hire from the community. No need to find great developers if you already have developers that know your code base and been contributing code. Increase trust. Healthy communication that comes with open source. Easier adoption, as I said, with DevTools. And also the product is just more flexible. People can just fork it and adjust it to whatever they need. They're also downsides though. Balance the community and business needs. Does this one go to the open source product or does this stay as a priority extension? Getting a burnout is a good topic. It's a very front of mind topic for a lot of people nowadays and it's absolutely true. And the same with support, right? I mean, people asking questions about your product on GitHub, you want to make sure you're going to give them the best support possible. But they're not paying you in the end. So you're doing a lot of free support, which is fine. It's part of being a project, but also it can be overwhelming at times. It's really harder to say no to things because everything is in the open. And also startups win on speed. You really have to compete and move fast. And enabling self-hosting and doing database migration, seamless migration, all of those kind of things just makes things so much harder. And maintaining two core bases, which you have to do because you're going to have customizations for specific customers or you're going to have hacks for specific customers. You're going to have a ticketing system where you're going to say, well, you know, customer X asks for this. We have to finish it now. You're not going to be able to do that in the open just because your customers wouldn't want that. And also your secret sauce may be public. And you're really just undercutting yourself on price. You're looking to self-host. So there are downsides. Another quote that I like is by Joseph Jax. He said don't open source something you don't want to commoditize because you're open sourcing your secret sauce. You're open sourcing all of the internal structure. It's actually really easy to mimic or even fork, but definitely to mimic what you're building. So just assume it's a commodity the moment you release it and make sure that the stuff that you don't want to be a commodity, they stay proprietary. Okay. So now we can talk about fundraising. I know this is something that people are often mystified by even just by the amount of people that asked me about it in the last 48 hours alone. So happy to cover a bit of it. So the first thing is that you're dealing with investors. And when you're dealing with people, you have to understand what do they actually care about and understand how they think. So the first thing to understand is the power law venture capital investments. What does that mean? Let's assume for simplicity that a fund invests in 100 companies. Most startups fail. So that means 90 are going to fail. And they invest a million in each. So 90 are going to fail. Okay. They already lost $90 million. And now they have 10 live companies. Let's say five of them are going to return, sorry, eight of them are going to return 5X. Okay. That's a good outcome. They invested for a certain amount of money. It's now worth 5X that amount. Great return. So that's $40 million that they made. And then the last two, so they lost 90. They just made 40. So they're down 50. So the last two, the last two companies really need to carry the whole weight of the whole investment fund to actually make a return. So that means the last two probably need to return 100X each. So if they invested a value of 10 million, it needs to be worth a billion now. And it's really the old venture model is based on the fact they're trying to find these one or two companies in their whole fund that is going to be 100X. Facebook was a 2000X or something like that. Uber as well. They're companies that are definitely outliers. Apple as well if you look at the whole 30 years at a time. So the moment to understand that they don't care about your 5X, because that 5X was $40 million out of the 240 that they made, you really have to position yourself as that 100X. And you also have to understand that they have investors. So even though you're talking to that person and they love what you're doing and they're open source developers themselves, they really want the product, whatever, they have investors to answer for. They told them, hey, I'm going to invest your money and I'm going to give you that amount of return. So really, you have to make sure to understand that dynamic. It's also, it's their job. I mean, here I wrote, you know, stay in touch, but really it's their job to waste your time. And they want to like meet you. They want to like know more about the space because they're going to invest in a competitor and they really want to understand the space better. They want to know about all the cool companies so they can tell their friends at the country club that, I mean, that was maybe a bit of an extreme joke, but they want to tell their friends that they've talked to all of these cool companies that are just doing very well now. This is not your job. Your job is to build a company, not to waste your time talking to investors. And another thing to remember is that not a yes is a no. So investors want to invest in the really good companies. As I said, they want to invest in the hundred extra. It's really hard to know if this company is that. So what they're going to say, they're not going to say, well, no, this is stupid. I'm not going to invest in you. What they're going to say is like, oh my God, this sounds amazing. I need to build my conviction for a few weeks and I'm going to like do some research and I'm going to like talk to you then and then they disappear. But if they see that some big name investor is talking to you or there's some traction there, then they're going to come back and complain to you and say, well, I told you I was interested. But really what they said in the beginning was no and you're never going to hear back from them again. So it's really important to count that as a no and not to count that as a yes. And another thing is that you want to believe the no, but not the why. So again, they want to invest in you, maybe not even in this company, but maybe in your next company, but they want to invest in the best companies. And if they're going to tell you, you suck, you're not going to talk to them the next time. So they're really going to say something nice like, oh, you know, this is, you know, our fund is like slowing down its investment so that really this is not the right time for us. But like, let's chat again in a couple of months. Really what they mean by that is leave me alone. I don't want to invest. And so like don't like, don't count it as a yes in a couple of months. Just count it as a no, essentially. Also like everything, it's a business. You want to have a plan. How much are you going to raise? What are you going to actually do with the money? Like, you know, it sounds good. I'm going to raise $2 million, but are you going to hire people? What are those people going to achieve? You know, figure those things out. How much of the money are you willing to sell? Are you willing to sell 90% of the company for $1 million? Are you willing to send 10? Those are all questions that you have to come in prepared for because you're dealing with professional negotiators that are going to rip you to shreds if you don't have like a strong stance on any of those. Another thing is to speak to investors all at once. So investors talk, as I mentioned, and they will talk about the cool companies that are chatting with. And if they're going to see that you've been talking to investors for the last year and you haven't raised around yet, that probably means you're tainted goods and they're not going to want to invest in you as well. So the right thing to do is probably try to invest money, like raise capital for like two, three months, and after that, if that doesn't work out, just stop, go back to building your business. If you're a developer, you probably don't need that money anyway. You can just continue building the product and then come back stronger in a few months. I'm going to rush this one a bit, but if you want to find investors, start with angels. Don't chase Sequoia. They're not going to invest in a small company, or they very rarely do. Don't waste your time. Build traction with angels, like small checks, and find people that are excited by the space, invested in similar companies, ask your friends. But really, the most important thing is don't take intros from investors who haven't invested. Investors want to look good, and again, they want to look like they know all the companies. So they're going to tell you, hey, I'm going to introduce you. I know a guy, Sequoia. I'm going to introduce you to him. I know a girl. They're that fun. But really what happens is that if they haven't invested in you themselves, they're just signals to Sequoia that you're tainted goods, because they don't want to invest in you, so why should they? So that's a really bad strategy to accept those intros. Just say that you already have enough traction and interest. But once they invest, by all means, that's the strongest signal. Someone put their money on you, have them introduce, have them go on a building and shout to the world that they've invested, and get introductions from them. This is actually where I see founders error the most, not because maybe that's the most common place for mistakes. It's just a place that I'm exposed to the most, is their decks and their pitch. You want to really keep it short. You want them asking for more, not going through 80 pages of boring stuff about whatever. You want to keep it short, short, 10, maybe 15 slides, really on point. You want to highlight what's most impressive, and in your case, it's probably team, maybe idea. Maybe if you have a really impressive set of customers, that's great. Highlight that. But most likely, you don't need to analyze your $2,000 a month revenue, because again, they don't care about that. It's not something that they see impressive. So focus on the stuff that are impressive. Another thing that people often get wrong is they confuse their customers with investors. So they go to the investor and assume the investor has the problem. The investor actually really cares about, I don't even know, I don't have an example at the top of my head, but they really care about this topic, and they understand the problem, and they face it themselves, and they really feel it. And if you told them about all the research that you've done, about this product, and how you solve it, they're going to really be interested about that. The most likely scenario is that they don't even know what you're talking about. It's not their job to be subject matter experts in whatever it is esoteric thing that you're building. They just need you to demonstrate that there is a problem. People really care about that. People are going to pay you money. This is how much money you're going to make. This is how you're solving it, not necessarily in this order. This is who you are and why you're the best person or best team to solve it. And obviously, you want to know your business inside out. You want to be concise, so let them ask questions, always iterate. Every after every call, I would go back and make sure that I get better after every call. On the call, be calm, relax, don't look desperate. You can sense desperation a mile away. When we were fundraising, I knew the worst case, I'm a developer, I can build a product myself. I don't need their money. I would love to have it, but I don't need it. That made it possible for me to relax and show that confidence that they're looking for. You're also assessing them. Ask questions, figure out if you want to take their money, because you're going to be with them for the long run if they're going to invest, so that's important. Also, only the CEO should go on calls. Don't do party conversations. First of all, just distraction, but also, you're going to get so many noses, and it's much better that less people get pounded with all of these noses, and the other people can actually focus on the business and continue. So the deal, this is mostly a side for you to read afterwards, but there are a lot of topics like, is it simple, is it fast, or are we going to spend 200,000 on lawyers in this deal? What are the terms for Rata, liquidation preference? All of those kind of things, you need to understand before you jump on that call, and make sure that what they're offering is what you're willing to accept. One thing that really grinds my gears, by the way, is investors that tell you, we invest 500,000. 50% of that is in money, and 50% of that is in services. So we're going to let you use our office and use our designers. I don't want your services. I'm going to hire my own designers, and that is such a rip-off if you ask me, and avoid those if you can. So final notes about fundraising. You only need one yes. When preparing this talk, I was thinking in my head, how many noses did I get when fundraising? I thought probably one, two. Then I looked at my notes, it was probably close to 50, but I really couldn't care less, because that doesn't matter. I only need one yes. You probably need more than one yes, you're probably going to get a few investors, but once you get a yes, the flywheel is rolling. Ignore the naysayers, ignore the assholes, you're going to meet some really shitty people along the way. They're humans as well, so just ignore those. Fundraising is definitely not success. Fundraising is definitely not company building. You need to actually build your company. This is not the end, this is just the beginning. You may not be able to raise some expectations, be ready for that, have a plan B, that's absolutely fine as well. As I said, very hard to fire investors, so watch out who you put in your cap table and who you marry. Another small point is that many investors only invest in certain locales, so a lot of investors only invest in the US. That's something to keep in mind. Why a big fan? They find a lot of open-source startups. They've been transformational for us. I would really consider applying. We got rejected many times. Don't let it hold you back. There are a lot of great open-source startups, GitLab, Docker, Superbase, HopeHosog, really the list just goes on and on. You're going to be in good company. One thing is that you need to be aware of the siren song. The tempting places where things usually go wrong for people. One place, as I said, fundraising or valuation or money raise or thinking that talking to an investor is actually success. But again, no, you need to build your company. Another area that people go wrong is that thinking that hiring more people is actually success. Managing 20 people. How big is your team? People ask me that all the time. We're small. The reason why we're small is because that's what we need to build a kick-ass product. This is something that you need to keep in mind. Burning money does not equate success. Plain startup does not build a company. Going to pitch competition, startup conferences, startup meetups, investor chats, all of those kind of things. Fun. Chasing press as well. Maybe fun. Honestly, probably fun. But it really just don't mean anything for the bottom line of your business. Just don't do it. Also remember, you really have not made it. Really don't look focused. The only thing that matters are customers, product, team and metrics. That's it. Everything else is just noise or maybe an enabler to get to everything else. So some lessons I personally learned the hard way is that you're no longer an engineer. I really like coding. That's it. I wake up, I love coding. And it's really hard to remember that this is not why you raise capital. Like this is not what my job is at the business anymore. I'm the CEO. I need to do just business stuff. I can help with the code. I can help with the architecture. But my main time should be spent on go-to-market. Opportunities are not going to last forever. Just grab them when you see them. It's very easy to say, I'm just going to perfect this landing page for a couple more weeks and then I'm going to take this opportunity. But then the opportunity is going to be gone. We missed a few in the beginning. People are going to be trying to take advantage of you. So just watch out. You're running a large budget. So just watch out for those people. Be mindful and protective of your time. So a lot of people are going to want to chat, catch up, whatever. Great. But as I said, you're going to be working 24 hours. So maybe spare the 25th hour for some fun stuff or maybe more work. And the last thing is that you really want to surround yourself with optimists. The truth is that your startup sucks. And it's going to suck for the longest time. So the last thing you want is people that are negative around you. A few more things. Team, super important. Early team sets the tone. Bad hires can really derail the company. Great co-founders can make the difference. I've seen it multiple times. But actually better to start on your own than having a terrible co-founder, which I've also had. You're really in control of your destiny. If something isn't working, you can change it. The only thing you cannot change is the market. But really just iterate, iterate, iterate. You can get lucky. We got lucky many times. Just put yourself out there. Talk to customers and you're going to notice something that no one else has and you're going to get lucky. And it's really a game of persistence. You can't lose unless you stop. Obviously I'm not saying go and bang your hand against the wall. You have to have some, you know, some acumen there. But as long as you don't give up, you're going to win. So just a few closing advice. It's very easy to think that hiring X will fix everything. Or customer X, once we get them, that's it. The business is going to go crazy. Or maybe this investor is going to change the destiny of the company. And maybe. Maybe some of them will. But it's really just on to you to actually fix the company and make it work. And you have to remember everything is sales. So you have to really get good at it. You want to get customers. You're going to get candidates. You're going to get investors. You have to sell them on the vision of the company and the company itself. Some more sources of great advice. YC YouTube channel, library. I mean, check it out afterwards in the slides. And yeah. And that's it. I just want to give you like a few closing words before I open the floor for questions. So make sure that it's really what you want. I mean, I kind of like try to dissuade you a bit in the beginning because really you need to be, you need to want to start this company even with my dissuasion there. And the only important things are to talk to your customers and build a product. Nothing else matters at this stage. You want to build something that people love. You have to remember it's a business. It's not, you know, not an open source project. It's something that you have to actually make money. And this is the primer, but there's so much more to it. And this is just to get you from zero to one. Every stage of the company is different and there's so much more and like, and maybe conflicting advice once you get to the one and once you get to the two. And one more thing. It's your job to be the number one evangelist of the company. And so always be selling. And in that spirit, if you need webhooks or you want to work at a fast growing startup, yeah, come talk to me about Sviks. And that's it. Any questions? Yeah. Sorry, a bit over. Amazing talk. Thanks a lot. |
Clear skies, no clouds in sight. Running a 14 person company on only free software.
They say it can't be done, they say it's too much work. But is it really? After 5 years of running Prehensile Tales on entirely free software I think I can answer this. |
So, welcome to my talk, Clear Skies, no clouds inside, running a business on free and open source software. A little bit about myself, my name is Henpieter van Braa, but if you talk to me, please call me HP, my mom does, and it feels much more at home. I'm a co-founder of two Godot-related companies, prehensile Tales BV, which is a Godot-consulted company, and recently Ramatalking Corporate, which is a company that plans to improve the Godot mobile experience, always be selling, I just heard. So, you know, I'm also a treasurer of the Godot Foundation, a gigantic nerd, and I started a free and open source code hosting site called Nutterbug.org around the time when, who was at Gatorious Shutdown, I think, and my personal blog is at blog.tmm.cx. So what is this presentation about? First of all, can you run a business entirely on free and open source software? Should you run a business entirely on free and open source software? Are you mad? And by that I mean me, maybe some of you too, I don't know. What does it mean to do this, and how can you actually do this? But this presentation isn't about, it isn't a total lesson on like managing a large fleet of servers, there's other talks about that, there's other people specializing in running data centers with thousands of servers, this is not this, this is for a reasonable amount of people in a reasonable amount of servers. It's also not a lesson in a flame, I'm starting a flame war about configuration management systems, so please don't add me if I say something you don't like. I am going to say some things you don't like if you are a CIS admin for a much larger organization. So first of all, can you do this? Well, I've pulled all the speakers on this stage and 100% of them say yes, so that's a good start, I think. So why would you do this? That's I think a question that a lot of people might ask themselves, I mean we're all here with FastAmp, you're here, so I don't have to talk to you about the benefits of open source software, why you should use free software, et cetera, et cetera, et cetera. That's not what I'm here for. So first of all, while you can use free tier of cloud services for almost everything, you can get most of the features. Once you start to get to a certain size, even if you're only as a small company, you end up with five, six, seven accounts for individual users, and it is very difficult for people to figure out what user you're going to use, what passwords to use, so you either end up with people using very insecure passwords, so you end up having to force a password policy on them, and then they're just going to write it down or whatever. And someone still has to fix these issues if they happen in your company, now you are certain suddenly help desk for people forgetting their passwords, at least that's been my experience. So access control is actually the real big one. Once you start to mix a bunch of different cloud services on basic tiers or free tiers, figuring out who has access to what, if everyone has the access that they need, and most importantly do people have more access than they need, it is very difficult to keep track of which document was shared with whom, why was it shared with that person, is it an individual that shared it with another individual, is it someone that maybe left the company now, but they shared it with their public or private Google accounts, etc., it's very difficult to keep track of these things. This could be a problem if you sign non-disclosure agreements with customers, other people that you work with, and depending on the business that you actually run, it could be a compliance issue, particularly now with GDPR, etc., there's a lot of things you have to keep track of. Of course, I see some of you think, but you can already do this, in fact, my company already does this, yes, it is certainly possible, but usually, if you want these level of features and these level of integrations, you end up having to pay actually quite a bit of money for individuals, and only to get this fairly basic level of control over your data. It also does work if you say, okay, I'm going to pick one SaaS provider, if you say, okay, I'm going to use only Google products, I'm not going to use anything outside of Google products, if Google doesn't have it, I'm not going to use it, that, of course, also works. So, is this all about money? Well, not really, I will get to that in a minute, but the first thing I usually hear from anyone who talks about doing something like this is either a knee-jerk reaction as in, but it's actually more expensive to do it yourself, I don't think that is true at least, it's not true for me, so at least this means it's not a truthism that is definitely applies to all cases, or of course the other one, it's only free if your time is free, that is also fair, but having tried this in a different way, supporting individual people in a bunch of different time zones in the case of my companies, with, oh, I don't have access to this document, can you send it to me, or I have people working in North America who share documents or fail to do it correctly, and a person working in Poland needs it the next day, due to the time zone difference, they won't get their document until 5 p.m. because nobody but the person who created it had access to it, this is also a time sink, and of course if the worst came to worst, failing in a compliance case is probably not kind of a free either, so what is this about then, if it's not just about money, first of all I just like being in control of my data, I like to know where my data is, I like to know where it's backed up, and I like to know that if I need to, I can just stick it on an external USB drive and take it with me somewhere, sometimes you just want to hack your data, I like having unsurprising costs, I know exactly how much my infrastructure is going to cost this month, I know how much it's going to cost next month, if I need to for a new project, whatever, add five external people to give access to some shared resources, I know I'm not going to end up paying 150-200 euros more next month, that I don't need to forget to not cancel again because otherwise I end up keeping paying it. It gives me a big peace of mind when it comes to access control, what I just talked about, I know exactly which one of my users have access to what data, I know why they have access to it because it's based on group memberships, and I know, luckily this has not happened, but I also know if something went where to go really wrong, I can in fact make sure that people stop having access to the data that I don't want them to have access to anymore. By giving people access to just one user ID and one password and one way of logging in, I feel like I'm making their day just a little bit better and that makes me happy. Just using free and open source software is nice, I mean most of the people here are here for a reason as well, so I'm assuming that most of you agree with that. If I run a laptop with just free software because it's important to me, why would I run my business on anything else? It feels very good to be part of a community rather than just a customer. I do pay for a certain software, let's get into that later, sorry, but even there you end up not really being treated so much as a small customer, you tend to be treated more like a community member to a certain extent. Also, they appreciate good buck reports more, which is always nice. But isn't this like super-duper hard? I don't think so. It is true that you need some things, you need to know some things that you might not need to know otherwise, particularly not to say you go in all in on Google Workspace or you go in all in on Office 365, you don't need to know a whole lot about infrastructure at all. For this you do, I'm not going to lie about that. The most important thing is actually DNS, tying all of that together, making sure that your emails don't end up in people's spam folders, et cetera. Most of this actually boils down to a basic knowledge of DNS. You're going to need to know a little bit about how to run a web server, how to administer a web server and what web servers do, but not as much as you might think. And of course a basic understanding of the Linux CLI, or I guess the FreeBSD CLI, if that's what you decide to do. But it's a similar thing. So most people when I talk about this have a perception that building something like this ends up with a cognitive overhead that looks something like this. It's an extremely complicated set of tools where every individual thing that fails could break everything. But in reality it's really not that bad. You end up with a bunch of containers running on one or more hosts with a simple set of configuration files that you stuff into Git and that's basically it. You don't manage 15 things, you manage six things or seven things. All right. So how do you do this kind of thing? Well, the most important thing really is to keep things simple. As soon as you start looking online about managing Linux servers, et cetera, you end up finding a whole pile of wonderful tools. I'm not trying to tell you that you should not use these tools, by the way. I'm just telling you that in this case, if you're just managing two or three servers, some of these tools might not be what you need. And trust upstream. When you use a cloud provider, you're just trusting that your cloud provider does the right thing at any given time. If you use a piece of open source software, if you use an open source operating system, if you use an open source engine X or other web server, trust that upstream tests the stuff they do. You don't need a five-street test environment just to install a security update on your operating system. So how do you keep things simple? As I said, there's a lot of really great tools like PuppetChef, Ansible, Juju CF Engine, BCF G2, SaltStack, and probably one that I forgot. I think there might even be some in this floor that I forgot to include in this. My apologies if that's one of your projects. These tools truly are great. I have been a sysadmin at a large company and without tools like this, I would have absolutely lost my mind. However, setting up services using these tools is a lot of translating between documentation you read about the tool into a configuration file format that isn't the one that is documented on the side of the thing you're trying to configure. And there's a lot of testing involved, which is great if you need to repeat this on 100 servers, but if it's one box you need to do and that you maybe need to redo in a year, it's a lot of cognitive overhead. And if your business grows to the point where this becomes necessary, then probably the overhead of converting what you did now into a PuppetManifest, Ansible or whatever is probably not the thing that's going to prevent you from growing. In that same vein, as soon as you start to look at, okay, how do I run containers? How do I do that myself? Again, lots of great resources, lots of super great projects. But if you search on Google, okay, how do I run a server with containers? One of the first links you're going to find is, okay, you're going to get to set up a Kubernetes, you're going to create an undercloud and an overcloud, and you're going to set up redundant, I don't know, load balancers, that was the story I was looking for. And you're going to set all of this great stuff up, and at some point when you have a thousand services, it's just going to be just as easy as if you had a hundred. Yeah, and I can understand why if you start reading that you get incredibly overwhelmed immediately, because that is a lot of stuff to learn. And again, you're not going to need it for what we're talking about here. Again, I think all of these projects are great. I have used many of them. This is just not about that. Please don't add me. So what does it mean to use simple software? So in this case, use what is supplied by the OS that you use. Every piece of software you install that is not in the repositories of the operating system use, that is something that you might regret later because that is something that is not tested by your upstream. This is again when we come to the trusting upstream part of things. So in this case, we're going to get there, I'm using Fedora for most of this stuff. Use Spotman, not Docker. Use SystemD, not whatever else, a nice thing you might come up with. Use WireGuard. Don't try to do anything too creative and use Git for your configuration management. And this is probably the most controversial point of this presentation. Use a recent OS. You do not need CentOS. You do not need Alma Linux. You do not need Red Hat Enterprise Linux. You need something where if you read the documentation, that it is the last thing that people actually touched. People are not writing documentation generally speaking on how to set up a modern thing on an ancient enterprise distribution. Again, these things have their places. I'm not trying to tell anyone that these things are bad. Again, this is about you managing your time and making sure you don't spend a lot of time doing this. And upgrading your server every 12 to 18 months really isn't a big deal. Again, trust upstream, DNF system upgrade, release for the next one 20 minutes later, you have your operating system upgraded. It's not a big deal. Another thing, use pre-existing containers wherever possible. You can of course write your own and sometimes you will have to for the demo I'm about to give. I did have to write one. And don't forget that upstream tends to prefer using their containers over a different way of deploying their software. So generally speaking, if you have an issue and you say, okay, I'm running your latest container version, you're actually more likely to get help than if you installed in any other way. Of course, what do we do about backup and disaster recovery? Well, if you follow the recipe that I'm going to talk about, then basically you have two directories to backup on each server. And if something goes wrong, you have two directories to replace back on the server. And basically your stuff is back. I will hear the people managing infrastructures for large corporations saying, but this will take hours. Yes. It also almost never happens. And this way you can be sure that your data is not gone if something goes wrong without breaking the bank. Generally speaking, I've been doing this for quite a while now and I've had one server fail due to a hard disk failure. Of course, the server was just using a rate one mirror. So I just called the people run the data center. They replaced the disk and there was actually not really a problem there. I would like to now introduce megacorp.icu. This is a very big company that has graciously let me demo their environment. Completely non-evil. They let me show you their environment because prehensile tiles, my own company, the CEO, thought it was irresponsible to have the internal systems recorded on there free and open source and developers European meeting thing. I don't know what they're thinking. Megacorp is nine totally real employees. They were absolutely not AI generated. The thing I'm about to show you is a setup that is roughly looks like this. There are three hosts running on this laptop, hopefully. Configured as such. So we have three hosts with an overlay network on WireGuard to make sure that these hosts can talk to each other over local IP addresses. Reason for that being is when you do something like this, you're probably not going to be able to afford or want to pay for dedicated rec space with your servers nicely lined up above each other, connected to each other over local connections. So you need some way for your servers to feel like a LAN. By doing a simple WireGuard mesh, any one server can go down and the other two can still talk to each other, which is a nice feature to have. And now a hopefully non-colossal mistake, I'm going to try to actually show you what this looks like by using a system like this. So first of all, this is the basic admin interface. You might see if you're using something like this. So we have our nine users here in LDAP. And we have a couple of groups here that you can manage. So I've made a human resources team for which Chatwick and Maxwell are our members. I made some shared mailboxes that a human resources team can also look at. So what this looks like here is for instance, let's see, have a new timer tab. This one I think. All right. So for a little explanation, I'm using a feature of Firefox here called container tabs. Basically in the top right here you will see the name of one of the fake users. I mean, totally real users. And basically Firefox will keep all the cookies, etc., separated so that I can actually show you what it means to log in from multiple systems. Just a private tab would only give me two Firefox tab containers because it basically give you an unlimited number of private tabs, essentially. So that's what we're doing here. So the first thing we can do is, oh, this is the wrong one. This is the real one of my real company. I was smart enough to make bookmarks and I didn't use them. So the user name of this person is Cadence. So first of all, this is basically the login screen that every user will see, hopefully only once a day. The way this works is you log into the system, there is a cookie that gets set for key cloak in this case and key cloak then does all the authentication with further back ends. So generally speaking, if everything works properly, your users will only have to log in once a day, also meaning that your password policies and things like that can be a little bit stricter because they really only have to remember one and they really only have to type it in once a day. So, oh yes, this user actually doesn't have access to this. That was part of the demo, but I forgot that in this case. We will get there. Let me show you something that the user doesn't have access to, their email box. So you see that I just switched from the next cloud to the email server and you also noticed, hopefully, that there was not another login prompt. That is, again, because these single sign-on systems, if you deploy them properly are in fact single sign-on. You sign-on once and then it should just keep working. So this is the email box for our user cadence and we can see we have an address book with all of the people, all of her colleagues in it. First and here is Chatwig Parsons. These names are also all AI generated, by the way. So what we can do is we can send an email to Chatwig here. Am I doing the right thing? Send email. You can send a little email to Chatwig here. And now, hopefully, when we switch to the Chatwig, there we go, Chatwig there, have received an email. You see also that the things like, small little things like the fact that all of the users have their own little profile pictures, et cetera, and these are all consistent across all of the applications as well. So if you look at the Chatwigs, Chatwigs does have access to the next cloud. There we go. And you see that this actually shows the same... If you click the same button twice, the same thing happens. This is very surprising. You see here that the user has this... Oh, sorry. You see here that the user actually has the same profile picture as well. These profile pictures all came from Active Directory, thank God, no. Came from OpenELDA as well. So what can you actually do with this? What is the big thing that I kept talking about? For instance, here we have a human resources folder that Chatwig has access to because Chatwig is part of the human resources department. And so there is... Maxwell is also an HR person. So Maxwell and Chatwig... So far, everything's still working. Very good. You see that Maxwell here has access to the same document. And there's not been any sharing here. It's just a matter of being part of the correct group. I will show that in a second. But if you look at the access that... This is all about Kelly. Thank you for getting that I made these nice handy dandy. It's not exactly single sign-on if you're doing a demo with five different users. It is only once per user, not once per demo, I'm afraid. You see here that the user, Elba here, does not have access to the shared directory and cannot see it. And trying to just send a link to this other person will also not work. As you see, you just get a file not found there. Now, what if Elba does become part of the human resources team? Well, at this point, it's just a matter of adding the member here. And then, in a couple of minutes, hopefully, maybe immediately, we'll get back to that. What's the next thing on my list of things to share? I also created this user, Mary J. What you've seen so far is basically what it looks like for people who have been working for a while, who already know the system. What does this look like for a completely new user? Because that's another problem, right? Getting people signed up on board in your organization. Okay, forgetting I made these. So, let's say you tell your new user, okay, the first thing you need to do is just go to the chat system to say hello. So, we have given the user their initial temporary password. I sign in and you see that you get just this one little prompt for people to change the password. And after that, they're just signed up like everything else. So, this is basically the only real onboarding that your users have to do. So, they don't have to create accounts for multiple things. You don't have to create accounts for multiple things. I don't end up having to type a personal Gmail account to some kind of Google Drive, nothing like that. And there we are. The same thing goes for things like the access to, for instance, channels on your instant messaging server. You see here that at Chatwick, being part of the human resources group, has access to this channel HR that is not visible to this user, Maria Johnson. So, Maria Johnson is not part of HR, does not see the group. This is not something that you have to manage yourself either. We also see here that since Alba was recently promoted to be part of the HR group, she is now, she now has access to the HR channel as well. Again, this speaks to this access control that I was talking about previously. You don't have to think about, okay, what things does an HR person have access to? Did I give them access to the right folders? Did I give them access to the right chat, email address, et cetera? It's all just taking care, you only take care of this one time when you set this up, basically. So, there's a couple of other things that people like to use, like, editing documents together. We can do that. If I, probably should have kept more tabs open. It turns out that navigating a computer is much easier when you're writing a talk and you're sitting down instead of facing a room of 700 UV pool. Well, that's not 700 of you here, but you know. So we have two people logged in here, and you'll be able to edit things collaboratively much like Google Docs, et cetera, as well. So you don't really need to lose any features over any of this either. Let's see. We have the chance. Yes. So these are the services that I added to the demo for this demo. If anyone has any questions about what other things you could easily do, we have a question section in a minute. Let me go back here. So I'm assuming that it went super well when I wrote this. I'm going to give myself three and a half out of five stars on that one. So can you do this? I think this little demo shows that for most purposes, what people are used to on a daily basis using their systems collaborating with each other, et cetera, it is, in fact, possible to do it. I think the user interface is far more consistent than if you compare this to a hot pot of different cloud services with different login systems, different access control systems. Again, predictable and controllable cost, you know how much it's going to cost when you start the month compared to how much it's going to cost at the end. I think you end up spending less time managing access control. You just have this one list of groups and members. You look at who is a member of the group, you know what access they have, and you also know that if you take them out of the group, they will cease having that access. That way, you also lose less time on people not being able to access things that they need for their jobs. So as part of writing this talk, I was setting up the megacorp.icu, various containers, et cetera, and taking some of the containers that I was ready running in production, kind of filing off the serial numbers of the specific things that I used them for. And I realized that maybe these could be useful for other people as well. So I created a GitHub project called megacorp.icu that has the containers that were used in this talk in it, as well as some of the tools that I used to manage this. Perhaps if other people are interested in running their own infrastructures in at least somewhat the same way that I'm proposing, perhaps we can do some kind of collaboration with each other and do sort of CS admin support group, I don't know. So questions, I think I'm a little ahead of where I was supposed to be, but maybe there will be a lot of questions. A little plug, because I was just told that you should always be selling. My company, Romatuck, is actively hiring people. So if any of this sounds like it could be you, please contact me. Okay, so we've got one roving microphone for questions, but we're going to go online first, because there's some online questions. Yeah, the chat's blowing up with this question right here. What do you think about so-called totally automatic solutions like Unohost or Sandstorm? Would they also be sufficient for a small business? So I actually started off using Sandstorm, but found that the access control that Sandstorm offers is actually not quite... So with Sandstorm, what usually ends up happening is that you deploy different instances of the same application that different people will have access to. What I found is that moving people between groups didn't work properly for that, specifically if somebody gets kicked out of a container, sharing things between things tended to get a little weird. I would say that if it works for you, use it. It didn't work for me personally. So I have another happy story. I started with a small startup. I was the assistant administrator. I set up. Okay, close up the question, please. So I started to install all the false infrastructure. It was beautiful. Next cloud, we had Aldo and many other things. And then the company got bigger, and then the president of the company said, I don't like how next cloud link calendars to email. And so the next month, we switched to Google Doc completely entirely and that's where all my work of two years and I cried. So how do you do with the users? I mean, you are the boss. I'd love to have you as my boss. But how do you convince users actually to comply? Well again, do you need me to repeat the question? So in my case, none of that really applies because I am the boss of these companies. I'm the CEO of both. So to a certain extent, if you don't like it. But in practice, people tend to do like it. Of course, if you have someone that says, okay, I only know how to use this one thing and I refuse to use anything else, if that person is the CEO of the company, that is a problem. I don't really have a good answer for you other than yet try to convince them that there are some benefits to this. I mean, in the end, it's their money, right? So if they're willing to pay $30, $40 a month per user because they don't like how these things are linked, there is only so much you can do at some point. Okay, we're going to take an online question and then I'm going to go over there for the next question. Keep your hands up if you want to ask a question. I've got a chance to keep an eye on you. Okay, so what do you think of Cloudron? So can you repeat the question, please? What do you think of Cloudron? I'm not familiar with that. Have you heard of Cloudron? Are you aware of co-op cloud? That's another question on the... I am not, no. All right, some good ideas here from the people. What do you use for documentation? Can you repeat the question, please? What do you recommend for documentation, like an alternative for conference or notion? So it's the question, what do you use for video conference? No, no, no, for documentation, internal documentation for the company. Oh, what do you use for internal documentation for the company? Yes. Yes. So in the case of pre-hensile tails, I use X-Wiki, which also integrates beautifully with this whole thing. People get access to subpages based on group memberships, etc. For user documentation, that's not intended to be changed. So I wrote a little manual on how to do certain simple things on the mail server. Those are just files that are in a shared folder on the next cloud that everyone has access to. Thanks for the talk. Very interesting. One question, I guess, all your services are publicly available. So does Keycloak support something like a two-factor authentication? Yes. Keycloak does support two-factor authentication. You can simply add, actually, let's not do a demo for something I haven't looked at. I will not make that mistake. But yeah, you can add an authenticator app to your account on Keycloak. And I do that for myself, because my account tends to have admin privileges, etc., on a lot of stuff. It is possible to also configure Keycloak such to make it mandatory for every user that logs in to set up a two-factor with an authenticator app. But yeah, so that's one thing to keep track of though, because this is something that can, in fact, take you, lose you a lot of time. If you do set it up, make sure you have an MTP client running on all your servers, because that's going to take you a while to figure out. Yes. I have a question. How do you deal with security since you're running NextCloud and Keycloak on premise? You said in some slides before that you update your systems, like, between 12 and 18 months. That is not a problem. But then again, security issues, like, rapidly rise up. And especially if you're on the public internet or facing the public internet, you run into, well, huge risks when you are not on the problem at hand. I mean, huge risks, sure, in the sense that if a worm were to be developed that goes through Keycloak directly and it doesn't get patched until it is all wild over the internet, then that might be a problem, yeah. But in the actual, so far, security updates for things like Keycloak, NextCloud, et cetera, they come out of cadence of once every month, month and a half. You basically just tell your podman container stack to pull the new images and you restart them. It's two, three minutes of work, usually. So again, it is a concern, but it's not something that's going to, it's not going to be your job to deal with this. It's two minutes, three minutes to install more security updates most of the time. Okay. So the question is, how do you know that you need to update something? I just run these pools and updates like every two, three weeks because I like you pointed out I don't want it to be my job to figure out all of the security updates that are available for everything. So I just install everything every couple of weeks. Next question is, can you use Purely Foss for the non-engineering aspects of the company? For example, accounting software that is compliant with state requirements for regular digital submission. So the question is, can you use it for things like administrative software and stuff? I frankly don't know because I cheat in that regard is that I have an accountant and every month I send them an email with a pile of PDFs in it and then I get a PDF back that says how much taxes they have to pay or return. So I don't know. I don't have the answer to that question. But hiring an accountant, a good idea if you're ever at a size of one or two people, just get an accountant. Thanks for the presentation. So since the keyword here is to keep things simple, I believe in Key Clock you can have all the users created locally and have them in Key Clock. What's the benefit of having LDAP behind that to keep track of the users and having all the information is why? If you want to keep things simple, why not just create local users? Because then you end up with the same problem that you started with, with different cloud tools. Now you have to manage individual access to different tools yourself again. The fact that there is a group of human resources that easily works across all the systems means that the mail server knows who's a member of human resources, next cloud knows who's a member of human resources, and what else? And RocketChat knows who's a member of human resources. It just lowers the burden to take that one little bit of extra complexity. In return you get back this one single thing you have to administer instead of figuring out how to do that across multiple things. Also from a purely practical point of view, most programs will support LDAP. Fewer and fewer programs now support things like PAM or anything like that. The LDAP kind of is what PAM used to be in that respect for good or bad. You've been talking about using this in small or small age companies. Do you foresee at which point this would not work anymore? In terms of size of the company? I think there's going to be an inflection point specifically when vertical scaling stops to work. If I need more than one piece of hardware to run all of the chat users on, I think at that point it's going to be a problem. If I get so many files that I cannot reasonably store them any longer on a server that you can reasonably rent, I mean right now you can get a rate one NVMe server where it's two times four terabytes of disk space for like a hundred bucks a month. Once things like that stop scaling, I think that is when you might start to have a problem. The way I think about that though is by the time I have that problem, I will also probably have enough people to get a sysadmin to figure that stuff out for me. How often have you had to report or fix bugs in the stack that you're using? How often do you have to fix and report bugs? So that is exactly how often I don't know. I'm not going to lie that it happens with some regularity. I currently have a bug open with NextCloud, for instance, that sometimes the avatar icons disappear. That is annoying. Things like that are annoying. But then again, I also have had bugs open for pieces of software I've paid for that are never fixed. It's not a panacea, I believe is the word you, is how you pronounce that. But it's not something, it's not like the things are constantly breaking and I'm constantly fixing things, absolutely not. A minute ago there was a question up here. Has it gone away? Just in this front block here. No, right, okay, we'll go over here. We'll see if it works. Somebody's getting some stabs. Hi, thanks. That was a really good talk. Like everything that you did say. On the point about accounting, that was one of the things I was hoping you might cover. So I was just going to ask if maybe there's anyone else in the room who does do their own bookkeeping and tax submission and knows free software that can do that in European or particularly UK contexts. So I think the question is, is there anyone in the room who does know how to do open source accounting? Because it's certainly not me. That has a second vote from me as well. Anyone? Hold up, hand up. PRP next, you say? PRP next. It's not low. Every regulation is in it. You have to build it yourself, but ERP next is an open source solution to manage your accounting stuff. Cool, thank you. Did we have more questions? How much time is needed on average per month to set up the infrastructure and maintain it? So setting up the infrastructure, well, I felt like it wasn't too much work to set up an entirely new infrastructure just for this talk. So I mean, it should give you some indication that it's not like a weeks long process. As in how much time does it take me per month to run this? It is a little bit variable. So just installing steady security updates and just keeping those things running maybe. So it's never consecutive, right? I think it's fair to say that I spent about an hour a month maintaining the infrastructure. I don't think I spent a whole lot more than that. Of course, this number can ramp up rather dramatically if you get a hardware failure. Like a broken hard drive in a server can now mean that you have to spend two, three hours talking with your data center people, planning to get the disk replaced, waiting for the disk to be replaced, putting it back in your rate set. So at that point, you might lose a couple more hours. But yeah, it's in the order of hours per month, not days. Hi. Have you tried to do it in high availability? So do you have any problems with like a next layer of the complication? If you have one key clock in machine A, the second key clock on machine data center B, what's your idea here? Because if you use, I don't know, Google, it's just one Gmail comment. Yeah, you are fine. Yeah, so the question is what do you do about redundancy? So that's actually one of those things where it's very easy to fall into a trap and which is kind of what this talk was hoped to be about. Is it you don't really need that most of the time? Make sure that your data is safe. But if you have a company of 10, 15, 20 people, if people can't log in in the middle of the night, it means that you're also sleeping and all your employees are also sleeping. So it's not, this is not something that worries me at all. It's not to say that, again, not to say that you should never worry about it, not saying that anyone who does worry about it is wrong. But just in the type of deployment that I'm talking about, a couple of hours downtime is not the end of the world. And we've seen it's not the end of the world because Azure just went down for like seven hours and, you know, we're all still here. Okay. Thank you all very much. Thank you, HP, for your talk. That's a great answer to your question. Thank you for coming. |
The End of Free Software
How the Cloud threatens FOSS and what we can do about it |
Thank you all for coming and staying so long. I'm positively surprised and impressed. A bit nervous. I hope I entertain you better and don't make a mess out of this. Who of you thought that this title was clickbait? That's fair. It's clickbait a little bit. I'm also serious about it. I think it's a serious issue. A quick introduction. I will not read this to you and you already heard it. I think the key point, I work at Rathead. Yes. My motivation for this talk is informed by my experience at Rathead, but really I've been an open source free software for a long time and I'm really committed to that. That's the reason why I work at Rathead, not wise-worthy. Also, I have to give this disclaimer. Nothing I say here is an official position of any of my former current or future employers, although it may be the same things I tell my employer. Take that as you wish. Let's get on the topic. Why do I say that the cloud is a threat to free and open source software? Because while free software has democratized the access to code address, so code in a repository, taking that code and actually operationalizing it, running it, has become the new proprietary differentiator, even on top of free software. In my experience, that is undermining both the usefulness and the sustainability of free software. Let me explain that a little bit. Why do we do free software? I trust everyone here in this room knows the definition of free software, so I'm not going to believe in that. I'll use open source free software force synonymously. We can discuss that over beer later if you want. The point is that you have the freedom to use software and it matters. It's important because before free software, code was a major proprietary differentiator. Proprietary code was also the major entry hurdle for accessing technology. It incentivized centralization involved gardens and thus became an inhibitor or was an inhibitor to innovation. It created and maintained dependencies and lock ends. You were dependent on the decisions of someone else who had authority over the code that you used. Back then, that was annoying. Maybe it wasn't as big of an issue as it today, but the world has changed and they'll get to that. Free software, in contrast to proprietary code, levels the playing field and democratizes the access to technology. It incentivizes open collaboration and offers a base for technical sovereignty. Sovereignty being the state where you have authority over your technology. That's my own path to open source. I got to open source because I was in high school and I needed a compiler. I was using, I was using OS 2 at the time. Guess my age based on that. I was using OS 2 because DOS and Windows were not my thing. But the Tobos C compiler I had used on DOS didn't work on OS 2 and IBM wanted a lot of money for a compiler. Luckily, there was this guy DJ Delori who had ported GCC to OS 2 and that is what saved me. Let me use a compiler on OS 2 and what got me into free software. The key point here was that I was excluded from access to technology. I was excluded from being able to use something to write my own code on that platform that happened to be running. Could have switched platforms but would have run into similar problems. There were other compilers that were maybe cheaper but the problem became obvious to me because of the pricing that made it inaccessible for me. So that removed the hurdle. And as sovereignty is a big topic right now, for me it starts with individual sovereignty but that becomes an issue for any organization. If as a business you are dependent on other people's decisions or in Europe that's a big topic nowadays even as a state or union of states you become dependent on other people's technology decisions, that's creating major risk. Free software is the antidote for that. It provides you an environment that lets you control your own technology and it creates an environment where access to technology is fully democratized. It provides incentives for collaboration and creates a better innovation model that's based on collaboration. And it, well now I'm repeating myself, apologies for that. It creates this level of sovereignty and control over your code. That's why it matters and that is something that in the cloud is becoming harder to do. What happened is that software aids the world. Everything is software defined now. An example I often use is the connected mousetrap I have in my basement. Mousetraps is a pretty old concept doing something very physical. And now I have a mousetrap that has a microcontroller, talks to my network and tells me through a cloud backend with a front end on my mobile phone when it did its business so I don't find it weeks later. So that's software defined now. And I bought this one because it has software features. So that shows how integrated our lives become, I mean it's a silly example, but it shows how integrated our lives have become with software. How things are defined by software. I always go back to Laurenz Lasek book in late 1990s or early 2000s that he wrote called Code where he argues that code becomes law because it defines our ability to interact with the world increasingly. I came across it in the context of the fight against software patents in Europe in the early 2000s. And I think that's become more and more reality. Everything we do is more defined by code. And that is not defined by code at rest anymore. It's not defined by code in a repository. It's defined by code that's running somewhere. And the software, open source is a base for that, right? The cloud is built on open source. The software has the complexity of modern software. The integration of the world has left interdependencies that everything now has on that has led to a model where users, commercial users, private users increasingly just built on top of services provided by someone else, right? And it's just a function of the complexity with the utility of something. If I need a database, I can click a button, I get a database. I don't have to become an expert in deploying databases myself. I don't need to figure out how to find the right infrastructure for it. I just press a button. It works. And I can focus on the top 10% of my solution stack where I implement my, as a business, I implement my proprietary differentiation. I will use that. But the price for that is a dependency on a service that in itself may be based on open source code, but the step from taking the code that's in a repository to putting together multiple pieces of code into a running service, and running that service, keeping it running, keeping it secure, and integrating these other things increasingly becomes proprietary. And it becomes the proprietary differentiation to the degree that less and less people can actually run services on their own. And why is that? Because the knowledge of running things in this complex interconnected world gets harder. It's hard to keep things secure, even configuring them secure, which you can see in the news, basically daily base, is hard. People fail. And all kinds of data shows up in places it shouldn't be. And that becoming a new, that knowledge, right? So it's not just, it's not the question whether in this respect is a definition of free software. My complaint is not that I may have to pay for a service, right? My complaint is that the knowledge on how to go from code in a repository to a running service, that knowledge in itself has become a proprietary differentiator that's creating walled gardens and that removes access to technology. And I'm guilty myself in actually falling into that trap. I mean, I don't run my own mail server anymore, which has been a, I don't know how many of you are in that camp, who runs their own mail server? Yeah, that's sad really. I mean, in this audience, I was wondering whether I should dare ask this question because I would have felt really silly if all of you would have raised their hand and say, I run my own mail server, Daniel, you idiot. But in reality, why don't I run my own mail service because I can't, I don't have the time to do it and keep it secure and keep my family on it and deal with the support requests of that and provide the same access from every device that I get if they are just on Google mail, right? But that's a compromise I make for the ease of use, the utility of it, and in the end, it creates a dependency. Now I have the situation, I'm going a bit off script here, but I just like now they are scanning my mail for certain legal violations, right? And they will forward certain types of violations automatically to the police and lock you out of your account. I think that was, that wasn't Google. There was a case recently where a family got into trouble for a photo they had of their kid, like traditional family photo that was flagged as inappropriate and they lost access to their mail account. I don't say which vendor that was, but you can Google that in the US, right? And they're locked out forever because algorithm decided that that was not acceptable. And that might, it's all well intended, right? But it shows what it means to not be, not have technology sovereignty, right? You're now dependent on an algorithm that might be very, very badly implemented that then decides whether you lose access to your own data because it recognized a violation with the best intentions, but sometimes dangerous consequences. And we'll see more of that, right? Because there's a whole different thing I'm not talking about today that's happening with artificial intelligence and the further integration of these systems into cloud services. So my point here at the end comes down to that, like the question, it's great that we have free software, right? It's great that we have revolutionized the access to technology, access to code. We have revolutionized the business where free software has won. It's a base for all modern software development to some degree, right? There's always something like proprietary differentiation in code. Right now, I would say is limited kind of to the top of the code pyramid, right? When it becomes like the things that one company writes in-house, right? Their own business differentiations, that's always proprietary, right? That's where your trade secrets are. And that's okay, right? It's always shaking the head. It's a separate divide, but you will like at a point where something becomes so unique, right? You get into the question with a... We can debate that in a question and answer. I see it, but I think so ultimately I respect people doing that, right? But then you get down to something is domain specific. Maybe there's some proprietary stuff in there, but latest you're like outside of like one specific industry, you get into free software frameworks that everyone is using because all the stuff on the business side that's common, their free software is a huge advantage because it gives you the ability to collaborate among competitors, to not reinvent things all the time, to get to common standards in a very practical way, right? Because it's code is better than a standard. So that works. But how much use is it really if you cannot run it anymore on your own with a dependency, without a dependency on someone who is a big centralized provider? The issue here that goes into that is true for individuals, that's true for organizations, that's true for civil society, that is true for companies that are not as big as some other companies. You always create these dependencies because of the talent gaps and just the knowledge gaps and the risk included in it. And this has allowed the leading cloud providers to create centralized economies of scale based around their proprietary operational knowledge, right? And it's really useful. You click a button, you get anything you need in infrastructure. You know, this sometimes includes unfair strip mining of free software projects, right? We've all seen that. They can do that just because of the scale of operationalization. There's nothing to do with the code. It has led to a situation where sometimes code itself is actually commoditized, right? Your challenge is not writing the code. You can get enough people together to write the code. Really the problem comes running it and creating the economy of scale and adoption that makes it successful, right? And that is a dynamic that basically goes back to the main frame, right? We went full circle and now we are running black box services on someone else's hardware. It's a convenient model, but it's really creating lock-ins. And it's the same kind of lock-in and access hurdle that code used to be when I was in high school, right? It was impossible for many people to get to that level of technology, even if they knew how to code in a sustainable way. And now we are getting back there. And there are plenty examples how this goes even into how we create free software, right? This is not just a problem in using or deploying it. It affects our own creation, right? If you write modern cloud-native software, you aggregate existing services and you focus on the last 10% of what you really care about. You don't have to worry about anything else because you can use it from the cloud provider. If you are doing free software, then you have to run everything yourself because we haven't expanded free software to solve this problem, right? And we also, I mean, we can take prominent examples of who uses GitHub. Does it include GitHub issues and things like that? Who thinks that GitHub is open-source software, free software, right? And it's great. It's a great service. It's a great tool. Git is free software. GitHub and everything around the code management itself. You go into issues, actions, integrations. It's owned by Microsoft, yes? Yeah. Now, I'm not going to go there. I have a sitting comment who I work for, so anyhow. I think I don't want to have prejudice against, I don't want to single out Microsoft there, right? Because that wasn't my point. Slack is not on, I think, I don't know who owns Slack, but same thing, right? So much user. We can go around and around, and we will find that even the development of free software depends more and more on proprietary services, software as a service. And it's, this is not necessarily criticism in these companies, right? I believe that everyone has a choice to offer their software and their service under the license in terms of service they see appropriate. But it's a problem for me where I think, well, there should be an alternative, right? We need to think about how sustainable this is in the long term. And, you know, can we get to a model where free and open-source software development can be integrated with other free and open-source software development in a consistent model end-to-end, where it regains its utility all the way to running code, right? That is the underlying problem is that when you, you know, having code and they're not being able to actually offer the service or run your own mail server or, I mean, big topic that everyone here is probably aware of is like the switch towards the Fediverse right now, which is great, right? Which brings more attention to this issue. People are trying to run their own instances of a decentralized social network now, and that, they're finding out that's actually really hard, right? And, and it's, we haven't, like, that's going to get ugly before it gets good if people don't run back to the centralized wall garden because it's too hard, right? But that makes a point. Ultimately, right now, while the Fediverse, I think, is far ahead, right? It's one of the areas where I see hope, because, you know, it's fairly easy to run. A lot of people are running and it's a, it's this decentralized, centralized where you have hubs and people are running them for others as non-profits or, you know, I think where you have, you know, direct contribution, but it's not a walled garden. That's hope. I see hope with projects like in, in the home automation space. I personally use Home Assistant, but they're, you know, other great projects, but a lot of them go deep into actually giving you something that just runs, right? So operationalizing it isn't hard. It's actually often out of the box. You can download an image. It just works. Sometimes there are even services and the integration into services works out of the box, right? You can actually aggregate things. Sometimes it's a bit, you know, part of the problem is that there's no differentiation between integrating into other open source components and services and some weird proprietary or very weird proprietary cloud-based things that maybe upload your security camera images to places where you don't want them, right? So there's that. But on the other hand, it's, it's progress in the sense that part of the open source project is to actually solve this problem. And, and that is, I mean, this is a problem. Again, I think this is a problem for everyone, right? Even if you're, if you're building on top of the cloud, even if you're a company building on top of the cloud and you have like very little code that is your own code, where you put your business differentiation, like your UI to control the microcontroller code that controls the mousetrap, you still need to figure out how to operationalize it. That means you need to have that knowledge from somewhere, right? And that, that's becoming increasingly a problem because it's hard to do that securely. So, you know, open source as one is only half right. That's my, my takeaway. You know, open source is a preeminent software development model at this point, but it's not the operations model. And this limits how useful free and open source software is for users, whether it's individual users, private users, nonprofits, or, or, or civil society organizations, companies, or even governments that try to have sovereignty in their technology and data use. You know, the GitOps is a nice example here, right? It's great to have a GitOps model, but if you GitOps, you're free software, how free is it when the GitOps part is actually proprietary, right? So my, my thesis is that we need to expand the concept of free and open source software from code at rest, from code in a repository to really include the operationalization of the code. And we need to create a collaborative, decentralized infrastructure model for that, where we can use the same approach of aggregating existing services that are run maybe by other organizations, by other projects, without having to rerun everything on our own, right? We need to create an exchange on operationalization knowledge and we need to create a practical capability to build applications in that way. And I think it's, it's important to really take this also in the, in the, you know, something I hadn't, hadn't thought about before as much. I spent some time at an open forum event on Friday about where sovereign cloud, cloud sovereignty in the EU was a big topic, right? I think as a, as a free and open source software community, we need to also make sure that this point gets pushed into the cloud sovereignty discussion, right? Because if you end up with cloud sovereignty, many people and talk about how, oh, we're going to use open source cloud code, but at the end it's going to be in itself just a black box service. Maybe it's just a data center that happens to be in the EU, but otherwise it's following the same existing proprietary cloud centralized cloud model that wouldn't be sovereignty, at least not in my, in my definition of sovereignty, which means you have actual control over your, your destiny. So what I'd like to see is what, what we need to do here, you know, expand the model, create a collaborative model around, around actually running things, focus on the sovereignty aspects, and I think we need to consider technologies without bias. What I'm really concerned about is for example, when I see discussions around web three, there are a lot of people who have now a mindset where blockchain, web three, that's tainted by either scam affiliation of similar technologies, perceived or real, or by political affiliations. I don't know how big that is in Europe, in the US, that's like a big topic where you can talk to someone and they will say, no, web three, that's something that the crypto gross use. I'm, I think we need to be agnostic there. We need to look at technology that's utility. I'm, I'm skeptical on each individual technology, but ultimately it comes down to how can we solve the problems. And there are areas where a distributed ledger is going to be the right solution. There's going to be areas in my view where, for example, something like Filecoin is the right solution because you need to find a way how to monetize offering services on a peer-to-peer basis. And we need to be careful not to put everything into the same bucket as maybe other things like, you know, some, some scam coin and snowball model because they are not just because he's the same technology. They're not the same thing. So I want to give a concrete example of what can be done. This is something we are trying at Red Hat, but it's an open source initiative. Well, it would be silly to talk about it in this context if it wasn't. So we call it operate first. And I should explain the term. So Red Hat has internally in, in, in our Linux group and I think for most other groups, but there where it comes from, we have a core principle that, that's called upstream first, which says that we will never try to differentiate from upstream code on a technology base. So if we implement a feature in our product, we try to get it into the relevant upstream projects before we ship it as a feature to customers. It's not always true. We ended up maintaining Zen kernel patch for a long time because it was never accepted upstream, but we also regretted that quickly and switched to KVM because it's painful, right? And this is not, this is not altruism per se, right? The, the reason why Red Hat doesn't try to differentiate on the technology level from our upstream open source code is because it would make our whole development model really painful and almost useless because you lose the main point why a company should use free software. It's the development model that allows you to collaborate and achieve better, better solutions than anything you could come up with in-house, right? You lose the corrective and I mean most, most modern issues that you have in complex systems in the, in the lower layers of your, of your software stack, you're never going to solve on your own, right? Like any big kernel issue, any big security issue, it's always a collaborative effort and that's why they get solved. I mean that's even true for most proprietary software at this point because someone else finds it, right? So, so Red Hat has this principle called upstream first which features go upstream before they go into the product, you know, exceptions are always there, but that's, that's a goal and it's motivated by core business requirement not by altruism. Now we need the same for creating services, right? It is now as much service, software as a service company as it's a, we ship code company. All our services are based on free software. Not all the glue code gets developed in a free software open source development model. Not all the glue code is publicly available yet. The point there is we have the same problem, right? And so this is not only, this is not speaking about our use of services, right? We can't force and we don't have the capacity to build everything ourselves, right? We have to offer services on proprietary cloud, for example, right? There is no way around that. But for the things we create, we were also drawn into just like everyone in the culture of DevOps and cloud native development, we have created things that are proprietary in the productization step and we have lost the ability to collaborate on that which hampers ourselves, it hampers our customers that, that then try to do things on-prem or try to create a hybrid environment. And at the end, we believe strongly in a hybrid cloud model for whole different discussion would be its own talk. But so we created this initiative to change our own dynamic, but invite others to join it. And it comes down to approaching the creation of our own services from a model where we start with making the upstream code operational, right? So we make sure that what we put into the software repo is not just theoretically runnable code, it's actually everything you need to run it. So open source service, first of all, means something that the average person can instantiate as a running service. Second, drive the knowledge on how to operationalize it into the same community or overlay community because things aggregate into bigger services and create communities to have an exchange on how to operationalize the software and make that knowledge available under free software licenses in a free software exchange model and development model. And then also try to create actual running instances that our projects can reuse without having to run component services on their own and invite others to participate in that. We collaborated with some universities, and this is not new, right? To be fair, Matt Miller from Fedora Project is sitting in the first row and reminding me, which is present, Fedora has done things like that all the time, right? Like Linux distributions, of course, have run what nowadays would be called a cloud service, like build systems and so on in a collaborative fashion, right? So this is not net news. The problem is we have to expand that into everything, into databases that we need to run, we need to make reusable, to any kind of software that we ship, we have to include the operationalization. So that's the pitch. And we have 10 minutes for questions, discussions and answers. 15, I did some quick talking there. Do we have questions online? Maybe we can start with these then. Yes, no. Nothing online, so questions. Yeah, I will take the first one then and then. I'm not going to hackle you. Closer, okay. There we go. So a lot of the public discussion I've seen kind of about cloud services and open source has come from a different direction. It's like companies who have open source software and their business model will support around it and then some other much larger company suddenly runs their software better than they do and they say, hey, that's not fair. That's not why we open sourced it. We wanted something else out of that. And there's been a kind of a fight over licensing and some new licenses basically meant to restrict that kind of running things. And there's been a debate about those fit under open source. I think you're talking about something entirely different. And is there a licensing approach we can take to drive software towards this operate first model? So I'm not convinced that code licensing itself will solve the issue of strip mining code. Like the problem when a larger company takes your code and runs a service and monetize or it doesn't have to actually be a service, right? They could just ship the code into it better, which is always, there's always like a risk like that. It's a dynamic that always existed in a Linux distribution, right? We ship something that undermines the ability to sell the same thing unless you find a way how to make you're selling the same thing either better or an add on, right? So in the tradition sense, I always like, I was a product manager for RHEL a long time where we ship components that other people try to monetize because it's expected from a Linux distribution to bring certain things out of the box. Now what we always said is, look, we have a very generalized solution here. We will help you upsell your specialized support, right? So that solved it in that sense. But if you go into the strip mining by, for example, cloud providers, it takes the same piece of software, run it, compete with you, and they can operationalize it much better than you then when also have more reach and have a better customer model, they have a marketplace where they can promote their solution over years. I don't know if a code license can solve that because part of the problem there is that they could also rewrite and when that was done, they rewrote a code, right? Or forked an earlier version when you change the license, right? So this didn't work. What worked, however, I think is creating the awareness, creating specialization and creating a dynamic around the integrations, right? Embracing that you're not selling actually the code, right? Now, I still believe that it's probably better to license a code on a copy left license and license your operationalization under a copy left license to minimize the abuse, right? Versus a license that's too liberal and allows people to take it proprietary. But the successful countermeasures I've seen were not based on licensing primarily. They were based on a business model and engagement with the customers. How do you copy left the operas? I can't say that word. Well, you get into it. We just found out, luckily, that game rules cannot be copyrighted. So D&D will live, despite Hasbro. But I think the code you implemented in can, right? And at the end, it comes down to how much of a hurdle is writing new automation code and actually that becomes a fairly big hurdle to reinvent automation code. And then copy left helps. It doesn't help you against the reach and economy of scale. There you need awareness of your benefits, a good business model, and awareness of the problems of world gardens, right? I mean, and the Twitter versus Fediverse situation is a great experiment of that. You know, I think we should all try to help make that successful, right? How do you feel about decay of self-hosted infrastructure? For example, servers or domains disappearing and in general, being less reliable than, for example, GitHub? I couldn't. Okay. How do you feel about decay of self-hosted infrastructure? For example, servers or domains disappearing? That wouldn't be the case with GitHub. How do you feel about the decay of self-hosted infrastructure like servers and domains disappearing and in general, less reliable than, for example, GitHub? So that's a huge issue, right? Self-hosted infrastructure always decays, right? And I mean, that's true for anyone who does something. You're interested at the beginning, and even if you're still around, you move on to the next interesting topic. That is at least true for me. I'm very happy with Home Assistant. I wouldn't want to run a security scanner on my Home Assistant device. Now, that's fairly okay because it's so isolated and doesn't control, well, it controls the heating. So there's a risk there. So yeah, that's a problem. I mean, and I think, my impression is, for example, as a Home Assistant community, and again, that's just the one I'm using. I don't want to like single them out and I haven't had time because, I haven't had time to try others. There might be better ones, but what I'm seeing there is that there is out of the community, actually, people start offering the services commercially, right? And nothing I said today is an argument against commercial offerings, right? It is an argument against proprietary offerings. It's totally fair to charge for something, right? Developers want to eat. So does admins need to eat, right? We can sell services. And so I think professional free software infrastructure is the answer to the decay of self-hosted systems and, you know, a reasonable approach, and some of us, we have to figure out, a reasonable approach to decentralization where multiple people can offer things without making you to choose proprietary lock-in is the answer. And that's why I think we need to look at web three without the bias because that is one way that could work, right? If I look at Filecoin, IPFS as an example, that actually is working right now, not economically for the people who are offering it because the hardware is too expensive for the money you can get with it. But if you're using Filecoin-based storage, that is actually pretty cheap and very reliable, right? And the cost, like the problem is that the compute capacity, or the specialized compute capacity needed to offer it is too expensive right now. But that's just, you know, that's going to be solved with progress in technology. And I'm sure that we'll see some risk-5-based specialized Filecoin systems in the near future that makes this affordable. And then we have a way how you can have, you know, at least for storage, you can have distributed infrastructure with good funding model. Other ways are foundations or other peer payment models or small companies, right? Decentralized small companies. We have three more questions. There was someone over here, there, and then there, and then over there. So what do you think is the way or the sort of gold standard or endpoint for packaging up operational capability with the code and with the application when shipping to end users who are not software developers and who don't have servers? I agree with everything you said before. My thesis so far has been that it's straight peer-to-peer, you know, stuff like BitTorrent or Tor or old-school file-sharing apps would be an example where when you got LimeWire, you got the app with the data. But, you know, it could also maybe be Web3 something. But what do you think that looks like for the 2 billion people who don't have servers and aren't software developers? So I have an example of right now a piece of software that I think is doing this really well and that's a sync thing. It's a peer-to-peer file synchronization system that I use between my mobile phone with Android, Linux machine, bunch of servers, and I now use it for everything where you use it for everything where you need the full copy everywhere. At least last time I haven't checked for new features. It was a debate. They don't have like on-demand replication. I didn't have that last time I looked for it. So I use other things when I want to have things like the Google Drive type thing, right, where you have something in the cloud and you only download it when you look at it. For that, it wasn't the right thing last time, but I use it for my, you know, key pass replication, things like that for backup for all documents I want everywhere. And it's super easy. It just works. And so my 11-year-old daughter uses it. Now the poor kid also has to run a Linux laptop because I refuse to do anything else. I can tell you of the pressure to go into a proprietary ecosystem of a specific company that sells laptops and phones primarily, I think, to what they do and watches. It's a fashion business I hear, but like the pressure on kids to be on that because all the other kids have that is enormous. So, you know, poor kid is in trouble, but she has something and some other cool things. And Roblox works in wine, just like if you're struggling with this. Hi. Thank you for your talk. I'm wondering whether this might also come as a problem because of sometimes a need for centralization. I myself moved to Mastodon, but I still push all of my toots to Twitter. And I would move to GitHub, to GitHub, I mean to a local GitHub, but if I want to have users on my code, I still need to push all of my code to GitHub as well. So, what's the solution with that? I mean, I'm in the same boat, right? Like, of course, I have a Mastodon account. I'm not super active on Twitter. I tried to do all my code projects on GitLab, which GitLab is slightly better, right? As a more hybrid model, although it's not perfect, but I like them. But yeah, I think it's a cultural change in awareness, right? People need to understand that you need decentralization and we need to create code and infrastructure and culture that works in the federation to give you the reach, right? The problem is reach, right? And that's, it's really a cultural question at the end. I mean, the perfect example is the switch from IRC to Slack as a dominant, like, interaction for developers in many places. I mean, there are, there are really limitations with IRC, right? That become problematic. Technology-wise, Slack isn't fundamentally different. It's nicer UI, some additional features and the ability, like the persistence that you had to do extra steps to get with IRC, right? But it's primarily kind of like minor improvements like threading and the web UI and persistence. And then the critical mass effect and the reach it created. Really, the answer is a cultural change where people appreciate the, the, and do the extra work. And it, and it goes back to the earlier question, right, about, like, what's a gold standard for creating such an app? It's us here who have to drive that and then make it usable for all the others, right? Like, and that's, that's not different than what happened with free software in general. We, we've been through this cycle and I think it's truly equivalent. I mean, using free software 20 years ago was hard, right? I mean, when using software was hard, but when I installed my first Slackware, that was painful. And then nothing was available and then you had to recompile it and it behaved randomly whenever you compiled it. So we have to do that. This has to be the last question. Okay. So I agree with what you just said about cultural change and to give an example of that, um, for me, Microsoft, uh, acquisition of GitHub with the last straw and I closed my account, but that means that if I want to contribute to free software projects, it's increasingly hard because I'm expected to have a GitHub account and it seems to me that anyone here who agrees with what you've been saying and yet who still uses GitHub or other proprietary software and particularly proprietary software that they allow to be at the heart of the development of the free software is being very hypocritical. And so it's up to us surely to, as you say, take on that pain, but that includes a lot of people in the room who put their hands up when you asked who has a GitHub account. So I put it to you. Should we not change? We have to go through that pain. Thank you. Yeah, I agree. We have to change. And I'll be honest, my presentation, part of the reason why I was struggling, I couldn't read my speaker notes because I created it in what's the default at Reddit with this Google presentation, Google docs, and then downloaded it because I didn't want to look super silly in front of you. So yes, it's a hard right. Like, and what I honestly believe in, I don't know who recognizes this t-shirt. All right, people, you have, you're not watching enough TV or the wrong. Watch the expands. Best show ever. But that's from the TV show. No, that's the TV show. They changed it. Next time I'll do a tattoo or something. Anyhow, so I believe in the power of subversion and not the not to get pretty assessor. I do not. I do not believe that was great at the time and stick with it. But my point is, so for example, I kept my GitHub account. I do contribute to projects where I need to because I don't want to put too much of a burden on people I'm trying to collaborate with that I depend on, right? Like it's hard. But I do try to problematize it, bring it up, and they do everything I can on alternative platforms, right? And I think, and you know, I think that's really like adding that extra step of advocacy and pulling people, trying to pull people over is the best we can do. I just created my, and I had that on the first slide, I'm sure I've put it right. I just created a Nostr key, and now I'll do Nostr instead of Twitter because that's a fully decentralized thing. But I'll cross post, but I'll point out that you find me on Nostr, and if you want me to reply, you have to talk to me on Nostr, right? Things like that. I don't know if we all do it. I think if everyone who comes to foster starts behaving like that, I think we can change things. |
Introducing Helios
A small, practical microkernel |
Hi everybody, my name is Drew DeVault and today I'm going to give a quick talk giving you a sneak peek at a microkernel I've been working on called Helios. So why should we write new kernels? These are the reasons that I came up with for writing new kernels. You know, is Linux good enough? I don't know, but kernel hacking is really, really fun. So I'm enjoying myself working on it, which is reason enough. I also want to prove if the programming language here is useful for writing kernels. Here is a programming language I developed along with many of the people in this room and many people outside of this room, which one of the stated goals is to be useful for systems programming and to be able to write kernels with. So in order to prove that that's possible, we have to write a kernel. Another goal is I want to know if we can do better than the SCL4 microkernel, which is a microkernel that inspired a lot of the system design for Helios. And if I were to be particularly ambitious and bold, I would wonder if it's possible to do better than Linux. And so we've had Linux now for, what, 30 years, maybe a little bit less than 30 years. And I think it's time for some innovation in kernels, SCL4 is cool, but maybe we can do better. And you know, at the end of the day, we're having fun doing it and that's enough. So enter Helios. Helios is a microkernel, which again is largely inspired by SCL4 and is written in hair to prove that hair can be used to write kernels and also because it's fun to write kernels and maybe we can make it good. It runs right now on X8664 and ARM64 and support for risk 5 is coming, which is all of the targets that hair presently supports. The kernel itself is quite small. The portable code base is about 8,500 lines of code. And then on top of that, for each architecture, we have about 3,000 lines of code. And that's it. That's the whole microkernel. The kernel is licensed under the GNU Public License and I suppose here, I should mention these small line cuts don't include the bootloaders, which themselves maybe add 2,000 lines of code per target. And it's written in hair, which is again, assistance programming language that I designed with the help of about 80 contributors. This is the pitch from the website, but the short version is that hair is a assistance programming language, which is designed to be very simple. We have a specification, which is less than 100 pages. We have a small compiler. We have a minimal runtime, manual memory management. And the goals is to use it for assistance programming. So that includes compilers, daemons, system tools, and also things like kernels. Further about hair, again, it's a general purpose systems programming language. So in addition to kernels, we also use it in user space on Linux and FreeBSD, working on OpenBSD and NetBSD user space support as well. We've been working on it now for about three years. And the footprint of the programming language is also small. We have an 18,000 line compiler, our back end cube, not LLVM, our back end is cube, which is about 12,000 lines of C99. And together, this is enough to bootstrap the compiler plus, you know, benutils is required. And it runs again on these three targets. If you've never seen any hair code before, I just have a small sample here. I'm not going to go into too much detail about exactly what any of this code does, but this is just what it looks like. If you're familiar with C, a lot of things here probably look fairly recognizable to you. Some things maybe don't. Name spaces are nice, you know. But this is just a peek at what hair looks like. And this particular code sample is the entry point for the Helios macro kernel. So this is the first line of portable code. There's also some architecture specific set up code, and the bootloader runs before any of this, but this is the first line of code that runs on all architectures. So what is Helios? What is the goal of the design? It's a macro kernel, so it's designed to be as small as possible and to move any tasks which can be performed in user space into user space, contrasted with something like Linux, which is a monolithic design. The kernel is very, very small and simple. It only has 14 syscalls, of which 12 are related to capabilities. It uses capability-based security, which is essentially this means of controlling access to resources on the system, like memory, like hardware IO, memory mapped IO, processes, threads, address spaces. All of these things are represented by capabilities, and the syscall API is used for working with those capabilities. And then each process on the system has access to some subset of capabilities, which entitles it to rights to use resources, which is a really good approach for sandboxing and security. It's especially good when compared to a monolithic design like Linux. The example I usually reach for to explain why Helios is designed is more secure than Linux is to consider the case of a floppy disk driver. So if you have a floppy disk driver on Linux, it's compiled into your kernel and runs in ring zero, and if there's a bug in it, the worst thing that bug can do is completely compromise your system. Whereas on Helios, the worst thing a bug in your floppy disk driver could do is erase your floppy disk. All drivers, in addition to user space processes, are sandboxed with the MMU, just like user space processes generally are on systems like Linux. Of course, for a microkernel, inter-process communication is critical. We have two approaches to IPC, which again are largely inspired by SEL4. We have synchronous IPC via endpoints and asynchronous via notifications, as well as the ability to set up shared memory so that you can communicate more efficiently than using syscalls for IPC. Where is the project at now? It's fairly mature. We're about nine months in. The capabilities are working. Inter-process communication is also working. We also have preemptive scheduling, so we do actually have processes which get scheduled, but the scheduler is very simple. We don't have support for SMP yet, so it's all running on one core, and it's just a simple round-robin scheduler, but we will make improvements in this domain. We also have support for Hardware IO and IRQs, both in user space, so it is now possible to write user space drivers for hardware in Helios or on top of Helios. In terms of booting, we currently have support for EFI on ARM and multi-boot on X8664. We're going to also bring EFI to X8664 as soon as somebody can be bothered to implement a position-independent code for our backend. And does it work? The answer is self-edit, evidently, yes, because this slide deck you're viewing right now is running on this Raspberry Pi, which is running on Helios. I promised that I would not do any talks about Helios until I could actually present that talk from Helios, and I initially was going to try and write an Intel HD graphics driver from X86 laptop, and then I started looking at the IHD manuals, of which there's about 18 volumes per Intel hardware revision. Among those are about 100,000 pages of PDF. And after about two days of reading those PDFs, I forgot about that and instead ported the entire kernel to ARM, so I could write a GPU driver for the Raspberry Pi instead. That ARM port took about 42 days to complete from start to finish. The Raspberry Pi here is running its GPU driver and a serial driver in user space. The GPU driver is driving the projector, and I'm switching between slides by typing letters into the serial port. The slide deck itself is encoded as QOI, quite OK images, on a tarball, which essentially acts like an in-it-RAMFS. And there's basically no hacks here. This is not, you know, there's no smoke in mirrors. I actually ported the entire Helios kernel to ARM. There's no SOC specific build, so this same configuration should work on any other ARM device, which implements an EFI boot. I am using EDK to boot through EFI. I'm using device trees to enumerate the hardware instead of drivers, so there's very little on the way of hacks. 42 days for a complete port to ARM, no hacks. It just works. Thank you. So, where's the project going from here? The kernel itself is done in big air quotes in terms of the fact that it's almost functionally complete. It needs to be polished, and we need to, you know, there's, if you do a git grab on to do, you find about a hundred things that still need to be fixed, just little stuff. We need to add multiprocessing support. I want to port it to risk five as well, which maybe it'll take more than 40 days, because I'm not going to, you know, kill myself over this one without a deadline like FOSSTEM. I mentioned earlier that I want to expand the boot loader options, so I want to add EFI support for X8664, and we also intend to boot risk five over EFI. And I want to improve the documentation, of course. The docs are actually already kind of okay. They're at Aries A-E-R-S-O-S.org. If you're curious, the kernel docs are maybe about 60% complete, and if you're curious to play with Helios, you can definitely use those as a starting point, and ask an IRC if you encounter a stub where there should be docs. In the big picture, this is our plans. So like I said, the kernel is almost functionally complete, but it's a macro kernel, so that doesn't mean that it can necessarily do very much that it's useful. But we're going to go to user space and build more stuff. We have this idea of kind of layers of support. So at the core is the micro kernel Helios, but then we're going to build additional projects on top of it, which will expand it into a complete operating system. We have now Mercury, which is a driver framework. This already exists, and is fairly mature, and has become even more so in the past week or so. And then we've just started last week working on Venus, which is going to be our collection of drivers, just any kind of hardware that we want to support. The driver for it will probably end up in Venus and be built on top of the Mercury framework. And together, these will present an interface to Gaia, which will take these abstractions for accessing hardware and build them into an actual programming environment, which will resemble Unix or Plan 9. We're also going to build Luna, which will be a POSIX compatibility layer. Gaia itself will not be POSIX. I think that there's room for beyond POSIX, I hope. But we do want POSIX software to work, so we'll have this compatibility layer. And then we'll tie it all together with Aries, which will be a more complete operating system that provides a package manager and a service manager and other things that you might be used to from a complete experience. I want to give some quick acknowledgments as well to the people who made this possible. I want to thank Ember Saladi and Alexi Jiren, in particular, for their early experiments with kernel programming and hair. These early kernels for X86 and RISC-5, they never made it to user space. They weren't very sophisticated, but they did answer a lot of the problems that needed to be answered for us to know how do we do hair development in ring zero. And so this was very valuable for establishing things like booting up, dealing with the MMU and other questions for how to get a kernel going in hair. And the hair community itself deserves a big shout out because none of this would be possible without the immense amount of work which people have put into it. Many of the people who contributed to hair are in this room, and I offer them my thanks. But many people are not here. There's about 80 people who went into making hair possible. I also want to thank the OSDEV community on Liberichat's IRC. These guys are incredibly smart and incredibly friendly and incredibly helpful. And if you want to get involved in kernel hacking and do any kind of work in ring zero yourself, this is an indispensable resource. Definitely check these guys out. And also we owe some acknowledgments to SEL4 because a lot of our design is inspired or stolen from SEL4. I should have updated this slide. We have a burst of a feather and a couple of hours for hair, the programming language. So it's not about Helios. It's about the language that Helios is implemented in. So if you want to learn more about the language, please come there. There's also a talk tomorrow in the microkernel room where I'm going to have a full hour to talk about Helios. And I'll go into a lot more depth. So you're welcome to come check that out. If you want to see any resources online about the system, the links are here. This is a link to the website, which contains mostly documentation, a link to the source code, to the website for the programming language, and to the IRC channel where we hang out and we'll answer your questions. And that's it. That's Helios. Thank you very much. |
Creating Pathways That Invest in New Maintainers |
All right, oh wow, it started, hi everyone. First off, huge thanks to organizers for having me here. This is my first Boston. It's as chaotic as everyone said, but it's been a lot of fun going through everything. So I'm going to be talking about creating pathways that invest in you maintainers. This is based on a blog post I wrote before. I get it, I don't like to read either, so thanks for coming to my lightning talk. You can hear the blog post in spoken form. So my name's Abby. I run the open source maintainer programs at GitHub. So I care a lot about maintainers, about open source. And I do think that thinking about pathways for new maintainers is one of the most effective ways of thinking about your community and just where to invest and how to really build that up. So first, who here is the maintainer? It's a lot of you. Good job. Thank you so much for your work of maintaining. You're the backbone of open source. It's huge applause for the maintainers. Thank you. And are any of you solo maintainers? It's a couple. Okay. So I'm going to talk about a framework that I worked on when I was at Mozilla. So this is put together by a bunch of my colleagues there. But I do think a sustainable community needs a couple things. First it needs a way for new people to come into the community and a way to level up within that community. I learned this the hard way when I was in university. And when you're running student clubs, there's this four-year cycle where kids are going, not kids, they're adults, where adults are going through, people are going through. And if there's ever a year where you're not recruiting first years or freshmen, then four years later, your club just has no one to lead it and it sort of dies out. So that's a bit of an exaggerated cycle, but it still happens in the real world. Like people are going to stop working on this eventually, so you have to keep thinking about new people coming in and ways to replace yourself. So we're going to look at this mountain of engagement. And I'm going to talk through all these steps and we'll talk about best practices for just moving people up these steps in a way that really helps you become new maintainers. So it starts with discovery. How do they first hear about your project? Maybe they're at Fostum or on a mailing list or something. Next is first contact. Do they join your Discord server? Do they start your repo? And then participation. Maybe they start talking in the chat. Start making issues. Sustained participation. They keep coming back. They keep making issues. They start making pull requests. Start really doing things in the community. Networked participation, this is the one I'm really interested in, is when they start bringing others into the community. And then finally leadership. And this can be as little as hosting our next meetup or as big as maintaining the whole project and you can really identify what you're interested in and write that out. So starting with discovery. Some best practices. Have a public repository. Work at GitHub. Have a public repository somewhere. Some marketing is usually really helpful. Go to conferences like this. Talk about your project. Write on mailing list. Having an open source or free software license. And I'll say these first few steps. It's really like the basics of setting up your community. They'll be, you'll know a lot of these things. I think after the first three is when it gets to really interesting stuff that I think a lot of people could benefit from. All right, so now that they've heard about your project, why do they reach out and actually start participating? And I find that most people like actually like join a mailing list or sign up for something because they're really excited about the vision of the project or the mission. Like they're excited about what you're doing and they want to join in. So making sure that you've written that down clearly and like having as concise a vision as possible is helpful because then they can remember what they heard in your talk. Go home and then go join that and like look it up. Also having communication channels makes it really easy for people to first contact you. I know it's hard if someone just writes, oh, if you're interested in contributing, send me an email. And sending an email is such a high bar compared to like, oh, you can lurk in our, I'm a little jet lagged so my words are missing. But lurk in our IRC channel. And then finally like have this stuff in the read me. Have a nice welcome mat for your project. Okay, so now that they've gotten in contact, why do they actually participate? And this one's interesting from volunteerism research. They found that people are four times more likely to volunteer if they've had a personal invitation. So if you meet someone who you think would be a good contributor, if you personally invite them that makes them so much more likely to actually come and do things on your project. The next thing is be responsive. We found this at Mozilla especially depending on cultures. A lot of people like to request permission to join before actually joining. So we, and we were sort of ignoring a lot of those requests. So just being responsive to things, even if it's something that you don't think is that important, really gives them a nice signal that people like want to participate in this. And then the next year around setting expectations. So having things like your contributed guidelines, a code of conduct, those public issues and pull requests so people can understand how they're supposed to contribute to this and what happens if they, like, the different ways that they can contribute. So with the code of conduct, making sure you also have enforcement, that usually means like good governance that works. And with the public issues and pull requests, they can see how the work is actually happening and how it's being built. So people can see that. They'll be much more likely to actually participate than if it's behind like a black box. All right, so those first three, again, those who are like setting up the project, I think a lot of people have gotten really good at those three. But then when it gets to sustained participation, how do you keep people coming back? A big one here is recognition. So just saying thank you. So this could be as big as like an award ceremony or giving out like your annual, I know like Google gives out like grants to different, yeah, I'm really jet lagged, different contributors for their open source projects. But it's really don't forget a simple thank you. So just saying thank you to people makes it makes a big difference and makes them much likelier to come back and participate again. So if you get a good pull request, say thank you. And then, yeah, hopefully they can come back. And then I like this one connecting to the mission. So Marshall Ganz, he's a political theorist. So he talks about this in terms of like political campaigns. So say you volunteer to Canvas, your neighborhood, and you go and do all of the houses in your street, you go to maybe 20 homes, and you go back to headquarters, you give them your clipboard, and they say, oh, thanks. And they like put it in the pile with the rest of the clipboards. And it's not as motivating as if you go back to headquarters and they say, oh, well, you got 20 houses. Let's add those 20 to the board. They do little tick marks. We're like, oh, look at this. We reached 1,000 people today. You help us reach our goal. And then connecting that small work that the volunteer did to the broader mission makes a big difference of people actually coming back and being excited about what they're doing and seeing that they're actually making a difference here. So whenever you can connect to mission is really good. And then matching skills with interests. So try and understand why people are coming to your project, what do they want to be doing, and helping them actually get that kind of experience that goes a long way with sustained participation. All right. The one I was excited about, the network participation, here's the, I think, formalized mentorship is one of the best ways to get network participation. I know with mentorship programs, we're often really excited about the mentees, like these new people coming into open source, but I think it's just as important to have more mentors in your open source project. Because these mentors are people who are onboarding others, and they're taking a lot of the, they're starting to get into these leadership positions, so actually giving them a role where they're bringing others in and helping onboard them, yeah, it goes a long way. And then also professional development opportunities, if you can give them certificates or like LinkedIn recommendations, that's really helpful for that network participation. People get excited about this and tell others that they want to come. And then hackathons and socials are just like a low bar way for people to bring their friends. So if you have like a fun social, you can just invite, people can invite their friends too and just encourage people to invite their friends that helps them start being that networked participant in the community. And then finally, it's leadership, getting up to the top there. And again, I think personal invitation goes a long way here too. A lot of people are in a community and they're just waiting to be asked to be a leader. So depending on how your project governance works, you can be asking someone to run for leadership, asking them to become a leader. But yeah, really taking the initiative to reach out because a lot of people don't want to step up themselves, especially if they're from underrepresented groups. So making sure you're getting invitations out. And then having that clear governance, so people understand who is making decisions, how do you become a decision maker, how do you stop becoming a decision maker, and like how are decisions made. If those are written out, it makes it a lot easier for people to become a leader. And I'll say leader doesn't mean you're on that decision making group, it can be something else in the project, but governance is still really helpful for people who want to step into leadership. And with this, I think just having that good value exchange, especially if you have volunteers as leaders, making sure that it's worth their time. And I'm a huge fan of time boxing volunteer acts, asks. I've definitely been asked to be a maintainer for something, I say yes. And then three months later, I'm like, oh, I don't want to do this anymore, but then it's hard for me to back out. But if it was time box, I just wouldn't be renewed. And I've just been the worst maintainer for years because it just was never ending. So making sure, I like that time box piece and just making sure the value exchange is there, that people are getting what they want out of this experience, whether it's that professional development like we talked about, the connections experience, yeah, to get that there. So here's an overview of the whole mountain. And I find a lot of people, a lot of maintainers especially, like when you're starting a project, you're focusing a lot on those early steps, but then they often don't get to that like networked participation step. And I find you can get really burnt out just like saying thank you to all the people or like doing those personal asks to everyone. But if you have more mentors on board or more people in leadership in your project, they can take on a lot of that early step work too. So that you don't get as burnt out as your community scales and as it grows. So this is the mountain, so four minutes left. All right, so here's a link to that blog post. You can read this in written form and it links to a lot of the references I talked about. So I had the research on the volunteerism stuff and Marshall Ganz's work. And then it's a link to some of the materials that we put together at Mozilla that we use this in. So oh yes. We just launched maintainers.github.com, so we're reigniting the maintainer community at GitHub. So if you're interested in this kind of stuff, if you're a maintainer and you want to be talking about this with other maintainers, there is a private community so you can go there and join. There's a bit of a, it does a little check to see if you're a maintainer and if you are then you'll get an auto invite or you can fill in a few questions and then I'll review it and probably let you in. So huge thanks. Are there questions in Lightning Talks? No. I don't know, okay. But hi, I'm Abby. You can find me on Twitter or on Macedon. Thanks for your time. And I put stickers on the edge of the stage if you want to grab some. They're in like little individual packets because I didn't feel like taking them out. So you're welcome to take a whole packet or you search through and like take a sticker that you like. But thank you so much. |
Should there be a standard in libre localization?
Ideas on how to make it easy for translators to contribute to any FOSS project they like |
Hello, everyone. So we have here Benjamin Jamie who will be sharing should there be a standard and Libra localization. Ideas on how to make it easy for translators to contribute to any FOSS project they like. Thank you, Benjamin. Thank you for the applause. I didn't deserve that yet, definitely. Thank you for being in such a large count here. So I'm Benjamin. I work in WebLate. I do the talking there and you can also meet Michael in this lovely hoodie here and he's the mind behind the code and everything and the founder. I won't be talking much about WebLate here. That's the talk tomorrow in translation's dev room. Here I would like to share a story and kick off something probably that could become a standard in Libra localization because at the moment it's nothing like that there. At first I would like to talk a short start of WebLate story. It doesn't start like a masterminded start-up plan that will get a lot of money and conquer the world of the localization. No, that's other companies. We are Libra software from the start and Michael actually started it when he was working at SUSE and now SUSE people are very proud that they started it during the SUSE hack week 12 years ago. Sorry, 11 years ago. And for a long time this is the office. It was actually the office at the top of the building of those large concrete buildings you have around the Czech Republic built in socialism and when you needed an elevator there, there was those like small on top of the building blocks where you put all the machinery for your elevator but it's not needed now so all the machinery is downstairs in the shaft and Michael had his office on the top of the building and it's like three to four meters or something like that and that's how he started it and it was not aimed to do something that will be masterminded and well prepared for all the use cases. It was just in the start the use case that Michael wanted something better than PUTL which was a very popular tool but no longer maintained and something that's very tightly integrated with this version control system so you don't have to do much of the manual work and you are a developer so you connect Weblight to your version control system and it just works. You don't have to care about anything and then everything is in the editor and you know localization editors from Weblight and from other similar tools and then more and more big open source projects and some commercial companies went to ask Michael, hey, we would like to use it. Can we pay you for supporting it and yeah that's how Weblight started getting bigger and these days there are large or small Libre projects that are using Weblight and oh sorry I forgot to put a pen pod logo there. We'll do tomorrow but there is Calidas and more and more Libre projects started using it and it became some sort of a go for tool and then Fedora came I think three years back to us and migrated all their stuff from Xanata from their original platform they created to Weblight because Xanata was like we don't have time for that, that's not our core of business so we won't use Weblight and that's how we realized that this is maybe the thing that will be happening more and it became our responsibility. We took very seriously to give the tool to the communities that they can use happily and they can contribute to but it's still just one platform. It's Weblight and we created last year a page called Discover Weblight so if you are on a hosted Weblight, if you have your projects on a hosted Weblight which is the largest instance of Weblight on the internet it's easy to search there and it will be even easier throughout this year because of the development and if you host yourself your instance we wanted to give the users a place where they can come and search through all the Weblight instances that are open to contribute to this page so at this point if the project is using Weblight you have this place and you can search for your project name let's say Benpot and you will be guided to the right place and you can contribute to the localizations there but what about if the project because there are so many projects in Libra world that are not using Weblight yet or they want to stick with different tool because it's always your choice what do you use in your community. Sorry. So there became an idea and I think two years back and that's really important to say the time frame I was speaking with Dwayne. Dwayne is a wonderful person author of Translate Toolkit which is one of the core component for that Weblight uses to work with translation files and Dwayne is very talkative like I am so we were talking a lot and then I told him yeah and maybe there should be a way for someone that wants to contribute to Libra software like one place possibly federated because so it's not in one central place where I as a translator of Libra software that's not technically savvy and I want to contribute the time to the project of my choice but I don't want to search through readme's and wiki's because every project has its own different way or there is no information and you have to really dig hard to find it or if you are at Fosdem you can walk around the booths and see a little bit like a little advertisement or small stand that says this project is localized on Weblight but if you are in the online world you are you will spend usually a lot of time to find where is the right place to localize your favorite software and I told that to Dwayne and Dwayne said oh yeah I had that idea a while before and I have it very roughly specified and we were talking more and then he sent like nine pages long nice specification of how it could look like and how it could work and there are so many other features that we would like to have there but that was two years ago and nothing happened because there is no time to do that in Weblight team and it should be a community thing it shouldn't be something that Weblight says this is the thing how you should do it and everybody should use it it can be like that but we are at Fosdem there are so many wonderfully skilled people that may be interested in not only helping with the actual development but also sharing their ideas and maybe they because it's such obvious thing that's missing maybe there are so many people that already have similar ideas or some kind of specification and I would like to use this talk if you have an idea I just want to kick it off we are not working on it yet we just want to connect we want to discuss and maybe in a year on next Fosdem we will have something like that and a lot of projects could use that or every project on Fosdem and it shouldn't be just about Weblight it should be maybe a file like you have a license file in your repository or a readme file that can be indexed by some tool so if I want to have an instance a server where I could guide let's say I'm a localization lab or some other lovely group of people that wants to localize open source software they can use some kind of tool put it on their website on their own server and tell if you have your time select language you want to contribute to and find your project and this might should index those readme's or translation md files from not only github and gitlab but probably more and more can it be done probably yes it has it will have some limits it probably won't scan all the repositories in the world that's not possible but maybe we can make something bigger than discover Weblight that's universal use for everyone now what should they do is to guide the translator to the right platform to the right place because if I'm a translator I don't and I don't know all the technical way of the localization which can be make easier by localization management software like Weblight I just want to translate my hello to hello and that's it and it should be unified so everybody can use it it should be possible to index it somehow to search through it and yeah that's that's a start and probably there is much more that it should do it but in a started this would be nice and yeah let's discuss if you go not right now because I have to add a link once I finish the talk but this is the place where you can go and search through Weblight schedule I just double check yeah it's right there is no miss click so if you go there you can see all the schedule all the plans we as Weblight have for this for them I will add there a link to the discussion and you can join you can share it with your friends that are interested in such things and maybe we will come up with something interesting for every possible translator in the Libre community or in ideal way this tool or this localization file or translation and whatever we will call it can do can welcome you translators that don't feel I'm a translator and then don't know how to get there maybe to make it easy for everyone so that's it that's what I wanted to share with you and yeah I'm interested in what do you think maybe we still have three minutes so we can talk something right now you can also if you go to that link you can also find our bio stickers t-shirts and yeah other things and you can share your ideas there about Weblight and what we should include and if you don't like something just come there and tell us or you can join the online discussion that will be ready in a few minutes and yeah thank you for your attention and if anybody has a question or anything just raise your hand or come here and tell it to us thank you for your time and injera for them |
Do more awkward user interviews
Do you feel awkward interviewing users about how they use your project? That's ok — awkward interviews are often good interviews. |
All right, everyone. Thank you. We have Emily O'Mear for the next talk, and she'll be doing her talk on do more awkward interview or more awkward user interviews. Do you feel awkward interviewing users about how they use your project? That's okay. Your interviews are often good interviews. I'll hand it over to Emily. Okay. Excellent. Can everyone hear me? Cool. Okay. So, how many of you feel really awkward when you do a user interview? Yes? Okay. How many of you are really excited to learn about how you can make your user interviews even more awkward so you can feel, okay, cool. That's what we're all here for. So, I'm talking about two things, basically, that you should do more user interviews regardless of how awkward you feel about them or not, and also that you should proactively employ techniques to make yourself feel more awkward, possibly also to make your interviewees feel a little bit more awkward too. So, I hope this sounds really awesome, and thank you all for joining me. The first thing that I, the first point that I want to make about doing user interviews and feeling awkward, because a lot of people actually really hesitate to ask users in the first place, because they feel like this is a huge ask. I'm going to ask for like 30 minutes of somebody's time, so they can feel really awkward just about that part. So, remember, though, that this isn't actually how your user thinks about it. Doing a user interview, as in from the user's perspective, is actually a way for them to contribute to your project. It's a way for them to tell you what works, what doesn't, help you make the project better, and also people like to talk about themselves. You're giving the user an opportunity to talk about themselves, and from their perspective, you know, hopefully get some of their needs met. They can give you like their wish list. That doesn't mean you have to do everything on the wish list, but it is, there is something in it for the user. So, don't get too hung up on that first ask and feel like this is like a huge imposition on your user. Okay, so then let's talk about a little bit why you need these user interviews. You need to be able to understand who your users are. You need to understand why they're using your product or your project, why they care about it. You need to understand that in order to figure out how you're going to talk about your project, and also figure out what you're going to do with your project next, what components of your project, what features are really useful to people, what things people really don't care about, what things would actually be helpful if they were added, what things potentially could actually detract from the value of your product if you were to build them. So user interviews are really important. They're also a really important part of building community. Again, something to keep in mind when you're feeling like really awkward about asking for that user interview. It's a way for your users to feel more connected to you as a maintainer and more connected to the community. So user interviews, super important to do in the first place. And this is some of the information that I hope that everyone thinks of when they're doing their user interviews. It's just an example of things that you're going to try to be getting, extracting out of your users as you're doing these interviews. You want to know what triggered them to find your project, so did they hear about it at a conference like FOSDEM? If so, and you want more people to find out about your project, maybe you should go to more conferences. But you also want to know what triggered them to even look for something like your project in the first place. What were they trying to accomplish in that moment? Then you also want to know, what do they compare your project to in their head? And sometimes you'll find answers like some other project, and it will be obvious. But oftentimes people don't actually compare a project to another project. They compare a project to spending four hours a month doing something manually. They might compare a project to paying a fine. They might compare a project to losing sleep at night because they are worried about whatever it is that your project solves. So understanding what that mental comparison is is going to help you talk about your project more effectively, understand where your project needs to go in the future, and make more of a connection with your community. So another thing that you need to understand is how they interact with your project. Is this something that they actually touch every day, every week, or is it something that runs in the background and is really critical, but it's very easy for everybody to forget, especially if you have a project that's used in a team situation. You want to know who on the team is interacting with a project and in what way. You'd also want to know who was the champion. So maybe you have five people on the team who are interacting with your project. Who was the champion? How did they then evangelize it within that team? And then you also want to know if, hopefully, there was one, but if there was a magic moment, what that moment was. So when they first started using your project, what was the sort of aha moment when they were like, oh, now I understand how this is working. This is so awesome. And the understanding what that moment is is really important, so that you know what to optimize and how to make the process for a new user of getting to that magic moment as smooth as possible. All right, so let's talk some more about goals, right? Okay, so what's your goal when you're doing a user-interferent view? I've just talked about some of these things, this is a pop quiz. Is it to get some really amazing information about your project, or is it to make yourself look awesome and smart? Which? Okay, so there is a little bit of a caveat here. You don't want to make yourself look like a total moron, especially, you know, people are trusting you, so you don't want them, like, if they're using your project, you don't want them to think you're a total moron, especially if you're trying to monetize this project in any way, in any way, depending on it financially. However, you also don't want to get hung up on the idea of making yourself look good during this interview. That is not the point of a user interview, the point is to get some really good information. So some of these techniques, these interview techniques are from journalism, I used to be a journalist, when you're a journalist you don't really care what your sources think of you, if they think you're an idiot, that's okay. So just to acknowledge, there are some limits, but then again, you know, your users are coming into this, they already respect you, they know your work as a maintainer, they depend and they like your project, so you don't need to show them how awesome you are, they already know that. Okay, so here are the two, like, awkward interview techniques that I want people to take away from this. So one is to strategically deploy awkward silences. The second one is to ask so many follow-up questions you risk looking dumb. But let's talk about the first one, the awkward silences. Okay, so in Western cultures, most cultures, it is a little bit culture-bound, but after a certain amount of time in a conversation, people start to feel like this silence is really awkward. And if you're in a conversation, that's really easy. You just ask the other person another question, boom, problem solved. But if this is an interview and one person is asking questions and the other person is answering them, and the interviewee has finished answering their question and the interviewer doesn't do anything and just sits there and is silent, it's going to make the interviewee start talking again. And what happens is that first answer that you get to the question is like the pre-planned canned answer that they've already sort of mentally vetted to make themselves look good, possibly to make their organization look good, and not reveal anything that's too embarrassing. That's good. I mean, you're going to get some good information there. But if you want to get beyond that, then after that person has finished answering the question, let it sit and let it feel a little bit awkward. And if you can sit with that awkward silence long enough, the interviewee will start talking again. And when he or she starts talking again, that's where you get the information that they haven't really pre-vetted with themselves. They'll start rambling. They'll get the sort of like more emotional, more off-the-cuff answers to questions like why did this matter or why did you want to do that? Even if you were to interview somebody about what they had for breakfast, I can guarantee you that if you just let an awkward silence sit there, they would start rambling on about why they chose whatever they had for breakfast because they feel like they need to fill that awkward silence. So the next thing, the next technique is follow-up questions. Let's use the breakfast example again. If so, you ask somebody, what did you have for breakfast? And they said, I had a croissant. And then you ask them why? Then you're going to start getting some more information. And then maybe they say, that's all that was available. And then you ask them why? And you keep asking why again. This can start to feel a little bit uncomfortable because you're really pressing people to talk about their core motivation. But you're also getting information that is much deeper, much richer. And one of the tricks here and the thing that can honestly make you risk looking like an idiot is that you want to try not to assume anything. And particularly, this can be hard in like a technical situation because there are sometimes when you want to show that you're not an idiot, you know why this was important. When somebody is like, we felt like we had to improve our supply chain security. And you ask why? It's like, well, isn't it obvious why we had to do that? But actually, it's not because why at that moment was that suddenly a priority? Why hadn't you done it before? So keep asking those why questions to try to dive into really understanding what the motivation is and not assuming anything. Even if it seems like it should be self-evident what the answer is, don't make that assumption. All right, so I am going to give you some homework. This homework is really easy. Choose a friend or a colleague and ask them to do an interview with you. It can be about anything. It can be what they did yesterday evening, what they had for breakfast. And I want you to try and see what happens when you don't jump in with a new question and just let that awkward silence hang. The key to doing this is that you have to set the expectations that this is an interview. So the other person is not allowed to ask you a question in return. They're only allowed to answer the questions that you're asking them. And the reason that I actually recommend practicing this is because it is way harder than you expect. It should be really easy, right? You're just sitting there quietly, but it goes against like your whole being to just let that silence sit there for as long as you can possibly stand. And that is why it works, quite honestly. Okay, so that's it. I'm Emily O'Mear. I have a podcast. It's called The Business of Open Source. You can connect with me on my website. You can write me an email. I'm also on LinkedIn. You can send me a message on LinkedIn. You can just search for my name and you will find me. I don't think there's anyone else with my name. Before we wrap up, though, is there anybody who has questions? Yes? Do I need a microphone for this or is it more effective to do this with a microphone or...? Just ask the question and I'll repeat it. Is it more effective to do this with a microphone or...? Oh, is that the question? You mean if you're doing a user interview? I mean, honestly, when I do interviews, I use Zoom, so I just use whatever's on my computer. But no, it doesn't matter. This is a situation where technology just doesn't matter. However you do an interview, it doesn't matter. On the other hand, I will say I understand where the question's coming from because people do behave differently when you put a microphone in front of their face. If anything, no, do not put a microphone in front of people's face because that triggers the self-censor, and what you want is to get rid of the self-censor. Yes? So you said that people shouldn't ask a question of you during this interview, so one way information is... Yep. If they do ask a question, do you stay silent to increase your... So the question is if the interviewee asks you a question, should you just stay silent to increase the awkwardness? That's a really good question. I am not sure. That's actually never happened to me, so I don't really have a good... I think I would go for just answering the question and then asking a follow-up, right? So you're... Yeah, you could do that too. So I think I would just... You want to use the two techniques together anyway. All right, we probably have time for one more question. Yes? Does being awkward ever break the interview or the trust and make it harder to go on? The question is, does being awkward ever break the trust and make it even harder to go on? In my experience, no. I have never had a situation where being awkward broke the interviewee's trust. I don't think so. I mean, there's probably... There is a limit, right? But as long as you're being polite, then I wouldn't worry about that. And if you have any sense of social grace whatsoever, you will probably totally max out on your awkwardness well before you would break someone's trust. All right. Yeah. Okay, we have seven seconds. So thank you so much. Thank you. Thank you. You're welcome. Thank you. |
Beyond Wikipedia: Discovering Wikimedia's Open-Source Ecosystem |
Hi and welcome everyone. I am here today to speak to you about a little bit about Wikimedia's open source ecosystem. So I assume all of you know what Wikipedia is and maybe some of you know that it runs on a software that is called MediaWiki. So all the Wikis run on this software but there's also tens of thousands of websites around the world that use this. A very cool example is NASA is using it for some other projects. But this is sort of the core of Wikipedia and the other projects and it's of course something that is open source that anyone can contribute to but it's not what I'm going to be talking about today. Because surrounding Wikipedia in all the other projects there is a huge ecosystem of software tools. You can think of these as like third party integrations. People build bots that do edits on Wikipedia. That fight vandalism. There are machine learning algorithms. There are pipelines for data that then go to research purposes. There's a lot going on. So I'm going to be talking about this part. Just a quick word about me. I'm a software engineer with a technical engagement team. You can see we're part of the technology department and our team is kind of split into parts. We have the cloud services team, SREs and engineers that build services and platforms for all these tool developers. Then we have developer advocacy. They do a lot of things. They are writing documents and running outreach programs and doing everything so that our technical contributors can build cool stuff on top of our platforms and content. So just to give you an idea of the scale of this, we have over 300,000 editors that contribute to Wikimedia projects every month. Wikimedia comments, which is the project that is for free, media files, videos, images, so on. There are now over 90 million media files on there. And we have 1.7 billion unique devices that access Wikimedia against statistics for every month. So some of you may recognize at least some of these projects. Of course, Wikimedia is the flagship. There are other ones like Wikidata, comments that we just mentioned, Wiktionary, and many more. So yeah, we're going to take a look at these tools, ecosystems that I mentioned in the beginning, and we're going to start from Wikimedia. This is the thing that most people know about us. And from there, yeah, we're going to explore the tools that are community-created software that interacts with and contributes to Wikimedia projects in some way. An example here is PyWikibot. It's a framework for building bots. So if you have a wiki and you want to run some kind of bot that does something, some type of edits, you would very likely use PyWikibot as a framework to develop this. From the tools themselves, we're going to go and have a look at the services and the platforms that support them. And we're going to start with an example of a couple tools and how they actually integrate with one of the projects. So this is a wiki project called Women in Red. And a wiki project is a group of users on Wikipedia that decide that they want to work on something specific. They come together to work as a group. And in this case, it's to fight the content-general gap. So they observe that only around 15 percent of English Wikipedia biographies were about women. And as of the 23rd of January this year, they have managed to take this number up to around 19.45 percent in about seven or eight years. And so where does this very precise statistic come from? You can see mentioned here in red, it's something called human-niki. And so this is what we would call a tool. And in this case, this is a dashboard. It provides statistics about the gender gap on all the Wikipedia projects. And you can see here that female content is the orange part and then the rest is male. And if you go to this website, you can see it in a more granular way by country, by project, by date of birth, and so on. So if you want to contribute to this project, an easy way to do it is to go to this Wikipedia project site and you can see different lists that have been curated. For instance, here we can see female activists. So you can get a list of all the female activists. And there are many, many categories like this. And some of these lists are curated by humans, but most of these lists actually come from a bot that's called ListeriaBot, which curates, which makes queries on Wikidata, which is another one of the projects. You can think of it like a huge knowledge graph that you can query using a similar language to SQL. It's called Sparkle. So yeah, you can use Wikidata to get lists with very high granularity. You can have activists from Germany or activists from Germany that were born after a certain date. And so this is what ListeriaBot does. So we have seen two different tools. One was a dashboard, one was a bot. And there are thousands of these tools. There are thousands of maintainers. And we're going to take a look at how we sustain these people. So I mentioned that my team is the cloud services team. And what we do is we provide hosting. We provide compute virtual machines. We provide data services for all these tools to function. And so again, to give you an idea of the scale of this, 30% of all the edits on Wikipedia as of 2020 were made by bots hosted on our services. For Wikidata, that number is a little bit higher. It's around 50%. So just to make you aware that this is a quite important part of the ecosystem. So I mentioned there are thousands of tools. And as of a couple of years ago, we now have a catalog where you can browse and search and find the tool you need for your project or if you are a tool maintainer, you can add it here so that people know it exists. Then what you see here are lists that have been curated. We have something called the Coolest Tool Award. So if you look down, you can see that Humaniki was one of the award-winning tools in 2021. Some of you may recognize this as a Jupyter Notebook. So this is a Jupyter Hub deployment that we have that is directly integrated with all of our data services so that people can access dumps, they can access Wikireplicas, they can access a lot of things that otherwise would be gigabyte and gigabyte and gigabyte of data they would have to download onto their own computers. Another tool is called Query. It's a public query interface for Wikireplicas. Wikireplicas, I didn't mention it, there are replicas of our production databases. And the cool thing about this is that all the queries are public so people can actually search and see other people's queries and be inspired or if you're not very good with SQL, you can adapt someone's query to your needs. Here you can see a specific query. So these services are still tools that serve this ecosystem but we also need somewhere to host them. So though we have a platform as a services offering, it's called Toolforged. It's not quite as fancy as Heroku or DigitalOcean or anything of this sort. If you look closely you see that you have to actually SSH into it. But it's still very powerful and very convenient for our users. It integrates again with data sources and it has managed databases, a elastic search cluster that everyone can use without having to maintain all these systems themselves. So yeah, the back end here is Kubernetes. Then for more complicated projects, some projects need more compute for instance. We also have a CloudVPS offering so that people can spin up their own virtual machines and basically do what they want on them. So this runs on top of OpenStack. And how could one get involved with this? So it's possible to get involved in any of these layers, either as a tool maintainer or as maintainer of any of these platforms. And that's kind of the thing I wanted to highlight a little bit today is that this is kind of a unique opportunity to actually contribute to platform and to infrastructure. We have people that work with our team and they are on our IRC channels and they push patches just like everyone else and if you don't know you would think they are just another software engineer on the team. And I asked some of them why, what brings you here? And of course many of them associate with their free knowledge and free knowledge movement and open source and all of that. But many also said that this is a unique opportunity to actually play with things like OpenStack or Terraform or Kubernetes in a situation where actually you have real traffic and real users which is something that is kind of very difficult to do at home and there are not many other projects where you would have this possibility. So some ways to get involved. We have several outreach programs. We have Outreachy which is an internship that runs twice a year. It's targeted more towards underrepresented demographic. Google summer code that's once a year and both programs are open to anyone so you don't have to be a student. You could be someone who is changing careers or doing some kind of a letter move. Google summer code has also become more flexible. It's not just summer anymore. There are shorter projects. There are longer projects. So that could be a way to get involved and get some kind of hands-on mentorship. Another way would be to come to the Wikimedia hackathons. We have one in Athens in May this year and then one is part of Wikimedia that takes part in Singapore that is in Singapore in August. And of course if you are brave you can just dive right in because everything we do is open and it's out there on the internet. Documentation of course but just even our project boards and fabricator it's a collaborative software for task management and such. So if you go there you would see that there is a huge variety of tasks. You can see the work boards of different teams at the foundation. You can see volunteer led projects and projects where people work together alongside each other. So a way could be simply to find something that interests you and look at the documentation and then come on our IRC channels and contact us and that's it. So yeah I have added some links which can be helpful to get started. And you are of course free to just reach out to me. I had my Twitter handle on the first slide and my slides are published on the website. We have 45 seconds for questions. Thank you. Thank you. |
data mountains - turn your data into mountains!
convert geospatial points into triangles scaled by data |
Hello, everyone. It's good to be back. It's been a while. This is my first time giving a talk here. I'm really pleased to be here. My name's Joe. I am a coder. I work in London for local government. I work a lot with geospatial data, and I am a Python programmer. Have we got any Python coders in today? Anyone using Jupyter? Cool. Right. So let's go. So in lockdown in 21, we had a census in England and Wales, and the data is coming now. Most of the data, all of the data, sorry, is spatial data. So we want to look at this on a map. Why? Most of the data is geospatial. In local government, everything that we do generally happens somewhere, whether it's collecting a bin, looking after young people, looking after old people, cleaning the streets. We always have to think about where this is happening. Apparently, 60% of all data is geospatial data. So I spent a lot of my time making maps in terms of data of this. Now, I'm going to be focusing on one part of the census data set today, and that's the east end of London in an area called Tower Hamlets. This may be familiar to some people if you've ever seen places like Columbia Road, Bethnal Green, Canary Wharf. These are all parts of the east end of London, and this is the main area I'm going to be talking about. So where is Tower Hamlets in London? So what you can see here is a very small area. It's 20 square kilometers, but this is quite a special area because in the whole of England and Wales, it has the highest population density. It has the most people packed into a small area. It also has the fastest growing population, so it's becoming more and more dense. So in terms of providing services for residents, we need to have a big think about where all the people are and how they fit in. Now, when we make maps, the first thing we usually do is we make a coropleth map. However, the data set for population density in our area, and I do apologize, I couldn't fit it all on screen. It doesn't appear very well as a coropleth. The reason is because the data set is not very evenly distributed. There is, as we will see, some areas with extremely high population density. So over here you've got Whitechapel. We have very high population density in Whitechapel. Over here we have a new development which used to be industrial land. Again, very, very high density developments, big, big towers full of people. And then we also have, just to the south of the financial sector, some areas of very high population density with a lot of people packed into a small place. But in terms of the data viz, this map doesn't really help very much. So the coropleth data viz didn't work for us. So we began to think, what else can we try? And we checked the data distribution, and sure enough, we've got some serious outliers. This is why the coropleth map didn't work very well for us. So what did we do next? We tried to log-transform the data. And yeah, you can see, you know, this area here. You can begin to see the density there. There's quite a few large developments with a lot of people squeezed in. Whitechapel, you don't see so much happening there. But you do see, just to the south of the financial sector, high density of population. The areas with low density, this is where all the banks are. So obviously, there's no people living in there. This is an old dock near to the Tower of London. There's no people living there. There's some very nice pubs, though. If you ever find yourself in that area, the Dickens Inn is excellent. I can recommend that to everybody. And then up here in the north, we have Victoria Park, which is where the East End borders with Hackney. And obviously, there's no people there, at least having their address registered there. Log-transform data looks better on a coropleth map. However, you can see the legend. You lose the data. So you can try to fix the legend. But we want to write as little code as we possibly can. We don't want to keep fixing legends and things like that. So we began to think about other ways to visualize our data set. So what did we do? I am a Python coder, but there's a really nice package in R called Cartogram. And this is a technique called a density equalization algorithm that basically turns your data set into a Voronoi first, and then it rescales the polygons from the Voronoi relative to an attribute of the data. This technique is quite popular. There's a wonderful geographer called Danny Dooling, who has an amazing website called World Mapper, which I strongly recommend you have a look at. And they do things like showing poverty, inequality, food pressure all around the world. And they size the geographies relative to the attributes of the geospatial data. So this is a great technique. There is one issue here, though, is that if you want to overlay different layers, then it becomes difficult. And also, the map does look a little bit unfamiliar as well. But it does show particularly where you have like clustering, where you have a number of census areas, and I'm going to say a little bit more about census areas, where you have a few together that have high data attribute value, then they all get bigger together. So what we can see here is just to the south of the financial sector, you can see there's a lot of worker bees all crammed into this place, and then it increases the volume on the map. So it's a nice data vis, but still we have a small challenge if we want to add more data over the top. And also, it's a bit unfamiliar for people that don't use cartograms. So this is a map made using Data Rapper. It's a very nice website, and they have something called a symbol plot. And what this does is it just basically shows little mountains, little peaks, that show the value of the data attribute that you're interested in at the place where that data is happening. And so again, we can see over here, you've got Whitechapel, lots of people packed in there. Just to the south of the financial sector, lots of people packed in there. The new developments here by the river in Blackwell, and here by the river in the old industrial zone. So this is quite interesting. It gives us some context, and it gives us the data. I really like this data vis, but it's Data Rapper, so it's not FOS, and it's not Python, and I like to use Python. So it was great, but it helped, but it didn't do everything that we needed it to do. The other thing that you will notice, and I'll try to explain this briefly, is that we have one really high value here. And there's a reason for this. It's an outlier, because actually it's this value here. It's an outlier, because, and the reason why it's an outlier is because the actual census area is really, really small. And the thing about the people who produce the census data is that they have to create census areas using roughly 100 to 600 people. Generally speaking, it's about 300 people, but they have to make it all fit together like a big jigsaw puzzle. So sometimes, you know, it's hard for them to make it work really well. So in this case, this census area with really high density is actually just one building. And so it's not a particularly big building, but everyone squeezed in there. So yes, so the data is quite hard to work with, but it is interesting. So when I was working with Data Rapper, I really liked it, and it did remind me of when I was young and I was reading Lord of the Rings books, I used to really like the map at the front of all these mountains, showing the misty mountains in those books. And so I was thinking, I could probably make a mountain with Python. How hard can it be? It turns out it's really easy. This is the essence of the library. It's just one function. You take a point on a map, you turn that point into a line. The line has a start point, which is just a couple of points of longitude, a tiny little bit of longitude to the west of your point. Then you convert your point to a latitude, which is kind of like a proxy for the height of the mountain, using some kind of algorithm that you choose. In my case, I'm just like using a range. So I take the minimum and maximum value of the input range, which is a separate function here. And range one is essentially the minimum population density and the maximum. And then I convert that to latitude values. And then the third point on the line is just a little bit of longitude to the east of my point. And then you use that to create a small triangle, really easy, really easy, and a lot of fun as well. So this is what I made with Python. And it's very similar to the data wrapper map, but I was going for like a kind of hand drawn kind of a look to make it look like something from Lord of the Rings. And, you know, it's the same thing. You've got Whitechapel here. You've got the financial sector here, and so on and so on. So that was fun. But, you know, population density, we were just talking about the reasons why it's a messy data set. There's one place in Chelsea, which has a population density of two million people per square kilometer. So this is a very difficult data set to represent using any tools available. So, you know, it's interesting. The other thing about Kensington and Chelsea is this is where Grenfell Tower is, if anybody knows about that story. This is where it happened. So let's try some other data sets to see if they're really messy. This is people that live in one bedroom homes. So this is tiny little flats, you know, filled with people. And so you can see all the worker bees for the financial sector. A lot of those are living in one bedroom flats. And actually, the new builds. This is a very new development here. And this is a very new development here. So it looks like people who are building homes now are building a lot of one bedroom homes. Two bedroom homes. Generally, everything is kind of the same. Nothing really jumps out here. Three bedroom homes. What you can start to see with three bedroom homes is that, yeah, it's generally even. But actually, in this area here, which is a bow, which is near the bow bow's church, which is used to decide if someone's a traditional East End cockney or not. That's kind of this area, really. So the cockneys seem to have three bedroom homes, generally. And then four or more. And what you see here is in the areas where the financial workers live, there's still quite a lot of four bedroom homes. But in some of these new build areas, there's very, very few relative to the rest of the area. So let's look at another slightly more famous area. This is Westminster in central London. And so you can see this is where Hyde Park is. There's no one living there. Again, this is the population density dataset. And then you've got an open street map based map just to help with orientation. And then in a future version of the module, I think I might do some more stuff with open street map. And then if you look at some of the outer London areas, and this is where I live, you can see like areas of urban density, but you can also see some very suburban areas where the population density is lower. This is like where most people are living in houses, basically. And you can also see green space. So we're nearly finished. I just want to give a massive shout to NB Dev. It's really good if you use Jupiter. Just check it out. Number one, if you're trying to do version control on Jupiter notebooks, it helps you with any clashes, any merge conflicts because it removes the metadata in the JSON that sometimes causes conflicts. If you have a team of people working on the same notebook, this is a real lifesaver. And also it just bakes in good practice. So it means that your code gets shared on GitHub really easily. It helps you or encourages you at least to write good documentation for your team and the community. It also encourages you to write good tests. And it enables you to publish modules. So big shout to them. I'd also like to thank Jarek, who has produced a wonderful PWA for FOSSTEM called Sejourner OX. Do check it out. It's a really good way of looking at the schedule for FOSSTEM and you can watch the videos with Sejourner OX. And also Ed, who's going to be giving a really cool talk on OSM and Wikidata. And finally, I'd like to thank all the council coders everywhere. Thanks for having me. |
CoffeOSM: improve OpenStreetMap a receipt at a time
checking and add shop on the map with a receipt |
Okay, the next speaker is Michele Tameni, which has a talk about Ko-fi or Ko-fi Osm. Improved opposite map are received at a time. So check in an art shop on the map with the receipt. So the stage use. Hello everyone and thank you to be here. It's really nice to be here like after attending like 10 edition of FOSM. So it's my first talk. It's really great to be here. So I want to talk to you today about a little project I started like five months ago, I think. And it basically want to be a new different and hopefully easier way to add place in particularly business place to open straight map. And to give you a little bit of perspective. The usual process to insert a new place or a business into open street map usually involved with the check if the place is already on open street maps, open them up and look for the location or where you want to add the place. Gather all the information needed. So the name, the position, the address, maybe the phone number, the website. And then open your preferred editor, the website or you know, Osmond or something like that. Insert all the information, check if it's correct and then save it. And it became easier over the time but it's anyway time consuming task. And especially I found myself sometimes having problem found updated information about your business like the phone number or the website. Sometimes you find incorrect information online. So I think it's quite hard sometimes to insert a new place. And so I got this idea that I'll tell you later how it's come. But to validate the idea, before coming to FOSM we took a little bit longer way. We went to Zagreb with one of my friends to drink some beer for a serious purpose obviously. And when you are in a new city that you don't know, you usually open them up and have a look where the majority of pubs, restaurants or something like that, where they are. And when you see something like that, you know that on that street maybe it's a good place to be or on the other side of the city there is something else. But what you usually do is looking for restaurants, pubs, bars and etc. So it's quite important for me having this kind of information on the map. Open street map I think it's improved a lot over the last years, especially Europe maybe. But sometimes it's lacking this information I think. So we went to Zagreb, we tried to find as many bars as possible for this research. And what we found out is that there is so much more pubs and drink in place than what we spot in the map at first time. So our question is how we can improve this. Obviously we can do more travel like this one and then insert all the information that we gather traveling. Fortunately there is so many volunteers around the world that do this kind of stuff and insert all the kind of place that we nowadays we can found on open street map. But as I told you before it's a time-consuming task and you have always to find the correct information about the business you want to add. So what you can do, you can do the things that I already said or do something like Bob does that is seated over there. I think he do a quite smart thing. He collect all this hype that we get over our drinking nights. And after he check if the place where we went is already on the map or not. And if it's not it's inserted in open street map. With all the information already there you don't need to look everywhere because usually the receipt have the business name, the address, the location, the numbers, sometimes the website. So I think it's quite smart things to do something like this. Maybe it's not that smart that Bob usually do this after too many years. That's another problem. And to avoid his mistakes, like type or something like that, I think it could be interesting to try to automate the process and access the information from the receipt. And basically the idea, so it's like to use a picture of your receipt and you get all the information that you need to insert the place and eventually you insert the place already if the place is not on the open street map. So copy or send, basically do these things, extract the text from the receipt, try to tokenize and label the data that you can find, check if the existing of the place is the place is already on the open street map. And if not, maybe because it's not actually possible, you can insert it already on the open street map or at least copy and paste all the information you need. And actually the project is quite small, it's just a proof of concept. And I thought a little bit about the architecture of this project and maybe I started to try to do a standalone app. But I think that maybe it's better to have something that can be easily integrated in the other application. There is like something like street complete or other projects that do a great job to improve and make easy to people to contribute to open street map. So I think it can be really, really interesting to maybe integrate a function like this in those apps. So I just mocked up a small Python API, visit on fast API that expose an endpoint where you can just upload an image. And then the software tried to understand what is on the site, label all the data and just return it. Actually, the front end is just really, really small application that show a small form and visualize the information that the mechanic will extract. As I said, future integration for the editors, I think it will be the way to use this kind of function, if it's probably to be interesting and useful and available. Or maybe it can be just a standalone app or a PWA. I actually don't know. It's something open and I'm here to discuss it actually. So how it works actually. The receivers are loaded to a server. I remove the exit data just to have a little bit of privacy because it can be there like the location, the time where you went in a place, and I think it's not something that you want to share. The image is a little bit preprocessed for before the OCR. Actually, it's something really basic that I think it can be improved really a lot. And there is OCR with Tesla OCR that work quite good, but I think it can be a little bit better, maybe processing a little bit more on image before. Then I tried several way to parse the data with Leapostal that actually I think is what I found the most reliable for this task with SPACI. And maybe it could be interesting to train a custom model because I found some open source model that can understand what our receiver and invoice say, but it's usually trained about the product that you buy and find the price and the total and not the business name. So it doesn't work really well actually. Then we can just look with an omination with the place it's already on the OpenStatePAP or not. If we can find the exact name, I try a location search with the Overpass API so I look at the address, have a look at all the business of that type. There is around and show a list to the user to just make sure that the place is not there with a little bit different name. So what can be done different or better I have in my to the list. Improved the data section. As I said before, the OCR actually works good. I tried like the Google Vision API that works much better, but I don't want to use it actually. So I think that maybe with a little bit more preprocessing on image, all the data excited could be a little bit more accurate. As I said the front end, actually it's a really, really small application that just shows some information. It's maybe better to do on the client side all the stuff for privacy reasons. So I just can read the text, keep it on my device and just upload or save the new place to OpenState PAP choosing what information I want to share. Integration. So as I said, you set the place directly from the app will be great or integrated it with some other editor. More safe to test and improve because what we spot in the agreb, so our chip is not unusual actually, is that there is not clear standard of obviously for a shift and can change a little bit from place to place. In Italy almost they are all the same. We doesn't find this true for like a diagram. So maybe call it more information about how the chip looks all around the world could be great. What we can do, be done different because this is the step that I thought it's the easiest, but maybe like I said, having a custom model that can label a little bit better, the information could be interesting at least because sometimes you find not the name of the place on the receipt, but the name of the business that is sometimes different or sometimes you find both. And actually the postal that I use they get confused so you doesn't have a reasonable result over time. So why I don't want is to find more pubs, more beer, more fun, help Bob to drink some more good beer, and absolutely to improve OpenStreetMaps together and just having an easier way to add place and so we can call it more information. So thank you, this is the website where you can find the source code. I asked the temporary playground where you can upload your receipt to test and that's it. I just want to know if you find this idea interesting, if it could be good to go on with this project, if you have some suggestions. So I ask questions, it doesn't ask you to have questions. So thank you. Because maybe you haven't snapped the photo when you just went out from the place, the first reason, and the second reason is for privacy, as I said before, but I think it will be great to offer the users the possible to choose if you want to share information. It's much easier because if you have the exit coordinates, you can just have a look in a really small area around and just easily found if the place is already on OpenStreetMap or not and even be more accurate when you insert it. So it's obviously a nice idea. I haven't done it just because sometimes like Bob he doesn't do the insert right away, but the day after then go there. Okay, I repeat the question. That's a nice idea. Yes. |
Announcing pg_statviz |
Hello. So PGSTATV is, from the name, I think you can understand that it's probably something to do with Postgres. And it is. It is a new Postgres extension and utility pair. So it comes with its own tool that you use outside of Postgres. It's minimalist. We'll get into that in a moment. It only does the thing it's supposed to do and it doesn't touch anything else in the system that it's not supposed to. And the purpose of PGSTATV is time series analysis and visualization. That's the vis part of Postgres internal statistics. That's the stat part. So Postgres internally keeps its own statistics. They are cumulative and dynamic statistics, right? So you get, like, number of buffers written is a cumulative statistic that keeps going up. You also have dynamic statistics like PGSTAT activity that tells you what's happening inside your Postgres server at that moment in time. So if you take snapshots of these statistics internally from within Postgres and you perform time series analysis on them, you can gain insights into how your server is behaving. So this utility that comes with PGSTATV's extension can produce visualizations for selected time ranges on the stored snapshots that are inside the database. So you can, for example, take snapshots of your server every 15 minutes during the course of the day and then analyze it over 24 hours to see what your peak times were and what was happening inside the server at that time. I wouldn't recommend taking snapshots more frequently than a minute. And it's easy to see why. You have too many snapshots. It's harder to see the bigger picture, maybe. So the reason for all of this is you want to track your performance over time and potentially you can perform troubleshooting on why your server is not behaving the way you expect it to and additional tuning. So minimalist, this is a tiny package that is based on the KISS and UNIX philosophies. So keep it simple and sweet, right? And the UNIX philosophy is that it comes with a tool that you can run as a normal Postgres command line tool like P SQL with the same parameters and everything else. And it allows you very simply to create snapshots of the statistics and visualize them. So it's modular. We'll get into the modules in a minute. It's minimal. It's the least amount of code I could write to make this thing work. And it's unobtrusive. So you can take snapshots without affecting any other activity running on your system. And I think that's very important for being able to monitor and analyze in production. So the components are Postgres extension, as we said, and the Python utility that retrieves the stored snapshots from the database and creates simple visualizations with them using matplotlib. The extension is written in plain SQL and PLPG SQL. So there's nothing to put in shared preload libraries. So this means that you can just type create extension and you can start using it without even restarting your server. So create extension PGstatvis is all you need to do. We're working on the packaging now to get it distributed through the PGDG repos, Postgres global development group repositories. And by extension, it will find its way into distributions hopefully soon. The way you install the utility is very simple. You just type pip install PGstatvis. If you tried that this morning, it wouldn't work, but I just uploaded the file so you can try it out. As I said, this is the last minute talk. It's very new. The code is pre-production quality. I would call it alpha code, but you can give it a try for yourself and offer any suggestions or fixes or tell me what I'm doing wrong. Now, the extension can be used by super users, but you don't have to. The only thing that the extension needs is PG monitor role privileges in order to be able to select from the internal Postgres statistics tables. And the usage is dead simple. And to take a snapshot, you just type from within a client, select PGstatvis.snapshot. Now, why is there no underscore there? It's because Postgres doesn't like us naming schemas PG underscore something. That's reserved only for core Postgres. So, extensions are not allowed to do it. So, what does the command line look like? You just pip install PGstatvis and you have the utility and the utility when you ask for help is a normal Postgres utility. You get your database selection, user name, host name part, et cetera, the same way you would connect with any Postgres client. And you've got modules like buff that shows you statistics on the background writer and buffers written to disk, cache hit ratio, checkpoint rate, connections, number of tuples, weights that it found in the server during the snapshot, wall generation and so on. And you can either run analyze which runs all of the modules at once and generates visualizations or you can run just one module if you're only interested in buffers. You can only say run buff. Most importantly, there's a capital D option that you can use to specify the date range in order to visualize only the time range you're interested in. So, like the last 24 hours only. And these are specified, of course, in ISO 8601 format. So, there's no ambiguity in how to type in dates. And it works something like this. You connect to database fof as user Postgres. You give it a date range and it just generates the snapshots and writes the visualizations as PNG to disk. And yes, it has a logo. So, it's complete. The visualizations look something like this. I apologize if the points are a bit too small for you to see. So, as we said, buffers written to disk is a line that keeps going up until stats reset. When the stats get reset, it starts from zero again. So, perhaps this is more useful. This is the buffer write rate in megabytes per second. So, you can see exactly how many buffers your Postgres server was writing to disk at any moment in time. And also, you can analyze what was happening because of the background writer. No? Yes. Thank you. So, you can see here that because this was a test I ran on my laptop with a script that was just inserting rows into the same table. The checkpoint line, which is the orange line, didn't do much because it wasn't scheduled activity taken care of by checkpoints. But you can see that back ends were doing most of the work. And also, you can see that the background writer, which is the green line, didn't get the chance to participate in all this buffer writing because from what we can see from the very low line, its limits were set to low for production. So, you can gain insights into the behavior of your Postgres server like this. Or you can look at connection versus status count. So, you can see how many connections you had coming into your server from clients, how many of them were active, how many of them were idle, how many were idle in transaction, and so on. But you can also see which users were taking up those connections. And I think that's really interesting when you have like an environment that's used by multiple applications so you can know which developers to blame when it all goes south. Weight events. As I was testing this on my laptop and was overflowing it with IO because I was inserting millions of rows into a table, I generated an IO data file read, sorry for the small letters, weight condition. And that was captured by the snapshot that was being taken every 10 seconds or so for this example. So, thank you for listening. The project is going to be live at github.com slash virus slash pgstatvis in a few moments. You can find me on master don. And what the hell? I'll do it right now. So, as we said, this is alpha quality code. Oh, I forgot to say that it doesn't do any scheduling or any maintenance or any partitioning of those internal tables where it keeps the snapshots. So you can delete them by hand. You can schedule the snapshots very easily with any tool you like, like cron or pgcron. But I didn't want to make this a dependency on the extension. So you can just configure it yourself. And I can just go to settings and make it public right now. Cool. Thank you. Any questions? No, okay. Thanks anyway. Thanks. Thanks for doing me. |
Breaking the Code of Inclusion: Designing Micro Materials Based on PRIMM Principles for Accessible Programming Education. |
Now is the time for the Josie Malaze. And he speaks about breaking the code of inclusion. The designing micro-material is based on prime principles for accessibility, programming, education. Very difficult detail. So, okay, the stage is yours. Thank you. Thank you. So, my name is Josie. Okay, okay, okay. My name is Josie. I am a PhD student at the University of Bristol. And today I'll be giving a talk about how we can design more open. Sorry for that. Okay, okay, okay. Test. All right. So, I'm Josie. And I'll be giving a talk about how we can design more open and inclusive pedagogical material to teach programming. I'll first like to start a bit with a small introduction to tell you why we think this talk is important. So, I've always been passionate both by programming, but also by education. I joined Kododojo from a very young age, helping kids to learn how to code. And I also went for an internship abroad in Kenya to build software for schools in less fortunate areas, where I could first-hand experience what technology can do to help education. Upon graduating, I joined Hack the Future Belgium for those of you that don't know. It's a small nonprofit organization in Brussels that organizes coding boot camps for refugees that are trying to find a job in the tech sector here. So, before we go on to the meat of the presentation, let's take a look at some of the background information we need. I'm pretty sure most of you remember your first programming classes where you had to install a weird-looking editor with hundreds of buttons. You had to type in some text. You tried to press the pre-button or a compile button. And you would see a weird error message that you didn't really understand. This can be very demotivating. And it's actually even worse when it happens to people from already underrepresented groups, because if they encounter these types of errors, they internalize it, and they feel like it's a confirmation that, yes, maybe they do not belong in such a classroom, which is, of course, not the message you want to send. Now, in the past years, there has been a new methodology of teaching programming that tries to limit this issue, and it's based on the prim principles. Now, the idea is, instead of starting to write your program from scratch from day one, you basically start with a simple exercise, predict. In this stage, you look at some existing code, you download it, and you're going to try to predict what will the result be if I run this piece of code. Then, in the next stage, you're actually going to download the code and execute it, and going to verify, wow, were your predictions correct or not. Then, in the third stage, we're going to ask you to actually make exercises on the problem. This can be label all the variables, or can you tell me which variables are updated at some point in the code. Now, up until this point, the learner hasn't changed anything about the code itself, and as such, it also doesn't feel like it's a personal failure if something goes wrong in the process. We're going to try to change this in the fourth step, modify, where we're still not going to ask the student to write codes from scratch, but we're going to give them a functioning program, and we're going to ask them, like, hey, can you make some small modifications, like, instead of running this loop three times, run it six times. And it's only in the last stage where we're going to ask them to actually write a new program from scratch using the same principles he developed earlier. Now, this is a new methodology in teaching, in programming education, but it's based on a very well-known pedagogical concept called the zone of proximal development, where the idea is quite simple. If you're only doing tasks you can do by yourself, you're not actually learning anything, you're just repeating stuff. But if I give you a task that you cannot do yourself, even with some help, you're actually just going to get demotivated and lose all motivation to continue learning, so that's also not good. So we should always strive to give exercises in this yellow zone, where the student cannot easily do it on their own, but with some proper guidance they can get there. Which brings us to micromaterials. Now, a micromaterial is an open education resource, so anybody can easily include it in their curriculum, but it should also provide some sort of automated feedback, so that even though the teacher is not there directly to guide the student, they can still get something away from it. Ideally, you also want some kind of automatically generated content, because managing all the content for exercises can be a very time-consuming exercise for teachers. So now we'll discuss some small examples that we, within our lab, have built to experiment with these ideas. So the first sample we'd like to discuss is an online environment to practice the use of HTML. It's designed for engineering students, where we do not really expect to be able to develop HTML websites on their own, but they should be able to grasp the core concepts of what the elements are about. So we have a stepwise progress that they can follow level by level, and we always start with a small presentation of the core HTML concepts that they need to know. Now, once it's time to actually practice their HTML, we do not just give them an editor, but instead we make use of Google's Blockly library to show this kind of HTML blocks that already contain the syntax, so they can focus on what the text represents and not on the syntax itself. Important here, we analyze sample HTML code to generate the blocks, so adding new levels is as simple as providing a new website we want them to recreate. We also dynamically create links for every element, like hints for every element that they will need to use in the page. We have a similar environment for the practice of databases for the same engineering students. We used to have a lot of problems where they had to install databases locally, database files could get corrupt, and they could turn into issues. So we developed a fully online environment with the use of SQLJS, which is an open source project based on SQLite, compiled to WebAssembly using Unscripted. So in this simple application, they would get to see a description of a database, and on this database, they would get to run queries. The queries would be typed in into an online code editor based on Adam. So they get syntax highlighting, they get code completion, and whenever they were to execute the queries, they could see the results directly in their browsers, and this they would use to answer questions about the data. Similarly, if it was more about inserting or updating the data, we could no longer do with simply having a simple filling the answer type of question. So instead, we're going to run checks to see at which step did they fail, so they can stepwise go back and modify the code until they get it right for all the constraints. Similarly, as well, if they want to create new tickets, we would also generate test cases for those. Another problem that's very often looked at is can the students interpret codes. And a common tool for this are trace tables. Now, for those of you that don't know, a trace table is just a way, for example, here, where you can look through your program line by line and note down what happens at each line. For example, this trace table is designed to look at which variables are declared or initialized or updated at certain moments in time. A different type of trace tables would be the operator's trace table, which gives the student a view of which operators are used at which points in the program. A third one would be this variable values, which is basically a way to see which variables are updated at which point in time, which is a way that we can know is the student actually reasoning about his code correctly. Lastly, we have a very basic trace table for advanced users where they can just keep track of the latest state of the program. Now, to this trace table application, we added some export functionalities so teachers could import the JSON from the students so we could run some diagnostics to see if students were understanding the code correctly. Now we add more of a social game, a coding card conundrum. It's basically a kind of card game where there are three types of cards. We have goal cards that say in a certain condition that needs to be true. We have environment cards that initialize five variables and we have code cards that would execute some codes to update the global state of the program. So it's a multiplayer game. It can be played from one to four people. And we get this field where each player gets a handful of cards and there will be three heaps in the center. Now, during their turn, students could select a card from their hand and place it on one of the environments. When doing so, they would have to update this state table to reflect the new latest state after executing the code on their card. And if they managed to achieve the goal on their goal card, they would get awarded some points for it. So it's a social game where they can still practice their trace tables. Then the final one we're going to discuss is Kingscrawl. In here, we tried to make it a bit more silly. We added a fantasy theme where the idea was that we live in the kingdom, but the kingdom is going to be attacked by an evil dragon. And it's up to us to find out which of the heroes will be able to stop the dragon. Now, there are 16 heroes randomly generated based on four essential variables of the equipment they are wearing and two extra variables to increase the inclusiveness of our application. Students would be shown a scroll that would be randomly generated based on essential JavaScript features that would update the states of those four variables. And it's up to the students to predict after the program has run which of the heroes matches the final description. Students were able to automatically select which elements of JavaScript they know or didn't know, so they could always participate even though they hadn't seen everything. And we gave them a state table to, of course, keep track of the state changes as they went through the scroll. So those were some of the examples that we developed. But now we'd like to take a step back and look at what did we learn from it and what kind of guidelines can we give you if you were to try to implement something on your own. So the first piece of advice you would like to give you is embrace the themes. I know it might look silly at first, but just having a silly theme takes a lot of the stress away from students and they no longer feel like they can fail doing an exercise because they're just playing a game in this kind of theme world. But when we want to design those themes we also shouldn't forget the principle of scale transfer which basically states that the context in which you learn something is an important part of how you will be able to apply those skills in different contexts. So the closer your environment is to a realistic use case the more useful their skills will directly be when they need to apply them to a realistic environment. So it's still good that in your silly theme world there is a real life kind of application. Also definitely invite the social aspects. One of the main reasons people stop learning how to code is because they don't really feel like they belong in the world or they feel discouraged. And just having this kind of social aspect it can be slightly competitive but also collaborative really takes away this barrier and pushes people to continue learning in a welcoming environment. Also you should strive to keep the setup minimal as possible and I know this sounds really straightforward but when you're designing an application that needs to be installed on a desktop and even 2 out of 20 students are facing issues that takes away at least 5 minutes of the teacher's time that they cannot be helping students. So if at all possible try to make it work just within the browser. Now we should also make sure that our micro materials only focus on one specific learning goal. A lot of content is created like for example the Heidi programming language which there is a great talk tomorrow and these are all like great tools but they kind of expect you to focus on them for your whole semester. A lot of teachers already have a curriculum they need to follow so they do not have the flexibility to really go with something else. Your tools should really be something they can introduce and use in one or two of their lessons without needing to change their overall curriculum. Otherwise it will just not be adapted. We also have the expertise reversal principle. Now the idea here is if you're new to something you really like to be guided and get all the hints on what can I do, what shouldn't I do but as you gain more experience if you keep having to go to the same guidance it actually becomes a hindrance. So ideally your application should be designed in such a way that they contain multiple layers where first at the outer layer there is a lot of help available for your students but as they move on and become more experienced and become more experts they can take away those layers and start operating the tool without getting all that extra help that's getting in the way of learning. Also of course try to do something with automatic content generation whenever you try to develop a tool on your own you will quickly discover that having to update the content yourself or even asking teachers to create the content themselves it takes a very long time and they will not do it. All right these to make it easier for teachers to use those tools not to give them more work. So the final principle is if adult possible try to make things more well compatible especially with the adults I've been teaching they are very busy work lives they are trying to learn programming while still taking care of the families if they have a simple tool they can use for example while on public transport that's a huge quality of life improvement for them. So let me finish this presentation by telling you all I know you're all a community of builders if you think like hey these kind of tools you discussed I can build those feel free to reach out to any learning environment close to you and try to ask how you can contribute. Thank you all for listening. |
Open Source Good Governance – GGI Framework presentation & deployment
A quick introduction to the OSPO Alliance handbook and resources |
So, my name is Boris Baldassari, I work for the Eclipse Foundation as an open source consultant and I'd like to introduce the OSPO Alliance and the work we are doing on the Good Governance Initiative. So, to start with, what is an OSPO? So, OSPO stands for Open Source Program Office and it's a team within an organization that takes care of everything open source within the organization. So, it's supposed to be a single point of contact for any question people may have about open source licensing and so on. The OSPO is supposed to foster the awareness of open source within the organization, define and implement a strategy regarding open source for the organization, inbound, outbound and much more than that. So, the idea is to make sure that the organization makes a good use of the open source, they consume, produce and so on. It also includes things about the ecosystem, the open source ecosystem and the participation into the ecosystem, the open source ecosystem by the organization. It should be noted that it has many impacts on many domains, so it's from the legal, to the technical, to the security dependencies, development practices and so on. And it's a strong recent trend, so many organizations nowadays are in the process of setting up an OSPO. When I talk about organizations, I talk about any type of organization, so these could be corporations, SMEs, administrations, universities, NGOs and so on. The point with these organizations is that quite often they don't know anything about open source, it's not in their culture, so they just don't know where to start. That's where the OSPO Alliance comes into the picture. So, the OSPO Alliance was launched in 2021 by a group of four European non-profit organizations and the idea is to promote an approach to excellence in open source software management. We wanted it to be really easy to access, low barrier, there is no fee, there is no commitment. The only thing we ask is a letter of support, a statement of support from the organization to the principles of open source and the OSPO Alliance. All our activity is open, public and fully accessible, no need to register to get access to what we, to our outcomes. We are entirely governed by open source principles, so you can just come, join us, collaborate and if you are an active contributor, then you will become a maintainer and so on. And once again, it's for all types and sizes of organizations. Our mission is two-fold, so firstly, help the organizations make sure that they better use the open source software and ecosystem and properly take care of open source and also make them good citizens of the open source ecosystem, so not only consuming and producing but also participating and financing and being there and openly asserting their use of open source, so things like that. To achieve these goals, we have set up different sub-task forces. The first and the most visible one is the Good Governance Initiative Handbook, which is a blueprint to help organizations define and build their OSPO. We also provide a safe place to exchange information so people can ask any type of question when they are trying to replace some proprietary software by an open source alternative that they can ask for what others did. We have mailing lists, meetings, regular meetings. We participate to events and we also have a monthly session which is called the OSPO Unworn Sessions, where some of our members present how they did with their OSPO and share their good practices, their mistakes too. We also have a task force for evangelism and dissemination of our initiative. The Good Governance Handbook, as I said, is a blueprint to help organizations define, so build a roadmap and actually build their OSPO and make it a success. After that, we propose 25 activities, which are good practices that you want to implement when you build your OSPO and when you want to grow the open source awareness within your organization. These 25 activities are organized into five goals, which are levels of maturity. The first one will be identify the open source you use and identify the skills that you have. Start changing your contracts with your employees so they can actually contribute, set up or educate the legal team, educate the executives, contribute back, contribute upstream, assert publicly your use of open source, and so on. We also provide the methods to implement these 25 activities, so we recommend an agile-like process where you will pick a few activities, complete them, select a few other activities, and so on until you have completed the scope. We had quite some success, so the first edition of our Handbook was in 2021. Last year we did an update the 1.1 version, and we also have translations, so German and French are already available, and a few others are on their way, so Portuguese, Spanish, Italian, and whatever your language is, please join us. We use weblates, and we, of course, welcome contributors. Talking about the activities, they have really a wide range and scope, so it's from, as I said, identifying the open source you use, doing software composition analysis, dependency management, vulnerability management, it's also about implementing the good practices that we use in open source, so setting up peer reviews and things like that, training the people, HR, so the contract, allowing people to contribute, funding the ecosystem, the project that you use, and the foundation and the organization that make the open source ecosystem up to executive education and potentially making open source strategic asset for your organization. Each activity has the same structure, so we have a description that states what this activity is about, opportunity assessment, why would you implement this activity, what it will bring to you, progress assessment, so you know where you are and that enables to track your progress and know when you can, you can say that it's a complete activity and you can switch to the next. Recommendations, so feedback from the field, experienced people saying, well, you might have a look at this that will help, and also some tools and recommendations and resources that you might find useful. As I said, the good governance initiative is for any type of organizations, so there are activities that will not fit within your organization. The first thing that you will do actually when you will implement the handbook is look at all the activities and select the one that you want or reject the one that do not apply to your context, that's okay, and from there the activities are kind of generic, so what you want to do from there is for each activity create a scorecard which will be the local adaptation of the activity to your own organization, so that will imply stating the teams, the internal teams that you may reach out to, the competencies, skills that you may need, the processes, tools, and resources that you can use and that you will have to deal with, and also identify some specific tasks that are relevant to your context, and that this task will help you track your progress on the scorecard. We provide the scorecards in two versions, one is the PDF that you can simply print and fill in with a paper and pen, good old style, and also a digital one. So last year when we did the 1.1 version of the handbook, we introduced a new feature which is a deployable GGI program. So basically it's a GitLab project that you clone in your own GitLab or in any GitLab and from there there will be a bunch of scripts that will be executed, creates 25 activities as GitLab issues, so create a board so you can visualize them, you can click on an issue and see the description, opportunity assessment, and so on. From there you can edit the description to fill in your own scorecard, and once you have that, so you have all your issues, you can visualize them in the board, you have defined your scorecard, so you can just start working on them. So you select a few of them, you work on them, you complete them, you select a few other ones and so on. And each and every time, so it's a nightly script actually, but my GGI will create a GitLab page, a static website that will reflect where you are, what are your current activities, your past activities, and will even provide a simple dashboard. So the list of activities when you have implemented the my GGI looks like that, and the simple dashboard that you get and static website looks like that, so you have the full description of the activities that you are working on and also the scorecards. So it really tracks your own adaptation and implementation of the GGI program. So that's all, once again we are an open source initiative, so you can access all our resources for free, not even need to register whatsoever, you go to our website, Ospo-Alliance.org, from there you can download the Good Governance Handbook, there is an HTML version. You get access to all the translations that are available, and there is also a section about contributing. You can just simply join our calls, we have weekly calls and just connect, everything is there. So we are European based because our contributors are European, so it's really based on the European values and way of doing things and so on, but we are open to absolutely everyone. So just come, join us, you can help on the translations, on the updates of the handbook, on the dissemination, and if you want to follow us, it's here. And that's it. Thank you. Thank you very much. |
FPGA-based music synthesis with open-source tools |
So, welcome Sebastian Holtzab, the stage is yours. Okay, thank you. Hi everybody, my name is Sebastian, I'm an embedded systems engineer, usually in kind of the automotive, functional safety, autonomous vehicle space, but today I want to talk about one of my side projects that's completely unrelated to what I do at work, and I hope you guys enjoy it, so just kind of, because I'm curious, who actually knows what Eurorack is? Oh, that's amazing, okay, I wasn't really sure what to expect, okay, that makes me really happy. Who actually has a Eurorack system at home, and plays with it? Oh, okay, four or five, okay, actually after this talk, if you have a Eurorack system and you like playing with FPGAs, I have a couple of these boards, so please talk to me, I'm willing to give them away if you have cool ideas, but we can get back to that at the end. So, basically, Eurorack is a modular system for building your own kind of hardware electronic music synthesizers, and it was kind of created in the 90s and standardised by Dopefar, it's a little bit of a rough standard because it's a little bit rough around the edges and imprecise, but basically what the Eurorack standard is, is it's a description for what the dimensions of these modules should look like, and kind of roughly what the signals look like that flow between these modules, and so there are a whole bunch of different manufacturers, thousands of manufacturers, tens of thousands of different modules that you can buy from all sorts of boutiques, small and big manufacturers, and you can kind of buy these things, put them together into a system like this, and then use this to create music, that's kind of what Eurorack system is. And just to give you a small idea of what a typical module looks like, here is kind of the input and output jacks on a mutable instruments module, it's basically an oscillator module, and you can see there is an output jack where basically audio frequency signals come out of the oscillator, and then you can kind of change the properties of these signals by plugging in different kinds of input signals, that for example the V-Oct input jack that actually changes the pitch of the output signal, so you can choose which note the oscillator is playing, if you want to change the volume of the oscillator then you would send signals into the level input for example, but that's kind of what the input and output jacks on one of these individual modules actually looks like. But where did this project actually come from? So I had a crazy idea for a weird high performance granular sampling device that I wanted to make using FPGAs, and I didn't really know where to start in terms of okay, you can buy an FPGA development platform, you can buy development boards for audio codec ICs, you can put it together but then you need to do some conditioning on the signal to make it work with Eurorack, and potentially you also need to calibrate it as well, and I couldn't find examples for this kind of thing, so that's why I started playing in this world, because I couldn't find anything that really combined the hardware synthesizer world of Eurorack, and this PMOD interface, and PMOD is basically, I think it was something that was originally standardized by Digiland, and there's a whole bunch of FPGA development boards that you can buy that have this interface, so basically if you have the Eurorack PMOD hardware that you can see above there, you can plug it into an FPGA development board, you can synthesize one of the example designs, you can use that in your Eurorack system and start making weird music. So this project is a hardware design, so this board designing QICAD that you can find in the GitHub repository, it's a bunch of system very log that implements kind of drivers for the codec and the calibration, online calibration, some example DSP cores, and a bunch of simulation infrastructure, that's not just test benches for the FPGA gateway, but also allows you to simulate an entire modular system on your machine with like this little bit of FPGA code that you wrote for your module, and I'll show you what that looks like in a bit. So just to give you a look at what this looks like, so the Icebreak is a fantastic FPGA development board that I highly recommend, supports all the open source tool chains, you can purchase it from one bit squared, and this is what it looks like plugged into the hardware here, I wouldn't necessarily recommend that you plug it in this way because most Eurorack cases won't actually fit something this deep, so I would actually recommend that you use a ribbon cable to connect the FPGA board to your, to this kind of expansion module here, but it doesn't have to be this development board, it could be a different one. Just to give you an idea, the Icebreaker I think right now is 70 or 80 U.S. dollars, I think, or so, and what I actually wanted to show was a demo video, but I don't have any audio, so if you're curious to see this thing in action, then I encourage you to go to the GitHub repository for these slides, and in the GitHub repository are two movie files, maybe go and look at these after the talk, that includes like a video example of a couple of example calls running and what audio sounds like. So yeah, just leave that there for a second. So why might you want to play with Eurorack P-Mode? Well, maybe you want to start playing with DSP, maybe you want to try doing things that are difficult on an MCU-based platform, and what do I mean by that? So, if you want to do super low latency operations on a microcontroller-based platform, and there are a few microcontroller-based development platforms for Eurorack, very quickly if you want very low latency, you have to start dealing with DMA, you have to start dealing with some pretty sophisticated real-time operating system concepts, for example, and actually there are quite a few things that are easier to do in the FPGA world, especially when it comes to DSP than in the MCU-based world, but even if that wasn't the case, it's still kind of cool in my opinion to, it's a cool learning platform to play with FPGAs. If you have kind of this world where you can just play with sound, plug arbitrary things in, make a module that implements a tiny little piece of functionality and see how it is affected by all different effect modules and different oscillators they have in your system and it makes it very easy to discover things. So in the GitHub repository for this project, you can find a whole bunch of examples for different DSP cores, and you can load these onto your FPGA development platform, you can try them out and see how they sound, and that's just to give you a picture, there's going to be more coming, but just to give you a very simple example of what one of these cores looks like, and to be clear, this is just the DSP core, there's a whole bunch of driver cores that you don't see that you don't have to worry about if you're making a design like this. This is Verilog, and this is implementing something called a voltage control amplifier, what does that mean? We're just multiplying one channel by another channel and sending that to an output channel. That's basically what a voltage control amplifier is, and in Verilog that looks like this. We take basically the output is input 0 multiplied by input 1, we have to shift it otherwise, because basically in this case it's a 16-bit sample, then we have a 32-bit result, this is to convert it back into a 16-bit sample so that we can send it to the output, but why is this cool? This is very simple, but what's cool is with this very, very simple piece of code in Verilog, you achieve something that is ridiculously low latency, so here we're achieving just with that example, like basically 120 microseconds of latency, if you're sampling at 192 kHz that's 24 samples of latency from the input to the output, just with this small bit of code. If you are going to try and do that on a microcontroller, not easy, you can do it, but it's not easy. In this case, I actually think that the latency on an FPGA-based system should have been a bit lower, but what's actually going on here, why are we getting 24 samples of latency from the input to the output with a simple implementation like this? It's because the audio codec on this piece of hardware itself has a bunch of internal filters that kind of pipeline the input, it uses this for like, I think some kind of pre-emphasis function or something that you can partially disable but not entirely disable, but that's kind of where this is coming from. In this plot up here, you can see, kind of in red, a signal coming into both the URAC PMOD and another MCU-based module, the DISTing, and then the yellow trace is what's coming out of the URAC PMOD and then the blue trace is what's coming out of the microcontroller-based module, so you can kind of see the difference in latency there, and both of them are both acting as the same thing in this case. So the hardware, I went through a few different revisions, I made some mistakes in revision one and two, as you can see here by the Bodge wires, but as of right now, if you were to go ahead and download the Kaikai design files for revision 2.2, which is this one on the right here, which I have in front of me here, it works without any modification, so you can just build it. But there is a revision 3 coming and I'll talk about that in a second. So what does this hardware design actually look like? Basically there are eight channels, we have four input channels, four output channels, a bunch of LED indicators so you can see what's going on, but then the heart of this whole system is the audio codec, and what's interesting about this audio codec is it's a chip that's kind of intended for the automotive industry, and some other Eurorack modules for this kind of purpose will not use kind of a normal audio codec, they'll use instrumentation amplifiers or some other type of ADC or DAC, and the reason for that is because in Eurorack we're not just dealing with audio frequency AC coupled signals, in Eurorack we're also dealing with sometimes extremely fast DC signals that need to be accurate, for example if you are controlling an oscillator and you need to control the pitch very precisely, the absolute DC accuracy is important because if you shift by a few millivolts then suddenly your song is not in tune anymore, and that's why DC accuracy is important, and basically audio codec IC is often not designed for DC accuracy, but the benefit of an audio codec IC is that it's cheap, so you might pay three or four euros per unit instead of, I don't know for instrumentation quality converter you're paying tens of euros perhaps, but the cool thing is, the main difference here and at least in the case of this codec that I was playing with in this hardware, there is kind of a fixed DC bias in the codec input and output when you get it from the factory and you can calibrate this out just using a simple, like you can basically calibrate it out using a simple linear regression, kind of feed five volts into the codec, measure what it gives you, send minus five volts in, measure what it gives you, do that for the inputs as well as the outputs, and then on the FPGA itself account for this DC bias that's present in the codec after it's manufactured, and after doing it going through this calibration process, and there are scripts to do this calibration in the GitHub repository, you can actually achieve kind of sub five millivolt level, absolute DC accuracy between minus ten and ten volts using an audio codec chip, which is awesome because it brings manufacturing cost down if you decide to kind of create your own FPGA based instrument and it means that this thing doesn't cost so much, and it also means you can use I2S which is kind of a standard-dish audio interface protocol, embedded audio interface protocol instead of some other interface protocol, so basically the datasheet strongly, it does not suggest that you do this, but if you do it, it works really well, so like explicitly ignoring what it says to do in the recommended external circuits, and I've tested the crap out of this, even over temperature and so on, it seems to work, and I mean fortunately we're not dealing with automotive and functional safety, so no one's going to complain if we don't do things the way the datasheet suggests that we do them. So this is kind of just a simplified overview of what the FPGA gateway looks like for the example project here, and basically the part that you write yourself is this user-defined DSP core, and then between that and the rest of the world is a driver for the codec IC, a driver for the I2C interface that initializes the codec IC, a calibration routine that basically does this online calibration of the codec that I was mentioning before, and yeah, I mean that's what it looks like. So that's kind of the hardware, and I also described a bit about how the gateway works, but if you have no hardware at all, you can still play with this stuff, because I actually wrote a plug-in for a BCV rack, which is an open source simulator for modular synthesis system, basically, Eurorack, and what this plug-in does is it actually simulates with varilator, like here I can show you, oh, got one minute, okay, so here's a varilog implementation of a clock divider, and this is sitting inside a plug-in for BCV rack, and it's compiled with varilator to C++ so that you can run it inside an entire modular synthesis system simulation, and so now we have basically this simulation of the hardware with the varilog that would be running on your FPGA, actually running through varilator in translated C++ linked to the BCV rack binary so that you can actually run your FPGA code inside a modular synthesis system, which is kind of crazy, and it makes me happy that you clapped then, but that was actually the easiest thing to do out of everything that I described before, like literally that was about four hours of work, I was surprised that it was so easy, but I encourage you to play with this stuff if you're interested in it, so, oh, okay, that's time for me, I was almost done, thank you for listening, so two things I would like to note, I am thinking of going through with a manufacturing run for the revision three of these boards, so if you are interested in the hardware here, if this is your niche, start a GitHub repository so I know how many people will be interested, that's my only request, because if I manufacture a hundred of these things and they just hang around in my apartment, then I won't know what to do with them, so thank you. Thank you. |
FabAccess
a machine access system for fablabs and makerspaces |
the stage is yours. Thank you. I'm here for a quite short time. My head may not have switched completely to English, so if that happens, maybe someone of the audience can help me out. So, my talk is about FedExcess, and in the last time I held a talk, after the talk, people said to me, you need to tell more about FedExcess, so that's what I'll be doing, and first thing first, wake up, some color. What is FedExcess? For FedExcess, I'm a manager of a workshop in Berlin, in a university, and some people call me the product owner or the father of FedExcess, it's not really the case, I'll tell later why. Switching machines in a workshop sounds easy. You'll switch one bit on, the machine is on, switch one bit off, reset the bit, and the machine is off, but it's not that easy. First thing, why do we need to switch off machines? In a workshop, you have two kinds of machines, the one who tear off parts of the body of the users, if they do not know how to use the machine, and the not so interesting ones who are not hurting people, and there are two kinds of people in your workshop. There are the ones who know how to clean up their place after they leave the workshop, and there are the ones who leave the mess to you, and if you need to get along with this combination of people and machines, you need to get some structure in the workshop, and I'm the only one working aesthetically in that workshop, so I'm not enough people to get structure, and 30 people are using the workshop, so I need support, and I need the wonder of digitalization. So, I do not like good luck, have fun, that's why we began to work in FabExSys in 2019. I won't read all of that. There are requirements which are coming into software if you want to switch on machines and off machines. That's why it wasn't that easy, and next thing is we saw a lot of projects who began working in that area, but when we realized there are a lot of these projects, I think most of you know that XKCD, we tried to avoid that by becoming modular and getting more than works for me, technology readiness level nine, maybe some of you know what that means, and that's one more thing we wanted to reach, we wanted to reach the technical ground for federation in the future, not right now, we'll get there in September this year maybe. That's what it is FabExSys, in the last talk they said talk about what is FabExSys, that's it. We have machines, we have users, and the machines can be switched on by the user, not by the manager of the FabLab, but the manager of the FabLab has the user database and the machine database and can say who was allowed to switch on which machine, and in the next step here they can see which one of the users switched on which machine in the last time, and that's it, we do not follow our users or make surveillance of what they are doing, we want to know who left that mess, and that's it. There are some perspectives which need to be implemented in such software, first is the workspace manager, I won't read because time is running, for the workspace manager the software we are implementing should fit into his workspace, and not press his workspace into our software, that's one of the most important things for us, and for the user it should be easy to use, and that's it fast and clear as, English words missing, so that's how we approached to berücksichtigen, well you know what I mean, sorry, how to consider the wishes of the FabLab manager, yes of me, I think it's too much text for nine minutes, are there some questions for that slide, I think the most important is the last one, attaching new machines is easy for Python and bash, because we are not able to implement as a project in each and every workshop, so we need to enable the workshop users and managers to work as a community project, in the middle there is GUI would be great, if there is anyone in the audience who is willing and able to build a GUI which takes machines and throws out all files, feel free, that's some things in the back end that are important, it's written in Rust, so the language and the compiler guarantees some sort of stability, and in between the back end and Rust there is the captain proto API which enables the guys who are not working in the back end and in the core of the server to get a stable API where they can throw against with Python or I think easier languages which are, there are more people who can write code in these languages, we have two to three core developers who are having an eye on a very stable API, and this is also important because the FabLab manager doesn't want to take care of software updates, they should happen without any effort. Accessibility is for the users, the app which we have showed in the slides before should be really accessible for the users because if you have a workshop it doesn't help you if a third of your users say sorry I can't access the app, I can't use your workshop machines. The last line there is one more step to accessibility because not everyone has a mobile phone and not everyone wants to take his mobile phone and get the app and when he just wants to use a machine so NFC is a thing in that case. Well, supported hardware devices out of the box, this is what the core developers have been working at right now and the extensibility, it are not only the core developers developing modules for FabXS, that's where the community comes in. I think the list on the top will grow with time but the developers have an eye on stability so it's not growing that fast as the community wishes so it gives community unrights, I don't know the English words, sorry. That's where we are right now, we are in the beta and I think the beginning of March there is a deadline for 1.0 which will be the first release which can be used by a workspace without having software engineering guys in the background. These are targets which will be addressed in 1.0, open ID connectors which by a lot of guys in the community as most of the German work spaces in the Verbunte of Newerkstetten switched to key cloak in the background. I'm good in time. The federation, I already talked to the people of Matrix and I think there will be some interfaces between FabXS and Matrix to realize federation. Coming to the end of the presentation, we are a team, the most important are the community members. We have 64 number growing, we have a community manager, thank you. We have four people in Berlin, the first is the core developers than me, sorry, and one guy in Boholt who is doing a really good job in alpha testing and documentation because if developers create documentation, it's better when there is someone outside of the developer circle. Nobody in the team does anything with blockchain, sorry. These are organizations who supported us and are carrying the project and yes, the lower even spent some amounts of money, thank you. These are the URLs you'll need to get involved and give it a try. I think we have three minutes. Are there questions? Yeah. Did you guys think about interoperating with already existing authorization schemes like free IDA for example? Yes, of course. Right now they are implementing a summer and there will be many, many more. It's part of the modular design of FabXS because we do not want to press you into what we are using. If there are authentication systems which are in the market, we will implement them one after the other, but if you are interested, feel free to talk to us. Very good question. Thank you. You are not able to solve social problems with technology. Unplugging plugs which switch on and off the machines, I think for me it's a part of a social problem, but there is a technical aspect. Of course the Shelleys say, I am here, I am here, I am here. They have heartbeat and if that heartbeat is away, so the core of FabXS realizes all that Shelleys is unplugged. It's a technical and social problem. Inside the kernel, yes, inside the backend, yes. It throws out in the normal configuration into the memory, so it's gone in the time it's thrown out a log file. If you need to follow what your users are doing, then you may take that log file and configure FabXS, give me the log file and push that log file to ERP systems or whatever you want, but that's not in the scope of FabXS. Any more questions? One and a half minutes or? Yeah. Do you have the possibility to also get a signal from the machine to say it's working or it's voicing? Next good question. Thank you. It depends. Sorry. Sometimes if the information from the machine is important for FabXS to switch on and off machines, then we're taking the data into FabXS. Normally when there are data from the machine which are not needed to switch on and off machines, because the core point of FabXS is switching on and off machines and that's what we're doing good. If there are any other data in our package, there is a MQTT server and it may be far-bited by the MQTT server and FabXS does not need that data. 37 seconds. This is something you can do with the log file which is coming out of FabXS and your own software. You writing or maybe a dual or any other ERP system which needs to be filled with structured data from that log file. No. We are not an ERP system. We are not gaining money from people using our machines. That's in the scope of different software packages. |
OpenStreetMap: Sharpen your "Emergency Eyes"
Disaster prep mapping in the EU |
So the idea behind this talk is to get you looking at your surroundings a little differently. That said, this may be some of the most boring mapping you'll ever do. And it's equal parts sort of diplomacy, translation, but also activism. So who are my OpenStreetMap? Who knows OpenStreetMap? Awesome! Freezing to the converted. All right. Contributors as well? Awesome. Okay. So I won't say too much about this except that the idea for those of you who don't know it is that it's a free, open, editable map of the entire world and a geographical database. That was founded in 2004 by a guy named Steve Coase, who's British. So one of the most powerful uses for this is making maps where people don't have them. And it's incredibly powerful when there's a disaster some place around the world. So who are my humanitarian OpenStreetMappers? Just a few. Okay. Awesome. Any emergency responders, emergency volunteers? Anybody like that? No. Okay. So I started with this because I participated in a map-a-thon for the Nepal earthquake. And it just was incredible to me that I could be sitting at home helping people out in an earthquake all the way around the world. And I'm an emergency volunteer at my hometown of San Francisco, so I started thinking that I could do a lot more with OpenStreetMaps near me, basically. And I started resiliency with the idea to use only open source tools about five years ago. And our big E, emergency classic, are earthquakes. And so this is what San Francisco looked like 30 years ago when the last big one. And it's predicted that we'll have another one, 99.7% chance that we'll have another one like this, in 30 years. However, half a decade that I've been doing this, we haven't had to eat like a good shake. Nothing's happening. No earthquakes, right? And that's what we're prepared for, like those are the maps we're making. But other stuff has been happening. So we've had a lot of other emergencies that we're not prepared for. And this is the Blade Runner sky that we've had in 2018 and 2020. We've had record temperatures, high temperatures, three out of five years. And just the way our houses are built, we're not prepared for this. We don't have sophisticated HVAC systems, like it's really pretty bad. And then this last one is the bomb cyclone flooding. And you really don't want to be out on your paddle board in San Francisco because the sewer water, all of that stuff is all mixed. So we weren't prepared for that. And it just happened at the end of January. So here we are in Fosdam. And I want to talk about Florence, Italy, which is a place that I've lived in and know well. Florence has long had to contend with the Arno running through it, right? It's flooded Florence something like 56 times since the 1100s. And they are prepared for this emergency. So you could basically do a tour of Florence. I don't know if anybody's doing this, but as an emergency person, I totally would. Just seeing the water scars. So that's San Nicolò. You could go all over the city. There are just how high the water is. So they're pretty prepared for this emergency. It's a wealthy city. There's a lot of smart people. The problem is, in a lot of places in Europe, there's money, and you do things that you don't necessarily need. So I really don't think there should always be an app for that. This is one that the government did. And I couldn't remember what it was called when I was researching this talk. Is it informed citizens or something else? And it's only available in Italian and French, which doesn't reflect the needs of Florence's varying population, right? So what we're looking at here, the blue areas are flooding areas. And the orange parts are landslides. So you're pretty much like the entire Florence is covered. The interesting bits are these extra bad flooding danger zones. But all of this information is trapped in this app, right? So for all of you, since everybody's basically an OpenStream app contributor, the great thing about Europe is that you also have a lot of street image available. So you open up the ID editor. This is that same point, one of these points, on the flooding zone. And this is actually an underpass. So this is in the ID editor with Mapillary. And you could check these points if you don't know what they are from the other map. And this is a flooded underpass. So this would be like a really good thing to have on your OSM map, right? So the other thing that you might find in a city like yours in Europe is that there's a lot of open data, OK? But a lot of this open data doesn't quite make it to OSM. So here we have another classic emergency in Florence, which they also have earthquakes. So in San Francisco, our volunteer emergency thing goes back. Our project goes back 30 years after the last big one. But in Florence, they have the misericordia, which goes back to medieval times. So these people are organized. They have the people. And in this case, this is an emergency shelter. And it goes down to the level of two organizations that are already assigned to this emergency shelter. It tells you the number of showers, how many bed things you can get, whether they've had a ham radio and tested it. Any ham people? Yes, I do it. OK, so we don't have anywhere near this kind of public information. But again, guess what? This is on the open data site and not in OSM. So this is what I mean about your sort of like translation job, because you've got to figure out the tags and you've got to put it into OSM. So there's a very lively OpenStreetMapItalia channel on Telegram. And a lot of the conversation is like, what do we call this thing? What should it be called? And since my first language is English, but I'm American and it was founded by a British guy, a lot of times I'm like, I have no idea, because that's not what we would call it. That said, most of the time you can mediate and find what you're looking for. It's already there. But the cool thing is that if it's not there, you can pitch the community a tag. And we've had to do this for a couple of emergency features in San Francisco, and we got both of them through. So it's not as rigid. If there are architectural features or other things that you want to put on your map that are specific to your country, you can definitely do that. So let's talk about emergency 2.0. So Florence has had a lot of heat waves that they never had before. And again, we're talking about it like a wealthy, famous place. They're not unprepared. They're just maybe not prepared in the way that is the most effective. So they have a very elaborate alert system. And again, we're looking at the open data. These are cooling places that you can go to in Florence. These are typically now in the states. They call them cooling centers. And the idea is it's a public place that has air conditioning that they can't kick you out of. So it's usually a public library. So the hottest month of the year is August. Do we have any Italians in the room? What happens in Italy in August? You don't stay in Italy in August. So everybody goes to the beach. So everybody's at the beach. Nobody's working. So the problem with this is there isn't a single one of these that's open for the whole month of August. We need a plan B. This is a big problem, especially because you don't really know how many people are going to have to play musical chairs, finding a library that they can sit in. So that's another one of the things. So this, unfortunately, for my Italians, these are all the places listed in Florence right now with air conditioning. And even if you know that Italians don't love air conditioning, you know that this can't be right. So for comparison, this is what the bike ways look like in Florence. And if anybody knows Florence, it's not a particularly bike-friendly city. So just the difference between these two things is pretty obvious. Now if you just take a quick look back, again, this is not your exciting mapping folks, but the libraries are not marked on this map, right? So that's a pretty easy thing to do. And the interesting thing about the air conditioning tag is that it's just, yes, the building has air conditioning. It doesn't say it's going to be on. It doesn't say you can go hang out in there. It doesn't say anything. So that's the basic idea behind that. Now Italy has a lot of, my Italian, churches, right? And churches are also cool places that, you know, they tend to be dark and whatever. But let's say they're not ideal for the cooling center thing because how could you tag that in a map, right? But also because maybe you have to have custodians or like, you know, there's something there. But Florence, for example, also has probably a dozen cloisters. And these courtyards are just by definition, they're built to be shady and cool and protected. So here we're looking at the large cloister in Santa Maria Novella. It was opened to the public in 2012 and it is, in fact, big. The other thing is, obviously, once you go map all of these things, the diplomacy comes in because I would never ask someone to go to the mayor or the emergency services and say, we're going to open up all the cloisters and let people hang out of them because, you know, like the other smaller cloisters here have these very famous Paolo cello frescoes and that's never going to fly. You're never going to have a bunch of people hanging out playing cards in the shade in that one. But there are plenty of these. And the other thing is that if you start looking around your town, you're going to be like, everything's mapped. But I will tell you what, this courtyard is mapped down to the single tree, but it's not tagged as a courtyard. So again, I think if you went and got all of these things on the map, then you can start pitching people, right? Then you can start saying, what if we open up these courtyards, you know, the tourists aren't using them, nobody's using them, whatever. The other thing that I would do, definitely, that is also extremely exciting would be to start tagging businesses that you know have air conditioning. So these, as I'm sure you guess, are hotels, right, and some restaurants. What other kind of buildings might have air conditioning that are semi-public? Any guesses? Pardon? Museums. OK. That's an interesting idea. That's what I'm looking for. So the museums in Florence, there are a ton of them, and they do have air conditioning. It's going to be a tough sell, but you could market, you know. But shopping malls, department stores, all of these places. And again, the tag just says it's just air conditioning. It's not, there's no other strings attached about the air conditioning. But definitely, if you had time, you could go around and pitch and say, look, all of these businesses, all of these hotels are air conditioned, and who's using the business center in August? No one, right? So your group could potentially organize to bring a group of people to these places and do that. So again, it's part advocacy, part diplomacy. This is our very, very basic runover toolkit that all of you, since you're all contributors, know. The one thing I would like to point out about this is we've done a lot of work on paper, and the two projects that we use are both EU, because the whole thing is that a lot of this information ends up trapped somewhere that people don't need it. So probably what you need when it's really hot is not a list of all the cooling centers. You need to know where the closest one to you is and what the hours are, right? The other thing about apps, the flooding earthquake, you want to think about connectivity, you want to think about electricity. We've done a lot with paper, and the Americans totally don't get the idea that you need paper. So I think that's part of the reason why both of these projects are Europeans. So it's interesting to start to get thinking about that since you're all already OSM folks. Anyway, what you can do with paper, the kind of things that you could help people get to places with this, because the point isn't really the database, it's getting information to the people when they need it, for the reason they need it. I think that's it. Questions? Yeah? We're just OSM organized that's where we meet and challenge what very hard to get involved in the business. Oh, right. So that's a really good question. OSM, let's see, which is the best place to go? All of my other OSMPs. What country are you based in? Where would you send them? Yeah, mailing list, IRC. Yeah, so the wiki, definitely. Yeah. Yeah, the important thing is to get plugged into the people near you so you could start going like, why is this tagged like this? Why? Anybody else? Yeah. Yeah? Thank you for raising your attention to the topic. Did you reach out to the developers of the Street Complete app to include those topics? Oh, that's really great. So I have not. I have not, but I definitely should. So Street Complete, if you haven't already used it, is a really great way to just tag a lot of stuff, like really banal stuff. Again, like you're super exciting changing the world mapping, but fire hydrants, benches. It has a lot of emergency things. It doesn't have these features, but you're right. We should definitely do that. Thank you. All right. Thank you. |
Bare-metal servers as a container runtime |
So, hello everyone, my name is Florian, I work at Scaleway and I'm an engineering manager software developer there and today we'll talk about how we use servers to handle our production workload just like if there were containers. So a quick slide of context, Scaleway is a European cloud provider, we do a lot of tests, data centers, shared host, collocation, shared hosting, whatever instances, databases, we have physical locations in France, Amsterdam and in Poland and I work in the storage team as I said, we are a team of ten people, we handle pretty much everything storage-ish related at Scaleway, so being the block and object storage products and also some order systems like the RfNSAN and the data backup in the online ecosystem and we have around a thousand servers in production and more than a hundred bitabytes of storage. So when I joined the team five years ago, the intro was what it was, it was drawn organically over the years and the versions were over the places, everything was a bit custom for the team needs at the time, so some servers were locally installed, some were PXC but everything was not homogeneous and we had an old pearl-based automation system that suffered a lot because not so much people had fluency in pearl, so it was pretty much a huge happen-only script that did stuff. So we wanted to start something fresh, something new, something we could work on for a few years, so we started considering stuff, everybody at the time started to use containers but we were not a fan because at that time they were not as material, the tooling and ecosystem was not that great and also we wanted to try something different. We still use containers for developer purposes, CI and whatever but we decided to do pixie live booting. It comes with some great advantages, first of all you just plug the server to the network, you put your discovery image and you are pretty much set up, nothing else to do. Reboot is just like taking down a container to update it and it's when you have thousands of servers in production, reboot is not that big of a deal, it's just life. The only downside is that you need to have a solid network and working the HCP. As a cloud provider you don't have network, don't do anything, so not really a downside. So let's talk about automation. After using the Perl stuff we started on salt, it worked okay but it was really hard to test modifications before putting them in production and at the time it was nobody's job, pretty much, like no one had clear in-house responsibility for it, so it was not that well maintained. So one day we decided to move to Ansible, mostly because everyone else at the time was moving to Ansible in the company, so we wanted to share our efforts and small libraries so that everybody can like do stuff cleaner and not have something where everyone does things on the side. It's easier to write, easier to test, the learning curve is not as steep as with salt and you don't have a central controller that has to be maintained by a sentencing team, you just can keep that in-house. But ultimately we wrote something along the lines of a controller, but some really dumb stuff, around 250 lines of go for the server and just a shy of 100 for the client, it's written in Python, it's pretty dumb just an API that when a server boots it calls it the API ID. Okay, here you are, here's your configuration. Just to have something a bit cleaner that works well with pixie images, we have split up the automation in two parts, we have one part that's our deployment playbooks, the only job in the life is to install and update all the software stacks and that's pretty much it, and afterwards we have the runtime playbooks, which job is to set up networking, because basically when you boot it in the lifeboots you just have the basic one interface DHCP address and you need some network configuration to make your service work, it assembles the raids, mount fire system, tune the OS like sysatel, separate governor or whatever, and after that just configure and start your services. One quick note, the installed playbooks are run on prediction server and during image creations and some are there just to remove default packages that we do not want to have in the built image. Now just a little bit about how we handle pixie image creations, basically a pixie image is a squash of s that you download over the internet just after having booted your kernel, it's literally a rootfes of your server that's inside the RAM, the first ways that were used before arrived was just a snapshot of the rootfes of a VM, afterwards there were Docker to pixie.pl, basically a huge power strip that was extracting the file system of a container, it came with some limitations, because basically the way that base image of containers are made make that not a fully functional OS, afterwards we use Ashko Parker and now we use a small trick with sshd and uncivilized ssh port that basically traps the uncivilization inside the ashrut on the build server. The advantage of this is that the same playbooks for the deployments of software are used for both the creation of the image and updating servers and predictions, and the build system is based on the ubuntu server image, nothing fancy, and we are using the default LiveBooty Neutron fs package that comes up with dbian and ubuntu with a few tweaks to allow for retries and to avoid boot storms, like you have 30 or 40 servers that reboot at the same time that do not fetch the squash fs on the same machines to avoid any networking issues. Here's the fun part, there is a small playbook that's called pixiemagic.tml that handles a lot of stuff, that's cool because the ubuntu default packages are not all meant to be run inside a LiveBoot environment, first we avoid triggering the packages triggers during kernel install because it saves a lot of time during the creation of an image, but afterwards you have to rebuild all the dkms and custom kernel models to specifically target the kernel version that is inside the shoot because it's not the same as the host necessarily, all the hypermorph profiles are to be patched because they limit what software can access which files, but as you are running from ram, you have some kind of overlay fs above it, so apparently that's ntp for example, it's not targeting the ntp drift file in slashvar but in somewhere like varlib, live, medium, something, overlay fs, so you have to patch all those, here's the flag, just like two weeks to find in another main learning list of the kernel, like 2.4 or something, you have that support for the overlay fs, so you don't break the default in your td before configuration, and as the system is amnesiac after each reboot, you have to take into account that we do not use any network configuration utility, so no network d, no if and down, nothing, so you have to take that into account because some simulink are broken by default. So we've been doing that for five years and well, pretty much nothing to say, just works, we have the convenience of containers, so we have all the systems into production that are pretty much homogeneous, with pretty much some version everywhere, and we have the comfort of using just plain bare metal servers without having any issues, like a long time ago when you updated Docker, it had the bad habits of restarting every container on the host without asking you anything, we can scale from 100 servers to a thousand, with only pretty much three to four people handling deployment in installation of servers, and new servers are deployed quite fast, like it takes like one hour to deploy 10 new servers maximum, because you just have to collect the max, update the DHCP, and put the machines, and it just works, and if you want to update anything, you just have to reboot, it's pretty simple. I think that's supposed to be fast. Does anyone have questions? Yes? That doesn't mean that the operating system has basically never written the disk on the client. Excuse me. Does that essentially mean that you have diskless servers there? Yeah. Okay, cool. So the only data that's written on physical disk are the question. So I was saying that we did that install pretty much the servers, that there was no OS installed on servers, and yes, we do have a few ones, like the one that handles the images, the installation, the DHCP, and so on, but every other servers inside our infrastructure is diskless for the OS parts, because we do have data from clients, and we tend to store that in memory, but that pretty much every disk is used for clients. Yes? Did you consider using live build or building your images instead of packer? So the question was, did we consider using live build from packer? We did use that, but at the time we were running on public virtual machines, and the building of images was pretty slow, so we moved to this solution using shrewd and a big-ass build server, and just really, really fast, so that's why we use that now. Yes? By swapping out self-master, that means that now Ansible needs to be run like ad hoc. What kind of automation do you have on top of that? Because having to run Ansible every time from your laptop? Yeah, so as I said, the playbooks are run at boot time on the servers. Basically, on every servers you have a small client written in pittons that's like 80 lines, but pretty much half of it is commentaries, and you just call an API saying, hey, my IP is that, can you deploy me? And we have multiple services everywhere that just have a copy of all the deployment files and the configuration files, and it's a reverse Ansible deployment from a server inside there, so you don't have any human interaction to redeploy a server. Any other questions? So thanks for your time. you |
Passbolt
Open source password manager for teams |
I just wanted to say thank you to the FOSDEM for inviting us again this year and maybe we can acknowledge the fact that it's been two years without FOSDEM in real life and it's really nice to see you and thanks a lot for the volunteers and the organizer of the founders of the FOSDEM. Maybe we can give them a quick round of applause because they are doing amazing job. Thank you guys. So this is the original PassBolt team in 2017 when we just after the first launch of PassBolt and the team have grown quite a bit since and it's nice to see you all. So who's using a password manager in the room? Wow, amazing. Who's using KeyPass? Quite a bit and Voltwarden, Bitwarden? Ah, nobody's perfect. PassBolt? I'm glad to see the PassBolt developers raising their hands. So you're like, okay wait another password manager or is PassBolt different from, well to assess the difference we will start first with the security. I will tell you a little bit what are the difference in terms of security between PassBolt and other more classic PassBolt managers. So one of the main aspect of PassBolt is it is based on OpenPGP. So it's based on public key cryptography. Who knows a little bit about OpenPGP? Okay, quite a bit. So I don't need to explain so much but traditional password manager use master password, the master key that is generated from the user password and then you have a derivation. They use a key derivation function so Argon2 or something less strong. So for example, KeyPass use Argon2 and last pass use PBKDF2 and I think Bitwarden, Voltwarden are going to support Argon2 very soon. But historically these algorithms they depend on the amount of rounds that you do especially the PBKDF2. They depend on the number of rounds that you do on the user password and if the user password is weak also the encryption strength is affected. So when you use a private key that is truly random like in PassBolt and some other password managers like OnePassword is doing that as well. They pad with a random private key the user password. You have some interesting security property on top. So it's a little bit stronger because it's not depending on the user password strength and you also have thanks to the OpenPGP being interoperable standards you have the ability to choose which algorithm do you want to use. So for example you could choose the size of the RSA key that you are using or you could opt for elliptical curve cryptography, newer algorithm that are part of the almost part of the OpenPGP standard and reduce the size of the messages so you can play a little bit with the algorithm depending on your requirements. So the way it works in PassBolt is we encrypt every secret which is at this baseline JSON component. We encrypt it once per users so it means that for example when you want to revoke the access of somebody for example this person leave the organization and you want to make sure that their access is completely revoked we just have to delete the entry for that particular user. How it works with other password manager and it depends but some of them what they do is that they create what they call a vault or a collection and they encrypt a bit like in OpenPGP a session key with the public key of the users so when the user leave they are not able to actually revoke the access so if the user for example manage to get a copy of the session key they can still access later the archive even though they don't have the logical rights. So having a private key is not that great when it comes to usability because you need to transfer that key to other devices so it makes the interaction with the system a little bit more complicated so for example when you use a mobile phone to transfer from your browser to the mobile phone we will have a succession of QR codes to make sure that we are not sending the key server side and all that so it makes the interactions a little bit more complicated than just the user typing their passwords. The advantage of having public key cryptography available is that we can also change the authentication system so we have a challenge based authentication system where the user needs to encrypt for the server random generated token the server will verify the signature and will send back that token and at the same time encrypt with the user public public key another random token that will be used by the user to authenticate later so it's in practice much stronger than just sending for example the password or hash version of the password because each authentication attempt is unique and you also have the advantages of checking the authority of the server at the same time so it's not prone to credential stuffings so you cannot for example try multiple passwords and try to authenticate with that you need to prove that you have the possession of the private key twice. Another big difference with the other password managers especially the ones that are online is that we force the usage of a browser extension so these have the advantages of if the server is compromised an attacker cannot modify the JavaScript that is running the application they cannot for example write a customization that take the passphrase and set it somewhere else so if the server is compromised they cannot change the code of the application that is run on the client one of the advantage of this is that you can also roll out update automatically so for example if you're using passwords in your organization if there is a flow in the client you will get automatically the updates you don't need to update your server to get a fix on the client so these have the disadvantage that you need to trust us with the update at least you need to trust the web store or you need to basically set up the web store yourself and also it's not specific to pass bolt but when you run a browser extension typically the website can find out if you have this extension installed or not or at least find out if you have an extension installed so one of the advantage of having a browser extension is you can do a form interaction so for example you can suggest things in a form or that sort of things so when you see the application of pass bolt when you visit a website it's actually not the website serving this application everything is in one iframe and the website that is serving you basically just a white page and the browser extension is injecting an iframe and the website cannot enter inside that iframe thanks to browser behaviors how they send box iframes of browser extension from because they consider this from the point of view of being on the different domain we have also anti-fishing mechanism available by default you've seen maybe with one password or between them there are campaigns going on at the moment where they try to trick the users to enter their passphrase in the case of pass bolt we have a mechanism built in by default so as you can see we are very transparent about the risk and the residual risk and the strengths of of pass bolt so we are 100% open source we are audited at least I think it was 10 times in 18 months and we have one audit going on right now and we have another audit at the end of February we work mostly with cure 53 we are based in Germany and they do a lot of auditing for password managers so every time we release a big feature they audit the changes of course you know it doesn't mean that it's perfect we are we are human so it's possible that there are some mistakes in the libraries that we use or you know in in what we are doing but at least we are trying to be transparent about what are the efforts that we make to report this vulnerability if any and fix them in a timely manner so open pgp is not perfect you have like all the algorithm that you don't want to run so we need to also make sure that we are not letting you use bad algorithm it's not quantum resistant we have still a lot of metadata that are not encrypted but we don't offer user key rotations so all these risks are explained to the end user of course not everybody can understand this but if you're an administrator running this then you have access to this information one thing I didn't mention is we are made in Luxembourg so you know if you're into digital the sovereignty might be interesting for you so okay security that was a two-third of the talk sorry but how does it look like so it's mostly a web application you can have it on most of the browsers except Safari we have a desktop app coming soon and Android and iOS native application one of the strengths of password is that you can assign permission in a granular fashion so since the secret is encrypted once per person per entry we are able to do interesting user experience when it comes to share so for example we can share with group we can assign rights to folders and we can instead of having rights at the collection level where you have everybody that have access to the collection that have the same right for all the entry in the connection we are able to do things a little bit more fine grained since you are all developers might interest you as well that if you have curl and GPG on the system you can pretty much interact with password because it doesn't require any fancy technology to be able to retrieve the secret decrypted or even basically push an update so you can do some interesting things for example if you want to inject you know secrets in your pipelines or you know even build something with Ansible you can you can integrate with password quite easily so as I mentioned before we also have the quick access which is interaction in the page that allows you and your user especially the non-advanced user to be prompted to use a password manager we have iOS and Android app there are native apps and you can use biometrics to liberate the passphrase so you don't have to type your passphrase all the time you can host it yourself there is no phoning home basically it works offline if you want and some of organization that are using passports are working in an air gap environment and it works fine we have basically packages for all distributions but you know we are trying to keep up with all the versions it's kind of complicated so we might not have precisely the version that you want but there is a good chance that you will find something that interests you and we have a one click install with AWS AMI and DigitalOcean if you are into that kind of things what's cooking for 2023 so we are doing mobile to mobile key transfer so we have desktop to mobile we want to do mobile to mobile and then mobile to desktop so basically people can start their journey on passports from any device and transfer their key easily but it's not completely there yet we want to allow administrators to enforce MFA even though you know there is the authentication in passports is quite strong still people want to tick that MFA box and we want to give them the tools to do that we will support Pasky's web botan for 2FA as well there is a new help site some more great configuration stuff coming user self registration desktop app and then later on we are going to work on password expiry manifest v3 it's the new format is pushed by Google for browser extension it brings zero value for the end user but you know Google say we have to do it so and then custom fields and more content types and the ability to choose what is encrypted or not so that you know maybe your organization wants to search on certain fields some other organization wants to have it encrypted so we will give you flexibility to create your own custom types and define what is searchable what is not so add a lot of slides on how it's made of obviously a lot of time but if you are interested and you want to have a chat with us on how it's made we will be at the bar behind at 6 o'clock and we will be giving out some swag so we have like a little fortune wheel that you can spin and other that you can even win a car okay that's all for me thanks a lot any questions for Remy we have like 42 seconds asking if it would be possible to have like one one key per device instead of having one key for to rule them all and we've talked about this and it's it's interesting idea but that would mean like a breaking change and so it would yeah but that's an interesting idea and like as I mentioned there is no key revocation at the moment but this is also something that we want to do to allow people to rotate their keys and that sort of things but yeah thank you Remy thank you very much |
Is YAML the Answer?
… and if so, what has ever been the question? |
This entire talk is inspired by a single remark by a former co-worker of mine who just casually dropped the line that Yama was so simple that nobody could ever attain mastership in it. So, a question towards the audience, also slight audience participation, sorry, who would tend to agree? That is almost nobody. That's a shame because you would be in good company. This is in the goat's section of every spec of Yama, there it is. So let's get a bit into detail and the very cool Yama exists to provide printable text presentation of structured data. In that regard, it is a rival to things like JSON, XML and other formats, it's been around for quite a while, we're almost looking into 20 years of Yama now. It is somewhat interwoven with JSON since version 1.2, actually all of JSON is also valid Yama. That is, they're just introducing an interesting relation because now since I think 2018, JSON is a strict subset of JavaScript, thus there's an intersection of Yama and JavaScript now that is precisely JSON. Let's not get into the argument if that's good or not. If you write a lot of Yama, all of the examples, well, most of the examples, most of the real life specimens of Yama, they will let you believe that there are no real types in Yama. Actually the opposite is true, Yama is heavily typed, almost everything in the Yama document has a type. Here's a selection, they're nothing surprising, all the times you see here, they are also present in JSON and that's also inspiring an interesting question. Let's say you have a Yama document, could you just change the syntax to JSON and have a valid presentation? What that work? Oh, you're too good. I attracted the wrong audience because, yeah, no, that doesn't work. It falls apart with the map type. The map type in Yama is really, really wide. It does allow for such things as composite keys to its entries. If you're really interested, they're introduced by, what's the token? Should mark space, so that's a thing. There are not so basic types, O-map is an audit map, the regular mapping is not audit, set is somewhat, yeah, special, there's not a complete list, by the way, there's a way to have Yama inside Yama, that's nice. And there's a type specifically for binary data. This one is really, really useful, provided it actually works. Try it. The problem really is in JSON, XML, also in Yama is you can't have certain bite marks. So the first 16, I believe, character points, they are off-limits, they are controlled signals, they can't be part of the stream. With this type, you can just base 64 everything, and once your Yama is being passed, that is being expanded into the raw binary, pretty neat, eh? First example, this is not minimally Yama, but it does help to illustrate a few points. I suppose a lot of you got a lot to do with, say, OpenShift, Kubernetes and the like. You've seen three hyphens a lot, haven't you? Okay. Who knows what that does? Not quite, no, no, it's not the beginning of the document. What this is is a document separator. So what you see here is it's not actually a Yama document, it's a Yama stream. Yama is meant to be streamed, possibly, don't do that unless you have a solid message framing because truncated Yama tends to be earlier than Yama, so I think twice before you do that. The thing is, most of the tools that you have with Yama will assume there's only one document ever. You need to do some convincing to get all the documents out of the stream. By the way, do you know what happens if you omit those three hyphens? You do. Okay, pretty much everything you see here, if it's omitting, it's implied. Also, great, eh? Say for the version number, there's also a bit of a homework for you, there's a chance this is going to break your tools because they do not understand version 1.2. The majority of the tools are still stuck at 1.1. Yeah, let's get into the time for it. This is something you see quite common, what do we have here? It's a mapping with a single entry, which shares the key variables, inside is another mapping for the key app version, we've got something in it. Now there's no indication to what dantotype that is. It's an integer, right? Agree? It depends on your schema. It depends on your schema. We're not quite there yet, but this is for most pure Yama, so this is an integer, we agree, for the time being. This is a float, right? This is a string, and this is still a string. You may have noticed I omitted a few things. What's three-point? It depends on your schema. No? Yes, it does. It says, the regular expression for float says three-point is float. What is point one then? It's also float. So if you want to make sure this is really always a string, you may be tempted to do something like this, I quoted things, our thing, also in Yama, big surprise, so you quoted also, so the professional may do something like this. This is a tag, the two exclamation marks means it's a global tag. So there's global meaning, oh, I'm running out of time. It's a string, have my word for it. The true professional who lost the plot may do something like this. These marks are identified by URNs, and also there's a namespace mechanism in Yama. There's a part where you go, yay, namespaces. So advanced features, this is something you do not commonly see. Most users of Yama are probably unaware this exists, but you have some tools to reduce duplication redundancy within your structures. One is anchors, they're basically marker, and one are alliances that do invoke those anchors. Pretty nifty, also these do give way for an attack known as one billion laughs. So it's basically you can set an anchor to an array or list of alliances who themselves contain a lot of anchors. So this allows for a very complex presentation of a very complex data structure that quickly expands plenty of votes. So if you happen to consume Yama from untrusted sources, this is something you should know. Magical operator, this is another really nifty tool. It's only valid in 1.1 of Yama, it got immediately deprecated in 1.2, and also it's a data type. It's there to basically merge mappings into all the mappings, great stuff. So test from the trenches, these are examples that really happened. Do you see, I should explain this, this is part of a GitLab setup, as a script, this is expected to be a sequence of strings to be executed on the shell. That's not what it is. What is the answer? No? Oh, very good, yes, that's not a string, it's a mapping. Because of the single pass design of Yama, the algorithm is very, very greedy. So it sees that colon there and says, oh great, this is a mapping, and completely ignores the quotations. So how do you fix this? There's one for ways, yes, I think the third one is my favorite, the fourth is really unsafe because once again, raw binary data, perhaps I don't know that. This is another favorite, again, GitLab CI, we have a mapping. We try to set some variables for GitLab to expand. What is the content of bar? I must remind you, mappings don't have order, oh, who knows? I thought the answer was for us. Oh, no, it's going to be empty, it's going to be empty. Another thing is, the mapping doesn't have an order, the Yama implementation in GitLab has other ideas, so it takes the mapping and applies an order on it, so bar goes top because lexicographic order. And then there's a single round of interpolations, and foo at that point is empty as a variable. So how do you fix that? Either way out is you rename your variables, or this, thank you, thank you. This never happened to me, but it's been too good to pass. What do we have here? What does languages contain? It's a sequence of, we will need to drink. It's one string and one Boolean, because no, which we're supposed to present Norwegian here, is accepted as a Boolean, so you need to tag that or quote it, however you wish. So my observations, because of the edge cases of the hidden complexity, there's a huge disparity in features that various tools actually support, they also show different behavior. It's one hot mess, I can't put it in other words. Also, if you're writing Yama, it is admittedly really a pleasure, it's easy to type, but you can never let your guard down. Yama would try to do a lot of lag work for you, being very accommodating, and sometimes the worst way possible. Some proposals on that, because the versions of Yama really do different behavior. You should start to tag your streams, accordingly, you should see that the tools you use for consuming Yama are properly configured, things like language-specific extensions. So part of the Yama streams could be evaluated in the process, read, executed, and that's a bit scary. As most of, yes, right, as most of Yama is relatively simple, the complexity is mostly because it's deeply nested and you can't properly edit it. But Yama may be a solution, it's a strict subset of Yama with a lot of the ambiguity removed, way safer, way easier, tooling support is so-so. So I teased a question with this talk, the question could be, Yama is exactly then the answer if all you wanted was Jason with comments. Some other niceties as well, but that's pretty much it. So this concludes my talk, thank you very much, you've been terrific. Is there any questions? Please repeat the questions. We've got four seconds. Is Jason any better? There you have it. It depends on what your use case is really. Repeat the question please. Oh yes, the question was Jason any better. So yes, thank you very much again. |
CNI 2.0: Vive la révolution |
Awesome, thanks everybody for joining me to chat about CNI 2.0. My name's Doug. I work in a variety of areas with Kubernetes networking primarily, but to scale the conversation, I wanted to see a show of hands of who knows what CNI is. All right, that's pretty good. And for everyone else, it is the container networking interface. And really what I think most people know CNI for is it's a plug-in that you use to get your network plumbed in Kubernetes. So it's going to define how your pods are going to talk to one another. But there's kind of an underlying part of CNI that I'm not sure everybody realizes is that it is COE agnostic. Who knows what a COE is? Not as many people. Well, I have a junior engineer on my team, and he was asking me some questions about it. And I said, I can't do that, because it's COE agnostic, and he goes, what's a COE? And I said, well, it's a container orchestration engine. And he's like, well, what's a container orchestration engine? Well, I'm wondering if anyone can name one container orchestration engine. Kubernetes, yeah, totally. And really, we just kind of talk about Kubernetes, right, because who can name two of these? That's Kubernetes. Well, it's an opinionated Kubernetes, for sure. But it's still Kubernetes under the hood. So yeah, there you go. So now we know who's been around this area for a long time, because we used to have to talk about these generically, like we were going to have a bunch of them. But Kubernetes won. And CNI is container orchestration agnostic. Well, does it need to be anymore? I really, I'm not sure about that. So I wanted to bring up that we love CNI. It's great. It's really modular. It allows us to have an interface that we use to be able to do the detailed kind of work that we need to do as people who care about networking in Kubernetes. We want to get in there, and we want to have this interface to say, this is how I want to set up networking in Kubernetes. And really, something that's kind of happening because it's container orchestration agnostic is that people might not actually use it anymore. They're thinking about Kubernetes, and they don't care that it's container orchestration agnostic. And they really want to have a deeper integration, and they're ignoring this. They want to do everything with Kubernetes, and they don't want to just say, oh, well, since it's container orchestration agnostic, CNI doesn't do anything with Kubernetes. And so people are starting to ignore CNI. This is really bad for our space. It's what it's doing is it's basically people are saying, I'm going to totally bypass this API. I'm just going to do everything in Kubernetes. I'm just going to set up my networking however I want. If you're somebody that has to go and administer these systems, or you're somebody who has to develop on these systems, you're going to have to track down all of these things that somebody else did that didn't follow this standard. I think it's really bad for our ecosystem. So I'm kind of on a warpath here to let people know this is happening. I'm not going to call out any names, but there's plenty of providers that have Kubernetes as a service or as something that you buy and you install out of the box. And they're like, oh, well, you don't need to worry about this. We've taken care of it for you. And I'd love to believe that you don't have to worry about this anymore. I've been working in production systems, in operations, as a developer of these kind of things. And I realized that the world is kind of a dirtier place than that. There's always going to be stuff that you want to get in there. And when you are using this pure upstream Kubernetes, you want the detail to get in and to figure out how this stuff works. I don't want to have to take apart somebody else's science experiment. And the reason people are doing this is because they want a tighter integration with Kubernetes. So CNI plugins are binaries. They run on disk. They're called by your container runtime. And they execute these binaries on disk. So for example, junior engineer on my team, we deal with CNI plugins all day, long day and day out. And he's looking at these other teams. And they're working on Kubernetes operators, all this stuff that integrates tightly with Kubernetes. And he's like, look at all these awesome tools that they have to interactively debug their programs. And if I want to do that with a CNI plugin that's not really running in Kubernetes because it's container agnostic, well, I can see why. Some Kubernetes distributors are saying, I'll just ignore it and just do this another way because I can get a tighter Kubernetes integration if I just bypass CNI. I am a maintainer of something called Multis CNI. And what it's used for is having multiple interfaces in a pod in Kubernetes. So if you're doing something that's more of a high-powered networking kind of stuff where you want to have isolation of networks, so a classic thing would be control and media on separate networks and you want to divide traffic so that that's isolated. That's the kind of thing you would use Multis for. And Multis is Kubernetes-aware, and it's CNI-aware. And it's designed to have multiple, to give you these multiple network interfaces. But because it's Kubernetes-aware and it's CNI-aware, people are trying to use it as kind of a CNI runtime. And that's not what it is. And kind of as this, so today we have CNI 1.0, CNI 2.0 is on the horizon. And as this conversation came up, and I'm seeing more people in my community coming to me like, hey, I want Multis CNI to do this thing that's in the CNI spec. And I mean, I want to make it work as well as I can to fit everyone's use cases, but I'm really starting to realize, hey, we need to get this kind of functionality into CNI itself and to use CNI in a way that really has this Kubernetes-awareness. So I'm really trying to invite everyone to get involved and to make sure that CNI and Kubernetes are kind of like a happy family together. And I think that we've got like a strong opportunity here. I'm sure if anyone saw the previous lightning talk, which was about YAML, but kind of a weird thing between Kubernetes and CNI is that if you are specifying like workloads and resources, et cetera, your pod specs in Kubernetes, you're using YAML, which has its problems as you saw in the last talk, but CNI itself uses JSON. So when you're trying to sort of marry these two worlds together, you have this kind of problem, especially in my space with Multis where you're kind of multiplexing CNI plugins to get these multiple interfaces. So you're taking YAML that's in Kubernetes and then you're packing JSON into those specs and it's kind of dirty in its own way. So that's like one of those things I'd like to see happen better. And I think it's really awkward for when you're trying to like programmatically interact with this stuff. So sure, as a user, you specify your YAML, you pack some JSON in it, no big deal. But if you're writing an application that parses that YAML, it also has to parse the JSON inside it, which is worse. So I want everyone to have the CNI 2.0 revolution live long and strong. And so I'm trying to get everyone to get involved. And this is a space that I can invite you to that is a working group that I know and love, the Kubernetes network planning working group. We meet every other Thursday in a time that's supposed to be the most friendly for Asia, Europe and the US. And we will be discussing this until it's solved. And we're going to take these considerations up with the CNI community as well. And I'd love to see any faces join and we can keep rocking and rolling on this. So thank you. And yeah, floor is open for any questions. So the question is the given by Casey Calandrello from Red Hat, no longer at Red Hat. A couple Q-cons ago, there's about CNI 2.0. And yes, this is a related effort. And I think that what Casey was talking about at the time was so one of the problems that I mentioned was you've got these CNI plugins, they're binaries on disk, wouldn't it be better if, say, they were containerized? And that was something that Casey was talking about is that we'd like to see CNI plugins be containerized instead of binaries on disk. They're going to be more familiar to folks that work on Kubernetes applications. He was also talking about getting a GRPC interface. But I think that this is kind of a newer thought process. And I don't want to put words in Casey's mouth. So I won't. But I have a feeling that he is also on board. And for those of you who don't know, Casey Calandrello is the originator of CNI and is also an awesome guy. So yeah, so related. Go for it. OK, so in that talk, he was talking most about CNI would be more like a more complete life cycle meant for networks and all of that. And at the same time, but now there is an activity to create like a proper API for networks. So how is this to make a happy marriage? So the question is, and it's a great one. And so one thing that was talked about in that presentation was the idea of a more complete life cycle management. And there is also concurrently an effort happening now that is to define a networks object for Kubernetes, so a data representation of networks. So this is a complex two part question and I have 90 seconds for it, but I love it. So yes, also, so second part first, there is an effort that is called I call it Kubernetes native multi networking. If you join the SIG network call, you can find out all the connection or information about it. Very interesting effort. And as I mentioned, multi CNI does multi networking stuff. That's awesome. And that particular conversation is to me bringing up lots of questions about what CNI 2.0 is going to look like. And for the first part of the question, which is richer life cycle management of networking in containers is so something about CNI is that it primarily functions on container creation and container deletion. There's some exceptions to this, but primarily so CNI add is a command happens when your container is created, your CNI plug in kicks off, does its work, goes to sleep dies. And then delete when your container is deleted, it tears it down and cleans it up. However, networking can still happen between those two to 10 points, right? So things happen, things change IPv6. You could have Slack happening and auto assign routing and IPs. And that's it. So we want to fix that too. So thank you very much. Appreciate it. Thanks for the question. |
Staging of Artifacts in a Build System |
So good evening, everyone, and welcome to the last talk in this session. I hope you still have some energy left. So my name is Sascha Roloff. I'm working at the Huawei Intelligent Cloud Technology Lab of the Huawei Munich Research Center, and today we are going to take a look under the hood of build systems and what common practices are currently used in basically all of the build systems and why many of them are suboptimal in certain regards and how they can be improved by a concept called staging. So in order to explain you the issues with current build systems, I directly jump into an example and I guess many of you have used make once or twice in your open source developments. So let's start with this classic build system. So we want to create a build description for a very simple hello world application composed of a hello binary and a greet library, and the greeting phrase is hard coded inside the hello binary and a greet e can be injected at compile time at the greet library. So this is a make file. We have our rules which describe which artifacts are generated by actions based on a set of input artifacts, and so we have different actions actually involved to generate the final binary, and for example we have compile actions to generate the object files like the hello.o or the greet.o, we have archive action for the greet library, and the final linking action to actually generate the binary. At the end we also want to create some sample output, so we just take the output of the hello world and store it in a text file. So nothing spectacular right now, each artifact is associated with a file on the file system so the actions can directly operate on it. If we execute the build, we just see all the actions are executed, everything fine, and the output is generated, and yes now the boss comes into our office and he's unhappy with our result, he wants to put it basically on a poster and it should be more readable. Okay so yeah then let's add some post processing to the task, and we just take the output of the hello binary, store it in the intermediate file and put this intermediate file into the post processing and translate all letters into, capitalize all letters basically and store it into post processed text. And then finally put this text into the target sample output, and we execute this, we see new actions are executed and the result is fine, looks much better now, hello world and capital letters great, but the boss is still unhappy, he wants to have some localization, he doesn't want to greet the whole world, he just wants to greet Munich and Brussels, and he wants to have it both in a single make file, so what do we have to do now, okay we have to basically we have two program variants now, and what should we do in order to reuse most of our rules that we already have, we can use a for loop over the location dependent targets and interpolate the city name into the artifact names as you can see here, so we have now not only a single hello binary, but two hello binaries with dot and the name of the city, and these are our two program variants, and as you can see there is a lot of string interpolation coming into our make file and it doesn't make it really readable, but we have to do it, because each artifact is associated with a file on the file system and this needs to be a unique name, so we have to do it basically, and when we execute it, okay now we get a bit more actions, but it's working, and we see now, okay the output is as required and we greet Munich and Brussels, but the boss is still, I mean he's happy now with our output, but now he's unhappy with our implementation, he says that's not maintainable, why do we use a build system from the 70s, use a modern one, they are supposed to do much better now, well okay, then let's use Bazel, and this is what it looks like in Bazel, so the same application, and as it turns out they are better, but not in all regards, so they introduce high level concepts like the CC binary and CC library, we don't have to manually write object file creation and linking, but it's, everything is wrapped now inside these high level concept calls, and also our bash calls are wrapped in these general targets, but I mean it looks a bit more readable now, but still we have this string interpolation here, and the for loops over the city names, and yeah why is it actually like that, why do we need this, I mean it's a modern build system, and the reason is because Bazel also associates each artifact with a file on the file system, and yeah, so that's why this basically brings us to an important observation, and this means even modern build systems, it's required that you have unique names for your artifacts, and because they basically follow a design decision implemented by make in the 70s, and the design decision by make was that each artifact needs to have a fixed location in the file system, well for make it was perfectly fine at that time, because there was nothing else or not much different to do in order to determine which part of a program needs to be recomputed, basically to compare timestamps, and for this you need files, so for make this was totally fine, but there is actually no reason anymore to do this in modern build systems, because they anyway isolate their actions in order to avoid getting unwanted dependencies into their builds, so their actions are executed either in a separate directory or in a container in order to better control the dependencies, so when they anyway execute their actions, why don't we allow the targets to specify where to put the artifacts, and this is exactly the idea of staging, so basically there is no technical reason for modern build systems for restriction of to associate each artifact with a file, and instead we propose that we should stop following this common practice and apply staging instead, and the idea of staging is that an action can freely select the location of input and output artifacts within its working directory, and this basically introduces a separation between physical and logical paths, inside an action, you only work on the logical paths, and the action can freely decide where to put a generated artifact, or where it wants to read an input artifact, and so this is basically our proposal to apply staging, and how could it be look like if it's implemented in a build system, so this is basically our project, it's called just build, and this is a build description that we propose, so we also have the definitions of our targets here, we also use the high level concepts like binaries and libraries, and in this JSON syntax the type just selects which kind of artifact or which kind of target it basically is, and what we can see inside the target definitions our artifacts are named without string interpolation, so we don't need to artificially invent unique names for our artifacts, they are just like they are, and also for example here this use target we just access the hello binary, even though we will have two of these binaries but we just write hello, and we don't care, I mean it's staged, and what is created from the depending dependency, it's just staging the final result at that location where we need it, so, but still we have the for loop, this is something what we of course still need to, which basically creates two configurations, which is then propagated, I mean this variable that is created here is propagated to all the depending targets, and it propagates until the greed library, which then reads this configuration variable and injects it into the compile command, so this is how a description could look like with staging, and from this description we can also generate a so called target graph, which shows the dependencies of the targets, so main depends on all, all depends on two post process because we have two configured targets, so the greed library basically is duplicated and this propagates until the post process target, and these target graph or targets are basically high level concepts, if you want to take a look into which actions are actually executed, you can also generate an action graph, which shows a data flow, that's why the errors are inverted, and it's a bipartite graph, which means so, the ellipses are the artifacts and the rectangles are the actions, and yeah, so you can really see the artifact names are basically the same, so post process dot txt and post process dot txt are the same names in both branches, and since they are staged in logical paths, there's no problem, there's no conflict actually, this would not work in make, you would have to use unique names, okay, so and what happens when we execute it, so we just select the target that we want to build, and there is some output coming here, and it says okay we have 12 actions, zero cache hits, of course we execute, built at first time, so you can count it's 12 actions, and it's just built, the artifact is somewhere, I mean it could be in a remote execution, and then it's just existing in a remote cus, if you want to have the artifact in your local folder, then you have to install it, and when we execute the installation, we now see okay, again 12 actions, and also 12 cache hits, because everything is known already, and then the file is in your local directory actually, and we see the content is fine, and we even don't need to store it into our local directory, we can just print the content of an artifact by the minus p option, if we take staging seriously, we have also very nice implications, and one is for example, assume that you have an external source code that you want to use in your project, and you want to apply some patches on it, and yeah how do you do it, normally you would copy it, apply the patch, because you don't want to modify the original source code, and yeah this results in a lot of maintenance problems, but with staging this can be done much easier with logical in place patching, you just apply the patch on the logical path, and yeah, then let's take a look how this could look like, so we have now put our example files in a third party directory outside of our project, and a directory with the patches, and the patch just modifies the hello greeting phrase with a bonjour greeting phrase, and we just have to add a single block, a single block into our build description, which points to our patch and to the file that needs to be patched, and that's it, and the resulting target graph just shows, okay we have now one more target here, the hellocpp source target, and the other, the binaries are depending on this extra target now, and also in the action graph you can see that there is just a single new action actually added here to the action graph, where earlier was hellocpp is now the patched version of hellocpp, and it's just another input, and if something is changed in the patch all dependent targets are executed, okay if we execute it we see bonjour Munich, bonjour Brussels, works well, okay so quickly to summarize my talk, as we have seen there are some inconvenient habits in modern build systems, and yeah we propose to apply staging instead to make build systems better, and you will have a couple of advantages if you apply staging, and yeah which are written here, and this is not only a concept it's already implemented, so if you want to take a look into our project please come by, and yeah now the stage is yours, thanks for your attention, are there are any questions, there is a question, no I think it will repeat the question, you heard the question, yeah exactly we do actually content addressable storage, so repeat the question please, okay the question was how do we identify which source code we actually need for the staging, and yeah we apply content addressable storage, so we determine a hash basically from all of our source codes, and then also we know what has been changed or not, any other questions, yeah so the question was whether we use the jason syntax, yeah no we decided for jason, and yeah it's jason is used as our build description syntax, okay so how many developers are working on it, and is it widely used, a very good question actually we recently got open sourced, and we are in total five developers currently working on it, and yeah but we really try to implement the new concepts into this build system, and make it a really sound build system compared to other modern build systems, and yeah so please just take a look at our project, and there is a nice tutorial also, which well everything explains nicely, and hope to see you there, okay thank you for the talk, thank you, goodbye, thank you for coming. |
Combining EASY!Appointments with Jitsi for online appointment management |
Hello, I'm Konstantinos Papadimitriou, I'm the web developer and analyst from GeForce Open Technologies Alliance. I will show you how we combine easy appointments with GC for online appointment management. In Covid, a lot of private and public services switched to online appointments, most of them used closed source or services for the online part. At GeForce, we adapted the easy appointments, which is an online software platform for appointment management and CC, an online open source meeting with audio and video, and combined them into a seamless integration for booking an appointment that will take place online rather than physically. About our organization, we established in 2008, we are a non-profit organization and also we have officers, holders, universities, research centers and public beneficial organizations. Our objectives are development and promotion of open standards, open software, open content, open data, open government, open educational resources, active common license of open material and design technologies, and also open hardware and design. We contribute to openness in education, in public sector and also in private sector. And we have thematic working groups on open software and open standards, open technologies and education, open governance and open data, open design, open hardware, wireless networks, information, system security and personal data protection, and also innovation and entrepreneurship. Well, the easy appointments part, it's a web application for appointment management. It has a very easy installation on a web server and also configuration is simple. Users can use it directly without having to install any applications. The operation is using a web browser and it works on a computer, on tablet or mobile phones since the design is responsive. It's a really simple and fast operation, optimized without unnecessary options. Also, easy to use environment for managing the appointments from the service side. It supports over 25 languages and also since it's open source anyone can contribute more languages. It's written by Greek developer Alex Cellegidis, which actively continues the development. It's open software, open source, and also it has already hundreds of installations around the world. It's written in PHP, it requires PHP 7.4 or above. It uses MySQL or MariaDB for the database part, runs on NZINX or Apache web server and uses PDO extensions for JSON, MbString, OpenSSL and PDO. It's a developer friendly with clean source code. And now the JITZY part. JITZY is a set of open source projects that allow easy teleconferencing. It's written in Java and JavaScript. It's compatible with web by RTC. JITZY video bridge transmits video and audio to all participants separately rather than combining them first. And of course it's open software. The platform now, on the first page, if the text seems Greek to you, actually it is Greek. In the first menu, the user can select the service he wants. After that, he selects a specific agent or any available from the service. And finally, some information appears about the selected service, the duration of the appointment. On the bottom, you can select also the interface language. We added the part that you can also arrange an online appointment. That's why maybe it's the same service, but rather than physically, you want to do it online. On the second step, you choose the available dates and times of the service you choose. Or if you select a specific agent, you see his availability or any other agent available. You will see the calendar here. In the third step, the user inserts its information, first name, last name and email. Which in this email, he will receive the details of the appointment. And also the option to change or to cancel the appointment. The fourth step, you view the user views all the details gathered for the appointment. And finally, when he submits the appointment, two emails are, he sees confirmation messages on the browser. Also, the system generates two emails, one for the user with details and also from the selected agent to the selected service. The email he receives, again, the confirmation, the details and also in case it's an online appointment, the location on the location field, it will appear the unique video link to the GC service. Also, the user will receive an as an attachment and an ICS file, which he can import it to the calendar. From the administration side, the agent sees all his meetings with information in an easy management environment. Also, you have roles in the system, so you can have directors of the service, which they will see all the appointments. Some technical details about our prototype. We did minimal changes to the appointments source code, so it will be also easy to use it. If the administrator doesn't set a physical location of a service because it's a service, so you say the appointment is on this address or this building, this office, if this is empty, then the changes we did, they generate a unique GC link using the appointments house. Each appointment has a unique house in the database, so we use this also as a link for the conference. We used the native location field to store the GC link. We added the location field in the email the user receives. And administrators on settings. We also wanted to show some text with information on the home page, which wasn't an option in easy appointments, so we also added this, and we wanted to change also the appearance. So we inserted a CSS file, which from there you can change colors and some design of the pages. The source code is available on the GitHub you see here. And that was our project. Thank you for coming. Does anyone have any questions? Once further information? Yes. I understand that this is a fork from the official easy appointment. Yes. So how do you manage with the new versions? Do you provide something like an automatic combination to add some touches to the main project? This is a prototype. We spoke with a developer of easy appointments. He has a system for plugins, so it could be written as a plugin to easy appointments. At the moment, no, you change the actual code. That's why also we wanted to make minimal changes. Thank you, Konstantinos. That was fantastic. Thank you. |
Breaking away from Big Tech
Using open source infrastructure in a convenient way |
Everyone, we have Boris and Redo here, and they will be providing a talk on breaking away from Big Tech using open source infrastructure in a convenient way. Thank you. Thank you. So, as mentioned, today we'll be talking about how to break away from Big Tech, and we will be focusing mostly on small and medium teams, but what we're talking about applies more broadly as well. So a little bit about us. My name is Boris. And I'm Redon. Yep. We've been open source activists for a couple of years, involved in different projects. And yeah. Okay. So to start off, let's talk about Big Tech. So any fans of Big Tech here? Show of hands. Okay. No. This was not expected. So, okay, there are many issues with Big Tech. We're going to focus on some of them really fast, and we'll not say, probably, I know you with things that you already know it or hate. But one thing that sends out is that the amount of money these people have. It's like, in the beginning was millions. Some decades ago, now it's trillions. It's billions now, and millions, billions, and now it's trillions. So it's probably the market accumulation or wealth accumulation is more than the entire GDP of France, if you gather all around all this. And that's just in the first quarter of 2021? 2021, one year ago. So probably more money are printed, and they're not going anywhere close to here, right? Which is, and money is not the issue here, but money also brings power, which is a major problem. And in the beginning, in the 70s, as you know, the big oil companies were the one who had all this market capital and growth. Of course, Disney, with its nice copyright lobbying issues, and Warner Brothers, just like Disney, but less efficient in terms of copyright things, as you know. But again, we're talking about trillions. And one thing that concerned us a lot is that we've seen many of open source organizations or companies that are in open source that they do not use, they don't do the one thing called that we say it's dog-fooding. They don't use other open source tools for their own infrastructure. And they do this out of conveniences, you know, because when services like Gmail started, they were giving away this service for free, free as not as in freedom, which brought a lot of people into these platforms, Gmail is only one of them, right? And this was the main problem, because they could afford, because all of this market capital that they had, they could afford giving all these services for free. And of course, they also killed a lot of innovative stuff. I'm not saying that Instagram is innovative, but as you know, they bought Amazon as one of these good examples. They're buying everything that is around and which they consider a threat, which might be a very innovative product at some point. And one of the scholars, Tim Wu, mentioned that these kind of formula, but by purchasing other companies that can be considered competitive, this is anti-innovative, right? And create oligopoly kill zones. Okay, you know all of this. But the question is, why do a lot of these organizations, small teams or medium teams, continue to use this kind of big tech platforms? And even more specifically, how come that in the open source community, we are so reliant upon big tech infrastructure? And there's a couple of different reasons for that. For sure, one of them is the fact that they are free as in premium and not free as in freedom. And this makes it very convenient to sign up with them. Also the fact that they integrate their services with one another very tightly. At first, it makes us believe that it makes us more efficient in what we want to do. But in reality, this just leads to a vendor lock-in, where after a few years, it will be very hard for you to migrate to something else. So in our daily job, we work on providing digital infrastructure using open source tools. And we've seen this. These organizations that are, again, are into the open source, open science or open knowledge world. These organizations started by using the free-me model of these platforms. And because they couldn't move in easily, now they are locked, right? And that's why they keep using these platforms. Okay. Yeah, so the solution, we think, is we're going to talk about it in two parts. The first part is the ideological side. And the second part is the more hands-on and practical one. So there are proposals for this. One of those, we are just mentioning, because we don't have much time, we are just mentioning one of them. Usually we say big tech, but why should everybody go big, right? So we should use small tech, small companies, which use open source tools. There is a small tech foundation founded by, founded some years ago. And they say the small tech is should be easy to use, private by default, peer-to-peer, zero knowledge, non-colonial, personal, share-like, interoperable, non-commercial, inclusive, a lot of stuff. Again, I'm pretty sure there are other approaches, but this is only one of them that we want to mention. So the part two of the solution, all of the solutions, is hands-on, the hands-on approach. What, okay, this is good. We need to, we want to move away from big tech, but what to do, how to do it technically. Right. And the good, but also maybe a little bad thing is that you have a lot of choices as to how you do this migration. So we listed the five most important ones that we've seen. And we want to go over each and every one of them to talk a bit more about them. So the first option is what we believe will be an ideal world. And as a sysadmin, I definitely want this to be real, will be where everybody is able to host their own stuff on their computers. And this means that everybody does their own deployments, their own maintenance, their own security optimizations, and so on. This can work quite good, especially for personal projects. But when we're speaking about infrastructure for organizations, it can be a bit harder to maintain as compared to your own personal data, case in point, an instance shall not be down for an organization in the same way it can be down for you. So with more users, there's more complexity as well. And also it depends on who is using it, like do it yourself. For example, I don't have the time to do it for my parents, for example, right? But I can do it for myself. And the idea is sometimes it's a very good scenario, sometimes it doesn't work. For the punk movement in the late 70s, 80s, it worked doing a DIY. We need to see if it also going to work for us as well. But when we talk about DIY, you need to be a bit careful about what we call the Dropbox problem. And what the Dropbox problem is, is that when Dropbox was initially launched, one of the first comments on Hacker News was... This is quite famous, by the way. Or for some people. One of the first comments was saying, I don't understand why Dropbox needs to exist, because you can host your own FTP server and use this and that library to do it. And yeah, from a technical perspective, you can. But I think time has shown that most users will not want to host their own FTP server. Is the person who commented this in here? So if not, please don't have this approach, don't have this approach. Because this is one of the reasons we have so much big tech right now. Like we say, yes, we can do it. And if you do all this and you have your own server at your own place and you do this and you update and you have SSL certificates and FTP and all that stuff, that works. But for the wider audience, that's a major issue. So we need to have a different approach when we propose this. So the second approach that one can take are what we call no-code platforms. So these are platforms like joingardens.com or Unihose. And essentially what they do is that they lower the barrier needed to start self-housing, because they automate a lot of the processes. And this is a really great way to not only set up, but also maintain your infrastructure. However, depending on your specific needs, if you want to do some custom features, it might be a bit trickier to get them to work exactly the way you want. But as long as you don't want something very custom, they're a very great way to get started. And it makes you win a lot of time from the first solution, which is self-hosting. Because you have a lot of the tools that you need to do in order to automate stuff, right? And third option is having an internal team. So again, if you are an organization promoting open knowledge, you might either, again, do it yourself and get all the know-how. But you might also have one system administrator or a team inside. This is good, because you can deploy things as you want and customize them. Usually there are costs for the hardware or if you go to the cloud, also for the team that you're going to go. If you have the budget that's good, if you don't, that's a tricky one. You should either go to option one or two or to the other two ones that are like option four or five. Hosting collectives are, if you know Chateau, lots of small hosting collectives from France. They are mainly in France. And they are mainly focused on the collective side of it, which is great, because it's also a very good approach of providing solidarity. Some of them are not-for-profit, some of them are collective, some of them are small companies. But the idea is to provide a good step, a very easy step for other collectives or other small companies to have open source infrastructure on their own. And some of them are very-that we know, they also provide some training so that it's easier than migration from all these evil platforms to the new platforms that they're using. We usually tend to-lots of people who are technically know how we tend to underestimate how hard it is to change the routine from one platform to the other. And these setups, these collectives are great in doing that. And the solution number four and five are kind of very close together. But the solution number five is that recently there has been an increasing number of providers that focus on open source infrastructure. For example, there's a GitHub repository on NextCloud called Providers. And these providers are not officially endorsed by NextCloud, but they take care of setting up NextCloud and maintaining it for you. And this is great because, for example, the NextCloud ecosystem has also official partners of NextCloud. So if you are a big organization, you just go there. If you are a small one, you just go to the list and research and do your own. Or you can do it yourself, as we mentioned before. Having all these options in a clear way makes it easier for people to just migrate for, I don't know, for my parents to start using NextCloud for their photos on their Android phone, which, by the way, I should update at some point because they asked them to-for me to help them. But it's important to keep in mind that it's not necessarily a one-size-fits-all. And even if you find something that fits your needs, your needs might change in the future. So it's important to think of these solutions as different alternatives and different steps that you can take on the journey to have open source infrastructure. So yeah, about managed service providers, as we mentioned, we think it should be very important to focus on platforms that are open source, so therefore open platforms. And with such service providers, usually you don't need to have technical knowledge to get things up and running. You need to have, though, some knowledge, which is legal, but also from the provider that you are choosing, of where your data are, if it's compliant, and all these things. So you need some basic knowledge to understand what the other side is doing with all your infrastructure that you are managing, right? And these providers should also offer you with not just technical support when something is down, but we believe also with user support. Because if you have an infrastructure, but nobody uses it, what's the point? And because managed service providers should not be the end goal, the end goal is to everybody to self-host, right? But until then, we should be able to understand the terms of service to read them because somebody else is doing the maintenance, and they can do something. They might have higher expectations, but there are also mistakes that people do usually in these cases, right? And so in this case, terms of service are very important, not in this case always. And also to understand the legal coverage, as I mentioned before, where the service are, where they deployed, et cetera, et cetera. And of course, very important, service continuity. There are many such service providers that provide open source platforms as managed service that, you know, they popped up, especially some years ago, pop up, they are there, and after like two years, they say, oh, didn't work out for us, the pricing that we calculated was bad, and they shut down. So you also need to review your service provider if they seem to have a business continuity and sustainability plan. So yeah. So one example is, for example, Mastodon, right? There's a lot of talk about hosting Mastodon these days, et cetera. So you need to know where to deploy it, and what you can do it again. You can do it yourself, but you can show someone else to do it for you. You need to have technical know-how with all the platforms. Platform-specific know-how. What does federation mean, for example? Or what's the toot? Legal implications and who does the moderation. All these things are very, very, very important for you to know. And five more seconds. So that's why, for example, for us, it took us months to understand and read a lot of legal paperwork and also research the platform before deciding to offer it to other people, which we are planning to do this week. And this is something that we are announcing just here today. And that's, yeah. Yeah. Something to keep in mind is that regardless of the option that you choose. You can go very quick. Yeah. Yeah. Very quickly. Be careful to not be vendor locked in, because that's a very important aspect. And if you want the sticker, we have it around with us so you can get it later. So thank you. Thank you so much. Thank you very much. |
Grottocenter
An open source database for cavers |
Hello, good morning everyone. I'm Christopher Peters and I'm going to present you Grotto Center. So first a bit about me. I'm from Belgium. I'm mostly working as a trainer and consultant in open source and DevOps. And in my free time I'm spending quite a lot of time doing outdoor adventure stuff such as for example I go into caves. By the way for the people who like to enjoy the pictures of my slides this is the only picture that I didn't take. So it's a bit difficult to take it when you're posing for it. So first a bit about caving or speleology. So what does it actually mean? It's mostly a scientific study where we are going into caves or outdoors and explore cast phenomena or caves in particular. If we are going into caves to study what is happening over there, there are different sciences that are involved. There is geology, there is biology, there is also a lot of physics and chemistry involved. But in general to get to the place that we want to study we need to go and pass certain obstacles. We have to crawl through some small passages. We have to descend into pits and for that we need a certain physical activity to get there. So there are some people who do this for science to actually research things that are happening down there. There are other people who just like to do the physical activity and so it's also a bit of a sport. For the people who are a bit familiar with other terminologies in English it's also known as caving, potholing, spelunking. So there are different words for it but in general the idea is we go into caves. Now if we go into caves of course there are a few things that we have to keep in mind and one of the things that I would like to present to you today is a problem that we often have to deal with. So when we go into caves on this picture you can see a person who is descending into a pit on a rope and this is something you don't do just like okay it's 10 o'clock let's go and see what's in this hole. We actually do some preparations. For that we are using a lot of information that we are consulting to make sure that we have the other equipment with us. We have also made sure that we are using the right tools that we are also making sure that we are not going in the wrong moments because sometimes also weather conditions can make a very bad influence on the conditions in a cave. Here you see an example of a survey. We have quite a lot of surveys that provide us with information on how the cave looks like if it has already been explored. And we can use these surveys to also make sure that we have all the rigging material with us that we make sure that we know where we are going, what we are going to do and what kind of dangers that could be involved in this in this trip. For this we have the surveys. They provide us with this information. There is also sometimes a description that is added to the survey which provides us more information of for example how to get to the entrance, maybe some particularities, special things we would like to see or special things we would like to avoid. And sometimes also separately we have rigging information. That's not always the case but depending on how well a cave has been researched and explored these documents are available. On this survey there is also here, over here you can see that there is an indication which actually also tells that this is a pit of 20 meters and so sometimes also this information is only available on the survey. Except for that we also have other information which is not very necessary to just facilitate the trip itself but which can give us some more background. For example scientific research that has been performed, some information about the region that could be geologically, that could be biologically. That could also be for example information about how rivers are flowing above and underground. There is also other kind of research that can have been performed that isn't necessarily limited to this specific cave we are going to visit but that could be more broad and give more information about also the region or maybe a particular kind of cave. And other documentation like for example when they made this survey they also took measurements. This survey has been made probably with compass and just measuring tape and thus they might have taken notes of that. They might have also saved them somewhere. These days we are also using more modern technologies such as radar and 3D scanners and also those data points that can be saved and they can also be archived. All this kind of information is available at a certain location. Most of the time it is the person who has generated this information who owns this information. But if we want to share this we are very much scattered around the world and there is not such a thing as a central database. That is the actual problem we have to deal with. So if you look at how we are going to share this information these days a lot of this happens by email or by websites or blogs. Sometimes also some file transfer services but in general all the information is at one specific organization, one specific person or group of persons and you have to know where you have to go to if you want to actually access the information. There are some countries who do have their central databases but this is not something which is also standardized or generalized. If we then have the information about the cave like for example the survey how also are we going to link this with scientific research that has been performed around this cave. That is also one of the problems we have to deal with. If we are contacting someone because we want to visit the cave very often we get a survey or we get a description or we get a combination of both but we don't get all the information that is available. Also because very often this information is not centralized in one sort of group of people. Some people have one part of the information other people have another part of the information. And then when I contact someone to obtain this information and if I have been using it for a while and then afterwards an update has been provided on this information how am I going to make sure that I am also getting this update. Because also in this world science is still moving on. We are still exploring new places underground. We are still finding new places, new caves and so all these kinds of information need to be shared as well because the survey that I was showing before is a very old one. There are already several newer versions of that available right now and so if I get this survey one day and a newer version is made the other day how am I going to make sure that I also get this if the person who sent me the first one is forgetting to also send me updates. For that there is a solution and this solution is GrottoCenter. So what is GrottoCenter? It is a sort of wiki database which is made for cavers and also mostly made by cavers. So there are a few people in France they have been starting developing a database which they use to store all the information and it is made public available so that everyone can use it. It is supported by the European Caving Federation and it is also supported by the International Caving Federation who also officially hosts some parts of this. The idea is to have a global source of all kinds of information about caves and caste phenomena and to have it with an open access so that everyone can access this information also with the idea to use standards so that all the information is saved in the same way and it is very easy for everyone to also add information and also query this. And because we are using standards we are also making it very easy to make it accessible by machines as well so that you can query this using APIs. How are we going to then use these kind of standards to make sure that everyone can access this information? For this we made a standard which is called CastLink and which enables us to link all the different documents to caves. For this we did not really try to reinvent the wheel, we just extended a tool which is from the W3C linked data and semantic web and this enables us to store the information in such a way that it is very easy to access also using machine readable ways such as APIs. And then you have Grotto Center itself so here you see the front page of the website where you have on the side menu bar which you can easily use to browse the information. If we then go, you can see there is a quick search and advanced search, I am going to skip those for a moment and I am going to skip to the map and then here you see a screenshot of a part of Belgium where you can also see all the hexagons which are indicating where our caves or cluster of caves. So these are just sort of caves bundled together, if you further zoom in then you can actually see over here you can see that there are also indicators which point where you can find certain caves and then you also have over there the pop-up which appears when you click on one of those points which provides us with some information like for example here which cave is being found at that location and then also a link to the data sheet of this cave. So this is the data sheet of that exact cave, you can see then where it is on the map, you can get some basic information such as for example the exact GPS coordinates, some information about how deep and how long the cave is when it's been discovered and then if you browse down here you can also find some information about for example the location, if you go down a bit more in the description you can also find documents such as the cave survey. Not all this information is of course very extended, it events very much on the people who add this information so this is one that has been created automatically using an import which is done during migration of Grotto Center from version 2 to version 3 but most of this information is being provided by users, it's just like Wikipedia, if I don't write a web page, a Wikipedia page about Grotto Center then there is no Wikipedia page available about Grotto Center. So this is information that can be shared by everyone, on the top there is also a login which enables you to add your information yourselves. How are we making this possible? We are using different technologies, the complete infrastructure is running on AWS and Microsoft Azure and we are using Docker to also host our application. For the rest we are using quite a lot of JavaScript, so there is NodeInvolved, there is Sales.js React. We are also using GitHub to host our code so you can also browse the code freely because this is of course, it's an open source project so the GitHub project is freely available, you can check out the code over there. For the rest you can also get in touch using our Slack channel. There is a wiki which explains how you can use this tool if you want to add information and then for the rest a few other links, for example here you have the wiki caves foundation. The wiki caves foundation is the foundation which is responsible for the project so Grotto Center is being officially organized by wiki caves. They are also making sure that all the funding and all the partners are being addressed. Then a link to what is called the documentation center of the International Caving Federation where you can find all the scientific documents. They have been imported into Grotto Center so these days if you are browsing information about specific caves then you can also find all the scientific bulletins linked to specific caves. And last link is one to the Cast Link project page where you can find more information as well as the current standard. So we are building this mostly with people who are doing a spillology and who are by accident also IT people but we are also very grateful that we from time to time get some interns from different schools that want to build on specific parts of our project to make it better and to also add more functionality. And for that we can rely on interns from the Polytech of Montpellier as well as the Epitech from Montpellier and the University of Grenoble. So many thanks to them as well. And now some of our partners I didn't list all of them because then my slide will be very full but there are quite a lot of caving clubs and caving associations, federations who are sponsoring the project and who are making it possible that we can build this and we can host this on an infrastructure that is also made that makes it possible that everyone can access this. So hereby I would like to conclude my presentation. If there are any questions and I am very eager to answer them and if you have any questions later you can also reach out to us on our Slack or GitHub or project pages. Thank you very much. |
Consulting for digital humanists
the cultural shock developing tools and pedagogy |
So, my name is Marie Dubromé, Constructing for Digital Humanists, The Cultural Shock, Be Developing Tools and Pedagogy. This talk is given on behalf of the Center for Digital Humanities in Uppsala, which is where I work. Here you have my contact. You can contact me via email, mastodon, as well as my website. I'm an engineer in digital humanities. We'll come back on that later. What you need to know is that I'm located in two countries, in the city of Lille in France and in Uppsala, where my work and university is. And on my spare time, I'm an activist in Uppsala Women's Coding, as well as sometimes in Ulug and Raoul in Lille and Uppsala. So, I'm not the first digital humanist you are encountering in the first day. I did a little bit of a survey, and so at least four people I could identify the digital humanists in the past, as the name of Jan Jonsson, Vanilly, Guido, Oly de Ober, Antoine Fosche. Thank you to them for opening the path and just being pioneer and indirectly making me feel welcome in coming to your community. So, without further introduction, what is digital humanities? It's a pretty new field, and it's a blurry term. I'm not going to have the talk of three days and three nights without sleeping about how to define it. What you need to know is that it's the field in the intersections between computer science and humanities, all sorts of humanities. And what in this context an engineer in digital humanities is, it's someone technical that will have expertise in one or several domain of computer science and whose mission is to help humanists in their projects. By humanists we can mean a philosopher, literature analyst, historian, sociology, archaeology, etc., etc. I thought to give you a bit of context, examples would talk more than a long discourse. So, if you're an engineer in digital humanities, you could be helping a theologist mapping all the colors of the Bible to show the color ambience of the New Testament. You could be also building an online databases to help an art historian will be broadcasting their anti-copedical knowledge about some engravings of books, etc., etc. You do not need to read this complicated table, but I will attract your attention to a couple of things. First, observe the subjects. The subjects are pretty academics, academic subjects, things that computer scientists will normally not be very familiar to, like literatures, Scandinavian languages, linguistics, here the ALM means the archives, etc., etc. We have an audience of very academic people, people that have no contact with the corporate world, that have been in academia and stayed in academia after. Look at this, we have a pretty gender balanced group and slightly actually female dominated. I counted 16 women here in this room, including me. So, that's an interesting contrast. Finally, the titles, it probably doesn't speak to you too much because it's a pretty specialized topic. Could we even say niche topics? I had fun making a table of the differences between the engineering world and the humanistic world. By the way, I have a background in classics before going to natural language processing, so I'm accustomed to this switch of culture. Additionally to the remarks I have made, I will add that among engineers and computer scientists, the programming is the norm, and often as an operating system, you will see Linux or Mac, or at least when I was in the Department of Natural Language Processing, that was what I was seeing. You would use an academic context, you would exchange terminal apps, et cetera, et cetera. Whereas if you land in the Department of Humanities, you will have a point and click as a norm and offer proprietary tools as a norm. Windows, Mac, PowerPoint, Word, I have been asked in conferences to give PowerPoint, PPTX, files and Word documents, and no one gets offended by that. So industry influence, academic, and I think finally the most important is that in computer science, we are very metrics oriented, optimizing maybe speed or other quantified performances, whereas the humanists are very question oriented. So between those two different audience that we need to gather in digital humanities, how do we help each others? And to be able to continue my talk or let's say end my talk in something positive, I will start in the, let's say, more obstacles or negative points of trying this marriage between the two fields. I think if you want to have free software, computer science and free software in the digital humanities, you need to overcome a lack of visibility from the free software world towards the humanists. And I think this versa. And I thought of taking short example of technical tools we are working with. So in digital humanities, I'm specialized in web scraping and NLP. And us as engineers, we tend to use, I would say, open source tools like Python, beautiful soup and spacey, whereas people that will be less computer savvy in digital humanities will use more proprietary software such as web scrapper.io or Antconk, which has the advantage to be point and click tools, to be simpler, more straightforward, to have less functionalities as well. And yes, another remark is that I noticed recently while giving courses to digital humanists that people are asking me if I can teach them GitHub. And they say, no, I can teach you GitHub. So it's very revealing the fact that they don't even know the difference or what it means. So it's very revealing that they know GitHub before understanding the notion of Git. So I think this is to stress my point about lack of visibility of the open source world or maybe the over visibility of the freeware world. The second obstacle to bringing free software to digital humanists is maybe a lack of cohesion. I mean, digital humanities is a nascent topic where we have a lot of groups already spread with different languages, digital historians, digital literature analysts, et cetera, et cetera. And so they have already had time gathering together. So in this context, it's hard to make as well a community of open source answers. Yes, so it's not yet built this community in this context. The next thing I will point or obstacle is I would say the lack of common references or I would say of, but not of common values. So it's hard to find a language to understand each other. I will come back on that in the positive points. And yeah, that's it for the obstacles I can see. Then on the positive points on what could make this marriage a success, we have first, I think we humanists will be a very receptive audience to the philosophy of free software. I mean, enlightenment, philosophy existed in before, and it was a study in humanities, and it was a study in humanities before which A Starman was born. And so they have those references and those references to philosophy and value of spreading the knowledge of sharing and so on. They have a lot of activism in them. I mean, half of my department is vegan, such reveal something about them. There are nascent but varied community, and I think that's a positive point. It's a crowd that is diverse, very unusual for our dev crowd. Something non-neglectable, they have financial means. The humanities in general do not have much financial needs, not as much as science, but digital humanists are attracting grants and subsidies for projects, and that's an indirect way to finance open source and free software. I'm an engineer developing tools for digital humanists, and my salary is paid by the government because digital humanities is such a buzzword. Finally, one very interesting thing about this crowd is that they know about domains that maybe engineers and computer scientists have interest on, but do not have a formal training on. A lot of them have training or have access easily to training on. Communication and science of education, ethics, and are trained to think critically all the time. I will end my presentation not in certitudes, but in doubts and questions. The questions we could have are ideas to reach better this community and how to make them feel welcome. That was it for my talk. I just kept for illustration purpose in case I have questions I happen to have prepared. References here, all in the slides. Thank you. Are you aware of the research software engineering movement that's been going on for about the last 10 years? Are you aware of the research software engineering movement that's been going on for about 10 years now? There's quite an active group in the Nordic countries and in the UK that have a lot of common issues with what you've just been talking about. There will be a very good community for you to get involved with. I don't know. Is it any linked with the dev room of yesterday about the research tools? I don't know. I wasn't in that dev room. Thank you for your remark. Please come to me after and I will write that down. We have one minute left. Yes. Thank you. How could you suggest to approach someone in the humanist sector that maybe needs a hand with the project but doesn't really know what can IT do for it? Does it really know what we do? I had a set of questions when I do consulting. I can read them to you. First, context. What is your domain? What does your domain focus on? Because they can be very varied. What is the problem you try to solve? Eventually, what did you already try to solve your problem? What tools did you use? That will give you an idea because they will be very blurry about what they want that will help anchoring. What are the data you have and hope to collect and they rarely think about it? And finally, and the most important, can you present me your data? Screenshots. Open your, probably they have an Excel file or something, whatever. Size. Format. How has it been collected? And things you think, or as a computer scientist, quantification, they don't think like that. They just think questions. So I think that could help. Thank you. You're welcome. Thanks, everyone. A big round of applause. |
A GitLab forge for all teachers and students in France?
A project of the French Ministry of Education |
Bonjour. It's a great honor to be present among you. I'm sorry, I speak English like an average French. And I asked, obviously, I asked chat GPT and Google Translate to write me my speech, so if it doesn't work for you, it's the algorithm fault, obviously. My name is Alexis Kaufman and I'm going to present to you a forge project for all teachers and students in France. So a forge project that foes them, so what? Nothing less original. But three points in which, in my opinion, distinguish this forge from other classical forges. Users are teachers and K-12 students, so up to 18 years old for school, not for the university. Share content is educational, pedagogical content and is more text in a mock-up language than code. And last but not least, the project is supported by the French state. A few words about me. Basically, I've been a high school math teacher for 25 years, but also a kind of free software activist having founded Fremontoft 20 years ago, a non-profit organization mainly focused on free software valorization. And for years, we hoped, we, Fremontoft, we hoped for a real free software policy in French education and elsewhere and worldwide, obviously. But that did not real come. I remember 10 years ago, as a president of Fremontoft, I wrote an article titled, My Ministry Drives Me to Despair, where I said this, either my ministry doesn't know that free software exists and then it is irresponsible and incompetent, or it knows it and does nothing and then it is guilty. By having such remarks publicly, you can well imagine that I would never have thought of funding myself one day in the ministry. But today times have changed. I was invited to join the ministry a year ago and I accepted the challenge. This is the first time that there is such a free software and open educational resources project manager in the institution and quite high in the organization chart. And I think it's a good sign. Another good sign, we don't just use free software, we also contribute. For example, we found a development for the open source video conferencing to Big Blue Button and it's unusual to read this in a project repose. And we contribute to other free software projects like NextCloud, Future, or Moodle. So another positive move, this project, it responds to two needs expressed by the teachers. The first is a classic one. We are nearly a million teachers in France and among them there are those who develop free software application for education and some of which have a great potential. But all these projects are scattered and it is difficult to spot them and then it's difficult to support them. And for ethical and GDPR reasons, we cannot also tell our teachers to go to Microsoft GitHub. So let's bring them together in a sovereign and secure space. And as we have computers courses from high school now in France, this also concerns students. First need, the second need corresponds to new users. Here we share a lot more formatted text, markdown, latech, then code to create educational content together. And here we create content in the dark mode image which will then be pushed into web pages, print PDF, or Jupyter Notebook. By using the pages function of the force, you can work collaboratively on the force to produce web pages, web sites with quality educational content as in this project with computer science teachers. And you can do this without leaving the web environment, without knowing Gitpush, et cetera. Another example in continuous, CI, continuous integration, we can also push markdown text into a pretty mind map as a philosophy teacher suggests here, just to emphasize that it's not just science teacher who work this way. So two needs expressed by teachers for a force of code and text. So let's do it. And the force already exists as a, as a POC, as a proof of concept since last September with 200 projects, 300 accounts. It's a GitLab instance powered by a French P platform as a service, Claver Cloud, who have a stand here at FOSDEM. And administered by the Association of Teachers Association, the Association of Computer Science Teacher of France, the A-E-E-F. The big news just from 10 days ago is that the project has been validated by the ministry, which has integrated it into its big strategy, the digital strategy for education 2023-2027. So the prototype force will join the IT services of the ministry. I cannot give you more details because the project is just beginning, so no roadmap, et cetera. But we have at least five years to move forward. To move forward and convince of its interest more and more teachers and students. This is the big challenge. Succeed in enlarging the cycle of users. This will require support, training, animation, and promotion of the projects submitted. Here you can see a quick and dirty swap analysis. The most important thing for me is that the project is supported by the institution and that it has values close to an event like the FOSDEM. So I have just one minute. Okay. So I don't detail the strengths, opportunities, or weaknesses. So thank you. I came to FOSDEM to present this project, but also despite my terrible level of English to discuss and see if other countries, other ministry of education in Europe, for instance, are engaged in similar projects to pool and share our experiences. Okay. Don't hesitate to talk about it around you if you think the project is interesting. And thank you very much for your attention. Thanks for your attention. If there is anyone who has a question, is there anyone who has a question in French, in English, also in chat GPT language? No, but if we have them, I can read with you a first list of positive points and some risks. So obviously the main thing is that it's an institutional project that carries values, openness, sharing ethics, sovereignty. So sovereignty is important here and sustainability. It encourages teachers and students to work together and so increase digital and computer skills, I think. And by gathering our force, join the force in the same place, it's easier to make community and obtain means. It's easier to ask money. And doing so, I hope we will have a big, a large bank of open educational content with free license and available to the entire school community. There is also a point with what in France we call it sobriété numérique, digital sobriété. I think it's not a good translation, but because we use light and open format, we can work offline. It's improved the existing rather than beginning from scratch, et cetera, et cetera. And important with the project for us is comments. So its governance and policy is shared with the users. That's very important because we did a lot of projects, a lot of top-down projects in the French Ministry of Education. These want to be more horizontal. Not very expensive, but at the beginning, but if there are more and more users with CICD, it can be different. And the threats, maybe it's a too complex tool for teachers and students if you want more and more users. So great needs for documentation, training, community animation. And convicts, great teachers who are already on forges to migrate to our force. That's a point because super teachers are already on Github, for instance. A couple of stuff. This is not clear. If the content belongs to the agents or the teacher or to the states, there is a thing to... And the evolution of GitLab offer and the competition with Microsoft Github, with Copilot, and cetera with more and more new functions. Voila. Thank you. |
FOSSbot: An open source and open design educational robot |
I'm going to be presented by two amazing researchers, Iroclis and Christos. Good morning. Thank you for the introduction. Thank you for being here and attending our presentation. So I'm Iroclis Varlamis and here is Christos Cronis. We come from Harakopio University but here we are also representing the Open Technology Alliance, the Greek Open Technology Alliance for open source software. So we will present today FOSBOT which is a project that started a couple of years ago in terms of the Google Summer Code as an open design, an open source software robot and gradually it evolved to what you see here which is a more concrete, let's say robot, a more concrete design that is specifically targeted to educators in our educational levels. So together with me, I have Christos and I will give the floor to Christos Cronis who is the lead developer of the team of FOSBOT and also the designer of this robot. So Christos, the floor is yours. Thank you. Hello, I'm Christos. I would like first to introduce the team behind the FOSBOT. Let's start with Mitzi, the supervisor of Professor Iroclis Varlamis who is the coordinator of the project, me as the designer of the robot including the hardware and the software, Eletherian Vanai who are front end and backend developers and past Google Summer Code contributors of the project, Dimitris and Yoros are backend developers and assemblers of the physical robot and finally Manusos who is backend developer and also the developer of the FOSBOT simulator. Now so this is the FOSBOT, also we have a physical version of the robot in the table. Let's start with an overview of the hardware and design aspects of the FOSBOT. On the front side of the robot, we have a multi-corner RGB LED on the top left, photo resistor on the top right and ultrasonic sensor in the middle. The entire body of the robot is 3D printable in any FDM printer with bed size 220 by 220 millimeters. The design is customizable and easy to assembly. At the bottom of the robot, there are three IR optical sensors suitable for line following or detection application. The robot moves based on two DC motors and a free roller on the backside. This motor has odometers to measure the movement of the robot and there is also a three axis accelerometer and gyroscope sensor inside. At the back of the robot, there is a speaker and an off switch, a charging port and the robot powered by three lithium ion rechargeable batteries. On the top of the robot, there are some unique features, a pulling component in the backside of the robot. That component gives the opportunity to attach a small object, let's say using a rope, in order to measure how the extra weight affects the movement of the robot. Also we have a large detachable cover, it's the white circular piece on the top. That component is a Lego brick compatible and also have a hole in the outside. This hole drives to the bottom of the robot and allowing the marker to be attached for programmatic drawing. Now let's summarize the key features from the hardware or design aspects. The robot is 3D readable and offers repeatability and customization, can be designed to be used with common electronics. In the current version, the brain of the robot is based on the Raspberry Pi, but we already see some variation of the robot with Arduino or MicroBit inside. It's open source and not open design, of course, and it comes in a low cost around 90 to 120 euros. In the slide you can see our lab on Harcobio University and some pictures from, let's say the assembly line of the robot, from the physical robot. Next we have some pictures from different assembly phases of the robot. Now we'll speak about the software. We have created a custom platform built in the robot with three modes, the Gitter-Carden mode, the elementary school mode and the high school or advanced mode. The Gitter-Carden mode features a friendly UI with card blocks based on Google's Blockly. Additionally, it's expandable with the option to add new cards to execute Python or Blockly scripts. For the elementary school, we have a custom user interface once again based on a more complicated Google's Blockly version that uses custom code blocks for all the sensors of the robot. The experience is similar to Scratch and that makes the robot to be easy to interact, that offers easy interaction with the robot. Finally, we have the high school or advanced mode. This mode based on Jupyter notebooks and the native Python language. This mode is under construction but will be available soon. Now let's take a look to the platform UI. This is the platform's home screen. That home screen allows the creation of multiple projects. Also we have the ability to import or export or share a project between the users. Also using the little code in the top right, we can modify the behaviors on some blocks such as the default distance, let's say for one-step movement. The icon with the three ABC cubes in the lower right offers access to the kindergarten mode. Now we see the kindergarten mode. The kindergarten mode utilizes a simplified version of Blockly using card-based blocks for basic actions. In the bottom right corners, we can see an example of how this mode can be used in classroom setting. In this example, we are trying to teach students the numbers through a gridded carpet with numbers on it. In the elementary school mode, we have, as already said, a fully custom version of Blockly. On the left side, we have different categories of blocks, including mathematics, programming, movement and sensing. On the right, there are some control buttons and a terminal window for printing real-time measurements and messages. Now, for the advanced mode. The advanced mode of the robot is based in Jupyter. The user can directly program the robot using native Python language, combined with our custom robot library, and the code can be combined with text and images. Then the whole page can be exported, including the result of the program execution, as an experimental report in the class. Now for the action. This is the line-following program that is written using the Blockly. It's a very common task for students when we teach them robotics. We have some videos to present you, on the left you see a video of the robot line-following a line and stopping when it detects an obstacle in front of it. On the right we have a video on the robot running the same code, but this time following the line in the loop. When a colleague puts her hand in front of the robot, it stops and waits until no obstacle is detected. Additionally, the physical robot, we also have a simulated environment. We have developed a library and a custom simulation environment for our robot using Coppelia SIM. This was a crucial step for us because it eliminates the need for a physical robot, and that means that the experimentation with the robot comes in no cost. Our support platform works seamlessly in both the physical and the simulated environment, allowing the project to run identically in either setting. On the left you can see a video of a simple example of our platform combined with the simulator. Here is more examples of our virtual environment. Those examples created for different teaching scenarios, and also we have a video in the top right that demonstrates a line-following project inside the simulator. We are trying to constantly improve the robot, and we have received strong interest from our university students. We have already students creating educational material and developing new features such as the real-time graphs, and soon we will hope to integrate the new features to the main platform. Now let's dive deeper into the workings of our platform. Firstly, we have created a custom library to simplify the process of controlling the electronic parts of the robot, and that library is based on my 2019 Google Summer of Code contribution. The platform was built using Flask, Socket I.O., Python, Blockly, and can be deployed to the robot via docking containers for easy distribution and maintenance. The robot, after powering up, tried to connect to a non-Wi-Fi, and if that is not possible, then the robot creates its own wireless access point. The access to the platform can be gained through a user's preferred browser, such as Chromium, and finally, all the necessary tools are already pre-installed inside the robot. These allow a hassle-free experience, as the user never needs to install anything to their computer. In this slide, we present a brief overview of how users can access the multiple levels of the robot. The top level is designed for less experienced users, and is where the platform resides inside multiple docked containers. The access to this layer can be achieved through a web browser. The second and third levels are designed for advanced users, with knowledge of Python language and bus, and can be accessed through SSH. Before concluding, I would like to present the future prospects for the robot and its potential use in higher education. In Heracopia University, we have already started to examine the potential of using the Fosbund as a machine learning robotics platform by combining the custom high-level library, the simulated environment, and the physical robot with advanced algorithms. With this combination, the Fosbund, we believe, has the potential to be used in various ways, for example, reinforcement learning. Additionally, if attached to the robot some advanced sensors, such as cameras or LiDAR, it can be used as a self-driving platform or a computer vision platform, whatever. So that brings us to the end of our presentation. I hope you enjoyed it. Before closing, I want to add a couple of things to this excellent presentation of Christos. First of all, I have to say that technology wouldn't be successful without content, first of all, and without people. So with the help of Open Technologies Alliance, we also managed to have a great group of educators, primary and secondary education, that currently are creating, are developing educational activities and educational content for teachers in Greece. And they are currently running some seminars for Fosbund, and they are currently educating them on programming and using Fosbund in their activities, in their teaching activities, over 1,000 or almost 1,000 teachers around Greece. So the benefit is that we have the virtual simulation environment for Fosbund, so they can start working in the simulation environment, and then everything that they have created there, they can directly apply it to Fosbund, they can print Fosbund, assemble it and use it in the actual process. Another thing that I would like to add to the higher education part of this presentation of this work is that we are currently working with some colleagues in the university in order to develop a short term curriculum, let's say one year curriculum with basic IT courses such as data management courses, IoT programming, basic Python programming, machine learning and AI as Christos said, in order to develop content that in most of the activities will use Fosbund as its main demonstration platform. So this is another effort that we are trying to do, we are working on at the moment, and we hope that it will soon bring us some results. And I would like to thank you once again for your attendance. |
Tableaunoir: an online blackboard for teaching |
Okay, so hello everyone, I am François Chrois-Sinterbeur, I'm really glad to be here and I will present to you Tableau Noir, which is an online blackboard. So a long time ago, in a galaxy far, far away, a mysterious virus attacked the population. So it attacked almost everybody and that's why we decided to organize a lockdown, right? But we still had to teach and actually we did blackboards to teach and we are not able to use blackboards anymore. And actually there exists solutions online that enables to draw and share ideas, but they don't feel completely the need. Some of them look really like, I would say, graphical drawing softwares, so with many buttons and usually you need to get rid from all these buttons that are on the screen. Also, usually when you have to explain things, you need to make animations and to make students interact with you. So for instance, you know maybe sorting algorithms? Sorting algorithms, like insertion, sort. And you can use manettes for that, it's very easy on blackboards. You have manettes like that and you can teach, okay, you have to insert five in this, so you will move this and put five here and nine will be here, okay? And it's very difficult to use that with real blackboards, with existing tools, solutions that are online. Also many of them, when you go on the website, you have to sign. And maybe you don't want to sign up and to give data to the services that offer some white or blackboards online. And also when you are using them, actually their server, they keep all your data maybe, they store your information and you don't want that. So that's why we decided to create Tableau Noir, which is available on this webpage. You can, so it's online and it's also offline, you can also use Tableau Noir offline with Electron, which is a kind of software to make websites used like offline. It's collaborative and that's it. So now the rest of the presentation is divided in several parts. The first part, I will discuss the philosophy of the tool. The second part, I will explain a bit more the use of these fridge magnets here. Then we will discuss the collaborative board aspects of Tableau Noir. Then I will discuss how to make a presentation with Tableau Noir and finally we'll conclude. So what is the philosophy of Tableau Noir? The philosophy is kiss. So maybe you know this philosophy, this philosophy means keep it simple and stupid. This is a bit the philosophy behind Tableau Noir. So if you have a lot of tools like that in your drawing software, actually you don't need them because if you want to select a part of your board when you teach, you don't have time to select the correct tool selecting and then blah blah blah. You will use directly, suppose you have this little guy and you want to move this guy on the board but you just select with the shark you put around and then control X and boom you can move the little guy or little girl. So we don't need this button. Now you have a button for shark and erase. You just have one button for toggling, it's sufficient. Then you have, usually in drawing software you can draw more thick or thin lines but with a shark you don't need, you do that. So let's get rid of these buttons too. Button for the type of line is the same, it's very easy to do that when you teach. So we don't need that. Zoom, we don't need also. I will explain that later why we don't need Zoom. So the tool Tableau Noir is teaching oriented. You can hide the toolbar. There is a toolbar but usually you hide it. We don't care about the toolbar. You have short keys like C. If you press C you can change the color of your shark and so on. Dynamic size of the eraser, usually when you erase in drawing software you have to choose the size of your eraser. When you teach you don't have time, maybe this size is good. No, when you erase, if you do it slowly you erase not so much and if you start to, then it will be big. Okay, cool. Accessibility, some persons are left-handed and others are right-handed. So on the project I had a request about that and I think it's important that everybody can use Tableau Noir so you can flip also the cursor. You can add text and also later formula in your lecture. Okay, so now let's talk about manettes. So manettes are things you can move on the board. So you can, for instance, explain the sort of algorithms directly in Tableau Noir by moving the manettes. Actually, you also have manettes from OpenMoji. This is a very nice project. I accidentally went on this. It's all icons. And Tableau Noir provides these icons in the interface so you can use this little picture for whatever you want. You can make interactive diagrams. So for instance, you can draw graphs where the manettes are the nodes and you have edges that you have drawn by hand. And then interestingly, you can move the manettes and the graph will be updated accordingly. Okay, you can also put a manette in the background. So if you have a grid and you want to draw on the top of the grid so you can transform that into a manette and then put the manettes in the background by pressing B and then you can draw on the top. And especially when you erase, you will not erase the grid. Otherwise, you have to redraw it and it's boring. Let's talk now about the collaborative aspect of Tableau Noir. So interestingly, if you are alone, if you use Tableau Noir alone there is no communication with the server. Nothing. Especially just loading the HTML file and JavaScript otherwise then during the interaction there is no communication with the server. But as long as long as you click on this button, which is here you will share the board, right? And then it will contact the server. So in the tableau noir.github.io, the kind of official address it will contact the server that is located in my lab in Ren, so Iriza. And it will create a shared board and you can share the URL with others. And then when you share the URL, they are all connected and when you draw something, you have messages that are passing and giving the information that has been done on the board. That's somehow how it's implemented. So the server is written in JavaScript. It's very small in Node.js. And the client, so it's a bundle file and the entire project is written in TypeScript. Using some libraries to... Yeah, I don't have the list here, but yeah. So the features of the... When you share the board so you have no account you can protect a board with a password. So by protect I mean some users may not have the right to draw on the blackboard. They only have the right to see the blackboard. And those data are stored on the server. So it means if everybody is stopping the session, sorry for that, data are lost. So you have to save manually the blackboard when you want to. Ah, there is no zoom. Why? Because suppose we have zoom on a blackboard. If somebody is writing something, for instance, okay, everything is okay. And the other person is writing hello. But this person is not seeing it because the zoom is not the same. So if you introduce zoom, you will get troubles. So that's why we decided not to have zoom. So the solution, if you don't have zoom everybody is seeing the same thing. We are happy and that's it. Okay. So with this tool actually, so this is a kind of new feature that has been added afterwards. We can now make presentation with Tableau Noir. And actually the presentation that you are seeing is made in Tableau Noir. So actually, so it's not completely clear, but it's like we have another way to do presentations. So I will divide the way to make presentation in three. You have Librophys style, La Terre Bimmer style and Tableau Noir. So how do you make animations? In Librophys, you have to navigate through a lot of menus to see, ah, I want to move this on the left. Coming from the right and so on. Okay. In La Terre, you have to write, sorry, I write this type of codes many times. You have to write some obscure code to encode animations. You can say, ah, this is on the slide, not on the next slide and so on. And kind of in Tableau Noir, you just do it. So maybe we can make a demo. We have one minute for the demo. So let's suppose we are at the first demo. And we want to explain how we arrive at the first demo. So there is a little cat. This is not a cat. This is a cat, no? Okay. Okay. This is my first slide. Okay. So now I create a new slide from here. Boom. And we have this cat. And this cat is going to the first demo. Okay. So now we have two slides. This one and this one. Okay. Now maybe we create a new slide where we erase everything. And you have, I don't know, this, I am improvising. Sorry. Maybe if you have ideas, don't hesitate. A t-shirt. Ah, a t-shirt. Okay. So we have a t-shirt. The cat. And then, so this is a slide, right? And then it takes the t-shirt and goes home. So let's see the presentation. So hello. I'm a cat. I want to go to the first demo. I entered. And I want to have a t-shirt. So you have a t-shirt here. The cat is quite happy. And he gets home with the t-shirt. Okay. So conclusion. So kind of table noir can be summed up. It fits the teachers and students need. Plus the kiss principle, keep it simple, stupid. Plus a bit of nostalgia. You have the real blackboard aspect, which is not completely clean, but you have, like, of dust. Dust that is a bit of dust. Yes. Video game aspect with a circular menu. With also, like, if you draw something very far, like this, you will have the overview of the map of the board. So kind of Zelda games or whatever. So now I am tempted to add new features. But this is the opposite of kiss. So I would be very glad to discuss with you about some features that I would like to add. And that's it. Thank you for your attention. Really cool project. I think we all agree. So thank you. Thanks a lot for the presentation. And if you guys have any questions, feel free to grab them outside and ask your questions. Thank you. Thank you. |
Lua for the lazy C developer |
Hi, everyone. So, this is Frank van Bever with Lua for the lazy C developer. Hi, I'm Frank van Bever, and I'm here to talk to you today about the Lua programming language and specifically for lazy C developers. So, first of all, who am I? Frank, I work for a company called Mind, free and open source. Well, there's consultancy, specialized in free and open source software for embedded systems. If you're the kind of person who enjoys the referencing pointers and you're looking for a job, check out our website or come to talk to me in the hallway after the presentation. So, with that out of the way, why am I here? I am here to talk to you about being a virtuous software developer. So, the man on the photo is Larry Wall. He's mostly known for being the creator of the Pearl programming language, but he also coined these three virtues of a great software developer. These virtues are laziness, impatience and hubris. I want to focus on the laziness virtue specifically today. And so, he defines laziness as the quality that makes you go to great efforts to reduce overall energy expenditure. It makes you write labor-saving programs that other people will find useful and document what you wrote so that you don't have to answer so many questions about it. Now, in my day job, I'm mostly a C developer and, well, if you have any experience with that, you know that there can be quite some energy expenditure involved in doing that. So, introducing the Lua programming language, quick introduction. It's a programming language. It's multi-paradigm. You can program in both an object-oriented style. It uses prototype inheritance. So, for people used to Java or C++ might seem a bit strange. It has the same inheritance model as JavaScript. It actually has quite a bit of resemblance to JavaScript in some ways. They were both created around the same time. Big change from C is that Lua is dynamically typed compared to C's static week typing. So, you can just declare a variable. You don't have to specify the type. You don't need to care about it. Lua is small. That can be interpreted in many ways. Lua installation that I have on my machine is about 250 kilobytes. So, it's really perfect for embedded systems where often you're constrained. But it's also small in the sense that it has, like, language itself. It's like a small set of meta features that you can use to then build whatever it is that you need. And one example of that, for example, is that there's only one data structure, a table, which is like a map or a Python dictionary. And by basically constraining the behavior of this table, you can get all kinds, well, you can basically build all kinds of other data structures. So, yeah, basically it being a small language, C developers should feel right at home. Another big difference is Lua is garbage collected compared to having to do manual management in C. Lua will basically take care of that for you. And most importantly, for a C developer, Lua is actually also a C library. What does that mean? Well, on the left, you have Hello World in Lua syntax. And this program can just as well be expressed using Lua C API, which you can see, well, yeah, my left, left, right. So, for you, also on the left side, the Lua C API, these two are equivalent. So, how does this, well, so how does this work? Well, Lua has as a stated design goal that it should be both an extension language, meaning that you can call Lua from C, so extend your C application by using Lua, as well as an extensible language, meaning that you can call into your C code from your Lua application. And the way it does this is by using a stack. So, this stack nearly, so the C API, everything it does is manipulate this specific stack. And this fixes two important impedance mismatches. So, like I mentioned before, first thing that it fixes for you is the static typing versus the dynamic typing. If we had to map the internal state of these Lua, of all the internal Lua types to C types, well, a knee jerk reaction would be union types structs. But that's a row that quickly leads to insanity. By using this stack as like a clear boundary line between the two, it's easy to translate your Lua variables into, well, Lua variables into C variables and vice versa. Second thing that it fixes for you, the second impedance mismatch is this manual memory management that you need to do in C while Lua is garbage collected. By popping, well, by pushing and popping from the stack, it is clear when the handover happens of memory from one, from one side to the, well, from the C side to Lua and vice versa. So, where might it make sense to use Lua? So, well, as you can imagine, using dynamically typed scripting language that is garbage collection comes with a performance hit. So, you need to keep that in mind. But so, for some cases where it might be useful is taking care of tedious stuff that runs sporadically. So, especially, yeah, there's no better way to get me to run to the coffee machine in a grumpy mood than having to do a lot of string manipulations in C, like an edible. And so, stuff like that, especially config files, for example, it makes, it can make sense to say, okay, you know what, we're going to call out to Lua, we get our config, we put it in a C struct, and then from there, we can go on with our application. Prototyping is another place where I found that Lua really shines. So, sometimes you need to, you have some software that you need to build, and you only get some vague requirements communicated to you. It actually helps to have the agility of this dynamic typed garbage collected language, but still have the flexibility of calling maybe into C dependencies that you will need later on. And then just, as you go, switch out C from Lua and Lua from C. And the third thing, and really, this is, in my opinion, is where Lua shines the most is, so, plugins and extensibility. If you want to make your application extensible, your C application extensible, so, if you just do it with C, you would say, okay, you need to build a shared object using this specific API, and then we'll do a DL open. Pretty annoying. Sometimes, if you have to explain to some people what a compiler is, then it already goes beep in their head. So, Lua, you just define a simple API and makes your application almost immediately extensible so that a third party can inject their logic into your application. It means that you don't have to implement all the features that people need. They can do it for themselves very easily. So, how hard is it to do this? Well, so, a bit of a risky move to have only code on the slide, but let's do this. Imagine you have a Lua file that contains a trivial function that returns the sum of two integers, A and B. You want to call this from your C code, you initialize your Lua state, you load this file, file gets executed, well, you need this file gets put on the stack, you need to execute this through, well, a Lua call. This will register everything that is in the file into the global scope, and at this point, this add function becomes available to your Lua. You can get it, you retrieve it from the global scope, it gets put back on the stack, you push both arguments onto the stack, you do another call, this time specifying that there are two arguments, one return value, and the final argument is basically that, well, it's for error handling that's beyond the scope of this presentation. Your Lua function will be executed, and then it puts the return value onto the stack, and then using this Lua to integer call, you can retrieve it from the stack, you're back in C land, your Lua has finished running, and so eight function calls, and you unlock a whole new world of possibilities in your C application. Of course, this is, well, check your return values, that's omitted for the slide, of course, make sure that, yeah, check for errors, that's important. So the other way around, you can have a, so we just covered the calling Lua from C, another, the other option is, of course, that you call your C code from Lua, the way to do that is by creating these Lua C functions, they always have the same signature, they take single argument, this Lua context, Lua state, and they return an integer, which is the number of variables that they put on the stack as return values. Functions will always look the same, pop the arguments from the stack, do some useful work, push the results back onto, push the results back onto the stack, you create an array of this Lua registry functions where you have a name plus a pointer to the function, a sentinel value to mark the end of the array, and then you have a Lua open underscore and then the name of your, the name of your, the name of your module, which returns a new lip, so this will put your module that you just created onto the stack and return one for a single return value, and by doing this you can then load the module in your Lua code and then call into the C code without any, well, call into the C code and get the result back. Of course, having to do, well, having to build a shared object might be a bit annoying, you have to convince your, you have to convince your build system to create a shared object for you, there's no, and there's no way to share then between a C application and your Lua code, so there's a, there's a fix for that actually, you can publish internal functions in your Lua application, so functions that exist within your C application and make them available to a Lua context that is created in that application, by combining basically the previous two approaches, so same thing here, subtract function defined as a Lua C function, returns the result of A-B, you register it and then in your code you can just say, you can just push this Lua new lip, so the module that will be created, it's actually, it's actually a table with function pointers because everything is a table in Lua, and then instead of, well, instead of it being a shared object and being registered, you can just say, okay, the thing that I just put on top of the stack, make it global and have it be, and make it global under the following name, so in this case, C arithmetic, and in that case, any other Lua script that you use doesn't even have to load the module, it will automatically be loaded, it's already in the context, and so wherever you're doing this, you can just say, you can just call this C arithmetic module and then the functions that exist within there, so in short, Lua can, well, you could say Lua can help you get more done quicker, but keeping this, being a virtuous programmer in mind, I think that Lua can definitely help you embody this virtue of laziness, and so there's some time left even. I, all the code that was in the presentation, I basically have some executable examples for that open GitLab, if you want to check it out, and that's it, so thank you for your attention. If you have any questions, or you want to tell me I'm wrong, or you want to talk to me about something I'll be in the hallway after your presentation, so thank you. Thank you. We have 20 seconds, if we have one question maybe, very quick one. |
I2P: Major Changes of the Peer-to-Peer Network
Cryptography of I2P Received a Major Update - an Overview of the Changes and its Impacts |
Hi, everyone. This is I2P, major changes of the peer-to-peer network by Conrad Bechler. Close enough. Enjoy. Hello, everyone. Thank you for being here. That's my third I2P talk at FOSSTEM. First one was about DNS. That was the most difficult for me personally. And now a major changes of the peer-to-peer network of a bunch of programmers, which are creating I2P, which is an overlay network. I'm an independent researchers and software architect and developer, and I'm developing libraries, and I'm working full-time for Diva.Exchange, which is a small non-profit association in Switzerland. And there is nothing like a centralized model involved, because I'm totally living in the peer-to-peer I2P world, because I really love what the I2P developers and the open sources space are doing. Since the last 20 years, and I suggest we're looking now at the agenda. I'll give you a short introduction of I2P, and then we're looking at what the developers did. As the latest changes of the code. And we'll focus, since I'm an application developer based on the I2P network, what is the impact? We're looking at the summary. And I understood that there are no questions, so if you have questions, I'll be outside in the hall for a few minutes, and I'm happy to get in touch with you if you have critical and skeptical questions. Short thing, what is the role of Diva.Exchange within I2P? I2P is a developer team. Diva.Exchange is a developer team. And the I2P developers are spread all over the world. The Diva.Developers are spread all over the world. We're cooperating friendly, but we're not the same. Diva.Exchange is primarily a library developer and is supporting official, so-called official, open-source Docker images. That's what we're doing. Myself, since I'm a researcher, I'm cooperating with Swiss universities, and I'm open for all corporations with any university to take a very skeptical perspective towards our own work. Now what is the I2P network? I2P network is an overlay network. And now please jump to the very last line of this slide. I2P gives you privacy by design, serious privacy by design. It means you get confidentiality and total anonymity. Just as an example, this statement that the network gives you total anonymity is something which we do research in Switzerland, whether we can break this anonymity. And that's one of our research approaches. But currently, at the current state of knowledge, and since I2P is existing since 20 years, we can say it's confidential and anonymous. Now a question. Whoever in this room, in the past presentations, there was around 5% of the audience have ever gotten in touch and used I2P in their life. Please, hands up. Wow, that's close to 20%. Cool, thank you. So for the others which do not know it, anonymity has a cost attached. And this cost is called latency. So when I'm talking about an anonymous network which gives you full confidentiality, then these messages are multiple time encrypted and hop from peer to peer to its destination. And through another tunnel, they hop back, the reply hops back. And this is giving anonymity on a very, very, very high level. A word to be clear here. I2P has no storage capability by itself. It's a transport layer. Now how to get I2P? Those which got in touch with it, they know that. There are the Linux repos, and then there are PPAs, and then there are the container maintained by Devo Exchange, which you find on Docker. Now why is there I2P? And why is there I2PD? Because there are two versions available from two development teams. In short, I2P is written in Java. It has a user interface. And there are a lot of fans around there which really like that. And then there is a smaller version, which is I2PD, which is written in C++, and which is the demon only, so it's rather a bit more for admins. Devo uses I2PD, and there's also the image maintainer on Docker for this version. From my point of view, both are equally valid. Word of warning, please don't trust any binaries which floating around here and there, which you can't reproduce locally by yourself, please build it yourself if you can, and sure you can. Because this gives power to the open-source developers. So if you just consume some binaries, please don't. It's also dangerous because you have a router running on your local machine. The whole thing here is it's a peer-to-peer network without any central authority, without any trust involved. Now, the latest changes. Since 20 years I2P is existing and gets developed, and there are two transport protocols. TCP and UDP. TCP as also used within other overlay networks was a long while, so around four years already upgraded to so-called NTCP2, which is much modernized, so the TCP communication was already pretty well done in the last four years. But UDP, which is something where you just blow out messages to the network, and I'll talk about that in a minute, which is fast for I2P in an I2P context, was not modernized since 15 years. And the developer of I2P, they worked one full year on modernizing UDP, and they call it SSU2. So when you hear SSU2 in the context of I2P, it's UDP. If you want to dig deeper, then please take a look at noiseprotocol.org, that's very important also for other overlay networks you might have heard of, and the developer team of I2P has borrowed quite some things from WireGuard, VPN, and also from Qwik, and there are some RFCs around, so please dig deeper by using these hints. Then the cryptography had to be updated too. There will be more work coming in the upcoming years, I know for sure. But it's now already on a quite good state. All right, seven minutes to go. What were the goals? UDP, the performance, is simply much better in an overlay network than TCP. Me as an application developer, which is creating a fully distributed storage layer on top of the I2P network, performance to communicate between the nodes in the I2P network is an issue. So for me, it was always the choice to take UDP, and I implemented a gossiping protocol to realize this fully anonymous blockchain, and so this distributed storage layer, and I needed UDP that the peers could communicate within each other. So for me, this was really an important goal that they could improve the performance a lot. Nationally, as you know, there are countries in this world where communication is not that easy as we have the luck to be here in Europe, so obfuscation is a topic. For all those people, for all the whistleblowers and the journalists which are using I2P in critical countries, they must not be detected, and I2P is one of the premier solutions too for such people which need communication on a very confidential level. So the obfuscation was a big topic, and additionally, again, UDP is easily attackable, so the developers had to look into the security problems they had, and they did, and they did a good job. Right. There are a few other challenges of UDP, like the fragmentation of messages. You know, you have a long message, you want to send it through the network, but it gets chopped up into pieces, and UDP is not reliable, so the developers really had some challenges to solve in the last 12 months. But the fact is, they have implemented strong solutions, and today, if you look at the release notes of I2P, I2P is in a much better state. And this is what I want you as a core take out of this presentation. I2P gives you, since 20 years, real privacy by design, and the developers did a good job the last 20 years. But now, with this performance improvement, and that's my personal view on their work, they really made a major step forward. Because UDP is for application developers really important within the I2P network. That's my personal view on it. Please feel free to criticize me afterwards in the Q&A session outside if you see this differently, or if you have a different view on it. I'm happy to hear that. But these reductions, they are relevant. For us as application developers, we directly felt this performance impact. In November, when the release was, so four months ago, or three and a half months ago, the performance in our blockchain test network improved very much. And although I believe from research point of view, we have quite some work ahead of us to, skeptically and very critically looking at the latest changes to see whether some weird box have been introduced. The first impression of the two months is, that's a new generation of I2P. So for me, the performance impact is really, really relevant. We're on the target. We're seeing the target line. Takeouts. After 20 years of development, I2P is the leading peer-to-peer privacy by design network which is a transport layer only, which gives you today very modern cryptography, very modern possibilities to use as an application developer. The performance increased significantly since last November. And if you feel as a researcher too, or as an application developer to take a closer look at I2P, now it's a good time. Either you can invest as a researcher trying to break the anonymity and present it at an upcoming conference, or you try to create some applications on top of it. I believe currently is a good point to take a look again at I2P because now it's really quite fast to use and it's also fun to use. On GitHub, so that's in the lower right corner and you'll find it online on GitHub, you'll find Docker images which help you to get started with I2P really quickly. You can, because a peer-to-peer network, to create it as a test network is quite complex. But we created here simple images for you which you can start with Docker Compose. You have all the containers running, your member of the I2P network and you can write your application on top of this network. Here is a list of sources. Please do your own research too because privacy by design also means that we hope to motivate US developers to take a look, a very close look at the documentation, a very close look at the source code to be skeptical because privacy is such an important topic for us as developers. So we're always very, very pleased if we get critical feedbacks on the development of the stuff we're doing. Thank you very much for your time. Have a great 4STEM. Thank you. Thanks, Koran. Really nice talk. Again, if you have questions, please grab them outside and be happy to answer. I hope. Thank you. |
The Nym Mixnet
Intro to a new anonymous communication network |
So, the next presentation is a NIM mixed net from Yoon Hockblatt. So, welcome. Yeah, thank you very much. Right. It's great to be here. And the presentation all seems to be working. Right, so I'll talk about NIM. The title is Intro to a New Anonymous Communication Network. There's quite a lot of overlap in the previous presentation about the concepts involved. And who am I? My name is Yoon Hockblatt, or sometimes I go by John for simplicity. I'm a Swedish developer. I spend my days writing rusts, back-end type of things. I do C++ and scientific computing in a previous life. Yeah, living Stockholm. Yeah, that's me. Right, so the NIM mixed net. What's the NIM mixed net? So, basics. I mean, this is obviously free software. The source code is available on GitHub over there. It's Apache licensed. It's mostly written in Rust. All the back-end stuff is written in Rust. Some of the front-end things is TypeScript. This was in the past. This has been funded by some EU projects. And currently, there is a Switzerland-based startup with us, the majority of the development. But yeah, it's an open project. And of course, we welcome public contributions. And yeah, it's quite deeply rooted in university as well, in university research. We have some, you know, work-loss researchers associated with the project. So, you know, the concepts aren't things that we sort of, you know, came up ourselves. This is, you know, state-of-the-art research. Right, so what is the problem that we're trying to solve? You know, we had the usual suspects, the, you know, government surveillance and surveillance capitalism. And, you know, if these four, which of these two is a problem, you know, very much depends on where in the world you are. In some parts of the world, these things aren't that big of a concern. For other people, this is serious matter. This is, you know, of grave concern to some people, depending on who you are and where you live. And what is the aspect that we try to tackle here? Because there's a lot of privacy platforms that sort of, to sort of, to try to attack this challenge from different perspectives. The NMEXnet is a network layer, or it's a transport layer thing. And the main challenge to be focused on is that it has become clear in the sort of the last ten years that there's now so much surveillance going on, and there are some entities that collect so much data on a global scale that they almost get some sort of like a god-sized view of the network. They can monitor the network on a planet scale, and they can do, they can correlate, they can correlate using leaked metadata, your transmission patterns, your packet sizes, timings. They can do end-to-end correlations, even though like your data is sent entirely encrypted the whole way, or obfuscated, but still, if you can sort of monitor like all endpoints, you can sort of still draw conclusions, you can identify who talks to who. And you know, as we know, who talks to who is sometimes more important than what they say from a sort of surveillance perspective. So that's the sort of the angle, the challenge that we try to talk about this. And so now I'm sort of taking a step back here, so I'm referring to the NIM platform, which is, then I use this quote here, a decentralized, incentivized mixed-net plus prior credentials. And sort of, yeah, my talk here will be to try to unpack what all of this means, and we're going to start then with what I think is sort of the core part is the mixed-net, the word in the middle there. I think if you use something like Tor as a starting point, that's sort of a very good first step to understand what it is. And just like Tor and just like the previous talk, it's an overlay network, in the same way as I2P uses onion routing, where all packets are wrapped in layers of encryption to sort of hide the fact, to hide the end destination of each packet. It's based on the Loupix design, if you know a little bit about mixed-nets, you've probably heard about Loupix. I put in a few citations here at the bottom, if you want to read a bit more about these things. It uses Sphinx packets, so that the idea is that all packets are wrapped into these identically looking and identically behaving packets, to sort of to hide some sizes and timings. And also, each packet as it moves through, because mixed-net is, I mean, okay, so something I forgot to mention, mixed-net is very much what it sounds. It's data, you send through data, multiple hops, you mix data as much as you can, through a cloud of nodes. At each node, I'm going to have some pictures on the next slide to illustrate it better. But yeah, on each hop in the network, you add things like random timings, which affect the reorders traffic, you add cover traffic, which cover traffic can appear in many ways, either between nodes, but also, for example, if you use a client to connect to network, to transmit data, you emit Sphinx packets at a steady average rate, so it's not a steady rate, but it's probabilistic how you send the packets. But you send the steady stream of packets, either fake or real ones, so when you have real data to send, you just fill up the packet stream, the packets they send out, fill up with real data. So from the outside, you can't tell when you're actually sending, when you're bursting data. You attach SERBs, so single-use reply blocks in your packets, so that when you, if you make a request across the network to get something back, you attach these headers, these metadata, so that the response can be layer encrypted and sent back, so that on the other side, the server doesn't know where the destination is, so you hide your identity, but you still allow the other side to reply back to you. It's a picture, the first step, the first one there, ordinary VPN, and a VPN doesn't give you any anonymity, it just moves trust, so the guy in the middle there, you can still see where data is coming from, where it is going. The second one, you have things like Tor, where you have these nodes in the middle, where you open up a circuit through the swarm of nodes, and you pump data through. And here you have mixed-net setup, where in each packet is mixed individually, so you don't open up a circuit, like Tor, for example, you send up, each packet is sent as an individual pass-through. And the idea here, the crucial thing is that on the other side, you see these packets there, they are now, they're colored white now instead of red, and they're the same size, and you shouldn't be able to tell, you can't tell, you can't correlate the data on the other side compared to on the sender side, which you can in many other systems, because you can't correlate transmission patterns, timing sizes. So even if you can monitor all the data, all the exit data from this mixed-net cloud, you still can't correlate who talks to who. That's sort of the key thing here. And yeah, so if we go back then to this quote, so decentralized, incentivized mixed-net plus price credentials, what we mean by incentivized, we mean that the network directory, which keeps track of all the mixed-nodes and gateways are a bit like exit nodes in Tor, they are constantly being monitored. So the network directory is effectively a set of validators running a consensus protocol, and they keep track of all the mixed-nodes, how well they mix traffic, how well they contribute capacity to network, giving them limbs for it, which in turn can be turned around and used to acquire bandwidth credentials, coconut credentials, it's the academic term. And the idea is that we also, because this is always a problem when you have something like this, with volunteers you only get so far, anonymity or privacy, it loves company, you want to disappear in the crowd, so you want to encourage people to provide capacity to the network at the same time as they're using it, that's the idea. Because otherwise it becomes difficult to scale up above a sort of base level. But if you want to make this available for the broader public, you need more capacity. And this is a way that we hope we can achieve this. And these private credentials, the idea is that you break the linkability between your identity and your right to use the service. And there's a very deep topic on its own, there's a citation, there are some cryptographic buzzwords here, as well as that are re-randomizable, means that if you use the same bandwidth credential multiple times, it's indistinguishable from multiple people using different credentials from the person redeeming these. But yeah, the idea is you want to break the link between your identity and your right to use something. And yeah, okay, so the first word there, decentralized, it's not too much to add there, we have a running network, it's 500 mix-nodes currently, and yeah, the vision is that this becomes self-running, it shouldn't have an antifragile funding model, we don't want it to be reliant on a specific company or some funding body or donations or anything, we want this to have robust, robustly running on its own, run by the community entirely, long-term, that's sort of the vision here. Even though currently we have a startup that sort of does the most of the development, in long-term we should be able to hand this off as sort of the idea. There's a picture, so this is all running currently, this thing that is currently sort of in deployment or sort of being rolled out or these credentials currently is free to use the main network, we have all these clients, there's SoxFi clients, there's Awasom clients, there's a native running client exposing a web socket, the mix-nodes up there, when you use a user you connect to the gateway, which is like entry and exit nodes for a tour, you mix the traffic, you exit on the gateway, you can have service providers, there's the set of validators keeping track of all the nodes in the system. Yeah, there's a lot to take in here, probably a lot of details there, I'm not sure it's all visible towards the end, but yeah, that's pretty much it. Thank you for your time. Yeah, thank you a lot for a nice talk. Yeah, thank you for listening, and I think that we have some time, so theoretically we could spend it asking a question at least for two minutes here and then after it we can discuss it outside. Hi, can you imagine the NIM framework also to be integrated into another proof-of-stake-based cryptocurrency as a back-end in the future maybe? What did he say first? Can you imagine that the main part of the NIM framework like the mix-nodes and everything around it can also be attached to an existing proof-of-stake-based other cryptocurrency that is not currently part of your ecosystem? Well, a big use case of this is that this is sort of on the network layer, so that means it's a big use case. You have all these other private systems where in this crypto space where they have these privacy-preserving services, but they still leak metadata at the bottom layer. They still leak metadata when you use broadcast transactions and things like this. So I think to integrate this in other systems in this space, then it will be in that layer, sort of the transport layer. So yes, there's a lot of potential for integrating with other privacy platforms, I think. In general, there are a lot of privacy platforms, and I think what we need is a robust ecosystem. There is no single solution that solves all our problems. We need a robust ecosystem for different solutions for different types of problems or different categories of problems. I mean, I don't see this as a competitor to other systems. It's more of a complement to each other. For example, when you add random delays, for example, that of course means you sort of compromise, you give up a bit of latency, which works very well for asynchronous communication, but might not work so well for other categories of applications. So I think something like this is also the complement store. It doesn't replace the store. It sort of complements it. Yeah. Okay. Thank you again, Yun. If there's any more questions, just grab me afterwards. Just go there and ask questions. |
Keyoxide: verifying online identity with cryptography
A novel approach to secure decentralized online identity |
Okay, Jarma Makenbach, his next talk is K-OxEat, the refiner line, the identity of his cryptography. Welcome, Jim. All right. Is this working? Yeah. All right, awesome. Thank you all for coming. So, yeah, I'm Jarma Makenbach, and I would like to give you a little non-technical introduction to my project called K-OxEat, which is all about verifying online identity with cryptography. And to start us off, I would actually like to review very quickly how traditional passwords work. So, yeah, you walk around with your little passport, and you give it to the person who needs to verify it. They will do their little verification with a computer, and then, of course, after that, you can do whatever you came to do with your passport. So what I would like to briefly touch on is this third step, where a passport needs to be verified by contacting the government that issued it. In and of itself, the passport doesn't do that much, it's just a piece of paper with maybe a little picture to identify the person, but not that much more. And in fact, even if you do verify with the government, you know, it checks the validity of the document, is it still valid? But it doesn't actually link it to the person, again, except for maybe the picture, but still. So if we were... So the online world, of course, is becoming more part of our lives. So how would identity work on the internet online? So for me, of course, I would like to have it secure. And with that, I actually mean verifiable, so that, you know, I can present my online passport and prove to people that it's me and that not someone else is trying to claim that they're me or anything like that. I would like it to be anonymous. We don't actually need this online identity to be the same as our real, in real life identity. In fact, I would go even further than that. In fact, we can be multiple persons. We can, you know, be a certain person that works in a certain domain. We can be a person with a certain hobby. And all these personas, they don't need to match, you know, a real identity. And they should have their own identities that need to be maintained and be cared for and be presented separately as identities. I would like to be self-sovereign. So people should be able to make their own passport. They should be able to create it, update it, delete it, distribute it, or keep private as much as, you know, as they would like to, as they see fit. And of course, I would like to see it decentralized. There shouldn't be any gatekeepers. It shouldn't run through a single big instances. It should just be, if you verify someone's identity, you need to be able to do it yourself on the spot without having to trust another entity in the middle as much as possible, because, you know. And if you combine all this, I think it should represent what you do on the Internet, online, and not just like the traditional passport where you were born, mostly. So let's envision a little scenario. Again, I said that this would be non-technical, so, you know. So we start off with Alice, and, you know, if there's Alice, then, of course, there's Bob. And they've spoken to each other for years now with, let's say, Matrix, I don't know. So they know who they are. And again, they don't need to be this, these don't need to be like in real life identities. They could be pseudonyms, so, but they know who they are. And on a certain day, Alice gets a message from a certain Bob91 through a separate network. And of course, Alice wants to know, you know, is this Bob? Why are they using, and if it is, you know, why are they using a different username? That's could have many reasons. But again, is this the same actual person? So luckily for Alice, in this case, she could ask Keoghseid. And Keoghseid actually did a little check on both Bob on the Matrix networks and Bob on 91 on the XYZ network, and it could actually verify that they were the same person. So that's really a rough explanation. So let's actually look at what happens to get to this point. So what Bob did is he created a cryptographic key. So cryptographic keys are represented by a fingerprint, a large string. In this case, let's simplify it, we'll just call it key 42. And he treats the cryptographic key as a sort of vault. So he can put arbitrary data in there. So in this case, he puts two links. In this case, I use different accounts. So maybe he refers to a Fediverse account, and he refers to a Liberapay account. Again, these could be any number of services that he chooses to enter in the vault. And then he uploads a public key version of this to a so-called key server. So a public key in this context, we can view it as like a glass vault. Everybody can look inside, but no one can actually modify the contents. Only Bob can, and Bob secured it through cryptography. But everyone can actually look at this vault and look at the links that are inside. And on that fateful day that we just discussed, Alice, so she went to this key oxide page and she entered the number 42 because she knows that that's Bob's cryptographic key identifier. Again, they must have talked about this before or anything like that. But so beforehand, she knew that this was Bob's key. And so, well, a key oxide then goes and fetch this key from the key server, displays the data. But again, it's just displaying this as a list. It doesn't know yet if it's actually verified because just claiming that you are someone, a certain account on the internet, that's not sufficient. So what the key oxide then continues to do is that it will actually fetch some basic data, profiled data from this Fediverse account. And inside this Fediverse account, it will actually find a link back to the key, in this case the number 42. And with this confirmation, it actually knows that it verifies the identity. This really is the basis on how key oxide determines identity through bidirectional verification. So the key links to a certain account on the internet, and the account on the internet links back to the key. It does the same verification for the Liberapay account. It finds a link back to the key. So it has to be the same person that both actions put the link to the account in the key, and put the link to the key back in the account. So that's how it looks in the stick figure mode. This is an example of my account, for example. And so this big string, that's actually what the 42 were presented in the stick figures. And below are all my identities, all my accounts that I linked in my profile. And that actually have been verified by this website. So what this website does, kikosite.org, in the end is basically just like a little automation layer. You could verify it yourself. You could go to each of these profiles. You could go to each of these links. And just look for the reference back to my key, which is written up there. But yeah, that is kind of a tedious task. So that's what kikosite is designed to automate for you. But kikosite will always give you the option to verify for yourself. So currently, we do have a kikosite mobile app. It's available on fdroid for now only. And it only does this profile verification mode functionality. It doesn't do profile creation yet. For now, it requires you to use the command line, the GBG command line tool. So that's not for everyone. But yeah, we're working on making sure that the app can just make it just like Keybase used to do back in the day. And of course, we would like to release it on different play stores so that it's more widely available. But you can already play around with it right now. I would like to briefly touch on what is next for kikosite. So we're trying to create this more easy kind of way to create profiles, not only using OpenPGP. Right now, everything's OpenPGP within kikosite. But soon, you'll be able to use, like it says, a minisign keys or SSH keys or whatever keys we can find. As I said before, I would also like to have more apps that can actually help you and guide you in creating profiles so that it's not just for the very technical people among us, but also just more general that people can enjoy online identity verification. Another thing is for each website that we support, it needs to be programmed manually. We need to know the right APIs for every service. So there's a lot of work that goes into finding out how to do that. And of course, if a certain bird site decides to close their API, well, yeah, that is over then for those certifications, but that is their problem. So we need to, yeah, add support for more services and websites. And yeah, we should also play around with creating new websites, creating more clients. Clients can do a lot more things than just display a list of profiles. And these are things that should be explored, like different ways of using identity online. Actually, I did so myself. I created my own competitor, in this case named Kiyoksal Blue. And I just, yeah, like it says, I built this in an afternoon, and it's just that easy to get started with this ID because in the end, it's just an automation layer. And yeah, it was just a little quick test for myself to see how easy is it to create a new client. So if anyone feels like playing around with this, go ahead. It's a little fun, I guess. And the other thing that I quickly wanted to mention is to go further than just display a nice list of profiles and of online identities would be, what else can we do with online identity? And actually, a few hours ago, there was this presentation by Pablo, where he discussed creating CVs for developers in the end for more people, which is, for now, like developers, how can they share what they do online? How can they share their free open and open source contributions on different platforms, on different websites? So that is something that maybe we could explore using Kiyoksal as a back end to know who has done what and make a nice little CV of anyone's contributions to the open source world. And with that, I would like to thank Victor and Berker and all the other people who have contributed to the project so far, an illness for their funding. And yeah, if you like the idea behind this, do reach out. It's fun. And yeah, we're always looking for more people to get this field further. Thank you. Yeah, thank you a lot for that presentation. So we have now almost three minutes for questions. And after that, you will ask them outside. Anyone? Questions? So is this system dependent on a key server? And no. So the question was, is it dependent on key servers? In the end, no. So key servers are just one way of promoting, distributing keys. But there's also like WKD, web key directory where you can put keys on your own server. You could also share keys directly, of course, between people. Brilliant. OK. So have you also included key signing so that you can create a web of trust? No. There have been talks about that, you know, because... It will be good. Yeah. Yeah, of course, like the web of trust discussion, you know, that's, yeah, people like it. People don't like it. If you can make that socially easy for people, it's a big deal. Yeah. That is also something to explore, the social aspect between having identities and, yeah, because in the end, that's just another verification. Do people know each other? And then that, yeah, generates trust. Thanks. There was a few questions there. Yeah, there's a last question. Because also in that direction, do you have support for blockchain names or something like that, where the public key is on a chain? It should probably be feasible. We haven't looked that much into it yet. For now, we're just focusing on the basic platforms where people are and where you can just create an account and then, yeah, get some data back from an API. It should probably be feasible as long as you can, and as long as you have an online entity where you can put arbitrary data and you have a way of querying the data, everything's possible. It's just a matter of, yeah, send us a PR or send us an issue and let's discuss and we'll get it sorted. Okay. Thank you, Yarma. Thank you for listening. Thank you. |
gallia: An Extendable Pentesting Framework |
So, hello everybody, today I'm going to talk about something I've been hacking on for quite some time now and it is called GALIA and it is an extendable pen testing framework mainly in the automotive domain. My talk is structured like this, it is divided into four parts, I will start with some metadata about me and the project, then I will give you an overview of the status quo, I will conclude with an outlook and I hope we have some time for a little short demo. So that's me, on the left hand side you see my avatar and you might have spotted me on GitHub, I'm Stefan and I'm a security researcher and I'm the maintainer of GALIA. So what is GALIA, GALIA stems from the SACFORCAS project, it was a research project and we received some funding. On the YouTube link you can see a little demo we prepared last year, GALIA is implemented in the Python programming language and we support the latest version minus one that is currently the 3.10 release, it is free software and it is available on GitHub, it is licensed on Apache 2 and we have two maintainers, the second maintainer hides in the audience, if you have questions we have been around here for some time and it aims to be a modular tool for automotive penetration tests. So what is this, according to Wikipedia a penetration test is an authorized simulated cyber attack on a computer system performed to evaluate the security of the system and that basically means we connect our computer to a car or an automotive ECU and we send some data and we keep on sending data until it breaks and hopefully it breaks and then we try to figure out what we did in order to break it and after such tests the lab usually looks like this. What are the challenges to actually achieve this? The reason why GALIA exists is that we were doing some penetration testing and we needed a protocol stack for this and in the automotive domain there is usually the UDS protocol that stands for unified diagnostic services and you can think of it as the HTTP of automotive with the difference that UDS is stateful in contrast to a stateless protocol. Of course we need post processing which means machine readable logs in order to analyze data, everything needs to be reproducible, we solve this by a defined directory structure for artifacts and of course the automotive guys are very, very creative in implementing network protocols that means if you do expect an answer, the ECU doesn't answer, if you expect no answer, the ECU does answer and that's quite a challenge and that's why we decided to write our own protocol stack. Since the automotive industry loves proprietary software and we do want to release the core of GALIA, we created a plugin interface where we maintain our own proprietary plugins in our own infrastructure which plugs nicely into the GitHub code and of course we need the whole software stack to achieve these goals. We did write a whole implementation of the UDS stack and the status quo is like this. Here you plug it into the OBD port of your vehicle, OBD stands for onboard diagnosis and several ECUs might be exposed on that port and you can use GALIA to discover this whole tree. For example, there might be three ECUs available and each ECU has different modes of operation. These are called UDS sessions, GALIA can also discover these and each session might also provide different UDS services. What a UDS service does is up to the manufacturer of the ECU, it can be getting parameters, it might be setting parameters, it might even be software updating, the UDS standard defines just some basic facts what this actually is and it could be everything basically. GALIA comes as a CLI tool and you can think of it as the end map for cars. We provide some ready to use scanning modules, for example the discovery I already showed and there are also modules to investigate these modules further. We have an UDS stack including DOIP or ISOTP, these are little transport protocols beneath UDS since UDS is on the application layer, DOIP for example is on top of TCP, ISOTP is on top of the CAN bus and you can use all these setups. The next one is that we are able to do some automation if you are testing some ECU on its own, you can power cycle it during a scan, we power cycle the whole setup before each scan and so on and of course we have machine readable logging which comes as JSON logging and SQL logging, the SQL module is quite interesting since it can be used to query logging information across different or multiple ECUs and for development we offer a virtual ECU module. The core concept of GALIA is a test run, it is basically the invocation on the command line until it finishes and it always creates a directory structure which is always the same in which contains some artifacts which can be used for scripting or similar and the artifacts they always contain log files, always contains pcap files and it might contain some something else. The software is basically structured like this, there is a core module which can be extended via plugins and you can build standalone modules or you can integrate into the CLI system. Basically, the architecture is like this, the main entry point is the scanner on top which contains a module for optionally controlling power supplies and it contains an abstraction module for an ECU which uses the whole UDS protocol stack and the protocol stack also can be extended via plugins. These plugins might look like this, this is a hello world module, basically you create a class in Python, you need to implement the main method which could be basically anything and then you plug it into an entry point and that's basically it. For random facts, we use poetry for dependency management and in order to maintain a modern Python code base async IO and async await is used everywhere, it is fully typed, it passes mypy strict, it is extendable as mentioned via the Python entry point API and for configuring the protocol stack, we use some URL strings which are verified by the pedantic module. Yesterday there was a great talk about the pedantic module if you are interested in this. So let's give some little outlook. Of course, we need more power supplies, we need more transport modules, we need also more scanner modules and also to we need extending the scope, we need more plugins, scanning techniques and so on and of course more breakage and more memes and more testing. And we need more packages, currently there is a package for the Arch Linux distribution, we have a AUR package and it is included in the NixOS distribution in the unstable branch. And if you are a package maintainer and if you are interested in this, just create a package, file a ticket on Github and we would like to include this in the readme file. I will conclude with a short demo, it can be downloaded on this link but I brought it here and I will play it. What we can see here is a T-Mark session with two tabs. On the first tab, we will start a virtual ECU which is a testing device we will run Gaelia against and on the second tab, we have the command line invocation, here we have the configuration of the network stack, we have a transport module called TCP lines which basically sends ASCII strings as TCP, on top of TCP, we use this for debugging and testing, we have an Artifox where all the logging is placed and we try to discover what actual UDS services this ECU exposes. When you start it, it starts Dumbcap which records all the network traffic, then Gaelia synchronizes with the ECU which means it sends a test a present packet that is something like a ping message and when the Pong arrives, the scanner starts and it iterates over the ECU and in this example, we have found a mode of operation, this session hex 52 and several services are exposed in this session which can be exposed, which can be scanned further with other scanning modules. Now I have finished with my talk, thank you very much for your attention and if there are any questions, I will be happy to answer them. Yes, thank you Stefan. So we have sometimes four questions here, yeah. Hi, thank you for the talk. Can you say anything about fuzzing or memory scan, what you did there? We have some very basic fuzzer, it is included in the GitHub code, it is a PDU fuzzing, but it just generates random data and you don't have any feedback loop for this since this is quite complicated to implement for a car in a generic manner. Internally we have some more sophisticated fuzzing modules but we are not allowed to publish them because of the NDA stuff, unfortunately. For the memory scan, there are a few services, blah, blah, blah, by memory address, we have discovery modules for this published on GitHub, but they actually discover endpoints and that is basically it. So I have done a lot of testing and reverse engineering and you need a matrix of tests of what you are testing against and the thing. So I was just wondering for fits and giggles, is this virtual CPU sufficient to actually plug it into the bus and actually run the car? Because then it would be accurate enough to be able to be tested as a real world MCU. I hope I did understand the question correctly, but our virtual ECU module offers a possibility to clone ECUs so we can just record traffic and store this in the database and it just answers what it has seen recently, but it does not cover any state. The idea was that the virtual CPU is actually sufficient to replace the ECU in the car and run, drive the car and that would give the enough accuracy of the ECU to be able to be tested. Maybe it's a bit too much, but that's not in the scope of Gator, I feel. Okay, thank you, Stefan. Thank you for listening and feel free to ask questions outside full speaker there. Thank you. You are welcome. |
Jubako, a new generic container format
A new file format to store contents all together |
Thank you. We start with a small introduction to have a bit of context about Djubaco. I'm Mathieu Goetje, I'm a freelance developer and I'm working, my main client is the Qwix project and for there I'm the lead developer of Flimzim. What is Qwix? Qwix is a project to provide content where internet is not there and the question we try to answer and we have answered is how to distribute static websites. And for example, if you don't know all Wikipedia in English, it's 95 gigabytes and it's 6.5 billion articles and media. And to do that we use the Zim format. It's an archive format for web content and content is partially compressed so you can compress textual content or not compress images or videos and you can do a random access without initial decompression so you can access the content inside the archive directly. It works well and pretty efficient but there is a few flaws within the design and the archive is really tied to web contents and to Qwix and you cannot add another metadata but the question I tried to answer is could we reuse the Zim format, the good idea of the Zim format and do better and more generic. So here is Djubaco. Djubaco is a Japanese name for the bento boxes and it's more boxes you can compose the way you want depending on your needs. And Djubaco is a new format independent of Qwix project and this is a good idea of the Zim format but generic. And Djubaco is a meta container. It tells you how to store things but it's up to you to decide what do you want to store and how do you want to organize them and there is a reference library written in Rust. The feature of Djubaco, it's mainly read only, archive are mainly read only, this is selective compression so you can compress the content or not. No initial decompression needed and you can do random access on the archive. It's configurable so you can decide which property you want on the entries. There is an extension system so your user can download an archive and they can download extra content to add content to the archive you provide. It's embeddable in another file and it's composable so you can compose different kind of entry together in the same container. So it checks them and a few features to do, signature and encryption, direct access to uncompressed content, content deduplication, modification, different patch between archive and overlay. Let's have a quick tour on the internal structure. The Djubaco containers are organized around packs. There is three kinds of packs, manifest packs, content and the directory. Each pack can be stored individually in a file in the file system or they can be put together in one file and then you distribute this file to your user. The manifest pack is the main pack, this is a pack you will try to open when you want to open a Djubaco container and it's mainly a list of all the other packs of the container. The content pack is a pack which contains the raw content, compressed or not, and without any metadata. The directory pack is where you store the entries and the entries can print to contents in the content pack. This is a configurable part of Djubaco and inside the directory pack there is entries with a specific schema so you have to define the schema and the schema is the series of properties and their types. The content is just a property, it's a link to the content in the content pack so you can have entries at that point to several contents or no contents at all and each entry schema can contain violence, it's kind of union or enum in Proclamation EC or REST and you can have different kind of entries inside one directory pack. Each use case, why you would like to use Djubaco? The first use case is file archive, there is two arcs which is an equivalent of tar and here we have one kind of entry with three variants, file, symlink and directory, all three variants share two common property and for example the file variants add the pointer to a contents, symlink and the directory just store the first and pointer to the first entry and the number of entries in the directory. So it's kind of an organization and three structure as a file system. There is no index property for now but just mainly because arcs is pretty young and I don't want to bother with them while testing arcs and Djubaco but it's hard. It's a file archive so we can compare a bit arcs with tar to see how Djubaco and arcs perform. If we take the Linux source code, the full Linux source code is more than one gigabyte and both our tar and arcs are compressing the source code is about 130 or 14 megabytes. Crescent time arcs is a bit faster than tar and expression time we are pretty close arcs is a bit slower but we have someone pretty close, both tools are pretty close. What is interesting is when we try to list the contents of the archive, tar took almost the same time that expression because to list the contents in the tar archive you need to uncompress all the contents and arcs is very faster because the list of the entries are separated from the contents itself. If you want to extract only one content from the archive and we try to, what's that called dumping and when you try to dump a third of all the entries, you can see that arcs is really really really faster and the same way extracting one entry from the tar is pretty close from the time of listing the contents the same way as you need to uncompress all the contents of the tar archive and arcs you can locate the content and do a direct access to the content without uncompressing other contents. Once you think that we can do that is mount the archive, directly mount the archive on the file system and if you mount the archive and you do a diff of the content between the original source and what is mounted, if you do a diff between two plain directories it's a bit less than a second with arcs it's four seconds and half and tar is an estimation it will take something like ten hours to do the comparison. You can do something even more interesting with a mounted file system or with a mounting Linux source is compiling the kernel so if you compile the kernel on the plain file system it's a bit more than half an hour and if you compile the kernel using the mounted arcs archive it's a bit less than an hour. What is interesting here is that the compilation is made with G8 so there is eight processes and arcs a fuel file system is monostated so there is a huge bottleneck for now but if we move to a multi-threaded fuel file system it should be even better. The use case is the GIM it's an equivalent of kind of equivalent of ZIM format there is two variants only and here we are storing the entries as a plain list and there is no tree structure and the GIM binary just integrates a small HTTP server looking for the entries. What we can do also is we could replace for example RPM and DEB with arcs or things based on jubacca so you could download your package and not extract it from the file system just open it directly and even a GVL or debugging fault that could be put in specific content pack of the same archive so it could simplify the management and you will not need to have different package to different sub-type of contents of your packages. OCI containers are based on Tor you need to extract them on the file system before running a container so you could just use arcs among them or you can even use directly put different layer in different content pack and so the wall images will be one jubacca container. File format almost all file formats are in fact container for other content so you could use jubacca to just organize the content you want to store what you want for your own project and your own file format. Websites jubacca is written in rest you could run it in wasm and so jubacca could run you could load your jubacca archive in the browser once and just open it directly in the browser. Backups backup jubacca is almost incremental by design if you reuse the content pack of the backups previous backup this is incremental and you can decide which property you want to have so for example you can add a checksum on each sentries to do a comparison between the content store in the backup and what you have on the file system. Embedding resources jubacca can be embedded in executor programs or even more this presentation you can download this presentation at this address and you will have a file and this file is an arched archive so you can just use the arch tool to list the content extract or month archive and you will have access to all the file of this presentation it is revealed yes and it's HTML content but the same file is also a gym archive so you can just use the gym tool to just set the content and open a browser to the local host and the same files is also a program so if you make it executable you can run the program itself to month extract or set the content what is interesting is that between our the content is not shared it is an arch and gym archive but it's just a view to the same content there is no duplication it's not two archive put together it's really one archive with two kind of view of the same content and the last line is the exact command used to serve this actual presentation conclusion this is a new way of thinking we could extract you could use archive directly instead of extracting it so we can reinvent the wall without thinking about using directly the archive it's a new way of thinking it's generic it's a command based tool that can add that can add that to different usage but it's pretty new maybe some crash and you can expect maybe some change in the specification thank you are there any questions can you repeat the question okay I don't know I know about squash fs but the thing is that jubaco is not a file system arcs is an archive to store files but jubaco is not so jubaco is more generic than crime fs or squash fs probably and arcs compared to squash fs is is is arcs slower than squash fs on size arcs is better but on the performance is slower we could implement that in other languages yeah could we re-implement this in other languages you could there is the specification is language and mystic but just I just implement reference library in rest battle but the specification is is public sorry that zip is pretty small but zip is is a slower the arcs in almost any any kind of operation and is bigger than arcs also thank you |
Self-hosting for non-coders?
The open-source approach |
Yeah sure check check check all right all right all right hello everyone first time actually I'm talking about this project on the conference so far as them here's at first the images that are generated with mid-journey so I'm going to talk about this self-hosting for non-coders somewhat of a click-by-title but we'll get there I have a question for you all did anyone try raise your hand if you tried hosting mastodon on your own a mastodon server all right all right we have a couple hands now keep your hand up only lower your hand if you think the process was easy or simple or enjoyable okay all right so this is the mastodon docs on how to host it this is just one page this is actually not even the hosting part this is just preparing the machine so out of this room here I guess three or four people tried that and I can understand them it's not easy and this doesn't this doesn't just apply to mastodon it applies to many apps actually that you would want to self-host many open source apps because apps are getting complex does this not work oh really should that right will there be does this work better okay all right so yeah so you saw the huge instruction you know what we want is everyone to be able to host mastodon because that's kind of the point of mastodon right it's decentralized there's Docker out there that many many projects use to simplify this process of hosting something maybe maybe we could host mastodon with Docker so this would be the Docker setup I guess doesn't look that much easier and I mean imagine I'm I assume that most of us here are technical people but imagine if you don't know how to use a command line that'll just be impossible you can forget about it you would use someone else's mastodon server so what if I tell you there is a way actually for people who don't even know how to use a command line to spin up a mastodon instance and this is how it would look like this gif is a bit sped up but it actually is close to the normal speed what's happening here is I'm entering a couple config values the mastodon is deployed on my machine anywhere I want where I have this gardens have mastodon is deployed streaming services deployed configured sidekick services deployed configured post grass databases deployed configured ready services deployed configured as you see master has a lot of services and we're done we have a mastodon instance now this would be the it's actually open on the internet I host a couple of those myself with this method and this would be the URL that I end up having and of course you can have it on your own domain if you own a domain you would have the mastodon let's say on your domain so the process without regard to speed me speeding up the gif took one minute forty four seconds I counted and you can do this locally on your own machine if you want you can do this in a cloud if you do this in a cloud that would be even a bit easier but we'll talk about it so this is called gardens it's an app it's a platform that I'm making it's open source under a gpl point three and it's not the only one not the only self hosting platform out there that really simplifies deployments actually this project is based on another one called cap rower I don't know if anyone in the room knows it okay it has a hundred million plus docker pools so a very established project it's out there for many years already it was called captain dug dug previously but cap rower was meant for hobbyists and kind of for testing our aim with gardens is to bring it more to the organizations so get it to a level where it's actually the stuff that you deploy like mastodon or anything is you can use it instead of a SAS service for example in an organizational setting so the technical part of this is roughly this it's a web app the web app actually you deploy on your own server so even the web app is on your own server we have a website for non technical people where they can connect their cloud account to deploy to their server but after they do that everything else happens on their machine this web app talks to docker API it uses docker compose which actually you cannot use the API so there is some processing there to deploy containers for these apps so for example like I was showing this various services that mastodon spins up they're all in separate containers and of course you can not just have mastodon deployed like this you can deploy jitsimid you can deploy wordpress I'm going to show a couple more examples and then we use nginx to show these apps on the web and allow the actual end user to interact with them now what I mentioned for non technical users right now we support digital ocean if let's say you don't know how to use a command line but you want to have gardens on your let's say VPS or your machine you just there's not all the flow you connect to digital ocean and then there's a no code process where you spin up gardens and then from gardens you can use it but there's also a local process where you just run one line in the CLI you pull basically the gardens container and then gardens does everything from there doesn't actually I mean it talks back to us to get the list of apps available but our end goal is for it not even to talk back to kind of really be isolated in your environment so these are just some examples of the apps we have in gardens there's really a lot of good open source products that are coming out recently of course there are established players like wordpress there is stuff to replace social media like peer to you here are just some examples of what those replace so of course with mastodon you can replace something like Twitter pen pot here at the bottom left you can replace figma they actually had the talk previously at the conference basero or no code B can replace air table juicy meat can replace zoom outlining an improved confidence and there's actually other alternatives there so you're you can choose whatever open source product you want to host we have about 130 apps a bit more available right now that you can sell host with one click but there is also an option to deploy your own apps or even connect to git lab let's say have a bit of CI CD going if you're more technical as I mentioned the process of spinning up those apps is based on docker compose mostly there are option to even spin up from tar archive and so on but this is like the main method yeah we do some processing of docker compose files that are taken from official repositories of these app developers and then we just spin this up for you so gardens is just one example it's I'm talking about it because I'm making it cap rower is what we forked because it's very reliable very nice service cool if I is a newer service similar to that the main difference I'd say is for the proxy we use engine x they use traffic and you know host is a bit different but it covers the same use case to self host stuff it's a bit different because it's a debut and distribution so you have that on kind of replace your operating system whereas gardens cap rower and qualified they are all apps that you can put right now on Ubuntu debut and I think on CentOS as well I actually got it working on macOS but that's more experimental so why do we even want to self host let's go back to that I'd say actually if we want software freedom which is I guess the point of open source in a lot of ways you have to be able to self host but as we can see from here you know just several people hosted mastodon let's say I imagine many more here use mastodon it's kind of like if you're not using this freedom or if you're not able to because you can't handle the technical side because it's too difficult that kind of defeats the purpose right so if you cannot host you don't really have freedom if you have it technically or formally let's say so I'd argue that this not having freedom is bad but also bad not just in a kind of ideological way but also in a competitive way if open source wants to be able to compete with all these SaaS apps out there because this is like something that open source has that proprietary apps just don't have and it's an unfair advantage I think I believe that right now organizations how they use it is they use proprietary SaaS by default they self host when they are really worried about privacy or there's some kind of sovereignty concerns or something they don't care about open source and how it should be is you self host instead of SaaS by default but then you choose a SaaS where it's too complex or you want the scale or there's some specific requirements or if you want to support developers because that's a one way they earn and they can financially support the development so to bring some numbers in there I've been hanging out on Reddit self-hosted subreddit and I found the statistics so for the past like seven years the number of people there skyrocketed to more than 200,000 people on the subreddit so it looks like people are interested actually in hosting their stuff but I also run a poll with 1,250 participants about that number where I found out that people on that subreddit they don't really self host anything business critical so to say or anything for their organization anything for productive use or at least not a lot of them so about only a hundred people self host let's say next cloud or something but most of them self host for media so let's say for movies or something or for personal use so maybe something for their smart home so that I think is a pity but I can understand that why people are not self hosting because it's complex like we discussed for organizations I mean because it's complex but also because of these issues you don't just want to spin up an app there's also a part that comes after you deploy which is about maintenance which is about making sure that your instance is secure and if you get a lot of users you also want to scale you want to have an option to scale so how we deal with that is for maintenance for each app for each service that I was showing you let's say for Macedon you can view logs in the graphical interface so you don't have to as attached to your instance to view logs we keep a version history we track analytics for your instance with net data so you can see let's say how much CPU is utilized how much RAM is utilized so you can kind of check the health and native that actually provides nice notifications so you can even get notifications where you want or have a web hook to get updated one let's say you're running out of CPU RAM for security we covered that with automatic SSL for all apps you can force it you can force HTTPS we have basic auth so that we do with let's encrypt for scalability actually that's the part I love the most if you have a lot of users you can scale either by adding more instances of the same app on the same machine and gardens will redistribute automatically the load or you can scale by adding more machines into a Docker swarm so we support Docker swarm this is the same for capro or so I'd say people want a self host but it's hard and we need to help them so otherwise they just use SAS like currently and this is one way to do it you can help this effort by if you're a developer maintaining Docker compose files or documentation just a way to self host for your application if you're a sysadmin or DevOps you can think about providing a platform like this to your organization so that people in your organization can spend something up without having to bother you and you can everyone here of course should use open source instead of SAS and you should try self hosting if it's if you find self hosting hard you can check out some of those self hosting platforms and of course host not on AWS or something but you can use some of those posters cloud 68 been giving a talk here a bit earlier chat on says a French network of good hosters liberal host is another one so look at other hosting solutions thank you and yeah have a good rest of the conference. |
Libre-SOC: From architecture and simulation to test silicon, and beyond
A design for a fully documented and transparent hybrid CPU-GPU-VPU core, for a family of System-on-Chip products |
Okay, LibreSock project is creating free and open source ship design for a family of system on ship products for powering routers, cell phones, laptops, your laptop maybe in the future. So it uses the power extension set, augmented with 3D upcodes, accelerating 3D graphics and also video acceleration upcodes and audio decoders and coders. So we needed for avoiding right proprietary binary blobs and drivers and no reverse engineering needed for supporting our hardware and well it is hard to do so we do it by grants and donations, we use a pool of community experts on newsnet, IRC, academia and also commercial partners which will produce our ship as you see later. So it is architecture is a traditional fetch decode issue and execute pipeline but to increase performance you use parallel decoders to decode instructions in advance, a vectored issue so one instruction can generate many results at one time utilizing the functions units of the ship with the parallel execution units and well managing all things scoreboard dependency tracking design. Well, we started from zero from the power is instruction set specification as published by the open power foundation which is a open standard and you can submit the extensions to it and that's one of the reasons we chose the power architecture. So the power has this big manual with all the assembly instructions and all these formats and we take these formats, these tables and auto-generate by Python and Python decoder from these specifications and the specification of the power architecture also has a sealed code and with this sealed code which is for humans but we use it for auto-translation to a simulator, Python simulator and we start from the beginning just simulating instructions in software then you use the simulator to test against this one, the last one and the harder simulator will verify against the software simulator and finally FPGA and even an ASIC. Let's jump here so this is like an imagine you have an add instruction coming in and the LU has to process it but before processing it, it has to receive operands like add what A and B but A and B can be the result of a previous instruction which is still being processed so it has to wait, wait where, in here, in here and here and when it has to have a read transaction, read the transaction then it will fill the buffers then the add instructions can proceed and it will generate a result and condition codes but maybe you cannot write them right now because you'll overwrite another instruction so we wait here and here also and this has to be managed so one of my tasks is to simulate this to see if it works well and do what? Formal verification which is so, so good, so, so interesting. With normal simulation you just throw random inputs maybe and some test cases but how do you know that you didn't hit a corner case? Well, the formal verification is like it's try everything at once. Actually, it starts from a bad result that you don't want to have and it shows you the input which reaches that bad result. Yes, so that's the bit of the thing. We get a simple core, first we do not do these function units all in parallel it just was one to test all of this is working put it here and then we read an instruction which decode instruction and then run the instruction terribly slowly but we validate the function units and the decoder. Next step which we do, we did already is we vectorize it so we put a read 64 because there are 32 instructions you add vector prefix to them and this vector prefix will tell you to read a predicate so from the vector you say no I don't want all of them I want the even ones or the odd ones or the ones with pass it a test like if then else but vectorized and then you run the vector loop so when instruction again can generate many may take the place of many many instructions and well now we go to the next steps we have this working now we have to do it in parallel we want to have performance it is working now performance so to be a performant you need to while you execute in one instruction you are decoding the next one you are fetching the next next one and if there is a jump instruction by chance and it doesn't match what you are fetching you have to reset the pipeline yes test well where you are right now we have a development environment that any of you can download and test in your computer you can do it is running in a shoot and then you can do make test to run the tests and if you have an FPGA board you can compile the bit stream and put into a supported board and we even did a ASIC with it well for the ASIC we need the PBK which is a process development kit that the factories don't give it to you freely so that part is done by a third party we don't touch property stuff but while it was done yes and we hope in the future to have a free PDK with it so the FPGA is booted we have a bare metal like Arduino like FPGA team running Zephyr OS was ported with network so networking was proved and the Linux with a serial console yes we have the test silicon with that little a simple core and it is being carefully tested because you have few chips produced one not to burn them so they are tested in a lab yes and the parties underway with the new instructions vector instructions already and the new instructions been submitted to the open power foundation for standardization and we are porting algorithms cryptographic algorithms and multimedia etc. so what you aim we aim to port and boot a Linux distro in the future eventually we want to have a full-term change GCC with vectorization they find these tensions to include the texture upcodes for 3d acceleration so you notice there will not be a GPU the instructions the CPU will be the GPU and well we need the hardware and software developers and testers and also well documentation optional no okay so who will build the chips well you just have research research money right well who produce thousands of chips for the market well we are partnered with red semiconductor which is have the mission of producing these chips producing a powerful and power efficient chip with our car so if you see some of them some of us some of them with this logo on the shirts you can talk to them you're here hello David hello people so that's it thank you very much thank you very much for the presentation there's a few minutes left for questions so in the back of the room I see someone waving just a moment thanks for your presentation you said that you had some test chips going on what's the status of the bring-up like how far did you get in the bring-up process okay we know the clock is working it has an well the azik it's not only for Libresock but for the academic institution to test their design so they are testing this the clock generation and well we know the clock generation works just that so maybe anyone else with a question okay thank you for your attention. |
Get Started with Open Source Formal Verification |
The next presentation will be by Fabien Choteau. He will give an introduction on formal verification and will learn us how to mathematically prove this error box in your software. Thank you. I need the timer because I have quite a lot of things to say. Hi everyone, I'm Fabien and yet today I want to talk about formal verification, open source formal verification. First the disclaimer, I'm not an expert in formal verification, but I'm a user of this technology. What I'm an expert at is embedded software, but I use formal verification and I will explain later our work in a company that's developing some formal verification solutions. If we look at what Wikipedia says, formal verification is the act of proving or disproving the correctness of intended algorithms using formal methods of mathematics. So in practice, what does that mean? Let's take a very trivial example. If we look at this line of code and we want to prove that it's correct, that it never fails. First we will have to look at what can go wrong. So for instance here we can say well there's potentially a division by zero, right? That's bad. So if we want to prove that the line is correct, we have to prove that x minus 10 is different from zero, which in terms means we have to prove that x is different than 10. This is known as a verification condition, something that has to be true for our program to be correct. Now if we look at this line of code in a context, just a trivial example, here we see that there is an if statement that's guarding the expression. So we know that x is always different from 10 and therefore we know that there's no possible division by zero in this piece of code. So that was easy, right? But now let's look at another very trivial piece of code. Are you able to spot all the verification conditions? Are you able to check that they are respected or not? This is actually very, very difficult. Most programmers will know some of the things that can go wrong. Most of the time we will forget what they are. I'm looking at you, integral flow. And so that's why programming correctly is very difficult for human beings because we are not able to keep in mind all the verification conditions and play with them and check them all the time. Again, this is a very simple piece of code. And so what formal verification is, and in particular, automatic formal verification, well the goal is to have tools that will extract the verification conditions from the code and then run a mathematical proof to check that they are respected. And so today I want to talk about one framework for automatic formal verification, which is called Spark. So Spark is both a set of tools and a language to perform automatic formal verification. So on the tool side, we have different tools that are working together to achieve this goal. The first one is Gnatt Prove. It's developed by the company I work for, Aida Core. It's going to take Spark code as the input and translate it to another language called YML. Then we have this, the tool itself, Y3, developed by Inria in France. It's a research institute which, again, translates the code and extract the verification conditions and call the different solvers, which are on the right here, to ask them to prove the verification conditions. So we have different solvers that will have properties on different kinds of algorithms, Altergo, CVC5, Z3. And so this full tool chain is open source and developed by different entities, as I mentioned. And so the solvers, for instance, they are not only used for Spark, they are used for other formal verification frameworks, but all of them work together in this framework. And so on the other side, Spark is also a language. And actually Spark is a subset of the Aida programming language. And so the question you may ask is, why would you use Aida? Why would you use a subset of Aida for formal verification? So I'm going to take just two simple examples. Why Aida is great for when you want to do formal verification? Well, this language provides a lot of specification power. The developers can express very precisely what they want from the program, from the code, which then the formal verification framework will be able to check. Just a simple example, if you program in any other language, if you want to have in your application a percentage value, for instance, for the completion of a process or whatever, usually we'll say, OK, my percentage is a float. And I'm going to say I'm going to use the value from 0 to 1. And if you are an extremely good programmer, you're going to write that in a comment. Like, you know, my lowest value is 0, my highest value is 1. And that's just a comment. And then if five years down the line, you want to say, oh, actually, it's better if it's from 0 to 100. What happens to the comment? Maybe you update it. Maybe you don't. How do you make sure that everything, all the code is updated to follow this new rule? That's very difficult to maintain. So one example with Aida, you just define your own type and you say it's a float. It's a new float. It's a different kind of float. That has only valid value between 0 and 1. And so what's great here is that the solvers that will try to prove the code, they get a lot of information from here. First, they will add verification conditions when you try to cast from a regular float to this percentage value. Because if the value is not within the range, that's a bug in your program. And you want to know it. Also, you can extract information. Like, if you take two percentage value and you multiply them, you also know that the result is between 0 and 1. And so in turn, that means a lot of information for the provers to do their work. And so they will be able to more automatically reason and prove that the program is correct. Another example is the contract-based programming of Aida. So you can have preconditions and postconditions on your subprograms. So precondition is something that has to be true when you call the function or the procedure. Postcondition is something that has to be true when you return from the subprogram. So very simple example here with the stack. We implement the stack. We have functions to know if the stack is full or if it's empty. And we implement the push. And obviously, well, not obviously, but in this API, we say it doesn't make sense to push something on the stack if the stack is already full. So never do that. That's the contract that the procedure is asking from the caller. And then the implementation says if you push something on the stack, well, it's not empty anymore at the return. And so with Spark, what's great here is that you have formal verification both on the caller side. So Spark will prove that you never call the push function when the stack is full. And on the implementation side, it's going to prove that when you return from the push function, the stack is never empty. And so that's going into functional correctness verification, which means your software is doing what it's supposed to do and only what it's supposed to do. And so with the integrated pre and post conditions in ADA and other features that I don't have time to mention, well, this makes ADA a very great language for formal verification. So why should I care about Spark and I would say formal verification in general? So with Spark, you can have mathematical proof that there is no possible vulnerabilities in your application for any possible inputs. That means no buffer overflow, no division by zero, no integral overflow and so on. If you go beyond and you use contracts, you can also prove, as I mentioned, the functional correctness. So the program does what it's supposed to do and only what it's supposed to do. And in terms, that means you can avoid some of the testing efforts because, for instance, unit testing is more or less trying to achieve the same goal. So if you already have a mathematical proof that the functional correctness of your code, you don't need to do unit testing anymore. And so you're going to save time on that. Recently we published a case study from NVIDIA. So a few years ago, the NVIDIA security team was questioning their methodology for security and how to achieve their goals in terms of security. And so they said testing security is pretty much impossible. You cannot test all possible combinations of all possible values for your application. And so they decided to try provability. And they selected Spark as an experiment. And now they are deploying more and more Spark in the GPU. So if you get the latest, greatest NVIDIA GPU, there should be some Spark-proven code embedded in the firmware, which lets them actually focus on other parts of the security on more interesting verifications and more interesting properties, security properties of the application. They don't have to deal with buffer overflows and integers overflows. All the low-level stuff, it's already taken care of. And they can focus on more interesting points. So now let's do some proof. So for the A9 Spark programming language, we have this package manager called Alire. So here are a few instructions to make your first and prove your first piece of Spark code. So you don't know then install the package manager. So from the command line, we start by creating a project or a crate with Alire in it. We enter the directory. We add the net prove tool suite as a dependency. So it's going to download everything and set it up for you. Then we write some piece of code. So you can recognize our very nice equation here. Just for the declaration of the X constant, it doesn't matter what it is. I'm just taking an integer value that Spark doesn't know. So just to make sure I'm not cheating or anything. We go to the console again. We run a net prove. And so a net prove will tell us, well, there might be a division by zero error at these points. So as you can see, the message is pretty clear. Actually, it can be even better than that because the tool can provide counter examples. So if we add the switch, counter examples on, net prove will say division by zero might fail, for instance, when X equals 10. And so that's pretty easy to fix. We just add this if statement. And we run the tool again. And that's it. We proved our first piece of code. So as you can see, it was easy. If you want to try and learn a little bit of Spark, we have an online website. So learn.edocore.com. Online interactive website. So you don't even have to install what I showed before just to learn and try the tool sheets. So there's different chapters and one specific to Spark. So that's one way to get started. And for those who wondered, just the piece of code before, there are seven potential bugs or errors in this one. So I'll let you as an exercise to fix this one. Thank you very much. Thank you for the presentation. Let me unwrap you with a shot. Perhaps someone might have a question about that. |
NGI Search and OpenWebSearch.EU projects
Two sister initiatives for a paradigm change in open search and discovery on the internet |
So, thank you. I'm Aurora from the University of Murcia. And I'm Michael from the University of Passau. And we will be presenting the NGA Search and OpenWeb Search.U projects. Two sister initiatives for a paradigm change in open search and discovery on the internet. We will have a common introduction and then each of us will delve in the projects that we are involved in. The short disclaimer before we start, in the last two days at FOSTA, we have heard a lot about personal lifetime projects. This is quite different because it's not personal. These are European institutions involved. And it's not lifetime, but these projects, they are just starting up and have nice ideas and require your attention and contribution. So, NGA Search is a European project that will welcome entrepreneurs, tech-dates, developers and socially engaged people that have challenging ideas about the way that we will search and discover information and data on the internet. So basically, we will find projects that are focusing on several topics that I will explain in a minute and they are compliant with the European values of openness, transparency, privacy and trust. The applicants can be natural persons and organizations and they can apply individually or as a consortium of three members of a team. OpenWeb Search is a project funded by the EU and has the goal of developing an open European infrastructure for a web search. There are several research and computing centers involved and the project started in September last year and has a time frame of three years. It's quite interesting that it's the first project which is funded by the EU and has the goal of developing Europe's own web index. And for those who don't know, a web index is a data structure which allows the vast access of web data. It is the fundamental core of all current search engines and it enables the development of all kind of web search and web data retrieval services. So for us, let me present you the project partners. We are Link Novene, Aros University, the University of Murcia, OW2, a funding box. These are some of our phases. We will organize five different open calls and we will welcome these innovative projects. We will provide them financial support up to 150,000 euros and as well we will not only provide financial but also technical, business and innovation support and the projects are expected to take 12 months. The first open call is already closed, the evaluation is ongoing and it was a technological based call, meaning that we were looking for products, finalized products. And then we will select nine to 10 projects in the topics that you see on the screen, base voice assistants, NLP, semantic analysis, social computing and data visualization. But let me delve a little bit more on the second open call topics which may caught your attention. I hope you can resonate with some of them. The second open call will open in April, the first of April and it's a little bit more research oriented but please have a look because we have plenty of space for everything. And let me explain each topic. The first one is power cognitive search by reinforcement learning. Today we look for mechanisms of self-learning and all kinds of algorithms that can contribute to our reinforcement learning system that is able to then with the interactions how to choose the right data and algorithms in order to make any search more relevant towards the objective of it. Then the second topic is machine based data, Internet of Things data. Today we look for algorithms for search and pattern discovery that are adapted or that can adapt to IOT characteristics which means we will look for edge computing, times, algorithms that can deal with time series, with events, with different localizations and so on. We have also AI based taxonomies. We look for the automatic creation and expansion of existing taxonomies that are machine based and machine interpretable semantics by using AI techniques. Basically we want to model the interdependency among new concepts so we want to adapt to the dynamic and the dynamicity of the data. The next topic is network analysis. Basically we want to find other ways to create knowledge graphs which are an interlinked network of distributed resources that can be searched. We can do this looking for, the objective can be to make them more scalable, to test the quality of the creative models, to create different semantics, to make them compliant with diversity and dynamicity of data and so on. Then we have AI based search tools and content generators. As you know nowadays AI based content generators are quite popular. Here I have some names that have been coming up in FOSM many times, check GPT, GitHub co-pilot, co-PAI. So we welcome projects that can help evaluate the privacy of these kind of tools. Also look for research at the logical gaps on them as well as the completeness of their answers. Then we have the topic of ethics in search and discovery. We want to have projects that specifically focus on testing that the developments of search and discovery are compliant with human rights, that they are not biased towards minorities, towards gender, so that they promote equality but also private, they are private and they take care of the data confidentiality. And finally we have new ways of discovering and accessing information, which is a broad topic so that we don't leave anything behind. How are we going to do that? We are going to provide technology mentoring and advice on technologies to access store managed data and also on algorithms and the platforms are tailored to the projects. We are going to leverage the reach out beta testing platforms, a platform for beta testing campaigns, then we are going to link the projects to different standards and promote the conversations with different foundations that will be related to them. We will have a program for market readiness. We will create workshops for pitch training so that they can try to pitch their solution or their project to potential investors and users. Business modeling advice and coaching and as for innovation, we will give information about which kind of open source license they can have, they can use, how to manage it and how to make their research and their solutions reproducible so following open science principles. So as to conclude for my part, the next three years there will be five open calls that will challenge the way that we search and discover information on the Internet. We have synergies with the NDI mission of looking for a more human centric Internet and these are our core values, open source, contributions to the wider community, collaboration and open science principles as well as transversal challenges that include sustainability and equality. I give the floor to my colleague. Thanks. So now switching to the second system project, open web surge. As I said before, there are several computing and research centers involved in the project and it's good this way because all of these have different competences like universities or businesses and this underlines one specific point about the project. In contrary to the web index which is operated by Google for example, this project does not follow a centralized approach but it tries to build up a web index collaboratively and distributed among several European institutions otherwise it would just not be possible to build up a web index from scratch. I will keep the motivation short because it should be clear why it is a good idea to have your own web index in Europe, the first thing is the imbalance in the surge engine market. There are four big global web indexes out there from big tech companies and this dominance of these companies has a lot of negative effects on the critical infrastructure web surge so to say and the solution is to strive for more plurality. Another thing is that web data is a driver for innovations and an example, just one out of many examples that has drawn a lot of attention lately is the training of large language models such as JetGPT which works on its basis with web data. The project goals are developing the core of an open web index so two remarks on this point, this will be done with open source and open configuration and it is not expected that at the end of the three years of the project there will be a production ready index but the goal is to have an index which contains at least 50% of all text web pages and then that can be worked on and that shows as a prototype that it is possible to create it collaboratively and distributed among several institutions. Another goal is to build an ecosystem around the index and make it publicly available in this way. As I said before it serves as a feasibility study in some way to show that it is possible to do it collaboratively and that is why along the way it should be established a network among European infrastructure partners. The overall vision of the project is to give open access for innovators and businesses and researchers on web data to enable them to build new business ideas for example or to work on their ideas on web search and web analysis. As for the NGI search project there are also ways to contribute to openwebsurge.u through third party calls there will be three public calls with a fixed amount of funding and the first of them focuses on the legal aspects of the project. There are two tracks the first one focuses contributions on legal and business and social aspects of web search in general and the second track focuses on the legal compliance of the crawling so the acquisition of the data from the internet which is then stored in the index. This call will open on the first of March this year. There are also other opportunities to contribute which is covered by the upcoming calls and this course overview of the architecture should just give you a hint on where these contributions can be located so if you have an idea on how to develop a search and discovery application this can be done for example as a vertical search engine on top of the web index infrastructure or it can be a content analysis method which enriches the data which is then stored in the index. So as a conclusion the openwebsurge.u project wants to open up web search and strives for more plurality to give new business ideas a chance and new alternative search engines a chance therefore the project partners collaborate and build up an European open web index which is then publicly available. For researchers innovators and businesses there is the possibility to contribute either by developing your own business model which sets upon the open web index or by involving in one of the three public calls. The first one of them as I said opens on the first of March. So thanks for your attention and we open for questions. |
FOSDEM infrastructure review |
Okay. Hello, everyone, and welcome to the last lightning talk of the conference. I hope you've enjoyed yourselves. This will be the FOSDEM infrastructure review, same as every year presented by Richard and Basti. So normally I do this thing, but Basti has been helping a ton and the ball of spaghetti and spit and duct tape, which I left him. He turned it to into something usable. So I'm just going to sit here on the side and I'm here on for the Q&A. But for the rest, it's Basti and it's his first public talk for Realty, so give him a big round of applause. Well, thank you. I hope I will not screw this one up. Okay. So we'll have about 15 minutes and 10 minutes of talk and five minutes for Q&A, and I hope it's somewhat interesting to you. First, the facts. The core infrastructure hasn't changed that much since the last FOSDEM 2020. We're still running on its Cisco ASR 10K for routing, ACLs, NAT64, and DHCP. We have already reused several switches that were already here from the last FOSDEM. They're owned by FOSDEM. These are Cisco C73750 switches. We had our old servers, which are now turning 10 this year. They were still here and they will be replaced next year. We have done, like all the years before, everything with Prometheus, Loki, and Grafana for monitoring our infrastructure, because that's what helps us running all the conference here. And we've built some public dashboards and we just put it out to a VM outside of ULB because we were running out of bandwidth, like the years before, and I'll come to that back later. We have a quite beefy video infrastructure. You might have seen this one here. It's a video capturing device. It's called a video box here at FOSDEM. It's all public. It's all open source, except one piece that's in there. You can find it on GitHub if you try to build it yourself. Go ahead, just grab the GitHub repo and clone it. These devices, there's two of them, one at the camera, one here for the presenters' laptop. They sent their streams to a big render farm that we have over in the K building, where every year our render farm is running on some laptops. So laptops sent the streams off to the cloud from Hatsner and from there we just distribute it to the world. So everyone at home can see the talks. We have some sort of semi-automated reverb on cutting process. Those of you who have been talking here maybe have known S-Review for years. This is the first time it's running on Kubernetes. So we are trying to go cloud native as well with our infrastructure. Just to show how all has been held together. This is our video boxes. I don't know if you can see it. We got those black magic encoders here that are turning the signals that we get, like SDI, HDMI, into a useful signal that we can process with our banana pie that we have in there. Everything is wired up to a dump switch here and then we go out like here and have our own switching infrastructure inside those boxes. There's some SSD below here where we just, in case of network failure, dump everything to the SSD as well. So hopefully everything that was been talked about at the conference is still captured and available in case of a network breakdown. Those boxes also have a nice display for the speaker so we can see if everything is running or it's not running, which makes it easy for people to operate these boxes. You don't have to be a video pro. You just have to wire yourself up to the box. You see a nice Fostum logo and see okay, everything is working and you're done and everything gets sound set out. This is like how the video system is actually working. All this can be found on the GitHub. You don't have to take screenshots for that. If you would like to see it, we will tear down this room afterwards so everyone can have a look at the infrastructure we're using because it's not being used after this talk. You see it's quite some interesting things to do. This is the instructions that all our volunteers get when they wire up the whole buildings here on one day on Friday. So they are not here but they should be given some round of applause because they are volunteers that are doing really the hard work and building up on one day the complete Fostum. So maybe it's time for round of applause for them. So here we have the other thing. This is also on the GitHub where you can see where is something coming from. We have the room sound system. This is what you're hearing me through and we have a camera with audio gear and speaker laptops and it's all getting pushed down until someone reaches your device down here. There's a ton of services processing it in between and this is almost all done with open source software. Expect for the encoder that's running in there which is from Blackmagic Design. So how is it processed? We have a rendering farm. This is the laptops. It's 27 this year. For those of you who don't know those laptops are being sold after Fostum. So if you want one you can grab one. This year they are already gone but for next year maybe you want to have a cheap device. You can have them with everything that's on them because we literally don't care for that. You can have it because after things have been processed after the Fostum you can see it. We have some racks where we just put them four wise and we have 27 of them. We have some switch infrastructure that is used for processing all that stuff and this one's not running out of bandwidth but we're coming back to what's running out of bandwidth. You might see this mess over here. This is our internet and looks like every common internet on the planet and this is like our safety net. We have a big box here where all the streams go and this will be sent out to Bulgaria to the video team right after Fostum so we have a really off-site copy of everything. So the challenges for this year. DNS64. All the years we've been running on bind 9 since ages and we switched to Core DNS just like testing it on Sunday of Fostum 2020. We really saw a significant reduction in CPU usage and that's why we stuck to Core DNS since then and this year we also replaced the remaining bind installations that we handled for all the internal DNS and all other recursive stuff that's been used here to provide you internet access. Richie always used to give you some timelines and that's what I'm trying to do as well. There were times when it was mentally challenging for people building up Fostum. We got better by year by year by year by doing some sort of automation and getting people used to know what to do and have everything set up before that. We installed routers. You see that there's a slide. It's getting better year of the year. This year we had like a very, we thought it would be okay from what we know. We just set it up in January and everything worked. We came here on the 5th of January I think, put everything up and it just worked which is great which gives you some sort of things not to care about because there were other things to care about. The network to have it up and running here took us a bit longer this year than the last years because we were playing around with the second uplink that we got. We used to have one gigabit uplink. Last week we got a 10 gigabit uplink and we thought okay, just enable that and play with it and it turned out to be not that easy to getting up both of the BGP sessions running and doing it properly. That's why it took us a bit longer this year. The monitoring was also one thing which really helps us to understand if FOSDM is ready to go or if someone has to stay very, very late here. The last years we've been very, very good at that. Basically in January everything was done like the last of January but it's January. This time it was in the first half of January everything was set up and was running and it worked and yeah, it was really great because some people actually got some sleep at FOSDM, didn't need to stay here very long because everything was pre-made and it just go and look at the dashboard, okay, this is missing, this is missing and just okay, just have them all checked. The video build up took a bit longer this year because of we getting old and rusty at that. Also, very many new faces that have never built up such great conference. This is why we took us a bit longer and the video team also, yeah, I think they got the least amount of sleep of all of the stuff that was running the conference. This was the story so far. We closed FOSDM 2020, I was also there at 2020. 2020 was really one of the best ones we ever had from a technical perspective. We had everything running via Ansible, just like one command and then wait an hour till everything is deployed and you're gone, cool, have some beer, some mate in between and everything was cool. Then we had this pandemic, just for me like a week after FOSDM everything went down. You had FOSDM 2021 and 2022, there were no conference here at ULB so we had no infrastructure to manage, it was quite okay. We had to do some other things like most of you have learned that we have a big metrics installation to run and the FOSDM conference and help you with communicating during the conference. Then there was this bad thing that the maintainer of the infrastructure left FOSDM in between these years and so Richie searched for someone who was dumb enough to do that, yeah, that's me. So this year we're back again in persona, sorry, yeah thanks. So after two years we came looking for the two machines after almost two years, like no one touched them, they rebooted one or two times due to power outages in the server cabinet, but we had a working SSH key, we had tons of updates to install after literally three years, I wonder nobody broke into the machines because they were publicly exposed on the internet, but only SSH and I think a three year old or three and a half year old Prometheus installation which was full of bugs, yeah. We noticed that the battery pack of the rate controllers have been depleted, so this was the only thing that actually happened in the three years, the batteries went to zero and didn't set themselves on fire, so everything was okay, the machines worked, just a bit of performance degradation, but everything seemed to be okay. Then we tried to run this Ansible thing from the last years and three years later Ansible has done a lot of things in the time and you want to use a current version of Ansible with that old stuff, you end up like this, this is me, yeah, start from scratch or fix all the Ansible roads, you can have a look at them, they're also on GitHub, so when we saw okay, how do we do this and said okay, Ansible be gone, we just fix it after the FOSDM because we will have to renew the service anyway and everything will change. So the service timeline, we have them service life at the eighth of January, services, DNSX4 although at the mid of January, we had centralized all our locks, this was something Richie was looking for since ages, that we had easy accessible lock files for everything that's running here at the FOSDM, which is good that we had them because we could see things like oh, the internet line that was proposed to be there actually came, we did, nobody told us but it came up, you see that, we, thanks to the centralized logging, we were aware of things like that and then we could go and fire up our BGP sessions. Then two days later, we noticed okay, firing up the BGP sessions wasn't that a good idea because we lost almost all connectivity, stop it says but I don't care, yeah, I just keep talking, yeah, we lost all our connectivity and said okay, damn it, we're in some sort of panic mode because the reason for looking at the service was like this bind security issue that was been, I read the mail at the morning of January 28th and said okay, we have to fix the bind installations and then you suddenly can't reach the service anymore and say okay, are they already hacked or what's going on and doing some back and forth with our centralized logging, you see that, this is Grafana Loki that we leveraged for that, we were kind of like, yeah, it's been really nice to debug things like that, we also noticed that there was an interface constantly flapping to our backbone which we also could fix within that session and after that we said okay, there are some MTU problems, we have to restart our BGP and so on and back and forth and then we finally agreed to just throw away the BGP sessions, go with the 1 gigabit line and yesterday evening we switched to the 10 gigabit line because we had the congested uplink like since 11 in the morning, so many people using too much bandwidth and since yesterday evening everything is okay, it's better and we are on the 10 gigabit link, due to the fact that there are not so many people here today, yesterday there were quite a bit more, the link was not fully saturated but you can tell this is the place where we could use some more bandwidth, it was like, I don't know, this is usually time for something to eat but at 3.30 we could actually use something of the new bandwidth that we had available, so if you want to look at all of the things, we have a dashboard put out there publicly, if you want to have a look at the infrastructure and the Ansible repo, that will be fixed to work with current Ansible versions within the next few days, just clone our infrastructure, clone everything and if you have any questions I'll be glad to take them, yeah, fire away. As I don't see any questions then, we're about to tear down this room after this, so please don't leave anything in here because it will be cleaned and everything will be torn out. If anyone else has a question, just, there's one, we use, the question is, why do you use laptops for rendering because they have a built-in USB called battery, so in place of the power outage we can easily run with them, also they're very cheap for us, we can just use the computing power and sell it at the same price that we bought it to the people here, you get a cheap laptop, we get some computing time on them before and that's the main reason for running it on laptops. Well actually the question was why you were using banana pie, that's a good question, the thing is that the capabilities of the banana pie were a bit better than the Raspberry pie, the times the decision was made, if you see there's a big LCD screen in front of the boxes where you can see that thing, I think it was with driving those LCD panels and also the computing power available on the banana pie that wasn't, yeah, but actually we have to look that up in the repo, there's everything documented, okay, yeah, there's another one in the front. So the question was if there are any public dashboards out there, yeah, we've put some public dashboards on dashboard.graphana.org, oh, dashboard.phasdom.org, sorry, which you can have a look at the infrastructure, we used to have some more dashboards like the t-shirts that have been sold, but due to the fact that we changed the shop, we converted to something that we bought to an open-source solution and the thing is we totally forgot to monitor that, so that's, but there are some dashboards out there to monitor it and if you want to see something more, just come to me after the talk and I'll show you something more here at the laptop, okay, yeah, another one. The biggest one standing here, no, actually the biggest issues we had was like running all that stuff after three years and not having set up everything properly was quite challenging, like on Saturday morning we had to run and redo the whole video installation on the K building because of, you see those transmitters here, they were not plugged properly and so we had no audio on the stream, this was one thing and then another very challenging thing was like when we played around and as I play, we did not engineer anything properly, when we played around with the BGP sessions, it was not clear how long it would take till things distributed to the whole net and we were literally just trying to get information, is it working, is it working not and till this BGP information propagates from here to the rest of the planet like Brazil, it takes quite some time and so you can't be sure that you're setting up the BGP session, everything works because shit will hit the fan after ten, twenty, thirty minutes and not instantly and so it's quite a problem to have instant recognition if things are going well or not. So the question was if the problems with the Wi-Fi that we had here on site were due to our BGP playing or was it due to something else, solar flares and so, the thing is that we had some issues, we've been given access to the WLC, the wireless controllers, you see these boxes over there, they're centrally controlled and we have to dig in that. We have some visibility of the infrastructure that's owned by the ULB, they've given us access to that so we can engineer that but we're not quite sure why it was that. Most of the time FOSDEM, which is an IPv6 only, was working quite good except for some Apple devices that do tend to just set up an IPv4 address even if there's no proper IPv4 and things get complicated and FOSDEM dual stack, which is dual stack, usually worked for most of the Apple devices, but we're not very certain of it. Yeah, yeah, you will see that. There's another one. So the question is if the live streams will be made, yeah, this week, rewindable or not, I honestly can't tell you that, I don't know, I can ask the video guys if they're planning that for next year, but there's no plan of that as far as I know. The biggest challenge was to redo things with HDMI over VGA, which we had the last years. But there's another one, yeah. So the question is that we're planning to use service, do we know what and what's planned for next year? We'll have a talk about that next week, I think, and then we go through the post-mortem, which is usually a week after FOSDEM, and then we decide on things to be bought for next year because switches are old and the route is also old, I think, and we have one more year on the route to go that should be fine for next year, but what after that? We have to make some decisions and some investments for next year to run this stuff, and this will be done next week when we're all bit cooled down and refreshed after this FOSDEM. Anyone else? Yeah, come. So the question was what part of the infrastructure are being reused and what do we bring for the event? Well in numbers, I think it was three truckloads of stuff, not three because the video arrived in a second. We bring mainly cameras and those boxes here, switches stay at the ULB, most of them stay here but the one that didn't stay here, they won't be here next year because ULB is planning to do some tidying up and giving us here some video ports for our VLANs. They're very, very good at working with us. We get access to most of the infrastructure, we just say tell them what you like to use and they just throw it on their controllers and bridge it to our service and we can use it and make fun with it, and they will be replacing part of the network infrastructure next year and we then will have to bring even less gear here. Which one first? Yeah. What about the rest of the year? So the question was what's about all the other stuff that FOSDEM is doing through the year? We hosted on our own hardware, is it in the cloud or somewhere? We used, yeah we have another company called Tegron here, it's a Belgian provider and most of the stuff is running at Tegron during the year. During FOSDEM we also spin up some VMs at Hetzner in Germany and they are only for during the event and short time after the event, so like cutting videos and so on in the cloud and they will be turned off like two or three weeks and then everything is running on Tegron on our own hardware there as well. So there was another question. So the question was what is being used for the communication between volunteers? We have that matrix set up, I don't know who's aware of matrix, it's a real time communication tool like Slack or something like that. We use matrix since 2020 internal for our video team for communicating and then we expanded that for 2020 and then with the pandemic we opened it up for all of the people and now the volunteers are being coordinated to that. And we also have our own trunk tier steel that we have here especially for this event set up and volunteers are also can be reached via those radios. Am I correct volunteers? Yes, yes, okay. We have two volunteers here. So yeah, we'll get you. Is there anything else you want to know or any other? Where's the money? The question is where's the money Lebowski, that's the real phrase from the film. I don't actually know, I'm not yet the member of FOSDOM stuff so you have to ask someone in a yellow shirt, there happens to be one next to me just from the microphone. We have a money box and a bank account. Anyone else? Three to one. Thank you very much. Thank you very much. Thank you very much. |
fq - jq for binary formats |
Yes, so I'm Mathias Valdman. I'm the author of this FQ tool. Now I'm going to do, I'm not going to be able to show all slides, I think. I want to do a lot of demo, because I think it's a good tool to demo. Sorry, can you talk a bit louder? I can try. What is FQ? It's a tool and the language and the group of set of decoders to work with data. More or less it used to be binary data, but now it's also text data, but I guess it's binary also. In a sense, so it's heavily inspired by JQ, both as it is JQ, the language, but it's also the CLI tool. It's inspired by how the arguments work and everything. And it's a tool for querying and displaying data, exactly like JQ. But it also has an interactive weapon, so you can work with it in a more interactive way, and it has auto completion and a lot of other bells and whistles to make it very nice to work with. And it's available for a lot of operator systems. So I like to call it like, it's like a debugger for files. That's how I use it. So why JQ? It's a very CLI friendly language. You don't need any new lines. It's like you, it's like the syntax is very, very tears. You can do a lot with a little syntax. And it's very composable. You have this like pipe, more or less like shell pipes. But it has these generators. You can even do loops and iterate and recurs with even this, using these pipes. You see these square brackets like that to iterate and dot, dot is to recursive, recursively through a tree of values, you can say. And it's also kind of like DSL for like selecting and transforming JSON. It is, you can call it JSON. It's not, it just happens to have JSON as an input output kind of. Internally, you can call it like JQ values. So it has arrays and objects and numbers and strings and boots. So it's kind of like a super set of JSON. And I have an example here that you, you build a new object that has a key A, that is an array, that is one, plus two, plus three and empty, and you see that it becomes the object with just A and array one, five. And empty was just a function that does nothing. It outputs no, no value at all. And you can also select values from the input. So here we have the object with A and B and then you create a new object that has this, the sum key. And then that's the sum of A plus B. So then it's three. And it's a purely function language based on generators and backtracking. But it has conditioners, it has function, it has bindings, it has all, all things you need. And it's, the default is that you run, you have one JQ filter or program that you run individually on each input file. But then it has, it has functions to tell it to not behave like that. So you can run one filter on a group of files, for example. And FQ has support currently for 113 formats. And it has most of them, I guess, half of them are media related because I work with media. So just like MP3, MP4, or Flack. And they also have support for like demuxing some of these forms like you, some of these containers they have support for segmenting and things. So you can kind of recombine it and then decode the, the demuxed sample and things. But they have been, other people have added other things. So executable formats, archiving, networking. So you can do PP cap nowadays, you can do TCP reassembly even, and even decode the TCP stream. So it's a support for RTMP, for example. And it's also a support for some serialization for like message pack and the ASM1 beer and the seat, seabor and those. And it's also a support for some text format. Some of them you can even decode and encode. So you can decode it into a JQ or JSON value, transform it with JQ and then encode it back to some other text format. You can't do it with the binary formats. I will see if I get to that to explain why that is, that is not easy to do. So what does it mean when you decode? What does it mean that you FQ support the format? It means that it can, there's, there is some code is written in Go. So there's like a kind of a DSL for writing decoders. And that produces a structure that is like JSON compatible. But each value in this structure also have no which bit range they come from. And they also have optionally can have like symbolic mappings. Like you can map the number to a string or string to a string or boolean to a, so you can in binary format you usually encode some number that means something. Like this is the upliver if you get that. And for media this usually means that it's, you decode everything except the pixels or the audio because it's like, yeah, then you buy, then you can use ffmpeg or whatever you want. There are some format that actually decodes to the samples. Flack for example we have support for actually coding. So there is a full flack decoder in Flackube but you can't, you can't listen to the sound. And some format can use other formats. That's how they are. They're like hierarchy over that they use. So you can even type in pfmpeg users, the internet decoder using IPV. And then you can even end up with the loops here that they're like zip files in the zip file. And yeah. And there's also support for formats to pass. You can even return value. You can return like values from a decoder into another decoder. You don't see this from the outside. But then for example mp4 has some boxes that have like information about how the samples should be decoded. Like how long are some fields or things. So then before and before the decoder can pass that information down to another sample decoder. So how do I use? I use it because of this. Because I work with media so all more or less the whole mp4 file is metadata about how to play this. How to seek, how to how to sync, how to yeah everything. And I guess Derek had a good talk about why this multimedia is basically endless pain. And I guess FQ can't really fix the pain but it can locate the pain I guess. So this is what I use it for. I debug the query assist when I work with media files. And it's used for someone usually at work there's a media file that has broken that someone it doesn't transcode or it's not in sync or and they say what what is wrong with this file. And we have to figure out like what is it? Is it the decoder problem? The maxi problem? Is it the encoder? Is it is it corrupt? Is it whatever? So FQ is very useful to quickly triage a lot of broken media files. And I can just short what is the time? Short like what can it not do? Or what is not good? It's not good for encoding like to change things. You can change things but it's more about slicing binaries into and then concatenate them together and then write them out to a new file kind of. So there's no like you can't just change the value in some JSON structure and just realize it back because it's like for example I gave an example here with the mp4 maxi. What would it mean to add or remove a sample in mp4 file? Then you would have to change all the boxes that describe how big the samples are and it's just cascades away into yeah and you can see that mp4 ffmpegs mp4 implementation is 17,000 lines of c-code very dense c-code so it's use ffmpeg if you want to do those kind of things. And I think you see and Peter will talk about more about encoding I guess and why why this is complicated. And you can repair media files with fq but you you probably have to be more or less an expert in the format that you're fixing. I usually do like I have you you can fix things by kind of testing the code the configurations or you use some kind of you encode something and borrow from another media file and yeah somewhere stitch together to see what it is or so there is I have some fq code to build like ADTS headers and whatever if you want if you find an AC frame somewhere you don't even know what because an AC frame has to have metadata about like what how many channels it has and what profile it is and things. And it doesn't do any decoders in runtime at the moment you can't write the decoder you have to write in go now and then compile it so we'll see maybe in the future it's going to have cacti support. I have to have a prototype for cacti but it is complicated also it has an expression language I have to write the parser for that. Maybe I will talk to you about this and see. There is more slides here but I will you can read them. I want to do a demo instead. Is it big enough? So fq is just a CLI tool. Work like this and you can do if you want to list all the formats that it supports. So if you run fq and let's see it has pcap if you do so the first argument is the filter that you want to run and dot in jq is just an identity function that is just it gives you what you put into get the root kind of so here we see that it's a pcap it has a head there it has packets and some TCP or TCP connections and things. So you can do dash I actually I want to do I want to do a crash course in the jq. I don't know how many people know how jq works so I can do a short version how just to show the particularities of jq. So now I started jq with dash I which gives you just a null input if you get one input that is just null so it's just a way of you can at least just execute jq values because it needs to have an input somehow always. So now you can just write strings or whatever you want to do and if you do dot there you just yes null and I guess the most special thing about jq is the comma operator and that outputs a value so you can do 1 comma 1 2 3 it gives you 1 2 3 but then in jq you can there are some special forms like this collect which sounds this looks very familiar to us as an array that then collects those values into something but then in jq you could you could just write the expressions here or whatever you want so let's say we can define a function for example so and then you can just collect that function or you can call it two times or you can map those values yeah so you see how it was too fast to but you see how you can define function you can have binding so it's a full functional language that's it's very I like it a lot how it works so let's back to that pcap file so let's see you can do for example if you want to see look at the first packet you can do this you can pipe it to D which is shows more recursively all the if you if you just give it when you do this you actually run D also but it also show it only showed one level so if you do D it it will show all of it so here you can see like you can write on jq expression here it is like show me the first and the last packet and it will do this and then you can say both of them yeah so you can do things like that so let's see we can go into the TCP connection and we take the first one we can do D and we see that this is seems to be HDP request someone has downloaded the file so let's see we can go to the server stream let's see and there is a there's one thing about jq is that the jq don't have doesn't have binary support it only has string so you so fq has to to support binary you can there is some special functions in fq to tell it that this this string is actually binary or I want it as binary if it's possible so then you can say like this stream for example if you if I would just do type it just a string but if I do two bytes you will actually get the raw bytes and then I can say like I wanted to maybe 400 first and DD is something that shows the whole it doesn't truncate the output so here we see that this is some kind of HDP request so let's say we want to get the bytes for this okay plus so here I think I have the body of the HDP request so in jq you can more than do for example this that all all the all the coders in fq are jq functions so you can do this now and it will just decode it as an mp4 file so now you can start as a sub-repell for example now you're inside the mp4 inside the the pcap and now you can do and here you have the whole box 3 for the mp4 file and and you can for example go into the samples I think this is some kind of subtitles for mp4 file that I found somewhere and here you have the tracks it has samples so this is like the raw the raw bytes for that sample and you see it's some it's some kind of weird XML subtitle and fq has support for XML so you can do this and then you get a JSON version of the XML see we can see almost there is some let's see I can write something that takes out all the this is probably not how you write TTML subtitle parser but we can we can do a quick quick version of one and there is some functions like grep by that recursive look for some condition so here for example you can look for did not work why aha so so now we it recursive defined all those text the objects that had the text field and then just take took that text field so now you can for example take this expression go out to the you have the wrap up here and you can go out to the prompt again and then remove the interactive and then do this instead or we can do this if you only want text and we can even say we want all all samples so here is all that that's the thing you can do with with all these decoders that the codes and blah blah and then you can iteratively do all this it's like yeah it's nice so let's see and I also want to show that you can you can actually you don't have to write this all this when after a while you maybe with your expression starts to get very long so you want to have more structures I can show you I have some helpers for mp4 files for example maybe you know so I spend a lot of times in mp4 files because that's like what is used everywhere nowadays so here is some helpers for example that can be written in a more structured way that with the indentation and things so you mean you don't go crazy and I can show here for example this is a bit long but for example here is an expression that loads this mp4.jq and it finds the video track and then it uses some jq code to produce a GNU plot output and it uses a GNU plot and then I have my weird tool I have for producing images in the terminal so so here's the bit rate for the video bit rate per bits per second in this video and you see that there's probably this top sear up over the iframe in the mp4 file maybe we can take questions then yes yeah you can use that's a hex editor also maybe it's not very convenient as a hex editor but you can for example you can do dash d bytes or bits then you get like it's just a dummy decoder kind of so you can do bits a lot be kept so so now you'll just get it so now you can do so and then you can concatenate this into you can the kind that you can build this like a binary arrays like one they are like iolists in Erdogan if you have used that that you can like you it's just an array with with things that can become binary and then it can and you can kind of pipe this to two bytes and then you get back so you can kind of do this yeah so you see that if you can you can build this that's how you can how you can repair things with fq that you take the samples and then add on some head there and then you concatenate the bytes and you can even decode it with the jq again so you can i mean you could you could try decode this as an mp3 file for example but it yeah it won't work yes i would like to have some more round timey version i can't show how there is one of the slideshow show actually how the dsl how the how the go along dsl looks like yeah here here is how kind of how the go dsl works that you there's like a d function of this kind of like that the the context for that keeps tracks of where you are and then you create like new structs and fields and and you have these mappers that say like this is a binary for made up binary format for live by for open source licenses that I was going to be a moment there so that's how how most of the code looks like so if you can do you can write any go code i mean i prefer to write them to make them look very declarative like so don't use too many yeah too much weird code in keep them simple is the and other questions yes it's i would say that it's probably hard to of course to write the code to the i think it's it creates a lot of options how what what does the user mean when they change something like do they want yeah it's also like do you want the checksums to be recalculated or not do you want what happens if you change something in a if you have the max something and then you change the size so now the segmenting becomes different so then it cascades to change the whole file so it's like do you want that to happen it's also like there are encodings that are there are like many ways to encode the same integer for example var ints can be encoding many many ways so you would encode and normalize all that or should it behave should remember try to remember how how the original thing was decoded yeah so it's it's complicated any other questions |
Parsing binary formats with Kaitai Struct |
Hi everybody, welcome to my presentation about Kaitai struct. I am Peter Pucil and I have a question. How many of you have any experience with Kaitai? Okay, there are a few of you. What is Kaitai struct? It's a tool for dealing with binary formats, especially parsing. It is based on a declarative language, Kaitai struct YAML, that can be used to specify arbitrary binary formats. It works as a parse generator and it currently supports 11 target programming languages. Parsing means to convert the binary data you see above to the structure data and object tree so that you can work with it later. Today I will also introduce a new functionality, which is serialization. I've been working on it for the last six months and it currently works in Java. Serialization means, I didn't mention that, that basically the inverse process. You want to create a binary file from an object tree. Something about this story. So the author of Kaitai struct is Michael Action and the project started in 2014. In 2016, Michael decided to release the project as open source and at that time the only supported languages were Java and TrueWe. In 2017, Michael presented Kaitai struct at FOSDEM and by then it already supported eight languages and had over 400 stars on GitHub. Michael also wanted to come today but unfortunately he couldn't. But if there is some chat or something, I think he should be there so you can ask him some questions or whatever. And how is it today? So we have 11 target languages and over 3,000 stars on GitHub and Kaitai is used in more than 500 GitHub projects. So let me share how I discovered Kaitai struct. This was in 2019 and I was playing electronic keyboard with a band and I wanted to create a MIDI editor so that I could record the songs on the keyboard and edit them on the computer. And I wanted the user to be able to upload a sound bank in the sound font to binary format so that they could control how the song could sound. And I wanted a web-based MIDI editor so I searched for a JavaScript parsing library of the.sf2 format but I couldn't find one that would work for me. So I started writing my own parser but it was really hard and a lot of debugging had to be done and it was just not fun. And when I finished I came across Kaitai struct and I found that my two months of work I spent on this could be done in just one day with Kaitai. So Kaitai impressed me with its concept, simplicity and versatility and I started contributing a lot. And Kaitai also helped me my personal development because until then I'd only programmed in JavaScript, PHP and a little bit of Python and within a few months I was able to work in 14 programming languages that were used in Kaitai. And in 2020 I accepted an offer from Michael to become an administrator of the project. So in my story I showed what options there are to get a parser. So the most convenient way that you are probably familiar with is to use a dedicated format library in the given language. So it will probably have a user-friendly API and can be optimized for a format. But sometimes it may be of poor quality and incomplete and it may be difficult to debug and fix it. And also for the most common formats like JPEG, L for zip, you can find even several libraries and you can choose, but for less common formats, some obscure ones, there will simply be no library in your language. So we need to look into other options and another option is to simply write your own parser. But in my experience this is the worst option because it takes a lot of time and you need to do a lot of debugging using some debug brains and dumps and it's just not fun. But it's what most people do, often because they just don't know any better. So that's why I'm here today. And well, the problem is that if you have already written a parser for your format in Python, for example, and then after some time you are asked to create a Java parser for the same format, you basically need to start again. So a bit better way is to use a parser combinator, which means that you are essentially still writing your own parser, but you are using some building blocks from a library. And a parser combinator typically allows you to declaratively define some sub structures, but still in the code and like in CU can define structs for the fixed size pieces of the format and then you can directly interpret some block of bytes with that struct. And there are many parser combinators, perhaps dozens in popular languages, but as with the two previous options, you have still the disadvantage that the parser you get this way is still bound to the particular language. And it may be even bound to an application. For example, if it was developed for a graphical editor, so it may be difficult to separate just a parser from that application to use it somewhere else. And the fourth option is to use a parser generator, which means that you are not writing the parsing code directly in the programming language, but instead you describe it in a domain, describe the format structure in a domain-specific language, and this description can be then automatically translated into a parser. So, KeitaStruck falls into this category and it is the KeitaStruck language is designed so that it's independent of both the application and the programming language. Here I'll show you how to work with Keita. The first stage is compilation. So you take this KeitaStruck specification of the format and in this case, this is a format that has one byte because this U1 type means unsigned integer of one byte. And you take this KeitaStruck specification and you compile it using the KeitaStruck compiler, which is a command line tool. And as output, you get the source code of the parser, in this case in Python. The main stage is parsing. You take, you give the input binary file to the generated parser. You get in the first step and you give the input binary file to the parser as input and you get parsed data as output, so an object tree. And in case of KeitaStruck, the generated parser works with the runtime library so you need to include it also into your application. Why use Keita? What are the advantages? So as I already mentioned, the advantage is that you write the KSY specification once and you can use it everywhere. It standardizes the way we describe binary formats and there are already many formats described in the Keita format gallery and any described format can be visualized automatically in a Gravis diagram and the KeitaStruck language is simple, you will see. There are also several visualization and dumping tool available in KeitaStruck. So the write once use everywhere feature means that you get parses in 11 programming languages for free from a single KSY specification. So in this case, I've had the compiler generate Java, Python and Ruby parsers from a simple KSY specification you see on the left. When you look for specifications of binary formats, you will find that each one looks different and there is no single standard to how to document formats and Keita is used or intended primarily for creating parses but some people write KSY specification just to document a format in an easy to understand way because you don't even have to be a programmer to understand a KSY specification and it's often easier than to read these long PDF documents. And the Keita project includes an extensive gallery of described formats. At the moment, there are 181 formats described by 76 contributors and there are also several hundreds more format specifications in various Keita projects. And so the Keita format gallery contains formats of various kinds, for example, as you see archive files, for example, executables, file systems, game data files, multimedia files and network protocols, you can go to this page and I took it from there. And this suggests the wide applicability of Keita. And it offers an idea to create an international database of formats where various obscure and historical formats would be documented in a uniform way for future preservation. And this would guarantee that we could basically, we could read the binary files we write now in like 100 or 200 years from now. The fact that the Keita extract language is declarative makes it possible to automatically visualize it, visualize the described format in a Gravis diagram. The Keita extract language is simple but powerful. You can describe pretty much any binary format with it. And a case one specification starts with the meta section and this sets the little end in byte order as default. The SEQ section is a sequence of attributes. The attribute name is in the ID key. The type U4 means that in this case num underscore files will be an unsigned for byte integer. You can define your own types in the type section. A field can also be repeated. So in this case the files attribute will be a list or an array of base type file. In the instances section you can define attributes that start at an arbitrary byte position. You can also use a powerful expression language in many places. And there is another built-in type is a character string in a certain encoding. And if you omit the type and only specify the size, the result is a byte array. There are several visualization and dumping tools available for inspecting files. And this can be useful for, for example, for finding errors, forensic analysis, or debugging. And the visualizers allow us to view the structured data parts from the input file based on a kitesh track specification, so something like this. And you can use the console visualizer or also the command line to case dump is available, which can give you the same structured data as you can see in JSON format. And this can be useful for automation. But the most popular visualization tool is the Web IDE. You can check it out on this URL. And at the top right is a hex-dump of the input binary file. So in this case I selected this.png file in the file tree on the left. And at the top left is the kitesh track specification editor, so a KSY spec editor. And according to the kitesh track specification, the input file is parsed and the result is the structured data that you see in the object tree at the bottom, bottom left. And when you edit the kitesh track specification, the input file is automatically parsed again and the object tree is updated. Serialization is a new feature in kitesh track and it's being developed thanks to the financial support of the NLNET Foundation. While parsing allows you to read binary data to an object, serialization is all about the inverse process. So we want to write an object to binary data. And currently in kitesh track, the serialization for support for Java is fully working and C-sharp and Python are in development. There are basically two use cases of serialization. You can edit an existing file or you can create a new file from scratch. And the support for serialization greatly extends the use of all written format specifications because now you can use them not only for parsing but also for serialization. And this has many uses, for example, you can convert one format into another or it can be used for fuzzing or video games modding and so on. This serialization process in kitesh track can be divided into four phases. First you need to create a ks object and then you fill it with data. So you set its individual fields or attributes using setters. Then you should call the underscore check method to check the consistency of the data with the format constraints. Finally, we can call underscore write and pass the stream where to write. And you can actually check out more details of how to use serialization in Java on this page. Currently, the serialization support in kitesh track is designed for the general case so that it works for every conceivable format specification. While a simple solution would work for perhaps most specifications, well, the solution that works for all of them was chosen. Even at the cost of delegating some task to the user. In the future, I would like to automate these tasks that need to be done manually at the moment so that it's more convenient for the user. The basic idea is that the user sets everything, including lengths of sets, magic signatures and kitesh track checks for consistency. Also, only fixed length streams are considered. So once you create a stream, you cannot resize it. Finally, I would like to talk about the plans for the future. Design for C-sharp and Python is in development and they should be ready in two months. There is also interest in adding Rust, C and Julia as target languages. And I would also like to see Wireshark desectors as a target because the concept of kitesh is not limited to programming languages. A target can be anything, for example, we already have a target for construct, which is a Python library for parsing and serialization of binary data. Thanks for listening. Now it's time for our questions. Yes, there is a dock key for which you can use on attributes and types in many places and you can write some documentation of the specific element and in some languages, but it doesn't work like 100% of the time, but the idea is that these documentation should translate to the generated parser as dock blocks and then the IDs and tools for development should autocomplete usually this documentation. Do you support in DNS when generating source code, depending on the target machine? Yes, there is a feature for calculated NDNS, it is called and you can switch the NDNS or the default NDNS based on the value of an arbitrary expression basically, so this can... But do you support host NDNS and target NDNS? Well, not really, but it's not that of a limitation because you can, for example, you can use parameters, for example, to pass it from your application basically because I don't know if I can... I don't know if it's a good idea, but another feature of KiteStruck is that you can define that types can have parameters and even the top level... Yeah, I should probably at least... Never mind, yeah, and you can define parameters and you can easily pass a parameter from your application that will somehow change the behavior of the specification over, yeah, so it's possible. With KSI, you seem to aim to define specification for certain languages or formats, but for languages and formats that already have a specification, how can you ensure that these two specs are actually the same and that you're not passing differently than other parts of it? I don't... Well, you mean that there is already an implementation of some... For example, someone's passing ZIP files out there, how do you guarantee that KiteStruck will pass ZIP files the same way? You don't basically, but from this point of view, it's just another implementation... Well, if you compare it to other parsers, for example, so there is, for example, a ZIP parser in every language, yeah, so ZIP parser library and this KiteStruck specification, it's just another implementation, so, well, it needs to be developed carefully so that it works well or... Yeah. I guess you would need a way to translate from a written specification to the KataI structure or the other way around to validate that what you wrote as the script actually corresponds to the actual specification, for example, if a specification is already matched in machine written, which is readable, I mean, it's not here, we should have a tool to convert from one to the other, so that would ensure that the passing is correct. But it doesn't help, because the implementation is done by humans, it's impossible, it's impossible. It's just an introduction. You have to run all those things. Why? I'm wondering if it would be possible to add some functionality to that, not only parsing but some very common functionality, do you think you can add that in the highest form? Common functionalities, so... Like, for example, there's a binary format and there's very common functionality everybody uses on that, let's say, like, I don't know, cutting a part of it or getting, calculating some, I don't know, value, magic value or hash value, could you add some extra functionality other than parsing in there? Well, so the question was that if you can, if we can add some common functionality in addition to the format specification, and the answer is that, well, you can do this to a certain extent, because there are, I didn't mention them or talk about them, but there are value instances, and you can prepare some, you can, this is like a calculated attribute, so you can write an arbitrary expression to it, and this can calculate, for example, some, like, I wrote a, I wrote a BMP specification or I extended it, and I used this, for example, to, well, in the BMP format, there are like color masks in different places, depending on the head version, and I used a value instance to get it from, so, depending on the version, so either get it from here on here, or if it's a fixed, fixed, I don't know if it's fixed core palette, or what is it called, so, yeah, we can do this to a certain extent, but some common functionality, like, I don't know some, well, if it would require, like, a programming language or something like that, so this would be infeasible, basically, because then, then we should, we would have, we would have to some, some programming languages, something language that translates to all targets, which is basically impossible, I think, yeah. There is some different type of learning, like service, you know, testing, you are this tool set to write a comprehensive diff tools that explains the differences between two binaries and that can leverage the existing descriptions to explain what the difference became to find. Yes, so I think you can compute some diff. Basically, I would do it, I showed the ksdump tool here. So I think you could generate the gson dumps of the two files and compare them, but when I did this, it was usually very, very massive, but you can probably improve that somehow, I don't know. But it's, yeah, okay, so thanks and... |
GNU poke
The extensible editor for structured binary data |
All right. So, it's two minutes early, but I have a tendency to speak and not stop, so we better get the start in here. So, I am so tired of doing introductory talks to poke. So, I have decided that this is the last one I'm going to do. So, but let's see. Is anyone here familiar with this program somehow? Yes? Okay. So, yeah, we need to do an introductory talk. So, GNU-Poke is an Extensible Editor for Structure Binary Data. So, it is a program that you can actually expand. We will see how, and you can expand it actually quite to a very high degree. And we will see that it's a program that is used to poke, or to mess, to edit data which is encoded in binary. We will see what is this thing about binary. Everything is binary at the end of the day, right? So, first, I'm going to do a very small introduction, an abstract one, right? I mean, why would you use such a program? Then, we will see how we are actually integrating poke in other programs via a shared object, a library. Then, I will basically talk a little bit about the project itself, the current status. We just made a major release that we are very happy about it, and we are very busy because the project is getting a lot of, we have a lot of things to do. It's getting very fun actually. And then, finally, an invitation to you to actually join us in, you know, in hacking this program. So, let's start with a very small introduction. I mean, when it comes to edit binary data, well, what do you use? You use a binary editor, right? If you go around and then you look around, you know, in internet about binary editors, you have, you know, the simplest kind, what I call like simple binary editors, which is, you know, your garden, garden variety, you know, programs which show you a byte dump. And then, a lot of them also show you the ASCII representation of the same bytes. What do you see in the screen? How many of those programs are around? Quite a lot of them. And they are nice. I mean, they are nice, they are useful, they are small, they are very easy to use. There is nothing mysterious about them. They are interactive, usually. The ones that allow you to actually change the value of those bytes in the files. You just go to the byte you want to change, you put in, you know, the new value for the bytes, ta-da. If you want to change, you know, based on some string, you go to the ASCII column there, you go to the position you want to edit, you change it, ba, you know, the file gets updated. Very nice, they are interactive. They support immediate editing. Like you go, you change here, and it immediately gets reflected in the file that you are editing, for example. What do they let you edit? In terms of what? Well, it depends. But on those simple binary editors, they let you operate in terms of bytes, in terms of strings, like we just mentioned. And sometimes, this is not often, but sometimes in terms of bits as well. Some of those basic binary editors, they support, you know, down to the bit level too. And that's it. That's the thing, right? You can edit in terms of those entities, of those concepts, bytes, bits, strings sometimes, sometimes numbers. Then the editor is not so simple anymore, but you see. But it's always, you know, like a fixed list of abstractions that you can edit. You manipulate your data using those abstractions, right? And then they give you, you know, byte dumps, the ASCII views, and they have fixed capabilities, depending on how nice the particular editor is. Some of them allow you to search for patterns from others, you know, to search and replace, to make byte divs, right? That kind of things. Measure entropy to search for frame buffers, that kind of things, right? Simple binary editors, they are useful. But then sometimes you have to edit, or you want to edit your data in terms not just of bytes or bits, but in terms of more abstract entities, like MP3 headers, or else relocations, or list of numbers, or whatever. And then for that, you have traditionally specialized binary editors, which are editors, like the two that you can see here, that know about some particular format. Like some of them, the one on the left knows about the L format, so you can see a tree view, which is quite nice. You know, with the different sections in the L. And the one at the right, which is small, but it's nice, is a very small MP3 editor to edit MP3 files. Actually, the metadata associated with the MP3, like the title of the song, the name of the singer, or whatever, right? Those specialized binary editors, they are nice too. They are useful. But they are not extensible, usually. I mean, you cannot go and say, oh, well, you know, I mean, in the ELF program here, you know, you know that ELF sections can contain arbitrary data, right? So you say, well, I want to use the same editor to edit the contents of one of the ELF sections. Usually, they don't let you do that. So they are not extensible. They also, they are not quite good at dealing with incorrect data. So if you get a corrupted ELF file, or some MP3 file that has problems in it, those editors probably are going to refuse to actually open them, open those files. And if they do, they are probably going to show you garbage, right? So they are not good at that. And you know, it's the typical situation. Exactly what you need is what they don't implement, right? It's always like that. So those are the specialized binary editors. And then we have K-Ti extract and friends, which is they implement this paradigm of, as you know, we got a nice presentation before. The first, you decode the data, like with one of the parsers of K-Ti extract, for example. Then you do your computation with the data. And then maybe you use an encoder, right, to write back the modified data. This is what I call the code, compute and code. Programs like these are also useful. Those are extensible, like we have seen with K-Ti extract. You can define your own structures. They use, usually, you know, some sort of declarative way of describing the layout of those data structures. They generate code in several target languages. I don't know why K-Ti extract does not generate C. I mean, to me, I'm so puzzled about that, but OK. Usually, those are non-interactive. They are not interactive. Like, you generate a parser that generates, you know, that parser in some programming language that then you incorporate in your program, and then you run, right? Usually, they are not that good to dealing with incorrect data either, right? Because the parser that they generate expects correct data. And they are either bit oriented, which is not common. I don't know if K-Ti extract can deal a little bit level. Good. Also, not aligned stuff. And all byte oriented. And often, there are no encoders. I know that K-Ti extract now they are starting to add support for actually writing data back to the file. This is nice, too. And then you have the poke approach, which is circular, right? What we wanted with this was the following. We wanted the immediate aspect of the simple binary editors. Like, OK, I go to this byte, and then I change it. Now, you know? I mean, now change it immediately. But also, we wanted the extensibility and the ability of working with higher abstractions like you have with the parser generator, like K-Ti extract, for example. We wanted everything together. And that is what poke is in few words. Basically, you describe your data structures, like in abstract type, for example, and you can immediately poke at it. You can immediately update it, edit it, write to it. And if you are not satisfied with that, you can, on the fly, using the same program in the prompt, you can redefine your data structure and do it again. This is good for also discovering the format of what you are editing, like in reverse engineering and whatnot. And when you are developing a new format, you know, that kind of use cases. So it is interactive with the poke application. It allows immediate editing. It allows data integrity. You can define your own complex structure, quite complex ones. And then it supports a very powerful and, to the point, domain-specific language. I'm a big fan of domain-specific languages because we have the ability and the brains to actually, you know, talk in several languages and write in several programming languages. And it is so great when a tool actually gives you, you know, the way of expressing things that are most suitable for the task at hand. The DSL is called poke as well, like the program, but with a big P to distinguish the programming language from the program itself. It is interactive, aesthetically typed, in purpose. It is garbage collected. It has some very interesting features, this programming language, because it's designed to, you know, to the point, to the task at hand. So for example, it's not bit-oriented and it's not byte-oriented. It is unit-oriented. So in poke, when you start talking about offset sizes in memory and so on, you don't talk in bytes or in bits. You talk in terms of arbitrary units that you can define. Right? I'm sorry I cannot get in detail because this is, you know, like a fast pitch, but you have all the information, you know, in internet and so on. And also, it works in bit-addressable IOS spaces. It can work within correct data because you can do non-strict mapping, saying, OK, I want to disable the constraints, that integrity constraints, so you can actually discover the, you know, what you have in front of you and adapt your own definitions to it and so on. You can define several versions of the same structure very easily to be more strict, less strict. And it is extensible. We will see that notebook is not just a binary editor. It is a full infrastructure to write binary utilities as well. Right? And then, similarly to what KITI extract is aspiring to do, for example, when it comes to document formats, you can use poke also to document formats and protocols in a functional way because your same documentation, you can use it, you know, to actually operate with the data. To do prototyping, to write binary utilities, to implement filters, and so on. And then, to integrate in other programs, which is very cool. I will show you in five minutes one example with the debugger, with GDB. Now, poke operates in, it can operate some files, memory buffers, you know, the memory of running processes. There is a collection of IOD devices, right, which are what you are editing. But what a poke program or what you have access from the command line is to a bit addressable IOS space. We call those IOS spaces in which you can actually map or, you know, manipulate different kind of entities, which are integers and same integers are strings, right, from the poke language. OK. We all know what bytes are, right? You will be surprised many people don't. They are just little numbers in a certain range, right, from 0 to 255. So this is the way, you know, in poke, you have to refer to bytes. But what I wanted to show you, because I think it's the interesting part, is that it's the way that you go, that in poke you have from going from bytes to actually encode in integers, right? So you see here, I don't know if you can see, you see here the IOD bytes there, which is the underlying device that you are editing, like the file, right? And then it has bytes, which are little numbers, right, from 0 to 255 in the range. And then the IOS space, which is on top of each byte of the bytes, is the bit addressable IOS space that your poke programs actually see. But we all know that bits are actually a very interesting thing. Bits exist usually at the hardware level, then they disappear until you will recreate them virtually on top of those byte numbers, right? It's very interesting. But so from poke what you see is the bits that are conceptually on top of the bytes. And that's the poke type, for example. This is a very, this is an unsigned 16 bits integer mapped in the IOD device at the first byte. But this is a boring example, it gets even more interesting. Like what we call wired integers, weird integers, right? So for example, in poke you can operate with 12 bits unsigned integers, as naturally as you will do with the 32 bits unsigned integer. This is quite cool actually, and useful, believe it or not. Then we have some conventions to refer to the bits and everything. But this integer actually occupies one full byte and then half of the next one. But also you can go to less than one byte, right? Like in this case. So you can actually operate with an unsigned integer of five bits, and then it doesn't feel like a complete byte. Obviously, since everything that there is in a computer is actually bytes, the drivers level, hardware level, you know, and everywhere, this is an artifact. But it's a useful one. And poke also has full support for unaligned stuff too, right? So you can work with actually a 16 unsigned integer, shift to bits in the IOD space. Can Kaitai extract do that? Yes, we will have to see. So yeah, so you could also skate your file just by shifting that three bits to the right, for example. Why not? Maybe fun. I included this here not to impress the cat. But to give you an impression, you know the impression that actually poke whether it's a serious problem. I mean, it's not just a stupid problem that poking at bytes here and there. We actually take it very seriously. And you can do this kind of stuff. And believe it or not, people need this kind of stuff. The other day in the IRC, we met the multicians, which is a community of people who are dealing with multics. And you will not believe what they need. Really, right? Like nine-bit bytes, you know how it's unbelievable. And we are struggling to actually give that people what they need, because they have rights too. The multics people have rights too. Anyway, so the poker sphere, poke is growing and growing. This started when simple. It was always a little bit special, but a little program. But it's getting out of hands at the moment. And in the sense that we have leap poke, which is a sered object. Obviously, first I made it in a prong, but then dodgy in one called an asshole, put it in a library. So I did. And then leap poke is a sered object that has the poke incremental compiler, the IOS page support, and everything. We will see now with GDBS an example how you can actually make use of it in your own programs. Then poke is the command line application, which uses leap poke. But it's just a program, a very small program, with a prompt. Like being a less load elf, you can dump bytes, right? Oh, sorry, here. And bar elf, elf64 file at serobytes, stuff like this. The command line application. But all of the logic is actually in the sered object. Then you have other applications like GDB, which is not upstream yet. It's in a branch upstream, but not in master. But they can actually use leap poke to give you poke capabilities. I will show you now in a two-minute little demo. There is a poke demon 2 that Mohammed will talk about. They have 10 minutes. Utilities, the pickles, which are the poke programs that give you the support for some particular format or for a particular domain. Then there is an emacs interface, of course. Then there is an emacs mode for the poke program, Veeam, for unholy people, for editing the poke code in Veeam, in VI, and something called pokelets, and so on, and so on, and so on. So it's getting fun. And then the integration. I'm very like this. You know, there is one problem when people write a program, they want it to do everything very well. And that does not work. So for example, in poke, we have some support for editing the memory of running processes. We do. We do. Well, I could show you, but I don't have time for that. Because we have an IOD, which is you can specify the process ID of a process, and then you can edit the running memory of the memory of the running process. But poke is not a debugger. It's not. I mean, there is a dwarf pickle, but you know, not to the same extent as a debugger. You cannot set breakpoints. It cannot use ptrace to command a process. But gdb is a debugger. So gdb is good at debugging, at running processes, and so on. And poke is good at poking the data. So let's put them together. So then the combination is good at both. And this can be achieved by using live poke. So I can show you very fast. So I have this C file. You see the C file? I have a C extract from with an intact. Can you see it properly there? Yes? Which one? Which one? No, I'm not going to address this. Trust me. OK. So there is this extract type, and then there is this buffer, there is this db global variable here, and so on, and so on, right? And an int main, int main. So then I compile that into an aout, and then I do home. This is the poke capable gdb, a dot out, break main, run. Now I am in main, right here. And then I can use gdb to look at the database here. You know this db global variable, and so on. Now this gdb is extended with poke. So you have a poke command where you can execute any poke code, like this, like 2 plus 2 equals 4. Brilliant. But here you can do anything that you can do with poke. So for example, you could say, and poke, in this case, it has access to the memory of the inferior that you are debugging with gdb. So even if it's multi-process or multi-thread, you switch in gdb, and then poke has access to the memory of the inferior, right? So what kind of things you can do? Well, anything. I mean, what is the address in gdb of this global variable here? Sorry. That is the address, right? So with poke, you could access to that address. But you have a command which is poke at types, right? Which is you are telling gdb to make poke aware of all the types known to gdb at this point in time, right? So the type char, the char type int, the struct prop that we saw before, and then this translated into poke type definitions for the keyvalent gdb types. This means that you can go from btf, from ctf, from door, from any debugging format that gdb understands to poke types, what you are using gdb in this way. Now, poke now has access to the memory of the inferior. So we could do, for example, these, and those are poke expressions, right? Struct prop at where? And you can use something we call alien tokens, which is the address of the gdb symbol db, which is the variable, right? And this is the poke struct. This is not a gdb value. This is a poke struct, which is poking at inferior memory. And you, of course, can write. You can do whatever you want. You can load pickles. Why is this useful for? Well, imagine you are debugging a program that is some sort of router for TCP. So the program itself, you know, you have TCP packets and some buffers in the program. But the program itself doesn't need to understand the payload of what it is transporting. But imagine that you want to take a look at what is going on there, right? So then, you don't have the dwarf definitions of the structures that are in the buffer, in the payload. But with poke, you just load your pickle or you write your own, and you can poke at it from gdb in the buffer of the running process, for example, right? So this is one example of application. This is 400 lines of C, the integration, using live poke. It's very nice, and, you know, it's easy. Another example, which is work in progress, and it can't wait to get this finished, is basically to add to the assembler a.poke directive. Because you know those.war,.bite, and so on directives in the assembler? They are not portable. And this is a very, very, very, very, very, very, very, pain in the ass. Because when you have to, actually, for example, test door for tests, you know, that kind of things, then they are not portable. I know it's amazing, but they are not portable. So see, for example, that. This is a real example of something I found. I will not tell where, of some people who are actually embedding some sort of executables inside other sections of some of the stuff. This is Sony computer, this is video games, you know, kids. They do these kind of things. And that is a struct in theory of what, of some header, right, for the PlayStation or whatever. Well, compare. You see? So once we have the integration in the assembler, then you will be able to load the PSXX pickle. It will be a small one where you define the structure of the PSXX executable, and then you could poke at assembly time, you know, like that, too. And accessing gas assembler symbols, you know, using alien tokens, like we saw you could do with GBB as well. So this is another example of integration. Yeah. So and we are on these kind of things now. We are looking into parasiting other programs, you know, and incorporating live poking them. So they do all the boring work, and we do the fun one. OK. So what's the current status of the program? We just released the POC 3.0 a few days ago. Up to now, we were doing one major release every year. And we had a maintenance branch. But people are not happy. Why? Because it was too long, and the difference between POC 2 and POC 3, it was too big, it was too much. And actually, we released the POC 2.0, for example, and in two weeks, we have forgotten about it already. Because we are so happy, you know, and so excited with the main branch. So now we are committing to release two big, major releases every year. Let's see if we can actually do that. So the development, we are old, new peace pods, right? So we don't use GitHub or anything like that. So we use a mailing list, and you send your patch to the mailing list in Unity format, and so on, and so on, right? And well, that is the website of the project. We have a Git repository, obviously. We have a mailing list, a development mailing list. We have a very nice build box at Sourceware. Thank you very much to the Sourceware Oversers. They are doing a great job maintaining the infrastructure of many new pack programs, including POC and also the Toolchain, GCC and Ellipse, and so on, for many years. And also, we have a pipeline hosted at GitLab that Bruno Hibel maintains, and I have no idea how it works. It's not clear to me what even a paper line is in that context, but it's green. So I guess it's good. We have a community website. It's called Pocology, and we try to get practical information there, like how can you write your pickles, and so on. And also, I have a blog in my website where I sometimes publish small articles with practical stuff. So how can you do, for example, implementing Spark stables using POC, or accessing them stuff? We want to be friendly to users. And now, starting now, in POC, we had in the POC source distribution, GNU Hacks POC, we had a pickles directory with a lot of pickles. Like for P, for L, for DOR, for BTF, for this, for that. Some instructions, set RISC-5, BPF, but this is getting crowded. And actually, some pickles are big and complex enough to actually need their own releases, so you can have several versions that work with the same version of POC. So we are basically putting some of the pickles in separated packages. And that's the case of the elf pickle and the dwarf pickle. So for example, the elf pickles, now they are distributed like separately, have not made the first release yet, but they are in the Git repository. They have their own manual, and so on. And the dwarf pickles, as well. Those are the first ones. We want to get the peak of support, because it's a huge fat monster, that one. We want to put it also in its own package. And also, with nice manuals, I have to show you this, because it is such a pain to do it that you have to brag about it. This is the POC manual, the new POC manual. And then, when you install the pickles packages, like POC dwarf and POC elf, and here, you have a nicely documented, you know, the pickles and everything. And the source distribution, sorry? Use man pages, not the menu. Well, well. Oh, well. We are generating man pages from the info. Then, actually, the idea is to use the elf pickle, because we are new writing pickles, because POC is sort of new. So we are trying to discover our way forward, but the POC elf pickles is sort of the canonical example. We are using it that way. We are writing it very carefully. So if you want to write a complex pickle, you can look at it, but, well. And this is it. I am so sorry that they could not give you, you know, an actually taste of how this program is. But there is no time for that. And there are other videos in the internet that they have done already. But, and then, if you want to join the development, please read the hacking file. Because we took the effort of writing it. It's huge. It has a lot of good information. And absolutely, no one reads it. So thank you very much. Thanks. |
Stack walking/unwinding without frame pointers |
Welcome everyone, today we are going to talk about walking native stacks in BPF without frame pointers. So my name is Roy Shally, I work at Polar Signals as a software engineer, I mostly work on profilers and eBPF related stuff and before that I have worked in different kernel subsystems as part of my job. My name is Javier and I have been working at Polar Signals for a year, mostly working on native stack and winding, so the work that we are going to introduce today and before that I was working on web reliability, tooling for developers and performance at Facebook. Yeah, so before we dive into the topic, I wanted to go through the agenda. So first we actually want to talk about why there is a need for a dwarf based stack walker in eBPF because that is like the most asked question, then we will go into the design of our stack walker and then we will talk about how we went from the prototype to making it production ready and then a bunch of learnings so far on some future plans. So as we said, we work on the production profilers which means that generally sampling profilers collect the stack traces at particular intervals and attaches values to it. Note that the profilers generally need like the user and application level stacks as well as kernel stacks and it sort of involves iterating over all the stack frames and then collecting the written addresses. Historically, there have been a dedicated register for that called frame pointer in both x86 and 64, but in recent times because of some of the compiler optimizations, it has been mostly disabled in most of the runtimes as well as in distros. Also it becomes really hard when you don't have frame pointers and instead of involving a couple of memory accesses per frame which is like quite fast, we will need to really do more work in the stack walkers. Note that the stack walking is like also a common practice in the debuggers as you all know. So what's like the current state of the world? Well, it's not a problem for the hyperscalers because hyperscalers actually have all the applications which are already enabling frame pointers in production and this is important because when the things break and you want to really go through the inspection, it's really helpful to have all the stack when it's needed. There's also a recent discussion in the Fedora mailing list, so just feel free to go through it. TLDR of that discussion is that it's being, so FPs are going to be enabled. Since I think Fedora 38, so that's about to be released in late April, Mac or software always have binaries which has frame pointers enabled. There's also an amazing work going on by Oracle engineers to have a simple format instead of Dwarf and we hope that that work also goes through and helps the ecosystem. So that's like sort of the current status, but what we want is we want that right now and we want the support for all the distros as well as all the runtimes which is like scattered here and there, for example, only go runtime, enables FPs like since go 1.7 and in x86 and since 1.12 and 64. So now some of you might be wondering if not frame pointers, what do we have? For example, say in Rust where it has been disabled by its own, by default, but even, but when you have the panic, you still get the all mattress, so how is it happening? So well, compilers always have this information and we generally need to know the value of the stack pointer in the previous frame and it can be like from any offset if there is no frame pointer. So that way we can always like find the value of the return addresses and continue unwinding the stack. This information is generally encoded as part of.ehframe section or.debugframe section in the Dwarf and there is also another way which is like unwine tables can be also synthesized from the object file which is something being done by orc format that has been used in kernel for a while now. We will talk in detail about.ehframe in a minute, but first of all, let's see if anybody else is using.ehframe already, of course. So the profiler we have developed is not the first thing who is going to use.ehframe. Perf, the popular profiler from Linux kernel, added the Dwarf support in since like when the perf event opens, this call was added which was like in 3.4 and it can collect the registers for the profile processors as well as the copy of the stack for every sample. While this approach has been proven to work, there are a bunch of drawbacks to it. For example, one of the things which perf does is it actually collects all the stacks and copies it into the user space. The second thing is that the implications of one process having another processes data in the user space can be quite complicated and also be it's like a lot of data especially for the CPU intensive applications. So why BPF? Stack walking in BPF for our profilers actually makes a lot of sense to us because in that case we don't really have to copy the whole stack. The information, a lot of it still stays in the kernel which provides like higher safety guarantees especially in the case of like stack walking mechanism. Once it's been implemented, we can like sort of leverage the perf subsystem to get the sample CPU cycles and then instructions, altricache misses, etc. And it can then also help us to develop other tools like allocation tracers, runtime specific profilers for example for JVM or Ruby, etc. So some of you who are probably also familiar with BPF may know that there is already BPF get stack ID so why there is a need for implementing something different. Well, as it turns out, the problem with that helper is that it also requires frame pointers. So it also uses frame pointers to walk through the stacks. And for the historical reasons, fully featured DWARF unwinder is like unlikely to be part of the Linux kernel. So before we dive into how we are using EH frame with BPF, let's look at what EH frame actually has to offer. So the EH frame section contains one or more call frame informations. The main goal of the call frame information is to provide answers on like how to restore every register for the previous frame at any part of our code execution. Directly storing the table, that sort of contain each program counter and all the registered and then locations such as like whether they have been pushed to the stack or not, etc. would generate huge unwind tables. And for that reason, DWARF is actually quite compact and very space efficient in that sense. So the unwind tables encoded in the CFI format are in the form of upcodes and those upcodes needs to be evaluated. So in the case of stack walking, once it has been evaluated, we generate the table that contains for each instruction boundary like how to store the value of the register. There are two main layers to it. One is that it helps with repetitive patterns that compress well and allows for like more compact representation of some data. In some cases, there is like also a specialized opcode that consumes say 1, 2, 4 bytes rather than just 4 bytes at like all time. And the second layer, which we call the second layer is the spatial opcode. That contains like the other set of opcodes, which is like containing the arbitrary expressions. That needs to be evaluated and that's like a lot of work. The main difference between these two is that in the first one, we just need like these two values. But in the second part of it, we will actually need to evaluate like the arbitrary during complete expressions. So for that reason, we would need to have like the full blown VM to be implemented in the BPF itself, which is not quite practical. So those who don't know how like generally the BPF applications flow works. This is how it would look like in a very, yeah, very high level point of view. So like in the user space, you would have like the driver program written in go. Like that's that's our stack and we are using likely BPF go over there. We are doing like creating the maps, attaching the BPF program to a CPU cycle perf event. It reads parses and evaluates the EH frame section of the process and like all the dynamic libraries. And in the BPF program, we have using the PID, we are fetching the table. And then we have like an unwind algorithm, which processes that work information. We will go in depth for each component, but let's see how the algorithm looks like. So basically for this one, it's like a really simple one. But basically we just read like three important register. First one is instruction pointer, RIP. Next one is the stack pointer. And the third one is, of course, like frame pointer, RBP. And then when the frame count is less or equal to the maximum stack depth we have defined, we find the unwind table for the program counter. We are the instruction pointer to the stack, calculate the previous frame, frame stack pointer, then update the registers with the calculated values for the previous frame and then continue with the next frame. So there's like just a nutshell that's what the algorithm is in the BPF. But now the important part is how we store that unwind information and what we have done to make it scalable. So now Javier will talk about that. Cool. So now we have some unwind information that we're going to describe the format later, but we need to put it somewhere, right? So one possibility would be to store this unwind info in the memory space of the applications that we are trying to profile. And we could do this, for example, using a combination of ptrace, mmap, and mlock. And we could use ptrace to hijack the process execution and allocate a new memory segment. And then we will have to lock it because in BPF we need to make sure that the pages that we are accessing are never going to be swapped out. But this has a big problem, that is, that will be altering the execution flow of the program. And it's something that we never want to do. One of the reasons for this is because, first, this memory will live in the process, which means that accounting for it will be odd and developers will find a new memory segment there that appeared out of the blue. So in their metrics, there will be something that changes behind their backs, which is not great. But also because the lifecycle of managing this memory chunk is very complicated. For example, what do you do if our application, if our profiler dies before the processes that we are introspecting? How do we clean this up? It involves a lot of, that's a lot of problems and adding solutions to these problems will require like crazy engineering, which we were not planning to tackle because it will over complicated project a lot. The other problem is that sharing memories way harder and accounting for our overhead is also very hard. If you think about it, for example, Libc is probably linked in most of the applications in your machine. So why having the same information like for every single process, right? So how do we actually store this information? We use BPF maps, which are data structures that, as Shali said, they can be written and read both from kernel and user space. We use in particular hash maps, which in the case of BPF, they are basically a mapping of bytes to bytes where you can put arbitrary information. So this is always locked in memory. BPF allows you with this flag not to lock memory, but in the type of BPF program we use, it is mandatory to lock it. Otherwise, as we said before, these pages could be swapped out and from the type of BPF programs that we have, page faults are forbidden. And yeah, in the end, we could reuse these mappings because they are in this kind of global BPF map that we have control over. So we can store, for example, Libc in one particular area, and then we'll have to add metadata for where it is for every single process that has this mapping. So yeah, this is a logical representation of our information for different executable segments. So imagine this is Libc, MySQL, Zlib, SystemD, and then some tank that isn't used. So this assumes that we have this logical view has a chunk of memory that is contiguous. But in reality, we cannot allocate any arbitrary chunk of memory in BPF. We cannot say we want 200 megabytes and it needs to be contiguous. We cannot do an malloc, right? So we've been doing some experiments and in the systems that we have tested and the kernels that we want to support, we can add up to 250,000 wine entries to each value of a BPF map. So because we want to be able to fit larger wine tables, we basically use the same solution that you would use in any other data intensive application, which is partitioning or sharding. So we are splitting the wine table into different shards and depending on the memory that your machine has, we'll allocate more or less shards ahead of time. That will result in a CPU to memory trade-off because when they get full, we'll regenerate them. But we'll talk about this later. So for example, let's see SystemD some wine table and let's suppose that it's a bit bigger than 250,000 elements so it doesn't fit in a single shard. So because it doesn't, we have to chunk it up. So we store the first chunk in the first shard and then there's a little bit that it's stored in the second shard. So we have added all these layers of indirection. We need some bookkeeping to do and this metadata is also stored in BPF maps. So we have a process that can have many mappings. Each mapping can have one or more chunks and then each chunk maps to a particular shard because you might have one and wine entry or up to 250,000 in a shard. We have some other metadata to exactly tell you where that information leaves. So yeah, thanks to this way of organizing data, we're able to, as we said before, share these executable mappings and thanks to that, we skip table generation for most of the executables in your box. And thanks to this, we only use 0.9% of the CPU cycles doing the process that Vashali was talking about before, which is not the most complex process in the universe, but it consumes some CPU cycles because we need to read some information from your L file, find the section, then we need to read the section and we need to parse it and interpret the dwarf information. So now we need to do way fewer times. So in your machine, we're only going to do it once per unique executable section. So what happens if we run out of space? So basically what we have implemented is basically a bump allocator. We keep on appending information to the shards and logically you can see it as a big chunk of memory. Once it's full, at some point, we'll decide to wipe the whole thing and start again, but we do it in such a way that we give every single process a fair chance of being profiled. So yeah, let's take a look at how are we doing this. So we start with a PID of any process that you find in your machine that at that point happens to be running on CPU, and the first thing we do is to check if it has unwind information. If it does, we need to find the mapping with the current instruction pointer that we have for that frame. Then we need to find the chunk. We can find it doing linear search because we have the information of every single, like minimum and maximum program counter that is covered by that chunk. Once we get the chunk, we can have the shard information, and once we have the shard information, we have the unwind information. But the problem is the unwind information, as we said before, it's basically an array. On this array, we need to find it there, but it can be up to 250,000 items. If you have done anything on BPF, you know that you have to basically build whatever you need yourself, and you don't have, for example, some sort of binary search that is executed on kernel space for you, so you need to implement it yourself, which is not a big deal in general. Implementing binary search is not rocket science, but the problem, most of the times, it's difficult to get all the off by once, but the problem here is that in kernel space, we have a lot of limitations we're going to talk about later, and we're going to talk about how we overcame them, because this produces quite a bit of code that has to be split logically. Not only the data structures are sharded, but the code is sharded, too. So anyways, we binary search this with up to seven iterations, but let's say eight, if you're feeling pessimistic, and then we're going to get the unwind action. So what is the operation we need to do to the current registers to recover the previous registers, and add that frame to the stack trace, and continue with the next frame, as I shall explain before. If the stack is correct, and we have the luxury to know that, because when we have known one information, and RBP is zero, that is specified by the X64 API to be the end of the stack, the bottom of the stack. So if it is correct, we hash the addresses, add the hash to a map, and bump a counter. So it is reasonably cheap, and I will show you some data later on this. And then every couple seconds, I think it's every 10 seconds or so, we collect all this information from user space, and we generate the actual profiles that we send to some server. As I said before, BPF has some interesting challenges for us. I think it's the closest that I've been to coding in the 90s or 80s, because we have very little stack space. We have 512 bytes, if I am not mistaken. So in order to overcome that, we used BPF maps as some sort of heap. Then there's a problem that I mentioned before about memory locking. That memory can never be swapped out, and it is in kernel space. So we want to make sure that we allocate the minimal space you need, and we need to do it properly, because each single environment has a different C-group configuration, and as some talks explained yesterday, it's quite tricky to know the actual memory that your machine has available. For the program size, there is two main issues. One of them is that there's a limitation on the number of instructions that you can store in the kernel, but also the BPF verifier, which is this machinery that makes sure that your program is safe, and for example, your program is going to finish, you're not the reference in any null pointers, and that in general, you're not going to crash the kernel, has a limit on the amount of iterations that it does internally. This is a problem for us, because doing a full binary search already fills these limits. So we need to use some techniques like this thing called BPF tail calls that is similar to Lisp tail calls, and if you're lucky, we are not. You can use, well, we use bounded loops, but we're going to use this new helper called BPF loop that basically it's a function that you can call multiple times creating some sort of loop in BPF, but we cannot use that because we want to support older kernels. That's a pretty new feature. So let's switch to something else. We have written our application in user space in Go, and we are a profiler, so we want to make sure that the overhead we have on your machine is as little as possible. But unfortunately, many of the Go APIs aren't designed with performance in mind. I am new to Go. I didn't know this was like this, and every single time I profiled our profiler, and I found these things, I was like, how is this possible? But it has the dwarf and elf parsing library in the standard library, which is great, but they are not designed for performance sensitive environments, let's say. And also, there's two functions that are binary read and binary writes that we use all the time because we need to deal with bytes back and forth that allocate in the fast path. But anyways, we profile our profiler all the time. We have found lots of opportunities that we keep on fixing, but of course, there's more work to do. And one of the areas where we try to be pretty comprehensive, it's with testing. So we have thorough testing for, well, unit testing for most of the core functions to ensure that we don't regress, but I think that, in my opinion, has helped us the most is snapshot testing. If you're not familiar with this technique, it's very simple. You basically generate some textual representation of your data structures, write them to disk or somewhere in memory, it doesn't matter, and then you generate them again after you make some changes to your code, and then you compare them. So this is how it looks in our case. We have some Git sub repository called test data, and then we have this textual representation of the unwind tables. You don't have to understand it all, but the idea here is that this covers a full function, which program counter starts in the one over there and ends in the one over there. And then we have the information for every single program counter, and then it tells you, for example, what to do here. The first one says CFA type two that I know is for RBP. So you need to get the current RBP at eight, and that will give you the previous frame stack pointer. But anyways, the interesting thing here is that this is very easy to implement, as you can see by our very advanced setup in our make file. We just build our binary. We dump these tables to disk, and then we ask Git to give us the changes. And if there's anything that has changed, we fail. So thanks to this, we have found a lot of bugs, and it has allowed us to iterate with confidence. One of the important things in this project has been de-risking it. It's been quite complex. When I started working at this, I had no idea about dwarf unwinding. I had no idea about unwinding without frame pointers at all. So we had to make sure that all these avenues were properly covered. We had, for example, the dwarf parser properly implemented, that we had all the interactions with BPF cover, and that the BPF unwinder worked well as well. So for this, we always tried to have a plan B at every stage of the project, and we tried to go in depth as well as in breadth. But anyways, I have five minutes left apparently. So we had a lot of automated testing, and one of the things that we did was adding kernel tests, which is super important, especially for BPF programs, because the BPF sub-system changes a lot over time. And there's a lot of features that we want to make sure we don't use, because otherwise it wouldn't work in other kernels. So we have a kernel testing system where basically it runs our application in multiple kernels and reports the state. And one of the things that I want to talk about is that production, as usual, brings a lot of interesting challenges. So by deploying our profiler to production, we found a lot of things that we didn't know about. And we were able to find some of these things thanks to using continuous profiling, our own profiler on our profiler. As you know, different hardware and different configuration are the biggest sources of performance differences as well as incidents in production. So I want to show you two things that we have found recently. One of them is basically we're using almost 30% CPU time opening files in our production environments that never showed up on my NVMe. And the reason is because, turns out, cloud disks are very slow. So we have fixed this. Another very interesting thing that we fixed the other day, it's something that happened when we rolled our profiler to production and then it started crashing. If you are interested, we will upload the slides, so feel free to check the pull request because everything is open source. But basically what happened here was that, for reasons, Go has a signal-based profiler and we have it enabled for even more reasons. And this only was enabled in production. So SIGPROV was interrupting our program execution while we were trying to load the BPF program. The BPF program takes a little while to load because the verifier has to run a bunch of algorithms to start to actually ensure that everything is safe and it was getting interrupted all the time. The BPF, that is the BPF library we used to load the BPF program, was retrying this up to five times until it basically said, I tried, this didn't work, sorry, and obviously we need the BPF program to be loaded to work. So there's many other considerations in this project, like short-lived processes, which we haven't optimized for, but we are still pretty decent ads. If your program runs for one second, we're probably going to catch it, but if this is something that you care about, feel free to message us. It will be something that we optimized. And then, yeah, this is our current format, I probably have one minute left or something like that. So you don't have to understand it all, but the point is we represent every single row with 264-bit words, but since we are making it a bit smaller, and this is basically how our size compares to dwarf. We are bigger because dwarf is optimized for disk while we are optimized for disk space while we are optimized for just raw speeds. So for example, our whole table for one shard pretty much fits in L2 cache. I guess, do I have any more time, or probably not, right? Two minutes, oh, okay, sorry, maybe I sped up too much. So we need to support parsing every single dwarf CFI opcodes, and the reason for this is because otherwise we won't be able to progress, but we cannot unwind from every single program counter, which sucks. But this is not a problem in practice. The reason for this is because the most typical way to recover the previous frame stack pointer is, which is called CFA in dwarf, but doesn't matter, is that you will get given which register you need to apply some offset to, and that will give you the previous frame stack pointer. We support that, but the problem is that it could be any arbitrary register, and right now we only support either RBP or RSP offsets, which happen 99% of the time. So this is something that we're going to work on soon. The other problem, as Vishali said before, is that dwarf has a VM that you need to implement, which has to be Turing-complete, and can implement any expression. It's not Turing-complete. The second level, yeah. The dwarf? No. That's why in the infinity project they had to add this new opcode in the draft hat. Okay. It's not exactly Turing-complete, it's almost there, yeah. Okay, well. But you need to implement a VM that basically has a set of virtual registers. Yeah. But the second, well, we're going to talk about those later, because the first level, yeah, it's the stack machine 100%, but the second level is, I can show you our code, it's messed up. It's messed up. But anyways, but the thing is that we are very lucky here, and you can check more about this in this PR. So there's two dwarf expressions that account for 50% of all the expressions that we have seen in most distributions. They are the expressions used by, well, the dynamic linker needs some, basically, and they are the expressions for procedure linkage tables or PLTs. The other good news, as I said before, is that RBP and RSP offsets rarely occur, and all the other possibilities that I haven't talked about, they almost never occur. Like, we've seen them very, very, very few times. The indexes are useful. Oh, good question. So. We're playing with AR64, because the GCC has 64 backend generates CFA expressions. That's what I was talking about, yeah, yeah, yeah. So but right now we only support X64, but I'm also going to talk about this later, sorry. But anyways. We don't know. Oh. Okay. Done. Okay, well. But we have the minutes buffer for the next one, right? Five minutes. Five minutes. Okay. I have two more slides. Well, anyways, our BPR program, we tried to make it as fast as possible. So this was running on my machine with a bunch of applications that have 90 frames or more. So even the maximum time that it takes is 0.5 milliseconds, which is not terrible on my CPU, which is from late 2017. And this is in a big part because we have optimized everything for memory. So everything's aligned properly. And we try to fit as many things as possible in the CPU cache. What about high level languages? So there's a project that I happen to work on, which is called RB Perf. So this is something that we're going to be adding in the future, basically for dynamic languages. You have the knowledge of the ABI of every interpreter version. And then the Stackwalker is also implemented in BPF. But instead of getting the return addresses, because you have no return addresses there that are meaningful to you, you have to directly extract the function names and other information of the process heap. Our project is called PARCA. So there's a couple of things that we're going to be doing, like mix and why mode, that as far as we know, no one else does this in profiling, in the baggers for sure, but not in profiling, which is the idea of that different sections will be on wants using different techniques. And for example, if you have a JIT that will be used, like Node.js, that has frame pointers. So you will unwind it with frame pointers. But once you reach the actual code from your interpreter, which is compiled and has some white information, we will use Dwarf and white information. RM64 support is coming late this year. And this feature is now disabled by default, but it is stable enough that we're going to be enabling it in a month. And then we're going to add other runtime such as the JVM or Ruby. And then just to say that we are open source, user space on the Apache 2, BPF on the GPL. And yeah, all the links are here. Thanks, yeah, thank you so much. |
Libabigail, State Of The Onion
Current status and perspectives of the Libabigail project |
Okay. Hello, everybody. So, my name is Doji. I work in the tools group at Red Hat. And so we're here today to, okay, first of all, thank you for staying. So, yeah, I wanted to talk about application binary interface analysis today. And, okay, first of all, who doesn't know about Lubabigel and ABI stuff? So, okay, so I think we'll have something for you guys. So, what are we going to talk about? So, first of all, I'll introduce what ABIgel is, and we'll look at how it works, what the project brought recently, and what we're looking for as far as the future goes. So, ABIgel is about doing analysis of application binary interfaces. So, it's a set of tools that can do things like compare the ABI of two binaries, or store the ABI of a binary onto a, you know, disk format. It can do comparison of binaries, you know, that are in packages, like Debian packages, RPMs, star files, etc. And it is also a shared library that you can use to write more tools if you want. So, that's all well and nice as far as marketing goes, but then let's look at, you know, what we mean by ABI. So, suppose you have a simple program, well, a simple binary which has, okay, simple, no, very complicated, let's say, which has, you know, three functions that are here. The types of the functions are defined here in a simple hierarchy. Here you have the first type that inherits, you know, S0 which inherits base type, and let's say another type here that inherits S0. Okay. So, that's the first version of it. Let's see if, I don't know if it compiles. Yes, it does. Then I have a second version of it which looks quite the same, but, okay, what does it do? Okay, what's the difference between the two? Very simple. I just inserted a, you know, data memba in the base class, and we want to know what the impact of this is on the ABI, you know, as far as the binary goes. So, where am I here? I'm in the source code of the project, and so I've built a version of it. And so, here we have one of the tools which name is ABI diff, which does what you think it does. And so, if I run it, what does it say? Basically, there are two changes as far as ABI goes in that battery due to the change of sin. So, the first change is about, you know, the first function which is here. And so, it is telling us basically that that function has a parameter type that changed. And the change is about this, you know, structure, remember? Something is interesting. The size hasn't changed, even though I've added, you know, a data member in there. So, yeah, you know the drill, right? If you don't, I can, you know, explain it more. But size hasn't changed. The base class has changed, and the change here is a data member insertion at a certain offset, blah, blah, blah. So, this is the impact of the change of the first type on the first interface, right? And so, there is another interface that got impacted, right? And the parameter of that function, which was struct s1, changed as well. The base class changed. The base class was struct s0, right? And the details of s0 change were reported earlier. So, we don't have to, you know, repeat it again, right? So, here you see that we compute the changes, right? And we also analyze those changes so that we can detect if things have been, you know, reported earlier or not. And also, we mess up with more stuff, because here we say, for instance, that there were two changes, for instance. But one got filtered out. What does that mean? So, let's see, for instance, if I recall the... Okay, I'll add a special... So, I've asked ABIDF to show me what, you know, to show me redundant changes, because by default it removes redundant changes. And we see that we have the third function that was impacted as well by, you know, the change we created. And so, well, all the changes that, you know, impact function three were already reported. So, this is why it was suppressed. That change was suppressed by default, because it was redundant. So, it's not just... We're not just diffing things. We're analyzing the diffs, and we're trying to, you know, massage those diffs so that they can be consumed by human beings. So, this is what we mean by analyzing ABI's, basically. So, how it works. The library used to implement the tools has a front-end, which is kind of backward. The front-end reads the binary. Usually, it is back-ends that writes binaries, but, okay, here, backward. So, we read the binary, which has to be in the ELF format right now, and we build an internal representation of it. We look at the publicly defined and exported symbols of, you know, declarations, basically functions and variables. We build a representation of them and their types. And then, we construct the graph of the types like that and their subtypes, and we pull all that together, and we call that an ABI corpus. A corpus is an artifact for us that represents the ABI of the binary we were looking at. And so, there is a middle-end that acts on that internal representation. Said otherwise, it acts on ABI corpora. Corpora being the plural of corpus in Latin, right? Let's be pedantic. So, we can, as you've seen, compare two instances of ABI corpus. Then, we build an internal representation of the result of the comparison. We call that an diff IR. So, it's a different IR. And then, we perform transformations on that diff IR, like categorization. So, we would walk the graph and say, okay, this diff node, we've seen it before. So, we'll mark it as being redundant to this other one. And then, they can be, you know, transformations that are suppression as well. Well, suppression. We will mark the nodes as being suppressed. For instance, because the user wrote something that we call a suppression specification file requiring that some types changes might not be reported. So, once we have that well-massaged diff IR, we have backends that walk that diff IR, obviously, or the initial IR and do useful stuff, like writing, you know, emitting reports, for instance, or emitting, you know, the representation of the ABI corpus in a disk-saved format that we called ABI XML. So, what we've done recently, so I'm going, you know, a bit fast because, you know, to let time for questions and stuff, and we can go on and, let's say, not very structured discussion afterwards, if you like. So, yeah, in the recent times, what we've done is, well, you know Dwarf, you know that it changes all the time with new versions of Dwarf producers. So, with GCC-11 and LLVN-14, the default Dwarf version was bumped to the version 5, which is quite ancient, actually. I think it was released in 2017 or something. So, yeah, we support most of that right now. And another major thing that happened recently was that thanks to folks in this room that I won't, don't worry, I won't give your name. New debug info format were added because, yeah, we started with Dwarf only. And so, the CTF debug info format was, support was added to the Babigel. So, basically now, if you have a binary having CTF and or Dwarf, you can choose whatever you want to use as a source of type information. So, things being how they are, the code got changed a bit to, you know, to be turned into a multi front-end architecture. We also have a multi backend architecture, basically, because we have different types of reports. The one I've shown you is the default one, which is quite verbose. So, some people like it more terse. So, yeah. And who knows whatever weird request users might come with in the future. So, yeah. Different report backends. And, well, it's not, doesn't stop there. We are still working on, you know, on new stuff while coming from user request. So, yeah, the, apparently the new kids on the block, well, new kids in town now, cool stuff is BPF, right? And with BPF comes BTF, which is the type description format of BPF. And so, there were some requests to support that. So, it is now in mainline, even though it's not in Babigel mainline, but it's not released yet. It should be released in the next version. So, what do we do with that? What's that thing? Basically, because BTF describes the C types, basically, we are using that to compare the interface exposed by the kernel to its modules. So, we're doing that with CTF already, with BTF now, and also with DWARF. With DWARF, it is much less fast, shall we say, than with the CTF support and BTF. So, people can, people are using that feature to, you know, analyze the KABI, basically, kernel ABI, that thing that doesn't exist. And then we've had, you know, weird project specific request over the year. And the last one that, you know, came in last month, I say, or yeah, yeah, last month, yeah, in January, was to have a, I call that the library set ABI analysis. So, basically, it's a project that has a huge library, a huge library, and they're planning to split it in different libraries, right? But then they keep ABI compatibility, they're supposed to. And so, they would like to ensure that the set of, you know, broken down libraries has an ABI that is equivalent or compatible with the first initial one. This is what I, you know, call the library set ABI analysis. So, we're going to add support for that in, I don't know if it's going to be in the next version or not. So, yeah, these are the kinds of things we are, we're working on. So, yeah. And now, I'll let you ask questions if you, if you have any, yeah. Does the library have any support for language specific ABI? So, languages are good on top of C, for example, but they have language schemes? Yeah, exactly. So, yes. So, there, of course, DWARF is multi-language. So, if the compiler of that language emits DWARF, then we're good to go. There is a small layer of language specific stuff we add, you know, for reporting so that we can talk, report stuff in the native language of the programmer, you know, who wrote the thing. So, to give you a concrete example, right now, we support C++, C, Fortran. Someone asked me for Rust support. So, we had that, basically. We have some crashes on OCaml. So, I thought we were supporting it, too, but I need to do some stuff. So, yeah. Basically, yeah. It needs work, but... The new language, I just have to define a small layer for the mangling logic. For the mangling logic. So, okay. I can show you, let me show you an example. So, yeah. I was writing. So, yeah. Let's see. So, you see, for instance, in C++, we'll compare, so here, you see this function, the function 3. I'll change it in the second version here, function 3, and add, and I'll add an integer here, right? Yes. Let's, whoops. We compile that, whoops. And, whoops. Weird stuff happened. So, look at what it is saying here. So, you see here, because we're in C++, I changed function 3 in the source code. Yeah. Let me just, yeah. See? I changed function 3 here, and I added a parameter, you know? That's what the programmer would say. But then, from the binary standpoint, what we're seeing is that the first function was removed, and then another one got added. This is because in C++, the name of the symbols of the two functions, the two versions of the functions, are different. They have a different mangling, okay? So, we go to, we go from the name of the symbol to the name of the declaration, right? So, but if I do the same in C, then, like, yeah, I knew you would ask that question. I don't know you, but, so, and I have second version here. Boom, boom, boom. And, so, here, some, oh, sorry. I changed the name of, sorry, I changed the parameter of the function there, but this is in C, okay? And so, if I go in the, ah, sorry, if I go in the shell and I look at, boom, at the two, so, this is, the first one was hello, and this one is bye, of course, because I think this is going to be the last C here, because in C, the name of the two symbols are the same. Now, we say that the function has changed. So, these are the kind of things that we'll have to adapt, basically, but there is not much to do. In some cases, you have mangling, and in the others, other cases, you don't. So, you don't have anything to do with the, for the mangling, the, you know, does that answer your question? Roughly. Roughly, yeah. You have this code, part of the code, which decodes the mangled name to an unreadable name. No, no, because the, the, the, the matching is done by dwarf. So, we know that this symbol is for this declaration. So, we don't have to do the mangling, the mangling or demangling. We, you know, we'll look at the addresses and we know that this symbol is for that one. So, yeah, we don't really care about, yeah. Another, yeah, please go ahead. Oh, there is none. No, no, no, no, no, no, no, no, it's, so, yeah, I, just to, to refresh the question, to repeat the question for the, yeah. What are the performance issues, you know, when we analyze, like, big libraries, like, you know, he said, I love VM, but, you know, there is WebKit, Gecko, etc., etc. So, we have a, when we're looking at, when we're looking at dwarf, we have a fundamental problem, which is the duplication of types. Here we are in the business of comparing things, right? And so, when we compare types, basically, we are in a, the, the land of quadratic algorithms, right? So, things are inherently slow if we do them naively, right? And so, the thing is, in dwarf, every single type unit is represented. But then, when you have a, the final binary, the final shell library, for instance, and you have, I don't know, you know, 1,000 translation units, and in every single translation unit, you had the string type, for instance, that was used. Then, you will have the string, the string type represented 1,000 times, at least, you know, in the, in the, in the dwarf. And so, we must be sure that those 100 occurrences of string are the one and the same. We can't just look at the name and say they're the same, because they could be otherwise, right? And so, we have to compare them and make sure they're the same, and then we'll say, okay, I'll just keep one and throw away the others. This is the duplication of type, it is called. And so, this process takes a huge amount of time, which is, well, for, for huge libraries, it can take, you know, it can take forever. So, we have heuristics to make this thing, you know, be faster, but then, you know, it takes time. So, we have some of the heuristics that we're using now are, is in the land of partitioning, like we will do things, you know, like piecewise, and, and try it so that we can do things in parallel, right? It is not mainline yet, but this is the, you know, the, the future we're, we're thinking about. Another approach is to have the types be de-duplicated before we intervene. This is what, for instance, the CTF guys do with C. So, they will do the de-duplication at debug info production time, and then in that, in that case, we're golden. There is another, another case where we're doing that is when we're building distribution packages, like, for instance, I don't know, RPM or Debian package or whatever, there is a tool which is called DWZ, which does the de-duplication to one extent. Well, when it works, it works. It does the de-duplication, but the problem is DWZ has the same issue as us, and sometimes when the binary is too big, DWZ will just give up, and in that case, well, we have to use our little hands and do the de-duplication in line, and then, well, we'll spend time. But this because someone should get DWZ, turn it into a library and put it in the linker? Yes, and, yes. Do it in link time? Yeah, we can, yeah, that's, that's something that, that's one of the things that we need to do to improve the entire ecosystem of these things, and, yeah, that's definitely, yeah. So, yeah. So, as, do we have other questions? Yes, are there any other formats that are on your road map? Right now, no, but, you know, like, three months ago, BTF was not on my road map, so, you know, the future is not what it used to be, so, I don't know, yeah. Anyway, so, yeah, we are on, hosted on Sourceware, we still use mailing lists, you send us patches, and yeah, you can find us on IRC, on the, on the, of this network, and, well, thank you very much. |
GNU poke beyond the CLI (Command Line Interface)
poked + pokelets = Better UI |
you You through i yeah Somewhere here, and yes, what is going to poke? So the project has, you can say that it has many components on the way you count it. But in a simple way, you can say, okay, it's a leap poke which has the incremental compiler for the poke programming language. By incremental, you can add stuff, redefine things, redefine types, redefine variables over the time. And it also had the other part is the PVM, poke virtual machine which is powered by Ganujiter project by Luca Sayu. And yeah, it's a virtual machine generator, which is fun but no time for talking about that. Sorry, Luca. And the other part is the IOS space, which is the abstraction to make things you know you can address bits from the IOS devices. So these are the main components of the leap poke. The other component we have in the Ganujiter project is the poke the CLI application, which is based on user leap poke, and you can read line, you can write to this in and then it gives you error and output on the set out. But the third thing is we have is the poke D, poke demon, you know demon in the term of server or in the program that exists, we don't see it but it's doing something. So it's a demon for me. But some people believe that okay, demon is only system things but I don't think so. So I like to call it poke D. So a very brief looking at the components inside the leap poke. So if you go to the leap poke dot leap poke dot H file, you will see three opaque types. The PK compiler, which presents the interaction with the compiler, the PVM, which is PK Val. It is the values from the poke virtual machine and you have the peak IOS, which is the abstraction over different devices like we have network block device, we have zero device, we have sub device, file, stream, like it's in the set out there. And process, which is you can poke other processes, you attach using the PID or memory, which is a in process memory for doing temporary things. And we have also a foreign dev, which you can introduce a new device. Like you have, like as an example, I have, I can show that, but like reading from packets and writing to memory. And then in poke, you can poke the packets, no network packets given from like a network or for example, at my job, I have to work with Bluetooth things. So we verify the data formats, we read them, and then we verify with the poke that we are, we get what we expect to. So ended language. The language is procedural, of course, statically type, of course, interpreted because of no PVM stuff. And it's interactive DSL for binary data. It is designed to specifically be expressed, enough powerful to express binary formats. And yeah, so we can have variables, var a equals something. We can have a weird integers like I have here and int 32 literal, which I'm casting to you int 13. We have strings which are not terminated. There is no non-null terminated string in poke. If you want to use, you can use arrays. These, this is the offset. So in size and offsets in the poke are magnitude, which is an integer and a unit, which you can define. The basic unit is bit, but you can define your own unit like let me, no, let me not. Yeah, here we have a repel, which is poke repel. So I can say, okay, I have unit foo equals three. And then I can define one foo, which is here in the, this is the output, you know, the output of the results, which is one, three. An offset, one magnet, one thing of this foo things, which is three bits, white. 13 here, it's the bits of this integer. It's an integer of size 13. Yeah, I'm casting to that from 32. And yeah, this is syntax for array. So for literals, for unsigned byte things, you have to specify the UB. So this is an array of, I can copy actually var E equals 01 UB. And then E, which you can see here that 01, so it's an array. So units, like you can define, if you go to the, is it the pk in our repo, it's there, you know. B uppercase, B is eight, kb and kilobyte, the stuff we haven't managed. So you can also define types, like you have an structure, which is consist of a three bits bit integer, followed by an offset and signed int 40 bits number, 50 bits number, which represents offsets in unit bytes, and we call it J, and then we have three strings, you know, not terminated strings. And functions, you can use the syntax fun, my function equals, this is the input signature, output column, and then you can do the, this thing here is for attributes of this value, like the length, is it mapped or not or stuff like that. So the application, the poke, it's a repel, you can, you know, declare stuff, like what you see here, but it's not the poke application, it is something else. So, and it provides a bunch of utilities. So here I switch the, I'm opening a new terminal here, I open the poke, the program, the CLI, so we have here something like dot command, start with dots, so dot help, you see here that we have a bunch of, you know, things here. Or look, for example, dot set, if you set, okay, you can do print printing, or here you can define variables, like you can change the set, output base to 10, okay, this kind of thing is also possible here. So you can do anything, and you can redefine A again to be hello, and it will be hello, and yeah. So the next thing is, okay, I can run these things, which I ran so I don't go over there. So if we want to know how poke, the CLI works, a very simply, oversimplified view is that it's, uses these two leap poke functions, PK compile buffer and compile the statement. The trick in poke application is, if the input starts with these keywords, like vary unit type, find immutable, if you compile and run the thing using this dysfunction, which expects, you know, arbitrary poke code. But otherwise it uses a statement, which assumes that you are passing in a statement in the raffle. And despite the name, both of them compile and run the poke program, so yeah. And this is an example of, because this thing started with var, so you can add more than one statement to it, here is an statement, because there is no var immutable form, and this is also a syntax error in the poke raffle. I can show you, yeah, there is error, because it expects a statement, not arbitrary code. So you see that there are some limitations. So what is this thing called, let me, yeah. Yeah, so we read from terminal, it gives a structure called input somehow. The first var, we compare it with these things, if it's that, compile buffer, this check macro, this is also pseudocode, it's not real code. So checks for first the output of this compile buffer function, which is compile time error, and also the exception during the execution of this program. And if it's not, okay, it checks things, and then, you know, in the compile statement, you get the value back, and besides this exception. So, but what is poke t? So, here in the poke application, we have this layer of std in, std out, and std error over the lip poke. So what about, you know, generalizing this abstraction to Unix sockets, instead of getting information from the input, we can get information from Unix socket. Is a demon, acts like a broker, so it listens on this socket, and it has a concept of channels, like you have input channels and output channels, and are completely independent, and then a client, we call them pokelets, connects to this poke t, it should tell the poke t that what is its role. So the role is an 8-bit thing, which is integral struct, you know, the syntax, this is different from normal structs, but yeah, should I explain that? No, I don't know, no. So the most significant bit is a direction followed by seven-bit of channel, which is limited, we reserve them for future upgrades. So when you connect to this socket, you have to write this byte to the poke t, then he knows that, okay, your input, so expect something from you, or it's an output channel, then when some user code write to this channel, you will get the data, and it's distributed to all, and there is no addressing thing, it's a broadcast thing. So if you want to know how these things work, there is a pickle, I can show why not. If we go to the GNU poke poke program, okay, here in the, oh, I have to, sorry, I have to enable the syntax highlighter, so yeah, then pickles, pdap. So all this communication poke the application protocol, you can see here, you can see the description here, okay, this rolls, what is the outcome stage, what is the, all of them are there, so if you are curious, you can go there and study. So, but we'll now go to the more detail. So here is the poke the oversimplified view. So here we have reading from input channels, and we have a concept of iteration, you know, when you start sending something, because, you know, you can have different print of statements, so it should, you know, it's, you know, partially send data in chunks, so we have to somehow notify our user, okay, here you're starting a new iteration of compiling or something, and so if it's from input channel, which is channel number one, it's simpler, I think, it gives to the PK compile buffer, and if it's from command input, which is input channel number two, it will give it to the statement, and here we have this check publish instead of print, it publishes to all the subscriber, all the clients are, said that, okay, we are interested in getting data from this output channel. So, any question regarding these things, poke the poke, poke the CLI, pickles, nothing, great, nobody is following, so that's great. So, let's talk about Pac-Me, so, question? Where? No, if you, I'll open the questions. What? Okay. I'm trying to my best to answer questions, so. Okay, so now let's talk about Pac-Me, which is the title of this talk, so it's an ACME-inspired, you know, poke interface, it's not ACME, let me explain it. The reason is because text rocks and anything else, like the graphical interface, it sucks really, seriously, I'm not, this is in the line of code of conduct, of custom or not, I don't know. You talk bad about graphics. Anything else sucks, yeah, good, thank you for the support. So, examples of good interfaces, in my opinion, are list machines, small talk environments, Oberon, which is a desktop computer system written scratch from Nikolaus Wirth, the creator of the Pascal, for the students, but it's a very interesting text-oriented interface, and the ACME, which is the editor of a great operating system called Plan 9 from Bell Labs, written by Rob Pike, the co-inventor of co-programmed language, which is also inspired by Oberon. The good thing is text is the main thing there, so you can select text and execute text, so it is very interactive, but also you have the text, so you can compose things and then, okay, put them in a function, and these kind of things, you know, it's much powerful than, you know, this DOM button, you have to click on that, you know, you cannot even change it, you know, if it's not useful for you. So, the reason is, okay, easy to compose, automation-friendly, you can have, explore, it's very useful for binary data, you are exploring, you know, different things, and then over the time, you have a script, so copy, paste it in a function, and then call it later, why not? And it's extensible, you know, you need something more, which we don't know, you develop your own program, which is, you know, a very simple thing, you can do that, and for your information, I'm not against graphics, graphics is awesome, the way that we currently use, you know, graphical interfaces is bad, like I have a button, I cannot change the function attached to it, I cannot, you know, remove this button, put two buttons, which is more useful to my case, so yeah, I love graphics like formulas, visualization, because the best pattern matcher we already have and planned, there's our brain, so why not? Visualization is cool, so, but this, yeah, never mind. Yeah, it's getting out of contact, so, kind of talk. So, okay, Pac-V is pokelets plus T-box. So the idea is, okay, okay, this terminal thing is not the best thing we have, it's from 60s, but yes, it's by far the best thing we have today, so. Let's live there, so, the option is, one option is to implement everything from scratch, which is, I'm too lazy to do that, so we have already programmed like T-box and screen, which you can do interesting stuff with them, like let me, if you don't know about it, let me, okay, this is a normal, external thing, and I say like, let me use one specific configuration for that thing, so here I'm opening this thing, so I can like, I have, there is a prefix, so you want to say that, okay, please do this thing, or like resizing things, running commands, you can do many cool things here. I'm not, this is good, we have this kind of thing, so we can use this infrastructure, and with the power of little programs, the pokelets, which talk with the poke over the socket, we can create a user interface, which is dynamic, not the best, but yeah, we can improve it, it's more interactive, and also, because of this, there is no limit, here pack me, I'm talking about terminals, but you can run on Emacs, which is we have, these things can coexist together, you have pack me here, Emacs here, and even I have an implementation in WebSocket, but it's still in Python, so I'm not allowed to publish it, unless I have some C implementation for it, yeah, I'm not looking at you, yeah, and in future, we will have support for screen, I chose T-Max because I use it on a daily basis, so why not, and okay, so I showed the T-Max for you, so let me show you the pack me, if you like pack me, it's here, so home user bin pack me, da-da, this is it, so it's called the T-Max on a specific unique socket, which is a T-Max thing, not important for us, with this specifically, so pack me, this T-Max configuration file, so let's look at what is it, so here, share, pack me, yeah, here, so here, like we are, you know, the default, because when you want to do something to instruct the T-Max, you have to send the key combination, okay, now I'm talking to you T-Max, so by default, it's control B here, you're unbinding it, and here, I chose control G, you can choose whatever you want, and my favorite one is control O, because it doesn't conflict with anything, anything else, B, A, G are conflicting with the GNU read line, which I hate, but this is for you people, so you can change it, and so we are, like we can have control O, so if I press control G and uppercase O, I automatically split the window, send these literal things, like peel it out, enter, this is, you know, but you see here, and then select, you know, go back to my cursor to my current pane, which is this one, the upper one, so you can, now if I go to the, okay, I can new one, peel it, echo, print, it's better to be quoted, hello, fuss them, and I guess with this one, I have a very little program which reads from, is it in, and writes to the poke D, so you see here, this 73 is the iteration number, and that's the result of the execution, it's there, so if I rerun it again, a new thing, so here in this patch, peel it out, I decided to use this, but you can use anything you are interested in, and just for the record, this slide thing, it's a poke led too, and I can show the code to you, like user bin, peel it, slides, it's like peel it out, channel number 60, don't put the length, so it's not important, and this is the terminal code, escape code for clearing this thing, so it just, instead of writing this slash slash the number, it just clears this, so I have these slides here, so the other thing I can show you, so let me finish this thing, so now key bindings, normal key bindings, like I have this power here to like, let me, this is also cool, let's out, okay, here in my thing, I think let's define a variable, echo var a equals 100, and send it to the poke, okay, so we know that it has, but with the power of Tmax, I can go and select this a, and I'm telling the, this is the key binding, okay, execute it, the e, upper e, this key binding, so pipe and cancel, put a semicolon at the end, and give it to the pokelet in channel two, which is for the command, so if I press shift e, 100 you have c, so it executes this a, it's like that you've written the a on the command line, in the ripple, so very simple thing, so I have also, because I love users, so I provided predefined layouts for you, so you can hear, control GF1, like it's opened your editor here, which is hell, yeah, this is using this editor thing, so you can just use your favorite editor, you have here the pickle, which I write a, you see, the logic is in the poke t, so when you can connect, disconnect, everything is there, and another layout is f2, yeah, it worked, so let's show, let me show you something, maybe a little bit more interesting, yeah, yeah, yeah, var, let's open the file, which I have somewhere, now this is error, I, okay, home, caldron, I don't remember that, caldron, yeah, test 0, 1, elf, okay, it's an elf file, you see here the number 0, this is the handle to this IO thing, this file, so I can load elf module, and okay, and I know that it's a risk file elf, so let's load this risk v module, we call module pickles, if you wonder why, ask him, and so var file equals elf, it's an elf 6, I know that it's an elf 3rd, oh, I can like do the view, here you get a dump, and you can, the fun thing is you can add as many as viewer you want, so it's the same thing, you can have a viewer in web, so here you instruct your thing, and in the web you get your things, highlighting stuff, anything you want, yeah, and so let's do elf 64 file, add this IO space at offset 0, so I'm telling that, okay, go to this IO space, number 0, go to that offset, and give me the elf file there, there's no elf file there, and it's, okay, oh, and if I hear, oh, oh, oh, oh, yeah, in the, I can zoom in, so here you see that, you get all the elf stuff here, like e-edit, no type, machine, section header, and others, so let's get the text section of this thing, elf 32, no, we have the file, get sections by name, that text, I know that this elf file only has one, so it gives all the sections, so I have one, so sub zero, if I write text, so at the end you see, you get this section header with the offset and size, right, so this is the address where the code, the actual execution code resides in the binary, so, but you know, you can say, okay, give me the bytes, give me how much bytes, text SH size bytes is an, here, this is an offset, I'm telling that, okay, I need bytes of this size, at this ispace, and at this offset, you know, go to this offset, SH offset, so you get all the bytes here, I can VM set output base to 16 and do it again, so in more interesting things, so, but it is bytes, what else can we do, you can say, okay, I know that these are risks, risks 532 I instructions, you know, this subset, you know, this basic base component of this risk machine, so give me that amount of elements of instructions at this offset, so here, if we go to the end, so you get all these instructions, risk 5 instructions, which be written in poke, the description of this instruction, that's the extensible thing, so for example, here we have an instruction of format I, which immediate of this value, RS1, functionality 0, or the this one, and opcode stuff, I can show you the, I think I'm done, you know, I don't have any time, so let me just show this one and then to Radare, so poke, what was, yeah, pickles, what, yeah, yeah, yeah, hey, I didn't pay for them, you can, it's free from my point of view, you have to handle him, so this is risk 5, you see, we define the opcodes, different instructions, like this is the instruction type B, it's an integral structure, so you can read about this, so we can have very complex things, we have pretty printers, nice thing is we have, also I want to show this, this is from, okay, I can put it into incense variable, incense 0, which is, let me close this view, dumb view, yeah, yeah, here we have the thing, so you can give the, this is the actual number, which is there, in the memory it is stored in the little indian, so 1380, something like that, and one, what, what, yeah, as, as, so we have a method, you call this method, you don't need to print, if there is no argument, so it gives you a real valid instruction for the assembler, so this is part of the whole story, so you can add more pokelets, next year I will come back with more interesting things, no GUI, but the right way, and yeah, thank you, thank you, if there is any more, thank you, thank you, thank you, thank you. |
The state of r2land
Presenting radare2, last updates and development plans |
So, this presentation I will try to show you the project, what's the current state, the features, most used plugins and which are the main features. Okay, so first of all, a quick presentation, I'm Sergei Alvarez, everybody knows me by my nickname, which is Pancake, I'm the author of the tool, I live in Barcelona and I work at Now Secure, which is a company from US, which we basically use different static and dynamic instrumentation tools for analyzing applications and find out privacy issues, like identifying if the application is leaking data, stuff like that, and then generate some reports for the customers and developers to improve the quality of the applications. So I work as a senior mobile security analyst, I like command line tools, I like command text interfaces, I brought many open source software, so my first goal is basically publishing all the stuff that I'm doing, so I like free software, and I'm maintaining the whole R2 ecosystem nowadays, so I'm basically focusing on R2, but I also maintain R2-Quitra, R2-Frida and many other plugins that work with R2. We'll have to reduce the font size here, this slide is a little bit. So it's a 17-year-old project, so I started this tool basically as a forensic tool, I wanted to recover some files that are lost in a hard drive. The thing is that I was working as a forensic analyst, but I was not going to use the private software that was in the company, so I wrote a simple hexadecimal editor that was able to find some patterns in the disk, and then dump like a one megabyte from there. After this I was interested in participating in CTF and different competitions for reverse engineering, and I found out that there were so many tools that didn't really solve my problems, so starting by, for example, GDB, it was not possible to script it at the time and typing comments all the time, it was kind of tedious, and I just wanted to automate many things. Also, there was hexadecimal editors, but it was not possible to extend them with plugins or anything like this, and there was like disassemblers, but Object Dump is cool, but it's not interactive, and the only interactive tool was private, which is IDA, and anyway, there was no real ecosystem for open source to solve any of these problems at once. And there were so many little tools that were solving one problem, but not really being able to integrate with the rest of the ecosystem or other tools. So I decided to start picking ideas, picking tools, developing everything from scratch, that's why I did R2, because R2 is not depending on anything, so you can, you only depend on the postics like Slipsy, and all the rest of dependencies are written from scratch, like console handling, read line interface, all the socket interface, parsing libraries, disassembling things, etc. It's licensed under LGPL3, and yeah, I mean, the focus of the project is basically to read other tools and be useful for hackers. It's not going to be a general proposed solution for all the problems, because for example, I don't plan to write like disassemblers from scratch, I think that there are better projects for this, so I'm integrating them into R2. Same goes for like Lipoak, it's one of the tools that is able to use R2. It's fully written in C, I mainly focus on portability because I like the things that I write to run everywhere. So the only option nowadays is C, I mean, there is some rast haters around that, anyway. The thing is that R2 can be run, can be compiled into WebAssembly, so you can run R2 inside your browser, you can also build it in a statical link at single binary, so you can drop it in a router. This year I plan to port it to a UFE, so you can run R2 inside your bootloader, and then you can use an independent operating system to use R2, mount file systems, and things like that. So there is some really high constraints on all the code that are shipping R2, so there is like a CI that is basically verifying everything, there is like a 24-7 fuzzer that is running and finding bugs and fixing them, so my policies, I don't let bugs stay for more than one day. So the code cannot contain like setJAMP, or Abort, or Asserts, or anything like this, because if you are doing something in hot, you don't want things to crash or to break. So the idea is that all the code that is running in R2 must be used from a library, so I don't want to use like double variables, I don't want to depend on something that, if it's not parsing properly, I don't want to crash, I want the, if the Moloch is failing, I want the program to still run, things like this. So this is the main concern that I have when I write code for R2. It's developing a single repo, but it's separated in different models, so it's like a big project in one repository, but there is like a bunch of libraries, each library has like a bunch of plugins, and many of these plugins are integrated or exposed, interfaces for extending it with scripting languages. So at the end is basically like a different layers of capabilities that can be extended pretty easily at different layers. So there is like a common line interface, so you have like a prompt, you can type things, there is like a visual mode, which is basically a list of comments that you execute every time that you press a key, and then you have the panels mode, which is like you can make splits, you can have like different tabs, you can have like different frames and so on, and then there is web interfaces, there is some people writing graphical interfaces for it, like Yaito, which is the Qt interface for it. For scripting, the easiest way for scripting R2 is Ertopipe, which is basically the simplest interface for interfacing with anything, which is basically you run something, you pass a string with a comment, and then you get the output of the comment as a string, but there is also bindings for the CAPI, there's automated bindings for Python, for Rust, etc. And there is also support for using these bindings from different scripting languages. So what are the libraries implementing or exposing? So you have the IO library, which is basically abstracting the access to the IO, this basically defines how you access like a file, everything is abstract, this means that a file doesn't need to be anything physical, there is support, I mean you cannot map like a full file in memory and then work on it because this is abstracted by the IO. So you can map like a remote file in a running in a remote instance of R2, so you can run R2 as an instance, and then you can map for example like a ptrace backend, which is basically reading and writing memory from another process, and this is like another IO interface, and all that stuff is just the file descriptor. So when you have like one IO open, then you can map this file descriptor into separate maps, a map is basically a portion of the memory taken from the file descriptor, so you say that from this offset to this offset from this file descriptor will be mapped in this virtual address in the IO space of R2, and then there are IO banks, and IO banks are basically a bunch of maps, so you can have like separate memory spaces, like for example you want to relate like a thread local stretch, you can have like one IO space that only contains the contents of the thread local stretch, but then there is like another bank that contains all the memory layout of the processing memory, and you can do that just by typing comments, you can do this also by using the API, but anyway, it's also possible to create like SQL memory maps, memory layouts, for example B850 is like an architecture that is used for automobiles and things like that, and this architecture basically relies on having like some solid infrastructure, so you see there is some models that have like two CPUs executing the same code at the same time, and there is like a verification that two CPUs are doing exactly the same at runtime, and the memory of this CPU is basically SQL, so there is some references that are going backward, so you have like one instruction at the zero address that is referencing something up, and this something up is basically going to the negative of other space, and this negative other space is not 32-bit in size, it's 26 for example, so you can basically configure this kind of things inside the R2, and you can basically get emulation, all the flags which are basically naming offsets in all these things and everything will be shaped properly, you can also define bit sizes memory spaces like 7-bit bytes and things like this, it's also able to pass binary formats, this works on any of the memory like I have said before, so you can pass from memory, you can pass from disk, there is support for the most known well-filed formats from console, binary, ROM, headers like Gameboy etc, but there is also for LFP, Makrokov etc, it's also parsing Dwarf, PDB and other debug information, this is only for getting like address of memory, making like a file name and so on, it's not really exporting all the structures and so on, but it's also possible to do that in the future, or using several libraries, it's also parsing like class information from SWIFT, Objective C or C++ binaries, and all that stuff is integrated inside the R2, so you can, it's subtracting all the information from all these final formats into single naming, so for example, imports in a PE is not the same as like an import in ELF, but for R2 is the same, so when you want to list what a binary is importing from other libraries, you can just use give me the imports, you don't have to use like different APIs or different comments depending on the file format, so it's unifying all the formats into a single naming, it's also supporting assembling and assembling, it's using like one API, so there is like a library that exposes an interface for doing this, and this library exposes plugins, so there's like plugins that are used by this API, and then you can basically implement like new architectures, like writing plugins for this library, but it's not only used for assembling and disassembling, which is basically text bytes, there is more low level detailed information, like you can, for some architectures you can get like a structure metadata, like which is the first operand, which is the size of the second argument, things like this, but it's also exposing a seal, a seal is like a very simple text interface for explaining what an instruction is doing at low level, it's kind of, it's very similar to fourth, like a stack based machine, and it's basically one statement separated by GOMA, so you have like O, GOMA, AX, GOMA equals, means that it will be pushing the number, then pushing the register name, and then pushing the operation, which is the equal, and then popping up from the operation to execute the statement. The reason for that is because there is so many ways to extend or to define an architecture, and there is some really fucked up things that can be done in so many architectures, so I was not going to define like extensible structures or doing some really complex things, because at the end I was always finding like something that was not compatible with another architecture, and I end up like saying, okay, I just can define like a comma separated string that it's just located in a single memory chunk, and I can just split it by comma and then emulate that. So there is a bunch of tools or libraries that can be, that are using a seal to extract this information, and then use that for emulation, use that for extracting information from a specific instruction, or even for the compiling. It's very portable, so it works on support debugging also, so you can do local and remote debugging. This means that you can run R2 as a local debugger in your Linux Mac or iOS device, but you can also attach to remote GDB or YDBG, et cetera. It's a functionality for searching for different patterns, so you can search for strings, access and model values, you can also find me something that is repeated multiple times, and then it will be finding like if there is any pattern that is repeated many times and give you the offset of these things, it's also able to generate function signatures, so by taking all the whole analysis for the program, it will identify all the functions, basic blocks, et cetera, and then you can generate like metadata for each of these functions, and this metadata can be imported again to search for this information in our binary. So you have like one binary with dwarf information, or the back symbols, then you can import this into a binary that is a stripet, and then you can basically identify these data structures or functions in another binary that is not containing this data. It's also possible to div code level, but you can also div data, you can find like using delta-diffing, so you have like two binaries that contain the same data, but in different offsets it will identify which offset is the stripet in the binary, and which is the main difference from that, it's not byte per byte level instruction checking. You can also div basic blocks, like you get the two control flow graphs, and then identifying which basic block is added, which removed, or if there is like a percentage of difference, so you can use that for bin-diffing, and you can also find differences like ABDIF for getting like there is new symbols removed, or things like that. I also took code from Grapp, a group, which is basically the bootloader, and I used that for parsing file systems, things that group is doing a lot of things for like assuming that the file system is correct, and R2 will never assume that anything is correct, so if there is like a corrupted file system I want to be able to mount it, so R2 is using this code basically for mounting file systems, you can have like a fake, or like a virtual file system interface in starter 2, and you can use that for mounting local or remote file systems. You can use R2-free, I will show that later, but I don't have time for showing it, but anyway, I will, you can use Freeda, which is like a tool for injecting code in remote applications, and you can use TCP or USB for communicating with that, and then Freeda can expose like an interface for accessing files remotely, so you can mount zips in a remote file system, extract the zip contents from one binary, pass the binary in local, mount that memory layout in local, and then whatever you would like to do with these things. Okay, obviously there is like a huge amount of things that can be done with that, so there is a need for a package manager. So I wrote R2-PM, it was like a 200 line shell script, so it's not really a big thing, but one year ago I decided to write it in C, so this way you can basically run the package manager anywhere, even in Windows, so it doesn't depend on anything that is not the same R2. So this package manager is basically pulling a zip repository, and this repository contains like scripts, and this script is basically defining a very simple way for installing and uninstalling, so it defines instructions for compiling and installing the plugin, and there is like basically a bunch of tools that are installed in your home, and also plugins that are loaded by R2. So the most common or most used plugins for R2 are, for example, about the compilers you have like R2-Dec, you have R2-Gidra, R2-Reddeck, so if you know Gidra for example, they provide like a compiler, there is like a part of the compiler which is written in C++, not in Java, and this code can be reused for writing at a compiler without depending on the whole Java thing, then there is Reddeck, which is the compiler based on LLVM and a bunch of parallel scripts that mess the thing, and they basically use the compiler toolchain to do the backward steps, to get from the binary disassembly to get like C like code for the compiling. Then you can also use the Aphora, there is support for signatures for IDA, there is like native signatures, there is like some repository of people writing this, and then there is support for Frida, if you know Frida, who knows Frida, where is your hand, okay, half of the room. Is that the compiler, right? No, Frida is a tool that basically injects code in a remote process, and then there is an agent that is running in a separate thread inside the process, so you can basically instrument the process at runtime, and you can basically inject JavaScript code or C or assembly inside the remote process and instrument that. You can use that for profiling, you can use that for modifying behavior, you can use that for tracing APIs, identifying when a specific function is called with some arguments, and then execute some code inside the remote process. Most people use JavaScript for doing this, but you can also use C with libgam or whatever, and yeah, we basically use that for dynamic instrumentation on iOS and Android applications. So you can use R2 Frida to have like a R2 interface for interfacing with Frida, which means that you don't need to type long JavaScript one-liners, you can also use R2 comments like which are pretty mnemonic and easy to type, if you know them, and there is also support for external assemblers like NuoGas or the Unicorn library, which is kind of like a stripping code from LVM, and there is also program solvers like Radius, SteelSol for Anger that are plugins that basically you define some constraints, you have like a function and you define, okay I want to know which are the arguments that I need to pass to this function to reach these specific others, like you want to know for example if it's possible to create like a buffer overflow in a specific variable in a local stack, or you want to define which, or you want to know which is the password that matches a specific crypto algorithm, so you can use that to define, okay, I want to know which is the amount, the block of bytes of one specific length that generate this hash, for example, things like that, and these won't be brute forcing, it's like using program solvers like Z3 and so on, so there is like different plugins that are integrated in R2 to use from R2 comments, and then you can define the preconditions, post-conditions, and the boundaries of the function to emulate. Then there is also support for parsing that structure, there is support for Kaitai, there is support for poke, this was integrated last week, because I didn't know it was possible to have like a library of poke, I was seeing it as a program, not a library, so it's integrated but it's not fully integrated, so I plan to continue integrating it to use like for disassembling, for parsing, headers, etc. So what can you do with R2 Frida? R2 Frida is basically, as I said before, it's front end for Frida, you can run scripts in your host, in your agent site, so you can write a JavaScript program that runs in R2, but it can be also loaded in the remote site, and you can load and unload plugins, so you can basically extend the R2 Frida comment set with JavaScript plugins that are loaded and unloaded at runtime. It's also scriptable with R2 Pipe, so you can write a program in Python, JavaScript, or the language you like, that interacts and automates comments and actions in the host site or the remote site of the Frida site. So you can spawn applications, you can attach to local remote processes, you can use different protocol or communication channels like USB, TCP, etc. You can remote the file systems, you can use that for interfacing with tracing APIs, profiling, and also supports extracting metadata from Java, Dalvik, Objective-C, and Swift support will be ready by the end of this year, because right now it's supported, but it's kind of like unstable, and the API is changing, so yeah, sort of like it will get better. So let's talk about R2 Pipe. Once the release, well, the first release that they did this year was the 5.8.0, I planned to keep ABA stable, actually I use ABDF in the CI, so every comment or pull request that people send to the project, they verify that it's ABA stable. This means that I'm not breaking the ABI, you can rebuild or update R2 without recompiling all the plugins or the tools that are using R2 libraries. This is pretty cool, and the thing is that I was having some kind of contract with myself that I don't break ABI, but this is something that you end up doing Ruby when code by hand, and having a tool that can automate this is great. But I wanted to have a runtime, something that you can run from R2, like interpreter, that is not a custom language or external library that needs to be integrated and it's really big. I was experimenting for some time with different languages and realizing that QuickJS is the only option, so even Lua is using setjump, so you cannot compile with Lua as a WebAssembly plugin. Also, if you want to use setjump with threads, it's kind of a mess. So I ended up picking QuickJS, which is the same JavaScript runtime that Frida is using. I picked the code from there, and as long as every comment in R2 is verifying with fuzzers, other sanitizers, and so on, I end up finding a lot of issues there. I sent like 12 patches to the project, so it's basically the fork that is used by Frida. So all these patches are upstream now, and the idea is basically that you can use TypeScript and JavaScript from R2, and you can write code like this. So you basically have an R2 pipe interface, but it's running inside R2. You can use this from WebAssembly, so you can basically open R2.online, and then you have a terminal that you can run R2, drag and drop a binary inside the browser, and use JavaScript to automate a bunch of actions for organizing the binary. Basically as I said before, you have a command that you run, and then you get the output of the command in exchange. As long as most of the comments in R2 speed out JSON, you can basically use the cmdj, which basically get the output of the command and pass the output as JSON. And then you get like a structure data that you can use for extracting it or processing it with the tools you like. I think that JavaScript and JSON are quite standard and useful nowadays, so you don't need to learn new things, but there is support for so many other languages. Like I said here, there is Python, support, Rust, Node.js, Rabi, Guile, Nivellisp, Haskell, Dlang, Swift, etc. So it's very easy to write this simple interface, and you can use basically different interfaces to communicate with R2, so you can use like a fork and a pipe, or you can also use like a TCP socket or an HTTP interface, WebSockets, whatever you like. And for the TypeScript thing, it's pretty cool, because I'm writing like a Type description that basically defines the APIs of R2, and also the structures that the commands are returning in JSON format. This means that you can get the JSON of a command, and then you can generate a schema out of that. And this schema can be used like a Types. So if you use TypeScript, ListServer, like the language service that autocompletes all the code, you can basically use that for using Type to autocomplete all the code. So for example, you can analyze function, you press P dot I, and then you get all the function name, you can get all the basic blocks, for each basic block you can use Type to get all the fields of this basic block, like the address, the amount of instructions, etc. And for each instruction, you can get the mnemonics, etc. And everything runs. I mean, you can use, like, Visual Studio Code, or NLVM, or whatever you like, or MX, I guess, that's also Supercell SP, and use that for automating and scripting using these languages. So R2 is able to visualize data and code in so many different ways, so all this, I mean, I can press E, so you can get the source code of the slides. So you can see here, this is the contents of the slide. So it's running a bunch of comments in R2, and the output is generated inside the visual mode that you saw before. So you can generate graphs, you can, from the control flow graph, you can also generate like a vertical horizontal line graph, like frame things, you can disassemble, you can render pictures in bit formats, in RGB format. You can also generate, like, different, like, comparing data and identifying which bytes are changing, by changing colors, you can define, like, a color for a specific bunch of addresses. So when you are disassembling, you can mark some regions, like, hot code, or this is, like, a portion of code that you're interested, and then you get headlights for one specific register, so you want to highlight SP, and then you get SP highlighted in the whole disassembly, things like that. But also, there is also UIs, but I'm, like, common line guys, so I prefer to use the shell. But there is, like, a Qt interface, but also I started to write, like, a new graphic interface using WX Widgets, because, I mean, Qt is great, because it's big, and it solves so many things, but I don't like the license, it's so huge, it's getting, like, really huge, and it's not handy for the kind of things that I like to do, so when I'm developing, I like to have fun, and I don't want to be suffering because of license, or companies that are developing things in the background. So I wrote so many UIs for it, too, like, in GTK, in using InBlip, also other HTML, CSS, for the web UI, because I do have, like, a web server inside, so at the end, I want to have fun, and I don't want to spend time, like, learning new APIs, and so on. So I'm writing, like, a WX Widgets API, a user interface, and the idea for this is to not follow the same flow, interaction flows that people use. So I want to have, like, a common line interface, but integrated in the UI. So I want to be able to have multiple windows using multiple sessions, or drag and drop things from one window to another one, instead of having, like, a big thing with panels and the common interaction ways. So what's the future? So there are short and long-term plans. It all depends on my time, because most of the things are maintained or done by me. There is a lot of contributors, but they come and live, so there is, like, not really a core developer team, or there is no really big plans for having, like, big organizations, there is, and so on. But the thing is that I want to be able to keep patching, like, stable releases. I want to have, like, ABA stability, mainly because there is some people or companies or users that are writing tools on top of this, and I don't want them to rewrite things every time that I make a release or break without updating this. I want to have, like, a create, so which is a Rust API that you can basically specify the version of R2 that you want to ship, and then this create will build R2 inside with a specific version, and then you can use your R2 statically linked inside your program. So you don't need to depend on system installations or depend on other things like that. I want to basically reduce the tech depth, because as long as I'm alone doing most of the things, I don't want to depend on humans. So I want to reduce the amount of things that are done manually. So for example, a friend of me brought, like, the flatback integration, so there is, like, a bot that basically detects when there is, like, a new release in R2 or in Yaito, and when this happens, it's basically pulling the code from all the projects, generating a changelog, generating a new build, and publishing that automatically. And you basically get, like, a graphical interface for Linux and Windows for free. I want to improve the code coverage and all the testing for fuzzing, and, yeah, I mean, there is, like, a lot of things that you can see, and if you have questions, please let me know after the talk with some beers. And I guess that everyone is hungry, so... Thank you so much guys. |
Welcome to the BSD devroom |
After two years of for them at home, this is our first in-person for them, so welcome. So just to remember you, just a few rules, you're not in your room anymore, so please keep all the rubbish on the trash bins and just try to leave the room clean. There's rules about no food and no drink, so just it's fine if you drink but try not to bring your food with you in the room. That's all and so the first COVID rule, so masks are not mandatory for the for them, so but we encourage you to use if you want to use one just feel free and we try to ventilate the room after each talk so maybe if people in the back could help us because we have just five minutes to switch from one speaker to the other so feel free to help us. So aside from that, this is the schedule for this morning. Pierre will be the first speaker this morning about the BSD drivers and we'll finish with the Chimera Linux, the last all finished at 2pm. So that's mostly all, so thank you very much for being here and enjoy your for them. |
BSD Driver Harmony
Improving collaboration between the major BSDs on driver development |
Hi everyone, thank you for joining, thank you for liking BSD and hopefully I will try to encourage you to help developing BSD so let's get started. So the program for my discussion for which I have 15 minutes is to go through 45 slides so hang tight, just kidding but I will start with a bit of background information, compare some device driver codes so we can have an idea how it looks between the different BSDs, call for help and try to send a few ideas out about how we can work together on BSD drivers. Anything of which, what about BSD drivers? Well, sadly the lighting doesn't allow me really to show much so basically I'm gathering the list of different drivers for different hardware showing how for instance on free BSD for sound cards you have sound HDA, on open BSD you have Azzelia, then HD Audio on net BSD now and so you have sometimes the driver missing in one BSD not in the other, they have different names, sometimes history which mergers go somewhere then patches come back, some areas are more developed in one BSD over another like TV capture or sometimes Wi-Fi so it's a bit frustrating in a way that we have this awesome hardware support and at the same time not every BSD benefits always from the other BSDs and progress they make or one bug is fixed in one and not in the other so to summarize and good morning, it's a bit of a mess, drivers everywhere but feels a bit sometimes like rabbit hole I would say so we have a collection of drivers with different history, some in common, some with different names, evolution is not consistent across the different BSDs but thankfully there is documentation, let's have a look at the differences between the different systems so this is the manual page for driver on free BSD so as you can see you have a few system headers, kernel headers in this case, some functions which should always be implemented probing, attaching, detaching, some throbbing and twiddling whatever that means, some data structures with references to the methods for attaching, probing and so on and then a declaration for the driver for the kernel to find it, another macro here gathering, putting everything together so looks pretty nice and clear to me but then if you look on met BSD it's a bit simpler actually, one less header, fewer functions which are documented here or mentioned here at least then just a single macro to attach all the bits together and declare it for the system then you have open BSD, oops, there's no manual page for driver or an open BSD so hi Stefan, you want to volunteer, he said read the code, yeah gladly, you couldn't be more on Q, we're going to look at UMB which comes from open BSD actually, so this is the manual page from open BSD for UMB, the documentation is usually very good so that was just a joke, so basically the UMB driver is about a USB network card which does LTE slash 4G support on the USB bus using the MBIM protocol so it's actually a standard protocol for many cards across vendors to support like cards from Dell, Ericsson, Sierra wireless, quite a few and so on and I actually happened to have ported the code from open BSD to net BSD where I heard it works and then I also did it for free BSD and this is actually a side by side comparison of the free BSD code on the left and the net BSD code on the right, I didn't put open BSD here because it's actually very very close to the net BSD code which is good news for the driver harmony so just keep in mind that it's also an open BSD but anyway to prove my point the net BSD port is actually also quite close to the free BSD one but there are like some differences so again sorry if it's not super readable here in the room probably on sorry actually on the stream and it looks less good and I prefer to like show my face than the code no I'm just kidding maybe we should have planned better no actually we spent some time figuring which different options but right now it's the best compromise but to summarize of course the license text stays the same the system headers are quite similar some are in common some are necessary in one over the other then if we keep scrolling down basically the black part are identical the purple are changes so then except for some debugging variables here it's very similar at the top of the driver with like the variables and then we reach the the method descriptions the prototypes they're actually very similar except on net BSD we or actually the original open BSD driver there was a redefinition of static so I followed the coding style that I found in original driver or in free BSD in this case so that's really the only major difference in most prototypes and it goes on and on then on free BSD you have one big difference on the USB stack where the definitions for the transfers are in a variable so you have like the bulk ones in and out like transmission and reception then we continue the function for probing fixing fits in one screen so here as we saw in the manual page it's called probe and net BSD is called match otherwise it does mostly the same there's a few differences in the USB stack so get interface descriptor find I desk but it's actually more or less the same then that's the code to attach it's a bit more involved but of course some variables are in common because it's the same code originally then quite a few changes but it's actually the same thing that is being done looking up USB device IDs on both sides some setup the printing for the console so the USB stack is quite similar but also different so here you have get interface descriptor here you have something else which does the same yeah get interface descriptor again the naming conventions are a bit different but overall it's very similar so I can keep scrolling I hope you get the idea basically we wish the detaching code same thing again here we are the driver of course is the part where there's the most similarity because it doesn't have to be specific for each BSD so there's a lot more black as a consequence we keep going I'm not going to show the whole driver of course don't worry another thing which is an important thing to have in mind is that for instance even when the code is the same and USB stack behaves the same way there are minor differences like on free BSD here are to hold the mutex and the net BSD for a similar call at task I didn't have to so you have to be wary of each specific requirement of the underlying stack that you're using on each BSD some functions can be called in interrupt context some cannot some require mutex some don't so basically it's what's going on so my dream is to have one driver API for every BSD so that we can share all the code but in reality I don't know if there is any chance to get to that however it would be great I think if we could go towards this and have more convergence take steps to get closer maybe both on the community level as much as on the programming level with the driver code so I'm showing some ideas here today what can be done in most BSDs the drivers fit in one file usually for the more complex ones there's sometimes a few more files but if we would change this convention we could have separate files maybe put variables which would be then the same between the different BSDs in separate files so we can easily merge that when there are changes in one or the other we could separate the BSD specific code like free BSD specific open BSD specific net BSD specific from the driver specific code maybe another idea we could go towards abstraction layers which is not always great but maybe some prototypes some variable types could be extracted the same way we could change names when great thing about BSDs is that the systems are developed as one consistent wall and there are usually fixed releases for the whole system so this is maybe easier to do that in the BSD world and in the Linux world for instance and so on and so forth outside of the driver code itself we have the system databases which could be unified a bit more like the PCI and USB IDs which are sometimes different between the different BSDs mostly the same but some names change including for register values in some drivers the driver names also are sometimes different for the same driver or at least historically or sometimes not so just showing ideas we could also share Git commits if we would have a bigger like a closer convergence and if we would all switch to Git there are mirrors too or there is got you can stay for Stefan's talk to learn more about that so basically I'm trying to set up a new exchange space for this initiative which I called BSD drivers or BSD driver harmony so I created a mailing list if you want to join what we could discuss if the mailing list is the best thing to have we are in 2023 so we could also have like a discord or something like that that whatever the cool kids do nowadays or anyway I set up some archives if you want to have discussions on the mailing lists maybe we could set up an RSC channel maybe you could set up a weekly somewhere to specifically document like how to best write portable code across the different BSD's maybe we could discuss funding it would be very welcome and basically to wrap up now of course each BSD has its own community but maybe we could try to get closer even though we have the major conferences in here the common dev room would be great maybe outside of the conferences to create a space for this so as mentioned kind of drivers can be challenging they are quite close similar but also different so anyway I hope this is worth the effort and that you will join participate and that we can write BSD codes together you can reach me at this address I'm at net BSD actually I'm also in the net BSD financials board so it's easy for me also to forward ideas in the higher spheres across the different committees we have many committees and then thanks for listening I will welcome your questions also online and hope you hope it resonates for you yes yes yeah so the question is if I summarize it I guess how difficult is it to make an abstraction layer which would work on those three BSD's then Taylor said then you have a force interface which is kind of true we all remember the XKCD I will just create a new standard because there's too many standards well it's I don't think it's so difficult necessarily because as mentioned a lot of drivers are actually very similar the systems are very similar and I just learned this morning that in some cases there are already abstraction layers which are coming like in free BSD from Juniper if I'm correct so maybe this is something which could be used also across the other BSD's converting code for that that they are now pushing a new abstraction layer in free BSD for the network drivers network cards yeah I think they're not really having the other BSD's in mind but this is something which would maybe help the free BSD drivers which would be then converted to be converted in turn to the other BSD's so maybe it's going to happen the facto it's one direction we can push for that's why I want to talk to developers across the different BSD projects I contacted people on the open BSD and free BSD side already and they are really receptive to the idea the issue is to spend the time and probably to make it happen yeah so you are not really asking a question but reminding us basically that some drivers or some subsystems have very specific constraints on different BSD's so the abstraction layer will have to keep this in mind really carefully like what I mentioned sometimes you need a mutex sometimes you need you can do something in interrupt context sometimes not so yeah for sure time is up also for questions not can we squeeze one more or yeah so you were just saying that for the stream we could push for yeah yeah so in that BSD we've been pushing for USB net unifying drivers on the USB network category to make them more the same we found many bugs and it helped get them all closer together yeah okay I guess we'll stop here for for now and let the next speaker speak in three minutes thank you |
Game of Trees Daemon
A Git repository server for OpenBSD and other systems |
Hello. Can you see my slides? Yeah. I have only wide background, so the light at the top shouldn't be a big issue. And yeah, we're good to go. Okay. Hi, I'm Stefan. I work on generally open source stuff as a freelancer, and I'm here to present something I've been working on as a site project in the last few months. This is part of the Game of Trees project, which I started in November 2017 at an OpenBSD hackathon in Berlin. It's compatible with Git repositories and the Git network protocol, but apart from that, it's not trying to replicate Git specifically, but it's just the idea to reuse these formats because they're very widely used. And they're fairly okay and well designed, so we can just keep using them and not make up our own. And yeah, because it's written on OpenBSD, it uses a lot of OpenBSD specific APIs. There's actually a portable version that's maintained by Thomas Adam, who also does the T-Max terminal multiplexer portable version, and you can install this on various systems. And I think Thomas always likes to also explore more options for other systems. If you're interested, if yours is not listed, you can talk to him. And yeah, it's ISC licensed because it aims to be basically as pleasing to OpenBSD developers as possible. That's the whole idea. Now, what we currently have is what's working really well is the client side. And this is basically sort of feature complete at this point. You might want to have some more convenience things, but all the basics are there. Everything is working. You have several frontends which I'll present in the following slides. You have a lot of code that's shared by these frontends, which I've labeled library here because it's in the lib directory of the source tree. One thing that this program does, which is very specific, is that it will not touch repository data outside of programs that are separate and are called lib exec helpers. From the programs point of view, if you use the library, you don't see this. You just like say open a repository and fetch me some objects and so on. But internally, it will actually start other programs that restricts themselves a lot using pledge and unveil and so on. And those will actually pass the repository data. This is the current list of commands. And I'm quite happy with this set actually. I've been working with this set for the last five years or so. They've slowly been added over time. But I feel very productive with these. And I don't miss anything. I know that some people would like some additional things. But at this point, we're mostly like fine tuning. And you can read the manual page on this URL if you like. You can actually read it from start to finish in order to get a good idea of how the system works and how it's supposed to be used. There's also a got admin utility which sort of mirrors CVS admin or SVN admin in the sense that if you're doing something that only requires like specific things where you do something with a repository specifically, you would use that command. This isn't complete. There are some things that I would still like to add here, which we'll go into later. But it's already prepared a lot of code for the server that I'll talk about. Because for example, dealing with pack files is necessary for the server as well as this tool. We have a curses command line, a base terminal browser thing. You can read commits with that and look at diffs and blame files and so on. It's working really well. And most recently, there's a developer Mark Jamsak who added a lot of convenience to this like vertical scrolling, diff stat display and all sorts of nice things. It doesn't work quite well on repositories that have a lot of merge commits. I found that some repositories are hard to browse if they use a lot of merges. But for simple repositories, it's really good. And if something is missing and you feel like you would like to use this on a repository with lots of merges, you can please make suggestions as to what we could improve there. You also have a web front end, which is sort of like CVS web or VUVC. And it's also using the God code internally to show you files on a web browser and commits and logs and so on. That's written by Tracy Emery. And most recently, Omar Polo has been doing a lot of refactoring there and added a templating mechanism, for example, to deal with generating the HTML, not from printf, but with something more generic. And it's quite nice. It also has RSS feeds for tags, which is probably rarely outdated, but I think it's kind of nice. You can be notified of new releases that way. Okay, so about the server. So the goal of one of the major milestones for any version control system that's ever been developed is that eventually you want to be self hosting. And so far, we've been using a Gitulite setup for this project. And that's working well, but I would really like to be able to run this on an OpenBSD server using my own code. So after putting this off for a long time, because I always thought it would be a lot of work, I finally ran out of things to do on the client side and said, okay, I'm going to look into several things now and started talking to people at Hackathons in September and summer last year, basically, and started working in September. By now, you can install it on OpenBSD current. It's not yet in the portable version. Thomas and Omar were going to look at that, but it might take some time still, but eventually it should arrive there. Now, the main use cases I want to support with this are exactly two. One is, of course, I want to be self hosting for my own open source projects and maybe also private repositories. And the other is I want to enable what OpenBSD is using now with CVS, which is anonymous distribution of source code over SSH, where you know that the server you talk to is genuine and should have the right source code for you, but the client doesn't need to authenticate. And every time I want to get source code from a platform like GitHub or GitLab or other forages that exist with God, I have to upload an SSH key because they will not accept my SSH connection. And because God only uses SSH, it doesn't implement HTTP support. This is really annoying. And it's not really a technical problem to do this. It's just basically that in their software, they didn't foresee this use case. But I think it's very nice. And you can actually go and try this now if you like. This is the code that I'm talking about running on a server and it's serving God code and God portable. You have the Husky fingerprints, which you can not take a photo of or whatever. It's also on the website. And yeah, if all of you all at the same time would now go and trigger this, you'd probably trap my SSH rate limiter, especially if HostM is behind that, which I hope not. But yeah, be gentle. Maybe if you want to clone from this repo, pick a slide number in your head from between 10 and 37. And when the slide comes up, you start your clone, then you'll be fine. So yeah, I'd like to explain a bit what the Git protocol is doing because without knowing this, you will not understand what a server should be doing. And it turns out that if you leave out HTTP and all this stuff and just concentrate on the playing it protocol, it's actually really quite simple. If you don't, if you also ignore some protocol extensions, which we haven't implemented yet. So this is like really a bare bone clone that that we will go through. It's not very complicated. The main thing to understand is that when you're using SSH, the Git client will actually go and run the login shell of the user and then give that a command to run. And Git basically hardcoded the names of these executables in its protocol. So you cannot be a Git protocol without calling Git upload pack on the server when you log in, right? Also there's Git receive pack for the other side when you're when you're when you're sending something. Anyway, if you run God clone with the dash v flag, you will see a trace that is very similar to what I'm showing now. It's I've left out a few bits. But initially, so this is only Git protocol version zero slash one Git protocol version two changed a bit some things in a good way. But I haven't implemented that. So we're seeing a version one trace. Initially, the server just sends one message which says I have one of the branches I have has this comment hash and this name. And oh, I also have some capabilities. You can see in the trace, these are hidden behind a null byte. Because I suppose very old versions of Git clients didn't really understand the capabilities yet and the null byte made them not read that part of the message. So they and also for version two, they did the same thing, hiding a version announcements behind two null bytes, because then the next kind, you know, this is a bit hacky, but seems to work. Don't worry about the capability capabilities. It's not important what they are. What's important to understand also is that each message is wrapped in a packet line, they call it. And that's simply a length plus data framing format for these messages. So then the server keeps sending messages for every branch it has. And here's one more, its main branch happens to be the same as had, because had is a similar to main, but you know, not important. And the client just keeps storing these. And eventually the server sends a flush packet, which is just a zero length packet and says, okay, I'm done. And in response to which the client will tell the server what it wants. So the client sends similar messages also includes its capabilities in the first message it's sending. And basically says, oh, yeah, I want this commit and this commit and this commit. And eventually it also sends a flush packet to terminate that list. Now if we're doing a clone, right? So we have nothing. But if we already had commits, we could now tell the server what we have by sending half lines, which look just the same as the want lines with more commit IDs. And the server then builds a second set of commits in its memory to say like, oh, okay, the client has all of these already, I don't need to send those and don't need to send any objects that are hanging off these commits. It's basically just an optimization to keep the pack file small that will be sent next, right? So you're not doing a full clone every time you do a full clone initially. And then once you have something, you tell the server what you already have. So you only fetch the new stuff. And yeah, because we're doing a clone, we're just setting a server we're done. And now the client's protocol is already finished. So this is basically the last message the client will ever send. And the server sends one more message in response, which is in this case, a knack, not acknowledged. I don't know why they chose these words, aka knack. But essentially what these do is for a knack, the server keeps sending knacks while the clients are sending half lines to say like, I haven't found a common ancestor yet, please send me more. Because without a common ancestor, the server cannot determine a subset of the commit graph to use for the pack file. Because if the client sends totally unrelated commit hashes, the server doesn't know, then the server cannot use this to optimize the pack file. So it keeps sending knack. And in another case where you would have a common ancestor, the server would send an act and commit hash. And the client would then stop sending half lines for this branch. The exact details of this part of the protocol are a bit complicated. And they kept adding extensions to this behavior. So the actual knack and act processing depends on various options that you can set in the protocol, which are all documented in the Git docs. But it's not important for us here now. Basically, the server just tells us, well, I have no common ancestors because you don't have any commits. That's fine. And then the server starts calculating the set of objects it wants to put in the pack file. And what Sony has colored, Git calls us something else. It calls us like counting and enumerate. I don't know which step does what. But what we do is we have the whole graph and we keep coloring nodes in the graph. It's kind of like mine or theirs or something like this. And then eventually we have a subsection, which in this case would be all of it. And of all the commits first, and then you go through these commits and traverse all the trees and collect all the trees and blobs that you need to include for the client. And then you have a lot of objects. And you sort them in a certain way. And you go through and check whether you already have a delta for any of these objects and whether the delta base will also be included in the packet sending so that you can avoid creating a delta for this object. You just reuse the delta that you already have somewhere, which is an optimization for performance and very important. If you don't do that, your server is going to be super slow. And then you deltify some of the rest of the objects and you're good to go. Now you know what you need to know to start generating a pack file stream. And you start sending this out to the client. And the client downloads it. Once it has everything indexes the pack, which is a step where you have the pack file, which is full of compressed and deltified objects. You don't know what's in it because the server didn't tell you anything about the objects. You just told the server, send me this. The server sends you something. Now you don't know what's in there. And to use the pack file, you always need to have an index for it, which tells you which object ID is at which offset in the pack file. So you just read the whole thing. And because Git uses intrinsic object identifiers, you can calculate the IDs yourself based on the contents of the blobs and the trees and the commits and so on. So you build that up. And then for any of the deltified objects, you also need to make sure that you can actually combine all the deltas to get the right content. And that's the last step. That takes quite a while. And then once you're done with that on a big pack anyway, it takes a long time. And then once you have that, you know, okay, I have this pack. The commit I want it is in there. All the objects that are hanging off of it are, you know, by nature of the hashing structure that Git is using are there. So that's fine, we're going to use this. Then you just create a reference for the Git client to find its initial commit and you can use the repository. In the push case, it works slightly differently. You still have this reference list announcement at the beginning. And instead of saying what it wants, the client proposes reference updates to say, oh, I would like to change the main branch to point to this commit. And I would like to change or add this tag or something like this. And then it just sends a pack file. And then the server has to index this and figure out that everything is fine. And whether it wants to change these references or not. And give feedback to the client to say, like, yes, okay, I've, you have changed this branch or you've added this tag and so on. So that's it for the protocol overview. You can find a lot of documentation in Git source tree about this. They moved the files recently. So if you have an older Git source checkout, it might still be in documentation slash technical, but in the current version, it's in documentation slash Git protocol dash packed attack system is the main one for this, but there are also others, similarly named files, which you can also read if you want to know more. Okay, another thing we need to talk about, because this is important to understand why we would need to write our own server in the first place, because there are already several server implementations, right? Why do we want our own? Well, when you write server software, especially an open BSD, there are a few design patterns that we use that are not commonly used elsewhere, I would say. I mean, I've never really seen them used widely outside this project, so it's a bit unique in that way and the way it does things, but these things are important to us. So for example, you know that SSH recently had a release where they had a double free and advisory products like yesterday, I think, or two days ago said like, oh, this is not believed to be exploitable. That is because of this. It's not because SSH code is generally great or something. It's because of the design patterns. And so we want these design patterns to be used. And so one of the things you do is that you split your program into several processes that have different tasks. And for each task, you decide what kind of system calls does this task need? And how can I make sure that a process that has network access isn't also able to start new programs or open files and so on. There's unveil which restricts the view of the file system and allows you to completely hide like your dot SSH directory, for example. And other things, basically, it says the program, for example, the God client says, I need the repository, I need the work tree, I need slash temp, that's all I need to see. And I don't need to see anything else. When you start new programs, you always fork an exec, which means that when you do the exec, the program will be restarted from scratch and OpenBSD's memory randomization will kick in and load all the code segments and text segments and stuff in different locations again, which you do for every request so that when somebody learns information about the outer space from an info league, they cannot use it on the next request. You have messages over pipes to communicate between these programs. And of course, you will have to have access to files and networks somehow, right, especially in isolated contexts. And there what you do is you pass file descriptors over these pipes so that one process opens resources and the other less privileged one is using them. So these are the these are the patterns we use. Okay, and so basically, this is what this is. It's a Git server that runs as this kind of multi process program. It only supports SSH. Git user account I mapped to regular shell accounts because I didn't want to reimplement user management. You can have a special purpose login shell for these users to restrict them, if you want. And access permissions are said per repository. I don't want to go very complicated and make it like per branch or something. It's just like, no, if you have access to the repo, you have access, which is good enough, for example, for OpenBSD's model where you get an account and you can commit anywhere. And when you configure this thing, this is basically what you need to do. You create your repositories, make sure they're owned by the right user that you run the demon as. And you have at least one repository in your configuration file, which has a path, but the repository is and access permissions for either, in this case, the example would be a group of developers, which you have in ETC groups and an anonymous user, which we can only read. Now, my initial implementation of this looked something like this. It was functional and I could write a test suite for it, which was the main part. This could actually be used to fetch and push changes. But the design wasn't very good in terms of this multiprocess aspect because the parent started, then it started a reader process and a writer process and that was it. And then all these processes were always used for every connection. It did allow us to at least get this up and running, though. And I don't know, I asked for a bit of review and got shocked responses to say like, no, you're doing this all wrong. Fork and X needs to be done per request and so on. So yeah, okay. But at least functionally, it was already quite okay. And the repository code there is reusing a lot of the code that I already had for like God admin and so on. So I mostly had to rewrite a lot of code for the parent process from scratch, which was all of this. This is what it looks now. So the parent basically encompasses or used to encompass all of this functionality and we'll go through each one by one. So right now, in this current implementation, you have the parent when it starts up, must start as root in order to be able to do certain things like open, like start the listener process as root, for example. And it uses pledge as standard IO proc exact, which means basically standard is you always want that it's like printf and stuff like this. Then you have proc and exact, which allows you to fork and execute programs. And you can also send and receive file descriptors. And that's it what it can do. It also currently does an unveil on itself. So with an X permission, so it can re execute itself with different option flags to start other versions of itself, basically that we will start later. I'm not sure if this is really sound because it used to be said that unveil would inherit to child processes. And I'm not sure what happened to this. Currently, it does not. So it does not inherit, so I can do this and not lose access to, for example, the slash temp directory in the processes I'm starting next. But if that ever changes, we would have to adapt this, but it's not a big deal. You start a listen process, which opens the actual Unix socket that this demon accepts connections on. So basically, if you're a local user on the system, you can always access it through the socket, but you would normally run this shell that we have to, which does this for you and speaks the appropriate protocol. It then drops privileges. And the listen process runs as just standard IO synaptic Unix. Unix is needed to operate on the Unix socket. It also does an unveil because the Unix pledge allows you to bind other sockets and bind would create other sockets for you somewhere. And we wanted to prevent that. So by unveiling everything, basically hiding everything with unveil, there's no way to create additional Unix sockets for this process. And this process is also, as an initial kind of dust prevention mechanism, this enforces a connection limit per UID so that not one user can just connect to the socket and spam it and prevent access for everyone else. Now, the shell is one of the most sensitive parts because this is where users log in and you actually confine them to this program. So you want this to be reasonably secure. It starts out with standard IO, receive FDN Unix to be able to connect to the Unix socket. But once it's connected, it drops that capability so it can no longer open new ones or do other things related to that. It only has a file of scripture it can talk on. And that's it. And then it starts demonstrating these packet lines that we saw to messages that are internal to the program and go over the pipe to the parent. The parent will then start an authorization process which only runs once. And what this does is it gives itself access to the password database of the system using the SCAT-PW syscall and also hides all the file system. And I think this is, this shows something very nice about Pledge and Unveil when used in combination because I'm actually reading ETC password and ETC group files, right? But Unveil, as per Unveil, I shouldn't be able to access those. But because I declared that I want to use the password database, the kernel knows that this process is okay. It's okay for this process to access those files. So it bypasses Unveil in that specific case. Which means I don't have to worry about how the security mechanism is implemented. I don't have to go and say, oh, is my libc when I ask for users going to open this file? Well, maybe I should add an exception for that. Or is it going to do this and such and such syscall? I don't have to worry. I just say like, Pledge, I will do that. And Unveil, I will do that. And they take care of it, which is great for a programmer. It's really nice to program against this. So what this process then does, of course, is matches the users that are logged in against the access rules in the config file you saw earlier and reports the result to the parent and just exits because that's all it needs to do. It's just a one-shot thing. Now, the parent starts two processes if authorization has succeeded. And the shell is kind of waiting because it's like, hey, I sent a message, but you haven't responded yet. But yeah, we're busy, we're setting up. So we start two things right now, a session process and a repository read or write process. Currently, the naming of these is horribly bad. It just was the best I could come up with. And it kind of grew organically from the initial setup with those three processes you saw earlier. But for example, the repository write process is not actually writing to the repository, which you'll see later. So I'm not very happy about this. And also, the session process is basically the most powerful component of the system right now. It's the only one that can actually read, write, the repository and create files in there. It can also do the same as slash temp. And for that, it needs all these pledges with like read path, write path, create path. And it also needs file attributes and file locking because when it changes references for clients, it needs to make sure that they get locked so that you don't have file system races where two clients commit at the same time and then you end up with a reference that's been overwritten. It also creates temporary files, which the repository process needs and gives it the file descriptors. It handles installing of the pack files and so on. And it has the git protocol state machine in it. So that's a bit, I would like to continue work there to split this up more, but because I had to have a functional implementation and I had to, like, I wanted to have something functional to clone from, which is there now, which is on the internet. That's fine. But going forward, this needs to be revisited for sure. The repository read and write process is apart from the name for repo write. I'm okay with how that's worked out. Both of them can only read from the repository. And what the reader does is it is responsible for creating a pack file and streaming the result to the God shell over a pipe that is created by the session process and handed to both the shell and the reader. And the writer is responsible for receiving a pack file and indexing it. So the indexing is almost done. So the indexing is done there. Okay. I have one minute left, one minute and a half. I quickly go through some implementation improvements. It's still like to do. So we should verify what the client has uploaded. Currently we trust it, what to do. The config file is parsed every time a process starts, which isn't ideal, which works, but it's bad if you're changing the file while the process is running. Yeah, session I already mentioned. And the state machines have some funny bugs. So these really need to be rewritten. They're basically like switch statements and if and so on. And I'd like to properly separate that out with tables and state transition functions and so on. But it was just a quick way of getting things working. But we already saw like thousands of flash packets flying through this process because an end of file on a socket triggered a flash packet and that was kind of stupid. This has been fixed, but there will still be other bugs like that. We should have some built-in checks so that commits can be verified according to project policies and things like denying merge commits if you don't want them or binary files and so on, preventing a forced push. I'd like to have commit notifications where you, for example, send an email or you can send an arbitrary HTTP request so that if you really want to have a post-commit hook script, you run it somewhere else and we'll give you information and trigger it. Yeah, also it should really keep track of what this space it has when it accepts pack files and not fill the disk and fail. We should be able to remove redundant pack files that have accumulated over time. I'd like to add SHA2 support and enable it by default once that works so that we use the SHA2 because we have zero production deployments right now and unlike it, so we can just use the new format they've already defined. Service at rebasing is another thing. I'm out of time, so I'm not going to go into that, but I think this is it. Sorry for the quick part of you. Thank you very much. I encourage you to ask a question about the hallway. Okay, good. |
Reggae: cool way of managing jails/VMs on FreeBSD
No docker, no cry |
Hello, and welcome to my talk about a tool called Reggae. It has something to do with a beehive and jail, and I'm going to explain what it actually does, why it does it that way, what are the goals and what's the initiative behind it. So the relevant thing about me is I'm a co-founder of Hacker Space in my hometown. I was lucky enough to marry this girl and she supports me in all my crazy IT endeavors. So Hacker Space was founded by us, and it's a place where I do most of my development in the open source community. I'm a free-biz user since 2016, I used it briefly before 5.0 came out, and I'm somewhat involved in free BSD, but I'm not a developer yet. I'm a CBSD contributor, and for those of you that don't know what CBSD is, it's a jail and beehive manager. It can also manage Zen and VirtualBox, but Reggae doesn't support these two. I decided to have only free BSD-native technologies supported, and obviously I'm an author of the software I'm about to talk about. So you might have seen similar tools like Vagrant and Docker Compose if you come from the Linux world, like I do. In that regard, it's not revolutionary, it's actually pretty same thing, run your development in a virtual machine or the jail, and currently jail has a really good support, beehive is worked on because it has different distributions of Linux, different operating systems, versions and so on, and it's a little bit harder to support because of those differences. And Oleg, who is a lead developer of CBSD, decided to create a CBSD file which resembles somewhat Vagrant file in ideas and not the syntax, and I decided to take a different approach. So it's kind of a silly situation currently that there are two CBSD DevOps tools, but we were trying out things and whatever works best is going to stay. I'm rooting for the reggae, of course, but so the concept is that, well, before I start with this, you would say the name of the project is reggae, I have a dreadlocks and I'm a musician, but I don't play reggae. It comes from a totally different idea. I had quite a few projects before this one that are all open source and now that, that had somehow referenced some songs or ideas from the reggae movement and music, and I said, okay, how about I name this one reggae and stop the streak of resembling something I actually don't want to resemble. So that's where the name comes from. It's me trying to come up with a name to stop the reggae streak. So the basic concept behind reggae is that it uses CBSD for all heavy lifting. All jail and beehive management is done with it. Some of the networking is done with the CBSD, some with RC.com. I'm going to get to that a bit later and reggae has two entities. One is a service, one is a project. I intentionally called it a service because it can be a virtual machine or the jail and the project is you might want to have different jails for different stuff like let's say you're working with WordPress as a hoster and you know you need the PHP with WordPress engine X on one side and MySQL on the other. You probably don't want all of that to be inside the same container so you can split those up. You can influence the order of creation and booting and everything. We still don't have dependencies between jails but we might have in the future because it's a really nice feature and you probably are not going to type all the commands to create the jail and then all the services inside of it. You're going to use some of the provisioners and I'm going to talk about that in a while. Where I come from that's basically Serbia. We don't have IPv6 and forgive me if I didn't implement IPv6 properly because I have a really limited resources in testing that. There is of course tunnels that I use and I try to combine stuff but yeah I'm basically with one hand behind the back. So I really try to support it. I just don't know if I've done it properly. Technologies, well it's a mouthful. With VNet Jails you can use DHCP and I almost insist on using VNet because for me as a system administrator it's easier than Jails. I just fire up the DHCP server, the DNS server, they are interconnected and to be specific DHCP is ICS, the DNS server is NSD and there's one more DNS caching which is unbound. There are, well all these technologies, you probably know what they are, the RTA DVD is for IPv6 as I mentioned and PF, you know what, I would love to add IPFW. This is the future so I don't cut off like I don't know what percentage of FreeBSD people but yeah we want to support everyone and everything ideally. And these to make and POSIXL are the only languages used to implement RTA. I could get away with it because the CBSD is doing the whole stuff and I just need to script a few things, right. As I said provisioners are supported, you don't have to use them, you can still just create a Jail and do your, maybe it's a lab and you're playing with it, experimenting and what not, but if you want you can run quite a few automation scripts and you can even have multiple provisioners for a single Jail. That sounds crazy but if you think about it, sorry wrong operating system, FreeBSD doesn't come with a pre-installed Python and if you want to use Ansible you are going to install Python first then the Ansible later and for situations like that it's really really nice to have shell provisioner doing like a first stage of the rocket launch and then Ansible does all the rest. I use shell and Ansible extensively and the rest of the provisioners I really try to support them, I used all of them or am using them at work or somewhere else but I'm not using them as extensively as the first two so if you find a bug please report it, I will be nice and fix it if I know how to. So Riot brings some things to the table that FreeBSD alone doesn't, I mean that's the reason it's created because it builds on top of something I personally consider a good software and what it brings is in order to run let's say re-enter Jails it will probably need a bridge and then some ePairs and so on but bridge interface has to be created or at least allowed different administrators to have their own configuration of a bridge and not actually enforce what a CBSD comes with by default and it's initializing network in such a way that PF is already configured, bridge is configured, IPv4 and 6 demons are stored at unbound and what not. So it initializes quite a lot and you might not like it and if you don't you have quite a few variables to disable stuff, for example you can disable IPv6 or have your projects resigned in a different directory not the default one and so on. So there's quite of initialization that it does because if you currently read how to start vNet Jail in FreeBSD it's not really easy, if you read it in different Jail managers it might be better, I didn't check other documentation so I don't know but currently FreeBSD documentation around Jail is kind of scarcely, not really depicting at all and one of the things that it helps you with is how do you even start with it. Because I come from a Linux world in 2016 I was like how am I doing stuff, what do I do, how do I configure it and that was actually the reason I chose CBSD. At the time it made most logical, it was the most intuitive tool for me to run my Jails. Since then I discovered it's good for other things not just intuition right but yeah I just love the software. The same is with CBSD initialization because CBSD can do quite a lot, it can track resources, it can give you the usage of network per Jail, it can do stuff and in order to do that it has to ask you things. For example, are you using IPFW or PF and things like that, are you using ZFS or UFS, what should I do, create clones or directories and so on. Ragga chose a different path or I chose it for it, you can say use ZFS, yes or no and if it's no luckily we only have two file systems so either you use ZFS or the other one UFS. So the initial configuration is quite easier with Ragga than CBSD and maybe if you're starting with a stack it's going to help you on your way from infancy to pro how to even start with CBSD and Ragga. There is one decision to, okay let me take a step back. I try to create everything as the code but some things are just not meant to be, like how do you assign addresses, CBSD has two, let's say two DHCP implementations. One is called DHCP and what it does is run through all the IP addresses and finds the first free one and assign it to Jail, only on creation so it's not going to scan every time just when the Jail is created it's going to scan them. So existing IP addresses and the configured IP addresses of Jails that are not stored it. So there is a second implementation of DHCP, it's called real DHCP. The idea is to have one master Jail that has DHCP and DNS in it, authoritative DNS in it and there is local unbound outside of Jail that says okay if you want a zone that your Jails are belonging to then look at here and it's from the Ragga and CBSD perspective. It's not so much about caching, actually it's not at all about caching although unbound is known for it and created for that purpose. It's more like how to call it a cop at the junction, you go there, you go over here and it's redirecting the queries. So the idea is that probably while developing with Ragga you're going to have a private IP address range and you want to still ask the DNS hey where is stuff because one thing that actually two things are annoying to me about Docker at least when I used it. Docker containers know by name where other Docker containers are but you don't. You have to use IP addresses or Etsy hosts, hackery or something and the second thing that really annoyed me is that I cannot have a user on this laptop with I don't know UID of thousand and one and have the same user in a Docker container and do development inside of Docker container. Just matching the ID's war was really terrible, I at least hope it was, it's not, it's not pertained but the idea behind that kind of frustration led to okay how about I have a DNS server, I mean it's what it's for to tell you where the IP address of this name is and so on. So yeah I couldn't create just the code, I still need one jail to do some maintenance stuff and some networking stuff. The nice thing about IPFW is that you can insert a rule whenever you want, if you're coming from Linux world that means it behaves like IP tables but PF doesn't and PF has a pretty static set of rules unless you're using anchors, anchors are a way to say okay this is a sub rule of the main rule and anchors can be changed and they are per jail called cbsd slash your jail name so it's easier to see which rules or for which jails and I don't want to trash Docker too much but if you've ever seen IP tables rules after the Docker mangles them it's a mess, it's a huge mess so I try to be nicer as much as technologists and freebies they allow, just don't be, sorry, don't be so negative towards developers and administrators they're going to hate you and so yeah when I say late these anchors are created after the jail is torrid so it already has an IP address and can use the port redirection whatever you need and yeah because until the jail is torrid it's already registered itself, got a lease and DHCP registered it in NSD zone so these anchors can use host names instead of IP addresses and if you're using DHCP you probably have no idea what's the IP address so the nice thing is that you can use names to really sort your networking and fire walling I would say in a proper way in a more readable way and there is a concept of in something called DevOps that I still don't know what it means they say the DevOps is IP accounting they know what not to click on AWS otherwise it's going to cost you so it's like it's like a administrator plus plus accounting right but yeah in a DevOps world you want your production to resemble development or maybe the other way around but yeah the fewer the differences between your development and your production the bigger the chances that you're gonna catch bugs in development not in production although users are the best monitoring system in the world okay so because of the vnet just the vnet implies that you're going to have to use some custom DFS rules because you want access to BPF you want access to PF so it can have its own its own firewall and naturally the FS rules are supported in reggae unfortunately by number not by name but hopefully that will change boot order is integer or should I say unsigned integer it works like nice the bigger the number the later you're going to boot so it's not a perfect solution but it's a solution to start your WordPress after the minus kill for example and it even works if you remove the reggae you create everything with reggae remove it it still works the boot order works so that I really try it to make booting really really work even if I did something naughty to my code and it breaks so I kind of have a failsafe I didn't know where majority of you are from but for some reason in Serbia package repository is extremely slow if I get two ambits that's like wow that's a light speed I don't know why it works like that maybe it's our providers because I talk to different people and they don't have that problem so I came up with solution work around what if all my jails just use proxy and first one is gonna download everything and it's gonna be painfully slow but the next one all the next ones were just gonna use the cached cached files and there is even a reggae recipe for proxy so you don't have to type in bunch of commands you just go to the repo download it type make and you're gonna have a proxy you're gonna have to change the configuration to tell it to use it but yeah it's as simple as it gets in my open source career I mostly work with audio and that's mostly graphical so I needed something like don't pollute my host and let me use jail for development but on the other hand let me click on the buttons because I need to so reggae has a really easy way of forwarding I think it's called forwarding in X open the port and all the X applications are connected to that port so X is on the X org is on host and your ex client is in inside the jail so you can interact with it and yeah it supports both but I'm gonna have to work real hard on this one as the newest addition to the source code so it it received the least amount of love and currently it only supports free bsd virtual machines what's beneath it is cbsd is using cloud in it to do your bootstrapping of a virtual machine to have the initial configuration and then your provisioner or provision nurse can kick in and do whatever you like whatever you told them to so what does development means well there is a devil mode because it's exposing so many things and doing it rather insecurely for the sake of a developer being able to do stuff I kind of think the devil mode is pretty good it's so close to the devil right so it has this development mode where you for example share your directory where the services with the with the jail I borrowed this idea from vagrant having a slash vagrant being a mount point of your project and you can interact with it either through the host or through the container so and yeah it's mountain and slash user slash source it kind of made sense because I kind of expect people not to do free bsd kernel or user space development inside a jail and you might find that an obstacle but you can still mount your own stuff on top of whatever right and see bsd do so you can add your mount points and stuff so it's not limiting if you're doing free bsd development which I expect some of you do it's not gonna be in the way this user is in is in the jail and it has the same user ID and group ID as the user host that execute it make inside of reggae service so that slash user slash source is all owner is the devil user and you can interact with it it's read write so the only difference is that the username and group name is gonna probably gonna be different why I say probably because I expect you not to name your user devil and there are well I was working in this I was pretty active for quite a few years as a web developer uh which is kind of strange for a free bsd guy free bsd contributor and what I realized is how about I have one line saying okay I'm gonna in these jails I'm gonna use like Django or something as as my framework that I'm gonna use so it does few alterations of how it does things so it lets you run your run your frameworks more easily and I shouldn't actually use plural because the only framework that's currently supported is called print it and guess who the author is but the idea is that you have a directory in your service called bin and inside of it you have a devil dot sh and reggae doesn't know what's inside of it it's just gonna run it inside the jail and usually for the web development I like to have a a tmax with a split pane back end front end and it knows how to tell front end where the back end API is if you need to integrate and you do so but basically you can create your own devil script and it's the reggae will be happy about it so it can do much more than just one framework but in the future I'm gonna add quite a few to make people know so much involved with jails easier to start using it so production does totally different thing I mean the opposite is gonna say okay this jail is gonna start with the boot of the machine it's also gonna try its best to remove everything discreet it like the user source mount the some of the some of the scripts they're installed for devil to work easily and so on so what's the idea here is that if you want to create like a build server or maybe on your machine you're using devil mode and you created it like that and then afterwards you say devil mode equals no make down make up and make up is gonna clean everything that devil created and then you can make export and create an image move it to some other server imported to there and it just means so that you can have like either your development container productionized not a good idea don't do it if you if you can avoid it but sometimes you can't and it works on my machine so ship it but the the more serious use case for it is to have your build server build the image of a jail and then you deploy it somewhere so this is an example of so up to here first recommends or creating the project and the project name is going to be as the directory you created so so the project containing multiple services well in this instance just want to make it compact so up to here is creating the project and then a project expects services directory to contain your jail or beehive definitions and the definitions are created with this you can omit this part and it's not going to create any provisioner for you or you can say here ansible shell puppet and that's the order of provisioners that are going to be run it has a few more expectations where things are going to be for a certain provisioner but it's going to create a skeleton for you so you don't have to create them by yourself and it's a well read the code if you have the skeleton it's going to be more intuitive what you should do next with your own code with your own development and it's pretty slick it can be done differently I just think to me this was most intuitive and I'm trying not to go against intuition so there is let's say all of us need something I don't know like MySQL we all here in this room are hosting WordPress and we all need MySQL WordPress and Nginx but you don't have to create it on your own there are pre-made recipes for that there are not images they are still in this instance they are all ansible but you can combine jails and beehives provision with different provisioners and in this case it's going to download these three jails configure them and interconnect them because Nginx has to know where the WordPress is and WordPress needs to know where the database is but it's not going to create a database for you it's still like okay I don't know what you want is it a single instance or you're hosting multiple WordPress instances and you need multiple databases and so on so that's left up to you but if you go to the MySQL everything is if you want to look at code of this just add jail dash MySQL and you're going to have the repository and there's a readme which tells you okay this is how you create a database in this jail so you can do it differently of course but if you're coming from a different environment I try to hold your hand as much as I can but you can't ignore it the same is for the WordPress and Nginx has a directory with quite a few examples how you create your own hosts inside the Nginx configuration but it's not going to configure it for you it's a recommendation and not not much okay you have to do it this way and this URL is probably the most complex project I did with Rage most complex in a sense that it has the biggest number of jails and services inside of it that need to talk to each other and well if you try to manage a driver and mail and do it responsibly you know that you need all the help you can get so I try to create this service that creates the mail and driver communication server it gives you the documentation on how you should configure your DNS what's what's inside of it the links towards specific services and their own configurations and examples and so on so it's my way of avoiding writing documentation I mean we all do that we have better things to do or better things to do so if you have an example it's a documentation on its own and yeah the future what can be better than documentation right but what I want to say is that Oleg who is a CBSD lead developer is Russian and I'm going to quote him my Russian is oh my English is terrible so our documentation currently is not good I'm not going to say terrible but but it's not good as it can be and that's literally our next assignment we already talked a few months before about it and the documentation is really killing us or the lack of good documentation and way of searching it and browsing it and and finding what you need there is quite a few documents there to describe how specific scenarios should be executed or specific commands what options do they have it just still needs to be worded better and more coherent and connected better so that you can say okay if I read this this and this I know where I'm where I'm going to be and then I know how to manage WordPress for example so documentation is a really big thing that maybe it's going to be only thing that we are going to work on besides the bugs so yeah sorry about not being very good with documents but we promise we will improve now these two I hope they are not mutually exclusive JillConf is a much much simpler way it's in the base it really works wonderful and it can even manage your beehive because today I don't know with which version it started but you can have a beehive inside your jail and if you have that you can only manage jail and that's kind of wonderful what we really need or at least I would like to have is a JailConf D working really great currently it doesn't one of the obstacles well two of the obstacles for you can in JailConf you can define dependencies between jails and that's how it's going to boot it's going to create a tree and then go from the leaves to the root but you can do that with JailConf D and there is a patch coming from a developer I know from Jerevan called Entranning and he promised he's going to upstream that patch this weekend so we are watching you but yeah it's almost it's almost there the second obstacle with JailConf D is that you cannot have common configuration like you can in a JailConf because JailConf is one file with multiple jails JailConf D is multiple jails oh sorry multiple files one jail per file you can probably have multiple jails per file but that kind of kills the idea so the idea is to have better service jail start or stop or whatever and then probably most of us jail managers developers are going to unite in it and say okay this is how we use it now you can criticize us and why didn't you do that in the first place right but I guess we need it to experiment and learn and see what what works and what not so that we know once we upstream it to JailConf it's going to be what we expect it to be and Kubernetes all jokes aside I don't know if you're following but there is more and more Kubernetes support in Potspree and there is I hope I'm going to pronounce the name properly Doug Robson is working on Kubernetes support Podman is already there and Doug has pretty initial version of Kubernetes working on FreeBSD with VNet and Jails why I'm hoping for this to happen is so that we can have easier transition from Linux to FreeBSD and honestly there's a friend of mine who promised me that we are going to try mixed Kubernetes data centers is a Linux freak I'm a FreeBSD freak so let's try to create one domain that is going to be mixed operating systems in Kubernetes who knows where it leads us maybe we just lose our hair but if we do succeed it's a really easy way to switch to migrate from one operating system to the other because realistically maybe you want to switch from FreeBSD to Linux please don't but yeah who knows what's your what's your use case and maybe your workplace is such that it simply has to be like that so yeah well I'm hoping this will not live because it leaves more space for this and so I'm kind of like a split brain person I really don't know what I want with Kubernetes it's it's probably here to stay so we need to support it in order to be kinder towards other developers and system administrators maybe DevOps so I would like to thank a few communities and persons till the center for supporting my crazy ideas and testing everything I do at least few of the people are testing my crazy ideas and reporting back and sometimes even doing slight development and I'm really grateful to be physically in an environment that is supporting me in such a way because usually I don't know about you but I suspect we are the crazy ones why don't you use why don't you use drugs like all normal kits or the docker or whatever right so yeah it's really nice to be supported the FreeBSD community I don't know how to say this enough they've been supportive and helpful and I didn't encounter a question that somebody didn't answer if I was persistent enough and CBSD community all again particular he created CBSD by himself and it's enormous task it even have and curses commands to make it easier for you so it's enormous task that well started 10 years ago 2013 and so it's our big anniversary this year I think it's somewhere during the summer but I will have to have to check that recently we announced that on CBSD site and I'm afraid I didn't remember it yeah it was few days before the FOSDM and I was doing bunch of things to to basically prepare so these are if you want to ask something contribute whatever is on your mind around regular CBSD or whatever music maybe these are my contacts and now is the right time to ask questions thank you oh he's gonna die he's gonna die you would for example my secular before workplace or there is also a mechanism that allows you like a reading a scroll to check if the database is ready to serve okay so the question was if I understood correctly if boot order can also check if the database started a bit before it stores the WordPress did I get it right no it cannot do that but it will at some point because the jail comf can have dependencies and at some point I'm really hoping to switch to dependencies instead of boot order because who likes boot order I mean if you have like 200 jails who knows what stores when but with dependencies it's much easier any more questions yes mostly for CBSD but you can find one for the reggae that depicts mostly how to initialize it and start the really dumb project that almost has nothing in it yeah sorry the question was I didn't catch that second one yes so the question was order any demos and they're on youtube some of them are on Russian and some of them are on how it's called askinema yeah so you can even from the demo you can copy paste the the text and do it yourself or commands or whatever okay anyone else okay then I'm gonna have to say goodbye with the famous words no docker no cry thank you |
Happy 5th anniversary pkg-provides |
So, as a start, oh yes, I put who am I, so I'm Rodrigo Solio, so I'm FreeBSD power committer since 2014, what else, so yes, what speak it provides, I don't know if how many of you knows package provides, a few ones, okay, so sorry, oh yes, so this part of, so as you know, we have package, PKG, which is a FreeBSD command used to manage packages, and the package was designed to be extended with plugins, so you can route your own plugins to run special commands, handle hooks, replace existing commands, or access the package database just for statistics or have numbers or whatever, so package provides basically is a plugin you can use with packages, we extend package to do a special thing is matching a found the packages for a specific file, basically you have you are looking for which package I need to install to have this file in my computer, and usually is you you start looking on the internet, you Google a little bit to know exactly what the package you need, and because sometimes the packages doesn't provide the name of the files we have inside the command or so on, so basically it's a way to found, is a reverse way to find which package correspond to a specific file, it could be a library, an executable, or just a header like hfile, so this is an example, so you just start package provides and the libmpg123.iso and give you the list of packages who provide this, sorry for the contrast, it's not very good, so you have a you have a for this lib, you have a two far two packages, we have the mpg123 which is probably the main project who provides the far, but you see that zoom video conferencing also provide this library in another version probably for your own and basically it's a Linux compact library, so as you can guess package provide is a client, so you have in your computer but it's also a database who contains the full list of files you can install if you install all the packages existing in FreeBSD and this is a lot for just one release and one architectures we have one nine million files installable, so this is a huge database we have to build, so just I put here this is the numbers for the the latest version so the 13 and if you look at the different versions of FreeBSD and different architectures you see how many how many files are and basically this is based on the packages you can build on FreeBSD for this architecture and version, so you you can probably see which are the the good architectures, the architectures who works and you have a lot of applications and those architectures you who are a little bit not really used or we are missing a lot of tools who doesn't build for those architectures, you have also the new ones like the ARMv7 who appears suddenly and it has a big big number and you have the ARMv6 who is declining a little bit, things still works but as the numbers shows it's not that popular than ARMv7, so this is basically how everything works, I have here because I host the database server myself, so I have the the package provide server here, you have the clients who connects to upload the database and I use the package FreeBSD servers to sync and have the list of files because this is the the best and most efficient way to have the full list of files in a minimum of time because of course you you can guess it's it's not always possible to build all the packages just to have the list, it's kind of nightmare, so basically yes I put some sometimes about these three so I start it on 2017, I have multiple motivations, the first one was to stop doing grep on ports when I'm looking for a file, it's nothing more terrible than you try to compile a new a new project and say oh I'm missing the something H file and you start draping around and it takes hours just to have the full list of dependencies, so this way you can found it easily and have the exact number and also you can have a full list of the packages providing the same file so you can choose the right one who match for you, usually it's not the it's not the first one because some packages bring a lot of dependencies so you can choose the right one and my second motivation was to write a package plugin, at this time package just was a new thing, we have this plugin feature and as much as you know I'm probably the first plugin available for PkG and probably one of the who works, who is still working, I have real users but yes so yes PkG provide was introduced at the FOSDEM, the one year late one year after in 2018 and this is where the story starts, this is the daily connections of people who access the server to upload the database, looks not that much but I consider package provide like a consider package provide a developer tour so I don't expect regular people use it that much because it doesn't too much sense and also we have only people access only when they perform a package update, I don't upload the database every time so when you are doing an update, you have updates at this time the plugin looks if we have a new database so it downloads it, it's right to be oh so oh yes yes this is what happens when you host your own server from time to time you have a breakages and in this case was some kind of certificate issues, some numbers, the question is from where people come from and as you can see it matches a lot the free BSD developers base so you have people mostly USA, Canada, some people in South America, mostly in Europe, we have some people Asia, Australia and unfortunately we now have too much people born Africa who don't use people not like BSD there so if you take the numbers yes we have the big chunk of users are in the United States and you have Germany and France we are quite the same numbers and yes and after that you have a lot of countries around the world sometimes yes you have what's funny is we have a Russian Federation and Ukraine and both have the same number I have the same number of people coming from both countries so strange and from time to time you have some special guests so I for a couple of months I have people from St. Kitties and Nevis and if you don't know St. Kitties is a small island here just close to Cuba and it's basically just a place to put money and register your companies when you have things to hide so okay we have developers there just yes a quick look about the database as I say the database so we have at this time we have 58,000 of ports if you consider we have a three major releases and say architectures that brings you a lot of ports so as I say it's something you can build by your own because I don't have an inner space or CPU also to obtain the list of files it's not possible to trust PQGP list because those files are not totally complete sometimes you have replacements who don't during the build and I can just download as is the packages because it needs time and space to download the full list of packages and also have impact on mirror performances so I found a simple way is I try to abuse the package file format as you know packages are just archived who contains two times the manifest of the package and the second manifest which is a full manifest has the list of files provided for the package then comes all the files and the thing is you can just use the head HTTP request to only download the first part of the package and let's say we took we can extract the manifest because it works and if the JSON file is valid you are good if you if you have an error you just increase a little bit of size just remember it for the next time so this way you can just download the only the manifest part of the files and the good thing is the HTTP head is a totally standard method so we are not trying to abuse the HTTP server and we are not producing or breakages in connections we are saving people band with because we are we are not we are not using the the package to mirror just for us and we save also everybody's time yes so talking about the database itself at the beginning I I designed it as a just a four text file just because it's easy to read and search but it I have I have this problem is was too large is was 36 megabytes even compress okay so at the in 2018 at the USB beacon I talked with talking with mark SP about the the thing he suggests me I can use the locate file format locate is a UNIX standard tool who store all the is try to sort all the files you have in your computer so it's easy to found the place and use a Nagarin called Bgram to store the the data and this this format is pre-efficient because if we once once are generated and compress it I reduce by two to not do the size of the file so it's it's it's pretty convenient because instead of downloading for a couple of minutes the database we can have a just in one one minute and something it's pretty similar with the package itself so in time so the package provide us ages as I saw yes I show you you can just perform reverse searches you're looking for a file give you the name of the package but you can also use it to identify unexpected the files in the packages like core dumps things happen sometimes people commit and then the commit comes with binary files or dumps because of it happens and also you can use it to search for duplicate files when you have two packages installed the same far and conflicting for a long time I report this this information to to the developers and I I try to track exactly how much conflict we have and someone decide or Stefan Stefan is there decides to start looking at it and use the database the database I provide to to search for file conflicts and try to fix it to to to to avoid conflicts during a package install because package the tool can handle conflicts but the handle the conflicts the worst way it say okay you have a conflict so or you can't install the new package or you have to remove the old one so there's no way to and sometimes you you have and that there's a lot of funny things because we have binaries who have to say exactly the same name doing completely different things sometimes the game and the network manager and wow so yes so yeah and I still having six minutes so if you have questions please yeah the options no I am what I take is the real files available in in package servers so is the real thing you have if you install a package it's not something I'm sorry the question was if I scan for options and the answer is no I just try to track what's on the package servers and the real files install if you install it's not it's not use the custom if you have a custom package server it doesn't bring you those are the files you are installing yes it's just for standard insta-free bsd packages the way we build it yeah oh the first okay how long is to how long is take to scan the packages and build the index hello so the first the first con it can take about I think is I saw I didn't do for a long time but I think it's like half an hour 30 minutes something like that and once once I build it I can I can just update in a couple of seconds from one minute so it's that the the good thing is if I consider because I we have a file a manifesto with all the packages on the version so I can see which package change so I because I just look at the package who change a version and based on that and just recovering the the manifesto it takes yes it takes a cup I think of the maximum is like a five minutes for for for an architecture and the version yes the database I use the locate so it's the yes is the locate format so because because it's really is really convenient for this kind of usage yes yes I I store it that and yeah yes I am I include that because of the the the way big Ram works it takes just a line and just a line is it is a path so it's considered path and there's no extra data around so you can you can use so I just include the the the package name inside inside the path and use a special character yes also separator I in this case I use the the star character because I don't expect people to put a put a star in the package name yes absolutely and and I use I use a regular expression for search but the result is it's quite acceptable invites compress it it's 17 exact compress it 17 megabytes yes around because it depends on which version you're looking at but it's basically yes it's around this size yes could be could be yeah now sorry the question aha this was a trick just to see if I can repeat yes the question was was why not use a slash instead of star because a slash a slash made more sense probably yes so but I I am yes we can we can probably yeah oh good question no I just see happy people |
Chimera Linux
A BSD/LLVM distro from scratch |
All right. Looks like we can begin. So, welcome. And I will tell you about my project which is called Chimera Linux. But first, let me introduce myself a bit. I'm a software developer from the Czech Republic. I've been contributing to open-source software since 2007. And currently, I'm on a break from work, so I'm kind of working on the distro full-time. When I'm not on a break for work, I work in the WebKit team in Igalia. Previously, I also used to work for Samsung in the open-source group, where I worked on the Enlightenment Foundation libraries and the Window Manager. Since 2009, I've been using FreeBSD, also on desktop for about 10 years. But I've not been using that on desktop since about 2018, because I've been mostly using Power Architecture computers these days, which FreeBSD doesn't have the greatest support for. So, for example, my GPU wouldn't work. I'm also a former developer of the Void Linux distribution, which has served as a huge inspiration for this project, especially in the design of the packaging system. And I sort of do all sorts of things. Besides distribution development, I also do, for example, game development, compiler stuff. I did some kernel bits as well. But now, what's Chimera Linux? It's a new Linux distribution which I started in 2021. And it's a general-purpose distribution created from scratch. It utilizes core tools from FreeBSD, which is one of the big differences from standard distributions, which is GNU tools for this, or busybox, for example. It uses the LLVM tool chain to compile all the packages. As a matter of fact, there's currently no GCC in the distribution other than for some variants of few boot for specific ARM devices. It uses the muscle Lipsy, and it's a rolling release distribution, so there are no releases. It sort of updates continuously. And it's also highly portable to many architectures right now. We are supporting Arch64, Power, a little Andean. Soon there will be Power Big Andean, X8664, as well as complete full support for RISC-5 64-bit. I started this project in early-mid 2021. And it started with a C-Built, which is sort of a meta-build system for packages. You create your packaging templates, and these basically describe the package and how to build it, and C-Built builds it. I was a Void Linux developer to this time, and I started C-Built as a way to investigate if I can fix many of the shortcomings of Void Linux's XBPS SRC system. So I've created a quick distribution around C-Built, which consisted of GCC and GNU user lines, as well as the XBPS package manager, which avoid to use this. And to this point, it was only about 50 packaging templates. So it was very tiny. It couldn't boot, definitely, because it had no kernel and no bootloader or init system, even, or anything. So it was just like a little container, which was capable of building itself when hosted on another distribution. And as I said, I was trying to fix many of the issues, and main focuses of C-Built have been performance as well as correctness. This was when I first managed to make Kaimira boot in a VM. Shortly after those 50 packages switched to LLVM and removal of GCC followed, as well as switched to 3BSD tools, removal of all the GNU stuff and so on, and as well as gradual expansion of all the packages. I've been sort of iteratively enhancing the distribution ever since and until it got to the current state. In late 2021, it was possible to boot the system, and it's capable of bootstrapping itself. In early 2022, there was a full GNOME desktop already. This was when I got Wayland Compositor running, and of course, everything needs to be able to run DOOM, so we got DOOM working. There's a terminal and some other basic stuff, but this was, I believe, around late 2021. I had a talk about distro at FOSDEM 2022 too, and many things happened during 2022, so last year I did the talk as a sort of chronological thing. I'm not going to do that this year because there have been too many things, and I couldn't fit it into a 50-minute slot. Huge focus has been in last year on security hardening and on development of different new solutions for things which we've been missing. I'm currently aiming for an alpha release, which will be sort of early adapter release, where things will mostly work on desktop. In late February or early March, I plan to make it coincide with one of the betas of FreeBSD 13.2 in order to be able to rebase the tooling. Now, for some motivations, why did I create this project? I've been unhappy with existing systems. There are many great things that existing systems have, but there's always at least one thing which has been annoying me. I sort of wanted to create a thing which would actually suit me in every single way. I wanted to make a well-rounded practical operating system that wouldn't be just a toy, but something people could actually use. At the same time, I would like to improve software in general, that is mainly in terms of portability as well as security when it comes to things like usage of sanitizers and so on. I would like to make full use of LLVM, not just replace GCC compiler, but actually utilize the unique strengths of LLVM, which includes a great sanitizer infrastructure, things like FINLTO, which GCC still doesn't have and so on. Of course, proving Linux is not a new Linux is also a major thing. While doing all this, I wanted to have some fun and some people on the internet said I couldn't do this. Of course, it's important to prove them wrong. I wanted to build a nice community which would be fun to hang around with and make a good system for both myself and for other people. Now, for some general principles of the project, I strongly believe that projects which basically are centered around a single goal are eventually doomed to fail because once you reach this goal, you have nothing else to do, but at the same time, it creates dogmatic things which you are not allowed to cross and it really restricts you with the development. On the other side of the problem, there's scope creep if you have too many things to do and you keep expanding on it. Eventually, you get to the point where you never get anything done, so it's important to balance these things. I think opinion development is overall a good thing because it gives you a sense of direction, which is always nice to have. I think, obviously, quality of the code matters, but quality of the community matters even more so. I think fun is good, so I would like to try to keep it that way and not get too technical in the process. I think free and open source software projects are social spaces and that's why if you let toxic people into your community, it's eventually going to become a chore for everybody else. I try hard to keep them out, but at the same time, I try to make sure it does not get overly elitist because it should be an open, inclusive project for everybody. As for technical principles, I try to make sure things are strict by default and try to avoid technical debt at all costs. There should usually be just one way to do things. That doesn't mean there only has to be one way, but more like a good default that people are supposed to follow and it's sort of intuitive and easy to follow. Things should remain as simple as possible, but not too simple. There are many people who overly focus on things like minimalist systems and in the process they end up forgetting what's actually practical. I think security and hardening is also very important and in many Linux distributions, it's sort of overlooked. That's another thing. I think portability is also extremely important. There are many kinds of hardware and people are using many different kinds of hardware. Of course, most people have their x86 computers, but there's more of it than you may think and things like risk five are taking off and there's power workstations and there's arm and so on, so it's good to have all these things. Now, good tooling is also very important and related to that is self-sustainability. That basically means whatever infrastructure you have should be self-contained and easy to get going and easy to replicate on any new computer. Related to that is being able to bootstrap the system from source code. I think that's sort of a double-edged sword because some people don't care about the bootstrap ability at all and things are massive binaries downloaded from the internet. On the other side of the coin, there's people who insist on complete bootstrap ability from source code for everything, even if it involves doing completely cursed things such as ever seeing how to bootstrap the Haskell compiler from source completely. It's like if you want to do it and you have to go through an ancient un-maintained Haskell compiler which only targets 32-bit x86 computers and like compile some stuff on that and it's from like 2004 and then you have to iterate through newer versions, eventually you get to GHC and then you can cross-compile a partial distribution for architecture and then go from that and eventually you reach your goal. I think it's a means to an end and it's important but not that important. Another thing is that I've seen over the years many things and I think it's something is written in a shell and it's a complicated program and it probably shouldn't be. It should be easy to do the right thing but tooling should also make it difficult to do the bad thing, kind of steer people towards you know doing what's right and doing so out of the box. Documentation is obviously also important and many people avoid writing documentation, I understand them because I'm also guilty of this in many cases but yeah we should strive for a good documentation. There's also the question of SystemD. I believe SystemD is in many aspects not great but also it brought a necessary change to Linux and there are basically many people on distributions who just stick their hat into the sand and just avoid even considering that SystemD might have brought some useful things and that it might kind of be also their fault that it has become so widely adopted. So we should develop good solutions to counter whatever SystemD has come up with and basically always try to improve. Now let's take a look at how a BSD system is developed. Usually you have your entire system in a single tree and a single repository typically SVN and so on and you have lots of different components in this repository. It's a complete system capable of boot so if if you invoke the central make file and compile the system you generalize compile your kernel and compile your userline and if you put it together you will get a system which is capable of booting and third-party software is which is not required for the base system is distributed through some kind of port system. Of course this doesn't mean that there are no third-party components in a base of the BSD system because I'm not aware of any BSD system which is developing complete replacement for the tool chain for example. You have your LLVM or whatever in base and usually it has its own built system integrated with existing make files but it's a single tree. Now let's contrast it to a Linux distribution. In a Linux distribution it's a collection of software from many different parties which are separate packages and you have Linux kernel as the base layer that's always the case otherwise it wouldn't be a Linux distribution. You have your userline tooling which is often supplied by GNU and you have the libc which is also often supplied by GNU glipc and you have the tool chain to build all this so it's also often GNU because while client is used for some distributions not too many of them and you have the service manager and also some auxiliary tooling around the service manager so that's often system d nowadays. This is tied together with a package manager which handles installing and removing and so on and sometimes you have well usually have some of the components always and then you can install or remove whatever you want and Linux plus gcc plus glipc plus core utils, find utils, diff utils, and so on. It makes GNU Linux or what is called GNU Linux. Distributions exist to make sure that all these components work together and they combine well because many different distributions combine them in different ways and they have different versions of these components and they all have to play nice. So the Linux kernel has a rule of never breaking user space if a new version of Linux kernel results in a binary not working it means it's a bug in the kernel even if it was for example originally an unintended behavior so this can be kind of a pain. But let's get back to Kaimira so starting out the tool chain. LLVM in Linux is pretty seamless nowadays most of the time. You have it available on most Linux systems but on most Linux systems it's sort of a different arrangement because you do not have LLVM provided the runtime. GCC provides this and it's called libgcc. It's mostly ABI compatible with lipanwine from LLVM but it also includes some of these built-ins which are provided via separate library in LLVM. LLVM comes with its own runtime called compiler rt and this is used in Kaimira instead of libgcc. For the C library we use muscle because it's a proven good implementation of a C library which is used by several distributions already and you can make most software work on it just fine with maybe with a few patches but better than other lipsies. When you have a GNU tool chain you usually have GNU benutils to complement GCC as well as LLVTils to provide lip health. Benutils provides things like linker because GCC does not come with its own linker. It also provides different tools which are used together with the compiler things like archiver and read health and these kind of stuff. In Kaimira this is the LLVM from LLVM is used as a linker and it's used everywhere. As for the other tooling which is provided by Benutils, elsetoolchain provides this tooling and this is also used on FreeBSD to provide these tools. LLVM also provides lip health implementation which replaces the one provided by LLVTils. Lip health is used in many places but for example the kernel requires it. LLVM also provides most of these tools which are provided by Libutils. They have a prefix LLVM so for example LLVM read health. We do not use those in the core system most of the time. So now to sort out the Coruselant you have many GNU components as well as non-GNU components like Corutils, FindUtils, DeFutils and so on. You have Util Linux also which is used by premature distributions and provides sort of a mixture of tools for all sorts of stuff. In non-GNU distros existing ones such as Alpine you often have busybox which is sort of a single binary which can be configured to include many different tools which are otherwise provided by Corutils and so on as well as by Util Linux. The main strength of busybox is that it's a single binary so you can put it in embedded environments and you can have things mostly work. But the other side of the coin is that it's very spartan when it comes to functionality and the code is also not very good. But the other alternatives are usually even worse in terms of available functionalities. So 3BSD tools are the answer here and that's what we've done. I found this third-party port of 3BSD's tools called BSDutils. It was a sort of incomplete experimental thing which was not quite ready for an actual system. So I helped complete it and reached parity with Corutils. I fixed many bugs which were created during porting in the process. I also ported many other tools to expand coverage and the result is Karmira Utils which the distro currently manages and it's sort of a single easy-to-build package which includes all of the tooling you want. And this replaced not just GNU tooling but also for example a portion of Util Linux which makes things much easier for the distribution especially in terms of bootstrapping because for example in Void Linux a NXB-PSSRC which is the build system which is similar to C-Build. You have a stripped down version of Util Linux in the base build container and you need this because some of these tools are necessary. But this means bootstrap problem because when you build a full version of Util Linux you have many dependencies which you do not want during bootstrapping of your system. Things for example UDef or that kind of stuff which you really don't want to pull in. So it has a stripped down version of Util Linux for that and then it has a full version which is built separately and it's kind of a mess. If we have a single package for the user and all of this can be avoided and then only a partial build of Util Linux can be built if needed. Karmira Utils is lean enough for very environments things like in-it-RAM-FS or even embedded things but at the same time it's fully featured enough to be used as interactive tooling so it's a nice all-in-one thing. And of course it helps break up the current monoculture of tooling as well as it's easy to harden. For example, Karmira Util utilizes Clang Control Flow Integrity hardening which can be enabled on Karmira Utils very easily and it just works. Now to get the kernel sorted out. These two photos one is Karmira running on the MNT reform laptop and the other is running on Raspberry Pi 3. The kernel is mostly compatible with Clang these days and some patches are needed to support BSD Utilities as well as the LIP-ELF from ELF Toolchain. I would like to eventually upstream these things and make sure things work out of box. Until recently there was an issue with the option to use Clang's internal assembler. It did not work on some architectures notably 64-bit power because of some legacy debugging for Nonsense. So GNU Bin Utils was used for that until some time but nowadays it's not a problem and the Clang assembler just works for every architecture. CKMS, what is CKMS? These photos usually use DKMS which stands for Dynamic Kernel Module System to build out of three kernel modules and it's a massive 5k inline bash script and it has functionality which seemed like a good idea at the time and nobody uses it and it no longer seems like a good idea for example. DKMS can package kernel modules and you can distribute them and of course this doesn't work because every distro has its own kernel and it can result in in slight differences in ABI and so on so you cannot really do that. I created CKMS which stands for Karmira Kernel Module System and it's kind of similar to DKMS but it's much more lightweight, more robust, it's implemented in Python. It has privileged separation so when you have your package manager built a kernel module in a hook during installation and you run your package manager as root it will properly drop privileges so it does not run the whole compilation of the module as root which happens with the KMS in most setups. Now for the package manager that's an important thing in a distro. I considered the FreeBSD package manager at some point but it was not in quite the shape I would like for production. I did contribute back some patches to fix a bunch of things with muscle because that was the main thing which was really problematic and I got it working but there are things such as version expressions and the version string stuff which is a work in progress and it's quite obvious that it's mainly all geared towards FreeBSD series right now. So eventually I ended up investigating APK from Alpine Linux which ended up proving to be a great fit. For one it's lightweight but it's also fairly powerful and I really like its virtual package system. It handles things like shared libraries very seamlessly where shared libraries in packages are provided basically as virtual packages and this makes it easily searchable, easy for the solver and so on. I eventually transitioned to APK Tools version 3 which is the next generation of APK which is currently not used by Alpine and it does not have a stabilizer yet but it works great. The main difference in APK 3 is that it no longer uses star balls as packages. It has a new custom sort of structured format which should help with avoiding vulnerabilities in the package manager. By summer 2021 it was fully integrated in C-Built and it just worked. Service management is another big thing you need to boot Linux distribution. So many options were evaluated in the process, for example Runnit which is used by Void Linux, S6 which is sort of new kid on the block, OpenRC which is sort of classic and built on the same principles as classic RC systems. In the end I ended up choosing Dinit which is a new service manager. I chose it because it's both powerful and lean. It's implemented in modern C++ so it's also safer than most other service managers. Most importantly it took me about one afternoon to get it fully working and get to the system from not booting at all to having it completely booting. It's supervising which means most demons are supervised by the service manager by running on the foreground and being basically child processes of the service manager but you can have a background processes as well. It's less robust so this should be avoided most of the time. It's dependency based so it can ensure that your services start in correct order. It has support for things like one-shots which help immensely during early boot because most things you need to do during early boot is basically things you run once and they do not have any sort of persistent process. So the early boot process is full of one-shots. For example Void Linux with Runnit solves it by making these one-shots a bunch of sequential shell scripts which are run before the actual services are running and it's not a great solution because it's not very flexible. In any case it's a good base for a solid service infrastructure. We have a custom suite of core services for Linux written from scratch. It has full support for fine-grained targets. Basically a target is a logical service which does not do anything by itself except act as some sort of sentinel. You can for example have a network just target and then you can have other things say I want to start before this and then you can make sure that or I want to start after this. You can make sure that your services start only after network is up for example. It also has first-class support for user services which is very important and I'll get to that later. The eventual goal is to have all long-running processes be services and there's also the matter of session tracking which I'll describe in a bit. Now this is a new project the distro came up with it's called turnstile and it's an answer to the login D part of system D. Linux mostly uses system D login D for session tracking. What that does is basically know when a user has logged in or when a user has logged in on another console and it also knows when the user has logged out and it can be used by say desktop environments in many different ways. There's E-Login D which exists as a standalone version which is basically just ripped off, ripped away from system D and the dependencies are stopped out and it's sort of dirty and not great. This is done by basically running a daemon which is called login D and a module in the PAM infrastructure which is obviously used for authentication and the PAM module basically doesn't even know when a new session has started and it also doesn't know when a session has ended. This plus seat management which E-Login D also does but this is not widely used because usually you only have one seat. It's used by desktop environments in especially things like whale and compositors. With system D most importantly also has login D also spawn a user session of system D basically which acts as just like normal service manager but it runs as your user and it runs Caesar services. E-Login D cannot do this because it has no idea what other init system or what user service manager you might be running. So this functionality is removed and there's no way to access it. This is one of the reasons why I developed this. It aims to eventually replace E-Login D and it was originally created just to manage those Caesar instances of of init. The issue with that when running this in parallel with E-Login D was that sometimes it needs to know something which E-Login D knows but sometimes E-Login D also needs to know something the user service manager knows. It especially affects things like lingering for example you can enable things you can enable specific user to linger which means those user services will stay up even after you have fully logged out. E-Login D manages your runtime directory for you which is used by many services and upon log out it removes this runtime directory. If you have still some user services running and E-Login D has removed your service directory then things go wrong. So it needs to be integrated and I plan to eventually fully replace login D. E-Login D turns style does not manage seats because there's already a project called Lipset and CTD which can do this satisfactorily but Lipset does not do the session tracking so they can be used together and I plan to provide a library alongside daemon. This library will provide agnostic API. This API will have multiple backends and it will have a backend for login D it will have a backend for turn style D as well as potentially other solutions and then things like desktops will be able to use this and be actually portable because for example right now to have GNOME on free BSD for example it needs many patches to replace this functionality and it's just not great. Having an agnostic API which is not provided by system D would be a much nicer solution. Of course I'll also have to convince subframes to adopt it. One thing which you do with turn style is managing the bus session bus as a user service. This has an advantage because you have a single session bus per user just like it's done when you have system D. Well why have a single session bus? This session bus has a socket. This socket is somewhere on the file system and this socket is used to identify other things on the bus. The way to locate this session bus is provided via environment variable so if you have the environment variable in your environment then things can use this to read the path and actually locate the socket. Traditionally you had the session bus started by for example your x11 script xinitrc which would run something like the bus run session something. That means the session bus was only available within your single graphical TTY. This is not great because then when you switch console and login there and you want to run something which needs to access the session bus it doesn't know about it. System D solves this. We also solve this by running the session bus as a user service so when you first login it automatically spawns the session bus. When you last log out it stops the session bus and it's available on every single VT. This has also limitless potential for other user services. We can do things like debas activation without having debas spawn the services themselves. It's currently also used for the sound server for example with pipe wire. Now let's move on to C-Built. C-Built is basically a build system for seaports as I ordered. C-Built is only on the stand library. This is what a template might look like. This is the template to build the DOOM game. As you can see it's mostly metadata. There's one hook in there which runs water recon which has no other way to do this. There's a build style for configure script which basically strips away all the non declarative things you would otherwise need. How C-Built works is that it builds all the software in a simple container called a build root in our terminology. It's a minimized Chimera system. There's some packages which provide the baseline and your built dependencies which are specified by the template are also installed into this container. This container is fully unprivileged so you don't need to run anything as root and it's fully sandboxed. This is done with Linux namespaces. The container is also read only after the build dependencies are installed which means no package built can actually change anything in the container otherwise other than in its own build directory. It also has no network access after all the fetch stage things are done and it has no access to the outside system. Templates are also declarative as I said ideal just metadata and it has fully transparent support for cross-compiling with most build systems which means in most templates you don't need to do anything and it will be able to cross-compiling without any additional effort. It has a clean handling of common build systems this includes configure script, mason, cmake and so on and it's strict. It has mandatory linting hooks for many things and unit tests where possible will run out of box. I strongly believe that being strict by default is good because you can always make things more loose if you need it but if you have things loose by default and then you need to strict in them and you have many hundreds, two thousands of packages and you need to adjust every single one of them it becomes effort which cannot be done because it's just too much. It has support for things like bulk builds where it can properly order things in the batch to build without you know having dependency ordering issues. It can check upstream projects for new versions and so on. Build flags all the basic stuff for hardening which Linux and service typically use like fortify, position independent executables, stack and so on are used. On top of that we use system-wide LTO for practically every package. I think there's only about 30 templates out of close to a thousand which have LTO disabled for different reasons. In some cases it could be enabled but it's not worth it. We do utilize a system-wide subset of undefined behavior sanitizer. It deals with things like trapping signed integer overflows in order to avoid potential problems. Also CFI or control flow integrity is used for many packages. It cannot be used for all because it breaks on a lot of stuff. It's very strict when it comes to typing of functions but it's still used on a couple hundred packages. The allocator. We now use the Skudo allocator from LLVM which is also used for example on Google Android. It replaces the allocator in Muscle. This is not because of hardening because Muscle allocator is already hardened but Skudo is also hardened allocator. But it has significantly better multi-fraided performance because Muscle Malk NG uses single global lock. This is a trade-off but it also means that the stock allocator in Muscle performs poorly in many things and it's something people commonly complain about so we now rely on Skudo. There's also the advantage of being able to eventually deploy GWP ASAN which is sort of sampling runtime version of address sanitizers which can catch many memory errors at runtime with minimal performance overhead. This is not enabled yet but it will be at some point. Other core things for the distro. Some tooling is taken from Debian. For example we use InitramFS tools to generate InitramFS images because other solutions were generally found to be unsatisfactory. For example requiring Bash for the hooks and so on. InitramFS tools is very clean and simple and nice to work with. If we also use console setup from Debian to do console and keyboard configuration as well as the script for handling encrypted drives. Eventually I also had to add some other things like the grab rootloader support for ZFS. We now support root on ZFS very easily and so on. This is Kymyra desktop on RISC 5. You can see it runs things like Firefox for example which does not build out of bugs but I made it work. This is on the high-five unmatched board from C5. When I was starting to add to the desktop first thing I added was the western wayland combustor as well as GTK. This sort of provided a baseline set of dependencies which are also used by pretty much everything else. So then I expanded with XoDark stack things like the enlightenment window manager as well as pegwm for a simple x11 window manager. I added the multimedia stack including ffmpeg, gstreamer, media players and so on. In spring 2022 I added the GNOME desktop which is the default choice but of course you can use anything else you want. I also added web browsers. This includes epiphany which comes with GNOME and is built on webkit and Firefox which is alternative choice and of course some games. As I said before I would like to release an alpha version in late February or early March. I'm not sure if this will happen but I hope it will. Before doing this I would like to perform a complete world rebuilt of all the packages because I have introduced some things in C-belt which I would like to propagate it into existing packages and it's not. So just to be clean I would like to build everything with LVM15 as it is right now and basically then release the alpha. I will need to launch automatic built infrastructure. I currently have a server in Kolo in my city but it's not on public network yet so I need to set up the public network and launch things like this and as well as CI. I would like to clean up the remaining fallout from the recent hardening stuff as well as update every template to its latest version. And after alpha which is the alpha cycle is expected to take about half a year to one year. I would like to add a libgcc compatibility shim so we can run existing binaries because right now you cannot run existing binaries because the system runtime is different. I would like to add support for D-Bus activation so D-Bus does not run demons by itself through D-Bus service files but instead do it instead delegate it to the service manager. I would like to investigate additional hardening things like LVM save stack and I would like to improve the documentation. Right now there's the beginning of camera handbook which includes some handy basic information like installation and how to set up encrypted address and so on but it can always use more documentation. Local support is also another thing I would like to expand. This is the problem in pretty much all muscle list rows with the local support being sort of limited and you can have translations but things like you know formats and so on. So in the conclusion we are currently nearing usability and it should be suitable for early adopters by March. I would like to get all the major changes done by beta and continue packaging more software as well as cooperate with upstreams including the free bsdu upstream on sensing fixes and tooling and so on. In any case thank you for listening and if you have any questions you can ask them. Of course we also have stickers so come pick them up. Yeah as I said it supported. I recently introduced it and I recently tested it. Oh yeah sure. He was asking if ZFS on root is supported. Yeah it's supported. It uses the upstream script for NETRMFS tools just patched to support the user line because we don't use busybox and it just works. We also provide ZFS packages with pre-compiled binary modules so it's not necessarily compiled from source during installation and CKMS can handle things in a non-conflicting way so if you have the package installed for the stock kernel which provides the binary ZFS modules CKMS will not try to build the modules again. Yeah. Okay well the question was if what's the target audience basically. Well I would say the primary audience is basically the same people who would use things like Gen2 and so on basically power users who can find their way around things because there's no insistence on providing graphical clickable stuff for everything because it just wouldn't be possible. I just do not have the main power to do all this so yeah it's for power users who can find their way around a simple system. Yeah. He was asking where does the name come from. Well Caimira is basically like a mythical monster made up of three different animals so it should be fairly obvious like where it comes from. We have Linux kernel and free bsd stuff and other stuff. Yeah. Okay. Yeah the question was if I'm working with free bsd. I know the project has taken back some of the changes. I have some patches in Caimira hotels which I do believe would be useful for upstream and yes I do want to submit to them upstream. For example I have a fix in the sort tool which fixes a crash with control flow integrity hardening so this would be nice to include for example. How did you solve the problem of the PAM service program? Sorry can you repeat? Those turnstile use C groups. Question is if turnstile is this C groups. No it doesn't. It does the same thing as login. Basically this is a PAM module to report things and it keeps a persistent socket connection to the daemon as long as the session is active and then when the socket is closed and when the daemon receives basically a notification pulse on the socket and once it knows the connection has been closed then it closes the session inside of the daemon. So you open a socket from the PAM service module? The question is if I open the socket from the PAM module. Yes the PAM module opens a connection to the socket which is provided by the daemon and the daemon basically opens the socket in the system as a unix domain socket and it's only accessible by root obviously so the PAM module access it. So if you open a socket from the PAM module that socket obviously ends up appearing as a file descriptor inside the program that ran the PAM. Does that not interfere with anything? The question is if the PAM module with the socket actually interferes with anything. No I found it doesn't interfere with anything and in fact as far as I know login D does basically the same thing and other solutions for handling for example there are several solutions for handling the runtime directory which is basically a run user uryd and they basically did the same thing as far as I know but yeah as far as I can tell it works okay. Yeah. So first of all thanks for your job it must be like really difficult to maintain all these dependencies. Thank you. So my question is are you planning to integrate like any specific SSL library like for example Libre SSL or like OpenSSL? Okay the question was about integrating SSL library we do use OpenSSL version 3. This was actually a pretty big transition when it happened because many things do not or did not work well with OpenSSL 3. Fortunately the amount of packages right now is not that huge there are still some which do not work with OpenSSL 3 and which we do rely on for example the Heimdall implementation of Kerberos which we use instead of MIT Karbi does not work with OpenSSL 3 yet but it has its own built-in crypto which can be used instead so we fall back on that for now. Anybody else? Looks like that's it man and thank you again. |
Collaborating with Collabora Online
How to re-use Collabora in your work or project |
All good, I don't know if this is working, but it's green. Ah, look at you, crikey, thank you for coming. Look at this, excitable people. Collaboration, well, so here we are with Calabra. I hope this is what you want to hear. One of my concerns is that I've given similar talks in the past, so maybe we have lots of time for questions at the end. So I tried to do some different things this time. So one thing that we're really eager to do is to get Calabra online reused in lots of different places. That's lots of innovation going on out there. And lots of people have great ideas of how to use documents and make them look better on the web. And we would love to integrate with you to do that. So one of the things that we're really useful for is converting documents to different formats, which seems like an easy thing to do, you know? But it's really tough. And to wrap that up nicely for you, we have this beautiful rest endpoint. And it looks so simple, you know, you just do this curl command, brilliant, and you ignore certificates, OK, so you should remove that in deployment. And out of it, you get, well, what do you get? You get your text file turned into a .x. That's an easy one, right? But imagine you wanted to convert a PDF or, you know, a PPTX into a PDF. Or a very common request is converting PPTX into animated SVG. So we can do that very nicely. We can produce XHGML out of it. You can run in your browser. That's actually how Calabra online does its presentation thing. So you can get animations and presentations. You can understand the structure of your arbitrarily horrible, say, binary PowerPoint file. And you can dump that into something. You can parse and read and interpret and mash up and do cool stuff with. And the good news about that is that, well, say, people do this already in lots of horrible ways. So I will pick on someone. I don't know, is there any? Who do we have in the room? OK, let me think. What integration do I particularly like? So there is an unnamed open source project. And when it tries to convert its files, to show them in its jitsie-like, remarkably jitsie-like, a video, no, not jitsie. I forget. One of these video conferencing systems, big blue button, let's call it. It essentially has a shell script. And all of the good, beautiful software architecture stops at the point you want to convert a document. And it launches a shell script, which starts a Docker image, which then launches another shell script in the Docker image that copies a file into it with some horrible command that then converts it via another shell script. It launches an Office Suite that sits and talks to a RPC. And then it's just absolute horror. And if any of this hangs, or dies, or crashes, or burns, or uses more CPU, or finds that one document that has a real problem, you're just doomed. You have to write all of this lifecycle management nightmare. And the good news is, with our beautiful API, you don't need to care about any of that. Deploy the Docker image, job done. If it's too big, we'll time out. If it's too horrible, we'll tell you. And it's all done for you. So that's kind of good if you want the whole document. Often, though, people have enterprise, file sync, and share. They want to see their document. They're fed up of seeing an icon. They want to see what's inside it. So again, we can convert your document to an easy thumbnail, very trivially. Nice big image. You can shrink it down to whatever size you like. And that's pretty good. So we're really eager that people use it everywhere. And so we've written most of the work for you already, so you can use it. I think it's a patchy license, some really liberal. I'm more of a copy-left guy, but at least I understand. Other people aren't. So in the language of choice, we probably missed Ruby. I'm going to get in trouble with Neil later. But there you go. And we've done a whole lot of features recently. So one of them I was really surprised and encouraged. I was talking to someone from a European office full of lawyers earlier. And you wouldn't believe it, but they really love citations. They're all court cases, of course. I always think academic citations. But referring to other legal cases is this massive, worldwide web of knowledge about what's fair. And anyway, so we've done lots of things with Zotero. So one of the things that you can do, if you have a collaboration online integration now, is just to push. We added all this citation stuff in the toolbar here. And NextCloud implemented this. And all you need to do is provide a box somewhere that you can set this API key. So Zotero have a very nice REST API. And then we plug into that, and you send us this. We had this user private info. So you have a user info which has things like avatars and extended information about users that we send to everyone in the UI. We thought it was best not to send your private key to everyone. So we added this extra tag, user private info. And so when you connect to CollaborOnline and embed it in your iFrame, you need three methods. Get, so we can get the file and show it and render it. And then a post so we can send it back again when we've edited it. So that's the save. And then there's a check file info. And that basically tells us about you. So who is this person? We've got a path thing, a URL for the document. And we've got an opaque identification password to token. But what's their name? Tell me their name. Tell me what they look like, their avatar and this sort of thing. And so you just send this back there and bingo. Suddenly you have beautiful integration with all of your citation libraries. You don't need to run a Java app on the side and then talk. It just works really nicely. And updates all of your citations beautifully with a familiar interface. So that's kind of nice. That's a way of integrating two things together through a very, very simple REST API into a nice UI. Yeah, so language tool is something else I love. I don't know if Daniel's, he's probably a rich man. Grammarly has set a price point for, does anyone get Grammarly adverts? Does anyone watch YouTube? Has been plagued by, yeah. So there are many ways to create value in the world. One of them is, of course, to create value. The other is to tell people you've created value. And I think often as engineers we forget to tell people that we've created the value. That's the problem. We do it all and then there's no marketing. I think Grammarly is probably the extreme example of marketing versus value. But anyway, so they can sell somehow for 30 bucks a month, 50 bucks a month, something like that, a subscription to their web grammar checker that sends all your information to someone else and sends it back with grammar checking, which is great. But because they've set the price point, there's a great company in Germany, I think based in Potsdam, that make, well, they already made an open source grammar checker. They've done the whole create the value bit. And so, but they now sell these lovely plugins to people and you can fit for much less money, get a better open source product and they have some of those nice AI things in there. And AI is cool. Of course, for checking an ISBN is valid, probably not the best use of AI, I might argue. But on the other hand, sentence structure and human language, that can be pretty cool. So, they're taking on Grammarly and the nice thing about that, of course, is you can get a docker image and you can deploy that in your Kubernetes, whatever and connect it up to a collaborate line. So, all of your grammar checking stays in-house. So, you get the benefit of all of that goodness and from a European free software company, I love it. And they're doing well and they're growing really nicely. So, nice to see that happen. Very easy to set up. And they even document the API nicely, which is kind of cool. So, you can see all of the number of features exposed in some of the JSON API they have at that URL. But again, very simple endpoints. You just send your stuff to check and you get some answers back and we show it. And then, of course, you can configure that as you like and they have a web service. I mean, another example of just sending text and getting text back is our deep integration. So, another easy thing there that's often useful for people. And, yeah, it's a bit interesting this. So, obviously, they want to try and retain formatting, which is probably one of the big advantages of pasting it into your web browser. But you can buy an enterprise key for DeepL and use that, but you're not going to have it on-premise, I guess. And then, trying to really get styles through HTML and map them back properly is more challenging than you might imagine. And we haven't done a very good job of it. So, if anyone wants to improve that, they're very welcome to come and contribute. But I think this idea of ask the computer to improve my documents and stuff, that interaction thing is there and working nicely. And there's lots of easy, low-hanging fruit if people want to help do cool stuff. So, one of the other things we try and do is we try and integrate outside the iframe. It's interesting. So, you create virtualization, for example, and almost all of the interesting thing about virtualization is the bit that isn't virtualized, you know, the bit where you can punch through the magic to not virtualize something and run the command inside it. Anyway, so, similarly with us, you know, the integration is probably the most interesting bit around the edge. And it's much the best if you can do that. So, we've written a huge SDK so that you're, it's easy to do, which you can see online. And so, when you save as, it seems easy to save as, right? But I just explained to you how it works. You know, you do a get and you do a post and that's kind of easy for us. But if you want to save it as something else, that's more tricky. So, but yet, people kind of want that if they're editing a document. They, you know, often people load a document and they continually save it as through its lifetime. So, the document you get actually started in 1995, you know, and it's been saved as ever since, you know, with a nice template and the WMF preview of Windows metafile in the top right corner and all of that good stuff. And often we see the macros in it. I mean, we're analyzing government macros and we routinely see like the Windows, the Office 95 macro API had a compatibility when Office 97 arrived and we're still seeing that in macros, you know, the word basic dot something thing. Just extraordinary. Anyway, save as is used. So, we should do that. I talked very quickly. Who has got lost? Does anyone, you know, no? I'm sorry. Okay, so we need a file picker. So, how are we going to get that? Well, I mentioned this check file info thing that tells us what you can cope with. And if you say, well, we can do insert remote image and we can do this write relative thing, then we'll send you a post message when you click save as. We send a post message outside the frame saying, hey, we want a graphic from somewhere, you know? And then you can do what you like. You pop up your nice file picker, come arbitrary, image creator, ASCII art, you know, whatever thing. We don't care. And when you're done, send us this action insert graphic and just a URL to it and we'll put it in the document. That's kind of cool. Or we'll save as and reload and, you know, create a window for the new document. So that's really useful and easy to do. And we're using, I think, the same hooks for our new export stuff. So there's a whole lot of work in accessible PDF creation and PDF UA and I mean, look at all these options. I mean, I hate options, but, you know, apparently you need all of these. So we've added loads of them recently and you'll be pleased. And of course, repub, it's very, very useful for accessibility. It's kind of an extended HTML, a dialect. So, you know, you can integrate easily and get all of this richness suddenly. One of our problems, of course, is that interoperability is really, really key in what we're doing. And people really care about that. And one of the challenges we have at the moment is our competition really is not great at interoperability. They're spoiled by interoperating with themselves a whole lot, which is easier. And so when we save, I mean, we love ODF, right? But if you save it in an ODF file, and, hey, that's good to see it, and then you sort of download it somewhere else and give it to someone on a Windows machine, like it's not, they'll load it in Word and it will completely mangle the document. You know, they even have a big list of the things they break. You know, there's a thousand pages. It's a very large document that explains all the things they don't do, you know, change tracking. I mean, why would you want that? You know, what kind of features? Anyway, so lots of it is dropped on the floor, which would be fine, because obviously that product is awful. But it's sad to be blamed, you know, for someone, I mean, like, you know, the user perception is your product doesn't interoperate. You're like, are you sure it's me? You know, like, I don't know. So, of course, if you use the DocX file format tragically, you know, we can interoperate better with the other world, which is a shame. But the good news is you shouldn't need to do that. You can use it all online in the browser and you can feel happy and relaxed knowing it's an ODF format on your server, and you have a full feature experience. But anyway, I was distracted. Remote font management, so that's all very good. But if you've ever written slides, what you'll notice is that, you know, that you change the wording of this line here until it just about fits in and doesn't wrap horribly, you know? And that's great. But of course, it's highly dependent on the font being used. And if you change the font, be it ever so slightly, you know, the text can grow and then everything looks awful. And of course, my slides look awful anyway, just because I'm a hacker. But other people have beautiful looking slides. And so anyway, we decided it was very useful to be able to configure fonts, and that brings a whole load of interesting problems. But anyway, to make this even easier, we have a remote configuration. So one of the things that's nice is to be able to deploy lots of images on Kubernetes and demand scale and more and more of these things. But it's a bit of a pain to configure them, and particularly for a large hoster or something. You know, you have customers that arrive quite regularly and add things, and how are you going to deal with that? And do you restart all of that? What do you do? So we have this remote configuration endpoint now. So you can cut a whole load of your config out. And collaboration, I would just go, hey, tell me my config. Has it changed? And then pull it every minute or so and update a whole subset of those settings to make it easier. And one of those courses is the font, font setting. So it's easy to have then a path to font. So if you have a file sync and share thing and you can manage files, you know, just create a folder and drop loads of fonts in it, and then we'll notice. And, you know, we just get this kind of JSON come out of that font endpoint, this thing we can figure in. That tells us where they are, and ideally some timestamps. So we don't, you know, continually fetch them. And then we can just build a whole set of fonts. And in the background, it's very funky, because we have a fork it. So you can't fork if you have threads. And it's kind of useful to have threads. So we initialize LibreOffice in what we call the fork it, whose job is just to fork. And we pre-initialize everything, load our configuration, and then we fork and copy on right huge amounts of our static data. So if you've started LibreOffice and thought, well, it took several seconds. What am I going to do online? Because that's the instant we fork within milliseconds and we have a document there ready to load and open. But the problem is it really needs all the fonts. And we really don't want to handle all of those fonts to the child processes. And we really control this very carefully from a security perspective. So anyway, after lots of work, we now restart this response. Load files and pass them in and patch lots of infrastructure. I was actually just talking to Leonard a few minutes ago about he was telling me, oh, you've got to mount PROC. You have to mount PROC. You'll get screwed over. You have to mount devices. Otherwise, something will go wrong in your stack somewhere. And we're like, yeah, well, we tested that. And we fixed and we patched around those things so that actually our jails have almost nothing in them. No PROC, only two devices in slash dev, no shell. They're pretty well locked down. So we like that. Oh, and then I've got a few minutes. I'll just show you a whole load of gratuitous features we've added just in case you like features. The users do, turns out. So I'm a big hater of the blockchain. But DevDAO actually sponsored some of this work. So we like that. And the European Commission as well. So getting our columns into our spreadsheets, lots of them has been happening. And it's got rid of this very annoying dialogue that played lots of users for a long time. So that's really cool. Oh, and there's even the proper credit. Well, NGIA DATSY. So the European Union Horizon Research Program is actually really cool. And anyone who has not knows about it, if you have a good idea, the traditional way of getting funding from the European Union was that if you have a really good idea, you need to find 15 other people across 27 countries and then try and persuade them all that their idea is the same thing and then get someone to write the proposal and submit it. And then you don't get it because it's all inconsistent. So the good thing about the NGIA DATSY, the Horizon Research thing, was that they said, hey, let's do something that's good for Europeans. And so they would just give money to single vision stuff. And our vision was, well, let's fix interoperability. So we did a lot of that. And they paid, which is awesome. Look at that. That's, yeah, it just makes life so much easier, doesn't it? And probably good for Europe as well. So anyway, say form controls, creating lots of nice rich text folders, much better PDF export with creating real, editable PDFs actually built into the product rather than having to layer things over the top afterwards. Starting to theme colors so that you can select different bits of your document and change the theme and see how that impacts the whole document. And we're doing lots of work here to extend that to writer and calc. Chart data tables. You wouldn't believe it, but these things at the bottom of charts are very popular in presentations. Some people like lines and other people like numbers. Now you can do both in the same thing. It's a key interoperability thing. And then also other random interop things, precisely anchoring your images and reflowing your text very nicely in the browser, improving our formula input bar, accessibility checker to try and find problems in your documents for the visually impaired, prettier dialogue functionality. So happening in the client side. And lots of this is now JavaScript in the client side to make it more accessible and performant. We completely reworked the tile serving thing. So instead of sending new tiles when things change, we send very small deltas of them. So we find the pixels that change, and then we Z-standard compress them. So we switch from big PNGs with even headers in them and crazy stuff to much smaller Z-standard compressed things. Thanks to Facebook. I just need to thank Facebook for helping us all get digital sovereignty back. That's important. Password options. So you can put passwords on your files and various attributes. Lots I've mentioned, the PDF things, embedded video playbacks, if you like that sort of thing. And the last silly idea, I think, in the few seconds I have left. So we've done a bit of a concept for running CollaborOnline in the browser. So when you go through your tunnel, and my hope is that tunnels get better connectivity. But you can then click a button and, in theory, run this thing offline. So we have a prototype now of CollaborOnline. If you're interested in that, there'll be talks in the LibreOffice track, which is, I suppose, somewhere nearby, that then allows you to edit offline. And there are a whole lot of interesting problems there. If you like wrestling with massive, multi-gigabyte linkages and horrible nightmares, do get involved with that. But there's a little prototype there that will allow you to work on that and play with WASM. So yeah, come and hack with us. We have HackFest. So there's a LibreOffice HackFest. If you're excited about LibreOffice, and you should be. The cool kids are all using LibreOffice technology. Come to our HackFest this Monday and Tuesday. Oh, we have a community dinner tonight with Pasta at the Business and Technology Incubator. And there's a great link there. If you take a photo of it, you'll have it for later, so you can come along. And beyond that, we're running a HackFest in Clare Corridor, Cambridge, in March 28 and 29, which is not only LibreOffice, but also CollaborOnline. And it would be lovely to see you there. If you want to come and stay in a beautiful Cambridge College and wine and dine at our expense and have some team building and get stuck into CollaborOnline, we'd love to have you with us. So thank you for your patience. Are there any questions? Yes, sir? When can we expect the chat-GPT integration with CollaborOn? Well, like I say, when can we expect chat-GPT integration with CollaborOn? I'm sorry, I have to repeat the question. Yes, so it's a really good question. Ultimately, you can select some text, and we can send that to you, and you can send it back quite easily. But yeah, I think AI brings a whole lot of interesting challenges. And I think, I don't know if you've looked at Office 365 and the AI slide improver, which I obviously should have used. It makes your slides look pretty. But the question is, what are pretty slides? And the real problem in AI is, of course, the training data. And one of our problems is that we like this digital sovereign world where we don't spy on people all the time to work out what they're doing to their documents, right? So Microsoft doesn't have this problem. They have Office 365, and they're constantly watching, you know? So they know how to make pretty slides just by watching millions of people go, color's a bit. And also offering you options of different ways to break or improve your slide and seeing what you choose. So yeah, I mean, how do we build the data sets to let us do this in an open-source way? And I think AI is fantastically interesting. And Bradley, no doubt, will come up with the Aferro AI license, you know? I'm sure this is happening. Because the source is banal beyond belief in most AI things. It's the data and the training data and the model you build. So yeah, it would suck to be an open-source company. 100% open-source code. It's just this massive binary blob that not even we understand that you have to buy, you know, to make it useful. So I don't know. We're working on the problem. And there are a lot of smart minds thinking about putting AI and keeping that sovereign and on-premise. But I don't have a perfect answer. But it's a fantastic question. And if you want to do chat, GPT, we should talk. You know, come and see me. I did not mention we're hiring people. I'm probably not supposed to. But we love C++ hackers. If you come and see me, you know, we're growing fast and doing some cool things. Other questions? Anyone at all? No? More? Yeah? Well, that's very good of you. Then come and see me afterwards if you want to chat. Please do. Thank you so much. Thank you very much. Thank you very much. Thank you very much. Thank you. Thank you very much. Thank you. Thank you. |
Migrating from proprietary to Open-Source knowledge management tools |
about migrating from proprietary to open source knowledge management tools. I'll talk a bit in general about the migration process and then demonstrate two migrations, one from conference and one from SharePoint with different technologies just to see what's possible on our side and in general. So first, why migrate from proprietary to open source? I'm sure everyone has a lot of good reasons for that in mind but some that we have identified as well would be first for privacy and security concerns. Of course, with open source software, you have more flexibility in where you host your data, how you host it, the environment and access that you set up. Two other concerns that are related and we have also seen some important examples in the recent years are vendor and data lock-in. So when you're using a proprietary software, you're a bit more vulnerable to the vendor than when you're using an open source software in the sense that if they make any significant strategy or pricing changes, you may find yourself in a situation where you need to migrate quickly or you need to quickly adjust to that change. A recent example is Confluence. Maybe some of you have stumbled across it. They had a cloud version and a server version and a data center but that's usually for larger teams. The server version was used by smaller teams that wanted to host the data on their premise and in late 2020, Confluence decided to stop the server version. So because of that, a lot of small companies had to either move to cloud, so host their data in the states, which wasn't possible for a lot of people, or move to data center, which was much more expensive and difficult to handle. And that was an example of how using proprietary software can make you quite vulnerable to their changes. One other thing that is important is the data lock-in. So using open source software usually allows you to migrate easier and integrate with certain tools due to open standards and protocols and formats. So if you're using an open source tool at some point and then you want to migrate to another, it may be easier than migrating from proprietary tools to other proprietary tools or open source ones. Then of course, accessibility, having access to the code and all the features may allow us to further extend it or contribute to it and implement other features. And finally, from a values or ethical point of view, when we are using open source, we kind of facilitate integration and technological growth for everyone, rather than focusing on products and standards that just have a small benefit or a benefit for one company. Okay, so with all these good reasons I hope in mind, I'm going to talk about how to approach migration. So in general, as a plan, I hope that's okay. So in general, as a plan, when we migrate from one knowledge management tool to another, but this can be kind of extended to other software as well, we first need to think about data format and dependencies. What type of data do we have? Do we have flat HTML pages? Do we have structured pages? Do we have any metadata tags? All that needs to be considered. Then we also need to look at the other extensions that we're using. What type of authentication do we have? Are we integrating with anything else that needs to be kept? Then once we have that listed, based on that, we can identify and set up the new software. For example, let's say that we have identified that on the confidence side, we can export to XML. Xfiki supports that and we want to do this migration. We need to set up the new software in the environment that works for us. Then we need to make the migration plan and clean up. This is of course quite specific to the software that you that you're migrating to and from. But in general, it's also interesting to know that it can be a good moment to clean up your data. For example, if you have a system that you have been using for 10 years, you may have a lot of obsolete data that can be cleaned up at this moment. Aside from the plan of migrating, you can also eliminate some pages or even reorganize the content if possible. Then of course, we need to execute the migration. Again, it depends on each organization or company if they executed with a team, if they executed with other services. But this is an important step that also needs to be planned and kind of realistically planned in time as well to say so. When we migrate itself, we also need to try to map the data and its structure. As I said previously, if you have some type of structured data, you will want to map that. Also, you will want to create the new integrations or dependencies or install them. Then finally, once that is done, we need to do a thorough testing of course before releasing the new system. Finally, delivering in production and a step that is quite commonly skipped would be the user training. If you have any sort of organization, you may have been in the situation where you had to change the software and people may have been resistant to using the new software or became a bit less efficient or didn't really know how to handle certain aspects. A bit of user training can be very helpful as well if you're changing your knowledge management tools. Now we know the general plan and we can see two examples in action. For the conference to xWiki migration, I'm going to demonstrate the XML export way. For the share point, we're going to see an example of CSV export. On the migration from conference to xWiki, we already had a lot of tools available to migrate. However, in recent years, we dedicated a bit more efforts to trying to make them as easy to use as possible. We had before the filter streams converter that also supported the conference format. Nowadays, we have an integrated migrator that has a couple of steps. We'll see them right away. For the macros, in conference, we have some macros like we have in other wiki systems or knowledge management tools. One concern while migrating were the macros that had to be supported or migrated properly. Because of that, we also developed a package of bridge macros, how we call them, or macros that were identical to what exists in conference and that would support migration. Of course, we don't have all the macros that exist in conference because both xWiki and conference are products that have been around for a really long time and they have their own specificities. That is a concern to keep in mind not only when migrating from conference to xWiki, but from any software. Again, this fits into the dependencies or applications that you're using part. Let's see how a migration works. Again, I'm going to use the integrated one, the integrated migrator, which reuses the original filter streams. What we did for this migrator itself, to make it, again, a bit nicer to use, we created this concept of a profile in which you can basically import each space separately and you'll have a bit of an analysis on what pages were imported, if you had any pages that caused issues and you're able to follow the process a bit better. You can name your profile however you want. If you want, you can also connect to the conference instance. This is not mandatory, but if you connect to the conference instance, it also gives you a pre-analysis of the pages that were already imported into xWiki, so that can be useful. If you're having conference cloud, your username and token would be username and the token that you get, the API token. If you're running conference server, the username would be your administrator username and the token will be your password. We have here, of course, a demo purpose conference instance. It's not what we use internally. Then we need to put the URL as well in order to map the URL schema. Let's take that as well. We don't have a custom URL path. This is important again for the URL schema and for their adaptation. If you have wiki in the path, that will be a standard URL. If you don't, if you have things like display or something else, that would be a custom URL. In this case, it's not applied. Then we need to mention the name, the key of the space that we're migrating. In this case, I made this dummy text demo space called FOS. This is the space that I want to import. Now I have my profile. Let's see if I'm connected. Yes, I can start the migration. The way in which the migration works is that you have a couple of steps. The first one is just the prerequisites. It would tell you what would be the requirements for migration. They usually apply for larger imports. In our case, we're just going to do a 7-megabyte zip. It's not that large. We don't need to adapt everything. Of course, in general, when you're running a migration, you need to have enough resources on the machine, enough memory, disk space and all that. Specifically, on the Xfiki side, you can also disable notifications in the Xfiki properties file. You can also disable some listeners if you know that you will be importing very large packages. The second step that I told you about previously with that analysis, if you are connected to the conference instance, you can see if you have on Xfiki any pages that already exist, so that if you have in the package that you're trying to import from conference pages that exist on Xfiki. We have some logs here. We can see that it looked at all the pages. We don't have on Xfiki pages that we're trying to import right now. In the third step, it's just to tell you to go to conference and export. It depends on what server version or conference or cloud version you have. In this case, it's a cloud version with XML, full or custom export. You can choose again between those two. I already have it downloaded, so I'm not going to download it again. At this step, you just have to import your export file. Let me show you the example to import. If you have it on the same server, you can also specify the source in the server. If you have Xfiki running on the same server that you have the files in, you can also specify it directly. All of this configuration is the filter streams configuration that you can adapt. It has some fields that are prefilled, but there's also a lot of power in other things that you can configure. For example, you can also import users. You can do user ID mapping. For example, if you have an LDAP that has generated on the conference site some random number IDs, and you want to map those to the new users that you have created on Xfiki, that's something you can do. Also, you can choose if you want to keep the author and history metadata and so on. You have some nice configuration that is quite granular. Once the configuration is done, you would import. This is the point where our documents are getting created. Because they configured it, we also have the history. For example, here you see this was created and then updated because on the conference side, I had multiple changes on those pages. Now, we see that we have the pages imported with no error. With no error, it's of course a great thing, but you can also have errors, of course. In our experience, the most common errors are caused by unsupported characters or corrupted pages on the conference side. If you are trying this out and you have some errors, the logs should tell you what is the page that is causing the issue. You can then fix it on the conference side and then re-import or fix it manually in Xfiki, whatever suits you best. Now, we have the pages imported. This is a post-import fixes check that we can also perform in case we have pages that were imported that don't have a parent or pages that have corrupted parents. Both in Confluence and Xfiki, we have the hierarchy system. In Xfiki, we have nest pages. In Confluence, you may have situations where the parent pages are corrupted. If you would have had that, you would see it in these reports. It's not the case here. Finally, we would need to recreate the hierarchy that we had in Confluence. You can see now that the pages that I have imported are flat. We have just one level hierarchy here. Now, I'm going to execute the nested pages migration tool that we also have at Xfiki. The pages will be moved into their parents according to the hierarchy that they had in Confluence. As you see, it's converting all the pages and they will be moved in the right place. Okay, cool. Now, we have a migration done. You can look at the pages to see all of your content. You also have, again, a lot of the macros that are also installed and can be reused. For the macros, the pro macros that I told you about, the bridge macros on our side are packaged. They are open source. They are public here, if you want to check them out or repackage them. On our side, they are packaged as under a license to be able to further support the development of the product. If you want to check them out or contribute to them, you can see them on our Git. We have here a Confluence migration done very quickly and without much hassle. We saw the Confluence migration. Now, let's see the SharePoint one. The way in which we migrate from Confluence is based on the XML export. From SharePoint, it's very different. In SharePoint, you have the option to export to CSV. If you're using SharePoint as a knowledge management tools and you have your documents with a bit of metadata, so like we have in this case, department could be considered a metadata or a structure data of field that you can check or uncheck and change. The pages have a form structure. If you have this type of data, one thing that you can do is to export to CSV, then create the same data structure on Xwiki. On Xwiki, we have an application that is called App Within Minutes that allows you to create structured data systems. Here, I already have an example made, but we can look at the structure. Basically, I just created the same structure that I had in the SharePoint example, so title, department, reviewed, and finally the content of my documents. Then, once I have that structure done, I can use the batch import application. Sorry, not here. Okay. With the batch import application, I would import the CSV that I have just got from SharePoint. I'm able to map the columns from the CSV to fields in Xwiki that I have just created. Here is the mapping that I just did before. You can choose whatever you want, even exclude some columns if it's the case. Then, we preview the mapping, and this is what they would look like on the Xwiki side. You can see that all the content is getting migrated. Let's just see a page. Here, you can say what you want to happen if you have duplicates. Then, we do the import, and the final result is something like this. All the pages were imported, and if you go to a page, you can see that you have this structured form type, and you can further edit it. Okay. That's all for the two examples. Sorry, I had to go through them very quickly. There are a lot of things that you can do to migrate, and of course, we're very happy to facilitate any migration from any other preparatory tool to get more users to open source. Thank you if you have any question. No questions? That clear? Yes. Yes, please. How would you deal with migration from basically just the directory with all its office documents? So, how do we deal with migration from the directory of all the office documents? So, two things that we can do. So, when you import office documents into Xwiki, we do have abundant integration with LibreOffice that allows you to convert office pages into Xwiki pages, but that's page by page. Or, if you have any sort of directory of office files, what you can do is to actually create manually this type of a CSV where you put in a row the content, and in this way, you can also add some metadata, for example, if you want to organize them in some departments or responsible person, so on. You can do that and then still use the batch import. At the moment, we don't have an existing tool for just feeding some files. We have something in progress also with batch import, but yeah, the one option is to either convert them one by one or use the batch import, but you would still need to organize them in a sort of a list. Yeah, that answers it. Yes, please. Thank you for the question. Just to repeat it, if it's the case, the question is if we can, if we facilitate in any way the addition of metadata or the cleanup, I would assume. So, on the metadata, as just mentioned now, for the office part, if you have office documents in any way, you can adapt that CSV file before migrating. So, for example, if you have office files or if you have an export from SharePoint, but it's not all documents have metadata, you can add them manually in the CSV that you do. On the conference side, not that much. You can, of course, so the labels and everything are imported, but to be straight here, it depends more on what you have on conference, because basically, with the migration from conference, we just take everything that you have and put it into Xfiki. We don't really facilitate any cleanup, but we allow you to migrate labels and macros that also do reports and all that. But for conference, specifically, it's a bit difficult to add metadata. Do you also migrate pages and lists? Sorry? For SharePoint, do you also migrate pages and lists so it will be not only documents? From SharePoint, at the moment, we migrate documents, so Word documents. There are other tools that we're working on with office integrations and Microsoft integration, but yeah, at the moment, we only import documents. Thank you. Maybe you told it, what about the user's permission right to view part of the document? Thank you. That's a really good question. The question is for user rights or permissions. That's in the part of the dependencies or integrations that we need to mind. At Xfiki, if you migrate from conference, for example, and you have native conference users, yes, we have the option to import them. You just need to configure that in the filter streams, and you can import the users, but not the permissions. The issue with the permissions is that the systems are very different. In conference, you have a very different system of access permissions compared to Xfiki. You can do that custom, like if you do a script that maps the rights and tries to set up some rights, we can imagine that, but at the moment, it's not possible. It's very difficult to do it generically. The alternative or the best case scenario is if you have something like an LDAP or even an SSO system that you have connected to your current tool, and when you migrate, you connect that same user directory to the new tool, such as Xfiki, and you just have the users created at the first login. That's, of course, the best case scenario. It's also actually possible to migrate users with the batch import, so you can do a bit of a hack there and import users as well, but for permissions, it's generally very complicated, and it's a case-by-case situation. You can import permissions, you can import groups from LDAP. We're also working on importing groups from as your SSO, but permissions, it's not yet generic enough done in our extensions. Yes? So thank you. Also a great question. If the history of additions is kept for the conference migration, yes, or for the XML migrations in general, yes. We do have that, and you can also see in our example here, I'm not sure if this one has enough history, but yeah, okay, so just a quick example, the history is retained, again, if you configure the filter to do so, and if you have this history retained, you can also see the changes between the versions, so that's something very nice. For SharePoint, we don't have that at the moment because we're not taking gold metadata from the documents, and also on other tools that support this type of filter streams migration, you may also get the history. Okay, thank you very much. Thanks. |
Deploy an enterprise search server with Fess
Search GitLab, Redmine, and repositories with a single query |
All right, so thanks for coming to this talk. I know there are a lot of interesting talks for interest talks, so thanks for choosing this one. So I'm going to talk about such more specifically about how to deploy one within an organization. So in this today's talk, I'm going to cover these topics. So it all started with engineers using multiple content management tools. And I'm going to talk about how searching became a problem and how we use an OSS tool to address their problem and the name of the tool is FES. And how we overcame some of the problems with FES. And I'm going to touch on the contribution side of the project. And I'm going to share some insights and observations. OK, so here's the list of chapters, so let's go. So just quickly, me and the team. So I'm an engineer at Toshiba. I've been recently maintaining the company's cloud infrastructure and just trying to increase the scope and range of automation through these activities because otherwise, without increasing the automation, we are doomed. But more importantly, here's the list of very capable, hard-working engineers that made this project possible. So I have a huge respect to them. So now onto the background. So in large corporations like ours, we typically have lots and lots of companies. And our team's job as a software engineering center is to deploy and provide tools to other Toshiba companies, like product development departments and R&D departments. Now, we have about 200 units of deployment. And we have these tools that are basically sort of heavily OSS-based. And now let me quickly turn on the user point here. So we have a few more, but these four are like a core of our tools. So now as we diversify with the tools, the searching became increasingly a problem. And there were two major reasons. For one thing, there is no easy way to search laterally. That is, sometimes we want to search all of these places exhaustively to make sure that we're not missing anything. But there's no easy way to do that. And one more thing is, one more problem is that as we found out, these tools are not quite cut out for searching inside certain binary files, like PDF files and Office document files, which are really something we use quite often. So what all of these lead to is that, ideally, what we want is a single search box that, given a query, searches all the places, like really no matter where they are and where the documents are and what the formats are. So that's what we are going for. However, this is going to be a daunting task, though, because such a tool would have to not only be able to solve the two problems that I talked about, but it would have to come with all the essential features, both on the user side and on the admin side. That is, we have to be able to easily set up crawlers and all those things and run them and maintain them easily. But there is a tool specifically designed for this task. And the name of the tool is FES. So next, I'm going to quickly talk about what this tool is and what it's like to use it, as I don't think this is particularly a well-known tool. So FES is, as read me on the GitHub repo sets, a powerful but easily deployable enterprise search server. So an enterprise search here describes software for searching information within an enterprise, as opposed to web search, like Google and Dr. Go. Now, FES uses elastic search as its search engine, meaning that indexing certain binary files, like office files and PD files, are more or less automatic. And one notable feature of this tool is that it comes with several types of web crawlers. There's one for web pages, and there's also one for file system, like a directory hierarchies. And there's one for a database as well. And all of this is to get data from many different kinds of sources. And if you look at the screenshots, there is a search box. And also, they have admin console. And the search engine results page looks only familiar to many, I think. And this tool is developed by a company named CodeLyps, which is a company that develops and opens sources tools. And they have a lot of experience engaging with OSS community. And let's take a quick look at how this tool works by looking at one of its core features, which is a web crawler. And I think it's a web crawler. It's basically a backbone of this whole system. And I think the concept is familiar to everyone. It basically crawls and indexes web pages, web page contents, and uploaded files. So the way you create web crawlers on Fisk is you go to admin console and then set these parameters for web crawlers. Now, there are quite a few parameters here, but I'm going to focus on a few important ones. And first, there is, of course, URLs field. And then you can include and exclude certain URLs. Fisk respects robots.gxt, but certain robots.gxt file doesn't disallow certain not so relevant pages. So in this case, in that kind of case, this comes in handy. And there are also fields like depth and max access count, which you probably want to set to a very high large value so that crawler is not going to stop pretty much early. And then we came to the permissions parameter. And I think this one needs a little bit more expansion. So this parameter is where you can implement per user access control. That is, when you, I hope the phone is large enough, but when you list users like that, and let's say the crawlers index everything and search is ready, then, and the users search something, only the users listed there see the results. So but notice that this setting is per web crawler basis, meaning that if you have 100 projects on GitLab, you're going to need 100 web crawlers, which is a lot. So clearly, some kind of automation is necessary. And I get back to this point later. One more thing to mention here is that user name here can be either users on FIS, FIS has its own users and groups, but can also be users authenticated by LDAP directory service. There's an option to configure this on FIS. So I hope that gave you some feel on how things work on FIS. So let's move on to the customization part of the stock. So no tools are perfect, and FIS is not an exception. So we have to customize on patch FIS in a few ways. So just quickly, here is a list of patches. So our dev team engineers over time wrote more than a few patches. And the general quality improvement patches and bug fix patches have been merged upstream. But there are also more experimental patches that are very specific to our problem. And those are kept proprietary. And I'm going to talk about two of those patches. First one is the authentication for web crawlers. Now, most of the pages of GitLab and RedMind are behind logging pages. And FIS has, so the web crawler has to authenticate itself as it makes its way. Now, FIS has a mechanism for this. That is, you can create web authentication object and attach it to each web crawler. This works in some cases out of the box if the logging form page is fairly standard. But our GitLab uses SAML. And then as it turns out, FIS does not support this. We have to do some patching. To just go over how the patching works at the conceptual level, what we did is we defined extra optional parameters that Admin can write on the console. That is, if there were parameters starting with SAML underscore, the patched parser of this form is going to pick them up and store them. And later, web crawler is going to see these parameters and recognizes that SAML authentication, SAML logging, needs to be attempted and runs extra SAML specific logic. So that's the patch one. And then the second one is about repository contents. So many of the repositories we have on GitLab are several gigabytes in size. And both GitLab and RedMind have pages to view repository files. So in theory, if you wait long enough, web crawler is going to index all these contents through those pages, in theory. But this turned out to be a complete non-starter because it's too slow and quite understandably so. And the reason is it's just so the web crawler is going to make HTTP requests. And GitLab fetches the file, just one file from Ripple, and then renders it to the web page. So there's just too many steps to just get the content of one file. So what we did is to first clone the repository contents to a local file system, and then run this file crawler, which is a crawler for directory hierarchy and do everything locally. Now this more or less solved the problem of speed. But one problem is that since everything is done locally, when the search indices are stored, these things are the filesystem paths. So we had to remap these to the URL so that later when the user clicks a link, it takes the user to the repository file page on GitLab. So what we did is, again, we defined custom optional parameters that the admin can write on the console config parameters field, specifically this prefix URL and map URL parameters. And when these parameters are present, the parser is going to pick them up, and then later the web crawler is going to see these parameters. And then if these are present, it will perform remapping. So this is the conceptual overview of how this patching works. And now most of these parameters are related to the web driver client on this. And there is information about this web driver client and Fist 14 issue, as since the web driver client is discontinued on Fist 14, and Fist 14 is latest. And Fist 14 has playwright-based crawler, which is still in development, and the information in the appendix. Now I'm going to talk about another important subject, and which is automation. And as you might have guessed, our configuration grew more and more complicated, as it always does. So for instance, we have way more than quite a few configurations for each Fist instance. And so there are lots of manual edits of config files. But these are taken care of by Ansible and Docker file, and I think that's standard. But perhaps a more interesting instance is we have to create several hundred web crawlers per Fist instance. And the reason is, typically on GitLab, you have projects. And for each project, you have members. And what you want to make sure is, when user searches something, you use it to see only the resources they have access to. So to automate the creation of web crawlers in such a case, Fist has APIs, just like GitLab APIs. So to explain how this is handled, to look at the sample script, you can combine the GitLab APIs and Fist APIs. First, you can use GitLab APIs to get all the projects. And then for each project, you can get the list of members. And then for using that list of members, you can create web crawler. And this is where Fist API comes in to create web crawler. And then you can also create web authentication object and attach it to that web crawler. So the Fist APIs are mostly like GitLab APIs. And for those who have used them, I think they'll be fairly intuitive. So that's the quick intro to the Fist APIs. And I'm going to share some insights and observations that I can make. So the first is did Fist solve our problems? And the answer would be definitely yes. So the users can now search across tools and inside binary files. And this turned out to be quite powerful as, for instance, even if the file is like a DocEx file, or even if it's a legacy doc format, and even if it's in a very obscure location in a very old repository, deeply nested places, users can actually find texts inside the file. So this turned out to be quite powerful. But it's not without a problem, though. So one problem is the performance. If you want to index contents of GitLab, which have, say, several hundred projects and several thousand issues using Fist Instance running on this level of computing resource, it takes us about a couple days to index everything. And this is something we are trying to improve on, like how to incrementally index contents and using some other techniques. So for now, that's it for this talk. So thank you so much for listening. And I want to open this up to the Q&A session. So you have a different type of resources, right? So every resource has different properties, I guess. So I do index the same properties for every source. So you have a set of properties to index. Are you only indexing content? So I'm trying to understand the question. When you're indexing documents, you are indexing the content of the document, the text of the document? Or are you indexing also some of the properties, like the name, the title, the author, the? I would say yes. So let me repeat the question. The question is, whether this first index, only the contents, like the text inside the file, or does it index things such as metadata, like title field, or PowerPoint, or they have several metadata. Yes, it does. It indexes those metadata as well. And those are handled by, most likely, the elastic search. But yes, those metadata are indexed. And actually, it shows the metadata title if they are present on the search results page. So yeah, I guess that's fine. The second one is a quick one, so are you accepting contributions to the project? It's an open project, and you are accepting contributions and new plugins? I'm sorry. So your question is, whether we are accepting contributions to the? So on our side of it, I don't quite think so. So you have a catalogue of connectors to all of them, and so on, to elastic search, and so on. So for instance, I was thinking on CMIS, that is a standard for content management. So I was thinking on trying to contribute this new connector to the best project. Is that an option? I'm sorry. I'm trying to understand the question. You said something about connectors, right? I'm naming that connector, but I don't know if connectors is the right word for you. I mean, so you have a specific browser for every different system, right? You have a browser for elastic search, a browser for office, a browser for set options, right? So we can add new browsers for different systems. Yes, yes. I mean, FES has seven different crawlers for different types of resources. And so the question is that are we going to be accepting new types of crawlers as contribution? Is that the question? Yes. It's not right. We're kind of sort of like a corporation and then working on our side, working as a project inside a corporation. So we buy ourselves, not at the moment, accepting a contribution to the project, but yeah. So yeah. Perfect thing. Another question? Yes. Are you going to publish the slides that you presented? OK. Are the slides that you presented, are you going to publish them somewhere? Then they are on their own page. Yes. Yeah, they are on Fesda website. So you should be able to download them. I'm sure I'll be able to. That's the short one. You said indexing takes about several days for this project you have. Yes. It's about re-indexing. So if content changes, re-index all the time a new index, making the whole index new, or is it fast? Just re-index certain documents. So the question is, let's say after index everything, and then in the subsequent run of web crawlers and all kind of crawlers, are they updated incrementally, faster, or do we have to re-index everything? So the Fesda tries to crawl efficiently. It tries to ignore the contents of web pages that haven't been updated. It checks the last modified field. So the mechanism of incremental crawling is not very ideal. For instance, the last modified field is not quite well-inforced, for instance. It's only the certain types of static web pages use it. And so there are a lot of unnecessary re-indexing happening. But there are some mechanisms to only index things that have been updated. But I've got to say that that mechanism is not very well run. And there are a lot of re-indexing. And then the subsequent crawling is not as efficient as we want. And then it is. Yeah. Yes. OK. Thank you very much. Thank you very much. |
Optimizing your core application for integration
Learnings from integrating OpenProject with Nextcloud |
Okay, so we are going to start the next course. The next course is about optimizing your co-applications for integration, learnings from integrating of the project with NextCloud. Thank you. Hello, everyone. My name is Wieland Lindenthal. I'm one of the lucky co-founders of OpenProject, and today I'm here because I'm also the managing engineer of the OpenProject NextCloud integration. Just to get the first things of the most important messages out right now, I assume that here are more core developers of open source software, collaboration software, and also product measures here in the room. For you, please take home this message, prioritize integrations on your roadmap. Right now is a great moment where the public sector, not only the public sector, but many people ask for integrations that we integrate our open source software together to provide an alternative to the Office 365, to the Google Cloud, to Google applications, and so on. Right now, people ask us that we join together forces, so together that we are stronger and become stronger in each single application so that we provide an alternative to the big tags. Okay, so integrations are really important. And with that, every single application, if we integrate, every single application doesn't need to do what the other application does, so we can focus on our own stuff and get better in our own silo, on our own puzzle piece of the big picture. Another bad news, building integrations is unnecessarily hard. The good news is, you always encounter the same problems again and again. So that's kind of predictable what is going to fail when you do an integration. So my call out here is, please drink your own champagne and participate as core developers in the integration projects that integrate into your system, because then you see the issues that people face when integrating. Because integrations are amazing, right? Just imagine what the web would be without a link or a browser would be without extension or a mobile phone without application. It would be boring, right? And it's just unpredictable what integrations come out if you just let people do. And people have great ideas. So the one example of such a great idea was the public sector, the city of Cologne and the University of Duisburg-Essen approached us with the idea of, hey, you had open project, we like you really, really much for organizing work, organizing tasks, organizing our teams. And we also use NextCloud for organizing our files to share the files with our people. And we always can find the same problem again and again, where are the files for our tasks? So when you do some work, you do a real job. You usually need some files as input to work on the task. And then at the end, you need to deliver some files and put them somewhere, and you need to communicate someone where the files are. So why not tracking that at the same entity, which is the ticket, we call it a work package, and solve that problem and have some real integrity between the two systems, like work package and real ID linking to the files in NextCloud. So for the last year, roughly eight people were working on this deep integration, a lot of people, because it's pretty hard. But anyway, so what we got is what you can see here is like an open project with a work package. And you can see there's some tasks, like second fermentation dough, you want to do this, you want to have a nice pizza. And on the right hand side, you have the files tab, where you see all the files that are related. And now it's not only that you have attachments, which were there already not open project before. You can also add multiple NextCloud instances to your open project and say, OK, these files should be stored here, with that contributor, these files should be stored there with this client, with this subcontractor. So you can have multiple NextCloud instances connected to OpenProject. And you will always see which files are relevant for finishing a task and where you should put the files when you're ready and done. And you hand over the task to the next one, the next design, let's say, and you follow a workflow process with an open project. And on the other side, or NextCloud, when you're on a file or you're on a folder, you can see where these files or folders are relevant and which tasks, which business processes they come up. So you can see here on the right hand side, we have this OpenProject tab. They can see, OK, this file is relevant in two places and two work packages. You can quickly see what status they are in so that the problem of where the files for my task are finally solved. Now I want to get a little bit deeper into the architecture of that thing because we want to talk about problems that come with integrations. So it's a very, very high-level architecture picture. As you can see, there's an OpenProject server, there's an NextCloud server, there's an OpenProject front-end that runs in the browser and the same browser that runs the NextCloud front-end, same different browser windows. And for this integration, we extended the OpenProject core so it's not a plugin, it's right in the OpenProject core, which extends the OpenProject front-end. And on the NextCloud side, we put everything into a NextCloud app, which is basically a bundle of code that you execute within NextCloud, or you could call it a plugin or extension. So that plugin extends the NextCloud server and also the NextCloud front-end. What we chose for the communication between the two systems is that we want that every user, because users have permissions on the one system and users have permissions on the other system, we wanted to make sure that these things match up, OK, so that an OpenProject user that is looking for a file can only see files, can only select files to link them that he or she has access to. But on the same time, we wanted to make sure that all linked files are visible to all users because someone says, like, this is the relevant file, maybe you don't have access, but you should know that these files exist. So we have these references by file ID from OpenProject to NextCloud. And the other way around, of course, as well, when you're in NextCloud and you saw these files linked, the work packages linked, you also should only link those work packages that you have access to. So work packages are organized in projects, you're a member of a project or not, so depending on this, you have access or not. So we need to have some one-on-one matching between the users. So we use OAuth for this one-on-one matching. So users are always authenticated and don't need to re-authenticate every time they log in. OK. But there's one thing on this architecture you can already spot, maybe, that it smells like trouble. That is the arrow in the middle that goes from the OpenProject front directly to NextCloud because that is a request, a cross-site request, in the browser between two domains. And here's where trouble starts. We'll come to this later. So clipping. So yeah, from this work, I just want to come up with a couple of things that lessons learned and I would just point them out for you to think and digest a little bit what it means to your project. So of course, OK, if you want to have integrations, of course, provide an API, that's obvious. NextCloud and OpenProject does that pretty well, I think. We have docs and examples. But then the tiny things start, like, OK, you have an API. You think it's cool and safe. But in the example of this direct access that I set on the architecture image, this is about direct file upload and downloads. So this is something that is kind of obvious, that you don't want to send the data through the OpenProject server, like, let's say if you want to go buy a movie, and then to NextCloud, you want to upload this directly or download this directly to the OpenProject front-end. So it turns out the API that NextCloud provides for this thing is working-ish, but not, you know, in a way that it did not work, OK? It didn't work, but it was really unfortunate because they had everything in place, but I bet they never had the scenario in their own company to use that in a cross-site domain environment. And so they didn't come across, like, twice a course problems, you know, your course security on the browser that prevents you to do stuff that you shouldn't do in the browsers, like, browser security features. And if you just have non-browser clients, these things don't show up. If you just have a NextCloud client on your desktop, these things don't show up. They are not browser context. So my lessons learned is here, also for us OpenProject, we are not saving these fields neither. Make sure that if you provide an API, also test it in a browser setting, right, and not just write it and hope it works. You have these course preflights. You need to check if the course preflight itself doesn't fail, right, there can be issues. OK. Another one, a tiny one, but I need to say this, you know, we as OpenProject, we want to link files. And these files we don't reference by path because the path is different for every user or can be different for every user. Because if you get a file shared with you and you save it in your NextCloud with a different file name, the path is completely different and it doesn't work for the other user. They say, another user, there's an OpenProject project. So we need to reference the file. But there's no API for getting files, thanks, for getting files by ID, especially not if you want to know what is the original file name that the owner of the file gave to the file or if you want to know if the file was already deleted, if it was trashed, it was in the trash bin. Information like this, we want to ask them frequently in order to update our knowledge about the file on OpenProject. It wasn't there. Probably never needed on the NextCloud side. So that's cool. They don't need it. But, yeah, kind of obvious. There's main objects of NextCloud and we need them, we need the information. And, yeah, this problem became visible here. If you want to build, allow, if you want to allow integrations into your system while creating a plug-in interface is a great idea. OpenProject has a plug-in interface. You can extend OpenProject code as you like. You need to be with server admin in order to make it run. NextCloud is much ahead in that sense. They have the app store. You just can click from a catalog and as a, not as a server administrator, but as an application administrator who doesn't like to talk to server administrators. You can simply download, download, install extensions to NextCloud. It's in the core of NextCloud and that's really, really cool. It's amazing. And I'm jealous I would love to have this OpenProject. Yeah. What about the source, you know, you have four of them. It's all PHP, you know. Actually, no. The app store, I think, is Ruby. Is it? Yeah. Okay. Tomorrow it's done. No. It could also be Python. I just know something. Okay. Okay. Thank you. I don't know how much you care about whether it's either of these. Okay. So you have this plug-in interface built in your application and that the typical things that every person that is implementing this interface comes across. For example, if you want to have cross-site fetching, cross-site calls, you need to allow in that application, extend the content security policy for cross-site requests. There's something, for example, that we discovered that OpenProject did not allow or make easy in the first place. So we had to extend that. It's a typical one. It only, you know, in the theory everything is working and then you run it in your browser and in the very last moment of implementing all the features, you see it's not working because of content security policy. Okay. This is a nice one. So when you're like an application administrator in OpenProject and you're also an application administrator in NextCloud and you want to set up the connection, the integration, the first thing that you do is you put in the URL of the other system, right? And you want to know if you put, or you should get some feedback if the, actually, if the URL is valid. Let's say you're next, you want to know that OpenProject is actually responding. OpenProject is responding in the necessary version. We don't have that. We don't expose that information. You want to know if the setup is complete. There's more about the other way. Like, let's say from OpenProject you want to connect to NextCloud. Of course, we also want to know if our extension is installed. So it would be great if you have an API endpoint that gives you information and saying, nah, it's not installed, so the validation of the URL is failing. There's something that is very, very useful for the administrators to get a relevant feedback for why things is not working. Because if you just put in a URL, you don't have any idea why it's not working. You need to get feedback. And this is something I think every core developer should think of as providing as a super nice candy for developers. Another one that is, that's a topic. So we're working a lot with OAuth. And setting up OAuth, I don't know if you know what, it has basically a lot of features like a secret, an ID, it has the redirect URL, and so on and so on. And if you are a host or a tester, you would like to script that stuff. So you would like to have an API to set this stuff up. And it's great if you provide an API for setting up OAuth. Also, if you have an API, you can do stuff like having one button and some setting and say, disconnect the integration all over the place. Delete it. I want to start again. I screwed up. I don't know what is going on. I screwed up. So for this, you need an API so that you can do this. And if you had actually deleted an OAuth connection, then please also make sure that the tokens are deleted. So that was something that we discovered with NextCloud, that the tokens were still around for quite some time. And the good thing is it came up and we could provide tickets and this thing could get solved. That's also another one, nice one. Just imagine you do the OAuth connection, so there's one user in NextCloud and connects as the user in an open project and that user happens to be admin in an open project. Suddenly the access as an admin comes to NextCloud. So there's a lot of powers. You can create projects. You can delete users. And maybe for the integration, that is not needed and vice versa. The open project might not need complete administrator rights on NextCloud. So it's like opening the surface for attacks or the opening for problems to vulnerabilities by a lot. And I think open project and NextCloud, we could do better in having scoped OAuth access control so that you don't allow creating or deleting projects or that you don't allow deleting users and so on. So limiting just for the user scope actions. We can do better, we'll come. Also what is nice, what I really enjoyed about working with NextCloud is that they provide a design system. They have frontend components that allowed us to have more consistent UI that look and feel like the NextCloud and also increase the development speed for us. At OpenProject, we also are working on a design system and UI components. So if you want to have a talk, listen to a talk about this, Parimal, where are you tomorrow? Same time? Yeah. Okay. Have fun. Yeah. Then also what is really handy for developers that develop integrations for your system is if you have containerized development environments already at hand, so it's easier to check out your code with the latest supported versions. For example, NextCloud has some multiple versions you need to support, you need to check, you want to find out if it's still working. And of course, the current main branch to see what is actually changing in the main branch is my code still live or it's already dying. So it's really nice to have, for example, Docker compose files that help you setting up and lower the barrier to get started in developing an integration for your application. Yeah. Also as a bonus on this, if you can provide examples how to set up your CI quickly so that you don't need to fiddle around with that, if you just have good standard practices, how to have CI running with that core image and with the plug-in, it would be also very, very handy for developers to have. I think we could sneak into other integrations and we could see how they did it in NextCloud. So it wasn't such a big trouble, but if you have a good example, that is really, really helpful. So with all these small spots of lessons learned that we have had in the last year, I just want to send you home, let it sink in, and just remember, integrated, we're stronger. Thank you. Thank you, we have six minutes for questions. So the question is why we need to deal with multiple NextCloud versions. I think NextCloud has multiple versions in production, like supported, so it's like I think 23, 24, 25, maybe 22 still, I'm not sure, and they have different code, but we just provide one, currently, we just provide one piece of code that is the NextCloud app, our NextCloud app, and we don't want to provide three different ones. So we want to make sure that in that app, all versions are supported. It's different for OpenProject. OpenProject always has just one running, one up-to-date version, so you'd always need to update to the latest version unless you want to get into trouble. So this is why you need multiple versions. More questions? Okay. Thank you. Ah, Ben. I want to add on to this that I think declarative configuration is always better than scripting, so if you can, if it's possible, the application, for example, provides OAuth configuration statically instead of via the script, I think that's always better because it allows you to basically do status-deploy all your applications, including the whole OAuth context. Yeah, so the comment was just, if you can provide it as a, let's say, a YAML file or something like this, a declaration, it's better than to have an API. I'm not so sure about OAuth in that moment because then you would need to have a center when you set up both systems at the same time, you wouldn't need to have a center that decides on the OAuth client ID and client secret and make it configurable from a center. Yeah, if these IDs and secrets there get actually generated by the applications itself because they get updated and maybe your center is not getting updated, maybe you create just a vulnerability here, right? So we were discussing this forth and back, but we believe that, you know, next time an OAuth project that continuously update and maybe some central distribution software might not get touched in three years and then it's outdated in the algorithm, so it's not secure anymore or whatever. Yeah. So, Trey, yeah, I'm not sure maybe you talked about it or I probably was not listening. I think you mentioned files as an example for the actual what is integrated between both systems. Was the scope broader than that or was it primarily set and was that any discussion because there were multiple things that could be integrated in the sense, right? Yeah, I mean, this integration basically focuses on files and folders, right? Because you want to, it's about integrating digital assets into work packages, right? So you have these links. So that's the example I chose. It would be interesting to have some standards also for API standards. Why not? Let's say every system has users. Why does a user need to be represented different in every system? Why can't we have a standard on that or the avatar or, you know, it's like there's stuff of permissions. Why can't we have permission standard somehow for all open source vendors? I don't know. Let's think about it, but I didn't see a reason why it needs to be so different. Users have been around for so long. Yeah. More questions? How do you mind? No one asked the technical question. There was a lovely portion of pizza. Did you make that yourself? Yes, that's my pizza. Oh, lovely pizza. Pizza Napolitana. Milan style. Okay. Thank you very much. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thanks. Thank you. Thank you. Thank you. Bye. Bye bye. Bye bye. Bye bye. Bye bye. Bye by with two. Bye bye. |
Nextcloud Numbers and Hubs
Our traditional yearly overview of what's new in Nextcloud |
So my name is Joss, I'm a co-founder of NexCloud and I talk about NexCloud aka I do marketing. Yeah and today I'm going to talk about what's new in NexCloud. Now this became a little bit of a mess because we usually do about three releases a year. And I kind of made up three releases here but there are actually only two. I just mixed up our own product naming so that was a good one. I hope nobody is confused but we'll simplify it going forward. So I will simply go over the products and what's new. As I said I can't talk about all the features because this is just what's new which is already as I said 84 slides although we're already on the third so we'll get there. Don't worry, I'll do my best at least. So NexCloud Hub 2, there was somewhere in March, April. We introduced user migration which is the first thing I'm going to show. File locking, automated file locking, well low database load, always performance that's always important I think. Improvements in talk, desktop integration and a couple of improvements in the groupware app. Now I'm going to try and show a couple of these things but there's not a lot to show a user migration that basically does what the name suggests. You can export your account. This depends a little bit on what the app supports so we have support in a couple of the important apps like for example files. And it'll of course not just export your files but also the comments on the files, activity history, what your favorite files are so these metadata gets exported. Of course your account itself gets exported so you know your profile name and picture and date all this stuff. And the idea is that you can export your account and then go to another NexCloud server and import it there. NexCloud is hosted by tons of companies as well as on your own and if you say look my Raspberry Pi is pulling it anymore I need something a little faster then you can move to something bigger. So yeah apps need to support this so it has some limitations but you can just choose what you want to turn on so you can say okay export only my files or export files. And my email configuration, export this, don't export that. It's fairly simple just download your zip file. So what else is new? In NexCloud files we worked a ton on performance. I'll get to that in a minute. We introduced a search API for indexing that's basically an API for an external indexing tool like Elasticsearch that can then index a content on NexCloud and show it to you. Somebody for OpenProject there looks very happy. Excellent. And we introduced kind of a little thing but if you share something by mail you can then give it a password and then email them the password. But of course well they already have something kind of unique which is their email address. So we now have a token option that they can simply say hey I have this email address they will receive a time token and then they can log in. You don't need to give them a password or something like that. So I mentioned performance and there are tons of things basically too much to talk about so I have a couple of details that are just semi randomly picked out. If you view a folder in this release we got the number of database queries down by 75%. You can configure cron jobs so you have this background jobs that are running and you know if you have a very busy server and the cron job hits just at your most busy time that might not be helpful. So these background jobs you can now configure a little bit that you say hey there's a very heavy task please don't do this at like 11 o'clock or at 9 o'clock when all the employees log in for example. So there's a bit more configurability there. And it also turned out that our server was very busy generating avatars and different sizes like one application for next slide set can I have it at 65 by 65 pixels the next one wanted 66 by 66. This is actually a ton of work so we now just offer in a few sizes and then you know you can resize it in the browser I mean that works fine and that made quite a difference you'd be surprised. What we also did is we separated image preview generation so if you have you know 100,000 photos and you want to scroll in that folder then the next slide server gets very very busy generating about 100,000 previews in of course multiple sizes. If you do that in advance that's very nice but then that means your server is simply busy at another time so what you can do now is you can have an external service that does this. So it runs on a separate server doesn't book down your primary next slide server is little things makes quite a difference. Now I mentioned file permissions so one thing we also did is just you make the permissions a little more well advanced and you can very specifically pick out if you don't just give edit rights we can say I want the ability to upload but not edit or not delete etc. Other improvements were like searching for files by the tag of the file improved group folders. You can now use the viewer app on public links so you share somebody a PDF or an image and then they can use our internal viewer app on that public link and lots of other little things there. So the next step in exit hub 2 was group pair. We did just a couple of features a lot of under the hood work the features are just a few one is accept and decline directly in the calendar app that just makes little quicker you already then see the calendar item you can just accept it and decline it. What we did in the mail app is we introduced the ability to send an email later and to undo send so you have 30 seconds to realize that well you were a little too quick with that email. In talk we did again under the hood work also introducing reactions which is of course always nice to zoom into then we did a media tab. And we integrate in the desktop client so when you get you know talk notification you can click there in the latest version of the desktop client this is even better because if you get like this operating system notification you can just click reply there and then right there already answer. Lisa works on Mac OS and think on Windows. I'm not sure if Linux has that feature maybe not yet but it's kind of cool and well here's the obligatory zoom in. With this release introduce a new user interface and next up office so well things fairly obvious we tried to make a little more familiar for people are migrating from. Certain other office applications is optional you can configure it as user but I think this yeah just makes it nicer out honestly I like the UI bit more myself I use it now so less menus to search through. And the nice thing is we introduced file locking already while ago so if you're working with somebody else and it's like not a file you can edit the next file but it's a Photoshop file for example need to download it edit it I'll get back to how we make that easier to later. But with this release we made file locking automatic so if you're an extra office you edit the file then it gets locked so that your colleague isn't on the desktop client locally editing it at the same time. And creating conflicts but instead desktop client will then say this file is being edited online please don't edit it locally. Depends a bit on the operating system whether a desktop client can enforce this or only give warnings because on Windows it's impossible to edit the file which is open. On Linux this is not a problem because it's a proper operating system. But it gives you a warning then so at least you know that you're breaking something. And then you can of course also right click on the file and desktop client and lock the file. It shows you who has locked the file if it's locked by an application it'll not say the name of whoever it was. But it'll say the name of the application so you can see our next text is locking this file so people are editing it on the browser. Now of course in desktop client you can just right click and say open in the browser. Then it just opens the browser window with that file immediately editing so you can just join the editing session. So it shouldn't block you for anything. That was already up too. I'm going to get to the next one this has a few more slides than the other one and also a lot more features. Well you can read I'm just going to show it to you. The first thing we did actually was to introduce a new design. So well it's more rounded and it also shows you your wallpaper through so people can you know pick a favorite wallpaper and then you can see through this wallpaper. Another benefit let me see we're back. Another thing is we also worked a lot on accessibility so you have a nicer even nicer dark mouth. You can also set fonts for people who have problems with while reading dyslexia. Yeah dark mode can also switch automatically now and a whole ton of other changes. If you use a different background to show you some more screenshots if you like it bubbly maybe you like a little darker you know that's for the real hackers dark mode with dark background. I now realize that even though I'm not a hacker I do run this background by default I like it. Either way what we also did is once again performance. I can once again of course go in all the details but let me quickly first cover security which we also work on for every release. If you use a server side encryption we allow you to encrypt S3 based primary storage. Object storage encrypt encryption networks with group folders which are used a lot and it takes 33% less storage. So it used to be that our encryption algorithm really blew up the size of the files by about a third and now well that's gone so it gets more again. And a couple of OCC commands to manage the encryption on the client side. We also made a bunch of improvements with user end to end encryption in the desktop client and the mobile client. It's faster. There was very helpful other people complained with a lot of files it was slow which it was. We also made key management a little nicer so you can also in the browser reset your keys. That also means you lose access to all your files if you forgot your password but you can start over if you made a mistake or forgot your password. And administrators can unsay groups of users or individual users can give them access to the end to end encryption or not so it can be controlled as well. I mentioned performance there was tons of stuff I think in core alone we did 33 separate improvements. I'm not going to show all of them I'm just going to pick out a few of them otherwise it gets a little crazy. So sorting of files when the tone faster because we only sort the recently changed files. It's a little thing but if you're in a folder with tons of files and you want to sort all of them why would you do that if you just need to sort the newest on top. Only look at the newer files. It's little things but it gets a lot faster. Search became about seven times faster. And all these things they add up because by now if you do a prop find which is kind of operation that a client checks hey has anything changed on the server that is now about 30 percent faster. So you know it's it's really common operations they don't get of course a seven times speed up but all these seven times and little things they add up to well 20 30 percent. If you upgrade to this release you should really notice it because pages load about a third faster and I think that's kind of where you notice a difference yourself. So anyway that's a really good stuff there. Now let's talk about the features. Next I'll talk. So we introduced polls. I mean that's really nice. Also a couple of other things. For example you now have what we call widgets. If you can share if you share a deck card you can then see the information of the deck card. There's also works with GitHub tickets. It works with YouTube videos. It works with tons of other stuff. Even maps locations and such. And that also shows in the media bar on the site that we introduced in the previous release. So I think that's really quite nice. We introduced the ability to basically start a call but not send a notification to everybody. So if you're in a big group and you want to have a call with only three members. You can start a call but then not for 60 people their phones are ringing. That might be a little annoying. And you can do the same with sending messages. So you can send a message but if you do this Saturday night you shouldn't bother your entire team. They all get a ping on their phone because it's bloody weekend. Why do you do this? Maybe you shouldn't be working. But if you do then at least you can send it without generating notifications. Of course in NextLoud you can also configure your availability and just say outside of these hours. Please put me automatically and do not disturb. That also helps. But still you can control this. But then of course you might actually want to send somebody call notifications because you're in the call with three people and you want number four then you can actually ring their phone this way directly. So all the way around. We also introduced message expiration. The widgets that I just already mentioned also works then with YouTube. We can also directly now create files and a ton of different ones. The poll I already mentioned. But also create a new document and then it just opens up as immediately shared with everybody. You can immediately start editing your office document or whatever you created. So that's nice. And in a classroom or webinar you might sometimes want to stop people from talking and we expanded the access rights with this as well. So you can now say okay you know you can't post messages you cannot react. You know no talking until well it's time to open that. We also worked on scalability quite a bit. We introduced clustering to the high performance backend. So for bigger calls but honestly we don't use a clustering ourselves. We're now using it for webinars. We had 350 people a couple of weeks ago. So that works still on one server. But there is of course a point that you want to have a call with even more people and then you need clustering. And that's helpful. All right. Let's talk about group fare for a minute. This always makes for a nice screenshot. So we introduced a nice org chart in our contacts app. You just define who is the boss of who and then it creates a graph that you can zoom in and out and all this stuff. Very simple but it's nice was done by a community project was supported by the EU. So yay for our EU overlords here in Brussels. Yeah the mail app also got a nice overhaul mostly UI improvements. I'll get to a couple of features in a minute. Well that's actually right now. So less than a minute. You can have now images in your signatures. You can configure an autoresponder directly in next clouds. And we made the UI of the appointment booking. So this is kind of like currently I guess that's been a feature in Excel for a while and we improved the UI a little. You can create certain dates and times that people can book meetings with you. You also updated the create new account wizard attachment viewer is now right in mail. So if you have a PDF just click it you can view it. I map invite support so in an email you get an invitation for the calendar. You just click accept and then you're in there etc. So all good stuff. Let's get to office as I have less than 10 minutes left. I know it's terrible. So what did we do? Yeah, you can upload custom fonts. That's actually super helpful because half the documents that look shit in collaboration lines. Next out of office just look bad because of the fonts. It's very simple. In many cases that's the problem with it. And well you can now easily add your fonts in the UI. What you can also do is if a document still looks bad or if it uses like really weird features that require like local data connections between documents and this stuff. There's now a button on the top left that says open locally. You click on it. It said are you sure you say yes if you are and then it'll just log the file. Open it on your desktop in Microsoft Office for example or Libre Office on your desktop. You edit it. You close. It syncs it back to the server and unlocks the file. This is awesome. Perfect compatibility because whatever you run locally works. Obviously this also works for your Photoshop files. So you just go and next out you look up the file. You click the three dot menu. You click open locally Photoshop opens when you're done. You close it. It unlocks the file again and your colleagues can work again with it too. So fairly simple. The button is like the one next to the save button. The third one top left. Right. We also have this app called Collective. It's a knowledge based application. We introduced a whole bunch of improvements like an outline that you see on the left here and tons of other stuff. Honestly I need to speed up a little bit. So I will skip through most of this but it's a lot of good stuff. Believe me you can add mentioned colleagues. You can search for the content of these things in Universal Search and the widgets work as well. So you can if you have a link to a GitHub issue or to a YouTube video then it just shows you right there the content. And here are the fonts by the way I just mentioned so you can upload them nice and easy. Next up photos. Really cool. We introduced photo albums so you can create a photo album and put photos in there without having to actually move them around. You can invite other people to it and share them with them. Then we introduced image recognition so there's an AI neural network thingy. No we're not sending it to Google. It's running local on your server and it'll recognize different objects. It'll recognize different faces, different people. Tags them and then you can find them by tags. A little bit more screen shots later. Super cool. Also introduced the photo editor so you can rotate crop and some filters, all the basic stuff for your photos. So Google killed photos so you can move over now. I was trying to, well I have lots of zoom instincts but you have a nice uploader now that shows you here are the faces that it automatically recognize and then you can click one and then you go well and you see all the pictures of that person. It's fairly simple. It can even recognize music genres. Obviously not in the photos app but in the music app to be clear. But it's really cool and again it's not using a database, it's using machine learning. So it really looks, well listens in this case and figures out the genre. I don't know how good it is. I've heard it's actually surprisingly good. So really cool. And again all this is on your server. No data is sent anywhere else unlike the big clouds. It even works on a Raspberry Pi. So that's pretty cool I think. On the clients we really did a ton of work for this release. You can of course edit files also on your tablet. Next up Office will work on your tablet, on your phone, etc. But we also introduced widgets. So you have these widgets when you open NextLoud you have the dashboard. These widgets are now also on your tablet and on your phone if you want them. They use the native widgets, the iOS widgets. So you have these widgets on iOS where you can have NextLoud widgets from your dashboard, on your phone, on your tablet, etc. Really cool I think. I think I have some examples here exactly. So your files, notifications, changes, files shared with you, etc. On Windows you now get this. So if you have a virtual file system on Windows so you're not syncing all the files but you view them and when you click them they get synced. But at least you get previews in the meantime that's new as well. Quite nice. Androids, few improvements, iOS improvements as well. I will not go into details because we are really getting there. Those were the widgets. Two other nice things in the last two minutes I think I have left. We made it a lot easier to get NextLoud. So there has been a Docker image for NextLoud for a long time. And this was a kind of IKEA inspired Docker image I would say. You had to bring your own database and your own file system and your own, you know, so you're really setting up like a bunch of Docker images. It's very nice if you're really into Kubernetes and that stuff. If you never heard of Kubernetes like most people we now have an all-in-one Docker image. You just download this one Docker image and well it'll give you a nice overview. It runs all the other containers in there. It's super easy to use. We even did a, because we made a VM version of it. Just a VM we've done this Docker image running in it on Ubuntu I think. Which you can then run on Windows if you want. So now suddenly NextLoud is available for Windows Server if you so wish. It's still running Linux obviously, right? It's not going crazy here but even has backup by the way built in. There is by the way a NextLoud backup app but that's using something completely different. That allows you to backup to another NextLoud server. Times up. I have to tell you by the way that you can deploy this all-in-one container with one click on these platforms. So if you really love the cloud but you want a little more control you can still run NextLoud on any of these with one, two clicks and then you have it deployed there. So we're in their app stores basically. Not all of them are fully finished yet but we're working on it. We'll get there. Anyway, questions? Sorry. Only a few seconds over time. It's not so bad. Questions? Come on. Go authoring. Yeah. Sorry? Go authoring. Go authoring of a document. So you mean collaborative editing? Yes, absolutely. So in NextLoud Office but also NextLoud Techs are note-taking and also the knowledge base. So if you're editing a knowledge base document you can do it with 20 people if you like. I mean I don't know why you won't. So well on iOS and the Android and iOS apps they do it but on the desktop client not that it would open in a browser. So you do right click you say edit document and it opens a browser window and then you're in there. Thank you. Yeah. Does the knowledge base support markdown? Does the knowledge base support markdown? Yes, it is markdown. It dreams markdown. It lives markdown. Everything is markdown. Other than NextLoud Office but it's NextLoud Text but then in a different way. So it has these widgets and all the other stuff just like text. They're basically, I don't know, it's like text but then with a sidebar that lets you search and, you know, choose and link to other documents. It's just text and steroids. Yes. Can I talk with other servers? Can you talk with other servers? Yes, multiple ways but NextLoud has a federation feature so you can share a file to the server of your friend. Talk. Yeah, so NextLoud Talk. No, at the moment talk is not federated. It's something we want to do but I don't have an ETA for you. Thank you. Next. Is the photo add one? The photos app is the default photos app with the recognition everything. Yes. Yeah. Yeah, for the recognition of faces because you need to download a gigantic, you know, thing file there. You need to separately install the recognized app and that's about, I don't know, I think it's a gigabyte plus because it needs to download this network that can recognize your pictures on your server. It has ARM and X86 so it should work in most places but this is a separate action. But the photos app itself is there. Yes. Collections use local content? Collect this. You mean the, what do you mean with local content? You mean you can insert a file in there? Yeah. Yes, you can. Yes, in this file document, files from NextLoud. Of course if it's not on NextLoud it will upload it and put it in a same folder. Yes. In the forms app, is it not possible to share the admin so several people can read the form or see the result? No, not yet but that's something we would want to do. So the forms app, you have NextLoud forms apps like Google forms but then you're not giving all your data to Evil America company. Yeah, and the question was can multiple people see and manage the same form and this is unfortunately not yet possible. So the person who creates the form, they can export the data in a spreadsheet but they can currently not yet give management to other people. We'll get there. Yes. You said something about talk, having clever people in the talk. Yeah. Can they all see each other? No. How many people can see each other? So you can control access rights and if you have infinite network bandwidth then everybody can see everybody. In the real world, no. And it purely depends on your network bandwidth here. But does the grid view of the talk, what? I think the grid view goes to 20 people or so and beyond that you start to scroll to the next page, next page. But I believe it scales with the size of your monitor. If you have a gigantic monitor and a high resolution, again, no, it's open source and it's self-hosted. You are always the limiting factor, believe me. Anything else? We talked about the local talk client. Local talk client. No, so we have Android and iOS client for talk at the moment. But stay tuned. That's it. Awesome. Thank you for being here, everybody. It's self-hosted, you are always the limiting factor. Thank you. |
The Relentless March of Markdown
And its arrival in Tiki 25 |
Hello, I'm Johnny Bradley, I work on the Tiki project. As a developer and have done for about 20 years I think, but I'm not going to do the usual, this is the new stuff in Tiki 25 talk because there's a lot of it and I thought I would concentrate on something more collaborative. So this is about markdown, I'm assuming everyone knows what markdown is and uses it, if not we can work it out afterwards. Yeah, it's a microphone but only for online people. Yeah, sorry, I'll try and project. In the old days we had BB code starting off which was like HTML with square brackets and the media wiki obviously everywhere, then I found Tiki, had its own syntax, it uses little quote marks instead of asterisks, that kind of thing and each time you change to a different platform you have to look up in the cheat sheet, how do I do underline, is it underscores, is it asterisks, whatever. And more and more in the last few years we've been finding markdown has appeared more and more, mainly from GitHub and GitLab and the best thing would be if there was just one, one syntax to rule them all. In the very olden days, early noughties, there was a project called wiki Creole which Tiki wiki was nearly involved with, we didn't implement it, but next wiki I believe did. Tiddly wiki, which I always liked the name, Doki wiki, lots of other friends and yeah but the project stalled, it didn't quite make it. I think apparently because media wiki just had too much stuff that they couldn't migrate to a different syntax so it'll ground to a halt. There was about the same time, there was the PHP pair classes which you could do an interchange with yeah wiki Creole to media wiki to Tiki, we still use that even though it's not supported anymore. Oh yeah, so we need another standard, Jean-Marc insisted that I should put this cartoon in, wasn't in the original script, so let's have one more standard. Here comes markdown and this was the reason for the title of this this talk is I just had this idea of markdown taking over the world a bit like Godzilla destroying a city. Markdown was started in 2004, it was basically a Pearl script I believe and over the years more and more people implemented different versions of it and it became a bunch of different standards. So then common mark project was initiated about 10 years ago and that seemed to be the best of breed, it's a creative commons definition of how it should work with a test suite and basically that seems to be the standard adopted by most people but obviously it doesn't do everything so many people extended it mainly Github, so Github flavored markdown seems to be the generic standard for most places and that is now used by Github, GitLab, Stack Overflow, People Without Logos, Nextcloud we just heard were there early on, Discord icon broken, bugzilla and now even the proprietary softwares like Facebook and WhatsApp, Telegram, Signal and so on are using it and so we thought Tiki should use it as well. So in Tiki 25 we've adopted markdown we've always had our own syntax so that carries on working and now its default is off but you can enable it and it's a little bit alpha that's improving and the next release hopefully it would become the default syntax and then at some point in the future Tiki syntax will become read only and will be markdown like the rest of the world we assume and of course we need our own flavor of it. One thing was a little surprise to me that you can't do centered text in markdown that is outside of the scope of it but all our clients will want to do centered text so we use misleadingly in Tiki these are called plugins but you don't have to plug them in they're just in there they're more like WordPress short codes I guess so yeah you want to center something you want to do it in a box formatting tools and user lists and so on I think there's about 250 plugins in Tiki which will carry on working but in markdown scope. We also use CKEditor for our existing WYSIWYG offering and we decided to review that partly because the licensing changes of CKEditor 5 or 6 is it seems to be a little more challenging and so we went for ToastUE which is a native markdown WYSIWYG and seems very promising again 25.0 it's a little alpha but does the job that's about it I'm really under time aren't I so yeah you can find out more about our markdown there Tiki 25 which I could go on to we have a lot of new stuff in there you can play with it on demo.tiki.org and a recap of everything on our wiki page I just wanted to say thanks to Mark Laport who was a Tiki leader for a long time and is still looking after us and he's sort of spearheaded this this initiative Victor who did a lot of the back-encoding Mova group who initiated the whole thing because they wanted better WYSIWYG and some mysterious third anonymous benefactor that's about it and are there any questions? so Mark Laport said that you are most welcome and we have I believe 16 minutes for questions 16 minutes yeah I should have done more slides yes I was just wondering can you translate from one Tiki supported language into markdown itself yes yeah we have yeah so the question was can we can we convert from Tiki markdown into markdown Tiki syntax into markdown and back again yes we can the more you do it the more the page will go weird because there are some things that are supported in different ways in the different languages but yeah each wiki page has a little cog icon you get a little dialogue saying do you want it in WYSIWYG or plain editor and you want it in Tiki or markdown syntax and it's surprising we weren't expecting that for Tiki 25 we're quite pleased that's working already again that will improve yes sir I looked at common mark specification and basically black sphinx which I used from P markdown extra like tables definition list is there any current markdown which implements those yes github flavoured markdown sorry yeah repeat question repeat question so specifically you're asking about tables and other features whether those are supported in markdown they're not in the core common mark specification but github github flavoured markdown adds tables and a bunch of other things there's some things that github do like references to other commits and pull requests and so on we don't use and again Tiki flavoured markdown has a plugin system so there's another 250 so you can do search results you can build quite complicated applications we're still working out what's missing so I guess we'll get feedback over the next few months as to what people want to do and still can't yes first one remark if people are interested we have a converter in xwiki rendering supports many many syntaxes and is able to convert from one to another including from html so if you can convert something to html it could actually convert to any of the syntaxes that xwiki rendering supports including markdown, common markdown, github flavoured markdown, and many other like wiki etc. and I have a question on the did you have problem for the WYSIWYG with the allowance to do inline html in markdown to make WYSIWYG go back and forth to come back to the markdown syntax? yes in the alpha sorry yeah do we have a problem converting from one syntax to the other where you get little stray bits of html creeping into the wiki markup which we obviously sanitize afterwards and I think we've nailed all those yes there were some challenges with that in the alpha stages and I kept on finding that a bold markdown tag would then suddenly appear as little html bold tags part of that I think is toast use does some of that again it's the initial release and it's still marked as experimental so yeah I found I don't know yeah I found toast community to be a little bit sort of read only I haven't found much of a chat if anyone knows where to talk to the toast developers better than I would like to do that and I did the toast implementation but yes it's something we're watching out for yes show them up yeah the question was is there gonna be a common mark version 2 I don't know anyone else got any ideas it seems to be is designed to be very complete and it's designed to be very simple so possibly not so we have to wait and see as an aside I found out the midi you know the music interface system that is still version 1.0 and that started in the 80s so maybe it doesn't need upgrading hello depends which language you're writing it in oh yeah are there any tools tools and libraries you're asking if to convert mark down to html yes there's a lot we're using the PHP league common mark because we're PHP application which does most of the mark down to html conversion and I believe the html back to mark down because again we go via html when we're doing conversion process which language are you in they're bound to be surely I don't know the python world that well so I'm guessing there'd be a link on probably on the common mark website should have a list of all of the libraries and implementations okay so spend the rest of the half hour as you wish and unless there's any more questions I'll leave it to it |
Privacy and Collaboration
How CryptPad lets you have both |
Okay, so hi everyone. Thank you for making it. I'm David. I'm the project lead on the Crip Pad team, part of X-Wiki. And so today I'm going to give you a little summary of what Crip Pad is and what we're looking at for the year ahead. But first, I want to give a little bit of context. What context are we working in? I think we're in a particular moment when it comes to online collaboration. I'm going to take education as an example, but I think the broad strokes of what's happening is applying to many other sectors. So last May, Human Rights Watch released a report where they found that 90% of education tech software was spying on students or had the potential to spy on students. And of course, a lot of this is linked to the COVID crisis. So don't get me wrong, this didn't just spring out of nowhere when COVID happened. A lot of it was kind of waiting in the wings. But studies such as this one on the zoomification of universities really show the kind of drastic shifts in adoption in terms of online collaborative software that happened around COVID. And also highlights real concerns, in this case about the impact on academic freedom, the concerns that are raised when universities become reliant on cloud providers, especially like Big Tech, cloud providers. I personally know a lot of people in academia and basically since COVID, a lot of people just live in Microsoft Teams now. So that's an issue. I'm not going to go too deep into it, but Privacy International is watching this space and especially in the ad tech sector, I think this is really worth watching. So at the moment, we're at this point where big entities such as governments, so in France, for example, where we're based, the education ministry is as it's back against the wall in a way because they're forced to, you know, they can't keep recommending that people flock to Big Tech in the kind of early panic of COVID. And so they're not, they're advising against it. But I guess the big question is, what now? You know, where do you go now? How do you replace these tools? Where collaboration has proven to be, there's a need for some kind of collaborative software solution. But obviously, all the concerns linked with these solutions being provided by Big Tech giants are really becoming apparent and harder to ignore now that these situations are kind of normalized. And the issue is that collaboration often comes at the price of privacy, right? So you have these, on one side, these solutions that are very good at providing collaboration. Some of them are, you know, big evil tech corpse like we've seen. Some others are maybe less evil, maybe more open source, etc. But yeah, typically not very good at privacy. On the other hand, you have very good solutions that provide encrypted storage and personal note-taking, for example, but that are typically not great at collaborative work. Okay, and so at CripPad, we think, or our project exists because we think we should be able to have both. So the collaboration and privacy shouldn't be mutually exclusive, basically. So we exist in that overlap in the Venn diagram. And so what we provide is basically an online collaborative office suite with a lot of features that you'd come to expect from such a thing. So you have a drive where you can store documents. We have a full suite of applications, all of which are collaborative in real time. So that means different people can be editing the same document at the same time and see each other as cursors and things like this. So we have spreadsheets, text documents, forms, Kanban boards, plain text, markdown documents. So some of these are integrations of other open source software. So we integrate only office frontends, for example, and others are made from scratch. So for example, our forms application we developed from the ground up. We have a lot of collaborative features. So some of the drive that I showed, this could also be a team drive where a few people have access to the same set of folders and we have granular kind of access controls as to who can see what, etc. We have calendars, sharing capabilities. So with contacts on the platform, but also with the ability to produce links that you can then share to your friends so you can all edit the same set of notes, for example. And the real unique feature behind all of this is that everything is end-to-end encrypted. So basically everything happens in your browser and your browser encrypts and decrypts all the messages that you send to the server that's centralizing the collaborative element. Okay, so nothing leaves your device unencrypted is the basic kind of feature but also constraint that we have. Okay, so I'm going to do a little tour of what's kind of currently on our minds. I have to put in a disclaimer that a lot is in flux at the moment. We're in the process of setting up potentially big partnerships and stuff but nothing is kind of fully finalized. So I'm not going to give any names and I'm not going to promise that everything I'm going to show today is going to actually be implemented this year and maybe some other stuff that I have no idea about yet will be implemented instead. So this is just what's currently on our mind as of February 2023. So one thing that's definitely underway is a project called CripPad Blueprints where we're thoroughly documenting the use of cryptography as we currently use it so that it's more transparent, our threat model is open, etc. And so that we can also pave the way for future developments. So Theo who's sitting here in the front row is leading this effort. He's just given a talk in the security dev room if you want to catch up on that later on and he's also just released a white paper as the first step in this project. There are some interesting experiments in the kind of looking ahead part of this project that are coming up. We're going to experiment with CRDTs so we're testing like a very experimental prototype with using YGS as a way of syncing the edits between different users and we're also looking into how we could implement perfect for its secrecy. At the moment when you join a document you also gain access to its history which in some cases could be problematic so we want to have a way of limiting the access to the history to the moment that you join for example and we want to have also better ways of recovering your accounts. At the moment the typical support email that we get is I forgot my password and we can't do anything in that case unfortunately because otherwise we would have access to people's documents but there are ways of doing this in a slightly better way when it comes to usability. So we want to look at this. This is funded by Next Generation Internet by the way who also have a presence here so I encourage to look them up. Another project we're doing with the help of their funding at the moment is CripPad Auth so this is also secured. We are looking at different ways of improving authentication on CripPad so we're going to look at ways to speak to integrate with existing single sign-on scheme. So say if a company already has a single sign-on system then CripPad should be able to be part of that and also looking at multi-factor authentication. Now this time last year I was presenting the results here at FOSDEM of our inter-office project. This is a project where we basically developed document conversion between different office formats and because of the constraints I was speaking earlier this cannot happen on the server because our server doesn't know anything so it had to happen in the client and that was slightly complicated but we did it and with this we were able to complete our integration of the only office editors so we also released the presentation and text document editors at the time. However we're still having quite a few issues with them we've not been able to stabilize them as much as we'd like so we're still seeing some bugs and we're not quite able to reproduce them so a big priority for us this year will be to stabilize these two apps at the moment on our flagship instance CripPad.fr they're limited to paying users just simply because we needed a way of limiting the amount of support tickets we're going to get so one of the big goals this year is to stabilize these two apps and release them for everyone. Mind you the I mean they are still open source right so if you're a self hosting an instance you already have access to these. Another thing we're looking at is to integrate a new application so draw.io or diagrams.net that editor should make its way into CripPad this year so collaborative diagram editing is very exciting. We just heard about the relentless march of markdown and one thing that we're looking at as well is to improve markdown use in CripPad but to improve we already have great support for markdown we have two applications so our code editor has great markdown support with lots of extensions and we also have a markdown slides application where you can write slide decks in markdown I know this is a slightly niche feature but I'm guessing this is probably the case where the niche is probably most active where coders are used to like write slides in markdown I personally like it but these two apps are really underused like hardly anyone knows about them in case of in the case of CripPad so I think to do justice to them we're going to try and merge them into a single app and we're going to call this notes again kind of going out on the limb here but something that's not code or something that doesn't suggest that it's only for editing kind of programs because that's not the case so yeah we're imagining something at the moment with basically a markdown editor with different modes so if you want to do slides you can just switch the mode in the single app so beyond kind of these different EU funded projects and the partnerships we're setting up we've been thinking also about how to make our project financially sustainable because we can't go on research grants forever and so basically we've identified these three broad segments in our user base which are enterprise nonprofits and education and right now we're trying to cater to these three kind of broad use cases not necessarily trying to get everyone on CripPad.fr but thinking more at the instance level and encouraging and advertising the possibility for bigger entities such as universities, bigger NGOs, small and medium or large enterprise to have their own instance of the product so that they could manage their own domain sorry or that they or we could manage for them as in like support contracts etc that would be you know maybe slightly customized to their branding etc so we're exploring this as a potential revenue source for stabilizing the or like you know making the project financially viable basically long term so this is some pricing that we've just launched on our project site so the self hosting and the code remains free obviously everything remains open source and then we have the existing way of subscribing to use CripPad is kind of personal and organizational accounts on CripPad.fr and now we're adding this managed instance possibility which is still in very early stages but yeah we're looking to see where that goes basically in the year ahead so one thing before closing I want to draw your attention to our new forum that was launched last October I want to encourage you if you have feedback feature requests even bug requests and you don't use github then please come to our forum submit you know start a thread about something that you care about and we'll respond and hopefully maybe some other community members will respond I want to give a big shout out to the team because obviously I'm not on my own we started this this time last year we were three people and now we're six in the core team so we're growing rapidly and a lot of things are happening and hopefully very exciting things on the near horizon this is it from me I want to encourage you to go to CripPad.org where you can find all relevant links to where to contact us the new pricing I've mentioned and everything else to get in touch or to read our documentation and try out CripPad for yourself thank you very much and I welcome your questions you have a question yes yes yes so yes one of the one of the things that we're oh yes sorry so I didn't talk about an API that we're developing which is called open with CripPad although we're kind of reluctant to use this name because in some cases it could imply that you're benefiting from the encryption when you would not be but so yes this is an API that we're developing at the moment that other host platforms could use basically CripPad application just as an editor okay so you have a file a markdown file or a document stored in NextCloud and you could open it for a collaborative session in a CripPad encrypted collaborative session and then once it's saved you save back to NextCloud or you know NextCloud or any other kind of host platform with a CripPad instance a lightweight CripPad instance running alongside it why are we reluctant to use open with CripPad for this or not reluctant but we're thinking about what to call it is that CripPad comes with a promise of privacy and in this case we only control the privacy while the the collaborative session is open so yeah that's that's an issue but yes potentially some exciting development in that front that would allow more people to integrate with us yeah yeah a question about the concept of zero trust because zero trust and and encryption based based on that you can validate or verify that the client is really enter and encrypting your data but as this is a web app is there any way to verify that the server delivers the code that is really end-to-end encrypting or in the end do I have to trust the server that is really delivering the code so how can you be sure that the code that you're getting from the server is really the right I'm repeating your question for the the right code and that there is no kind of malicious action on the server part is that your question so Theo's talk is called whom do you trust it was in the security bedroom earlier so he will have a lot more detail on this I think the short answer to it is that you can't really be sure of it but there are ways that we are exploring in terms of verifying for example that the code on the server is matching a certain repository but this is not implemented at the moment so you have to trust that the yes that the server is really delivering the same code that we publish on github for example run your own server yeah that's yes the back yes hi so a lot of freedom fighters and activists are starting to use this question one is this advisable and question two is if you allow me in the mountains a lot of commercial encryption phones providers for criminals actually got busted over the last two years these were for drug criminals and so on people planning murders on that are there any contingency plans for this so to your first question is it advisable for activists to use script pad I guess is your question I would say yes Theo reminded us earlier of the case of Disha Ravi who is an Indian climate activist who basically got doxxed by Google to New Delhi police and that resulted in her arrest so I think yes it's beneficial if you want to avoid scenarios like this and I think especially so when you're involved in this kind of organizing I think it's about protecting yourself but also protecting whoever else you're dragging into your movement right so we've seen a few cases where you know a document is shared quite widely and you know you see these uses of Google docs where it's about sharing resources or even gathering resources etc so this is not just protecting yourself as an activist but also all the people who are visiting your document right so I think yeah there is there is precedent here where you know big tech companies are more than happy to out people and I think as we see the tightening of the definition of like what's acceptable activism and especially around the climate movement I think yes I would advise for it uh your second question was about encrypted phones and contingency plans for criminal uses of the software basically so we administer kripa.fr which is the flagship instance and I think these plans are really at this level at the instance admin level and so I can only speak to what we do on kripa.fr and we're actively monitoring for you know criminal uses of the the platform and whenever something is reported to us or whenever we find something we're like actively searching then we take down you know anything we found we find our our experiences that at some point most of these endeavors end up you know having to recruit or having to basically you know publish a link to something on the platform so that's when that's when these things become groups or whatever become visible and then we can act on them could that answer your question we have one question at the back yes can you please tell us something more about the distribution in France and how it's actually welcomed by school and how it's actually functioned so starting the presentation yes so what's the adoption of kripa in response to the the recent recommendation not to use big tech platforms in education in France so nothing as far as I'm aware we have we're not aware of any schools using or having adopted kripa as a result of these recommendations be aware we're talking like this was two months ago or something or like it's fairly recent we see a lot of adoption in Germany and this was even before this kind of recent developments so we have a lot of testimonials from schools and universities that are using kripa in in Germany and we seem to have a lot more traction on the german market than we do in France so this is something that we're looking to improve on the front side we're hoping to get for example there is a catalog of apps for the education national which we're working to get onto that kind of um suite that is deployable by schools we're not there yet but this is something that's um on our radar yes they're interested they need it they need the whole thing in particular right which is very important for that type of deployment it's a bit complicated it's not a no-to-deployment by school so it's not instances it's basically a very very big instances they have they have on their application another type of path simple one there is somebody from national education tomorrow in another room go see him and tell him that you want it so thank you then we have one more last question okay yes okay uh you told us a quick quick path is a entrance from user to uh yes so you have no access to the copy of the users yes uh it says basically say it for actually just to use it and then you told us that uh you monitor the flagship uh instance yes lawful or unlawful users illegal yeah that's a kind of a vector on how you monitor the encrypted data for lawful or unlawful users oh no sorry sorry uh so uh contradiction between um i said that we that i would recommend activists to use crip pad and also that we were monitoring for criminal uses what i mean but what i mean by monitoring is look is uh you know o-center like scanning the internet for crip pad links not that we can monitor anything as administrators of the platform does that make sense we search we don't we don't see encrypted data but if somebody if somebody sends us a link with encryption key because it was perfect a link to a document we will delete it potentially and we would report it if we if we have to report and if uh and if we see that it's bluntly criminal we would potentially even close the account and delete the data because we have some links between the data we wouldn't be able to read the data because we cannot see the data but we would delete it and i'd like to add on this we had one police request in the year and it was clearly illegal what they sent us we sent them what what we have which was close to nothing like very little information only what we see in the pads basically uh but so we're not seeing any data clearly we're not seeing the data but some people post links that's why the police if they see the computer they will find a lot of links on the computer links to crip pad documents we can continue this if you want but i think there was a misunderstanding with my use of actively monitoring public so uh if if something is posted on the thank you very much |
Transparent, asynchronous, efficient communication
How the Zulip open-source team chat application addresses the needs of open-source and research communities |
Great. Again, I hope everyone's had a great first day at FOSSTEM in person again, and thank you for staying to the last presentation. My name is Lauren, and I'm a software engineer at Condor Labs, and I work on the open source project ZULIP, and I'm going to be talking about how you can collaborate via chat, hopefully transparently, efficiently, and asynchronously. So as a collaboration tool, and we're thinking especially here as open source communities, open source software projects, open research projects, what are some of the benefits of chat that we have? And really, I mean, we have so many collaboration tools and communication tools. We have email, we have our issue trackers, but chat can really be a place of generating some community and connection, and also it's kind of a low friction, lightweight place to connect, right, and create some of those things. So it's a really, can be a really beneficial place for our projects, but it comes with some challenges. So who here works with some folks that are maybe in a different time zone than them? Yeah, me, I definitely do. And so that can be really challenging if you have a chat application going because we think of chat as being live, right, when we're all sitting down at our computers together, but if you're working with someone in a totally different time zone, then your chat is not going to be synchronous. It's going to need to be functional in sort of a different sense. So that's a challenge of chat in our communities. Also in our communities, we have so many things going on, right? We have new features that we want to implement. We have bugs that we're fixing, issues that we're dealing with. We have releases to manage, conferences to attend. So we've got a lot going on and that can really make chat become very overwhelming very quickly. And we have a lot of different folks in our communities, right, and they all have different needs and play different roles. And so I'm going to go through an analysis, and I encourage you to kind of think about your open source community, your open research community maybe, and these roles and what their needs might be, in addition to the ones that I've kind of specified here. So project leaders, the folks who are leading the charge of your project, the people who are making it happen, what are some of their needs and challenges with chat? Well, they want to be there and have those connections with the community in chat, but they also really want to make sure that they're not missing anything in chat, right, if they're not there. So there's this kind of balance between connecting and also being able to step away from it that we have as project leaders. We have core contributors. I'm a core contributor to Zulip and when your core folks come on, you know, they're working more often, they're checking on a chat more often, but what they really want when they check in with chat for it to be some relevant to the work they're doing and helpful that they're participating in chat. And then they also kind of want to be able to go away and focus on their work and then come back to chat. So it's again this kind of coming and going that becomes a challenge. Our casual contributors, folks who are maybe invested in our projects but are not there day to day, folks who are checking in maybe on the weekends or once in a while. So what about these folks? Well, honestly, if the chat is just a big volume of messages that are coming in in this huge stream, they may not even use chat as a collaboration tool, right? Because when they come in and see that there are hundreds and hundreds of unread messages that they have to sort through and see through, they're going to say, hey, I'm going to go somewhere else to collaborate. I'm going to look at the issue tracker. I'm going to be on the email list or whatnot. And they're not going to really know what's going on in the way that chat is. So that can be really a challenge. If we want our communities to grow, we need new folks coming in, right? New contributors, new people getting invested in our projects. And when they come in, we don't want those people lurking, hiding. I don't know what to do, what's going on, you know? We want those people to be able to feel like they have a space to step forward and start participating, right? Have a voice and not be kind of this shadowy person in the background until they figure things out, right? And we want them to get a sense of who our community is, what they're doing, what we're doing together, what we're building. And we have end users, right? The people who are using our projects. So when they come into a chat, again, it's overwhelming, lots of conversations going on. And they have a question or a doubt or maybe some feedback to give. They may not choose to do that in your chat if there's not a space for them that they feel like their voice is going to be heard, right? So if it's this kind of chaotic, loud, cacophonous, like chat, hey, what's going on, da-da-da-da, lots of things going on, they may be like, all right, no, this is not, I'm not going to be able to engage here. So we really want to create a chat space that they can have these needs met. And, of course, we have lots of interact, we're talking about interacting with these people, but it was 20 minutes and kind of going on. These are kind of three characteristics of collaboration and communication with chat that I've identified as being kind of core to serving open source, open research projects. And that's as well as live having an asynchronous ability to chat, having an efficient chat experience, and having a transparent chat experience. So something that we're working on together. So going back to Zulip, again, I'm a contributor to the Zulip open source project. It's 100% open source, modern chat application. We have many, many contributors from all over the world, lots of people making their first time contribution to open source through either an internship program like Outreachy, or Google Summer Code, or just for their own interests. And folks can choose to self-host, obviously, open source, their own server with their own chat application, or we also host Zulip Cloud for folks who want to be organizations on the cloud. So let's start talking about tackling those characteristics that I talked about. So Zulip has this unique topic-based threading model. So you're probably familiar from chat applications who has, everyone has a chat application they're using, right, with some shape or other on their phone or whatnot. So we have maybe, we call them in Zulip, we call them streams. You might be familiar with them as channels or rooms. And this is kind of the big bucket that we set out for conversations that we're having. And the thing about Zulip is we create another layer of context in our streams. So, for example, I have an image here of a stream for our annual summit that we're having. We're going to have a great time at our annual summit. And we've actually, within our annual summit conversation, our stream, we have topics that are coming in and binding those groups of conversations together, kind of like an email subject line would have. So it's the end of my day. I've just got my CI passing on my issue. I'm super excited, but I'm going to sign off. But I signed into chat just before I go, and I look and I have 78 unread messages in my annual summit stream. And I was supposed to check in about this today. And I'm looking through it and I look up, I open up the topics and I say, you know what? They are having a very lively discussion about a bouncy castle at our annual summit. But you know what? I don't like bouncy castles. I could care less about the bouncy castle. I have no interest in bouncy castle. I'm not going to be jumping in the bouncy castle at the summit. So that is 48 messages that I know right there from the topic. I don't need to read right now. I can save those for later. I can mark them unread now, whatever I'd like to do. But I can look at these topics and say, you know what? I'm really interested in the catering because I know some people attending our summit have some not allergies that are very severe. And I want to make sure that's part of this discussion focused on our catering. So I'm going to look at that topic and that focus conversation context there. And if no one's brought that up in those four different messages, then I'm going to put in a pertinent question there. Hey, do we know that people are coming with not allergies? Are we making sure our catering is accommodating that need? So by reading through my topics, topic by topic, I can focus on what my interests are, where I can add value to the chat, and it makes the whole experience much more manageable. So topics really make asynchronous chat work. We now have folks all over the globe who can participate with more contextual feedback when they're online. So again, if they really care about that bouncy castle conversation that happened, they can still jump in and add their feedback there. Again, we make some space for people whose voices might get lost, new folks and users. And so chat becomes more useful for them. But of course, topics are being used by humans. Humans do not always work. We don't have conversations in straight lines. We don't always make sense all the time. And we need to make sure that they work with the humans that are working with them. So at Zulip, we've made a number of tools to work with this kind of patterns of conversations that we have. So for example, maybe we have this really, we have this new feature we're implementing during this really intense design conversation in our design stream. And somebody has this really great new idea right here. What we're going to do is take that new idea message. We're going to move it over here to its own topic with the new idea. We're going to create a link between these two topics in the same stream for the design. And now we have two parallel topic conversations going on in the same stream about design that have context. We can go back. We can connect them. Maybe we're having this really intense conversation about the new release. And we have a really excited new contributor jump in to say, hey, my name is, and I'm really excited. And what do we do? How do I get things done? And we can take that message, move it over to the new person stream, say introductions. Hi, welcome. We're so glad you're here. Please read our documentation. Let us know if you have questions. And this really important release conversation that's going on in our release stream continues uninterrupted, and we keep our flow organized and efficient. Maybe you have some come in with a help question, right? They're asking for help. They're working on upgrading to the new release. They have some questions. They've had some issues. Some of our SysOps people get on, work with them with a question, and they come to a resolution. That user can then mark that topic as resolved. A big check mark will show up in front of that topic visually. And now we know that that question has been answered and resolved. And so we have this kind of, they have the ability to step out and say, hey, you know what? My question was answered. Thank you so much. This is done. So again, creating organization within our topics makes things more efficient. People can prioritize their time. We can move conversations forward. And people have agency to say, thank you. I'm done. Or, hey, this unresolved this topic. We thought we fixed it, but we didn't. It's still an issue. Let's unresolve it. And we're building up all of this. These conversations are happening. They're branching off here. They're branching off there. They're branching out there. And we built this big tree, this repository of knowledge. Now our chat is not something ephemeral, happening in the moment. We're really starting to create sort of a repository of knowledge that's there for everyone to share. So we've got this asynchronous conversations. We've got this repository of knowledge. What about the transparency, right? So in our most recent Zulu release in November, 6.0, our public access feature was landed. And what public access basically is, is an organization with a Zulip can decide, you know what, that help stream we have, that's really important information we want to share with everyone. So we're going to make that web public, which basically means that anyone on the Internet can access those conversations without signing in and without logging into your Zulip. So that now is information that's on the Internet available to anyone. Whatever their questions are, however they get there, they can start accessing that information. Those help questions right away. Maybe we have our design conversations and we don't put those in a public. So people know, what is our design ethic? Where are we moving? What are we working on? And we can make that web public and people can engage. Maybe we've had this really great conversation about a new feature that we're implementing in our chat. And we have over here in GitLab our issue tracker for that feature. We can actually now, if that conversation happened on a web public stream, we can take that, make a link to the chat. And again, anyone who gets to GitLab and looks at our issue and says, oh, there's more information here, click relevant chat conversation. And now all of that information without logging in is available to that person. So again, we're really taking our chat with the public access and moving it beyond our community and making it relevant to anyone who's curious about our open source, our open research projects, like what we're doing. This is a value of open source that we have. So again, if we're making decisions in chat, this is available for people to see. New community members can start learning before they even sign up. And we have this repository, this tree of knowledge that we built that's now out there in the wild, in the forest of the internet that we have, that we're sharing with everyone. So really creating that transparent and chat's becoming much more relevant beyond just an ephemeral conversation. So as I mentioned, Zulip is 100% open source, free. You can start your own server. And we also have our Zulip Cloud, which has a free level of support, similarly to Slack before they made their change this summer, which is like limited. You have a certain history of messages. With non-profits, open source projects, academic research, we actually offer sponsorship on our Zulip Cloud standard, which is normally a paid platform. So you get even more history available to the public. It's not limited. That public access is there. So we really are committed to being part of the open source community and making sure that all of our projects have great connection, collaboration, and are engaging all of the people who want to be involved in the organizations. Again, thank you so much. That's about it for me. I have some great links that are in the slides here. The community's directory is a directory of organizations on Zulip that have opted into the public access already. So if you're curious, that's a great place to start looking. You can find me at Zulip Development Community. That's where we are talking about Zulip and the features that we're implementing and what we're doing. We have some case studies, etc. So I want to open it to questions or I can jump into one of these open Zulip instances if anyone's interested. Yeah, so thank you. Yes. So for topics to work efficiently, you need to be really strict with moving messages around. That means that moderators, I guess, would have to scan every message and move things around. Yeah, yeah. So the question is, for topics to work and we're moving things around and when people come in, you take on a lot of moderators who have to kind of be very active and efficient in that. Yes, definitely in my experience in Zulip Cloud, it depends on your organization. You can actually set that up. So for example, just moving topics within the same stream, like maybe somebody didn't name it very well, you can actually set that permission level to a generic user right now and we're actually working on our user groups so that they can be even more designed to be unique to the organization. So like those levels of permissions can kind of be shared out throughout your user base. So we actually have this in our new users a lot of times are coming in and to Zulip who want to contribute and they're sending messages and very quickly, they'll start actually, if they see a message kind of go jump into a stream and interrupt a conversation, they'll even just move that out right as like a person maybe who is there for two weeks. But it does require that kind of communal engagement, but you can disperse that so it's not just on your core contributors or your moderators, it can kind of be dispersed and hopefully with user groups which is a feature that we're working on and planning, that'll be even more can be fine tuned to your organization and how you, the community you want to create with your Zulip chat. Other questions, yes. Right, so each Zulip organization is deciding, so the question is how do you control privacy with public streams and what's going on for the folks listening at home. So definitely your organization is deciding what streams are web public, right? So that is definitely kind of when you sign on and you're posting in those streams, it's kind of like this information is available in general on the internet. There are private streams in Zulip, there are streams that are public within your Zulip organization that people have to sign in. So for example, on our Zulip development community, the stream for like asking help with, for new contributors like getting help with development, that's not a web public stream because that's kind of folks being vulnerable and maybe asking questions and saying like, I don't know how to do this, can someone help? Like obviously that's not something, I mean that's super brave of them and we're proud, you know, we want that as public within our organization but we're not sending that out to the internet. So we've made that choice culturally as an organization. So each organization that decides. So I believe like the Rust language that's using the public access feature, they've decided that all of their streams are web public. So basically when you sign up to be part of that chat discussion on the Rust language community, whatever your discussion you're having there is available on the internet. And so that's just part of kind of like the culture of each organization that you're setting up. You can definitely set it up so there are more privacy, you know, focused organizations. But again, thinking about open source communities and the fact that we want to be having, you know, there are definitely certain parts of our conversations and chats at me. We might want to be having as publicly as possible, right? So, yeah. Any other questions? Yes? Do you have any integration at the end? Yes, yes. We have lots of integrations. Right, yeah. So you can move from Slack to Zulip, for instance. Yes, yes. Well, GitHub for tracking issues and Zulip is a chat application work together, yeah. So like, so we track our, we're on GitHub for our open source that's where our code is. And so our issues link and we use integrations for like bots to communicate and stuff. So, but definitely there are lots of integrations and such that one can use lots of different authentication methods, et cetera. It's a fully fledged modern chat app. Yeah. Other questions? Curiosity. Again, if your curiosity has been for your open source projects, please I'll be around tomorrow or come on the Zulip development community, check us out. We have lots of public streams and I'm just been really excited to see everyone here at Pustum and thank you for having me. |
Conquering tribal knowledge with Grav
Four years and a pandemic later, where has our Grav setup taken us? |
Hello, everyone. My name is Andrea. I am a product specialist for documentation at Agen, and I'm happy to be here today to share about how we use graph for our documentation both internally and externally. I used to be a technical writer for more than five years, and before that I used to be a research scientist. But enough about me. Let's look a little bit into Agen, the company that I work for, so we can better understand the needs for documentation internally and externally. Agen is a financial technology platform, which has seen a lot of growth since its inception back in 2007. These days we are over 2,000 employees of more than 100 nationalities across the world in 26 offices. In development, we use write and maintain open-source software. And to keep all of this together, we have internal hubs for knowledge sharing and externally to build that confidence in the businesses that we work with and to help them integrate. We also have documentation for them. The team, doing documentation at Agen, the Docs at Agen team, now has around 30 people. This picture is a bit outdated, so a bit less than 30 people in there. It was taken back in November when we had an off-site to work on objectives for 2023. But getting back to the point, so about half of the Docs team is technical writers who work closely with the development teams to produce integration information that our clients use. And the other half is more product-focused. They look after our graph implementation and try and make it as user-focused as possible, where the users are both the people reading the documentation as well as the people writing the documentation. So we've got front-enders, back-enders, we've got automation engineers, we call them DocOps engineers. We've got product managers and designers working for us. And it would be a bit incomplete if we were to not mention all the contributors. So the external documentation are not open source, but they are inner-sourced, which means that our colleagues at Agen can contribute suggestions if they find something that they think is not quite right, or they have an idea of how to improve things. And the internal knowledge base is open for everyone to edit. And in the last four years, there have been over 3,000 people who have contributed to the internal knowledge base. Now, some of you might have been wondering what exactly is graph. It's an open-source flat file content management system. It's got a really big community around it, and it has consistently won some awards as a product. And for support with your graph implementation, there is also Trilby Media, a company of the people who have written graph. Graph itself is written in PHP. The content is in Markdown. You can also do templating using Twig. And its features are greatly extensible through its powerful API and package management. So some of you might have seen this talk from back in 2018 from Alexei. who was then talking about the migration to graph from Confluence. He was emphasizing how we chose graph because of the possibilities it offers in terms of collaboration, contribution, and extensibility. Other benefits were obviously it being open-source, which meant no fee per user, which used to be the model at the time with Confluence. It also allowed us to unify how we do documentation internally. So before migrating to graph, there were several CMSs serving various needs. The knowledge was quite fragmented. And switching to graph enabled us to create a go-to place for internal knowledge sharing. Also moving away from Confluence allowed us to have full control over the look and feel of the website, believe it or not. And I could hardly believe it when people were telling me that in order to customize the look and feel of the Confluence website, they had to modify the DOM using JavaScript. There was a very roundabout way of doing it. And also being able to fully customize both the end user experience and the writer experience, according to their needs, is a big plus that graph was bringing. Now, in terms of this talk, we're going to look at the graph implementation from a tech perspective, looking at how the code base is organized and what kind of quality assurance and customizability we do with it. We're also going to look at the implementation from a people perspective, all the different ways that people collaborate and create content using graph. We're going to talk about the collaboration with the open source project maintainers themselves. And then we're going to end looking into the crystal ball into what we expect or what we're looking forward to in terms of the future. Now, for the tech part, the really neat thing about graph is that it's possible to have a single code base that then enables several websites to be deployed. This is because we have the graph code base in a separate repo from the various content repositories. And then for each content repository, you can even have them deployed to several domains, according to their own specific domain base configuration file, which can even turn on and off various features depending on domain. For example, this happens, we use this for the external documentation, where we have two domains, we have an internal domain for staging environment and a public domain for the documentation that our clients use. Graph also supports both static and dynamic versions of a website. For public documentation, we always use static HTML websites for security reasons, but internally, so that users can immediately see the changes that they've saved, we do use the dynamic version of the website. So talking directly to the server. Graph also allows quite finely grained control of access to people, so you can restrict read write access based on user profile, or you can even do this outside of graph by putting a particular server instance behind a firewall. So to look at this visually, we have the one graph code base that then together with a content repository and domain config files will produce one website. And then from the same code base together with a different content repo and say a domain config file for dynamic websites will produce the dynamic website. And then to get a static website, for example, all you need to get a static website for the same content repo, all you need is a domain config file for the static version. And then you already take the single graph code base together with the content and produce the static website. In terms of quality assurance, our DocOps engineers are working hard to write tests. For example, we never publish broken links. That's because we have the broken link checker, which produces a report like what you see here, telling writers where the broken link is, so that it can go fix it. We also have security checks, so we have to be very careful for compliance reasons, what kind of card numbers we use, for example, in code samples. So this kind of check will always prevent people from publishing if the test fail. But there are, say, you know, less crucial things like maybe like small style checks or like small typos that we will not stop publishing for. People, writers still get the reports for this, and they will fix it, but it won't stop publishing. And one of the great strengths of Graph customization, this can be done at various levels. So first of all, the look and feel can be customized using various themes. As you can see here, two fairly different looking websites. We can establish various page templates where we see that there is recurring content of a certain shape. We also build custom components for the UI. So if we identify, for example, a component we built last year was for decision trees. So for example, for people doing tech support internally or for people trying to choose their particular online payments integration with Adien, they would need to either read a few pages or they could take this, like answer a few questions, and they'll get a recommended integration. So that kind of decision tree component is something that we built for Graph. And because we have a single code base that can work with several repos, that component can be enabled on various websites, so whichever ones need it. And then also another neat thing is the extensibility using plugins, which means that we can add new functionality in a modular way. We particularly use this for the internal knowledge base for Hub to create integrations with other tools we use internally. So for example, to allow colleagues to easily embed content from, say, the issue tracker or from the internal stack overflow, because like I said, we want Hub to be the go-to place for knowledge sharing internally, and therefore it needs to provide good connections to all the other tools that are used within the company. Now, all of this, of course, does not come without these challenges. The team has grown probably about five times because there was one developer when Alexey gave his talk in 2018. And we now have five developers working on the various projects. Part of getting here was defining and enforcing workflows so that everyone works consistently and predictably, which also helps, makes for easier troubleshooting. Another challenge that came with us having offices across the world meant that if the internal documentation was hosted in the Netherlands, our colleagues over in Australia or Singapore were having to wait quite a long time to see the content that they needed. And this came back as a pain point from the users. So we started deploying the internal documentation to service in APAC as well. So then we also had to deal with thinking changes across services worldwide. So we've done that one too. Right. Moving on to people and how they work together. And Graf does enable various ways of collaborating and managing content. And for this part, I'm going to look at the internal docs and the external docs in parallel so we can better see the various characteristics depending on how people collaborate and how they create content. So for the internal documentation, it's run wiki style, which means that anyone can make changes. People mostly make changes using the browser editor. We also accept poor requests, but relatively small percentage of people will raise a poor request. The internal documentation also has page and section owners visible at the top of the page so that people can easily find out who to contact if they have questions or if they spotted something wrong on the page. And our colleagues from development have also set up certain integrations with code repositories and internal tools to again facilitate this communication and knowledge sharing in one place. Now in terms of the external documentation, like I said, there is the group of technical writers who write and maintain the content. Now unlike Hub, unlike the people writing content for Hub, people writing docs mostly use their ID. They commit their markdown changes and push them to the remote. They will check the state of the docs locally and all of that. So in other words, they use a doc's code approach to creating and maintaining content. I kind of mentioned this before that we have a way for people to suggest changes internally. So we have a technical writer on duty every week who reviews changes coming from colleagues. Because there is this small group focusing on curating this content and they are quite, so like text-heavy-day tree documentation is code, there is a lot of reuse and parameterization in the external documentation trying to find that sweet spot, that balance between writing content in a scalable way, so reducing the maintenance burden but also without increasing the cognitive load for maintenance too much. But also an interesting thing that has, well, interesting thing that has happened since 2018 is that Agen as a company went from being a payments company to a financial technology platform. So this means that on the one hand, the payments products stayed, became, so like the bread and butter of what the company does and the basis for a lot of other products. And those products had more and more releases, so there have been iterations, several versions to document. But on the other hand, extending into financial products meant that a series of other new products were built which also needed documentation. So there has been a lot of content growth along both of these axes. So let's have a look at how the company growth has, is reflected in documentation as well. So in terms of internal documentation, we all use it, right? So the fact that the company has grown about three times since 2018 means that the audience has also grown just as much. The content itself has grown two and a half times. Last I checked with the product specialist for Hub, we had more than 12,000 pages. That is a lot of content. And those 12,000 pages have been written by over 3,000 people in the last four years. Yeah, I was quite surprised when I first saw these numbers. Now for the external documentation, the audience has also increased even more than the internal one. So this is another aspect of how company growth can be seen in docs analytics. The amount of content we have in the docs has also increased undoubtedly with all the new versions, all the new products. But somehow in a more controlled, curated way. And maybe when we look at the challenges in the next slides, it might become more apparent why the growth for external documentation has been a bit more controlled. And of course, definitely worth mentioning that more than 350 colleagues have suggested changes in the last three and a bit years because the suggest changes flow is a bit more recent. So we get on average 100 people a year making suggestions and that is very, very valuable to us because with thousands of pages of documentation, the help that we get from people suggesting changes is invaluable. And now let's look at the challenges that come with the growth in number of people and growth in content. So again, looking at hub, the internal documentation, or the last four years I've taught us is that we need to have a good strategy for content ownership long term because time passes and people come, people go, but the content remains. And there needs to be a plan in place for what happens to that who becomes responsible for it. There's also the issue of broken links. So I was saying earlier that in the external documentation, we never publish a broken link. And that is true. But with internal documentation, we don't have that kind of flow in place because the editing flow is also different. And we are looking into possible ways that we can let people know when links that they have in sections that they own are broken. This normally happens when other parts of the internal documentation that they refer to have been restructured, pages have been removed or reorganized. So yeah, that's definitely something we're looking into. Also, empowering people to write good content is important and making them feel responsible for the content that they've written because writing the content is not enough. It's great that they have written it. But this mindset that once you've written something, you're responsible for it is something that we need to work on a bit more. Now, in terms of the external documentation, one of the pain points that we see from feedback is that it's easy to pull out of sync with other platforms that we have. So for example, documentation versus the marketing website or versus the internal documentation. And this is because the same information is maintained manually by different people. Needing to scale complex content means that we need to have some way of having the same information being shown everywhere, offering a consistent experience in terms of navigation and things like that. We already have ideas for how to tackle this. I'm going to defer this to the last section about the future. Something else that we get is I was saying with the passing of time, several software versions, we find that users do ask for older versions. So versioning is something that we're also working on. Versioning for documentation. And that can be a whole separate talk in itself. So maybe see you next year. And as you might have seen, I've been avoiding the last point for both because finding relevant content quickly is a continuing challenge, especially in a climate of growth. And that is where search and information architecture have to be on point. And continuously iterating on that and seeing what works, what doesn't work for people is a core part. But then also looking at how to show people only the information that is relevant to them is something we're also exploring. Now, in terms of collaboration with the open source community, we have a direct relationship with the graph project maintainers. We've sponsored building certain plugins, which we have then open sourced. So some plugins are for our own internal use because they are very business case specific. But we've also collaborated on plugins that we've then open sourced. And we also contribute bug fixes to the code repository. Moving on to look at the future, I'm only going to look at the biggest challenge that I am very excited about tackling, which is scaling consistently across all the platforms that we have. And now that I'm looking at this diagram, it kind of looks a bit like a present, doesn't it? With a hat. So this inconsistency that I was saying, we're starting to see between the different platforms, not all of which are running on graph. We want to create a single source of truth. And the interface for inputting the information for the single source of truth is probably also going to be an instance of graph using the same one code base that I was mentioning at the beginning. So this is for the kind of information that cannot be generated automatically from code. So this is not for things like API reference. This is for things like emerging properties of various features. So something that is not just a flag in the code. This is something that the product teams will be responsible for. And having using graph for this means that they'll be able to have the same familiar user experience when inputting information about their product as well. So then in the future, other systems, whether they're running graph or not, will be able to use the same information that will be accurately rendered in all these different portals. So maybe see you in another four years to tell you how this one went. So let's move on to the Q&A session. Thank you for your attention and see you in a bit. Okay, so thank you. I think that we have so we just have a couple of minutes for questions. And then in three minutes, we'll get into the next talk. So if you have any questions, don't hesitate to put it in the chat. In three minutes, we'll get into the next talk. Yeah, I did put a question. Me, Andrea, the speaker, have put a question in my own talk. Because I was wondering, I was curious how many people would already be familiar and with graph because me, like myself, I only found out about it when I started my current job. So, yeah, I was just curious within the community if there are people who use it, but it seems that it's, well, at least amongst the people here is not that well-known. Indeed, on Mayan, at least I didn't know about it. Since we started talking about it at first I think it was one or two years ago, but it's nice to see that it's also evolving. Yeah, there has been one. Yeah, sorry, there is a bit of a quarrel, I think. Yeah, I'm not sure which stream to mute. Should one of us maybe read the question that was asked in the chat? So, the question from Mungal, is there any integration with ECM CSP products to include documents and preview on the website? I can see that NextCloud is provided, but only for backup purposes. Yeah, so, like I said, we do use NextCloud, but we don't embed it in the internal knowledge base per se. So, for any need to access that, we link out. I'm pretty sure part of the reason for that is because some NextCloud files have restricted access and having this double layer of access restriction was seemed a bit unnecessary, but I don't think we've had requests for it either. Cool, so, I think that no, if anyone wants to ask some questions, we have the room that we are discussing in, which is now open. So, I guess that you can take questions here. On my end, I will switch to the next talk for the dev room. So, thanks a lot, Andrea, for coming and thanks for your presentation. It should be online on the website very soon, if I'm not mistaken. Yeah, thanks for having me. I'll see you around. Thank you. Bye. Bye. you |
Creating a content pipeline with Antora
Using AsciiDoc content for the website and other downstream processes |
Creating a content pipeline with Intora, welcome to this talk here at Boston. Watch this talk to learn how to use ASCII.content from multiple sources for multiple websites and other downstream processes. In this talk Fabrice and I show you what problems were solved in the Eclipse chip project to allow for a shift left in the creation of the content and to automate all the things. After describing why there was a need to change and what was used in the solution, there's a demo showing how to use the different building blocks together. Let's get started and let us introduce ourselves. I'm happy to introduce to you Fabrice Flortable. He's also working at Red Hat together with me on a very different project, though his superpowers were to know the ways within Red Hat about documentation, so it helped me a lot to figure out how to feed downstream processes, something that I wouldn't have been able to figure out without him. He was willing to do things differently than before, so we tried out new stuff and he was volunteering to use that in the Eclipse chip project. He was then working at the time and the superpower of Fabrice is that he's both a technical writer and he can do programming and can do automation, as probably due to his previous life. He's been working in the automation and Ansible world quite a bit, so that's a very, very valuable skill for a technical writer. Thanks for having that and bringing that to the project. Today with me we have Alexandre Schwarz. First is the author of the IntelliJ ASCII DOC plugin that has Antora support and that's a game changer for all the authors who are playing with ASCII DOC and Antora in the world, so that's our first reason for being a superhero. Also Alexandre has a meetup that happens, I don't know, one per month, maybe less, of our topics that are relevant to documentation, so that's a very interesting meetup. What is the context of, in terms of content management and collaboration that we have here with Fabrice? Can you introduce us to the topic? Yeah, so the big picture is, let's apply the shift left principle to content management. So I work at Redat and at Redat we have a successful development model, which is we develop things upstream and then we transform the upstream project into a downstream product and we call it productization or downstreaming and moving the development effort from the product to the upstream project, that's what we call the shift left. Is this model also driving documentation? For the most part of it, the answer is no, that's not good, but it can change and probably it should change. So how do we apply this model, this shift left model to the documentation? So that's simple, you take your technical writers who are usually bound to the product and you let them contribute to the upstream project and you let them collaborate with engineers, that's super simple. For upstream, there is an immediate benefit, the community has an enterprise-grade documentation, so it means better docs, of course better docs because if you pay your writers and you don't have better docs at the end, what would be the sense of it? Good docs has been identified commonly as a key feature for a successful community. So we have multiple questions and thoughts that are saying that people look at the docs first when they come to a project and good docs means people go a little bit forward and just looking at the marketing homepage and maybe they stay a little bit longer. So in the end, it means happy community, good docs means happy community, happy community means probably users and then customers. So that's good. For upstream, obvious, but for downstream, because you're making things maybe more complex or you put your technical writers in a place where they are not used to go, so that can be difficult. But too many good benefits. So the first one is you scratch multiple Bufalo with one stick. You write your content once and you can publish it in multiple places. The second one is your content grows faster. Your writers can focus on enhancing the content that the community is producing and they are more productive. So that's the second one. Third one is the time to market because you automate everything, of course. So the documentation that is produced in the open, it lands immediately in the product. And that's something that is really important, is that you have merged your content in the project and you have a documentation, a product documentation that has it directly. I mean, there are cases where the documentation is written in the community, but you don't have technical writer there and the role of technical writer very often is a police role. Or yes, okay, that's okay for a community, but for a product documentation, you should not write it like this, like that. So it takes time. Something that you can write in five minutes and it's okay in a wiki upstream, then it needs to comply to more rules when you write it for downstream. So how do we solve this pain or this problem? So here we have two places where we can push left. Push left for docs, it means single sourcing and single sourcing as much as you can. It means you write your content to both left as you can in the chart and everywhere else, you just use it and you don't, you don't read it. Left means upstream for you, right? Left is upstream, it's in the open source and immediately when the code is being written in the first place. We have two opportunities to shift left here. That's the first one is to get content from the source code of the application. Exactly the content strings that are used in the user interface. So in a configuration field, the description, the description field is micro content. You can reuse it in the docs directly. So you can build your reference guide based on the strings that are present in the docs and you can automate that. When you do that, if a configuration field is removed, then the configuration field is removed from the docs too. If there is a new field that appears, then the next iteration on documentation publication will publish that new field. So that's important. I mean, that's not the main focus of this talk here, but that's something where you can write ad hoc scripts with your preferred language. That's quite a common problem in docs, I believe. The other shift left moment, the shift left opportunity is to have your content written in the upstream project and not in the downstream product. Really, you write the content upstream. You don't do any authoring of content downstream. It's not something where you copy, paste files and you write it down. There is no human interaction. There is no human adaptation. It's a fully automated process. And it's important. That's the fully automated process. This part is quite hard because you have to identify all the variations and you have to find a solution for every single point where there is a variation in content. Can you give an example for such a variation? We have to dress up a map of these variations and the first kind of variation, it's in-line content. For example, your project has a name, your product has a name, it's not the same. The project has a version, the product version is not the same. In the project I work on for three years, for example, the project is called Eclipse Che. The product is called Dev Spaces or right at OpenShift Dev Spaces. The project is working on Kubernetes. The orchestrator is Kubernetes. For the downstream product, the orchestrator is OpenShift. OpenShift is just a distribution of Kubernetes but the product is not meant to be installed on Kubernetes in any version. So that makes a difference. So that's good for a couple of words, a sentence, maximum, less than a paragraph. You can work like that. Then you have block content and block content, so it's more than one sentence, more than one paragraph, code block, image, screenshots, table, diagram, full procedure, a full set of procedure. It can be a lot of things. When you have three cases, you have content upstream different from content downstream. You have a content that is upstream and that is not downstream, and a content that is downstream and that is not upstream. So it means that at this point it's quite complex. But there is another problem which is maybe specific to Red Hat, but I'm not sure. It's how can you use modern tools in upstream? How can you have some freedom in the tools that you are using in upstream? When you have a downstream tool chain that is very specific and that has very specific requirements. And in my case, so in Red Hat's case, the customer portal publication tool chain is quite old. It's normal because when you have to move, you have to move the tool chain for all the products, so it's something that moves much, much slower than a project. But then it's how can you provide, how can you satisfy the downstream tool chain and still have something that is where you have fun upstream, where you have modern tools and where you just don't do things that the community finds stupid because for the community downstream doesn't exist. So you have also to think about that. For the community, downstream is not a reality. Downstream is just hidden and you cannot say, let's do this for downstream and believe that community will accept that. That's not the case. When it's stupid, it's stupid. So when it's stupid, when it's something that you do only for downstream, the chains that the community accept that is quite low and that's normal. And I believe that's really good because it's pushing the writers to do something better and to re-question some of the processes that you have in downstream. Yeah, and upstream is then the place to innovate to try out new things that then eventually get adopted in other places. So move fast, innovate in the upstream project. So in my experience, the upstream project is super innovative because we are using Antora, but not only Antora, we are using Antora extensions that are still in alpha stage and we are successful with that. So I mean, what we do in upstream is quite exactly the opposite of what we can do in downstream. So we can use alpha software and we can be successful with it and be happy with it. So that is a great introduction to the solution, right? So what do we do in the solution then? What did we use to make this happen, this shift left? We have found everything that we needed in the Antora ecosystem. So that's the thing that is completely unbelievable. Not all the tools were there in the beginning, but now they are there, so now it's time to talk about them because it's almost everything out of the shelves. First thing is Antora in general. So Antora is built to have multiple publications. It's built for an upstream downstream model. So you don't need to add anything to Antora and you can have a publication for upstream with one product name and a publication for downstream with a product name. And you can also have variation on content. It's all built in. That's the first thing. Then to content from some source code, there is a very nice extension called Antora Collector. And Antora Collector, it's more like a tool that lets you use your own code. So it's more kind of a scheduler that will execute a script at real time and make sure that you run it every time. So we are using that, for example, for Shedox to reference guide for the configuration of the application, for example. And the last bit is Antora Assembler, another Antora extension, very new and which is just high checking an extension that is there to produce a PDF and we use it to produce something that is required to do a PDF, but that is very helpful in our case. So Antora Assembler is taking an Antora website, an Antora component, an Antora something and it's processing it and putting everything in one monolithic Shedox file where all the Antora complexity is hidden. And meaning at the end, you can have a publication tool that is very simple and you don't need all the complexity of Antora to do your publication. So Antora Assembler is then the key for us to export it in a way that all our downstream processes can successfully work with all the content upstream. We talked about Antora and its extensions, how they solve things for you and the Eclipse Tray project. So what's the difference for you if a project picks Antora, what's then different than before? What are the unique features that you can benefit from Antora when using it? I can speak in general and I can speak also with a very specific thing in mind. It's because the project was using Jekyll before Antora and we had a problem with Jekyll. So we solved the problem with Antora but there are things that are quite special. First thing is Antora entirely correlates content to the rig and publication. So you can have the playbook, so the playbook that is used for publication in one repository and you can have the UI, so the team, the CSS, the presentation layer in another repository and you can have your content in one repository, multiple repositories, in branches, whatever you want. So it's really this thing is like HTML and CSS. It's the same principle. Antora is leaving somewhere, presentation layer is leaving somewhere and then the things that put all of that together is leaving somewhere else. So that's at the root of Antora, so that's the first thing. And then these things enable content variations and Antora is built to handle content variation. So it's just made for that. That's the thing that is so good. So it's made for an upstream downstream model. It's made of that. So it's supporting ASCII doc attributes out of the box and it's built on the fact that you are using ASCII doc attributes. It's using one feature of ASCII doc which is includes and it's using that everywhere and it's like, so ASCII doc is defining includes generally and Antora is doing a classification of includes. So you have examples, you have images, you have attachments, you have partials which is just a snippet of content that doesn't do a full URL. You can have also distributed components. So it means that you can have one repository where you have the contents and a distributed component where you have additional files. That's what we are using for downstream. So all the images that are specific to the, so the downstream repository has specific examples, specific content, text content are images. So the screenshots that are just with the product UI on it and it's as simple as creating, putting the file in the correct directory and there is not to think about it. When we were, so before as Antora assembler, we were juggling with directories where we overwrite content in this directory but not in this one. It was a mess. It was difficult for people who are, so even for writers, it was difficult to understand. Ah, but my file was overwritten, yes, because this file, you must edit it in upstream first. And with Antora assembler, the upstream files are not in the repository, they are really in a cache. They are just downloaded at build time. They don't appear in the Git history and there is no confusion. So that's one of the big, big, big wins of Antora assembler. And pushing further, Antora Composer, it's the same thing as the content that you generate don't live in the repository. So one, something that is really, really interesting with Antora is that the contents that is in your repository, that is in your Git repository, is the contents that you are authorized to edit and that the contents that comes from another content repository, you don't have to commit it into the repository. And Antora assembler and Antora code collector are using that as a, as a condom. So there is less confusion for the writers, for the authors, for the contributors, because a file that is in Git, it's a file that you can edit. And the problem that we had before is that we had, let's say, we were talking reference guide. So the output of the script was stored in Git. And then when we were doing, when we were doing an edit on another file, the script would update the file and then it would end up in a endless discussion on Git about like, but why are you committing this file? It's not in your content, et cetera. So it was available. That was endless discussion because it was not clear. And with Antora collector and Antora assembler, it's, everything is clear. It's in the repository, you can edit it. Yeah. And on the other hand, everything that's generated will never end up in a repository. Everything that's generated is only accessed in the code where it comes from plus a shell script or whatever else you need to generate it and never going to be committed to Git. Right? It's not committed to Git. We have a build artifact, which is the ASCII doc, the monolithic ASCII doc. But the toolchain require this build artifact to start doing its own build. And then this nice thing is then that Antora assembler, it creates an output that could have come from everywhere and it doesn't really need Antora anymore to render. It's just plain ASCII doc that comes out of that and everybody can render that. Any toolchain that could instead ASCII doc can render that monolithic ASCII doc file that's rendered by Antora assembler. If you say Antora is a great choice with collector, with assembler, still you have to do ASCII doc. So how do you think about ASCII doc and this single sourcing approach? Would you rather use another markup language or are you happy with ASCII doc as it is today? So Antora is so much based on features that are only in ASCII doc that it's difficult to imagine Antora for markdown because it's based on ASCII doc attributes. There is no such thing in the markdown ecosystem or not exactly. So that's one thing. I believe you have projects where you can also use variables that are extending markdown. So that's not impossible. But then the includes. The includes now I'm working on a project using docusers and it's markdown extended. It's different. It's more complex. I find ASCII doc is pretty simple with these include statements that they are simple. They are enabling a lot of things and Antora is based on it. One of the pillar of Antora is attributes. The other one is includes and the other one is cross references. And these are three features that you are very specific to ASCII doc. And of course, ASCII doc has also a very nice feature for the author. Then it's a matter of taste. So I know that we have a discussion with developers about the syntax, the formatting, all of that. That's something where people can be very opinionated. As an author, I really like ASCII doc because it's very consistent. So it's for inline stuff, the syntax is always the same. For block stuff, the syntax is always the same. The tables, there is one implementation of tables and not 20. The punctuation, it's very well defined, so that's good. You have extensions. You have Antora extensions. We talk a lot about them that you have also ASCII doc extensions, meaning you can add the diagrams. So cookie diagram, you can add them into ASCII doc. You have extensions to do PDF, to do presentation with revit.js, to process ASCII doc files directly in your browser. The ecosystem is very much alive. We have created a small demo project which contains Antora, Antora Collector and Antora Assembler. In this demo, I'll walk you through this project. The link to this project is also on the slide with all the links at the end of the talks. Let's have a look at the code. Like all good projects, it has a readme which walks you through what to find here and how to use it. While a real project with Antora is usually split into multiple git repositories, this demo combines everything as single repository as it makes it simpler to run local experiments. Blitting them in multiple repositories allows for advanced features and a separation of different responsibilities for different parts in a production setter. The docs folder contains the same component in two different versions. This version has the standard Antora folder structure. The version 2 also uses Antora Collector to run a script and pick up the generated files as if they would be checked into git. The start page is an Antora component with the special name root. In this example, it also is not versioned, so there is only a single version. Looking at the playbook folder, this contains everything to build the documentation site. It uses note to install and run Antora and its extensions. All dependencies are listed in the package.json file. There are two playbooks, one to build a production site, a second to build an author's preview from the local work tree to show changes which haven't been committed yet. The playbook lists among other things, components, extensions, the UI bundle and ASCII doc attributes. The Antora assembler needs another configuration file to specify how it should create a PDF. Optionally, it can keep the created monolithic ASCII doc file so you can use it in an additional build step. As an alternative, one could use the plugin as a blueprint to create other output files in your own plugin. There's a very small demo extension which accesses the attributes of a page to further process them. In this example, it prints the front-meter attributes of an ASCII doc page on the console. The supplemental UI folder contains several small customizations for the default Antora theme. Depending on your production setup, a full custom theme might be better. Have a look at the Antora UI projects for details on how to do this. For this demo, it was sufficient to override some of the partials of the original theme with modified partials, for example, for the header, the footer and the outdated page banner that I will show you in a second. Let's look at the site that has been generated. The start page doesn't have a navigation outline specified in the component descriptor, therefore it shows a default with all components and versions listed there. When I choose the content module version 1, I see a warning that I'm looking at an outdated version with a hint where to navigate to the latest version. On version 2, there's a link to a PDF containing the content of all pages in this component. On the page with the page attribute, I see some of the attributes that are available when rendering a page. Those attributes come in handy. They allow me to render information in the footer that links to the revision used to publish this page. I can also add a feedback link, which opens a GitHub issue referencing this page and its version. This is a simple mechanism to offer users providing feedback to my content. At the top right, there's a link to edit this page on GitHub. When I go to a generated page, this page doesn't have the edit button as it has been generated. This is the end of the demo tour. Give it a try yourself to find out what's possible with Antora and its extensions. Thanks for watching this talk. We hope you found some things you want to try out in your current ONIX project. Here are some links to Antora and its extensions and a link to the demo repository. On the right, you see the links to our blogs. When you are watching this live at Postem, we're looking forward to a vivid Q&A round. If you watch this at recording, visit our blogs and get in touch by email or our social media accounts. We'd love to hear where you're using or plan to use Antora in your projects. Hey, we're live for Breeze, we're live for, we have 10 minutes of Q&A time, isn't it? We are live? Yeah, we are live. So, people can listen to us in the complete world. We talked about this earlier, but we don't have so many questions yet. You said Antora changed some things for you with the simplicity of the workflow. I think you first started with the best script that was restored. So, really, 2022 was a year of big changes because the new version of Antora was published with the extensions. Then in the late 22, the first extension arrived and they changed what we can do. It's an enormous change in simplicity. I was doing the training since three years with, I don't know, 800 lines of bash, CREP. It was breaking very often, so lots of time to fix stuff because the content is broken. Now we have something that is done in a few lines of FML, just configuration file and there is still a little post-processing bash script, but it's less, it's 50 lines, it's small. So, the simplicity of the process has dramatically improved and there is another extension that is coming. It's Antora Atlas. We didn't talk about it. No, we didn't. Yeah. Antora Atlas, it's something where you can build just a subset of a documentation website. So, you can have a very large customer portal, so to say, or something like that, with a lot, big, big, big amount of documentation. If you want to build all of that in one go, it takes a big amount of time. But with that extension, you can just build a part of it and it goes much, much faster. So that's really something that's getting in the good direction to onboard Antora in large documentations or in companies that have a large population of documentation. So that's really good news. And I saw that with the Spring Security project, I think they are one of the users of the Atlas extension for Antora. So that's very interesting. But you said you did some post-processing with Bash from the output of the Assembler extension, probably. Mm-hmm. And I'm preparing this in a similar way for the Key Cloud project. And for that, I kind of forked part of the Assembler extension and put all the post-processing that I needed to move that from a Bash script that I had also moved that to the JavaScript code. And that makes it a bit more maintainable, I have more syntax support, some validation support. And yeah, so all these extensions that are there now. And so maybe one question in the scope of Antora Atlas that you talked about just before here in the Q&A, did you, well, do you know already some projects that could be starting to use them, to try to weigh in order to have a real case, a real-world use case? Or maybe organization you're talking to that? I would suggest to go to the lip chat of Antora and ask there. I don't know if it's getting developed right now. So it's like the author is saying every week, hey, I have a new alpha version and now this is working. And so it's just really impressive because you see the progress every week. And it's also, it's, oh, it makes me read some part of the ASCII doc code and everything our everything is done. So it's like really, it's really impressive how it's a continuous improvement process that is getting rid of that so that ASCII doctor will become better from there. So it's probably done for someone, but I don't know exactly who for who. But if you bring security, that would be the example and I can post a link to the chat so people can follow up on how that's going. In the meantime, we also have one question from Benson on the chat, which is about, is Antora accessible for screen readers? Well, I haven't tested this with screen readers on a laptop pro. So I need to be very cautious what I see here. Thing is, it's, it's plain HTML that helps. So it's not so much JavaScript and single page application that's probably also a help. And I'm not sure if they put in so many ARIA tags in there. I don't know that in the Senate theme, there are some, but if you want to really optimize it for a screen readers, you might need to put some more effort into that to see that the ARIA tags are right when it comes to the, to the scene, to the, I would say, outline navigation and all that stuff. Put ARIA tags in the content in the middle, I would say there's, yeah, maybe, maybe ask in the Zulip chat. I'm also on the Zulip chat there and then get pull requests into the standard Antora scene. Yeah. The one thing to know is that there is a default UI, so a default presentation there, but you can, you can change it. Some projects have made an entirely different UI, but I, I, I believe it's like accessible like most of the documentation framework that we have now, but I, I, I don't, I'm not sure to give a very prescriptive answer to that because I, I have not watched it. Okay. Thank you. I'm not sure to give a very prescriptive answer to that. And Fabrice, you, what kind of extension did you add? I think you added a link, an external link, checker extension to Antora in your project to do some additional validation. Yeah. I used, yeah. I used the, the Composer. So with, with the Composer extension, you can, you can call external scripts. And I did that to, to, I call Veil, so the Veil style guide lintar to verify that the, the, the content is compliant with the style guide. And I can calling also HTML test to verify the external links. So Antora is already checking all the x-refs. So if there is a broken x-refs, you, you will have a warning, but I added something to, to test. So that's the, the, the real nice thing is that Antora is in charge of, so before that we had Gulp who was launching all of these additional, additional tools. And yeah, it's Antora is in charge of everything. So it's, it's, it's, in terms of, of who owns the process, it's something much more clear. And, and also it's, it's reuse, so, you know, the, the checks that you have, you, you add in the, the project, there are no running the downstream project also, so that's, that's really, really nice. You have to write your pipeline at once. Sometimes it's so cool. So the difficult thing is to understand what belongs to the, the modules, so the Antorayana or what belongs to the playbook, Antora playbook, the channel and what happens in which, in which, in which area it's sometimes it's, it's, it can be confusing at the beginning, but, but. Yeah. It's like the playbook is the global configuration that is the whole site. And then for each component, you can have a separate configuration in the component or in the Antorayama file. And this can be then specific to a version of that component or always is specific to a version. Like if you have a version 19, then you have an Antorayama file for that version 19 living in the 19 branch of your software 20 version 20 component, the scripture Antorayama living in the 20 branch of your software. And they can have different validation rules. They can run different scripts. They can run whatever is necessary for that version to create the content that you want to show on the website. So it's really a divide and conquer here. One thing is the global stuff and the other thing is the version specific stuff. Okay. So the, the session will end in about one minute, then we'll go to the next session, which is about tribe, I think content structuring and collaborative framework. So if you want to ask further questions, we will have a room which will pop up in the chat of the, of the room right away and you will be able to access it and, and talk to well, talk to the speakers or talk to Fabrice and to Alexander and that are, that are there. So thank you for the both of you. And I hope that it will see each other so you're welcome to do them, right? So maybe, maybe next year. Thank you. See you there. Bye-bye. Bye-bye. Welcome everybody. Now there are people who will, who will maybe come here in the, in the screen or maybe not. Okay, now the room is public and now other people could join in and see if that happens or not happens. I think there were like six or seven, well, maybe two people really watching this talk at the moment. Let's see how it works. I made them. They will go to the next talk probably in some way. Yeah. Let's wait. So let's wait what, five minutes? Yeah, I think so. Just wait five minutes and then we, the final statement in the chat saying, welcome. Oh, sorry, goodbye everyone. That's it. I don't know if you follow or master them. Hmm? One of you follow or master them? Yeah, master them. Where do you follow? You're master them. Hmm? Tell them. No. No. No, no. No, no. They are not actually you, um, with master them. Hmm? Well, they are not. OK. They are not actually you. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No, no. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. No. Did you write something with the, with the, with the hash tag enter. No. Ask The Dog. Awww, maybe. Are you on, are you? marry. No, it's just, it's. Okay, I don't know. Did you write something about first them? Yeah, but do you have to go first to them? Yeah, but it's just, how to find users is just painful. And I didn't just, my instance is, is offline or I don't know. What are you, what instance are you? I am, so you can find me maybe. That's a small instance. I believe nobody's coming now. Yeah, thank you. That was F, F0. Let me, let me. I appreciate your documentation. So this is the real pipeline. I don't know what happens when you, when you, when you move this out from a server. Okay. So nobody here. Okay. Okay. Okay. Okay. Okay. Okay. |
Tribe - a content structuring and collaborative framework
JSON compatible and opinionated content-first framework |
you Hello everyone this presentation is about tribe tribe is a collaborative project management framework it is collaborative as in it affects the ways in which people work together people from different teams UX design content and development it gives a set of conventions to work together kind of smoothens the collaborative process we call it a framework because it has a well-defined folder structure it has a set of conventions that are extendable and we are putting them out in the community so that we can together extend them further it has a defined text stack which is again extendable it has a set of licenses which enable open source collaborations and through this presentation we'll be taking you through all of these aspects to some extent I'll talk a bit about the hall of nations which has been used through this presentation as a metaphor it was a beautiful building an exhibition space in New Delhi built in 1972 it was built to house exhibitions the brief given to the architects was that it must be able to exhibit objects of varying sizes from books to satellites so the architects they came up with a design solution through the constraints of available building material available skill set cost constraints and all of that they broke convention by imagining a building without pillars also the building material for it was not prefabricated it was built on site coming back to tribe firstly I'll give a brief introduction about tribe tribe has been progressively built to provide one language or a unified set of conventions to the following three aspects relational database structure variable type handling in backend scripts and handling html input fields typically used in cms's to smoothen the data entry data editing and moderation processes within a project a bit of info about our team we're a team of four we run a small services company called postcode it was through our collaborations that tribe was progressively built over the last three years we've done around 90 projects we've worked with various external development teams ux folks design studios and content teams we have used tribe to build websites products applications android ios applications platforms and chatpots in terms of tech stack the core backend tech stack is php mysql we use nginx as a web server and composer packages whenever required based on projects tribe provides api support with json api version 1.1 to support any kind of frontend we typically use uh nodejs applications on the frontend before we delve deep into the technicalities of how tribe works i'd like to talk a bit about why did we create tribe we needed more backend agility to support iterations of ux content design and frontend decision making to work on how we build a more agile backend we needed to know the things that slow down backend development so we figured out constraints of database structure the rigid table structure does not allow us to constantly change the column names and table names we wanted to accommodate the agility that json provides while taking advantage of indexing and caching and searchability that mysql provides then while coding api layers we truly realize the importance of conventions looking at how json api has been drawn out really helped us think of more possibilities that could reduce time spent in backend development variable type handling in php takes up a lot of time and was not very agile in our experience in the collaborative process if for example we look from a content person's perspective it seems like a very simple ask when they say they need a multi-select instead of a single select such a simple ask can be a relatively complex task for a backend programmer especially if it is on a project that is live with significant user data also we wanted to date simple cms something that we could have across all our projects we wanted a single place to keep all our assets and uploads arranged by some convention so that things are predictably located and can be looked for at a later stage now let's go deeper in the architecture of tribe one of the core elements of tribe is a config file called types.json it is a pure json file that can be configured and reconfigured quickly it defines the information architecture of the application with precision and it looks somewhat like this and there are a few other terms that will help me explain types.json and its usage better the first term is types a type can be thought of as a data model or a table in a database it is non-hierarchical as it you cannot have subtypes all types are equal on the hierarchy it can accommodate metadata which is extendable a basic example would be a description of a type it's singular or plural form or how it would be called in a different language this is how a type looks within types.json file then there are modules modules are equivalent to columns in a table or attributes in a model they allow for complex relational references for example you could reference and link a module from one type to another module from another type or a module from one type could dynamically link to another type because types are non-hierarchical modules allow types to be referenced thereby creating a hierarchy if needed modules themselves are non-hierarchical so sub modules cannot be created a module also defines usage in input fields for CMS like types metadata of modules is extendable this is an example of how modules look there are there are four important aspects to notice here metadata starting with input underscore tells how the module must be used in an input field metadata starting with list underscore tells how the module must be listed in a list view metadata starting with var underscore defines how the module must be utilized by the php backend scripts input underscore slug can be used as name of the variable in php name of the input field in the front end and the name of the json attribute in mysql json thereby combining all of these use cases then there are objects an object is an instant instance of a model it is the equivalent of rows in a table it has a universal id that never repeats this is particularly unique to the way tribe functions this allows for objects to be shifted from one type to another even after the application goes live and at any stage an object is stored as a json object that retains the type and module structure in readable slugs rather than numbered references this is how an object looks like coming to uploads convention uploads folder follows a very simple convention year month day currently on tribe image file versions are created automatically junction is our content management front end system it is a headless cms it is a front end system built on ember js which is a node js based framework it also connects to the uploads folder the view you are seeing is the least view of a single type and this one is an editing view of a single object towards the end just reiterating our belief that convention is key to collaboration and word about licenses a tribe is in gpl version three junction which is the cms it uses the mit license and types dot json uses the creative commons license that's image called see raj reval associates the people who built the beautiful building a few shots of the building and that's it then thank you everyone you |
Open Source Collaboration Tools for Alfresco
Enhancing Collaboration Experience with CSP |
Welcome to this session. We are talking today about open source collaboration tools for Al fresco, how to improve the online editing experience by using the Al fresco as document repository and some other known open source services for the collaborative editing online. My name is Angel Boroy, I'm a developer evangelist in Highland that is the company that is producing Al fresco today and I hope you enjoy this session. Let's move with the agenda for today. On the first part, we are going to review what Al fresco does, so the main features are Al fresco as CSP and we are going also over this topic later. The second part will be focused on the default applications provided by Al fresco in the integration API, so you can understand what the integrations we are going to explore later are adding to the product. After that, we are going to focus on the online editing collaboration tools that we are testing today, mainly LibreOffice, Collaboration Online and OnlyOffice. We are going to see some live demos on how to integrate this with Al fresco and how to use that with the Al fresco applications. And finally, some recap, some thoughts about the future and about these integrations we are going to explore. So let's move to the first topic, Al fresco as CSP, so the content service platform is a term that was coined by Garner back in 2014 and mainly what that means is that there is a set of services, APIs and repositories that are working with content types, that are working with documents and that are helping to classify and to manage all these documents. On the last year, this topic was named as ECM, Enterprise Content Management, but in these days, this term, this content service platform is the most used in order to classify this kind of platforms like Al fresco. Remember also that Al fresco is an open source product, the main license is LGPO version 3, so you may copy, distribute and modify this software and you need to provide also source code if you are extending the Al fresco, as it happens in the modules in the add-ons we are going to see later. We are open to the community, so you have all this source code that I labeled in the Al fresco GitHub.com, we have also a Maven repository because mainly this is developed with Java, so you have also this open Maven repository, you can get in touch with the community in the Al fresco hub, hub.alfresco.com and there is also the official documentation available that you can also contribute to improve that and we have also some chatting channels like the Al fresco Discord channel, so we are open to you, just reach us if you have some thought of if you want just to improve the experience with the Al fresco open source product. So let's start reviewing what are the features provided by Al fresco and by mainly all the other content service platforms, so this is like a classical flow of document management inside a content service platform, so you start with the ingestion of the document, so you need to put the documents on the repository and in this step you can apply some OCR transformation and classification operations and once we have the document inside the repository then we need to classify that by using content models describing the properties, the metadata associated to a document, we are providing also the searching by using this metadata and its properties, these values and also the content of the documents, we can also transform the documents just changing the mind type or for some enrichment of the metadata or the content and we can also provide some automations and workflow for the rules in order to work with the document to transform the document or to move the document between different users and finally it's the collaboration, it's the review, the co-authoring, the online editing and some process that happens between the users on the same document and for that we have like a simple integration with Office and Google Docs that we are going to see later, so let's move on the ingestion process, so the ingestion process mainly is just getting a document that can be rotated, that can be grained and we can apply some processes like OCR transformation in order to get something more clear in order to get the test of the document and the right quality, so we can classify the document, in this case we are extracting the metadata, the type of the contract, the name, the taker, provider address, some metadata and also the content of the document and with that we have the document inside the alfresco repository, if we move to the content model, so the content model is something that is used in order to describe every document in order to classify the document with different metadata, so in this case we are using a type that can be a list of a screen, the name of the document, taker that is the name of the taker and provider also can be a list of different streams, we can apply regular expression in order to validate the input, the postal code and the content and with that, without definition, we are providing UI form for the users in order to change the properties, the values of the properties or in order to verify that is extracted in the right way, we can upload the document also manually and something that we are also providing is the versioning for the content for the metadata, so you can understand the different changes that are happening on our document, if we move to the searching part in alfresco we are providing three different search syntaxes, so the first one should be like free, so you can use a term in order to get the information from metadata and content, you can use FTS, FTS is the alfresco syntax, so you can just combine your terms with some properties, with the value of some properties, so in this case like the type contract and finally we have the standard CMIS that is something related with SQL, it's like the SQL syntax and also allows you to create some filters and some conditions on your documents, so with that we are providing facets and we are providing also the results from your searching type and you can just connect with your documents, for this process we are also applying permissions, so every user, every document is associated to user groups, to some authorities, so we are applying also that to your searching capabilities, another service provided is the transform service, so for this transform service we can create some renditions, for instance a thumbnail of one document or we can start some pages, we can also apply nlp to extract or to recognize some information that is on the document and finally we can apply also some transformations to the main type, so finally you got like different versions of the input of the initial document, and we move to the automation, we have for instance these folder rules that are able to classify the document according to different criteria, so for instance for every document of type contract you have to start the approval workflow and the approval workflow is something that also is provided by the by the alfresco repository and that is related in the activity engine, that is a vpmn engine, so we can define our workflow for the document and with different steps that are prepared for user interaction or for automating some process and connection with external services and so on, so these are like the first steps for a document and once the document is prepared, once we have this document with all the metadata, with the content, with the searching capabilities then we need to start working with that in many cases, so by default alfresco is providing the circuit protocol that can be used with office in order to edit these office documents, we have also support for google docs, so we can copy the document to google docs, you can edit the document in google docs and that is coming again back to the alfresco repository but we have something in the middle that is not covered, that is the online editing the collaboration with an online open source platform and this is what we are going to see today, this is what we are going to explore in this session but let's start with the integration mechanisms and the applications that are provided by alfresco default out of the box, let's start with that, if we talk about alfresco applications and integration APIs, yes for you to understand how to integrate a solution with alfresco, we can think of the main user applications that these days are two of them, we have one that is named SER, this web application is like the legacy web application that provides the access to the documents and all the collaboration features and that is it's built on top of freemaker, dojo toolkit and ui, so they are like all the technologies and the development happens in java, so we are creating java projects in order to extend this application, on the other hand we have also the adf the alfresco development framework that it's built on top of angular, so you have some angular components and you can add your angular components to this ui application, so there are different applications with different SDKs with different ways of extending, currently many users are still using SER but the new users are using adf ACA that is the alfresco content application, it's one implementation of this adf, so if you are planning to create something new for the alfresco ecosystem then you need to move to the angular one, and if you are supporting still previous deployments of alfresco probably you want to create something for SER, for the alfresco SER we are talking about alfresco SER add-ons, so you have on the alfresco documentation all the instructions in order to create this kind of extension for this application that can be declarative, just configuring something of the application, changing some behaviors and default behaviors of the application, you can create something that is programmatic, so you can add new features, add new functionality to the application, and you can even overwrite default behaviors of the application, so this is very rich and it's really relying on these maven projects, so it's a java project, you add all your modifications and you deploy that on the SER web application, if we move to the alfresco ui extension to the adf then this is something different, then you need to use an angular project and you can overwrite or disable extension points, you can change some rules, actions, visual elements, you can register new application routes, and you can also reduce new rules about auto components, so you have all the information also on the alfresco official documentation and the extensibility, these extending mechanisms, it's able to apply your angular components to the default application, so you are working in this case with angular, so you need a detailed knowledge of angular in order to extend that, if we move to the integration api, so remember that we were talking about SER and adf that are mainly the uis, are mainly the user applications that we are providing, but even with that you have also some extension mechanisms on the repository, so you have the res api and the even apis that we are we are compiling both into the alfresco javasdk that the name is out of process, because this integration happened out of the alfresco repository, so you can use both res api and the events api that is related on active enqueue from this SDK, if you want to create an extension inside the repository then we have the in-process SDK, so with this SDK you can create some new resume points or you can create some behavior different inside the alfresco repository, and finally we are supporting also the standard cmi client that should be more or less also some out of process extension, so let's go deeper with the first one, so this out of process is reliant on events and res apis, again in the official documentation you have everything you need in order to use this api and this is based on cloud events, specification and open spi specification, so both events and the res api are specified on standards that you can rely on, and you have all the different actions and all the different information in relation with action activities, comments, notes and so on, so you are able to interact with all the different entities of the repository by using events or using the res api, if we move to the in-process, to the maven SDK that is related in java, you can just create a new content model, you can add some new res api endpoints, you can also create some behaviors for some policies in the repository, so you can just create new behaviors for that and add new code to the standard alfresco lifecycle for the nodes, you can create some transformations and extract some metadata, and even you can create some schedule jobs, so you have a wide variety of operations that you can extend on the alfresco repository by using this kind of integration, if we move to the cmis that is a standard developed by oasis, so you have all the specifications at this point, alfresco is supporting like 90% of the standards, so we have a wide coverage of the standard, and in this case we are working with res api bandis that can be in xml or in json, so browser atompook, and we have also some kind of services and navigation, object, multi-filing, discovery, versioning, relationship, so we are able also to interact with the repository by using this cmis standard that have clients for many different technologies, so java obviously and python, and many different technologies that you can use, so these are mainly the all the different ways of extending alfresco, so remember the applications and this UI applications, these user applications, but then we can also move to the repository and we can also extend the repository features by using one of the SDKs, so let's move to the next section. Okay, let's start with the online editing collaboration tools, that this is like the main topic of this session, and we are going to start with LibreOffice, so the first integration with LibreOffice was relying on web app or cmis protocol, so we can see this sample from our partner, from our community developer, repillibro, that was just checking out the document, editing the document, checking in the document again, so it was a process that was not that easy, right, so you have this description, you have this process description also for reference, but we are going to explore something like, it's like more simple than that, that is the colabora online project, so colabora online is a project that is providing LibreOffice for the online editing, so you have this project that is hosted by colabora online itself, that is the alfresco colabora online, it's an avant-garde alfresco that we are going to see later, and is maintained by Yesi, Yesi is a French company very focused in open source, but the nice thing about that is that this project is coming from Magenta, that was a Danish company, and this Danish company created some initial version, then this developer, Jeremy Lesard from Yesi, created something better, Arawa came later to create something better, and finally all this previous work was consolidated in the alfresco colabora online that is currently the version we are using, so this add-on that you have available on the colabora online, you have a com, provides online editing for alfresco using colabora online, so many users can open the same document at the same time, and they can collaborate together on the same document, this integration is made by using new opi, this web application open platform interface protocol, so that means that we are open an iframe inside alfresco, and it's really very handy because you have your iframe and you can just work with that without being out of alfresco, so very clean implementation from that, and it's providing an add-on for alfresco ser for the legacy application, it's also providing an extension of the UI for ACA for ADF, an angular extension, an angular component, and it's providing also the alfresco repository add-on, this alfresco repository add-on is required and you can deploy the ser add-on or the UI extension if you are using ser or if you are using ADF ACA application, an angular application, but the repository is required because it's providing some services that are used by both UI applications, in order to test this project we have created, we have created a repository that includes a Docker Compose template that you can use to test the integration, so this is just something that you can use to test the integration, but if you want to deploy that in production or you want to move farther with the integration, then probably you need to contact with GSE or you need to you need to spend some time in order to understand better how that works, so this Docker Compose yes needs some modifications for you, the first one is that you need to use to add your public URLs for collaboration alfresco, so in this alfresco setting I'm using this one that is my local IP, but you need to add yours and then we are using the collaboration online development editing, so we are using this code with no SSL and we are exposing that by using the port 9980, so these are like the main characteristics of the Docker Compose, just remember to add your own IP and we are also just extending the alfresco Docker images in order to deploy the collaborative platform extension for the repository and in order also to deploy the collaborate extension for the SER application, we are not providing the collaborate extension, so we are not providing the ADF extension, the Angular component, because we want to compare this integration, this collaborate integration with only office, since only office don't have ADF extension, we are only using this, but you can also deploy this collaborate extension, if you are using ADF or ACA, there is no problem in doing that, just follow the instructions on the main web page on the GitHub project and you can understand how to apply that, and with no more explanations, let's dive into the demo. Before starting with the demo, just remember that you have in the collaborate online GitHub account, you have this alfresco collaborate online, that is the project we are going to use, that is the one providing the ad-ons for the repository, this platform extension for the SER application, this one we are not going to use the collaborate extension, the Angular extension, but you can also use that, you have all the instructions below, just for you to understand how to build and deploy that, and we are going to use this alfresco collaboration tools project, that includes the Docker compose template for both collaborate and only office, so for the collaborate one, we have this Docker compose with some alfresco and SER Docker image extensions, so just remember to check your IP, so in my case, my IP is this one, my public IP is this one, so this is the one I need to use, okay, just change that in the Docker compose, and without any additional instructions, we can just start the Docker compose, so let's start that, I just have the project locally, so Docker compose up, we are going to build the Docker images, so the Docker images are very easy, so we have this the extension at this point, and we are just standing the alfresco repository and copying the YAR to the deployment folder, and it's the same thing exactly for SER, so we have a extension, and we are copying this extension to the deployment folder, so once the deployment is ready, then we need to use this URL, or this URL in order to access the environment, so just let's wait for it to start, it looks like now it's ready, so we can just go and open a new fresco with SER, we are opening this SER web application that has this the collaborate online on it, this extension, we can just go and find some document like this one, and with that you can see this editing collaborate online action, that is something that was added by the add-on, if we click on that, then we have this, we have the news, so we have this iframe that is on the alfresco application, we can just add something, we can change something on that, we can just save the document, and when we close, then we have the new document with all the changes, and we have also that version, so let me create also some additional users, so we can experiment how that works with many different users editing the same document, I'll be back in a minute, okay so now we have two different users with access to the document, we have the test one in this browser, and we have the test two in this other browser, so if we start editing the document with test one, we can go back to this other user, and we can see that cool entry is being edited with collaborate online, but we still can be editing the document also, so with that we have collaborative editing with test one, with test two, we can make some changes, for instance I'm going to change that with the user with text two, and we can save that, if we move to the other one, the change is happening, it is live, and we can change also from this other user this color, and we can again save that, so once we are back to the document, we can see that if we close also this edition, and we refresh the environment, it's not being edited anymore, all the changes are happening on the document, and on the version history we have the change happened by test two and test one, so very clean, it works as expected, and is very useful to apply this kind of collaboration of editing collaboration with collaborate online, so it worked, let's move to the next one, if we move to the only office integration, then we have another case of collaboration, so the initial integration was created by Cetric3, that user was working for Parasift, so they created the initial version, and then this was adopted by OnlyOffice itself on the GitHub account of OnlyOffice in order to promote and to improve this integration, so you have the project in the official GitHub account of OnlyOffice, it provides online editing, also and many users can be editing at the same time, so it's more or less the same thing as collaborative, but in this case they are using a custom protocol that is not WOPI, and it updates the document after 10 seconds of inactivity in OnlyOffice, so you can also just see on that, but it's a custom protocol created specifically for this operation, you can also get access to the details on the GitHub account, and also the difference is that we are not using an iFrame anymore, but we are opening OnlyOffice in a new browser, with a new browser tab, so we are editing the document on this tab, and then we are going back to Alfresco, we are, this add-on is providing an Alfresco share add-on and an Alfresco repository add-on, so there is no option for ADF or ACA for the Angular components in this case, but you can still develop that because it's more or less the same thing as we have with the Collabora app. The deployment provided, again on the Alfresco collaboration tools project with the OnlyOffice folder, is just using the OnlyOffice document server, but with the JWT option enabled, because we are using the latest version of the document server, so we need to enable this authentication, so we are using this secret, and we are signing this secret in the document server, in the OnlyOffice document server, and we are also applying that to the Alfresco repository, so we are deploying the JWT application in the Alfresco repository, and we are adding this for the Alfresco global property, so the OnlyOffice URL, remember that this should be also the PolyURL, and the secret that we are signing with the document server with the OnlyOffice, and we are deploying the OnlyOffice integration in the server application, and no extension is available for the Alfresco container, so extending the Docker images, just to add these settings, the Office, the OnlyOffice URL, and the JWT secret, and creating this JWT secret on the configuration for the Docker Compose, so with that in mind, let's start with the demo. For OnlyOffice, again we have the OnlyOffice project in the official OnlyOffice JWT with your instructions, and you have also the releases ready to be deployed on that, so you have both add-ons enabled, and we have created this OnlyOffice project that again has the Docker Compose, the Docker Compose templates for reference, remember again to use your own public IP for that, just change this parameter, and then you can just go with that, so we have also the project again, so we can start this on the OnlyOffice folder, and the extension in this case is more or less the same thing, so we have this Docker file, and we are copying the add-on, the JAR file to the deployment folder, since this is not supporting environment variables, this add-on, we are copying the configuration we want also as the Alfresco global property, so it's a small difference with the other, with the JWT secret, and also with the public URL, and insert is more or less the same, so we are extending the Alfresco Docker image and copying the JAR to the deployment folder, so again more or less the same approach, a simple Docker Compose with some extension for the Alfresco Docker images in order to apply that, so we can just go with that, we can open again SER, we are going to use the admin user for the first test, let's see if that is working, so in this case something that is also interesting is that if we navigate to the same document, we were modifying before, so this one, we have now the written OnlyOffice or convert using OnlyOffice, so this is because OnlyOffice is not supporting this format, so we can convert the document, and once the document is converted, we have this one that is ready to be edited in OnlyOffice, so with that again, this is OnlyOffice window, we can just change that, and once we close these windows, when we say okay this is safe, after some time, we are going to get the modification in Alfresco, so you can say that the system still thinks that the document is being edited, but if we wait a bit more, then we have the version 2.0 and we have the modification on it, so it's also something that we can use very easily, we just need to wait a bit to get the changes in the Alfresco repository, and let's see now how that works when different users are editing the same document, and let me create some configuration and let's move to that. Okay, so we have a similar setting, so we have the user test 1 in this browser, we have the user test 2 in this other browser, and just let's check what's happening when I click on editing OnlyOffice, so I'm editing in OnlyOffice with the first user, if I move to this one and I refresh, I can see like the document is located, so I cannot co-edit on that, so I need yes some cooperation in order to collaborate with it, so I can close this one and wait a bit, since that is not edited anymore, and the same thing on the other side, and we can just click on editing OnlyOffice and editing OnlyOffice again, so now we are both editing the same document at the same time, so we have two users and two users, if I change this one from user 2, okay, and I close, that will be safe, and if I change this order from user 1, so yes, let's move to green again, and I close again, then probably, or I expect to find the document is still locked by the other user, so let's wait a bit, remember that there is some time you need to wait, okay, now everything, all the changes are on the document, and I have also this change, so only the latest changes, the latest change was updated to Al fresco, so we don't have the full history, but we were able to edit the document both at the same time by using OnlyOffice, right, so the behavior is a bit different, but it's still working and is still able to provide collaboration between both, okay, that was also awesome, and let's move with some more material for today, okay, once we learn about Al fresco, what Al fresco is providing, and we'll learn also about all the different APIs and applications that Al fresco is providing, we review two different approaches for the online editing collaboration, one coming from Collaboration Online, that is relying on LibreOffice, and the other coming from OnlyOffice, so both integrations are providing a source for their repository for the set application, and also the Collaboration Online is providing an Angular component for the new UI for the ADF ACA Al fresco UI application, so with that, just remember that Al fresco is providing some content service platform features that can be useful for you in order to prepare the documents for this online collaboration, however, in Al fresco we are adding this collaborative editing feature by using standard services, we are talking about the community edition, the open source edition, that you have the source code and so on, so for this edition you need to add collaborative editing by using, for instance, OnlyOffice or Collaboration Online, and as I said, you can use Share for both, and you can use ADF also for Collaboration Online. Thanks to every contributor that made this reel, so as we saw before, this was like a journey in order to create the Collaboration Online add-on for Al fresco, and also a shorter journey for the OnlyOffice add-on, but it was a history of collaboration, and this is something that really makes that happens and made that valuable, and if you are planning to integrate your service with Al fresco, yes, contact me, reach me, yes, talk me in the conference, or just reach me by any of my social media, because really we like you to use our product, and we like to create this collaboration histories, so I hope that was a useful session for you, and you have my email, and you have me also in the social media, so just always willing to help and willing to collaborate with you in some ideas or some projects you have. It was a pleasure for my side, see you there, and bye. Yeah, so I think that we have about four minutes for questions. I don't know, Angel, if the connection is stable enough on your end, or if you can hear me, so I have to mention for the audience that Angel today is at FOSDEM, so you may be able to see him also within the corridors. I guess so, can you hear me? Yeah, yeah, great. So let me try to fetch the questions. Yeah, so can you hear me, come in? Oh, it's a bit delayed, but I guess that we can try. Yeah, indeed. So actually, well, don't hesitate, for anyone in the chat, don't hesitate to ask questions. On my end, I had one when it comes to the only office integration, actually two of them. So you mentioned that you are opening the only office editor now in a new tab instead of an iframe for the collaborator editor. Is it like a restriction that you found with only office, or is it more for answering a security concern, or do you know about it? No, not really. I guess that was a restriction, because this alone was started some years ago, and they created like a new pattern of integration, this distance seconds waiting time. Probably, if doing that on the current days, let's collaborate online. But the nice thing is that you have the source code, you have the APIs, now you have all the basic tools in order to make that real. And also, again, if you're planning to do that, just try and reach me, and we'll try to build something together. So this was like the main methods of this session. So there are people working with that, there are people just creating some ideas, and we can contribute to make that better. But I guess that the only office integration can be recreated using the modern features of the only office platform. Okay, okay, great. And so we have about one minute left, I think, for the talk. Maybe one last question. So by recreating this only office filter using the new APIs that you provide as part of Al fresco, do you plan to have a multi user collaboration here in your demo? For example, you showed that you had to have one user that opens the only office document, and then the other one, and then closes it, and then the other one can open it, but they cannot really edit in real time. Is it something that you are planning to build also? Or it needs to be in the wrong map, I would say. Yeah, yeah. So the behavior can be exactly the same you have with the collaboration online. So both both users collaborating online and creating changes. But I'm presenting this from the perspective of of Al fresco. So we are not maintaining the project, right? Okay, yeah, and help to make that real. Okay, yeah, that's good. Okay, thanks a lot, Angel. And so if there are other questions, then you can the the the room called tackling the document, sorry, the room tackling named open source collaboration tools for Al fresco should be no open. And so if you have more questions, maybe they can be put there. So on your end, thanks a lot for for joining the event yet again this year. I'm going to have to leave you because we have the last talk of the of the demo which is starting. But maybe we'll see each other today at first. Thank you very much. Hello. Sir? Sir? |
Tackling document collaboration challenges in 2023 |
Hi, everyone. My name is Mike. I'm a communications and tech contest specialist on the office. The topic of my presentation today is tackling document collaboration challenges in 2023. So we'll speak about it and I will present the most important updates in our solutions and features since the last false name addition. So there are years of pandemic spread influenced every line of work. Document collaboration is no exception in this case. Lots of teams face certain difficulties. Some users want to stay at home, office and work remotely, despite it's now possible to get back to a real office. And therefore, lots of groups became even more distributed. Therefore, a question arises, where is the balance and how to organize effective document collaboration in such conditions? So we suppose that with the right tools, it's possible to ensure effective teamwork for any company or group in any situation. So we defined our directions for only office to meet the user needs in 2023. This is what will help with tackling creation challenges. And it can be applicable to any project as well. So first is security to make work with data and all kinds of office documents as secure as possible, taking into consideration the growing number of cyber attacks. Then of course, collaboration itself to ensure effective coordination for teams of any size and in any environment. And usability follows as well to provide excellent user experience. And at least, last but not least, this flexibility and integrations to make the solutions available and accessible to as many people as possible. So just a small reminder on the office is an open source project with focus on advanced and secure document processing with the code reference available on Geek Hub. It comprises editors for text documents, sheets and slides along with the form creator, BF reader and converter with Office Open XML as a core file format. On the office has a single agent for online, mobile and desktop versions and can be deployed on any platform and on any device. In 2022, we released three major updates along with four intermediate hot fixes. The recent update is Russian 7.3 released in the end of January. Let's see what we did last year. So since the last post them, according to the trending directions, we just specified. Data security is of course a crucial question for everyone who works with documents online. It started with version 7.2, that protects documents from unauthorized access, is enabled by default on the office docs, integrations for security reasons and for data integrity. A random secret is automatically generated and just needs to be added on the side of the host application. If required, you can also replace the default jude secret with your own custom key. Processing sensitive data and spreadsheets became more secure, but the ability to set password protection for separate sheets and workbooks. Additional security measures here allow editing for specific ranges, hiding formulas and locking cells. Version 7.3 provides users with another option to password protect text documents with the ability to allow only certain actions in the file, reading, filling forms, commenting or tracking changes. And calibration is of course of great importance for lots of teams, including distributed groups and those who work remotely. So at the beginning of 2022, we introduced on the office forms. They're also known as forms and allow you to automate your paperwork routine, making it easier to create and fill out model documents. But on the office, it's possible to build complex digital forms from any DocX document from scratch or from scratch, and users can add the required fields, such as text area, combo box, drop-down list, checkbox, radio button, image, email, phone, complex field, and just the properties to get a contract, assignment, declaration and virtually any other type of document ready quickly. It's possible to collaborate on a form in real time and share it, share the ready forms and fill them out digitally. If needed, the forms can also be exported to BDF. So our approach to digital forms allows implementing additional features. So in the version 7.3 of only office docs, we brought like more effective work with forms to simplify the document workflow. You need to, you can create and you can assign various recipient roles for field filling. This way users can virtually, this way users can visually identify which fields they should fill out depending on the role matching colors. In the future updates, we're also going to extend this functionality by adding an ability to separate recipient roles with restrictions as well as electronic signatures. In addition to that, version 7.3 brings more reduced fields for quicker form creation, such as a date and time with multiple display options, zip codes, and credit card. We also provide forms library with multiple templates in different languages. All forms there are absolutely free to use. Another collaboration update is a live viewer, which allows users to see changes made by other collaborators when opening a document spreadsheet or presentation in the view only mode. So this basically lets you participate in a document like but without the right to access, still tracking seeing all changes in real time. To make collaboration in spreadsheets more comfortable, we edit collaborators course or display. So when co-editing a workbook with other people in real time, you can see their selections marked with the different colors. The on-office spreadsheet editor got support for the version history feature so that users can navigate through their previous drafts and restore them if necessary. And by default, each draft is saved as a version when the last user closes the file. So user interaction in docs also became more convenient. You can quickly sort comments in a text document, spreadsheet on presentation, which makes it easier to work with comments by different users at different time. And for example, you can view all the comments in chronological order from the newest to oldest and vice versa. And also, you can sort the comments by the author of basically using two available options, A to Z or Z to A. Version 7.0 brought the ability to choose between two display modes. When reviewing changes made by other co-authors, you can show changes in balloons when you click them and in tooltips when you hover your mouse over them. Usability trend implies adding new useful features along with interface improvements. It's exactly what we did and we are going to do further. Thanks for the implementation of Heartless text shaping library. On the office font agent supports lots of new scripts. The NCO writing system, which belongs to RTL, is only one of the examples. The editors now also support ligatures that allows merging multiple symbols into one and hence the support for new languages, for example, Bengali and Singhala. And it also makes for more harmonious reading experience. You can configure ligatures and choose between standard contextual discretionary and historical types and also like the combo options for ligatures. With version 7.0, we added dark mold for the text editor. It provides the eye strain and improves readability in half-light environments. When the mold is on, the light color is presented against the dark screen, which significantly reduces the light emitted by devices. And following user requests, we also added another dark mold variation, which is dark contrast. One more bonus feature here is also a dark document mold, which allows changing canvas to blackish gray shade and adjusts the content colors for the best readability. Besides, you can make the editors automatically switch to dark light mold, depending on the theme used in your system. So throughout the year, on the office document editor brought such useful features as mail merge, data import from local files and URLs, hyperlink color correction, inserting all these spreadsheets, updated search and replace, and more convenient work with shapes, including the ability to edit shape geometry with a mouse cursor. Calculations and spreadsheets became more comfortable with query tables, print review, currency options, according to the ISO 4217 standard, and tool tips for formulas, text qualifiers, hotkeys for base special, and a link to data range and 1904 date system. New pictures in only office docs, presentation editor comprised full automation support, new transitions tab, hyperlink color correction, duplicate slide option, and others. The latest version 7.3 brings support for smart art and unicode and LATX equation syntax, base special post slides, and watch window and spreadsheets, which allows inspecting, auditing, or confirming formula calculations and results in large files. There are multiple formulas. I'm sorry, there are multiple formats that only office docs works with. Core formats are dockets, xlexics, and pbtx, and also supports ODF formats and a variety of other popular ones such as txt, rtf, html, pdf, csv, etc. With the version 7.0, we greatly improved overall support for Office OpenXML features, including the introduction of new mechanics, and a work for the existing objects and attributes. For working with digital forms, we introduced our own formats, docxf, and oporm, since the standard formats don't allow us to bring all our ideas to life. So docxf, based on docx, is a form template which you can edit and collaborate on, and oporm is a created, ready to fill out form that you share with others, which they can fill out. So you can fully work with both from the web and open the forms in the mobile apps to fill them. Starting from version 7.1, you can use on-the-office docs to view PDF, xps, and njv files in a more convenient way. The built-in viewer opens files on the client side and offers such improvements as page thumbnail panel, navigation bar, support for external and internal links, as well as the course or hand mode. When you save a PDF file, you can convert it to docxf and other available formats, except for pdfa. This option, together with the improved pdf viewer, allows you to work with pdf files and on-the-office without any third-party applications. And last but not least, what you first do in Haswork with Formats is that you can export presentations in picture format, JPG and PNG. On-the-office docs will create an archive with the separate slides in pictures, in this case. Besides, ppsx presentations are supported for viewing now. Integrations, both internal and external ones, play an important role in all of this ecosystem. For now, we count more than 30 really few-use connectors, which allow us to use in our editors within other cloud services, as well as more than 30 plugins, which bring extra third-party features to the editors. Over the last year, LACH has been done in the integration development. Introduced for new connectors, the first one is Moodle, which allows users to add document activities to courses and enable editing and calibration directly within the course structure for students and teachers. And we developed three connectors for open-source CMS framework, WordPress, Strapi and Vupal. In addition to developing new connectors, we've worked on improving existing integrations. First of all, the introduced forms functionality became possible in our existing integrations. Besides, we added new features and fixed bugs in the current connectors, trying to keep our users keep back in mind, including the apps for Hum Hub, SharePoint, Alfresco, LiveFray, NextCloud, Confluence, Moodle, Plone, Giraffe, and RedMine. And we also welcome our partners, who add OnlyOffice to their platforms to provide users with document processing. Among the new partner integrations are ThinkISL, an intuitive quality management software, O2Oa, a collaborative office space to boost productivity developed by a Chinese company in London Network, C360, a suite of secure and accessible applications by the Italian company, Innovazione Gigi d'Alle. And, Lan Roche once said, complete French collaborative solution builds on 100% open-source infrastructure. And Ant2Whip, which is an open-source agile project management software by our French partner in Aileen. Logins can significantly extend and simplify your work in the editors. In 2022, we added several handy ones, GTC and Reboot, to organize audio and video meetings, writing the editors, draw IO to create charts, mind maps, diagrams, infographics, etc. Doc2MD, to quickly convert text to markdown. And moreover, we introduced a brand new plugin manager, or a plugin marketplace, to make plugin installation even easier and more straightforward. It allows you to explore all available plugins and install or remove any plugin with just one click without leaving the editors. And of course, we always do our best to make on-office docs available for every user, so we always extend the number of distribution formats and packages. So, we officially rolled out on-office docs as a service. You can easily launch our online editors and create your collaborative environments with the platform of your choice in the cloud. And as all the editors of on-office docs and document builder can be installed on devices with the IRAM architecture that allows for higher performance, energy efficiency, and integrated security. The IRAM-compatible version of on-office docs is available as a separate build and offers several installation options, such as Docker Images, GB, and RPM packages. On-office desktop app became pre-installed on such Linux distributions as Amarok Linux, VCOS, BROS, and was verified an official application store of QtFish. Besides, we got compatibility certificate with China's national operating system, Kailin OS. The editors now are available in Vulture and Alibaba Cloud Marketplaces. One-click app on Linux Marketplaces is coming soon. And Artera, an internet service provider from Switzerland, is now offering on-office docs as a part of their hosting solutions. Moreover, last year, the Minis Forum and Manjaro officially launched MiniPC-UN350, where on-office desktop app was the default office suite of the pre-installed system. On-office is also part of the Marina Cloud Service Bundle included in the device default app of the system on the Marina One smartphone. We released framework components, which you can use to easily deploy on-office docs within React, Angular, and Vue environments. And an update, which will be really interesting for developers, is the new API class. It allows interacting with text documents, spreadsheets, and presentations, and fillable forms from the outside. This way, you can create special connector to make changes in the documents using your own buttons or methods instead of the editor buttons. For example, you will be able to generate a feed with comments from all your documents, reply to them, to enclosed comments in one place, or automatically fill out forms with the data exported from databases. With version 7.2 update, we also fully upgraded the native on-office document builder API to make document generation more comfortable using JS commands instead of text commands. Moreover, we added the .NET doc renderer library, which allows working with the document builder API in your .NET-based app. Finally, in the last year, our team grew both in Numbers and Jagerik Ridge. We opened three new office departments, resulting in six office locations overall, Latvia, UK, USA, and New One Singapore, Yerevan in Armenia, and Tashkent in Uzbekistan. Our development hub is now located in Uzbekistan. The Armenian office became home for a customer care team, and lots of our teammates also work remotely from various parts of Europe. And let us boast a bit, and once again, I think our users for support since we made it to the top three in the cloud computer insider awards for the second year in a row, and this time on the office received gold in the category Cloud Content Management. So, thanks a lot for your attention. If you have any questions, you can always visit our official website on theoffice.com and contact our team on GitHub or and we always open for any collaboration. So, thank you and enjoy the rest of class. Okay, so okay, so we have a link for the on-office API for the for the WP connector, so that's nice. Thank you. Sorry again, I have a bit of echo. Yeah, great question actually, Alex. So, any success stories illustrating strategies to make organization move? Hello? Yeah, I make mine. Okay. So, hello. Hello. Hello. Thanks, everyone, for listening. I'm not self-made, but I'm sorry for wasting a bit of time. Okay, I'm jumping to the questions now. So, we have one question, which is, do you have any success stories illustrating the strategies to make organization move away from Microsoft-only tools? Anything that you saw as part of on-office migrations, maybe? Okay, thank you. So, yeah, we have a few examples and you can definitely access all the success stories on the website of the customers. So, just not to give any wrong information, I just offer you to jump to the session. Is it any way for me to share a link? Yeah, I can do it in the chat. So, basically, right now, I believe that people should be able to access the room that you joined and connect to the conference. And then we can have a discussion, I think, here directly. It could be a bit easier. And connect to the conference. I'm sorry, I have a bit of a problem. Okay, so maybe Alex, you can sit directly with Mike to answer your question. Okay, also, I saw the question in the collaboration and content management chat about on-office supporting or planning to support the VOPI protocol. And yes, we do. Since last year, we added support for VOPI on the OfficeDocs. And it still requires building a connector. So, just like a classic API-based integration. But we already have a few examples of VOPI connectors. The first one, which was SharePoint, which works completely. There is also OpenCam connector. And there is also kind of a work-in-progress version, a connector for on-cloud internet scale also based on VOPI. It's not publicly available, but this one we are working on. That's like a collaboration with the on-cloud team. So, yes, we do. And it's possible to create connectors to connect via VOPI protocol. So, thanks a lot, Mike. On my end, I will need to leave. So, if you have any other further questions, don't hesitate to join the tackling document collaboration challenges in 2023. The link is in the chat of the dev room. And here you can discuss. And globally, thanks also a lot for coming to this dev room this year. I hope that next year we'll be able to do it physically. And that we will have a bit more time for presentations. Yeah, sounds great. So, thank you everyone again. I will hang out in the chat. So, in case there are any other questions coming, I'll be there. I will also jump the link, but your cases are briefly. Thank you. Thanks a lot. Bye. Bye. you |
Welcome to the Community Devroom |
We are now slowly counting down to 9 a.m. Please take this opportunity to rustle any plastic that may be around your muffin before we get started. You can also rustle your plastic around the muffin during the sessions I have been told and it will all be okay. All right. A wonderful good morning and happy time zone for those who may be tuning in from elsewhere in the world on the live stream and welcome to the community Dev Room for Fosdham 2023. My name is Leslie Hawthorne. I am very proud to be one of the co-organizers of the community Dev Room along with Laura Chikowsky right there and Shirley Bales who is at the back. We would also like to express our sincere thanks to our additional members of our program committee this year, Samson Goody and Claire Dillon who helped us to review the submissions to the Dev Room. Just a few housekeeping remarks before we get started for our day. We will have refreshments available throughout the day as long as they last. For the baked goods, the vegan baked goods are marked with a dark colored ribbon and the vegetarian baked goods are marked with a light colored ribbon. And the community Dev Room is thankful to the Free Software Foundation Europe and the folks running the legal and policy Dev Room for providing us with air filtration for our Dev Room today. They were kind enough to leave their machines behind for us to use. If you are not feeling well, obviously please mask. We have several masks available if you need one and they are the kind that allow you to have a little bit more ease with respiration throughout the day if that is a challenge for you. Our community Dev Room is run under the FOSDEM code of conduct. If there are any concerns that you have during your time in the Dev Room today, please feel free to see me in person or to reach out to conduct at FOSDEM.org. There is also a phone number on the FOSDEM website for reaching out for code of conduct questions. Sadly, I don't have it available right this moment in time, but you will find it on the FOSDEM.org site. I think that's about it other than to say I'm really excited that we can all finally be back together in person for this event and I'm really glad you're all here. So thank you and thank you to those folks joining us from the live stream. We're going to have a number of talks today. I won't be introducing speakers in between each session, but we will have brief housekeeping remarks throughout the day as the room fills up. And we have three minutes before Matt is supposed to get started. So for folks who may be tuning into the live stream, we will now need to be entertaining for the next three minutes. Matt, are you entertaining? I can be very entertaining if you need to be. Excellent. Well, let's try it. I don't have any ferns for us to sit between two ferns, but what is our topic du jour, sir? Today I'm going to be talking about external evangelists and how to build external evangelists. Okay. Well, we all like it when people want to talk about how cool our things are. So I think Matt's advice will be deeply fulfilling to us and useful in our quest. I'm so glad that the streams are live. Do we get the little wink-a-wink-a cat in here? No, you can't be, it's only the slides up, but so if you want to, I'd like to... Hi, everybody. Well, that's excellent. If you're going to do anything embarrassing, do it over here where you're off-camera and there will be no photojournalistic evidence of the situation. We could play ask the audience awkward questions. We could do that. We could do karaoke just to annoy my friend Brian. Yeah. How many people went out drinking last night? No. No. That's right because that's the correct answer because everyone who did isn't here yet. Yes. I was going to say we here in the community of Devereux Valley were responsible morning start where you're not feeling ill. Yes. When we go to sit with our fellow humans in community. You know, it was funny though because last night I was seeing so many people, I want to see you do this talk. It sounds good, sounds good. And I'm like, oh yeah, it's at night and they're like, I'm not going to go. That's cool. I know. It is. I, you know, we do attempt to bribe folks to arrive early with snacks. But I didn't know that. I didn't know that. So maybe we need to do more advertising on the snacks. Well, I see. Commercial? Can we do a commercial? Ladies and gentlemen, please eat snacks. Everyone is welcome here and that's why there are delicious vegan vegetarian treats. Oh my goodness, that's fantastic. Thank you to our spokesmodel, Scotty, who organizes the Frostcon conference. It's basically Fosden, but it happens in Germany right outside the town of Bonn where I live. Please come visit us. It's very nice there. It's the small town in Germany that starts with a B, so it's very chill there. We don't have the cool disco tech like Berlin, but we do have a very beautiful park, so you should come and join us. We promise you long walks along the Rhine. |
Building External Evangelists
What should be the primary goal of every community team |
Okay, and now is the part where we promise you Matt, go Matt. All righty, thank you everybody, thank you. And for, you know, we're starting to warm up the crowd for you. You know, for was said, I was the opening act here, so I'm really, you know, glad to be able to introduce the rest of the, you know, the folks today. But I wanted to talk to you about something that is really critical, it's building external evangelists. How many people are here in DevRel? A few? Oh, yeah, good number. Community? Just, you know, in general? Okay. All right, well, yeah, DevRel community. It's kind of, it kind of overlap now, don't they? So who am I? This is me. That's me over there. Yes. So I am the head of open source strategy, or the Haas, for short. If you want to call me the Haas, I can answer to that. You can find me everywhere on socials if you want to connect after the fact. I work for SCARF. If you're not familiar with SCARF, we do advanced metrics to track adoption and growth of open source projects. And we also have the hacking open source business podcast. So if you're interested in being a guest or checking it out, appreciate it. So one of the questions that I get asked a lot, and I talk with founders, I talk with executives quite a bit, I get this question asked all the time. How do you measure community or DevRel? Has anybody gotten that question in the, you know, the space? In the last five minutes. In the last five minutes, just yesterday, right? Executives have a hard time figuring out, you know, what this is. And my answer generally to them is, you measure the external advocates or evangelists you create. Okay. There's a lot of other metrics you can track, but this is the one that I typically go to more often than not. And you might be asking the question, why? And so I'm going to answer that question in 348 slides. I think I have till noon today, so I should be going for a while here. That's cool. But my first question on the why is, what is the company expecting from these activities? So when you're talking about community DevRel, what's the, what are you expecting? You know, so anybody want to shout out what you're expecting? No? Okay. I'll just go. I'll tell you what, most people are expecting, they're expecting more contributors or more users or more customers. What's funny is, every company that I talk to, they might fit into one of these, but they want all of them, right? So if they're like, oh, we want more contributors, well, really, what do you want? Well, we want more people to use our stuff, and if you think we have more contributors, we'll have better software, and then if we have more users, we'll have better, you know, customers, right? So this more is something that people are striving for. So no matter what, people are looking for more. So I have to put on my other hat, because those who are DevRel in the community space, I got to go a little sales marketing businessy, so I got to switch hats. I got to put on my sales hat, which is the supervillain hat, so I can talk a little bit about other things, other than community activities. So most companies are looking for more dollars, okay? Whether you're looking for more contributors or more users or more customers at the board level, at the executive level, ultimately, they're thinking dollars and cents. And it gets back to a hypothesis, right, which is the more users in the community, the bigger the community we have, the more potential customers we have in the future, right? Have you ever seen the product grow flywheel before? It's kind of overdone. This is my really crappy interpretation of trying to redo it, so if anybody wants to redo it in a prettier graphic, you know, you can see that. You know, the idea is the more people who come in, find your software, try it, start to use it, start to be part of the community, the more willing they are to either tell other people about it, or potentially purchase something for you. And this is where you get a huge disconnect, right? So you have the community DevRel team here who is trying to be very focused on the relationships that you have with your community. They're trying to track the community growth, and you're not necessarily thinking about the number of customers ARR, MRR. But when you get to the boardroom, a lot of times those are the metrics that they're looking at, whereas we're talking about these metrics. And so there's a disconnect, and we need to link those two somehow. And early on, a lot of these are talked about quite frequently in smaller companies, but as you get bigger, everything tends to shift to this side. And when you're saying, ah, how many of the number of customers of the ARR can we attribute back to this side? It becomes kind of squishy, right? And that becomes a difficult proposition for a lot of folks. And this is where we've seen the rise of the, quote, unquote, influencer marketing, or DevRel in our space, but influencer marketing has been around for a while. You've seen some of the documentaries, or you've seen some people out there on Instagram who are like, look, I have free trips, and I get to go to all these places. Hey, I get free trips to go to these places, too, and stand in front of you awesome people. So I'm kind of an influencer, right? But this is trust, right? It's all based on trust. So when you see somebody, whether it's online, whether it's part of some other community, you start to get to know them, and you start to trust what they say, and when they tell you to buy a product or do something else, hey, that's something that you listen to. And that has been established in not just the software space, but also in the overall kind of retail and consumer space as well. And as tech evangelists, we are all part of that, even if we don't want to consider ourselves, we are. We're influencers. Our job is to go out there, talk to people, and get them to understand what we're trying to preach to them, tell them, share our experiences, help them get better. But there's a difference between internal and external evangelists, right? So let me give you this example. We've got two people here, right? The first person works for Delta Company, let's say, and they do development JavaScript, and they say Delta Software is awesome. Person B works for a website that uses Delta Software, and they say Delta Software is awesome. Who are you more likely to believe when they say that? You're going to believe B, why? Well, this person over here is incented to tell you that Delta Software is awesome. So if I come up here and I work for SCARF and I tell you, SCARF stuff is awesome, you'll be like, oh, okay, that's cool. But if someone else comes along and says SCARF software is awesome, you're more likely to listen. And that's why building these external evangelists is critically important. And from a business perspective, this is how the business side thinks of this. Someone out in this space tells you this is great, this is awesome. Their followers, some of these blue people here, they're like, ooh, yeah, that sounds pretty cool. Let me go ahead and let me watch this thing that you're doing or attend a conference or listen in. And so those people watch, and then some subset of those will say, ooh, I really want to try that out. So then they try it out, and of those who try it out, maybe a couple will actually adopt it. They'll maybe become customers. And that funnel, that process is what is expected from the business side, and that's what the investment in the DevRel space is all about when it comes to the executive side of things. But the thing is, can this scale? So if there is one me, so I can go to one conference at a time. If we hire another me or another me, we can do this as well, but that gets really expensive. And a lot of companies, especially in younger stages, are a little scrapped. So they might have one, two, three DevRels maybe. And so how do you scale that? And that comes into thinking about this in terms of different tribes or different communities. There are lots of different programming languages. And it's not feasible for one person to understand all the different tech ecosystems, all the different tribes, all the different communities that are out there. They're all very different. And so by plugging into that and finding people who might be able to plug into those external communities, those external tribes, you can actually do this play where you're talking to people and getting them to try over and over again in multiple communities that you might not be part of or might not be able to reach. So how do you get external evangelists? There's a few different ways. This is not an exhaustive measure by any means. It's going to be the 15 minute version, which is the time we've got left. But let me start with, should you hire influencers or DevRel people or external people to kind of come in and do something for you? There's a lot of people who have a really high GitHub rating. They have a lot of YouTube subscribers. These are great people. But your results are going to vary because their communities might not match what you're trying to do. And so I've done this in the past at previous jobs where you partner with someone and generally it's a paid relationship and you're like, hey, could you cover this event? Could you come talk? Could you do these things? And sometimes it works and sometimes it doesn't. It's a very costly way of doing it, but it can be effective if you do it right. So the classic play is obviously, grow your user base. That's easy, right? So okay, we're done. Just grow your user base, you're done. It goes a little bit beyond that, right? So when we talk about growing the user base, this is where you look at the graph here. You start with users and there's different pathways for those users, but everyone has to start as a user. It could go into the Dev side where they want to be code contributors and eventually become maintainers or part of the core teams, or it could be that they become content contributors. Someone who does blogs, someone who does talks at conferences and then eventually turning into a regular evangelist who continually does that. Now what's interesting is a lot of folks, especially in the community, a lot of our metrics are based on this top, which is great, but a lot of this bottom is not necessarily looked at or followed as closely and there isn't as flushed out a metric set for these. So how can you get more users? There's a lot of different ways, right? So content, content, content, content machines. I do a lot of blogs, podcasts, videos, all kinds of different stuff. That helps people, especially if it's geared towards making it easier for them to get into the community. So education, training, code examples, conference talks, things like that. But also there's a product space here, right? Features that matter. That's important. So it is a collaborative effort with product in a lot of cases. You have to get that. You have to make it easier for people to get started. And then you have to figure out how to get more people to tell how awesome your product is. That's that external evangelist side, right? Because again, if I tell you we're awesome, it's okay, but if someone else who you trust in the community does, who has nothing to do with us, that's even better. And you have to set those expectations, right? And this is a classic problem that I see happen over and over again in the startup space, especially in Dev Rowan community, is you start to compare people to other people, okay? Not all communities are the same, and you have to identify that. So if you're a part of, let's say, the Haskell community, and you're going to compare with JavaScript, those are two vastly different sized communities. And if you're coming from the JavaScript community and you're saying, like, look at all these people in my conference talk, I've got a full room, I've got 200 people, you go to the Haskell and it's got 30, you can't be like, oh, I only got 30 people at my talk. Because that's not what it's about, because the Haskell, you know, programming language just has a smaller community, and that's okay. And when I'm a database guy, and when you look at databases, for instance, I've built this chart, it's really hard to read, but you can get the slides after, but the gist of this is you start off and you think, wow, how many people use databases? Everybody uses a database. So what's your total market size? How many people can we get to use our stuff? Oh, there's billions and billions of people who are going to care about what I want to say. Well, when you start talking about operationally sized things, you're like, huh, okay, well, from a developer perspective, let's be honest, front-end developers don't care about databases, so half of all developers go out the window. And then you're like, hmm, well, maybe full stack ones, I guess, care about it. What architects do? And then you're like, oh, well, what about operations people? Well, DBAs do, do sys admins, no? So you have to start to pare down and figure out what that audience is. And don't get discouraged if you're working in a space where the audience is small and you have to grow it. That's something that I see happen quite often in a lot of executives and companies, they're like, oh, that guy got 10,000 views on his video, why can't you do that? He's like, because I'm not them. I had an interview with, I talked with a guy from Google on a podcast that I did, and he's in the database space. It's like, the Google guys just, they don't get it because our Google videos on G-Cloud, they're like, 500,000 views, and the database ones are like 5,000, and they're constantly harassing me. You're not doing enough because you're not getting the 500,000 views. Well, the community's smaller, it's different, and that's okay. So avoid some common mistakes, right? You know, don't assume that your audience is as large as others, and don't assume that just because you're speaking will automatically generate interest. It requires a lot of work, and product awesomeness trumps story. You can have the best storytellers, the best DevRel folks out there, and if they're talking to people about a crappy product, it just doesn't work. So some best practices. Be authentic, okay? This is really important. You want to connect with people, you have to be authentic in your conversations. If you are, you know, authentic people will come to you, and you have to make connections. How many people like dogs? Okay, half of you. So you know what? This is my dog. If you know what's awesome about dogs, you can be an introvert, all right? And if you're walking around, you know, no one else in the neighborhood, if you have a dog on a leash, they will stop by and talk to you. Why? Because dogs are cute, but they also have dogs, especially if you have two dog people. I've met so many people just from the dog, you know, that it's not funny. It's a connection that you make. So how do you find those connections in the community? Because the more of these types of connections, the more people that you can get to potentially help within the project and potentially talk about your stuff. You also need to make this really accessible, okay? People often develop brilliant things that are too complicated, okay? We also have maintainers and devs who think that everyone thinks like themselves, okay? If there is, if you are comfortable on the command line and you're comfortable setting up something following a 142-step process, don't assume that other people do that, right? You know, not everyone thinks that way. So and also make it accessible for how new users get involved. Be safe in a fun environment, right? Nobody likes to come and listen to the person who talks in the monotone voice and is, this is a really boring talk in a boring community. You want to make things fun and engaging for people, right? You want to connect with folks as much as you can. Avoid toxic people because they suck, you know? Avoid abrasive culture, you know? What are you doing to engage with the community on a regular basis? You know, what's the personality of your community? Does your community have a personality? It should. You should define that up front. You should figure out what you want people to know your community as. You also want to be transparent, okay? Share what's happening internally. Oftentimes, especially in an open source-driven company when you have commercial open source, you have gotten different tiers, right? You have the tier of people who are in the company and they know everything that's going on and then you want all these contributors outside to just assume that they know what's going on internally or not know it. And setting up those silos is bad. So sharing plans, roadmaps, other things, super critical. Don't create that caste-based system, okay? Explain yourself and your decisions, right? When you say, I'm not going to take this feature, that's okay. You can do that as a maintainer, but if you don't explain why, then that really hurts people, right? And also be open to feedback. Now the next one is you really want to focus on getting help, but you want to ask for it. I don't know why so many people are afraid to ask for help. This is a classic blunder. If you're running a community and you want to build contributors, maintainers, you want to build external evangelists, you have to tell people where to start. Classic. I've seen this several times where people are like, I really want to get involved with this community. So they submit some patch and it's like, wow, this has nothing to do with what we want in this project. So we just reject it and move on. And then that person moves away and they never want to come back. Well, why did they choose that? Well, they didn't know where else to start. Tell them where to start, help them with that. And give that feedback, right? Be responsive. Nothing drives people more bad even, not hearing from you when they ask you a question or submit something to a project. It's mind-limbingly bad. So you're building relationships here, treating each other like you want to be in a relationship with one another, right? You want to be part of this community. Avoid that silence. Good, bad. And promote people's work, right? So if somebody submits a blog, somebody's had a conference talk, go ahead and promote it for them. Tell them that this is awesome. Give them feedback. Keep on rewarding people, right? People are rewarded by positive achievement and that doesn't just mean baked goods, right? I do appreciate the baked goods, though. But you can weigh challenge coins, Tom, right? Yep, so there you go. That's the different things you can do. But one reward that's often overlooked is simple acknowledgement. I acknowledge that you contributed. Postgres does a great job of this. They give away challenge coins as well. But everyone who contributes, they're out in their readme's and every time they do release notes, right? So make sure when they do release notes or blogs, you write profiles on people. Physical rewards, of course, work too. I like hats, but that's just me. But the other thing you can do is listen, okay? Listening is critical to all of this. And really, just don't be an ass. I mean, you know, people are out here to participate in the community and they want to be part of this. And so the more that you can help them feel like they're part of it doesn't mean that that means that they're going to be wanting to stick around and help more. External evangelists aren't just participants, right? You know, that's what you want. You want to have the external evangelists. You don't want people just in the community. You want to make sure that they can contribute and be happy to do so. So if you're interested in more, because I'm just about out of time, you know, go ahead and follow along, get more details. We just released this new website, opensourcemetrics.org. And so I put all of the metrics that kind of like go between the community and the business and how to kind of like map out that entire funnel. And you can also see me later on today where I give the open source business guidebook over in room K. Or if you're going to the state of open con in the UK, I do have a talk on how to destroy a community, the super fillings guide. So if you really want to know how Darth Vader, Emperor Palpatine, Voldemort would go about destroying an open source community, stop on by and see me. And again, that's where you can reach me. Thank you, Matt. Yes. Thank you very much to Matt for that wonderful talk. I was actually just posting that anyone who's looking to get into a DevRel career, there's some really great information here about setting reasonable expectations with your company leadership. So very much appreciated. Matt, if you would take questions, we have a little time for questions. Now is the time to ask questions. Is it on now? Yes. Yes. No. Mic on. Mic on. Mic off. Okay. I can go, hey, questions, questions for people. Anybody have any questions? Anybody want to see me just stand up comedy? Yes. Yes. Yes. Yes. I'm embarrassed. I can't do it. Question. Hold on a minute. Oh, no, no, no, no. I shall slowly walk through the audience. I really appreciated the advice to listen, to hear what people are saying so you can kind of bring them into their community. But how do you get people to create content about something like do you reward them, encourage them? Is it just bragging rights? What's your approach? So I think that there's a couple of things that I've seen work effectively. I know there's a talk this afternoon on building non-code contributors, which is good, so I'd recommend that. But typically my approach has been to help people get involved, start to work with them, talk to them initially, especially a lot of people who are just getting started. They don't know what to write, especially when English isn't their first language and you're going to post in English. So working with someone, giving them ideas, I generally will start with a creative kind of like idea board and do that help-wanted, where I'm looking for a topic on how to do X in Kubernetes or how to deploy this or how to tune this. And if they'll agree to take one, I'll work with them to release it, we'll promote it. So there is some of that reward. But I also go through and do monthly rewards for the top, you know, blog posts and contributors that way. I'll look at yearly rewards for that as well. We've done things where, given plaques out at conferences and stuff, for someone who's had the most effective or interesting blog, had them come out to the conference, stand up on stage, give a talk, whatever, and that's a reward as well for them. But a lot of it is just building that relationship and working with them, especially as they're starting out. Because when you're starting out, you're really unsure and there's a lot of that imposter syndrome. You're like, I don't really know if this is good enough to publish. And the thing is, you just got to make it so easy for them to get started and feel like they're part of the community and feel like, no matter what, you're there for them and you can help them through it. Floor has you. Oh, sorry. Thank you. Yes. My phone started to beep and my brain immediately turned off. Thank you, Floor. Hey, Matt. Thank you. So there's a couple of companies that have these community ambassador-type programs. What in your experience or from your viewpoint are some good ones and why? So the community ambassador programs, like GitHub has the GitHub Stars program where they've got quite a few people who are on the GitHub side of things. Most of them end up being more on the code contributor side and they don't necessarily kind of embrace the bigger audiences. Even when you look at a lot of the tools that are out on community space, whether you're talking about orbits or common room, they're very focused on the contributor first and then a few extra things, but they don't necessarily do the full picture. And so I think that that's where, if we want to continue to grow the community, it has to be in embracing all kinds of contributions, not just the code contributions. And so I think that that's where we need to kind of move to that next level. When I was at Percona, we tried to build that program in, but it was still relatively young. So we were still working through the kinks. We went through one year on that program to do that, which worked pretty well, but I wouldn't say that it is the model for everything just yet. Thank you very much, Matt. Oh, it is on. I love it. Thank you. Thank you very much for your talk, Floor. If you would come up, please, and get set up. That would be lovely. While we are waiting for our next speaker to get slides ready, it is time for the standard during the break community dev remarks. Hold on a minute. I will own masks for anyone who may be following along on the live stream who reads lips. Good morning again. You've probably heard me say good morning before. I'm going to say it again. For those of you who may be hungry, we have stacks available all day. You will find a vegan option on the table in the cake that is wrapped with the darker colored ribbon. The vegetarian baked goods are sealed with a white ribbon. We have more gummy candy than I can take home. So please avail yourself of the gummy candy. I am fortunate enough to live in the town where Haribo, which was actually a company that was founded in Bonn, Germany. It's really easy for me to get more from the factory store. You should eat this, just saying. As per usual, this room is run under FOSDEM's code of conduct. If anybody has anything that they need addressed during the dev room today, you're welcome to come and see me. I'm Leslie. We can have a chat. You can also reach out to the FOSDEM team at conduct at FOSDEM.org. There is also a phone number listed on the FOSDEM site if you would like to make a call. Unfortunately, I do not have the capacity yet to memorize a number that long. I need more coffee. Yes, we really do. I will not do karaoke to entertain you during the break out of great respect for my friend Brian, who does not find it pleasant. But there might have been some never going to give you up happening on the Grand Place last night after a few drinks. Not going to rickroll you either. I have too much respect for all of you. I don't know. But I sent you, did I not send? There was a thing. You know what? I'm just going to build a nice little social calendar page next year and actually be organized as opposed to being like, oh, wow, we're going to FOSDEM again. We're going to need to people. That's going to be hard. And we're the community people. I have great compassion for anyone else who is also having struggles with people. All right. I should, you know, this is one of those times when you should have memorized the fun facts like a having bird has to turn off the top light here. Oh, let's do that. Did you move it? Yeah. It seemed like a hole because there's a dimmer. What are we doing in the box? Dimmer? I think it was the front. We're figuring it out. Calm out of the darkness, my friends. We have found the button. Yes. We are clearly not finding the button that we need to make things light new. Press it again. And. Oh, we have succeeded in our quest. Oh, that's my email. We don't want to see my email potentially. Yeah. Oh, yes. Okay. Excellent. This is the door through which you entered the community dev room. Should you need to exit the community dev room during the talks? Please exit through this door. Eventually we will have a queue of thousands. Well, at least many people who will want to enter the dev room and it is much easier if we follow a proper queuing process to come in to my left and to exit to my right. I will continue poking this little thing because I want to make sure that we start things on time and also to. |
What I learned about leading a healthy project from speaking to 50+ maintainers |
Hi, everyone. Welcome to the community deaf room and to my talk. I want to talk to you about a meetup that I ran for many, many months and sometimes there were also many, many months in between different meetup events because we all had lives and things to do and it was during pandemic time. But it was a live streamed meetup about open source topics and about basically everything in open source, so licensing, funding, mental health in open source, all kinds of different programming languages. We invited some of the maintainers of those languages to talk to us about that language and its community. And we started it basically because we wanted to just learn more and what better way to just invite people to chat to. So there were no scheduled talks or anything. It was just panel discussions and trying to learn more about those type of communities and those type of initiatives. Oh, that went a little far. So a 2021 tight lived survey that you might have seen surveyed 400 open source maintainers and found that 46% of maintainers are not paid at all. 26% earn more than $1,000 per year for maintenance work and 59% have quit or have considered quitting, maintaining the project and missing lack of financial compensation as their top reason for disliking being a maintainer. Now in yesterday too there were a lot of talks about funding open source maintainers and that it might not be the primary reason for open source maintainers being open source maintainers but in the end of the day we do need to pay for stuff and it's nice if you are able to pay for that stuff. But you know like beyond the projects that are maintained by the proverbial single individual in Nebraska there are a lot of other reasons for open source projects going bad and I would like to share some of those stories that I found from the contributing today meet up basically straight from the horse's mouth and hopefully it will all help us create more healthy communities. Alright so who am I? My name is Floor. I'm currently staff developer advocate at Ivan. Previously I was in developer relations roles at Grafana Labs and Microsoft and a whole bunch of open source developer tooling companies. I am a member of the DevOps days core team and I organized DevOps days or co-organized DevOps days Amsterdam and I'm in Microsoft MPP for developer technologies hence my question to Matt about community ambassador programs and I organize way too many meetups contributing today is just one of them so I have very little free time but if I have free time I spend it with my chickens and my dog and my son not necessarily in the order but all of those things. Alright so I love this quote from Rain Leander and I'm just going to say that they said this on the show I don't think they did but there were definitely I guess I'm contributing today multiple times but it just gives it a little bit more gravitas if we pretend that this was said at the meetup and they said that a project is only as healthy as its community. Community is at the core of any successful project without community a project is only technically open source because of its license but can't read the benefits of innovation consistent release cycle and faster response to current trends and market needs and my own introduction to open source was a very very positive one. I had decided that I want to learn Ruby on Rails and to the end I started following the Rails Girls guides and I found that I would get stuck on something because the guy didn't cover how I needed to do something for my specific setup that I needed to install an extra something something and I figured out that I could Google what I should do that and that is called troubleshooting so I felt really important and then I could go into the guides and change it so that someone else wouldn't get stuck on the same issue and I thought that was fantastic so someone somewhere in the world that I didn't know could benefit from me failing that first time and then fixing something and I was entirely sold on open source so while that was a wonderful first experience I also had a lot of dreadful experiences being in open source and I want to sort of and that was also the intention of contributing today I wanted to figure out you know like how can we recognize those sort of bad experiences that we might likely find in some projects and maybe mitigate it or at least be aware of those so that we can avoid those communities like the plague all right so in the meantime you know like fast forward a little bit I had learned Ruby on Rails I worked for a couple of different companies in the open source space I had been part of a couple of open source communities and you start recognizing sort of red flags early on so you do that by looking at code of conduct figuring out if the code of conduct is at the root of a repository for projects if there is persons that I can contact if I experience something that is not what I was expecting or hoping for or that there's just a generic email address or no contact details because that's that happens if they have a contributing file do I know if going to the project what I can contribute to does it set me up for me and my contribution to be a success do I know what they're looking for I would look at the makeup of the contributors pool do I recognize anyone does it look kind of diverse to me a helpful read me is certainly a really good sign I would look at responsiveness so how long are issues or pull requests typically open before they're responded to and then also the general tone are people thanks for their contributions or can they expect really abrasive answers and not really any love so I would look at those type of things and I as I got more into open source you tend to build relationships and you would hear from the hallway which projects are the ones that you should maybe contribute your time to and which you maybe shouldn't and you look at other things like do maintainers and Matt also sort of alluded to this in your earlier talk do they grant sort of like recognition for people making contributions do they offer some sort of mentoring so some more to newer people in the open source project do they spend some of their time mentoring newcomers to the project do if I come to a project with a with a and I file an issue someone picking these up and sort of mentoring me through making sure that that is going to be a wonderful PR in the end and do they promote active contributors to maybe maintainers with commit access so that was also like some of the good signs that you could find alright in 2017 there was a open source survey that never got repeated sadly maybe they thought that the state of the Octaverse is the thing that sort of like replaced this and it was by GitHub and a couple of other companies and it was a collective responses from 5,500 respondents from over 3,800 open source repositories and GitHub and then 500 respondents from different platforms than GitHub and the results are an open data set set about among other things their experiences of those who use and build and maintain open source software the main takeaways I'm just going to read them out because we're not sure the people in the live stream can actually see this live so you know like they're just relying on my voice sorry documentation is highly valued frequently overlooked and a means for establishing inclusive and accessible communities negative interactions are infrequent but highly visible and with consequences for project activity open source is used by the whole world but its contributors don't reflect its broad audience yet using and contributing to open source often happens on the job and open source is a default when choosing software and I want to focus mostly on the top two findings of that report so I want to focus on the interactions and accessibility but in no means I'm negating all the other points from the survey alright from that survey nearly like 18% of respondents indicate that they have personally experienced a negative interaction with another user in open source and 50% have witnessed it between other people and negative experiences really have consequences for a project's health communities health 21% of people who experienced or witnessed a negative behavior set that they have stopped contributing to a project because of it and 8 of them started to work in private channels more often which is also something that we definitely do not want so a little bit of the breakdown of the behaviors that were seen so by far the most frequently encountered bad behavior is rudeness 45% witness 60% experienced follow it by name calling 20% witness 5% experienced stereotyping 11% witness 3% experienced and more serious events like sexual harassment stalking or doxing are each encountered by less than 5% of the respondents and experienced by less than 2% but grouped together experience witnessed by 14% and experienced by 3% and now when you think all of the survey was from 2017 surely it's gotten better by now I have bad news for you a report from on this foundation from 2021 that looks at diversity inclusion equity and inclusion in open source didn't bring a much better results than the 2017 one it's still a very toxic place to be and especially for marginalized folks it is a difficult space to navigate all right so in developer relations you asked for a raise of hands I don't know how many people raise their hands but in developer relations we have a couple of tools that measure community activity more so that's you mentioned orbit common room but they look more at you know where are your influencers right and what is what kind of like activity is taking place but looking at open source metrics I looked at two things like the chaos as an initiative in this space to measure community health chaos is a Linux foundation project that is focused and I quote from the website so they get it right creating metrics metrics models and software to better understand the open source community health on a global scale chaos if you don't know is an acronym for community health analytics in open source software and of course chaos gone to place this last Friday and chaos understands community health as the potential that an open source software community continues developing quality software chaos is helping through all kinds of things like templates for readme's and code of conduct and a wonderful terminology sheets so like they're doing a good job and then detergize in the same space of course is another company in that space that according to their website they improve decision-making and reporting by analyzing software development community activity and performance of open source projects and they do so across patches and forums and meetings and all kinds of stuff and a October no sorry August 22 report looks at the communities open source ecosystem and I didn't necessarily want to know more about community health in the communities space but sidestep I do like a quote from Sarah Novotny from the time that she let the company communities community group at Google now of course she works for Microsoft because they all move around and she mentioned how the chop wood carry water is not only like sort of like a phrase that they say in the communities community a lot but it's actually a core tenant all of the projects and they want to make sure that everyone that anyone that is coming to the communities community is doing so to help the project and not help their company forward or be in the communities community for their own sort of like nefarious agenda so they they keep saying this phrase and they make sure that whenever someone is coming into the project with PR or anything that doesn't sort of align with what the community wants they will make sure that they they are very aware of that so they spend a lot of time actually doing the sort of unglamorous stuff like thanking individuals for their variety of contributions and not only when it includes code but any other contribution too and in its basics they are seeing this as a way to you know like pay that forward that that's sort of like that field and when someone does enter the project with this I don't know 10k line PR they they're made aware of Google's developer relation manager Ajah Hammerleys quote we don't do that here if you haven't read the blog where Ajah explains what that means you should definitely have a look at it but sort of the gist of it is Ajah says if you run a meetup or a team if you lead an open source project or if you organize an event people will be looking at you to know what is okay and what isn't you get a responsibility whether you want it or not it's yours and we don't do that here is a polite a firm way to educate a newcomer about your culture back to the report so Baterja the looks when they look at community health they look at a couple of things so one of the things is responsiveness as the time to solve issues and pull requests or merge requests as well as the average commits per day a lower average response time indicates a higher performance in the ecosystem the complexity as determined by the number of code repositories and the number of people involved an increasing number of repositories indicates in increasing complexity makes sense activity diversity as in number of commits issues patches requests and over a period of time and when and where that activity takes place sort of shows how the ecosystem is distributed they look at how talent is managed in the open source ecosystem and they try and analyze a different contribution different aspects of code contributor growth so they identify the contributor retention rate so the contributors that remain engaged and sort of the bounce rate so people that leave the projects and they look at the bus factor definitely feel like many people in the room should be aware of that factor but is the sort of minimum number of team members that should be engaged with the projects for it to have long term sustainability and they look at community footprint to and that's an interesting one it's the presence or an influence of organizations within the open source ecosystem so one of them that they look at is the gamma factor that's the amount of contributors that use a gmail account and are likely not contributing from sort of like a professional standpoint the elephant factor the minimum number of email domains whose employees perform 50% of the total contributions and they look at contribution patterns so for instance are a lot of the contributions done during the week or during the weekend if there are many of them are done by the during the weekend and likely there's a lot of individual volunteers contributors that are still contributing to the project right to a couple of the lessons learned from talking to these 50 plus maintainers one important thing that kept coming back in all of these meetings is how important it is to set expectations so the side story of the we don't do that here in the communities community is a way to model behavior and you want to be doing that so if problematic behavior as the ones that are described in the open source survey from 2017 for instance take place it is paramount to address those swiftly but but definitely publicly because you want to set that tone you want to show the behavior that you're looking for you want to model that behavior and there's ways to automate a couple of the things that you expect from people entering your community too so if you find yourself asking people that send you a pull request to elaborate on a pull request even just like even just after one or two times and maybe you want to make sure that you have PR and issue templates available so that they know what is expected what you are looking for in order for you to be able to actually review that that pull request so a lot of that expectation setting is something that you can automate too and of course licenses are a way to set expectations too right and we actually had two different contributing today meetups that were talking about licenses and most recent one was with clearly defined tight lived open source initiative and the ethical source movement and I learned a couple of wonderful things one of them actually if you're married to a lawyer apparently you can also give legal advice I don't think I was allowed to quote that one but another thing is that combining licenses for one project is definitely possible and I'm not talking like one one license for the code and then one license for documentation but combining licenses is possible but it's mostly good for lawyers and not for the project necessarily because you're just introducing complexity you will want to choose a license that is standard in in sort of the ecosystem that you're you're in so if everyone's using Apache maybe you want to use Apache to in your sort of like language ecosystem for instance and if you're thinking about building a business around your project maybe ultimately you need to choose wisely we talked about CLA's contributor license agreements a lot in one of those meetups and how if you haven't done that it's absolutely a pain to do real real licensing because you will need to contact all of your contributors and some of the it's it's anyway notoriously hard to try and contact your contributors because they often don't have an email address or anything listed on their GitHub profile if that's your platform of choice maybe they have moved on from the project some of them have maybe passed away because that's an actual possibility and then how do you go about that? Either way you would need to do if you do real licensing you would need to do a introduce a breaking change right because you're breaking your API and that can be very divisive in a in a community and some of the community members might refer to like little latest release and fork that one and create a community around that one or totally fine but you need to be aware of that. Also recently at Tech Press there was this project that was that sort of threatened to real license if they didn't get funding which I find an interesting approach to right originally an academic project the Postgres open source stewards are very interested in its extensibility and its ability to merge it with other projects so maybe one of the lessons learned from contributing today is to play well with others make sure that there's connectors integrations that sort of extensibility is an interesting thing and people look for that. Some organizations that are involved in open source have great policies internally or maybe in Ospo and usually that means that employees can make contributions to the upstream project during working hours which ensures activity on the project. They're professionally involved so maybe they also wear their professional hat I'm hoping and hopefully it means that they don't they're not just there with their professional sort of goggles on. You might want to steer away from single vendor support most all projects accept help and contributions from service providers we'll know that they're hyperscalers and some of them definitely appear on paper as the champion supporters of a project but the future requests can never outweigh whatever the community wants importance. A side tangent and I promise this is probably the last one but recognize red flags whenever a company is looking to pay you for your open source contributions or pay you as a maintainer and make very very sure that you don't have to subject yourself to the same OCRs as they have for the rest of their engineering organization because companies work at a very different pace and with you know financial years then open source does and it's important to know that. Sticking with Postgres for a little bit contributor experience is a very important thing I remember that Gregory Stark I don't know if he's in the room maybe not maybe he's still sleeping he mentioned that his first contribution to Postgres was a very positive one and that's something that made him want to contribute more for the long term. It can make all the difference in terms of gaining long term contributors for your project if that first experience is a good one my first experience with Rails was a wonderful one because of Rails Girls and remember that 2017 open source survey and how it said that documentation is highly valued frequently overlooked and a means for establishing inclusive and accessible communities? Well from the contributing to date meetup panels on docs often the conclusions were that it's important to create documentation in a market language that other people know so that they can review it maybe also contribute to it so if they find an issue like getting stuck on a certain something something in the Rails Girls guide you can contribute to the project and that at minimum the setup needs to be documented that people can usually tinker beyond the setup but that first step needs to be a good one and to be clear and concise to avoid jargon and if you use jargon then offer glossary maybe. I also love this quote about non-code contributions that triaging and labeling is a contribution too. Projects need to think about the different reasons for people to contribute to open source and they can address those different motivations with appropriate work. Matt said this when Matt was on our show too. So open source is a marathon it's not a sprint and marathons need sponsors sprints do too you can ask me how I know my son runs and I need to get sponsors for this because I said I do sponsors for for conferences I can definitely find sponsors for your sprints and we've had several contributing today's sessions around funding open source work and I thought this we invited Toby Langall from Unlock Open he's very known for saying things that are a little controversial maybe but it's good so companies making money of people's work without paying them is a huge liability something that he said which can't agree more and he continued that people don't understand what it takes to run a business or an open source project for that matter and nobody scrutinizes companies for providing healthcare or I don't know bathrooms for their employees so why are we looking putting putting maintainers on such a you know under such scrutiny from that same episode Phoebe Quincy was a senior community relations manager at Digital Ocean said that there's many projects that need your help more than the community's project even if you rely on the community's project a lot for to get your your your stuff done and also added that Hector Perfest in more recent years is actually looking to promote some of the smaller sort of like plumbing projects that really would need your help and that could use your help through using Open Collective or GitHub sponsors or any of the sort that sort of program Estelle well who's a technical senior technical writer and community engineer at Open Web Dogs also said that most sponsoring programs are small pockets of money just enough to cover server costs but still we treat that as this big thing Phoebe from Digital Ocean I quoted her before said that false funds award lump sums large amount of money and the next challenge is doing that consistently we found that you know sometimes projects get a lot of money in the one month and then nothing the next and how do you even balance that and yeah and we feel like this is this wonderful thing by the way like 10k is this huge amount of money while what do we pay software engineers at our companies in 2021 we had a very similar conversation with Babel Maysainer Henry who mentioned that he is scrutinized for by by the community for going on podcasts and doing marketing for the project to try and get more donations in while he should be working on the project and so he feels like he should be spending more time doing coding while that might actually not be what this is the best interest of the community and the Babel project and while a normal company would like to have a runway maybe when you're running an open source project that that sometimes feels like sending the wrong signal because why why are you not you were doing the work before while you were not paid for it so now that you're paid for it like you need to do xyz and this sort of like ties into a whole sort of toxic notion of we have this really romantic view of the open source maintainer that is in the basement of their mom you know like deprived of sunlight and just like suffering and only then are you a true you know unspoiled hacker but it's ridiculous like so for anyone who does want to pay a mortgage doesn't want to work at a company doesn't want to give their project away to foundation so for valid reasons I'm sure there is this website called oss.fund there's a lot of different ways to fund your project more consistently but the FOSS contributor survey from 2020 by the Linux foundation I feel like I'm sponsored by the Linux foundation a little I loved our talk yesterday too in the Janssen room the closing talk summarized the results of survey of free open source software developers with a goal to identify key issues in improving security and sustainability of open source and found that vast majority of respondents that nearly 75 percent are employed full-time so that definitely goes into you know like breaks with that romantic notion of the person in the basement and survey found that over half of respondents are paid to contribute to FOSS which I mean great and usually we're a little bit concerned about corporate involvement in FOSS projects but there's a lot that we can do there remember when I talked about mentorship like some of these people that are paid to contribute to open source maybe they can fulfill that role of the mentorship for new contributors and of course Linux foundation suggests to maybe move to a foundation with neutral governance to ensure diversity of organizations and control but of course the next foundation would say that wouldn't they all right I'm drawing through clothes I promise the previously mentioned report found that all types of contributors reported that they spent very little of their time responding to security issues and when I say very little I wonder what number you have in your mind but it's 2.27 percent of their time of the total time that they spend on the project they spend on security issues and also they indicated they have no intention of increasing that amount of time to spend on security issues instead they're looking at organizations to help them with free security audits with contributions for security and so that's something for anyone in this room that is working for a company that allows you to spend time on open source is something that maybe you can bring some of your expertise or you can find some of your friends in security departments and see if you can get them to help you out on that one I'd be remiss of course to not talk about the slity of our supply change your transitive dependencies are part of your wonderful community garden and it's imperative that you at least observe them and observe the health of that components that are literally your foundation all right in conclusion I wouldn't dare to suggest that open source is sick but I think we can all do better at taking some of our vitamins so collaboration between strangers is one of the most remarkable aspects of open source and if we can all be better stewards for all of that so set expectations upgrade your own boarding improve your contributor experience check your dependencies floss and prepare to do that all over again every single day thank you so much thank you very much |
Cultural Relativism
a Prism for Constructing Cross Cultural Communities |
And without further ado, I will now turn it over to yield the floor to Claude. Thank you very much, sir. Can you hear me? Does this work? All right. It's working. Thank you. Okay. I'm here today to talk about cultural relativism as a prism for constructing cross-cultural communities. And let me start first by saying that the opinions expressed in this talk are mine. They don't necessarily reflect FOSDEM, my employer, or the projects that I work on. And in the course of this talk, there may be images or phrases that offend, but any such offense comes from lack of knowledge on my part and is unintentional. So who am I? I'm Claude Warren. I am a member of the Apache Foundation. I've been a member and a leader of cross-cultural teams. You can see my social media links there. I'm not on Wikipedia. That's my father. And if you look my father up, you will also find a link to my mother. And if you look them both up, you will find that they went to Northwestern University where they became immersed in the practice and theory of cultural relativism. They were students of Melville Herskovitz, who was himself a student of Franz Boas, who we will get to in a moment, but just tries it to say that both of my parents were cultural relativists and I grew up in a framework of cultural relativism. So let's start by defining culture. Culture is learned behavior. It's what you learn in your social framework. So it's what you learn from your family, from your friends, from your school, your city, your state, your country, they all have culture that influences you and helps you develop your culture, activities you participate in, any place that people gather, you develop a culture. So as an example, culture is not that you eat, but what you eat. On the left here is a picture of a durian fruit. Comes from South East Asia. There are signs in the Singapore transit stations that say don't bring this in the station, don't bring it on the vehicles. It's an acquired taste, I tried to acquire it, I couldn't. I'm told that it is just absolutely lovely once you've acquired that taste, but to me it smells like rotting flesh and I couldn't get past it. So the second one obviously is a cow. Most of the people in this room are going to look at that cow and say cow, milk, cheese, butter, maybe leather, beef perhaps. But if you're Hindu, that's a sacred animal and none of those things will come to mind. So let's look at another culturally determined perception. So it's not that you perceive time, but how you perceive it. So the Amaya people in the high Andes perceive time that the past is in front of them. So for about seven generations as they're talking about people, they will speak as though that person is in front of them and then beyond that time goes left to right. But this means that the future is behind them. So for us, back to the future is a great play on words and a fantastic movie title. For the Amaya, it's the way they perceive time, it's the normal way to perceive things. So to tell somebody from the Amaya culture to go forward to the future would sound just as ridiculous as somebody telling us to go backwards to the future. Okay, so now that we have some idea of the kinds of things that are involved in culture, let's look at cultural relativism. So cultural relativism is the idea that you can only understand beliefs and practices from within the construct of the culture itself. Now this is separate from moral relativism, which we're not going to talk about today. Cultural relativism has been used recently to make arguments for the continuation of human rights offenses on the grounds that are culturally based. And anthropology, cultural relativism is a means to get outside of your culture and understand another. So it's how do you step outside of your culture and get rid of those biases so that you can understand what's happening in another culture. But for people who are managing cross-cultural teams or multicultural teams, we can use cultural relativism as a prism through which we can discover miscommunication and we can discuss interactions between members. Cultural relativism was first defined as a, first recorded as a concept by Herodotus, who was a Greek historian from 484 to 425 BCE. And he was writing about Darius the Great. And Darius the Great was a Persian ruler from 522 to 486 BCE. Darius was inquiring about funerary customs in his empire, and as you can see, his empire was fairly vast. Now it turns out that in the eastern edge of his empire, the funerary custom was funerary cannibalism. And on the western edge, it was cremation. And when those two populations were told about the custom of the other population, they both were dismayed and abhorred the other process. This is the first recorded case of this concept of cultural relativism. Now it was established as axiomatic in anthropology in the early 20th century by Franz Boas. Franz Boas was appointed professor of anthropology at Columbia University in states in 1899. He is called the father of American anthropology. And he first articulated the idea of cultural relativism in 1887. But he didn't coin the term. That felt of Alan Locke. Now Alan Locke is a distinguished, is the first African American Rhodes scholar. He was reviewing this book by Robert Lowey. And Robert Lowey was a student of Franz Boas and was writing about cultural relativism. He was a cultural relativist as opposed to a cultural evolutionist, which was popular in the Victorian era. And when reviewing this book, Alan Locke was talking about Robert Lowey's extreme cultural relativism. And that's the first time the phrase occurs in print. Okay, so why do we care about cultural relativism? Well, we care about cultural relativism because the tools that anthropologists and ethnographers built to understand the culture, we can use to find the friction points between cultures in our cross-cultural teams. And those friction points are going to lead to misunderstandings. And misunderstandings can lead to a cohesive breakdown within the team where people on the team think they're being undercut or that other people are working against them, sabotaging their efforts. It can also lead to a misunderstanding of what your work assignment is. What is it that we're trying to build? And if you get that wrong, your app to build the wrong tool altogether and your project will have high potential to fail. And as I said, it can cause failure of the project. So let's look at a couple of examples of cultural relativism in action. Poplight. Well, as it says here, with the exception of people with color vision deficiencies, everybody perceives color the same way. The light enters your eye, hits the cone, nerves fire, brain gets the message, color. Except some languages have no word for the color green. They instead use the color yellow or blue. So for those cultures that use those languages, stoplight only has two colors. And the phrase go on green makes no sense. In Irish, the color of a person is the color of their hair. Okay? So Eric the red, okay, he was an Irish, but you get the idea. If you want to speak about a person of color, so a dark, complexed person, that person is blue. So the translation of black lives matter and blue lives matter into Irish becomes the same phrase, right? So a distinction that is so important and so prevalent in American U.S. culture is lost in translation. So talk about icons, phrases, and gestures are all culturally centered and that is rather obnoxious, so let me make him stop. All right, so the thumbs up gesture generally means okay, all good, let's proceed, something along that line, some sort of approval. However, in some cultures that is a rude gesture. In ancient Rome, it was contrary to popular belief, the symbol that the gladiator should be put to death. In addition, in online forums, the definition has begun to modify and it is changing from approval to really a passive aggressive way to shut down a conversation. So all of this indicates that symbols, the meaning of symbols drifts over time and space. Now I moved from the U.S. to Ireland a little over 10 years ago and discovered that there are two countries separated by a common language. For example, the phrase just about, as in I just about caught the bus. Now in the U.S., you would say I just about caught the bus, that means I ran after it and it left without me, I missed it. In Ireland, I just about caught the bus means I ran after it and my gosh, I managed to get on the thing and I got a lift. So they describe the same phrase, both in English, both in the Western culture, describe two sides for opposite outcomes of the same event. So idioms are obviously another source of misunderstanding. So phrases like, there's more than one way to skin a cat, which means there's more than one solution and is perfectly acceptable in a lot of cultures, but there is obviously a chance of misunderstanding here, I actually had somebody ask me why would you skin a cat? Valid question, but I don't know, it's just one of those things I learned in my culture. And then we'll get to the nodding head guy now, in general we think of that as the nodding head meaning yes, but in Bulgarian culture that means no, and in fact shaking your head from side to side means yes, go figure. Now one more example here is that in some cultures it is acceptable and appropriate to interject a short word at the end of sentences of speakers during expositions. So this indicates that you're listening, you're paying attention. Now in English you can see this when people say yes at the end of somebody's sentence all the time. The BBC television show Antiques Roadshow has examples of this all the time. This is a very valuable place, yes, yes, they know that, but they're just, you know, let's keep the conversation going. However, in other cultures that's considered very rude to interrupt your speaker. My example three, there are words and phrases that are unacceptable in certain cultures. There are symbols and phrases that it would like to use them here would probably get me reported for a violation of the code of conduct and derail this entire conversation. Yet those words and phrases are acceptable in other cultural settings. So what cultural relativism teaches us is that when in a multicultural setting, if one is confused or triggered by a symbol, gesture or phrase, one should ask what it means. So Nelvo Herskovitz, judgments are based on experience and experience is interpreted by an individual in terms of their culture. As I said, Nelvo Herskovitz was a student of Franz Boaz and my parents' instructor. So let's look at a couple of examples of cultural relativism in cross-cultural teams. I worked on a project where we were transitioning the development from one culture to another, from one team to another. Really we're separate cultures. In the one receiving culture, it was not permissible to show failure. This was a career limiting move, don't fail, don't show failure. It was also not acceptable to disagree with somebody who was perceived to be of a higher rank or status. And we're trying to do open source development where everybody does everything in the public so that you can see what's happening. So this was a problem for us as we're trying to move this across. And so we came up with a couple of techniques that we used in this project to get around this problem. And the first problem was actually understanding what the problem was. You know, it was like, well, I don't want to check that into the source code control system. I don't want to do this. I don't want to do that. Why? Because we just don't do that. And then you have to work a little bit, you find out what the problems are at the base and figure out how you can work around this, work a solution. So what we did was we made it safe to fail, and we exhibited failures on the original team. And one of the things we did was, you know, our source code control system, you would check stuff in any branch, and it would build the branch. And most of the time, that's going to fail if you're working on a development branch. So we exhibited that. We showed you, look, this happens all the time. All of the developers that have been working on this team, this always happens. Don't worry about it. Nobody's going to look at your failures unless you ask them to. I mean, if you want somebody to come in and say, you say, I can't understand why this is failing. Can you help me out? They're going to come look. But otherwise, they're going to ignore that. Don't worry about it. It's safe. We pointed out that developers were not allowed to change the main branch of code, couldn't check stuff in directly to it. There really wasn't a way to break that. There were permission controls in place that were going to prohibit that. And in fact, if you were able to do that as a new developer, come in and put something in the main branch of code, then you had just found a bug in the permission system and you should be congratulated. We showed that all developers were following the same process. There weren't senior developers doing something different, had more ability to do stuff. We're developing code and they did have permission to do stuff that was different. But development of code, development of what the project was always the same for everybody. We also said that no code would be reviewed unless it was in the version control system. So this gave us the advantage we're a distributed team. You got to be able to look at the code and know that we're talking about the same code, the same lines, all of those great things that people we know about. And as a secondary effect of this, it meant that when desktops and laptops crashed, we had the code in the version control system, so a great benefit there. Nothing really out of the ordinary. We also made it safe to disagree. And doing this was probably an anathema to some agile purists here because we made it so that in the stand-up, we would have discussion and dissent. And so normally your agile thing is, what am I doing today, what did I do yesterday, what's blocking me, done. Now we're going to have discussion and dissent. And what this did was it meant that the new members were able to see how within the team we would discuss these issues and we would bring these problems up and talk about them and that this was okay, it was acceptable and over time they began to open up and to disagree and to present their perspectives on things and ask questions. And it really, they became very strong contributors to the project. And the other piece here is ask the questions and those indelicate questions need to be, it needs to be acceptable to ask indelicate questions. And this is because if I don't understand what you've said and I'm trying to understand that, at some point I'm going to ask a question that to you is going to be rude, not acceptable. And it needs to be okay for that to happen because we can't figure out where that point is and what that issue is if we can't talk about it. So you've got to be able to both ask the indelicate question and you've got to be ready to receive the indelicate question. Okay, so the net result here was we established a project culture. And the project culture wasn't, this is our culture and it's great, it's better, come do what we do. It was this is the culture of the project and let's make this work and let's all contribute to it and we can change the culture and do all of the things that we do. But make the culture of the project and live by that. So the second example, this all occurred in western culture and the conversation occurred on Slack. And basically there was a conference coming up and I said, what time do the doors open? I want to know what time do we start and close at the end of the day. And I was told, oh you go to this website and you'll see the times. So I went and I said, can't get there unless I'm registered. If you're not registered, why do you care? And so my co-workers came to me and said, well that was rather rude. And I thought, well, all right, yeah, I can see that as it could be rude. But I had worked with these guys for a number of years. They're planning a conference, they're under a lot of stress because we're getting close to the time. And they're trying to get through all their questions in a day to see, you know, you're coming in, you're going to get through all the questions, get those answers out the door, get to the next process. Let's see if we can get this project done and get this conference running. So the people I've known for a long time, and to me it would just take too long to say, I don't see why anyone attending, not attending the conference would be interested in starting times. Can you elaborate on why you want to know? All right. So yes, that would be a much politer way to say that. But I understood that they didn't have the time. So what I want to point out here is that the written word, it is often difficult to understand what's meant in the written word. You have no vocal or facial cues, and then I want to point out that those are culturally based as well. All right. So let's look at how to succeed, to find the culture of the project, not just in terms of words, but in actions. You want to talk about inclusivity and sensitivity, and you want to do the training for that? Brilliant. That's great. That tells you what to do. But it's the practice and the execution of the training that builds the culture. Practice makes the culture. You need to consider cultural differences when we're talking about discussions. We're going to have discussions about things. If you're talking about planning documents, you're trying to do planning for your project, you talk about use case definitions, you talk about documentation. Any of those things are rife, fine fields for finding cultural conflicts. Make it safe to disagree and to ask questions. Make it psychologically safe to fail. And the disagreements will surface the misunderstandings before they become, before the misunderstandings. The failure is a result of a misunderstanding that didn't rise to the level of a disagreement. So you make it safe to disagree, but then it didn't come up, and then you get a failure because there was a misunderstanding that wasn't a disagreement. So you want to explore both of those, in any case. You want to explore them both as possible friction point between cultures. So when you see either of those in your project, as you're working your teams, think about is there a cultural difference that might be contributing here? So exhibit the culture of the project. Again, practice builds culture. I can't stress that enough. Management has to lead by example. If I like to say no executive washrooms, do everything in public, failure must be public. By public I mean, obviously, if you're working in a corporate environment, that's going to be within the firewall of the project. Management work items, with the exception of personnel and other legally encumbered topics, must be in the public within the project. What is management doing? What is our goal? That's really going to help facilitate the communication up and down. Treat everyone, every objection, every suggestion, every point of view with respect. Assume that everyone working on the project is working towards success, and never attribute to malice what can be explained by ignorance or cultural difference. So in closing, when we're connected to others, we become better people, and the art of cross-cultural team management is to be a better person. So if I've offended anyone with any image or phrase during this talk, please accept my apology and understand that it was unintentional, but also let me know so that I can adjust my approach in the future. Thank you. Thank you, Claude. That was lovely. Very much appreciated. If you are open to taking questions, we certainly have some time for questions. Oh, dear. Okay. Okay. Would anyone like to ask a question of our speaker? Excellent. I'm headed your way. I should have spoken slower. Thank you for a wonderful talk. I wanted to ask you if you were in a position where you were leading a project or a team that is very multicultural, and those things also keep changing, you get new people, some people leave, and it's just a great diversity in the team, how would you approach learning more about their cultures and what is and isn't acceptable, and then trying to find a mesh that works for everyone? Okay. So the question is, if you're having a multicultural team where people are coming and going, lots of cultural change from the people that are in the team, how do you find a way to understand what everybody's culture is and what's acceptable for everybody? I think to answer the second part, I'm not sure that you can find something that's acceptable to everybody, that there's always going to be somebody who's a little tweet, I suppose, by things that go on, but it is being open to realizing that the problem is there, it's the first part, and then asking questions, and basically I think if you want to talk about documentation or something like that, then it would be documenting how you are open to these, how these questions can be asked, and what makes this a safe space to ask these questions. I don't know that there is a generic answer that says, this is how you do it, and it works across everything, because there are just so many differences. I have lots of great horror stories if somebody wants to come ask, but then they're all different, and they all have different solutions. You talked about the challenges that you saw in your projects, where people think in a very hierarchical way and don't want to engage or interact with someone that they perceive as higher ranks, and also people wanting to do everything in a perfect manner, and your resolution to that was to flatten the structure, enable communication, and everyone can fail, which is a very western way of thinking. So you try to bring them across to your culture. Have you ever tried to do it the other way, tried to engage in a project or let the project go in a more hierarchical way and make sure that people were rewarded on not making errors, and if so, what were your experiences? That's a really good question, and yes, it was. It ended up being a very western way of looking at it. I recognize that. I will say in defense of this one project is that we were moving from the western framework where we were working, and the project was working, and everybody was sort of depending on the project working, that all of our users were sort of dependent on the way the project had worked, so there was a lot of pressure to sort of stay the same course. Have I worked in other, if I think about it, I can probably come up with one, but no, it's probably the most honest answer. I haven't had a, I have worked in very hierarchical structures, that US corporate structure works that way, and that's why I'm an open source, that's about all I can say, without getting myself in serious trouble. Hey Claude. Hey Ferdinand. Fancy seeing you here. So you work for Ivan, which is originated in Finland, is there any fun story that you can share from the very Finnish culture that Ivan has? No. Well done, sir. I saw a handout here. How do you get from that? I don't know about that. Thank you. I would say, and it's not an exact answer, but in reference to the previous thing, so in some cultures, like at the meeting, you never actually discuss the thing, all the conversations happen in one-to-one conversation around the meeting, and you show up the meeting and the leader says, we should do this, and everyone says yes. But the meeting doesn't happen until a consensus is already gained, right? And that's actually, when there's a very controversial decision, I found that to be like actually a useful technique. Like I know that if we just have the meeting straight up, like we're going to have a huge argument. So like I go and talk to each of the people individually, I try and sort of canvas a solution so that then when we actually have the meeting, everyone has already agreed. So that's at least one situation where we can learn from other kinds of cultures and import that. Yes. Yeah, I would agree. It makes for very short meetings. I love that. All right. Folks, we're going to go ahead and begin to transition to the next speaker. Thank you. Greg Claude. Very much appreciated. Thank you. Thank you very much. |
Contributor Experience 201
Supporting social infrastructure in FOSS projects |
Hi, everyone. Thank you for coming to my presentation. This is my first time at FOSDEM. Clearly, first time in Belgium. Brussels is such a lovely city. But I think we lucked out. And I must say, I don't mind taking a break from the sun. I'm hoping to return and explore the city more outside of the ULB campus. So please send the recommendations my way. Let's start. I would like to thank the community Devrim organizers for inviting me to speak and share the work of my team. My name is Inessa Posen. I work on NAMPAI and am part of Contribute Experience Lead Team. Here's the plan for the next 25 minutes. First, I'll introduce the rest of my team. I will introduce you briefly the projects that they're working on. Then I will share my thoughts on why Contribute Experience matters. I will also talk in more detail about how we are making Contribute Experience better within the sci-fi ecosystem. If you were wondering about the numbers in the title, just like in the university course catalog, we will dive a little bit deeper into the topic. I will briefly touch on how we evaluate our progress in our work of community building and providing Contribute Experience. And also talk about the project health metrics that we pay most attention to. Finally, I will introduce the Contribute Experience project, a new community of practice, an open source project, a community-driven project. And I will invite you all to join if you're interested in this topic. Just keep in mind that building, thriving open source communities takes a lot of years and it takes some time to talk about it. So 20 minutes will not be enough to cover everything, but hopefully it will be enough to start the conversation. So a few more words about me. In case you're wondering about my accent, I'm originally from Ukraine. I've been living in the United States for some time now. I'm a Pythonista, meaning that I'm advocating for adoption of the Python programming language. I've been organizing the South Florida Pi Ladies chapter, the maintainer summit at Picon US since 2020. I'm also very active in the sci-fi community. In December 2021, I stepped into the role of Contribute Experience Lead. And before that, I was very active contributor to NumPy on several subprojects within the NumPy community. In 2021, I joined the steering council of the NumPy project, which is a governing body. And I pay a lot of attention to other projects in the sci-fi community. I do consulting, mainly on the topics of Contribute Experience, Project Governance, Community Management, Contribute Support. We're a team of three. And I'd like to introduce the rest of the Contribute Experience Lead team, Melissa Mendoza. She's in Brazil. It's probably too early for her to join us. I asked her to add a few bullet points to introduce herself. So this is me listening to her own words. And Noa Tamir. Noa, if you're watching, hi. She's in Berlin. So hopefully, she's joining us. Noa also added her introduction. I should mention that Melissa is working as a Contribute Experience Lead for two projects, sci-fi and matplotlib. And Noa is looking after contributors at matplotlib and pandas. I mentioned that as of December 2021, I am working on NumPy. I am compensated to work on open source. And this was made possible by Chan-Zacharyk initiative. We received funding for four foundational projects, sci-fi, NumPy, matplotlib and pandas. This grant allowed us to create a dedicated role to take care of the contributors in each project. I must also mention that the four libraries, Python libraries, that our team is working on are quite different in terms of the contributor community, governance models, policies and therefore priorities when it comes to contributor support are somewhat different. And now let's talk about you. Yes, you. I'm always very curious about how others start in open source. And since there are quite a few of you, I will not ask you to introduce one by one, but I have a question for you. Please raise your hand if you are a contributor to open source. Very nice. Please keep your hand raised if your first time experience contributing to open source was great. Well, I would say about 40 percent, maybe slightly less. I have an opinion that I would like to share. I would like to use the stage to share. And I think that contributing to open source, starting contributing to open source shouldn't be like receiving bad customer service. We've all been in this situation when you call to ask a simple question and then you put on hold for an hour. It also should not feel like you are climbing Mount Everest and have to face a great beard to have your first pull request merged. So our team set out on a mission to make everyone's first time contributor experience, first time experience contributing to open source great. And now I'm going to share how we're doing it with the projects that we're working on. Governance and policymaking are maybe not the most intuitive places to start, but they're very, very important. Given the limitation of time, I will not be going into detail as much as I could. And please feel free to come to me after this presentation to ask more about the bullet points that I put on the slides. I'll be pretty brief. Reviewing code of conduct and the related policies. That was our priority from the start. It's very common nowadays for projects to have code of conduct, but sometimes they are very inefficient. In terms of it's very hard to find information where you go for help if you need. Who decides on complaints? What is in the scope of code of conduct? What interactions are in the scope of conduct? And so on. Another thing that you might come across is that there is no policy on how community members change. They just join and there is no election process, nomination process. And it can be, it can be curveball in certain situations. So make sure that your code of conduct is a working policy is very important because we all want our contributors to feel safe in our communities. Policies on progression from a contributor to a team project leader. Your project needs new leaders. If you're serious about sustainability, you need to be prepared to share decision making power. Also, if you would like to retain your contributors, how do you, how do you motivate them to stay and to take on more responsibilities? I would recommend giving them recognition and visibility of their work and having policies how you progress to leadership positions is very important. I would say that most projects in Python community still lack in these policies and this is something we are actively working on. The restification of the roles. We don't expect a cardio surgeon to do anesthesia. Then why do we expect a software engineer to be good at creating educational materials or organizing events? And then why we also expect community events organizers to be seasoned, seasoned software engineers. To have a thriving open source project, you need contributors with diverse skill skill sets and diversifying pathways of contribution is something that is still quite new in terms of implementation. I think I covered. Oh, yes. And bring visibility and recognition to non-code work. The Python community is on GitHub and green squares on GitHub became sort of a currency and also a huge incentive. And you can see the dynamics within especially smaller projects where contributors take on the work that gives you green squares, more green squares, darker green squares. It's really hard to solve this problem. One solution that I came up with for our projects is to make all work, including non-code work, visible. And it's not a perfect solution, but it's the one that we can offer working within the system that we have. Supporting newcomers. I'll be very brief because I know I'm right through the time. This is what we are doing for supporting our newcomers. Everyone heard about the importance of technical documentation before we decided to make it even better. We work with millennials. We work with now Generation Z and Alpha will be around the corner. So Doc Street's Contributor Guidelines, tutorials on Jupyter books, YouTube videos and even comics. We utilize in every format that is available to make NumPy and other projects. I work mostly on NumPy and this is something that I was leading with the NumPy project to make it more accessible and easy to start contributing. Also creating safe spaces to ask questions. Every project that is part of the grant has newcomers hours. We do peer programming sessions. We have dedicated Slack channels and also we host regular newcomer sprints. And of course, very, very important. Time EPR, triage and code review. I also would like to bring your attention to Gitpod and GitHub code spaces. There are many reasons to use these tools for our ecosystem. Building a development environment can be challenging, especially for newcomers. Having a cloud-based virtual environment for newcomers has been a huge help for us, especially holding newcomer sprints virtually. So this is something that I highly recommend considering if your dev environment setup is pretty complicated. Also, keep in mind that not everyone has access to the machines that allow to build locally. Next I'll speak, next I'll talk about supporting active contributors. In this part of the presentation, I'll be talking about NumPy mostly. We have teams and this is the progress of the work that we've been doing on diversifying contribution pathways. We also hold community events. NumPy community events calendar is very busy. We have two, three events every week. And this information is broadcasted to the contributor community through several channels. We also recognize that the NumPy contributor community doesn't speak English only. And we have been doing a lot of work in the past two years, translating our documentation, translating our YouTube videos. Yes, so this is something that I think is bringing NumPy closer to other parts of the world, non-English speaking parts of the world. Alternating meeting times to accommodate every part of the world. This is something fairly new. And I must say I had some reservations getting up at 6 a.m. for some of the meetings, especially when we had very low numbers, one, two people showing up. And just recently, we got some traction. And the last newcomers meeting, we had 10 people. Some of them joined before the start of the meeting. So that was a very exciting moment for me that the community is learning about the alternating time slots and the joining. And the last thing I would like to mention is that we are doing some work on web accessibility. We understand that there is a huge number of people who have visual disabilities or sometimes don't have access to the technology that requires the same work of web accessibility. I'm sorry. We all agree that creating a welcoming to newcomers community takes a lot of work. And it takes a lot of work, requires lots of bandwidth from maintainers as well. Early on, I decided to think how we can optimize the work of our maintainers to lower the burden. And this is the solutions that we implemented so far. We have code review guidelines. We have been using saved replies, which has worked really well for our community. We identified questions that are frequently asked. And we are hoping GitHub will allow saved replies on the organization level. So if there are GitHub people in the audience, please take a note. Yes. We are also using GitHub projects. My focus is on first-time contributors. And I actively triage first-time contributors PRs every day and bring attention to our maintainers as needed. We're also planning on leadership transition. Succession planning. And just recently we had Melissa transition away from the documentation team leader. She had a lot of phone chip late. And now we have two new documentation team leads. One of them is an alumni of Google season of code of docs. I'm sorry. Now, the part that Bitersia folks in the audience will probably find interesting. Looking at the list of the project health metrics developed by Keros, a Linux foundation project focused on creating metrics can be overwhelming. And there is no size fits all. Being a department of one, I had to decide what metrics I want to prioritize. And this is what I decided to focus on. Time to first response to review PR by first-time contributor. And time to first response to review PR. And the reason for that is that there are some studies, and I think a general consensus that reviewing first PR from a first-time contributor is very important for retention. Besides the metrics suggested by Keros, I've set North Star for NAMPAI to measure our efforts in creating a welcoming to newcomer's culture. After being actively involved with open source community for some time, and also being a community organizer and community manager. And now, as a contributor experience lead, I believe a second contribution is a great indicator of the experience contributors newcomers had with the project. I also must mention that you should not forget about qualitative research. It's not, it's sometimes hard to a gorge if understanding about the experience. The experience of your contributors, some numbers, and just talking to them, also having the post event surveys is a good way to get feedback from your newcomers. Okay, my time is up, so I'll just fly through. So if you're interested in the conversation about contributor experience, we've started a project. It's called the contributor experience project, and it's a community of practice. Please take a photo. This is how to connect with us. To close, I'd like to give a shout out to my team members and to NAMPAI team. And also to the chance I can be an initiative for supporting open science and open source. Feel free to reach out to me if you have any questions about my work. Thank you. |
Free Culture CV
an open source idea to show the community your contributions |
I am now going to get out of Pavla's way so that he can begin his excellent presentation. Thank you for being here Pavla. Thank you. Will you give Pavla a round of applause? Hello, my name is Pavlo. I am going to talk about the FreeCourt.cd as an open source idea to show the community your contributions. The first question is if you have to think yourself, does your cd follow one standard? The main goal of this talk is to create a community to build a project to solve the main question. The community has already started on that link, so feel free to go there. The question is how do we show our contributions to the free culture communities? In my opinion we need a free culture cd and we will talk about it. This talk is not to show a product, it's not a demo and it's not to show the latest updates of something. This talk is to create and to discuss about the creation of an open cd standard to get more environmental sustainability including papers and to show how and where to collaborate. The thing is if you have our profile with the communities that we are contributing to, if someone sees our profile you can find out more communities that maybe you don't know. The main goal is to create a community and improve the relationship between the candidates and the recruiters because this relationship needs to improve. One important thing is to have more control over your personal data. This talk is based on four parts. The first part is a comparison between the 21st century cd and the 21st century cd that we want to create. Then we will see the current standard that we currently have in the European Union. As you will see it's going to be amazing. Then we will see more about my identity, like the idea of my free culture cd. Then we will see things for inspiration in the community to create, to discuss, to decide and to build that community. First, the old cd's are usually not a standard. Each one has a different template and this is a mess. We currently have an European standard but this should be the input. Linked in is we don't want to be at the standard because it currently has a lot of users but it's not the way I think. On the other hand, we need to create a standard cd to get all the benefits of the standard things in terms of the format, the fields and the features that our cd should have. Then the old cd's are static. We have a cd, we print it and each time you get a new job, you have to create a new version or whatever. Sometimes it happens that a recruiter asks you, hey, I have your cd but it's not the latest one. We want that our cd, the 21st century cd, should be dynamic. It should be a link where this is my identity, this is my profile, my cd. Each time it's requested by itself. You don't have a file that you have to update. Instead of being asked about, hey, please send you the cd, my link is here and it will be always there. The current cd's are not searchable and indexable. When the recruiters have tons of cd's, it's difficult to search inside the papers and it's hard to find cd's because they are just papers. For example, it's a problem for the recruiters how to find that people in between the old cd's that they have who has Linux skills. There is no way because they are just papers. The cd that we want must be searchable and indexable. Our recruiters can say, okay, please show me the people with Linux skills. Current cd's don't have validations, no configurations, so I can say I am a hardware student. I put on my cd and it's supposed that I'm a hardware student but I'm not. We need cd's validable and configurable with APIs, with links, with whatever to validate that the information that we are providing is true. And then paper. We are current cd's. Now we are using the pdf, you know, but the paper has a problem and it's how do we show the multimedia information, right? We have to create a multimedia cd where we can add our portfolio, our videos, our pictures, whatever. So we get a better portfolio. Our cd's not going to trust because sometimes due to data protection rights you have to drop to the bin the cd's and we get a happy environment because we need to, okay? The second part is the amazing Europa's journey and as I said, currently we have an open standard cd that is named Europa's and it's an European commission project. So this is starting like this and I don't know if you can see it. Okay, look that it's a Microsoft template. So as you can imagine, this is going to be... Without words, okay? As you can see, this is not the way, you know, everyone should feel their own cd. But it's a nice list of standard things that a cd should have, you know? So we will see negative things but also positive things, okay? A good idea is to add additional information with your publications, your presentations and so on. But I guess just a text is not enough to show that, you know, maybe a link to the presentation or the event, you know? And also maybe I can say that I was on the whatever conference and who knows that it's true, you know? It's hard to find out that. So we need another way to fill the Europa's cd, okay? The Europa's started in 20 before, I don't know, but the thing is that they realized that they need a portal, okay? Because the world template was not enough and nobody was filling the cd. So they create an Europa's portal, okay? And let's see the submission journey where you are filling your personal information. Then you add your work experience and you fill your personal skills. For example, one personal skill important is the language, okay? It should be also standard. There is a current European standard. And you add your digital skills and there are suggestions when you are filling the digital skills. And as you can see, oh, this is Microsoft Office digital skill, Microsoft Word, Microsoft Excel. I don't know if it's on purpose or not, but many Microsoft things as you can see. And I said, okay, I'm going to fill all of them because I'm a pretty cool engineer, you know? And let's fill all of them. And then another suggestions started, okay? And as you can see soon, okay? It's my digital skill. Then Outlook, instead of email. Google, Facebook, as an important digital skill on Josephie. PowerPoint, okay? And then it started. I fill all of them again. And then it turns to more personal skills like Motivated. Teamwork, written on verbal skills, reliability. And I said, I'm going to continue with my filling all of them. And then the amazing digital skill. I am an internet user, okay? So this is my CV, as you can see. I am a pretty amazing engineer. And suddenly I said, oh, I can create a CV based on this information. This is what I want. I want to create my CV with my profile. Okay, so you fill again some information. Let's build a professional CV. You can add an statement or add the bottom of your CV. And I started to add, we will see later on, that I started to add all my free culture profiles. I don't know why, but I want that information in there, okay? We will see later on the reason. And I created, okay, and there is another feature. It's that you can upload it to errors network. Okay, this is the most positive thing about Europas. And it's like errors. It's like a short network between the European Union countries, where people from other countries can ask for CVs for other countries. And if they are available, you can see their CVs. Okay, so I think it's a good idea. But I don't know if errors is usual. You can check which countries are you willing to work at. You can fill your occupation. And suddenly, okay, my CV is on errors. So I don't know if that is something in my life, but it's there. I left the open question if it's a good idea, just to discuss on the community, okay? It's a good idea to have an European Union short CV database. And then this screen shows up that it's test your digital skills. So as you can imagine before, let's see this amazing thing. I don't know if you can see the questions, but it was like a self-test of, I'm able to do this, I'm able to do this, I'm able to do this. Okay, it's a test. Then some questions that are well, you know, but the first one is based on some proprietary software. And the other one, I don't know if it's important to know for an employee here, but how many digits the output code has. So then I finished my test. They say congratulations, and I am an advanced user, okay? So as you can see, I got the highest level. And you can generate like a proof that you got that level, okay? Actually, they are asking for my name. I don't know why, because I was already loving it. But it's funny because my level six advanced report is also, belongs also to Linus Torvald. Okay, you can type another name, and it's... Okay, and then there is a living roadmap, but I don't know why I got this level, but I got the... I am a master of remote work, really. I don't know how they tested that, but I'm... And this is my percentage. Okay, you get... I mean, this is, in my opinion, this is an important thing and a cool thing. But as you can see, the other steps should improve. Okay, and then the third part is my identity, okay? My name is Paulino Cosa. Thanks for the cause mention, the cause community mention, that I could be for, Inessa. And we will see the parts that free culture TV must have, okay? Most of them are like a normal TV. But the most important thing is the metrics that... I want a city where you are able to see metrics about the contributions to free culture communities that you have done. Okay? For recruiters, it's important to know the profile, to know how active the person is on the free culture community. And I mean, it's, in my opinion, this shows many information about our identity and how the way we work. One of the most important things in your community is your identity, okay? It's hard to implement an identity star where... An open identity star, okay? And we have two options. The first one is key base, okay? I suppose that it's well known here. It's as a pro, it's open source, it's privacy and security focused. And it's easy to install, okay? But as a con, it's type of confirmation must be developed, okay? Because we will see that, as I said, the confirmation is important to give trustness to our city, okay? Another point is that the server is not open source, it's not widely used, and it's not focused only on online identity, okay? So the other one, the other way to create your identity, that I prefer that, it's key oxide, that we will see a lightning talk later on at 1 p.m. And as you can see, this is the YAML profile. And this is one of the most important slides because my identity, it's my DNS of my website, my GitHub profile, my GitHub profile, and whatever community, you know? And I want, the most important thing is to generate metrics about the contributions that are using my community profile. Let's say GitHub or GitHub or whatever. Another thing is the work experience, okay? So you have to have a list with the companies you have worked at, a description, and the possibility to add validations, that is true, okay? And also the recommendations. In terms of education and training, it's almost the same. A list, of course, description of the content and possibility to add the validations. So I work at Certified, so it provides a web page where you submit the identity of the certification. Okay, that person is certified. It's true that what are they saying? Then about the skills. I suppose that it should be mostly free text and multimedia. And we also need validations. And this is a list of some of my profile in some communities, you know? So we are going to, for example, this is my profile. So I need a way to get metrics of my contributions to the Wikimedia environment, let's say. So for example, I have uploaded 140 images. In OpenStreetMaps, I made 49 edits, as you can see in my profile. On OpenStreetMaps, you already have a way to get the metrics of your contributions, okay? Here you can see some statistics about the internal statistics about my OpenStreetMaps contributions. Then you get graphs. The problem is that the platform that I'm using is not open source, you know? So the community should build another way to get that data and to show metrics and to tell more about our contributions. Okay, then this is my GitHub profile and I'm already including metrics inside my GitHub profile. And there are several ways to get those metrics, okay? The first one is the GitHub Wikimedia stats project that is pretty popular. It's more than 51k GitHub stars. And you get the metrics about your contributions and also the most used language, which is disinformation, is pretty useful for the recruiters, okay? Because it tells more about us that, okay, I go to the interview and I say, I'm an amazing PHP developer. But if then we show your contributions and your contributions are not so good, maybe the recruiters get more information about that. Another platform is Metrics.lycoptodio, where you can see some statistics. And on the latest version, there are pretty cool graphs and things that more or less are true. But as I said, we have to get that information basically from the GitHub API and provide them in another way. Another is WebLab that I got 62 translations, you know? And maybe for a recruiter of translator, maybe it's important how active it is in this community that it's focused on translations, you know? And on GitHub, I have contributed mostly in my work to 60 projects. And again, we have to find a way to get this kind of metrics in OpenFoodFats, 479 products. And I'm at CodeBert, okay? I have started the pre-courtersity that is the repository of the community that I want to build. And finally, the last part is the things for inspiration. And the first thing is that we already have an OpenCV format, but it's made by the Spanish recruiters company. But it's based on the agent, okay? And this could be the first point for inspiration. And then we have to build, we have to create CV builders. Here I left four examples of open source projects that led you to create your CV based on your GitHub profile, okay? You enter your information and you get your CV in several ways, a PDF, a web page, or whatever. Then one of the most important things that we have to develop are extractors and converters. We need the extractors to get the information for the API to then show them. And we have to create converters to make the life easier for the people that they have a LinkedIn profile. And we need a converter from your LinkedIn CV to your FreeCourters CV, or maybe a translator to Overpass. I mean, we have to provide ways to avoid just filling again the information because I already put in another platform. And a list of things that we should talk about is there are ethical things, you know, about data protection. For example, in Spain, many people put the street, the address on the CV. I don't know if it's useful or not, but for example, in the UK, it's not possible to add your profile, your picture of yourself. And another ethical thing that I think we should discuss. We have to choose a backend, how we extract the information. If it's Overpass, maybe the Overpass is the best way to start, or maybe we have to self-develop another option. Then we have to make a decision if we should have a CV public database, if it's ours, maybe the project itself is the best way or the best database. For example, the key oxide is using just your JPG key, so maybe that's enough. And we have to provide a way to allow to share our personal data. First, maybe we don't want to add to our CV the contact. But if a recruiter is asking, please send me your email or your phone. Maybe we need that way to, okay, I want to provide this information to this recruiter. That would be important. And the last one is if we should provide retrocompatibility to the old CV. Okay, so we feel all the information in our ramies in free cultural CV. But maybe the old recruiters, let's say, they need the old CVs, I don't know why. And maybe to a CV that could be printable, it's a PDF. It's important. And obviously we will find another questions on the way. So let's start the community and let's discuss that. This is the best attempt that I have seen about the free cultural CV focused on developers. The problem is that the repo is already on GitHub and it's pretty popular, but the web page is not working. So maybe a four, I don't know. But there you can show more about your developer skills rather than the Microsoft Word or an Internet user. So here we see metrics about our internal, our profile, let's say, the technologies that we use, funny facts, and our personal repositories. For example, I like a lot this idea and it's more or less the CV that I would like to have. And finally, take home message. We have many reasons to improve the way that we are creating CVs, as you saw before. We need to create the new standard and the new standard must be community-developed because it seems, I don't know, I'm not sure, but it seems that Microsoft, for example, took part on the Europas definition, which had to show metrics about our collaboration. So if you are interested and you think this is important, please join us at that community and let's discuss and let's be all something that I think it affects to everyone and it's important for us as a community and also as a person, the open source staff and the open cultural staff. So thank you. Thank you so much. Is this thing on? I think this thing is on. This thing is on. Thank you so much, Pablo. Appreciate your talk. |
Uncover the Missing Link
Creating Clear Linkage between Open Source and Standards |
Charles. Okay. Welcome. Thank you. Great. Thank you. We get some applause for Charles, please. Thanks for that. Thanks for being here around the lunch hour, too. I appreciate you're making a little extra sacrifice. And just thanks for being here as part of the community dev room, too, right? And contributing to the discussion here. First, AV question. Can you hear me okay? Is the audio all right? Okay, great. Okay. So, super excited to be here. And especially because I get to talk with all of you about two things that I'm extremely passionate about. The first one is open source. And since you're here in this room, I'm guessing there's a very good chance you share that passion with me. The second one is standards. I really enjoy working with standards. It's a big part of my job. I'm in the standards group at Cisco. Hopefully some of you are consider yourself a supporter of standards. But all right, we've got a few. And what I really want to talk with you about is what I think is an opportunity for the standards community and the open source communities to work a bit better together and to help other with navigating across those communities. Because it is very common for many people to really primarily associate themselves with one community or the other. So anything we can do to help the other community better understand what we're doing, how we're doing it, how to leverage the resources we have, I think is very helpful. And that's the subject of my talk. To cover that missing link, the linkage between all the great things we're doing in open source and the great things we're doing with standards to the benefit of both communities. So I was a little late watching this, but I imagine many of you saw Game of Thrones. Is this a familiar sort of image to you? Even if you didn't, the idea is on the left hand side, I guess to your left, yeah, you have John Snow. He's leading, he's kind of the lead of the people of the north. On the right, you have this big guy, Tormund Giant Spain. And he's kind of one of the leads of the wildlings that are even further north. And those two communities are separated first by a very large wall. And they really don't trust each other, they don't understand each other. However, during the course of the series, they come to understand each other a bit better and find really some common goals that they have in mind and to work together and to achieve some greater good. And I think there's some similarity there with open source and with standards where by working more closely with each other, we can really benefit each other. So first of all, why standards? Why do I think standards is important here? And I think for most industries, standards have really played a key role for many, many years. Large players in those industries, whether you're talking about telecom or networking or communications, it's mandatory to support a wide variety of standards in order to solve equipment or solutions to provide those into that market. And compliance with these standards is important for a number of reasons. And two of them that I want to call out is for interoperability so that when you have components from many different vendors, you can actually get them to work together. And also just to make sure that you never get stuck with a single vendor lock-in where, you know, once you start down a path, you can never use anything from anyone else without prohibitive cost or churn. And as a result of this importance of standards in those industries, fierce competitors come together and actually collaborate quite well on the standards because they have a vested interest in those standards in establishing them so that they can add support for them into their products. And it's also important for them to be able to interoperate because many times they'll be coming in that they have some product or some solution that they want to sell into an existing larger solution. And, you know, their stuff needs to work with their competitors. In order to enter into this new market, they can't get the person buying their stuff to, you know, start over from scratch, right? You need to insert yourself in sort of incrementally into it. And so interoperability is super important there to interop with not only your competitors but your partners. So, you know, that actually works quite well. However, there's definitely challenges with standards. And one of the main ones is it can take a long time. Just getting a standard from the time you start defining it in a community to the time it's actually a completed published standard oftentimes can take a few years. And then if you think that now once you have that standard, a lot of companies will wait until the standard actually exists before they go off and start implementing it. So that's going to take a little bit more time now to add support for it. And then you go and you take implementations from multiple different vendors and you try to use them together and the interoperability isn't there because, you know, these standards are complicated. There's some ambiguity in how you might interpret certain parts of it or which feature sets from them you support and how well. So it takes a bit longer to really get that interoperability that was the goal from the start. And sometimes that will even require going back and, you know, changing the standard. But in any case, no matter what, it tends to take quite a long time. And, you know, that becomes a challenge because you may not have your standard already and adopted and deployed in time for it to do the thing that you really needed it to do. You might kind of miss that window of opportunity. So with open source, I think we see a very, very different dynamic. We see open source fueling the transformation of industries very, very quickly in many cases. And a vast community of people like what we see here at FOSDEM comes together, you know, around some common goals and really can create fantastic software very quickly. And this rapid innovation is so, I mean, it's a very desirable thing. And so I think these industries that have mandated a really needed support for standards for many, many years, they're seeing open source as this great opportunity to speed things up. And so they're saying, well, you also need to support open source. You need to understand how to work with it because, you know, in my network or in my overall solution, I'm going to have equipment from a lot of different people. And some of that, I really think, needs to be open source. So how does your stuff work with these open source components? And sometimes that the open source can become so widely deployed that it actually will become sort of a de facto standard maybe before that standards group ever defined anything in that space, maybe before they were able to complete their work. So open source is really bringing some great things, but I think we would all agree there are some challenges. Open source projects, from the point of view of someone who's trying to build a solution, it's really just like, I think of it as a tool set, right? There's a lot of different projects out there, but none of them offer a complete solution. And so then what you need to be able to do is understand each of them, know how to put them together. And, you know, these are often open source projects that weren't designed, you know, the people working on them really didn't have working with these other projects in mind, right? They didn't design for that. They may not have documented it at all how to make it work with these other things. Or, you know, that documentation may not be great. The other thing is, I mean, in the open source community, it sort of has a mind of its own, right? So perhaps you want to use an open source project in a certain way, but if you're not actively contributing to it, it might move in a way that, you know, you weren't expecting or you weren't wanting. And so it may not ultimately meet, you know, the need that you had for your overall solution. And, you know, some projects just fade away, the community loses interest and starts spending their time on something else that looks even more interesting. So this challenge of integrating open source together with perhaps some other open source projects and maybe even some proprietary stuff, you know, that's a real challenge. So where I think there's a lot of potential benefit is by bringing some of these characteristics of standards and open source together. By those two communities working better together, you can really bring some of that, if you could bring some of that speed and collaborative spirit that we see with open source into standards, actually think about creating open source implementations of standards as they're being developed and then be able to feed that back into the standards and improve the quality of them early on, not after they've been published, and, you know, after several years, but actually very early in those, the first iterations of the standards. Then when you do have these standards, if you can add support for them into popular open source projects, then those projects suddenly become much more consumable by the industry that's built these big solutions around all these standards, right? Now all of a sudden, your open source project can interoperate in that larger environment because it does support these key standards. And then also, if you have some code that's being developed that implements, you know, either the complete or part of the standard, then when the standard is actually ready, it's going to be much, much faster to deploy it if you already have some running code, right, rather than having to start from scratch, implementing just what was written down on a sheet of paper. So I think you can really have time to, you can speed up the time for adoption by having that code available in parallel with the standard. So just one example where I think helps illustrate the opportunity here, this is a relatively, it's a little bit dated slide from the Lenox Foundation, but I think it demonstrates some of the points I'm trying to make quite well. This is looking at the network automation space. And on the left-hand side, you can see that there's a lot of open source projects just in the Lenox Foundation alone that play a role in this space. So it's kind of layered from, like the networking stack from top to bottom, or bottom to up, you can look at it either way. So you see the different open source projects, sometimes several of them playing at the same level of the stack because, you know, we allow multiple different projects that do some of the same things to coexist, right? I mean, that's very common in open source. So you have all these different projects, some of them you may look at and say, hey, that project doesn't even exist anymore. That's an example that I was saying of, you know, the projects, some of them actually fade away. So if you're trying to put together a solution that involves these different open source components, even if you're taking them all from the Lenox Foundation, you still have a fair amount of integration that you're going to need to be able to do. On the right-hand side, you see similarly going up and down the networking stack here, the standards groups that play a role in that space. And where I think there's the greatest opportunity for collaboration is going east-west across, right? So standards organizations like, for example, IETF, and playing at the same level as perhaps an FDIO or an OVS or even with Lenox, you start to see the opportunity for something that's being done in the standards area what needs to be implemented, probably is implemented in these open source projects. So to the point that you can make that interaction and collaboration across these communities better, you can really help both sides. So as an example where I think we're making some progress here, I mentioned the IETF just a minute ago, and that's one standard organization that I'm involved in. It's Internet Engineering Task Force, and it was created back in the mid-80s with the goal of making the Internet really, all the protocols on top of which the Internet's built. And now today, the goal remains to make the Internet better and over time to refine and add new protocols that are needed. So household names like TCPIP, DNS, DHCP, these things that we all end up using every day, those all came out of IETF. And I think IETF is a kind of good example here and that IETF even behaves a little bit, if you're not familiar with it, it behaves a little bit more closely to an open source community and that when you go there, you participate as an individual. There's no company affiliation. It's free to participate. Anyone can do that, much like you often see with open source. And this quote on the right-hand side, I think helps pull that out. You see, you know, we reject Kings, presidents, voting. We believe in rough consensus and running code. And that idea of running code was really important to the IETF early on. I think maybe there was a drifting away from that over time. And I see there being a lot of potential in getting back to that idea of running code, sorry, that's related to standards. So one thing that the IETF did, I think as a realization that its processes and ways of working need to change and to try to make it a little bit easier for developers and the open source community to get involved is just taking a look at the way that standards are actually written and developed over time and how consensus is built around them. Traditionally, it's been done using a pretty arcane set of tools that have been built in the IETF over time. And if you've been working in the IETF for a long time, you're probably pretty comfortable with writing your drafts in XML and then converting them into this RFC format. But if you're not, that's kind of a barrier to entry. And so there was actually a working group defined to say, hey, how can we make our processes work a bit better to meet developers, to make it more developer friendly? And so this GitHub integration and tooling, what it did, was to find a way that IETF drafts could be posted and worked on in a GitHub repo. And instead of using XML and having to translate, it could be written and marked down, something that many developers are pretty familiar with just from writing their own readme's. And so now the drafts, although they're still just really text, but they're stored in GitHub and you have the version control around them and you can do pull requests and all that. So really, it's really meeting developers where they are, making it easier, more comfortable, straightforward for them to contribute by using things that the developers are already pretty familiar with. And you can see if, maybe a little small, but when you're looking at, I'm showing you what the working group page for the draft, there's a section called Additional Resources. And what this takes you here was a newly defined additional resource, which is the GitHub repo, what the draft that people are collaborating on is stored. Another thing that's been done, and this is where I've spent a lot of time, is in running the IETF hackathons. And the idea there is to do exactly that thing that I mentioned earlier of speeding up the pace and relevance of IETF standards by implementing them in parallel with defining them and taking the things that you learn from implementing them and working them back into future iterations of the draft. So I think that way you end up with a higher quality standard in a shorter amount of time. But another goal is to, again, attract developers and kind of new participants from, say, universities, younger people in general into the IETF by creating an environment that's a little bit more interesting, perhaps, to contribute. Rather than reviewing drafts and commenting on them on mailing lists or something like that, you can actually use tools and techniques and code and contribute in that way too and point out where maybe there's a problem in the standard or a better way to do it by actually implementing it and using that to help build your feedback. And these are free and open to everyone and very collaborative. Some hackathons are competitive, but this is really about learning, moving things forward and improving all standards that the IETF works on. So it's non-competitive. And on the right-hand side, I just have a chart. You can see we ran the first one, I think, back in 2015 and attracted about 45 people, and I thought that was quite successful. Then a couple of years later, after having these at every IETF meeting, it's now common to have three to 400 people at one of these. And if you think of an entire IETF meeting being about 1,000 people, that's 30 or 40% of the people are coming and actually spending the weekend before it to work on code together, which I think has had a huge impact on the IETF. So another thing we did then, as a result of these hackathons, was, well, let's create a GitHub org and we can store the projects that we work on in the hackathon and we can store them there. So in case the project wasn't already hosted somewhere else, oftentimes we'll put the project in this GitHub org. And that makes it a little bit easier then to find code related to standards because you can go look in this GitHub org. So at least you find some code that's related to standards. You can find it there. So that helps out a bit too. Another thing we've done is, if you look at, we always have a wiki that lists the various projects for an IETF hackathon. And this is a good example where they've done an important thing where you can see the key IETF drafts that the project is working on. There's links to them. You may not be able to read them well, but if you can download the slides, I've uploaded them. And you can see those. And then down below, it's talking about where the code is that they're writing. And in this case, it's a GitHub repo. They're working on VPP. It's a network processing open source project. And they're implementing these drafts in VPP, which is a great example of the type of thing we want to have see. So right here in their project description, they're showing you the drafts and the code, which is a nice way of linking those things together. So we made progress, but it's not all rainbows and unicorns yet by any means. And so I think we're just scratching the surface here with what we can do. I go back and I show you this GitHub org that we have. There's maybe 20 projects there. It's probably 1% of the code that is related to standards that exist in the open source world. So you can get a little bit of code that way, but it's really just a few things. And this wiki, the problem with this is if you look closely, you'll see that one of the draft names has even changed over time. So they're not maintained. People use them extensively in the lead-up to the hackathon, during the hackathon, and for maybe a week or so after. But they're not maintained. This isn't really a good way to, over time, find the linkage between code and drafts. So I think there's definitely things that we need to do to improve. So what I did was, for those of you who work in the ITF, if you want to improve something, what do you do? You write a draft. So I wrote a draft about more systematically going and creating this linkage between ITF drafts and code that's related to them. And to set up a process that you could think of becoming a standard way of doing this. And so that everyone keeps this in mind when they're writing their drafts and to point out where to find the code. So earlier I showed this additional information that you could have point to a GitHub repo. What we've done now is we added another type of additional resource, and that's a related implementation. And so now what you can do is when you're writing your Internet draft, you have an idea, you can point to relevant code, whether it's an implementation of, say, the protocol you're defining. Perhaps it's a tool that helps you sniff that protocol when it appears on the network, right, to sniff it on the wire. So anything that's going to help with implementing, supporting, deploying, maintaining, operating the standards that you're defining, you can point to them here. And all you need to do is you can go and you can edit those resources and maybe, again, a little hard to see, but the first one is pointing to the GitHub repo where the draft is stored, and the second one is pointing to a related implementation, where, in this case, because it's a sort of a process document, the related implementation is actually pointing to another RFC, which is about how to run the hackathon. So maybe not the greatest example, but I wanted to show how you use this. Now, applying it to a more real-world example, I'm going back to that hackathon project I showed you before that has the links to the Internet draft, and then it's pointing to a GitHub repo. So you can see here, I'm looking at the draft here that's highlighted in red, this one on draft IETF, OpsAWG, IPfix, SRV6, and if I go look at it in the IETF data tracker, I can see there's related implementations defined for it down there at the bottom, and if I click on that, it'll actually take me to that GitHub repo. So now I'm not relying on the Wiki anymore and something that's not being maintained. Now, actually, I'm putting the linkage between the code and the draft right in the tooling that's used to work on the IETF draft. So in this way, it's a much more stable, easier to find reference for everyone, whether they knew about the IETF hackathons or not. And so I think that's the type of thing that it's starting to gain traction in the IETF, and I'm trying to get it to be much more common, and perhaps even there's better ways to do it, so people can also comment on the draft that I showed before if they have a better idea of how to do it. So I guess my ask to all of you is I hope that you see some value in creating this linkage between open source and standards, but you can help do your part to create that linkage. So if you're working for me or others in the standards community, if you're working on internet standards and you happen to know of some open source code, create that linkage, show people where there's code related to the standards. Perhaps you're working on an open source project, though, and you know that project very well, and you know you've implemented some key standards in it. So make sure that that's well documented in your readmeer or easy for someone who's not familiar with your project to see that, oh yes, you do support these standards, because that might hopefully make your open source project much more valuable and easier for them to consume and to use for their purposes. So by making this linkage easier for people to find and use, I think we can make standards much more consumable by developers and the other way around, the open source community and the things that we build there become much more consumable by these industries that have relied on standards for a number of years. So with that I thank you for your time. I maybe have a couple minutes for questions still. You absolutely have a couple minutes for questions. Excellent, I'm on my way. And I'd also love feedback on the IETF draft. If any of you are like, hey, I want to provide comments on an IETF draft, I've never done that before. This is a good opportunity. Thanks for this pitch. It was really enlightening. I think there's really a link between open source and standard. So my question is, in one of your previous pictures, you were showing like standardizing organization and open source. And I've seen that there was like the image of open config on the open source. Let's see, not this one, right? No, I think I know. One of the initial slides you were showing. This one maybe. Yeah, so I recently had to develop something related to the open config group, right? And somehow they are also defining a standard. So what do you think about this? Yeah, that's a good question. So that's an example of sort of a community that's not traditionally maybe considered a standard group. But they're defining their own standards within it. And it has been a little bit of a challenge, I'd say, historically, because you had some standards being defined in IETF and kind of alternative standards rather than being argued within the IETF and used to influence IETF standards. It's kind of standards in a similar space, but being done by another organization now, by open config. And so I guess in some ways it's similar to having multiple open source projects that are trying to do the same thing. So I'd say the one thing is, as long as the open config is open to community consensus and taking input, then ideally we'd have one set of standards, but if we have two and they have different communities working behind them, perhaps they each meet a separate need. So it's not necessarily a bad thing. The important thing is that both communities are very open to input from their members, and hopefully they collaborate a bit where they can. So you don't see anything against this? I'd say it's not ideal, similar to competing open source projects existing, but you can't always force everyone. That's kind of the beauty, I guess, of open source, is you can go off and explore different things. And that's kind of what's happened here. It's like a fork, almost. And whether they come back together at some point, I don't know, probably not at this point, but who knows? Okay, so we have time for one more question, and I see one hand. Great when it works out that way. Yes, thank you for the talk, for these implementations linked to a specification. Would you preferably distinguish between an implementation and, let's say, a reference implementation? Yeah, great point. That's actually one of the challenges we had in the IHF too, that people thought, I guess one of the concerns was perhaps pointing to some code that is, you know, it's just a little bit of a science project, or a quick demo, maybe even at one of these hackathons, as opposed to something that is a full-fledged, hey, I can go off and use this in my mission-critical software. So I think the important thing there is in the README. It hasn't really come up with a better solution than that, than to say, hey, document what this is. Are you just getting going? Is this a proof of concept? Is this something that you want people to contribute to, to make sure you have your contributing file there? So it's kind of just letting people know through the README, I think, is the best way to go. Thank you so much, Charles. Very much appreciated. Thank you. |
Just A Community Minute |
Before we get started, we actually have an empty chair, so anybody want to raise their hand to be a panelist? Contestant? Catch his name. Hi, I'm Shirley, and this is the first time we thought we'd try this out in the community dev room, because you know, everyone's sort of sitting around all day learning things, which is a great, great thing, but we thought we'd try to have some fun as well in engagement, so we'll see how it goes. A little bit weak in here, it is lunchtime, but maybe nobody likes to have fun, we'll see. So just a community minute, some of you might have heard of the game show Just A Minute, it's a British game show from the 60s, it's a radio comedy show with a panel of folks, and essentially you have a couple of topics that you'll throw at them, and our esteemed panelists will have to answer 60 seconds to answer or talk to the topic that they're going to be presented with. So if you have heard of Just A Minute, it was originally hosted by Nicholas Parsons, he hosted it for about 50-something years before he unfortunately passed away, and then Sue Perkins has recently taken over, you might recognize her from the great British bake-off. But again, rules of the game, each person goes, when it's their turn, they get one minute, they have to answer on the topic with no hesitation at all. So the three golden rules, no hesitation, no repetition, no deviation. So for example, if the topic provided is bananas, Don would have to talk about bananas for full 60 seconds, never repeating the word banana, never deviating and talking about pineapples, and not hesitating. If any of those three things are met, her fellow contestants hit the buzzer, and unfortunately this was the cheapest thing I could find that didn't have wires, so you can test your buzzers now. By the way, those are children's learning tools, 24 bucks on Amazon. So I should say, sorry George, I didn't have you in time, we're not quite sitting in order, but we have a spot in the middle, sort of ish, Don, Ray and George at the end. So let's get started. We'll start with Don. Once I see the topic, we'll start. Can you hear me? Yes. Okay, so Don, your topic is things that make people crazy. Things that make people crazy. There are lots of things that make people crazy. Spiders, I want to mine, absolutely bonkers. And I think a lot of people, crowds, like we've got here at FOSDEM, I think those often make people, not the word I'm going to repeat. And there are lots of others too, I think maybe birds. You know, there was that movie, the birds, that I think also makes people kind of bonkers. Horror movies. Sorry, does it count as a bonkers? Yep, it does. Okay, all right, very good, I'm glad I'm done. So now, so the mic goes to George to continue the rest of the, yes, curve ball, you have the remaining time on the topic, things that make people crazy. Well, I was thinking about things that make developers crazy is when you do a lock and you don't do an unlock, that makes you crazy. It makes you crazy. You said crazy. Yes, you can't say more than once. If it's the topic, you can't say it more than once. Seven seconds remaining. I think the other thing that drive me crazy is long lines. Yeah. All right, Mike, go to the spot. All right. Your topic is beer. Beer. So last night I went out and had far too much of this, this liquid goodness that is flowing all over Brussels. And unfortunately, no one bothered to tell me how strong some of them were. And I may be feeling some of the repercussions of my experiences and I apologize in advance if this has impeded the verbosity of what I've been able to deliver to you fine fellows. And I want you all to know that you should not take my example as an action not to mirror, because I think it's important that you go out and you try all of that beer because you will not know what you're missing if you do not take this opportunity. Oh, should have done that. So I actually don't drink that tasty beverage. It's one of the only things that I will not drink. Alcohol is fine. I just, for some reason, I just can't, I can't get into it. So, so being here, I think is maybe kind of wasted on me. My partner, though, comes with me and he certainly imbibes it all. Oh, Ben, since you don't drink beer. All right, the mic goes to Ray. So just to switch things up, we're going to have the audience pick a topic. Anybody want to throw Ray a topic? Books. Books. I enjoy reading them. But my problem is I have a short attention span. So it takes me a long time to finish consuming the content. I wrote one of these ones a long time ago and one of the things that I learned from that experience was that having an incredibly good editor is key to success. And if you don't have one of those, then you find yourself in a position where you hate everything about it. You don't want to look at the computer. You don't want to look at the horrible interface that O'Reilly has built for you to put these words into play and so much pain and suffering and you wake up in the morning and you think, why in the world did I ever agree to this and why did I sneak into the O'Reilly party in the first place? What is wrong with me? But I think that that should not discourage any of you from the audience from doing that. You should take the time to write. Oh, then. All right, George, your topic is DevRel. DevRel is a thing which I just heard about today as I was looking at what a developer advocate was. But I've been in community management for a long time and I think certainly it's an important topic. Developers are very important and it can often be frustrating when you have new developers. I mean, it's a scam and I'm here to tell you that we need to put a stop to this before it takes out any more innocent people. The victims are crying in the streets. Why do we continue to perpetuate the idea that this is something that anyone should be trying to do in any sense? Because it's not, it's bad, it's harmful, and I'm not afraid to say it on video. I think that we need more people who are willing to speak up against this. Stop the madness right now. Nobody wants to fall in? I'm still having problems figuring out what the delineation is between DevRel and the marketing community. Easy point for Ray. All right, we're going to go back to Don. How to cook eggs. So I'm a vegan and so I don't actually cook eggs, but I have seen people cook these things. When our friends come over, so my partner has been friends with people who went to uni with 20-some years ago. They all come over on New Year's Eve and we did make eggs and sausage for the team and my partner cooked these things in the microwave, which I thought was maybe a little bit weird because I'd always seen people, you see people on TV and they put them in a pan and they cook them for a while and then halfway through they flip them over and then they wait a little bit longer and then they put them on a plate with some toast and some probably meat. I don't know because like I said, I don't actually eat these things, but I think that when you cook them you also put like oil or butter in the pan and I think that's part of the process. And then you can also serve them, oh good. Did she get the whole minute? Well done. I think I said the eggs twice, but nobody beat me up. I got away with it. Nobody want to follow on that. I was waiting for it, but after a second time. Alright, Spott. Anything about nachinis? You have to catch one another. Spott, you're up. The internet. I keep this in a box on my desk because if I didn't then what would the repercussions be? No one think of the children. I know I'm harping on this today, but I think it's important that we all factor in the youth when we look at the applications of the internet. I mean really, what good has it brought us? It did not bring us here together. Well, maybe it did, but it did not give us ideas. Well, it could have, but these things are not important. What is important is that there is a square feature that involves cables and blinking lights. And if we don't have those blinking lights, come on, I'll beat me up. Thank you. Is it a good thing or a bad thing? I haven't figured it out. It's been around. Everybody uses them all the time. Are we addicted? I'm not sure. It's basically a series of tubes that go from one place to another and they're full of data and goodness. And also, okay. Well done, George. All right, Ray, it's back to you. A technology you love. Technology I love. Wheels. Shoes. Lights. Cars. Batteries. I've been fascinated with, I went to an energy track yesterday and they were talking about like EVs. It was pretty interesting about how you can not only charge your car, but discharge them and make money. And it was amazed that this person only spent in the Netherlands 200 euros to charge his car for like over 10 months. And Netherlands is not a place that you think of that has a lot of solar energy supply. But I was pretty impressed with something. I just bought a vehicle with the battery. So there's something I'll have to think about. But he did have to do a lot of hacking with open source software. But I thought it was pretty amazing and commendable. But hopefully that kind of platform will be available for a lot of people. So, is that a minute? I think we're already getting tired. All right. George George's free software. Apparently there's a song about free software, which I've never heard and I'm hoping maybe someone will seem to me at some point. But this way of developing software is based on the idea that everybody has a right to modify the software that they're using. Which I've always thought was kind of a strange thing to say. It's a right that it's wrong somehow to withhold software from somebody. But I do believe that it is a better way to start making software. And we wouldn't be here in this conference if we didn't all believe that. And now I'm trying to think of something else to say. And I'm hoping someone will beat me off or something, maybe. No, but it's... The song, I think it goes something like this. I asked ChatGPT to write me a free software song. So without free software, I guess I wouldn't have a job. So obviously I'm glad. I got to meet a lot of cool people, made good friends. So, there you go. Back to you, Dan. I will not give you any other food-related topics. How about FOSDEM? I have a love-hate relationship with FOSDEM. I come here every year. On the one hand, we have the crowds, which is on the less of things that I like about this particular event. On the other hand, it's the place where everyone I know finally comes together. So I go to lots of different conferences. But this is the place where the whole community is in one place. And we have a community dev room. And Leslie always makes me delicious vegan cake for while we're here. And it's a lot of fun. I get to see people. There are lots of really good places to eat in Brussels. So I don't drink the beer as discussed earlier. But there are two really good Ethiopian restaurants and an African tapas restaurant, all of which are high on my list of delicious places to eat in Brussels while I'm here at this event. And I spend a lot of my time at fringe parts of the conference. So ChaosCon on Friday was a lot of fun. And I like to go out to dinner. Oh, good. Spot. Bob Dylan. No one wants me to sing. I promise you this. I can give you 150 reasons why the first reason is that I actually can do a really good impression and no one needs to hear it. But if I have more of that beer that I was talking about previously, it might come out. And I apologize in advance because it's not... I know a lot of people are a huge fan of the artists. I mean, I don't happen to be one of them, but obviously I have a lot of respect. You know, cultural icon, epic performer from what I've seen and heard, and a lot of followers. No one speaks your mind. No one wants to part of this. So fun fact, I was raised by hippies. So there was a fair amount of this artist playing around the home because they liked sort of that folksy guitar organ. So Don and Ray are tied points-wise. Well, and so spot in George. Tied for a laugh. Yes. Open topic, go ahead. Is there a free software song? I haven't heard it. I don't know if it's just made up or maybe there should be one. And maybe somebody should start a GitHub repo for free open source software song, and maybe somebody can compose it, we can contribute to it, make it a medley. And what kind of license should it be under if there is a song? I don't know if there's a license even for songs, but maybe somebody knows, but maybe they shouldn't even be a legal aspect. I don't know if we have one. There was that Sousa Chameleon one a while ago where they dressed up and there was a video and everything and it was based on what does the fox say, I think? So there is that melody floating around, but I also don't know of a particular musical number. Was that pausing? Okay. Oops. What kind of, what kind of would this be? Would it be like those communist ditties that they... Alright, George, since you're holding the mic anyway, frets like flies, French flies otherwise. So frets are made from potatoes and they are fried and I'm not sure if they were invented in France, but I do remember that one time I was driving, I'm from the US, I was driving across the country from Michigan to Oregon and I stopped at a place in the middle of Wyoming and looked at the menu and on the menu they had freedom fries. They said menu twice. We never know what the story was. I brought up a good point about whoever invented the food. I mean, I always wondered the same thing. Whichever civilization first invented this... Now I'm from the south. That was a good win. Alright, Mike goes back to Don. A topic is open source. Open source. I have made a career out of the stock technology and have been working in it for more than 20 years. So it's been a really, really long time. I really like it. Big companies just pay me to do this. So I get to hang out with really cool, fun people. I get to travel the world and go to conferences and give talks and hang out at places like this. And I get to work on software that anybody can see. It's fun. The people are fantastic. It's an amazing community. I get to come to this dev room every year at this event. I get to go to other events. I don't understand what the big deal about open source is. I mean, I went to the doctor and my doctor was very clear that this was not something that I should keep having and that I really ought to use the cream to just clear it up because the... Since you're holding the mic anyway. Penguins. A penguin bit me once and it made me want to go and write code. And I don't know why that is, but I think it might have been sick. And I'm not an expert in animal diseases in any real sense, but that doesn't seem like it makes any real logical progression. How would that animal infect me with the desire to be at my keyboard, but I really need to spend less time at the zoo. And if I was able to be in other places, maybe a museum, maybe a bookstore, possibly a restaurant, there would be less risk of being attacked by a feral creature. I find myself in these situations far too often and I think that my life choices are generally poor and I really regret so many things that I have done and this is becoming a confessional and I hope that you all can appreciate that I'm bearing my soul to you right now all because of a black and white thing and its tendencies. Alright, we're getting closer to the end. Dawn and Ray still tied, spot catching up with one point behind Dawn and Ray. So once we get to George, we'll do one last round. Oops, did we drop them? I thought you had the points of score. So someone challenges somebody else and they finish the time, it's one point. If you do the full minute like you have, you get two points. Alright Ray, community building. Community building, so I've been in this since 2014. When I started, I think there was a lot of discussion about why does this job need to even exist. People didn't understand what the value was and I think that's obviously changed. I don't have to justify my existence anymore like I used to but now the challenge is like where in the organization does it belong? It's like existential question. I think the contribution to the company and the bottom line is pretty clear but people don't know who my boss should be, what the metric should be. But like everything else, I think it's just constantly evolving and hopefully will continue to grow. It's nice to see like a dev rooms like this that are very popular and people want to talk about them and discuss them and debate them. Wow. Thank you. Alright, does anyone have a topic suggestion for George? Okay. Alright, the cloud. The cloud is a very fuzzy concept that was invented. It was first described, my understanding was in 2006 with Amazon Web Servers using Zen which was the first open source virtual hypervisor which was available for them to use and it spread from there and it spread from computing to data and storing data on someone else's hard drive instead of using someone else's computer and then to cloud documents like Google Docs and then everyone thought that software as a service was a great thing and a way to lock people in and prevented free software. So now you can't modify the software that you're using. Alright, last round. Don, your topic is travelling by plane. Oh, travelling by plane. This is one of my favorite things to do. I actually probably prefer travelling by train but sadly, so I live in the UK. We have lots of rail transportation but I can't get back to see my family. I can't get back to the US where a lot of the cool conferences are without travelling by air. So I fly Delta a lot which is nice because I have status and so I get extra things. I get to sit in the lounge and drink free champagne which is pretty cool and I spend a lot of my time in these little tubes where I still wear a mask because they are just gigantic germ factories with people breathing all over each other for 10 hours straight. The food is terrible but on those international ones, they give you... Oh, thanks. Okay, spot. Crabs. Well, my doctor...no, I'm not going to do that. Recently I was in the position of having a lot of money at my disposal and of course the first thing that came to my mind was crabs. So I went on a quest to find a vendor who would be able to take a design that I had found on the internet that was very popular, it was very orange and pinchy and I said to them, can you make a lot of these? And then they responded, who is this? And I said, it does not matter, I have money, just do it and they did. And a year later, these boxes started to show up in the warehouse and they were full of plushy goodness, the kind that makes you think, am I at a seafood restaurant by mistake? But you're not, you're in this place where there's shelves and inside the plastic wrap it's little eyes look at you and you can't help but feel abject terror because you know that this should not exist and you have done something that is against the laws of nature. Oh my God. Alright Ray. I'm related, I have scraps. Get hub. Get hub. So a lot of things to say about this. I've been using it for the past two years and before that I worked at a competing platform, GitLab. So it took me a long time to get used to the new interface and I struggled quite a bit with it, but now I guess after two years I can't even tell the difference between the two platforms. It's kind of ironic, a lot of open source software are hosted on this platform but the platform itself is not open source so the irony continues. But you know, I'm not a, I don't get religious about a lot of things including like whether something should be freely available with source code or not because I use a lot of Apple products and they're obviously not open source. You said that at least. Alright, well that kind of concludes unless you want to keep going. People need to like start throwing some topics at our panelists. If not, our winner today is Dawn. Followed very closely by Ray and Spot who caught up. Doesn't George get one more? Oh sorry, sorry George. So yep, one more. You're right, you're right, you're right. Absolutely. On the topic of that, how about lunch? Not us going to lunch, which we could, but topic is lunch. Yesterday for lunch at Fozdem I knew that there would be long queues and so I went to the grocery store and bought some food ahead of time and that was very lovely. When I became hungry I went outside and I opened my bag and I opened the packages and I consumed what was inside the packages and I didn't... Well where I come from there's this thing that we do when we're hungry and it's important that you get the deep fryer out to start with because if you ain't going to burn your neighborhood down you're doing it wrong so what you've gots to do is you've got to put the oil in and then you find the food and then you put the food in and then you... Today however I was unable to go to the establishment that sold nourishment. Well done George. Well thank you all for playing along. We thought we'd try something different this year. We'll see how if we do this again next year. But Dawn's still the winner. Tied by Spot and Ray. I did buy a prize for each of you, however yours isn't vegan so I'll have to get you something. Or there's more cake. Otherwise thank you for playing along. I know that we ended a bit early but maybe... |
Nurturing, Motivating and Recognizing Non-Code Contributions |
Welcome back. If you thought it was quiet before, it's not anymore. I will now remember to remove my mask so that anyone who may be following along on the livestream and lip reading will be able to hear my comments. We're going to go ahead and slide back into our sessions for the day, and I am very pleased to introduce Alex to talk to us about nurturing, motivating, and recognizing non-code contributions a subject near and dear to my heart as an open-source community member who is not a developer. So if we could get a round of applause to welcome Alex, that would be great. Hey, nice to see you all. So let's start. My name is Alex Abramova, and I work in Precona for two years in the Precona community team, and I'm really inspired by the idea that open-source can be open to everyone with diversity of experience and not necessarily a technical one. First couple of words about the company. Precona provides best-of-breed support, consulting, managed services, training, and free and open-source software for MySQL, MariaDB, MongoDB, Postgres. So we make applications in databases run better. Please check our website for more information. I'm sure you can find everything for you and for your applications. Also, Precona is a remote first country with a company with 350 employees now in almost 40 countries. I think it's great. Precona gathered a great experience in working with open-source contributions, and today we will talk specifically about non-code contributions. In particular, we will look into different types of non-code contributions, and we will discuss how we can provide value to your project and team, and how to engage and empower contributors. When we talk about open-source contributions, nowadays we still often associate them with coding, with tech geeks who contribute code and understand all those strange words like fork, pull request, and etc. But the open-source world is very extensive and diverse, and I believe that everyone can find their place there. Even if someone doesn't feel confident with coding, there are still lots of things to keep them busy. So, what are the types of non-code contributions? They include technical writing, copy-editing, translations, and everything working with text, testing reporting issues, designing and advocating social media coverage, and this list is not an extensive one, so if you have any ideas, we can add them here too. Just use your imagination. And we will look into these types a bit more deeper. First is technical writing, copy-editing, and professionals who are good at working with text can have a lot of work to do. For example, help with translations and localizations to expand your open-source project to different markets. Maintaining documentation is also a thing that is a source of worries for many open-source projects. A lot of stuff might be needed here, from simple how-tos to updates about new releases and new features. Blogging, this type is my favorite one actually, and it includes help with spreading the world about this project, a project you might love. For example, posting technical content in your personal blog or different blogging platforms about experience you gained in styling software in different environments, about basic configurations for beginners, best practices, tuning and monitoring of the tool, life hacks for experienced users, how this tool helps you in your profession, in your business, and possible alternatives to this tool, digest of different tools. Next is testing. It includes providing valuable feedback, looking from the position of the end user or a developer who works with the tool, for example, reporting issues, describing unexpected strange behavior, reporting bugs, filling out the surveys, and also creating feature requests and sharing your ideas about future project development. Designing. Well, here, depending on the professional focus of an individual, there is a wide range of involvement from UI improvements to banner and promo graphics design. Not all open source projects have resources to hire a designer to involve him into development, and for a contributor, it is also a good way to add interesting projects to their portfolio. Advocating. So, there are lots of things that can be done here. It includes sharing posts on social media, on Twitter, Facebook, LinkedIn, whatever, recording YouTube shorts and videos, streaming about software you like. If a person has a podcast, he can invite project maintainers to discuss different aspects of the project, and its value and perspectives. Active speakers can also include in their presentations mentions of a project they like, and it's all helped to grow open source projects, especially small ones. And also, living reviews on software marketplaces can be also included here. So, let's talk a bit why you need non-code contributions. First, it gives you resources to fill in the gaps. It's not a secret that a developer space struggle, for example, to maintain documentation, or you may not have resources for proper testing, improving usability, or localization. Well, coding is super important, but it's not always enough. Well, it helps you to become more visible. For example, this can specifically apply to reviews on software marketplaces where people come to choose a software for their development, for their pet project, student project, and not only pet projects. When we see how people share pictures of t-shirts with your project name and Twitter, how many reviews they have, when they see how people use your software and create their own tutorials and explain how they use this software in their use case, it's very, very valuable for a developer team to see that. Then, the last but not the least, it do make the world a better place for everyone. Open doors for those who didn't have opportunities to obtain systematic technical knowledge. Due to different reasons, maybe economical, a personal, whatever. For example, I myself didn't obtain technical knowledge. I have a humanitarian experience, humanitarian knowledge, but now I'm giving a talk at the biggest open source event in Europe. So, open source did a big work for me and I believe that everyone who is interested in technologies can follow my path. Thank you. Okay, so, profits for non-culture, what can they be? There are many. They can show their professional skills and obtain diversified experience with different projects in which portfolio, connect with community, which is very important, and interact with patient people. And I think it also brings lots of fun in your life. So, how can we involve more non-code contributions? And how we work with them in Percona? Offer small reward for collection of particular tasks that you need. For example, in Percona, we launch several campaigns for living a review on software market places, following a technical review where user describes his experience, his use case. And we offer t-shirts with our logo or marks delivered worldwide. People also tend to love $5 Amazon gift cards. And you can launch campaigns with clear requirements describing what you want to see and deadline to motivate people to participate. And then publish it on your blog post and share social media. We also reward those who report security issues. In Percona, we have forum. And we also reward active users of a forum, people who help others, and ask the questions of the community. If you don't have a forum but have a Slack channel or Discord channel, you can also do the same. Be creative and find what's important for you. To send back, we use Shopify and Printful and Printify. Printful and Printify are providers of t-shirts and all this stuff. We just add their logo. We have very simple design tools. Even if you do not have any design skills, you can do something simple there. And when we send them worldwide, when participate in different challenges, like Oktoberfest in 2022 last year, we did a special focus on non-code contributors. And Percona repositories also participated in Oktoberfest. We had about 20 contributions and half of them were non-code contributions. For your swag gifts, personalize them if possible. We also do this in Printful. We add these nice messages to Alice from Percona of Love, to Matt from Percona of Love. And actually, it costs us additionally two and a half dollars. And people love to see this way. When they get, they take pictures, they share them on Twitter, they send these pictures to us. We collect them and also share. And actually, it makes them not only, you know, made one contribution. Like someone fixed a typo or reported an issue. And when, for example, he got his t-shirt, and when he started sharing on Twitter, and we didn't even ask him. It makes his contribution even larger. Make a word visible and document them for the history. Publish blogs on your blogs, platform or on social media. For example, here we posted a blog post who recognized every contributor and described who did what, how he contributed to Percona repositories during Oktoberfest. And we love to post such things because people love recognition. We also publish all contributions, community videos, articles, passing our repositories, our software, on our community website, Percona.community. We collect them and the lists. We share them on Twitter. Here is the example of a Twitter. Percona Bites, this is our community Twitter account. We tag contributors if we can find them on Twitter. And people love being recognized like this. What's important, of course, is providing clear instructions for non-experienced users. How to contribute? How to publish a blog post on your website? How to contribute in different ways? And we also try to make sure that we are leaving our email everywhere we can. So people who even read this post but had issues, posting, had questions, had doubt, could easily find our contact information and then we could guide them through the process. Our website and blog use Hugo Engine and I believe it's quite easy for everyone to use, even I coped with it. So give a list of ideas, give a list of things you believe your project needs right now. Just write them and post openly on your website, on your blog. So people knew they could find and understand what you need right now. Also, we love to host streams and podcasts and provide a space for people to come, invite people and talk on different topics. They can include their career path, open source trends they see, everything, not necessarily just coding. And to host streams we use restream, Riverside and Pubbin. If you have a community advocate program, make sure you include non-code contributions in that and recognize contributors too. We have cast a written dashboard which help us to optimize the process of rewarding people who made non-code contributions, who posted blogs and videos or mentioned us in their talks. And we also use a little bit of love for that. And the last thing, but not the least, in your messaging, emphasize why it matters. Why non-code contributions really matters to your project. Explain that it is contribution indeed. Recognize it as a contribution because non-coders tend to underestimate what they do. They think like, okay, I just fix it people. Everyone can do that. No, not everyone. Not everyone will do. Not everyone will stop by and spend five minutes on your project. So thank you for your attention. Percona is hiring. You can check out our website. And also this is nice part of my presentation. You can participate in a raffle, scan this QR code and you can win a ticket to Percona Live conference in Denver, which will take place in May. You need to scan the QR code, fill out a simple form, and we will choose one winner randomly after that. |
If it’s public money, make it public code!
How to effectively push for Free Software all over Europe |
If we could all have a moment of discovering our desks, and without further ado, may we have a round of applause to welcome Johannes. Thank you. Yeah, thanks everybody for stopping by at the community, the room, or for staying here. I'm glad to see so many of you. Maybe some of you have seen this morning my colleague Lina's talk at the public code dev room about the PMPC campaign in Europe. So in my talk, I will focus more on the PMPC campaign and how you can get involved in it, what you can do with it at your local level, at your region, in your country. Yes, my name is Johannes Nader. I'm working for the Free Software Foundation Europe, and for those who don't know us, the Free Software Foundation Europe is a charity that empowers people, users to control technology. As software is deeply involved in all our lives, it's important that it empowers us rather than restricting us. And we think that free software is crucial for that. We've been around for more than 20 years, have been campaigning legally and politically for free software and open source software, and we're relying on a strong community across Europe. So to give you a brief outline of my talk, first I'll talk about why does it make a difference, why does free software really matter, not only for your daily coding work, for your daily computer work, but also in building a sustainable and digital Europe that's ready for innovation and freedom. And a lot of the things that I'm going to tell you in the first part of my talk will be familiar to you, but I think it's good to put it together so you can use it as arguments when talking to your local administrations or to your politicians. So in the second step, I will show the PMPC campaign to you, which has been, yeah, we have been running it since 2017, and during last years it has gained a lot of momentum and has become a key part of the push for free software all over Europe. And the last part, which will be the main focus of my talk, is how you can get involved in promoting free software. So let me start with a brief explanation of free software. I think definitions are important, and also if you talk to politicians and to administrations always make sure that you have the same understanding of what you're talking about. So I think you're all familiar with the U.S. military chain of commands, the Unified Combat Commanders, the Secretary of Defense, and on top of the hierarchy, of course, we have the President. He's the one who has to decide to press the red button or hopefully not to press it. But what about the guy, what about the girl or a woman, the person who installed the red button? Can we be sure that it does what it should, and can we be sure that it works only when it's pushed? This is of some importance for us too. And this is one problem that free software can solve, but there are a lot of others. So what is free software? Free software grants everybody, everyone, for fundamental rights or for freedoms, the freedoms to use, to study, to share, and to improve the software. What does this mean? You can use anybody. Everyone can use the software for any purpose. There are no restrictions. You can do with it whatever you want. You can look into the code, you can study it, and you can analyze it, and also others can analyze it and find out what is going on behind the curtains. You can share the code. The software can be shared without limitations. And of course, you can also earn money with it. You don't have to sell it, but you can make money with it, depending on your business model. Many people do, many developers do that, and many companies. And you can, of course, improve the software. So to make it better, you can, of course, also make it worse, but hopefully make it better and give it back to the community, so others can work on it too and use it too. So why does free software matter to public bodies? Why should public bodies support free software? There are two fundamental maxims that are important for public bodies. One of them is digital sovereignty. So in order to establish trustworthy systems, public bodies must ensure that they have full control over the software and the computer systems at the core of our state digital infrastructure. This is very important. And the other basic maxim is tax money must be spent in the most efficient way possible. So public bodies are financed through taxes, they must make sure that they spend the funds in the most efficient way possible. And both maxims are undermined by the use of proprietary software. So let's have a look at some of the problems with proprietary software. First, there's no interoperability, so you all know situations where somebody sends you a file made with a proprietary software and you have to buy the software in order to open it or work with it or even buy a new computer, install a proprietary operating system. This, of course, also applies to administrations. And it leads to a vendor lock-in. So administrations have to stick to their software solution if they use proprietary software not to lose compatibility, to be able to work together. And this is a big problem and it leads to unpredictable costs. You never know what next year's license cost will be. You have to pay for the upgrade in any way. You have to do it. You lock in and you have no chance to escape that if you're using proprietary software. And yeah, these investments that you are making, they are lost. They go to the vendor for the license and you don't know if the software is made better with your money or even not. Maybe some features implemented you as an administration can't benefit from. So you don't have much control over new features. It's the vendor's decision. There's lack of transparency. We know from research and also from experience, for example during the COVID pandemic, that there's low acceptance among citizens for proprietary digital services. So for example, the COVID tracking apps, we know that people respond better to them and tend to using them more if they are open software, free software, open source. And if they know that their privacy, their fundamental rights and data protection also are guaranteed. And this leads also to the last point, security concerns. Trust is good. Trusting proprietary software can be dangerous. You never can show that there aren't any bugs, aren't any security holes or backdoors. So the solution for all of this is, of course, free software. With free software, you have interoperability through open standards. Free software and open standards are crucial for that and crucial for working across borders or across systems from other vendors. Free licenses give you independence. There is no vendor lock-in. Administrations are free to move to different solutions. And you can also, in the long run, save costs by using free software through collaboration. So free software, as we all know, is based on collaboration. Public bodies will benefit from this if they adapt existing software or decide on joint procurement. For example, they can share risks and save money in the long run. And more and more examples for this kind of cooperation exist. I will come back with some examples a bit later. If you, as an administration, as a local government, are using free software, you can empower your local partners. You can, if you opt for free software, empower local IT companies instead of paying license fees to international or transnational corporations. And your local economy will benefit from this tax money. It will often be small and medium enterprises, SMEs. This makes a huge difference for your region, for your city, for your, yeah, for your country and the citizens. Free software is transparent by default. This validates privacy, full data protection and fundamental rights. And as I said, this is key to a higher acceptance by the citizens. And finally, free software code, as we all know, can be audited. We can look into it. Of course, it's not automatically free of bucks. And security problems, but publicly verifiable code, is a prerequisite for finding these bucks and security problems and not leaving the security of critical infrastructures to proprietary companies. So let's have a look at, yeah, government's effect on IT companies. Which effect can a free software policy have on the IT industry? Governments are among the largest, the largest IT purchases accounting for up to 27% of software companies' revenues. If we convince governments to spend money on free software rather than licenses, this can really make a huge difference for the whole IT economics. And we have a huge influence on economy, especially SMEs again, and your local economy. Given these numbers, what if public bodies decide on free software when procuring? What if free software becomes default locally, regionally, but also in your country or throughout Europe? First, as we said, that tax savings, similar applications, don't have to be programmed from the scratch every time. So tax savings in the long run. If you can use software, apart from other projects, improve and adapt it, share and give back so others can do the same. Second, there's collaboration. Major projects can share expertise and costs. Third, you serve the public by using free software, so applications paid by the public should be available to everyone. And lastly, fostering innovation. If you, yeah, with transparent processes, others don't have to reinvent the wheel and you can use brands and money to invent something new, to come up with new solutions and new ideas. Instead of funding the same corporations for the same products again and again, and again SMEs can benefit from this. So let's have a look at some examples. France, for example, implemented a free software-friendly policy for steps in that direction back in 2012. And what happened? In France, we saw up to 5.4 yearly increase in companies that use free software. We saw up to 18% yearly increase in the number of IT-related startups. Up to 14% yearly increase in the number of individuals employed in IT-related jobs. This is very impressive, I think. And interesting side note, also up to 16% less software-related patents. So yeah, if you, as a government opt for free software, you can really make a difference also for your economy. And some more examples, looking at the city of Barcelona, which is a very famous example for the implementation of free software. They decided some years ago to spend 70% of their IT budget on free software projects. It's not everything, it's not 100%, but it's better and a lot better than nothing. And they decided to collaborate with other cities. What happened, from 3,000 companies involved in the local IT business, 60% were SMEs. Many from the region, of course. So the reasons for using free software in your administration, in your region, are not only technical, it's not only sovereignty, tax efficiency, but there are also good economic reasons to do so. And this is very interesting for politicians, of course. So Barcelona took a very important decision. They decided not to switch all systems at the same time, but one step by step. We know from other examples, from Munich, for example, with the Linux project that they decided to do a really fast change to free software to Linux in their administration. And it wasn't well accepted by many people working there, and it led to complexities. So we think the better idea is to do it step by step. So whenever new software is procured, go for free software, whenever a license runs out, go for free software. And one other example, this time from Germany, there's a collaboration of nine municipalities from southern Germany. They call themselves Ready, and they decided to develop solutions together. So the starting point was during the COVID crisis, when they started using a JITSI implementation, called it Palim Palim, and they found it to be very useful for doing library lectures online, and other cities found it to be also very fitting their needs, and they decided to collaborate and exchange knowledge and work together on this and also on many other projects. So to kind of actually start, why should administration support free software, regionality, autonomy, efficiency? Regional SMEs can become strong partners, free software heads to develop and maintain tailored software that suits your needs, not just the vendor's business model, and with transparent processes, others don't have to reinvent the wheel. Major projects can share expertise and costs. Yes, of course, I gave you some examples, but there are many, many hundreds, thousands of cities throughout Europe who have not yet adapted free software. This is why we as the FSFE, Free Software Foundation Europe, decided back in 2017 to start the public money, public code campaign, and we have a long way to go. This is why we started this campaign. Our goal is to make free software the default in all public bodies at all levels across Europe and maybe also outside Europe, but for now across Europe. This is why we started the campaign demanding to use taxpayers' money only for free software. The mission statement that we have here is very easy. We want legislation requiring that publicly financed software developed for the public sector may be publicly available under free and open source software license. If it is public money, it should be public code as well. We have an open letter. You can find it on this website, publiccode.eu. If you haven't signed it yet, you can do so now. There are already 34,000 people until now who have signed this letter and not only individuals, but also many organizations have decided to support our campaign. I think it's, yesterday I checked, it was 220 organizations throughout Europe backing our campaign and you will know many of the logos here, so many organizations backed the public money, public code demand. And among them are also some administrations, some supporting administrations. Barcelona, I already mentioned, here I think I have only six, it's seven by now. They are from Spain, from Germany, from Sweden and Luxembourg. Yes, but thousands haven't made their commitment yet. Many have a positive attitude towards free software. We saw that three weeks ago we had an online event on the current situation in Dortmund, in Germany, where pro-free software policy is just being implemented and there were, I think, about 170 people joining in the event and there were administrations all over Germany. So not only from Dortmund, but interest is there in many, many administrations, not only in Germany, all over Europe. They're interested, they need encouragement, they need good arguments, and they need a good idea of how to start, and this is why we'd like to invite you to help. You can make a difference, you know your city and your region, and if you start an initiative coming up to your politicians with ideas and with demands, they will take you more seriously because it's their job, you are the citizens. So what can you do? With a campaign we developed some tools for you, you can use. I already talked about the open letter. By the way, have you signed it? I think everybody has now. Yes? I just tried to sign it, it results in a 404. I will try it more and more. Okay, this goes directly to our tech admins, but thank you for the effort. It is out there in 22 languages, the open letter and also the website. Take it once we fix the form, send it to your local administrations, send it to your city councils, send it to your members of parliament and ask them to sign it to this demand. Or even better or in the next step, make an appointment and talk to them about PMPC, explain them the idea of free software, explain them the idea of public money, public code and tell them about the open letter. We have a for sure, it's this one, 30 pages, gathering the above mentioned arguments, the best practices that we have and ideas of how to start and giving them good arguments for public money, public code. With this we are targeting experts from administrations, also politicians, but also you, it can help you with finding good examples and good reasons for public money, public code to come up with. I think at the moment we have it in six languages, there are more to come, also here if you want to help translating something, just come up to us, we always need help in translating our campaigns to different languages. We have many stickers, we have postcards, order them for free at our website, fsfe.org, or come up to our booth after this talk, it's in the Cade building and get some there. We also have a campaign video, I wanted to show it but I think there's no time left now. So you can find it on our website too, on the public code website. You have 10 minutes if you like to show the video. I will use that better than for the video, or maybe in the end. And the video I think is also out in about 10 languages, so you can use it for your regional and country. So if you want to contribute to the campaign, where can you start? You can of course focus on the national or international level, but you have to be aware that there's a long way to go and a lot of work. So if you have a long breath and same power, then go for it. But there are other levels to start with, you can, for example, your university, or you can talk to your library, your local job agency, or the school administration where during the COVID, again the COVID pandemic, everybody started to use Zoom or Google Teams or something like that, Microsoft Teams, yeah. And now maybe the next invoices for the next license are already on the table of the head of the department, or they see that they need something more sustainable, or they are even not really aware that there is something else like Big Blue Button or GC. So go to them and tell them. So acting on a local or regional level will be easier, and success there will be a lower hanging fruit for you. Who can you reach out to? Who can you approach? Of course decision makers, so your city council, regional parliament, it makes sense though to talk to people from different institutions, so also from different parties, and maybe also to an admin from your IT apartment, you might even find allies there who share the same view and help you. Don't be afraid of talking to people who might not share your views or to a party you're not leaning towards. Reach out, meet, talk, be friendly, even if they don't support you at first, you don't have anything to lose. You can just win. In the long run, you will maybe even change their minds. So really try to talk to everybody, not just to the persons you know that they are probably already convinced of public money, public code. What can you do? So how? Use the definitions, the examples, and arguments I was talking about before. Read out for sure for more information on that. Adapt your examples. This is very important. So if you're from Spain and you want to talk to a Spanish government, then maybe use the numbers from Barcelona or from Asturias, better than from France or vice versa. Choose your arguments wisely. So if you're talking to somebody from a left-leaning party, maybe it's not so good to emphasize the economic arguments but more sustainability and empowering the people. If you're talking to someone from a liberal party or leaning towards the liberal side, economic is of course good. And conservative parties might be more interested in national sovereignty and how it is threatened by vendor lock-in and free softwares, of course, the gold standard in security. And most important thing, even if you don't have success at the moment, remain friendly. So an example, what you can achieve, again from Germany, CDU is the biggest conservative party in Germany and back in 2019, they decided to back our demand for public money, public code. This is also possible. And of course, don't hesitate to welcome such decisions. It's important to convince everybody of free software because it's making us as a community independent from who is having the government at the moment. And finally, when, so timing matters, what could, well, I was a bit fast, yeah, when, timing matters for your activities, keep your eyes open for relevant developments in your region. Are there any news about plans for a new IT strategy? Is there a new head of the IT department? Are there problems with school video conferencing tools? Don't miss these opportunities. Step in and talk about the best alternatives. But also don't wait for things to happen. Don't wait for such occasions, rather be proactive, writing or talking is never wrong. Keep in mind, if you do that, answers will take time. Be patient, maybe write a friendly reminder after some weeks when you didn't get an answer. And after an appointment, be sure to always follow up and keep the communication going. Be always friendly. Very, very important, I said that before, because angry tweets or toots will close doors rather than opening it. So finally, one more example, again from Spain, so very, Spain and Germany focused here. Yeah, we, this time from Orviedo, it's a community in the region of Asturias. Maybe somebody, the picture is also in the room, I don't know. There was a little hack lab, the pica-pica hack lab, and they decided back in 2015 to start demanding public money, public code from their local authorities. And they didn't have success at first, but then they got into contact with our campaign and they used our materials and they really had a, yeah, had a long breath reaching out to communities and used these arguments to all parties, towards all parties, and convinced many people. They wrote emails and they managed to get face-to-face meetings, which is very important if you reach out to politicians and they had a little social hack. So once you're inside the parliament, you have to get inside somehow and people, somebody has to let you in. But once you're inside, you can just knock doors and doors will open. So find somebody who lets you in the parliament. So what they managed to do is the parliament of Asturias, of the bigger region in Spain, stating the parliament of Asturias commits itself to the international public money, public code campaign. This is not legally binding, but it's a first step and what is also really, really important people now there understand free software, the politicians. What does success mean for your approaches? So if you can get somebody to sign the letter, if you can get a government to sign the letter or parliament, this is of course really, really good. But don't be frustrated if a library board decides against free software or if the city council PMPC statement has loopholes or things move slowly in administrations, sometimes you really need to have a long breath. But you've talked and you've convinced someone and we have time, we have a long breath and we will finally succeed. So that was the main thing of my talk. Finally if you want, you can come up to our booth at a set in Building K, level one, just take some material there and also if you want to support the FSFE's work, you can do that with a donation. We would be very grateful for that because we're charity and we depend on donations and you have the chance also to get your copy of this really lovely children's book which explains the idea behind free software to children. So thank you for listening and yes, I'm open for your questions. We have time for one question and I have seen one hand, excellent. Maybe two questions if people speak quickly. Yeah, I think that's, it did not come out from your talk if there's public money, public codes, this means they save a lot of money if they now choose open source software, how much developer do they hire? Do they do audits or it says now just all the open source software projects have much more users which are not developer? The open source projects have now much more users and these users have ideas and maybe find bugs but the amount of developers seems not to increase or who does the audits for the software that the public domains are using now, is this also in your plan from the free software foundation just to save some money? For what? Yeah, the question is to say these companies, public companies, public money and use public codes. So they save a lot of money, they say hire any open source developer and so the idea is to say audits, who has the work? The idea is of course not to take away the money from the companies, the idea is of course that the money is used in a better way and we all know that software developers have to be paid somehow but it should be rather through maintenance fees and through development of new features or audits than through your license fees and this is what we also try with this campaign to tell them. Okay, so only one question. It is only one question, yeah. Building K for more wonderful people. Yeah. Thank you very much. Thank you. Thank you. Thank you. Thank you. Thank you very much. |
Contributor Growth Strategies for OSS Projects |
Here today at Bosdem, I am going to be talking about contributor growth strategies, how to get more contributors for your open source project. Today I'll start by talking about the factors that can impact contributor growth and why it can be so incredibly challenging before moving into some strategies for growing your contributor base using contributor ladders to help people move into leadership positions. And then finally I'll talk about some metrics you can use to measure project sustainability and then we'll look at a few resources and give you some final thoughts. First I'll tell you a teeny tiny bit about me. I've been in the technology industry for well over 20 years working mostly on open source projects from within companies like Intel, Puppet, and now at VMware I'm responsible for our open source community strategy within the open source program office. I'm a board member of Open UK, I'm a governing board member and maintainer for the Linux Foundation's Chaos Metrics project, I'm co-chair of the CNCF contributor strategy technical advisory group and I also have a PhD from the University of Greenwich where I researched how people collaborate within the Linux kernel. Let's start by talking more about why it can be so hard to achieve contributor growth. In this section we'll talk about some of the issues people face that can impact the sustainability of contributions along with the vicious cycle that people often face when trying to balance the time required to maintain the project versus the time to onboard new contributors. The reality is we are not mindless automatons. We have feelings, we have bad days, we have other commitments and personal challenges in our lives that are often invisible to the other contributors within the project and they can get in the way of our contributions to open source projects. We can be squishy, we can be unpredictable and irrational, especially when we're stressed out, when we're overworked, when we're burnt out. But you can't have an open source project without actually having people to maintain it so you need to be able to encourage people to participate in ways that are sustainable not just for the project but also for those people as human beings. Many projects struggle to find people who will actively participate in their projects and continue to participate over the long term. If it was easy you would all have all of the people you needed to maintain your project and I'd be in an empty room and none of you would be watching this talk. But we're in a situation now where there are lots of open source projects and just frankly not enough contributors. Maintainers are burning out and are in desperate need of help. And sometimes it can be really difficult to get people to contribute to your project. And unfortunately there's no magic behind this, there's no one size fits all solution. But throughout this talk I will focus on some things that you can do to increase the chances of successfully building a community and then growing more contributors for your project. Maintaining an open source project is hard work. And it extends out over many years. And maintainer burnout is common within open source projects. Even the really big successful projects like Kubernetes struggle with maintainer burnout and growing the contributor community. It can be hard for already overworked maintainers to balance the day to day work required to keep the project running while also investing in additional activity to increase future community sustainability. So this creates a vicious cycle where maintainers don't have enough time to onboard new contributors leading to fewer contributors which leads back to no time to onboard new contributors. And while it takes a bit more time up front, if you can invest some time in activities that will help onboard new contributors like onboarding documentation for example, you can increase the chances that you can break out of this vicious cycle. Another way to free up some time for maintainers to break out of this cycle is by getting help for different types of contributions that take up valuable time and are required to make an open source project successful. So think about things like documentation, marketing, community management and loads of other things. And for projects with really complex code bases, it can sometimes be a lot easier to onboard people into some of these roles first to help free up some time to onboard other contributors later. Now next up, let's talk about developing and executing on a long term contributor growth strategy including motivation, governance, new contributor onboarding, mentoring and leadership. People's motivations for contributing to your project vary widely. Some people are contributing as part of their job while others might contribute to gain experience or maybe learn more about a particular technology. And you don't really have any control whatsoever over what motivated people to show up. But there are things that you can do to motivate them to stick around regardless of why they showed up in the first place. Clear communication and reducing friction are key to helping people stick around. And I'll talk more in upcoming slides about the importance of explicit and clearly communicated governance along with solid onboarding docs and fostering a welcoming and inclusive community. But there are other things you can do to help motivate people to contribute. Having good first issues or help wanted labels are excellent places to start because these help contributors find something that they can work on while they learn more about the project. So good first issues should be targeted as something very simple that a brand new contributor can just pick up and do in a very quick amount of time. It helps them learn more about the contribution process. And then help wanted labels are for issues that maybe are a little bit more complicated and take a little bit more time so that people who've already started to contribute can find something else to work on next. And good first issues and help wanted labels are passive requests for help, right? But I also encourage maintainers to be proactive and specific about ways that people can help. Asking someone specific to review a PR or maybe answer a question in Slack or some other forum from a user demonstrates that you recognize their expertise and that you want their help specifically. And going back to the discussion about squishy humans, knowing that we're appreciated makes us feel good, right? Which can be a strong motivator to participate in an open source project. Now I know a lot of people like to really hate on governance. It's just paperwork, it's busy work, nobody likes it, it's just gets in the way of doing the real work on the project. But this really isn't true of good governance. And good governance is really about setting expectations and getting all of the people in the community collaborating together. And ultimately the focus of open source project governance is on people. The roles we play, our responsibilities, how we make decisions, and what we should expect from each other as participating in the community. The goal should be to make the processes for participation as obvious as possible. Even for people who are brand new to the community. So having clear rules about how collaboration occurs, how decisions are made, and what types of contributions are in or out of scope really helps community members make contributions that are likely to be accepted and embraced by the project. This helps avoid wasting maintainers' time for contributions that just aren't aligned with the project at all. And a healthy project with clear governance makes contributors happy and helps set the project up for success and future growth. Now another aspect of governance is about making it easier for people to move into positions of increasing responsibility to help reduce the load on existing maintainers. We'll talk more about this later in the section about contributor ladders and leadership. But the good news is you don't have to start from scratch, right? We have some good templates with instructions that we've developed at the CNCF. But they apply to most projects so they can help you sort of quickly and easily build out some really basic governance for your project. Now I suspect that some of you are still thinking that you really don't need to spend time on governance. But think about this from the perspective of the new contributor. It's a lot more difficult to participate in a project or participate in a community if you don't know anything about the role that you might play, the expectations, the key players, or the roles for participating. So explicit documented governance gives new and existing contributors a clear path to guide them through your project. And spending a bit of time, you don't spend a lot but just a bit of time documenting governance up front can help save you time later with fewer questions about how things work. And it gives you a document that you can point people to if they have questions. Now when I start contributing to a new open source project, I want to know how decisions are made and who makes those decisions which helps me understand whether decisions are likely to be made fairly by people with the expertise to make those decisions. And I also want to see a clear path into leadership for me or for my colleagues if we decide to stick around in the project over the long term. But the bottom line is that if the process for collaboration and decision making are not clearly documented as part of the project governance, this introduces a lot of uncertainty and increases the barrier to contribution while also jeopardizing the long term health and viability of the project. I see so many open source projects with contributing guides that don't actually provide any useful information about contributing to the project. At a minimum, a new contributor needs to understand how to spin up an environment where they can do their development, the expectations for testing and how to run those tests, any processes or expectations you have around things like pull requests or issues, and instructions for other requirements like maybe they need to sign a CLA or sign their commits using the DCO process. Now if this is well documented, contributors can get started with a minimal amount of help from existing maintainers which can save you a lot of time in the long run. And when a project doesn't have good onboarding docs, maintainers can get frustrated by the amount of time they spend on new contributor questions and it can make it hard for contributors to feel welcome and take them longer to become productive. Now this does not mean that you need to spend days and days and weeks and weeks writing the perfect onboarding documentation for your project. Anything is better than nothing. And if you just start with a few things that can help people get started quickly, new contributors can actually help make the onboarding documents better by adding more details and additional instructions for things that they found confusing or that they had struggled with. Your project should also be designed to keep diversity, equity and inclusion top of mind. Building a diverse community where people feel welcome and included doesn't just happen, it does require putting at least some thought into it, but it's really time well spent. Building an environment where everyone, including people from marginalized populations feels safe is the first step toward building a diverse community for your project. Ideally having programs that give people opportunities for things like shadowing, mentoring, sponsoring new potential leaders can help you grow a diverse set of people into new leadership within your project. The Kubernetes contributor experience SIG is a great place to see some examples of how to implement some of these programs for things like shadowing and mentoring, and projects that make a concerted effort to bring in new people from a variety of backgrounds and have programs in place to help them grow into leadership positions are more likely to benefit from increased innovation and just have a healthier contributor community. And by having a diverse and welcoming community, you have the advantage of getting contributors that might not feel welcome in other projects. Now this, I gave it its own section, but it's really still kind of part of the strategies section, but it's important enough to call out separately since moving people into leadership positions really is a key part of growing your contributor base and scaling your project. So I'll talk about this in the context of contributor ladders, which is a good way to do this. Defining the roles and responsibilities for contributors, reviewers, maintainers can help with recruiting new people into these roles. It can help to really think of this as a ladder where contributors can climb up to become reviewers and those reviewers can become maintainers. And what's important is to document it and make sure that people understand how they can climb this ladder and move into positions with more responsibility within the project. A contributor ladder usually outlines the different roles within the project, along with responsibilities and privileges that come with them. Community members generally start at the first levels of the ladder and then they advance up as their involvement in the project grows. So for each one of the ladder, you can define responsibilities, which are the things that a contributor is expected to do. Requirements are the qualifications that a person needs to be put in that role. And then privileges are maybe some things that those contributors are entitled to do as a part of that position. And all of this helps set expectations for roles and encourages people to think about how they might take on additional responsibilities within the project. And as you get more people moving into maintainer roles, you can reduce the load of the existing maintainers. Now, the good news is that there is, like with many things, also a template, so you can avoid building this from scratch. Now, this template has probably more roles than most projects need, but it's intended to be simplified and customized for your project. Project leadership is one of the key elements of good governance, and this is how you scale your project. So you should have some kind of documentation about your leadership. For small projects, maybe you just have a list of maintainers that indicates which people are responsible for which things. And there are quite a few different options for selecting leaders as part of defining your governance. And the ideal is to have a process that provides a fair and level playing field that defines how contributors can become leaders. And this should be documented so that all participants can clearly understand the criteria and the process for moving into leadership positions. Now, some of the bigger projects, like Kubernetes, for example, have an election process, at least for the top levels of leadership, like a steering committee. But only the biggest projects actually need something that complicated. Most projects have a relatively simple process, where the existing leaders or existing maintainers get to select the new ones. So for example, new maintainers are often nominated by existing maintainers and maybe approved after a certain number of maintainers have agreed to it, or maybe there's a voting process. And there are loads of different options for selecting leaders for a project, so I won't go into all of them, but there is a, I wrote a document for the CNCF that kind of describes some different options. But the key is to spend some time thinking about this as you document your governance and as you build your contributor ladder, so that you can bring new people into leadership positions and reduce the load on the existing maintainers to help scale your project by growing your contributor base. Now, granted, mentoring takes a bit more time, but it's a good way to help existing contributors become even better, with an eye toward moving them into leadership positions. So for busy maintainers, one good approach is to focus on mentoring people who've already been around a while and are unlikely to disappear and help them learn to do maybe some more complex time consuming tasks. Like with many things, mentoring isn't something that's all or nothing, and you can time box it to whatever time you can fit in your schedule, so this doesn't have to be hours and hours every week. Even spending an hour a month or an hour a week to help someone quickly become productive in some part of your project can be time well spent if that person can then take on a few tasks that help you reduce your load as a maintainer. You can even structure this as shadowing and allow them to watch you and learn while you do some maintainer tasks that you're going to do anyways. And if you focus this on helping someone learn to do something that can free up your time later, then this will be time well spent. Now the strategic part of all of this comes in to thinking about where your time would be best spent. I've given a lot of suggestions so far in this presentation, and you should not try to do everything at once, right? So I recommend you think strategically about where you start. If you know you've had people interested in contributing, but they've given up when they couldn't get started, then maybe you focus on onboarding docs. If you have a lot of casual contributors who come around occasionally and contribute things, maybe you focus on the contributor ladder and governance to help some of them move up to take on more responsibility and eventually move into leadership positions. One way to figure out the best place to start is by using metrics. So I'm a big fan of metrics and data for those who know me. But this can help you find problem areas and figure out where you should be spending your time. Time is precious, right? So it's important to identify problem areas so that you can focus on the right things while avoiding wasting time on things that maybe are already going okay or going really well. However, metrics do need to be interpreted in light of how you operate as a community and the other things happening in your project. And there's no one-size-fits-all interpretation for metrics. But I will talk in this section about what some trends might indicate and how you can think about addressing them. One key area to look at for your project is responsiveness. So in this project, in this graph, you can see that there are times when they have a lot of PRs in the backlog that need to be merged or closed. Now, if these PRs are coming from several regular contributors who aren't maintainers, maybe it's a good time to look at how you can promote some of those contributors to become maybe reviewers, approvers, maintainers to help with the workload. Now, as with any metrics, you need to interpret them in light of your project. So there are other things that can cause an increase in the backlog, like everyone preparing for a big release or maybe there's a big conference coming up or it's vacation season in Europe that just are not resolved by moving people into leadership roles. So you have to think about maybe why you have this backlog and what might help resolve it. It can also help to look at the types of contributors that you have. So in this case, casual contributors are those drive-through contributors who make just a handful of contributions and then you never see them again. Regular contributors are the ones that make some contributions and stick around. So they continue to make some contributions but maybe not a ton of contributions. And then core contributors are usually the maintainers who make most of the contributions and stick around over the long term. Now, you can really learn a lot from this graph, actually. If you have a very small number of casual and regular contributors, this can mean that people don't have the information needed to become productive and contribute. So in some cases, onboarding docs can help solve this issue. Another thing that this graph can indicate is whether maybe you have some fundamental issues within your project that are driving people away. So if you see the total number of contributors declining or the number of regular contributors declining, this can indicate some deeper issues, maybe with toxic community members or an unwelcoming environment that probably needs to be resolved before you take any other actions to grow the community. Or it could mean people are leaving your community for some other reasons. You know, maybe lack of responsiveness is another one. Now, this metric is often called the bus factor, pony factor, lottery factor. Based on the idea that if one person or a small number of people disappeared, maybe after winning the lottery, that the project would possibly be completely screwed because they're making all of the contributions. So I recommend measuring this because there are a couple of things that can tell you. So first of all, it tells you how big of an issue this is for your current contributor situation. If it's like this one, this is a big issue and you should focus on getting some more contributors that can be moved into leadership roles. You might also find that there are people contributing more than you realized, which is the other reason that this is a good metric. This can help you think about who you can encourage to contribute more and maybe find someone who could move up the ladder into a leadership role and reaching out to someone and acknowledging their work while encouraging them to do a bit more can help quite a bit with contributor growth. Sometimes people just need a bit of encouragement. And as I mentioned earlier, you can ask people for specific things that you know they're good at. There are several communities that I've gotten more involved in or involved in in the first place because someone asked for my specific help and kind of made me feel wanted within that community. Now before I wrap up this talk, I'm going to leave you with a few resources that you might find useful. I've mentioned the CNCF contributor strategy technical advisory group a couple of times. We have a governance working group and a contributor growth working group, which provide templates and guidance about contributor experience, sustainability, governance, that sort of thing to help people develop strategies for maintaining healthy projects. The resources are designed for CNCF projects, but most of what we talk about applies just about any open source project. The open source way guidebook has loads of really great details about building and maintaining open source projects. The chaos project has loads of metric definitions and software, which you could see in my metric slides, that you can use to measure the health of your open source community. These are all great starting places for understanding how to grow your contributor community. I just mentioned the contributor strategy tag on the resources slide, but I wanted to put in a really quick recruiting plug. Like with most open source projects, we're also looking for help. We don't have enough people to contribute. Anyone's welcome to join our meetings if you want to learn about us or even better if you're passionate about contributor growth, governance or related topics and want to help CNCF projects improve in those areas. We'd love to have you join us and help us develop resources and provide advice to projects. This slide has several ways to find us and get involved. With that said, let me leave you with just a few final thoughts. Maintaining an open source project is so much work and there are so many maintainers who are overworked, exhausted, burning out. The best way to address this challenge is by growing your contributor base, but it's hard work. It takes time away from the day-to-day activities now, which can be really hard to justify when you feel like you're barely keeping up as it is. But in the longer term, spending at least a little time on things that can help you recruit and keep new contributors will be worth it in the long run. And hopefully some of the templates and resources that I've provided will help you get started with some of this work. And as I've mentioned before, you don't need to do everything at once. Spending just a little time on something to grow your contributor base is a great way to start. Thank you for coming to my talk and we can open it up for questions. That was very easy. Thank you, Don. I have a question about the lottery factor. Do you have any advice on how to measure or how to take into account quality, not just enumerating the quantity of commits that various contributors make? So the question is how to take into account quality versus quantity? Yeah, I think that that's something you have to... So you're not going to be able to measure the quality really well instead of a quantitative way, right? But I think it's something, again, it gets at interpretation. So I think it's something that you need to really think about. So there might be somebody that has loads of commits for some particular reason, but maybe they're not a major contributor to the project for some other reason. Maybe they're not a maintainer and maybe it wouldn't be a big deal if they left tomorrow. Maybe they're working on something that's fairly niche that not a lot of people use, for example. So I think you really do need to interpret those graphs. And you can think about the quality of what someone does. If that's someone who makes a ton of commits and they're kind of crap, then that's not as important as someone who makes a lot of commits that are really great. So you just need to, I think, apply a reality filter on top of it. But I don't think it's something that you can really easily measure quantitatively, but you need to just think about and kind of put a qualitative filter on top of it. Yeah, other questions? Hi. First off, great talk. I had a question about the governance side, because I work on a pretty big open source project, but the problem there is that it's owned by a company that doesn't really necessarily value open sources like one of their core reasons why they do it. Do you have any recommendations for how to advocate for governance in the open source project or company owned projects? Yeah, it's really hard to advocate for governance in company owned projects. I've tried to do it and failed in certain projects in the past. What I've found is usually the best way to approach it is to get them to document the way things work now. So not worry about changing all of the governance, not worry about coming up with some big elaborate governance structure, but how do people get moved into committed roles now? How do they get moved up? And try to encourage companies, and this is something I have this discussion internally with NVMware all the time too, is we hire people to work on certain open source projects, and so they get committer status kind of automatically, and we should be as transparent as possible about that in our governance materials, for example, that they do get special treatment as of the fact that it's a NVMware open source project, and we're paying them to do the work, so they're going to be able to commit from day one. So I think being transparent about that as transparent as possible can help, but I would say work with them to document what they have right now, and then maybe people will start to see the gaps and you can try to build on it, but it's hard, it's really hard to get good governance for company and projects. It's a tough problem. Thanks for your talk. About the lottery factor, I'm not sure if you mentioned it or anything, |
Centering DEI Within Your Open Source Project |
Love it. Love it. We are getting there. We are almost there. I keep encouraging everybody to take more Haribo, but I don't think we have enough left to encourage massive runs on the Haribo. We also have Ritter Sport, if you need it. Oh, in that case, wonderful humans. I live in the town where they make the Haribo gummy candies, so please take the gummy candies. I can get more easily. Do not need the extra's to go back on the train with me to the Germany. Due to due. While we are getting set up and having a little bit of additional technical difficulty, oh, I'm sorry, I'm getting very happy over here because we have our next generations of Fosdummers hanging out and getting candy now too. Yeah, that's right. Sadly, I couldn't bring my daughter with me, but when she was all of, I think she was like 33 days old, she went to her first Fosdum, so I'm looking forward to her next to Fosdum. No, although I did get one conference to make her a lanyard that said future speaker, so that made me feel happy. I think I'm going to try and bring her to DevConf this year, if at all possible, but the German school system is very rigorous about thou shalt not miss school time, so it's a little more difficult. If you want a la, la, la, la, la, la, you totally can. There may be karaoke coming on. I know you're going to leave. It's worth the risk to me to entertain our adoring public, Mr. Prophet, but we will try and avoid it because we care for you. Is everybody both enjoying their Fosdum and exhausted and ready to go home? Yes, okay, that's all right. Sometimes exhaustion is where we end up, but in the good way, in the we're full of happiness and good ideas way and filled with delicious candy. Welcome, welcome, welcome, lovely people who are just joining us. We are still working on convincing the, watch me go. Everybody centering DEI in your open source project. How about that? Yay. All right, hi, hi, everybody. It's great to be here and it's great to see people. So I'm Matt German Prey and I'm one of the co-founders of the Chaos Project and I'm also a professor by day at the University of Nebraska Omaha in the College of Information, Science and Technology. And it's really great to be here. So just real briefly, I'm going to be giving this talk with Justin and Christie. So they can kind of introduce themselves when they come up and we also have a slide that introduces them as well. So the Chaos Project as Dawn was actually alluding to in her last talk, which was a great segue, is a project that's focused on open source health and sustainability. So creating metrics and software as well as programs because there are a lot of metrics that can't necessarily be determined through software that require qualitative and human efforts to understand. So we develop those types of metrics and software and programs. As part of what we do in the Chaos Project, diversity, equity and inclusion is a key component to understand with respect to the health and sustainability of projects. And for ourselves, we looked at the Chaos Project and thought, how can we not only help others think about diversity, equity and inclusion with respect to their projects, but also reflect ourselves about our own diversity, equity and inclusion. So how can we as a project better center diversity, equity and inclusion within the Chaos Project and then subsequently help others do the same from our experience. So I would like to first say thanks to the Ford Foundation who has provided support for this reflection over the last, it's been a year and a half now and it'll complete in about two, it'll complete in about another six months. And so through the support from the Ford Foundation, it's been really great because we've been able to have consistency in our team of people helping us do the reflection, including Christy and Justin, as well as come to places like FOSDEM and talk about our experience and what we've learned and share that with other people. So that got cut off a little, it's close. So the team, Cella got cut off. So we have Christy, we have Justin, we have Sean Goggins, we have Georgia Boulin who's there in the center, we have Elizabeth Barron who is the Chaos Community Manager, we have Ruth Ikega who couldn't come, she's on the right and then in the upper right corner, I think you only see a very small part of Cella, but it's Cella Yang who is also part of the team for this reflection and then myself in the lower right hand corner. So it's been an amazing group of people and like I said, through the Ford Foundation support, we've been able to stay together as a team throughout the entire reflection. So here are the speakers, again, we're Justin, you're cut off. That's okay. Hopefully there aren't a lot of words that get cut off there when we show what we're showing. But again, I'm Matt Germanprey, we have Christy Pogri and Justin Florey and I think Justin's gonna talk kind of about our first, we have four things that we're gonna talk about that were part of this reflection and Justin will talk about two of them and Christy will also talk about two of them and then we'd really love to kind of hear from you all and answer questions the best that we can or any thoughts that you might have. So Justin, I'm gonna turn it over to you. Should you do quick round introductions to all of us or? I'm done. I'm Matt Germanprey. I'm Justin Florey. So I'm currently with Red Hat as the Fedora community architect. So I'm working in the Fedora community day by day, but I've also been involved with the chaos project that we're going to be talking a little bit about here for the last five years and been working with our awesome group of folks on this DEI review for the last two. Hello everyone. So I'm Christy Pogri. I'm currently the program manager for the GNOME Foundation and I'm also a part of the chaos community for around one year and a half now and also contributing in their diversity and inclusion, a working group. All right. So let's look at the first two points of our reflection. So again, this was for just the background context again. So this was work that we started about two years ago and we were coming into this reflection. We were starting really from like the very beginning. We were going to take a look at the chaos project to understand where could we better focus our efforts and spend our time and energy and resources to make things better. So over that journey, the four points that we're going to be sharing here are basically the outcomes of this journey that we've been on for the last two years. So you're getting a little bit more of the polished end result of this. So it'll be nice bullet point list, but you can also keep in mind that when we started with this reflection, we were kind of like, where do we begin? And you know, this was a lot of conversations that brought us to this point and engaging with our community to learn some of these things that helped us define some of our efforts. So without further ado, the four highlights. Whoops. A little bit of a lag. Okay, we didn't get it all cut off. So I'll be talking a little bit around our newcomer initiatives and engagements and also talk a little bit around our global efforts. So across the span of the last two years, no, I guess I kind of already covered this. So the first part on the newcomer experience. So with the creation of new support structures, along with an increased emphasis on mentoring, the chaos project has grown a lot, really tremendously with new contributors in the last couple of years. New onboarding pathways were developed, and we prioritized the refinement of these pathways too. I might say part of this was also kind of the right time at the right place, because I think as we were kind of going into COVID as well, and everything was really shifting into a virtual format, as we were spending more time thinking about the newcomer experience, I also think there was like more of an interest in folks trying to get involved with open source projects, which also opened the door for lots of folks that we ended up engaging with from countries and places all over the world. So ultimately, many of us long time FOSS contributors remember well what it is like to get involved in a new project, and trying to figure out where do you begin. It's easy to get overwhelmed. And ultimately, many of us long time FOSS contributors remember well what it's like, or at the same time, sorry, ensuring newcomers have a clear pathway to get onboarded is important for project sustainability and keeping things going. So new faces should always be coming and going, as well as mingling with the older faces and people that have been in the community. So keeping those signs of folks who have been around for a long while while also still having new faces coming into the fold and the mix is important for community growth and sustainability. Additionally, newcomers have that unique perspective as a first time observer to your community. And you can learn a lot by listening actively to them and their perspectives that they're bringing to your community. So one key takeaway here is helping newcomers see and understand the impact of their work. It's important to validate the small contributions. In a way, you almost need to height people up on their participation. It feels great, actually kind of like what Don was just talking about in the last talk. It feels great when you know your efforts are appreciated and made a difference to someone else, all that soft squishy human stuff. So one wish list item for developing metrics or measurements in a FOSS project is showing and measuring the impact of a newcomer contribution. So both trying to come up with a way to measure and understand that, but also kind of around this recognition piece of making sure that the people making that contribution also understand that impact of their contribution. Additionally, newcomers need to feel safe and welcome in the community. They need space to try out new ideas and more importantly, to make mistakes and mess up because that's where you really learn. So having a safe environment is important for people to test and validate the wildest ideas and a welcoming community gives them the opportunity to bounce back when something doesn't work out. So one key part of unlocking creativity and innovation is making sure that people bring their best and that a newcomer has the space that they need to bring their best while they're learning more about the project and the community. In short, avoid this tendency to over celebrate your long-term contributors. We also want to recognize those folks, but I really like, I think it was someone in our group who put it as the emphasis is really to share the spotlight. So making sure that there's an equitable focus of how you're recognizing and validating people's contributions in your community. And also kind of a side point or not only side point, but the importance of diversity and leadership I think is a tie-in here with that newcomer experience. So it's kind of an unwritten point in the slide deck here, but in our Chaos DEI review team, we have one of the people who we really wanted to be here, Ruth Akega, who has done a lot with that newcomer onboarding in the Chaos project. She was really hands-on with the mentoring and onboarding of the new contributors, especially across Western Africa, where she's originally from. And Ruth worked directly with the newcomers, but also brought visibility to the challenges and issues, but also the success points for the people she was working with to our group, to the DEI review team. So in the same way, she was very hands-on with engaging with those people, but she was also sharing these experiences back with our review team so that we had better visibility and insight to that. So at the more, I guess, a higher level in the project, we could think, how could we align, again, our time, resources, and energy to make sure that we're supporting and welcoming in all these new people to the project community. So the takeaway on this point is to choose folks from regions, especially with less representation, to try to encourage them into these leadership roles, or find somewhere in the project where they can make that impact. It makes it easier to build relationships with those newcomers to the project. I think global efforts, right? So when you're trying to fix things, challenges are going to come up. It's important to have this persistence in the face of challenges when necessary. And when you're trying to pivot a project's focus toward DEI challenges or towards DEI, these challenges are going to be inevitable. The work is a kind of cultural, organizational culture work. And that stuff is really hard. So again, there's kind of this piece that there's a historical resistance to change. Change is hard. We don't like it. It can, you know, sometimes there's a, I read this, there's actually a comic I was reading a while back that, you know, we have this part in our brain that when we, there's something that's challenging our beliefs or ideals that we kind of like freeze up. It's almost as if someone was like threatening us with something violent, but it might just be changing our mindset or ideas. And what I'm going with this is that the change is really hard. And so we can help make it easier to evolve and adapt to change. So I think one part of this as well is that we're not also in a, you know, escape room or echo chamber doing this work. We're kind of all in this together. One of the data points to mention here as well is, again, coming back to both the Chaos Africa piece, which is where we had a lot of that newcomer growth was we had over 200 new community members who joined into our project. And that was across five different focus groups of development and design and research and technical writing and community management. I think it also ties into one of the efforts around our new website, which is underway for the Chaos Project, which has been a huge contribution from all these newcomers who have helped shape and influence how we're kind of opening or presenting the front door of our project and making it easier for people to find the things they're looking for. And also the over 900 plus folks following the new Chaos Africa Twitter account as another data point. And our folks in the Asia Pacific community, there's also been an ongoing connection with the to-do group supported by the Linux Foundation and GitHub. Sorry, I misread that. So they're doing work with the to-do group around trying to support some of these efforts and grow the community there as well. So I think from here, I will pass it over to Christie to take us into the last two points of our, what we're here to share with you today for the reflection. Great. Thank you, Justin. One other important, very important part of our community and one of the most successful initiatives, let's say, has been the badging. So the badging is meant, the badging review is meant for self-reflection and improvement on how many different communities and how many different conferences are implementing and are centering diversity, equity and inclusion. This initiative uses Chaos Matrix and the way that it works is that we have an organizer of a conference applying on our website. We have two forms. One application is for all online events and the other application is for in-person events. And the person who is applying actually goes through all the application, answers different questions and we have two reviews, two reviewers that actually go through the application and then by the end of the review, we also give the badge to the specific conference or the community that has applied for this. And it's also like this badging review has four different levels and depending on how you are centering DI, we also give the reviews. In our community, we also give support and lessons learned from many people on how you can improve and how you can actually work more in your community to center diversity, equity and inclusion. Another initiative, let's say, in our community has been the survey. We started the community survey by October 2022. So in this survey, the idea was to get a general idea of how the experience of the community is going, what we are doing well, what needs more improvement. So by collecting all this data, we were able to work more on the things that we want to see more improved for the next years. The idea of this survey was to create a structure that could help also other communities to implement it. And we have asked many demographic questions and when the results will come out, we would like to give or to publicize this survey online with all the structures so everyone that would like to implement it in their respective communities, they can do so. It's a very demanding work. It takes a lot of energy and time. So we would love to give this work to others to make their lives better. And for a quick summary, okay, thank you. For a quick summary, so for our two years journey, there are definitely surprises going on and we had things that we need to do during this whole process. So it's not, you know, sometimes things don't really go in the way that we have planned it, but the most important thing is to actually finish with the results that we wanted to and to accomplish the goal that we set ourselves to. Another main idea and another big lesson that we have learned throughout our journey is that supporting contributors and making onboarding process smooth actually helps us to have a better community, to have a healthier community, and to be able to get as many contributors as possible. And also the other last takeaway is that the community surveys are actually very important because it is a very good way to understand what is going on in your community, how can you help, and what other things that you can appreciate or that you can feel proud of, let's say. So this was all the presentation that we had for today. Thank you all for joining our talk, and feel free if you have any questions, we'd be happy to answer you. I'm going to put that QR code on the live survey. I don't think you, if you took a picture, I'm sure it looked good. Okay. I'll do that. If I know how. Yeah, it's, it gets cut here. Yeah, can I, you can also upload the PDFs so that I can see if I can. Hi. What's that? Yeah, we're done. Okay, yeah. So I think, yeah, we have about eight minutes left. Do we have, yeah, I see one question here from the audience. Yeah, it's, it's an issue. Thanks guys. It was a great talk and great insight. One of the questions that I had was, so you mentioned that improving the onboarding process made it easier to retain contributors down the line. Do you have any insight as to why you think that one might have contributed to the other? Because oftentimes I've seen in other projects where we've done, we've done work to help improve the on ramps, and we do see a large influx of people, but over time as well, there's still a fair amount of attrition. So I was just wondering what your insights were based on what you saw from the work you've done. Sure. Thanks. Yeah, thank you for that. So as far as a lot of our newcomer experience, I can tell you about a couple of things that we've been doing just in that regard, and then I'll speak to kind of how I think it has helped retain people as well. So from the newcomer experience, a lot of the work that we have done are open office hours, which have worked really quite well. They're unstructured hours every week. We also have one hour a week of kind of structured talk about the project itself. So it's usually led by Ruth or by Elizabeth, who's the community manager, using a slide deck. You know what I mean? That's a little bit more structured in time. The other thing that we had to reflect on was in terms of the newcomer experience, for a lot of us, it was not actually going out as far as like how to make a commit, you know, or how to do a PR. But they were usually easier things like, here's the meeting to attend, and we encourage you to attend this meeting twice. And then this would be, say, our general meeting. And then once you've attended the general meeting twice, we encourage you to attend a working group meeting, you know, kind of a smaller level meeting, say four times. And what we find is that by doing this, it helps create the vocabulary for folks to understand what it is. So sometimes the commit is kind of far out there. Our poll request is kind of far out there. So in doing that, we also had a better opportunity to understand why people were joining. So it was kind of on us to understand why. And then the earlier talk about the different types of contributions, I don't know if you were here. There was one that was like not just technical contributions, but all the different types of contributions. It then becomes kind of our responsibility as the community folk, you know, like the community managers, to identify what those hopes are from people and how to contribute, and then really try to focus in on pairing that. So hopefully that helps a little bit. So it's not just putting out documents that are for the newcomer experience, but you have to be really active along that path as well. Did you want to comment too? Let's do one more call out to something that I thought worked really well on this as well. It was Elizabeth Barron, I think Ruth too, were doing kind of like individualized like mentor sessions with folks who wanted like some extra guidance or had questions and weren't really sure where to go. Talking to a real human being and getting some of that individualized support, I think also made it easier for folks to feel connected to the project. And so when things got hard or things weren't clear, there was always a way that they could, they had like kind of a buddy in the community to help kind of help them through. And I think that was also kind of a big part of the retention piece. I also think that an important part of the onboarding process and to make it as easy as possible is to also have a good documentation on how people at least can get started. And by having some very specific steps, it makes people more comfortable and it makes people feel that they know where they're actually going. So having actionable items helps you with a path forward. Do you have other questions for our fine speakers? We have about three minutes. We're homing? Three minutes. Ah, I see him. Is it helpful to have an onboarding buddy for each person, a mentor for their first stage? I think that assigning someone specific to every newcomer may be a little bit too demanding because there might be like, you know, a certain amount of people coming all at the same time. And maybe we don't have the manpower to assist each one with an individual mentor. But I think that having one mentor for three or four people for a set group of people would be easier to coordinate and to organize. That's a good question. And I don't think we sorted it out yet. But in terms of having a one-on-one to Christie's point would be a lot of work. So how we sort that out? I don't think we have that quite clear yet. Maybe just one corollary point to that is I think in Fedora, we have also like a joint special interest group of folks who kind of buddy up with folks. I think what's helped with that to make it scale is that there's a process that anyone who wants to even do that mentoring work knows how they can help people. So it makes it easier not just for newcomers to find someone to get kind of help buddy up with them. But there's also a way for the people who want to be buddies to help connect with the newcomers too. So I think that might be one way to help make it easier to scale. Because one-to-one is it only goes only reach can go so far with that before you hit your your limit. Anyone else have a question for our wonderful presenters? Very good on my way. Hi, I was wondering if after the two years of research do you review your metrics process to have another understanding what does health in a community means? Yeah, so it's part of this process. So the KS project also publishes metrics. So it's part of what we're learning that's going right into the development of new metrics. So that would be the publishing that we're doing in part. I think we're also in the stage right now of how we |
Building Open Source Teams |
We got a round of applause for Bruce, please. Hello, everybody. My name is Bruce Momjin. I am from Philadelphia. I am one of the Postgres Court team members, and I'm going to talk for the next 30 minutes about some of the nuts and bolts of building open-source teams. I know the previous talk was a little more high-level. This is definitely trend-travel activity here because that's what I've done for a long time. I'm basically a co-founder of the Postgres Internet Development Group. We have a dev room downstairs. We had a big conference on Friday at local hotel. We have a whole bunch of events coming up. Paris, Stockholm, Malta. We got one in Chicago. We're going to be at scale in a couple months. We have Ottawa and hopefully one in Singapore coming up, so a whole bunch of stuff going on. I travel a lot. I've been a court team member since the community started in 1996, and I've worked for three open-source companies. I've been around the block, and I have some insights into how we've built Postgres and some of the things that work for us. What I want to talk about first is motivation. There's something a lot of people don't talk about. We take it for granted that people are going to come to us, but why do they come to us? Why do they want to be involved in open-source? What interests them? Why are they even giving their time to us? It's kind of a trench question. Talk a little bit about open-source management and how we manage the open-source project, and then I'll talk a little bit about the development process. Open-source motivations. This is actually from a study done in 2002 published in the Register URL here right at the bottom. By the way, these slides are actually on my website, so if you want to go here, feel free to get the slides and do whatever you want with them. What they basically did a survey, and they found that the major motivations for people getting involved with open-source were professional advancement, learning new skills. That's probably what got me involved in new skills. Practical need for the software. Maybe they have a business need for the software to be enhanced. There's a lot of people in the Postgres community like that. Or their business wants the software to thrive. Now, that's my employer, EDB, wants Postgres to thrive so they pay me to kind of do my crazy thing. Mental stimulation is actually a valid reason to get involved. I was actually one of my also reasons for getting involved. I was always curious how a relational database worked and Postgres was open-source so I could actually see it work. It was exciting to me. And finally, the belief in open-source, the belief that open-source is a good and therefore I want to invest in that. So these are kind of the four big ones. I'll tell you the real thing that nobody talks about is if you go up to somebody who works in open-source and you say, which one of these are you, they may not even know. And the weird part of it is that they all kind of work together. Yeah, I love the mental stimulation aspect and I actually kind of have a need for it and I'm learning new skills and that might help me in my career. And open-source stuff is cool and I like helping people. Which one is it? Well, it's kind of all of them kind of mushed together and I may be able to give you percentages a little bit but that's part of the beauty of it. Usually, if I say somebody, go dig a ditch, they're going to say, okay, how much are you going to give me for digging the ditch? They kind of know what's motivating them because they just don't want to dig a ditch. But for open-source software, it's like multiple motivations all kind of working together and flowing together. And healthy open-source projects are able to have all of those aspects working at the same time. And if you focus on just one of them, which I know some projects do, you're really not taking advantage of the full spectrum of ways of attracting developers. I know this is kind of, I've never heard anybody talk of it before, but when you see it in writing, you're like, yeah, I guess that does kind of make sense because that's kind of how I got involved and that's what motivates me. So there's all this kind of stuff that doesn't really make sense but actually it actually does. I mentioned mental stimulation. Programming is one of those unusual activities that does not require any upfront money. As long as you have a computer, you can do as much programming as you want. You're not paying per hour or paying per piece or paying for wire or paying for fabrication of some kind of equipment. It's basically a malleable and a cost-free medium that you can continue changing. And that cost-free aspect actually helps people get involved with Postgres particularly. It's kind of a puzzle to solve some of the problems we have to do so that some people like, how many people like puzzles, right? I mean, yeah, it's kind of fun, right? Postgres has a ton of puzzles that are really hard and a whole bunch of people would love to work together to solve them. That's one of the things that attracts people. It kind of makes sense. They wouldn't say it this way, but when you look at the people and they look at the way they talk and you kind of like squint your eye a little bit, you're like, yeah, that's really what's kind of, you know, on the other parts as well because they have a good-paying job and they like helping people and they like open source and it's helping their business. There's a whole bunch of things all going at the same time. They enjoy learning. There's just a whole bunch of stuff. And actually, this comes out of a book from, I think, the 70s, the Mythical Man Month, right? So, yeah, we're going way, way back here in terms of understanding why programming is interesting. But we're kind of, I feel, as open source people and, of course, the postgres person, we're kind of, there's a term in English, you know, you're running on all cylinders. All of you are, you have an engine and all of the cylinders are working in the same way in a uniform fashion for the same purpose. And that's kind of what we have here and that's kind of what we have here. So what I'm saying is that to build open source projects, you're not going to open with these things, but be aware that these are aspects and aspects that you're going to attract different people who have different focuses and different things that motivate them. Okay? Let's talk a little bit about management. Obviously, managing an open source project is incredibly complicated. Having done it for so long and having done it really when open source was much smaller, obviously communication is a key aspect, the ability to send free, in fact, you know, email, we didn't have chat back then, really, but the ability to send communication freely across the globe and harness the capabilities of people who are very far away from me is incredibly important. But you're all going to have issues with just being able to communicate potentially with email. You may not meet the people that you're working with. You have travel distance challenges. You have time zone distance challenges. When I was working on the port of Postgres to Windows, I had one person in Stockholm who's actually at the conference and another person in Sydney, Australia. So based on their availability, I would wake up in the morning and I would work with the person in Stockholm because he would prefer to work in his evening time. And then I would kind of go through the day and then at night I would work with the person in Sydney because that person liked to work in their daytime. So I had this weird thing where I'd worked really hard from like 7 to 9 in the morning and like from 10 to midnight. And the rest of the day was kind of like, yeah. The culture's going to be different. People are, different cultures are going to have different focuses. One of the things I've tried to do is to make, like sort of fit into the culture I'm visiting or the people I'm working with. Sometimes that's uncomfortable, but I think it's important. Language, obviously, that's a huge challenge. You know, I'm speaking in English. You're understanding English. But I realize, A, I'm not sure how successful I would have been if I had to do everything in French or Flemish or German. I don't know if I would have been capable of being the communicator I am if I had to go learn another language to be involved, right? How can we bridge that gap? I feel I'm very lucky and blessed to have, to be able to speak in English and you have to understand English, but that's not everybody. And how can we work with people who have trouble learning languages or just don't have the opportunity to learn languages? And how can we show that we value their contributions and we want them to be part of our community and how to, that's, I can have a whole talk just on that, you can imagine. Funding, I'll talk about that in a minute. Communication, so I've gotten, I'm pretty good at that. One of the people, one of our community members said, you know, for, early on they said to do, to get new people in the community we just need Bruce Momchin to tell them because I used to call people just in the middle of the day and talk to them about whatever they wanted to talk about who are regular community members. And I may be calling Germany, I may be calling, you know, the other side of the country, the United States, but that personal contact meant a lot. I care about what you're doing, I want to know you, I don't necessarily want to know what you're doing for the community, I want to know you as you. And if you want to do something for the community, that's fine, but that's not my focus, okay. When I go to visit and I travel quite a bit, maybe 30 some events a year, I'm not there to tell you about our software. I'm not here to tell you about my software either, right. I am here to get to know you, I am here to understand how we can help you and how we can work together in whatever you feel you want to work on. But I don't have an agenda, my employer fortunately doesn't have an agenda for me, so I'm basically, my title is evangelist and Postgres evangelist or open source evangelist, whatever you want to call it, I am not here to sell you to use anything, I'm not even here to tell you to use Postgres, right. I remember I was in, I was in, Sri Lanka once and somebody held up his hand and said, why would I use Postgres instead of MySQL? This was like 20 years ago. And I said, I don't know. Maybe, I don't know. It's not, I said, if you want to use it fine, if you don't want to use it fine. I'm not here to tell you, I'm not here, if you wonder why it has some features you might like, yeah, I can talk about that, but I'm not here to convince you. I am here to get you involved if you want to be involved. Instant messaging, I've actually found it to be really almost better sometimes in the phone now, because everyone's on, typically Telegram is huge. If you're not using Telegram, that's usually what I use for Europe, Russia. A lot, Asia's a little harder. You know, there's WeChat in China and I'm not sure I do a whole lot of chatting with Japan or Korea. I have a couple of Korean guys on Telegram, now that I think of it. Yeah, I mean Google Chat and we don't do too much with Facebook or at least I don't, I'm sure some of my people do, but being able to just chat somebody and say, hey, did you look at this or how you're doing or did this problem in your country cause you a problem or how you're feeling or, wow, that was an amazing email you sent or I loved that patch you did. These little things, it doesn't cost you anything. I'm not sending them $100, I'm just talking to them and I'm saying, you did a great job. I was down there at the Postgres booth just before I came up here. I said, this booth looks great. And I said, the dev room we have looks fantastic. We got a Postgres banner out front of the building. There's a banner right near the entrance to the room. We've got people in blue vests that say elephant herder on the back that are sort of helping people get into the room. And I'm like, we look like a million bucks here. I don't know what you guys are doing, this is great. Didn't cost me anything, but they did a lot of work. Somebody had to make those vests. These people are volunteering their time. Somebody had to bring the banners. They got to put them in their car and bring them here. Nothing worse than doing something and have nobody care about it. There's a lot of people who do stuff and nobody cares. Nobody, even if people appreciate it, nobody tells them they appreciate it. It does not cost you anything to tell somebody you appreciate what they do. It doesn't diminish you. It doesn't make you look foolish. They're probably going to thank you and it's going to mean a lot to them. That's a key aspect of building any type of community, particularly a community where nobody gets paid. I mean, I know this sounds like obvious, but a lot of people don't do it. And that's my soapbox. Yeah, I'm sorry about that. I travel a lot and there's a lot of people who help me get to where I am and help me get fine things and help me in the hotels and airports and stuff. I say, thank you. Thank you for helping me. Thank you for working on Sunday. I know you'd rather be home. I think it makes a big difference. It's a mind change. It's a mind change of what type of person you want to be. It's not just an open source. It's basically what type of person you want to be. A grateful person who thanks others for helping them or you just want to worry about yourself and just get your thing done and whatever happens to those other people who helped you. Yeah, right? A lot of people that way, but you're not going to be a leader if you're that way. That's not the type of leader. In fact, there's a really interesting, I go to a lot of leader conferences and there's basically seven, six type of leaders. There's like the innovative leader, there's the managerial leader, there's the sort of organizational leader, and one of them is probably the most important for open source is the servant leader. The leader who is a servant to the people who report to them and wants the best to happen to everyone who's reporting to them. That's the type of leader you have to be an open source because you have no money. You have no control over these people. They are helping you out of their own voluntary and if they stop doing it, there's nothing you can do. And a lot of management focuses on rewarding people and paying them and stuff, but honestly, servant leader for open source, you can search for it. There's a lot of talks about it. It's actually really interesting. One of the things I found in terms of conferences, and this is a good example, I travel about 90 days a year when COVID isn't happening and I found that going to somebody's country and spending time with them and staying an extra day after the conference and just hanging out with them and doing whatever they want to do is gold. The conference itself, yeah, okay, people hear me talk and I'm talking about Postgres features and blah, blah, blah, blah, blah. Okay, yeah, maybe I'm good at that or maybe they can see my slides online and it's not the same maybe, but a lot of what I do is not just peeking to small groups but spending time with individuals in their countries, just doing whatever they want to do. Hey, let's go to the park. Let's go to the zoo. Let's show me something interesting and it's unbelievable some of the stuff I've seen by just asking somebody, hey, I'm here for an extra day after the conference. If you want to do anything, let me know. And you don't know what's going to happen to you. And honestly, it's a little scary. You'll notice I have shoes on but they actually have high-top shoes because a lot of times when I'm traveling, I don't know what climb I'm going to be going up. Particularly in Russia, I seem to be ending up on the top of a mountain or in a forest or covered in snow or just places I never would have thought I'd go. But again, I have to be flexible. I can't be this sort of like, oh, I'm scared. Anyway, it's actually kind of crazy and that investment, that investment pays back. It really does. One of the issues I've had is, yeah, email's nice, talking to somebody on chat is better. Talking to somebody by voice is better than that. Going to them is the best, right? So I'll tell you a story and surely you're here or you left? Surely you're here? No. So I have the shoes right there. So in 2000, I want to say three, I sure used to work for Oskon. I went to Oskon in San Diego. I fly out there. I didn't travel very much at that point. So I fly out to San Diego and it was in a hotel right on the water and I had a tutorial and a talk. So I gave my tutorial, it was probably like 40 people there and then I gave my talk the next day and there's like 30 people there. And I'm like, okay, so I talked to like 70 people and I'm on the plane flying home and I'm like, I was a waste of time. So I sent an email, thousands of people see it. I just flew across the country, spent like four days, five days, like to talk to 70 people. I'm telling my wife, we're not doing the same one. It's 2003. I think I've traveled like 1200 days since then, but anyway, the stats are on my website. What I found out is the next week when I was looking at the Postgres email list, there was a lot of activity that wasn't there the week that I left. Before Oskon, the week after Oskon, I had a lot more activity after the week and what I realized was the people I had lunch with at Oskon were now the people who were actively working in Postgres. I don't know how it happened. I don't know what magic words I said. Who knows? But the point is it wasn't unbelievable and sure it was the people in my talk. I think it was the people I talked to at lunch. I talked to them. I got to know them. I talked about, I guess I must have said something about Postgres or they asked me something about Postgres. I talked about, and all of a sudden we were going. And I don't know, Magnus Haagender, who was actually one of the core team members as well, he's here, and I went to a conference early on, it was in 2004 in, it was in Denmark, style Copenhagen. Yeah, I have my clock right here. I'm good. Away to Copenhagen. And again, it was a very early trip. I remember flying. It was a crazy flight out of JFK in the snow. I remember arriving and I don't know. I talked to the guy, it was Magnus. I don't know. I don't even know what Magnus meant. I didn't know that Magnus was a name because I'm from the United States. So I'm talking to him and he's talking to me about stuff and okay, whatever. And I talked to a bunch of other people. And then I came back from Copenhagen, all of a sudden Magnus is involved and he's working an email with us. I'm like, oh, this is great. Little did I know, he would end up being the president of Postgres Europe and a core team member. You don't know. You don't know who you're going to talk to. So I will tell you travel is time consuming. There's only a few people in any one location but if you can do it and you have the time to invest in a long-term goal, it pays off tremendously. Time zones, obviously we all work in multiple time zones. I've always worked from home. So I don't have it nine to five. My days have always been very long. So I work, you know, I kind of, you know, I work in the morning and then I work kind of through the day but I may take two hours off, three hours off in the middle of the day to go shopping or go visit somebody or go to church. I don't know, whatever. And I never really worried about it because the work's always there, right? And at night the people in Asia are awake, right? And in the morning, so it's actually kind of a long term. Having a cell phone is nice because you can communicate when you're not home but it is a 24-hour cycle, right? There's something happening all the time and that is a little hard to get used to. Culture, show interest in other cultures. Don't be the person who, oh, I don't do it that way or oh, that's wrong that you do it that way. We had a case where there was an inappropriate something at a conference in Russia years ago and I had to call somebody up in the middle of the night. I called him at his 1 a.m. and he answered the phone and I kind of talked to him about the issue and he was able to resolve it by the time everyone got up the next day, right? So again, I had that, I already had personal contact with that individual. That individual had already sent his daughter to live with me for a summer to learn English. So I knew him and his family very well and he answered my phone at 1 a.m. I said there's concern about this event, this thing that happened at the conference. I got all the information I need. I talked to the court committee and we handled it very cleanly. But again, I had already invested in that relationship long before I needed to call that person at 1 a.m. to get an answer so that things didn't get out of hand. It turned out they didn't understand. This was part of the package they bought and we're like what country has a package like that? But whatever. You have to be culturally understanding sometimes. I spoke at a Russia conference in June and people were saying don't do that. Well, we don't discriminate against where somebody lives, right? So how do I do that? I start the talk and I said I want to say something before I talk about my material. I said I know it's a very hard time for people in Russia but it's an even harder time for people in Ukraine and the Ukrainians. But I said I work for a project that does not have boundaries between individuals. We don't discriminate on where somebody lives and it's in that spirit I'd like to talk to you today. I said the one sad thing is that I have been in Russia many times and I have and I said I'm sad to think it's going to be a long time till I can come back and see you again in Russia. And the feedback I got from them was they really appreciated me saying that. I was able to talk about Ukraine. I was able to talk about Russia at the same time and I think in a balanced way. And they appreciated somebody who was willing to talk about that. So again, I spent a lot of time thinking about what am I going to say to them and how am I going to make sure that I say the right words. But they appreciated it and I continue to have a regular dialogue with them. And we have a lot of developers in Russia still who continue to work on our project. We don't discriminate against them basically. Language can be an issue. We do have some cases where we try and do per language. Remember I talked about the language barrier. We have a French email list. We have a Spanish email list. We have a Japanese email list. A Chinese email list. There's a whole bunch of per language lists to get people started. We have per language, I believe it's telegram channels. I know we have a big Russian telegram channel. I don't remember what other languages we have. I know we have Slack channels for particular languages. We have obviously conferences all over the world. Some of them are in native languages. Some of them are in English. Some of them are both or translated. So when I go to Japan, there's typically a translator there. When I go to Russia, often there's a translator in China as well. So again, depending. We have different documentation in different languages too. Funding, we don't have any. So just get over that. Hopefully you'll find a company like mine who is worried about the health of the community as willing to invest in somebody as crazy as me who just says these things and goes around and does whatever. And the reason is because they're investing so that if there's a problem, he already knows everyone who can get the problem fixed real cleanly. And that's again one of the investments. In the development process, we try and involve everybody. Find each person's motivation. Remember, different people are motivated by different things. So figure out what they're motivated in and try and put them in a position where they're going to be strong. I don't know how to put this, but none of us is perfect. We all have problems, things we do wrong. So I try and look at a person and say, what is your strength and what are the weaknesses? How can I put them in a place where your strengths are going to shine? I kind of have somebody backstopping the weaknesses so that it doesn't become a problem. Reach out to individuals. Again, that personal contact is important. Harvest the strength of the team. There's always somebody smaller than you. I learned that early on. There was a guy in Krasnoyarsk, Russia, who was so smart early on, not only could not answer his questions, I didn't understand why you'd ask him a question. So I'd say, I don't know the answer, but why you're asking the question is like, oh, okay, I understand it, but now I at least understand why you asked the question. Produce work that people are proud of. I'm not always Mr. Sunshine sometimes. If I'm embarrassed by something, my community members will hear about it. In fact, somebody just told me today, I railed against how a website looked. It was kind of stuck in an ugly years ago in Dallas. You remembered where it was and what year it was, and he said you didn't like the website and you basically said you were going to, you like yell at everybody and he said, I don't care what you do to the website, do something. Because we were stuck and we couldn't, we were stuck trying to look for something perfect. I said, just do something and that kind of got us over the hill. Produce clean code. Remember, we're not paying people to code, so we better make it easy. And finally, manage the team. Lead by example, not from authority. This is a big thing. I didn't realize it myself, but if the leadership, the leadership really has an unspoken control over how the project and how the team works. If you have toxic leadership at the top, you're not going to have a well run team at the bottom. If you have a servant leadership at the top, the people underneath you will be servants to those below and so forth and will continue promoting people up and up into more powerful positions. That is absolutely a key aspect. I've been surprised at how well that works. Except failure gracefully, when I make mistakes, I'm like, I'm sorry. And when I say, I'm sorry, and I say I made a mistake, it opens the door for other people when they make mistakes to also say I made a mistake. Because if Bruce can say he was wrong about this thing and basically take the lumps on it, then, oh, I guess, look, it didn't hurt. I guess I can do that too. If I'm the type of person who ignores any mistake I make, then that's how other people are going to do it. So seek consensus. The post-christ community does that a lot. Okay, let me take some questions. I'm going to have one question. We have one minute for one question. Who would like to be our one question in our one minute? I see a hand, a lot. It's Ilya. She's going to give me an easy one, I'm sure. Hi, Bruce. Hey. So here's my question. How do you find balance? I mean, you said you take calls at 1 a.m., you travel three months a year, blah, blah, blah. You have wife, you have children. How do you do that? Right, so the question is how do you find balance? Basically, you have to just accept your failures. You have to accept that I'm not going to be everywhere, that I'm not going to be able to fix every problem, I'm not going to be able to do everything. And you now rely on your leaders. So for example, a great example, when I'm here at the conference, I have no idea what's happening in the community. No idea, because I trust all of our people who have been working for years on this, so when I travel, I don't even bother reading the community email list. When I get back, I'll take a look at it, and you know, nothing bad happens when I'm gone. So what I've realized is not to be anxious about things, to trust the people who are part of your organization, to handle it, and by showing confidence in them, they become confident. If you show you're not confident then, they will not be confident. It's kind of natural. Thank you. Appreciate it. Thank you. |
Do we still need to have virtual events?
My learnings from organizing virtual community events |
Thank you so much, Ray. Can we get a round of applause for Ray, please? Thank you. You figured something would go wrong, but this was resolved pretty quickly. So thanks for sticking around to the last session of the Dev Room. It's really good to be back in person, and I realize the irony of talking about virtual events when we're all together, but bear with me. So my name is Ray Peck. I currently have a community at CUBE. If you're not familiar with CUBE, we're in the semantic layer space of data analytics. If you want to find out more, check out our website at CUBE.dev. For now, I'm still on Twitter. I'm like a lot of people trying to figure out what to do with Twitter, but you can reach out to me on Twitter or LinkedIn or other social platforms where you can find me. And if you came to my session like three years ago in 2020, back then I was working at GitLab, where I also done a lot of virtual activities. So in addition to having a lot of experience along with all of you was virtual events, I went to sort of share what we all collectively learn about getting together in virtual formats. So what we'll talk about is, I mean, first of all, I wanted to sort of step back and talk about why we even have events in open source. I mean, a lot of us are, you know, even before pandemic, we work really well across different time zones, different cultures asynchronously, but we invest a lot of time and money in events and just want to briefly talk about why. And then what we learned, especially in the early days of the pandemic, like the experience wasn't, you know, I think universally not great with virtual events, but wanted to sort of discuss like what we learned over the past couple of years. And I also want to talk about events that actually work relatively well in a virtual format. And I think that's why like a virtual events aren't necessarily going away. It'll be in a different, you know, improved format. But I want to talk about some topics or formats where, you know, virtual events actually setting actually works pretty well and how can even complement or enhance the in person events like like this one. So I think all these photos are from pre pandemic. I just looked at like a different events like, you know, one on your top, top left, I guess, on your top left, that's from CubeCon. It's like a huge event that gets more than 10,000 people typically, right? I don't know what the number was in Detroit last year, but you know, it's like a huge event, you know, almost a week long. And the other picture, it's sort of the other end of the spectrum. I'm in the in that picture somewhere. I can't remember where I am because it's been it's been like four years. But that's KDE Academy. They get together like once a year for almost a week. And I mean, this was like a sort of good way for me to get introduced to the community. I didn't know anything about KDE or what even, you know, what the difference was between KDE and Good Home. But they really made me feel welcome. And it was good way for me to get sort of onboarded and introduce to people in the community. So I definitely enjoy that. And even at large conferences like CubeCon or like OpenStack, you know, I mean, I need to remember to call them Open Infrastructure Foundation. But one of the things I like about a lot of these conferences is that they have like a developer hack room, hacking room, so like a hacking lounge. So even if you've been contributing or working on something for a long time, you get a chance to sort of work together with somebody like in person in a more intimate setting. And I mean, I find myself sort of hiding in like a developer lounges with a lot of other developers. So like, you know, all these sessions are great. You know, ability to sort of, you know, act with other people in person and collaborating are, you know, obviously a great benefit of these large events. But I think what, like a lot of people, what I appreciate the most is, I mean, Bruce just talked about it, the ability to sort of form personal relationship with somebody in person. This is really difficult to do, virtually. Like, you know, you can work with somebody, you know, collaboratively for years. But, you know, you're not going to go much beyond like having a great professional relationship with somebody. I made a lot of good friends in Open Source over the past several years. And I guarantee you, some of that happened in like really informal settings. Like there's something intimate about sharing a beer or sharing a meal with somebody and that you can't get over Zoom or whatever platform you use to collaborate with somebody. And especially if you had like, you know, maybe you had some conflict with somebody over meetings, like the direction of the project, how you're implementing something, those tend to all go away if you share a meal with somebody. I mean, I've had so many instances of that happening. So, you know, being able to meet with somebody in a more relaxed setting is something I really look forward to. And I took a picture, I found this picture of a hallway because I think the great benefit of these conferences is a hallway track, right? And then I don't know if you did the same thing. When I was putting together my schedule for FOSM, like I was trying to look at which sessions that I'm going to, I also reached out to the friends that I knew who were going to be here and trying to decide what am I going to meet you for lunch, when I'm going to go get together for, you know, for a beer at our favorite bar near Grand Place. So, that hallway track is, you know, obviously, that's probably one of the most important tracks that I look forward to in every conference. Unfortunately, this all came crashing to a halt like three years ago. I mean, I don't know if you were here at FOSM in 2020. I mean, shortly after that sort of will stop, right? And then, you know, a lot of people made herculean efforts to move events online. I remember, like, I presented at an event like in April. I mean, this is something that was sort of a local large meetup kind of thing in the Bay Area. They were scheduling for a while for DevRel people. And, I mean, they did a wonderful job like moving everything on to, like, hop in. Like, I mean, I think before then, I didn't even know what hop in was. They had some technical difficulties, even though I went through some speaker training on a platform. But despite all that, they did a wonderful job of sort of moving things quickly online in about, I mean, they only had about a month to do it. Having said all that, what were our experiences with virtual events in the early days? And, I mean, I found these pictures, and this is probably how a lot of us fell when events moved online. And it's not just because we missed the hallway track or, you know, networking session over beer. A lot of people, I think, talked about Zoom fatigue. They were just tired of, you know, having more interactions with Zoom. I don't think I struggled with that as much, because I was used to having a lot of Zoom meeting anyways. What I struggled with, what I found out was that I'll attend a session. There'll be, like, a 45-minute session, and then they give you a 10-15-minute break, because that's how the events were used to run, because you need to move to a different room to get to the next session. During those 10-15 minutes, like, I didn't have anything else to do. So what I do is I'll check, like, Slack or email from work, right? I mean, when I'm at a conference like this, I'm pretty good at compartmentalizing. I tell myself, I'm not going to check Slack until, like, lunchtime or morning break or whatever it is. But when I was sitting at home, sitting at a place where I work at all day, I can get easily distracted and I'll check a Slack message. I wasn't urgent, but I'll respond to it, and then I realized, like, 20 minutes later, oh, shoot, this other session I was going to go to, I'm 10 minutes late to it. So, well, I might as well just watch the recording then, right? So I was, I think I was getting, like, really confused. Like, am I at work or am I at a conference? Because I couldn't tell, because I'm just in my same home office. That's what I struggle with, and I think that's why the experience was really bad, because they're trying to force-fit in-person events, and they're trying to keep the same format, but move that onto whatever platform that people were using. And that obviously didn't work very well. Because, you know, I think what we realize now is that you don't need a 15-minute break, right? You can just move to the next session. I mean, you don't want to do that for eight hours, but you don't want people to be, like, switching their, you know, switching to a different mode all of a sudden, like, because you have a 15-minute break. So I think that's one of the things that I really struggle with in the early days. And, but if you think about, like, because I think when we thought of events before the pandemic, we always thought about these, like, almost week-long conferences, like KubeCon, or even like KDE Academy, that's almost a week long. You're at a different location for about almost a week. But if you think about it before the pandemic, a lot of us in open source were doing a lot of collaboration work, like, on a virtual platform, and not necessarily in person. Like, Hackathon's a good example. I mean, this is something, you know, I help start, like, when I was at GitLab, we once a quarter, community members will get together for two days, and then we'll just, you know, you don't have to commit to be, you know, at the Hackathon for 48 hours, but whatever time you can commit to in those two days, like, we're going to encourage, like, contributions from the community. So, obviously, the metrics we looked at was how many contributions we're getting. Those are like, this is something that we can count and measure. But what was more gratifying, I mean, at the end of my, like, a two-plus years at GitLab, what I noticed during Hackathon is that there's just a lot more chatter on our online, like, chat platform, like, we're using Gitter at the time, and more people are helping each other, and then community members are forming connections and bonds, right? So that was very gratifying in addition to the lot of contributions that we're getting. So that's one example. The other example on the top right, this is my presentation from two years ago with my friend Sophia. So this is about documentation, like, how, you know, how to, like, really accelerate, you know, documentation on open source projects. And the example I gave was that, so, what, I was, let me step back, I was working at Linux Foundation at the time, and Sophia is at Ericsson. So a lot of the networking projects in DLF, they had set release cycles, like, either three months or six months release cycle, and if your sub-project wants to be part of the release, there are a couple of documentation related milestones that you need to hit. Like, if you don't have stuff documented, sorry, you're not going to be part of the release, that's sort of one way of us forcing documentation milestones to be met. And first documentation milestone is relatively simple. You just basically, in your repo, create a sub-directory for the release. For project ABC, you need to have documentation structure sort of laid out with some header information and labeling. So it's sort of a routine task, but it's not something that gets you up in the morning necessarily, but it just has to be done. So what I suggest to Sophia was that let's make a, like, a, you know, fun event out of this. Like, we're doing a sort of, like, a household chore, but let's sort of, you know, almost like gamifying it. Let's all get together virtually for a couple of hours, and people have people complete their documentation milestone for that release. And then first, like, a documentation milestone for that particular release for OPNFE project happened to be, like, a week before FOSAM. So I contacted Sophia. I said, I'm coming to Europe anyways. Why don't I come to Ericsson in Sweden? Because, you know, she's a documentation made for OPNFE, but there are a couple of other Ericsson people that were leading other sub-projects in OPNFE. So we'll get together in a conference room. We'll set up a bridge so people can join if they want to. And to my surprise, like, when we started at, like, 10 in the morning in Stockholm, people dialed in from Nokia, like an hour ahead in Finland. That I sort of expected, because the technical steering committee chairs at Nokia, because I was pretty sure that he would be able to sort of twist some people's arms. And Nokia, but what we found out on the audio bridge was on go-to-meeting at the time was people from Japan joined. Like, they all got together at, like, the NEC office in Tokyo. And they were just going to spend 30 or 45 minutes in the documentation work, and they were just going to go for beers. Which was perfect, right? And then so God tested people to get together. And the same thing happened with some people from China. So, I mean, we worked for, like, an hour and a half, two hours in Stockholm, went for lunch, and had another session, so people from rest of Europe that were late starters, and then North American could sort of join in. And then we saw the same thing happen, so that was pretty fun. And then that sort of became, that became sort of a tradition for a lot of the documentation milestones. It just made it fun. You don't have to be in the same city at the same time, but it gave people a chance to bond and work together. The other thing, other example that I wanted to add, you see a lot of open source communities or commercial open source companies do this. They do these, like, workshops. And we do the same thing at CUBE. Like, this is a way for us to sort of provide educational content to people, because they'll ask questions about, we want to understand your caching feature better. How do we do data modeling? And one of our stars of the webinar is sitting there, Adnan. So this gave us a, not just way for us to, like, reach people synchronously through the webinar, this helped us build, like, a content on our YouTube channel, and this really helped. I think, like, I can't remember what speaker it was. I can't remember it was Bruce or maybe it was other speaker. One is really King, right? So you want to have provide resources for community members that can use. And, yeah, we've been doing this even before the pandemic. And, you know, this is something that's always been an arsenal and something that we need to take advantage of. And there are definitely a lot of benefits of virtual events, and I hear a few of them. One is, like, something like a webinar. It takes almost no effort to spin this up. Like, if you have a Zoom, like a Zoom account, like, you know, I guess you'll have to upgrade to Zoom webinar. But if, you know, if you're not going to have more than 100% on, person on, on, on your, on your training sessions, you can just get away with your regular Zoom Zoom account for up to 100 people. So, you know, it's pretty easy to set up and start, because it doesn't cost very much. It allows you to do a lot of experimentation, like a documentation is a good example. The only cost was really, you know, setting up another audio bridge and go to meeting at that point. But if you compare that to, like, what people consider a pretty lightweight in person event, like a meetup. Let's say you want to schedule a meetup at, you know, one of your local community members company for a couple of hours. You know, like, you have to figure out a room. Maybe you have to order food and drinks to encourage people to come in and actually feed them while they're there. But also, you may have to deal with security at your company so that people can come have access to your conference facilities. So that's not trivial amount of work. And it's supposedly, it's not a full industry event. It's a casual, like, in-person meetup. But even that takes, you know, several hours to sort of plan and organize, right? And not just for organizers, even for, like, a community members, if you force everything to be in person, you're going to end up incurring some costs. Like, I remember going to a meetup, local meetup in the Bay area where I live. I should have been without traffic. I should have been able to get there in 15, 20 minutes. But, and this was at noon, like on Thursday. Like, you wouldn't necessarily expect traffic. But for whatever reason, it took me like 40 minutes to get there. And I was just rolling my eyes as I was driving. Like, this is not good use of my time. Like, I'm actually commuting to go to a two-hour event. So, yeah. So, like, compared to, like, a lot of in-person stuff that we've been traditionally been doing, you can do this at, spin things up at very low cost. The other thing I like about the virtual events, and this is an observation, I don't know if this is scientifically true. Like, a lot of the virtual events, the event, like, event material seems to be available a lot quicker. Because what happens is that if you're doing a YouTube livestream, unless you have to do an edit, like, it's available right away. Right? Even if you weren't able to join the session, like, synchronously. So, you know, unless you have, you insist on doing, like, a professional editing of the, of the livestream, it's available right away to people in the community or, like, even people outside of the community. So, I mean, I think, like, with Adnan and other colleagues, we've done, like, 15 to 20 workshops over the past, like, year and a half. I only had to edit, like, one of our livestream video ones. And that's because, like, we had some technical difficulties, like, I did today early on with, with one of the panelists. And I just had to cut out, all I did was cut out the first three minutes of the video, because we were just struggling with buttons. So, I mean, I think that's another reason why I'm a big proponent of trying to see if you can move stuff, you know, online versus doing it in person. So, I talked about documentation already, like, there are other things that work pretty well in open source communities in a virtual setting. And one is, like, GitLab used to do this. I think they sort of stopped this for a while. They would do, like, over the weekend, they'd do issue triaging. So, if you have a, you know, open source project of decent size, you're going to have thousands and thousands of issues. So, like, labeling them, putting priority on them is, it's sort of a, sort of a mundane task that's not necessarily fun. But if you want to give people an opportunity, I think GitLab used to do this, like, over the weekend. Like, you don't have to spend the whole weekend. But, you know, any amount of time that you can contribute to helping triage issues. And the big one is, like, finding duplicate issues and, like, merging them, right, which is a great house cleaning, house cleaning type of thing to do. So, that's another type of activity that you don't necessarily do this in person. So, that's another example. And the other one is, like, testing. And I'm not just talking about, like, a unit testing or functional testing. But even, like, if you have a UI team or a UX team that want to get community feedback, this doesn't have to be done in person. And this is pretty simple to do, set up online. So, pause for the other picture. That's supposed to be a tape cutting thing. So, if you're doing, like, your inaugural event, you're, like, rather than insisting on trying to do this in person, try to experiment doing, like, experimenting with the content and the format, like, online. Because it's going to cost you a lot less. I mean, at CUBE, we're thinking about doing our first conference sometime in September. And we already decided this is going to be done virtually. Because if we were to, like, secure a venue all day in San Francisco, that alone is going to be pretty expensive. And Logistical Challenge is going to be a lot worse. So, we're going to start, like, online. And make sure that validate the format and the content and the type of the audience that we're going after. And then, at some point, you know, for a large enough to do it in person, that we will have built the audience. And then we already tested it out in a, in a more cost-effective manner. All right. So, you know, I talk, mention briefly about complimenting in-person events. So, as we're putting together this conference later this year, I talked to, like, several, like, event management vendors that have experienced in both in-person, hybrid, and online events. So, every single one of them, I started talking to them, like, in October, November last year. One of the questions I asked is that a lot of the events are going back in person. Are people, like, abandoning, like, virtual or hybrid events, the answer from every single one of them was no. There seems to be still market for virtual events. And then, one good example, a few of them provided, was that, so this isn't just, like, broadcasting or live streaming an in-person session that's happening like it is now. So, what's happening is that a lot of companies or communities are doing, moving things virtual, almost as a separate track. And good example was a lot of conferences have day zero events. So, if you have an event that's, like, officially starting on Tuesday, they might have something else going on on, like, Monday. It could be, like, a local meetup. It could be a project team meeting. Or some open source community do this, like, OpenStack used to do this. They'll have, like, an orientation for new members. Like, if you want to get a good overview, introductory overview of all the projects in OpenStack, and learning how to, like, contribute, like, submitting your first patch, for example, they do, they make this, I mean, they actually did this, like, over the weekend before their conference starts on Monday. So, a lot of the day zero events, I think they're moving them virtual so that even if you can't travel on site to that conference, you can still participate. So, you're broadening the audience, and allowing more people to sort of participate at least some portion of the event, even if you can't travel. And my suggestion, if you're going to do, I mean, I'm all for day zero events, I would even make this, like, happen the week before. Like, let's say, like, if your day zero traditionally was on Monday, I would do this, like, virtually on, like, Thursday or Wednesday the week before. And the reason why I say that is that so that people, rather than people having to arrive on Sunday to go to a day zero event on Monday, they can just stay home, right? They can just watch this virtually, and they can just spend Monday traveling, so they spend less time away from home. Like, I mean, I'm sure a lot of you have children and kids like I do, you know, if you have to be gone for, like, a whole week, that just creates a lot of logistical challenges, right? Like, who's going to take her to soccer practice, et cetera, et cetera. So the less time you're away from home, I mean, I think it's a bonus. I think you're doing your community members a big favor, too. The other thing that I like about a lot of this virtual format, and I see this happening with FOSM, too, with some of the developer dev rooms, is you're basically increasing the capacity of a lot of the talk sessions, because you're creating a sort of, like, a parallel track that's what's happening in person, so that, you know, you know, I've been on both sides of the CEP process. I mean, well, unfortunately, you're going to have to make some hard decisions about not accepting a talk, which is always a hard thing to do. And by increasing the capacity, you have to deal with less of that, and especially for first-time speakers, it's a lot of the conferences gotten pretty competitive in terms of, you know, how difficult it is to get your talk accepted. So, you know, this helps, like, potentially increase your, you know, capacity for more speakers and more discussion topics. So a couple of quick do's and don'ts. I mean, I sort of mentioned the first one in terms of the don'ts. Like, when you're moving things to a virtual format, don't do the same thing that you would have done, like, in person. Like, you know, having a, you don't need a 10-15 minute break for between every session. Like, you want to have some break, you know, during your conference, but you don't want people to be distracted because they're just watching it from their home office, or even if they're in the office, they're watching it from their desk. But they don't need to, they don't need the 10 minutes to walk to a different room to talk to a session. So, you know, when you have people's attention, you know, just, you know, make the most of the, that opportunity. And then I've also seen some communities do this. Let's say you have a five, six hours of content. You don't need to do it in, like, one single day. You can even split it up over two to, I've seen some companies do this over three days. So you only have, like, I only have two-hour chunk that you need to schedule for over the two or three-day period that might make it more, that might make it easier to fit stuff into your schedule. I mean, although that's not, I'd be a little worried about that because like, how do you make sure that people don't drop off after day one? So you have to sort of manage that. But that's another thing I've seen some communities and companies do. The second one, this is the one that really drives me crazy. And I mentioned this too. Like, if you have stuff live-streamed, it's, you know, make it available on your YouTube channel right away, right? Like, don't have people registered to get content. Like, you're basically asking for people's email address so they can watch the video. And I hate it when that happens. Like, I went to a conference a couple of months ago because our CTO was presenting and they told us the video should be available within a week. I said, great. And then I went back to the event website. They said, oh, if you want to watch a recording, please register by providing your email address. And I think I was diplomatic. I sent them an email and I said, this doesn't seem very open source to me, like if you're asking for an email. And actually, they came back to their credit and said, actually, if you go to the YouTube channel, you'll find it here. It's like, why don't you then tell people that on the website? It's like, why are you kind of hiding that fact, right? It just drove me nuts. But, and yeah, so please, please don't do that by putting a wall around your content. And it's like, please provide your email address to download this white paper and it's like, well, that doesn't sound like a white paper then, right? But anyhow, that's, I guess that's my soapbox. And then if you're doing the event virtually, like don't force people to participate synchronously, because you have time zone issues, right? I mean, this is why I like the model we had at GitLab. You have a two day window, you can participate any time you want. We're not asking you to spend 48 hours, like, unless you want to, right? Even if you can spend just 30 minutes over a two day period, that's great, right? I mean, I don't, I'm not going to judge you because you have better things to do. So on the other side of the slide here, and I know it was really good at this. I'm not just saying this because he bought me beer last night. When we did our workshop, I know it was really good at putting materials available like a week in advance. So people can check out the slides, they can prep the content, and we also have like hands on workshop on our cloud instance called KubeCloud. So people can actually like, you know, tie it up to their data sources and they start experimenting with it. So the sooner we put it up, our scores were actually better. Even if the slides change over the, you know, over the next several days. So having the content available ahead of time so that people can prep, it just, it just provides a lot better experience. And then it also allows people to ask questions like ahead of time. So you're almost having like asynchronous Q&A session. And I, you know, that just makes life easier for a lot of people, including presenters, right? Because you know what to sort of expect. The last one is sort of similar. I saw this, like, Linux Foundation did this like a few years ago. I, you know, I made a presentation, again, with Sophia on documentation. And then there's like a Slack channel for Q&As. And after our presentation, we were on there for like 30, 45 minutes, which I thought was great. Because typically, like when you're at a convention center, you get kind of a shoot out after your presentation. So people only have like a five minute window to ask you questions, like if you ran out of time, right? This just gave us a lot more time to answer questions. And also for people to be, you know, vocal with their comments and feedback, which I thought was great. And I made sure that I let a lot of people know that I thought that was pretty awesome. So I know I'm almost out of time. So I think I've covered most of these. So it's not either or. The virtual events can complement what you're doing, like in person. And, you know, just remember that we've been doing like a virtual collaboration for a long time. And this is definitely important toolkit to help community members. And then especially for people who can't always travel for whatever reason, financial or otherwise. And the final point I want to make, actually, the platforms that we're using for online events, like including Hopin, I mean, I'm not saying that's my favorite tool, but even Hopin has really evolved over time. And one of the things I noticed when I talked to event management vendors is that they'll give a list of, here are the virtual event platforms that we support. One company, they provided like 15 logos, including like Hopin. So they're just more alternatives available. So I think we have better platforms and tool to run events virtually. And yeah, I just, I think that's it. I just want to make sure I have time for questions. Hopefully you'll be able to answer them. Thank you. Great. Because it's you, you can do one question even though we're over time, because we think you're awesome. All right. Thank you. Anyone want to ask a question? I see a hand. Off we go. You are the final question of the community, Debra, for Fosdom 2023. Congratulations. Well, thank you, Max. Thank you very much for the talk. Thank you. We have a global market, but obviously we don't have big resources like big companies. So I wonder how much it is to set up the whole logistic of the virtual event, live streaming. Because we definitely have a global market. So obviously you can't be relying on everyone assisting to your event if you do it in person. But I'm thinking probably best thing is to just record it and then put it available online a week later or something like that rather than having to live stream it as a proper hybrid event. What do you think? Yeah. So the question for people online, you know, because even Zoom is not free, right? And then, you know, is it okay to sort of, you know, pre-record things and make content available for people to like consume and ask questions to? I think that's completely valid. Actually, one of the things I learned from these vendors, they don't, on a virtual platform, they actually don't encourage you to have like a live presentation in a Q&A, because there's just a lot more technical difficulties that you can run through. So they call it senior lives. They prefer that they have the presenter record and be available for 10 minutes of questioning as an example. But, you know, if you don't want to have a person available 10 minutes online for questioning, you can still have like a creative Google Doc, like here's a recording. You submit your questions over this period of time and the presenter or other people can answer them. So it's all documented, right? So I think that's completely valid. So you don't have to use Zoom. I mean, I just happened to be, I've been using Zoom for the past few years, so that's what I'm comfortable with. But you can do a live stream from Google Hangout. But I haven't tried it, hands on, myself, but I know that's doable. So that's an option. But having something pre-recorded available on YouTube channel, but here's a documentation. It could be a wiki page or Google Docs. Please submit your questions. We'll close the questioning in like seven days and, you know, whoever the presenter was, whoever the content expert is, we'll get back to you with their answers. So I think that's a completely valid thing to do. It doesn't have to be like 100% interactive. So, thank you. |
Community Closing remarks |
So you're almost having like a synchronous Q&A session and I, you know, that just makes life easier for a lot of people, including presenters, right? Because you know what to sort of expect. The last one is sort of similar. I saw this, like, Linux Foundation did this, like, a few years ago. I, you know, I made a presentation, again, with Sophia on documentation, and then there was like a Slack channel for Q&As, and after our presentation, we were on there for like 30, 45 minutes, which I thought was great, because typically when you're at a convention center, you kind of get shoot out after your presentation, so people only have like a five minute window to ask you questions, like if you ran out of time, right? This just gave us a lot more time to answer questions, and also for people to be, you know, vocal with their comments and feedback, which I thought was great, and I made sure that I let a lot of people know that I thought that was pretty awesome. So I know I'm almost out of time, so I think I've covered most of these. So it's not either or. The virtual events can complement what you're doing, like, in person, and, you know, just remember that we've been doing, like, a virtual collaboration for a long time, and this is definitely an important toolkit to help community members, and then especially for people who can't always travel for whatever reason, financial or otherwise. And the final point I want to make, actually, the platforms that we're using for online events, like including Hopin, I mean, I'm not saying that's my favorite tool, but even Hopin has really evolved over time, and one of the things I noticed when I talked to event manager and vendors is that they'll give a list of, here are the virtual event platforms that we support. One company, they provided, like, 15 logos, including, like, Hopin, so there are just more alternatives available, so I think we have better platforms and tools to run events virtually, and, yeah, I just, I think that's it, I just want to make sure I have time for questions. Hopefully you'll leave it to me after I thank you. Great, because it's you, you can do one question, even though we're over time, because we think you're awesome. All right, thank you. Anyone want to ask a question? I see a hand. Off we go. You are the final question of the community, Debra, for FOSDOM 2023, congratulations. Well, thank you very much for the talk. We have a global market, but obviously we don't have big resources like big companies, so I wonder how much it is to set up the whole logistic of the virtual event live streaming. Because we definitely have a global market, so obviously you can't be relying on everyone assisting to your event if you do it in person, but I'm thinking probably the best thing is to just record it and then put it available online a week later or something like that, rather than having to live stream it as a proper hybrid event, what do you think? Well, the question for people online, because even Zoom is not free, and then is it okay to pre-record things and make content available for people to consume and ask questions to? I think that's completely valid, actually one of the things I learned from these vendors, they don't, on a virtual platform, they actually don't encourage you to have a live presentation in a Q&A, because there's just a lot more technical difficulties that you can run through. They call us senior lives, they prefer that they have the presenter record and be available for 10 minutes of questioning, as an example, but if you don't want to have a person available 10 minutes online for questioning, you can still have a create a Google Doc, here's a recording, please submit your questions over this period of time and the presenter or other people can answer them, so it's all documented, right? So I think that's completely valid, so you don't have to use Zoom, I mean, I just happened to be, I've been using Zoom for the past few years, so that's what I'm comfortable with, but you can do a live stream from Google Hangout, but I haven't tried to hands on myself, but I know it's doable, so that's an option, but having something pre-recorded available on YouTube channel, but here's a documentation, it could be a Wiki page or Google Docs, please submit your questions, we'll close the questioning in like seven days and whoever the presenter was, whoever the content expert is, we'll get back to you with their answers, so I think that's a completely valid thing to do, it doesn't have to be like 100% interactive, so. |
Rust based Shim-Firmware for confidential container |
Hello everyone, today I'm happy to join Confidential Computing Dev Room to share the information rust-based shim firmware for Confidential Container. I'm Jay Wen-Yao, Principal Engineering Intel. I have been engaged as a firmware developer for about 20 years working on UEFI, TCG, DMTF, Industrial Standard Working Group. I'm an architect for the TDX virtual firmware. Here is today's agenda. First, I will show you some background of the firmware while we need the shim firmware and the TDShim internals. Today, the industry is adding hardware-based Confidential Computing support, for example AMD, SCV, or Intel TDX. This figure demonstrated the concept of the Confidential Computing. The hypervisor VMM is on the bottom. On the left-hand side, the red box shows the legacy VMs. This is a traditional VM hypervisor environment. The hypervisor has the highest privilege it can access or modify the VM environment. On the right-hand side, the green box is a Confidential Computing environment. We call it TE, Trusted Exclusion Environment. Like a virtual machine, it includes the virtual firmware, guest OS, and user APP. The VMM on the bottom is untrusted. With the help of hardware SOC such as TDX or SCV, the TE is isolated from the VMM and other TEs. Inside the TE, the memory and CPU state confidentiality and integrity is provided to keep the sensitive IP or workload data secure from the most hardware-based tech. Since the VMM is still the owner of the whole system resource, such as memory and CPU, also the VMM manages the TE launch and teardown, so the denial of service tech is out of scope. In a traditional VM hypervisor environment, we need a virtual firmware to provide the services to the guest OS. For example, the EDK2 OVMF, Open Virtual Machine Firmware, provides the UEFI services in the virtual firmware. This is also true for the TE environment. For example, we need to modify the OVMF to add the TE support. The TE virtual firmware owns its first instruction of a TE, which is a reset vector at all OS. Similar to the traditional virtual firmware, the TE virtual firmware loads the guest OS loader and jump to the OS loader. The TE virtual firmware enables the trusted boot capability to build a chain of trust from the hardware to the TE OS. Here we list the existing virtual firmware solution as an example. The CBIOS is a legacy 16-bit BIOS solution. It is used to boot legacy guest OS, such as Windows XP or non-UEFI Linux. Currently, the most widely used UEFI solution is OVMF, the Open Virtual Machine Firmware. Then NKVM are using OVMF to boot the guest UEFI OS UEFI Linux. The cloud hypervisor firmware is used by the cloud hypervisor as a lightweight solution. This does not have UEFI services. The TE hardware solution may have special requirements for the TE virtual firmware. Take TDX as an example, the entry point must be 32-bit. It needs a special multiple processor wake-up structure for the guest OS. The TE needs explicit accept the assigned memory before use it. The DMA for the virtual device is a shared private memory attribute switch. The TE virtual firmware must support the measurement extension to the next component to build the chain of trust for the TE. To meet those special requirements, the UEFI solution OVMF needs added TDX support and ACV support. We call TDVF, which stands for the TDX virtual firmware. The TDXM is the guest firmware solution for replace the cloud hypervisor firmware to support the confidential container use case. TDXM is a lightweight virtual firmware for confidential container environment. It's written in Rust program language, currently it's supporting the TDX, it's located in the confidential container community toward development work is open sourced. We have three release tag now. The responsibility of the TDXM is to own the first instruction or reserve actor of a TD. It provides the required boot information such as memory map, virtual CPU information to the next phase, which we call the payload. The payload could be the OS kernel or a biometric execution environment for the service TD. The TDXM need to build the chain of trust from the inter-TDX module to the payload. Here is the boot flow comparison between the TDXM and the TDVF. The right hand side is a TDVF based solution. The VMM passes TDHOP to the TDVF as input parameter, it's input memory information. The TDVF build the UEFI memory map, create the UEFI services and ACPR tables, then load and launch the UEFI OS loader and the UEFI OS. The left hand side is the TDXM, VMM pass the TDHOP to the TDXM same as the TDVF. The TDXM build the E820 memory map and create the static ACPR table, then load and jump to the Linux guest kernel directly. The OS loader in the middle can be skipped. Here is the comparison between TDXM and the TDVF features. From a use case perspective, TDVF is for the confidential VM or the rich service TD environment. The TDXM can be used for the confidential container and the parameter of small service TD. The TDVF is written in C while the TDXM is written in Rust without STD support. The TDXM does not provide any UEFI services, OS runtime or device drivers, which is different from TDVF. In order to support multiple processors, the TDXM still provides the static ACPR table, such as MADT and PUICUP structure, which is same as TDVF. The virtual device RQ information is in DSDT in the TDVF case, but DSDT is not required in the TDXM use case. As such, the virtual RQ information can be passed as part of boot parameter in the TDXM. For memory map, the TDXM uses E820 table to provide the TE memory map information, while the TDVF uses EFI memory map. The trusted boot support is same between TDXM and TDVF. Both solutions need to extend the next component to the RTMR and build the event log for the measurement. Secure Boot is also supported in both TDXM and TDVF. The difference is that TDVF uses standard UEFI secure boot, while the TDXM uses customized secure boot solution. We will introduce that later. The size of the image is different. By default, the TDVF OVM map is 4 MB, it keeps increasing recently. But the TDXM without secure boot only has 140 kb, even with secure boot is only 270 kb. That's why we call it as a SHIM firmware. Now we can introduce more TDXM internal information. In TDXM project, we define the TDXM specification to standardize the interface between VMM and the TDXM, and the interface between TDXM and the payload. The TDXM itself includes the reset vector. The reset vector is written in a symbolic language. The code runs by the bootstrap processor BSP, whose virtual CPU index is always zero. The BSP will park other application processor APs and switch to X64 long mode, set stack for the Rust code, then jump to the SHIM main function. The SHIM main function is written in the Rust language. This will pass the TDHUB input from the VMM. It measures the TDHUB, gets the memory mapping information, and builds the 820 table. Then it accepts the memory and loads the payload and jumps to the payload. People may use different payloads in a different use case. For example, in a normal confidential container use case, the TDSHIM can boot a Linux kernel directly based upon the Linux boot protocol. Service TD use case, the TDSHIM can boot the migration TD core to make it for migration TD. The migration TD is a service TD used in TDX 1.5 to support the guest OS live migration. Now we will introduce two important features in the TDSHIM, trust boot and secure boot. They are all documented in the TDSHIM specifications. First, let's take a look at trust boot. In the trust boot flow, one component must measure the next level component before transfer control to it. Later, a remote verifier can get the measurement data with digital signature signed by the trusted entity, and verify the TD environment launch as expected. This flow is called remote agitation. The TDSHIM supports the boot flow by extending the measurement to the TD runtime measurement register. The TD measured component includes the TDHUB, payload, and the boot parameter, etc. At the same time, TDSHIM provides a confidential computing event log called CCEL to the verifier. The event log may be used to reproduce the digest value recorded in RTMR. As such, the verifier can check each individual component described in the event log. The final attestation can be based on the hash of the measurement register or the hash of the event log. The TDX architecture provides one MRED and four RTMR measurement registers to map the TPM PCR-based measurement. MRTD1 maps the PCR0 as a firmware boot code, which is the TDSHIM itself. The RTMR0 maps the PCR1 and the PCR7 as a firmware configuration, such as TDHUB for the VMM or secured policy. RTMR1 maps the PCR2 to PCR6 as the OS or payload information. The RTMR2 will map the PCR8 to 15 as application information. From the transfer boot, the secure boot requires one component to verify the digital signature of the next-level component before transfer control to it. In order to support such verification, the TDSHIM needs to provision a non-good public key and the minimum secure version number, called SVM. The payload itself should include the image, digital signature, as well as the SVM value. The secure boot in a TDSHIM includes two step verification. In step one, the TDSHIM needs to verify its public key matches the public key hash in the TDSHIM image, then the TDSHIM needs to verify the digital signature of the payload according to the public key. The digital signature needs to cover both the payload image and the SVM value to prevent the SVM modification. In step two, the TDSHIM needs to verify the SVM in the payload to ensure it's equal to or bigger than the minimum SVM provision in the TDSHIM image. That is to prevent the payload-rollback attack. If the secure boot with SVM is enabled, the payload-remote attack station can be used in different verification policy. The verification can be based on the SVM on the image, not the image hash. This can be achieved without secure boot because there's no other secure way to allow the payload to pass the SVM information to the TDSHIM. With secure boot, the SVM value can be tampered by the adversary without being noticed. The measurement with secure boot is almost the same as the one without secure boot. The only difference is that the SVM value of the payload is extended to the RTMR1 as a specific entry. As such, the verifier can check the specific SVM entry in the event log. The policy could be, I require the TD payload bigger than SVM4. It could be any SVM with SVM5, SVM6, etc. To follow the secure best practice, the TDSHIM enables the protection such as data execution protection. It marks the code page to be read only and the data page to be non-skewable. It's useful to break the exploitation, even if the environment is compromised as such as buffer overflow or stack overflow, the attacker cannot inject the code. We're also trying to enable the control flow guard, CET, such as shadow stack and indirect branch tracking. That is still working on progress, and that work depends on the rest compiler. TDSHIM project provides a set of tools. For example, the TE InfoHash tool allows you to calculate the MRTD-based TE InfoHash value. As such, you can predict the value in the TD report. Payload reference calculator can be used to calculate the TD payload reference value by a big image, a busy image, and a kernel parameter. The metadata checker tool accepts the TDSHIM files as an input, and extracts the TDX metadata and verifies if the metadata is valid, then dumps them with metadata. Finally, we enable the set of tests for the TDSHIM project, for example, the fuzzing test with AFL fuzz and the cargo fuzz, which are two popular ones in the rust fuzzing. We enable the cargo clipy, and it runs the Rudra, Christie, MR, AI static analysis tools, and fix the reported issues there. Unfortunately, we notice that some tools cannot work with the latest rust compiler, such as Rudra. Argo Deny is integrated in CI to ensure that the great TDSHIM rely on does not have any known secure vulnerabilities. Beyond that, we also run the unit test and collect the coverage as well to ensure the quality of the project. Based on that, that's all for the TDSHIM introduction, and thank you for your attention. Please let me know if there is any question or that. Thank you. |
Project Veraison (VERificAtIon of atteStatiON)
(Trying to) making sense of chaos |
Okay, hi. My name is Thomas. I'm an engineer at TARM. I'm also a contributor to Project Verazin. This presentation is about Verazin and wants to give you an idea of what the project is about and how it fits in the wider translation of natural computing picture. I hope that, you know, after listening to this, you'd be motivated to at least have a look at the project and maybe become an early adopter or even a contributor in the future. Who knows? So without further ado, let me go through what I've put together here, starting with a bunch of trees via about the project. So we have a logo and the colorful ovals there are supposed to represent grapes. In fact, Verazin is a term used in winemaking and is the moment when grapes start ripening, at which point they can be of different colors, and in fact, the word means change of color. And this is sort of visual metaphor of the blurry nature of trust, which can be yes, can be no, green grape, red grape, or really anything in between. There's some debate around the way the word is pronounced, the French say verison, the English say Verazin or Verazin, so you have a choice, you have multiple choices. In fact, please do not spell it like the American technical company, because that would be really confusing. And it's also a background for verification of attestation. It started within ARM, in the architecture and technology group, to which I belong, together with my colleagues, as I'm on the team, as well. It's been adopted by the Confidential Computing Consortium in June 2022. It's currently in incubation stage, meaning it's in the early phases of adoption. And we're looking at growing it across a few different metrics, the most obvious being code and documentation maturity, but also, we are also trying to grow the community a bit. And being here is part of that effort. But yeah, the headline here is that we've moved from being an ARM project to be under the Linux Foundation umbrella, therefore, we are completely open governance and of course open source. Before we dive into the core of the presentation, though, let me give you a bunch of pointers also useful, the code base, GitHub, the chat, the mailing list hosted by the Confidential Computing Consortium, and these confusing URL, which is one for our regular weekly calls, which I'm not expecting you to memorize, in fact, you could just follow the first link in the list, the GitHub work, and the splash page should have all of these other links. But general, feel free to join any of these channels, pop up with our calls, ask questions, whatever. You will be always very much welcome. So now we do a quick recap on remote attestation, starting with the problem. So you suppose you have these two guys, a test for an underlying party that needs to engage in some sort of distributed computation. Suppose also that initially they don't trust each other, they're mutually distrusting, and then you can't make progress in the computation until a trust relationship can be established between the two. And instances of this kind of situation abound, for example, in Confidential Computing, the test is typically the guy that excuse the workload, the Confidential Workload, and the underlying party needs to know that these guys is trusted before it shares some input with it. Say, for example, you need to ship an ML model, or some privacy sensitive data, or both. And there's obviously obviously the dual problem where the underlying party is the consumer of the output of the computation, say you are an actuator of some sort that acts upon signals that are coming from the tester, and you want to trust the tester before making a mess, right? Say, there's stuff in critical infrastructure, especially. So to break this impasse, the natural thing to do is for one party to convince the other that they can be trusted. And here's where attestation comes in the frame, so attestation, what is attestation? It's a technique that uses specialized components, which is rooted in hardware called the root of trust, that the attest can use to do basically two things, sampling the current state of its TCP, of its trusted computing base, and put together a report, and be signed this report with the secret identity keys that are securely stashed inside the root of trust in hardware. All of that in a trustworthy way, in such a way that no one, even a co-located software or firmware, can subvert these sequence of actions. And this signed report is called attestation evidence, or evidence. And the evidence, which is basically a very, very strong authentication signal, can be sent to the RP, which, once it receives it, needs to verify by checking that the signature of the report is correct, and that the identity of the signer is known and trusted, and also to check that the reported TCP state is acceptable, slash good, for some local policy-defined interpretation of what good means. And the process of verification of evidence is the process that covers at least these two checks that we just discussed. And in the remote attestation architecture, which is RFC 9334, freshly published, this process of verification of evidence is taken care of by the verifier role, which basically mediates between attestors and reliable parties. And I know that the lines in this picture are not real channels between real devices. They are logical channels between architectural roles. So they can be recomposed and put together in a very different way, depending on the protocol. Logically, these are the relationships between various architectural roles. And in order to do this function, the verifier needs to know a few things, a few very important things. First, how to verify the identity of A of the attestor, and typically that is done by knowing a public key associated with the secret signing key of the attestor, and knowing that in a reliable way. And B, what is the expected state of the attestor TCP, of course. And this can become pretty messy, depending on how complex the attestor is. On one end of the spectrum, you have very simple attestors that can be described with a single measurement, does not change or changes glacially. And on the other hand, there are composite testers that are made of more than one attestor, each attestor sporting its own TCP, each TCP made of multiple separate independently moving parts, you know, software components configuration and whatnot, with multiple different supply chain actors involved, you know, think can become really hairy. So if the complexity is very low, it makes sense to co-locate a simple verifier with the underlying party, but if that's not the case, you know, it might be a reasonable choice to design a system where the verification function is effectively offloaded to a separate architectural component. These options are not, you know, aggregation, disaggregation, they are not mutually exclusive. Take for example, the case where the tester is composite and its evidence can be clearly split along a platform slash workload axis. So in that case, one might want to call an external platform verifier, and if everything is okay, then move to a local verifier for the verification of the workload. You know, as I said, the things are logical, they can be reassembled and put together in very, very different way. But in general, whatever the system architecture you come up with, the verifier role needs a bunch of trusted links to the supply chain that is involved in the verification, because the supply chain endorses the attesta, right, and so it's critical that the link between the supply chain and the verifier is a trusted one, and that endorsements are genuine. And these are the green boxes in this picture. But if I could also have an owner, I typically have an owner, that can provide the verification policies for evidence to it, save for, you know, to customize the process of appraisal or, you know, in a special situation to provide out batching. And to complete the RATS architectural picture is the RP owner, Reliant RATS owner, can feed the policy for how to act on the attestation results coming from the verifier, right, to extract the reliability to make decisions based on the appraisal done by the verifier regarding the attesta. So this finishes the RATS recap. And the question now is, where is Verizio in this picture? And as you may have guessed, is the blue box at the center. But it's not just that, it's also all the lines that are attached to the blue box, that is all the verifier interfaces, which are quite a few. Let's start from the bottom left and then move clockwise. So evidence, we've built a bunch of libraries for manipulating various evidence formats, both from point of view of the verifier, so the coding and verification, but also from the attesta point of view, which means encoding and signing. This way, if one needs to put together an end-to-end flow, say for integration testing or for the demo, it's quite easy to build this, you know, attesta remulator. You don't need to deal with the real horror, which can be tricky, especially if you're dealing with CI environments. The current list of supported evidence includes two ID profiles, PSA for Cortex-M, so IoT Gizmo and NCCA for the new ARM-V9 confidential computing architecture. We also have a TPM profile, and this came from an integration project we did with our friends at Inact Trust, who have a product for monitoring device health that can use Verizio as a backend. We have a bare-bones implementation of DICE, say TCG, but even Open is okay, I think. And then we have AWS Nitro, contributed by our friends at Veracruz. Veracruz is another triple-C project, so a confidential computing project that uses Verizio as the backend for their proxy CA. On the endorsement, RefValueFront, we have implementation for Quorum, which is a format we co-designed with Intel and Fromhofer in the DICE working group in TCG, and has now been adopted in the ATF Rats working group on the standards track. It's basically a specialized format for describing how the attester looks like to the Verifier, which aggregates different subformats for specifying a bunch of orthogonal but cool existing dimensions of the platform, so software, firmware, trust anchors, other things. For policy, we have integrated OPPA, Open Policy Agent, which is an existing successful general-purpose policy engine. We found something that existed that was fit for purpose, so we didn't feel the need to reinvent this specific bit. And for attestation results, there's an info model that's being standardized in the ATF called R4C, which allows to normalize the appraisal output so that the relying party policies can be simplified greatly, because now they're fully decoupled from the specifics of the attester. But because R4C is just an info model, we had to create a serialization for it, which we based on it. We call EAR, EAR is a good candidate for standardization, because there's nothing at this point in time, and so we're trying to push this through the RAS working group in ATF as well, and it's the only output format we support for now. As far as attestation result policies, we do nothing because this is completely autoscoped, this is entirely on the underlying party. So yeah, now I wanted to give you the map of what we just discussed with what exists in the Verison Convey. And note that nearly the entirety of what follows is goal line packages, and also command line tools and services slash demons in good old UNIX patterns, and this was a conscious choice. We choose to go with Golang, because one of the main goals of the project is to provide a bunch of component 3 to assemble system that provides verification as a service. And Golang is a very good language in terms of library support, native library support ecosystem, and also in terms of business, the learning curve is very non-steep, I would say. So it's trivial to learn, and therefore we hope that more people can be, more contributors could come by lowering the barrier here. So let's start with a couple of layer zero packages, Eat and Gokuzi. Eat is the entity that is the decision token, it's the main format for attestation messages, either evidence or attestation results that have been put forward by the ATF. It's basically a framework that extends caught and jotted by adding a number of claims with attestation specific semantics, and also a way to instantiate the framework through so-called profiles, and PSA and CCA being just two such examples. The Eat package is very neat package is not tracking the most recent version of the draft. We decided to wait for the draft to become RFC before doing a final alignment pass because it has all the claims that we need for the dependent packages, and so for the moment is good as is. But we will need to make it up to date, as soon as the document goes through the final stages of the standardization process. And Eat builds on caught, which in turn uses CosiSign and SineOne, so early on we realized that we needed a Cosi implementation in Go, and the only thing we could find at the point in time was Gokuzi, which was originally developed by Mozilla for their autograph service, but it supported only CosiSign, so we forked it and extended to support SineOne, but then Mozilla discontinued it, at the point in time when we wanted to contribute it back to the main line, they discontinued the project, so we took responsibility of it alongside Microsoft, and together with Microsoft we did a fair amount of work, including improving the ergonomics of the interfaces, and also making implementation generally more robust, and to that respect, we went through a couple of external security reviews and addressed older comments, so we're fairly confident in that this package has production quality, and we shipped 1.0.0 just a couple of weeks ago, and this is the first ever release bit of Verizon, so we're particularly excited by that. It's an interesting package among the lots, also because it's used not just by Verizon, but also by other quite big projects, Notary in CNCF and SiegStore, OpenSSF, so it's a nice, it's come out as a nice, useful building block that has relevance in the community. Now moving on to the evidence, you know, these are the evidence bits, the red, the orange things, we have the two-it profiles for CCAMPSA, the DICE thingy, and the sort of house-like shape is the common line interface called AV-CLI, which is an attester emulator that also talks to the Verizon verification API client side, so it can be used to play with the services quite easily on the common line. The yellow things are the attestation results packages, so here, there's only one package, basically, it's Verizon here, and there's an associated CLI arc, it can be used to verify and pretty print results, so one can pipe AV-CLI outputs into arc to see what happened during appraisal, and the green thing is the bunch of endorsement draft file-related packages, including the top-level manifest format, Corem-Corem, and the dependent bits for firmware verification keys and software and so on, so Corem-Coremade, Corem-Quartz, and Swede respectively, and also Cochly, which is the common line interface for assembling Corems from a bunch of JSON templates, you sign them and then you can submit them to the Verizon services using the provisioning API, which is linked into Cochly, and finally, the blue box, including microservices, which I will talk about in the next slide, and the client-side interface of the Verizon API, which exists also in Rust with C-bindings, and I think a pure C implementation has been contributed as well, I don't know where, if it's been merged fully or is still an open PR, but anyway, so yeah, let's have a quick look at the services architecture. Basically, there are two pipelines, one for verification, the bottom one, and one for provisioning, the top one, that converge on to the core services, VITIS, which connects to the data and the plug-in stores, it's called VITIS, VITIS is another winemaking one, and it's basically Verizon's services, tracel computing base, it has a whole, the security-related computation in it. The interesting thing is that the processing in both pipelines is in stages, and with very precise hook points, where plug-in code can be supplied by the user in order to customize certain aspects of the processing. So basically, what you do is as a user that wants to add their own format, either of evidence or for endorsements and ref values, you build your plug-in code so that you basically, you adapt your formats to the standardized processing pipeline. Right, so next steps, we have a few exciting new things in the pipeline, adding new formats, of course, opportunistically though, depending on contributions, integrations, et cetera. We want to improve documentation, not a very exciting bit, but a necessary step in making the project useful and usable. And this is a big one, we want to work on the identity and access management to support the more generic multi-tenant service, and this is the single missing bit that prevents Verizon from being used in a production context at this point in time. And finally, we want to allow a static plug-in list build in order to reduce the potential attack surface on the services. Yeah, this is what the next months will bring us. So, questions? Thank you very much, by the way, for listening. |
Nydus Image Service for Confidential Containers |
Hello everybody, this is my honor to introduce in my recent work about enabling NIDA's image service for confidential containers. Let me introduce myself first. I am Gary Niel from Antibaba Cloud. Currently I am working in the OS team to enable NIDA's operating system for cloud workloads. I am a non-standing NIDA's cloud hacker and has contributed much to the NIDA's kernel. In the last few years, I am also interested in cloud-related technologies such as micro-oVM, container runtime, container image management, and I have joined several open source projects such as the Cata containers project, confidential containers project, the NIDA's image service project, and the masterman project. I will go over three topics. First, I will explain the special requirement of image management for confidential containers, the current available technologies, and the challenging we are still facing. Then I will give a brief introduction about the NIDA's image project. Its design, its feature, and its current status. Last, I will give my ideas to enhance NIDA's image service for confidential containers to improve the image loading process for confidential containers. Project COCO, or the Confidential Container Project, aims to protect confidentiality and the integrity of a container workload by using hardware TEs. A way to protect a container application is to adopt the Cata container architecture, which is to run a dedicated virtual machine for each port. And we can enhance confidentiality and the integrity of Cata virtual machines with hardware TEs. So to protect an application, we need to protect all the sources accessed by the application, such as CPU, memory, network storage, and external device, such as GPUs. As a container, in addition to those resources accessed by the application, we also need to protect the container image of the workload. So how could we protect container images for confidential containers? So why do we care about image management for confidential containers? What's the special requirement? Before talking about special requirements, let's go through the current way to managing container images for normal containers. Take a container D as an example. To run a container, container D will first download a raw image blob from registry and store those blobs to local FS. Once all blobs are ready, container D will call slap shorter to convert those blobs into fast system and prepare root FS for containers. Once the root FS is ready, container D will start a container and the container can access all files and the data inside the container image. Here, we can say that the raw image blob and the mounted fast system are available on host sites, which expose special challenge to confidential containers because we need to protect confidentiality and integrity of images for containers. Let's move on to image management for confidential containers. To summary, confidential image management will face three special requirements, which are confidentiality, integrity, and efficiency. To ensure confidentiality, all image containers should be encrypted, both on registry and on local host. So the image container can be kept private. Second, the image management must be moved from host inside into guest. Because if we store image container and mounted fast system on host, the container will be available to host, which breaks confidentiality. Even worse, host can make changes to those images and break the integrity of container images. By encryption and moving image management inside a guest, we can ensure confidentiality and integrity of images. But we will face another new challenges. With traditional image management, each blob and fast system are mounted on host, which can be reused for different container instances and restart. But by moving image management inside into guest, we need to download and prepare images for each container instance. In other words, container images can't be reused for different container instances, which will bring bigger costs, such as high pressure on registry, slow container startup time, and heavy IO requests on local device. So how could we achieve both confidentiality, integrity, and efficiency for confidential containers? There are some existing technologies for confidential containers. The OSI Cript project provides a way to encrypt the whole images and the cosine project provides a way to ensure the integrity of container images. And the confidential container community also invented some new technologies to move image management from host into guest. We modified the container D and Cata container and introduced a new component named imageRS. These three components help us to management container images inside a guest. So we have technologies to ensure container image confidentiality and integrity, but we are still facing the challenge of efficiency. How could we improve efficiency for image management? The latest image management service project provides an interesting way to achieve efficiency for confidential containers. What's the latest image service? The latest image project provides a framework to provide image management service for containers. The following picture is a co-architecture of the latest project. It has been split into build, ship, and run stages. This project has different aspects. First, it defines a read-only file system format with plenty features such as laser loading, data de-definification, and compatible with OSI-V1 images. And we are also adding encryption to the image format. Second, it's a read-only file system for containers, AI models, and software packages. It's very flexible to access the latest image. We provide different interfaces such as fields on Linux and Mac OS, water IFS for virtual machines, and ERFs page sharing on Linux. And we are also developing a real-space library for application to directly access files from our latest image. Third, we also develop a storage subsystem for loader. We also develop a load-neighbors storage subsystem with P2P, Cache, and data de-definification. We build a load-load content-address storage subsystem to duplicate data among different images. Last, the latest image project has put much effort to get integrated with the ecosystem. There is one more feature we should mention here. The latest latest release provides an OSI-V1 compatible model. We will get more information about the compatible model in later. The core of the latest project is the latest image format. So, let's get an undetected explanation about the latest image format. The way to convert an existing OSI-V1 image into a latest image, as we know, an OSI-V1 image contains one manifest and one or more data layers. Each data layer is a binary blob. Actually, the binary blob is a tar stream. Within the tar stream, there are tar headers and file data. To convert an OSI-V1 image layer to latest layer, latest data blob, we first group all tar headers together and translate them into a file system meta data. The file system meta data can be mounted directly by a Fuse server or by the Incola IRFS file system. With this file system meta data, let us can provide full file system view to the workspace. Then, we chunk file data into fixed size and compress the chunk data. At last, we need some information to decompress the compressed chunk data. So, our latest data blob includes three parts. First is meta data, chunk info array, and chunk data. There is one latest data blob for every OSI-V1 image layer. In addition, latest has an extra layer. We call it latest meta data blob. latest meta data blob is generated by merging all file system meta data from all data blobs. In other words, the latest meta data blob is built-time overview file system. With the meta data blob, we do not need to mount each data blob individually. Instead, we directly mount the meta data blob. Thus, we don't need to overlay image layers at a long time. We don't notice that if we care about backward compatibility, we need to both generate OSI-V1 image and the latest image for the same container. That will cause container image data saved twice and waste storage space. To solve this problem, the latest NIDAS provides a new model called NIDAS OSI-V1 compatible model. With this mode, the latest data blob only contains file system meta data and chunk information. It doesn't save chunk data. The OSI image spec version 1.1 provides an OSI reference type. By using the reference type, we can get the data from the original OSI-V1 image. That means for existing OSI-V1 images, we can build an extra NIDAS image to provide lazy loading and other features. The OSI compatible model generates various more NIDAS images, typically about 3 to 5% of the original OSI-V1 images. The OSI-V1 compatible model is very useful for backward compatibility. NIDAS has two modes. One is NIDAS native mode and the other is OSI-V1 compatible mode. Each NIDAS image contains two types of blob, data blob and meta blob. The meta blob contains file system meta data and can provide a full file system view. And the data blob contains file chunks for each layer. The NIDAS project also provides flexible interface to access NIDAS images. It can be accessed by a source of use, URLFS, waterFS, and even through some URL space library. For example, the NIDAS image is URLFS compatible. Let's look at the way for URLFS to make use of NIDAS images. The URLFS will directly mount a NIDAS meta data blob and provide a full file system view. The application can work the file system tree. When the application tries to read the data from a file and the file data is not ready, URLFS will notify the FS catch and FS catch will send a request to NIDASD and MSD will fetch the data from the remote registry. And when the data is ready, NIDAS will notify FS catch and notify URLFS. Eventually, the data will be sent back to the application. As image service, help to improve the efficiency of confidential containers. There are several enhancements needed for NIDAS images to support confidential containers. First, we need to add data encryption to NIDAS image format. We use a hybrid mode to protect NIDAS image. First, we will use OSI craps to protect the NIDAS meta data blob. And the meta data blob contains case to describe data from data blobs. So the data blobs are protected by NIDAS. By that way, we can support both data encryption and meta loading at the same time. For data integrity, traditionally, the integrity of data blobs or images are verified at a download time. And there is no mechanism to ensure data integrity at a run time. NIDAS adds a special attribute to the image management to verify the integrity of data checks at a run time. So, like encryption, we will combine cosine and NIDAS to protect the integrity of the whole image. First, we will use cosine to protect the integrity of manifest and NIDAS meta blob. And the meta data blob contains digest of the data blob. And there is a monetary to usually ensure the data integrity of each data chunk. So the data blob is protected by NIDAS again. With the enhancement of encryption and data integrity verification, we can support laser loading and image cards for confidential containers. So, we can fetch image data from remote registry or from remote node through P2P, or we can fetch the image data from host, from data cache on host through what type of interface or what type of block interface. And we also support different modes to access encrypted images. It can be accessed through LADASD and fuels, or it can also be accessed through LADASD and URFS. And we are working on researching to enable URFS to directly access LADAS images, but that is still in the early stage. We are still working on that direction. That is our development plan. The first stage is to integrate LADAS image service with the image IIS create. After the first stage, we only provide the laser loading capability and do not include data caching. The next step is to add data caching to LADASD. By that, we can perfect the image data and cache it inside the trusted domain that will greatly improve the performance and reliability. And as I mentioned just now, we are still investigating to enhance the URFS to directly access LADAS images through what type of block. If we achieve that, it will be very flexible. There is no URFS demon to serve the image. That will be very great. But how to provide image caching and host is out of scope. We won't discuss it here. And there are other ways to provide image caching service, such as we can block-based image caching. For example, we can use code call to image format to provide encrypted image. Then we can use DEM integrity and DEM craft to ensure the confidentiality and integrity. Let us do the same simple, but it is not very inflexible. So we will enable LADAS image service for confidential container first. We are targeting to integrate LADAS image service into confidential containers by end of code 2. If you are interested in the technology or project, please join us. Thank you for listening. . |
THE BASE - FOSS Confidential Container SDK to ease the development |
Hi, I'm Sebastian, co-founder and CTO at Enclave, and our mission is to make confidential cloud computing as simple as possible. This is also the subject of this talk, it's about an open source project that we call the base and where we help the community to simplify the development of Enclave applications. So in this talk I will bring you on a journey, also with our journey where we first of all had to explore how to bring Enclaveation into existing product applications. So here comes the disclaimer, this talk is about lessons learned and in particular about a lot of pains that we discover, and we hope that from our lessons we can help the community to just find a much smoother way to develop confidential compute applications. So speaking of that, I think the main motivation why we started the whole project is, we believe the cloud is super cool and the future is cloud, so effectively everybody will develop in future applications in the cloud. But as a security guy, I also know that by default the cloud sees all the data, the application codes, which is in particular critical for a lot of businesses, because the business logic is leaked and so on, and that somehow motivated the whole field of confidential cloud computing. So people started to think already a long time ago, like decades ago, with approaches like fully homomorphic encryption or multi-party computation, how to compute in an encrypted way, and confidential cloud computing is like a revolution of those ideas with applications to the cloud, and in particular to solving the problems that a cloud leaks by default. So here comes another disclaimer, so whatever I'm now going to talk about relates to Intel SGX based incubation technology, there are also other approaches, most notably by AMD 5, but for ease of use and presentation, this talk is related rather to Intel SGX approaches. So what is confidential cloud computing? It's quite easily explained, it's the idea of turning your workloads into enclave workloads. So the nice thing is, for example, if you have the AMRunning with your Docker apps or Kubernetes cluster, you can with this new concept turn those applications, clusters, containers into applications that run in a black box, and by black box we really mean, through encryption mechanisms, the ability that even at runtime, your application, your data, your workloads is black box shielded, vaulted from the infrastructure. And this is somehow cool, because as mentioned, it solves a lot of problems that we have right now with cloud applications in particular, keep in mind that the cloud is designed in such a way that it shares resources, and the only way how the resources are shared are through virtualization, and virtualization is with a hypervisor, rather implementing a software-based isolation mechanism. With enclave technology, we finally can use strong cryptographic mechanisms, which are based on well-studied cryptographic assumptions. So let me just start with a short introduction, how the basic concepts work in order to give you a better feeling how those black boxes are designed, and it's also an appetizer what needs to be done whenever you want to develop those black boxes and put your workloads into enclaves. And I think that the main concept, which I personally find very revolutionary, is that runtime memory encryption, or you can now talk about data and use encryption, or always encrypted at any point in time, and this is possible thanks to an extension of existing CPUs, in particular the extension in Intel-based CPU, this is called SGX, and you can think about that like a small security process or somehow extension of the ideas of TPMs back in the days that gives the CPU additional cryptographic superpower to, among others, encrypt user space memory. This is called an enclave in terms of Intel SGX. And here the assumption is that the CPU is a trust anchor, so we really assume that the CPU, who helps, for example, the encryption decryption keys, acts like a trusted anchor, and keys are not extractable from the hardware. This is the base assumption. And with that help, you can just think about that whenever the memory management unit, for example, accesses some physical addresses, on the way there is an encryption engine, typically the AES, that first of all allows to encrypt and decrypt those memory bits. And another thing which is somehow related to the choice of the AES algorithm is also the fact that whatever now you write to memory is not only encrypted, but it's also often indicated. Meaning, for example, if someone alters the memory, changes the ciphertext, then of course this is detectable. So integrity protection comes literally for free. And if you put that together and you assume that now with the CPU we have a trust anchor, something like a trusted third party inside our compute environment, then another cool feature is remote attestation. Remote attestation is about now proving to a user, which, for example, has no access to the hardware, to the data center, to the cloud, that his workload runs in an enclave and no one has modified that. And the way it is done is through a protocol called remote attestation, it's a bit like a challenge response protocol with the fact that the CPU acts like an auditor, like a trusted third party, that measures the enclave. And on this basis, issues and signs of reports such that the user can easily verify that he deals now with an enclave that, for example, he has generated, he has signed, and he has now a cryptographic proof that no one has manipulated the workload. And a last feature, which is quite innovative, and I really find cool, is key provisioning. So an enclave is like any other application, first of all, called, that is somehow stored in the file system, which is loaded by the operating system. And of course, if we assume that anything is untrusted except the CPU, then we, for example, should think about that a malicious party has access to the binary and can, of course, manipulate it. So a very, very bad idea is to put any secrets, any passwords or whatever in that binary, simply because it may be reverse engineered. And key provisioning is another protocol, building a remote attestation, that allows to provision all the secret key material, all the environment variables, maybe password secret keys for SSLTLS certificate, whatever, you consider crucial. You can also think about, you know, adding additional files into the enclave, for example, any documents and crypto file systems, whatever you think is, as mentioned, worth to be protected. And secret key provisioning is a protocol that, first of all, allows the user to remotely attest that he is now talking to an enclave and whereas the application he knows, he can trust. And before effectively starting the application, he can provision through a secure SSLTLS protected channel the secrets in order to parametrize, configure or maintain the application. So this is like a life provisioning of secret information. And of course, it totally makes sense. And if, for example, your application is somewhere hosted by an untrusted environment, you just want to make sure that this environment has no access to your secrets. So this is roughly the theory behind enclave technology. And now let's go on a mission or a journey, how one can get an application enclave. And this is also a bit part of our journey, because we started with a lot of approaches. And you can consider this walkthrough a bit also like, you know, and best practice advice. How I at least believe is the easiest way to build enclave applications. So what kind of ingredients do you need? Of course, hardware. And as mentioned before, in this talk, it's all about Intel SGX. So you definitely need a CPU, an Intel CPU starting from Skylake onwards that has been introduced around 2015. And this micro architecture for the very first time contains the SGX security extensions. So you might think, ah, maybe I have a laptop, you know, there is an Intel CPU inside. Maybe it's not that old, so chances are high that you are lucky and your CPU supports that. But I don't think that this is a good idea because maybe you have read about that. The desktop line where SGX has been supported is now deprecated and stopped simply because the SGX capabilities are strictly limited. The enclave sizes that you can generate are, for example, too small for larger mainstream applications. And Intel has shifted strategy now towards server-based architecture. So a good idea is, of course, to find a server blade which supports SGX and here, I think, IceLake. And most notably in the recent introduction this year at Sapphire Rapids, it's like a better chance. So these are high-performance servers made for cloud applications with, I think, 48 cores or even more. I think Sapphire has even more cores. And the nice thing is that you can generate enclaves, I think, up to one terabyte. The downside is that, of course, those machines are not so cheap. So they cost you roughly somewhere between $30,000 to $50,000, depending on what configuration you're interested in. So this is already a small showstop right thing for someone who's just interested in developing a small project at home, contributing to open-source projects, helping, for example, to bring enclaveation into their stack. So later on, I'm going to tell a bit how, I think, I believe one can bypass these huge investment costs for open-source development. So let's come to the second ingredient, a second chapter. We definitely need drivers, drivers that tell the operating system how to talk to the SGX unit. So, of course, you can compile the drivers from scratch. There's the GitHub repo where the drivers are available, provided by Intel. But this is also something I would recommend you from our experiences, because there are a lot of configurations, and you really need to know for what environment you want to compile the drivers and so on. So there's a better idea. Simply use a Linux operating system that has kernel 5.11 on-walls, because the drivers have been upstreamed to the kernel, so they are ready to use. So you literally have to do nothing. This is my advice. And a good example is, for example, Ubuntu 22, which provides those drivers out of the box. So now we know, we have the requirements about hardware, we know that we need to install the drivers. So as an open source developer, a question is, so damn, how can I get this setup running? It sounds like a huge entry barrier. And I think there is a nice shortcut to just get those two requirements implemented. So one way I would, for example, to rent a bare metal machine and OVH, for example, offers the advanced one series, which is SGX enabled. So the functionality is available through the BIOS. So all you need is just to install an operating system with the matching drivers. And my advice would be to just install Ubuntu 22. Another approach is to have a look at the Azure Cloud, because Azure also offers confidential compute ready VMs. So you can literally just book a VM, which is hourly charged. And so the operating system, the drivers are all in place. So you can literally start with development. And here a small disclaimer. So I have no strings attached in either to OVH or Azure. You know, I'm just putting that into the air simply because I know that finding the right hardware that's the right prerequisites in order to implement NCAT application is not easy. And this is something that, you know, we figure out something, an easy approach, at least what we believe. But there are, I guess, some other cloud providers, smaller and larger, that might offer you similar configurations. Yeah. Cool. So now let's move on to the next ingredient. So if now the hardware prerequisites are met, we are now interested in implementing the software. So we want now to enclave our code. I think a standard approach, and this is also historically motivated, is to use an SDK and there is an SDK provided by Intel, but a bunch of other open source projects, like Tclave, somehow maintained by the Apache Foundation or conclave, offer SDKs in different programming languages. This is definitely cool if you, for example, start developing your application from scratch, your application is small, think about, for example, a small crypto wallet, which just needs a signing functionality you would like to put in an enclave. I think this is a cool approach. But when you're in the situation that you have, for example, existing applications, the open source community has developed and maintains for decades, like a MariaDB or an Android X server. I think this is not a good idea because this would require that you go into the code and you somehow rewrite it where necessary, taking the SGX functionality into account. And here, my recommendation would be then to rather focus on existing Libos approaches. There are also a bunch of open source projects, for example, like the Grime projects, Aclo, or Mysticos, who develop Libos. Libos is something like an user space library that emulates of rating system functionality, most notably syscalls. And the nice thing is that you actually do not need to rewrite the application and recompile it. You can just load the binary into the enclave thanks to the Libos. The binary thinks that it runs like a normal application on top of the operating system. But effectively, it is within an enclave. This is the superpower of Libos approaches. But those Libos approaches also have their limitations. There are open source projects, some of them are production-ready, some are less, some are actively maintained, some are less, the standard situation with open source projects. Of course, where functionality in this way is limited, and they require some expertise and training. It's like any application or a development stack, you really need to understand what you do. So what is the shortcut here? And that was also a bit the motivation of our work. We developed and open sourced the base where we hope to give ready-to-use enclave application on a silver plate. So what is the base? 18 applications ranging from standard databases to backend technologies. And also some applications, for example WordPress, Umami, which is analytics tools, Mosquito, like an IoT broker. But the whole idea was that we asked ourselves, hey, what do I need typically on a daily basis as a developer? Definitely some of those applications. And why don't we just give those applications an enclave form? So in this project, for example, you find the Docker or Docker compose files. You can easily derive manifest files out of that for Kubernetes cluster or whatever cloud-native tools you use on your daily basis. And simply use the recipes in order to enclave your cloud application. And you don't really need to dive very much into the deepness of the underlying lip-osses, which we, for example, have already done just in order to save time and help you to just focus on the development of your applications and not only in becoming a security engineer, understanding what do I need to do, how does it work to enclave it. We really just want to speed up the process and this way contribute to the fact that enclaveation technology becomes the next standard. So the project comes with documentation, a lot of examples, how you, for example, can customize your enclaves and applications. Some of those repos also have demo branches where, for example, created some kind of attacks showcasing the power of enclave application versus non-enclave. Some repos also have some demo videos, for example, if you first of all want to check out how SGX can help your application and speed up a process because some of those applications need to be built and the build processes. At the time consuming, you know, there are images ready on Docker Hub. And for those who just want to check out how it works, we also released the base on Azure marketplace where you can just click the right VM, the application you want to try out and literally start with the development. And for those people who still believe that this is still a hacky and time consuming approach, we recently started to contribute to the Portainer project. Portainer is something like configuration, development, orchestration platform for Docker and Kubernetes based applications. And our contribution contains the extension of Portainer towards the support of enclave containers. So what we envision is that people have just in this UI simple templates where they can choose, hey, I want a MariaDB or I want a MongoDB. They just configure it, they deploy it as usual with Portainer and the extension just makes sure that the key management, the key provisioning is set up in place. So the whole idea with Portainer CC is to even further simplify the development of enclave applications. Yeah, that's almost the end of my talk. So as mentioned, the base as well as Portainer CC, our open source project, and we're interested in growing the community. We're looking for people that want to contribute, for example, with their own enclave applications or help to add additional functionality to Portainer CC. So if you're interested, you know, through the GitHub, you find a little invitational link to our Discord server. So please join us, even if you have questions or are interested in, you know, learning more about SJX or even, you know, AMD's counterpart technology. And if on the other hand, you don't have the time to contribute to an open source project, but you're interested in, for example, using an application to protect your workload as an engineer because you're convinced that this still doesn't make sense. I recommend you to just go to enclave.cloud, which is the one stop shop for confidential cloud computing. So here, you really can, with a few clicks, configure the corresponding environment you're interested in, for example, VM or Kubernetes cluster or serverless function and managed database. Choose your cloud provider, at least at the moment the type of cloud provider that supports confidential compute technology, and then literally use that environment in order to build your cloud application. That's it. Thanks for your time and hope to see you. Bye-bye. Thank you. |
A Study of Fine-Grain Compartment Interface Vulnerabilities: What, Why, and What We Should Do About Them |
Hello everyone, I am Hugo Le Feuvel, PhD student at the University of Manchester. In this talk, I will present the result of my research on compartments interface vulnerabilities, a work that will appear in NDSS 23. This is the result of a collaboration between Manchester, Bucharest, Rice and Unicraft.io. Before starting to talk about interface vulnerabilities, let me bring a little bit of necessary background. And a very important notion in this work is compartmentalization. Compartmentalization is about decomposing software into lesser-privileged components, such that components only have access to what they need to do their job on. Compartmentalization is not particularly something new, so let me illustrate it with a real-world example, web servers. Typically, web servers are composed of components that do, on the one hand, privilege things like listening to port 80, and on the other hand, of other components that perform risky operations like parsing network-provided data. If we have these two components in the same process, then this process has to be root, and that's problematic because if an attacker manages to compromise the network-facing component, for example, then it will immediately own the root process. So what people do in practice is decomposing or compartmentalizing the server into two entities in separate processes, the master, which is privileged and not exposed to risky operations, and the worker, which is deprivileged and exposed to network data. Both entities then communicate over shared memory. Thus, if the worker gets compromised, it will not be able to perform privileged operations and will remain contained. Recently, we have seen really nice advances in the field of compartmentalization. People have been taking more fine-grained, more arbitrary, and more automatic approaches to compartmentalization. And what these work do is taking arbitrary applications, identifying a particular component that may be untrusted or risky, or trusted and critical, and compartmentalizing it automatically or semi-automatically. The granularity of the component can be very variable, ranging from libraries to blocks of code. Notice that I'm talking about compartments here, not processes, as the isolation technology too is very variable. In short, the goal of these works is quite ambitious. It's about compartmentalizing legacy software with a low engineering effort and a low performance cost. Unfortunately, as we're highlighting in this work, things are not as easy as they might seem. And privileged separated software, cross-component interfaces are the attack surface. And there, all sorts of things can go wrong security-wise. Let me give you a few examples. Let's say we have two compartments. One on the left, malicious, and the other one on the right, trusted, protecting some secret. The compartmentalization mechanism guarantees us that Compartment 1 cannot access Compartment 2's memory directly. So that doesn't work. However, Compartment 1 is still able to do legitimate API calls to Compartment 2 with, for example, an invalid pointer. If Compartment 2 doesn't validate the pointer, it will risk exploitation. Another example is the usage of corrupted indexing information, for example, a size, index, or bounds, as is done in this function. Another one is the usage of a corrupted object, such as a tampered file pointer. And there are many others which will go through partially in the next slide. In this work, we unify all of these vulnerabilities under the concept of compartment interface vulnerabilities, or SIVs. SIVs encompass traditional confused deputies, IAGO attacks, which are SIVs specific for the system called API, and their references and their influences under influence and probably many others. They are all attacks revolving around misuse of a legitimate interface. SIVs are very common when compartmentalizing and modified applications, as we further highlight in this talk. They affect all compartmentalization framework because they are a fundamental part of the problem of privilege separation. To put it in more precise words, we define SIVs as the set of vulnerabilities that arise due to lack of or improper control and data flow validation at Compartment Boundaries. We observe three classes of SIVs, data leakages, data corruption, and temporal violations. Within data leakages, we differentiate between address leakages, which can be leveraged to de-rentamize compartments and mount further attacks, and compartment confidential data leakages, which result in information disclosure. Both are due to data oversharing and sharing of uninitialized memory. We have already illustrated a range of data corruption attacks in the previous slide, but generally, there are not to happen in situations where interface-crossing data is used without appropriate sanitization. They can affect control as well as non-control data. Finally, temporal violations include vulnerabilities like expectation of API usage ordering, usage of corruptive synchronization primitives, or a shared memory time of check to time of use. Temporal violations are usually caused by a wide range of behaviors, including missing copies, double fetches, and generally lack of enforcement of API semantics. This is a broad and succent overview, but the paper provides a full taxonomy including an analysis of existing defenses. So having observed and characterized the problem, we asked a few questions. How many SIVs are there at legacy-imported APIs? Are all APIs similarly affected by SIVs, for example, taking library API generally versus module APIs generally? Do we observe systematic differences? How hard are these SIVs to address when compartmentalizing? And finally, how bad are they? If for some reason you don't fix one of them or just decide to not fix them at all, what is the impact on the guarantees that compartmentalization can give to you? We believe that it is really critical to understand these points to be able to provide countermeasures that are adequate, systematic, and usable. And so the approach that we take in this work to answer these questions is to design a tool, and more particularly a fuzzer, specialized to detect SIVs at arbitrary interfaces, and we call this tool Comfuzz. Then we apply Comfuzz at scale to a range of applications and interfaces to gather a dataset of real-world SIVs. Finally, we study, systematize, patronize the resulting dataset to extract numerous insights on the problem of SIVs in compartmentalization. In the next slides, I will give a quick overview of Comfuzz before focusing on the dataset and insights. So let me give you a high-level overview of the fuzzer first. Taking unmodified applications, we instrument them to intercept cross-compartment calls. Compartments are freely defined, for example, a particular library boundary or an internal component interface. We based our prototype on dynamic binary instrumentation using Intel PIN, but also explored other instrumentation approaches, for example, LLVN-based. The interface between the trusted and untrusted components is automatically detected using binary debug information. Our fuzzing monitor then drives the exploration by ordering mutations of the data flow to simulate attacks from the malicious compartment to the trusted compartment. The workload used to drive the program is application-specific, for example, benchmark tools, test suites, custom workloads, etc. You could even plug another fuzzer like OSS there. Finally, the fuzzer automatically triages and stores crash reports that includes de-duplicating, reproducing, minimizing, etc. The paper provides much greater details on these technical matters, and I will be happy to elaborate if you have questions. Using Comfuzz, we gathered a substantial dataset that we carefully dissected. Here you can see the paper's big table that summarizes the dataset. Let's have a closer look at it. Overall, we applied Comfuzz to 25 applications and 36 APIs, for a total of 39 scenarios. We considered a selection of library APIs, module APIs, and internal component APIs, trying to focus on scenarios that make sense in popular software. In fact, 16 of these scenarios have been previously considered by about 12 studies in the literature, and the attacks that we find apply to them as well. In total, we find 629 SIVs. We classify these SIVs in five impact classes, read impact, write impact, execution, memory allocator corruption, and null point under reference. With this data, the first questions that we try to answer are how many SIVs are there at legacy or unmodified arbitrary APIs, and are all APIs or code similarly affected? And looking into this, we quickly confirmed that SIVs are absolutely widespread among unmodified APIs or code. Having said that, we also highlighted significant disparities of prevalence among scenarios, and that's the really interesting part. For example, we observed variations of SIV counts from 0 to 105 across APIs. That's quite significant. Take a look at this plot, which represents for each scenario the number of vulnerable API endpoints versus non-vulnerable. It clearly shows that SIV prevalence among applications and APIs is very heterogeneous. We have large and almost totally SIV-free APIs, and small and fully vulnerable APIs. In fact, we show an entire absence of correlation between API size and SIV count in this dataset. So while clearly, yes, SIVs are widespread, no, not all APIs are similarly affected. This motivates us to look into the patterns and effects that influence these observations. And doing so, we observe recurring APIs and patterns that result in SIVs. This really comforts us in the idea that the presence of SIVs is influenced by structural properties of the API, rather than API size or quantity of shared data. In this talk, I will present one of these patterns, but there are more in the paper. And the particular pattern I want to go through concerns modular APIs. Indeed, we noticed that modular or module APIs are the most SIV-vulnerable interfaces in our study. On average, we observe that module APIs feature more SIVs and worse SIVs than any other class of APIs. And looking at the structure of these interfaces, it makes sense. Unlike library APIs, module APIs must be very generic and yield high performance. As a consequence, we have patterns with the application exposing its core internals and its core states to the module to achieve their generosity and performance. But this results in a much larger attack surface exposed to the module. Take the example of this data structure exposed to potential malicious modules by the Apache HTTP core. It is a very complex with over 75 fields, 60% of which point us, referencing core data structures like memory pools, connection state structures, or mutexes. What we observe with this pattern is a somewhat counter-intuitive thing. Modularity is not always good for compartmentalization, and in many cases, it can even be counterproductive. This is only one of the patterns that we highlight, and there are more in the paper. Now, having shown that SIVs are widespread but affecting applications unequally, or APIs, let's look at their concrete security impact. And the first thing that we confirm is that they are quite impactful. In fact, over 75% of scenarios present in our dataset show at least one right vulnerability. And worse than that, about 70% of write and read and 50% of execute vulnerabilities are arbitrary, which means that the attacker, which means that when the attacker controls a write or read primitive, then they are likely to be able to read and write anywhere. And while only a smaller portion of these scenarios have execution impact, it is likely that read and write primitives will be combinable to achieve execution capabilities. In this talk, I will be concretely illustrating this impact with practical scenarios and real-world SIVs taken from the dataset, where we demonstrate key extraction from a protected OpenSSL. Once again, here, we show more details in the paper. So here, we assume a scenario with two compartments, where the goal is to isolate OpenSSL. For example, from a compromised web server engine X. Isolating OpenSSL, or part of OpenSSL, is a popular application of compartmentalization, both in the literature and in the industry. Thus, here, the compartment interface and therefore the attack surface is the OpenSSL public API. Unfortunately, we find several SIVs that enable for read, write, and execution impact. Take this option setting primitive, for example, which is part of the OpenSSL public API. It differences an interface crossing pointer, sets it, and returns it, clearly resulting in an arbitrary read and write oracle. Any attacker that can compromise the application's control flow will likely be able to extract SSL keys easily. Thus, clearly, if the API is not carefully enough sanitized, the benefits will be pretty low, at most a form of weak hardening. Now, you could tell me that it's not a good idea to protect at the public API anyways, and that we should rather choose the OpenSSL internal key API that's much smaller. So, let's take a look at it. This time, we have NGINX and most of OpenSSL in the untrusted compartment, while we have the small key handling part of OpenSSL together with the keys in the protected compartment. Unfortunately, here too, we find several SIVs. Take a look at this function of the internal key API, for example. I only put the signature for simplicity's sake because the function is implemented in per-generated assembly. You can manipulate the in pointer to point to the key that you cannot directly access, encrypt with a known key, and then decrypt to get the secrets. Hence, here again, attackers that can manage to compromise the application are likely to be able to easily extract the key. Unfortunately here, fixing the SIVs requires to make the component stateful, which is a fairly drastic design change. Overall, through these two examples, I showed how existing OpenSSL isolation strategies collapse when confronted to SIVs, and how important they are security-wise. To conclude this talk, let's take a quick look at countermeasures. How do we tackle SIVs? Overall, we see two ways. First, making progress on automatic and systematic countermeasures. Our paper highlights the limitations as part of our SIV taxonomy. Second, learning from our study of patterns. We also believe that software component APIs should be designed to feature low compartmentalization complexity in the first place. We provide a set of guidelines to achieve this. The two approaches are complementary. Even in the presence of countermeasures, well-designed APIs are wishable, as the first point is known to be fundamentally harder. I will not have enough time to go over all the guidelines, but let me try to give you the gist of them. First, not every interface is a good boundary for privilege separation. Maybe a particular API doesn't fit privilege separation, and that's fine. In this case, it will be hard to harden anyways. Second, we recommend that major attention should be dedicated to reducing the complexity of interface crossing objects. Avoiding, for example, sharing of resource handle, system resource extracts, synchronization primitives, et cetera. If this is not possible, it should bring us back to the first point. The interface is probably not the right point to compartmentalize. For example, because components are too deeply entangled. Third, compartmentalizable components should enforce API semantics to be safe. For example, ordering or concurrency support. Under distrust scenarios, it is not acceptable anymore to assume that the caller will respect them or face the consequences. We are slowly coming towards the end of this talk, so let me summarize the key points that I wanted to make. Civs should be at the center of every compartmentalization approach, and you will likely not achieve tangible security benefits without considering them. API design patterns influence the presence of civs and their severity. Overall, it's not so much about the size of the API. It's about the complexity of API crossing objects. Addressing civs is not just a matter of writing a few checks. In fact, strong solutions often require refactoring the API. Thus, compartmentalizing apps goes much further than just setting and enforcing bounds. We want this work to be an appeal for more research towards addressing the problem of civs, systematically finding them, addressing them, or telling you what interface may good compartmentalization boundaries. If you are interested in this work, I invite you to check out our paper and the code and data set of Confuzz. I will now be more than happy to take questions. Thank you. Thank you, Hugo, for this very accessible talk on this important topic of securing interfaces. One question maybe that I can start with is something that you brought up yourself as well. You say it's more about compartmentalization, but it also applies obviously to TEs. Can you comment a bit on that? Is that something you consider Confuzz, your physics could be extended to something like Gromine? Actually, maybe there are two different parts. I think the conceptual part about compartment interface vulnerabilities, maybe we could remove the compartment out of interface, out of compartment interface vulnerabilities, and just get interface vulnerabilities. I think it has also been described by other works previously, notably some of the work that you did, Joe. I think that applies to TEs really, really well. I think it's just a generic problem about interfaces, and that fully applies to TEs. Regarding the fuzzer, from a very technical point of view, I think that it might need some adaptation to be run on existing TE software, but it's absolutely feasible. I think that it could apply there as well. We didn't really explore it because obviously at some point we needed to restrict the scope of what we're doing, but I think it makes sense. Following up on that as well, I think you mentioned in your slides one of the technologies that you could use for compartmentalization. It's not only TEs, it's also something like Cherry. It uses capabilities, and I'm wondering, TEs are not great in these vulnerabilities because you have these confused specialty attacks that you explained also, where you have a pointer that you essentially can dereference. With Cherry, with capabilities, you have sort of native mitigations for many of those, capabilities I think were made with the idea of avoiding confused deputy. Can you comment a bit on what underlying technology can mean for the vulnerability of compartmentalization? I'm not sure if I can, I don't think I can share my screen, but maybe I can. But you can put a link maybe in the chat for people. Actually in the paper we did talk about this, so I'm just going to share my screen, but maybe I can. I'm sorry, I just broke everything. I just posted the link, I don't know if I triggered something terrible. I think I see the link, I think you unmuted that or something. So the paper goes in data, can you summarize maybe in the minute that remains? Absolutely, yes. So Cherry provides some features that as you said are really nice in addressing some of the spatial part of the compartment interface spectrum, of the SIV spectrum. It does not solve everything, it's not magic. Like many of the leakage issues remain, many of the temporal issues remain as well, because to some extent they are a little bit more high level than just spatial things on memory. So they still apply. For example, the issues with ordering of interface calls. If you have an interface that has some ordering expectations, for example calling function one before function two, and you don't respect that, Cherry is not necessarily going to help you. So this is going to remain. So it does address part of it, but it's not necessarily going to help you. Thank you. |
Building a secure network of trusted applications on untrusted hosts |
Hi, my name is Roman, I'm Principal Software Engineer and Network Service Technique at Profine. Today I'll tell you how to build a platform agnostic and hardware agnostic secure network of trusted applications on untrusted hosts. We all love the cloud. It's convenient. It enables companies to save money, grow faster and illuminates the need for a ton of work for managing and maintaining our own infrastructure. It simply makes our lives easier. Well, for the most part. Unfortunately, security breaches do happen and they're costly. According to IBM Cost of a Data Breach 2022 report, $9.44 million is the average cost of a data breach in the US, $4.35 million is the average total cost of a data breach globally and $10.10 million is the average total cost of a breach in the healthcare industry. Unfortunately, or rather quite fortunately given the risks, businesses from various highly regulated sectors like financial or medical simply cannot benefit from cloud offerings due to different laws around things like privacy and data protection. But it doesn't necessarily have to be this way. If it is for computing, by allowing protection of data in use creates opportunities to do things which simply weren't possible before. One way to benefit from the collection of computing would be to just simply use the TEs directly. For example, we could use the SDK provided by the hardware manufacturer and equipped with a fixed stack of documentation all we go. It works, but there are quite a few drawbacks. First and foremost, security is hard. Writing software directly communicating with a secure CPU is not exactly everyone's cup of tea. If all you need is a simple microservice application with a small REST API, diving deep into internals of a particular hardware technology just should not be necessary. It takes away the precious time that could be otherwise spent on developing revenue-producing business logic. But let's say we went ahead and developed a secure layer interfacing with a particular CPU technology. Well, now we have to maintain it. Now apart from that, we also have to fix any bugs while having it reduced and hoped that none of them are exploitable. People make mistakes, and the more code there is, the more opportunity there is to make one. After putting all of this work in, now imagine that you want to switch to a different service provider, which does not offer the same hardware technology you've used originally. Or much more concerning, what if vulnerability is discovered in a particular hardware technology you developed against? The different trust execution environments just are not exactly compatible. So your level of just two choices really is either wait until the vulnerability is fixed and hope your application is not exploited in the meantime, or you go ahead and redo all of the work you've already done for the original technology for the new one. Last but not least, chances are that someone had already done this before, and fundamentally the concepts that make systems secure do not change. So most likely you're going to just repeat the same work someone else had already done. At Rofin, we are custodians of the NRX open source project, which among other things is designed to address exactly the issues I've just outlined. It's a thin, secure layer of abstraction in between the host and the TE. It's essentially a secure runtime, which lets you execute your WebAssembly workloads inside arbitrary trust execution environments. NRX has supported various backends, today that's Intel GX and AMD Cells and P, but as more and more TEs are made available, support will be added for them as well. NRX project was started in 2019, and in 2021, Rofin was founded, which was committed to being 100% open source and providing services and support for NRX. In 2022, we also launched our enterprise products. So now why WebAssembly? It's polyglot. It's supported by languages like Rust, C, C++, Go, Java, Python, C Sharp, Java, Ruby, and the list goes on and on. So it's designed to be portable and embeddable. It has functional equivalents to a usual native binary, so for the most part, development process is exactly the same as for developing any other application. There are emerging system API standards, called WASI, to which, by the way, we also contribute. You can run NRX outside of TE for development purposes. It runs on Linux, Windows, and Mac, both XA664 and ARM64 are supported. Trusted execution is currently only available on XA664 Linux. For SGX, you'll need a recent kernel, and a few Intel provider services running, like ASMD and PCCS, and for AMD Cells and P, all you really need is, unfortunately, a recent kernel with a patch set provided by AMD. So the patches are not mainline yet, but we also maintain our own kernel tree with everything you could possibly need for this. Now let's see how is NRX actually deployed. On the left here, we have a tenant. Let's call her Jane. On the right, we have a CSP server with a supported CPU, on which Jane wants to deploy her workload. How does Jane ensure integrity of the workload being executed by CSP and confidentiality of his data in use? Do that, Jane will ask to execute her workload in NRX. The first thing that the KEEP does is it asks a secure CPU to measure the encrypted memory pages containing the KEEP itself. This is the execution layer and the sheen. The CPU then returns a cryptographically signed attestation report containing the measurement or along with information about the CPU, for example, the firmware version used. The execution layer then sends the report to an attestation service for validation. In NRX, this attestation service is called Steward. The Steward will make sure that the KEEP is indeed trusted. It will check the signature of the report to ensure it's being run in a hardware-based trusted institution environment and will also make sure, for example, that the CPU firmware version used is not vulnerable and will verify that NRX execution layer was not tampered with. On successful attestation, Steward then issues a certificate for the KEEP, which is used to fetch the workload from a registry. We call it drawbridge in NRX. And the certificate is also used for performing cryptographic operations, for example, for providing transparent TLS to the workload. Now let's see how this works in practice. To begin with, let's see how do we actually run something within an NRX KEEP. The fundamental unit of work executed by NRX today consists of just a WebAssembly executable and NRX KEEP configuration. For example, here it looks for my chat server that is going to secure later. This is the KEEP configuration. So here is my Steward configured, my personal Steward that I've deployed on VPS, and my Stern IO configuration. And in this case, I want to inherit everything from the host, so that means I want to print everything from the host and I also get a sign in from the host. This file will also contain things like network policy or trust anchors and other things like that. I've already uploaded this to my personal drawbridge and I tagged it with a tag of 010. So let's see what that looks like. For that, I'll do a request to my drawbridge and what I get back here for this request is a tag, right? Or we also call it an entry. And so an entry is nothing else than a node inside a merkle tree. And it's a merkle tree because it contains the digest of the contents of itself. Now what does it mean is that if I would, for example, go one layer deeper and inspect the actual tree associated with this tag, I'll see that it contains the NRS.toml and made it wasn't we've seen earlier. Now if I were to, for example, compute the digest of my NRS.toml, you'll see that this is exactly the same digest we see here and here. Now I can go, of course, one step up and instead of computing the digest of the NRS.toml, I can compute the digest of the, well, the actual entry, the actual tag, right? For that, I will just do a request again to the same URL and again compute the digest of it. Now, if you remember, you'll notice that this is again exactly the same digest that we see in our tag, right? And so this digest is, in fact, a digest of the minified JSON of this object that we've seen over here, right? So this is nicely formed for us by JQ, but we need to request directly, just get a minified JSON, which we then hash. So let's, so here I'm logged in to AMD's 7SMP capable machine. I could, for example, read the CPU info and I will grab for model name and only want one entry and see that this is indeed an AMD Apex 513 processor. So I'm going to use NRX deploy and I'll also specify the backend explicitly to, yeah, well, deploy the work code we just looked at. So I'm going to use again my local, well, not my local, sorry, my custom drawbridge. I'll deploy the chat server version 101, exactly the same one that we have seen before. And then I'm going to switch to yet another server again remote. This one has support for the sgx and again I'll do, here we see this is Intel Xeon 6338. And here I'll also do NRX deploy and in this case I will execute the chat client. Now once it starts, it will ask me for a URL, I'll put here the address and the port. So you can see here I've connected, here you can see the server also acknowledged the connection. And if you just look here, you'll see the exact same digest we've just seen in our entry. It was over here. So we also see the slug of the server, we just rented that other server, the version. So all this information came from the certificate, it's cryptographically signed data contained within the certificate, which we are, well, NRX actually parts for us and exposed to the work load. Similarly, the server also have received the slug that the client was deployed from and it also received the digest of the work load. So by looking at the certificate, we now can know exactly what workload is that other party running. We could also try to inspect this, we can use OpenSSL to connect and sure enough we see our certificate, you can see here that it's currently called, it's a common name, it should be a san of course, but it's just a proof of concept. So you can see here the certificate chain that we have, well, we have a certificate with a common name associated with the slug and the digest. And it was issued to us by the steward, by the steward that I have deployed in my infrastructure. And there's also my own CA in the root chain, which actually signed as a steward cert before. And if we look at the server logs, we'll notice the OpenSSL connection, which actually was not left in by the server. And it says here that the client did not present a valid certificate. So this was not a keep with the valid certificate issued by the steward, therefore the server didn't trust it and didn't let it in the secure chat room. Similarly, if I were to use NRX with a different backend than SGX, for example, I would use a KVM, which is not a real TE, right, it's just a KVM backend, it will not even attest to the steward. So the steward wouldn't issue a cert for us, right, and then we cannot actually execute the workload in NRX. Now let's look at how we actually achieved this. And to begin with, let's look at the client. And you'll notice it's quite a small executable actually. And notice also, so this workload doesn't actually need to do any TLS itself or anything like that. NRX Runtime handles all the TLS connections for it, so, and by default all connections are TLS anyway. So we're going to use a virtual file system to connect to an address at runtime. Unfortunately, it's required right now due to the limitation of the YG spec, but I get there's more going on on providing this APIs, but currently it's not possible to just call or connect Cisco like you would normally do, but that's why NRX provides a virtual file system to actually connect to a particular address. Now similarly, there's another virtual file system to extract the peer data from the connection we have established, and in this case we can simply match on that peer information. So here for example, if we are presented with an anonymous peer, so this which did not have a TLS certificate, we just simply abort. And this would also be triggered if the certificate would be not signed by a trusted party, like a stewardly trust. If it was a local workload, and it was executed in a real TEE, right, we could still trust it because we know the expected digest of the packages we have uploaded to the drawbridge. This by the way, the exact same digest we have seen before, maybe you see, it is over here. So this is the exact same digest we've looked at before. Now in a high B flow, of course, we're presented with the actual NRX key, which is then associated with a slug and the digest. And what we can do here is we can actually match on the actual workload slug. So where did this workload actually came from, it's version, right, and in this case we don't even need to check the digest because we trust the drawbridge slug. So in this case, we have verified these three versions, and we do not want to allow any other versions, right. Of course, this would eventually become a key configuration, probably, it could be specified as a tunnel, but for now, just for simplicity, I've included everything in the source code. Now similarly, we have the server part. And it has a very similar peer check over here, where it again checks for anonymous local key. And it actually doesn't want any local workload in, and it only allows essentially official releases that they're verified and were issued perhaps by this entity over here. So let's get back to the slides. If you're interested in this project, you can get involved using one of the links provided over here. And yeah, now a moment of a set announcement. Just a few hours before recording this video, I found out that Profian is closing, and therefore the NRS project is looking for maintainers, and I'm looking for a job. So if you know anyone who would be interested in the NRS project or me, please let me know. You can contact me or email or LinkedIn, and here's my Github handle. And yeah, now it's time for questions. Thank you. |
Scalable Confidential Computing on Kubernetes with Marblerun |
Hello everyone, my name is Moritz Eckert and in this talk I want to highlight a challenge of confidential computing that goes more into this DevOps direction. I want to talk about how we can deploy, orchestrate and manage confidential applications in a scalable, more cloud native way using a tool called Maburon. Before we go into the details, I want to first set the scene, give you an understanding of a use case, of the problem set we're dealing with and then step by step explore the solution that Maburon offers and in the end you might even have time for a short demo. So we were doing a project together with Bosch where we were dealing with cars, vehicles, creating video streams that contain personal identifiable information, PII. That means license plates, faces, other things that we wanted to feed into a model training, AI model training pipeline and we wanted to train that model on the raw real data. That means no blurring, no cutouts and at the same time we wanted to make use of an external labeling service that we would provide this material and they would label it for us so we can train our model on it. Of course this should all be done in a privacy preserving way, not sharing any of this PII. And therefore we developed a pipeline that essentially splitted the image based on what is PII and what is not and then would send the let's call it safe non-PII footage to the labeling companies and get back the labels, put together the image with the PII information again and then feed it to our model training. And for that we wanted to use confidential computing. That means from the car fleet to the actual trained model data would always be encrypted also at runtime and we can have this whole pipeline verifiable. And this pipeline should be scalable, should hold up against the challenges of a large car fleet sending a lot of footage, a lot of video data. You can find more about this use case of this project, we presented that with Bosch at Microsoft Build, Intel Vision, you will find those talks online. We from the actual site, we built open source projects to realize such types of confidential computing applications, we have a bunch of tools targeting more of these new types of applications and then we have a tool called Constellation that's more for the lift and shift side. So for this specific project we choose the Mavaron framework which we call the control plane for scalable confidential apps. But now that we know the problem statement, let's see in more detail what are the challenges when you want to deploy such scalable AI pipeline or any other type of microservice architecture in Kubernetes or any other cloud orchestrator. So on the right hand side we have our cluster that's already confidential computing capable, in that case it were SGX equipped machines, we already packaged our application into secure enclaves, in this case we used the Grameen framework for Intel SGX which will be presented in the next talk after that one. So this is already done, now on the left hand side on the top left we have our DevOps team and the DevOps team now wants to deploy all those different microservices that together build up the pipeline and they want to ensure that all of them have integrity, they have the right identity, they want to do remote attestation essentially. But how can they do that without doing that for each and every service individually that potentially can scale up and down dynamically, this is going to be a huge pain if you would do that. And then they want to provide configuration, they want to provide parameters similarly how you would do in a regular Kubernetes deployment, but you can't trust the Kubernetes environment, you can't trust the control plane or anything. So how can they manage orchestrate such applications? And then once deployed how can the individual services, how can they securely communicate with each other knowing that the other end of the connection is indeed a valid secure enclave, it has the right identity, it's also part of the pipeline and how can we do that decoupled from the application layer? So we don't have to do that over and over again when we build a new type of application and need to do remote attestation on the application logic. No, we want to do that in a service mesh fashion where we decouple this functionality and solve it on a different layer. And lastly, how can an external party like the Carfleet verify an entire deployment that's just not one big monolith but consists of several services? How can they verify it from the outside in one single simple step without knowing the inner details of our architecture? These are the challenges we are facing when we do real world deployments using confidential computing. So now that we know the challenge, let's see how Maveron approaches this problem. So with Maveron, we deploy a control plane component, we call it the coordinator. So we deploy the coordinator that itself runs inside a secure enclave. That means our DevOps team can first verify that coordinator and then they can provide or they provide the coordinator with a configuration, a deployment configuration we call the manifest and that is similar to any other deployment configuration for Kubernetes. But we deploy it to a trusted controller that we verified beforehand. So I want to show a quick example of how such manifest could look like. You can see here it's JSON format. The first thing we define here are packages. Packages are essentially the enclaves. This right here is specific to into SGX. We have the MR signer or here signer ID and so forth. In the future, this will also support other types of CC hardware, for example, AMG, SCV and so forth. But in this case, we have two SGX enclaves back and front end and they are identified with their unique signatures. And once we have these packages defined, think of them like you defined a container, a containerized enclave. We define the next section, which is our models. And models now consume such package or things like a Kubernetes pod consuming a container. In this case, it consumes the back end package and then we can define several parameters like files that should be available to that marble or the environment variables that you want to have to find for that model and the arguments. And this is similar to any other Kubernetes deployment, but now, because it's the manifest, because you can verify it, the coordinator can enforce this configuration for your outplaves and you can trust these configurations and that's the important point. What we also have here is called roles. So this model is associated with two roles, third reader and key reader. And Maveron implements a type of role-based access control. So if I scroll down in this manifest, I will find a section that is called roles. And here, similar to any other role-based system, every role is associated with a resource and the specific instances of these resource and it defines what action that role can perform on the resource. And these roles, we can then use to attach them to marbles. So a model can do certain operations on resources and we also have users and users are authenticated using PKI and then they can do whatever their role allows them to do. One of the things, for example, would be to define a secret. And the secret could be, for example, a user-defined secret, for example, an API key. And then we have a user, we have a role that allows them to update the API key. And this would, for example, allow you to have a multi-tenant scenario where you have the DevOps team that deploys this application and you have another team that is managing another service or access to something and that provides a key or an API key into that application. And using the role-based access control, they can deploy that or set that specific secret after the manifest has been put in place. So this key is then uploaded, is managed by the coordinator which is trusted and is from there distributed to your services to your marbles. And that means the DevOps team can never access a secret. So you have a split of trust. You have the owners of that secret that will always stay in control and you have the DevOps team and they can engage in a multi-tenant scenario. So we now have seen the manifest, we have seen the packages, the marbles, the role-based access control and resources such as secrets. So we go back to our example. After setting the manifest, the coordinator will then take care of providing those credentials or secrets or configurations to your marbles to your services. So once you deploy them using regular Kubernetes means of deploying the application, they will come up, they will be authenticated by the Marble Run coordinator and then receive their parameters, their credentials. Part of these credentials are TLS certificates. So every service has a unique leave TLS certificate which goes up to a root certificate that is established by Marble Run as a PKI infrastructure. So there's for every deployment, you have one root certificate. Those can then be used to establish secure connections between the services. For certain runtime environments such as EGO, this can be done automatically in a neutral TLS fashion if you're familiar with service mesh kind of things, you might be familiar with that term, but essentially it means you will wrap every TCP connection between your services automatically into TLS so they are secured. For other types of runtime environments or if they're specific need, you can also consume those credentials using the Marble Run manifest. You can place them into files, you can place them into the environment and then your application can take them from there and use them to establish secure connections. That's very similar to how you would operate with like CertManager or other certificate management systems for Kubernetes. The difference being that those certificates are only available inside that secure confidential environment, inside your secure confidential deployment, that means the coordinator and then your services. So now what we have is we have our running deployment, everything is connected, they can communicate with each other. So in the last step now an outside party can connect to the coordinator again, can obtain its attestation statement that contains the Marble Run manifest so they can verify that indeed a valid Marble Run coordinator, they can verify that indeed the deployment, the manifest they expect it to be in place there and they will obtain the root certificate from this deployment and after verifying that they can use the root certificate to authenticate to their deployment, in this case the front end of the AI pipeline that consumes the video stream from the cars and transitively because they verify the coordinator, they verify the manifest, they verified the entire deployment in one concise simple step. And that's it. Now we have all of those problems I showed you earlier, we have them solved, we can manage and orchestrate that deployment from the DevOps team side, we can communicate with that deployment from the car fleet and potentially any other third party or legislator or regulator can verify this deployment as well using the same type of mechanism. So in summary, what does Marble Run do? It adds a orchestratability, a configurability, a manageability to your confidential deployments, make some scalable, make them updatable, make them manageable. It can run standalone but it runs best on Kubernetes and the usual cloud environments. It supports several runtimes so far it's SGX specific, so for example Ego, Graphene, Ocklum and the future, we plan to also support other types of confidential computing runtimes like AMD, SAP, Intel, TDAX, most likely based on the confidential containers project. Okay, now we have a little bit of time left to look at a demo. Let me switch to my console. I have a Kubernetes cluster running here on Microsoft Azure with two nodes that are both equipped with SGX capabilities and we will now install Marble Run, set the manifest and deploy a simple demo application. So we use the Marble Run CLI tool that you can use to interact with the Marble Run coordinator and perform your usual DevOps tasks. So Marble Run will make sure to install the Microsoft coordinator that just should take a second for everything to be set up and then we can port forward the client API to local hosts so we can interact with the coordinator and this should be, if you're in a production environment, you can of course also use an external open server or any other way of exporting that service to any kind of environment you want to. You just need to wait a second for the coordinator to be up and running and then I can use this client API from here. And the second step would be then to verify the coordinators in DDA valid Marble Run coordinator with the correct enclave, you can use any type of command that will automatically do this verification and in this case we want to obtain the root certificate of this Marble Run deployment here. This would also be what an external party like a Carfleet can do to obtain a root certificate and then connect to the application, of course you can always do that ahead of time and then just distribute that certificate as the root of trust. But once we trust the coordinator, which is just a plain constellation coordinator, we can do Marble Run check providing the Marble Run, oh sorry it's not check, it's status. We can use Marble Run status to see what status the Marble Run coordinator is in, in this case it's waiting for a manifest to be set. Now we can set the manifest, I have one prepared here. It's a very simple application, just having three packages, microservice, demo application, one front end, two back end services, so three packages, also three marbles for these packages and they build up such demo, emoji, voting service. We have already seen the manifest, I'm not going to go into the detail again but of course also this is in GitHub so you can have a look at it if you want to. And then we can just set this manifest using Marble Run manifest set and it will upload the manifest and now the Marble Run coordinator should be in the state of being ready to accept and authenticate marbles. So now we can just go ahead and deploy our application and this is just regular Kubernetes application deployments. So in this case we're using Helm for this emoji voter demo application, it will install those three services and make sure that they are continuously running. And we can now check Marble Run and see those authentication requests where those marbles for example the web front end or one of those two back end services will contact the coordinator, will authenticate itself and be provided with the configurational parameters and their credentials. So we see the web front end was first and it was successfully activated as a new model. And this will continuously be running. So now if I go ahead and get all the parts in the emoji voter namespace I will see that all of these services are indeed running and if I would now scale up and down one of them they will be automatically authenticated again by Marble Run and be added to this deployment spanning a confidential overlay, a confidential deployment. Yeah and that's it from the demo. I think we've seen how you can interact with the Marble Run CLI, how you can install it, how easy it is to deploy a confidential application and we can then now keep using Marble Run to orchestrate and update secrets and update things based on what we defined in the manifest. Yeah let's go back to the slides. That's it. I think we've seen in summary Marble Run makes it quite easy to deploy, orchestrate, just manage applications during the lifecycle, manage confidential applications during the lifecycle and it augments the usual DevOps stack of cloud native deployments using Kubernetes, using your regular service mesh and so forth. So if you want to try it out or want more details please see our Github page, it also links to a documentation, you can very quickly create your first confidential deployment. We also have a simulation node so if you're not, don't have access to any type of confidential computing hardware you can just use the simulation node to run it locally using MiniCube or whatever the tooling you have in place. And then if you have further questions please get in touch, you can find me on LinkedIn, you can join our Discord. I will also be there tomorrow for the day two of the confidential computing dev room. Please hit me up if you have any questions or just want to have a chat. And yeah, before we go into the Q&A one last cheeky little self-advertisement, we have the open confidential computing conference coming up in March. I think this is you're the right audience for that. There are a lot of confidential computing open source projects that are going to be presented, some interesting applications and other insights into confidential computing in general. So yeah, it's free. It's online. Please register if you're interested. It will be very, very cool to see you all there. That's it. Thank you very much. And I think we have a bit of time left for questions. you |
Gramine Library OS
Running unmodified Linux applications in Intel SGX enclaves |
Hi, I am Vijay Dandraj and welcome to a talk on Grameen. I am one of the contributors to Grameen project and I have been working in this project for past few years. There was a talk on Grameen in last year's FOSDEM by one of the Grameen maintainers, on the technical details of Grameen as well as a demo showing how to use Grameen as well as GSE. So, in this talk I will quickly go over the technical details and focus more on the current status as well as the new features that were added in Grameen. We will focus more on EDMM which is one of the new features that was added and we will also discuss on the future plans. I am assuming the audiences are aware of Intel SGX as well as DEs in general and so I will skip the background on this and move further. So, this is the presentation from an Intel representative and thus I should show the following disclaimer. Grameen is an open community project and currently Intel and ITL are the most active members of this community. There have been sizable contributions by IBM as well as Alibaba and there is also other contributors from several other companies and academic partners. Now, productizing Grameen is currently done by Intel. Let us discuss on the technical details of Grameen. Grameen is a library OS that emulates Linux kernel. In other words, Grameen can be thought of as a tiny Linux kernel re-implementation to run unmodified guest in the user space. Grameen supports multiple processes, applications with multiple processes such as a bash script with many child workers or an engine X application with many worker processes. This is possible because Grameen emulates syscalls like clone, fork, exec v. Another important aspect of Grameen is that it is architected in such a way to support different backends. Currently, there are two supported backends. Grameen is Linux also called as direct as this backend directly forwards the calls to the host OS and the Linux SGX which runs an application inside an SGX on clay. In future, we also plan to support other backends like TDX. Grameen is a community maintained open source project and it joined the confidential compute consortium in September 2021. Grameen can be easily deployed in cloud environment like Azure. One can get the Docker image from Docker Hub with a few additional packages. One can get started with a simple hello world program very quickly. The latest production ready Grameen was released in September 2022 and we plan to release in a quarterly cadence. One dot release is actually planned this week so keep an eye out for it. We have successfully enabled multiple widely used applications in Grameen and we have also enabled support for many popular languages like Python, Go, Rust and Java. One of the recent applications that we have added was scikit-learn. We plan to support more use cases as we grow. One of the benefits of Grameen is that it's both modular as well as it's small which allows unmodified applications to run on different platforms in different environments. So technically Grameen can take an x8664 based application and it's dependent libraries without any modification or recompilation and let it run on another environment. So currently in Grameen we can run an application in an Intel SGX environment on top of an untrusted Linux host. Hypothetically we could also run an application in Grameen on a RISC-5 Keystone environment on top of a Windows kernel. But this may need a new backend to enable such environment. The takeaway here is with minimal effort Grameen enables to lift and shift application from one environment to another environment. Grameen is a library OS and so it runs as a normal user level process on the host OS. Grameen is modular and it runs as two closely interacting components. So one is the Grameen main component and the other one is the Grameen backend component. The backend component can be switched to different implementation without any modification to the Grameen main component. To allow switching between different backends Grameen specifies an ABI interface between the main component and the backend. Applications enabled inside Grameen should be a Linux application that uses Linux system calls. Grameen intercepts these system calls and tries to emulate them in the Grameen main component. Grameen currently implements around 169 CIS calls. A few examples of Grameen emulating CIS calls are affinity related CIS calls. And there are also a few calls which Grameen forwards to the host OS. These being the input output CIS calls such as receive or send. As Grameen cannot communicate with outside world without the help of the host OS. When Grameen forwards to the host OS it doesn't simply call the host OS. There is one more indirection. This is the backend, the Grameen backend. The backend is needed to adjust Grameen to request to the capabilities of the underlying platform. So the backend is kept minimal and stateless. Grameen calls into the backend via 50 functions. So this is basically the Grameen ABI that's defined between the two components. The main code base remains exactly the same between different backends. To run a Linux application inside SGX on-tly, we can simply use Grameen with an SGX specific backend. Linux SGX is the backend that is primarily used in Grameen. As Intel SGX dictates separation of a process into trusted and untrusted part, Grameen's SGX backend is also divided into two parts. The trusted part that runs within the on-tly and the untrusted part that runs outside the on-tly. The trusted part performs OCOLs to exit the on-tly and pass the control to the untrusted part. There are approximately 42 OCOLs in total. The untrusted part forwards the request to the host kernel, gets the results back and then re-enters the on-tly. It is important to know that these 42 OCOLs are the only attack surface in Grameen. So we audit all these OCOLs and also add checks and validations to make sure there are no attacks from the untrusted host. We believe having these 42 OCOLs between the trusted and the untrusted world is a good compromise between security and performance. But when talking about trusted execution environment, the TCB plays a main role. On the metric of TCB, Grameen offers the main component, which is around 27K lines of code. This is basically implemented in C with few assembly. The direct Linux backend is about 15K lines of code. For comparison, the open-source version of Redis without any additional models is about 144K lines of code. The tiny configuration of Linux is about 270K lines of code. When we look at the SGX environment, the main Grameen component remains the same. It is around 27K lines of code and the SGX trusted background is around 15K lines of code and the untrusted backend is around 4K lines of code. Since the trusted background only runs within the on-tly, the total TCB for Intel SGX is around 42K lines of code. And this is small for a framework that can run a wide range of applications. Another important aspect is the SGX attestation. But for this, I would recommend to go and look into docs, which very nicely explains the SGX attestation flow. And we can also look at last year's talk, which also captures the SGX attestation. So next, let's talk about the enhancements since the last 4-stem talk. So in Grameen, now we have started releasing packages for popular Linux distributions such as the Ubuntu and CentOS. And we have started releasing official Grameen Docker images. And we have also introduced some curated applications with several workloads and it is available at Azure Marketplace. So few of the core Grameen changes that were done are a major overhaul to the memory management and a total rewrite of sockets. And also ELF and ELF parsing and loading. We have also added support for Intel AMX instructions. And there was also a sanitization of CPU and NUMA topology done. And we have added support for executable scripts. Recently, we added support for EDMM, which is on-tly dynamic memory management. And we will discuss about this in the next few slides. So why EDMM? So SGX-1 requires all on-tly memory to be committed at on-tly build time. This causes few limitations. So now the developer needs to predict maximum memory that he needs, thus hogging the critical EPC resources. Also, the developer needs to specify maximum number of threads that can be created inside an on-tly. And since the page permissions cannot be changed, all the on-tly memory must have read, write, execute permissions. To improve on this, SGX-2 extensions were introduced and as part of that, we have EDMM, which allows runtime management of on-tly resources. So with EDMM, on-tly memory is added and removed on-demand. And as part of that, we can reduce the startup time because we don't add any EPC pages during on-tly build. And so for bigger on-tly sizes, this is really useful. And it also enables new functionality such as dynamic thread creation. So these are few of the SGX-2 instructions. The blue table shows supervisor instructions and these comprise of allocating EPC pages, restricting page permissions and modifying EPC page types. And the yellow table shows user-level instructions. So these are basically to accept the page inside the on-tly to enhance page permissions. And an E-accept copy is a special instruction wherein as part of the accept, you also copy the contents into the augmented EPC page. So a quick overview of EDMM. So as part of EDMM, we can dynamically allocate and deallocate EPC pages. We can dynamically create and destroy threads and we can restrict or relax page permissions. By restrict, I mean reducing the permissions, for example, from read-write executable to just read. And by relax, I mean the opposite wherein you change from read to maybe read-write executable. But dynamic memory management requires a new approach. The OS is involved in page table management and also TLB flushes. But since the OS cannot be trusted, the EDMM architecture ensures that the on-tly confirms the request before any changes are taken into effect. So EDMM provides a new instruction, the E-accept, to make sure on-tly accepts the new modifications. So here I will quickly discuss how typical page allocation flow happens. This is a page fault-based allocation. So what I mean by that is we act E-accept on an uncommitted page. This triggers an asynchronous exit to the SGX driver calling the page fault handler. And as part of the page fault handler, the SGX driver eogues the page and it returns back to the untrusted backend. The untrusted backend resumes back to the on-tly and then we retry the instruction again. Since it is a fault instruction, it will be retried and this time it will get accepted and the EPC page is accepted as part of the on-tly. So as part of each page allocation, we encounter one AEX, two context switches and one ECOL. And in case of page DL allocation, we call an iOcton to the SGX driver to trim the page. What that means is we tell the on-tly not to use those pages. As part of this iOcton, the SGX driver invokes an instruction called eTrack to remove all the TLB addresses from the CPUs. It issues IPIs to flush the stale linear address to physical address translation and it returns to the untrusted backend. The untrusted backend now resumes the flow and we accept this new modification that was done to the on-tly. Then we issue another iOcton to remove the page from the EPC and the driver removes the page from the EPC and goes back again to the untrusted backend which resumes the on-tly and we continue our execution. So this is a simple deallocation flow and as part of the deallocation flow, we end up having two OCOLs, four context switches and two ECOLs. So this is a pretty expensive operation and might lead to some performance impact. So in Grameen, we have enabled EDMM. So the EDMM support was enabled in the Linux kernel starting from v6.0 and Grameen currently supports the nape page allocation and deallocation that we just saw as well as restricting and relaxing page permissions. To turn on EDMM, we use a manifest option called EDMM underscore enable but this is by default set to false. So if this is set to true and we run on CPU which does not support EDMM, Grameen will refuse to start. So on the performance implication as we saw, the name EDMM implementation is fairly expensive as we see adding and removing page at runtime is more expensive than adding the page at the on-tly creation or build time. Currently we are working on optimizations to improve this performance. So our current ongoing work is we are working on EDMM optimizations to support better quicker allocation, deallocation flows and we are also working on adding support for dynamic thread creation and destruction. We are also continuing to develop support for additional runtimes and workloads and we are currently working on integrating with confidential container deployments like Cata containers and Enclave CC. We are also working on an interoperable RA TLS and also standardization of this interoperable TLS. As part of our future work, we want to support additional TEs such as Intel TDX and we want to support communication with hardware accelerators. So core screen partitioning for applications to enable use cases like DPDK where control plane is executed inside the enclave and the data plane is executed in the untrusted region. So we are also exploring such use cases. To get a roadmap on Grameen, please visit this following link. So you can reach us in the GitHub and for those who are interested to join us, you can contribute in the form of pull requests, issues, discussion or tutorials as well. And you can read more about Grameen in the Grameen website and there is also a doc section which is pretty useful. You can reach us also through GitHub and yeah, I think the end of the talk. Thank you for your time. Thank you. |
Confidential Containers and the Pitfalls of Runtime Attestation |
Hello, my name is Tobin and today I want to talk to you about an interesting security consideration that has come up in the confidential containers community and that might apply to other confidential computing projects as well. I know that remote talks can be a little bit flat, but if it's any enticement, I think this is maybe the biggest security issue that our community has faced so far and possibly also the most interesting. Before we start, I want to thank my colleague Dov Muirk as well as community members Nan Krishnamurti and Mingwei Sho for helping to discover and uncover and think about this interesting problem. Before we get to the main issue, I want to revisit one design principle of confidential containers that is significant here and that's the idea of decoupled measurement. With a lot of confidential computing projects, the hardware evidence is used to measure or includes the measurement of the workloads and in some ways it's very intuitive to want to measure the workload the most significant thing with the hardware, but the drawback to this is that the hardware measurement is often platform specific and a little clunky. In confidential containers, we prefer to use standard techniques such as container signatures or encrypted containers. We do this with a two-stage process where we use the hardware measurement to measure the stack essentially, the CAD agent and its associated tools and then once those are trusted via this hardware measurement, we can use those tools to measure the workload itself. This is handy because again, it allows us to do standard stuff, but it also means that we can use a generic guest image, which is a good fit for CAD containers in particular. It will probably become clear why this generic guest image is significant here, but for now I want to talk about these so-called evidence factory attacks. I think that's more or less a made up term, but it makes sense here. The crux of this attack is that we're going to use an attestation report that we aren't really entitled to to get secrets that don't belong to us, to attack a target key broker. How are we going to do this? Well, attestation reports, once they're out in the wild, can't really be tampered with. They can't really be stolen. They can't be tampered with because they're generally signed by a private key that's only known to the hardware and they can't really be stolen because you should probably put the knots inside of your attestation report that's going to link this attestation report to a very particular connection to the KBS. If you can't steal one that's floating around in the world, how are you going to attack the KBS? How are you going to impersonate somebody who has a valid report? One way might be to launch your own virtual machine. Launch your own guest and request your own attestation report from that guest. Of course, if you do this, the launch measurement will probably be wrong because the launch measurement will show, hey, this is some random person's VM with a bunch of random stuff in it. It's not the VM that the KBS is expecting to provision the secrets to. On the other hand, if you were somehow able to get the launch measurement to be correct, well, then the VM wouldn't be malicious and you wouldn't be able to access it because the launch measurement should guarantee that there's no way in. There's no easy way into the guest. To be clear, what we're really trying to do here, the crux of this attack is figuring out if there's a way for a malicious virtual machine to manufacture a valid attestation report. When I say valid attestation report, what I mean is an attestation report that reports the correct firmware, the correct NDRD, the correct kernel, and the correct command line. If this exists, if there's a way to do this, then we have a bit of a problem. How is somebody going to break into the guest in order to request a valid attestation report? Clearly, it's not going to happen by tampering with the boot process because that would change the launch measurement. Well, that's not what this presentation is about, at least. Instead, we're looking at a later phase of execution, basically, where the guest is already running, where the guest has already booted and already has used a valid NDRDF-alternal. It's already got the correct launch measurement. Can we get into the guest once that has happened? In the community, we've looked at the trust model of confidential containers a lot. Something that stuck out from the beginning was the API between the guest and the host. In this case, this is the API of the Cata agent. One discussion from the beginning was how do we harden the API of the Cata agent so that we can't have arbitrary code executed inside of the workload or inside of the guest? But it turns out, when we were thinking about this, that we may have overlooked another pretty significant attack service, a large API, and that is the API between the workload and the guest, the API of the container, and the guest kernel. Essentially, this is a very large API. What if a workload could break out into the guest? Would that cause big problems? The answer, in short, is yes. If a malicious container can break into guest user space, it can generate a valid adaptation report and use that to request secrets from any KBS. There's two things that are really bad about this in particular. First of all, when I say any KBS, I really mean any KBS. Not even a KBS belongs to a particular client. Like I said, we have a generic guest image, meaning that the launch measurement we're looking for is the same across the entire project. Anyone who's using confidential containers is looking for the same launch measurement, essentially. So, if I can generate this adaptation report from a guest that is booted correctly from that image, I can then request secrets from anyone. That's obviously not good at all. Another thing that's pretty bad about this is that containers are executing arbitrary code by design. So we might say that there's some way to limit that a malicious container would be able to run. But in fact, there isn't because part of confidential containers, one of the features, is that we can run containers. We can run any container you want, essentially. Now, some of the good news is that this is not actively exploitable because it relies on there being a way to break out of the container into guest user space. But the flip side of this is that you might say that the security of confidential containers, at least in this case, reduces down to the security of non-confidential containers just to this API. Once you break out of that, you have the ability to steal secrets. That's really not good. So, let me walk through this attack step by step just to make it absolutely clear how it goes. The first step is that the attacker needs to craft a container that can execute code in user space. This is not trivial, but it's far from impossible, and things like this have happened in the past. Then, the attacker is going to run this container with confidential containers. They can do this locally using their own SEV machine or whatever, or they can do it in the cloud. It doesn't matter. Anywhere where confidential containers is running, they can run this. The container, once it starts up, is going to connect to a KBS and make some sort of request. As part of that, there's an authentication flow where it will get a nonce from the target KBS. It's going to keep track of that nonce, and then the container will break out and get this valid AdStation report. When it requests the AdStation report, it's going to put the nonce that it got in it, and it's also going to put its own public key here. It's basically just setting up its own secure connection. Once that secure connection is there, it's just going to use it to request secrets. Send off the AdStation report, the KBS will say, yep, the measurement checks out, okay, here's the nonce that I sent it, that's correct. Let me use this public key to create a secure channel to wrap secrets. Really relatively straightforward attack in some ways. Let's talk about solutions. While this attack is in some ways very severe, there are actually a number of different things we can do to prevent it, although it's not yet clear exactly which is the optimal solution. Really the simplest solution is to revoke access to AdStation reports, and what this means is that the guest would have some ability to revoke its own ability, its own capability to get a future AdStation report. This hinges on the idea that there's different phases of execution in confidential containers, or in a lot of projects, really. In the early phase of execution, we should still be executing really within the footprint of what is measured, really within the footprint of the kernel, the NITRD, kernel parameters, firmware, things like that. As long as we're in this area that has been measured, we can be relatively confident about what is going on. One thing that is a big red flag here is that we are allowing arbitrary code to be executed within the guest in the context of a container. Maybe before we start doing this, we should figure out some way that we could get rid of any future AdStation reports, make sure that none are generated. With SCVS and P, there's actually a pretty easy way to do this, which is that we can overwrite the keys that the guest needs to communicate with the PSP. They're called VMPCKs, and you can delete them. If you do that, you will not be talking to the PSP anymore. This has some limitations. I'm not sure of a good way to do this with TDX, but if you know, let me know. I think this is a pretty simple way to address the problem, and a pretty good solution in some ways. That said, there are things it would conflict with. There's been some discussion of something referred to as the passport model, where the workload actually would get an AdStation report at runtime, and obviously that would be disabled if we don't allow any AdStation reports to be generated at runtime. There are maybe ways to resolve that, but that's a discussion that will need to happen. Another solution is to use the host data field. I'm going to come back to this because it's a fairly complicated proposal, so I'll talk about that on the next slide, but let me go through the other two first. One of them is to use IMA to basically constantly check the state of the guest. This would maybe detect a breakout. Doing this would probably require a virtual TPM inside the guest, and this may be available at some point. It's not yet clear if we're going to utilize that for confidential containers in some ways. It might be overkill, but here it might actually be useful. The final thing I wanted to mention is VMPLs. This is virtual machine privilege levels, and it's a feature of SEVS and P. This probably isn't feasible for use in confidential containers, exactly, but I bring it up because this is actually the way that the VTPMs that are being developed for SEVS and P get sidestep this problem. A VMPL is a way to create different privilege levels inside of a guest, and one thing a VMPL can limit access to is generation of a particular AdStation report. If we had some sort of hardware feature like that that could isolate the ability to generate these extremely sensitive AdStation reports, that could help, and the VTPMs, they have exactly this in the form of the VMPL. So let's talk about the host data-based solution. First of all, what is the host data? The host data is a field in the AdStation report that is set by the host prior to the launch of the guest, and then included in every AdStation report requested from inside the guest. This is S&P terminology, but I believe MRINFO is a similar thing for TDX. Even if the guest is malicious, there's no way for it to get a report without the host data in it. We could use the host data for a number of different things, but I'm going to focus on using it to store the public key of the KBS. This will essentially bind the hardware evidence to one KBS. Interestingly, this does not actually guarantee that we are connecting to the correct KBS. It doesn't guarantee the identity of the KBS, in part because the host sets the host data and the host is untrusted. Instead, what it guarantees us is that the guest will only be able to connect to one KBS. When the AdStation agent connects to a KBS, it will check that the public key of the KBS matches what is in the host data. If it doesn't, the connection will fail. Similarly, the KBS will also enforce that the correct public key is in the host data that it has when it receives a request. Fundamentally, the only evidence available within this guest will be evidence that has this public key of a particular KBS in it. Now, on its own, this does not really prevent the attack because we could direct all requests coming from the guest to the target KBS. But this is where signature validation comes in. When we turn on signature validation, the AdStation agent will make a request to the KBS for the signature policy before any containers can be run. This is assuming we're using an online KBS. Since we can only connect to one KBS, as soon as we make this request, as soon as we request signatures and as soon as that request is successful, we've collapsed the attack space dramatically. Now, the only KBS that we can attack is the KBS that provides the signature verification policy. If we use a malicious KBS to provide a signature verification policy, for instance, one that allows us to run the malicious container, then we still have to connect to the malicious KBS for the entire lifetime of the guest. So the only KBS that we can attack would be that malicious KBS. That's not much of an attack. On the other hand, we could try to get the target KBS, we could direct everything to the target KBS instead, and try to get the target KBS to provide the signature verification policy and then attack it later with secret requests. But this isn't going to work because the target KBS shouldn't give us a signature verification policy that would allow the malicious container to run. So with these two properties, first requiring that all connections go to the same KBS using the host data, putting the public host data in it, and then requiring that signatures are there so that we know we've connected at least to one KBS and we know it's the KBS that's actually running the container, we can really close off this attack fairly effectively. So there's a few things we should still note here. One of them is that we've made an assumption, which is that the target KBS is going to require signatures. And this might be setting a precedent for the project in some ways, saying that if you want to sidestep some relatively serious security issues, you need to use image signatures. Is this a reasonable thing to take on board or is it too restrictive? Is it leaving aside certain use cases that are really valuable? I think that's something we still need to think about. Another thing is that I mentioned that this is not the only possible use of the host data. So another thing we could do with the host data is that we could put workload-specific information in the host data field. For instance, we could put the measurement of the container image or something a little bit more sophisticated that achieves the same thing. In my mind, I think that this is redundant to the image signatures, which have so far been really the main way that we've done measurement of the workload in this project. And I think one of the important takeaways from this presentation is that there's ways to resolve this attack without compromising on our ideas about decoupling measurement. And in fact, we can have a generic guest image. We can still use signatures and all this stuff as long as we take a few precautions. But that said, I think it would solve the problem if we put workload-specific information into the host data. Okay. I want to talk a little bit about SEVES. So far, I've been talking about S&P, which isn't even implemented yet, so this is another reason why in some ways it's not actively exploitable. With TDX, I think the current TDX implementation probably is vulnerable to this, although I'm not 100% sure about that. What we've got currently on the SEVES side is SEVES. And here, the story is sort of better and sort of worse. I'm not going to go into all the details here, but the important thing is that the connection to the KBS is actually made from the host, right? Because with SEV and SEVES, we use preadastation, which is driven by the host. So again, connection is made by the host, not inside the guest, which means it's very hard to regulate. This thing I was talking about earlier where we check that the KBS is going to match a certain field or anything like that, it's not going to work because it's all happening in the untrusted realm. Unfortunately, we also don't really have the ability to revoke a connection inside the guest. The first solution that I mentioned, where we say no more attestation reports, we can't really do that with SEV in the same way using the online KBS that we currently have. So essentially, the two main solutions that have been proposed of the host data, that won't really work for SEV and the revoking the attestation reports, that isn't really going to work for SEV either. The good news is that the VM can only connect to one KBS at a time, just the way that the online KBS is implemented, you inject a connection to a very particular KBS. Now you could target someone else's KBS here potentially, so the attack still exists, but it can only be carried out targeting one KBS per guest, if that's any consolation. So in some ways, the attack is a little bit less severe, but it's probably harder to mitigate, to be honest. So I am still thinking about the best way to address this with SEV and SEVES. So what can we take away from this? First I want to mention some general things. For one thing, we realize here that the capability of generating a valid attestation report or valid hardware evidence is precious and needs to be protected. When I say valid here, I don't mean generating an attestation report that has a signature that checks out. I mean generating an attestation report that will convince someone's KBS to give you secrets. Being able to do that, like I said, extremely precious. So let's think, how are we protecting this? For us, a big part of this is that we realize that our protection against this wasn't a hardware protection. Really, instead it was the software protection of the API between the workload container and the guest. It's also important to think what happens if these protections should fail, right? Especially in our case where, again, it's not hardware protection, it's software protection. What happens if somebody does breach that and gain the ability to generate so-called valid evidence? Is it the case that they can attack anyone, right? There's some dangers of having a very generic measurement. We need to be careful in having a generic guest measurement and make sure that that doesn't mean that it's easy to impersonate some other person using confidential containers and convince some other person's KBS to give you their secrets. There's a more complex underlying question here about the identity of a guest. Confidential guests for a long time haven't really had much of an identity. There haven't been many ways to give an identity to a guest. In some ways, our project leans into that by having a generic guest measurement. But there's dangers in having no identity whatsoever. Now, still, I maintain that the decoupling of the workload measurement and the guest measurement is a good idea. And that having a fixed identity or an identity that corresponds exactly to the workload may or may not be necessary. But we do have to take precautions to make sure, for instance, that evidence isn't interchangeable between KBSs. Interesting. A sort of meta note about all this is that you need to be careful about relying too much on one trust model. I showed the diagram earlier. It's a diagram that tons of people had spent a lot of time staring at and thinking about. But we've mainly been thinking about the trust model in terms of protecting the workload container. It hadn't occurred to us that you also need to think about the protecting the ability to generate evidence from everyone, protect it from everyone, including protecting it from the workload. Finally, a few things that are a little more specific to the project. I do want to classify, again, that this is, in some ways, pretty serious, but also not actively exploitable, pretty fixable. So we will work to get this fixed. We're working on the S&P implementation right now, and it will contain some of these mitigations. It has not yet been decided exactly which, if you have any ideas for this, please let me know. This is still a relatively young project. And we get confidence in the security of our project by going through things like this. And at an early stage where it's relatively harmless, it's good to discover these things. So if you have any ideas or any questions, please let me know. And let's keep the discussion moving forward on this topic. |
We need a Let’s Encrypt movement for Confidential Computing
The importance of protecting data in use |
First session, and I wanted to explain a bit more, give an overview of the fields, right, for those learning. So I'm Nick Vidal. Part of the Anarchs Community Manager at Anarchs, it's part of the Confidential Computing Consortium. It's an open source project. I'm also serving as the chair of the Outreach Committee from the Confidential Computing Consortium, and it's a pleasure to be here. So let's start out with talking about the states of data protection. This is very basic. I mean, everybody knows about protecting data arrests, protecting data in transit, but now protecting in use, this is something that is relatively new. So what exactly is protecting data arrests? When you have your hard drive encrypted, your laptop, your traveling, you get lost, that data is safe as long as your hard drive is encrypted and nobody can get in there. In transits, when you open up your browser and you just type HTTPS and you access some website, the data that's flowing between your browser and the server, if that's using HTTPS, that's encrypted and nobody can tamper into that data. That's secure. However, there's a third way that the data can also be accessible, and that's directly on the CPU. This is something that people mostly are not aware of, developers, even security professionals, and they are not aware that when you have some type of application or data running at the CPU in memory, if for some reason that system is compromised, somebody can get access to that data. Suppose that there's an exploits, somebody gains root access, they'll be able to see what's in memory, and conflict computing allows you to encrypt data while in use, while in the CPU. Even if somebody breaks down, even if somebody gains access, roots access to that system, they won't have access to that data because it's just like the hard drive example. It's just like the data in transit example. Confidential computing protects data and the codes, both confidentiality and integrity. Confidentiality means you cannot actually read the data, and integrity, you cannot mess up a tamper with that data or with that code. For conflict computing to have, to achieve conflict computing, you have to at least provide data confidentiality, data integrity, and codes integrity. How about codes confidentiality? As part of the conflict computing consortium's definition of conflict computing, that's not necessary, but some projects like the NRX projects, we provide that, all those protections. This is the official definition by the conflict computing consortium, conflict computing protects data in use by performing computation in a hardware-based, attested, trusted... Manage sensitive and regulated data. I wanted to read that because the CCC worked, a whole bunch of group of people worked together during one year to define this definition and another year to add one word, attested. I wanted to read very clearly to make this definition, so I wouldn't make a mistake, right? I don't want to memorize and forget something. What's conflict computing, what's the case study, where can it be used? Actually it has many uses, right now we have some sectors that are very much regulated, they have a lot of sensitive data, and in fact they cannot use the clouds as of today. They simply can't, policies won't allow them to benefit from the clouds. So we have for example banking, financial services, insurance, of course they have a lot of sensitive financial data. We also have healthcare, there's the HIPAA, for example in the US it's a regulation regarding healthcare. We have telecom, Edge, IoT, governments, a whole bunch of sectors that currently do not use the clouds because they can't, because they have a lot of sensitive data, because it's very much regulated by governments, policies. So conflict computing will open the clouds, the IoT, the Edge to these sectors, they have a lot of sensitive data, and that's the huge potential of this technology. If we can open up this, the clouds to these sectors, you'll grow a lot. That's why, one of the reasons why cloud service providers are currently, this year and this past year, they have offered conflict computing and this is going to grow immensely. So I talked about the conflict computing consortium, we are part of the Linux foundation, so we bring together hardware vendors like Intel, AMD, ARM, NVIDIA, cloud service providers like Azure, Google Clouds, and so many others, startups as well, and software developers. We have a whole bunch of members here, as you can see, all the major players are betting on this, because in some ways, this is the future of the clouds in terms of security, and currently we have seven open source projects. We invite as many open source projects, if you are working with conflict computing and you have a nice project, we welcome you to the CCC. So I work at the Anarchs, but we have Grammy, we have Veracruz, Verasome, a lot of very great technology here, which is fully open source. Now let's step back and look at the Let's Encrypt movement. Not many people are aware of conflict computing and its importance of protecting data while in use. If we go back 10, 15 years ago, that was the same challenge that we had regarding protecting data in transits. People were like, hey, I'm not an e-commerce, I'm not like a bank, or why should I use HTTPS? We kind of left right now, hey, HTTPS is just the default, right? It's very easy. Why we shouldn't have this as the default? Even from whatever, you have your own blog, it doesn't have any sensitive data, but even so, you're going to use HTTPS because it's easy, it's convenient, and it's just more secure. This same mindset is what we need, and what we need to change for people to start really thinking about data in use, of protecting data in use, and it will make everything so much secure. It doesn't matter if your system gets hacked, if everyone has roots access to it, even so, it's not game over, your data is still secure. We hear in the news all the time about all the vulnerabilities, and this could have been prevented by using conflict computing. So Lens Encrypt, it was started in 2012 by four people, two from Mozilla, one from the Electronic Frontier Foundation, and from the University of Michigan. It's the world's largest certificate authority, it provides free TOS encryption, and the Goish really make the web safer using HTTPS. They have a lot of sponsors and partners, as I mentioned, you're part of the Lens Foundation helps them, is a partner, also the Mozilla Foundation, EFF, there's a whole group of people that see the importance of having HTTPS by default today. What makes it possible is that they have developed software that's very easy to use that makes it just, it's a very easy process to enable HTTPS. So they have the ACMA protocol, and they have those who have provided open up search bots, you know what that is, that's the Python application that creates a certificate rights, and on the server side they have Boulder. And these technologies, this software, and this protocol, they make it really easy to achieve HTTPS by default. So I'm not sure if you can see really well here, the contrast is not really good, but you can see the growth here. So this is the- It's quite well on the stream if you happen to have a device. So this is the start of the project, I can't actually see the ears here, but as you can see it's growing right, it has grown a lot, especially here, I'm not sure what happened here, but this is how many certificates they have given rights, and so it's very successful Let's Encrypt is one of the most successful achievements to help secure the web. And how did they accomplish this? What can the Confidential Computing community learn from the Let's Encrypt movements? So here are some key ideas that I thought about, but you can also explore this, and this can be an open discussion as well. So first of all, make a campaign that brings this awareness around the importance of encrypting data while in use, the same way that Let's Encrypt did this for data in transits. For us, HTTPS is just the default way. We can't think of any other way of doing this. Why would we use HTTPS only, right, even for a blog or whatever? It just makes sense. So adoption by TES, by Cloud Service Providers, this is happening right now, so all the major Cloud Service Providers are really making this available, generally available, and they should, they're a bit expensive right now, but they should become more affordable in the future. Of course, all the hardware developers, they have made the technologies available. So Intel, ARM, ARM is still going to release this, but you have AMD as well. They have invested a lot, sometimes even a decade, right, in terms of Intel SGX. We have to develop software that makes it really easy to deploy computer computing. So one of the projects is NRX, we make it really easy, we use WebAssembly to allow developers to deploy applications, and it's really nice if you want to check that out. We have to abstract the complexities, complex computing is really complex. You have to know about encryption, you have to learn about attestation, about the different models that exist to achieve computer computing, and the software has to abstract all those complexities if you want to gain a market share, right? It has to be CSP neutral, it has to be hardware neutral, preferably the developer doesn't have to know if his application is going to run on AMD or Intel or whatever. But it just should work, he's going to deploy this application and complex computing will just, everything will be encrypted and he doesn't have to care about attestation or whatever. Just works, just like let's encrypts. Promotes open source software, I believe open source plays a very important role here, and the complex computing consortium has this as part of its mission to promote open source software, that's why we adopted setting open source projects here. We have to make it affordable, and right now it's very niche, but we have to see that maybe in five years, this is just going to be the default way, right? And maybe eventually it might even be free. So we want to commoditize complex computing at some points, right, to make this a reality by default. And so with that, I would like to thank you for hearing me. You can get in touch with us at the complex computing consortium using this email. And I invite you to join our tech meetings, it happens every other week. And also our outreach meetings, if you want to really learn or share your ideas, your technical ideas, I recommend joining the tech committee and outreach if you want to expand this idea and promote it. So thank you very much, and that's it. Do we have some time for questions? Yes, the first block is a little bit cramped, so there's a little bit less time for questions in the first block. But can you feel to ask some, and I think you're also around for the rest of the day to answer any questions? So with TLS, I have the feeling that the infrastructure was already there, and it was just a problem of it was too complicated, and with confidence in computer computing as a developer, I don't really even know where to start. The infrastructure is coming, the CSPs, they're really adopting computer computing. We had heard some announcements, especially last year, of making this technology generally available, but really the prices need to drop, it must make it really easy for people to adapt this. I currently can't use it on my laptop, for example. But confidence computing is mostly targeted for the server side, or the Edge, or IoT. There are some, the Intel SGX for the PC or for the laptop, in fact, they were kind of degraded. Intel is not supporting this anymore, but yeah. So once I get West, Andrew, the next speaker could maybe already set up while we have the next question. Maybe go ahead. Go ahead. Thank you for a very nice talk. You were talking about the definition of confidential computing, the confidential computing concerns you had decided on, and you mentioned that it was one year to add a test to this definition. So why was that, why wasn't it there in the first place? Yeah, so attestation was not part of the first definition. The reason why is because attestation is really complex, and maybe it wasn't given as much importance as before, but once attestation became a really big discussion after, and in fact they created a special interest group around attestation, and that's when they decided to add a test to this definition. Okay. |
LSKV: Democratising Confidential Computing from the Core |
All right, our next speaker is Andrew Jeffery from University of Cambridge talking about LSKB. Hello. So yes, I'm Andrew Jeffery from University of Cambridge, emails there if you want to email me about any questions that I can't answer today. As a brief kind of precursor, I kind of come from the distributed systems world, not necessarily confidential computing world, so this is kind of like a hybrid of both worlds here. So today we're going to talk about LSKB aiming to democratise confidential computing from the core. So first of all, we've got to work out what this core actually is that we kind of want to start replacing. And so we're going to start working with distributed key value stores. In particular, we're going to look at CD. And as the CD website defines itself, it's a distributed, reliable key value store and importantly, it's for the most critical data of your distributed systems. So CD runs as a cluster, it's distributed, so you have this one leader node and you might have some more followers in this setup as well. CD is also not alone, it's the core, so you have some applications around that. Some of those applications might be some sort of orchestration, so using Kubernetes on top is like one of the main candidates. Otherwise you might use M3 or Rook or Core DNS or other applications that use CD internally as well. So effectively, it's really widely used, it's quite a critical piece of a lot of infrastructure. And so you have to interact with CD in some way, even if you're just using one of these services. And so primarily you use some key value operations like you can put, so you might write foo one equals bar into the data store and it keeps some history so you kind of have this revision system. So when you do that first write, it'll be stored as a version five and you write to be six, seven, eight going on like that. So you can, after you've written, you can get something back out using the range queries and with this you can say, I'd like all keys between foo and foo five in history, so you would be able to get multiple keys at once. And you can also specify the revision here if you wanted to go back in time just to see what it was some previous point. After you've read something, you might no longer need stuff, so you can delete it. So you can also delete with this range as well, so you can delete say foo to foo five. Transactions are a nice ability here, so you can do some kind of conditional logic at the data store side. So if you can use put range of deletes or more or less the transactions internally to do kind of bulk operations here, so you can say write foo two and foo three in the same revision. Additionally, you can have leases on top of the data store, so these can be used for building high level primitives and distributed systems, and primarily you might want some like leadership mechanism. One final thing here is the watch API that Etsy provides, similar to ranges you can do a range between a start and an end, and you can also do a watch, a certain point in history. So for watches, that history is where you start watching from. So if you start revision five, you'll be notified that foo equals bar, and then you'll be notified of the things in revision six, foo two and foo three, and everything that kind of comes in after that as well while you keep the connection open. And this is just like a really core API that's used by lots of these other systems, so this is kind of something we might want to mimic if we want to like replace Etsy. So Etsy is a big system, we want to run it somewhere, primarily lots of people run it in the cloud. You don't always want to trust this cloud, because it's run by cloud providers. They might themselves be trustworthy, but the things that they're operating might not be. So if a high provider gets a weakness, then might get attackers going through to the lower layers and being able to access some of the hardware themselves. Clients that are interacting with your service might be within the cloud themselves, they might have already accepted some of the cloud primitives, they might also just be outside of the cloud and just having to use your service for some reason, so they might not be wanting to use the cloud directly themselves. Additionally, they might not necessarily speak directly to your data store. Lots of things they talked through a proxy, so if you're in Kubernetes as well, you might have Cubelets and Kube CTL, they speak through the API server, which basically terminates the CLS connections before passing and doing some logic back on the data store itself. And so today we're going to speak about two problems. Problem one here is this LCD cluster that's running in the cloud. If we're not trusting in this cloud, then all of the data in memory is currently unencrypted and so we want to be able to do something about that. And problem two is this proxy. If this proxy is terminating some CLS connections, we don't really want that to be able to happen. We'll actually see how the proxy can be a bit distrustful with our interactions. We want to be able to show when it's not being very trustworthy. So diving into problem one. So we've got this XED cluster, XED like any kind of storage service has some storage, it has some memory and some processing application. It's also distributed, so we have some CLS connections between the peers. And effectively, as you can see in yellow here, we have some level of security. So recommended setups have XED communicating for XED nodes over TLS. And we also can have this optional sort of file system encryption that gets put down to storage. One main problem with this, all of those keys are in memory. So the TLS key that we were using for TLS is now in memory, so the attackers got that. And they've also got the private key for any file system encryption. So basically renders our TLS connections and our storage encryption pretty much worthless if we're not trusting that someone can't access our memory. So if we actually swap out CD for LSKV, which is our data store, we run LSKV inside of an SGX enclave, which gives us those confidentiality properties that we just mentioned in the previous talk, then we can build some of these privileges to be a bit more trustworthy. So now that our memory is encrypted and integrity protected, we can store our TLS keys and file system keys there and trust that they're not going to be able to be accessed. The actual application itself is running in a secure enclave, so we can show you that it's not going to be able to be modified. While TLS connections, we can be sure that they're actually secure TLS because our TLS keys are in memory, and we can actually upgrade those to a tested TLS where rather than just trusting the other application to ever end, we can make sure the other application is running in a secure enclave as well, so it's in the same environment that we trust. And finally, we've got this file system key in memory, so we can trust that anything we write to disk is actually going to be encrypted properly as well and safe. So this is like one nice solution to that problem one. So if we delve into a bit of what LSKV is a bit more, we can see that it builds on something called CCF. It runs in the SJX enclave, and like most services in the cloud, it will run on top of a hypervisor and has some memory and storage and other resources attached as well. So if we quickly jump into CCF, which is the confidential consortium framework, it's a pretty nice project that basically splits up the interactions and the management of it into three distinct roles. So the first role is the operator. So this is the kind of cloud provider, the person who's standing up VMs for you. You might say, I'd like one LSKV node to run with, and then you might later want more to join into the network. And so this operator is untrusted, so all they can do is stand those nodes up. They can't do any sort of giving those nodes access to the data in the cluster. They don't auto join the cluster. That's the responsibility of the governance, who we partly trust. There's a few of them, and so any interactions they do have to be done in some sort of majority way. And so these people will be responsible for things like once a node has been stood up, except checking the configuration of it, and finally accepting it into the network so it can start serving application requests and handle some of the data in the cluster. And finally, we have users that need to actually access the application that we're running. And so these are treated as kind of trusted towards the application, but the application itself can have internal access controls inside. And all the data is stored on an encrypted ledger that gets written to the disks. Governance requests are stored publicly in this ledger, and they're also signed so that everyone can see those and verify those. User interactions that go through the application are normally stored encrypted by default. You can make them public if you have certain use cases for those. So LSKV actually has an HCD-compatible API. This slide might look pretty familiar. It's basically the same API at the core. One asterisk is that this watch is at the bottom, is currently requiring a patch on CCF because it hasn't got around to being merged in yet, but that's something that should be fixed and should be expected to work. So effectively, this basically means that we can switch out HCD for LSKV in most cases, solving problem one. So before we quickly go into problem two, we've just swapped out HCD and LSKV, and it might not always be as simple as this. So some quick trade-offs. LSKV is actually optimistically consistent in the way it replicates data rather than HCD is strongly consistent, and otherwise we have the normal things we might expect that LSKV actually gives us confidentiality properties of our data, makes more operations transparent so those governance operations we can see. It also has this HCD API at its core. It also has some extra features that we're not going to cover too much here. There's one later. So quickly on this optimistic sense, rather than replicating data in a synchronous way, when you write to LSKV, it will replicate asynchronously, and in turn, you get an ID back. You can later follow up with this ID to say, was this replicated properly or was it not allowed? And this basically puts the decision at the client's side, so they can either be optimistic and say, I'm going to trust that that's fine, or they can come back later and say, no, I wanted to make sure that was replicated first. So this is just a key difference. If we go quickly on to problem two, we have this proxy that we might want to communicate to the data store too, but in this instance, Alice wants to write 500 pounds into an account, but the proxy is going to intercept that and write that money into Bob's account instead. LSKV is none the wiser here. It's just gone on the request. It's going to process that request and handle the response. The proxy in turn has an opportunity to return to the client and say, OK, Alice, we've written your money into the account. Now hopefully you can see here that Alice is not equal to Bob, and so she hasn't actually got the money. So LSKV gives us this receipt functionality where we can actually kind of expose untrustworthy proxies. So when the client first does the request to write money into Alice's account, they can also ask for a receipt for that operation. This goes through the proxy. The proxy can rewrite the normal, write request as normal, so the LSKV actually puts the money into Bob's account, but we still want to get that receipt back. So when the receipt actually comes back from LSKV, the client can actually detect that either the proxy's manipulated this receipt in some way, so it's no longer valid because it's a signature, or it is valid, and in which case it says that the money went into Bob's account, which is not what they wanted. So you can use this to kind of flag to someone else that this proxy's not trustworthy and you'd probably stop talking to it. So that's also how we can kind of solve problem two quickly with LSKV. Just a quick summary of things here is now that we don't think current data stores are really suited for confidential operation, primarily looking at CD. We don't think that lifting and shifting them into confidential environments gets you all the properties that you necessarily want. You get some kind of memory encryption integrity just by running them in enclaves, but you don't get some of the other properties that we kind of get from building on the ledger, and also like having a different trust model compared to what some systems have. We've introduced LSKV, which is a new confidentiality store we've built on CCF, and it actually has an HD compatible API, which basically means that you can swap out HD for LSKV in most cases. LSKV is also able to highlight these untrustworthy proxies using receipts, and yeah, it's kind of fast. Thanks to that optimism, it basically has a higher throughput and lower latency than an HDD, so if you do replace an HDD with it, hopefully your performance of stuff will start increasing as well. And that's pretty much me, so I'm around for a few questions, and please check out the GitHub repo, its open source and everything. And yeah, my email is there if you have any questions after the talk, so thank you. We definitely have time, so feel free to go ahead and answer your first question. On the proxy side, the receipts, is that an API extension, or is that an API change? If I'm using the proxy lab, I switch it to LSKV, do I need to change my application? So that's a question about the HDD API, does that include the write receipts by default? No, it does not. They're a separate GRPC service on top, so your clients would need to be manipulated with that. Either your clients or you'll build that into a proxy, so you could say, because the idea of the proxy is that it's a different kind of exposing different API to a client, right? So if you do a write request to a proxy, your proxy, you could by default ask for a receipt and present that to the user as well, they might need some extra functionality to be able to verify that. Is that because it's, I don't know how the receipt works, with extending the HDD API to have some sort of, I don't know, a non-source or a request for signature, is that something that you're planning to, would it make sense to extend the actual HDD API? Questions, would it make sense to extend the HDD API for receipts? We don't really think so, we're not really planning on doing changes directly in HDD, just primarily because HDD has a different kind of threat model and trust model, so some of the things that we have, HDD doesn't necessarily support against. So that's one reason we're not going to stop putting some of this back into HDD itself. It's kind of designed to be a separate service. Yeah. Yeah. I'm wondering how the authentication would work, like if I have confidential services that and only those should be able to access the confidential storage, how would the mutual authentication work here? Between, so do you mean, the HDD should just give out secrets to those confidential services and the confidential service should know that they access the correct HDD storage essentially. By HDD storage, do you mean a key or? I mean the LSKV as an HDD replacement in the Kubernetes state. Okay. So I think the question is about how do you make sure you access the right HDD cluster from a confidential client? Yes. Yeah. So you'd be speaking through the API server right in Kubernetes? Yeah. Or you're saying you want to connect directly? Maybe you have a confidential controller or you connect directly, I guess. Okay. So we wouldn't, I'm not sure if we can do that directly in LSKV. It has a cluster ID like a normal HDD cluster, so you can go about trusting that. But we don't, I don't think we've really worked through that use case directly. The main thing is that if you want to write a proxy around it, then you might want that in a proxy. Okay. Thanks. What are the things that LSKV have to be careful of as compared to like HDD? In what context? Like, what are the things that LSKV need to be taken care of that GTC don't take care of? Okay. Why do you need to do that? Yeah, yeah. So the question is about why do we, well, some things that LSKV kind of supports that doesn't, HDD doesn't. So one of the main things is storage. In HDD, everything gets written to disks right away and that's part of the consensus process. You won't get returned until it's written to disk. In LSKV, that's not the case, we don't write to disk synchronously, which basically means that we're not trusting this host to necessarily persist our data or even give it back to us correctly when we ask for it. It's only used for like disaster recovery scenarios. So that's like one primary difference, which basically means that HDD is open to rollback attacks. So if you trust the host to do the data, they can cut some off and when you load back, you've got a subset of what you had before. Okay. Yeah? So I saw in the trade-offs that LSKV cannot have a strong replication like HDD. So what was the cause of LSKV, what's the case to have to do that trade-off? Yeah. So the question is about why does LSKV not have strongly consistent replication? Primarily, if we're not trusting the host, then A, we don't want to write things to disk, like we just said. So that's one reason for that part. And then on the replication side, we're not trusting the host, so we don't want to block everyone on wanting to replicate everything. And so it does do the replication, and you can follow up again with that ID from your operation. So if you do care about it being strongly replicated, you can just follow up the ID and wait until it's committed. It's just kind of giving the users a bit more flexibility in that operation choice. Okay. So I have one more, so Tom will set up while we're in the last session, but I have one more. So I'm going to write for your performance numbers, because 50% latency loss I expect, but then three and a half times gain in performance, something I didn't expect. Can you say why that is, or how well it keeps them consistent? It's actually consistent. I mean, maybe I missed that. Yeah, of course you don't see the slide anymore. Oh, that's a good slide, yes. Yes. Great. Yes. So the question is about why it's so fast. Unfortunately, it was a consistency slide, but I think I missed it, but thank you. Yeah. I kind of won. Yeah. Even the slides are confidential now, but yeah, slides are on there for some pitch. One minute, second question. Otherwise. Yeah. So one of the first slides, when you get an example of a typical architecture, there were like simple types. Mm-hmm. Yeah, so primarily you can replace ETD with LSKV at the moment, if you took the clusters and swap them, because the ETD API is still there, your things will still work. If you wanted to take advantage of the extra things like the receipts for the proxies and things, you would need to add some logic into your proxy or into your client's fave route. There are, I think, apps that support the specificity support as well. No, there's not currently apps that support those receipts and extra bits at the moment. That's, we haven't got to add a bit. It's just on the data store focus at the moment. Cool. All right. |
Keeping safety-critical programs alive when Linux isn’t able to
Using OP-TEE to deliver availability to applications in a Trusted Execution Environment. |
Next speaker is Tom Van Eyck, who is actually also from the same department that you and me are from. And he will, I think, be the only speaker that talks a little bit about our trust zone, so take it ahead. Yeah, so I'm Tom. I'm a first year PhD student at the KU Leuven, and I'll be talking about the research we have done in the last year, year and a half. So to sketch a bit of context, we are researching on cyber-physical systems. Cyber-physical systems are systems that interact with the real world, for example, air compressor's robotic arms. These systems have a controller with safety-critical applications. However, in the recent years, the industry is moving to including a Commodity OS-like Linux to facilitate software updates and third-party applications for monitoring of the cyber-physical systems. However, as you can imagine, to keep the safety-critical applications safe, you have to put this on a separate processor so that the Commodity OS may not influence it, but this becomes very expensive. So what industry wants to do is that they want to integrate both these things on a single processor. So that's basically the requirement that the industry has at this point, but a few problems result from it. The first is, I think, the most obvious. Whenever there's a bug in the Commodity OS or an attacker, they can influence the execution of the safety-critical applications, which means that the safety-critical applications cannot be guaranteed to have their availability, so the safety aspect is completely lost, which is not acceptable for these cyber-physical systems. The second issue that arises is that these safety-critical applications have real-time execution requirements, although there is some support in Linux. For real-time execution, this is mostly not what industry is looking for, so they want to have a real low latency real-time execution scheduler as well. And the third issue that arises is that industry also wants to share the peripherals between the safety-critical applications and the Commodity OS, because these monitoring applications should be able to read out the peripherals, however, industry doesn't want them to be able to disable the peripherals, because then, again, availability isn't guaranteed. For example, if the train wants to break and the peripheral for the brakes is disabled, no brake can be pressed, this is quite an issue. So out of this, our research question was formed, can we ensure availability for safety-critical applications while running a Commodity OS on the same system with little developer impact? And to repeat, we need isolation of the critical applications, we need real-time execution of these critical applications, and we also need a transparent sharing system for the peripherals. As an aside, the threat model that we assumed was that there is a strong remote adversary with root privileges in the Commodity OS, they want to launch a denial of service attack on the complete system, and we assume that the hardware, critical applications and peripherals are trusted and everything else is not. So jumping into our first requirement, to isolate the critical applications, we chose to use ARM Trust Zone, because it's integrated in high-end, low-end devices, embedded devices as well, and it has existed for quite some time, so the chance that the industry already has a processor deployed with ARM Trust Zone on it is quite high. So ARM Trust Zone is actually just hardware-based isolation, it creates two worlds, as you can see the normal world on the left and the secure world on the right, it does this by defining two security states with their own address spaces. In that way, we can ensure confidentiality and integrity for code and data in the secure world, because the hardware blocks any access to these address spaces of the secure world coming from the normal world. We use OPTE, which is an open-source TE implementation of ARM Trust Zone, and it works together with Linux in the normal world. So architecture, you can see again the normal world on the left and the secure world on the right. All the great boxes are the boxes that were already there in OPTEOS and Linux, but the white boxes we added. So for requirement one, of course, is ARM Trust Zone. Requirement two, we added a secure scheduler and a secure interrupt. For requirement three, we added a driver in the normal world and a secure driver in the secure world, and then we also have developed a use case where we monitor the Linux kernel, and that's also in the secure world, but I'll talk about that later. So for the real-time scheduler, so for the real-time execution requirements, we basically need two things. We need a periodic interrupt, and we need a scheduling system. For this periodic interrupt, we use a hardware timer on the board. It's very simple. We set the interrupt to be the highest priority of the complete system, and we protect it from the normal world so that the normal world cannot disable it or reconfigure it. So when an interrupt is triggered by this hardware timer, it gets caught by OPTEOS. OPTEOS checks if it's a scheduling interrupt. If so, it passes on execution to Friartos, which is a well-known, relatively small real-time operating system, which supports task prioritization and preemption, which is very useful for industry. So whenever Friartos gets control, it will schedule its tasks, and after all tasks have executed, control will be given back to OPTEOS so that the system can function as normally. And then for requirement three, we have a – so this is obviously the normal way that an application on the user level would interact with hardware peripherals. However, these peripherals need to be in a secure world, so we also need to move part of the driver into the secure world. So this is called driver splitting often. We basically introduce a secure driver in the secure world at kernel level, which is the liaison between the normal world and the peripheral. So this secure driver then to keep the developer efforts minimal should not contain a lot. In fact, very little. Only hardware accesses can be put in a secure driver, but you can also put some security policies in there. So for example, if a user application in a normal world wants to read something, it will get allowed. But if it isn't allowed, for example, to disable the peripheral, the secure driver is able to just stop the request and nothing will happen. The secure driver may also include some logic to share the access between the normal and the secure world. For example, if you have a screen as a peripheral, the secure world's content of the screen will always be displayed on top of the normal world content. And the nice thing about creating such a system is indeed only the developers of the driver at the kernel level in the normal world need to care about any changes made to the system. As far as the user level applications know, nothing has changed, which is very useful for industry as well. Of course, you need read and write access, and this is given by a set of APIs included in OptiOS called the global platform APIs. These are a standardized set of function calls to facilitate calling into the secure world, providing data and getting data back. We've measured this to take on average 123 microseconds, which is plenty fast enough for industry. But of course, secure peripherals might also want to return an interrupt, might trigger an interrupt, and this interrupt must also be returned at some point to the normal world. So for this, we developed a notifier system that consists of two parts, one in the normal world, one in the secure world. So what happens is that if an interrupt is triggered at the peripheral, it will get forwarded to the secure world notifier by the secure driver. Then the secure world notifier will trigger an interrupt in the normal world, which will be caught by the normal world notifier, and this will forward it to any driver in the normal world that wants to know if such an interrupt has happened using a published subscribe system. So now we solved all the three requirements, but then we got on to thinking, what can we do with this? So we developed a use case where we tried to monitor the Linux kernel running state if it has crashed or not. So we adopted a very simple system to do this. We basically challenged the Linux kernel using a notification from the system we just built, and we expect the response back in a certain time frame. If we got the response back, Linux is alive, otherwise it's not. It's as simple as that. The things that we can do with this is however more interesting. Whenever Linux doesn't respond in time and we know it's dead, we can, from the secure world, we can dump the kernel state, normal world memory, and we can even reboot Linux kernel while still keeping the safety critical applications running. So we did that. It will show a demo where we reboot Linux. So whenever the monitor in the secure world notices that Linux kernel is dead. So first, to go back a bit, first we store the kernel image at boot time because then we know that Linux is in a good state and it's up and running because we need access to the normal world file system. So we get the image, we store it in the secure world, we protect it from normal world memory so that no access from normal world is possible anymore. So then when Linux crashes, we notice this. We disable all the cores because we are on a multi-core system. We disable all the cores except for our own. We write the image back again to the normal world and then we just jump to the kernel start address. I left some tricky things out because OptiOS needs to do some resetting of its own systems as well but that's not that important. So I have a demo that basically demonstrates this. So again on the left, if you can see it clearly, we have the normal world, isn't very important what you can see there. The most important thing is on the right in the secure world. You can see on the top that that's the output that the monitor is giving every 500 milliseconds. So every 500 milliseconds it's selling a challenge to Linux and getting a response back. If it's a response, it's obviously green. So now if we go into Linux and we make or we cause a crash in the system. Very simple crash kernel panic. We immediately see that the monitor notices this. It will start rebooting process and keep in mind that the secure world is still executing its task in a real-time fashion with a given known latency. And after Linux has rebooted, we again see that the monitor notices Linux is alive. And if after we wait a bit, we get again a shell which we can use like any other Linux system. Yeah, the demo. Thank you. So then to conclude, again, a research question. Can we ensure availability for safety-critical applications while running a commodity-OS on the same system with little developer impact? We do believe so. We did this by leveraging threshold isolation to isolate the critical applications. We introduced a secure scheduling system with freeRTOS. And we introduced also a transparent peripheral sharing system. We have some documentation online. I put it as a tutorial, but you need this board to be able to run it. So it's still ongoing research. And we will update this tutorial whenever we update, we get new stuff in our research. You can also look at the documentation for Opti as well. And if you have any questions at any time, just contact me at this email address. I'd be happy to answer them. So that was it. I hope you enjoyed the presentation. And if you have questions. Yeah. So one of the problems I see with this approach is that you move the device to the secure world. They basically have a seam layer to make the cabinet talk to the device you move to the secure world. The problem is, we've been discussing this for a while, but the problem is you expose a bigger back surface in both the, which is the drivers, basically breaking the main assumption in opting that we don't trust Linus, right? So you have a buffer that fits into your driver and you kind of start to trust that buffer because it ends up being hardware, so there needs to be some kind of arbitration or rationalization process during that thing. And how do you thought about this? Yeah, of course. It was indeed that if you move a driver into the secure world, we increase the attack surface considerably, of course, and we solved this, or we thought about it indeed, and we came to the conclusion that it is indeed a problem. However, if you get secure policies, which you know in advance, so for example, we know that a peripheral can be read only by the normal world, but not written because of course you designed a system like so. You can see, based on the actual requests that are sent by the normal world, if it is allowed or not. So if it's, for example, a write request to an address, we know we cannot allow this. However, if it's a read address, we know we can just execute it and return the data. We have a hardware vendor who will just listen and have a hardware that responds to do memory address cases and you should be fine. Yeah, but that's not the case, sadly enough. The thing is that there's an RFC that we haven't been able to reason about it, which we don't have a watchbook like you have an object that we do have is that we, once the camera comes up, we measure portions of the text area of the camera. We take a text and then we periodically randomly check if that hasn't taken. Now, arguably, that's not a very strong attack against recent, you know, rocks and stuff like that, but you should check that out at some point. Yeah, we are considering this for other research projects that we are running at the same time. It's a question that keeps coming back. How do we authenticate the normal world or the Linux kernel to the secure world? How do we make sure that it is running correctly? Because the kernel, when it comes up, we basically change the bounce of the muscle code. Yeah, indeed. Yeah. After some point, in this case, it's arguably trust what happens. Yeah, indeed. And there is some research going into attestation, so not remote attestation but locally. And also some continuous attestation, but it's still an Linux kernel, which is very difficult to attestate. The last thing is the kernel itself, right? So when you write it to memory and then reboot it, you need to have some kind of cryptographic set, right? Because the trending in the whole chain of trust is, you know, you boot with the effect that verifies your kernel and then you outestate it. But if you load the kernel and down to an entry point, first of all, you need to cryptographic verify what you're doing. Yeah, of course, yeah. And there's been some code and development going into EFI where if you down to the kernel entry point and not the EFI entry point, you use a bunch of security services. For example, ASL app. Yeah, of course. So this was a proof of concept indeed. If you just write the kernel image back to the normal world and jump to it, it's of course a big problem. So if you want to actually build such a system in a secure way, you will need to do checking your image at boot time when you store it in memory. Once you've checked it and stored it, you know that it's safe and then whenever you want to reboot, you can set up the normal world completely so that it is again in a well-known secure state like at boot time and then write the image. We can talk. Of course, yeah. There's more problems, right? Because for example, if you boot like that, ASLR won't work properly in our, at least in our system. Okay. Okay, we'll have a good discussion. Yeah. Yeah, go ahead. Maybe a follow-up question to this initial observation about the sort of security aspects of kind of sharing responsibility for the peripherals. Another reason why you would want to partition the peripherals rather than to give control competes with trust is also performance, right? Because essentially now you're saying that for any peripheral access, you need that to go through the... Yeah, so there is... There is some latency number, but did you evaluate the overall performance degradation on the Linux? Yes, indeed we did. I did not include the slide. Ah, yeah. So the question was, if we evaluated the complete latency of sharing peripherals between the two worlds, yeah, we did indeed verify, don't have a slide for the graphs, but if you look at these numbers, you can see that to go to the peripheral from the normal world, it takes around 103 microseconds. This is however in a standard call in OptiOS, and there is also something like a fast call and then you will get six microsecond latency, which is very fast, certainly fast enough for the requirements set by our industrial partners. And also the 68 microseconds to go back is also quickly enough for these systems, because these systems often have a control loop of one millisecond. So that means that, yeah, even with these numbers, you have like 95 or above of the original performance or time of execution in the normal world. So this is, we think this is a cost that we can totally take in developing the systems just because it gives us so many benefits on security level. Is that good? Okay. Yeah. So the question was, how does this system using a trusted execution environment compare to a hypervisor implementation? At our research group in this synod, we are working with very low-end embedded devices. So this is a proof of concept on quite a high processor, it would be possible to do it in a hypervisor. I don't know how to do it because I haven't taken a look at it that closely, but on these very low-end embedded devices, it's mostly not useful or it's even damaging for the lifetime of device to use a hypervisor because you have also quite some overhead and a very limited lifetime battery. You have a very limited energy budget. So on these lower-end processes, there is an arm trust on implementation that is also very energy efficient. So if we would, of course, not use Opti-Res and Linux, but take the same principles and apply it to such a chip, we believe we could stay in the energy budget that these low-end devices have. And we don't think it's something we can immediately do with hypervisors, but of course, that's not my research area, so that's something that can, of course, be very interesting to research. There was a question in the back. Is there an instance of this hardware that we can use on the cloud to try it out? I'm afraid not, so there is no instance in the cloud that is available to try right now. However, there is, I don't know if you know QEMU, possibly you do, so QEMU. It does support ARM virtualization. So we actually first started using that to develop our system, but we very quickly moved on to a hardware device because of our industrial partners. So the code, as is, won't immediately run on QEMU, but if you change the interrupt numbers and the things that are different, it should be relatively easy to reproduce the same results on a virtualized system as well. Just to continue on the question on virtual machines versus OPE. I think one key difference is that OPE is actually a real trusted execution environment in the sense that compared to a virtual machine where the hypervisor would have access to the address space of the protected application and through flow, with OPE, it doesn't. Anything that runs on the unsecured world does not see that the physical address space cannot access. Well, I guess, for example, our question is like the Siemens software controllers, actually the hypervisor is not Windows or Linux, it's the custom hypervisor, and that also has access like proper. So if it is an hypervisor that uses hardware virtualization, it does have access to the address space of whatever VM is running on the machine. If it's based on the ARM64's virtualization extension, it will have access to the address space. With OPE, the address space is only accessible to the secure world. It's controlled by the secure monitor, which is running at a very privileged level, which is more privileged than the hypervisor. So the hypervisor does not get access to the secure application, whereas with an hypervisor, the hypervisor could just mess with the nested page tables and do whatever it wants, access to the secure peripheral database. Maybe to kick in here then, I love this, so that's what the coffee break hours are like before. So please continue this discussion, but just for the general audience, would I have like a 10-minute break now? I won't call it a coffee break because there's not enough time to run downstairs, you wouldn't come back up. In 10 minutes, we will continue with the next talk, but exactly use it for these type of one-to-one break and continue the discussion in the next 10 minutes. One thing I wanted to pitch in. |
Open Source Confidential Computing with RISC-V |
So before the 35 minutes, you mean? So you have your schedule to stop at 14.45. So you give me a give me a heads up. You can do it at 35. OK, yes. Sounds good. OK. All right, people. Let's do the second vlog today. I'm very happy. I'm actually excited about this talk. So Samuel from the RISC-5 company will talk about what's going on in the RISC-5 landscape. And I think, yeah, I'm excited for the next big step in our community, right? From open source software, open source hardware. So take that away, Samuel. Thank you. Thank you. So yeah, I'm Samuel. I work for a company called RISC-5. It's a startup that does RISC-5 things. And today I'm going to talk about confidential computing with RISC-5. And how do we do want to implement, well, an open source implementation of confidential computing? The previous talks I've mentioned are things like OPTIE. Some of them I've mentioned things like SGX or SCV. Those are all hardware implementation of the security attributes that the first talks about. Confidentiality, protection of memory, confidentiality of data in use. And this talk is really about how we want to achieve that with RISC-5. And the difference between the RISC-5 implementation and all other existing implementations is that everything is done in the open. Everything is open source. And everyone here in that room is free to come and help and contribute to that implementation. So that's why I think it's interesting. Hopefully I'm not wrong. OK. Who was on the RISC-5 dev room before? OK, so that's needed. RISC-5, what is RISC-5? RISC-5 is a free and open ISA, not open source ISA, because there's no source. It's an ISA, an instruction set architecture. So it's free. Everyone can use it, can build a CPU out of it without paying in license, any fees or anything like this. Actually, everyone is free to take half of the specification, implement some weirdos, CPU. It doesn't matter. You can take whatever you want out of this specification. And it's open in a sense that everything is defined in the open. So all the specs are frozen, that's ratified and accepted by the RISC-5 International Foundation. They're ratified and some modification can be added to it, but it's more difficult. But between the time they start to be specified and the time they are ratified, everything is open. So it's on GitHub. You can go and put some comments and some pull requests on CPU specifications. That are actually used in the real world. So it's quite interesting. And yeah, the specifications are released under an open source license. There are two volumes for the specification. It's fairly small. It's actually 300 pages, which is, I think, almost the same amount of pages that X86 uses for documenting the move instruction. So it's a good comparison. So yeah, it's very small. It's easy to read. Just go ahead and grab it. And yeah, the spec is split into the unprivileged and privileged specification. And I'm going to talk about this next. Why is the RISC-5 ISA interesting? So first of all, it's simple, as I just said. If you look at the specification, if you read the specification, there is no micro-architectural dependency. So the specification tells you how the ISA must look like. It doesn't tell you how it must be implemented. So everyone is free to go and implement the ISA the way they want. There is no dependency on a specific implementation. And probably this is why it's small, or actually smaller. It is modular, so it's the same specification for everyone. RISC-32, RISC-64, and it's the same implementation for the developer boards that you can find in the market and the upcoming like the Ventana, Multicore, SOCU actually massively Multicore, SOCs, it's the same spec. So it's modular. Everyone uses the same thing. And it's stable. So there's a base ISA and a set of standard extensions that are frozen. That means that you can rely on this to implement your UCPU and you'll be able to use whatever application are running and using those extensions. Those are frozen, they're not going to change. And if they change, they change the backward compatible way. And extensions are optional. So you don't have to implement all extensions to be called a RISC-5 CPU. And this here is the base ISA. So that's the entire base ISA. This is small. It's very small. It's easy to read. Oh, kind of. Not on that slide, but it's easy to read and it's small. I talked about the spec being split between privilege and unprivileged parts. And I'm going to talk about privilege mode, which is what is defined in the privilege specification. I'm going to talk about this because it's relevant, really relevant to the confidential computing implementation. So there are three basic privilege modes for a RISC-5 CPU to run on. The user mode, supervisor mode, and machine mode. And you switch between those modes through two mechanisms, actually through instructions. E-Call and M-Ret and S-Ret. So if you're in user mode, if your CPU is running in user mode, which is typically an application, you make an E-Call, which is a CIS-Call, basically. So to implement CIS-Call, you're going to use the E-Call instruction. And if you're in the kernel and you need firmware services, you're going to make another E-Call, and you go down in the privilege level and you're more privileged. To go back, to go up and move to a less privileged world, you're going to call M-Ret from the firmware world, from the machine mode. And you're going to call S-Ret to get back from a system call. And as I said, those mode actually maps to real use cases, what we typically use to. So the user mode is the application mode. Supervisor mode is where your kernel is going to run. And machine mode is where your firmware, EFI kind of thing, UFI kind of thing is going to run. One very important thing for the confidential computing implementation is the two additional modes. Actually, three additional modes that have been added with the hypervisor extension. So there is an extension to the base RIS5 ISA. It's called the H extension, H as in hypervisor. And this is an extension that's been added and is frozen. So it's something that is not going to change for supporting virtualization. So the mode that I've been adding is the HAS mode, the VS mode and the VU mode. So you can see in this diagram, you can run your application as usually in U mode. And then you're going to have your hypervisor, your host kernel when the extension is enabled, it's going to run not on S mode but on HAS mode. So hypervisor, supervisor mode. This is why your Linux KVM or Zen kind of thing are running. And then when you're going to create the virtual machine, the virtual machine is going to be split. If it's a full Linux virtual machine, it's going to be split into two different modes. The VU mode, the virtualized user mode and the virtualized supervisor mode. So your guest kernel is going to run in a virtualized supervisor mode and your guest applications are going to run in a virtualized user mode. Okay? All right. So confidential computing. I just did like a scratch course in five minutes of RISC-5. So I hope this makes sense. But anyways, I needed to do this to kind of explain where we want to go with confidential computing on RISC-5. So what we're defining currently in RISC-5 for confidential computing is called the AppTE RISC-5 specification. AppTE as in application, processor, trusted, execution environment. So it's a technical group where everything, again, is open. So there's a GitHub repo for this technical group. All specifications are there, the discussions, the minic nodes, everything. And it is not ratified yet, not frozen. So this is a work in progress. So again, feel free to come and join and help and provide some feedback on that specification. But it is aimed at becoming the reference confidential computing architecture for RISC-5. So it's currently in a pretty late state. It's going to be ratified, not ratified, but accepted pretty soon in a few months. But it's going to be the reference confidential computing architecture for RISC-5. It's not an ISA specification. So we don't add to the RISC-5 set of instruction and architectural definitions. But we do identify a few ISA gaps. For example, what we call the confidential memory attributes, which I'm going to talk about later. And just to clarify things, because we talked about OPTE, for example, there's an implementation of OPTE for RISC-5. The OPTE specification for RISC-5 is not aiming at the same set of use cases. OPTE is really trying to do and support the same use cases as TDX, for those who are familiar with TDX, or SCV, for those who are familiar with this AMD technology. And basically, this specification is defining a new class of trusted execution environment for RISC-5. And these new class are trusted virtual machines. So same as TDX, so same as SCV. The goal is really to run full-blown virtual machine in a confidential computing environment, where you will have memory and data confidentiality and integrity, as explained in the first talk. And the goal is really for people to take their existing workload, their existing virtual machine, their existing Kubernetes nodes, and move that into a confidential computing TE. The same way they're doing this, or they aim at doing this with SCV or TDX. So there are really two different set of use cases, and OPTE is aiming at this specific set of use cases. So there are a few architecture components that I'm going to talk about. An OPTE beats per heart, sorry, I didn't mention this, but a heart, HRT in RISC-5 terminology is actually a CPU core. It's a core, it's called a heart. There's a few components that I'm going to go through, the security manager, the TSM driver, there's a dependency on the hardware root of trust, and there's a structure, a non-ISA-specified structure called the memory tracking table. And to go through all these components and kind of explain what they are and how they're put together to reach the goal of memory and data protection and integrity guarantees when it's in use. I'm going to take an example of how from a call start of a RISC-5 SOC, we could actually build a trusted virtual machine with the confidential computing architecture that I'm trying to describe. Okay, so we have a RISC-5 SOC with a few components that are mandatory. We need an IOMMU, we need a root of trust, we need an MMU obviously. This is all dependent on the H extension on 64-bit RISC-5. It's basically RISC-5 GC, which is the general purpose specification, plus compressed, but we don't need compressed, it's just the G part. But yeah, it's a full-blown 64-bit RISC-5 SOC that's running there with an IOMMU. We do need and mandate the presence of a hardware root of trust and we need some sort of memory protection. So an MMU, a memory checker, something like this. The first thing that the root of trust is going to measure and load is called the TSM driver. So that's the first component of this confidential computing architecture. And the TSM driver is the component, the trusted component that runs in M mode, in thermal mode, that's going to split the world in non-confidential and confidential, okay? And the TSM driver is, yeah, a confidential world switcher, and it's the component that basically toggles a bit in the RISC-5 SOC, the apt-e bit, to tell if the heart is currently running in confidential mode or non-confidential mode. So there is apt-e bits that is part of the specification that tells at any point in time if a specific RISC-5 core, RISC-5 heart, is running in confidential mode or non-confidential mode. And the TSM driver is the component that's going to make that switch, is the component that is going to toggle that switch. So it's part of the TCB, it's a trusted component, it's a software trusted component, and that runs in M mode and does that. And basically, the TSM driver is going to switch from, for example, non-confidential to confidential, when something in non-confidential, like a VMM or KVM or your Linux kernel, is sending a specific TEE call, which is an E-call, basically a call that allows you to move from supervisor mode to machine mode, so basically from Linux kernel to TSM driver. The TSM driver is going to trap this, and then it's going to toggle the apt-e bit, which means it's going to atomically switch the CPU into confidential mode, and then it's going to move to something called the TSM, the trusted security manager, the TEE security manager, sorry. And to do that, it calls the MRET instruction and moves to TSM. So we are in the kernel, the kernel makes an E-call, the TSM driver toggles the CPU from non-confidential to confidential, and then starts running the TSM, and we're going to talk about the TSM next. And this is what the TSM driver is mostly about. The TSM driver, I'm going to talk about the TSM right after this, but the one very important thing that the TSM driver manages is called the memory tracking table. The memory tracking table is a piece of memory, and the structure of this memory tracking table is not specified in the confidential computing specification. It is up to any implementation to decide what it puts in this memory tracking table. What the specs tells is what this memory tracking table is for, and this is what I'm going to explain now. The memory tracking table is enforcing, and just to take back, the memory tracking table lives in confidential memory. So the memory tracking table lives in a piece of memory that is protected from the non-confidential world to actually see or temper with. So it's encrypted, protected, integrity-protected memory. So the memory tracking table enforces the confidentiality memory attribute for each and every page on the system. So it's what we call a PMA page tracker. So it defines if any memory page is confidential or not. So you take a physical address, you give that to the MTT, to the memory tracking table, and the MTT tells you if this address belongs to a confidential page or non-confidential page. So with this memory tracking table, anytime you want, for example, the non-confidential world is trying to access physically a page, the memory tracking table is going to be used by the CPU to actually check if this page is confidential or non-confidential. If you're trying to access a confidential page from a non-confidential world, if you're trying to read memory from your trusted virtual machine from your VMM, from your QMU, from your KVM, then the memory tracking table is going to tell you this is a confidential page, and that's going to generate a CPU fault. And it gives you memory protection. Depending on how you want to implement memory encryption, basically, to protect your memory, the memory tracking table will be able to tell you which key you need to use to encrypt or decrypt that physical page. And you can decide how you want to implement this, how many keys you want to support, if you want to add one key per TVM or multiple keys, or it's up to the micro-architectural implementation of the specification to decide what it does with it. Okay, so the TSM driver managed the memory tracking table, which gives us memory protection and integrity. And the next thing the TSM driver is going to do is going to load and measure the next component, the next trusted component that now runs in the last privileged mode, the TSM, the TE Security Manager. The TSM lives at the same level as the Linux kernel, as KVM, as the IPervisor, basically. But it lives in confidential work. It lives and runs out of confidential memory, and it's only run when the RIS5 CPU is running with the apti bit on, which means it's running when it's in confidential mode. So the TSM, I don't know if people here are familiar with TDX, but there are some similarities here for those who know TDX, unfortunately. So TSM, it's the TE Security Manager. It's a trusted piece between the host VMM and the TVM. So the TVM is a trusted virtual machine that we're trying to build through those steps. And nothing from the confidential world can actually touch a trusted virtual machine without going through the trusted, the TE Security Manager, the TSM. One very important thing that the TSM does is it manages all the second-stage page tables. So the page tables that allows you to translate TVM physical addresses to host physical addresses, those are managed by the TSM in confidential memory. So with the confidential computing implementation, KVM no longer manages the second-stage page tables for the trusted virtual machine. It's all handled by the TSM, which is trusted, in confidential memory. So that's a very important piece of TSM. And something really important to understand is that it is a passive component. So it implements security services that are going to be called by the host VMM. It doesn't run by itself. It's not something that schedules TVM or handles interrupts or it doesn't do anything like this. It just replies to security requests that are coming from the host. The host is in control of the machine. It's not in control of the trusted virtual machine. It needs to go through the TSM. And the TSM is only responsible for this, getting security requests from the host, from the host VMM, and replying to it. And we do have an open source implementation for this. So it's called Salus. It's on GitHub again. And it basically implements everything that I just described, plus a lot more different things. It's all in the specification and it's all open source. So go there. The TSM also manages the entity. So whenever the TSM adds a page to a trusted virtual machine, it's going to add entries to the entity and it's a little bit more complicated than this because it needs to go through the TSM driver. But basically the entity is something that is owned by the TSM driver and by the TSM. Okay, so TSM driver started. It loaded the TSM. At some point we have a host OS, a Linux kernel with KVM that starts. It puts some non-competential virtual machine. And at some point someone is going to be starting a trusted virtual machine, a virtual machine that runs in confidential world. And to do that, there's a set of ABI's between the host VMM on the left, the non-competential world, and the TSM. And that goes through the TSM driver. The TSM driver is the trusted piece that actually proxies each and every request from the non-competential world to the confidential world, to the TSM basically. And those are called the TE host ABI's because there are, it's a set of binary interfaces that are called from the host to actually manage and request security features from the TSM. Everything is proxied through the TSM driver. So the TSM driver traps the host sending E-calls, SBI calls, and basically it traps the calls from the host VMM, from KVM, for example, and it then schedules the TSM to actually run and handle those calls. So a few examples, creating and destroying a TVM context, converting confidential memory to, non-competential memory to confidential and vice versa, mapping pages from non-competential world to a TVM. All those security features, they are requested from the host VMM, from KVM, and they are managed by the TSM. So KVM itself, obviously we don't want KVM to actually take a page and add that to the TVM, a trusted virtual machine address space. It has to go through the TSM, which manages all the page tables for this TVM. And for example, if we want to create a TVM, which is what we're aiming or trying to do here, it goes through a few steps, and all those steps here map to an actual T... the host ABI, the ABA between KVM and the TSM, and there are basically seven steps. The first one is to create a TVM context. So KVM will ask for having a context so that it can use that context and then start configuring the TVM. The next thing a KVM needs to do is to allocate some memory from physical pages to the TSM so that the TSM can actually build the second-stage page tables for the TVM that it's going to create. Those second-stage page tables are living in confidential memory, so they cannot be handled, they must not be handled by KVM, by the host VMM. So KVM donates pages to TSM, and the TSM is going to use that to build those page tables. It's not meant to be used by the TVM memory, it's meant to actually track the second-stage page tables for the TVM. Then KVM is going to tell TSM that some memory region needs to be reserved for the TVM. So that's basically the TVM address space. And then KVM is going to allocate pages and move those pages from non-confidential to confidential and ask TSM to map those pages in the memory region that it just asked for creation in step number three. The last and next thing that KVM needs to do is to create TVM CPUs, because basically all the CPU state is contained and managed in confidential memory as well. All the CPU state that the TVM is going to run on top of is managed by the TSM in confidential memory so that KVM does not see ATVM general purpose registers values and cannot mess with it, obviously. So this is all handled by the TSM as well. And the KVM finalized the TVM and eventually asked TSM to start running the TVM. And this is where your TVM is starting to run off confidential memory with a VCPU which state is also kept in confidential memory and protected. So we have this. TSM just created a TVM upon the host VMM request. And the TVM can also talk back to the TSM. The TVM never talks back directly to the host VMM. It only talks back to the TSM. The same way a non-confidential VMM exit would be trapped by the host VMM. A confidential TVM VMM exit, for example, or any service that the confidential VMM needs will be managed by the TSM driver or the TSM. So there are a set of ABI's between the TVM and the TSM. And, for example, a thing that I didn't talk about, but attestation is something that is being requested by the TVM. So the TVM is going to ask for an attestation evidence. And this is going to be serviced by the TSM through those ABI's here between the TVM and the TSM. So the TVM asks for an attestation report, a signed attestation report, an evidence that is going to send to a lying party to run the full attestation dance whenever it wants to do that. And part of this specification, the confidential computing specification, defines how this attestation flow is going to be running. And, more importantly, how the attestation evidence is going to be built, out of which measurements, and how this is going to be formatted. Unlike TDX or SGX or SCV, we do use a standard format. We use X509 certificates for building an evidence. So each layer on the chain here from the hardware that will touch up to the TVM loads, measure, and certificates the next layer. So this is based on a specification called TCG DICE. It's a layered specification for building attestation evidence. And this is what we use with the RISV confidential computing implementation. Eventually, the TVM, when it asks for an attestation evidence, it will get a certificate from the TSM. So the TSM builds the certificate with the entire attestation evidence that is part of the certificate as an X509 exception. And this certificate is routed back all the way back to the hardware world trust for a relying party to then verify and attest or not. The last thing I want to talk about is IO. I didn't talk about IO because it's a chapter on its own. There are two kinds of virtual machine IO. There's the power virtualized IO, also known as virtual IO most of the time. Doing virtual IO with confidential computing, a confidential VM, TDX, SCV, or RISV is challenging because basically the virtual IO device implementation is done by the host VMM. So typically your virtual unit is going to be done by QMU or by an external process running out of the host user, for example. So you must share memory between your TVM and your host VMM. So it's complex. It's actually not very efficient because you need a software IO TLB and you need to do a buffer bouncing between confidential and non-confidential to be able to share stuff. You need to harden your guests so that you can actually somehow trust the host implementation, etc. So there's a lot of discussion around this. If you go to the Linux Cocoa mailing list, it's a Linux kernel mailing list. There's a lot of heated discussion right now. And the other IO, surprisingly, the other IO form is direct assignment. That is even more complex. Direct assignment basically means you take a PCI device that you don't know, that you know nothing about, and you add that to your TE trusted compute base. Basically you're going to say, I want my NVIDIA GPU to be part of my trusted virtual machine. And to do that, you basically need to attest and authenticate the device that you want to plug into your TVM. So there's a lot of specification, well, not a lot, but a few specifications, PCI specification called T-DISP and IDE for protecting the IDE link between your device and your TVM. You need collaboration from the IOMMU. It's a very complex topic. The first one, Vert IO1, is very much in progress. The direct assignment want, it's still being defined. So I rushed that through. I'm done. Thanks a lot for listening. I hope it was useful. Thank you so much. And I have time for questions. |
Introduction to Secure Execution for s390x
KVM confidential VMs on IBM Z |
Hi, my name is Claudio Imbrender. I am one of the maintainers for KVM on S390. I'm Stefan, I'm contributor to the kernel on S390 and also maintainer of the S390 tools. Yes, and we are here to talk about the same thing that the previous person talked about, but for S390. So, a few months ago, a colleague passed from my office and said, hey, there's a confidential computed track at first time. Maybe you should submit something. By the way, the deadline is tomorrow. So, I did. I went to the website and this was the call for paper and it says, all my major process vendors and there's Intel, AMD, ARM, ARM, Power and like, so we are here to fix that. How are we going to do it? First of all, an overview of how the whole secure execution works. The lifecycle, a small glimpse into how we handle swapping. And then Stefan will talk about attestation and confidential dump. So, let's get started. So, what it does, confidential virtual machines prevents the entrusted hypervisor or host from looking into the guest or touching things into the guest. It does not protect against an idle service attacks. It does not protect the guest from doing stupid things because we want to protect maybe the machine from malicious operators, hypervisors, compliance. If you're here, you know what I'm talking about, right? So, yeah, this is the grand scheme of thing, which I think looks like most of confidential computer solutions. Of course, well, yes. So, we have what we call Ultravisor. Everyone has a different name, so we call it Ultravisor. It's the entrusted entity in the whole system. It's the only entity that has complete access to the whole system. And it's implemented partially in hardware and partially in firmware. Then we have the guests, which are trusted for themselves. I think they can shoot themselves in the foot if they so want, but otherwise they are, if they don't do stupid things, they will be secure. Notice that a secure guest can only access its own memory. And the hypervisor cannot access the guest memory or Ultravisor memory. Nobody can access Ultravisor memory, in fact. The line is dotted because memory can be shared, but this will be seen now because, of course, I owe surprise. And a non-secure guest is non-trusted, like the hypervisor. So, yeah. So, this is not news. So, again, Ultravisor memory is inaccessible. The guest memory is not accessible unless shared. And attempting to access guest memory or hypervisor memory will just result in an exception. So, there is no memory encryption, like in MDSCV involved. It's just that the page will become inaccessible, and in any way, shape or form, not even IO or stuff like that, will be able to access the page. The guest decides which pages to share. So, it's the guest that decides, for example, a bunch of pages at the end for bounce buffers, which were mentioned earlier. Yes? And for everything else, there is direct interaction with the Ultravisor. So, the guest to host mappings are also secured. And enforced by hardware, because, of course, this is also important. If everything is secured, but then the host could swap the mappings for some pages at runtime, and things could break in the guest, and we don't want it. So, the host can change the mappings, but then it will crash, basically, the guest. So, it's fine. And for everything else, it goes to the Ultravisor. So, all the other interesting things, like, and for everything else, it goes to the Ultravisor. So, all the other interactions that go from the Hypervisor to the guest, or from the guest to the Hypervisor, they go through the Ultravisor, and the Ultravisor will check and proxy the interactions. So, make sure that, for example, in some cases, some instructions are only allowed to return specific return codes. If the Hypervisor is returning something that is not supposed to be returned, that will not be allowed. The Hypervisor still has lots of things to do, like, the actual IO device models handling some of the instructions, like, as I said, IO, some other instructions that are actually not handled by the Hypervisor, but the Hypervisor need to be notified that the instruction happened, because otherwise the guest will not be able to be executed correctly, than scheduling a memory management, the usual stuff that's still there. So, what can we build with this? We have basically an almost unlimited number of secure guests at the same time. It's not unlimited, but you will run out of memory before you run out of guests. We're talking millions. This is very important. The boot image is encrypted, and it can contain secrets. So, the boot image is encrypted. We can swap. We can have remote attestation, although it's not needed, because the boot image is encrypted. But we still have it, because there are some cases where it can be useful, and host-initiated dumps. We do not have line migration. We do not have device pass-through. We do not have huge pages perking, and we do not have nested secure guests. Maybe one day. So, what happens when the host starts? The first thing that the host does is to check if the Travisor is available, and if so, to query the Travisor to know specific parameters, like how much memory needs to be donated, for example, and then donate a bunch of memory to the Travisor, where a bunch of information and details and metadata about the pages are kept, which rings a bell from the previous talk, maybe. Yes, that memory now belongs to the Travisor, and it cannot be touched again ever until the host reboots, basically. So, it's gone forever. I mean, forever. I'm not even speaking. So, the boot blob is encrypted, so you can either have a custom image for a specific guest, or you have your image, and you put, I don't know, the look keys, for example, inside any 3D, and then you just boot it, and it will boot safely, or maybe you have a generic image because of some kind of orchestration that you're using, or because it's a vendor image and you don't want to touch it, and then, in that case, you will need remote attestation. The boot image is encrypted using a public key. The private key is inside the hardware safely embedded, and so nobody can decrypt the boot image except the hardware. The boot blob contains the kernel in it 3D, the kernel command line, some more keys provided by the owner, which can be used later for other purposes, for example, dumping. And, yeah. So, let's see. It's a simplified view of the guest lifecycle. The host creates a guest, the host asks the ultravisor to create a secure guest, and by doing so, also donates a bunch more memory to keep some more housekeeping and more metadata about guest pages. The ultravisor will, at this point, create the secure guest, and then the boot blob is passed to the ultravisor, which will then make the pages inaccessible, decrypt it, verify the hmark, verify the hash, and if everything is all right, if the image is correct and there has not been tampered with, then it can be finally run by the host. Now, this is a slightly less simplified view, which if you're watching the stream or if you watch it later, you can pause and have a look at it. I will not explain it now, but it's just for your convenience, for reference. So, memory donation. I talked about memory donation, so we have the UV-based storage, which is that big chunk of memory which is donated by the host to the ultravisor very early in the boot process, and can be big, and it's absolute memory, which we call absolute, means physical memory basically, so it needs to be done very early at boot because otherwise you will not find a very large block of physical memory, a contiguous physical memory. Configuration-based storage and CPU-based storage configuration means guest in S390 speak. Storage means memory in S390 speak. So, those are small pieces of physical and contiguous memory, so it's not a problem to find a few pages. Configuration-virtual storage can be big, it depends on the size of the guest, and this needs to be contiguous, but can be contiguous in virtual memory, host virtual as in kernel space, kernel outer space virtual memory, so it's also not a problem, you can just do a Vmalloc and you got it. The configuration on CPU, memory that is donated, will be taken away basically until the guest is destroyed, once the guest is destroyed, the memory can be used again by the host. Swapping, let's have a look at how swapping works. So, when the host wants to swap a page, is this readable? Actually yes, cool. When the host wants to swap a page, it will do an export page, basically asking the ultravisor to please make the page accessible. The ultravisor will first encrypt and save the hash somewhere, somewhere means one of these regions that have been donated when the guest was created. Once the page is encrypted and hashed, then it's made accessible, which means that it's not usable by the guest anymore at this point, because it's not secure anymore. The host can then un-map the page, swap it to normal stuff. When the guest tries to use it again, the host will get the usual page fault, you un-swap the page, you map it back, you run the guest again, but the page is there but it's not secure, it's still encrypted. So, the guest tries to use the page, the host will get a fault again, a different type of fault, and this fault will trigger an import. Import means the ultravisor will check, make the page inaccessible, decrypt it, check the integrity that if everything is all right, then the page will be accepted and the host can run the guest again and the guest finally will be able to run. Yes. Next is Alistation. So, let's talk about Alistation. Give me a sec. So, Claudia just said it's optional, then why should we use it? So, for IBM Secure Execution, we do not require the explicit or external adaptation to prove that the guest is secure as in. We encrypt the image, we verify the integrity and if your image contains a unique secret and if you want to use a stock kernel and a stock image, you can use a unique public SSH key and if you can connect to that image using your fitting private key, your successful login will attest that is your secure execution image. But there are some cases where you might not want or cannot do this. For example, the explicit adaptation is useful when you want to provide to a third party that you're running on secure execution without passing that private key to that other party or you want to verify that you're not only running a specific image but the specific image instance. So, if you have your image multiple times, you want to differ between those probably. Or to be more general, if you want to have trusted information about your secure execution guest or the execution environment it's running in. Also, another point would be if you have a generic image from a generic stock from your supplier, probably you want to first prove that this image is really secure and then deploy your instance-dependent secrets. And the workflow, the first three steps are, as Claudia talked before, the host will start the secure execution image, the ultravisor will verify the hashes and start the image and the guest transition into secure execution mode. And we have a trusted system on the left and on this trusted system we will generate a request, a station request, and this request will contain a public ECDH key and an encrypted measurement key. We do a measurement with that encrypted key later. This request also encrypted with the public key of the ultravisor, similar to the image. Then the request will be sent to the secure execution guest and it will trigger an ultravisor call to the ultravisor. The ultravisor will do a measurement. That measurement mostly contains the hashes used to verify the image and a configuration unique ID, unique for that instance. And then the response travels back to our trusted system and a trusted system, we can redo the HMAC and if they are the same, we can be sure that this image is tested and running on a secure execution. If they are not the same, something went wrong. So for dumping, we are here talking about hypervisor initiated dump. So there are two types of dumping. The guest can dump itself. It has a pro of that you do not need an interaction between the guest and the hypervisor. However, you have to provide memory in the guest beforehand. So that's not always possible. In that case, if you don't want to deny the memory or if your guest is in a very badly crashed state and also the dumping of yourself modifies your memory. In the other way, the host initiative dumping is that it works regardless of your guest state. It can be very badly crashed. It will not modify your guest state and the guest does not change during dumping. However, you need access to the guest state and we promise that the hypervisor never has access to the guest state. So how do we do it? First of all, confidential dumping is an opt-in by the guest owner. So if you want to be very paranoid, you can just say, I don't want my image to ever be dumped. But it's a reliable and secure way for hypervisor initiative dumping. Every guest state the hypervisor receives is encrypted. Also, no QM API changes, so you can just issue your worst dump as normal. And the S390 tools contains a tool, getDump, that will handle the decryption later. And then the decrypted dump can be used as normal using crash, for example, to analyze the dump. So for a hypervisor perspective, if the dump was requested, the hypervisor stops all VCPUs and then import all guest pages. Export, that's a different way, right? So we will export all guest pages. And then, so the guest has no access to the pages anymore, but they are encrypted. It will call the initiate configuration dump ultravisor call. So we'll initiate the dumping process and set up some states in the ultravisor. And then we need the state for each CPU. We'll dump the CPU state for each VCPU. We'll get this encrypted CPU state. And then we need a metadata for the memory to encrypt it later. So we have ISXTS tweaks for the decryption later. And in the end, we just need some bunch of keys and initialization vectors and seeds and nonces to decrypt the memory later. And that's all written into a VM Core L file with some extensions. So we zoom out a little bit. During the generation of the secure execution image, the user or the image owner has to first opt into the dumping and then provide a customer communication key, we CCK. That key is later used to decrypt the image. After the generation of the image, the guest owner transfers the image to its cloud provider or to the host. And it will start the image and the process and verify the image using the ultravisor and the image will be started as a secure execution guest. At some point, someone will request a dump. It can be by the guest image owner or by the hypervisor. It doesn't matter. The QMU then will trigger the confidential dump ultravisor calls I had in the slide before. And the ultravisor will create the confidential dump encrypted versus the CCK. The encrypted dump data in the VM Core L file is transported back to the trusted system. And then in the trust system, you can decrypt the image using the get dump and your secret customer communication key. And then you have a decrypted normal looking dump to analyze. So just a quick summary. The current state is that you require an IBM Z15 for secure execution. And if you want to use attestation or dumping, you need an Z16. And from a software perspective, everything is upstream. Just to summarize the content of our talk, confidential VMs for S390X is we create a secure image that can have secrets in it. We support swapping, implicit and explicit attestation and also host and asheet dumping. And now we have a bonus slide. Yes, bonus slide. We are working on confidential containers based on secure execution. There's a pipeline kind of working, including attestation and everything except for the secure part. But the secure part itself works. It's just a matter of putting these together. So that's coming soon, hopefully. Yeah. Thank you. Questions? Question on the encryption of the image. So if I'm a guest owner, I want to run a confidential VM with AC. What's the process for building the... Okay, so the process, the question is about how to build basically the boot image. So you need the public key of the machine, or the machines you want to run the image on. And that's the... So if I have 5,000 AC machines... If you have 5,000 AC machines, you need 5,000 keys, but you probably will not have 5,000 AC machines. But then you need... Probably you hope so. I mean, by all means, buy 5,000 AC machines, but no, in that case, yes, you will need 5,000 keys to... But you can encrypt the image, I think, with multiple keys. So you don't have to have 5,000 images. Okay, so I can have one image signed by multiple keys. Yes, you can have one image signed by multiple keys. I mean, encrypted with... Yes? Could you elaborate on what the endorsement mechanism of these public keys are? So how do I actually obtain the public key for a particular machine, and how do I know that I'm using the right key? So the keys are signed by an IBM master key, which is published somewhere. I don't exactly know the details about that, but somewhere there is a way, there's a certification authority, and you can know that there is a... The keys you're getting are real keys for an IBM Z15 or Z16 machine. That one. So to be more specific, during the image generation or the adaptation request generation, you provide the IBM signing key that's signed by a DigiSert CA signing key that's signed by some root key. And we also have, obviously, a relocation list for that. So you can be trusted that you're really signing with a hardware key. Maybe one question has come from my side. One of the things that have been very useful in technology like SGX is DCB recovery, right? With all these attacks, you have seen that they can recover from that. And I'm wondering, with the type of association that you have here, with a hard-coded private key, there's no real way of knowing that I'm running even on a patch. Do you have some sort of microcode patch or something like meltdown for you? Okay. So I think we do support something, some versioning, so that you can refuse your image to be run on an older machine, for example. I'm not sure about the rest. But, I mean, there is a relocation list for if a machine gets compromised. Yeah. So, yeah? It's on a four-machine base. There's no notion of a machine can be in a compromised state if it loads an older microcode version. No, you cannot roll back, I think. So the firmware is per machine and controlled more or less by IBM. So either the whole machine is compromised or nothing. So that's not per guest. We have no firmware per guest. So if you notice that there's a bug or whatever, you just revoke the whole key, so the whole host key, and update your firmware with a new key. Or you generate a new key. Yeah, I see. It depends where you have the key, because you mentioned in the slides that it's in hardware. Yeah, yeah. The private key is in hardware. So, like, it's not, a firmware will have to read it, obviously, to process it, though. So if that one is leaked, it's game over? For that machine. For that machine, it's game over. If that is leaked, for that machine, it's game over until the machine gets patched. Okay, well, you kind of decrypted the image, right? With the key, you can then decrypt the image, the boot image. So if you didn't have secrets in the boot image, then you're safer, I guess. Excuse me, I didn't get the detail. How is it possible that the key, private key is in hardware, but updating the firmware changes the key? No, no, no, you don't change the key. In that case, if the, right, right, no, no, no. If the key is compromised, you have to change the hardware as well. You update the firmware to get rid of the bug, but the key is compromised, you need a new key. You send us the hardware and then we'll basically give a new key, basically. Yes. Keying all the service. So we have a bit of a break, so if you're free to go around. I still have some questions, so if you don't mind. By all means. Maybe from the audience first. So when you explore the, you mentioned there are no exceptions in a row. So if you don't have it, does it page fold? Yes. The first is a page fold? So when the VM notices that the page is not fully secure. Yes. Why do you have two exceptions for that and why do we not do it? Okay, so the question, I don't know if the stream is just going, but I still have a question. Okay, then I repeat the question. So the question is why do we have, by swapping, we have two different here. We have a page fold and then we have the other fold. Couldn't we just have one fold? In theory, yes. Could you like, you mentioned that you are in part also just unsported and immediately in top of H? You can export it and import it immediately, but I don't know. Okay, let me answer the first part first. So yes, we could have skipped the second fold by exporting the page directly, by importing the page directly. The point is that when the page fold happens, we don't know if that is for a secure guest or not. The page fold handler would need to be so much more complicated, unnecessarily complicated, and importing a page takes time anyway, and the overhead of having an extra exception is not so big compared to the amount of code we needed to write. So what was the second question? Okay, okay. Yes. Sorry, it has another question here? Oh, sorry, I thought it was... There's also a question from the online stream, so I'm going to read it to you guys and then you can handle it. So, Muhammad is asking in the flow shown, can you explain why is the measurement key a part of the evidence? Why is it needed? Is the attestation flow specified somewhere? The measurement key is generated by the trusted system owner, and then encrypted with the public key of the firmware or the hardware. We need us to do an HMAC, so it's a signed authentication. And only the person who has the key or the machine who has the measurement key can reliably create the HMAC. |
Tilting a Pyramid
Confidentiality in a Cloud Native Environment |
So, we're probably shifting gears a bit from all this deeply technical intimidating talks and kind of connect to the first talk of this track, basically. I found this really interesting idea of kind of emulating the let's encrypt moment for confidential computing because it is more or less what we are doing. So, I'm a software engineer in Microsoft's KINFORG team. We do a lot of exciting stuff in terms of open source, Linux, eBPF, and containers in Kubernetes and my team is also involved in exploring ways to integrate existing confidential computing technology, all that we heard in the talks before with existing ways to deploy containers. So, this talk is kind of, yeah, we're on a tight schedule so I have to really do cut a few ends short and probably will be a bit hand wavy. So, sorry about that. Please reach out if you want to get details about something. So, obviously, it's also not a comprehensive coverage over all the confidential containers. Realm, it's quite wide and covers a lot of areas so I'm focusing on a few things that we are looking at at the moment and also maybe the point in the talks will not age very well because things are very much evolving and whatever is mentioned later could be in a few months, could be already like in another stage. But the idea basically is to provide like pointers for people who want to get involved in this because from my perspective it's very exciting, it's something that's like a very practical problem you can solve and there's a lot of open questions that are really accessible I think also for people without a very deep technical background in confidential computing. And eventually we have to define cloud native so establish some terminology but we cannot go very deep here so essentially it's a bit of a buzzword but you will find that it's more or less an ecosystem of practices, tools, interfaces, APIs that more or less aim to ease the deployment and management of applications on cloud platforms. And that can be infrastructure as a service, it can be I don't know functions as a service but in most cases containers are very prominent in this space and this is also what we are focusing on. And Kubernetes in this space among other competitors has been adopted as a go-to solution really for container orchestration and management. And quickly introduce Kubernetes in like two lines. So Kubernetes is a container orchestrator, management, API, abstraction layer. I would say that it's not trivial to host and operate. So it's very popular offering by cloud service providers to offer some hosted solution for Kubernetes and provide the developers, engineers with some API layers. And in Kubernetes what we have is maybe a bit unique, you have this notion of POTS which define like a logical environment, it's like they are isolated, resource constrained, they also share namespaces, C groups and they are composed of individual containers. So a node has POTS and in the POTS can be very like co-located containers if you want and this is an abstraction that is quite useful in confidential computing. So in general I think this is also for people who work with confidential computing quite common. There's kind of this trade-off scale where you say like you have a small TCB surface. This means you run enclaves, you have like SDKs and your workloads have to be customized for this. But this also means you don't have to care for a lot of stuff because the TCB surface is quite small. And you have like a bigger TCB surface, it's like bare metal VMs, like Kubernetes clusters for example and those have the convenience of running unmodified workloads, this kind of lift and shift idea. And if you want the Kubernetes POTS could kind of fit in there somewhere on the scale, probably not exactly in the middle but somewhere. So because the idea is that we only have like minimal adjustments to existing workloads that are already running in Kubernetes. And as mentioned before some workloads are simply locked out of cloud native and public clouds due to compliance issues. So our approach is basically to ease the adoption of confidential computing by enabling confidentiality with minimal upfront investments because really only big corporations are able to invest in like self-hosting environments that provide confidentiality. And as I said like it has been widely adopted by the industry Kubernetes and Kubernetes provides some abstractions and technologies. I think you could also even argue that the main value of Kubernetes is not really the tech but the kind of API abstractions that are there. That really makes you if you move from one place to the other you will be able to adopt quite easily because there's a lot of like shared solutions these days. And we can use those abstractions. For example we look at SEV and TDX they leverage VMs for confidentiality and we have already an established solution with Qatar containers that run those pod units that we've seen before in virtual machines. So this is something that we don't have to start from scratch but we can kind of see okay there's something that's been working so far and maybe we can leverage this. And so ideally probably you don't see this but ideally the result is that you have a Kubernetes spec where you just add a property something is confidential, confidential true. So this is a kind of lift and shift idea that we have like very low friction enabled customers to deploy confidential workloads. And the problem start when we look at the Kubernetes privileges like they're formed in the pyramid usually. So the squiggly lines here indicate that it's not clear cut but it's more or less like some parts are owned maybe by the platform engineers or by the so-called orchestrator some parts are taken over by cloud service providers. But the idea is that the hierarchy is pretty clear. So the closer you get to the app users eventually to the service users the less privileged you are. The platform engineers are really also an in-between the layer that exists like classical operators but on Kubernetes API level. And the confidentiality permit then drops and it's a bit messy because now we basically have the system that's been built for years and everything's figured out nicely and now our privilege model doesn't fit anymore. So from the confidentiality perspective you want to lock out the cloud service provider. And we also don't want the cluster administrators to mess with the workloads. Possibly not even the developers. And it might be really only the app users who have access to confidential data compute. And this is something we have to deal with. And this also basically the challenges I think in this model we have to overcome. I think it's definitely not an exhaustive list but it's like three topics I picked which I like recent months I followed the discussions. But I also don't think they're like insurmountable. So there's definitely stuff that we can solve and it's partially nice engineering problems. Starting with image management. So the container images usually are managed by the infrastructure. And this makes a lot of sense because there's also a lot of shared resources. So containers images are organized in layers. So instead of pulling an image 10 times it can be cashed on the note. It can be shared through replications of a single service for example. And in a trusted execution environment we need verified or encrypted images for our workloads. And actually there are already OCI facilities for those aspects. But the problem is they're running on the wrong layer so they're not in the TE. So if basically the infrastructure part is not part of the TE then we kind of need to drill a hole or move stuff. To a trusted execution environment. And so there's basically some approaches. So I think the pragmatic bandaid to start with which made a lot of sense is just pull everything in encrypted memory in a confidential pod VM. And this has I think practical merit in the first because you can start with a solution. But it's very clear that there's downsides to this because you need to provision potentially a big chunks of memory for this. And also each pod needs to be. Yeah needs to pull individually their their image layers. And some of those workloads I know they run pytorch images for example they really like gigabytes big. And it's not something that you want to pull individually all the time. We can of course like create encrypted scratch spaces to do this. So we get rid of the over-provisioning of the memory but the unshared images will still yield pretty bad startup time. The good news is that there's approaches that we can use today like we can stream encrypted image layers or otherwise chunked up blocks. So there's different ideas from the house to the confidential pot. And the technology to do this is also not something that we need to invent from scratch. It's already there in container D in recent versions of container D. It's called remote snapshotters that basically do this but it's like not 100 percent there but it's it doesn't meet our requirements fully but I think it's pretty close. So about the whole image topic I think we can be pretty optimistic. There's another problem that's about metadata Kubernetes and that's maybe a bit counterintuitive to people who don't know this. But Kubernetes will like freely transform your specified workloads in multiple ways. So it will inject mount points environment variables. It will change your image definitions and this is all by design. So because like the the cluster operators that I mentioned before that basically make sure that all the workloads are compliant that if you if some engineer deploy some Redis image usually like Redis latest or something. They would basically on the fly have some admission controller that just rewrites this stuff. And from our perspective this is not what we want from the confidentiality perspective. We want to verify what we want to run and we cannot have the the orchestrator or the platform engineers rewriting our specs. So we want to run exactly what we what we specified and also in terms of for example environment variable injection. So Kubernetes does this for very good reasons. For example for service discovery all kinds of information the pot receives from the from the orchestrator. But it's very problematic if you think of the pot image before you have like a small batch thing that does some caching on Redis locally. And if the control plane would just inject the Redis host environment variable and forward this to whatever various instance. This would obviously topple confidentiality. So this is something we need to deal with. And this is a bit I think from my perspective a bit messier than the image part. Because we basically have to or at least I don't see another way than saying OK we have to review the delta between what the user eventually specified or what the user specified and what is eventually being provisioned on the container container D site. And I think we can then validate whether we're fine with the applied changes in the user specified policies. But it's very hard. I think in some cases to write those policies are very complex. So the whole idea that we have a nice UX that says like just confidential true is not working anymore because the users have to write those policies. And they there's some dynamisms in Kubernetes that are very hard to model. So we either have to find a way to kind of make this really convenient or we find some other way to tackle this. I think there's another idea I've recently read about is basically that we are a variation of this idea as more or less it will kind of lock the changes that have been performed. And then and yeah top approach it from this perspective that we see OK we we don't model what could happen. But we more or less look at the log file of the changes that are there and see whether we are fine with those changes. But that's I think the same idea but it's only a variation. Eventually I think and this is also something that is a bit more challenging. We have to address the problem that the control plane API is in the host components. They interact with the Kubernetes ports and it's really like a crucial part of all the tooling of all the of how it developers interact with Kubernetes. I know exec into their containers for debugging stuff they will get locks and all this is currently going through the control plane. And basically our task is to cut out this kind of middleman from the user to the workload. And there's a lot of aspects like observability. This is a key concept of like cloud native workloads and it's obviously almost a contradiction like confidentiality and observability. It's very hard to reconcile but it's something that I think is we have to address at some point. So we kind of need to obscure the locks traces metrics from non trusted parties and those metrics are for example sometimes also used in by the orchestrator directly. So they would use our garbage collection metrics to perform autoscaling of pots. So it's really a tricky question how to deal with this. So we need to deprivilege basically the non trusted parties and prevent them from doing those or from retrieving those metrics but also from executing commands in the scope of a confidential pot. And I think there's yeah as we see images I think the pragmatic bandaid is probably at the moment locking down those problematic parts. It's like you cannot exec at the moment maybe retrieving locks is not that easy as it was before. And obviously that's not something that's practical in the long run for for real workloads. And there's two solutions to this problem I've stumbled upon recently. It's more or less like say you split the container management APIs into the infrastructure and trusted parts and operate a kind of shadow trusted control plan that users and tools interact with. And this is also part of a tee and it would kind of mirror the Kubernetes API the downside I think it's it's from what I have seen. It's a it's a large effort and I'm also not sure whether it's maintainable in the long run because you basically have to evolve with the Kubernetes APIs all the time. And I think the other alternative solution I've seen is more viable from my perspective. It means we haven't kind of an encrypted transport between the privileged user and the container management tactic on the confidential VM for example. And the downside to this approach is basically that it's also quite an invasive change. Because you need to extend many power touch many parts of the stack because you have this kind of confidential transport through all these layers means you have to. Change components in Kubernetes in container D and even like the clients and tools that sit on top of Kubernetes so it's also not an easy thing to do. Yeah, I think that's the three points I wanted to mention. In summary, I think confidential computing and cloud native containers are a good match. From my perspective, it could really boost the adoption of confidential computing. There's definitely some hairy questions we need to figure out to make this work in practice. But from what I'm seeing like there's a very engaged community and yeah, it's very exciting. So if you want to chime in, it's confidential containers or on GitHub. There's meetings. There's slack. And I think that's it. We have some around 10 minutes for questions. You mentioned the control plane issue and the API between the control plane and the T is not practical in the long run. What do you mean by that? No, no, locking it out is not practical. So if you basically say you're not able to use a full API in confidential context. So if you basically right now you can lock down the controversial parts of the API. Because it's very hard to untangle those things. Some things are like, you know, when the APIs weren't conceived, you didn't think like exec would be a problem, but create wouldn't be a problem. So at the moment, I think we can do this on the container D level or the Cata agent level where you can basically say, okay, there's a few things we just don't support. But I think this is not sustainable in the long run. So you're saying that you're not saying the architecture is not practical. You're saying that the fact that you're shutting the API off is not a long term solution. Exactly. But I mean, if you want to basically start with something that is definitely, like the image pulling on the confidential VM, this is, I think, makes sense if you want to start with this. What do you think about the metadata problem and then the last problem? Having this trusted control plane also as a solution for the metadata problem, where you're like a trusted controller, a trusted part of the API server that can give your descriptions, and it will enforce them to the container? I mean, that's obviously, pardon you. So the question is whether basically the, if you move parts of the control plane into the TAE, whether you basically get around a lot of those problems. And that's absolutely true because basically this self-made problems to a huge degree. So if you just take the whole thing and put it in a TAE, then most of the things that were there, maybe even all of the things there, aren't a problem anymore suddenly. But this is, as I said, starting from the notion of existing users who basically use existing hosted Kubernetes offerings. How do you basically migrate those users to confidential? You would have to do those things. And so I'm not even arguing that this is maybe the best solution in the long run. But if you want to do this, then you have to kind of overcome those issues. We have two online questions. We have two online questions, and I'm going to give you so we can read it. Okay. Okay, there's a question. Are there any new challenges related to attestation protocols in containers compared to already existing attestation mechanism in TDX, SVV, S&P, etc.? From my perspective, I think this is not something that is necessarily conflicting with these confidential containers. So it's more or less, we also have to basically follow the same attestation principles as non-containerized workloads. We're just moving on top. Yeah, you're just moving it on a different layer. But I think there's no fundamental difference or problem that is specific to containers. There's another question about the use of a proxy being considered. I think it's a bit too broad for me to understand where a proxy would sit. Yeah. Yeah, I mean, this is what I meant by having this kind of transport. But yeah, I think the proxy pretty much you need to then start also at the client to tweak things. Maybe you have a distinct cube config for confidential ports. I didn't really look deep into this, but I think there's definitely changes. You need to do also when you employ a proxy. And one question or comment from my side is, well, there are a lot of challenges in order to have this done in the first structure side. So you mentioned about how we pull the pods and whether we can share those or not. The image, sorry. That's a problem that affects mostly the people providing the service. But here, in my mind, the biggest problem we have is maintaining the availability, the unobserved availability for the end users. And that's something that, well, we will have to think together on how to solve these. Yeah. I mean, the priorities, I think I agree. Sometimes I think are different, put it this way, because sometimes things are just also KPI driven. So you say like, we have this solution and it starts like in three seconds. And if we do this, then it doesn't work. So because then we regressed in some metric. But this, I think a product question from my personal view, I also don't think that the image pulling time, the startup time is a big issue. But there's also like, if many of those workloads are machine learning workloads, then it's like this. A PyTorch image is like 21 gigabyte or something. It's really crazy. And I understand that there's a concern. And startup is not like a problem for Kubernetes itself. We have to wait a lot with Maze. But for functional services, it is. Yeah. And this is one of the main cases that people are looking for. So, yeah. It's understandable trying to mize that side as well. Thank you, Mades. Thanks a lot. |
Salmiac: Running unmodified container images in Nitro Enclaves |
Yeah, hello everyone, today me and Nikita and my colleague Ariti senior engineers Fortanix want to talk about the confidential computing project to build at Fortanix using latest technology from Amazon called Nitro Enclaves. Now Nitro Enclaves provides environment run application confidentially meaning that no one should be able to temper with the application while it runs during the execution compared to your regular operating system or in other words, you know, if your program is quite shy and doesn't want to be poked by the outsiders or something, we kind of provide like a safe environment for it to execute or a safe space, if you will. Before I start talking about the project itself, I want to give a little bit of context about what parts Nitro consists of. The Nitro system is Nitro system, Nitro Enclaves built on, it consists of three main parts. We have a Nitro hypervisor, that is a combined piece of software and hardware tech which provides CPU and memory utilization capabilities for the VM. The Nitro chip which is a hardware tech, it provides a root of trust which tells you if you're like running a tainted hypervisor or not and there are Nitro cards which are responsible for memory encryption. Now how it looks like from a VM standpoint, as you can see in the picture we have two parts. We have an untrusted parent VM and a trusted Nitro Enclave VM. They are completely separated and they run their own kernels. And the Nitro Enclave is the VM that you run your application on. Now the enclave and parent VMs they talk through and can only talk through a socket called Vsoc. There is a special socket connection specifically designed to provide a communication between an enclave VM and the hypervisor. Now we believe that any program doesn't matter how shy it is, it still wants to socialize with the outside world because complete hermits are quite rare after all and it also wants to keep some personal diary with all the data and things it's done so it can reread it after the enclave has finished. Now bare Nitro doesn't provide all of those. As you can see it has no network capability out of the box and no durable storage. It gives no networking capabilities inside the enclave meaning that the application inside the enclave can't connect to anything from the outside world and vice versa. And for the persistent storage Nitro doesn't provide any form of that. The kernel inside Nitro VM is running on a RAM disk and it gets cleared after the enclave finishes. So all of it kind of drastically limits the number of useful applications can be run. On a bare Nitro for example if you want to run like a web server or a database you can't do it on a bare Nitro but it won't provide much value to the end user because yeah the big portion of functionality of those programs just is not supported. And the salmic project that we created aims to solve those limitations by providing confidential program execution environment without sacrificing fundamental systems capabilities. On a higher level the project looks like this. Fundamentally it is a yeah it is a it's a Docker image that runs on a VM with a Nitro enclave enabled. The only required argument that needs to be passed is the Nitro socket here on the right. And after that it behaves like any other Docker image would. You can you know you can start, stop it, monitor its status using your standard Docker commands. Functionally the image can be divided into two parts. One is called a parent and one called an enclave. Those are internal names not to be confused with parent VM or Nitro enclave VM. The enclave contains client application as well as a runner program of our creation. Parent contains only a runner program and responsible for setting up networking and the file system as well as network enclave creation, its monitoring and termination. So and the only communication channel between those two is the VSOC connection. This VSOC connection effectively creates like a confidational boundary between those two parts. Yeah and shown the architectural area can go further to discuss how we implement networking. So I'll make abstracts networking communication between client application and this outside world meaning that if someone wants to connect to the application inside the enclave the only thing one need is a networking address so it works the same as you would connect to any other service running on the internet and the other way around the application inside the enclave can connect to anything on the internet it also needs only the internet address and the application inside the enclave doesn't know that it runs inside the enclave. The common example of applications would be your common I don't know nginx web server that runs inside the enclave you can send web requests to it and it will answer to whomever. So in details networking is implemented using a two part program called VSOC proxy as you can see this is also a two part program similar to parent in the enclave on a previous slide one runs inside the enclave one runs inside the parent they are both connected to and only VSOC. So one of the responsibilities of the parent and it is an entry point and the startup is to copy all the necessary network configuration of the network device on the right this is a Docker network device propagate those settings inside the enclave and the enclave part of the VSOC proxy will recreate said network device or devices effectively mirroring it inside the enclave but instead it will use virtual network device which is a functionality provided by Linux kernel it's called tune or tap this is a functionality to read and write networking packets using program and user space. If you are more interested in that you can check kernel.org they have a lot of information about it. So after said devices have been created both parts VSOC proxy will start forwarding packets will forward from you know external network inside the enclave and the virtual device inside the enclave will forward them in the opposite direction. Yeah sure so the only difference between those two becomes yeah effectively the direction they are transmitting networking packets. Set architecture also supports multiple network interfaces as well as a private network between parent and enclave private network makes two independent entities between parent and the enclave and it allows us to enable instant protocol between those two on top of the VSOC. This lays the foundation for a file system solution and this will be presented by my colleague Arithi and I will pass you a mic. Okay hello everyone I will dive deeper into how we support persistent file system with Nitro enclaves in our project from IAC. So I first wanted to go over how you actually create an enclave image using AWS Nitro. AWS does provide us with a command line tool Nitro CLI. It's used for managing the life cycles of Nitro enclaves you can build an enclave you can run it you can check its status whether it's running or not how much CPU it's using or memory it's using and in particular I wanted to discuss the build enclave command which allows you to provide as input a docker image which has your application that you want to run inside an enclave and it gives you an enclave image file and along with an enclave image file it also gives you enclave measurements which are later used by the enclave itself to prove its identity and build trust with an external service that's basically Amazon. So how do we do it different in SOMIAC project? We don't just create an enclave image file we create a Nitro docker image from the input image that we get. So the first step that we do is look at the input docker image and export the image file system into a block file let's call it init image and we store it into a block file with the help of DM verity it's a Linux kernel device mapper tool it uses the kernel scriptor API to protect the integrity of the initial input image so what happens at the time of creating an enclave image is that it creates a hash tree of the storage blocks present in the device and it ends up giving us a root hash and we store this root hash in the enclave image file and we also create another empty block file let's call it run dot image and I'll discuss more about it in the future slides so as you can see here the input docker image is converted into a Nitro docker image which consists of the block files the enclave image file and foot annex software which is Somiak so the next step is what happens at runtime and how do we make these block files which were present in the docker image inside the enclave so we use nbd the network block device protocol to be able to serve these block files which are present in the parent docker container into the enclave and it's over the network so we take we make use of the VSOC proxy that Nikita discussed earlier so each block file is is served through the nbd server program which runs on the parent side and they both use different ports on the server and inside the enclave we have the nbd client program which runs we have two clients which access the server on two separate ports to provide us with two devices let's call it dev read and dev write okay so once we have the once we have the block devices in the inside the enclave how do we actually mount it we use I think I made a mistake here in the previous slide I did mention the read and dev write I want to change that to dev init so we use the emberty again to open the dev init device and when we do use the emberty we provide the root hash from the enclave image file so when the emberty opens up the block device it makes sure that the hash tree it constructed at enclave image creation time matches what it finds at runtime so this way the initial contents of your client image are preserved it provides integrity protection this way and let's say we mounted at a location mount lower and we have a second device which was mounted as dev run so this device is open using dmcrypt it is another device mapper tool provided by the Linux kernel it is used to encrypt a block device and on the first launch of the application what we do is we format this device into loops to device and create an exe force file system on it and right now we're using the default cipher and the encryption key used can either be generated randomly or you can use an external key service to use the encryption key so let's say we mount this dev run device at directory mount upper so we have two sets of file system layers here mount upper and mount lower and we need to combine this and provide it to our client application that we want to run so we use an overlay mount here to merge the two the merge the two and provide it as mount root so this allows the client application to read verified data from the image block file which is presented as mount lower as you can see and any changes in the state of the file system during enclave runtime is saved or the state is saved in mount upper which is basically the run image block file and yeah so once we set up the overlay file system we need to change the root of the file system here so that the client application is unaware of these different layers that we have so we use to root to change the root of the client file system and once we root and we also bind on some of the system directories such as dev proxies yeah I think that should give the client application everything it needs to run its application and yeah save any state changes into the block device and yeah I think that's all I have if you have any questions about any of the networking or the file system persistence you can ask me on the key to about it yeah maybe talk a little bit about performance I mean you're telling everything through the VSOC is there any boundaries you hit or I would see we have okay the question is how does it impact performance of the application we haven't yet measured what is the performance implications of it yet but this was more of being able to support what was initially not supported by AWS Nitro any other questions is this one yes okay the question is is the DEE architecture of Nitro similar to TDX and SCV I suppose it is a closed system by AWS only so how can users verify that Nitro is secure okay I'm afraid I I don't know the comparison with SCV and the other platform mentioned here but you need to be able to trust AWS if you trust AWS you trust the system apart from AWS there's no one else that can access the contents as far as we know the Nitro Sela is open yeah so as far as I know I mean the Nitro Sela is open source right but the hardware that Dunkleys run on it's done by Amazon I don't know probably ordered by them whatever so you have to trust them for the hardware yeah that's it |
Autonomous Confidential Kubernetes
How to securely manage K8s from within K8s |
We're going to start here, we have a presentation about installation, almost presentation for NADIS, how to manage Kubernetes, we think Kubernetes. The presenters are Mohit and Malta, Malta, so big shout out to them. Thank you. Yeah, we'd like to pick up where I think Nick in his initial presentation of saying we need to have like a let's encrypt movement, we need to make confidential computing a commodity, where he started off and then Mark knows I think had a great talk showing all those bits and pieces that we need to have to bring together like the use cases, the cloud native world, the way we develop applications for the cloud and the advantages that the confidential computing technology gives us, how to bring them together and where are those little gaps and historically seen or for different kind of use cases, I would roughly split from a use case perspective of if I want to develop an application, where can CC help me, how can I apply confidential computing, I can roughly split that in three tiers if you will, I think one is definitely managing keys, having enclaves that hold your cryptographic certificates, your keys that process the crypto operations for you, very small TCB, very small kind of application, right, and then the second one is where I package my entire application inside a enclave, inside a confidential container and I think that's what we've been doing lately a lot and then I think the third thing is what Mark has described is how can we bring that together, making this orchestratable, making this manageable, deployable and I think there are different ways of getting from the tier two or the way we are to here, one is I guess what Mark has described with taking containers, making them confidential containers and then having the problems with orchestration and an orthogonal approach that we like to present now is more of the idea of having confidential clusters, so instead of isolating individual containers, isolating the nodes, the downside probably is a little bit larger TCB and the advantage is being more closely to where we are right now with deploying and developing cloud native applications. Talking about challenges for level three, definitely I think one of the biggest ones is UI UX, right, there's little hope that people will go ahead and drastically adjust the way they deploy and develop applications for the cloud just because they want to use this new type of technology, so we need to get very close to where they are and bring those worlds together and then of course there are the challenges Mark has described with orchestration, orchestration, attesting, how can we attest all those different containers that are running in my cluster and don't necessarily want to verify each and individual instance of it, right, that could be a thousand and more of the same. And then once I have a cluster, all those day two operations of updating, upgrading and doing that in a sensitive way where I can always verify what's currently being running and what are the changes and yeah, big part of what we are going to present today is the right one here where a big benefit of the cloud is actually that I can give away some of this orchestrational work and I consume managed services that are operated by someone or autonomously and I just consume them through an API or any other kind of interface. So as Nick has said, infrastructure is rolling out, we see all those confidential technologies in the cloud, AMD, SCB, we have heard so many, many today, IBM, RISC-5, most of them give us a confidential VM, which is, as we've seen, not necessarily the abstraction we want, but still we can already consume managed Kubernetes that runs on confidential VMs, at least for the worker nodes. I think Azure has it, GCP has it, yeah. So this exists, but it's not really solving the problem, I mean, it gives us runtime encryption for the stuff that works on, lives on that nodes, but all the edges, all the IO is not protected, right? The API server is not protected, we've seen that in Magnus Talk, the metadata problem, the problem of the trusted control plane, the way if you want to consume persistent volumes, is that automatically encrypted or do you need to adjust my application to encrypt before writing to storage? So the idea of a confidential cluster is that I have somebody or something that fills in those gaps, so that I have, instead of those individual confidential VMs, I have one big context that I can verify through attestation, that I can establish a secret channel, and then if I'm in that context, if I'm in that cluster, I can just use Kubernetes as I used to, and from inside there, essentially everything is trusted, right? It's a different type of approach, it just creates an envelope around my Kubernetes and isolates that as a whole. As I said, I think UX and UI and the way we use this is super important, it's not gonna work that we need a lot of adjustments, a lot of additional steps in the development workflows, so having, this is just an example of constellation here, but having a simple way of creating such a confidential cluster and then using it is important, and all those things that I showed, all the challenges we need to solve below, we need to make this more or less invisible, right? In terms of constellation, we try to make the node operating system as verifiable as possible, strip it down as much as possible, harden it, then strip them together for a cluster, we need to think about supply chain, we need to think about how we can automatically encrypt all the stuff that goes over the network, that goes to the storage. Ideally this is all open source, so constellation, if I didn't have mentioned it's open source, and it's cloud agnostic, so it can run everywhere, and then for most of confidential computing stuff, I need some way of recovery, should things go, should things go south and everything is down and need to get back into running mode. So yeah, the big problem with the confidential cluster concept is now I can create a cluster and we will see in a bit of what that means, but if I can create a cluster, I have everything verified, now I have to maintain and run it on my own, and this is I guess the biggest problem with that concept, right? People want to consume managed stuff, when they have managed Kubernetes, don't want to run their own, orchestrate their own Kubernetes, but this is a big trade-off that people are facing, and yeah, we try to work on concepts of making that as easy as possible, and yeah, Malte is going to show you how. Yeah, so thanks for the introduction. So the basic idea that we had was how can we manage Kubernetes from inside Kubernetes itself, and to kind of draft this idea, I will start by explaining what typically you can do today, so on the left side you really have the traditional on-prem model, which is you have the whole cluster in your own hand, the control plane, the worker nodes, it runs on your own hardware, which is great for security, right? Because you have full control, but it also means you are responsible for every single interaction, like scaling up the cluster, joining the nodes, performing upgrades, both on the OS level and also Kubernetes upgrades, and then on the other side you have something that is super popular, it is just let the cloud provider deal with it, and it means the cloud provider can scale your cluster up and down, just if you have a burst of traffic coming in, you get new nodes, it is all super easy, you can set it up so that the cloud provider will automatically patch your operating system, it will automatically upgrade your Kubernetes, and this is great from a DevOps perspective, it is super simple, it scales, it takes away work from the developer and the operator of the cluster, so what we thought is why don't we meet in the middle, and that is kind of like we have to run our own control plane in the confidential context, but if we do this, we lose all of these smart features from the cloud provider, so we will just reinvent them but inside the cluster, that means we can still do auto scaling, we can still join the nodes by themselves without any human interaction, we can still roll out OS updates and we can even roll out Kubernetes upgrades inside a running Kubernetes cluster, so to explain how this works, I will first go on to how Kubernetes nodes and constellation can actually join the cluster without any outside interaction, and what you have to understand here is these are all confidential VMs and they make heavy use of the measured boot chain, I think we already had some good introductions on this, but I will still show you how this works in an example, so first in the confidential VM we have the firmware, and the firmware is basically just the first part that starts up, and the main task here is to load up the first stage boot loader and to measure it, so we measure it on AMD SEV in the approach we are currently doing, it is measured into a virtual TPM, and then we load this boot loader and then we will start executing it, and the boot loader just has the task of, in our case, loading the next stage and measuring it, which is a unified kernel image, and this is a very neat trick, it is basically just one blob that contains the Linux kernel and in it RAMFS and also the kernel command line, so the nice property here is we can measure all of these in one blob and don't have to take care of the individual components, which can be quite hard to do correctly, and inside of this, in the RAMFS, we will use the kernel command line to extract the hash that we expect for the root file system, and for this we use the Emverity, which I will not go into too much detail about this, it just allows us to have a read-only root file system and we know in advance that it has not been tampered with, and we can efficiently check every block while it is read and before it is actually given to the user land, so that's how we get to the root file system, and inside of this root file system we have a small application and the task of this application is to join this node into the Kubernetes cluster. So next to the completely unmodifiable OS, we have a state disk and the only task of the state disk is to have the data for Kubernetes itself, like container images and state at runtime that has to be stored on disk, and this is initialized to be completely clean, it's encrypted, and yeah, this is like a component we need to operate. So the next question is how do we make these things possible, and for this we deploy some microservices inside of Constellation, and these are the node operator, this is responsible for actually rolling out updates, it's the join service that attests nodes that are joining the cluster and decides if they are allowed to join or not, we also have a key service that is handling encryption keys, and yeah, some more that are not really important right now. So how does a node actually join the cluster? So I mentioned there's the bootstrapper that is started inside of the confidential virtual machine and it will autonomously search for the existing Kubernetes control plan and it will perform remote attestation using attested TLS, and it will basically use the attestation statement for example from AMD SCV, SNP, and this way, so the join service already knows what measurements to expect from a correct node that is running the expected software, so it can decide at this point if the booted node is running what you wanted to run and decide if it is allowed to join the cluster. So based on this, the join service can then offer the node a join token which allows it to join the cluster and it will also hand out a permanent encryption key for the state disk. So next we will have a quick look at how updates work and on a high level, we want the administrator to be in control, we don't want to give up complete control over the update process, but we want the actual execution to be completely automatic and seamless and we do this by basically just telling the cluster what to do and the rest is done by a Kubernetes operator which is a way to give in a desired state and let the cluster handle moving towards the state. And an important thing to think about here is we are running in the cloud environment and we don't want you to depend on individual nodes, this is also what GKE and EKS and others are doing, we are saying if you want to upgrade, we will give you a new node that has the desired configuration and we will never try to do updates in place. So how does the actual update work? We basically give in custom resources that describe the desired state, so the Kubernetes version and the OS image that we want to run on and some properties to actually verify like the expected measurements for the new image and tashes for the individual Kubernetes components and the operator reads this information and basically checks if the desired state matches reality and if it detects a mismatch, it will first stop any auto-scaling operations that are happening in the cluster and then it will start replacing the nodes one by one and for this we use the different APIs by the cloud providers. So in this case, we will just spawn a new node in the correct node group that has the desired configuration. We wait for the node to autonomously join the cluster and we wait for it to become ready. Next we will cordon and drain the node which just means we will safely move over your running workloads from this node to other nodes in the cluster and only if we are sure that your running workloads moved over, we will then remove the old node from the cluster and this is basically how you can get from having outdated nodes to having updated nodes and this will just go on until your whole cluster is up to date. You can also parallelize this and when this is done, you can just restart the auto-scaler and move on with your day. All right, quick conclusion. So in summary, the fundamental ideas, we take this confidential cluster concept, enveloping the entire Kubernetes cluster instead of protecting single containers or parts, where we gain is we basically get all the orchestration for free, we need to protect the edges, all the Ion and so forth. The downside is we can't isolate inside that cluster so it's one big envelope, of course. This works already, it's an open source tool, you can check out Constellation on GitHub and try it locally or on one of the big clouds. From a Kubernetes perspective, it's just vanilla Kubernetes so not surprising that it's certified. To give out some more references, if you're interested in this whole image part, there was the image-based Linux and TPM death room, there's a lot of talks on these topics, also very interesting. There's a, so this is the last talk here, but if you're interested in more confidential computing, sneak a little advertisement here for the OC3 Open Confidential Computing Conference that's going to happen in March, it's virtual free, you can just sign up and listen to the talks if you're interested. A bunch of the folks that were here think also have a talk there. Yeah, so yeah, if you have any questions, please feel free to get in touch and that's it. Thank you. Oh, so yeah. So the question was about the Attesta TLS, when we join nodes, we establish a seek connection based on Attesta TLS. Yes, so first of all, our implementation is open source, it's part of the consolation source on GitHub. I think it's nothing fancy, we use the AMD SCV or Intel TDX and so forth, remote attestation statement to exchange a key as part of the data that's sent over. And we bind the TLS session to that attested key. I guess there are a couple of implementations for Attesta TLS, they work more or less the same. Yeah. I think to the most that I'm familiar with, there is this vulnerability in remote attestation that can be faked by a machine's whole spread and now I wonder if it is possible to fight out from the remote attestation of the whole cluster, any single machine in the cluster may make an attestation and it goes unnoticed or not, all the others are for example useful. Okay, so repeat the question. The question was there is a known vulnerability for attestation in confidential computing and if given this confidential cluster, if from the whole cluster attestation I can refer to if one of the nodes is faking attestation. I have to say there were several vulnerabilities in several of the CC technologies over time, I'm not aware of, no with what vulnerability you're referring to. Okay, so the way the cluster attestation works is you give the, let's say the first node, it has a known configuration, it will attest all other nodes based on this known attestation. If one node would be able to perfectly fake that attestation, you would not know from an outside, from a cluster attestation perspective which node this would be. But yeah, I guess that's what you can say. It is super simple but it is big TCB, do you have any plans to reduce the TCB? Yeah, we try to, as I said, this is a trade-off, yes, it's a much larger TCB than SGX, much larger TCB even than confidential containers. We of course will try to make it as minimal as possible. Biggest leverage is of course the node OS and everything we can do inside there, yeah, we'll definitely try to improve there. Yes? So you mentioned that there's some firmware at the beginning of the food processes that firmware provided by you or by the provider? Very good question. Oh, sorry. Yeah, the question is, part of the confidential VMs is the first component that's booted is the firmware. Do we have control of the firmware? Ideally we would have, but what's provided by the cloud providers right now is Azure has something in preview that allows you to do that. It's not general available and GCP does not allow you that. So the firmware for at least GCP or Azure is completely controlled by them. On OpenStack with QM or KVM, you can potentially fully control the firmware, yeah. Yes, next question. That doesn't create a huge trust problem because you have to trust the firmware to be secure. I mean, this is, of course, does this create a trust problem is the question. Yeah. I mean, this is a controversy, I fully agree with you. This is not how we would like it. This is just the best we can have. |
Devroom closing and goodbye |
Thank you all for being here. We're not going to keep it long. Maybe just a couple of words to wrap up. This was the fourth edition and we are very happy that it was again physically, it's a hugely different experience. I think we are doing very well as an open source community on Toast Execution. I think we are doing well and that's, I think, something to celebrate together. There's only two things you can do with Toast Computing. One is to restrict the user, things like digital restrictions management, DRM. And the other, I think, and that's the future, is how can we secure user data. And to do that, I think, the only way forward is open source and free software. So let's keep the pace up together as a community and hopefully we get there again here in Brussels next year. And I think Fritz also wants to say something. Yeah, I just want to wrap up and saying thank you to everyone to join. Thank you for the nice questions. I think it was nice in the last years. Please send us feedback, send us words about what you think about also the online part of it because we're not sure if we want to keep this hybrid version of it for the next year. So please send us any sorts of whether you think the online part was actually worthwhile. I think this part will always be live streamed by FOSDEM, but we're not sure whether we actually keep it in hybrid. So please send us any sorts you have, and please also reach out to us if you want changes made for next years. We really want to make this more of a community event and maybe also get a full day next year. So thank you a lot to everyone and have a great evening. Thank you for being here. |
Drawing your Kubernetes cluster the right way
how to present the cluster without scaring people |
Hello. Hello. Okay. So, welcome to the container's dev room for them 2023. We're all very happy to be back in person in completely filled room as usual for this track. And we've got our first speaker who's going to be talking about drawing Kubernetes cluster the right way. Take it away. Okay. So, yeah, you see the title of the talk. And the nature of Kubernetes is a little bit specific. I would say. And that's why these talks are so important. Talk exists and I hope can be beneficial. If we speak about the Kubernetes specific, the first thing we actually think about is the high-entrance threshold. First of all, we have tremendous amount of objects or entities, I would say, which interact with each other in some predictable or not always predictable manner. So, the slide just lists few of them. And as you know, Google likes inventing new entities with specific names, which are doing something non-standard from the point of view of pre-Google systems. So, you have pods services which are not traditional service. You have a large amount of controllers. You have labels, lectures. You have some objects which are really Kubernetes specific and such as deployments. Then you have specific tools or components or concepts to control all these orchestrated together set of containers. And due to this complexity, you really need good and simple drawing to illustrate it. Well, from the cognitive point of view, we divide cognitive load in several groups, actually in three groups. The one is really on cognitive load related to the subject you are presenting and to other groups are more related to the way you present the subject. And decreasing cognitive load for presentation actually free some mind resources of the consumer of your drawing and makes it much easier to understand. And that is really needed in case of Kubernetes. The specific of Kubernetes cluster relies on two things. The first thing is, well, we can traditionally think about the cluster of virtual machines as cluster of some machines connected in one network. But actually a traditional networking based approach in case of Kubernetes is not so beneficial. First of all, the network in case of the Kubernetes cluster is more about namespaces, network policies, granted or not granted access and so on. And actually, well, virtual machines are still machines. They are somehow virtually connected with each other. But that's not the problem of you as a cluster creator. It's actually the problem of Kubernetes itself to manage this. And no reasons to draw it on your diagram. Then you have to do the same. You have to draw objects, Kubernetes objects. You have to combine objects in groups and just connecting them with some network lines is not helpful. And if speaking about grouping, the other problem is that sometimes groups should be presented as objects and sometimes groups should be presented as groups. It's really Kubernetes. And it's really a Kubernetes way of thinking about, well, things. And you have to deal with it. We will see it a little bit later. So let's try to look how people are drawing Kubernetes clusters. The first thing to think about is using some black and white drawing a traditional way. Maybe an old school graph V's example would be the good starting point. Well, what's good with it? It's graph V's, it's recognizable. And people will say, oh, he uses graph V's. Cool. But as you remember, your Kubernetes cluster is not a graph. Yes. It's not actually good to use network diagram in traditional ways. So using graphs is nothing good as well. Well, some author, well, by the way, on the bottom, you have the source of the slide, of the drawing and regards to the author. Some are using LibreOffice diagrams with blocks and lines. You see not too much lines here. It's good. And it's really better than the others. It's better than the others. And then the previous one turns really easy to understand. Then you see that shades of gray are using to distinguish group of objects one from another. Actually, it's a good approach. So grouping should be actively used in black and white drawing as well. And you can see if you just draw rectangles without using color, it will be dull a little bit. And it will be a little bit empty. You will look at it and think, is it a diagram at all? Okay. If we are speaking that shades of gray are beneficial to distinguish groups of objects one from another, then what about colors? Colors are also beneficial. Frankly speaking, this diagram comes from the official Kubernetes documentation from Google. What good is it? It uses colors. What's bad with it? Never use black text on blue background because blue and black has the lowest possible value of contrast. Other examples, just a few of them just to notice that people are actively using colors. This one at least uses white on black. It would be more readable. Well, actually, if you choose colors, the good idea is to use color wheel, especially if you are more confident with containers than with color combinations. So color wheel allows you to choose colors with good contrast and which are complementary to the main color you choose. You can even follow the color scheme of your website or of your slides or anything like this. And it will give you the unique and recognizable diagrams. So few links to color wheel available online are on the slide, and I would say it's a good thing to use. So a little bit more about networking diagrams. They're actually second drawing from the official Kubernetes documentation. What's good? It has stacks. Stacks are also good in grouping. It has some UML like arrows which are actually not inheritance. Just used for some aesthetic reasons, I think. And it's clearly drawn in LibreOffice. So if Google is okay with drawing things in LibreOffice, why not? And it uses some traditional networking icons, but just to put something except rectangles here. The really good example from Red Hat, sorry. You see colors with grouping. You see few network icons which are just objects to make the overall drawing more interesting and not overloading your mind. A little bit more about networking diagrams. If you have lines and arrows, not to show that everything is connected with everything, but to guide viewer how to scan this diagram, then in this case networks are really good. Network, I mean lines, arrows. But if you add more icons, it will be a disaster. The diagram becomes really difficult to read. Tiny icons, all tiny icons are looking similar and will be lost. Sequence drawing as well. We were speaking that arrows are good in putting some order in which the reader should scan it. Sequence numbers are okay as well. But once again, don't overdo it. Also, it's popular to make some experiments with shape of objects. Non-rectangular blocks are rather good. You can use them. Official Google documentation has a portion of diagrams with something like this. If you are lucky, it will be a good sample of art. If you are not lucky, it will not. So guess whether you need it or not. Code fragments. When you create flight, it's typical to put code fragments inside your diagrams, some UML files or something like this. It works rather well. Don't forget about colors. But probably not in web documentation or something like this. One more point is to speak about the official Kubernetes icons. Well, blue haptagons specify objects. Blue haptagons specify groups. They are very similar. Is it easy to scan? No. It's not. It's a good example. Sorry. We have much worse examples. Which were officially presented as an example to follow. It was slides on this one, exceptionally. This one was present in slides on Google Drive. I have noticed it was removed from the RIDME file. But it's still pretty good. It's still pretty present over the Internet. It's difficult to close things when you have made them public once again. So what to use? The first. A lot of angled shapes increase stress level. It's known from psychology. So if you use official Kubernetes icons, use just a few of them, not too much. And probably you would like to use grouping with rectangles with rounded corner, just like this one, to reduce the overall level of stress caused by these angular icons. And what else? Few is better than more in this case. Well, the conclusions. The first thing to take care of is low consumption of icons, which saves your user from visual overload. More text, less icons. Like in Unix, you know, text is okay. Then color. Color runs, well, not the world, but definitely runs your presentation. Just use a good color combinations. It's a really good idea to follow. Then round corners will save your presentation as well. They are really important, especially if you use several official icons to let people know that you are, well, the person who knows that these icons exist. Then what about ideal drawing? Rectangles, just few lines as a gaze, then maybe some numbers if you need to show sequence of actions. Maybe some arrows if you need to show the sequence of actions. Then a good idea is to mix official Kubernetes icons with some other icons if you need more than two or three of them. It sounds strange, but it really works. Then the last advice. Using several simple drawings instead of one complicated drawing is a really good way to present your cluster. Actually, it relates not only to Kubernetes. Several simple drawings are almost always better than one complicated drawing, but in case of Kubernetes and this additional cognitive flawed, it's really important. Then probably that's the last slide. Thank you. Thank you. |
Send in the chown()s
systemd containers in user namespaces |
Okay, then our next talk is by Frazier. Over to you. Okay. Thank you. So I'm going to talk about user namespaces and delegation of control of C groups in container orchestration systems. So yeah, containers and container standards. This is the containers dev room, I think, hopefully most people should have some idea. Talk about Kubernetes and OpenShift, user namespaces and C groups, and then a demo of a system B based workload on Kubernetes and more specifically OpenShift. So container is a process isolation and confinement abstraction. Most commonly it uses operating system level virtualization where the processes running in the container are using the same kernel as the host system. Examples of this include FreeBSD Jails and Solaris Zones as well as Linux containers. And the container image is not a container, the container image just defines the file system contents for a container and some metadata suggesting what process should be run, what user ID it should run underneath and so on. On Linux containers use a combination of the following technologies, so namespaces, process ID namespaces, mount namespaces, network namespaces, et cetera, to restrict what the process running in the container can see. The container may have restricted capabilities or a second profile that limits which system calls or operating system features it can use. There may be SE Linux or app armor confinement and it can use C groups for resource limits. Not necessarily all of these are used at the same time, it depends on the implementation. The OpenContainer initiative defines standards around containers in the free software ecosystem and its runtime specification in particular defines a low level runtime interface for containers that is not just for Linux containers but it defines the runtime semantics for Linux Solaris containers, Windows containers, virtual machines and so there are a bunch of implementations of the runtime spec including RunC, the reference implementation, CRUN and CART containers which is a virtual machine based container runtime. The OCI runtime spec has a JSON configuration and there's a link to an example, it defines or lets you define the mounts, what process to be executed, its environment, life cycle hooks so extra code to run when the container is created, destroyed, started, stopped and the Linux specific capabilities that can be controlled via the OCI runtime spec include the capabilities, that's the kernel feature capabilities, namespaces, the C group, the system controls that should be set for that container, the seccomp profile and so on. This is what an example runtime specification looks like, so it has a process structure which includes a field for the user ID and the group ID, it has this Linux structure which defines the Linux specific attributes for this container and one of those attributes is the namespaces list which defines a list of the different namespaces that should be used or should be newly created for that container. So Kubernetes is a container orchestration platform, has a declarative configuration system and it integrates with many different cloud providers. Anyone not know what Kubernetes is in the room? So the terminology in Kubernetes, the container is an isolated or confined process or process tree, a pod is one or more related containers that together constitute an application of some sort, so it might be an HTTP server and a database to encapsulate the entire web application. A namespace in Kubernetes terminology is not a namespace in Linux kernel terminology, a namespace is just an object scope and authorization scope for a bunch of objects in the Kubernetes data store, so if you have a particular team or project in your organization you might deploy all of the Kubernetes applications in a single namespace. And a node is a machine in the cluster where pods are executed, there are different kinds of nodes, there are control pane nodes, there are worker nodes where the actual business applications run. Kubelet is the agent that executes pods on nodes, so there's a scheduling system, the scheduler will, when a pod is created, decide what node that pod should run on and Kubelet is the agent on the node that takes the pod specification and turns it into a container running on that node. The Kubernetes terminology uses the term sandbox, that's an isolation or confinement mechanism, and there's one sandbox per pod, so there could be multiple containers running in the sandbox. And the container runtime interface defines how Kubelet actually starts and stops the containers for the sandboxes. So cryo is one implementation of the container runtime interface, that's what's used in OpenShift, there's also container d, that's used in some other distributions of Kubernetes. So visualizing this, the whole box is one Kubernetes node, the Kubelet has the gRPC client to talk to us CRI runtime, the CRI runtime does something and containers appear. So we could instantiate the container runtime interface implementation as cryo, and then we can see that cryo talks to an OCI runtime, it uses exec to use the OCI runtime, and we can go one step further and say that the OCI runtime implementation will be run seat. And this is the setup that we use on OpenShift. This is a pod spec in YAML format, so we have kind pod, the specification has a list of containers, in this case there's one, the container has a name, defines the image to use, the command to execute, environment variables that should be set, and so on. OpenShift or OpenShift container platform is an enterprise-ready Kubernetes container platform, it's commercially supported by Red Hat, there's a community upstream distribution called OKD, as I mentioned before, it uses cryo and run seat, and the latest stable release is 4.12, I think that came out just a week ago or so. And its default way that it creates containers is it uses SE Linux and namespaces to confine the processes, each namespace gets assigned a unique user ID range and the processes for the pods in that namespace have to run in those host UIDs. You can circumvent this using the run as user security context constraint, but that is not a good idea, you don't want your containers running as root on the container host because if they escape the sandbox, then your cluster got owned. So this is the why of user namespaces, user namespaces as we're talking about Linux kernel, user namespaces can be used to improve the workload isolation and the confinement of the pods in your cluster. They can also be used to run applications that require or assume that they're running as specific user IDs, or to phrase this a different way, you can drag your legacy applications kicking and screaming into the cloud and get the benefits of all of the orchestration and networking support that these platforms like Kubernetes and OpenShift can give you while still running that workload securely. In other words, trick it to believe it is running as root. And yeah, there have been a bunch of CVEs in Kubernetes and the broader container orchestration ecosystem arising from sandbox escapes where user namespaces would have prevented the vulnerability or severely limited or curtailed its impact. So visualizing a user namespace, we can have two separate containers with a user namespace mapping of UID range 0 to 65535 inside the container's user namespace to a range of unprivileged user IDs in the host's user ID namespace. So the processes running in the container believe that they are running as, for example, root UID 0 or some other privileged user ID when, in fact, it's running as UID 200,000 on the host, an unprivileged user ID. Take the sandbox and you're still an unprivileged user on the host. So in Linux, there's some references to some man pages about user namespaces and how to use them. The critical thing is the unshare system call, which is how you create and use the user namespace. In the OCI runtime specification, there are some fields. And again, this is Linux specific, so it's inside the Linux specific part of that configuration that you can specify that a user namespace should be created or used for that container and you can specify the mapping. So how do we map the containers user ID range to the host user ID range? User namespaces were implemented before they were implemented in Kubernetes upstream. We did it in cryo, it first shipped in OpenShift 4.7, but it required a considerable amount of additional configuration of the cluster to use it. And since OpenShift 4.10, you've been able to use it out of the box. You do still have to opt in using annotations on a per pod basis. It requires the NEU ID security context constraint or an equivalent privileged security context constraint in order to admit the pod because the admission machinery does not yet understand about user namespaces. So the pod spec says, I want to run as user ID 0 and the admission machinery says, uh-uh, no way. We need to circumvent that for the time being, but the workload itself will run in the user namespace. And depending on the workload, it may still require some additional cluster configuration. So the annotations to opt in, you can say IO.OpenShift.Builder is true. That activates a particular cryo, what we call a workload, basically an alternative bunch of runtime settings. And then we use the user NS mode annotation to specify that we want an arbitrary user namespace of size 65536. So that'll allocate you a contiguous host UID range for that container and map it to unprivileged host user IDs. In the Kubernetes upstream, it took a bit longer to get user namespace support and it's still a work in progress, but the initial support was delivered in Kubernetes 1.25. And that version is what we've moved to in OpenShift 4.12. So you can now use the first class user namespace support in OpenShift. It is an alpha feature, so it's not enabled by default. You have to turn it on with a feature gate. And at the moment, it only supports ephemeral volume types, so empty to a config maps, secrets, no persistent volume support yet. You opt in by putting host users false in your pod spec and currently gives you a fixed mapping size of 65536 that will be unique to that pod. It is hoped that a later phase will deliver support for the additional volume types. The reason that we didn't have them is the complexity around ID mapped mounts and how to adapt the volume mounts to understand how to map the user IDs between the host UID namespace and the container's username space. There's also very simple heuristics around the ID range assignment. As I mentioned, it's a fixed size of 65536 that limits the number of pods that you could run in user namespaces on a given node, and there are still some other mount point and file ownership issues, for example, with the C-group FS. And that takes us to the C-group's topic. So OpenShift creates a unique C-group for each container, and it also creates a C-group namespace so that the container in the CISFS C-group mount only sees its namespace. Inside the container, CISFS C-group actually points to CISFS C-group slash a whole bunch of stuff specific to that container in the host's file system. If you want to run a systemD-based workload, systemD needs right access to the C-group FS, but by default, the C-group FS will be mounted read only. So the solution, we modify the container runtime to chone the C-group to the container's process UID. So that is we chone it to the host UID corresponding to UID0 in the container's user namespace. But first, before we did this in an ad hoc basis, we engaged with the OpenContainer initiative to define the semantics for C-group ownership in a container, and those proposals were workshopped and accepted, and after that, we were able to implement those semantics in RunC. So what are the semantics? Well, the container's C-group will be choned to the host UID matching the process UID in the container's user namespace, if and only if the node is using C-groups V2, and the container has its own C-group namespace and the C-group FS is mounted read-write. So only the C-group directory itself and the files mentioned in sys-kernel C-group delegate are choned. These are the ones that are safe to delegate to a sub-process. And the container runtime, if that file sys-kernel C-group delegate is defined, then it will read that file and only chone the files mentioned there. So it can respond to the evolution of the kernel where new C-group nodes may come and go, some of them may be safe to delegate, some of them may not. In OpenShift, C-groups V2 is not yet the default when you deploy a cluster, but it does work and it is supported, and to activate the C-group choned semantics that I just explained, we still require an annotation in the pod spec. So let's do a demo. Here's a cluster I prepared earlier, and we can see new project test, okay, OC create user test, maybe test already exists, okay, we'll just use test. So we can now, oh, well, I'll show you the pod I'm going to create, cat pod nginx, host users false. So this is a pod spec, let's get some syntax highlighting. This is going to run nginx, it's a system D based workload, so it's a fedora system that will come up and system D will run and it will start nginx. We're setting host users false so that it will run in a user namespace. I have already enabled the feature flag on this cluster. There's that annotation for the C-group choned, and the name of the pod will be nginx, so let's create that. So OC as test, create a share. Okay, fingers crossed. Okay, so we'll say OC admin policy add role to user edit, okay, let's try that again. So the pod has been created, and we'll just check it's status to see which node it is running on, and it hasn't yet started, so we don't have a container ID for it yet, but in the upper pane, I'll get a shell on that worker node. Okay, pod is now running, so we can run cry control, inspect, container ID, and we'll just pull out the PID, so this is our PID. Now if we have a look at the user ID map for this process, okay, so here we see that we have a user namespace with the user ID mapping of size 65536, which is mapping user ID 0 in the container's user namespace to user ID 131072 in the host user namespace. And now we can look at the processes that are actually running there, so we'll do pgrep, l-ns says show me everything with the same set of namespaces as this process ID, and then I'll just pipe that to PS, let's print the user, the PID, and the command, sorting by PID, okay. So we can see that this container is running, well, in it, and then bunch of systemd processes, and then eventually nginx, and these are running under, see, 131072, yeah, 132071, yeah, so these are running as various user IDs in the container's user namespace, and those are being mapped to the host user ID namespace as these PIDs. If we look at the logs, we can see that it looks like a regular systemd system has come up, indeed it has, and let's see, I see, maybe we'll get a shell on the node, on the pod, and yeah, if we have a look at what is the container's view of the processes that are running, it sees that systemd is running as root or other systemd-related users inside the container's user namespace, nginx is running as the nginx user in that container's user namespace, but as we saw, these are all mapped to unprivileged host UIDs in the host user namespace. So that concludes the demo, here's a link to various resources, I have a lot of blog posts on this and related topics, so you can hit my blog and just look at the containers tag, a recording of a demo or a similar demo, slightly earlier version, link to KEP127, which is where all of the discussion around how to do the upstream support for user namespaces in Kubernetes, all of that discussion happened there, the OCI runtime spec is referenced there, and that's it, so I think there is time for some questions. Please stay seated until 25 so we can ask questions, okay, there's one in the back, do you want to read the one from the chat first? There's a question in chat, I don't see it anymore, it says, why would I want to run systemd in a container, it's cool that it's possible with user namespaces, but I lack an idea for use case. So the use case is you have a complicated legacy workload that runs under systemd or makes assumptions about the environment it runs in, the user IDs that the different components are running as, you've got two choices, one is to spend a whole lot of upfront engineering effort to break up that application and containerize it and make it a cloud native application, which is expensive and typically has a long lead time, or you can just wrap that whole application up in a container and run it securely, hopefully, in a container orchestration platform and get the benefit of all of the scaling, networking, observability features that the orchestration platform gives you without having to spend that effort to bust your application into a hundred pieces. So I would say that that is the use case, I think it's a compelling one, if you were building applications today you certainly wouldn't do that, but there are 100 million legacy applications out there and people don't want to break them up and change them. Hi, thanks for the talk, I'm actually doing this right now at my company, but basically the container is running as privileged, so that's why it can access C group, it doesn't use user namespaces, it just runs as a privilege, so I was wondering if using this method you could set a memory max, memory high, or other values for some processes in the C group running in the container, I mean. I'm sorry, I couldn't hear the question because it's rather echo-y. Sorry, yeah, so can you set memory high, memory max, values, CPU affinities, like all these kinds of things you would set in the C group usually, can you set them from this particular use case of C groups in the container? Yes, absolutely, because the container still has its own C group namespace, so all of the standard C group confinement and resource limit capabilities can be used. Okay, I guess I got confused by the list of values that were allowed to do a listed in the previous slide, there was a restricted list of values. Yeah, so those were just the particular files that need to be choned in order for a safe delegation of control of that branch of the C group hierarchy to another process, so you can still set on the C group directory all of the limits and the container won't be able to change those because those will not be choned to the container's process UID. Okay, thank you. So I have a question regarding the CFFC group C-H own, so you mentioned it's going to be changed to the UID of the container, of the process container, can you do that if you want to have several ports to run their own system D? Is your question around can you do it in a nested way? No, not in a nested way, you have three different ports which each port needs their own system D. Yes, yes, absolutely. So if I created more pods in my demo, you would see that they would then be mapped to different host UID ranges, so the limit is only how many of the range allocations can you fit into the host UID range? So the limit will be a little under 6.536 because the size of the host UID range on Linux by default is 2 to the 32, yeah. Okay, I think we have time for one more question and thank you all for your patience. Thank you Fraser. I just wanted to know if v2, secret v2 by default in OpenShift is on the road map yet and whether or not there's any sort of estimated time scale for that. Yes it is, yep, so there is a plan to eventually move to secret v2 as the default, I don't know the exact time frame. Thank you so much for your talk, thank you all for your patience and I know this sounds weird but you're free to leave. |
Fedora CoreOS - Your Next Multiplayer Homelab Distro
Using Fedora CoreOS in a Selfhosted Homelab to setup a Multiplayer Server |
We're starting now. Please. Please quiet down. Hello folks. Today we are here to talk about Fedora Corvus and how you can use that to do some fun stuff. The fun stuff being, you know, hosting your own dedicated multiplayer server so that you and your friends can have some fun. I'm joined here by Sumanthro, myself Akash Deep Dhar. We both work for Red Hat, but we are Fedora Project Contributors, and part of Fedora Council as well. So we welcome you to this talk. So about the things that we would want to talk about, the first and the foremost thing would of course be the OS, the thing that we run our containers on, and why exactly should you give a damn about it when there are like a plethora of Linux distributions out there with their own twists and turns attached to them. The next thing is of course putting that operating system to use, that of course is to have our own dedicated Minecraft server, understand how that process works and how easy or difficult it can be to actually put that to use. We'll put that to use again for the Valheim server too, because guess what? The community is great, and the folks have always created containers, and when it comes to containers, we always have something, some kind of platform to make use of. And by all means, if you trust me enough, you can totally scan this QR code, and it will lead you to the documentation, so you can totally go along with the talk, because this will be more hands-on. We'll be doing stuff over here, and we'll be making you folks understand as to what is happening behind the scenes and why exactly are we doing that. Or you can totally head over to the schedules page on the Container Step Room and find our links over there, so documentation is over there as well. Speaking of the operating system that we're going to talk about, what exactly is Fedora CoreOS? To begin with, it's a set of packages that's very minimal in nature, but it's very focused to the container-based workflows. So you won't get to see a lot of bells and whistles out of the box. Sure, there's an option to add those by yourself whenever you feel like, but then again, it's the networking, it's the container-based workflows like Moby, Podman, which I installed over there, as well as some tools like Firewall that you would need to make sure that people actually connect to your containers and be able to do what they want to do, which are pre-installed, which gives you just enough of a stuff to get started, and a canvas to actually add your own distributions, your own packages on the top of it, and grow upon it as you go on. And there's this thing called RPMOS Tree, which is based on LibOS Tree. The entire file system is transactional in nature, which essentially means that if I were to add packages on the top of the existing deployment, well, an existing set of installed packages, I can do so by ease without actually worrying that, oh, what's going to happen if I add this bleeding edge package? Will my deployment stop failing to work? And if it does, you can always find your way back. It's a Git kind of a workflow, so if you understand Git, it's pretty much like that, so you can always rebase your deployments, your own set collection of packages at any point in time, and just fall back at any point in time you want to go back to. The next thing is the fact that this is secure as well as scalable. So we, of course, would like to have this not only deployed in bare metal servers, but as well in a plethora of VMs having their own set of purposes. Now, the way we do that is by the use of configuration, and it's kind of difficult that you need to configure like tens of thousands of machines with hand, so guess what? You don't. You use something called beauty in configuration, which is a human readable form of something called ignition, and what it exactly does is you specify what kind of change that you want to make on that operating system. Some users that you want to add, some files that you want to make create, some networking rules, firewall rules, services to start stuff like that. You can do so so that when in the first boot, you get the operating system exactly the way you want, so you don't really have to install stuff and then configure it and then do that like thousands of times just because scalability is the thing. The next thing that I want to talk about is the fact that this operating system is available in not just x86 underscore 64, but in a lot of other places, architectures such as AR64 and ST90X as well, and we plan on providing the support for other operating systems architectures in the coming times. Speaking of the minimal set of packages, how minimal is it that we're talking? Let's give that a number. So we have like three release sets, the one for stable, the testing, as for us next, they are determined by the number at the penultimate decimal place, so 3.0 is stable, 2.x is testing, and 1.x is next. And you can totally see what those purposes are for and why exactly would someone want to go for a next over stable or vice versa. So say for instance, if there's a contributor who wants to develop this thing, they want to test all the bleeding edge packages that comes to this platform, they know what they can choose, and for the ones who really want to set up a server for their home people, they don't really want to move around a lot of stuff. They can either go for stable or they can reach out to our friends at CentOS because they have a CentOS team called OS2. So how exactly does this operating system become secure and scalable? I mean, I do sure like giving a business pitch because it's all buzzwords. So there are ways to make sure that the packages that you have installed or to say layered on the top of your existing installation, the way it gets automatically updated is very much in your control, which essentially means that it can go out in the open, start downloading everything, every new and bleeding edge stuff if you ask it to, or just hold back on it just because you want a stable operating system, you want to really curate the packages that you end up having so you have a lesser variness to updates that you end up having. So it's totally in your control and by all means there are ways to totally disable automated updates as and when you see fit. And these updates for the packages that you have installed, these don't get applied as soon as they get installed, but rather you kind of make those applications of those updates when you want them to happen. Usually it's a reboot because, well, the service actually happens to go through all the packages and refresh them based on the updates that has happened in the last boot, but you can always do it either live or in the next time as well. So at any point in time you feel like that a certain update has gone through which should not have, you can rest assured because that has not yet applied and you can always fall back to the previous deployment. And oh, I just happened to explain the last point. So that is rolling back whenever you feel like it. So depending on how you want to use this, you can use it on a Raspberry Pi if you are having one on your shelf gathering dust or you can have multiple VMs of yours on that desktop of yours that you think is an overkill and you don't use it anything else other than gaming. And of course there are choices to use it on the cloud too, which we totally suggest because this is something that we want to deploy on scale. So the kind of replicated deployment that you can have kind of depends on the kind of purpose that you would want to use this for. Right. So that's basically about operating system itself. Now we're going to make that thing, put that thing into use and understand how we can do some fun stuff, you know, set ourselves up an environment on this laptop itself. I set myself up a virtualization host and I'll configure it to the way I want it to. So if I want a user, I'll add it there. If I want a server to run in a certain way, maybe allow for no more than 32 people, I'd do so too. And by all means here again, this is a thing that you can follow along. So you can feel free to read the documentation by scanning the QR code and we'll move on to the next one. Right. So the set up environment that we have in place is VMware workstation. We really wanted to make sure that things are a lot more easier and off-limited scope for scope of this presentation. But you can use Quemu, you can use virtual box, you can use anything that you want or if you want to just nail it, you really want to make use of a bare metal too. And the specification that we provided it for are listed over here and for this case just because we want a server that actually runs, that actually is something that won't get a lot of packages, a lot of updates down the line, we really want to make sure that this runs in the long term, we have approached the stable stream for this one. Right. So I'm going to exit out of the screen for a bit and go more into detail about the stuff that I talked about over here. Right. So speaking of the demonstration, I have an entire directory of files that I need. Now these can either be firewall rules, the system, the units that I want to enable on the first boot, the packages that I want to install, the configuration for the swap that I want to put into place, stuff like that. So to get started, like I mentioned, we required a butane configuration. Now what exactly would that be? Let's find out. Pute cons and of course mine.pu. Well, basically it's just a list of directives that lets you do the stuff that you want to make happen. So if you want to create users with the set password for them, add SSS authorized keys, stuff like that, you can have them over here. The same goes for the stories, the symbolic links that you want to make happen from one folder to another directory. And you can have files that you source from a remote location and place it to somewhere else. Then finally we get to the system, the units part, where you can actually declare services. Now these services can either be for installing packages, adding firewall rules, enabling containers, stuff like that, and you can totally control them the way you want. And finally, you know, there's this cadence about what needs to be done first that you can use with the use of system D directives, like before, depends on, but you can also mention them over here. Right. So as you can see that we have roughly three system D units that we have mentioned over here. The first, of course, is to install portman compose and firewall D. We have portman pre-installed but not portman compose, so we might as well end up getting one. And the next is to allow Minecraft server to firewall. So before that we, of course, would reboot because like I mentioned, your updates only get applied when you want them to be applied, which by this case, by the default is reboot. There's also an option to apply them live, but then again you'd want to use them for applications like, well, a text editor or something of that kind, but definitely not for something that will end up becoming a service on itself. And then finally starting the dedicated server, now that the stuff that we needed for the firewall D service rules are everything in place. So just to avoid using more time during the presentation, what we did was, well, we did that in the hindsight like previously. And now what we have over here is the IP that we can connect to the container, the one that's running firewall D, the one that runs that firewall D service as well as the Minecraft dedicated server. To go forward in detail, I would show the status of the units that I mentioned off, like those for allowing these and depending on what kind of condition that you make happen, you can either make them run once, like if the firewall D service has been enabled, so you don't want to enable it again. So you can always put a flag of some kind telling that if a certain condition satisfies, which it seems to have, it won't run again. So the next thing that I want to show you is, of course, the server itself. So if I were to follow a certain unit, let's just say start Minecraft server, but I'm going to save myself some effort and go like that. So we have this container right here that's running on Podman. And yeah, there's this internal IP address as well that lets us connect to that. And finally, about the services that lets you install both Podman Compose as well as the firewall D. We'll head over here. Where's the terminal? There it is. And do, of course, mine. It's just a moment. And allowing system D system, allowing Minecraft server to the firewall. So we have set also the condition which tells that once that thing is done, you create a file called done, allow Minecraft server to firewall. So with the services like these, we kind of make sure that the service runs exactly when we want it to and not any time more than that. So once it's done, it's done. And of course, the condition for setting up in the first boot kind of falls in line for the one that helps us install these packages, especially for the Podman Compose and firewall D. So condition for the first boot is true, but you reboot after this thing has completely done. And by that, we help applying these packages on the existing deployment. Right. Let's say we'll go over here and we'll check the IP address again. This happens to be, this is not forwarded. So as much as any of you folks have Minecraft installed, I'm really sorry that you folks won't be able to connect to this one for the security purposes. But I'm going to connect to it and see what kind of world it gets me into. So we have the IP 192.168.234.129. Let's see if it's reachable. Well, actually it's kind of a firewall D thing if it's the service runs. And if the rules have been applied, we would be able to. Hmm, just a moment folks. Oh, it seems to have run. Now to have a plan B and a plan C, I have heard stories of folks losing their presentations. They have like three flash drives. So I also thought of deploying one in my home, but we probably won't need that because guess what? We have a sour that says that it's running on this host and it's running on, well, the forced and set up that we have here. Well, the other one, the plan C, of course, does not happen to be a plan C anymore because there's something that I can't reach out to. So I'm going to connect over to this one. Right. So the worst thing that can happen to a player when the entire Minecraft world has happened. But then again, if I were to all tab out and were to check the logs of the service, I should be able to see that I indeed have connected and have reached. So, you know, folks can totally get creative with what they can do with this. They can have their own servers hosted on their local place, maybe have an OCI cloud to do some reverse proxing to have their friends connect to it as well. The possibilities are endless. And when it gets deployed on scale on cloud, it just goes to the next level. And it's not just for gaming, but rather for if you want to do a local deployment for Plex or anything for that matter, which uses containers, it is possible. Now I'm going to hand it over to Sumanthro to be able to talk about the Valheim setup, say, back to the presentation. Or to you, Sumanthro. Hey, guys. So exactly much like the Minecraft setup, we have the Valheim setup, which is basically setting up the environment variables, configuring the host, making it work. Technically all the documentation has been put out on that QR code. So, if I go over here. Yeah, so all the required files has been put out here. So, like Akash explained, we have a buton conf. So, if I go inside, you would have a buton conf configuration generated by this ignition file. So, if you look at it over here, the buton conf configuration has the storage, the directories, and all we need the files, any rules. And finally, the system D unit services that needs to run, specifically in much like the exact case of Minecraft. If I go back, this is actually the ignition file. This is the back door of how CoreOS would basically look at that configuration and parse it for its own purposes. So, this is perfectly mission readable, not supposed to be edited by hand. But, you know, if you guys wanted to change something, that's supposed to be on the buton side of things. Coming back to the configuration, there's a root. There's supposed to be ETC system D and then the ZRAM generator service. And this one is swap on ZRAM service. By default, we have put a ZRAM zero as the default setting. It requires a bunch of RAM and we literally have put one of those services over there just to ensure things are going fine. Going back to my systems, there are going to be a server through the firewall. That's exactly the same. If I open it, that's a very basic, that's a very basic thing. Going back, we have the start well dedicated service and this one is going to have the podman compose parts and that's an execute script with the up and down. Coming back to the point, so one more thing, we actually hosted it on private servers back in the home. The way that we kind of can get it up and running right now is reaching, where was the terminal? Go ahead. So for the interest of time. I should probably practice turning on microphones before speaking. For the interest of time, what we'll do is we'll just not show the Valheim demo. Unfortunately, apologies for that. But this is totally reachable on the VM that we have set up over here and the port that it goes on is reachable. But the point being that these things are very possible, fun stuff and kind of gives you a reason why you would want to try a new workflow where the entire file system, the entire packaging workflow is nothing but a get kind of a thing. So you can always roll back, roll front, depending on what you want to do. And the best thing that you can do is, well, try it out today if you have a VM to spare or a device to do say. Right. So that's about it for the presentation. We'd really love to have your questions and answer them too. We have a bunch of time for questions. So thank you very much for your presentation. I had a question about the relationship between Fedora Core, Fedora Core OS and persistent storage. My understanding is that when you're working in containers, you want everything to be ethereal and temporary and don't worry about that. But like you mentioned, if you have some sort of media server, how would you address that sort of like persistent sort of ButterFS data pool? Is that part of butane or how is that configured and managed? The way we have it is by setting up mounts. The way it works is anything that gets affected by the installation, removal, updating of the packages, these are the ones that are very transient in nature. So these can get affected. But if you have mounts that lead to some different place, they most likely won't be, the home directory would most likely stay untouched. So as long as it does not have anything to do with a certain package, if it's not a file that gets introduced when you install a package or something of that kind, it for most likely won't be touched and it would always stay the safe. Any more questions? Okay. What is the relationship between 4.0.x and 4.0.i.o.t? And does it make any sense? Has everyone got that? What's the difference between Fedora CoreOS and what? Fedora IoT. So for the record, Fedora IoT was an official edition for a long time, which means Fedora would push in updates regularly. CoreOS recently became an edition which is a release back, which now brings into question the release criteria for IoT and the boards that we support. They were very officially declared as supported. But in case of CoreOS, there's no such official support that was given before. It was never made an edition. So that's one thing that you're missing out on. Second thing that you're missing out on is IoT, on the other hand, is released every six months with the release. Fedora in CoreOS, in this case, would have a stream cycle, which means the next would be if today we get a next stream, in 15 days that would become testing, and then in next 15 days it would become the stable. And then that's how it's going to roll. Obviously, given the fact that the next stream is tested by the CoreOS's own CI, which runs for almost all the basic tests which is required for the thing to run, both are based out of OS free. But again, every 15 days, CoreOS updates to the next stream or moves through the next stream. In case of IoT, you get it every six months. Hi, thanks for the talk. I would love to see this kind of bootstrapping of CoreOS happening on SystemDN spawn, for instance. Would that be feasible, like having that butane declarative way of just instantiate me something under SystemDN spawn? Is that something that can work already? It technically can. But then again, we kind of have to understand if SystemDN spawn, well, the environment inside of it will have SystemDN or not. So it's very much possible. And one of the use cases that I've seen using it is building packages or testing them for that matter. So the very fact that, you know, you don't really have to configure it when it's up, but rather decide how it's going to look like from the get-go, and that deployment can be replicated like anywhere. So that really makes it a prime image of some kind. And it does not even have to be a container image, right? That thing can be based upon and using a virtual machine or SystemDN spawn, as you mentioned, you can have that kind of a blueprint. But guess what? It's not a container. It's a full-blown operating system, which is running like it would on a bare-metal node. Okay. Thank you. Okay. This is the last question we can take. So I was wondering this ignition file is only applied on the first startup. Can we make some kind of declarative configuration for Silver Blue? Or for, and not for Silver Blue for CoreOS? Is this also supported? So if I get your question correctly, you are wondering if a certain configuration can be run again if I want to, on the same deployment. Right. So it's totally possible. You know, the thing that you end up getting after running these many steps after running the butane configuration is a deployment. So what you can do is you can use that deployment and run a butane configuration on the top of it. So that becomes your base deployment. And anything that you add on the top of it is the resulting deployment. So the very thing that deployment states in this case is a state in which the operating system is in right now. So, yeah. So if you just mangle the state to create a new one, that state becomes your existing state. That's about it. One last thing, guys. We have a Fedora boot. Feel free to go there, grab swags and whatnot. And thanks for attending. Thanks. Hey, guys. |
Deploying Kubernetes across Hybrid and Multi-Cloud Environments Using OpenNebula |
Okay. Okay. I am Marco Mancini. I am a social architect in Open Nebula System. Open Nebula System is, okay. Open Nebula System is the company behind Open Nebula, the open source software. So today, I will talk about how you can easily deploy Kubernetes clusters on hybrid and multi-cloud environments by using our open source solution. So let me introduce Open Nebula. Open Nebula was born around 14 years ago as a solution for private cloud computing, you know, for on-premises. And evolved during the last years, and now we provide a solution that allows you to manage different types of workloads. So going from virtual machines to application containers to Kubernetes clusters along what we call today, you know, the data-sender cloud edge continuum. So you can have resources on on-premises, you can have resources on public or on far edge and so on. So what we would like to do with this open source solution, no? Open Nebula is to provide you with a simple way in order to manage different type of workloads along the, let's say, this cloud edge continuum. And so you can minimize the complexity, you know, to manage these workloads. You reduce no consumption of resources because you can manage different types across different kind of resources and so on. So mainly what we, at the core of Open Nebula, we use different virtualization technologies. So we go from supporting using VMware, KVM for virtual machine workloads, to LXC for system containers, to Firecracker, where you can manage micro-VMs and deploy container-based applications. And we manage these technologies by using clearly advanced features, no? You can have multi-tenancy, self-service, no? You can provide resources on these different virtualization technologies and so on. We have a graphical user interface where you can manage all your resources across the, as I said, this continuum and also we have integrated different third-party tools, no? Going from Terraform to Ansible to Kubernetes and so on. So our vision about the multi-cloud is that we can, no, we would like to provide an easy way for automatic provisioning of resources, no? Across multiple cloud providers. That's the moment. So what we have built is a tool that is called the One Provision. So you can see in the bottom, so we have also a graphical user interface, but it's also a command line interface. So you can create resources on different providers. At the moment, we support providers that has bare-metal servers like AWS and Equinix. But, yeah, we can support other providers. We just need to write some drivers, no? That allow us to provide resources also across different providers. Behind the One Provision tool, we use open-source tools like Terraform and Ansible. So with this tool, with this One Provision tool, we can build so different what we call edge clusters. So an edge cluster for Openable is an abstraction where you have computing, you have storage and networking. So once you provide this edge cluster, every cluster, whenever it's provisioning, can be managed by our uniform, just from one managing place that is our Sunstone graphical user interface or with our command line interface. And so from one just panel, you can manage all your clusters across different, for example, providers or your premise resources. And then at the end, what we have is the concept of marketplace. So whether you can have appliances or you can have, we have also integrated Docker app. So you can have also Docker images that you can deploy. So you can deploy virtual machine, multi-virtual machine, containers, and Kubernetes clusters across these different resources that we have provisioned. So this is how we manage, let's say, multi-cloud environment. So by using this One Provision tool and then our graphical user interface and the marketplace. So let me introduce also how we have built Kubernetes, how Kubernetes is integrated in Open Nebula. So for us, Kubernetes is just a service. So we have built an appliance. I will talk soon about how we have built this appliance. So as I said, you can manage Kubernetes by using our tool for managing any application, right? And then you can deploy on different edge clusters, right? So you can exploit all the features that we have. So since we have a multi-tenants environment, you can deploy Kubernetes clusters for all your tenant within Open Nebula. So you want to deploy Kubernetes clusters on the same physical resources that are shared. They will be deployed in a secure way because you can deploy by using our visualization technologies and so on. And also you are not looking to any vendor because you can just deploy your Kubernetes clusters on any, let's say, cloud edge or premise or far edge provider that you would like to integrate within your infrastructure and the price infrastructure. So how we have built Kubernetes, integrated Kubernetes in Open Nebula is we have defined an appliance. It's called one key. This is just a complete Kubernetes deployment. So it's based on RQ2 and we use the version 1.24 of Kubernetes. So we provide all the features. So when you deploy this appliance, you have all the features included. So you don't have to deal with managing deployment of a storage solution or ingress controllers or load balancing. At the moment, we have used these technologies on our roadmap. There are some features that we would like to include, especially a better integration with some of the features that has Open Nebula. But at the moment, yeah, we have this kind of solution that is based on, as I said, on RQ2. The one key appliance, these are the components. It's based on one flow. One flow is a component in Open Nebula that allow you to define multi VMs applications. So in a one flow service, you can have different roles. And each role, for example, in this case, for the Kubernetes appliance, we have defined different roles. For example, we use the VNF role. This is the load balancer for the control plane. But it also does NAT and routing because we have two networks within our appliance. One is the public network and another is the private network between the different components. So this VNF also allows for the different VMs within the private network to communicate outside to the public. Then we have the master role. His role is to manage the control plane, the ATC database, the API, and so on. Then we have the worker nodes that you can use for any workloads that you want to deploy on your Kubernetes cluster. And then finally, we have the storage nodes. These are dedicated so they will not be used for when you have to deploy some workloads, but they are used just for your storage needs. And we use Longcore for persistent volumes within other Kubernetes, within the one case service. As I said, the VNF, this virtual network function service provides a load balancer. So you can have multiple VNF, so in an availability mode. Taking into account that OpenNebula offers you the abstraction of virtual machine groups. So usually for having an availability solution, if you have a virtual machine, you would like to deploy your virtual machine on different hosts in order to have an available solution. So you can use OpenNebula VM groups and then using some affinity rules, your VMs will be deployed for example on different hosts so you can have an available solution. And this is valid for any role that you have seen before. So for any role, you can use these VM groups in order to have also available solution. So one key by default, just create one VM for each role, but you can modify and scale the solution. So having multiple VMs for each role. So this is the VM. As I said, for the persistent volumes, we have this storage nodes where we deploy a Longcore, we use Longcore. So you can have replicas of your volumes on different VMs related to the storage nodes. Then we have, in order to access your services, we need that you deploy within your Kubernetes clusters. You can have the ingress controller you can use. We deploy an ingress controller based on traffic. So this can be used for HTTP and HTTPS protocol. And then you can access the service by just defining an ingress controller for your service. And then we have integrated also Metal LB, instead for the load balancer service. So in this case, you can use this for other kind of protocols that are not HTTP or HTTP based. Yeah, I would like to go because, yeah, it's almost, I have five minutes now, more or less. I will prepare just a demo to show you how you can use Open Nebula. So I will show you how to use one provision in order to provide resources on AWS and Equinix. And then we can deploy a Kubernetes cluster on both edge clusters that on this two public cloud provider. And then we just, you can just access one of the Kubernetes clusters and just deploy an application. Let's me go on the demo. Okay, so this is the Sunstone Graphica user interface that you can see here. If we go to clusters, we have just the default cluster. But there are no host, no data, there are only data store, there are no host. So in this moment, we have just our front end without any resources. Now what's it go is to go to the one provision. We have defined already two providers, one for Equinix and one for AWS. And once you define these providers, you can create clusters on the two providers. So we are going to create a cluster, for example, in AWS. In this case, we have defined a provider for AWS in London, the zone. And this will now create an edge cluster on AWS. As I said, we use Terraform and Ansible to create resources and to compute in such a way that you create an edge cluster for OpenEbula. And then here I'm going to create another cluster instead of on Equinix. Clearly, you have some parameters. You can define the number of hosts, you can define the number of public IP that you would like to access, and so on. Okay? By the way, you can define two type of clusters with one provision. One is an edge cluster, it's a base, or you can also create a safe cluster, an hyperconverged cluster. As you can see here, once you use one provision in a Sunstone, graphical user interface, you will see the hosts that are going to be proficient. And while it will take around five, ten minutes, this depends on the cloud provider how much time it needs to create resources. But once you have created the resources, you can see here the two clusters. What you have to do is to instantiate a couple of, to use Kubernetes appliance, we have to define a couple of private networks, one for Equinix and one for the other AWS clusters. And in order to do this, it's simplified because we create a template, then you just instantiate the template, and then you can create also the private networks, both for AWS and Equinix. Because we need the private for the internal VMs, the roles like node, master, storage, and the worker nodes, and then we need the public network instead for the VNF, that is our main endpoint where to access the Kubernetes clusters. Now what we are going to do is to import the one key appliance for our marketplace within our open Nebula. You can do this just once. So we are going to import the appliance. And once you import the appliance, what will be imported are templates for the VMs that are for each role, and the template for the service. This service is based on one floor, and also the images that are related to the different roles. So in order to create a new Kubernetes cluster, what we have to do is to just instantiate a service by selecting the appropriate networks, for example. So in this case, you can see now I'm creating a cluster on AWS. So I select for the public network, the AWS cluster public for the private, the AWS private, and then I just have to put a couple of IPs internal. These are for the internal networks, for the virtual IP, for the VNF, and for the gateway. And we can do the same for Equinix. So by just selecting the public networks of Equinix and then the private networks that we have defined. Also in this case, I've used the same network for both clusters. And here you see that now we are deploying the two Kubernetes clusters on the two different edge clusters that are on AWS London and Equinix. As you see, the first role that is deployed is VNF. Once the VNF is ready running, in one floor, you can define dependencies. And once the VNF is ready, one floor is going to deploy the other roles, master, the worker, and the storage node. In order to access the Kubernetes clusters, you have to use the public IP of the VNF. And you can use SSH agent forwarding by using, you know, first connecting to the VNF and then connecting to the master by using the private IP. Okay. Here we can see the nodes. So we can have, as I said, by default, you have one node for each master clear. This is not for production environment. If you want to have for production environment, you need to scale each node, for example. So here, I just create an image and I prepared also a YAML file, a manifest file for exposing the service through the ingress controller. And then you can use the public IP of the VNF to access the service. Okay. Clearly OpenEBOLA is not, doesn't have any tools for managing the deployment of application on Kubernetes. So we manage the infrastructure and the deployment of Kubernetes cluster. Then you can use kubectl, you can use Ranger, you can use other open source tooling, you know, that maybe in the future we can add also. As you can see here, by using the public IP of the VNF, I have access to the engine mix. Another thing, you can scale the roles once you deploy, for example. In this case, I can scale, for example, the worker. You just put the number here. We use the one flow, one flow allows us to scale the cluster for each role. Okay. And now you can see another worker is going to be deployed. Yeah, this was the demo and I think that's the conclusion. Okay. Thank you. Thank you all. Okay. |
Touring the container developer tooling landscape |
our next talk. I guess I don't have to give a big introduction. A lot of you know Phil, right? He's going to talk about Turing the container developer tooling landscape. All right. Hi everybody. I think I'm on. Yeah, you're on. Yeah, so thanks for coming. It seems like Fossum is back. We've got a packed containers room. I think my talk is mostly a warm-up for Akahiro after me. So a lot of familiar faces here. A lot of good talk so far. Maybe this will be a little bit different than the last few talks. We've been talking a lot about containers and various environments, but we haven't really talked about tools. You've seen a few tools used in some of the demos. And so I'm just going to talk through where we are these days with development tooling. Interesting that it's 2023 and interesting year in that we're now 10 years since Solomon Hikes gave this demo during a lightning talk at PyCon. I think it was like April. So getting pretty close to 10 years since someone saw Docker being run at the command line for the first time. And what an interesting demo was because he misspelled Hello World and that's now permanently in the history of the Internet forever. We've been using containers for a long time now. Apologies to those who are big Solaris fans or BSD, obviously containers and the technologies behind them existed in other operating systems before Docker. But essentially, at this point, everything that's been demoed today has been on Linux. There was a great kind of intro in one of the earlier talks about namespaces and C groups. This picture is old because people keep creating new namespaces. So it doesn't work anymore. This was a cool image way back in the day because it was the perfect number to create like a flat packed box. So you create your box out of the namespaces and then you shape your box with C groups. What size do you want it to be? What limits do you want to place on that? And apologies to my friends at Microsoft again. There are containers on Windows as well these days. But again, for the lion's share of use cases, these are the features and the technologies we've been using to create containers. But let's not forget there's other pieces to the puzzle, whether you're using Docker or some other runtime. There's SE Linux or App Armor in use. There's Setcom profiles. The images we've been constructing, millions of them are constructed around Linux concepts, libraries, binaries that are basically Linux user space file systems. And then there's the Linux capabilities that we add or remove or default in our container runtimes. So again, all these things are very Linux specific. And yet, you know, where are developers developing these containers? What tools are they using on what platforms? And I got a little nervous coming to FOSDOM because I thought, oh boy, everybody in this room, there's Linux on the laptop actually is alive and well at FOSDOM, but not so much other places in the world as many of you know. I spent way, way too long trying to create this slide because I kept trying to find better data on the split of who's using what operating systems for developers. It's pretty easy to find that Windows is still very heavily used if you work for a large company. They may hand you a laptop and enforce that you use a very specific image of Windows locked down in various ways. Mac has been growing in popularity for a long time now. A lot of developers use Macs, myself included at the moment. The problem is it's really hard to gauge how many people use Linux. If you look at the Stack Overflow developer surveys, you get numbers as high as 30 or 40% in the past few years, but the way they're asking the questions, it's hard to know if people are saying I'm developing in a Linux instance somewhere in the cloud or I'm actually running Linux on my laptop. And since we're in Brussels, if you're at dinner somewhere, it turns out someone might overhear your conversation at dinner and they're also at Fosdum, so someone point me at a new data source, JetBrains has a developer survey that they've been doing for a number of years and they had slightly different numbers. They had 60% for Windows and Mac and Linux were actually almost exactly the same at around 25 or 26% each. So regardless, we know that people are on various platforms and they're wanting to develop Linux containers. The easy solution is, hey, we have tons of virtualization options. I don't know, it looks like a lot of younger people here. When I was a developer and VMware came out, I was like, wow, this is magic. I'm like able to run this other operating system on my laptop, Parallels is out there, KVM, VirtualBox, Vagrant, all these options to be able to run a VM. And obviously that's one very simple solution to, I need to run Linux, but my physical thing that I have that my manager gave me or that my work provides can't run Linux, so I'll just run a VM. But this solution brings about some new problems because now I have another OS image to manage and it's got updates and maybe security issues. And so now I'm managing my laptop or my desktop and also this other OS. I have these VM boundary issues. So I'm on my host and I've checked out some source code, but it's not in my VM and I got to figure out this file sharing and figure out how to do networking. I want to run a container in the VM and I want to access it. And now I figure out how this works with the network. And there's also just my kind of developer workflow. There's some inhibitors, the fact that this thing's in a VM and I have a tool I want to run, but it's only on my host. And so again, this becomes potentially clunky to operate in these two worlds. Way back in the early days of Docker, one of the solutions that someone came up with was Docker Machine. It was this really nice simple way to sort of do this VM management on your behalf. You export your Docker host variable and point to the right place. And all of a sudden it seems like you're using Docker on the host and all the magic of the VM management is done for you. It was fairly simplistic and so over the years pieces of that are what became Docker desktop. This is a screen grab, I think, from one of the Windows versions. But again, 2016, 2017 and beyond for Mac and Windows, a more complete solution was developed. It also included a ton of other tools. So you didn't just have your runtime. Runtime you had Docker compose, you had image signing from Notary. You had a full Kubernetes cluster that you could access that was also being managed by this VM. So again, there were people sort of trying to make this easier for the developer who wasn't on Linux to develop their Linux containers that maybe they were going to deploy into a production environment that ran Linux somewhere in the cloud or in a data center. So this was great. It felt seamless to the developer. It felt like I was running my container commands locally. I'm doing Docker build, Docker run. The file and networking, people smarter than me had figured out the magic of all this pass-through that just seemed seamless and easy to use. And now there was bundling of these other tools, you know, relevant things that I needed to use were already there in the VM for me. Meanwhile, everyone wasn't using Docker. We had the ContainerD project, which I'm wearing my ContainerD t-shirt, but also a sweater so you can't see it. We have Podman. There's been some demos today that have used Podman. Red Hat was building their own suite of tools with Creo and Podman. And I don't know if we have any high-performance computing HPC folks in the room, but Singularity was gaining popularity now known as AppTainer. So again, there was these other technologies, other runtimes, other tools that people were using, and maybe Docker Desktop was really not meaningful to that group of people. And so over the years, other solutions for those other runtimes have been developed. And so obviously Podman Desktop is one of those. There was just a new release lately. I think it's been around for about a year, although pieces of how to run Podman on your Mac and Windows have existed maybe more than that, but the official Podman Desktop project has been around for about a year. You get Windows, Linux, and Mac OS support. And it has Kubernetes. It has a plugin system. They just recently developed a new DNS and networking service that is a little more amenable to desktop, laptop environments than using CNI plugins. And again, it's built around tools that have been in development for many years. Podman, Builda, Scopio, and the containers, libraries that these are built around. And again, because those things were built with certain features, like the rootless and unprivileged work, the demolus runtime with Podman and CRUN, you get all those same features, but now you can run it on your Mac or your Windows system if that's your local developer environment, and you get all the same capabilities if you were using Podman on a Linux system. So you get both all the Docker command line compatibility that Podman originally developed, but with Libpod, you also get the Docker API, which may be important for tools you're using that try and integrate directly with the Docker API. If you're not in that world, there's Lima, NerdCTL, and ContainerD as sort of a stack of projects. NerdCTL, similar to Podman, provides you that same Docker command line API with composed support. It uses QMU for virtualization, so this is the Lima component. That handles, again, the file sharing, the network pass-through via some additional projects that are part of that Lima scope. Again, this is all focused on macOS so far today. I think there's some discussions around Windows support and AkaHero is here and may be able to speak more to that than I can. One of the benefits of being built around ContainerD is that this stack can also expose experimental features like lazy loading snapshotters, image encryption, and other sort of sub-projects of ContainerD that are out there today. NerdCTL, as it's packaged by default, gives you rootless unprivileged mode, so if you run it through Lima, you're getting, again, rootless unprivileged containers running underneath that on Mac. A few projects that are built on top of that are Rancher Desktop and CoLima. Rancher Desktop, obviously many of you have heard of the Rancher suite of projects and products. They created a desktop platform that built on the Lima foundation for their macOS support. Both of these projects also found that some of their user base either needed the Docker API or had very specific ties to Docker. So you can get both of these projects, not just with ContainerD and NerdCTL, but also get the Docker engine. In fact, CoLima, if you install it by default, does install Docker. They both provide Kubernetes clusters. So again, if local development environments and Kubernetes, that combination is important to you. They both provide that. Rancher Desktop also adds Windows and Linux support in addition, and that's not using Lima underneath. So the last project I wanted to talk about came out of my team, AWS. This is a project we just launched in November last year. So just a few months ago, it's called Finch, and it builds on the same stack as Rancher Desktop and CoLima, where we're using Lima, NerdCTL, and BuildKit to provide that Docker command line, Docker build support, Docker compose support inside of VM on your Mac. And so there's Homebrew Packaging and Apple Sign Installer packages for that. It supports ARM64 and Intel. And also because of QMU and its capabilities, you can build containers. No matter what your host CPU is, you can build containers for Intel or ARM64. And again, the host CPU itself can be any of the either Apple Silicon, M1, M2, or the Intel-based Mac. So again, we're a young project. Our plans for an extension framework similar to Podman Desktop and Docker Desktop, so that we want that same model of you can add features and add capabilities without having to add them to the Finch project itself to extend it to other use cases. And we're also planning similar to Rancher Desktop for adding Windows and Linux support. Obviously, we're not really building a completely new tool. We're packaging most of these existing components. So we're working upstream. There's myself and a few other container demaintainers. We're working in Lima. We have a few pull requests merged in Lima. We had in the latest Nerd CTL release a few weeks ago, we had five different Amazon folks mentioned in the release notes. We're planning to add some features to BuildKit. And we also have several people working in the OCI specs, like the recent reference type work. So again, a lot of the work we do in Finch is really building out capabilities in these underlying projects, not so much building a brand new interface on top. And we want it to be a community open source project. So we're working on a public roadmap. Obviously, there's a GitHub repository here where you can go and see what we're doing, open to external contribution. And what we'd really love collaboration on is this added operating system support. Again, some of that work might be in Lima or elsewhere. But we'd love to add Windows and Linux support. And then understanding the best way to design this extension system that you can already use with other tools that I mentioned. We're also on the CNCF Slack in the channel Finch. So with that, that was a whirlwind tour through what's available for desktop tooling today with containers. And I think we have a few minutes for questions. APPLAUSE Yeah, any questions? Hi. What was the motivation to create Finch when there was already this whole ecosystem? If I think I understood the question, why create Finch when there was Rancher Desktop or Colema or Lima? Yeah, that's a good question. So each of those tools kind of has its own natural inclination. With Rancher Desktop, the focus was great local Kubernetes environment and a GUI and some management around it and including Docker. We wanted something simpler that's just the command line tool. And so we talked to the Rancher folks about maybe having a common upstream. Maybe Finch becomes that common upstream of Lima, container D, nerd CTL, build kit. So that might still be in the works. And then Colema is a very small project. There's one maintainer. He's kind of working on his own. And again, we were looking at, you know, he's got Docker in there. He's got Kubernetes. And we wanted to, again, focus just on the container interloop lifecycle, build containers, run containers, push containers to registries. And so essentially it's just a simplification that we think there's still lots of ability for collaboration with those other projects because we're all using the same stack below us. We have time for one more fairly quick question. How easy it is to pick up Finch for someone who's just started working as a developer? Yeah, how easy to use? What's the learning curve compared to Docker? Yeah, so again, most of these tools are built around the sort of understood Docker command line tool. So if you've already used Docker, like it's the same commands, the same flags. So in that sense, there's no real learning curve. Now, if you're just brand new to containers, it's really the same effort that you'd have to do to learn Docker or Podman or Finch or anything else. So it's really about your understanding of kind of the existing developer tooling space built around Docker. Okay, thank you. Please leave quietly when we are still asking questions. Other than that, thank you. Thank you. |
Bit-for-bit reproducible builds with Dockerfile
Deterministic timestamps and deterministic apt-get |
Hi, I'm Akihiro Suda from NTT Corporation at JEPRAN, in this session, I talk about a bit-for-bit reproducible build with Dockerfile, focusing on the determinies of the timestamps and the abstract partition versions. I have a demo, and you can reproduce my demo by yourselves using this github.com. Let's begin with what are reproducible builds. Each means producing exactly the same binary when you have the same source. For containers, this source means Dockerfile and every source code files that are referred from the Dockerfile, and the binary means OCR images, including the tar layers and the metadata JSONs. This reproducibility has to be attestable by anybody at any time, but not necessarily on any machine, because typically your machine has to have a specific version of tool chains, and sometimes you have to use a specific version of the host operating system, and with a specific file system, and with a specific CPU, so this limitation is very far from ideal, but this is sometimes inevitable. So why do we need reproducible builds? It's because we want to verify the actual source code of the binary, not the claimed source code. The actual source code may differ from the claimed source code when the build environment, such as the developer's laptop, or the city server, such as Jenkins, or the action is compromised, or when the developer simply has malicious intent. So we want to be sure whether we have the actual source code, and if the builds are reproducible, we can be sure that we have the actual source code. Otherwise we are not sure whether we have the actual source code or not, and maybe we are using some compromised source code. So reproducible builds is really great, but it's not a panacea, especially reproducibility has nothing to do with whether the source code is safe to use. The source code may still contain malicious codes, so reproducible builds make sense only when you actually review the source code by yourself. So it's a very time-taking job, and very few people are motivated to bother doing that, but this problem is beyond scope of my talk. So maybe this task can be automated using some AI in the next couple of years, but it's beyond scope of this talk. I don't know. And it was hard to make the builds reproducible, especially with Docker files, so there were three major changes. The most obvious one is time stamps, such as the time stamps of the files in OCI TAR layers, and other time stamps in OCI sessions, such as the OALJ OpenContinent.images.created. And we also have time stamps in the image histories, so it's going to be printable with joker history commands. So the time stamp problem is the most obvious one, but the time stamp problem is relatively easy to solve. So the biggest problem is not time stamps, and the biggest problem is non-determinants of after-git. So when you run after-git, the package version that is installable with after-git changes every time. And of course, this is not specific to after-git. So the same problem exists in DNF, APK, ZIPR, Pac-Man, and almost all package managers. Actually, NICS, Partition Manager, has solved this issue with a branch-on-paying system called Flix, but NICS is very complex and still hard for most people to run. And NICS is also similar, but NICS is still complex and very hard, so most people still want to use after-git or DNF or APK. And the third problem shown here is characteristics of the file systems, such as hard links and X attributes. So these special characteristics may differ across file systems. So re-positional builds were really hard in the ecosystem of ZOKA file, but it's now supported in Build-Git version 0.11. Build-Git is a modern image-building framework made for ZOKA and MOBI, and it has been embedded in the ZOKA demo since ZOKA version 18.06. But it's not specific to ZOKA and MOBI, so it can be also used as a sound-alonged demo called Build-Git D, and Build-Git D can be executed inside Q-Valentice or NADCTA or POTMA or any other control engines that support OCI specs. Build-Git version 11 was released the last month with the built-in support for reproducing time stamps, thanks to Tony Stiggy for the contribution of this work. And this version 0.11 still needs really complex ZOKA files, but the next version 0.12 is likely to require less complex ZOKA files. And reproducing time stamps is supported using a special build org called Source-State-Epoch. This build org conforms to the reproducible-builds.org's Source-State-Epoch spec, which is available under hdbs.com. The reproducible-builds.org spec's Source-State-Epoch. And argument value is usually expected to be set to the unix epoch representation of the git commit dates using git log dash 1 dash dash pretty equal passivity. So you get an integer number that corresponds to the seconds since 1970, generally first. The Source-State-Epoch is exposed to the run instruction of the Docker file as the environment is variable, and in addition, it's also consumed by build git itself for the time stamps in the OCI JSONs. But not for the time stamps in the OCI, not for the time stamps in the OCI Tyrayas and build git version 0.11. This is planned to be improved in version 0.12. So as I mentioned in the previous slide, there is a bunch of capabilities in version 0.11. So especially the file time stamps currently has to be explicitly touched with using the find commands, XRs commands, and the touch commands like this very complex script. That already takes more than three lines. And also, you have to squash all the layers to eliminate all the fs files that are created on removing the files in the containers because the time stamps of the whiteouts are not reproducible in build git version 0.11. And also there's a restriction on the mount point trajectories. So cache mount points can be only created under TMP fs such as thrush div. And also hard links are not reproducible depending on the file system stamp shooter. So in this version, we still have a bunch of capabilities, but these capabilities are already being improved in my product list 3560. It's not merged in the master branch yet, but I hope that this product will be merged in the next version 0.12 in the next couple of weeks or maybe in the next couple of months. The next topic is reproducing package versions. This is the most important topic of this talk. The package versions are hard to reproduce because most of the distributions do not retain all the packages. For example, Ubuntu does not retain all the packages as far as I can see. DBN does, but the package archives are not mirrored widely. And basically we only have the central snapshot.dbn.orz and only a few mirrors. This is causing too much load on the central server snapshot.dbn.orz. So basically this snapshot.dbn.orz server can't be used in the CI environments because it's really slow and it's really freaky. And this slowness and the freakiest problem will get even worse when more people begin to make their bills reproducible. This situation is very similar for Fedora and Arc Linux as well. And reprogate is my solution for this problem. This is a decentralized and reproducible front-end for Aftergate, DNF, APK, and Parkmar. The package version can be locked with SHA256 sums file. And packages can be fetched from several transports such as HLGP, OCR, OCR3s, IPFS, local five systems, and NFS. By default, reprogate attempts to fetch the packages from dev.dbn.orz using the package name. The dev.dbn.orz server is fast, but it's ephemeral. It doesn't regain all the packages. So for all packages, reprogate automatically forced back to dev.n.set.fr using SHA256 hash. This is relatively slow, but this server provides persistent snapshots of all the packages. You can also configure reprogate to use OCR3s, IPFS, and local five systems. Reprogate currently supports the five distributions, dev.dbn.orz, Fedora, ArcBind, and Arc Linux. Reprogate is expected to be used in continuous, but can be used with noncontinental environments as well. The command user is like this. So you run reprogate hash generate to generate the hash file, and run after get install hello to install hello packages, and reprogate hash generate again, and you will get SHA256 stamps file like this. And inside the containers, you can run reprogate install with SHA256 stamps file, and reprogate this package from HTTP after get to repo, or maybe from OCR3, or maybe from IPFS, or maybe from NFS, depending on the configuration. And here is the demo. So to reproduce this demo, you have to run specific version of build kit, version 0.11.0. And in this directory, I have SHA256 stamps file like this. This is mostly for running GCC. And Docker file is a country really complex, it's machine generated, and it has a bunch of workarounds like this for sausage epoch stuff, and you can use this to test reproducibility. This takes a few minutes, but the result is like this. So you will get the same fast 0, AS3BC, FEB67C85 on any machine, such as on DTIB actions, or local laptops, so you can try this by yourself on your own machine. And the future works includes simplifying Docker files and cache management. I'm also trying to implement with 20-stages xxopt and xxapk for cross-compilation. And also, reproducibility should be testable with SSF, such as provenances, ideally just with a single click, and probably more contribution is welcome for these topics. And here is the wrap up of my talk. So reproducible build helps testing the true origin of the binary, and challenges like non-deterministic timestamps and partitions, and basically the partitions 0.11 adds programming and support for source data epoch, and the reproducibility can be used for reproducing the partitions with 5, 6 sums. And I think, sorry, the demo is still running, so I can't show the result of the demo, but that should be like this result. Any questions? Would it be fair to say that this sacrifice is security in favor of reproducibility, because you would have to keep that list of hashes maintained to make sure that the packages downloaded are always like the most secure ones? So your question was how to make these hash files, right? How do you make sure the list of package hashes is always pointing to the most secure versions of a package? So you can use replicate hash data command to scan installed packages, and make the hash file like this, but you can also create a hash file by yourself, by just with text editor, or maybe just your own NSR tool to maintain this hash file. Okay, we're out of time. Thank you for the talk. Thanks everyone for attending. |
Kubernetes and Checkpoint/Restore |
We want to start, and that means I need to once again ask you to quiet down, please, so that we can hear our speaker. Our next talk is by Adrian Reber, and he's going to talk about Kubernetes and Checkpoint Restore. Hello, Mike is on. So welcome everyone to my talk about Kubernetes and Checkpoint Restore. Please quiet down. So I've actually done the talk about container migration here in 2020. This was using Portman in the last three years. I was able to move it into Kubernetes. It's not on. It's green. Better now? No? Too soft? What's too soft? I think the only thing that you can do is move it slightly down. Down? Better now? No. You can turn it up? Oh, there was... That's for the... Green is good. Better now? Better now? Not too good? Is it better now? No? No? No? Okay, we've got to make do with what we have for now, but please, if you all quiet down, we can hear our speaker a lot better. Okay, so I'm working on process migration for at least 13 years now. I'm involved in CRIU, which is the basis for the container migration we are doing here today. I don't know. Since around 2012, and I'm focusing mainly on container migration since 2015. So the agenda for today's session here is... Can we turn something down? I get feedback. Okay, so the agenda is something like... I'm going to talk a bit about background of checkpoint restore, especially how CRIU is integrated in different things. Then I will present use cases for container checkpoint restore, container migration. Then I want to talk about few technical details about CRIU. I might make this very short depending on the time. And then I want to talk about a future of checkpoint restore, especially in Kubernetes, what we are thinking about topic right now. So checkpoint restore in user space is the name of the tool CRIU. The reason for the name is that checkpointing and restoring is a thing for over 20 years now in Linux, even longer maybe. And there were different implementations. There were ones using an external module. There were ones doing LD preload. And around 2006 or 2008 there was something called a patch set for Linux kernel to do it in the kernel. It was over 100 patches. It was never merged because it was really huge and complicated. And because the in kernel thing didn't work, CRIU was named in user space because it's not in the kernel, it's in user space. There are multiple integrations of CRIU in different container engines, sometimes orchestration. The first one to mention here is OpenVizet. They invented CRIU for their container product many years ago to live migrate containers from one node to another. So the thing is about CRIU, it has been developed with containers in mind. At that time it was for different containers probably, but it's for containers and that's why it works as well as it does today. Then we know that Google uses CRIU in their internal container engine called Borg. I have no details about it except the things I've heard at conferences from them. So what they told us at CRIU upstream is that they use container migration for low priority jobs on nodes. And if there's not enough resources then the container will be migrated. They said they killed the container before and restarted it somewhere else. All the work was lost and now they can just migrate it to another node and they say they use it for background tasks like the example they gave is YouTube recoding of things which happens in the background, it's not time-critical then so that's why they use a checkpoint restore for it. There's an integration in CXD which enables you to migrate container from one host to another, then it's integrated in Docker, it's integrated in Portman, this is what I've been working on the last five years mainly and the thing I've been working on in the last three years to get it into Kubernetes is to integrate CRIU support into Cryo. This is one of the existing container engines which Kubernetes can use. Interestingly enough there's a ticket about container live migration in Kubernetes open since 2015, since then nothing has happened until now where we kind of can migrate container, we can definitely checkpoint them and we introduce this into Kubernetes and the label forensic container checkpointing. This was an interesting experience for me because I was not aware how Kubernetes processes are working to get something new in there, so I wrote some code, I submitted the patches and then nothing happened and at some point people told me you have to write something called a Kubernetes enhancement proposal, it's a document where you describe what you want to do, so I did this, so this is the links to the documents, I wrote for this, the third link is then the pull request for the actual code changes which is marginal and the last link is a blog post where it is described how to today use forensic container checkpointing in combination with Kubernetes. The reason for the name forensic container checkpointing is we were looking at a way to introduce checkpointing into Kubernetes with the minimal impact on Kubernetes, the thing is it's a more or less completely new concept for containers because Kubernetes thinks about containers, you start them, you stop them, they're done, you don't care about anything, and now there's something new there which tries to, okay, but I can still move my container from one node to another node, keep all the state and so it was a long discussion to get it into Kubernetes. The idea behind forensic container checkpointing is you have a container running somewhere and you suspect there might be something wrong, you don't want to immediately stop it, maybe the attacker can detect if you stop it and remove things from it so you can take a checkpoint from the container, the container never knows it was checkpointed, you can analyze it in a sandbox environment somewhere else, you can look at all the memory pages offline without the container running or you can restart it as many times as you want, so that's the idea for forensic container checkpointing and under which label it's currently available in Kubernetes. So use cases for checkpoint and restore container migration, I have a couple of them and one has a demo which relies on the network so we will see if this works. So first and maybe simplest use case for checkpoint restore for containers is reboot and save state, so I have a host with a blue kernel running on it, the kernel is getting out of the state, I have to update it, I have a stateful container because for stateless containers it doesn't make sense, but the container, the stateful container takes some time to start so what can you do with checkpoint restore, you can take a copy of the container, write it to the disk with all the state, with all memory pages saved as they were just before, you update the kernel, you reboot the system and it comes up with a green kernel with all security holes fixed, but a container you can restore it without waiting a long time, it's immediately there on your rebooted host. Another use case which is similar to this is quick startup use case, I have, people were talking to me about this, so this is what people actually use in production from what I've been told, so they have a container it takes forever to start, it takes like eight minutes to initialize, and they have some software as a service thing where they want customers to have a container immediately, so what they do, they don't initialize it from scratch, they take a checkpoint once it's initialized and then they can create multiple copies really fast of the container in matters of seconds, and so the customers can have their containers faster and maybe they are happier. The next thing is the combination of those things is container live migration, I have a source node, I have a destination node, and I want to move my container from one system to the other system without losing the state of the container, so I take a checkpoint and then I can restore the container on the destination system one or multiple times, and this is the place where I want to do my demo, so let's see, so I want to have a Kubernetes thing running here and I have a small YAML file with two containers, let's have a look at the YAML file, so it has, it's a part with two containers, one is called WildFly, this is a WildFly-based Java-based application and the other one is Counter, both are really simple, stateful containers, if I do a request to the container I get back a number, the number is increased and second time the number is the increased number, so let's talk to the container, hopefully it works. Okay, this is hard to read, but I think I need this ID, so I'll just do a curl to the container here and then I need to replace the ID to figure out the IP address of my, where's my mouse here, container, and it returns counter zero, counter one, counter two, so it's stateful, but it's simple, so and to use checkpoint restore in Kubernetes, so this is currently a kubelet only interface because we still don't know how it, it's the best way to integrate it into Kubernetes, so it's not straightforward yet to use it, but it's there, so I'm also doing a curl, now let's find my command in the history, now that's the wrong one, oh there it was, missed, sorry, almost have it, okay, so this is the command, so what I'm doing here, I'm just talking to the kubelet, you see the HTTPS address at the end of the long line and it says I'm using the checkpoint API endpoint and I'm using, I'm trying to checkpoint a container in the default Kubernetes namespace in the pod counters and the container counters, so I'm doing this and now it's creating the checkpoint in the back and if I'm looking at what it says, it's just created a file somewhere which contains all file system changes, all memory pages, complete state of the container and now I want to migrate it to another host and now I have to create an OCI, kind of an OCI image out of it, I'm using builder here and then I'm saying, I'll give it an annotation so that the destination knows this is a container image, then I'm gonna include the checkpoint archive into the container, into the image and then I will say commit, it's the wrong one, commit and I'm gonna call it checkpoint image latest so now I have an OCI type container image locally which contains the checkpoint and now I will push it to a registry, here it was, and I will call it tech39 and now it's getting pushed to a registry so this works pretty good but this VM is not local and now I want to restore the container on my local VM and that's happening here, right, ctrl-ps, so nothing is running then I have to edit my YAML file and so it's pretty similar to the one I had before I have a pod called counters and I have a container wildfly which is started from a normal image and the other container called counter is started from the checkpoint image and now I say apply and now let's see what the network says if it likes me so it now says, so it's really hard to read because it's a large font but it said pulling the initial image so that's already there so it doesn't need to pull, it created a container wildfly it started a container wildfly and now it's actually pulling the checkpoint archive from the registry, oh and it's created a container and started a container so now we have a restored container hopefully running here, let's get the idea, the idea of the container and let's talk to the container again and so now we shouldn't see counter zero but counter, I don't know, three or four I don't remember what it was last time this is the right idea, I hope yeah and it says, so now we have a stateful migration of a container from one Kubernetes host to another Kubernetes host by creating a checkpoint, pushing it to a registry and then tricking, kind of tricking Kubernetes into starting a container but in the background we kind of used a checkpoint container so Kubernetes thinks it started a container, a normal container but there was a checkpoint container behind it so the checkpoint restore, the restoring of the checkpoint all happens in the container engine below it in cryo and for Kubernetes it's just a normal container it has restored so back to my slides so another use case people are interested in a lot which I have never thought about is spot instances which AWS and Google has it's cheap machines which you can get but the deal is they can take it away anytime they want like you have two minutes before they take it away and so if you have checkpointing it's now independent of Kubernetes or not but if you have Kubernetes on your spot instances you can checkpoint your containers right into some storage and then restore the container on another system and still use spot instances without losing any of your calculation work, whatever it was doing so something about cryo so I mentioned everything we are doing here is using cryo so the call stack is basically the kubelet talks to cryo, cryo talks to runc, runc talks to cryo, cryo does the checkpoint and then each layer adds some metadata to it and so that's how we have it but all the main work of checkpointing a process is done by cryo and some details about cryo so of course the first step is it's checkpointing the container cryo uses ptrace or the secret freezer to stop all processes in the container and then we look at proc pit to collect information about the processes that's also one of the reasons why it's called in user space because we use existing user space interfaces cryo over the years added additional interfaces to the kernel but they've never been checkpoint only they are usually just adding additional information you can get from a running process so once all the information in proc pit has been collected by cryo another part of cryo comes which is called the parasite code the parasite code is injected into the process and it's now running as a daemon in the address space of the process and this way cryo can talk to this parasite code and now get information about the process from inside the address space of the process from memory pages to get to dump them all really fast to this but a lot of steps are done by the parasite code which is injected into the target process we want to checkpoint the parasite code is removed after usage and the process never knows it was under the control from the parasite code I have a diagram which tries to show how this could look like so we have the original process code to be checkpointed we put something out of the code it's not perfectly correct but we put a parasite code in the original process now the parasite code is running doing the things it has to do and then we remove it and the program looks the same as it was before and at this point all checkpointing information has been written to disk and the process is killed or continues running this really depends on what you want to do and no we are not aware of any effect on the process if it continues to run after checkpointing and to migrate the process you have then the last step is restoring and what CRIU does it reads all the checkpoint images then it recreates the process tree of the container by doing a clone, clone 3 for each PID a thread ID and then the process tree is recreated as before and then CRIU kind of morphs all the processes in the process tree to the original process and always a good example is file descriptors this is easy so what CRIU does during checkpointing it looks at all the file descriptors looks the ID and the file name the path and the file pointer where it is and during restore it's just put again so it opens the same file with the same file ID and then it points the file pointer to the same location and then the process can continue to run and then the file is the same as it was before the file descriptor then all the memory pages are loaded back into memory and mapped to the right location we load all the security settings like app-amor as ilinux.com we do this really late in CRIU because some of these things make it very difficult but it's happening late so it's working well and then when everything all the resources are restored, all the memory pages are back then CRIU tells the process to continue to run and then you have a restored process so now to what's next in Kubernetes so we can kind of migrate a container like I have shown and then we are only at the start of the whole thing so the next thing would maybe be kubectl checkpoint so that you don't have to talk directly to the kubelet for kubectl checkpoint one of the things which is currently under discussion is if you do a checkpoint all of a sudden you have all the memory pages on disk with all secrets private keys, random numbers, whatever and so what we do for the current Kubernetes setup is it's only readable by root because if you root then you could easily access the memory of all the processes so if the checkpoint archive is also only readable as root it's the same problem you have the thing is you can take the checkpoint archive move it to another machine and then maybe somebody else can read it so there's still a problem that you can leak information you don't want to leak so the thing about is to maybe encrypt the image we don't know yet if we do it at the OCI image level or at the CRIU level we're talking about but it's not yet clear what we want to do but at some point the goal is definitely to have something like kubectl checkpoint to make it easy then I've shown only how I can checkpoint a container out of a pod and restore it into another pod so the other thing would be to do a complete pod checkpoint restore I've done proof of concepts of this so this is not really a technical challenge but you have to figure out how can the interface in Kubernetes look to implement this then maybe if all of this works maybe you can do a kubectl migrate to just tell Kubernetes please migrate this container to some other node to some other hosts and if this works then maybe we also could have schedule integration that if certain resources are getting low, low priority containers can be moved to another place another thing which we're also discussing concerning this is so I've shown you I've migrated a container with my own private OCI image standard which is the thing which I came up with it's a tar file with some metadata in it but we would like to have it standardized so that other container engines can use that information, the standard and not the thing I came up with which just felt like the right thing to do so this is the place where the standardization discussion is going on it's not going on really fast or anything like this but yeah I guess that's how creating a standard works and with this I'm at the end of my talk, the summary is basically query you can checkpoint and restore containers, it's integrated into different container engines, it's used in production use cases are things like reboot into new kernel without losing container states, start multiple copies quickly, migrate running containers, the new spot instances I've been asked about, this has all been done under the forensic container checkpoint in Kubernetes enhancement proposal and currently we trick Kubernetes to restore a container by using create and start without letting Kubernetes know that it's a checkpoint and with this I'm at the end, thanks for your time and I guess questions we have time for questions I have two questions, the first one is Howard one second please, stay quiet until the talk is over so two questions, how are network connections handled when the containers are stored and the other question is does CreeU support some kind of optimization in incremental checkpoints? so the first question is network connections so CreeU can checkpoint and restore TCP connections established established is an interesting thing, if they're just open and listening it's not really a difficult thing to do but it can restore established TCP connections but I'm not sure it's important in the case of Kubernetes because if you migrate, maybe you migrate to some other cluster or somewhere else maybe the network is set up differently and you can only restore a TCP connection if the both IP addresses of the connection are the same and it makes sense for live migration because at some point the TCP timers will time out anyway but I think maybe it would make sense if you migrate a part and keep the TCP connections between the container and the part alive then it would make sense, it's technically possible I'm not sure how important it is for external connections but for internal connections it makes sense the other question was about optimization so CreeU itself supports pre-copy and post-copy migration techniques just like VMs so you can take a copy of the memory move it to the destination then just do the diff at the end or you can do page faults if on missing pages and missing pages are then collected during the runtime so this is all just like QEMU does all the technology is the same but it's not integrated into Kubernetes at all it's... technically it's possible in Portman we can do this the only thing is you have to decide if this is an incremental checkpoint or not because the checkpoint looks differently so if we know it's an incremental checkpoint only the memory pages are dumped and if it's the final checkpoint we have to dump everything and if it's the first checkpoint you say it's the final checkpoint you cannot do an incremental checkpoint on that one very impressive thing except network what else do you know will not be possible to migrate I'm impressed by this thing except network you mentioned something else that cannot be checkpoints so the main problem is every external hardware like infinite band, GPUs, FPGAs because there's state in the hardware and we cannot get it out two years ago AMD actually provided a plugin for KreeU to get the state out of their GPGPUs so KreeU should be able to checkpoint and store processes using AMD GPUs I never use it myself I don't have one but they implemented it so I assume it's working and so everything external hardware where you don't have the state in the kernel that's the main limitation Hi, thank you for this you said there's parasite code, does that mean it changes the container hash so how do you propose to secure them again and make sure that's your parasite code and it's somebody else's I didn't get it 100% something about container hashes and making sure it's I think the worry is that if you inject parasite code that the container hash has changed somehow it doesn't it doesn't it doesn't change the container hash the parasite code is removed afterwards so it's okay, thank you thank you, excellent talk how big are the images the size of the process memory used or the total process allocated to the system I don't hear anything in the front how big are the images that you restore exactly, so the size of the checkpoint is basically the size of all memory pages which we dump, all the additional information which crew is dumping is compared to it is really small and then it depends if you do something like an importment or docker, if you do a diff you usually see which files changed in the container to your base image and this comes on top of it, all files which change we include all the files completely into the checkpoint whether we don't include this while I'm bringing the mic over there has anyone changed, has anything changed in terms of how complex process trees you can restore because we're thinking about we discussed using it for system deservices for example for you one of your limitations that you usually had is as soon as you run something fairly complex inside of the container and you try to check point restore it with crew it would just fail because it would use kernel features that it wouldn't support so the biggest problem we're currently seeing is containers using systemd, because systemd is very advanced it uses things nobody else uses so this is the point where crew might fail because it seems like at least from previous point or from what I've seen nobody uses as many new kernel features as systemd does so it sometimes fails the systemd is running there but usually I don't see often people in the OCI container world using systemd I guess it would be a good idea to have a real init system even in your container but it's not something people do so it's not something we get complaints at all about I also thought this talk was very interesting so I saw that you had these talked about having these kubectl migrate and kubectl checkpoints because I'm thinking that mostly what you want to migrate might be like a stateful application for example like a stateful, what is called a stateful something so I was thinking maybe you could have something in the stateful stateful deploy whatever it's called instead of say you want to drain a node actually one of the first implementations I did I was using drain I added an option to kubectl drain which is for checkpoints so all containers were check pointed during drain and then they were restored during boot up okay sorry for being the buskill but we're out of time thank you for their talk that was really interesting and thank you everyone for attending and being so quiet during your question |
Exploring Database Containers |
Hi, everyone. How are you? How is hosting the weekend? Good? Yes. That's nice. I'm happy to be here. It's my first time in Europe and it's the first time that I will talk in English for a first event in person. This is pretty nice. My name is Edet Buja. I am a technology evangelist at Percona and this is a very basic and friendly introduction about databases and containers. About me, I am from Peru in South America. I am working as Six Months in Percona. It's an open source company. We create open source databases free. I am a Google woman tech maker. I was nominated as a docker captain last year and I am a database and container enthusiast. You can follow me on Twitter and LinkedIn. I used to post about containers, Kubernetes, open source. For the agenda today, we are going to see about containers. We will see docker architecture. We will see the workflow between the components of docker. We are going to have two examples of how we are running a single Percona server MySQL container and we are going to run multiple containers for Percona server MySQL. We will see the docker volume, how this is important in this work of databases on containers. We will see backups, restores of databases and best practices. Let's start it. What's a container? How many of you knows what's with docker? Yeah, a lot. Okay. That's nice. Docker or do you use other tools? Yeah, there are different kinds of tools for container application. But a container is like a single unit, lightweight unit of software that package everything that you need for your application. When we run application, when we build application, we know that we need a lot of packages. If you are running, for example, if you are building a Java application, you need libraries, dependencies, many things to run your application. So everything have to be containerized in a single unit of software and this is going to be isolated for other things like your infrastructure. And the good thing is that your container can run on different platforms in your laptop, in your server, in your cloud. With this, we end with a problem that we have when we say, hey, your program runs. Yes, this works just on my computer. But no, it has to run in different platforms. We don't need to have this problem to dependencies and other kind of things when we test our application in other platforms. There are different tools, as I say, for containerization. We have container interface, for example. We have container D and we have Docker that is the tool that we are going to focus now. All these tools are also in the cloud native computing foundation ecosystem. If you see the landscape, you will see a lot of tools there. There is a part for containerization and there are more than three. There are a lot of tools for them. The Docker architecture, it works like a client-server model. We have the Docker DMO, which is going to process all the commands. It's going to start to listen to the client always and the client is going to send a request to the DMO through the REST app. With this model, the Docker DMO can also manage network containers, images, and Docker volumes. If we go more in detail, we will see that we have the client, the DMO that is also called the engine of Docker, and we have another component that could be your Docker registry, the public, which is Docker Hub, where all the official images are published, and also we can have our own private registry in case we don't want to share it with the public. In this case, this is the flow of a component. For example, if we do a pull, we are going to try to bring the image from the Docker Hub into the Docker DMO cache. If the Docker DMO doesn't find the image in cache, it's going to bring it from the Docker Hub. But if this is in cache, it's going to take it just that and start to process. The same with Docker build. When we run Docker build from the client, the Docker DMO will try to take a Docker file. A Docker file is a recipe with a lot of instructions where we put all the commands to run our application and deploy it. So I'm going to, the Docker DMO is going to take the Docker file and build it, build the image, and if you want, we can also run it. We run, we will create a container. The container is our application that is already alive and is ready to make connections of petitions. One more thing here is that we can have everything in our host or we can have clients, remote clients that could make petitions to the Docker DMO. Container benefits. There are pros and cons, but now I'm going to focus on these benefits, the containers give us. So one of these is we can reduce costs with this because we can run several containers in a single infrastructure. That's infrastructure that we have because of the technology of containers is different than the virtualization. In virtualization, we use the hypervisor and when you create virtual machines, it consumes more resources from your, from your infrastructure, but when you use containers, it's very different. You are using that technology, a container would make it possible to run different, a lot of containers in a single machine. So for that reason, it's possible to reduce costs. Also, the containers are very friendly with continuous integration and continuous delivery process. If you have like a big application, a monolithic application, this, and you want to, you want to run container, you want to integrate it in the DevOps process. This is going to be hard. We have to work like microservices to make each service as a container and included in the continuous integration and continuous delivery process. It's easy. When we build, when we build our application over a container, it's easy to kill it. It's easy to create it again. It's easy to fail and the process is faster. Another benefit is the multicloud compatibility with the time several companies try to migrate to a hybrid cloud. They just don't, don't want to have everything on premise. They also want to scale. They want to grow. So for a reason, they opt for cloud and containers fit very good in this. You can install Docker. I know you did it. You can choose your distro. You are, you use Debian, the CentOS, everything. So you can go to the official Docker documentation and easily look all the steps. When you install this, it will install it, the Docker client, the Docker DMO and other tools that you will need to use Docker in your local matching. We already talk about containers, right? But this talk is about exploring database on containers. We are going to talk about my SQL, which is at this base relational database. We know that it's a database. And to run my SQL on containers, we need to understand how volumes works because the most important thing running databases on containers is the data. If we lost the data, we lost everything. For the next slides, we are going to focus in this part. We will use the image of Percona server for my SQL. This Percona server for my SQL is open source. It's like my SQL, but with more nice things. You can use it. It's open source. It's in Docker Hub. So we will use this image and we will create a Docker container. We will see how it works with all volumes. We will see the layers in Docker and then we will create a persistent volume and we will see how it changes in the layers of Docker. So just here to see that if you want to have an image, it's necessary to have a Docker file. You can use a Docker file before by yourself. That's good. A Docker file is a recipe where you will put everything for your application. So you need this to create an image. Then you need an image to create your Docker container. There are three essential steps here to remember how Docker works. We will run a single Percona server for my SQL container. We will use Docker run to create the image. No. We don't use Docker run to create the image. We use Docker run to create a container. So we use this to create a container. So we will do dash D to say run this container in the background. I don't want to use the terminal. And I will call it Percona server for my Percona server one. I will pass it like the environment variable, for the root. This is not a good practice here. This is just to show how we are going to create a container. And we will use this official Percona server for my SQL. With this I am creating a container, right? I'm creating a container with this one. Okay? So if we go to Docker image LS, this is going to pull the image of Percona server and then it will create the container. That command is going to do two things. It's going to bring the image from the official Dockerfab and it's going to create a container. So if we see Docker container PS, our container is up. Okay. After we have the database, we need to add data. We will add databases, we will add data, we will change registers, we will have transactions, many things that we can do like a regular database. Okay. If we run a single Percona server in my SQL container, we know how it works in layers. If we see this in green, there are layers from Percona, Percona server image. This is the image that we pull it, that we can change. This is just react only. We can change this, but in top of that, it's going to be created a layer, a new layer. This layer, this layer is react only. I can add data. This layer is the one that will contain all the things that I am doing in Docker on that image, on that container. I added a new database. Yes. I create a new registry. I delete it. I add the transactions. All this is going to save it here. But what happens if I don't have volume? My container is ephemeral, right? It could die. It could crash. My machine could crash. And all my data is going to be lost. I will, I will lose all the data. We will see how it works with multiple containers. To run multiple containers with the same image, if we see this is the same image, the same version of the image, we will just change the name of this container. Also, we can change another thing because this is a database, right? What thing we can change? They run in a port, right? In which port my SQL used to run? Yeah. So I need to change the port for the other container to avoid the conflict. Okay. How it works in layers. The same. We will use the same layer. We will use the same layer for Percona, Percona server, which can, we can modify. But in top of that, we are going to have two layers more. One of the first containers that I created and the second for the other that I can add. I can add data. I can change things. But once again, if I don't have volume, this is going to die. But this is how to work if we want to create an application when it doesn't matter if we save the state of this application. This is important. Persist data in databases is really important for this kind of application because sometimes we think that, like Kubernetes, since it was created for a state less application, but now we have options to use stateful applications on containers. And this is one of the reasons. Create volumes. So it's pretty easy to create volume. We can create a volume just with dash V or dash, dash volume. And we can say it, we can create a local volume with local run and detach. We will call it Percona server. The same process. And when we say dash V, we are saying, okay, this will be my volume in a host, in my local data directory. And this one is going to be inside my container. So this is like a mirror from this image. And how it works. In layers, we have the same, the layer that we can modify. And in top of that, we are going to create another layer. But in this case, we are adding, we are creating the mounted volume in BarLivMySQL. There are other directories that we can create the volume. I am just adding, as an example, this, because in MySQL, we have configuration files. We have logs. We have another things. But for that, we want to create these volumes for all of that things. I am just adding, as an example, BarLivMySQL, which is also a directory that is very important. And this local directory is the one that could be in my host. But it is not recommended, because if your host crashes, everything crashes too with your volumes. It is preferable to run it in a remote host. Okay. Two backups. Who here make backups? Okay. I use the very easy way to make backups. I use it for logical backups, my SQL dump used in the container. And for physical backups, we use in the company PerconextraVacup, which is, have more features to have that physical backup. And for restore, I will use also my SQL dump. And we don't use PerconextraVacup in this case, because it has a lot of pins. For backup, I will execute a backup in a container that is already running. PerconaserverVacup is already running. Let's see that we created. And we are executing Docker exit, it, to enter into the Percona in that container and type that common, my SQL dump, to create a backup of the database. So the backup is going to be in that file, dump SQL. And the same process with restore, we can take that backup. And this is a different container. I'm going to restore the dot SQL file in a different container. In this case, in PerconaserverRestore, using my SQL, use that command, my SQL. Okay. Best practices or some recommendation to use containers in database. Okay. And one of this is that we can keep constantly monitoring our database and the whole system, because we don't know when we are going to don't have enough resources for our containers. We should be aware of that or have notifications to say, hey, you don't have a note disk, you don't have a note memory, so provision or try to scale in your resources. So we should keep monitoring. Using some tools for that, for example, is PMM. We can use open source monitors to monitor our databases on containers. And we can store this data in persistent volume outside the container. It recommended no inside the container, because it's easy to create plans for recovery. We can restore the data easily also and fast. We should limit the resources of utilization of our containers. Our containers, we know that they are small, but also we should limit when they are a lot. And we should regularly have backups of the database and store these backups in a different location. And have a plan of migration and disaster recovery is really great. In that case, having a monitoring tool helps a lot. And what more? That's all. You can find me in LinkedIn and Twitter. Okay, we have time for questions. If you absolutely need to leave and you can't wait until the talk is over, please do so as quietly as possible so we can understand the questions. Thanks. Hi. Thank you so much for your talk. It was really interesting. I'm wondering what kind of limitations do you see when you're speaking about having a databases arriving in containers? There is storage limitations, CPU, or something else? Guys, can you please be a little quiet so we can understand the question? All right, I will try it with the microphone. Yeah, you. The people can you. Thank you. I was wondering maybe, first of all, really cool talk. Thank you so much. My question would be, could you maybe talk us through some kind of limitations that you can see when you're running databases from containers? You didn't understand it? Thank you so much for the talk. It was really cool. Maybe you can share with us some kind of limitations that you see when you're running to the solution of running databases inside containers, right? You cannot really run very big database. You probably will have a problem with that. What kind of limitations do you see? So, yeah, the question is about sorry, the question is about what limitations you can run into with database containers? Yeah, I don't want to say this, but it depends really of the business. Okay, if you want to invest a lot of money in infrastructure, but because at the end, your database, the volume that you have is not going to be part of your container, it's going to be outside. And this depends on you. You want to invest a lot of money to save that data. It's good. You want to replicate it? Please try and be quiet while we are asking questions. Are there any more questions? There is one more question from the back, so please be quiet. Thank you. Hello. I wanted to ask, did you notice any kind of performance issues? Did you benchmark things? Did you identify some kind of overheads going on when you containerize a database like MySQL or other kind of databases really? Sorry, I didn't get your question. All right, I'm just going to ask you. When you containerize a database, be it MySQL or Postgres or any kind of open source database that you may have tested on this kind of setup, did you notice any kind of overheads, compute, memory, or disk, essentially, where you can see that the database performance or operation is significantly affected by the fact of being containerized? I'm not sure about that, but if you use open source to monitor your containers on databases, you can have a visualization of these things if you don't have enough resources so it can show you alerts or things like that where you can figure out where exactly is your limitation. Okay, so for example, did you run Benchmark? Could you help me? Okay, could you help me to answer? Okay, my friend is going to help me to answer this. All right, thank you. Yeah, thank you to you. Hey, so usually the performance degradation is around two, three, four percent. The issue is more about how you configure the database, kind of storage, if it's local or network storage, but the virtualization is minimal. It's like running on a EC2 instance. Okay, so there is an impact, miserable, at least you say around four or five percent, but you say that's not going to be the, that there are configurations we can do to try to avoid that. Do you have any kind of paper or any kind of resources that we might use to avoid those kind of bottlenecks? If I got correctly, not much. The measure that we do in databases is measuring TPS. So you will notice on, if we're running Benchmarks, we've seen Bench, for example, three percent, like if you are running 1,000 credits per second, you will get 980, 990 credits per second when containerized. Okay, and do you have any kind of recommendations, kind of generic recommendations you can do so that when you run a database in a container, here is what you can do to try and negate some of the performance bottleneck that you guys have noticed? To be honest, on real-day activities, I would say 99 percent of the performance will come from how you configure my SQL, not the containerization is like just a small piece of the game. You can make more effect by modifying the database configuration. All right, thank you very much. Thanks to you. |
Safer containers through system call interception
(Ab)using seccomp to emulate the world |
Our next talk is by Stefan, he's the project leader for LexD, a container manager, a former teammate of mine as well, and he's going to talk about safe containers through system call interception. Hello. It starts working well. Thanks, sir. All right. So, you can edit the intro. I'm Stefan. I work at County Call. I'm the project leader for LexD, LexEFS, and a bunch of other stuff that we do, effectively system container guide. And, yeah, we're going to be talking about system call interception today. First, just a tiny bit of going back to the basics. We can't need to explain what we're trying to achieve. So there are two main kind of containers out there. We've got privileged containers and unprivileged containers. The ones you want are the unprivileged guide. Privileged is bad, and just to clarify there, too, we don't mean privileged as in dash dash privileged in Docker. That's extra, extra bad. Docker by default is privileged, and the definition of privileged is whether you're using a user namespace or not. So in the case of LexD, which is a container manager that I'm working on these days, with default to unprivileged containers, that's great. It means that root in the container is not root on the host. If there's a container escape of any kind, you don't, you get as much permission as a nobody user on the system, that's great. Problem is, not being real root also means you don't get to do stuff that real root can do. A lot of stuff have been enabled now inside the user namespace that you can do yourself. You can create network devices. You can reconfigure. A lot of stuff is great. But there are still things you can't do. You can't change your process priority to something higher than what you would be allowed to do as a normal user on the system. Otherwise, a user on the system can just create a new user namespace, go in it, and bump their process priority to whatever they want, and bypass all kind of settings on the system. So there's a lot of things that are not quite possible. In general, we want to eradicate privileged containers because having real root is very, very bad. And it's kind of a game of, like, welcome all, as far as trying to prevent nasty things from happening. We've got Apama. We've got SecComp. We've got a whole bunch of things that are all trying to prevent you from doing bad things. So that's done by us thinking about what all the bad things are and trying to block them. And someone just needs to find another bad thing we didn't think of. And then there goes the entire system. So we don't want those. We'd like to get rid of them completely. But for that, we need to find ways to allow for unprivileged environments to do things that are normally only allowed to be done by privileged environments in a way that's still safe. All right. So all I'm going to be talking about today relies on SecComp, which is the system call interception mechanism in Linux. It lets you do a bunch of nice policies. You can just put policies for, like, this system code with those arguments, just deny them or return this return code or return, yeah, this particular error number, for example. But it also grew the ability with Linux 5.0 in 2019. It grew the ability to notify user space instead. So you can put a policy in SecComp that says, if this is called and the arguments are so insert, instead of doing, taking action right now, go and notify this file descriptor that something happened. And then the whole system can have a privileged demon monitoring that notification mechanism and take actions. There is some complexity around security that I'm going to get into very shortly because you can do very, very bad things with that. But if you do it correctly, it lets you run a more privileged action kind of on behalf of a less privileged container after going through some kind of a list or that kind of logic on the host to make sure that this is actually fine. Now for the nasty issues, time of check, time of use is a very, very common issue in security. And this mechanism has definitely got some issues around that. User space gets notified that a system code was made. The system code can have pointers to a bunch of different arguments and structures. And there's nothing preventing the caller from technically changing the value at those pointers. So you need to be a bit careful when you're processing those messages. You effectively need to start by copying everything, evaluating it. If everything looks good, then you can take actions. But by taking actions, we mean you can run the thing on behalf of the user with the original arguments, never putting them again because otherwise they could change. Or you could just say, I don't want this reject. What you shouldn't do is say, oh, based on those arguments, it seems fine. Let it continue. Because there's absolutely nothing that prevents the caller process from just racing you and immediately changing the arguments to something else before it goes back to the kernel and then running with a value you would not have allowed. So you need to be careful in your design so that this doesn't happen. Otherwise you're literally allowing people to run stuff as full route privilege inside of imperial containers, which would be very bad. So what do we actually do with this stuff? So far we've implemented quite a few things. I'm going to go into more details about each of those. The first thing we implemented, actually, I don't know if they are in the right order. One of the first things we implemented is make node. Then we followed that, which is useful for save device nodes creation. I'm going to go into more detail shortly. Set X adder, we also added pretty early on. We've got support for eBPF, so we can allow some specific eBPF programs. We've got support for set scheduler, which is used to change some of the process priorities. We've got support for mount, which was a real pain in the ass to implement, but we've got support for mount. And we've got support for this info, which was also reasonably fun to implement. Now kind of going over those things directly, make node, what do we use that for? One of the things we wanted to enable is for running Docker inside of lexd containers. As I said, lexd containers are on privilege, they are nice and safe. Docker, by and large, not safe. But Docker running inside of a privileged lexd container safe. So we figured we'd try and make that work. And we did manage to get it working. The main driver at the time was Travis CI. The Travis CI platform was using lexd containers on m64, IBM Z series, as well as IBM power at the time. And they wanted the same behavior as they were getting on Intel. And on Intel, they were using full VMs that you could do whatever you wanted them. So we wanted to make sure that Docker worked properly in there. And what we noticed is that Docker layers, especially the directory white out files and five white out files, rely on either, I think it's C00 device nodes, or they rely on specific excellent attributes that just say that this is like a directory that was removed, effectively in the underlay, using that as white out. Both cases, those things were not allowed. Device creation in a container is a big, big, big no go usually. Because if you can create, say, the device node for dev SDA, then there's nothing preventing your own previous container from rewriting your disk, which would be bad. So that's usually not allowed. But some specific device nodes are fine. And the work we did there also allows for things like creating a new dev null device or new dev zero device or those kind of devices which are inherently safe. And making that possible means that you can now do things like running the bootstrap or similar tools inside of an underprivileged container, because the few devices that are needed to be created as part of an image creation process are safe devices and this allows it. We generally consider this particular interception to be safe, as in you can pretty much turn it on on any complexity container without having to think too much about who's in that container, like do we actually trust the workloads on that kind of stuff. The other piece to that puzzle was set XSATA. As mentioned, same deal with Docker and the white out files. We needed to implement that one. Similarly, it does not allow all of the XSATA. It just allows very safe namespaces of XSATA attributes. It will not let you do things like setting a security XSATA attribute, because that would let you do some really, really bad things, for example. And this is similarly considered to be safe on our side. Then got the pretty interesting one, mount. Also, again, mount is a bit of a problem, because usually, well, first of all, usually it relies on having a block device. You kind of need to have that allowed in the container, which is already a bit fishy in many cases. You've got to be careful that any block device exposed to the container, you consider as entrusted from that portal and you never mount it as real roots somewhere else, or they could try and attack you. And by attack you, what I mean is the kernel has a super block parser that will process a new device as it gets mounted. And this is not guaranteed to be bug free. So a user that can craft a very specific block device might be able to trick something like X4, XFS, BarFS, or any of the other file systems into either crashing the entire system or doing arbitrary code execution in the kernel. Both cases not very good. But we still enable, so we still enable support for that. If you have a container that you trust, that you don't want to give full access for everything, but that you still trust, you can technically do this, and it will let you mount inside of the container. We added an extra layer on top of that, which lets us do a shift, because if you do the amounts, the ownership information on that device are probably not landing at the container. So we allow stacking shift effects, which is a fact-send that's hopefully dying soon, but that we implemented for Ubuntu, that we can stack on top and that fixes the permissions. So we support that as well. The really cool thing with this stuff is we also support redirecting to Fuse instead, which then becomes safe. So what we can do is say, Ernie attempt at mounting EXT4 inside of the container called defuse 2FS binary instead. And yeah, that's safe. That actually does work pretty well. And I'll show that in just a tiny bit. And then we worked on the BPF, not allowing all of the BPF programs, but specifically allowing those that we need to do nested containers and doing device permission management throughout. So we can review what the program is, if it matches what we expect, then we load it, otherwise we just reject entirely, and that's also considered to be safe. Then for an unsafe one, SCAD set scheduler is not super-duper safe, because it lets you reconfigure scheduler options. It was needed to be able to run Android inside of an entry-registerly container. They were doing some slightly wonky stuff on startup. That needed that. But we know that effectively the container could make itself unkillable, for example, or could raise its priority enough to slow down the rest of the system. So it's something to keep in mind. There's no way to escape that we're aware of by enabling this thing, but there's definitely some ability to affect the entire system. And then we had this info. That was kind of led by Alpine deciding not to use Proc Mem Info to figure out the memory usage in the free command. So you would run a container with a limit of, like, a gig of RAM and run free, and you would see 128 gigs, because it would just show you the host value directly. And that's because you look like CFS, which is another project we run, that overlays on top of Proc to show the right values. They don't work, because they were not reading the file system. They were going straight to the kernel with a system call using CS Info. So we've implemented Interception for that, and we've filled it with the same values as you would be getting from, like, CFS, and that gets us the right behavior. And so just switching to the demo, I'm also just rechecking something here real quick. Okay, I was just making sure that Christian was wrong with the time. That's good. I've got until 3. You showed me at 10 minutes. All right. So let's just move here to the terminal. And the first thing I'll do is play with Makenode, so, and that should be, yep, that's all right. It's launching a new container, Ubuntu 22.04, and let's try Makenode, I'm sure I'm getting those wrong again, because I always get them wrong, Makenode. Depending on the kernel. Makenode name, then type. Depending on the kernel, this might work. Yeah, it actually does work now. Let me figure out DevNode, this one shouldn't work, 1, 3. C5, 1. Yeah, C1, 3, for example, for DevNode does not work out of the box. But if we stop this, then set demo, makenode, security, calls, intercept, makenode, true. In this case, we do need to restart the container, because the entire second policy needs to change. For smaller changes, we don't need to, but for that we do. And now that works properly. So that's the Makenode piece. Then we've got Docker. For that one, I did prepare a tiny bit, because I did not feel like downloading Docker on the Wi-Fi here. I mean, actually, I did, but it took an hour, so happy I didn't do it during the talk. So for Docker, actually, let me show you the config first. So the container here has security nesting enabled, which allows for running containers in containers. And it has intercept of both Makenode and setExider that are set up. And in there, that part does use the network, so I'm just hoping that it's tiny enough. There we go. So that works properly. And the issue before was that unpacking the layers would just blow up. All right. So that was Docker, then to the more, to the fancier one, which is mount. All right. So for mount, launch the container here. I'm going to go and pass it a block device. So I'm passing it dev loop 11 on my system as dev SDA inside of the container. Yeah, your signs are still wrong. I've got 15. I mean, it's until three. Okay. So demo mount, make FS, EXT4 on the SDA, yes. So just formatting, that you can always do, like there's nothing preventing you from creating a file system. Normally, that works just fine. What doesn't work is this, like you're not allowed to actually mount it inside of a container. But now we can make that work, actually. So what we're going to be doing here is turning on mount interception. And then we need to set an extra one, which is the one allowing specific file system. So in this case, EXT4. And then restarting the container here. Okay. Exit back in there and try mounting again, and that works. And if we look at this, and we look at DF, it's mounted normally, it's fine. Other than it actually did this as your route, and I could have done very nasty things by crafting a particular device ahead of time. It works as expected. Now, what we can do to make things a bit more interesting here, actually, we did back in there, let's unmount it, and then install fuse to FS. That's a fuse implementation of EXT234. That's pretty readily available. You can install that. And then what we need to do is remove the config key that allowed straight up mounting of EXT4. And we replace it with another config key that instead says EXT4 is fuse to FS. So we put that in there, go back in, and then do SDA back to M&T. And the funny thing is that you won't actually notice any difference whatsoever. It actually looks completely identical unless you go look at proc mounts at which point you're going to notice that the file system here is not EXT4, it's fuse.exe4. And if you look at the process list, you're going to notice that there's an extra process running in your container now. So that's pretty sweet, because it means we can literally forward any file system to fuse, because it's done at the Cisco layer, the container doesn't really need to be aware of it. Like it is not doing anything to the mount command. You can just call the mount Cisco at any point, and it will just forward it to fuse and do the right thing for you, which means no chance you work yours whatsoever. It just works. So that is pretty cool. And the last thing I wanted to show in the demo is for launch a Alpine Edge container. So that's going to be the most info. And I'm going to set a memory limit of one gig. So if I go in there now, and I look at the free memory, we can see I've got 16 gigs, which is considerably more than one gig. The enforcement is in place. So that's where problems happen, is that the enforcement is in place. Now if you run something that will look at the free memory output to figure out how much memory it can claim, it's going to claim the wrong amount of memory, and it will get killed by the out-of-memory killer. So that's a problem, which is why we did the work to fix that shoe system call interception. So you can do security, syscall, intercept, sys, info, sure. And then bounce that container. And if I go in there and I look at free, now we've got a gig. So that actually works properly. It also fixes a bunch of other things. It doesn't just do the memory, but it also does CPU load, uptime, and a bunch of other things. So that's how StrapSysInfo is now properly handled. It's just the easiest use case we have is free on uptime, because we know that this command has been changed to use sysinfo instead of the five system, so it's a very easy one to just prove the concept works. But that particular piece of work fixed a lot of other things. I think it also improved Java, was using the wrong interface and the wrong amount of memory sometimes, so that fixed that. It fixed a bunch of other stuff. That was pretty good to have. So looking forward, where do we want to take this? We've got most of what we wanted covered, really. The big items that are really problematic have been resolved. Docker was a big one that we really wanted to solve, and it's working fine. Alpine behaving, that's really nice. We're happy with it. The monitor section allows for a lot of different stuff now. It's possible to do things like image building instead of an previous container. You can really do a lot. And Android seems to be happy, too, with the tiny bit of stuff we added for that. The things we'd like to add now are kind of weird stuff, really. The one I've got in mind mostly for the sake of it, but also because I'm sure we found a use case for it, is to implement init module and finit module. So the ability to do kernel module loading from inside of an previous container, which is as terrifying as it sounds, but the idea there is that we would not actually allow for the container to feed us the actual object file that we would then store at the counter. What we would do is we would receive that object file, we'd look at what the kernel module name is, look against an alias that we have of kernel modules that are finitely loaded on the system, and if it's in that list, then we'll do the loading using the module from the host system that we know is correct. This might help quite a bit with things like firewalls in containers that might need to load custom IP tables or net filter modules, and potentially some other things like file systems and other things. So that would be an interesting one to implement, and I'm sure we're going to have to explain to a lot of people exactly how it is that we're doing it, because otherwise we're going to be absolutely terrified. Before eBPF program handling, I think it's also in our plans, as I said, we currently only intercept the eBPF program that's used for device management, so for allowing device creation, device mapping, that kind of stuff within containers. That's because C Group 2 removed the device's C Groups file interface and moved to eBPF instead, so we implemented it that way. We seem to have some interest for other programs, like I think SystemD and some other pretty common pieces of software now generate eBPF hook that they hook either globally or on two specific interfaces, and some of those should be saved. That should be things that we can effectively pull, validate, that they match the expected pattern, and if they do, then show that to the camera, this is fine, and that should actually make a lot more newer software that make use of a lot of eBPF stuff to just start working. I don't think we're anywhere near getting something like a eBPF trace working safely inside a container. That's absolutely terrifying, because it's got access to all of the candle constructs, and that's not something we do, but some subsets of those interfaces should definitely be fine. I think eBPF will solve a lot of those problems, probably, because then you can load unprivileged programs. And the other thing that I've had in mind for a while, and it's mostly a cool thing, not something I've actually had the use case for yet, SecComp has an interesting property in that it runs extremely early. It runs in the system called entry time in the kernel, before the system call is resolved. That means we can intercept system calls that don't exist. So we can intercept system call numbers that have not yet been allocated, and that means we get to actually implement new system calls purely in user space, that you can access through the normal kernel system call API. That's super interesting because it lets you do very easy prototype and testing of potential system calls. If you want to try specific interfaces, see how they look, the layout, what kind of arguments you want, you can pretty quickly implement system calls through that, and already show user space software added until you're happy with it, at which point you go back and you do the actual kernel implementation of the system call. So that might be pretty interesting. I don't think anyone has actually done that yet, but that's a nice property of how SecComp works, that it works before any kind of resolution, any kind of validity of the system call number. All that SecComp tells us is actually a system call number and all of the pointers through the arguments. It doesn't care whether the thing exists or not. So it means we get to actually intercept things that don't exist. And that's it. So we can start getting a few questions. Also on your way out, if you're interested, we do have legacy stickers on the table over there. If you want to help yourself, there's a question over there. I think this was the first one. Yeah, there's one here, there's one over there. When will we see the sysinfo system call being intercepted by default on LXD or other distributions? Sorry. When will we see sysinfo calls being default intercepted? This is going to roll out? Yeah, we've currently not decided to do any of that by default. Please leave quietly while we are answering questions. Yeah, so we've not decided to intercept anything by default yet. We consider it to be safe. The main problem we have is it depends on the kernel version that you're running, whether it's going to be working or not. And it's still recent enough, even though it's 5.1, which has been around a while now, it's still recent enough that a bunch of distros would not work properly. So we want to wait until we can generally assume that all of the distros that are like all the long term support releases are still supported before we can start doing that kind of stuff by default. Please keep it down while we are answering questions. Thank you. Hello. Thanks for the great talk. So I have two questions. First of all, you said there is this time of check versus time of use issue. And so how do you solve it? It's still trying to give a question, but I can't hear anything, hold on. So first of all, how do you fix this time of check versus time of use issue, where you know you call a syscall, the syscall gets notified, and you can, well, raise it with another thread and change some arguments, right? I didn't really get that, but if Christian did, you can answer instead, because you probably know it. It's extremely noisy. Now, okay, I'm going to try this. Stefan, how do you fix the time of check, time of use issue? Okay, so the time of check, time of use issue, you fix it by never letting the kernel execute after the check. So you never continue a system call after the check, effectively. If you want to intercept a system call, you are now in charge of running it. And so you copy the arguments as they are, you do the check on your copy of them. You never, ever reuse the pointer that the user gave you, and you go with your own copy of it, and that's perfectly safe. But if the argument is a pointer to a string, you need to copy the string, and when you are copying the string, it may be changed under the hood. So are you actually freezing the process with something like the C group, the freezer C group, for example? So technically the calling thread is frozen by the kernel, but it doesn't prevent another parallel thread to modify it, which is why we effectively map the memory of the process with the, we copy the entire thing that we care about. The entire, like if there's pointer of pointers, we just travel start, we copy it. Once we've copied that, that's what we check policy against. And that's what the, those are the arguments we're going to be passing to the actual kernel. And we just never look back at what came from the process, which means if they try to raise us at that point, it doesn't matter. We create full copies, we create full copies of everything, we never continue the system call, although that's an ability that I added a while back, so you can even say, continue the system call if I come to the conclusion that it's fine to do so. But if you do that, then you need to be, the kernel needs to guarantee you that it's safe. For example, continuing the make not system call after you inspected the arguments is safe because the kernel will just allow the creation of any device. So I have another question, because you said about MK not that if you MK not add like this device that nothing protects you against reading or writing into this, right? But there is this devices C group where you can actually protect this device from being written to or read from. And this is what Docker does, for example. So are you doing this in LXC? I'm sorry, I only get about 20% of what you're saying. Well of times what we'll do is that I'm going to be outside and we can just talk because you also have questions. So just follow me and we'll chat, it's going to be easier. Thank you very much. |
Bottlerocket OS - a container-optimized Linux |
And our next talk is by Sean, and he's going to talk about Bottle Rocket. Thanks. Thank you. Yeah, so I'm going to talk about the Bottle Rocket Container Optimized Linux Distribution. My name is Sean McGinnis. I work, I'm an engineer on the Bottle Rocket project, and I work at AWS. So I'll just go over what a container optimized Linux is. I'm assuming most people in this track probably have an idea. But go over the basics, talk a little bit about what Bottle Rocket is, show a little bit so you get a feel for it. And what I'd really like is to see others get involved. So the mission statement, I guess, is that it's a free and open-source Linux-based operating system for hosting containers. So what does that mean? Before we get into the motivations behind that, it's interesting to look at general purpose Linux distributions. And some of the challenges with using those, when you have hundreds of nodes in a Kubernetes cluster, you really have a lot of workloads that you're running, containerized workloads that you're running, and you need to optimize how you're using resources. So with most general purpose distros, the configuration is immutable. You can log into the machine, you can make changes, make adjustments, add, install extra services. Out of the box, a lot of them come with a default baseline set of services that you might not necessarily need when you're just trying to run containers. And that uses more system resources and then also creates more of a security risk of there's more attack service available. And because of that, because you can log in, you can change things, you can tweak configuration settings, those kinds of systems are easier to become pets, where you've really customized that node exactly how you want it. And you're less likely to just blow that away and spin up a new one, especially when, OK, I've made some changes there. I don't quite remember what I changed, because I was troubleshooting something. There might be something important there. So now I need to take care of this node, and it becomes my pet. So for container optimized Linux distributions, especially for BioRocket, really try to optimize for just the services that you need running on your Linux machine to be able to run your containers. That means less resource usage for things that aren't very important to you. It means less attack area for someone to compromise that machine, get all kinds of added benefits, faster boot time, smaller image sizes to transfer around. So with BioRocket, we try to make things as small as possible, just what you need, and try to make it more secure by default. And I don't say secure by default, because that's impossible, try to make it more secure. So really locking things down, making sure that if someone were to try to compromise your host, they're going to have a hard time doing it. And it's open source. It's BioRocket is not a general purpose operating system. If you're looking to do other things besides container workloads, BioRocket's probably not the right distro for you. It is backed by AWS, but it is not an AWS-only solution. Coming from AWS, it is very well integrated with AWS, but I hope that doesn't stay the primary case for long. And it is not a container-based OS. So what I mean by that, and this comes up a lot in conversations where there's this confusion where you talk about a container distro. And different people kind of already have a preconception of what that term means. So the two paradigms that come up are the distro, that's kind of the base image that you build on top of. So you've got a file from BioRocket versus your OS. You're running, and then on that you spin up containers. And it really is. When we talk about BioRocket, being a container optimized Linux, it's that second one. BioRocket is not something that you would use to create a container image. A little background of BioRocket, if you see the date in the bottom left there, we launched March 2020, which there are a few other things going on around that time. So didn't quite make the big splash on the launch there that we had hoped for. But it's great now to actually be back in person in front of people, being able to talk about the work that we've done in the last two years, or three years, and hopefully get the awareness out there a little more. BioRocket, right now, we build and distribute different variants. The variants term for us is how we try to optimize things for your specific scenario. So if you're running Kubernetes 1.22, there is a variant specifically for Kubernetes 1.22, and Amazon ECS, and VMware. The reason to have these different variants is back to that idea of no extra overhead. So for these, for the metal variant, we try to limit the number of kernel drivers that are loaded. The kernel drivers that you need between a metal deployment, where you're running on an actual server hardware, versus where you're running in a virtual machine instance, are different. We would know if you're running in a virtual machine, there's a very small subset of the available drivers that you need to actually do that. So anything extra is taken out of there. Any specific agents that are needed to integrate well, like in VMware, we have those baked into that variant. So you pick the variants that you want for your scenario, and that gives you a well-integrated option. Now, Bottle Rocket isn't far from the leaders in container optimized distributions. CoreOS is one that really popularizes the whole idea. FlatCars is very popular. I just wanted to acknowledge that, make sure that everyone's aware, there are other options out there. None of the things here are meant to say one is better than the other. They all approach things from slightly different angles, just because there's maybe a smaller list for one. If you're using that platform, then you're well-integrated. Maybe that is the best option for you. But Bottle Rocket is trying to address all of these similar problem spaces that the other distributions are doing. We all come at it a little different way, just like how you have Ubuntu and Red Hat and many, many other general-purpose distributions. So to dive into Bottle Rocket a little bit, there really isn't too much more than these blocks. We have the base Linux kernel. System.bedu is used to manage things. And we actually run two different container-y instances. And the reason for this is, again, security. Everything on the left-hand side, host containers, are things that are used to manage the node. They're things that have a little more privilege, that might be able to access things that, regular pods that are running on the container-y instance that's used, say, with Kubernetes, that would be a little more locked down. And then if there is any security vulnerability that helps isolate things. It's an API-driven configuration. So when you deploy an instance, you can give a configuration that actually, even though it ends up being a file with settings, everything goes through the API. And that's what actually sets the values for what happens when this instance boots up and runs. The host containers, so again, the things with a little more privilege, host container is really where, how you would access the machine. So the actual ball rocket base itself, if you look at this actual Linux kernel running, all the systems on there, there's no shell. There's no SSHD. It really is isolated. You need to physically connect some way to this instance. So how do you actually do things? And that's where the control container comes in. That provides an environment that you can connect to and has a few of the tools that you may need if you need to actually interact beyond the API configuration. One step above that, then, is from that control container, you can launch an admin container. And that actually does then give you the option to run SSHD. It lets you access more of the system settings. But it's something that's not run by default, and it's not something that we recommend you keep running. Really, it's there, kind of a break glass if you need it. You need to get in the node. You need to be able to troubleshoot some things. This is how you would do it by running an admin container. And then there's also other privileged things that we call bootstrap containers. Those are things that, they're container images that you can customize and spin up to be able to do special things. We've seen some cases where, OK, maybe I have some specific file system requirements, or I need to do some special thing. Bootstrap container is one of those host containers that's a little more privileged that you can have it when the system just starts up, initializes, that can do some custom things for you. So to see what this actually looks like, I talk a bit very vaguely about distributions. I'll just show you what it's like to interact with a ball rocket host. So typically, the only way you would interact with a host is through whatever container orchestration you're using. You shouldn't care what's running, what nodes are actually part of your cluster. You just have a Kubernetes cluster, and you use it. So you can use things like kubectl to get information about nodes. You could describe the nodes if you'd like. And if we do that, you can see here, it's an OS image, ball rocket 1.12. But most of the time, you would just be running your services, your pods, your load bans, demon sets, those types of things. If you need to connect, then you actually need to connect to the console. So if you're on a bare metal instance, this is actually plugging into the display port. If you're using a hosting platform, this is actually a hosting platform, this is whatever console interface that gives you, and you're presented with this. This is the controlled container that lets you actually access the host. So back to that diagram, within these host containers, under that container instance, right now, the shell that I have is actually this controlled container, because there's no shell on the base OS itself. So this gives you a little information to help you get started. There's the API client. And I can use that to get my Kubernetes settings. But if I look at Kubernetes, it puts everything under configuration under Etsy Kubernetes by default. There's nothing there. There's no shell. I can't do, let's see, I can't access file, can't make changes here. But there's nothing on here that shows you that this is part of a Kubernetes cluster. And that's because we're just inside that control container. This isn't actually the host OS, the host file system. The trick that I have there is there is a hidden mount that will give you some access to the root file system. But if you notice, I didn't point it out, there's enable admin container, which is one of the commands that that banner recommends, or lets you know about. And that's what actually gets you into this admin container that has more access. So if I do enter admin container, I knew I should have reset things. Normally, you just enter admin container. So that spins up that container instance, because it's not running by default. It's only when you need it. So I started up an admin container. Now I have a shell within the admin container. So now I'm from there, and now that I'm in that container, I have a little more access. I can see some more files, but it's still not going to give me full access yet. So there's a tool called Shellty. Now that I have that, now I have access to the actual underlying file system. So now I can go in, that's a Kubernetes, I want to take a look at the kubelet config. All that information is there. So yeah, aren't command line demos exciting? So in addition to being able to access the system only through these controlled mechanisms, we try to limit anything else that could be running, that could be used. There's a read-only root file system. So even if a container running in your Kubernetes cluster somehow was able to break out, we'd only have access to the read-only file system. It can't make any changes there. We also use de-imverity as an extra layer of precaution. So even if something happened, that adds some checks. And things are locked down. So really it'd be very difficult to compromise a system. And then we also use SeLinks. So there's multiple layers of protections in place here to try to limit things. There's PCI compliance. Sorry, I don't know what happened to that slide. And we are looking at FIPS compliance in the future to be able to show that the system really is secure. So I mentioned it's a read-only file system. The way Bialarocket is distributed is it's image base. There's no YUM, there's no DNF. You can't go in there and install extra packages. So it is a static image. So one nice thing about that is when you, if you want to upgrade a node to a new release of Bialarocket, the way it works is it'll actually download that newer image and write it to the second partition of the root disk. And then upgrading really is just switching over and pointing at that new partition. Because everything, all the settings that you have are persisted as part of what was set through the API server. It can switch over to this new image, reboot when it comes up. It reads all those settings again, uses the new image, and post is running. We do provide a few tools. There's a command line interface to be able to check for updates. That's a pain, especially if you have hundreds of nodes. But there's things like Bialarocket update operator, which will handle a lot of this for you. So if you have a Kubernetes cluster, you can schedule when you want maintenance activities to happen. That will automatically go out and look for new versions being released. And then it'll take care of interacting with Kubernetes to coordinate off nodes, get workloads drained off into others, upgrade those nodes, and then allow things to move back. So cleanly over a period of time, it'll get all of your nodes within the cluster upgraded to the new version. Or, and this is my preference, just replace the nodes. They're not customized. You know you haven't left any special thing on the file system that you need to worry about, am I going to lose something if I get rid of this? You just spin up new nodes, have them join, and then you can get rid of the old nodes and fresh system every time. Either way works. The configuration, like I mentioned, most of the time you're passing in a user data file. And that's in the Tamil format. But really, we're an equal opportunity markup project. So depending on how you're doing things, there's the YAML, if you use something like EKS, everything's configured in YAML, so you can have settings there. I showed the API client, if you actually do want to go and make changes on the command line, you can use that API client and set and give it a JSON string of the settings that you'd like or the Tamil. And on this URL at the bottom in the repo, there's a full listing of all the different settings that you can do with those configuration files. Now, the ballerocket handles things slightly different than a lot of other distros. So that can be a stumbling block when you look at how, if you want to adopt ballerocket or you're trying it out and trying to see if it works. So there's a few things to be aware of, I guess, when you do that. One common thing that I've heard from users is, well, my company requires that I run this anti-virus agent on all of my hosts. If they've containerized that agent, great. If they haven't, that's an issue. Like I said, there's no DNF, there's no YAML. You can't go in there and install software. So really, anything that you need to run on there, any kind of host agents that integrate with systems, can be run in privileged containers that can do a lot of things. It just has to be containerized to be able to run on ballerocket. And then, like I said, accessing it. A lot of sysadmins, they're used to, OK, I've got all these nodes out there. I know I just SSH in, and that's how I access my system and do things. So that can be a stumbling block, too. Things are done a little bit differently because things are so locked down, pros and cons. So because things are locked down, you need to enable that admin container. Then you can enable SSH if you need to, but it's not going to be there by default. And then the last thing I wanted to bring up, just because these are two AWS-initiated projects, both open source, I'll be talking to someone about ballerocket, and we'll go into some detail. And it seems like we're both on the same page, talking about the same thing. And then, somehow, they say something, and we're like, oh, wait, no, you're actually thinking about a different project. So Firecracker is another thing. That's actually for a virtualized solution. So that's ballerocket, Firecracker, explosive things, but not the same thing. So my main motivation for getting out and talking about things like this is love to see more people get involved. Everything that we can right now is up on GitHub, under the Bottle Rocket OS org, everything is Apache 2 and MIT licensed. We try to publish a roadmap under that org, so if you're curious what's happening, take a look there. But we'd like to, the people working on the project would like to hear from folks about what they'd like to see. If you have ideas and you want to bring pull requests, love that, to actually work on Bottle Rocket, we have the Linux kernel, obviously C, but most of the Bottle Rocket pieces of putting all this together is in Rust. So you will need Linux, you will need Rust, and you will need Docker to be able to do builds and things like that. We have a Bottle Rocket SDK image that we published, so that has the specific Rust, like the version, and there are some Go pieces, so it has the Go tool chain. But you do need a base requirement on your machine to be able to actually run things. And I say a decent amount of CPU memory and storage. I can't really give an exact number. You're compiling, you're building a distro. So if you want to do that in a two core, eight gigabyte VM on your laptop, you're going to have to be patient. That'll take some time. So really, the more CPU, the more memory that you have, the better that whole situation is going to be. But there is a building.md file in the repo. If you are interested in that, take a look, and that will go through everything to get you set up to be able to actually check out the repo, make changes, and compile it. Another area that I hope is going to help to get people involved, we're calling right now Autotree builds. So the variants that I spoke about, having these variants that are very optimized for different situations. If you wanted to build your own, say you have your own container orchestration platform, and you want to integrate BioRocket with it, right now you would have to fork the whole repo and do everything within BioRocket to get repo itself. So we're looking at ways, how can we separate things out and make this easier? So if you have a customized BioRocket image that you'd like to make for your company, for your home lab, how can you do that without having it pulled everything in? So if you are interested, you can subscribe to this, 2669 in the GitHub BioRocket issues. And then even if you're not a developer, you don't have the resources to build your own, to be able to compile everything, if you're not interested in that, that's fine. We'd love people to just join our community meetings. Let us know what you're looking for, let us know if there's anything missing from BioRocket, become part of BioRocket itself. So that happens right now every other week. And we manage it through Meetup just to have an easy way to communicate when those are. There's a HackMD, you can throw your ideas in there, and we can discuss them. So I'd love to see anybody join there. So that's meetup.com, BioRocket-community. And with that, I'll open it up if there's any questions. Hey, thanks for a great presentation. Given that the file systems are immutable, where do logs go? Does BioRocket itself log? And I understood that the cubelet is also running non-containerized. So where do the cubelet logs go if you use cubelet? Yeah, there are some very targeted areas where we mount a tempfs. So things like how I was talking about all the settings through the API, you need to use those to spin up cubelet or to run cubelet. It needs to know its settings and needs to read that from a configuration file. So yeah, so if we have a read-only file system, how does it do that? On boot, we mount these tempfs mounts in specific places where they're needed. And then based on reading the configuration settings that gets written out with the template, so if changes somehow happen in there, your reboot comes back up and you're exactly how you have things originally. There's a question in chat is, is there a version of bottle rocket which is built to run on KVM lib world? We, there, in the repo, let me see where did I put that. In the repo, there is, sorry, it must just be under building, there are instructions. Bottle rocket can be run on QMU, so if you, that's a great development tool too, is if you want to make changes and spin things up and just have it running, see how it works, you can run it as a virtual machine, yeah. Thank you for your... |
Automating secret rotation in Kubernetes
Minimizing mistakes by removing the human element |
Okay. Our next talk is going to start right now. Mark's already on stage. He's going to talk about automating secrets, rotation, and Kubernetes, and please quiet down so we can understand him. Okay. Hello. Can you hear me? All right. So thank you for joining here today. My name is Mark. I'm an engineer tech lead at Cisco. For the last couple of years, or maybe the better part of the decade, my primary job was helping engineering teams around their business applications and Kubernetes and helping them succeed without having to get into too much details about Kubernetes. Let me start with the story. I'm pretty sure this will sound familiar to a lot of us here. A couple of years ago, I was in the middle of debugging session. It was already the middle of the night. Everyone was tired. And finally, we found a problem. I committed the change, pushed the code, and then suddenly all the buzz went off. We received an e-mail from AWS that a pair of credentials was committed in a public repository. Who did something like that before? Come on. I'm pretty sure it's more than that. There's no shame in that. Everyone has to go through that once. So we obviously had to revoke the credentials, generate a new pair, and deploy it to production. And we were able to do that because we had, like, good secret management pipeline in place. And this kind of hints at why rotating secrets or being able to rotate secrets is important, because if you have an incident like this, you have to be able to act quickly and rotate those secrets and make sure that, well, in a first-case scenario, people may steal your data in a better scenario than AWS. Someone might start mining Bitcoin. But you have to be able to react quickly. Another reason why this is a very important topic is we often have to meet certain compliance requirements that require us to rotate every secret we have, like, every 90 days. I'm pretty sure many of us have to deal with that. But the worst of all, the worst situation of all is when you don't even know that a secret has been leaked. Or maybe an angry ex-employee took something with home. And you don't even know that happened. And they are stealing your data. They are stealing your customer's data. Or they are mining Bitcoin in a better situation. All right. So probably nobody disputes that secret rotation is important. But unfortunately, it comes with its own self-challenges, which often turns people away from actually caring about this. And obviously, secret rotation or managing secrets or configuration is a very complex problem, especially in a Kubernetes environment where you may have multiple different clusters, multiple different in-spaces where you have to deploy these secrets, many different secrets and integration, which means it takes a lot of time to do it right. And it's still an error-prone process. And in an idea scenario, if you screw something up, it may not result in an actual outage or incident. But it may, which is obviously, it would affect the business, which is what we wanted to avoid in the first place by making these secret rotations. So all right. So I'm going to talk about some of the key challenges and why it's important points to that secret rotation should be possible. I mean, it's probably always possible. But I've seen situations where rotating certain secrets would have been very, very hard. Like it would have taken like hours, which is a problem. But so it should be possible. And you should be able to do it relatively quickly. Secret rotation should also be as much automated as possible. Like we are not really trustworthy, like we make mistakes, exhibit A. So it should be ultimately as much as possible. And humans should interact with secrets and secret rotation as little as possible. And finally, secret rotation should happen periodically. Like you shouldn't have a secret that you use for years, because as I mentioned, you don't know if it's been leaked. And if you don't know if it's been leaked, how do you know if your system is secure or not? So how does secret rotation look like in general? We are not even talking about Kubernetes here. First, you need to have a secret store. If you don't have a secret store, then the whole thing is a lot more complex than it should be. You have a secret store where you store your secrets, and then you have some solution to deploy those secrets to your production environment or production environments. Now, when you need to change a secret, depending on what type of secret that is, you have to go to the secret provider, which may be a third-party provider like AWS or GitHub or anything like that. You have to issue a new pair of credentials or generate a new secret, change that in the secret store, and then you need some sort of mechanism to deploy the new secret. That probably should be an automatic process that notices the secret change, and it should deploy the secrets for you in your production environment. Now, in some cases, if you have a secret store that supports that, for example, Hashicorp's vault, your secret store may be able to automatically issue credentials for you, for example, for AWS, your database, or whatever else Hashicorp's vault supports, so you don't even need to do that manually. Hashicorp's vault takes care of that, and that's like the best case scenario. Now, how does this look like in Kubernetes? First of all, you have to decide whether you want to use Kubernetes secrets at all or not. There are options when you don't have to use Kubernetes secrets, but that's probably the easiest way to many secrets in Kubernetes, and the reason why generally people don't like using Kubernetes secrets is because they have this notion that Kubernetes secrets are not secure because they are base 64 encoded, and that's not secure. So that's an entirely different conversation. The bottom line is if you have envelope encryption enabled, which is disabled by default, then you're probably safe using Kubernetes secrets. Now, if you decided to use Kubernetes secrets, then you need something that deploys the secrets from your secret store to Kubernetes, and this could be, for example, the external secrets operator. There are other solutions, but this is probably the one that the community organizes around a lot lately. So external secrets operator is able to synchronize your secrets from an external store, external being to Kubernetes in this case. For example, Hashicorp's vault or AWS secret manager or whatever else you have, external secrets operator is able to synchronize secrets to Kubernetes secrets, and it's also able to pick up changes. It doesn't actively monitor changes, but periodically it takes a look at the secrets, and if something changes, then it synchronizes the changes to Kubernetes. So we have that part covered, and then you can use the Kubernetes secrets, either as environment variables or mount them as files, however you want to use them. Now, the secrets change. What then? So if you mount secrets as files, and your application is able to pick up that change, then you don't have anything to do. Your application will already reload the configuration, and you have the whole thing covered. Now, if your application can't do that, or if your application uses environment variables, you mount secrets as environment variables, but that's a more difficult problem, and for years we didn't really have a solution for that other than manual restarts. A couple of years ago, this component called reloader appeared on the market, which basically watches workloads that have, that references secrets, and it also watches the secrets, obviously, and when it detects a change, it triggers a standard workload rollout, similarly to how you would do that with kubectl rollout, for example. So it may change the annotation of the workload, and that would result in the workload being rolled out, which means that it would run with the new environment variables, and it would remount the secret with the changed file. And if we take a look at the whole process from the previous diagram, we don't have one component that takes care of the deployment, in this case, but we have two, one that synchronizes the secrets from the secrets store to Kubernetes, and the other one that takes care of the rollouts, or making sure that the workloads notice the secret change. Well, let's take a look at a very quick demo, how that looks like in action, and I have a repository prepared, you can go ahead and try it if you want to, and I have a Kubernetes cluster running here with both external secrets and reloader installed, and in addition to that, we have like a simple echo server, which just, I believe it's, yeah, we just output something. So let's take a look at how we configure external secrets first. So as I mentioned, you configure external secrets, or maybe I don't need to mention, I don't know, but you configure external secrets via custom resources, which means you create, can you see it from the back? Okay, cool. So you configure external secrets via custom resource called external secret, and you tell external secrets to, you tell external secrets how to, and from which external store should it synchronize secrets from, and where it should put it. So in this case, we are telling external secrets to synchronize secrets from a store I created and called as fake. This is basically a static secret store in this case. It synchronizes secrets into a secret called full bar, and it's going to synchronize from the fake secret store under the key, from under the key full slash bar to a key under hello in the Kubernetes secret. So let's take a look at, if we do, in fact, have that secret there. So we have a full bar secret. That's good so far. And we have a hello key here. I'm sure if you can see that. Now, if I change this secret right now, this, this is just a command that patches the external or the fake store to change the secret value. If I go back and check the secret value, it should be changed to everyone. Now, if I try to curdle the service again, there are no changes here. So if I manually restart the pod, let's see, do I have the command here? Yeah, I have a rollout command. If I manually restart the pod and restart the port forward as well, then I should see that the secret value is in fact changed. Maybe I haven't shown you, but I do have the application deployment here that references the full bar secret. All right. So now we have the secret synchronization part covered. Now, let's see how it works if I want the workload to be automatically rolled out when the secret changes. So I can annotate the echo server with this reloader annotation, which will make reloader start watching this workload and the secrets mounted in it. So nothing changed yet. I should still see everyone. That's fine. And now let's change the secret again to fuzz them. So if I, yeah, the secret is changed to fuzz them. And if we take a look at the, I probably have to restart this. If we take a look at the service, it should now say hello fuzz them. So in this case, I didn't have to restart the virtual manually because reloader did that for me when I changed the secret. When I changed the secret in the store, that external secret synchronized into the Kubernetes secret and reloader noticed that change, so it rolled out the deployment. So that's what I wanted to show you today. If you have any questions, I'm happy to answer them. Hi. Thanks for your presentation. Can we use a reloader? Can you speak up, please, because I can't hear you. Please stay quiet. Thank you. Can we use reloader without Kubernetes secrets? Because we're one of, can we use reloader without syncing to Kubernetes secrets? I mean, you absolutely can. So with reloader, you can watch either secrets or config maps or both if you want to. But you need to use Kubernetes secrets and config maps. How do you change secrets is up to you. If you don't want to automatically synchronize, you don't have to. You can use reloader just to trigger a reload without using external secrets or synchronized secrets. So if you want to do that manually, you can absolutely do that. Does it answer your question? No. I would like to do something like synchronize secrets right into volumes, for example, like skipping Kubernetes secrets totally, because we don't want to, like, resist that in that CD. So no, probably reloader is not really useful in that case. But I see what you mean. So if you, for example, if you use something like bolt-amp and you grab the secrets directly from within the pod and you want to trigger a reload, then no, reloader can't be used that way. But we are actually working, so I'm from Cisco and before that I was working for Banzai Cloud and we are working on a solution right now exactly for that so we can have, like, a component that watches secrets that have external bolt references and reloads a component or trigger reloads for workloads based on those changes. But none of these tools support that at the moment. So are there some risks of using this method instead of using, for example, a secret vault? I mean, with a secret vault, if you watch for a file and if you watch for a secret that should be written in a file or somewhere, if the secret change vault usually emits a signal like a sig up to reload the process. So what when the secret changes? Usually vault emits a signal, an up signal to reload the process and load the configuration. In this way you are reloading the whole container so there are some risks. The problem is that only works if you talk to a vault directly from your workloads and with the solution you don't have to integrate vault directly, like you can use whatever secret story you want to. And the problem is that vault doesn't actually know where it should set its signal. So in this case you may deploy the secrets to a number of different clusters and the logists wouldn't know where to send those signals. So the minor advantage is that it's fully transparent to the solution. I don't know. We have time for one more question. Any advice about some tools to do the rotation on the other part, like, for example, rotate the standard database credentials, something like that, that will automatically update in the secret store then trigger the workshop? The problem with that is that secret providers, like, there are many different secret providers. So it's really hard to build a central solution for that. But hashicorp vault is one. Hashicorp vault has a bunch of, I think it's called old backends or something like that, that you can use to issue credentials, for example, to a Postgres database. And that credential can actually have a TTL, a deadline. And then after a certain time, hashicorp's vault would issue a new pair of credentials and then external secrets would be able to synchronize those credentials. We actually use that with AWS back end. And that's how we rotate database credentials every 90 days. Okay. Thank you so much for the talk. Thank you for all the questions. Thank you for staying quiet. Thank you. |
Quick starting secure container storage using squashfs, overlay and dm-verity |
The next talk is by Scott. He's going to talk about quick startings, secure containers I'm Scott Moser, I work for Cisco Systems and over the past three years or so we've been working on a project internally that implements a lot of the image based workflows that we were talking about in another room. And just kind of building up that piece by piece. That is called project machine and so that's what I primarily am working on and stuff around that. So through that we kind of came to some needs and desires to change how we were running containers and that's what we got here. The goal is pretty simple of this talk and our goal was really just to replace the tar and gzip format in an OCI image with SquashFS and now discuss why there's benefits of that. I'll show some comparisons of what the registry data looks like and what the registry sees and compare what the runtime looks like and what's different there. And then I'll give a little demo and the sales pitch part, there's two tools that are ours. They're open source but they're decent tools so I'll show them here. Stacker and we signed with Cosign, we published Zot and then we run with LXC. Probably everybody's here is familiar with LXC. So in order to get SquashFS file system images in a registry, it looks a lot like it does with tar, gzip images. We put just files that go into registry, the metadata contains a list of images. The index is a list of images, each of those images has a list of layers and then the difference really is just in the media type of the layer. So we get, yeah, and then first, and so we both have a sign check, some of the tar ball in both cases, or the image right, that data's there so you can know what it is. And then in addition on the SquashFS one, we put the DM Verity hash, the root hash in the metadata and we sign that. That come into play later. Oh, I went backward. There we go. So now run type, at run time, the images really do look very similar. Both of them, they, well, we uncompressed with tar and gzip or we either copy the image out of the repository to a place on the disk. And then we can either share that same location for every container or you can mount it, you can take a copy of it for each container that you're going to launch. You know, that path makes garbage collection a little bit easier. And then in tar world, if you want to compare the data that you're running, you want to compare the file system that is running versus the thing that you downloaded. That's a real pain in the rear end, right? You got to basically look at all the contents of all the files and look at their modification times and compare that to the compressed tar ball or, you know, extract it and just compare it to file system trees. It's a real pain. With SquashFS, the image is there. It was read only and you just Shaw Summit and the Shaw Summit matches the Shaw Summit you downloaded and you know you're good, right? So there's a lot of benefit out of that. The primary reason that we kind of got here and we're looking into something else was really that once you've extracted a tar file system out, there's kind of no way to put it back in. You know, you can't ever really get back and verify that you're running what you thought. So, and then in the runtime, other benefits of SquashFS and Verity is we get in privilege with privileged mounts. If we're running a container that is real root and can do a mount, then we can use that de-embarity data that we got in there so that the kernel can actually then verify that the data is used as is, as it reads it off the disk. And then, but is unprivileged when we're running unprivileged containers, if you can't do a mount, we do a mount with SquashFuse and there you can't use de-embarity. There's no, to my knowledge, way to use a block device or use the device mapper and get block device, get de-embarity without being real root. So, let's see. And then another benefit is the file system doesn't implement write, right? So you're not going to be, you're not going to be attacked from the file system. Nobody's going to be replacing a binary there. If they're going to get to it, they have to come in like from the other side and modify the disk, the data, but that should be caught via checksum or de-embarity. So, but that comes at a little bit of cost because basically everybody and their brother can read a tar ball, right? And at this point, but not, but SquashFuse is a little bit less readable, although there's, there's good tools, but they're not as widely, as widely deployed. Oh, yeah, and I just want to point out like, so really the changes here, it's, it's not, it's not evolution, it's not revolutionary, it's evolutionary. It's a small change. There's changes being discussed for like OCI image V2 or V2 repositories and different file formats that would really kind of revolutionize thing and do much, much, much better than this. But this is a significant improvement upon, upon what's there right now. So I said there Overlay, OverlayFS doesn't have any write support or let's see, I'm sorry. SquashFS doesn't have write support, so you end up having to use Overlay. Overlay, again, was talked about in the image container or in the image based workflow track all the time. I'm not sure how well people are aware of here, but I think it's probably generally fairly common knowledge. It's a kernel file system. It's very mature and you can basically stack, stack file system data on top of each other and get the same basic tree that the tar that extracting a series of layers for OCI image gets you. I don't know really which came first if it was the Overlay file system in the kernel or in the wide outs there or the wide outs that are in a tar ball layer. I don't know which came first, but they look very similar. So the ones we're using and the ones that stack or stores in its stack or stores in the images are just the same as the kernel writes them. So we just use the Overlay there and then it's a, it's simple, very use, useful. Yeah, and then I say Overlay bugs are present in the kernel. It has slightly different semantics than some other file systems, but largely over the past 10 years, they have been really well squashed. So this, it works real well. And then the last thing is that if you're using an overlay, you can easily see the changes that, that were made to a file system because they're basically all on a single tree. You can look under, you can look at the Overlay layer and see these are the files that were written. So deambarity is the device mapper verification. And it's just a feature in the kernel that basically provides, uses technology called a Merkle tree, which allows you to provide a hash of the top. And then each, each blocks and cascading down are just our hashes that are built into that. So basically you can mount the thing up immediately and start reading. And what will happen is you, you get bad reads if there's bad data or I, I learned today that you can also, you could trigger like a crash or something. If data was not, if there was an integrity valve violation there. So let's see. Yeah. And so that's deambarity, very useful. But again, that only works as real root. So let's go ahead and try to do a demo of this. Now, yeah, use anybody if you're, if you think I'm just giving a sales pitch, maybe I am. I don't say a lot of good things about software in general. But these two that I'm selling are reasonable pieces of software and they may help you. So we're going to build with stacker. We're going to sign things with cosine. We're going to publish to Zot and we're going to run Alex and we're going to run things with LXC. Let's see. See how this goes. This worked at 3am last night. So, you know, let's see. I just do that. All right. So this is a stacker file. And then again, stacker is our build tool. It's really very similar to Docker in what it's capable of. It runs completely unprivileged. It can also run privilege, but runs completely unprivileged and allows you to build OCI images. You can build them either in TAR or in SquashFS file system type. Let's see. It's a very mature project and it's working towards CNCF inclusion. So it works and it pretty much works out of the box. It's not a single binary. It runs on disk rows pretty close out of the box. So you don't need to have a huge stack to try it out. So I'm going to go ahead and do stacker build. I'm going to do this because, yeah, there's no way I was going to type that right. All right. So there we said build. I want to build both the layer types, TAR and SquashFS. And then substitute. It just provides some mechanism to substitute inside the YAML file. Because I don't want to go to Docker right now. I'd rather go to a local ZAP that I'm running. So there it didn't actually build that. Clearly it didn't do all that apt and everything. It was already built, so it reused its cache. And now we can go ahead and co-sign. Yeah, we'll publish those images. So there. Here's the two images that it built. It built one called talkroot-squashFS. And one just called talkroot. And the one without that is a TAR. And then up there is GZIP. And then you can see that these are the same image manifest type. And so largely tools will still be able to read the Squash data that we put up. Like Scopeo will still copy it down. You can still move them around without a whole lot of extra work. Let's see. So now go ahead and publish those. Publish those two images. And that just uploaded them to a local ZAP. It's running here that I've got running on local host. And let's see now. Is that right? What did I mean? Co-sign? Come on. What did mean that? So I'll go ahead and generate a co-sign key pair. And that is enforced currently in Etsy containers. Yeah, that's there. Basically I say anything that come from local host there needs to be signed by this key that we just did. So we're going to need to go ahead and sign that stuff. Co-sign is telling me that in that log verbiage that nobody's really going to read, it's telling me that you should not just refer to an image in a repository by its name. You should give the hash because otherwise it might not be what you think you're signing. So that's bad practice. Shame on me. All right, let's see. So now we've got stuff published into ZOT. Our local ZOT is running. And we can see these images are in ZOT. And ZOT is just another thing that we run. It runs an OCI registry. It's really good software. It's really very easily. It's one binary. You take it a little bit of config and then you can run a Docker registry. The biggest benefit that I see out of it is that I don't hit the Docker bandwidth threshold. Because out of our company, whatever it is out of that lab, that usually gets hit by like 7 AM in the morning. So if you don't have something caching, then you're out of luck. Let's see. So now we've got images in our local ZOT. We've built them, signed them, and published them to a local repository. And now I can go ahead and try to run one. So this is the status quo. This is, I create a user namespaced container. And then I can LXC start, minus in. I meant now. It lets you watch it boot. So that's just the tar one. It extracted that to the file system. It mounted up the file system in a user namespace. And it let me run. Come on now. OK, now we can do the same thing. But instead of using the talk root of s, we'll use the talk root of s dash squash of s. We'll name this image. So that then, it copied the OCI, it pulled down the OCI data out of the ZOT repository, put on disk, and then is ready for me to run it. When I run it, I hope. There. All right. So now I've got running on the system. Let's see. Here I've got an overlay file system like is mounted underneath that. These squash fused binaries got mounted one, one, and then another, and then another, and then an overlay over the top of those three. So this is running completely unprivileged. I can mount those up and use them in place. Go ahead and see. How much time am I like, OK, then I think I can show another one running as root, but it's basically, oh, actually, yeah, I will go ahead and try to start that just because. If I can. The one that the thing to show there is that. Oh, no, I should end and ask and take questions. Yeah, because I was saying I wasn't sure if you were. So, yeah, so thanks for listening. So I want to thank God for letting me be here and, you know, spend another day on software and complaining about software. And thank my team for letting and for helping me out my family for letting me be gone and Cisco on you guys for coming. This is project machine and anybody got any questions. Sorry. We have time for one question. All right, very clear. Feel free to reach out and thank you. |
Cluster API: Operating Kubernetes with Kubernetes |
You're going to talk about cluster API operating Kubernetes with Kubernetes. How? Hello. Thank you for coming. My name is Alex. I'm a software engineer. I work at Susie on the run chair. I do a lot of stuff related to cluster lifecycle. And today I'm going to talk about cluster API and operating Kubernetes with Kubernetes. Hope it will be fun. Here is a short summary of what we are going to talk about today. I'll try to explain the problem of managing the Kubernetes cluster lifecycle. I'll try to explain what is cluster API, how does it approach this problem, and we'll take a look at some building blocks of cluster API. And also I'll be doing a demo, and because I don't have enough time, the demo will be done simultaneously with the talk. So it's a live demo, nothing is recorded, hopefully everything will be fine. I already had some problems with networking today, but let's see. So let's move on to the next slide. So cluster lifecycle is complicated, and why is that? But if you have to manage more than one cluster, say you have 10 Kubernetes cluster or maybe 100 Kubernetes clusters, then the problem becomes similar to managing containers and why we invented Kubernetes. And cluster API tries to solve this problem of managing multiple clusters, and also sometimes you have to manage the underlying infrastructure, and that also somehow needs to be done in a nice and consistent way. Then you also have to upgrade clusters, sometimes you have to upgrade multiple clusters, and upgrading clusters is not always easy, especially when it comes to control planes. And you want to deploy your clusters on different infrastructure, let's say you have something running on AWS, when you have some bare metal things running, and you also need to somehow manage that. And you don't want to use different tools that depend on your infrastructure, you want to use something that is a single point of management and it's consistent, it provides some nice experience, and it's easy to use and automate. So what is cluster API? Cluster API takes this approach where we install it, it's an extension to Kubernetes API that allows you to provision, upgrade, and operate your cluster, and you install it on your Kubernetes, then you use what we call management cluster to manage workload clusters. Yes, you can do this on a different infrastructure provider, you can have one management cluster managing stuff running on AWS, and you can have the same cluster managing your clusters on Azure. So this is the basic idea of cluster API, and next we are going to take a look at the building blocks of CAPI, and I will start my demo. But before this, let me switch to the terminal and show you what I have prepared in advance. So I deployed a management cluster where I already installed CAPI so we don't lose time, everything should be up and running, and yeah, let's move on. The main entity in the cluster API is called cluster, and it represents a Kubernetes cluster, it's not tied to some kind of infrastructure, so it's just a generic Kubernetes cluster. And to make it more clear, I will show you how it looks like. As you can see, it's a normal Kubernetes object that has some kind, metadata, but what's interesting for us is the spec here, you can see the spec references, two things, yeah, the first reference is a reference to infrastructure, and for this demo I'm going to use Docker as infrastructure provider because I don't want to make any requests to some cloud because of a network, I wasn't sure if it's going to work properly, so I decided to use Docker as our infrastructure provider, it's an infrastructure provider we use for development and testing, and the second interesting reference is a reference to what we call control pane providers, and because control planes are harder to manage than worker machines, we require a specific resource for that, and this control pane provider is based on a tool called QPADM, which is a default that you can use with CAPI, so let me create this cluster, and we can take a look at the objects that are referenced inside. The first reference you saw is a reference to Docker cluster, it's also what we call an infrastructure cluster, and it's responsible for all prerequisites that are required to run your cluster on any infrastructure, so for example, if you're running it on public cloud, it will provision all networks, load balancer, security groups, VPCs, and whatever else you need, and this reference is actually what makes cluster API plugable, so if you want to add your own provider, you just have to follow a documentation implement API with some rules and then you can reference it, and that's how you plug in your own provider. Let me show you how Docker cluster looks in our case, it's pretty simple, there is no real infrastructure to run, so I'm going to create it too, okay, it's done, then the next reference we saw in cluster object was a reference to what we call a control pane provider, what it does, it creates a control pane machine, generates cloud config, and also is responsible for any other actions related to control pane management, stuff like, you know, HCD, Core DNS, or whatever you implement or want to enable. Let me show you how it looks like, this will be so far the biggest object we have there, because it contains some configurations we require for our control pane, but as you can see, you can customize some Kubernetes components there using Kubernetes API, so if you would like, you can just specify anything you need here to provision control planes, you can also specify replica set, and you also need Kubernetes version there. Now, maybe I forgot to create it. Yeah. Okay, so let's talk about worker machines and how does KPI approach managing machines. It's important first to note that machine is just a host for your Kubernetes nodes, so it can be virtual machine, can be bare metal, can be anything your infrastructure provider means, and I'd like to show an example with bots, you don't manage bots manually, right? You don't use them as a standalone resource, you use something else. If you want to manage replicas count for your bots, you use something called replica set that has just one purpose, create your certain count of bots, and then if you want to do more complex stuff like rolling upgrades, you use a deployment on top of this that manages replica set, so KPI followed the same pattern and created machines, then there is a machine set that manages replica count, and there is a machine deployment on top of that, that does more complicated things. Let's go back to the terminal. I will show you a machine deployment, you can see similar to normal deployment has replica count, then it has a selector, has a template, and inside the spec is similar to what we saw with cluster object, it has two references, one is for our infrastructure template, which is Docker for this demo, and the second one is a bootstrap provider, which is based on QPADM. So the infrastructure template or Docker template that we saw there in the reference are just specifications for your host depending on your cloud provider, it can be an instance type, storage size, anything you put there, and the second reference to bootstrap provider is just a reference to an API that generates user data with proper cloud config, so you can configure your Kubernetes components as you want. Let me show you how it looks like. For Docker machine, it's just an image in this case and some extra mounts, and for bootstrap provider, we just have some arguments for our Kubernetes components, and this is it. Okay, so this was it. Let me now check if everything works fine. Yeah, everything works fine. As you can see, we have three control pane machines that are running inside Docker containers that we created before and after some time, we should get a worker machine that we just created. Let's take a look at how it all works together. We have a cluster object that represents the cluster, then it has to reference an infrastructure provider, which is Docker in this case, and it also has to reference a control pane provider, which is based on QPADM, and once these two are done with a job, you can connect your machine deployments that have to reference a machine template, so Kapi knows what specifications you want, and also a QPADM config template where you can configure your Kubernetes components, and this is all you need to create a basic Kapi cluster. Unfortunately, I don't have enough time to talk about other things that exist in Kapi like machine health checks that help you track and remediate unhealthy machines when there are cluster classes, which are powerful templates for creating clusters. You can also connect cluster autoscalar if you want, and there are day two operations coming, so you can think of KPS like SwissKnife for everything related to cluster lifecycle. And we still have time. I'm going to show you how we can upgrade the cluster. Let's check its state again. Yeah, so if you... Now you can see that we have three control planes, and they all are running Kubernetes v125, and let me upgrade them to Kubernetes v126, so how do I do this? In order to do this, we have to change the version in the control pane provider object, and we also have to change the image reference in the machine template. So just by doing so, I will start upgrading the cluster. As you can see, cluster API started to spin up new control pane machine with v126 that is going to replace old ones, and it's going to take care for us like insuring a CD quorum and all sorts of things, so we don't have to take care about this. I'm going to go back to the summary, and let's go once again for what we saw today. So I try to explain the problem of managing Kubernetes clusters, and the main idea, we wanted to have a tool that provides a declarative and consistent API, and will allow you provision and manage your clusters on different infrastructure in some nice way so you can have a single point of managing your clusters for all the possible infrastructures you're running, and this approach is like use Kubernetes because Kubernetes already provides a lot of tools for building a powerful API. I think with us it, maybe I was a bit quick, but I don't have anything else. I'm ready to answer questions if someone has. Okay, we have ample time for questions. Hi, thanks for the nice demo. This allows you to manage the workload clusters. Can it also manage the life cycle of the management cluster, or how do you do that? Yes, you can. So what if it destroys itself, so what happens then? It shouldn't. Depends on how you use it, but yeah. Works on local clusters, thank you. Thank you. The question about updates, is it possible to update components like cobalates without recreating virtual machines? Yes. And how is it working? It's done through your bootstrap or control pane provider. Yeah, and you also have to provide an image that will be used for your new instances. No, no, I mean if you need to update cobalates and you don't want to reorder new... Yeah, okay. Costa API doesn't support in-place upgrades. It will be creating a new machine with new image, new everything, and then replacing old one. Okay, got it. And can you tell a little more about control pane updates? Sorry? Control pane updates, updates of control pane nodes and components. So I just showed one, like when you change the version it will start replacing old machines with newer ones. You just have to provide all the specifications. You have to provide a new Kubernetes version you want and also a new image. So we try to bake everything inside the machine image so you don't have to download new things and it will just replace old machine with a new one, with new versions. So it's a replace upgrade. It's not in place. The same as POTS, if you change, for example, reference to image, it will destroy old one and create a newer one. So it's the same concept. There's an online question is in the chat. Are there any latency requirements between the management cluster and the workload cluster? It depends on your use case, but yeah, ideally you should take care of your management cluster with somewhere near workload clusters or is able to reach it within some limits. And one more. Does the management cluster need to run at all the time or can it be shut off when not doing life cycle work? So here is the thing. If you disable it nothing will manage your Kubernetes cluster so they will be basically unmanaged. Yeah, your workload cluster will continue running but there is nothing that will keep track of them. For example, if you use cluster autoscaler or machine health checks you need your management cluster to be running all the time because it constantly looks at the state of your workload clusters. Okay. If there are no more questions and we can end a few minutes early. Thank you for the talk. Thank you all for attending. |
7 years of cgroup v2: the future of Linux resource control |
Okay. I think we're ready to start. Oh, excellent. This time it worked perfectly. Thank you so much. Yeah. Chris is going to talk about C Group V2, seven years of C Group V2 in the kernel, very exciting time, and the future of Linux resource control. Take it away. Hello, everybody. Oh, yes. Please go on. Thank you. That's it. I'm done. Goodbye. Hello. I'm Chris Down. I work as a kernel engineer at Metta. I work on the kernels memory management subsystem, especially I'm a contributor to C Groups, which are one of the things which underpins our model of containers. I'm also a maintainer of the system D project. So there's two things on this slide, which you can hate me for. Most of the time I'm thinking about, you know, how we can make Linux just a little bit more reliable, just a little bit more usable at scale. We have a million plus machines. We can't just buy more RAM. It's not really a thing we can do. So we need to extract the absolute maximum from every single machine. Otherwise, there's a huge loss of capacity that could result. So that's the kind of thing I want to talk to you about today. However, the last seven years we have done this at Metta, how we've improved the reliability and capacity and extracted more efficiency. At Metta and in industry, we are increasingly facing this kind of problem where we can't effectively solve scaling problems just by throwing hardware at the problem. We can't construct data centers fast enough. We can't source clean power fast enough. We have hundreds of thousands of machines and we just can't afford to waste capacity because any small loss in capacity on a single machine translates to a very large amount at scale. Ultimately, what we need to do is use resources more efficiently and we need to build the kernel infrastructure in order to do that. Another challenge that we have is that many huge site incidents for companies like us and companies of our size are caused by lacking resource control. Not being able to control things like CPU, IO, memory and the like is one of the most pervasive causes of incidents and outages across our industry and we need to sustain an initiative industry-wide in order to fix this. So how does all of this relate to this C-groups thing in the title? So C-groups are a kernel mechanism to balance and control and isolate things like memory, CPU, IO, things that you share across a machine, things that processes share and I'm sure if you've operated containers before, which I'm going to assume that you have, judging by the fact you're in this room otherwise you may be lost in looking for the AI room, you know every single modern container runtime uses this. Stalker uses it, Chorus uses it, Kubernetes uses it, SystemD uses it. The reason they use it is because it's the most mature platform to do this work and it solves a lot of the long-standing problems which we had with kind of classic resource control in the form of view limits and things like that. C-groups have existed for about 14 years now and they have changed a lot in that time. Most notably, seven years ago in kernel 4.5 we released C-group 2. I gave a whole talk around the time when that happened on why we were moving to a totally new interface, why we weren't just iterating on the old interface and if you're interested in a really in-depth look at that then here's a talk which you can go and take a look at. But the most fundamental change really is that in C-group 2 what happens is that you enable or disable resources in the context of a particular C-group. In C-group 1 what you have is a hierarchy for memory, a hierarchy for CPU and the two will never meet. Those two things are completely independent. SystemD when it creates things in C-group V1 it will name them the same they get called something.slice or something.service but they have no relation to each other across resources. But in C-group 2 you have just a single C-group and you enable or disable resources in the context of that particular C-group so you can enable say memory control and IO control together. That might seem like you know an aesthetic kind of concern but it's really not. Without this major API change we simply cannot use C-groups to do complex resource control. Take the following scenario. Memory starts to run out on your machine. So when we start to run out of memory on a pretty much any modern operating system what do you do? Well you try and go and free some up. So we start to reclaim some page caches. We start to reclaim maybe some anonymous pages if we have swap. And this results in disk IO. And if we're particularly memory bound and it's really hard to free pages and we're having to walk the pages over and over and over to try and find stuff to free then it's going to cost a non-trivial amount of CPU cycles to do so. Looking through available memory to find pages which can be free can be extremely expensive on memory bound workloads. On some highly loaded or memory bound systems it can take you know double digit amount of CPU from the machine just to do this walking. It's a highly expensive process. And without having the single resource hierarchy we cannot take into account these transfers between the different resources how one leads to another because they're all completely independent. If you've been in the containers different before you've probably thinking I've seen this guy before and I think he's given this exact talk about three years ago. I'm sure some of you think and that already. Well the company name isn't the only thing which has changed in 2020. Also some seagrups things have changed since 2020 and obviously I don't want to rehash the same things over and over. I don't want to bore you. So this talk will mostly be about the changes since the last time I was here in 2020 with just a little bit of context setting just a little bit. This talk is really about the process of getting resource isolation working at scale. It's what it needs to happen in production not just in a theoretical concern. The elephant in the room of course is COVID. The last three years have seen pretty significant changes in behavior due to COVID especially for a platform like Facebook which we own of course. This was by about 27% over what you would usually expect and this came at a time where not only you're seeing increased demand but you literally can't go out and buy memory. You can't go out and buy more CPUs. You can't go out and buy more disks because there's a shortage because there's COVID. So what we really needed was to make more efficient use of the existing resources on the machine right. We need to have an acceleration or existing efforts around resource control in order to do that to make things more efficient. Now almost every single time that I give this sounds like a personal point of concern. Every time I give this talk somebody on Hacker News comments why don't you just get some more memory? Now I don't know how trivial people in this room think that is when you've got several million servers but it is slightly difficult sometimes. For example there's a huge amount of cost involved there and not just the money which is indeed substantial and I'm very glad it's not coming out of my bank account but also in things like power draw, in things like thermals, in things like hardware design trade-offs. Not to mention during COVID you just couldn't get these kind of, you couldn't get a hard drive, you couldn't get some memory. You'd go down to your local Best Buy and do it but that's about it. So not really an option. So here's a simple little proposition for you, for anyone in the room who wants to be brave. How do you view memory usage for a process in Linux? Oh come on. Free! My man said free. Oh lord. This was a trap. So I appreciate it though, big up about that. So yeah, so free and the like really only measure like one type of memory. They do have caches and buffers in the side but the thing is okay so for free or for PS which were shut at the back you know you do see something like the resident set size and you see some other details and you might be thinking hey you know that's fine like I don't really care about some of the other things that's the bit which my application is really using. For example we don't necessarily think that our programs rely on caches and buffers to operate in any sustainable way but the problem is the answer for any sufficiently complex system is almost certainly that a lot of those caches and buffers are not optional. They are basically essential. Let's take Chrome just as a facile example. The Chrome Binary's code segment is over 130 megs. He's a chunky boy. He is. He's a big boy. We load this code into memory. We do it gradually. We're not we're not maniacs. We do it gradually but you know we do it as part of the page cache. A boy if you want to execute some particular part of Chrome you know this cache isn't just nice to have the cache that has the code in it that runs this particular part of Chrome. We literally cannot make any forward progress without that part of the cache and the same goes for caches for the files you're loading especially for something like Chrome you probably do have a lot of caches so eventually those pages are going to have to make their way into the working set. They're going to have to make their way into main memory. In another particularly egregious case we have a demon at Meta and this demon aggregates metrics across a machine. It sends them to centralized storage and as part of this what it does is it runs a whole bunch of janky scripts and these janky scripts go and collect things across the machine. I mean we've all got one. We've all got this kind of demon where you collect all kind of janky stuff and you don't really know what it does but it sends some nice metrics and it looks nice and one of the things we were able to demonstrate is while the team had this demon thought that it took about 100 to 150 megabytes to run using the things that we'll talk about in this talk it actually was more like two gigabytes. So the difference is quite substantial on some things like you could be quite misunderstanding like what is taking memory on your machine. So in C-group 2 we have this file called memory.current that measures the current memory usage for the C-group including everything like caches, buffers, kernel objects, so on. So job done right? Well no the problem is here that whenever somebody comes to these talks and I say something like don't use RSS to measure your application they go and see oh we've added a new thing called memory.current and it measures everything great. I'm just gonna put some metrics based on that but it's quite important to understand what that actually means to have everything here right. The very fact that we are not talking about just the resident set size anymore means the ramifications are fundamentally different. We have caches, buffers, socket memory, TCP memory, kernel objects, all kind of stuff in here and that's exactly how it should be because we need that to prevent abuse of these resources which are valid resources across the system. They are things we actually need to run. So understanding why reasoning about memory.current might be more complicated than it seems comes down to why as an industry we tended to gravitate towards measuring RSS in the first place. We don't measure RSS because it measures anything useful we measure it because it's really fucking easy to measure. That's the reason we measure RSS like there's no other reason like it doesn't measure anything very useful. It kind of tells you vaguely like maybe what your application might be doing kind of but it doesn't tell you anything of any of the actually like interesting parts of your application only the bits you pretty much already knew. So memory.current suffers from pretty much exactly the opposite problem which is it tells you the truth and don't really know how to deal with that. Don't really know how to deal with being told how much memory application is using. For example if you set an 8 gigabyte memory limit in C root v2 how big is memory.current going to be on a machine which has no other thing running on it. It's probably going to be 8 gigabytes because we've decided that we're going to fill it with all kind of nice stuff. There's no reason we should evict that. There's no reason we should take away these nice you know K mem caches. There's no reason we should take away these slots because we have free memory so why not. Why not keep them around. So if there was no pressure for this to shrink from any outside scope then the slack is just going to expand until it reaches your limit. So what should we do? How should we know what the real needed amount of memory is at a given time? So let's take an example Linux kernel build for example which with no limits has a peak memory.current of just over 800 megabytes. In C root v2 we have this tunable called memory.high. This tunable reclaims memory from the C group until it goes back under some threshold. It just keeps on reclaiming and reclaiming and reclaiming and throttling until you reach back under. So right now things take about four minutes with no limits. This is about how long it takes to build the kernel and when I apply you know a throttling like a like a reclaim threshold of 600 megabytes actually you know the job finishes roughly about the same amount of time maybe a second more with about 25 percent less available memory at peak and the same even happens when we go down to 400 megabytes. Now we're using half the memory that we originally used with only a few seconds more wall time. It's it's pretty good trade-off. However if we just go just a little bit further then things just never even complete. We have to we have to control see the build right and this is nine minutes in it's still ain't done. So we know that the process needs somewhere between 300 and 400 megabytes of memory but it's pretty error prone to try and work out what the exact value is. So to get an accurate number for services at scale which are even more difficult than this because they dynamically shrink and expand depending on load we need a better automated way to do that. So determining the exact amount of memory required by an application is a really really difficult and error prone task right. So SEMPAI is this kind of simple self-contained tool to continually poll what's called pressure stall information or PSI. Pressure stall information is essentially a new thing we've added in CIGRI2 to determine whether a particular resource is oversaturated and we've never really had a metric like this in in the Linux kernel before. We've had many related metrics for example for memory we have things like you know page caches and buffer usage and so on but we don't really know how to tell pressure or over subscription from an efficient use of the system those two are very difficult to tell apart even with using things like page scans or or so on it's pretty difficult. So in SEMPAI what we do is we use these PSI pressure stall metrics to measure the amount of time which threads in a particular C group were stuck doing in this case memory work. So this pressure equals 0.16 thing kind of halfway down the slide means that you know 0.16 percent of the time I could have been doing more productive work but I've been stuck doing memory work. This could be things like you know waiting for a kernel memory lock it could be things like being throttled could be waiting for reclaimed to finish even more than that it could be memory related IO which which can also dominate to be honest things like refolding file content into the page cache or swapping in and pressure is essentially saying you know if I had a bit more memory I would be able to run so much faster 0.16 percent faster. So using PSI and memory.high what SEMPAI does is adjust just enough memory pressure on a C group to evict cold memory pages that aren't essential for workload performance. It's an integral controller which dynamically adapts to these memory peaks and troughs an example case being something like a web server which is somewhere where we have used it when more requests come we see that the pressure is growing and we expand the memory.high limit when fewer requests are coming we we see that and we start to decrease the amount of working set which we give again so it can be used to answer the question you know how much memory does my application actually use over time and in this case we find for the compile job the answer is about like 340 megabytes or so and that's fine you might be asking yourself what's the what are the benefits of this shrinking like why why does this even matter to be honest surely like when you're starting to run out of memory Linux is going to do it anyway and you're not wrong like that's true but the thing is what we kind of need here is to get ahead of memory shortages which which could be bad and amortize the work ahead of time when your machine is already highly contended it's already being driven into the ground and going towards the umkiller it's pretty hard to say hey bro could you just like like give me some pages right now like it's it's not exactly like what what's on its mind it's probably desperately trying to keep the atomic pool going so there's there's another thing as well which is you know it's pretty good for determining regressions which is what a lot of people use for rss for right like we this is the way we found out that that demon was using two gigabytes of memory instead of 150 megabytes of memory so it's pretty good for finding out hey how much does my application actually need to run so the combination of these things means that senpai is an essential part of how we do workload stacking of matter and it not only gives us an accurate read on what the demand is right now but allows us to adjust stacking expectations depending on what the workload is doing this feeds into another one of our efforts around efficiency which is improving memory offloading so traditionally on most operating systems you have only one real memory offloading location which is your disk um even if you don't have swap that's true because you do things like demand paging right you page things in gradually and you also have to you know evict and get things in the file cache so we're talking also here about like a lot of granular intermediate areas that could be considered for some page offloading for infrequently access pages but they're not really so frequently used um getting this data come into main memory again though can be very different in terms of how difficult it is depending on how far up the the triangle you go right for example um it's much easier to do it on an ssd than a hard drive because hard drives don't well they're slow and they also don't tolerate random head like head seeking very well but there are more granular gradual things that we can do as well for example one thing we can do is to start look at exact strategies outside of hardware one of the problems with the duality of either being in ram or on the disk is that even your disk even if it's quite fast even if if it's flash it tends to be quite a few orders of magnitude slower than your main memory is uh so one area which we have have been heavily invested in is looking at what we might term warm pages uh in Linux we have talked a lot about hot pages and cold pages if you look in the memory management code but there is like this kind of part of the working set which yes i do need it relatively frequently but i don't need it to make forward progress all the time so zswap is one of these one of these things we can use for that it's it's uh essentially a feature of the Linux kernel which compresses pages which looks like they will compress well and are not too hot into a separate pool in main memory we do have to page fold them back in into main memory again if if we actually want to use them of course but it's several orders of magnitude faster than trying to get it off the disk we still do have this swap for infrequently access pages there tends to be quite a bit cold working set as well um but you know this is kind of like this tiered hierarchy where we want to have warm uh warm pages instead swap hot pages in in main memory and kind of cold pages and swap one problem we had here was that even when we configure the kernel to swap as aggressively as possible it still wouldn't do it um if you've actually looked at the swap code and i've had the unfortunate misery of working on it um this you'll learn that swap code was implemented a very long time ago by the people who knew what swap did and how things worked but none of them are around to tell us what the hell anything means anymore and it's very confusing so i can't even describe to you how the old algorithm works because it has about 500 heuristics and i don't know why any of them are there um so for this reason you know we try to think how can we make this a little bit more efficient we are using non-rotational disks now we have zswap we have flash disks we have ssds we want to make a an algorithm which can handle this better so from kernel 5.8 um we have been working on a new algorithm which has already landed um so first we have code to track all swap ins and cache misses across the system so for every cache page we're having to page fold and evict and page fold and evict and page fold and evict over and over again what we want to do is try and page out a heat page instead if we're unlucky and this heat page actually it turns out to be hot then you know no biggie like we we've made a mistake but we'll try a different one next time we do have some heuristics to try and work out which one is hot and which one is not but they are kind of expensive so we don't use a lot of them um however you know if if we are lucky and the heat page does stay swapped out then that's one more page which we can use for file caches and we can use it for other processes and this means that we can engage swap a lot more readily in most scenarios importantly though we are not adding ioload this doesn't increase ioload or decrease endurance of the disk um we are just more intentional about in choosing how to apply the i it doesn't double up um we only trade one type of paging for another and our goal here is to reach an optimal state where the optimal state is doing the minimum amount of i o in order to sustain workload performance um so ideally what we do is have this tiered model of you know like I said main memory z swap and swap on disk this is super simple idea compared to the old model although the old algorithm has a lot of kind of weird heuristics as I mentioned a lot of penalties a lot of kind of strange things um in general it was not really written for an era where SSDs exist or where z swap exists so it's understandable that it needed some some care and attention so what were the effects of this change in prod like what what actually happened so on web servers we not only noticed like an increase in performance but we also noticed a decrease in heat memory by about two gigabytes or so out of about 16 gigabytes total the cache grew to fill this newly freed space and it grew by about two gigabytes from about uh two gigabytes of cache to four gigabytes of cache we also observed a measurable increase in web server performance from this change which is deeply encouraging and these are all indications that you know we are now starting to reclaim the right things actually we are making better decisions because things are looking pretty positive here so not only that but you see a decrease in disk i o because we are actually doing things correctly we are making the correct decisions and it's not really that often that you get a benefit in performance disk i o memory usage instead of having to trade off between them right so it probably indicates that this is the better solution for this kind of era this also meant that on some workloads uh we now had opportunities to stack where we did not have opportunities to stack before like running say multiple kinds of ads jobs or multiple kinds of web servers on top of each other uh many machines don't use up all of their resources but they use up just enough that it's pretty hard to stack something else on top of it because you're using just enough that it's not actually enough to sustainably run to workload side by side so this is another thing where we've managed to kind of push the needle just a little bit so that you can make quite a bit more use uh an efficiency out of the servers that exist the combination of changes to the swap algorithm using z-swap and squeezing workloads using senpai was a huge part of our operation during covid all of these things acting together we termed tmo which stands for transparent memory offloading and you can see some of the results we've had in production here in some cases we were able to save up to 20 percent of critical fleet-wide workloads memory with either neutral or even in some cases positive effects on workload performance so this opens up a lot of opportunities obviously in terms of reliability stacking and future growth this whole topic has a huge amount of cover i really could just do an entire talk on this um if you want to learn more i do recommend the post which is linked at the bottom my colleagues johannes and dan wrote an article with a lot more depth on you know how we achieve what we achieved and on things like cxl memory as well so let's come back to this this slide from earlier um we briefly touched on the fact that if bounded one resource can just turn into another a particularly egregious case being memory turning into i o when it gets bounded for this reason it might seem counterintuitive but we always need controls on i o when we have controls on memory otherwise memory pressure will always just directly translate to disk i o probably the most attuned way to solve this is to try to limit disk bandwidth or disk i ops however this doesn't really manifest usually very well in reality if you think about any modern storage device they tend to be quite complex they they're q devices you can throw a lot of commands of them in parallel and when you do that you often find that hey you know magically it can do more things the same reason we have i o schedulers because we can optimize what we do inside the disk also the mixture of i o really matters like reads versus writes sequential versus random even on ssds these things tend to matter um and it's really hard to turn to determine a single metric for loadedness for a storage device because the cost of one i o operation or one block of data is extremely variable depending on the wider context um so it's it's also really punitive to just have a limit on you know how much can i write how many i ops can i do um because even if nobody else is using the disk you're still slowed down to this level there's no opportunity to make the most of the disk when nobody else is doing anything right so it's not really good for this kind of best effort bursty work on a machine which we would like to do so the first way that we try to avoid this problem is by using latency as a metric for workload health so what we might try and do is apply a maximal target latency for i o completions on the main workload and if we exceed that we start dialing back other c groups with lucid latency requirements back to their own configured thresholds what this does is this prevents an application from thrashing on memory so much that it just kills i o across the system this actually works really well for systems where there's only one workload but the problem comes when you have a multi workload stacked case like this here we have two high priority workloads which are stacked on a single machine one has an i o dot latency of 10 milliseconds the other has 30 milliseconds but the problem here is as soon as workload one gets into trouble everyone else is going to suffer and there's no way around that we're just going to penalize them and there's no way to say you know how bad is the situation really and is it really them causing the problem this is fine if you're you know the thing you're throttling is just best effort but it's we here we have two important workloads right so how can we solve this so our solution is this thing called i o dot cost which might look very similar at first but notice the omission of the units these are not units in milliseconds these are weights in a similar way to how we do cpu scheduling so how do we know what 40 60 or 100 mean in this context well they add up to 200 so the idea is if you are saturating your disk you know best effort outside will get 40 will get i guess 20 percent of of the work it'll workload one will get 50 and workload two will get 30 so it balances out based on this kind of shares or weights like model how do we know when we reach this 100 percent of saturation though so what i o dot cost does is build a linear model of your disk over time it sees how the disk responds these variable loads passively and it works based on things like you know read or write i o whether it's random or sequential the size of the i o so it boils down this quite complex operation of you know how much can my disk actually do into a linear model which it which it handles itself it has a kind of a q s model you can implement but there's also a basic on the fly model using q depth so you can read more about it in the links at the bottom i won't waffle on too much but it is something which you can use to do kind of effective i o control in the old days i came to this room and talked about secret b2 and the historical response was basically that's nice docker doesn't support it though so please leave um i've had a nice chat with some docker lutz uh no the docker people are very nice and so are all the other container people and what's happened is we have it almost everywhere almost everywhere secret b2 is a thing we have quite a diversity of container run time some police report is basically supported everywhere um so even if nothing changes from your side moving to secret b2 means that you know you get significantly more reliable accounting for free we spent quite a while working with docker and system defoaks and so on and so forth to get things working and we're also really thankful to fedora for making secret b2 the default since fedora 32 as well as making things more reliable behind the scenes for users this also you know got some people's ass into gear when they had an issue on their github on their github that says it doesn't work in fedora so cheers fedora people uh it was a kind of a good signal that you know this is what we are actually doing this is what we as an industry as a as a technology community are actually doing uh and that was quite helpful the kd and gnom folks have also been busy using cgroups to give uh a better management of that kind of desktop handling david edmundson and henry chain from kd in particular gave this talk at kd academy the title of talk was using cgroups to make everything amazing now i'm not brazen enough to title my talk that but i'll just let it speak for itself for their one um it basically goes over the use of cgroups and c v2 for resource control and for interactive responsiveness on the desktop um so this is definitely kind of a developing space obviously there's been a lot of work on the server side here um but if you're interested in that i definitely recommend you know giving the talk a watch it really goes into challenges they had and then unique features c v2 has to solve those finally android is also using the metrics exported by the psi project in order to detect and prevent memory pressure events which affect the user experience as you can imagine on android interactive latency is extremely important you don't it would really suck if you're about to click a button and then you click it and that requires allocating memory and the whole phone freezes i mean it does still happen sometimes but obviously this is something which which they're trying to work on and we've been working quite closely with them to integrate the psi uh project into the into android hopefully this talk gave you some ideas about things you'd like to try out for yourself um we're still very actively improving uh kernel resource control it might have been seven years since we started but you know we still have plenty of things we want to do and what we really need is your feedback what we really need is more examples of uh how the community is using c v2 and problems and issues you've encountered um obviously everyone's needs are quite different and i and others are quite eager to know what we could be doing to help you what we could be doing to make things better what we could be doing to make things more intuitive because there's definitely work to be done there and i'll be around after the talk if you want to chat but feel free to drop me an email message me on mastodon always happy to hear feedback or suggestions um i've been chris down and this has been seven years of c at c review to future of Linux resource control thank you very much |
From a database in container to DBaaS on Kubernetes |
Our next talk is by Peter, and he's going to talk about database in a container to debass on Kubernetes. I hope I pronounced this correctly. Yeah. Okay. Hello, everyone. You hear me well? Okay. Cool. So, let me ask first, how many of you folks have been involved with open source in 90s? Anyone remember those days? So, well, I was, and for me in those times, right, you remember that open source was, well, quite different than today, right? You needed a lot of elbows grease, right? I remember how you have to, you know, download the source packages, maybe patch them some way to make sure it works if your particular compiler, right, and, you know, figure out all the libraries, dependencies, all this kind of stuff to make it work, right? And you could feel that certain pride for just installing some applications. Since that, we had this never-ending move to simplicity, making it possible to run open source software more and more easily, right? So, from that, download sources, patch, and compile. We had a wonderful invention of TarGZ binaries and install script. Anybody remembers those? No? Some do, right? And then they got, like, packages of dependencies, and then those have been in repositories, right? And in the end, they're coming to say, hey, you know what, now we don't really care about the disk space anymore, so we just, you know, jumble it all together as a, you know, docker or snap packages, right, with no dependencies, right? So we got a lot of that move to their simplicity, and obviously that is, you know, fantastic and convenient. One of those ways, which is very popular, is docker. And a question in our talk, as it relates to the database and what is my background, is to what extent you can and should use database with those technologies. And if you look at the docker, we use that actually quite a lot in particular in test and dev environment, right? What is wonderful about docker, if you want several database versions, right, or wherever you can install them very easily on the same node, which don't conflict with each other, right? Where in, you know, your classical Linux operating system, if you want to install, you know, MySQL 5.5, 5.6, 5.7 at the same time, because you maybe want to make sure your application is working well, good luck, right? They all conflict on the shared files and so on and so forth. Dockers enable that, right? That is absolutely fantastic. And also, you can use solutions like a docker compose and bunch of others if you want to, you know, deploy your application and, you know, database that depends on in a docker containers, make it very easily, very nicely. Now, if you look at docker in production, though, it is also possible, though we actually see less than that. Some of the concerns come from overhead, and I would say these are mostly unfounded those days, but if you really, like Google, you can still find some articles saying, like, some scary stories about docker and database being absolutely horrible, right? What also you need to take care of in this case is a little bit of extra complexity, which I know especially a lot of not-docker experts have been beaten by. Like, you have to have your database on a data volume for best results. Otherwise, you can remove your data container, right, and boom, all your data is gone. Well, that is a very different experience compared to, let's say, uninstalling the RPM or depth package on Linux, where you can uninstall the database package your data remains so you can install a different one, right? In docker, unless you have the data stored in a separate volume, you will trash your data, right, to give if you're moving their container. Also, in production, that is somewhere you need a lot of monitoring and observability. Okay. Hopefully, that will settle. So, if you look at the introduction, we often need some observability, right, and monitoring, which initially lacked support for docker, which I think got a lot better in those days. So, what is the state of open-source databases with docker? If you think about that, what most open-source databases out there have official docker images. For those which don't, you will find a variety of unofficial docker images out there, right, so you can pretty much run it everywhere. It is very commonly deployed for test and dev. You will, if you look at the docker stats, you will see, like, hundreds of millions of downloads, like docker pools for many of them, though, I would say, in production, it is limited, right. I know some companies say, hey, we have deployed our production with docker with, let's say, our custom orchestration system, but I would not say that is very common. You know, at your corner, for our software, for database, we support, we essentially do the same thing. We provide the docker packages for everything. So, if you are just sticking to docker, to pure docker, what problems are not solved very well in this environment? The first, and I think the most important one, is, you know, day two operations. The databases are interesting, right, in a way that unlike the application where you often can say, let's just, you know, tear down and redeploy from scratch, right, and that is the approach which is increasingly often taken, right, instead of just, you know, modifying their application, right. You cannot really do that with a database. You know, database is something which has to retain the state, retain the data, not lose any transactions which have been committed and so on and so forth, and that means what majority of the complexity and majority of the life in database happens is what's called day two, right, after you have the deployed, right, and docker-wise simplifies your installation, does not do anything really to solve all that needs to upgrade database. You don't deal with high availability and so on and so forth, right. Also, we can see what a lot of database management problems for real production database have to be done in context of a cluster, because every real production database will require high availability and that is cannot be done by the single instance, right. That has to be some sort of distributed cluster and docker doesn't really help us in this regard. So what does? Well, as you may have guessed, that would be a Kubernetes, right. Really, there have been some other container orchestration system for years, but I think you can say with confidence is what docker has won at this level, and in this regard it has a, you know, largest market share. So where do we see the states of Kubernetes and databases? Well, the relationship has been kind of complicated through years, why? Because Kubernetes initially was designed for a stateless application, right. And if you're saying, well, something is designed for stateless application, you can say, using that for databases, are you freaking crazy? The database are the opposite of stateless, right. The database where we supposed to have our states, right. And I think that is something which has been getting improvement, right, and now the Kubernetes is actually quite capable to run databases as well. But that wasn't always the case, right. I think this is interesting to look at this tweet, which is what, like almost like four years ago, right, at that time, in which case KC Hightower, which is one of very, well, you know, we're experts and third leaders in the Kubernetes space, was not very sure, right, about running databases on Kubernetes, right. Well, let's see what has changed and look at some stats. Now, over the last few years, in the Kubernetes space, we had this doc community, right, which stands for data on Kubernetes community. Very active, right, and really working to enable running data intensive applications on Kubernetes. And I think we've quite good results, right. These are actually like a little bit outdated, like from the last year, pool results, right. But we could see what there is a fair amount of companies running some significant number of data intensive applications on Kubernetes already. Here are some stats which are newer from a cloud native foundation this day, right. And these are comparing 2021 to 2022, right, essentially, just last year starts where we can see what the databases was the second most common workload, well, second most fastest growing kind of workload, which is deploying Kubernetes, right. So we can see what the things are changing, right. Now you can see also some other stuff like messaging and big data. All of those are also actually data intensive applications, right. So we can see Kubernetes have moved in this field, right. Now here is our interesting data point, right. If you look at the database as a service, public database as a service, right, you would see many independent database as a service solutions which have been released over the last, you know, three, four years, actually Kubernetes built, right. They're based on Kubernetes, right. Here are, you know, a number of companies, right, which you may have heard, which are running their public databases as a service on Kubernetes. So I hope by this point I have convinced you what the databases on Kubernetes is, you know, quite possible and can be run quite successful. Now what is wonderful about Kubernetes specifically? Well, I mentioned that as a container orchestration system, right, but I also can think about that as essentially an operating system which is focused on the, you know, data center, right, a set of data centers, environment rather than on a single node. What I think is particularly great when it comes to the databases, it has a very robust mechanics to deal with all kind of failures, node failures and some others, right, because this is actually quite complicated problems, right. If you think about, like, very large systems, you have to be thinking about failures happening all the time, so often maybe multiple failures at the same time, right, and really doing that manually, right, as you would have to do if you want to, let's say, roll out your highly available system on, you know, like a bare metal or a bunch of VMs, right. It is tough, right. It is tough to get that, you know, last one, two, wherever, or maybe kind of zero one percent of edge cases, which is absolutely essential for running applications and scale. Specifically for databases, these typically are being built with the operator framework, right. The operator framework is something which, as the name says, allows you to put a lot of logic in, right, and say, hey, do what a skilled database operator would do. Because being stateful, databases need, like, an extra care, right, how you are going to upgrade the cluster, right. Well, you know what, you don't shut over database nodes down, right, and then change between your version and spin up. Well, no, you don't do it this way in databases. You often need to follow some process, you know, upgrade them one after another, maybe when to ensure your grade was successful, that's how database was warmed up appropriately, right, and all the very nuanced things which databases at scale need. Now, if you look at the databases on Kubernetes, we can see their pickup by vendors is slower, right. If you think about, you know, many of them would have operators that are not quite existing, right, or quite limited. And I would say a lot of reason for that is what, in this age, vendors often would want you to rather to go to the database as a service solution, right. Idea, hey, you know, if you are, how you would like you to develop cloud native applications is go to our solution, right. If you are playing with, you know, MariaDB, go to SkySQL, right. If you are, you know, MongoDB, go to MongoDB Atlas, right, and so on and so forth, right. But in this case, you often steal a lot of third party solutions very developed, and then slowly but surely, many vendors, they start to pick up, right, because hey, you know what, it's better to have a Kubernetes operator and people doing something else entirely, right. So we have, for example, official operators for MySQL or MariaDB or even MongoDB, they are, though, relatively limited at this point. Now, from our side, we've been in this operator game, I think, for a while, and for MySQL, MongoDB Postgres has pretty robust solutions, right, which you can use. What I would say is the problem with Kubernetes in this case. Well, if you look in this case of the Kubernetes, it can be quite complicated, right. And the running database is something you need to really be careful because often you don't get a second chance, right. If you sort of lost your database, well, that can be already very big and serious issue for your business, right. And setting up Kubernetes for a database, for like a storage and backup, right, can be quite advanced skill at this point. Now, if you look at the databases, where we see their state-of-art simplicity, I would say, is in there in a database as a service, right. And databases as a service, as it available in a proprietary cloud, I think brings a lot of great usability, but of course, also at a great cost. And in this case, I mean both as direct cost, as well as a vendor lock-in, which happens. In this case, if you look at the databases and servers, as they exist right now, there are a number of proprietary databases and service offerings, like obviously any launch cloud has them. Then there are some database vendors, right, which have their own, you think about MongoDB, SkySQL from MariaDB, Cockroach, Cloud, right, Timescale, everybody also has their own branded database cloud those days. And there is also a bunch of other vendors, right, which has also their own proprietary database management framework, like Avian Instaclassers gets you there. Now, why database as a service is important from my standpoint, because it really removes a lot of toil, right, the management have ability, like things of, hey, you know, patching, like security updates, it all can be done either automatically or, you know, like a pretty much push-down solution, you know, backups makes things easy to scale, right, hey, you know what I want to scale, right, instead of figuring out how to do that. But the problem with database as a service, as it comes right now, it often would be what I would call like a hotel California compatibility, right, like you can move into something as Amazon Aurora, right, from your, you know, of cloud installation, but then it's maybe very hard to move back. In fact, a lot of work out there is done exactly to make that, hmm, that hard. What also would see with a lot of the cloud vendors is those solutions they are called fully managed, right? Well, and fully managed is kind of over a market in my opinion, right, because when you talk to Amazon, for example, they say, oh, our solution is fully managed. Okay, so who's responsible for database security? Oh, that's shared responsibility. Who's going to tune that? Oh, that's shared responsibility. Well, what if I could not share responsibility, right? Everything is shared responsibility, and that means, well, which you may not find from the marketing pages, you still need people to understand databases on your stuff. Those, if the budgets have been reallocated to the fully managed database service providers, you may not have those people or enough of your people on the team. Now, my concern, of course, with those commercial databases service solution is that it is a vendor login, as I mentioned right now, right? Which may be, you know, painful for some, right? Maybe some of you have heard about 37 signals who recently wrote this article about why they live in the cloud and saying, oh, my gosh, that is like so expensive, right? And they mentioned specifically expense and a lot of, you know, fully managed database solutions they have been doing. But that is also something likely to come, become even more painful, you know, painful. So anybody of you recognize this young, good-looking guy out there? Anyone? Well, this is Mr. Larry Ellison, right? And what Mr. Larry Ellison was doing in 80s, he was really saving people from the nasty big blue and the vendor login which was happening, right, with the mainframe, right? But we go, we understand what happened, you know, a couple of decades after, later, after people were sufficiently saved by the Oracle. Now, what do we say? Well, Oracle doesn't have customers, Oracle has hostages, right? So that is what we should expect with a database vendor login as well, right, as you sufficiently adopt all of those wonderful extra features and you don't have a way back anymore, right? You can expect their cost of escalating as if Oracle. In my opinion, though, there is a good way to use the cloud, I would say, as indicates here, where you can really use the cloud as a commodity and build the value through their open-source solutions as Kubernetes and really look at this side, right? Instead of really building a relationship on the proprietary cloud vendors, you can see how you can embrace their solutions which are coming from an open-source stack like a one-in-a-cloud computing foundation. You can see this as an example, right? There are a lot of icons here. You probably cannot really read all of them, but the point what I want to indicate here is just how big is an open-source ecosystem is, and you probably find some projects for almost any need which you would have in like a proprietary cloud but they're in open-source. In my experience, what I would like to see and what you're working on as per corner in a database space to really provide a fully open-source solution which you can write on a variety of environments, right? Like, hey, you want it on a cloud, you want it in any of the on-prem environments, well, you got it, right? You should be able to do that with no changes. If you are just looking for the basics, actually... Well, I take it back. If you are having a lot of Kubernetes experience in your company already, actually their Kubernetes database operators are already pretty cool, right? They really, you know, eliminate so much of Toil, right? And you can check this, you know, tutorial which shows you how you set up a cluster, scale it, you know, whatever, back it up, right? Really, you know, just a couple of single comments compared to what that would do on Linux. It's like on a bare Linux is a lot more complicated. And for those who like more of, you know, graphical user interface similar to what Amazon RDS or other cloud vendors provide, we're working on that through our solution PMM, which is also 100% open source. Well, like, you can, you know, check it out. So, in the end, we would like to see, right, and hopefully we'll see more similar solution coming up from an industry where we have open source databases as a service experience. Some people wonder in this case, like, what does that really mean? Because the database as a service supposed to be like a fully managed. And what I mean by that is this, right? First part of a database as a service is your interface and experience. Like, hey, I deploy the database in a couple of clicks, right? Or like a single API call. Well, and nothing prevents us having an open source software which has those features. We can do it, right, and we should do it. Now, of course, there is an other piece, right? Well, typically, then things go beyond software ability to deal with that. There are some people, you know, in Amazon, right, or, you know, SkySquad or Avian, right, any of those providers, right? And of course, that is something, well, you don't get if you get a software alone. But that is something I believe you should have a choice where you are building those troubleshooting skills in-house. If that is the choice you take, right, or you should be able to pick from a variety of vendors, right, which can provide that kind of need for you, right, to provide a full database as a service experience comparable to the get-in and commercial cloud those days. So, with that, let me finish up with this, right, is what, if you look at the database as a service, the databases, they have been really going from container to full database experience in the open source side quite well. We can see what the Docker support is very mature. Kubernetes, I think, is getting there, right, if a lot of people are using that already. And the databases experience in the open source space is still work in progress, but I would expect it's coming and maturing both from Percona and other vendors in a few years. And, well, because of the open source, you can be part of solution, in this case, by, you know, by contributing to the ecosystem. So, I think the database as a service has won, right, because of unparalleled convenience, and you know what, deep down, we're all suckers for that. The software vendor lock-in sucks, right, I don't think anybody wants to build the company, and as in many other areas, I believe the open source is coming for rescue as well. With that, that's all ahead. Okay, we have time for about one, maybe two questions. Hi, thank you for your presentation. Quick question about the more operational side of running a database in Docker or Kubernetes. So, the main part people are usually scared about is, of course, the stateful part of it. So, storing the data somewhere should be for, like, file system snapshots or doing backups and so on. So, that's, in a way, stored in a separate place afterwards. For that second building block, what kind of services would you suggest in that case? Yeah, so, well, the question is about some operational aspects about running a database on Kubernetes, right, and specifically as it relates to the storage, right? Well, in our experience, a lot of that depends on what already exists. I think one of the big improvements in Kubernetes recently was having a unified CSI, right, internal storage interface, right, which allows now a lot more flexibility than before, right, and it's ever-improving, right? Like, for example, snapshots, they're building, right? Or you can now, like, scale the volume in many cases, right? So, that is what we rely on. Okay, we're unfortunately out of time. Thank you for the talk. Thank you for being here. Okay, well, and I will be outside so you guys have a few more questions. Happy to answer. |
Lightweight Kubernetes Operators with WebAssembly
Towards serverless Kubernetes controllers |
So, hi everyone. I am Merlin and we're going to talk about lightweight Kubernetes operators with WebAssembly. So, basically, it's an attempt to lower the memory and CPU footprint of the Kubernetes control plane. So, I am Merlin. You can also say it in Dutch, Merlin. And I am a researcher at iMac and I teach at Gantt University. I'm also part of the Ubuntu Community Council. But right now, I'm here to talk about my research, which is service orchestration in the cloud and in the edge. And so, it's specifically the edge part of this research. Edge computing is becoming more and more popular. More and more people want to run their applications closer to end users on devices inside of users' homes, for example. And as a result, you have a lot of these people who are coming from a background of developing cloud applications and who now suddenly want to develop applications that run on devices, which are very low-powered. And they really like the development experience of the cloud. They like all the tools. They like the cloud-native experience with tools like Kubernetes, for example. But as most of you might know, Kubernetes isn't really a great fit for the edge. Kubernetes is incredibly resource-hungry. It really likes to gobble up RAM. It really likes to block all your CPUs. And there's a lot of components inside of the Kubernetes control plane that do this. Part of it is the kubelet that runs on every worker machine. Part of it is the container run times themselves or the API server. But what I'm going to talk about in this session, I think I have no idea why. I still have batteries, so I'm going to talk about operators specifically. Operators tend to take a lot of resources, eat up a lot of resources from your Kubernetes cluster. So first of all, operators, these are basically plugins to the Kubernetes control plane, which add additional functionality to the Kubernetes API. For example, it could add a resource to deploy and manage a MySQL cluster or it could add a resource to deploy and manage a SEF cluster, for example. And these operators, they are also really resource-hungry. And this is part of it is because they are long-running processes. So these processes, they see something change in your Kubernetes cluster. They want to do something with it and then write those changes back to the API server in order to manage the applications. But after that writing is done, these processes, they keep running because they keep listening for events from the Kubernetes API or even sometimes manually watching if some resource has changed. And so even if they're doing nothing, they're still running. A lot of them are written in Golang. And Golang really likes memory. They are running inside of containers. Most of them are running inside of separate containers. And they're basically sitting in RAM doing nothing, eating up that RAM. And so this is an issue if you want to run Kubernetes in the edge on devices which have like 512 megabytes of RAM. These operators are basically unusable in situations like that. So how could we solve this? One of the ways that you could solve this is that we think we can solve this is by using WebAssembly and the WebAssembly system interface. And so yes, really, we're trying to lower the footprint of Kubernetes by taking a web technology and putting it inside of Kubernetes. If you don't believe me, this is a tweet from one of the co-founders of Docker who basically said like if WebAssembly and the WebAssembly system interface would have existed in 2008, they wouldn't have needed to create Docker. It's a very interesting technology which we think is a very good fit to solve this issue in Kubernetes. So what is WebAssembly created originally for the browser? It's basically a binary code format. You compile your applications to WebAssembly instead of compiling them to x86 or to ARM. And then this code runs inside of a runtime. You could call it a very lightweight virtual machine. It runs in your browser, it runs in the Node.js runtime, but there's also a whole bunch of new purpose built, very lightweight runtimes such as wasm time, the one that we're using right now. And the WebAssembly system interface is basically a syscall interface. So WebAssembly is your binary, but it doesn't have access to anything. And then the system interface is a syscall interface. So that's an interface that it uses to open files, open sockets, start new threads and stuff like that. And so if you combine these two, you basically have a very lightweight, super fast sandbox. And so the result of running these operators inside of WebAssembly containers is that they use a lot less RAM. So here on this slide at the top, you see 100 operators running as Docker containers. Then you have 100 operators running as WebAssembly containers and then 100 running just on bare metal. So we're not reaching the performance of bare metal. There's still some overhead. However, we're compared to the Docker containers like we're getting a lot closer than that. As an advantage that we didn't see coming initially, but they also have a lot less latency. They run a lot quicker. This also shows the difference between Golang operators and Rust operators. So obviously, Rust will have a lot less latency and a lot less latency distribution because it's not a garbage collected language. However, we were surprised to see that running them inside of WebAssembly gave them even better, even more consistent latency. So how did we do this? We basically work with a client server model or like a parent operator and a child operator. The parent operator, it is a WebAssembly runtime with a bunch of additions to it in order to support running operators inside of that runtime. And it watches the Kubernetes resources in the name of the operators running inside of it. So the operators don't have to keep running to watch it. They can just shut down when there's nothing to do. And the parent operator will call them once there is a change to process. The child operators, those are where the actual operators run inside. And the interesting part is that they are just regular operators compiled to WebAssembly using a patched version of the Kubernetes SDK. So in the future, this will probably make it possible to just take a regular Kubernetes operator, compile it to WebAssembly, and then use it in this system. Right now, we only support Rust because Rust support for WebAssembly is very good, Golang support for WebAssembly is iffy. And we have a patched version of Kube RS, a Kubernetes SDK, to then contact the parent operator instead of contacting the Kubernetes API itself. So how does this loading and unloading work? This is the WebAssembly engine. This is basically just wasn't time, the WebAssembly runtime. And in here is your client operator, your child operator is running. Once the child operator wants to contact the Kubernetes API server, it does a syscall. We extended the WebAssembly system interface to add a few syscalls to support the scenario. And this syscall goes through to the parent operator and the parent operator is the one who actually contacts the Kubernetes API. Once these calls are finished, the parent operator, it contacts the child operator back again in order to give it the result of these calls. And if the child operator is not doing anything, the parent operator shuts down the child operator. And once there changes to process, it starts it up again. And so the results I showed you on the first slides, those results are just not unloading anything. Just running Kubernetes operators inside of WebAssembly. So these results are what you get when you have a worst case scenario for unloading operators when they're not doing anything. And so we see that in a worst case scenario, they still use 50% less RAM because they're constantly being unloaded and then reloaded again once there's changes to process. However, this is obviously at the cost of latency. Even though WebAssembly, it starts incredibly fast. It has latency that just can't be compared to Docker containers for starting applications. There is still some latency to start a WebAssembly application. And so this compounds in the worst case scenario of like 100 operators chaining themselves up to 12 seconds, which is an issue. So what are we doing now? So we have this basic proof of concept to show that this seems to be a very good approach to lower the footprint of the Kubernetes control plane. And we want to do more with this. Currently, we're improving the build tools and we're making more realistic tests. All the tests we did right now were a worst case scenario of operators constantly doing stuff. However, in the real world, most operators don't do anything most of the time. So we're creating more realistic tests to see what these operators, what the performance benefits are for real workloads. We're also working on predictive unloading so that if we know that an operator is going to have to run again in a few milliseconds, we don't unload it because it's better to just keep it running. In the future, we want to work on better support for controllers that wake periodically. So right now, we see that a lot of production controllers actually wake periodically every five seconds or every 20 seconds in order to manually check resources in the Kubernetes API because some of those resources, they can't work with callbacks. So we are trying to figure out a way to actually put that functionality into the host operator itself so that even when you're watching resources that don't support event-based APIs, the operator is still sleeping as long as there's nothing to process. And we're also really interested in upstreaming and standardizing this. We have patches for Kube RS. We have an extension for the WebAssembly system interface. It would be very interesting to see if there's people in the ecosystem who are interested in this and support for Golang, although this will probably not be work that we're doing, we'll just wait until Golang is better supported in WebAssembly. So I have to thank the developers. Francesco is somewhere here in the audience. We started from a prototype created by Francesco and Marcus, which runs Kubernetes controllers inside of WebAssembly. And we refactored it to use wasm time and we added the unloading mechanism. This was done by Tim as part of his master's thesis. And right now, student Kevin is working on it also as part of his master's thesis to improve the build system so that it's much easier to get started with it and to add predictive unloading and more realistic benchmarks to have a better idea of what is the performance for actual production controllers. So the main reason I am here today is to say like, hey, we have a really cool proof of concept, which solves an issue that we have been having. Is this solving an issue for other people in the community? And are you interested in working together on this? If you're interested in working together on this, please get in touch. If you're a student yourself and you want to do like an internship or a master's thesis working on this, we have a lot of opportunities, same for a PhD. So please contact us, send me an email to see what we can do for you and how we could collaborate. So this is the end of my presentation and there's now room for questions. I also put the link to part of our code here. I think this GitHub repo also links to the other repositories that you need. Okay, we can take a couple of questions. So why was he so fast and why it is not possible to do something similar with JVM? So definitely, JVM and WebAssembly are very similar in that regard and a lot of people, they position WebAssembly as being like a more cross-platform and a more cross-language version of the JVM. But if you're only interested in Java and Java-based languages, then the Java runtime itself is a very good alternative to this. Okay, there was another one over here, right? Yeah. So if I understood correctly, you are deploying your operators outside containers and that makes them much more efficient. But, I mean, besides the security aspects, when you deploy in containers and Kubernetes, you have many other things that you can set, like resource limits, but also things like post-topology spread constraints and notations to make sure that some processes are running on specific nodes and so on. How can you address that with WebAssembly? Because you cannot package then your operator like any other workload that you deploy in Kubernetes. Yeah, so it's a very good question. So one of the benchmarks was just running the operators on bare metal, but that's not actually what I'm proposing. It was just to see, like, what is the absolute maximum amount of performance we could get out of this. Our plan is to run each operator inside of its own container. It's just a WebAssembly plus WebAssembly system interface container instead of a Docker container. And so most of the security profile and stuff like that that you have with Docker containers is very similar with WebAssembly. Some would even argue that it's more secure in WebAssembly because it has a much smaller API footprint and it has some of the best teams working on it to make sure it's secure for the browser. Moreover, the code that is running in these WebAssembly containers in my proof of concept, this is control plane code. So this is code that the system administrator selected, like, okay, yeah, I want this specific system administration code to manage my applications. And so in that sense, there's also, like, a higher level of trust put into the code, which means that, like, things like attacks and stuff like that, there's less of a risk to it. But even then, like, it's still running inside of containers. So one of the most important scalability aspects of Kubernetes controllers is the watch-based cache, right? So without it, the API server wouldn't be able to handle all the long pulling and so on. And it's also one of the most memory-intensive aspects of Kubernetes controllers. I was wondering in your memory benchmarks if you were cutting down on this watch-based aspect, or if it is still included in the parent operator. So for example, is the parent operator caching as a proxy for the child operators? Is that the case? Yeah, that's what's happening, basically. The parent operator is where the caches are, yeah. |
Making Continuous Delivery Accessible to All |
I'm sorry, but it was very successful. I changed the box with the fight-care-fights. 13th place. The fight-care-fights. 13th place. 13th place. 13th place. 13th place. 13th place. 13th place. 13th place. 13th place. 13th place. 13th place. Thank you. Thank you. You want to start? People happy? We are good. You want to welcome people? You can. Hi, everyone. You can hear me. It's working. Welcome, everyone. Thanks for being in the CIC. Thanks to the CIC moderators for organizing this event. Laurie, would you like to go first? Who we are. My name is Laura. I work for JFrog. I'm the CDF's marketing outreach chair. I'm also the chair of CDCon, which is May 8th and 9th this year in Vancouver. That's a picture of me and my kid. That's Instagram, because we look like we're happy, but we had just gotten lost in Central Park. Don't always believe what you see. Thanks, Laurie. I work at the Linux Foundation as the executive director of the CIC. I've seen Laurie's photo, so I had to put my photo with my son. He's my son. Seven years old. We are not lost. We are enjoying what is that rotating wheel. That thing. It is in Ireland. Today, we want to talk about a topic which we have been talking a lot about. We are going to talk about a topic that we want to talk about. We want to talk about a topic which we have been talking a lot about the last couple of years. As you see, we have all these different industries, finance, healthcare, telecom, and if we think about these different industries, the continuous integration and continuous delivery deployment, it is common of all the industries, regardless of what they are doing. All these industries have certain needs and requirements when it comes to developing their products, getting them tested, getting them delivered and made available to their end users. And sometimes we when we talk about continuous integration and continuous delivery, we focus around tools and technologies. Yes, tools and technologies are really important that is what enables us to establish new pipelines, run end to end, software flows and get the software out there to field. But when we think about what actually continuous integration and continuous delivery means, that part is a bit tricky because everyone has their own concerns, everyone has their own use case, and when you talk to a certain individual or certain organization, they come up with different things because the way the organization's communities embrace continuous integration and continuous delivery differs from each other. Some of the organizations have been in continuous delivery for years, they are very mature, they know what they need next, and some of the organizations are a bit slower compared to the others, and they are just starting their continuous delivery transformation, for example. And I actually remember, I think it was three or four years ago, it was this room, if I am not mistaken, I was talking about a similar topic for telecom industry. And telecom industry as you know, it has been like very coupled, and telecom industry is going through this transformation, and it has taken a while for telecom industry to actually embrace the continuous delivery. And because of that we need to make sure that we put the individuals, we put the organizations, we need to put the communities together to make sure we can collaborate across different industries, across different projects and different communities to find out how we can best help each other and make sure everyone moves forward while they transform their business or communities for continuous delivery. So I'm sure all of you know what continuous delivery is. Continuous delivery is a software and practice-enabling organizations to deliver software with speed and security, as well as sustainability and other things. And it has become really important for again, as I said, organizations regardless of the size, type, industry and so on. And without continuous delivery it may be challenging to push the new software out there. But if you go back to what I was just talking, and this is typical innovation adoption curve, and if you think about different organizations, different companies, different communities, some of them are ahead of others. Like if you think about Silicon Valley companies, they have been doing continuous delivery for years, decades even, like when it was first, this term was first going to continuous delivery in the 90s, being of 2000s, and they came for a lot. So those organizations they have lots to share with others who may be just starting. And we have some organizations in between they already started working with continuous delivery, but they are trying to find out how to you know, find out what is next to do. So if you think like on the you know, organizations that are ahead of their peers, for example, they adopted continuous delivery years ago, they are trying to find out what as they need to do, they are thinking about different topics, but if we think about the other end of the curve, those organizations may not be interested at all, they may think they are different from the other companies, they are different from other organizations, and they may say, oh our product structure is not like that, we can't apply continuous delivery to what we are doing. So we don't need this. And in between, different types of organizations doing different things. And that is where we see the actual need to collaborate in broadly. Because again, if we look at the adoption curve, the organizations that are ahead of others, they are thinking about some really hard questions. They are trying to find answers to those questions like security, scalability and sustainability. Like when you start your continuous delivery of pipelines, you start simple. Your pipelines don't cross organizational boundaries. You may have one CI server that may be enough for your organization. But when the scale comes then things start happening. And how to make sure that your pipelines passing through multiple organizational boundaries could be established providing feedback to different parts of organization from developers to sales teams, for example. And those organizations are looking at scalability aspects, interoperability aspects. They need somewhere to work with these topics and they need somewhere to share their experience with others. And the other group of people might be looking at developer productivity and experience because they have done with continuous delivery. It is working fine, but their experience of their developers are suffering, so they need to fix that. And if we think other types of organizations, they may be after standardization because everything is working fine, but all the teams in the organization doing things in different way and it is very costly and time consuming to maintain those things, for example. And finally, some organizations may be opposing to idea of continuous delivery because their products are different and they don't believe in continuous delivery. And they may need help talking about best practices, white papers and so on. And this is the key thing to make sure that, okay, as I mentioned, tools and technology, it is important, but more important than that is to come together and share all these learnings, challenges, concerns, use case with others so we can learn from each other and push the domain forward together. Okay, so the name of this presentation is making CD accessible for all. And so we are with the Continuous Delivery Foundation. So who here is a part of the CDF? Nobody? So this room is the CI CD room and we don't have one hand going up that, one hand! Yes, two! Okay, so the CD Foundation was brought together to help solve these challenges. It's a vendor-neutral organization under the Linux Foundation and we're here to make things work better. So where do you find us? So get your phones out. I know everybody likes phones and likes to take pictures of slides. I am QR heavy, QR code heavy in this presentation. So the first thing you're going to want to do is go to Stop! Terrible! Bad Fati, bad! Okay, so the first thing you want to do is to check out our website, right? So that's got all of our information on it. It's got our projects, it's got where to find us, blah, blah, blah, but if you want to get right into the conversation, go ahead and join our Slack channel. Yes, I know you're on our Slack channel, but that's okay because this one is alive and it's well and people are on there commenting, asking questions, the SIGs, the working groups of projects are on there and it's your best resource to find out what's going on in the CDF, what people are currently looking for and where maybe you can contribute and then when I come back next year and ask how many people know about the CDF and are in the CDF, y'all are going to raise your hands. It's going to be amazing. Okay, challenge! So if you didn't know, the CDF has nine projects and ooh, I need my notes, hold on. Our first project is probably our most well-known project which is Jenkins and it graduated and so I'm just going to give you some like a commercial for each of these so that you can kind of go back and see if these are things that you currently use or maybe want to look into. So Jenkins is a leading source open automation server providing hundreds of plugins to support building, deploying and automating any project and when I say that it's graduated that means it was incubated within the CDF and then it became sustainable on its own and it's then moved out. So we still do everything with Jenkins, they're here, they're building K, go check them out and they've got lots of good stuff going on. The next one I want to talk about is, these are all in just crazy order, Spinnaker, which is an open source multi-cloud continuous delivery platform for releasing software changes in high velocity and confidence. Then Screwdriver, so I've got them in like the order in which they came into the CDF, this is just picture order, let's just not pay attention. So Screwdriver, it's an open source build platform that facilitates the workflow for continuous delivery pipelines. Then we have our little Alien Ortilius which is all about microservice tracking and then we've got Jenkins X and that is a CICD for Kubernetes with preview environment on pull requests using cloud native pipelines from Tecton. Tecton Pipelines is on there, which was also graduated recently, it graduated last October and so it's an open source framework for creating CICD systems. Shipwright is a framework for building container images on Kubernetes. CDEvents was born out of a SIG and this is one of our newest incubating projects and they are in version 0.1 right now with 0.2 about to come out which is a very exciting project if you're looking for something in the event space. So CDEvents are a common specification for continuous delivery events enabling interoperability in the complete software production ecosystem they've got a lot of cool things they're working to integrate with other projects that we have oh, Sean Mark for Jenkins stand up, there's our Jenkins guy right there he runs Google Summer of Code for Jenkins we love him, he's amazing and then our last our last project that just got entered into the CDEvents is Persia and it's creating a decentralized package management network so we don't like to use the word blockchain you know an immutable ledger all that good stuff so you can track and see where everything was built alright phones out so special interest groups right so if you're like I'm not ready to join a project but what else is going on within the CDEvents so we have five currently five special interest groups, best practices interoperability, events software supply chain and MOops so as Fatih was saying earlier like best practices what are you doing how did you scale, how did you get there what are you using, what tools are you using what size is your company, my company is going to be that size how do I get there in the easiest way possible boom, best practices that's what they're trying to figure out interoperability, this one has got a lot of like spiciness happening right now, this is sort of like a big buzzword and they are working hard this is one of the most active groups that we have and so they are trying to figure out like interoperability with an ecosystem like how do you track events, how do you see what's going on and how do you make all your tools work together so if something breaks you get notified shaking your head, it's fine so next is our event SIG and again CDEvents came out of this SIG so they realized it was more than just a SIG and they wanted to create a project, we have a white paper there, like I said it's in the back, yes so there's a white paper in the back, you can learn about it today and they're about to release their next version and it's very exciting, we've got some big companies working on it and again the thing about the CDEvents is it's vendor neutral, so we've got companies like Netflix, eBay, Google, Amazon, just huge corporations that are working to solve this problem collectively, which I think is super important so that we're all on the same page moving forward, software supply chain, we've all heard about software supply chain attacks, it's gone up 742% in the last year so this is something that's super important and again just like a lot of other groups we're out there trying to figure out the best way to secure pipelines and lastly ML Ops, it's machine learning bringing what's learned in the database and the science side over to methodology and trying to move forward in a good practical manner so five minutes so we're good so we would love for you to join us and help build out the future of continuous delivery we've got end users, we've got vendors, we've got projects, the whole thing is we're working together to solve the problems and if you want to join our mailing list, that's another QR code and I know I said this earlier, but if you're interested in talking at CDCon, which is in Vancouver, it's combined with Open Source Summit NA and GitOpsCon, our CFP is open until next Friday I'm program chair, I would love to read your abstracts and get you all on the speaking slot that would be amazing and that's our session any questions? we've got five minutes for questions or no questions awesome I do have one question, how much time do we need to participate in the CDF? so it's volunteer only right, so you determine how much time you want to volunteer as marketing chair, I volunteer a lot of hours, but I work for a company that allows me to work in Open Source which is really cool, but I also do stuff on my own during my downtime one of the things I love about working in the developer community is that you guys just love developing on your off time, during your work time like when you're on vacation, I see you with laptops at the pool, so my suggestion is just check out what we have to offer and see if it's something that interests you, there's nothing wrong with striking up a conversation with somebody online to see is this worth it, is this really an active group, I think that's what we find out a lot about communities that we join, sometimes they're active sometimes they're dead, but just give us a shot and I think you'll be happy yes? I could have said even further away how do you see the adoption of CD events in projects outside of the CD foundation? Okay, so the CD events project is getting contributions from projects outside of CD foundation such as captain, I know if any of you have heard captain, which is like an event where you're working on a project and you're working on a project and some of you have heard captain which is like event-based control playing for continuous delivery, which is a project at CDF sorry, CNCF, right, yeah CNCF, so that kind of collaboration is happening, but the main thing that slowed the CD events adoption was not having a release, so as Lori mentioned the project made their first release like a month or two ago, 0.1 and they are working on adopting CD events adopted for two CDF projects and I am sure the other projects outside of CDF will follow the suit so currently the discussion is around Jankins and Spinnaker adopting CD events so once they adopt then others will hopefully follow that outside of CDF, yeah and we've got it, okay so do you produce any documentation for people or small organizations who want to use CI processes or to start using them to overcome common issues? yeah, that best practices okay the question was if we have any documentation to help organizations with different sizes like small size startups or large size was that correct? So the best practices CIG actually developed a website which is called the address is bestpractices.cd.foundation and on that website you can find such information how to start with continuous delivery regardless of your size or what you should be looking at depending on the size of your company and that document or that website is a CD developed knowledge base so you can look at it and if you find something missing there then you can go and contribute and improve the documentation so just go to the website bestpractices.cd.foundation and you have the entire website for that and thank you we are out of time it's muted I think Ernu is on okay next one we made it just on time so alright one last thing in order to stay in the middle of the screen |
How To Automate Documentation Workflow For Developers |
My name is Portia, I'm from Document Right, we write technical documentation and today's presentation is called Automate the Pain Away, CICD Workflows for Documentation, so once again, it's me, I run a technical documentation agency, before then I used to work as a Django developer, I spent years, too much time tinkering with side projects and working with documentation and you can find me on Document Rights, on Twitter and YouTube. What we'll cover, what problem we're trying to solve, the tools that exist, how to automate your docs and the pros and cons to automation. So I talk to clients and a common problem they tell me is, sounds like a developer wrote the docs, I'm like, okay, and we dig into exactly what does that mean, sounds like a developer wrote the docs. So some of the problems that companies have with their documentation include a wall of text, which means it's really hard to parse through the documentation, it lacks context, this is really common because if you wrote the code, you really don't know what an outsider will be like confused on, it's out of date, it's not updated on a regular basis, the documentation is incomplete, there are some features that lack documentation and finally it's incohesive, it's very obvious that your Go developer, your React developer and your marketing person all worked on the documentation and you can tell which section they worked on. So quick tour of voice and tone, if you want your documentation to sound cohesive and not like four different departments touched it or sound like developers wrote it, it's really important to know these concepts. So this is a voice chart and the purpose of a voice chart is to make sure that your documentation, like I said before, sounds cohesive and this is based on different principles. So you get together with your team and you figure out what do we stand for and based on what you stand for that actually informs the concepts, vocabulary and grammar that you will use in your documentation. I can talk about this but it's easier to actually give you some real examples. So this is Google's voice, this example I'm using comes directly from Google's documentation, I promise this is not my opinion, this is just Google. So if you're looking at Google's documentation, they tell you that some of their principles include timeless documentation, catering to a global audience and accessibility, that's all nice and good, but what does that mean? For concepts with timeless documentation, it means it's avoiding words and phrases that anchor documentation in a point in time, like 2015. Vocabulary, it avoids words like now, new and currently, that's in their style guide and grammar, it uses the present tense. So you take your principles and from there, you actually figure out what concept are you going to use, which vocabulary are you going to use and not use and what is your grammar going to be like. Second example, Microsoft. This comes from Microsoft's style guide documentation, once again, I've not made this up, these are not my principles. Warm and relaxed, crisp and clear, bias free communication. So if we look at crisp and clear, it means that we're to the point, we write for scanning first, reading second, we make it simple above all, vocabulary, whenever possible, choose words that have one meaning as opposed to words with multiple meanings and grammar, make every word sentence count, concise, clear sentences. So once again, it's easier for you to get to concise, clear sentences, it's easier for you to get to whenever possible, use words that have one clear meaning when you know what your principle is, that's something that should not be avoided. We talked about voice, there's also tone. So you can have a voice, but you can have different types of tone for a voice. So if you have a warning label, you'll have like, it'll be urgent, you have an urgent tone, but the voice is the same. If it's empathy, like a getting started guide, you want that person to feel like, yes, they're welcome, yes, they should spend their whole weekend with your documentation, whole weekend. And inclusiveness, the introduction is also welcome. This is what we're about, yes, this documentation is for you and will solve your problem. Those are the different tones. And once again, you have one voice, but several different tones in your documentation. All of this is codified, not in the product manager's head, not in the lead engineer's head, but in a style guide. So the examples that we looked at were from Google and Microsoft style guides. So there's several different style guides that you can use for technical writing. There's a Google, Microsoft, Smashing Magazine has one. And the one that I personally use for my team is Digital Ocean. I love Digital Ocean's technical writing guide because it gets to the point, and it gets the information really quickly. If you're using Microsoft style guide, it's literally over a thousand pages, literally. And I think Google is around over a thousand pages as well. So even with the best intentions, one just doesn't have time for that. Here is a list of common documentation pitfalls. This data comes from the Google season of docs 2021, and some of the pitfalls that you have of documentation include the documentation is lacking, specific use cases, the documentation is disorganized, it's outdated, not consistent. You don't know of these problems, right? Your documentation's perfect. And the documentation needs to be converted into a different tool, platform, or format. Oh no! So many moving parts! No wonder why no one keeps up with their documentation. You have voice, you have tone, you have the fact that it's out today, you have the fact that people are using different tools, how do you keep it together? Automation to the rescue! So this is where we're going to use CICD tooling and automation to make sure that we take our principles and best practices and actually implement them. So we're going to take a step back. Using a CICD workflow is part of a docs as code workflow, and with a docs as code workflow, it has several parts. One, you're using version control like Git. Second, you're building documentation with an open source platform like Spinks, Gatsby, Next.js, Docsaurus. The documentation is written in MDX or Markdown. I know there's some ASCII people out there, but in this presentation we're going to talk about Markdown, and you make use of CICD tools. If you want to learn more about docs as code, you can read Anne Gentel's book, Docs Like Code. Her first edition was 2016, and she just updated it actually last month. She's amazing. I love that book. I'm going to read it to everyone. All right. Automating your team style guide starter pack. So in this situation case, we're going to use GitHub Actions, Markdown, and Veil. I'll talk about what those tools are. So GitHub Actions, the basics. It's a CICD solution provided by GitHub. It's made up of workflow and events. It contains a marketplace of third party actions, so you can write your own actions, but you can also use actions that are already pre-written in the marketplace. And it's great for running linters and tests. Many of you probably already use it for JavaScript and Python and other programming languages. Markdown. It's platform agnostic. Use it with platforms such as Gatsby, Hugo, or Docosaurus. Not as complex as code. I put this down because many people who are touching your documentation, they're not always engineers. And in our team, we have a project manager, and she did not know anything about Git. She knew nothing about code, but she actually updates documentation using Markdown. Another thing I like about Markdown is you can use MDX, and MDX is where you can import and use React components. So if you want something that is interactive, MDX really gives you some more flexibility in MD. Veil. It's not a linter for pros. There are extensions for Visual Studio Code, JetBrain, and Vim, and Emacs fans out there. It's not the only linter out there. You can check out Woke and WriteGood. The one thing I will say is with Veil, you can use many linters within Veil, and Woke is what it sounds like. I'll let you check that out on your own time. The code. So I'm going to, let's see. If we have time, I would love to run a demo, but we'll see what happens. And I would have to use a second screen, so let's just stay with this for now. This is the Veil config file, and the Veil config file, what you're looking at is the style path, the alert level. I believe there's three different alert levels. All kind of format you're using, which is markdown, and style guides. In this case, you're not writing your own style guide, but you're using style guides created by WriteGood and Google. And this is what a typical workflow looks like. This is the Veil one, and you have on, which triggers the workflow. So the trigger is, in this case, a push, and this code will run on a main branch. You're not stuck on one branch, like if you want to run this on multiple branches, you can, but in this situation, we're just using one branch. And if you go to 25 to 27, those are the style guides that we're using in this situation. We're using Google, and we're also using WriteGood. I just added this today, because I don't want this to seem like magic. When you're using Google's Veil YAML file, this is what it looks like. It's basically a regular expression, checking this. This is the first person one, and it's checking to see if you're using I. What you could do is, if you want to make your own Veil, linter, you can go to Google. You can go to Microsoft, and you can use their examples to help you get started. Finally, takeaways. The benefits of automating your documentation. It greatly speeds up the publishing process, which is a plus. The git increases workflow transparency. Git blame. It encourages developers to take a more active role in maintaining documentation, because it's using the similar tools that developers are using to develop code, which is Git, Vim, MDX, and it upskills technical writers. There are many technical writers out there that are not software developers, and they ask a developer, well, what do I need to know? What kind of coding skills do I need? And they're like C++, which is fine. It's fine. But we're not giving them the kind of advice they need to use this on a day-to-day basis. So if you go to Twitter, if you go to LinkedIn, if you go to write the docs Slack channel, there are technical writers who want to know how to upskill, and knowing how to use markdown, knowing a sprinkling of HTML, CSS, JavaScript can really give them the skills where they can actually use this on the job, which is really important. So you have these coding skills that they could actually implement, and that's one of the reasons why I like the docs as code. Caveats. These are the cons. When I work with a team, I would want them to already use a style guide without the automation. This is important because you can figure out what you want in your style guide, what should be taken out, and you can actually build a process. Once you're confident with that process, then it's time to automate. Second, you figure out how to deal with false positives. Style is going to be wrong. You're going to have to talk to your team and figure out what's going to happen when a tool gives you the wrong answer. I've had teams where they've gotten a false positive, and it really destabilized them. And so having that plan is important. And finally, this is the most important part. Communication will not solve for poor team communication, pettiness, and office bullies. You can't use CICD to deal with bullies. You actually need to talk to them, and you actually need a conversation. Thank you very much. You can find me at document rights on Twitter and YouTube. Wait, wait, wait, yes, okay, yes, we're going to run this. I'm sorry, wait, wait, we only have two or three minutes, so let's not do too much clapping. This is what the repo looks like. This is a bare bone. What I have is a getting started guide. I've used Obama Ipsum. You don't need to use Obama. There's actually a Trump one, too. So we have this, and get status, get ad, get commit, Obama example. I know this is not great, and then let's push this. So I am now pushing this to GitHub, which is here. So this is my GitHub repo, and here you see the workflows, the config file, the getting started guide, and this is where your actions are located. And to the left of your actions, in this example, I'm only using one action, but you can use many actions, and here I'm actually running the code. So it's still building. And how much time do I have? Do I have one minute? I have, oh, someone said two. So build, yay, this built, which is always a plus during a live demo. And if we scroll down, here are all the issues. So there are different levels. One level is an error, the next error is a warning. So error is spell out all ordinal numbers. And what I like about this is, let's see if this is going to work for us. This shows you exactly what your error is, and you don't need a person to comb through the documentation and tell you this. This is a linter telling you exactly what you should change, and your higher level problems, you can have a product manager or a technical writer deal with, and do we have time for questions? No? All right. Thank you, everyone. |
Delivering a crossplane-based platform |
Hi everyone, I'm here today to talk about delivering a cross-plane based platform. A few words about myself. My name is Maximilian Blatt. I'm a Kubernetes and cross-plane developer and consultant at Accenture in Germany. I'm using or working with cross-plane for almost two years or yeah it's two years now and I'm the maintainer of several cross-plane related open source projects including the provider for AWS, the provider Styra, provider AguCD and I've contributed to many more including cross-plane itself. Now since this is the CI CD dev room I don't know if everyone is familiar with cross-plane so I just want to spend a minute or two explaining what it is. So cross-plane essentially is an extension to the Kubernetes API and it allows you to create cloud resources the way you would create resources in Kubernetes. So the thing on the left is something most of you probably have seen once or twice which is a Kubernetes pod and it's a very common resource that you have in Kubernetes and it basically just schedules and container where you can run an application. And on the right you see a bucket as you would create it with cross-plane and it represents an actual bucket on AWS S3. And if you look at both of these objects then you see that they are very very similar because they are both inside the Kubernetes cluster and you have both very common or the same kind of structure. You have your API version and your kind. You have the metadata that comes with every cross-plane with every Kubernetes object. You have a declarative spec so where you describe the state of the resources the resource that you want and then you have the status information about the resource itself. And that is one of the features that cross-plane does for you so it connects external APIs any kind of external APIs with Kubernetes and lets you manage your whole cloud infrastructure through one Kubernetes cluster. And the second very powerful feature of cross-plane is that it allows you to create your custom Kubernetes APIs by using something that is called compositions and then it's the thing that you can see in the middle. It's a very rough and simplified graph to show the way cross-plane works and it essentially is always works that you have the user claim for a resource for your API that you have to find using a so-called XID or a composite resource definition and that is then passed to a composition and then the composition spawns a number of managed resources. Managed resources are something that you have seen in this slide before which is in a bucket or any other kind of external resource on any other kind of external API. Today I want to talk mostly about XIDs and compositions because that is what you do most of the time when you are working with cross-plane. Now developing a platform with cross-plane. If you look at simple CI CD pipeline then you have usually build, test and then deploy and that is that is very easy and for most software projects that is also very easy to understand but because cross-plane is a bit different and you have different things that you do inside these steps. So what you do with cross-plane is you are first building and pushing a package and you are you're not writing code but you are just writing YAML objects which are then applied on the cluster and then they are handled and treated like data by cross-plane and then when you are testing your cross-plane platform then you are applying all your compositions and your XIDs to a test cluster and then you are claiming them and then you see if they work and then if that is okay then you are deploying them and you're just doing the same but on a production cluster. I don't want to talk about the deployment today because that is very simple that is basically just like doing a Kubernetes deployment you are building an OCI image and then pushing that and then you are installing that on a cluster using cross-plane and that's it there's not much to tell about but I want to talk about the building and the testing. Let's start with the building. If you have worked with cross-plane before then that is probably very familiar for you. On the left you see an XID as you would write it and on the right you see a composition. So an XID I said it basically just defines the API that your user has applied to and it's very similar to custom resource definitions that you are writing in plain Kubernetes. So you have your API schema in the spec of your XID and then in the composition what you do is you define the resource that should be created when the user claims this API and that can be an arbitrary number of resources so you don't have to create just one resource but you can create dozens of them so I've written compositions where you are creating 30 or more resources at once but that is essentially how it how it is done you are specifying a base resource and then you can modify this resource by copying information from the user claim into the resource that you want to create. That is what you do the whole time you are working with prospering you are writing an XID and then you are writing a composition or multiple compositions and then the user can claim it and then choose the composition that he he wants. That now looks easy at first but when you are doing this on an enterprise level then you are very easily you end up with compositions that can be thousands of lines of code where you are creating dozens of objects and then because you are just dealing with pure YAML then you really starting to get at the limit because you have a lot of things that are very repetitive inside compositions you have very similar structures let's say if you are spawning a lot of similar objects on your cluster but in different compositions then you sometimes you have the same patches that you are reusing for example if you just want to patch the name of a resource by what the user has given to you then you are repeating this patch over and over for every resource for every file you are writing and sometimes you then have compositions who only vary in details if you have different environments for example you are in different AWS accounts and you only want resources to appear in specific accounts or you have different values like the region or static resources that you are that you want to connect like the account ID and then you have to to write the same composition over and over but just with different values and then you see that you are ending up with something that gets really really complicated because you're just doing a lot of copy and paste and so you need something to generate the YAML dynamically and in these two years I spent a lot of thoughts how to simplify this process and I have experimented with a bunch of stuff and we've tried out Q which is some form of JSON like framework that allows you to build structures and have them validated but it's very complex and not very easy for newcomers so if you have new developers and teams then it's a bit hard to to onboard them on it on it because the error messages are not very helpful in many cases and the tool that we ended up establishing was Helm and not the biggest fan of Helm because it's a bit quirky to use and sometimes if you have error messages or if you have errors then it's sometimes hard to detect where the error actually is because it just tells you all there's something wrong with your YAML but you don't know where exactly happened but the good thing with Helm is that it can do everything that we need you can replace common code blocks such as constants with things that you have written out in your values YAML you can use templates to parameterize patches and to save lines of code and you can even replace the the API schemas of XRDs by something that you can generate and that is a really really cool thing so I just checked the code in our repository and we have about a hundred lines of code for for Helm I'm sorry 10,000 lines of code for Helm and we are generating 200,000 lines of code of compositions that are then applied on our API clusters if you are doing this if you are generating code for for crossplane with Helm or any other kind of code generation tool then I recommend you to check these generated YAML bits into your Git because as it turned out it's very hard to detect unintended changes that you are doing in Helm with your bare eyes if you are changing one value or a template somewhere and then it might have some side effects that you're not seeing so easily and so I really recommend you to check these generated codes YAML code into your Git and do not treat it as artifacts and then if you are in your CI then you should what we are doing and that is really helpful is that you should regenerate all your package and your generated YAML and see if any diff appears and if that is the case then you should just treat this as an error and abort and if there is no diff then it's okay and then you can continue on push your package to the OCI repository. Now so much for the building now let's look at the testing. The first things that you are doing probably when you are starting working with crossplane is that you are writing your composition and then you are applying it on a cluster and then you are claiming it and then you see if it works if all the resources get ready and if you can use them and then it's done and that is all manual and that is very easy to do because it requires no additional setup and you can just use the cluster that you have but when you are really want to do automatic testing or enterprise level testing then that is not enough and because you have manual steps you have an outcome that is not reproducible because you are doing the things all by yourself then also you don't have to find what is actually expected outcome because sometimes even if a resource gets healthy it doesn't mean that the resource is configured the way you want it. So we also tried and tested a few things and we started with go testing but it turned out to be much more complicated because you have to write a lot of Bola plate code stuff and so we ended up using Cuddle. I don't know if some people know it. It's basically a Kubernetes testing toolkit and that allows you to define all your test cases in YAML and then just let Cuddle do all the work all the application of the YAML on the server and then you can define the resources that you expect afterwards and if you're imagining the graph that I showed you before where you have the composition and then you claim it and then you have a number of managed resources that are then spawned and so you can have the claim as an input and then you can just define the resources that you want to have created as an output and then you can handle let Cuddle handle all the rest for you and then it can do things in parallel and such and this is a really really great thing. So I recommend Cuddle just to show you an example how these tests look like so you have your small bucket claim if we are sticking to this simple bucket example then you have your bucket claim on the left which is your test case and then on the right you are defining all the objects that you want. You have the bucket claim itself which has a resource status that should become ready and then you have composite resource which is an internal resource that gets created by crossplane where it stores some reconciling information which should also become ready and then you have your actual bucket managed resource which also has properties that you are expecting it to have and it also has a status and so that is all you need to do testing with Cuddle for crossplane and one thing I want to highlight is because in crossplane the names of the composite resource are always generated by the Qube API server so every time you are claiming an API the name is different it's always different and you cannot influence it so what you can do with Cuddle is you can let Cuddle identify the objects that you are expecting via the labels you don't have to pass the name but instead just tell Yammer that you just want an object with certain properties and label set and then Cuddle will look for one object for any object on the server and if there is one that satisfies this constraint then you are good to go. One other thing that we've experienced is very good is you should run your tests in separate clusters for every pipeline that you are running so we are using virtual clusters or B clusters for that that they run inside a physical cluster of course you can create your your own physical cluster all the time but if you are spinning up physical clusters at least on EKS it can take up to 30 minutes and that is not something that you want for every test and it also costs a lot of money and so you're just spinning up virtual clusters which are Kubernetes control planes that are running as POTS inside a cluster where you can then install cross-plane its providers apply the compositions and then run all the tests with Cuddle and then once you are done with the tests then you can just delete the cluster and everything is fine and also you don't have any intervention between two different pipelines because compositions are cluster scope and they are most likely overriding each other. Now I've been talking a lot about end-to-end tests and they are really good and I recommend you to write end-to-end tests when you are building a cross-plane platform but end-to-end tests also take a lot of time to run if you're considering that you have an API where you are creating real physical cloud resources and then you always have to wait for your resource to actually start and then after some time maybe it says it says that something is misconfigured and then you have to look for an error and if you're really just doing development that it really slows you down because you have always this 10, 15, 20 minutes gaps between something happening and there are a lot of mistakes that you can make when you are writing compositions and so I just want to highlight a few things so you have these composite type rest that reference the composition with the XRD they have to match and they are only validated at runtime then you have the group names which have to match with the XRD name you have an unstructured open API schema because XRD is because Kubernetes does not support recursive API schemers yet maybe it will come in the future but as of now it's not supported the same goes for the resource base which can also have any kind of field and then you have the resource patches by default the behavior in cross-plane is if you have if you want to patch from a field to another field and the path of your source does not exist then cross-plane cross-plane default behavior is that it will just ignore the patch and it will not throw an error or anything and if that is the case and you you might easily swallow any any errors and then it you're wondering why things don't work but but you just have a typo in your patch and it's really hard to find these if you have two thousand signs of YAML code and then you have types that must match so if the user is inputting a string then you have to make sure that the string is actually expected and not an integer on the on the actually bucket API for example and then you have the indentation the big thing that if when you are writing YAML files that is my big problem if I'm writing YAML files I always mess up the indentation and then things get all messy so we need something to detect these errors sooner because the sooner you detect an error the easier it is to fix so what we have done because there is nothing out there at least we couldn't find anything we've developed a linter for cross-plane compositions where we are loading actual XID and CRD schemas and then comparing them with the compositions and then applying a set of rules like ensuring that the composition actually supports a valid XID type that you don't have duplicate objects which can sometimes happen especially if you are generating things with helm and then the most important thing is that it actually validates the patches that you are running against the CRD and the XID schemas and that is really really helpful that the first time when we ran this against our production code it turned out to have I think 800 errors that nobody noticed but somehow our our platform still worked yeah and other cool thing about our linter is that it's pure CLI and you don't need a Kubernetes cluster or a cross-plane installation you can just run this locally without setting anything else up and you can it really takes maybe one minute or two and then you have all your your your compositions linter and that is really really really great you're wondering where to get it and there will be a link on the last slide where you can find the code yeah summing things up and so this is our CD CI CD pipeline that we have developed after a couple of years of testing and failing so we use helm to write and build our compositions to generate the YAML code dynamically we use our self-written linter to lint our compositions and we use Cuddle to run all the end-to-end tests and then we are just pushing things with train or any other kind of OCI tool that that comes handy yeah so so much here's a QR code for the linter we are actually making this open source today so you are the first one to actually see the code except us yeah thank you do we have time for questions okay any questions so my question is more about crossplane then crossplane this looks really good though and how does crossplane compare to things like cluster API and the CRDs that that introduces like where's the distinction between the two of them just you know if you're familiar with cluster API so crossplane makes use of CRDs under the hood so if you are if you are applying your XIDs on the cluster then crossplane will generate CRDs which are then used as the API that can be the user can claim if there are no more questions then thank you we're going to make a five minutes break |
Continuously Update Everything
A recipe for disaster? |
Yeah, I think I will start, otherwise I would know I want to have, I mean, I like to talk a lot, so I won't have enough time anyway. So thanks for coming to my presentation, and today I'm going to talk about the challenges of updating everything. So my name is working. I'm also one of the CI-CD Dev Room maintainer. I'm working for Sousa on all things related to Kubernetes, Rensho. So if there are anything you want to talk about, feel free to reach out after my presentation. But today I'm not here to talk about what I'm doing at work. And here to present a project that I started before joining Sousa, back then when I used to work on Jenkins projects, and that project is named Update CLI. So Update CLI is a common line tool that we use to automate things. So the design is to run it on your machine, on the CI environment, whatever it is. And so you specify in your manifest what's the update strategy would look like. So initially I wanted to talk about, first, Update CLI, and then all the challenges that you have when you want to automate Docker images, when you want to automate infrastructure or whatever. But I won't have the time to do that here. So for those people in Ghent, for the configuration management camp, I will have more time over there. I will just focus on what Update CLI is, what the problem is, and what I'm trying to do. So the challenge that I face is, quite often when I maintain large amounts of projects, something that used to work, to not work anymore. Like you are using Ugo, for example, to generate a website. And then at some point, you cannot deploy the website anymore because even though projects release a new Ugo version, they fail to build the published Docker image associated to that. Or you would realize yesterday I was investigating an issue where Update CLI would deploy and would roll back a version of the NGNX Ingress Controller, and then it ended up that the people maintaining that container just released, deleted the release, forgot to remove the GTIG in those situations. So you would expect something to work. You won't automate. And the thing is, when you get in those situations, you try to understand why it didn't work. I mean, it used to work for years, and then suddenly it doesn't work anymore. And then so you spend time trying to understand what's the latest version, what's the changelog, what's something failed, basically. And so when you want to automate those updates, so you don't have to pay attention to them, I mean, it obviously has benefits, right? I'm curious. Who's using, for example, Tip and Abut or Renovate about to automate things, updates? Yes, a few people. It's only the start. But once you start automating things, you know that, obviously, it gets easier to change your project infrastructure documentation, no matter what, because you get confident in the change that you want to do. And in my case, most of my projects are hosted on Git repositories. And one of the challenges when you have, when you think about those Git repositories is everything is a file. And what you try to do is you try to automate them, but most of the time you have no idea what you're trying to update, right? So for example, Dependabut will just look at a package of GZN. So if you find a package of GZN, we'll list all dependencies and try to update them one by one. But on the other side, for those people using, for example, random GZN file, there is no way to know in advance what should be updated in those files. And then you have all those middle grounds, like, for example, for those people familiar with Dockerfile, you have some instruction that you can automatically update, like the from instruction. It's pretty straightforward to know what you need to update. You don't know what you want to update. That's a different story. But the thing is you know that you want to automate the base image. On the other side, you can put pretty much every information in the run instruction, the label, the end instruction, and that's where things get difficult. So when we started working on a data line, we wanted to think, OK, we want to automate everything. So we want to define where the information is coming from, what are the conditions to automate the thing, and finally, what should be the state of your file on your Git repository. So if I go back to my ego example, the idea is the source of information is the latest ego release. That could be the Git tag, that could be the latest Docker image published, that could be the GitHub release, for example. But at some point, we have to decide what's the source of truth for that specific application. And then you have, like, a bunch of conditions that you want to apply, like, does it make sense to bump the version in production if you fail to bump the version in the dev environment? So you want to be sure that you are using the same version everywhere. And only then, you will bump all the files related to that version. And so when we come back about Update CLI, the idea is we specify a manifest, we have to write a manifest. So that's the main difference, for example, for Dependable. Because Dependable, you just enable the button, it works. But it will only detect what it can detect. But most of the time, you have no idea what you should update. With Update CLI, we went the other way. We write a manifest, so for example, this one is you have the source, the source of truth. In this case, it's a GitHub release. So it can GitHub release. And then you have the specification, where you provide all the parameters for that specific project. So in this case, I am monitoring the go-go-io git repository. This one gives me a version, let's say 100. That's the latest one. And what I want to do is I want to be sure that all my files, named natify.tom.yml, are up to date. So I look at the key. And then if I run this manifest on my machine, it will just dump the file on my machine. If I run this manifest on the CI environment, it will bump the file in the CI environment. And so the next step is, okay, that's one thing to have it working on a machine. But then you also want to be sure that your git repository is up to date and don't pay attention to them. So you can just focus on what really makes sense in your case. And so the next step is, okay, we want to specify where that file is located. So we have a bunch of other resources. In this case, it's a SCM of type git, because I want to update git repositories. And then I specify that I want the pull request approach, where I create a temporary branch and then someone can review my change. And then when you think about all those building blocks, you can really have, like, more advanced scenarios, like this one is another project that we use, where we use it, is when someone really is in a new version of Apno, we use GitHub Action there. That send a bunch of release events automatically to other git repositories. And those git repositories will trigger Update CLI. So Update CLI will retrieve all the different information. So for example, on Apno slash docs, which is obviously the website, we retrieve the latest version of Apno, and then we check that all the download links are up to date. We check that we have the version for that specific website. So we maintain one documentation per major and minor version. So we try to be sure that those files are up to date. And if it's not the case, we open a PR. And then as part of the release process of Apno, someone needs to review the PR and double check that it still contains a file that you want to have there. Another example is the way we automate Hemshot. We define, okay, we are monitoring the Apno UI, which is a front application, and we monitor the backend, the Apno. And then if for some reason there is a new version, then it will automatically bump the Hemshot, bump the metadata, and so on. And once again, we have a human validation where someone can just come, look at the PR, and decide if we want to go one step further. So really briefly here is when automated update is not a so easy challenge, as I initially thought, because so we split the project in three different categories. So the first one is declarative. So the idea is you know in advance what should be updated, and so you define in a manifest how you want to update something, because it's not something that you can define in advance. The other discovery is a bit more like for those people familiar with Dependentbot or Renovatebot, you just run the command and you ask it to automatically detect what could be updated. There are scenarios where you can find that information. For example, on a Maven project, you have the pump.xml, you fetch all the dependencies and update them. That's pretty easy. On other projects like Docker containers, it's kind of a mess over there, so it's super difficult to know what should be the next version. And then on the other side, you have all those situations where you specify a constraint, a version constraint, like you don't want to use a version bigger than the 1.0, but at some point the project upstream is like way further than you are in your project. At some point, you need to be aware that you will need to plan some work to catch up on the upstream project. And so that's another experiment, which is update monitor. So I want to do a quick demo. I don't have good internet connectivity here, so I hope it will work. So on the left side is one of the manifest, is it big enough? Oops. So on the left side, we specify a few things like, okay, in this case, we want to enable the auto merge feature of GitHub, actually, of GitHub PolarQuest. We specify labels. So we automatically open a PR, and if all the tests are passing, it will merge the PR automatically. And so I don't have to pay attention, which reduce, obviously, the noise introduced by those PR. We need to specify which projects we want to go. So in this case, that's the updated website. And finally, we specify where the information is coming from. So this one, we monitor, go, go, go, go. As I said, we could have instead of monitoring the GitHub release, at some point I could have said, okay. I just want to monitor the Docker images. But in that case, I just need to provide a different piece of information. Or you could say, for example, I want to monitor here, writing from the IDD is the easiest way. You can specify different ways of filtering version, because what's something that we notice, for example, is when I said the Docker ecosystem is a mess, is you can put whatever information you want in a tag. So there is pretty much no way, I mean, most of the time, there is no way to know what should be the next version. Then depending on the registries, they don't return you the latest version, because they don't sort the tags in the same way. So at some point, yeah, you need to enforce a specific behavior. And then the target in this case is, if there is a new version of Hugo, we want to be sure that the workflow file has the correct version and that the native file is up to date. So I don't care. And so what it looks like on the other side is just a CLI, as I said. So you can read it from my machine, Linux, Mac, wherever you want to run it. And then, voila, you get the latest version, change log, depending on the situation. And based on that, you can just combine the projects. And so we have a lot of different workflows where we automate things. The last thing, how many time do I have left? Five minutes? Okay. Where is that? It's not this one. So the thing that I was mentioning for monitoring the different versions, so this one is a different way to see the problem is you want to monitor the version that you are using at some point. And so you want to compare, okay, in this case, on one location, I say I want to monitor a version from the native file.tamiaml, so it gets me a version which is 0.1010. And on the other side, I want to compare with what's the latest code version. And so if it does not match, then I know that I need to work on that at some point. And since I have a bit of a time, I can quickly show what the discovery looks like. Yeah, the auto-discovery is a bit more difficult because you need to know in advance what you want to do, but where is that thing? Yeah, this way. As you can see, we don't have a lot of support at this time, so it's mainly around containers because I'm working on containers most of the time. But so it will pass the file, so in this case, it identifies. It's a Rancher project where we have fleets, and then based on that, it will try to fetch all the different versions specified in the fleet project, and it will suggest other versions. So that's it for my presentation. And... Voila. Is there any questions? One time, two times, yes. There is one over there. Hi there. Thanks for the presentation. You were talking about what the Panda was, but you didn't mention about renovate. I wonder how much it overlaps with renovate, if it's a bit more customizable one. So the question, I mentioned the Panda, but I didn't mention that much renovate. So if you compare the Panda, but the renovate is way better than the Panda, but because the Panda, but I didn't have the time to cover the domain. There are a lot of things that I didn't have the time to cover, but for example, one of the features that I really love in renovate, but is they allows you to group PRs which reduce the noise. Because for example, the Panda bot, especially for those people maintaining JavaScript projects, the Panda bot can just open like 10, 20, 30 PRs, and then you have to review all of them tests. And so there are different strategies to update. Renovate bot is just way better in the way that it supports more modules. On the other side, it's not really easy in the case of renovate, but to have workflows where you really want to say, okay, I'll fetch a version, I'll check a bunch of things, and then I'll update other targets, basically. So I would say renovate bot is better in the autodiscovery part, where you can detect things for you. But on the other side, it's not really easy to have like very complex updates in areas. And that's it. So Charles, the floor, thank you. |
Continuous Delivery to many Kubernetes Clusters |
Hello. Thank you for coming to my talk. It's not a TED talk, but it's just my talk. Continuous delivery to many Kubernetes clusters. My name is Carlos Sanchez and I'm here to talk to you about our live experience, real world. I'm not here to sell you anything. So at least I'll try to tell you if I have time some of the mistakes we made too. She's not all beautiful and wonderful. So I'm a principal scientist at Adobe Experience Manager Cloud Service. I'll talk a little bit about the product. On the open source side, I started the Jenkins Kubernetes plug-in. Anybody heard about Jenkins? Yes, some people probably, yeah. Okay. And I'm a Kubernetes. Anybody heard about Kubernetes? Yeah? Okay. Anybody using Kubernetes in production? So I'm a long time contributor to open source. There are multiple projects on Jenkins, Apache Foundation and all that. So a quick intro to what Adobe Experience Manager is because people, every time I say Adobe, people say Photoshop and PDF and Flash, yeah. So that's not any of those, right? So this is a content management system that you probably never heard of, but it's powering 80% of the 4100 and it's very, very enterprise. I'm not expecting people to know, but this is widely used because it's based in a lot of open source. It's a distributed OSGI application that was started many years ago and uses a lot of components of open source from the Apache Foundation and we contribute back to those components like Felix, Apache Felix, Sling and a few things about content management there. And it has a huge market of extension developers, people that are writing their own Java code that then runs on Adobe Experience Manager and AM. So when I joined Adobe, the goal was, let's move this into a cloud service and this is running AM on Kubernetes. We're running currently on Azure and we have 35 clusters and growing very quickly because this is a content management. We run it in multiple regions, right now 11, so multiple ones in the US, Europe, Australia, Singapore, Japan, whatever, because people want to have low latency between their users and the content. And then another interesting fact is that we have the Kubernetes clusters. We don't run them directly. We build stuff on top of them and we have a different team at Adobe that manages Kubernetes for us. Some curiosities is like customers can run their own code, so we are running this for them and we take their code and run it inside our processes. So we have to limit clusters permissions for security and we have several security concerns because this is a very multi-tenant setup. Each customer can have multiple environments, multiple copies and they can self-service, so they can deploy new environments whenever they want, they can update them and do a few things, so it's not just us controlling what is running, it's also the customers. Each customer can have three or more Kubernetes namespaces where these environments run and this, I like to call this a micromanolith. So we don't run a big service that spans like thousands of instances, we run slightly different versions of the same service over a thousand, ten thousand times. So micromanolith defines it very well. And then we use namespace Kubernetes namespaces to provide the scope on network isolation, quotas, permissions and so on. Now internally we have multiple teams building services, so different services have different requirements, they have people can use different languages and we are more in a philosophy of you build it, you run it. And we are basically doing each services post as an API or we follow the Kubernetes operator patterns. We also use to split the monolith, we use a lot of init containers and sidecars, if you know in Kubernetes you can run multiple containers at the same time, so the main application runs in one container and then we have to apply division of concern, many sidecars that do different things. And it's an easy way to split separate concerns without having to rewrite your whole architecture to go to a fully network-based, micro-service oriented architecture. So on the continuous delivery side, which is probably what you are interested in here, we are running, we are moving to a, from a generally release to, I mean we are pushing changes daily multiple times, right? Not only, not just the application, the application may be slower to move, but on the operational side and all the services and operators, micro-service, all these things, all of them together, any of them at any point in time, any day can receive changes. So we use Jenkins for CI CD in some places, we have Tecton, you heard about that in one of the talks before, it's another open source project to do workflows on Kubernetes, to orchestrate some pipelines and we also started using Argo CD for some new micro-services. We follow a GitOps process, so where most of the configuration is a storing Git and it's reconciling each commit, right? And we use a pull versus push model to scale. And I'll go through this in a bit. We have a combination of multiple things being deployed to the clusters. We have the AM application that is deployed with a Helm chart. We have operation services that are on operators and services and all the other things that are not the application. These are deployed using Kubernetes files but templatized. And we are also using customized and Argo CD for some new micro-services. On the Helm side, we use the Helm operator. So in each namespace, we use the Helm operator CRDs to do a more state-based installation of Helm. So we create the CRD and the Helm operator is going to install the application based on the parameters on the CRD. A word of advice is don't mix application and infrastructure, infrastructural configuration on the same package because if you cannot enforce the same Helm chart for old tenants. For example, as I mentioned before, customers can decide when to update things, right? So we have some customers in older releases and some once in newer releases. This is something that we want to change. But in the meantime, if we want to update a specific version of something in an old release, it's hard when this is already packaged on Helm. So we built a solution for this. So from the platform level, we can go and manipulate this Helm chart. So we can have overrides and this is easy to do when you have the Helm operator. So you can inject, whenever there's a request to install a new Helm chart, we change parameters. So we change both Helm values. This is easy. Instead of passing some values, you pass different ones. Or you can use customized patches. And this is also support from the Helm operator. This is also support for customized patches. And customized patches are very interesting because they allow you to patch any Kubernetes resource. So even if there was no previous Helm value defined for it. So if we want to change a sidecar container image version across the whole fleet, we just have to change the patch. And this patch is going to be applied to all the clusters, all the namespaces. And all the Helm charts that were installed are going to get reinstalled with the right version that we want. So we do this combination of both Helm chart and then operational values on the other hand. Very important for us was the shift left mentality, right? Detecting problems as soon as possible. Not waiting for developers to push things to production because the cost increases. So we run checks as soon as we can on pull requests. So this is still fresh in your memory when you make a change and something is broken. You want to catch it as soon as possible. And we do this by generating all these templates. We have some tests that generate these templates and then apply tests, multiple tests on them. The most basic check that you can run is the apply QCTL, apply the right run. This will tell you if the manifest is wrong in some very obvious way. So if it's valid or it's not valid. Cube conform is a tool that will allow you to validate the Kubernetes schemas. So this is the successor of Kubeval. Anybody heard about Kubeval or Kubeconform? Okay. So this is very useful for if you have custom CRDs or just to make sure typical problems are you, you miss the jammer indentation and now it's not valid anymore and then you catch this on a PR. You just run this and it will tell you, you know, this property is missing or this is property is in the wrong place because everybody loves jammer, right? Conf test is another tool for open policy agents. Any people familiar with open policy agents? Open policy, OPA. So OPA allows you to write policies where you can go and pretty much check anything in any structure file. In the case of Kubernetes, you could say, I don't know, don't mount, don't run the pod as root. Make sure you don't mount secrets as environment variables or with files. Make sure, enforce that all the pods have some labels. Any random thing that you can think of, you can do it. And like, don't pull from Docker Hub, pull from the internal registry. You can do that with Conf test and OPA policies. The only problem is that it uses the regular language that if you haven't heard of, it's very painful to work with, but it works great once you try to figure out. We added another tool which is called Pluto. Pluto is just a CLI that will tell you what API versions have been deprecated or removed. So if you are running, if you are thinking about upgrading Kubernetes, you run Pluto and it will tell you, you know, this is deprecated, it's going to be removed in this version and so on. So you can enforce that. We built a tool that we call Git init, which is our own version of a GitOps pool. So we have the Kubernetes definitions storing Git and we deploy these to blob stores across regions. So they are pulled in each cluster. And Git init is a deployment that runs continuously on each namespace. We have around 10,000 namespaces in our fleet. So it basically pulls the blob, applies the changes and does this thing every so often. And an example of why we do a pool versus pool, because pushing to all the clusters, we have a job that does this and it runs in parallel, like in 20 threads or something, and still takes like five hours to run. So we cannot push things when we want. On Argo CD, we have a newcast platform that allows you to do Argo CD-based microservices. Argo CD, basically, this would create a new Git repo, it would come with some templates and that would get deployed with Argo CD to the cluster. And this is for us, we are thinking about moving this way and each team will have their own Git repo, because right now we have mostly centralized operators and everything. And this is good for the, okay, you go on your own direction, you do whatever you want, you build it, you run it. On the other hand, it's a bit tricky because when we decide or figure out something is problematic, we cannot just centrally say, you know, on this Git repo tell me who is doing this and let's change it. But we are moving towards that direction. Let me skip this and talk a bit about progressive delivery. So progressive delivery is a way, well, it's something that, it's a name for something that you've probably heard of, which is canary rollouts and doing percentage-based rollouts, feature flags, blue ring, so basically don't update everybody at the same time because you can break everybody. So we can do rollouts to different customer groups in separate waves and we can also do rollouts to percentage of customers. By default, we have a time-based rollout that goes from dev to stage to prod candidate after a period of time. And this is running on Jenkins and ensures that things have been running on dev on stage before we merge them to prod. I mean, this is very basic. What we built was feature flags at the namespace level. We have 10,000 namespaces and then the Kubernetes definition templates. So what we allow developers to do is for each namespace, they can decide, I want to roll out this change to this environment, dev, stage, or prod, or I want to deploy this change to a specific cluster or by template namespace type of, yeah, type of namespace or a percentage. And this is just using templates on Kubernetes objects. So an example is, in this case, a rule, sorry, a Kubernetes definition where you can have a template that is as full version or bar version, or you can enable a container, a sidecar container, or disable it. And then at the bottom, you can see the rules. So by default, we want full version to be 1.0, but for the namespace, all the namespaces on the dev environment, we want that to be 1.1. So this allows us to quickly roll out changes, but progressively. We can also do it for percentiles. So in this case, we could say, I want all the namespaces in dev and all the namespaces in a stage to have this full version 1.1 and enable matter rule true, but for prod, I only want 5%. So I roll out a change to 5% of prod, and then I can continue after that. So this has proven really useful for developers to test in safely, increases development in speed, PRs are much faster, so it's all great. And we are thinking about, well, we're thinking, we are working on getting ARGOR rollouts also at the deployment level. ARGOR rollouts allows you to do blue-green and canary rollouts, where you can say, progress the number of pods over a period of time, so instead of changing, I don't know, 10 pods at the same time, because one by one, and if you have a service mesh, you can go even more fine-grained and say, I want 5% of the traffic to go to the old version, to the new version, everything else to the old version, and keep progressing that and do automatic rollbacks. So, yes. So, yeah. With the service mesh, you can fine-tune the traffic percentages, but with Kubernetes services, you can still do it. It's just that we are limited with the number of pods. So to sum up, Shift left on Garrail, so keeping people safe on what they are doing, this increases development speed, reduces the issues that you are going to have in production, and you're never going to prevent having issues in production. What you can prevent is how many customers are affected and how fast you can fix them, right? So for us, what was very useful is the progressive delivery techniques, like canaries, percent of rollouts, or automated rollbacks, and the automation to do this, control and progressive rollout, pays off over time. So I think we have one minute for questions. Or you can find me afterwards. Thank you. |
CI/CD for Machine learning models
How to test ML models? |
Hello, everyone. Do you hear me well? Thanks, pretty large audience. If I may ask a quick show of hands, who among you have some experience, just any level of experience with machine learning? Okay, cool. Awesome. So, I'll be talking today about how to run testing on machine learning systems. So, there are different keywords, CICD, quality assurance. A few words about us. So, I'm one of the founders of Giscard. We are building a collaborative and open source software platform to precisely ensure the quality of AI models. And I'll be explaining in this presentation a bit how it works. In terms of agenda, I prepared kind of two sections on the why, like why a project on testing machine learning systems is needed, why we personally, I personally decided to work on that problem. Some of the risks and why classical software testing methods don't quite work on AI. And then I'll do some more concrete examples on two important quality criteria that you might want to test for machine learning. One is robustness and the other is furnace. And if we have the time, it's just 30 minutes. I hope that we can do a quick demo of an example use case where we run the full CICD pipeline on a machine learning model. So, to kind of start off easy, I put together a series of memes to explain my personal story of why I came to create a company around, and a project around this machine learning testing thing. So, about 10 years ago, I started in machine learning, statistics, data science, and you know, you had this, you start using the scikit learn API, and you're like, yeah, it's super easy, right? Anybody can be a data scientist, you just dot fit, dot predict, and that's it. You're a data scientist. And probably if you're here today, you're like, yeah, have you tested your model? Yeah, sure. Train test, yeah. Reality, if you've deployed in production, is quite different. So, if you've deployed through production, often you'll have this painful discovery where you have your product manager, business stakeholders to whom you said, look, I worked really hard on the fine tuning and the grid search to get to 85% accuracy, and you push your first version to production, and things don't quite work out. You don't reproduce these good accuracy numbers. So, well, this was me. I hope it's not you. It was one of these, my first experience deploying machine learning through production was on a fraud detection system. So, frauds are notoriously difficult as a use case for machine learning because what you're trying to detect doesn't quite want to be detected. There are people behind it who have a vested interest not to have your machine learning system detect them. So, often in terms of performance, that's at least what I ended up doing a lot of hot fixes in production. It's bad. So, kind of like five years ago, this was my stance on machine learning in production, a very painful grueling experience where you never know when you're going to be a complain, where you're going to be on call to solve something in production. So, that's when I decided to buff up and switch roles to join a software engineering team. I was a data crew back then, so I moved internally from data science to the product team, and here are some of the things to summarize that as someone with a machine learning background, but no real software engineering experience that these were kind of like what I was told, like you must learn the ways of a CI CD, otherwise your project will not come to production. And for context, so I was specifically at that time in charge of creating like an open source layer to federate all the NLP and computer vision APIs that vendors in the cloud provide, and then to do the same for pre-trained NLP and time series models. So, what was difficult in this context is I was not even the one in charge of the models, and the models will be retrained fine-tuned, so the guarantees into the properties of that system as an engineer, that's more difficult. There are some elements in the stack that you don't have control of. So, yeah, this is a bit of a repeat of a previous meme, and I really wanted to say one does not simply ship an ML product without tests. The challenge I had then is that from an engineering management standpoint, I was told, yeah, but you know, it's easy, no engineers, they all write their test cases, so you do machine learning, just write them all, just write all the test cases. So, this was me being kind of a square one. It's like, okay, so you're telling me, I just need to write unit tests, okay, that will not really solve the issue, and that's kind of the beginning of a quest that set me on to build something to solve that gap between, okay, I want to test my models, I need to test my models, and how do, how can I do that? Because clearly, and I'll explain why, unit testing, your model is really not enough. So, a different angle on the Y, I'll try to take a step back and talk about quality in general. I think in this track, we all agree that quality matters, and if you look at AI, it's an industry that's an engineering practice that is far younger than software engineering or civil engineering, and it's just riddled with incidents. I encourage you, if you don't know that resource already, it's an open source database, it's incidentdatabase.ai, and it's a public collection of reports, mostly in the media, of AI models that have not worked properly, and it's a really great work that has been going on for about two years and a half. It's a global initiative, and just in this time, they collected more than 2,000 incidents. Since these are public reports, think of it as the tip of the iceberg, of course. There are a lot of incidents internal to companies that are not necessarily spoken out in the media. The incident database has a very interesting taxonomy of the different types of incidents. It's very multifaceted. I took the liberty to simplify it in three big categories of incidents. One is FX, the other is business economic impact, and the third one is on security. We're talking about really, if they happen at a global company scale, incidents that are very, very severe. In FX, you can have a sexist credit scoring algorithm that exposes the company to lawsuits, to brand image damages, et cetera. These are notoriously hard to identify. In a way, machine learning is precisely about discrimination. It's hard to tell a machine that is learning to discriminate, not to discriminate against certain sensitive groups. I'll speak on some methods that can be used precisely on this problem, but Apple was working with at the time Goldman Sachs on deploying this algorithm and probably some tests and safeguards were unfortunately skipped. It was actually discovered on Twitter that in a simple case, a male loan applicant would get 10x their loan limit compared to his wife. That sparked a huge controversy that probably exposed Apple to some lawsuits. In another area, that is not with sensitive features such as gender. There was a huge catastrophe a year and a half ago that happened to Zillow, a real estate company, where there was a small bias that was overestimating the prices of homes. They decided to put this algorithm live to buy and sell houses. It turned out that this tiny bias, which was left unchecked, was exploited by the real estate agents in the US. Literally, this created a loss of nearly half a billion dollars. Again, maybe if going back to testing, this could have been unseapated and avoided. Now on a more cybersecurity spectrum, there's a lot of good research from cybersecurity labs showing that you can hack, for example, a computer vision system in an autonomous driving context. Here you put a special tape on the road and you can crash a Tesla. We don't quite know if these types of vulnerabilities have been exploited in real life yet, but as AI becomes super ubiquitous, and obviously there are some bad actors out there that might want to hack these systems and introduces a new type of attack vectors. That's also something we need to care about. Both from the practitioners of AI and from a regulatory standpoint, testing just makes sense. Yanlequin, chief AI scientist at META, was actually taking a stance at the beginning of last year on Twitter saying that if you won't trust in a system, you need tests. Also making a slight criticism towards some of the explainability methods, because two years ago, if you've followed that realm, people were saying, oh, you just need explainability and then your problems will go away. Well, that's just part of the answer. Lastly, and this was covered in some of the talks this morning on the big auditorium, there's a growing regulatory requirement to put some checks and balances in place. That also says that you need specifically in case your AI system is high risk, you need to put quality measures in place. The definition of high risk AI systems is pretty large. Obviously, you have anything related to infrastructure, like critical infrastructure, defense, et cetera, but you also have all AI systems that are involved in human resources and public service and financial services, because these are considered, obviously, critical components of society. Now that we kind of agreed that it's an important problem, these are some of the challenges, because if you've encountered some of these issues, you probably looked at some easy solutions, taking some analogies on what you might do to do this. There are three points that make this problem of testing machine learning a bit special, meaning it's still a big work in progress. Point one is that it is really not enough to check the data quality to guarantee the quality of a machine learning system. One of my co-founders doing his PhD proved experimentally, you can run experiments, you can have really clean data in a bad model. So you cannot just say it's an upstream problem, it's technically like systems engineering, you have to take the data, the machine learning model, and the user into context to analyze its properties. Moreover, the errors of the machine learning systems are often caused by pieces of data that did not exist when the model was created, they were clean, but they did not exist. Second point, it's pretty hard just to copy-paste some of the testing methods from software into AI. One is like, yes, you can do some unit tests on machine learning models, but they won't prove much. Because the principle is that it's a transactional system and things are moving quite a lot. So that's a good baseline. If you have a machine learning system and you have some unit tests, that's really like step one. It's better to have that than to have nothing. But you have to embrace the fact that there has got to be a large number of test cases. So you cannot just test on three, five, hundred, even a thousand cases will not be enough. The models themselves are probabilistic. So you have to take into account statistical methods of tests. And lastly, and I think this is specific to, because there has been some systems that were heavily dependent on data, but with AI, AI also came with a fact that you increase the number of data inputs compared to traditional systems. So you very quickly come into issues of, well, it's a combinatorial problem, and it's factually impossible to generate all the combinations. Very simple example of that. How can you test an NLP system? Lastly, like AI touching a lot of different points. If you want to have a complete test coverage, you really need to take into account multiple criteria. So performance of a system, but also robustness, robustness to errors, fairness, privacy, security, reliability. And also, and that's becoming an increasingly important topic with green AI, it's like what is the carbon impact of this AI? Do you really need that many GPUs? Can you make your system a bit more energy efficient? So today I'll focus on the, because I see we have 10 more minutes, I'll focus on two aspects, the robustness and the effects. So I'll start with robustness. Who has read or heard about this paper? Quick show of hands. Okay, one. So who has heard of behavioral testing? Because that's not machine learning specific. Yeah, cool. So Ribeiro three years ago, along with other co-writers of this paper, did I think a fantastic job to see how to adapt behavioral testing, which is a really good practice from software engineering, to the context of machine learning. And specifically wrote something for NLP models. The main problem that this research paper aimed to solve was test case generation. Because really NLP is by a sense a problem, NLP, a natural language processing. So you have an input text, it's just raw text. So you need to test this. But what you can do is to generate test cases that rely on mapping the input and the input changing changes in the text to expectations. I'll give three examples from very, very simple to a bit more complex. One is like the principle of minimum functionality. For example, if you are building a sentiment prediction machine learning system, you could just have a test that says if you have extraordinary in the sentence, you should always predict that the model will say it's a positive message. Now you will probably tell me, yeah, but what about if the user has written it's not extraordinary or absolutely not extraordinary? And that actually brings me to the concept of test template. And the fact that probably for NLP, what you need to do, and this is obviously language specific, is start to have templates where you change the text by, for example, adding negations. And then so you might want to test if your system, if you're adding negation, if you have a certain direction. Because normally if the machine learning model has understood, it should, if it's about sentiment, understand that putting not an extraordinary or not good, you have then synonyms, will not affect the system too much. Or actually either your system, you want it to move to a certain direction or there are cases where you want actually the opposite behavior. You want robustness. So that's called invariance. So for instance, you will want a system that is robust to typos to just changing like a location name, just putting synonyms, et cetera, et cetera. So we've created this diagram to explain it. And it's a really thriving field in the research. There is a lot of research going on these days about testing machine learning systems. And metamorphic testing is one of the leading methods to do that. The principle is, if I take an analogy, is very similar to if you've worked in finance or if you have some friends who work there, the principle of backtesting an investment strategy. You simulate different changes in the market conditions and you see how your strategy, your algorithm behaves, what is the variance of that strategy. This concept applies very well to machine learning. So you need two things. You need one to define a perturbation. So what I was explaining earlier in NLP, perturbation might be adding typos, adding negation. In another context, like let's say it's more in an industrial case, it might be about doubling the values of some sensors or adding noise to an image. And then, pretty simply, you define a test expectation in terms of the metamorphic relation between the output of a machine learning model and the distribution of the output after perturbation. And once you have that, and if you have enough data, then you can actually have, like you can do actual statistical tests, see there's a difference in distribution, et cetera. So I won't have too much time to dive into all the details of this, but we have wrote a technical guide on this topic and you have a link in QR code up there. Next, I'll talk a bit about a really tricky topic, which is AI fairness. And I want to emphasize that it's, at least our recommendation, is not to come at the problem of AI ethics with a closed mind or a top-down definition of this is an ethical system or no, this is an unethical system. My co-founder did his PhD on precisely on this topic and wrote a full paper on this, looking at the philosophical and sociological implications of this. And the gist of it is that, yes, to a certain extent, you can adopt a top-down approach to AI fairness, saying, well, for instance, as an organization, we want to test the fairness on explicitly free, sensitive categories. You can say, well, we want to check for gender balance. We want to check for race balance. That means if the country where you deploy a machine learning allows to collect this data, this is not always the case. But the challenge with these approaches is that, A, you might not have the data to measure this, and B, you may miss out because often when this exercise of defining the quality criteria for fairness and for balance are done, you only have a limited sample. So it's, in taking some sociological analysis, it's really important to have this kind of top-down definition of AI ethics, meet the reality on the ground, and confront the actual users and the makers of the systems to get them to define the definition of ethics, rather than a big organization, if I put a bit of a caricature that says, AI ethics, yeah, we wrote a charter about this. You follow, you read this, you sign, and then, oops, you're ethical. Having said that, so there are some good top-down metrics to adopt that are kind of a baseline, and I'll explain one of them, which is disparate impact. Disparate impact is actually a metric from the human resources management industry from at least 40 years ago, so it's not new. That says, so it's probabilities, but essentially it's about setting a rule of 80%, where you measure the probability of, you define a positive outcome with respect to a given protected population, and you say, well, I want to the proportion of the probability of a positive outcome relative to the probability of a positive outcome in the unprotected context to be above 80%. So, for instance, so if you want to apply that to a, oops, to put more concrete, yeah, so if, say, you're building a model to predict the churn of customers, and you want to check whether your model is biased or not for each class, this formula allows you to really define this metric and write a concrete test case. Right, so I just have three minutes, so I'll highlight what one of the features of our project enables is putting human feedback, so really having an interface where users and not only data scientists can change the parameters, so there's a link to metamorphic testing, and actually give human feedback to a point art where the biases may be, and the benefit of this approach is that it allows for the community to precisely define what they think are the risks. So sadly, we won't have time to do a demo, but this phase, in our project, we call that the inspection phase, and it's about before you test, and this is super important, and again, one of the things where it's different from traditional software testing, before you even test, you need to confront yourself with the data and the model, so that's where actually we think explainability methods really shine, it's because they allow to debug and to identify the zones of risks, and this is precisely what helps once you have qualified feedback to know where you should put your effort in test, so in a nutshell what I'm saying for testing machine learning systems is it's not a matter of creating hundreds of tests, of automating everything, but rather to have a good idea of, from a fairness standpoint and for a performance standpoint, of what are the 10, 15, maybe max 20 tests that you want in your platform. If you want to get started actually on it, this is our GitHub, and if you have a machine learning system to test, we're interested in your feedback. |
Build CI/CD pipelines as code, run them anywhere |
First thing first, the dagger team was kind enough to send me some stickers. This is really the reason why I go to conferences. So if you want to pick some up, I will leave it somewhere here. Okay, cool. So, good evening, everyone. Thank you for joining my presentation today. My name is Mark, and for the last couple years or maybe for the better part of the decade, I've been helping engineering teams focus on their business, building their business applications instead of worrying about things like deployments or CI CD or stuff like that. And I'm currently, my current title at Cisco is Tech Lead, but by the way, is there anyone here who saw my presentation in the morning in the go-to room? Okay. Okay, so I'm going to make it up to you guys. So, and please laugh at my jokes as well again, even though you already heard them. So I decided that I would come clean here today, that this is a completely fake title, and that in fact my real job is a YAML engineer. So anyone else want to come clean, unburden themselves? Okay, okay, cool. Oh, yeah, yeah, that's engineer probably, that's an overstatement. So let's talk about CI CD, and we do have a bunch of CI CD services available today, and it's still evolving continuously, but we do have a couple of challenges that causes pain to developers and others, people every day, and I've already kind of hinted at one of them, YAML, like, man, you put a space in the wrong place and it just breaks over, and you don't even know why, because the CI tool may not even tell you where that extra space is. So YAML is really one of the core pains of all the CI solutions today, and yeah, I know there is Jenkins and Groovy, which is inverse, but YAML is really the standard of CI CD languages these days. There are some places, like, there are a couple of solutions where it works kind of okay for simple pipelines, but for more complex cases it's just a nightmare. Then CI has this tendency to break for no obvious reason, like, one day the pipeline works and the other way it just doesn't, and, well, for operations, for deploying your application to a production environment, you can always say, okay, OAPS problem, let them solve, but for CI that's not really a case, like, developers have to interface with interactive CI, and if the CI is breaking, then it's probably the developers who have to fix it. And the problem with current CI solutions today is that we don't really have, like, an easy way to debug CI issues, like, if there is something wrong, you probably have to guess where the problem is, maybe add a few echo lines to the YAML file, push the whole thing to the repository, wait for the CI to get triggered by the repository, and then go through this whole long feedback loop over and over again, you see a lot of people nodding. So when something goes wrong, it's your job to fix it, and it takes a lot of time, and it's just painful. And sometimes, sometimes, it's actually not the CI's fault, but your fault. The code doesn't, or the test doesn't pass, the linter doesn't pass, and that's often caused by things like having different versions in the CI and different versions in your development environment, and there are tools, and there are, like, ways to make those as close to each other as possible, but still this is happening very often, like, I don't know, I had this problem like a week ago. So sometimes it's just your code that's not working with the CI, and you have to go through the same feedback loop trying to push a change, hoping that it will fix your problem, and of course, it doesn't work for the first time, so you do it over and over again, until after an hour, maybe, maybe if you sacrifice something to the CI gods, it works. So I'm pretty sure there are other challenges with CI, but let's see how Dagger may be able to solve some of these challenges here. So first of all, who has heard about Dagger? Who knows what Dagger is, oh, cool. So Dagger is a program about portable CI-CD solution, and portable is a pretty great feature here, because instead of going through the total feedback loop I was talking about, you can run your CI-CD pipeline on your own local machine and figure out what's wrong much sooner than by pushing to the Git repository and waiting for the CI over and over again. So it's much quicker that way to debug issues either related to the CI or your code, and it's also much easier to build the pipeline in the first place, like when you build a new CI pipeline for a new project, you have to go through the same feedback loop, because you have to add new steps and figure out if it works or not, and if it doesn't, then you have to figure out how to add the right parameters. So even building new pipelines is way easier, because the whole thing is portable. The other thing that makes Dagger great is that you can basically write your pipelines in any language. Dagger officially supports a couple of languages, like Go, Python, TypeScript, and Q, but basically any language that can talk to a GraphQL API, because that's what's under the hood, any language that can talk to a GraphQL API basically can be used to build your own pipelines with Dagger. And if you combine these two traits, like being portable and being able to write pipelines in your own language, it also points to the fact that you can completely avoid vendor locking, like you can, obviously you would still need some sort of CI service and you would need like a thin layer of integration that would run Dagger itself, but once you have a portable pipeline written in your own format in your own language, not in a proprietary or CI-specific general format, you are not logged into the CI vendor you are using right now. And that you don't really switch like CI providers often, but that happens, like when they buy your company and then you have to move from one provider to another and then you have to move again because reasons. And the fourth reason or the fourth thing that makes Dagger great is caching. Now, most CI services already have some sort of caching solution that you can use to cache like intermediary artifacts or dependencies or whatever you want to store in a cache that you don't want to download or compile every single time when your CI pipeline runs. But you still have to configure it properly and you have to make sure that you have the right caching keys, you have to add the right paths to the caching configuration. And you may either end up with a huge cache at the end of the day or you may not use cache at all. So if you don't configure it properly, it may not work. With Dagger, you get caching by default. And by default, I mean every single step in your pipeline, the result of that run will be cached similarly how a Docker file works, like every single instruction, the result of that instruction will be cached if there are no changes before that step, actually. So similarly to that, Dagger caches every step in your CI pipeline. Now how does it do that? How does it work? How is this portable? Any guesses? One word? Yeah, Docker, yeah, that's right. So containers, of course, containers. So in order to be portable and to do all this magic that Dagger does, it needs to have like a reasonable level of isolation so that you can be confident that it will run on your local machine and on your CI the same way. So it runs your builds in containers. And I already mentioned that Dagger has a few official SDKs that you can use to build the pipeline in your own code. Using that Dagger SDK, you can talk to the so-called Dagger engine, which is the API that implements the GraphQL specification. And the Dagger SDK will call this API with the steps in your pipeline. And the Dagger engine will build a DAG from these steps. And then we'll pass that to basically to run through a container runtime. And that's how your pipeline will run. And the good thing about this is that you can actually change this pipeline so the output of one pipeline can be the input of another. And this whole thing goes through a single thing called session. So in a single session, you can have multiple, like, these container executions. And you can change the results into each other if you want to. Now let's actually take a look at how these things run. And the reason why I asked if there is anyone here who was in my presentation in the morning because I completely botched the demo and it didn't work at all. So let's hope it works this time. So the example is in Go. But again, it could be, like, TypeScript and Python or even Q. And I'm not going to go into that much detail about the Go specific here. But basically, you need to import this Dagger SDK in order to, by the way, can you see the screen or the code? Make it bigger? OK. Better? Cool. So you have to import this Dagger SDK if you want to use Go. And then the first thing you need to do is connect to the Dagger Engine. Now if the Dagger Engine doesn't run locally, then the SDK will actually run it using, as a simple Docker container. So the first thing you need to do is connect to this Dagger Engine. And then you can start launching these containers and start building your pipelines. And if it looks very similar, it's because it's basically the same, uses the same language and it looks very similar to Docker files. And it works basically the same way. So you have, like, a base image. You have a bunch of mounted volumes for caching. And then you mount the source code and you run some sort of command. And that's it. That's your pipeline. Now let's see if it actually runs. So I use this make file authority for Go called mage. So this is how I have this whole code implemented in a test function. Let's see if it runs. Okay. So it did run. Cool. Let's try just for, let's debug to see what's happening in the background. So it pulls an image, the goal and image. It mounts the code. It mounts the volumes. And then runs my Go test on the mounted code. And then basically exits and outputs the result of the test. So, well, that's it. If you want to get started with Dagger, check out the documentation. It's getting better by the day. They actually released, well, either today or yesterday, a new quick start guide, which is pretty awesome. It has all the three or four supported languages in a single document. So you can switch between languages if you want to. There is even a playground for the lower level GraphQL API. So if you don't want to start a new project, you can play directly with the GraphQL API with the hosted version of the Dagger engine. So thank you for your attention. If you have any questions, I'm happy to answer if you have time for that. Thank you so much. Awesome. I have a question with regards to implementation, so let's say that you roll your pipeline, you commit that you want to run that somewhere like in a CI environment. GitHub Actions, or Gaffer Bids, Jenkins, or whatever. How do you go about that? I can imagine that you need to expose the Docker socket to the pipeline, or how does it work? Yeah, so basically, if you have Docker running in your environment, you can run this pipeline. And you can run Docker anywhere, basically, today. You can run it in Jenkins or GitHub. You have it in GitHub Actions, actually. And you probably have it on your machine, as well. So wherever you have Docker running today, this pipeline will run. So you just invoke the Dagger command, that command that you just showed us. Yeah, it's not even a Dagger command. This is entirely my code. This is my go binary, basically. Right, okay. And it will find the Docker API socket, and if you just start containers there, yeah. Before I, very cool stuff. So before I switch all my CI to Dagger, let's frame it like this. What would be the two things that you would really love to see an improved implementation of in the next version? Can you repeat the question? What are the two things that really need to be improved about the current state of Dagger? What improves? To improve Dagger, in your opinion, what are the two things that need the most improvement? Okay, so one thing is secret management. Right now, Dagger, it's not that easy to work with secrets, so that needs to be improved, and they are actually working on it, so that's great. The other thing is that right now, if you build something in one language, for example, if I build a reusable library in Go to run my pipelines, I can't reuse it in TypeScript, for example, today. And for that, there is actually a feature called extensions, so they are working on a feature so you can build extensions to the Dagger engine, so you can build these reliable or reusable pipeline PCs, like running liters and stuff like that, so you don't have to build that in your own code, you just have to build the extensions, and you can call it from whatever language you want to call them. Basically GraphQL API extensions. Thank you. Last question. Hi, does Dagger support spinning up Service A concurrently with Service B, because the tests need something else to run while the test is running, and then afterward you can continue to other stuff? Right now, I don't think it does. Again, this is something that they are thinking about, but it's not a trivial thing to do, so no. Currently. Okay, someone is working on it. Thank you. Of course. Thank you. Thank you very much. |
How We Gained Observability Into Our CI/CD Pipeline
Using best of breed open source to monitor Jenkins |
So, I hope it will be fun enough for you to wake up at the end of the day and very excited to be here at FOSDEM and specifically at the CI CD Dev Room. And today I'd like to share with you about how we gained observability into our CI CD pipeline and how you can do too. So let's start with a day in the life of a DoD developer on duty, at least in my company. And it goes like that. So the first thing the DoD does in the morning, at least it used to be before we did this exercise, is going into the Jenkins. We worked with Jenkins, but the takeaways, by the way, will be very applicable to any other system you work with, so nothing too specific here. Going into Jenkins at the beginning of the morning, we're looking at the status there, the pipelines for the last few hours over the night, and of course checking if anything is red, and most importantly, if there's a red master. And if you can obviously finish your coffee or jump straight into the investigation. And to be honest, sometimes people actually forgot to go into the Jenkins and check this, so that's another topic we'll maybe touch upon. So you go in, and then you need to go, let's say you see a failure, you see something red, you need to start going one by one on the different runs, and start figuring out, understanding what failed, where it failed, why it failed, and so on. And it's important that you actually, you needed to go one by one on the different runs, and we have several runs, we have the backend, we have the app, we have smoke tests, several of these, and start getting the picture, getting the pattern across, and understanding cross runs, across branches, what's going on. And on top of all of that, it was very difficult to compare with historical behavior, with the past behavior, to understand what's, and anomaly, what's the steady state for these days, and so on. So, and just to give you a few examples of questions that we found it difficult or time-consuming to answer, things such as, did all runs fail on the same step, did all runs fail for the same reason, is that on a specific branch, is that on a specific machine, if something's taking longer, is that normal, is that anomalous, what's the benchmark? And so these sorts of questions, it took us too long to answer, and we realized we need to improve. A word about myself, my name is Dotan Horvitz, I'm the Principal Developer Advocate at a company called Logs.io, Logs.io provides a cloud-native observability platform that's built on popular open-source tools such as you probably know, Prometheus, OpenSearch, OpenTelemetry, Yeager, and others. I come from a background as a developer, a solutions architect, even a product manager, and most importantly, I'm an advocate of open-source and communities. I run a podcast called Open Observability Talks about open-source DevOps observability, so if you're interested in these topics and you like podcasts, do check it out. I also run, organize, co-organize several communities, the local chapter of the CNCF, the cloud-native computing foundation in Tel Aviv, Kubernetes Community Days, DevOps Days, et cetera, and you can find me everywhere at Horvitz. So if you have something interesting, you tweet, feel free to tag me. So before I get into how we improved our CI CD pipeline or capabilities, let's first understand what we want to improve on. And actually, I see very often that people jump into solving before really understanding the metric, the KPI that they want to improve, and very basically, therefore, primary metrics for, let's say, DevOps performance, and you can see there on the screen, there's the deployment frequency, lead time for changes, change failure rate, and MPTR, mean time to recovery. I don't have time to go over all of these, but very important, so if you're new to this and if you want to read a bit more about that, I left a QR code and a short link for you at the bottom for a 101 on the Dora metrics, do check it out, I think it's priceless. And in our case, we needed to improve on the lead time for changes or sometimes called cycle time, which is the amount of time it takes a commit to get into production, which in our case was the time was too long, too high, and was holding us back. So we are experts at observability in our engineering team. That's what we do for a living, so it was very clear to us that what we're missing in our case is observability into our CICD pipeline. And to be fair with Jenkins, and there are lots of things to complain about Jenkins, but there is some capabilities within Jenkins. You can go into a specific pipeline run, you can see the different steps, you can see how much time an individual step took. Using some plugins, you can also visualize the graph and we even wired Jenkins to get alerts on Slack, but that wasn't good enough for us. And the reason that we wanted to find a way to monitor aggregated and filtered information according to our own timescale, according to our own filters, obviously to see things across branches, across runs, to compare with historical data, with our own filtering, so that's where we aimed at. And we launched this internal project with these requirements, four requirements. One, first and foremost, as we need the dashboard, we need the dashboard with aggregated views to be able to see the aggregated data across pipelines, across runs, across branches as we talked about. Finally we wanted to have access to historical data to be able to compare, to understand trends, to identify patterns, anomalies, and so on. Thirdly, we wanted reports and alerts to be able to automate as much as possible. And lastly, we wanted some ability to view flaky tests, test performance, and to be able to understand their impact on the pipeline. So that was the project requirements and how we did that. Essentially it takes four steps, collect, store, visualize, and report. And I'll show you exactly how it's done and what each step entails. In terms of the tech stack, we were very versed with the Elk stack, Elasticsearch, Kabbana. Then we also switched over to OpenSearch and OpenSearch dashboards after Elastic re-licensed and it was no longer open source. So that was our natural point to start our observability journey. And I'll show you how we did these four steps with this tech stack. So the first step is collect. And for that we instrumented the pipeline to collect all the relevant information and put it in environment variables. Which information, you can see some examples here on the screen, the branch, the Kamecha, the machine IP, the run type, whether it's scheduled, triggered by merge to master or something else, fail step, step duration, build number, anything essentially that you find useful for investigation later. My recommendation, collect it and persist it. So that's the collect phase and after collect comes store. And for that we created a new summary step at the end of the pipeline one where we ran a command to collect all of that information that we did in the first step and created a JSON and persisted it to Elasticsearch, as I mentioned then move to OpenSearch. And it's important to say again for the fairness of Jenkins and for the Jenkins experts here, Jenkins does have some built in persistency capabilities. And we tried them out, but it wasn't good enough for us. And the reason is that by default Jenkins essentially keeps all the bills and stores them on the Jenkins machine, which burdens these machines of course. And then you start needing to limit the number of bills and the duration, how many days and so on and so forth. So that wasn't good enough for us. We needed a more powerful access to historical data. We wanted to persist historical data in our own control, the duration, the retention and most importantly off of the Jenkins servers so as not to risk and overload the critical path. So that's about store and after store. Once we have all the data in Elasticsearch or OpenSearch, now it's very easy to build command dashboards or OpenSearch dashboards and visualizations on top of that. And then comes the question, sorry, then comes the question, okay, so which visualizations should I build? And for that, and that's a tip, take it with you, go back to the pains, go back to the questions that you found it hard to answer and this would be the starting point. So if you remember before we mentioned things such as did all runs fail on the same step, did all runs fail for the same reason, how many fail, is that a specific branch, is that a specific machine and so on, these are the questions that we guide you then to choose the right visualizations for your dashboard. And I'll give you some examples here. So let's start with the top line view. You want to understand the health of your house table, your pipeline is. So visualize the success and failure rates, you can do that overall in general or at a specific time window on a graph, very easy to see the first glance, what's the health status of your pipeline. You want to find problematic steps, then visualize failures segmented by pipeline steps, again very easy to see the spiking step there. You want to detect problematic build machines, visualize failures segmented by machine and that by the way saved us a lot of wasted time going and checking bugs in the release code. When we saw such a thing, we just go, you kill the machine, you let the auto scaler spin up a new instance and you start clean and in many cases it solves the problem. So lots of time saved, in general this aspect of code based or environmental based issues is definitely a challenge I'm assuming, not just for me, so I'll get back to that soon. Another example duration per step, again very easy to see where and at the time is spent. So that's the visualize part and after visualize comes the reporting and alerting phase. And if you remember before the DOD, the developer on duty, needed to go manually and check Jenkins and then the health check, now the DOD gets start of day report directly to Slack and actually as you can see the report already contains the link to the dashboard and even a snapshot of the dashboard embedded within the Slack so that at the first glance even without going into the dashboard you can see if you can finish your coffee or if there's something alerting that you need to click that link and go start investigating. And of course it doesn't have to be a schedule report, it could be also you can define triggered alerts on any of that, the fields, the data that we collected in the first phase and the collect phase so and you can do any complex queries or conditions that you want, you want to do something like if the sum of failures goes above x or the average duration goes above y trigger an alert. So essentially anything that you can formalize as a Lucene query, you can automate as an alert and that's some alerting layer that we built on top of elastic search and open search for that. One last note, I'm giving the examples from Slack because that's what we use in our environment but you're not limited obviously to Slack, you have support for many notification endpoints depending on your systems, pager duty, victorops, ops genie, MS themes, whatever. We personally work with Slack so that the examples are with Slack. So that's how we build observability into the Jenkins pipelines but as we all know especially here in the CI CD dev room, Jenkins, CI CD is much more than just Jenkins. So what else? So we wanted to analyze if you remember the original requirements to analyze flaky tests and test performance and following the same process, collecting all the relevant information from test run and storing it in elastic search and open search and then creating a cabana dashboard or open search dashboards and as you can see very all the relevant usual suspects that you'd expect, the test duration, fail test, flaky test, failure count and rate moving averages, fail test by branch over time, all of the things that you would need in order to analyze and understand the impact of your test and the flaky tests in your system. And similarly after visualize you can also report, we created reports to Slack, we have a dedicated Slack channel for that, following the same pattern. One important point is about the openness. So once you have the data in open search or in elastic search, it's very easy for different teams to create different visualizations on top of that same data. So I took another extreme, a different team that didn't like the graphs and preferred the table views and the counters to visualize, again, very similarly, test stats and so on. And that's the beauty of it. So just to summarize, we instrumented Jenkins pipeline to collect relevant data and put it in environment variables, then at the end of the pipeline we created a JSON with all this data and persisted it to elastic search open search, then we created Kibana dashboards on top of that data and lastly we created reports and alerts on that data. So four steps, collect, store, visualize and report. So that was our first step in the journey but we didn't stop there. The next step was we asked ourselves, what can we do in order to investigate performance of specific pipeline runs? So you have a run that takes a lot of time, you want to optimize, but where is the problem? And that's actually what distributed tracing is ideal for. How many people know what distributed tracing is with a show of hands? Okay, I see that most of us, there are a few that know, so maybe I'll say a word about that soon. Very importantly, Jenkins has the capability to emit trace data spans, just like it does for logs, so it's already built in. So we decided to visualize jobs and pipeline executions as distributed tracing. That was the next step. And for those who don't know, distributed tracing essentially helps pinpoint where issues occur and where latency is in production environments, in distributed systems, it's not specific for CICD. If you think about a microservice architecture and a request coming in and flowing through a chain of interacting microservices, then when something goes wrong, you get an error on that request, you want to know where the error is within this chain, or if there's a latency, you want to know where the latency is. That's distributed tracing in a nutshell. And the way it works is that each step in this call chain, or in our case, each step in the pipeline, creates and emits a span. You can think about a span as a structured log that also contains the trace ID, the start time, the duration, and some other context. And then there is a back end that collects all these spans, reconstruct the trace, and then visualizes it typically in this timeline view or gun chart that you can see on the right-hand side. So now that we understand the distributed tracing, let's see how we add distributed tracing type of performance, pipeline performance into a CICD pipeline. And same process. For the collect step, collect. And for the collect step, we decided to use an open telemetry collector who doesn't know about open telemetry, who doesn't know the project, just so I have a background, okay. I have a few, so I'll say a word about that. And anyway, I added a link, you see a QR code and a link at the lower corner there for a beginner's guide to open telemetry that I wrote. I gave a talk about open telemetry at KubeCon Europe, so you'll find it useful. But very briefly, it's an observability platform for collecting logs, metrics, and traces. So it's not specific only to traces in an open unified standard manner. It's an open source project under the CNCF, the Cloud Native Computing Foundation. And at the time, it's a fairly young project by the time, the tracing piece of open telemetry was already GA generally available, so we decided to go with that. Today, by the way, also metrics is soon to be GA, it's already in release candidate, and logging is still not there. So what do you need to do if you choose open telemetry? You need to set up the open telemetry collector, it's sort of an agent for it to send. You need to install the Jenkins open telemetry plug-in, very easy to do that on the UI. And then you need to configure the Jenkins open telemetry plug-in to send to the open telemetry collector and point over OTLP over GRPC protocol. That's the collect phase, and after collect comes store. For the back end, we used Jega. Jega is also a very popular open source under the CNCF, specifically for distributed tracing. And we use Jega to monitor our own production environment, so that was our natural choice also for this. We also have a Jager-based service, so we just use that. But anything that I show here, actually you can use with any Jager distro, whichever one you use, managed or self-serve. And if you do run your own, by the way, I added the link on how to deploy Jager on Kubernetes in production, so you have a link there, a short link that I added, a very useful guide. So what do you need to do? You need to configure open telemetry collector to export in open telemetry collector terms to export to Jager in the right format, all the aggregated information. And once you have that, then you can visualize, the visualized part is much easier in this case, because you have a Jager UI with predefined dashboard, you don't need to start composing visuals. Essentially, what you can see here on the left-hand side, you can see this indented tree structure, and then on the right, the gun chart. Each line here is a span, and it's very easy to see the pipeline sequence. The text is a bit small, but you can see, for each step of the pipeline, you can see the duration, how much it took, you see which ones ran in parallel, and which ones ran sequentially. If you have a very long latency on the overall, you can see where most of the time is being spent, where the critical path, where you best optimize, and so on. And by the way, Jager also offers other views, like recently added the flame graph, and you have trace statistics, and graph view, and so on. But this is what people are used to, so I'm showing the timeline view. So that's on Jager, and of course, as we said before, CICD is more than just Jenkins, so what we can do beyond just Jenkins, and what you can do is actually to instrument additional pieces like Maven, Ansible, and other elements to get final granularity into your traces and steps. For example, here, the things that you see in yellow is Maven build steps. So what before used to be one black box span in the trace. Suddenly, now you can click, open, and see the different build steps, each one with its own duration, each one with its own context, and so on. So that's in a nutshell how we added tracing to our CICD pipeline. The next step is, as I mentioned before, many of the pipelines actually failed not because of the released code, but because of the CICD environment. So we decided to monitor metrics from the Jenkins servers and the environment. It goes to the system, the containers, the JVM, essentially anything that could break irrespective of the released code, and following the same flow. So the first step, collect, we use the telegraph, we use that in production, so we use that here as well, that's an open source by inflex data, and essentially you need two steps. You need to first enable, configure, sorry, Jenkins to expose metrics in Prometheus format. We work a lot with Prometheus for metrics, so that was our natural choice, and that's a simple configuration in the Jenkins web UI, and then you need to install telegraph if you don't already have that, and then make sure that it configured to scrape the metrics off of the Jenkins server using the Prometheus input plugin. So that's the first step. The second step is on the store side. As I mentioned, we use Prometheus for metrics, so we use that as well here. We even have our own managed Prometheus, so we use that, but anything that I show here is identical whether you use Prometheus or any Prometheus compatible backend. And essentially you need to configure telegraph to send the metrics to Prometheus, and you have two ways to do that. You can do that in pull mode or in push mode. So pull mode is the default for Prometheus, essentially when you configure a telegraph to expose a slash metrics endpoint, and then it can be exposed for Prometheus to scrape it from. If you want to do that, you use the Prometheus client output plugin, or if you want to do it in push mode, then you use the HTTP output plugin. Just an important note, make sure that you set the data format to Prometheus remote write. So that's the store phase, and then once you have all the data in Prometheus, then it's very easy to create Grafana dashboards on top of that. And I gave some examples here. You can filter, of course, by build type, by branch, machine ID, build number, and so on. And you can monitor in this example, this is a system monitoring, so CPU, memory, disk usage, load, and so on. You can monitor the Docker container, like the CPU, IO, inbound, outbound, disk usage, obviously the running, stopped, paused containers by Jenkins machine, everything that you'd expect, and JVM metrics, by being a Java implementation, thread count, heap memory, garbage collection, duration, things like that. You can even, of course, monitor the Jenkins nodes, queues, executors themselves. So again, you have an example dashboard here. You can see the queue size, status breakdown, the Jenkins jobs, the count executed over time, breakdown by job status, and so on. So this is the types, just to, obviously, lots of other visualizations that you can create, and you can also create alerts. I won't show that in the lack of time, so just to summarize what we've seen. Treat your CICD the same as you treat your production. For your production, use whatever, elastic search, open search, Grafana to monitor to create observability. Do the same with your CICD pipeline, and preferably leverage the same stack, the same tool chain for that, and don't reinvent the wheel. That was our journey. As I mentioned, we wanted dashboards and aggregated views to see several pipelines across different run branches over time, and so on. We wanted historical data and controlled persistence off of the Jenkins servers to determine the duration, the retention of that data. We wanted reports and alerts to automate as much as possible, and lastly, we wanted test performance, flaky tests, and so on. You saw how we achieved that. Four steps. If there's one thing to take out of that talk, take this one, collect, store, visualize, and report an alert. And what we gained, just to summarize, significant improvement in our lead time for changes, in our cycle time, if you remember the Dora metrics at the beginning. On the way, we also got an improved developer-on-duty experience, much less of a sufferer there. It's based on open source. Very important. We're here on FOSDEM. So based on open search, open telemetry, Yeager, Prometheus, Telegraph, you saw the stack. If you want more information, you have here a QR code for a guide to CICD observability that I wrote. You're welcome to take a short or a bit short link and read more about this, but this was very much in a nutshell. Thank you very much for listening. I'm Doton Horvitz, and enjoy the rest of the conference. I don't know if we have time for questions. No. So I'm here if you have questions or if you have a sticker, and may the open source be with you. Thank you. We have time for questions, if there are any. We have time for questions, so if you want, we can just see for a few minutes. Is that a question? Yeah, the other question in the back. Okay. Which one do you want to be the first one to ask a question? Thanks. So have you considered persistence? How long do you store your metrics and your traces? Have you wondered about that? And for how long at a time you store your metrics? So we have. That was part of the original challenge when we used the Jenkins persistence, because when you persist it on the nodes themselves, and obviously you're very limited, there's the plugin that you can configure per days or per number of bills and so on. When you do it off of that critical path, you have much more room to maneuver, and then it depends on the amount of data you collect. We started small, so we collected for longer periods, but the more it came with the app, the more the appetite grew, and people wanted more and more types of metrics and time series data, so we needed to be a bit more conservative, but it's very much dependent on your practices in terms of the data. Yeah, the question was more about the process, so iterative, you explained it, so it starts small. Yeah, exactly. And iterative is the best, because it really depends, you need to learn the patterns of your data consumption, the telemetry, and then you can optimize the balance between having the observability and not overloading and overpricing costs. Right. Thank you very, very interesting. Thank you. There was another question in the back, yeah? Thank you. So what was the most surprising insight that you've learned, good or bad, and how did you react to it? I think I was most surprised personally about the amount of failures that occur because of the environment and what kinds of things, and how simple it is to just kill the machine, kill the instance, let the auto-scaler spin it back up, and you save yourself a lot of hassle and a lot of waking people up at night, so that was astonishing. How many things are irrespective of the code and just environmental, and we took a lot of learnings out there to make the environment more robust, to get people to clean after them, to automate the cleanups and things like that, that's what me was insightful. Thank you. Any other questions? Then I have one last one, sorry. No, no worries. My question is, who are usually the people looking at the dashboard, because I maintain a lot of dashboard in the past, and sometimes I had a feeling that I was the only one looking at those dashboards, so I'm just wondering if you identify a type of people who really benefit from those dashboards. So it's a very interesting question because we also learned and we changed the org structure several times, so it moves between Dev and DevOps. We now have a release engineering team, so they are the main stakeholders to look at that, but this dashboard is the goal, as I said, the developer on duty, so everyone that is now on call needs to see that, that's for sure, and the tier two, tier three, so let's say the chain for that. You also use that as a high level also by the team leads in the developer side of things, so these are the main stakeholders, depending on if it's the critical part of the developer on duty and the tiers, or if it's the overall thing the health state in general by the release engineer. Thank you. Thank you very much, everyone. |
Inside the FIM (Fbi IMproved) Scriptable Image Viewer
About a Small Command Language Powering an Image Viewer |
Back in 2006, I was a user of Gert Hoffman's Linux framebuffer image viewer, FBI. At some point, I wanted to add Veeam-styled keys for movement, so I came up with a patch for FBI. Soon after, I wanted a simple command line and shortcuts to jump around. And then commands came, and a parser, and auto-completion, inspiration came from Veeam, mutt, the shell, and so Veeam grew, hack after hack. What is Veeam now? Veeam is a Unix tool specialized in viewing image files. Let me stress this, viewing, not editing. Veeam is customizable via configuration files, and is interoperable with other Unix tools. Veeam adheres to the Perl's logon, there's more than one way to do it. Thanks to caching and prefetching, Veeam plays well with slower computers. It spares discrete time. It has a minimalistic user interface, no buttons, no menus, and it's flexible. It displays images as pixels or characters via SSH, under screen. If invoked in the Linux framebuffer, Veeam uses it. Under X11, Veeam runs in a window or full screen. Another option is to display images as ASCII art, even in color. Runtime auto-detection will try choosing the most appropriate mode. You can also specify one yourself via the command line. Veeam offers a consistent look and feel across those different graphical modes. Invoking Veeam from the shell to open image files works as you expect. File decoding depends on file contents, not on the file name. You can also load directories, even recursively, or in the background. In scanning a directory, a file name-based selection occurs, and this is to avoid opening and inspecting contents of too many files. Interactive usage is keyboard-oriented. Arrow keys for movement, plus, minus for scale, N for next, P for previous, or pitch down to do what you expect. Each one of those keys is bound to an action. An action may be invoking a simple command or a command with expressions as arguments. An action can also contain control flow, as the one associated with pitch down on the bottom of the slide. The thing is, under the hood, there is a language interpreter. Just as in Veeam, you access it with the column key. And just as in Veeam, or the shell, there is auto-completion. Configuration files are scripts written in this language. Veeam's language consists of commands, aliases, variables, control constructs, and special shortcuts. I develop Veeam for my daily use to occasionally open files or to load a collected collection of photographs. Sometimes I come up with more tricky use cases, perhaps updating the configuration file with new aliases, or updating my shell configuration with new aliases using Veeam. So far, I talked about introductory topics. I'm also presenting another talk at this FOSDEM. That talk is about general interactive usage, and it may be of your interest too. But now, we will cover language-specific topics. Of the theme commands, the most important ones are those to move around, scale the image, and get help. The purpose of the help go-to, scale, pen commands is self-explanatory. Other important commands, like list or limit, manipulate the file list. Please check out my talk on interactive theme usage for that. How to start using theme commands effectively? You usually pick the functionality you like, experiment a bit, and express it as an alias. This can be a command with argument, like pan left, or go 2 plus 1. Or it can be a more complicated statement. Here you have next 10 for a simple slideshow loop. Notice how their arguments are quoted. The idea is to streamline your work. Do a workflow. No matter how complex an action, you should be able to encapsulate it. As Larry Wall said, easy things should be easy, and hard things should be possible. Theme has each of its commands and variables documented. The manual pages are generated from the help command. The help command is also dynamic. So you can use it to get the actual key bindings and aliases. Now certain frequent actions have direct language shortcuts. You may want to use those occasionally via the interactive command line. One is jumping to a specific position in the files list. So for the third position, enter the command line with colon, enter digit 3, and hit enter. For the first or the last one, you may recognize the carrot and dollar syntax as familiar here. The shortcuts of this slide instead are for rescaling images by specifying factors. As you see, this exploits the fine-grained control that the scale command offers. What you see are standalone theme statements, of course. Another shortcut syntax prefixes a command by a number. The command, or a block, will be repeatedly executed that specific number of times. One can interrupt the iterations by hitting escape, for instance. You can use a so-called range syntax to repeat an action on a filename interval. Just specify number, comma, number before a command. Use this to invoke commands on filenames, which will be substituted to open, close curly brackets. This usage of the brackets substitution mimics the syntax of the Unix find utility. FIM uses dynamic variables and a weak type system. Internally, a variable can be an integer, a floating point number, or a string. You can combine expressions with several operators. Strings concatenate with dot. There are two quoting styles for strings. Within single quotes, you only have to escape using backslash, single quotes. Within double quotes, we escape double quotes. FIM sets several variables, many meant to be used or internally. Some other variables are meant to be controlled by the user. These are the configuration variables, and may, for instance, control caching behavior, or customize, the status line, or the window manager caption. In the case of the special variable random, think of it as it were a function. Certain variables prefixed by i colon are contextual to the current image. They may hold its filename, size, or the current width. But also special information, like exif metadata. I, for instance, really like to have those being displayed in the status line. Variables are the glue of the language. But auto commands are the glue of the FIM internals. Auto commands, inspired by VIM, are actions that trigger if the current filename matches a pattern on a specific event. There is a dozen of specific events defined in FIM. You have a list of them in demand FIM or C. Auto commands are tricky, but also very powerful. Now, remember the special range syntax? One must say one can also use it to rudimentarily interact with the shell from within FIM. The example here repeats the copy command with a changing first argument. It copies each file into a specific directory. You can think the open-closed brackets as the i colon filename variable. This feature is to be used with lots of care. By now, you know enough of the language that you can experiment in the internal command line. You can also make FIM run custom scripts or customize the configuration. It would be useful to know that FIM has command-recording functionality. See the right script out option, for instance. With it, when terminating your usage session, FIM can save all the executed actions, along with timings, into a text file, a script. You can then replay that session by executing that output script. Just remember that only executed commands are saved. The actual invocation arguments to FIM will not be there. Also worth to know, in your scripts, you can control the exit status. This is good for interaction with shell scripts. You can signal success or failure of a script. Indeed, so far, we have mostly seen internal scriptability. But FIM has several command-line options. There are many things you can do with them. And interacting with shell pipes, you can read and execute a script via the standard input. You can read images via the standard input. Or perhaps print filename lists as output of one FIM instance. And back, reading back as input in another FIM instance. We are approaching the end of the talk. You might be curious and want to know that FIM consists of about 42,000 lines of C++ code. The language parser has been written in Bison and Flex. And, well, details of the grammar are in Man, FIM or C. FIM's language offers many possibilities. But this can still be improved a lot. Variable identifiers are not always clear. The technique mistakes in a program can be difficult. Output could be escaped or quoted better. Sometimes I wish less quoting were possible. Autocompletion could be richer. And wouldn't it be awesome to use your favorite extension or customization language instead of this one here? Well, this is the end of our tour of FIM's internal language. FIM is packaged on several Linux distributions. So you can check out yours to obtain FIM. The documentation is mostly inside two Man pages. And perhaps the other FIM talk I'm giving at this FOSDEM, specific about interactive usage, can be of your interest too. So I hope you will be enjoying FIM as much as I do. Thanks for your attention. We are live now. We don't seem to have any questions. You could add something if you have anything to add beyond the talk itself. Maybe I could ask questions. Did you consider using a Lisp-like language for FIM? And why did you choose a specific kind of syntax? I don't know if Lisp is a minimalistic language in that it requires a certain degree of parenthesis as far as I know. And that is already too much or more than what I wanted. I don't exclude the idea of having Lisp to interact with FIM in the future. But still I wanted to have minimalistic syntax, like the one that you have seen with the shortcut jump command. I think you cannot have that in Lisp. So you wanted a little bit of really short syntax. I am referring to slide 17 and 18. Okay. There is a question by Piotr. Perhaps I can answer it straight away. Yeah, good. I am not enough an Emacs user to have... Could you read up the question next? Piotr asks, do you have mappings for Emacs? And my answer is, I am not that much of an Emacs user to know it. However, the mappings, I mean, already if you tell me, Piotr, what is Emacs style in interactive usage, it's no problem for me, even to me, for me to come up with a few key mappings, because this is configurable as you will also see in the using FIM presentation this afternoon, which is less nerdy than this one. And when it comes to the language, the command line, I didn't say it, or I didn't put much emphasis on this, but FIM uses autocompletion based on libredline. Libredline is what you use in many shells and allows you to have autocompletion. So that has interactive autocompletion with Emacs style. However, that works under the frame buffer and not in X11, so it's not exactly as we should. Perhaps can be fixed. I hope I answered the two aspects of Emacs style working with FIM. Please come up with another question, please. So could you say something about your motivation for writing FIM? Why did you want to take on this project? Thanks, Arun. I wanted to be able to suit my very personal style of opening files and browsing files, especially PDFs at the beginning. I liked to have PDFs and I wanted to use a style which is more VIM oriented. At the time, I didn't have a suitable alternative that I liked. It's mostly to suit my own needs, and it continues this way. So for the fun, these are the motivations behind FIM. Okay, yeah, that makes sense. So you wanted a PDF reader and an image reader that can be used with FIM. Yeah, when it comes to the PDFs, because I didn't stick here in this talk, but there is a script which transforms images into PDFs or similar files into PNGs. It's called FIMGS, and it's nothing else than the translation of FBIGS, because FIM means to be an improvement on FBI. It can also open CBZ or tar balls to extract stuff. The extraction is not recursive, unfortunately. It's something that I wish to have at some point. It's a recursive extraction of tar balls or PDFs into images, or archives. Archives also can be opened, by the way, somewhere, yeah. Do we have any other questions? I don't see any. There seems to be a bit of lag in my question-score box, but there are no questions. So I'll ask one more. So when you convert these PDFs to images for viewing in FIM, can you search through the PDF, I mean, search through the text of the PDF? No, no. The PDF will become an image, and that's all. I think in principle, yeah, by using Poplar, which is one library which FIM uses, might be possible in the future, but I don't plan this at the moment. For now, it's about images. Use something else if you really need to use fully the F functionality. For now, stick to FIM for images. People use FIM for image frames, good for them, but I think it's a bit not using FIM at the full potential. I think FIM should be used interactively for images. And when one wants to have some special custom command, then you are welcome to configure FIM, but mostly it's meant for interactive usage, so don't take me wrong here. It's not something for programming stuff, use a library for that. This is just for viewing and viewing on steroids, hopefully. What do you mean by image frames? You said something about image frames. I mean small devices, which usually have a small screen, a very weak processor, which every five minutes, perhaps, changes an image and does a slideshow, which is very low, but it's usually not interactive. I think people use it a lot for this, and for the Raspberry PI, for instance, because I think there is no X there, or, let's say, it's good to use the frame buffer there. But I encourage using FIM interactively every day, if you like this style of keyboard-oriented usage, of course. I wish to have menus, and I'm working on that, but that's not the spirit of FIM. Okay, I think the time is over. We still have two and a half minutes. I mean, two hours, 45 seconds. Sorry, two minutes, 45 seconds. I still don't see any more questions. It's probably a bit early in the morning. Yes. Okay. I asked the question about the PDFs, because Emax has something very similar to look at PDFs, and it has the same problem with the search. It converts the PDFs to images, and then you can't really do much with the image. It's not a substitute for a PDF reader, that's the point. Can you play the SVG images using FIM? Sorry, can you repeat, please? Can you handle SVG images in FIM? Which images? SVG, scalable vector graphics? Yes, yes, yes. There is an internal conversion which uses Inkscape to render an internally pixel map out of the SVG. You have different converters inside FIM. Sorry, not inside, being invoked by FIM to make out of even your custom format. A pixel map. I invite those who wish to use this FIM, this afternoon, my other talk. I think it's at 3.30 or something like this. It will be most about usage. I think, guys, you should prepare for the next recording. The next talk, I think someone else will be handy. Good. So, we are almost done. Thanks, Arun. If you want to add something, you have 20 seconds more. No, I welcome to follow me live in Brussels this afternoon. Who is in Brussels? Otherwise, virtually. Thanks, Arun. Thanks to organizer Manolis and Piotr. Thank you, Mr. Ciao. Bye. Thank you. Thank you. Thank you. Thank you. |
LIPS Scheme
Powerful introspection and extensibility |
Welcome to my talk, Lisp Scheme, Powerful Introspection and Extensibility. My name is Jakub Tienkiewicz, you can find me online with my handle Jacobik, I am a senior software developer from Poland. I focus mostly on JavaScript language, I am open source developer and Polish Wikipedia editor. I am also a mentor and a teacher. We will talk about Lisp and Scheme history, we will do quick introduction to Scheme, next we will talk about Lisp Scheme history and the most important part of the talk is about Lisp Scheme, how it works. To get most out of this talk you need to know basic of JavaScript. Lisp was presented in 1960s by John McCarthy and his famous paper Recursive Functions of Symbolic Expression and Direct Computation by Machine Part 1. Part 2 was never created, it was based on Lambda Calculus by Alonzo Tertz and the paper explained an eval function that was written in the Lisp itself. One of McCarthy's students, Steve Russell, decided to implement the eval function on the IBM 704 and it was first interpreter of the Lisp language. The syntax of the interpreter was a little bit different than the one described in the paper because of the limitations of the keyboard on the IBM mainframe. Lisp stands for Lisp Processing. The most important thing about the language is that it's homoiconic, which means that the Lisp code is represented by Lisp, the main data structure. It was heavily used by AI research at the beginning and it was great source of inspiration for most modern programming languages. There are few so-called dialects of Lisp which are used today. Scheme, Clojure, MX Lisp, Racket and Common Lisp. Scheme was invented in 1970s at MIT by Guy L. Steele and Gerald J. Sassman when they investigated the actor model. The language is defined by specifications RNRS, which stands for Revisited Report on Language Scheme. Where number indicates how many times it was revisited. Second version was Revisited Revisited, so it used power off to make the name shorter. There are also official extensions to the language, SRFI, Scheme Request for Implementations, which adds new language features. The official website for the language is Scheme.org. Now let's talk about basic of Scheme. In most modern programming languages, when you have a function call, you use syntax like this, where you have a function name and in parentheses there are arguments separated by a comma. In List and Scheme on the other hand, the code is created from S-expressions. A list created by parentheses, where first element is a function and arguments separated by a space. And you can mess those lists. What is important with this expression is that those are not operators. They are plus and aesthetic symbols, which are names of the functions. So they are in fact function calls. So they are written in the same way as a sign function. As I've mentioned, code and data use the same data structures. So it's important to distinguish data from code. This is done by quotations. The first expression is code and the second is data, a list of numbers. To define variables in Scheme, you use define. That can also be used to define a function. And let is used to create local variables. And this is how you define an if statement that will print a message depending on a Boolean expression. Define if and let expressions are special syntax which works differently than the normal functions. And you can define your own syntax like this by using macros. For example, we can define macrofrop that when passing expression with infix notation will sum the numbers using prefix notation. In Scheme, there are two types of macros. First are list macros that accept code as data and return new list that will be evaluated. And the second are hygienic macros that use pattern matching syntax. These macros are used probably by all list dialects. But hygienic macros are specific to Scheme and dialect based on Scheme. To learn more about Scheme, I suggest a book, Sketchy Scheme by Nils M. Horm. You can find the older version of the book on Internet Archive by a suggest to get the latest version from this link. The main topic of this talk is leaps. Scheme in implementation written in JavaScript. So let's quickly talk about history of this project. It started on KotPen as a list based on Scheme. I wanted to create Emacs in browser and wanted to have something like EmacsList. That's why leaps from the beginning have an optional dynamic scope that is a characteristic feature of EmacsList. Fpcat Scheme because it's much simpler than other dialects. You can still find the first version of the interpreter on KotPen. Leaps was inspired by EmacsList and Python, mostly about the introspection features and that all functions have documentation inside the code, which you can access from the REPL. The last version of leaps that you can access from the NPM repository as a stable release is version 0.20.3. But on a certain point, I decided that I want a full Scheme implementation, not only leaps based on Scheme. And I've started working on the code on the devil branch. But at one point, it turns out that there are way too many breaking changes to release the next version. That's why I released it as 1.0 beta and the latest version is 16. At the beginning, the whole code was written in JavaScript. But when I was making an effort toward full Scheme implementation, more and more code was written in Scheme. Now almost half of the leaps code is Scheme. And now there is a time for the demo. This is the official website for the leaps project. And what's cool about this is that here you have a bookmarklet and you can drag this link to your bookmarks and execute it on a different page. For instance, here there is a first lecture of the structure of the interpretation of the program. A classic video lectures from MIT. You can evaluate Scheme code that you see on the screen. The feature of the rebel is that there are syntax high-liking and parenthesis matching and also each macro and functions have documentation if you hover over the name. Here you have documentation for define. Here you have documentation for asterisk multiplication operator. You can also undock the panel with the rebel and use it inside the window that you can drag and drop on the page. Another cool feature of the rebel is that you can execute it on PDF files. But I've tested this only on Chrome browser. This PDF document is Scheme language specification. But it often gives problems if you try to execute code that is inside this document in the rebel. For instance, on page 12, there's this quotations. If you try to execute this code in the rebel, you give it this kind of warning. But you can fix this error by executing this code. You execute it, suddenly you can evaluate this expression. This is a special kind of function that creates syntax extension. Here you can have documentation for this function. Syntax extensions allow to define new syntax similar to the one defined in JavaScript, like those quotations. Here you can see vector literals, defined by hash sign. Vectors are also created as syntax extensions. And Scheme vectors are just JavaScript R writes. Similar syntax extension is ampersand that define JavaScript object literals. Here we can see that representation of object literals looks the same as the code. This is another feature of Libs that allow to define new representation for different instances. Scheme vectors are also defined in the same way. You can use both features to define homo-iconic data types. You can use both features to define homo-iconic data types. Records are the way to define new data types in Scheme that is defined in the specification on page 27. You can define syntax extension for this record. The third argument to set special indicates how the makePerson function should receive the arguments. The list or as a normal arguments. This feature may be removed in the future to simplify the code. The dot notation in the last argument is taken from JavaScript to simplify interaction with the hosting language. Libs is a global object that you can inspect with the dir function inspired by Python. In the same way, you can access any JavaScript object or a function. By let's go back to our record example. A person is a class. And you can create an instance of that class with a new macro. Or with makePerson function created by Scheme record type. We can also use our syntax extension to create a new person object. Now let's add a representation of this new data type. And now we can evaluate the code and have the same representation. The queue parameter indicates if the result should be quoted or not. In the wrapper, the strings are quoted because they use Scheme write function. But you can use display function that don't use quotations. But with setRapper you can make representation of the records without the new syntax. And you can use display function to make representation of the records without the new syntax. And you can use display function to make representation of the records without the new syntax. With this feature, you can easily serialize and deserialize custom data types. For instance, when saving in browser local storage. We use that eval because readReturnsList as data that needs to be evaluated to get the instance of the person object. To get the property of the speaker object, you can use dot special macro. Or you can use JavaScript dot notation. The next feature I want to discuss is introspection. You can use upper post function to search the environment. This is a list of functions and macros that match a given string. In this case, vector. You can also use regular expressions to make the search more specific. And this is a list of typed vectors. Each of those constructors have also the syntax extension that allow to create those vectors according to scheme specification. Each scheme type at vector is in fact JavaScript type at RI. And each of those RIs have its own representation. So they look like the code that defines them. You can access the documentation and the source code of the every function macro and fiber defined by lips. You can access the name, the documentation and the source code of this function. What's cool about the code is that it's live object that you can modify. The double underscore syntax is inspired by Python magic properties. You can also inspect the internals of other lips objects like symbols or numbers. Lips support full numerical tower. Here's a complex number, but it's not yet fully unit tested, so there are no guarantees that everything works correctly by 100%. You can also inspect the list objects. The instance of function use JavaScript instance of operator to check if argument is instance of the object. Here it check if x is lips list and it written true. You can use the standard list function to get the third element of the list. But you can also access internal pair objects defined by lips. The proposed function that I've showed a few moments ago allows to search the environment, but in lips you can also access environment objects themselves and do cool things with them. You can inspect them. As you can see there is double underscore nth property that you can read. Object keys is a JavaScript function that returns r i of strings, so it's represented as a scheme vector. Those are all objects defined inside the repo, including lips internals. You can see our person class and you can access it using environment object. Make person was our representation of the person instance using set wrapper. Inside the environment object there is also another double underscore property parent, which allows you to access a lexical scope time. You can access both x fibers from the scope 10 inside a one expression. You can also access both x fibers from the scope 10. You can also modify the scope inside the chain. Set is a generic macro that allows to modify any JavaScript object, not only lips internals. Another feature of lips is that you can act as stack frames of function calls. They are also environment objects inspired by the air programming language that has a lot of lips under the hood. Here the function test can modify the scope outside the function call. Another function similar to parent frame is plural parent frames, which returns a list of stack frames. With both parent frames and underscore code access you can modify the function that call a given function anywhere inside the call stack chain. The long arrow is a macro for invoking methods and the cally is lips object similar to JavaScript object with the same name. The long arrow macro is a convenient method to create the chain of method calls. This code demonstrates another fundamental feature of lips, where everything is automatically assigned away by default when needed. Fetch is a javascript function that returns a promise which results to a resource object. That object has a text method that also returns a promise. Here we can skip them which makes the code simpler than the javascript equivalent. The match in the code is a string method and the number one returns first group from the regular expression. The whole expression returns the main header of the lips website, but when needed you can quote the promise and use it as an object. you Now let's back to our presentation for a final thought. As you was able to see from the demo, lips is pretty flexible and powerful, but it has its limitations. One of the limitations is that macro are on time. There is no macro expansion time. There are also some performance issues. One of the reasons for it may be the lack of macro expansion time, but you can fix those issues by embedding javascript code in tight lips. The most important things that are missing are first call continuations and tail call optimizations and also the syntax rules scheme hygienic macro system is not working exactly as it should. All this can be improved in the future. Thank you for listening to my presentation. |
Bringing RISC-V to Guix's bootstrap
What's done and what we need to do |
Hello everyone, this is Eke Starra and we're going to talk about the RISVive support on the Geeks Woodstrap or at least the bits I've been working on during this last year. So I'm a telecommunication engineer, I'm a freelance engineer or programmer working at LNQ.tech, which is just my website, I'm a Geeks user and a contributor so many things we're going to talk about today involve Geeks one way or the other and maybe you will remember me from the last year I gave this talk in the this talk you can see here as an as a recapitulation of my last year working on RISVive and as the result of that talk it comes when I'm what I'm going to introduce to you today. So last year I asked for the NNET grant which I mentioned in that talk and we didn't know if it was given or not at that time but now I can say the grant was was given and I've been working for that grant as a part-time during one year and this is the work I'm going to show you today. So the work was based on the bootstrapping process of Geeks or in general for other distros for the RISVive architecture. And yeah we're going to talk about what they did, what did I did I leave for the future, how can how it can be improved and these kind of things. So if you want to read a longer version you have here in the bottom my blog and in my blog there's a series of posts describing every single comment I did so maybe read that one if you want to go into the details. In this talk we're going to just to take a look to a couple of these comments in a very high level way. So let's introduce the bootstrapping issue and why this is important right. So we have free software and we love it right because it lets us audit our programs, we can read the source code of our programs and we can check if we like what they do or if we don't like it. But when we start to have software distributions some other problems appear. For example if a distribution maintainer or someone working in the distribution decides to replace the binaries that are given to us through the package manager with other binaries that have some kind of vulnerabilities or stuff like that they could do that. And we have no way to match the source code we are we suppose that that binary is using we can't really match it to the binary so we don't know this relation right. In geeks we have reproducibility so if the geeks maintainers for a reason imagine they decide to give us the wrong binaries we can always challenge the binaries or from the substitutes and we can check if the result of those binaries that the binaries they are giving to us and the ones we can build ourselves are the same. That's interesting and it's very interesting in many other areas that we're not going to cover but this is only protecting us from people being evil but sometimes a program can be evil too. Imagine this compiler this evil compiler that has sunglasses it decides to introduce vulnerabilities or malware in our final binaries we can think that as we compile the software ourselves the binary is going to have only the functionalities that we can see in the source code but in fact the compiler can be introducing new functionalities that may harm us as a user so the reproducibility here will only ensure that we are going to reproduce this environment here and make the same binary but that binary if the compiler is corrupt in the reproducibility case we are just going to reproduce the same corrupt binaries so we need something else to make sure that the compilers we use are not corrupt because we don't really know and this kind of attack is described by Ken Thompson the one of the authors of UNIX in a paper that is called Reflections on Trust in Trust so you can take a look into it there's a link here to how this one's done in real life so we have to remember that the compilers are also programmed so this issue is recursive if we want we could make a compiler that introduces new this corruption in the next program that is built using it and if the next program is a compiler it could introduce this to reproduce itself so the problem might be here in this compiler but it might be here and this corrupt compiler can introduce the vulnerabilities to the next one so the problem we have here is much deeper it's a recursive problem because the compiler has to be compiled so what's the exit point of this recursion right so imagine we could make a compiler imagine we are just in the realm of the ideas and imagination if we could make a compiler that is just sure that doesn't have to be processed so if we write something in a in a language that doesn't have to be processed that is sure directly for example that would mean we don't need to add this this this arrow here so the conversion between the source and the compiler would be just one line with no no extra things so we could break this thing so we're going to mention some projects that do that in and we're going to introduce that how does it work in real life so that way we could break this problem right this recursion would be stopped so in practice the new linux distributions they are just given as many preview binaries including bus etc and many others so they have to trust all those all those binaries as they don't have any kind of malware inside right they are trusting the ones who build those problems and not only that they are trusting also the compilers that build those programs so there are many layers of trust there because the compilers that compile those software they are also being trusted so it's really hard to know which is who is compiling each thing and where does the compiler they are using coming it's coming from right so there are many issues with that so what we know in practice is that we can compile most of the world using a powerful compiler as ccc so we can base our research or our story on gcc and once we fix the buddhist applicability issue in gcc as everything or mostly everything comes from from a c program or from a c compiler we can bootstrap everything from there so the problem is we can't use a pre-built compiler for building gcc because of the thing i've been talking about during this this minute so the key here is who is compiling the compiler right so if we go to gcc and if we want to compile the world we have a dependency with this which is obviously gcc but gcc also requires a compiler to be built and normally that compiler is going to be what gcc a previous version of that and that gcc it's also going to require another compiler who's who's probably going to be gcc and that also going it's also going to need another compiler which is also going to be a gcc and so on so if we follow these points we are we are just realizing that we don't really know who built the first gcc in this list and probably the first one was just a program written assembly or whatever during the 80s we don't really know so we have to break this this recursion somewhere right so there's there are also a lot of libraries here in the middle it's kind of complex right so we have to make a a project that is a very simple compiler that's able to start this this chain of compilations and just break everything adding one point that that doesn't really need to be trusted right that we can add it and this is this exists already and this is what we have in gigs nowadays so we have the world which is with the world we mean all the all the packages in the world more or less we have a modern gcc that is able to build most of that we have a gcc 7.5 building the modern gcc we have a gcc 4 building that gcc 7 we have a gcc dot 95 building that we have a tiny c compiler in this case you see we are not using gcc anymore so we use a tiny c compiler which is a very small c compiler we use that to build the gcc in our case we are also using in the middle in the middle here somewhere with the strap all tiny c compiler which is a simplified version of tiny cc which is buildable using GNU mesh so we can build it using sorry we can build it using GNU mesh and we also have a state zero bo6 which is a a set of tools that are written in directly in source code so we don't really need to trust any binary and from there we can build a very simple compiler right so with that simple compiler we can just build GNU mesh and with that build the world right so let's let me introduce these these these two projects because maybe you don't know them very well those are very easy to understand the areas gcc c compiler and and a tiny tiny cc which is also a c compiler but these these two are a little bit complex so the case the case of the GNU mesh GNU mesh is a project that has like two legs right one of the legs is an scheme interpreter and the other leg is a c compiler so the c compiler is designed so it's able to build the scheme interpreter and the scheme interpreter is designed so it can interpret the c compiler right so in the end what we have is an scheme interpreter written in c and a c compiler written in a scheme and they are self-hosted right no self-hosted no they are mutually hosted so the scheme interpreter can interpret the c compiler and the other way around so they don't rely on themselves but in the other and the goal of this is to help to create a full bootstrap for unix like operating systems and this is what they are doing in in the case of kicks also the scheme interpreter is written in as you see is simple c and the c compiler is written in a scheme so the the most important part in this description that they just copied from the website and you can you can read more there is that they can be bootstrap using mesh to planet and messy tools these tools are also part from of state zero posix and state zero posix is that mini compiler thingy i told you that it's a literally source code so it starts in a very small seed you can see here like it's it's a 256 byte seed written directly in binary in a in a very in a little bit weird way to make it easier to to read but it's literally binary so we write that in binary and that is is small enough to be to be checked by the user to make sure that that binary is correct and then from that we can start building small tools until we build something that is near to a very simple c compiler so once we reach that we can build the new mess using that and from mess we are able to build a much bigger c compiler as a tiny cc and from from that one we can start building old gcc right so here you see this one is is relying on the kernel but there are other products that don't even trust the kernel so that's very interesting too so wrapping up there's no no corrupt compiler if there's only source code think about that so that's how we how we break the problem uh or how we break the that recursive so in the case of the ris 5 support this is the status of the ris 5 support uh one year ago when i started working on this we had like the world is not applicable because i mean the world there are many packages out there but we have uh in another project if ryan flasner is working on bringing the whole world to ris 5 on geeks so that's very interesting too we have modern gccs uh do have ris 5 support starting on 7.5 the older ones don't have any ris 5 support tiny cc does have uh ris 5 support but the bootstrap hour one no because it's fourth from a couple of years ago maybe four years ago or six years ago i don't remember so it doesn't have ris 5 support and genomes does does have some support but it's not but i've tested yet so this is the status of the project so the spoiler of what they did is that i've reported the go the ris 5 support to this gcc and i've reported the ris 5 support to this tiny c compiler and we're going to remove probably this one so we don't really need it so in that case it's like everything has ris 5 support now well not everything but kind of so what i did i started uh working on gcc so i went for gcc 4.6 and i started like putting all the the ris 5 support i had in a more modern gcc but in order to understand that we have to understand a little bit about how does gcc work gcc uses a model that is called debits on phraser and it's not the same model that we can see in most of the compilers we have read in books and stuff like that this one the intermediate representation of gcc is machine dependent and it's based on something that is called phraser's transfer language so in a very high level way what we have is a high level language imagine c c plus plus we read that we convert that to an ast that it is called jimple in this case in gcc i don't i don't remember the reason but there is a reason behind this name um that is converted to to rtl to the intermediate representation that is machine dependent this one is machine independent and we optimize on top of that we generate new rtl with other with other structure and that structure is matched against some rtl templates and from those we obtain the assembly right from the assembly onwards is just calling as the as the genu assembler and ld the genu linker so these conversions are kind of easy to do but but not really so the first one the gimbal to rtl conversion is is that using identifiers so if imagine that we have the gimbal instruction at so we are going to search for the rtl instruction called add and we're going to to just convert from one to the other in the case of the optimizations they are then checking some templates of the rtl so if we have several instructions together and we have an instruction that can do everything together we can we can expand or or compress instructions right there are some rules we'll have to write and then there is the rtl to assembly conversion that is described in all the rtl instruction we're going to see an example here so this is this is an example of an of a machine description file or a very small piece of a very big machine description file that you can find them in the gcc source code at gcc slash config slash the machine you want to use slash whatever dot md from machine description and this is how the instructions look so the instruction here is called add di3 the add is from the add instruction the di is from double integrals so it's for big integrals and the three is that is going to use three arguments so this is the one that is going to be matched against the identifiers in gimpel so the conversion is going to be like that then this is what this this instruction looks like this is predicate so this is only going to be used this instruction in the case that the target is a 64-bit target that's why that's because it's using a double integer this is the assembly code is going to generate this instruction and these are some attributes i don't really get about those so the behavior of the instruction is described here and it says you have to set the register operand zero you have to set it to the value of the plus of the register operand one and the arithmetic operand two that's the way it works so this add instruction has this meaning and these match operand parts are pieces that are going to be matched against the against the RTL code we have so first it's going to generate like a like a general RTL code and then it's going to match against these blocks and if the the match it's going to generate these assembly files per line so if we had another rule that is more specific than this one and that matches before this is not going to be matched and it's going to generate another assembly this is a little bit how it works so these files are processed c files are generated and these c files are included in the code of the gcc so that's the gcc build system which is really really complex so you can think that if all this is made in this kind of machine description files that are kind of a configuration file you can think okay so if you take these machine descriptor files from from a gcc that supports RISC5 and you move them to the older RISC5 that to the older gcc that doesn't support RISC5 it should work right it you should be able to just compile the gcc and make it work but reality is not that simple we also have other types of things to to to bring another target to gcc so one of them is the target description matrix and functions which is just a very big header file with a lot of things defined there and there are also libraries like libgcc and many others it's more complex more complex than it looks so my process here was just trying to solve the missing pieces of code i found here and there so what i did was exactly that i went to the gcc code base to repeat the commit where the RISC5 support was added to repeat that to the past and in the past i started fixing all the problems i was finding there were many there were missing i in a sense so i have to fix them or add new ones or use older constructs that were equivalent to those there were also some RTL constructs that didn't exist i had to make some extra predicates i had to convert all the new api to the cc api that we had in the past many many things then the harder one was the libgcc because that was related with the build system and the gcc build system is really complex so i didn't really understand how it worked but i finally i made it work somehow so you can read these two blog posts where i described the all the changes i had to make and i go pretty much in a in a very detailed way in the description so you you can go and read those if you're interested in the code itself so about tiny cc i had a similar issue but the the the git history wasn't that well described so i had to go and just take take some files try to overwrite and make it a little bit by hand um same thing uh i packaged everything for gigs to be able to reproduce the the the work i was doing in another computer and let my friends help me um yeah and just started reading the code from the modern tiny cc and started adding it to the old tiny cc that that was the base of our fork of the buddhistrapable tiny cc um it was really hard to do tiny cc is super hard to read for me has many many many variables that their name is only one character functions that their name is o or o f and stuff like that it's really really hard to to read but i managed to make it work i don't remember very well how so if you want to read about that you have this blog post that describes it in a little bit more of with a little bit more of detail um i don't recommend it either in this case i didn't really make a really really interesting work here here it was kind of complex so and the only thing missing in this one i think this one is pretty pretty well done the only thing missing some optimization code and we decided or i decided to leave it outside the project because it would affect other architectures too so the the only difference we have with the upstream code is this one right so the the left is a program compiled with the optimized version and the other is the the one the one i did the only difference you can see is the jump here in the 4c that doesn't really like doesn't have just one instruction it has distraction is doubled because the there's is not applying the optimization of cleaning this out after after this jump is is set it doesn't really matter that much to me at least so yeah it doesn't really matter because this instruction is not going to be called ever so it doesn't really change that much and this this this this compiler is not going to be a production compiler so it doesn't really really really matter so considering those both backends are kind of working and generating these five binaries there are some things that need to be done we have to remember that that i only worked on the backends and something i didn't tell you is that i in order to test them i was working in a in a cross-compile environment so i was working on my machine making binaries for ris 5 and i test the backends that way but we have to test all these things in in proper hardware to make sure that we can build one thing with the next and and make the whole chain work so for gcc we have to properly package it tested with c++ support with i didn't have time to test and fix all the libraries that can be missing on the c++ side then we can we have to describe how to build that using tiny cc directly because we had before we had gcc 2.95 in the middle and we don't have it available in in ris 5 we have to be able to build gcc 7.5 using the back ported one too in also in ris 5 for tiny cc we have to build the bootstrap old tiny cc using the new mesh um and we have to decide if we are going for the to use the upstream one or we are just going to use the bootstrap old tiny cc in a minute so there are some decisions to be made here in the case of mesh we have to review the ris 5 support which is something we didn't do yet but i think it's pretty pretty advanced so it should be easy to merge and in geeks from the geek side we have to package everything so everything is very well described and the steps for building one compiler with the other is well described in geeks so everyone can benefit and can use everyone can use it and in the end as an extra we have to do all this in real hardware so we don't have the problems over the cross compilation and stuff like that which are cool for testing but they are not cool if we want to make something that is useful in real life so as last words yes i'm just finishing um as you can imagine from the talk there's a lot of work to be done but most of it is just the result of the integration that needs to be done after all the time i've been working just bringing some of the the dots in this line to ris 5 so now the work that is missing is packaging and integrating everything and maybe i'm not the best guy for doing that so we are looking for a little bit more funding from an internet and i think internet is is open to give us some help on that and we are going to involve more more people doing this kind of work so if you like this kind of work you can also join us and help us do this and also i want to thank all the people that that was involved in this one because i had help from many people and some people um told me that they want to help me further so thank you very much to everyone and just need just to finish if you want to to join us or if you have any question uh or anything like that you have my email there and these are the relevant uh ioc channels you can you can join and also just to finish i know this is hard i know this is complex but if you want to join it and if you are open to learn new things fight against all compilers and some code that is not very easy to read sometimes i'll be there right i'll be there trying to help you and give you the psychological support that everyone needs sometimes so just just join it's going to be a lot of fun i promise so thank you very much bye bye thanks for being there we are live then okay that's great yeah okay let's answer some questions then yeah okay so uh there's a question about the hardware i'm using to compile so yeah the easiest answer is just i use my laptop i have a laptop which is not a very powerful one but it's not even it's a good one and i just use that i'm a very patient guy i i have to compile many gcc many times yeah yes patient yes wait to wait to finish and that's that's what i'm doing for the future we are looking on buying some risk risk five hardware in order to test all these things in native in a native machine but at the moment i'm just using my laptop so yeah that's that's the answer so um what does geeks and this project think about rust versus sea that's on the first oh that's a good question um i can't i can't talk on on the name of geeks but i can't talk about around myself uh rust is a cool language it's a fantastic language but it's it's it's huge so sometimes i think it's going to be really hard to bootstrap we had issues with that if i don't remember uh wrong right like we have to build it through gcc and start like the bootstrap in process is really complex right so yeah rust is cool but but for bootstrap in issue is is not the best and in the case of sea we have i think yeah we are the relationship has been cut off so yeah okay yeah you can continue your point in the chat here or someone shows in this in the talk room okay yeah what what do i have the link to the to the next room or how how is it is so if you're in the bringing this quite a big bootstrap room that is your talk specific room okay so people can come and interact with you here even when the other talks are going on okay yeah so thanks for speaking at the first time you guys thank you thank you for being there and helping me i'll catch you later then i'll move on to the next yeah bye bye you |
Using GNU Guix Containers with FHS (Filesystem Hierarchy Standard) Support |
Hi everyone, welcome to my talk about using GNU Geeks containers with FHS support, file system hierarchy support. It's a pleasure to be presenting here at FOSDOM, I wish I could be there in person. This talk, if you're watching it kind of live, is also at a really early time for me in my time zone, so I'll do my best to be mentally present for questions by looking forward to discussing afterwards. I am not a container expert, I've definitely suffered a bit through some containers in just trying to make some things work and trying to explore other stuff. Containers have been pretty much everywhere it seems like, so I've come in contact that not something I've developed personally, but kind of as a practical trying to get through and use it. In terms of what FHS is, that stands for the File System Hierarchy Standard, and this is what we typically see in most distros, Linux distros, things in slash lib, slash bin, lots of random things in slash etc., and so on, so that's the typical thing, but this is a rather big assumption, and that's something I didn't even know the term or know what it was referring to until I really started working on this and coming to Geeks especially and seeing what they do there. Also, let me start by giving you a brief overview of GNU Geeks, I'm sure most of you are pretty familiar with it, but those you aren't, I'll just give you a quick overview of some of the features and kind of how it works. So it's a distribution of GNU operating system, it follows the FSDG, the Free System Distribution Guidelines, meaning that they only deal with free software essentially, no binary blobs and firmware and things like that in the kernel for instance. The whole distribution from the package definitions down to the service manager, Shepard, is all built on Guile Scheme. So as I mentioned before, I love all things Lisp, and this was really a big feature for me as being able to hack on the whole distribution from top down in a language that is a Lisp. And this has brought lots of cool features, things are transactional, either happens or it doesn't, you don't get stuck in weird states, you can roll back to previous sets of packages. The whole system is declarative, so you can just have one file and you declare exactly how you set up your operating system from where file systems are mounted to your users to the packages that are included, how you configure all sorts of things, including other cool things like transformations, so being able to take a package definition and easily change it to a different branch, a different Git commit to using patches that you have locally to do things, and it's all then a way that you can reproduce the same output again. And that's a big feature of what Geeks tries to do. And to do a lot of these things, it's necessary that Geeks does not follow FHS. So in other words, in order to have different package versions for different users on the system, they can't all be in slash bin, right? To have different dependency versions or to substitute trying out a newer dependency for something you're building versus what other things are, you need to be able to kind of separate out where things are without just throwing them all in one place. This also lets you do the system configuration and kind of roll back and change things just by changing sim links basically. I'll just leave a few links up here for those who want to read more. So speaking of, then something that's a nice feature of Geeks, which is what this talk is about, kind of an additional option, is GeekShell. And this is essentially a very quick one-off environment. You can just install something in this temporary environment to the shell and use it, do some testing, do whatever you want without installing it in your main profile. So rather than normally, most issues, you want to use something, you have to install it, use it, and then you forget about it, right? I've definitely installed lots of little tools just to play around or try something, figure out which one I want, and then forget until you realize you've installed lots of stuff taking up space. So this lets you do kind of a one-off thing. Some nice features too is that this is cache. So the first time, it may have to download subsuits or build things if you're building locally for whatever you need for the package you want to run, so that can take a little bit of time. But afterwards, then, it's nearly instantaneous, right? If this is something that Geekshell's already computed, the set of packages you want, it'll run pretty much as quick as just running the stuff directly, the launching, at least. After that, it's, you know, however fast the program runs. And I found this really an invaluable tool. I love having little tools I use once in a while. I don't need them around all the time. So some simple examples here. One is for using, let's say, Python or another language where you just have some scripts you want to run occasionally. You don't need to develop them all the time. You don't need Python and tons of Python packages all around. So in this case, you can do Geekshell with Python as an input, and Python, here's a particular package for that, Canvas API used for some grading stuff. And then after the double dash, the command you want to run in that shell. You can also just do Geekshell without the stuff after the two dashes to enter an environment there for as long as you need, and then exit back out to your normal shell. So nice, simple tool. It's really powerful. It's great for building up one-off environments or being able to reproduce a set of packages to hack on something, for instance, without having to maintain all of that at once. So the next step on this command is the container option. So the long forms dash, dash container, dash C, and it runs it, as you might guess, from the good naming in a container. And this container uses pretty much the same technology as everything else, which is namespaces, and it works basically the same way where you are in an isolated environment, so you have to specify everything you want to be in there. It's not quite a virtual machine, because we're not emulating, for instance, different host CPU or something like that, but it's kind of in between there and you specify everything. You want, it's by default, in a container, so it's isolated from the host. Let's you do things in a reproducible manner and a way where you can keep things contained and not have to worry about being polluted by environment variables, other packages, anything else you have going on. So the new option that I'm going to spend the rest of the time on this talk about is FHS containers. So this is a new option, building off of that container option, called emulate FHS or dash F for the short version. And in short, this makes things in the container look like an FHS distribution. So we'll get things like slash lib, slash bin, and it also includes on a more technical note a glib C, which we'll read from a global loader cache. That'll help with compatibility, which is one reason why it was added here. And in short, the uses here is it gives you a nice minimal environment that's more typical in a sense, unless just I love geeks and think it can take over the world, it hasn't yet. So it's good to be able to make contact with what most other people will be using in some way. Often language specific tooling expects to be able to download packages from places or to set up an environment in a certain way, which may not play well with geeks. So if you're kind of in between using packages in geeks or having to use other ecosystems, that can be a handy way to set up a nice, isolate environment for that. Likewise, for some binaries, I'll discuss testing too, if you want to be able to test something in a different environment, not going to a full virtual machine or anything like that, but sort of like a CH root type testing setup. So before again, to some more specific details. If we look at our previous one, we add here the FHS option. Here we can see that slash lib has a whole bunch of stuff. Bin also has a bunch of things as we might expect in a typical place versus here I see nothing and in a regular geek system, I just have a sim link for shell for a compatibility reasons, right? That's it. It's to kind of come about through those profiles, the sim links I mentioned before to be able to find things. So the FHS environment, in short, just kind of sets up some further sim links and puts things you'd expect to be in certain places where things can find them. So let's look at a few examples. I think that's the best way to demonstrate the uses and kind of how this is a nice, neat tool to do some things. So one, the Tor browser. This is one where you're usually concerned about privacy if you're using it, fingerprinting and, in other words, tracking down who someone is based on, things like font sizes, things that are installed, canvas sizes, etc. So for privacy, Tor, for instance, and other places recommend running kind of a standard browser, right, without all these little things that users do to make it their own. So in this case, running the official Tor browser binary is a good idea. And why not? We're trying to be safe and private. Let's have some extra isolation and keep that environment pretty self-contained. So here is the command here. In this case, I've downloaded the Tor browser. I've already extracted it to a folder called Tor-browser. And then this command just to highlight some of the different pieces there. So running GeekShell in a container, the network option gives network access. By default, we do not do that, and the FHS option. Now in order to do things like display something on the host outside the container, we want to provide some environment for it. So preserving and sharing things like the display and X authority for an X server will let us do that. And then here, Tor-browser is pretty self-contained, but it still expects to have things from the typical distro it's being run on. So here I've added ALSA, bash, core utilities, various things there as you might expect to get things to run. And then we can just launch it basically, which is this last piece. So let's try that out. In this case, it starts up right away. I've already run it, so it didn't have to download and set up all those packages. And there we are. We're in our Tor-browser. It's very bright, but looks good. So another example I mentioned earlier is kind of tooling ecosystems from different languages. Rust is one which is very popular. It's in the Linux kernel, or just about to be. And it moves quickly, and a lot of projects will notice we'll use the nightly set of tool chain binaries and libraries and all that stuff. So if you're keeping up that, developing things, you may want to have access to that. And geeks were not as quick on updating Rust. There's a lot of things that need to be built, and it's kind of a moving target. So in this case, maybe you want to do that in a separate environment, be able to use the kind of quote-unquote usual tools and directions that someone gives for setting up Rust. So in this case, there's the Rust-up tool, which is just a little script that will download and set up a Rust tool chain and environment for you. As you might expect, if you tried to do this, I suppose I can. Why not? This is the instructions given on the Rust-up website, just says to kind of curl and pipe that into shell. So if we were to try that, it'll download it, and then you'll get a cryptic no such file or directory error, which is weird because if you look, the thing it's asking for does exist. In this case, it's indicative of it's trying to run something or use a library that's not or expected to loader in this case. So there's ways around this. You can patch the paths to point to the right place and so on, but that's getting a bit tricky. Instead, we'll use our new tool. So here I'll run a shell again, giving network access, going to download stuff, and then a bunch of inputs needed. In this case, this is a lot more than you need just for the Rust-up script. For the Rust-up script, you pretty much just need, I think, probably GCC lib to load stuff, the tool chain for building stuff, curl, grab. The rest of these more graphical ones are for building, as an example, a Rust project, which we can show as well. So here, I'll run this. In the last option it mentioned here, it says I'm sharing the temp home directory I created as home in the container. So by default, the container will just see the current working directory. So in this case, I eliminated the current working directory so it doesn't appear nested. But that's the default behavior. You'll just see the directory you're in, not anything else outside of it. So for instance, if you want to reuse your environments, especially an FHS one where you might be running things for a while and want to go back to it, then you'd want to set up a home directory for it. You could share your own. I think it's good practice here in containers, set up a separate one, and let that build up the state for that. And then if you want to erase it or go to a different one, you can just change that option and not have to clean out things and figure out what got touched. Okay, so then let's run this. You can see it has some instructions. It already says, and we can just let it download things and run. That's it. That gives you some instructions here to source something, and we can see if there's a rust. And indeed, there is 1.67 from five days ago, pretty recent. Outside the container, I don't have rust. But then from there, I can follow the usual directions for a project. So the inputs I gave here was from an example of EWW, this widget library. It's pretty popular for making kind of cool desktop widgets. And in that case, it uses the latest rust, and the instructions for building the project are pretty much to clone the repository and then run cargo build. And you can do that in the shell once you have rust up that gave you the latest version and sets up your environment the way you want it without polluting and messing up your main environment and shell, which is quite nice, especially things like this, which are downloading a lot of stuff, setting things up. If you especially want to test things, build from clean environment, this is just a really nice tool for doing that. All right. As another example, addressing something in Geeks in particular, but I think pretty handy more generally, we don't have electron-based applications. Really the JavaScript packaging nightmare dystopias, I've heard it called, is just hard or impossible right now to package things from source or to bootstrap it from source all the way through. Just the dependency chains, circular dependencies, and the ecosystem there is not really built with what Geeks wants to do and the standards that we have for how we want to package things. So you have some free software, electron-based stuff, for instance, which there's no reason why you shouldn't be able to run it, but we can't build it from source. But we could use app images, for instance, which are supposed to be these nice self-contained have everything in them packages, as we'll see that it's not quite as simple as that. But let's jump right to an example before looking at some detail there. So in this case, I have a little bit more, a trick that I want to show here is using the development option. So what this does is it grabs all the development inputs needed to build, in this case, on Google to Chromium from source. So in that case, it'll be all the libraries, the compilers, all the stuff you need if you were to be working on that project. In this case, I want to grab all those inputs. This I just found is a nice kind of not a finessing tool, not a minimalistic tool, but as a way of grabbing lots of inputs when you don't want to mess around with things or you expect that you'll need all the kind of stuff that a typical browser does in this case. So for electron things, I think that's helpful. GCC lib is usually needed for a few libraries that are always expected to be around for binaries, for instance. And now we start getting to some more details for kind of desktop applications that often expect to have access to debuffs and be able to send messages and receive messages that way. So preserving that environment and exposing where it runs. Likewise in this case, since it's using Chromium basically as a rendering engine, it tries to use hardware acceleration. So exposing a bunch of devices and other hardware things is needed to make things work as smooth as possible. While most of this is, I would say, kind of reproducible in a sense, some of these options are getting probably more particular to my system in terms of what hardware is needed and what it tries to run with. So some of this may need tweaking on different systems. But once we have all that, then I've downloaded from some weeks ago a version of VSCodium, which is the freely licensed free build of Microsoft's VSCode editor. And once you've made it executable, you can just, supposed to be able to just run it directly because it's supposed to have everything in there. But that doesn't work. So maybe we should try that first as an example just to see what that looks like. So I get the same no such file or directory if I try to just run it. As expected, a binary assumes you have some things and doesn't expect you to have nothing. So they're not really a self-contained. I've seen other app images when I've had to include other random inputs you'd expected have packaged in there. But other times, they make assumptions on what's on the host system, which is why here I've included some other stuff that, again, is overkill probably for this package, but just as an example that you can use. So again, the profile is already existing, so it doesn't have to do anything here. And we get VSCodium. So a couple things to note here, one, as I mentioned, I've exposed more stuff from the host in order to get this to run. So even though this is a container, it's supposed to be self-contained, the app image, you still need things from the host to run a big graphical tool, desktop tool in this case. So it's something, you know, it's a convenience to be able to run things like this on the host without having to build it from source and all of that. On the other hand, it's not really completely contained in private if we have to start poking lots of holes in order to get things to work. On a kind of technical note, the app image here was using this option called app image extract and run, which basically the app image is, as far as I understand, sort of an archive slash disk image. And so normally when you try to run it directly, it mounts itself using fuse and then, you know, that's mounted somewhere in your file system and then it runs from within there. This doesn't work because if you try to mount something within the container, you don't have access, usually you need root access or fuse access in this case to mount something, you don't have that from within the container. There is a way to call out the container using tools from Flatpak. In that case, you can, as a test I did, you have a little wrapper that'll call fuse fuser mount on the host to mount the image. And that actually will work, except that within the container you don't see the mounted image. I would love if someone could explain the details. Basically I'm thinking of something to do with when namespaces are created and since the container already has access to certain things, even if it has access to where that image is mounted through fuse, it can't access what's in there. It sees nothing. If you create another container, if you look from the host after you've run this mount, you will see it. So that's a little bit kind of a technical detail there on kind of containers, which probably someone can explain a little better, but for good reasons, generally you don't want things in the container to call out to the host and especially things that need special access like mounting disk images. All right, so a few tips I want to kind of close out with. In general, the packages you need, it's not clear. Usually there's a lot of trial and error. You run something, it complains it can't find a library. That's not always the most helpful because often you'll get, as you saw, like file not found or you'll get some other error and it's not clear where it's failing to load something. So in that case, Strace can be a bit overkill, but you could, of course, use that or other tools to kind of see what libraries are trying to be loaded and where it's breaking down. Readme Surprise Surprise can often tell you what is expected to build a project for it to run, but they're not usually complete. There's usually some assumptions of tools everyone supposedly has on a machine in a distro. So there's a bit of trial and error going on there. XDG Utils from Flatpak lets you kind of call out to the host using portals, as it called, which is a way of then, for instance, passing a URL to be open on the host browser. Kind of lastly, what's next? I think some utilities to make this easier to script. You can definitely take those long, quick, geek shell commands and put them in a, you know, your favorite script and run them that way. You can use a kind of longer shebang to also call geek shell and so on. But I think it would be helpful to have some ways of kind of packaging up some of these common options or ways to run things like that more seamlessly from the host. So I think that could be some things we could work on and can make things kind of smooth. I'm also interested to hear what uses other people have. A few people already on the geeks mailing list and IRC sometimes chime in with things they're trying to do and that's been helpful to see what works and what doesn't. Yeah, so to end, I think this is another great tool in the geek shell toolbox. I know I'm a little partial as having written the patches but with help. But I think it's just something that lets us do a lot of stuff in geek shell that for practical reasons we want to be able to do. I would love to be able to build everything from source to have it reproducible, to have a geeks package for it. Not always possible. It's not something I really reach for very often. I'm very few occasions I've needed it but it's great to have it there, be able to test something that I might expect to work on another machine, for instance. But it's always, I think, a good learning experience. This has taught me a lot about kind of geeks about what's reproducible, what's minimalistic to come back to the theme of this dev room. Being able to really specify everything that's in your container and what's needed and understand really what's happening there gives you a good understanding, I think, of how software is built and the shortcomings. Even things like app images and flat packs, which are supposed to be all in one, are not really. And this gives us another way of kind of running stuff like that or being able to maybe develop ways of packaging things like that in addition to other tools geeks has. But I definitely appreciate any input and feedback and questions people have. And just to end really quickly with a thank you to Ludovix, especially who helped really tweak and polish these patches and some fixes. Previous work done on a third-party channel, which is non-free so I won't mention any detail but that was kind of the origin of some of this stuff and things that I worked on there as well. Thank you everyone for paying attention. It's been a pleasure to be here. I hope to see everyone in person at the next one. Thanks. |
Self-conscious Reflexive Interpreters |
Hi, I'm Wilberd. This talk is on self-conscious reflexive interpreters, and is joint work with Nada Amin. The talk is largely inspired by John Doyle's 1978 MIT PhD thesis proposal entitled Reflexive Interpreters, but also incorporates ideas from John McCarthy, Marvin Minsky, and Doug Lenit. It's very much work in progress. I hope you'll bear that in mind as you listen to the talk. However, we're hoping that ideas in this talk will encourage you to explore some of the space and ideas of reflexive or self-conscious interpreters, which both Nada and I find very intriguing and inspiring. So, in 1959, John McCarthy, who is known for many things, including being the originator of the list programming language, wrote a paper called Programs with Common Sense. This is one of the early papers in our symbolic artificial intelligence, or artificial intelligence in general, and he proposes this idea of writing a problem solver that can learn from its experience and also accept advice from an external entity and communicate with an external entity. So, he calls this software an advice taker, or the advice taker. So, when I talk about advice taker, I'm talking about the software he envisions in this 1959 paper, Programs with Common Sense. And in particular, his notion of common sense is far beyond what you see in almost any piece of software today, maybe with large language models very recently, you could argue that there's some common sense, even that's, I think, highly debatable. However, interacting with a compiler, or a text editor, or word processor, or things like that, I think there's really no common sense in those programs, even many decades after this original proposal. McCarthy describes this notion of a advice taker, and he proposes features of the advice taker that he thinks would be critical for building something. So, for example, all the behaviors of the system have to be representable in the system itself. So, this is a system that to some extent understands its own capabilities and behaviors. And also, the system has to be extensible in a simple way. And the system has to be able to improve its behavior. And there are other features of this program that McCarthy considers important. And then in the paper, he talks about different ways you could describe such a system using imperative or declarative sentences or language. And he also talks about how you might go about constructing an advice taker. It's important to understand that in 1959, John McCarthy hadn't attempted to create such a system. This is basically a white paper describing how you might go about building a system, or what the design might look like, or what the desired features of such a system would be. But he certainly hadn't implemented anything like advice taker at this point. And as far as I can tell, no one has succeeded in building something like advice taker today. He talks about the main features, representing expressions in the computer and so forth. When you read this paper, it is from 1959. So, it's fairly early on in the history of computing, at least as we see it in 2023. But you can see a kernel of an idea here. So, for example, he has this notion of an immediate deduction routine. So, if you're familiar with Kahneman's idea of system one and system two thinking, McCarthy in this paper proposes something very similar where there's fast thinking and slow thinking. You can have slow thinking that's more reflective or introspective. And you can have fast thinking, which is sort of automatic thinking. And so, part of the idea in this paper is that a smart system or common sense system, like the advice taker, would have access to what we might now call solvers. For example, SAT solvers, SMT solvers, things like that, or program synthesis tools or whatever, that for solving particular problems might be fast, but in some sense at a global level, aren't very introspective, not very smart. So, this overall software, which is supposed to display intelligent behavior, would be capable of using these lower level solvers in an intelligent way because the overall system would understand what the different solvers are good for and have some at least, you know, rough notion of how long it takes a solver to run, if it's going to finish roughly, things like that, what sort of resource usages it might have, and also what resources the overall system, the overall advice taker has available to it. For example, how much memory, how much RAM, how much nowadays flash drive space, you know, how much CPU horsepower, you know, is it running on a parallel supercomputer, is it running on a pocket calculator, so forth. So, the overall advice taker would be built in this kind of layered fashion, where there'd actually be a hierarchy of reasoning tasks and problem solving tasks going all the way down to calling out to these solvers, which probably can't be introspected into. So, in other words, the overall advice taker program can't look into these solvers necessarily to understand exactly how they're working, but it has some high level description of how they work or can learn over time through observation how they work. And the paper also presents a language here, logic based language basically for, you know, describing how a system might reason about going to the airport, for example, that kind of thing. Now, I'll say that I find this paper very inspiring. However, when read today, the paper can seem a little goofy or anachronistic, I guess, something like that, old fashioned. Part of the idea, part of the reason for that, I think, is that this example is not very inspiring. And I don't know where this came from. I don't think this is the greatest use of logic ever. And McCarthy later did a lot of work on frames and you know, logics where you could talk about what's changing and so forth, having to do with the frame problem, and these sorts of things. And you can maybe see some of his early reasoning, or thinking about these problems in this logical description. However, I don't think it's a particularly exciting example. And so a modern reader may read this example and think there's nothing really here. But I would encourage you if you read this paper to think a bit in terms of what McCarthy was trying to accomplish, not in terms of his encoding, or that particular logic he was using. Instead, you know, focus at the level of these five facets of a system, or his overall design where you'd have a fast system that you can't introspect into. And then you'd have high level, you know, a system that could represent aspects of itself and its behavior. So I think that's the important part of this paper, understanding that. One thing I found interesting about this paper also, which is definitely different from how most papers work today, is that at the very end, there's a description of when McCarthy presented the paper. And he's catching a little flak here from one of the pioneers of natural language processing, who says Dr. McCarthy's paper belongs in a journal of half-baked ideas. And part of this, I think, was because McCarthy hadn't even attempted to implement this. You know, if you think about the computing resources available in 1959, this is a pretty optimistic you know, system, or the idea that he could build a system, or anyone could build a system like this in 1959 was pretty amazing given that, you know, they were still basically waiting for transistorized computers and all that. But anyway, this paper, I think, if read by a modern reader focusing on the intent of McCarthy and the high level of concepts, is still inspiring today. And I think it's also true that, as McCarthy points out, computer programs don't have common sense. And the programs I use day in, day out, don't really learn from me. I can't converse with them and have a conversation. You know, McCarthy wanted to be able to use some sort of natural language processing, or maybe stylized language to converse with the program, and have the program be able to communicate its internal state, or aspects of its internal state, to an outside advisor, which would have been a human, I think, in McCarthy's day, but now potentially could be another program. And so he's got a lot of interesting ideas here. And, you know, I felt his frustration with the state of software, or reading this, and I feel that same frustration today, which I think is probably one reason why people get so excited by signing like chat GPT, which does appear to maybe have some common sense, or to be able to have a conversation with a user. So, this is a foundational paper in the history of artificial intelligence. It's full of interesting ideas, if you read it, I think, with the right mindset. And I think we haven't made that much progress even today on what McCarthy was proposing in 1959. So part of what Nada and I are trying to figure out is, well, why don't programs today have common sense in the way McCarthy was talking about? Why can't a program determine when it's stuck and ask for help, or have a human or other external agent, you know, provide heuristics, something like that. McCarthy wanted one of these problem-solving advice takers to be able to learn how to become good at a new domain, such as playing chess, let's say, by, you know, asking questions when it got stuck, getting advice from an entity that's more skilled than it was, with the idea that they can improve over time from its experience. Now, of course, today we have programs that play Go and chess and board games like that extremely well. And there's been a lot of work on reinforcement learning, and there's the work on alpha zero and so forth. However, if you think about the staggering amount of computation required to create, you know, superhuman play with one of those systems, I think that's not at all in the spirit of what McCarthy had in mind. McCarthy, I think, was talking about how humans learn and the idea that for humans learning how to play chess, someone more experienced can sit there and watch and give pithy advice. And a beginner can learn in real time with relatively limited communication and bandwidth and without, you know, playing against themselves 100 billion times or whatever happens with these reinforcement learning systems. There's a sort of relatively small amount of computation in some sense. Now, I'm not saying that the brain isn't very complicated and doesn't do all sorts of things we don't understand. I'm not saying the brain isn't capable of lots of computation, but in terms of symbolic manipulation and things like that, we know our brains are relatively limited. So certainly human beginner doesn't learn how to play chess the same way that alpha zero would. And the human learning some skill with an expert setting next to them to guide them learns in a different way. And that's really what McCarthy is talking about. And so that's what Nata and I are interested in exploring. You know, could we, now computers or millions of times more capable, try to take another crack at building one of these systems. Now, another person who was very interested in building this sort of system was John Doyle. And so John Doyle was studying at MIT in the late 70s. And he was working with Jerry Sussman. And, you know, this was an era where he had Marvin Minsky and Sussman and this very strong AI lab that was, you know, interested in things like, you know, scheme programming, the scheme programming language, you know, came out of this environment. And also symbolic representation, how do you represent information? And also things like metacircular interpreters. So here's an area where you're seeing a combination of programming languages and ideas about programming languages and interpreters and metacircular interpreters, but also connected to symbolic artificial intelligence. And it doesn't actually have to be symbolic. You could have neural networks helping out, you could have machine learning or reinforcement learning, things like that, that interact with the symbolic systems. But there is some notion of a symbolic system inside of, of what we're talking about here with Doyle and McCarthy. Whether that be functional programming based or imperative programming based or logic programming based, there's still some sort of symbolic information and some notion of explainability, which turns out to be important in these ideas. In case Doyle was interested in this idea of a self conscious, metacircular interpreter. So the idea that you could have a problem solver, you could, you know, sort of take a crack at building McCarthy's advice taker, if you had a metacircular interpreter that was augmented with information about the programming language, it can interpret and its own code and so forth, and could reason about that. So, and also part of this is that, you know, the system has to be able to control itself and try to deal with this exponentially growing search base that comes up over and over again, when you're doing reasoning. So MacArthur, sorry, Doyle proposes a whole bunch of interesting ideas. And this is linked to a bunch of work that was going on at MIT lab at that time by, you know, people like Guy Steele and Drew McDermott and so forth, in addition to Minsky and Sussman and so forth, you know, a whole bunch of other people. I won't get into all of them, but I definitely recommend reading these AI memos from that time period and things like the Amor interpreter. Lots of very interesting papers back then. And McCarthy, sorry, Doyle talks about, you know, the language you might have and it gives examples of reasoning and compilation efficiency turns out to be a major idea. Once again, I think if you read this thesis proposal, which is from around 1979, 1980, it's important to keep in mind the intention and where Doyle was trying to go instead of, you know, overly criticizing specific examples that maybe aren't very exciting today. Okay, so I think it's important to keep in mind what he was trying to accomplish. And he wrote a PhD thesis, you know, on related ideas, a model for deliberation action and introspection, which was published as a AI tech report number 581. So those are really interesting ideas to me. Doyle also talked about what he called a truth maintenance system, which later probably should be called a belief maintenance system. But he proposed this architecture, which I think was largely inspired by work by Sussman and other people, but more formalized in a particular architecture, this truth maintenance systems or TMSs. And this paper, this AI memo 521 is full of interesting ideas, including things like arguing truth maintenance systems that could, you know, argue in front of other truth maintenance systems, and other truth maintenance systems observing the arguments between two TMSs could update their own beliefs. So these, these were systems that could update their own beliefs over time. And there are all sorts of interesting work here, including default logics and things like that. And, and one of the things that came out of the work on TMSs was this book, Building Problem Solvers, by Ken Forbes and Johan DeClerre. And this is basically a book on AI patterns and how they can be AI programming patterns and how they can be applied to various problems. But it talks about things like the different types of truth maintenance systems. An early piece of work that's related is, you know, Jerry Sussman's a computational model skill acquisition where he tries to understand how a program could learn some complex domain like a human could. And, and so this was really foundational to a lot of the work that came later by people like Doyle. So this was also very interesting. This is describing his hacker system. So the hacker system, and you can, I think if you look at the hacker system, you can see something like a TMS inside of it. But this hacker system could learn basically how to program or learn how to debug programs and things like that. And so this was an early attempt to, you know, try to deal with problem solving domain having to do with software development or programming. That was similar in some sense to McCarthy's advice taker, although as far as I know, there wasn't this notion of, of interaction in the same way that McCarthy had talked about. So you could see something like the TMS is coming out of this hacker approach. Another piece of work that came out of the MIT AI lab around that time was this notion of a Lisp programmers apprentice by Charles Rich and Howie Shrobe. And there was actually a book that was published by ACM Press on the programmers apprentice project which ran for, for quite a while at MIT. And the idea was to build a system that could learn the needs of a software engineer over time. And this, this was an extremely ambitious project at the time when it started in the 70s, included things like natural language processing and voice recognition and, and so forth. And different types of program synthesis at the architectural level, not just at the synthesis at the level of individual functions. So that was also, you know, an interesting set of ideas that that were going around. Okay, so the last set of ideas I'll talk about that I think are in this vein, we're from Doug Lennett, who worked on several important programs. And one was called AM. This is like an automated mathematician. And there was another one called Eurisco. And here's Eurisco, a plan, a program that learns new heuristics and domain concepts. And this is part three of that series. And you can find these papers, the nature of heuristics, so this heuristic based theory formation. Okay, so here's number two. And, you know, as followed up by this paper, why AM and Eurisco appear to work. And this is also a very interesting line of reasoning. And so you have this idea of heuristic guided systems, systems that can invent their own heuristics, and so forth. And you can see that all of these systems, along with the work by, say, you know, Minsky on Society of Mine, are similar in that they go to a certain notion of intelligence, which is the ability to get unstuck, the ability to either ask for help, or to recognize when a system is stuck, or to be able to use heuristics to get unstuck, or even to use meta heuristics to develop new heuristics to get stuck, or to use meta meta heuristics to develop meta heuristics to develop meta meta, you know, develop heuristics to get unstuck, that sort of thing. So, you know, that that is, I think, core at understanding all of this work, you know, the notion of intelligence, which has to do with getting unstuck. Now, I could talk a lot more about these ideas, but I would like to change gears into what Nada and I have been exploring. And, you know, so we've decided, we want to try to understand why something like AdviceTaker doesn't seem to exist today, at least to our knowledge. There have been projects, there are projects like the SOAR project, that's OAR, and other projects have been running a long time for, you know, symbolic AI type things. But as far as I know, there isn't anything I think that McCarthy would recognize as his AdviceTaker. And so, the question we have is, why is that? Is it because the basic idea is fundamentally flawed? Is it because the idea is not well defined enough, and you couldn't tell if it had been built or not? Is it because that there's some fundamental limitation, like there's some notion of self-introspection or self-consciousness that we can't describe or runs into the halting problem or something like that? Or is it just because, you know, people have abandoned that idea? You know, it's been, let's see, 60 years, plus since that paper was proposed, computers are millions of times faster, you know, in terms of memory usage and so forth. Our memory availability, and there's been lots of progress in algorithms and programming languages and solvers and large language models and so forth. So, maybe it's possible today to try to build something like AdviceTaker, or at least if it's not, to try to understand maybe why that's not possible. Now, of course, it could be that the reason AdviceTaker hasn't been built is that it would take, you know, maybe a thousand people, you know, 20 years to build it. So, that might be possible. Or it may be that, you know, something could be built today using off-the-shelf components or the solvers we have and things like that. Combining those things that already exist in the creative way, maybe that would be possible for a small number of people to make a lot of progress. So, we're not sure. So, we want to explore, and we want to explore by trying to build things and figuring out what we find hard, what we find easy, and with nothing else, you know, no other objective, at least we hope that by exploring this space, we will encounter interesting things we want to explore more, even if we can't build something like AdviceTaker. Now, the line of research that I'm starting from has to do with this language called mini-canron that I've been working on with many people, including Nada and Dan Friedman, Oleg Kostrov, Michael Ballantyne, Rick Rosenblatt, many, many others. I can't name everyone. But a whole bunch of people have worked on this language, going back to Dan Friedman's original implementation of it. And this is a mini-canron has basically turned into a constraint logic language, a pure constraint logic language, for doing things like writing interpreters, type inferences, parsers, as pure relations. And that allows you to do types of program synthesis. So, one of the things that came out of that was this paper, Unified Approach to Solving Seven Programming Problems, where we show how by writing an interpreter for a subset of scheme as a pure relation, and then combining that with constraint solving and a special type of search, Oleg Kostrov came up with, it's possible to solve various program synthesis problems in a unified way. And another thing that came out of this is a barlerman, this barlerman tool. And so with barlerman, we can do little synthesis problems. So, for example, here I want to maybe define append in scheme. So, I can say append of the empty list to the empty list is the empty list. All right, so I'm giving you an example. And then the system is going to try to fill in basically a template where comma a, comma b, and comma c are holes or logic variables with no value associated with them, representing holes. And then, in this case, the, this is the constant function that always returns the empty list has been synthesized, which is correct, but not very interesting. So, we can try, say, what happens if we append the list cat to list dog, we should get back the list cat dog. And now, barlerman has to do a little more work and tries to come up with something a little more complicated, but it still is missing the recursion. So, we can try doing one more call. So, let's do how about ABC to DE to get ABCDE. And hopefully, barlerman will be able to synthesize the recursion. In any case, you can see that we're doing a type of example directed synthesis. And, you know, under the hood, barlerman uses constraint solving, unification, things like that, also has type constraints with numbers and, and symbols, and does a, a type of complete interleaving search. And sometimes barlerman gets stuck. You know, so barlerman is not an example of smart software, we're in, in the McCarthy notion. Barlerman will often get stuck. However, in certain cases, at least when there's enough context filled in, barlerman can be relatively fast. So, right now, it's taking barlerman a while. So, let's just fill in a little more of the template here. So, I'll, I'll say that we're defying a function called a pen, which takes two arguments, call them L and S, see if this speeds it up any. And, you know, a human can look at these examples and figure out things like the name of the function should be append. A human could also look at the fact that all three of these examples include two arguments. Now, in this case, they're both lists, although append in general and scheme, the second argument doesn't have to be a list, but that might help. Another thing we can give is a help, it's help if we want to is, you know, we might say, hey, because this appears to be a recursive function, we maybe can give it a barlerman a little more help like that and say that, well, since we have a list in the first position, we're going to guess that we're going to check if the list is empty. Otherwise, we're going to do one of two recursive calls. So, that might help. Might need more help. How much help does it need? Let's see. Okay. So, in this case, it figured it out. I think the fact that I'm recording a video right now is slowing down the the processor enough that we're using enough memory that the barlomens having a little more trouble than usual. There are some tricks we can use to give barlerman hints. But you could see part of it was I was able to fill out some of the structure. So, I could guess, you know, even a beginning scheme programmer, we would teach certain heuristics to. So, for example, all right, given these examples, well, we know we're defining a function called append, we can guess at least that the function takes two arguments. It might take more than two arguments, or it might take a variable number of arguments. You know, so maybe it takes zero or more arguments. In fact, the full scheme append can take any number of lists. In this case, we could do a two argument synthesis. And if we guess that append should be recursive because we have lists of different lengths, then if we also guess that we're recurring on the first argument, then we can probably figure out a lot of the structure of the program automatically. And then we might also be able to figure out things like, well, maybe we don't know what the base case is. And so, in this case, it's still still can synthesize the program, even not knowing what the base case is. So, we could, you know, create a little bit of a skeleton of a program, just from looking at these examples and following a few heuristics. And then Barleman, even though it's dealing with this exponential search, could get enough of a hint that it can finish synthesizing the rest of the program. Okay, so that's an example of how Barleman, which isn't very smart, could be called from a smarter program that can do introspection on the examples. And the smarter program could then provide a template or skeleton or sketch of the program to be synthesized based on what it observes from things like the tests or some sort of specification provided maybe by human or by another computer program. So, this idea of using Barleman basically as an external solver is part of what we're trying to explore as well. I should also mention that the software that we're developing might also benefit from some of the ideas in Chris Hansen and Jerry Sussman's software designed for flexibility. And this book is, in some ways, the intellectual successor to structure an interpretation of computer programs by Abelson and Sussman, but can also be thought of as lessons taken from artificial intelligence programming out of MIT and in the corporate world as well, distilled so you can use them in various other software projects. And Jerry Sussman gave a nice talk in 2022 at the Scheme Workshop, which is available on YouTube, where he talks about one of these patterns, which is called layering, where you can add things like meta information you want to keep track to in an intelligent piece of software through this layering technique. So, that's worth looking at. Okay, so let's look at BAT, which is the Barleman advice taker. Now, BAT itself, even though if we ever release it, we're probably going to release it under an MIT license. This BAT project, the Barleman advice taker that Nada and I are working on, has not yet been released, and we're not sure if or when we will release it. Right now, it's in very early stages, and we're just exploring some of the ideas that I've been talking about. And I'll walk you through some of the code and some of the examples to see where we're trying to go. But it is the case that we're still very early in the development of the software, and it's quite messy. A number of the files here need to be removed or cleaned up. We need to have documentation and more tests and things like that. And so, it'll just be a while before we'd be in a position where we want to release it, and we'd also want it to be more capable. But the other part, the other reason, at least I'm hesitant to release it right now, is that the ideas and the papers that I've been showing, those ideas have existed for a long time, and anyone who's smart and creative and thinks hard about those ideas and is inspired by them, could use their own approach to try to build something like AdviceTaker or build something like what Doyle was envisioning with a metacircular reflexive interpreter. And so, the fact that we're building something that uses scheme, shea scheme, mini-canron, barlerman, you know, scheme interpreter written as a relation and mini-canron and all those sorts of things, that doesn't mean that that's the only way to approach what McCarthy was thinking of or Doyle was thinking of. That doesn't mean that we're on the right track at all. In fact, we could be totally on the wrong track. So, I'm a little hesitant to release what we have just because, you know, people might just decide that they want to play around with this and use it, and that this is a starting point for exploration rather than looking at the problem you know, fresh and reading those papers and just thinking really hard and then using the techniques that maybe you're familiar with. So, you know, it may be just better for people who are interested in this area to work independently a little bit and then we could exchange notes or things like that, maybe hold a workshop or something to talk about ideas. Whereas, you know, just sharing code may actually not be beneficial. And so, anyway, if you have thoughts on that, let me know. I think it is a little double edge to share this code right now, given that I don't think we really understand the special sauce that be required yet. And so, if you start from this code, you may be heading down the wrong path. Okay. So, I am going to load bat and chase game. All right. Okay, so that seemed to be running some tests. Okay, you can see it's applying heuristics and it's calling barlament and things like that. Okay, so let's just look at this bat software a little bit. And maybe talk about the organization of the software. So, the current version of that. So, bat stands for barlament advice taker. So, the idea, the original idea was that we were going to build an advice taker program that was oriented around barlament, which was the program I just showed you. Now, barlament is not very smart. Okay, it's not capable of recognizing when it's stuck. It doesn't know anything about its resource utilization. It can't ask for help. It can't explain its own internal state. So, you could think of barlament in a sense as a opaque solver that might be called by an introspective system. Other solvers that might be called from an introspective system would be things like SAT solvers or SMT solvers, maybe something like Z3, or maybe a solver written using answer set programming, or maybe something neural based, reinforcement learning based, statistics based, as long as the answer, you know, could be verified in the end or the system could reason about its confidence in the answer. So, you have this idea of a solver, something that's fast, but not introspective, that can be called from the advice taking program. So, bat is the advice taking program, and barlament is one solver that it could use, and over time we may add additional solvers. McCarthy also had the idea of a problem domain because advice taker is supposed to be a problem solver with common sense. That means that you're trying to solve problems in some domain, whether that being playing a good game of chess or trying to write a program or whatever, there has to be some problem domain or maybe an advice taker could handle multiple problem domains. Even all the way back to McCarthy in 1959, McCarthy was looking at generating programs as a problem domain. John Doyle also looked at this domain. In fact, many of the people in this area of AI, you'll see, looking at those papers I showed you, they look at programming, reasoning about programs, generating programs, fixing or repairing programs as a problem domain. And I think that's natural for two reasons. One is anyone who's exploring these areas probably is a pretty good programmer or at least has had to go through the process of learning how to program and knows either how to teach programming or how to learn about programming has gone through that experience. And also, the other reason is that because the advice taker itself is a program, in this case, bat is written in Scheme and mini-canon, if you could build a system that can reason about software and that generate software, repair software, then there's at least the potential of applying the advice taker to itself and therefore having the system improve itself. And so I think this is at the heart of Doyle's idea of this introspective, you know, or reflexive, meta-circular interpreter is that the interpreter for some problem-solving domain could understand its own code, at least to some extent, have access to its own code, maybe know semantics of its own code. So you can imagine maybe a problem-solving interpreter that had access to its own operational semantics for its own interpreter or denotational semantics or axiomatic semantics, things like that. Formal representations of itself or that could do abstract interpretation of programs that can interpret things like that. So that would be an example of an interpreter that had access to some smarts about its own behavior or capabilities. In addition, you could also have the system have access to information about the hardware it's running on, how much memory it has available, the processors, things like that. And furthermore, you could have this information organized in a way, along with other information about the problem domain. So if the problem domain is about chess, maybe their concepts related to playing chess. And one way to organize this information is in an ontology, which is often represented as a tree or a graph or a forest, often trees or forests of information. So hierarchical information, you can think of this often as sort of an object-oriented type thing where you have parent-child relationships. And so you can represent all sorts of information, including heuristics and meta heuristics in an ontology. So in BAT, we have this idea of an ontology. So we have concepts. And you can see here we have a syntax rules macro. And we have instances of concepts. So we have a concept in fields. And so we have a notion of an instance and the types of instance and so forth. And so here are a bunch of helpers. And because we want to be reflective or as meta-circular as possible, the notion of a concept is itself a concept. The notion of an ontology is also a concept or an instance is also a concept. So this is a little small talky in a way, if you want to think of it that way. We also have the notion of a solver. So we have various types of solvers. And we have the notion of history and the delta or change between two different states and things like that. So we have a whole bunch of different concepts here. If I go down here and look at, so initially we had an empty list of concepts. If I have the system list, the concepts that currently knows about, you can see that there is information about things like tail position, or I shouldn't say information, but concepts like things like tail position, which is a grammatical property of software. And also there are concepts like advice or the fact that there's a user or BAT itself. So BAT has a concept referring to itself and also has explicit notions of history using a Blackboard architecture, which I won't get into what a Blackboard architecture is, but that was a traditional style 1980s AI system approach where you could write different things to memory and that would trigger certain types of actions. But also notions of resources and things like for the arity of a function, whether or not the function is variadic and take any number of arguments, or maybe it's fixed, fixed arity, and we know exactly how many arguments, or maybe it's fixed arity with at least a certain number of arguments, but we don't know what those are and things like that. And so the notion of arity itself, the notion of a program template or a sketch, the notion of variables and expressions. So we're getting into things like, you know, notions of the programming language that are represented, and the notion of a synthesis problem or a synthesis solution. You know, all of these sorts of things, including heuristics and meta heuristics, are important to be able to represent in the system. So those are ideas or concepts represented in an ontology. And the important part, the most important part is that the system can represent things about itself. It has a representation of itself as representation of a user or a conversation, those sorts of things. In addition to an ontology, there's also a notion of communication between the advice taker and a user or an external entity. So currently, we have sort of a high level sketch of how we might imagine a conversation. We don't have a working implementation yet. But you can see some examples of what the conversation with BAT may be. If BAT gets stuck, you know, trying to synthesize something like factorial, there may be a suggestion to try to do something like use an accumulator. And so synthesize an accumulator called factorial AC. And then, you know, there may be a sketch there that the system is able to come up with. And then you can imagine this sort of conversation going backwards and forwards between some entity and external entity in BAT itself. So, you know, at this point, we're still trying to figure out how we would represent that communication. In McCarthy's advice taker paper, he talks about sort of a stylized language that could be used. And we're definitely figuring that one out. We also have the notion of an erity. So that was sort of the first thing we wanted to do, similar to when I tried to, for Barleman, figure out what the erity is for the append function. That turns out to speed up the Barleman synthesis quite a lot, usually. So you can see that we have, in this case, the notion of trying to append, to synthesize append. And we have heuristics that have to do with erity. So here, we have a notion of guessing erity. And then, we have various helpers to try to help, you know, help us with this notion of guessing the erity of a function. But you can see here is a heuristic, you know, fine erity sketch from input output examples. And so here you can see we have an instance in our, our ontology. So we have a heuristic in the name of the heuristic and when it's applicable and how do you apply it and so forth. There also is a heuristic having to do with Barleman itself. So, you know, there is a Barleman heuristic. So you can actually see that, you know, if, if it's possible to invoke Barleman, that is a heuristic that's available to the Barleman advice taker. So that's one of the heuristics. And, and we have code here to transform problems into something that Barleman can handle. So, let's see what else. Engines are something that we're not currently using, but that's a way to deal with timeouts. Let's see, we have append examples here. So we have input output examples. So these are, you know, similar to what I was showing with Barleman. And then we have more sophisticated examples where we might have notions of logic variables or holes, even in the input output examples themselves, and so forth. And you can see that there are different types of sketches that might be guessed. And in fact, we can also do things like, you know, have, have the system, if the system guesses that the base case is not recursive, we can use a minicanrin absento constraint saying that the name append can't appear in the body of the base case, if we think that there's a base case. And, and also there's no Lebrecht, no recursive definitions in the base case. So we can use some of the constraints in minicanrin and Barleman to enforce certain notions like the idea that we have a base case. And, you know, we can imagine sort of the internal state of Barleman, how it would work through different aspects of this program as it's trying to interpret it. And part of the idea is to be able to simulate what a student learning how to program in a language like scheme might think. And so there, there are heuristics that we teach to beginning pre scheme programmers, like if Dan Friedman always teaches, if you write a recursive function and there's no question asked about a certain argument, like is it null or a pair, then that argument will be passed in any recursions without being changed. So that would be an example of a heuristic that we could add to that. Let's see. We also have information like, you know, tail recursion, which we're still working on. But we have some heuristics with tail recursion that we're working on as well. And then we have some code that can take one of our templates and turn it into a mini can run example that Barleman can handle. And we've also, we also have been exploring notions of types that we can, that we might find useful in the future. And I think, yeah, yeah, so we also have some notes and, and motivating examples that we care about. And so high level questions that we have are, what are the sorts of things that we want to, to have a system like that be able to reason about explicitly, and what information needs or what concepts have to go in the ontology, like what does that need to know about itself, what does that need to know about potential resource usage, how is that going to communicate with external entities concisely and represent concisely its own internal state, and also know when it's stuck or recognize when it's stuck and ask for heuristics or meta heuristics and have that conversation. So those are things that we're, we're thinking about. If you find this interesting and you'd like to talk to us, you know, please drop me an email. And maybe we can do a call. And I also encourage you to try hacking on something like, like advice taker yourself, I think it's a very interesting set of problems. And a minimum, I think it's worth reading these papers and trying to understand where a lot of these, you know, 1980s, late 70s systems were headed towards. And, you know, it was worth rethinking in modern day, whether or not we could take another shot at it. And if you also think that, well, everything now is neural, or machine learning, you just remember that neural networks had multiple times where they were in vogue and then went out of vogue and so forth. So, you know, maybe time that symbolic systems or at least neuro symbolic combinations of systems are revisited in the spirit of things like advice taker and McCarthy and Doyle and Minsky and Sussman and so forth. Thank you very much. |
GNU Guix and Open science, a crush? |
Okay, so the aim is to control the source of variation and from a scientific point of view when you are publishing a paper, an independent observer should observe the same results and have the same conclusion and this observation must be a sustainable so it doesn't depend where, when you observe it and maybe not where neither. So it's also collective. So all the question in this scientific framework is how can we redo later and elsewhere so I'm doing something on my laptop and another person will try to redo something like two years later the same thing. So how can we redo later and elsewhere what I've done here and today and this is a big question and the challenge in reproducible research and in science and I think Geeks answer to this question. So what does it mean a computational environment? So for example Alice says using this data you need that C file and GCC 11 to run my analysis. Okay, but what is the source code of GCC 11 and GCC 11 requires some tools for building and these tools, there is also tools require a runtime and this is recursive. So answering all this question is controlling the source of variations so you are controlling your computational environment. So this question is not new in computing, I mean in computing, it's not new. We have solutions. The solutions are package manager and so on. So we have package manager like APTU but there are some issues for example it's difficult to have several versions, difficult to have rollback. We have environment manager like Conda, PIP, module files but there is a kind of issue with transparency, who know what is inside a PIP installed torch. There is module files but how they maintain on your, I mean how they maintain, do you use on your laptop and on another machine? There is a docker file but the docker and docker files are based on previous solutions so also drawback apply also. Geeks in fact is all this solution plugged together and in fact it fix all the annoyance of each. This is what Geeks is. So Geeks is a package manager like APTU, etc. is a transactional and decorative and it produce showable packs like docker images. You can produce virtual machines and you can deploy that. For example you like unseemble or parker and you can build a whole Linux distribution and it's also a scheme library. Geeks is really awesome. Okay we have 20 minutes so I don't speak about that because it's too much. I'll just explain you how Geeks is helping me in my daily job. Geeks you can run Geeks on the top of any Linux distribution so it's really easy to try. If you haven't you should. So Geeks is just another package manager. So yeah you can install, remove without any privilege. This is more than APT from Debian for example. You have the direct creative management so declarative management means that you can have a configuration file and so on transactional so you don't have broken state you can rollback and so on and you have binary substitutes so you are not compiling everything from scratch every time. Okay this is some kind of classical package manager but you have more. You have isolated environment on the fly so you can create an isolated environment with Linux namespace on the fly and this is really helpful to check the dependency of your scientific analysis and you also can use Geeks to produce images like Docker images but without using the Docker file machinery. So you are saying okay nice but the issue with science is about reproducibility so you have a package manager that have all these features but why is reproducible? So the answer is about version. So for example Alice says I use GCC Adversion 11 so you have GCC to change but you need a linker like LD you need binitils and GCC is a compiler but the compiler need NPC and also need the NPC need NPFR and the big question is is the same GCC if we replace this NPF 4.1 by NPF 4 NPFR 4.0 is the same GCC or not and this is the issue that we can have when we are using when we are running analysis we are not controlling within us detail this graph and maybe the difference in the version of this package have a big influence of GCC at the end we cannot know and okay so Geeks in fact the version of Geeks it's fixed by I mean the state of Geeks is provided by Geeks described and this fix the graph this graph it fix the complete collection of packages and Geeks itself so in fact each node specify a receipt and each node specify code source the upstream source but also all the tools that you need to build the package so the compiler the build automation CMake make etc the configuration flags and so on and the dependency that you need to do that so you have a kind of recursive graph and this graph can be really really used for example for skypie it's more than 1000 nodes so it's it's not manageable by end and Geeks provide you a fine control about this graph so collaboration in action is this is what Geeks is helping me concretely every day so I write a manifest manifest it's just a file where I describe my tools for example python skype skipy nimpy or or and so on and I create my environment with Geeks shell so this is nice then I can I can pin this graph and I apply Geeks describe and I pin the graph so now I have another file state-alice which pins this graph but collaboration is about sharing the computational environment so somewhere is sharing one specific graph so in fact if I share these two files the manifest which describe the list of the tools and state-alice which describe the state of the graph okay Blake my collaborator can spawn exactly the same computational environment using the Geeks time machine so you think that Blake and Alice are running exactly the same computational environment and if Carol also knows these two files Carol can run the exact same environment as Blake and Alice so here we have some things that it's really easy so on my laptop I write my I specify the tools that I need python or etc and I specify the state I'm running on my on the laptop then I deploy on the cluster I use just transfer the two files and I run this Geeks time machine command and on the cluster I have the exact same environment that I have on my laptop so there is no question about what can be wrong between my laptop and the cluster because it's exactly the same computational environment and this is a game changer when you are running analysis on different machines where your colleagues are running in different places so Geeks this time machine provides a way to jump in different states temporarily so you can be I mean if you imagine you have the time and and and Alice and Blake or Carol are not on the same state compared to the time but they can jump artificially and temporarily to the same point in time to have the exact same computational environment and this is not possible with any other maybe next and so this is kind of game-changing for me so to have that working very well you need to preserve all the source code and for that you need software heritage which is I took just after and you need to have a backward compatibility of the Linux Linux kernel so and you also need to have some compatibility of hardware for example in five year we can if the hardware are not x86 yeah maybe we cannot jump it back in time but if you have compatibility of that where we have this that work and the question is what we have this size where these three conditions are satisfied so now we have for example this condition and what is the size of this window and from my point of view Geeks is running a quasi unique experiment at I mean real-world experiment a large since 2019 so we will see that this size maybe in five years we will try to to redo something from now and we will fail because something that we but we have this mechanism able to jump in different point in time so software heritage is a long-term source code archive so you collect and preserve all the source code of the world I mean open source code of the world so all github githlab debian and so on and geeks is able to to save the code of the geek package definition so for example the source code of DCC you can geeks use the source code of DCC coming from internet but you can save this directly to to software heritage and the package definition itself and what is really nice is that geeks is able to to fall back if the source disappears so for example you have a product if tomorrow github is down for whatever reason all the paper published with with the line my script is on github and the package is on on github and github is down like githulite I don't remember the name there is many popular platforms that are down and all all break and with this mechanism it is doesn't break so I have five minutes right okay so just geeks is able to so geeks is able to pack everything so you can produce a docker image with geeks so using the manifest you use geeks pack and geeks pack generate the docker image and then if Blake doesn't run geeks she can run the docker and the binary inside the docker are exactly the same than the binary inside the computational environment of Alice and because of the time machine this is reproducible over the time and this is also a kind of game changing in science and in fact a container it's just a format of the archive and geeks is an austic about this container format so you can generate torbol docker singularity there is an experimental debian binary package and yesterday evening there is a patch about supporting rpm package so this is just flexible to every context the key point is to to fully control the binary going inside the container and geeks does this job so this way it's a factory for creating images so geeks is helping me because there is three commands and two files so this is really easy to explain to for example medical doctor and and so on and it's a packing factory when I can deploy on infrastructure where geeks is not running for sharing computational environment and for me this is a two key for two keys for for for open science research and so on and if you want more information there is a this group geeks hpc but it's like many things in geeks the name is not good because it's more like geeks for science than not specific to geeks hpcs geeks for scientific and okay and it's running production so don't be afraid to install geeks I mean it's a there is all this cluster running already geeks more all the laptops and desktop so it's now it's in production so yeah this is uh I mean why the picture geeks and and and science that I would like to have in in the future so thanks so uh yeah but yeah yeah if you're the first one scientists but for example the scientists uh we are trying to reuse uh they can use for example different uh kind different cpu or different architecture how much some kind of uh optimizations for particular cpu can have a impact on having different results can it just like ruin a whole idea they can can they really like if you have idea like how much impact so I have to repeat the question so the question is the so in hpc context you have you have performance that depends on the architecture and you can have micro optimization for specific architecture so how geeks deal with that for the first question and how will we do that for reproducibility is the second question I think from my understanding is the two question I don't know the difference of the performances about the the micro optimization this is a job of the researcher in the field to say this micro optimization provides this performance improvement what I can say what geeks does to manage this micro optimization so Ludo is is is giving a talk in hpc dev room about the the the tune package transformation that I can ask when we are speaking about reproducibility we can ask if this micro micro optimization or I mean our fit the reproducible and the scientific method so I don't have the answer but at some at some point I think is it worth to have a micro optimization for I mean having something like a couple of percent of improvement but we lose all the reproducibility we lose the way to check that the computation is correct so I don't know if it's I mean this is a question for a collective question for the researcher in general so there is a question so the question is is it enough or not to have all this machinery to have reproducible science so the answer is I don't know because I mean if to have the answer is that everybody should run this to be be sure that he's in case or not so I don't know and the I don't remember what I want to say about that I mean I don't show here but there is a paper published with this method and yeah there is more reproducibility than than the other but at some point the the the the software is just one part of the big picture of the reproducibility issue in in in science and in fact is a is a is a is a collective practice in fact because you for example I'm trying to reproduce the paper from August this August and some data are missing the script are missing the the package that's been used are missing so so I cannot reproduce and this is not geeks it's just because publisher didn't good job thanks everybody |
How Replicant, a 100% free software Android distribution, uses (or doesn't use) Guix |
There was a lot of talk about like geeks used in scientific environment for doing like reproductive research, like in geeks conference, geeks birthdays and even here before my talk. So this will be a completely different usage of geeks. So I will present first what is replicant, so fully free Android distribution and how we manage to use geeks or fail to use it basically. So yes, there is a lot of issue in smartphones. So for instance like people still have to use them to be reachable, to do mobile computing like in some country, mobile banking and so on. And it's also like often cheaper than laptops. So a lot of people use them. But yeah, the issue is that like it's to produce the smartphone like you have to mine metals and so on. So it's not good for the planet. The networks also know your location like it spies on you, the smartphones run on free software. So it's even not clear if like smartphone can really empower people because you can take pictures and things like that so you can get proof of stuff, but you also get spying. So not using smartphone is not really a solution because a lot of people use it so political activists, journalists and so on, even indigenous people for their security. So yeah, like destroying smartphone and factory is not a good option and it requires like people to have people willing to do that and to support them. So the option to fix all that stuff would be like free software so we can make for instance smartphone last longer and block some of the spying but not the fact that the network know your location and so on. So we can limit basically some of the risk and also other way to help than programming like doing political pressure and funding like doing work to get funds to fund people to do stuff. So we want 100% free software and to be usable and like I said limit the damage. So historically we like worked to fork line address and make sure we don't ship any non free software. So the idea is like to have 100% free software, Android distribution. So yeah, like Geeks basically. So this is a smartphone, you have a modem here. So this communicates with the tower and also like as access to your SIM card and it's connected also to main system on a chip on this one. So the issue is like you have a very high DPI on smartphone like the screen is very small with high resolution. So you cannot also use it with very big fingers and usually you have no hardware keyboards. So this is also why we are trying to use Android on smartphone because it's already works. So yeah. So you have like for instance several hardware you can choose from, for instance this could run with Geeks for instance. The pine phone, so you have like really the status you have nothing ideal yet. Yeah, the lot of phone like don't boot with free software in other like the modem, yeah it's isolated from the system on a chip so it can't access its RAM like take control of the operating system and so on. And like hardware usability on GNULINUX, it really depends. Like power consumption for instance on the LibreM5 is probably good enough and the pine phone it's like not there yet but it could be improved. So yeah. So we have like some fully free distribution like PureOS it works right now, it supports one device, replicant 6 it's really old it's still based on Android 6 so it has a security issue. Geeks it could probably work but it's like missing a lot of package probably. We need like probably to have like GNOME working on smartphone and so on to be able to use it and other fully free distribution they also like lack packages but you also need like for instance package to support the modem and integrate all that in the distribution so for instance for the pine phone it's as I understand there was a talk at FOSDEM last year and it's doable to package it in distribution relatively easily. So as I said replicant is like fully free software Android distribution, supports some old device and we are trying to like get it working on Android 11 still. So yeah I probably already explained why it's based on Android. So Android is like far from ideal because we don't lack any package manager during builds. So for instance distribution like Geeks you have an abstraction that can compile for instance software written with auto tools that can build with auto tools, CMake and so on. Android you have everything that's downloaded and put in the same tree so this is a big issue because we can't easily mix GNU linux with Android because Android you have everything like you have Android.mk or Android.bp it's their build system and everything is integrated together so you don't have this abstraction and also the code like it's less fun to work with with Android and like you have huge code that's not really known or very well known or detected from the free software community and we also have like issue with application for instance fdroid it's really great it has like strict licensing policy but it requires the non free Android SDK in practice to build applications. So that's maybe something that Geeks could solve one day. So the Android architecture it's it's meant really for getting like selling phones so basically at the beginning you had a modified linux kernel for instance you could have like completely different audio driver and it talks to a hardware library that makes the abstraction between the kernel and the Android frameworks so Android application talks to the Android framework and can play audio for instance. So still today there's a lot of fun that are not like supported by upstream linux so there is still like need of it still work like that and the situation is still not that good. So the security model also is Android sandbox application and linux it's much more like simple for users because in Android you often like have malware in Google Play that's removed by Google obviously but it's meant like not to trust the software while in GNU linux it trusts basically the software so it's much more easy you have root and so on. So yeah maybe I will skip that yeah no maybe not so basically in replicants we try to replace non-free code and now we are trying like to support device with upstream kernel but yeah it takes some time basically. So now I will get to the geeks specific stuff finally when I got all the background done can we run geeks on top of replicants. So replicants 6 is a very old distribution so I tried and it doesn't work we can use for instance Geekpack trying to like deploy binaries but you need a matching kernel with kernel headers so when you build it the application run and for instance changing the kernel header in geeks besides requiring to recompile everything I tried and it didn't work for me but we can run it on like recent Android versions so we simply use Geekspack and we can deploy like command line application to Android so it's really easy it's just cross compile the application so you still have like a lot of limitation with cross compiling because some build system don't support cross compiling for instance you can't like you can't run GNOME apps but you cannot even compile them because it will use like build system that are not supported for cross compiling and this is also useful if we want to support like smartphone in geeks because if we manage to cross compile software we could build image that are then can be downloaded from geeks website like for the pine 64 no pine book everything is cross compiled because the builder run on x86 machines so yeah and we would like in replicant to like support using geeks but it's really too big to ship so instead we will try to implement the missing dependency and maybe add an installer based on geeks install this is the standard way to install geeks on existing distribution so this doesn't work basically we cannot do like Android package yet in geeks so the issue is we would need like a real Android indicate to do that because Android use a different Lipsy that's bionic so maybe if like people want to do it new like target like for cross compilation would need to be added but this is probably a lot of work but yeah the issue is also in in Android application and maybe in lot of modern programming language when you install something like with pip install it will install a lot of dependency and basically you don't really know what's in there for instance for for like JavaScript it's npm you can have a lot of dependency and you you really lose control so if geeks would be able to like build Android application it would be a way to fix the SDK issue the fdroid is having and also like really know what's inside the application and control it then there is the issue of can we build an Android distribution with geeks so as I said there's a Lipsy difference so the issue is that yeah I sorry I already explained that the Android build system it's you don't have like package definition in it so this has issue of licensing and no abstraction for the build system like auto tools cmax and so on so Android usually like requires huge resources to build because again you don't have like package manager so you have to build everything at once as far as I know and so can we like if we package for instance all Android in geeks there has been work that has started to do that you hit some issue for instance the the Android build system geeks supports android.mk so it's it supports the old Android build system so this is why it's probably limited to Android 7 and the newer android.bp with blueprints need to be supported to like be able to build Android components and also it would be tied to glipsy I don't know if it would work but yeah maybe it works so but yeah the advantage also it would be easier to build since you could just use geeks on an existing distribution again you would know like the licensing on the package how it's done you wouldn't have this build problem of building everything you could just build a part of it so it would fix a lot of issue but the the issue is that if for instance replicant does that we would be tied to geeks and there is also there is the real risk of like not be being stuck at some point because geeks is it's it's really nice it's strict for like programming language and so on it tries to bootstrap things like even make you use a shell script to like bootstrap make you cannot use make to bootstrap make sometimes there are exceptions like for haskell where geeks use a binary compiler to bootstrap the compiler but usually it's done well so it it takes time basically to like add newer android build system and julien lepellier did did a talk about that explaining all all the effort required required for that so yeah it's it would be a big bet and it would be risky so use for instance using geeks inside the android build system for like I don't know building the kernel or and putting the binary inside maybe it would be a better solution for that or even like trying to use like geeks shell for instance to make like to install a required package to make a container of it it's really really simple you do geekshell.c-f because you can now have fhs option you have like a standard distribution with user bin etc you don't have everything in slash GNU anymore or at least that's how the tools see it and so you that's how for instance bitcoin can be built inside bitcoin core you have a way to build a reproducible binary and it use geeks in the background not with fhs yet but it was done before but that's how it works basically so you can use it to build stuff but and android it's also isolated from the host when building stuff so it doesn't it tries to avoid using like python from the host and so on so it could probably work so this is like how where we yeah so we we still have like some real use case of geeks and we will go into it for instance like we use all distribution to build android because android the android version we use require that so like using geeks to ship some more recent tools that are required works basically and one of the big use cases so it's also automatic testing because a lot of projects use things like docker and again you don't know what's in the container it's like you give control to docker hub you you don't like host your own infrastructure and so on and with geeks it's much more easy to like do it without like depending on infrastructure like that so basically we have a modem and we have a library that talks to modem drivers it's a bit like i don't know qmi and things like that so this is specific for samsung phones so we can use geeks.scm to test that automatically so this is in some software mainly guile software and geeks obviously you have some people writing files like that where you have a package sorry wrong file uh yes you have you have package definition like that for instance sorry yes here so people use package definition to be able to automatically ah five minutes sorry to automatically test software um so we use that to like test this library when built with gnu linux with gcc silang and silang and also with the android build system uh so uh basically sorry so in the android build system you have uh like sorry uh yes you have like targets like that so this is to to build a tool so there is the the source code here and you have um sorry uh yes it's that one so geeks has an android build system which is a project that's just a mac file that includes the android specific mac file that can then build targets so it's it's pretty limited because it's really nice because it's simple but because it's mac file based uh here it includes the android the specific file to build software but you can have only one target so what we can do is in geeks.sm we parse the file and we uh basically we compile it for all targets and we can even like run val green and a lot of stuff inside uh so yeah uh how many how many minutes three questions okay uh sorry so we also use geeks for some of uh infrastructure so basically i did this talk because like there is a lot of presentation on geeks for like scientific software but less on other use case so do you have questions yeah uh what did you uh what triggered you to use geeks for uh a replicant building uh basically uh it's to like have a complete uh test treat uh that's uh without much complexity oh can you repeat the question sorry what triggered you to use geeks uh for uh to use geeks for building for yeah for doing tests it's fine uh so if i wanted to do test i can't do all the tests for instance with the various leapsy build system uh usually in autotools you have like you can run scripts but you can't rebuild the stuff with other dependencies so it's really not supported so the flexibility here it's like really huge you can test almost everything and you even like have uh power pc support so you can test for big and yand too so it's uh yeah other question replicants uh so what is the future of replicants so i'm trying to get funding through an lnet again to support the pine fun in replicants and make glow droid squad reusable easily by other distribution and also uh part usb began to the pine fun to to android basically to isolate the modem because the modem could like try to become a keyboard it's really easy to do with a usb uh gadget and then like start typing commands so we really want to isolate the modem uh is there other question or yes are you investigating using nix instead of wix ah so we are fully can you repeat the question uh sorry um have you considered using nix instead of gix uh no because nix is not uh validated by the dfsf and we are so we need to use like stuff that's fully free and so we would need like to maybe fork nix and it's a lot of work you |
Exploring WebAssembly with Forth (and vice versa)
Artisanal, minimal, just-in-time compilation for the web and beyond |
Right, so welcome. My name is Ramco and I'm here to talk about two very undeclarative but very minimal and hopefully useful languages. So the first one is FORTH. FORTH is a very minimal programming language that's been around since the 70s. It's had mostly applications in low-level contexts such as embedded systems, spacecraft controllers and so on, but it's had some other applications as well. Now if you look at FORTH, the most obvious thing to notice is that it's stack-based. So it uses a reverse-polish notation where you first put something on the stack and then you call a function. But other than that, it looks like a regular high-level language with syntax for constant variables for comments, syntax for function definitions, loops and conditions and so on. But actually, that's an illusion. FORTH has almost no syntax. So FORTH executes through a very simple interpreter loop. So what it does is it reads something up until the next space and then decides, is it a number? I'm going to put it on the stack. Is it something else? Then I assume it's a function which is called a word in FORTH and it's going to execute it. So symbols is just like any normal word, so it's just a function of FORTH. Same goes for the colon. Colon starts a new definition of a word. Colon, when it executes, it puts the interpreter into a special mode called compilation mode. In this compilation mode, the interpreter still advances token by token, but when it encounters a number, instead of putting it on the stack, what it does is it generates some code that will put that number on the stack later when this word is executed. Same for another symbol. Instead of calling this function, what it's going to do is it's going to compile some code that will call this function when this word is executed. Now the same goes actually another, sorry, so it's going to compile. The exception for this is that there is a thing called immediate words. Immediate words are always executed even if your interpreter is in compiler mode. An example of such an immediate word is the opening parenthesis which starts a comment. When it executes, what it will do is it will actually consume all the input. Another immediate word is the semicolon. So the semicolon is what you see when you end the definition. What this will do is it will put your interpreter back out of compilation mode into interpretation mode. Other of these immediate words are the loops and the ifs and then else, and you can actually create your own immediate words and as such, extend the compiler because these are executed at compile time. So you extend the compiler and you create your own language. So in summary, fourth is actually nothing but a very simple interpreter loop with an integrated compiler. There is no syntax almost to fourth. Just paste the limited tokens. All the behavior of the language is in the execution of these definitions and you can actually extend the compiler yourself. This combination of super simplicity and power has actually made fourth a very attractive language to implement on a new piece of hardware and a restricted piece of hardware. Typically, these fourth implementations are targeted at hardware assembly, but you can actually do this in any low-level language, which brings me to the second language of my talk, WebAssembly. So I think everybody here knows WebAssembly. It's an open standard for portable binary code. Most browsers can execute WebAssembly. Many languages can compile to WebAssembly, so the result is that you can run all these languages in a browser. Although WebAssembly was designed for the web, there's actually nothing web-specific about WebAssembly. It's just an open standard of portable code. So most of the information you find online about WebAssembly is about how you compile your favorite language to WebAssembly or how you run WebAssembly in your browser. So a few years ago, I wanted to figure out what was actually under the hood of WebAssembly. And at the same time, I came across fourth. So what I did was I combined both, hoping that I would learn something about both. So that's why I created WA fourth. WA fourth is a small fourth system. It's completely handwritten in WebAssembly, and it compiles to WebAssembly. So goals are, WebAssembly tries to, WA fourth tries to do as much as possible in WebAssembly. Now the problem is WebAssembly is a portable standard, so you cannot do everything in WebAssembly. For example, it needs to do very few things outside of WebAssembly. For example, reading or writing a character to the output or reading from the input. WA fourth tries to be simple. So it's just one big WebAssembly file handwritten. There are no dependencies, no complex tools. The compiler is very simply written. It still tries to be complete enough to be useful. There's an ANS standard that defines what the fourth interpreter needs to implement, the minimal set of words. WA fourth implements these and implements a bunch of other words as well. What isn't the goal is speed. So of course, because WA fourth is implemented in WebAssembly, you're going to get some speed for free. But still the compiler is very naive, so I don't expect it to be very fast. Same goes for binary size of the system. It's written in WebAssembly, so it's going to be naturally very small. In fact, it's about 14 kilobytes of WebAssembly, compiled binary WebAssembly. However, I'm not doing any code golfing or something like that to keep the system small because I want to keep it simple. And as most fourths are not really known to be very user friendly and WA fourth is not different, although it does emit some debugging information to make debugging easier, as you will see. So what can you do with WA fourth? Well, you can embed it in any JavaScript application, which means you can run fourth code inside your JavaScript and you get bi-directional bindings to the system and back to JavaScript. To illustrate this, I have a few example applications. So the first one is the standard fourth console that always exists where you can interactively execute fourth code and you can even interactively compile code and then run this compiled code. So it's a wrapper, actually. I also have a small graphical programming environment where you can create some graphics using a logo-like turtle graphics language, but it uses fourth. It looks a lot like logo, but it's actually fourth. And I took this a bit further and then I created a notebook extension, VS Code extension to create VS Code notebooks. So these are actually formatted markdown files interleaved with runnable code, so you can run this code. This is ideal for tutorials because you can have the code directly there, you can execute it, you can change some parameters and then see what the effect is by rerunning the program. Now because this is just WebAssembly and it's just a very small system, there's also a script that converts these notebooks into a standalone, small standalone HTML file with all the functionality, but you don't actually need VS Code anymore to run it. Now let's have a look under the hood. Like most assembly formats, WebAssembly has a text-based format, which is much easier to read than the binary format for humans. So this text-based format is based on S expression, so it looks a lot like Lisp. So this right part here is the entire fourth interpreter that I described earlier, but comes straight out of WA fourth, and it's actually quite easy to understand. So first it starts by parsing something, parsing the token and then it's going to either execute it if it's a function or it's going to compile it if you're in compiler mode, or if it's a number then it's going to put it on the stack or it's going to compile it. So this tree-like code structure is then transformed to binary WebAssembly using a tool from WebIt. WebIt is a WebAssembly binary toolkit. This is actually a toolkit with a lot of tools to work with WebAssembly files. It's a very interesting project to look at. So this is the entire interpreter. The interpreter is actually quite simple. The interesting part is the part where you have to compile something. So you have to compile a call when you're in compiler mode. So how does this work? Well somewhere in memory there is a hard-coded binary header of a WebAssembly module with one function in it. So when a new word definition starts, what happens is some values in this header are reset and the pointer is initialized to start at the end of the header. So each time the interpreter, this is the piece of the interpreter, needs to compile a call to a function, what it does is it generates some raw binary WebAssembly hexcodes and puts it at the end of the header. So for example if it needs to do a call, what it does is it generates a hexcode for a constant instruction with the index of the function to call and then an indirect call instruction. And so the compiler keeps on adding binary code to the end of this module. Now once you reach the end of the definition, this code, this binary piece of code, needs to be loaded into the system. So WebAssembly doesn't support anything for this yet. So there's no support for just in time compilation, although there are some discussions about it. So what WA4 does is it takes a pointer to this piece in memory of binary code and it passes it to the host system. So in this case it's JavaScript. And JavaScript has a small piece of code here running, what it does is it takes this binary, it uses the WebAssembly API to create a new WebAssembly module and it instantiates it. That's all JavaScript has to do. The rest is tracked by WA4, it keeps track of which module corresponds to which function that it needs to call or compile later on. So here you can see the system in action. So what's happening here is now it's you start the definition, you start by compiling something so you're still in compilation mode. And so it's only when you reach the end of the definition that suddenly you're going to see a new entry in your WebAssembly debugger with a function that has been loaded. So, and this is the generated WebAssembly code that's been generated by the compiler. You can get even more control over this compilation process by writing your own WebAssembly inside 4th. So this is actually, this is again no new syntax, this is just standard 4th with some user defined words. And there's one direct one-to-one mapping from this to this, if you can read it, but probably can't from there. Last thing I want to note about implementation detail is that most 4ths have very efficient execution by using a system they call ThreadedCode. So ThreadedCode is actually called doing jump instructions all over the place using values that come from memory or from registers. Now this is something you can do in WebAssembly. WebAssembly only allows structured jumps. So WebAssembly is actually structured programming language. What WebAssembly does have is function tables. So these are dynamic tables where you can put functions in, function references in, and then it comes with a special instruction where you can say jump to the function at this index. This is a system that WA4th uses for calling the words. Now the downside is that this is a very inefficient system compared to direct calls or jumps. So I said that speed wasn't really a goal for WA4th, but it's still interesting to get some ID of ballpark numbers of speed and size involved. So I did some very unscientific thing, and I took an algorithm, in this case the sieve algorithm to compute prime numbers. I took a fourth implementation, ported it to JavaScript CE WebAssembly, and then ran it a few times and see what the result was. Again this is not a very representative benchmark, but it's just here to get a feel for some numbers. So if you look at the execution times, WA4th is about 10 times faster than a JavaScript 4th version. This is to be expected. JavaScript 4th versions do pure interpretation, WA4th uses compilation, so there's no surprise there. But what is a bit surprising is that G4th, which is a native 4th, is not much faster than WA4th. I have no idea why this is, I'm suspicious about this result, maybe it's because I'm using an architecture that G4th isn't optimized for. JavaScript is 10 times faster than WA4th, which is also normal because WA4th needs to do these constant indirect jumps, and JavaScript doesn't have this problem. It doesn't need to do any function calling at all. And then finally, if you have the C version, and you compile it to WebAssembly using M-scripten, it's about as fast as running the raw WebAssembly, and the native version of the algorithm is slightly faster. Although you have to say the WebAssembly engine is pretty good at running this code compared to native code. So if we look at the size of the runtime and the code that is executed, the main takeaway here is that WA4th is actually a very small system, it's like about 15K, but you need a complete browser to run it, so that's of course huge to run. So the question is, can we improve this situation? So actually there are several standalone implementations of WebAssembly in different languages. For example, WebIt has a reference implementation in C++, there's WasmTime, which is security focused and speed focused in Rust, but there are several others. But these only do the WebAssembly part, so there's still this small piece of code, these small pieces that are outside of the system that you need to call out to. If you wanted to use all these engines and try this out and create a standalone version, you would need to write this little piece of code in all these languages against all these APIs. Now luckily there's something called the WebAssembly C API, and this is a standardized Blackbox API that most of these systems implement. So actually the only thing you have to do is write these, I had to do was write these 200 lines of implementation in Dependency, and then I could drop in any engine I wanted and then have a standalone version of my system. Now if we look at some, the same benchmark again, we can see that speed-wise, WebIt is about 100 times slower than the browser version, which is normal. I mean, this version in WebIt, that's a reference implementation, it's very naive, it just does what it needs to do to be functional. What is a bit weird is that WasmTime, which is supposed to be fast, is still about 10 times faster than the browser version, and there is no good reason for this. So I don't know why this is, I haven't tried other engines yet. Now if you look at size, you see that if you use a relatively optimizing system, you still have 90 megabytes, which is a lot smaller than a browser, but still if you have a system of about 15K, this is still big. Can we do something about this? Well, you need the WebAssembly runtime to be able to run your fourth code and to compile your code and load it, but typically most programs, once you did the first pass and you did all the compilation necessary, you no longer need a compiler if you want to run the program again. So you can do some out-of-time compiling, and this is where WA4C comes in. So what it does is it takes your fourth program, it uses WA4C to run your program once, and at the end of the cycle, it's going to look at all the modules that you created, it's going to combine them all, combine the final state, and then create one big WebAssembly module out of this. Now it's going to take this big module and then use another tool from Rabbit, Rabbit is really a cool toolset, it's going to use another tool from Rabbit called Wasm2C to transform this big module into C, and then it's going to use your host compiler to create a native executable. So the end result is that you have a fourth code to native compiler and your native binary is your fourth code with the rest of the fourth system still in there, but the compiler left out. And the cool thing is that because this is all platform-independent stuff up until the native compiler, you can actually do cross-compiling easily, so you can just do cross-compiling from fourth to any architecture you want. And all this code is about 500 lines and uses a lot of stuff from Rabbit actually, and Rabbit is the only dependency here. So if you look at our final table of benchmarks, we see that the speed is slightly better than Wasm, than it was before in the browser version, and the binary is becoming a lot smaller, so the entire system is only about 116K in the end of native code. Now there's still room for improvement here. So what WA4C does is it just throws together all these modules and then generates the big module. Now this big module, there are no cross-module calls anymore, so what you could do is actually do some post-processing. You could change all these indirect calls into direct calls, which could speed up a lot because the calls are really the bottleneck here. Another thing you could do is throw away code that you don't need. So in conclusions, this was a very fast talk. I could only touch upon things very briefly. What I did was I used fourth to explore low-level language implementation in WebAssembly. Because fourth is so minimal, I was able to keep things very simple, try out a lot of things, and go a lot of places. But I think a lot of the things that I've shown here are actually applicable to other languages. You could use declarative languages if you want to compare to WebAssembly. Although I have to say, if you don't know fourth yet, I can really recommend having a look at it because I find that there's some interesting philosophies and concepts behind it. Thank you. Questions? It was fast, wasn't it? Sorry about that. Sir, I have a question. We seem to be dealing in rather old languages today. Yeah, yeah, yeah. I always have been. It's at least the 60s, I think, or 50s even. Yeah, yeah. And fourth is early 70s. Yeah, yeah, yeah. WebAssembly is nowhere. Yes. WebAssembly is slightly newer. So yes, I... We'll have more from the 90s later. Okay. One question? Yeah, one thing is that there was a... Potentially, you could... I'm not sure. One potential direction. You could also consider doing the code generation in JavaScript, as in you can just create function out of binary... Out of text in JavaScript. And the same thing... I'm not sure the infrastructure how much can be shared, but the same thing could happen also in JavaScript side, as in the thing of compiling the code, the JavaScript side, and then it's... So it could get to JavaScript. So the level of performance of JavaScript. I'm not sure if it's interesting. So the question is, can I reach the same performance if you do it in JavaScript? Potentially, there is this thing passing through WebAssembly and this JS port you mentioned, but potentially it's also possible to do code generation in JavaScript. So the question is, can you do also this code generation in JavaScript? Yes, of course you can. Potentially. Potentially you can. So typically what you will see is the handy part, because I'm working in WebAssembly, is that I have all the WebAssembly low levels at my disposal. The hard part, if you go to the other languages, is that you're going to be... You need to have something to manipulate these... For example, this function table is very critical. So you need to be able to talk to that and hook into that. That's going to be the tricky part, but it's definitely possible. But it's easier if you do it directly in WebAssembly. Of course, you would never write a complex language directly in WebAssembly. That's madness. So you can do it with force, but I would not recommend it with anything. Thank you. One more question. Yes, I'm interested because you also used WebAssembly to see compiler. Yes, I used it. You had poor performance compared to C. Have you checked the regions? I didn't know. I used the WebAssembly to see compiler. The performance was quite on par with... So if I took the C algorithm, it was about... It's a bad benchmark, but the performance was about 10% slower. So it was not much slower than native binary. So it's C-compiled to native and C-compiled to WebAssembly was only a little bit slower. Of course, you are running... Okay, but you are running in WebAssembly, you are still running in virtual machine, right? So the fact that the performance is going to be maybe a little bit slower, but I thought it was still okay, given that you're still in a VM. We need to solve. That's amazing. |
Whippet: A new production embeddable garbage collector
Replacing Guile's engine while the car is running |
We're starting. My name is Andy and I'm here to say we're talking garbage collectors in a really great way. This is my talk about WhipIt, which is a new garbage collector for Gile. So Gile is an imitation of Scheme, as most people know. But if you looked at it, and you tried to determine its composition, you would notice that there's a big C library that's part of it. And it has an API, like we show, like there's a cons function, which is defined as cons, and it takes some arguments, and it returns a value. And there's a lot of code inside Gile that uses this API, and a lot of code in external projects and files that also use this API. So it's exposed to third-party users. And Gile is a garbage collected language. Data is allocated by the garbage collector, and the garbage collector takes responsibility for freeing it. And how is this going to work? So let's say I cons the value, I'm making a new object, I need to include it in the set of live data, right? So what's a live object? A live object is one of the roots, or anything referred to by a live object. So it's a circular definition. You compute the fixed point of this computation. And how are we going to do this? I'm sorry, I'm getting on to the next slide. So there are actually three strategies we can use here. One, we can ref count values, and you know, we used to laugh at this, but it's coming back in style, actually. Here you could register the location of this value with the runtime, and unregister it at some point when it goes out of scope. And another way we could find this value would be what is called conservative root scanning. And that's what Gile has done for many, many years now. And the idea, I don't know, if this is the first time you're hearing this, this is going to be wild. You know, like your brain's just going to go poof, because you take the stack, right? The machine stack. And you treat every word on it as if, like, it's an integer, you know? But if it's an integer, which is within the range of the objects managed by the heap, then we consider this maybe a pointer. And then we keep those objects alive. So it's conservative in the sense that it doesn't compute the minimal set of live objects. It's an over approximation of the live objects. It seems to work, though, historically. It's not one of those things you have guarantees on. It's very strange. And Gile's very old, 30 years old, I think, today, or not today, but like this year, I think, something like that. We're getting older, also. And since it's very beginning, it had a custom GC, which we inherited from a previous limitation that Gile was based on, SCM. And then in the mid-2000s, we added support for proper P threads. We had other things before. It was a kind of buggy time, because threads and garbage collectors, it's a very tricky thing to get right. And if you just half-hazardly add them together without understanding what you're doing, you can make some bugs. When we switch to a third-party collector called the Bohm-Demmers-Weiser collector, I should have spelled it out here, a lot of these bugs went away, actually, because it takes threads more into account. It's better designed in some ways. And a nice thing when we switch to the Bohm collector is it scans not only stacks, but also static data segments, P thread keys. It tries to find all the roots that it might possibly find. It grovels your system for special magic integers. And actually, with conservative collection, there are some advantages, and some real advantages. It is very nice to program with a conservative garage collector. I work on web browsers, they all have, well, two of the three major ones have precise roots, and it's a pain getting the handles right. And I've had bugs, you know, where you forget to register the location of a value, and everything blows up, but only sometimes, it depends on when the garage collector runs. And it doesn't constrain the compiler, because the compiler doesn't have to keep track, you don't have to make the compiler tell the system about where the values are. And yeah, but on the neighbor side, you might leak values. We don't know to what extent this is a thing. It appears to be fine in practice. We actually don't have a lot of data there. With the advent of 64-bit address spaces, I think it is less of a problem, though. Another issue is we can't move values. If any integer that we ever find during the whole trace of the heap might be a pointer to a value, we can never compact the heap. And this is actually a real, it's a real limitation for us in the sense that we can't use some of the newer, better performing garbage collecting algorithms. And as a technical constraint, it also constrains the garbage collector from changing. It's very difficult to change to one of those garbage collector algorithms now, because we have so much user code, we have so much implementation, and it'll be hard. But whatever I told you, there is actually a better way. Because we thought we were at a local maximum. We couldn't get any better without getting worse for a while. We wouldn't reach that mountaintop without having to descend into the valley. But it turns out that you can have conservative roots and move objects and compact the heap. You can have conservative roots and do a fast bump pointer allocation, which we'll get to in a minute. And you can have conservative roots and eventually possibly add more precision to your scan. And the thing that came along that allowed me to know this was something called Immix. This is a paper that was published in 2008 by Steve Blackburn, his group. And it is a new, well, he characterizes in that paper a new class of fundamental GC algorithms. So you have basically four things you can do when you're doing a GC. You can have what's called mark compact, meaning find the live objects, and then slide them all to one side of the same space. So within the space that you found the objects in, you slide them all to one side. You have mark sweep, find all the objects, and then collect all the holes into, these are the holes that are two words long, and these are the holes that are three words long, and these are the holes, like that, into free lists. This is what it's called. You sweep to a free list. Mark sweep. Evacuation, find all the live objects, and as you find them, you copy them somewhere else. So instead of sliding to part of one space, you get them out of the space entirely. And that's a semi-space, for example, that's a number of different Java collection algorithms. And this other new algorithm is mark region. Find all the holes and bump pointer allocate into them. As you allocate, you sort of sweep across the space, and you allocate in a bump pointer fashion into this hole, and then to that hole, and then to that hole, instead of collecting free lists. And IMIX is one of these new kind of collectors. This is a diagram from the paper, the 2008 paper. IMIX organizes the heap into blocks and lines. Blocks are about 64 kilobytes in size, should be a multiple of the page size, and lines for IMIX are 128 bytes. And as you allocate, here in this diagram, we can see that there are some blocks that are all full. Full block, we don't have to do anything about it. There are some blocks that have some lines which were marked in the previous collection, and some lines that were not marked in the previous collection. The lines that are not marked, a set of contiguous lines, is a hole. You can bump pointer allocate into the holes. Objects can be part of a line, in which case maybe many objects fit in a line. They can span multiple lines, but they can't span blocks, okay? When you allocate, you bump pointer allocate, and you sweep through all the blocks in the system in the course of that GC cycle. When you trace, you mark objects in the same way as a mark sweep collector, so there's a mark bit associated with every object, possibly an object's header, possibly an aside table. But as you mark them, you also mark lines, the lines that they're on, using address math. Typically the way this is invented is all these blocks are allocated as part of a line 2 megabyte slabs, basically, and you can use address arithmetic to get to the aside table of mark bytes for the line. When you sweep, you do, at the end of collection, there is an eager sweep over all of the line mark bytes, so the contiguous array of mark bytes for lines, to identify which blocks are full, which are completely empty, and which are recycled, containing some old data, and those you would bump pointer allocate into the holes. The cool thing about it is that IMIX does opportunistic evacuation, so it's not simply leaving these objects in place. If it determines that your system needs to be defragmented, then it can choose some set of blocks to evacuate, and choose some other set of blocks which are already empty to be evacuation targets. So it's still a one-pass algorithm over the heap, but instead of marking objects in place, it tries to put them into an empty block. And if you do this a couple of times, you'll completely defragment the heap. And it can fail because parallel markers, and ordering, and alignment issues, and that's okay if the evacuation fails, you just mark in place. It's always okay to mark in place, and it's always okay to try to evacuate, evacuation may or may not succeed. So when I realize this, that you can mark in place or evacuate, this is something that is compatible with guile, right? We can do bump-point allocation now instead of allocating from free lists, which would improve throughput in guile programs. We can compact the heap, which is, I mean, I know there are many users here, and python-xyz.scm is one of the files you have, yes. I say no more. So I started a year on this, on this work-in-progress whip GC implementation, hence where the name comes from. There are a couple of differences from IMEX. IMEX has these 128-byte lines, and if just one object on a line is left over, then the line is kept live, right? In the next collection, nobody will allocate, nobody will put an object in that line. It's not a hole, basically. And for various reasons, I didn't make sense to me, so instead in Whippet, we have 16-byte lines, so effectively the line mark table is the object mark table. You only have one mark byte, it's a byte because of parallel markers, and it's a bit more overhead in terms of space, but maybe it's a bit more parsimonious with memory, we'll see how it works out. It's an open question here. And additionally, with these line mark bytes being more fine-grained, it's a lose to do an eager sweep over the heap, so we do lazy sweeping, so as you allocate, you just sweep one block, and then sweep another block, and then sweep another block, like that. And the good thing about that is that it parallelizes things. The bad thing is that you don't know how much data was live at the previous collection right after your collection, because you haven't swept yet. Yeah, okay. So some comparisons with Whippet compared to the Bohm collector, and there are a number of different points here. So one of them is you can move values. If every edge in your graph is potentially conservative, then you can't move anything, because you could find an edge that keeps an object live and doesn't allow moving late in the trace. But if you can partition your edges into a set that's conservative and a set that's not conservative, a set that's precise, you do the conservative ones first, and any object which isn't reached in that conservative trace is then movable. So what happens is you mark the stack first, and you mark in place, you don't evacuate. That is an implicit pin on every object that you mark. And then you go and you mark the heap, and if you find another object there, you can evacuate at that point. And then in Whippet, if we see that the heap is fragmented, we can turn evacuation on, and if we don't, if we see the heap is not fragmented, we can always mark in place and not incur the overhead of copying. There is also explicit pinning for various reasons. We can shrink the heap, which is nice, because these blocks are multiples of the OS page size, they're easy to return to the OS whenever we find that a block is empty, and you can just mark it as being empty, and MAdvise, MAV don't need it, and if you ever need it again, you can pull it right back in, it's zeroed by the OS. And additionally, there's a possibility to use adaptive heap sizing techniques, such as the one that I link here, it's an online algorithm that depends on what's your current cost of GC and how fast are you allocating. So a process which sort of stops and goes quiet, gets its memory slowly reduced to the minimum. You can fit more on a system. And we can also do a generational collection, if we want to, using the sticky mark-byte algorithm, which I link to here, it's described more in that post. For some programs, it doesn't make a difference, because some data isn't very generation friendly. This is the case of the first empty GC bench pair over there, where the first bar is Whippet without generational collection, and the second is with. But in some cases, it's very effective, like in this, I'm making a bunch of quad trees, and it pretty much doubles the throughput of the system. Additionally, with Whippet, we scale a lot better for multiple allocator threads. In BDW, you have these size segregated free lists, the free lists of size two, three, four, and that sort of thing, and you need to lock the heap to sweep and find more and fill those free lists. In Whippet, you use uncontended atomic ops to obtain the next block, just basically incrementing a counter, because the blocks are contiguous in these two megabyte slabs, and you sweep without contention. So these are two graphs showing the time it takes as problem size increases and number of mutator threads increases. So at each step, I'm adding on an additional mutator, an additional thread, doing the same amount of work. So with two mutator threads, the heap is twice as big as it was with one, and with eight, it's eight times as big as it was with one. So we do expect to see some increase. What we see is that BDW takes more time, ultimately, like it's at nine seconds with an eight thread mutator, whereas we're only at three and a half with Whippet, it scales much better when you're adding allocators. And this is with a single marker thread, so we expect to see some increase as the problem size gets larger. This is, what do you call that? It's like when you make a quilt, apparently you're supposed to put a part in it that's incorrect because you don't want to show too much pride in the face of God, right? It's like a touch of the hand sort of thing. This is my humility slide showing Whippet being slower than BDWGC on this one machine. I have no idea what's going on with this because I remeasure it on my other machine. It looks much better. But it does point that as you add on marker threads, things improve, although I don't understand the relative BDW Whippet thing right there, so that's a question. So with the heap, with twice as much memory as the problem takes, as we add markers, things get better for both BDW and Whippet, but a little bit better for Whippet. So ephemerons. This is weak maps like you have in JavaScript where you have keys associated with value, but what if value references key? Can you have a circular reference? Could the weak reference, does it leak memory? I don't know. You people have heard about ephemerons, I would imagine. You cannot do them in the boom collector. It's impossible, right? I've tried a lot and thought about this, but with Whippet we have them. You really need deep GC integration to implement ephemerons. Right and precision. So with BDW, you're always stack conservative. You're always scanning the heap, the stack for smelly pointers, right, or smelly integers, integers that could point to the heap. And it's often configured in such a way that every edge on the heap also is conservative. And with Whippet we can configure it in a number of different ways. And probably we're heading down the mid-near term is this conservative scan of the C stack, precise scan of the scheme stack, and a precise scan of the heap. So we will be able to get the advantages of motion and compaction and all that. But we could move to a fully precise stack as well. And potentially things to get better. BDW GC is terrible to hack on. I just counted it's like 15 or 16% CP processor directives. You can imagine it's probably 90% of the code is covered by if thefts. It's really, really hard. Right. So some more words about how it is that we are, we are, that royal we, right, okay. Working on getting Whippet implemented in such a way that it could land in Guile and not break the world because I'm going to make a confession. I don't maintain software, I develop software, and I throw it over the wallet, I forget about it. So if I'm going to get bugs in the garbage collector, that's not, I better not start because, you know, I'm not going to fix them. So the repositories here, it is designed to be an embed only library, kind of like an include style library, but you actually do separate compilation. But it's something that you include in your source tree because it needs to be specialized with respect to the program that's using it. In the case of Guile, Guile will tell Whippet how to put a forwarding pointer in an object, for example, how to do a precise trace of the heap. And then we also specify Whippet with respect to the domain. So what should we scan conservatively, what should we scan precisely, that sort of thing. There is, we use LTO, and it appears to remove the overhead of the separate compilation, link time optimization. I'm actually suspecting LTO for that other graph that I showed you. So we actually managed to get performance and abstraction at the same time by being inspired by MMTK. MMTK is a memory management toolkit, it's fantastic. It's a library of garbage collectors and technique and experience and knowledge, currently written in Rust, formerly part of the Jyx research JVM, but now retargeting to open JDK and V8 and a number of other systems. We could actually slot this into Guile if we wanted to at some point. But we have enough information exposed in the API to allow a JIT to use that exposed information and generate machine code for the fast path for allocation, for example. And by having like a real abstract barrier between the two sides, we allow both sides to evolve at their own pace. And when we think about migrating Guile to Whippet, which is kind of where I want to go here, I know in the talk description it kind of oversold the item, right, it's like now we have a new production garbage collector in Guile, it's not there yet. So this abstract API can be implemented by the current garbage collector being used by Guile, by the Bohm collector, by the BDW collector. And so that's going to be the first step, is to switch Guile over to use the new API but still use the old collector implementation. And then we can look at switching to Whippet, but that wouldn't require any code changes ideally in Guile. I mean, so you have the Whippet API, but then you have the Whippet garbage implementation algorithm that we were talking about. There are a lot of variants on the algorithm in Sonali that you can, these are different ways you can configure Whippet on two different tests, one there's MTGC bench, one there's quads here. And going across we can first see serial Whippet, one marker, one marking thread, it's not going to be parallel marking. That's the first light blue bar on both of those sides. And then we have parallel Whippet, four markers in this case is what I was measuring. It improves things in some cases, a little bit in other cases, minor improvements. Generational Whippet, collect more recently allocated objects more frequently than older objects. Parallel generational Whippet, four markers and generational. And then after that there's four more bars which are the same thing, but collecting stack routes conservatively. The previous one is a precise scan of the stack, the previous four bars and then the next four bars are conservative scan and as you'll note it actually performs better. And there are two reasons for this, one conservative scanning can actually reduce the lifetime of objects if the compiler determines that an object isn't needed at any given point it can reuse its register or stack slot or what have you, whereas you have to wait for the unregister part of a registration, deregistration API if you're using precise routes. And the other thing is that when using this API from C, I don't actually have cooperation from the compiler where it's going to write out a table of where all the values are. I have to explicitly say, and now remember this one, okay, now forget it. And now remember this one, and now forget it. And that's overhead, right? And by doing a conservative scan, I remove that overhead. And then the final two bars, I didn't include generational because it doesn't really make sense in this context as a fully heap conservative scan. We increase a lot on this empty GC benchmark because it allocates a very big array and I don't have the equivalent of point on this allocation that the BDW API gives you. So we end up tracing all the elements of that really big array, which gives a big spike over there. And in the case of quads, we never have large objects, we're always tracing everything anyway and it doesn't really matter. But heap conservative does slow you down relative to just having stack conservative. Right. And then as a project, it's written in C, which I know is a sin, but Guile has this sort of odd place in the supply chain of geeks and it's useful to depend on a more minimal set of things rather than using Rust, for example. But it's a relatively modern C, uses stethatomic, uses things in a way that are constexpr-ish in a way that you know that the compiler is going to reduce them down. It avoids void pointers completely, using instead structs containing a single number, which gets boiled away by the compiler as well, which can't be cast to each other, you need explicit conversions, that way you won't confuse a conservative reference with a precise reference and things like that. And we don't actually have any API or API concern at all because it's an embedded-only library. If something breaks, don't update it. And it does have a bit of an abstraction for how do you find conservative roots on whatever your platform is. It's not so bad, it turns out. So if we think about when it is that this might reach Guile, then we are, it's when we can, right, you know, in the end. This is kind of a side project for me. I have other side projects, children, you know, so I can't really give an ETA here, but I would mention that there are a few things to do, and what we might end up with is that we could get a new release series for Guile, which is I think is what would be required for this, maybe starting in six months or so, just switching over to the API and staying with the Balm Collector, and maybe we could release a new stable version in another six months or really a little bit more. But we'd have to do a few things for there. Wipit is done mostly with the exception of actually growing and shrinking the heap, implementing finalizers, and having an API for checking in with Wipit, checking in with the GC as to when a mutator should stop, because that's one other thing that the BDW Collector does is it uses signals to stop all the threads, whereas Wipit relies on periodic save points. There are trade-offs. In Guile we'd have to switch over to these save points, I think it's possible. And I think we would start with a heap conservative Wipit, just because it's the same thing that we do with the BDW Collector, and then we'd move over to a precise scan of the heap. When we get to a precise scan of the heap, we have to implement a few things on the Guile side. There are some hazards about current uses of the API. In particular, if a third-party user ever allocates an object and then stuffs something in it that Guile doesn't know about, is it an integer or is it a pointer to the heap? And there are a couple of places that people can do that that are unclear. And we can't allow this if we want to trace the heap precisely and move objects. So this might require some small API changes and API breaks, because it's a new series, around this area. It might be actually time to remove smobs entirely, possibly. So that's what's actually pushing us to a new major release. So in summary, Wipit is a new GC, it's a replacement for BDW GC. It has the potential to reach a new local maximum, the better than BDW. And I think we can get into Guile 3.2. I would like to thank particularly the MMTK people for inspiration and discussions, because it's been really helpful to be able to properly learn about garbage collection over the last year or so. I'll leave you with one slide. When you evaluate a GC, you need to do so with a space-time diagram, because GC is a function, it's a trade-off between space and time. So on the x-axis, you should have your heap size as a function of what is the minimum heap size. Here, I measured some algorithms at 1.3x, 1.5x, 1.75x, 2, 2.5, 3, 4, 5, and 6, or just a 5. I don't know, on the y-axis, you should have whatever you're measuring, be it instructions retired or wall clock time or memory or something like that, because the heap size is one of the, and the response to heap size is one of the fundamental trade-offs in GC. Here we show that actually, we show the BDW collector, a semi-space collector, which is also implemented behind the Wipit API, and the Wipit algorithm, serial, one marker, one mutator on this benchmark. We see performance as we change heap size. Wipit is the only one that gets to 1.3x. This is an analytical calculation of how big the heap should be. It's not measured as to how small I can get anything to run, but it's like what I think the heap should take. So it might not precisely be 1.3, it might be one, you know, it's a number in that range. It can get to the smallest. It takes a bit of effort to do so. As you become more parsimonious with your heap, you end up tracing it more. So the curve goes up on that side, but it's the only one that actually gets to that x-axis point of view. And then it quickly passes, and you want these numbers to be low. That's what you want. It quickly passes BDW GC, it's only one point where it takes more time than BDW GC, and that's concerning. I need to fix that one. Let me see the green line. This is a semi-space collector. Semi-space, as you add memory, it gets easier and easier and easier, right, because it depends only on the size of the live data. Whereas WIPPET and BDW need to sweep the heap. So as you add memory, it sort of plateaus. It doesn't keep on going down. I don't know why it goes up at the end. This is my other little touch of the hand. I don't know. That looks like a bug to me. So that's something I fixed. Anyway, there's WIPPET. Thank you for enduring this blathering. And good luck, everybody, in about 18 months when this starts rolling out to geeks. Just joking, because I won't be around. Good. So I'll take any questions. Even dumb questions. That's okay. Yes, sir? It seems to me like conservative stack scanning is incompatible with address sanitizer from LVM or GCC. So how do you debug address bugs in the GCC? So the question is, conservative stack scanning seems to be incompatible with address sanitizer from LVM GCC. I'm a professional C++ developer, and I work on web browsers. I don't know what address sanitizer does. I know it gives me bugs sometimes and tells me things I have to fix, but I don't know what's going on there. I should know. Can you tell us, why is it incompatible? Basically, every time you access something that wasn't registered properly via malloc, for example, or aloca, it tells you you're in the red zone or you're in something that doesn't work. So to scan your wall stack, only part of it is actually valid. So the answer is that it only signals warnings if you ever access a value after it's been freed. Is that right? For example, you are in a function and you access something that wasn't. I think it's actually not a problem because we don't trigger the malloc-free detection at all. What makes a complete third-party allocator is if you M-map the page and we're just reading values from that page, and so it doesn't trigger the particular logic there, which also means you have no tool support. You're as wild west with the bugs that go with it, so yeah, I guess that's the answer there. Yes, so the question. How will this affect Geeks users? Well, this will affect Geeks users in the sense that, one, I hope that when you read build the system, Geeks launches multiple threads to compile things. And as we see, there is contention in BDWGC. It doesn't actually scale very well as you add threads if you have an allocation-heavy workload. And so I think that when Guile incorporates WIPIT, Geeks with multiple threads should scale better. In addition, we will be able to have better tooling for how understanding the heap and heap usage, and ideally, be able to place ourselves better on the kind of space-time trade-off if you need more throughput, give it a bigger heap, also let it shrink. And that can affect also longer-running demons like the shepherd and things like that. So it should yield a more robust system. Yes? There are some architectures which can be used in 64-bit page. Would that be a problem with using 16K blocks? They're actually 64 kilobyte blocks. So I think I chose the least common multiple or whatever. It's configurable, but I think the default size is such that they are large enough for any common architecture. The question was about page size, is 16 kilobytes big enough for blocks, but it's actually 64 kilobytes. Yes? With the collection, would it ever have to stop all threads simultaneously, or do the threads stop at a different location? Basically, are there stops in the world? Yeah. Yeah, that's a very good question. I didn't mention this. So this is a stop-the-world collector. It's not a concurrent collector with the exception of threads mark their own stacks while other threads are running. There's a little bit of concurrency there. We may add concurrent marking at some point, but you need write barriers for that to work. And so that would be something to add once generational collection is working, because you've proven that you have all the write barriers in the right place. Then write barriers is just like a little piece of code that runs whenever you store a pointer. And if write barriers can be used to indicate pointers from old objects to new objects, helping you do generational collection, they can also be used to mark an object as being allocated after the concurrent marker has already marked it in that cycle. I'm not explaining myself very well. But basically, you need write barriers to be able to have, to be able to minimize the stop-the-world component of the mark phase. Does that answer the question? Yes? Is this simply guy or complicate the guy with the web assembly? Oh, yeah. It's a good question. So there's a project to compile a guy's web assembly. I think initially this will probably start by having a guy library produce web assembly that has its own runtime. And this could grow to a whole program standalone compiler in which a guy has a library that takes your guile program and spits out a native binary. And in that case, that native binary would include WIPPET, embedded in it, instead of having that native binary then link to the BDW collector. So the goal would be to produce one binary that's all finished. Is that it? Thank you. Thank you very much. Thank you very much. Thank you very much. |
Zig and Guile for fast code and a REPL |
People here are organizing the deaf room, particularly Manolis who has done a lot of the work before we organized this room. Anyway, I'm going to talk a little bit about Guy and Zig. So I prepared two talks. One talk you can download online. There was kind of an overview of why I made these choices and why we're doing this. But I think it's better just to hit the command line or a shell. Right, so many people will recognize this Emacs. The letters fall off on the side, it shouldn't matter too much. And then I'm going to run in a shell. So I don't know if everyone is aware, but the gig shell is a proper container. So only the tools are pulled in that we define. And this is done in the gig.scheme file. In the gig.scheme file we define the dependency we want. Right, so Guy is there and Zig is there. So in the file we find Zig version. In the container, right, and then Guy minus V is also there. But Vi, for example, is not. And if this was properly running on Debian, it would be visible. So what I'm going to do is I'm going to run Zig to build my library. It's a dynamic library. And then I use package config to pick up the Guile compile switches. And I'm going to compile it against a little C file. Uh-oh. Yeah. Right, yeah, I missed the second line, I see, yeah. Right. So Guile is almost designed for linking against C. Right, so I wrote a little C program to show you that. And it can, it calls Guile functions. So scheme from int is a Guile function. So it switches, it turns test into a Guile integer, essentially. And then a boolean, and then I call this function here in C. Right, my increment in function. And you can see that it uses Guile objects to pass into the function. And there's also a Guile object returned. Right. So it's very minimalistic code. And I just need to get a compile it. Now it did compile, but now it doesn't find the library. So I need to add the library path. So I'm just doing this raw so you can see what is happening. You know, I mean, if you had a proper build script, you can account for all this. But you can see it says hello world from 3 to 4. Right, so that's what the C function does. Now I want to do the same thing in Zig. Right, so I created, actually what I want to do is I want to call Zig from, you know, the same library that I'm using. I want to call this from Guile. Right. So let's try that. So I'm in Guile now and I added the local search path for the library. Right. Yeah, so here we load the shared library libmy.so. Right, it loads into Guile. And I want to try something like, and it says it doesn't find the procedure because I haven't defined it. Yeah, so that doesn't work very well. Let me see where we are. Yeah, so call it ping Zig with an underscore. All right, that's not very Guile-like, is it? Yeah, so that's some conventions and I already ignored it. Yeah, so Guile has a wide range of C functions in the library. And these can be called. You know, so if you look for the C function, which one did we use before? Yeah, or a scheme from int you can see here, right? Scheme from int. And so in the Guile reference manual, you see in the reference manual almost on every page you see the sort of the C functions that you could also call as list functions. You know, and the scheme from int should be there. It's a long list, but that's the idea. So when you actually use the Guile manual, you will see the C interface to the CABI. Now the interesting thing about Zig is that it faithfully, you know, uses the CABI. So, you know, anything you can, you define in Zig, you can essentially access from Guile. So let's look at my Zig file and I say ping. Yeah, so this is the equivalent C function, sorry, ZigVinc function that we had defined in C earlier. Yeah, you have a ping Zig. It takes a scheme object as an input and it returns a scheme object, right? And it just pings it back. So how can I see the scheme object as it is defined now? And as a matter of fact, Zig can export, sorry, can import C include files. Yeah, it's actually one command. Yeah, so you say Zig translate C, right, then the path to the include file, then include file itself, and it turns into a proper Zig file. And this Zig file you can just import and it will work. Yeah, so all the functions that are defined by Guile that are exported that you could use from C in principle are now available in Zig, including the objects. Yes, if you look at this Zig file, it doesn't look very nice, right? But it's all there and it actually just worked in one go. I had to delete one line in it. All right, so yeah, the other thing of course is that I'm using ping Zig1 right now. Okay, so let's try hello. And it pings back hello, right? I mean that's what we see in the Zig function here. And Guile is not a strictly typed language. Yeah, I mean it's typed, but it's not in the sense that here we have a variable that you can, you know, in the one case I'm using an integer, in the next case I'm using a string. Yeah, and this makes for, apart from the fact that I'm using a rappel where I'm actually talking to the, you know, to the Zig backend, it also gives me a lot of flexibility in what, you know, how I define these functions and these variables that get passed. Okay, so let's do something a little bit more complex. And, you know, this exploration that I had with Zig and C and Guile, it's also all online, you can just read it. It's in a GitLab repository. Yeah, so if you define a function in Zig, you know, just naively like ping Zig here, it won't be immediately visible to Guile. So you need to create a mapping for that. Yeah, and this is in the Guile documentation, it's exactly the same thing in C. See if I can find it, yeah, here it is. So when you initialize the module, which means when you load the shared library, right, it will call this function, and you will define a sub, sorry, it will, yeah, define the function call here. So in Zig here is ping Zig, right, ping Zig and it has one argument. You need to do that to make the symbol visible to the Guile interpreter. That's basically what you have to do. There's nothing more to it, which is kind of boring, you know. So yeah, I'm also opinion that, you know, that we need multiple programming languages. Yeah, there's the, when you talk about Zig, there's often, or even C++, you know, there's often the elephant in the room. I'm not going to name it. But this is a language that tries to be everything, you know, and you end up with a very complex language. Also, the compiler is dog slow. I don't know if anyone is using the unnamed language. And then it has, you know, it has a borough checker, which acts like, you know, a nagging wife, you know, it keeps talking to you. And I tried it, you know, and I tried to love it because it's a functional language, you know, it's a functional programming language. But it kept talking to me and it kept going me out of my flow, you know, I just couldn't keep moving. So I think, you know, it's probably wiser to have a language like C, which is, you know, you have to realize that most of the code in the world today is written in C++, C++ still. If you want a type of performance, you will end up with a strictly typed language, which is, you know, imperative to some degree because CPUs are imperative, right? We don't have, at this point, we don't really have functional programming CPUs. So to optimize that stuff, you end up, you know, with a type of language that has to cater for that. But nobody loves programming in C++, you know, and in C programming is also hard, you know, to shoot yourself in the food language. I call it. So it's nice if you can have a language that has some somewhat stronger guarantees, but it's still blazingly fast and still, you know, kind of imperative. And then have something like Gile, which actually allows you to be, you know, productive, right? And do functional programming when you want to. So you end up with this type of mixture. Have we got five minutes? Five minutes. Five minutes to two questions. Okay, so one thing, one additional thing I would like to show you. Sorry, that's mine. Yeah, so you can, I mean, using the Gile libraries, you can essentially build up lists, you know, which is the fundamental for many list-like programming efforts. But, you know, when you talk about performance, you'd like to deal with arrays of data. So it continues blocks of memory where you have integers in a row or doubles in a row, and you're able to address these integers and doubles. Of course, you can do that from Gile, but, you know, if you want to do, if you write high-performance code like we do, you want to be able to, you know, use it as a vector in ZIG. Yeah, so you have an index, the base pointer to the vector, and then you have an index, and you should be able to fetch out the data object that you want. So just write a little example here. So this is the list example. Let me see if I can, the vector array. Yeah, so you have, I wrote a little, let me move that down. A little ZIG function, which has my increment in floating point 64 bits vector ZIG, you know, I'm very good at naming, apparently. You pass in a Gile vector, which is, again, a scheme object. It returns a scheme object, which is, again, a vector, right? And then it calls this Gile function, scheme f64, oh, that's where the naming came from. It came from Gile, yes, so I set X, you know, so I set, I'm set in the vector, I set at position one, right? So index one, I set the value 3.7. So this is kind of happening in Gile-ish C code, so it's calling essentially Gile C functions. Yeah, and I prove that it works, you can look it up. But here, here I'm using a proper index, I think, let me see, yeah. So you increment the f64 vector, right? This is the old version. Here I get a handle on the array. And then I get the data, so I get a vector here. If our data is a vector, yeah, of the elements of the vector array, right? And then I index the data points based on the vector, you know, of the floating points, and show 0, 1, and number 2. Yeah, and that's what it shows, so I'm not going to do that live, but that's the idea. Yeah, I'm done. Yeah, so it's in a nutshell, the code and the slides are online, so you can have a look, have a go. Any questions? Yes. I tried quite long actually, so I read five books. I probably took two months and wrote a thousand lines of code or so to decide it wasn't for me. But yeah, I hear quite a few stories like this, which are very similar. I think, you know, it's a language for masochists. A language for masochists. You have to be a brilliant person, right? Well, a brilliant masochist, let's put it that way. You have to be brilliant to keep that all in your head. Yeah, so see. Yeah, you know, I'm not complaining about it because people who program in Rust, they do better than in C++. So the compiler does help a lot, and I think it does lead to better code. I've given students, you know, work in Rust, and they do write better code because the compiler actually helps you. All right, but it also takes them a long time to get something done. So it depends on what your goals are, right? I mean, if you have to write, you know, perfect software to launch your rocket, you might, you know, want to do it that way. But if it's just like me, you know, in my job, we write mostly throw away code. It doesn't pay. Bioinformatics, so I'm in science. We have long-lived code, but it's usually by accident, right? So you write something, and people start using it, and then 10 years later, it becomes mainstream. It's actually happened to one of our projects, yeah. It's kind of, yeah. And then, you know, then it's too late to do better. So at one point, you showed us how to convert the ZIG file to C, right? But it wasn't really necessary in order to, or it wasn't necessary in order to call the style stuff from within ZIG? No, I mean, it's, the guy that adheres to the C ABI, right? So it has a C calling interface. You can use the scheme object, yeah, so that came from there. But actually, the scheme object is really simple when you look at it. So it could be that, you know, just, you can just roll your own. Yeah? So you create the language for scratch. So why did you decide to use the semicolons and curly braces in it? Well, I didn't design ZIG. I should have been clearer. Nor have I any input on guile, unlike other people here. But, yeah, you know, I dabble in languages. You know, I use, people often ask me, what is the language of the year? I think at that point it was Scala. I'm embarrassed to say. But, no, I think, you know, ZIG does appeal to me. Yeah? The second question is, so if I'm running, if I want to make a formal program, and I do them in Fortran or Julia, why would they use it? So Fortran is, you know, it's a bit difficult, because the very different language is Fortran and Julia. But I think, you know, ZIG tries to be in the space of C. Yeah. It's a general purpose systems programming and uncompromising speed. And it is fast. Yes, the compiler itself compiles itself in 10 minutes about on a standard machine. But I think, for example, it doesn't have exception handling. Yeah, it uses a different approach, which is more like a maybe monad. You know, and even C++, typically you'll have exception handling, which, you know, every time you call a function, it has to carry state with it to be able to unroll the stack. And this causes overheads. That's one thing. Now, with C++, the other thing is that in the background, there's often a lot of memory allocation going on. Yeah, especially when, I mean, it's kind of unavoidable to use the SEL these days. That's best practice. And I find that in ZIG, because it's closer to the C philosophy, it's actually much faster. Yeah, so. Oh, hey. Are you planning to write like a tutorial in the guide manual for how to do this? Yeah, we should. Yeah, we should. I think the next speaker needs to go on. No? How much time do we have? Two minutes, one and a half. Yeah, you can. It's good to switch. Thank you. Thank you. Thank you. I'll switch on. |
Algebraic Effects and Types as First-Class Features in the Fuzion Language
Giving a pure functional solution for non-functional aspects. |
Okay. Thank you. It's great to be here for the first time. Really in person. I really feel. Yes. You're a regular feature, actually. Oh, so far only online. It was so sad always, always in my small little room. So here I am. Want to talk about fusion again like the last years. What I want to talk about here is about algebraic effects and types and what has changed in the last year in that respect in the fusion language. Short background to me. I did a lot in compilers, mostly in the Java area in the past for about 30 years. And also did quite a bit on garbage collection. So I enjoyed the previous talks here quite a lot. But now to the fusion language, which is what I'm currently working on. And the main idea behind the fusion language is that we see more and more language getting more and more overloaded with more and more features and concepts and new things being introduced. While in fusion, we try to reduce this to one single concept, which is a fusion feature. And that's basically an abstraction for things like classes, methods, interfaces, and so on. And instead of having the developer to choose what to write, whether to write a method or a class or so, have the compiler have the implementation, make these decisions, what we actually need to implement those features. And also we see that more and more systems become safe to critical. So we need a simple language and we need tools to analyze that simple language to ensure the correctness. So that's what we keep in mind when we work on fusion. Fusion is on GitHub. I don't give any tutorial in the language or so in this talk. This is all online, so I will just basically throw you in the water and let you learn to swim bringing fusion code in this talk. But I hope, I think you will do fine. Fusion is backed by a small company, Tokiba, and now to this talk. No, before I forget, for those who only came for the fusion stickers, I have some here, so you can maybe give them around. If you ever once can take one. So now, what I want to talk about here is basically three points. I start with explaining what algebraic effects are and how they are implemented or how they are used in fusion. Then I want to talk a bit about types and how types can be used as first class features and type values can be used in fusion. And then I bring these two together because that brings a big advantage when working with algebraic effects. So, what do we need effects for? Fusion, in fusion, a feature is a pure function, so there is no side effects, no mutation of data by default. And algebraic effects are used in fusion now to model all the non-functional effects like state changes, IO, any threat communication or mechanisms like exceptions. So, let me first define what are algebraic effects. And algebraic effect is actually a set of operations. You could think of operations like reading data, so performing some IO, getting the time or doing something like a panic, so causing an error in the application to stop or logging some data somewhere. Typically, these operations in an algebraic effect model some non-functional aspect that is kind of orthogonal to the actual computation of your function. These operations have basically two things that they can mainly do. They could either resume and produce a result like reading that is successful or they can abort which is like throwing an exception, you just get out of that, you don't get anything from this. And an effect can be implemented by different effect handlers. So, one effect type could have different implementations. And to run code that uses an effect, this code has to be executed while a corresponding handler for that effect is installed. Now, the effects allow static analysis of the code that we can analyze what effects any feature has. And we require that for all library code, the full set of effects is documented in the signature of those features. So, if effects that are not presented are unexpected are used, that would cause a compile time error. I start with a small example of a hello world, a hello world feature. We use this exclamation mark syntax to mark that this code actually uses an effect and the effect or requires an effect. And that is I O dot out in this case because the library function say requires that effect. And I run this code now, of course it prints hello world. And if I analyze the code for the effects that it has, I see that I O dot out is an effect required by this small example. Now, I want to run the same code, the hello world hasn't changed, using my own handler. So, I have defined a handler here, which is a feature inherited thing from can print and it really finds the print operation to print to I O error instead and to replace the exclamation mark by many exclamation marks. And now to run this, we need to first install this handler as the I O out handler. The I O out here is just a convenient function that installs the first handler and executes the code given in the lambda as a second argument. And when I run this now, of course, I see the print out is the modified string because we replace the exclamation mark here. And if I analyze this for effects, I now see this no longer depends on the I O out effect, but the I O error because we have kind of diverted the code to depend on the other effect. We could also implement a handler that doesn't do anything, then the hello world executed in that environment would not require any effect at all anymore. That much to effects. Now, let me talk about types as first class features in fusion. To make it easier for you to understand what's happening, I give first an example of generics in Java, where I have a generic method here that takes an argument A of any generic type T and prints out the value. Doing that in fusion, we have type parameters, which are actually at the same level as value parameters. So we have a function with two arguments, T, which is a type, and A, which is a value of that type. And this is not just syntactic sugar that this looks the same, but it's also internally the type parameters of our argument features just as the arguments itself. Of course, we can now, oops, I went a bit further than I wanted. We can now call this function with two different type parameters and two different value arguments and print these two values. That's pretty standard for generic for type parametric functions. Fusion uses a lot of type inferencing. So in such a case where the types are obvious from the arguments, they don't need to be given. So we have the rule that type parameters always have to precede the value arguments and they can be left out if they can be inferred. So the code can be written like this. Then next, we could constrain type parameters. So we could say in this case, we want a type that must be numeric. And if we have such a constraint, we could use operations that are only provided by the type, like the plus operator that is defined on numerics. And if we run this now, we can also output the double value, still pretty standard for generics. But what we can also do is we can use the type parameter itself and call features that the type parameter provides, like every type provides its name. So we can run this code and print those names. And we can go even further with that. And I want to show you in an implementation of a feature that calculates the sum of a list of some numeric elements. The implementation of that feature would distinguish the case of an empty list or a list consisting out of a head and a tail, where we can recursively calculate the sum. The question now is what do we do in the case of the empty list? In language like Java, we could have no way to produce any value here. What we can do is we can call a feature that is defined in the type numeric and redefined by all concrete implementations to provide the zero value for that actual numeric type. So numeric itself is defined as a feature with its corresponding type defining zero as an instance of exactly this type. And then something like an i32, 32-bit integer, defines an implementation of type.zero to return the integer zero. And we can now use that function to print the sums of different lists here. And when I do this, we have a list of floating point values, a list of fractions. We have an empty list of floating point values and an empty list of fractions, so we get the corresponding zero values from the types of the corresponding types. So that much to types. And now coming back to effects, I want to use these types and type parameters to give names to effects or to reference user-defined effects. And I'll give you an example using a code that creates a linked ring, so a ring structure. To create a ring structure with references, you need mutation because at some point you have to close the ring. So that code is mutable, it's not easily pure. Then, so we will see that this depends on the mutate effect. And then we want to reimplement this or extend this to use local mutable state or local mutability to make this function pure. So I start by code to create a ring that uses the mutate effect. You don't need to understand the whole code, the important thing is that every element in that ring has a pointer to the next element and there's a reference to the very last, because if you extend the ring, you have to update the next of the last element in the ring to point to the newly added element in the ring. And here, for next, we create a mutable value, which is done by a base library function mut, which used the mutate effect. And to update the next, we use this arrow operation to update that. Now we create a small demo, we create a ring with elements ABC and then we run 10 times through the ring to print them. So we do that, we see that it circles around in that ring. But if you now analyze this for effects, what we see is that we have a mutate effect used by the code. There's lots of other effects as well, the out effect, because we print something, but there's also error handling in the library code that shows up here. But what I'm interested in is here is now, this has the mutate effect, because the code mutates the next element while building the ring. And now we want to get rid of this mutability in the code, I know, I think I'll make it in five minutes. And the way to do this is we define a local instance of the mutate effect. And to do that, I first need a bit more space in the code. And I'll add a type parameter M of type mutate to the code here and also pass this on on calls and on types of ring used in here. And now when we create an instance of such a mutable reference to the next element, instead, we take the instance of the mutate effect M, which we got as a parameter here, from the current environment. The syntax we use in fusion there is type dot n, which is the effect from the environment, plus dot and the operation new create a new mutable variable. And now we can define our own mutate effect. Here mm is the local mutate effect defined here, which is just inheriting from mutate. And is nothing follows after the is, so it doesn't do anything special, it is just a new sub feature in inheriting from mutate. So it basically only has the purpose of giving a new name to this is my local mutate here. And now we can pass this sub instance of mutate or the sub type of mutate to the ring here, which means that all the mutable values that are created are created locally to the mm to our own mutate effect. Now we still have to create this effect and execute the code within the context of this effect, and this happens in the bottom here. So we create an instance of mm and use it to execute the demo code. And that means that the m dot f nth call here will then take the instance of mm from the current environment to create the new mutable value. And when we run the demo, the same output, if we analyze it for effects now, we see it's the same effect. Apart from the mutate is gone, because the mutate is completely local here. So we can create code that locally to perform some calculation creates mutable data and mutates data structures. But the result is a pure function anyway, because the mutation is only done locally. So I'm coming to the end. The fusion of status is still under developing. It's a very experimental language, but the language definition is slowly getting more stable. There's still a lot of work on the base library that is ongoing. The current implementation has two back ends. One is a very slow interpreter running on the JVM, and the other is a C back end, which also used the beam demo visor garbage collector, which we just learned about as a garbage collector right now, but I would like to have a precise garbage collector and add a lot there as well. And basic analysis tools like you've seen for the effects here are available. And yes, those who remember Ellie might wonder who is disturbing me now from while I'm working on this is Felix. That's it. That's coming to the end. So maybe one more sentence. I hope I could show you that algebraic effects and types as first class features are something that complements one another pretty well. It helps to create code, then encapsulates non-functional effects, and yeah, that makes it possible to work with this and work even with code that is not pure, but to manage this in a nice way. You find some links here to resources related to fusion. We are happy for everyone who gets involved. Please have a look. Join us. We are a small team currently from three working on this. We can, there should be more. Yeah. Thank you very much. Can I pick? Yeah. You think you were first? So earlier you said that a particular type can influence numeric, that all terminology you used, but is it a numeric interface? And then you were able to say A plus B. If you didn't say that influence numeric, what would happen if you tried to compile that program? The question was if a particular type would not implement numeric, and you would use the plus in there, what would happen then? You would get a compile time error. It's completely strict typing. So if you call a function that requires a numeric type parameter, and you call it with say a string, what string happens to have a plus, but not the numeric plus, you will get an error that type, the actual type parameter is not compatible to the type constrained in the call feature, so that will not be an example. You were converting the value to string in order to print it. Is the string operation implied to be present for every value? There's a two string operation in our any, which is the parent of any feature. So a two string is always available, yes. It's not very helpful if you don't define anything, because you just, it's a second, the next speaker setup, which I'm contributing to the language called NIM, which also have my effect types, and ad hoc, more important, generic call. One problem that I've seen in combining these features is that when you have generic call, you often don't, the concrete instantiations may trigger different effects. So how do you approach this, and syntactically or semi-events, semi-table language? The question was that actual code can actually trigger all sorts of different actual effects at runtime, so you could have, one example I think of, you could have a function converting an object to a string that maybe performs some logging, and some code printing that value would not expect that. My answer to that is, that must be part of the static analysis. We need to analyze the whole application and see what is happening there. Library functions can do this to a certain extent, but they cannot predict if you have a dynamic value coming in, what the actual type will be, so we need a whole program analysis in the end there. Do we have time for one more? Yeah, another approach to have pure functional function and side effects is, like Haskell can do this. So I find it really interesting that with this language, it's possible to get rid of the effect of the algebraic effect. Is that like a decision, or do other languages like Unison, for example, also use algebraic effects, also have these features, and isn't that also like kind of, is it like on purpose, or is it like, what are the pros and cons of it? So the question was, in Haskell you have monads, which have a similar role like the effects, but you have them always explicitly, you have to carry them around and mark them. And the answer is here, this is actually, it's on purpose, we don't want to have the hassle of wrapping everything into a monad and carrying it around all the time. So the idea is to get rid of this as much as possible without losing the information you get from the effects. Time's up. Okay. Thank you for all the questions. |
IDP-Z3, a reasoning engine for FO(.)
A truly declarative approach to programming. |
All right, good. So thank you for joining me this afternoon for this presentation. I'm a researcher at KU Leuven, and I'll be talking about our work on a reasoning engine for FODOT. So this morning, I connected to Chad GPT, and I asked him, or it, I don't know, what's the, I'm twice the age of my son, who is 15 years younger than me, or old MI. And it started pretty well by saying that let's give it, let's give a variable a name, and then let's write the formula. But then when it tried to solve the equation, somehow he got lost, and he was not able to find the correct answer. And that shows that Chad GPT is very capable at understanding English, but not so much about reasoning, about, it does not have the cognitive skills to solve the equation. And so if we ever will be able to create a machine that can pass the Turing test, of course it needs to be able to handle natural language, but it also has to have the cognitive skills that we human have, like those listed here, we should be able, it should be able to learn from others through symbolic communication. It should be able to apply knowledge in new ways to perform new tasks, like the capacity to solve an equation, that's a knowledge that you can apply in many different settings. And it should have other cognitive skills, like the capability to ask relevant questions, and to explain its reasoning. In all these types of skills, we try to implement them, and the field of study at the university is called knowledge representation and reasoning, and that's what we are working on. Now one way to have a computer solve a riddle, like on the edge that I gave earlier, is to program the system, to program the computer. But there's a big difference between a statement in a program and a statement of knowledge. If, for example, I write in a program that F is equal to M multiplied by A, that's a statement in a procedure, you can compute F if you know M and A. But that's quite different from what you would see in a physics book, where the second law of motion would really mean that given any two of those quantities, you can compute the third one. And so you can really reason in multiple directions, unlike in a procedural statement. So that's a big difference between a program and knowledge. And you might think that a prologue, because it's a so-called declarative language, does not have that problem, but in fact it does. If you have a prologue program, like you can vote if you are more than 18, it can only compute vote from the fact that you are more than 18 or not. It cannot go the other way around. And that's quite unlike what you would write in a logic statement, that you can vote if and only if your age is larger than 18. And with that kind of statement, you should be able to say, to infer your age or at least a minimum value of your age from your right to vote. So really, a prologue is a programming language. It's not a knowledge representation language. Now, what is programming? It's actually the process of translating the knowledge that we have of a problem into a program that can then give a problem, solve it. And then if you have another problem, you have to apply that knowledge and convert it into another program, possibly, to solve it. And so depending on what kind of information you get as input and what kind of information you want as output, you will have to write different programs. And sometimes these programs will be quite different, which is a pity because the knowledge that is implicit in the program is the same. So why do we have to write so many different programs? This process, the industry of converting knowledge into a program is, of course, a big industry. It's the acting industry. That's what makes us live. But still, there's possibly a better way to do it, and that's what we are working on. If we can represent the knowledge into a knowledge base and then develop some generic inferences depending on the type of problem that you have, then you don't have to rewrite the program every time. You can use a generic inference, give it the knowledge base that you have, the input of your problem, and it will give you the answer. So for example, for the age riddle that I had in the beginning, the generic inference is what we call model search. So we search for a model of the equation, and that's a very generic skill that can be implemented once and then applied in many different ways. So what is knowledge then? Knowledge is a statement of knowledge, it's a statement that is true in all possible worlds, like for example, the second law of motion. It's true in all possible worlds that you can imagine. You can also say that a statement of knowledge is true in all acceptable worlds. That's what the law says, like the regulations, what is acceptable as a behavior. Sometimes a statement of knowledge can be about what you desire the world to be. So it's an expectation, or it can be about a particular situation that you face, and all these are different propositional attitudes. If you are interested in philosophy, you can go and look at Wittgenstein and his book, but that's the idea behind this. And so we have been thinking about what would be a good knowledge representation language, what could be its good attributes, and it should be, it should use symbols that have very simple semantics, like the age of a person, like very simple predicates and functions. It should have statements that are close to natural language, so it should be very easy to express a statement in natural language into a statement in knowledge representation, and vice versa, when you have a representation of knowledge, you should be able to read it very easily into a natural language. And it should be expressive. It should be able to express complex forms of knowledge. So first order logic is a nice candidate for expressing knowledge. That's why it's one of the basic language to express scientific knowledge, for example. We do use it in school for a good purpose. It has indeed some symbols with simple semantics. The statements are close to natural language, but still it is not as expressive as we would like it to be. So it has a construct like quantification. You can say for every x something is true, or there is an x such that something is true, but it doesn't have aggregates. It doesn't have inductive definitions, which are complex ways of explaining how you would compute the value of some elements. And so we introduce a language called fo dot, the dot being representing a list of extensions. And so it's first order logic extended with types, or so-called sorts in the literature, with definitions or inductive definitions, with arithmetic so that you can do some computations. It's still this point limited to linear arithmetic, so you cannot do transcendental functions like sine and cosine. It supports aggregates, like counting the cardinality of elements that satisfy some conditions, as well as the minimal or maximum, et cetera, and it has some more advanced functions like you're dealing with partial functions, functions that are undefined for some values of their arguments and intentional objects. You can look at the documentation if you want to have more details. Maybe I can give you some examples of statements. The first one would be for regulation about COVID, and you could read it like this. So if you want to do an activity of outdoor sports, then you have to finish it before 8 p.m. in the evening, and then either you have a mask or you have a COVID-safe ticket. So the hat is a symbol, is a logic symbol for and, and the V is a logic symbol for all. The second statement would be like for an organization of an event like this one, or for a course, planning of a course. For every course provided by the university, the number of students that attend the course should be less than the capacity of the room of the course. That's really a very simple statement that can then be used into, for the search of a correct planning. The bottom example is an example of a definition where you have rules that can be applied. And the first, so it looks a little bit like a prologue statement that can then be used to define the tax rate of, for, for, for, for selling a house, for example. And so at KU Leuven, we have developed, we are developing these technologies. I mentioned FODOT, which is one of the two knowledge representation language. This is the more powerful one. CDMN is a table-like way to, to introduce decision tables. And it can be used by business, by, by business analysts, I would say, in a, and it's a little bit simpler than, than FODOT. Then IDPZ3 is your reasoning engine that can use that knowledge base, as well as some inputs to compute, to, to, to perform some reasoning tasks. And on top of that, we have developed the IC, the Interactive Concertant, as we call it. So it's a little bit like a machine that can pass a Turing test, but we vote the knowledge, the natural language capabilities, but it can reason like a Turing test machine would, would need to do. And these parts are really generic. So once it is developed, then it can be reused very, very easily. So the IDPZ3 is the reasoning engine that has these capabilities. It can ask questions like, is it possible, according to the knowledge base of the possible words that we give him. That's called model checking. It can ask, you can ask him what would be a possible word, again, according to the knowledge, the knowledge base. That's model generation or model search, model expansion. What is relevant? What would, what should I get for, as information to check, to, to be able to construct a model of the, of the, of the, of the knowledge base? What are the consequences of some partial information that I have about a situation? I have some information about the situation that I face. What are the consequences of that? It can then give you some explanations about those consequences. Why this is a consequence? So it can explain its own reasoning. And it can also do some kind of optimization. Again, you can look at the website to have some more information there. So the reasoning engine is hosted in a Python program. So it's a Python program that will tell which inference to, to, to, to perform. And so it's easily downloadable from the, the Python package index. And so let me talk about the interactive consultants. Let's say that you have some challenge to engineer a design that meets some customer requirements. Well, to address that challenge, we develop a novel class of applets that can perform various forms of reasoning in, in the domain of expertise. And that will help the engineer finds proper design. So how does that work? So you have the requirements that come from, from the outside, from the environment, from, from, from the customer. So you have a set of requirements. The engineer will then interact with the application to enter the requirements that he knows. As well as some tentative decisions that he thinks would be proper design. And then in return, the system will ask him some additional question. And really, you, you, you should know exactly what is the property of this material or what is the expected operating temperature of the, of the system. So it's, it's starting to have a conversation with the engineer. It will tell him what are the prerequisites of his tentative decisions. So if he says, oh, I'd like to use steel, well, the system says, okay, but then you need to have, I don't know, some kind of pressure that is not higher than, than something else, whatever. It will tell him the consequences of his, of his decisions. It will be able to give some explanations and then some do some optimization of the design. So it's really a consultant that will help the engineer come to a solution. And all that with some proprietary expertise of the domain of, of, of interest that will be used to do the, the reasoning. If I have time, I'll give you a quick demo, maybe I'll go through the slides first. So we are developing that in partnership with some industrial partners. It is a big multinational that prefers to keep privates. I have five minutes left. Thank you. But we are also working with Siemens with Flandersmake, which is a research lab for the industry in, in, in, in Flanders. We entirely select in the, the banking sector, heads up with notaries. And the idea is to reduce the decision time of, of some experts. For example, in the Flandersmake, it was to reduce the time to select a glue, to glue two materials together. And typically they had to go through data sheets and to find the proper glue. And with the tool, they cannot do it in less than five minutes. And with the development cause, that was quite, quite low. This is an example with the big multinational company. They had to design custom industrial components. The expertise of doing that was in the head of the experienced engineers. But they wanted to empower the younger engineers. And so we formalized the knowledge of the experts into the system. And then with the interactive consultant, the younger engineers can play around with the different options and find a proper design that is right the first time. And the fact of formalizing the knowledge in, in the system really makes that knowledge an asset of the company that can then be managed as an over asset. And the organization becomes a learning organization. So why do we do this now? Why, why is it possible now and not before? That's because there are new solvers that are capable of making those types of reasoning. It's a big progress in the artificial intelligence world, but it's a little bit less well known than neural networks and so on. But it is quite, quite a nice progress. And we try to put that into, into practice. And also, we are getting a new understanding of what is knowledge and how to use it. And that's why this is, this is an interesting area. Let me go back to, to the demo. So you can go to the IDPZ3, ooh, I don't have internet. I did have it before. Yeah, sorry about that. So I won't be able to show it, but that's, that's the end of my presentation. If there are any questions, I'll be happy to answer. Thank you. Yes. Can you explain how this, how you approach differences from the classical expert systems? What's the next step? So, so yeah, the, the question is how is this, if I understand well, how is this different from expert systems, right? Because expert systems were quite popular back in the 80s and 90s. And this looks very much the same. But the thing is that expert systems were very much like prologue. And so they could make inferences in one direction, but, but they could not reason with, with, in, in any other direction. So it could not reason with partial information, for example. Well, here, even with partial information, it's capable of doing some, some, some, to, to come up with, with some conclusions. So it's very different from, from programming from, from knowledge. Yeah. Yeah. So in the use case you presented, you said that you took the expert note. How can you formalize it in order to be used by the original engine? Yeah. So the representation of the data is on FODAR. Yes. The representation of the knowledge is in FODAR. Yes. Okay. How, what's the process of transferring the data? We're just discussing with experts. Right. Yeah. So for the moment it's done by a knowledge engineer, as we call it, who talks with the experts and who looks in the data sheets and then that formalize it. It's like a programmer if you want, but for knowledge. So he has to, to, to formalize the knowledge into the, the, the formal language. Okay. And one more follow-up. So data representation is something like a decision tree or something? The, the representation is what I showed earlier. These are statements in logic, like the one I showed. Like this, this one's here. Let's, let's go ahead and show. So it would be statements like these that look very much like statement in English, but they would use some kind of formal. So the knowledge base is really just a text file. Okay. Steven? Yeah. Maybe one more. You mentioned humor. You mentioned humor. Yes. It's not capable of that yet, but we are working on it. All right. All right. Yeah. One more. Yeah. Was it a lot of work to customize Z3 to work with this? To work with this? Yeah. It was some, some work. Yes. So the question is, we use Z3 as a back end for the reasoning engine. How much work did it require to build to the reasoning engine, engine that we, we have? Of course, Z3 is already quite capable. So it's very, very useful. On top of that, we have new language constructs like aggregates that Z3 does not have. Natively. And we have also some additional inference or reasoning capabilities on top of it. Like the term, determination of what is the relevant question in the particular case. So, yeah. The system capable of recognizing or even pointing out some coherent specifications in the knowledge base. Could you speak a bit louder? I don't hear you. Yes, indeed. Right. Yeah. Yeah. So the question is, what happens is the knowledge base have conflicting statements in it, because then it cannot reason correctly. And actually, we have an inference that will try to extract the minimum set of instances of statements in it. That will make the, that makes the knowledge base inconsistent. And then from that inconsistent set of statements, we can try, the knowledge engineer can try to resolve it. Has that been used in practice? It is used, yes, indeed. Has that helped the specific case? In that specific case, I don't remember exactly. I couldn't say, but it could be. I don't know if. Yeah, this should come in. No, no, come on. And your picture is a big form. I thought you were expecting the general picture. There's still one question. If you could stay quiet, please. So the question is whether this could have applications in the medical field, where you have some, I need to reason with some probabilities. If there is this set of symptoms, there's a probability of such that this could happen. At this point, we are not focusing on this type of reasoning with probabilities. There's another group at KU11 that has developed the logic called probabilistic logic, which is an evolution of prologue, which goes in that direction. But this one does not do it. All right, thank you very much. |
LuaRocks and the challenges of minimalism |
it's green now okay now it's fine okay again sorry for those online it was muted the whole time yeah so let's follow through consistency like the design must be consistent or when worse is better the design must not be overly inconsistent right and and he also starts to look about priorities and we'll get to that like in the list world in the view of the right thing consistency is as important as correctness right whereas in the unique style worse is better well consistency can be sacrificed for simplicity in some cases but if you have to do that it's actually better to drop the feature all together if it's not that important right so that so it's better to drop the parts that are that deal with less common circumstances rather to introduce more complexity or inconsistency right so that's one way to keep it consistent is like not add more inconsistent stuff right if that was supposed to break down with the simplicity like and finally the the completeness both of them want the design to cover as many important situations as practical or reasonably expected cases should be covered right but in the right thing way of thinking completeness and consistency are what matter the most so that simplicity is not allowed to reduce completeness whereas in the unix world the unique way of thinking Lord in worse is better it can be sacrificed in favor of any other quality right consistency can be sacrificed to get completeness if not at the cost of simplicity right so these are these are different ways of working that are concerned with the same things and they are similar on the surface but as you go deep they are very different right and both of them work like they're both proven and they both built very successful software right so but what we can get out of this like what does it matter for us okay it's interesting that in the 90s someone wrote about this and identified this two ways of thinking but what matters for us in our everyday development is that there is no one right way to do like minimalistic software there are in the decision space in the design space there are forks in the road where you will be faced with decisions like okay I want to make this software clean and minimal and nice but okay do what do it this way or do it do it that way right there will be forks in the road and I'm gonna talk about all the ways that I got that wrong right so we will see what happens when things go wrong even if you have your best intentions of making clear and nice and simple minimalistic software right so bit of context I've been involved as like Andy said like many of us said we're all getting older like I've been involved with the Lua community for Lua programming language for like almost like 20 years now and yeah here's me at like the first Lua workshop back in 2005 like and giving a talk on a piece of software that I wrote so I was talking about how nice it was and like now 20 years later I talk about how terrible this after I write it right so yeah that's called maturing right so yeah so so yeah so this is Lua is like an example of like a very nice like minimalistic language like the whole source distribution is like 360k right and that includes like the bytecode compiler the VM the standard library like an amazing C API which is like stack based and like it's a triumph of like the power of simplicity and all that and the these three guys here who are is this and this they are like the creators of Lua and they have said like time and again how influenced they were by scheme right so but if you look at the surface it's like an all-goal style syntax it's like no parentheses and all of that right but like in essence you know because there are very much in that spirit of the right thing like it's it's it's very much in that mindset right but then when you get minimalistic systems like that and minimalistic languages right you end up with that batteries not included effect because that's how you get like such a simple and small thing right but when you have to interact with the real world the language that the minimalistic system is like the tip of the iceberg right because you have to deal with xml parsing and json and various protocols and all of those such things right so like you have to build on top of that little minimalistic base right so how do you deal with that we go modular right the idea is that instead of having that you know that big monster system with like every capability under the sun where that you see in some environment build like the smaller system that you can right so enter lure ox which is like a module manager a package manager for modules for the Lua programming language when I first got involved with that Lua had no package manager and and this was a project that we started around 2006 that so it's been like I was like around 15 years that I've been maintaining this project right and I know I know I understand if many people here don't like language specific package managers I know because in my other hat like I'm at this Linux distro maintainer and I know the pain that they can be but yeah like I have extra slides about that in the end if it becomes like a topic but let's agree that for many situations the necessary evil right to have them like so the story of making lure ox like it started with some sets of challenges because Lua was designed to be an embedded language so people run it like you have like an engine X server like as we do at Kong and where we have like a Lua Lua VMs embedded into it or you have like games and tons of games that were Lua first got really popular where you have like Lua scripting embedded in it so the idea is that you take that small language you put inside your large application you hook one to the other via the C API and then your scripts gets the functionality of the host application right and you can write scripts that control your game or which handle your requests and and all of that stuff but what ended up happening is that at the time we started lure ox and again I think I'm using Andy's like royal we in this case there was no one Lua community there was several scattered Lua communities right because most of the applications were like this were embedded applications and even though there were some modules here and there that you could use there was no convenient way to use them all together and build like pure Lua applications right so one of the goals was not just create the package manager but also kind of create that community of usage of modules right so I so we kind of have a had a point to prove right with with the with the system so yeah so to kind of try to prove that point right from the beginning I decided okay let's make lure ox the package manager into a pure Lua application of the kind that I want to show that it's possible once you have an ecosystem of modules right so we can do like the proverbial dog footing right so yeah so so we set out from from beginning with that idea right and that proof challenging because how do you make a system based on you know packages and modules when you don't have a package manager yet right you you get it you get into that like who watches the watchman situation like who packages the packages for the package manager and yeah so then of course we get to the the solution is bootstrapping right so I guess as open source developer right for the love of the like the will of the open source developers to show the products worth by dog footing like and like compilers and programming language people like like to show the worth of their systems by bootstrapping it by building it in itself right so so that's what we did like so we had a file system operation because that's generally what package manager do is a bunch of file system operations like moving files around calling calling other compilers and things like that building stuff and you know putting things in the right places so we had a library of file system abstractions and we made like an initial bootstrap version of that library which would have zero dependencies it would like because like core luon like has shell out command like a system can do like a CLI call right so we could so we made a library in terms of those operations that that was like the bootstrap version right so that we can that we could then build the rest on top right as I said who watches the watchman so we ended up with these two versions like the bootstrapping version and and the dog footing version and so this one could run with zero dependencies but then once you had the ability to install additional modules this one ran all of the file system operations in process so it ran a much faster right so because of deployment into systems and new operating systems that did not have any support yet we ended up having to maintain both of them in the code base right so there you start notice that things are getting smelly right like it's no longer that minimalistic anymore when you have like these two different implementations of right off your core libraries around plus the real world keeps getting in the way and once you have those little like Unix CLI abstractions often you start to have like to put little additions because some of the OS's are different right and then as time goes by you have to make an entire different implementation in order to support windows as well right so yeah so things are things are not getting like looking any nicer right yeah plus over time people start asking like this is a package manager you told us that low rocks was a pure low application why can't I like upgrade it in itself you know just run low rocks install low rocks and I looked at it and then I thought about like the two different bootstrapping situations and all of the complications that that would happen and after a while like I caved in I said okay let's try to do this we've done this for a while it worked if you had the exact same set of circumstances in your system you know for in order to that to work because lower rocks can be installed like in a home in a home folder without permissions as a system-wide package with permissions installed by the package manager installed by yourself from sources or if it's on windows then windows you cannot modify files that are currently open unlike in Unix so over time we ended up dropping this one because now this is no longer supported like so you actually have to install lower rocks by some other means right but yeah so but we really tried so another point where like the idea of minimalism kind of backfired was the definition of the scope of the project itself so especially because I was coming from experience of being a maintainer for a Linux distro and writing the package manager for that Linux distro as well so I had a bunch of experience in in package management and the development of that area I decided from from the beginning and said like this is gonna be a package manager for Lua modules specifically and Lua modules can be written either in Lua like .lua files which contain like source code or they can be actually see dynamic libraries like .so files or .dll files right which are like C programs that use the Lua C API and built into a library that can be loaded by Lua at runtime right and I limited the scope and said that's this is it right because I don't want people to start making like crazy hacks with it because I knew that once people would start doing crazy hacks with it they would start stepping on the toes of the operating system package manager like even more than language package managers already do right because I've had people ask me things like oh I have some Lua stuff that integrates with a JVM I would love to distribute my Java files and classes with Lua rocks right I've heard all sorts of things like that so at first I thought the decision was right but then there were things like okay I'm doing web applications and I need assets like CSS and things like that that actually have a legitimate case of being distributed with that right or I'm making a game and my graphics like they need to come along with that and all of that and at first I say well if you're making like a whole application like you can use Lua rocks for your modules and then you have to package it some other way that would make sense if you're thinking with the hat of OS package manager but in the end people would just end up like doing like crazy hacks like you know renaming their files like .lua to please Lua rocks so that they wouldn't install it and then load it by other means and all the things like that and and and I myself was guilty of that at some point in which like I made a Lua module that could read the Lua rocks metadata and find out where Lua rocks it's install it's like documentation files and you could put like other files in there and things like that so it's kind of like you know like I should have put like a Jeff Goldblum slide saying life finds a way you know so for this one maybe you can add it in post and add that so yeah so the next story I think this is the third this is the third story of the so Lua keeps its simplicity by having this model of mechanism not policies like people ask for features and the Lua authors do not give them the features they give them like the mechanisms in which you can actually write the feature right so instead of having a debug error Lua has like a debug API which can provide you like you can call functions and get information of the about the debug state and people has used that to write debuggers and IDs and all sorts of things right so it's a it's it sounds sounded to me like a smart way of keeping things minimal and and bounded right while being very principle about how they do it you know the right way so so in Lua rocks kind of tried kind of trying to find the to follow the same path and going well when in doubt make making a sensible right for various things for example like how to download code from the internet in order to make your packages right back like nowadays the whole version control wars are done and get has won right but at the beginning like around 2006 you would have like subversion and mercurial and fossil and darks and all of the other ones that no one else remembers anymore and so yeah so so it came up with the whole thing where okay so depending on who what you used in the protocol if you put like get you know colon slash slash or if you put like foobar it would try to load the foobar module and see if you know if we had like hooks in order to download stuff and and things like that the side effect is now now we have the code base like modules there for a version control systems that I'm sure no one has used like for over 10 years and they're still there in the code base right as extensions and things like that extensible build types another aspect of that you know you know heard of cats that was the Lua community and then the various subcommunities around it was that everyone was building before Lua rocks existed everyone was building their module like their own way right some people use C make some people use make some people just put the code out there and say like compile it yourself you figure it out right and and I didn't want like some people use like configure with auto tools and all of that right so I didn't want to annoy anyone because the goal really was to get a community formed around it and not like segregate anyone so we did the same thing with the build types right but this time I kind of followed my unixie itch and I went like full worse is better on it and in addition of all that is those extensible build types I create one that I just called like built in like a total lack of creativity which was like kind of like make you know like just get a list of things to do and things to compile but like without any sort of like dependency resolution not anything like I was like super dummy was just go through the list one by one call the compiler get things done and recently like a few years ago like I I ran the numbers on the whole rocks repository like thousands and thousands of packages and like 80% of them were all using the built-in mode because like the most common cases they are is that your lure modules just a couple of Lua files and not even or it's a single C file like so that one like it it really was that one was a success right so in that sense and the last example is like configuration this is also not supposed to be read like it's just supposed to be like how big are is the response when you run like low rocks config because at first because everyone was using their own like little ways of running Lua I made like every single path and sub path and folder name like oh do you want your includes to be in a folder called include or do you want your bin to be in a folk art bin so nothing ever is hard-coded in there right so end result I've barely seen people like customizing all those deep configurations plus people tell me that the configuration is super complicated and sometimes people who would maybe like solve their problem by changing those things they just can't just can't find which is you know the proper entry to fix because you know because there are so many options and they're so very similar like runtime external depth subdirectories and like runtime external depth patterns because maybe in our different operating system where it doesn't end with SO and things like that so all of that is but but it was I kind of overshot in making it you know too configurable right so now by now you must be convinced that low rocks is like the worst piece of software in the history of mankind and or at least the worst package manager out of them but if you think about it like on paper low rocks is like the minimalist's dream package manager right it has zero dependencies it dogfoods its own optional dependencies it has a well-defined scope in terms of you know the files that it manages it's built like its internal APIs are built on a minimal base yet it's extensible right so what went wrong right I guess in the end but what happened is that it also ended up becoming like the kind of thing that minimalists despise the most right because it ended up and ended up being a large system that tries to be all things for all people right so well let's make the extensible then people start extending it in so it started growing in all sorts of directions right so what happened why why did why why we got to that point like two things one was that like the realization that reducing complexity is different from shifting complexity around right you say well I made a minimal base and now and now this is extensible but then all of the extensions was were bundled right so I so I still have code around for maintaining all all of those like old deprecated version control systems and things like that because they were all part of off the core application still right so many of the extensible bits they they kind of like if you were to use like a third-party extension sort of felt like a second-class citizen like you would have to add like an additional build dependency for that so and and and things like that and second the world is dynamic like things change and when we talk about design people often think about design as a blueprint but a blueprint you know it's something that you'd like before you start building the house right so so people tend to see design a lot like a initial step right you have to design then you have to build but the world keeps changing the requirements keep changing like stuff keeps changing and you keep accommodating for those changes so you have to think about how you're going to keep your minimalism over time right because it's easy to start small and then and then it spirals out of control so that got me thinking a lot like the sort of like the final section here of the thought about it is that that got me thinking about like minimalistic software maintenance because it's nice to you know you think about like oh this nice and small like piece of software like as a something like still in time or at the beginning but you have to think how do you make it how do you make that work over time you know and keep those principles and the way to do that is by setting boundaries and and that's one thing that that worse is better system that like that puts simplicity at the front so it does that well it's like a lot you know cuz that really like it stops from like say oh we're gonna drop a feature if it reduces simplicity or we're gonna make it a little inconsistent as long we keep everything simple because simple is more easily maintainable right because things can be like complete and consistent and end up being super complicated right so for example like one example within the realm of like the right thing philosophy I think was that the little language itself it favors simplicity over compatibility right because they they do put consistency and completeness like and in first place right but also like the right way style of managing that growing complexity is by changing your definition of complete right or limiting your definition of complete right then you like if you want to go for completeness and you want to still be minimal like you have to put a bound on your definition of complete right so so the way they did it was that okay we have to make changes all right what are we gonna take out what are we gonna you know in order in order to keep that going so so basically in Lua every like major version which is like every 5x version five one five two five three they are essentially different versions of the language in terms of like even like source code gets incompatible because they make changes that break things right like as long as they keep the design consistent and complete within their definition of completeness right so yeah and and when we think about is like the easiest way to go is to go like the Unix approach which is if you if if you just follow the way of keeping simplicity then you do get that long-term maintainability and even in the original essay Richard Gabriel said like he said here like I have intentionally caricature the worst is better philosophy to convince you that it's obviously a bad philosophy and that that New Jersey like the Bell Labs Unix approach is a bad approach however I believe that worse is better even in its strong and form we watch that caricature of you know like everything simplicity is more important than correctness and things like that has better survival characteristics right then the right thing and it's interesting because survival characteristics gives you the sense of like something happening over time right you keep surviving right so so here isn't directly talking about maintenance like about long-term success of the project right not that's something that was like oh I dreamed it up one day and I came up you know with with Lisp and wrote a paper and that's it like okay what happens with Lisp over time right and so it's so he said that in line one right he said oh it's a you know the New Jersey approach when used for software is a better approach than the MIT approach like said that later he wrote another paper that called that's what's called worse is better is worse right because he was still trying to advocate for a list approach but things can't get out of hand like and then you remember if you remember that Richard Gaber was involved with common Lisp right which is not exactly minimalistic right so that's why that that's what can end up happening right so why are the lessons learned like what about lure rocks you know what can we do about it like after all this time right there are things that I'm thinking about doing like this is like this whole talk is like the foreshadowing of the changes that I plan to make into the work all right so yeah that's like the last talk so we have a little bit of a relax all right so yeah that's the last last couple of slides here so just to say like we can still provide zero dependencies for users if we vendor in our bootstrap dependencies so we don't have to deal with bootstrapping anymore because like now we know 15 years later that the base set of dependencies really doesn't change over time right like we need a socket library we did an SSL library we need you know just a little couple of things right we can still make it zero dependencies for users and we can manage like a small set of libraries that we keep need to maintain ourselves right so that whole you know set of boots can be done away with like we could like simplify the scope and you know so that people don't have to do hacks in order to use assets and things like that because we do present it like and then it goes back to the whole right thing and worse is better right because the concerns with implementation versus interface because I was very concerned of presenting a cleaning interface you only deal with lua modules but the internal implementation was that lure rocks and it's internal it's a general purpose package manager has to deal with files moving around in a completely general manner right so we could just expose that to users and trust their ability to you know not mess up with the files like if they say like I'm gonna put them in this folder then alright we just we'll just put it there for them right and the other thing is that yeah the idea of the extensible minimal base is a good one but it should be a minimal base that it's extensible not like already extended right and the way to do this is that the extensions need to feel like first-class citizens like just like you know the rest of the code and not have things like okay these if you use this bill type you don't have to pull in anything else but for this time this one you do right so if you if you put everything in an even playing field then you can keep your base small right so in the end yeah it's still about simplicity correctness completeness and consistency but I would say and as we get older and start learning those lessons about how time matters is that you need to think about simplicity over time correctness over time completeness over time consistency over time now I'm over time so thank you do you have any questions or should we go for the drink last we have questions okay what do you mean by that a moment very overwhelmed by amaziness yes yes yes yes so yeah thank you for allowing me to talk about the good parts yeah there was a to me a turning point moment was I think well I think like law rocks are 2006 I was hired to do a law consulting project around 2013 so that was like quite a few years afterwards in which I had the experience of using lower rocks as a user with a deadline who actually had to put up something together that solved a real world problem for the customer and I needed like protocol buffers and lots of things from the real world and I was actually like able to I was so used to having to come up with everything on my own in lua and then like okay lower locks install lua protobuf well it actually works like okay now in an XML and I pull like five six seven eight different packages and actually got my problem solved in a way that was much faster and I felt like yeah okay the yeah this is the actual park you know like like the it is a reality like it you only believe it when you use it right and sometimes the people who make the software and other people using it right so you don't actually get to experience that but but yes like I think the part about the building an ecosystem that was a success in the end right we actually have an ecosystem now any other questions over there yes so that's that's the slide I prepared for this topic yeah so yeah so like language based package managers and open system package managers right because whenever people are you know reticence to the idea of language package managers they go like oh why don't we instead just instead packages all as insert my favorite package style like why don't we just do deb files for everything or why don't we just do geeks for everything why don't we just do rpm for everything why won't we just do you know macOS whatever they use now things right so it turns out that it's so one okay this alone would sort of answer right but then again there's another argument of scale because if you think about like how many packages do just debian which is one of the largest distributions have like I think like last I checked a couple of years ago it was like something like 60,000 packages right must be higher now right the JavaScript okay JavaScript is the worst example okay so let's take something like this JavaScript is over a million packages right so that's let's call that too much right but take something in the middle like Ruby Jam's like will easily have over a hundred thousand right or maybe like two hundred thousand by now right so like the set of modules of a language ecosystem would easily overwhelm like the pack like the old package management overhead of the community of that distribution like so if you do the math the matrix of like every operating system and Linux distro versus every language out there right doesn't add up like this so you kind of have to have language module managers on one side and system managers on the other and okay this was the funniest slide okay this is actually the actual slide so actually we're a paper on this problem right which was rejected many many times because you have operating system conferences and you have programming language conferences and package manager is always everyone else's problem right and I was there like living at that crossroads like these that the other two guys like Lucas from IBM Research and Michael from Victoria University in New Zealand they were both developers of global Linux was also the Linux distro that we were developing together right so so we're all coming from and Michael is a researcher in programming language department as well right so so we so we had like and like Lucas works with operating systems so and I'm sort of in between so yeah so so it took a while right so that was this this was in a rare like PLOS programming language operating system workshop that happened within SOSP which is an operating system conference and at last that that got like accepted because we came up with a taxonomy of like identifying the different types of package managers and what do they get right and wrong because yeah they do step on each other's toes a lot and yeah this this paper has some recommendation on how not to because low rocks was very careful not to step on top of the toes and now we're gonna be kicked out of the room let's go out and have some beers thank you |
Reviving Reverse Polish Lisp
Building an open-source HP48-like calculator |
Hello, my name is Christophe de Dinosha, and today I'm going to talk about DB48X, which is my attempt at resurrecting Reverse Polish Lisp, or RPL, on a modern open-source platform. So, first of all, what are we going to talk about? Why revive RPL, and why should you care? I'm going to give a demo of what DB48X looks like, both on a simulator and on real hardware. The idea is to have a handy calculator that does tons of stuff for you. Then I'm going to give a brief history of RPL and of pocket calculators, notably from HP. I'm going to talk about free software on modern calculators, and I'm going, since this is really the topic of this track, to talk about the very minimalist aspects of calculators, even today. Here's a quick feature overview. I'm going to develop this a little with various examples, and I'm also going to explain how this works inside, and notably talk about a very compact C++ object model with garbage collection. So, let's start by running DB48X. I'm going to give two demos, one running on simulator, and the second one on SwissMicro's DM42 hardware. So, the demo on simulator can be formed online. You can scan this QR code to see the details of the demo. The point here is that we have a development environment that simulates at least the user-spaced portion of the application software, and will let us perform a variety of online tests automatically that are difficult to do on the physical calculator. Now, on real hardware, what you do is you have a USB interface. You simply load a DM42 program in it, and that's essentially it. When you exit, you have your new environment, and you can switch back and forth between this and, for instance, the standard DM42 software. So, RPL has a common line where you type, for instance, numbers, and you hit Enter, and you see that the numbers go on a stack that is used for operations. When you want to add numbers or subtract them, you will essentially operate on the last two items on the stack and push the result there. Operations like scientific computations operate exactly the same way. They operate on the top level number one of the stack. In terms of programming, RPL has this special signed brackets that identifies an RPL program, and I push the program on the stack. Then I'm going to give it a name, so 1 plus increments, the first level of the stack, and then to execute the program, I simply hit the soft key, and you see that every time I push it, I increment the number on the stack. So, fairly simple. There is a built-in markdown help if you hit a key and keep it held. So, here we are seeing the feature for that function, and then we can explore the hyperlinks. So, it's all markdown based, so we can reuse the same help for the GitHub repository and for the calculator itself. Now, let me give you a very brief history of RPL. So, again, RPL stands for Reverse Polish Lisp, and it's an interactive language for calculators that has been used by Hewlett-Packard from 1984 until 2015. So, RPL has an ancestry called RPL, Reverse Polish Notation, and this started with the HP35 scientific calculator that introduced this system which saves keystrokes and is very similar to the way you think about the computations you're doing. It also leads, naturally, to a step-by-step programming model where you simply record keystrokes and the calculator is going to replay these keystrokes. That was introduced with the HP65, which was really marvel for the day. The little slots you see on the side, for instance, are for card readers. So, you can actually store your software on little tapes. The last real complete RPL system was the HP41. I'm talking about a system because it could be connected to a variety of expansions. There was a bus, you could connect it to printers, to plotters, to data acquisition tools, and so on. But there were later machines that a RPL was introduced and still used RPN. So, essentially, the main difference has been a fixed size for the stack and no real type system. And the high end of that series is the HP42. Now, RPL itself was introduced with the HP28C in 1984 or five. That machine had only two kilobytes of run, so that tells you this can run on a very, very small system. It also introduced a new Hewlett-Pagard CPU called the Salon that I'm going to talk about in a moment. The series culminated, the historical series culminated with the HP48 series, and that had equations, larger graphics, was extensible. So, you had slots where you could put memory, ROM cards, etc. And then there were follow-ups like the 49, etc. that were not very different from the 48. That series was recreated with HP50G and other calculators like the 38, the 48G2, etc. that are essentially running the original software designed for the Salon CPU under emulation with an ARM CPU that emulates the Salon. And that gives you a significant boost in speed and essentially it executes exactly the same software. Now, because the ARM CPU itself is much, much faster than Salon, a number of folks started developing software for it. And these series of calculators were based on somewhat standard platforms that could be flashed. And so, people developed open-source software and free software to replace the built-in firmware. An example shown here is new RPL, which is an ARM native implementation of RPL, that is relatively complete as far as the language itself goes, but is missing a number of features from the original calculator, including graphics, equation editor, etc. Now, how does RPL work inside? It's very interesting because it's a very smart, minimalist system. So, first of all, it's optimized for the HP Salon CPU, which is a descendant from CPUs built for earlier calculators. And that's a four-bit CPU with 64-bit registers designed mostly for floating points. And so, you have four-bit nibbles that you can address individually in memory. Addresses are 20 bits, that's five nibbles. And the 64 bits in the register can be addressed in a variety of ways that correspond to a BCD representation of floating points. So, for instance, the X field is for exponent, the M for mantissa, the S for sine. So, there is a number of pieces of free software and free calculator firmware that can run either on ARM-based calculator, and then later led to platforms developed specifically to run this kind of software. In terms of available platforms, if you go beyond the HP calculators, so first of all, the ARM-based HP calculators can be flashed. So, even a lowly HP 20 something can be given new firmware and a new life. You have an example here with something called WP34S which creates a very advanced scientific calculator from a very inexpensive HP calculator. And there are also a number of free emulators for iOS, Android, etc. So, what you see here is a 42 emulator called 342. And Swiss Micros essentially started building the hardware around this software. So, they created the DM42 which runs a variant of 342 with some underlying firmware to provide operating system-level services. And so, they have this platform and that same platform just with a firmware flashing and changes in keyboard can emulate the HP 42, the HP 41, etc. Now, third-party firmware has started sporting like mushrooms but really large and advanced firmware. There are not that many variants. What you see here is the descendant of WP34S which is called 43S and has a number of really advanced features, but it's essentially still in the same spirit as the RPN calculators. In other words, it's still using the RPN logic with a fixed-size stack and not much in terms of typing. So, my first attempt to enter that space was to port a new RPL to DM42. And so, I created a simulator and you can see the results of this experiment there with a side-by-side setup where you have the DM42 on the left, the HP 50G simulator on the in the middle, and the HP Prime simulator on the right. And essentially, my work was to try to make the software more portable, support one-bit graphics on the DM42, but really take advantage of that platform. And on simulator, it worked pretty well. The problem is, as I said, this machine is really minimalist and it turns out that new RPL, as soon as I started trying to run it on the physical hardware, it just did not fit. Why? Because the platform is built around an ultra-low-power ARM Cortex M4F, which has, among other benefits, that the battery life on a battery like this is up to three years according to the vendor. Now, that machine has only 96K of RAM and only 70K free after the operating system load. How much is 64K? Well, that's essentially one Commodore 64 and a half. And the Commodore 64 is not exactly yesterday's machine. There's only two megabytes of flash available. So again, in terms of old stuff, what remains free once you have loaded standard libraries and the floating-point emulation library from Intel, et cetera, is about 700K. So that's about the same size as an original Macintosh floppy disk. So my conclusion within these numbers is that I had better restart from scratch to create a firmware that was redesigned to fit in such a small system. So how does that work? Well, first of all, I wanted to use C++ on a modern language with templates and various library utilities, et cetera. But I needed to have garbage collection for the objects, just like the original RPL, and a very, very minimal memory usage. Let's start with the features that are implemented today. And that's essentially based off the command set of the whole series from the HP48SX to the HP50G. The Intel floating-point library that ships with the platform gives me 34 decimal places for floating points. So you see E and Pi here with the number of digits that were computed by running the exponential of one and four times the octangent of one. The platform, so my application software on top of that also supports large integers like the HP50G as well as base numbers that today can be in hexadecimal, decimal, octal, or binary. And I plan to support arbitrary bases between two and 36 in a later firmware revision. You see here how these numbers are entered in the machine with the hashtag at the beginning. And then when you put this hashtag on the command line, the cursor shifts to be like binary or base. And then I can enter the numbers directly and the first row of letters changes directly to let me enter numbers more practically. So let me show you that live. So I bring up the calculator. I click on shift base. And you see that I have the hash sign here. And I can say hash one, two, A. And hash two, two, E plus. And I have my hexadecimal conversion here. So as I said, RPL has a number of data types that includes text, list, and arrays. So the lists are between braces. The arrays are between square brackets. And the text is between quotes. You see a program there on level two that takes the hello string, the world string, then does a plus. And when you evaluate that program, you get hello world. You have also programs and algebraic expressions. So I just showed what the program looks like. But you can also have algebraic expressions written the usual way. You see here, for instance, square root of x plus one. There is a plethora of scientific functions. The catalog in the HP48 series lists something like 1700 functions total. A little less on some other models, but it's the order of what you have. I also already implemented a storage mechanism for persistent values, so variables, directories, et cetera. And so what you see here is a three-level menu where when you hit the key, you evaluate what is inside the variable. When you shift, you will go to the second level in the menu and that will read the content of the variable. And if you shift twice, then you're going to the third level of the menu and you're going to store something in the variable. So again, I can show that live. I'm going to store the result I just had. So execute is for execute equation. I'm going to call that B and I do store, sorry, enter store. And then if I go to the recall menu that shows me the variable and you see my B here, and if I just evaluate B, I have the number I had. If I shift B, I record the value. And if I want to store something else in B, I will shift twice, hit that key, and now B is 12. So as you can see, the system works already at that level. So in order to be able to really have something efficient on such a small machine, I had to design a custom object model and I based it on RPL itself, the historical RPL, but I tried to make it much more compact. And for instance, I use LB128 to store all the objects in memory. So LB128 is this system used for instance in Dwarf that encodes integers by having only the last, so you have seven bits per byte, and the last one in the series has a bit clear, the other have a bit set. So the type that is the first byte or LB128 value is an index to the handler table used for evaluation. So instead of using direct addresses like in RPL, I use an index. And so that means I can have 128 one byte types or commands and 16384 fit in two bytes. And as a reminder, in RPL that was 205, 2.5 bytes, five nibbles for each type. So I'm saving a little here. So you see here the catalog on the HP 450, I think. So let me compare and contrast the storage of something like the number one. To be precise, it's the internal number one on the HP48. The HP48 has no real user integers, whereas a DB48X has. So when you type one, the most compact storage you have for, sorry, that's actually three, I got that wrong on the HP48. So the value that you see here, that's the prefix. And so the 02911 is the address of the evaluation handler for integers. And three, that should actually be one, is the payload. The storage in LB128 is 14, that's the index for integer types. And 01 is the actual value. And because the habit is not set, that stops here and we're done. If you look at ABC, how the text ABC is stored, the prefix in the HP48 is 0282C. So that's the five nibbles address. Then you have the total size, and then you have the ABC encoding itself. Whereas for DB48X, you have the type, which is two, then you have the length again encoded as a DB128. And so because it's less than 128, it uses only one byte. And then I have the data itself after that. The name ABC is exactly the same encoding, except that the prefix is not the same. And for DB48X, the type shifts from two to one C. The types themselves change with every build, by the way. So that means the evaluation loop is extremely simple. It's essentially the way this works. You can see the code here is that you're going to take for each object, you're going to compute its size, skip to the next one, and then call the handler and evaluate that handler. So it's really evaluating a program in DB48X is extremely fast. And there is a fast, simple copying garbage collector. And the picture that was supposed to illustrate that was a promptly garbage collector as well. So what is the improvement over existing ASP calculators? Well, moving from 4-bit to 32-bit CPU means that it's much, much faster on various tests like loops, et cetera, and between one, two or three orders of magnitude faster. Scientific computations are even faster. There is a high resolution monochrome in display. That means that when you switch off the calculator, it keeps a picture that you display there. And so we have these fancy off images that you can use. So let me show you some examples here. So you see this is one off image. And if I shift off, then I'm going to see another image. And again, because it's an E ink, it doesn't consume any memory. There are three rows for the softkey menu system. That's an improvement compared to the original HP calculators. Because of the high resolution display, we can display the functions associated with base function, shift, and double shift. And as I pointed out earlier, the highlighted portion in black moves as you hit the shift key. So let me show that again. So you see that if I hit the shift key once, then I get to recalling the value B. And if I hit twice, then I move there and then back to the original location. There is a common catalog and auto completion. So that's better shown than explained. So let me type. So let me go back to my demo system here. So let's say that if I hit the shift key and I hold it, I shift to alpha mode. And now I'm going to type, for instance, A. And we are going to see nothing because I was still in the recall menu. I hit plus. And you see that now I have auto completion at the bottom with the various comments that begin with A. There is a plus here. And you might wonder why the plus is here. It's because it also takes the name add. So add contains an A. And I have ABS, for instance. And now I can do ABS. And I have evaluated ABS of 12 directly. So that's pretty neat. That's a good way to quickly access a very, very large number of functions. And it's optimized for the original GM42 key layout. I paid a lot of attention to this. So for instance, I showed earlier how, for instance, when you type execute, which is execute a comment in the GM42, there is no real equivalent for the RPL model. So instead, I retranslate that as execute equation. And that does something that is very frequent in RPL, which is to have a symbolic value for something. You can see also that the cursor is moving, is changing depending on what I'm doing. So for instance, here it's A for algebraic. And it's white because I'm in alpha mode. If I leave alpha mode, it's going to turn black. But I'm still in algebraic mode. The row keys are, I have only two, the HP48 has four. So on the common line, up and down, move left and right. It's an acquired paste. There is also no real run stop for programs. So RS is instead translated as eval. So it evaluates the value that you have. And as I said, there is this markdown based online help. So you saw that in the video, but we can show it live now. So for instance, if I hit sin and I hold sin, then it's going to show the online help there. You see that there is this home button. So I can go to home and then I can go down and select, for instance, the first entry there. And I'm going to jump to help. And that explains how the help system works. So there is a lot that remains to be done. The future plans include support for complex numbers that are not implemented yet. Vector and metrics arithmetic, which is integral to the HP48 RPL variant that also exists today within 28, et cetera. That's a relatively complex set of things in particular, because I would like to do it like new RPL does, when new RPL does support matrices with symbolic values in there. So you can have a matrix with an X in there and as the determinant of that matrix, you're going to get the results. Whether I can fit that in the available space is unclear. As I said, there are about 1500 functions that remain to be implemented in some way, including variants. So for instance, the sin function for sinus, the sine cosine function, so all the trigonometrics are implemented for real numbers, but they are not implemented for complex numbers yet or for other data types. So there is some work that remains to be done also even on existing functions. Plotting and graphing is a key feature of these calculators. So I'd like to have that. The HP50G is quite advanced in that respect and getting to the point where we have feature piety is going to take a lot of time. So that's essentially what I had to show. I hope that you found this interesting and I'm really welcoming contributors if you want to take a look at how this works inside and if you want to help me add many of the new features or if it were only just to write or extend the online help, any kind of help is really welcome. And that's about it. Thanks a lot for listening. Now it's time for questions and the questions will be live and I'll have a calculator available if you want to play with it. So we should be live now. We have only 30 seconds left. So how do the funds work on these calculators? Is it possible to load custom funds for different steps? Okay, so there are two parts to this question. The first one is the funds themselves and the second one is non-letting scripts. So in terms of funds, there were multiple formats that I tried. The current model, the current firmware supports two formats that I call dense and sparse. The sparse format is more efficient for large funds that have a lot of space and the dense format is more compact for smaller funds that have something like, for instance if you have a five or eight bits of hate for very small funds, then practically all pixels inside are used and so you have a dense or format for that. So that's for the representation of funds. All the run presentations cover the 16-bit range of unicode and so they do include the most of the non-letting characters. So we do cover an arbitrary range of non-letting characters. What the system lacks at the moment is that it doesn't know how to do combining glyphs and it doesn't know how to do right to left rendering. Those are a bit complex, they are not implemented in the firmware at the moment. The fund that, and then I wrote in the GitHub repository, there is a tool that lets you convert any TTF font to use as a font in the system. The font that I used is derived from an open source font and I forgot what the name is and I changed a few glyphs inside just to make them look better on the screen. So you can look at the GitHub history and you'll see that I tried a dozen fonts until I found one that I thought would look good. Okay, thank you. Thanks for speaking at Fasten Christophe. I will catch you later, I'll move on to the next talk now. You can hang out in this room if people want to come and chat with you. This is a breakout room just for this talk. Yep, thanks a lot. Yeah, bye. |
An Introduction to Guix Home
Declarative $HOME configuration with Scheme! |
I'm David Wilson from the System Crappers YouTube channel, and I'm here today to give a talk called an Introduction to Geeks Home. Here's some links where you can find me online. Definitely subscribe to the System Crappers channel on YouTube or Odyssey if you're interested in learning more about GNU Geeks, GNU Emacs, and other related tools, especially if you're interested in learning more about Geeks Home because I will be making more videos about it this year. And definitely check me out on the FETIverse, I'm on faucedon.org at David Will. So in this talk, I'm going to show you how to manage your user-level configuration, often called your .files, in Scheme using a futuristic package manager and system configuration tool called GNU Geeks. So when you start to care more about configuration, the applications that you use for projects or for your day-to-day work, you inevitably have to find a way to store those configuration files so that, number one, you don't lose them. And number two, you can use them on more than one machine or after, perhaps you have to do a reinstallation because your machine got destroyed somehow or whatever, tends to happen. So it's pretty common for people to use tools to sync their configuration files across machines like Git or some other file synchronization tool and then place those files in their home directory using GNU Sto or maybe even a bespoke shell script. But there are some subtle problems with this approach. First of all, how do you reliably install the software that your configuration depends on? There's probably a number of tools that you need to be installed before you can actually start using your configuration. So maybe you have a shell script that sets those things up, but then what happens if you start using your systems package manager to install more programs and then you forget to add those to your installation script, then next time you have to install your config somewhere else, you just forget which programs you ever had installed to begin with. Definitely something I've had happen before. So how do you customize your configuration files for each of the machines that you use without it all becoming a mess? One thing that I've run into many times is that if I do have my .file stored in a Git repository, maybe on each of the machines I use, I have like subtle tweaks, like maybe font settings or DPI settings that I have locally that I never check in because there's not a good way to delineate those settings between different machines. So that does become a problem whenever you have multiple machines sharing the same .files repository or share configuration. Also how do you fix your configuration after syncing a broken or half committed change? So maybe you are in the middle of making some major changes to your configuration and then you sync it to another machine and then you start using it and you realize that either your Emacs configuration doesn't load all the way or maybe your shell doesn't work right anymore. Things like that. How do you fix that whenever it happens? Definitely something that you have to consider while sharing your configuration on multiple machines. So I'm here to tell you today that Geeks is the answer to this problem. Yes, I can say that. Blanket across the board. I'm being a little bit sarcastic. So if this is your first time hearing about Geeks or perhaps you haven't experimented with it yet, here's a quick primer. Geeks is a functional package manager and declarative system configuration tool written with Guile Scheme. That probably sounds a little bit vague or maybe not too easy to understand. So more plainly, Geeks manages the installation and configuration of the software that you use in a highly repeatable and resilient way. First of all, both the software that you install and your configuration is installed in a transactional way, which means no broken half upgrades. Whenever Geeks installs an update to your configuration on your machine, it builds it all and only applies it whenever it knows that the build of the configuration is successful and that all the applications can be installed successfully, which is great because it means that you don't have like a half upgraded program or maybe your configuration files, if they're not going to be applied correctly, then at least they won't get written out to your system. Also every update that you make to your configuration is remembered and you can roll back. So whenever you install updates to your configuration with Geeks, they are installed as something called Generations, where each time you update your configuration, the previous configuration you have is still saved so that in case of any problems, you can always roll back to it very easily with a single command at the command line. It's really, really helpful. It saved me a number of times. So when you use Geeks to manage your system or user level configuration, you write the configuration as scheme code using a declarative format. In other words, you declare what your system should look like and then Geeks makes it so. It's really nice to have this capability because it means that your configuration becomes pretty readable because it's code, but it also looks like a document. But you can also use the power of scheme to automate some certain parts of your config where you need to. So if you're a Lisp enthusiast like I am, it's really cool to be able to use scheme to configure your system. So you can gain these benefits either by installing Geeks on your Linux distribution of choice or by using the Geeks system distribution to manage your entire machine. But for the purpose of learning Geeks or Geeks Home, I would definitely recommend installing it on your existing Linux distribution before you try to jump into installing the full Geeks system because it can be quite challenging. However, I do have a video on the system crap YouTube channel that goes through the entire process, both for installing Geeks, the package manager and the entire system. So if you want to learn more about that, definitely check out the videos that I made for those. So you can learn more about Geeks by reading the reference manual, which is really good and really thorough and the official website for the project. I've also made a number of videos about Geeks on the on my channel. So if you want the links to the homepage and reference manual, they're here in the slides, which will be available on the page for this talk. And also the link to my playlist for the Geeks videos that I've made so far. I think I've made about five or six videos. So there's a lot of useful stuff in there in case you've never tried Geeks yet. So let's talk about Geeks Home. Geeks Home is the feature of Geeks, which enables you to apply a complete configuration to your home folder for managing your user level programs and services. For instance, it enables you to configure important things like your shell, whether you use bash or Z shell, maybe fish programs that you use regularly, like maybe Emacs or other things for maybe your desktop environment. And also background services like sync things or sync thing or many other programs that maybe you want to run as a user level service, not a system level service. Maybe it's something that's only useful for you as a user. It's much more powerful than other .files management programs like Shea Mua, RCM or YATM because it also installs the software that you need for your configuration because it knows which packages are needed for the different parts of your configuration that you're using and also enables you to easily roll back to previous working configurations, which is a big deal whenever you're sharing your configuration across multiple machines. There's always a chance that something can go wrong or maybe if you install a newer version of a program, even programs you install can be rolled back to previous versions. So there's lots of ways where this rollback functionality can really help you whenever you're trying to manage your system with Geeks. So you can find the documentation for Geeks Home specifically in the Geeks Reference Manual at this link. Geeks Home actually is a new feature of Geeks that just got released as part of Geeks 1.4, which maybe was like a month ago that that came out, but it's always evolving. So this isn't a link to the standard manual, but you would probably also want to take a look at the development manual as well for the latest details on configuring Geeks Home. So to get started with Geeks Home, the first thing you need to do is install Geeks. And like I mentioned before, I have a couple of videos that explain to you how to install Geeks. Definitely check out either of these to get started. The installing Geeks in your Linux distribution is kind of the one you probably should use if you've never used Geeks before because it will allow you to just install the Geeks command in your existing distro and the demo I'll do today is actually using Geeks inside of Ubuntu. So that is certainly one way to do that. And you can easily get started with Geeks Home by generating a home configuration from your existing home folder with the following command, Geeks Home Import, and then a folder path where it will write out this new home configuration for you. So what it will do is take a look at your home folder. And if it notices that you're using Bash as your primary shell, it will take all of your sort of Bash configuration files and take a look at them and then create a section in your Geeks Home config based on that. It also, if you've already been using Geeks to install packages on your machine, it will take a look at all the installed packages that you have and then put them together into a list in your home configuration. At the end, which is running, I believe this is Ubuntu 2204 and I've already installed Geeks. It's actually kind of nice in Ubuntu 22 and also the latest versions of Debian. I think Debian Bullseye. You can install Geeks from the app repository. So this is another easy way to get Geeks installed. But I have Geeks installed here. So what I'm going to do is run Geeks, if I can type, Geeks Home Import, home dash configuration or home config. Let's put it that way. And it's really quickly just looked over my home folder, my Geeks profile and created a configuration file for me. So I'm going to pull up Emacs so we can take a look at what it created. In fact, let's just look at the folder first. So home config, let's do LSAL because I know there's some other files in here. So in this folder, what it actually did is it copied the batch logout and batch RC files that were installed originally by Ubuntu in my home folder. And then it also created this home configuration file. So now I can use Emacs NW and then look at the home config slash home configuration.sem file. So like I said before, this is all written in scheme. And this is, you know, normal guile scheme code. If you've ever looked at guile before, it's a very nice scheme of application. So this use module section just pulls in the relevant modules from Geeks that we need for writing our home config. And then we have this one home environment expression at the top. This is a special syntax provided by Geeks, which lets you describe your home environment, which would be applied to your home folder. It's made out of, I guess the two constituent parts are the, well, I keep hitting escape because of my configuration, but I'm used to anyway. It has two parts, packages and services. So here it's got the packages list. And you can see here there's a list of two packages. There's Htop and Emacs, and that's because I had already used Geeks install to install those packages after I set up Geeks on this machine. If you've never used Geeks before, this list may be empty, but if you've already been using Geeks and you will see the packages you already have installed listed here. Also there's the services list. And this is the more interesting part of the configuration because it is where the various different parts or programs that you use get configured. But usually these sections all have their own special configuration types. So like here we have one service called the home bash service type, and then it has its own home bash configuration. And this configuration has its own fields, like the aliases that you want to set in your bash configuration, your bash RC file, your bash logout file, et cetera. So we have a list of aliases here already. And that's because the geeks home import command scanned over our existing files and found all these aliases that were there and just added it to this list. One thing I'm going to do here though, this alert alias that gets added to the Ubuntu profile by default seems to result in some buggy behavior. So I'm just going to delete that one really quick and then just pull this line up. But you can see that it did put some things in here already. It's very easy for you to go ahead and add more aliases if you want to. Also for the bash RC and bash logout files, it has a local file reference to those files that it copied over for us from our home directory. So I recommend definitely keeping these files in your home config if you're using an existing Linux distribution like Ubuntu, because sometimes those profile files have important things that might make geeks home sort of break your desktop session. So make sure that you sort of keep whatever is put here from the existing bash files. So that's a quick look at what a home configuration looks like. I definitely recommend taking a look at the geeks reference manual for more in-depth detail about what all this stuff means. But the last thing I'll say is that there are a number of services that you can use that already exist in geeks that you can try out, you know, different programs and features that you may want to pull into your configuration. I'll show you a way to find more of those in just a little bit. One more important thing to mention here, this concept of a service actually does not correlate directly to like a background service or a daemon that's installed. A service is actually a combination of different configuration aspects. Like for instance, the programs that need to be installed for a particular feature, the actual configuration files that need to be written out to the machine. And also it could be background services as well, but an individual service can provide multiple things to your configuration, not just a background service. So we could try out this configuration right now without actually harming our home folder by using the following command, geeks home container, and then the path to that home configuration.scm file that geeks home import had written out for us. Now the nice thing about this is that this allows you to take a look at what files and programs that geeks home will apply to your home folder before you ever actually apply it to your home folder. You can just go into a shell environment and explore what files are there, what programs are there to make sure that the configuration that you're going to write to your machine is actually what you expect it to be. So let's go jump into the folder really quickly or sorry, jump into the VM and we'll give that a shot. So I'm going to run geeks home container, a home config, home configuration.scm. And this may take just a second because it has to take a look at that configuration file and build everything up, but it seems to be done already. So now we are in this container environment. The only way you can really tell that the shell is different is because the original shell has a color prompt and this one doesn't, not really a big deal, but if we were to use LSAL, we can see now that there are a number of files here that are linked to or symbolic links to paths under slash GNU slash store. slash GNU slash store is the store location for all of the files and programs that geeks installs for your system. It's just a huge folder with a bunch of these folders that have hashes for the first part of the path and then the name of the file or folder that is being placed there. And this is part of what makes geeks a functional package manager. For every file or program needs to install, it creates a derivation which creates an output that goes into this store and anytime you update your configuration, if you've changed the configuration file or maybe updated a program that you're using, there will be a new folder that gets produced in this GNU store folder and then your actual configuration will be linked to whatever the newer version of the folder in GNU store relevant to the configuration file or the program that you're using. It's a little bit hard to understand without more detail, which I'm not going to go into right now, but I will make a lot more videos about this and also the new geeks reference manual does a pretty good job of explaining it. So take a look at that too if you want to learn more. But we can see that we do have the bash logout and bash RC files, which we did mention directly in our home configuration. We also have bash profile, which is being placed there by something else, which we can take a look at in a bit. We have a geeks home folder, which is related to our home profile, which is in the GNU store. That's not something you really need to deal with directly, but we can take a look at that too. And then a profile file. There's also some other things probably in the dot config folder, if I do ls al. Yeah, there's a lot config also gets generated. So lots of things are being generated by geeks home, and we can take a look at what all those files have in them. Let's look at what's in the bash RC file. So the bash RC file is basically the same contents as it had before, but I believe some of the stuff at the very beginning is actually being generated by geeks home. All these early aliases here, they believe these are all coming from those aliases. These are the lines that we have in our geeks home configuration. And the rest of this comes from the file that originally was there in our home directory. So the other interesting thing to note about these files is that they're actually read only. If I were to open up Emacs for bash RC, let's see if this works. Okay, let's use control X controller bash RC. Now I can't actually edit this file. Yes, it says it's read only. And that's another interesting thing about geeks home is that whenever you apply a configuration with geeks home into your home folder, any of the files that are placed there are read only because they are meant to be immutable in the sense that you should only be able to make changes to those files whenever you're using geeks home to do it. If we were to look at the geeks home folder, and I believe it's a profile slash bin, you can see what programs get installed. So this is the only programs that have been installed. We have Emacs and we have some other things that are sort of just basic things that are needed for fonts and desktop environment, et cetera, your bash as well. So this list obviously will be longer if you have more programs installed, but you can see that it did set up all these programs in your home path. And with a geeks home setup, these will just be accessible like any other program in your system. Okay, now that we've taken a look at our configuration applied to this container environment, and we've made sure that everything that we see there is what we expected to see. We could actually apply this to our home folder using the geeks home reconfigure command, just pointing it at that same home configuration.scm file. So once we do that, we're going to end up seeing a .geeks home folder in our home directory, especially since this is the first time we're going to run geeks home reconfigure on this machine, and then it will store your profile information there, and then you will be able to use geeks home from that point forward. So let's jump back into our VM and we can run this command geeks home re oops, actually let's get out of the container environment first geeks home reconfigure home config home configuration.scm. And this will run the same profile that we had just used in the container environment doesn't take any extra time because it's already been built. And now we can take a look at our own home folder to see that our batch RC batch profile and batch logout files have been linked into the GNU store. We also have this new geeks home folder that I mentioned before geeks profile was the original profile whenever I was installing emacs and htop myself before I started using geeks home. So that will also still be there but geeks all the geeks home related things are in this geeks home folder. And also the profile file is there as well. So what happens to my old files like when you do this the first time you already had that dot batch RC file dot batch profile dot batch logout what happens to those files. So if you're applying a geeks home configuration for the first time you will probably overwrite existing files in your home directory specifically those batch files. But don't panic though geeks home actually made a backup of any file that it replaces. And this happens anytime that you run geeks home whenever you run geeks home and a file exists where geeks home would be placing a file it will create a folder for you in your home directory called. Well it has a time stamp at the at the beginning where this is the time stamp when the profile was applied geeks home legacy configs backup and if you were to go look at that folder ls al let's see that's the one we can see there's this batch logout batch RC and dot profile files that it has saved for you so that if in any case that you know geeks has broken your configuration and you really need to get your original files back you can go find them in one of these folders. So definitely nice to know that you won't lose things that you have painstakingly put together on your system or things that might cause your configure configuration to break once you start using geeks home. Also what if I made a change that broke my configuration well geeks was built to solve this problem and geeks home is no different. You can run the following command to inspect your old home configuration generations and like I said before the generation is like one point in time of your configuration or one execution of geeks home reconfigure and then you can roll back to an earlier generation with the geeks home well you use geeks home list generations to list all the generations of your configuration and then geeks home switch generation to go to a specific generation number and also you can just use geeks home rollback to go back to the previous one. So let's just try it out really fast I don't know if it's going to work the way that we hope for if we run it just with a single change but let's just see what happens let's let's look at the geeks home list generations first and you'll see that there's only one it says when I installed it and then it says where all the files are basically what the channels were time etc. If you had other times we ran geeks home reconfigure this list would be even longer let's see what happens when I type geeks home rollback okay cannot switch to home environment generation negative one that makes sense because this is the first one but we will make another change in just a second when you'll be able to see what that might look like so let's try making a change to our home configuration to see what it does let's say we want to set an environment variable in our bash profile that gets applied to all future sessions we can do that by adding an environment variables entry to our home bash configuration I've got an example here where we already have a home bash service type with our home bash configuration we have the aliases part that we've seen already we have batch RC bash profile things like that we've seen already but we've got this extra thing here called environment variables now environment variables is a known field for the home bash configuration type and what it expects to see is actually this is wrong as a an a list or an association list of key value pairs for the environment variable and the value to be placed there and when we put this here it should actually affect our bash profile to add more environment variables to that list let's jump back over to our VM and give that a try we're going to jump back into the home configuration file that we started with and I've already typed it out here we'll just uncomment it but we we have this environment variables list we're going to set our editor to emacs client I'll save this file jump back out of the console and then go back to run geeks home reconfigure oh actually before I do that let's take a look at our dot bash profile file and you can see that there is no editor environment variable in this list so now when we go run the let's see geeks home reconfigure command I know I just keep rolling up in this list then it will apply that new configuration change and then we should be able to go look at our batch profile file and see that now we have the editor environment variable pointing to emacs client and if we were to log in to a new batch login session in fact let's let's take a look now it's actually kind of useful echo editor the change does not get applied to the current shell you have to actually start a new batch session before you see this it has to actually be a new login session otherwise in you know environments like ubuntu it doesn't actually load your profile unless you log out log back in so let's run bash dash dash login and now that I do that I can use echo editor and we can see that that emacs client variable is now set there correctly now like as I was saying before you can say geeks let's see home list generations we see now that we have two generations here and I can type geeks home rollback and if I were to use cat on profile you'll see that that oops it was batch profile batch profile that the environment variable that we added is now gone so it actually just rolled back that to the old configuration that we had before we added that environment variable and everything's just fine and the thing that we had there before is now gone so that's great we can generate a configuration and apply it but how do you find more services to use thankfully there's a nice command that you can use to find home services that you might be interested to try and that's called geeks home search for instance if you want to find services that are relevant to your desktop environment just run geeks home search desktop and that will give you back all the services that are relevant to desktop environment so if I were to go back to the shell and typing let's see once again geeks home search desktop we can see a few different services here redshift debust and this xdg mime applications I think that one you don't use directly but this red redshift one seems interesting it says run redshift a program that adjusts the color temperature of display according to the time of day so how do we use it it doesn't really tell us much except for the fact that you can go look at a location in the geeks code but we don't have the repository cloned down ourselves or at least we don't think we do so we're not sure exactly how to do that what you can actually do is use the geeks home edit command to edit a particular service and actually it will open up the file relating to that service so you can take a look at it so let's look at this home redshift service so we're going to use geeks home edit home dash redshift and now it seems to have opened nano to look at this which is fine but we can see that it's the home redshift service type and it has an extensions list now we're not going to go into how exactly home services work but I will make a video about this soon but you can see that it has one service extension which is the home shepherd service type shepherd is a service management tool written in guile scheme and what this is basically telling us is that this redshift service is actually going to have a background process that gets spawned inside of your profile and also you can tell that there's this default value which says home redshift configuration that actually tells us there's a configuration type that's been defined for configuring redshift so if we would just kind of cruise up in this file a little bit we should be able to see right here the define configuration home redshift configuration each field in this configuration is going to tell you what the type of the field should be like a file like redshift and then a documentation string that says what the field is for so this is one way to go see what fields are available to configure for a given service but also a lot of this stuff gets generated into the geeks reference manual so you can go take a look at the actual generated documentation for these services and you might get some more information on how to use it but what if we wanted to configure the home redshift service to have the latitude and longitude of brushles so that we can have redshift change the color temperature relative to that location on the planet whenever it gets dark outside so I've gone ahead and typed in the configuration for the home redshift service so you don't have to sit here and watch me type two things to point out one is that I had to pull in the GNU home services desktop module to find this we can actually take a look at that by see the geeks home search desktop you can see that it's in the file path GNU home services desktop it's actually the module path where this service is defined so that's how I knew I had to go pull in this GNU home services desktop module path then we added to the front of our services list this home redshift service type with our home redshift configuration I'm telling it I want the location provider to be manual because I don't want it to try to use GPS to find my location then I give it an explicit latitude and longitude for determining the time of day and when it should change its color temperature so we can go back into our shell and use geeks home reconfigure to run this new configuration change and then we can see here at the very end service redshift has been started if I use the herd status command that will actually tell me which shepherd services are currently running in my system oh it's not letting me do it I let me let me do this actually bash dash dash login so herd status now it says that we have two services that are started the root service which is just part of shepherd and then the redshift service we didn't have to tell it what program to install we just said we want redshift and we want to give these configuration variables and it just installed it installed it and started up for us automatically and then every time you log into your system that service will be started up again so once you try to use geeks home you'll quickly find that there aren't a whole lot of home services in the geeks repo just yet and that's because geeks home is still relatively new and it needs more user adoption so that we can have more people contributing new services so please try geeks home and consider contributing new home services for the programs that you use regularly because that's the only way that the set of services is going to grow so more information on contributing to geeks can be found in the reference manual really good section there and within the next couple months i'm going to publish a video explaining how to write your own home services so make sure to subscribe to the system crapper channel on youtube or odyssey to me notify when that gets released thank you very much for sitting through this talk i hope you learned something about geeks home today i'll be available in the virtual chat for the talk to answer questions so please feel free feel free to drop by and ask anything there or find me online i'll be happy to answer your questions there as well thanks so much and we'll see you see you next time happy hacking you should be live now all right hopefully that was a decent introduction to geeks home i don't know if we'll have many much time to discuss anything but if there's any questions i could try to answer them yeah i had a question about how how you synchronize the config across machines yeah i definitely use git for that i just have a dot files repository i have all my get home configuration files there and then i synchronize this between machines and the nice thing is that geeks home makes it really easy to apply the same configuration after i've synced them across machines it's a lot easier than having to make sure that i have all my files in the right place if you try to use like gnu stow or something like that you have to make sure that you run gnu stow every time after you sync obviously you have to run geeks home every time after you sync but at least you know it's more obvious that you need to do that when using geeks home so it's been a lot of fun i enjoy this approach yeah definitely i think geeks home is the truth so okay i think uh yeah we're out there uh says your talk has ended so if people have any questions they might come here and post it in this room cool so yeah um so that's for speaking in first time yeah i'll i'll catch you later next time okay bye you you |
Literate Storytelling: Interpreting Syntaxes for Explorers
Demonstration of the use of syntaxes to facilitate the search of information |
Hello, my name is Jonathan McHugh and I am from Icebreaker and I'm just going to be going through some of the design decisions in regards to my project as well as express some thoughts regarding knowledge management and the use of abstract symbols as a way of working with projects to maximise flow and improve interoperability and reduce the time and costs of expressing concepts and expectations and desires and unleashing them as actions and what not. And this is a diagram which I playfully did which just sort of explains some of the facets with regards to it. Before I really got into coding I had a varied experience with regards to information society from a wider perspective and one of the things which I found was very useful for dealing with large information apparatus was to use abstract symbols. I did this for an association whereby I was collecting different tags and for from DeliciousV which was a social bookmarking system from back in the day and what I did was the focus on me and tax on me as I was collecting were getting so large it was getting it would have been uneconomic to use for instance just the lettering of the terms and it was better to aggregate them and what I did was make use of the non-alpha numeric characters which formed the which usually on the right hand side of a keyboard sometimes at the top as well and I found that this was very good for both demarcating things but it also sort of created more of an impetus regarding how to order things and the gains from say doing completions was fantastic because you weren't really just cycling through all the letters of the word to get the the form you wanted but you just sort of smash in and deal with the the term because you've got an associated character or combination of characters and one of the things when I really got into programming I started out by really prioritising things like tech, the documentation system, VIM, the text editor and as well as sort of using regular expressions by the tooling said as well as AUK and I tried to I tried to use the existing characters and that I had from the syntax for delineating the the the the terms and definitions which I was trying to collect in terms of programming but what I found was that it didn't quite match because the the association which had previously formed my my my my things were from were from they were dealing with a public affairs politics and cultural activities and even though it dealt upon the the even though it dealt upon the technology aspects such as peer to peer or things like that it didn't really delve into it it was really at the kind of so they're kind of white shirt level in terms of expectations competences and I realised that I needed something something better suited for what happens within a computer environment and and what what though what the components are and how they interlace and spent a lot of dog food eating trying to work out a form so I'm just going to go through some of the aspects with regards to this and I purposely chose something very abstract but also accessible I chose it abstract because I was aware with regards to semantic business process management which was an attempt which is an approach to mitigate groups from different language communities with with with common terms and for me an abstract seemed to be quite quite useful because it would be cultural neutral and I subsequently found that the ambiguity is quite useful because it allows you to do things quickly without being too bothered and actually there there was a there's turns out there's interesting psychological advantages from breaking things down into a certain unit of things so for instance I was reading that it takes seven forms that in fact whatever things are roughly people can put things into seven boxes so expecting the number of cows of a field there's a variance which happens with around seven in terms of people and expectations as well as expectations with bitterness and some people can be more precise and more accurate and there's just a kind of a deviation in terms of that once you look in the numbers of recollection and what people can hold and interpret in certain things I fixated on the number six so in effect I classified the Vim tooling which I was dealing with with a roughly between 1, 2, 3, 4, well it was 10, 20, 30, 40, 50, 60 which and I subsequently expanded upon that with the realizing that things things wouldn't deal with it needed a bit more subtlety so in effect the first tier was used to delineate whether something 10 which was something which was which was which was which was something which was personally dealt with 20 to deal with documentation 30 to deal with display 40 to deal with movement 50 to do with environments such as conflicts and 60 to do with external or system-wide things or external tooling and I subsequently adapted this for two level layer with regards to 1q10 to represent one facet 1q30 to represent another and this sort of played well in terms of things but I just sort of had a bit of a I had a guilt in it because the non- that the previous system I came up with had the it worked very nicely that the you could describe things with the with the non-alpha numerics and they're quite accessible on the keyboard and so after a while with my confidence at least in terms of how I was dealing with the annotations in terms of the putting putting subsets of content which is quite similar into a specific folder with these annotations forming and dealing with things and so for instance I would end up with something looking like this which was compounded and so I started using the the letters sort of in the home row in the middle of where the QWERTY keyboard was so and it was it encouraged me to really have things in specialized forms and repositories and I took the philosophy really sort of spreading things out so for instance we look at this this is an example of of different roughly passing based activities and I could do another form and so HQH would be roughly around passing OQ would be represented of languages and tools and so for instance we could do lower and this would give an example of lots of lower based points so as you can see these annotations can allow you to really cut through and deal with various points and with the with regards to the icebreaker project I felt that the I felt this stuff was interesting but the icebreaker project was very document focused for very good reasons the my icebreaker project was has been looking into how document how flap based files could be used particularly with regards to gem text the file format of gem and I could be used to express issues and problems or also Kanban boards which is which would be more preferable than the walled garden approach of services such as Github where you in effect end up putting all your repos with all the git history and all the subtleties in terms of that but when it comes to improving any of these repos you're not allowed to delve through the history and subtleties in terms of that so here's just at the top pane an example of and G networks Kanban board repo just the read me file being run through a the git in terms of the history and diffs so as you see all of these lovely things in terms of people adding things people removing things people altering things you don't really get in Github and that's a shame I sort of looked in terms of that and it was very satisfying that gem text could provide such a minimal minimal syntactic range for expressing the ideas that that ultimately I felt the need that just to see how things could mix and and perform with and what I went with was the use of the KL liner format which is within the emacs hyper hyperbole the package which is a format which is hierarchical and it's got very good interfaces so for instance you could be adding parts like this or hitting a child form and what's very nice is as part of this being a person information manager so you can cross link blocks and if including in other documents and should the blocks change then you will be able to you'll be able to catch up and work out where that the block has been moved to at least within the same document and so that's one of the things which I really prioritized and dealt with in terms of this year and and I managed to with the interpreter canonical mix both the syntaxes of gem protocols gem text the KL liner format as well as the key annotation system and so this has and I haven't explored it perfectly but I believe that the passing expression grammars are capable of representing things within the same line and not necessarily be relegated to to comparing line by line and so this kind of deviant exploration of how syntaxes can work across I've potentially got something much more richer and and subtle with with then then the formats and syntaxes isolated here just on the top pane here is just an example of in in the in the language and pass a txr a way of creating a definition and then providing the name of the reference in this case I use the annotation style hqh to just point out that the the passing forms and and here we have the the various key key aspects uh aspects of the of the annotation um for for key just returning to that I should emphasize that that the key annotation is formed around the the green buttons which would either so for instance and it would be one of the one of the letters in there would generate would providing the the starting kernel for for an annotation but it is supplemented with qwe or two which allows to provide a an inference and this was more of a later stage innovation and you can also combine annotations the annotation points at least I do it so that up to four can be compounded to be representative of of one which excluding blooms which would be providing where the dictionary where the document deals with and encourages more recursive perspective on things has has a very large range permutative range which even though that wouldn't be satisfied with either an individual annotating things or or a community left alone the the logical outcome of combining certain things together it provides a very large range in the tens if not hundreds of thousands of of of points and so as you add one or more annotations you can really get a fingerprint of what things are and it's also got a bit subtlety regarding that it's not the lack of precision in terms of this which has been described to me more in terms of like I think it was Aristotle's use of hexes in terms of what something is and and its sort of force the philosophy of what something is and I've been trying to deliberate in terms of the the subtleties regarding that and I guess it would be for instance there's there's always conjecture regarding who invents say the first submarine or the or say something like the first submarine or the first first the first camera and so for instance from my own perspective in terms of prejudices I've got a concept of of an Irishman inventing the first submarine and London a London an English politician who was invented the first camera and that's perhaps just based on my own prejudices upbringing and me not having complete enough technological perspective and definition in terms of that and I'm sure there are people for instance and in other regions who who have different opinions and that's all fine because we all sort of have a common idea of what a submarine is or what a camera is but we might have different definitions at which point that thing moved from being something else into that form and that's why I like the the vagueness of this key and and its annotations is that it it doesn't it doesn't have grandiose claims of completeness and it has more of a kind of ectomology approach in which you can make an opinion on things switching out and it's all fine including if if the definitions changes and it's it's almost in terms of how it works and and operates in terms of the work for any user it really gives you kind of the convenience and dealing with things so just returning back to here for instance I looking in terms of a a more of a specific function at the top here you've got RQR KWK and that's in effect RQR would be a form of a to-do and KWK in terms of hashes in terms of the aspects the complementary description is create hashes and here we're just returning just in case you there's a definite the function definition as well as a reference in terms of the fact that you have the annotations being being outputted here within the the parentheses here just in the light blue and here we have the the form for putting things in cases which in in the TXR form needs to be ended this is a list based language so as you can imagine things start and stop in very clear forms you have the syntax for creating a binding which in this form would be well let's let's go the first part so this is you this is an example of a URI variable being captured which is referencing a separate function and which and and expecting this outcome and of course you might have you could have a function capturing multiple things and which this this approach could capture different points and at least my interpretation regarding why I'm doing this binding is because it allows it seemingly allowing you to inherit all the subtleties upstream in terms of that but I might be a bit flaky in terms of that so as you can imagine there are sort of lots of different forms and and and dealing with things and here's an example of a of one of the of the forms which would have the with a here classic regular expression style things in terms of repeating numbers and various points and so just to give an example of how I could sort of use these annotations for right sort of racing through things so here is an example of RQR which would be a mechanism for just pulling up all of my to-dos and here you have yeah it's referencing the fact that there are 202 different tasks and dealing with things and obviously this is just providing a document type a singular document but you could be for instance performing a search based upon multiple documents or within the project which within this emacs operating environment means that you're pretty much just limited by time and ambition so for instance you could as you can see it's very terse and this is this is how this is for me quite significant in terms of flying by I I really really try and make sure that I have maximum flow and that I can switch from one mental state to another and and and handle multiple things without being overwhelmed and for instance the name that the fact that the documents and the and the directories have to have named in terms of that means that it makes it easier for instance jumping for a different file so for instance if I press ctrl x and b in emacs then there's a range of point so if just pressing mqm here for instance has given me a list of different buffers and and the names in terms of that so it I can really sort of come up with buttons and hotkeys and actions and to deal with that so for instance here's just a way in in terms of the reuse of these annotations can come can be dealt with so for instance at the top we have svge tag mode which is an third party emacs library which allows you to turn various points in terms of these black and white boxes are the annotations I would like to say improved or visualized but you can also add other things in terms of that I've talked in other activities regarding the the use of hyper hyper hyperbole the in terms of navigation and and points in terms of that so this time I thought I'd just look at one of the latest features from number eight version eight which is defil which provides more a regular expression based form in terms of that so this form would be the defining of the function the name of this function the the opening context the middle context at the ending context as well as the middle context as well as what you're meant to do based upon that within these within these curly parentheses so what I've what I what is what is the conjecture in terms of this one is the fact that within a specific annotation you could be having different actions based upon where the cursor is within the annotation let alone the potential in terms of any perspective before or afterwards and this is quite deep because of the use of repeativeness and various forms that you could form very complex workflows not necessarily from key bindings pulling out the the the the things and and the or the use of classic my menus but in fact you could use the the cursor within or relative to a one of these annotations and that forms the the action which could get very fast in terms of just pushing about it having a dedicated action in in terms of the styles dedicated action button for dealing with things and so the the menu would in effect be the cursor related to the and here's a high roll of format which is within the hyperbole suite which looks into things and that's very interesting which I'll be looking into more so for instance an action could be to go to a subset of this for instance this is an example of the section shells which has the subsection and I haven't really I don't really have the time to go into this further but there's a recent emacs conference where this is one of the main talking points here is a here is a product of a script which I think is about 50 gigabytes or might be maybe this is a 20 50 or 20 megabytes which is an effect a rip grip through various points and and and and and dealing with the this this is a and trying to isolate various various things so let's so let's just highlight some colors quickly we'll just do that let's see pink yeah well that's probably slowing it down but whatever but in effect it's classic rip grip answers which document which line which character what was dealt with in here I'm just running through the entire system the entire file system to find the annotations pertinent ah but yeah there we go so um but we'll use swiper to reinforce so this will ah this will have the aggregate of it's got it does have duplicates at least how this was set up but that's fine because the outputs dealing with the orderly fashion in terms of that so ascertain is it come on sorry the uh I think my computer's a bit overwhelmed it's humming a bit but there must be some background thing yes so here we go so we've got ascertaining ascertain whether in correct location that ascertain wherever useful nothing links to it and as you can I I do have I do have tooling which sort of deals of this from the point of a specific script and I've I've worked out a way to in effect inject the annotations based upon hashing of content I've also been developing the hash trees forming with regards to the documents but yes um I hope this is of interest and yeah there's there's lots of interesting things with regards to icebreaker and I guess it would be best to visit the fostering page for this talk for some more supplementary information thank you very much you |
tissue—the minimalist git+plain text issue tracker |
Good evening, welcome to my talk. My talk today is on Tissue, a minimalist Git and plain text based issue tracker. Small projects need small tools. Many pre-software projects unfortunately use GitHub, a proprietary platform. When pre-software solutions such as GitLab and GitE do exist, they are not easy to host. They often require running complex database services and they tend to blindly imitate GitHub without thinking about ease of hosting for small independent programs. Now, GitHub is not popular for no reason. You come to the nice issue tracker. Issue trackers are really handy as a community when you are coordinating work. GitHub lets you host your project wiki, website, release star balls, etc. on their servers. Free hosting is always nice. It has a nice search interface to search through issues, commit messages and even code. It allows you to easily hand over your projects to new maintenance when you grow tired of your own code or you no longer have the time to maintain them. Tissue tries to do all this but in its own way and still be minimal and easy to host and maintain. It requires no database servers whatsoever, not even SQLite. Issues are simply plain text files committed into your Git repo. Tissue comes with a powerful full-text search engine built on this APN search engine library. And as an added bonus, you can even search through your documentation and commit messages. Now, since the entire state of a Tissue repository is in Git, it is really easy to move from one server to another or to backup and restore. To backup, you merely have to backup the Git repo, which you have to do anyway. Backup is a really important thing and something that should be very painless for self-hosted services. I remember many years ago when I was using GNU Social and quite active on the FedEverse. I would notice that from time to time instances would vanish from the Internet. It turns out that this is because people were not backing up their GNU Social instances correctly and they would lose it all and not be able to put it back when a hard disk crashed. Now, in theory, it is perfectly possible to backup GNU Social instances and restore it when necessary. But in practice, it does not always happen. This is because people running self-hosted instances are usually doing it in their free time and do not necessarily have the mental bandwidth to figure out all the details and implement all the best practices. So, any piece of software that is aiming for wide adoption among self-hosted services should be really simple to use. It cannot merely be free software. That alone is not enough. Simplicity and minimalism are absolutely necessary. Let me show you how tissue is used in practice. Here is the tissue repository. You see the tissue code here. It goes into the issues directory. You see many issue files here. Each file here corresponds to one issue. This is a typical issue file. It is written in gemtext format. Gemtext is a markdown-like format used in the Gemini protocol. It is really minimal and interesting. You should definitely check out gemini and gemtext if you haven't already. This issue has its title, tag and text. Finally, this particular issue is closed already. So, you see closed here. Let's try another one. This is another issue. It is similar again. We have a title, tags and text. Now, let's actually try to use tissue on the command line. Let me run tissue. This lists all issues. The commit message is that tissue knows a lot. Maybe we can search through and list only the closed issues. Tissue search is closed. There are three closed issues here. Maybe we only want to look at issues that have the word emacs interface in it. Here you have those. Tissue is a small project with only a few issues and commit messages. Perhaps it would be more interesting to look at a larger project like geeks. Geeks doesn't actually use tissue. But if it did, what would the search experience be like? Here I have a locally running instance of tissue web interface which has indexed all of geeks' commit messages, all of its manual and all of the geeks' cookbook as well. Now, let's try to search it. Perhaps I want to know something about the geeks garbage collector. I run geeksgc. Let's look at only the documentation for now. The first result is the invoking geeksgc page. I can click on it and immediately jump to the relevant section of the geeks manual. Now, interestingly there are many of these invoking pages that I keep having to refer to once in a while. But it's really hard to navigate the manual hierarchy and get to the page reliably. It would be really handy to have a search interface like this where you can simply jump to the section instead of having to traverse through the manual. Let's search for WireGuard. Here again you have a geeks cookbook page that tells you how to set up a WireGuard VPN. That's nice. But it's not just documentation, we could also look for commit messages that mention the word WireGuard. Let's look at the search for useg expressions. Now, many geeks packages are being rewritten to useg expressions. So, you'll find a lot of commit messages that mention useg expressions. One important thing to note here is that even commit messages with usingg expressions is matched. They have been understanding the English language enough to know that using is just a derived form of use. So, what you see here is it's not a simple grep-like search. It is a more powerful, more natural search. In fact, searching commit messages with grep is a real pain. It should be using something with real natural language search like this. Now, let's look at, maybe we want to look at all commits that update the SQLite package. So, we search for update SQLite. Here you see, all of these are SQLite updating commits. That's nice. Maybe we want to look at commits that update SQLite and remove a patch. See that again here. There is an update commit with the patch being removed. Once again, it's important to note that the terms in the search query are scattered throughout the commit message. They are not exact search matches like you would get with grep. That's it for the demo. I hope that gave you some flavor for what tissue feels like. That's it for my talk as well. I'm an Israeli tank peoter and the G-Network team. Tissue began as peotter's idea and grew very iteratively and organically within the G-Network team as an internal issue tracker. Tissue certainly wouldn't be what it is today without all the early experimentation that they kindly participated in. Thank you for listening. Unfortunately, I couldn't make it to Brussels this year due to visa delays. Who would have thought that crossing an imaginary line on a map would be so difficult? But maybe next year. Have a nice day. Thank you. |
(Keynote) What could go wrong? Me, I was
Containerised Applications are the way |
Good morning, wow. I was not expecting this much of an audience at 9am on Sunday at a FOSSTEM, so thank you all for coming. Yeah, I'm here to talk about how I was at FOSSTEM five years ago. I told you all a whole bunch of things and I was utterly wrong. So many ways, it's actually kind of amusing. But who am I? My name's Richard. I've been working on OpenSUSA since it began, I've been a customer of SUSAs, I've been a contributor, I've been a Q&A engineer, I've been working there for almost 10 years. These days I am a ridiculous advocate of rolling releases, it's what everybody should be using. I created the micro-S desktop, my day job is being one of the release engineers for tumbleweed and micro-S, I also do a bit of consulting and I also do a bit of photography. But a long time ago in a room, actually just on the other side of this campus, I was here at FOSSTEM telling everybody that containerized applications, so things like flat-back snap app images, the idea that graphical apps in some portable format are absolutely utterly terrible and nobody should be ever using them ever and they were going to eat all of our users and yeah, it's just going to be horribly, horribly wrong. And I even started the presentation with quickie comments, like those who don't remember the past are condemned to repeat it and I even made really unflattering comparisons. Like doing diagrams from Windows architecture and pointing out, Windows has all these wonderful runtimes where you can have different environments and run your application on top and it was absolutely terrible in Windows, it's going to be absolutely terrible when we do the same thing in Linux. Giving the examples of all of the security issues that you see in Windows in this kind of approach. Things like security relevant DLLs lurking in some folder in your Windows machine, being an absolute nightmare to patch, an absolute nightmare to fix when it goes wrong, all these horrible update issues, how do you end up getting an update on your Windows or your Mac machine? Well, you download some EXE or some bundle and then there's some updater and it does whatever the heck it wants on its machine. Licensing issues, especially with open source, how do you mix and match all these different licenses together in one cohesive thing and it's just going to eat up all of your disk space. And then I went back to this slide again and then started talking about the various technologies at the time, 2017, we're out there doing this containerized runtime stuff and I would compare this lovely Windows diagram to this lovely canonical diagram which looks very, very similar because actually it is. The idea is similar, the concept is similar, but as you'll see just because the concept is similar doesn't necessarily mean the whole idea is bad, execution doesn't matter. And it wasn't just Snap, I wasn't just shifting on Ubuntu because I don't like Ubuntu, I was doing the same with Flatback and I was basically pointing out that this whole containerized application idea was repeating the same issue. We were going to be going down this road of security relevant libraries, lurking in all of these snaps in Flatback. Back then we didn't necessarily have a good story about how are we going to update these things, how are we going to keep them maintained, who was going to look after all of these base snaps and run times in Flatback and the like. Who was going to look at all of the legal issues and review the possible licensing issues of bundling these things together and who was going to buy everybody bigger hard disks. And the kind of main conclusion that I left with which despite the fact you'll see I was wrong about a lot of what I said, I still actually hold true is at the heart of it when distributing software doesn't matter if you're doing it as a container or as a full-blown fat OS distributor or anything in between with any kind of fancy technology, the responsibilities are the same. Happy image, Flatback snap might make it easier to be the upstream than giving out your application to the users, that's great, but the responsibilities are still the same that distributors have been doing in distributions for years. You have to worry about maintainability, you have to worry about the security, you have to worry about licensing and all this wonderful stuff. So they're going to have to borrow all of the same stuff. So five years ago I gave this presentation, there was lots of people in the audience from App Image, Snap and Flatback, some of them said very nice things to me, some of them said very un-nice things to me. Starting with App Image, they took a lot of what I said surprisingly on board and really ran with it. I said all this stuff in February 2017 and by June 2017 I was saying stuff like this on stage, this was taken at the OpenSUSA conference, this was on the App Image website for, well, longer than I wish it was. But the reason it was because in that short window, App Images thought they could address most of my concerns by actually obviously running to the OpenSUSA build service and working with the OpenSUSA build service guys and integrating App Image really quite nicely with it at the time. So the idea being the problem, App Image wasn't the problem, maybe the way you build App Image is a problem. If you build them in a nice auditing build system and have the whole thing tracked with dependencies in a build system and you build it reproducibly and you do all the licensing reviews there, then OBS could be the solution to all of the App Images problem. And yeah, they worked really nicely with it and they gave all these promises, they'd be encouraging people to be using OBS as the main App Image building tool and we'd all move on happy in a nice unified way forward. And I even said things to Snappy and Flatback like you're falling behind App Image at this point, saying App Image had a better build story and they were working with other people and telling people to be more like App Image. And I still was badgering on, by the way, you can tell all my old slides because they have this thing at the bottom so you can see old me compared to new me. I was still worrying a little bit about dependencies because as you'll see, App Image makes some really interesting assumptions but I was, you know, tuned 2017 kind of hopeful that, you know, we'd get to a point where everybody would be working together and we'd have sort of maybe a new consistent run time and things could move forward. I was also hopeful that we might have sandboxing finally because, you know, Snapp kind of had some with App Armor. Flatback has Bubblewrap. You know, maybe App Armor would be the way forward. How wrong I was. So now, five years later, where are we? And I don't want to go deep down in technical issues too much because a lot of this isn't just technical. You know, we're an open source project. Any technical issue can be fixed, right? It is a lot about what are people actually doing? What do they actually care about? Where are they actually taking things? What are we actually doing? So let's judge people by their own standards. This is a screenshot from the current App Image website. And it says, use this to make Linux apps that run everywhere. But they don't run everywhere. And they say, as a user, it should be as easy to install as it is on a Mac or Windows machine. But they're not. And they say, you don't have to learn all these distributions with all these different distros doing things different ways. Technically, that's true. You just need to learn all these different distributions and doing all the different things. And you have to build your own to put in your App Image. And I'm not just, you know, saying this to, you know, core shade on them. You know, these, we have, I have users on micro OS who are trying to run App Images. And they can't because App Images require Fuse 2. I'm a rolling release. I haven't shipped Fuse 2 for like a year. I've been using Fuse 3. And you can't get an App Image to work with Fuse 3. It has to be Fuse 2. The portable image format that isn't portable because it makes assumptions about stuff that's on the base OS. And not just, you know, not just weird stuff like Fuse, but even down and dirty in the kernel. If you're running Debian and you're trying on an electron app, it's not going to work properly because the kernel in Debian isn't built the way that App Image is expecting the kernel to be running. So this is great promise. And it's going to work in some places, but only if you're lucky enough that your distro has the same assumptions baked into it that App Image has. And this is a recurring issue. Even reading the App Image documentation for building App Images, you know, it tells you, as a developer, think about all of the distros where you want your App Image to run on. So the whole promise of, you know, not worrying about distros goes away. You have to worry about more of them than you normally would. And put every single dependency which might not be fulfilled by that distro in your App Image. So, yeah, avoid distros by building a huge one and putting it in a big table. It's a lot of work. It's way too much work. I utterly respect anybody using it because they're probably doing more work than I am doing a rolling release. Especially when the recommendations for what you put in that giant App Image is the oldest, crustiest stuff you can find. They recommend avoiding using anything new because anything new is more likely to have compatibility issues with older distros. So literally find the oldest distro that's still supported and use that as your base for building App Image. Which, you know, also seems a bit of a problem to me because, you know, if you're always picking the oldest, the oldest is always the first one to not get maintenance updates. So you are always going to be rebasing on some crusty old almost out of day LTS to do what you want to do with App Image. It doesn't make any sense by their own standards. And they tell everybody that it's installing just like on a Mac. You know, just download the binary, put it on your desktop, right click it, make it executable and it will run. Which, you know, 15 years ago, that's true. That's how you run something on a Mac. I own a Mac now. That's not how you run stuff on a Mac. There's not a single Mac application I've ever installed that works that way. Even the Apple documentation makes it very, very clear that if you're downloading something from the Internet and you double-clicking it on a Mac, it's going to run an installer. Which is a terrible thing anyway. But it needs to run an installer. When you're downloading random stuff from the Internet, there needs to be checks for dependencies. There needs to be some, yeah, modification to what's on the host. So every random downloaded Mac application has an installer just like Windows. Or it's done in an app store where, you know, Apple are controlling all that kind of things and helping that along. So yes, I was wrong about App Image. First thing, it was terrible. Because they did try and make an effort. But then I was wrong again because it's actually even worse than I said five years ago. You know, they failed to do everything that they set out to do. They don't do anything to address the actual problems with software releasing. Dependency problems are just hand-wavy worse than anyone else could possibly do. Licensing issues, security, maintenance, good luck. Build a new distro and ship it again. And this is worse than we do in distros with all of the faults I will admit distros have on this. So please, do not use App Images. And also they're not nice people. Because they kept publishing this for like four years after I told them to take it down and I had to threaten to sue them. So they're just not nice. Now, SNAP. Despite my reservations back in 2017, actually SNAP was at the time the one I was most optimistic about. You know, at the time Canonical were actively collaborating with other distributions. They even invited me to a SNAP workshop trying to get SNAP supported in as many Linux distributions as possible. They had an approach of upstream first. They were promising that all of their app armor patches and all of the enablement they had to do was going to end up in the kernel and going to end up being upstream. At the time in 2017, you could run your own SNAP store. So you could have your own repository for downloading SNAPs. And unlike Flatpak where, you know, it's much more just graphical. They also had a story for, you know, non-graphical apps. And, you know, it's only five years ago. But back then, you know, everybody wasn't necessarily using containers for server stuff the way we are now. So, you know, it was interesting on all those levels. But it's five years later. And all of the promises of SNAP confinement working everywhere so you can have your nicer sandboxed SNAP application hasn't come true. You know, SNAPD does not support confinement on most non-Obuntu distributions. And even some Ubuntu distributions. And, you know, this was posted on their forums three years ago now. That was the case three years ago. You know, users still waiting to get any kind of proper sandboxing insecurity with SNAPs. Still not there. And then this was posted this month. Still promising. It might happen. But it's been five years. None of the app armor stuff is in the kernel yet. None of the enablement we need is in the kernel yet. Distros can't easily or really at all, you know, keep with an upstream kernel and get SNAP running in the way SNAP should be running. So if you run a SNAP on a non-Obuntu distribution, you're probably running it in an incredibly secure and insecure way. You know, do you trust that random software deliverable with, you know, access to everything on your machine? Probably not. At least that random software developer using SNAP isn't using their own SNAP store because they can't anymore. You know, 2017 you could. Then they released a new version of SNAPD. So now the only version of the SNAP store that works with SNAPD is Kmonocos. So, you know, it's an open source package format, but it's a closed source delivery format. You're only going to get that software from Kmonocos and if you read up on it, you know, there's lots of examples where Kmonocos have done the right thing and, you know, handled SNAPs that were, you know, malicious and got them off quickly. But it's like, how do you know, you know, you're just trusting Kmonocos that they're always doing the right thing because you can't see. You can't see what they're putting on there. You can't see how they get there. You can't do it yourself. You know, if you trust Kmonocos, that's fine. But, you know, I'm much more open source orientated myself. I'd rather, you know, even if I am trusting somebody else, I'd rather be able to have a look and see what's going on in there, maybe run my own, maybe compare something alongside, you know, and yes, for most developers or at least most small developers, this is free. So you can build your SNAP and publish it to the Kmonocos SNAP store, you know, with no effort. But as soon as you start getting bigger, as soon as you start becoming a bit of an ISV or doing stuff with IoT with lots of devices, then Kmonocos want you to have a brand store. And this isn't a documentation for SNAPcraft where it comes to building. When you actually have a look at the price list for having a SNAP store, you know, the price list is kind of dear. You know, do you really want to be spending at least 5000 euros just to be able to publish your application on somebody else's server under your name? But I can understand if people are buying into this, you know, I can definitely understand why Canonical Antenna Rush to change it. It's probably making them a good bit of money. On open-suited, like I said, at the time in 2017, they were working with us. Now, not going so well. SNAP is the only bit of software in all of my years doing anything, police manageried open-suizer where it's felt more than one security order. It's the only bit of software I've had to project in multiple times. And there was good collaboration going on to get those issues fixed, but since 2019, that's kind of fizzled out. Haven't seen anything since. So when it comes to SNAP, you know, I was wrong. I was really kind of keen on SNAP back in 2017. And these days, I can't really say that much nice about it. The upstream first promises have all stalled. It doesn't seem to be an effort to get it really moving again on other distributions. So, you know, it's not a portable format by any stretch of any imagination. There's no open-source delivery option, you know, even if SNAP, you know, the SNAP store may always be the best way of doing it anyway. There's a case to be made for that, even if there was an open-source way. And, you know, it's not really a viable alternative for something like Flatpak until, you know, unless you use Ubuntu, unless you trust Canonical, unless you're willing to give them money to distribute your stuff. And so, Flatpak. Now, I need to kind of do a little bit of a detour on this, because when I was talking five years ago about all of this stuff, one of the things that I was trying to pitch in the side thing there was this idea that, you know, well, everybody should be using rolling releases. I really, really believe that. And I still believe that now. And I really think, you know, in this modern age, to get applications in the hands of users, you know, a rolling-based operating system is the absolute key. You know, you need to have it all built together, you need to have everything, yeah, integrated, built consistently, tested consistently, and, you know, taking the fair share of the maintenance and security burden, and then shipping it all in a way that the users don't really care that everything is churning around underneath, you know, it just works. And at SUSE we've still been working on this. We have an operating system called Open SUSE MicroRS. Vanilla MicroRS is much more server-orientated. It's immutable, like CoreOS and other similar immutable platforms. Can't be modified during runtime at all. It's rolling, so changing snapshots, it's actually using the same code base as Tumbleweed, so every day, almost. It's small, but small enough to do the job that it's meant to do. And the assumption is, you know, that server is going to do just one job in a data center, so, you know, a VM running one RPM or a VM running containers, and then, you know, as many containers on top, but, you know, the job is a container from the operating system point of view. And this is working really quite well. In fact, SUSE also has commercial products based on this. SLEE Micro is based directly off Open SUSE MicroRS, the new SUSE Alp you might have heard of, where we're thinking of doing like a whole new ecosystem of enterprise distros. You know, that's building off what we did with SLEE Micro and Open SUSE MicroRS. But me, you know, I'm still a desktop guy. So, you know, doing this with my day job, I found myself asking, yeah, I found myself asking, was like, okay, so, I've got this nice small OS and it can run just one thing. You know, what if that just one thing with a desktop? And so, I started the MicroS desktop project, sort of alongside regular MicroS. And, yeah, basically it's a modern Chromebook-like, silver blue-like environment where you have a nice minimal base system. My recommendation will be running the GNOME one. That's the one that's most maintained with a desktop environment on top. And the basic configuration tools are, yeah, the in there, but everything else is provided by somewhere else. In fact, everything else is provided by Flaphack. So, this is one of the reasons why I'm doing this presentation. I kind of have to explain how in five years I went from Flaphack is the devil to Flaphack is the only thing you should be running on your desktop. Because I talked to some of the people that I was talking to back then and this is kind of their expression. Because five years ago, when I was talking about this stuff, I was meanest about Flaphack than all the other ones. I was even invited to Gwadek and I gave the meanest talk I have ever given to anybody right to the people who were actually developing the thing. And the guys from GNOME, they listened. I wasn't right. I'm not right about everything. That's the recurring theme of this presentation. But they challenged some of my opinions, but they accepted at least the cool ones that actually mattered. And Flaphack has changed. Like I was talking about earlier, responsibility is the key issue when you're talking about delivering software. No matter how you're distributing it. You need to be thinking about dependencies and licenses and maintenance and security. And one thing that Flaphack does very, very well is basically take all of that away from the distribution and make it the packages problem. Not great if you're a package, but they do it in a way that actually probably lowers the burden for everybody. So that's nice. Automation and technology is great. But really, dependencies become the issue of the person making the Flaphack. Licenses become the issue there. Maintenance, security, etc. So distros can stop worrying about it. And Flaphack does this very well with their runtime concept where if you're building an application for GNOME, you have a GNOME runtime. If you're building an application for KDE, you have a KDE runtime. Elementary have their runtime as well. And then for everything else, there's the generic free desktop runtime, which is a little bit heavier and clunkier, but gets the job done. And back in 2017, this terrified me. Not because there was competing distributions, because I'm used to competing distributions. The question was really, are these mini-distributions going to be maintained anything like every other distro out there? Are these going to handle CDEs well? Are they going to not have horrific licensing issues, etc., etc.? Well, they've been doing this for five years now. These runtimes are very well maintained. These are snapshots from their various git trees. They're all updating very, very quickly, keeping up with their respected upstreams of GTK and QT and what have you. Handling CDEs very, very well. I don't know more about that later. So basically, they're handling this just as well as any other distribution does. Maybe even better in some cases, because they're narrow in scope. They've actually got less work to do themselves than a full-blown distribution with tens of thousands of packages. So you've got your runtimes and you've got your Flatpak application on top of that. But what about the Flatpak client? Especially if you think about what I was just talking about with Snap earlier, with all of the issues with app armor and custom patches and what have you. Well, as a distribution guy, getting Flatpak in my distribution is really not that hard at all. You need to have the client on that. But you're not having to worry about a huge chain of dependencies and a whole bunch of plumbing to get it running. I don't need to have fuse to on my distro. All I need to have is bubble wrap, OS tree, and a couple of XTG packages. And they themselves don't really pull that much in as well. So it's small, it's simple, it's relatively easy, self-contained. Doesn't cause me huge build chains when I have to rebuild the whole thing in tumbleweed. It's a really nice ecosystem to just plop on top of my distro and then all of the applications come from Flatpak. From a licensing perspective, all the Flatpaks on FlatHub are checked. They all have to have some kind of license that allows open redistribution or legal redistribution. Or they do also support proprietary stuff. You can get a Spotify Flatpak. But obviously, you can't have the source code for the Spotify binary in their Git tree. So all of the proprietary stuff has to be pulled through by discrete declared links. And the Flatpak, specifically the FlatHub team, are checking that, verifying that things aren't changing there, not letting nasty things happen and binaries flip around. So at the very least, you may not know exactly what horrible thing is in this sandbox, but it's sandboxed. It's not much of a threat to your machine anyway. And you know it's the one that was sent at the submission time. You know it was the one that was reviewed. You know it isn't changing unexpectedly. So basically, it's as good or as better as any other distribution out there with their native packages. When it comes to maintenance, basically the same story. You know, just like open SUSE, FlatHub doesn't like Flatpaks to have distro-specific packages or Flatpak-specific packages. You know, they want everything upstream as possible. They have an incredibly robust build, test, publish workflow. They're not using OBS, I wish there was. They're not using OpenQA, I wish there were. But you know, what they're using is just as good, maybe in some way that's better. They can actually like give everyone nice test channels for testing their application, which I really think I want to copy sometime. But yeah, it's maintained. It's easy there, you know, easy for maintainers to keep their app maintained, and that is all ticking over nicely. From a security point of view, well, Flatpak is the only one that works everywhere. It's the only one that those applications are sandboxed. The portal concept where, you know, basically holes are pegged through the sandbox to give you things like access to the file picker and other parts of the file system, and the like, you know, has proven to be secure enough and, you know, expandable enough, you know, it's not great, it's not perfect, nothing ever is. But it's doing the job, and it's doing the job well, and these applications are working very well. And Flatpak CVEs, you know, happen very, very rarely. And when they do happen, they're not these terrifying, scary things, because the thing is architected very, very well. So, you know, the last CVE that I could find was in February 2002, you know, it was a medium score, it was fixed incredibly quickly, I think every distribution had no problem adding that, because, again, like I mentioned earlier, given the client is very well structured, you know, you don't have a huge dependency chain, even the most ancient of LTSS distros can then just happily get the patch in, get the thing running. So, when I started the microS desktop, I adopted Flatpak, some Flab, actually November 2017, so if you put the timeline in that, you know, I did change my opinion quite a bit from the beginning of February 2017 to the end. But I was using Flatpak as it was the one that I could work with, you know, I couldn't use Snap, couldn't use App Image. And I didn't trust it that much at the time, you know, I was thinking like you've seen with other distributions of building my own Flatpaks and using them rather than trusting Flatup, or doing like Fedora does with, you know, they build their own and then they also give Flatup with some kind of filtering. But I didn't really want to mess with that at the beginning of my project doing all of this, so I just opted for trusting Flatup first and then waiting for the problems to surface. And it's five years later and I'm still waiting, like we haven't had a single issue with the microS desktop where a Flatup application really got in the way and needed us to think, okay, you know, we can't trust these guys, we should start doing that. It just hasn't happened. You know, the few times an application hasn't worked right, well, we send a patch. We work with them because that's how open source is meant to work, right? So as a distribution guy, I've realized, you know, we don't need to be building these giant, humongous, huge code bases, you know, even though that's still what we do with Tumbleweed, you know, I don't meet myself. I'm purely a microS person now. All of my servers are microS. My desktop here is microS. I'm using a tiny 1,000 package fraction of my Tumbleweed code base. And everything else is coming from containers, some of which are built from that much bigger code base. All my graphical stuff is coming from Flatup. And my life is good and I'm happy. And this presentation is Libre Office from Flatup. So my final thoughts, which I realized I'm actually finishing a little bit early, but that's good. More time for Q&A. Flatpacks are ready for primetime. The other ones aren't. You know, don't use app image. Only use Snap if you trust Canonical. But, you know, we're here at Fostem. Flatpacks are the better way to go for people like you who are here at Fostem. And my system automatically updated in the background. Yeah. Desktop Linux distros do not need to package the whole world. If you're a distro builder, think about following the model we are doing with microS desktop. Think about, if not narrowing your scope because you're building the packages and you don't want to tell maintainers to go away, then at least just, you know, start drawing your focus more on just what you need to be doing. Start testing that part more. Start telling your users, you know, that's the bit you can really, really trust and, you know, give some secondary class to the old fashioned way of doing things. Yes. So you're telling us that Flatpacks run everywhere. Is that also true for different architectures? That is true at least for ARM. For Z probably not, but do you really have that many desktops in the main frame? Yes, of course. Yeah. Well, then that's something I'm sure the Flathop team wouldn't mind. Well, I'm sure we could get that working on Flatpack. Like if there's a need there, then also thinking about a risk drive, of course, and stuff like that. Yeah. But then that kind of, you know, point, actually nicely draws me to my sort of finishing point really, you know, none of this stuff is ever going to be perfect. You know, no technology ever is. That's why we do this stuff in the open. That's why we do this stuff open source. So when things aren't perfect and aren't the way they are, aren't covering a architecture that you want or whatever, you know, isn't it better to go to a project that is already going in that direction, that is trying to be available to everybody that is open to, you know, open to me yelling at them for months about how terrible they are and then work with them to get it all done rather than sticking in your own tiny little sandbox, doing it all on your own and then being burdened with it for decades. Like if you're doing graphical applications, this is the way we should be going. It's easier for package maintainers. It's easier for distros. It's easier for everyone to keep up. It's easier for users too. I mean, you just, you know, nice little web store. They click on what they want. You know, they can have the beta version if they publish in the beta version. It's, yeah, it's a nice way of getting stuff done. So, yeah, please, if you're doing anything with graphical apps, please get it on FlatHub. Please contribute to Flatpak. Please put Flatpak in your distro. And is there any other questions? Because, yes, right at the back there. You've addressed the outstanding question about CPU architecture, which is a great question. How do you feel about the fact, and I realize I'm asking a Linux question of a Linux distro maintainer, but how do you feel about the fact that containers tie everyone in the world to the Linux kernel interface as their interface shutting out other open kernel options like the BSDs from participating in that ecosystem and that the overall drive towards containers is further orphaning these already minimally represented, but very, very strong options in other kernels? They're strong. But I mean, I guess the recurring point I get to with all of this kind of thing is, you know, niche players are great for playing in niches. You know, when you're talking about something that needs to have widespread adoption and or widespread contribution, you know, some degree of centralization does make sense. It doesn't make sense for everybody to go make their own kernel. It doesn't make sense for everybody to make their own distribution. I would say it doesn't make sense for everybody to go packaging their own graphical applications 20 times over. So as hard as it is to say to somebody who's clearly passionate about other kernels and BSDs and what have you, I'm fine with containerization and these technologies dragging everybody to the Linux kernel because that's where the contributions are. So, you know, and as long as the Linux kernel is open to contributions and everybody can steer it in, you know, a good direction, I'm kind of okay with that. Thank you for your talk. I was with the presentation of 2017, so I think it's very nice that you changed the views. That year I also watched the presentation about Atomic from Fedora, so it was funny how those things interlapsed. I have a question about how you feel about the base system. You see currently there are trends like Nix and like SteamOS, which use like an immutable image as a base. How do you feel about that? I think immutable distributions are the way to go. I think if you're running Linux, it should be immutable. Immutability does bring with it a bunch of extra questions. For us as geeks, I think I can say that without insulting anybody in the room, we are keen to tinker with our machines and of course immutability quite often can get in the way of that. If you can't change your running system, how are you going to install that one little thing that you want? I think there's a sweet spot and I don't think some of the other distributions get it. You know, image-based deployments, you know, you've got a frozen image, you can't really modify that image or you have to build a whole new one. That's too much work. I don't like image-based immutable systems that much. Nix has an interesting way with everything being declarative, but it's a lot of hassle. Declaring everything, it kind of swings the other way for me, so I don't necessarily like the Nix way. OSTree has an interesting take on the whole thing, both from a user's perspective and the fact that it's immutable. It's nice, but then you end up with a million different layers of OSTree and that kind of just gets technically burdensome. Obviously, I work on micro-OS. I think we found that sweet spot. In our case, we're using BTFF snapshots to do all the magic underneath the hood where your running system never gets touched, but you can still do traditional package management against a new snapshot and that becomes your Nix Boot target. You never affect the running system, but you can do whatever the heck you want with your Nix Boot. Then if that Nix Boot goes horribly wrong, we just throw the whole snapshot away. It's super fast, super easy. It avoids all of that. You can still tinker with it, but unfortunately, the downside of that is I do sometimes have to tell people, don't tinker too much. The more you do crazy stuff, the more likely you're going to throw that snapshot away, but I think that sweet spot is better than super lockdown images or complete freedom of having to declare everything in a config file. Okay. Thank you for your presentation. I had never heard of Flatpak. On my Ubuntu, I'm using a Snap to install application. On my Mac, I'm using Homebrew. What do you think of Homebrew on Linux? I don't see the point of Homebrew on Linux. Yeah. Why? I get it on Mac. I've installed a few things on my Mac that I desperately need there, but my Mac, I use for photography. I don't do anything technical on it. Don't see the point. Okay. Thank you. How likely is it for the files stored in the home directory, especially the user files, to be affected if I roll back a snapshot after a failed upgrade? So yeah, that's a really micro-specific question that's called that. The way we do it on micro-OS is when we talk about the root file system, we're not talking about the root partition because we're using BTRFS. So BTRFS, you have this concept of subvolumes. We have a subvolume for literally everything where the data should be changing. So Home opt because that's third party, so it's not us, user local because, again, that's not us. Anything that isn't the distro is in a subvolume, and then the distro's root file system is just that last bit that's left. So that bit's read-only. That's the bit that's managed by the package manager. All the subvolumes are freely available in read-write. That doesn't make ETC a little bit interesting because that's the one folder where it's both. Distros put stuff in there. In micro-OS, we handle that with overlayFS right now, where we're basically taking copies of that, knowing what we put there, knowing what the user put there, or at least trying to, and then merging everything together so the thing works. Ideally, what we would like is everybody to start using, like most people already are, user for putting in distribution configs at USR. It should be in userlib or useretc or whatever. Just like you see with systemd, where distros put their distro config in userlib systemd, and then users put their local config in ETC systemd. That way works very, very nicely, but meanwhile, ETC is a bit of a mess, but a mess that we can manage. Thank you for the presentation. Why isn't FlatBag suitable for CLIs? It is suitable for CLIs. There's actually guides now for how to do that. The assumption is always probably going to be that it's graphical, but there's no reason why a graphical application can't start an X-term and run a CLI app. There's actually examples in the FlatBag documentation of how to do that. Generally speaking, for apps that might not fit that kind of model, I think a lot of that CLI or more service-based command-line-y stuff, that's handled so well by OCI containers, Podman, Docker, and the like. Why mess with that? You've got all those containers already out there. You've got everyone building the command-line tooling and server tooling in containers. That does very, very well in that context. It just sucks on the desktop, have FlatBag that just handles the desktop issue. You don't necessarily have to have one thing to do everything. I think FlatBag draws that line quite nicely, where it naturally starts getting painful when you head down that road. Any more questions? No? Well, hopefully I will see you all in a couple of years when I'm back in. Thank you very much. |
Automating a rolling binary release for Spack
Scaling a modern CI workflow to a large distribution |
All right, so I'm Todd Gamble, and I'm from Lawrence Livermore National Laboratory. Normally I would give an intro of what Livermore is, but who's been hearing about Livermore in the news lately? The people heard about the fusion ignition over in the US, that's our lab. So I'm from there. I work in the HPC area at Livermore, and so we have a big supercomputing center. And the HPC ecosystem is a pretty complex place. People distribute software, mostly as source. You build lots of different variants of the package. Users typically don't have root on the machine when they install software, and so they're building from source in their home directory or installing something in their home directory. And you want the code to be optimized for fancy machines like these ones over here. So you're trying to build software that supports a really broad set of environments, including like Power, ARM, AMD, Intel, and then also GPU architectures. So things like NVIDIA and now AMD GPUs are showing up, and we've even got a machine coming all out at Argonne. This is near Chicago with Intel Panaveco GPUs. On top of all that, the ecosystem has C, C++, Fortran, Python, other languages, Lua, all linked together in the same app. And so we want a distribution that can support this type of environment. And so SPAC is a package manager that enables software distribution for HPC, given that set of constraints. Packages are not quite like the build specs that you would see in your standard RPM or Deb-based distribution. They're really parameterized Python recipes for how to build that package on lots of different architectures. And it has a DSL for doing that. I'm not going to get into that today. But the end user can essentially take one package and install it lots of different ways. So you could say, I want to install HDF5 at a particular version. I want to install it with Clang, not GCC. I want to have the thread safe option on, or I want to inject some flags in the build and have an entirely different version of it that's built with a different set of flags, or that's targeted at a particular micro-architecture, or that maybe uses a particular dependency. So you can build the same package with two versions of MPI. So we're trying to provide the ease of use of mainstream tools with the flexibility needed for HPC so that we can get the performance everyone. And it builds from source, but you can also install relocatable build caches in SPAC, much like you would with, say, Nix or Geeks. They're not relocatable because they're not really targeting the sort of home directory use case, but it's the same sort of build cache model. It's not a typical binary distribution. The whole project has a fairly large community of contributors, or at least maybe not large by some of the other distribution standards, but we have 1100-plus contributors. We maintain the core tool, and then there's a whole bunch of people who work on package recipes. So in some ways, it looks a lot like Homebrew or a project like that. And then there's a whole bunch of infrastructure behind the scenes to keep all this working, and all these things together enable people to build lots of different software stacks. And so there's like an extreme-scale software stack that's maintained by the US Exascale Project. AWS has a stack that they use on their parallel cluster product internally, and also for users. Livermore has its internal software deployment. There are some math library stacks, VizTools, things like that. And every application, really, in HPC is kind of its own software stack. So you heard about flat packs and snaps in the last session, well, really, making apps more mindful of how their software is actually a distribution is something that we've been pushing for a long time within HPC. The GitHub is a pretty busy place. We merge 300 to 500 PRs per month, and it's like something like 411 commits or more. And so managing that is kind of painful. And we're trying very hard to reduce downstream work, which is actually difficult for a source-based distribution. If you think about how SPAC is structured, there's this mainline develop branch that actually most people use. They'll just clone it straight from the repo, build from that, kind of like you do with mixed packages or something. External contributors contribute there. And we cut a release every once in a while where we stabilize the packages and keep them sort of fixed so that you don't have a lot of version churn in the repo. And then to actually integrate with the HPC facilities, all the places that are deploying supercomputers, we have this E4S software distribution where they end up doing a whole bunch of downstream integration at the site, where they're basically building the whole thing from source, essentially in a new environment. And there's a whole lot of debugging that takes place there that we would really like to be able to move upstream. The applications, likewise, they are not necessarily using what the facility deploys. Some of them do. Some of them don't. They pull from basically all of these places. They might get a math solver library from the facility. They might get something else installed from SPAC mainline built the way that they want. And they may pull stuff off of release branches too, all to assemble an application and have it built. And so this is a lot of porting at the lowest end, and what we'd really like to do is take that software integration and move it upstream and get to a point where we can have these types of environments building NCI all the time in sort of a rolling release and do binary deploys on the supercomputers with actual optimized binaries. So that's what we're trying to get to. So we set out to make a binary distribution with a bunch of different goals. The main one, and the one that's pretty key to our whole ecosystem, is it has to be sustainable. We don't have that many maintainers. And they currently, their workflow is basically to work with people who are making contributions, on pull requests, help them get them merged, and then move on to the next one. And we don't want them to have to sit around and babysit builds on, say, a release integration branch all the time. We want a rolling release because people do tend to use the develop branch. And so we want that to be up to date with pretty current binaries all the time. But some people do fix themselves to releases, and so we want sort of snapshots for those releases as well. We need to be able to support, at least eventually, all the packages that are in SPAC. And it still has to be source-buildable around those binaries. So if you want to build a component and rely on binaries for some other component, we want to support that. And then finally, people trust sources. They can check some of them. You can download the tarball. You can usually check some of them, except for when GitHub changes the hashes. But we want to ensure that the binaries that we're generating are just as trustworthy as the sources. So we've taken some steps to ensure that. So SPAC is a little different from your standard distro if you haven't gathered already. If you think about a traditional package manager, you have a sort of a recipe per configuration. And so that's like your RPM build spec or dev spec or whatever. It goes into a build farm, and you produce packages, at least for one platform, in sort of a one-to-one relationship with those specs, actually. There's templating and things that goes on to reduce that. But you're typically maintaining one software stack that gets updated over time. In SPAC, what we're trying to do is we have these parameterized package recipes that go into build farm, but it's really the same recipe that's being used across different architectures. We force the contributors to work on the same package so that essentially you're modeling all the different ways the software can be used, and we try to get a lot of reuse out of the recipes across platforms. Those go into the build farm, and you can use the same recipes to produce optimized binaries for lots of different platforms. So you could get a graviton, arm build, you could get a Skylake binary, you could get a GPU build, and so on. And then you could do that for many different software stacks for different use cases. And then we want you to be able to build from source on top of that. So that's what we're trying to do. We put a CI architecture together that is sort of based around this. Like I said, we want to be sustainable, we want to maintain the workflow that we already have on the project, and so we want people, we want basically GitHub to be the center of the distribution. What goes into develop is really maintaining the distribution as well as contributing to the project. And so we have a bunch of infrastructure currently stood up in AWS to support this. So the binaries themselves and the sources are all distributed through S3 and CloudFront. We set up a big Kubernetes cluster to support autoscaling runners, and we're using high availability GitLab in there to drive the CI. GitLab may seem like a strange choice for maintaining a distribution, but the motivation behind that is really that all of the HPC centers also have internal GitLabs, and so do a lot of universities and other sites. And so the goal is really for all of this automation and tooling to be usable not just in the cloud for the large distribution of SPAC, but also for people's personal software stacks locally. And so the idea is that we're generating GitLab CI configuration, and you can use that either for this or internally or in an air gap network somewhere. So we're leveraging Carpenter on the backend for just-in-time instances for runner pools. That's a tool for AWS, it's open source, you can find it on GitHub. It essentially lets you make requests for nodes with certain amounts of memory, certain target architectures, and so on, and it manages containers on the instances for you on the backend and sort of moves work around so that you can have an efficient build pool in Kubernetes. We also have some bare-metal runners at the University of Oregon with more exotic architectures than you can maybe find in the cloud. So like there's an AMD MI 200 GPU builder in there, there's A64FX, which is what runs on Sugaku, it's the ARM architecture with vector instructions, Power9, and so on. And so we are able to do runs there for architectures that aren't supported in the cloud. There's some monitoring thrown in. We haven't really leveraged it in a smart way yet, but we are collecting a lot of data about our builds. And then there's a bot that helps sort of coordinate between GitHub and GitLab. And so we have sort of a sync script that allows us to build off of forks and things like that in GitLab over this whole setup. So it's fairly custom, but at least the GitLab component is recyclable internally. And we would like to be able to support more runners in the future, like if maybe we want to work with Azure on their HPC setup and they want to provide runners for the project or if other universities and places want to provide runners, we want to leave that open. For maintaining the stacks themselves, we made it possible to sort of instantiate a new stack in a pull request. And so we have this directory full of the sort of 16 stacks that we currently build in CI. You can see them there. Each one of those is some targeted software stack for some type of machine or some group. Each of those contains sort of a YAML file with configuration for the stack in it. And so the YAML file itself is fairly simple. It has a list of packages that you want to build, and so this is the machine learning one for CUDA. Those are all the names of the stack recipes that you're building here. And then some configuration up here. And so for this particular stack, you're saying, I want to build for x8664v3, which is AVX2. And I want to disable Rockum and enable CUDA, except on LLVM because there's some weird bug with the CUDA support there, at least in our stack. And so you can see it's fairly concise. You make a list of packages. You say, here's the configuration I want, and you can go and take this thing and build a bunch of packages. We make it easy to change sort of low-level stack-wide parameters. So the parameterized packages in stack, you can tell it to build with a different compiler. And so we had essentially this large E4S stack with maybe 600 packages working in standard environments. We wanted to support the one API compilers from Intel. And so that's Intel's new optimizing compilers. It is unlikely that anyone has ever run this much open source through a proprietary vendor compiler like that, but it is client-based. And so we were able to throw one API into the config by just saying, here's where one API lives, and make all packages require one API. And so the build system swaps in the one API compiler through some wrappers that are at the lower level. And we were able to get that stack working in a week or two, despite the fact that we've never built a lot of these packages with one API before. So I think that's actually pretty cool. In a lot of cases, it's not worth it to use a vendor compiler because there's so many bugs and issues with software that's never been built. But here, we're just really throwing sort of a bunch of open source packages through, and it helped us communicate with Intel. We were able to say, hey, here are bugs that we're seeing with your compiler. We can link you directly to the build log for the build that failed. And that helps them patch up the compiler, and it continues to help them ensure that it can build everything it needs to. In SPAC, you don't. So like I said, the recipes are these parameterized things, and so there's actually a solving step to these stacks. You saw sort of the requirements in the YAML file that said what I want to build. We run that through our packet solver to get sort of a fully resolved graph of all the things that need to be built in a stack. And then that is used to generate a GitLab CI YAML. And then for one of the problems that we have to solve there is mapping builds to runners. So once the whole thing is concrete, and we've said here's all the dependencies, these are all the exact build configurations we want to make, we have to say how that should be mapped to particular runners. And so we don't currently support things like cross builds. So if you want to build for AVX 512 or the more fancy vector instructions on newer Intel CPUs, you need to make sure that you get one of those CPUs in the build environment. And so we say, if you match AVX 512, give me an AVX 512 runner. If you match one of these somewhat atrocious, hard to build packages up here like LLVM and PyTorch, give me a gigantic runner with lots of memory, things like that. And essentially what this is doing is it's just saying, here's the package properties up at the top, here are the tags that should be on the runner, make sure that I get a runner with those capabilities. And we haven't got a schema for all the tags yet, but I think we could standardize this and make it easy for someone to plug in runners at their own site for this sort of thing. All right. So one of the things that we did here to ensure trust is we have essentially a build environment going on in pull requests. If you trust back, you're basically trusting the maintainers. We want to ensure that the binaries are things that are approved by the maintainers. And so we can't just distribute binaries that got built in pull requests. So when contributors submit package changes, we go and we have private buckets for every PR that we're supporting where we're doing the builds. The maintainers come along and say, oh, it worked. They review the code. And then they say, okay, we can merge that and rebuild everything on develop and sign. So essentially everything in the main release is getting built from only approved recipes. It's not using any binaries that were built in the PR. All right. The pull request integration, yeah, definitely makes things easy for contributors. And we were able to take the system and announce our public binary cache last June with something like 4600 builds in CI. And so it's mostly easy for contributors. They get a status update on their pull request. And mostly easy for users. They can just say, hey, use the binary mirror. So there are some problems. One issue is that build caches are a lot different from RPMs and devs. In most distributions, you would have sort of a stable ABI for your build cache. Your rebuild package, you can throw it in the mix with the others. Here if you modify one package, you really do have to rebuild all the dependents. And so if you modify XZ here, then you have to build everything that depends on it again in the build cache. And so what that can mean is if you have a gigantic software stack like this one and you modify, say, package conf at the bottom of it, it can trigger a massive rebuild of everything in the stack. And so that's one of the scalability problems that I think we're going to have to deal with in the long term is that you can get these really long-running pipelines. Caches like Visit and PyTorch and so on will build forever, and it frustrates contributors. The other sort of thing that happens is if you think about how the release works on develop, you're picking a commit every once in a while and building it. And if you have a PR that is sort of based behind the last develop build, that's OK. Although GitHub typically wants to merge that with head, which means that you'll build a lot of redundant things in your build environment. We can be picky and merge it with the last develop build to ensure that we get a lot of cache reuse in the build environment. But what that means is if we get a PR that's out ahead of the last develop build and say D up there is in progress, if you merge that second PR with D, you're basically going to be doing the same builds that D is doing but in a PR environment. And so if you have a bunch of those, we've brought GitLab down before by accidentally building all of those PRs that are not caught up with the latest or for which develop has not caught up with them. And so we have to be picky and hold back these guys until there's a build ahead of them so that we get enough reuse out of the cache to support this. So the other problem with long pipelines is that they, depending on how reliable your infrastructure is, the more things that you build in a pipeline, the more likely you already get a build failure somewhere. And so because we're building this cone of destruction in our pipelines, we are sort of subject to system failures happening in the pipeline somewhere. And so users have to kind of babysit and restart builds that have nothing to do with what they're contributing. So we're looking for ways that we could make that better. One issue that we have is consistency. So when you test on PRs, it's not always sufficient to ensure that your develop branch is working. So you may have this initial package state, a PR gets submitted, you test with new B. Another PR gets submitted, you test with new package C. If you take those and you don't require your PRs to be up to date with develop, when they both get merged, the state that's in develop is something that you've never tested because you have basically new versions of those two packages together now. And so there are ways to get around this. One of them is merge queues. So we're looking at merge queues as a way to scale this pipeline out. They essentially allow you to have pull requests with a small amount of testing where you then enqueue them in your sort of merge queue up there, that's the gray stuff. And they are sort of serialized for commit to develop. If they succeed, then they're merged directly in a fast forward fashion. And then basically the full testing is only done on the merge queue. And you always are assured that the thing that you tested is the thing that gets merged into develop. So we're looking very much forward to GitHub making merge queue available in the next couple of weeks. The other thing we think that could do is allow us to sort of stage the work on PRs. So we're looking at ways we could scale this out. Right now, for a relatively small number of packages, 4,600, we're able to build this, these massive rebuilds on PRs. But we need the stage to see how to scale it out further, so that's what we're looking at now. We might build only the package or only the package and direct dependence on PRs and maybe phase how much work we do on the develop builds as well. But we do need to do a full build every once in a while so that there's a consistent state in the build cache. So that's where we're at. Thanks. Thank you very much for the presentation. You mentioned quite a bit of other technologies, like Nix, Gwix, Dab, RPM. You could have mentioned Ombru as well, or maybe you did. And Docker. And it feels like all these tools could help you. Yeah. And it feels like you are building everything on your own. So is there a reason not to leverage any of these technologies? Which technologies do you mean? Yeah. So we are leveraging a lot of technologies, right? I guess which ones do you think we should? Nix, for example. So we don't. So Nix has essentially one version of everything in the mainline, right? And in the HPC environment, what we want you to be able to do is not build that one thing that's in the mainline, but to be able to build a one-off very easily. So the whole point of SPAC is think of it as Nix with a solver, right? It's Nix where you can say, actually, no, build this version of this thing with this build option for that GPU, and it will take the recipe and reuse it for that purpose. Whereas in Nix, it's much harder to have package variants like that. So that's really the power of SPAC. And so we're combinatorial Nix. You can think of it that way. Well, wouldn't you be able to leverage Nix and describe all these differences instead of redoing it? No. The Nix packages don't do that. |
Automation for Debian Packaging |
Good morning, so my name is Yomov Noi, I work on Debian among other things, and today I'm going to talk about some things I've been working on to automate changes to Debian packaging. The talk is specifically about Debian, but hopefully some of the lessons are applicable to other distributions as well. Debian hopefully doesn't need a lot of introduction, but set the scene a little bit, so hopefully you're all familiar with Debian distributions. Debian distributions are system integrators, and they're sort of a spectrum in terms of how much work various distributions do to integrate, and Debian definitely falls on sort of the far side of the spectrum there, so there's quite a lot of work that happens in order to turn an upstream package into a Debian package. There's work going on to have the various packages work together very well, a lot of extra metadata, as opposed to some other distributions, like for example Nix, that are just a very, very thin wrapper around an upstream project. That does mean that there's more work involved in the Debian package, and it's also more likely to require more work when for example a new upstream release comes out, but a lot of the work that is involved here relates to that. Debian has about 1,000 developers, about 30,000 source packages overall, and the way that Debian has traditionally worked is each package has a single person associated with it. Over time we sort of move to a model in which there's teams that maintain large sets of packages, but there is still a very clear owner for each of the packages, which means that depending on the package there are different guidelines and different policies that you have to take into account. You can just modify whatever package you want. There's the Debian policy, which mandates some things around packages, but if you want to say a random Haskell package, there's a Haskell package policy, and there's a particular group of people who maintain the various Haskell packages, so if you're not a part of that group you need to get one of them to review your changes. Traditionally Debian packages have all been just sort of tarballs. The source packages have just been tarballs with a Debian directory and a couple of Debian specific files in those, and those tarballs get shipped around and an archive is made up out of a bunch of those tarballs, and potentially of the binary packages that get built from those tarballs. And those tarballs are generally GBT-signed. So that's sort of how Debian has traditionally worked over the last, I guess, 15 years. The ecosystem has evolved, so fortunately we're now using Git for a lot of the packages, not for all of them, so there's no mandate whatsoever within Debian that packages have to be in Git, and upstream packages have also moved to Git, which is great. As a sort of, I've worked on version control systems before, and as a version control geek I'm a little bit sad that we have a monoculture now, but it's really nice that there's a single version control system that you have to deal with, rather than sort of like a plethora, which makes it really hard if you're packaging to interact with all of the different systems. And there is, in Debian, no formal requirement that you have to maintain your packages on our Git hosting sites also, but the vast majority of packages are on that site. There's also packages that are hosted on people's individual Git servers, or on GitHub, or wherever, and there's a header in the packages that allows you to declare where packages are located. And this is a graph of sort of trends overall within Debian. You can also declare, or you could also declare other version control systems, and over time you can see that the big red bar at the bottom is packages that don't use a version control system at all, and the big sort of brown yellowy bit is Git, and then there's a bunch of other version control systems which you can see like gradually lose popularity. But all of this means that you can now, for the vast majority of packages, find the Git URL that's associated, make a change, and hopefully create a merge proposal, which is an improvement over downloading a tarball of the latest uploads to Debian, creating a patch and attaching that to a bug report, which both requires more work on your parts. It might mean that your changes are against a version of the package that has changed by the maintainer since the last upload, and it's also less work on part of the maintainer. And then there's a bunch of other things that have changed, so there's more metadata now on where the upstream is located, so we have an extra control file now that basically says like this is where the upstream Git repository is, which makes it easier to, for example, pull in a Git snapshot from the upstream repository, or in theory, and I don't think we actually use this for anything yet, like report bugs in the upstream bug tracker and stuff like that automatically. And this particular file is still a draft proposal, but about like a third of the packages in Debian already have this file, and there's some tooling as well that can automatically generate this based on other metadata that exists in the upstream repository like DOAP files or some build files. And then one of the other changes in the last couple of years has been the broad adoption of proper gate packages to the stable releases. And then finally, we used to have a sort of a white plethora of like build tools in Debian, and over the last couple of years we sort of converged on a single build tool, and that also makes it a lot easier to make changes across the archive, like you don't have to first figure out what the build tool is that's being used and how the particular change that you want to make is done in that particular build tool. And this is the graph of sort of the adoption of the, I guess, DWIM, DAP helper over time. And so traditionally in Debian, and then, like this is an example, a particular control file contains a carriage line feed, because we don't have a monorepo, you then sort of wait for all of the maintainers to run the tools to gradually make changes and like over the course of five years everybody maybe makes that change, and the problem is fixed. Yeah, so this is an example of like a control file containing a carriage line feed. There is actually a command in there that describes what you need to run in order to fix this, but everybody has to go and sort of like run this tool, get the suggestion, run the commands, do an upload of the package, so that can take quite a while. So yeah, this is the sort of what I was describing, like it can take quite long to actually get through this and it requires quite a bit of attention from like a large set of people who all have to learn a little bit about this particular problem, have to run this command, have to do all of this work. So like, yeah, this is slow and it takes a lot of time across Debian developers. Like I said, there was this command that was mentioned there. We could just automate running that, like rather than having a command tell you what you should run, we could just do it for you. And that's basically the idea behind the tool called Lenship Brush. So it takes sort of a quarter of the and it has a bit of cleverness to sort of work out where the problem is and whether it can solve it with enough confidence and then it just makes the changes. And it preserves as much of the rest of the packaging as it can, so it won't like completely reformat the entire package and things like that. And yeah, this is sort of an example. I won't go into it too deeply, but basically, like this is changing the section of the package because one of the, or sorry, the priorities of the package because one of the priorities was deprecated. But then there's an adoption challenge here as well because now everybody has to install this particular tool and has to run it regularly on each of the packages they touch. And yeah, as you can see, like this is a graph of the popcorn, the number of people who have Lenship Brush installed and you can see that by like 2021, it's about 140. So it's not the vast majority of developers. So we can take a step further and say like what if we just go out and discover all of the Git repositories and we just run this tool on the repository for them and we just create a merge browser. So that's the idea behind the tool called Silver Platter. And yeah, so that basically just automates that whole process. But then somebody has to go in and like manually do this regularly. So now if we take another step, build a cloud service that basically does this regularly for all of the packages within Debian. And that is called Debian Generator. So it basically just scrapes all of the VCS Git fields, finds the ones that are on hosting sites that it supports. So it's either GitLab or GitHub or Launchpads and it runs Lenship Brush on them. And then this is sort of what it's the kinds of things that it does. I don't think the diff is particularly readable. You can see it's a relatively trivial change. In this case it's like fixing actually a VCS Git header and some build depends. But it's also a change that like would take somebody at least a couple of minutes to make and verify and stuff like that. And this tool also provides a way to do QA in all of the changes. So it will have a human review the diff at least. But it will also do a build of the package with the changes. We'll do a build of the package without the changes. And it will look at the diff between the binary output as well. In particular to see if like anything. In particular like there's some teams in Debian that maintain like 3,000, 4,000 packages. And it sort of works like TCP slow start. So it will send out one pull request to a maintainer. If they merge that they'll get another two pull requests. If they merge those they'll get another four, et cetera. So like it goes exponentially. But if they close the first pull request they won't get any more. So it's sort of meant to like if people are interested in engaging they'll get all the pull requests they want if they don't care. Or for example if they've got their own workflows, if they run an engine brush themselves, they won't be spammed with like lots of pull requests. And we monitor the pull requests for comments by humans. So it's not just a black box that's like sent you lots and lots of pull requests. If you leave a comment on a pull request because you have concerns, we'll have a look at the comments and for example make changes. And yeah, there's some risks here though. Like for example the slow start that I mentioned is one of the things that we built into not get people into a mood where they'll basically ignore these pull requests because they're low quality or because they're spammy or whatever. And like we're really trying to sort of save developers time and improve the archive. But we don't want sort of another distraction that they now have to deal with or that annoys them in which case they might like completely ignore things. So there are a couple of principles that we try to follow. We don't make any sort of experimental changes. We only make changes that we think that we have very high confidence are correct. If we don't have enough confidence, we'll just like not make the change at all. Like a human can always come by and make the change. And then we try to provide as much context for the developers as we can. So like we'll do a build without the changes and a build with the changes and we'll give them the binary depth as well. And usually that's only like a line or two. But normally you would have to like manually do those builds if you were making the change yourself. And this is something that we can provide so we should. And yeah, we try to listen as much as possible to like feedback. Yeah, I think I've already mentioned this sort of like we try really hard to be conservative in what would change. It's in particular very easy to sort of like make a particular change across a large number of packages and have people turn off and go like, yeah, this thing just sends me incorrect changes. So I'm going to completely ignore it. Yeah, so what has this done so far? We've at this point processed most of the Debian archive. Unfortunately, we lost a bunch of the data with the graphs. But I've got some old graphs where you can basically see its progress. We're at sort of the, I think it's around 25,000 mark at this point. So these were direct pushes to repositories and these are merge proposals. And so you can see in green the open merge proposals and in blue the, sorry, in blue the open merge proposals and in green the merged proposals. And there's sort of some trans visible where like at some point somebody in one of the teams that maintains like all the 5000 pro packages decides that they're going to merge a bunch of merge proposals and you get like a spike both in terms of like merged and also in terms of open merge proposals. Yeah, and then there are a bunch of challenges around this as well. So a lot of packages are still not hosted in Git as you saw from one of the earlier graphs. But there's also a lot of packages that are hosted on Git servers that no longer exist. So there was a Debian hosting site called Aliath before Salsa and there's still a lot of packages that sort of declare that they're hosted there. And nobody has really bothered updating the headers or in some cases hasn't uploaded the packages since we migrated. But there's also a bunch of packages that still declare that they're hosted on Subversion while the Subversion server has long been turned down. So yeah, there isn't really any good way of dealing with those. And then for packages that are not yet on Salsa, like we have to maybe figure out something else to do. So either we could try and encourage people to maybe migrate to Git or maybe we could actually generate patches and attach those to bug reports. But there's a lot of sort of challenges around that as well because then you have to make sure that the patch stays up to date. And maybe you want to have a different threshold for when you actually file a patch because there's a lot more work involved with actually applying and uploading that. And there's a lot of merge proposals that actually sit idle. So one of the ways in which GitLab works is if you are a member of a team that owns a lot of repositories, you don't get any notifications when people open pull requests. So a lot of pull requests just sit there idle unless people actively subscribe to changes. Yeah, I'll skip over this because I'm a little bit short on time. I've so far just introduced the sort of small changes that we can make with Lengine fixes. There's a bunch of other kinds of changes that we're making as well, in particular merging new upstream releases, doing backports to older Debian releases, cleaning up some of the older, the other fields in the control files, so in particular like packages that have been orphans, updating the headers to reflect that, people who have retired from the projects, removing them from the uploaders and maintaining the fields, and some other cleanups. That was it. Thanks for listening. All of this is sort of a thin wrapper on top of a bunch of other infrastructure that has been built in Debian over the last couple of years. So these are some of the other services that it's based on. And here's some more links if you're interested. Any questions? Yep. So the question was, could this help companies who currently skip sort of proper packaging? Yes, skip packaging can just make very haphazard big containers or just deploy by copying stuff around. Would you recommend them proper packaging with this tooling? I think sort of what I was getting at the beginning of the talk, like there are probably other distributions that allow you to do like a much thinner wrapper around the upstream that are probably more appropriate for that, because if you do a Debian package, there's quite a lot of extra information you have to add to get to a full Debian package that meets the Debian policy. And if you don't want to invest a lot in the packaging, a Debian package is probably, it's quite a lot of investments, even with more tooling. Like this isn't going to write your package description for you. Maybe it needs integration with chat tpt, but like, we're not there yet. When you make a change, do you have one lint rule per PR or how do you bundle them? Sorry. When you make a PR, is there like one lint rule per PR or do you combine multiple text in one PR? So it varies a little bit per campaign. For these things, like these small really like one line changes, I combine them. And I also revisit the PRs regularly and then it'll just add new commits to the existing PR. So yeah, you won't get spammed with like 10 things for 10 tiny changes. Thanks. I had a question. Say, we have a similar system in Fedora project called Packet, which helps automating packages in DNF as well as RPM. Say, what involves over there is making releases and the change log is picked from releases that's on GitHub or GitHub or wherever that is hosted. Do you plan on making something like that happen, you know, pushing updates to the packages based on the releases made on their Git repositories? Yes, this does actually do that. It's not one of the things I talked about, but it's one of the sort of other campaigns. |
Upstream Collaboration and Linux Distributions Collaboration - Is that excluded?
The Linux Distributions Working Group @ The Open Mainframe Project |
Thank you, and welcome to my presentation. AppStream collaboration and Linux distribution collaboration is very excluded. In my case, I'm representing two projects, OpenSUSE, where I'm a member and a representative, and then the OpenMainframe project, where I'm leading the Linux distribution working group with the most important Linux distribution included, who are running on the mainframe I'm Sara Krisch, and my agenda is first something about myself, how I came to this topic, then you should receive a short introduction about the mainframes and the OpenMainframe project, what that is and what we are providing, then why we have founded the Linux distribution working group, from that we are coming to the topic of the reason why we wanted to collaborate, and in the last step I want to tell how we are including AppStream projects at the moment, and what we want to achieve, and I want to receive a little bit feedback in the Q&A session, how you as OpenSUSE projects want to be included in such architecture specific working groups or collaboration projects, then about myself a little bit, I'm an OpenSUSE contributor since around 10 years, and I'm also a member of a release engineering team, and I'm responsible for the S390X architecture, therefore I'm also the team lead for S390X at OpenSUSE, I wrote my bachelor phases at IBM, and afterwards I became a DevOps consultant at Accenture, I'm also allowed to contribute a little bit to OpenSUSE via my job, but anyway, I had this idea then to found the Linux distribution working group, and I am with that also the co-chair of this Linux distribution working group, mainframes perhaps not everybody knows them, that is the latest, is he 16 system on the right side, that mainframes are a large high performance computer systems and are also called big engines, the architecture behind IBM C is the S390X architecture, the X came at the end for the 64 bit architecture which is also included in such a system, and such systems are used for mission critical data, for banking services, and everything else, and you can run thousands of VMs on such a system, the open mainframe project is a hardware project for the mainframe, and it has been funded in the year 2015, and this project is under the hood of the Linux foundation, and should have the focal point for deployments and the usage of Linux and open source in a mainframe computing environment, therefore we have got multiple mainframe centric projects where I will not explain all the projects because the time is a little bit missing, but we have got also a mentorship program included, cobalt perhaps some have known before, but most projects are more COS based, therefore it is no surprise that we have funded our Linux distribution working group inside of this open mainframe project also, and besides of that we have got also a cobalt working group and COS enablement working group, COS is an alternative operating by IBM and a little bit commercial, but IBM is working on it to provide also open source projects for that, here we can see a little bit the overview of all included Linux distributions, SUSE, Red Hat and Ubuntu have joined after the community Linux distributions with Debian, Open SUSE and Fedora, SUSE is our sponsor of the Linux distribution working group, and then also Rocky Linux and IMA Linux have joined forward, our structure, Elisabeth and I, we had the idea to found this Linux distribution working group at IBM C-Day two years ago, and then we said we don't want to have it only for one or two Linux distributions, we want to have it for all that we can achieve better support and better collaboration between all of them, then we said we want to have a minimum of one representative for every Linux distribution that is required for the input from the distribution side, and yes SUSE said we want to sponsor it, our goals are creating a place for collaboration across Linux distributions, we are the open mainframe project, mailing list, the wiki and the chat, then we wanted to provide a space for distributions to request for help on their S390X ports, if there is something not working or anything else like that, and then we had also the goal to ensure any and all infrastructure required should be available for supporting the ports, therefore you can request hardware and everything else, Debian has got his own mainframe as an example, you can request support from the Linux one community cloud, I have got also a slide about that included, and when I said I want to have better support from IBM to fix S390X specific bugs, we have got the distinguished engineer Ulrich Weigern included here for that, and therefore we are collaborating via the mailing list with him and our meeting sessions. When our collaborative process is in the first step, the problem discussions on the mailing list, if anything is happening, we can discuss the problems, we can reproduce issues sometimes in our Linux distributions, discuss it on the mailing list, and then we are forwarding issues and ideas of improvements for IBM that will be forwarded when internally, and then we have got also monthly meetings for half an hour every month, that is more come together with a review of what has happened in the month and what are the next steps, any other problems or any other news, and then we are sharing our knowledge also in this half an hour time. This collaboration is also a benefit for all, with that upstream contributions are available for all, that is lowering all the research and development costs, because we have got our point of contact, where there is an exchange between these distributions, we can come faster forward, and that is a benefit for IBM and the community. Additionally, we have got the same solutions for our Linux distributions, we can use the patches from Debian, Fedora, OpenSUSE, share it, bring it upstream, test it together, and with that we have got also the same solutions for S390 specific problems, we share our knowledge between the communities equal to here, we have got one deaf room together and sharing our knowledge here, that is also in our working group available. Then we are increasing a little bit also innovation, we have got diverse community ideas, we can bring it together and can forward it and bring the latest technologies into our Linux distributions, and with that we are accelerating also the Linux development for S390X. From that, if we are working together, I would say we can achieve more together than alone, and that is also the reason that IBM is providing the Linux one open source software community cloud, you can receive VMs on the Linux one systems, that is also our mainframe for Linux configuration, that is sponsored by IBM in the United States, and the Linux one community cloud provides 120 days for a single open source contributors with free access, you are receiving less Ubuntu and then we are providing also long-term access for open source projects. In our case, we have got five VMs for open source available and we maintain it for our own, we have upgraded a slas to open source on that, and with that we can develop also an open source foundation. With that, we are coming to the idea to include the upstream projects, it is easy to include base projects like the Linux kernel, compilers like GCC or KVM, because we have got all ready developers at IBM who are contributing as maintainers and they can interact directly on our issues and bug reports, but there are many other projects in Linux included, not only the kernel, not only toolchain and everything else, we are receiving new programming languages, we are using the latest databases or anything else, the Linux world is really wide. This is an example how we have done that in the GCC bug tracker, we can create a bug report that is arriving and then Andreas Krebel as an example and IBM developer is interacting on it and analyzing it and creates a fix. The process at GCC is upstream bug report, IBM maintainers are receiving all 390X specific bugs and the maintainers are interacting. One hint in this direction, these developers are also open for joining our Linux distribution working group for discussions, but from our point of view, that is not enough because there are many other projects, an example is Core, what I have created an issue last month because of the 390X enablement which was not working, I created on GitHub my issue, the developer tried to fix it and said, I don't know how to fix it, I don't know 390X specific things, I asked when should I forward it to IBM, yes please, I wrote my email to the mailing list, Ulrich Weigand, the distinguished engineer has interacted when news directly, that is the problem at it and it has been working, but such a process is a little bit longer and requires us as Linux distribution maintainers in this case, should we include such project also with invitations to our mailing list or is this our responsibility to forward from the upstream project to IBM then, that is a question there, the reasons for such required forwarding is that there are so many open source projects everywhere, IBM does not know all the Linux integrated software and latest technologies, especially the new ones and most IBM maintainers are only available for the base projects which are called the strategic open source projects, which is a little bit funny, but in any way they have got their strategic projects, the maintainers are working on that and that's it, if anything new is coming in, they have to find someone responsible new for that one and yes, we the Linux distribution maintainers know our requirements and what we want to include as the latest technologies, therefore we want to achieve a new connection between IBM and upstream projects, what is missing at the moment. If you are interested for joining us as an upstream project or anything else, we are open for that now, here is our wiki link of the open mainframe project with the Linux distribution working group, then we have got our mailing list and our meeting sessions on Zoom every second Tuesday, the invitations are coming via the mailing list and now I want to use the rest of the time for discussions, how do you want to become involved into such architecture-specific working groups as an upstream project, one hint, we are also our distribution collaboration mailing lists available now, two weeks ago we have received an email, there is one from the kernel, then I have seen there is something for security topics, there is also for ARM architecture by Linaro I believe, one collaboration mailing list, something like that exists already, but what is your expectation and how do you want to be included if you have got problems with architecture-specific problems, have you got any wishes, Ben, say here, nothing? Yeah, by the way, your consistent branding across the various sub-projects that you had on the other slide, I just want to compliment it, it is really nice, like their way they are all using the same palette, sorry I didn't mean to mess up your slide system, you mean TCC and everything else, no, the various things under the open mainframe group, we have lost the slides now though, anyway, yes, I think that is super cool, I love it, so the distributions you had in there, I noticed you had Fedora in there and I saw Red Hat mentioned, is that mean Raoul is involved or? Yes, Red is also involved, if we want to give something forward to Red Hat, I am for open mainframe group, we have the Why is it not creating energy? Why does it not share it? And here we have got something about our Linux distributions with all information about our architecture, specific mailing lists and everything else, and who is responsible for what. Dan is listed here, Opensuzer is listed here, but I am responsible for the distributions. RockyLinux is listed here with Luis Arbel and Mustafa Gessen. And when we have got also our meeting session side where you are receiving an overview who is attending our meetings, therefore here you can see also the Debian responsible person, Deepak Zope, Nikolai is when you guys from Suser responsible for S390X who is joining our sessions. David Edelssohn is the CTO for Opensource at IBM, Ulrich Weigand is the Distinguished Engineer for Opensource and Linux, who for these persons are mostly the default persons joining us, Dan is only listed for Fedora because he has joined as a Fedora representative, but he is also responsible for S390X at Red Hat as a default. I was just, I am with CentOS project and we sit kind of midstream between Fedora and RAL, so I am just wondering like, you know, should we be getting involved or is it enough that we have Fedora involved, so, I will talk to our Fedora for a second. Yes, my wish was as a default, two different persons because we have also separated Opensuzer and Suser in our case. I asked multiple times at Red Hat, can we receive an additional person? Yes, I will forward it and nothing has happened and when Dan said, I am also from Red Hat, okay, when you are for both. We are open to have an additional person from Red Hat, but in general. Okay, I will forward it. Yes, if we have got from every distribution one person, why not? As an upstream maintainer, if you want to support S390, what will be the best way to kind of do it? So could we just document some contact points for your group, so we can say like, if we have some issues with supporting the S390 architecture, I can just, I will say, if we can fix it ourselves, try to follow up with your working group. Yes, that's our goal, what we want to provide. As a first step, we have got our mailing list, if you have got any problems or anything else, you can write to us and we will look how we can support you. Okay, so. All the IBM people who are required are also on this mailing list and they interact closed into two hours or something like that on the issues which are coming in and therefore that's our first step, how we want to include you. We are also open to include you in our meeting sessions then, but that would be the first step, how we can include you. Okay, so the preferred, I guess, point of contact for you would be for our development documentation. It depends on which upstream project you are, which one is that? For example, Python. That is Python. When Python, I expect, there should be also a point of contact upstream from IBM. But they don't know the whole module things, that is the problem, therefore, forward the issues and problems and when you are receiving the support. Yeah, that makes sense. Any further questions? Ben, we have. Perhaps two minutes. It seems the people from the next session are joining now. Thank you that you have joined my presentation. Awesome. Thank you. Thank you. |
AMENDMENT Linux Distributions’ State of Gaming
A Case Study of Fedora Workstation |
Hello, folks. Welcome to my talk. I'm Akash Deepadhar. Today I'm going to talk about Linux distribution state of gaming. I'll talk a little bit about myself first. Basically, I'm someone who has been contributing to Fedora project for around a couple of years before they thought the folks who actually help the Fedora distribution as a corporate entity, they thought that, well, I'm not going to leave anyway, so they might as well hire me. The next thing that I find myself doing is actually working as a software engineer for a team that manages infrastructure for Fedora as well as CentOS. It's called the community platform engineering team. As well as, you know, just because Fedora is kind of close to my heart, I mean, it's kind of CentOS at this point in time, but misappropriation. I work for Fedora council as well as being the objective representative for the Fedora website synapse team. Gaming has been a prime concern for me, especially having laptops that don't quite run games. So the thing that I used to do is have distributions that can actually have a greater headroom assigned to that game instead of running some fancy stuff in the background. So that is what has been the entry point for Linux distributions for me. So over the course of last five years, I have written and demonstrated multiple talks, multiple articles around how to be able to run video games on genuine Linux distributions, how to be able to benchmark them and while driver install is too while we're at it. So I watched this movie called Zootopia and there's this Fox character that I can't remember the name of. So he tells that, you know, the best way of giving a talk is to ask a question to themselves first and then answering that question. So I guess I'll do that. So we're going to ask ourselves three questions about the state of gaming and Linux distributions. The first being, is it popular? The second being, is it convenient to make happen? And the third being, is it performant? Like why even consider gaming on Linux distributions when there is some other consoles, there are the platforms which are actually willing to do that. So it certainly is popular. I mean, we can totally thank our friends at Valve for the Steam Deck and for other people who run a lot of games on their Android iPhones. I mean, Android is Linux, all right. But then again, is it the way we kind of want it to be popular? So there's this small asterisk over there. We have things for emulation. We have things operating systems dedicated for running video games like distributions like Bar to sell Linux, Laka which runs RetroArch and nothing else on the bottom of it. Then there's this thing called RetroPie which runs emulation stations. So if you must have gotten yourself a Raspberry Pi and looked for something in the Internet, some DIY tutorial, it's probably one of the first five things you will end up seeing. And finally, consoles that actually use Linux on the top of them. Speaking of convenience, it is convenient. Is it convenient to run Linux distributions for gaming? And you most certainly will have different opinions regarding the kind of configurations that you want to do. If you want to tailor fit your stuff, get the frame rates that you want and the quality that you need. There are more configurations to like bare bones wine, RetroArch so that you can tailor fit your stuff or something like Android phones or Steam Deck which can do that for you. And finally, we have performance. Now, I have seen this over the course of years running video games on Linux, the things that are supposed to run on Windows that if the games run, well, if they do, they usually end up being 15 to 30 percent more performant. You can totally find the references in the slide deck if the font is a bit too small that, yeah, this thing is actually the case. But what exactly is the sacrifice here? Also, I mean, I can pass through a GPU. If I have this big GPU, I can pass it through virtual machines and have near native performances instead of doing, say, on a hardware that is totally not OK for a certain game to run. But yeah, why exactly do I have those asterisks out there? If it's performant, if it's convenient, if it's popular, then what's with the terms and conditions, supply kind of thing that I have over there? And, well, there are things that we're missing out. So one of the first things that we do is it is popular. People are enthusiastic about it, but less people are enthusiastic about it. And it's usually the people who would like to spend their hours configuring things, writing config files, hacking stuff to be able to run some games on their desktop or their handheld devices that run Linux distributions. And that's barely around 1.38%. And that's the service about where I got that 1.38% from. It is something that has been going up since the last couple of years. Here again, we have the friends at Valk to thank for, for the Steam Deck that we are getting increased usage. But a lot of these users are totally going unaccounted for because telemetry is a big no-no for us. And we definitely advertise telemetry as something that we should not do. And, well, when there are things like Lutris, things like RetroPy, RetroArch, then PlayOnLinux, Wine, these tools are doing the best that they can do. But guess what? We don't get to know how exactly are these, these being used. And as a result, the publishers, they think that, well, Linux distributions, who uses Linux distributions? Why should we port our games to Linux distributions? We better not. We might just recreate them for other platforms. Well, I won't name any, but with comparatively higher market share, right, offer consoles. Because guess what? Consoles are supposed to be for gaming, not for writing code. And then, you know, third party developers, they don't bother. They don't really care. They are like, ah, fine. They'll use their stuff to emulate our games on their platform. They call Wine an emulator. We know the difference. They probably don't. So I go have my friends have a conversation with my friends that, yeah, you play this game on Windows. Here's how it can run on Linux. And they're like, oh, my God. That's too many configuration files. That's too many hours of work. And all my friends are on Windows. So sorry about that, brother. But I'm going to be at Windows as well. And the other person, they try installing games. But guess what? It's a multiplayer game. So, and it's like, oh, no, we don't recognize this platform. So you're cheating. That kind of stuff that totally puts people off. They don't want them to be here, even though they totally are not. Talking about convenience, it's, well, it's convenient for some people. The some, you know, I kind of count myself in the minority because I can totally go behind the screen, do hours of stuff. But for what about others, you know, what about the folks who just want to spend some time playing games on a weekend? You know, someone who have a busy life. So you don't expect them to actually sit behind their computer screens for like five hours configuring stuff, right? You want them to actually be able to play games on the get go. It does not quite happen that way for the most parts. Few games work as it is. Some of them require minimal configuration, like some slides here, some slides there, maybe some versions of DXVK. But others, they don't work at all. And, you know, you don't get to know that they don't work unless you spend hours of it banging your head on the wall, trying to make them work, and then you realize that they really don't. So the convenience, like, they follow through steps. But just because the Linux distributions are so fragmented, we have a certain version of package manager in a certain distribution. There's a certain way, certain root FS are installed, stuff like that. So you can't quite expect a certain steps to actually work on one distributions and to be replicated on something else. There would be certain steps that would be required in between, and it only comes with experience. But can we expect experience? Of course not. So people have difficulties with doing that, and then people don't really want to spend time, and it's all valid because guess what? People are there to play games, not to become contributors in an open source software, right? Only about performance, right? So one of the things that happens with performance is the fact that there are games that run comfortably, right? All fine, good frame rates, good graphics whatsoever. At the very same time, if you use that same distribution, you thought, oh, this runs Final Fantasy, I might as well run Warframe with that, then it won't happen. It won't even load up, let alone have good frame rates in that. So there is some Taylor fitting required, but that Taylor fitting works for one thing, but it does not work for something else. So you don't have this one size fits all kind of a solution for gaming, which is sad, but then again, it is what it is right now. And well, there are some publishers who do not even support these environments. They're like, nope, not this, not that, nothing at all. We won't let Linux users play our games, because that's not how we do things. And look, people are able to get frame rates. I was able to get a lot of frame rates, good performance, but there have been times when it has been all inconsistent. The such thing has happened with my friends as well, using a variety of Linux distributions. So it's definitely not just for Linux, but for my friends who actually use Pop OS just because it allows for having NVIDIA drivers installed from a get go. So you don't really have to pop open a terminal and do some crazy voodoo according to my non-technical friends to be able to install drivers. It just works from a get go, but guess what? Even they have some inconsistent performances. And then there are ports that are for Linux, but just because they are not a lot of takers, here again, telemetry, just because people don't get to know that there are actually people playing their games trying to work hard to actually make them compatible. People pull them out even if there was a version at some point in time. Right. So if all I have are complaints, is it all bad? Is it something that does not run at all? What is it like? You know, it's quite the opposite. It's not bad. The community has been doing a great job. If I were to look back 10 years ago, people had to use wine as it is, right? And it's a tool that gets things done, but then again, if you want things to be done, things need to be abstracted for you to be able to understand it. And if someone of a web developer is made to understand the things that go behind the scene, oh my God, then it's totally not worth of doing. So there are tools like Lutris, Play on Linux that abstract the stuff that wine do. So it's a lot better right now than it has ever been before. But then again, there are things that we can totally do to make things a lot more better than they are right now. So there's a silver lining. It's a small market share. People can be unsatisfied with big config files and stuff like that. But here are six ways that I think, you know, it's all subjective. There can be other things that people can think as well that I can add my list and make it 60 probably. But one of the things that people need to understand is if it were really a technical challenge, right? So we have a lot of people working hard to make these things work. Drivers, no matter how hard it can be for the property drivers, the property blocks, the kernel modules to be loaded up, people are working hard. But people are not understanding how exactly is it affecting or influencing the gamers, the folks who actually use GNU Linux distributions to be able to play these games. So there should be some way of open metrics, you know, some kind of telemetry that is not shady. It does not look through your context and understand, oh, this person reaches out to this at a certain point in time. Not that kind of metrics, but rather what exactly is the tool, what exactly are the workflows that are used in order to make these video games work. So these reliable metrics should be implemented for the developers to understand that, yeah, their works are indeed worth it. And for the publishers, because oh, boy, they think that the market share is small. So in order for the rise in market share, which has been for the most parts, the ones that we get to see from Steam, we should have more than that. So I don't know if there are metrics in Lutris, bottles or emulators. And I guess there's not apart from the ones that they themselves collect, because guess what, they really want to see if their stuff works or not, and how exactly can they improve their own software. But does it like add up to the entire GNU Linux gaming metrics and make people understand that, yeah, folks using Lutris, folks using bottles, folks using Steam Deck, folks using Android, they all combine together as being a market share for this entire gaming. I don't think so at this point in time, but this is something that we should definitely consider. The next thing, of course, is to account for feedback and promote participation. So I have seen in the course of the last many years that if you make people feel like they are being heard, no matter what kind of project it is, if it's just creating some websites or deploying things on the infrastructure, or be it about gaming. If people are heard, if the features that they suggest, if the bugs that they tell is bothering them are implemented, chances are that they will tell their friends that this is something, this is some kind of tool that they make use of and they should make use of as well. And I like to think that reporting for bugs is also a very valuable contribution. So being empathetic to the users, understanding by putting themselves on their shoes, what kind of issues that they end up facing. And finally, understanding the tools that people make use of. If there are a lot of dials, if there are a lot of dropdowns, if there are a lot of things that people have to do before they are able to run their games, it's probably not the most convenient way of doing so. So you can't quite expect a person coming back home after a long day to be able to tweak those stuffs and make them work. So the usability of them, how do we make it more convenient, should be something that we should look into in these distributions. You know, kind of streamline the entire workflow in order to make sure that people know where exactly they need to go to, to get a certain function, to get a certain settings applied. And that's more about the convenience of, you know, so that people can focus solely on the video games and not around the operating system that's built for the sake of running games. Because trust me, if you have a PC, operating system, you know, video games are the one thing out of a thousand things that you will do. So it just makes a lot more easier to focus on what they want. For convenience, we totally should be able to abstract complex things when we need to. So customization is fine. It's one of the reasons why we are fragmented and I'm kind of thankful for it so that I have a choice that I can customize a distribution of my own kind to be able to make something, to serve a certain purpose. But for the folks who don't need, it's going to be overwhelming. They're going to be really scared of all those options put out there in front of them and they'll be like, oh my God, no, definitely not. And they'll run back to the thing that they were playing games on. So it's definitely not something that we would want to do. Being able to provide a balance between the two of them and organically finding, oh, fine, this person is scrolling down the menu. So probably it's looking for something that's a lot more extensive than what we are provided for. Some kind of organic way to find it and to be able to demonstrate how they can do it is a way that would strike a good balance between the complicated looking stuff and people who really want to get their job done as quick as possible. Finally, for distributions that actually prioritize these tools, these workflows, these applications, these should be available like natively in their own repositories or there should be a way to be able to install them and not like build from source or like dot-slashing them out of the blue because who runs shell script files anyway, right? You should definitely read them. Drivers, codecs, kernel modules and things like that, if you don't have a way to update them natively, trust me, it's really a bad choice to be able to using that distributions. Say, people are here to play games, not to build software from source, so definitely we should consider having all of these things packaged natively. And finally, to build standard workflows to be able to test and quantify that, yeah, what is good performance, what is bad? Now, me, I can be really biased towards good performance even if I see 60 frames per second on a 165-hertz screen. I can tell that is good, but for someone else, it's like, oh, no, it's just like one-third of that frame rate. How do you call it a good performance? In that very case, we need to understand and tell that, yeah, this is the criteria that was used to tell that, yeah, this video game actually runs and this video game does not and could use some more work before it's able to be, well, executing the way it should be. And when we have all of these things in place, probably the publishers of triple-A titles, popular ones like that, will be able to understand that, yeah, there is some kind of standards used in this fragmented world of distributions to be able to understand that, yeah, if we follow these rules to be able to create our games, it will have a compatibility with at least 85 percent, 75 percent, I'm saying this on the top of my head, but at least majority of distributions will not have a problem and you won't be actually told to use a certain distribution just because, well, your friend uses it. So let's have a case study of Fedora Workstation in the end. So we have had distributions based on the top of Fedora Linux distribution, Nubara Workstation, they have added meaningful additions on the top, so to be able to make sure that people who really are willing to focus on video gaming, they don't have to install much stuff on the top of it. And you know, it's heavily popular with the folks who develop bottles and looters because they get the latest and greatest stuff in the official repository, so they don't really have to go out of their way to do so. And then the required tooling to be able to run these games, drivers, and the ability to install them from RPM Fusion just in case their proprietary in nature is totally possible. And the fact that the GNOME desktop, well, what can I say, it's just one of the great ones. I'm a bit biased. Say, you know, it totally keeps the workflows aligned and well, unintuitive. And finally, talking about the consistent performance, well, let's just say that the configurations should be done in a modular manner so as to make sure that if I do certain thing, I can copy that stuff and give it to my friend. Here's, you know, you don't have to spend many hours like I did, paste that stuff and this should be running. Or something a bit more polished than that, but basically, no more repeating of efforts. And, you know, customizable enough to be actually minimize the footprint of the operating system, the distribution itself, to be able to dedicate more of that performance over to the actual video games. And that's pretty much about it. I'm totally open to your questions. Thank you for your talk. I got the impression that you're pushing for telemetry to be used more. Would that be right? Well, let's just say I'm pushing for an open telemetry. So you get to see what kind of information is being shared with the folks and what folks are you sharing that with. So you don't think that, oh, it's the shady number of information and with the shady number of folks that it is shared with, right? So telemetry is important. There are software that have telemetry like pre-built and they have it natively done, but then again, it's just a limited set of telemetry people. Once we unify this and have a place where we can say that, yeah, it's coming from them, so there's something that we can improve upon in the distributions level so that we can understand that, yeah, a certain application or an emulator is acting up and there's something that we can act on. Any more questions? Hey, thank you very much. I think on the telemetry side, there is a fundamental metric that the developers look at which is sales. So I think the main thing is like there is a sort of 1-2% audience on Linux that will buy games. I think that's pretty clear. I think there are other advantages for developers having Linux users in early. We tend to report bugs and if we do that in a helpful and non-annoying way, then we can be an asset particularly to the trend of people doing early access releases and wanting engaged users. So I think that's the thing that you can think about supporting early access games, supporting stuff on edge I think is helpful. The other side is like if you just want to play games, by far the easiest thing to do is ignore your distribution, install Steam, Proton handles the config wrapping around wine really conveniently and the best game of the last five years was released natively on Linux anyway, so play Slay the Spire. Thank you. Thank you so much. One of the things that you mentioned that probably we could have some kind of telemetry done in a central basis and the fact that early access can be something that we can provide to people using Linux. It's one of the things that we can totally use to actually increase our market share and make people feel like it's worth it if you give it a try and people might end up actually buying it. When it comes to Proton, I mean, geez, they have done a marvelous job by abstracting what's not important or what's totally really, really scary. Might scare people away out of the room, leave their Steam decks that they have purchased with their hard earned money and to be able to play games while they're on the go. So it's some kind of abstraction that keeps things convenient that we are all looking for and there should be a balance so people should be like, oh, geez, it's so abstracted that I can't do anything anymore, right? So that should definitely not happen. All right, folks, I'm going to give way to the next talk. Please find me over here if you have more questions. Thank you so much again. |
Building a Web UI for the Fedora installer
the reasons, the tools and progress so far |
Hello. Welcome to my talk about building a web UI for the Fedora installer. So my name is Martin Coleman, and I work in the team that's building the Anaconda installer used by Fedora, REL, CentOS, and REL distributions. First, I would like to talk a bit about, like, why we decided to actually build a web UI for our installer. And, yeah, first, like, very, very shortly about, like, the Fedora installer project. Yeah, the name of it is Anaconda, which is very confusing for some people doing Python in the scientific domain, because there is a very similar project in that it's like a Python thing, but it's called the same SV, but I think we are older. So, anyway, right now we have a GTK3 UI for the installer. We have a text-based UI. It's also possible to fully automate the installation. We have things like add-on support, and, yeah, we are used, as I mentioned, by Fedora, REL, CentOS, and others. This talk is basically concerning only the graphical user interface. We don't expect to have any changes for the text-based interface and the kickstart-based automation in the context of the web UI. So, why did we actually choose to do something about the current graphical interface, and why did we choose to start working on a web UI? So, one of the points is that the current GTK interface comes from the year 2013, kind of looks like early GNOME 3 by coincidence. Maybe it was built at the same time, basically. And over time, we added new features. We fixed bugs. We adapted to various Fedora changes, for example. And the stuff kind of got bolted on. Not always it was possible to change the UI. So, in some cases, it's getting a bit clunky already. Another issue is that some of the technology we built it on is getting a bit old right now. GTK3 is not that old at the moment, but already you have GTK4. Eventually, we would have to port it. One of the issues is, for example, that the Fedora installation image. The Fedora project tries to have minimal dependencies of applications. So, like, over time, you want to have, like, the minimal amount of libraries. So, we would have to quite possibly migrate to keep the image sizes small. That's one of the reasons. We also still run on top of X. There is even some hard dependency right now on keyboard switching during the installation. So, this is something we would have to address anyway. The remote access to a graphical installation right now is not the best. It's based on VNC. So, it's unsecure. It's not very efficient. It requires you to have a graphical system running on the host that you are installing. And you need a special application that might not be available that users might need to install. So, that's one of the issues. And also, I'm not saying it's not possible to test GTK3 interfaces, but basically, it's not that simple. And we don't really have any unit test coverage. Like, there are people from, for example, the Fedora QA community that do test Anaconda. But what they are using is basically a screenshot or graphical bitmap based testing right now. So, this is something that could be improved. And also, what we have seen in the past years is that there seems to be a clear trend towards using Web UIs for system management. Some of you might still remember some of the system config tools used on Fedora and CentOS and Trell that used to be available to configure stuff like services, networking, firewalls. All of these, over time, effectively became cockpit plugins for the cockpit web console. So, this seems to be the trend overall for system management as far as we can tell. So, what we kind of found out, there are some benefits of doing something about the current UI situation and doing something about it with a web technology based UI. So, while we are at it, we can address some of the UX issues we have right now because it's effectively a fresh start right now. It's easier to achieve a consistency because, yeah, you are building the whole thing. So, you can make sure that it's, since it feels similar, it's using the same concepts, the same workflows for everything, hopefully. Also another thing is that given the proliferation of Web UIs everywhere, basically, there seems to be much bigger community of users, of developers of these technologies. And there is overall more documentation, there is even more resources for non-developer roles like UX designers or usability testing projects. And this seems to be, unfortunately, quite lacking in many native GUI libraries right now in comparison to the web technologies. And also, like one quite big point for it is that using a Web UI, just to be specific, we are going to use the Web UI both locally and remotely. So, we want to run it for the local graphical session, if any, but also it makes it much, much easier to access the installation remotely. So, for any headless installations, it should be much easier for the users using the installer to connect securely and much more efficiently to the host that is being installed. Also, the host doesn't have to contain any graphical dependencies, effectively, because all the rendering is happening on the client. So, the installation image could be much smaller. And also, the installation time resource requirements could be much, much smaller. That could be an issue for stuff like Raspberry Pis or some IoT SBCs, which are perfectly fine for the tasks you will be using them. But if you try to do a graphical installation on them, varying like possible issues with drivers, it might need much more resources to just install, to bring up the graphical interface, then it will be using for its lifetime of doing some useful work. So, let's talk a bit about the technical details of the tools that we are using to build the UI for the third-line installer right now. So, this is the overall architecture. The install is already modular. In that, it has a Python backend, which has a D-Bus interface. Then we are using Cockpit to provide us a bridge between D-Bus and the web application, which itself is then written with ReactJS for the logic and pattern fly for the WebUI widget. The current Anaconda with the GTK 3UI with the text UI, and even with the Kickstart support, is actually using the same Python backend, and even the GTK 3UI already is communicating with the backend via D-Bus. So, this makes it possible for us to right now work in parallel, that we are building the WebUI next to these other UI's right now. Just instead of, like, directly calling D-Bus, you have pattern fly widgets React talking via D-Bus, calling D-Bus calls from the backend. This is very similar to Cockpit plugins in general. Usually, you have the networking screen in Cockpit, for example, and what it does, it's talking to network manager via this bridge. It's doing D-Bus calls from D-Bus. That's basically the idea of Cockpit, and we are reusing this for our project. Yeah, so, as I've already mentioned, it's not about another UI that you can remotely access while keeping the current graphical interface next to this. Like, eventually, once we cover all the necessary functionality for the given project or product, it should replace the current graphical interface. But right now, it's being developed next to it, and thanks to the module backend, thanks to the D-Bus interface, it's not that hard to do it. Also, one more thing that we found very, very useful is the Cockpit test framework. This is addressing the issue I've mentioned previously about no unit tests for the graphical interface. This is something that has been developed for the Cockpit project itself, which directly maintains a lot of the screens you will see when you install Fedora or CentOS or some such distribution and enable Cockpit. But there are also many community-maintained outside of the main community developing Cockpit, many other Cockpit plugins. So, there is a very comprehensive and, I would say, very nice test suite that makes it possible to essentially write Python unit tests that then manipulate your WebUI or Cockpit plugin. In our case, the Anaconda Fedora installer web interface. And it also supports pixel testing, which we are thinking, yeah, this is nice. But then we actually thought about the other issue that most web applications have, and that's dependencies. There are dependencies being pulled from NPM for pattern fly, for React.js, and the other libraries you might need to use. And the problem with this is that the release cadence is pretty fast. There are new versions of pattern fly all the time. And it would be very easy to get left behind basically to have very big difference in using some old version and being much harder to upgrade later on. And pixel tests make it much, much more easy to update this automatically or almost automatically because you can effectively compare if you see any graphical changes from the old to the new version. Same thing for any changes for the WebUI. You can easily see what the new state looks like if you see some changes that are expected, if you change some label or add a button, or if the layout is totally wrong. So, yeah, this is something I can recommend for web applications. It seems to be very, not something we expected to be using, but it helps a lot. And it's a part of the Cockpit test tooling. So, okay, so how far we got? This all started about a year ago in earnest. And right now, we have a very simple but end-to-end installer images that can be used to demonstrate the WebUI. And actually, you will end up with a functional, minimal but functional system. It's possible to select an installation language. We already support geolocation like with the current GTK3 interface. It's possible to select disks. It's possible to dynamically add disks. Again, this is kind of a demonstration of some dynamic behavior we wanted to have there. That's it right now for storage. Storage, I'll talk a bit more about it later on. But that's one of our main focus points because that's like 90% of every installer. We have a review screen where you can see the settings. And where you are also told that, yeah, you shouldn't really run it right now on any production system that has any useful data because you select the disks and we will use them. We will wipe them and use them. So, that's the minimal storage we have been able to come a bit for now until we have some more comprehensive screen where you can actually keep partitions and stuff like that. And the last one is just a progress screen where you can see the installation happening, where you can see some errors if there are any where you can kind of guess how long it will take because that's not always easy to tell the user correctly. So, to have at least some pictures in the presentation, so this is in general how it looks like. If you've seen the current Anaconda, this is quite a departure from it. We decided to have a flexible result layout. And if you've seen some pattern fly applications, this should look pretty familiar. And that's one of the aims as well, like people who use cockpit or some other applications using this tool kit could be quite more familiar than seeing some a bit outdated GTK3 interface in some unfamiliar theming. So, as you can see, it's pretty similar. This is the installation destination screen. We already have some built-in help support. You can click on some of the information links. You will get a doc with help content. This is demonstrating just some simple disk selection. You can plug in an USB drive already to add more disks. We expect this to grow in functionality quite a lot for stuff like network-attached storage and more complex disk layouts. And this is the review screen. Right now it looks very similar, but again we expect this to grow quite a bit, because as we add more screens, this should directly proliferate here. And we are looking into ways how to, for example, visualize more complex storage layouts, because that will be challenging, but that was one of the pain points we got from users so far. Yeah, this is the progress screen. This is basically the last thing you will see. Then it will just tell you, yeah, you are done. So that's it, like four screens right now. But it's already producing functional systems. One other outcome of this project so far is preview image. Sorry for the long links, but essentially the main information here, if you go to Fedora Magazine and type in Anaconda, you will get a bunch of articles about the WebUI, because that's what we are writing about Anaconda right now. So there is a preview image. The idea is that we will refresh it once every time we add something visible. Right now, it's about like a month old, but I would expect some new features landing in the next few weeks. So this will be updated regularly. And that's the best you can use right now to have a feel of how the WebUI looks like. It's a self-contained image that effectively dumps F37 user space into the machine that you run it on, and please don't run it on anything production resembling. So we found some challenges, like working on this. Yeah, we have a huge amount of functionality. The project is all the current UI has been used for like nine years. So we are really trying to kind of check what is being used and what not. So we don't go insane implementing it all. So that's ongoing. We try to identify and avoid some of the UX problems we have right now. Also, and keeping things consistent. Like that's one nice thing about Pattenfly. There are pretty nice UX guidelines that you can apply on many, many things. And that helps to keep the UI consistent. Yeah, another issue is like how to run this locally. That's not that easy, actually, because the web engines are pretty monolithic, pretty big. And they come with some mainly RAM requirements, not to mention the image size requirements. And there are actually not that many usable web engines on Fedora. It's effectively a JDK WebKit or Firefox. And each one of them has some pluses, some minuses. So right now we are still comparing these two and deciding which one to use. For remote running, that's kind of not our problem that much. Even that's another issue with Pattenfly. Like if we see some corruption, some layout issues, it quite possibly would affect other Pattenfly users. And you might not need to do something about it unless, unlike if we used some very, very custom web UI stuff. For remote running, another issue is how you actually authenticate the thing, how you encrypt in a useful manner. So this is still ongoing, how we solve that. It might not be pretty, but one way is to show some certificate fingerprints to the user to show some generated passwords or stuff like that. Another option is to use custom images. That might be perfect for some cases to bother some for others. So we will see right now. The web UI image you can use right now is, this is a disabled right now. But if you use the inst.wepoi.remote option, you can actually access the web UI remotely. But you need to pass it because it's totally unsecure right now. These mechanisms are not yet in place. So okay, this is really in planning stages and we don't have much time to talk about it. But the main focus is definitely storage. This will be big. We plan to have something that you can manually do, something that guides you. And it should start landing in the next few preview image releases, definitely. And yeah, more screens, definitely. The priority is driven by the next, the first image we could reply to, basically. So right now there is some date and time work already running. We have some backups for user and through password configuration. We need to add the error reporting, definitely. And other stuff. Definitely add-ons. Already the UI supports them. We need to keep it. And yeah, I think this is actually, yeah. So this is, uh-huh. So this is the, this is the effect of the last slide. And it's, I think we can start with the questions just quickly. Like, storage is a big focus. This is a way you can provide feedback to us about it. And let's start with the questions. Thanks. Hey, Martin. I don't have a question per se. Oh, yeah. Right. Say, um, I really appreciate the stuff that you folks are doing. I tried doing this myself by wrapping kick-start with ViewJS, Flask. Um, and I thought that it would be really feasible, really easy thing to do. But when I started implementing it, I came to know the kind of entry cases that I was to take care of. So I'm totally looking forward to what you folks end up doing. All the best. Thanks. Okay. Um, oh. Anaconda has now just nice features as, as escaping to a terminal, for instance, to bypass things Anaconda can't do at the moment. Do you retain that too? What plan do you do? So the current text interface, as well as like, if you, if you can access the machine locally, it should still be possible to do like anything in the terminal that you can do today. And you should be also able to use the existing text interface. We won't be changing that. Yes. But you, you can escape the web interface and get a terminal or what is the way to do that? This is not like yet settled. Like if you will include it, but the cockpit project has built in terminal emulated. I could imagine this to be included in the VAP UI. So we might be able to include it in our VAP UI as well. Would be nice if you do it. Yes. Thanks. Thanks for the feedback. Thank you for the talk. I think this is very interesting. And I think it's a good idea. You know, certainly convenient to set up headless machines this way. But at the same time, I was wondering, I think it was Alex Larson who wrote this Broadway backend for GTK. So basically you could use GTK and it would output to what goes into a web browser. And I, you know, just comes to my mind, why not use something like that instead? Because I think that if you, we want to continue to invest in GTK and technical technologies using GTK because we need GTK for Fedora, we don't really need web for sure. And so if we can end up using GTK and investing more resources there, maybe this makes it just overall better for the whole health of the ecosystem. And we get our web UI too. So thank you. Yeah. I must say I don't have like really like very recent information about it, but we looked at it a while ago basically to the, at the Broadway technology. It definitely looked interesting, but it didn't, it looked more like a Tehdemo back then. It could have progressed since then, but I think there have been some issues like with authentication or possibly performance. So yeah, that's a good point, but I don't have like latest information right now for it. Okay. I mean, then you can have all of your, you can have the GTK and you have your web, everyone else. Well, that's another question. Before Fedora's 37, we had a discussion about soft rate installation using the BIOS boot machines and we found a good solution, but Anna couldn't ask currently a bit strange installing software rate on ify systems because we don't use a ify system partition, but a rate partition. Do we have a chance to get the fix too? I'm not sure. Like I, like I, it's not that many people actually in the installer team and I have been very much concerned and concentrated on the, on the web UI right now for the couple, last couple months, but definitely if you reach out to us, like we have a mailing list or we have a metrics channel, I think right now already for Fedora. So please reach out to us using some of these channels and we can look at it. Yes. Oh, you can do that. Is it, is it possible to provision the machine from the cockpit because you can already create in cockpit virtual machines. So it would be nice to be integrated in one place in one console. Is it possible or do you have such plan? I think it's a, I don't think we have like integration for it right now, but that's an interesting idea. And like we have been thinking for stuff like satellite and some other provisioning mechanisms that it would make sense to more closely integrate with the installer, with the web UI because you could avoid the certificate and authentication issues. If you could, for example, inject something into the image. So that could, that could work. Like you could have like create machine or provision bare metal, whatever. And it could like include some like trust chain anchors, whatever into the installation run. And then you could then directly connect to the, to the machine. Yeah, we have been thinking about it, but we haven't yet implemented something like this, but it seems like obvious choice for some mechanisms. And yeah, with integrating it like this with cockpit machines, that could be a nice idea. So thanks for the suggestion. Okay, thanks. You're at it, guys. Okay. Thank you, Mark. Thanks a lot. |
How we build and maintain Kairos
A day in the life of a meta distribution |
Yeah, welcome to this talk about how we build and maintain Kairis. Let me just quickly introduce myself. My name is Mauro Morales. I'm originally from Guatemala, but now living in Belgium, really like it here. My just some random thing about me, the first destroyer I used was Memphis. I really liked it when I was going to university and never went away from Linux. My current daily driver is open Susie Tumbleweed. I am a Ruby and Go developer and there's some places you can reach out if you have questions about the talk or anything afterwards. I want to talk how we build Kairis, but for that I need to tell you what Kairis is first. If you go to our website, you will see that we sell or advertise ourselves as the immutable Linux meta-distribution for edge communities. That sounds like a handful, so let me dive a little bit deeper into that. What is edge computing? There is a trend right now into moving the operational aspects from the data center outside. What does that mean? That you might have nodes in certain places, let's say that you have a grocery store producing a lot of information about all the sales that are happening, all the different products, and instead of sending all the bulk of data up to the server and do some processing just to send it back to all those nodes, what you do is that you do the heavy processing already at the node and the only thing that you send all the way to the server is the summary or the calculations that you are producing in these nodes. This is useful for doing some machine learning, some artificial intelligence, stuff like this. The thing is, if you don't pass all the raw data, first of all, you reduce a lot the latency of the end result because you don't have to pass through the wire all this information. Second, it's a lot more private because you are not putting all your eggs in one basket. If for some reason the main data center gets attacked, they don't have access to all the raw information. Instead, the data is distributed across the different nodes and the only thing that they might have access to is the resulting calculations, for example. Or even if a certain specific node gets attacked, then that doesn't mean they can access the rest of the network. It's a very interesting concept that is being formulated right now and Kairos wants to be a solution for those nodes that are being run there. How does Kairos do that? First of all, Kairos is immutable and that means that the operating system is read-only. The user data, of course, can be written, but if for some reason someone takes a hold of the machine, they cannot install any packages, they cannot change any configuration. Even if they manage to do it, as soon as the machine gets rebooted, you still get the same original image that you used to have. If we go even a little bit deeper into that, what does it mean that the OS is immutable? Because there are, for example, some cases where the root partition is immutable, but some other areas are not. Kairos in this case is an image, we distribute it as an image as a whole. That means that all the OS components, including the kernel and the init-rd, they are all immutable. You cannot change it. init-rd is not built at the moment of installation, but it's already there when we ship it in the image. Let's see another interesting goodie of Kairos is that it's distribution agnostic, or I guess better said, we are friendly to every other distribution. We don't want you to lose the distribution that you already like and love for some reason. It could be that you already have a big know-how that you have built onto your distribution, so you don't want to change to a different distribution. It could be that you already have a licensing that you're paying for a center company and you don't want to switch that because of costs or something else, or it might just be that your operational team has decided on working on certain golden images, and we don't want you to have to go away from any of these just because you want the goodies that Kairos can offer you. Right now, Kairos can play well with OpenSUSE, Alpine, Fedora, Debian, Rocky Linux, and Ubuntu, and we're trying to add more different distributions on top of that. Another interesting thing that Kairos does is it tries to be really easy to configure and maintain. For that, what we do is that we have YAML, where you can define the way you want your system to look. In this case, for example, I am creating a user called Kairos with the password Kairos. I want SSH keys. I want to go and grab from the user modeler his public SSH key, and I also want to put this particular SSH key in text that I want inside of the system. I also want, for example, in this case, K3S enabled. As you can see, it's very easy to use, it's very easy to keep versions inside your Git repositories, so you don't have to break the current flow that you already have if you're already doing DevOps. We want to, like I'm saying, we want to make it really easy to configure and maintain. We provide a new web UI that was just introduced in version 1.5 for Kairos, where you can take that configuration that I was showing you, you go to a certain node, so you look for the IP of the node. The node has just been booted. There's nothing else running in it, there's no installation yet in it. You just go to the IP of the node, you paste that configuration that you want, you say install, and the node gets installed the way you requested. We want to make Kairos easy to configure and maintain, but when you have a machine at the edge, that might sometimes mean that you don't have the person that is in charge of doing the configuration there physically. Sometimes you have someone who is maybe not technically that savvy to do the work, or that for some reason you don't have the trust of that. It could be in this case the manager of the store, let's say. For that, what we provide is that Kairos will present itself on installation with a QR code, and then all they need to do is send a picture of that QR code, and then whoever is doing the installation or the configuration can use a command line and then take the configuration in YAML and also pass the image with the QR code, and Kairos will be doing the configuration itself. What else? So, Kairos also performs AB upgrades. What does this mean? Like I was telling you, Kairos is distributed as a full image. So whenever we are doing an upgrade, we switch the image completely. So whatever image you have on active mode that is being run on the system right now, we download the new image for the upgrade, and then after reboot, we do a transition in which the new image becomes the active one. That is helpful because if for some reason we cannot really start that new image, there is still your old version of the OS that can still run, right? So it's a little bit more reliable, we could say. Of course things can go bad, and even if you have your two versions of the OS, you could still screw things up, and for that we also provide a recovery partition, also in this OS part which is immutable, that you can access to so that you can do manually whatever fixes you might need to do to one of those two partitions. Another goodie that Kairos provides is it has TPM encryption, so nowadays a lot of machines or IoT devices come with another chip, generally, which can do TPM encryption. This is useful because you can imagine it's just like having a UB key or I don't know, there are other providers right now inside of your system, and that way you can trust that only this machine, for example, is the one that is going to be able to unencrypt the data of the user. This is useful because, like I was saying, if you're at the edge of the network, if for some reason someone steals your hard drive and they put it into some other machine, they are not able to unencrypt it because they don't have that chip on their machine. There are multiple facets in which TPM encryption can work. I'm not going to detail all of them. I think if you're interested into that, you should check there was a talk yesterday by Leonard Pottering about the TPM encryption was very interesting. But it can get a lot more complex, we could say. You could even say, for example, that not only you need the machine, so the chip, with your data, but you also need a Kubernetes approval in order to do that. It will really depend on your model of security that you want to have, of course. If you put all of these things together, we believe that Kairos makes it a great distribution to be run at the edge. What I wanted to tell you about is now how do we build that kind of distribution ourselves. For that, we use something that we call the Kairos factory. Just to give you an idea of how this works, we start by having Linux container images that are provided by the distributions. We base our work on the amazing work that the distributions are doing, of course. We don't want to reinvent the wheel. When we take that container image, we pass it through the Kairos factory, and as a result, what is at the end is what we call the Kairos images. Right now, we offer two different types. One is the Kairos core, which provides the immutability. It has an agent, which can be used for upgrades, for installation, and many other things. It comes with the kernel and the init.rd, like I was mentioning. We also provide what is called the Kairos standard, which brings everything from the Kairos core, but on top, we also add K3S Kubernetes flavor, and also HBPN users for peer-to-peer networking and VPN and other things. The way you consume these ones is we have releases. If you go to our GitHub page, you can download, for example, an ISO, and you can install this in bare metal, or we also have, in our docs, we have all the different distributions that we support and links to the OCI images, the container images, which you can use for upgrades. Not only you can use those for upgrades, but you can also use them for customization. This is where I guess it gets most interesting to the people who are going to be using Kairos at the edge, because it's very simple to extend the image that we provide you. If you're already using Docker, which is probably the case, if you're used to using Docker files, which is probably the case, if you're using Kubernetes, well, you don't have to learn anything new. All you have to do is, say, the front part, you use our image, and then you do whatever you want to do. For example, I'm installing the application figlet, and then I just have to tag the version of the OS that I want to distribute. Then you can have whatever release cadence of your own that you might want to have. Now you might tell me, okay, that sounds good, but maybe I want something a lot more complex. Maybe I want to do a release like the way you guys do Kairos standard. No problem. We also have this thing called providers, in which we allow you to do a lot more complex things on the Kairos machine that you're building. That's exactly how we do the Kairos standard. That's how we put K3S and HBPN in it. You can basically just start from any of the ones that we have and build your own. That's pretty much the process that I was explaining. Let me talk a little bit about the challenges of starting with the different distributions all over. First of all, it's not so easy with the packages because some distributions come with certain packages, some with others. Sometimes they are named differently, sometimes they come in different versions. Another problem is that the base configuration is not the same in all the distributions. For example, we recently had a bug related to QR codes not being easily displayed because the configuration in Ubuntu was a little bit different. Another thing that is quite different is the init system. Right now we only support system D because it's basically the most mainstream one, I would say, on the mainstream distributions, and open RC. That's problematic because we have to maintain two different flows of code just for these two systems, but it is what it is. We also have issues with the C standard library. For most distributions that we consume right now, they come with the Glib C, but there are distributions like Alpine that comes with Muscle, and that makes it a little bit challenging. For those distributions, we cannot provide a kernel or init.rd from that particular distribution. If you check, for example, the image that we put out there for Alpine, it will have open Susa kernel or an Ubuntu kernel because we need to be able to build it somehow. Let's dig a little bit deeper now that we know the challenges that we have. How do we try to address them? Well, again, starting from a Docker file, in that first from that you see on the left, we put the distribution image, so let's say open Susa Tumble with latest, and then after that we install certain packages. That way we base out all the different distributions that we have out there, so if there are some packages that are not there in open Susa, we install them, or if it's in Ubuntu, we install them so that they are kind of balanced out, so that we can do everything that we need to do. We also do a little bit of system configuration. Mainly it is about ensuring that certain init processes get started properly. Then the result of that, we put into container image. One thing I want to mention is that we are agnostic. We use any OCI-building engine. If you're using Docker, great, if you want to use Podman, whatever works for you. We start building that new image. Then we install Kairos binaries. This can be, for example, the agent. The agent is used for installation, for upgrades, and other things. We then install Kairos packages. This is different from the distribution packages. These packages that are specific for Kairos, they are mainly for tooling. The really cool thing there is that they are completely agnostic. The OCI-building engine, so that means that we can be building Fedora image, and we can be using packages from open Susa and Ubuntu at the same time to do this. I personally find that really cool. Then after that, we do certain system configuration. This can be about how we're going to, for example, mount the different systems in the disk and stuff like that. The result of it is going to be a container image, an OCI image, that you can download for, like I was saying, because you might want to do certain configuration on top of it. You might also for doing upgrades, or we pass it through something that we call the OS Builder, which will convert that OCI image into an image that you can actually boot. You burn it into a USB, and you can put it into hardware, or net boot it. Of course, all of these changes are prone to issues, to errors, to breaking things. To avoid having that kind of situation, we have a CI system that is ensuring that we can build every one of those distributions. If something fails there, we go and fix it before it can be released. Once every image has been built, we run a certain acceptance criteria tests, like I don't know, we're sure that we can do an upgrade, we're sure that we can do an installation of Kairos, et cetera, et cetera. Putting it all together, that's how you can have a really nice secure distribution running at the edge of your network, talking to your Kubernetes cluster. If you're interested in testing it out, please, you can go to our website. There's a lot of documentation there. You can download the different releases, try it out. You can try it out on your Raspberry Pi. It's quite fun to do. You can talk to us via matrix, GitHub discussions, and we have even office hours. Every Wednesday, 5.30 p.m. European time, you can chat to us if you want. That's all I have for you today. If you have any questions, please let me know. By the end, if you are interested in stickers, please come and grab one or more. |
CentOS Stream
RHEL development in public |
Alright. Hello everyone. Thanks for coming. My name is Adam. I work for Red Hat and I do send-off stream for Day Job. I'm the send-off stream lead and I want to talk to you about send-off stream and how we, Red Hat, use it to get work done on RHEL and how you can use it to participate but also build your stuff and we'll see. Okay, so just to set the context, before send-off stream we did something like this, like when we created RHEL, where Enterprise Linux, I keep saying RHEL, we took what's in Fedora, that's where the innovation happens and then we had like a long process to build RHEL out of some of that and when that got out, somewhere later, sent-off stream, sent-off Linux happened and yeah, that was interesting but the problem was that when you find something in that rebuild, you can't really change much because the goal was to be a rebuild of RHEL. So what we did with sent-off stream, we kind of switched it a little bit. Now the process is like there's more steps, so we have Fedora still, there's nothing changing about Fedora, that's still the primary place where innovation happened, that's the upstream, there's something called Fedora ELN, which is like a subset of the content, rebuilt with RHEL configuration and that's the next major RHEL release. So if you go out and look at Fedora ELN, that's what RHEL 10 is sort of right now, then sent-off stream happens, this is where the development of RHEL happens in public and if you talk to Red Haters, they will sort of somehow combine these together, sent-off stream and RHEL because this is really our development space and I'll have more details later, so this is tracking the next minor version of RHEL and this is what you can use, but you can also contribute to it and we'll get into details. So this is basically if you heard Fedora ELN sent-off stream, this is what it is and I have some more details here, this is Fedora, Fedora Rohite sources, this has its own sources and this is the build-build flag and you can also see there's a different amount of packages, it's just like more information there. Okay, so let's talk about how we get work done or how you can get work done in sent-off stream, how that works. So there's this diagram and we're not going to go box after box, no worries, but I just want to demonstrate that you see that we have bugzilla where work tracking happens, there's a merge request coming to the GitLab and then basically everything is synced, so the upper part is the sent-off stream infrastructure, this is RHEL internal infrastructure and as change happens, it gets built in both, it gets tested in both and when that passes test, both get released further to the process and finally it gets into both RHEL and sent-off. There's something called sent-off development compose and production compose, which is basically when this compose is sort of like a repo and ISO and container images, just like a snapshot that you can consume and yeah, there's one that happens after the test, that happens every day and then the verification, this is like an internal process paperwork and stuff and then that goes through the production compose and I can even show you how that happens in the system, so this is what a bugzilla bug looks like, someone was adding, this is like half a year ago, a multipass TCP to RHEL, so they created this, they did the merge request in GitLab, everything was visible publicly and they submitted a build first in sent-off stream, that got through, got built in RHEL as well, if I scroll down there's tags, that was like the multiple steps, the gates pending candidate, that's how you can know where it's in the process and it basically got through that and now if you're using sent-off stream you already have it installed because that's half a year ago for that change, but this is basically the flow how it works. Let's talk about contributions now, for some context I'm starting with RHEL 8, RHEL publicly said that we'll do minor releases of RHEL every six months and major releases of RHEL every three years and this is what we've been sort of doing eight and nine and something called ABI, I got that as a note for myself, with RHEL we make some promises to customers about ABI guarantees and support statements etc, basically whatever you would expect from Enterprise OS, so we don't want to break things for customers in the major version and this will influence what contributions we can take, so the easiest one is bug fixes, if you find a bug and you can fix it, feel free to do so, we'll be very happy to take it, test it and if it doesn't break things for the customers, merge it, get it in and that's the easiest way to contribute, you can also contribute stable updates from upstream and buy stable that gets to the promises, basically as RHEL ages it gets further from the upstream because we need to keep things sort of stable in the ABI way, so we still release updates like every single minor release, but most of them are backwards, so again welcome to contribute updates that are stable, we'll again fix them, test them, build them and get them in, but yeah, and backported features here, this is what I mentioned basically already, I just have a slide for that, okay, what we can take is the ABI non-compatible updates and if you're wondering about details there's the document called RHEL application compatibility guide, you can find it online on the Red Hat's website and it'll explain exactly how it works, but most packages have the ABI stable for the entire 10 years of RHEL life cycle and we take it very seriously at Red Hat because customers build applications and they want them to run forever without changing them, so we don't want to break this for them, so please don't submit things that would break ABI, we would need to politely explain why not and reject it, that's what you can contribute to Fedora ELN for example and we'll get to that. Okay, I have maybe for docs, typos, man pages, there's a thing for customers, if they go to the customer portal and they have a bug with documentation, they can report an issue and get it fixed that way, otherwise we tend to batch them together so they land all at once so maintainers can focus more on actual feature development, back porting and stuff, so these are welcome but they might take longer to get in because of this and this is a detail image of the life cycle, if you want to get your change into a specific minor version of RHEL, we don't have a way in the bugzilla to really communicate it but you can get in touch with the maintainer and you can sort of anticipate, by the way, minor release, this is the dark blue, extended update support, this is the light blue and then update services for SAP, so even like we're done with minor, we still might be supposing it for up to four years and the arrows is like where CentOS stream work happens, so it tracks all of them and just changes make it to the minor releases and yeah, you can sort of anticipate like where it gets but there's no communication like where exactly, so if you really need to, you would need to talk to the maintainer, yeah, we have this for eight and there's also seven, there's like a lot of things going in the background and if you want to contribute, let's talk about how, so you can open bugs in bugzilla, you can test stream, if you find something, you can open bugs and hopefully get it into the next minor release, you can open merge requests in GitLab, create a GitLab account but first please make sure that you have a bug so you start the conversation with the maintainer so they know what's coming and they can also help you with the change and then you can track the change, this is again from the diagram, we have these three tags in Koji which like we used to track the process and you can preview things in the development compose or the production compose base where it gets, you can get the composites on this URL and there's slash production slash development but otherwise they go to the mirror so if you go to CentOS or you will find CentOS stream there, okay, let's have a look at use of CentOS stream, of course you can use it to preview REL test features that are in development, see what's coming before it actually gets to REL, I think one of the interesting part is that you can use it, if you build something on top of REL you can use CentOS stream in your CI to preview how it would work on the future REL so you can get ready for the next minor release and one advantage like compared to a rebuild is that you can, if you find a bug like in CentOS stream you can actually get it fixed for you and get it in REL proper so this is like what we're trying to do there and this is actually one of the most interesting for me so we have special interest groups, there's like the Hyperscale SIG, there's Cloud SIG, there's the K-Mode SIG and they work in the CentOS stream community and they build their own stuff on top of CentOS stream so they have like a stable enterprise platform but again compared to a rebuild they can actually influence what's happening, they can submit changes, unbreaks things for them and get it into REL proper, I know the Hyperscale SIG they're maintaining bunch of stuff before they actually merge it and there's really interesting stuff going on, you're welcome to come in and create your own SIG and use the community build system to build everything and CentOS stream is definitely there, that's the primer build targets. So that was mostly CentOS stream, I have something about CentOS stream 10 and REL 10 as well basically we saw this diagram and with REL 10 we're right here so if you want to contribute towards REL 10, get it in Fedora Rohide which means get in Fedora ELN if it's within the REL package set and at this point you can change APIs, you can do whatever Fedora would normally do through Fedora changes so this is like the most flexible time of contribution to CentOS stream 10 and REL 10 and later when we get to do stream this is from REL 9 but the process is the same we have like Rohide and Fedora ELN, imagine this like Git branches and Fedora Rohide is the rebuild ELN, follows it and we branch CentOS stream from that and then later start doing REL and yeah we call that bootstrap that phase that will be happening somewhere later and yeah that's how it happens so you can get your changes to Fedora ELN right now and that was a different... |
How to package BPF software for Linux distributions
…presented on Gentoo Linux |
With that, I will pass it over to our next speaker. I think we're right on time. How to package EBPF software presented on Gentoo Linux. I will pass it over to you. Welcome. Thank you. So yeah, my name is Yako, and this is the topic. So let's dive straight into it. What are we going to discuss in this talk? So I'm going to introduce you shortly to a topic of EBPF technology, explain a little bit what it is, what we can do with it, and a little bit about history and what development tools you can use to develop your own EPF programs. After that, we're going to focus on packaging side of things. We're going to talk about a little bit of packaging on Gentoo Linux, and then some challenges, problems we have faced just packaging software and BPF software in general on Gentoo and how we can fix or overcome these issues. So a little bit about, I work for a company called Santura. We are based in Zagreb, Croatia, and our expertise on most of our work revolves around embedded Linux. We are focused on network edge. We work a lot with network switches, CPEs, and things like this. We are heavily using some operating systems that are tailored for embedded systems such as Buildroot, OpenWRT, Yocto. But we've also been using Gentoo for some of our stuff that has to do with embedded devices as well. We're also passionate about open source projects. We love to contribute to open source and give something back to the community. And myself, I had the privilege of being a Gentoo developer since 2021. So I'm going to talk some experiences. Well, we'll be using in Santura some BPF programs and some of our projects for our network devices. But I'm also going to share some experiences or things I've learned during my time that I had a chance to be a Gentoo developer and just how to package software in general. OK, so eBPF, short term for Extended Berkeley Packet Filter. So what is it? Essentially, it's a Linux subsystem that can run programs in a virtualized environment. So it actually allows you to extend functionality of your kernel without the need to change your kernel source code, recompile it, redeploy it, and all this complicated procedure repeated multiple times. Essentially, it allows you to write your own program and then your BPF program can attach to the kernel. And then you can extract or analyze different information based on various Linux syscalls. So for example, if you want to see, you can see, let's say, how many times or you can print information anytime that user opens a certain file on his computer. Nowadays, when we hear the term BPF, most of the time it's referring to Extended BPF or eBPF. But nowadays, we have something called Classic. It got named Classic BPF, which is a simple internal virtual machine designed to handle network packet filtering. It started back in the 1990s, and that's how we got its name, so packet filtering, because that's what it was used for. But Extended BPF was implemented on top of this standard BPF, let's say. I believe the first kernel to have support for eBPF was 3.18, if I'm not mistaken, which was around 2014 and 2015 released. So it's been less than 10 years. And in this short period, this technology really grew into a huge ecosystem. Nowadays, it's just a general event processing framework. And there are numerous tools available, information, videos, books written about how you can do what you can do with BPF. We're not going to discuss or go into too many details, because it's quite long. But nowadays, it's used for observability, networking, security, application tracing, and things like this. If you're interested to learn more about this concept, or just in general about performance and observability, I highly recommend you to check Brandon Gregg's content. He wrote a book, and there are many videos available of him online in which he speaks about performance, observability, BPF program, and things like this. So definitely check his content out if you want to learn more. So now we can take a look at some of the tools that were developed to allow us to write BPF programs more easily. So how BPF programs actually work. So before they can be loaded in the kernel, they need to be compiled into bytecode. So you can actually write this bytecode directly, but it's going to be very tedious, tedious process, not really suitable for development. So there have been a lot of different tools, toolkits implemented in various high-level languages that allows you to do these things more easily. So I've just mentioned a few projects here, but if you go to eBPF.io, there are many more applications written depending on what you want to use or what language. For example, maybe a popular one was BCC. So it's a toolkit for writing BPF programs using higher-level language, such as Python and Law. And it uses a LLVM compiler backend. There's also BPF Trace, which is a high-level tracing language. Then we have a libBPF, which is user space library for loading and interacting with BPF programs. This is actually in kernel library present in the kernel source. It grew very popular around the recent years. There was a concept of BPF development. It's so-called BPF-C-O-R-E compiled once run everywhere. So it's a concept that allows you to develop programs even more easily compared to, let's say, BCC, which uses LLVM and Clang as a compiler, which introduce more heavy dependencies or certain requirements that maybe you cannot meet if you're deploying something on an embedded environment. There's also, for example, EBPF Go library, Ply, lightweight dynamic tracing tool, and many other applications. Most of these packages are, well, yeah, they're available in Gen2. We have BCC, BPF Trace. I believe we don't have EBF Go library, but all of the other ones are present as packages. So if you're using Gen2, you can just simply download the latest version available. Okay. So now we talked a little bit about BPF, what it is, what kind of tools we have. Now let's focus on packaging side of things. So Gen2, I'm sure you've heard about it. It's a source-based distribution. This fact is the one that distinguishes Gen2 from other distributions. So usually if you're using, let's say, Debian, Fedora, Ubuntu, you wanna download, install package, your package manager is gonna download from repository pre-compiled package extracted and install into your file system. Well, Gen2 does things differently. So most of the packages are actually, you know, compiled from source. So the package manager has to do all things such as fetching the source, then it has to unpack the source, configure, compile, install, configure after installation, and things like this. Main component of Gen2 is actually, it's package manager called Portage. Portage is actually this component that allows you to have this highly flexible and customizable system. It allows you to have a great control over your system. So Gen2 is actually designed to be like, you have a, when you download Gen2, usually you have a minimal set of, you have minimal archive which contains just a basic set of tools like compilers, libraries, you know, stuff that you, stuff that are required to build programs actually. So you just download this minimal set of files and then you build, you build your own, let's say, system tailored for your own needs. Now, if we talk about package in Gen2 itself, I've written some terms to allow us to understand how this is actually done. So we have eBuild. eBuild is simply, yeah, it's a package file, it's simply a text file that contains build instructions. So it's written in a bash like syntax. So this eBuild actually gives all of the instructions to Portage how he's gonna download the source, how to unpack, how to configure based on what build system the project is using, how to install the package, whether you need to do some additional things, specifying dependencies that need like this as well. Then we have something called eClass. So eClass, we can think of eClass like a library. So it's just a, it's a common code used by different eBuilds. So it's just there to avoid code duplication. So for instance, easiest way to explain this is if you're using, you have different build systems such as, you know, auto tools, build system, CMake or mess on build system. So each of these systems has its own eClass. Basically it allows you to avoid having to write, you know, the same procedure for configuring, building, installing the package. We also have something called the use flag, but we can think of it as like a feature flag or configuration flag. So this actually allows you to, when you're compiling your package, building it from source, it allows you to selectively turn on or off certain features if you want them enabled or disabled, like if you're building, let's say you're building Gen2 for a headless system for a server or something. You're probably not gonna need support for graphics graphical interface. So you can just, for instance, you can look up what use flag controls this support for graphical interface and you can simply turn it off in your configuration. So then package manager is not gonna pull in any of these dependencies, you know, Wayland, X11 or whatever can be used and all of your packages are gonna be built with the graphics support disabled. Yeah, so when packaging things, these are probably things we work with most of the times. Now we can look at a short example. So I've put a link here on our GitHub repository. Essentially it's a small application just developed to demonstrate how to write eBPF program and how to package it. There are also links to two blog posts here. First one talks about how the BPF aspect of things, how it was actually developed. And the second one talks about packaging it using, well, the second part actually talks about mostly from the aspect of cross compilation. We're not gonna focus on cross compilation, but it still gives a good overview of how packages are built and designed for Gen2 Linux. So this is our three of our projects. So it's a pretty standard CMake project directory. We have some CMake specific files which are used for CMake to determine what library is, what dependencies, what things you depend on. Then we have a CMake list which contains instructions for building, installing the package. Obviously we got the source code. We got some include headers. These headers are actually, well, they depend on their different based on what architecture you are building your program with. So they contain kernel definitions specific to each architecture. So I've mentioned BCC previously a few slides ago. So if we wrote this application using BCC, we wouldn't have these include files because when you develop program using BCC, you must include or you have to have all kernel headers present on your system. Because a lot of times you don't know which headers you're gonna need. So, because BCC actually does on the fly compilation. So it's compiled at runtime. So you need both kernel headers and you need Clang and LLVM present at runtime which can be a serious. Well, you can just present a challenge when you want to develop something for like a small embedded system or you just don't have the necessary, necessary processing power to do all these things. So now we can have a look at simple. Well, this is how evil looks like. So it's, at the beginning, we have some header information, copyright, so on. Then on the line four, we have something called EAPI. It's just a variable that tells your package manager how to parse the rest of your file. So we have these on six line E-classes and their implementations can actually vary depending on which EAPI you are using. So it's necessary for us to specify EAPI just for the package manager knows, okay, this is how I'm gonna, this is what functions I'm gonna use, how they look like for each EAPI. So then we need to inherit some functions, some E-classes that are going to allow us to package our program more easily. So for example, obviously we're gonna use a CMake E-class. We're also gonna use Git E-class because we're going to build our package from Git repository. We got some, we got Linux info E-class that gives us access to some of the, you know, some of the things related to kernel like checking which configuration options are present in your kernel configuration and so on. After that we got some metadata description, whole page things like this and then we specify which project or which repository package manager has to look for your package. And we got license. Slot is not really important for us in this context. Keywords is just a way to specify for which architectures your package is going to build. So there's actually a mistake in this package. So usually we call packages that are built from Git like live packages. So when we have live package, we never specify what, like the key words because when we build packages from Git, we're not sure like their source can change at any time. So there's, it's not really consistent. So that's why we cannot guarantee, we cannot guarantee they're going to build at certain, you know, at all times. So that's why we don't, if the package has no keywords, it means it's like highly, you know, it's unstable package. So it's not really, you cannot expect, you can expect it to not work sometimes. About for the purpose of this demonstration, I've put it in there. Then we specify some use flag which are what I've talked about, some configuration switches we will use later on. Sometimes when you're building a package, for example, maybe you want, maybe you don't want your binaries to be stripped. So you can also tell your package manager, okay, I don't want my binaries to be stripped. Please don't do that. So you just tell him, okay, I want you to restrict stripping binaries. So all of your binaries are going to be built with debugging for included in them. Now we specify some dependencies here. We have two, well, we have a bit more types of dependencies, but these are like the three main ones that we use. Runtime dependencies and then we have built time dependencies which are, which are consist of, they can be split into two categories, depend and be depend. So why they are split? Well, it's usually because of cross compilation. So for example, when you're cross compiling a package to, let's say from AMD 64 to ARM, you're going to need some packages like like headers or libraries, you're going to need them present on this, on this target system that you are compiling for. So these, these packages, they belong in the depend group. So packages that need to be present at build time, but they need to be present on the target system as well. But be depend, it specifies dependencies that package manager has to run while the package is being built. So obviously if you're cross compiling for ARM, you have to have them available on your laptop from which you are compiling. Normally when you're, when you're building a package for, you know, on your laptop for, on the same architecture, this, this does not make that much difference. But when you're cross compiling packages, it does make a difference. So that's why we have to clearly specify. Then we can, usually BPF programs, they require a certain, certain external configuration options to be, to be included in your kernel for them to work correctly. So we can also say, okay, I need these options. Can you please check if these options are present while the package is being built? And if they're not present, your package manager is gonna print you a warning after it's being built. You need these, you need these options to be included. Otherwise your program maybe won't work correctly. Then we come to, well, we have to configure the package. So we specify some CMake arguments. This is where our use flags come to play. So basically this is a batch syntax, so whether it just depends the second line. If you have the use flag VM Linux turned on, obviously the argument is going to be on. Yes. After that we have a installation part. We use the CMake key class implementation. That's why we write CMake SRC install and then we do some other stuff depending on, again, if there is another use flag. Now let's discuss some challenges that we face during packaging, not only BPF, but in general software. So Gen2 tries to support as many different, you know, many different build configurations, whether you want to use a different compiler, different linker, different, you know, libc for your system, so on. So it takes, you know, sometimes it takes quite a bit of work to get, obviously some things are not going to be able to be built with all configurations, but we try to provide as much support as possible for you as a user to be able to build your own system. Then we have to ensure compatibility with latest tool chain, which is important because we're building, we're constantly building different source and we try to have the latest, latest and greatest compilers, the libraries available. So I've just mentioned two examples here of, usually whenever there's a big release of either compiler or something like this, glibc, we have many, you know, build failures across our tree, so then what we have to do is, you know, we have to analyze, collect, okay, these are the packages that do not build. Sometimes there are like hundreds of packages that do not build. Then we have to, you know, go try to patch each package and then send the patches upstream if there's upstream, if they're still active and so on, which, you know, can be quite a challenging task, but yeah, in general, we have our tool chain team, which does a great job of staying on top of, on top of latest development efforts. So we also try to, it's always good to have this cross distribution collaboration where we, you know, that's why it's important to provide patches to send patches because then someone from Debian or from Fedora can use our patch or vice versa. There can also be cross compilation issues related with packages, heavy build and line time dependencies for some of these BPF packages, so they may not be suitable for embedded systems. How we deal with this, so for some of the things that I mentioned, it's just not possible to deal with, like whether we like it or not, if there's a new compiler, like new GCC release, things are going to break and it's not possible to avoid it, but that's why it's important for us to, proactively as a distribution test, proactively test packages, write and submit patches upstream, I mentioned that, it's very important, and I've, I remember numerous times where I, you know, try to fix a build failure for some package and I just go and look, maybe I can look in Debian or Fedora, maybe they have a patch and, wow, they have a patch and it's so nice and convenient to have, you know, already patch written for you, so that's why it's important and we also try to, if I write a patch or if we write a patch for some, something, try to submit, try to submit and give it back to, to community. Yeah, that's pretty much it from me on this topic. We have a few minutes for, for some questions if you have. Okay. So once you have a package on system, how does that get loaded into the, into the, if you, it's, if the package is installed on your GENTI system, is it loaded automatically? What is there, can you have any BPF package installed and it's not put into the kernel? No, then usually we'll have a, you'll have a binary or something or script that you can use to, you know, load your program. So do you package those loader scripts as a separate thing or how does it, like, what's that part of it? Yeah, we can just package them, you know, we can just install them additionally. Like usually they're going to be, if the, if they're intended to be by upstream, they're going to be included in, you know, the build system is going to install them. So usually there's not going to be any need to install them separately, but if there's like configuration script or something with our package that we need, obviously it's possible to just, you know, use a simple helper functions and install different files that you need. Is it, so it's, the file being in a certain place in the current, in a tree makes it be loaded in the kernel at the time? Sorry, I don't know how GENTI does things on the boot. Ah, well the concept of BPF, you can just, you know, it doesn't have to be, you can just be binary programmer available in some, you know, random location, like somewhere, doesn't have to be in a certain directory. Okay, but sorry, I'm, it's maybe super dumb question, but you know, I've got to say, been true on my files, I've got a binary sitting on my files that's a traditional, you know, of ELF binary sitting there, right? Just because it's there, it doesn't run, you've got to have system deservice or something. Well yes, that depends on what you're packaging actually, this was just, you can have either system deservice or you can. So you would package a system deservice? Yeah, yeah, you would also package a system deservice. To insert it to, would be, is that the standard way to do it or is there a? Well most, yeah, if it's something that's, it tended to be, you know, ran over periodically or continuously, yes, you were gonna, you're going to have a system deservice, which you can also install with your packaging. Okay. You have one minute left, any last questions? Last one. BPF programs are changing the functionality of Kernel. How do you ensure that we distributed the security issues? How do you solve it? Did you just use, that's what's already in Jagent? Yeah, well, yeah, good question. But BPF actually has its own verifier. So it has to, you know, before it's loaded, it has to be, it has to be verified so that, you know, we are running a secure program. So this is, yeah, how we handle this. BPF itself handles this thing by using its verifier before the program is loaded into the Kernel. Nevertheless, it can be, I mean, functional and working, but still changing the system in some way, which is not intended. So you just rely on the Ganto distribution that the user is actually downloading that what you have uploaded. I'm sorry, can you repeat? So when the package with BPF program comes to the user, how does he know that you actually program it and that it's not temperate on its way? How does it know that we are packaging the right thing? Yeah, you're using just the Ganto normal distribution. Yeah, yeah, well, Ganto does, you know, we usually when you package a software that's, you package it from a release table so then you compare the hashes. Ganto package manager actually compares the hashes of what you are downloading and what is available on source repository. So that's how. Nothing special, just Ganto. How do you port this on other distributions? How do you port this? Well, that just depends on, I mean, the concept is pretty much the same, just depends on what tooling you're going to use if you're going to package either for Debian, Fedora, Arch Linux, yeah. Thank you. Thank you. Thank you. |
From Linux to Cloud to Edge and beyond: Evolution of women contributors in distros & FOSS
A timeline from past, present, and future |
Hello, hi everyone. Thanks for joining us today. My name is Amita Sharma and I have Justin Florey with me. And today we will be speaking on a topic, evolution of women contribution in distro and force. And we will try to map it with the changes in the technology from on prem to cloud to edge. So this will be our agenda today. We will give a brief introduction about ourselves. We will try to look at the past, like 10 years or a decade, how the women stand in the technology in the open source world. What is the present look like for the women in the open source world and in the tech industry? And how the future will be? What are the parameters which has impacted in the past? What is the factors which are influencing today? And what we can do to make tomorrow better for the women? So with that, I am Amita Sharma. As I just mentioned, I am from India, Pune location. And I am working in Red Hat. I have a little bit more than of 17 years of experience and 12 years at Red Hat. I started contributing in Fedora since last 8 years. And I started with Fedora quality engineering testing. And then I started contributing in the diversity team of Fedora. I also served as a diversity advisor while contributing in Fedora. But in last few years, I faced some personal issues. And that's why I had to step back a little bit. But now I am coming back again with the help of my friends. And Justin is one of them. That is not so quick introduction about me. I would request Justin too. Hi, my name is Justin. In addition to being one of your dev room co-wranglers today, I have also been in the Fedora community for also about 8 years. Currently I am at Red Hat as the Fedora community architect where I joined pretty recently in October. So long time in Fedora, new to Red Hat. But in addition to all these things I am doing in my day job and things I did in Fedora in the past, I also worked together with Amita and even I think a couple of folks who might be in this room now too, to co-found our, at the time, it was our diversity team, now diversity, equity and inclusion team in 2015. And we've had kind of a very interesting journey over the last 8 years as we've tried to think around how we can make our community more inclusive and welcoming for people across all kinds of different backgrounds. So that's a little bit about us. But let's go ahead and before we get into the past, present, future, some questions to help frame our conversation for today. Yeah. So as we go into the presentation, I would request all of you to think about these questions and answers. I'm sure these answers are not binary in a yes or no, but it can give you a trigger in mind to think a little bit about the women contribution in the open source and overall in the tech industry. So with the hope, you will leave this talk with some deeper insight, how you might answer these questions in your own communities. I would like to go through these questions. When technology is advancing rapidly, we all know that how technology is moving, the data is coming out of the data centers and on prem to cloud and making its way to the edge even the autonomous technology. Do we think that even women is also advancing in their career and in the tech industry and in open source and in distros along with this technology, how rapidly these technology are advancing? Over the time, what are those critical factors that has influenced or impacted the women growth overall? What should we stop, start and continue doing to influence the women growth and women's contribution in the open source? How can we together build a cohesive environment for the women to contribute to support and to grow so that they can also reach to their sky when even data has come out of the data centers? So with that thought, let's look at the past. When I say at the past, I'm talking about last two decades, like 20 years, where how the women contributed or where she stood in the open source communities or in the tech industry. What were those challenges which has impacted her contribution and her ratio as compared to other genders in these communities? So we all know, maybe we can ask this question to ourselves that when you think about of Tiffin making or doing the laundry or taking care of the kids at the home, who comes in our mind? A mother, a female figure, right? So by default, it is socially customized thought or socially, you know, all over accepted thought that female is the primary caretaker at the home and not the men. So open source communities or the contribution to the distros as well, it is something we do out of our job or along with our job. This is where men can sink in their time easily, because they are not, they are not the birth giver. They are not the primary caretaker at the home. They don't have that primary responsibilities at the home. So with that social obligation of taking care of everything at home while also doing their job, it is little bit difficult to expect during that era from women to also contribute in the open source world. So balancing between all of these duties was little difficult at that point of time. Another factor which has influenced women contribution in the open source world is going away from the home. I'm not too sure about other countries, but in India from where I'm coming and mostly the APAC countries, if it's not so easy for the women to step out of the home. I still remember my father told me that you will only do your engineering if you will get an admission to a college which is nearby to home. If you can commute daily to that college, then only you are going to join and do your engineering. Otherwise you can take any other random course or the clothes stitching course and still survive. But we will not send you away from the home and I don't blame him and this is probably most with most of the female child at that point of the time. I'm talking about like 20 years back then and I don't blame him because the conditions and the situation was like that only. If you look at the crime rate in India for the women crime rate, the rapes, the human trafficking, prostitution and whatnot, they were very, very high. I'm not saying that they don't exist now. They still are there but the awareness, the advancement of technologies like mobile phone and all of that, the education really helped to overcome all of those insecurities and now parents are much more confident to send away their female child. I'm standing here out of my country so you can say that now things are much, much, much more better. But back during that time it was really difficult. So if the women and female could not step out of the home, the opportunities for them to take the better education, to take up the career and the job they would like to do, that was really very limited scope and packet at that time. Another factor which influenced the number or the ratio of the women in the open source was the biased growth opportunities and I'm not saying it, the numbers all over says that. If you look at the numbers, the leadership positions, the CEO, CTOs and even the staff or the council members, they are maximum 99% are male and not the female and that's exactly why because of this is sort of a boys club, they hire more and more boys and men rather than women and this all starts from the beginning. Like if you look at the schools, look at the technical colleges, the engineering colleges even around you, women are less in number there as well. So that's why they are less in the technical industry, they are less in the open source communities and the distros as well. So that is why the numbers are uneven. With that, that was about the past and the reasons why women were not that much seen around these open source communities and distros, maybe now the time to look at what does the picture look like at the present and for that I would request Justin to help us. So we'll take a quick look at some, especially in the distro space, some of the things of where we are today. There's a number of things up here that you can also kind of dig into but I'd like to highlight a couple of these. One of these is actually the emergence of DEI communities in distro space. Like I mentioned earlier, the Fedora diversity DEI team launched in 2015 and one of our biggest things that we were working on at the time was looking at pre-COVID was a lot of our event guidelines and looking at things like our annual conference and local release parties that we would do in countries all over the world. Things that we could do to help support our community, to open up and try to identify gaps that might make it harder for people who have different, are coming from different places to contribute and be a part in our community. Some specific examples as well. There's also Arch Linux women which that community has been around also for I think probably like eight, ten years as well. There's Ubuntu women which also has kind of had parts of like more active growth at times as well and I'd like to say like there's also two kinds of communities that have kind of emerged here. I'd say there's more like social communities where people can connect and share experiences and identify with other people who are like themselves as well as more functional teams like the Fedora DEI team where we're working with project leadership and providing advice and insight to our project for how we can make these steps. Another one I think that's very relevant in recent times is the flexible working style, especially after COVID this whole shift to remote hybrid work has opened up more opportunities for people to work in the way that works best for their lifestyle. Whether that means spending some time at home, commuting into an office or being fully remote, there's more of these opportunities now to choose a working style that's flexible and fits different kinds of lifestyles and this traditional one of you're always in an office nine to five Monday to Friday kind of structure. Additionally one that's I think also very important now as well is the support and mentoring which has been really critical for bringing more people into the fold, really emphasizing this person-to-person connection. I mean even for me I can think of people who have helped mentor me and guide me in the distro community space and I think that's really important for everyone to have, someone who they can relate to and help give them not just visibility but help give them advice on how they can grow and support their career or contributions in a community like a Linux distribution. One specific data point for this is the Outreachy Internship Program which many of our distributions participate in this program for quite some time even going back to when it was the GNOME program for women. Outreachy's celebrating over a thousand internships this year which is like really exciting but I think really underscores this whole thing that mentorship is a really important part of how we've gotten to where we are today. But with that I want to also look ahead into the future a little bit and Mita and I will split this one up a little bit to think around how we can continue breaking down these barriers going forward into the future. But first before we can really look ahead to the future I think we also have to acknowledge the fact that there's still a long way to go there's still a lot of the same challenges that we were working with 10 or 20 years ago that are still present either in the same way or maybe a different form today. So we're still far away from being this total ideal equality equity place and this will help change the overall mindset if we can at least acknowledge the problem and show us that we can start to take steps to resolve it. One thing I'll mention here as a data point which is you know we do what we do a lot in Fedora I feel like I'm really proud of the work that we've done in our DEI team but also if I look at our Fedora Council the the role that I just joined the three roles who are part of the top level of the Fedora leadership are all three white men from North America which I think kind of does underscore the point that you know there's still there's room to grow in these leadership opportunities although I will also say our Fedora Council better reflect some of that diversity we still don't have this like you know we're not in this place where even we've solved all these things like we're also trying to figure out ways that we can better support people to get into leadership roles and I think this is true whether it's in a leadership capacity or many other ways everyone's kind of at different points in this journey so what that might look like for one community might be different for another but we need to at least acknowledge that there's still some work to do. Another one is coming back to this whole thing that many of the district communities have started to build teams or initiatives around diversity equity and inclusion we really need to support that kind of work in our district communities so we need to invest more time energy and resources into this kind of work and additionally coming up with ways to measure better measure our efforts so we better understand what's working and what isn't because we want to make sure that we're investing our time wisely on this too and again kind of the examples I mentioned Arch Linux women and the Fedora DEI team but even outside of the distro space there's lots of communities like this as well like I know the Drupal diversity team in the content management system world does a lot around their community and has had a big impact in Drupal. Additionally we really need to emphasize this idea of really listening and trying out new ways and ideas of working you know we need to try things that might not have been done before or maybe even we're done before and failed but sometimes we also have to go back to some of these things and try again or take a different approach things won't change if we don't try new things take new approaches and listen closely to feedback from our community really emphasizing this active listening piece and again I think part of this is also emphasizing actually the next part is also with the diversifying our recruitment efforts so spending time to really I think this kind of ties into leadership as well but trying to make sure that we're tapping into growing our community in places that need more more visibility and light I think that outreach-y case is a really great one because that's a great opportunity for projects to advertise these opportunities to work in even some very key parts of a project community like in in Fedora Linux we've had folks working in from areas of the Linux kernel to metrics and data parts of the project where they're getting more insight into what's happening in our project and sharing those things back with the community to help us understand whether it's like a DEI focus or more broadly just around like packaging and things that we're doing in the project it's important that we're taking those steps to diversify who we're bringing into the project and making sure that we're taking full account of where we're spending our time and energy and for the last bit I will pass it back to Amita to finish us out thank you so much Justin uh one of the very critical point here is that even if we are doing a lot of efforts which Justin has specified in different communities be it for the Fedora Pache Edda women and a lot of them right around us we have successfully brought the women in already we have diversified the group now but this is the time that we really give the equality and the equal respect the equal pay grades and the equal rewards to these women I I know I cannot talk about some of these examples which I have seen myself that even if the women and men are on the same position in the same department in the same organization the pay gap is huge still huge and even numbers says that from there is an article from Australian Government which mentioned that only 17.1% of the women reaches to the CEO levels 25.8% to the board members of the company only 30% of the key management positions so you still see there is a huge gap and the huge way to go to overcome these gaps between the salaries between the leadership position and because the impact is also huge we think that okay this is the current uh situation no over the period of the time if you look at the compound salary by the time the a woman retires she has very less amount of money than a man has so if a woman loses her spouse or sees any unfortunate events in her life she has to face the poverty even though she is working much more harder at the home and the office and balancing well over there so with that it's very critical to have the equal pay and recognition we need to encourage women to take the leadership role as well we all has been saying that how important it is but one fact one thing which I would like to share from my perspective and my personal journey as well women try to reach to that level but then because of some personal issues or the birth or death or whatever they need to take a step back but then when they come back it is hard for them to you know walk those steps I have seen that when I was there in the Fedora and now I have joined we have moved so fast from IRC to telegram to even now signal and from Pagier to GitLab and everything has changed so fast it is so overwhelming to catch up with all of these things it's necessary to give a helping hand and mentors and supporters to these women who would like to come back to these communities and make an impact so it is very necessary and also don't look at the speed of the contribution because from where they are coming they are trying to balance with a lot of things at home at office at open source contribution so don't look at the speed don't try to measure their contribution like how many tickets they have done so far they are still learning and try to overcome that gap which is huge and overwhelming so be respectful for that last but not the least the mindset change is very necessary for everyone because we all know that if we support the inclusivity we are we can better innovate and the mindset change is also necessary for the women as well I'm standing here talking about talking to you with all with all of you about all of these things but at the back of my mind somewhere I'm thinking about my eight years old girl did she had enough meal that her father is able to make her braids because that is necessary rule for the school so all of those guilty trips we've movement always have those guilty trips in back of our mind because we think we are the primary caregiver so we also need to change that mindset if we are doing something for ourselves to making our career better please don't go on those guilty trips okay so that is for us as well with that very important question for uh I don't know where slides is not showing up it's on the screen here let's try I'll try playing an interview of someone who contributed a quick one minute one we'll try to do this with laptop speakers and see how it goes where's this it's okay it's fine we can skip we figured we'd improvise this one a little bit but basically one of the things that she's emphasizing here in her interview is she's talking about some of these recent changes that that flexibility that people who are also prioritizing family in their life there's more of an awareness that sometimes family comes first it's not always about you know 50 hour 60 hour work weeks I think there's a kind of a going to that whole changing mindset piece that there's more of an acceptance or more of an understanding for that now than there was again 20 years ago and we'll we'll upload the slides on the Fosden website so if you want to see the video you can go probably in the next day or two onto the Fosden website and you can find and play the video interview too and with that I would like to ask just in that what he thinks is the most important thing for the women because now you are part of the fedora council and serving a very critical role I think I kind of said one of my points is like you know I'm I see my role in the in the fedora community and I think one thing we can continue to emphasize and push for like in the fedora council which is our top level group trying to make these leadership opportunities more accessible and also clear for people like so there's not kind of an ambiguity or feeling like they're it's so far away from people but trying to maybe bring it a little bit closer and make those opportunities to contribute really powerful and I have to acknowledge that and Justin is doing that job very well and he has helped me a lot to come back with that I would like to open the floor for all of you if you have any questions I think we are at time because you would like to come and talk with us we'll be around the room and in the hallway thank you thanks everybody |
Fixing Year 2038
Coordinating the 64-bit time_t ABI migration |
I'm Wookie, and I have come to talk to you about the year 2038 problem. I would quite like to get feedback and not talk all the time, which means I have slightly too much information to give you in time available. I will try and get this about right. So just to make sure people understand what the problem is, time t on Unix is a 32-bit signed int, which started in 1970 and counts in seconds, so it rolls over in January 2038, at which point bad things will probably happen. That is now less than 15 years away, which isn't very long when you consider the sorts of systems that will still be running 32-bit code in that time. There are a lot of other things which will also go wrong at various dates, but I'm not talking about those today. Now, I don't claim to be a particular expert in this area, but I have been taking a look at it for the last few months, because Debian needs to do something, and we should work out what it is. So I just kind of try and relatively quickly cover the problem and the issues as far as we know them. I was hoping to have done a few more tests before I got here, so I could be a little bit more explicit about things that definitely will or will not break, but we are where we are and see what people think and also discover from you stuff that you know will break, because I'd like to make a list. So obviously, most people already use 64-bit computers, which have 64-bit time to, and this problem doesn't arise for, I don't know, many centuries, but there's still areas where cheap really matters, and those people still use 32-bit, so cars, TVs, kind of embedded controllers for buildings and heat pumps and plant and all that stuff, and also cheap phones. Talking to people in the business, 32-bit Android is still a thing and will remain a thing for quite a long time, so, and the problem is that the stuff that people are still doing that with tends to get installed and left for a long time, so 15 years already isn't very long, we're a bit late. Quite a lot of the stuff that will break is already installed, you know, well, too late, but there will probably be more, so we should probably fix it. Most of that stuff will be kind of embedded distros like open embedded and Android, you know, built for the thing, so the set of stuff using kind of Debian-style binary distros that this applies to is fairly small, but we should probably still fix it. Some 32-bit architectures don't care because they already started with 64-bit time to sensibly. Big distros, I don't think care about 32-bit anymore, if anyone wants to contradict me on that, that will be useful. I'm pretty sure RedHut's given up already, I don't know about Fedora and how long they tend to... Okay, excellent. Obviously, most of the 32-bit architectures are varying degrees of obsolete, you know, but sort of still in use now, but probably definitely not by another 15 years time, or at least stuff that will still be installed, but I think the one thing that does still exist and is still being used fairly heavily is on V7 and Debian intends to carry on maintaining that for as long as people are using it, so we kind of have to fix that, and this problem is harder if you're a binary distro than if you're a source, rebuild everything every time distro because how does the transition work? So what has been done so far, Aunt, who unfortunately isn't here, he did quite a lot of work on this starting in 2017 with Deeper fixing a lot of kernel stuff at the bottom of the stack. Turns out the Pearl people fixed this 12 years ago, although in a slightly odd way, of course, it's Pearl. Muzzle was fixed a couple of years ago, G-Libsus fixed last year, year before, and I think quite a lot of stuff, and because some of those embedded people especially have been building, you know, with new Muzzle, which forces 64-bit time to, quite a lot of stuff has been fixed, but I do not have a good handle on how much stuff still is not fixed and will just break if you build it with it, you know, it won't even build, never mind work, or stuff which will build, but it will just explode your file formats or eat its old data or whatever. So again, I would like feedback on that. So quite a lot of people have done quite a lot of work, Aunt tried rebuilding Debian in 2020, but everything was too new and too much broke and it didn't get very far. So we've just done it again a couple of times last year, and, you know, just rebuilding the base part, that's what I've done, and that seems to work, and doing an ABI analysis, Steve Langer said he did a load of work in Ubuntu on how many libraries actually change, which we'll get to in a moment. So how does this work? So G-Libsus 3.34, the way they've done it is to support the old 32-bit ABI and the new 64-bit ABI, and each package, you know, you set a magic variable, are we doing 64-bit or not? So it's a per-build thing, and G-Libsus doesn't just change. So normally an ABI transition, you know, you get the new library, it's different, you build against it, you get the new thing. So you don't have to do anything. But with G-Libsus we were given the choice, it's kind of unhelpful really. So something somewhere has to set the magic variable, otherwise you'll get the old thing. They have some file format issues as well, the utump and co files, and apparently that's fixed. QEMU allegedly is still broken, but somebody has a plan. So just to make this a bit more complicated, G-Libsus sets file offset bits to 64 if you set time to 64, so you have to have large file systems if you're having 64-bit time. So that's two different ABI changes, and that's been around a long time and people have been fixing it for ages. Again, I don't know how much software doesn't work if you turn large file systems on, doesn't build, gets it wrong. My impression is that most stuff has been fixed and is actually using that already. But again, I'm not sure of the numbers. GNU-Lib, if you use GNU-Lib in your build system, it will just turn on 64-bit support if G-Libsus provides it. So if you build with a modern G-Libsus, you'll get 64-bit time to, unless you specifically set this magic variable you've never heard of to stop it doing that. So some people are going to have been already bumped into 64-bit time when they weren't necessarily expecting it. And the last autocomp release was they decided that if you turned large file system on, you should have 64-bit time to, either the opposite way around to G-Libsus, but effectively that meant anybody who had ever put in large file system support any time in the last 15 years suddenly got 64-bit time to if they re-auto-confed, which some of us thought was a bit radical, so we told them not to do that. But the thinking there was, it's time. We should change. Please will people get a move on? So, you know, it wasn't completely crazy, but I think that was, that would surprise too many people, I think, but we do need to work out how we're going to do this without surprise. Yeah, so people transition, but like when they're expected to, not because something they hadn't noticed changed. So the point here is if you use time t in any struct, especially in a public ABI for your library particularly, or any file, then if the size changes, the ABI changes and your file format changes, and that is the transition we have to deal with. Now, you know, new ABIs is not uncommon. It happens all the time. Libraries change, add new stuff, adding new stuff, that's okay, but change stuff. But this is a big one because it affects a lot of packages, but not as big as some we've done in the past. File formats I think is harder to quantify and work out quite how big a problem that is or still is, and it's kind of up to each, it's mostly an application level thing, you know, if your app stores files and it needs to be able to read its old files and its new files without exploding. Disk formats is a bit trickier, but again, mostly that's been dealt with in the kernel, I believe. So the fundamental question, especially for Debian, but kind of more generally ready for everybody is, are we just going to update the existing architecture with this new feature which changes the ABI, which is what everybody's done so far. You just rebuild it with the new 64-bit time and stuff changes, but we still call it ARM Linux, going to ABI HF, or whatever, if in I3-8.6 world equivalent. That's not too bad if you rebuild from scratch every time. It's quite a lot harder in the binary distro world where we expect to upgrade things as we go along. And it does change the ABI. So in a sense, this is kind of wrong because that triplet defines an ABI. But on the other hand, we do this all the time, LFS changes the ABI, and nobody said we need a new triplet for that. But from Debian's point of view, arguably a lot of people have said, well, this is too big and scary and everything might break, so let's not do that. Let's just start a new architecture with a new triplet, and then we'll know it's all done, everything will correspond, and we're much less likely to get random breakage from a bit of half and half time to 32-bit and 64-bit time. But on the other hand, other stuff will break just because you've changed the name and the triplet and stuff. There's always some breakage in there. The problem is that if we decide we need a new architecture and everybody else has done it with the old triplet name, now the old triplet name means two different things. It means the new ABI or the old ABI for in Debian ecosystem, and that doesn't seem like a good place to be either. So I don't know. If we could all agree what we're doing amongst distro people who care, that will be really helpful. So as I said, glibc doesn't enforce 64-bit time t, so if just building against a new glibc doesn't put you in 64-bit world, something somewhere has to set the magic variable, and the question is, does d-package do that, or glibc do that, or gcc do that? I mean, it seems to me glibc should probably do it, but at the moment that's not a thing. We can work that out. As I said, the new triplet is kind of easier. We just start again. It will give us the opportunity to call it ARM32 for the rest of the time, which would be a much better name, but a very small bonus in comparison to how much work is involved. So in a way, this transition is just like any other large transition, and we've done it before. Libc5 to Libc6 was massive, and that was like everything had to transition in one go. Everything was smaller back then, which made it a bit easier, and it affected everybody, so everybody had an incentive. The problem here is that there's only really is an RMHF problem, but if we have a big transition with lots of libraries, everybody has to wait whilst that piles up and then goes through, and all the other architectures, proper architectures, will complain and go, will you people get a move on? You've bunged up the whole world. We have done this before, back in 2007, apparently long doubles changed from 64 bits to 128 bits on a whole load of relatively minor architectures nobody cares about. Not now. I mean, they probably cared a bit more, 13 years ago, and the world didn't explode. I didn't even notice, so it kind of been that bad. So I had a look at some numbers. How big is this problem? Is it completely intractable or not? I think that's quite an important question. So six and a half thousand packages out of our 35,000 have TimeT in them at all. So that's how big it could possibly be, but an awful lot of that is not in public APIs or file formats. Again, I have really no handle on how big the file format part this is, but the API part, we're getting a reasonable idea. So in the bottom 150 packages that you make to just bootstrap a system, 85 of those are libraries and seven of those actually change with TimeT, so that doesn't sound too bad. Five language sets done a very useful bit of work on Ubuntu, up to all 767 library packages. 209 didn't analyze for tedious reasons, we can go about that, but of the 558 that did, 17% of them, i.e. 82, did change the API. So that doesn't sound too crazy. If we assume the same sort of fraction in the bit that's not yet analyzed, there's maybe 115 libraries or something like that, which would need to transition together. So that's a lot, but it's probably doable. I've just done some fairly random experiments having built both standard RMHF, RMHF with large file systems and RMHF with TimeT and large file systems, and just tried putting binaries in between them, and nothing obviously blew up, it didn't just immediately fail, so it's not that bad, but I didn't have a good set of things you should actually check. What are the tests I should run having installed some of this on the old system to see whether it broke? So I think we need a list like that to kind of get a handle on how bad that part of the problem is. So here's a few things that have been mentioned as potential problems, NFS version 3, apparently the time there sometimes is signed or sometimes isn't depending on the client implementation, so some of them won't break for another 150 years, some will, that will annoy some people. I don't know how many NFS people haven't moved to V4 yet, probably some. Apparently X3 is a problem, I don't know anything about that. XFS was a problem, but I believe that's fixed in kernel 5.10 or something. CPO was alleged to be a problem, but then somebody else told me that it has 11 octal digits and therefore 33 bits, so we've got another 150 years before that breaks. Does anybody know for a fact? There's this whole room that should know stuff like this. Only INN will break, but that's a relatively, that's a thing, it could be fixed. 32-bit wine on 32-bit systems with 64-bit time to allegedly my break, but then that's only on I3H6, I don't think we care about that, so I'm probably going to ignore that. So yeah, what else? Again, we shall come to some questions in a moment. I'm sure there's quite a lot of other things, but I don't know what they are. So yeah, it's question time, I have done that reasonably quickly, excellent. I would like feedback from people. So this point about the, are we doing a new architecture or not? Thinking so far in Debian has been kind of, this feels very big and intractable and we're not sure how bad it is, so it's a lot safer to do a new thing. But having done some research, the research I've seen suggests that maybe it's not so enormous we can't do this, so I think an in-place transition probably is doable. There will probably be some breakage, but alone you'll be in RHF, so it's like we could just annoy those people for a while. It might eat your files a lot, I think that's the thing that is really going to piss people off. If you install a new thing and half way through, it just corrupts files, they could be important. That's not good. So that's the driver for going, let's just do a new triplet. But I don't think anybody else wants to do a new triplet because most people have sort of fixed this with a standard transition within the system and a rebuild. So I'm not going to say what I think just yet, but I have been developing an opinion having done this research and I would love to get a list of things like, is Debus going to break? Does it have any times in it that's, it's like half your things are using a new time and the other half are using old time and Debus has, or things like that, you know, IPC mechanisms as opposed to ABIs. I think the ABI problem we understand, it's the rest of it I'm a bit worried about. Yeah. I'm not quite sure how to, does anybody want to say anything? Somebody wants to say something? Have we got a roaming mic? Yes. Excellent. Steve, yeah, see. Hi, Wookie. Hi. So, of course, having discussed about this a bit in the past, this is most complicated on RHF, as you said, it's probably not possible to necessarily change the ABI of I386, because there are a whole load of old binaries that people will not be able to run anymore. And the only reason you'd ever care about running I386 is for shitty old binaries. It's less of an issue there for RHF, so that might be feasible. Yeah, so we need good test suites for all of these things and for all, for good integration tests, as you say, for either end of IPC and that kind of thing. I have no idea on those. I wish I did. Okay. I have the same concern about 32-bit wine, which is, people care about that for running shitty old Windows binaries. I mean, I care about that. I have some shitty old Windows binaries, and that's what it's for. I mean, somehow, Win32 has become this lingua franca of, I need an old program to run on any machine, and now, like, if we look at what Valve is doing with Proton, they've just decided, fine, Win32 is the standard for it. So I realize there's a difference between Win32 and the architecture that it runs on, but Wine32 specifically, as I understand it, 64-bit wine doesn't run 32-bit Windows binaries. Correct. So the thing is, wine already does, ABI translation, that's what it does, right? So it seems to me that if the underlying system has 64-bit time calls, I'm not sure it necessarily can't work with the old binaries. Oh, that's something I haven't thought about. I may misunderstand exactly how this works. So I don't know whether, in fact, that is fixable, or, as I say, whether anybody cares enough about i386 World, just leave it alone. I hadn't thought about the fact that that relies on the Windows time format, which isn't faced with this problem. Yeah. So yeah, I'm not sure whether it's actually a big deal, just to me, someone would have to do some work in some old code that everyone's left alone for a very long time, and nobody can remember how it works. I also wonder about RMV5, there's an awful lot of Raspberry Pi ones floating around the world right now. Yes. It's a good point, actually, are there any Pi people in the room? What have the Pi people thought about this problem? I haven't talked to the Pi people yet, and he's quite right that it's a significant constituency. I'm not from Vine, but I remember something about them improving their abstraction so that you can run 32-bit wine applications on a 64-bit wine. So to move the 64-bit, 32-bit translation into wine so that the native libraries are all 64-bit. I'm not sure if that would actually solve the year 2038 problem. Fair enough. So we're not sure. Totally good. So does anybody have things which they know will break, which have not been mentioned? Yeah, because, I mean, you've shown us this magic autocon variable that switches the transition to 64-bit time to off, but you cannot be sure that somebody used, well, maybe an unreleased part of Cnulip into his autoconf code or whatever, and there are lots of exotic build systems out there where somebody may just say we are using 64-bit time, and then you have some random piece of software that is compiled in one way, and some other random piece of software that is compiled the other way. So it's kind of already, we are already in the mess, and we just need to find the best way to get out of it. And the other thing that I wanted to say is that, I mean, I'm personally, I'm kind of a bit of a fan of the Cnulip. Who want to keep some ancient software running with binary compatibility. So yeah, I think, as I said, the big boys just abandoned 32-bit world, so they decided it's not their problem. Yeah, I'm also not from wine, but for sure, regarding what I said before, yeah, 1.8 is basically going to start to support running on 64-bit, so the 32 binary, so that would be a problem. Another thing you mentioned, like, stuff like D-Bus as well won't be a problem because, I mean, D-Bus is not tied for time, it's basically passing numbers, you define the kind of... It's not passing time-structs around to people. Yeah, I mean, it's my passing time, but it's not the time type, so basically you define the kind of... I don't know how it works, but yeah, any sort of protocol like that which was being given structs by one application and taking them out by another, if they have a different understanding of how big the time variable is, it won't work. Yeah, you know, that's exactly true, but I mean, in that case, it shouldn't be a problem. Why it might be a problem for stuff like using sockets or FDs, that's indeed an issue, and that's... Right, I mean, just to respond to it, Andreas, the, what's he going to say? No, my brain's failed. No, that's right. The fact that people, as you say, already there's a certain amount of randomness in what's built out there. If we're not noticing a huge number of problems from it, then maybe it's not too bad, right? It could be viewed as a good thing, as a good sign, I don't know. I mean, in any case, when you said that you have to do a large-scale rebuild, and that source distros have it easy, that's not entirely true, because we will also have to schedule for our users a large-scale rebuild. True. Easier, I think. Yeah. There was someone up there. Yeah, there's at least an NTP which has some time of time t and build a system. Yeah, that's 20 days six, that runs out. So actually, that's first, and maybe an even bigger disaster. No, I further understand the other time, t2. I need to find that link for you. Okay, yes, please. If anybody wants to send me examples of things that should be tested, that will be brilliant. I think it's probably the way to do it. So the other thing that I've failed to get onto is there's a list where we intend to discuss this and produce a plan. Quite soon, I think we should do something like this year, whatever the hell it is. And we need a reason, some representatives from each, it doesn't need to be very many people, but the fairly small number of people who actually care about this problem. Which you're particularly worried about, yeah, I guess that's our time anyway. So thank you very much. Thank you. Thank you. Thank you. |
Creating and distributing debug packages |
Hi, everybody. Yes. So, talk about debug packages and distributing debug packages today. So, my name is Morten Lindrö. I go by the nickname of Fox Brown on the Internet. I have been a contributor to the Arch Linux distribution since 2016. I'm doing sort of open source development since 2013. I do sort of security teamwork, re-use the builds. My care is sort of about usable security, supply chain security and all of that stuff and a lot of secure boots. But today I'm going to talk about what I've been spending sort of two years of my life working on, which is debug packages in Arch Linux. So, in the skills correctly. So, one of the sort of, normally when you sort of get some crashes at some point, you will see this fancy little stack trace. And if you use systemd, you will at some point have the crash handlers getting you the seg faults, which happens. And then you can sort of just debug this with GDB. And if you do look at the backtrace, you just see nonsense. There's nothing here that makes sense at all. You can't figure out what happened. You don't know what crashed. And you have no idea. So, if you actually do this on an Arch Linux system today, what you'll actually see is not that nonsense backtrace. You'll instead see, no, let's cross Y. And you'll instead get this, which has a lot more information. You'll see what happened, what crashed it. You'll get all the symbols. And you did nothing. You did not download any debug packages. You didn't think about it. You just happened behind the scenes. And if we ask what actually happened, you'll see that there's some internal syscall that crashed it. So, this is super nice. This is a lot better than sort of what the debugging experience has been on Arch Linux previously. And it took me, I don't know, three years, two and a half years implementing a little bit on and off. So, why do we care about debug packages? So, if we, for instance, have Pacman, which is a fairly sort of simple and small binary, it's like half a meg of size if you build it. But if you strip away all the debug information, you can almost half the size, which is nice. So, if you don't need all of that information on your disk, it's nice to sort of have some space savings. And in more like extreme cases, like in KeyCAD, had some sole name inside of Python, it's like half a gig. And if you strip away the debug information, it's 33 megabytes. It's sort of nice to sort of have the opportunities to sort of debug all of this. And this can all be sort of very large. So, what people do instead is that GDB implements what we call detached debug symbols. And that allows us to sort of separate out the debug symbols from the binaries and sort of re-link it together on the system. And one of the key elements for this is this fancy little build ID, which gets stamped into every binary on your system. And we use that to sort of link. We define the build ID. We can make some standard directory on your system. We can split out the debug symbols from the binary, move it to that directory, add some debug link to the binary, and everything just works. It will be as if the binary was, as the debug sections were still on the binaries. This is nice. And this is sort of what Debian, Ubuntu, Fedora, all of them do to make those debug packages. And that's nice. But one of the things that you saw in the demonstration is that we also have the source code of the binaries. And that's more of a hack which some distributions have support for and some distribution doesn't support. So Debian, Ubuntu does not have source listings, I believe, while Fedora, SUSE, and now Arch as well has source listings. And the way this sort of works is sort of you do a little bit of hacking. So if we build Pacman just normally and we run GDB on it and we ask what the sources were, you'll have your embedded project path in those binaries. So what you can do then instead is to use debug edit. Historically, this has been part of the RPM upstream. So Pacman didn't want to have a dependency on RPM to support debug packages, which is a bit weird. But this was split out now into a separate project in back in 2001, no, yeah, 2021, which is now a separate project, which is quite nice, and it makes more sort of accessible for other package managers. So instead of sort of using the current working directory to embed stuff, we can rewrite all of those paths inside the binary to some standard path on the file system. So in Arch, we use source debug and then we do name spacing so we can have sources from multiple versions of Pacman. And if you sort of do these DOMs, you'll have rewritten all of those source listings, which is part of the binary, which is super nice. And then you can sort of get all the source code associated with binary. So before debug edit was available as a sort of normal thing, Pacman also had support for source listings, but he didn't use debug edit. He decided to use awk instead. So he then tried to parse out all of the file paths, I don't know, from read-off, try to figure out whatever was there and sort of try to get it out. And this worked for, like, simple C programs, but if you threw like a rush to go at it, it had no clue what that was at all. So it was a hack. It worked. It was in the source code for, I don't know, six years maybe. So I ripped that out last year. So this, yes. So when these packages get built and you have the debug symbols and have all of the source listings, we can then sort of compile all of this to some package and then distribute it to our distributions. So all our packages in Arch Linux goes to this repo.archin.org, which is a tier zero mirror. That's where all the packages gets distributed from to all our mirrors. And on this, there's two package pools. There is from corn extra. There's a package. Just flash debug pool. And for community, there's, okay, there's a big community's dashboard debug, not packages. But these can be fetched and distributed to all mirrors, but it's a huge amount of packages. So what we do instead is that we are synced over this to something called a debug info instance we have, which allows us to do fetch everything over HTTP instead. So debug info is a very cool microservice which is capable of getting you the source code and the symbols from binaries over HTTP. So you don't have to think about which debug packages do you need, which one do you have to download to get full backtrace. We can just point GDB at this instance and it will just fetch everything for us, which is quite nice. So it's written, maintained by the ELF maintenance. It's a web server in C in the year 2020. So it's running on, like, I think a few distributions, like, I think Boyd Linux has one, Debian has one, Debian and Ubuntu got one past six months. And there's Fedora and SUSE also has several of these. So it's super simple. We can just use the debug info. We can give it that this is some tar archives that you want to parse and give it a package pool. And we just set the debug info URLs variable and then we can run GDB on the binaries and it works. That's all you have to do to sort of make GDB read those files instead of having to distribute them. So, yes. And then you can have this debug info find command line thing to fetch stuff for you or you can use it as a library instead. But yeah. So running a web server in C in 2020 is, you know, a little bit iffy. So we sort of wrote this, distributed this in sort of this hardware system file. So if something gets exploited or something happening in that C code, you never know. It's still sort of only really contained to some fairly restrictive set of policies. So you can't ask your privileges, you can't really write anything to the system. But you can sort of just read stuff, which is quite nice. So the only really two paths this has access to on our sort of in production system is just these two package pools and some cache directory and sort of that's everything it sees. So that's fairly quite nice. Been planning to upstream it. And I think you bumped into and Debian uses this as well, but it's an extremely properly yet sadly. So, you know, we have debug packages, we distribute it, people can use them, but we can also parse metrics from people accessing this server. So I spent a little bit of time. Look what you're how this vendors. Yeah. Okay. It does not like that. I don't know. I can't zoom out. I hate this. So, so what you sort of see here is just some basic statistics. So what people have been doing on it, we enabled debug packages for all our packages fairly recently this year. So that's why you see the biggest corpus spike going straight up because we have more symbols now. But you also see that we reached two terabytes of data being sent out to different users the past month. So that's the last 30 days with two terabytes out. And you can see some statistics on how much data people are fetching the errors from through but statistics is sort of quite nice. And you sort of get this from free from hosting it. Yes. So all of this infrastructure that's been put up in Arch, of course, is all open source. There's no proprietary infrastructure. There's no hidden files. So all the stuff we use to distribute debug info is all in our infrastructure repository under the roles of debug info. That's sort of how we fetch all of the packages, how we do the service management stuff, and all of those things. Yes. So I'll probably have more time. Yes. So one of the things I also did because, you know, debug packages is usually done on C applications and stuff, but I don't actually know C. I do Python and Go instead. So what I also spent a lot of time on doing is to sort of try to get better debug info support in Go because that's cool. So here, just to sort of give an example, here we're going to crash the tail scale SSH client because that's a nice example, I think. So this instructed the Go compiler to actually give us a core dump. And then we can use the delve debugger in Go. And it actually, with a few patches, is able to read out all the debug symbols, all of the source code, which is fetched from the debug info server as well, which is quite nice as it will give us the more opportunities to sort of debug Go applications. It also works on Rust. It also works on Julia and whatever sort of programming languages you want. Which is quite nice. So it's sort of an improvement for the entire ecosystem as well. Yes. That was it. I'll have a lot of time for questions if anybody has anything. So I'm wondering what you actually store for the source. Is it the build tree or are you trying to remove some things to save storage? Because, I mean, you have like a package, you have an upstream source, you have patches on top of the upstream source, and then maybe even the build process might generate sources itself. Yes. So I don't quite know how, but this is just a binary, which sort of dwarf generates the source listing as part of the dwarf metadata, I think. So this is all the, there's some generated optimized out sources, I think, and there's some sort of things that points around to different sources, but it will mostly just be sort of the patched up, generated, done sources, which gets embedded there. So it's, the source listing is a nice bonus, but it's not necessarily some would normally be distributing with the binary. That answers the question. Yes. Any more questions? Thanks for using it. Could you upstream the system deservers files? Yes, it's been a moment to do this for a long time. It's a little bit problematic though, because you don't need to figure out sort of how the paths and stuff needs to get into the service file with some configuration file, but it can probably be done. And I think that there will be people use it as well. Yes, it should be upstreamed. Yes. Yeah. So, and we normally hide the HB server behind the proxy. Yes. It's written in C++ if that helps. Yeah, no, yes. It's actually C++ is not C. It's all the elf stuff that's mostly written in C, I think. Yeah, so it's a C++ program that uses Lib, micro, HBD, and SQLite to store our other data. Yeah. So we have it behind the reverse proxy to sort of get the TLS configuration going and outside, but we also just warranted the hardening there because it's just, it's easy with system D to just get the hardening there. So it's no reason to sort of not do it. So it's quite nice, but I'll try to upstream it. Thanks. Yes. Are those statistics on your dashboard pulled from the HTTP server he was describing? Are those from like your Nginx or whatever proxy you're using? What? Sorry. Are the statistics you had on your dashboard earlier? Yes. Are those pulled from the back end? Or are they from like a proxy in front? So the debug info has a slash metrics, which is all Promtail. So it just exports a bunch of metrics and you just point Promtail from it and it will just parse it. So that dashboard is something we made internally, which I just spent two weeks making, and that's also open source. So you can just fetch the JSON file for the dashboard on the Grafana and everything there is all sort of open. So you can go look at it. But it's all, it's just this sort of slash metrics endpoint of debug info. So the Red Hat people actually watches this for all the debug info servers that has been employed and they can like look at the statistics and errors from all of different servers and see how the traffic between all of those are sort of how much is Fedora distributing compared to Arch and stuff, which is quite nice. That was not public, I think. But yeah, it's cool. Can you tell us a bit about the requirements in terms of storage? Because I recently looked at another distribution and they didn't build all the packages because of lack of storage. So that's what I'm trying to figure out now because we enabled debug symbols for all the packages, but they're not currently distributing it to our mirrors. So Arch, the total mirror size for Arch is like 60, 70, 80 gigabytes, I think, of data. But I assume like that would be several hundreds if we actually upload all the debug packages. But I think Fedora in total is like three, four terabytes or something. So I assume it will go inside three, four times and stuff. I know like the LLVMD stuff is like a two gigabyte package, I think, with symbols and people try to optimize it a little bit so you get a better, faster to upload. So it's, yeah, one sort of main issue with debug edit and sort of debug info and stuff is that you have, Dwarf 5 has support for compressed sections, but debug edit does not understand the compressed sections. So you have to decompress the sections before you leave out the paths and there's no good way to sort of recompress everything again. So getting better support for sort of compressed Dwarf info would sort of help fix a few of those sort of space requirements, I think, on the mirrors. Can I ask another question? Is there work on the duplication instead of compression? Because you have different version of packages as well. So it's not that relevant for ours because we don't keep those versions and we don't really do delta files on the packages. So on the arch side of things I don't think that's really relevant for us, but I don't know. It could probably be done at some level, at least for like Fedora or Debian that keeps multiple versions of the same package. A small question, for which architectors are you generating those debug info binaries? So arch only really supports x8664. We don't really have any other architectures. But because we have the 32-bit port and we have the ARM people and I think they're just pulling our packages and probably building debug in full for them, but arch itself is not really distributing anything else on x8664 currently. So you mentioned different architectures. Do you know if there's plan to upstream the booking for D and in general risk five because I know Felix Yan is working on this? Yes, I know Felix is working on it. We want to, this is more an arch thing, but we don't have traditional build farm server setup. So it's a bit hard for us to do multiple architectures because one package maintainer has to build that package for each architecture. So currently we want to have support for more architectures and better support like V2, V3, V4 versions of X that you see now supporting. But you currently haven't really solved that in a good way currently. Okay, thanks. Thanks. Thank you. Thank you. Thank you. |
KDLP: Kernel Development Learning Pipeline
A comprehensive pipeline for bringing new talent into the the Linux kernel and its orbit |
Okay, all right, is the mic working? Yeah. Yeah. All right. Great. Okay. All right. Welcome, everyone. I'm going to talk about a program that I've started at Red Hat called Kernel Development Learning Pipeline that I run with a small group of engineers kind of as a side project. So first, I'll talk about what it is, first of all, why it's a good idea, why we think it's a good idea and we're doing it, a little bit about the program and some of the growth that we've had in the program over the past year, two years approximately, and then conclude with some resources about the program. So first of all, what is KDLP? So that stands for Kernel Development Learning Pipeline, credit for that acronym to Julia Denham, actually, and we are building a comprehensive pipeline for Linux kernel talent and low-level talent more generally, because as it turns out, there's not a lot of younger people getting into the kernel. It's not an extremely popular area of study, it's barely really taught in school. So right now, the main component is we have a course that we're teaching at UMass Lowell and also on the Linux Foundation's platform. We are trying to recruit interns from this course and from this program and through kind of our network. We bring people in and then they serve as TAs and help develop the course and improve the content. And then ideally, we bring them in and recruit them full-time, that's the goal. So yeah, why is this a good idea? So like I was saying, a lot of senior Linux kernel engineers are getting somewhat close to retirement. That's kind of, definitely a much higher, I think, average age in kernel and in like low-level engineering, at least in the US that I've noticed, than in other areas of software engineering. And it takes a very long time for people to learn the Linux kernel, especially today, because a lot of people don't even learn C in school and they're generally separated from a lot of the low-level computer science concepts. Things that people 20, 30, 40 years ago, so I'm told, were learned kind of as a standard thing. Today, it's sort of a niche topic. There are people getting into it, but only kind of in-ditch communities at certain schools if you're exposed to certain people or certain online communities. So the kernel itself is barely taught. People take maybe one class in opening systems and a lot of the tools, like some of the more advanced get usage, for example. It's difficult to learn that kind of stuff. How do people learn how to do email patches without trying and failing and just getting roasted online? It's kind of difficult. So for a lot of companies, it's difficult to find talent. I mean, I know at Red Hat, it can be somewhat difficult to find people who have some Linux kernel knowledge and are actually interested in doing it. There's lots of people who are capable of doing it and just maybe don't like it that much. So, of course, the way that we're pitching this to Red Hat and the way that it's a good idea for any company is that we think we can bring more value to the company for a lower cost. But for the community, what we are trying to bring is to train the next generation of developers from all sorts of different backgrounds, as opposed to just people who happen to bump into somebody at a recruiting event, which is how I got into the kernel. I just happened to run into somebody at an event that he wasn't even supposed to be at, in fact. So a little overview. So there's kind of three main sections of the program. So the first is this kernel development course that we've developed that is taught at UMass Lool. So we've created this from scratch using, originally based on the Linux device drivers third edition book, but we've kind of gone further from that. That's actually one of the latest books on Linux kernel engineering and it's published in 2005. But yeah, so we've been working on developing our own curriculum with a number of different labs and some slide decks and kind of a whole, you know, a set of things that people could do to learn the kernel. And then from there, we tried to bring people into internships as we're able to do that or other kind of more second level or hands on experiences where they can further develop those skills and then ideally bring them in as full time engineers at Red Hat or I mean the beauty of open source is if they get recruited somewhere else or if they go work somewhere else in the open source community, that's still, that's a win for everyone in the community. It's a win for all the companies working with the community because everyone working on the kernel is working on the kernel. So a little more about the course. The goal is to introduce students to Linux kernel development. You know, obviously we can't teach them the entire kernel. We can barely even introduce them to one subsystem, but the goal is to teach them kind of what they need to learn to teach themselves and to teach them what they need to know to work in one of these open source communities, specifically the Linux kernel community, but those skills I think are pretty applicable to other kind of more niche open source areas that can be somewhat intimidating for a lot of people to get into. We mostly just require C language skills. I mean that's obviously the most important thing when you're working in the kernel. That's a basic Linux experience, some programming experience in general, but not need a ton of Linux kernel experience to get started, but really the most critical thing to see. So we teach kind of an overview of just various kernel features and subsystems, kind of describe like a map of the kernel, what there is, what, you know, how the different pieces connect together, how you can interact with it. We approach through device driver development, that's a good way to get people into, you know, to work with a lot of the different APIs, because to write a device driver you need to interact with a lot of different areas and you end up building something that's somewhat kind of complete piece that works on its own, and then we introduce people to more advanced usage of Git. I mean people may be familiar with GitHub, but you know, believe it or not, a lot of people don't know the difference between Git and GitHub. So we help out with that and get them started with email patches and rebases and things like that. We also talk about BPF Trace and other tracing, which are things that they, I don't think BPF Trace or BPF in general is taught in any school that I know of, at least not in undergrad courses, you know, NCSCope and other ways to explore large projects and repositories, because you know, in a computer science program people generally don't get exposed to working on like large ongoing projects. You know, you work on some, you know, one and done thing, and you know, then you send it in and no one else looks at it and you get your grade and you move on, right? So we're introducing people to work on larger code bases, which is something that's somewhat unusual for a university course. So of course, all these course materials and the assignments completely open source. We've linked it, it's all on our website, and we're continuously improving it. You know, the source for the website is also on GitHub, you know, if anyone has any suggestions or changes, we're always open to them. So we have undergraduate and graduate level university courses that run at UMass Lowell that we've created over the past couple of years. It became kind of a full-fledged course in the fall semester of last year. So it's relatively new. And this semester we were also running it as a kind of dual program with the Linux Foundation's mentorship platform. So we have people applying on there to, I will also talk a little bit more about that later. We run through the course and kind of work with at the same schedule and at the same pace as the people at UMass Lowell, but they're from around the world and they don't need to be enrolled as students. They can come from anywhere, right? And kind of the bottom line is we try to reimagine how to teach this kind of material from first principles because it doesn't really make sense to teach it from a typical, like memorize this and do the exam kind of perspective, right? The goal is to get people to be able to explain things in their own words and to work on open source projects and work in open source communities. So we've replaced exams with presentations where they actually explain the work that they've done in their own words to their peers. So from there we recruit people as interns and try to bring them in. So we find people in the course who are enthusiastic and capable about so by the time we bring them in to the internship, they also have a lot of prerequisite knowledge that they would need. And so they're able to hit the ground running and do a lot more in their three to six month period or longer than they otherwise would have because actually, so I had a, the manager who hired me at Red Hat originally as a co-op, he said he didn't like to do summer internships in the kernel because by the time three months are over, they've just gotten on boarded. They've just started to understand how to work with a Linux kernel and then it's over and then they got to go. So he only really did co-ops. But the goal with this is if we kind of vet people ahead of time and we give them the skills that they need and give them kind of, you know, the get skills and the, you know, they compile the kernel and they have the background, they actually can get some value from an internship and we're more likely to bring people into internships who actually want to work in the kernel and then they're more likely to go off and, you know, and work in the kernel in general or work in open source communities. So we pair them up with rail engineers and, you know, they work on various initiatives within Red Hat that are, you know, that are, need people to learn these kind of new and old combinations of old and new skills. But the goal is to train these new people to work in those areas. So we also bring them in to help with our program and we've had people come in, they take the course and then they come in and they turn around and then they're TAs and they work on improving the course. Maybe they do a couple lectures here and there. They do some grading and they kind of service TAs. So another thing that we are doing is we have this kind of new kernel devs group within Red Hat, which I thought I'll just briefly mention. It's a little bit tangential, but, you know, once people get into Red Hat and when people are working in the kernel, especially in a remote job, which a lot of these positions are, people get very separated, they get very siloed, there's not a lot of socialization. So we run a group to, you know, just bring people together with, for a meeting with very little agenda. You know, occasionally we have a presentation here and there, but just to talk about things and share concepts and tools and questions and, you know, it's a Red Hat specific thing, so they can ask about specific things related to their job and a place where, you know, there's, I mean, managers are allowed, but it's not a specific structured meeting for a business purpose. And I think people who, you know, you might get that in an office somewhere when you're just walking around and talking to people, but it's a little bit more difficult to get when you're in a remote position. So we also bring interns who we've recruited into that group, you know, and people from elsewhere within Red Hat who are interested in the kernel and they can ask questions and find resources and, you know, if they enjoy it, they can switch into the kernel. Because there are actually a number of people who are interested and they just don't know where to start. They're just not a lot of good resources. So now I'll talk about some of the growth of this program over the last couple of years. So first of all, we've partnered with Red Hat's main educational initiative, Red Hat Academy, which we found out about only through doing this program. But they work with universities mainly on delivering and facilitating kind of systems management, you know, what do you call it, system administration and cyber reliability engineering, kind of the more, you know, the standard kind of Red Hat certifications that they do. But they don't have much software engineering. In fact, I don't think they had any software engineering component at all. So they were happy to work with us and, you know, we're happy to work with them. And they made us these nice posters, which we have down here if anyone's interested. Or I guess they're called like leave behind sheets, technically. We also ran a workshop last year for interns and co-ops in Ireland. And that was just kind of every other week, kind of a casual thing, you know, over time we had fewer and fewer people, to be honest. But the people who stayed around really enjoyed it and I think hopefully some of them will end up working in the space or, you know, in open source generally, hopefully the feedback for them was generally fairly positive. We've also been connecting to various educational programs in different countries. A lot of this stuff's very preliminary, but we're hoping to kind of package what we've done in our course in a way that can be replicated at other places and other universities and other countries to bring people in from different places in the world. And then we've also partnered with the Linux Foundation, specifically their mentorship platform, which I will talk about next. So like I was saying, the Linux Foundation. So we have just put our course on the Linux Foundation's platform essentially as a mentorship, but it's the same thing as the course. We've worked with Shua Khan, she's been very helpful. She runs this Linux Foundation's mentorship platform and, you know, great person to get in touch with if you're interested in learning the kernel or have questions about this. She's also a kernel maintainer if you're not familiar. So we are running the same course, and this is kind of experimental, we're running the same schedule with the same assignments with a group who are doing the course for credit at UMass Lowell, and at the same time, people who are just, they just apply it online, and they're just doing it. There are people, you know, just on random continents, some of them are students, some of them are working, some of them are doing, you know, I have no idea, honestly, some of them are just probably, you know, having a good time, but they're doing the same assignments and on the same schedule, and, you know, they submit their assignments to a different mailing list than the UMass Lowell cohort, but they're in a shared Discord server and they're interacting, and, you know, we're seeing how that goes, and it's an extremely diverse group of mentees. We have people from, I think like five different continents, you know, people from various places in Europe, a couple in America, South America, someone in Mexico, like three or four from Africa, Nigeria, Kenya. So a very interesting group, a couple in India as well. So a couple of statistics on what we've been able to do. It's a fairly new program, but I am happy to show some results. We were able to hire two people full-time from internships who originally did the, went through the class. This was during a year that was very difficult to hire people. There was a hiring freeze, so I think two is still, it's a lot more than zero. Two people who did co-ops or internships with us were recruited to Amazon and Microsoft, but one of them liked Red Hat so much and liked working with Open Source so much better than Amazon that he is pretty interested in coming back, leaving Amazon and coming back to Red Hat, which is pretty cool. But overall, we've had seven interns and co-ops who have been trained and have gone through this particular program via this KDLP thing. And last semester, we had our biggest class, well, until this semester, we had 12 students, undergraduate, mostly graduate students, graduate and undergraduate courses are the same, which I found out, same content. And we have about a dozen, a couple more than a dozen students this semester, and the course is actively going on right now. It just started a couple of weeks ago, and I think about a dozen in the LFX mentorship as well. And this is by far the most diverse group by gender and location, which is pretty interesting. So now, just briefly, some program information. So the team is, well, it's me. Julie Denham is a program manager. I think she's in the chat answering questions, if anyone's interested, back in Boston. Charles Marabelle, he's the content lead for the course. And Dennis Alexandrov, he is an intern who's been extended and also TA. He's been working on the course since last summer, originally went through the course from McGill University. And yeah, acknowledgments. I just want to give a shout out to two Red Hatters in particular. First Heidi Dempsey, the research and innovation director in North America. She's been a longtime supporter of this program from the beginning, from like, you know, before it was even an idea, you know, when we were talking about doing something in, like, February 2019. So her support's been great, and, you know, recommend, you know, doing research and innovation with her if anyone's, you know, interested in doing that. And of course, Mike McGrath, the vice president of RELL, who is the executive sponsor for this program within Red Hat. Now, a couple resources. We have a mailing list, which is linked up there, which we pretty much just used for giving a quarterly update newsletter, which we just started this quarter. And it's read only, so you're not going to get spammed with a ton of information. If you subscribe to that mailing list, you get invited to a weekly office hour session. That happens Tuesday at noon Eastern Standard Time, which just for asking questions about the program or the kernel in general, or you can just send me or one of us an email. And then on the right, we have our website. We have a section of our website that talks about the information, specifically the structure and content of the course. We have the page on the UMass Law Catalog that talks about the, you know, that just has a course description, so in case anyone's interested in checking that out. And the mentorship page on LFX. Then down there on the bottom right, we do have a crowdfunding page on LFX. If you want to support the program development and diverse engineers, it's a very, you know, we'll be able to bring people on through that platform as TAs, potentially give them, you know, bring them on and continue them kind of beyond what we're able to do with just Red Hat. All right, any questions? Hi, I think there is an offer and demand problem. If you consider every Linux developers, it will be like 1,000 JavaScript developers. And also people at the university, if you give them the choice between JavaScript and Linux, they will have more opportunities going with the JavaScript. They cannot do a mistake. However, if they go with the kernel, like, I don't know, in my country, for example, there's a lot of, there's a lot less jobs and you have less opportunities. So what's your opinion on this? Does people know the choice between like classic software development and kernel software development? So if I understand that correctly, you're saying there's fewer jobs in the kernel. And so, you know, what would I say about, just about what, it was like 1,000 to one or something, you said, well, I mean, that's true, but the competition for those jobs is much more intense. It's much kind of broader, you know, skill area. I mean, it's definitely, you know, I'm not condemning JavaScript developers. JavaScript's fine. It's necessary, you know, the web runs on it for all its problems. But it's also true that, I mean, working in the kernel, you know, the Linux kernel, there are fewer jobs, but there are a number of very solid jobs, and it doesn't, learning Linux kernel skills is somewhat transferable to other areas of development. You don't have to work as a Linux kernel developer. You don't have to, you know, go to Red Hat and work on RHEL and do backports or whatever. Right? You can kind of just do it on your own. You can do, you know, just some other low-level project and just having knowledge of the kernel and how it works and the knowledge necessary to work on the kernel, I think, could benefit people in many other areas and just is an overall boost to people's skills that is somewhat rare to find these days. So even for a JavaScript developer, if they know how the kernel works and if they know, you know, behind the scenes, you know, what's going on at a system level, I think that is, you know, can't be anything but a good thing. Thanks for the talk. I have a question about the audience, so is it mainly targeting students at the college level and then partnering with university, or do you also consider a conversion program like existing software developers that want to switch to more low-level kernel development? Yeah, so the course at UMass Lowell is, you know, a course at UMass Lowell for people enrolled at UMass Lowell, but the program on LFX, the whole idea of putting it on there is that we could accept people from anywhere in the world doing anything, students, non-students, you know, U.S., non-U.S., whatever. So any background, as long as they have some knowledge of C, good internet access, decent English and a computer, then we can take them. The only limitation is our ability to grade the assignments and handle that many students and like the resources that we have to work on the program, because this is, you know, for the three engineers and, you know, potentially a couple more people who are interested and, you know, a couple of interns, right, it's, this is all a side project for us. Like, I'm mainly a RELL engineer. That's my main job. Like, this is just a side project, so we don't have a ton of resources and, yeah, we're basically limited by how much time and resources we have to run the program. But we're, yeah, we encourage people to apply from all sorts of different backgrounds. Okay, thank you. So back in the day when I used to participate in the development of kernel, it required us to basically build the entire mainline. Your slide showed something with Raspberry Pis. Is it feasible to build the entire kernel using a Raspberry Pi? So, that originally comes from, we used to do a lot more stuff with Raspberry Pis before the chip shortage. It became more difficult to get them and then, you know, I think generally like a lot of the introductory stuff we were doing didn't require them as much, so we moved somewhat away from them. But, yes, we did work with Raspberry Pis and we were compiling the kernel on the Raspberry Pi. So I was able to do it on, on the three, it took a lot of weird tricks and it took like a day, you know, because I think it was like single threaded, and on the four, I could do it. It was a little bit faster, but if you didn't have cooling, the system would overheat unless you used fewer than four cores. But you can do it, and in fact, you can install Fedora, you can install CentOS, and you can install Rail. Oh, yeah. Yeah, I know that. Yeah, all right, that was my question, thank you. Hey, he had said, with the, the alternate, the non-university rate for doing it, you've got people here working at the same time, possibly, I was just wondering, since it's also, you've got the same assignment schedule, what is the time commitment like for someone trying to get into it, but might have full-time work? What is the time commitment like? Because it depends how quickly you complete the assignments. Okay, so it's, well, the, the lectures are the, the sessions that we have, which are, you know, on, on Google Meet, they're also recorded and we post them there about their two of them per week, generally, and they're, they're each an hour and 15 minutes, and it goes for, you know, like a standard 12-week semester. In terms of the assignments, I mean, if you're, you know, I think at the beginning, setting up the environment, send email patches, if you haven't done that before, and compiling the kernel, and, you know, we have one assignment where you write a shell, you know, that can be, I mean, yeah, it depends how familiar you are with, with some of the concepts. But, you know, I think of it as, it's just a, like, same commitment as a, you know, maybe medium to, like, junior to senior level undergraduate computer science course, approximately. I mean, that's, that's how we're designing it. Sure. Hiya, thanks for the talk. Any of these courses, like, I'm, I'm curious about back, back-porting and forward-porting stuff. So, when, when, when you're doing this course work, are you working on a specific version of the kernel? Like, yeah, this course is going to be on 6.2, because I find that a lot of problems that I run into is, okay, this driver was just introduced, let's say, in October, but I need it on 5.13, so I need to back-port it, and that might be trivial, right, might just be copying some files, but if the API has changed in some way, it can be really difficult to figure out how it changed, because the documentation, to my knowledge, is not great there. So, is that some, something that you actually sort of discuss in these courses? We haven't done back-porting specifically. We don't, I don't think we even, we don't have any assignments that's specifically asked for, for back-ports, because we have to, we design them in a way that, you know, each student is doing, like, relatively unique work, so it's kind of, you know, and they're all posting it in a, in a public mailing list, so, you know, we need to figure out how we can kind of either generate assignments or, or figure out assignments that are basically by design, difficult to impossible to, to just, to cheat in. And so, so with back-porting, I mean, we could figure something out. We haven't really looked into it. We talk about, you know, get cherry-pick and rebase, and we talk about the, the background that you would need to do that, and I think we may have done a demonstration of it in the course. But that's a good idea. Do you anticipate that the recent admission of Rust as a programming language, at least for driver development in the Linux kernel, will cause renewed interest, especially among younger talent? And are you already anticipating for that in the program? Potentially, yeah, we've, we've discussed it, and we've heard some people talking about it. We had a few people who were in the course, who were excited about Rust, and Rust in the kernel. So we, we have seen some interest in the kernel. Probably, yeah, it's probably increased interest from young people, I would say. But personally, I need to, I need to learn Rust. I don't know Rust. Do you mind if I do a little bit of, oh, sure, yeah, they're here, but people take us with them. where you were asked, you know, to work on vice-drivers and then, you know, non-handling veterans and so on, actually, when we fly, so they were teaching us since, I think it's never been more, you know, for the people, maybe like, what are you, and you're a serious person. I mean, I'm probably fine. Sorry, let me get the mic away from you. Oh, okay. Yeah. Yeah. All right. |
AMENDMENT KubeOS: Container OS based on OpenEuler
A container operating system based on openEuler and a solution of cluster nodes upgrade |
So, hello everyone, first of all, this is not my talk. I've been receiving this talk because my colleague didn't make it to get the visa on time. So I'm sorry, I don't know anything about Kubernetes. I'm usually more into low-level stuff, kernel, and embedded. But I will deliver the talk with the notes that I received, and if you have questions, you can directly direct it by email to my colleague. I wouldn't be able to answer. I'm sorry for this in advance. Okay. So before getting to the architecture and principle of the QBOS, let's define what it's all. So there is a cloud-native development that is encouraged by Docker Kubernetes communities. And many infrastructure is being cloudified. But some of the problems with the general-purpose operating systems reappear in this cloud-native environment. So you have container management, workloads scheduling, automatic service deployment, rollbacks of updates, and so on. That's all capabilities that are provided by Kubernetes, but it is unable to control the cluster-node operating system directly. So the first problem in cloud-native environments is the desynchronization between OS and Kubernetes that are managed and controlled completely separately. Also Kubernetes, like the operating system management, needs a key, upgrades, user access control, all these things. And then you can have like the ops operation guys or patient people, sorry, that need to complain ridden and task between the two systems. The maintenance are therefore poorly synchronized usually, and the greater modification of the OS components can affect the availability of the OS and which require additional monitoring from Kubernetes. So an example is that you have operation staff that must block the nodes to stop new workloads from arriving in order to upgrade the OS without interfering with the Kubernetes. And after everything is clear and everything is updated, you can unblock the node again. So this makes it complicated and expensive. So another issue is the OS version management. So if you have a standard package manager and you can add, remove, modify packages independently on the OS, at the beginning you have an image which is clean, but then you start differing from your different instances. So you have like what they call OS version splitting. So you will have different packages installed on different nodes. The version of these packages can also differ, security updates and all that stuff. So you have this divergence that appear over time. So if you want some integrity and consistency that you want to ensure for your OS nodes, this can harm this constraint. And yes, so if you want also to update to a major version, it's also more difficult. So other people have worked on this problem. So rebuilding the operating system is an approach that has been taken to solve these problems. So previously you have many technology packages that are part of the OS that are moving to containers. So the old guest OS is less reliant. We rely less on the guest OS so it can be replaced by a lightweight operating system with less services that are on and so on. So container OS is a lightweight operating system designed to run containers. And so like on the figure on the right, there is an OS OS and it's not the OS running inside the container. So you have three important aspects, minimalism, usability and atomic updates. It means that you will only include what you really need as components in the host OS. So the container OS requires a Linux kernel, container engines like Docker, container D, and security mechanisms such as SE Linux to ensure the security. And other applications that are running containers are running containers because you don't need it in the host. And this can also reduce the attack surface because you have less in the host OS. Emutability is that you use a read-only file system that can be configured at the start of the deployment and also reduce the risk. And the atomic update is that you do the upgrade for the entire OS and not individually for packages. So the core OS was started in 2013 and was the first widely used container operating system. You also have a system like AWS bottle rocket, flat car, and container optimized OS. So QBOS, it's a container operating system built on OpenOiler, which is a distribution maintained by Huawei. So QBOS main design concept is to use Kubernetes to manage the operating systems. Once you have QBOS that has been installed on a cluster, the user only knew the Qube control command and YAML file on the master node. The OS of the cluster worker node can be managed. And this OS on QBOS is connected to the cluster as a Kubernetes component, putting it in the same position as the other resources in the clusters. And containers and operating system can just be matched in a unified way through Kubernetes. So OpenOiler based reconstruction is used so that the operating system can be updated optimally, like to avoid the problems I introduced before. So now we are going to go a little bit in more depth about QBOS. So the first feature is the ability to manage the OS through directly Kubernetes. So we use API extension, custom resource, CRD, to design and registering in the cluster. We use Kubernetes operating framework to create customized controller for the OS to monitor and manage it. Then this Kubernetes operating framework, we use it to create customers. So the user only need to modify this CR, enter the expected OS status to the cluster, and the QBOS and Kubernetes handle this, and you only have to manage it in the control plane. So the next one is atomicity management of the OS. QBOS upgrade is an atomic dual zone upgrade. It does not include packet manager. The change of each software package corresponds to the change of the operating system version. Then the OS version corresponds to a specific OS image or RPM package combination. Each software update as shown in this diagram is an OS version update. So you avoid the version splitting problems, and the cluster nodes remain consistent at all times. So QBOS is lightweight with unnecessary components removed to reduce the attack surface and enable faster start-up and upgrade. So this is a diagram of the QBOS overall architecture. So you have two main parts. The first with three different components, OS operator, OS proxy, and OS agent. In the red box above the diagram, which are used for Kubernetes cluster docking, complete OS monitoring and management. And the second part is the QBOS image creation tool. The user can use QBOS scripts to generate QBOS images from the open or lower repo source, which supports the generation of container image, virtual machine image, and so on. So the three main components I mentioned, like OS operator, proxy, and agent, are critical to the ability to manage cluster using Kubernetes. The OS operator and proxy are the operators we mentioned earlier. The OS operator will be deployed in the cluster as deployment and daemon set, and will communicate with Kubernetes to issue upgrade instructions. The operator is a global OS manager that monitors all cluster nodes. When a new version of the OS information is configured by the user, it determines whether to upgrade and send a great task to each node. The proxy is a single node operating system manager that monitors the current node information. When the operator sends a great notification, it will lock the node to expel the pods and forward the OS information to the agent. The agent is not included in the Kubernetes cluster. The real executor of the OS management communicates with the proxy via Unix domain sockets, receive a message from the proxy, and perform the upgrade rollback and configuration operations. So the upgrade process, we will use the work process as an explaining example. So we consider how the different components communicate and interact. First the user configures the OS information to be upgraded via Qt control and enable files, such as OS version, address of the OS image, number of nodes to be upgraded concurrently, and so on. Then when the OS instance changes, the operator begins the upgrade process, labels the nodes that must be upgraded, and limits the number of nodes to be upgraded each time to the number specified by the user. Then the proxy checks to see if the current node is marked as an upgrade node, locks the nodes to expel the pods, and retrieves the OS information from the cluster before sending it to the OS agent. After receiving the message, the agents will download the upgraded package from the address specified by the user, complete the upgrade, and restart. After restarting, the proxy will detect that the node OS version has reached the expected version and will unlock the node and remove the upgrade level of the node. So this is the complete upgrade process. Then finally the file system. So how do we design and upgrade the file system in QBOS? It adopts a dual-area upgrade, like mentioned earlier, to upgrade the OS, so you have two root partitions, the upgrade of partition A is to download the updated image for the partition B, and then modify the default bootloader as the B partition after, and then you restart from the B by default, and the opposite happens for the next upgrade. So it's a classical dual image thing. The file system of QBOS is recently, which improved the security, but we also support persistent data partitions. The union path, which is mounted as an overlay, and the files in the image other than the user change can still be seen. There is a writable path, which has a writable file layer to the image using the bind mounts. The files in the image are not displayed, only user data is stored, and there is also the boot partition, which contains the bootloader files. So we determine the main concept of QBOS and design, and implemented a set of components to complete the OS management, and we intend to continue completing more functions based on this process. One thing is the ability to provide a configuration, like in the grid process, the configuration is delivered to the node via the Kubernetes cluster on the cluster control plane to ensure the consistency of the configurations of the nodes, and given that some of the configuration must be complete before the nodes join the cluster, more configuration capabilities to the QBOS image creation are planned. Then there is the improved upgrade capability. We have realized the function-based OS upgrade, and we will provide upgrade strategies that user can customize, such as upgrading based on the cluster node label to provide more upgrade solutions. In addition to the rich functions, we intend to improve the usability of QBOS by displaying the upgrade of configuration process and improving the image creation tool so that user can more easily customize the image. Okay, and that's it. Sorry again for the functions, but for the question, you can always shoot the colleague in the middle. |
AMENDMENT Parsing zone files really fast |
Hi everyone, I'm Jeroen, well I'm going to tell you something about parsing some files really fast and I worked for NL netlabs, oh yeah, so some numbers, there's some caveats here, so the fifth, I did not do measurements because I made finished the slides like today, so I did the measurements on the train so, but I think the 50 megabytes is actually slower, I'm pretty sure the 700 megabytes is correct and we will go beyond and I'm going to tell you how, so yeah, so basically the motivation is that currently the parser NSD isn't very fast and we have an operator where someone only takes the better part of an hour and at that point it stops being practical, so yeah, that's the motivation and I actually like to take you on a journey so that I went through, I will also show you the new algorithms but I also want to tell you why parsing some files currently in NSD at least is really slow and to do that we have to tell you a bit on parsing, so I've included an example with a whole world sea program and the NSD parser is based on lexing jak and that's really useful if you want to parse things like a computer language where each token has a meaning of in and of itself, for some files however that is definitely not a case, so if you look at the, also in this case I provide an example but everyone in the room will probably make out that on the last line I try to define an A record with a corresponding IP address and then what the zone parser actually does is it takes the A, it makes that the owner and then throws a syntax error because well an IP address is not really a valid record type obviously, so yeah lexing jak is really not a good choice but then there's also the fact that I think zone files are only more or less standardized, they're not really standardized and putting it modally and when you combine the two that just leads to a lot of trouble and well the first thing I did was analyze why the current parser is slow and the current parser is slow well, it's actually inherent to the tool because they're just not a good fit because what the lexer does is it gives you each token and then passes on to the parser but it does so by matching a whole bunch of inputs and then taking the longest one, the longest match and then executing corresponding action in which in the case of the NSD zone file parser means that you can, that it copies the token then tries to unescape it and for names it tries to, it needs the dots right because they have meaning in domain names and what it does there is that it actually splits the input and passes each label separately, that is of course copied and the parser then concatenates that all back together and that is not really a fast process. So my first thought was well what if I just change the process a bit and scrap lexen yak and cut all the memory allocations that gave us better numbers but they're not the numbers that I wanted to see right because under the 8 megabytes is, I mean you can express it in gigabytes a second but then it becomes even less impressive right than under megabytes a second so. So yeah I started looking into that and I will show you the algorithms that I used and I came up with in a minute but to make you understand why these algorithms work it's important that each and every one of you knows that your CPU is a pipeline CPU and all modern CPUs are pipeline CPUs and what that means is that each executing each instruction is not a single step it's actually a multi-stage process. So there's a fetch and decode step and there's a lot more to it in practice but this is a mechanism that was designed so that you to optimize performance in the CPU and the premise on getting fast codes is that you keep the pipeline nice and full. That does not always happen especially or one case where that doesn't happen is when you get a pipeline install and that happens essentially when there's when the next instruction that you want to execute depends on the result of the instruction that you're currently executing and in that case the CPU has installed a pipeline it has to wait until the result is printed back it can then go on to decode and then only then execute your instruction and you'll take a hit of a couple cycles. Then there's of course the well-known pipeline flashes and those happen essentially if there's a conditional jump for instance an F statement and the CPU goes on to load the instructions that come after that and if it turns out that it actually needs to execute other instructions then it needs to flush the pipeline and only then can it go on to execute your code. And there's bronze prediction that is used to improve the flow and modern CPUs actually do a pretty good job of that but well it's prediction so it's not it's not always right. And it turns out so if we look at that information then we can analyze why a parser is the process of parsing is just inherently slow because if we go over it byte by byte like the NSDE shown parser for example then before you can analyze the next byte you have to wait until you know you have to resolve the new state of your current byte. And also it turns out that as far as the CPU is concerned the zone files are just random right that anything can happen at any time so it's hard to predict branches in that case. Right so the base of the new instructions that I'm using at base is a thing called a CND or single instruction multiple data and my interest in all of this was really sparked by the CND JSON project and it caught my attention because they expressed their throughput in gigabytes a second. Now in the next slide I'm going to tell you something about the algorithms but I'm not going to go into them in great depth because there's not a whole lot of time so if you want to know more on that then I would advise you to watch the talk or just read the paper. CND what that is is actually an instruction set and what it does it adds factor in registers and instructions to operate on those registers and what that allows us to do is to classify blocks instead of just bytes and there's some trick re-involved and there's a super simple example on the slide but basically we can classify 16, 32 or 64 bytes in one go depending on your CPU and then we repeat that multiple times for each input that we actually want to know about and the idea is that we can cut branches and dependencies. So what's good to know about CND is that it's all vertical non-horizontal so it's really it's really an instruction that is executed for each of the inputs so you can actually do logic in CND and the way to work around that is to convert the inputs to a mask. So we would get a 64 bit mask for each of the inputs that we checked and with those bit masks in hand then the first thing that we are going to do is to classify all the escape bits because there's some files allow for escaping and this is actually an algorithm that CND JSON guys came up with and basically what we do is that for each uneven number of backslashes we take the next character and so that bit then represents the character that is actually escaped and we need that information so that we can identify the quoted sections or in the case of some zone files also the comment sections and this was actually kind of a hard problem because they don't have this problem in JSON documents but in zone files comments can cancel out quoted sections and quoted sections can contain semicolons and then new lines they limit comments but we only want the new lines that actually delimits the comment because what we really want to do is that we want to find out which of the characters that we identify as structural characters are contained in quoted sequences or in comments and there's a simple example in the bottom there so oh yeah I did a number of experiments but in the end it turned out that if there's a semicolon in the input we just branch so we have a slow path assuming that there's not too many comments in zone files which for generated zone files I guess it's okay and once we have that information all the bits that remain automatically belong to the non quoted strings and then and this is oversimplifying it but if we shift right and do an XOR then that would get us all the transitions and with that information we can then go on to create indexes of those because your CPU does not only provide SIMD instructions it also provides bit manipulation instructions really fast bit manipulation instructions so the first thing that it does is it takes the population count to find out how many transitions are actually in your input block and then we use the trailing zero count to find out the relative position of the bit and if we combine it with the index then that should give us the pointer to the exact input byte and there's some more trickery involved here because of course for zone files if there's an error we want to report that error and to do that we need a line count and quoted sections of course may contain just new lines but we don't want to worry about those in the parser because that would mean that each parse function would possibly need to update a line count and that would just not be very convenient so what we do there is we take an unlikely branch if there's new line in the quoted section which really doesn't happen it's an edge case in the case of zone files and we take the slope path to count all the new lines in the input or at least the one in the quoted sections and then once we generate a token for the actual, the limiting new line we add the number of new lines that we found in quoted sections yeah and that gives us basically that gives us a fast scanner in my initial measurements and I think it's a little bit fast now that would get me a scanning of two gigabytes a second for zone files at least with an older.com zone etc etc so there's caveats there too but it turns out that the rest of the DNS data because we of course we have to parse it we only now tokenize it we also have to parse it the rest of the DNS data allows for optimizations using cindy as well and of course we want to start with the data that occurs the most and that is of course the main names and with the cindy instruction we actually just repeat the scanning process we quickly identify all the dots we turn that into a bit mask and then use the bit manipulation instructions to go over the domain name because most of the time if we just fill in the length on the dot then that would give us a proper Y format and of course there's a slow path for edge cases as well there and next of course is the record type and normally I guess you would hash I'd initially just use binary search which is faster than just linear search of course but that took away quite some performance so we want a perfect actually we want a hash but then a hash table is pretty big and so I figured I want a perfect hash and it turns out we can do we can do that so if you take the first character of the records because there's not that many record types right and there's certainly if you take the first character there's never more than that many record never more than like eight or nine record types that start with the first letter so if you then take the last character and at length then it turns out that doesn't give me any collisions so we can also the hash of collisions occur but I mean for all the record types and what for 40 years it doesn't give me collisions so I guess we're good on that from with a number no for each record type someone asked if this only works for record types and then the number and answer is no it works for all record types because they're all closely they're all really close together right so they're and they're alphanumeric most of the time sometimes there's numbers so we just I think our uppercase or downcase it and then multiply together a good distribution and then just add length and that gives me that gives that gives me a unique key without collisions and from there I can just do a use in the instruction to do compare equal so I can do the exact right string compare and that gives me a really nice nice speed up yeah and it and the people who worked on some DJs and actually do a lot of did a lot of projects like using Cindy for for decoding base 64 so the plan is to incorporate all those things as well and then there's one tricky part your CPU actually supports multiple instructions that at least if you have modern CPU if you have like a pending for then you only get SSC 42 but we want our software to be able to run and all those devices without recompiling so we actually compile it four times in the case of xx86 then use the CPU ID instruction to pick the right one and then well it's still in progress projects I have hoped to be a little bit further along but unfortunately not it will be a standalone library because it might actually be useful to other people and that will make it easy to integrate into other projects it was initially just intended for NSD yeah the numbers are so far pretty good at least quite a bit better than what we have now I think it's possible to go to one gigabyte a second yeah so if you want to check it out there's a link in there and finally I want to there's slide with acknowledgement because these people help me a lot I just send them on unsolicited email at first and then I happen to get answers back and they help me and they even took a look at my presentation help me there as well so thanks to Jeff Daniel and all the sim DJs and people and with that I actually finished in time it's time for questions yes oh no but that's the slow path and exactly there the hash doesn't work but that's the slow path so we do a slow path sorry I should repeat the question what the person in the audience was actually referring to what happens if you we use a generic type notation where we start the type by type followed by a number and obviously it doesn't work there but it does the slow path so we have a slow path there is the person so complete that you can parse an output and you get the same output as a parsed in really good because I would so if the parser is good enough that to give you the exact same output as you put in no it does not do that no well you you I mean do you mean by access to white space exact or just yeah no it doesn't do that no and then but it's also not it does also strip comments yes yes yes yes yes yes yes how do you handle escape decimals because in the example you gave you strip and you have looked where the backslashes are yeah and then take the next character but if you have like backslash 003 then you need those four characters as a single unit to encode in the final I'm not actually I just do the I just really good yeah you're gonna have to I hope there's no more questions because you're gonna have to keep doing that but what I so what happens if I guess for certain type of input I record like a backslash 003 which encodes the byte with a value 3 single byte yeah how do you do that with your algorithm that just takes the next character when you have backslashes but that's just to that's just to so the question is what do I do with escape characters or with escape sequences and so what I explained on the what I do what happens with backslashes is just to tokenize so I don't strip any data I just mark out the starts in the ends of each string field and then the parsing comes after that so there's no data actually stripped yeah so the question is so the question is what's the output format and the output format in this case is just DNS wire format so the the idea here is that for each record it will invoke a callback and it will just give you wire format with pointers to where all the fields are in the like an internal description of the field so that you know the length and you know the type of the field yeah there is definitely value to large effectors because it takes less instructions so if you can do something so I did not look into using the GPU but yeah that might benefit so if you know I have not so the question is why do why does parsing zone files have to be fast well because they're they get reloaded quite a lot of times each time there's an update to a zone you need to reload you need to reload the zone so we want to yeah the so that happens multiple times like an hour it differs per zone right but in our case if it takes more than an hour and the operator actually or it takes the better part of an hour and the operator wants to lead more wants to reload more often then that becomes a problem right so we just need to be faster but then there's all the end so there's other benefits as well so NSD for instance we support zone verification where just before the zone goes live you can have a focus program to verify that your DNS stack data is correct and there you can use an AXFR or you can let the NSD feed you the zone data in which case you just get text representation and if the zone is big enough then you want that to be fast because it's in the critical path right yeah well actually if the question was if splitting the files and multi-threading is something that we consider well splitting the files no but split on them yeah well new lines yeah that can be tricky because zone files can contain parentheses which would then mean that the record continues on the next line but a colleague actually did do a parallel zone loading implementation and I guess we can even do that with with this implementation right because there the it was actually quite a bit faster but the scanning process still takes a long time because you go over it by by bite but now that we have a fast scanner there's no reason why we cannot also include like parallel parsing yeah that could work yeah so the question is if we did it |
DNS for I2P: a Distributed Network without Central Authority
How Students Tried to Create a DNS for an Overlay Network without a Central Authority |
Thank you very much for having me. Do you hear me back there? Lovely. All right. Okay. I'm Conrad and I have the opposite of problems and I have a million of problems compared to Geron because Geron had performance issues. That was the things before and he tried to parse DNS zone files really fast in the I2P network, which is an overlay network which provides anonymity. We have that few hosts. We don't even need to think about performance. We have, as I said, one million other problems. Now, the students of the University of Applied Science got the job from me to take a look at this DNS problem because the I2P network has no DNS. So welcome to Stone Age. |
Why resolving two names in a GUI program is hard
Summary of available name resolution APIs on Linux and why a new one is needed |
Hello everyone, this is Peter Menzik and he will tell you why resolving two names in the GUI program is hard. If you've ever tried, you know. If you've never tried, listen. Okay. Good afternoon. My name is Peter Menzik. I work for Red Hat, so I took my hat with myself and this is a presentation about why the resolution of two names in a single program is not simple. So, how do you resolve names? The system offers get other info call and it is protocol and family independent, requires just host name and service name and returns and this is somehow ordered and works on major OS operating systems. It is fine, but it blocks. And we don't want that. We can work on that by using asynchronous libraries, which are usually DNS only. That might be good enough, but not always and typical applications should not be limited by that. So, good for servers, not for workstations. Because name resolution can be provided also by different providers than DNS only and some are obsolete, some are not. So, I think application should try to use common provider for any aim. We have, for example, system here is of D, which provides different protocols, but does not work in a way which breaks, for example, DNS only application. So, it's not a good solution, I think. So, how do I make multiple solutions from single program? I can use second interface. It works for DCP, UDP, present on any system, but I can handle thousands of connections from single program without problem. I just use poll or select and select only socket, which has something ready for me. So, why is blocking problem? Because graphic applications use not a blocking loop, but just event-based loops. And they are non-responsive if any call they use blocks. We want to avoid that. So, modern applications are implemented by just callbacks to events they want to handle and nothing else. And then spend most of time waiting for something and conserve CPU. So, why not just working threads? Because creating a new thread is simple, but receiving result from finish to work in thread into the main thread is not so simple. And it increases complexity a lot of any application almost. So, why that thread is needed anyway? Name resolution on Linux machine can come from multiple modules. Some are local only on the machine. Some need to ask local or remote service. And fetch, send some request and wait for some response. It may tie or not and extract fetched addresses and return them to start connecting. And the most important waiting for timeout or activity or socket activity is implemented by any framework doing non-trivial applications anyway, because they need it. So, how can it be made non-blocking? I think we should make a common code to implement protocol specific plugins and DNS should be only one of them. And for example, multicast DNS or so and provide integration with custom loops in different applications because major graphical applications use Qt or G-Lip, but they may use some custom loops and it should require relatively small time, small code part to integrate with them quite nice. So, we should rewrite existing modules to use callbacks like modern applications and not just blocking because current modules are easy to write and maintain but difficult to use from normal applications. I think resolution should be simple even in non-trivial applications. So, what do we need? Just ability to add and modify socket into watched list of events and denotified after time is up and if no activity occurred and provide some code to handle those events. And we don't care too much about time precision because we measure time out in seconds in DNS anyway. So, who cares? So, why non-blocking? Because it creates no race conditions. It's almost unlimited. It's limited by number of socket handled that usually quite high, so we don't care. And it can allow many queries per thread without any problem. And resolution would become more easy handled in a single thread only, not scattered over multiple threads during runtime. So, we should not care and of course, separate threads would still make sense if this intensive applications are run but for small fetches of data from network it's not necessary. I think server software should take advantage too. So, unfortunately, there is no implementation yet. I think Pavel Shimada wrote quite good start called NetResolve and it provides separate load-double modules with different providers which can be used as building start. But its documentation is quite poor and non-blocking API. I try to start what is needed is missing and waiting for me, I think, to write. But I think we need protocol independent API for normal applications. And if we add just some metadata to stroke the other's info provided by Get Other Info today, I think we could handle also HTTPS resource records in library and not require common applications to handle that and implement it in each application. I guess there are many applications starting HTTPS connection and it should be not re-implemented in every application doing that. Of course, some parts are similar and for example, multicast DNS can use the similar parts and could use the same calls with just asynchronous way. And that is almost all I had to say. So, are there questions? No questions? So, you want a solution for this? Is there a way for Red Hat to lead this in the free desktop space, maybe? Well, maybe, yes. Who should lead the initiative? I am not sure. It's not official Red Hat initiative yet. It's just my own opinion. So, it's not like Red Hat already has project and involved people and such. So, it's still what I think should be done, but not yet decided who should lead it or who should cover that. I definitely want to talk about it in Red Hat, but it's in fact not clear to me which mailing list or organization should start and should organize this because it's maybe should be handled by G-Lipsy people or I don't know. I would like to talk and hear what other thing about it because I'm not sure myself. It occurs to me that this problem statement is a lot like the get DNS problem statement plus things that aren't DNS. So, I guess my recommendation would be why not look and see if you can enhance that API to include these non-DNS naming systems? Which API? Well, the question was why not enhance existing solutions like get DNS for example? I'm not sure how can you do that because I think that the problem I have with system D trying to do that is far away. This is a good example of how to do that wrong because I think when application wants to talk DNS only and nothing else, it should be able to. So, if I use the get DNS library and I think it should do only DNS, it should be able to choose. So, how do I choose whether it's different protocols and how do I forward from get DNS library? What do we all do from there? Because I think get DNS expects DNS record types or such things and those are DNS specific. Those don't work in other protocols. Does that answer the question? Well, not so much question as well as how to get DNS address this. Get DNS for all its faults is extremely flexible. So, you can enable and disable extensions. You can say by default it does DNS only if you say I want to have an DNS that starts doing it. So, there is a way to extend it and the same thing applies to record types. If you say I have funky record type, it fits within the framework to have it. So, I think it's a lot of technical problems. Yes, but look, it wasn't even a question. Please repeat the comment. I think statement get DNS is quite flexible and can adjust to those. Yes, why not? I don't say we have to implement it from start, but it should be generic enough so it would be future proof. And I admit I don't know details of DNS, get DNS library, so I can't comment details. I just ensure it can do, but why not if it's another library, but I think it eventually should land in Lipsy or something like that after it proves it works. So, maybe. So, you had a slide. Does anybody else have a question? No? You had a slide about callbacks near the beginning. Callbacks. But it doesn't matter. Do you expect every plug-in to handle things like TLS or would TLS be something? I would like to TLS be. TLS is kind of special machine, but it should be somewhere inside. And what the user should receive should be ready to use socket to work on. So, he just puts inside the name and service name and it does the heavy machine inside. Well, TLS socket is something over it. It's above normal connection, so I think it should be extended. I'm not sure what should be. It's above that. I'm, yes, I am out of time. Yes, yes. That's perfect. Thank you. |
Connectbyname and the Proxy Control option |
So actually, I'm going to talk about three subjects, connect my name, proxy control option and also a little bit of rust throwing at the end as a bonus. So my name is Philip Holmberg and since a bit more than a year, I now work for NLNetLabs. So the question that has probably been posed by many people is can you have just a function that in comes a house name and a surface and you get a socket back. And sort of the starting point for this project, because we've got some funding, so we officially defined that as Mikael Aberson in the ITF one suggested that something like that should be done. And of course, we want to have options so we can have a slightly more modern version where you have a context as the first thing and it returns an error code in place of overloading that with the socket, but general idea. Of course, this is completely bad because this is blocking. This is what we now want. Unfortunately, because we only at NLNetLabs basically do DNS when it comes to name resolution, this talk ignores every other possible thing. We don't even do MDNS, but we definitely don't do anything fancy, but it should not be precluded. I mean, if people want to add it, why not? So to make it non-blocking, the obvious way to extend it is to take an event framework like LibEvent and then in LibEvent speak it is, well, you create an event base, you do a bit of initialization where you pass the event base to the asynchronous library function, you start it, it returns to say, well, okay, I'm busy. Then at some point it does a couple of callbacks, like this callback function that you pass, but the main loop is called event-based dispatch and as long as your entire application is written around it, then the application just calls this one and then you can call this connect by name as many times as you like. So if you want to make this practical more complex and do a release engineering, for example, getDNS has support for, I think, three event frameworks and you can define your own event framework and stuff like that. I'll ignore this, the only thing you're going to get here is LibEvent. But there's a couple of practical things that we would like to add, so now we get another full slide and so far I said you get a socket back, implicitly a socket back means TCP because while UDP is way too complex, but then in practice, who does TCP anymore? I mean, the thing is if you have a TCP socket, then you immediately call your SSL library and you want a TLS connection, I mean, at least I hope that people are not writing new codes that ships unencrypted data over the internet. Now within LibEvent, you're lucky because they have a concept called buffer event, that's why the callback there gets a buffer event, and LibEvent can transparently do SSL, so you just return right to the buffer event and then LibEvent, well, if it knows that it is a TLS, then it sends it to open SSL and if it's just a normal TCP connection, then it sends it to the socket. So that solves that problem and that allows the library to also do a couple of other interesting things as we will see on the other slide, but because we are an organization that is focused on DNS, we focused on all of the complexity of stuff that you can do with DNS. So for example, one thing that the library does, I forgot to mention, is that if you get multiple addresses back, then the traditional way is you write a for loop, you do connect to the first address and then to the second address and there's, I don't know, many minutes timeout on the TCP connection, so if the first address doesn't work, then it takes forever. So your library needs to do happy eyeballs such that you start to connect, wait not that long and then start the next connect, which also means that any timer system is not in the order of seconds, it should be definitely in order of milliseconds because it should be within human response levels and not like, okay, the network is down, we wait seconds. So that is stuff that this library can hide and that the prototype also does, but to get to the DNS part, if you have a modern web browser, then the web browser has an option to configure DNS and that's highly controversial because it goes over HTTP, but it's something where applications have now said, okay, we are done with, et cetera, resolve.golf, we from an application point of view want to be able to do, decide which is our upstream resolver, so we added configuration options that you can say, well, I want to have an upstream resolver that has authenticated encryption. I don't really like quick and I have no clue, so I say the only allowed transport protocols is plain old DNS over 53, which will always fail because it cannot do any encryption, but we do allow DNS over TCP, we do allow DNS over HTTP too, but none of the fancy quick things, we have a name for authentication and of course we can go completely overboard and also do SVC parameters. So that extends the call a bit because now the context has a way that you can say, well, this is my DNS policy and then it goes out and do it. I mean, basic interface is still more or less the same. So we worked on connect by name, we built a prototype and a grant from an LNET foundation, we support asynchronous resolution, well, of course, asynchronous also mean that your A or what A query should go in parallel, happy eyeballs, then of course the DNS community invented Dane, so if you do GLS then you also have to do the Dane query immediately and I forgot to list here, we also do SVCB and if you have the patience to configure experimental open SSL libraries, you can also do the encrypted client hello from SVCB into open SSL and stuff like that and the nice thing is you can all hide it in a single library. So what I would like from the community is sort of one is sort of what doesn't work, what extra stuff that we need, but we also have a problem with how do we go further with this. I mean, we built a prototype, but we cannot really ourselves make it into a product for various reasons, so take a look at it if you are interested and let us know if you want to do something. Current problem for me is it's on top of KTNS, KTNS is extremely nice library, but it tries to do everything, so it's also a very heavy weight library, so there it is like, it's a library that you want to link with potentially all applications should that be that heavy weight. So that's how we got to the next subject. This is sort of now what the ITF has created as what ASTAPS resolvers should do and I left out a case and other things because ADD is busy and I don't know, there's probably quite a few other working groups. So the stop resolver, which was a very simple thing with a recent that sends a query over port 53, has to do more and more and more stuff. So many applications, ASTAPS resolvers, how many libraries will implement all of those transports, especially if it's also implemented in different languages. It used to be that a stop resolver had basically no state, but if you do DOT, DOH, UQ, then you have connection setup, you generate load in a recursive resolver because if you're constantly setting up, say, DOT, DOH connections, then it has a way higher load than if it's just a simple UDP query and it's definitely bad for short-lived applications like Ping that have a way higher overhead setting up a connection to the local recursive resolver than the actual work that the application is doing. So the simple way to solve that, we thought, you introduce a local proxy. That's not really something new because lots of people are unbound as a local DNS proxy. Well, we also, as part of the GetDNS project created, Stubby, that focuses more on doing DNS all the time, there is things like DNS dist, DNS mask, system D, resolve D, so it looks like, okay, we don't have to worry about that, we can just talk to a local proxy. But then, if we go back to the example config I had for connect by name for the Firefox that wants to talk, DOH, how do you tell your local proxy that you actually want to have an authenticated connection? What if your proxy is just sending it, I don't know, to one of the public resolvers over port 53, maybe that's not what your application wants. And then, this whole local proxy falls down and you get, say, a browser again implementing its own step resolver because it doesn't have any control. So we thought about it for a while and created a draft in the ITF with a new ETNS zero option. And basically, when you send the request to your step resolver, then you can encode all of the stuff that you want to have as a policy in such an option. So you can be very basic and set a flag like, well, only give me an authenticated connection. If you can't do it and just report like it doesn't work or you could say, well, this is the recursive resolver that I want you to use, please use that. And then applications can trust the local proxy because they can control it. And it provides a nice way to basically reduce the step resolver footprint a bit by moving all of the difficult transports to the proxy. We have a proof of concept for that, though I have to warn you that we revised the layout of the option in the draft that is listed here and what the proof concept does is an older draft. But if you want to play with it with the general concept, then that is there. So we decided that, well, we can continue writing code in C and, of course, for our existing products like unbound NSD, we will just maintain them in C because they are written in C. But we would like to try to move to Rust for new code. And I just copied a little bit of stuff from a prototype. First thing uses Rust in creative ways and that is something where it's now a prototype and we definitely need feedback from users of the library like, okay, it's very great that you can have a message builder that takes a static or press or type and it has a stream target but probably you don't want to write code like that. So it's built at the moment to be flexible and use the language but it should be somewhere modified to be more usable. Then here in the middle, you basically get the main thing because the whole thing is generic if you want to send a query, then you have to go to the question section and then you say, well, I want to push a question there and then there is again a bit of a usability problem where you say, okay, I need this back to a builder and I need a clone of it. So this is the part that I experimented with. If you want to have a TCP upstream, then you say create the TCP connection and the nice thing with Rust is that it can do all of the asynchronous stuff with a nice syntax. So basically you say, do this connect here and wait until the connect is done but because this function is implicitly asynchronous, as a programmer you can just write this as if it's sequential code but the caller can just call this as an asynchronous function and you don't have to do anything extra. Here I have to do a bit more work to really figure out how it fits in the Rust ecosystem because the thing with if you have a TCP connection upstream to a DNS resolver and I wanted to have this as just the basics for maybe DOH or whatever is that you want to set up the connection once but then you want to potentially send many queries over it. So I need to have a separate thing that actually talks TCP as a worker threat but then because it's all asynchronous this is basically getting an asynchronous worker and then I also say well give me an asynchronous query and then in Rust you can say okay you have two asynchronous things that you want to do at the same time well just do them both at the same time and then normally we expect to be here that we got a reply and then we print a reply and we are done. So this is sort of the direction we want to go to which is also why we have a bit of a problem developing the connect by name prototype that we now have because it is like okay we don't really want to have a new prototype in C what do we want to do with it. So that's what I wanted to tell today there is I think plenty of space for questions. I love the idea of having a function which can deal with not just name a resolution but DNS name a resolution and also the cryptography but as a distribution maintainer I have to say that having something a library function which makes applications behave differently from all other applications is really a non-starter so I think that you need to consider in some way to support NSS and the NSS plugins through the libc or however it's better. You mentioned that probably a demon is needed to get good performance so maybe the DNS part is the less important one that you can delegate to some other component. I'll try to summarize you say there's something with distributing this and there is something with if you run a local proxy then you don't have to focus as much on DNS if I got that correct. There are already some projects in this space that you mentioned and they are expected to work with the normal libc NSS plugins and I think that your library to be universally used that I think that's the task to be your goal you need to support the normal name resolution which is expected by any current applications so it has to support the libc plugins. You say the library will only be adopted if it supports the libc plugins. Yes I agree I mean that's why we made the prototype because we were looking into what should the interface to the library be how should the library behave stuff like that sort of the high-level stuff and fully expecting that any production quality implementation of the library has to take a lot of this stuff into account and certainly dealing with nestwitch.conf is I guess mandatory for any production quality library. For the proxy control option because there are lots of demons in that space of course it's best if those adopt the option once it is actually standardized by the ITF. I mean it's not that we want to write another proxy it's just like we have a very specific problem that we want to solve if we want to make stuff resolve a small and still give them access to all of the encrypted transports but yeah if for example system dresolve they would also do the proxy control option then it would be perfectly fine I mean there's no new reason to write a new one for the proxy control option. Is it only the step resolver that will tell the proxy server that it wants those policies applied or does the proxy also communicate back to the step resolver that is actually implying those policies because in the initial situation where nothing supports it, which you always have. So the question is what happens if you send a proxy control option to an older step resolver that may not be aware. So I didn't want to go over the entire draft, so we thought about that. But basically there are some priming queries. I forgot the exact name. Is it resolver.ARPA that is proposed? Something like that. So try to look up resolver.ARPA, see if you get the right response. If you don't, then the only thing you leaked is that you were trying to look up resolver.ARPA. We assume that that is safe and then if you do get it, then you know that the proxy understands it. Yeah. Any more questions? Okay, yeah. There's actually a comment on both this presentation and the previous one. You're tackling three moving targets at the same time. You're trying to figure out how to integrate with the event loop. You're trying to figure out what your API to the application looks like and you need to figure out what your integration with NSS or system. The complexity is multiplicative, so you're curbing this. This is a horrible idea. You can at least remove the event loop integration as a moving target. There is an existing project called libverto which tried to just solve that one problem by providing four libraries and API to integrate with an arbitrary event loop provided by the application. I think you need to remove the number of moving targets like reduce it and maybe the event loop is the one to kick out first and try to put in a separate consideration how to solve that and then continue from there. So the question was basically it tries to deal with too much stuff at the same time. Event loops, figuring out an API and then also figuring out how to deal with an S-switch. There's an existing library called virto. That makes it easier to be flexible with respect to event loops. That's definitely a good point. I'll try to look at it, but I specifically decided to only focus on libEvent to just get virto. To get something, a prototype up and running and not try to support arbitrary things like that. More questions, some more time. Okay, it seems that we have run out of questions. |
iothnamed
a DNS server/forwarder/cache for the Internet of Threads |
Tell me when, tell me when, five more seconds, okay. This time silly it makes the video of cutting process so much easier if we stick to a schedule. Good afternoon everybody, thank you for having me here. Let me start from a short introduction of what internet of threads means. What is an end node of internet? So, which are the communicating nodes of the internet? The legacy approach, what is at the beginning, it was the internet of hosts. Internet of threads were given to controller, to each controller. So, actually the communication took place, takes place between the controllers of hosts. This concept has been made wider and the networking, the network endpoint are virtual controllers of virtual machines or even in spaces. Internet of threads is one step farther, it means that we want even just processes or even threads within processes as nodes as internet. So, the idea is to give IP addresses, actually IPv6 address we wouldn't have as many IPv4 address as we need, to threads or processes. The idea can be depicted from this. Long time ago there were fixed line and really the telephone number was connected to a place, a room. And it was common to call a number and say, is Jack at home? Nowadays we use portable phones and the numbers are connected to people and it's very easy. So, on the internet what do we look for? We don't look for controllers, we look for services. So, the most natural way is to have IP addresses connected to the process with providing that service, not controllers or machines, virtual or real. An ether stack is just a layer between API to the application layer and the API to the data link layer. Actually these are two layers, layer three and four of the other stack. But anyway it's a slice in the middle. This implementation is currently most of the time deeply inside the kernel of the machine of the computer you're using. But it can be seen as a library and this library can be linked to a user process. And in this way the user process can directly, can talk with the network. We created, we made one further step to this implementation using the Libyot library for the Internet of Threads. It's not a library that implements a stack, it's a framework library that allows to load actual implementation for network stacks as plugins. Providing a unified API to the applications, in such a way it's possible to run applications that can use either the kernel stack or any implementation of the network stack as a library. Actually the actual implementation permits to change the implementation just by changing a string. The actual stack supported by Libyot are the kernel stack. Woody stack which is actually a trick. It's a namespace using a top inside the namespace. So we are borrowing the kernel stack using it at user level, then a real implementation of user level TCP-IP stacks like Pico TCP, the module is named PicoX. And the working process where I'm working to port lightweight IP to this port. Okay, what do we need in the API? The way to communicate, this is quite known and this is the standard way we use the stack. So open-closer communication point center. For all these, there are backlit sockets. But what is not common using the, if you use the kernel stack, you have as guaranteed as provided the definition of the stack as a configuration parameters of the stack, like which is the API address, which are the routing definitions and so on. So we needed to add this port for the API. The definition of stack needed some syntax, some specific syntax. When using the API, a new stack is created. There is a, it's a pointer to a specific structure that can be used for communication. So the only difference for socket unit, for the socket API, is that instead of using socket, there is a new call named mSocket, which has one further parameter, which is the actual stack implementation to use. Okay, and then there are all the other API calls well-known from Berkeley sockets. But okay, we needed to create and delete the stack, so in such a way to do that, we use those calls we have seen two years ago, and the pointer can be used for communicated. And what about configuration? Okay, for configuration, there is an NFC used, for example, by the Klinus kernel that uses the other family netlink to provide messages for configuring the network. So there is no need of further API entry for configuration. We just need to, we just need that our stack support AF netlink configuration. Another point about these sockets, using forum sockets, if it's a library level implementation of the stack, the problem is that this integer could be an internal number of the library. Instead, we need that integer to be a real file descriptor, because we need the file descriptor to be used, for example, for Paul socket. And we had to write a new kernel module in the library named Paul, that creates a file descriptor in which the elements can be synthesized. And so stack implemented at user level as a library can provide real file descriptor, and this file descriptor can be used on select Paul and so on. So we can use real file descriptor, file descriptor coming from different implementation of the network, and write an end dreaming program altogether. Just a quick look to give you the feeling of what does it mean. This is a program just sending chow using a datagram. This is the legacy way. This is the same example using internal threads. Here is the implementation. Just by changing this string, I can use any implementation I want, provided I have support for that. Okay, now the core of the presentation. We needed an ecosystem around this idea. We needed a lot of stuff that we currently have in our support, in our tool set, but we need them implemented as internal threads. We needed calls to configure the network. These are not calls from the Internet of Threads implementation of the library Internet of Threads. These calls generate net link messages. So it can be used even for configuring the kernel stack from a program. Let us pass quickly through this. We need a library to a query for DNS. Why? Because using the celib implementation, it uses the kernel stack, and it uses the definitions in atc.result.conf and the string atc.result.conf are coded in the celib code. It's not possible to change even the file to be used. But we designed a named proxy for word cache, especially for the Internet of Threads, which are the characteristics of this DNS. It uses the Internet of Threads so it can use for the same queries, or for word queries, different stacks defined by the user. But at the same time, they can provide further services, further features, specifically useful for the Internet of Threads. Let us pass. These are the configuration items. But I prefer to show you some scenario in which the IOTNMD can be used. This is a common scenario. It's a common proxy scenario, like libmasque or similar, but implemented using Internet of Threads. So the idea is that if we ask the cache, if it asks the proxy an address provided by a friendly node, we can cache the result and provide it back to the query, or we can add some specific local addresses. It doesn't provide a relay from external queries. The point is that we have a stack for the query and a stack to forward the query outside, in this case, in this configuration. I provide the service to all the processes connected to this stack, and I forward the queries on another different stack with a different implementation. Okay. These are the tests just to see that it is able to resolve foreign and local results. Or it can be used as a delegated subdomain. Okay. So they did that given there is an S record providing, forwarding, defining this pink server as the responsible to the server, dummy, v2, and so on. So it provides back the solution. And here the new point is that we can use different stack. But I have kept some time to show you some more ideas. Actually, managing DNS servers for IPv6 is a daunting process, very error-prone, because if you have to write all those huge, long numbers, it is very hard to not to insert errors in the configuration. That is to create IPv6 address, the health part of the IPv6 address using hash code, hash resolution, using the result of a hash function defined on the full defined domain name. So given the full domain name, we can have the health part of the IPv6 address for free. So this is the proxy as in the previous example, but I can ask the server to solve all the addresses like something hash.local. And I can use any string before the hash.local, and I get a name resolution. That means that if I add a new node, which can be a computer or even a process, I just have to baptize it to give it a name, and it will be connected on the net without having to write any single line in the DNS server. The slides are on the website, so if you want to pass through and download the prototype from GitHub, you can test this. The same thing can be done, the reverse will run, so having a delegated domain that also addresses using hash. So we can have a number of local machines that can be seen from the internet just by giving them a name. But there is one more result, one more, which is one time IP. One time IP is a security feature like a one time password. One time password means that you have a password that lasts for a short period of time, so if somebody is able to ice drop the password, it's kept useless in a few moments. This is the idea of one time IP, the host part of the address is defined by an hash definition that changes during the time. It's an hash not only of the name, of the fully qualified domain name, but it includes a password and the time. So if the legitimate user wanted to connect to the server, it knows which is the actual IP address. Any other user, even if he's able to trace the network traffic, it gets some address that will be null in a few seconds. Okay, I think I have no time to show the other. Minus means time for questions, isn't it? Yes, four minutes left including questions. Just one point and then I go to the question. Namely, ACP is another tool which asks, which uses the ACP server, but these ACP servers query the DNS to provide the address to the computer, not the processor whatsoever, so that you can just say to your computer which is the name, and the resolution is provided, and the definition of all the networking configuration is provided getting replies from the DNS server. Okay, I have to stop. I have a lot of more examples to give you. Questions? You can just repeat it. What is the overhead of using a network stack in every thread, especially if you're using the one-time IPs for each connection? What is the overhead that you get within the application if you have a lot of connections coming in? For sure on single connections we have some performance drop, but the point is that the overall bandwidth by all the applications you have on your system can use the entire bandwidth, so you have the experience that you spread democratically the bandwidth among your processes. Questions? Sorry, I can't tell you. Is latency a factor like bandwidth? Actually, the point is that behind the user-level stack there is a virtual network, which is a virtual-disputed internet, that using support like VxVD, which is like a VxLan without VTAP, so each process is part of the distributed stages. Latency are quite good because you have a direct UDP connection, a unique connection from end-to-end from a process in one machine to the process to the other machine so there is a direct UDP one-to-one. My idea is to have a new concept of computing elements, so instead of having your computer with the process, you have locally a network of a cluster of computers with all the processes which are of different networks. So you have the real networks, which is just the basic framework, and you extract this in a number of virtual networks that connect the processes running in which fields cannot be like. That's all the time we have. Thank you, Denzo. |
Implementation of the Drink server: programming details |
Hello everybody, this is Stefan, Stefan Bortmaier of AFNIC, we will talk about drink, which is, I guess, an experimental DNS server? No, not at all. It's a Tramway station in Belgium, actually. I hope that you explain. Well, yes, it's a DNS server, and you can see here an example of it working at FOSDEM. I ask 2plus2.diamondname txtrecord and that's extraordinary. I get four as an answer, which is really, really useful. It was not possible before, but, but, exactly, exactly, this is authentic. So you can be sure it's really, really full because it's signed with DNSSEC. So now we are going to see how it is done by Medlector. So drink is a dynamic, authoritative name server, and with several services. The main one, which was the original goal, is to return the IP address of the client. You have a lot of services on the Internet doing this, but all of them are very minimum. They don't implement all of the funny things of the DNS. We have also other services, for instance, ECS, EDNS Client Subnet Echo, can be useful also if you want to know what your resolver is sending about you, and you have other services such as Calculator. Well, the goals to develop drink were first to learn, to have fun also, and also to implement a lot of DNS stuff that are missing from the typical dynamic DNS services, such as TCP, NSID, cookies, DNSSEC, of course, et cetera, extended DNS errors, et cetera, et cetera. So that was the idea, and also it also provides a platform to test IDs at IETF Hackathon. IETF Hackathon are great because you can have t-shirts, and it's an opportunity to test new IDs, new stuff, and modifying existing software like NSD or Bind or Not is not always easy, so I wanted something which was easier, at least for me. So as you see, it does not pretend to be a competitor to things like Power DNS, NSD, Not, et cetera. It's experimental. So the implementation is done in Elixir. For the people who don't know Elixir, it's mostly a functional programming language which compiles to Erlang bytecode, which is then executed by the Erlang virtual machine. The good thing about Erlang is massive parallelism, so the virtual machine is really, really good for that. The syntax of Erlang is seen by many people as a bit of style. So the Elixir was mostly done, at least at the beginning, to have a better syntax for the same bytecode on the virtual machine. Also it's always fun to learn a new language. I didn't do everything myself. I had to rely on several existing libraries, and it's one of the pleasures of free software. You have a lot of libraries with free software lessons compatible with the one you use, hopelessly. The problem is that Elixir is not mainstream, so unlike languages like Go or Python, which have very, very good mature, maintained, debugged DNS libraries, Elixir, wow. There are some DNS libraries, typically with the last commit three or four years ago, sometimes older, and not always maintained and things like that. So it's a typical problem when you program in Elixir. When you go to X, which is the main repository of libraries, you always find something, whatever you are looking for, but pay attention, is it still maintained, debugged, et cetera. You have many libraries for the same stuff, but not all of them perfect. So it's one of the problems you have when you program in Elixir. Everything can itself call external microservices with HTTP or things like that, which as consequences for the implementation, because external services can be slow or unreliable, so you have to be careful not to crash, not to ung everything while you are waiting for the microservices. It's a bit like the talk about DNS resolution for graphical program. In Elixir, we may, unlike the typical authoritative server, which only depends on what is in its memory, so it's very predictable, and the response time is constant, unlike this typical authoritative name server, drink as a response time on success rates, which are highly dependent on the external services. That's free software, of course, because we are at FOSDEM, so I wouldn't dare to present it if it were not free software. You are here, but let's go to the important implementation point. First one, which is probably the most important, parallelism. So I don't like events. I think that events are an invention of the devil. Again, God intended parallelism to be done with processes, and Elixir, well, Erlang actually, because the run time is Erlang one, Erlang encouraged you to use massive parallelism, and when I say massive, really massive. You have anything to do, you create a new thread of execution, and it's very, very efficient. So in drink, every DNS request is a separate process. When I say process, it's not an operating system process, because of course creating them or managing them would be much too costly. But one of the funny things with the Erlang world is that they have a terminology which is quite specific. So words like process or application do not have the same meaning in the Erlang world as everywhere else. So a process here, it's what Go is calling a Go routine, for instance. For those who programmed in Go, it's more or less the same. Basically, it's very clip to create and to manage. So don't hesitate. One of the things that we always tell to the beginners in Elixir or Erlang, don't hesitate to create process. So every request is a process. When it does TCP, every TCP connection is a process. And everything is done by process. For instance, logging statistics, where it's not implemented yet, but control through a local socket is also done by a separate process. As I said, there is a process for everything. So as consequences, if you crash, if there is an exception, remember it's experimental code and it's written by me so there are a lot of bugs. But if you crash, you only crash one process. You don't take down the entire server. So that's a very interesting thing because it's one of the motto of the Erlang and Elixir programmers, let it crash. If a process crashes, it's not a big problem as long as the entire server continues to work. In the same way, if a request is stuck because you are waiting for something, you are calling a microservice somewhere at the other end of the internet and it does not reply or not immediately, it's not a big problem for drink because all the other requests will continue to work. Because parallelism is really great and unlike what many people are saying, it's even simpler than traditional programming. So for TCP, as a consequence, when I programmed it in the Elixir way, pipelining, meaning sending several requests over the TCP connection without waiting for the reply of the first one, worked immediately without me having anything to do at all. On out-of-order replies, which are not only allowed in TCP DNS but also mandated by the RFC, work also immediately the first time I tested, it worked without anything specific because every DNS request is a process. It works in parallel, so you have out-of-order replies. Remember that for a typical authoritative name server, out-of-order replies are not necessary because the response time is typically the same for every request. So there is not really any point in making out-of-order replies, unlike a resolver for instance. But drink is a bit special because any request can take some time, a lot of time. So out-of-order replies are still very important. And as I said, parallel programming is simpler, this is something you have to teach to the students. Parallel programming is not something very complicated that you see only at the end of the year. It's something very simple, very natural, and if you don't use events, everything is fine. And you don't care about things like, this request may block me, yeah, okay, let it block, no problem, other process will work. So here is an example of Elixir code. It's a functional language, so we use a map a lot. We don't do loops because loops also are an invention of the devil. So we have a set of IP addresses, and we just map a function. The function simply listen on this address with some options, okay. Then you open the socket, and for each socket, you create a server which runs this function, TCP loop acceptor, which will itself create a process for every DNS request received over the TCP connection. And that's all, and it's the end of the function that you map on the set of all IP addresses. Okay. Not even a bug in this one, no, I don't think so. Another important point when you write an internet server, whatever type of internet server it is, is of course robustness, because as you know, the internet is hostile. You see a lot of funny things, a lot of funny DNS packets, and sometimes even random binaries sent to the 53 port. So I assume everybody in the room have read LFC 9267. Is that the one that has no work? Yeah. Okay. It's very good reading if you are interested in DNS implementation, how it works. Basically, it's a list of the things that can go wrong when you pass DNS request. It's not a complete list. So the internet is a dangle. In packets can have whatever, literally whatever, everything is possible. And of course, the main example in LFC 9267 are compression pointers, because compression pointers can do things like pointing to themself, pointing outside of the packet. So if you program in C in a completely careless way, you can imagine what will happen. And indeed happens in the real world. Most of example in the LFC are from DNS mask on the windows, but it can happen to anyone. EDNS is not mentioned in the LFC, but it can be fun also. It was specially fun for me because the DNS libraries that I choose, I discovered later that it has no support for EDNS. So EDNS had to be done entirely. And EDNS options, for instance, are type length value. So you can have a length which is too large or too small and make the packet impossible to pass or even worse can trigger a crash of the server or remote code execution in the worst case. If you program in C, this is the sort of thing that can happen. So here is an example on how to pass EDNS. The second line with the brackets, the brackets are when you handle binary data, you extract the code. And you use for that pattern matching because it's a functional language, LXC relies a lot on pattern matching. So here the equal here is not an assignment. It simply means that you pattern match. And if it fails, there is an exception. So binary part which extracts the first two bytes of the data is a safe function, meaning that itself it uses pattern matching. If there are, for instance, not enough bytes to get the first two, you will have also an exception. You won't execute a remote code or go outside in the memory or things like that. Then you do things. You extract also the length of the packet and then you read the length. So if you do this sort of thing in C without paying attention, you can imagine the catastrophic thing that can happen. But here it's safe. In the worst case, you will have an exception here because not enough bytes. So here we trap the exception and we raise a proper exception and then we will return form error to the guy. In case you have something unexpected, this may crash, of course. It may take down the process. But remember, each request is a separate process, so the other request will be fine. DNSSEC. Ha! DNSSEC is fun. Because it's dynamic, you need to have dynamic signing. But cryptography, one of the things I really dislike with cryptography is that each bit wrong on the signature is completely off the mark. So it makes things really difficult to debug because some software tells you that the signature does not match. Okay, what's the problem exactly? Did I forget a field or did I forget something in the LFC? Ah, yes, something. So an example, a bug that I added, for instance, is that default encoding of the DNS library uses compression for the data which is inside the R data. So the domain name in the SOA or NS record, for instance. But the LFC about DNSSEC says that the signing has to be done on encoding which is done without any compression. So it didn't match and it took me some time to figure out what's the problem. Also the library I used did not allow to encode without name compression. So I had to redo everything myself. Like most programming projects, Drink was at the beginning, oh, it seems simple, it will be done in a weekend. And of course, in the end, it was much longer. So here is an example of code for signing, again binary data. We put all the information that are mandated by the LFC in the pseudo LFC which is then encoded by myself, unsigned. There are a few funny tricks, for instance, all domain names has to be put in lower case, the sort of problem that you discover when you go through a resolver which does case randomization. That's how you learn. But the most funny in DNSSEC is, of course, negative answers. So Moses came back from the mountain with ten commandments and one says that you should not lie. But you have to lie here because you have to say that there is nothing between this name and this name. And you don't know all the names because the server is completely dynamic. So Drink used something called white lies which are described in LFC 4470. So the Ensec record is just a bit before the name to a bit later. It seems simple, but it's very hard to get by. At one step, for instance, when implementing the algorithm of LFC, I had a code which worked with unbound or not, but failed with bind. And I never really discovered why, but after some tweaking, it worked. Also encoding of Ensec bitmaps, it's quite interesting, Ensec bitmaps are encoded in a very clever way, but very hard to get right, especially since LFC has only one test vector. So it's very difficult to see if you are on the right track or not. But in the end, it works with, we have everything in LX here necessary, enumerate, it's all the things that you can enumerate. It's a very generic library, so you can do things like finding the minimum, filtering to extract some data, map to apply a function, et cetera, et cetera, it's cool. Of course you need to test LX here like most programming languages as a framework for testing. But also I made external tests from a Python program written in Python to be sure that I don't have the same bug in both the tester and the testee. So it's also especially important in the DNS to test not only with proper DNS request, but also with broken request to see how the server reacts. So here is a Python code to create, for instance, an incorrect EDNS option. This is a comment on the second line. The length, NSID has no data, but here we put a random length, so any server that will try to decode EDNS stupidly will read too much bytes and something wrong will happen. So we create this EDNS option for DNS packet, we send it to the server and we hope that the server will reply as the RFC said with form error, otherwise the test will fail. And that's all. So time for questions. Yes, that's this. Good question. I have to think about it. The question was about byte order because DNS RFC specifies byte order for things like a length in EDNS packets, for instance, and it's not explicit in the Elixir code and that's a good question because I don't remember how I did it, but I won the program on several machines with different byte order to be sure that it was okay, but I don't remember how I did it. That's an interesting question. This is a code that I wrote. The last code that I wrote was DNSSEC, so DNSSEC is still fresh in my mind. The rest is a bit more complicated. I can probably add to that. When you specify the binary pattern matching, you can choose how you want it done, and you can specify the elements, and you've got a default in DNS, but I don't remember which input, which input. So you mentioned that when you added TCP, the pipelining just worked. How does it handle a larger plot if you do not have that? Is it always like the answer comes back and so there's no chance that a big answer has to worry about a small answer arriving while it's being sent to anything like that? So about TCP, when there are some questions or replies that are larger than other or takes more time. So because of the parallelism and because every DNS request is a separate process, they follow their own path. The only case where they meet is when they try to send the reply back. So in that case, it's a long virtual machine which is in charge of being sure that you cannot interrupt a white operation. So the way it's implemented is that everything goes through a process. For instance, logging works the same way. We send everything to a logging process which then serializes. So we can be sure. And also, writing on the socket is done by the Erlang library, not by me, so it cannot be interrupted, so there is no risk of interleaving replies, if that was your question. On the Erlang socket library also does a few things that are not really important but are fun. For instance, when creating the socket, maybe you notice this option, packet 2. It means that two bytes length has to be added automatically, which is good for EPP or for DNS. And also by default, it's in network byte order, which is good again. Oh, performance. Yes, with DNS perf. And I compare the drink with NSD. Drink is typically three to four times slower, which is expected, of course, because it's dynamic. It has not been optimized for speed, and because NSD is very fast. So of course, as you know, performance testing is something complicated. It depends on a lot of things. So I don't have strong, serious measurements, but the measurements I did on my machine show that the difference in performance is, in my opinion, quite acceptable. Three times slower than NSD is actually quite good. The question is, do I plan to add some caching in it because some questions can take time to retrieve or to compute? No. It's don't think it's, as you know, caching is one of the two or three complicated things in computer programming. So in my opinion, it's not worth it. Caching can be done by the client, anyway. Or you can run the drink behind the NSD, if you will insist. Thank you, Stefa. |
Hosting your own DNS for 'fun' and zero profit |
Alright. Hello everyone. Thank you for packing the room. No pressure. As Peter mentioned, I'm Kevin Fleming. I've actually spoken at FOSDM quite a few times before but never in the DNS room. So this was a bit of a stretch to think that I might even get a talk accepted here and I'm thankful both that the team did and that you all decided to show up. So as he mentioned, I don't know if I called a mistake. I made a decision a few years ago about something that I wanted to do differently than I had been doing and as a result I am now running my own public DNS server. This is done for personal basis. I don't do this for pay or anything like that. It's the zero profit part. Nobody pays me to do this. In fact, it actually costs money so it's negative profit really if you want to think about it that way. Also, this is a bit odd but I wrote these slides in a somewhat unusual way where every title of a slide is actually a question from the audience which I will then answer in the slide. They get somewhat humorous further into the presentation. So if someone in the audience wants to play the role of the person asking the questions as we go to each slide that would be fantastic. In fact, you could all do it as a group. That would be even more fantastic. So if you decide to do that, that's great. If not, I will be happy to read them. Yes, exactly. So start off with everybody in the audience who currently is responsible for running authoritative name servers that are reachable on the public internet. Raise your hand and keep your hand up. Now, if you have your hand up but you're doing that because you run an internet service provider, put your hand down. So we're down to less than a quarter of the ones who had their hands up originally. That's more than I thought actually. I didn't think as many people were as strange as I am. So you can see what this up, I already covered most of this anyway. It's relatively easy to do obviously because this is phosom. I wouldn't be talking about if it required non-free software. It does not require anything that's non-free software. It can be fun for the subset of humanity that would come to a room like this. And it's also not expensive to do. So all right, now we're ready. We're going to the next slide. Okay, why don't you? Yeah, all right. Why would we do something like this? So the fun part we already covered. The second part is, for years in my personal infrastructure, I had been using, you're probably not familiar with Hurricane Electric. Most of them are bigger in the U.S. than they are here for hosting things. But the effort of free DNS hosting service, which I'd been using, was working fine. And then I decided I wanted to sort of enhance my DNS usage and implement DNSSEC for SSHFP records so I could stop having to deal with SSHP distribution issues. And a couple of other things, like I wanted to get real TLS certificates for all of my services. And of course, five years ago, Let's Encrypt was just a new thing. And DNS challenges were the best way to do that, blah, blah, blah. So the things I wanted, the DNS hosting service that I was using did not offer. So I thought, fine, I'll go find a different one. And I can't even remember where it came from. But years ago, some wonderful person on the Internet used to maintain a Google spreadsheet of all of the DNS hosting providers, commercial and otherwise, and all these different attributes of what they supported. And so I went to that spreadsheet and I filtered down to all the ones that had the things that I wanted. And there were three or four left. And none of them were ones I would actually have wanted to use. Most of them because the minimum cost was too high. They were designed for large enterprises that are doing this sort of thing all the time. So I thought, all right, fine. I will consider another alternative. All right. What do you need? So for those of you who have been around on the Internet for a much longer time than a lot of the people who I see in the room who aren't old enough for this, if you ever had to fill out the actual paperwork to register a domain name with network solutions and fax it to them, you will know what I'm talking about. For those of you who went way over your head, don't worry about it. So it used to be a long time ago that this was a really complicated thing to do. Today, it's not such a complicated thing to do. So these are the really simple things you need. You need some place to host the service that's going to be the source of truth for all of your zones. What do they contain? Obviously, you can choose to put that in lots of different places. I choose to put it on a private network that's not reachable from the Internet because it's better and more secure that way. You need a few places on the Internet where you can make your zones available to the Internet. Preferably, those are not all in the same machine or all on the same network or all on the same data center or preferably not even all on the same continent, really. It would be best if they're geographically distributed, if they're not all going to go down because you put them all in the same hosting provider and their control plane breaks all the time, Microsoft, and their entire cloud infrastructure goes down all at once around the world. So keep those things in mind when you decide where to put these things. It's perfectly fine to put these on Raspberry Pi's and your friends' houses if that's something you choose to do. If you have friends geographically distributed around the world who are willing to do that for you, that's fine. But there are plenty, and trust you, but there are plenty of other ways to do it as well. You will of course need DNS authoritative server software. There's lots of those out there. There's a bunch of people in this room who are responsible for Power DNS, so I'm thankful for giving me the chance to do this. And then, and this is actually trickier than you would seem, your domain register, the place you host your domains possibly, the also place you bought them from, although they don't have to be the same, needs to let you have the tools to set this up. They need to be able to let you specify your own name servers, not all of them will. And if you want to do DNS sec, they have to let you upload DNS records for your signing keys for your zones. Fewer of them will do that. Many of them that are especially the free ones do not give you the ability to do that. Is that really enough? Yeah. That's really it. There's not a whole lot required to do this. Now, this is all written from the point of view of me doing this personally. I would actually be happy to use the same kind of structure if I was doing this for, let's say, a small on profit organization or something like that. I would not go in this specific direction if I was responsible for doing this for 10,000 domains for a bunch of people who are paying me to do that. Things would have to be much more complicated in that situation. But now, going back to the way things were decades and decades ago, network solutions time, back then we did not have Google public DNS, cloud for our public DNS, quad nine public DNS, massive ISPs with millions and millions or billions of subscribers that ran resolvers on behalf of their clients. So back then, if you wanted to host your own zones, your own authoritative servers, pretty much every like small chunk of the internet was going to, if it needed to know the answer for one of your names, it was going to be reaching out to your servers to get it because there was no caching layer in between them and you. That's no longer true. I mean, obviously, lots of us also run our own resolvers. And so we are going to be part of the noise that's hitting your authoritative servers. But if you have a website that millions and millions of people on mobile phones are going to touch, those mobile carriers resolvers are going to go ask your servers for the answers and then hold on to them for however long you've told them they're allowed to hold on to them a day or two days or a week or whatever is appropriate. So you don't really need big, beefy servers to do this. Really, really tiny little machines are perfectly fine for doing this sort of thing. Also, even the smallest machines you can get in clouds nowadays have multi gigahertz CPUs with very fast RAM. And as I think Stefan pointed out in his last talk, a normal, non-dynamic authoritative server is very fast. It's very predictable. It doesn't take a lot of time to come up with an answer. Probably all of the data it needs is in memory already anyway. So these are not, they don't have to be really super powerful machines. Why did you do this? Yeah. So I covered some of that already. Actually, I kind of did that backwards, didn't I? I talked about that before I was supposed to. Darn it. So a couple of the things that I ran into when I was doing the search was there was quite a few of the available ones. If you just wanted to host one or two zones, they were okay. The prices were reasonable, actually considered doing it. At the time, I think I had seven. Now, I've probably got 11 or 12 for various different purposes because we all just grab domain names for projects and then never use them for anything, but you keep them around. So I thought there was a thread on the Fediverse about that. People ran, it's one round of poll. How many domain names do you own that you've never used? The number was high. And so there were some where, you know, if you went above three, like it was $100 US dollars a month for the service to be able to do this. There were others who, shockingly, charged by query volume and not by the number of zones. It's like, well, I can't, I don't have to control over that. I mean, if someone starts hammering the servers, not paying attention to the TTLs, I don't want my bill to suddenly be $14,000 for this. This is crazy. So fine, we are not going to do that. All right, ready. What do you use today? Well, what do I use today? So, like probably a lot of people at POSDEM, I have at least one reasonably beefy computer at home, network storage appliance. So there's a container on that where I run the hidden primary authoritative server. Many people who, there was lots of people who will choose not to put things in Amazon Web Services. I'm not going to complain one way or the other, choose whatever you like. I will say that their ARM-based nano virtual machines are the cheapest machines I could find that were not VPSs, that I could actually install my own software on and open up my own ports and make them available to the internet. There are lots and lots and lots of cloud providers out there, obviously, and if you can do this with a, you know, one CPU, Graviton 2 nano instance, you can certainly do it with a Raspberry Pi or some other small computer like that. If you weren't trying to use PowerDNS, you could probably do it with an ESP32, although I don't think I want to try that, but you could probably do it. So, I have three, I have two of those and then I have a dedicated server in a data center which I use for off-site backups and running my mastodon server and matrix server and all that kind of stuff. I put another one there just because. And it gives me good distribution. One's on the west coast of the US, one's on a roughly east coast of North America, it's not in the US and Canada, and then one of them is in Europe, so that works out fairly well. So, and then in our home network, because my home network is wildly over-engineered, I have two PC engines, APUs, which are our border devices that connect us to the internet and handle all sorts of things. They're the ones that run the resolvers, the recursive resolvers my wife learned about when she saw the slides. And then, of course, they also have a copy, they also have authoritative servers sitting right next to the resolvers for a reason that I will talk about in a minute. And now, you don't have to read this because we're on the same topic. So, I use PowerDNS auth server on all of them. I use SQLite databases because they're really simple and I want to try to keep all of the machines except the primary as stateless as possible and simple to manage and deploy. That also means that I use DNS-based distribution of the data. So, it's not like a shared mass-equal database or any of that sort of much more complicated thing. But, again, since this is why I went back to before, I would not do this if I was doing this for 10,000 zones for paying customers. SQLite 3 would probably not be the best choice. Well, actually, SQLite 3 probably would be okay. AXR would not be okay in that situation because you'd be doing this constantly pushing out zone updates. And then, because I use Ansible to manage all of my infrastructure and since there was no actual good Ansible modules to poke at the PowerDNS API to create zones and manage them and set up TCIC keys, I wrote some, which there's a link to at the end of the presentation. What is that cost? Oh, I thought you were asking for the Ansible modules. Yeah, good job. Good job. Excellent. So, what is all this cost? So, you already got the NAS, putting another container and it's free, right? Especially if something is lightweight as a primary auth service server, almost no resources whatsoever. I translated all of the costs I'm paying AWS into Euro for you. You can see that it cost me less than four Euro a month to host those machines. Now, that includes paying for them for up front for three years. So, because of the AWS basically cuts the price in half if you do that as opposed to paying for it on a monthly basis. But, it's 78 Euro for three years. If you decide two years in that you don't want those machines located there anymore and you're basically throwing away 20 Euro of up front payment, that's not the end of the world. We're not talking about a gigantic amount of money here. And then, because I have the rented server and the other data center, putting a server on that was free as well. The software, of course, is all free and the modules I wrote are also free. So, cost is very low. The half of the cost per month is actually the storage cost from AWS for the root volume for the VMs. The actual cost of the VM is less than the storage cost if you do it this way. So, and they won't let you create a root VM, a root volume that's smaller than 8 gigabytes even if you don't need that much. That's the small, well, I'm using the Debian AMI and it has a hard-coded minimum of at least 8 gigabytes for the root volume. So, I suppose I could make my own and try to cut that down by another few pennies or something like that. So, what do I do with that? So, as I mentioned before briefly, all of the network infrastructure I manage uses real up-to-end group certificates for all a browser accessible and actually some non-browser accessible endpoints. I had been using self-signed certificates previously. For those of you who've done that, you know how painful that can be having to make sure that everything has the right CA certificate to trust them and everything has it. I don't want to do that anymore. So, much easier to use. Let's encrypt for that. Similar thing for SSH. All of this 25 different SSHable endpoints across this infrastructure. Managing root key rotation for them was a pain because then things that had copies of the public half of the key all had to be updated and everything else. Using SSHFP solves that problem entirely except that it requires DNSSEC. Well, I mean, open SSH requires that it be a signed answer for fairly good reasons. So, I'm not going to try to work around that. And then something I didn't mention before is doing this not only gives you access to be able to host all of the, you know, standards track RFC DNS records, you probably get to do lots of cool things that aren't actually standards track RFCs yet. So, for example, HTTPS records which are still just a draft and we don't even know if they're really going to end up being approved or not. It probably will. And Firefox and Chrome already know how to use them. I host all of those already for my HTTPS services because why not? It just took almost no effort to set it up. I do Ansible based management for all of this stuff. And then I'm sure that most of the open source auth servers out there can do online signing, but Power DNS auth server certainly can. So, I don't have to worry about, you know, regularly re-signing the zones and handling all the key to speech and everything else because it does all that for me. So, the last thing that's mentioned there, which I'll get to just a little bit, but I have it on purpose. How many of you know what catalog zones are? Wow. That's shocking because it's you. You wrote the code. Of course, you know what it is. So, catalog zones made this whole thing much simpler. In fact, we were joking dinner last night that there's probably would have had to be three slides in the section about how to do maintenance of all of this that don't have to exist anymore because catalog zones take care of the whole thing for you. So, I just led right into this, didn't I? So, obviously, new software releases have to be deployed. For the moment, although I know Peter's working on it, there aren't currently ARM64, AR64 packages in the Power DNS repos. So, I use their builder scripts to just build them myself and make publish my own packages to my machines to install them. Eventually, I won't have to do that first part and I'll just run app to get update and it'll install the new packages. And then whenever I need to add or remove zones, I go to my Ansible Playbooks and I change the list of zones that I want to maintain, go run them and they poke at the API to do the right things. First of all, make sure those zones now exist or no longer exist on the hidden primary server. And in the past, they also had to reach out to all of the other servers to make the corresponding change. So, every secondary had to know that you've added a new zone so that it could know a lot of the data. Catalog zones take care of all of that for me. So, now when I stand up a new secondary server, I only have to tell it which catalog zones it's supposed to pay attention to and which server those are supposed to come from and then it automatically populates its secondary zone list all by itself and it's automatically updated every time I add or remove zones. It takes, I don't know, a minute maybe for all of them to be updated and everything is happy. It's really fantastic. So, there's going to be more cool stuff we're going to do with that in the future. So, step zero, one, two, three, four, put them all together. That first one is the most important thing. Go find out what your domain registrar supports. If they don't give you the tools you need to be able to do this, you're going to have to move your domains to a different registrar or give up. Those are obviously two choices, equally valid. So, but there are lots and lots and lots of really good ones out there, different parts of the world. So, it's easy enough to find out. There's a very short list of things you need to be able to do. The biggest things are the DS record is probably going to be the most, the first thing to check. Will they let you upload DS records for your own zones? If they won't, you're dead. You can't do DNS sec with your zones and that registrar. You have to switch to somebody else. The one I currently use supports both IPv4 and IPv6 glue records, but only IPv4 through the web interface. IPv6 has to be done via support ticket, which is annoying, but it doesn't change very often, so I'm willing to live with that pain. So, decide which server-server software you want to use. Obviously, that's going to be an important decision for you. Decide where you want to put all the stuff, where you want it to live, both the hidden primary and all the secondaries. And then, how are you going to manage the zone list? And as you can see there, if your answer has happened to follow that category, you can do exactly what I did and be on your way. Now, a little bit of bonus here. I did write that as a question. I forgot I changed it. So, on those network appliances in our home network, I run the recursive resolver for all of the clients on the home network to use to resolve names. And with our last ISP, we had more outages than I would have liked to have. And once you start setting up something like this, you forget the IP addresses for all of your own infrastructure. You only know the names. And when you've made the zone signed and your resolvers can't reach the internet, they can't resolve any of the names because they can't, in my case, my domain is km6g.us, so they have to know what's signed at the root, what's signed at the US, what's signed and all that stuff, right? So, we would have very bizarre cases where things inside the network stopped working because our internet link was not working correctly even though the things we were trying to use didn't use the internet at all. They just couldn't talk to each other because they couldn't resolve names. So, now I have the resolvers sitting right with an authoritative server sitting right next to them that hosts all of our zones that nothing talks to except the resolver so that even if that box can't talk to the internet, it can still resolve any of our internal names with no problems at all. An additional thing there, thank you again to the... What's that? I am, yeah, yeah, so what happens here is a feature in the Power.DS recurser which I will say unabashedly, I wrote, which is that you can actually tell the auth server to set notifies to their recursers, which is not something you would think would have ever, I mean it's not something that's normal. It's really cool though because what we do is when that happens, the recurser says, aha, that's a notification from the authoritative server that any content I might have in my cache for that zone is probably stale, throw it all away. So that means if I've got internal infrastructure changes, I've moved the container to a different machine and its IP address has changed, within a minute everything that needs to use that will get the correct address as long as it's not running a local cache on the box itself. So, and then as I said, over engineering I used anycast and OSPF and all kinds of other stuff to reach the recurser resolver. So, there's a bunch of links, the slides all are up on the FOSDM website, so feel free to download them. All of these links are useful and you can see even the HTTPS thing which is not, not even a stent, what's that? Oh, you're right, I forgot to put that link in there. Tell me where it is and I'll add it. Yeah, so, and that's that, I have two and a half minutes left. So, so questions, yes sir? You said you wanted to be required to add, change DS records. For anyone else want to do, to do this, please note that some TLDs require you to send up the DNS key record and they will hash it to DS. Ah, okay. So, he just missed me a comment that depending on which TLD your zones are under, you may have to send a different thing for them depending on exactly what they want from you. So, that's fine, thank you. Don't fight, yes, that one. Just another comment, some TLDs are scanning for CDS, CDNSP and CDNs. Yes. So, you don't have to play with the registrar at all. Right. Emerging technology that's been around for a really long time. Yeah, so I just, so I had just a repeat that there are some top level domains not related to the registrar that will actually notice the correct type of records in your zone and pick up the keys from there so that you don't even have to manually update them when you change them or rotate them. Yes. Yes. Okay. Robust infrastructure but you still use only one DNS software. That's a good point. I have diversity and robust infrastructure but no diversity in the software. That's absolutely a good point. I'm okay with that risk though. Yes, sir. So, how do I ensure that I'm not under attack on the virtual machines and how do I apply CV updates? So, because of I'm a geek every Saturday morning before my wife gets up I run this gigantic Ansible playbook that goes and touches everything and applies all package updates and reboots and it needs to be rebooted and all of that sort of thing. So, I'm reasonably good there. Plus, on the public facing machines I have the Debian unattended upgrade thing in place so that if a really important thing gets a package update shut comes in and gets applied it will do it for itself. So, I don't have to restart that. What was the other question? What was the other part of that one? Oh, the attack thing. So, this is really cool. Those particular AWS VMs have a hard cap on the amount of CPU that they're allowed to use. So, if somebody tries to do a DOS attack on them they just stop getting responses. So, I don't have to do anything. It's just not granted that resolution of my zones would be harmed by that. But, these are personal zones. We're not hitting them all that hard from outside. So, that's kind of a neat feature that it's just a side effect of the way those particular VMs work is you're limited in how much CPU you can use. So, yep. And, are we out? Out of time, yeah. All right. Thank you, Captain. Thank you. Thank you all. |
Moving from home grown to open source
A thrilling tale of RFC non-compliance, wildcard hell and scaling issues |
So, my name is Robin Geuze. I used to work for TransIP at first and Team Blue after they merged with a bunch of other companies for about a decade until a month ago. During that time period we transitioned from running our own closed source DNS server software to running open source DNS server software and just like the talk we just had, that happens to be power DNS. So, I'll take you through the issues we had going from closed source to open source, which roughly took the entire time I was there, about nine years. So, yeah, let's start. So, how it started for me. TransDNS, which they called the home root DNS software, was written originally in about 2003, 2004 and it had the DNS support added in 2012. When I started working at TransIP in 2013 as a PHP coder, I was asked to help them debug a crasher in the TransDNS code. It basically came down to a buffer overflow because somebody had, one of our customers had managed to put more than 16 kilobytes of text record data on one single label. The really quickly quick fix was to increase the buffer to 32 kilobytes. And one small disclaimer, I was involved in almost all the work that I mentioned here, but there are some things that I didn't do myself or just consulted on, stuff like that. I'll try to make a distinction about it, but I might miss some stuff. Yeah, so back then it was a really basic setup. We basically had three servers. They were all running TransDNS. There was no load balancing. The signing stack was built using DNS stack tools for those few people who still know what it is. And there was a lot of automation on top of DNS stack tools in PHP to make all of that work and ultimately upload stuff to the registry because we were one of the, we were a registrar, so a lot of the stuff was automated. All of this DNS propagation was done to cron jobs, which means it was very slow. It took roughly five minutes to propagate a DNS change, which back then wasn't really a big problem. But as we went on, it became more and more an issue, especially when we got let's encrypts and you needed to quickly update your DNS to get your certificate signed. We had at the high, I think we still have roughly one million zones in the setup, most of which so about 80, 90% are DNS signed. There were very few people back then that actually knew stuff about it and dared to work on it. I think maybe three or four people, one of which was I. It had very bad RSC compatibility, which I will get into a little bit later. Adding new record types, which Kevin mentioned like SSHFP was a lot of work because there was a interpreter in TransDNS itself, which had to be written in C and writing interpreters in C for stack strings is not fun. And well, I fixed that initial buffer overflow block, but the main problem was there just not a lot of bound checking in the code. So yeah, there were a lot of hidden bugs that probably should be fixed as well. So we took a few initial steps because initially, because we had the three servers, there was no loan financing, we meant that if we restarted TransDNS, one of the servers would stop responding until the restart was done. And the restart took roughly 15 minutes because every single record would get loaded into memory. And since we had a million zones, I think it was like 25 million records or something back then, it just took a lot of time and might have used the quick DNS zone parser stuff. So the first thing we did was implement load balancing. This was before DNS. This was a thing. So what we tried initially was relay day, which some of the BSD folks might know. It did work, but we had a lot of weird issues. It was really hard to debug. And so eventually we switched to using HAProxy for TCP, which works, nothing more to say about it. And I wrote something rather quickly in C roughly based on the TransDNS code to forward the UDP stuff. That worked quite well and actually enabled us to actually iterate on the TransDNS code because we could do save restarts without having to worry about queries being dropped. And that allowed me to fix the glaring issues like there not being any bounce checking in the code. So we had less risk of buffer overflows. And I fixed a lot of the EDS issues that were becoming a problem at that point. Eventually when DNS was a little bit more mature, we switched to that because otherwise I had to maintain another piece of software and I really didn't feel like that. In the meantime, it did improve the TCP stack a lot in TransDNS because we noticed that especially SIDN, the.nlRegistrar registry, did a lot of TCP queries and the original implementation was basically just spawn a new thread for every TCP connection, but once you get to about a thousand threads, that's not a great solution. So I changed to a polling-based model, worked great, got pretty high performance, and we never had a problem with it after that. The only thing I changed later is when we moved to Linux, I changed to ePool. Yeah, so SIDN had validation monitoring and we kept getting reminded about the fact that we were doing a lot of stuff wrong. So yeah, we actually had one specific case that basically covered most of the, I think it was about 80% of those errors, and that's, it's 62 issues, but they have the same cause. So the first issue was the incorrect handling of wildcards. So if you have a wildcard that, for example, star.nl, then you have a record c.nl, and then you try to resolve a.c.nl, it should not hit the wildcard, because c.nl exists, which means you should return a no-data, or an extra main in this case, but transdns didn't really care, so it would just return the data from the wildcard. Very useful, makes it a lot easier to configure DNS, but it causes some issues, especially with DNS validation. The second issue was basically the same only in the empty non-terminals. If a.b.c exists, and you try to resolve b.c, even though there's nothing specific on b.c, you should say there's no data, rather than it's a non-existent domain, also causes the DNS validation errors. Same basic cause. The solution was to switch from, in transdns, to switch from an ordered map that used the type and the domain name as the key, to a map that only used the domain name as the key, and have an array in there with the type, which could also be empty, so we would immediately notice if there was a label in our way. That worked well. I actually did it this next slide, so the only problem is we couldn't just deploy that, because we might break stuff for our customers, and customers get a little bit difficult if you break stuff for them. So what I decided was, okay, for the NSIC it's broken anyways, because the NSIC enables for the resolvers would just return errors when you have one of these labels. So what I did is fixing the two steps. I initially enabled it only for the NSIC queries, so the correct behavior, and kept the wrong behavior for non-DNSIC queries, and in between we just covered a large amount of queries. I think I did two days of DCP dumping, and milling it down to the actual unique queries, and compared what our name servers would respond for DNSIC versus non-DNSIC. For everything that had a difference, we contacted the customers, and told them, hey, you need to fix this. I think it was only about 20 to 30 customers. It was actually not that many, so that made it a lot easier. And then we just, at some point I decided I'll flip the switch. There were a few customers that didn't respond, but at some point you just have to decide to. Don't give a fuck. One other small issue we have with RFC implementation was the NSIC implementation, because almost all of our zones use NSIC tree. The NSIC implementation was not as well tested as the NSIC tree implementation, so it was wrong, like really wrong. I just rewrote it from scratch, and then it worked, but yeah. So we started to think about moving to PowerDNS, and the main reason we did was because SIDN announced that we would no longer get a DNSIC incentive for domains using the NSIC algorithm 7. So that's the RSA plus NSIC tree algorithm. That would cost us a bunch of money, and that's a very good way to stimulate people to do stuff. So at this point we decided to buy the bullet and just start over from scratch, and build a really new, more modern setup. We picked PowerDNS, basically, partially because we already had some experience with it, and we didn't really want to deal with zone files, because we had a million zones, and putting them all on a file system makes things annoying. So PowerDNS was the only one where we thought, oh, this allows us to do changes via the API. We don't need to worry about having separate zone files for every single zone. So we needed to pick a PowerDNS backend to use, because PowerDNS is one thing, but you still need something to put stuff in. And there we sort of had to hit a problem, because PowerDNS is really fast, because it's literally just a hash map in memory, so it can basically do instant answers. And while the PowerDNS, as you go back in, is very nice and flexible, but it's not really fast, especially because we had a lot of zones that would not get very frequent queries. So they'd have a lot of non-active data, which means the query cache wouldn't really help a lot, which means that we would have a lot of SQL queries continuously, because they would get queries sometimes. It's not a lot. The bind backend had the same problem as all the other name servers. We didn't have API support, and it would mean we needed to use a lot of zone files, which we didn't want to. So introducing the LMDB backend. This already exists at the point that we started looking at it, because Hubert had written it. It's very fast, and it has support for the API, which is really nice. It only had one major issue. Because of the way Hubert had implemented it, it didn't really allow records bigger than 512 bytes. We have quite a lot of zones. So I decided to fix that in the end. I wrote a pull request for the Power Genius team, and I think that was pretty quickly accepted into there. It also included some migration code, so the older the LMDB database would automatically be migrated to the new LMDB database format. It also improved performance in some corner cases, but that was not really the goal of this patch. So then we started moving over. We built a setup. It was really cool. There's a lot of automation around it. It does actually do all the zone transfers via XFR, even though Kevin just said it's a bad idea if you have a lot of zones. But in practice, it works quite well, except for one issue. Every first day, our updates would take ages to go through. Basically, we traced it down to an enormous bump in the XFR queues. We would literally have 400,000 XFR queued up. So that was a bit of a problem. So the reason this happens is because Power Genius renews its signatures every first day of the week. Very nice. We don't have to think about it. Problem is, if you have a million zones, that takes quite a while, especially because we were running our hidden primary on a VM, so it was also not that quick to answer queries. So we could have just shown more hardware in it, but we decided to look a little bit more at a more sustainable solution, because, well, if it works with one million zones on the Phosomers scene, it will still work if you have 10 million zones. So I discussed it with the Power Genius guys, and I came up with a solution which is XFR priority levels. So rather than treating all XFRs that need to be done at the same level, we gave more priority to things that are user-initiated. So if you initiate an XFR via Power Genius control, it will be first in the queue. Whatever else is in the queue, that one was treated first. After that, there's the API, notifies, solar refresh, and signature refresh is the lowest priority. That meant that, yes, we would still have a quite a large queue, but we could still process our updates very quickly. That was included into Power Genius, right with us. Well, in 4.5, it was included. We still saw the use queues, but those own updates would pretty quickly propagate. So that, for us, was fine. We never had a problem with it after that. Yeah, and then we had some other issues, most of which were in a minor solved in the low-banus layer or just fixed in Power Genius updates. The TCP performance is still something I want to look at in Power Genius just for fun, as a open-source developer. It's on my list of things I want to improve. We had some various smaller bugs in the NMDB backend because it was quite new. We were not the first one that ran it at really large scale, but we were one of the first ones, and we did see some problems that nobody else had had yet. One CVE we discovered literally within the day of rolling out a new version, so that was very fun for Peter because he got to roll out a new lose a day after he released the previous one. We had an issue that there were certain query patterns that we would get that were specifically designed to target a weakness in Power Genius. That was a transient as we didn't care about them, but Power Genius did get affected. We eventually resolved this by adding some detection at the low-banus layer that would just block queries for those affected domains. It would mean that that customer's domain would have limited functionality, but at least it would still work, and all the other customers would not be affected, which was for us the most important thing. Yeah, so some closing thoughts. Yeah, migrating a home root setup is really not for the faint of heart. However, running one is also not for the faint of heart. Yeah, it is worth it. It just gives you a lot more flexibility because now adding new record types is just a question of adding them in our front end and making them work. Whereas before, we had to add the new record type at every single step in the stack, and it just really took a lot of time. We can even, in theory, add different brands of secondaries. Currently, there's a few issues that were prevented, but it's relatively easy to solve, so we could just run not as a secondary or NSD or even bind if he would want to do that for some reason. What I did really, really notice is don't try to do this in one go, because it's a lot of work, and you'll make mistakes. If you do it in smaller steps, the mistakes will be smaller, easier to fix, and it also just feels a lot better if you can accomplish some things in between rather than trying to do it all at once. One thing I wanted to ask, DNSSEC incentives, they work both when trying to get people to use DNSSEC, but they also work to improve the quality of the DNSSEC, because we've seen, especially in the.nl zone, because I've also been involved in that work a little bit, some very bad implementations that got fixed when the rules were made stricter, including ours initially, but were even the worst ones. Yeah, that is it. For the people that would like to see, I've open sourced DNS before I left the company, so I can see it myself as well, so that's fun. It's on GitHub. I've also put the URLs for the two major pull requests I made. There's a bunch of other ones, but I haven't put all of them in there, and that's about it. So, questions? So, that makes it a bit of a more concern in your case, and I used to be a customer. On the upside, most of these methods probably weren't noticed by the majority, but I think you should take it more seriously if you were a company that actually makes money out of posting here. So, the comment is that we both made mistakes. It was a bit related to the talk Kevin did, so a lot of the things that we said are related, and the comment is Kevin was only doing it for himself, and we were doing it for paying customers. Yeah, I agree. When I started, there were a lot of issues, and I've tried three years to attempt to fix them as much as I could. To be clear, I wasn't hired to maintain trans-DNS. That just happened to be something that got shoved into my lab because I knew some C and C++. I became pretty passionate about it. I came pretty quickly rolled into the PowerDNS community. I also added a lot of contributions to DNS when that was getting started up. I agree with the initial statement. I've tried to fix it as much as I could. Sometimes, you set out with certain criteria. You build something that can meet that criteria, and it scales to a certain point. Eventually, you get to a million customers, a lot of customers. The company would start off with a million customers, and maybe at the time that this was a good system, things would fly to me. But as the business grew and things grow, you have to do exactly what he did. You evaluate it and say, you know, it's time for something different. He identified that and made the changes accordingly. A brief resume, he said that sometimes due to scaling, you run into issues that you hadn't foreseen when you were in it, and he set something up. Just taking a step to resolve them in the end can be a good thing. There was a question there. The question is how did they get them to agree with over sourcing it. At the point that I open sourced it, I was sort of CTO slash head of R&D of the Dutch part of the organization. Also, I only open sourced it after we totally took it out of production, so it's mainly a historic interest thing. Did you ever consider open sourcing trans-DNS before switching? So the question is, did we consider open sourcing before switching? No. And I'll tell you, we weren't very proud of, at least I wasn't proud of the source quality. I didn't write it myself, all of it. I only contributed to it later. I tried to improve it as much as I could, but it's still not... It's very focused on just doing one thing, and it's very good at that, but it's not very applicable to use by others. So I think it's interesting now to see some of the tricks to make things really fast that you can see in the code. But beyond that, I would never use it in a production environment other than the one it was in, because that one was built specifically to run around that code. What's actually the motivation for implementing the DNS hosting and trans-DNS? I think even around the time when you started as a company, there was not a software available that could have been used. So the question is, what was the motivation to implement their own DNS software? So to be clear, trans-DNS was implemented in 2003, so this was roughly when power-DNS started to grow, but the problem was that there were already quite a lot of zones in there, and it just got a little bit cumbersome using bind, because that was the primary name-server software you'd use back then, and yeah, that was the main motivation. Bind was getting annoying because you had to have a lot of zone files, and everything was running on 3BSD using UFS, so there was a 32,000 files per directory limit at that point, which also didn't help. I mean, there's ways to solve that, that's not that complicated, but that was the main motivation as well, as I think there were some performance issues in bind back then that were relatively easily resolved. The other alternatives would have been GGB DNS, but that had its own things, like the guy that wrote it, not saying you should use it. Anything else? |
Bizarre and Unusual Uses of DNS
Rule 53: If you can think of it, someone's done it in the DNS |
Hello, everybody. Welcome to our last talk of the day. This is Peter Lowe. Am I going to be able to see them on the screen as well? No. Okay, I can look around. Yeah. Hello, this is Peter Lowe. He will be showing us many of the weird things people have done with DNS to finish our day. Thank you. The first thing I should say is that this presentation normally takes about 40 minutes because there's quite a lot of things to see, so I'm going to be running through it. It's going to be a whirlwind tour of bizarre and unusual things that people have done with DNS. Okay, so I'm going to skip through some of this stuff. This is all about who I am. Normally, I talk about myself a little bit, but nobody's really interested in that. You've come to see the weird stuff, right? I'm a security researcher. I do DNS stuff, basically. Bunch of different titles for this. Not very click-baity. This is a sort of intro to DNS. I'm assuming that everybody kind of knows what that is here, so we'll skip over that. And this is how it all started, which is part of the DNS abuse special interest group. For first, I'm the co-chair John Todd from Quad9 joked about some malware distribution via DNS, and we thought, oh, that's not possible. Wait, hang on. So I started going and collecting some other stuff. Yes, a lot of these aren't around anymore. Unfortunately, it's kind of like a museum of weird DNS things. Some of them still work, but there's a bunch of links at the end. If you're interested in any of these, I've tried to put resources and links to everything that you'll see. So first section, trace routes. Not 100% DNS, but basically it works by setting up a static route and then making sure that the reverse DNS maps back to something interesting. So this was one of the first things I ever saw, the Star Wars intro via DNS. It was by a guy called Ryan. I'm going to be looking up at here as well, by the way, because I can't remember all the details with these. Ryan Werber from Beald on it in 2013, it was one of the things I saw and thought, oh, how does that work? And it kind of got me interested in some of the stranger parts of DNS. There's an IPv6 version out there somewhere. It went down very quickly because of a DDoS, typical for a lot of these things, unfortunately. There is this one, which is hand.bb0.nl. This, as you can see, displays a hand, and it's done over IPv6. If you increase the number of hops, it does get more interesting as it goes on. This is the only space I had for on the slide, so, yeah. Sebastian Haas, who will be featured multiple times in this presentation. I hope he's here, but if not, hi, Sebastian. He put up a thing where you could trace it and find the live scores from the Euro 2020 match, which was pretty impressive. He also wrote a thing called Fake Root, or Faker Tea, which allows you to set up IPv6 routes on your local machine. This is another one from MakerForce. This is some alternate lyrics to American Pie, which I'm sure you're all familiar with. Bad Horse, if anybody here is a fan of Dr. Horrible Singalong, there is a semi-famous thing in it called Bad Horse, where he sings a little song, and we have the lyrics here. If you go and have a look at the SSL certificate chain for Bad.Horse, yeah, signed up Bad.Horse, there's a Lalisa Reg there as well. This is a screenshot of Dr. Horrible. And the first time I did this, Andrew Campling, who's an encrypted DNS guy and other stuff, he said, let's put something festive in so I went out and had a look. And of course, someone's done a Christmas tree, so there's not one. Toys and Toys. One of the first things I found was at Postel.org, which I think is a great place to be hosting something interesting. It's a calculator. It's not as good as Stefan Bolzmeier's, where you can actually put the plus character in, and it's not around anymore. I put this. There's a reverse Polish calculator out there. Apparently, this is a reverse Polish calculator, so it shows how much I know about maths. This is one of the more interesting ones. There's a bunch of different versions of this. It's a local IP echo. It tells you what your public-facing IP address is. It's actually quite useful because in scripts, instead of just doing a curl request in my IP.org or whatever, this is going to get an answer much quicker and it's much more easily scriptable. Yes, this one is the MyIP service from Google. As everybody knows, Google is famous for discontinuing services, but they're never going to let the DNS service go. So, yeah. There's some tools here for network admins, some IP to ASN translators, lots of different options. There's Team Cummry. There's an example here. There's some other ones out there if you go looking for them. There is an example here. I think this is from Tony Finch. This is postcodes, which will translate them to... Oh, and Jan-Pitt means... Sorry. This will give you the geolocation for postcodes in the UK. I think Jan-Pitt did some... Oh, yeah. This is also the airport codes, I think. I'm also missing my talk... Speaker notes, by the way, so forgive me for this. And DNS top toys. This is a great site. They put up a bunch of different things that you can look at. There's world time, IP Echo, another one. Number two words. I genuinely don't know what this is useful for. I don't know why anybody would use it, but it's kind of fun to have it. This is one of my favorite ones. I'm quite a fan of geocaching. I don't know if you guys know about it, but it's like Pokemon Go for geeks. And you get to go out in the world and find things that are out there. There is one geocache which has the author of Mocha Petrius, but it's unfortunately not Dr. Paul Mocha Petrius. This is, again, Sebastian Haas, I think. And it's basically a host name, and if you look up the text record for it, it gives you the hint for the first part of the geocache, which is, I love it, personally. There's a text adventure out there. This is very cool. I mean, this is what you want from DNS, right? A guy called Craig Mayhew, you look up different host names, and it gives you different options, and he uses round-robin DNS for random decision trees. So it's pretty cool. Tunneling. OK, so people's definition of tunneling varies, right? It could be a simple kind of like C2 communication, or it could be full file extraction over DNS. I've got some examples here. If you want to discuss what tunneling means exactly, let's meet afterwards and fight. This is an intro to tunneling, which I found from Slashdot in 2000, but the general concept is the same. Wikipedia over DNS by a guy called David Ledbetter. This is very cool. It actually supports Unicode as well. I don't think it works anymore, unfortunately, but it is basically, he took a local copy of the XML dump and then installed it, I think, on Power DNS, and you could look up pages via that. Blogging, another very cool example where you look up, you publish text records, and that is your static blog. I love it because it's going to be fast. It actually works. From a blog, all you really want is the content. So, yeah, you can get the index. You can get, look up specific posts, or order them by recently post, really recently posted. OK, so now we're getting a little bit more into the words. This is IP over DNS. There is a library out there called Iodine, which is the chemical element number 53, which is appropriate. It does full IP over DNS. There's some examples of how this can be used later on. I don't, I'm getting to the point where I lose words about how to describe this kind of stuff. I mean, it's all brilliant, but this is like, yeah, I don't know. HTTP over DNS, so we're getting even more crazy here. This is browser tunnel. It's actually quite useful for some things if you're in certain situations like an airport or something like that where things might be a bit restricted. This is basically how it works. It does raise the interesting concept of, if you're familiar with DNS over HTTPS, HTTP over DNS over HTTPS. So, yeah. This is pretty cool. It's called Slow DNS and it's a full VPN over DNS which is in the Google Android, the Google Play Store. I haven't used it because it does include ads and I've never actually got the courage to check it out properly. But I think it's the kind of thing that can really work in airports and other restricted areas. So it's, you run it and it's a VPN that lets you access the internet and it works over DNS. So a lot of places DNS is going to be unrestricted and this is going to help you out in those situations. The ads thing, well, you know, give it a go, but the concept is amazing. Another library that I found really cool is called DNS Cat 2. You install a server and you install a client and then they communicate to each other over DNS packets. You don't have to have an actual domain working at all. You don't have to register a domain. It just uses the DNS protocol and it's got a bunch of built-in functions like sending files and windows and messages and stuff like that. It's very cool. This is still on GitHub. There's a link at the end. You'll see. And a few other things. How much time? Oh, am I out of time? Okay. The benefits of being the last person speaking. Corey Quinn talked about how you can use DNS as a full config management system. This is the BIMI brand indicators, which works over DNS. It uses TXT records, which start with underscore BIMI. A full contacts database. Somebody in the UK has created a whole protocol and used it to put the yellow pages online and they've got an SDK and all sorts of crazy stuff. There is dskv.com. This is a full key value store. This works. This guy is quite dedicated to it. I have to say I'm quite impressed. It's a really good documentation. Go check it out. A file system over DNS. Yeah. I know. Right? I mean, why not? Okay. Ben Cox was in the audience when I did this presentation once and I had to say I totally lost my... It's amazing. And there's an example of him on Twitter streaming named P3 over DNS and it working. It's just a magical... Here you go. Here's the links at the end and questions. There you go. Thank you. Any questions for Peter? No? No. Thank you Peter. Thank you all for being here. Have a good day. |
Delta-like Streaming of (encrypted) OTA Updates for RAUC |
Yeah, thanks. Thanks for introducing me. Hello, a warm welcome also from my side to first them to the embedded froom. And we're going to hear something today about Delta like streaming of encrypted over the air updates for Rao. So luckily I managed to put the entire abstract of the presentation already in the title. So what we hear about is the changes and developments during the roughly two or three years that happened in Rao, the Rao updating framework. And so it's basically the development that happened since we've last met here, I guess. So short notes about me. My name is Sandvik. I'm an embedded software developer. I work at Pangotronics. I'm the team lead of the integration team at Pangotronics, and I'm the co-maintenor of the update framework Rao that we will hear more about soon. Pangotronics for those who don't know, it is a company based in Germany and we provide professional embedded Linux consulting and support and work closely together with the community. And with since the beginning, I think more than 7,000 patches in the Linux kernel. So a short overview of what we hear today. So the first thing is a short introduction into what Rao is for those who are not that familiar with, but very, very short. Then we talk about the bundle format because this is crucial development for or the base for all the further features that are listed here. So the first thing I will talk about then is bundle streaming. Then we will hear about adaptive or delta-like updates, how to encrypt our bundle, give a short outlook on recent development about app updates, and at the end, we have a short look into what's coming next on features and what's in the ecosystem. So, yeah, a typical over-the-air field update scenario could look like this. We have here our server. The server builds the image that we want to deploy to the target. We create an update artifact from it, sign it, upload it to our deployment infrastructure, and then we have the individual targets, update targets here that download the update and install it. And there's also still this conventional not-so-over-the-air use case for, for example, using a USB stick. So what Rao handles is basically two parts. The first one is the creation of the update artifacts, the signing, verification and so on, and the actual installation, the failsafe installation of the updates on the target. So, yeah, basically Raoq is an embedded Linux update framework, so it handles the failsafe and atomic update of AB systems, so redundant system where you have one partition, where you're running from an inactive partition, and when you update, you write your update into the inactive partition. Once you're done, you switch in the bootloader to the inactive partition reboot, and everything is fine. Raoq is basically two parts on the target. It's the service that handles the update that runs and installed there, and it gets its view on the system from the system configuration file. And, yeah, the artifact for updating we call in Raoq a bundle. A bundle consists of the images that should be installed. It consists of additional hooks or something like this, and a manifest that holds the description, yeah, what these images are for, basically. So, it's written in C with some utility libraries to not reinvent the wheel for everything. It's licensed on the LGPL and hosted on GitHub. It was started in 2015, and I think the first release was in 2017. So, yeah, as I already mentioned, the bundle format is quite essential for the next things that we talk about here. So, let's first of all have a short look at the initial bundle format, because this was a motivation for changing the bundle format then. The initial bundle format was quite straightforward. It was just all the artifacts and the manifest packed together in a squash file system. We signed this squash file system and append the signature to the end of the bundle. So, the verification is also quite easy. We just have to read the entire bundle and have to read the signature to be able to verify the bundle. So, yeah, this is also the downside. Even if we don't also just want to access the manifest, we have to always authenticate or read the entire bundle. So, this is quite slow, and if it comes to over-the-air updating, it requires us to always download the full bundle before we can access any data in this. So, yeah, this is bad if we want to use it for streaming. So, this is why we have introduced in 2020 a new bundle format. And this bundle format is basically, it's called the variety format, and it uses the DM variety. So, short intro, a device mapper system in Linux is a generic abstraction of, yeah, manipulating block devices. So, a device mapper has the same API as the block device has. So, for the upper layer, it looks like it's just talking to a block device. But below, it can manipulate the block device, authenticate data, it can merge data together like we know from DM linear, and there are several device mappers in the kernel. So, the one that we talk about here is DM variety. It is basically an integrity protection method for read-only block devices. So, the rough concept is that you split the block image into several chunks and generate a hash for each. And of these, you recursively do this again and again until you have a single root hash, and then you can verify each single block until the root hash recursively. So, yeah, you have the data protection or integrity protection for read-only files. The variety table is just appended to the image. So, let's see how we use this in the RAUK bundle. So, we take our images here and first create the variety hash and the root hash. The variety hash is simply appended to the bundle, and the root hash is now placed in the manifest. And then, we just sign the manifest with an enveloping signature, which means that the manifest is a payload of the signature. And what this gives us is now the verification of the manifest is quite easy. We just have to verify the manifest or the signature and get the manifest content. And inside the manifest, there's also the root hash. This is then automatically trusted if we have verified the authentication of the manifest. And then, we can set up the invariity and use the hash tree appended to the manifest and the authenticated root hash. And then, for each access to each chunk or block on the block device, this is authenticated to the invariity in the kernel. And this allows you to have fully authenticated random access to your bundle. And you also, you only need to verify by the time of using the data. So, the next logical consequence is to implement streaming. So, up to now, RAUK was not so over the air. So, downloading means that we assume there is an external service like Hockbird or an application or an SCP that downloads the RAUK bundle to the target device. And then, with RAUK, we start installing it from the local storage. Well, the disadvantage of this is obvious. We have to have some extra space on the target where we can store the bundle. And the artifacts can, yeah, in a modern system become quite huge. And so, the approach is that we implement streaming or downloading in RAUK itself. And if RAUK is able to do this and directly download it to the target device that we update, then no intermediate storage would be required. So, let's have a look how this is realized in RAUK. So, first of all, what we do in RAUK is that we spawn an, or fork an unprivileged helper process because RAUK, yeah, runs as root as it has to update the system. And you really don't want to use a root service to download data from the internet. So, it spawns an unprivileged helper. And this helper acts as a translation. It plays a block device on one side and talks to the update server via HTTPS range request on the other side. And, yeah, HTTPS range request should be supported by all common web service, also light TTP services, and it's also supported by many delivery networks. And if we combine this now with what we've seen with the access to a variety bundle, then we have fully authenticated random access to the remote bundle. And, yeah, we can randomly access so no intermediate storage is required. So, the next need when we are able to download things is normally that we want to save download bandwidth because bandwidth is limited, expansive, or something. And the normal approach for this is to do conventional data updates. It means you have two versions of your image on your host system, calculate a delta, and then you perform the update with this delta image on the target. So, if you have the exact version that you have to calculate the delta for on your target, this works very well. You can go here from version two to the target to version three. But, if you now have a system that is on a different version, yeah, this fails because it simply doesn't apply. So, it's an optimal diff. It allows very small updates. But, yeah, you require to have access to the different image versions on the host, and you only can update step-by-step. So, from version one to version two to version three. So, in route we've chosen a different approach, a more generic approach for optimizing download. This is called adaptive updates. The concept behind this is that the bundle or the manifest itself provides a number of optimization options. So, with each option, there's normally an additional data connected that is stored in the device for optimizing the download. But, since we are able to stream the bundle, we don't have to download these additional data that is stored in the manifest. And then, it's a responsibility of the route service on the target to see, okay, which of these capabilities do I support and which can I use and which is the best one. And, there's always a fallback to use a full bundle download. So, you're always able to download the image you want to install. One method, adaptive method, generic one is the hash index. The idea behind this is that you split your image into several chunks and hash each chunk and generate a hash list from this. And, for installation, you just basically do the same on the target. You take your target device, block device, for example, you hash it with the same algorithm, create the same hash index. And then, for the optimization, yeah, you just download, first of all, the hash index that is stored in the bundle. And then, you compare it line by line with the hashes that you've calculated on the target. And then, you can download or just need to download the hashes that differ between what's on your target and what's in the bundle. And this works both for the intended target version, but also, if you come from a fully different image, then you just have to download a bit more, because the hashes that differ are a bit more. For block devices, this is already implemented in the current drug version. And there are also plans to support this for file-based updates using R-Sync and offline generated checksum files. The next topic is bundle encryption. So, the motivation is, I think, quite clear. You will have some sensitive data in your bundle, and you want to protect it, because you have it on an unsafe cloud storage or an unsafe communication channel. So, in Raoq, we have implemented this in two-stage approach. So, the first one is a symmetric encryption of only the payload. This is this part. This is what normally already the build server does. And this does not yet require access to the key material. And the second part is the individual encryption. Then you can take the symmetrically encrypted image and encrypted per recipient. You can just take one key and encrypt it for all your devices by using a shared key, or if you really want to do security, then you can also use per device or per recipient keys and encrypt the bundle for many individual recipients, many thousands. So, this again uses a device mapper, a different device mapper. Now we use DMCrypt. It's also quite simple. For the generation of the DMCrypt image or the image we use for DMCrypt, we just split up the original image into equal sized chunks, generate random symmetric key, and encrypt each block individual. And the DMCrypt device mapper then just provides a transparent description of the images. So, if we access a chunk there, then, yeah, DMCrypt just decrypts this chunk we just selected with the key, with the symmetric key, which is the same used for encrypting. And if we combine this now in the bundle, so we have here the image encrypted and combined it with DMVarity, then we have a blockwise authenticated description. And since we have random access to the device mapper and the variety format, we also have the possibility to stream an encrypted update. So, short on time, a few notes about app updates. So far in route we assumed, okay, the application is normally the application. So, we assumed a bit a monotolic system where the application is the one thing that the device should do. And so we said, okay, the application is normally either part of the root file system or you can have it in a separate slot. But it actually, anyway, linked against the libraries that are contained in the root file system. So, it's fine to install it always together with the updated root file system. The reality showed it's a bit different and there are more and more demands of having the capability of doing container updates, doing app store-like updates and where you also have one vendor which provides a base system, which is rarely updated and other vendors provide the applications, which are much more frequently updated and additional data should be added there. And up to now we had no solution for this in route and said, okay, then use route for the base system and use another updater or update approach for this application or file updates. What we are working on, and this is in a quite premature state, actually, is route artifact updates. The basic concept behind this is that you have a slot for artifacts and inside the slots we don't do image-based updates, what we do directly are file-based updates. And then we provide the same as we do for image-based updates. We ensure that the update is atomically and we also support both the case where we don't have any dependency of the app of the container to the base system, so this is what you basically see here. But we also support the use case of having a dependency on the root file system but the need to more frequently and independently from the root file system update your application. And together with our checks and files, the idea is that this again also supports streaming and delta-like updates. So just a very quick rest through the other features and community things. We've switched to Mason build system recently. This is already merged. It wasn't when I started the slides. So a new feature we also have is adding custom metadata in the manifest that you can then access via route info or the deepest API for custom application. And an ongoing development is also about providing more fine-grained process because currently we just have a per slot progress and if you have a large tar then you wait very long until the progress gets to the next step. And a contribution that came or was started by the community is the Rooke Hockpit update. This is basically an interface between the Hockpit deployment server and Rooke on the other side. It talks via the deepest API with Rooke and this is a good example where the community started things and they moved then to the Rooke organization and are now maintained by the Rooke community. And with the latest version of Rooke Hockpit update we are also compatible with using streaming updates for Hockpit. And shout out to Leon who is sitting somewhere here in the room. The Meteorite community is a layer or layer collection started by Leon which provides some example integration of Rooke into, for example, QEMO or for Raspberry Pi and it's a very good starting point if you want to check out how to use Rooke, how to use all the features in Rooke. And yeah, I really recommend you to use this as a starting point. A final slide. For an open source project it's always hard to know which are the users of your project and where it's actually used. So it's always interesting for us to know this. One example where we came aware of that Rooke has used is a very famous one. It's a well-steamed deck that uses Rooke together with the async. Another example is the home-assisted operating system that uses Rooke for updating the PANY system and the Ornero Eclipse project. And one thing that I also find very interesting is that the infotainment or information panels on the German ICE trains have a custom distribution they call Linux for ICEs and they also use Rooke for updating the systems. So this was very quick. Thank you for attending. I think we still have two or three more questions. Yeah, I think we have a time for one or two questions. Yeah. Hi there. Thank you for that. That's absolutely intriguing, really interesting. So one of the questions was how do I plug this into BitBake and you've answered that. That's great. I know what to do when I get home. The other was what's the granularity of this? I saw a sort of a 4K block size in there somewhere. In terms of your hashes and then downloading blocks through the streaming process, is that 4K increments? How does that change? And what's the overhead in verifying those hashes as you download? What's the impact on performance and have you looked at any figures for that? Getting quite low. So the question was if the 4K is fine-brained enough for normal downloads, so it's currently fixed, but it could also be changed if that's not sufficient. But in the current approach, the 4K is a fixed size there. Okay, so it's getting late, so unfortunately we don't have time for any more questions, but don't hesitate to ask them in matrix chat or try to catch our speaker in the corridor. I'll be in front of the room. You can ask questions and we can discuss there. Thank you for a great talk. Thank you very much. Thank you very much. |
Matter and Thread as Connectivity Solution for Embedded |
Good morning, everybody. I'm happy to see such a large crowd here. I hope I can stand up for that. So, I will talk a little bit today about, a little bit about matter and threat as a connectivity solution for embedded here. So, the agenda I have, I mean, giving a little bit of a scope for the talk, then give you an overview about matter. So, I don't know how is Temmler with that, how is Temmler with open threat and so on. So, we'll start a little bit with that. Then, how I'm using it or what we are doing using it on Yachto as well as on Zephyr. And then, also how we are using matter on top of open threat. Talking about the more generic mesh capabilities that threat is offering you and the border router that is needed to get it all hooked together. And then, about one more detail that has just have been introduced about like multicast DNS discovery proxy, as well as service integration protocol. And then, how I tied it together for our use cases as a transparent gateway blueprint. So, I only have 20 minutes. So, if I'm rushing it a little bit, don't worry. The slides are available. There's also like a lot more slides after the, as a appendix, we can look them up later on as well. So, for the scope here, my goal and my ideas have been like going, we need to do something for low power, it should be wireless. I wanted to have IP version 6 end-to-end connectivity. I want to look for the, for power budget. So, having for example, small sensor devices that can run on a coin cell battery for whatever, six months, a year, maybe two years, depending on the use cases and the usage. And having mesh capabilities that don't have only like direct connections, but we like being able to extend the network over time by adding more devices and so on. And for the situation with the power budget, sleepy end devices that really only wake up if they are being interrupted for a specific use case or only waking up for a short amount of time to querying the parent to get data that is also something I considered here. All the stuff I'm talking about here are obviously open source solutions. Thread as well as Meta have consortiums around them to like do products and do like testing and verification, as well as getting a certification and so on. This is definitely something different. So, if you want to build a product, you might need to pay for some specific parts, but all the software side or the engineering side that is open source here and that's what I'm going to talk about. So, Meta. Some people might have heard about it before. It was formerly known as connected home of IP or ship, in short. It is part of the now so-called CSI, connected with standard alliance. That, on the other hand, was called Zigbee alliance before. I think that is something that rings the bell with more people than CSA. They have an open source SDK for Meta. It's an application layer. So, you're basically like not doing any of these. So, it's built on top of IP version 6 and then does all the talk about like how devices access, how the data is expressed and so on, what kind of device types are there. It's more like they call it like the language for IoT. I don't know if I sign off on that, but that's like the basic idea we're having here. So, the 1.0 release for the spec as well as the SDK was done in October last year. And one of the interesting part why it got so much hype is like this industry was that a lot of the big players are sit together and doing that here. So, getting like groups like Google, Apple, Amazon and so on in one room and working together on this standard to actually try to get all the devices to talk together even by keeping their own platforms. That is like a very interesting part. And that could be something that a lot of smaller companies could take as a leverage to get into these kind of platforms to be supported instead of working with each and every platform individually and get it enabled. So, you don't need to get your device, work with home kids and with Google Home or what Amazon have their accounts on. So, you can just do it as a meta device and then it should work in all of them. They also have a feature called multi-admin, which is something that would allow you to have like an, for example, you have an Android device and your wife has an iPhone or whatever and both could control the same IoT devices in the same network running the native platforms they are using. So, this is something you can share the devices by using on different platforms as well. So, meta in Yocto and open embedded. So, the meta SDK, the way they are building that and so on, it's not so, I'm not so familiar with that before I started it. So, they are using something called GN, which is just generate ninja. So, basically just does a whole run and then generate all the ninja files to get it all built. And they have something called PickWeep that they, it's like their abstraction, how you get like all different kind of vendor SDKs supported and like different cross-compiled two-chains and so on. This is like difficult to get that into something like Yocto open embedded which just focus on the cross-compiling part here. So, that was quite a bit of work. So, we had to do like a GN base class to get it supported and do a lot of like work around here. But in any way, we got that part working. So, we have the core libraries building. We have the examples that are part of the SDK building. We have that all in the, in our Neo layer. But there's also a different layer from NXP meta meta meta which does it a little bit differently. But in the end, you're getting the same result. You have like integration to run that on a, on a Linux system. So, this is, that is all there. You can, you can take that and then build on top of the library and take the examples devices to like do whatever you want there. On the meta, on the Zephire side. So, on here is a project that runs not only on Linux, but we also run it on the Zephire side. So, we have like multiple kernels there. So, we all those need to make sure that we want to have the integration part there as well. I'm, I have been working on a proof of concept to get meta as a, as a sub module or as a Zephire module integrated. So, the build system would, inside the meta SDK, there is like a platform abstraction for Zephire which is based on two SDKs from I think Nordic and T-Link. The SDKs are based on Zephire as well. And they have like a generic Zephire abstraction now. And then we have the integration part where we have a CMake file and so on to hook that into the Zephire build system. We are an external module. And you have like a specific module, a jumble file that just tells where the CMake file is, how is the setup, how the build set up is and so on. And then you are setting things like open thread with dependencies on. This is not ready yet. So, that's why I have no link here. I'm still working on that. I hope to get it ready, but as always takes a bit more time than expected. But that's definitely something that could be interesting to get that running on the Zephire side as well. So technically, it's possible. I saw it working in the different other SDKs. But for our use case, we wanted to have it working with Zephire upstream without any specific silicon vendor at on and so on. So, this is why we are going this approach. So meta devices. So, the device types that are available in the SDK coming from the 1.0 release are very limited. I think they only have like five device types specified right now. There are definitely a lot more coming and will come in the next upcoming releases. So, meta is doing like two releases per year in spring and autumn. So, there should be a lot more going on there. They are using the Zikbi cluster library. So, in case you are familiar with that from all the projects or something. So, they are basing their device types and destruction on that one, extending it a little bit and then using it. But it should be a very good base for your own devices. It doesn't mean it cover all the nitty details you have maybe on your product or something. But it could be a good entry point for cover the basic functionality and then for details you might leave that out and have like an out of band situation. Or you go to the meta working group and work with them to extend that over time. As usual like set up is the QR codes you might know from other devices already like HomeKit and so on. You can also use a pin and then you have NFC that is upcoming that you can use that as well. So, in terms of connectivity layers, what they are supporting in the beginning. That is Ethernet and Wi-Fi. There is no much need to like adapt it or anything. For Wi-Fi they are working on a better software. So, right now it is a soft API set up if you want to bring a device in. But that means you would need for example if you do the soft API with your phone and then you leave the connectivity to your normal data. So, they are working with a Wi-Fi alliance to change that as a neighboring service. So, that is good. Brutal store energy is also available for device onboarding. It is not a connectivity layer they are using on for data transmission but it is just only for onboarding. And as I mentioned before they have like these, if your device supports more than the device type expressions they are doing, you either need to work together with them to extend that or you need to find a way to have that as an out of bound connectivity. But here comes the beauty of being IPv6. You have like end-to-end connectivity to the device. And if you for example have like a mobile application or something that would control this, you could still do all the work with this method to support the basic functionality and then hand over to the IP to the end device over IPv6 and then control the device for like an extended API you might offer on the device. But Ethernet and Wi-Fi are not really the ones I was looking into when looking at the power budget and designing devices that are really power budget friendly and can run for like a year or so on a battery. So, I was looking into Thread for that. So, Open Thread is the open source implementation of the Thread specification. Thread group is the governance body again. Membership, you have to pay fee if you want to get certified and so on. But you don't have to do that if you are just going for the implementation with Open Thread. It's BST3 license. So, that's all easy for you to integrate in your products and so on. It's mostly driven by Google and formerly Nest and it has a very established code base already in products and running in the millions of in the world already. So, Thread is a mesh network. So, what does it cover here? So, you have like different types of devices that are part of the mesh network. You have full Thread devices which are normally devices that have like a good amount of power budget being either line powered or having like a big battery that they can operate on. These are like often like routers that can like take the packet forward to another one and make sure that everything, the whole data keeps flowing. And then you have like router eligible devices which is something that will become a router if the mesh network needs them later on to make sure that the data keeps flowing or if they are like in a corner of the mesh network where they need to increase the quality for all the other nodes available. Or there could be just a simple full end device which is just operating there not doing any routing or something but still being a full end device. And then for the power constraint devices you have like minimal Thread devices which are minimal end devices and they can be sleepy devices. So, basically they would like spend most of their lifetime just being asleep not drawing or drawing as little power as possible and only wake up if they are getting an interrupt like when you open your window or something like that you want to send the notification out for that or you call, you just have a short poll to a parent and ask if there is any data left for you. So, they are using 15.4 as a base layer, a file layer and they have a functionality where you send out a packet to if there is any data available for you and even in the egg frame already you have like a bit set where it says, oh, there is data waiting for you, don't go to sleep, call me again and then you are getting all the data and if not you can fall asleep already again. You can also have like in the newer specifications you have something that is like synchronized schedules but that would mean you need a newer type of like ships available not all of them do that so you would have to be like 15.4, the 2015 release for that so you need to find like the silicon ships that actually support that and then you can get that running as well. Okay I talked about the router things before, you have router devices, you have end devices and then you have a leader of the threat network. This one is in charge of making sure that all the key material for example is distributed to all the networks, all the keys are getting wrote over if they are running out of frame counter and so on that really makes sure that all the stuff is available and all the devices get the need information and then you obviously have like a standby leader if the leader is like running out of battery or like someone tripped over the power cable or something like that so you have all of that covered in the mesh functionality as well and the other important device that is available in such a network is the border router because you want to have these kind of things obviously connected to at least your home network, maybe even to the internet that's up to you but you want to have like more devices than only within the threat network and that is where the border router comes in. I will talk about that a little bit later because that's also relevant to the matter part for the integration. In terms of addressing, there's like three different types of addresses, you have like the link local, what you can reach directly within your range, in your wireless receiving range or transmitting range and then you have mesh local addressing which is like available in the whole mesh network and then you have like the global addresses, it's all IPv6 addresses and they have like allow you to like individually target specific parts of the mesh and so on. I'm rushing through that a little bit because it's too much to go into all the details here in 20 minutes but it's a little bit more in the slides. So in terms of the software, they're having there available, there's the OpenStreet core library which is used for all of them, then you have abstractions for like all the different silicon vendors integrating with their SDKs and so on, so you can see them all there listed. If you have a specific device for example, running that you could, you'll burn metal on that as well or you could go and run it for example as an OpenStreet module on that fire being supported. And on the link side, they have like two basic services that are running, there's the OT daemon which is like the basically only a full enterprise which could operate as a normal enterprise in the network and then you have the OpenStreet border router POSIX, how they call it, that is the full border router set up that you would run on your Linux device if it's the border router and engaging there. So talking about all the power constraints you are having, so there are two advancement that have been happening driven by meta mostly but falling back to thread to make it even more power efficient. So this is a multicastiness discovery proxy and the service registration protocol. So I talked before, the border router is like the central part here to shield the mesh network from the rest of the network or the other way around. And this is, so if you look at a lot of the IoT devices you have maybe in your home or you know about, these are often like vendors where you have like one specific hub for your specific device types and so on. Then you have the next hub for the other types and so on. This is all crowded and so on. And for the border router, this is often more software components that can be updated in devices that are already available. So the 15.4 radios they are used for threads that are the same radios that are used for ZigBee. That means all the hubs that you might have, already have ZigBee support, it is up to the hardware vendor if they want to change that over to a different firmware and then all the other software around it to make sure they can also support thread. It's also possible to run like both of them in parallel if you do like multi-protocol on the firmware level where you have like ZigBee device support as well as thread device support. That's a bit more complicated but it is possible as well. But nowadays, one of the problems I saw when I worked with thread the first time was like everybody needs to get like another device being the hub and so on. But if you look around now, there's tons of devices already available that offer border router functionality. All the Apple devices like the HomePod, the HomePod mini, Apple TV, all the Google Nest devices, Echo and so on. All these things are like the, if you have them in your house already or like people, your target audience have them in the house already, it's already sorted out. And then there's a lot more hubs doing the support as well. The Ikea, the new Ikea hubs have support for it. I think that the smart things hub is going to plan support for it and so on. Then a lot of the smaller vendors as well are coming out hopefully over the years. So that means if you bring home an open thread device or a thread device, you shouldn't be worried to get it on board as long as you have some of these. So it's not as easy support as Wi-Fi for example right now but it's good in get traction there. But coming back to the situation about the discovery proxy. So this kind of wireless networks, they don't have any multicast support right? So they, whatever you send in as a multicast there will end up as a broadcast in the whole mesh network. Which is obviously not a good thing if you want to be like, if you're constrained in power and need to listen in and wonder what's going on. I mean the sleepy end device for them, it's not too difficult because they would sleep and the parents would just discard whatever they have for them if it's not targeted directly for the specific device. But all the other ones would still like draw on the batteries they're having to do that. So on the other hand multicast DNS discovery is something that is very much used in the industry for all kind of services discovery in networks. So we want to have that support as well. So there's a component now that has been specified as an ITF RC draft which is sitting there and doing basically the proxy what the name suggests right? On the one hand if you have like Wi-Fi or ethernet you do multicast DNS and on the other hand you do a unicast DNS service discovery and so on. So that is like basically proxy and back and forth. That doesn't mean depending on the border you're using you don't, you're not forced to that. So the end to end principle of IPv6 still stands but it's like an optimization you maybe don't want to miss out on. And on the other hand so that is mostly covering the side where you have like Wi-Fi and ethernet flooding into in the threat network. But on the threat side you also want to make sure that you announce the device that are available and so on. That is like the service verification protocol where you go and say I have the service available as DNS. So this is like what I'm offering here and they can register on the border router service for that. So that would mean and would distribute that again as multicast DNS on the Wi-Fi ethernet side. So that is like how I knit all the things together. So we have a blueprint that is like you know near how we talk about like proof of concept demos we are doing and we have a layer where we do all the integration parts here, all the open thread stuff. I upstreamed already they are in meta or E networking and the meta stuff that is still very much work in progress going on as well as the ZFIA side. But if you're interested in that, I mean it's not, I'm not hiding anything here. It's just not ready to show but if you're interested we can talk about like where these things are. So this is like an old example I had where I had like just a threat device being onboarded so it could like go ahead, have like secure code for this specific device and then you have even have an Android application to onboard on the network and so on and do all the stuff and then in the end with all the components together you have like IP version 6 connectivity. And on top of that this meta you would have like real device abstraction and offering all kind of platform integration and so on. And with that I'm done and I should have like a few minutes for questions. Hi there, thank you very much for that. We're currently looking at developing a matter device. So what I'm trying to understand is if I buy a matter device, it might be a matter device that supports Wi-Fi or it might be a matter device that supports threadover 802.15.4 which to my mind feels like it's going to be really confusing for people. And I'm asking as we go to develop this border router should we focus on supporting sensor like devices that are Wi-Fi devices or 802.15.4 devices or is it not that simple? I think you, I think it's sensible to make sure that you have at least a 15.4 radio in the device because I think all the sensor devices you will see coming are most likely to use thread. Because just power budget wise, I mean at least the feedback I saw in the working groups and so in the feedbacks I got in the working groups. Excuse me, if you're moving in or out of the room can you please do that quietly whilst we're doing the Q&A and try and keep the noise down to a minimum because it's getting very difficult to be heard here. So please be considerate to others. Okay, so I think you really need to make sure that you have that available because all the companies that are looking into that they want to like be conservative in power or something they are definitely going for that. So and I mean if you do the hardware setup make sure you have the radio available. If you enable that by default from the beginning it's up to you, right? I mean you can always have like the firmware available and then ship the device and then enable it later on. I've seen tons of devices doing that but having it available for the hardware to sacrifice I would not ditch that. It's like yes it's like a few euro maybe depending on the volume and so on but I'm pretty sure that most of the device will come with that. So okay. Hey I've got a question from online here. I'm over here. Online we have a question. What's the rationale for a non-router end device if it doesn't have any power management requirements? Can you repeat that? What's the rationale for a non-router end device if it doesn't have any power management requirements? Okay, I mean the thing is it could be a router. Normally in that case it would be like a read like a router, a leisure device but you maybe you don't need all the routers available in your network, right? I mean you have like if you have a mesh network and depending on how the topology is and like maybe your house or your how the environment is basically you have maybe enough routers available at that point. So all of the full end devices can do that but you might don't choose for that so. Okay. Okay, any more questions? Yes sir. Hi. So one of the most controversial parts of the spec. Can you a little bit louder? Sorry it's very noisy. So one of the most controversial parts of the spec when it was released was they were talking about authentication onto the network like onboarding via blockchain. Can you discuss that a little bit? Okay, so what I think what you reference to is like the distributed ledger. So having so one of the ideas that is like work around in the meta-working group is like the distributed ledger where you can authenticate the devices that are like so you're basically not getting fake device in the network and so on. That's definitely something that could be problematic for if you want to like do your own device in your own home for example and get them onboarded. I still have to see if that is really enforced or not. That is really depending on the platform and so on. How are you using that together? Yeah, but I need to hear some microphone or something. Have you managed to get a DIY device onto an actual Google matter network yet? Not on a Google network now. I was able to like getting all working together on my own setup but I didn't work against the platforms and so on. Yeah, that sort of seems like one of the problems. Well, we have to see. I wouldn't rule it out right now but I can confirm that it's possible but it really depends on how you do it. But it's definitely the concern, you're right. It's possible? Okay, so here someone said it's possible here. Okay, thank you everybody. Okay, thank you very much. I want to check the chat room. There's one more question on the chat room you could answer. Okay, I will have a look. Thank you. |
Developing Bluetooth Mesh networks with Rust |
Hey everyone, thanks for coming and yeah, today I would like to talk a little bit about the Bluetooth Mesh and what we did in the Rust ecosystem to basically support it both on the embedded and on the Linux side and it's a good continuation on the topic that we had in the previous session because it's a little bit comparable and there's a lot of material so basically what I will, and a little time, 20 minutes, so what I will basically give you today is a lot of teasers and a lot of pointers and I hope you'll get interested and could follow the links to further investigate things. So let's get started with what the Bluetooth Mesh is. Bluetooth Mesh is based on the BLE, so Bluetooth Low Energy Technology, but it's designed to create a mesh network or devices on top of it, meaning that you should be able to connect nodes or devices directly in dynamic hierarchies basically. What's different between the Bluetooth Mesh and the thread, for example, is that Bluetooth Mesh doesn't use any routing, it's based on the managed flooding principle, meaning that the device will try to publish messages to all the devices in the range and those devices will then figure out what to do next with those messages that have been received and it supports published subscribe model as we will see in a minute. So this is how it basically looks like and it's a similar what Stefan showed us with the threads, so we have a regular node that can send and receive messages, we have relay nodes which are only there to extend the range of the network, so they're just relaying things that they're receiving and in a similar fashion as thread, we can have a low power nodes that are mostly sleeping and are not active and which are accompanied by the friend nodes which will buffer messages addressed over these low power nodes. The stack looks something like this, so as I said like we have a Bluetooth load energy as a basic layer, there's a networking layer that's responsible for creating networks and exchanging keys and all that kind of things and then we have an application layer which is completely defined in the Bluetooth mesh, meaning that all our models are predefined and we can use it as we will see now. So as I said the models are defined, so for example all the things that are talking like a sensors or on-off switches are defined as a model on the application level on the mesh and we can have a client and a server model meaning that the client and the server will exchange the messages and communicate like that. So how does it work then is that each device, each node can have multiple elements and those elements can have multiple client or server models between them and each element has its own unicast address that can be used to address an element within the device. We can also create more complex hierarchies by defining a group of addresses and defining the virtual addresses which provides us with a way to create more complex topologies and to have like a full power of public subscriber architectures on the mesh level. And every device is part of the particular network and the particular application within that network meaning that all the messages that are exchanged between the devices are double encrypted with the network and the application key. To onboard the device onto the network we need to go through something like a provisioning process meaning that we need to have like a special node that will behave like a provisioner of the network and that node will be responsible for creating and managing the keys, setting the addresses and things like that. So what are the use cases on top of the Bluetooth low energy we can have extended range and more flexible topologies but we can also have existing hardware so this is just another application on existing Bluetooth low energy hardware that can be applied but with more flexible technologies and providing an option to connect larger number of devices. So I don't want to go too deep into this because it's probably a session of its own. We heard a lot about the thread here but this is just a small comparison between all the operating technologies in the space and their respective solutions in all the different layers. So when we started playing with this we had one goal in mind and that is to create like a full stack meaning that we can create application based on the Bluetooth mesh that will cover the full stack going from the embedded microcontrollers to the Linux and having support for these applications in the cloud and try to do all that in Rust. We will talk about that a bit more in a moment and the idea was to create a platform that could be easy to build these applications both on devices and on the cloud side but also provide a way to ease the management of the Bluetooth mesh networks. But before we dive into what we did in Rust is let's go a little bit through the current state and on the embedded side the ZFR is the only thing that I found in the open source that had a support for the Bluetooth mesh. Of course all the vendors had their own support as the case that can be used out of the box On the Linux side everything related to the Bluetooth is basically under the BlueZ project and the BlueZ defines the Dibas APIs for communicating with the Bluetooth demon on the different kind of things and of course they have the mesh API as well for the Dibas and it's used to send messages between the Bluetooth demon and the applications that want to talk a Bluetooth mesh on the Linux side. But the demon is different so if you want to use the mesh on the Linux box you need to install the different package and basically disable and stop the regular Bluetooth demon and enable the specific Bluetooth mesh demon. There's also a provisioner tool included which is called the Mesh CFG client and it's an interactive tool that allows us to do all the provisioning things. So create new networks, scan for the provision devices, add those devices to the network and create addresses for their models. One of the downsides of this tool is that it's too interactive so it's not that easily scriptable and it's making it hard to create like reproducible networks and environments that you want to do. And the final state is then how do we create these applications on the Linux side that will do this and there's even less examples of that on the network. All that I could find when I started looking into it was some of the Python examples done in the Bluetooth white papers and basically those are just simple Python applications that use the divas interface to basically communicate with the mesh demon over it. So coming from this kind of state you could see the end goal that we try to do is to try to see how far can we go with this tech and try to implement most of these things in Rust. And now the question is why Rust of course and we found a very good solution for system programming so it basically allows us to create, it's statically compiled and strongly typed which means that it has a strong preform. Save programs without introducing runtimes and VMs, again a very suitable for system programming for this kind of applications. And finally it's a fairly modern language with a lot of good tooling so you know people coming from other areas for example I don't consider myself an embedded programmer but I feel much more comfortable playing with Rust for these use cases than I would be if I would try to do the same thing in a C so yeah. So first thing we did is to create a bit mesh create and that's a basic create that we try to do is to implement all the traits that are needed for implementing the Bluetooth mesh specification. So as you remember all the layers of the Bluetooth mesh so everything needed for representing the application models or the networking layer traits should be defined in this one create and as you can see you will see we will be able to reuse that in all different layers of the stack. But in order to be able to reuse it in the embedded space that this create needs to be and no STD meaning that it shouldn't rely on a standard library. And this is a kind of go to example to show how the sensor data representation could look like in defined by the BT mesh create. So Rust embedded I think it's going so how many people here are using Rust today for embedded programming. Let's go. So what's the goal here? There's a Rust embedded working group that are dedicated to this task and its goal is to enable people to run firmware using Rust, firmware targeted to the microcontrollers with the small RAM and ROM capabilities without operating system and without memory allocator. As I said like we have only 20 minutes and there's a lot of things so I just giving you the pointers. So there's a lot more to be said about embedded Rust but we don't have that much time. And the next thing, next cool thing as I said doing embedded with Rust is that it enables you to do quite a model programming things even for the firmware. So there's a project called embassy which allows us to use basically as in programming for the firmers. It provides a scheduler and the hardware abstractions that we can use to build quite capable asynchronous applications in Rust and it has a hardware support for all the major hardware platforms today. On top of that the project that we are involved in is building on top of the embassy and trying to add more IoT things on top of the basic embedded development. So communication with the cloud in terms of MQTT or HTTP, trying to support use cases like Bluetooth mesh and try to create more advanced applications like OTA firmware updates. And you can see here one of the examples from the workshop that we did that I'll mention later on is for example how we can use the Bitimesh create on the firmware to basically every time we read the sensor data we can package that sensor data in the proper sensor Bluetooth mesh message and send it over the Bluetooth. Then on the Linux side there's a project called Bluer which is part of the BlueZ Linux official group which tries to implement all the Linux Bluetooth protocol stack in Rust and at the moment it provides support for all the major features of the Bluetooth like get or Bluetooth cloud energy. What we try to do here is to provide support for the Bluetooth mesh in a similar way as the rest of the Bluer works. So again, nice thing about Rust is that you can use a lot of crates and existing technologies that are there for different use cases. So for example, a Bluer uses a Tokyo runtime, very frequently used runtime for building all kind of server applications in Rust and communicates with the mesh daemon over using the DBScrate. The good thing is and that was the part of the plan is to use the Bitimesh create here as well to use for the mesh traits that we would need it. So this is the quick architecture of how things work on the Linux so I hope you can see it well. So we have a mesh daemon which communicates directly with the devices. It has its own state in the mesh config and the mesh storage volumes and it communicates with using the system DBScrate to random applications, being the gateway application or some device simulator on the Linux as well. But the good thing is that you can see here is that and this is one of the things that I personally like a lot about using Rust for these use cases is that this code running on the Linux looks pretty much similar like the code running on the firmware. So here we are receiving the Bluetooth message, we are parsing it, we are creating a JSON out of it and sending it over the MQTT to the cloud. But you know, it's very easy for a single person to jump back and forth over the different stack layers and using the similar crates and a similar style code then it would be if we go from writing a C for the firmware and then a Python code for the gateway and then doing something in Java in the cloud for example. So the mesh support is not officially landed in Bloor and this is all my fault due to my laziness and other priorities. But hopefully this PR will be merged in the coming weeks, let's say. Final part of the project that we have been building is to build a kind of IoT friendly cloud platform, again done in Rust. Here we try to provide all the services that your typical IoT application is needing. So being able to do a lot of connectivity, having a capable device registry and being able to integrate further into the cloud applications and using digital twinning and all these kind of things on the other side. But again, I'm coming back to the same thing. So there's a thing called payload converter in the cloud that can actually intercept our messages coming from the gateways. And if you can remember in the previous example, we already parsed the Bluetooth messages and send them as a JSON. But if your gateway is sending just the row bytes, you can do that thing on the cloud, again with the same crates and with a very similar code. So we will parse the bytes, get the message, do some JSON processing, and forward that message deeper into the cloud. So we were playing with this for a while, and then there was a chance to actually try to put this all into the work. With the EclipseCon, we had a hackathon and a workshop where we tried to cover the whole area with the Bluetooth mesh network, provide the microbeads for people to play around with, and provide some basic applications in the cloud that will talk to each other. But the basic big architecture looks like this. So we have a public sandbox for our drug cloud consisting of Kafka and all this kind of stuff. And we brought the gateway based on the Bloor. We provided some examples of how to use microbeads with the Rust embedded drug device and embassy, and provided a couple of applications that will talk to the cloud using the web socket in the background. So just to recap how this architecture looks on the firmware. So you have a couple of layers, the embassy and the Bluetooth radio on the bottom. Then we have a drug device and the BTMesh support next on, and on top of that, we can write our own application that will do things with these messages. On the gateway side, we implemented the gateway using the Bloor. And we also tried to use some of the, so to say, latest edge technologies to deploy and manage those gateways. So trying to use MicroShift, which is the RedHeads version of the single node Kubernetes cluster, paired with the open cluster management to deploy these gateways to appropriate nodes. And I must say, to my surprise, it all worked pretty well. So we had like a four or five gateways based on the Intel Nux and some Raspberry Pis. Because Raspberry Pis didn't run the Kubernetes, we used the basic podman and the Docker images to run the gateways. And that provides a very good coverage of a very large space. What we needed to do is to provide a couple of relay nodes. You can see on this other picture, is to just to basically extend the range over some longer corridors that were there. But everything worked pretty good from this perspective. So that's all what I have to cover today. So as I said, there's a lot of teasers. We didn't get into anything too much deeply. But these are the communities. So hit us on the Drug IoT metrics channel. That's where we all hang and are happy to talk about these things. If you're interested in EBC, I would suggest to take a look at that and the Bloor thing, hopefully with the official BTMesh support very soon. Thanks. Thank you very much. |
5 errors when building embedded systems |
It's good to learn at the errors of someone else, I would say. We all do errors, but if you can avoid doing all of them on our own, that's a little bit better. That's why I'm going to share with you today a set of my favorite errors I have seen in embedded products. And if you have worked with me in the past, don't worry, I'm changing the details of all of the examples that you cannot figure out which project it exactly was, okay? So no panic. But before we start, a disclaimer, I'm a security person, so I have my bias, okay? And now a task for you, an important task for you. Concentrate. Concentrate and think about an embedded product or project you have been working on. It may be something that you are working on right now or it may be something you have been working with in the past. Concentrate. You have one? Keep it. We are staying honest with ourselves because count one point for every single error on my list that was in your project, okay? You stay honest with yourself. First one. Easy. Binary ingit. When we are thinking about this, we probably would say- Your microphone is muted. So- I think- No, it's not muted. It's- Okay. It's great. Okay. Okay. When we get to binary ingit, what you think about at the beginning is there's some beginner developer that got the application, compiled it, and then everything to git, right? But it's not the whole truth. I have seen binary ingit for different reasons too. One important example, firmware, for a big project. And whilst I started talking to the team about why do we have that binary ingit, but it's hard to compile. You need that toolchain, that distribution version. Then you have to patch this and that. So it was too complicated. They just put it in git. And in the to-do list, we are going to compile it later. But later, it took some time to arrive, right? My suggestion, if you are thinking about putting binaries in git, first think. And then what you can do is, at the minimum, put a script that compiles that binary. At the maximum, in your CI, that, of course, you have one, in your CI, put a different job that is doing all the complicated work to compile that firmware binary, whatever. This binary, if it can be compiled from source, would make sure that Alberto, who is here, for people who don't know Alberto, you should know him, Alberto won't be crying when he audits your repository for license compliance. And for me, as a security person, when I see binaries in git, I tell myself, what do we have here? Probably five-year versions of everything with all the CVs from the last five years. Great. Try to avoid binaries in git, except if you really know what you are doing. But really, know what you are doing. Okay, forgotten independence is number four. Do you know what you have in your project? Really? No? Yeah, yeah. Not knowing what you have in your project that quite often happens for embedded projects that use one git repo and they copy everything in this library's configuration files. And then after 10 or 15 years, nobody knows what is in there. But it may also happen when you are using more advanced systems like Yocto, because there are quite few people looking into the Yocto dependency list to figure out what they have in their build. And when they do, they look for the first time they start shouting and running away. A test for you, in your project, the same project that you are honestly counting points for, how many open SSL versions are there? Zero? Are you really sure there's zero? Okay, we are going to add it. That could be fun. One copy. Yeah, there are some people that may be this one. Okay, let's go forward. Less than three, more than three. Some people think that maybe more than three. And I think most of the people are not really sure. Okay, how many people are not sure? Yeah. And it's not only open SSL. For a security searcher, open SSL think that you need to update frequently. But there are other libraries like that. If you do not know what you have as dependencies, have a look and think how you can improve yourself here. And for those who have managers who do not understand why looking to dependencies is important, use the word SBOM. We are generating an SBOM. For those who do not know what this SBOM is yet, I assume that in 24 months you are going to learn that. The hard way. Number three. Number three is not considering vendor support for everything you use in your project from the beginning. The classical example is not very open source friendly support for a processor or not completely up to date. But this is going and getting better. What I would like to give you an example is an embedded product I was working with. They were using some quite specialized devices, good quality, the product itself was very good quality, with one asterisk. The chip itself was done by a company of three, including people doing drivers. So of course the driver wasn't upstreamed when I looked into it, it wasn't in the state to be upstreamed any time soon, with devs all around the place in the code. They were very welcome to accept patches, but you had to write all of them and test yourself. I recommend everyone starting an embedded product. Then you have the first list of components that you want to use. Have a look of them and figure out how much it's going to cost to put that chip. Maybe choosing a different chip, even if the chip is a little bit more expensive or harder to get, it's going to be less expensive at the end. Okay, number two, update added last minute. That is one of my favorites. Update has a pretty important impact on the embedded system quite usually. It means quite often that the flash size is too small, that the partitioning scheme has to be changed, that you need to change the whole boot process, and you need to retest all that from the beginning. If the legislation is lurking behind the scenes, if you are starting working on an embedded project, and update system is not yet on the requirement list, it's good to have a look, because for some of you, what's going to happen just before the release, the management comes. We have a checklist here for you before we release, and on that checklist, update SBOM. If you are not prepared, it may be a good idea to get vacations before that. Now my favorite, developing an embedded system on the life system. My real example of that were people working on a system with a very expensive FPGA, and very expensive peripherals, so they basically had one piece. As the team was small, so they were working all on the same system, in addition, it was based on Ubuntu, so what they did, they were installing packages, creating sim links because something they didn't want to compile, changing configuration files, and of course there was no single place when they documented it all. Then what happened when they started building the second prototype? That was a little bit complex. Why not developing on the life system when you are prototyping, you do not know how it's going to work during later on, if you are not going to change the approach you are going to take. Why not? In this case, DevOps, but it's not a catchy word to get more views of the video, it's really something that you can use, use the DevOps tools as ansible, for example, in this case, so that you have a script that exactly deploys the system as it needs to be, and the right moment, and keep the script in a version control system, so then you can work on it and update during the system life. We are getting to the end of my favorite list, and now I would like to make a check. How many of you have projects with five points? All five points? We have some. Okay. Congratulations for your honesty. Congratulations for your honesty to yourself. Yeah, that's the decision of our managers. I could do another, yeah, I expected to do a little bit of explanation on how to explain to managers, but I think that would be another talk of how to explain that to managers. What I would recommend you today, in a new project you are working on, take the list, choose one of the subjects that's one of the problems that happens in this project, and remove that single one for now. For quite many of them, talking about legislations, IP compliance, S-bombs, stuff like that works with the management. If you are sure, talk to Albert again. For some other cases, it may be a little bit more complicated, but in my experience, talking about legal, talking about cost, maintenance has cost. If you choose something that is hard to maintain, it's going to cost expenses, but for company finances and rated expressions, that helps. I hope that was helpful for you, that you have learned something, you learned some techniques, and now I have planned some time to get a little bit of a feedback from the audience. We have a question here. Chris is on the other side. In the front row. Thank you. Thanks for the talk. Our first point was binaries in Git. When we are developing an embedded system and compiling a firmware, what's a good solution when we are not making releases, but in between, if we need to have access to the binary file and make sure that it's the last version, what's the solution about putting just the binary in Git? If I understand the question correctly, your question was, when you have a firmware in your product, you want to know that you always have the latest version? Yes. That's not a release, so it's not, I don't think it's doable with the ICD. I can see two cases in such a situation. Either you are compiling the firmware yourself, or you are getting from the vendor. If you are getting from the vendor because there's some feature they have added that's a little bit more complex. In this case, you don't really have an option. If you are compiling yourself, and it's hard to compile, I prefer to have a separate build stage for the firmware itself. You may have different branches for the firmware, and you are using every single dependency from a different build system, when you are using multi-stage CR. We can chat. Maybe I'm not as advanced as you are. Maybe you can chat about the details of setting it up later. Any other questions? We have someone in the middle, in the front. Yes, thank you very much for the presentation. I wanted to ask, if you have a product which is really long running, like several years, and then regarding this vendor support for hardware components, sometimes on our project it is like some of these components are running into end of life, and is there a strategy or something like that where you can anticipate this kind of scenario, where your product really has a long life cycle, and then you have to really think about what is if some of our hardware components having end of life or something like that. Unfortunately, the mic level wasn't great, so I'm not sure I cached everything. If I do a summary of what you have said, you have an example of a project using components that may be reaching end of life, and you want to support it for a very long time. So what to do in this case? It depends if it's about drivers, about all the components. If you are about drivers, drivers in Linux get removed really, really late, so normally the driver should still be there in the latest system. There may be some changes that are not exactly compatible with what you are using. That's true. You may have vendor BSP that they stopped upgrading, and that's when that happens, that's a big problem. One solution is talk with the vendor, but if they do not want to understand what you need, I would probably try to create some abstraction layers and keep some parts on the older versions and migrate the newer parts, things that you can maintain actually. Then in this case, it will depend exactly on the case, on the situation, which component it is. It will really depend. Yeah. Complicated. Okay. I'm going to second. Thank you for your talk. What if you had to convince your colleagues to follow these practices? You put them in place, but management doesn't really care much about them. It doesn't enforce them. Okay. The question was how to convince the colleagues, even if the management is quite okay with those graphics. What I use is a set of horror stories from my past. When people did like that, six months later, what happened? It was like developing new stuff. They do not like fixing old bugs, looking into history, so using the argument of if we do it messy this time, then we'll have to maintain it, and this is you who is going to maintain that stuff. They have to get burned at least one time. That can help. Thank you. Okay. I think we'll be done now. Just one comment on one of the earlier questions. I think a good approach would be to look at the vendor, how they support Linux. Some vendors provide, I mean, look at one processor. It had 500 patches to a five-year-old kernel, and another vendor, they push everything to the mainstream, and you might want to think who you want to choose. Absolutely agree with that. When I'm looking into the chip to use, I'm looking at the vendor's mainstream support, and that's one of the criteria to start with, basically. I think it's a great point. I think we should all boycott vendors who don't have upstream drivers. Yeah, that's a separate. Just say no, okay? Thank you all. Thank you, thank you very much. |
WAM: an embedded web runtime history for LG webOS and Automotive Grade Linux
Introduction and retrospective |
Okay, let's start off then. So, first off, please. Thank you. Thank you. Thank you. Thank you. Okay, now it's working. Okay. So, we're going to talk first about what is WAM. WAM stands for Web Application Manager. It's the LG WAM OS WAM runtime. It's built on top of Chromium nowadays. Web OS. It's an operating system for embedded products. It's authentic. So, the idea is that web applications are first-class citizens to the same level or native applications or even more prominent in web OS. The components is built on top of Jokto. It uses Wayland for graphics with QML for the Wayland compositing and Mali for biter keyboards. It has a unified media server. For IPC between applications, it has a JSON protocol that is named Luna. So, yeah, WAM is the centerpiece of the web experience in web OS. So, places where it was used, HP Touchpath a long time ago, there were some palm phones where web OS was part of the palm offering. Nowadays, it's the key part of the web OS, the LG Smart TVs. From 2013, that's the main OS to use for those TVs. So, we have hundreds of millions of users using this. Other products we have implemented in the web OS and ROS integration. So, we have some robot experiments, digital signage, some appliances like this fridge, and wearables like this clock. It is used from 2018 in the web OS open source edition. So, basically, it's a public distribution with open source basing all these components. It is used nowadays also on AGL, on the Automated Linux. It's the web runtime for AGL. So, there was a part of the LG web OS web runtime to the AGL that is not web OS. How it works? So, this is the architecture. The reddish areas are what is implemented by Chrome. Orange part is the, nowadays it's also part of Chrome, but it's the integration with Wayland. There's the blue parts that are provided by web OS, that are the Wayland Compositor and the IPC. And one is the green part. So, it's built on top of Chromium that handles the running web application of the system in an efficient way. So, the thing, why we want to put one in OS? Basically, the idea is having high support for the web platform nowadays. Web platform moves fast, so if you want to get up to date to the latest standards, you need something that provides the web standards and moves as the Chrome and baseline moves get the latest web standards. It controls the application lifecycle. So, basically, when you run a web application, one takes care of running it, of closing it, of reducing the saving memory resources, CPU resources, GPU resources, so they are properly distributed on the system, saving CPU and battery when applications are not visible to the user. So, yeah, that's one of the great advantages is this single runtime gives some performance improvements because we are sharing as much possible resources for running web content. Thus, the last one, launch time optimization, it's also quite critical because running a web stack is quite heavy thing nowadays. So, being able to have things pre-launched, pre-warmed is quite important for having a seamless web experience where application launching is very fast and application switching is also fast. Our security, well, it has all the web standards about how to run remote contents and also local contents through security origins. Security origins, basically, is the sum of a scheme, the port, and the host part of the URL. We have some permissions declared in our application manifest so we can determine which parts of the system a web application can use. And about developer tools, we have basically the same we would have on a Chrome browser. So we have the Web Inspector on developer tools and we have the Chromium Tracer for having performance analysis in the system. I think that one of the important things is that it's been running for a long time in millions of LGTBs on all the devices, so 10 years of experience, it's proven, it's running, it's stable, and it's used for a long time. And now it's also out of the LGR reference platform, too. So we have here some links for open source, the open source flavor of WebOS, and all the companies are the same as we have in Smart TVs, Related to One, and WebOS OSC. So basically, you use WebOS OSC, you can try all these components in your own devices. Okay, so let's move to the retrospective part. As Said says, it's been 10 years. The main caveat here is that I joined the WebOS project in October 2012. The history of one WebOS stand before 2008, but I will mostly focus on what I lived and not on what happened before. Anyway, it's just lesson learned. I hope some of them may be useful or insightful for you. Okay, so a bit of history. WebOS was developed by Palm that was acquired by HP in 2010, and then on 2011 HP decides that they don't want to develop any other WebOS product anymore. So the last device is a tablet that is HP Doge Path. But in 2012, there's a start of partnership between HP and LG for pouring WebOS for the LG Smart TV box. The idea was that WebOS could be the basis for the future Smart TV offered by LG. They had something that was named Netcast before, and it was hard to maintain, and it was hard to keep moving with what was required for the future Smart TV OS, because you would have more than one Web application running at the same time that switched into that WebOS were already providing at that time. So 2013 basically what happened was that LG had acquired the business unit that was found before. So basically the business unit that owns WebOS. That business unit was renamed to the LG Silicon Valley Labs. At that time. And from 2014, LG WebOS based TVs introduced in computer entertainment show in Las Vegas, and then the TVs were released a few months later. I think it's April 2014, the first LG Smart TVs based on WebOS. Okay, more about open source. As I said, HP Power decided to scrap all the WebOS new products, so they stopped doing new products, and they published most of the source codes open WebOS with the idea, I think, well, the feeling at the right time, at that time that nowadays is the same, they open it to attract interest in WebOS, to attract investors. Very likely they were already considering selling the business unit. But when LG acquired the Power Business Unit, they stopped maintaining open WebOS. So they came back to the new open source developing model, and that was kind of a strong problem. But it was very, very hard, the work to port WebOS for smart TVs and releasing to products. There is something that sometimes it's not very clear, but when you have something like an open source product, some upstream, some poly-produced, but then you want to get to release quality for a product for millions of users, it's not easy. There's a lot of work to stabilize, to mature things, to even pass controls by authorities of all quality. It's not very easy. So we all decided to focus on first having the smart TVs running WebOS, and then open sourcing again would be an afterthought after that. Usually that doesn't happen in the end. This kind of afterthought never happened, but in this case it happened. So in 2018, like six years after open WebOS was stopped, basically, LG releases WebOS open source edition. The focus was in this case allowing people to take this, to prototype ADS, do experiments, make things around WebOS, because basically you would have a UI user experience a way to integrate this in the web component. Yeah, the idea is students, independent developers would have a way to prototype and do things with that. So again, the idea is creating a community around it. So OSC is acting nowadays. After five years it's still there, so it's not something that is going away. The hardware technology nowadays is Raspberry Pi, and nowadays it's Pi 4 model. My view, it simplifies testing new ideas. It allows to start things like integrating one WebOS part in ROS for robotics and in EGL for automotive. So yeah, it was quite a success in that regard, but there are tons of experiments that have been happening that are quite useful for understanding, well, for prototyping product ideas. So if you want to integrate WebUI in your experimental product, OSC can help. It's very easy to integrate web application, web contents, both a party and even running locally. Okay. So how it happens, we have WebOS, we have Open Source Edition. There was the idea at the LSBL that it would be interesting to port parts of Chromium, the web runtime we have in WebOS to port it to EGL, so it would have a web runtime with all the advantages we talked about. So it has been a collaboration between EGL and EGLia, and then presented to the Linux Foundation, so in the end, a collaboration among the three. Before 2017, Galileo was assisting porting Chrome browser adaptation to Wayland on the Chrome browser, so it was running in EGL. But in May, in 2017, my team in LG Silicon Valley Labs did experiment to port one to EGL. It was actually one month, it was mostly working in two weeks, and then we started maintaining it from 2019 to present. Okay, the thing is that the experiment proved it was possible, and now the focus is always moving to be able to run EGL with only web UI, so basically the main UI at that time in EGL was just in Qt, and the idea was that if you didn't want to run on top of Qt, you could run on top of one and half all the system UI and don't ask for applications. It also allowed to integrate with other applications, add with the system services as provided by them, and that's part of the continuous adaptation as EGL evolves with different system protocols and system services we need to evolve to catch up with that. So that was what happened in the last four years. We did one adaptation to EGL. Okay, another evolution. In 2012, we were using Qt WebKit. We moved to Qt WebKit 2 in the first two years, and the first web OS TVs were using Qt WebKit 2. Then we moved to use Qt Web Engine. The idea was that everybody was moving to blink, and it was not as much more than that. The idea is that we had this feeling that chromium blink were moving faster while doing more for the web platform. So it would save costs for maintenance to move to use an engine based on blink and chromium. So yeah, we used Qt Web Engine. Qt already has a part that would save us time for doing that part. But from 2015, we moved to create our own binding layer and drop Qt Web Engine. So this web OS WebView is a new component that replaces Qt Web Engine. One is built now on top of that. Why? The main reason, there are a few reasons, but there was a concern about the licensing model on Qt Web Engine at the time. It was one of the first components that moved to LGPL v3. So the way they contents about the patent clauses and for a TV vendor, that was kind of a problem. And not only a TV vendor, we found several others that would have some concerns on that. So there was no other Qt Web Engine use at the time. So in the end, we did this. We removed that dependency, at least in the Web Engine integration side. It was allowed to simplify the continuous upgrade to track upstream Chromium at that time, because at least the media stack in Web OS is way different. So maintaining a different Web Stack on top of Qt Web Engine on top of Chromium and trying to keep that baseline, trying to keep upstream. It started to become quite hard. Then we did something different, and that's more recent. One was based on Qt for a long time, but there was concern from some stakeholders, not LG, that has some partnership with Qt that is strong. So it was not a problem, but for other stakeholders, the CPL dependency was kind of a problem. And the other reason for using Qt diminishes a bit, because basically C++ and STL improved a lot and simplified things. So it was not that important to have all the pieces that Qt was providing for free as part of the bundle. So in last year, well, two years ago, we moved to not depend on Qt anymore, and now it's based on STL and other C++ libraries for JSON passing and a bit of glib for the main loop. We moved from Qmake to CMake. Okay, but the thing is about stability. One didn't change a lot in the last 10 years, so the main ideas that we have running 10 years ago are still there, the way we handle the way running applications, et cetera. So it has been useful. The architecture has been flexible enough to adapt to the web engine changes we explained, new changes, new products, and even OS changes. So we've been able to put one to different OSes, different web engines, and it added very well. So first, about the future, why was STL here to stay? LG is spending lots of money on making it the main part for the OS TV offering. They've been even allowing it to be used for third-party TV vendors through the world-wide staff. And there are some future discussions. We are now using GCC for building. We may move in the future to use Clang, basically because it's the toolchain that comes about. So maintaining both the chains is kind of problematic. And we want to improve the upgrade cycles. We want to be closer to up-synchromium. So there are many reforms happening nowadays to improve these also. So these are the final remarks. Yeah, 10 years of the project, more to come. It's in millions of products. It proved to be useful. And it allows to create products offering, I mean, don't put this in Arduino. It's not going to work. But for products with 512 MB, even 250 CSV, it's possible to provide a good web experience. So that's it. And thanks. These are the sponsors of the work. So it's important to show them. Thank you. Thank you. We have a few questions, starting with the online questions. The question online was, if I have an LG TV and they want to rebuild the firmware from sources, is that possible? I don't think you're going to know the answer to that one. No. The LG OS TV is proprietary OS. I say that maybe around 10% of the software is proprietary, but you cannot be the firmware. And actually, TV industry is quite bad in that regard, because of DRM and of the requirements of the TV providers like Netflix, Disney Plus, all these kind of things. They want to have a strong hold on how the contents are delivered and when the contents are possible to be delivered and paid and whatever. So it's not all a thing about LG. Samsung has the same problem. Other vendors have the same problem. It's that the industry that will deliver contents to the TV are quite problematic in that regard. That's something that I would like to see it improve. Google has done a lot for at least reducing the pattern that is related to the DRM, but that's a problem of the TV industry that we need to deal nowadays. So we manufacture OEM boards. And over the years, we manufacture OEM boards. And over the years, I find it really difficult to know what kind of platforms. I find it really difficult to recommend UX platforms to our clients. So nowadays, we're starting to look at Flutter. That seems to be something people are talking about. I'm really sort of not a UX person. So when would you say we should be recommending this and WebOS as opposed to other alternatives? What are the sort of pros and cons, as it were? Actually, there are different lines. So for system UI, you can choose whatever you want. Flutter is quite efficient. Qt is quite good. Web is possible. The main thing is that, do you need web applications on top of that? Because you may want to play separate contents like Twitter application that is web, like YouTube, silly that. If you need separate contents, you may also want to have a web runtime. So you may have the need. You may want to still use web also for the system UI. It's your choice. The tooling is a great advantage of web contents because there are tons of developers. There are tons of ways to do UI in web. And it's pretty much a common standard for that. But yeah, it's a choice. Thanks, everybody. Time's up. Thank you. |
KUKSA.val Vehicle Abstraction
In-vehicle access to standardized VSS Vehicle Signals |
Okay, welcome to the next session. So I will be talking about KuxaWall and in vehicle access to standard VSS signals. I will tell you what it is. I hope I can convince you that it's a good thing and you want it, but let's see. I'm Sebastian. I'm part of the Kuxa project and when I'm working, I'm working for Eters on vehicle software. So Eters belongs to Bosch. If that is something that's more familiar to you. So yeah, this is about automotive software. So let's get started. We will start like very high, you know, like sky high. And then I promise we go down to code. So there should be something for everyone. First thing is what's Kuxa. It's an open source software project. So I'm not totally at the wrong place here. First time for me, first time for Kuxa, I think. Kuxa doesn't stand alone. Actually it's part of the Eclipse software defined vehicle working group. So Eclipse is one of those many happy homes that exist for open source software. And the Eclipse SDV working group, that's basically a bunch of companies and people interested in automotive open source software and a couple of other interesting projects beside Kuxa. But we're focusing on Kuxa. You could say Eclipse SDV, it shares the software defined vehicle mindset. So what is that? I mean, later just Google it. You will see that's basically the latest and greatest type in automotive. So whenever somebody tries to sell you automotive software these days, they will probably put the SDV stamp on it. So what does it mean? That's also, I can only give you a very broad thing. So it's a little bit of a marketing thing. But what it normally promises is faster updates. Like I mean, today maybe you have a phone and you get a stream of updates for two years and if it stops after two years you're pissed. If you have a vehicle it's a bit different. Maybe if you're lucky after two years you get an update with some emergency patches and yeah, you should feel pleasantly surprised about that. And SDV is a little bit about making this better in cars that you have more like these apps and software functionalities added later and not so much big blob of firmware that we only update in emergencies. It's often you hear this term 10 times faster development, not 10 times faster than you are maybe as IT guys, but 10 times faster than automotive is currently. Because a lot of software there comes rightfully so from the mindset of like deeply embedded stuff, you know, embedded for real men and women, engine controllers, real time issues. And there, I mean, of course it is harder in a slower process to develop software. But somehow it has taken over also to the higher levels of your vehicle now. So even your infotainment doesn't update as fast as maybe your Android phone. And that is something SDV promises to change. Many companies promising that. So that will make you as a user of vehicles more happy, developers more happy because finally they can write software for vehicles or it gets just cheaper, makes corporations happy. So that's, yeah, SDV is awesome. Just Google it, lots of press releases about it. So how do we do it? So I, well, the point is in IT we can already do it. So it's pretty easy actually. We take all of our favorite text tags we know and laugh which are super productive. We put a bunch of beefy hardware. Okay, we need a little bit more hardware, but that is some understanding that's coming to automotive. It might make sense to invest a little bit more on hardware to get cheaper and more productive on the software side. And we put it all in a vehicle, done. Problem is, I mean, we do it, it's pretty cool, right? I mean, maybe you have your Kubernetes cluster and cars and I'm not joking. People think about that and doing that. So you can probably deploy WordPress easily, right? And if you put good enough hardware, I think your doom frame will be pretty acceptable. Of course, there are challenges, right? I mean, it's a vehicle where you drive around yourself with your family or things like that. So probably thinking about safety and security is a good idea. And we come to that a little bit later in the talk. But the problem is, if you put all these in your computer, if you don't have any access to the vehicle's hardware, if you can't actually interact with the sensors and actuators in the vehicle, yeah, then maybe it ends at deploying WordPress, right? You can't do all these fancy automotive applications you have in mind. And that is sort of the challenge. It doesn't help you just putting all these text-texts on there. What does it, let's see, we're stuck here. Okay, so what does it mean to access vehicle hardware? So mostly the most interesting part on a vehicle is really signal-based on a very low level and you have maybe two kinds of interesting things, sensors, how fast am I, and simple actuators. You want to open the trunk, open the door, engage the vipers. Now you might think, no problem. I read about Linux. It has can interface, automotive, I just enable it and off I go. But of course, it should be pretty clear that it's maybe not a clever way to just let anybody who deploys some software in my vehicle interact with all the bus systems and all the hardware in the vehicle just for safety reasons alone. And of course, you want your vehicle still to move. So that's probably not a good idea. Second challenge is just the way how automotive software evolved in a vehicle. Let's say the serialization, the data formats are very much not standardized. So even a simple concept like vehicle speed, how it's represented in bits and what's the data type and what's the unit and things like that, that is different from each manufacturer to each manufacturer from model to model, from model year to model year. So that's a big pain if you want to write a piece of software that runs in more than just one vehicle. Luckily, because we don't have so much time, challenge two, the second one is solved or it's in the process of being solved, there's something called the Covesa Vehicle Signal Specification link at the end of the talk. So yeah, there's some homework maybe for you. The point is that's a very simple data model describing sickness in a vehicle. It uses identifiers which are based on a tree. That's just an example. In fact, it's much more complex. But you can already, I think, see the gist of it. Like here we have something like path like vehicle chassis, axle, wheel, tire, pressure. So you already have an idea of what it is, what it might be. And the VSS defines then also what would be the data type of it, what would be the unit, maybe hectopascals. I don't know, don't quote me on that. And if you have that, and if your software stack integrates it somehow, then interoperability and portability would get much easier. But the question is, that's a data model. So it's actually a nice, YAML-based file format you convert it to whatever you like, but it's not live software. So how do you bring it to life? And that's a question, where would you use these kind of nice abstracted model? And two things, talk again about these real embedded layer, probably not. Because down there, where you have these super small microcontrollers, few kilobytes of RAM, safety-critical stuff, maybe you can't or don't want to invest in this cost of abstraction. Because that, it costs, I mean, it's nice and neat, but it costs. Then, I mean, all the other way, somewhere in the cloud backend systems that a manufacturer might have, or that a fleet operator might have, there might be a good idea to have an abstracted data model. And that is also the place where VSS is already in production today at several different companies. You can even go to AWS by a generic service for that, if you want. So there are solutions there, because that is where at first has taken root. And the question is, on the way between the sensors, which are very deep in your vehicle, up to the cloud, where would you do this transformation? Where would you move from these proprietary signals and very different variants to the standardized thing? And I think it's no surprise that in SDV, the answer is still in vehicle, because you want to be better in vehicle. But the point is, do it not on a microcontroller, do it in a vehicle computer. I use the term very broadly, but that would be any computing unit in your car that actually has a processor and has a real, let's say, POSIX-style operating system. So Linux, for example, maybe the QNX fence here would also work, but not microcontrollers but the microprocessor platform. And that actually is exactly what Cooke-Zawal is. Cooke-Zawal is an open-source software component that can do just that, sit on a vehicle computer and provide access to the standardized signals. How does it look like in an architecture kind of view? So basically, you have something we call the Cooke-Zawal data broker. That's like you can see, like a small server, it can run on your vehicle computer. It provides a GRPC-based interface or network-based interface you can connect to. You can get said, read, subscribe, all these nice signals, these abstracted signals from the VSS tree. So it means the applications on top, yeah, they would be portable. They work on standardized signals. Of course, you need to get them in and out somehow because down there, I mean, we have the non-standardized signal. So there we have something we call it a VSS provider or VSS feeder as a term you'll often see in our documentation. And this is a software component that would transform basically the data from the proprietary things in a given vehicle to the standardized form as required by Cooke-Zawal and as required by VSS. Important some architecture decisions. So this data broker is written in Rust pretty lightweight. It does not have many features, so no history or things like that. I mean, currently, it's like less than four megabytes that's statically compiled, you know, everything all in. And that's how we want to keep it so that it could really run on even the smallest one of these vehicle computers. There's also, I'm not sure if we have old Cooke-Zawal users there, there's a version of Cooke-Zawal in C++, but currently, I'm focusing on the Rust version for this talk. But architecture is the same. So how would it look like, let's say, if you really want to write an application that can basically open your trunk, pretty simple, right? User presses the button, open the trunk, and you maybe also want to see whether the trunk is open or not. And VSS terms is pretty simple. I mean, you talk to Cooke-Zawal data broker, you say, I want to subscribe to the state of the trunk, because you can show this fancy graphic if it changes. You might, if the user presses that he wants to open the trunk in this app, or somewhere, you want to set the target position of the trunk. So you want to, yeah, state the intent that you want the trunk to be opened. If it ends there, nice. I mean, you talk to a database, nothing happens. So in the end, you need these two VSS providers. We talk about feeder and control service. So feeder would be the component that checks actually what is the state of the trunk. It's back to Cooke-Zawal, so everybody can get an updated state of subscribe. And the control service would run the other way around. If the target value is set, the control service would be triggered. And then it can do basically something, right? I mean, you can write a can frame or some IP that some of these standards, which are an automotive. I can make it a little bit more specific because for these toy examples, because now it's just PowerPoint engineering, but I think people sitting here, they want to see code, or at least want to see if it exists. So the point is, how would we make this feeder, and how would we make this provider? And I show you examples now in Python, two reasons. Python is compatible with PowerPoint. I can fit every single one slight. And second, we have a very nice Python library on Cooke-Zawal that makes it easy. But as I said before, the interface actually is GRPC-based, and that you can generate in virtual every language. So if you want to write this in Rust or C++ or C, or Go, or whatever, you can just generate it. So feeder example, pretty simple. So here in this example, because then you can run it on your computer, we just simulate the vehicular access. Normally there, you will need to have some interface to can or some autosite API or something like that. So we just pretend whether there's a file or not that states whether the trunk is open or not. We just uncheck for that. And if it changes, we just set this vehicle body trunk rear as open data point to the current state. And that is literally, I mean, this code, if you get the slide there, you can't copy it. You can't type it. You know, like in the magazines 20 years ago, you can type the code and run it. But this code will run. So I didn't leave anything out for simplicity. Maybe error handling. But if you do everything right, it will run. Control service is basically the same thing. The control service is now that thing where we say we want to listen whether actually somebody wants to change the state of the trunk. And if I see somebody wants to open it or close it, then we would enact the changes in the e-architect of the vehicle. And again, we simulate it via the file and let me check. I think we are not so bad in time. So I think I can, I didn't dare to show it live. I'm not that, how to say, crazy here, but I can show you semi-life. Same code you have just seen. So we start off by just starting data broker, so this component that will always run. Then we use something called the data broker CLI, so you can just simulate what an app would do. I mean, it's CLI for that. So you see, I will just query the state of the trunk. And it's not surprising that it will say, I don't know, because we just started the Cooks-Avile bookhouse, should it know. We can now simulate feeding this value. I mean, it's against CLI. We just pretend that the trunk might be open. So I said that. And it says, OK, that's a good sign normally. And now if I query it again, OK, the data is there. So yeah, hell yeah, we can make a database. Now let's look again. That is exactly the code you saw before on the slide. We simulate whether the trunk is open or not, depending on whether this file in temp exists. And you see, there's no extra code. I just, I'm just running it now. It actually says trunk is not open anymore, because we don't have this file. We don't have this file. And if we query again in the CLI, you see, yeah, that is reflected already. Now we can also, I mean, now assume you have a user and a vehicle, right? I mean, he opens the trunk. He closes the trunk, something like this. So again, we do this here with our mockup. And you see this is immediately picked up. You can see from the locks of the feeder. And again, on the CLI, you see that the state changes. And that would already be all you need if you want to have these fancy graphics in there, right? So let me show you what is. I can do the same stuff now with the control service, also the same code as you saw on the PowerPoint. Basically, it subscribes for state changes. That's important. We always have the differentiation between current value and target value. Because that's important to see, because some operations take longer and we start it. Of course, first, nothing happens, because nobody requests anything. But now, again, we are back in the CLI. So now I can do a set call, which basically is telling me I have the intent now to open the trunk. I'm not good at slower typing. There it is. I type it. But now it's fast. Yeah. And don't blink. So you see the trunk control service sort immediately. Open the trunk. You could see the feeder also picked it up immediately, because in this example, it worked. And then, of course, we can, again, get the state in CLI. Now we're closing the trunk. Our trunk is fancy. It can close by itself. Then pick it up. The feeder picked it up and you see it. So that's round trip. But yeah, that you can also do at home. So it's all the links you need are in it. Now I need to get my mouse back to sit. So that we just saw. So now, I think one important thing I want to talk about is now, I mean, that's fine and it works. And I mean, probably you believe me, it could also work if you know the magic can command or something like this. But the question is, would you want to run this in your vehicle or should you do it? So the question of safety and security, I think, is very important. So the thing is, no matter what applications, at one point, you need to think about it. I mean, security is already the aspect that probably you don't even want to give all applications access to all the data you might have. If you support the whole VSS tree, I think it has like 700 to 1,000 signals already, just in the standard thing. So maybe not every app should access it. And the other thing is, again, like opening a trunk is maybe not a good idea if you're on the highway and driving this 100 kilometers per hour. And so I'm very sorry to tell you that I cannot magically solve it to you. I would love to tell you download CookServal and everything is fine. Of course not. But it's important to look at the architecture. So the CookServal, it's written that something in automotive, we call it SQM domain. That's basically, yeah, maybe it's a well-tested software, but we don't trust it much for any safety-related things. So what we do there is, OK, we can have authorization, right? An application needs to prove that I am allowed to write this value or read this value or things like that. That's a security aspect. Same like now these feeder and service we saw in the example. Of course you could add some extra security there if you want, but the ones I showed you, the Python things, not, yeah, they cannot give you any safety, at least not in the midterm. I think now there are some, let's say, activities to try to make a safe Linux kernel. And of course, I mean, the data broker is written Rust. You could test a lot of it. You can wait until you have certified Rust compilers. Maybe something in the future that could be also carry some safety load, but for now it's just hopefully a well-tested Linux software. But the point is once you go into, let's say now you're right on the canvas or something, right? Or you do a summary request. So then you cross this boundary and then there are several patterns that might happen. So maybe you talk to some deeply embedded ECU that actually is controlling the trunk lock or the door lock or something like this. Those things already have safety guard rails in them, right? They, I mean, even today when this is the same automotive, it's not that they would just blindly do whatever you tell them. So there would already be something in it that will check, I don't do it if I am on a highway with 100 kilometers per hour. Of course, in line of this architecture, you might, to rethink if it really captures all the things. But anyway, on this layer you have it. Another thing is, since the interface is GIPC, it's just a network-based interface. So these feeder and the service, so the things that do the actual conversion, instead of running them on the same Linux domain, the same Linux machine, you could of course also run them on a more powerful vehicle computer that might be able to carry safety load. So like in automotive, we say that's maybe an SLB, SLC kind of thing. So maybe something that run Q and X and only certified software. So maybe if you implement your feeder and services there, then they would be able to carry some safety loads. And of course, you can always, I mean, use underlying security concepts as you want. It's always a good idea for defense and death. So the thing is, as I said, we don't give you the magic bullet, but instead of now giving every random application access to any kind of vehicle, hardware, vehicle buses, one thing you have data broker, a single entry point, and basically you can do all the security you want there. And then, since this is your control point, and you control what kind of feeders and services you build, depending on your application, you can see where you need to put some safety loads and where you need to deal with safety. Because there are issues where maybe you don't need to do, because in automotive we have lots of these gateway things everywhere we like them. So if you have an application that just doing telemetry and receives the data from some interface where you basically already have a data data where you can't even write, right? Maybe it's fine to do everything on the Linux one safety side. But the moment you want to actuate something in a vehicle, you need to think about it. That's the most important thing, but I think we give you all the knobs you need. So regarding to enabling SDV, just to repeat it a little bit. So yeah, if you want any application to access vehicle buses, that's probably a pretty stupid or insane thing to do. As you've just seen, at least with Cooke's, you have this single control point where you can already solve the security issues due to this architecture with the feeders and providers and this very generic interface. You can choose where you put those, and you can choose where you put then any safety control points in. So that enables you to really build a safe system depending on your application. Other thing, what we said before, I mean, another big challenge is that the signals in your common vehicle, they are all different. I mean, vehicle speed on five different vehicles is encoded in five different ways. But as in Cooke's, you can use the Covisa VSS specification. You're basically on the safe side there. You can use the standard signals. They're really standardized. You can also describe your own ones if you want. That depends on how you want to apply it. But the point is you have a common language for that, and you can use this throughout your whole tech stack. So if you really go to the cloud or things like that, you do the transformation already, Covisa VSS site on the vehicle, everything else is just piping through, and you don't need to deal with some weird bits and bytes from Canvas up in your cloud level. So that is, I mean, as Covisa, we can't solve everything in the software defined vehicle because we want to leave something for other people, but the interface program that you have, same, safe, and same interfaces in the secure and safe way, I think that's a very good starting point. And the thing is, in the vehicle, everything starts with those low-level signals because the whole data fabric of the vehicle down there is built like this. To learn more, yeah, I mean, you can ask me all you want now or grab me outside. Otherwise, a couple of helpful links. So our GitHub, where the main hooks of our data broker sits. If you don't find me here or don't dare to approach me, you can also click the link, and then you find my contract address. I would invite you to check out Eclipse SDV, as I say, there are other interesting projects in there. Covisa VSS, which is gaining adoption throughout the industry as a data model, it's a super interesting thing. And some advertising plugs for one of our sister projects in Eclipse SDVs, Eclipse Velocitas. So I show you this very bare-bone Python examples, if you want to have something more fancy in regard to how you write vehicle applications, like a whole development framework with generating deployable containers and things in CI, then Velocitas might be very interesting. Why I make advertisement for this, not only because it's cool people, but they also integrate Covisa. So if you are a Velocitas app developer, you also get everything that Covisa can do. Yeah, that's all I have to say. And if there are any questions, I'm still here for three more minutes, I guess. Okay, so we have a question from the chat room. I'm going to pick one, there's a couple. Are there available CAN or CAN FD data providers for some cars in an open repository? Yes, so what we have is something called a CAN feeder or DBC feeder. That's a very generic component to get CAN signals in. Basically what you need is something called a DBC file that's an industry standard, how you describe CAN signals and do a mapping. And it's always hard in industry to get example traces out, but for the CAN feeder, we have a trace from a Tesla Model 3, not because they give us, but because the community reverse engineered it quite a lot. So basically you get a big trace of a Tesla and the mappings to VSS, which you can try out directly. Okay, I think the question was with regard to OBD, so the answer to that is it is pretty trivial to build an OBD feeder and you can do it at home with let's say the Python examples and if you're in a Python world, yeah, there's something called Python OBD, which is like a very simple library for all the standard on signals, the challenge why we have not, in our repositories is we are all a patchy two license on Eclipse Foundation and the Python OBD, just one example, that's a GPL-based code, so we can't put it in the repo, but you can, I mean, nothing stops you basically to do that. Thank you for the presentation. It's quite inspiring and actually my question is relating to this one. So let's say if I go home and would like to implement something, then what hardware and software I would need and how much it would cost me to start develop something? So if you start with your own vehicle, which is basically your own responsibility, then what we just heard, like if you buy these OBD adapters, like from AliExpress for five bucks, you can get them and you can, with a few lines of code and several languages, get already some interesting information about, let's say, your current vehicle speed, engine load, oil temperatures, yeah, so that would be, I think, the coolest way if you just hack around and we actually have many people, also students who do that, hardware-wise, I mean, if you want to go into real vehicle, I think you need to dare to cut some can lines and go deeper, but then you go into, let's say, if you're not working with... If I would like to use your stack, so I would like to use Cuxar, wow. Basically you can use, so, hardware-wise, I think there's also, Cuxar also has hardware, I hadn't time to present it here, it's Cuxar hardware, which is a base module for Raspberry Pi compute module, and it includes two KNFD interfaces and OBD, currently the challenge is we can't sell them to you yet, because we have the manufacturing files, like online, so if you have electronic shops, you can do it, but yeah, they're not just, you cannot not buy them. |
Convergent camera applications for mobile Linux devices
What does it take to run your desktop camera application on your phone |
Happy? Hello, everyone. Thanks for attending. We were doing okay. Is that my device or I don't know if that's me or okay. So today's talk is written by Yakubo Mondi. Unfortunately, he couldn't attend today. He's his back. So I'm stepping in. So what I'm talking about today is not work I've done. It's about his experiences working on the Python Pro. I don't want to touch you now. So my name is Kiran. Just like Yakubo, I'm an embedded camera engineer with ideas on board. We've been working on VFRL2 kernel drivers and for some time now, lib camera. We can be found on IRC matrix. Anyway, you want to get a hold of us at GitHub if you need or after the chat. And today we want to talk about how we perceive the Linux camera stack on both desktop and mobile environments. And starting with the kernel, we see lib camera as being a big part of that to support the platform abstractions. And on top of lib camera, we see lots of applications desiring to use pipe wire. So we're going to look through there. And the overall goal is that applications shouldn't care what platform they're running on. They shouldn't care if they're running on a PC, a desktop, or a Libon 5 or a Pinephone Pro. And equally, any application you want to run or camera framework, they should all be able to say, hey, I want to talk to the camera. Give me some pictures, please. And specifically today's talk is about the Pinephone Pro. Jacobo has spent some time over the last three months or so, or more, trying to make sure that we can bring up the Pinephone Pro with lib camera and standard applications. And the Pinephone Pro is an interesting device because it's, I think it's promoted as like a test ground. So it's like there's no official software, but it's a good device that people can play with and develop their own software. Interestingly for us, it has an RK3399, which is a chip that has an ISP. And it's actually a device that we have already been supporting for several years now. It pretty much was one of the first devices we started sporting with lib camera. And part of why we created lib camera is because cameras got complex. This is a slide I presented a few times. But on the left, we can see that beyond having just a single video node where you might say, UBC, give me some pictures, cameras started getting more complicated. They have multiple components and you want to configure them. And the one on the left has now been removed from the kernel. And the N900, which already has a lot of different nodes, that's 13 years ago. So if you can imagine, there's a lot of cameras out there now that are even more complicated than all the components there, particularly lots of crawl components. And with all those new components, it's very difficult for applications to know what to do with each of those things. Suddenly every application has to be aware of every platform. And that's going to lead to a lot of replication of code. Each camera application is going to have to deal with media controller to configure the pipeline, has to talk to V for L2 to get frames, has to talk to sub devices to configure parameters on the sensor itself. And that changes for every platform. It's different on a Rockchip, it's different on Raspberry Pi, it's different on an Intel. So lib camera's goal really is to fill the gap of that abstraction so that applications only have to look at one API again. And it sits on top of V for L2, it's not a replacement for V for L2. But what we have is a pipeline handler, which deals with the platform abstraction. And we have a component called the IPA. And that's crucial for devices like the Pinephone Pro, with an ISP and raw Bayer sensors, because you need control algorithms. And the IPA in lib camera provides the space to do that. On top of lib camera itself, we have a native lib camera API, which is C++. We've got Python bindings, there's people developing Rust bindings. The Rust bindings are actually giving us C bindings, I believe. Aside from that, we've got Android HAL integration, which is important and comes up later. And integration into frameworks like G-Streamer. And as I said, the Rockchip for LK3399 is one of the devices we started supporting when we started lib camera, particularly on the Chromebook, Chrome tab. But it's actually a really interesting platform because it's in a lot of small-ball computers as well. So it's readily available hardware, you can plug in off-the-shelf cameras from Raspberry Pi and play with it. And what I actually really like is recently we've been working on the IMX8M+, which has the same ISP core in the chip. So the same code that we've written for Rockchip also works on the IMX8M+. So I've mentioned that these cameras are now complex and we've got this thing called an ISP, which is kind of getting in the way of people getting images out of their cameras. And the reason for that is the cameras themselves are now raw biosensors. And that needs a lot more processing and support to get good images from, particularly the underlying format is in a Bayer format, which most applications don't want to process. So that data is fed into the ISP, but the ISP needs to be managed. It produces something called, well, it produces statistics, usually custom to each platform. And there has to be code or an algorithm to process those statistics to then generate control parameters to configure the ISP for the next frame. And ultimately, then that will process in a loop and produce you some images that the applications will expect, either YUV or RGB. And we already have an implementation for this. This is one of the things we started early. I believe a lot of this implementation is derived from Raspberry Pi, so it's quite compatible with the implementation that Raspberry Pi have at the moment. But we've got various components, like AGC, to handle how bright the image is automatically or set manually. White balance is important, then lens shading and the kind of three that you have to start with. But all that code is open and already existing in Live Camera. The kernel driver itself has been in Mainline Kernel now since, I believe, 2020. And it was destaged in 21. Helen from Collabra was working on that. And since then, it's still had active development. There's fixes that go up. And we've been working on it to extend support for the IMX8M+. And so the kernel side and the Live Camera side is looking pretty good. We've got support for processing the images. We've got the kernel drivers. But when we go back to the Pinephone Pro, for quite a long time, there's no driver in the Mainline Kernel for the front camera, 8858. And even though there was a driver for the back camera, it wasn't tuned and it wasn't supported very well. It wasn't tuned, really. So Pinephone Pro has been left behind from Live Camera for quite some time because no one was actively working on this. And it just meant that you couldn't use Live Camera on a Pinephone Pro. And then Yacobo has been working on this in collaboration with others who wanted to push this forward and make it work again. And Nicholas Roth started this back in October, I think, where he wanted to get Wade Road running on a Pinephone Pro. So he was trying to find out what the missing piece is, what do we need to up three. And this talk really derived from the work that he kick-started. So he submitted support for the rear camera, front camera, to Live Camera. And he based that on the kernel driver that was in the Pinephone Pro so self-hosted driver, not self-hosted, Meggy's tree. And interestingly, the driver was, it hadn't been posted upstream, so it hadn't had any kind of review process. And it exposed itself as a name with M00F underscore over 8858. So it was encoding properties in the sensor name about where it is and its location. And that's not very good for Live Camera because it's not generic because then we can't have a handle that says only match the front camera in location zero when we want that to support every device that has the sensor. So the upstreaming process actually highlights where things need to be cleaned up. This has gone through some iterations. And Yakobo, who would have been talking, has taken this on to completion and it will land in 6.3. It's accepted in the next media tree now. So that's getting in March, I think. The support required for Live Camera, we moved that and made a release last week for 004. So now we've got a kernel with the ISP driver. We've got the sensor drivers and Live Camera support all upstream and mainline. The other sensor needs a lot of work still. Interestingly, it's supported by Raspberry Pi and the Raspberry Pi kernel has a lot of downstream patches. So if anyone wants to get involved, this is a really good opportunity to look at what is in the Raspberry Pi tree, take some of those cleanups, get them suitable for mainline and post them up. So if anyone's there, what's next to make it good? There's lots of patches to upstream still for the 258. The next stages really are about camera tuning. And that's part of the process that we're trying to provide in Live Camera as a framework. We've got a camera tuning tool and that's really about helping the control loops know how to process the images. So we have a camera tuning tool which is being developed and can be used already. You can tune the cameras at home, simple things like taking pictures of a white wall. Ideally, you want a color card that was on one of the earlier slides. But with very inexpensive tools, you can do some pretty basic, an initial start at camera tuning. So if you've got devices and you want to investigate this, that's a great place to get started. So that is not my work here is done, but it's Yacopos. That's front and back cameras from running on the device. This is captured using CAM, which is just a pure test tool in the camera. We have CAM and QCAM. They're not meant for end users really. It's just for helping us develop the framework. The images probably need more work on the lens shading and white balance, but that's part of the tuning process that we mentioned. But users don't want to stop using test tools. So what's also been going on and progressing nicely is support for application layers on top. And Robert Mader, I met back in Prague and since then has been also working on this with his device, trying to get the desktop environment to be suitable of the same experience you get on the desktop to work on mobile. And that's been building up the camera portal in PipeWire, extending support in GStreamer to handle controls and mapping that all through the camera portals. So from PipeWire's perspective, this is what the MediaStack looks like for desktop environments or anything using PipeWire, where PipeWire sits on top of LibCamera, knows how to look at cameras that are VFRil2 as well. But if it needs LibCamera, it already has that integration. And then GStreamer and applications can sit on top of PipeWire. And Robert has been doing quite a lot of work trying to clean up and finish that integration of that application pipeline, particularly in getting the Nome Camera app to work all the way through. And I remember when I first saw the Nome Camera app, I saw it and thought, great, there's a standard design for a desktop. I want this to work on LibCamera. So this has made me really happy seeing that people have pushed this forward. The Nome Camera is a design, I've forgotten which team was designing it, but then James Westman took that design and created an application for it, which can be part of the standard Nome environment, and also run on mobile devices, which is the key point. If I could have put that in the slide. So, yeah, I couldn't be here today, but he did manage to record on his PinePhone Pro with a screen grab and encode on the device, running Nome Camera on the PinePhone Pro, running through pipe wire into LibCamera, running LibCamera algorithms and through the ISP. So this is hardware accelerated camera. He's taken a picture. He, in a moment, will change the camera to the front camera. There. And one of the things I like is, quite interestingly, you can see the algorithms kick in. So you can see it starts out green and then it corrects itself. So you can see that real time action from the algorithms that are in place in the camera. In consumer devices, that still happens on a UVC webcam, but usually you hide those frames. In the camera, with these up here, we're just not hiding them, so you can see it. Excellent. So we can have real live demos instead of video ones. Can I get back? So, thank you. That demo was running Nome Camera through the pipe wire camera portal. And thanks to Robert, if you have a device running pipe wire, you can install this flat pack, get that application and run it on your device. I believe that will just all work. The instruction from Robert over there. Okay. I went with what I had. So that's great. We can now run camera application that's exactly the same on desktop and mobile. But there's more that we want to do on our phones or with communications nowadays. So getting browser support is really important there. But now we've got pipe wire integration and portals. Browsers, which will most of them use WebRTC, would be really helpful if we had integration of WebRTC that could talk to pipe wire. And Michael from Pengu Tronix has been working on that. There we are. I want to see you later. Has been working on that tirelessly for a year or more, I think. And I don't think I can point that far out. But what was fantastic is last week, it went green and that part is merged. There's still a few more series to get in. But the core support is now there. So in some months, that's a very wishy number. We should be able to see browsers able to handle this pipeline and you can make a video call on your Pinephone Pro, which is fantastic. It's not me. It's other people. I do talk a lot about other people's work, so I don't want to take credit. Talking to some of the distros, I know that even once the code is ready, I believe we can start getting early PPA-type packages available so that we don't have to wait for it to filter through all the upstream processes. But that's up to the distros, not me. So there's actually, it's quite exciting. There's a lot of development going on with lib camera at the moment. Lots of application-side development. And another one that's been being supported lately since the last three or four months is Adam Piggs is working on Sailfish OS. And he had an application called Harbor Camera that ran there and he's ported that to run on lib camera. So he's been working on that on the Pinephone and it's QT-based. So I've been running it on my desktop, my laptop and an Intel device. But even though he's developing it on the Pinephone, because it's based on lib camera, the platform is abstracted so it will work on every platform that's supported by lib camera. So the same application is now going to run just the same as known camera will run on all the supported platforms. I'm modified, which is brilliant. I've run it on, as I said, on the surface goes and my desktop. He has already started plumbing in manual controls so you can do things that you expect from your mobile phone camera app to control the exposure and brightness that you might want to play around with. And then autofocus and manual focus should be coming up soon. That is a screen capture from Raphael, who is one of the other lib camera community members, who is just testing Adam's work on the Pinephone. But that's actually running on MimoLest, whereas Adam's working directly on Selfish. So again, it's nice to see that the distribution doesn't matter. It's all about cross-platform. A lot of this work here started because Nicholas Roth was trying to get way droid support working. He wanted to run way droid to get the Android applications running on his phone. And so that is a way of running an Android environment in a contained, in a containerized solution on your device, on a regular Linux system, such as we can now run on the phones. And I said earlier that lib camera already provides an Android camera how. So we've already got integration there for telling Android how to talk to the camera, which is great. So that can be reused. There's still a fair bit of work there, unfortunately. Nicholas has got it working in way droid. He can capture frames. But due to the format that the buffers are captured in, he can't display them. So that's going to need some more work in Mesa. And look at how the buffer management is being handled. There may be some updates from Excellent. So there's recent developments that may improve this in the near future with panfrost driver development. But might be an opportunity if anyone wants to get dug into those layers to work on. Sorry, Millie Pixels is a fork from Dorota, where she's been supporting the Libre 5. And the interesting part there is she's working on a GPU based ISP. So on devices where we don't have support for managing the ISP, such as Qualcomm or devices that don't have one, that work would be really interesting to see extend lib camera to work on those devices. And Pavel's talk earlier was about sharp photos on a mobile phone who's creating a software, CPU based implementation, which will help get things started as well. These are lessons Jakob wanted to highlight that he's learned. But they do apply widely. Fragmentation, when it's all split with lots of different stacks from vendors, it's very difficult to use that generically. So lib camera's goal is to try and pull this all together. And to do that, it needs mainlining. We have to have a single definition of what is true. And mainlining is difficult. It takes a lot of effort. My friend that I traveled up on the train with to FOSDEM, he was saying he posted patches to one of the Linux lists. And after four, six weeks, he had no reply and he found that really demotivating. So it is important that you consider mainlining from the start. You've got to get in so early. It takes a lot of time. And it's always slower than when you've got control of your own repositories. We've definitely learned more lessons from developing lib camera. A lot of us are derived from kernel developers. So we've been on the other side. And now we're seeing just how important it is on the user space side that these value or controls and interfaces need to be standardized and have a reference implementation. Since we've created lib camera, we're finding that the sensor drivers in Linux are improving rapidly, we hope. We've started saying that controls have to be defined by the sensors. There's so many missing parts to the drivers that are already upstream and they need more work to get them supported generically. But doing so means it will all be more consistent and improve the experience for everyone. Thank you. I think I might have two minutes. Excellent. So two minutes if you do have any questions. It already does. I think there may be. Let's just repeat the question. The question was will lib camera support the original pine phone? So Pavel is brilliant to answer that. Wait, wait, wait. And so kernel on original pine phone doesn't have required APIs for lib camera. So you can either break the lib camera to work anyway or you can fix the kernel and people are doing both at the moment. It's in development. But Adam Piggs who was doing the pinhole camera app, he is working on pine phone. So no, on pine phone. So it's one of the active platforms being used. So I believe so, yes. Any more questions? Earlier you talked about releasing 0.0.4. What's on your roadmap for an 0.1 release? So what do you want to get done? I'm sorry. I couldn't hear the question. For lib camera 1.0. He's ducking. He's getting out of the way. We want to, there's key features that we want to support in lib camera and it will break the API. So we already know exactly how we want to break things. I started tagging releases so that we had defined points before Christmas. So trying to do it every two months at the moment. It is in the plan. There's a big reconfiguration API that wanted to try and get in first before we go for 1.0. But we need testing and app to know how to get it right. Once we go 1.0, it feels like we're going to say that's it. But I think some of it's more psychological. But that's versioning. So we're working on it. I'm trying to make sure we handle ABI breakages automatically now. So we'll be able to improve on release management. We kind of always just said use the latest because we were trying to iterate so fast. And that is hopefully improving. So we're working on it. I'm out of time. Thank you. Thank you. Thank you for a great talk. |
Advanced Camera Support on Allwinner SoCs with Mainline Linux |
Okay, so let's start. So hi everyone, I'm going to be talking about advanced camera support on all-winner SOCs today. So the topic is quite similar to the previous one in that we are also going to talk about sensors that need particular care and processing. So unfortunately I will be maybe explaining some of the things that were said by Kirin just earlier, so sorry about that for the people who just attended the previous talk. But hopefully you'll also learn a thing or two in addition. So I'm Paul, I work at a company called Butlin, we are an embedded Linux engineering company, so we do services around that. And I have contributed a number of things, especially to the Linux kernel, so I worked on the Sidrus VPU driver, a little bit of DRM things on the all-winner side, but also others. And I made a training that we give about displaying and rendering. But today I am here to tell you about the camera support for all-winner SOCs using mainline Linux. So we are going to talk first about general things related to image capture technology and just complex camera pipelines in general. So let's start with kind of an overview of a typical capture chain. So we start with the optics, of course, where the light is kind of focused on a particular area where we have our sensor. So the optics are usually passive from a software perspective, but not necessarily because you can have coil drivers to change the focus and things like that, so it's not always a passive thing. So after that you have the sensor, which actually samples the light and produces data from that. And we want to transmit that data to something else, typically like your CPU, where you can do something with the data, like display it or encode it. But in between the acquisition and actually receiving the data, we need to do some processing. And that processing can sometimes happen on the sensor itself, in which case we talk about some embedded processing, or it can be on the other side of the interface. So on the side that receives the data, typically your SOC or your CPU package or whatever. So this processing step is really the one that is necessary to produce good looking pictures from, let's say, just samples coming out of an ADC. So typically what we call a row or barrier sensor will produce not pixels, but data in a biopattern, which is a grid of red, green and blue filters that is applied on the sensor. That gives you information for each of these different channels. And we kind of need to translate that to pixels. So this is called debiring, and this is how we get pixels. But these pixels typically look very bad. So you need to apply a number of processing, number of operations and enhancements to have something that looks like a nice picture. So a number of things need to be done. For example, the brightness that you get from your ADC is linear, and we want to apply some gamma curves to that to make it look nice to the human eye. Typically, there's some dark level current that we have to subtract, for example, because the zero value that you get from the sensor is not necessarily the, well, the darkest value that you get from the sensor is not necessarily zero, so you might need to subtract an offset, things like that. There's usually a lot of noise, and the colors will be off, so you will need to do some white balancing, things like that. So all of these different steps take place in what we call the ISP, the Image Signal Processor, and there's basically three domains in which we apply these enhancements. The first one is the Biodomain, so that's really the first step that we apply to the data coming from the raw sensor. At the end of that step, we get some RGB data that we also want to enhance, and at the end of that, we typically convert it to YUV representation, and then we can also apply some enhancements to that data, and at the end, we get a YUV picture that is, like, ready to be displayed and encoded, for example. So yeah, that's kind of a list of the different enhancements that we apply, so I mentioned a number of them already. I'm not going to go through the list, but you get some idea that there is really a lot of things to be done here, and it actually takes quite some processing power to do that. So that's why typically we consider that it's not something you can do in real time with a CPU or it's going to fully load your CPU just to produce pictures and let alone encode them and things like that. So lots of things to do. That's really the base that you need to have something that looks right. There's more advanced stuff that you can have in addition, like the lens shading correction, so it's about the fact that the lenses will typically be darker on the edges than they are at the center, so you want to kind of even that out. That's also an operation to do. Dewarping, that's when you have, like, a very short focal, and things look, well, the geometry looks distorted, so you need to kind of readapt that. That's also very intensive in terms of calculation. Stabilization can also be involved if you have, like, a very shaky footage, especially from, like, a smartphone or something like that, so you want to also apply a stabilization step to your picture, and then, finally, you might also want to apply some style to your picture, so that will typically be a color lookup table where you can decide that you want to make it look, I know, like, CPiaton or something like that. This is also some processing that you will need to apply. So I mentioned that there's basically two types of ways to deal with this operation. The first one is to have the ISP in the sensor, in which case it's typically quite simple, and when it's in the sensor, you get the data directly ready from the sensor, but when it's not, you get just the raw Bayer data, and you need to do all of these different enhancement steps on some system on a chip ISP, so that's typically a hardware block that is dedicated for the purpose in your SoC. So nowadays, many multimedia-oriented SoCs do have such blocks, and in order to properly configure that, you might need some specific calibration data that really depends on the sensor, sometimes on the environment that is used, things like that, so it's kind of highly specific to the setup that you have. So it's kind of just an illustration of the different steps, so that's the kind of picture that you would get as a raw thing from the sensor, and the steps you might apply to have something in YUV at the end that looks kind of okay. But it's not just about statically configuring an image processor to produce something good. Some parameters actually depend on the thing that you are shooting, so there's basically three things that you need to adjust depending on the situation. The first one is focus, so of course that implies that you have control over some coil to change the focus of the lens, but obviously your picture is going to look very different if it's out of focus or if it's sharply focused, so that's one of the things that the ISP is also involved with to basically tell you whether an image is sharp or not. There is white balance, which highly depends on the source light that you are using, especially the color temperature of that light, so if you are in broad daylight, it's not the same as being in a room with some particular type of lighting, so it needs to adjust to that. It will typically have an impact on how the image looks. And of course exposure, so what is basically the window of luminescence that your sensor is going to sample, so if you are in a very bright environment, you need to apply a different gain than when you are in a very dark environment, so that's also something that needs to be adjusted depending on what you are shooting, and this is also something that the ISP is going to help for by telling you basically how bright or how dark the scene is. So yeah, you can adjust those, especially exposure, you can adjust with three parameters, so if you do like photography, you probably know about that. You can change the aperture of the lens, you can change the exposure time, so for how long you are waiting for light to come in to charge yourselves that will be read by the ADC, and you can increase the gain, which will also increase noise typically. So advanced users will typically want to control these parameters manually to have exactly the picture that they want, but in most cases people just want to take their phone out and shoot at something and just press a button, it just works, so the idea is that we want all of these different parameters to be adjusted like automatically. So that's what we call the 3A, so the 3As are automatic exposition, auto focus and auto white balance, and that is typically again something that will be done with the ISP. So it works with a feedback loop, okay, there's a number of algorithms in the literature that exist that are known to be able to do that correctly and efficiently, but the way they are implemented in the actual ISP hardware really depends, of course, on the hardware itself and how it was designed and what the like register interface is to configure these different things, and that is often considered to be the secret source of the manufacturer of the ISPs, so that's the information that is often kind of difficult to get and that they don't want to release, and so that's why sometimes you end up with like a big binary blob that does all of this and you don't really know what's going on. Okay, so that was for the kind of, yeah, parameters for the image enhancement. Now let's take a little bit of a look at the hardware interfaces for the capture. So historically there's been different ways to transmit pictures from one side to another. There used to be analog interfaces which are now mostly deprecated, so let's not really focus so much on those. And then we have typically two types of hardware interfaces that are used for cameras. First one is the parallel, also called DVP sometimes, and that's when you basically just have like one line of data per bit, you have some sync signals, so it's a little bit like a display, a parallel display if you should know about that, and so you just kind of send the data like that, and there's also more advanced interfaces which are also more robust to noise that typically work with serial lanes, so there is MyPy CSI2 and other ones like LVDS, SDI, High Spy. So those are kind of the high end interfaces that allow you to stream a lot of data, they typically go pretty high speed, they are more robust to noise, so they are considered to be like the advanced ones. So that's the one MyPy CSI2 that we are going to focus on. Through my particular use case involving the Alwinner platforms, so in case you're not familiar with the Alwinner platforms, they are ARM SoCs, made by this company called Alwinner from China, they are widely available, especially on these kind of form factors as developer boards, and there's a number of these platforms that support MyPy CSI2 and that have an image signal processor, so it means that we can connect a raw Bayer sensor and get the data from that, pipe it through the ISP, and then get a picture at the end. So that was kind of the goal of the project that I had involving these platforms. So the scope was on the V3 and A83T platforms using two different image sensors, OV8, 865 and OV645648, which are, like I just said, MyPy CSI2 sensors that provide raw Bayer data. And these sensors don't really have an onboard ISP. I think one of the two actually has one, but it does very little, so you still need to do a lot on the receiving end of the interface. So that was the goal. That's the state of all-winner camera support in general with the mainline channel, because we wanted to use the mainline channel, of course. Let's first take a look at the general all-winner platform support. So there is a community called Sanxi, or Sanxi, I think, which has been working towards mainline support for all-winner SOCs, so it's very advanced, there's lots of people involved. You can check out this link, the Linux mainlining effort, which kind of lists all the features of the different SOCs and how they are currently supported in mainline Linux, and it's pretty impressive nowadays. Many of the features are supported, especially for the older SOCs, because of course it takes time to get it right. But the multimedia areas are often the ones that come last in support, because they are typically a bit complex to implement. So when I started the project, there were two drivers for capturing data. The first one is the SunFry CSI driver, which covers the first generation of these all-winner platforms. It was the hardware then evolved into a second generation, which is supported in mainline by a driver called SunSixi CSI. After that, all-winner made a new generation of platforms, which have a third generation of CSI, which is currently not supported. So the devices that I was interested in, so the V3 and A83T, work with the second generation driver. So this driver basically allows you to receive images from the parallel interface, but it didn't support MyPi CSI 2, and it didn't have support for the ISP. So there was actually some support for these features in the downstream vendor all-winner kernel. So they do have some code for that, but the ISP part, especially, was implemented as a binary blob, so it was like a static library that was linked to the kernel, which is not necessarily very legal, but never mind. So there was actually very, very few resources regarding how the ISP works on these platforms. Okay, right, okay. So generally speaking, how do we support cameras in Linux, at least at the kernel level? So there's this API called v4l2 that I think you've all heard of just before, and probably many people know about. So it's really about supporting anything that produces pixels that the CPU can receive. So it supports lots of different types of devices, not only cameras, but also, I don't know, things like skaters, DVBT receivers, lots of different things, now decoders, encoders, things like that, so really lots of different devices related to pixels. And typically the way it works is that you have one device node that corresponds to a driver, so typically dev video zero, and that device node gives you access to an API from user space where you can do all the different things that are necessary to get a picture from user space. So typically negotiating the pixel format that you want to receive, doing the memory management like allocating the buffers, how many buffers you want, et cetera, queuing and decuing buffers. So user space provides a buffer to the driver which will fill it with pixels and then return it to the application, and then the application has a buffer that has pixels in it that it can use to, again, display or encode them or whatever. So this video device works well for, I would say, all-in-one devices where you basically just receive the finished data from a device like a USB-UVC camera. So the camera itself will do all of the processing inside, and it will just give you the final result over USB, and you get that through this API on Linux. And yeah, typically you need some DMA interface to do that transfer. But in the case of a more complex pipeline, especially when you have multiple components involved, like with the ISP, with the Mypy CSI2 receiver, with a particular sensor that you can control directly, then you end up with a situation where you have multiple devices in the pipeline, and you kind of need to configure each one of these devices individually. So this called for a more advanced API, which is the subdev API, which allows not only to have one big device for receiving the data, but also side devices that you can use to configure each component in the chain. And there is also the Media Controller API that allows you to kind of control the topology between these devices. So the subdevs typically just represent one of the parts of the pipeline, and they typically cannot do DMA. So they will be connected from and to other devices through some interfaces that don't involve writing the data to memory. So it could be a FIFO, or it could be an actual hardware interface like Mypy CSI2. And basically, the top-level video device will be in charge of kind of calling the next subdev in the chain, which we'll call the next one it's touch to, and et cetera. So that, for example, you can coordinate starting the stream and starting all the elements at the same time to start receiving an image. But these subdevs still need to be parented to the V4L2 device. So basically, they need to be all controlled under the same top-level entity to be able to, let's say, coordinate between one another. So for that, there is an API in V4L2 that allows you to register the subdevs with V4L2 device. So again, that's the parent controlling entity, which is easy to do if all of the support for the subdevs are in the same driver, because you have access to that V4L2 dev pointer. But it can also happen that you have multiple drivers involved throughout the tree. So for example, you have one driver for your sensor, one driver for your DMA interface to transfer the data, one driver for your ISP, and you could even have more. So in that case, the drivers don't know exactly which other driver they should be attached to. So in that case, there is a asynchronous subdev registration interface, which allows you when basically you have, for example, a sensor driver to just make that subdev available to whichever driver is going to need it later. So the subdev drivers will just make the subdev available to the rest of the world. And then the top-level drivers will need a way to identify which subdevs they actually need and to get a handle of them, which will allow registering these subdevs with the top-level V4L2 device. So the way that this kind of linking is done is through the FW node graph, which is typically implemented in device tree. So it uses the port and endpoint representation that maybe you've seen in some device trees implementing this, and this description also allows describing some characteristics of the interface. For example, if you have a sensor that is on a MyPyCSI interface, it can use a different number of lanes. So in MyPyCSI 2, you can have up to four lanes, but maybe the sensor only uses two. So you have to kind of be able to share this information. And this is also done through this FW node graph description. So you have some device tree properties that you had to indicate that. And then the drivers can call these endpoint pass helper to actually retrieve the information about the interface. So to illustrate, on the left side, we have some sensor here. So we have the port and endpoint representation. The remote endpoint allows you to connect two sides together, and you have these extra properties here like the data lane and link frequencies that really describe the characteristics of the bus, so at which frequency it should be running and how many lanes it should have. And then on the other side, you have the same thing. In this case, the link frequency is controlled by the sensor, so you only need to provide it there, but the data lanes is present on both sides. So that's how you can link basically different devices and allow the top level driver to retrieve access to the sub devs that you want to use. So this is very flexible, of course, because then the same, for example, sensor driver can be connected to lots of different platforms and lots of different situations. So it's really the driver itself doesn't know about how it's connected. It's really the device tree and the FWU node graph that tells you how it works. So back to async notification, just quickly to illustrate how the top level driver would gain access to a sub dev. So first, it has to match using that FWU node graph representation. It has to match a particular sub dev and the top level driver registers a notifier which has a number of callbacks that will be called when a particular device becomes available and then it can pretty much bind to that device and then the matching sub dev will be registered with the top level vfoil to device and then everything can be linked together and the top level driver actually has a pointer to a vfoil to sub dev that it can use to apply some actions like stop streaming, stop streaming or configure the format or things like that. So this is how it kind of all works together. So yeah, that's also when the media controller API comes in. So the media controller API is there to control the topology of how these different devices are actually connected between one another. So it also implements particular functions. So you can say this block attached to this sub dev is an entity of this kind, okay? And each sub dev has an associated media entity which lists pads which are basically in and out points that you can use to connect other devices. And then you can create links between these pads which represent the actual connection in the hardware. So for example, you could have multiple links that are possible for one device and then you could decide to enable one at runtime. So for example, if you have a multiplexer or something like that, that would be a typical case where you would just select one of the inputs and have just one output. So this is really the API that allows you to configure the topology of the whole pipeline and how everything is connected together. There's also some runtime validation to make sure that when you connect two entities, they are configured with the same pixel format so that everyone agrees on what the data will be, the data that will be transferred. And there is a user space utility called media CTL that you can use to configure these links. So for example, here I'm configuring pad number one of this sub dev to be connected to pad number zero of this sub dev and the one indicates that the link should be enabled. So yeah, it's a bit blurry. This is kind of just to give you some kind of big idea or kind of a head start on that, but it's definitely complex, so it's normal that it seems a little bit blurry, it's just in case you have to work on that, then you know what are the things involved in this. So in the end, we can end up with very complex pipelines, okay? So each of the green blocks are sub devs, okay? So they represent a specific functionality that can be connected in different ways. And the yellow blocks are the actual DMA engine, so the video nodes that are visible from user space that programs can connect to to receive the data. But of course, if you haven't configured the rest of the chain properly, then there will be no data available. So this is really what you use at the end when everything is configured and everything is ready and it works. Okay, so that was for the general pipeline integration thing. Now let's talk about ISPs more specifically. So ISPs are just a kind of sub dev and media entity. And they typically have an internal pipeline with multiple things in it, so we don't necessarily represent the internal pipeline unless it's relevant. So there will normally just be one sub dev for the ISP. But this sub dev will have highly specific parameters. Like I said, it depends on the hardware implementation. So the representation of the parameters that you give to the hardware will differ from one implementation to another. So it means that it's actually very hard to have like a generic interface that will work for every ISP and that would be the same. So instead of that, in V4L2, there is actually driver-specific or hardware-specific structures that are used to configure the ISP sub devs. So the way it works is that we have, so one or more capture video devices that's the same as the dev video zero where you get the typical data, the final data that you want. And we have extra video devices that we can use to configure the ISP and to get side information from the ISP. So these are the meta output and meta capture video devices. So the meta output is there for parameters. So in V4L2, output is when you provide something to the driver, not when you get something from it, which is a bit confusing, but that's where it is. So with that, basically, you will also use the same Q interface as you have with a video device. But instead of having pixels in the buffers, you will have particular structures that correspond to the parameters of the ISP that you are going to fill with a particular configuration. And then you can push that as a buffer to the video device, and the ISP will be configured to use those parameters. For the meta capture, which is the data provided by the ISP, you get the typical feedback information from the ISP, so essentially it will be statistics about how sharp the picture is, how dark the picture is, things like that, so that you can use this information to create a feedback loop and then provide new parameters in the output video device to properly configure the ISP to respond to a change in the scene or something like that. So for example, if you switch off a light and turn a different one on that has a different color temperature, for example, then you will get the information from this statistics, and you will be able to adjust the parameters to respond to that change. So that's how it works. Here is an example from the RK ISP, the Rockchip ISP1, where you can typically find this same topology. So the ISP is here. It actually has extra sub-devs before having the video devices for capturing the pixels. But you also find this statistic video device and params video device. So the params will take a particular structure here that you can configure, and the statistics will take another one with the information provided by the ISP. Okay, so that gives you kind of a big overview of how all of this is supported in V4L2 in Mainline Linux. So now let's take a look at the thing I actually worked on for the all-winner cameras. So using, again, these same interfaces for the particular use case of all-winner cameras, or cameras, you know, interfaced with all-winner SoCs. So in the all-winner second-generation hardware implementation, we have MiPy CSI2 controllers, which are really the components connected to the actual bus, the actual MiPy CSI2 bus, which are separate hardware blocks that are connected through a FIFO to the CSI controller, which is really just a DMA engine that will get some pixels in and write them to memory, basically, with some formatting and timing things. But essentially, that's what it does. So this CSI controller again was already supported in Mainline, but not the MiPy CSI2 controllers. So the CSI controller actually also needs to be configured specifically to take its input from the MiPy CSI2 controller instead of the parallel interface, which is the only choice that was supported before. So that's one of the things I had to add support for. So there was a lot of kind of reworking of the CSI code to support that, even though the biggest rework was actually to support the ISP. We need to get some information from the sensor to properly configure the MiPy CSI2 interface on the receiving side. So for that, we use a V4L2 control that the MiPy CSI2 controller is going to retrieve from the sensor driver through the subdev interface again. So it knows what the clock frequency of the bus will be. And we also use a part of the GenericLinux-Phi API to do that, because MiPy CSI2 works with a physical, let's say, protocol or physical implementation called DeFi from MiPy, which is kind of like the physical layer implementation that is used by this interface. So there needs to be some configuration about that. And yeah, for that, we use the Linux-Phi API. Now if we look more closely at the platforms that I got interested in, first, for the A83T, there was actually some source code provided in the all-winner vendor releases that we could use as a base to implement a driver, a proper mainline driver. So it has lots of magic values in registers, so sometimes it's just writing things to registers and we have no idea what it means, but we basically just took that in and did the same and it just worked. So there's still some magic involved, but that's unfortunately not so uncommon, so we just have to deal with it. The DeFi part is separate, so it has different control registers, but that was also supported in that all-winner SDK downstream code, so we could also just reuse the same thing and it worked. For the A31 and V3 supports, so it's like, again, the second generation of all-winner SOCs, we have a different MiPy CSI2 controller from the A83T, so it was necessary to write a separate driver for that one. There was also reference source code available and some documentation in one of the user manuals of the platforms, so that was, again, sufficient to write a driver. It turns out that the DeFi part is actually the same controller that is already used for MiPy DSI, which is a display interface that uses the same physical layer encapsulation, I would say. So there was actually already a driver for the DeFi block used for MiPy DSI, in which case it's in transmit mode, because when you want to drive a display, you push pixels out, but in that case, we reused that driver but configured it instead in receive mode for MiPy CSI2, so we could get pixels in. So that was also a change in this driver. But it was then necessary to indicate in which direction it should be running, so there were different approaches that were possible for that. So I think at the end, we settled for a particular device tree property to configure this mode. So the kind of outcome of this work was first some series to support the MiPy CSI2 controllers, so about 2,600 added lines, so pretty big, that's two new drivers here and here, some changes to the DeFi, like I just mentioned, and some device tree changes, so that's most of it. I started this work in October 2020, and it was merged in the next 6.0 in June 2022. So now these drivers are upstream, and you can use them, and they work, and I actually got a number of people writing to me and saying that they actually have been using this in different situations, and apparently it works pretty well, so I'm pretty glad about that. It's pretty nice. So that was for the MiPy CSI2 part, and let's say the big part of the work was supporting the ISP. So the ISP is connected to the CSI controller as well, but on the other side, meaning that the data will flow from MiPy CSI2 to the CSI controller to the ISP. So there also needed to be some configuration to be able to support that, especially big rework was required because when you start using the ISP, the DMA engine that is used to write the data to memory is no longer the DMA engine of the CSI controller. So the CSI has to act like a regular subdev, okay? It's no longer the final, let's say the final sync for the data, but it's just one more element in the chain. So the driver had to be reworked to support this different mode of working, where it will basically not register itself as the parent V4L2 device, but instead it will register itself as a subdev and the parent V4L2 device will be the ISP driver, which is again a separate driver. So that required quite some rework, and also to support both modes, obviously, because not everyone is interested in using the ISP or not every platform even has an ISP. So yeah, so there needed to be some, yeah, some rework to support that. What else to say, it has, I don't know if I put it here, but it has some weird way of configuring it, basically, in a typical hardware, you would just like have some registers and configure them, and then the effects will be applied on the next frame or something like that. But in that hardware, it actually has a DMA buffer, where you write the new values of the register, and then you trigger some update bits, and the hardware itself will go and read from the DMA buffer and copy that data to its registers synchronously with the virtual synchronization, so when you receive a new frame. So it's very odd as a way of working, but that's how it does. So like if you write directly to the registers, it won't actually do anything. You need to write to a side buffer, and then tell the hardware to update its registers from that buffer. So yeah, it's a little bit weird. If you look at the driver, you'll see that there is this buffer that is allocated for that, so that's the reason why, that's how it works, and that's what the old winner code is doing. So that's how it's done. So that's the final pipeline that we have with the sensor here, connected to the Mypy CSI 2 subdev, which is a separate driver. Then it goes through the CSI driver, which in this case is configured as a subdev only. And then it goes to the ISP subdev, which provides a DMA capture interface where you have the final data that was processed, and that should look good. And it also has another video device for the parameters. Like I described with the RockTip ISP, this one is implemented the same way. So we also have a specific structure to configure it. Currently, there is no support for the statistics, but in the future, when such support is added, there will be another video device connected to this ISP subdev to be able to provide the feedback data out. OK, so yeah, that's pretty much what I just said. Few details about the currently supported features in that config parameters buffer. Currently, we support the buyer coefficients, so we can translate from the buyer raw data to actual RGB data, and we can tweak how much of each color channel we put in. So that will typically allow different color temperatures, basically. We also support 2D noise filtering, which is called BDNF, so it's bi-directional noise filtering, which basically is like a low-pass filter, so it will remove the high-frequency stuff in your picture, and that will make it look smoother and nicer. And also easier to encode, which is one of the big reasons why you need to do noise filtering. And yeah, that's the main two features, so there's still a lot to be added. That's just the scope of what our project was at the time, but there's definitely a lot of room for improvement, so the ISP itself has numerous hardware capabilities, and so those could be added later in the driver. So it was, for that reason, submitted to staging in Linux, because we don't yet support all the features, so we don't yet have a complete description of that structure, and since it's part of the API, we want to make it clear that it's not finalized yet, so there will be some additions to this structure to support other features that are currently not implemented. So this code was submitted in September 2021, and it was merged in November 2022. So this is also in Linux 6.2, so you can get that with the update, so that's pretty nice. This change was much bigger. You can see it's 8,000 lines of additions, so it's a whole new driver, and a big rework of the previous 6 ICSI driver, which was more or less a complete rewrite of the driver, so it's pretty big. Just to finish on what is left to do in this area, so currently the ISP only supports the V3 platform, but the same hardware is found on the AT3T, and there's a few other chips that have previous versions of the same hardware, so they could be supported in the same driver, so that's something that could be done in the future. I mentioned that there is no statistics currently, so that is also something that could be added in the future. It has numerous other features that we could support, scaling rotation, and of course all of the modules inside the ISP for all the different features that I mentioned, and we don't have any 3A algorithm support in user space to do this feedback loop implementation, so that is also something to be worked on. And of course, doing that would be a great fit for Lib Camera, so Teran has just talked about it, so I won't go over it again, but that's definitely a good fit for supporting an ISP with Mainline Linux, so hopefully it will soon be well integrated in Lib Camera. Someone recently submitted patches about this, so it's like going towards this direction, so that's pretty nice. That's pretty much the end of this talk. I just wanted to mention that Bootlin is hiring, so if you are interested in this kind of stuff, how to support everything, you can reach out to us and we have positions available, also internships, so feel free if you're interested. And that is pretty much it for me, so thanks everyone, and now I'll have questions if there's any. Hi there, that was fantastic, thank you, who knew it was so complicated. The last time I looked at some of this with NXP free scale parts, we were using GStreamer with V4L sources coming into it, and a lot of the headache was that there was loads of buffer copying all over the place, and there were different memory maps and different access for different components to different memory maps. So with what you're explaining here, typical use case might be, we do this image processing, then I want to encode it with H.264265, maybe I want to push it into a GPU to do some kind of image analysis with AI machine learning techniques. Could you say something about how that hangs together with buffer copying and so forth? So basically nowadays the V4L2 framework has great support for DMA buff, which is a technology used for buffer sharing across different devices. So with that driver you could absolutely reuse the same memory where the ISP is producing the picture and use that as the source for an encoder or even the GPU, because DRM also supports DMA buff pretty well. So you could do all of that with zero copy, that's definitely all supported, and I didn't have to do anything special to have that work, it's just the V4L2 framework has that now. So unless your hardware has weird constraints like the GPU can access this part of memory or things like that, which are not really well represented currently, but in the general case it should work pretty well. So yeah, basically when we have an encoder driver for these all-winner platforms we will definitely be able to directly import ISP output to encoder input and no copy and low latency. Yeah. So. Anyone else? Yeah, thanks for your talk and for supporting, hopefully, more mainline Linux so we have more phones available. I have a question about the support for artificial network declarators. Do you have any idea if this is somehow integrated into the kernel stack in this way? I mean, it's a lot of work like this as is, but well. Yeah, so the AI accelerator stuff, that's not really the same scope as the camera stuff, but that is definitely moving forward. There is an axle subsystem that was added to the kernel quite recently, which is based of DRM for some aspects. And I think more and more drivers are being contributed towards that, so the main issue currently with that would be that the compilers to compile the models into the hardware representation are typically non-free and probably going to remain so in a number of cases. So feel free to push for free compilers for these models to your hardware provider or whatever. Any more questions? You mentioned patches for the camera for the ISP. Could you point them to me? Sorry? Could you point me to the patches you mentioned, and do you have plenty of work on the camera? It's Adam Pig, right? Sorry? It's Adam Pig's. That's just for the CISAs receiver as far as I'm aware, just not for the ISP. Maybe I went a bit fast over that. So it's actually patches on the driver, the SunSix ICSI driver side to implement things that Leap Camera expects. So I think you know the one thing I'm talking about. So do you plan to work on the ISP support, the Leap Camera? So personally, I would be very happy to do so, so we're just looking for someone to fund that effort. So if, you know, someone with lots of money and interest, please come and talk to us. No, but seriously, I know that people would definitely be interested in that, so it's good to spread the word that we are available to do that. We just need someone interested and serious about funding this, but we would definitely be very happy to do it. So yeah. Okay. Cool. Thank you for a great talk. And that's the end of the question. Thank you. |
U-Boot as PSCI provider on ARM64 |
Yeah, so let's start with this talk. My name is Marek Vashu and this talk is about Ubooth as a PSCI provider. Now, PSCI stands for Power State Coordination Interface. It's a standard drafted by ARM and it is used on ARM system. It defines a software interface that's used by things like bootloader's operating systems to bring up CPU cores, stop the CPU cores, do a system suspend, resume, perform, reboot and power off. The presence of the PSCI interface is mandatory on ARM v8. It is optional on ARM v7 although you can find ARM v7 systems which do also provide PSCI. There is a related interface which is called an SCMI and this is used for clock management, power domain management of devices. You may sometimes see that there are systems which misuse PSCI for this kind of functionality like power domain on and off and this is wrong. So that all goes into SCMI. We will not talk about SCMI however right now. The reason why PSCI exists is multiple fold. One of them is convenience. The thing is doing these things like CPU core on, off, suspend, resume. This is a really horribly complex process and hardware is full of bugs so implementing it correctly so that your system doesn't randomly crash during suspend for example. The code is very complex and if you want to run multiple OS's on an ARM machine there is a balancing act in place. Basically what ARM decided was to implement this once, implement it properly and then expose to the operating system an interface which allows it to say okay well now suspend or now bring up a CPU core. And all this horrible complexity and all the work arounds for the hardware bugs they are hidden in this sort of an interface which is implemented once. So it pretty much covers the convenience, the complexity. The other thing is the thing which brings up CPU cores may interact with say regulators and this could potentially damage the hardware if you do it wrong. So if the hardware is very fragile it may also be a good idea to hide this from the operating system which may crash and do something wrong and then potentially damage the hardware. That's why it's hidden in the firmware. However what you may argue is that if we put this functionality in the firmware what happens if the firmware is buggy then you have to update the firmware which essentially means updating your boot loader which is dangerous operation unless you are very well prepared for that. So it can break your machine if you do it wrong. So all of this is really a balancing act why not put it into the OS or into the boot loader. One completely separate reason for existence of this API is virtualization. So in a virtualized setup on ARM the secure monitor firmware which is running in the highest privilege level provides a PSCI interface to the OS running in lower privilege mode which is like the EL2 and that OS itself can provide the same looking PSCI interface to the OS running in virtualization so in EL1 and to the OS this looks very much identical whether it's running in virtualization or whether it's running on bare metal. So for that purpose also there is the PSCI interface which allows you to bring up CPU cores which in one case may be virtual in the other case they are the actual real CPU cores hardware ones. Now the way PSCI is implemented is by means of SMC CC which stands for SMC call convention which is another standard drafted by ARM and it basically tells you that on ARM64 there are two instructions one of them SMC the other HVC instruction and they both trigger a synchronous exception. In case of the SMC instruction the synchronous exception lands in exception level three in case of the HVC the synchronous exception lands in EL2 and the SMC CC also tells you which CPU registers to set up before you call the SMC and which CPU registers are then used as a return value from the SMC or HVC instruction. As for the exception levels there is four of them on ARM64 EL3 to EL0 the EL3 is the most privileged one this is where the secure monitor firmware runs and this is also where the code which brings up the CPU cores and does the suspend resume and all this is running. EL2 is the last privileged and this is where operating system is running the one which is running on bare metal. You can use SMC from the EL2 into the EL3 to request services from the secure monitor. EL1 is used for virtualized OS so an OS which is running in virtualization can do HVC which would trigger synchronous exception in EL2 in the OS which is running on the bare metal and the OS running on the bare metal may provide some services to the virtualized OS this way. You can read all about these exception levels in the ARM specification. If you download the slides which are in PENTA you can use all these links which will redirect you to all the specifications so we can just read all about that. Suffice to say there are these four exception levels on ARM for now. The way the SMC actually works is that if you want to do an SMC request you're supposed to set up CPU register zero with a function ID which basically says what kind of request you want to do. You want performed by the secure monitor or by the OS and then you're supposed to set up six additional parameters 6.1 all the way to x6 which are parameters for this function which you want to trigger. With this setup you have to do the SMC or HVC instruction. This instruction triggers synchronous exception. The synchronous exception then makes the CPU elevate its exception level to the higher one and trigger the exception handler which validates that the function ID is even okay for you to call that the parameters for the function are okay at all and if all of this is correct then the request which is represented by this function ID is then performed by the secure monitor firmware or by the OS. Once the request is performed the secure monitor firmware or the OS will set up for additional registers x0 to x3 with the return values and will return just past the SMC or HVC instruction into the calling software and resume execution at the exception level of the calling software and then the calling software can collect the result of this call in the registers x0 to x3 and do something about this. This is roughly how it works about these function IDs. These function IDs are the requests you can do to the secure monitor firmware or to the OS running in the bare metal. You can actually not find them in the SMCCC specification because the SMCCC specification just says there are function IDs but the blocks of these function IDs are distributed across various specifications like the PSCI specification which has two blocks carved out of the function IDs or the SCMI specification which has its own set of function IDs. The PSCI specification has two sets of function IDs. One is for 32-bit PSCI calls, the other is for 64-bit PSCI calls. The only reason for this is that 64-bit PSCI calls just take 64-bit parameters so the function signature is slightly different. But beyond that it's very much compatible the 32-bit and 64-bit PSCI functions and function implementations. So you can look up the function IDs obviously in the PSCI specification. You can also look them up in the UBOOT sources. You can look them up in the Linux kernel sources. This stuff here is coming from the UBOOT sources. Hello. So what you can see here is for example CPU on PSCI function which is actually a macro which is expanded to like C4 plus 3. So this would actually go into the SMC register x0 before you call the SMC instruction. Now there are multiple callers of the SMC instruction as well as multiple handlers. There are callers in UBOOT. This is all built around this FV call that see SMC call and HVC call implementation. In Linux the PSCI implementation lives in driver's firmware PSCI PSCI. The handlers are either in ATF or in UBOOT itself. The UBOOT SMC callers are all built around this SMC call function. So like anything in UBOOT which does PSCI interaction is essentially SMC call PSCI function name and then some parameters for the PSCI function. If you look at the SMC call and UBOOT actually it very much copies what's in the SMC CC. So that means set up register x0 with function ID, set up a couple of parameter registers x1 to x6, then trigger the SMC instruction. Once the SMC instruction request is done the execution will return past the SMC instruction and continue here where the UBOOT code will collect the registers which were set up by the secure monitor firmware as the return values from the SMC instruction and then you can use them in the UBOOT code. There is a matching HVC call a little bit further in this FV call that see in UBOOT if you want to look it up which is used for the EL2 HVC call. UBOOT has the bonus thing that it actually has a command which is called NSMC. So in the UBOOT command line you can experiment with the SMC calls and it's a command which takes seven parameters up to seven parameters. The first parameter is the SMC function ID and then the six additional parameters are the parameters for the SMC function. So if you want to do like a PSCI call I think this one is like PSCI version here you can do it like from the UBOOT command line and you can experiment with this all you want. The return value from the SMC command is four values which is the x0 x1 x3 and x0 x1 x2 and x3 CPU registers. So you can then analyze what you got out of the SMC call if it didn't fail obviously. As for the Linux kernel there is this additional thing in Linux then when the PSCI firmware driver is probing Linux has to figure out whether it is running on bare metal or in virtualization. So if Linux is running on bare metal then it uses the SMC instruction to communicate with the secure monitor firmware otherwise it's using the HVC instruction if it's running in virtualization to communicate with the OS that's running on the bare metal. But beyond that the PSCI firmware driver in Linux just exposes the PSCI functions as a wrapper around SMC calls and the actual SMC instruction call and the setup of the x0 all the way to x6 registers. This is implemented in smccc call.s in Rx64 so it's very much yet again a wrapper around the SMC instruction no matter whether it's UBOOT whether it's Linux. But now let's talk about the more interesting part which are the handlers and for one to be an SMC handler the CPU core has to fulfill a couple of requirements. The main requirement to handle SMC exceptions is to be able to even receive the exceptions. So the CPU core basically has to be able to receive exception in EL3 if it wants to handle SMC. If you are on an SMP system you also have to be able to receive IPIs inter processor interrupts because in order to bring up secondary cores it is necessary for the secondary cores to be able to receive IPIs to break them out of a loop in the PSCI provider firmware because the OS is not immediately ready for the secondary cores. I'll explain that in a bit. In UBOOT most of this PSCI and synchronous exception handling code is actually in place already and it's all generic code. So the UBOOT entry point the UBOOT entry point is very much here in the startup.s and the PSCI synchronous exception handling code is here in PSCI.s. It's there both for ARM32 and ARM64 it's just in different subdirectories. All you as a user actually have to implement is the PSCI.C which are the C callbacks of the actual PSCI functionality which perform the stuff which the PSCI function are supposed to do with the hardware like start the CPU core, stop the CPU core. So all this stuff is generic, all this stuff is so specific and if you decide to implement PSCI provider in UBOOT you have to fill that in. Now if a UBOOT is configured as a PSCI provider then UBOOT is running in EL3 that means in the highest execution level, exception level. That means UBOOT is not able to perform any SMC calls so you have to make sure there are none because otherwise the system would just hang on boot. The OS will be running in EL2 and it will be able to do SMC calls into the UBOOT synchronous exception handler so this is something to keep in mind. Beyond that if UBOOT is configured to be a PSCI provider there is only really a little bit of additional setup when the UBOOT starts up in this MV8 setup PSCI and this code does basically that it takes parts of UBOOT which are marked with attribute secure which is essentially the PSCI handling code. It copies it into an SRAM then it setups MMU tables and flags this SRAM with a secure bit. That means no code running in not EL3 that means anything lower than EL3 will be able to modify this secure handling code. Finally the UBOOT sets up an exception vectors so that when the synchronous exception happens it will land in the UBOOT synchronous exception handler and then enter the PSCI code. When such a synchronous exception happens the UBOOT synchronous exception handler is entered so when like an OS does SMC call it will land here in the MV8 PSCI.S handle thing and at that point the synchronous exception can be anything so first we have to figure out whether this is even an SMC at all or it could be a hardware fault it could be an unknown SMC exception which we cannot even handle. If it is an SMC exception we need to figure out whether it's 32-bit one or 64-bit one assuming it's an SMC 64 on MV8 we still need to figure out whether this is a even a PSCI exception or it could be another type of an SMC. If it is a PSCI then UBOOT looks up the callback function which implements the PSCI function ID if it even exists in UBOOT and if it does then it sets up C runtime environment and jumps onto the C function which then looks very much like this and in this C function you can just do like a write into a register and for example in this case power of the system and like you don't have to care about the assembler before that all you have to care about with the PSCI provider is very much this because this is so specific and this is something you have to implement. Now on SMP there is this additional problem in that when the operating system running in EL2 requests from the PSCI provider that it wants to bring up secondary core the operating system will pass through the PSCI a pointer for the OS entry point but you cannot just turn on the secondary core which will start up in EL3 and point it into the OS entry point because this would be a security violation you would essentially start the CPU core which is running in the highest privilege level and make it enter the operating system in some sort of a highest privilege level state even though the OS is running in lower privilege state so what happens there is the CPU core actually has to enter Uboot in the Uboot init code the CPU core gets configured gets set in a defined state so that it can enter the OS the CPU core GIC the interrupt controller registers are configured so that it can receive an IPI then the CPU core drops into EL2 and then the CPU core starts spinning and waiting for an IPI so that when the operating system is actually ready to receive the CPU core it can ping it with an IPI and the CPU core will then be released to the operating system and it jumps to the operating system entry point and then the operating system runs on two cores so this is the detail with an smp finally here is a summary of what to do in case you want to use Uboot as a psci provider so you have to look up the gig distributor and redistributor base this is something which you find out in your SOC datasheet or if there is a linux device today it's already there and define these two macros gig debase and gigar base then you have to make sure that your DRAM is marked as non-secure because sometimes it is marked as secure and if it is marked as secure in the MMU tables then your OS will not be able to access DRAM and it will crash you potentially have to configure other security related registers of the CPU this is again SOC specific you have to look it up in your SOC datasheet then finally the main part of the implementation is fill in your psci.c callback implementation against SOC specific and then remove the previews pl31 psci implementation block which potentially was atf enable these Uboot config options give or take in the Uboot port config and compile and then it should basically work and in case it doesn't work Uboot has the debug UART functionality so if you have two UARTs on your machine you can point the debug UART into the other UART not the console UART and use this dedicated lightweight printing mechanism to essentially print some sort of debug output from the Uboot psci provider the secure part even while the Linux kernel is running it is possible to get some debug UART prints out of this here are the config options which you used for that okay and now since I am through my slides I promised an example so this will be very boring here is Uboot and if you are familiar with Uboot this is how it looks and it just looks all the same except if you are familiar with Uboot on imx8m plus or imx8m in general you may notice that there is no notice here the notice comes from the atfpl31 blob and the blob is not there because the Uboot is the provider of that functionality now but beyond that I can boot the Linux kernel all the same the Linux kernel detects that there is a psci interface in the firmware which is now provided by the Uboot the Linux kernel brings up the cpu course the cpu course show up in proc cpu info and the cpu course just work and that's actually all there is to show it's exactly the same as it was with the blob except now you have one less entry in the s-point so you no longer need the atfpl31 blob which is bundled with Uboot because Uboot can do it now for you and by the way this stuff is now upstream since two days ago in case you enable the debug uart you will see some sort of a debug print out of the secure part of Uboot for example here is psci cpu 164 this is the Linux kernel sending the psci request to Uboot and Uboot just brings up the cpu code for the Linux kernel and that's it thank you for your attention questions yeah so if you have an existing atf implementing psci and you want to move to Uboot you have to convert everything at once there's no way to like step by step move functionality over so you see the functionality is actually super simple i mean all you have to do is like turn on cpu code turn off cpu code and suspend and power off and reset and this is like 200 lines of code so it's like super simple really um and it's actually all now upstream for imx 8m plus so we can actually just use that as an inspiration |
barebox, the bootloader for Linux kernel developers |
So, hi everyone, welcome to my talk, welcome to Bearbox and first of all, can all of you raise your hands who ever heard of Bearbox or, okay, that's quite some, and who use it actually in some projects? Okay, that's more than I thought. Okay, as I thought, I'm Marco, I'm from Pentatonix, I'm an embedded software developer, yeah, my tasks are related to kernel, to bootloader, to BSP stuff, graphics stuff and so on and so on. Most of the time I contribute my work to upstream, sometimes not. I live in the north of Germany, yeah, and that's it. And again, there is just a brief introduction to Bearbox who is not becoming familiar with, and then let's add a new driver, so an example, then add a new board so you can see how you can add your own board and upstream it to Bearbox because we always welcome new boards, and then we make a short hands-on, hopefully we have some time for it. And yeah, okay, then welcome to Bearbox. Bearbox started in the 2007 as a patchlet of Uboot, and the patchlet was called Uboot v2. This patchlet was retracted, and then, yeah, Sasha renamed it and here, made an official fork of it in 2009. We have monthly releases, we have mainline support in PDX test, and we have mainline support in BuildWood, and we wanted to have mainline support in Yocto and Open Embedded Core. I sent patches just yesterday, you can find the link below or in the slides. Please, any discussion is welcome. And if it's not getting into Open Embedded Core, yeah, you can pull it from Meta PDX or Meta Bearbox. And okay, yeah, we have 330 contributors around, and yeah, it sounds not that much, but we are living, Bearbox is living, we have around 1,400 commits per year, and okay, the graph is not that optimal, it would be, yeah, wise all the time, but yeah, 1,400, it's not that bad. And we are alive, so let's add a new driver. And what are our design decisions in Bearbox? We are coming from Linux, we don't want to re-render the third time or the tenth time, so we are taking the Linux device driver model and stripping it down to bootloader use case. All the configuration is done via device tree since day one, since we get forked, and yeah, device tree and Kconfig, some of it, most of it, device tree, and we also, since we re-use the Linux device driver model, we also re-use the driver frameworks. We just strip it down to met our requirements and then push it, and it's mostly just in copy and paste with small adaptions. So then, let's add a new driver. As I said, it's in copy and paste, copy it from the Linux source, copy it to the Bearbox source, and then adapt the code. So how does it look? And some example, I took the clock from some workshop, and I made a div, you can see the div above, it's Linux 6.1 or 6.2, so it's 6.2, blah, blah, and yeah, just adapt the headers, because, okay, of course, Bearbox does not have all those headers, and replace them with some Bearbox headers, and then replace some functions we do not support. That's not very important, it's just replace some functions, remove it, or replace it, yeah. So we adapted the driver, we are finally done. So we have ported our new clock driver for this workshop. So we have changed 50 lines of code for a driver of size about 171 lines of code. This makes about one percent or two percent of adapted code, and we ported the driver to Bearbox to a bootloader. Most of the time, when you port a driver, clock drivers are a bit specific, but some drivers do IRQ, and Bearbox, we don't have IRQs, so you need to port it to some polling mechanism, but we have helpers for that. So we are welcome you to port your driver, or your driver of choice you need into a bootloader. So then, after we ported the driver, we need to compile it. So let's add a K-config in the makefile, because we are kernel-related bootloader. So let's add a K-config menu entry, and at the makefile, it looks like a kernel. So then we are finally done. We need to enable it, we need to compile it, and we have everything. And of course, we can test it with what? With the Linux device tree, because we are the device tree based, and we, since we are using the complete driver from Linux, we also have the complete bindings from Linux. So we can use the upstream Linux device tree. You don't have to do anything, just copy and paste. So yeah, feels like writing a kernel driver, isn't it? To me, at least. So to sum up this a bit, the Bearbox drivers are just a stripped-down Linux version driver, and drivers can be ported with little effort. There may be some more effort if you have more complex drivers, or if you port frameworks. But by frameworks, that really depends. There are some really easy frameworks, and there are some frameworks which are huge. So I added some examples, which frameworks we already support. A very decent one is the NetDSA. I think some of you know the DSA stuff, this distributed architecture for switches. I think it's called distributed switch architecture, anyway. And we support it in Bearbox, because we are using the NetDSA framework from Linux. So we can do NetDSA with bootloader, so you can speak with your switch like you do in Linux. So that's very impressive for a bootloader. And yeah, then let's move on. We have added our driver. Let's move on and add a new board. Before we can add a new board, I wanted to explain you some stuff, some internals of Bearbox. I don't want to go into details, because we don't have that much time. If we have more time at the end, I will show some more examples. And yeah, of course, you can raise your hand if you have questions in the end. Anyway, we have a single binary in Bearbox, which you flash to the target. But this is composed of several components. One of it is the socketer. This is really, yeah, one of it is the socketer. Then we have this prebootloader. Then we have some firmware blobs like device tree, like third-party firmware blobs, TFA or ATF or DDR firmware. Yeah, depends on your socket also. And then we have the actual bootloader. So as you can see, there is a singleton image, but it's composed of different firmwares or different blobs. In Bearbox, we do this composing for the prebootloader and for the firmware and for the actual bootloader, we do a link it together. And the socketer is appended. So, yeah, also the socketer is very search-specific. And this appending mechanism depends on this image creation tool, which is also so specific. So then also some booting stuff need to be known before you can add the board. Most or most modern socks are booting from a bootworm. So you plug in the check and then the bootworm comes up. The bootworm loads some stuff from, yeah, the boot medium you configured via some GBIOs or some pins. And then it's loading the, yeah, some kilobytes into some static RAM. And then it jumps to the static RAM and then executes the reboot loader and then the bootloader and the bootloader finally decrypt. So now look into deeper and add a new board. So as I said, the bootworm is loading this socketer. It's decoding this socketer. It's really search-specific and it's executing the software which was loaded from the boot medium. Then the reboot loader gets loaded by the bootworm and set up the DRAM. And then it's loading the real bootloader to the DRAM after it set it up to DRAM. And then it jumps to the actual bootloader. Or if you have some more decent socks like RV8 based socks, then it's jumping to the TFA and the TFA is jumping to the bootloader which we previously loaded to the DRAM. So, okay, now we know it and now we add the reboot loader stuff. The reboot loader stuff don't have any support of device fee or so. It's a kind of low level stuff. It's kind of dirty stuff. Anyway, in Bellbox, everything starts at the entry function. The entry function is always the first point where it starts. And, okay, let's add iMix 8MN EVK with this entry function. We have some helper functions after we load it into the DRAM. And then we are calling the NXP. This is some bot specific code which we get here. And this set up the UART. So, we have a reboot loader low level debug stuff. And then as the function calls start ATF, this function set up the DDR as I said. It loads the Bellbox to the DRAM or DDR and starts the TFA. And the TFA is then jumping into the loaded image or into the given image, into the image which we loaded previously at the specific address we told TFA. And after we loaded this, we get started by this last function and that's the actually Bellbox entry function. And here you can see the reboot loader is passing the device tree. So, the reboot loader contains also loaded the device tree and is passing the device tree finally to Bellbox. So, Bellbox is just running with the device tree. So, now let's come into the Bellbox to the boot lawyer. This is the stuff where all the magic happens, where all your bot specific fix ups does happen. Like you have some overlays for some displays, you have different displays, do some detection and then apply device tree overlays. So, your device tree is finally finished, finally fixed up for the kernel and the kernel don't have to do anything. This can, such stuff can happen in the board code. And yeah, such a board code is also a driver in Bellbox and after the board code finished to load everything and do all the magic, it jumps to the kernel. So, as you can see here, this is the entry function for a board code. Again, this looks again for me to a kernel, like a kernel driver. You have some driver structure where you can put all the functions and you have this computable which is also checked and then depending on the computable which is given, the driver is loaded or executed or not. And then you have the board function and within the board function you do this magic stuff, I told you. You do detect where to come from, do some magic stuff and set up. This is some kind of special in Bellbox. We have some Bellbox update handler. This is some kind of, yeah, you know this magic in the Submarine where you can have some scripts to load something from MMC to some magic addresses and here are magic addresses and here are magic addresses and there are involved five commands or so. In Bellbox we have the Bellbox update command and this is the handler for it. We just forget the handler and in Bellbox we can just call Bellbox update and finished. So, there is just one command involved. And yeah, then finally the board code is fixing up the file and then it, yeah, finished and starting the kernel. And the kernel is do all the remaining stuff. So, that's not part of my talk. And yeah, okay. So, we have added a driver, we have added a new board and everything apart, the pre-boot loader stuff feels like a kernel to me. So, I hope I can make you a bit excited about you can visit the link. This is a Bellbox hosted online on the tiny emulator. Yeah, just click on it and you can try it out. Yeah, what else? Bellbox. Bellbox has, now it's coming to the improved part. Bellbox has a rich shell with auto-completion. This shell has history. This shell supports scripting and so on. This shell is also called what you would see when you visit the link. What do we have also? We do also have 30FI systems. This means no more commands and offsets like you know from Reboot. It's just copy A to B. It's like Unix. It's like Linux. So, we have LS, we have RM, we have AutoMount. You know AutoMount from system D where you visit some directory and magic happens and it gets mounted automatically. We also do have that in our bootloader. And we also have memory mapped IO access. So, we can visit it or we can check the memory. We can manipulate the memory via MD, via memory write. So, MV. And yeah, as I said earlier, we have this Bellbox Update command, which is pretty amazing because it's hiding all the ugly stuff. It's just Bellbox Update and you are done. And also pretty amazing in Bellbox. We have multi image support. It's some language of Bellbox. It's something you are compiling Bellbox. And then with one compile, with one dev config, we can build 100 or more boards. And this is something I wanted to show you right now with enhanced one. So, now it's, I hope everything works as expected. So, yes. Is it readable or should I increase the size? Okay. So, it's just a small script I wrote. It does nothing. Just, it applies, make dev config, then it enters the dev config. So, I can show you something about Bellbox. So, it's like, again, it's like a Linux kernel. And here you can see system types. And here you can see all the boards we have enabled. So, all the boards, not just one board. And, okay, exit. Yeah, everything is fine. And then it's calling make compile. So, okay, that's nothing fancy here about. The fancy starts at the end. Let that happen. And, hopefully. So, okay, now we are at the end. We are finished compiling all these stuff. And now we are building our images. And we are building around, as I said, we are building around 100 images right now with one command, with one dev config. And that's pretty amazing, because I know how some, or I know BSPs where you have just two different Uboot configs to have one Uboot for an SD card and one Uboot for an EMMC or one Uboot for on spy and one Uboot for an SD. That's not the kind we are working. We are saying one image will all them. So, and then also you compile it and you can select as many boards as you want. As soon as, yeah, it has some limitation. This must be the same. So, we, or the architecture. So, this was it. I mix seven. Sorry, I mix four, seven. So, all I mix six and so on and so on. And now we can see cat. So, now I can demo build flash images. So, we have built 132 images just in one minute. And that's not doable with Uboot, at least with my knowledge, I can't do it with Uboot. And yeah, of course, then as I said, we have this, or if it would work, yeah, we have this online barebox. And then we also have an editor. We have also VI, but VI is not that good working within the web. But we have also added and then we can edit some stuff like going here and hitting blah blah or ha ha. Anyway, as you could have seen this script. And Uboot first time. It's nothing special. It's just a script which checks some environment variable. And then that's it. I want it just half an hour ago. And then we can handle this and see, okay, global first time ref is not set. And then we can say, okay, since we have history, we can move up. And I didn't edit it. Okay, then we say global first time ref. Oh, nope. Also auto completion. So global first time ref. And then Uboot first time. And then we say hello first time 20, yeah, 23. And yeah, that was a short head on. It's pretty amazing barebox. I really, I really would like it to see you if you have contributed barebox, if you send patches, if you bring your board to mainline and let us enjoy barebox. |
Building FPGA Bitstreams with Open-Source Tools |
You The present the present of the present of the present and of the present of the present and of the present from the present from the present from the present from the present from the present And if you're building a ramp, there's going to be a black line, because that's where the camera is pointing to. Ah, okay, I see. You can compare it to there. It's in the black lines. Okay. Hello, everyone. Silence. Hello, everyone. So, my talk about building FPGA bitstreams with open source tools. The important part here is open source tools, because building bitstreams for FPGAs was quite a pain in the ass. A while ago, you had vendor tools, which were large. You had to install them. Some of them were working on Linux. Some were working on specific distributions. If you have the right version, it wasn't fun. And a few years ago, there were more developments on open source tool chains for FPGAs. And since then, a colleague of mine, Stefan, and I, I'm fiddling around with these tools and trying them out and doing stuff with FPGAs. And the experience of that I will be showing today. About me, I'm Michael Treta. I work at Pengotronics as an embedded Linux developer in the graphics team. So, usually I'm doing software in Linux, sometimes user space, sometimes drivers. So, that's my background. Why am I doing FPGAs first? Agenda first. I will show you the open source FPGA tool chain. Then I will show you an example FPGA-based system. And in the end, I will show you the insights and pain points we had when we were developing this system and give you a conclusion and an outlook on the next steps that we are tackling. Use cases for FPGAs. You want to use them if you have real-time requirements and you need a high data throughput. That's two things that you need in graphics as well. You have high data throughput because you need to push all the image data from one point to another. You have real-time requirements because you have 30 FPS, 60 FPS that you have to address. So, you have a limited time span for each frame where the frame must be finished. And another use case or you would also use them for prototyping such systems because you can fiddle around and start experimenting with different implementations. The FPGA open source tool chain is basically these four steps. You start with some HDL description of your bit stream. Usually it's very low or VHDL. Then you have users which synthesizes the code into a net list. Next PNR routes and places the net list for your specific FPGA implementation. So, next PNR needs to know about your FPGA architecture and which vendor you're using. So, this is different for Xilinx for Altera which is Intel now or Lattice. So, you need to know something about your FPGA internal working for that. And in the end, you have a packer which takes the router bit stream with the configuration and writes an actual bit stream which consists of all the bits that configure the FPGA. That was quick and I won't go deeper into this because, as I said, we were working on this for a while and had several talks on that. One by Stefan, you find it on YouTube. It's about who's in both of those FPGA tools. There he will go deeper into how these tools work, how you can interact with them, how you can call them and how all of this stuff is working. And a second talk of building open hardware with open software by me as well on YouTube. And there I go into details on how we automated this and put the FPGA tool chain into a Yocto-based BSP so that we have reproducibility on the bit streams of the tools that are used, our configuration and other providers of libraries that we are using so we can run a specific checkout of the Git repository of our BSP and be sure that we get the same build as the last time. This is also now in our SCI so we have some CI running that takes the status of the bit stream and gives us a working bit stream again from the previous day. So what's in the bit stream? Usually if you come as a Linux developer you want to run a Linux on your bit stream. That's where you take RISC-5, a soft core CPU. Some of them are able to run Linux. Most of them are able to synthesize for your FPGA. So you have quite a few of these. RISC-5 is one implemented in Spinal HDL, Boom implemented in Chisel, Rocket implemented in Chisel or CVA-6 implemented in System Very Low. I said previously you take code for your FPGA in some hardware description language. I said it's very low. The four cores are implemented in three different languages so this is quite a few of different languages and each of these cores takes some periphery to make it actually work also implemented in the same language. So once you decide for a core you're more or less decide for which hardware description language and tools you are using. That's not something that's really flexible for us. And this is a point where Litex comes in. Litex is implemented in Megan which is another hardware description language which is based on Python. And it supports Linux on Litex and gives various pre-configured FPGA bit streams that contain some RISC-5 core and support various FPGA boards that you can just run. So here you see an example from the Litex Git web page, Git repository. It shows an Akron-based board with a multi-core Linux SOC. You have some access to DDR. You have some access to SATA and UART. So this is basically an SOC that's able to run Linux with enough periphery to actually make it work. So there is an example for that available. The question that arise was okay we have this example system but we want to fiddle around with the FPGA and want to run our own bit stream in there or at least be able to customize it to some point. So our starting question for this was can we add our own custom cores that are written in Berrylook into the bit stream that falls out of Litex. For that we decided to come up with a demo system. So the requirements for that were we are using a lambda concept ECP IX-5 board. You saw it before with an ECP-5 FPGA by lettuce which is supported by users and the entire tool chain. So support there is great. We want to put WaxRisk-5 with Linux into the FPGA to run a Linux. And because it's already there so the demo systems have it. And we want to add to our system an LED ring because LEDs are flashy, LEDs attract people. So it's something nice and we want to have interaction with the system. That's why we put a hand wheel there as well so you can as a user do some inputs to the system and get some feedback from the system. It didn't boot probably because it didn't have power. So starting with WaxRisk-5 with Linux as said, Litex already supports this. You can go to the Linux on Litex repository, look for this file and you'll see an implementation for WaxRisk-5 running on the lettuce ECP IX-5. WaxRisk-5 is written in Spinal HDL. That's neither MyGun nor Veriloc. In Litex or in preparation for using it in Litex, you generate the Veriloc code for this WaxRisk-5 core, wrap it in Python or MyGun. And after that, Litex can just integrate the newly created core or generated core into the SOC. It's an example target which supports Lite DRUM for memory access and Lite SD card for using an SD card. The LED ring supports the WS2812 protocol which is a single wire protocol to control LEDs or more than one LED. It's usually used in other fruit LED stripes but you can find various cheap clones on Alibaba and wherever. There is already a core for controlling this protocol implemented in Litex so it's there coded in MyGun. It works as an IOMAPT bus slave so from Linux you just write to the registers and LEDs flash and change the colors. And on the bus you have four bytes per LED so that's the three colors plus whatever. Input is done via a hand wheel which is from Amazon which is usually used for CNC stuff. So it has 100 steps and you just can turn it around and it gives you a rotary encoder with two signals where you can find the direction of rotation. We took some code of the internet for that. It's implemented in very low. There are various examples for that. We wrapped it in Python so that we can use it in Litex and this one runs as a bus master and is able to control the LED core via this connection. So it just sends right this color to this LED on the bus. So if you put all of it together in the middle we have the FPGA on the right side. We have the already existing system with the Vexrist 5 running Linux. The Lite SD card which is connected to the SD card. The Lite DRAM which is connected to our memory. All of these are put together by a wishbone bus. Details are not important. It's a memory bus to communicate between cores. On the left hand side, up here you see the encoder which is implemented in very low. Which is connected to the hand wheel. So this is one thing we added to the bit stream. And we have the LED.py which is migran for the LED controller which is also added to the bit stream and controls the LED ring. So how this is integrated into Litex? We created a new Litex target. Litex distinguishes between targets and platforms. Platforms are the actual boards. Targets are the SOC that you synthesize into your FPGA. So we took the existing platform and added a new target for our own SOC. Because it's Python we have inheritance. We just inherited from the example base SOC with the Litex core and the Vexus 5 core and the other stuff. We configured it, instantiated it, reconfigured the pins so that we can actually connect our own peripherals. So the hand wheel and the LEDs to the connector of the board, added the LED core and added the rotary encoder. In our case it's a decoder for interacting with the hand wheel and we are done. All of this customization is about 200 lines of Python code. So what we encountered and fixed during this process, after we added our custom cores, suddenly Linux was booting but it was not able to access the SD card anymore. We were suspecting various problems in the end like timing or that our bit stream generation failed or something. In the end it turned out due to adding our new custom cores the memory map on the bus has changed and the SD RAM controller has changed its base address and Linux didn't know about that because we were using just the device tree that we took from the example. So we changed our yachter build to use or to take the configuration for the Linux core, generate a device tree from that and give it to Linux and that way Linux took the correct base address and uses the correct base address and is able to use the SD card again. So that's something that will come up later as well because if you reconfigure your base system and device tree changes you have to make sure that your device tree matches your system that you're using. And one other thing is there is some boot loader in the FPGA or in the bit stream for early bring up which corresponds to the ROM code or the ROM boot loader that Marco mentioned previously. So it's in the bit stream and it usually required so we compiled the boot loader and then it started to recentize the bit stream run place and route again. So this is quite fast but it's still in our case six to eight minutes. That's not something you want to have if you're just compiling a really small binary. So we changed our yachter build to synthesize and keep enough space in the area and compile the ROM boot loader and just put it together afterwards. This works a lot faster if you're fiddling around there. Pain points. Meaghan is not really maintained anymore. There is a successor for it. It's Amaranth. Lightix is currently still using Meaghan migrating it to Amaranth is not really feasible. There are ideas to change it to a new hard description language. I'm not sure what's happening there but we saw especially for simulation we saw that Meaghan generates invalid very low code or code you wouldn't want to simulate. That's something. Currently we have to live with it. Then another observation was that the yachter environment wasn't as reproducible as we expected. We're not sure what's the cause for that. If we are missing it somewhere that there is some seed for compiling or for a place and route which we haven't fixed. We saw that there is sometimes failing bitstream falling out of the yachter build that should be reproducible. We wanted to use JTAG to debug the early boot of the WaxRisk 5 and look into it. We are already flashing our bitstream via JTAG on the FPGA. We have JTAG connected to the board but we are not able to connect it to the WaxRisk 5 JTAG connector so that we could add the WaxRisk 5 core to our JTAG chain. That's something we can work around. We can just use different pins but we have to use a second JTAG for that. That's something that's not that great and we haven't figured out yet. Coming to the conclusion, adding and customizing Litex targets is really convenient. It's something you have to figure out. You have to work into it. Once you're in there, I said it's 200 lines for configuring. That is nice. The step from Blinky to just get an LED flashing to an SOC is really large. It's not really surprising because it's much more. But also from things that you have to configure, things where you misconfigured something and it's just not working, you have really many knobs that you can turn and your system will surprise you. With all this tool chain, Litex, there are different modules in Litex. All of them have to be in sync. Sometimes it happens that if you just update one component, you will run into surprises. It's really important to make sure that you have something around this entire tool chain that keeps all your tools fixed on a specific version error that you know which version you are using also for reporting bugs. That's Yopto doing for us or is the plan to have. The next steps are maybe we want to run a kernel CI on the Linux on Litex system so that we can just take Linux kernel and run it against the system. There comes the problem that kernel CI expects a device feed that's upstream with kind of a conflict with the generation of the device feed that has to match your target. Then we see that Linux takes ages to boot on the WexRisk5 about two minutes. Not sure if that's because the core is just slow or if there is something we are waiting for in user space that can be fixed. The WexRisk5 actually supports multi-core systems. We weren't able to boot this yet. Maybe we want to look into using the same system on the concept with different WexRisk5 cores. As I said before, there are four different cores in different languages. All of them usually generate very low code so it should be possible to integrate it into Litex. Show me the source. It's on GitHub. This is a Yocto meta layer where you can find the code for the editing or the entire SOC configuration, the code for the hand wheel and a few other fixes that we did. You can add this to a Yocto workspace and should be able to reproduce the bitstream that we built. Thank you for your attention. That's me. That's my colleague, Stefan. You can send us an email if you have questions or just ask me and find me somewhere. Thank you. We have time for literally one question. This person here was the first one. Thank you for your talk. I had one question about the address changing in the device tree. In microcontroller world, you can have a lot of microcontrollers with the same addresses for the common devices like for STM32. Do all the STM32 have the same base address for RAM? |
Open Source Switching: Upstreaming ONIE NVMEM and switch BSP drivers
An overview of a DENT upstream WG project and network switch board support in the Linux kernel |
Okay, so this is the lightning talk section, like I say, please don't get up and move after every single talk, otherwise it's going to be a chaos. So you're in here for the whole 50 minutes, 60 minutes actually. So just sit down, relax, enjoy, starting with, I'm afraid I've forgotten your name. Okay, I hope everyone can hear me now, my name is Jaco, hello everyone. I am currently a firmware engineer at Sartura, and today I'll give you a very brief overview of what we have been doing in open-source switching space. So the presentation is very brief, so I'll jump right in. Okay, so basically when we talk about switching, we are talking about devices network switches, right? Our team has started out with CPEs. So these are embedded network devices like access points and routers. But a network switch basically is a multi-port device which has a dedicated ASIC, which controls packet switching. And this ASIC also has to support some kind of hardware offloading for advanced features like lag and so on. So in this space for the Linux kernel, we have an internal driver model, which is called switch dev. This model allows the driver to offload the data plane onto the switch ASIC, right? And also for the switching part, we have a relevant project called ONI. ONI is actually a pre-installed environment for network switches. But ONI is not just that. It also supports various hardware standards in this domain. So we'll get to it in a bit. Okay, so where are we right now with open-source switching? There are some challenges. So basically we have a limited amount of platform supported in the mainline kernel. The most common ones are Pristera from Marvel. We also have some Spectrum devices and Sparks 5 switch chipset family. Obviously in this space we want to build a fully open-source switching platform. This means that we have decided to join the DENT project and the DENT community. The DENT project is actually looking to create something like a fully open-source enterprise-grade network switch. And in this space we have been working to lead the DENT upstream working group, which has organized and funded work for open-source switching. Okay, so one of the projects which has been organized and funded by this DENT community has been the Linux ONI NVMM project. Basically the idea was that ONI, the specification from ONI, mandates that all network devices and hardware must have some kind of product data stored in non-volatile memory. This chip actually has to be supported in the mainline kernel. And the idea was to expose some standardized API in the kernel to allow user space to read from it. And this has been done by the Wootling guys. So the other work that we have been doing in this space independently from DENT is actually for the replica 1 build system project. So we have been working on the PSU driver for the Delta chipsets. We have also done some PoE driver support and the upstreamed TN48M CPLD drivers. Here we have the GPO driver and the CLPD reset controller. These are some of the patches that we have made. You can see that there have been a lot of these revisions for the upstreaming process. So what are we actually planning for the future? Currently the support is, as I said, rather limited because for example one of the most important aspects is the management of power over Ethernet. There is currently no standardized power over Ethernet management interface in the Linux kernel. And this is something that we would like to work with, for example, the DENT community to achieve. Obviously this would imply also writing a PoE manager daemon, which would manage the PoE features of the switches. And also this kind of work would allow other PoE controllers to be easily integrated into the kernel. Also on the user space side we also have some support in system B network D done. For example, virtual LANs. Our team has been working on the DHCPv4 static leases in this project. And also we have been working on adding wired 8.802.1x support in the host APD project. So that's basically it. Thank you all for listening. If you have any questions regarding this kind of work, please get in touch. And thank you everyone. Thank you. |
A journey to the hardware world
A software engineer retrospective |
So, hi everyone, my name is Mathieu Tasse, I'm a software engineer and this talk is about a retrospective of my last year when it comes to hardware development. So my background is that I do a lot of C programming, I've been doing so for the last 10 years, kernel drivers, libraries, mostly on Linux, so with Yocto, all of that mostly for profit because you get to make a living and I've also been hacking on more fun stuff like guide, new geeks, functional stuff for fun this time and to me up to last year hardware was more or less like a black box, I mean I was familiar with the surface of the box which are data sheets, manuals, schematics, but what's inside of it wasn't that interesting to me if I had something harder than finding out a GPIO number or any kind of trouble, I found my hardware colleague, dropped the ball and moved on, but last year almost randomly I discovered the world of hardware repair and micro soldering on YouTube and it's a fun world, there are some really really talented people which are repairing some stuff, they are finding the one tiny capacitor that is failing, removing it and just by buying some 1001 cent capacitor you are fixing a burden, I found that quite interesting and I think that the world would be really different maybe in 20, 30 years and having this kind of tool set could be interesting and at the same time I had a project where I was involved in the design of a motherboard for an Intel CPU and the hardware guy was a bit busy, I was somehow against my will involved in the hardware selection and so I learned that it's more or less picking out ICs which means integrated circuits, it's like often a tiny black box and you have to pick one, you have to pick one that is not out of stock and it's challenging these times and once you pick two or some that maybe will ship then you need to draw the wires between them and it was quite a fun exercise and it gave me the motivation to acquire some tools because one of the difference between software and hardware is that with software, well you can have your laptop and work everywhere, with hardware my experience is that it's not only that you need some tools, it's that you need all the tools, like if you don't have everything you will still have something missing and you'll be oh no and so this is about how to turn your desk into a terrible mess, so first you need to buy a microscope, you don't need a times 5000 zoom, 540 is way than enough, you need some LED lightening so that you're able to see what's under that and it's quite a fun exercise to learn to solder on the microscope and it's a nice tool to use, then you need obviously a soldering iron, most of you are familiar with it, you need a hot air station, it's maybe even more important than the soldering iron, you use that to desolder some ICs, the tiny black things, we solder them, but it's difficult to manage because it's blowing 400 degrees air so it's easy to mess everything and burn your board, you need a generator to power your board, you need an oscilloscope to be able to see your signals, you need also a breadboard so that you're able to experiment, try out some things, you need some components, some resistors, some capacitors, some inductances, as I said it's nice to have all of those so that when you are trying out something you don't need to wait for two weeks, five weeks to experiment, you need a multimeter, you don't need a fancy one, it's like a two UUOS multimeter, the main function is that when you connect the probes it beeps and you can, it sounds funny but with that you can reverse engineer some tiny circuits, you can isolate some issues, you can do a lot of stuff. You need some flux so that the solder is able to flow nicely, it's probably very toxic and as healthy as like eating lead for breakfast so maybe you also need some kind of fume extractor, I need to improve mine because you need also some solder wick or solder pump so that you're able to remove the solder when you have made some mistakes, you need some tweezers and more than that you need to use them because when you are dealing with O4 or 2 components they like to jump to the end of the room so quite a fun exercise, you need a puff board so that once you have a circuit on your breadboard you can make it more permanent, you need some wires to wire things obviously, if you have a two dollar multimeter then you probably also need an RLC meter so that you are able to find the values of your components and last of all I felt like I needed, I have a software engineer, I did some hardware at school but not so much and I felt like I needed some kind of reference book and when it comes to software to me it's like it's SICP structured and interpretation of computer programs, it feels to me like a novel, it's like a Stephen King novel, I can relate, I felt so entertaining and I can use what's the reference book for hardware, people tell me it's the art of electronics so I can just say that it's a whole different deal here, to me at least, like reading more than two or three pages gives me like horrible headaches but it's a reference book, you will find some more in-depth explanations on electronics like overflow and what, and when I acquired all that equipment I tried to design some easy circuits so this one is a flash programming device on a pervboard so that you are able to put your IC on the socket on the left hand corner and then able to flash it with flash run, that was my third circuit then I managed to make a few reapers, I had a 0% success rate for like three months, then I did hit 2% success rate by fixing my coffee machine but since then I managed to fix quite a few things, I also managed to hack a BIOS, it was a nice thing, I had a laptop with password protected BIOS, I tried to remove the BIOS which is the DIC there, I burnt it to ashes, I burnt another one that I still burnt to ashes, about the third one, I flashed it with the device you saw with flash run, put BIOS without the password, managed to solder it back on with the hot station I thought about, then it wasn't working because I also managed to blue some copper traces then I had to run some wires but at the end the notebook booted and it's not much but it was quite a success to me, then finally I did try to get into PCB design so I have a led ribbon around my desk and I tried to make it like remote controllable so I designed the PCB it's like the worst use case you can ever think of but you get to start somewhere so I designed a PCB on QCAD with a CPU that was really fun to solder, a regulator, a USB port, nothing fancy, I found QCAD really fun to use, I mean it was easy to draw the schematics to make the routing even without any experience, I mean really nice software, I tried LibroCAD for my next design because the name sounds appealing but it was quite an interesting process, yeah that's it so the takeaway message is that even if you are 100% software engineer and you don't want to get into hardware having a minimal set of equipment and minimal set of knowledge allows you to do some fun stuff and try to connect you to your hardware colleagues which is a good thing in my opinion, so thank you. Okay, thank you Matthew. |
Ups and Downs with Remote Desktop Protocol (RDP) on Wayland, Weston and the Yocto Project |
So, last 10 seconds to leave the room, if no, you're staying towards the end of the lighting talks. One, two, three, four, five, six, seven, eight, nine, ten. And we are starting the next lighting talk by Leon. Thank you very much for joining. It's not the end of the world. It's just a lighting talk about, oh, is it? That's working? Kind of. Is it working? Yeah. All right. It's a little bit broken. So, it's not the end of the world. It's just a lighting talk about Wayland, Weston, the Yocto project, and RDP. So, Wayland is a display protocol. It was started like 15 years ago with the idea to replace X11, and it's slowly getting there. How many of you are using Wayland on your desktop computers or trying to use it? All right. Pretty much all of us, all right. We're trying, right? I'm not a Wayland, neither a Western developer, but I'm a user. Actually, I'm a contributor to the Yocto project and open embedded. So here I'm sharing ups and downs of the integration of RDP. So Western is just one of the compositors for Wayland. There are so many different compositors. Probably Western is not the best one of them, actually, but for embedded systems, it's actually sometimes pretty useful because it's small and simple. And there are options how to share your screen when you're running Wayland and Western. And there is this protocol, which is RDP, which stands for Remote Desktop Protocol. It's an alternative to VNC. Basically, you are doing remote screen sharing with it. There are some ups and downs, like it's proprietary protocol. The ups is that it's actually a semantic protocol. So you are sharing the fonts, the controls, all that kind of stuff with RDP. And the good thing about RDP is that you actually have an implementation for it in Western, the reference compositor of Wayland. So keep in mind that if you come into a situation where you're working on an embedded device, you have Wayland, you're using Western, and one moment you want to share screen for one reason or another, you can use RDP as an alternative of VNC. So you might have heard about the Yocto project. If you attended some of the other talks earlier today, they're quite famous because they make really good Hooties that you can order online. But the Yocto project is not just about hooties and t-shirts. It's a collaborative project of the Linux Foundation for building a custom embedded Linux distribution that is using the open embedded build framework and a tool for building images called BitBake. The Yocto project comes with a reference Linux distribution, which is called Pocky. So basically this is a way how you can get started relatively easy with some images that are out there, and you can pretty much build them out of the box if you have the right BSP for the hardware that you're touching. So I said relatively easy, but I have to say that the truth is the Yocto project has a steep learning curve. It has an amazing flexibility, but it takes some time to learn it. And it has releases twice per year, and nowadays there is a long term support release. So here's an example for the releases. This is the release that is going to be released in April. And we have Kirkstone, which is an LTS release. And we have Dunfa, which is also a long term release. It was released almost three years ago, and it will be supported for a year more. So keep in mind that, especially in the terms of the things that we're talking here about Wayland and Weston and the BSPs, there are different versions of Wayland and Weston depending on your BSP, but also on the Yocto release that you are using. So you might end up in this situation where a feature is missing from Weston because you are using an older version of Weston, and you are using this older version of Weston because you are using an older release of Yocto. So my personal recommendation is that if you are not sure, and if you can make a choice, go for the latest and greatest long term support release. But of course, that's not always true, it depends. So here is a simple example how to do a BBA pen file. This is basically extending the existing recipe for Weston so that we can build the module for screen sharing with RDP. Out of the box, it's not built, so we have to go one step further and make this configuration. It's just a build configuration to make sure that in core image Weston, which is a small image containing Wayland and Weston, we can have this module and we can enable, after that, screen sharing over RDP. And the RDP implementation in Wayland is based on free RDP, so we have to add it as a dependency. So after we have this, at runtime, we need to do a little bit of more configuration. Well, actually, we have to do them unless we haven't done them as part of the automation of the recipes with Yocto and Open embedded. But this is just a simple example of how to generate appropriate keys. After that, to configure the Weston init file, this is the master configuration file of Weston with various configurations depending on your system. So there, you have to enable in the screen sharing section the command to be launched when you do screen sharing. But when this is done, remember to launch Weston or restart it just to make sure that the right configuration is loaded. But after that, there is one more thing you need to do. You need to press control out plus S. That's pretty cool if you have a keyboard. But some embedded devices don't have keyboards, right? So hold on, hold on, I've told you it's ups and downs, so sometimes they're downs. You are versions of Weston actually have this option to put in Weston init, which allows you to do automatic start up of the screen sharing with Weston. And this has been added to Weston version nine. And I believe this feature was added by Marek who had a talk here earlier about you boot. So you probably know him. So thank you very much for doing this because this is a really useful feature if you are working on an embedded device that doesn't have a keyboard. And once you're ready with this, from another computer in the same network, you can launch a client that supports RDP and you can connect remotely to your embedded device. Here are a couple of examples depending whether you're using Wayland and based on your response to my question at the beginning of the session, it looks like a lot of you are using Weston. So here you go, you just replace the IP and if everything is okay, you'll be able to connect to your embedded device remotely. And if you're still using X11, you're still in the game. So here is a very simple demonstration. And what we see here is a screenshot from my computer. My computer is, I'm kind of a lazy Linux user, so it's just Ubuntu FTS support. So we have Ubuntu with Wayland and GNOME, which we see on the back. And here in this screen, we are seeing core image Weston running on Raspberry Pi 4 with the configurations that you have seen in the previous slides. All right. So that's all. That's pretty much how it works. RDP has some ups and downs. I guess the major conclusion from this lightning talk is that if you come into a situation where you need to do screen sharing on embedded device that is running Wayland and Weston, as an alternative to VNC, you have RDP as an option. And use the Yocto project and open embedded, it's pretty cool and it's pretty much everywhere nowadays. Thank you very much for the attention. I think we're just on time. |
Bluetooth state in PipeWire and WirePlumber |
Hello, I'm Frédéric Danisse, software engineer at Colabora, and I will present you the Bluetooth test in wire pramber, and per-pramber, and wire pramber. By prior, the low latency, gravity processing engine's attempt is to handle the audio and video streams. It is intended to replace both Pulse Audio and Jack Audio systems. Wire pramber is in charge of creating the audio and video notes, and the link between the notes according to the policies defined by the system or the users. Both of them are designed for desktop. Main distribution is switching to them, or to the embedded. More specifically for Bluetooth, the Bluetooth classic audio profiles are divided in two main categories, the mono, the stereo, and mostly unidirectional audio streaming called ANSI, mono and bidirectional profiles, like which is ANSI profile, and it's said profile, the latter less and less used. For the Bluetooth classic audio profiles, A2DP stands for Advanced Audio Distribution Profiles. It aims to manage audio streaming between media player and headset or speakers. And in this table, you can see the supported codecs. The first one is SBC codec, which is low complexity, fast, and low C, but is implemented on all devices. The A2DP specification allows other codecs, like AAC, which is optional and not implemented on all devices. And this specification allows also other codecs not defined in this specification. Most of them have been implemented to improve the audio quality, but are not supported by all devices. For example, the EPTX family of codecs can be found on Qualcomm devices or need licensing from Qualcomm, or the LDAC codec is found on Sony devices. In PipeWire, we use this ability to add some other codecs like OPUS, which is an open format, or LC3PUS, which is an enhanced version of the codec using LE audio, which we'll talk later. Some founders have the ability to do bidirectional audio on A2DP with a fast stream codec, which is an evolution of the SBC codec, or the EPTX lossless codec, which is one of the EPTX family. Just one other thing. Last year, we were able to pass the Bluetooth qualification using both PipeWire and WirePobler on the Steam Deck. HFP stands for Ans3 Profile. It is used for communication usage, but unlike the A2DP one, it also defines the commands to interact with the telephony using a set of 80 commands. This can be done with external demons like HSP, HFPD, or Ophono, Ophono adding a complete support for the modem, or with a native backend, which is only a limited set of 80 commands allowing to complete the connection with Bluetooth devices. Yes. And this can be used with configuring application. Last year, we had to the native backend the support for modem manager allowing to have a complete telephony usage from inside the Bluetooth, from inside PipeWire. So with Ophono, our modem manager, PipeWire has a complete telephony support to the mobile distribution device, mobile device distribution. HFP supports two codecs, the mount datory one, which is CVSD, which is an urban audio connection. In this case, in this case, yes, for the CVSD, the audio is sent directly to the Bluetooth chipset, which will encode the data through the blue Cisco socket. And the second codec is MSBC, which is optional, it's a fixed configuration of SBC. But it needs both support from Kernel and the chipset. And it is automatically detected during runtime by PipeWire. But on some hardware devices, the chipset has a direct audio link connected to an audio card or to the modem. To be able to support it, we add a hardware scope of load mode, which allows PipeWire to only use this code socket to connect and configure the remote, the link to the remote device. While PipeWire will create pass-through nodes allowing the user to select the Bluetooth remote device as an audio output. And then the data is sent to the audio card, which plays them to the Bluetooth chipset, which will encode and send the data over the air. Now I will do a quick overview of the new low-energy audio specifications. The idea is to unify the stereo and mono audio profiles and replace both A2DP and HFP. It has a better sound quality with the new IC3 codec. It has an Isoconus radio channel to guarantee bandwidth and minimal delay. By default, it is able to support bidirectional audio for every usage. It supports multi-stream support, replacing two wireless. It also supports hearing aids. With the new Holocaust mode, you are able to broadcast audio without interaction between the transmitter and the receivers. These send-ups in a lot of new profiles and specifications. The ones in blue are already supported by BlueZ and PipeWire. But there are not so many devices on the market to test with, they are still set as experimental in both BlueZ and PipeWire, and in some configuration to be set if you want to use them. Regarding the broadcast support, it is already supported, the low-level is already supported in the channel, but there is still some work to do in BlueZ and PipeWire, and mostly find the correct UX to be able to share audio or to select the broadcast you want to listen to. Thank you. |
Exploring a swedish smarthome hub |
All right. Oh, wow. Microphone. A couple of months ago, I went shopping to the IKEA. Who am I? My name is Hawa. As the slide says, I've been playing with Linux and computers for over 20 years, and I've been doing it professionally for more than 10 years. I'm currently a software consultant at Mind, which is a local company. We have a rebranding new logo here. Not on the slides yet. So we're going to talk about this device. It's a new smart home hub from IKEA. Is smartphone smart home? Is that a good idea? I guess in this crowd we have two extremes. You either think it's part of the internet of shit, or you have already 20 of these devices at your home. Well, in my personal opinion, I like them a lot, but only if the data stays with me at my home, at my local computer. So no clouds. And preferably, it should run as much open-source software as possible. So that's the state for this review. So the app, well, it's really, really simple. It's IKEA. It's what you can expect if you have ever assembled something from IKEA. You either love it or you hate it. So this app is the same thing. You're either going to love it and can use it from scratch or you're going to hate it, like my mother, which also doesn't assemble IKEA furniture. Great app. Regarding the point I made from the cloud, they're actually scoring really great. So there's a really clear and easy-to-read privacy statement. And the app is opt-in. So it asks you nicely if you want to send the usage data towards IKEA in the same manner. It has support for Apple HomeKit. It also has support for the Google AI stuff, but it's all opt-in. Really nice. What's inside? Well, we're in an embedded room, so we had to take a look. This device has a dual-core STM processor, a single chip of DDR3 memory, and about four gigabytes of MMC memory, and dual ZigBee slash Bluetooth radios. So this should be fun to work with. Well, I opened it up. This thing has a laser. I always recognize these three pins on PCBs. Those pads look familiar. So simple. You are thingy on the laptop. You use Minicom, default settings for Minicom, of course. We get a boot prompt, but that's not really useful. It's the only thing that it shows. It's a Dirigera app. Very nice to tell me again. What we can see is that it uses the STM32 secure boots stuff. I don't really know the details quite yet. I haven't worked with those chips that much. And then it starts up system D, and it ends there. So what this is telling us is that they really cleaned up their information that they put on the U-word. Right. In the app, they have a link to this website, gplcode.ikea.com. And if you press that download button, you get a zip file. Yay. And if you unzip that zip file, you get a bunch of directories. It's pretty much a huge dump of source code. The things that we do recognize in that entire dump, it's a kernel, system D, base files. Everything is called base files. Busybox, and it uses Raoq as the update mechanism. The naming of these directories makes me think that this uses Yocto, but there are no build scripts or flashing scripts included in this zip file. So if you go look at the Yocto reference manual in chapter 35-3, you'll see this text. 35.3.1 talks about providing the source code. It shows you how to make a Yocto recipe to generate the directory that they just zipped and dumped on the Internet. The next subchapter talks about providing the license text, which pretty much just takes all the license.txts from that first directory, concatenates them in one huge file. So that one is also, that's what this button does, download license information, so you get that huge file. And then they forgot about chapter 3, providing compilation scripts and the source code modifications. So Ikea, you will have some more work to do. I would like my compilation scripts. And I would really like to flash my own hardware with the GPL code that I received on my hardware. And that's my talk. Thank you very much. |
The PolyVent FLOSS Ventilator
A Free-libre Respiration Ecosystem |
Okay, thank you everybody for being here. I know it's the end of the day, it's been a long day. So thank you. I'd like to talk about the Polyvent Free Libre Open Source ventilator. This is hardware in a little different sense than is used in this room. Normally when you say hardware at this conference, you mean chips and VLSI stuff, but this is an electromechanical hardware device. This talk is co-authored with Dr. Victor Sutrin. Victor, can you raise your hand? And Antal Zeiderwick is our chief mechanical engineer for the chassis part. If you meet us after the talk, we'll be happy to answer questions for you. And we are trying to recruit software engineers and electrical and mechanical engineers to work on the project as well. So I am Robert Reed. I'm the founder of Public Invention, which is a US 501C3 public charity. Our motto is to invent in the public for the public. I think this conference will appreciate that we're trying to take the principles of open source software development and apply it not only to chip design, but to actual hardware inventions. So I'd like to create a setting in the spring of 2020 in the United States. So many people had died of COVID-19 so quickly in New York that they had to use refrigerated trucks as temporary morts for that purpose. At that time, there was a genuine belief that the Western world might need a million mechanical ventilators to try to keep people alive. That turned out not to be true, but it wasn't erroneous at the time based on what we knew from the disease progression in northern Italy. What we didn't know at the time was that social distancing and lockdowns would work, and also doctors decided they didn't need to ventilate paper people as early with COVID as they had previously thought. Nonetheless, a very large number of humanitarian engineering teams all over the world attempted to make emergency ventilators to solve this problem. It was kind of a global effort, and Victor and a young man who was 16 at the time started working on their own ventilator in the same effort. Now they started with a bellows-based design, we're going to talk about that. The thing they designed, the polyvent, was specifically designed to talk about fragile supply chains. So it was designed to be constructable within a low and middle income country, and that's one reason they went with bellows in the initial design. Originally, they weren't necessarily embracing open-source licensing because they didn't know that much about it, and everyone sort of believed, well, we're going to need large firms to make a lot of money, and if you have an open-source license on it, they won't want to use your product. Now, how do we know that 100 humanitarian engineering teams started? Because public invention evaluated all of them. So we made a spreadsheet which evaluated all of the open-source ventilators along a wide variety of dimensions here. Now at the time, and still today, what we're trying to do is to create open-source medical devices. That is harder than making open-source hardware, which is harder than making open-source software, which is harder than copywriting text, both from a legal point of view and from an intellectual point of view. The cost of development for medical things goes up because you're attempting to produce regulated devices. Now originally, the Polyvent team was attempting to do that, but at that time in the United States, there was an emergency youth authorization. So there was a belief that we might not need to do all the things that the FDA would normally require. So while this was going on, public invention published the Open Medical Technology Manifesto, which is that open, shareable, repairable medical technology will make us all healthier. The Polyvent ventilator is aligned with that, and I invite you all to find this and sign it if you agree with it. So the Polyvent team began working on a ventilator, and they had some success in Lens, and they designed a very extensible system that we're going to talk about. But the global pandemic urgency was dissipating by about six months from that spring. So by October of that first year, people were no longer excited about the idea. So the thing that I'm most proud of perhaps of this team is that they just kept going and continued to develop the ventilator. So at that time, they joined public invention basically in exchange for making it fully open source, public invention began to start paying for parts and manual labor to support the development of the ventilator. It's also the case that I'm mostly a software guy, another non-profit helpful engineering had the VentOS software, which we're going to talk about, and the existing team didn't have any software. So it was a nice alliance. This is their original system. This is a fully functional ventilator. It uses dual bellows here. Bellows can be manufactured with 3D printers. So they can presumably be made in any country was the idea. However, there were some problems. The bearings to drive the bellows up and down tended to wear out. And there were some other improvements possible. We started to make those improvements. The big switch we made was to switch to a proportional valve based system that used pressurized air and pressurized oxygen. This was inspired by Smith College in the United States, which is probably the premier of women's college in the United States. They had made an award-winning ventilator called the Smith Vent. They stopped. I don't know why, but we continued and have used the same basic technology. Now we already had a spirometer, the VentMon, which was made by public invention. We used that as part of our system, and eventually we started to redesign for education. We started with the proportional valve on the left, which is a Birket proportional valve. It was really kind of an engineering mistake because it was larger than what we needed and the airflow was not as precise as what we needed. The valve on the right is difficult to source, it's made in the United States by IQ valve. It's a very precisely controlled proportional valve. Like all projects, we learned as we went along. This was what we called the Polyvent One, even though it's after the Velo's module. This was, again, fully functional. We performed some tests with professors of education in biomedical engineering. This system worked, but we decided to redesign it for education. So while this was going on, the COVID pandemic continues. In India, around this time, there was a terrible, terrible spike of death. Now this was not due to a lack of ventilators. People say it was due to a lack of oxygen. We have also, public invention have also worked on an oxygen concentrate. The reason I bring it up is that what we're attempting to do and what many of you are attempting to do in the software that you produce is to make the world better for a lot of people. Making open source medical devices is a new way, a new avenue for open source philosophy to make the world better for a very large number of people. It's quite a technical challenge, but that's why we're doing it. So based on educator feedback, we made a lighter single deck design. We made a transparent case. We made the inside spacious and modular so that students could look at it and you could also repair it more easily. You didn't have to take the whole thing apart. That is very different than the way professionally designed medical equipment is made. It's not made to be easy to repair. It's not made to be easy to understand. So it's quite a departure from what you would see in a normal for profit sort of made device. We also, Nathaniel did a really good job designing a modular card based electronic control system. And this actually paid off when a second public invention team created a card that we were able to put into the device to control a general purpose alarm device, which we're working on. So that team did that with no interaction with Victor's team just based on the documentation that we have. So this is the timeline and we've been getting better and better as we go along like most projects. In October, we did a classroom evaluation with 12 biomedical engineering students at Rice University in Houston, Texas in the United States. This is the device as it stands today. This is the Polyvent 2. That's what the students looked used. As you can see, it actually uses an acrylic case so you can see all of the components. And I don't have a good layout diagram, but it's laid out in a way where it's physically modular as well as being electronically modular. The software is too, of course, because we learned a lot from the open source software community on how to do this. So it's now our intention with the Polyvent to continue to eventually make a design basis that can be used for a medical ventilator. But we believe that by sort of infiltrating the research and education community, we have a better shot of eventually accomplishing that. So the Polyvent platform right now is for medical and veterinary doctors, but really it's for biomedical engineering students, even you can teach business school classes on it. You can certainly do mechanical electrical engineering software engineering. And we consider ourselves firmly part of the emerging discipline of humanitarian engineering. So what we did to make the classroom instructor, I am not a teacher. I'm mostly a computer programmer, is we made fake broken parts and we asked the students to turn their backs and we would install a fake broken part and then they would attempt to find it. Now this class they were taking is in fact a troubleshooting class. So it worked rather well and of the 12 students who were there, they really strongly believe that this would be useful in other universities. So it's our hope to sort of sell this at cost, even though it's completely open source. We could make it if they wanted to, all the physical designs, all the software designs are completely open, but making things like this in hardware requires, as one of the gentlemen in the previous talk, a certain amount of tooling and so forth. So people like a graduate school may find it easier to pay us $5,000 for one of these, which is sort of the hardware costs are about $2,000 and it takes some labor to put it together. It's kind of the cost for us to make it rather than build one themselves. But they could, they can build it and modify it themselves based on licenses that I'm sure you're all familiar with. So this is kind of a schematic of the design that you saw there physically and the thing that's most important is Nathaniel did a really good job designing an electronic extensible card system. And this is based on an IEEE standard I'm not familiar with, but basically you plug slots into it and it exposes pins of the ESP32. So if you have a device that you would like to add to the ventilator, like a humidifier, a nebulizer, a heater, an additional set of instrumentation, you can just design a card and stick it in there. And that's what the general purpose alarm device team of public convention did. This is a physical photograph of how those things slide in there. This card right here is a card with a bunch of power transistors which control the solenoid valves which are in the system. Because obviously it takes 24 volts to do that. So now I'd like to talk about software. The software system is called Vint OS. I didn't name it, really it's not an operating system. But we kind of think of it that way. It runs on an ESP32 and it was created by a different non-profit, which I'm a board member, helpful at engineering, and some other people worked on it, in particular Ben Coons. Now interestingly, this was forked to make an oxygen concentrator, which we have since quit working on, called the AUX. But that was forked to be used by me for NASA, the U.S. National Aeronautics and Space Administration, to make a control system for a high-tech ceramic oxygen generator. So a lot of times, as I'm sure you guys understand, open source code lives even if its initial purpose is not met. If you write good code that's documented with a good license, you can use it for some other purpose and we're trying to do that. In fact, Ben made a number of improvements that really need to come back into Vint OS and I kind of need a volunteer to help me do that because there's always more software work to be done. So the Vint OS architecture, and this is where we're really talking about an embedded system that you guys will understand, is a simple Arduino platform compiled with PlatformIO. Configuration modes in PlatformIO set pre-processed with compile time switches, which give us a wide variety of hardware architectures we can compile into, although the PolyVent is effectively the only machine on which it really runs today, but we could support other architectures. It almost doesn't run on an Arduino Uno because it's too big, but technically it will run on an Uno. We use an ESP32. We have a pretty good hardware abstraction layer. The basic architecture is what's called a superloop or simple loop architecture and we believe that's appropriate for a life-critical medical device like the one that we're designing. So Vint OS claims to be a operating system that is universal. It's a universal platform for mechanical human ventilation. How is that possible? Well it's possible because all ventilators do almost exactly the same thing. They're relatively straightforward. They're simple devices. Simple doesn't mean easy because if you do something wrong the patient dies, but they are still relatively simple devices. Thank you. In particular, doctors normally want to vary the breasts per minute. As you become sicker you require more breasts per minute. You hope that doesn't happen. The inhalation time and the exhalation time ratio is varied for the comfort of the patient. If you are approaching death they may have to make that what would be very uncomfortable for a healthy person to try to keep you alive. Pressure control ventilation keeps constant pressure through the inhalation. You want that pressure to be low because high pressure can cause damage to your lungs. But as you approach death that pressure may have to go up to try to keep you alive. Doctors, I'm not a medical doctor, Victor is a physiologist, not a medical doctor. Clinicians know how to balance these things. It's our desire to give them the power to do that. Basically you just blow air into the patient's lungs and then you stop and the lungs deflate on their own. That's the way positive pressure ventilation works. It's simple but you have to control all these things. This is sort of a diagram of a universal ventilator. All ventilators are sort of the same in this sense. There's an air drive which produces air in one way or another and that's the most mechanical system that's part of it. There's a sense module and ours is completely separated in the sense that we use the Ventmon which is a separate device that we would like to productize. We gave a bunch away because we had a grant to give them away but it's basically a spirometer. It measures everything about human breath and if you connect it to the ventilator it allows you to see what the ventilator is doing. A controller is what this room would think of as the embedded system. That's where VentOS runs. Our interface is we use a Internet of Things based public data cloud and we're still working on aspects of the clinical interface. If we think about philosophy, the Unix way, and of course I didn't write this, this is on Wikipedia you can find us, is to write programs that do one thing and do it well, write programs to work together, and write programs to handle text streams because they're a universal interface. This is from the 70s. This is very old philosophy which has served the world in good stead because Linux and open source software is eating the world. How do you apply the same things to the kinds of electromechanical devices that we're building? There aren't even chips. They're moving air around. Well, you attempt to do the same thing. You build machines that do one thing and do it well. That is not the way Johnson and Johnson would build a ventilator. They would put everything in the same case but we're not Johnson to Johnson, right? We can do something different. We make a physically separated device where physical components handle one component at a time and then they're integrated in a soft way. By using digital control, we make them all roboticizable or controllable by a controller so that we can use them and they can be reused in that way. In my experience, instead of handling text streams, the modern way to do this is you handle JSON objects that are communicated either via SPI or I squared C and that's kind of a universal control language that's easy for both programmers and the hardware devices to understand. How realistic is this? That's debatable because we're nowhere close to having an FDA-based ventilator at the moment. However, we have done a lot with very little money. We built the Ventmon which is kind of our most realized device because it's much easier than a ventilator, right? VentOS is an existing operating system, Polyvent is a ventilator. I'm very proud that we've defined two data standards based on JSON, the public invention respiratory data standard and the public invention respiratory control standard. Now, as you guys, I'm going to come back to this but as you guys know, progress is often made through defining standards. It's often not very glamorous to do so but the work of defining the standards is really what allows other people to take your work and utilize it in a standard way. In this case, we've done work that has not been recognized. No one else is using these standards yet but I hope that will change. We tried to build an oxygen concentrator. We sort of stopped working on that. We also have vent display which gives a complete dynamic display of breath plots and the things that clinicians need to do. So if we map that to our diagram here, what we find is that the device that we're calling the ventilator really could be thought of as an air drive. It's the part that makes the air. We have a separate device, the Ventmon, which can be used as a sense module and we have a separate set of programs which happens to be an IoT defined public lake and some JavaScript that runs in a browser to do the clinical GUI aspects of the system. We're also designing a general purpose alarm device as I'm sure you understand in any intensive care unit, thank you, situation you have to produce alarms when the patient needs care. Now, that can occur because your machine has broken or the battery has failed or you've run out of power or someone has tripped over a hose but then that happens a lot but it also can occur simply because the patient's condition is deteriorating. In any case, you have to be able to produce a device which can generically alert people to the fact that something has to be done. While following the UNIX way adopted the hardware, our idea is to make a separately packageable device that could be used for a cap door or a burglar alarm or all kinds of other devices in hopes that we can build a community of practice using that which will strengthen the use for medical alerts. This is the software that I was talking about. This runs in a browser. This is what is produced by the Ventmon. I probably should be showing a video but this is actually dynamic as the machine breeze or the patient breeze, you're seeing the pressure flow and various events like the measurement of the humidity and temperature, the end of the breath, the beginning of the breath. What you have on the right here is what a doctor in an ICU would typically compute about the breath traces. This is not super sophisticated but the thing that I really like about it is it runs in a browser so it's distributed generally and then secondly the software functionality of doing all those computations completely separated from the ventilator. In most devices this is built into the panel of the ventilator and cannot be reused in any other way. Lots of the things we've been talking about, VentOS can claim to be a universal system because it implements a hardware abstraction layer that lets you interface to turbines, fans, in our case proportional valves, bellows, other ways of producing gas. Following the open source methodology, it's not so much a machine as an ecosystem. We're trying to build a respiration ecosystem. As we've said, we've already seen that one piece of functionality has been added as a PCB that's put into the control module and that is an SPI interface to the general purpose alarm device which I mentioned previously. You might say, well why on earth would we ever have a respiration ecosystem? Well, there's a good reason from kind of a patient point of view which is all of these devices which accomplish various medical purposes, a ventilator, an O2 concentrator, a by level positive pressure air wave machine, a CPAP machine, a PAPR, a bag valve mass monitor, all of those essentially need standards of respiration data exchange which we have developed but nobody else has used and many of them need the same sense module that we've been talking about in the Vintmont. In that sense, if you think of the way open source software has made components that work together really effectively, what we're trying to do is to create hardware and software components integrated which work together as effectively in the realm of human respiration. In a sense, we're trying to democratize the field of medical respiration and education around it. Open source software has already shown us the way. We're just taking things that were developed by open source software and attempting to apply them to hardware. In particular, as I'm sure you guys know, the development of standards like HTTP, HTML, JSON, etc. are absolutely critical to the progress and interaction of multiple components in the embedded architecture world but open source software more generally. We're trying to accomplish the same thing by producing respiration standards. These of course exist in GitHub repos. Thank you very much. In short, we built the most open, extensible ecosystem for a classroom. It's the most open, best documented system. I can claim that because I evaluated all of the other ones. There are other open source ventilators but you cannot find their designs online. They're not really open. They're just thinking about being open. That concludes my talk. Thank you very much. Thank you very much. |
Reverse engineering a solar roof datalogger
"Hey, is that a Raspberry Pi in there?" |
Okay, so, last talk of the dev room, Paolo Bonzini is going to be talking about. Reverse engineering a solar roof panel. Sorry, roof. Yeah, whatever. Well, whatever he said. Here, yes. So as a quick introduction to myself and to set the expectation straight, I'm not a hardware guy. I'm not a security guy. This is basically something I did for fun. So I'm a beginner in these topics. On the other hand, I'm not a total idiot either. I know assembly pretty well. I work with compilers, kernel stuff, so I know X dumps enough to do this stuff. So anyway, this all starts almost five years ago when I bought a solar roof for my family. And the installer asked about having this optional data logging component. And I didn't really want to have anything cloud related because IoT is short for Internet of Things that shouldn't be on the Internet. So I didn't want it to be on the Internet, but it's entirely local. So I said, sure, why not? And this is what I got from them. This is the normal solar roof setup, the stuff that they want to touch with them football. This is the part that this talk will be about. And it was already suspicious from the beginning. But I mean, who knows? Maybe they bought it on sale. I don't know. But the plot thickened when I by chance had wireshark running and I saw there was a Raspberry Pi that I didn't know of. I have other Raspberry Pies, but none with this IP address. So actually, a few years later, when I was not really preparing for this talk, but I noticed these on their website. So these are the specifications for this. It has quad core ARM cortex. It even has a microSD inside. Okay, so you know what kind of Raspberry Pi is going to be in there. But anyway, let's take a step back and go back to 2018 and say, let's see what this thing does. So it's pretty nice. I mean, you can only see these from outside, but actually there's extra things around. Here you can see that they power it through the GPIO. In fact, the power brick, they actually cut out the USB part and they just screwed the wires to the GPIO here. So what does it do? It basically logs data to the SD card every five minutes. It lets you plot nice graphs. Unfortunately, I don't have any picture of the graphs because I don't use the software anymore. But it also has five relays. You can see them here. Some of them have normally closed, normally open. Some of them only have normally open. And it's based on five volt inputs, which is a bit weird because usually you use 12 volts or 24 volts DC for communication. Five volts is a bit weird. And it's not very useful also because you have to actually put the wires on the wall and it was already hard to find a place for the whole solar roof things retrofitted into a relatively old house. Another interesting thing is that it has a built-in UPS because the inverter has like a 10 ampere line that stays up 24-7. If the grid goes away, it's still battery powered. So this is another incentive to actually reuse the Raspberry Pi for something else. And also on a slightly lower level, what does it do? It has port 22 open. Unfortunately, there's no way to upload keys. It has a remote update, but it triggered exactly once when I plugged it in and then not anymore for four years. The web server is not Apache. It's not nginx. It's just an embedded web server with a vanilla server header in HTTP. But it has a nice JSON API. It's easily discoverable with the Firefox developer tools and that's how I used this thing for some time. So for example, there is the API login. There is the API dash, which is basically what is used for the dashboard. And it returns the instant data like one minute old at most with some really cryptic names. Okay, Ver is not cryptic. Temp is not cryptic, but the other ones are a bit weird. And it also has the possibility to get CSV data with similar headers for a particular day or average across the whole year. The daily one is the most useful because if you have a lot of data for the same header, it's easy to figure out what it might be. So for example, this one, ENPH, okay, V probably stands for voltage, W stands for what, but H you may not know of hand. But if you look at the number, you can see that pretty clearly that it's probably the frequency of the grid or something like that because it's about 50. So you get lots of data. You get voltage, current, power values. You get a few computed fields that are easy to recognize because the name starts with X. So for example, X home is the current consumption of the home, independent of whether there is data coming from this, sorry, there is power coming from the solar panels, or maybe instead it's coming from the battery. The actual data from the inverter are a bit not suited, for example, to plotting useful graphs, but they have a few computed fields that put things together. There's also a few dozen flags that are interesting. They are almost all zero, and so they will be a bit harder to reverse engineer. And also you can see the five-volt inputs and the relay outputs in the logs. I actually never use these, so they are always zero, but they're named like in one, in two, in three, in four, in five, so it's pretty easy to figure out. So what you can do with this is already do simple-minded hacks with curls. So for example, you can gather all the yearly data and do your own plots. For example, you can see here is when I do laundry because the weekend and Tuesday is where I consume more power. And another thing that I did very early on was push data to MQTT to get some nice widgets on my phone that could give me instant data without opening the web interface. And also for home automation, at some point I was using these to turn on and off some ZigBee smart plug bugs. But this was not any reverse engineering at the Raspberry Pi level. It was just looking at JSON stuff and doing just stuff with curl. But this was already enough to find some interesting bugs. There are some weird stuff in the logs. So for example, here you can see this probably stands for day and hour, and this stands for month and seconds, but it's a value that has a decimal point in it, so it makes no sense. And I will show later what happens. Also, there are some fields that end with L and H, but sometimes they are swapped. So you can see that L is the one that stays always the same, and H is the one that keeps going up and down, up and down. And also with L and H, they didn't really care a lot about them because also if you plot them, the plot looks like this. So this is the solar production from the total solar production since the day that I installed it. And this is how much it had produced in January, 2021. This is in December, and it's really weird. What actually happened is that they put the low as a signed value, but it should have been unsigned. So whenever it goes from 2 to the 15 minus 1 to the 15, it actually flips back to minus 32,000, whatever. So you can see that this is exactly 655.36 kilowatt hours. And if you fix it, it's still wrong because there are some bugs. But I mean, this was not supposed really to be used by the buyers of the appliance. People were just supposed to look at the nice plots. Also, this is way more worrisome because there are flags that seem to be passing correctly. If you look at the logs, it keeps giving these weird names that they keep going on and off like a few times every day. And the name is very scary. It's battery charge over the current hard limit, and same for this charge. And the weird thing is that you can see this error in the web interface all the time, but you never see it on the inverter. The inverter has a little panel with errors, and you cannot see that any time. So that's weird. And I didn't really like that. I asked customer support, which were otherwise very nice, but they just said, yeah, don't worry, it's fine. So all the time while I was doing these things for fun, I had the lingering thought that the microSD card would die over me because they say diamonds are forever. I don't know if it's true, but certainly microSD cards are not forever. So even worse, I couldn't just go and say, okay, it's 10 years warranty, please send me another SD card. Because the newer models, they moved the storage to cloud, and they only have five years of service, which is suspiciously close to the time before my own SD card died. So people won't pay for the cloud service. You won't have to repair the data logger. Anyway, I never did anything about it until last year. Last year, I got the data logger to shut down on me twice in a week, and I naively thought that it was just a network problem, and I said, okay, I will just take out the SD card and set up a static address. Actually, the SD card was dying. I don't know why it really had all these problems in general. It didn't have any problem until last November, and last November, it really died in a matter of a week. So I have no idea what happened, but I'm very thankful to the SD card gods for. So when I open it, this is what it looks like, and this is the behind of Raspberry Pi 3. It's potato quality, but okay, the first impressions was that it's a standard Raspbian install, so I guess I can forgive them for not obeying the GPL because they didn't do anything weird. I mean, it was just Raspbian. And there were two statically linked binaries in OMPI. One was running on TTY1. Nobody ever noticed that it was running there because there's no room for an HDMI cable, so it was there, though. The main use of that HomePI, I think, was that if you plug a keyboard through the USB and you press there, it rebooted, so it probably was a quick way for them to do some testing. But it had some nice ASCII art of the company logo. The other one is the one that we care more about. The data is placed in HomePI storage, and the nice thing is that Strays is installed, so I was already thinking of ways to get some data from this because with Strays, it's so easy. Anyway, there's basically, again, in stock install, some files are newer than the others, so that's an easy way to see what's going on and what they changed. There's a couple system D units. There's an I2C RTC. This is nice because at the time, I hadn't even looked at the PCB or anything, so I didn't know what was in there. And knowing that there is a real-time clock on the board is nice to know. They disabled Bluetooth for no particular reason. Well, there is a reason, but I will say it later. So what I do is just I copy the binaries to my computer, I add my SSH, and I turn it back on and it works. So one thing that you need to consider before doing this kind of work is what about the warranty, especially for something as expensive as the Solar Roof. The thing is the connection to the inverter is through USB. It's not direct RS485. There is a scooter. This thing is away from the board, so I didn't touch it. I just removed the screws to take a picture, and that was it. It's basically RS485 to USB adapter. So also, the inverter has the Baud rate. You can customize it, and the user manual has basically three choices. Date and time. What was the other one? It was the total that it reports for the produced energy. So you can reset it to zero if you give it to another person or for whatever reason you want to change it. And the Baud rate for RS5. So the user is supposed to know that it is based on RS485 and is supposed to look at RS485 data. I don't think this is any kind of lawyer analysis, but still I'm pretty sure I didn't break the warranty on the inverter, which is the really expensive part. So anyway, I got the binaries. The first thing I do is using strings, and just doing strings by plus is awesome source of information. And what I noticed by doing quick search, for example, for the API endpoints and for DevTTY USB, is that all the strings are together, and this means the program is unlikely to be written in C, because in C they will be null terminated, and they will be all one after another. So it could be go because, I mean, there's not that many languages that you would use to write a web server in and that produce a large binary. Certainly, it wouldn't be rushed because it was four years ago, so it wasn't as fancy as today. And anyway, if you do a read-elf, you can see some section editors that go sim, tab, go pcl, and tab is basically the go format for the bug information. So it's almost certainly go, like what? Certainly go. Another thing that you can do for strings is for GitHub, because why wouldn't people use RPIO libraries from GitHub? And also, you can find some nice names and things. And this is the name of the model of my inverter, so I know that what they call it in the source code or in the files can be handy. So anyway, the thing is, there is running this, I have SSH access, so what I can do is just trace it and see what happens. And one nice thing is that the T2 opens and closes the DevTTY USB 0 every minute, so it's pretty easy to also get not just the board rate, but also the parity, the stop bits. Okay, that's very little, but okay. I will go fast. So with go, the thing is it has an event loop that can move a subroutine from one thread to another, so you need to track the file descriptor numbers. So what you get here is something that is basically you can recognize to be Modbus. This is a read 16-bit register request. This is read one-bit register request. So what I did is I basically took this from the logs and I put it in a small C program to the code what it was. I mean, I could probably do something with Wireshark or whatever, but trace gave me already some C strings, so I put it in a C program. And this is enough, for example, to compare with the CSV files, at least for the 16-bit registers. So for example, I can see now that these are probably low and high. So this was the minutes and seconds, the hour, the day, the month, and the year. This is 21 in hexadecimal. And I can also see that some values are fixed points. So this one is the version multiplied by 100, and this one is the temperature multiplied by 10. For the discrete inputs, it's a lot more complicated. I could find in strings some nice names of the fields. So for example, GRN was no voltage from the grid and so on. It's a bit weird that they put it an alert that there was a blackout, but they didn't put as an alert that the fan broke, whatever. And this also doesn't make a lot of sense to average the bulls, but whatever. So anyway, this is already nice because I have the names corresponding to all the fields, but I don't have the mappings of the discrete inputs. That's what Modbus calls the one-bit Boolean values to the flag. So, and I knew that this was the part that was broken in the code. So this was probably not going to succeed, but actually it was successful. I used radar 2 for this, and I will super, super quickly go through radar 2. This is what I learned about radar 2 because I had never used it before. So the commands are one letter per word. So ADF means analyze data in functions. There are some people here that are old enough to remember Lotus 1, 2, 3. That's the way the menus worked in Lotus 1, 2, 3. Basically, you had like one letter per word in the command that you wanted to execute. And the main ones are seek, print, and slash for search. And another interesting thing to know is that the state of your work is saved in a project, which is actually a single file in the Git repository with thousands of commands in it that say all the nice things about your binary. You can get some information from the debug info that I showed earlier. There's a nice command to do all the analysis that is possible, but it doesn't really work for a static binary-linked binary. So I use these ones instead. It starts giving some nice information. So for example, in the project file, after I do the analyze strings commands, I can see that it has these CS commands. And now when I do the disassembly, it actually prints these as a string and not as instructions. Likewise, when I do AXD, I can see that it loads these. And it also says what is the data that is loaded from here. This is probably in the constant pool. So for example, this instruction is the one that loads the address of dev.tty.usb0. One thing that you can do is also you can write your own commands and add them to the file. So for example, here, I know that this location is a data operand for this instruction. And I can tell Rader 2, for everything that is in here, make it dumped as bytes, not instructions. So after I add all these things to the project file, it will not be dumped as a rubbish ARM instruction. It will actually print the word. Then you can also search keys. For example, if I search for the two flags that went up and down, I can see that they are here. And here, they are closed. So I decided to search for this address. You have to put it backwards, because it's little Indian. So this is the first byte and this is the last one. But then I found that they were found relatively close, like 68 bytes apart. So you dump them and it's very helpful that it even tells you where the hits were from the previous searches. And here, you can see some nice things. It seems to repeat every 68 bytes. And it's always pointed to the string followed by the length. And also, there's these nice numbers, which might be maybe the numbers of the discrete inputs, who knows. And if you go back and back, you can see using the seek command, I can go back 68 bytes at the time. And sooner or later, I get to a point where the format changes. 68 bytes before, there's nothing like what was afterwards. So this was the beginning of the array. And now, I know exactly which discrete input was with which name and so on. I can also tell to print data. So, for example, this was a floating point number. And if I print it, it's 0.1. So the guess is that this would also be something related to the fixed point values. And now, it's time to actually find the pointer to this. I do, I search for this address here. And I find it here. It's also address followed by the number of entries. So it's probably some kind of array descriptor for go. And then, I search for the address of the descriptor. And then, if I go here, I see that there is a reference from the query function of my model of the inverter. So I guess we have a winner. And in fact, what I did then was rewrite the software using Modbus. It outputs the logs in exactly the same format. So I still have a continuous log from the date of the installation, except I didn't fix, I didn't leave it exactly the same. I fixed the bugs. So, no averages of dates. The scary flags are not logged anymore. I can see now whether it's using the battery or not. And I even got the grid non-flag during a blackout. So I guess that's full confirmation that it works. It also does the same thing that I was doing before with curl. Now, I do it natively. So every minute, I export it to MQTT with Home Assistant. I don't have the plot functionality, but I can get it from Home Assistant. So that's fine. This is the source code. And sooner or later, I will try to put an Ansible playbook so that if the SD card dies once more, I will just have a very quick way to deploy it. As a bonus, for the last minute of the talk, I have a picture of the PCB because at some point, I wanted to update the DBN11. The nickname changed, so it didn't get the network anymore. I had to really connect the Raspberry Pi to the keyboard and monitor. So here's the PCB. It's a work of art. It's all through whole components. Don't ask me why. The inputs are voltage dividers, so learning that the blue resistors are the more precise one finally paid off after, like, 40 years of my life. Here's the battery-backed RTC. There's a power LED and an alert LED connected to the GPIO. This is nice. This is a driver I see for the relays and also for the LEDs because these are powered in five volts, not 3.3. And there's also, this is nice if somebody wanted to hack further on it. These are test pins and they are connected to eight more GPIOs on the Raspberry Pi thing. But there's something really weird here because this part, these Q-terminals are not exposed. They're unused. And this is a bias resistor. This is a terminal resistor. This is another RS485 transducer because remember that it was connected through USB. And it actually works. I have no idea why it's there. But if you look at the website, there is probably an older version of the board where you can actually read here common A and B. So at some point they wanted to use it. And then they didn't so on the brochure picture, this is not used on the website picture. They still have the older version. And that's it. |
Learn 8-bit machine language with the Toy CPU emulator
An emulator in the style of the Altair 8880 or IMSAI 8080 |
Hi, everyone, and welcome to my FOSDOM talk about Learn 8-Bit Machine Language with the Toy CPU emulator. It's an emulator in the style of the Alchair 8800 or the MSI 8080. Now, let me go ahead and share my screen so we can all see my slides together. And so here we are. So let me go ahead and first introduce myself. My name is Jim Hall, and I am from the Freedos project as well as a number of other open source software projects. If you'd like to reach out to me after the conference, there's my email address on screen at jhaulatfreedos.org. I'm going to be talking about the Toy CPU, and you can reach that at my GitHub, and that's at github.com slash freedosproject slash toy CPU. Now this project is something that I used in a class that I teach to kind of back up a little bit. Among other things, I also do instruction of university courses part time. And one of the courses that I like to teach is this class, MIS 100, Fundamentals of Information Technology in Organizations, and that's at Metropolitan State University. It's located near to me in St. Paul, Minnesota. That's where I live. And this course is not meant for computer science students. This is really an introduction, I would describe it as sort of an introduction to technology for people who are going into our College of Management or basically any kind of management major. These are not meant to be people who are going to be computer programmers or engineers of any kind. They're going to be project managers and directors and things like that later on in their career. The goal of the course is really two-fold. One is to kind of teach some tools like, for example, WordXL PowerPoint, things like that. But also it's to build a basic understanding of how technology works because really everything that we do in business today and in our personal lives requires technology. So it's important if you're going to be decision makers in an organization, you really should understand how that technology works, even at some sort of a basic level. Again, I'm not teaching them about computer programming per se, and I'm not teaching them how to build a computer, but they do need to understand at a high level how all that stuff works. My goal in this class is to basically remove the mystery around how technology works. So it's not just you press a button and magic happens on the back end, but they should have some understanding of what's happening in the back end to make different things happen. Now part of this class in terms of the outcomes is, well, and certainly a number of things are going to be understanding some operating systems. You can see up here about computer security, but also a little bit of programming. Now, as I said, we're not teaching them how to be computer programmers. We're not going to expect them to come out of this and know how to program and see or Java or something. But again, they do need to understand how that stuff works on the back end. I don't want them to think as they graduate this course and as they graduate the university, I don't want them to think about computer programming as some sort of magical thing that they don't completely understand. They need to have some general idea about what's happening. Now this course is kind of similar to another course that Brian Kernighen teaches. And I kind of wanted to do something that he does in his course, which is basically teaching the students about how computer programming works on a very simple CPU, something that would be very common for a CPU in the 1960s, maybe the 1970s, where basically you have a series of very simple instructions and the computer has an accumulator and you can manipulate values in the accumulator to make the program do different things. And in his course, and I've interviewed Brian a couple of times, which is why I know about it. And so in his course, he wrote this toy machine simulator. And as you can see, it's sort of similar to assembly. And so the program I've got up on screen would add two numbers together. It loads the value of one into the accumulator and then it adds the value of two to what's in the accumulator. And that, of course, results in the value of three. And then we'll store the value that's in the accumulator into a variable called sum. You can see that that is stored after the rest of the program instructions are done. And then it prints the value to this output device that he has defined. And so that's why you can see three on the first line on the output side on the right hand side of the simulator. And then the program stops. So it does the stop instruction and now the program is done. And of course, when it's done, you can see the output for three. And of course, it says stop in the accumulator course also has the value of three because that's what it had when the computer was done running. And I thought this was a really interesting way to teach kids about programming at sort of a high level. But I like to also talk a lot about computer history in my class. And so I'll do a lot of lectures where we kind of talk about how technology got from, you know, let's say A to B. For example, the first week in the course, we start talking about things like the ENIAC and the Colossus, and we walk our way all the way up through different areas and computing to today. And so I kind of like to back things up a little bit and kind of start with how did people used to program computers in a much earlier era? Now I wanted to kind of borrow what Brian had done in his toy machine simulator, but I wanted to take it in a different direction. And so I use this as a starting point to create my own toy machine simulator and I call that the toy CPU. Now where did I go with this? Well, actually, I wanted to do something that was a little bit sort of old school and to kind of really get the kids to understand how programming worked in sort of a switches and lights model. As I said earlier in the semester in our very first week, we talked with the students, I talked with the students about, you know, the history of computing and, you know, when they see the pictures of older computers that have switches and lights on the front panel and that's how you program the computer, they say, well, okay, I guess I understand that they use switches and lights, but I don't really know how that works. And so I'm like, well, let's talk about that. And so I then talked about the, you know, this Altair 8800, which is what you're seeing up on the screen, which definitely used a switches and lights model. But how do you actually program that? Well, I didn't want to actually use an Altair 8800. In my class, I figured that was way too much in terms of overkill. I really wanted to have something very simple that the students could sort of see just a very bare minimum of how things worked. And so I combined this concept of an Altair 8800-like machine with what Brian Kernighan had made for his toy machine simulator. And that's where I came up with the idea of the toy CPU simulator. And so it's meant to be sort of in the style of the Altair 8800 or the MSI 8080, basically where you have a series of switches and lights, and the lights will indicate in binary what's going on. Now, binary might seem to you like, well, if you're not teaching them programming, why are you teaching them binary? Well, actually in that class, do teach these students how binary works, because binary ends up getting used in a lot of different concepts and technology. Again, they're not doing a whole bunch of stuff with it. They're not like, you know, doing ads and some tracks and binary, but they're not doing like binary operations, but they do need to understand how binary works, because we talk about it in networking and things like that. So going with a toy CPU simulator that used binary is actually in keeping with some of the other things that we do in the class. And so what you're seeing here on screen is the simplest sort of interface I could come up with for the toy CPU simulator. You're seeing the counter, and that's in the upper left. And that's a series of eight what are meant to be LEDs. And that's going to show an eight bit or a one byte value, and that's the counter in the program memory. I'll talk about that in a second. Now for each counter, you're going to have an instruction in memory. And so on the right hand side, on the same line as the counter, you can see the instruction that's stored at that counter memory address. And so again, it's going to be an eight bit or one byte value, although we don't actually have that many instructions for the toy CPU. I'll talk about the instructions in just a second too. Now as the computer runs, it's going to have a very simple operation model where it's going to have a single accumulator, and then you can put values into the accumulator. You can copy values out of the accumulator. You can operate on the accumulator and things like that. So I need to be able to show what the accumulator looks like. And so that's what I've got there in the middle on the right hand side is the actual accumulator itself. Now the accumulator can hold from values from zero to 255, so it's 256 numbers. And of course, it's binary, so it's eight bits, it's one byte. And on the bottom line, I also made a status that kind of shows what the toy CPU is doing. So you can see what this is showing is that the system is powered on. So you can see PWR power on the right hand side. It's also in input mode, it's waiting for the user to actually do something. And so it's an input mode. As you go to different modes in the toy, it'll go into edit mode. And so we'll light up the light for edit. When you run the program, it runs the run. And then if anything needs to happen in terms of aborting or having an error or hitting a halt, things like that, we actually will light up those lights as well. And actually, when the toy CPU boots up, I wanted to be able to show that it's initializing, it's zeroing the memory. And so when we actually look at the toy CPU in a little bit, you'll actually see the initialization like the INI light will light up, and so we'll see power and INI lit up as it runs through memory and zeroes everything out. Now the counter is eight bits, it's one byte. That means it can store values from zero to 255. And so this toy CPU has that much memory, it gets capable of storing 256 bytes. And that's going to be combined or shared with program instructions as well as memory values in the program. Now in terms of the instructions, it's definitely a minimal instruction set computer. And you can see right there on the screen on the lower left of that black rectangle, I've got what looks like or what's meant to look like a piece of paper that's been taped to the front of the machine. And so that is showing you the different operations that this toy CPU is capable of doing. And it's meant to represent the binary instruction for each one of those. And so it, you know, stop and that'll obviously stop the machine or stop the program from running. That's basically the end of your program. And that's all zeros. And I did all zeros because when the machine boots up, it zeros out all the memory. And then if you're to run the program right away, at least it would just immediately stop. So it's sort of a safe way of stopping the system. And then you can operate on the binary values in the accumulator and you can see that the values are one and two or binary one and two. And that is, you know, the seven zeros and a, and a one or six zeros of one and a zero. What that's meant to represent sort of visually is that the, the bit in those last two places is either on the right hand side of the left hand side, and that'll shift the, the, the register, the, in the accumulator, all the bits to the right or to the left. And so it's a, it's a way to just do a binary shift. Now it's not a rotate. It's actually a shift. And so if you have the binary value of one and you shift to the right, it'll only shift by the way, right and left by one value. So if you want to shift multiple times, you just need to call it shift, you know, right, right and left shift multiple times. And so if you have a binary value of one in the accumulator and you do a right, it'll actually give you zero. And so you need to be careful that this is not actually a rotate. This really is a shift like you get in, in C programming. And then not as visually meant to represent in the instruction lights as for lights off and for lights on, it's meant to represent that this is flipping the values that are in the accumulator. And then there's the instruction for and, which is actually just one, actually two plus the plus not. And so that allows us to have a value here that, that doesn't end of what's in the accumulator. It was some other value, a register value, a value that's stored somewhere in memory. We can also do a binary or same thing with some other value that's in memory. And actually, by the way, I want you to notice I picked the these binary instructions very carefully. You'll notice that the fourth bit in, if it's been, if it's been turned on for that instruction, that means that the next, it takes, it takes an argument, it takes the next value. In the program instructions will be the, the, the actual value that it needs to operate with. So, or is going to, or what's in the accumulator with some other value in memory. And so that's why the third bit in, or fourth bit in is, is turned on. I can also do an exclusive or, and of course, I can also do things like very important, I need to be able to load a value into memory. So I can start working on it. If I can load a value into memory, I probably should be able to save a value somewhere back into memory. And so I can save a copy or store something that's in the accumulator into some part of my memory. And of course, I can add to the accumulator some value that's previously stored in memory and second also subtract from the accumulator some, some value that's already stored in memory. And then you need to have flow control of some kind. And so I have a go to instruction in here where it'll actually jump to some counter instructions somewhere in memory somewhere else. And it can also do a conditional jump. So it's called if zero. And so if the accumulator is zero, then it will jump somewhere in memory. And so if you want to do a comparison on something, for example, you need to do some binary operations for like an X or to see if you get zero and you can jump somewhere else in memory based on that. It really is meant to be a very minimal instruction set for the toy CPU. Now, I found I also had to have a null operation and not and because sometimes it's just helpful to just take out an instruction when you're debugging something. And actually the way that the program works is if it doesn't recognize an instruction that you've given it, that's the same as a knob. It's the same as saying, I'm just going to ignore this instruction. But remember, I said that the fourth bit in if that bit has been turned on that that tells the toy CPU that the next value of the next the next counter has something used to use. And so if you actually gave it an instruction that was just three zeros of one and then four zeros, then what that would do is that would that would still be recognized as a not because there's nothing that looks like that, but at the same time will have a side effect of actually skipping the next instruction and memory. Don't rely on that because maybe a future version of the toy CPU will get rid of that. But but that's actually what happens. So the safe guaranteed safe way to do a not instruction is to actually use the one followed by seven zeros. Now, why would I create the toy CPU to begin with? By the way, I probably should talk about that. Why do I create the toy CPU as opposed to going find something else? Well, as I said, I can go and find like a an Altair 8800 simulator. There are they exist. I could have just used one of those, but that's a lot of overhead for my students. I didn't want them necessarily like have to learn all of the instructions that are there in the in the Intel instruction set that seemed like a lot for them to have to tackle. There is actually another minimal instruction set computer called the what's called the digit rule that I really liked. I think it's about $40 and it's a it's a kit you can buy on some guy's website, but it actually is out of stock. It looks like it's it's sold out and then he hasn't made any more. So I would have bought one of those, but it wasn't available. So I had to build my own. So that's why I made the toy CPU. And so looking at the instruction set, this is what I was talking about here. You can stop. You can move things to the bits to the right and left by one, these different instructions. And it's always meant to be represented by that little card that's kind of taped to the front of the toy CPU. Well, let's actually look at how you might build a program for this. And this is where we actually get to explore the toy CPU. So if you want to download the toy CPU and and run it, there's a version, by the way, on the on my GitHub that is a binary for DOS. And since I have a free DOS background, obviously, that's why I wrote it for that. There is an old working prototype also for DOS version one that it doesn't actually let you input a program. It was just sort of meant to sort of an experiment to see what I could do with it. Version two was a Linux program that uses end curses. And if you want to run the program on a Linux machine, you can download the source code from a GitHub that and grab the source code to version two. There's a release version two. And you can compile that with end curses and that will that will work fine. But it was also kind of meant as a prototype. It's not very nice looking. This version, the DOS version is the one I really wanted to use. And so that's that's what we're going to be using here. Now, let's look at this one. So I find that before I write a program on the toy CPU, it's helped to kind of it helps to kind of write everything down on like a little piece of paper. So that's what I'm showing here in the left hand side is what it might look like. If you're going to write everything down on a piece of paper. And so if I want to blink all of the lights on the accumulator, I'm not talking about like, you know, like each light individually to light it all up, although you could do that. I'm just going to do the very simple example here. I'm going to light up all the right hand side, the four right hand side lights and I'm going to light up then the left hand side lights. And then I'm going to light all of them up together and then the program will be done. And so that's what we're going to do. So we're going to write a very simple program to do this. Now, you'll notice, by the way, when we eventually get to run this, that the the toy CPU has a very long delay. And that's meant so you can actually watch the system run. Now, let me actually bring up my the actual toy CPU. So here's here's the toy CPU and I'm going to go ahead and run the toy. I mean, before, so actually before we write a program, let's actually look at the toy CPU itself. So hit return on toy and this is the toy CPU. Now, I mentioned it's going to initialize the memory. And so it's basically going to wander through from zero to two hundred and fifty five and set each instruction to zero. And that's what you saw at the beginning. You can actually quit. You can actually watch, by the way, the light go over to halt. But let me just rerun that again. So let's do toy and you can see it go. The counter is going up from zero to two hundred and fifty five. And as it does that, it's setting each instruction to zero. The accumulator was already initialized to zero when the program starts up. And so here I am in input mode and of course, the system is on. So the power light is lit up and the input light is also lit up. And so I can see here, I've got the little card left inside. It's giving me the instructions for what I can do inside the toy CPU. Now, just to kind of wander through the interface a little bit here, this is in input mode right now. And so on the bottom of the screen, you can see the hints for how you can use input mode, input mode, the up and down arrows. We'll have to move between the counters. So I'm going to use the up and down arrows of my my keyboard. So if I go down one. So basically can imagine the top being zero and then everything after that. So basically we're trying to read instructions from one line to the next. It kind of made sense for me to do it that way. So that way we go down to go to the next instruction and up to go to the previous instruction, because when you write it down in a piece of paper, that's how you're going to write it. And so this is counter instruction zero. But we can see that the instruction itself is I'm sorry, the counter one. So this is the basically the second line of the program. But the instruction is zero. If I go down again, you can see I'm an instruction counter to the instruction self is also zero. So it's basically a stop command and the same thing. There's three, there's four, there's five, there's six. And so if I go all the way back up to zero, you can see that the system is has a stop instruction. Everything has been zeroed out. And so if I hit R, you can see down there on input mode, Enter will allow me to edit it. We'll look at that in a second, but R will actually run the program that I have in memory. And as I said, I zero everything out. And as you can see on that little piece of paper on the front of the toy CPU, all zeros means that the computer is going to just stop running the program. And that's a safe way to do it. So I hit R on my keyboard and it's going to run the program. You can see it will go to the run status and those lights in the bottom. Nothing happened in the because the first instruction, the counter zero had an instruction of stop, which means it immediately stopped the program. It also took a while. Every instruction has a delay built into it. So that way you can actually watch the CPU running. And that way, as I run the program, I can actually explain to my students or remind them about what's happening and they can kind of match what's happening on the program they wrote down, see it actually execute on the toy CPU itself. I'm just going to run it one more time. You can actually watch the status in the lower right hand corner is going to go from this input mode to run. But of course, nothing's going to happen because the program is immediately going to do a stop instruction on counter zero. So I'll do R right now. There it is, it's running, but it's got nothing to do. And so it immediately stops. And so we've actually moved pretty far, actually. This is this is something that my students learned how to see is that the computer is actually doing something. It's immediately stopping, but it's immediately doing something. So let's actually write a program. And so this is the program that I showed my students about how computers work, how programming actually worked in sort of this switches and lights model. And so the goal was that I would show them how to write a program. And then we would write a program. I would challenge them to write some programs and we would input it into the toy CPU and watch it run. So the first one's a very simple program. We're going to blink all of those lights on the accumulator. We're going to light up the lights on the right hand side of the accumulator. And we're going to light up the lights in the left hand side of the accumulator. We're going to light them all up and then we're going to stop. And so that's what we're doing here. We're just basically a series of load instructions. We're going to load the right. We're going to load the left and we're going to load them all and then we're going to stop. And so the way that I do that, if I bring up my little hint sheet and so if I were to look here at the next slide, you can see that using my little hint sheet there in the middle, I can now create an instruction set to put into memory. So instruction zero is going to be the load command. And that's what that binary looks like. It's zero, zero, zero, one, zero, one, zero, zero, right? So if you look on the card in the middle, you can see that that's what the load instruction is. And then we need to load from the right hand side. Well, as I write my instructions, I can write all of my instructions from zero to six. That's the actual program instructions and any memory after that from seven all the way up to 255. I can use for program memory. It just happens to be this easiest for your program memory that you're going to be using to be right after the stop instruction. And so this program is actually 10 bytes long from zero to nine. So I'm going to load from memory location seven because seven is the first instruction after I'm done with the program. And then I'm going to load from memory location eight because eight is the next one. And then I'm going to load from memory location nine and then I'm going to stop. And you can see that memory location seven has all the lights lit up on the right hand side of the accumulator. And then left will, number location eight will load up all the lights on the left hand side. And then memory location nine is going to load up all of the lights on the accumulator. It's going to be value 255. So I'm going to bring my virtual box in here. Let's actually write this program. And so here I am, I'm going to write a program. So I'm in input mode. And that means I can use the up and down arrows, right? So I can do up and down. That's instruction one. Here we are back at zero. I can do one, two, three. Going back up here to zero. So there's zero. So for zero, I want to have a load instruction. And so I'm going to hit return. So on the bottom of the screen, you can see a little hint. It says input mode and up, down for counter and that enter to edit. And that'll allow me to edit the instruction at counter zero. And so if I hit enter, you can see that now I get... First of all, the lights on the bottom indicate that I'm in now edit mode. And then I get a little underline under each light that I can turn on and off. And so now I'm in edit mode. And so the little hint on the very bottom of the screen says edit mode. I'm going to use left and right to change what bit I'm looking at. And we use space to flip the bit. And then when I'm done doing editing, I can just do enter. And that'll bring me back into edit mode. And that's right, input mode. And so I want to do a load. And so here I can... You can use my arrows right and left. We'll move the arrow or the indicator right and left. But I want to set the load instruction. So you can see the hint over there or the little program I got written out. The load instruction. And of course, I could use a little sheet of paper this tack to the front of the the toy CPU. But the load instruction is... I'm going to turn that on and that on. And that should be the load instruction, right? Sort of compare that with what I've got on my screen over here. 0, 0, 0, 1, 0, 1, 0, 0. That's the load instruction. So now I've got that set so I can hit return. And that has now set the load instruction. Now I want to load from memory location seven. I want to go down one. And so there's counter number one. And so now I want to load from memory location seven. So I'm going to hit enter. And I'm going to go and set the instruction of seven. This is going to say load from memory location seven. So it does the load instruction first. And then it has to say, well, it's going to load from somewhere. So the next instruction in memory tells it where to load the memory... Where to load from memory. That value. And so I'm going to hit return because now I've entered seven. And now let's go down to instruction two. We used the down arrow. So now I'm on counter two. And now I want to do another load instruction. So I'm going to hit return. And I'm going to put in a load instruction. And so that's another load instruction. It's the same one I had back on memory location zero. Let me go back to zero. See, that's the one I had for zero. And then go back to memory instruction two. There's my load on two. And we're going to load this. Well, I'm going to load this from the left-hand side, which you can see in my program is going to be memory location eight. So the next line here in memory location are counter three. I now need to enter in the value of eight. And so if you remember your binary, those of you who maybe don't know binary, binary goes like this. From the right-hand side, I'm counting my bits. And so this is the ones place, the twos place, the fours place, the eights place, the 16, 32, 64, 128. So to load from eight, well, that's zero, zero. So this is the ones place, the twos place, the fours place, the eights place. And so if we hit space on that, that is the value of eight. And so I hit return on that. And now I need to go down to instruction four. So use the down arrow. Now do another load instruction. And so I'm going to do a load instruction that looks like that. And now I need to load from memory location nine. So go down one more. And this is now instruction five or counter five. Hit return on that. Now I need to enter the value of nine. So again, if this is the ones place, the twos place, the fours place, the eights place, I need eight plus one is nine. And so hit return on that. And now I've got nine entered in counter five. And counter six is the stop instruction, which is all zero. So I can just leave that be. And then counter seven. Well, this is now the memory locations that are not instructions. They're just memory. And so I need to insert or light up all the right hand lights. And so hit enter on seven, counter seven. Let's go ahead and turn on all of these lights. And I'm going to hit return on that. And now go down to instruction eight, counter eight. And I edit this one. I'm going to light up the left hand side. I'm going to light these four up over here. Enter because I'm done doing that. And now we go down to instruction nine. I'm going to light them all up. So I'm going to hit enter. And I'm going to just light up all of lights. And that's it. We can now run the program. Now, every time you do run, it'll actually always start the program from counter zeros. Let's go ahead and do our to run the program. And we'll actually watch it light up the accumulator. So pay attention to the accumulator line and as we run it, it's going to load the value from memory location seven, which is going to light up all the lights on the right hand side. I'm just going to load them. The value from memory location eight, which is going to light up all the lights on left hand side. It's going to load the value from memory location nine and it will light up all the lights on the screen on the accumulator at once. So I'm going to do R and it starts with zero in the accumulator. And now it's going to load up all the right hand side. It's going to load up all the left hand side. It's going to load up all of them. And now the program is done and now we've exited. And so that's the program, right? So let's run it one more time so we can actually follow along. You can actually see the delay built into the toy CPU. So you can actually watch the program run. So I'm going to load from seven. I'm going to load from eight. I'm going to load from nine. And I'm going to stop the program. There we go. And now I'm back to counter zero, which has the load instruction built into it. And so now let me know how to do that one. There's actually a better way, an easier way to write, to write a program that will light up all those lights. And that is to do it this way. We're going to use binary instructions. So I'm going to load all the lights that are on the right hand side. And then I'm going to do a knot. I'm going to turn that zero, zero, zero, one, one, one. I'm going to do a knot on that. So every zero becomes a one and every one becomes a zero. And so that will turn it to one, one, one, one, zero, zero, zero. Now, because I don't want to move where my memory location is and things like that makes it a little bit easier to not have a change things in the program. I'm going to just use a knob instruction. That's where it ends up being easy to use this knob. And then I'm going to do an or from memory location seven. So what the or does, if you remember, what the or instruction does is it just takes two values and every time there's a, it just basically lines them up. And so every time you have a one in these two values, then you get a one in the output. And so you can have a one and a one, and that will give you a one, or you can have a one and a zero. And because it's an or, well, it gives you a one, right? One and a zero or together will give you a one. And then the only time you don't get a one at the end is if they're both zero, right? Zero and zero or together gives you a zero. So I want to use this program to actually light up the lights on the right-hand side, on the left-hand side, then all the lights together. By doing this, I don't actually need the last two instructions. So now my program is actually eight bytes long, eight bytes long from zero to seven. So let's go ahead and, and now look at what that would look like. So if I were to look at what the program looks like, the, this is what I have. So I'm going to do this instruction here from, from zero to seven. I'm just going to ignore the values that I have in my, in my program memory. That's an eight and nine. So let's go ahead and bring my, my toy CPU back up on screen. And now let's enter this program. So in memory location zero, I need to start with a load instruction. Well, I've already got a load instruction there, so I don't need to edit this one at all. And so now I go down to memory location one, counter one, and I need to load from memory location seven. Well, I've already got memory location seven in there, so I need to edit that. The next one I'm going to just do in memory location three, I'm sorry, memory location two, I need to enter the NOP instruction, right? So I'm going to enter the NOP instruction. So if I hit enter on this, now I can edit this. Now the NOP instruction, if we look at that little piece of paper tacked on the front of the toy CPU, the NOP instruction is a one followed by all zeros. And so that's my NOP instruction. Memory location three or counter three, I now need to give it the NOT instruction. And so this is a nice one. So if we just hit enter, and then we just have these four bits be zero, and then these four bits turn on. And that is the NOT instruction, right? See how it looks where you've got four zeros and four ones? It meant to imply that that's a NOT, all zeros become ones. And so that's our NOT instruction encounter three and encounter four. Hit just the down arrow, go to instruction four. We're going to do an OR instruction. So we're going to hit return on this and now turn this load into an OR. And so that bit is still the same, but I need to turn this off and turn this on. And that should be the OR instruction. So it always helps when you do a program to write out the binary instructions. So you're not having to do it kind of on the fly. And so there it is. This instruction four is an OR instruction. And now I'm going to go down, hit enter on that, and it puts me back into input mode. And let's go down to counter five. Counter five now needs to be memory location seven instead of memory location nine. And so hit enter on this. I need to now change this to memory location seven. And so seven as binary seven, right? So again, remembering our binary, the ones place the twos place the fours place the eights place. So one plus two plus four is seven, right? One plus two is three plus four is seven. So one, one, one is the binary value of seven. And so that's what I've got in instruction or counter five and hit enter there. And I go down to counter six, and that needs to be the stop instruction, which because I use not now, I don't have to change that, right? It's already stopped. And so I'm good. And then memory locator counter seven, if I use the down arrow, needs to be all the lights lit up on the right hand side. Now it doesn't matter what's in memory locations, eight and nine, because I'm never referencing memory locations, eight and nine. So using the down arrow, you can see this is memory location eight, which we're not going to use. And then this is memory location nine, which we're not going to use. And so now let's go ahead and run this. And so what we should see again is it's going to load memory location seven. This is memory location seven. It's going to load into the accumulator the lights that are all lit on the right hand side. And then it's going to do a not instruction. So it's going to flip those zeros to ones and those ones to zeros or anything that's off will get turned on anything that's on will get turned off. And that will light up the lights on the left hand side of the accumulator. And then it's going to do an or and that will basically turn on all the lights in the accumulator because the lights on the left with an or on all the lights on the right, we'll turn on all the lights on the accumulator. And so it's the same program we had before, but now we only had to do it with eight bytes instead of 10. So we go ahead and do R and this will run the program. We can actually watch it go. So it's going to load from seven and there it is on the right hand side. I'm going to now do a not instruction. So it's going to flip the bits over and then I'm going to do an or and that will now light up all the lights in the accumulator and now the program is done. Let's run it one more time. We can actually watch it run. So when there it is, there load from location seven, it's going to do a not and then it's going to do a not and then there's the not and then it's going to do an or with location seven and it's going to then do all of our lights up here and stop the program. And so that's how we're writing that program. Let's do another one. And so I've got another program in here. We're going to count down from a value. So we can count down from three. We're going to do three, two, one, zero. And so this one requires a little bit of working out some values. And so I want to load a value. All programs really start by loading a value. So I'm going to load a value into the accumulator and then I'm going to test it. Is that value zero? Because if it's zero, I can I can be done. So if it's zero, I'm going to just end the program. So if zero jumped to the end of the program. And so I need to write a little green arrow there that kind of points to what the end of the program is. And that way I kind of keep track about what's where. And then it's going to subtract. So if it's if it's if it didn't, if it wasn't zero, the accumulator wasn't zero, and it's some other value, and it can only be the values of zero and 255. So I can now subtract one from that, whatever's in there. And I can't actually tell it to just subtract one. I actually have to tell it to subtract a value that's stored in memory. So I need to store the memory one in memory somewhere. So the value one is I need to keep a note in here that that's one, which ends up being a place on instruction 10. And then now that I've subtracted one, let's go back to the beginning to where we test if it's zero. So I'm going to do a go to instruction and then need to loop back to the beginning. And so I write a little little arrow there, a little orange or arrow that kind of points back to the beginning of my loop. And that way I remember where everything is. And this is actually how I write instructions or how I write programs for the for the toy CPUs. I actually have to write them all out. And I'm going to write a little placeholder for I need to go to the end. I need to load a start value. I need to subtract one. And so I'll just put words in parentheses there or in quotes to remind myself that that needs to come from somewhere else. And so as long as I've got those labels written somewhere, then I'm good. Now if I were going to turn this into a program to run on the toy, I'm just going to go look at my next slide here. So I count down from three to one zero. And that means that I need to start with a load instruction, right? As I've got written a little piece of paper on the left hand side, that orange, that yellow piece of paper. I wrote down the load instruction. So on the right hand side, that's what the load instruction looks like. I need to load from a start value. And as I wrote it down the left hand side, the start value needs to be on memory address nine. And so that's what I've got written there in binary. Just write it down for binary nine. And then I need to do a test if zero. And so that's what the binary looks like on the right hand side for an if zero. You're right. Again, you can look at that little card in the middle of the screen. I just reproduced what the card looks like. So you can actually remind yourself that that's what the if zero instruction looks like. And if it's zero, it needs to jump to the end of the program. And as I wrote down in my little piece of paper on the left hand side, you can see that the end of the instruction or the end of the program is at memory location eight. And then it needs to do a subtraction. So there's my subtraction instruction on the right hand side. And we're going to subtract the value of one, which is sort of memory location 10. And then we're going to go to back to the beginning of the loop, which is a counter two. Then we can stop. And then we have our two variables at the end, one for the start value and one for the value of one. And so this program is 11 bytes long from zero to 10. So let's actually enter this into the toy. Let's actually watch it run. And so there's my toy and I'm in counter zero. And it already has the load instruction for me. So that's good. And then I'm going to now enter in memory location nine for the instruction one, right? So here it is with the value of seven. So now he changes to nine. And the value of nine is the value of eight plus one. And so that's what that is. One, two, four, and eight. And so eight plus one is nine. So now memory, counter zero has the value of nine. I go down to the counter two. And now I'm going to do an if zero instruction. So if zero, hit return here. Let's enter this. If zero should look like this as the if zero instruction. And then counter three, I need to pull from memory location eight. So hit enter on that. And that is eight. And then memory location four. That's four. I need to do a subtraction. And so to hit return on this. So I need to enter. And you can see on the right hand side, always helps to write down what your binary instructions are. So you can just very quickly enter them into the toy. And so that's the subtraction. We go down by one. And now I'm going to subtract from the value of one, which is stored in memory location 10. And 10 is eight plus two. So again, your binary is, this is one, two, four, and eight. So eight plus two is 10. And now memory location six, counter six, now needs to have a go to instruction. Well, before this was the end of my programs. And I need to actually enter the go to instruction, which looks like that. And then I need to go to the value or go to memory counter two. And so then you change this value here to two. Which looks like that. And then after that, counter eight needs to be the stop instruction, because that's the end of my program. And now memory locations nine and 10 have the start value and then the value of one. So I can subtract one from my value. So hit enter on, well, that's the stop. So I need to go down one. This is nine. And now I need to enter the start value of three. And so this is one and two, that's three, hit enter on that. Now I've entered three into counter nine. And then counter 10 needs to have the value of one. And so now if I run this program, if I've entered everything incorrectly, now I can run my program. And you can see that it starts with zero. It's now loaded in the value of three. And then it's going to subtract from three. And then it's going to subtract one. And now I'm two. And then it's going to check if it's zero and it's not. So keep going. It's now subtract one again. And once it's subtracted one, now we go down to one. And it's going to check, okay, is this zero? It's not zero. So I'm going to keep doing the loop and subtract one again. And now it gives us down to zero. And I was going to jump back to the beginning of the loop and test if it's zero. And if it is, it is. And now it's going to jump to the end of the program. And now we're done. There we go. And that is counting down three, two, one, zero. And this was a program that my students were actually able to enter. Once they saw how to enter a program, it took them a little while to kind of get their own first program, but they were actually able to do it. And it was really neat to watch these students who are not going to be computer science students, be able to write a program for a Switches and Lights model of computer. Let's go and look at another one here. So let's move a light from the left-hand side to the right-hand side. This is a very simple operation using shift. And so I'm going to load a starting value. You can see instruction eight is my starting value, which is the value on the left-hand side. And then it's going to test if that's zero. Because if it's zero, we're done. And so if it's zero, it will jump to the end of the program. And then if it's not zero, then we need to shift that light off to the right-hand side by one. So let's do the right instruction. Right doesn't take an argument. It just shifts the bits off to the right by one. And if they roll off the right-hand side, then they're lost. And they're going to jump back to the beginning of the loop. And then the instruction seven is where the program can actually end. And then instruction eight has the value that we're starting with. And so looking at turning that into a program, I only need instructions zero through eight. This is a nine-byte program. And so let's go ahead and enter that into the toy. And so I'm going to start on counter zero. And I'm going to load. There's my load instruction. It's already there. And now I'm going to go down by one. Let's go to counter one. I need to load from memory location eight. So it changes to an eight. And now counter two needs to be the test if something is zero. And this is already the if-zero instruction. You can see it's zero, zero, zero, one, one, zero, zero, one. That's the if-zero instruction. You can always tell because it's on a little card in the front of the toy CPU. But I also have written that out in binary on the right-hand side. So I know that counter two has the if-zero instruction. Go down to counter three. And this needs to be the instruction that is the end of the program. That's where the stop instruction lives. And that is going to be, so that is, I'm on three. And this needs to be the value of seven. And so I'm going to turn that off and turn these on. And that is now the binary value of seven. And now let's go down one. We can go to counter four. And this needs to be the right shift instruction. Shift all the bytes in there over by one. So we're going to turn this off. And that is now the right instruction. Enter on that to submit it. And then go down one. And now this is instruction five. And this needs to be the go-to instruction. So hit enter to edit this. And so these two lights and that one off, that should be the go-to instruction. And then submit that and then go down by one. This will now be counter six. That now needs to have the value of two because that is going to go back to instruction two on my program. So let's go ahead and enter the value of two. And let's go then to counter seven. So I'm going to use the down arrow and that brings me down to counter seven. And this needs to be the stop instruction. So I'm going to hit enter here and turn off this bit. And that now means that my program has stopped. And then eight is where we need to have the start value. Now it does take a while to run this program and actually watch the light move all the way from the left-hand side to the right-hand side. So for simplicity, just to kind of speed up the demo a little bit, let's start the bit right there. We're going to start the light right there. And we're going to watch it go from that end to the right and then it'll disappear and then the program will end. Now it doesn't matter what I have in instructions after that. So this is eight and then nine has some garbage value in it, but I don't care because my program never gets there. So let's go ahead and run this program. So again, what it's going to do is it's going to start with this value here from memory location eight. And it's going to light up that light. That's the fourth one from the right. And it's going to load that into the accumulator and then it's going to keep moving that light to the right by one until it gets to the value of one and then right shift one and that'll be the value of zero and then the program will end. So basically it's the same as moving that light to the right hand side. So we're going to just run R and now it's going to load the value. There it is. And now it's going to do various comparisons if zero and then it's going to do a right shift, which it just did. And let's go back to the beginning, do a test if it's zero and it's not. So we're going to do a right shift and there it is. And they're going to go back to the beginning and we're going to test if it's zero. And if it's not, then we'll do a right shift and then it's going to move it off by one. And let's go back to the beginning of the loop and it's going to test if it's zero. And if it's not, it's going to then do a right shift. There it is. Now it's zero. Let's go back to the beginning of the loop. It's testing if it's zero and it is. And now it's going to jump to the end of the program, which has a stop instruction and there it is. So that is a program that will just move a light from the left hand side over to the right. And I think we have time for one more program. So let's go and do one more program here. We're going to add the value one and two and we're going to store it in another location. So this is again something I had my students do. So that way they would have a basic understanding of how computers are working, how you program a computer, a very simple computer using a switches and lights model. And so I'm going to load a value into memory. I need to load my first value. And then I'm going to add to the accumulator some other value, my second value also stored for memory. And then the accumulator will now be the sum of the first and second numbers. And now we need to store that in memory somewhere else. And so I need to do a store instruction, store that sum somewhere else, and then I can stop the program. A very simple program and it needs three variables just to store values. One is going to need to have one variable that's going to store the value. First number we need to add, we'll put in the number one. And then the second number it needs to add, we'll put in the number two. And then another place for it to store the results. Now you can actually watch the accumulator change. We can actually go back later to see if the accumulator has changed. And so we're going to put in just a garbage value here if all the lights are on. So we can actually see it be one plus two will be three. We actually should see the value of three in there when we're done. So let's use that little card in the front of the toy CPU to actually figure out what our instructions need to look like. So we can start with the load instruction. And we're going to load from... Now we've kind of figured out by writing on the left-hand side a little sticky note that looks like a little sticky note. By writing out our program first, we can actually figure out that the first instruction or the first number it needs to pull from is going to be on instruction seven or counter seven. So we're doing on the right-hand side, we're doing a load from memory location seven. And then we're going to add to the accumulator what is in memory location eight. Turns out that's our second number is memory location eight. And then we're going to take that value, we're going to copy that value, we're going to store it into memory somewhere. And so we're going to use the store instruction and put that into the place we were for that, which is memory location nine. And once we've done that, we're going to stop the program and then we're done. So this is another 10 byte program from zero to nine. And by the way, I've borrowed all of these, many of these instructions from Brian Kernighan's toy and that was able to use the Brian Kernighan's toy after I showed my students the toy CPU. So let's go ahead and enter this add one plus two and store it in a variable. And so I'm going to now start at zero, counter zero, which is a load of variable or load of value. And that's already got the load instruction on. So I'm going to go down by one to memory location, our counters, counter one, which now needs to have a value of seven. So I had to enter here. And now let's turn that to binary value of seven. And I'm going to go down one more. So counter two, we're going to do the add instruction. So let's hit enter on this and edit this to be the add instruction. Now I've already written on the right hand side what add needs to look like. And so add looks like that. That's the add instruction. And so I'm done with that. And now we need to add from what variable or what place in memory. So counter three has the location of memory we're going to pull from. And that is memory location eight. And so that is the binary value of eight. And then counter four now needs to store that somewhere. So now we need to give it the store instruction, which we haven't used yet. So this is the store instruction. And where is it going to store it? Well, the next instruction here, instruction five, counter five, is going to have the memory location that we need to store it in, which is the memory value, memory location of nine. And then the next instruction, instruction six, needs to be the stop instructions. I need to turn off that one bit, the 30 set. And now I'm a stop instruction. And then I go down one more. And now I've got memory location seven. It's my first number I want to add. And I'm going to make that the value of one. And then memory location eight. I'm going to add the second number. Let's make it a nice number we can count, count down, which is number two. And now one plus two will be three, but we're going to store it over here in memory location nine. And just so we can see where things, how things get changed, let's actually change all these lights to ones, the value of 255. And so that's memory location nine. Enter on that. And now let's run the program. So again, you can see on the right-hand side what's going to be happening. It's going to load a value for memory location seven. That's the value of one. It's going to then add to the accumulator what's stored in memory location eight, which is the value of two. And so that will change the accumulator to the value of three. And then we're going to use the store instruction to store that somewhere. We're going to store it in memory location nine, which is what we've got on screen right now. And then it's going to stop. So when we're done with the program, we can go back to memory instruction nine. And we will. We can actually see that we've changed the value from 255 to three. So we are to run the program. It's now it's going to load. And there's our value. It's going to add. And that's now it's three. And I was going to store. And now it's going to store it. And now the program is done. And so now if I use the up and down keys, you can actually go down to this is zero counter one, counter two, counter three, counter four, counter five, six, seven, eight, nine. And nine is now the value of three. We actually were able to modify the contents of memory. And so this is another program that my students were able to write. Now again, they're not going to become computer science students. So they don't need to be experts in how computers work, but they need to have a basic understanding about how programs work. And now they have the sort of the grounding of how we would program computers is just sort of a switches and lights model. Now it became much easier for them to now see why we had programming languages that were higher level like C, Fortran, things like that. And so we were then able to carry this forward and talk about other programming languages. So this is the same program, add one plus two and store it in another variable written in the C programming language and the Fortran 77 programming language. So if you're a C programmer, you know that you're going to start a function called main. And then we're going to take the define a variable called sum. And we're going to just use the, we're going to add one plus two. And we're going to store in that variable called sum. And then we're going to, in this case, print it back out to the user. So by these other programming languages, we can actually now have things like displaying things to the screen. And so that prints out one plus two equals three, because that's the value of the sum variable. And then it returns back to the operating system. Or if you're a Fortran programmer on the right hand side, we're going to define a program called add. And it's going to find a variable called sum. And it's going to add the numbers of one and two and store it in the variable called sum. It's going to print out then the results one plus two equals three. And so by starting with sort of the switches and lights model, my students then understood at a basic level how computers operated. They're not using disk interfaces and screen and things like that. They're not querying the keyboard and other types of interrupts. It's just a very basic minimal instructions that computer that teaches them the basics on how a very simple computer might operate. And once they have that understanding, then they were able to carry that forward to then understand why we built other programming languages and why those other programming languages are so much easier for programmers to use. And so that is why I created the toy CPU. And just a couple of programs you can use in the toy CPU. I'm sure now that you've seen some demonstrations about how to write programs in the toy CPU, you can write your own. And I would encourage you to go ahead and do that. So again, on the screen, you can see the URL to download the toy CPU at github.com slash Fridos project slash toy CPU. If you have questions after this conference is over, you can certainly feel free to email me at jhaul at Fridos.org. Otherwise, I think we're going to do a Q&A now. And I'll be happy to hopefully I'm going to be on the chat and I can answer your questions there. All right. So thank you very much for attending this talk on how to use the toy CPU emulator to learn 8-bit programming. Thanks very much. And thanks for joining FOSDOM. I think we're live. We are. Anyway, so yeah, so yeah. So thank you very much for doing this and for kicking off the AmiDev rooms. What year are we? FOSDOM 2023 presentations was an amazing talk and especially cool because, you know, it's from the education point of view, educational system point of view, which is interesting. It's always interesting. So yeah, thank you for doing this. And yeah, we have a couple of questions and most of them are from me. So they're annoying questions, but please bear with me. Let's see. So yeah, so this was actually answered during this talk, but maybe you can go into it a little bit more. So yeah, why create a new project, right? Or toy CPU, there's a lot of like all the CPUs that are very simple instruction sets. So yeah. Yeah, and I thought about, you know, using something else. I mentioned in the video briefly that there's a project called the DigiRule 2 that was exactly what I was looking for. I was looking for something that would be simple for my students to understand. You know, they're not computer science students or their business students. And so I wanted something to be very easy for them to understand, but at the same time, I also showed how the switches and lights model worked, because that was always the question. I've taught this class a couple of times and that's always the question. It's like, well, how do you do programming with switches and knobs and lights? And so I wanted to be able to show that. So I wanted it, but I wanted it to be simple. I didn't want to like get them lost with, you know, segmented memory and the registers and things like that. So I was just like, I wanted to create something. I wanted something that was simple. And the DigiRule 2 actually would have been perfect for that. And it's only, I think it's like, well, it's less than $100. I know that for sure. And that would have been perfect, but they're out of stock. It's an independent project and they're out of stock. So I wasn't able to buy one to show up in class. So for me, it was, let's just go ahead and write one. That's very simple. That has a minimal instruction set that it's enough to show the students kind of how it works. And so that's why I created the Toy CPU, rather than like I could have used a, you know, there are emulators out there that let you basically run a virtual, you know, Altair 8800 or, you know, something like that. But it's like, when I looked at it, it's like, it's a little bit too much, especially the ones that were showing how the internals of the CPU worked. And I didn't quite need to go that deep. So something like this ended up being simple enough, I think that these 100 level students could understand it. Yeah, okay. And during the talk, you mentioned, of course, you'd like to explain the history of everything to the students. And so did you explain punch cards to them? I did explain punch cards to them, but, you know, we didn't actually do anything as a demo of punch cards, but I actually have a physical punch card that I had in my office, my other office that I actually was able to show them and say, this is a punched card. And we actually were able to talk about how the punch cards work. Because yeah, I like to do that in my classes to actually, you know, it's the history. It's, you know, they learn how to use tools, that's part of the class. But the other part is they need to understand how technology works. And the way that I like to explain it is to explain kind of how we got from here to there, not just like, here's a computer in 2023. But actually, here's how we got to computers in 2023. And yeah, I like to show a lot of like how we got from there to here. Because I think if they kind of can see things, you know, changing over time, it's easier to look at, for example, a motherboard for a modern PC, and be able to recognize things on it, if they started with what we do, with starting by looking at the motherboard on Apple too. So, you know, that way it's like, oh, there's a CPU, there's the memory, there's these other things. And then we look at a more modern motherboard, then we look at a more modern board from that, and then we can look at, you know, what a phone motherboard and things like that. And the students, once they, once they kind of see kind of, you know, how things started and kind of walked their way up, I find the students are much better at actually being able to understand the technology at kind of a high level. So that's why I like to do this sort of this, how we got from here to, you know, from there to here. And that's also why I wanted to show my students kind of this switches and lights programming model. Yeah, all right. And the small question, even some of the slides you use simulator, and others you use emulator. And so what's the difference from your point of view for this particular project? In terms of what? The term or just what I was doing? No, the technology, excuse me. Oh, so yeah, I just wanted to, you know, basically, you know, simulate in kind of an older computer, you know, and that was the goal. And so, you know, I kind of use the term emulator and simulator, I guess, interchangeably there. But, you know, the goal was just to simulate kind of an older, the way a simpler machine would have worked. And this is the kind of way that a CPU would work from, you know, the 60s or 70s, you know, with the accumulator, single operations, you know, limited amount of memory and things like that. And so that was that was that was the choice there. Yeah, all right. Yeah, another question we joked about in the room is, so yes, this is pretty, yeah, it's pretty awesome. But it might be a complex for for students. So, so yeah, how did the students do actually business students? I mean, it's yeah, and that's a great point. It is a it is a little complicated. I mean, I'm kind of going back to old school on this one. And it is a bit much for students today to kind of understand how that works. You know, showing them the, you know, the the toy, you know, I showed them how the toy works, and we walked through the simple load instructions, and they were able to understand that that they mean they like, okay, great, once I understand how this works, I understand how the loading operations work to actually load these different values. They don't have to, you know, they do, you know, as part of the class, I actually do teach some binary because binary ends up being used in lots of different things, like they have to learn how networking works at a high level. And so learning how to binary works allows them to understand some of that. So this was not the first time they had seen binary. But, you know, they could they understand how load operations work. And so that was okay. You know, I would say most of the students were able to do the add one plus two program. That that's pretty straightforward. That's just loading a number and adding a number to it and then storing it somewhere else. And then it's just a matter of figuring out where all those numbers need to go. And I would say most of the students were able to do that one. I would say maybe half, maybe if I'm lucky, half of the students were able to do the countdown three, two, one, zero, because that one's that one's kind of complicated when you look at it for a non-computer science student, and maybe even for some computer science students at the 100 level, because it's got a loop in there and it's modifying values. And so I mean, that was that was a lot. So if I was lucky, maybe half of the students kind of got that one. And so that's why I kind of think, you know, it's like, well, if I were going to do this again, you know, what I, you know, I'll I'm scheduled to teach that class again. And so what I what I use the toy again, and I think it was an interesting experiment, but I'm I'm not sure that I would use the toy again. And it's kind of got a little bit, a little bit deep in some parts. So I think I might not use it again, but it was a great experiment to do this year. Yeah, okay. Okay. And none of the students saw the light and thought I'm going to go do computer science now. So I'm aware of that. What I tell them at the beginning, I said, I'm not going to try and convert you to be computer science student. But if you want to, that'd be great. But I don't think anybody in that room probably got converted on that one. Yeah, all right. All right. There's a question. So did you consider doing it like a web based version of this, like Brian did for toy? I did consider doing something that was web based. But something else I didn't mention in the talk was, you know, I created a mock up of this, just an experiment that was, you know, something that like wouldn't read input. It just, you know, you had to have a hard coded program, but it would run. It would display everything correctly. And then I did another experiment, sort of a mock up for that ran on Linux and curses mode. And then I thought, well, maybe the next one should be, you know, written in JavaScript, run on a website, you know, in theory, that would be too hard. But what ended up happening was, and I didn't mention this in the talk, is that there was a competition called Open Jam. And that's where you have a weekend to write a game from scratch. And it happened that the, around the same time that I was like, I should rewrite this program to make it look better, is when Open Jam announced, you know, what the topic was. And the topic was lights in the dark, because it was over Halloween. And I thought, lights in the dark, that's switches and lights programming. And so I decided that I would use that opportunity to do a switches and lights version of the program. And that's actually where it came from. It's like, well, my fastest path to do this is actually, you know, because I do Fridos, my fastest path on this is actually to write a little Fridos program to do it, which is what I did. And so I wrote this version of the program in about a day and a half. But I was only able to get there because I'd done these, you know, those prototypes before. So, you know, web-based actually would be a much better way to do it the way that Brian did. I just, and I would have done it. It's just that it happened to be that this other thing for Open Jam came up at just the right time when I was thinking about rewriting it. And so then I rewrote it. Yeah, okay. All right. And a fun little question is, what was the most complicated program yourself, I guess, wrote for this because the students, they went to the countdown thing. Yeah. So I think the most complicated program that I wrote with the toy, that's a good question. What is the most complicated one? You know, it would probably be something that had loops in it. Probably that countdown was pretty complicated because it's got a number of comparisons and loops and operations in it. I'm sure I did something else more complicated. Just not coming to mind. But that loops, the one where you're doing countdown, that one's pretty complicated. I'd love to do something that can count to negative numbers because you'll note that it's only a capable of displaying positive numbers, 0 to 255. And so I'd love to do something that can actually do, like, write a little program within that limitation that takes the last bit and uses that as a sign or something. So that way, I'd actually watch it count 3, 2, 1, 0, minus 1, minus 2, minus 3, which is possible to do. I haven't written that one, but that would be an interesting one to do as a demo. But I could see in my head right now how I would probably write that. So I'll probably write that. Yeah. Somebody mentioned in the chat that multiplication could be interesting. Yeah. Yeah. Multiplication, basically, it's addition. And I thought about adding addition or sorry, multiplication and division to the instructions. But then I was like, but the whole point is like, you can show that multiplication is just addition a number of times. So multiplication is just several additions. And I basically maintain two counters for that. And I could see that happening. Divisions kind of the same way, although you can't display decimals. But it can't be a floating point. But that would be an interesting program to write as well. Yeah, OK. And of course, the question everybody's wondering about, is it Turing complete? Say it again? The question everybody's wondering about is, is it Turing complete? Is it Turing complete? I don't know that it is. It was only ever meant as a very simple programming language, but I guess I'd never really considered if it was Turing complete or not. Yeah, all right. And something that really came to mind is, so from your perspective, how somebody is saying, yes, it is Turing complete. So there it is. I'm not sure what the basis for that is, if you can type that, please. In the meantime, I will ask a different question. But from your perspective, because this is a very interesting experiment, but generally speaking, the process itself, creating this, could also be very interesting to teach how to create stuff like this, writing emulators, writing simulators and stuff. And I was curious if you have an opinion about that. How good would teaching emulation fair, and how useful would it be? No, I don't teach a computer science class. I would love to, but I don't actually teach one, and I would love to. But one of the things that I believe that in teaching things like this, it actually does help to show how we got from point A to point B. And I think that there is something in teaching computer science that you probably should have students work on problems that are old. And for example, in a compiler class, it would be interesting to have a student write a compiler language, or invent their own language, where you can only use the characters that are on a punched card, and see what kind of program they would come up with. I, as an experiment, wrote a version of Landpart. That's the very first computer spreadsheet that ever existed. Just because I was curious to kind of see how certain things worked in that. And it was, I now know a ton about how Landpart works. And I think that if you ask students to do projects like that in a computer science class, where you write something that's very old, in this case, maybe write an emulator that simulates a very simple CPU, I think students will understand a lot more about how that technology works. Because they will have had to kind of dig into and actually understand for themselves how certain decisions get made, how certain limitations get worked around. Very simple CPUs, could they multiply? Is that something that we had at the beginning? Well, you can emulate that by writing a program that will do multiplication for you. And then is it easier to just create that as part of the CPU? So, yeah, I think that the, kind of get a little off maybe from where the question was, but I think it'd be very interesting to have a student write a program like this, an emulator like this. It doesn't take very long. As I mentioned, it took me about a day and a half, basically a weekend to write this version. So it's very within the capabilities of an undergraduate computer science student. And I think they would learn a lot about how computers work. Yeah, okay. I don't agree that it would take them a day and a half, but yeah. Yes, it might not take them a day and a half as an undergraduate computer science student. It'd be a good, like senior projects, mid-semester projects, something like that, right? Because then they'd be able to, they'd be able to work it out, try to figure it out on their own. That obviously took me a lot of time, because I've been doing programming for so long. But yeah, absolutely. Yeah, all right. And you just mentioned, or a couple of minutes ago, you mentioned your, this course, you're going to do it again next semester or next year next semester? Yeah, if I'm scheduled to do it in summer and in fall. And so we'll see if I end up doing something quite like this in that class. If I do use it, I probably would show it rather than asking students to write programs for it, just because it was, I think we got a little bit lost in the class with like doing the, trying to debug the loop. But the, maybe I'd keep them, like write some simple programs, like the ad programs, almost everyone was able to do that one. That would probably be a good one to have them see. And then we move on and show, you know, like I showed in the slides, then we actually moved on and said, okay, now that we know how you'd write like an ad one and two together in the toy CPU, now let's actually write that same program, or rather, I'll write that same program for you in higher level programming languages, like C, like Fortran, so that way the students can see like, oh, okay. So I know how the, you know, the machine did it, the toy did it. And now I can see why programming languages are so much easier to write stuff in because, you know, it's like, all right, so that's the, that's the program that you saw in the slides. That's the same program, but now I've written in C and in Fortran. And the students were like, oh, that's, that's why programming is such a big deal. And then we were all, you know, saying, okay, you can create way more interesting and complex programs using these programming languages because now you don't have to manage everything at the lower level, compiler does that for you. And they kind of, they, I think that they, the feedback the other semester was that the students said that they understood programming and how computers work a lot better because we had shown them kind of how things work very simple levels. So, you know, will I do it again next semester? I'm not quite sure, but I might, but I think I'd probably keep it to the very simple programming examples for them to tackle, like the adding to numbers. All right, fair enough, fair enough. That's another question. It's quite long. So, it says that the way the programs were written on the yellow notes used one line per memory location. I can understand that, but also it's a little bit confusing because you were using labels to beginning and end, for example, that didn't really have a place. Did you consider another layout for the assembly? Yeah, I suppose I could. What I was doing there, I was trying to talk it through on the slide, you know, the way that I would write these things out for my students is I would write things using the sort of the assembly notation, but I had to do it, you know, one, you know, I would try to do it one line at a time, like it would actually be inserted in the toy. And the way that I would do it, like on the whiteboard is I would say, okay, so we need to load, for example, the first number. And so I would just say load, and then we just, as a placeholder, the next line would say first number. And then we would go through and say, okay, we're going to add the second number. And so I would just say the add instruction, and then next line, I would just say the second number. And then, you know, it needs to be stored somewhere. So I say store, and then, you know, in some other location, and then the end, and then then we need, so then we're able to like draw lines to, okay, so the first number is going to be stored in this memory location down here. And so now we know that that, that we can actually, you know, we're able to erase it on the whiteboard and say, okay, so the first number is being stored at this memory location. So I can erase that, and I can actually put it in there. Kind of hard to show in a slide, because I kept the text, and I just used arrows. But that's, that's actually how we did it in class. And so the way that I would, that we actually did it in class was we actually did use the names of those instructions. And then once we had our list, so once we had our list of what all the instructions were, and what all the memory locations were, and we just write the memory locations as plain numbers, then we would do another column on the whiteboard that actually says, okay, now we're going to actually turn that into instructions for the, for the toy. And then we would just write the zeros and ones for each one. And that way they could see on the left-hand side of the whiteboard, you know, what the, what the instruction was, or what the memory location was, and on the right-hand side of the whiteboard, they could see what the binder representation was, and that kind of reinforced that this is exactly what we're putting in. And that was just a matter of taking those, those binary numbers and putting them into the toy and running them. So it was a little bit harder to kind of show as a slide, but we actually did go through that process on the whiteboard. It was a lot easier to do, if you can sort of erase some things and scratch things out and put some arrows in. So it was, you know, it ended up being a little bit more than what I was able, if I tried to do that on a slide, I think it'd be too busy. So I tried to keep the slide a little bit simpler. Yeah, all right, all right. And the question slash suggestion, actually. So will you place this video or something similar on YouTube, right? Because it's very interesting. And yeah, and I think I understand the reason behind the question, because not everybody comes to the FOSTA website. And yeah, so yeah. Absolutely. So, you know, I'm, I'm part of the Freedoss project, and we have a YouTube channel. And so our Freedoss project is freedoss.org, and I will put a copy of this video on our YouTube channel, either today or tomorrow, so that way everybody can see it. And so if you missed the presentation or just want to show it off to somebody else, you can click on our YouTube channel and watch it there. I'll also make sure we put a news item out on the website so that way people can see it if they just visit the website. Again, that's freedoss.org. Yeah, yeah. Unfortunately, the Q&A will still be here on FOSTA, but I'm not sure if that's unfortunate or not. Sure. But yeah, it's, the FOSTA videos used to be uploaded to YouTube, but for some reason YouTube blocked FOSTA because we were uploading too much to YouTube. So yeah, they just blocked the accounts and it's a running issue. So yeah. Oh, somebody's asking, how do you spell the YouTube channel? It's just freedoss.org. It was, in my slide, I had the, my email address, and that was Jay Hall at freedoss.org. And so the website is just freedoss.org. The YouTube channel is youtube.com slash freedoss project. Okay. It's also in the footer of our website. All right. I will manage to find it. All right, let's see. Did I get through everything? I think I went through everything. Let's just give, I see somebody typing. Let's give him a moment. Let's see. Let's see. Let's see. You still typing? All right. I think final question. You kind of like answered it already, but still, so if you, if you would make like, I can't really find the actual word, but like a sequel course for this one, how would you make it? Like a more advanced, so people that like this course. Oh yeah. So, you know, this class is MIS 100. It's meant to kind of introduce kind of how technology works and then how to use certain tools. You know, I would, I would love, I think it'd be very interesting to see a class that kind of goes into more depth. I think that, you know, for that kind of need to focus on certain things. And I really think that, you know, there's a lot to be said for any student. It doesn't matter what program they're in, any student, to understand how technology works kind of a basic level. The way that I tell my students at the beginning of the semester in that course is that, you know, I don't want you to graduate, and I mentioned this in the video, I don't want you to graduate from this class or from the, from the, from the university and, and think that technology is just you press a button and magic happens in the background and then you get back an answer. You should have some understanding. And so having, I think that, you know, to do more with this class, I think it's really just helping students to do more of a deeper dive on how technology works and how certain of these things, you know, come about. You know, I think that there's a lot that students can understand around how applications get put together. The challenge would be because these particular students are not meant to be computer science students or business students. And so how do you do it in a way that's going to be accessible to them? But I, but we do, we do have other classes that kind of go into that in more depth. And I think that that's, that would be, that was something I would encourage everybody to kind of every university to do something like that. All right, excellent. So I hope all universities listen. Yeah, we know how that goes. But all right, thank you very much, Jim. And we hope to see you soon in the Brussels again. Thank you. Thank you very much. Thanks. Thanks everybody. Bye. |
7 things I learned about old computers, via emulation
(p.s. it's not about games) |
Okay, then we'll start. Well, everyone's now seen all of the slides, so we can just go down the pub. So, as this part of the slide says, my name is Steve, and as that part of the slide says, I'm a bit of a geek. Now, if we've all turned up to the right room, this is a talk about old computers and emulation and things that I've learned about old computers through emulation. Over the course of the next 20 minutes or so, I'm going to look at why emulation is really quite fun. I want to mention these seven things that I've learned over the last year or so, and then the honoree mentions there's some other stuff that I found that was too good to go to waste. So, before all of that, who am I? What have I done to earn a place on this stage? Or is this slide should really be called? It's the ego slide. Well, the speaker brags about themselves for 10 minutes while everyone else gets bored. My slide has that on it. That's not a computer. That's Lego. Yeah. So, who am I? Well, I'm a game developer. I've been a developer for quite a few years. I've written console games. I've written mobile games. I've written a book, 20 Go To 10, which was crowdfunded last year. And it's a very good book on old computers, because that's essentially all I do. I've spoken at this conference a few times before, and I still haven't got it right. So, everything that's on that slide is not really that useful. It's about what's not on that slide. I'm not a professional retro developer. I haven't built, you know, the Vega. I have not built the new Commodore 64 machine. Everything that I have done with retro has been just me having a fiddle around with electronics. Essentially, that's a long-winded way of saying, if I can do this, anyone can. So, why is emulation useful? Well, it just is. And it isn't necessarily about games. It's about being able to see the machine. How does it boot up? How does it start? What goes on behind the hood? And you can do it safely. You can't break anything when you're running this stuff through an emulator. So, where should we start? The Jupiter race. Each of these, I'm going to mention a couple of emulators that I've used and experimented with. Obviously, Mamers probably got all of these anyway, but Mamers is too big for me to compile. I like the smaller emulators because I can compile them easier and I can actually see. That way, I've got the development environment of the emulator and I can then run it through GCC as well. So, what did I know about the Jupiter race before all of this? Well, it uses FORTH. The first thing anyone will say about the Jupiter race, well, it programs in FORTH. And it's just like a ZX81 because the guys, Vickers and Al Vassa, worked on the ZX81. They also worked on the beginning of the Spectrum. And they said, why is Clive Sinclair making all this money on our computer? We could build a computer of ourselves and we could make all the money. And they did. Unfortunately, they didn't sell any machines. So, they made all the money of about 4,000 quid. But it's a very nice machine and the prices of the army bear just stupid. It is why I don't have one. And when I said it's like the ZX81, it's very like the ZX81. This is the code from two different emulators that reads the keyboard. Can you spot the difference? No, there isn't. Well, there is. There is one difference. Jupiter 8th added an extra shift button in the bottom right. And that's the only change between the two machines' keyboard code. One thing. What else do we know? Oh, it's got a font which is stored in memory. And this was really interesting. Because the font is stored in memory. But this is the font, but it's not stored like that. Well, if you have a look, look at some of the... It's stored here. That's not a very big memory location for the code. So, what's it actually doing? Well, if you look at letters, you know, some of these, they can go below the line. But most don't. So, and you've got some things which go to the top and some that don't. So, if you've got a capital letter that doesn't go above a line and go below the line, why do you need all eight bytes to store that letter? You don't. You just hot chop the top off. And then you write a piece of code that puts that byte back in before you render it. So, they managed to save a whole chunk of code. The graphic symbols you see at the top, they're not stored in memory. They're generated by code. Again, you've saved another 200-something bytes by doing that. In fact, the only symbol in the whole font which is copied in its entirety is the copyright symbol. It's the only one that uses all eight bytes. And everything else is just modified some way. And it took only 64 bytes of code to do all of this munging. So that's quite a big saving over every single top and every single bottom of each letter. Next machine. Oh, there's a couple of murmurs for that one. Good. It's one of my favourites. Obviously, when anyone mentions the Welsh computer industry, they have to mention the dragon because it's pretty much the only one that existed. But it's a great machine. The 699 processor is a phenomenal piece of work. Wonderful machine. It's also told it's a bit of a rip-off of the Tandy Coco. It isn't. Both the dragon and the Tandy Coco were based on the same reference design that Motorola put out to basically sell their chips. So they both used exactly the same setup. The other thing that people know about the dragon other than the fact it uses Microsoft Basic is green. The video chip defaults to green. Which colour green? That's colour green. Seriously, dudes. Why you thought that was a good thing to boot up to in the morning? It means your games look like this. Now, luckily, someone had the bright idea that you could actually just not use the colour. Build all your games in black and white because then they look a bit better. And they look like this. Which is fairly respectable for a machine from 1982. This one is Jet Set Willy. Very well known that the 48K Spectrum had Jet Set Willy but the much smaller Dragon 32 had Jet Set Willy with more rooms in it. Because they did all the graphics in black and white so there was less colour needed so they got more data space and of course the processor was a lot better than the Z80. What else do we know about the dragon? Yeah, that's its font. It's not very pretty I'm afraid. But this is the bit that I found interesting. This sets up the graphics mode. At no point are you actually sending data across the bus. If you write to a memory address ffc6 for example, it will set a bit. It doesn't matter what data is on the data bus, it's just not read. If you write to c6 it writes a 0 bit, if you write to c7 it writes a 1 bit. You can send data by just writing anything to an address. That I thought was a really interesting approach. Turns out it's not the only machine to do it. Turns out ARM did it as well. The Archimedes, they just said right we're not going to connect the data bus to this chip. It'll be cheaper and then we'll write data to it just by toggling addresses. Nice clever way. So what computer's coming next everybody? It's the Game Boy. Pretty sure most people have had one of these. They're a bit like the Nokia phones of their era. You drop it on the pavement and the pavement cracks. That sort of thing. So what did I know about the Game Boy? Well there are so many emulators out there that emulate this and some variation of it. Four shades of green and had a version of DRM. Now I know where we are so I'm pretty safe in saying DRM is a bad thing. But this is what they did. Now that's currently obviously too small for you to be able to see but that's not important. What's important is that it's all on the screen. That is the first 256 bytes of the Nintendo Game Boy ROM. And in this lot it has to set up the graphics system, the sound system, it has to do its copy protection on the cartridge you put in to make sure it's an official cartridge. And it does that in that 256 bytes of code and then this is the bit that does the check. You put two pointers into memory, one into the cartridge that's been inserted and one into the 256 bytes of ROM and it says okay if all of these bytes in the cartridge match all of these bytes in the ROM then it's a legitimate cartridge. You're allowed to play and the game game continues. But what bytes are these? Well there's not many of them, you can see it goes around the loop and if it doesn't match the machine just locks up, that's fine. Well these are the bytes that it checks. Can anyone spot the pattern in those bytes? Let me show you in a graphical form. Anyone spot the pattern in those bytes? The logo that comes up when you switch on the Game Boy is that? It's in code. So it checks for the Game Boy Nintendo logo at the start. This means if you're doing a dodgy cartridge you have to copy the Nintendo logo into your cartridge. This means you are committing a copyright law infringement and because it's a logo it's a trademark law as well. As much as I hit the other thing whoever came up with that idea, that was a smart idea. I just disagree with it. Pac-Man. Originally called Pac-Man I can't think why they would change the name. It's a lovely little machine and it's incredible when you consider how much memory it has. 3K of memory for a full screen of color graphics. Now the way that it does this is through tiling and through a lot of clever hardware. But the bit that's interesting is how it's laid out. Again you don't really need to look at this but if you come up close the memory goes in this direction then in that direction then in that direction then down the other side. It's a weird way of mapping. But it means that when you get to level 256 this happens. I don't expect anyone to have got there but thanks to emulation I can write one byte and I can actually play level 256. Now it looks like the screen is all corrupt because some kind of graph, it's not. You know when you're playing the game and you get oranges and strawberry and fruit in the middle of it here? That's what it's trying to draw. They never thought anyone would get to level 256 so they never had a piece of code that checked the number of pieces of fruit that it should draw. And this is just running over all memory drawing all the fruit that it can find and then the rest of memory that it can find. That bit there, brilliant. So Pac-Man done. ZX81. Any fans of the ZX81? I was expecting a bigger cheer to be honest. We've got two fans over there. First machine I ever had. Don't worry that's not one of my security questions. 1k of memory. 1 kilobyte. No colour, no sound, it's complete moot. But we had a chess program that ran in 1k. There was a 3D monster mice, not 1k but still 3D games on a piece of plastic, basically a black plastic cheese wedge. You could get a 16k RAM pack. Wow, think of all that extra memory. What can you do with that? A very interesting fact that I found out completely by accident. I was taking a photograph of this for the book and there's a big story about the ZX81 RAM pack wobble. And because this RAM pack is set on the back of the machine it does this and then it crashes. Well people solve this problem by blue tack and with glue. Some people solve the problem by just buying a better machine. I solved the problem by actually sticking the two together with screws and really folding it in nice and tight. I thought I'm going to get a picture of this and this famous RAM pack wobble thing. So I measured the angle between the ZX81 case and the 16k RAM pack. It's 16 degrees. So what did I find out about the ZX81? Well how does it know there's a 16k RAM pack in there or not? Well it just tries everything. It just writes to every memory address it can find and when it's on it says oh hang on I've got no memory addresses over here it says well that's where your memory stops. Just writes data into it and then reads it back again. Very simple but effective. The grown up version of the ZX81 was the spectrum. This one did have colour of sorts. It had sound of sorts. It went beep. But what a machine. There are so many emulators for this machine it's not even funny. Even I've written one. But what can we say about the machine or the keyboard other than being rubbery dead flesh thing? It's how the keyboard's read that's kind of interesting. And it works this way. A Z80 machine has an input output system using memory and an input output system using ports. So you can communicate with systems that aren't based on a memory bus. So what does it do to read the keyboard? It does an out request onto a port and if there is a zero in any one of these bit positions it means the key is down. Well a traditional thing in a lot of these machines of the early 80s it would always be negative logic. So if you have zero zero zero zero for example in that it means all of those keys are down. Which funnily enough means it's quicker to check for five keys being down than it is to check for one key. Because if you want to check for a very specific key you've got to check the individual bit. Whereas if you just don't care about it you just say right well if it's not if it isn't one one one if it's you know it will just trigger as one of these keys is down. That's all it needs to worry about. And finally of our seven the Commodore 64 which I'm kind of contractually obliged to sort of put in since I put in Sinclair as well before. Commodore 64 what can we say? Well it's a bread bin and there are far too many emulators as well to mention. It was originally called the Vic 40 because the one before this was the Vic 20 and by a nice coincidence 40 in hex is also 64. A lot of memory 64 kilobytes of memory. 64k of memory hang on wait a minute how does that work? How can you have a machine with 64k of memory in? The address range is always 64k. So you've got 64k of memory and that's all you could ever get. Where's everything stored? Where's your basic interpreter? Where's the system? If you've got 64k of memory there's not enough room to store the system. Well this is what it did. It gave you all the memory and they said actually no I need this bit for my ROM. I need this bit for the SID chip to do the sound. I need this bit for my kernel. So essentially you don't have 64k of memory. It's all hidden by the ROMs. But there was an instruction you could call which was that and it would basically say yeah I know there's a basic ROM over there but really I know machine code. So just ignore the ROM and it just switches the whole ROM off. If you don't want to use the kernel or the system ROMs there to do the rendering to the screen you can just say well I know where the screen data is I'll just write to it directly so you can turn that one off as well. And essentially if you're prepared to go the whole machine code route you can just turn it all off so you've got 64k of memory. That's the only way you could have done it. So what are the other mentions that we should do? Well I should probably mention the X81 again because it is just an amazing machine. I also have to mention the Game Boy again obviously not so close as the brilliant emulator but one of the talks later on is about the Game Boy emulator and they're in the room and I don't want to offend them. Isn't that right? The Spectrum, the Jetpack game, there's a couple of things I'm using about this game. One, very good game. The people who wrote it later became rare that did Goldeneye. This drew the screen backwards. Most people think when you draw a screen you start at the top and you go down to the bottom right? That's how you do it. Well they couldn't do that because their code was a bit slow so what they realised is if I'm drawing the screen this way if you don't get to the bottom of the screen before the TV gets to the bottom of the screen you only ever get half a screen. So what they did is they drew it backwards. You start at the bottom and go up to the top. That way it doesn't matter if your game is too slow you're only ever going to get the crossover once per frame worst case scenario. So as the screen is refreshing this way you're rendering that way and if you ever see a little black line when you play Jetpack that's what it is. It couldn't render the screen quick enough and it's just missed the flyback code. The other thing is it's also 16K game which means you can fit it onto a cartridge and this cartridge can take the place of the Spectrum ROM and this was a very unpopular thing. I mean you'd think you know games laid, who remembers cassette tapes? I've suddenly realised we might have an audience that have no idea what a cassette tape is. These games were loaded off cassette and they would make a screechy noise. That sort of thing. That's a good rendition let's play. I'm sure somewhere, somewhere can actually sing the sound tunes from the Spectrum and actually load them into a machine. That's probably possible. I don't have pitch perfect but I'm sure it's possible. So they think why would I want to listen to seven minutes of screechy noises? Surely I just want to put a cartridge in. So Sinclair made a cartridge system. It had 10 games on it and that was it. Flop. But because it fitted in 16K it could be done as a cartridge and because it fits in the same memory as the ROM means you can't use any other Spectrum ROM capabilities in your code. The Acorn BBC Micro. Popular for anyone who was in the UK during the 80s because this was the computer we had at school. It was deemed educational enough to be allowed in the classroom. What's nice about this one is that they have a little credits page in the ROM. Now it's very difficult to see that on an actual real world machine because when I showed you the Commodore 64 and the ROM was overlapping the RAM, the same thing is true of the BBC, its ROM was overlapped by something else. So you can't read the thanks to page. But because of emulation you can and you can see all the people they thank. Pretty much everyone and their wife and they thank the city of Cambridge. Knop. Knop. No operation or no operation. It doesn't do anything. It's an assembler instruction that pretty much every machine has. Just does nothing. So why would you ever code that? Well, when you're poor like what I is, you don't have an assembler. Most people, you know, you'd have an assembler, you'd type in some reasonably English type things into the assembler. It would convert it into machine code and then the machine code would run on the machine. I didn't have an assembler. So I had to hand assemble everything by looking up in a reference book and saying I want to do this instruction, this is the number I need to do and I type that number in manually to my machine. Now when you do a jump that goes backwards, you have to write it in two's complement and a jump like that say F0 is back about what is it, about 16 instructions about. But I can never remember if it includes the instruction itself or not. So this instruction is two bytes. So is this going back 16 or 14? No idea. So all I would do is I'd put a big stack of knobs. So wherever it jumped to, it's going to be a safe instruction and it's not going to mess up the machine. Turns out I wasn't the only person to have ever done that. Microsoft Basic. If you type weight 6502 comma 1, it prints Microsoft on the screen because these bytes are hidden in the computation for the sign function. But that's not ASCII. If you mask off those bits, shift them around a bit, subtract something else, then it's Microsoft but backwards and it's there. Microsoft and Apple put a lot of these sort of things. Steve Jobs was very scared. And quite rightly so, loads of people were trying to copy the Apple 2. Very, very popular computer, particularly in America. So Steve Jobs put a thing in there where if you went up and you pressed a special combination of keys, it would put up a big icon saying stolen from Apple. I was going to put that on the slide. Unfortunately, copyright means I can't put it on the slide. But I can show you Microsoft, one of Microsoft's many little things they put in there. So with that, I shall end. My beer is getting seriously low. I will update my FOSDEM scorecard. There we go. If you think after 23, I'd be good at this by now, right? So with that in mind, I'll open the floor to questions or even if you've got nice stories about old computers, this is a talk where it's acceptable to use the phrase, this is more of a comment than a question. So with that in mind, I'll say thank you very much and over to you. How are we doing? We've got eight minutes for questions. Orgin, we'll chit chat, whatever we have. There is one over here. Can I tell you on there what's the way? So what about Amstrad? Amstrad as a company, if you loved Amstrad on I think the 5th of June 1984, you hated them the day after because they bought Sinclair. In America, there's a show called The Apprentice. In the UK, we have a version where Alan Sugar who created Amstrad, he had his own range of machines, CPC6664s and things like that. They are surprisingly good machines. For a machine that sold a couple of million copies, it doesn't sound like a lot nowadays, but back then it was a really good, well-made machine, surprisingly enough. Unfortunately, I don't have one. Their chips are called Roland and Dave, I think, named after the designers. It's another Z80 machine. The spectrum and what it spawned really did say, well, we already know Z80, so we're going to go and set up our own computer company and use the Z80 again. It's not that it was the best machine or the best processor, but the Z80 did get used in so many things. So it's kind of disproportional to its value as a chip. Good to hear some Amstrad here. Yes, it was. So the question is about the, I didn't realize I was doing funnies. Yes, so the Commodore 64 was called Vic 40 in pre-production and prototyping. ZX Spectrum was called ZX2 during its prototyping stage. Not unsurprisingly, it came after the ZX80 and ZX81. The bit, which I'm so glad you brought it up because it's something I forgot to mention earlier, the ZX80 that Sinclair produced was named because of the Z80 chip. It was Z80 chip with the extra special ingredient, ZX, for extra. ZX81 was called 81 because it came out the year after and therefore was related to the year, not the ZX80. The spectrum was called the Spectrum because it had color. Therefore, ZX81 is probably the only computer ever named for its year of release. I've never found another one, except probably a Gateway 2000, but that's a PC and they're boring. Yes, so the question is, do I know anything about non-Western machines, Russian, Chinese machines, and the answer is no. Because I'm of that sort of age, I remember when there was an Iron Curtain and it was called an Iron Curtain, whether that's politically correct or not, I don't know. But we couldn't get those machines into the West and they couldn't get our machines across. So they would essentially either smuggle some in or they would find circuit diagrams and then rebuild their own. There is a massive scenario of clones and copies and variations from what used to be the whole Eastern block. They are all impressive. They look like they're homemade, and they just look amazing. It's just the physicality of the machine. They are carbon copies of other spectrums and things of their ilk, but they are lovely. So repeating the question for the crowd, something, something, something, Japanese stuff. Tell me electromechanical computer. How do we do? Okay. I think you're about to kick me off, aren't you? Almost. Just a comment. I don't know if you are aware, but reading the keyboard on the spectrum, the last eight bits actually select each of each row of the keyboard, so you can do a very simple routine to read. Is any key pressed by just doing a single read? Yes. Press any key. If it's quick, think. The youngest machine I've made an emulator for. Tricky. It probably is a spectrum or something like that. One of the first ones I did was a chip 8 emulator, which is an interesting one. If you want to write an emulator, write chip 8 emulator, because the machine never existed. It was things like the Cosmic that was it back in the 1970s. This machine would actually run a sort of a simulator or an emulator inside itself to run the chip 8 code. Chip 8 didn't exist in the machine. It was an interpreted language that was being run on a 70s processor. If you see any of the old TV games where you'd sort of play a version of Pong, the chip inside that is probably an RCA 1802, which would run the simulator chip 8 language. It was an Apple One story I wanted to get in, but I didn't think I could get that in the time frame. It's about the Apple One ROM. It doesn't do much. It's just basically write data into memory, read data from memory, and execute a program. Use a jump if carry flag, because the carry flag is always going to be set in one particular way, so it acts like a different type of jump, because that different type of jump didn't exist. I didn't have quite the room to include everything, because my time is up. Get lost. Go home. |
Pushing the PSP
Emulating Dreamcast and DS on PSP |
All right, so yeah, we have Daniel, how did I pronounce your last name? Daniel Welcher. Daniel Welcher. Okay. So Daniel Welcher, he did some very interesting work that is, I guess, unusual, because most of the emulation we see is on PC, so this is different, and me personally, I'm very excited to see this start, which is starting 10 minutes late, but we're going to figure it out, so right now, I'm going to go there, take it away. Okay, so my talk today is called Pushing the PSP, and it's about writing two emulators for the PSP. So I wrote this with the help of one of the other main developers on this project, Zero, or Lorenzo, but he can't be with us here today, so I'm presenting on his behalf. So a bit of background on this talk. It is mainly about Dreamcast and DS emulation on the PSP, like I said, and this was first attempted about a decade ago, around 2009. There were proof of concepts made for both consoles on the PSP, but due to the small power gap, it was quite difficult to emulate them at any good speed, and they remained proof of concepts. But today, much better tools are available, and much better understandings of all platforms involved are available as well, so we'll see how a newer team gets on. So a quick primer on what the PSP has to work with. The main CPU is called Allegrex, and it's a MIPS CPU at 333 MHz. The GPU is a custom Sony graphics card at 166 MHz. The resolution is about a little less than 480p, and we have 32 MB of RAM as our baseline, although most models are 64. We have another chip of interest called the MediaEngine, which is exactly the same as the main CPU, but lacking a vector unit. That chip will become a big talking point later, because developers officially couldn't write arbitrary codes to this chip, but we can. So we'll start with the DS, because this one is the lighter, easier machine, hopefully. So the DS is, as I said, a much lighter machine than the PSP. We're looking at an ARM chip at around 66 MHz, and a secondary ARM chip at 33 MHz. There's no modern-looking GPU in it, so we just have about 656 kilobytes of VRAM and 4 MB of RAM, so in theory this looks quite doable. The first efforts trying to emulate the DS on PSP were by a developer called Yoshihiro back in 2009, who ported an old build of Desmume, which many of you may know as a popular Nintendo DS emulator to the PSP. It booted a lot of games, but as you can see from the frames per second counter in the top, it did not run very well. For those who can't see at home, that says about four, okay, out of 60. So we've a bit of work to do. It's a very basic proof of concept, but it's an exciting effort, because as you might realize, both of these systems released in 2004, they were still receiving games at this time, so we were effectively playing games on the rival platform. But today the code is quite outdated, Desmume has come quite far along, and it was never very well optimized in the first place. So the challenge is with emulating the DS on the PSP specifically. First of all, we're probably going to rely on an interpreter, at least for the beginning, which is quite slow, more on that later. The touchscreen, of course, the PSP does not have a touchscreen, so we'll have to find a way to work around this. The unique graphics architecture, we have a 2D and 3D engine as opposed to a more modern graphic solution, and of course we have the two-screen problem, of course. How do we present two screens on one is one thing, but the resolution doesn't quite seem to fit either, so we'll need a unique solution to try and scale things as well as we can. And then the last question is, what DS emulator can we use as a base? Because while Desmume was the obvious choice back in 2009, many other options have popped up since. So these were the three emulators we mainly considered. On the left we had Desmume, we just use a newer build of Desmume. It's the most complete, it has high compatibility overall, it is a bit slow, it has a lot of old code and it's missing some extra features, and the developers won't give us much support. MelonDS is a newer emulator released many years after the original proof of concept. It's mostly complete and it's faster, but it's a work in progress. NewDS is exactly the same situation. The developer is quite helpful and we did actually get in contact with him to help us building it, and it is underactive development, but it is the least complete. Though it is portable, so all of these three emulators are worth a look. So we started with a more modern build of Desmume, around 2020. Zero led the charge on this one, and he began porting the most recent stable build of Desmume to PSP, and there's some success. Many games boot and 3D does work, but because we're only using an interpreter as opposed to a more efficient means of translating the code, we're still not seeing great speeds. For those of you who can't see, we're looking at about 5 frames a second on Super Mario and about 17 on Yoshi's Island, so about a quarter of the speed maximum right now. So we'll see how we can improve. So what can we do? Well, first of all, we can use the PSP's GPU. We can accelerate drawing with the inbuilt graphics hardware, at least 3D drawing. We can use the PSP's VFPU to optimise maths and similar functions. We can underclock the emulated system, hoping that we can skip some cycles and games and performance, and like I mentioned earlier, we can use the media engine. So just to explain a little bit about this chip, this originally could only be accessed through a Sony API by official developers. That meant it was pretty much limited to tasks such as audio and media decoding. But for us, we can take advantage of this to do whatever we really want. And so now the question pops up, could we even emulate the second DS CPU on our second CPU? Could we offload some functions to it? We have a lot of options here, but we'll have to do a little bit of thinking to figure out how to use it. So the first steps are hardware rendering, moving our 3D rendering to the PSP's GPU. And in fact, this demo runs at 71 frames a second from 20, using software or CPU-driven rendering. Of course, there are newer issues introduced. You can see the dice is now missing its face texture, a little issue, but we fix it eventually. And it also saves some CPU resources, so it will hopefully have even more knock-on benefits for the emulator. So the big step is a dynamic REC compiler or a Dynarec, and just-in-time emulation. So I'm sure many people here know about what a Dynarec is, but just to recap, it is compared to an interpreter. An interpreter fetches and executes instructions one by one, and it would never be fast enough to emulate the DS on PSP. But a dynamic REC compiler translates to native code and caches it, so we can run much more of the code as if it was natively for the PSP. So far, so good. So the Dynarec at this point was less than half finished, but we are getting some big speed gains. Basic 2D scenes reach or even exceed full speed. That Zookeeper demo earlier is now running at over 50 frames per second. In fact, just for comparison, the PS Vita build, which was not optimized, runs this same scene at 22 on much newer hardware. So we were off to a great start, doubling the performance on a newer console. But 3D is still very slow because that's quite a complicated thing, so very little gain. It's difficult to see here. Professor Layton at the top is actually reaching an exceeding full speed, but the 3D drawing is still at about 13 frames a second in a real game situation. So the question is, does Moomi is too slow? Does Moomi's convolutional RK code is a big factor in the performance here? Because the emulator is about 20 years old now. It was one of the first efforts to ever emulate the DS on PC, let alone on PSP. So we ran a little test. Zero compiled Moomi on his computer without any optimizations, turning off everything from the compiler and seeing how the emulator ran with pretty much no optimization. And even on a modern computer, Pokémon was barely reaching full speed. So resolving some of these speed issues would require a major refactoring of already a large old emulator. So time to explore our other options. Does Moomi would require a lot of reworking to get optimal speeds? So what about the other emulators? Well, new DS shows the most promise. It's clean, simple and portable, even if it's early in development. So can we use new DS to better utilize the PSP? Well, at first there were signs of promise. It has a very good start. And initial results seemed quite promising. Even in the very early build where we hadn't even fixed RGB color issues, we were seeing over 50% speed on Yoshi's Island, that same game that was running at 17 earlier. It's neater. So each loop calls function distinctly. So drawing, rendering, DMA, all these things are called as individual parts of the function, meaning it is easier for us to split it up between both chips. And it means we might be able to parallelize the code a bit better as well. So we can use that media engine for a bit of extra performance. But it's harder than it looks in the end. So some cash problems emerge here. As some of you may know, having developed for dual core systems, it's a bit different when you move to dual CPU. See, when you try to access the same code on both CPUs, you need to flush the cash. And this wastes a lot of time. This on top of things like scheduling issues and trying to balance between both CPUs led this to be a bit more difficult than we'd initially imagined. On top of that, NUDES was a very different code base, and we would need to re-implement our work on hardware acceleration and our dynamic recompiler. So that itself is a lot of additional work. And like I said, it's still a work in progress as well. So this emulator would still need more time, even if we finished porting it ourselves. So back to the drawing board. Despite its flaws, this movement still seems like the best option. NUDES would need a lot of work, both from ourselves and from the developer before we was in a complete state. And optimization is also not there either. So the other alternatives also turned out lacking. Any of you with an Android phone trying to play Nintendo DS games may know of Drastic. This is a very popular, very fast emulator, but it's closed source, and that kind of rules that one out pretty quickly. Meland DS is also an option, but it's very threaded, and that doesn't really work well on our single CPU machines, or in our case, dual CPU, for the reasons I mentioned earlier. So where are we now? Well, we've stuck with it as the main DS emulator for PSP, and it's at a pretty good state. Many games do boot. Some 2D games are indeed playable with Frameskip, and of course sound disabled on that note as well, because that is a whole different beast to emulate right now. The JIT or the Dynarec is implemented, but it's very inefficient, and it could be much faster. And I'll return to this later, but we can use knowledge learned from our Dreamcast JIT to improve our DS1. In other words, we learn a bit more about the PSP through various different emulation projects, and we can use it to improve all of them. So what's next? Well, first of all, complete the JIT. It could be hard as the PSP is short on RAM, and we can't cache that much code, but work can still be done. We can optimize the emulator for new compilers, so we don't actually use the latest toolchain. That introduces new issues, new bugs, new crashes, but that could in theory get us some extra performance. We could try to use HLE or higher level emulation for the ARM7, which is the second CPU on the DS, which as opposed to lower level emulation, which would be emulating the chip more directly, this would only emulate the necessary commands and functions that the DS would have used. This would save us some performance, but again, a whole different beast right now. And the final thing is PSP1000 support, which is support for the 32 megabyte RAM models. As I mentioned, we're already short on RAM, but we might get there one day. So the second half of this presentation is about the Dreamcast. And this is a much bigger challenge, Germany, than the DS. Let's have a little comparison here. So the Dreamcast is a few years older than both of these systems, but its CPU is a Hitachi SH4 at 200 megahertz. Moreover, the GPU is actually a bit slower, but has more memory than we actually have. We're looking at 16 megabytes of RAM and 2 megabytes of sound memory as well compared to the DS's 4. As you can see, we're kind of up against it here. If we go by the rule of thumb, the emulation requires several times more power than the original system. This could get difficult, but we'll see where this goes, at least as a proof of concept. So the first four port for a bit of backstory. In 2008, Dirk Rasil, who's now known as Skimp, ported his emulator, NullDLC, to the PSP as a little proof of concept, basically for a bit of fun, and it booted commercial games. This here is the Dreamcast BIOS running on the PSP, and it's pretty much feature complete. It's slow, but it works, and that is a big start. But there are some glitches and issues. So I have here some footage from the original emulator. Now bear in mind, this build, the binary and the code, was lost for a long time, so this is drawn from YouTube from the time, so I apologize in advance for the quality. This is footage of Shenmue from the Dreamcast running on the PSP in 2008. It's a little problematic, shall we say. But the fact that this is booting at all is a big step. And we're looking maybe at about 20% speed here. This was a big budget title perhaps around the year 1999, so the fact that we are emulating this at all is quite a big achievement. I think we peak at around 25% speed here, but there are clearly issues with hardware culling, transform, texturing, pretty much the whole boat of graphical issues is on display here. Crazy Taxi, we thought would fare better being an arcade game, but this is more interesting. It is so slow that it practically doesn't appear as a video here, right? We're looking at literally about three frames a second. But this is interesting because we thought Shenmue being one of the most expensive games of the time would be more difficult. Is the footage still running? Yes, it is. It's that slow. So we thought Crazy Taxi would be easier to emulate, but as it turns out, it looks like this has some unique quirks for our emulator. So a few years later, the source code actually returns. Our friend Skimp finds the source code again somewhere, and he puts it on GitHub around 2017, almost 10 years later. So it was added to the PSP archive, which is a GitHub repository in 2021 after it's cleaned up and confirmed to compile. This is where we come in. So we began to compile the emulator again and make some adjustments. Games do boot, albeit with some issues, but there is some promise. For example, the game here, I believe it's Powerstone or one of these fighting games, is actually hitting up to 38 frames a second, despite some obvious graphical issues. So it's a pretty good start, I'm sure you'll agree. Turn of the King, original developer Skimp actually returns to help us out on this project and helps us to plan out how we should use the PSP hardware. Obviously, he has been developing a Dreamcast emulator for many years, so his expertise has been invaluable. He helps us with parallelization, hardware expertise, and he actually believes that full speed emulation for the Dreamcast is possible on the PSP, which is definitely not what people would have thought 10 years ago. To work on the jit begins, it's early, but it's a work in progress. The first full speed milestone. Big titles are getting better too. Here the game Mr. Driller becomes the first Dreamcast game to run at full speed on the PSP, something we never even thought possible. Though there are of course a few caveats, there seem to be some texture corruptions going on and the performance does not stay above 60, but this is a big milestone for the emulator. Like Adventure on the other hand is a much more complicated game, it's running at about 25% speed for comparison, but it's a good start. Audio is still too big a performance here right now to reliably implement, but we might get there. So this brings us to another new chip. I mentioned the media engine earlier, but it actually has a chip related to it called the VME or the Virtual Mobile Engine. This is a reconfigurable chip that was designed for media decoding, but we still don't understand much about it to this very day, almost 20 years after the PSP released. It's a bit like an FPGA in the sense that we can reconfigure it in software and it's capable of 5 giga operations a second, so if we figure out how to use this chip we could see big emulation gains, but at the moment we are still a little short on knowledge, so this is more of an area for the future. In terms of optimization and what can still be done, first of all audio, like I mentioned, in theory we could offload this to the VME once we figure it out, but if not the media engine might still do the job. Texture optimizations, textures are currently stored in RAM, meaning that there is a speed penalty in transferring them to video memory and then rendering them. Once we find out how to move things around more efficiently we'll find some performance there. Rendering bugs, as you saw earlier, are still there due to some problematic implementations. So here's some footage from 2023 of the latest build, same scenes recorded more recently. As you can see Shenmue is still struggling for performance and there are actually new issues altogether. The main character is now running in a completely different direction, but we are seeing much better performance. In fact this hits up to 70% speed and sure there are still hardware culling issues on display here, but the fact of the matter is we're looking at nearly full speed already. And finally we have crazy taxi here. It's finally looking like a motion video, which is fantastic. It still requires a lot of work, but you know what, considering what you saw earlier I think we will take it. So the state of play, just to recap, there's a long way to go, but there is a lot of progress. The early jitter implementation gets us some speed at the cost of some instability right now. It's already up to three times as fast as the original build, perhaps more, and we're seeing full speed in some games. 3D graphics are completely hardware accelerated right now, but the implementation is not yet finished. So the future is bright. This is the note I will end on. Despite key areas of the emulator in early stages, progress is very good. We're seeing full speed in some cases, which is more than anyone may ever have expected. Big budget games like Shenmue are heading up to 75, and once stability improves and our JIT comes along a bit further, who knows where the future could take us. That's the end of the presentation, but before I go to question, I'd just like to thank Zero, as he was the main developer on both of these projects, unfortunately he cannot be here today. And additional thanks here to Skimp and HLiD, who provided us help with our implementations of the JITs on both machines. So that's the end of that presentation. Do we have any questions? Thank you. So do we have any questions? Yes? Is the dolphin in the next project? Is the dolphin emulator our next project? Yeah, in the next project. One more time? The dolphin in your slide. Oh. We will see, that was actually from a Dreamcast game called Echo the Dolphin, which is why I used it as an image there, but we'll see, maybe I'll use the dolphin next time. Any other questions? Yes? In this new version of this movie, what is the impact on the battery of the PSP? This is a good question. Okay, so originally the PSP, oh sorry, so in the modern version of this movie, what is the battery impact, is the question. So this is a very good question because battery was always a concern for the PSP, especially at launch. In fact, Sony limited the PSP to 222 megahertz at launch to prevent battery loss. The battery is actually the same in terms of battery usage compared to the original build because we've been running at 333 megahertz the entire time. So in the end, there is little to no change in battery consumption. Anything else? Yes? Okay, good question. So this is about underclocking the emulated CPU. So if we say here that the DSH CPU is running at 66 megahertz, I think that is what I said earlier anyway, and then we could potentially save some performance by emulating the CPU at a lower clock speed. So we could emulate, say, a 50 megahertz DS. This might come at the cost of some in-game performance but would lead to some more stability and some more performance consistency, if that makes sense. Or some games that do not fully utilize the DS might actually just see some free performance altogether. Does that answer your question? Thank you very much. Yes? You mentioned one possible optimization to skip code that's basically not, you know, only when the, only emulate the essential parts. Yes, do you mean higher level emulation? Yes. So being a little bit like, first of all, is that static, like you just know beforehand which command or which code you can skip? Exactly, like how much improvement can you get from that? Okay, so the question is about HLE, so emulating some, only the necessary functions on the DS's second CPU. To be honest, I'm still a little bit in the dark on the specifics of HLE, but I will say this. The idea is instead of emulating the chip in a traditional low level sense, we know that this chip can only use a handful of functions, right? So this chip was dedicated just to a few sound or auxiliary functions. So we try to emulate these, how do I put this? We try to, we know that there are only a few functions to emulate, so we focus on those if that makes sense. That is the concept of higher level emulation, although I can't unfortunately answer all the details about it right now. Thanks for the question. Yes? I'm just wondering if you can say a little bit more about the JIT, like, is it very naive? Is it just getting rid of the overhead interpreter, or does it actually do some sort of a registration or local optimizations that translate? So is your question about the JIT for the DS or the Dreamcast, or both? Both, okay. So like I said, Xero was the main driving force on both JITs, and he had some help from outside, so I'm not fully aware of the current state, but I can say that from what he's told me, the DS JIT right now is complete, as in all the op codes are implemented. But the implementation is naive, so it's quite inefficient right now. He didn't quite elaborate on what parts were inefficient, but that was the gist. As for Dreamcast, it's actually not fully implemented yet, but he says it's a more efficient implementation as per the PSP's hardware. Is that answer your question? Thank you very much. Yes? What's the compiler tool transport for PSP? Can you just use upstream GCC or back? Yes. So, sorry, the question is about GCC and modern compiler tools. Yes, we actually do keep the PSP quite up-to-date in terms of the tool chain. We tend to use GCC 11, if I recall, or quite a modern version of these upstream compilers, but this one is actually built on a tool chain that's a few years old because the newer ones introduced some instability. But to answer your question, yes, it's mostly upstream compilers. Anything else? No? I think that's good. Oh. What's the question? So, you showed some triple A games and some smaller games, had you tried homebrew or custom built stuff just to maybe even unit tests, so unit tests, you were having a lot of sound issues just to narrow that part of the emulation down in a custom built software, had you thought of that, or is it all just kind of pre-built, whatever you get? Okay, so your question is have we tried homebrew programs on either of these emulators? Well, actually, there was one or two homebrew pieces featured here. You might remember there was a dice demo displayed for the hardware acceleration. We used that as a simple way to test whether the 3D rendering worked, but the problem with homebrew is that often this uses code that's, shall we say, a little bit unofficial, and this can cause whole new issues that wouldn't really apply to commercial games. So we tried to avoid it, but we do use it sometimes to test particular things. Does that answer your question? Thank you very much. I do ask a question. Yes, go on. What do you ask a question? How do we debug this? How do we debug this? On the PSP? Yeah. Okay, so how do we debug? This is a pretty good question. So with the PSP, we can plug it straight into the computer, obviously, using a USB connection. There is a tool, what is it called? I forget the actual name of the tool now, that's how long it's been, but basically there is a way of, you can send the PRX, which is the binary, to the PSP, and it is connected to the computer still, so you can log what's going on in the memory, on the computer side. Okay, but no breakpoints. No, no, no, nothing too fancy that I know of, anyway. If someone does know more about that, and I'm not using it, then that's great. Last one? Yep. All right. Thank you very much. Okay. Thank you very much. Thank you. Thank you very much. |
An introduction into AMD/Xilinx libsystemctlm-soc |
All right. We are ready. We fixed it. We broke again and we fixed it again or I didn't do anything. The green shirts did it. All right, so next we have Francisco Iglesias. Now we're going to start building. Yes, yes. And to me at least, I'm just going to rant, but to me because it's always interesting to see how emulation is used in the enterprise, in the, you know, people's money world. Or not, or not. Let's see how it goes. All right. Okay. Hi everybody and welcome to this presentation. My name is Francisco Denon. I work at AMD with QMU Development and System C Development. Okay. So I'll try. I have a little threat problem, but okay. So and today I will be speaking about our open source cold simulation solution and the agenda of the talk then. It is, first, I will give a short introduction into what cold simulation is. And thereafter, I will be speaking a little about the AMD silence QMU itself and proceed with introducing live system C. Tell them, as you see, and since the repository system C, tell them closing demo. And lastly, I will show a short demo where QMU is co-simulating with a couple of RTL memories and using the infrastructure live system C. Tell them, as you see. Sorry, can you speak up a little bit more? Even more. Yeah. So in this slide, I tried to capture the one of the trade-offs that is done when you choose simulation technique for your RTL and it is the trade-off between speed and design capacity visibility. And we see that the three techniques that is used for RTL development, RTL simulation or simulation, FPGA prototyping, they all come with a different cost on the simulation speed. And on the left side here also, we have the virtual platforms that are fast and great for software development, but they do not help with pure RTL debugging or development. So an approach that can be used here to try to leverage from the two worlds here is to place a portion of interest in the portion of the RTL on one of the RTL simulation techniques and then keep the rest of the system modeled in one of the virtual platforms. And this way you will then keep most of the system simulated at a quite fast speed while still keeping the visibility to this portion of RTL that is in focus. So this is what we mean with co-simulation that you are mixing these two worlds. In our open source co-sim solution, we have the SILINX QMU where we model the processing systems of the FPGAs and then we have system C that we use for modeling the programmable logic. And LIM system C, the LMSC, it has bridges that allows us to connect the system C models of RTL and also FPGA prototypes and the hardware emulators. I will be speaking more about the bridges shortly. But first, a little about the AMD SILINX QMU fork. So this is where we have our improved support and modeling for the SILINX platforms then. And today it is based on the mainline QMU version 7.1.0 and we upgrade it around once a year to a more recent mainline version. And the AMD SILINX QMU then has some extra functionality. One of these is that it can create machines through a hardware DTB. And this allows us for having a more flexible machine creation and modification process. And the AMD SILINX QMU also has an implementation of the remote port protocol. This protocol is the protocol that is used when we co-simulate both different QMU architectures and also when we co-simulate with system C. This is an overview of this where we see an AR64 QMU co-simulating with a microblaze QMU and also with a system C application on the side. Continuing with the LibSystem C. This is a project that was started by Edgar Iglesias in 2016 and the license is MIT. One of the core features is that it has the remote port protocol implementation in system C that is then used for connecting with QMU and co-simulating with QMU. And going together with this, it also has system C wrappers, what we call. These are for wrappers for our SYNX in Campyversal, Rosonetten. And the short description of a wrapper is that it wraps QMU into a system C module so that for the rest of the system C application, the interaction from the other modules with QMU is done through the standard system C interfaces as TLM and signals, etc. The library also has TLM bridges into AXE4, AXE3, AXE4 Lite, EPBAs, A-slite, CHI, CXS, TLP, XDMII. And a bridge converts the communication from the TLM site into the protocol-specific site. So here's an example of the TLM to AXE bridge, which translates TLM into AXE. And these bridges then is what allows us to co-simulate, for example, in this case an AXE, DUT, that has been generated from RTL. So we see here that the system C wrapper communicates through TLM to the bridge that then converts this TLM to AXE signaling. And communicates through this AXE signaling with the AXE DUT then. And this is how QMU on the left-hand side then can access the DUT. There are also RTL bridges in the library for AXE4, 3, AXE4 Lite, AXE4, CHI and CXS. And the RTL bridges have two components. The first one is the bridge itself that is placed on the FPGA or in a Harvard emulator. And the other component is the driver of the bridge that is placed on the system C application software side. So the way it goes is that TLM transaction enters the driver, which then configures the RTL bridge to replicate this transaction as an AXE transaction, for example, inside the FPGA or the Harvard emulator. And this is an example of when these bridges are used with an Albeo U250 card, where we have between the bridge and the bridge driver and the bridge, we have some infrastructure there. The fire PCIe next year made them. And one can see these components as a transport channel where the driver accesses go through towards the RTL bridge. And looking at how it looks inside a hardware emulator is very similar. But instead of PCIe and here the vendor bridges are used for this transport. In the library we also have protocol checkers for AXE 4, AXE 3, AXE 4 Lite and AXE Lite CHI. And the protocol checkers, they are connected to the signals and monitors the signals and try to find issues, violations to the protocols then. Also in the library we have modules that can be used for generating AXE traffic. So we have AXE, AXE LiteMasters and AXE Interconnect. So the masters here, they generate ace transactions towards the interconnect and the interconnect will then when required snoop the other masters and otherwise forward the transaction to the TLM memory at the bottom. We have a similar setup for CHI where we have request nodes that generate CHI traffic and a CHI interconnect that does snoopy when required or forwards the request to a slave node at the bottom. Also in the library we have a tool called PySimGen that can generate simulations from IP exact descriptions. And there's a basic TLM traffic generator that one can configure to generate randomized traffic or provide a description of transactions to issue. And there are some simple, easy co-simulation examples that one can have a look at as a starting point. There's a lot of documentation for all the components and we also have an extensive test suite. The system seat TLM CoSIM demo is also a project that was started by Edgar Iglesias in 2016 and the license sense is MIT. And this contains several QMU co-simulation demos where we co-simulate the SyncMP QMU and VERSAL QMU with PL model on the system seat side and there's also a risk five demo where a risk five QMU is co-simulating with an open source. Internet controller core on the system seat side. We have several X86 QMU that co-simulate with PCIe endpoint models on the system seat side. And there is also a PySimGen demo where the system seat side of the co-simulation has been completely generated by from IP exact. And these demos they serve, they demonstrate how to embed the live system seat library in an own project and how to use it. So for the demo that I'll show now, it is a, here I will be launching a Linux system on the SyncMP QMU and it will be co-simulating with a system seat app where that includes a couple of RTL memories. One of the RTL memories is XC4 interface and the second one has a XC4 light interface. On the XC4 light signals there's a protocol checker connected and I also modified the XC4 light memory here and I injected that error so that we can see that the protocol checker finds this. So let's see then. So we see here that on this left terminal this is where QMU is being launched and on the yellow terminal on the top is where the system seat application has been launched. And we will start by doing some accesses to the XC4 memory and thereafter here comes the accesses for the XC4 memory and then thereafter we will do an access towards the XC4 light memory that has an error in it. And here we see that the protocol checker found the error and outputted some description message. After the simulation you get a trace that we can inspect and we can see here, follow the access signals and look at the transactions just issued. See that it is the expected data that we're seeing in here and you can see those at the bottom here that these are the data that we were writing to the memory. Then the protocol checker's error is also connected to a signal in this case. So for the transaction that failed it can be found when this signal has been asserted. So this is seen at the bottom here where there is the asserted signal and then we can look into the transaction here and find the problem. And that is all what I have today. Thank you for listening. That's a dumb question which I'm known for. No, so because like I said at the beginning I'm very interested in how this works in enterprises and I'm curious how do you guys like decide a feature to be implemented? How do you plan that kind of stuff? Do you know how that works in the community or if you're in your basement? Do you mean like in QMU or in the system C or overall? So it ends up with me. Yes. So how do we decide the features that we implement? And it's actually the demand that drives this. So if we see that some team internally at AMD siblings needs a feature in QMU then we implement it. Or if we see if there's a feature that might be useful later forward going forward. Not right now but perhaps in a year or so that also then we will consider implementing it too. So and often it ends up that our demands are pretty similar to all other developer or all other demands. So if we do a feature, implement a feature, it often becomes useful for others as well. Not only for the silencs, AMD silencs in part. A small follow-up. You guys probably do Agile like the rest of the world. I'm curious like how do you guys refine the story like this in Agile service? And I'm very sorry. Okay, how do we use Agile development in this? I don't care about Agile. I really care about the refinements. I don't like Agile actually. Like how do you guys brainstorm together on a feature? What do you put on paper? Like it needs to be this but how do we do this? Because it's not always comparable to something that already exists with emulators. It's usually something that's never been done before. I'm really sorry about this question. I know it's a very good question and I have to admit it. I'm not sure if we have such a process that we're probably looking at here. We get a request in our group, implement. We need this feature from, for example, one of the RTL groups. They need a feature, they ask us and we implement it. So we don't have really a process where we kind of do this very Agile in that sense. This is our team. It might be different in other teams at AMD. So Chris, how do you get the system C model from Verilog? And does that also work for co-gen generated IP which might be implemented? So how do we get the system C model from Verilog? So there's an open source tool named Verilator that will Verilog and create the module for you. But it's not going to work for the co-gen generated IP which is encrypted and which Verilator cannot process. For that I'm not sure how to do that. Sorry for that. There is no free line. I don't have to speak on that because I have to admit that I'm mostly on the QMU development side. But if you ping me afterwards I can take your card and see if I can contact give you a correct contact or something. Is there something for VHDL as well? I think there are tools that do this. But if there is a tool that automatically generates a system C model from VHDL, there are tools apparently. I'm pretty sure there are too. But we have not used them. Are you limiting yourself to the synthesizable subset of system C or do you don't care? No, we don't limit ourselves to system C now. I'm coming from the world of open source software-defined radio. I have flow graphs where I have data processing blocks that are running in software. On an mpsox R64 core. What I want to do is I want to take a block and implement it in some RTL and get it to run on the fpga part. How does that work? I have some part of software that I want to be accelerated by an fpga accelerator. These tools you mean? Yes. In that case you could... Yes, how... Random acceleration implementation of software. How do I go from software acceleration to hardware implementation? I know how to write. Yes, yes. I have to admit that I myself am not an expert hardware engineer. I think that the way I would have done it is just to go ahead and create the world of code. With this tool it's very sweet because you can connect it to the QMU system. Just as a library and say, okay, I have this XE stream. Yes. Put it in there and I call C functions in the end, right? You can launch your real software in QMU that interacts with it. How do I exchange data with the library? What's the interfaces? I see internally it's here and it's called the system C, right? Yes, yes. You don't have to choose that. But what's on the surface? How do I get data in and out? How do you get data in and out, the simulators? Yeah. Perhaps I would have needed a better overview picture, but if you can get... How you get data in into your system C application. That's... We don't have any magic frills, but... So the remote port protocol is just a protocol that transfers... Transactions from QMU into the system C side or to another QMU. And so it's not really a way to... That will allow you to load in a bunch of data into the system C application. But... Any more questions? Did I answer your question? Yes, I think afterwards and I can... Okay, we don't have time. Thank you very much. Thank you. |
Emulator development in Java |
So, my name is Neil Coffey. I'm a Java developer. Of course, I'm a Java developer with that surname. And so, this is a talk about a little side project that I started a couple of years ago. It was kind of, I was just keen to see in Java how far I would get with developing an emulator. This is the first emulator that I've developed from scratch. And it kind of started, you know, I had a bit of time, you know, we've had a lockdown and I kind of thought, well, what do I need to write an emulator? Well, one of the things I might try and do to start with is get a ROM reader to kind of start from scratch. And then I found, I don't know if you've heard, but my country left the EU a couple of years ago. And I actually found it hard to source the ROM reader from Germany. So, the first thing I did, if there's any work, is I built my own, obviously, that's the first thing. And then, so by the time I'd done this, I was kind of committed at that point. Okay. So, what I'm going to go through, then, is my experiences of writing an emulator and kind of, as I say, first time I've ever written decisions, challenges. It's going to be a little bit of a tour through some of the APIs that there are now in the Java platform, this kind of thing. And in all honesty, some kind of, there are some pros and cons that I'll talk about. Yeah. And above all, some kind of little, little tricks in the APIs that aren't always very well documented that can kind of help us. So, why Java? So, I'll be completely honest. The main reason for me was, it's the language I'm most familiar with. Yeah. So, I've been using Java now for about 20 years, about the first JRE that I used came on floppy disk. Okay. So, that's how long. These days, I'm just obviously cross-platform. And these days, it's got quite a rich set of APIs, hopefully, everything we need to develop an emulator. It's got good longevity. So, you tend not to have this thing in Java that you sometimes get in Swift, for example, where you kind of come in one morning, try and recompile your code and find it won't compile anymore because Apple's changed something. Java tends not to have that. It's maintained good backwards compatibility over the years. And so, hopefully, anything I write, moving forward, will also run. I don't have to have an emulator in a few years' time to emulate the emulator. Okay. There are, as well, from a personal view, there's some APIs coming up that I was kind of keen to have a benchmark to see, well, in a couple of years' time, you know, things like the, you know, the foreign function and memory API that's kind of just about to kind of hit stability. I was kind of interested to see, well, you know, what will I be able to do with that when it comes out? Okay. So, I set myself some goals that I wanted to be, my emulator, to be accurate enough to allow most software to run on. In all honesty, for kind of version one of my first emulator, there were some things that I decided not to emulate, to things like memory contention issues. There are some weird things that you can get that I'll maybe have time to talk about in the spectrum with kind of glitches in the video display. So, essentially, my kind of overall goal was anything that software uses that isn't a kind of bug in the hardware that people might accidentally get around or use, I'll try and emulate that. As Roddy mentioned, one thing I was trying to do is get a baseline from the basic Java APIs and try not to bring in additional libraries as a kind of starting point and want to be a cooperative applications like not necessarily just full screen, perform enough, yeah, as I say, I'm not trying to write a one gigahertz Z84 for my kind of first projects. Which machines do I try to emulate? So, I went for the trusty old ZX Spectrum. So, apologies to Steve, I'm adding to the pile of emulators now available from ZX Spectrum. And I also thought that the Sega Mars system, so why these two together? A, these are the machines I had as a kid, okay? But B, if we look at the technical specs, there are actually some similarities that are going to help us. So, you can see the video resolution is similar, although the video chips and formats that they use are very different. The CPU essentially is a data around 3.5. So, around 3.5, actually, there are different models of the spectrum with different speeds and the Mars system, very slight. I think it was 3.58 for the Mars system. And you can see then here, for the, probably everybody in this room is kind of fell if middle of these machines, but for those who aren't. So, you can see that the Sinclair Spectrum in comparison was all about saving money. So, you had one custom ULA here that was handling the video and the sound and was also memory controller compared to the Mars system that had a bit more on-board hardware that we're going to have to try and emulate. So, just a little bit more detail of some of the difficulties, again, for people and may be familiar. So, the ZX Spectrum, it renders its video all from RAM, essentially with kind of no acceleration as such. And it's got this format that really kind of gives the ZX Spectrum its look and feel. Yes, you had essentially a one bit per pixel bitmap and then over the top of that, you're allowed two colors, essentially, per rate by itself. Yeah, and this kind of gave the Spectrum a bit of a unique look and feel as bright and flash as well per cell. Compare that to the Mars system where you've got an actual dedicated graphics chip, but this was all tile based. Yeah, so you have a 34 by 24 tile display. Each tile can be 8 by 8 pixels. Yeah, so the eagle eyed amongst you will notice that you can't actually define enough unique tiles to give each pixel in that display. It's kind of a unique pixel. So anything that looks like it does, you'll see you get these kind of almost like little manga cards for some games. Or here where we've tried to fill the screen, obviously secretly around the edges, we've actually got blank space. So there wasn't actually enough memory to have unique tiles for every space on the screen. But despite that, it did have features that were actively kind of advocated by Sega to its developers to make the most possible use of that of the video chip. So the way it worked, you have a series of registers to control things like the scrolling, the colors. And there was a mechanism via interrupt to actually on each scan line or on every nth scan line, depending on how you programmed it, you could actually change those registers. Yes, you could change the scroll position at different parts of the screen. You could switch off the screen. You could potentially change the color palette. And so that's something when we're doing our video rendering, we're going to have to have a little think about how we can kind of optimize that a little bit. I'll just give a very quick example. So we're going to see here, we've got some parallax scrolling, where you see how on different scan lines, we're setting a different X position. And then that's quite a nice fact, that's a game called Choplifter. On the next example, we're actually going to have a case where here we're actually, it's not literally turning off the screen, but it's changing the base address of the screen memory to effectively turn it off at that bottom part. And this is kind of probably the most one, an extreme example here, where literally on kind of every other scan line, we're changing the scroll position to kind of give that effect there. So very briefly, I'll just give a little bit of the kind of the overall organization of the emulators, kind of the first thing you really need to think about. So it's how we kind of turn this, this is very high level obviously, but this essentially what the hardware looks like, we've got an address bus at the top with the ROM and the RAM connector, we've got a data bus at the bottom with any peripherals, which on the spectrum were fairly minimal, there was a one to eight version with the sound chip. And then on the master system, you can see again, similar idea, but notice that the ROM essentially is the cartridge that you plug in. Yes, when you plug a cartridge in, you're kind of directly communicating with the Z80 and any logic for things like memory paging, you can have that on the cartridge. And then a few more peripherals going on the data bus, we've got the video processor there, the programmable sound generator, there's not an FM unit, which I'll touch on briefly, and the controllers. So then what I try to do, and so there was the emulator clock there as well. And what I try to do is to abstract that down, so that I'm going to organize the program this way, we've obviously got the Z80 implementation is obviously a kind of fairly fundamental part. But then we've, what I've actually done is in my implementation, I've separated out the Z80 decoder from the actual instruction loop. This is quite nice and we want to add a debugger as well, then you can go through the same code to decode the instructions for the debugger. And then we've got an abstract IO bus, from which again then on the master system, we'll have our master system IO bus on the spectrum IO, etc. A memory of similar ideas, we have subclasses of these overall base classes. And the clock, which is actually working the other way round to the way that the hardware, the clock is effectively going to be a kind of break on the CPU thread and is going to tell it when to pause to keep things at the right rate of instructions. And there'll be a little bit of feedback as well between the video thread so that it can interact with the CPU to do the things I've just mentioned about accurately timing the scroll registers and things. So just an example, I end up with interfaces like this and then to the Z80, it's effectively, it doesn't care whether it's a master system or a spectrum it's communicating with, it just goes through these abstract interfaces like this. A little bit of detail just on, I've just mentioned about the the CPU. The implementation that I went for, which isn't necessarily the most kind of popular of the traditional emulators, I tried to really break down the instruction set into more of an object-oriented form. So I've got instruction types you'll see there and then for each type the individual instruction is kind of returned as an object that says well it's this type and it's from this source, this destination. So I've tried to kind of not have to write 900 different routines for all the various combinations that the Z80 had and that gives quite nice code. There's a little bit of a performance trade-off obviously but it turns out not to be not to be too bad. Okay and then the other decision I made was well we're now writing in Java in 2023 now so I decided well I want to make the most of multi-threading. So the various of the components I've just mentioned will actually sit in their own thread. Okay and that's kind of nice organizationally and also in terms of monitoring the performance of the app it means we can break down a little bit more easily what resources are being used for each component. So just to give a little bit of an overview of this, so we'll have at the top kind of got our, well this work yeah, is that good? So we've got our the CPU thread at the top there and which is going to be interacting with the clock and is periodically going to say you know I've done this many instruction cycles. How am I doing? Do I need to pause to kind of maintain the correct instruction rate? Then we've got the video controller which is going to be sending periodically sending V blank instructions every frame to the CPU to notify it. We've got then also a separate rendering thread which is going to do any of the kind of heavy lifting rendering that we need to do. So anything like scaling, calculating what the actual pixels are and then the idea is that here in the event dispatch thread which is a single threaded at that point we have to kind of have our ducks in a row and know what we're actually going to render. Then additional complication is it was going to be an audio service in its own thread as well. So different APIs that we're going to use. There's a standard Java Swing API so there's no additional open GL plug-ins here. A couple of the Java sounds I mentioned monitoring Neo kind of a little hidden one but when we're writing data, when we're emulating kind of cartridge saves and we want to write data actually open a mapped file for that to save the data and their threading is often important. I'm not going to really mention too much but there are also desktop and taskbar and integration APIs that help with integrating into the desktop with the system menus and things. So we'll start with the graphics. The standard Swing and Java 2D APIs people may be familiar with, the idea is that you override the the jcomponent class and you implement a paint component method and here in principle we can set various options to hint with whether we want quality speed etc and then finally we can render an image and it will be rendered with the with these different settings. But some caveats with that. Unfortunately it turns out that some of those options effectively end up turning off GPU acceleration and they can be quite CPU hungry and efficient. It's not clearly documented which ones actually run on the CPU and the GPU but effectively ends up that the fast options without any quality interpolation are the ones that run that just go straight to the GPU. So we're going to have to be a little bit careful not to use too much CPU time for each frame render. And then there's also an additional problem that the standard API to set and get pixels from buffered images actually it's quite inefficient for setting individual pixels but we have a workaround. So this will be the standard API that we'd use. We create our image like this, lovely, we set different types about 15 different types that we could use and then we can set RGB and whether that backing star is an input pixel or bytes per pixel or whatever it will work out how to set the RGB lovely. But in practice we're probably never going to have anything other than an input pixel. So this is the least efficient way we could possibly imagine to set the pixel data. Luckily we can actually with a little bit of jigglypokery we can ask Java 2D for the underlying interay and then we can just directly write to that. The advantage being then things like array fill, array copy, array dot fill sorry they then become available. There's a caveat that normally wouldn't do this because if you've got static images that you're rendering lots of times the what would normally happen is that Java 2D sends that to the GPU once then subsequent renders are effectively free but we don't really need that for our purposes. We're going to be rendering a different image on each frame effectively so that's not such a problem for us. So then just to come back to us I'm showing you earlier with the different scroll per frame on different raster lines. We kind of want to get the best of both worlds with how we then end up structuring things. So what I do is I basically I kind of break down the image and say well for this frame where are the points where the things like the scroll registers actually change. On some games that will they will just have one setting per frame and I can then just efficiently render the the the entire frame without without having to you know worry about clips per section etc. So I don't kind of literally go through pixel by pixel kind of chasing the beam. Just yes there's just a kind of brief example here so I'll split into sections and then I can say for that section get me the relevant settings and then go through and fetch from the from the the tile map data and render it kind of almost as you expect. So by doing that and by using this trick of getting the raw kind of interay this does allow us to get quite a good speed up on on the rendering. So if there's kind of one one thing you're doing in Java the kind of the one kind of speed up to think about is probably this. Mention so having having none about that trick there's some little little tricks that we can do obviously people familiar with with with CRTs where they were actually the way these systems work they kind of render every other scan line and we can if you've got a really good quality monitor a little like that most people's minds a little bit more that you kind of had bleed in between the scan lines and you also kind of get ghosting effects this kind of thing. So we can try and give a little bit of the that look and feel yeah so I'm literally going to do here in the Java is I'm going to render things that every other kind of scan line I'm going to render the kind of darkened version of that scan line so I can kind of produce something like this and then just have to be a little bit careful with the scaling because you can get more effects if if you've you've got a kind of odd scale factor so do a little bit of extra interpolation to try and get around that. Then another effect that we can do in Java is to like these kind of ghosting effects if we can define our effect in terms of a convolution matrix which you may have seen then we get native library built in that will allow us to render that efficiently and that will also access the integer data under the hood it won't go through that set RGB every time. So we can get effects like this again we're kind of at low rendering time and then this is for my favorite spectrum games from a child to do something like this combining the kind of CRT effect. Another issue we just have is there are multiple ways to scale images in Java and depending on which one we pick we kind of get different different performance characteristics so the thing I'm actually looking at which is kind of most stable is to actually just hard code just hard code the scaling myself because then I can go through this you know access the interay directly some of these other built-in APIs unfortunately you know they go through that get RGB set RGB to be you know support different formats but we don't really we don't really need that. Okay let's talk about sound so the the mass system and the spectrum had quite different ways of producing sound the spectrum obviously was this kind of very simple speaker it could effectively be a one or a zero and you kind of control a square wave literally from the CPU to produce your sound but then something like the master system that had an actual sound chip you would control the sound by setting register to say I want tone one to be this frequency etc so we want to abstract those two ways of producing sound so that we can we can just have one generate sample data method and then our audio service is going to call into that and so it's just a brief slip here of what I do so I've got it that'll be the subclass for example for the spectrum type sound there and then here a bit more complicated but we effectively you know do a similar thing we're going to be whenever we're asked for some sample data we're going to calculate that sample data and split it back yeah and then and then the question becomes well given that sample data production how do we actually pipe it down to the audio output and Java has this slightly quirky model where you have a notional mixer that's got inputs and outputs and the slightly perverse thing is that everything is seen in terms of this notional mixer so when you want to output sound you're actually sending it to an input of the mixer yeah so we call it a source line yeah whereas to us it's not really a source it's a target but that's the reason for that so if I you see here they're also tied to particular drivers and I can enumerate the different drivers on my machine I find that I found out for example that my Mac can listen through my iPhone microphone that was the first time I found that out so yeah so we we're clearly available mixers and then we query them for their available source lines okay and then we can we can write the data and to the source line we open it with a format that we want we write the data and so this is now where I can call my generate sample data method when there's some frames to send I send them okay people might have spotted a slight flaw with that I've got a nice infinite loop there on something like the spectrum I need to be able to tell the difference between there's no audio and there's no audio yet but but there's some on the way and I don't want to sit in an infinite loop in the meantime okay so this is where so yeah this was just code examples how I get we output those ones and zeros and then we translate them but so I'll just skip quickly we so we get those ones and zeros and then what we're actually going to do is we're going to use a condition object which is part of the Java concurrency API so that we can basically in our audio in our audio service thread we can wait for a notification that there's actually some audio that that we want to send okay there we go okay yeah there's also a little bit that we can do with yeah hybrid buffering is basically where we we want we want to ideally have a small buffer to fill to send but that then ensures the problem of we might we run the risk that if we can't fill our buffer in time we end up with choppy audio and so in practice what we can actually do is have a larger buffer and detect when it's half full and kind of keep topping it up and so that's basically how I do it okay and the FM synth which I'll mention briefly I never had one of these I think they're quite rake and I'll get them in Japan but the master system this was an option for the master system okay and I'm what I actually do for this I cheat slightly I use javas built in midi software synthesizer so I translate the instructions to that FM synth into midi commands and I send these to the soft synth and I don't know if this is going to play on the projector but I'll turn up the audio here and just see so you'll hear difference you'll hear the the normal PSG sound chip and then you will hear the FM kind of synth oh I don't hear that it's probably too quiet and you see there we can then start playing about with things like the the voices that we we assign to those okay so I'll just touch on very briefly because time is getting to the end and so I'll just touch very briefly on the timing and concurrency so the CP obviously we need to maintain it at a kind of our desired instruction rate so the way I do this is I introduce pauses and but then we want to be able to accurately measure those pauses and we also need to accurately measure the timings between the frames that were that we're sending and there are there's obviously standard APIs in in Java to do this a little issue that I did come across the standard executor framework that we'd normally use for doing this so here we say right okay I want to frame every every 60th of the second depending on your platform you can actually in practice get quite erratic intervals between between between the the events so you can see in particular on macOS I find you could get this kind of 20 error so this is just just one experiment for example if we and what I luckily found was that if we request low low sleep interval with the accuracy is actually better for low sleep intervals than the higher sleep intervals and it seems to it seems to max out a particular amount I'm not exactly sure of the underlying reason for that it was to meet in Darwin but then what this leads to is we can kind of come with depending on the platform we can come up with a different strategy for maintaining accurate timing and a challenge you know it's a perpetual challenge with Java really is then that the the the best strategy will will depend on the depend on the platform very briefly data manipulation which sometimes something a bit scared of in Java we all of the types are right well they're generally signed char is unsigned but they're generally fixed width and signed we can't do what we can in seeing other languages and defining our own types and so one way to work around this one want to do things like register access and the audio data is the byte buffer is generally the kind of the easiest way to do that and you'll notice that when we want when we want bytes because byte the byte type is signed so if on an unsigned byte then we would normally and promote it to an int and then we can basically undo the ff and lock lock off the lock lock off the the lowest bytes and then so there's just a I'll just skip further and there's just a question with that about well how do we check that the jit compiler is doing what we need to do and so I'll just let step forward slightly and what we can actually do we can ask it yeah so we can we can ask it to dump out the the the jit compiled assembler and then we can check if some of those optimizations are actually going in so this was very simple test I set up it's basically it's iterating through repeatedly effectively writing a word and then reading it from from a byte buffer yeah this obviously is slightly contrived this is you know really the kind of the contrived corner case example but it kind of illustrates the the kind of thing that's possible yeah so I'm effectively that bts effectively writing a two byte unsigned value into there via a byte buffer so it looks like I'm creating a byte buffer setting values on it calling a method on it but by the time we get down to the actual jit compiled assembly code in the best case we're actually not that just compiles down into a we are storing a half word in there and so that's the kind of thing that we can that we can do to kind of check for those things okay and I think we're skipped to the there we go yeah so mentions yeah the method those method calls are completely optimized out okay so so there you go so in conclusion using those various APIs together we can write them in Java a few pros and cons caveats around the different platform behavior a few things that still to add in here this is this is very much kind of version one however it was at the point where it will actually run quite a lot of the spectrum master system software if anyone's curious I've got initially released there on github there's going to be source code and further improvements on the way so watch that repo as they say a few references there that people may or may not have come across this book here by Chris Smith is I think kind of a remarkable piece of work about the kind of the very kind of low level details of how the spectrum works and the usual you know kind of reference guides that over the years have surfaced on the web and so with that I think I'll hand back |
OpenCSD, simple and intuitive computational storage emulation with QEMU and eBPF
After all, why not turn your computer into a distributed system? |
Okay. So, hello, everyone. This presentation is about open CSD, which is a computational storage emulation platform. And the reason we're emulating that, I'll get into shortly. But first, I think I owe you an explanation of computational storage and what it actually is. Because I don't think many people are familiar with that. Even in this deaf room, but I'm pretty sure most people are familiar with Cameroon eBPF. You can email me. There's a link to the repo. And this has been a long time collaboration with my master's thesis at the food. So, let's get started. I'm going to briefly explain who am I. I'm Cornelucca. My handle online is mostly Dentalian. I'm also a licensed ham radio operator, Popa Delta 3 Sierra Uniform that is. And my expertise is in parallel and distributed system. I've been in academia for some while, associate degree, bachelor degree, master's degree. And I've had some experiences throughout that time. So, I've worked on health technology for officially impact people. Worked on OpenStack with cloud optimizations. I've done computational storage for my master's thesis. That's what this talk is about. And currently, we're working on SCADA systems for the lower two radio telescope at Astron. So, why do we actually need computational storage? And that's because we live in a data-driven society nowadays. So, the world is practically exploding with data, so much so that we're expected to store 200 setabytes of data by 2050. And these high data and throughput requirements pose significant challenges on storage interfaces and technologies that we are using today. So, if you look at your traditional computer architecture, the one that's being used on X86, it's based on the von Neumann architecture. And here, we basically need to move all data into main system memory before we can begin processing. So, this poses memory bottlenecks and internet interconnect bottlenecks on networks or PCI Express, and it also drastically ad hinders energy efficiency to an extent. So, how much of a bandwidth gap are we talking here? Well, if you look at the server from 2021, say using Epic Milan with 64 SSDs, we're losing about four and a half times the amount of bandwidth that could be offered by all the SSDs in tandem, but can't be utilized because we can't move it into memory that fast. So, that's quite significant. Now, what is this computational sort? And how does this solve this actually? Well, we fit a computational storage device, so a flash storage device with its own CPU and memory. And now, the user, the host processor, can submit small programs to this computational device, let it execute, and only the result data from this computation can then be returned over the interconnect into system memory, thereby reducing data movement and potentially improving energy efficiency. Because these lower power cores using more specialized hardware are typically more energy efficient than your general purpose x86 processor. If we then look at the state of current prototypes as of September 2022, we see three main impediments. Firstly, is the API between the host and device interface. There's no standardization here. People aren't building hardware prototypes, but not so much looking at the software interfaces. And we also have the problem of a file system, because these flash devices, they're your file systems and we want to keep that synchronized between the host and device. So, how do we achieve that? We can't use cache coherent interconnects or shared virtual memory because by the time we back roundtrip between the PCI Express interface, we'll have lost all the performance that we decide to gain. And how do we stick to existing interfaces? People that access file systems, they read, they write, they use system calls. They are very used to this. If you would suddenly need to link a shared library to access your file system, people wouldn't be up for that. So, we need some solutions here. That's what OpenCSD and FluffleFS introduce. We have a simple and intuitive system. All the dependencies and the software itself can run in user space. You don't need any kernel modules or things like this. We manage you entirely. We use system calls that are available in all operating systems, nor all most typical operating systems, FreeBSD, Windows, Mac OS, and Linux. So, I'd say that's pretty good. And we do something that's never been done before in computational storage. We allow a regular user on the host to access a file. Concurrently, while a kernel that is executing on the computational storage device is also accessing that file. And this has never been done before. And we managed to do this using existing open-source libraries. So, we've boost, Scenium, Fuse, UBPF, and SBDK. Some of you will be familiar with some of these. And this allows any user like you to, after this talk, try and experience this yourself in Camu without buying any additional hardware. And I'll get into that hardware in a second, because there's some specialized hardware that if we want to have this physically in our hands, we have to do some things. And if we look at the design, then we see four key components and a fifth one that they'll explain on the next slide. We're using a log-structured file system which supports no in-place updates. So, everything is appended and appended. And we have a module interface where we have backends and frontends. So, this allows us to experiment and try out new things. We can basically swap the backends and keep the frontend the same. And we're using this new technology in Flash SSDs that's called zone namespaces. They are commercially available now, but they're pretty hard to get still, but that's going to improve in the future. And the system calls that we managed to reuse, those are extended attributes. So, extended attributes on any file and directory on most file systems, on the file system you are using likely now, you can set arbitrary key value pairs on these files. And we can use this as a hint from the user to the file system to instruct the file system that something special needs to happen. And basically, we just reserve some keys there and assign special behavior to them. Now, let's get back to the topic of zone namespaces because I only use some explanation here. Back when we had hard drives, we could perform arbitrary reads and writes to arbitrary sectors. Sectors could be rewritten all the time without requiring any erasure beforehand. This is what is known as the traditional block interface. But there's a problem, and that is that NAND flash doesn't actually support this behavior. So, when you have NAND flash, your sectors are concentrated in blocks and this block needs to be linearly written. And before you can rewrite the information in a block, the block needs to be erased as a whole. So, in order to accommodate, flash SSDs have to incorporate what is known as a flash translation layer, where basically all these requests that go to the same sectors are somehow translated and put somewhere else physically, just so that the user can still use this same block interface that they have been used to from the time of hard drives. So, there's this physical translation between these logical and physical blocks, and when we try to synchronize the file system from the host with the device while a kernel is running, this introduces a whole lot of problems. So, how do we solve this? Now, you know the answer. It's the sound namespaces. We basically present an interface that's not the block interface, and it's an interface that fits to NAND flash behavior. So, when you use the sound namespaces SSD, you need, as a developer of a file system or the kernel, need to linearly write each sector in the block, and you need to erase the block as a whole. So, effectively, you become the manager of this SSD, the flash translation layer, and the garbage collection lifts on the host, and we call this whole system host-managed. If we now combine this with a log-structured file system, which also didn't have any in-place updates, then you naturally see that this becomes a very good fit. And now, together with these two technologies, we can finally synchronize the host and the file system, and we can do that by making the file temporarily immutable while the kernel is running. And we do that using a snapshot consistency model by creating in-memory snapshots. So, we were able to create a representation of the file as it was on the host with metadata, put that to the computational storage device memory, and we can assure that all the data that is there will remain immutable during the execution of the kernel. Meanwhile, the user can actually still write to the file, and the metadata of the file on the host will differ, but that's not a problem. So, this is very powerful, and it allows us to also understand kernel behavior in a way, because we can now have metadata and send it to the computational storage device that says, well, actually, if the kernel tries to do this, remember, it's a user-submitted program, it might be malicious, then we want to block those actions, so we have a security interface as well. The final kick in the bucket for this design is that we want to be architecture-independent, and we do that through EBPF, the system that you're also using for network hooks and event hooks in the Linux kernel nowadays. With EBPF, you can define system calls and expose those in a header, and this is actually the format of how you would do that, that's a real example, and the vendor would implement that code, and you would define in a specification some behavior, but the vendor doesn't have to open source their code, which, in the case of Flash, SSDs and vendors, is pretty important because they don't seem to be that keen on that, and this way, we can still have an interface, the users can write programs once and reuse them across all vendors without any problem, and the nice thing about EBPF is that this instruction set architecture, what EBPF essentially is, is easily implementable in a VM. So there's even pre-existing open source implementations of this, and that's what we're using, UBPF. Now that I've explained all the key components to OpenCSD and FluffleFS, I want to start with a little demo and show you what are some of the actual practical use cases for this. So how can we use such a computational storage system in a way that it makes sense in terms of data reduction and energy efficiency? And for that, we're going to go to the example of Shannon entropy. This is heavily used by file systems who can perform background compression or by just compression programs that compress in the background. What you basically do is you try to quantify the randomness you have in a file. Typically, it's between 0 and 1, but for computers, that doesn't really make sense. So we use this log b that's over here to normalize this for bytes. Then we can say what's the distribution of bytes. So we create, because a byte has 265 different possible values, we create 265 bins, and we submit a program to calculate this. It runs in the background, and only the result is returned to the host operating system. And then the host operating system is free to decide whether or not this file should be compressed or not. So how does such a kernel look like, the kernel that you actually submit to the computational storage device, or you can just write them in C and compile them with Clang. So you write them in C, and we have two individual interfaces here that we are exposing. The yellow commands, those are introduced by the system calls, the ebpfi that we are defining, and the purple ones, those are introduced by a file system. What that means is that using this system as is now, that it's not agnostic to the file system. So it is agnostic to the vendor, and the architecture of the vendor. So we have this ARM or x86, that doesn't matter, but now it's specific to the FluffleFS file system that we have written. And I will address some possible solutions for this at the end. Other things we need to realize is that the ebpf stack size is typically very small. We're talking bytes here instead of kilobytes. So we need a way to address this. So what you can do is in ubpf you can allocate a heap, just as your stack, and then we have this bpf getmem info that we have defined as part of the ABI that allows you to get your heap pointer. Now currently you have to manually offset this, which is a bit tedious, if you will. You see that that is actually done here. To store the bins, we offset the buffer by the sector size, and then the data from the sector reads is actually stored at the top of the buffer, and the bins are stored at the offset for precisely one sector size. Now when we go to look at the file system interface and all the helpers and data structures and additional function calls that we introduced, we can later see that we can also make a basic implementation of malloc and free here and then just resolve this. But for now, for this example, it's a bit tedious. Now how do you actually trigger this? So we had the extended attributes, we had all these systems in place, but now you just have this kernel, you have compiled it, you have stored it to a file, and now you want to actually offload your computation, well, in an emulated fashion, but you want to learn, you want to see how you do that. So the first thing you do is you call start on the kernel object. So this is your compiled diecode, and then you get the inode number. This inode number you have then to remember and you then open the file that you want to read upon or write upon, but for the examples we're using read mostly. Then you use set extended attribute, you use our reserved key, you set it to the inode number of the kernel file, and then when you actually issue read commands, the read commands will actually go to the computational storage device and they'll run on there. But when do you actually take these snapshots? And the trick is as soon as you set extended attributes, this is just by design, right? It could also be once you call the first read or once you execute the first write, but we have decided to do it at the moment that you set extended attribute. That means that if you make any changes to your kernel, once you've actually set extended attribute, then nothing changes anymore. And the same goes to the file. Now I want to briefly explain some different types of kernel that you can have, and what the example here is mainly showing is what we call a stream kernel. So a stream kernel happens in place of the regular read or write request. So the regular read or write request doesn't happen, only the computational storage request happens on the computational storage device. And with an event kernel, it's like the opposite way around. First, the regular event happens normally, and then the kernel is presented with the metadata from that request and can do additional things. This is for databases interesting. For example, say you're writing a big table, and you want to know the average or the minimum or the maximum, and you want to emit that as metadata at the end of your table write. While you could use an event kernel to let it write as is, then you get presented with the data, and the kernel runs on the computational storage device, and you emit the metadata after, and you can store that as like an index. We have also decided to isolate the context of this computational storage offloading, so what is considered, once you set the attribute, by PID. But we also could make this by file handle, or you could even set it for the whole line node. More so, we could use specific keys for file handle PID or I node offloading, so it's just a matter of semantics here. Now, I have some source code in Python of these execution steps that I've just shown here, because there's a little bit of details that I left out in the brief overview. The first is that you have to stride your requests, and those have to be strided by 500 to 12k. Why is this so? Well, infuse the amount of kernel pages that are allocated to move data between the kernel and the user space is statically fixed. So if you go over this, then your request will seem filing from the user perspective, but what the kernel will do is it will chop up your requests. Why is that problematic? Well, then multiple kernels spawn, because from the context of the file system, every time it sees a read or write request, it will go to the kernel and move it to the computational storage device. Then here you can see how I set the extended attribute and get the kernel, the I node number, and what I want to show here at the bottom is that I'm getting 265 integers, and that's for each of the buckets of the entropy read, but I'm having a request of 512k. So that shows you the amount of data reduction that you can achieve using systems like this. 265 integers, 512k. Pretty good. Could be better though. The reason it's not better is floating point support in EBPF is limited to the fact where you need to implement fixed point match yourself. So we could do this as part of the file system helpers, but that's not done for this prototype at the moment. Now, some limitations. This was a master thesis work. This was my first time defining a file system ever. It's solely a proof of concept. There's no garbage collection, no deletion, no space reclaiming. Please don't use it. Please don't use it to store your files. Yeah. EBPF has an ending in this, just like any ISA would have, and there's currently no conversions. So if you happen to use something that uses different ending in this, all your data will be upside down. So you have to deal with that yourself for now, but once again, we can make it part of the file system helpers to help with these data structure layout conversions and the engineers conversions. As I mentioned briefly earlier, floating point support in EBPF is practically non-existent, but we can implement fixed point match. And currently, I haven't shown any performance examples because I don't think that they are that interesting because what's currently happening when you emulate offloading is that it just runs on the host processor as is in EBPF. So it isn't representative of the microcontrollers that you would find on SSDs. So the runtime, the time that it would take to execute these kernels would be much too fast. So that's something that we need to deal on, I think, because then we can more easily reason about what would be the actual performance if we would offload these applications to SSDs. Frankly, these SSDs do have very capable microcontrollers, typically even multi-core processors. The reason they do that is because they need to manage your flash sensations layer. So they are already fairly capable devices, actually. Only read stream kernels have been fully implemented for this prototype as well. And that's mainly because event kernel performance is problematic because the data from the event kernel, remember the IO request happens regularly, so all the data is moved back into the host processor and only then is the event kernel started. But what you really need is a two-stage system where you prevent the data being moved back from the host. This requires some more tinkering. And the final thing, we need to make this agnostic to the file system. And we can very easily achieve this using this file system runtime, where to an ICD, an installable client driver, much the same way that Falcon and OpenCL and OpenGL are working, you can dynamically load a shared library that implements all the functions you have defined in the header. And this can also dynamically compile your programs and then store the cache versions of this program. And using StataFS, we can easily identify on what file system is running. And that allows users to write their programs one, run on any architecture and for any computational file system, which I think is pretty powerful and flexible. So that's it. I encourage you to try this. I've also written a thesis on this that does have some performance metrics. It also shows you some interesting data structures that we had to design for the file system to be able to support these in-memory snapshots. There's a previous work called ZCSD that also has some early performance information. And I've written quite an extensive survey on the last decade's history or so of computational flash storage devices, which also quite interesting. So thank you. APPLAUSE Seven minutes for questions. Oh, that's quite good. I imagine this is quite difficult, right? Computational storage, what the fuck's that? So please don't hesitate to ask questions if anything is unclear. What's the availability of hardware that can do this? The computational storage? Yes, the computational storage. There is one vendor that is selling a computational storage device that's not based on zoned namespaces storage. So it's using conventional SSDs and it supports computational storage to a network interface. So you have the normal PCIe interface and then there's this transport over need to do TCPIP and then you basically just connect over it to SSH and then you can do things on the SSD. That one's commercially available. I don't know what they would ask for that product. What does ZCSD have to do with zoned namespaces? Nothing in principle, but you need a way to synchronize the file system between the host and device and zoned namespaces make that trivial, whereas conventional SSDs, the logical and physical block translation, severely hinders this, makes it extremely difficult to perform. So why didn't you include the performance projects from your pieces or better ones? Because the performance... Oh, sorry. Oh, yeah. Why don't I... Very good. I forgot that actually all the time. Yeah, so why didn't I include any performance metrics if I have them? And the answer is because I don't think I would have time and I don't think they're interesting enough to include. This is a very complicated subject. It's very new for most people, computational search. Most people have never heard of it. So I much rather spend the time to explain this properly and try to show you that this is a very interesting concept to solve this bandwidth gap rather than show you some metrics that are not representative anyway because the kernel is running on the host CPU and you're not going to have an additional host CPU on the Flash SSD. Can you talk about what kind of test setup you have for your metric? So I don't... Yeah, of the metrics themselves. Yeah, so the framework... Okay, yeah, yeah, very good. What kind of test setup I had to do all these analyses and to try these things out. So I run Camu on my own host machine, just a normal laptop, basically this one. And Camu then creates a virtual sound namespaces device that's actually quite recently introduced to Camu. So you can now try sound namespaces without owning sound namespaces. That's the whole reason Camu comes into play because otherwise people wouldn't need to buy a sound namespaces SSD which is quite badly available. And then you just run the prototype as is. So that's all you need. And you really don't need any special hardware. Yeah, it could be even on an ARM laptop. It doesn't matter. Did you test it? No, I did not test it. But whether or not I tested if it works on ARM. The answer is no, I did not test it. But I'm pretty sure Camu compiles some ARM. So I'm pretty sure we're good there. Because you have to remember that that's maybe not intrinsically clear from this presentation. But we didn't extend Camu in any way. It's just a normal Camu cumulation. You don't even need to custom install it. You can just get it from the package manager and use this. I have a lot of questions about that. Regarding the computational part, what are the limitations of what kind of CPU or kernel that it may run on these devices? I think the main limitations, what are the limitations as the kernels that you run on these devices? Well, first of all, you need to have data reduction, right? If you're going to read one gigabyte from the flash storage and you're going to return one gigabyte of data to the host, then there's no real point in offloading this because the data is going to be moved anyway. So the first thing that the limitation is that you have to find an application that is reductive in nature. Once you do the computation, you return less data. The nice thing is that's 99% of all workloads, right? So that's pretty good. And the second thing is that if it's timing critical and the computation takes a long time, then it's probably not that interesting because the latency will then be too bad because the performance of these cores is much less than your host processor. But you can implement specialized instructions that could be very efficient in doing database filtering or things like this. And that is where the whole ASIC and FPGA part would come into play. But if it's not timing critical and it's in the background, like the Shandom entropy compression, those are ideal cases. Reduction in data and not timing critical. So what you mean is we can have software kernels with the back end in hardware so we can also program the thermo. So like maybe a core like CPU board or GPU board. To repeat the question, whether or not it's just software or whether we also program the hardware. Of course, FPGAs can be reprogrammed on the fly and we have seen prototypes in the past for computational storage devices where they do just that. From the host device, the user sends a bit stream that dynamically reprograms the FPGA and then the kernel starts running. That's not what we're trying to achieve here. What I envision in this is that the FPGA has specialized logic to do certain computations and then from the ABI, from the EBPF ABI, whether code triggers those instructions will utilize the FPGA to do those computations but they would be defined in the specification beforehand. Because typically in reflashing a FPGA with a new bit stream takes quite some time so in the interest of performance it might not be that interesting. So I'm going to ask a question. You might have mentioned it but are there close source competitors? If there are close source competitors in the space of computational storage. Well, actually that's one of the things that's been growing really well in this scene. I'd say the vast majority of everything is open source. At least if you look at the recent things, if you look at the past decade then it's a bit worse because there is a lot of research published that doesn't actually publish the source code or rather the source code is published but everything is a hardware prototype and they didn't publish the bit streams or the FHDL or the Farrellock so you're then stuck as well or they didn't have any PCB designs so you can't reproduce the work if you will. I say this is a much bigger problem than just computational storage in the field of academia but it's also present here. Yes. Which one? The Python code. Complexity in terms of? The reason this is a nested loop, in the phase of performance I have a nested loop here so why that and why in the terms of performance how? And Python. The trick is this is just for demonstration purposes that's one you can easily make this example in C or C++ and you shit if you care about performance. The trick is that this program is already spending 99% of its time in IO weight because it's waiting for the kernel to complete so in the phase of that it's not that interesting and the reason we have a nested loop is because the floating point performance in EBP, the floating point in EBPF is not existent or at least I didn't implement a fixed point mod so what I have to do after this at the bottom what you don't see here is that from all these buckets of these bins I'm actually computing the distribution using floating point math in Python which is why I don't get a single number from this kernel because if I would have floating point implementation in EBPF I could already do that computation in EBPF and only return a single 32-bit float as a result instead of these 265 integers. But I still, the reason this is a loop is because I still have to strike for the read request because I can't go above 5 on the 12K even if my file is bigger than 5 on the 12K. You said it's spending a lot of time in IO weight. Couldn't you just write it there just to prove it or it doesn't make any sense in this case? Well, the trick is, okay, couldn't I implement multi-treading here? Currently, the EBPFVM runs as a single process so even if you submit multiple kernels only one will execute at a time. Why? It's a thesis prototype, right? Time, things like this. Okay. Thank you very much. No worries. |
Understanding the Bull GAMMA 3 first generation computer through emulation |
So, our museum is located in Namur, so it's not far from here, so if you have some time to come, you're welcome. So we have different missions, of course. One of it is to preserve all machines, to show them to the public, and also to study those machines to keep understanding them. So my talk is more precisely about that. And actually, why this machine? Actually we have a big collection, part of our museum is actually a big mechanical graphic collection, you can see it's here. So we have a whole bunch of machines, electrical and mechanical machines that are still being maintained. Unfortunately, we don't have the bull gamma tree, it's very rare, but it was connected with those machines. So we have many documentation about those machines, and we were interested to study that machine more specifically. So I will go through the historical context, make you discover the machine, and then go into it to try to emulate it, looking at some existing emulators, and then detailing our own emulator and what we learn with it. So let's go back in time, so you know we are here now in 2023. So if we go back, how long is it, 70 years ago, in the 50s, just after World War I, the first generation of computers was developed. So at that time, the technology was very different than today, because there were no integrated circuits, there were no CPU, microprocessor, they were developed in the 70s. There were no TTL circuits, there were no transistors, there were no magnetic cores. Actually when you really want to build a computer, you have technology like vacuum tubes and delay lines to try to store some memory and drums. So it was really a very different technology, and of course, you can imagine, the memory was very small. And so another point is that at that time, well, most of the processing was made because of course there was automation before the computer, so most of the automation was done through electromechanical machines, so a tabulating machine, you know it was developed in the end of the 19th century with the already tabulating machine, and then it became the IBM company. And you can see here that there was some kind of transition between that area and those machines, those computers that were starting to be developed. And actually the interesting point is that one that I will show you, actually at the beginning it was not really a computer, it was still some kind of auxiliary calculator for a tabulating machine, that one that you can see in our museum. And after, actually, it began to improve and the dependency between the machine was reversed. So the gamma-3 became the computer and the tabulating became the peripheral. So you can see other machines after, of course, you can see that both developed also the gamma-60, gamma-30 machine in the second generation, so I will not focus on that. So maybe in the next one. And so how did we study the machine? Of course, we have documentation at the museum. There is also a number of existing examples of that machine, one in Angers where it was built by Bull in Grenoble, they acquired one and they preserved it, and one in Frankfurt. So, of course, we don't have one, but we have those documentation and we have also many documentation that was also provided by Akonis, which is another museum located in Grenoble. And there are a few emulators, so we'll come back to that later. Have a look at the hardware, so as I told you, it's a first-generation computer, it's based on vacuum tube delay lines. Actually, the code was stored in a connection panel, so you can see it on the top there. So in order to program it, actually, you had to plug the instruction to say, well, the first instruction, it has four characters, but the first character, it's that exact decimal code, the second one is that code, and so on. So it's really like that spaghetti coding, and for that reason, actually, that spaghetti coding was also used in a tabulating machine, so it was the way to code at that time. And that's also the reason why we cannot really call it a computer in that form, because it does not follow the von Neumann architecture, because in that architecture, you have to have the code inside the main memory, although somehow that panel was memory mapped, so you could consider it like some kind of read-only memory. What about the memory? The memory itself, actually, it was only seven registers. And in order to keep the information, actually, the information, it was the equivalent of six bytes, so it's 12 characters of four bits. It was just circulating in a line with a regeneration system, so it's an LC circuit, and for just one word, for six bytes, you can see the device here, it's more than eight kilograms. You imagine the start of the... It was really very big. About the computation, it was also based more on diodes, so I will not go into all the details. It was mostly addition and subtraction, as I will see, the multiplication and division were implemented through iterative addition and subtraction. And what about the frequency? The frequency was 2.5 hertz. Why that? Actually, it could go, the inner could go faster, but it was just because it was synchronized with the mechanical machine, with punch cards, so it was limited by that part. And you can see also there is a nice drawer, it's really easy to open, of course, for the maintenance because when a vacuum tube had a problem, you had to replace it and it was designed for that. So is it a computer or a calculator? So in French, we have different names, but as I told you, we cannot really consider it as a computer first time because of that it was not following the von Neumann architecture and it was really designed as an auxiliary machine for the tabulating. So as you can see there, a quote from a guy who designed the machine in 1953. So it's really an extension and the good point is that that computation was so fast that there was no delay by the calculation, so it was really transparent for the tabulating. And actually at that time, the programs inside the machine were more like auxiliary computation that were augmenting the capability of the tabulating machine. And there there were evolution, that's the interesting point. There was a version, of course, that first version is only adding and subtracting integers, so there was a version that was able to do floating points. And then in 1957, there was a drum extension, that's the interesting point, it's about 100 kilobytes and it could store the program. So from that time, we can say that it's really the first French computer and it's also the transition between the electro-mechanical device, the electro-mechanical area and the computer area. Also, another interesting point is that those first computers were not using binary or exact decimal representation, they were still computing in decimal. So it's interesting because I found, it's in French with this transition there, there was a whole discussion about should we use decimal or should we use binary or exact decimal for computation. So there were some advantages, benefits and some disadvantages. So you can see the advantages, two figures, zero and one, it's really powerful for the relay, it's ideal to map and for the disadvantageous binary, it's become very long, very long word and we need to translate back and forth with the decimal. So the conclusion, it's quite funny, we will use semi-decimal, which actually is the name for binary coded decimal and they introduce those coding for the binary coded decimal. So that was for the first version, after they came back on that decision and actually the update for the drum extension was able to support the binary, the full binary mode. So what do we have as memory, as I told you we have those registers, actually we have seven main registers, you can see here a bit more because there were extensions. So a register is one word of 12 digits, 12 characters, so those four bits, so it's actually six bytes, so the main memory was only 42 bytes, so you see it's very, very limited. And if you look at the full architecture here, the full gamma tree with all the extensions, you can see on the top left the panel, the main registers are on the left, the top one, the M1 actually is the only one where you can read and write, so all the computation will be performed in that one and the other one, M2 to M7 will be used as a register to read operands. And the instruction, you can see the decoding of the instruction, the structure of the instruction is composed of four parts, I will detail them after, it's called TO, AD, OD and OF and the rest are extensions, so this is more memory, so you can switch those registers with those ones and the drum extension can also map on those octets, so you can load a part of program from there, from the drum to those parts and then execute them into the computer. So about the instruction set, you can see that there are four parts, so the first is quite natural, it's just the type of operation, so you can have addition, subtraction, I will detail after. The second part also quite natural, it's just the address, it means which operand we will use, so for the addition for AOE we can see it means M4, the register number four and what's a bit different and word is that then we have two other pieces of information in the instruction that we tell you which range in a register you will manipulate, because the reason is that the memory was scarce and so if you wanted to store two different information in the same register you could then address one part of it and you could really select if it was two bits and then ten bits and things like that. So you can see here a very simple addition, so I can decode it with you, so this means a transfer from one register to the accumulator, so the M0 register, so it's from M4, so you can see M4 we have two parts A and B, so A is from 6 to 9 and B is from 1 to 5, so the first thing is that we will load the part 6 to 9 into the accumulator, then we will ask to perform an addition with what we can find in the same register four in part 1.5, you can see 1.5 here and you can see that as an internal flag it also remembers the part that is used for the shift part that it should use for the addition and then it can perform the addition and will have A plus B inside the register and then you will put back the result, so it's a reverse instead of B or it's UB to store back the result in M4 and of course here you have to think oh I've done an addition so maybe there is one carry overflow so you can see here that we have provisioned one byte more to be able to store the result back, so you can see all the mental gymnastics you have to do to be able to program with that kind of range in the registers, so it means that when you are coding you have to use that kind of sheet, you can see of course the mnemonic, you see here the translation where you have to think about those range and you have then to facilitate that for the range you have to allocate your range and reason about your range also on this sheet, so you can see here the problem is computing that formula and then you will just perform the different calculation, multiplication, shift shift to have the right power and then divide by a square root of three. Okay quickly this is the full instruction set that you can see it's not very regular, well a natural thing is that no operation is still zero, it was already zero, you have operations to different kind of jumps, there was an inner flag to remember how to jump, you have different memory transfer, I will not go into details, of course to set memory to zero or to load a value to make the transfer between different kind of registers to, there is a logical ant, I didn't find any logical or, I don't know if there was one to be true, but okay different comparison and then of course the most important one from A, B, C, D, E, F the addition and the arithmetic operation and you can see there are two flavors for the multiplication and division because there was one what was called reduced multiplication and reduced division that was faster but that we will not operate on a double register because of course if you have a big to the result of multiplication could of course take twice as much as space. Okay so this is the code card, so it summarizes the whole instruction set and it reflects the complexity of its organization, you can point just three things, first it's called ordinateur, so in French the name was ordinateur but the name ordinateur was coined one year after for IBM machine so it was not, it didn't exist yet so you have to think about all that, you can see here the different arithmetic operation for A to F, so 12, 13, 14, 15 and you can see the order is not, it's not always in the order, the seven is presented higher because just the shift and operation and the two is not represented because it was an extension for the drum. Okay let me go quickly, so about existing emulators, so this one was written in 1995 by Vincent Gauguin, it's in sorry in x86 assembly code and it's still run but well thanks for the emulators because you need those bugs to run it, we don't have the source code, you can just see there, well it's just emulating everything so it's quite good complete and you can see there that it's just loading some information so it's just loading 0, 9, 4, 2, 7 in the memory tree register and then you can, well there is a drum emulated and then you have a number of programs on the drum you can try, a more recent one it's available online so this one is very interesting because it's very well documented and you can even play with the panel, there is a full console where you can step in and actually it was one of the sources of inspiration of our work because that one was in java 6 and oh it's in java so we kind of transposed and first studied that code and there was there is also an extension visualization 3d visualization which is funny because you can you can explore inside the machine you can see here the connection and there were big cables to connect the machine with the tabulating okay about the emulation structure of course what we have modeled is all the components so you have the machine you have the different kind of memories, banal memories it's just the registers, different series groups and then a special one which is the panel which is actually as you can see memory mapped to one of the series and of course you can also have connected machines and the drum then of course the whole instruction set you can see there the modeling the way the instructions are structured depending on their kind if it's for drum transfer it's of all the arithmetic operations have some common parts so we have some hierarchy there and of course there is some execution management and test and you can see the code there on github and what's interesting from also the emulation point of view of course all operation will have to specify the different information so for the addition this is an inner operation just to show you how it's implemented so of course you have to specify the range where you are performing the addition and this is quite a standard implementation where you just loop over the different the different bit and then you propagate the carry what is interesting is just that you have the base so that code would work if both for the binary and for the decimal implementation actually the variant of the machine so this this is trying to mimic the whole the whole operations another one very much simple is just to use the Java operations for example for the for the subtraction we just translate everything in decimal perform the subtraction and then start the result there is only one one thing that that must be we must be careful is that we have to use long in Java because 12 those 12 numbers are more than 32 bits in in Java we skip the division so the current implementation while we have our prototype just in Java so we are just using here the eclipses as an environment and running the the test so this is just test we have a small interface this not yet finished and you can see here a quick code that just showed the Fibonacci suite and well you can see the result here I will not go into the detail but you can you can see there is actually a loop so there is a jump for 10 iterations and then you you have the different number you can see the number after a few iterations you have 13 8 13 and six like that which which are being computed so it's it's working and now I will finish so what what what did we learn so it was quite quite funny to uh and strange to look at that machine it's not so complex to code but there are many many implementation details you see about those range manipulation and we are still have a lot a lot to explore for example all the floating points improving the user interface and of course we are at the start so we would like to really to study what was used and as as code at that time so in summary it was it was very and it's still very rewarding from the technical but also from the historical and cultural point of view okay thank you and if you have questions you're welcome you have some some reference there about all the the guys who have worked on the on that machine yeah like the core memory and the reading that was required to rewrite it again once you read so so the question is about the the core memory simulating the reading of the memory so the well it's a good question because I don't know how to call this a simulator or an emulator but uh the the components well the the machine is is quite quite also at it's uh what we are emulating it's kind of an abstraction of the machine I would say uh so one limitation is done that we don't really know the physics of the reading so uh we are assuming that we can read reliably the information and that we don't have any timing issue if the the things that are bothering you but of course we don't have a working machine to to compare with so we can only uh compare with expected result or with what order the older emulator as is delivering actually the older emulator had a problem we discovered there was a mistake discovered so it was corrected by the the guy who's still maintaining it somehow but yeah so the the point would be really to uh to be able to to study the electronic circuitry if we would like to go to that to that level but we don't have one sorry yeah there's a question from the stream yeah is there a compiler for gamma 3 a compiler uh well is there a compiler for the gamma 3 so you can you could see well assembly code was assembly language was invented two years before I think by uh uh and the assembly was was done manually at that time so at that time the question the answer is no there were no compiler but no today actually uh the guys from akonit have developed uh uh a compiler from a language that looks like uh java I think uh so you can uh you you you actually yes you can compile from that pseudo code language into uh into the gamma 3 yeah that was done I didn't try it but I did the question is was the program the program that I showed was uh in uh coded was was executed from the panel so well the panel is just a way to to specify the content of the memory is the is the same but just with wire but the the emulator supports the drum yeah yeah it could yeah yeah it could could load instructions so the drum could contain instructions or it could contain uh data yeah yeah yeah there is yeah so the the the question is about the the cycle count of the different instructions so yeah we we have uh time about timing uh about the uh the addition subtraction and different kind of multiplication so that's that's available and that's a good point because the emulator is not uh taking that into account so it would probably be a good good point to try to reproduce that that behavior thank you very much so maybe |
I made a GameBoy emulator to learn about computers. And now I work with them...
A brief personal journey in emulator development (with a sprinkle of Rust and WebAssembly) |
All right, so yeah, our last speaker for the day, for this year actually, is Tirmang Gomez. And this is his first time doing a talk in general, so he's very nervous. Okay, so this is the title of my talk. It's a bit long, but the short version is at the bottom, I'm just gonna talk to you. I spent some time, a couple of years ago, making a Gimbo emulator, and I'm gonna talk to you about it. So wanting some introductions, that's my name, and if you want to reach out to me after the conference, those are some of the ways. I work as a software engineer. I don't work on emulators, I use them sometimes, but it's not part of my work. This is mostly just a hobby, so I've done all of this on my own time. And I can emulate the Gimbo camera as well. So this is what I'm gonna talk about today, points 1, 2, 3, I'm going to talk to you about my particular emulator, how you can run it if you want to do so. And afterwards, I'm gonna talk more generally about Gimbo emulation and how you can build your own emulator. I'll give some tips that I found that are useful for debugging. And at the end, some lessons learned, and hopefully, if there is time, some demo. So this is what my target audience here is mostly, for this talk, is mostly going to be emulator beginners, emulator development beginners. I find the Gimbo to be quite beginner-friendly. One other reason is because it's very heavily documented, and there are other reasons as well that I'll get to later. If you're interested in Rust and WebAssembly, you're going to see a use case. And if you're just generally a fan of this device, then you might enjoy that also. So why make this in the first place? The main reason I'm sure many people here will relate, or people making emulators is the nostalgia. I used to own one of these, so I want to know how it works. Another reason, more generally speaking, this system is very attractive to emulate because of the, there's a huge amount of software out there, so you can spend many hours just trying games and seeing if they work. And if they don't work, then you can spend many more hours trying to fix them. And it's just something I do for fun. I did it mostly, I don't work on it much these days, but every time I do, it's a lot of fun. So it's made in Rust. The selling points for Rust are performance and memory save. My main selling point is that it has a very useful package manager and build tool. It's very quick to prototype things, and I was able to put this together very quickly actually. And one of the other main reasons I want to use it is because of WebAssembly. The support in Rust is great, so you almost get WebAssembly for free if you use Rust. The tools are very nice. And it runs on the website because it's WebAssembly can run on the browser, so it's very portable. That's my phone, that's my PC. It also runs natively. It's not just WebAssembly. So if you want to run it, these are the commands you need to run. There's a native build, single command. You give it the ROM and it will emulate it. The web build, this is the few more commands because you have to deploy a web application. So it's just, but it's very straightforward. It just works. And that's the link if you want to try it. So I'm going to talk about the architecture and emulation. So these are the two devices that I emulate. The original Gameboy came out in 1989. It was extremely popular. It was designed to be as cheap as possible, so lots of games were made for it. And it lasted close to 10 years. There were a few revisions in between, but it was mostly the same system. And then almost 10 years later, the Nintendo released the color version, which has still a very similar shape. And also internally, the system is also very similar. So the Gameboy color is like a super set of the original Gameboy. So these are the two devices that I target. And I have to mention the Gameboy Advance. It's a completely different system. It's arm-based. It was still backwards compatible, but it's very different under the hood. So I don't support it for the time being. So I'm going to talk about the architecture. I'm going to, so if you open the original Gameboy, you'll see a bunch of stuff. But for emulation purposes, we only care about those three chips. One of them has the CPU and the pixel processing unit, which and the other chips are memory. So I'm going to narrow, I'm going to limit this section to just talking about the CPU, the pixel processing unit, which does graphics and at the end to wrap it all up, I'll talk about the memory map that you, which is what allows the CPU and the pixel processing unit to talk to each other basically. So some basic stats about the CPU. It has 8-bit registers and 16-bit registers. It can do 500 things, has 500 instructions, a 16-bit address bus and an 8-bit data bus, and it can run at two different speeds. The original Gameboy could only run at four megahertz, but the Gameboy color could choose between either of those two speeds. So about the registers and some general information, it has general purpose registers. These are here for intermediate calculations. There's also a flag register, which has information about the last arithmetic instruction that run. So if you add two numbers together or subtract numbers together and the result is zero, this register will tell you and other things. The 16-bit registers are basically just the 8-bit ones, but used in combinations of two, mostly just for 0.3. The general purpose ones, it has the normal program counter with the address of in-memory of the next instruction, a stack pointer for implemented subroutines, and there's a global switch for interrupts, it's Boolean, so when you set it to zero, the CPU will stop listening to interrupts, such as bottom presses, until you set it back to one. So how can you model this in Rust? It's very simple. This is exactly what it looks like on mine. The state is very simple, it's just a few fields for the registers. So I'm going to talk about instructions. This CPU has 500 instructions. It has your typical instructions that you would expect, so memory reads and writes, arithmetic and branch instructions, so jumps and calling to subroutines. Some of the instructions can be conditional using the F register, and on this website you can see them in color coded in a very nice table. So this is at the core of the CPU, this is how you implement the instructions. So you have to do the three things, first you have to fetch the instruction from memory using the PC register, afterwards you have to decode the instruction, so that means figuring out what instruction to run based on that byte that you just read, and you can do this with a, in C++ you would use a switch statement, in Rust you can use a match statement. And after you decode the instructions you have to run it, so those are the three things you do, you fetch, you decode and you run, and you run it in a loop, in a loop and that's what the CPU does. So this is one example of an instruction, the code is very simple, this is a memory instruction, I'm only going to comment on the return statement, this particular instruction on the real CPU would take eight cycles of the clock, and we need to keep track of this because afterwards we need to see this information to synchronize all of the emulator, otherwise it would lead to bugs, so that's why I returned the number. Another example of instruction, an arithmetic instruction and exit operation, this one takes for cycles and it's arithmetic so it modifies the contents of the F register. And you can look up how to implement every instruction on this PDF. So you do this for 500 times, you might make mistakes but there are ways to fix those, I'll get to those later. So you do it 500 times and you will end up with a massive match statement or a switch statement, but the code inside of each of the branches is very simple, but it's still error prone. This is an optional thing you can do, because this is going to run very frequently, it doesn't hurt to turn that into a sort of binary search, so you can optimize the code a bit using, in Rust this is very straightforward using the match statements. So that's pretty much the CPU. I'm going to switch to the pixel processing unit, this is the chip responsible for graphics. So the Game Boy had an LCD panel, this size is 160 pixels by 144, total of 4 colors, more on Game Boy color of course, and it runs at roughly 60 hertz. And the way graphics works on this particular system is by a composition of three layers, you have the window layer, the spread layer and the background layer, and then there are, the CPU has registers, this device also has registers to program how you composite these layers together. So I'm going to go layer by layer. So the first layer is the window layer. This is usually reserved for things like game stats, it's fixed on the LCD, you can move it around, but the graphics within the layer are not movable, they are constrained to a grid. Can anybody guess this game? Yes, Link's Awakening, yeah. So that's Link, Link is a sprite on the sprite layer. So sprites are basically freely movable objects on the LCD, you can have 14 in total and they come in two different sizes, programmable by registers again, along with other things like color and position and orientation and things like that. And finally the background layer, what I think is the most interesting one, it's basically a grid of 32 by 32 tiles, each tile is 8 by 8, so the total size is 256 by 256, so it doesn't fit on the LCD screen, but you can scroll it using registers. So that's, and also furthermore, the scrolling wraps around so you can be clever and implement infinite scrolling that way. So it cannot, so there are more registers, I don't have time to talk about all of them, but there's a link. So by today's standards, this graphic-wise, this system cannot do much, but there are games that are quite clever using these limitations. So this is one example, it's not really a game, it's more of a technical demo, but still. So this particular example is used in the background layer only, and it's modifying this scrolling register, so it's actually moving it around the screen, however, it's changing the value of the register on every single line, and what this accomplishes is like a vertical stretching effect, and at the same time they are stretching the Nintendo logo horizontally in memory, you can see right there, and in combination these two things looks like they are zooming in the Nintendo logo, which is something that the gameboy cannot do in hardware, but they work around this by combining hardware and software, so I think it's quite interesting. And there are many more examples of games being clever, this is one. So implementation-wise, this pixel processing unit is a bit more tricky to implement, like on the CPU, and because of that it is a source of most of my bugs, and this game is easy to recognize, it's Tony Hawk. So the reason it's tricky to implement correctly is because we need to keep the CPU and the pixel processing unit in constant sync, that's the reason I was returning the number of cycles on each instruction before, and if you don't do it accurately enough it would lead to stuff like this happening, however I found that most games don't really care, most games are quite forgiving of inaccuracies, every now and then you will encounter a situation like this, in this particular example the rest of the game looks fine, it's only the interesting that is glitchy, and I think this is one of the reasons why the gameboy is a good emulation emulator, beginning-friendly emulation project because you don't need to be super accurate to emulate most games. So yeah, this is how you would implement the synchronization, this is how I do it, so first you on each iteration step you implement, you run the CPU for an instruction, it will give you the number of cycles that it will take, and then you use that to synchronize the rest of the components, so you feed it to the rest of the components so that they catch up to the CPU, so you do this forever, basically this loop right here is the core of this emulator, this is what the emulator looks like, there are a few things like getting the image from the screen and so on, but conceptually this is an emulator, it's very simple. So I've talked about the CPU and the pixel processing unit, both have registers, but they are separate things on the circuit board, so the CPU needs to be able to modify the registers of the pixel processing unit, and the way this is done is through memory, because these registers, every register that is not a CPU register is exposed in memory, so by reading and writing particular values to a particular address in memory, you can modify the registers of these devices, and you can map the memory map a bit like this, you have the characters right there, the video RAM and work RAM are the same size, because they are those two chips on the circuit board, those two other chips, they are the exact same chip, and there are other things, the buttons themselves are inside of these registers I.O., so yeah, there are some regions that are a bit special, you are not allowed to write to this region for some reason, and there are other details, this link has a technical documentation of the rest of the map in detail. So implementing the memory is quite easy, you just list every single component and every single register, a bit like this, so you get the cartridge, the video RAM, pixel processing unit registers, the buttons, sound registers, interrupt, controller, and then you need to be able to read from them, so based on the address range you can you route it to the appropriate device, and you need a similar method for writing values, some of the values will be read only, so keep that in mind, so at this point maybe you will have a sort of working emulator, but if it is your first emulator, as was my case, then you will run into bugs, and there are a few things, and they can be a bit tricky compared to other types of software I found, so there are a few strategies that I, sorry, so there are a few strategies you can follow in order to track down bugs, the first one I could give is just, because there is so much documentation about the Game Boy you can turn it into unit tests, to unit test particular sections of the hardware, the other reason why the Game Boy is so beginner friendly is you can actually run the diagnostics, there are available ROMs you can run and it will tell you where you are, where you have issues, so if you make a mistake on the CPU, which is likely, then this particular ROM will tell you what the mistake was, and you can also integrate this into your testing framework to run in CIO for extra credit, so the next one, the next tip is debugging, I am going to show debugging using an example, so after you have an emulator, the logical step is to build a debugger for it, because it will allow you to see how, it will teach you things about the games running, but it will also teach you where you might be making mistakes, so in this particular example, when I run this game, at the moment it doesn't work, so basically this is what it looks like, it just gives you a black screen, so there is nothing going on, but if you spend time making a debugger, then you can start finding clues, in this case, I spend sometimes just getting the instructions, the registers, the disassembly, very useful, and in this particular example, I know what the issue with this game is, so it is writing a value from this address and expecting a value that is never there, so this address corresponds to something called a DMA transfer, and what this tells me is that I have made a mistake in this emulation, so I can go to that particular section of my project and fix it, but I haven't fixed it yet, because I found it quite recently, and also I found that it is a lot more fun to add debugging features than it is fixing the issues themselves, and I've been a bit busy recently, so that's the end of my technical talk, and I'm going to finish with some conclusions, this is my favorite glitch by the way, it only happens when you set the name to a particular name, it is very weird, so writing an emulator, at least on a Gameboy emulator, is the easy part of emulating a Gameboy, like I said, there's tons of documentation, and the hard part of the work has been done by other people who have been kindly enough to write down their findings, so I just have to read the information, interpret it, and turn it into a program, so I keep that in mind when I move to the next system to emulate, because it might not be as easy, so most games as I said are forgiving of inaccuracies, except this is more of an issue with my emulator, but most games are forgiving of inaccuracies in the graphics, so this is yet one other reason why it's friendly for beginners, and finally, WebAssembly and Rust are great, if you just Rust, it's using WebAssembly, it's very natural, if the support is great, and I have a small demo, it runs on the browser, so that's the LCD, I'm also drawing the video memory and the color palettes, and one of the things you can emulate on the Gameboy is, it came with a camera, so if you load the camera on this application, it will request permission for the camera, but I've shown the picture at the beginning, so if you cancel the permission, it will still boot, so it has a fallback, so it cannot get the webcam, because I haven't given it permissions, but it can still put the file in there. I think you can play games with it, but I don't know how it works, but that's the demo, so that's it from me. Can I just break in? I'm leaving immediately, but if you go out, please continue your questions and your discussion, please look around you and take any garbage that you see from the room here and put it in the back, if a lot of people help, it's not much work, otherwise we will be here forever. Thank you. I have a question. Can I modify it in such a way that I can mess with the logic of the game? The question was, can I identify particular things happening on the different games? Do you know about these trainers? No, I don't. Can I implement something like a game shark to cheat on games? Yes, I could. The emulator is built as a library, so you can use it as a library and read and write arbitrary bytes to arbitrary addresses, so you could potentially build something like that, yes. Thank you. You also had a corporate check. You have a single loop, where every part we're I'm like, see, what? You still in your program? Oh, she's like, okay. Yeah, I know. Well, my question was that you have a single room that has processes on the CPU. Yes, it's, yeah. What if you wanted, what if you were emulating with Rust a system where you want to have different threads for different peripherals. But they are all accessing the memory. Wouldn't the Rust have the same interview with that? Um, so can I use Rust to, can I run things in different threads with the first problems? And probably yes, but that was a kind of worms that I didn't want to open. And also, if the system was simple enough like this one, you don't really need to optimize like that. It can all run in a single thread. But for a more complex device, sure, I would have to investigate more on that. But I didn't have to do that on this one. Yeah. Why did you pick Rust? Was there any reason that you did not select C++? Yeah. Why did I pick Rust over something like C++? It's what I use Rust for my personal projects. It's what I like using it. It's what I like using. And you know Rust better than C++? Yeah. And the processor is a 6502 or is it? So the processor, the question was what the processor is. Yeah, it's not a 6502. I think it's a mix of a Psylog Z80 and an Intel 8080. So it's like a combination of the two. I think it is I'm not really sure. You split your match up into the binary sets. Did you actually benchmark that? Because I thought the compiler would have just translated into a jump tape. On mic, you know. And we're going to get kicked out. I'll be honest, I didn't benchmark that everybody was in change. |
Welcome to the online Energy Devroom |
Good morning and welcome at the Energy Dev Room. My name is Nico Rikke and I've been active in both the energy sector and the free and open source software community for many years now. Thank you for tuning in. The moment this recording is broadcasted, I should be at FossLam and in the chat to welcome you there as well. But as finding a quiet place with good internet can be tricky, I pre-recorded this message. In the last couple of years, I've seen free and open source software starting to take over the energy sector. Being active in the energy sector, I thought it would be cool to host a dedicated energy dev room at FossLam. The FossLam team gave us the opportunity, we organized the dev room and you responded. We were overwhelmed by the amount of interesting proposals you submitted. They ranged from home automation to green software and energy system modeling. By adding an online section and limiting the duration of the talks, we were able to accommodate most of them. The speakers will certainly have much more to share, so please ask them your questions in the chat. Energy of course is a timely subject. Our society runs on cheap and reliable energy, something we can no longer take for granted. Scientists warn that we are acting too slowly and need to ramp up the adoption of renewable energy. The Russian invasion of Ukraine made matters worse by disrupting supplies and causing a price hike. It is clear that our energy system is under pressure. Citizens are protesting and businesses are shutting down. Society needs solutions and demands them. Unfortunately it is not just a matter of installing wind turbines and solar panels. The decentralized and weather bound nature of these generation sources requires changes in energy management and distribution. While energy generation, consumption and distribution differs per country, we can speed up the worldwide energy transition by a tremendous amount if we prevent the usual reinvention of the digital wheels. This is the mission that unites us here at FOSSTEM in this energy dev room. Today's presentations cover software projects that model, manage and optimize energy. Unfortunately, these solutions alone will not be sufficient to solve the energy crisis and limit climate change. More impactful and direct changes will be necessary to achieve those goals, like the way in which we live and consume. But they can be the building blocks of our future energy system. The projects presented here today, besides being great projects, are available under a free and open source software license. This is important because it enables collaboration, mass adoption and customizations to fit specific use cases. This really fits the challenge of the energy sector in which everything is in a state of change and systems will have to integrate to fulfill their purpose. I hope you will take this opportunity today to learn, get inspired and strengthen the free and open source energy software community. So please, share your questions and thoughts in the chat room. Presenters have been asked to answer questions in chat or in the Q&A livestream. In the afternoon, the dev room will be on campus. Of course we hope to see you there, but otherwise you can continue online by watching the livestream. The only thing left is to thank those that made this all possible. Together with my fellow dev room organizers, Nikolas Heunig, Kai Uwe Hermann, Dan Brown and Anna-Linda Helsen, I would like to thank the FOSDEM organizers and volunteers, all the people that put in the effort to submit proposals, all selected speakers and all of you for participating. I hope you have a great conference. Thank you. |
Energy policy by the European Commission
Brief overview of policies and opportunities for collaboration |
Ladies and gentlemen, members of the open-source community at the FOSDAM event, I am very happy to speak to you about the EU's energy policy and I would like to ask you for your help. But first of all, I would like to pay tribute to Shuli Goodman, the energy and drive that she brought to the global open-source community. To use it to fight climate change will really be greatly missed and my thoughts go out to her wife, her son, her family and friends. Now, to the topic of today, of course, the Green Deal is nothing new to you. We have targets for 2030 and 2050 in terms of becoming climate neutral in Europe, in terms of renewables, in terms of energy efficiency that are very, very ambitious. But as if that was not difficult and ambitious enough already, since Russia has invaded Ukraine, we have agreed to step up our ambition even further through the repower EU plan. So we visited the targets and we are changing the pace, not the direction. So we are still going in the same direction. We want to become climate neutral, but everything has to be faster. And becoming climate neutral depends, first of all, on bringing more renewables into the system. Indeed, this is a system challenge. It's about more electricity, but also backup and storage in other forms of energy. It's about more flexible, a smarter and a more digital energy system. And this is a system challenge that not one party can solve on its own. We need to work together to change it from production to consumption and everything in between, the grids, the networks, the transport. So this means that we need to share solutions and sharing solutions in the time like this, when it needs to be smarter and more digital, is all about sharing data so that together we can optimize the system based on good information, models, automated systems that know when to use what energy so that the use of renewables can be optimized into the system. And of course, when we talk about sharing data and when we talk about sharing solutions, open source is the excellent way to go. And in the Commission, we know that this is important, so we want to support it. And we do this in energy, in particular through the Horizon Europe work program, where you may have seen that several calls in the area of wind, solar, energy sharing, smart grids are all referring to the use of open source as a way to support solutions. And of course, open source starts with people talking to each other and understanding each other's problems and realizing that by working together, they can find better, more sustainable solutions that have a better chance of scale up. And that is also why we support this through Horizon Europe and through the digitalization of energy action plan that has come out in October of last year. So that's why I would like to ask for your help and keep the open source community vibrant, not just discuss it today at the FOSM event, but work together beyond this event, work together beyond individual products and beyond individual business cases, because after all, open source is a state of mind and a state of mind that we are going to need for the energy transition, and it translates beyond software and beyond solutions. It is needed for a system change that we are looking for. So I wish you a very successful event today, a very successful session, and I'll be very curious to hear what comes out and I hope that we will stay in touch. Thank you very much for your attention. |
What the energy industry can learn from how open source technology has transformed other traditional industries |
All right, so we're here for the energy foster panel, and I will first introduce myself, Ferdi Bonci. I'm a professor for monitoring and distributed control for power systems. I'm with the Institute for Automation of Complex Power Systems at WTH Aachen in Germany. So I work on solutions, microservice based architectures for measuring, monitoring, controlling this energy grid undergoing the transition, so packed with renewables, with new types of users, with all kinds of energy storage systems. I have collaborated with the Linux Foundation Energy to prepare the survey, the 2022 survey on transformation readiness for the electrical system. And we were targeting utilities to find out how ready they are for the digital transition. Just want to mention that this effort was initiated, was driven by Dr. Shuli Goodman with whom I had the privilege to work, and the joy to know, and that will be my forever role model. Okay, so I'm looking forward to hear the experiences from other industries as we're concerned with the digital transition. So I would like our panelists to first introduce themselves. So please, Kate Stewart, if you want to start. My name is Kate Stewart, I'm the VP of Dependable Embedded Systems at the Linux Foundation. And so my focus has been what do we need to do to make open source projects and systems dependable in the embedded system, and obviously, given the energy sector and given the critical nature of it, this is something that's very near and dear to the things I care about. Thank you, Kate, great. So we'll go in first name alphabetical order now, Daniel. Yes, hi, I'm Dan Koshy, I'm the executive director responsible for automotive grade Linux. And before that, I was the VP GM of the automotive business unit at a company called Montevista, and we were the first to put Linux in a car, that was my division, and we were also the first to put Linux in a mobile phone, which was the Motorola Razr mobile phone. So that's kind of my background. Thank you, Dan. Thanks a lot. Gabrielle, please. Well, first of all, thank you for the opportunity here. I'm Gabrielle Columbro. I have two roles. One is a general manager for Linux Foundation Europe. This is one of the areas that makes it really interesting for me to be here. So thank you for the opportunity. The role why I'm here is executive director of FINOS, the FinTech Open Source Foundation. We really spent the last few years waking up the financial services industry to the benefits of open source, and we are now seeing a lot of potential for this industry to better collaborate in the open. So I'm happy to share my experience here. Wonderful. Thank you. And Rani, please. Yeah, hi. Hi, I'm Rani Haibi. I'm the CTO of Networking Edge and Access at the Linux Foundation, where I work with the telecommunication industry stakeholders on building the technology required for not just the digitization of that industry, but as you know, the telecommunication industry has to almost reinvent itself every 10 years or so with the different generations of communication, the 4G, 5G, we're already looking at 6G, and open source plays a large role in accelerating this innovation, and I'm sure there are many similarities between the telecommunication industry, which is heavily regulated, and the energy industry, which can be used to learn and figure out how to do things right. Thank you so much. Thank you so much. It already sounds promising and great, good. So I guess we can get started, and I guess we can all agree that the energy sector is really at the tipping point, and that it requires a transformation for decarbonizing the energy system, and to meet the carbonization goals, the digital transition is a necessary step. So each of you works in an industry that has undergone a similar transformation before. So based on your experience in your own industry, what should the energy stakeholders keep in mind to make this transition as efficient as possible? Would you want to start, maybe you, Kate? So I think one of the things that's important to keep in mind here, as people are trying to transition, is these sorts of journeys go better with company, and so finding people that you can collaborate with, and effectively forming communities, will help build up more consensus, get more eyes on the problem, and will accelerate the quality coming into play. So I think that is sort of the one probable tip that they should sort of look at. We're seeing lots and lots and lots of innovation happening in the embedded space right now, between the wind, the solar, and so forth, and a lot of microgrids and some of the new grid technologies emerging that are using the open source technologies. So figuring out, finding projects that you're interested in, that are the relevant to you, and then starting to collaborate to extend it to make sure it fits for your use case, it's pretty much why open source tends to work. Thank you so much. Anybody who wants to follow up, maybe Dan? Yeah, I'll say a few words. So my advice to the energy sector would be to identify, like a handful, maybe three to five companies that are key leaders or thought leaders in this space at the moment. And really get them to buy into this digitization and new open source model. And I'll use automotive grade Linux, which we also refer to as AGL as an example. So we identified the top car makers like Toyota, Mazda, Honda, Hyundai, Mercedes, et cetera. And we got them involved very, very early. And by doing so, it caused the entire supply chain underneath them to pay attention, but also it caused them to say, hey, we need to get behind this because our biggest clients are behind it. Right? If Toyota is behind this effort, we need to get behind this effort. And together, they form this community, which now we have over 150 companies. And that would be my advice to the energy sectors. You need at least the top dogs, you know, the top three to five companies, or at least the thought leaders, the ones that are innovative to buy into what you're trying to do and really change the way the supply chain and the community will be working in the future. That would be my advice. Thank you so much. Very, very direct. Gabriele, do you want to say something? Yeah, absolutely. And I certainly want to echo what Kate then shared, you know, it is not lost on me that especially when you're talking about, you know, a regulated industry like finance or, you know, energy, of course, we hear from Telco, you know, it is really hard to change an industry through open source if you don't have, you know, both a broad community but also a community that really has the sort of market leaders and the players, you know, it's not lost on me that Finos wouldn't have experienced this growth if we didn't have sort of the top 10 investment banks in the world, really already part of the community from the gap go. If I can add something specific to the nature of the industry, you know, I come from open source communities, I'm a developer, I've always sort of been very fond of the value of open source, you know, in technology, in terms of talent, in terms of collaboration, in terms of the solutions that it can provide. But when you're trying to build a community that is vertical and vertically focused on a specific industry, it's really important to hone in on the business value that can be driven out of open source, not just the technology value. And that's something that, you know, in my world has meant finding use cases and strategic challenges that are important for these sort of large firms, but not just for those, you know, to really involve every constituent in the value chain of that specific industry. In our case means, you know, not only the banks, but of course, fintechs, vendors, data and technology vendor of the work in their industry, and importantly, regulators. And know that there is a sometimes an impedance mismatch in terms of their understanding of open source and sometimes the fear of open source, but in the long term is super valuable if you can shift left that engagement to the point that, you know, maybe before my grandkids grow old, they come up with open source machine readable regulation themselves. And so that that's a little bit of a, you know, a call to involve all the different constituents, of course, with, you know, having sort of the top leaders involved. Thank you so much. Thank you. Thank you. Very interesting. Yeah. So I think something that I heard in the telecommunication industry that kind of slowed down the adoption of open source was the misconception that the industry is so unique with its requirement for high availability and being regulated that we need to invent everything from scratch and all the technology needs to be unique. While in reality, there's some grain of truth to that, but there are some unique requirements. But if you really look at the required technology, I would say 90% of it can be reused from other industry, maybe the cloud service providers, maybe even from enterprise. And maybe by using that, only 10% of the software needs to be unique per industry, but there's a lot that can be reused. So there's no need to kind of reinvent the wheel. There's a lot that can be adopted. You're so true. Thank you. All right. So now that we had this round of tips from the different industries, maybe we want to go deeper into what is the role of open source specifically in your industry. So how did open source impact digital transformation within your industry so that we can make the case for the energy sector as well? Yeah, I can see a few words. So in the case of automotive, it was quite challenging because it wasn't necessarily a technology discussion. It was a discussion with lawyers because the automotive industry is very much risk averse. So everything they do is to mitigate risk and liability. And the misconception was that open source was not secure and not safe and that it was a bunch of hackers doing stuff on some website. And all these misconceptions, we had to go to these automotive companies and teach them that this is not the case. And we used a lot of methods to do this. One of the methods is that the Linux Foundation has some very good reports that show that open source is actually more secure. There's more eyeballs. There's more developers looking at the source code to make sure that it doesn't have any nefarious intentions or nefarious code. And unlike closed source where one bad employee could really inject some bad code and if that employee leaves someday, a lot of bad things can happen. And so all these misconceptions, we had to educate the industry. We had to educate specifically the legal departments inside these car companies to make them understand that, no, this is really, really good high quality software. In fact, this open source software runs most of the internet as we know it. And so once they started understanding that and we started explaining to them that it's actually potentially less liability for you because I'll use the Toyota example. If Toyota is using a piece of open source software and Mazda is using it and Mercedes is using it and Volkswagen is using it, they're all using the same piece of software. There's a lot more safety in that in terms of they're all using the same code. Their engineers are all looking at the same code and they're all debugging the same code. And when they find an issue, they give it back to the community and everybody benefits. And so once they understood this process and the value of open source, then it became easier for them to adopt it. And so that's how we approached it. Thank you. Very good advice. I was going to jump in in something similar, sort of echoing the last two speakers. Definitely financial services has sort of the special snowflake syndrome that we heard from sort of the networking side of the house. I think it's generally related to the fact that sometimes they hide behind the regulated nature to say, there's just things that we cannot do, but the reality is once you dig in, there's nothing preventing them from doing open source. So I think one of the ways open source has changed the industry is making them realize more and more that 80 to 90 percent of what they do is non-differentiating and it's just a waste of resources when if you think about it, this industry is really now heavily competing with big tech and the West Coast really aggressively entering the financial market. And so really sort of this push to be more efficient found a really good solution in open source. I think the second sort of related topic that I think is really changing the industry is the talent aspect sort of, you know, 10 years ago, most of these banks, there's plenty of articles talking about Goldman Sachs or JP Morgan as to how, you know, they had internal, even programming languages, you know, they went to the very bottom layer of the stack and building everything proprietary and everything customized. And maybe 10 years ago, that was considered the secret sauce for them. If you fast forward five to seven years, in the last three years, there's been a clear realization that actually that's a weight that is actually something that can become a witness because it's really hard to hire talent that then you need to, you know, train up on your internal platform versus, for example, you know, JP Morgan using the Python ecosystem that every single day builds new components and new talent. And so I think, you know, besides the differentiation, I think open source is really creating a whole new generation of talent that hopefully is going to result in a better financial system across the world. Thanks a lot. Thanks a lot for your critical topic, talent and training. Yeah, if I could, I'd like to sort of build on what's Dan and Gabrielle are saying. Open source tends to be sort of the base of innovation. And there are components out there that are used widely beyond across all of our sectors. For instance, I think you'll see the Linux kernel for one of them is in a lot of these places already. I think it's in something like 70% of all embedded applications that are already running Linux. And so we've got that innovation there. And if you can leverage what is out there already to the points we've just made and focus on the value add and your differentiation, it's a much more efficient strategy. And quite frankly, de-risks things to a large extent. Rene, what's your perspective? Yeah, to add to everything that's already been said, I think when people, some people think about FOSS, they think about the free as in free beer, that it's cheap and inferior software. But in reality, it's more about the free as in free spirit where everybody can take part in creating the technology. And I think Gabrielle kind of alluded to that shift left where actually the consumers of technology can work side to side or shoulder to shoulder with the creators of the technology. And that leads to much shorter cycles of innovation. So it really opens up a new way for, let's say, the energy providers to work with their vendors and to kind of have a short feedback loop for what their requirements are and what their needs are, and making sure that the software develop really addresses their needs. So open source communities and open source projects provide this platform for this type of collaboration. Thank you so much. Yes, relation with the customer is key right now in the energy sector, of course. Good. So, well, I guess that after looking at the strengths, we can look at the potential pitfalls. So from what pitfalls should stakeholders look out for the transition process? Maybe I can start here. I know I care about some of these things pretty tightly. I think first off is security. Making sure that the projects you're using are based are sustainable projects, have well done, it's not just a single developer, it's you're going to base something on something, make sure that you have some diversity of community behind it, but that quite frankly also the community is paying attention to security and focus on practices in that space. So focusing on making sure you understand the security story of the projects you tend to incorporate, the open source projects you want to incorporate is first, would be the first recommendation. Second recommendation is knowing exactly which projects you're bringing in because a lot of these projects will bring in dependencies and so think technologies like software build materials or S-bombs and having a clear line of sight as to all of the components you've brought in and the implicit dependencies as a function of security but it's also a function of safety because you cannot go through any safety or regulatory things without knowing in detail exactly which pieces you're bringing in. So the transparency and being able, so one, the projects you're bringing in, make sure that you've got a sustainable ecosystem behind them and they pay attention to security and then two, know all the dependencies that they're bringing in. Thank you Kate. Do we want to go? Yeah, I think this may sound strange but I think one of the pitfalls is that companies want to form a consortium and they're all really eager to do it, they're all, oh my god, let's do this, this is great and then there's no funding and that's the biggest pitfall in my opinion. You have to fund these projects because these projects don't run themselves and so the software is free and it's open and you can go download it sure but the project itself, in order to build the community, that needs at least some decent amount of funding and a lot of companies don't realize that and they think they can, you know, throw a few thousand dollars at it and it's gonna be fine but I think, you know, raising funds to run the project and to build the community, have a community management in place and have events, things like all member meetings, these things are quite important to build a community in my opinion and then number two, I would say, if you're going to be creating original code for the project, the choice of license is quite important, you know, in most cases you have to pick a very business friendly license otherwise the adoption of the code will be really bad and so, you know, there's a lot of precedence in what license to pick so I don't think it's very difficult but you have to make sure it's a business friendly license for the members. I just want to sort of connect I think with what the last three folks said, I've heard so many times, especially early in the days, this idea of, yeah let's put this project out there and then, you know, the community will come and the reality is that, you know, it's not as easy and besides of course considering foundations as an help to grow your project, sort of shameless plug here but, you know, I want to hit on something that Rennie said, you know, when you put out an open source project either solo or in a foundation, you always want to consider how is it going to create an actual ecosystem and an actual potentially commercially viable ecosystem because ultimately, you know, we all know that open source is free as in free spirit but there's a lot of really valuable sort of virtual cycle that can be built around an open source project with then the idea of, you know, increasing the sustainability of that project both in terms of reinvestments, you know, touching on the funding that Dan mentioned and on the sustainability of the maintainers that Kate touched upon so I think it's really important to, when you think about an open source project, how you draw those sort of bright lines and maintain expectations that are both, you know, fulfilling the need of individual developers in the community and are aligned with the ethos of open source but that ultimately, again, touching also on the license choice, don't undermine the potential creation of a commercial ecosystem around that project which ultimately is going to be what fuels the sustainability of the project in the long run and I admit this is more an art than science still but it's something that I think, you know, it's becoming mainstream more and more and, you know, I think it's important that the energy sector thinks about this as well. Yeah and I would like to summarize maybe saying don't try to do too much open source at once and it may sound crazy because we're advocating here for open source but I've seen some companies trying to do too much open source at once and then diluting the resources, not having the funding, not having the personnel so I think my advice would be identify where it makes sense the most to start with open source, start there and then grow your open source involvement as you go, you don't need to go from zero to 16, three seconds, it could be more gradual, it makes sense to make it more gradual, learn from your experience and then expend more into open source over time. So well I guess when developers and industries in the energy sector try to undertake this transformation they may find it intimidating, the complexity of it may be intimidating. So what is your advice to those who have to undertake this journey in the energy sector for decarbonizing the energy system? So I would maybe start again tying to my last comment, start small. Yes, open source can be intimidating, things can go in the wrong direction real fast if you don't do it right. So it's really important to identify where it makes sense the most to start and start with something small and then build up on that. Yeah, I agree with that. You need to identify maybe the biggest pain point, maybe the top two or three things that companies are dealing with and struggling with and start chipping away at those things. In the case of Automotive Grade Linux, we first tackled infotainment because this was the biggest pain point for the car makers. They were not keeping up with the mobile phone but then eventually we went on to support instrument cluster, heads up display, telematics. So now we support all sorts of things in the vehicle but we started with the most important one which was the biggest pain point for our members and for our community. That would be my advice, is pick one to start with and go with that and you'll grow on that success eventually. I'll also say taking sort of the angle of go out and be in the community, find projects related to the things that you care about and have people sitting there and volunteering to do things so that you're seen as active in the community and you are contributing back and then as you start to find the people that care about the same problems as you, you have a basis to sort of build from but you also know the behaviors of the community and you know the points to watch out for has been alluded to already. I mean you mentioned both sort of corporate and individuals here and I think I couldn't agree more on the corporate side in terms of starting small, finding you know high value challenges and of course ideally seeding the solution with the you know an initial contribution or a project that is already out there that can be augmented. I also do think you know we take a lot of pride in trying to engage individual developers and showing the value of the work that we're doing in our community and financial services to individual whether that means you know incentivizing an award, whether it means getting sort of your next big job in a large financial institution. I hear a lot of talks of you know I realize that some of our communities being very professional grade can be you know considered daunting for an individual that you know maybe comes out of school and wants to engage in open source but the reality is that on the other side there's always a maintainer that is in need of help and so when I feel you know a lot of you know imposter syndrome I think my best suggestion again for individual we talked about a lot of corporates but for individuals even if you are you know engaging a potentially very complex project very sort of highly professional open source project just understand that on the other hand there's always someone in need of help and so that you know doing your due diligence reading the manual reading the documentation complying with the contribution guidelines is important but you know don't be scared take the leap it's almost like you know when you're jumping off a cliff the the just the jumping part is hard after that you know you're on your way. Thank you so much this was great we got advice for the entire sector and down to the individuals so thank you very much so I think our time is up I want to thank the panelists very for a very very interesting discussion and for energizing me and hopefully the entire energy sector towards this transition thank you all and have a nice evening. you |
Challenges in Home Energy Management
How to best use your own PV-generated power |
A warm welcome. This is a hurried up presentation on deploying energy management in households and on challenges that we have encountered from hardware to software stack all the way up to layer 8. The household owners, those are. Analyzing the potential for energy savings, it's on average three major consumers worth investigating and once you will have exchanged your fossil heating for heat pump, all of those are electric, so saving the planet from a term is all about saving electric energy. Well, sure of going wagon maybe, but I love my daily burger. So it's the car and just like in computing, it's climatization and the big iron appliances to cost the most headache. Exchanging those for more efficient models means a major invest and usually it doesn't really buy you any relevant savings in energy. But there is hope, the royal voices. There is still potential for energy cost savings. If you have PV power and have proper control over when you consume that power. So how does that work? For example, doing the washing typically takes around a kilowatt hour. I myself just got a raise to four to six cents. So that's my energy cost when I'm doing my washing at night using power from the grid. Now if you have an energy management system postpone the washing round to around noon the next day or more precisely to when there is excess power, PV power available from your roof. And cost is only like a feed-in compensation, that's seven cents in my case. Figures will vary between countries in Germany moving all your clothing and dish washings will save you more than a hundred euros a year. Potential savings in EV charging and heating are even larger. And now here's your bonus for listening rather than just skipping through abstracts. This would also work without a solar plant if you're using a variable power tariff. And those you might know them, you pay about the price of the power stock exchange plus a transmission fee and savings are less, however, still well worth it. Now on to the challenges. First is getting hold of an EV or a new time of a heat pump and a competent installer. That's a problem of its own, let alone for a good price. Well, let's skip that. Next is housing. In apartment buildings there's unresolved problems in describing cost of PV and EV charging energy to single users and only building owners will possess the required information on wiring. This electric and HVAC devices contracts with utility companies and contractors and so on. And often they will not give any renter the allowance to modify anything, let alone property management companies that fight even balcony plants just for the look of it. So effectively energy management deployments are limited to home owners that have physical and logical access to all relevant devices. So how does it work once your home qualifies for it? You need a computer on-site that can talk to inverter, EV charger, heat pump and your white goods devices. So the first hurdle is to physically connect to those. Local control rules, running in the cloud is unreliable for technical and other reasons. Finding a proper state machine is a challenge and you probably know all the risk to operations and financials with cloud services when suddenly that company drops out of service. Almost all inverters on what was based most only have a serial connector and there's serial to LAN or Wi-Fi adapters to take care of. So technically it is not a big deal but often it's a deal breaker when the installer on-site doesn't know and the ones in the know, the software people, they don't have access to the site physically. Modbus is a simple protocol but then again every vendor has its own list of registers and logic. To be fair, most are offering specs for downloads, however not so much about the logic but information is often incomplete and hard to get hold of. But the harder part is that there's no standardization in that. So implementers like ourselves have to figure out for every single device. For chargers there's no wireline equivalent and those put up another kind of challenge. There is no single unified logic in talking to those. There are also next-gen protocols such as eBus or OCPP and those become available in more and more field-deployed hardware. Chargers, they all can be controlled because that used to be a prerequisite in many countries to get the installation government funded, however, control usually means app only, right? Some even use the Bluetooth link and both are challenges to make them become part of any automation. Now connecting devices is where OpenHUB and even more so its community are great. This is what we use in our product. In the energy device domain there's OpenHUB bindings for more or less every officially published API. Bindings are modules to translate device-specific communications to the abstract representation layer that OpenHUB is based on. Many source-minded people have figured out how to control their own devices and thankfully shared all of this information. So together we have compiled a pretty good knowledge base. For chargers we moved on to EVCC, EV Electric Vehicle Charge Controller, that's another great open-source software. It's dedicated to handling charging and it has a web API too. So now we only have a single API, albeit more complex, and a unified logic layer to cope with for chargers. Oh yeah, this one is my favorite low-light slide. To inverters in the field it's often limited. All relevant device settings are such as to enable external control, all modbos at SG Ready related parameters, they are on admin level. You might know that from the HVAC industry but with inverters most installers don't even hand the password of the device to the owner of the device. With some inverters the admin account is even linked to the installer personally and he signed with a vendor that he would not pass his own so he is really taking that seriously. And all of that is a fact of successful lobbying in the past. Grid operators, for example, they denied certifications to inverter vendors and if they don't implement this sort of crappy restrictions. And installers are also very much intimidated by grid operators and afraid of losing support or to get sued or whatever, although there is absolutely no contextual relationship from the installer to the grid operator. That only exists with the home owner, the guy to own the device, but he in turn is denied access to that. It's particularly nasty when it comes to battery control. Some greedy energy companies, albeit regulated on that, don't want users to feed the grid from battery and they manage to prevent this from happening that way. Unfortunately, there is still hope. At least in Germany there have been major changes starting this year, but let's see how long it will take for them to apply to the installed base. Another low light along those lines is smart meter usage. Proper energy management requires to meet a consumption, but you have to install your own smart meter. Even though you get an official one installed together with your solar plant, but you effectively cannot access that one's ratings, let alone in real-time. Even with the so-called intelligent meters, those have a gateway because the data of that one is only being sent to the grid operator. That one makes it available in a portal if you're lucky, but you don't get it in real-time. Ultimately, you have to install another meter of your own. That in turn requires you to get an electrician normally, while there's some sensors like a Shelly that you can install yourself, but to many owners, the electric cabinet door is really the line they don't dare to cross. There are complex electric setups out there with multiple inverters, usually from different brands, that we often encounter in upgraded older installations, and I'm not even counting balcony plants. Owners of heat pumps or night storage heater also often make use of dedicated taverns, and those require another meter. Connecting to heat pumps themselves, it's rather simple using the G-Ready interface. That's a two-bit wireline interface to signal cheap power is available. Nonetheless, in reality, only very few HVAC installers set up SGR because to them it's the electrician's job, and that one of course doesn't care to return once the PV is working. Any layman is allowed to set it up himself. It is low voltage, which means you won't risk your life and insurance, but few know it's badly documented, and so most home owners are unaware or afraid of doing it themselves. And devil is in the details. During recent installation of my very own heat pump, they have put wires but didn't connect them on the pump side, so I had to call support and they sent me a picture of where to connect the low voltage wires, and got warned that opening the enclosure alone would expose me to the risk of touching 400 volts DC. So should there not be any FOSSTEM presentation of my next year? There are drawbacks to SG-Ready, but the worst thing about it is that energy management happens inside the inverter, and it is one-on-one only. So cheap power assignment to a heat pump competes and conflicts with all other major consumers such as the EV charger. Device vendors will keep improving their capabilities, but inverters will keep failing to get the comprehensive picture right, and that's needed to coordinate all household devices. There's many competitors to power. Chargers likewise, there's some that already do their own metering to allow for excess only charging, but this also will be short-lived kinks at best. Now with variable tariffs latest coming up, local control like that can even have a negative impact on cost, because cheapest times will be shifting around and will often be at night, and that is right when the existing excess power-based control will result in right the opposite of what it is supposed to accomplish. I'm anxiously waiting to tell my classic grid operator goodbye. I really believe that Tibber is the new Twitter. Final slide, saving on energy and cost is fine and having features such as notifications when the washing is done are a bonus, but they are not enough to get people to change their long-awaited practice of operating household appliances. They would rather stop using a new system than change their habits. So for an energy management system to get accepted, it must not enforce changes to handling. Connecting existing white goods is surprisingly simple. You don't need new smart home connector-like devices, because any washing program of classic white goods devices once started can be interrupted and proceeds when you restore power. So any Shelley or other plug socket switch with metering capabilities will do. Automation is more consequent than any human, and it doesn't get tired of doing the savings math. So you no more have to worry and remember all day long when to start your electric appliance as best. For those to know this type of problem, automation can really get you some peace of mind. So to conclude, deploy your own energy management system, an open source based one of course, and get peace rather than expensive, intelligent devices. Become the sovereign of your energy usage. Thank you for listening. |
Obstacles to open source in building energy technology
An analysis of the German research landscape |
Hello everyone at the energy left room at FOSDEM. My name is Felix Riemann and I'm happy to be here and present you some insights from projects at Einstein Center for the Future and Theo Berlin, both located in the heart of Berlin. And today I'm focusing on some technologies and some obstacles we find dealing with energy technologies and related open source software in the German research landscape. To give you a slight introduction, I'm going today to talk about why do we need open source software for buildings and energy, how do buildings impact climate, how can open source reduce that impact and what is the current state of open source. I'm going to give you a short brief review of funded project. I'm going to give you some major obstacles we identified with the usage of open source and how researchers are applying it currently and then last but not least, I'm going to give you an outlook where we will be in five years. But let's start with a short introduction of our job and our goal and why we think FOSDEM is the ideal place to be here. So we are doing, we are the accompanying research. So what we are doing is we are supporting more than 300 research projects and survive them merely through different means and our focus on different aspects of digitalization. So for example, we look at data governance, we look at which tools they apply, for what reasons do they apply these tools and we try to support and connect projects and this is especially why we're happy to be here at FOSDEM because we are coming from an energy perspective and we would like to get to know more people in open source community and get feedback from you and foster the exchange between the national research community and the international open source community to learn and foster the exchange. Our goal is supporting standardization and integration of software and standards so that other people can apply the solutions, we help our researchers that we support and reuse. And why do we think residential or buildings in general and neighborhoods are especially relevant for that case? So if we look in Germany around 35% of the end energy usage are related to buildings and around two thirds of that are related to residential buildings and the majority of that is for heating and as you can see here on the left there's some empirical data from buildings, energy usage, a lot of that energy usage is actually used in old buildings. And what we can see from that, that older buildings use more energy than new buildings because of variety of reasons e.g. different installation standards but also we can see that we need to focus on a specialty of buildings especially older buildings and keep in mind that with the long lifespan of buildings around 30 plus years as you can see here that tomorrow is building a built today so if we want to be climate neutral by 2045 we need to build climate neutral buildings now. And open source technology can help with that for variety of reasons especially in three strategies. They can help reduce the demand through installation of better insulation so if you for example have open planning tools you can choose the right approach and where to insulate, what to insulate and what is cost effective and can also replace technology you can find for example the perfect heat pump for your place which can replace the boiler you are using or you can also have better control strategies which is mostly applied in non-residential building so if you for example have a demand or crit orientated supplying approach that also reduces the total energy demand or makes it more crit friendly. So digital technologies are essential for climate friendly buildings for the three reasons and we need software to help to plan and run these buildings and our impression or our goal is to make this software open source because it can foster the transformation, it can make it cheaper, it can make it more transparent and also faster and how can they do that? For example I have five different light phases in a building this is for example the plan which we start normally and for example here we can apply open computer edit design so if we have better orientation in a building you also need less energy demand because the sun is actually helping to heat our work. If we are dealing with a build environment and we have a variety of tasks for example as build classification so we know which boiler is actually built in in the building and not only plan to build in which is huge hassle or huge hassle in a lot of actually building environments we're going to go in the usage phase monitoring can be used to deploy building control strategies we can for example see oh the heater is running while we actually also have an open window which is suboptimal so we can identify faults and if we look at the renovating phase of a building we can actually choose the right fit between technology and insulation and that's but not least if we go to more life cycle oriented approach. Material databases can improve the recycling quarter for example we need to identify which material is built where and where and how is it used or how can we reuse it and how is the German Federal Ministry of Economics and Fair helping to shape that so as you can see on the left there's roughly spend more than 100 million each year on research projects so that's quite a lot and just a short takeaway from this slide in the last few years it's been shifting from buildings to neighborhoods and to heating this networks and we say and we also can think if you look at general information that integration and linkage of different technologies is becoming more and more important and for example hence the funding for neighborhoods has been growing instead of isolated topics and what are these topics that are being developed so in 2021 we did the survey and we survived 179 projects and out of these projects around 128 said they were developing or using some kind of digital applications. If you're wondering why there's a huge gap to be honest we are also wondering but we also asked the projects to exclude the answers if they are for example using Excel as a software data developing and we see that actually quite often because if you're focused for an example on energy planners which are mostly self-employed very very tiny bureaus they actually use quite a lot of Excel so that's actually helping them but it's not the kind of software we are looking at so the software is excluded here and the kind of topics you see on the left side is mostly focused on simulation, operation optimization, monitoring, energy management, learning tools and what you also can see that none of the project is actually focused on social aspect so there's a huge gap in that on how we can include social aspects in digital aspects together in buildings and neighborhoods so if you're looking for a research project there it is and what did the project think about open source? A previous project by some colleagues of us did a survey in 2018 and they found that only minority of software or roughly 3% of Soviet projects planned for a full open source release of developed tools and software so that's not a lot isn't it and most of the software uses at least one proprietary tool so keep in mind that if they want to build software they use an average 4-5 other digital tools we call it a tool chain and I will talk later about this a bit more and the majority of these tools around 70% are not open source so there's not a lot of open source being released and there's also not a lot of open source being used what might be some reasons for that so there's a paper by some colleagues Stefan Feninger and all the papers from 2017 and they found some variety of reasons they focused on data and open source software and some reasons they find is that ag ethical and security concerns so there's sensitive and personal information might be included and through a variety of reasons everyone is afraid they might overshare personal and sensitive information unwanted exposure so if your stuff is public everyone can find the mistakes you make personally I think it's important to have stuff public because no one can find the mistakes and someone is referring to you and then they are repeating your mistakes so it's important to have stuff open so mistakes can be identified but that might also be a reason for some people not to have it because some of them might identify mistakes then we have to protection of intellectual property and I will talk about this a little bit more in the future but expertise is a business model or can be a business model and then we have institutional personal inertia so long-running practice are very hard to overcome especially for huge organizations that apply a different standard in a different methodology and let's continue with obstacles we find in our research landscape and especially related to buildings as this paper was more focused on energy systems engineering so I categorized in three categories starting with technical obstacles I will continue with cultural obstacles and finish with financial obstacles so especially in buildings and related technology we have a huge heterogeneity in data as you saw on a previous slide we have quite an age gap with a variety of energy demand variety of energy related technology and this also makes the software people deploy or develop for it quite heterogeneous because you have to for example identify different data points you have to example different ICT and so on so on the one hand some may think it's not worth publishing software anyway on the other hand some software may not be applicable at all because it's focused on a very special aspect so people think oh why make it open source but we think hey there might be at least a second person and it's public money then the two chains so if one part of the two chain is not documented at all or not open source or open science it cannot be reused and we think we need more modular and more and better documented two chains so the individual components need to be documented and well understood and then integrated in a chain instead of focusing on a complete chain and paper people should focus on the components and then we have missing open software which is especially relevant to cat and solvers and in some areas there's just like a huge gap and this leads to technical obstacles in making the complete system open source and as we can see on the right basic technical prerequisites is that the interfaces fit and we think that is relevant for all of these technical obstacles so we need to have fitting interfaces for data for the tool chains and last but not least for the missing open source software so we can have a whole ecosphere of software let's continue with the cultural obstacles we have a variety of cultural obstacles identified and one might be surprising but we actually have a lack of development skill in our understanding so software development is especially in mechanical engineering education and they're represented at least in Germany so that's our impression and also this leads to lack of common criteria for software quality so additionally in research software is often seen as a tool rather than an output so the researchers focus on publishing papers instead of publishing software and publishing the software is often overlooked or doesn't happen at all and last but not least I think this is more has more to do with the usage phase of software we did a survey in 2021 as previously said and out of the surveyed projects and only a minority said they are testing the software with their users so this in our opinion reduces the applicability and even if the software is open source no one is using it because it's not tested evaluated and the users don't understand it so we need a bigger um responsibility or bigger focus on using testing so the people that are supposed to use the software in the end actually using it last but not least we have the financial obstacles and especially the first one is relevant in our case so we quite often have enter funding so when a phd is done or project is finished and then you have to write your report but the report is just text and with that often uh no one any focus anymore on developing or publishing the software and we need to find a financial structure or funding structure that focus on also publishing all outputs in a well-documented and well-understood way then we have some business interest research is also a business having tools especially only for you can help you get new funding and can help you stand out and quite often we now see commercial alternatives with especially project partners wanting to have a long-term service agreement so they choose rather like um commercial alternative and then uh we have as they can provide a long-term service currently the maintenance the and the use of the software development of the software often ends with the project um participant of a workshop to us that we think it's important to find value behind project to have a community around open source deployed and this is why we need to focus on a value added to practice beyond your research project let me talk about the toolchain example as previous said a little bit more so quite often we have two people looking to build the software for the same solution so example you see user a up here and he wants to build um software for demonstration and user b or she wants to reuse that software but she's missing the license so she cannot reuse the tool or the software completely and she has to rebuild it so we have a lot of lost um power and we have a lot uh inefficiencies due to that and if you keep in mind that on average four or five tools are used to build software in your research project this happens actually quite often so we think partial open source is not enough we need modularized and well understood open source and only this can foster an open source culture as the software is only open source when all of the parts so to speak the complete toolchain are understood well documented and open source themselves this leads me to the recommendations i only brought six but we can discuss this later so as already said with the two chains i think modularized development is important so if you have each of the building stones of your overall software well understood documented and open source this enables maintenance and reusability of the whole software then we have user focused or more user focused software development actually the um software can be used and redeveloped and maintained after the project and we also need to focus on standardized interface interfaces especially with building technology and buildings as everyone is using some kind of object orientation but not the same so we cannot apply the same interfaces and so on and so on and so on and then we think it's important to publish this data scroll so actually we already have a lot of software and documentation and related papers but there's no general web page currently that publishes everything and collects everything that is being funded by public money so that's also an important step then also focusing on essentials crucial software e.g monitoring where which every project uses and every project or every building can use to save energy should be focused on and last but not least adapted funding because for maintaining software we often need ongoing funding structures let's come let me come to the last part of the presentation so you heard a lot about obstacles and i gave you a few brief introduction on what are missing stepping stones so far in the past but what's the current state of open source so on the left you see an argos which is a database where publicly funded projects are listed and if you look up open source you find around 446 projects at least in the beginning of January and the positive thing is there has been a sharp increase in recent years only 16 projects of these 446 projects have been started before 2010 but is quite often unclear what is considered open source and especially keep in mind the two chains is only the hardest open source everything can still be not reusable in open source way but i think there's been a sharp increase and we have a variety of open source projects listed and where we'll be in the next years so if we think about the future like for decarbonization and digitization of the energy system we need quite some building stones e.g infrastructure high quality data involvement of people as already said and if we did a server again in November 2020 and we asked around 250 researchers if there needs a promotion of open standard tools like energy plus and corresponding libraries and the majority of them agreed so with that outlook that everywhere that we think a lot of researchers and even the publicly funded structure is shifting to more open source approach and is considering the need for integration of these different opening open source approaches let me finish the presentation i'm happy to ask any questions here in the chat or per email and thank you very much for your attention and have a great day at FOSDEM |
EVerest: AC and DC electric vehicle charging with open source software and hardware |
Hello, my name is Kai and I'm going to talk to you about Everest, AC and DC electric vehicle charging with open source software and hardware. First, a few words about myself. I have a background in computer science and robotics and I've been working at Pyonix on the Everest project since early 2021. So how do you actually charge a car? Most of you that have electric vehicles will probably be familiar with these methods but I'm just going to recap them real quick so everybody is on the same baseline here. You have your basic AC charging when you have a portable charger at home that you just plug into a wall socket or maybe even have a wall mounted charging station that can charge your car with up to 11 or 22 kilowatts. In public, you sometimes still see these slow AC chargers where you maybe even have to bring your own cable. Just plug that one in, plug into your card and you authorize with an RFID card or maybe even an app and then charging is properly built to your account. There's an alternative to that which I would call the smart AC charging with ISO 15118 or maybe even plug and charge which is a much more secure way of authorizing your charging session with a back end provider and what's probably for the crowd at this presentation very interesting in the future is the possibility to have B-directional AC charging. Think about vehicle to grid, vehicle to load scenarios where the car can be used maybe as a solar battery for your home that you can charge when the sun is shining, when energy is cheap and then you can use that energy in times where the grid is stressed a little bit and you want to reduce your demand on the electricity grid and then you might be able to just discharge your car and use your car as a battery for your home. Also something that people will be most familiar with is the DC charging using the dinspec and the ISO norm again. These are usually the big highway fast chargers where you can charge up to like 200, 300 kilowatts and but there's also smaller units for the home, think about like DC-DC solar systems and things like that and also here what's probably very exciting for all of you is upcoming like B-directional DC charging and yeah, taking energy back out of the car again. What is Everest? It's a complete software stack for EV Chargers, it runs on basically any embedded Linux platform out there, it is released under the Apache 2.0 license and the aim is to support as many different hardware platforms as possible. In this talk we're going to mostly focus on building our own charger with an open hardware design that I will present later on. So some of the features that Everest has, it's built on a very modular architecture where different modules can do very specific things and then they can communicate over MQTT with each other. There is also a graphical setup web interface that you can use to configure different topologies of charges, you can see some examples here on the slides and you can also use the same web interface to do energy management configuration as well. Next I'm going to quickly go through the steps that you would have to take to use this graphical web interface to configure your own charging station. First we start with an EVSE manager, this is a module that owns a charging connector and takes care of the charging logic and the whole charging session handling and it orchestrates all the other modules access to this one connector. Now we add a board support package which in this case is the EET driver module which will handle all the control pilot handling the access to the relays and the reading of for example the RCD currents. Now we add an energy manager, this can be just a very simple configuration, a more advanced one I will show you in a few slides. Following that we need an authentication mechanism, here we add an authentication manager as well as two token providers that will be able to authenticate our charging session with. In the next step we can add some cloud connectivity, in this example we add a OCVP 1.6 JSON module as well as a power meter via Modbus and a system module that supports the rebooting and firmware update of the charging station via OCVP. And in the last step we add an API module so that external applications can talk to the Everest system and read out some pedmetry but also control the charging session. As I mentioned before you can use the same graphical configuration interface to also configure the energy management. Here you can see a more complex energy distribution tree to be able to load panels multiple charging stations. Here we add an energy manager as a root node, add a 22 amp fuse to our grid connection and then as children of that fuse we can add smaller fuses that then connect to the EVSE managers underneath it and these EVSE managers now have different cars connected with different charging goals and the energy management system is able to schedule charging by a global optimizer so that every car gets the most optimal charging schedule assigned to it. Everest also comes with software and hardware in the loop simulation facilities and it implements a lot of protocols that are relevant in the EV charging space at the moment like OCVP 2.6 with 2.0.1 support coming very soon. We have support for ISO 15118 AC and DC, for the Dinspec, for the basic PWM charging. We also have the possibility to do communication with Modbus devices, think about external power meters for example and also an API over MQTT where you can get some data about the charging session to maybe integrate into your home automation system. Everest itself is written in C++17 but there's also language bindings for Python and JavaScript available so you can write modules in all of these three languages whichever suits your needs the most. So, let's talk about the basic PWM charging. The car and the charging station can communicate over the so-called control pilot signal. This is just a plus minus 12 volt signal where the car can lower the positive part of the signal by adding load resistors and a diode to lower this voltage to a specific voltage. For example, 9 volt signals that the car is connected, 6 volt means that the car actually wants to charge. And the charging station then can use a PWM duty cycle to encode the available current for the car to draw. This is typically between 6 and 32 amps. So, how do you actually build one of these AC chargers? The good news is an AC charger is not a complicated battery charger. This part happens on the onboard charger in the car. The AC charger is just a smart relay. So what you typically only need is a power path, so a mains connection, some relays, an RCD for safety, optionally maybe a power meter if you want to know how far your car has charged already plus a microcontroller to interface with this control pilot signal. If you want to do some more advanced things, Linux port is usually a good idea to have as well. I'm now going to talk about our open hardware design that we've released, the Yeti and the YAK boards. They are available under this Github repository and are released under the CERN open hardware license version 2 in the permissive flavor. This hardware design has been developed to be as developer-friendly as possible, so it includes a lot of features. But it's obviously not optimized for cost savings or ease of manufacturing in mind, but it has a lot of very exciting features, so you can build all kinds of charging stations on top of these designs. It's been designed in Keycard 6 and case design files for 3D printing are also available. So let's talk about the first of these hardware designs, which is the Yeti power board. It is a 22kW AC free-phase power board. Here on the low left you can see a block diagram of this power board and on the right some pictures of the upper and the lower side of the board. Let's talk about the features that the Yeti board has. It is capable of doing the control pilot signal generation as well as the control pilot signal sampling in sync with the PWM signal. It also has onboard relays for free-phase power switching with welding detection and a free-phase power metering support with up to 8kHz of sampling. There is the possibility to measure voltages, currents, power, frequencies of all phases plus the neutral. There is an RCD module integrated which can detect DC ground faults as well as AC faults and it can output the measured leakage current as telemetry. There is also a 10 pin connector for a high-level board to control the Yeti board over UART. This is also used to connect the Yeti to our YAK high-level board design which I will talk about later. If you want to use the Yeti as a stand-alone charger which is totally possible, there is also an external connector for a small LCD. You can also add modbus devices for external power meters, we have some external GPIOs on this board and the board itself can be powered just by the 110V or 230V mains connection with an internal power supply which is also capable of supplying the YAK board. But you can also connect an external 12V supply if you so choose. This board is also a lot of more features which you can then just look up under this link. The Yeti comes with an STM32 microcontroller on board and the firmware for this microcontroller is also available on our Github page. It's licensed under Apache 2.0 license and the purpose of this microcontroller firmware is that it can control all the devices on the Yeti board and all the electrical safety relevant code is encapsulated into that firmware. On top of that it also does all of the communication of the Yeti board over the UART using protobuf with a high level communication board and then with the Everest software. How do you use this Yeti board? You can either use it as a stand-alone charger or you can use it as a power path for a smart charger. You can also configure it to do automatic switching between these modes in case like the higher level Linux board fails for some reason. You can still continue as a stand-alone charger. If you want to use the Yeti board as a stand-alone charger it is a complete AC charger for electric vehicles supporting the basic charging I talked about earlier. This means it contains the complete charging logic that you need and a car will charge immediately when you connect it to the board. There is also some UART connection that you can use to observe the status of the charging session and also to have limited control over the charging system such as pausing and resuming charging. This mode is what we call the high level control mode of the firmware. But you can also use the Yeti board as a power path for a smart charger. Here you would then switch it into the so-called low level control mode just with a UART command and here you must provide the charging logic externally. Only the basic state machine remains in the microcontroller which is essential for electrical safety. An external board is then capable to set the PWM in duty cycle and is able to read back the control pilot events. This is also the mode that Everest then uses to enable the so-called high level charging using ISO 15118 or the DeanSpec. I will now explain what this high level charging mode is. It uses a power line communication on top of the control pilot PWM signal. It literally uses the same wire using the home plug green fire standard and the following steps need to be done to create a successful high level charging session. First, a logical network between the charger and the car is set up using Slack. Then IPv6 link local addresses are set up on both sides. The car will then send a UDP broadcast to find the charger and the charger replies with its IP address and port number. A TCP TLS connection is then created from the car to the charger and over that the ISO 15118 protocol is then spoken which is encoded in some XML data in a binary XML representation called XE. Now I am going to talk to you about the YAK high level control board. Here on the right side you can see a few photos of one of these boards assembled and on the left side you see a block diagram of this high level control board. This is used to run Everest on an embedded Linux system. Some of the features of this YAK control board is that it can receive a Raspberry Pi compute module 4. This is basically your system where you run your Linux on. It has a 10 pin connector for a direct connection to the motherboard, a real time clock with a backup battery, a power line communications green fire modem for doing the high level charging communication with the car that I just talked about. There is also a UART and power connector populated for popular RFID modules and there is also RS485 modbus connectivity, you have a CANBUS available, you have Ethernet, wireless LAN, you have Bluetooth, USB ports, there is even a USB client port to be able to flash the flash storage of the compute module 4 and of course you have lots of external GPIOs to play with. Now we have everything that we need to put together a basic but also smart charging station. So from right to left you just need a mains freeface power in plug, you need one of these GTE power boards, plug that in, on the other side you plug in a type 2 connector to your car, if you then plug this into your car you are already good to go and you will be able to charge your electric vehicle with up to 22kW if the vehicle supports it. If you want to do some more interesting things like try out some of the smart charging protocols and maybe develop some interesting solutions on top of that you can add this high level control board and then just start working on some interesting implementations. Another exciting project that we are working on right now is a DIY B-directional DC charger. If you paid attention over the last couple of minutes you will have noticed that the jack board already comes prepared with everything that you would need for proper DC communication because the DC communication is done over the same control pilot wire using the high level charging protocols and the only things you really just need to build a proper DC charger is some power electronics and an isolation monitor and then they are pretty much good to go. Obviously this is a lot more complicated and we are still hard at work for creating a good design here but you can definitely stay tuned for more coming in the spring or summer from us. If this was interesting for you here is how you can get involved with the Evers project. You can check out our code on the GitHub organization. You can also check out the hardware designs and microcontroller firmware. We do have a mailing list if you want to ask some questions. There is the project page on the Linux Foundation Energy website. We do have a quick start guide to help you get started with development and on every fourth first day of the month there is a technical steering committee meeting where we talk about what we implemented in the last weeks leading up to this technical steering committee meeting. It's always being announced via the mailing list and recordings are made available shortly after on YouTube. There is also a weekly developer sync meeting where you can join Evers developers, ask questions and start contributing. This meeting happens every Tuesday between 10 and 11 am Central European time and the meeting link for that is sent out via our mailing list. Thank you very much for listening and I am open to receiving questions now. Hello, I see there is at least one question right now that you can purchase these boards pre-made somewhere right now. I think not yet at the moment and I also don't know if I want to do too much advertising here but yeah I think something will be available at least from our company at some point but you can also totally build your own here. So Nico is asking how many of these boards have been produced or tested yet? So like a few of our developers have had charging stations based on these boards at their home for way over a year now so they've been tested quite heavily. I'm charging my own electric vehicle basically every day with a charging station based on this design here and yeah so they are already well tested. Probably if you would want to build a product with these boards you would have to go through the certification processes because your designs might differ a bit but yeah they functionality wise they've been tested quite heavily. So Thomas is asking if there are any plans on scaling up production? Well I guess that kind of depends on the demand. Right now this kit is thought of as a basis for development especially doing Everest development but I could imagine if the demand is like crazy high that some scaling up of production would occur at some point in the future. See some people typing questions so erdjonker asks if the DC will be CCS yes for now the DC would be based on the CCS connector but that's just because we're based in Europe and that's the common plug here. I've heard of some people working on something with Tademo at the moment but I'm not completely sure what the status there is. So yeah Thomas is asking if there's already an idea how competitive this could be with regards to commercial charges. I guess this also kind of aligns with the next question from Wookie that the boards have a lot of functionality on top so I guess a run would come off as quite an expensive charger and that's true like I said in the presentation the boards are definitely not designed with cost saving and as less features that you need to build a charging station in there but it's more like a development kit that probably costs much more than like off the shelf mass produced charging station would cost but that's also not really the goal of this of this board is to enable development and to have as many things to play with as possible like think of the in the SDR space you have these boards where you also have a lot of features in them but they're not as cheap as like your typical cheap television receiving stick that you plug into your laptop I still see some typing going on so maybe there's some more questions coming up yes as Wookie is saying there that's also our feeling that especially he said that there's a terrible shortage of open EVSE kits out there and we think so too there are some projects like you mentioned but definitely a fully featured project especially with a nice open hardware design that you can just play around with and integrate into your own designs and maybe even like strip out half of the functionality that you don't need I think is a good thing especially if ever is being released under such a permissive license of the Apache 2.0 and the hardware designs themselves being under the certain open hardware license this could definitely open up a lot of possibilities for people to play with their own charging station hardware this is asking if this is targeted more towards commercial vendors or more towards hobbyists I personally would say it's targeted towards both groups like of course commercial vendors for the big DC charging stations you probably don't want to build something like that in your garage at home but you can use the same software stack and like I mentioned also parts of the hardware stack for that but for hobbyists as well like if you want to integrate it into your home automation system or if you want to dig in deeper in communication with the car especially with the ISO 15118 coming up and the B-directional charging possibilities that will soon open up to many vehicles out there and as Marco also mentioned obviously academia is also an interesting part there so you can imagine yeah like students working on EV charging there and things like that alright looks looks like normal questions I will definitely be around for a few more minutes in the public room afterwards and yeah looking forward to some more questions. |
European Eichrecht
E-Mobility with Love & Security |
Welcome to my talk on the transition of the German Calibration Law or Eichhecht towards a common European Calibration Law. Why this talk? Well, we all know in immobility a nice system architecture security in private is just do not exist, but there are at least some good starting points. So I thought, okay, let's fix this. But first a bit on my person. I studied computer science at the Technical University of Ilmenau, then I worked at multiple startups in the area of craft databases, renewable energy and e-health. And finally in 2014 I started my own company because I thought, well, it would be easier to sell good open source and open data solutions where you can sell to both sides of an API. But back to immobility. What is an immobility user story? Obviously, an EV driver wants to find a free, compatible and working charging station, which is already complicated enough. Then he wants to charge often as fast as possible, or at least as fast as it makes sense for him. Finally, he only wants to pay what he really consumed, not too much and especially without any surprises. If he is a digital native, he might also demand a real digital process, which simply means he wants an error. And we as Boston people, we want open source and it should be free of bullshit. What is bullshit? Now we all know it, this is especially this big EV driver authorization bullshit. This is where a couple of methods to authorize people in immobility and all of them have not much to do with security, none of them has to do with privacy. We even have a MAC address-based authorization, which I just call cyberterrorism. And even from a business point of view, those methods just do not provide enough collision-free identifications for everyone trying to charge his electric car. So just bullshit. On the other side, we have the charging station operator story. What does the charging station operator want? He obviously wants to sell energy and make money. At the same time, he does not want to pay too much to his energy supplier. So as you can see, multiple parties have to trust the energy measurements. And in the future, we also need secure mechanisms for law-balancing services. Additionally, we have to remember that charging is a distributed remote sales process. And most charging stations run unsupervised without anyone on site who could help you as an EV driver when there's a problem. So there's a real need for 100% security and safety for all processes. And finally, there's the engineer story. Measuring energy is hard, while we know for now more than 100 years how to measure AC. But measuring DC is still hard. And measuring high-power DC is even harder. For the security engineers, we can solve all your issues with script or curfew. Nice. But now we have a key distribution problem to solve then. Measuring energy is not only hard, it's also a heavily regulated area. There is the measuring instrument directive, or short, MIT, from nearly 20 years ago. It defines all of Europe the minimum requirements for any metering device used for billing process. But such an MIT meter is still a very traditional analog device. Therefore, there are additional specifications and projects of the German PTP defining the minimum requirements how to transmit measurements in a secure and trustworthy way over an entrusted computer network, like the internet. This is all about asymmetric cryptography and public key infrastructure. And again, more than 20 years old. All of this led us to the German Calibration Law or Eichrecht, which was defined from 2015 to 2019, when it became unally a law. Since April 2019, all charging stations have to measure correctly and send their results using digital signatures, or at least they should. Often we hear the term smart meter gateways when it comes to modern energy systems. What are those smart meter gateways all about? We remember the foundations of secure transmission of measurement data is over 20 years old and well tested in different PTP projects. After this period, the German PTP and the German Federal Office for Information Security started to define a next generation security architecture, which we call today smart meter gateways. But we have to keep in mind smart meter gateways are in fact nothing more than VPN tunnels with application layer gateways to access remote smart meters. This is okay for what an energy supplier wants to do, but this is simply not the use case of immobility. In immobility, measurements have to travel many hots through different operator networks. So from the point of view of the German PTP, the entire value chain has to be certified, which is the poor horror for every operator, because this would mean every firmware update on every charging station and every software update on the backend system would have to be reviewed and certified by the PTP. Clearly, this would be the end of all innovation in this market. So a much better approach is to use digital signatures, because by using digital signatures we can be sure that measurement cannot be falsified by random errors, internal attackers or management fraud. The same idea is also used within a charging station itself. As the entire charging station is the management device, everything is regulated again. Even every small firmware update will be regulated, but you can make life much easier with computer scientists best friends and encapsulation and interface. Every regulated function is encapsulated within the so-called measurement capsule, which is more or less just a small energy meter with additional digital signatures and a good real-time clock. All this is located within a small enclosure within the charging station, so it is well separated from everything else within the charging station. The strange part of the current regulation is that there must be a display and the EV driver must be able to look onto the meter, read measurements and the public key, maybe take a photo with his smartphone, because nobody can remember public keys. Unfortunately, this might only be a greater deal when you sit all day in the ivory tower of the MID. In daily life of an operator, it looks more often like this. Good luck finding the public key, reading, metering data or verifying your invoice. Fun fact, this requirement exists just because of a single stupid sentence within MIT regulation and even the German PDB complained about it 18 years ago and no one fixed it since then. So to conclude, the PDB Gunstigelussel, how we call it in German, is about having a charging station with a secure smart meter here on the left, which sends at least a charging start and a charging stop measurement, including some sort of EV driver or session identification to the charging station of a radar backend here in the middle. In the operator backend, we combine both into a so-called charge detail record and send it towards the immobility provider here on the right. He puts all information into an invoice and sends it to you, the EV driver. The new or the PDB can take a so-called transparency software to verify the digital signatures of the measurements and everybody might be happy. Well, in theory, this is true, by and the way, when you ask yourself why don't we send it directly to the electric vehicle, even the ISO 1511820 standard from 2022, so last year, does not support the use case of the German calibration law. Also, the fundamental data structures and the public keys do not fit together. With relations immobility, you fucked it up again, but back to the good parts. What is this transparency software all about? Well, the transparency software is some sort of virtual display on the energy meter, which can validate the digital signatures of all measurements. Therefore, it's also a legal part of the charging station certification process and it also suffers from all kind of regulations. A common way to satisfy one of these regulations is to put the transparency software onto a Linux live ISO image. This is perhaps an unexpected but a quite cool application of open source software. Because we disliked all this politics and immobility, we created our own transparency software. It was the first really open source transparency software and still is the only real open source project in the area of the calibration law. It understands line measurements from different vendors and it's based on the Electron framework. So it is based on TypeScript, SCSS and HTML. The source code is available on GitHub. Feel free to become a sponsor. Let's first look at the typical charge transparency record. In this case, this is just a simple JSON file. It is all the required measurement data, additional metadata and information how to verify the digital signatures, which might be based on some other data format, often a binary data format. How this is done in detail, we will see in a moment. When we now load a typical charge detail record, we will see here one or more charging sessions on the left. Already here, we can see whether the status of the digital signature is okay or not. When we click on one, we can see details on the right. Here we can see whether the validation of individual measurement values is correct or not and whether all measurement values together are a valid charging session. This is important because caused by errors within the charging station or the back ends, one of the signed meter values might be missing or it's a duplicate or some other logical problem occurred when we now click on the details of one of the measurement values. We see how this measurement value was constructed and how it is validated, how the string for plain data must be constructed, how it is hashed, what the public key is and what the expected digital signature is about. When it's correct, you will see a nice, abalytic signature. That's it. At the end, nothing really complicated. As an e-V-Driver, which transparency software I have to use, because there might be different transparency software for different vendors of charging stations, which version of software I have to use. We have seen getting the public key is also not that easy, will I really understand the user interface and user experience of this software? What about billing? e-V-Drivers want to verify invoices, not really meeting values. So where do I get authentic and timestamp charging tariff information? Again, in theory, in Germany, we have a law for this and in other lands, we have even law that you have the right to get a real-time tariff information before, during and after charging. So, again, we are missing in an overall architecture. But don't get me wrong, eichrecht as a digital process is very reasonable, but it fails in daily operations and the immobility really nothing fits together. Security requirements are often not understood and security goals cannot be realized. And surprise, we even have some new regulations. Since the end of last year, we have NS2 cybersecurity regulation and a regulation for the silence of critical entities. The entire charging station infrastructure is now part of the sectors of high-criticality. At the moment, there's not a problem, but become the next big problem of immobility. And do you really want to quit law management with untrustworthy metering data? So yes, well, we have a problem. Again, let's reboot the immobility protocol landscape. This time, we hopefully think twice about fundamental protocol requirements and our design goals. It must not again be just a loosely coupled union of very different protocol kingdoms, which do not play together nicely, just because no one wants to talk to the kingdom next to him. It must also not again ignore 40 years of computer science, security and privacy research and reinvent every bad idea of what had already been deprecated somewhere else 20 years ago. It must become a true Internet of Energy, which means we have an open source first development and government approach without any walled gardens, without any excuses. It must be a rock solid, secure, privacy-aware and extensible architecture with a minimal government overhead, just to coordinate the development of higher level business applications. No one should again wait 10 or more years until basic protocol design flaws inhibiting his business innovation are fixed. We really need a common language for all entities, common semantics and a common understanding of errors and error mitigation strategies within distributed real-time systems. It just does not make any sense that we for example still have important immobility protocols which do not have any concept for charging stations and everybody has to work around this limitation. The Charger transparency software again will go ahead and in the next version we will heavily extend the ways we make use of good cryptography. There will no longer be just cryptographic keys to sign energy meter measurements, but also keys for charging station operators to sign business to business and business customer turrets and invoices. Operators will also sign every update of static location and real-time usage data. This will close this missing link between the EV driver use case of validating a B2C invoice and the currently limited reality of just providing signed energy meter measurement values without any terrific information. Also immobility providers can sign their B2C invoices using their cryptographic keys. Some keys will allow the immobility provider to sign anonymous EV driver identities. This is a newer concept which should replace the current EV driver authorization bullshit in the market and solve all related security and privacy issues. Those anonymous identities are just a guarantee for a charge point operator that an immobility provider will pay the debt. It will no longer leak personal data and as all certificates have a very short lifetime over just few days or even hours, correlation attacks will also be something of the past. Finally all grid load management operations also need cryptographic security and transparency. When an EV driver receives less energy or less kilowatt hours as advertised, he should receive a trustworthy explanation why this happened. Chargy will also become support by a system project. The Chargy Software as a Service API will solve all issues around providing trustworthy charging station location, real-time and security related data, which we see today not only in market-driven solutions but also in governmental immobility databases. This idea is also nothing new. In fact, it's just a copycat of e-regulations we can find in the e-health sector. The EU Medical Device Regulation and the Eudermid Database define both in great detail how vendors have to provide all data around the company and their device models, their device certifications and their sold devices publicly and as open data. Also the operators of those devices have to provide data about their companies, about how they manage those devices and about the most interesting data set device self-tests. Because there is just nothing more trustworthy than a daily authentic digital-signed self-test from each individual device. When the e-health sector can provide such data, there is just no excuse for the immobility sector not to do the same. For this we need a protocol suite which goes far beyond the current state-of-the-art Excel and SQL table transport protocols used in immobility. We want protocols which are defined for server-to-server communication, be fully asynchronous, real-time and provide end-to-end security and privacy. We have to remember that currently state-of-the-art real-time data and immobility means real-time data that is at least three to five minutes old and thus often worthless for any real business decisions. This was my very short and fast introduction to the very interesting world of regulations and collaborations in immobility. Why this is important, why we cannot and must not avoid it any longer and why it is really an interesting starting point for fundamental new immobility protocol architecture and real digital processes. So thank you for your audience. Please use the first and chat applications for your questions and suggestions. Use the issue you management on GitHub or send me an email. You can also sponsor our work and the further development of this project on GitHub. |
Presentation of the SEAPATH project |
Hello everyone and welcome. I will today introduce you to the C-PASS project and the virtual edition for real-time Power Grid Substitution Automation. First of all, I will present myself. I'm a consultant in open source software at Savoie which is a French and Canadian company which is an expert in open source software and we offer technical services to other companies using open source software. The company is a member of the Yoctop project, the Linux Foundation and the Linux Foundation Energy which hosts the C-PASS project. Okay, let's dive into the context. We're experiencing an energy transition mostly because of two points. The first of all, the new power production. For example, the renewable energy. You have many new power sources which are distributed around the country and not constant in power. But also because the request of electricity has changed with electric mobility with smart services and the Internet of Things. We have a new customer, a new services for the power distribution and so we need a new power grid and a power grid control architecture. We need to be much more distributed and to have a power grid that is flexible and adaptive. That goes with the idea of new and much more data management in the substation to adapt quickly to the request of the customer and the production of renewable energy. The C-PASS project comes in response to this energy transition. The vision of the project is to move from a hardware-centric model to a software-centric model. We want to abstract everything we can from hardware to software. So instead of having many pieces of hardware in the substation that dialogues together, we want to control all these hardware pieces, the hardware boxes, with main software to control them. That will offer the flexibility and scalability we want. The C-PASS project also choose to direct into the open source development. The idea behind that is not to depend on any industrial and any other company but also to offer for everyone to come and see what is developed in the C-PASS project and let everyone offer his point of view from every other industry to develop a project that will need to be not only suitable for the power grid. We need many vision for the project to grow. This is inspired by the North American model. For example, AT&T did that to develop the new 5G software, the new 5G grid. They needed this flexibility and scalability and they did this transition from hardware-centric model to software-centric model. So here comes the C-PASS project, which is a software-enabled automation platform on Artifacturine. The mission of the C-PASS project is to develop a reference design at an industrial level for all applications to dialogue with. This platform will be open source and will run a virtualized automation for each goal we want to achieve, for each piece of software for each actuator or captor. They will be virtualized and encapsulated with the platform controlling the overall action. Our needs, we have many needs in this project. First of all, we need a very high preference in terms of real-time and latency because we want the grid to be flexible, yes, but also for security because we want to react very quickly if there is a problem. We also need the C-PASS project to be adaptable in terms of security. We want to be able to deploy patches and to close security breaches really quickly. We want it to be hardware-agnostic for everyone to be used, that goes with the open source mindset, and we want it to be really customizable and adaptable. Because we use open source software, we also will follow the state of the art of what already exists. The idea is to not reinvent the wheel every time but use already existing open source software, because we can because we are an open source project, and to configure it and to benefit from the fact that they are already really well done and already certified. We don't want to rewrite an entire program ourselves. That is nonsense in terms of open source software production. To achieve this requirement, we first set up a Yocto project which allows us to create a custom Linux distribution which is entirely hardware-agnostic. We just have to recompile everything for another hardware and that allows us to have a full control of package and versions that are on our Linux distribution. That's also really good for cybersecurity because we can easily track and patch every CVE. The Yocto project informed the user if they found a new CVE in one software that it is used, and we can patch directly using the source code because everything is open source. But by doing that, we got into a problem that many industries don't want to deal with the complexity of the Yocto project, and it's much more suitable for them to use, for example, Debian. We created another way of doing the CVE project, but using Debian and a real-time kernel, of course, and an unseable to configure the already created Debian. That's much more useful and easy to use, and we saw that there is a real need here that many industries want to use Debian instead of Yocto. But the two approaches exist today, and any customers that want to implement C-Pass can choose one of them. Okay, so here is our C-Pass project. At the bottom of it, we have the hardware platform, here Intel, but that can be anything, as I said, a Linux real-time kernel above it, and as I said, all the open source software that we want to use and to configure instead of writing it ourselves. All the parts with pacemaker and self are used for distributed file system and distributed VMs between many hypervisors, because the C-Pass project will use a cluster of hypervisors. We don't want all VMs to shut down if an hypervisor is dead. We want to replicate them and relaunch them immediately, automatically. OpenVswitch is used for controlling switch, Internet switches automatically. Of course, we don't want someone to come in the substation every time we need to do changes, so we use OpenVswitch. This comes with the software-centric approach, as I described before. DPDK is basically hardware acceleration for OpenVswitch, let's say that, and of course QMU and KVM, which is the basic couple for virtualization in Linux. So this is our C-Pass project. Two things are important here. First, that the C-Pass project in itself doesn't have any software itself. It is used to configure all these already existing software and use it to benefit from them. And on top of the C-Pass project, we will have all the VMs, and for example, every industrial we work with, and if one day we want to choose to change a piece of hardware because it's deprecated, we just have to shut down the VM, call someone else, and let it write software that will go in this VM. The changing of the piece of hardware will not interfere with the rest. This is a basic idea, basic innovation, first innovation of the C-Pass project. Okay, now I have described the project. I will go a bit more technical and speak about C-Pass testing project. How do we test an open-source project when everyone can write and can propose progress, and how do we write, how do we test a project that doesn't have any software in it, that use, that propose a mainframe but doesn't have any software, because all the software will be in the VMs, the customer software will be in the VMs. I will speak here for the Debian version, because it's simpler, but I'll explain it later. As I said, C-Pass must meet many requirements and provide many guarantees, and for that, we launch the CI at every pull request. Someone propose a pull request, it is accepted by a member of C-Pass, and the CI will launch automatically from this pull request. That means the pull request must build, of course, but also all the tests must be successful. All the tests that we wrote later must be successful for the pull request to be merged. That allows us to avoid regression on one part, but also to display all future tests for the parts we haven't implemented yet. And of course, that will be visible for everyone on GitHub, and especially the man that made the pull request. To achieve that, we will generate a test report to display all tests that pass or not. This report will organize all the 1,500 tests we have between categories and between machines, for example, cybersecurity category or real-time category, and all the hypervisors we will have. It will link all tests to requirements. This is especially useful in cybersecurity, because we have a bunch of requirements to meet, and many tests will link to a single requirement, so we can patch them and see them in terms of requirements. That will separate non-regression part on future work part, as I said before, and that will be visible for everyone on GitHub. What tests, let's say now, what tests we write? As I said, there is no software itself in CPAS, because it will be in the VMs. So all customer code is in the virtual machine. We can do any functional testing. This is nonsense here. But what we do want to do is to check unit requirements, for example, for cybersecurity, system-level testing, let's say. I put some example here. For example, we want to test that the key, the RSA key has the right permission for the right users. We want to check that the root password is randomized or is encrypted. We want to check a bash timeout that SSH has some permission and doesn't allow some connection, et cetera, et cetera. So this is very basic testing, no functional long testing, but configuration testing, let's say. To achieve that, we use a software called Kukinya that I will introduce here. This is a Linux firmware validation framework, and that allows us to write human-readable tests. I put an example on the right. This is a Kukinya configuration file, and you can test, for example, if a user exists, if a process is running, if a disk is mounted, if a Python package is here, et cetera, et cetera. So all of this is human-readable and written in a simple text file. Kukinya offers the abstraction necessary and allows us to write ourselves the complicated shell command to test something and something, and we will inevitably forget an option or something. Kukinya is used and most of all used in the embedded world, because it is easily portable, it requires only a shell, not even bash, just a shell. It is written itself in shell, so it doesn't have any compilation or an installation. It's really easy, and it can extract the results for us, either in CSV, simple or in more complicated, for example, XML, with a logging on the number of tests that pass and so on, many information. And of course, it is open source. Okay, now I have described everything. This is a complete CI that we have. A poll request is made on GitHub. All sources are downloaded on a self-hosted runner. We need to host a runner because we have a cluster to build. We can do that virtually. The Debian, all hypervisors, have already a Debian version as operating system. This version is, this Debian is configured by Ansible. We then deploy Kukinya, so the testing process, and all the testing files that I described before, the tests are launched, gathered in a PDF report, which is uploaded, and the link is given on the poll request on GitHub. This is the CI we currently have on CTAS. I will now go on two points of implementation that are interesting in our CI. First of all, as I said just before, all hypervisors have already a Debian version, and they use our operating system Debian. It is already deployed and already set up that allow us to avoid two problems. First of all, the compilation. So there is no compilation in this CI yet. And the flashing problem, which is a very big problem because automatically flashing and rebooting a machine is complicated. With Debian, we do not even think about that. This is already the same operating system, and we will just configure it every time with Ansible. That means we have to control the basic state, the default state of the CI, and that this is done through Ansible, first of all, using idempotency. So just mean Ansible will not do the same thing twice. But it is not really, not totally useful in our case, because some, for example, if we move or remove files, Ansible cannot roll back to the last version. It's not possible with Ansible. So to deal with that, we will shortly, not done yet, we will shortly implement an LVM snapshot of the default Debian. So we create a snapshot of the default state of the Debian before the CI, we launch the CI, and we then roll back with LVM to the default version. This is the ID. Another problem that we encounter with this CI, is that this is a complicated CI. We don't have compilation, but we need to recover our sources, launch test, configure, launch test, gather results, generate report, upload report, and so on. So many complicated things that required the runner to be configured to do all these things. And because the CI will evaluate over time, we don't want it to, we don't want to redeploy the runner or to reconfigure the runner every time. To avoid that, we use a Docker container, and we clone, re-download the sources of the CI, the code of the CI, every time, directly in the Docker container, and launch all the commands in this container. So Ansible is the Ansible command for conceiving configuration, the report generation, the upload of the report are each launched in the Docker container. That allows us not to deal with configuring and downloading package for the runner itself every time we made the CI evaluate. To do that, we also use a small tool called CQFD, which is really useful in our case. This is a simple common line Docker wrapper, but it will allow us to launch commands directly inside the Docker container. It maps the current directory in Docker, it creates a user in Docker with the same username as ours in order to deal with permissions, and so we can recall a CQFD run Ansible command, and it will execute Ansible in the Docker container, really useful in our case. Okay. Before the end of this presentation, I will talk about a bit about future works and what we will do later. First, as I said, this is a CI for the Debian version, because it was simpler, and we want to create the same for the Yocto version. Many problems with that. First, Yocto has a very complicated compilation and a really long compilation, so we need to deploy other runners to do that, and maybe handle concurrency problem. The machine needs to be flashed every time. We cannot, as with Debian, roll back to an old snapshot. This is not possible. The flashing problem is really difficult. We already tried to use a PXE, but that doesn't work. That's absolutely not hardware, hardware that's really dependent of hardware, and that doesn't work every time. We have two ideas to deal with that. Maybe configure with an update mechanism and consider the new version that has to be flashed as an update that can do job, or with a USB gadget to present the new version with a virtual USB key, as it was with a laptop and a real USB key that would plug in it. That is possible, but we don't have a configuration yet, and we are not sure which solution we will choose. The other thing to do is to run longer tests. I didn't talk about real-time tests here, or latency tests, because it is a really long test, many days. We cannot launch it at every request, but we have to launch it at every release. That's the thing we will do in the future, and to certify, to demonstrate that we met the requirement. That will have the form of this graph, for example, with all the measures we do, the maximum latency we have, the median latency we have, and so on. That will be launched at every release of CPAS. Thank you for hearing me during this presentation. If you want to experience some more with CPAS, it is open source, so you can go on the GitHub pages of CPAS. There are also some other conferences about CPAS available on YouTube, and for myself, I am already open to answer all your questions already. We already have a question from Markus. Markus says that it looks very similar to him, and because he is a maintainer of OpenHappian, there is extension to OpenHapp, and it chooses essentially lots of advanced batch scripting on top of Debian, and they also roll out a virtual VM in a Docker container in CI on every pull request upload. Can you say something about that? I read that. I don't know the software that you are talking about, but that's interesting, because essentially, batch scripting on top of Debian is a sort of trick we found to make the CPAS project with Debian, because originally, the only CPAS version was with Yocto, which offers much more configuration over the Linux system you have, but as I said, we found out that many constructors and many industrials don't want to dive into the complexity of Yocto, and so here we are creating many batch scripts to configure Debian properly, especially for cybersecurity questions, which are a bit complicated with that. But if you go to the GitHub pages of the CPAS project, you will see that many, many problems we have with Debian are not there with the Yocto version. Okay, so I don't see any other questions for now, but I just said two or three things that are not technically correct, and I just thought of that after I uploaded the video, but I said that there is no software embedded in the CPAS project, which is not technically true. The software we want to reach, so the interesting software will be the software that controls the actuator of the captors, and that will be in the VMs, in the virtual machines, of course, but we do develop some software in the CPAS project. For example, we have a virtual machine management system written in Python, which is just to manage our virtual machine in the cluster in a simpler way, and we do develop that in the CPAS project. I also said that the Yocto project doesn't use unseable, which is also not true, so the Yocto version of the CPAS project doesn't use unseable. Of course, we use it because we want to configure a cluster, so unseable is used just to do the network configuration of the cluster, because in the Yocto project, we already have configured the Linux system, because we have built it the way we want. In the Debian version, this is not possible, we just plug a Debian USB key with a real-time kernel, and that's all we can do, so everything else has to be configured through unseable. It's a much messier, a much bigger unseable repository for the Debian version. Yes, I think Marcus knows what we are talking about. And yes, for the last correction I can add, we have thought about how to implement the CI for the Yocto version and the problem of flashing the machines, and we found that the simpler way to do that is just to use an update mechanism. So we use software updates in the Yocto project, in the Yocto version, with two double banks, so we can just upload a new version of the C pass projects, as if it was a new version, even if it is just another pre-request to test. That works pretty well, and that avoid flashing problems, PXE problems that we have. Okay, so if there isn't any questions anymore, I say goodbye, and you can see me, I will be in the first dem this afternoon, you can solve for the pink sweat and of course the hat with the text on it, I think you can find me there. Thank you. |
Green software engineering
Building tools and ecosystems around green software engineering |
So, hello and welcome to my talk about green software engineering and more specifically about building energy measurement tools and ecosystems around software. My name is Arne and I work for Green Coding Berlin, which is a company that specializes in making open source tools for energy aware software measurement. I would like to take you on a tour today of a concept for a possible future ecosystem we imagine where energy consumption of software is a first world metric and available for every developer and user. So let's have a look at a hypothetical scenario. The Windows 10 operating system typically comes with a minimum system requirements. So if you look on the vendors web page, you can see it has a processor that is needed, one gigahertz, one gigabyte of RAM, a particular amount of hard disk space, graphics ports, etc. However, what is never given is the power on, for instance, idle that this operating system uses on this reference hardware that it apparently already specifies. So this should be pretty doable, right? Also something like power the desktop activity. So how much power does it use just to go around in the operating system, opening the file explorer, using the taskbar and stuff like this. On the reference system, for instance, that with Microsoft specifies or on a reference system that we or the community specifies. And imagine then you can make could make informed choices. So by just saying, hey, I'm looking at Windows 10, and I see that it has 45 watts in idle. But apparently, my computer is mostly in idle. So it might be more interesting to use Ubuntu, for instance, which has just 20 watts in idle or desktop activity is even lower. So why not choose this operating system if energy is my main concern. And this is what I, what I cherish the most in the operating system, or which is an important metric for me. If you think this process even further, you can think about comparing energy of applications very specific, not only in the idle scenario or in one scenario, but in very specific usage scenarios that are ingrained to how people typically use such an application. What you see here is two radar charts on the left side is WhatsApp, and on the right side is Telegram. Please keep in mind that these are concept pictures. So this is not actually the energy that these application use for this use case. But let's say your use case is that you message a lot with an app, but you don't do that many video calls. So if you look then at WhatsApp, you see here that it has quite a high energy budget when it comes to messaging, whereas Telegram has quite a lower budget. Telegram is, however, very bad when it comes to video where WhatsApp could be, for instance, better. So let's say that you are mostly doing messaging with your application, and you would like to keep your battery life, or maybe use Telegram on the desktop, your desktop energy consumption low, then with such metrics, you could actually make an informed decision if WhatsApp or Telegram is the better app for you if energy is an important concern. And imagine as a developer, if you think even one step further, that you go to GitHub or to GitLab or wherever your software is hosted, and you look in the repository and you see right away with something like an open energy batch, how we call it internally, to see how much the software, you see it down here, how much the software is actually using for its intended use case that the developer of the software had in mind. So you can compare one software that maybe has very limited use case to another software or library, just by the energy budget, because you have the metrics so readily available. We actually try to build these tools, and I would like to take you in this very short time frame that we have been given by FOSDAM, so just about 20 minutes, I would like you to take a tour through our projects that we are doing, more as an appetizer, so you see what we are working on and what we think could be possible or a possible ecosystem in the future. You will be presented with a view that looks like such, so the green metrics tool, EcoCI, Open Energy Badge and Cloud Energy. So the green metrics tool is what I would like to talk about today, mostly, because I think it is the tool that outlines our concept of transparency in the software community the best, and then we'll talk about later about our approaches for CI pipelines or restricted environments like the cloud. So first of all, I think it makes sense, although I know people tend to hate diagrams or flowcharts to some degree, but I think it makes sense to quickly go over how the concept of the tool works from a high-level perspective. So in order to measure software, we follow the container-based approach. So we assume that your software is already in a containerized format or can be put in such a format. So for instance, even a Firefox browser, if you want to measure desktop applications, can be put in a container and be measured with our tool. Also machine learning applications, simple command line applications, but also web applications. Typically when you develop software, you already have infrastructure files like Docker files, Docker compose file, or even a Kubernetes file available, which our tool can consume in all fairness, Kubernetes is still a work in progress, but Docker files can consume. And then what the tool basically orchestrates the containers and attaches every reporter that you want in terms of measuring metrics. So here we are still very similar to typical data logging approaches like Datadoc does it for instance, or other big players. So the memory, the AC power, DC power, the network traffic, CPU percentage, CPU and RAM is all locked during the execution of what we call a standard usage scenario. So in the first couple of slides, I've shown you the concept of looking at software from how is it typically used. And people already have thought about this concept quite a lot when they make end-to-end tests with their software, because this is a typical flow that a user goes through in your application, or unit tests, which might be very reduced amounts of functionality that is tested in a block, or benchmarks that are already inside of the software repository, session replays, shell scripts, build files that basically measure where we could measure your build process. All of this is already available typically, and our tool can consume these files, will run these workflows and then tell you the energy budget over the time of this run in particular. This slide is more just, if you're not too familiar with Docker, the idea is just to have every service or every component of the application in a separate container, so that we can later on better granularize the metrics and better look at which component might be interesting to look at if you want to do energy optimizations in particular. When you use the tool, and I will just go quickly over that and then probably go with you through a live version of what we are hosting at the moment, you will get a lot of metrics. So you will obviously get something like the CPU utilization, or the average memory that was used, or maybe the network bandwidth that was used. But what is interesting for this dashboard, and basically it's USP, is that you get also the energy metrics from the CPU, from the memory, you get a calculation what the network has used in energy, and you get convoluted or basically aggregated values where it makes often sense to look at CPU and memory in conjunction, or it makes sense to look at all the metrics that you have available to get something like a total energy budget. Then you obviously can look also at the AC, so at the wall plugs, so not only what is your CPU and your RAM using, but what is the total machine using, or something that we have in our lab as a setup, you just look at the main board, so not on the outside of the PSU, so what is basically plugged in the desktop computer, but only the power that flows directly into the main board. And here you can see that our tool automatically calculates the CO2 budget based on the energy that it has used for this run. The tool also shows you which reporters have been used in an overview, and then it tells you a lot of charts, so this is a sample chart, and what the tool can basically give you is not only an overview capability, but also an introspection where you, for instance, are interested in the idle time of the application. So what is my application doing when no user is interacting with it? Is it actually using energy, and is this too much energy for my belief or for the belief of the community? So for instance, here we have an example of a setup, of a WordPress setup that we have done with an Apache, a Puppeteer container that runs Chrome, and also a MariaDB instance. And you can see here that here are a couple of requests that have been done to the WordPress instance, and then we are basically just idling, but still the web server is doing quite some work, and there have been no web sockets active, so why is there server and database activity here? Is this valid, is this maybe some caching, some housekeeping, or is this unintended behavior? We picture that our tool could highlight such energy hotspot or energy malfunctions, as we call them, to better understand how software uses energy. You can also look at energy anomalies, so we work sometimes with features like TurboBoost, which is typically not turned on in cloud environments, but very often for desktops, which brings a processor in kind of like an overdrive state so that it can react very quickly in a frequency above its normal frequency. However, what we have done here in this example, we have run a constant CPU utilization, but as you can see here, the CPU clocks at different frequency over the time, and sometimes it uses exponentially more energy for the same tasks. So it finishes quicker, but it uses more than only a linear amount more of energy to do the task. So this is a very interesting insight that our tool can, for instance, deliver when you try for energy optimizations of your software. So what is the whole idea that we have behind all this project? And let me move myself down here a little bit so you can see the full slide. We want to create an open source community or a green software community that focuses on the transparency of software so that you have basically an interface, which we call the usage scenario, where you can measure software against and then ask later on questions against a database or against an API, which has measured all these softwares, questions like how much does this software consume? Is there a more carbon friendly alternative, or is there a software that makes less energy requests, less network requests? The idea, if these softwares are available in your country, so Yucca to my knowledge is, for instance, from the US and code check is more like a German application, is we want to be the Yucca or the code check of software. So we want to deliver answers to developers where they can ask questions about the energy budgeting of a library, of a software, or of a functionality by providing a framework to make these measurements. So let me move up here again and then back to the slides. So let me show you our other tools that we believe are needed to build an ecosystem around green software because software is not only running in desktop environments or is not only on a single machine, it also runs a lot in the clouds, where these measurements that we have, and I would like to encourage you to read a bit on what sensors are available in our tool, but where these sensors are not available, which is for instance in the cloud. So let me bring up my browser again. So if you are on the homepage and you have seen the green metrics tool that I've just talked about, you'll also see that we have the cloud energy project and the EcoCI project. So EcoCI focuses on measuring the energy of software in a continuous integration pipeline that for instance runs in a virtual machine. Our focus is currently on GitHub actions. In order to estimate the energy in a virtual machine, because you cannot measure, you have no access to the wall plug in the data center, you have no access to sensors in the CPU or whatever, you have to estimate the machine based on measurements that you already have for the same hardware. If you click on cloud energy, you can see here that we have based our machine learning model on a research paper from InterACTC and the University of London, and they have basically taken the data from the spec power database, which is an open database for servers that have been measured just with a fixed workload to compare it against each other. And based on this data, we can create a machine learning model, which is also free and open source to use, that is just a Python tool, which you call with the information that you have. So let's say you have the information that your CPU is from Intel, that the frequency that you're running is 2.6 gigahertz, you have 7 gigabytes of RAM, and you know the CPU has 24 threads. But you don't know any more info. You don't know if it's a Skylake processor or a more modern internal processor. You have no more information because the hypervisor limits this to you. So if you give the model more information, it can give you more accurate estimates, but it can also work with the limited information in the cloud. And then it spits out to the standard out the current watchers that you have been using, and then you can reuse that in a tool that we build upon that. So now that you've understood that there is a machine learning model behind the idea, I would like to bring you to EcoCI. So EcoCI is a GitHub action that is based on the work from the Cloud Energy Project that can give you in a GitHub action the information of how much a CI pipeline has used in terms of energy. So if you go, for instance, to the GitHub repository, you can also go to the marketplace. So we go one step further. And here you can see you can directly use it. It is very easy to use. It just needs two calls to initialize a tool and then one more call whenever you want to get a measurement. And what it does for you, so let's quickly go to our repository where we actually use GitHub actions to measure every of our workflows in the tool. So we click on actions, let's say we go to manual test run virtual machine, we click on main. And you see here, I've run this run yesterday, it succeeded. So our log tells us, hey, all tests have to work fine. So to run a work point fine and also the API. And for this run in the Azure Cloud where GitHub actions runs as virtual machines, I have used 650 joules of energy. And you get a nice ASCII graph over time. We were a bit limited here in the graphs we can display in the GitHub actions overview. But you can see here at what point in time the energy, for instance, is the highest and then maybe look at the later tests if they, if you deem them to be more energy consuming than for instance at the start where it was using only a fixed amount of energy. So this gives a developer and also a user the information how much energy is not only the software using, but also the development of the software is it maybe using more than we want as developers or maybe even as a community. And these are all concept tools to just get a first start of what we, what we think could be possible, of what we think could be possible in a new future where software is basically measured and the data of the, of its usage is constantly published by developers also. The idea is then to have something like an open energy batch that is basically in every repository that tells you for this software and for this usage scenario that comes with it. So be it for instance running the tests or be it for instance building the containers or the intended use case of the software. So let's say the NumPy library of Python has an energy batch where it says, hey, for 1000 times 1000 metrics multiplication, this software uses this amount of energy on the reference system that we have specified. And when you use the same reference systems to compare software against each other, you come to a scenario that we have basically shown from the starters in the first slides where you can basically tell is the one software more energy hungry than the other one comparing the same use case. So let me quickly get back to my slide deck. So let's wrap up. Measuring software energy consumption we believe is still too hard. The goal should be easy as starting a Docker container and it should happen transparently. Therefore we have created the green metrics tool which can reuse Docker files and infrastructure files to make it very easy to orchestrate your architecture. And then in a flow that you already have, be it a puppeteer file or be it just a shell script, you can run that with our tool just as a parameter appended and it will tell you how much energy has been used over this particular scenario that you feed in. Measuring software is also very complex. So this is what we have integrated best practices or tool like pausing between measurements, letting systems idle before you actually use them, turning functionalities like SGX off, looking at if TurboBoost is on and very more features. Just inline measuring like Datadoc or other providers are doing it at the moment, we believe is not enough and is too arbitrary to talk about energy. Software must be measured against a standard usage case. So we provide standard usage cases for software as an interface, but we ask you the community also or we need to see over time what are the standard usage cases we can all agree on. A software must be comparable to another similar software in terms of energy. This is why we need these standard usage cases to make it comparable. This also means it must be measured on reference machines that everybody has access to that we want to provide for the community as a free service. Energy metrics must also be available in restricted environments like the cloud. So I've talked about estimation models that need to be open source and available and for everybody to implement. And energy must be transparent and a first order metric and order in developing and using software. People should know before they use the software how much energy it is consuming. And this is what we are trying to achieve with the tools we are developing. I hope it could pique your interest in our work and in the tools we are developing, some as concepts, some already production ready. And thank you for listening and now I hope it's time for questions. |
Carbon Intensity Aware Scheduling in Kubernetes |
Hello, everyone. Today we are going to talk about how you can achieve sustainability in computing, how you can do energy efficient placement of Kubernetes workload. My name is Parul Singh and I work as a senior software engineer at Red Hat. With me, we have Guy Liu. He is a software engineer intern and today we are presenting this presentation together. So we are part of CNCF and we are taking community-based initiatives on environment sustainability. If you want to check our proposal, you can follow the link. We also have done a few projects. Again, using community-based approach, the first one of them is carbon aware scaling with KEDA. We did this with Microsoft and we investigated how you can use electricity and carbon intensity to make workload scaling decisions. Another one that we've been working with IBM Research is Clever. That is container, label, energy efficient, VP, a recommender for Kubernetes. And if you want to check out both of these projects, you can just follow the QR. So the agenda is very simple. We'll give a brief background of the things, how they are at the moment and then we're going to introduce a sustainability stack which consists of two projects, the Kepler and the Model Server and then we will have a demo. So here we have an interesting quote that sums up the motivation of our sustainability stack and the problem it seeks to solve. So according to Gardner, in 2021 an ACM technology brief estimated that the information and communication technology sector contributed between 1.8% and 3.9% of global carbon emissions, which is astonishingly more than the CO2 emission contributions of both Germany and Italy combined. The significant carbon footprint and significant energy consumption of the tech industry begs the following questions. How can we measure energy consumption quickly and indirectly? How can we measure energy consumption of workloads? And how can we then attribute power on shared resources to processes, containers and pods? So with these issues in mind, we introduced a cloud-native sustainability stack which seeks to address these questions and problems. Perule will first start by discussing the Kepler project and then I will discuss the Kepler Model Server project. Let's talk about the energy consumption attribution methodology used by the Kepler. What Kepler is based on the principle that power consumption is attributed to the resource usage by process containers and pods. For example, let's say you have a pod that consumed 10% of CPU, that means it attributed to 10% of CPU power consumption. Similar if you have like five containers and they total contributed to 50% of CPU usage, that means they attributed to 50% of CPU power consumption. It is so on and so forth for other resources like memory and GPU etc. And this we based this principle based on the studies and we have attached the link to the paper. If you're interested you can check that out. So Kepler is a Kubernetes-based efficient power level exporter and it uses software counters to measure power consumption by hardware resources and exports them as Prometheus metrics. Kepler does three things. The first is reporting. It reports per pod level energy consumption including resources like CPU, GPU and RAM and it supports bare metal as well as PM. So you can measure your workloads energy consumption even on AWS or Azure etc. And it supports Prometheus for exporting the metrics and you can see the dashboards using Grafana. It's very important that Kepler has low energy footprint because what we're trying to do is measure. So we don't want to have Kepler consuming a lot of power itself. So we used EBPF to probe the counters and this considerably reduced the computational resource used by Kepler. And at last we support ML models to estimate energy consumption when you don't have a power meter and Kai will talk more about it in the Kepler model server portion but we use ML models to predict the energy consumption when inherent power meter is not available. The second part of the sustainability stack is the Kepler model server. So by default Kepler will use a supported power measurement tool or meter to measure node-related energy metrics like CPU core, DRAM and then they uses this to estimate pod level energy metrics. But what happens when Kepler does not have access to a supported power meter? This is where the Kepler model server steps in to provide trained models that use software counters and performance metrics to predict relevant energy metrics. The tech stack of the Kepler model server also includes TensorFlow Keras, Psychic, Flask and Prometheus. So let's take a look at some of the models the Kepler model server has implemented. For example, we have a linear regression model that predicts node level CPU core energy consumption with the following categorical and normalized numerical software counters and performance metrics. This model also supports incremental learning, incremental training on new batches of data to improve the model's performance on a cluster. The second example also provides a linear regression model capable of online learning but it instead predicts node level DRAM energy consumption with the following software counters and performance metrics. So let's take a look at how the model server fits in Kepler as a whole. So the first part is training our models on a variety of training workloads where Kepler can export node energy metrics and performance metrics because a power meter is present. In this case Kepler retrieves these node energy metrics from agents which are then collected and exported as Prometheus metrics. The model server scrapes these Prometheus metrics, sets up training, testing and validation data sets and then trains, evaluates and saves the model with the new data. The second part is now exporting these trained models to Kepler for prediction whenever a power meter is not provided. The Kepler model server can export the model itself as an archive to Kepler and this is done with flash grouts. The model server can also export the model's weights directly using flash grouts and or Prometheus metrics. In the future we will also like to export the model weights using the open telemetry metrics API. Now that we have talked about sustainability stack let's see how you can do carbon intensity aware scheduling. So the use case that we are trying to solve is can you put a check or can you control the carbon intensity of your workload. For example is it possible to fuel your workloads using renewable energy like solar power or wind power when available and switch to fossil fuel when the renewable energy is not at disposal. So the use case premise is based on multi-node cluster where you have nodes in different geographical zones and the workloads that we will be talking about is long-running patch or machine learning workloads that that keeps on retraining algorithm or any long-running patch workloads that are not affected by rescheduling of that runs long enough that have an impact to considerable impact on carbon intensity and they don't they're not affected if you reschedule them on different nodes. So our demo setup is based on OpenShift cluster and for monitoring we're using Prometheus. We would be using features like chains, toleration and node selectors to orchestrate where the workload is going to run on which node and we will have a carbon intensity forecaster that will forecast the carbon intensity of nodes and for this demo we are only considering two Rs step that that means that a carbon intensity forecaster would predict what is the carbon intensity for the next two hours. So let's first describe the carbon intensity forecaster. The forecaster has access to an exporter which scrapes time series carbon intensity data from numerous public APIs like electricity map or national grid and it then exports this data as Prometheus metrics. The forecaster will then scrape the Prometheus metrics from the exporter and update its models for each of the node with new time series data. In this demo we will have three nodes so the forecaster will have individual models for each of the three nodes which are in different zones of course. The carbon forecaster will then provide a prediction of the carbon intensity of the desired region a few hours in advance. Note that the carbon intensity forecaster and exporter are extendable interfaces this means the forecaster can implement many different types of time series forecasting models and the exporter can scrape from many different carbon data APIs. So now that we have a carbon intensity forecaster external applications like the Cron job will forecast the potential carbon intensity sometime into the future for each of the three nodes. The Cron job does this by making an HTTP request to the carbon forecaster using the get slash forecasted CIN point and each of the three nodes are then periodically assigned node labels depending on the carbon intensity. Red stands for a relatively high carbon intensity yellow stands for a medium carbon intensity and green stands for a relatively low carbon intensity. So in this example note 1 is for forecasted two hours in the future to have the highest carbon intensity so it is labeled red. Node 2 is forecasted two hours in the future to have a medium carbon intensity, so it is labeled yellow and note 3 is forecasted two hours in the future to have the lowest carbon intensity so it is labeled green. Now that you have assigned labels to the node, it's on the pod to declare its intention that what kind of node it prefers and also what kind of node it does not prefer at all. So for example in the pod yaml you specify node selector carbon intensity as green that means it prefers nodes that have labels as carbon intensity green and you also have to add as a tolerations where you have to say that you don't have the toleration effect no execute means that this pod does not have toleration to run on nodes that have been tainted as red. So if the scheduler will try to schedule this pod on node 1 that has label and taint potas red, this pod would evict within 5 seconds. So that's what the toleration second is for. So let's see how this this looks like. So you have node 1 and there was that has label that had label green now it's turning to red that means its carbon intensity is increasing. So we will taint the node and we will apply the taint as carbon intensity red and no execute. So as soon as this taint is applied the pod is evicted from node 1 and it's assigned to node 2 which has the carbon intensity is changing from red to green and it has been tainted. So tainting the nodes ensures that pods are evicted by the nodes if pods have no tolerations for taint. So this is like the whole picture we have a carbon intensity exporter that queries the various public API to gather the carbon intensity data and it exports them as Prometheus metrics. Now the node label and why is a cronchial what it does it queries a carbon intensity forecaster and it queries in head of time what is going to be the carbon intensity of the various nodes and it patches the labels and taints based on the forecasted carbon intensity. So let's get to the demo. First I'm going to show you how you can install a Kepler operator on an OpenShift environment. The first the release that we have right now is V1 alpha 1 and it has a prerequisite that it needs C group V2 and it follows Kepler 0.4 release and it deploys Kepler both on Kubernetes and OpenShift. So when you're deploying it on OpenShift it also reconfigure your OpenShift nodes by applying a machine config and SCC and right now Kepler uses local linear regression estimator in Kepler main container with offline trained models but in the next release we are planning to provide end-to-end learning pipeline where it can train the model as well as use the model. So if you're interested in a code you can follow us on GitHub repository and so let's get to the demo. To deploy the operator go inside the Kepler operator project and run the make deploy that will create all the manifest and install the operator in the namespace Kepler operator system. So now I'm just going to go into the Kepler operator system the namespace and I'm going to apply let's see if the operator has been yeah so you can see that the operator is running now I'm going to apply the CRD and wait for the Kepler instances to get started. So as you can see the Kepler instances are running and they are each of them is up and running and they are each of them are running on each of the nodes as a demon set pod so that's why you see so many of them and now I'm going to deploy Grafana give it a second yes so Grafana is deployed now to enable user workload monitoring I'm going to apply the config map and that ensures that the Prometheus and the user workload monitoring namespace is capturing the Prometheus metrics so let's see if the pods are up and running in the OpenShift user monitoring project as you can see that all the pods are running so now to see the metrics I'm just going to the Grafana URL just sign in and because we applied the Grafana operator so the default Kepler dashboard should be available give it a second it will load yeah so now you can see the energy reporting from Kepler you can see the carbon footprint you can see the power consumption in namespaces total power consumption pod process power consumption and total power consumption by namespace so that's the default Grafana dashboard so now that we have seen how you can install and play around with your Kepler operator it's time to see how you can also do carbon intensity aware scheduling so for that I have a cluster already ready so you can see that there are six nodes on this cluster and for my this demo I'm not going to run anything on the master node so I'm only going to do things on the worker node so I have applied the cron job and I'm just waiting it to become active all right so that job has been scheduled let's wait for you to get completed as you can see that the crown job has been completed so let's see what gains and what labels it has applied to the three nodes so to see the labels I'm going to use the same script that I have written okay so you can see that the node 2 2 3 has got the label green node 2 2 2 has got the label red and node 1 2 6 has the label yellow so anytime that we are going to schedule a carbon intensive aware pod or workload it should favor 2 2 3 which has carbon intensity as green let's also check what gains has been assigned and if they match the labels so you can see that the node 1 2 a 6 and 1 2 3 which has carbon intensity as green and yellow have no gains while the node 2 2 2 which has the carbon intensity as red as you can see over here has the taint applied now I am going to test it out if this works as expected by applying or by scheduling a long-running workload before I do that I just want to watch all the pods in the namespace so right now there's no pod so so I have applied this pod and it has it has no tolerations for node that has tainted red and it favors a node or it wants a node that has the label green so over here you can see that the CITS pod the pod that I just ran had some issue or had some problem in finding the right node that's because the default scheduler was trying to sign it on a node that didn't have the right label or didn't have the right taint so it took some took a while so let's verify where this pod is running so you can see that it has been scheduled on the go it was scheduled on two to three but right now it's running on 126 and 126 is a node that has common intensity as yellow so that's that's completely okay the time it was scheduled on either the green node or on the yellow node which is okay as long as it's not scheduled on the red node so that would be all thank you for watching the demo we would like to share a few lessons that we learned while working on this project the first is that finding the zone cover intensity data is not simple some points are missing and not all of them are free we also need to support multiple and complex query types for example right now we are just querying what is the current or the average cover intensity in zone XYZ but we need to have more complicated queries like which zone has the lowest cover intensity and we are also thinking of contributing the work that we have done with the cover intensity forecasting and integrating it with green software foundation carbon away SDK which is another open source community that has been working on sustainability and green software so the road ahead for us looks like we are thinking of extending the multi node logic to multi cluster and we're exploring how you can do that using kcp and we are also thinking of integrating carbon intensity awareness in Kubernetes plugins existing plugins for example the trimaran target load packing is a scheduler plugin by in the Kubernetes sake and we're thinking of integrating the profile with the carbon intensity awareness and also thinking of how you can tune trimaran further for energy efficiency so that was all if you are more interested in learning about the principle of that capra is based on you can follow the link and check out a project we have attached the GitHub repo for the project as well as the model server and thank you so much and any questions you okay do you want to take that question do you see the question okay so yeah sir um sorry I'm just trying to see the questions I have to switch back and forth okay sir um sorry I'm just trying to see the questions I somebody asked how do we split the energy for the pod oh um yeah I think I can answer that um this was done on Kepler I believe and I was developed by somebody else but essentially there are two ways for like the model server we also have recently have like models that'll use the performance metrics and then the software counters to directly try and predict pod energy um when it that that's one option and then second option in Kepler is typically um once it generates the energy it'll then try and attribute it I believe to each of the pods and I think that's based on um is it based on cpu utilization proof I don't know yeah what we do is we monitor the cpu utilization although whatever the cpu instruction or the process is going on and then we use cgroup id to kind of like attribute what how that energy is related to which pod because we take the cgroup id and we translate that which particular process or container it's related to so that's how we gather the metrics so the important thing to note over here is Kepler uses the models to estimate or predict the energy consumption and these models are already trained they already have they are already being published so Kepler uses these models to predict pod energy level consumption on scenarios where you're not running on bare metal on those cases we don't have the access to the inbuilt power meter so in those scenarios we estimate or we predict what is going to be the energy consumption so another question is how what is the credibility of the uh uh greenness uh that data is as good as the data published by the public api for example we have electricity map in us and national grid in europe and uh that is one of a one of a problem as well that the the greenness or the accuracy of the carbon intensity is as good as the data that's being published by the public api we cannot control that okay i should probably note that we will also aim for any data that's from the government so i think national grid is uh straight from is from the uk government so i think that's pretty reliable and we will always make sure that the data that we use is from reliable sources |
Welcome to the on-campus Energy Devroom |
All right. I think we can begin. So, good afternoon. Thank you all for being here at the on-campus part of the Energy Dev Room. And as you might have seen, we kicked off the Energy Dev Room this morning with an online section. And there was quite a bit of interaction. We had nine great presentations, over 70 people in the chat. And yeah, of course, many, multiple of that, of people actually watching the live stream. So, I was glad that the hybrid form was working out. And yeah, we are here with the team that made this Energy Dev Room possible. Nicholas Hoening, Dan Brown, Annalena Helsen, and Kajua Hermann. And some of them will be presenting or have presented already. My name is Nico Rikke and I've been active in the energy sector and the free and open-source software community for quite a while now. And I will repeat some of the talks I did this morning. I assume most of you here have not yet seen that talk, that speech. So, in the last couple of years, I've seen free and open-source software take over the energy sector. And being active in that energy sector, I thought it would be cool to have an Energy Dev Room here at FOSDEM like this. And FOSDEM gave us the opportunity after we submitted a proposal. And then you responded with proposals. And we were overwhelmed by the amount of proposals we got. We got submissions that ranged from home automation to green software and energy system modeling. And by adding the online section in the morning and bit shortening the talks, we were able to accommodate most of them. But still we had to say no to a couple. So, maybe next year, right? Energy, of course, is a timely subject. Our society runs on cheap and reliable energy. And that is something we can no longer take for granted. Scientists, they warn us that we are acting too slowly. I need to ramp up the adoption of renewable energy. The Russian invasion of Ukraine made matters worse by disrupting supplies and causing a price hike. And so it is clear that our energy system is under pressure. Citizens are protesting. Businesses are shutting down. And so society needs solutions. And they also demand them. Unfortunately, it's not just a matter of installing wind turbines and solar panels because these new sources of energy are decentralized and they are weather bound by nature. And thus they require changes in energy management and distribution. While energy generation consumption and distribution differs per country, we can really speed up the worldwide energy transition by a tremendous amount if we avoid the usual reinvention of the digital wheels. And we think that this is the mission that unites us here today at FOSSTEM in this energy dev room. Today's presentations you'll see or have seen will cover projects that model, manage and optimize energy. And unfortunately, these solutions will not be sufficient to solve the energy crisis or limit climate change by themselves. More impactful and direct changes will be necessary to achieve those goals. But these projects can be the building blocks of the future energy system. So the projects presented here today, besides being great projects, are available under a free and open source license. And this is important because it enables collaboration, mass adoption and customizations to fit specific use cases. And this really fits the energy system in which everything is in a state of change and integration is needed to fulfill the purpose. So I hope you all will take this opportunity today to learn, get inspired and strengthen the free and open source software energy software community. So please raise your questions after the presentations in the Q&A or join the online conversation in the matrix chat room. And Alina will keep an eye on that. And so all of us would like to thank everybody that made this all possible. So we'd like to thank the FOSSTEM organizers and volunteers, the people that put in the effort to submit proposals, all selected speakers and all of you for participating. Thank you and have a great conference. |
V2GLiberty: The open stack that could
How we enable EV owners to be ahead of the industry, with open source software |
shortly about us so I'm there on the left so we are site our energy flexibility software startup who decided to go the open source way two years ago and this project we're working together with positive design small company in the Netherlands as well they are more working on the UX part of this so something where I really know a lot about are these two projects because we are building flex measures project we denote donated to the Linux Energy Foundation before you leave the room get some swag over there and together we built this V2G Liberty project which actually is works as an umbrella for the rest so vehicle to grid some of you might know roughly what it is it has been a buzzword going around it basically means other than most of the car charging that's going on today just power into the car you could actually get power out of the car for instance back onto the grid here I listed some use cases why that is supposedly a good idea and specifically the the third one that might be pretty interesting because in a scenario where you have a varying energy price now your car could be a trader right and actually the spreads in the energy markets are increasing by a lot these days it makes it suddenly interesting and when I look at industry coming up with vehicle to grid by themselves people observe a lot of delays and I from my perspective it seems that the the the actual industry players closed source players are looking to actually deliver an ecosystem right so these are pictures you're getting from the big names Hyundai Volkswagen Tesla they're always thinking about putting multiple things in your home basically taking it over and that takes longer and so something I'm not looking forward to that's why we decided to do this project it's more than a year ago now so we have more than a year of data from one location and recently we've attracted some other enthusiasts and there's five more locations where this is being employed in reality and quickly about the motivation so why should we do that is this something that I want to you know sell it was site energy flexibility and focus on that completely actually probably not site energy flexibility in our flex measures project is about making the best use of energy flexibility in general but this is very cool to show that we can do something today we don't have to wait or there's open source projects if you put them together are super powerful and it was a great way to bootstrap ourselves to challenge our technology so I'll talk about the stack the design of what V2G Liberty looks like if you use it and some outlooks first what do you need in this context to actually get going you won't find a lot of cars you can use for this the Nissan Leaf is one of the only ones specifically in 2021 that could do vehicle to grid same goes for the charger so this is a charger from Spanish company wallbox of course they promised that open standards like OCPP will be very soon working and that hasn't happened yet so what was it is you need some kind of computer in your house this has been mentioned in talks this morning already and then we work with an energy contract with dynamic tariffs so in the Netherlands that you already have I think six or seven to choose from that's going very hard but the tipper for instance is launching in a couple European countries and offers you that so this is actually the software architecture in a nutshell I don't want to make it too difficult if you imagine you put this in your house what you need is to install home assistant so that's this logo here home assistant is a very stable home automation software and we've basically built V2G Liberty as a home assistant plug-in that's also actually fun to do it's it's it's nice and you get a lot of presence like UI widgets and things like that and then flex measures is actually not running in the house it could it's dockerized you could put it in the house next to your home automation software and I think a couple of these enthusiasts who now are using this are doing that but for such a software it's sometimes nicer to run in the cloud because it's more difficult to maintain flex measures itself is then responsible to get the relevant real-time data that's it's important to schedule the cars charging this charging so in this case the prices that the consumer contract is on we could also get I think we actually do in a new version it's not listed here we also get some public data that helps us to look at the co2 levels of your car cost consumption so there will be another box right there and I talked about this connection right so we have to somehow talk to the wallbox and we found out for now this has to be modbus and we found out to how this response and we talked to the company the wallbox company if we're allowed to put our code in a public repository if you could see in the code which registers their hardware reacts to but I think yeah we've sorted it out that was a bit difficult and basically what we have to do is simply say start or stop the charge or discharging and we are able to read the state of charge of the battery and on the left side are some more nicer UX features so you want to put along for a longer ride you want to tell our system that the car needs to be full and maybe you want some overrides I'll come to those later so first about the components I I mentioned home assistant it has been around for a while now I think it has also some origins in the Netherlands but it's also developed in California now actually we had a couple of the home assistant people over in a demo because these people here so that was nice to actually have that also in real life and as I said you can write plugins to really to do your own logic on top of a home assistant now thanks measures is the project that I spent the most time with our company develops that yeah basically it's a data driven platform to get the best timing for your flexible energy assets when should they be on or off so and what I'm talking to you today about is an immobility project but we've also having some commercial projects in industry the built environment and actually our goal our dream is that this all comes together so for example we're working on heating now the energy flexibility of heating and heating and immobility somehow happen right next to each other so that's where we want to go and flex measures itself has a UI and I'm just showcasing that here but in our project right now in this V2G Liberty project that wasn't really being used so what we want is for flex measures to be a back-end that you talk to through APIs and you built your you usually built your own user interfacing flexible service or you integrate what flex measures helps you with into your existing service well that's actually what we did with V2G Liberty so that's actually a kind of typical home assistant look for your dashboard I think we have the goal to bring our own style into that when we have the help but that's what you get and yeah usually you can see what your car is doing what's the charging power right now what's the state of charge and then we come into the more interesting or self-built features I will talk about this in another slide but this basically shows you the state of charge that happened in blue and what flex measures has advised to happen with that in the upcoming hours and the energy price and here you have you have the ability as a user to simply say I don't want this stop the automation or just charge the car right now I don't care about the optimal result here you see if you've reserved the car and that's where our partner positive design came in to really think with us well it if we get to design a V2G application what do we want to experience when we use it and these are the goals it should basically you want to be happy that it's there for a few weeks but then you want to stop thinking about it every day it should just happen and you might look at your end result and be happy it of course needs to be ready for you to do trips at least trips let's say groceries hospital go to a nearby town which in the Netherlands of course is quite easy so Utrecht Amsterdam for instance works with 20 percent of charge yeah and CO2 saving and cost saving of course other the goals you can really put numbers on yeah let's let's go one level deeper in the detail here I'm not sure it was all clear I always said the state of charge history will be shown to you in blue and then it's shown to you what the planning would be so here you can see a bit if you look at the price in gray that in the future you will charge your battery because the price is low and you will discharge because there's a higher price later and you do that twice actually in that in that day and the new feature as I said earlier is that we also plotting the CO2 intensity on the grid that's going to happen in those hours that's something we are basing on the scheduled coal and gas power for the upcoming days of course there's also professional services for that somebody in an earlier talk mentioned electricity map I think that's in the capital tool that's a third-party integration for cost reasons we basically developed our own version of that and what's interesting of course as you see there's a slight correlation and we actually have a plot somewhere where we looked at the whole here from our data and checked so is low carbon intensity does that kind of correlate to lower prices because in that moment you have a lot of sun and wind on the grid which have zero marginal cost so that can happen and in the day ahead prices of course that's there's an economics that makes that complex but it does actually happen during the day during the day you see a correlation in the night not yet so here's one or two features of this application in V2G Liberty for instance you come home from some trip you connect the car V2G Liberty talks to the charger and asks so what's the deal with the car right now what is the state of charge it comes back as below 20% and then there's only one cause of action we have to get up back to those 20% so that's just a simple fallback that gets you to 60 to 80 kilometers and and when you've outlived your range anxiety that should be okay and that other feature I've shown in the UI before as well you can go to your calendar on your phone and that's where NextCloud comes in here now I I'm a personal on also the company where NextCloud users and that's why I was so happy to bring NextCloud in this as well but it's basically just used for the agenda integration you can use Google Calendar if you need to that's no problem so you make your own agenda for your car and you create an entry in the agenda for your car that you're going on a longer on a road trip let's say tomorrow at 8 o'clock and that will be picked up by V2G Liberty and what's nice here you see here the mobile view I showed the desktop view before but Home Assistant even gives you something that looks really well on your mobile mobile app which is this is basically the same widgets just rearranged so then this will show up home V2G Liberty will know about that reservation will contact flex measures automatically flex measures will realize oh there's a new constraint coming in I need to recompute everything and that will change so here you see state of charge will go up to a hundred try trying to avoid that price peak there in the middle do it cost efficiently all right so where are we now this project so it's working nice first thing that comes to mind now that we're the other enthusiasts being on board that the installation effort still a bit high you you know we have written it all out on that V2G Liberty GitHub read me but as a couple steps you need to install your Home Assistant and make that plug-in work so there we can have some low-hanging fruits in Home Assistant you can basically have an actual plug-in that is downloadable and updates itself and all that you do have configurations to make in a file that could be a wizard this there's some stuff there it really helps us also with flex measures to see if it runs in the background like what kind of monitoring do we need it's really helpful some people are installing flex measures themselves as well so that's really and you really a techie enthusiast I will briefly go into some earnings or economical results so here's some hints that you know sometimes you have a day where there's there was huge sprite price spreads and your car basically set at home the whole day and then you can really have a great day with earnings above the 10 euros in the Netherlands good to keep in mind energy flexibility is only usable if your asset is there and you're not using it so if you take your car for long rides every day well there's less time to do something with it and then you have less earnings excuse me so actually this user is making a lot of kilometers actually and so here's a report from I think this is ten months of data and this is an overview we will see how much has been charged and discharged right so this is actually large parts of these of these kilowatt hours have actually been charged just to save them and give them back to the grid at a better time this is what where a lot of policymakers put high hopes on of course that cars will work as batteries so that the grid can use and carry energy at the best times well there's a big axle spreadsheet behind this but if you just look at the the bottom line that you would see they have driven their car for 3,000 kilowatt hours and paid 200 euros for that it's a pretty good price and you could compare with scenarios so what if what if I had just a fixed cost contract from a year ago would you wouldn't get that today you can you arrive at some price as you all know this changes these numbers they change so fast these days and you got before the Ukraine crisis already lots of movement in the markets after that more so this is this is difficult I think to make these look at these as hard facts like how many euros will I save if I install this in 2023 or 2024 I will not subscribe on the number there but their savings and there's some other people making these calculations also on a more higher level but it's nice to have an actual project you know this has really happened and we can dive deeper into the data so if anybody's interested let us know and of course another one a part of the cost is that the V2G capable chargers are currently much more expensive but I think the difference is coming down soon right so what we will do the installation I already mentioned updating V2G Liberty if you have it running and we have a new version that that can be easier we want to actually show you KPIs you know what did I save in money or CO2 let's say last month I think per day can also work now we have more users enthusiasts who install it at home and we run it so will the learning curve is going up right now in a sense of how much information we get that's great and there's it's going to be an interesting year with more things to support I'm not sure what kind of things this project should support the V2G Liberty project it potentially has to do with demand from the community but of course if you if chargers actually support OCPP then that's just a great idea and on the flex measure side so that's something I also know the the near future a bit I already mentioned that we are tackling immobility in projects like this but also heat in other projects and we look at the build environment this has to come together so our our big next challenge is to really model the energy flexibility from these two usages of energy combined and and give one make one computation about a building or a site that uses heat and uses immobility as two big flexible power demands and and come up with one optimization and then actually automate that so that's that's our next big milestone and the other thing that's really important of course to mention is network congestion that's network operators are coming up with ideas how projects like this are flexible consumers can do their part so for instance can stop their demands in a specific moment and in a specific region which would help on the lower and medium parts of the grid and that's also for us really on the map this is almost at the end of course we need to hear questions but of course to for people to get in contact and I'm just listening to the best contact points for each of the projects for V2G liberty that's just come to the GitHub project and interact there flex measures itself is has more channels that you can contact us on we listed them here in the in the read the docs for instance through LF energy their ways I do have I think I have two minutes so one thing I have on the very last slide before I close is something we build in flex measures recently because if you talk about projects like these basically after they happen and you just summarizing what happened they don't really come alive because what actually happens and what we do is that throughout the day new situations arise all the time you know new circumstances like the car comes back it has a completely new state of charge could be lower maybe they charged on the way is higher you don't know this this is a new situation so we need to recompute and each of these situations you also need to have a different set of forecasts so you ask yourself it's this point in time I was asked to recompute and I look at the set of forecasts what do I know now about the state of things so maybe not all the devices have sent me everything yet there's always delays there's always lags in IOT applications in the future which forecast do I have now available and that's what I mean with it doesn't come alive right and we let's see if it works yeah we we made it we made a UI that uses JavaScript so you can travel through time basically so imagine that that bar is now you hit that button and we have kept the old schedules we have kept everything that's all and we know when we knew it that's how we can travel through time and when we stop it anywhere we know what we knew at the time so let's say we knew the day ahead prices you can see the day head price on top coming in in batches for instance and then the new schedules that's a red dotted line also all right let me stop in time and there's any questions all right there's a few but who decides on the order well let's just start my left I think that V2d liberty oh yeah sure the question was if this has been built for one car or if there's the opportunity to have multiple cars at the same location on the charging station supported and V2d liberty has right now been built for one household with one car which is expandable in principle through some work and flex measures itself has a solver that could also schedule multiple cars so that's not a technical problem on that side so the question is that for the day head prices the energy suppliers make a forecast usually or traditionally what all the consumers will consume aggregated over thousands of consumers they get a nice curve and they try to buy that now if the consumers react to that price you have a loop somehow a dynamics more dynamic system and the question was if I have thought about it yes of course it's super interesting there's two thoughts one is that there's now a couple of providers like that in the Netherlands and I think they basically adjust for that they would assume that a bunch of their customers do act flexible and they might have have to add a model for that a behavioral model economics economics behavioral and try to get it right on the other hand I think that some there's going to be more energy suppliers because I know of a company in the Netherlands that is basically helping larger companies become an energy supplier so they don't if it's basically energy supply as a service so you can brand your own energy supply contract and some of organizations I've talked to think about adjusting their their price signal themselves so they buy something on the day head market but that's not the price they give to you they give you a different profile to try to sort that out yeah could be could be that you're not doing as much money on the market but now you add services I was talking about hardware so can you repeat the question do you mean the Everest project yes well we are both in Linux energy foundation so we know about each other so that would be a way out the question right it's everything's super new and the question was for the audience online that there's the Everest project and they also have open source hardware although that's not there the core of the the company but they offer that and they had a great talk this morning so that could help of course so that's it's a great opportunity for the community for anybody maybe for us to combine these two I don't know but especially young companies have to sprint it's difficult I want to also get some part of the we are fine I'll be answering questions in the chat if there's something burning or right now in person afterwards in the break thank you |
OpenSTEF: Open Source energy predictions |
Okay, awesome to see there's so many people here, really cool that there's a big interest in the energy topic. My name is Frederik Stool and I'm from Alliander which is a grid operator and I'll be talking about the open staff today. So first of all I put here in the graph this is a load profile, so the energy load somewhere in the grid and well you can see how it fluctuates over time, sometimes it's positive, sometimes it's negative, this means whether there's neto production or neto consumption. Now the question is, or you could ask, if we are at the red line right now, what will be the load in the future? And that's what we want to predict and if you're interested in that then you can use open staff because open staff means short-term energy forecasting. Okay, let's zoom out a bit first and before I go into a bit more detail about what open staff does, first I want to talk about, give a short introduction about why this is relevant. I don't have hours so I have to keep it short but there's a lot of to talk about here. But I want to start out with this picture and it's actually quite cool because I think the last presentation talked about flexible energy that consumers can use and this is one of the many things that are changing in the energy sector. So consumers have flexible products that also start producing, consumers also have solar panels, your local farmer might have a wind turbine somewhere, you have big wind parks on the sea, so there's all kinds of developments going on right now that make it harder for grid operators to forecast what's going to happen tomorrow or even the day after tomorrow. And I'll put this picture because all the things that I mentioned you can see right there and probably in the future it's only going to get harder and harder. For now I want to focus on the renewable energy part because it's also quite impactful. As you can see on this graph here this is for the Netherlands, the percentage of renewable energy production and you can see that in just a couple of years like five years it has more than doubled the electricity percentage that has been produced by renewable sources and renewable sources don't produce at a constant load of course, they change all the time depending on weather and this means it's harder to forecast and to put that into perspective I have another slide here and this is a typical consumption profile. If you have your local neighborhood then this is what the energy load will often look like. So you have the five peaks which means it's a peak for every day and in the weekend it's a bit lower, you can see the dips, these dips in the middle of the day that's because there's a couple of solar panels on some roofs you know, it still looks easy to predict. If you go to other places where there's way more renewable energy you can see these energy profiles change dramatically. So here you can see really a profile for a big solar park and you can see these huge negative peaks which on some days are there and some days they're not probably that's a cloudy day so there's no or less energy generated. And this is an energy profile for a big wind farm which you can see is well, seems hard to predict because there's seems no real, yeah so a negative energy means that the consumer or the customer I mean is giving back to the grid so then it's negative for us. If it's positive it means sorry, yeah exactly, so if it means a big negative peak it means the customer is producing a lot of energy. So it's just a convention you could also switch the sign but you have to choose one convention. Yeah power, yeah. Yeah so it's not just one side it's more like a general profile for like a substation for the grid operator but connected to it is a lot of solar. Yeah exactly so on all of these there's load and production but yeah I just wanted to share this feeling. So this can be difficult and this also leads to problems and this is two maps of the Netherlands and the colored areas are the areas where Alliander is currently active and on the left is energy consumption and on the right is energy production and this map shows if you're a new customer and you want to be connected to the grid if it's red it's probably difficult because there's no more room. According to the Dutch law the energy grid is full over there and you can see that this is for huge areas in the Netherlands and also large areas on the consumption side. And of course Alliander is doing everything they can to solve this by building new cables and new substations but this takes time, a lot of time and we don't have the time as you can see in the graph before. So I don't have that one but I assume it's very similar because we're not the only one who are having these issues. So how can we solve this? Well one important thing is that we need grid inside and therefore this also includes forecasts. So transmission forecasts and these are important for all three parts in the electrical grid so all three parties. So for customers, for DSOs such as Alliander and for TSOs such as Tenet which control the high voltage grid. Using these forecasts operators can try to maintain grid safety and grid balance and can give customers as much electricity as they want and as they need because the need is high. With these forecasts we can also enable smart solutions and I put here two brochure pictures of those solutions, one of them is a pilot FlexPower which was in Amsterdam which was about charging electrical vehicles and charging them faster if it's possible and not charging them as fast if it's not possible. We at Alliander supplied forecasts for this project and another platform is the GOPEX platform which is like a trading platform for electricity where customers can trade with operators to either consume or to produce energy flexibly and this is also being used right now at Alliander and we also provide a forecast for that. So it's no longer working so let's use it. So now let's talk about Opelstaff again because that's why I'm here and I'm going to give a short introduction to Opelstaff and then I'm just going to give a short demo about Opelstaff how you can make a forecast and also want to talk a bit about using Opelstaff in an operational setting. So first of all, the primary thing Opelstaff can help you with is that it's just a complete machine learning pipeline. So I'm just going to give a short list of what it can do. It handles input validation such as checking whether your data is complete. It has feature engineering so it automatically calculates for you lag features or other features that are based on input features. So for example, if you input it with wind speed, it can calculate wind turbine power output for you or the same for direct normal irradiance. Next it is some kind of intelligent train validation split of the time series. It has support for multiple type of regressors. So right now we have, for example, HGBoost which is at Allende the most commonly used but we also had a collaboration with Sonyo which added ProLove to Opelstaff and we also have support for probabilistic forecasts. So that means not just one line but quantiles. And unless it has integrated the model and artifact storage using MLflow. So what does this all mean then? Let's go to an actual demo. So I'm going to put this up here. That's a low resolution. Let's zoom out. It's a bit too much. Okay, so I'm just going to walk you through an example or how you could make a forecast. So first we need to make some kind of config object that's just what you have to feed Opelstaff. Let's close this. So let's run this line. Next I put some example input in this project. So we can load it and we can visualize it. So as you can see here, well, this is upon a stator frame and here we have the load and we have a lot of predictors. Well, some of these, well, the names should make sense. So for example, the amount of variation predicted by the KMI or the temperature, well, all these predictors are already in this example data. So if we have this, I can also plot it for you so you can see, well, this is another power profile. Okay, so now imagine you have this and you want to know, well, what's next? Then first we need to train a model. So Opelstaff has a train model pipeline which basically does all those things I just mentioned. So we can just call the pipeline and let's hope the live demo does not fill me. It will take about 15 seconds I think to train a model and store it. So you can see some info about what it's doing, well, and it's stored it. So let's have a look and we'll flow comes with an interface so we can directly see that we train a model. So right here, this was the run. Now let's hope this works. I see that my internet is no longer working so apparently this, then this figure this will work. All right, I'm not showing it. So, well, this is the MLflow interface and you can see that we just train a model. You can also click on the model or on the train run, this is just MLflow and you can see a bit more, well, information about what happened during the training. The next, of course, we want to make a prediction. So again, OpenStep has a pipeline for that so we can just say, okay, I want a prediction. So it's loading the model and using data to create a prediction and then we can visualize that as well. And then we have a graph right there. So this is the forecast that it made for the next, well, this was in some example data in 2021 but about 48 hours of forecast so that's OpenStep in practice. Let's go back to the presentation. Do this slide. Yeah, so, well, this flow has been a minimal flow but of course in reality, at least for if you're a grid operator, you want to do this in an operational setting. So this means that you want to do daily forecasts for a lot of different locations with all kinds of configurations. And OpenStep also comes with a so-called reference implementation about how you could do this. So this is a picture of what you would have to do so we have OpenStep right here which is basically, I just showed you the training and forecasting pipeline. Then we have another package which is called OpenStep DBC, database connector which can connect to a database. And we use MySQL and Influx DB to store all the data required to run it operationally. And we also have a Gafana dashboard built upon this database stack so we can also see what's going on. And again, as I already have shown, you can use MLflow to keep track of all the models and all the runs that are being done to see what's going on. So I want to show this dashboard as well. So this dashboard is just example data so it's not our real dashboard. But here you can for example see some load that was there on the system and you can also see that for example this is not just one area but it has a sum of two systems which is quite common in an electrical grid that you have a lot of measurement points that you have to add together with different signs. And you can see for example here's then a live forecast of this location as well. You can also see plots of the feature importance that obtained during training of the model. You can see on which data the model has been trained. Over here these plots are really small but here you can see them. So it's a dashboard where you can see everything that Oberstaff does for every location that you are could be interested in. You talk about forecast, are you within the forecast also taking other forecasts like the weather forecast into account or are you forecasting that yourself? No so we use all kinds of data and the weather forecast is, oh the question is whether we are using other forecasted data or whether we forecast, do those forecasts as well? Like whether we forecast the weather ourselves and the answer is that it depends a bit. So in general we use the weather forecasts for multiple sources and also for example price like the head pricing. So we use those data but sometimes we also feed the prediction itself or we feed one prediction into another prediction. So I mean you can play around with that but you have to feed Oberstaff with all the predictors that you wanted to know. Okay let's move on, I have one last slide and that's basically key information because that was my presentation so here I put all the info you might be interested in on this slide and if there are any questions then feel free to ask. So the information is really useful but what is the purpose for the net grid operator, what use does a grid operator make of this information, is it for congestion management, is it for some kind of load shedding, what is the role of this exercise? So I think the question is why would the grid operator be interested in forecast I guess is what you're asking. So there are many reasons but I think you already mentioned congestion management is indeed an important reason but also well grid insight. So the more congestion management is going to be used for the grid the more important it is also to maintain grid safety and grid safety is not just one operator, we are all connected to multiple grid operators so everyone has to communicate what they are going to do and what they expect the energy flow to be the next day so every operator can decide to do what's necessary to maintain grid safety. So that's what I mentioned before the transmission forecast, every operator has to communicate to everyone who is connected to what they expect the load to be on the next day. I see that my time is up so I'm afraid I have to answer the questions in the chat. Yeah or in the hallway, I think for time management we have to learn from these talks and see if we can manage to keep a couple of more minutes for questions, sorry. Thank you very much for listening. |
4 Years of Energy Management with openHAB
A personal story about smart homes, PV systems and EVs. |
So let us get started, it's great to see such a crowded room here, I hope you're not all here just for the next talk to grab a seat. So my name is Kai, I'm a software architect and a project lead and founder of the project OpenHUB. I'm not going to talk about OpenHUB for smart home today here in that talk, if you're interested in that project, come to see me at our booth directly here in that building at the entrance. But I'm going to talk about more or less my personal story as a consumer, as an end user in terms of energy management, my experiences there over the years and the story actually goes much further back than four years, so it all started already 15 years when we built the house and I more or less electrified everything possible in there. So starting from the lights, not that many people use candles nowadays anymore, sure, but also heating is all electric, warm water through a heat pump, photovoltaic system on the roof, so everything was nicely connected to a K-next system, so controllable, I could get all measurements, but what was missing at that time was to really have some software that really helps me to visualize things, to control things and so on, and that's why I started the OpenHUB project in year 2010, directly as an open source project with the intention to have a system that allows me to create overarching rules and overarching user interfaces over all things I have at home that have somehow an API that I can somehow connect to and to have such a system in place. The focus and contrast to all the commercial solutions out there was on local control, I said, well, it's my home, I have all the devices at home there, they should talk locally with each other, I want to have all my data locally and I don't want to have any dependency on the internet for that system. By now OpenHUB grew quite a community and we have more than 400 different so-called bindings which are more or less drivers for certain radio protocols, other systems, technologies to reach out and to really combine into one single system and so you can more or less get everything what is available at home into that solution. Now, in terms of energy, what I did is that I hooked up such electric meters with an S0 interface in my electrical cabinet which simply provide impulses as an output and I hooked them up to a KNEX binary input which then simply provide those on the KNEX bus and created simple rules in OpenHUB that count the number of ticks for a certain period of time to calculate the current power out of that. As you can see here from the graph, I have several of these meters, one for heating and blue here which usually really turns on and off load, simply the green that was now one day last week here so winter and not that sunny as the photovoltaic power produced and then yellow the household energy that we use for more or less all the rest at home. Now having that in the browser as a visualization is quite nice but how to engage my family members to actually also get a feeling about the consumed energy where we put up a fairly simple device here, an energy light which is basically an IKEA lamp with a Philips U bulb inside and a fairly simple rule in OpenHUB that you can see here which simply says whenever our household power changes then if it's not night because then we want all lights off, calculate a U value ranging from green over yellow to red and simply post that as a new color to that light bulb. And interestingly that this device that somehow goes a bit to your unconsciousness over time so we're passing that many times a day in the house and you suddenly feel after a while that something doesn't seem to be normal, I didn't turn the dishwasher on or the washing machine and still it's showing red so let's think what I might have forgotten then. So it really gives a sense of and a feeling about the energy usage at home are not just for me but also for other family members which is a nice effect on that one. Now for monitoring heating energy here, this is quite a nice visualization which shows a calendar here. This one shows last December where you might remember here in central Europe we had a very cold phase in the middle and it was fairly mild over Christmas and towards the end. And so the background color here on each single day shows the minimal temperature of that day ranging from minus 10 degrees that we had at home to I think on New Year's Eve it was around 11 degrees minimum temperature that day and the diamonds here show then the used energy for heating that day and you see a very nice correlation here between those two figures so that this can be also used to see whether everything works nicely or if you should actually check if something's not right. For monitoring photovoltaic system if you set up a bit more complex graph that uses InfluxDB and Grafana dashboard which both nicely integrate with Open Hub as a system to really get the data out here. So you can see in blue the elevation of the sun for that day, in red the luminance in south direction, in yellow then the power of the photovoltaic system, the gray bars show when it was raining that day and so you really have a very nice visualization and you can check that everything's working alright and also a very good correlation here between really the light intensity and the photovoltaic power so whenever something's off here you could create alarms on your Grafana dashboard to actually say hey check your system please. Luckily so far after 15 years with that system everything was smooth and I never needed a single alarm on that. Another nice event happened in spring 2015 when we had a partial solar eclipse at home and it was on a bright sunny day without any clouds and that really resulted in a very nice curve here and interesting thing is that with a partial solar eclipse when you look outside you hardly notice it because it's not going dark, it's still daylight but here see that the power of the sun really went down by factor 3 to 4 roughly and it was almost as if it's dark so it was quite a nice effect. So all the monitoring is nice and good but in the end when you're talking about energy management you really want to do some load shifting, optimizing your consumption and all of those things. Now unfortunately at the time that our photovoltaic system went live or at that time there was no incentive at all for the end customer to self-consume that energy that is produced but everything goes to the grid and it's paid there and that's it so there's no benefit for me to actually shuffle around some loads and do things so my only option was to say well okay our utility should provide different price levels over the day and I can maybe shift things for that and thinking 10 years back the standard example for shifting load was hey you can do your washing at night that was what everybody came up with more or less and so I said well okay sounds interesting let's see such a washing machine that was smart grid ready usually cost around 300 euros more than the same model without such a feature okay you could say well one time investment let's go for that fine and at the time also in Germany the utilities were legally obliged to offer you at least one smart tariff that had to have two different price levels at least so I said okay let's check that out and my local utility said okay we have a field trial here and in order to participate in that you actually have to book our smart tariff which was an additional 100 euros a year I have no clue why because we already had a smart meter so there was no hardware investment or anything involved in that but they provided an API then which said for the next day for that hour of the day it will cost you that much money and the price difference between high and low was exactly 3 cents per kilowatt hour so I quickly checked okay washing machine what does that mean actually as a yearly consumption it's roughly 150 kilowatt hours that you assume here so I did some quick arithmetic and came to the conclusion that hey you can save four euro fifty a year by doing all your washing at night and yeah so that doesn't sound that much but you might now argue okay you can also use your tumble dryer at night you could maybe wash your dishes at night as well and maybe even move your warm meals to the night when everybody else is asleep but even then you're not coming anywhere close to actually have any benefit from all of that okay so that that wasn't too interesting for me unfortunately and somehow my local utility also noticed after a while hey that doesn't seem to be too attractive nobody really wants that and actually they came by and told me that hey those smart meters that you have at home they break so often and then they can't read the LC display anymore and so they can't get the number out of the meter and they have no clue to no clue what to do about that so they said well in 2016 they ripped that out and replaced it by an old school Ferraris meter and said hey that one is really lasting 10 years we don't have to come by everything fine so here you go so that was it more or less with all my attempts at being really in the front there doing energy management and trying to be cool with all the smart home stuff and automation here and that stayed like that until more or less four years ago when we bought this nice little blue Tesla here which had a huge battery and I thought okay so much battery to store energy I have to do something with that now as I said photovoltaic system wasn't really helping me here because there was no incentive for self consumption so I had to put up a second photovoltaic system this time on the garage roof and in 2019 was now the case that for this one giving power to the grid hardly gave you any money so you had a big incentive in using all that energy yourself and optimizing that really and so yeah big parts of the household energy during the day is automatically covered then by such a photovoltaic system but then with the combination of the car surely surplus charging becomes very attractive here to say that everything that exceeds what you need in the household should be used for charging your car quite luckily then for more or less the pandemic times was that well everybody did at home office so did I so the car was at home during the day when it was sunny so that worked out really well and this year shows now another open hub rule that was simply says or that whenever the photovoltaic system power changes or the household power changes then please check if the car is connected to wall box and adjust the current that the wall box is delivering to the car and I have a keba wall box that accepts UDP packets here to control it down to a milli amp granularity which is really nice because you can steer it very precisely here you have to at least go with six amps though which is more or less the minimal power to start charging of the car but with that rule I can do all of that and so on the next slide you see more less than the outcome on a very nice sunny day so in blue you have here the overall power that goes to the grid or comes from the grid and the idea is to really level that out on the zero line ideally so in the morning when there was no sun we had to draw power from the grid then the sun came up or we gave some power to the grid until the car started to charge then up a certain level and then you can see that it's really fairly flat at zero so that works pretty well then came lunchtime when the household power consumption was a bit more bumpy going up and down so it's a bit more tricky to level that all out but it worked also quite well then I think at the end the dishwasher went on which used so much energy already that the charging had to stop completely and it turned actually out that there was some bug in the car firmware that didn't resume the charging afterwards anymore so at that time I had to manually then always go there and have to restart it luckily by now this bug is fixed by Tesla and yeah so the rest of the day the charging rate was a bit reduced and it works quite well and overall you can see that on the next slide that's the yield of the photovoltaic system over all of last year and in average that was between 10 and 11 kilowatt hours per day and if you consider that half of that so five kilowatt hours is then used for the surplus charging that corresponds to roughly 10,000 kilometers a year of driving the car obviously a bit more in summertime and not that much in wintertime but it corresponds to a saving of roughly two tons carbon dioxide which is quite a nice effect here and yeah that's my experience so far I'm looking in the future to also integrate with other solutions like EVCC for example which specifically looks into car wallbox monitoring and also going or like OpenStep which is sounded quite nice into looking into the future predicting and getting more machine learning stuff in there which might be a topic for next year then and with that I thank you very much for your attention are there any questions yeah okay the question is if I can imagine whether I more or less give control more to the grid operator than controlling it myself in theory I can imagine that but from all that I've seen out there is that that's still a far far future that really the utilities would be in a position to really make use of that data and problem that I see is also you know how do you actually make sure that the data is real that I'm not just giving anything there for maybe benefiting in some way of a better tariff or whatever and sorry it's measured by the meter okay if that's all just the pure meter values that's I think anyhow already possible with the smart meters that are installed not in my case now at the moment anymore yeah for having the utility allow to decide when to charge and discharge the car I still want to be in the position to say well I actually needed at that charging state at that moment and so on if that can be fulfilled that works there washing machine is also something that really goes into your own personal comfort a lot so if they decide when to do it and so it's all a bit tricky I think it's better here in the households to really decide what to do and give incentives to do the right stuff yeah yeah no from utility and the grid side it's obviously very important to not see a single household but to see more or less a whole city part of the city and so on and to be able to control things there to more or less get a decent level that's for sure but I think it's helpful to provide incentives to the single people by having an API to interact with and then that might work okay I see my time's up thank you very much if you want to discuss further I'm at the booth |
Combatting Software-Driven Environmental Harm With Free Software |
Thank you, everyone, for coming, especially a big thank you to the organizers. This is a great event, and I'm really honored to be here. It's my first time at Fozdem, and this is an incredible community, incredible event that I've wanted to come for years. I'm representing today the KDE Ecoinitiative. This is a community project involving several people. Some of them are here. Some who were here earlier in the Railway open source dev room in this room earlier today, and some presented earlier today in the online event of the Energy Dev Room. I'm going to talk today about combating software-driven environmental harm with free software. I'm not going to be as technical as some of the other talks. I'm going to focus more on some of the softer sides of free software and how that's good for the environment. There's a lot of links in the slides. If you want to download them, you can either go to our lab repository, or you can scan the QR code. I'll come back to this at the end. To get started, to get an idea of what the problem is. This is some data from a report from the Association for Computing and Machinery. It's the oldest association of its type since 1947. They estimated how much energy consumption the entire ICT sector, sorry, the greenhouse gas emissions of the entire ICT sector is. In their estimates, they find that it's within 1.8 to 3.9% of global greenhouse gas emissions. This is roughly equivalent to the airline industry, which is estimated at 2.5%. This data includes everything from production to transportation to end-of-life treatment, bitcoin, training, machine learning models, and things like this. As they say at the very beginning of the report, computing can help mitigate climate change, but it must first cease contributing to it. In their projections, they estimate that by 2050, the ICT sector will contribute about 30% of global greenhouse gas emissions. Can I ask, we're going to knit zero by 2050, where are they? So this data is assuming nothing changes from today. And some of the major contributors to this are training machine learning models that has increased 300,000 times between 2012 and 2018 and is currently doubling every few months in terms of energy consumption. That's one of the main contributors, a short lifespan of digital devices is another. Digital devices, they estimate to be at, by 2025, 75 billion devices in the world. That's about 10 per person. If everyone, if that's distributed evenly, of course it's not. And in their report, they claim at one point towards the end, efficiencies must be coupled with slash demand, so conservation, to reduce the ICT sector carbon emissions. And those are going to be two of the main points I'm going to talk about today, efficiencies and conservation. This is from another report, this does not include such a vast data set as the ACM report. This is from the SHIFT project, it's a project, a nonprofit from France. This is from 2019, and this is looking at usage and production and how that is distributed in terms of energy consumption. This does not include things like Bitcoin, it doesn't include transportation. So there are several things that are not in this data set. But they estimate, and this is just a good idea to think about what I'm going to talk about today, they estimate that usage, which is on the left side, including terminals, that's all the end user devices. Networks and data centers contributes about 55% of energy consumption, whereas production is 45%. And again, this is not including an entire, the full data set. For today, I'm going to talk a little bit about all of these things, I'm going to talk about production in sort of a broad strokes, not going into any of the individual devices, and focus mostly on the terminals, so the end user devices, but it does have some relevance in terms of network and data center usage. So as I said, I'm going to talk about efficiency and conservation, what do I mean by efficiency? I mean same task, achieving the same result, but using fewer hardware demands. This is going to be focused on desktop software, KDE is a desktop software development nonprofit, and conservation that is reducing waste driven by software, and that will become clear in just a second. This is some data looking at the energy consumption of two word processors. This is from a report from the German Environment Agency, in which they compared various software products doing the exact same thing. This is called a standard usage scenario, this is usage scenario measurements, so basically they're running the exact same script to generate the same task from the software, and then looking at how much energy it consumes by using an external power meter. And what they find is that word processor one, which they only identify as an open source word processor, is consuming four times less the energy compared to word processor two, which they only identify as a proprietary software product. Now you might look at this and say, okay, for one individual user, this is maybe not that significant, but you have to think of it at scale. For word processors, every university, every office, every government institution is using word processors. When you multiply this up by millions, possibly billions of users, that really adds up. And I'm going to give an example of how that adds up. This is directly taken from an online course on sustainable software design from Detloff Thoms. In this example, he imagines a scenario where you just have a one CPU second reduction in your software. And that one CPU second reduction is about the equivalent of 10 watt second savings. When you multiply that by 1.5 million users, who are having perhaps in this, that savings is interacted with 20 times a day, 230 times a year in your working day, that adds up to 19 megawatt hours of savings. What does that mean to make a comparison if you take a modern electric vehicle and drive it? That would be the energy needed to drive from Paris to Beijing and back six times. This is just from one CPU second reduction. If I can convince 500 people to do 10 of those reductions with those exact same numbers, you end up with 95,000 megawatt hour savings. That's the equivalent to the energy consumption of a 30,000 two-person households in one year. This adds up once you start looking at it at scale. Going back to those two word processors, this is from that same report comparing word processor proprietary and open source, looking at the energy consumption over time and what you see here is, so I'm not going to focus on what's happening before this blue line. I'm just going to look at what happens here. This is the point in that usage scenario script when the script saves the document and then goes idle. This lower plot is the open source application. What you see is that the document is saved and in fact it goes idle. By comparison, looking at the proprietary software product, it continues doing things. What is it doing? I don't know. It's maybe telemetry, a phoning home, doing some sort of analytics. Can the user opt out of this? Probably not. This is probably outside of the user control. Is it necessary for the functionality of that software? Probably not. I don't know that speculation, but when you look at what's happening over time, you can see a significant difference here. That's it for efficiency. I'm going to come back to some of this in the second half of the talk. I'm going to look at conservation now, reducing waste driven by software. This is an infographic and I'm going to go through it now. This is from a report-based UN data, I believe, which sort of from 2016, there's a reference to a tsunami of e-waste. This is actually increasing. The data that they report is that it would be the equivalent to the materials used to build 4,500 Eiffel Towers in one year that's e-waste. That would be, just I thought about what if you stacked all those Eiffel Towers up, that would be 17 times higher than Mount Everest. This is in one year and it's increasing. Less than 20% of our e-waste gets recycled. In our landfills, e-waste accounts for about 2% of the waste in it, but it's 70% of the toxic waste in landfills. This is really damaging to the environment. What does software have to do with this? That's a hardware issue. Well, software determines how long we can use our hardware. You have things, problems like abandonware or planned obsolescence, where your device is no longer supported. My parents got this on one of their machines and I convinced them to switch to Linux because of it, because to update would have required buying new hardware. You have bloat and feature creep, where your device no longer meets minimum system requirements. The result is that you have new devices produced and shipped, and functioning devices are discarded as e-waste. This is data from Apple. I got it from a book called Smart Green World. This is particularly scandalous that functioning devices end up as e-waste when you consider that this is from Apple's own data, 78% of the greenhouse gas emissions comes just in the production. This is completely useless waste and contribution to the climate crisis. I said I talk about free software. I'm going to first focus on KDE's vision. My main point here is that what's good for the user is good for the environment. KDE has the vision. This is from about five years ago the community came up with, what do they want to see long term for KDE? What they want is a world in which everyone has control over their digital life and enjoys freedom and privacy. Each word is broken down at the website if you go to the link. I'm going to focus on a couple of them, so a world, so everyone in which everyone has control over the digital life. How do they want to do that? They want to hand control over to the user. They want to put you in the driver's seat, and the way they do that is by making free and open source software. To enjoy freedom and privacy, without the freedom to make changes and share them, users are entirely reliant on the vendor's benevolence for apparent control. Transparency and user autonomy aren't features. They're inherent to free and open source software. Those same values are what make free and open source software already more sustainable than non-free software. It's not just me saying this. This is also the German Environment Agency, which released the award criteria for the Blue Angel Eco-certification for software for desktop software in 2020, in which they recognize that transparency in energy consumption and user autonomy in letting users decide how they use their software actually is more sustainable. There are three main categories to the award criteria, resource and energy efficiency, potential hardware operating life, and user autonomy. In other talks, I go through what I'm calling the three steps to eco-certification, measure, analyze and certify, measuring by running usage scenarios, measuring energy consumption, using that data using a tool like Oscar, the open source software consumption analysis in R, and then collecting the data. I'm not going to talk about the measurement and analysis today. I'm going to actually focus more on those softer qualities, the user autonomy ones. And a bit more detail, this is what the criteria require. So resource and energy efficiency, it means that you are transparent about how much energy your software consumes when it's used by an average user. What an average user is, is not defined. You have to decide what you think your software is used by most users. Most importantly, you have to publish it. You have to make it transparent about what your assumptions are. And then with that, then you measure the energy consumption and publish it. The potential hardware operating life, the requirement is that it runs in hardware that's five years old. Now this to me is far too low. I mean, most people, and I have an example later, are using free software, can use devices up to at least 10 years old. Five years is not very much. It's 2018 at this point. And then the user autonomy criteria. And this is where a free and open source software really has an advantage. Connecting features or, sorry, qualities like uninstallability and modularity that you can only install what you need, not more, not less. We have support that the software can be supported beyond the original developer's intentions. Offline capability and freedom from advertising that you can use the software without it having connect to a server or run processes to feed you ads. Documentation of your use of open standards, how you can uninstall and things like this, and transparency. Now, I would say that most people in the free and open source software community take these for granted. We don't think of these things as being sustainable. And so I'm going to pick just three of them and talk a little bit about them now. And I think then I'll have plenty of time for questions. So uninstallability and modularity, right, this is not exciting news, right? We can uninstall things completely when using a free and open source software. A lot of proprietary software products you can't, right? By leaving things, so by running things that you don't want, right, you're creating inefficiencies when using that software. It's going to take longer to load and start, it's going to take longer to shut down. Those software components that you're not using might be adding CPU seconds to add up once you start thinking about it, scaling it up to millions, possibly billions of users. Modularity, if there are things that are being installed with a software product that you don't want, right, that's again creating inefficiencies. Free software gives users the control to decide what they install or uninstall. And that creates a more efficient software product. Continuity of support. This is actually a picture I asked around in the KDE community, which hardware people are running KDE Plasma on that they know is no longer supported by the vendors. And one person responded, this is from, I don't know if this is the exact model, but a 2009 MacBook that had their end of life in 2019 with Apple's 10.10 Mac OS. And they are now running it with an up-to-date operating system Kabuntu with Plasma, long-term support without any problems. You can do this because the support for free software doesn't have these arbitrary or planned end of life moments. The Blue Angel, in their criteria, you don't have to be free and open-source software to get the award, but you do have to have a plan for long-term continuous support after you stop as a company developing that software product. And if you don't, you have to make it free and open-source software to get the eco-label. Offline capability and freedom from advertising, just to put some numbers to this, right? So at KDE, and like many other free software products, there's no forced opt-in telemetry. In fact, KDE does have a telemetry policy, but it's opt-in at all times. Users aren't automatically giving data to KDE. Most other software is not also requiring that. What does that mean in terms of energy savings? So this is a graph from a report for the EU, Carbon Footprint of Unwanted Data Use by Smartphones. And what I like is it makes a very clear connection between the network and the data centers in terms of power consumption, right? So every time your smartphone or computer is going through the network, of course it's consuming energy. They in this report say that 60% of EU citizens, when asked, would opt out of advertising if they could on their smartphones. They estimate that that savings, if those 60% of the people could opt out, would be at its worst 3 to 8 million metric tons of CO2 a year. That would be equal to 370,950,000 EU citizens annual energy consumption, right? For something that many users probably don't want. So yeah, these things add up by making software that respects users, that gives users choice. We are actually making more sustainable software. There's many more topics to talk about. If you're interested in the topic, you guys get a sneak peek to our handbook about measuring energy consumption of software. It actually will be officially announced next week. But it's online now if you want to go to our website, eco.kd.org, in which we cover sort of three main parts. Why does this matter is the first part. What is the Blue Angel? It's focused on the criteria as a benchmark for what a sustainable digital society could look like. And the part three is then how do you measure your energy consumption and how do you fulfill the user autonomy requirements if you're interested in eco-certifying your software. KDE has been interested in eco-certifying their software. We are proud to announce that we're the first to have a eco-certified computer program in the global eco-labeling network with Ocular. This is from April last year. There are other initiatives that I just wanted to point out before my time is up that I think are really important. This is from the Free Software Foundation in Europe. It's an open letter to demand that the right to repair must include software. It goes, software determines how long we can use devices and if we have a right to repair them, we should have the right to repair, to put any software we want on those devices. You can keep devices in use as, again, a Free Software Foundation Europe initiative that's really great upcycling your phone. Just look into it. I just wanted to point it out because I think they're doing great things. If you're interested in, as a software developer, measuring software, we set up a lab in KDAB. This is Arna who gave a talk earlier today in the online energy dev room. Chris has helped out, set it up. Several other people who are involved in the KDEco initiative have helped set this up. We have a lab that's going to set up so that you can measure the energy consumption with an external power meter. We're in progress right now of trying to make an online portal so that you can upload your usage scenario script, get a report back. You can either use it for data-driven decisions about your own software design or applying for something like the Blue Angel Eco label or similar. I just wanted to, as a final note, KDE is voted in October to make sustainable software, one of their goals, one of their three goals for the next couple of years. In KDE, we're trying to align various initiatives within the community, doing things similar to actually what was talked about earlier, trying to think of ways to give users information similar to that light bulb that Kai was talking about earlier that gives you an indication of what's consuming energy, and we're thinking of how we can implement those things into like an eco-widget so that users can get information about what maybe the grid intensity, what the power grid mix looks like at that moment so they can decide if they want to do an update when there's more green energy, things like this. Various other initiatives if you're interested. This is a community project. You're welcome to get involved. Various channels to get in touch with us, email, mastodon. We have a big blue button online meetup every second Wednesday, that's next Wednesday, I want you to talk about various things, and then mailing lists in Matrix Room. Thank you. I just have to note that this is a project, so I'm working in the Blower Angle for Foss project, which is a government funded project from the German government. Thank you very much, and I look forward to your questions. Actually, I'm going to do one thing. We have online questions as well. I feel like online folks always get ignored first, so I'm going to just try. Is there any online questions that we could? None so far. None so far. Okay. Then I'm going to bring it to the room. If you're in contact with the German government, can you vouch that they tell to the hardware producers to open source their drivers? I can certainly mention it next time I'm at an event, and I have someone's ear, which is not often. So the question was, sorry I have to repeat it, if I can, next time I'm in contact with someone from the German government, if they can open source drivers, they can force hardware vendors to open source drivers, is that, yeah. And I would be happy to try to drop that comment if I can. I saw a hand over here before. I think it was yours, yeah. So the question is, so what is the Blue Angel? Where do you find out information about the Blue Angel as a consumer? And the Blue Angel, so I actually can, I'll ask, I think there probably are some German speakers in this room, or people who are in German speaking countries, who here knows the Blue Angel and what do you know it for? And what is it known for? Paper. Paper. Most people say paper. So it's really unknown for paper products, and toilet paper in particular, and I've started some talks making the joke, what software and toilet paper have in common. They can be certified. So Blue Angel certifies a lot more than that. There's hundreds of products, cleaning, detergents, construction materials, things like this. In the IT sector, they certify servers, or server providers, and now software. And that's it. They want to extend this, just to put this out, they want to extend the equal label to not just desktop software, but also mobile apps, and distributed software systems, or client server type things, that's in progress right now. The desktop software, how you can find out about it is if you go to the Blue Angel website, there they have a list of all the products. I don't remember the link off the top of my head, but it might be, no. It's on our website, if you're buying a product, and it's on the packaging. So that's the kind of thing that, and what it says, so it's maybe just an important point, they're a type one equal label, which means that it looks at the entire lifespan of the product, and it requires a third party evaluation of compliance. Whereas other equal labels, not like type two or type three, I think are the others, don't require third party evaluation. So it has a bit more of a stringent process in the evaluation. Is there time for more questions, or we have to switch over. I'm happy to talk in the hallway, or in online, or after the event, so thank you. |
Getting to a fossil free internet by 2030
A tour of the tech and policy changes to get us there |
Okay, shall I start folks? I'm afraid I have quite a few slides to go through folks, so if you, I guess I'll start now then. Hello folks, welcome to my talk, Getting to a Fossil Free Internet by 2030, before I start. Can you folks hear me all right? Excellent. Okay, all right. As you can see there's the title, a talk of the tech and policy changes to get us there. Hello, if you can't see my face here, because you're not in this room, that's what I look like, I suppose. My name is Chris, I work at an organisation called the Green Web Foundation. I've spent the last 10 to 15 years working in a series of wacky climate and data themed start-ups, as you can see here. Loco 2 is about locomotion, like trains, but also a low CO2, and going loco, going holiday, stuff like that, and it's just puns are a thing that I have been working with. But these days I do most of my time working with the Green Web Foundation, so we track the transition of the internet away from fossil fuels. I also run a podcast where I talk all about green software with other people, and I also work with a larger organisation called the Green Software Foundation, where we do work on policy specifically. During the time I have with you today, we're going to cover these three things here. I'm going to tell you about why I care about a fossil free internet by 2030. I'm going to share a framework for helping you think about sustainability in the digital realm, and then I'm going to use this framework to look ahead at some policy and tech changes. So let's start. Why a fossil free internet by 2030? So the main one is, let's be clear about this. We're in a climate crisis because we keep burning fossil fuels, and every single time over the last 30 years we've had a chance to get off them, we haven't. If you want to actually speak to your friends or your boss about why a fossil free internet is important, these are the ways I think you might explain it to someone. So first of all, it's achievable. There are big firms already doing this, but we'll see that small firms can do this, like climate emergency, carbon savings, that's another one. It saves lives because all the kind of stuff that gets put into the sky that it doesn't heat the world up ends up poisoning lots and lots of people. It saves money because, as we know, fossil fuels are not cheap if you've tried heating your house recently. And yeah, people like being in green firms, and yeah, there's a whole geopolitics thing that I think people in Ukraine will be very aware of, and as Germans, we were also, or as Europeans, we're aware of too. So that's why I'd stay there. Now a framework to think about this is what we use where I work. We have a model that we call CID, consumption, intensity and direction, where consumption is about, can I change how much we need? Intensity is, can I change how much harm is done by the things that we have to do? And direction is about changing, where do we want to go? What kind of future do we want to build? So the first of these is the lens is consumption. Because our society is largely still based on fossil fuels, using pretty much resources in any way, or create some form of carbon pollution and waste. And to start off with, if there are things that we want to do, it's a good idea to try to minimise the amount of resources we need to do that, no matter what. So with that in mind, we know that we should be reducing this, reducing emissions. What do the trends look like for the last year, right? This is a chart of historical CO2 emissions created by Dr. Robert Rode at Berkeley Earth, a kind of environmental data science non-profit. You can see the direction we're heading in. And the good news is that in 2015, almost every single country on Earth agreed at COP21, a climate conference, that we really should aim to reduce emissions and we should go for well below two degrees of warming for this century. Since then, we've actually found out that we need to be aiming for not just well below two degrees, but more like well below 1.5 degrees, which is a much more aggressive target. And that's what you can see. These are the potential pathways we should be on. And what does that mean for our sector, the tech sector? Well, the Paris Agreement will require for the information and communication technology industry to reduce greenhouse gases by 45% by 2030. So that's about 7% year on year if we started in 2020. And I'm not sure it went down the last three years, all right? So this was actually something that was put together by the Science Based Targets Institute, the ITU, which is the International Telecoms Union, I think, and a number of other kind of official organizations who spent a decent amount of time looking at this. And this is just for two degrees. So we need to be doing more than 45% really. So how are we doing so far? Well, there's good news and there's bad news, I suppose. So the good news is that like two billion people who have been connected to the internet in the last, say, five or six years, and we've been able to use the internet a lot more, which during the pandemic was very, very useful. I really appreciated having access to the internet during the pandemic. And what I want to draw your attention to here is that while we've increased usage of digital services, we haven't necessarily increased the usage of energy at the same kind of proportion, except for cryptocurrencies, which to be honest, the less said about that, the better. So I'm not going to really be talking about that too much if that's okay with you, folks. However, and I should ask, why have the figures, why have the figures not been higher? Well, the simplest answer is that Moore's law has been helping us. And Moore's law is basically where computer chips have been getting steadily faster and more power efficient over time. And as the sector has grown so fast, the average efficiency has increased. So the thing is Moore's law is not infinite. So going forward improvements in efficiency will likely have to come from new places, not just the chips themselves. And this is what this diagram from a paper in Science magazine illustrates. This paper is called There's Plenty of Room at the Top. What will drive computer performance after Moore's law? It's a fun read. And the general argument is that for efficiency gains to keep up with, but will keep pace with growth and really need to be going faster than growth. So we don't just stay flat, but actually go down. We'll need to start applying all these ideas at the top, all right? Which basically means it's going to land on us as developers and designers of digital services a lot more than it otherwise would have. We can't just rely on Moore's law doing the work for us. So that's consumption. And as Saul Griffiths, the author of Electrify says, you can't efficiency your way to zero. If we don't want to be cold and wet and without Wi-Fi, at some point there will be a minimal amount of resource usage that takes place. Faced with that, we need to think about the harm that occurs as a byproduct for every unit of usage that we have to make. So in our framework, we talk about this in terms of intensity because there are multiple dimensions of intensity from water usage to resource depletion to soil toxicity, air quality. But for the purpose of this talk, I'm going to focus primarily on carbon pollution. Now, in Germany, where I live, the carbon intensity of electricity is influenced by two main things. There's a lot of wind and solar for sure, but the majority of our power comes from burning coal. And these machines, that's what's powering our data centers. These machines, you can see, they destroy forests, they destroy people's homes just to dig up coal, which we then burn to heat up water to create steam. The steam turns a turbine, which generates electricity, which is what we end up using. Along the way, we lose about two-thirds of all the energy through cooling towers, and we emit a lot of carbon. So a kilowatt-hour of power is around, like, a kilogram of CO2. This isn't the only way. An alternative approach would be to directly harness the energy around us, using either solar or wind turbines, geothermals, and so on. Now, we're not burning fuels in this case. So we're no longer wasting lots of energy in the form of waste heat that we need to put into the sky. And it means that there's no need for the kinds of cooling towers that you frequently see, so it looks somewhat different as well. Also, the carbon emissions are much, much, much lower. Even when you factor in the actual making of the solar panels and disposing of them, for example, you can see the figure here, but it's something like 17 times lower. So this is why it makes sense to not be running on fossil fuels. Now, globally speaking, about 60% of the electricity we use right now comes from burning fossil fuels. But this is the global average in certain parts of the world. Energy is much cleaner. In other parts, it's much dirtier. Now, the numbers look tiny on this part. What you can see is that wind and solar, in the top right, are growing, and they've really been plummeting in cost over the last 10 years. Solar energy is nearly 10 times cheaper than it was 10 years ago. And something like 90% of all the new energy capacity that comes on stream these days is coming from renewables. So there is some hope. It's not just going to be that awful black line forever. So because the carbon intensity of compute is not uniform, like I said, all around the world, if you want to reduce the harm done by carbon pollution from the energy usage that powers your servers or powers any digital services, you have two main options. You can move work geographically to run things where the energy is cleaner. So this is like geographic migration. That's like the Iceland, the Nordics, and so on. Or because the energy, the carbon intensity of energy changes based on how much where the generation is coming from, you can basically move things through time. So you can wait to run things at a different time of day, for example. So you can wait till it's windy or sunny, or send it to where it's windy or sunny. So that's consumption, intensity, and direction. Finally, we have direction. If you care about open source technology and free software, you know, they're very important ideas that are kind of re-embed into free software. And this is like you choose into work on one project over another, you're making one version of the future more likely to actually happen than another. And this is the quote that I find really useful. It's from a guy called Cade DM. He says, technology is a social, political, and environmental accelerant. I think it's useful because it really kind of gets across the idea that we're making a deliberate decision about what we're supporting. And to give you some idea, this is a graph showing the direct emissions from a number of companies you've heard of in 2019, plus a single contract from Microsoft with Exxon Mobile, one of the oil companies that recently, like in the last month, announced $150 billion in profit last year. All right? Basically, the consequence of using AI to help Exxon Mobile basically drill for more oil and gas is basically around 6 million tonnes. So that's about a Facebook, all right? And this is the same figure. And this is actually something why I talk about actually thinking about what you use it for, not just can I make my stuff efficient basically. It's not all bad again. So Microsoft, they do this, but they've recently been doing more work about actually finding ways to actually get coal-fired power plants off the grid and replacing them with other forms of power, like geothermal or nuclear sometimes, all right? And every single time, that's like more than a Facebook, more than a Facebook, more than a Google's worth every single year coming down. So this is why it's important to actually be thinking about what you use it for. So let's use this framework to talk about some of the policy changes ahead of us. So remember that chart that I showed you before about emissions going up when they need to be going down, all right? Okay. For better or worse, this idea of reducing emissions to within safe limits has started to be referred to as net zero. And we're finally starting to see some consensus about what it actually means. Now, this is the ISO, the people who set standards. They've created some guidance about what net zero should really mean. And the actual guidance is pretty strident. They basically say net zero claims are no longer considered credible if they're not halving emissions by 2030 for an organization. So if your organization doesn't have a net zero target and it's not doing that, not really credible. They also say it needs to include the entire supply chain, not just like your own stuff, basically. And also you need to have interim targets. So targets for 2025 or 2026. And this is because seven years is roughly the typical kind of tenure of a CEO that you can say, we'll be net zero by 2030, leave and then leave it for someone else's job to actually fix. And this is why it's actually quite important to actually do this, because you need to have something which actually brings action early. Also in Europe, the, it's a bit of a mouthful, the European Corporate Sustainability Reporting Directive. In 2024, that's what organizations above 250 employees now need to be reporting, which means you need to have started reporting two months ago for 2023, if you haven't already started reporting. So this is a thing that's important. And then finally, the IFRS, they're a bit like the kind of Pope of accounting standards. They basically set a decree that everyone ends up following. And they've said that no, if you're going to talk about the carbon footprint of your organization, you need to talk about not just your organization, you need to talk about your supply chain as well. And that's actually really important because around 90% plus is makes up most people's emissions in the supply chain, not really your own emissions. So let's use our framework we spoke about, consumption and all that. Okay, can I change how much we need? Joseph did a really good job talking about things like the Blau angle and things like that. There are now emerging standards to actually try to label sustainability of digital services. And as he mentioned, yeah, there is Ocula was the first ever product that got some decent, got actually certified for this. This is important because people who spend money on digital software, like say, public sector or large, large companies, they can they can write these into contract and say you need to have a blue angle certificate in order to actually for us to spend tens of millions of euros with you, for example. If you want to like work with this stuff yourself or work out the environmental footprint of this, we mentioned before, the guy called Arna Tarara, he actually demonstrated a really, really cool piece of technology that does this stuff. If you have a digital service, it will show you, it will basically create a system. So you can see this and then it will come up with all these cool charts. So you can see this is like a chart, this is the output showing me doing various things and seeing how it creates spikes in usage. So that's testing something from the outside. You can also look at something from the inside. So the Firefox Profiler, that will actually self report its own information. So it will tell you I'm using, I'm using this much power and this part of my browser is using this much. That's really, really, really cool. And there's a whole talk about that later. We did some work with them to actually turn some of these into carbon metrics. So you get an idea of what these figures are from. And this is the kind of stuff that my organization tends to do. If you know this, then you can actually figure out, okay, now that I understand there are hotspots in what I'm doing, what do I do? And this chart has been shared, or this table has been shared quite a few places. Basically different languages have different kind of memory footprints. And this might give the impression that you should just rewrite everything in C or something like that. That's not the end of the message you should take away. But in many cases, it may be that there are certain parts that are already being rewritten for you. Or it's worth remembering that you might be using a language for other reasons than just efficiency. And like if you are building something super efficient that help people dig for more oil and gas, probably not that useful. So intensity, can I change how much, how much harm is being done? So as you might be aware, renewable energy from wind turbines and solar panels depends on the sun and the wind and they don't always blow. Data sensors are normally on 24 seven though. So how do you power something that's on with 24 seven with energy like that? The truth is that most of the time we rely on averages. So when Google says and said that they already run on green energy, they're basically saying they've used the same amount of energy as they view as they have they've generated. So they've generated the same amount of energy from green sources as they've used over the year, right? That sounds fair. Yeah, it's not quite that simple. Because you know how, say people say you should get eight hours of sleep a day, more or less. So over a year, which is maybe 9000 hours, that means you should probably get 3000 hours of sleep, all right? Now if I got 3000 hours of sleep by a beat by sleeping from January the 1st to April the 1st, then survived on Red Bull and chocolate bars for the rest of the year, I mean, it's better than no sleep. But I'm not sure it's 100% sustainable, right? And like this is actually one of the things that we'll be talking about later. So this is the difference between annual and hourly and why it matters. And as I mentioned before, if you don't have massive solar farms and you can't invest on all this stuff yourself, some things you might need to do is think about how and when you run computing jobs or how you do stuff like this. Now there was a talk all about this. And if you've heard of Kubernetes, right, there's basically, you can basically say please run this computing job where it's green, where it's sunny, or please run it when it's when there's lots of energy on the grid, all right? There's a really good talk about it. So I'm not going to talk too much about it. Okay. So what I will talk about though is just going a bit more detail about what our computers run on really, I suppose. So when we think of computers just using energy right now, what we're really using is a mixture of electricity from different sources, each with slightly different properties. So if you have some onsite renewables connected to say a building, that's one source, it'll be very low carbon. But you won't have much control over when the power comes on because it's basically the wind and the sun that's deciding this. And you're dependent on the outside environment. If you've ever used a laptop, you've used battery storage here. And like that means it comes on when you want it to be. I mean, a laptop that only turns on when it's sunny is would not be that useful. So this is like what you refer to as dispatchability or things like whether they're dispatchable or not, all right? But they have limited, they have limited amounts of power, or sometimes, but it's getting a lot better. Finally, there's the grid energy. Sometimes it's clean, sometimes it's dirty, and you can normally rely on it. But when like awful geopolitical political events happen, they kind of change how much it costs, both in human terms and in actual kind of financial terms. Now, I apologize for going super nerdy on some of this stuff, but if you've ever used a virtual machine, you'll understand that what we do with like virtual machines is we take a big server, and then we use virtualization to turn it into kind of abstract set of resources like compute, like computing power, network, RAM and storage. And once we've got this kind of broken down way of thinking about it, we can then turn it into a series of like smaller machines. So there might be one virtual machine that has lots of RAM, but only a little bit of storage. And by doing that, I can basically make much, much more efficient use if I have a set of computers. And this idea of like breaking things down and breaking them up again, I think is one thing that you can use when you think about energy as well. And there was a really cool paper that was published last year about this idea of ecovisors. And the general idea is that you can do the same with power as what we've been doing with like computing resources. So if you virtualize the power as a kind of grid made up of three different things, that you can allocate maybe a significant amount of battery for one program that needs to be on all the time, and maybe less to another which doesn't need to be on all the time. So you can basically allocate it in these different ways. And this allows you to then do something like this. You could basically create virtual machines with certain amounts of virtualized power that you can actually work with. And this is actually something that you can then share visibility of to the machines because computers generally just aware of them having power. They don't really know about what the kind of power is like, whether it's green power, whether it's battery or also one. But you can make some of this available to people to see. And there are ways that you can implement like an API. So a program can be aware of its own power that it's using. And that allows you to then actually basically design a system so it's aware of this and can make much, much better use of this or turn things off when it doesn't need or actually use more when there's an abundance of green energy, for example. So that's kind of one of the ideas which I think is super exciting that haven't seen anywhere else. You can also do some cool stuff with networking, carbon away networking. And I've got a few more slides so I'm going to have to run through this quickly. But the short version is, if I want to visit a website in England, sorry, and the website and the servers in Poland, that's to go all the way through here. Now, every single time there's going to be different kinds of energy that I use. But if the internet was aware of the network, I could go, I could take different routes which are cleaner and greener and some kind of like low carbon trick shot thing. So this was actually was A, this exists right now. There's a network program called Sion that does something like this already. There's also a way to extend IPv6 like this. We shared a paper specifically to this IETF internet architecture board workshop all about this stuff. There's 26 papers all about this and it's super exciting and interesting in my book. And a funny direction, can I change where we're headed? So we work on this idea, a fossil free incident by 2030 because we think the internet should be a global public good and it's healthy for the planet and you should be healthy for people who use it. And do you remember I spoke about the idea of hourly figures like annual versus hourly? Google as one company, they've actually put a lot of time and money behind the idea of saying, well, we're going to run 24 seven carbon free energy, none of that red bull and chocolate stuff. We're going to do it properly, which is good. And this is something that we didn't realize was possible. Microsoft is now doing the same thing as again, but they're coming up with their own words because they can't possibly have the same words for some reason. And I thought, wow, this is really, really cool. And then I found out literally like a month ago that this company here, Peninsula Clean Energy, they're a small nonprofit organization. They've said, oh yeah, we're already at 99% 24 seven matched and we reckon we'll do it by 2025. So this is a small company, maybe that is reaching say 100,000 people in California. They've basically built this into their governance to say, no, this is a priority over share buy backs, dividends to sharehold is everything like that. It's about intention. And they've basically got there. But what's super cool and for the FOSDEM group is that they've open sourced how they're doing it. They've actually shared the model they use to actually figure out where they buy the power from and the ideas behind this. And like, I think this is pretty freaking cool. And it's like, basically why am I excited about open source and everything like that? Finally, I've only used American examples so far. But now I can't say that this was entirely Europe's plan because Nord Stream was kind of switched off by like Putin and everything like that. But they've span it this way. They've basically said, well, we've completely got rid of our dependence on Russian fossil fuels now. It went much faster than we expected. So we now have the possibility to redirect the money we would have spent on gas to repower EU, 250 billion euros to net zero industries. So this is basically a quarter of a trillion euros. So basically to talk to deploy the kind of cool green stuff rather than the bad brown stuff that I think we should be getting away from. So that's the my recap. I wanted to share a model with you, but I call CID consumption, intensity and direction. Can I change how much we need? Can I change how much harm is done? Can I change where we're headed? And I want to tell you about a fossil free internet by 2030. It's achievable. We've seen that big firms do this, but small firms do it and faster and better. We've seen how it basically saves carbon. We know that doing this saves lives. It saves money because we know that fossil fuels are actually really, really expensive. And I've already done this slide once. I'm just going to skip it because I'm running out of time. But folks, I think that's the time I've used. So if you want to know more, our organization, the Greenbelt Foundation, we publish open source code, we publish open data and we work with other organizations who are trying to do the same. We also offer training and consulting. If I've been speaking too quickly, there is a transcript of this entire talk at that website. And I'm also online as Mr. Chris Adams on Twitter, but less so these days. And on Mastodon, social was there. And also if you use email, I use email too. Yeah. Thanks, folks. |
Power profiling with the Firefox Profiler |
Hello everybody, I will be talking about power profiling with a Firefox Profiler and Chris mentioned it briefly but I will go in many more details about how this works. So the outline of the talk I will first explain what the Firefox Profiler is because I don't want to assume that everybody knows and then I will go into the topic which is explain why we care about this, how this thing happens, where we support it and show examples of which kind of information it gives us and if I have some time left I have a few more things I could share. So what's the Firefox Profiler? You can find it as a web application at this address. It's a built-in profiler inside Firefox, the web browser and by built-in there I mean that the part that collects the data is inside the web browser itself and the place where you see the data is a web application. It was initially created for performance work especially when Mozilla and Firefox started caring a lot about making things go fast because our competition said that they were faster and we could compete on that. So especially for Firefox 57, Firefox Quantum, we worked a lot with a profiler and the question was always why is this thing so slow, can we make it faster? Another time it had expanded and now we can use it for many more things, a lot of debugging and it's a great way to get data about what the episode software is doing and it has multiple sources of data but the two main sources are something which is we have a timer and at a fixed interval we interrupt the program and capture data by getting stacks of threads for example or also getting the values of counters for example, counting memory locations and the other thing is markers, we will just record what happened at any specific point in time and this is useful for things that happen very quickly but we still want to see. So if you want to get started with your profiler as I said you go to this address where you click the big button and enable Firefox profiler, you get this that appears basically clicking the button just made the toolbar icon appear here on your browser but you could also find it by customizing the toolbar. Then you can customize a preset, here it says nightly, it's a good default for general profiling and then you can start recording, it will show something like this, then you can do whatever you would like to measure, for example loading a website, very often people would like their website to load faster so that's a good example, do it and a few seconds later when you click the capture button you will see something like this in a tab that appears. So here what I profiled is loading the Wikipedia homepage and we can see there's many things in the user interface, it might be a little bit overwhelming at the beginning but we get used to it very quickly and you can move the mouse around and there will almost always be a tooltip like this that explains what you are looking at. So first part here is what we call the timeline, it's things happening across the time. We can see markers, it's those small things here that we noted, those are network requests or also some kind of markers and here we are hovering here on this yellow thing and we see here the stack of what we are hovering, the stacks include JavaScript, C++, you can know everything about what was happening at this time and then there's the bottom half that's showing data in various different ways, here the cool tree is just showing what the samples look like, this is the memory counter here, so counter just counting how many occasions we did and we also have markers here in the marker chart where we can see actually which network request we did. So this is a tool again that we developed to make things faster, then we will see how we can use it to make things more efficient. So now we will talk about power profiling and first say why we care about this, why we care and also why do I care, so my work at Masilize to make Firefox more efficient to understand how it uses power and reduce the power we use and there are two main sets of reasons. First one is still performance because resource use is still a performance topic and users care about power use actually because the phone is noisy, the laptop is too hot to type on, the battery life is too short, so all very good reasons but at the individual level and we also care about the more global level for sustainability, Mozilla made climate commitments. Some interesting things that we mentioned here is that we want to lead openly and share our tools and improve our products from a sustainability perspective. So showing the tools this is what I'm doing with the power profiling stuff and the reason why we want to improve our product is that when we did a greenhouse gas assessment for Mozilla, it turns out the use of our product is 98%. That should not really be a surprise, we have a product that's used by multiple hundred millions of people. So anything else we do even if it's not clean, it's still a tiny portion. Okay so power profiling is to understand the local power use of Firefox or website and a computer that's typically in front of us. I will explain the journey I followed to try to understand how much power we were using. So the first step was, okay I know what we are doing, I know how much CPU we are using but I have no idea what that means in what and where we can save power and the first step was to buy one of those wattmeters that are always recommended for people who want to save energy by figuring out how much power is used by what in their house. It's easy, it's affordable, pretty accurate but not that useful for software because you can't track what happens over time. Then I found something better, it's still a wattmeter but it's sending data to a computer of a Bluetooth. There I can see the history of what happens, it's much better but still how do I match this with what actually happened, I need to remember what I did, it's not as convenient, maybe I could record a video of what happened on the computer but still it's painful to use. And I kept wondering what are other people doing? And then I found this article by Microsoft they were very happy to say that Edge was the most efficient, I'm bragging about it and blah, blah, blah, blah, blah, blah. One sentence caught my attention, power was measured on the surface book so Microsoft device because it has integrated hardware instrumentation. So what Microsoft did is they built their own computers with instrumentation so they could measure how much power is being used and they explain about what this thing is. Can I get one of those machines, sure, they are old, they were released in 2015 so getting old now but we can still play with those. So one of those machines is what they use when they compare Edge with everything else. The way they looked at it was with the Windows performance monitor so it's this application here and we can see indeed that I have power for the battery, CPU cores, GPU and the Wi-Fi. Pretty interesting and I looked for more recent devices, I spent multiple weeks searching for devices that might have those power meters because that was really interesting and the devices and the picture here they are the only two that I found that actually have working power meters and they are both Microsoft surface devices. This is what the UI looks like if you look at the Windows performance monitor. So those things they are here and they report data and it's numbers, good luck if you want to understand what that means. I have no idea. Well, I do have some idea but it took a while to understand and then I had a good surprise. I noticed on some machines that there were those things reported as energy meters and the names are pretty familiar. They are the same names that we see if we use a piece of software called Intel Power Gadget that reports the power used by the CPU and integrated GPU and those kind of things and after some correlation I realized that all the machines on which I noticed this were running Windows 11 and they all had Intel CPUs and I verified this because I found a Windows 10 machine, it didn't have this, I updated to Windows 11 and then those things appeared. So it's really Windows 11 that brought those things. Intel CPUs are recorded as power meters and the very nice thing is there's a documented API to use those power meters. It's probably used by Perf Monitor or something like that. The UI I don't want to use but with the documentation I could understand how to make use of this what the unit was, Pico whatever, so that was the answer and we can create many times per second. It doesn't have to be only once per second and it's accessible in user line which is something I care a lot for Firefox because I don't want people using Firefox as root or things like that. That would be absolutely terrible and no requirement to install a specific driver. Before that in our test infrastructure when we were interested in measuring power we installed Intel Power Gadget but we don't want to require users to do that and it's not open source. So I started working on a prototype to include this in the Firefox profiler because as I said we record counters, memory counters and it looks like this API is totally usable for profiler counters and this is the bug where I worked on it and this is the first prototype I got. So the names, they match what we saw and then we have those power tracks in addition to memory counters, network traffic and all the other stuff. It actually looked pretty reasonable so this is a profile of Firefox starting up so we use a lot of CPU at start up and then almost nothing because we are done starting up and here this is the CPU being used and here this is the GPU, we use it at the beginning when we start showing something and then only every once in a while, pretty reasonable. So I kept working on it and polished it enough that it would really work and be something that would be happy to ship. So I thought the prototype now will say where it actually works, where we managed to get it working because that's not everywhere but still it's almost everywhere at this point. It works on Windows 11 on those specific devices I mentioned before. It works on Windows 10 sorry, it works on Windows 11 with Intel CPUs and I've recently had reports that with AMD CPUs it started working, I don't know exactly when but I suspect it's with this update and I will try to verify soon. And one thing that's very interesting with AMD CPU is that we have one power track per core which might let us make much better correlation about what's actually using the power. Mac, two different architectures on Mac, mostly undocumented or poorly documented API, poorly documented means the name of the API is there, there's no explanation about what it does or how to use it. But the kernel is open source so by reversing a little bit I could figure out that this task energy thing was a value in nano drill and for Intel CPUs, specifics is called with magic assembly code that we had implemented eight years ago, I didn't know anything about but someone pointed me to it. Great, we can also support Intel Macs and then Linux. So on Linux we can use Ripple Perfevents, so Ripple is running average power limit, it's the data that's reported by Intel CPUs. One issue on Linux is the data is not available as a user and the reason is it used to be available directly and there was a side channel attack where people noticed that by querying power use very repeatedly they could actually figure out what data was being processed and the way they addressed it was to restrict the access so you need to run this before starting power profiling and it's actually the same command that you need to run to run Linux Perfevents to profile in general on Linux. So it's probably fine and as long as it's just this and I don't require Firefox to be run as root, I think it's okay. MDC CPUs are supported since the new version of a Linux kernel but it's a few years old at this point, probably fine. And if you try it, it doesn't work on Ubuntu Firefox snap packages but if you download Firefox on the Mozilla site and don't use the snap package, it works on Ubuntu too. Here's how you configure it. So I showed the profiler UI before where you could use the nightly preset. I said it was fine for most cases. If you want to profile power, we have a power preset that's configuring the profiler. The two things it does is enabling power profiling. So with this feature we have here in the configuration page and the other thing is adjusting the configuration to reduce the amount of overhead because if we have a lot of profiler overhead, the things we will see in the profile will be meaningless. Actually, we already tried to do power profiling a couple years ago. That was supposed to be a picture here. That's strange. I see the picture on my screen but it's not here. So I was saying we are afraid that the profiler overhead would make it impossible to get any useful data out of power profiles because the profiler is actually using a lot of power itself to interrupt sample the stacks and all that thing. So we can reduce that overhead by using longer intervals between samples and ensuring that when we sample, we only capture the values of the counters and not the actual stacks. It appeared quickly. So we see the features here that were enabled, power profiling, markers for all threads, sampling every 10 milliseconds instead of every 1 millisecond. Now that I explained how we got this power profiling thing, I will show examples of what it looks like when a power profile is something. So loading Wikipedia homepage again, this time with the power profiling preset and we can see exactly how much power was used by the content process loading Wikipedia. So we can select around here and we see in the tooltip how much power was used by this process during that amount of time. This profile is captured on Mac so we have a track per process. Another example. By the way, the profiler is very easy to explore by looking and moving the mouse around and looking at the tooltips. It's not so great for screenshots. So all my slides include a link to the profile that I'm showing. So if you want to look at the slides later and click the link, that will be a lot more fun for you I think. This time it's Firefox startup Windows 11 and we can see how much power was used by starting Firefox here by the CPU. So we can see it here. And this is an example I really like because it's really pushing the limit of what we can profile and I had never thought we could profile this especially when I was afraid about the other head. But if we profile, we see that when we do nothing with Firefox, literally nothing, like I was profiling Firefox about blank. So literally nothing, no websites. The one thing that's left is the cursor that blinks in the address bar. Every 500 milliseconds there's a power spike and we can see exactly how much power it uses to show or hide the cursor in the address bar. So yeah, very detailed. And then I will show some examples of how we used power profiling to validate fixes we've done. So this is something we did specifically for Windows 11. They have a new feature that they call efficiency mode and it's visible in the task manager by looking at this icon here, this green leaf thing. It looks a lot like green racing but it actually does something. It means we let the operating system know that this process is doing nothing that the user cares about immediately because it's probably invisible. It's probably in the background. What we want instead of doing the work as fast as possible is do it with as little energy as we can. It's typically doing this by doing two different things. One is ensuring we use the lowest possible CPU frequency, which uses less power. And the second thing is on hybrid CPUs that have both efficiency and performance cores, always use efficiency cores. And this power profile was captured on a modern Intel laptop. So we have an ultralight CPU with efficiency cores. And this is a web page that I did for testing. It's using 100% of the CPU, just a busy loop that does nothing except burning CPU. The process here is in the foreground. Here we see a process priority change because we go in the background. And here you see how much power we use. It's about 10 watts for the CPU, here it's down to 2. So we divide it by 5 power used by just ensuring that we let Windows know that this process is for something in the background, don't worry about speed. And things actually run slower, by the way, but use less power. And then I have a few fun examples of things that were... So the previous example where stuff actually works and power profiling was useful to show the stuff we cared about. And now I wanted to share a few funny examples of things I absolutely didn't expect. This profile is from one of the Microsoft Surface machines I put before in the picture. I said we profile the CPU, the GPU, the Wi-Fi chip there. And that's the only machine where I can profile the power used by the Wi-Fi. And I noticed that Wi-Fi chip power used is almost always half a watt, so 500 milliwatts. When the machine is plugged to a charger. And what happened here is I unplugged the charger, there was a poor broadcast event. And now we only use power when there's probably network traffic. And I had no idea that Windows was doing this. I think it's to reduce latency, but it keeps the Wi-Fi chip alive all the time. And more Wi-Fi profiling, because CPU profiling you can do all you like, because it's easy to do. Wi-Fi profiling, you need specific hardware, so I will share the fun. This time I was power profiling what it looks like to do a bandwidth test. So it was the website speedtest.net. It's a single profile, there's a link here, but I zoomed on two different parts of the profile. The top half is when I actually run the test on the machine. So we still use 500 milliwatts here. And we have peaks that go up to two watts for the Wi-Fi chip. We push it to the limit. It was actually not really testing the bandwidth of my internet connection, more of the Wi-Fi chip of that machine. And the second chart is I did the exact same test, but on this laptop that was on the same desk. And here it's stable at the beginning, 500 milliwatts, stable at the end. And for about the same duration, we have almost the same shape, but it only goes up to 700 milliwatts or 800 milliwatts. So we can see that just if there's computers in the room or very close proximity that use the Wi-Fi, it looks like there's more Wi-Fi packets that the machine needs to discount because they are not meant for this machine to get. And that actually uses power. And I think we can actually look at network traffic by looking at how much power is used on the Wi-Fi. And when I tried to get this, I was getting very confused results. And then I realized someone was streaming something in a different room in the house on the same Wi-Fi network. So we closed that computer and then I got better screenshots. And yes, also when I worked on that, I like put wired network on all the other computers on my office, otherwise it was a mess. And another one that's still puzzling to me, I said on Mac, we have a power track per process. I don't exactly know how they do it, but I suspect it's because they control both the CPU hardware and the kernel. And I suspect what they do is they have a power meter internally for each core. And whenever they context switch, they very likely take the value of the counter at this point. So they can know exactly how much power is used by each process. And this example, I still can't explain. So I was using a test web page again that's using 100% of the CPU core. And we see it's using 4 watts. By the way, it's three different screenshots that I merge into one so that you can see different tooltips. But yeah, if it's not perfectly aligned, it's because I'm not so good at image editing. So using 4 watts here with a process that's just burning CPU. And then the other processes, they do literally nothing. So you will have to trust me on this or look at the profile, but I looked in the market chart, there's literally nothing. The threads don't wake up. So the only thing we are profiling here is the actual power overhead of a profiler. And if you compare the numbers, we're talking about 4 watts here, 2 milliwatts here, it's probably fine. Profiler overhead is probably not distorting too much of information. But the one thing that's really strange is when you stop burning CPU here, a few milliseconds later, the power overhead of a profiler drops dramatically, about 10 times lower. I still don't have the correct explanation for this. I have ideas about what it could be. I suspect it is that when we are actually busy with a CPU, the operating system uses a higher frequency. So it's likely that it's actually correct and it's just we are using a CPU at a higher frequency here. So the same operations took more power. But I'm not sure, it's just a guess. Memory-contention or something, or core-contention, there's lots of things you can be serving across the system. Yeah, there are lots of possible explanations, but I don't have a way to conclude about what the thing actually is. Another idea was we also have efficiency cores on both machines. It could be that we are switching to efficiency cores, but I'm not sure it's a good explanation, especially given I have that many processes on only two efficiency cores. So yeah, I don't know, but it's fun things to look at in profiles. And if I have a few more minutes, I have three more slides. One thing I wanted to share is the Firefox task manager. Very often when you care about power profiting, it's because something is using too much power on your Firefox. And the good way to look at it is to look at the task manager here. That will give you all the processes used by Firefox, but in addition to showing just the process IDs and how many percent of the CPU it's using, it will tell you which tabs are loaded in which process. So you can figure out if you want to close a specific tab that's using too much. But also, there's this profiler button here that appears when you hover the line next to the PID. If you click it, five seconds later, you get a profiler tab with everything that happened in that process. So in most cases, you just need to close a tab because you have one tab in the background that you don't care about that's doing crazy stuff. But if something really looks not the way it should be, you can do one click profiting. And if your machine supports power profiting, the power tracks will be there. Another thing I wanted to mention, it was also visible a little bit in Chris's presentation. But I worked on power profiting. I didn't work on adding the CO2 equivalent. This was a very welcome contribution from Chris and Fershad from the Green Web Foundation. We are very happy about that. And the last thing I wanted to share here, so I explain all of this presentation, how great it is to power profile. And I will explain why you don't really need it. In most cases, what's using most of your power is the CPU. And without power profiting, we can already profile CPU use. So we have CPU use per thread here. That's how we make the shape here. But sometimes we don't look at all the threads at once and something else might be using power or CPU. And we also record the CPU for the entire process. It's also a counter. So we record it in the same way. We don't show it by default because we're not too sure about the user interface we want to put for it. But you can access it from the DevTools console here. You type experimental enable process CPU tracks. And you see those process CPU tracks that appear here. And the shape, they are extremely similar. Usually, there are slight differences mostly when we use the GPU a lot. Like the shape is slightly different here, slightly different here. Overall, it's mostly the same. CPU profiting can get on all machines. That's all I wanted to share for today. First, thanks for your attention. If you have questions, I think we have a little bit of time. You said the screenshots were public. Is there any data set underneath the screenshot publicly? I didn't say the screenshots are public. I said the profiles are public. And there's a link at the bottom of the slides to open the profiles. Yeah, but you can really get your own profile. And it's a lot more fun when you profile something you actually use or your own computer. OK, but it could be useful to make a community data set correlated to typical, as you said, blank page linking cursor until typical very bloat website. And then try to make a loop back to the website builder to give them information about what is significant. OK, so it was more a comment than a question. It would be useful to publish examples of what's using power. One thing I should have probably mentioned is that when looking at the power numbers, they don't mean a lot in terms of what your actual users will experience, because the typical power use of a computer varies a lot. Some of the machine I was showing in the picture, they have four-watt CPUs. Some people have 200-watt CPUs. The most common, because we also have telemetry at Mozilla, and we look at the power use of our user's CPU, the most common CPU power is 15-watt, that's typical for laptops, and the second most power is 65, and that's typical for desktop machines. Here's a question from the internet, where I think Friedland asks if it's detailed enough to verify that constant time-crypto algorithms are also constant energy-crypto algorithms. Should I repeat? So the question was, is it precise enough to verify if constant time-crypto algorithms are also constant energy-use algorithms? If you can run the algorithm multiple times in a row long enough that you can profile it, maybe, so we need to run it many times with different inputs. I'm not completely sure, honestly, but you can try. And the sampling rate that we get is at most every milliseconds on some operating systems. And the issue I mentioned about the side-channel attack, so on Linux it was worked around by making it restricted access. On other platforms, the way they work around it is by ensuring we don't access it more than once every millisecond. So if we access it more than once every millisecond, we get the same data again. So you can't profile it more than every millisecond. Yeah, it comes up, right? Yeah. Thanks. |
Update on open-source energy system modeling in the global south and including Africa |
Okay, my name is Robbie Morrison and I'm here to talk about energy system modelling. I want to take you right up to the stratosphere. A couple of things on my background, I won't go through all this, but I started climate campaigning 33 years ago. I started high resolution national energy system modelling 23 years ago, and I started open source energy system modelling 20 years ago, so I was right at the beginning of those trends pretty much. I want to talk briefly about the open energy modelling initiative, which started about eight years ago, and it's an informal collection of modellers. We now have about 1,000 people involved. The bulk of them are early stage full time researchers, and that gives you an idea of how much sort of interest there is in this open side. There is an entire parallel universe doing clothes modelling that we don't have much contact with in the power companies, in the World Bank, in the multilateral organisations, so I'm only going to talk about the open source side. The final point up here is that this whole field has flipped in the last year radically. I get contacted by corporations and economists and so forth now, which would never have happened two years ago, so this is a complete game change. I'm not going to talk very much about energy system modelling, but if you want an introduction I recommend this YouTube, which is made with my partner in the car park, and it's descriptive and it's quite good. This is a quick schematic showing what these models can capture. This just happens to be one that I pulled up that's hybrid with agent based modelling in it, but you see a lot of the entities, if you like, that were being discussed in the previous talks, but brought together in a collective. So we have households and we have market operators and we have lines companies and we have markets and we have AC power flow and we have a lot of kith and the system, hydrosystems, storage, gas turbine sets and so forth, and a whole lot of external characteristics coming in through weather conditions, interest rates and so forth, so that's the broad picture. If you want to look at the models that exist, this Wikipedia page is worthwhile, it's about half complete and it covers the various models. Some are directed specifically to the energy sector, but increasingly they're a sector coupled and they come into the whole energy system. The basic paradigm is operations research, so the underlying model produces a set of constraints in a sparse matrix, has a goal function which is normally minimum aggregate cost and feeds that all into a solver and returns a result. The way that the analysis proceeds is by so called comparative analysis of scenarios, so you pick a base scenario, a reference scenario and then you propose different scenarios that you want to explore with nuclear, without nuclear and so on and so on. These are the high resolution, they have a lot of detail in them, so they have the plant and the network and so forth in them. A lot of external circumstances, weather, demand for energy services and so forth. They are contiguous time which is really important nowadays because with renewables and storage you can't kind of do typical periods, you actually have to work your way through the entire system as it evolves. The evolution might be out for 30 years, out to 2050. There's a degree of different types of foresight, sometimes it's perfect foresight so we know everything about the future, other times it's stepwise so we do recursive dynamics. What up here, technological progress is included, one factor, multi-factor, for example the uptake of a particular technology like solar PV will, the model will internally reduce the costs for that particular technology as it's taken up and it evolves through time. The optimisation is usually mixed into linear programming, anything else more exotic runs into performance issues. Conceptual extensions include embedded decision taking using agency, multi-criteria optimisation, some assessment of co-benefits such as urban air quality, sensitivity to the framing of the problem, the role of uncertainty and the exploration of near optimal solutions. So this is system modelling, all systems have kind of natural systems and problems if you like together have natural boundaries. If you want to model Europe or we want to model an energy system in Germany you probably want to go to the boundaries of Europe for example because that's kind of a natural point. The methods or naturally seek technical synergies, that's one of the advantages of using these systems, the least cost approach will pick up the synergies and get them working. Future climate change is normally included, projected future climate change. These models may exhibit undue sensitivity to both data quality and to system resolution so they're not without issues that have to be explored by modellers. They started off with energy systems, electricity systems coupled into district heating and into gas and so forth but they're increasingly branching out into land usage, water use, the industrial sector when you're looking at things like hydrogen, ammonia, thermal integration and steel production. Carbon capture is included now outside of the energy system so residual emissions from cement and from agriculture are now being included in these models. Comparability also and we've had some talks about vehicle charging but this is to look at the whole picture and not just the perspective of the householder or even the lines company. Co-benefits beyond climate change mitigation I mentioned. What isn't in the models is there is no embedded economy. If you want to do that then you have to go to process based integrated assessment models which are widely used by the IPCC and in which case you have a lot more kind of an economic take on the system. The model started off being open source but there are good reasons why we want to look beyond open source and the one, the first reason is to go to open science. So we want genuinely open data and we want it under communal curation. We want full transparency and as modelers we want an engaged overarching community so that we can compare and contribute and support each other. The goal in my kind of take is that we should be looking at public policy analysis which is based on peer production, on commons based peer production and the reason I say that and I think there was a talk earlier this morning from the European Commission, people like the European Commission do not have the capacity to explore the solution space and I will add nor do they have the creativity required. That's not a criticism, that's just an observation. So we really want a massive effort in exploring what our future could be out to 2050, the kind of trajectories and pathways and requirements that are needed. Some potential for public engagement but very few examples to date when these models are used for more specific projects. Our biggest Achilles heel is complete and coherent data for public interest analysis. We are not data scientists, we are desperate to have data which is complete and coherent. If it's dirty, it's a problem. If the semantics behind the data collection is somewhat inconsistent, it's a problem. If the information is missing, it's a problem. This may not be an issue for data scientists using statistical techniques or machine learning but it is for us. One issue that doesn't get much air play are data standards and quite a lot of the data standards in this area, especially in the electricity sector are proprietary, they come under so-called brand, we heard about that, fair, reasonable and non-discriminatory conditions. The problem is if the data standards are legally encumbered, then the code bases that reflect that and the data sets that comply with it could become derivative works under intellectual property law and we are in trouble. So we want basically CC by 4.0 or something similar on the data standards. I'll skip the last bit on data sets actually and I'll skip the slide but I just want to point out that the situation in Europe is pretty awful on a number of levels. You go to the US and you'll find a much friendlier environment for this kind of public interest information. Okay, second part of my talk is about the global south and the question is why is someone who's white, male and old standing here talking about the global south. My short answer is I'm from Aotearoa, New Zealand and New Zealand became bicultural all over my lifetime and I saw that process and contributed to it. I had radio programmes in the early 90s on sustainability and conservation on tribal radio, on ewe radio and so forth. I went to land occupations, I organised joint meetings with tribes, Huey they're called and they take place on Marae. So that's kind of my back story about why I can talk about this I think. This is a map of Africa with the high voltage network present and you will see that there is very little structure there. South Africa a little more, David is going to talk a little more about this so I won't. This is another example of a model called osmosis. This is in Africa and these are the cumulative trades out for the next 30 years. So this is the kind of thing that the models are starting to look at. There are two overarching projects in this area, the osmosis global project. Osmosis is written in a high level mathematical programming language called Mathprog. The second one is pipes and meets earth which is written in Python and you'll hear a little bit more about. One of the interesting things I thought, I looked up software heritage collects the forks for a particular code base and this is 135 fork repositories for osmosis and 308 fork repositories for pipes. So that gives you an idea of how the open source world works when people will fork the project. These aren't hostile forks I presume and use them for their own work and hopefully contribute their contributions back upstream. This clear activity now in Central America, Costa Rica, South America, countries like Brazil, India and surrounding regions, South Africa, Sub-Saharan Africa and most of this is in the context of academic work. We have no connection or very little connection crossing over with the multilateral development organisations and so forth. So this is the parallel universe I mentioned. How we doing for time? One of the issues that we face I think is interacting with official agencies because we are relatively informal and relatively self-directed and we also are a competition against the agencies like the International Atomic Energy Agency or IRENA or whoever are doing their own analysis and I quite like this quote from Oliver Getten, every day politics is therefore dominated not by evidence based policy making but by attempts at policy based evidence making and that's exactly what we want to avoid. I talked to the incumbent NGOs about using our kind of analysis and they weren't very interested but I feel quite encouraged now because there are a new set of foundation backed think tanks who are actually very keen on this kind of stuff and I'm sorry I can't mention too many names because I was ill for the two weeks prior to this talk and I didn't get consent to talk about them but a couple of climate analytics and transition zero. Some official agencies are starting to talk about open sourcing their stuff but they're not doing it in a particularly robust way in my opinion and this is a problem. They will either open wash or they will do what's called throw their code over the wall which is put it on GitHub but there's no attempt to develop it, there are no issues listed and whether it even runs is open to question. In regards working with the global south and as I had about 10 interviews with researchers in the global south to try and find out what scopes and issues they're unstructured interviews but it was kind of interesting. So the clear benefits of open source projects are of course few cost barriers with the caveat that the commercial solvers can be expensive. Open license for Garobi might equal three full time researchers in India for example. There's a soft technology transfer, it's bi-directional, it's lightweight, all the software projects bundle associated communities and this is I think really a useful part and the work is transparent, it can be studied and challenged which I think is really important. There are some cost cultural considerations I think that are necessary to explore and I talk about this in Aotearoa and New Zealand becoming bicultural but indigenous languages bundle different concepts and they're quite noticeably different, sovereignty is an issue, it's really easy to transgress sovereignty without realising it. There's a question of representation, the projects are all pretty much white and male and the global north at the moment and the next question really is also a matter to be traversed is that the framing of the models and the problems from a global north perspective may not be very appropriate to the circumstances in the global south. Global slide challenges, just overarching challenges, most of these won't be very surprising, code maintenance is always a challenge, support for maintainers. Building a suitable knowledge commons is going to be a real challenge, for instance the international energy agency only sells its data under non-disclosure, we don't get hold of that although it's collected from our national governments, the European Union is focused on data commodification through its single digital market, the scientific institutions are unnecessarily protective, I talked about cross cultural issues, we need to find new ways of interacting with official agencies to get any of this information into the policy process and I'll just conclude with a quotation from an East German playwright, Heinrich Müller, optimism is just a lack of information, okay that's it, thank you, yeah any questions, can you speak up a little too if you ask questions, maybe you said the European Union has some issues with open data, I know that the European Space Agency has really strong footprint on doing all this or is it Sentinel data, stuff maximum open to drive a new economy, so has this lot to spread to the other agencies yet, no the ones I'm going to mention and I will mention some names, the Meridata for climate, future climate is under bespoke license, the YASA data on scenarios going forward also under a bespoke license and so on, so a lot of the Horizon 2020 projects are also problematic, the stuff under statute reporting is also legally encumbered, so I can't for the life of me understand why, but some of it is technically encumbered, so for example the transparency platform run by ENSOE is legally encumbered, the EEX data from the European energy exchange also and also technically encumbered, you can't cut and paste it off the website, it's not very deep protection but and we've complained my friends to ASA the regulator and they say it's compliant, sorry yeah. There's the open government license, UK 3.0, I don't know which one they're using, the other decent experience was with Elexon UK balancing, well I think I'm their only official licensee but other than that it told me to retract everything, I can use it completely open which is quite nice, so you know it can sometimes be. I just want to comment on licensing, the one that, the really the only license that works is CC by 4.0, if you go to the open government license UK 3.0 you'll find it's not interoperable with Creative Commons and so you end up with legal data silos, all the licenses are written by lawyers, I can assure you that and the lawyers all know what they're doing, okay okay okay, there's a question up there or no, yeah, yeah yeah, Remind went open, that's from Pic, went to one of the high GPL licenses, I filed a bug report on that because the GPL licenses have a clause on the, remember when Java was proprietary and you have to have an open language for a GPL license, they use GAMS which is not an open language and I filed a bug report and I know that personally the lawyer who responded who said it was okay, now look I'm not an open source lawyer, I didn't write the textbook but that was where that discussion went. Have you seen any new funding come into this particular field to open things up more, because all I know is that in December I know that the Creative Commons group, they've started, they've started to hire new roles in this specific role because they landed like a, you know small millions of Euros grant for this, but beyond that I don't know if there's, if you know any other groups starting to do stuff in this field. The overarching, okay, okay, yeah, thank you, oh sorry, the question was funding specifically for open source and the sort of short answer is, hang on, the short answer is that the funding, I'm talking about Germany let's say, has been quite good for modelling in general and it hasn't been specifically directed to open source. The high level organisation, the Open Energy Modelling Initiative hasn't needed resources as yet but what will happen going forward I don't know, but the funders are interested in the kind of open science component of what we do, that's quite clear and I presume that the next rounds of funding will start looking for real open source projects to be, to be for support. Yeah? So what would you say in all your years of experience has to change and how can we push for the change so that we get these open data, so what are the levels we have to pull? In a particular, well, the question was what levers are needed to come to genuinely open data, it depends on the jurisdiction, in the US it's quite good, federal, work by federal employees is public domain and there's been enough copyright, legislation around copyright that most of the stuff isn't actually covered, protected by copyright, they don't have a database directive. Working back to Europe, the only solution I can see is CC by 4.0 as a policy, which is lightweight, doesn't require legislation or change and so forth, but it does require the European Union to get out of the data commodification and I didn't mention it but there's a thing called the Data Producers Act which is still live which might come back into the data act, the proposed data act and that would be a complete travesty for us because that would mean all this machine generated data would now have its own intellectual property and I couldn't think of anything worse. Okay, yep, thank you everyone. |
Open data and open-source adoption in the energy sector
filling the gaps with the open community |
I'm David Fioriti from the University of Pisa and co-director of the Pipe Submits Earth Initiative. And today I will talk about open data and open source adoption in the NSG sector and in particular we will give the example of how we are doing the Pipe Submits Earth Initiative in which we develop energy modeling. So let's start from the business social model in energy planning. I really thank Robby for the fantastic introduction before and we can see that what the big players, most of the big players are doing except maybe few exceptions that also are presenting today is that a lot of entities which means governments or also TSOs and really big players are paying for having access to solvers, models and data to develop their own very specialized model to obtain some results that they are highly interested and they are willing to pay a lot for that activity. But this leads to problems of clearly transparency of results and replicability that we have been talking all about today. But to just draft a bit of what we mean is that and to clarify about the duplication that is occurring is that let's say that three entities are actually working on developing different activities very narrow focused. The entity A may have some features as likewise the entity B but different geographical scope let's say A is interested in Germany and the B is interested in Italy. On the other side the entity C that may develop something similar is interested into another feature to add for example stability features or so on but still working on Germany. This however leads to duplication of features because entity A and B needs to develop twice the same feature and entity C instead has to work twice to filter the data that are occurring and this is an issue and we are going to talk about this. In the open-air approach in energy planning is that the idea is to open up the different tasks and with open source tools that Robbie mentioned about and so this can be avoided because the parts that are needed the data handling and the efficient they can be simply taken up and appropriately used. In a short way let's not reinvent the wheel let's say and awareness of this is rising as we all know and especially the industry level it's interesting to see here that the organization of DSOs at the European level is pushing towards opening and using open source in practice however reality is a bit different. We have seen Alieander that is doing a great job here but at least an Italian level I tell you that this is taking a while and I'm sure that this is kind of commonplace. In fact the reality far from true complete here we can see data and the access to data is an issue that is needed to use successfully open and fully transparent activities and you can see here that in terms of public available data on final energy there is none. IEA is a lot of data and recently has pushed towards releasing openly the data but currently they are missing the funds to do so in fact they asked the national governments to come to feed the gap. So open tools are great but users may concern in using them because of quality and security issues that have been discussed and developers they have some concerns as well. However we can solve that and there is the need to improve on these together so to quickly and go beyond these barriers apparent barriers. On one side we need users to coordinate the developments so to not waste the scarce resources and answer the quality issues and long term sustainability that we are facing and on the other end developers with appropriate license that can be protected and can still thrive with no problems. So we talk about open data and what about the numbers. In 2019 in a recent review it has been they were reviewed about 30 energy models worldwide and thanks to open mode initiative we see that these numbers are growing and like last week the numbers was 73 and this is great but some concerns arise is there some failed cooperation in all of this. How much duplication is there in these numbers I'm not speaking about that I'm just posing the problem now and that's why I think that with initiative there is the need to provide guidance as they are actually doing and better coordination to work together rather than duplicate and activities and also to show the possible users that the actual model is already there. In particular I took the screenshot of the overview of the models and we can see that by columns it is possible to focus on some features that are interesting and I'd like to notice that I think it could be nice also to show some application based recommendations so let's say what if I want to develop an energy management system what are the options and so on. So now let's go at they use for energy planning. So these tools have been used for different scopes and they have different features and in particular in this case we have recently summarized some of the features that the different tools are providing and in particular we focus on energy planning used in Africa. Basically the state of the art of energy planning in Africa is done by the use of Plexus. Plexus is a commercial tool and that has a lot of features and it is actually very trusted by the industry and in fact very widely used but as we can see there are plenty of other open source alternatives that can be competing in terms of features with respect to what Plexus is offering and but why are not we using the open source that's a big question. So and that's why we need to build trust and to work together to thrive on this and the Peter said that I want to show here and also recall later when we go deeper into the pipes and its initiative is that we need to work together and then duplicate efforts and also share knowledge and data because by intertwining and sharing them we can only achieve this goal. So now let's go into deeper and how we do that at the our initiative in particular what is pipes and it's earth but it's earth is an open independent research initiative that aims at pushing at the global speed up the global energy transition by using open data in open source tools. In particular we have no barrier in terms of a peep of preconcepts we are welcoming anybody any of you of any friends that you may have to join and collaborate with us and if you ask okay but what are you working on so here you can see our main four pillars and I want none ever to strengthen enough the concept of community in fact this is the first one and we have the first concept and this is very important we have a discord channel and currently we are over 200 people in this discord channel well above and that's not like 201 it's like about 250 something like that number 300 already okay fantastic and they changed it pretty fast because I checked last week like fantastic and we push for open data because we believe and that open data is the only way and not only that we also push for open models and but that's not that's not a novelty but it's something that is also interesting to show is the concept of solver on the right because having open data and open models is not enough if you want to produce the results you need a solver that is open to possibly use in fact the initiative we have also showed that and improve the visibility of other tools from this point of view something that we know we is I think one of the novelties here is that when the project started and I have to be honest when the project started I was not there I joined as others as well is that we grew an existing user base so we didn't start from new modeling and with nothing else we started from the PIPSA user base that was already large and established with a lot of tools for different elements in particular you can see here PIPSA that is a framework we call it that that enables to draw to draw easily equations automatically so that the user does not need to do that if you want to add for example a storage in a in a bus it does it for you while instead we have also them there are the also the other models PIPSA Europe and PIPSA Europe sec that are the frameworks populated with data to represent the Europe system what we did is to leverage on this knowledge and this expertise and draw the earth modeling and that you can see here in particular we are working on several packages the earth model that aims at modeling the earth and other two models is linked for data creation that I will talk to you about later briefly and you may ask why PIPSA but the answer is here and it's quite popular and thanks to that our community is pretty large and I really like this image with a lot of faces and I think that we should update these numbers hopefully not adding all of them but the major ones so if I you if it to summarize our recipe in a single slide I'd like to show this here so first we start from growing an existing user base not creating a new one secondly by create by leveraging a user base there is an existing model and what we want to gather in our procedure is to cut all the possible contributions that other user may do because this can benefit each other in an interesting way in this matrix you can see that if a user other country then another user adding the feature a may benefit to of the existing feature so that they can use the feature a on the existing model plus the contribution of the country see and similarly lack likewise the feature be added by another person may may add on top of each other and this is what actually what we are working on we put to produce open data and open source tools that are then shared with the other communities as well in fact we add also other chats with other communities so to to possibly provide the data that we produce to feed other energy modeling tools for example written in Julia which is not our and then obviously clearly by doing this doing this activity we share knowledge so how to plan for a bright future we need policy tools and analysis and for these reasons we have provided pipes earth that is our measure tool for energy sector model for earth scope and we are working on distribution level approach that is called pipes of distribution but this is highly under development so for the one on the left the package is stable and what about the data so we have a lot there are a lot of open data but especially in some regions that are missing I like this slide in which you can see that on the left they are the open data and on the right you can see that there are a lot of missing ones in their actual data set and that's why there are we are there are some packages to tackle this issue particularly we are relying on we are developing the tech energy that aims to estimate the energy infrastructure that may be missing and on the other hand there is the demand creator that aims to estimate the demand and leverage on existing AI tools to perform this task so to give a very quick recap of three slides for what we mean with the energy modeling and pipes earth first of all what are the functionalities and we can see that to satisfy the need by the policy makers that are robust, reliable, low-cost, simple and planning tools we provide our solution that means to leverage validated models with the community support and those and the open community absolutely and really many thanks to the existing great work that the pipes of community has done and I cannot say it's not it's never enough so these to satisfy these requirements we perform a complete procedure that aims to start from the data analysis in which we open we use and rely open data sets such as open street map and so on and we filter out to produce high resolution data and without a solution we mean that if you want to model your municipality so that if you want the municipality resolution you if you have a computer big enough can be are able to do that for the region that you may be interested to tackle are you interested in Africa are you interested in Nigeria are you interested in Australia that's what you can do and how can you make it easy and the procedure is the following like the general procedure they follow is that we decompose the large problem into small pieces in which every contributor can add these its own few lines of code to actually obtain the result starting from the creation file down to results and how is it easy easy to run two lines when it works obviously first you choose your countries of interest secondly you run this and when it does not work like always you can add see the documentation and also access our discord channel that in which you can interact with us very easily and to summarize and show you some results these are some results that are being published in our preprint and in particular you can see net zero energy planning for Nigeria and Iran for for Africa Nigeria has been validated Africa we need to work a little bit more on the data and what's next earth is next currently I'm working on validating each single country worldwide I've started with Africa South America in Asia and currently we are a status about 60% of those working so I can show you an image also about the current status and to do so that we absolutely need to share knowledge work together and data so thank you very much for being here and I'm up question yeah please okay that's a good question so the question was okay when you run the just two lines what are the outputs actually so in particular I can show so first of all we have also a YouTube channel in which you there are videos in which we run the models with some of the community and you can see everything from start to the end in particular the outputs are a lot of files that are all the intermediate files that produce time series for demand time series for the renewable production as well as the structure that encapsulates the networks that we produce the in terms of software they are basically CSV file and C file so there are different data structures that are all documented and that can produce I can show you also an example of folder so I usually work on a remote computer so here maybe a little bit so I can show you exactly my current folder that I use for developments okay and I can show you also the status of the countries that are actually working and obviously it's improving so this is a folder connections I'm not sure if it's going sorry I was connected but I got this disconnected before and so okay in the meantime I do have a clone here locally maybe okay yes so this is a typical pipe folder you can see that there are different data structure and different folders I don't want to go into the details but because but there are videos that are actually tackling these problems and we are we have actually run the models together but if you go and into the results basically the results are a folder which contains a network that is that contains is a pipes a network and you can open it also with a notebook from Visual Studio code and you can reproduce and see the outputs if in the meantime the folder has loaded in the meantime that's loading maybe okay now it's loading yeah please so currently pipe cert is more at the question was if we can model at municipality scale so basically the a question about the resolution spatial resolution that we are actually addressing so currently pipes earth is working at transmission level also it because we all have data for transmission level because we also rely on open street map while at medium voltage and low voltage lines the data sets are quite poor if any or if I mean there are some few but really few which means that our model is more suited currently for larger areas in particular we work on with administrative zones JDM JDM data set and what we can do is you can choose the level you are interested into and you can run the model for that level of resolution this is possible obviously the data that are used are the data that are available of open street map but if you have high detail the data that it may be closed source you can fit them in without releasing and it is possible to use we are actually working on this and in the meantime it has loaded and we can see that I mean this is my current folder it's a bit there are a lot of stuff but this is because all these because I've actually run a lot of scenarios you can see the number of scenarios that have been run and each of them corresponds to a configuration file for the image that you have seen and the green the light green countries are those that are being successfully executed with the dark green we are almost there and also the others we are actually backfixing them and in the couple of few months we expect to have the entire picture green the white ones have been not executed yet sorry do you say that that would work in matches measure data or something no because the problems here are some are basically data management problems like in in china for example the the columns of the data set from open street map contained in Japan contained for example in frequency and in the tag frequency there was 50 comma 50 instead of 50 only so it was it's a really a data management problem that needed to be fixed and that's why now we improved the the representability of Japan and there are some other little bit backfixing that we are currently working okay so the the question more or less is what data we are using in the model so for the net depends on what is the type of information that we need we merge weather information coming for example for era 5 data we can we collect data from gdp and population data for the population data was over pop and so on there are different data if we look at the network data and open street map we as a mainstream data we consider the open street map data as a first producer but we are also working at a possibly streaming your own data because maybe there is another data set that is as better information or than open street map and you may be willing to use it instead of open street map for a single country and we are working on creating this interface to feed in more data it's it should be a few days of work if and also anyone is interested and I think time is up and but for anything you have my contacts and |
Elixir - Old wine in new casks
Intro talk about Elixir/Erlang |
All right, cool. So again, apologies for being this late. I really don't take it out on the people that are organizing this room. It's really my fault. So I hope still you have a nice day. And I'll try to keep it short, so we stay on schedule. So this is kind of an introductory talk for people that are new to Elixir and Alang. So Elixir is a language which now already exists for 10 years. And it's built on top of the Beam virtual machine, also called the Erlang VM. So it had some of the properties that the Beam runtime has as well. And the Beam runtime is actually created for telecom systems. So it's meant to be 24-7 on. And by doing that, it has to be full torrent, so if something goes wrong, it can still heal and keep on running. And because it has to be on all the time, it also means that any code changes should be done on the fly. All the system is running without interruptions, without bringing systems down, bringing systems up. But just keep things running on and changing the code under the hood. It also needed to be concurrent, because it needed to handle a lot of incoming telephone calls at the same time. And it also needs to be distributed, because you have to connect telephone switches together and make sure that everything runs smoothly. So those are kind of the properties that Erlang also inherited from Erlang as well. So when you look at other systems, multi-threaded, OK, programming can be hard. So in theory, it should all work like we have, you know, if you want to do something concurrently, we spawn a few threads and they do their work. But in practices, because threads can actually interfere with each other's work, it actually becomes a mess. So hence the second picture. The other property that Erlang has is full torrents. So in Erlang, you set up a supervision tree in which a supervisor is actually watching, monitoring, or worker. And if one of those processes dies, then the supervisor actually makes sure that a new process is spawned in its place. And the system as a whole keeps running, even though one of the parts actually fail. And so the mantra that's very often told in Erlang is, let it crash. Nice timing, OK. Because people feel safe by, you know, if there's an exception, if your code always goes for the happy path and something goes wrong, Erlang developers tend to not care that much about it because the system, like the supervisor, will restart that process again. So very exceptional edge cases are sometimes not covered because they feel comfortable having the system pick it up from there as well. Before Alexa came around, Erlang also existed for quite some while. So Alexa also inherited some of the experience of 20 years building telecom systems, which also makes it, for example, WhatsApp had only 57 engineers working for them when they were sold to Facebook. But only about 20 of them were Erlang developers. The rest were actually mobile developers supporting Android, Windows, iOS, et cetera. And they actually could handle a lot of users while having a small team. So then the question also becomes a little bit why does Alexa exist? And when people, like, innovate when they're building new things, there are approximately three things, three ways they can go around it. So they completely build something very new, which didn't exist before. Or they try to combine the ideas from previous, from other fields, for example. Or in some cases, people just put a new label on it and say, well, this is new. This is innovation. So hence the title of my talk is, is Alexa really something new? Or is it just a new label on the existing Erlang foundation? And some other languages, they, you know, they've tried to incrementally do some innovations. But after a while, the original sources picked up those changes. In this, like, CoffeeScript is a very famous example, in which the original language picked up those changes and nowadays a lot less people actually use CoffeeScript. So how we're doing on time, okay. So the question is then also, why did Jose, kind of the creator of Alexa, why did he write a new language? And he was at a time when he wrote, Alexa was working at the Rails team. And one of the things that he faced was trying to make Rails thread safe, so making sure that several threads that were running in the Rails program weren't interfering with each other. And by doing that, he was actually looking around, how did other, like folks, how there are other problems in languages, other frameworks, how did they solve that issue? And that's when he actually stumbled upon Erlang. And he liked it. It was, you know, just the thing he needed to use. But there were also some things that he was actually missing. So for starters, the syntax stems from Prolog. So it's unfamiliar for a lot of people. So that means that new people who come to Erlang have to, you know, have a high barrier to, okay, high barrier to actually get around because they feel unfamiliar with the syntax. So he did that first. And he also introduced other new syntax, for example, the pipe operator in which, like the result of the previous expression, is piped into the next function as a first parameter. So by doing that, you can avoid having a very nested function calls by having something that's more readable, more clear to other people. He also introduced more extensibility to the language by introducing macros and protocols. And one of my favorites is actually the bottom one. I'm not sure if everybody can read it, but it's an upcase function which takes a string and upcases every letter. And it does that under the hood via a macro. So the Unicode definition, like the library definitions of characters is downloaded and actually being translated to functions under the hood. So when you call this, you're actually using, you know, some data that is transformed into functions for the language. I'll skip over this part because we don't have judgment time. And you also actually see that those macros are used everywhere. So even like, you know, defining a module is a macro, defining a function, et cetera. Everything is actually implemented through macros. The other thing that he also introduced is the build tool to make it easier for people who are, for example, new to the language. If you want to have a package manager, like before, didn't really have package management, like in the sense that you could add packages to your project, but you had to download them by yourself, put them somewhere, define it in your config, like, okay, this is the path to my library that I'm using. And with Hex and with Mixed, Alex just made it easier, but, you know, by having a list of dependencies and go download it from a central place. Documentation was also made more prominent. For example, the doc tests, which are inspired by Python. So in this case, we have a function defined, and above it is a document, a comment in which there's an example. And this example doesn't serve only for documentation, but at the same time, it's also tests. So, you know, actually, you can, if you would change the implementation, you can directly see the effect of it because the test is just above it as documentation fails. So, and, yeah, the documentation is also accessible from Rappel, from other places. And this was built before the LSP. So nowadays, you can, you know, just hover over function in your editor, and you will see the documentation. But when Elixir was created, those functionalities weren't that common, like among other languages, and that's something that's really nice to work with. And the last thing that he kind of also introduced is a different culture, a culture which is a little bit more open to newcomers. So it's not like Erlang, you know, shed away from newcomers, but it also didn't, like, make it easier for new people who are new to the language to get started with it, et cetera. So that whole, you know, like, to come back to my question, like, is Elixir in kind of new flavor on top of Erlang? I think there are kind of projects stemming from Elixir which make it more interesting and which are really new. So, for example, NX numerical Elixir is an extension which makes machine learning easy, and that's something that, you know, before Elixir, nobody actually thought would be useful to do with the beam, with the Erlang VM, because it wasn't meant for that. It wasn't meant for numerical, for number crunching. But this library, this tooling actually makes it a lot easier to do, and that's very promising. Phoenix is actually a web framework which was inspired by Rails, and now study arounds. Phoenix is now an inspiration for Rails and other frameworks to work with. And NERVs is also kind of an interesting project which makes it possible to run on smaller devices like Raspberry Pis or something like this. So to answer the question, is Elixir really different from Erlang? Is it really, you know, an innovation or is it rehashing? I would say no. I think Elixir really adds something to the whole ecosystem, which wasn't that easy before that. So with that being said, thanks for listening. Thank you. Unfortunately, we don't have any time for Q&A, but you can find, don't you? Yes. Here. Again, I usually have the handle toxified, so on Twitter, if it still works or mastered on you, you can also find me. And I'll be around, I think, for today if you have any further questions. So thanks again for listening, and apologies for being this late. Thank you again. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. |
Introduction to Gleam
by building type-safe Discord bots on the BEAM |
So, now we have Harry Berstow with an instruction to Gleam, which is another language running on Erlang VM, so give it up for him. Hi, everyone, my name is Harry, and I'm, as was said, an instruction to Gleam. You might ask, why is Gleam? Gleam is a programming language for building type-safe systems that scale, it's powered primarily by the beam, but can also be run on JavaScript targets too. I thought I'd go first into the three key points which make Gleam what it is. First it's safety. Gleam has powerful compile-time type checking built into its core. This helps you write fast code that's integrated with Erlang and Elixir while giving you the safety of a statically typed language. Secondly it's performance, as was just discussed before, building on the success of Discord, WhatsApp, Ericsson, and more with the beam. Gleam adds no overhead, so you get the same great type-safety and performance with an enjoyable syntax. And finally it's friendliness, both the community and the syntax of Gleam are friendly. The community is more than happy to help with any problem or just friendly chit-chat, they even help write some of this talk. And when you get something wrong, the compiler provides insightful help so that you can hunt down the issues and stop them. The syntax of Gleam is similar to that of Rust, but if you're not from one of those backgrounds, don't worry, there are several guides to get started if you're used to a syntax from Python, Elm, Erlang, or even Elixir. Here's an example of the start of the Gleam project. All Gleam projects have an exported main function in the project name.gleam file, which is within your source folder. If you need I.O., you can import the standard libraries I.O. module as shown there. And the standard library contains several modules to help you with everything you can think of, from regex to options, iterators, and more. If you need target-specific standard library features, look at the Gleam, Erlang, and Gleam JavaScript packages, which are both available on Hex and GitHub. Let's explore some Gleam examples to get a better understanding of the language. And once we've done that, you can go away and look at the docs yourself for more examples, and we'll go on to building some stuff with Shimmer. Variables in Gleam are created using the let keyword. They are assigned to a name and a value. The name can be reused later by other let bindings, but the values contained within are immutable, meaning the values themselves cannot be changed. Here's an example of blocks. Every block in Gleam is an expression. All expressions in the block are executed, and then the result of the last expression is returned. So as you can see here, the response will be false, even though hello and 42 plus 12 are evaluated. This can be used to build more advanced expressions where the order of operations is important. Here's an example of using the blocks to convert from Fahrenheit to Celsius, meaning sure to remove the 32 before multiplying and dividing. In Gleam, lists are all homogenous. This means that elements in a list must all be of the same type. If you try and construct a list of multiple types, this will result in a compiler presenting you with a type error and showing you where you try to use the multiple different types, so you can find it and correct it. Planning to a list in Gleam is very fast, and this is the way that Gleam's documentation recommends that you should add new values to a list. In the standard library, there is a list module, which allows you to do more advanced operations and also add to lists that way. The above example uses two constant lists, well, a constant and a constant list, but the same principles apply whether you have one dynamic and the other constant or vice versa. If you need multiple types in one place, you can use two pools using the hash and bracket syntax there. They can have multiple types and can be pattern matched against. We'll look at pattern matching in a few slides, but if you want to access the values on a two-pool, there's always the dot syntax, which I'll show you on the next slide, which is similar to that that you'd be used to in object-oriented for custom types and objects. Here's an example of a two-pool, which has two elements, and they're selected using the dot syntax and assigned to their own variables. It's not particularly useful here because they're constants, but with runtime variables, it's easy to access. Gleam supports custom types, and custom types in Gleam are a collection of keys and their values, and you can see them as objects. There's just one caveat though, types in Gleam don't have methods. Similar to two-pools, you can use the dot syntax to access properties within them, but instead of dot and position, you use dot and the name. In Gleam, custom types can have multiple constructors, similar to in the Rust ecosystem for enums. This does bring another caveat though, which is that the dot syntax now only works for keys that are shared across all elements. In this case, the only key you would be able to use the dot syntax with is name, otherwise you would have to pattern match against them to make sure that type safety stays. Case statements can match anything. In this first example, we use basic integers, but there's more advanced pattern matching over the next couple slides. You can see we match the first three numbers and produce a value, and otherwise we just consume as a variable and say we can either use or discard that variable. Some pattern match against two-pools here and even extract values from within. In this example, we're checking for two specific paths where one is no and the other is yes. The unique thing about the yes path is that we're discarding the integer in the middle, but we could again take that as a variable and do further checks against it. If you remember the custom type from earlier, this pattern matches against that, so we can extract the values into certain variables here, like talks and mic, and the rest can be thrown away with the two dots. You can also use the two dots and assign that to a variable so that then you can reconstruct the type afterwards to pass it back on somewhere else. There's lots more about Gleam syntax that I don't have time to cover today, such as external functions, generics, the use keyword, and more, and stuff's always being added to the syntax. All of it's documented in the language tour, so feel free to have a look over there and get a better understanding of what else is available within Gleam. Now let's get on to building some bots to put our Gleam skills into practice. Shimmer is a library which I've doubled in and out of over the last 13 months. I started as a project to learn Gleam and get into the Beam ecosystem, but in the process I've done much more. I'm doing this talk now, I've started contributing to the Gleam compiler and the wider ecosystem, and I use Elixir and Erlang more day-to-day now. At this point in Shimmer's development, we've moved away from using Erlang foreign functions and now a majority of it is in Gleam. Some key features of Shimmer, first, is compatibility. While Shimmer is built in Gleam, it can be used in Elixir, Erlang, and any other Beam language, it's published on Hex and the source code is available online. I've been working on some examples for Erlang and Elixir, which I'll publish into the GitHub repository once I've got them to a stable point. Secondly, it's actor-based. As we discussed before with its resilience, Shimmer is built on top of actors, and when we're running in single shard mode, you only have one actor, multiple shards, that's not a problem. We use a supervisor tree so that all the shards stay alive, and it's built on top of Erlang's OTP using the Gleam OTP package. And finally, it's type safety. As well as Beam Core to Gleam is a useful feature for Shimmer. While building your Discord bot in Gleam, we leverage all of Gleam's type functionality to ensure that the code you write for the Beam is type safe. You only get the full type safety when you write all of your code in Gleam, but you can always trust that the core of the library will be type safe. It's a little fun fact, moving more and more of Shimmer to Gleam. We're currently at 97% Gleam, and the rest is just Erlang foreign functions for small parts of networking, which are yet to have libraries implemented in Gleam. For some of you, this might now be the most interesting part of the talk, and for some of you, it might not. But I'm just going to quickly touch on how Discord's gateway works so that you have a better understanding of why we use actors and how that's useful to us in Gleam and with the OTP package. Discord Bot is powered by Discord's real-time gateway, which uses WebSockets to send and receive messages. For Shimmer, we use Erlang's gun library from 9.9 to receive them, and we use a typed wrapper on top of that, which is based upon Lewis, the creator of Gleam's Nerf library. The diagram here shows what happens when Shimmer opens a connection to the gateway. We use ETF encoding and hand the frames off to actors to pass, manage them, and send them to the event loop, and eventually either trigger handlers or discard them. Inside of that, Shimmer has a powerful event loop built on top of actors and messages, which manages multiple messages as well as its own state, both internally and externally accepts messages so that you can send updates to Discord, or internally, we can manage the updates. The next slide shows a state diagram, which roughly shows how it works. The state diagram shows what happens at different stages, depending on the initial message. For example, here, if you have a WebSocket frame, it's then passed. We then check whether it's what it's asking us to do. We then either respond, discard it, stop the bot, and then terminate. This diagram isn't complete at all, but it just shows you how complicated it can be very quickly, and how Gleam and the beam can easily handle it. Now that we know some Gleam and understand how Shimmer works under the hood, let's actually get our bot written. Above the boilerplate we're going to use, and as a side note, the final code for all of this is in the GitHub repository, which there's a link to at the end. Shimmer uses a handler-based system, which allows for one function to be registered for each event. For the purpose of this bot, we're only registering two events, but you can always register more as and when they're implemented in Shimmer. But before we have a look at that, let's understand how this code uses what we learned earlier and what it actually does. Here we create a new Shimmer client. Here we use a function that wraps around a custom type. The custom type holds both internal data as well as token, intents, and other data you pass in. So we create a function to wrap it. That way you don't have to manage all of that state yourself. And then we pipe that into the connect function, where it takes the client, passes that as the first parameter, and then passes your handlers in as the second. Normally the token should be an environment variable, but for the purpose of this, we're just using a string. Finally, we'll tell Erlang to sleep forever so that our actor and supervisor can run in the background, accepting messages from the gateway, and passing them to the event loop. Now that we know what it all vaguely does, let's revisit the handlers. First we're going to add a handler for the on-ready event. All handlers are passed in their event, as well as the client. That way you can use the client to call other methods, such as updating the bot's presence or sending messages yourself across the gateway. On the client, there's no private accesses, so you can access all the internal stuff as well if you want to add your own custom functionality. The client has its gun connection and all that other stuff in there as well, so you can adapt that as you please. Let's quickly zoom into the handler and explore that. Here, you can see the event in this case is an on-ready event, which provides us crucial information. As I said before, there's the client that we have just spoken about. The Gleams accesses syntax we learned about earlier makes it easy to access fields within the types, even when they're two levels deep. As you can see here, we're accessing the user's ID, which is in the user field of the event, and then we're printing it to the console using the standard libraries IO. We can then make this into a function, and then we can pass that into our on-ready handler. That way we could have the functions in multiple different files and import them from across the project to keep everything tidy. Let's move on to actually receiving some messages and sending some responses. When we receive a message, we get the on-message payload as our event. This contains information about the message itself, as well as the Guild ID mentions the message content and other variables. For now, we're going to assign the content to a variable for ease, but we can always collapse that into the case statement we use later on if that isn't something you need. Let's have a look how we're going to use our pattern matching to match against the content. Using Gleams' powerful pattern matching, we can check it as a desired prefix, and then we can extract the part to the right of the prefix into a separate variable. If not, we can take the message out itself, and we can just print that for easier debugging for now. Let's say we want a specific command, though. We could either add another case statement onto that, or we could just edit it so it's exclamation mark on what we want as the string of pattern matching against. Let's say, for example, you wanted some arguments, though. You could put the two together, and you could have your prefix with the command and take all of the arguments out separately to then pass and manage them. Now we'll match against a specific command, and in the response, we'll use the message send function to reply to the user by sending another message. As before, we can use the Handler's Builder to add this in as a function, and the bot should be done. Now you have a basic ping pong where you can send and receive messages using basically everything you learned from the introduction earlier. This before code, as I said earlier, was available on the GitHub as well, if you want to have a look and take a deeper dive there. Just to recap, at the start of the talk, we went over some Gleam syntax before we get ready on our exploration of Shimmer. We found out how the Discord's gateway worked on a high level and how to leverage Gleam OTP, and how Gleam OTP is leveraged within Shimmer for Actors. Thank you very much for listening, and if there's some QR codes to the Gleam website as well as the Gleam Discord, if you want to talk there, and if there's any questions, I'm happy to take them if I have time. So there's time for questions. You showed the tuple access syntax, which was tuple dot zero, tuple dot one. Does that mean that if you use a record, or if it's called, can you still use zero as a key? Or is that not? If you use a custom type, no. When you use custom types, you have to use the keys you define in the custom type to access them, the index syntax is only available for tuples. Question there. I have a question about the handlers on the library, and about Gleam, I guess. When I'm writing the handler, do I know what type of the event is, and the client, by the time of writing? Yes, so when a Gleam project is put onto Hex, we produce Hex docs, and they're all documented there as well. So the types on Hex docs you can look at, and also Gleam has an LSP built into it, which gives you the information, which is going to give you the information in your editor. Okay. Hello. If you're used to LX0, what are the things that you would miss in Gleam, or is there a big overlap? There's a fairly, it has most of the features you're used to, along with your type safety. The only, I guess, difference would be in Elixir, you can define multiple modules in one file, whereas in Gleam, that's not really something. Modules are files themselves. I guess that's the only thing I could think of off the top of my head. Right. Thank you. No worries. Is there a microse as well? No, we don't have macros right now, but there has been several discussions about how we want to do them and what they're going to be like, so there's potential for that in the future. Any more questions? Okay. Yeah. I'm sorry. Thank you for your talk. It was very nice. I have one question. I think currently it's version 0.25 of Gleam, or 0.26. 0.26, yeah. I'm sorry. This week. Are there any big hurdles before plans for 1.0, for example? I believe Lewis wants to get LSP features more properly implemented, but you can always join the Discord and talk there. I think Lewis is probably better, but I think we also have a GitHub milestone on the GitHub repository, which says what we want before V1. Any more questions? Okay. Thank you, Aria, again. |
Speak binary to me
Learn the powers of binary pattern matching |
Okay, now we have Charles Brozegard and he's going to speak to us of binary pattern matching in Elixir and Erlang, one of the cool features of Erlang and Elixir. So if you speak binary to me, give it up for Charles. Thank you, don't worry, I'm not going to speak binary, but yes, this is a talk about speaking binary to other devices. And this is me and you can find me around the web at ARTRARBR, yes. I work as a software developer in Denmark in a company called Indelab and part of my job is to work on innovations in the Internet of Things realm. So things like smart buildings, smart cities, and smart factories. In practice, this involves building gateways that link many different kinds of stupid things together and then when we have a network of things, they can exchange information and we can embed the smartest into the system. And by many different stupid things, I mean things like remote terminal units which are used in the grid, electric grid and all the utilities, PLCs which are used heavily in factories for automation, solar inverters, heat pumps, thermostats, all kinds of smart home equipment. And at the lowest level, we have simple sensors and actuators. And the thing that all these things have in common is that I had to speak binary to them. They don't speak JSON, they don't know XML, they don't know protocol even. They had their own custom binary dialects depending on the protocol that they are using. And later in this talk I show an example of a simple binary dialect so that we all know what it is. When I'm building an integration for a system like this, I always reach for Elixir first. This is because Elixir has some special affordances that makes it extremely good at this kind of work. This is also the case for Erlang and LFE and Beam and Gleam and the other languages on the Erlang virtual machine, but Elixir happens to be my happy space. Some of the Beam's known strong points in this area are fault tolerance, state machines and concurrency. And then there's business acts and that's what I'll be talking about today. This is a beginner level talk. I don't assume you know anything about Elixir or the previously mentioned systems, but my hope is that if you should find yourself in a situation where you need to speak binary to something, this talk will help you get started. So binaries. Computers today work by manipulating electric signals and a signal can be either a logic high or a logic low. Because a signal is known as a bit and a sequence of 8 bits is a byte. And this is also how computers communicate. No matter if it's Ethernet or Wi-Fi or whatever, it's about transferring a logic signal of high and low bits. There are different ways we can write down the binary signals. The first notation that I highlighted here is in binary where we just take every bit in the sequence and we write it down as a 0 or a 1. And then it prefix it with a 0B so that it's easier to tell from other numbers. And in this case, the sequence of 8 bits, which is a byte, can also be turned into a decimal number because in a byte we can have 256 different combinations. So it's a number from 0 to 255. And in decimal notation, this sequence of bits is 75. There's also Hex notation, where we use the characters 0 to 9 and 8 to F. This means we can describe the constant of a byte with just two characters. So this is very convenient when we are dealing with binary numbers. Just a common example of where we write the bytes as decimals. We do that with IPv4 addresses because an IPv4 address is just four bytes. But then for human consumption, we write it as a binary number, or sorry, a decimal number. And MAC addresses use Hex notation instead of the binary notation. While all the operations in our computers are processing bits and bytes, we rarely think about them when we are programming. We think in terms of integers, floats, strings, lists, maps, and all the structures. But binary data is just a flat stream of bits. There's nothing inherent in it to tell one field from the other, or any kind of structure. It's all down to how we interpret those bits. This means that when we need to speak binary to something from our programs, we need to write a translation layer that can take whatever list of maps we have in our program and turn it into a binary sequence. I prefer calling this translation layer a codec because it's short to write. And you can say that a codec encodes our data structures in bytes for sending. And then when we receive data, the codec decodes that data into our high-level data structures. And sometimes we're lucky we can find a library that will do that for us, but sometimes we need to do it ourselves. And that's where bit syntax comes in handy. So let's take a look at that. Let's say I need to send a sequence of free bytes to some other system. They need to have values 10, 20, and 30. To do that, I use distance x, so I use double ankle brackets to start the binary sequence. And then I write the bytes that I need in sequence and separate it by a comma. So that's pretty simple. Yes. And when I need to receive that data, like if I'm receiving the same sequence of bytes, I need to decode that into free variables, a, b, and c. Then I use this syntax, again, double brackets. Then I put the variables instead of the numbers that I want to extract. So it's pretty simple. But of course, we are not really done yet. It's very few domains where we only work with integers in the sequence of 0 to 255. We need larger numbers. We need negative numbers. We need floats and strings. But many languages will just give you a byte stream, and then you need to sort of do a lot of strange computations to turn that into lists and strings. But in Elixir and on the Beam, we have the bits and sacks, and we can do more. Specifically with bits and sacks, we have the option or the ability to specify modifiers. And the modifiers can specify the type, the sign, the size, unit, and indianess of the sequence of bits we want to extract from the binary. So here you see that the type is integer. It is unsigned, so it's positive numbers only. It has a size of eight units, and one unit is said to be one bit long. And it's big indian, which I will talk about later. So these are all equivalent 10, 20, 30, they are encoded the same way. It's just different syntaxes. So you can see if you don't specify any modifiers, these are the defaults that I used instead. And the second line I used, I omitted size, I just wrote eight. That's something you can do when you know it has a constant at compile time. If the size is variable, you will need to use the full size modifier. And the modifiers can be combined in any order, so you can do whatever you like. And when we decode it, we use exactly the same syntax. We can say the same things, like grab the first byte, tell it to compile it that it's an integer, and then it will extract it like this. And instead of just going through all the different modifiers and the combinations, I move on to showcasing some examples. But before we do that, I want to mention where the bits and text came from. Bits and text comes from a place of pain. These two guys, Clairs, Wichström, and Tony Rockwell, were working at the computer science laboratory at AXN on implementing networking protocols for Erlang, and it was painful. And so they sat down, since they were so close to the makers of the language, they could invent a new syntax for use in Erlang. And this paper, which is published in 1998, describes the first version of the syntax as it was implemented in an experimental version of Erlang. I think a few months later, it was released with slightly different syntax, but with all the same concepts. And that paper also explains what Indianness is, and it's actually just a fun word for byte order, because if you have a 16-bit integer that you need to send to some other system, that's two bytes, right? And you have to figure out, the one byte is A2, and the other byte is C1. And you have to figure out which byte do you send first. Some systems will send the most significant byte first, so that's A2. But other systems will send the least significant byte first, that's C1. And so this obviously has consequences, because you need to know the byte order that the system you're talking to, what it expects, otherwise it just gets confused. And yes, the byte ordering is said to be big Indian when you start with the most significant byte, and it's said to be little Indian if you start with the least significant byte. And this is kind of a thing, I've been working with this for years, but I didn't really know what Indian means, because it's a sort of weird name, right? But the paper by Claes and Tony hinted me in the direction of finding that by pointing me to this Internet experiment note from 1980, which is, I think, the first sort of place where Indian and byte order was used together on holy wars of the plea for peace. And that sort of shows that this is an important topic, sort of like Vim vs. T-Max, I guess. And it's actually just, Indian is just a pop culture reference to a book called Gulliver's Travels, where a seagull travels out into the world and meets the people of Lilliput and Plefusco, I think, and they are in conflict, because the emperor of Lilliput has commanded that X must be broken at the little end when you eat them for breakfast or whatever. So that's obviously absurd, and so they wage a war. So big Indian means we send the big end of the number first, and the little end means we send the little end of the number. So examples of bits and texts. For the purpose of this talk, I have invented the T-Box, which is a very simple device. It has a name, and it can measure the temperature, and it can tell you if there's an error in the time stamp or the measurement. It has a binary dialect or protocol, which I also invented, and this sort of mirrors what you will find in a real protocol description for some kind of device. A client can connect to a T-Box and can send requests to the T-Box, and the T-Box will respond with a reply. Every message that is sent includes a header, which is one byte long, and replies from the T-Box will also contain a value. The header starts with four bits of magic, which is a constant value that is always there and is used to sort of make sure this is the beginning of a message. Then there's a direction bit, which tells whether this message is a request or a reply, and there's the attribute, three bits, which are used to tell if this is a name or temperature message. There are extra, we only use one bit in the attribute, but that's just because they expect to expand the protocol someday. This is an example of a sequence request-reply. First we send the request with the header, with the magic bits first, then it's a zero because the direction is a request, and then we're requesting one, which means we're requesting the temperature, and then the reply has almost the same header at the beginning, it's just one in place of the direction, and then the bytes with the value after that. If you're requesting the name of the T-Box, then it will respond with 12 bytes, and it's always 12 bytes. If the name is shorter than 12 bytes, then the rest of the bytes are just null bytes, and that looks like this. If the name of the box is fustum, then you have six bytes of actual characters, and then null bytes for the rest. The temperature is a little more complicated. It has three fields. There's the time, which is a 32-bit integer, counting the number of seconds since 1st of January, 1970. Then there's the temperature, which is a 16-bit float, and then there's the quality byte, which tells you whether there's an error in some of the measurements. It's the two last bits in the Q-byte that are used. The second to last bit tells you there's an error in the clock, and the last bit tells you there's an error in the temperature measurement. It's important to note that the numbers are little in the end. This is what a temperature value looks like. First, it has a timestamp, which is a couple days ago. Then the temperature, which is 32 degrees about that, and then both of the error bits are high, so you should not trust the sample. This syntax, we want to send a request to the T-box to get the name. I use the double anchor brackets again. It's all integers, and it's all unsigned, so the only thing I have to specify is the size. Here I specify the magic is four bits, then I have my direction, one bit, and the attribute which is zero for getting the name. This shows how we can use the bits and tags to encode, easily encode things which are smaller than bytes. The reply that comes back looks like this. When we receive the goodbye, we do like this. First I want to assert that the message I get back is what I expect. In place of the header, I assert that the values are what I expect. I assert that I get the magic bits first, that the direction bit is high so that I know it's a reply, and that the attribute is zero so that I know it's the name. This is all true. I know that the rest of the message is 12 bytes long, and I want to assign that to the variable name. Here you can see I will use the bytes modifier. This changes the type, and it changes the size, or it changes the unit property. Before with integers, I would specify, like in the header you can see it's two colon colon four for the magic. That means four bits, but when I specify bytes, that the type of name is bytes, then I also say that the 12 means 12 bytes long and not 12 bits long. That's just to say that the bytes is a type and has different defaults than integers. When I get the Tbox temperature, I request it like this, pretty much the same as before, just with a one for the attribute, and for the reply, I again assert on the header that I get back what I expect. Then I have the timestamp, which was the 32-bit integer, but little indian, so let's put that in there, and the temperature, which is a float, 16 bits, also little indian. Then I discard six bits from the cube byte because they were not used, and then I just plug the second to last and the last bit out as clock error and temperature error. That's the basics of bits and tacks. There's so much more to cover with writing whole applications or libraries to do this kind of stuff, like what do we do when you don't receive the entire message at once, you have to frame messages when you're streaming them or receiving them as a stream. There's generators, there's a special, well, it looks like just a normal forward generator, but it actually has some special optimizations for working with binaries when you need to generate a binary. We could talk about performance tuning, but as I said, I've been working with this for years and I've never had to really do any performance tuning. It's pretty performant as is. That's also one of the points in the paper by Claes and Tony is that this is performant. Yeah, there are many other tools for working with binaries that help us when we're looking at this data and not understanding why it's not doing what we expect. Wireshark is definitely one of them. I recommend checking that out. I also recommend if you want to explore more depth, that you check out Protohackers, which is a sort of advent of code challenge thing about their networking protocols, and Andrea Lopardi from the Elixir core team has a live stream on YouTube or has streams on YouTube where he sort of goes through the problems one by one, so that's very good for learning that. Andrea has also started writing a book about this kind of stuff. So that's it. Thank you. Thank you for your thoughts, Rolf. Do you have any questions? Hi. Thanks. So I've done, I've implemented a few binary protocols like network protocols like HTTP 2 or other stuff like that. I'd love to know how you, if you have done any streaming of data and passing of messages coming from streaming data and generators also would be interesting just to know how you approached it because I know I did something but I don't know how we did it. I think I've implemented a few protocols that use, for example, TCP as the underlying transport and then that's like a stream. And then there are a few patterns for how you want to handle that. I think Frank Honloth and the nerve team wrote a sort of framing behavior, which way you have a couple of callbacks that you need to implement in order to handle a stream so that they give you bytes and then you return back messages when you see a full message. It's actually a very good sort of guideline for how to do that. Another approach is taken by Andrea Leopardi in his library for Redis where it's like you call, when you call the decoding function and it returns a result, the result will be whatever messages was in that binary you gave it and then a continuation function which you call next time with more bytes so that it continues to return new messages, yeah. Any question? If I would buy a T-box from you, do I still get support in about 16 years from now? Maybe five minutes. I actually have a question for you if there are no other questions. Do you have a library you suggest to see? Because I implemented a library to decode the QOEI which is the quite okay image format which is a new image format just to get my irons dirty with binary pattern matching in Elixir but it gets very unwieldy very fast. So I saw like in JSON they have some macros that generate binary pattern matching. Do you have any libraries you recommend to check out? I think the Redis library is pretty nice. I also have a KNX library which is like a smart home protocol which I don't know if I would recommend but when I wrote it I thought it was made sense, yeah. Thank you. Any other questions? The last one I guess. Maybe a word about exceptions when patterns don't match. Yeah. You didn't talk about that if I'm correct. That's true. I mean sometimes we say we should just let it crash in this community. That's not always the case. Sometimes the protocol will say that if you're unable to decode a message you must ignore it and just continue going. In that case what I usually do is I define functions where I match on the data I received and then I always need to have a fallback clause in case it didn't match anything and then just lock an error and continue. But I mean probably the proper thing to do in Erlang is to try to just die. There's also you could when you receive a message you could have a special process that is only for decoding so you start a task, decode a message and get the data structure back but I think it depends on the use case how you want to handle bad data. Okay, thank you very much. |
LiveView keeps you warm!
Building a knitting machine UI with Phoenix LiveView |
Now we have Arjan's Japanese and it's going to talk about how live view can be used to keep you warm. So give it up. Alright. Now thank you everybody. Nice to see such a big turnout. When we last organized this deaf room I think three years ago it was a much smaller room and it was really packed but now it's not packed but it's obvious because it's a bigger room so I'm very glad that everybody's here. So my name is Arjan's Japanese and I'm going to talk a bit about knitting today with you and also a bit about live view. So I'm going to give a little bit of background about my project and about what I'm doing and then I'm going to talk about live view and how I use it and then I'm going to wrap up. So let's start with some background. I've been programming since I was a little kid yesterday in the when we had a beer together I talked about what is your first OTP release that you used and mine was actually R13. So that's a bit and you can guess my age a bit. It's from 2009 so I've been using Erlang since 2009. Before that I used PHP a lot but then I got introduced by somebody to Erlang and that actually Mark Warrell had a bit the same story as Jose Valim had when he created Elixir. Mark wanted to create also a web framework for Erlang. Elixir was not yet born and so he created Zotonic a web framework. And I'm still a contributor to that and it's still alive although it's not as popular as Elixir or Phoenix for that matter. But then later I got used to programming in Elixir and I've been doing that since for quite a bit. So my background is I studied AI I have a master's in that back when AI was not hot at all. It was the middle of the winter of AI nobody wanted to do anything with it but I thought hey why not. But then it turned out I did not really do anything with AI for a long time so I just became a regular web developer doing first PHP like I said and then Elixir. But I'm still interested in AI in hardware and software and also in art actually because after when I stopped or graduated from AI school I went to the Art Academy, the Rietveld Academy in Amsterdam and I decided to proceed the career in art or at least try to do something more creative outside of pure computer science. So one of the let's talk about that a bit and because it gets me slowly to the knitting stuff. So I worked with Clazine van de Zans, Gilpe Lodz, she's a friend of mine and an artist slash interaction designer and together we did quite a lot of projects that were related to this kind of stuff so for instance we built an app where you could interact with a fake social network from the Dutch golden age. So you would walk around in a museum and then you would walk next to a Rembrandt painting and then Rembrandt would send you a private message and wanted to become your friend. And that way it actually told a story about that golden age that was also already like a social network. It told the story of history basically through the current situation and it's still used by children mostly for education. Other similar kind of project is where in Ghent where we created augmented reality installation where there was actually a chat, a little chat going on here and there was an archaeologist who was chatting to you about the objects that you would scan. Another project that was really nice during COVID actually was Distance Disco. And due to disco which is like an app where you dance silently with your headphones on and you're matched to somebody else and then you have to basically mimic how you dance and then based on if you find somebody who dances like you then you're probably dancing to the same song because everybody listens to another song. I gave a talk, that was written in Erlang actually, the back end for that, like with processes for everything and matching people together. That's another talk. Another talk that I actually did three years ago here was with this printer, it was also a project for Clazine where she created like some kind of interactive cooking installation from the future. So you first would have to interact with this Google speaker over here and then the speaker would tell you, ask you a few very personal questions like do you believe in God and do you value your privacy and what would you do if, I don't know. And based on those information there would be a little recipe printed out and you would get instructions on to make something in that installation. So it was only logical that when Clazine got some new projects that she thought of me and she thought of, hey I have this, I have now this customer which is our customer. It's, I don't know, was some art collective of people who will approach Clazine to make some kind of installation for a conference. So like a conference like this where there would be somewhere a back channel with information on how the people in the conference are doing. So for instance a graph of the mood or tweets or pictures etc. And she thought, hey why would we show it on the screen, why not show it in a knitting? Which is logical, right? And then she thought of me, this doesn't work, sorry, I have to do it like this. Because actually ten years ago I already hacked a knitting machine once together with two very talented people who did actually most of the knitting, I just did most of the software around it. But this was also a long time ago, you can I think still look this project up. So Clazine thought, well I want to make like a giant knitting for a conference where everything that is happening in the conference gets knitted out and then we have like this big carpet that you can still look out after, like a big blanket of conference feedback. So she asked me like do you want to do this project and I was like yeah why not. Because it was for me, it's not like I do this kind of stuff full time, it's more that I do it because I just like it and whenever Clazine has an idea I just do it. So we went on to eBay or Markplatz actually and we bought this pass up electronic knitting machine which is a machine from I think 30 or 40 years ago, Swiss made so very well made. Some people describe this machine as the Rolls Royce under the knitting machines. So I thought hey this is a nice machine to look at and to see if I can make it knit what I want. So I bought it and I put it in my home. So now I have a room which is basically the knitting room because it's a large machine, it's like I couldn't bring it unfortunately. I would have and I thought well it's like a printer right, it has pixels, every knitting is a printer, it's a pixel and I just write a printer driver for it basically. Now well it's not that easy as it turns out, unfortunately and over the last couple of months I've grown a lot of respect for the whole knitting industry, robotics things because there's actually a lot more to it than just, now we as a software developer we're very lucky to be in such a stable environment where we just write code and it does something or it doesn't something but there's nothing in between, it's not hardware, it is kind of, it's kind of interesting to learn and also I found out there's actually YouTube videos of people operating this machine and those people are usually like 60 year old women but they can do it so well there's like a lot of instructions how to do that and there's a lot of parameters to tweak and a lot of weird tools that you have to use to get it right but eventually I got it somehow working. It's basically the parameter space of like you have to have a certain knit, you know a certain thickness of the yarn and you have to like the proper tension of the yarn because otherwise you get loops, it is unbelievable how much, it's like it's basically like trying to learn to play the violin or trying to yeah something like that and like playing, trying to play the violin without a proper instructor present or the instructor has died already because it's such an old violin. So I have a very small clip of me knitting if it's here, hello, oh there we go, it takes a while and there's no release, no sound, oh that's very short but this is basically how it goes so the machine is there, there's yarn coming in from the top and the machine goes over the needles, there's a lot of needles here and then once this carriage goes over it basically the needles hook into it and make it like a yarn or knitting and the knitting comes out underneath. And how did I want to automate this, I have to watch my time a bit and so I'm going okay. I actually found online some Germans who actually used this machine before to hack it because there's actually, damn it, this doesn't work. So what I wanted to replace was this, how you used to program it was basically there was this big flow chart in this manual where you would need to press the buttons in a certain combination and then set the dials, everything and then upload the pattern and the way you upload the pattern is you basically take a piece of matrix paper like with a grid and you make some of the cells you make black and that's basically your pattern so you draw pixel art on a paper and you feed the paper into this scanner because this is kind of a scanner and then this thing somehow says okay well I've now remembered that pattern and then this machine communicates with the thing that goes and the carriage that goes back and forth and that actually then nits that pattern. I did not try to do that because it sounded very hard and so instead I found this space from the hackerspace Bamberg in which they used an Arduino because this is the connector that you need to plug into the console so I basically replaced the whole console with an Arduino so on one side there's the Arduino communicating with the carriage and then out comes digital signals, just a serial protocol that goes back into the computer and then the computer can tell you know knit this pattern, knit that pattern etc. If you have more questions about this I can answer a lot, I'm not going to do it right now but basically the new user interface that I'm working right now looks a bit like this so it's just basically a browser and also because I wanted to use live view for something so that's we're finally getting to that subject, yes. So this knitting interface shows a bit, it shows well basically where the machine is at in the knit process and it shows the current color that it is knitting, it shows whether you have to move it left or right because there is a motor, you can enable the motor so the knits automatically but you can also do it manually. So there's a little counter, it has a start-stop button and it has several configuration things like because it's a very big knitting machine so you have to specify or you have to knit it here or knit it there, it's so wide and then you can upload a pattern so you can type in 1 to 0's here basically literally you can type in 1 to 0's and it will then create a pattern and then there's a state machine that will loop through that pattern to send the proper instructions to the Arduino. So I'm going to demo that a bit later because first I want to go a bit into the detail of this live view, well it's not only live view, it's basically just a Phoenix or just an elixir project that has several parts. One of it of course is the user interface which is all the way over there and the other part is the actual knitting machine that's over here so it's connected with an Arduino like I showed and the Arduino is connected to the elixir so this middle part is the interesting software part that I've built. So there's a few components here and I've created some kind of color coding for that I invented myself so basically green is an Erlang process or elixir process and orange is like state, it's just data and without color is something that is not very interesting. So basically whenever I used the NERVs UART library I think it was already mentioned that the NERVs project is really nice for doing IoT kind of things with elixir so UART is the protocol, it's basically a serial port so whenever you program an Arduino you can tell it to send and receive serial commands and you can very easily listen to them with elixir. There's a monitor that basically looks for the serial port so I can hot plug basically the knitting machine into my computer and pull it out. And then basically whenever it receives some serial packet, just a line, basically just a text line, it sends us over, Phoenix pops up to the rest of the system. And then there are several other components that listen to those serial commands, one of them is the control which is a gen server that basically has all this state, basically Asia has all this state like where am I in the knitting etc. So that gen server holds the state of the knitting machine itself so that basically has the task of transforming this pattern plus the settings into a sequence of commands that needs to be sent out to the knitting machine and also the state is also updated whenever a new serial commands come in. And then the control is also connected to the live view which actually shows everything that is being done. So what I'm going to talk the rest of is mostly this part because that's the title of the talk, it's about a live view and how that works and what it is actually, how it works in my case. Are there any questions? Oh, I have 10 minutes left, really? Okay. I have 10 minutes left. So let's go a bit quicker now. What is live view? Well, from the docs it says live view provides rich real time experiences with user rendered HTML and to understand or with server rendered HTML and to understand what that actually means, let's dive into a little bit of the web history. When this whole internet thing started, the first thing there was was just a browser with HTML, right? So you had a web server, you uploaded the HTML file too and you just view it. That's basically it. And when you want to go to another page, you just click a link and you view the other page. That's basically it. This is just what we now call a static website, very static. So that worked pretty well for a while. And then when at some point people thought, well, we can also not make the static, but we can send something different back every time. So that's when it became DHTML, dynamic HTML, PHP was born, other programming, I'm getting feedback from the mic now, it's a bit irritating, whatever. So basically, the HTML became dynamic. So there was a lot of logic on the server, you would make a PHP file that rendered out HTML. The HTML was different for each user even because there were sessions, you have cookies where you store basically the state like this user is logged in, this user has this stuff in his cart, whatever. So there's a lot of logic on the server that renders into HTML and then the HTML is just sent over the wire and the browser displays it, win, done. So that worked pretty okay. But it was not really interactive because every time you had to do something on the site, it would reload the page. So then at some point people thought, hey, I can make little effects, I can do hovers and animations. So we make a JavaScript as born basically, I think with IE version, something free, it started to became popular. So there was a bit of JavaScript written to make things a bit more lively and a bit more dynamic without having to reload the page every time. Well, of course, you know what happened next. So JavaScript became very big. So a lot of the logic was actually moved to the client. So the pages did not reload every time. It's basically just load the page one time and then JavaScript basically takes care of the rest. It traces parts of the HTML with other parts, it even sends you to another URL without actually reloading the page with push state kind of things. So there was a lot of logic on the client suddenly. And it would fetch underwater, not over HTML, but it would just use what's called Ajax, but currently we have rest and GraphQL, all kinds of protocols to get data into the client and then do stuff with the data. So there was a lot of logic suddenly on the client. And this is still the case. I think when you write JavaScript, there's like any web project is quite heavy on the server side or on the client side. Now with LiveView, the pendulum has swung in the other way a bit again because they're actually very interesting thing about LiveView is that we can do very interactive things. We don't have to reload the page every time to do something interactive. We can stay on the same page, but we can still dynamically change parts of the page without having to do very heavy, create all kinds of APIs and do complicated things. So suddenly with LiveView, the logic is again mostly for 99%, I would say, back on the server. So it's actually, it's like a bit again from back in the old days. You just render something, you just put it to the browser and then the browser displays it. It's as simple as that. You don't need to write a lot of JavaScript unless you really want to. That's actually one of the promises of LiveView that you can make UIs very quickly just staying in Elixir and just templating from Elixir. So how does that actually work? Is there a diagram on the next slide? Yes. So basically what happens is, I can show this one first. So there's one LiveView process. So basically in Erlang, processes are very lightweight, five minutes left, oh no. Templates are rendered on the server and they are rendered every time you update the state. But it does not send the whole template over to the client, it just sends the things that are changed. So basically it works a bit like this. So the first time it renders, we get some HTML and then it actually connects over a web socket and then it says, hey, I'm a LiveView process and I can now interact with you. So and then whenever some state changes, the browser is now connected to the process in the corresponding process in the server. So when the state changes, it actually re-renders something and it actually just sends the things that have changed, sends it to the browser and then the browser is intelligent enough that it can just patch small parts of the DOM tree to just change that part and not change everything. So that makes it very lightweight, very flexible. And this is a bit, well, when you start with this, it is re, it's now integrated into Phoenix, the Elixir web framework and I will give a little demo now, I think. So let's make the knitting live. So it would look a bit like, when you write an Elixir module, it looks a bit like this. There's always something that you have to write, it's mount. When you mount something, you just return, okay, I'm mounted and I have some assigns in my socket. It's a bit like normal Phoenix templating. You assign things to something and then you can render, use those assigns to render something. So back below here, there's actually the render function and in this case, it just renders an image tag, image class movie. Wonder what that is. And the source is some kind of, is an image URL with a variable in it, a frame. The frame is a sign that is assigned here. So it renders a single image. And whenever a serial command comes in from a serial port, basically it calculates a new frame and then assign it to the socket again. So this triggers another render here and it will probably change the frame number so it will change the image. So this basically connects the serial port to a live view. I think that's better. So I actually have a demo of that because I actually, well, I did not bring my knitting machine, I actually brought my knitting machine emulator, which is an Arduino with a potential meter attached. And I can probably now, this will fail, but who cares, plug it in. And then we go to this. So if you look in the source, we see somewhere near this, this is the one. We see the movie. It's now at frame number 15. And then if we, is it running, should be running. If we now turn the knob, yes, okay. So now I can knit. See, I am knitting. So I did not bring my whole machine, but I brought a virtual version of myself that is now at home. And I can control it through this little Arduino over here. Yeah, I thought of this last night to demo it like this. It was not really prepared, but yeah, so that, I hope that gets you a bit of the idea how live view updates its state. And in this case, it is very simple, but an actual live view, of course, is much more, much bigger. And actually, that is the rest of my talk that I still need to do. Let's continue very quickly. How much? Oh, time's up. Is it really up? I think that you can take like one or two minutes, so I don't know where. Okay. I will quickly skip through the next slides then. So you can imagine that writing a single elixir module with every logic in it, you get one big assign with everything. It's not really scalable. So there are actually two things to make that scalable. You make components. So one of the components is function components, which are basically just rendering templates inside functions. And then the other ones are Phoenix live components, and those are basically like sub-live views in your live view that have their own state and their own render function and their own mount function. So in this case, in my UI, these are, I just created a few components. One component is settings components, which contains a form. Another component is a row component that just renders a single row. So this row component is very simple. It just renders basically a set of divs, and then you can call it like this. So it's basically, in the template syntax, using function components, basically the same as a normal tag, but you prepend it with a dot, which is because it's basically a function call. And the live components are stateful. So live components have their own states, or they have a mount, you can assign things to there. And what I already said, it's a live view inside a live view. And so these signals directly communicate with the live components in this case. Then there's some more things like slots you can create like different parts of your component and make them into like separate things where you can put part of your DOM tree as well. And I just wanted to say there's a lot out there, like there's a big community, and I think live view is really getting a lot of traction. And it's actually a shame. I have not done anything really with live view in production, actually. I wanted to make a disclaimer there. But I really like where it's going, and there's a lot of projects popping up with component libraries and people making stuff on top of it. There's the storybook project, which is also very nice, which allows you to make a library of components and then have like a live environment somewhere where you can document these components and try them out and copy-paste the code for you to use inside your live view. So there's a lot of things, there's JavaScript integration, which I'm not going to show. There's live view native coming up, it's also very nice technology where you don't render things to the browser, but you render things into a native app, so you actually build like a native app, like what the React Native is to react is, well, you get the drift. So thank you for listening. Thank you, Arjen. |
Distributed music programming with Gleam, BEAM, and the Web Audio API |
So, now we have Haley Thompson and we're going to talk about the distributed music programming with Gleam, Beam and the Web Audio API. Give it up. Okay, so, hello everyone, yeah, today I'm going to be talking about a little web app I've been making using Beam, Gleam and the Web Audio API. Just before I get into that, maybe a little bit about who I am. My name is Haley. I'm a front-end Elm developer, actually, so I don't really do any back-end stuff. I'm totally new to Beam, Erlang and Elixir. I've been doing Elm professionally, almost exclusively, for about three years now and kind of personally, for four or maybe five, and also a PhD student. I'm writing up my thesis at the moment on programming language design and particularly how it relates to sound and music computing. And finally, I am a Gleam community person. If you've ever dropped into the Gleam Discord, you've probably seen me spending way too much of my own time there. So distributed audio, what the heck am I talking about? What am I going to be making? This nondescript-looking box is called a mono, and one of the things it can be is a step sequencer. And so what that means is each of these buttons represents a note that can be played, and the columns are steps in time, and the rows are different notes, different frequencies. And what I'd like to make is one of these in software, and I want to supercharge that basically by making it networked and collaborative. So we want everyone to be working on the same instrument, you know, on different computers over the web. The way I structured this talk, I'm not going to be going into too many technical details about Gleam or the app itself. If you were here earlier this morning, Harry's talk would have done a really good job of introducing you to Gleam, and if you missed that, the language docs are a much better start than what I could give you. So instead, I'm first going to go over some of the languages I could have chosen and didn't, and then briefly explain why I picked Gleam. And then I'm going to give you a very, very abridged tour of the codebase by basically building the thing from the ground up. So why not your favorite language? Why not JavaScript? Well, I've been doing Elm, as I said, for three, four, five years now. I've been in this great statically-typed, pure, functional fantasy land, and the idea of going back to a mutable, dynamically-typed, object-oriented thing terrifies me. I just don't want to do that at all. So okay, why not Elm then? If I'm so used to it, why would I not use that? Well, I actually maintain a package for doing Web Audio things in Elm, but if you've ever used Elm before, you probably know it has a rather interesting take on foreign function interfaces and interrupt with JavaScript, and I just don't want to deal with that for this particular project. And then it also leaves the question open on what to choose for the back end, and really add, like, just one language for the entire stack. And finally, why not Elixir? Well, I don't know it for a start. As I understand, I'm still going to need to use a lot of JavaScript for the audio side of things, even if I use something like LiveView. And I'm a bit of a type nerd, so the dynamic typing kind of puts me off a bit. For me, I think Gleam conveniently addresses all of these things. So I get to use the same language across the entire stack. Gleam targets both Arlang and JavaScript, and I get to share types across the stack as well. So my audio code and my messaging and stuff, this can all be well typed across kind of the network boundary. It also got a really good interop story. The FFI in Gleam is very simple, very, very easy to use. And so if I need to dip into JavaScript or Arlang or Elixir, that can be quite easy. And also, it's a very simple language. So for someone like me that's very new to back end programming, this is a great kind of soft introduction to Beam and OTP and that sort of thing. Well, I didn't go to that slide, but that's the slide I just did. The first thing I want to do is make some sounds. And to do that, we need to have a bit of an understanding of the web audio API. And so a super, super quick primer on that is it's a lowish level browser API for making sounds on the web. You create audio nodes, so they might be sound sources like an oscillator or some signal processing like a filter or a delay, and you connect those into a graph in JavaScript. But all the signal processing happens in native code that we don't write and we don't control. So this is just a very brief example of what that looks like in JavaScript. I don't know about any of you, but to me, this is really, really clunky. We create a bunch of nodes, then we set a bunch of properties, then we have to remember to connect them up, and then we have to remember to start some of them, and then at the end hopefully we get some sound. Instead, what I'd like to do is get a really nice declarative API for this, something that we might be used to for doing like view code. And for that, I'm going to model that with these two types in Glean. So we have a node type with a filled T, which stands for type, and so that says whether it's an oscillator or a delay or a filter. And we have a list of parameters that we want set on that node, and then a list of connections. And then we end up with something like this. So this is the same audio graph that we just saw with a, in my opinion, a much, much nicer API. You kind of get implicit connections based on how nested things are, kind of like a DOM tray or HTML or something. What I'd need to do then is write a little bit of JavaScript to turn those Glean values into some Web Audio code, and we're not going to go into any detail on that here. It took me about 50 lines of JavaScript to do that, and that is the only not Glean code that I wrote in this whole app. So assuming that all works, the next thing we want to do is render something onto a page. For that, we're going to use a framework that I made called Luster. I've said maybe like 50 times now that I'm a big Elm fan, and so Luster takes a lot of the ideas from Elm, particularly its ModelView update or the Elm architecture, and it basically applies it on top of React. So we actually have a wrapper for React, and we can use React components and all that sort of thing with this nice kind of unidirectional stake flow. So we start off with a model, and this is what we're going to derive both user interface and audio code from. And so here, I don't have the type up on the screen, but where we've got rows, a row has the note, so the frequency to play, and then an array of steps that either indicate whether it's on or off, and we take that model and we render it into something. Now Gleam doesn't have macros, it doesn't have a templating engine, or really anything like JSX or anything like that. What we have is just functions. So here, we're calling element.dev, and we're setting a class on it, and then inside we're rendering a button, and we have this message, this update step message, and basically that's going to be fired whenever the button is clicked on, and that goes through the runtime into our update function. We change some rows, update some program state, and the cycle continues. So the state changes, our UI changes, more interactions, blah, blah, blah. If all goes well, we end up with something that looks like this. And what we have here is just a simple client web app. This is the sequence that I've been talking about. This only runs on the client, so anyone that loads this up is going to get their own thing. And so far, we haven't spoken about back-end, so I'm assuming you're serving this on GitHub pages or your own server or whatever. So what we want to do next is serve this with some Gleam code, and to do that, we're going to use two more packages. One is called GLSEN. This is a fairly low-level package that sets up a supervisor and manages a pool of connections that can manage things like TCP connections and sockets and this sort of thing. And on top of that, another package called mist, which is a web server written in Gleam that provides a kind of dead simple HTTP server that you can then configure to accept web socket connections or do SSL connections, these sorts of things. So far, I've been heavily abridging the code. This is pretty much all you need to start serving some static files using mist and GLSEN. The magic kind of happens just in this very simple serve static asset function, which takes a path. Ideally we'd do some finalization on the path, but I've left that out to be brief. Read the file if the file exists. We just respond and we make sure we set the right headers, and that's it. Now we can host our little web app statically with more Gleam code. The final piece of the puzzle then is client server communication. How do we make this distributed? How do we have everyone connected to the same instance? So for that, we need to set up web sockets and mist makes this dead simple as well. You just set up an upgrade handler on any particular path that you want here. It's just the web socket path, and that code looks like this. You set up some event listeners on when the socket opens or closes, and then also how you want to handle messages. On WS message here, essentially just Jason decodes the message into something well typed and sends that off to our app's main process. On the front end, we need to hook up web sockets as well. There's a package for that called LusterWebSocket. This isn't made by me. Someone else has very gratefully made this. For that, we just need to call WS.init in our app's init function, and that will set up everything that we need, so it will do all the plumbing into the runtime to make sure the events are dispatched and end up in our update function. So here, we pass in this WebSocket message constructor, and then whenever we get an event on the WebSocket that goes into our update function, we can change our state, do whatever we need to do, and that will affect the app and renders and so on. Now, I mess, that is the wrong text, but oh well. I mentioned earlier that one of the great things about DREAM is that we can share types across the front and the back end. And so, what we can start to do is have to type messages between client and server. So here, we have a to back end message type, so this is what the clients will send to the back end to ask it to update some state change. So for example, start the sequence, stop it, toggle a step on or off, update some parameters, and then we'd handle that in our apps main update function on the back end. So here, we're updating some shared state, and this is the state that is shared across all clients, and then we're broadcasting that state back to clients. And we do that with a to front end message, and so this is the same kind of idea in reverse. This will tell the client to update a particular part of its model. That looks like this. Again, we decode the JSON that we're getting from the web socket, and then we can just branch off of that, and this would be called in our update function. And so what we end up is this really neat, tidy kind of loop where the server sends a message to the client with some state to render, then user interaction happens, an event is emitted from there, and instead of updating the state locally, we send a message back to the back end, that updates the state on the back end, and then that state is broadcast back to the clients, and we have the same kind of event loop that we had just on the client, but now across the network. Now I've waffled on for a bit, I think it would be cool to maybe see a demo. I'm not sure we can get the sound. I'm going to check the sound of the video guys, let's try to do what you want to do. What would you like me to do? I'll try to play audio, and I will see if I can. Yeah, we are trying to play audio with the mini jack. I can just play out the speaker, it's fine. It's not a very big room. The mini jack audio is not coming off. Okay, well while they're dealing with that, I'll just explain what's happening, I think it's kind of clear. So we have two clients open here. Okay, that's important, no problem. Maybe it was me that was having no sound. If it was muted, maybe it was the plug in there, let's try. No. Okay, cool. It wasn't user error, it was okay. So we have two instances going on here, for some reason that one isn't going, there we go. So I can change the parameters on this side, you can see they're reflected on the other, add steps or whatever. Yes, and so this is all totally networked, conceptually you could run this on the web and have, I mean this is just running locally but I would have hoped that people could open up here. So just a recap, we've got a full stack GLEAM app, we have an ATP server on the back end, we have a React app on the front end, both written in pure GLEAM, both sharing types, and we have this live view style of communication, but specifically or kind of crucially, this communication is well typed and so we know all the messages that we're supposed to be handling on both the front end and the back end. And this is just a quick kind of look at how many lines of code we're in this code base, and so you can see 85 lines of JavaScript was all that was needed and everything else is pure GLEAM. Which I think is pretty cool, it's pretty exciting that you can do that today. So yeah, thank you for listening. Thank you, are there any questions, yep. Thank you for sharing, maybe it was apparent from your presentation but I just wanted to check how are the different clients synchronized. Yeah, okay, so let me go back. We had this model and when I introduced that each client had their own model and so basically the server has its own version of this now and it's broadcasting, every time the sequence resets, it broadcasts the entire model to make sure everything stays in sync and then whenever one client changes something it broadcasts a message to tell the client to update their local version. So it depends on how the client gets this new information and that's more or less okay enough for synchronization. Yeah it seems to be kind of fine, I guess if one person is in Australia and one is over here there's going to be some noticeable ping but then you wouldn't be stupid enough to do that. Thank you. So I don't know much about the Gleam front end stuff, what was necessary to write in JavaScript that you couldn't write in Gleam? Yeah, the JavaScript is just the part that actually renders the Web Audio stuff. So that's the APIs that are available in Gleam? Well so Gleam doesn't really have any browser API bindings at the moment, I could have FFI'd the whole thing and probably taken a bit more into Gleam but for that particular bit I've done that JavaScript myself quite a few times and so it was just quicker to just keep that little bit in JavaScript. Thank you, thanks. Any other question? In the beginning you presented an API for connecting audio nodes by using nesting, my question is how would that work with more complex graphs that have forks and merges or feedbacks? So you're talking about this, right? Yeah, actually presented a kind of Striptown version of the actual API and there we have like keyed nodes so you can assign like an ID to a node and then there's like Reth nodes as well so you can refer to other nodes in the graph outside of the tree and so that way you can keep this kind of tree-like structure but jump out and refer to anything you want and have loops or whatever. And so actually that's what's happening in this app so we've got that delay that's going on in the background and that's the feedback loop and then it's going, yeah, does that make sense? Cool. Any other question? Hello, sorry, I didn't see the full presentation, I arrived in the middle and maybe I will ask something that you already shared but I would like to know if can we apply this environment for live coding, improvise the performance, it's mainly dedicated for building clients and applications? Yeah, I think you could totally transfer these ideas to live coding or performance, I mean ultimately it just comes down to sending messages right and so here we're sending like user interaction events but you could do conceptually the same thing with code snippets or some other kind of data transfer, yeah. Any other question? Hi, Redsorg, I was wondering you said it was compatible with React and so will it be compatible with other frameworks like Vue or the future? Yeah, at the moment it's just React but it's been on my to-do list for a while now to kind of factor out the state management that Lustre does away from the actual renderer that you choose so right now just React, some nebulous time in the future, it could be Vue or Morphdome or whatever. Okay, I think there's time for one more question if there is one. Okay. Thanks for talk but if someone want to use some hardware devices to connect, does Glim support some other wrappers over Web API to speak with some hardware parts like the USB serial port, etc.? Right. Do you mean from the browser side or yeah, so like I said there aren't really any official bindings at the moment but as I also said the FFI story is very simple so it's actually quite easy to create bindings for these browsers yourself which is pretty much the situation where we're at today. I mean the biggest thing maybe just holding Glim back at the moment is the ecosystem is just very, very young and so we don't have many packages or bindings for a lot of stuff. Okay, thank you again for your talk. Thank you. Thank you. |
The Actor Model as a Load Testing Framework |
Okay, now we have Nelson Vides with the actor model as a load testing framework. Give it up. Thank you very much. Thank you for coming. Let's get started. As you heard, I'm Nelson Vides. We only have so many minutes, so I'm not going to go deep into an introduction of who I am. Just ask me on the corridors. I love talking. I'm Senior Erlang Consultant for Erlang Solutions and Core Mongolian Developer, messaging back-end, different questions. Again, ask me on the corridors. I would love to talk about it. Let's start with an analogy, an intro, a catchy intro. Now let's see how the internet works. While this loads, and I hope it loads, otherwise I have it downloaded, I had a fantastic teacher in high school, a fantastic physics teacher, kudos to him, whatever he is, hello. When we were studying aerodynamics and the Newton laws, we studied this bridge that is not loading. I think I will just save time and reproduce it here. Don't ask me how to make it bigger. Back in the 40s, they built a bridge in the Tacoma Narrows in Washington State, crossing from Tacoma to the peninsula, the other side of the Narrows, and the bridge had that problem. It had a very spectacular build. Through the build, they already realized that this is happening, that the bridge is not really stable, and very shortly after the grand opening, they had to evacuate. They left one car with the only casualty. Unfortunately, a dog was left inside of that car, the only casualty of this accident. This spectacular happening, if you check it on Wikipedia, it will be written that something like this left a mark in the history of engineering, engineers went all mad and crazy, what happened here, what mistake have we made? Eventually, the bridge fell in 1940, so then there was World War II, they didn't have a chance to build it. In the 50s, they built a new one. The old one, these pieces that fell are now a fantastic house for fishes in the bottom of the river. Let's go back to the presentation. Yeah, this never loaded, good that I don't load it. Why am I talking about this bridge? Back in the days, bridges were like this. In the Roman times, it was a solid piece of stone, you could just hammer it in all directions, it was just solid. What is the load that this bridge was having? A few Roman centurions walking, a hundred of them at a time. How heavy that is? Some armory, what armors they had anyway? It was not like big modern missiles and things that weighed tons. But one day, we went from bridges like that to bridges like this, that are very lightweight. Even if they are much bigger and they spawn way longer distances, they are way lighter than the previous one and they are not as solid. So there are forces that didn't used to matter in the previous bridge, that now make a really big difference. For example, wind. The previous bridge put it through a hurricane, probably like what kind of hurricane you need to do something. But that bridge, not this one in the picture, this is a model, but the Tacoma bridge fell under a wind of 40 miles per hour. It's not that I like miles, sorry, I'm supporter of the international system, but the Wikipedia article was written by an American, so it's in miles. How many kilometers per hour that is, I don't know how to convert it. But it's not a lot, it's not a hurricane. So my analogy, in the previous bridge, there was just a few people with a small load and forces that were there, didn't play any difference whatsoever. But in the new bridge, there is hundreds of cars with lots of loads, probably transports of goods and much bigger weapons than in the past, and forces that were always there really make a huge difference. Let's have an analogy that matters to us here, we are not bridge engineers. Not long ago we had these huge computers, but you can probably just punch them and nothing would ever happen. If I punch this one, the presentation is over. That were used by just a few people with a few use cases. And then we went to this magic infrastructure of God knows what is going on, of lots of things put somewhere, used by millions of people, God knows what use case people are finding out. You know, you probably, you design your service with one or two use cases in mind, and then people surprise you. So, the questions again. What are all the interactions? There was one or two use cases, but one or two people now is the limit. What is the traffic capacity? In the Roman bridge, there was Centurion, an army, a small army, a division with a few weapons. Now, just imagine a modern bridge. What about the amplifying factors? The problem with the wind asked me in the Q&A or in the questions like the details of why this bridge fell. I love that story. There was a little bit of wind that amplified the movement to more than the bridge would support. This can happen also to us. Imagine a client sends a packet that is compressed. We decompress it and, you know, he sends half a kilobyte, but we decompress it and it's five gigas. And, you know, you run out of memory. What about amplifying factors? And what about all forces that didn't make any difference? For example, punching a computer. That now they really do. All right. Let's get with a little bit of terminology. I'm coming back to the title of my presentation. What is a framework? Here you have a bunch of copy-pasted definitions from different dictionaries. And Wikipedia is the first, which is not the best dictionary, but we all love it. Basically, probably you have an idea like Phoenix is a web framework, for example. It's a set of tools that gives you a way to build a system to solve a problem. In turn, what is a model? You can have a model of a bridge, but you cannot have a framework of a bridge. You have a framework to build a bridge and a model that represents the bridge. Again, some copy-pasted definitions from diverse dictionaries for you to enjoy. And ask me later. This model, in particular, is the inverted model of the catenarius of the Sagrada Familia. Again, ask me. I love this topic, but we are here to talk about Erlang. This is how Gaudí designed the Sagrada Familia. That is just about to finish any day now. Let's some data. We'll finish it. So we have a framework, a set of tools to solve a problem, and a model, a representation, a theoretical representation of your problem set. Testing and load. Testing, like kids go to school and they get a test, just to prove that they know what they're supposed to know. It's a process of making sure that things are doing what they're supposed to do, that they know their knowledge, that the software does what it's supposed to do, et cetera. And load. This is what Newton would probably love to call work. Again, thank you, physics teacher. Probably what Newton would love to call work is a mass of quantity of something that has to be worked on. Like, moved, or supported, or resisted against gravity, or wind, or transported in these virtual bridges of cables that we have under the ocean, et cetera. So load testing is testing that the software, a service, can handle the load that we are giving it. And how it behaves under different such quantities. So we have this roughly scheme of, like, three points of performance testing, of load testing, that you have to test. Performance is basically how fast your algorithm is, like, executed once. It takes 10 seconds, or 10 nanoseconds. It's the theoretical performance, but what happens when you make a lot of requests at the point where you expect your service to still be able, but not more than that. It depends on the hardware you deploy, your architecture. You expect that this should behave like this, and then you test it. And then you put more load and see how it dies. We have this luxury in IT that we can destroy our software, because we can just replicate it, build infinite copies. You know, the bridge guy would be very yellow. He cannot build two bridges to break one. He has no second chance. There is one bridge. Don't break it. It's very expensive. Make sure it works. How do you test what happens when it dies? So a load testing framework is going to be, of course, a set of tools that gives you a way to test these different kinds of loads. And for these kinds of loads, you need some units of measurement. What is a load? In the case of the bridge, Newton would love to call that the forces. And you need the interactions. How are these possible loads applied? You know, in the case of the bridge, we would usually think of gravity. There is just one interaction. It goes down, but wind and turbulence and your users can be very crazy. Forces can be applied in any way. So we need to think about the unit of measurement and the interactions. So as I said, there is the forces. Newton would love this. And the equivalent. You have a service, some backend that has users. And as I said before, you would never imagine the ways they find to use your service. You're usually designed with three or four things in mind, but you know. So I would say that the equivalent of the forces that can be applied in different directions are, like, self-independent programs. Imagine that each one of those users is a program that decides how to apply his force, decides how to interact. Like, each one of those many arrows that you can draw in this bridge, and this is infinite if you get involved with differential equations and, you know, complicated mathematics, everything moves like crazy. All those moving arrows can be represented with an independent program on its own. And those programs interact with each other. This is the model of the actor that I can imagine that most of you, more or less, would be familiar with, like, what we do in Erlang and Elixir. For those of you that are not, the idea, basically, by the way, before I go to the next slide, this is Karl Hewitt, the guy that named the actor model that put it into paper. He died a month ago, or almost two, maybe, somewhere in mid-December. So, a bit of a tribute to him. Thank you for the theory. For those of you that may not be familiar with the concept of the actor, basically, it's the universal primitive. In a language like Ruby, for example, everything is an object. You can do whatever, dot something, and maybe it will crash because it's not valid. The compiler may tell you, but you can. That's how you design your program. In a language like Lisp, everything is a function. Absolutely everything. You can do whatever parentheses. And maybe it's not valid. Maybe it will crash. Maybe the compiler will tell you before compiling. In a language like Erlang, everything is an actor. You can do whatever exclamation marks send a message. And it's almost never valid. It's only by a process identifier, or if it has a name, a proper name. So, this is the model of your program. This is how you structure the program. How are we going to load test a service? Light thickens, and the crawl makes wing to the rocky wood. This has lots of background. It's a very personal thing. First of all, of course, I love Shakespeare, but that's not the point. I work, as I said, at the beginning in MongoSIM service. That is an XMPP implementation. And in XMPP, I don't know why, but I'm very happy about it. All the examples in the RFC are given with Shakespeare quotes. So, when it comes to messages, you know, there is Alice writing to, not Alice. Juliet writing to Romeo from the balcony, and then all the examples are like this. So, we made a piece of service based on a quote from Shakespeare, the name. That is called a murder of crows. I also love Hitchcock. If you haven't watched it, please watch this movie. So, there is this library that we created in my team to test MongoSIM on the load. That is called a murder of crows, because crows are dangerous and are there to kill you and eat your corpse. So, this is what we try to do, to just kill MongoSIM, see dying, and then try to make it stronger next time. And with this project, we reflect about the interactions, the traffic capacity, amplifying factor, all new forces. So, in the case of a messaging system, there is this vulnerability that happens to everyone back in the day. You know, there is compression. Somebody sends you a small packet, you decompress it, and boom, your run out of memory. These kind of things, you have to look for these amplifying forces, the traffic capacity, how much traffic each client can send, how many clients can you have, all new forces. Something that may not be a surprise for old schoolers, Erlang developers. This new world of cloud, that is someone else's computer, really. If all your microservices connection are a lot less stable, distribution is not as cool and easy as it was when Ericsson made it and hardware was indestructible, you know, the punching theory. Nothing happens. Now it dies. So, all new forces that now make a difference in the new way of building a system. In the case of MongoSIM, we have these usual use cases, session establishment. So, you know, somebody logs in, authentication, password, password less, make up your mind. Send messages. Obviously, it's all about sending messages. Fetch in your archive. You reconnect after a while, you are on holidays, and then you fetch all the messages you lost. This is stored somewhere. It has to be stored as you send it. What is the impact that it has on sending, on receiving? Joining and leaving group chats. This is something, and in all classic XMPP, it's a problem to scale, but all classic with the time happened. We had solutions for that. So, I had these problems. We need to test them. And we think how to test them. So, you start your scenario, and at testing time, you need a init, a startup. Like, start the metrics, start the functionality that is going to coordinate all your actors when they have some interaction between them. For example, in a group chat, you are going to create so many actors that then they will join the same group chat and talk to each other. Or in a multi-user game, you are going to have millions of users, but they will cluster in groups. So, you need to coordinate them. So, you will start logic to capture users and to coordinate them and join the same group, et cetera, et cetera. So, you start all the actors. After all your init, then you spawn all the process, you know, and each one executes the program they are supposed to, that they have been coded to do. And then you run it. Locally or distributed. At some point, the load that you can generate doesn't fit in a single computer, so it has to be distributed, so you need your service to handle the distribution for you. The purpose of the load testing is checking how your software is going to survive or die and not implementing the load testing idea. We want a load testing library that will just give me all the users, give me a way to coordinate them when I have to, to throttle them when I have to, and the rate that I have to, to handle whatever place I need to start this load testing. And I don't want to think about all of that. I just want to describe the scenario that I'm going to use to kill my service. So, we build a library that does all that other stuff. Very important thing is the throttle idea. In the case of the chat service, imagine that a million users connect exactly at the same time and looking at the same time. It's probably not a real use case. You can test for that, but that is the stress part when you want to kill the service. That later, you would usually see what happens when you connect 100 per second, and then you increment 200 per second, 500 per second, 1,000 per second, and you want to have a functionality that will throttle and progressively increment the rate. And then seeing your metrics, both load testing library will output to Grafana, your service that you're testing will output to Grafana, and then see the correlations. You want actors to wait for the permission. Am I allowed to do this already? And the cases, the session establishment, but also joining a group chat, how many messages are you going to send. There is this, you know, you have an arrow that is going to be applied in one place. How big do you want the arrow to be? You want that arrow to grow incrementally. And you may want to ask another actor to wait for the approval. You can tell the throttle logic to tell that actor to wait for something. And then that actor, which is not yourself, will wait for the action. For example, in the case of joining a group chat, first you have to create it. So there is a first user that says to everyone, wait, don't join because I need to create the group chat first. Voila is created, come here, et cetera. And another very piece of important functionality is the coordination idea. So as actors are appearing in your load test, one thing that you will want to do, as I said before, is to coordinate sets of them. For example, who is going to write to whom? So you want an actor to know about another one, so it can send him a message. You want a functionality that will pick up actors as they are starting in a configurable way, either all of them that are started or sets of pairs or a list of them. And once the configurable amount of actors has started, then make them do something. There is a callback that will get the list of actors that they identify as and will coordinate how they interact with each other. And, yeah, the actor, as they join the coordinator, they will be given the function that they have to do. So to us, this is what my load testing framework is supposed to help me do. We use it for XMPP. So then we have scenarios and functionality written that knows how to do the authentication for the protocol, that knows the functionality of Mongoose IM. But we don't believe that the load testing library is the one that decides your scenario. I have seen different load testing frameworks that give you functionality to run HTTP requests. So what if you are not testing something HTTP related? We believe that the best way to write what you want to test is to write the code that you know how to write anyway. So the idea is that you write Erlang, Elixir is on the way. This library is not integrated with Elixir, but we will pull requests accepted. The library, as I say, is called AMOC, an acronym for an order of crowds because you want to see your service dying. There is the repo, you can look it up. We have this other repo that we call AMOC Arsenal where we have all the scenarios for XMPP where you can take inspiration on how they work. And I'm about to finish here. I propose to myself that I would make this presentation without showing a single line of code. So I actually cutted the screenshot before the code starts. Let's see how it works. In previous presentation I have shown a lot and it's a bit more complicated to explain. So the library has documentation. Another thing that I have pending is to use the new XDOP documentation. It doesn't have it yet, but it has a beautiful markdown that you can read in GitHub pages. And the scenarios library for inspiration. That is all I will have for you. This is my handle. That is to repos links for MongoSIM and for AMOC. And this is a picture that I have everywhere if you see some Nelson videos and you don't know if it's me. It's going to be that one if it has that picture. That's all from me. Thank you very much. Thank you, Nelson. So is there any questions? Yeah. I know that there's also a library called Zang. It's a low testing library written in early. So how is this one different? In that one you write the scenario in XML. And it has a, how do you call it? Like a domain specific language, XML base to describe what you want to do. And the library has to offer you the protocol. So that library actually has HTTP and XMPP helper functionality. But if you want a different protocol, the library doesn't give it. So we thought that we just want to write the airline code. It's way more pleasant to write and also less limited. Okay, thank you. Other questions? So by using the murder of Kraus, did you already find any like bugs in Mongoose I am that you've been able to fix based on the... Every single time. Fair. Useful bottlenecks sometimes are database interactions. And all fours that didn't used to matter in the computer you could punch. But now, so as you write messages, you need to make sure that they are recoverable. But the amount of messages that you can send might not be as scalable as the amount of inserts a database can have. So this is something that we test a lot. And another functionality that we do is the time to delivery. So the sender puts a timestamp and the receiver just measures the difference. And that's something that we also test continuously when we change something to see that we didn't introduce a computation that would make the time to delivery longer. So those are the two most common tests that we test almost all the time and then there are each case that we don't test as regularly. But we have all the scenarios for them. Any other question? I wanted to mention another library I saw that is called MZBench, I think. I don't know that one. Yeah, I think it was... I know it because it was used by VernMQ to do its load testing, I think. And I think it's in Erlang, too, and you write scenarios. But is Emoq able to... If I have an actor that has to perform some action and then pass the state to another actor, is that possible? Or is that... You have to write your own code, basically, to do that. I have. The coordinator would help. Okay. So in the coordinator you can say to pick up pairs of actors and then they say... Okay, you had the first one. Okay. Yeah. Any other question? We have something similar for changing the owner of a room and then actors have to pass the state, the knowledge to another one. Okay. Okay, thank you again. We'll see if there are any other questions. |
Shorter feedback loops with Livebook |
Okay. Okay, now we have Linus de Meijer with shorter feedback loops with the live book. Give it up. All right. Thank you. Can everybody hear me? Yeah. To my surprise, I am the first and only Belgian here presenting. Well, I can welcome you all here. Maybe the first question I want to ask, which is the most important one, who is hungry right now? All right. Sorry, I cannot help with that. But another question I would like to ask is, who has heard of Live Book? Who knows what it is, more or less? Yeah. Okay. It's a lot of people who has worked with Live Book professionally or has just, has it on their computer? Okay. Less people, but there are some. If you want to follow along and if you already installed Live Book, then you can go to my GitHub repository. I have a little notebook prepared. And I try to switch back and forth between this presentation and the Live Book. All right. The goals of today would be to introduce you to Live Book, to make sure you all understand what it is, how you can get it, how you install it, the various options. And then I think the most interesting part is how I used it in three different cases and what I learned from using it in a real project. And underneath all this, I hope I can bring across the message that Live Book really helps to start somewhere in the middle. So you don't spend time like scaffolding an application. And then just after a few days or hours, gets to the most interesting part. So Live Book enables you to start in the middle. That's my main message here. Whenever you start Live Book, you're greeted with like an introduction page. At the top you see your folder structure. There is a very nice learning section. And at the bottom there you have your sessions. So you will import a Live Book often and then a session will appear and you can hook into that. So you can, that's actually what we are going to do here. I'm going to go to my notebook that I just prepared here. And yeah, I just wanted to point out if you are just starting with Live Book, there is a very good learning section. So please go through these. Also if you're learning Elixir, it's a very good way to familiarize yourself with Elixir. And it also covers things like how you make pretty graphs or how you would use the Kino Library, which is the one that is used to actually interact with your Live Book. It's all just marked down. So it uses those, yeah, you can see here, it uses these code fences with the Elixir annotation. So it's very easy to check into your GitHub repository and make sure you can review it if you want. And GitHub also recently added the feature that it nicely formats your Live Books. They have the extension Live MD. So it integrates nicely with your version control system. The basics. We have code cells which can be executed. They contain your codes. And the first one is a little bit special in the sense that it often contains your setup. So you can pull in all your dependencies. So you can use the mix install function. And I'm going to use a few here, not too many. But I'll go over these once they become relevant. So right here we have our first code cell. We can just execute it. It takes a while to just start up. But then we can go. It is being evaluated. You can see the green dots. So it's being evaluated. And you have all those nice features that you can expect from an IDE. So you can ask it to autocomplete. If you control space one more time, you get all the documentation for that function. So you get a lot of help editing your code here. The result is being print down below here. And you also have the ability to, like I did here actually, to put stuff. And that's also being printed underneath your code cell. So that's very nice. And yeah, maybe the most or a very important feature at least, it's that you can interleave your code blocks with just regular markdown. So it's a really nice way to do a little coding and then explain what you have done and then go on to the code again. Yes, a few words about reproducibility. It's very nice to have this notebook and to know what actually will happen if you execute all those code cells. If you start from the beginning, it's very, very clear. You go from top to bottom. But what if you are going to edit in the middle, make a change somewhere? Well, Lifehook has recovered. It analyzes all those bindings that are being made in those code cells. And it makes sure that if you change something, the relevant code cells underneath them are also going to be executed again. So that's actually the way you often build up states. You have a code cell that creates a binding and then in the next code cell you can reuse that or you can use that binding so you can build upon when you go through all the code cells. I can do a little demonstration how branching sections work. So the sections are actually shown very clearly here on the side. I have a few of them, but one has the little branch icon. So this is a branching section. And this is just to show how the execution model actually works. So right here, I demonstrate that you can use the bindings from before, so from the main flow of the notebook. And if I start an infinite loop here, this is just going to stay printing in this little frame here. You will see that if I execute a code cell below, it will be queued, but it will never run because the other code is just blocking that one. But if we carry on to the next session, we can see that all is well again. So this is still blocked, but this is the main threat of execution. So we are not blocked here anymore. And just to show that we cannot access the bindings from before, I just triggered this error because we cannot access this variable. Okay, this is a pretty picture that I stole from Josef Alem. I'm not going to go into detail, but I just want to point out that everything is based or is heavily using the airline distribution mechanisms that we all know from the airline OTP ecosystem. So we have a central application here. It's a live view application, actually, with a lot of JavaScript. And we can connect to it through WebSockets. That's all being handled for us. And it does not run the code actually on the live book application itself, but in normal mode, it will spawn a new node and run your code on this new node. So we call it a runtime. This runtime is not aware of anything live book related. It is just a plain node that can execute code and you get the results back. So that's what's going on underneath. There are a lot of ways you can get live book on your computer. Recently, well, I used to like e-script installation, but since it's tied to your Elixir installation, I now switch to using the desktop application, which is getting very good at this point. You also can run it in the cloud. You can have it as a Docker image. That's all being covered. The various ways to start, not very interesting, but I think what's more interesting is my story of how I used it to mitigate risks early on, some projects that I've been doing. Yes, I just want to sum up here the benefits that I see. So it allows you to start in the middle. If you're using live book, you can jump straight into your problem space. It increases transparency, and you can use it because you can use that markdown in between to document your process. So all your thoughts, you can put them in between all those code cells, and it's, I think, way better than those obscure scripts we sometimes write, and you can also very easily share this document. So that's actually something that we did. I got some tasks to do. A client was asking something. We were doing something with machine learning and artificial intelligence. We were not aware or we did not know whether we could do it. So I sat down, made his live hook, and then documented all the steps I did, and in the end, got a pretty graph out of it, so I could convince the client that we could do it, actually, with Elixir. Just a little bit of context. I work for a small company. We often switch in between projects. The company is named Zenjoy, and we are often working as a team of two. So documentation and collaboration is very important, and also the communication with the clients is very, very important. So in this first case, we were tasked to interoperate with or to call an undocumented legacy API. It was very low level. It was not as low level as tools explained to us, so it was not like we had to do the pattern matching on the bit level, but we had to use the GenTCP module straight from Erlang, but it was very nice to have this live book environment where we could just throw the commands at this server that we could somehow use and see what came back. So in this way, we were able to create a notebook that documented all the commands we could see or how it reacted, and you see some pattern matching going on here. You also see some magic variables. So this was given to us, so we could not change this, but at least we could document it, and this became a very long document to refer back to. So this is another demonstration I wanted to do. It's not because you're in the browser that you're constrained by any way. You can still use all the process magic and all the GenServers you like, and this is just a demonstration of how you would go around and spawn a TCP server. I'm using Thousand Island here. In reality, I was using the other one, the older one, Cowboy and Ranch. Yes, thank you. But now for this demonstration, I got to use Thousand Island, and it's super nice, so you can just define your handler. It's just going to echo back whatever we send to it, and here I started up. The only caveat or the thing you have to be aware of is that you can start your children, your processes, and your supervisor tree under the Kino code cell. So whenever you reevaluate here, you see that you can't see, but another process is started or a whole tree starts. So this is a nice interop with the Lightbook environment. And once we have those, we can even... Yeah. Yeah, I guess that the gods are not with me today, but at least what I wanted to show is that you can actually draw a pretty picture of the supervision tree right here. But still, I think the server got started, and so now I can net cut into my local host on this given port, and I can see how some stuff is being echoed back to me, so at least that works. All right. Back to the presentation. So it is a nice environment to stub out a server and set up a situation where you can then use your application to interact with this stubbed version of your API. All right. I'm going to show, or this is just an example of how you would integrate your Livebook with a regular mixed project. It's all just sitting next to each other. Oftentimes, I just make a folder where the notebook lives, and then your mixed project, whatever it is, it can be a Phoenix application. You can access it if you use the part way of referencing your dependencies. A few words about a typical lifecycle that I've observed, you often start to experiment in your Livebook. On good days, you add tests, and then you move all that code into the regular application or in the regular mixed project, and you reference it from there on in the way that I just described. So you promote reusable code, and that's often a way that worked very well for me. The second case that I want to discuss is how I set up or created concurrent ETL pipeline, which is a fancy word for just loading CSV files and then maybe transforming them and dumping them into Postgres. So I really got to learn a lot about how concurrent data processing actually happens. I got to play around with Flow, which is a very nice library which builds on top of GenState. No, not GenState, the other one. GenStage, that's the one. And you can still use all the power of processes that are available in Euler and Elixir. To demonstrate, I've prepared or I want to show how you can use ECTO and then this Flow library right within your like hook application. To start off, I create a repository, just like you would do in a Phoenix application. Don't worry if you do not recognize this. This is kind of standard stuff. You have to specify the adapter, and then you can emulate whatever mixed ECTO create would do. So you make sure that your storage is up. In my case, it was already up, so that's what it reports. And then you can even make your migrations like you would if you used ECTO together with a Phoenix application. In this case, I also made sure there was an item in the database so we can query it later on. So I have to make sure that this repository actually runs. And then I migrate, well, I do a rollback to make sure nothing is being left over, and then I migrate again. So we have the end situation that I can query right here. So you can see that our fresh entry is just inserted with this new timestamp. And then we can build upon this. This is another demonstration, very short. This is a definition of a flow. Also, don't worry if you do not recognize this. That's not the key here. I just want to show that you can use all those goodies, and you're not constrained in any way. Right here, we are just emitting a value every second, and we're going to wait for three seconds and then insert an item in the database again. So you get the logging, and if we query it again, so you might recognize the nice ecto query syntax. You see I've wrapped it in a data table, and you can now see the three new items appearing. So that's very nice. Here you see me playing around and actually visualizing this ETL pipeline where every color actually is another class of objects or is being inserted in another table. It went very quickly, but when making this presentation, I also saw there is some room for improvement. So not all cylinders are firing together, but at least it was fast enough for our purposes. Another case I want to share with you is that we used the live hook to actually connect to a live running instance. So remember, as I have shown in the beginning, it's all just Erlang distribution under the hood. So instead of using the regular setup where you do an LXR standalone setting, you can also do an attached node configuration. The only thing you have to know is your node's name or the short name and a cookie which you have to agree upon. And it's very good for doing one of the tasks. Maybe you don't have a UI for something yet, and you want to do it in a live book, then this is a nice way to actually have like a super admin interface, but be aware that this is still a live environment. So if you do this, make sure to put a big disclaimer on top of your notebook to remind you of the risks involved. All right. The last thing that I want to share or show is how you would do tests in live hook. Like I said, on a good day, you write tests. And we have seen some examples in the previous presentations, how you would do like a doc test where you attach some kind of a formatted test and it's expected output. Well, since a few versions, these tests are actually automatically run. So if you define a module, in this case Christmas, you see that the doc tests are failing. I think I can easily fix it by changing the expectation. And if you run it again, the doc tests are green again. But you can also do just your regular testing. The only thing you have to think about is that you have to disable your auto-running. But then again, you can do your testing and you have to make sure you call the run function on the X unit module. So there is no excuse not to test, actually. I want to end with reference to these two resources. There is an initiative by Dockyard Academy. It's an open source curriculum to learn Elixir. And they have used the notebooks or the live books to actually teach this to students. And the other thing you might have heard about in the Elixir news is the Project Bumblebee, which allows you to actually play around with these new neural networks like GPT2 and stable diffusion. And you can just do it locally. So it's a very nice way. It integrates very nice into your live book notebook. All right. That's it for me. Thank you very much. Thank you very much. Is there any question? Could you maybe compare and contrast live books with Jupyter notebooks? Yes. That's actually a reference. Sorry. So the question was how this relates to the Jupyter notebooks, which we might also know. I think it's very much inspired by it. So it's also a computational notebook. But I also see a lot of differences, although I do not know Jupyter notebooks very well. But I think, for example, like the dependencies in the first cell, I do not think there is such a system in the Jupyter notebooks. You would have to use like Comda or Anaconda to set up your dependency. So it's a little bit less integrated. But I cannot say more about differences. But you're very right. There is a strong inspiration there. Yes. Thank you. Any other question? Cool. Thanks for the talk. I wanted to ask, actually, whether there is an option as well for a live book being available as an UI within the IDE, so kind of connected closer to the development environment? No. Not that I know of. No. No. It runs in the browser, and that's where it lives. So you can install it as a standalone application, but it's still something that lives in the browser. But you're right in the sense that it is not a full-blown IDE, and that's also one of the nuisances that I have noticed is that if you have very large code cells, for example, you are missing some features. And if you're used to VI bindings, for example, you will not find them there. Yeah. Cool. Thanks. Yes. Okay. Last question. Does this work for multiple users collaborating on things? Yes. And I should have shown this. It is one of the nicest features. Thank you for opening that door. If you're using multiple sessions or, for example, multiple users in multiple locations, you, for example, see the selections they made. You see a little cursor where they are editing or you are editing, and you're actually editing the same notebook. So, yes, it's kind of a live coding environment. Yes. I don't know. No. It's building on top of these goodies we have. Yeah. Okay. Thank you again. |
Running Erlang and Elixir on microcontrollers with AtomVM
How to run BEAM code on a 3 $ microcontroller |
So there is more management overhead than to talk, so great. Okay. This is Davide Beteo with Running Airline and Elixir on microcontrollers with AtomVM. Give it up. Hello, everyone. So who I am, basically I work during my daytime on a start and a job that are really nice. Elixir project for IoT and whatever. And during my nighttime, I try to work a lot on AtomVM that allows you to run Elixir, Erlang, Gleam, whatever can run on the beam on a microcontroller. When I say microcontroller, I mean something really, I mean memory constrained, but not too much. I mean, still it has to be a 32-bit processor. It requires about 80 kilobytes of RAM, but we can do it. Pretty crazy, but we can do it. And so the software is mostly unmodified. I mean, we don't have to translate it to other formats or whatever. It can run beam files, so it's pretty standard. So how? So basically we did, well, I created it from scratch, and so the wall implementation has no code from the original beam implementation because we are focused on memory. So rather than focusing on performances where beam is very good at, we are focusing on making everything stay in just a few kilobytes of RAM. And the virtual machine is compatible with, I mean, all the recent OTP releases. We already have some experimental support for OTP26, so we are on par right now. And we have support for quite a big number of nymphs and biffs from Erlang. So we implemented them in all the daily basics, so you can run your project if you are not doing anything weird. And well, there it is. And also we did something more. For example, we weren't able to run a replica for Erlang or Elixir on a microcontroller. It's not really easy. So we did a simple list implementation for testing stuff. So if you want to test registers or EWC communication or SPI communication, you can poke with registers using Lisp. It's not as good as maybe Erlang or Elixir or whatever, but, I mean, you can experiment a lot. And everything can be packed into a single file that can be easily flashed. And we are mainly supporting right now ESP32 because we started a project with that powerful microcontroller, but we support, of course, Linux, macOS, and whatever, because, yeah, we need to test it. And we are working on improving and extending the support to other devices. I mean, as soon as I get a new development board, I try to run it. And sometimes I need help, of course. And it's pretty easy to port it, by the way. And when, it is already here, and it can be used for your simple or maybe a bit more complex projects. Again, you are running on a really constrained device, but you can do interesting stuff. And we are working towards the next release that it will feature a lot of cool stuff. We got finally SMP support, so we can take benefit of multi-core microcontrollers. And we got recently also really good code debugging features, so it's pretty nice. And yeah, this project has been possible thanks to the work of other contributors. And so thank you very much to everyone that has been working. And because, you know, open source projects are always kind of teamwork. And it's hard to do something like this just alone, so a lot of thanks to all the contributors. And thank you to all of you, of course. Thank you, Davide. Four minutes and 40 seconds, I think. Okay, thank you. |
Dealing with a Monster Query
a story of Elixir & optimization |
And this is the last talk of the day room, and Mackenzie Morgan is going to talk to us about dealing with a monster query. Give it up. So, hello. I'm Mackenzie. I actually work at Nextro now, and I did not put that on the thing, because it's kind of weird, since this is about something that happened at a previous job. Let's see. Let's go over here in the spacebar. There we go. So, back in 2020, I learned Elixir because the company that I was working at, which is Axios, it's a news company, they launched our new mobile app, and then it promptly crashed every morning at 6 a.m., and I am not a morning person, so I did not want those pings. And so, we needed to do something about this, and so there's a quick rewrite into Elixir. I was not involved in the rewritings. I didn't know Elixir yet. They grabbed a couple of contractors and said, hey, learn Elixir, because we're going to be handing this off to you. Okay. And everything worked out, worked really great, except that there was this one query. So, we had this one query that had a whole lot of OR clauses in it, because, well, and this was responsible for the majority of our database load. And we also had the biggest day in U.S. politics coming up, the U.S. presidential election. If you have dealt with news organizations, you know that politics, big political events mean a ton of traffic, right? And so, that is a huge day working in a news org, and this was the second newspaper I'd worked for, so I knew how this went. Usually, for advice for optimizing stuff is to move as much computation as possible out of the code and into the database, right? But this is the story of how refactoring the opposite direction was what actually saved us. So, it's pretty standard in ACMS to have a structure that looks kind of like this, right, where you've got, okay, you've got a post, and it can be in a category, and it can be tagged, and it can be this, and it can be that, and you're trying to find posts in any of these different ways. So, we had four different taxonomies that we were using to decide what we were going to show you in the mobile app. You could subscribe to a channel, either tag or whatever. And so, we had all these four ORs where you go get the post through taxonomy one, one's the aggregator, two, three, four, all those things. And that's where our four big OR queries came in. Which looked like this. And that's the simplified version. That doesn't have the sorting, that doesn't have the time limits. That's the simplified version. But that's what that looks like, and it's ridiculous. And so, AWS stats told us that this was going absolutely bonkers. I did a Postgres explain, analyze on the query that the Ecto generated. And Postgres said it was over 3,600, was like cost for the analyze, and that it would take eight milliseconds to execute. But that's like, just computing what it needed to run was the huge problem for it. So, I'm going to go through how we factor this to be super fast. So, okay, so we had four taxonomies, so four smaller queries. So, really, they each look like that. That's very simple. And those each take one-eighth of a millisecond. So, this is a good start. It's still kind of ugly if you write that four times, though. So, but what if we take advantage of Adams and the PIN operator in Elixir, because Elixir's got some pretty cool syntax features. And we make it into a query, sort of make a function that we can call four times, passing those in. And that's a bit better, but we're still calling it four times separately. And so, if we go a little bit further and we take advantage of the concurrency that we all know the beam has, we can pass in the list of what the taxonomies are that we're going through and use the task async stream. And now, we can make all four queries running at the same exact, like running simultaneously, just by passing in that list, which makes it really easy to, you know, instead of copying and pasting more and more code, just adds the list when we add another taxonomy. And guess what? By the time I left the company, yes, there were five. So, what did changing over from that big nasty block to this get us? The database CPU utilization went down from 50% to 40%, so that's a 20% drop because math. I know it looks like 10, but, you know, you do 40s, 80% of 50, yeah. The Postgres, remember I said the Postgres analyzes over 3,600? It was 16 after that. That was a much happier database server. And the execution time, like I said, went from eight milliseconds to one eighth of a millisecond each, so a total of a half a millisecond if you were to string them along continuously. So, yeah, so much faster and our database overhead down by 20%. Great. We also had a seven times increase in the number of requests per second that we could handle according to our benchmarking scripts. So, we got to have a stress free election night. I did not have to be trying to restarting servers at two o'clock in the morning as we waited and waited for results. So, that's it. That's all I'm going to show you about. And that's how to find me. |
Running Real-time Stream Processing Analytics On Traces |
Thanks very much, thank you, so welcome everyone and I'm glad that you're here on Saturday early morning in this first session, so I'd like to make it as easy as possible, thanks for the organizers, Jerez and Yamur, for inviting me to talk today about stream processing. The fact is I don't know your background, so I'm not sure exactly how much experience you'll have with stream processing, so if you see some concepts are easy, just get everyone up and into this concept. So I'll be talking today about stream processing on adaptive and there's a lot, so that's what the main focus will be. Obviously this title as it is could be a startup company, so you would expect to have some ideas today where you can use some of these ideas in your work or in your experience or in your case of study whatever you want or whether you are a Java developer or data scientist or MLO, so it doesn't matter. So there is something for everyone here today, so that's the main focus for this session. So, anyone recognize these guys on the screen here? Right, so that's where I came from, I'm based in Liverpool in the UK and on the right side is the Liverpool football club, which is basically one of the top football teams in the UK, so I wanted just to highlight this screen here, just to tell you that stream processing is not in specific domain, it could be in any domain. And if you look at it, do you know how long it takes, for example, for an eye to blink? Come again? Yeah, so it takes over half a second, so that's pretty fast. So if you think about maybe minutes or hours, probably this is not the right discussion room for you. We're talking about some milliseconds today, so whether it's for example using it in finance, whether you use it in IoT devices, smart devices, whether you use it in sports, hospitals or machine learning or what we're trying to do today for stream processing. So that's the main idea. And obviously, if you're working with real-time stream processing, you focus on the real-time data, right? And I've seen it so many times where platforms and tools focus on how much data you can process and you see these benchmarks everywhere on the internet and this is pretty cool, I think, but the key source and the secret source for this is to use something in combination between real-time data and historical data. So the main reason for this is to look at context. So without knowing what's going on, you probably don't benefit much from the real-time data you're processing. So what you want is always to go back and check what's going on with the context of these data. There is a problem in this secret source, obviously, because what you're looking at is kind of like two different data types and you want to make sure that you process it at the same speed or very close to the same speed. Obviously, it becomes really a problem when you try to scale it. So if you have, I don't know, maybe a few cases of data that you want to process, probably it's not too much trouble for you, but when you start to scale it up, it becomes really a problem to understand how you want to scale it. So do you scale your data or do you scale your compute or do you scale both at what speed? So we will discuss all these concepts today and if I ask you now how much data you process, obviously, because in this room I would assume over a million transactions per second or a few millions or, I don't know, some of you might be processing millions of transactions per second. So that's pretty good. And what we want today is to focus this domain into a very specific area. And this area essentially what we're trying to do today is to analyze traces. So it doesn't matter if it's like writing system traces or platform traces or it's like programming language traces. What we want is to make sure that we have environment and within this environment you can scale your loads, basically, scale your processing and at the same time we'll provide some kind of analytics, right? So again, if you look at how much data you're trying to process, the number by itself doesn't give you much what's going on here. So what you want is to find this specific information you're looking for. Kind of like looking at, you know, finding the needles or finding the hidden areas within your data. So if you look at, you know, how much loads you process per day or per week and you'll store it somewhere on, you know, crystal hard drive or you store it in Mori or you store it in the cloud. So what you want is to, you know, make sense of it. And some companies do this process manually, which means they run software and they go through their loads and this is kind of a patch service and they try to understand what's going on within the load. So obviously this is a problem when you want to scale it and with the scaling you have different loads stored in different places and you want to make sure basically to have a platform where in this platform we kind of like looking at some kind of results. So for the sake of this discussion today, we'll focus on two different solutions. So one of them is trying to provide some kind of alerts and the other is to provide some kind of trends within your data. Obviously I work for a company called Hazardcast. So Hazardcast as a platform, I love you to do so but obviously you might have heard of some companies or, you know, they do some kind of stream processing. So this is kind of like, you know, overview what's going on with this domain at this time. Obviously you can split it depending on if you're looking for open source solution or, I don't know, hardware solution or, you know, some kind of management service. And what you look at is kind of which domain you work so are you looking to capture your data or some kind of, you know, streaming your data or you want to do some kind of transformation on your data or do some kind of electrical machine learning. So you can see that you split it into 12 squares and within these like tools and platforms are, you know, spread it over. Some tools not exist on this screen for whatever reason but obviously this might give you some ideas but it's hard to decide which tool you want to go for. Simply because I think the distribution is not clear here so it tells you basically which tool is open source for example and where in process you can use it but it doesn't give you full picture on, you know, how to do it in practical terms. And so this is where it might be easier to understand what we're talking about. So if you remember from my slide where I discussed the historical data and the new data. So today we're kind of like, you know, trying to split everything into two categories. So on one side you get like stream processing engines. So these engines are pretty fast in, you know, streaming events. And on this far right side you have some kind of fast data stores which are, you know, are pretty fast in handling data at speed. So again the solution for lead time stream processing is kind of a combination and you want to process data in this moment and at the same time you want to actually also access data storage somewhere. So that's where Hazardcast fits into this area here. So the platform itself obviously for those who don't know, by the way we have one of the masterminds of Hazardcast sitting in this room. So this is the platform. So it's open source platform. It doesn't matter where your source is coming from, whether it's Apache Cloud or Apache IoT devices, for example, I don't know, some kind of device applications or even like within Hazardcast or even you can write your own connector and you feed it into the platform. So platform historically used to be two different components. So the IOMTG and the Jet Engine. And essentially now it's all back in one, one jar file. As you see here, it allows you to load your data from hard disks into memory. So you have access to historical data and pretty much like instantaneously and this will, well, you know, you can provide context, what's going on with your data. At the same time, you can actually do stream processing. So that's what Jet Engine is. So from here you can do some kind, I don't know, maybe like data transformation or do some kind of stream processing as we will do today or even like defined machine learning if you want to. You can connect it to some clients. So these are some clients here, so written into various languages. If you're from data science background, which means your programming languages in general are not preferable for you, so you might be considering using SQL to do what I'm planning today. So this is another option you can do. And once you process this data where you load it into memory for historical data and at the same time you have some kind of data coming in. For example, and you do the combination or even you do transformation, you can proceed to do some kind of visualization. So the good thing about Hesicas in general where it comes to scaling is it's partition aware. So which means basically your compute, your Jet Engine or your process essentially can be or can detect where your data is stored. So this is like, you know, we're trying to have as low latency as possible when it comes to processing this data. So this is very important to understand because latency is your enemy when it comes to stream processing. So what you want is kind of like having a platform where you avoid network folks. For example, you avoid IO to your hard disk. You will try to also avoid every time or, sorry, context switching between threads. So you want to avoid all of these, but at the same time you want your process to be as close as possible to your data. You can avoid some kind of, you know, machine learning on this. And the scaling itself could be done in various ways. So the main thing to take away from here is there's no master-worker relationship. So all nodes basically are peers. And we've done this study. It's a bit dated, but it's kind of like one million transactions per second on 45 nodes. So what we're trying to do now is to add one zero into this number here. And even though it's pretty impressive, what is nice about it is the linear scaling, which means more data you can add, you know, more nodes into it. So that's the historical bit of this talk. So let's just move to the technical part. So for this demo, what I wanted is kind of like, you know, show you some ideas, right? So you should be able to take these ideas and apply it, you know, anywhere. Obviously the solution as itself could be like, you know, project by itself. So feel free to edit and change it. All source code is available on GitHub and the documentation as well. So you can go through it. So the main idea when it comes to analyzing or, you know, making sense out of your traces and logs is to store it somewhere close to, you know, your compute, first of all, and shouldn't be stored locally, right? So you want to store it first of all. So the first thing is to store it on the cloud. So for this demo, what I'm doing is I'm storing everything onto the cloud. There is a solution called Hazelgast-Virginian, which is kind of like service. So you don't need to download GR, run your project. You can simply plug in and play. So you can create an account. You'll run everything I'm discussing today. So you create an instance of Hazelgast. And from there, you can pretty much proceed to what I'm planning to do. So the first option we were talking about is kind of like storing everything into the cloud. So we're going to import the data. Obviously, we need some kind of trace message, which makes sense. So this trace message could be, you know, changed based on how you want to approach it, right? So for example, if you're working with machine learning, you probably look for some kind of, I don't know, classification solution for your, you know, for your tests, or you could be looking for NLE. If you don't want to work with machine learning, you probably want to look for some kind of trends. So you look for processing your data. It doesn't matter if you're using machine learning. In this case, you want to have some kind of data stored somewhere. So it could be in JSON format, or it could be like bar charts, strings. So it depends again how much the speed is important to you. So the option, first option is to go through the alerts. So in alerts, what we're trying to do here is to take everything and store it in the cloud. So obviously we don't store it in the cloud on our disks. What we try to do is to store it in memory. My preference in this case is to use some kind of map structure. So map structure allows you to essentially random access and rebalance between various nodes within your cluster. And at the same time, you want to have some key value, so in order to know where this is coming from. So in this case, it could be like ID address, for example, so this is where, and support number, so as key. And the value could be anything that makes sense to you. So in this case, for example, you can track level of this error, sorry, of this loop, and message, for example, if you want to do some kind of NLP processing on it, and some kind of, you know, process or thread name on this. Obviously once you have your key and value, what you can do is proceed and store it into memory. So this is where you get this set to hazard cast. And what we're trying to do is create the IMAP. And once you have the IMAP, it means you should be able to store it, you know, access it and do same processing as I will show you. So first message is to store it in the cloud, store it in memory. In this case, I'm using hazard cast gradient, and I'm using IMAP. And second stage is to do the same processing, right? So there are a couple of options here for you. So first option is to use SQL. So SQL is built within hazard cast, which means, or on top of hazard cast, which means you should be able to query your data, so if you provide some kind of specific messages that you're looking for, obviously depends on your input, you can do some kind of SQL. So whether it's an inner joy, for example, or sales and so on. Or the other option is to do some kind of prediction. So you're getting some logs, you don't know exactly what's going on to happen next, and you try to predict to provide some kind of, you know, alerts or trends. So we need to build the trends. So in order to do this, what I did is kind of like use the same key, but for my value, what I'm using is some kind of log score. So log score is not important. What I'm saying here is I want to give value for every single message, or every single log message. So this could be, for example, how important this specific message is for you, or it could be, for example, how serious or how dangerous the message is. So as levels in logs, you can define scores, so instead of having four levels, for example, you can spread it, I don't know, from one to 100. So this should give you some kind of predictions. Why? Because if you have, for example, 10 messages, 10 local messages, or, for example, warning, you don't know exactly if the event matrix will be warning or not. If you want to predict it, obviously it doesn't give you how much will be warning or not. Whereas if you use some kind of numerical value, you can get as close as possible to this. So we get key from there, we get score from here, and what we do next is to do some kind of predictions on the logs. So in this process, what we have is exactly my key and the value, which is like the score on each log message, and I import it into Hezakas. So Hezakas allows you to basically input and output from two different maps, and do stream processing, so we'll build the train based on previous logs, based on previous log scores, and we'll use the prediction on top of it to provide some kind of alert. So zero means don't alert, one means alert. And as you can see here, the actual workflow, kind of like you build, you read it from math, then you define trend map, so which is like normal map, and from there you can use it to predict what's going to happen next. Obviously you do some kind of visualization, so how does it look like? So this is kind of the prediction part of it. So we take the logs map and we build trend map out of it. So the trend map would start reading messages and the scores and build train for you. And from this trend I can use some kind of machine learning, it doesn't have to be machine learning. In this case it's linear regression, but it could be anything to be honest. And we check the values and we try to use some kind of prediction based on the previous values to decide if you want to send an alert or not. And obviously this is kind of like describing the exact thing, so when there is a one on your values, it's alert, send alert when it's zero, don't. And here there are three ways to do it. So this is where the same processing comes into place. So you could simply use SQL to read from the map and do some query if you want. Obviously this is batch, which means it's not real time same processing, or you can even create a pipeline or create a process. And from this process you can read the logs and do some same thing. So you can use either SQL or Java to do it. But first two options are batch, which means you can process the data in real time, you want to do changes in real time. So the third option is the journal map. So journal map will track all changes, so it is continuous. So you have the logs stored and you have logs coming into Kafka topic and you can basically store both into journal map. So we're on 5.2 version within Hazegas, 5.3 will have the SQL features on top of it, which allows you, for example, data scientists to just do the queries and change the data. And obviously it's ring covered, so this is very important to understand, so you can start processing your data from start or from the end. And what you want is kind of like, you know, using this kind of alerts to it. So the first part is to read it. So this is the actual map we built in the first option. And from here you can define the key, for example, and the value. And you start, for example, to do some kind of filtering. So this happens in real time on continuous and the map itself will allow you to basically track changes. So to give you some takeaways and best practices, so we just try to summarize everything we discussed today. Obviously there are more to discuss, but this should give you something to go out and try. So first of all, you need to store your logs into some kind of data platform. So in this case, I'm using Hazegas, but obviously you don't have to use Hazegas. The idea here is to do some kind of compute on your circle data to provide some context, as well as real time data. And from there, you need to store it on the cloud. So you need to store it somewhere where you can access logs from multiple places. Obviously it has to be stored in memory. And from there, you need to choose the format. If you, for example, looking to provide some, you know, I don't know, some predictions you probably need to use JSON format. Or for example, if you want just to do something that you can't sing and it's faster, if you want to speed the unit, it is some kind of map structure. Obviously when you store it in memory, because this will allow some kind of random access. And also you need to consider how you empty your map. So because you are limited on size, obviously. And finally, you need to consider security. So whatever you send to the cloud, you need to make sure that you don't include some, you know, personal entities or whatever. So if you're interested in this topic, we're running a conference next month. So feel free to scan this code. We provide training for this all free, obviously. And everything I mentioned today is open source, so you should be able to do everything I mentioned today. I'll be steering around here if you want to have a chat or if you want to discuss it a little bit more. Obviously within half an hour, there's not much to give, but hopefully you've got some ideas from this talk. And hopefully it will be also useful for you. So with that being said, thanks very much for listening. I'll open for questions. Thank you. |
CDC Stream Processing with Apache Flink
A peek under the hood of a changelog engine |
Welcome, good morning everybody. So today I want to talk a little bit about change data capture, CDC, stream processing, with the Patrick link. This talk is split into three parts. The first part is for people that have never heard of link before. The second part is maybe a little bit more deep, but I think it's really, really deep. And then the third part, we could dive really, really deep into under the hood. So just to make you particularly interested in the software and what we are doing here. So yeah, I already got an introduction, but just to summarize it. So like from the open source side, part of the link even before it became part of the Patrick software foundation in 2014. I'm a member of the management committee of the Patrick link. In the years I also made it to the top contributors according to additions in the top one. I don't know which refactoring I did to the top one contributor. Yeah, and among the core people that try to design things equal every day in the world. Can you realize, yeah, I went through a couple of companies. The latest one where I was a co-founder was in Maroc. Maroc got acquired by Confluent beginning of this year. And now I'm a principal software engineer at Confluent. So let's talk about it. Before I start with an introduction to the link, I would actually like to talk about stream processing in general. Because when you do stream processing, you basically always can identify roughly like four building blocks. So let's talk about those building blocks first. So first of all, you need streams, right? You want to have data, you maybe want to create some pipeline from source to sync. You might want to distribute your streams because you have a lot of data. So maybe you want to scale out and scale in depending on the load. You want to join streams together. You want to enrich streams. Maybe there is a control stream and the main stream. So you want to dynamically modify the behavior of the application while the application is running. And yeah, sometimes there is a bug in your application. Or you just want to trace certain behavior, then you also want to be in play streams. Time, working with time is also a very, very important concept. Because on one side, you want to make progress in your pipeline. But at some points, you also want to sync for an ISO if you have two streams. Maybe you want to wait for the other stream. Maybe you want to block or you want to buffer some of the streams. Maybe if the second event doesn't come in, you want to time out after some time. Maybe you also want to replay historical data. So you want to fast forward the time. You don't want to wait another hour to fire an hour window. No, this should be quicker. Then when we talk about buffering, what I just said, or in general, storing data state is a very important component for processing. State can be, for example, a machine learning model that is updated from time to time to classify your incoming streams. It can be a cache if you don't want to look up in the database for every record that comes in. In a general state, a low state can be an order of classified. And state also, at some point, needs to be acquired. If you have a state full streaming application, a very useful component is helped by actually making sure that I can create this natural of my streaming repository. So I wanted a state backup to my streaming application. I wanted to make a version of it, so every night I want to create a screenshot version. Maybe I want a full streaming application in a staging cluster, in a gelatin cluster, and play around with the data in the state. Maybe I want to do some testing, or I just want to time track the process of my application. So let's talk a little bit about what makes Flink unique compared to other competitors. First of all, what I just showed is that Flink is one of the best stream processors for all of these use cases and building blocks. So when you design a streaming application, you can start with a whiteboard and you just draw some circles. What do you actually want to do? Maybe you want to reform some sources. Maybe you want to normalize your data. You want to filter some data out. You want to join the data and in the end you really want to sync it somewhere else. But this is how it starts. And this is also how you have to reason about when you're creating a pipeline. And what Flink does under the hood is it has a parallelism and scalability built in. So you don't have to think of threading or network transfers or anything like that. Under the hood there are sharks, there are petitions depending on the connectors. There are sub-tasks that in parallel execute operations. Each of these tasks, some of these tasks can stay full and can have some storage local to the operator. Very important. So the state basically scales out and scales in with the operator. You don't need to vary to a database which would increase data. And then, of course, it travels and then the whole pipeline runs. And now comes the important part. What Flink explained really unique is that this possibility of creating an existence snapshot of your entire stream of technology. So there are, like, what we call the checkpoint barriers which are traveling through the topology and make a backup of each state of the operator. And then this snapshot is then persisted on a long-term storage like S3 or HDFS or some other distributed kind of system. When we talk about use cases, there are plenty of use cases. We have the process transactions, docs, IOTs, any kind of events, user interactions. People use it for broad detection for machine learning, for event-driven applications, for ETL, for data integration, for analytics. So Flink has become, over the last 10 years, it has become like a very large platform. You can connect various connectors from business stream systems, you can read and write files, databases, key value stores. And as I said, like also event-driven applications where maybe you want to send out an email or don't need a connector. You can also implement something custom that talks to somewhere as the API. So let's look at this scene. So I also want to quickly talk about Flink's API. So this is the API stack. The two main APIs are data stream API, table API or table table API. And there is also stateful functions. Stateful functions is a sub-project that tries to execute an actor model on a page of Flink, but will not go into detail here. So first of all, like all the APIs are built on a data flow runtime, so there's no batching or anything involved under the hood. It's really a data flow runtime. Whatever the result is ready, it will be streamed to the next operator. On top of that, there is a loader, the stream operator API, which you can use. But yeah, this is for X, of course, I would say. And then we have the mainstream APIs on top. And the specialty about table table API is that there is an optimizer on a planning stage in between. So the C helps you when you're not creating the most efficient pipelines. The optimizer will make sure that the streaming will be executed more efficiently. Yeah, let's also quickly look at the APIs. So this is like a basic example of creating a stream for just three elements. And then you're executing this on a cluster or in your IDE, then you're retrieving the result back. And you have an iterator locally, you can just print it locally. It's not very useful, but this is a minimal example of the Java API. The important thing is that the stream API basically exposes all the building blocks that I mentioned on my previous slide. So you can have very abstract operator typologies, and you can use built-in functions like map, process, and connect. Which each of them takes different functions, and then you can really define your business logic in those. They use different functions, and you can also use completely arbitrary Java records. For example, Python records that flow between the operator and conceptually. This is interesting when we talk about change data capture. Conceptually, the data stream API does not know about changes. It only knows about records, so there is no change flag or anything like that. So conceptually, the data stream API is an app that can only or insert only as long. And also when you look at the output, one, two, three, four, five, six. So let's take a look at table API and simple API. So usually you just say to this table API, order in SQL, because it's a unified API. You can decide whether you want to electrify your pipeline programmatically, or whether you want to use standard SQL for defining your topology. In the end, you also execute and you can also print locally in your IDP. Here, this API abstracts all building blocks. So you have no access to timers or state or anything like this. This will be on the foot. Also the operator topology is determined by the planner, not by you. The nice thing here is you can focus on your business logic. And you do this declaratively to optimize your business durations and make something out of it. Internally, it uses highly efficient records, also up to the engine, not to you. What you will see maybe is like a road type. If you really want to go out of table API, then you see a road type, which can work as a business program. And the interesting thing here is that conceptually, we are working with tables here, tables and views, you know, databases. But under the hood, there is actually a change level. And that's what I want to show in the following slides. But you can also see that, like for example, if you're disappearing here, you will get this output when you run it in the IDQ. And you already see that there is, of course, an F0 column with the 123 output. But there is an additional column, first column, which already shows that there is some change like attached to every record. In this case, it's just insert. The nice thing about Linux APIs is that you can mix and match them. So you can, for example, start with the S3 API, and then you go to table API, or the other way around. If you have SQL, you can do the detail in SQL first. And then if you have some more complex logic, like timer services, or like a very complex state, or whatever, then you can go to the S3 API, do it there, and then you can switch back to SQL, or you can just use the SQL for Nectar. But you find the entire pipeline in the S3 API. So that is up to you. But yeah, the APIs for that are present to go back and forth between those two. So now let's really talk about change.s3 processing. If you think about data processing, like in most of the cases, the main data processing is always consuming a stream of changes. Because if this would not be like a continuous input stream, then your company, your project, whatever it would actually be, right? So like it is actually very common that data flows in continuously. And the Flinky APIs, the Flink runtime sees everything basically as a stream. And it just distinguishes between a bounded stream and unbounded stream. So bounded means you define a start and an end, and the end was coming there. Unbounded means you start somewhere, and now it can be somewhere in the past, and then you start processing the future. So this is up to you. Yeah, if you really think a bit about this, actually batch processing is just a special case of stream processing. So batch processing means that through the bounded nature of the stream, I can maybe do some more specialized operators like sorting, for example. It's easier in such a thing. And you can also use different algorithms if you have sorted like this to a sort of a join, or something like this. So that the runtime has special operators and special handling of bounded streams. But in general, you can process everything for the stream. So both bounded and unbounded data. So how does actually things look like? So how can I work with streams and things like that? So the first answer to this, or I mentioned before, you actually don't work with streams. So what you work with is dynamic tables. So this is just a concept we call the dynamic tables. It's a concept similar to materialized views and materialized view maintenance. So what you do as a user is you define your tables. So on the left side, we have transactions on the right side. We have maybe revenue. And then you define, in the middle, you define a standing, running SQL31, which gets translated into a pipeline methodology next to the private branch. So then the question is, OK, if we have this big SQL kind of a database, and the answer to that is no, it's not a database, because we are not in charge of the data. So you can bring your own data and your own systems. So it's more like a process. It leads from all different kinds of systems. So if a table is not a stream, or if I don't work with streams, how does that actually relate with each other? And an interesting piece of interesting term here is called stream table duality. So you can basically see a stream as the change log of a continuously changing table. So it is possible, and I will also show an example shortly, that you can convert from a table into a stream and from a stream into a table. You can do a back and forth mapping after this possible. Usually, as a user, you don't see that. Under the hood, the runtime, all the sources, all the things, all the operators, they work with change logs under the hood. So in Flink, we have four different kinds of change tags for each record. So we have insertions, insertions are also the default input and output for bounded hash queries. And then we have update four, which basically removes a previously inputted result. Then we have update after to update something. And then we have to feed one of the last results. When we see only insert only in a log, then we call this an only or insert only log. If it contains some kind of division or update before, we call this updating table or an updating stream. And if it never contains an update before, but only update afters, then there's a primary key involved, and then we call this absurdity. So let me make a quick example. So again, we have on the left side those actions on the right side with value, and in the middle we have summing and sumproving by name of different sections. So what happens now is, like in the logical table, there is a new record coming in called Alice. This is how it will be represented in the change tag under the hood. And then this is what comes out. So we are summing here. So 256 is the first result that we are already looking at. So now the next variable comes in. Again, it will be added also to the last table. But now it comes in. There's another Alice, and we want to move like this. So that means the sum is not updated or we need to update the sum to the newest number. That means, first, we have to remove the old record. And if we want to materialize our change log into a table, so we also have to remove the row in the table. And then we can finally add the updated row in the table. And this is what the change log looks like. And if you would apply this change log to a SQL or to some key values there, or like the search or so, then the result would be there. And if we would define a primary key on the sync table, actually we don't need this update before, because then it would be now searching operation. And yeah, we can basically save 50% of traffic if we do not want to support that with the rows in the sync table. So I already mentioned that each sync and each source, that they declare a change log model which changes they and they can't consume. And yeah, I give like a quick example of various connectors. So when we, for example, read from a file system, this is usually a scan operation. So it is very common that when you read from a file that this is just insert only. There are no updates coming through the file system. Sometimes they do, but in the general case, you just scan through the Kafka file for example. Okay, so Kafka, in the early days, Kafka was actually just as a log for every N record that came in through Kafka was also considered like an insert only record. Then later, Kafka also added some absurd functionality. So we also have a connector called Kafka Absurd for that. That means when a value in Kafka is null, it means addition. So the Kafka Absurd connector, for example, would produce insertions and divisions. If you define the JVC connector, JVC also doesn't have this concept of updates. So in the same case, we would scan the entire table and just scan to only produce insertions. So we have all the insertions for JVC. But that comes like the most complex case. What happens if you use, for example, an easy one? You connect it to the database to consume the change of this particular from the database. You put this into Kafka and then you consume from Kafka. In this case, for example, this could, for example, all kinds of changes that can then be evaluated by the end. The optimizer basically tracks the changes through the entire topology and the sync prepares what it can digest. The optimizer could react sometimes with an error message, but sometimes there's more to it. So let's quickly also talk about these two different modes. So I already said that sometimes you can do upserts where there's no update before. And sometimes you need all four kind of changes. And this is called like retract versus absurd. So retract has this nice property that there is no primary key required. This works for almost all external systems, which is great. You can also support up with the pros, which you cannot support in the upsert. The table is called an absurd table. And interestingly, also retracts, so like this retracting of the previous admitted record is actually often required in distributed systems. And I also have a little example on the right side, but I will show shortly. So let me definitely explain this first. So this is a count of a count. So we are creating an histogram. The lexical variable itself is not so important. What is important? What is actually flowing in the cluster through the operators? So whatever record comes in, the first operator will identify this. Okay, this is the first time, so the count is one. And then since we want to do the count of a count, the next operator will also count this as a one, and it will keep some state. How many records have I seen for this particular count? So now comes the second record in. And we have to after the count. So now the count is two, but one anymore. And interestingly, if we do a hash partition for some collectors, it might be that the count ends up at a completely different operator. But what happens with the old count? So now you have two threads or two operators, parallel instances of the operator that have a count and that they need to remove the count in the other operator. This is why this case rejection is required, because the update before needs to go to the subclass one and remove the man outdated record. But in general, absurd is an optimization. It reduces traffic, reduces computation. And if it's possible, it's great, but usually there is a lot of reflections flowing on in the... And also have some examples here. Like if you would do an explain on some SQL query in the SQL, the bottom part is what you would see. So let's assume we have a table of transactions and a table of payment and a table of result. The table of result can consume all kinds of changes. I just took my table also. And you join transactions and payments. And in the explain, you also see that there is... You can get information in the explain about the change of mode. For example, if the input here is insert only, insert only, then also the join will produce insert only result. And for example, if we do an outer join, in this case the left outer join, then things become a bit more complex. Here you have insert only, insert only, but since that outer join will emit another first, like if there's one thing, like one record comes in, there's no matching record so far, and you have to emit another first for the other side, and then when the other side is coming in, then you have to remove another again and emit the final result. And that's why, for example, here, you have all kinds of changes coming out of the join. And then we can even make it more complicated. What happens if we define a primary key on transactions and payments? Then the optimizer will recognize, okay, that input spec and the right input spec will contain now a key key. That is great. So I can remove the update before, so you can see that there is one, that particular is not necessary anymore, because we can do upserts on the results. So this query is obviously more efficient than the other one. And the other good optimizer can also range between those different modes. I don't want to get details here, but if it's possible, like it's necessary, you can go from updating to the collection. But that's not also under the course of the information. And depending on the operators, you also can switch between these modes. So for example, if you have a regular join, to append only tables, then also the resulting table will be append only. And I showed already that if there's one of the tables updating, the results will be updating. And if there's some outer join, then the result is always updating. And now comes the interesting part. If you have append only table, and you join it to an updating table, there is a special kind of join, which we call temporal join. A temporal join will actually produce an append only table, because it looks at the table at a point, a specific point in time. That's a very interesting operator. Unfortunately, we don't have enough time, but I just want to show you an example of this very, very useful join operator. So let's assume we have some orders table, and orders have a currency, and there is a currency rates table. And obviously, you don't want to join those two tables with the latest currency rates, but you actually want to know what was the currency rate at the time when the order was created. And this syntax here with the persistent time as of actually allows you to consume all the changes from the rates table and join it with orders at the point on the order. This is just one example of a very sophisticated join operation. And by the way, for system time as of this season, we're going to have to be able to understand so carefully how we see what we're going to do. So I also have prepared a demo. I think we still have seven minutes left, so it should be good to see. So I also want to show you some of the CPC capabilities. So I will run everything in my IDE. I will use Java for this example. I have a MySQL container, and I'm running it, so we'll start with a SQL container. I'm processing it. And this container will create a MySQL instance, and it will also be filled already with, I think, three or four rows. I have a few examples here. I can simply run the examples and the main method of the IDE. So what I'm doing here is I'm creating different tables to connect to a SQL. One is a JPC one, which fully screens the table once, and the other one is a CPC one, which, like, continuously monitors the tables and the JPC. So let's just run this. So here we see the first three results. As you can see, the application has not stopped. So it is waiting for more records to come, and now I want to insert more values into MySQL. I could have used MySQL to see a line for that, but I can also use my SQL to see what I see. So I am having a regular, I can set into transaction JPC, values, blah, blah, blah, and I can run this main method here. So it's a bit overkill to use Spring for that, for just one value, but I think it's flexible enough you can also use it for the hash query of just setting one record into the database, and as you can see, we can show my SQL and from my SQL, via CPC to the link, and then to the next CPC. We have also more sophisticated examples here, and I don't think that we have more time for that, but we can do a lot of things with Spring SQL. I think I could spend a day and talk a little bit more about the way this works. So, put on the grid. Yeah, Spring SQL and it is very powerful, has been crafted over years and years and years, by many, many companies, many teams. It's very flexible for integration, for integrating various systems of different semantics, and there is way more. So, I just showed some operators, but we have a large, large coverage of SQL standard, so over-windows support for aggregating, for Spring, we support the recognized laws for pattern matching and complex plan processing. We have time for obsessions, windows for like, cutting your screen into pieces. Then there is a huge CPC connector ecosystem, not part of the fourth thing, but also quite useful as a little think-of-stars already. Then, something new, fatal store, which tries to be like the first streaming data warehouse kind of thing. It's not a very early version, but it's very promising. So, yeah, I would recommend to, yeah, maybe look into one of these sub-projects as well. Not only things, it's big, but also the ecosystem, around the thing, the growths and growths and growths. I'm happy to take questions. I think we have three minutes left, but otherwise I will also speak outside for any questions. Thank you very much. Thank you. Yeah, so like... Can you please repeat the question? The question is like, how does it, like, handle like, transactions that also take a lot of time before the transaction has ended. So, in general, I think we are not very good at transaction handling in things, but like you have with Data Stream API, you have a lot of possibilities to, like for example, you can buffer all the data from this transaction and stay at it after terrible time you want to, and then just wait until the transaction closes, and then you're creating the execution of the transaction, and that is possible. So, yeah, personally, I would maybe do some stuff in the Data Stream API first until the transaction ended, and then push that. Thank you. Yeah, thank you both. So, in terms of running times, do I just think from bottom all, if it's an hybrid, I shift my interpretation? No, I think it's creating its own, its own runtime. So, you will be able to do mutual apps, some types of software, some types of software, some types of software, and then just look at the, like, I don't know if there's a lot of data in the stream. No, I don't. Is that all there? Okay. Let me check. There's a library that's called the HGIS, which is not for you to be used, but we can also buy it, and that's the idea for a line. But in general, yeah, the platform, it's rather a platform, that's where it looks. Now, the next logical question, how far does it scale? I would say, as the question was, how far does it scale, I can tell you that probably from the system, and like, there's single state, there's a billion, all of us, and all of us in the day or so. Yeah, and there is Apple, that processes things, like all the big banks use it for credit card fraud detection and stuff like that. So, I don't think that, like most companies in this room, they will not reach the scalability limits of Link, because yeah, we are not at, unless here some Apple or Alibaba people are in this room, then maybe, I don't know. You said that the frame does not own data, but then there is this table store project, so is it like move towards, you know, more like ownership? Yeah, this table store is a very, very interesting approach, like it started last year, or two years ago, it's rather new. I think it was last year, early last year. It doesn't really fit to Apache Flink, but it's still, it's very useful, and yeah, we will see, maybe it will leave the, maybe it will leave the Apache Software Foundation soon, but yeah, not allowed to, and not the Software Foundation, but the Flink project itself, we will see, because it doesn't fit really well, but it's in general, like we still have this vision of Flink as a database. Yeah, we will see. Thank you. Sir, a question. This is about big states, because you mentioned that you can have like terabytes of state, but when you create a checkpoint, and if this checkpoint will be very big, and storage of it can be long, is it like a huge DC post to write into the store? Yeah, so the question was, like how can we actually snapshot large state in general? And this is exactly where Flink distinguishes, like where it differs from competitors, because there is a lot of engineering involved to make this as efficient as possible. I'm sure there's even more to do, it's still not perfectly efficient, there's more optimizations that you can do, but for example, there is like differential snapshots involved, there is local recovery involved, or like there are many, many algorithms under the hood to make it as quickly as possible. But yeah, of course, like if you have terabytes of state, and the machine has died completely, and then you obviously need to restore these terabytes of state from S3 into your task manager again, and this can take time. So like it tries its best, but yeah, of course, you need to do benchmarks for your use case. One last question. Yeah, so we guarantee exactly once and to end if the connectors, source and sync support that, like especially for state, so there are no duplicates in state, we might need to reprocess data during a failure, but yeah, like end to end, exactly once semantics are possible. Okay, then yeah, thank you very much, and I'm waiting outside. |
An introduction to Apache Beam for streaming analytics
Get to know how to leverage Apache Beam for your streaming analytics pipelines |
Okay, you're good to go. Thank you. Okay. Thanks. I think the technical issues are now solved. Thanks again, everyone, for being here. So I'm going to talk today about Apache VIN. Apache VIN is a framework for data processing that runs on top of several platforms, and it's especially meant for doing streaming analytics. So Javier already introduced me. My name is Israel. So I work as a cloud data engineer in Google Cloud, helping customers doing data engineering on top of Google Cloud. A lot of the work that I do is actually helping customers with Apache VIN, and particularly Dataflow, which is our runner for Apache VIN. So Apache VIN, what is it? Apache VIN is a framework for data processing. It allows to run data pipelines, like Flink, like Spark, like so many other big data systems. It has two main features. The first one is that it's a unified computing model for batch and streaming. Any pipeline that you have written in Apache VIN for a batch use case, you may easily use it in streaming as well. So the same code, you reuse all that code, and you have to add some small additions that we are going to talk about in a bit. And the other main feature is that it runs everywhere, and you can run, you can write your pipeline in many different languages, for some definition of everywhere. So you can write your pipeline in Java, in Python, in Go. You may also run your pipeline in any of the programming languages of the Java Victor machine, for instance. So I have here highlighted Scala, because that's a framework called SIO, the Don't Buy Spotify, on top of Apache VIN for, let's say, Scala native development of pipeline. So you don't have to use Java looking code in Scala. So it's a functional code. There are lots of people using, for instance, Kotlin also, on top of the Java Victor machine with the Java VIN SDK. So that's about the programming language that you may use. And you may run a VIN pipeline on top of runners. So there's a direct runner for running and local testing pipelines. It's not meant for, let's say, to be used, let's say, with real world use cases. But then you can run your pipeline on top of Dataflow, on top of Flink, on top of Hazelcast, Spark, many different runners. So basically, when you write the pipeline in Apache VIN, you are not tied to the platform where you're running. So you may move it to different platforms, which are minor comments. So Apache VIN is a theoretical model for computing. Not all the runners implement the model, let's say, to the same degree of extent. So right now, as of now, I would say Dataflow and Flink are probably the ones that are fully covered, or the runners may have some gaps. See the example. Hadoop. You may run Apache VIN pipeline on top of Hadoop also, but you cannot run streaming on top of Hadoop because it doesn't support streaming. So it also depends on the capability of the runner. So what you're able to do with VIN, it depends on the capabilities of the runner. So there's no magic. So if VIN is batch and streaming, but if your platform doesn't support streaming, for instance, so you cannot do streaming. So let's talk about streaming. What's the problem with streaming? It's extremely interesting. So in streaming, you are getting data, a lot of data, continuously. There's no beginning, and there's no end. So you cannot know in advance where are the boundaries of your data. It's coming continuously from many different places. Think, I don't know. Like you are designing a mobile application for a game or whatever, and people are using it and sending events to your systems every once in a while. So because the application is deployed in the wild world, data will come. Who knows how? So it will come out of order. So some users will be in the underground and without the phone coverage. They still try attempting to send in events, and then they will send the events late. Like, for instance, here, let's see if I can put the pointer. So this is data that is supposed to be produced around 8 in the morning. Maybe there are network latencies and so on, but more or less you get it around 8 in the morning in your system. So this is the time where you are seeing the data in your system. But for whatever reason, you may also get very late data. And depending on what you want to do, you may want to process this data as it was produced, as you are receiving it. So this is actually the problem with micro-batching. So I remember when I started hearing about streaming in the first days, many years ago, a lot of people said, oh, Spark, I don't like it because it does micro-batching. It doesn't do real streaming. I had no clue what they meant. It's like, well, you have to group things to process it somehow. So if the data is infinite, you will need to process it. The problem with micro-batching, which is not happening in Spark anymore, so this was really ancient times, the problem with micro-batching is that you are doing the batches as you see the data. And then you may have the data that belongs together in buckets that are separate. Like, for instance, this was data that was produced at 8 in the morning. If you are doing buckets of one hour, so well, you may capture here this message, but then if you have late data, you will capture it in a bucket that doesn't belong, that element doesn't belong with the rest of the elements there if you want to process them together. So you need to solve this problem of lack of order in streaming. And this is what you can easily solve with Apache Bin. Let's talk about the watermark. There are many ways of doing stream processing. One of the most popular is using a concept of watermark. There is no one dimension of time. There are two dimensions of times. At least that is the event time, the moment in which the data was produced, and that is the processing time, the moment in which you see the data. They will never be the same. They can be really close sometimes, but you cannot grant it how close or how far you are going to be from that moment. So we put time in two dimensions. So the ideal is, for instance, like this straight line in blue, for sure, this zone is impossible. You cannot see data before it's produced, or not yet at least according to the laws of physics. But then most likely what will happen is that you will have some delay. Sometimes it will be closer to the ideal, sometimes it will be farther from the ideal. And you need to take this into account. So let's see an example that might be a little bit more telling. So Star Wars. So have you ever watched a Star Wars movie? So Star Wars were released out of order, out of order. So the first movie was Episode 4. This is purely streaming. This is what happens in streaming. You are expecting events at the beginning of the session, in the middle of the session, the end of the session, and then you get end of the session, middle of the session, beginning of the session. And you need to reorder things. Depending on what you want to do, you may need to reorder. If you don't care, look, I don't want to, I don't care. I just want to count how many movies per year were released. Well, you don't need the event time. But if you want to reconstruct the story who did what before or after, what happened, so you need to actually be able to reconstruct that time. So the time where the movies were released, its processing time, event time is the time in which actually the events are happening. And this is the kind of problems that we can solve using Apache VIN or Flink or many other streaming systems. Let's see how we can deal with this. The classical way is using windowing. Windowing is grouping things based on temporal properties. When we do a data pipeline, we need to solve one question, which is what we are going to compute. But if you want to do this in streaming and group things based on temporal properties, you need to answer three additional questions. Where, when, and how. Let's see some details. What? This is easy. We are going to be aggregating. So this is Java code, and it's in Java here as an example. So it's Apache VIN API. I haven't entered into details. There's a link at the end with more details. So don't mind the details right now. So we are aggregating things together. So this is what we are happening. We are not doing any kind of temporal based logic yet. We are just aggregating stuff together. So we are summing up all the numbers here. So this is the operation that we are doing. Same as in batch. This is in batch. So imagine that we are getting this batch. The problem is that when we are working in streaming, we don't see the full data at once. And we need to produce output at some point so we cannot wait forever. So we need to decide how to group things together. So for instance here, we are going to group things in windows of two minutes. But the windows of two minutes are not in processing time. They are in event time. For instance, so here, this message here, so this message here, we see it around 12, and we put it in the window of 12. But this message over here, so this was received between 12.08 and 12.09. And we are able still to attribute it to assign it to the window between 12 and 12.02 in event time. Because, well, so we can wait for late data and put it in the right window, despite the message being quite late compared to the processing time. And same with the rest of windows. Now the question is, okay, good. So you are waiting until 12.08. So what if your message shows up at 8 p.m.? What do you do? Like eight hours after. So we need to do another decision, okay? So we have already made the decision on how we are going to group things together. Here is with easy windows. There are more windows in Apache bin, not entering into details right now. But now we need to decide how long do we wait, okay? So we are going to wait until the watermark. Okay, the watermark is this relationship between processing time and event time that in the case of bin and depending on the runner. It's calculated and estimated on the fly as data goes through our pipeline. And it's this curve is estimated. And when you trespass the watermark, you have a certain degree of warranty that your data is complete, okay? A certain degree of warranty, okay? It cannot be granted because, well, so the future cannot be known, okay? So we cannot travel in time, okay? So here, for instance, so we are processing data in the watermark and the nine, this number here that we were processing before, now it's left out of the window. So what does it mean if we are processing data? We were summing up numbers, that number, that nine, we are not counting it. As soon as we see it in our pipeline, it will be dropped, like lost, okay? So the pipeline will ignore it, okay? And it may make sense, okay? So you cannot wait forever. At some point, you will have to stop and move on, okay? But maybe you want to take it into account, okay? So maybe you, I don't know, like this is a billing, invoicing thing and every penny counts, okay? So then you need to process it. Well, you have to take yet another decision. How we are gonna wait for late data and how we are gonna actually update the data, okay? Here, I'm summing numbers. It's easy, commutative, associative, really no big deal, okay? So I can do it like, say, I can do it, I can do it like a monoid in big data processing, so I can just take the aggregation, the previous aggregation and keep aggregating. I don't need to keep all the numbers that I have seen so far, so it is easy. In other cases, for any non-associative, non-commutative operation, so you may need to have actually full data to produce an update, okay? And if you are working in streaming, maybe you don't want to accumulate all the data, okay? Because that will increase the amount of resources that you will need for your pipeline. It will have impact in performance, latency and so on. So here, we are accumulating because the operation allows it and we are actually waiting for late data, okay? So now, we are waiting for late data, but we don't want to wait forever. We want to have some numbers, okay? So we are actually producing several outputs per window, okay? So like for instance here, continuing with the first, so when the watermark is trespassed, we produce an output, okay? And then when we see the new number, so we produce the output. We produce it really late, okay? But well, so we cannot make magic, okay? So this is when we see the data, so we cannot process it earlier than this, okay? We may actually decide to produce data, some output, even before the watermark, because the watermark can be really slow. It depends on the pace of the updates of the data. If for whatever reason, users are sending your data with a lot of lateness, the watermark can progress really slowly, okay? And so the watermark, how you produce output is always a trade-off in the streaming between completeness and latency. You need to make a decision, okay? So here, we put an early trigger. So we're producing output soon, low latency, but it's incomplete, because well, so later on we're gonna keep seeing numbers until the watermark. Good. So basically, this is streaming in Apache Bin in 10 minutes. This is a lot of information, explained very quickly. If you want to get deeper, if you want to get deeper, there's this example here, okay? So in Java and in Python, so it's available in the two languages, and you can see everything that we have seen in the previous slides with all details, okay? And you may run this locally if you want, so you don't have to have like an environment, so like a cloud environment, a cluster, or a stream process or anything like that, so it may run locally with some synthetic data, made-up data, okay? Now, this is the classic way of doing streaming in Apache Bin. This has been around for years already, okay? So this is the same model that is implemented in Spark, it's the same model that is implemented in Flink, so they are all kind of similar. There are other things that you can also do in Apache Bin in streaming, like anything that you can do in Apache Bin, you can also do it in streaming, and I'm gonna highlight here a couple of those, okay? I'm leaving out a lot of stuff, because, well, so time is limited, and leave it out for instance SQL, so that was a great talk by Timo focusing on SQL, so you can also do SQL in Apache Bin if you want in streaming, okay? So similar examples to what Timo did, and you can actually run that on Flink if you want, okay? So it may make sense if you, well, I don't know, at some point you want to move away from Flink to Dataflow, you want to move away from Dataflow to Spark, so in order to have this portability. One thing that you can do in streaming is stateful functions, and stateful functions are very interesting for windowing between quotes that doesn't depend on time. Very typically, I work with customers, like all these windowing trigger things, it's super interesting, but look, whenever I see a message of this type, I want to have all the messages that I have seen so far in a group and do these calculations, and I don't care about time, okay? I don't care about grouping things in time. I want to group things by some logic, okay? I'm gonna give you a predicate, you pass a message, if the message fulfills a condition, I want to close the previous window and start a new one. How can you do that in Apache bin? You can do that with stateful functions, okay? Stateful functions, so here we have some input, here we have a map, it's called a part doing Apache bin, and we do some transformation, and we want to accumulate a state here, okay? So depending on what we see at some point, we do something else, and this is mutable state, okay? In a system like Dataflow, like Flink, like all the systems where Apache bin runs, having state, mutable state in a streaming that is computed in a consistent way is extremely difficult, okay? One way to shoot yourself in your feet with systems like this in streaming is trying to keep accumulating state using some kind of external system, okay? Because runners will have... sometimes will have issues that will be errors, that will be retries, infrastructure will die, you will have auto-scaling. There are all kinds of situations that the runner may want to retract the computation and recompute again, okay? And then in these kinds of situations, having any kind of external system for mutable state, it's complex, okay? It's doable, okay? You may have, and you will have with Apache bin in any kind of the runners that you can run, you will have this end-to-end exactly once processing, but this end-to-end exactly once processing doesn't mean that your code is going to be executed exactly once. It may be executed more than once, okay? This is what makes maintaining external state to a pipeline complex. But if the state is internal to the pipeline, then, well, so the system itself can, let's say, take care of the problems of reprocessing and maintain a mutable state in a consistent way. So this is where it's a stateful function in Apache bin, and you can use it for use cases like this, okay? For instance, say that I want to produce windows between quotes based on some kind of property. So I keep seeing messages, okay, that I keep processing, okay? And then I keep accumulating the messages in some state, okay? I maintain a buffer, like I keep every single message that I see and I count, okay? Because, well, the buffer cannot, so the buffer must have some boundaries, okay? So because this is local state that is maintaining the machine in the worker, in the executor where you are running, and the executor will have limited resources. It might be very large resources, but limited anyways, okay? So you keep accumulating, and then you keep processing here, for instance. So typically, you can use this, for instance, batching to call in an external service, but you can also do here, whenever I see a specific type of message, I emit some output, okay? I emit some output, and then I have applied a window. All the messages that I have in the buffer, I tag a new session ID, a new window ID, and then I emit them. I hold them for a while until I see the right message that I need, and then I emit them. There are two problems here. We want to, so customers always think that streaming is complex, and they want to get away of all the temporal-based calculations, okay? It's so complex, so messy. Look, my algorithm is really much simpler, but it is not. So you are in the streaming, so you cannot ignore time, okay? You have situations where you will see the messages out of order, and you will see, you will have situations where you will not see the messages for a while, okay? And then you need to decide what to do in these two cases, even if you don't want to, okay? What happens when I see out of order? You may say, I don't care, unlikely, but well, in some situations, it might be true, okay? Or you may have to wait, like, some timer in order to give room for late data to arrive into your code and actually produce the actual output, okay? So this would be an event-time timer, okay? Look, in event-time, you are going to see the messages in order, okay? So wait 30 seconds, two minutes, and so on. And then when you have seen all the messages, it's the moment in which you apply the session. That's called an event-time timer. And you may have also problems of staleness, okay? I'm waiting for the end of my session, but I've not seen messages. I haven't seen messages in the last five minutes. In processing time, okay? The problem with event-time is that it depends on the progress of the watermark. But if you stop seeing messages, the watermark will stop advancing. The watermark is always estimated in being runners or normally estimated as the time stamp of the oldest message waiting to be processed, okay? So literally you may stop your pipeline waiting forever for some data that maybe it will never arrive, okay? So processing time-time will stop this problem, okay? After 10 minutes, like, measure with a clock. If nothing comes, I don't care about the watermark. I don't care. Keep going, okay? So data has been lost for whatever reason, and we cannot wait forever. So this is a stateful function, and it's also very useful in streaming because it allows you to apply logic that goes beyond the temporal properties that we have seen in the previous slides. And here you have some examples and links. The slides are already available through the first-time website, so I encourage you to have a look at these examples. What else can I do in streaming? Machine learning inference, okay? So there are many ways to do machine learning inference in streaming at a scale, okay? Many of those quite expensive. So you can deploy endpoints in cloud platforms, with GPUs, with a lot of stuff, okay? And normally, so, well, so those are... those solve a lot of functionality for you, but they are expensive. So what if you want to apply machine learning inference in a pipeline, in Apache Bin? Well, you could do that, okay? You could be thinking, well, I can do that. So I can, I don't know, like, import TensorFlow, load the model, apply it, so you could do a lot of stuff, okay, yourself. But this is already solved for you in Apache Bin, okay? So you can run machine learning inference with the so-called run inference... run inference transform, okay? So we see it here. So right now, it has, let's say, out-of-the-box support for PyTorch, TensorFlow, as I can learn, with more coming. When you're running a distributed system and you want to apply a model, each one of the workers in the distributed system will have its own memory. So Apache Bin runs on top of share-nothing architecture, okay, like fling, dataflow, spark. Workers are independent of each other. They don't share any common state. The state that we have seen before is actually maintained per key and per window if we apply the per window. It's totally local for the worker, and two workers cannot share a state. But the model, we don't want to instantiate a model if we have 100 workers 100 times because the model is going to be the same for every worker, right? So the model hasn't changed. The model is actually read-only. Run inference solves these problems by having some state that is shared across all the workers and it's transparent for you. So this is something that you can always implement in a distributed system, but it's complex. This is the problem that is solved with run inference. It's only one copy of the model per, let's say, machine where you're running, okay? So in memory, okay? Because, well, you need always to make an instance in memory to be able to apply it. But if you have, I don't know, like 100 threads, 100 sub-workers, 100 CPUs inside the same machine, you will not have 100 copies of the model. Regardless of, let's say, what's the computation model of how the runner is implemented on top of the machines. That will be only one copy in the memory of the machine. So this is the problem that is solved with run inference, okay? So if you want to apply a streaming inference, it's a very convenient way of doing this with very little code. And depending on the runner, this is a possibility in data flow, this is also a possibility if you are running on top of Kubernetes in a runner that supports Kubernetes, like, for instance, Flink. You can do also hinting of the resources that your transformation is gonna need, okay? Hinting of transformations could be, look, this transformation is gonna need this amount of memory, minimal, okay? But hinting could also be, look, this is a step that is running ML inference. So use a GPU for this step, okay? And then the runner will take care of making sure that the step that is running a virtual machine, let's say, matches the infrastructure hints that you provide through the code, and you will have, let's say, different types of nodes for different types of transformations. One of the problems of shared nothing architecture is that all the workers are alike. All the workers are the same. With this, you can have different types of workers for different kinds of transformations, which, let's say, in terms of cost, it's better. I don't have to say optimal, so maybe that's a better alternative. But basically, you use GPUs in the workers, where you need to use GPUs, you don't use them in the workers where you don't need to use them. And you don't have to worry about assigning work to different workers, that's actually done by the runner, automatically, with these hints, okay? If you want to know more, here you have some links. So I have only five minutes left, so I'm leaving the best for the end of the presentation. Great, look, Israel, you showed San Java at the beginning, now you tell me ML inference is so cool, but it's Python, right? So it's PyTorch, TensorFlow, Scikit-learn, it's all Python. One of the things that you can do in ApacheVinus is in cross-language transforms. Anything that you have available in any of the SDKs, in any language, you may use it in any other language, as long as the runner supports this, okay? So it's not supported by all the runners, but it's supported by the main runners, okay? So basically, run inference may be used in Java. If you want to use any transformation from any language, you have to add some boilerplate code, not so much, but a little bit. This is already done, let's say, for the main transforms that are most popular, all that they say, that make more sense to be used in different languages, like a map, well, using a map from Java in Python doesn't really make sense, okay? Using run inference makes. Using connectors, input-output connectors from another SDK in Python, for instance, makes sense, because the amount of input-output connectors that you have per SDK is not the same. So you may write, I don't know, like, to databases in Java and to message queues in Python, but maybe you don't have the same functionalities in all the SDKs, you can use any connector from any SDK, and this makes it quite flexible, okay? So these are the so-called multi-language pipelines, and basically, it means that you can run any transformation in any SDK, and this is implemented because the runner environment is containerized, okay? So there's a container per language, and there's some magic that makes, let's say, the container communicate between themselves, okay? And the serialization and the serialization between programming languages. So this is part of the boiler press that you need to take care of. If you use things like Apache Vida schemas that I haven't talked about in this talk, so it will be transparent for you, anything that you have in one schema in one language, you will be able to serialize it, serialize it to any other language. So if you follow, let's say, if you follow the Apache VIN custom, it's quite straightforward to use these kind of things. Well, thanks everyone so far for your attention. So here, and almost there are some links that I recommend you to have a look if you want to learn more about Apache VIN. I have covered a lot of stuff in very short time, okay? So there's a lot of things behind everything that I have explained here. If you want to know more about all the window in streaming, triggers, watermarks and so on, I strongly recommend you this book. It was released sometime ago. You may think that it's outdated, it's not outdated, so let's say this is the same model that is applied in many different streaming systems, and this is not a book about Apache VIN, it's a book about streaming systems with lots of examples coming from Apache VIN, but also examples coming from Flink, Kafka, PAPSA, and many other systems. Actually, it's very interesting. It's my favorite book, one of my favorite books, and the other one being the book actually from Martin Kledman about data intensive applications. And if you want to know more about VIN, so I recommend you the VIN College. There are lots of videos with lots of details about the things that I have explained here in YouTube. Some of them are actually linked in the slides. For sure, the main site of Apache VIN guide and all the documentation that is there. And if you want to learn more about Apache VIN, there is also the videos of the Apache VIN Summit, the previous editions. And if you want to participate, if you are here today, so you may be interested in streaming, so the call for papers is open until March 20th, I think. VIN Summit will be in June in New York, and I encourage you to submit talks. Well, so this is all. So thanks all for your attention. It's time for questions now. Thank you. So what's the advantage of using VIN if you are already using VIN? So it's portability, mainly. So if tomorrow you want to move away from Flink forward for whatever reason, so you should be able to move to other runners that have the same level of functionality, like, for instance, Dataflow. I don't know. So we have one of the main committers here of Apache Flink say that he gets hit by a bus. We don't want that to happen, but that may happen. Everything may happen. The world is really very uncertain. So basically you have portability. Yes. Thank you very much. Unfortunately, we don't have time for more questions right now, but I'm sure we'll be happy to answer any questions. Yes, anytime. Yes, thanks. Thank you. |
Ingesting over a million rows per second on a single instance.
Time-series processing using QuestDB |
So the thing about QuestDB, apart from being open source, we want people to know us because we try to be very performant, but specifically in small machines. It, like, perform very well in 120 CPUs and 200 gigs of RAM, it's okay. Performing very well in 4 CPUs and 16 of RAM, 16 gigs is more difficult. So that we try to optimize for that. Actually in the past, we were optimizing for the larger instance use case and then we realized not everybody has, like, a super large instance at home, so, you know, we try to be better at that. We also try to be very good with developer experience, that you get performance out of the box. There are many things you can tweak in QuestDB, you know, in every other database, every other system, lots of configuration, the, I don't know, the memory, page size, the buffers and what not, which CPUs do what, blah, blah, blah, blah. By default, if you don't touch anything, which will perform well. And then if you have expert tolerance, you might fine tune. But we try hard to make developer experience as simple, and that's why we choose SQL also for querying data. So another time series database, make the trade off. We want to perform. We need to use a different language, which is cool because, you know, that's, I get it. We choose SQL because we want the developers to have an easy way learning QuestDB. For ingesting data, you can use SQL, but we also offer a different protocol, which is faster. That's why we have collecting libraries, so you don't have to go low level to be performant. But that's the idea. And we are open source, very proud about being open source. But why we are building another database? There are a lot of databases. If you walk around first, then you're going to read research about every type of database out there. And just today here, I saw MongoDB, I saw Clickhouse, there's someone about Postgres, there's someone about SQL, about MariaDB. Why you need another database, another open source database? Well, because different data looks different and can have different problems. And in our case, we are specialized on time series. We don't do anything else. I mean, if you try to use QuestDB for full text search analytics, we are truly the worst database ever for that. If you try to use QuestDB for geospatial queries, we support some geospatial themes kind of a bit. We have a specific data type about geohasses, so we have a type about that. But we are not good for geospatial unless it is part of time series plus geo. That's kind of the idea. So we specialize only on time series analytics, on data which is changing over time and you want to monitor and track those changes. That's the idea. We are not good for anything else. If you try to use QuestDB for everything, boy, what a disappointment we are going to be. But if you try for time series database, this will be one of the good ones. That's kind of the idea. And that's why we are building QuestDB, because there are a lot of time series data out there. And how do you know if you have a time series tool and I have to hear a lot of things. I want to just read a couple of them. But basically, if most of the time you are reading data on a slice of time, tell me which energy consumption I have over the last minute. Tell me how is the nuclear reactor doing in the past 10 microseconds. Tell me what is the conversion for this user in the past week. Let me know for all the data, I have a moving vehicle, which was the last position I saw it and which was the sensor in this particular point in time. So if you have that, time series can be interesting. So with time series, you have all that type of problems. Data tends to be inserting faster than it reads. Databases, historically, have been optimized for reads. They try every trick in the book for making read super fast. When you insert data, you need to define indexes and they are going to index by many different things and they keep caching memory for a lot of things and blah, blah, blah, blah. So reading is the key thing, because usually you read data much more than you write. But also in time series databases, we can support heavy reads on top of that, but we need to support heavy inserts and keep performance of that. We don't use indexes. The performance you're going to see today is with no indexes. We don't need them. We don't want them, because having an index slowed down in gestion. It's a luxury we cannot have. So we have some kind of indexing, but we don't have indexes, not as you know them. That's kind of the idea here. So it's slightly different. You have data that you are writing very often, that data is going to grow, and it can grow fast. And you need to have some way of loading or deleting that data. On a traditional database, you just don't say, oh, I have, I don't know, I'm Amazon and I'm getting users. It's like, oh, I already have a million users, a million one, I'm going to delete the old users. You don't do that. I mean, sometimes you do, but you don't do that. You don't really do that on your databases. On time series database, almost all of them have some mechanism to deal with historical data and do something with that. In our case, you can amount partitions, you can amount to cheaper storage, those kind of things. But we have the commands and it is designed for that kind of thing. That kind of the idea. Many other things about how you have a time series storyline, but that kind of the idea. But better than me just telling you, I'm going to show you some queries on top of demo data sets. I'm going to get the feeling why a time series database might be interesting and then we're going to details about the ingesting data and about all those things. That's in sound good so far, yeah? Do you have any questions? I'm happy to take them during the talk, by the way, not only at the end. So we have a live demo, demo.questdbe.io, which is running on a large machine on AWS. We don't need all the power, but since it's like, you know, open to the public. Again, we have a few different data sets. There is one. You are in a big data room, so you are truly familiar with the taxi rise, New York City taxi rise data set. It's the, and the city of New York has a data set, which is very cool for machine learning and for big data, which is taxi rides in the city of New York. When the ride started, when it finished, also the coordinate and a few things like the tip and the amount of the fare, how many people, blah, blah, blah. So we took that open data set and we just put it here on questdbe, a few years of the data set. Yes, you know, a lot of columns here. So let me just show you how big this is. This is, right now, is the size okay or maybe not? Maybe I have to make it a bit, first this a bit bigger and then, okay. So it's 1.6 billion rows, which is not huge. I mean, if you have a relational database, 1.6 billion rows, they don't, relational databases today, they are great. But 1.6 billion rows is like, yeah, I couldn't work with that, I'm not super comfortable. For us, it's cute. It's like, I mean, it's a data set which is respectable but not really huge, but 1.60 billion rows. And now, what if I want to do something like, for example, I don't know, I want to calculate the average of whichever, this for example, this number, I want to average the fair amount over 1.6 billion trips. How long you will expect your database to take to go do a full scan over 1.6 billion rows and compute the average, no indexes, no anything. How long would you say, more or less, ballpark, 1.6 billion rows, no one? How is the size in gigabytes, megabytes? I don't know for the whole data set, but this is a double, I mean, I really just know, it's big, it's big. When you download the CSV, it's CSV, it's about 600 megabytes and you have several of those. It's in the, you know, it's largesse. But anyway, well, actually it was slower than I thought. It took, usually it takes half a second, this time it took 0.6 seconds. I know it's slow, I know, but it's with a reason, sort of that. But I told you, I told you, we are trying to see this database, we are super slow for other things. This is not a time series query, did you see any timestamp here, I didn't see anything. This is just a full scan, we parallelize, we read data and we are slow. We take almost over half a second to go over only 1.6 billion rows, unforgivable, sort of that. But there with me here, no, that's the thing, I mean, I'm kind of half kidding but not really. But wait until I put a time dimension, now yes, I want only, for example, I want only one year of data and I'm going to just also add another computation because I know that it's just counting data which is super fast. So I'm going to add another computation, so I'm going to count the data and only for 2016 and this is better, this is already 100 milliseconds because we are going only over a few rows, we are going only about, yeah, it's only 146 million rows, this is much more manageable, so only 140 million rows, that's better. So we can go actually very fast on this and then if you keep going down, oh no, I want only one month of data which is, I don't know, still, yeah, 12 million rows, so a month of data is 60 milliseconds, for one day of data, of course, is way faster, this is already 50 milliseconds, if I go to one specific hour, a minute, it should be, you know, kind of, not much faster because, oh yeah, it's under one millisecond actually, thank you for that, but still, like, you know, we have partitions, so basically one thing we do, we only go to the partition where the data is stored, so we only attack that part of the data, but that's kind of the thing, for when you have like that time component, we are quite fast, oh, fairly fast, that's kind of the beauty for a time-serious database, and we can do also interesting, other interesting things, if I go to the same table and I show you what this looks like, you can see that for the same second, I have many trips because this is New York, baby, and in New York, you know, the city that never sleeps, you can't get back in every corner, you get rich when you land in New York, I spent there one year, it's not like that, anyway, so in every particular second, even at midnight, you have always a few trips at least, okay, so actually you could do that, we could do something like, I want to know the, I want to, if I want to do something like, give me the date time and how many trips are ending where this date time is in, for example, June 21st, city, what are you doing there, man? I didn't even know I had city here, okay, so, I don't know, for example, in this particular minute, in one particular day, I want to sample in one second interval and know how many trips I have for every particular second, so that's another thing you can do in a time series database, rather than grouping by columns that you can also do, you can group by time, you call this sample by, so we can sample by any, we go from microsecond to year, I guess, microsecond to year, so you can group by microsecond, millisecond, second, year, day, whatever, so in this case, I'm saying, okay, in this particular second, I have six trips and five trips and blah, blah, blah, you get the idea, yeah, so something I wanted to show you, which is another cool one, it's, I have this data set with several trips every second, I have another data set, also with data from Manhattan, is the weather data set, so maybe it will be interesting to know, to join those two data sets, it will be cool to know the weather that I had for a particular trip, because maybe that gives me some insight, I don't know, the challenge is this data set, of course, is real life, it's a different open data set, it's not at the same resolution, we don't have weather changes every second, in my hometown sometimes that happens, and when I was living in London that was crazy, but in real life, we don't measure, we don't store weather changes every second, in this particular data set, we have about two or three records every hour, so now if I want to join a data set with sub-second resolution, a data set with sub-hour resolution, and I want to do a join, if I want to do it in other databases, I could do it, it will take me a while, then I will think I have it and I wouldn't, and then it will be like, yeah, this makes sense, or not really, and a week later I will be crying, I don't know, I don't know, so you know, I should know, so one thing, one cool thing we have here, we have a demo set, it's an example, I'm going to move on to another thing really quickly, because otherwise, but this one I really like, we have a special type of join, which we call an ask of join, which basically does this, I'm going to select the data from the table I told you already for one particular day in time, and then I'm going to do what we call an ask of join, which basically says, this table has a time stamp, we call it the designated time stamp, you design which is the column, you have several, so we have the designated time stamp in one, designated time stamp in the other, joined by the ones that are closer to each other, in this case, ask of means the one which is exactly the same, or immediately before me, the one which is closer to me, what happened before, we have also the one strictly before me cannot be the same, but that's the idea, so in this case for joining two different data sets, I can just do that, also I'm going to add here the time stamp for the other table, so it's clear, so if I run this query, now here I can see for each record on the New York taxi rides, I'm always getting the same time stamp in the weather data set, because I have only one entry every 40 or 45 minutes, if I move to a different point in the day to this day, but instead of at 12, at 12.55 for example, I should see already the time matching to a different entry on this table, but that's it, I have different resolutions, I don't care which one, we join by time, because we're about time, that's kind of the idea, that's what I'm trying to say, I have more interesting queries, but maybe for a different day, so that's the first thing. So I told you, okay, now you get the idea why tensile is kind of interesting, the kind of things we can do, down sampling, all those things, machine learning is very important, you have data maybe every second, and then you want to do a forecasting, and it doesn't make sense to train a model with every second data in many cases, maybe you want to down sample to 15 minutes intervals, with this trick you can do it easily, so that's kind of the idea. So I was speaking about ingesting data, so ingesting over one million times per second on a single instance, it's interesting, but ingesting over one million records per second on a single instance, it's easy actually, I could just write to a file, a pending line, and that will be it, the interesting bit is actually being able to ingest data while you are able to query data in real time, the same data you ingested, that's the trick, because just ingesting, I mean, you put it there and you're like, why ingesting a million records, when you think about it, it's like, well, wait, but how long I have to wait to query the data, and when I can, so the idea is you can query the data at the same time, all benchmarks are lies, of course, on the same benchmark that I'm going to tell you, other people will tell you the contrary, and I'm totally fine with that, but a couple of years ago we published an article saying, hey, we can ingest now at 1.4 million, the slides are linked already on the first page, by the way, thank you, so we, our CTO posted about, you know, how we were ingesting 1.4 million records per second, these records were, they have like 20 columns, 10 dimensions, 10 strings, and 10 metrics, 10 numbers, so, you know, we could ingest records of 20 columns with 10 strings and 10 numbers, 1.4 million records per second while running queries, which is the other bit, so we were able to scan over 4 million, 4 billion records per second, you know, at the same time in relatively small machines, relatively small, so that's kind of the, the idea, okay, and these benchmarks, we didn't write it, it was, there is a benchmark specifically for 10 series databases, as I told you earlier, if you load data in QuestDB, you can load relational data into QuestDB, and you can run queries, you try to run a conventional benchmark on QuestDB, it's going to be super slow, so we are not designed for full text search, we are not designed for, you know, just operations, reading individual records, or doing updating data, we are not designed for that, we can do it, but we are not designed for that, so there is, and also there are 10 series databases, so in FluxDB, another open source database, created this benchmark, the TSBS benchmark, which is specifically about 10 series databases, so the queries and the ingestion patterns matches what you would expect from a 10 series database, now it's maintained by time scale, which is another open source database on top of Postgres, and we have our own, you know, there is an adapter for running that on top of QuestDB, and with that benchmark, it's with the one that we are getting those results, so with that particular benchmark, it's the one giving the results, so you know, your mileage might vary, also depending on the hardware, if you try to run the benchmark in the cloud, it's going to be slower, always, because in the cloud, by default, you use on AWS, you use CVS, on WorldCloud, you use the attached storage, it's networking storage, it has latency, because they are not local disk, they are super cool, but they are not local, it's going to be always slower, you want to get this on WorldCloud or on AWS, you can do it, you have to use NVME disk, which are local disk, which are attached to the instance, but they disappear when you close the instance, but with those disks, you will be getting the same benchmark, so hardware is also important with the benchmark, but that's the idea, you know, that's how we did it, and before, I tell you a bit about the technical decisions, that I will not have super time, but I want to show you how we are doing this in gestion, so let me just, if I can move this out of the way, so this is a scripting goal, I don't know any goal at all, but I know to run this, so another developer advocate, I mean, I couldn't tell you that I know a lot of goals, but I have no idea, so goal lang is a language, so yeah, we have, I've been told it's pretty cool, so we have this library or package or whatever they call it in Go, which is our official package, cargo or whatever, I don't know, so this is my missing languages here, thank you, so yeah, this is our theme, I'm connecting to local host to the default port in QuestDB, I'm going to be simulating data, so I'm simulating IoT data, and I'm going to be outputting a device type, it can be red or blue or green or yellow, I'm going to be outputting duration, latitude, longitude, speed, and time stamp in nanoseconds, and I'm going to do this in chunks of, in batches of 50,000 records, I'm going to do this 200 times, 50,000 records, 200 times, 10 million records, I'm going to be inserting 10 million records on a device, on a table that doesn't exist, QuestDB will create it automatically when it starts receiving data, so if I run this scripting goal, which run doing go run, well don't go, so go run, it's ingesting data, it should take less than 10 seconds because we are ingesting 10 million, and that's finished, so let me just go to my local host here, let me just select, select how many records did we ingest it for, I have to refresh the tables, okay, how many records I ingested, 10 million records, that's good, can you tell me the interval, so I can see what happened here, sampled by one second, and it's telling me, yeah, you know, in the first second only half a million, because we, we then started at the top of the second, it was probably at second or something, but after that, one million, one million, one million, ten, one, you see, you see the idea, okay, that's not too bad, I can do this slightly better, I can run this script actually twice ingesting in the same instant to two different tables, so now, if I refresh, I should see I have two tables, not only one, so I have two tables here, same hardware and everything, if I run again, I'm going to select only the last 10 rows, so we only see the latest run, so you can see it's just lower now, I was actually ingesting to two tables, so I'm ingesting only 700,000 per second, something like that, but if I go to the same time to the other table, I can just do a union, if I go to the other table here, you should see that at the same time in the, oh yeah, I cannot apply limit here, sorry, in a union, so I should see that, you know, even if I was going slower, the other table was reading data, and in this format you cannot see it very well, but we can do something I told you earlier, I can just rather than do a join, I can just do something like, as of join, the first query with the second, so I should be able to do this, now I have, in the first run, we were running only one instance of sending data, and this one is the one in which I was running two, so you can see, for this particular second, we were ingesting 700,000 records in one, 700,000 records in the other same time, so about 1.4 something million in total because we're in different tables, out of the box, if I configure the writers and how many threads I have for processing things, I can get it slightly faster than this, okay, but that's good enough, on a local, M1 laptop SSD, it's fast, but that's the idea, okay, so that's the one million there, I was not lying, I was just, you know, telling you things, I have only a few minutes, but that's cool, how we got here, first, we can do a lot of assumptions about the data, this is time-serious, so we know people usually want to get not individual rows, but computations over rows, we know people mostly want to group by things that are in the data, like strings, like the country name or the device name or the brand or whatever, so instead of storing strings, we have a special symbol, which is called a special type, which is called a symbol, if you give me a string, we convert into a number and we do look up automatically those things, so we can make a lot of assumptions because we hyper-specialize on one particular use case, we optimize storage, we don't use indexes because we store everything always in incremental order per partition, if we get data out of order, we have to regret the partitions, but we don't need indexes because we always have the data physically in order, so we can scan super quickly back and forth, that's kind of the idea, we also parallelize as much as we can using different things, this is written in Java and it's from scratch, you will see some databases which I love, like MongoDB, excellent database for content, they have a time-serious module, we use the same MongoDB collections for doing time-series, they cannot be as fast because they are using exactly what they are using for content, it's very convenient, I can do everything, but same thing with other engines that are built on top of other things, we don't have any dependencies, everything is built for scratch, actually we are writing some of the libraries in Java like strings and loggers and so on to avoid conversions, there are things that we don't use, so we don't use them, we have libraries for strings, we have libraries for memory management, we have libraries for absolutely everything, they are written in our own version, we had our own Justintine compiler because the original Justintine compiler in Java was not performed enough for some of the parallelization inquiries wanted to do, so we wrote everything, our Java is kind of weird, Jeremy can tell you more about that, it's super weird Java, but it's still Java, that's kind of the idea, we even route our own input output functions, that's kind of a thing, why? Because we can get nanoseconds faster, this is log4j, log4j, we don't speak about log4j, but this is awesome, but you know this is log4j, j for log4j, and this is the nanoseconds, the operations you can do in each nanosecond, so with log4j, login, integer, you can do 82 operations per nanosecond, we can do 800 operations per nanosecond, which is, do you have to go down to the nanosecond, if you are doing a CRUT application, probably not, it really depends what you are building, that's kind of why we are writing things from scratch, so basically the approach of QuestDB to performance, you know this, this is like, I don't know who you are, but I don't know you, but I will find you and I will kill you, that's kind of the same approach I see on QuestDB team, they are like, I don't know, we can get faster at some obscure thing here, so that's kind of the idea, and we try to be a good team player, Jeremy here has contributed himself, only alone, the connectors for KafkaConnet, connectors for Apache Flink, so we try to integrate with the rest of the ecosystem, we love it if you try QuestDB, you are open source geeks, you like, we have stars, we like you have stars, please contribute, please start on GitHub if you like it, we have a contributor to the Slack channel, we are quite friendly, we are fast, we work with interesting problems, if you like interesting problems, if you like weird Java, we would love to have you here, so thank you very much, and I can take any questions outside. Oh, one question for the chat, thank you, yeah, yeah, yeah, yeah, it's a, someone was asking, is QuestDB can work with GPS data, yes, you can work with GPS data, we have doubles that we can use for that, we don't have a lot of geospatial functions, we have geohashes, which basically allow you to define in which, at different resolutions, in which square in the world something is, so if you are talking about finding where a point is in the world, at a particular point in time, QuestDB is very cool, if you need to do other things, we support some math libraries, calls and all those things to do your own calculations, but yeah, it can be used for GPS, and some people are, a lot of people are actually doing asset tracking with QuestDB, thank you. |
Building A Real-Time Analytics Dashboard with Streamlit, Apache Pinot, and Apache Pulsar
Best of Both Worlds with Event Streaming and Real-Time Analytics |
Okay. Hello, everyone, and welcome to our talk today. Yeah, so today we're going to talk about real-time analytics application with Apache Pino and Apache Pulsar and Pino. So here today too is myself. I'm Mary Grigleski. I'm a streaming developer advocate at Data Stacks. A company is actually primarily doing Apache Cassandra up to this point. And now we're going to be doing more Apache Pulsar. And it's a streaming event, streaming platform that's kind of optimized for the cloud-native platform. I'm based in Chicago. I'm also Java champion and president of Chicago Java user script too, and blah, blah, blah, all these things. I was a developer before too, just so you know, mostly in Java. So now we have Mark. So we have Mark introduce. Hello. And I do realize, Javier, that we've stolen your intro section. So we've got, we've gone straight and say, yeah, hi. I'm Mark. I work at Starchy. We do Apache Pino. I'm a developer advocate there. And so, yeah, like we kind of had on the first slide, we're going to be showing you how to, yeah, how to, I guess maybe more, how to build a real-time like analytics dashboard with Pulsar, Pino, and then Python dashboard library called Streamlit. So we're going to see half the talk will be that. And we're going to see how well does the Wi-Fi survive our attempts to use live data. So let's see. Let's hope the demo gods are in the room. So I guess first things to start with is to define like what exactly does this mean? What is real-time analytics? So we've seen lots of talks, right, showing streaming data. So obviously that's like a big part of it. But the real-time analytics bit is kind of this bit in highlighted. So the goal is we're trying to provide insights to make decisions quickly. And I guess the most important bit is that farsight. So like, it's cool that we've got the data. We can capture our IoT data. We can capture like orders coming from a website. We can capture the logs, but we want to do something, do something with it. And so if we move on from the definition, like all the talks we've seen so far, they focus on events. We've got events representing stuff. So someone is purchasing something. Someone wants to do a search query. Someone is taking a taxi ride like we've seen in Javier's talk. And those events are cool on their own, right? But we really want to get some insights from them. Like what can we do? And eventually that leads to, okay, we've got some insight and we do something as a result of knowing that this is happening. So like say, we know someone is searching for a pizza. Okay, let's get a pizza shop in front of them if we're doing Google Adverts. If someone's then going in and buying the pizza, okay, let's show them in real time what are the things that people have also been buying along with that pizza or that hamburger that we can suggest like in the flow. So we want to try and react to those events coming in. So we've seen like lots of tools. We've seen Beam. We've seen Flink. We've seen like lots of tools for getting the streaming data, but we want to do something with that data. Like that's the whole purpose, I guess, of all the applications that we're building. And in the real time analytics space, like this is like, you imagine the value of data over time in this world where we're trying to do something with that streaming data, the value of that data goes down over time. So if we know like today, like, hey, you made an order and something's gone wrong with it, I can like try and do something to make you happy, like give you a voucher or call you up and try and fix it. If I find out like when I batch process that tomorrow, it's like, oh, you already hate me. So it's too late. And so we're kind of focused on the left side of this diagram. So in terms of real time analytics, we want the data like close to when it's coming in, maybe not exactly like, not not exactly like when it comes in, like in the time just after that, that's the kind of region that we're living. And there can be lots of people who are interested in this data. So it could be the analysts inside a company. So maybe it's people like, in our imaginary pizza shop, like they're actually like running, running the, running the operations with the pizza shop. It might be the management. They're like, hey, I want to know what's going on now. What's the, what's the revenue that we're seeing like right now, like we're in the last 10 minutes or whatever it is. And it could be the users. And so that's, that's kind of the interesting thing that we, that there's sort of changed, I guess, from doing traditional analytics to doing this real time stuff is that the data is almost coming back that users are creating the data and then we're feeding it back to them in terms of products that they can then use. And yeah, I guess that's more or less what I wanted to say. And so when we're building these applications, there are kind of, I mean, they're not, they're not strictly like this, but there's sort of four obvious quadrants of applications that people build. So they go along, oh yes, I should show you, they go along two axes. So we've got human facing and machine facing along the Y and then internal and external on the X. So if we go in the top side, that would be like the observability area. And actually a lot of the time series querying would be in there. And this would be sort of the area of like data dogs. So hey, like I wanted to get the met, I've got like all this telemetry data coming in and I want to know what's going on, like what's going on in my, in my AWS cluster. What's happening like with all my Lambda functions? Like are there any that are suddenly really slow? Like can I, can I like figure that out? And maybe there's a machine that's actually interpreting that data rather than a human is looking at it and going, oh yeah, it's that one Lambda function there that's really slow. Probably be feeding it into like, like some other tool that's figuring it out. If we come down here, so this is where we're going to be today. So imagine this is a dashboard and obviously I'm sure you've seen loads of dashboards, you've probably seen Tableau, you've seen loads of BI tools and with a lot of them, the data that you're using is maybe like yesterday's, yesterday's data. And so what we're going to show you today is how could you build one that is like updating as, as new data is coming in. If we come up to the, to the top left, this is now the machine is processing the data, but it's for the users. So this would be like how Javier was talking. So we've got like the fraud detection system. So the data's coming in, it's being processed by something and then I'm seeing, I'm seeing the result, but maybe I'm not going and working out like with my own query. Oh, look, I wrote this really, really clever query. Here's the, here's the thing. Maybe there's some, some pre-processing happening by some sort of machine learning algorithm. And then if we come down into the bottom corner, be some sort of external service that could be, yeah, like in our piece, for example, like an order tracking service. Like, you know how on your phone you order something from, I don't know what's this food delivery service here just eats. Maybe you see like, Hey, look, I can see exactly where it is. How far away is it? Why on earth has the driver gone the wrong way to my house? You can see all that sort of, all that sort of information. Maybe too much. Maybe they should just show you. It's always coming towards your house. It never went like the absolute opposite way and got to your house an hour later. So just to show you some real time, real, real world examples of where this is, where this is used. So, so LinkedIn is one. So the, who viewed your profile, I guess most of you have probably seen this and you see like, I guess for people spying, spying on you. And if you, if you have, you can kind of see like all the people that look to you. And it's, it's very, it's very up to date, right? Like if I went and viewed one of your profile pages, you would see it straight away. Like, Hey, look, Mark looks at it. Use for that would be, Hey, maybe someone is in like, you often, I guess it's often recruiters, right? You're like, Oh, I wonder why that person is following them. I'm going to, I'm going to contact them. So it's almost like, is there a real time way of like interacting with someone? I don't know if you wanted to collaborate with them on something or yeah, I guess they got a job that's available. They use it in the news field as well. So I guess here is similar to what you would see in, I guess in Facebook, I guess even in TikTok and those sorts of tools. The goal is kind of make you interact with, with this product more like they want you to stay on it. So they need to show you what is happening now so that you're going to stay on it and not, and not, yeah, I guess to not go away and do something else, which potentially is more useful, but they want you to just stay on there. And then yeah, for LinkedIn, like this one's, I guess this one's like a little bit of a, a smaller user base, but yeah, for the recruiters, I can see like, okay, what is the trends of what is happening in terms of what jobs are available, what places those are in the world and so on. And then Uber Eats, let's do one more example. So Uber Eats is another one. So this one's kind of like what I was saying. So they've got a dash, this is a very, very much a dashboard approach. And this is for a restaurant manager. So if they were hosting the restaurant on there, and you can see like the things that are interesting are they would get like missed orders. So hey, we've made a mess of this order. Can we fix it like now rather than waiting till tomorrow? We've got this order that's gone wrong. Can we, can we go and fix it? It's almost like you're able to achieve like the customer service that you are in a restaurant where you can kind of see like with your eyes, okay, these people look really angry with me. It's like, hey, look, the data is being shown is almost giving you the equivalent of the, in the restaurant experience without being in the restaurant. Okay, so what, how do we go about, so those are some examples of people who have built those things and they're, they're a way more of them. Those are just some, some ones that are picked up. How do we, how do we build that? So there are some properties that we need to, we need to achieve. And some of these Javier was talking about in his talks. First one is we want to be able to get the data in quickly where it is in these applications that generally coming from a streaming data platform of, of some sorts. In our talk, it's going to be Pulsar and we need to be able to get the data into Pulsar and then into like wherever we want to get it to query it, in this case, into Pina, we need to get it in there very, very quickly. Once it's in there, we want to be able to query it very quickly as well. So one way of thinking about it is in these applications, we want to do OLTP type queries, like query speeds on OLAP data. So we want to be querying everything but getting like the results in, in like, imagine like a refresh on a web page or like on a page on a web, on a mobile app. So I don't want to, don't want to be sitting there waiting for five or 10 seconds for the results, right? And that, that, that particular requirement is a lot more the case when it's an external user, right? Like if it's inside a company, because it's fine, you can just go and get a coffee and wait for the results. But if it's outside, you're not going to, you're not going to do that. They're going to use, use another application instead. And then finally, we want to be able to scale it, right? So either it could be like one, one dashboard doing loads of different queries and kind of aggregate and bringing everything together into one view or it's a lot, maybe lots of users like concurrently doing it. But end result is lots of queries are coming in. We need to be able to handle those and it can't affect those other two things either. So we need to be able to still be like doing lots of those concurrent queries while ingesting big amounts of data very quickly. So how do we go about building one of those? So these are, we kind of got around the outside some of the properties that you would have. And then in the middle, we've got a couple of tools that can achieve this. So in this case, Pulsar and Pino. And so you can kind of see we want to achieve real-time ingestion. The data will often be like very wide, like lots of, lots of columns, lots of properties potentially nested. And then we've got to do something with it to figure out how we're going to get it into a structure that we can query it. And then yeah, you can kind of see some of them. So we need to be able to, the data needs to be fresh. We want to do thousands of queries a second. And then the latency needs to be like OLTP style. So I'm going to hand the microphone back to Mary now. Okay. Thank you. Okay. Thank you, Mark. So, okay. So now we're going to focus just couple minutes on Apache Pulsar. So Pulsar, right? How many of you actually real quickly have heard of Pulsar or working? Oh, working. Okay, cool. So some of you have, and then of course there are also folks using Kafka too. But here I'm wanting to say, to tell you why we want to use Pulsar, right? So right now in here too, for those of you who are new and essentially to Pulsar, there are like a couple components to it, but primarily to their brokers that are serverless Java runtime, right? And running, but it's very flexible too. So for clients, you are supposed to be writing your producer and consumer, it takes on a pop-up type of architectural pattern, right? So I won't have time to get into all the details, but just give you a highlight, right? Producer, consumer is what you write. You can write it in Java, in Go, in Python, in any kind of languages that are supported by the community too. And then multiple brokers that are running and also is optimized too to run in the cloud native environment. Also too, instead of it managing all of the huge amounts of log messages, it actually leveraged on Bookkey, which is a patchy bookkeeper project. And it's also like a high availability or fast read and fast write type of log, logging, distributed logging system. Then it also makes use of Zookeeper to help it to manage the cluster, that aspect of things. And now really quick to kind of give you an introduction. Pulsar was developed first by Yahoo back in 2013 or so, and it's basically recognizing we need an event streaming platform that can run very effectively in a cloud native environment. They contributed to Apache Software Foundation in 2016 and then very quickly became a top level project in 2018. And again, it's very cloud native in nature. It's already cluster based. Multitenancy is supported too. Again, I talked already about the simple client APIs that you can write in many different languages. And it separates, one of the strengths of Pulsar is that it separates out the compute and the storage. So Pulsar manage all of the message things and then have Bookkeeper manage all of the log messages and stuff. And then also Pulsar has guaranteed message delivery. And also it has a Pulsar function framework that's very lightweight too. So then you don't need to rely on any external libraries or vendors essentially to kind of do message transformation as you are constructing your data pipeline too. And also another feature about it is that it has tiered storage offload. So if there's messages that becomes cold, then you can move it off to, or actually if it kind of becomes cold, then it gets moved off to offline storage such as like S3 buckets and things like that. Okay, so just real quick slide just to show too like about streaming and not versus not streaming. So in modern day streaming, as you can see, we're ingesting data and this is what we'll be ingesting for like, you know, analytics software processing so like Pino, you ingest data and then without actually writing to disk like the traditional way of doing things. So then it speeds up to the whole process. And it processes the data in memory. And when you're done with processing, you can use Pulsar function to transform your data, whatever you need, and then you output your data to a sync. So there will be connectors that that helps you to do that. So the whole kind of pipeline is designed to be very efficient. That's what it is. So and now we back to Pino a little more and then with the demo too. All right, so what is what is Pino? So this diagram more or less explains it. So as I say, we've got data coming in from from sources. So in our case, it's going to be Pulsar, but you can kind of see there are lots of other ones. And what's kind of interesting is you could you could have in theory, you could load into the same tables streaming and batch data sources and query them query them both together. Once they come in, it's a it's a column store. So it's a store. And then you can kind of do some aggregation. There's all the different types of indexes on top of that. And you can do some prematerialization as well. And then on the right hand side, we've got a couple of the use cases. So just to quickly show you the architecture of how this how this works from the from the far side. So the data is coming in from Pulsar here. We've got there are we're going to be using three components. We've got a controller, we've got a server and then we'll have a broker, which will come up here. So the controller is the manager of the cluster. So it's taking care of a hey, where does this where does this data need to go and then and Pino uses a tool called Helix, which is on top of on top of zookeeper. And so it will then it breaks data out into segments. So the segments will in this particular case will map to the partitions coming from Pulsar. So what like each partition will be coming to a segment. So if we had, for example, four partitions, we get four, four different segments. And if you were doing it in the cluster, in our case, we'll just did on my machine. But if you were doing it in a cluster, it would then have decide where is it going to replicate each of the data coming from each of the partitions. I might do one, one is on server one and seven four, and two is on server two and seven three and so on at the servers in the data. So that's where the data is going. Remember, the controller manages everything. And then we have the broker is taking care of the querying. So the query comes in here. So for example, here we're counting from a table how many how many rows for the country us. And then the broker sends it out in a scatter gather type pattern out to the servers that it knows have the data that will map that might be appropriate. And then it will they will kind of send their result back and then the broker takes care of aggregating it and sending it back to the client. Okay, now let's have a look at a demo. Let's see if I can get this to. So what we're going to do is if in case you want to look at it, this is where the demo lives. Hopefully that QR code still works. I'll just let you take a picture of it. And what it what it is, we can kind of tell from the name. So it's a wiki demo. It's going to sit in as I say before, it's going to sit in this bottom corner. So real time dashboard. And this is the architecture of it. So we've got a streaming API. And the data is going to come in, we've got a Python application, it's going to process it, put it into Palsar, from Palsar into Pino. And then we're going to have a stream that dashboard on the end. We've got 15 minutes. So let's see how we go. So wiki media has a really nice recent changes feed. So this is capturing all the changes that are being done to different properties in wiki media. And they actually store it internally as Kafka, but then expose it using server side events protocols. This is a HTTP protocol, it basically just streams out loads and loads of it, it reflects from the infinite stream of these events. And this is what it looks like. So we get three properties to get an event, an ID and a data. The main bit that's interesting is data. So you can kind of see it's like a sort of nested JSON structure of stuff. And you can kind of, you can sort of see like, so we've got like the URL that's been changed, you've got a request ID, you got an ID. It says somewhere, yeah, like what was the title of the page? When was the time stamp? At the revision? Yeah, it's got lots of lots of kind of interesting stuff that we can, that we can pull out. Nope, not done yet. Okay, so now that would have been, that would have been amazing. So we have got, if I come, can you see? Yes. All right, great. How can I see if I can get this to stay on my, hopefully, is that good enough? Is that good enough? Yeah. All right, perfect. So we have a, let's see if I can type, pigmentize. Ah. Yeah, there we go. So we've got a script called wiki.pipe. So this is what it looks like. So I guess the red is not entirely readable. But we've got, this is, this is the, don't do that. This is the URL that we're going to be working with. So you can see these are, if I paste that into my browser. Oh, I don't know. What's that done? It's going to the wrong one. Come here. Hang on. I'll escape that. So if we paste that in here, this is what it, this is what it looks like. So you get some loads and loads of messages. Chrome will eventually get very, very angry with you if you leave this running forever. But you can kind of see the messages are coming through. And so we're going to be processing that. So you sort of see, we've got, we're just using the request library in Python to make a wrap around it. So that's this bit here. Wrap around that, that particular endpoint. And then it's going to stream the events. And we've got this SSE, the server side events client, Python client. Wrap that round and we get an infinite stream of messages. So if we were to run this, so let's just go here. And we'll pipe it into JQ just so it's a bit more readable. So you can kind of see like the messages are coming through. If I stop it and scroll up, so you can see we've got, this is what it looks like. So you got the schema, meta, ID type. You can kind of see like, oh, this is, this is, this is what, oh, this is Russian, Russian, someone's changing Russian Wikipedia at the moment apparently. There you go. Thanks. So we've got lots of different stuff. And so that's kind of the first bit, right? We wanted to get the stream into like a fashion that we've got it, right? So we've done that bit. Next bit is we want to get it into Pulsar. So let's have a look at our second script. So this one is to get it into Pulsar. Oh, hang on. It's a bit longer. So let's just pipe it into less first. So we're going to be using the Pulsar client. And the first bit is the same, right? So we still build our streaming client here, like for the, for the wiki. But then we also build a Pulsar client here. So we're creating a Pulsar client. We point it to localhost 665. This is the port. We're actually running all this in Docker, but if it exposes the ports out to the, to the host OS, we create a producer. So our topic is going to be called wiki events. And then if you scroll down here, this is still the same. So we're still looping through that stream of stuff. But this time we, we then call the Pulsar producer and we send it a message and we're sending it in async, which means we're not going to wait for the response, right? And then finally we flush, we flushed it every 100 messages. Anything else to add on the Pulsar production? That's good. All right, cool. So now let's call that one. So if we call Python wiki to Pulsar. So then I run and you can kind of see like it will sort of run away and everything is, everything is happy. I can never remember quite exactly what the commands are, but we can then use the Pulsar client to check that these messages are making their way in. So we can call this Pulsar client and we say we're going to actually, do you want to explain it? You're probably better than me. Yeah, consuming and the events and then also the subscription. You have a name to it. Okay. So it's going to pick up wherever it was the last time I did it. So let's see. So there we go. I mean, this is going way faster. I guess it's catching up the ones that we've done, we've done before. So, and then eventually we'll come to the same sort of speeds. So we can see these are, these are the messages coming through Pulsar. We've put it in JSON format, but it can handle multiple forms, right? If you wanted to. So yeah, sort of all the, all the typical ones you would expect. So I guess most people would be doing Avro with some sort of schema attached to it. But I mean, for, yeah, for simplicity in the demo, JSON is quite a, quite a nice tool. Okay. So we've got it that far. Next thing is we want to get it into our, into a way that we can query it from, let's see, where's my, right? So, so as I say, Pino, it's like model is table. So you create a table and you can query it. And the first thing is we need to have a schema for that table. So this is what the Wikipedia one looked like. So we're pulling out some of the fields. We've got ID, we've got Wiki, we've got user, we've got title, comment, stream, type, bots, and then we've specified a timestamp down to bottom. You notice maybe that this is, this is kind of using the, the language of data warehousing. So these are dimension fields. We don't have any metrics fields in because there aren't really anything that we can count in this dataset, but you could have, if you had something that you were counting, then you could put that as a metric field. And then finally we've got a date time field as well. Now in general, by default, whatever fields you put in here, it will map exactly. If there is a value in your source with this name, it will map it directly. So in ideal world, everything is flat and we just go map, map, map, map, map. In this case actually, it's not quite as simple as that, but so what we need to do, so I'll just show you the table conflict that goes with it. But the first thing to notice is we can do these transformation functions to get the data into our column. So ID is under meta.id. So we're using this JSON path function to pull stuff out. You could, if you wanted to, if you, if we had cleaned the data up before, we could have used Palsar's serverless functions to do the same thing. And then we would have it in a cleaner state and just go straight into Pina. So you can kind of choose which, which of those options. This is not a replacement for a stream processor, right? This is just a very, very tiny adjustments to the data, right? If it was like slightly, slightly wrong. And then the other thing was the timestamps in the wiki give you are in milliseconds, epoch seconds, and I need epoch milliseconds for, for Palsar. So I multiply by a thousand. Now let's, oh, sorry. Yeah, I forgot the top bit. So top bit, name of the table needs to match the name of the schema. So it's Wikipedia. We need to specify what's the timestamp. And then this is the config that's telling it, hey, I need to, actually, I'm looking, I'm looking at the wrong one even. I should be, I shouldn't be saying table Palsar. There we go. So, sorry. So you see here, we say, hey, I'm going to be pulling the data from the Palsar stream. This is my Palsar connection string. I then need to say, hey, I'm going to decode, I need, I'm going to tell it which factory to use. I need to tell it the Palsar factory. And then this is, yeah, I mean, this is not really necessary for, for now. And what we're going to do now is we're going to add our table. So what this is going to do is going to create the table and then immediately it's going to start consuming. If you didn't have any messages in there, it obviously wouldn't consume anything. But since we, we do, we should be able to see our table here, Wikipedia. And you can see we've got, the messages are kind of coming in. So you can see we've got, at the moment, 9,000 messages. We could write like a, so it's a sequel, sequel on top of it. So we can, thanks. We could say, okay, let's have a look which user is doing the most stuff. Let's hang on. Oh, forgot the group by group. By user. Oh, can't type. Order by, can't start sending. So we could do, yeah, we could do something. What people, where people are changing stuff. And then finally, yeah, let me just, let me just quickly show you what, what, what exactly what we're doing. Okay. They're back again. Should I put it back? Yeah. Yeah. Okay. All right, easy one, easy one. So, oh, sorry. So what we're doing, so it's all in Python. So we're using the Pena Python driver. We're connecting to Pena here. And then we're basically just running some queries. So the one that we're showing you, like the table on the top, this red is not entirely readable. Is that aggregation plus filtering. So kind of capturing what happened in the last minute versus, versus what happened a minute before. And then we're going to kind of just run the query and stick it into pandas. We do the same. We build some metrics. There's numbers on the top there called, string that calls the metrics. So we can pass in like a value and then we can build the dieters. And the dieters, this minute minus the previous minutes, you can kind of see the change. And instead of, yeah, I mean, I guess, yeah, that's probably enough on the code. But if you want to have a look at that, it's all in the, the repository. And in theory, like you've seen me literally just running the commands in there. It should just, should just work. So hopefully, hopefully you can, you can follow along. But I just came back to here to conclude. So yeah, lots of, lots of people doing stuff with, with parents or some of the, the users parcel as well. So lots of, lots of different, lots of people, people using it. Just a conclusion and then, and then one more slide. So I hope that you can kind of see what you're combining these different tools together. You can build some quite, some quite cool applications. In this case, I'm just going to show you what the action would be because all the users have built. But you could imagine, like if it was like, if you were looking at real wicked users, you maybe want to try and encourage the ones you see like are coming in new. We've built something that's fresh there, the fast gradient scale. And we've done it with a classroom. For me, I'm writing a book with an ex colleague of mine, I'm showing you how to build these type of things. If you're interested in that, then there's my contact on your left. And I let Mary conclude. Just, okay. Here, it's just how you can connect me with me and also Apache Pulsar. If you're interested, we have a Slack group and also a wiki page. Sorry, wiki page of Apache Pulsar neighborhood team that you can build up on more stuff on. So, okay, I think then this is pretty much it. And if you need more links to Pulsar, this is the page and I can share the slide that we can share the slide with you so you can have more. And we also have developer data stacks. We'll have it on YouTube in five minutes about Pulsar if you're interested in that. Also, we'll have also master data stacks developers that you can find examples to because today we can't get into a lot of these things. And myself too, I've actually have a Twitch stream every Wednesday afternoon, central time like Chicago, so there will be evening time here. So if you're interested, you can follow me on Twitch as well. And yeah, I think that's it. How did it become like this? Thank you. Thank you. Backup slides in case the camera gets disastrously. Two questions. Here's one. Yeah, quick one. As you know, the way Kafka has, you can have cluster of Kafka instance brokers and stuff like that. Do you have that same problem where like one goes down, the other one says going down? It's like zookeeper kind of causes that problem. Because when I say Kafka, it always seems to be a problem with speaking with each other. So that's more about, I can move it offline with you and if folks are interested too, but I actually also have another talk that's kind of deeper right into Pulsar. As well as tomorrow too, I actually have a talk at the open JDK room, but that's more focusing on JMS. So there will be open JDK room in building H. So if you want to do that. But as far as Pulsar is concerned, it is very cloud native by itself. So a lot of things who in fact I didn't get to talk about is very infrastructure aware. So things like, you know, you don't want to worry about how do you deal with offsets while in Kafka. You don't have such, you don't have such thing. And then there are other things too. You just don't have enough time. I think I was told this time's up. So we can move it offline and then follow me on my discord. Actually, I didn't even get my discord, but follow me somewhere. Twitter. And I'll be happy to answer any questions you have. Thank you. Okay. Thank you. Thank you. |
Building Analytical Apps With ClickHouse |
Okay, so yeah, thank you, everyone, for coming. The streaming data, the last one of the day, after that, you're free to use, you know, go elsewhere. For the last one, we have a super cool talk with Alex, he is the CEO of the house. Yes, he's going to tell us how to build real-time application with the house. Yeah, thank you. So, the title of my talk is very similar to the previous. So, let's see what will be the difference. I will try to build a small, simple analytical application, just about right now. And how to build an analytical application? We have to figure out what to do, where to collect our data, how to prepare and clean our data, how to load it, and how to visualize it. And I will use the following technologies. Apache Flink, Apache Boom, Apache Kafka, Apache Pulsar, Apache Spark, Apache Plane, Stringlet, the Bezier, Apache Iceberg, Apache Superset. Every time I notice Apache once again, I'm looking more and more stupid. So, maybe I don't actually have to use all of these technologies, because if I do, at least I will have to be able to tell apart what is the difference between Apache Kafka and Apache Pulsar. If you cannot, don't even try to use these technologies. And what I want to do? Actually, I want just analyze data. What power data? Give me some data. I want to analyze it. I have no idea what I will get in the result. I want some interesting data set with logs, metrics, time series data. I want clicks, whatever. So, where to find this data? If you want some demos, there are plenty of sources of fairly available public updatable data sets, like Internet Archive or Zmodo.org, whatever does it mean, or Common Core, GitHub Archive, Wikipedia, blockchain data from public blockchains, whatever scans. Sometimes you can do metrics scan by yourself and get away with it, but there are plenty of downloads. So, maybe you will be surprised by my choice, but I selected the data from Wikipedia. Exactly, almost exactly as from previous talk. The data is available on dumps.wikimedia.org. It is public demoing. You can do whatever you want with it. It contains data dumps, edit history, and page view statistics. And I will analyze page view statistics. It is updated every hour and represented by about 70,000 gzip files, 3.5 terabytes. What to do is 3.5 terabytes, download it. So, the data looks like this. It looks kind of low, and I like it. And how to download it? With this shell script, it looks kind of raw, and I like it. So, what it is doing? It writes by years, it writes by month, collects on the list of links, and then simply downloading by parallel with WGET and XRX. It is rate-limited to three concurrent requests, apparently. Actually, WGET has recursive mode, but it does not have parallelism, so I decided to simply parallelize with XRX. And after about three days, data is downloaded. Let's preview it. If you decompress just one file, it looks like this. It is kind of a strange format. It is not CSV, not TSV, not JSON. It does not look like ProtoBuff. It is the white space separated file. It was just a few fields. Title, project, subproject, the number of page views, and also zero, for whatever, always zero field. How to load this data? And I like shell scripts, but I don't want to use set, oak, and parallel. Even despite I am on this open source conference, I will not use set, oak, and parallel. Instead, I will use ClickHouse local. What is ClickHouse local? It is a small tool for analytical data processing on local files or remote files without a server. You don't have to install ClickHouse to use ClickHouse local. And it can process every data format. It can process external data from external data sources, data lakes, object storages, everything. And actually, ClickHouse local is not a unique tool. There are many tools for command line data processing. Here is a list. I will not pronounce this list because I like ClickHouse local. I don't like all these tools. Installing ClickHouse local is easy. Google.sh. It is also safe because, keep in mind, it is pipe.sh, not pipe.sudo.sh. Writing it is also easy. And let's preview this data. It has interactive mode. Let's run ClickHouse local. And we can select directly from URL. What format to use? CSV does not work. CSV does not work. But there is a format, pretty simple, named line as string. What is this format? It interprets a file as a table with a single column, named line with type string. So just a single column with all our data. We can use it for just filtering. We can also select from multiple files. As in this example, we can select a file name. We can filter by something. OK. Now we have some idea what our data looks like. Now we have to clean up, prepare, structure our data, maybe convert it into another format. And I will do it with this select query. What it is doing? It is selecting from files all our 3 terabyte jzip files with line as a string. It will split the string by white space to some values, represent it as array, select elements of this array as project, sub-project, and path. Path can be URL encoded with percent encoding. I will use a function decode URL component. I will also extract the date from the file name with a function path date time best effort. And it looks like this. It is not Russian Wikipedia. It is AB Wikipedia, whatever it means. And what is AA Wikipedia? I don't know. It will be pretty interesting. Also, what I did with this 3.5 terabyte of files, I uploaded to my S3 bucket. And I just made this S3 bucket public. So until we have money, you will be able to download something. But please be kind. And you can select directly from this S3 bucket as well from all of these files. Yes, in the same way. Okay, so we just previewed our data. Now let's proceed to real data loading. Let's install a real ClickHouse server instead of ClickHouse local. But actually, there is no difference between ClickHouse local and ClickHouse client and ClickHouse server. Well, everything in a single binary. You just rename it to ClickHouse server and it automatically becomes a server. You can create a sim link. You can take this binary and install it. And it will install into user bin, user and etc. You can run it without installation. So let's start it and let's create a table. So here is a table structure. Five fields, date time, because it is time-serious, project, sub-project, page title, name it path, the number of page views, name it hits. I also enabled stronger compression with ZSTD, the standard and low cardinality data types. And this standard is just a compression codec. I will also index it by path and time. So I will be able to quickly select for specific pages. And how to load data into this table? Let's use Kafka or Pulsar and automate with Airflow and do ETL with Airbite or DBT. Actually, I don't know why DBT even exists, because I can do everything without DBT. I will do it with just insert select. Insert into Wikistat my select query from S3. And I will wait while it finishes. Let's take a look. You don't see anything. Let's make a font slightly larger. I will make a font slightly larger. Okay. Now it started to load the data. 0%, 57 CPU consumed, 2 gigabytes per second and 50 million rows per second. 50 million. I did not watch one of the previous talk. It was named loading more than a million records per second on a single server. So we are loading more than a million records per second on a single server. Okay. Let's take a look what is happening, because just loading data is not enough. It will take a while. And what to do while it is loading? I will run Distat. Distat will show me the system usage, and I see that it is bounded by IO, 500 megabytes per second, Britain. It is compressor data. IO weighed 68%. CPU weighed almost non-existing. I can also run top to see what is happening. CPU 16 cores, and it works, and IO weighed 70%. But for me, it is not enough. For me, it is not enough, because I also run this tool per top, because I always profile my code. So what my code is doing? It is doing compression, sorting, nothing. Okay. And after eight hours, my data is loaded. The table size on disk is just 700 gigabytes. Original was 3.5 terabytes, so it compressed like five times. It was in Gzip, now it is in Clickhouse, with all the column-oriented compression. The speed was 50 million rows per second, but actually, it was not true, because after eight hours, it degraded to just 14 million rows per second. Still not bad. It degraded because data has to be merged on disk, and it takes write amplification, it takes additional IO. So what is the size? 380 billion records, 0.3 trillion. The total page views on Wikipedia is just 1 trillion, 300 billion page views. Nothing surprising, Wikipedia is quite popular. And about my table. So every record took just 2.0 bytes compressed. All this title, like Wikipedia main page, it was like 50 bytes, now it is compressed to just two bytes. And if you look at compression ratio, actually path is compressed to 170 times because we sorted by path. Okay, but so what? What to do with my data? I have loaded. It took, it was 3.5 terabytes, and I can't be proud that I wasted eight hours loading this data, and it compressed so well. But what to do with this data? We need some actionable insights from this data. Let's make real-time dashboards. How to do real-time dashboard? We can use Grafana, SuperSet, Netbase, Tableau, Observable, or even Streamlit. I don't want to use Streamlit, it looks too complex, too complicated in the previous talk. And actually there is no problem, I can use Grafana, SuperSet, Netbase with Clickhouse, it works perfectly, but I am an engineer. And why to use Grafana if I can write my own Grafana in a day? Let's do it just now. Let's decide what JavaScript framework to use. I can use React, View, Swelte, I don't know what is Swelte, but it is popular. You know, if Rust were JavaScript framework, I will use Rust. Maybe I should use not JavaScript, but TypeScript. But no, I will use modern JavaScript. What is modern JavaScript? Modern JavaScript, it is when you simply open HTML file in Notepad or VI or whatever, and writing a code without frameworks, without build systems, without dependencies. Actually, I need one dependency, some charting library. And I just picked a random charting library from GitHub. Name it Uplot from Lyonya. The description Uplot is a fast memory-efficient library. Okay, solved. I will use it. Another question, how to query my database? Should I write a backend in Python in Go? No, I will query my database directly from JavaScript, from modern JavaScript with Rust API. I will use Async, await, fetch API, and post my query to the database, and it will return the data in format JSON. Okay, enough modern JavaScript. So, Clickhouse has Rust API embedded into the server. It has authentication, access control, rate limiting, quotas, query complexity limiting, parameterized queries, custom handlers, so you don't have to write a select query, you can just define a handler like slash my report, or slash insert my data. And you can actually open Clickhouse to the Internet and get away with that. I did that, it still works. Okay, here is a query for Wikipedia trends that we will use for a dashboard. It will simply select this time series rounded to some time frame, to some page. And here is a parameterized query. It looks slightly different, it's not like question mark here. It is actually a strictly typed substitution. Okay, and how long this query will take? Let me ask you, how long this query will take? What do you think? Eight days. Eight days, why eight days? It should work on a table with 0.3 trillion records. How long this query will take? Twenty milliseconds. Okay, let's experiment nine milliseconds. So, you are wrong. You are also wrong. I was scrolling back and forth. So, maybe Clickhouse is fast. What if I do my SQL, 29 milliseconds? Okay, closer. MariaDB, 20 milliseconds. What if I will replace equality comparison to like and add percent? The same, because prefix also using index. But what if I will add percent on the front? Okay, now it started to do a full scan. And this full scan was quite fast, over 1 billion records per second, but still not fast enough for real time. But all the queries with exact matching was real time. Okay, let me show you this dashboard. It looks like this modern dashboard. It looks actually gorgeous. It has dark seam. And you can see it compares trends on Wikipedia for Clickhouse. Clickhouse is growing. Spark is not growing. Green Plum is not growing. What was there? Snowflake is quite okay. Let's check it. Let's see what is inside. Every chart is defined with parameterized query. You write select. Actually, it's not even parameterized. Okay, what about MongoDB? Here I define a new chart and here is Mongo. Okay, I did one mistake. It was filtered by outliers for Snowflake. Let's move. Okay, Mongo... No, Mongo is not doing great. Clickhouse is doing great. By the way, what if you will just open a dashboard by default? It will present you observability dashboard for Clickhouse. So you can see what the system is doing. It is actually the same code, the same dashboard, but different queries. You can use parameterized queries for these parameters, change parameters, change the time frame. It's not like Grafana, it does not have features, but it is nice. And you can see, yes, it is a single HTML page and here is a proof. Okay. So what do we have? We have created real-time dashboard with Clickhouse. We have loaded 0.3 trillion records of data from a public data set. It works, it works fast, it looks great. And if you want to build... Actually, I don't insist you to use modern JavaScript. I don't insist you to query Clickhouse directly from a user browser. You can use Grafana superset meta-base. Streamlit, maybe, I'm not sure. But you can also build these small applications. And I have built quite a few. There is Clickhouse Playground where you can explore some data sets. There is a web page for Clickhouse testing infrastructure. Name it R, test green, yet you can try and check what it is. And the source code, dashboard HTML, is located in our repository. And just to note, this service is not original. I have found multiple similar services, for example, WikiShark, for the same trends on Wikipedia. But on WikiShark, there is a description that the author... I did not remember, maybe he made a PhD implementing a data structure, custom data structure for this. But he can simply load the data into Clickhouse. The experience of working with Clickhouse worth multiple PhDs. Okay. Thank you, that's it. Thank you. We do have time for multiple questions. More than JavaScript, for example. Why is this super fast? Very easy. Why this dashboard is fast? Because it's processing very fast. Why it is inserting fast? Why it is selecting fast? Because I always profile it. You have seen, I always look at what is happening inside the code. What can be optimized? If I see that, like some percent of time, spent doing nothing for mem copy, I'm thinking maybe I should optimize mem copy. Maybe I should remove mem copy. But actually a very long list about everything. But still we are talking about one machine. If one machine can process all the data. Yeah. I just created a machine on AWS with GP2 EBS, just in case. Data was in S3. I have uploaded. By the way, maybe we have time for some demos. But the resolution, the screen resolution is not. And Wi-Fi stopped to work. So probably no demos. But okay, next questions. Okay. Hello, thanks for the talk. You mentioned compression. Does that slow down select? Not quite. Actually compression can even improve select queries. It is kind of paradoxical, but let me explain. First, because less amount of data will be read from disk. Second, because data read from disk is also cached in memory, in the page cache. And the page cache will contain compressed data. And when you process this data, you will decompress this data into CPU cache without using main memory. So even... Yeah. LZ4 from multiple threads can be faster than memory bandwidth. ZSTD not always. But on servers like AMD Epic with 128 cores, if you run ZSTD decompression in every core, it has a chance to be faster than memory. Thank you. So what is your total AWS bill for this project? I prepared it yesterday and used also S3, prepared before that. So let's calculate S3 cost. I am storing the original data, three and a half terabytes. And it should be like 23. But 23 is the least price per month for terabytes. So it will be like $70 per month if you don't have AWS discounts. But I do. And for the server, the server was about $4 per hour for a server. And something for GP2. So maybe something like $5 per hour. And it is still running. I started up it yesterday when I prepared the talk. And so 24 hours will be how many? Something like maybe $50. Okay, I spent $50 for this talk. Is your S3 back in public? Yeah, it is public. So keep in mind, if you will abuse it, we will simply turn it off. Maybe another question about S3. What type of connectors do you have to S3? Is it just for uploading? Or can you also use S3 for indexing and storing data? Yes, you can. And in multiple different ways. First, just a bunch of files on S3. Process them as is. Parquet, protobufs, Avro. Avro does not matter. Everything works. Second, you can process files in Apache Delta Lake or Apache Hoodie. Asperk will be supported maybe in the next release. So you can prepare data in your data bricks or Spark. And process with Clickhouse because Clickhouse is better than Spark. Third option. You can also plug in S3 as a virtual file system for merge three tables. And it will be used not only for selects but also for inserts. And you can have your servers almost stateless. And the data will be in the object storage. Yeah, plenty of options. One more question. Yeah, for sure. You can use it in a cluster. You can set up an insert in a distributed table and it will scale linearly. And these queries will also scale. The queries that take already like 9 milliseconds, 10 milliseconds will take not less. Maybe they will take even more, like 15 milliseconds. But the queries that took seconds, minutes, they will scale linearly. Theoretically, no. But in practice, some companies are using Clickhouse on over 1,000 of nodes. Many companies are using Clickhouse on several hundreds of nodes. When you have to deal with clusters with hundreds and thousands of nodes, especially if it is geographically distributed, you will definitely have troubles. But with Clickhouse, it is totally possible to have these clusters and it will work. Another question. Interesting question because maybe you are asking about what are the data structures inside. Maybe you are asking, is Clickhouse based on some readily available data structures? The data format, Clickhouse Merge 3, is original. You can think that maybe it is somehow similar to Apache Iceberg, maybe. But actually not. The column format in memory and the network transfer format is also original, but it is very similar to Apache Arrow. That's slightly different. The algorithms, actually, we took every good algorithm from everywhere. If someone writes a blog post on the Internet like about, I have implemented the best hash table. Instantly, someone from my team will try and test it inside Clickhouse. Okay, looks like no more questions. Thank you. |
Building Personalized AI Apps with MIT App Inventor |
Okay, I think we can start already. Hi everybody, I'm Diego Barreiro. I'm one of the open source contributors to the MIT App Inventor project and today I'm going to be talking about App Inventor and how we can introduce artificial intelligence to kids using this platform. But before getting into it, I would like to introduce myself a little bit more. I first started coding with App Inventor when I was 14 years old. It was 2013 at that time and I basically wanted to build an app. I didn't know anything about coding and a high school teacher showed me this amazing platform. So I just spent the next couple of years building up with App Inventor and eventually I switched it to Java coding and I was able to contribute to the project later on as I'm doing right now. So what is MIT App Inventor? MIT App Inventor is an online platform that enables anybody to build any kind of application without having to learn any programming language like Java or coding that is nowadays the most popular ones. This is the interface and it has the mock font on the center and on the left side we can have the components, the elements that will make the app like buttons, labels, text boxes, text areas, like any kind of interaction that the user will have with the app. And then we can customize the properties like colors, text fonts, text sizes, whatever on this panel so we can make the app look as we wish for the final user. And how does the logic work? Well, most of you may know about Scratch. App Inventor works somehow like Scratch using this block language. So let's say that we want to play a sound when the app opens. We will be using a block that says when the screen one has opened, we want to play a specific sound later on. And that's how we can just make any kind of application. MIT App Inventor allows existing Android developers and Android developers to introduce new components using extensions. And we will be using today one of those extensions that was developed by a research project at MIT that enables to classify images on different groups using artificial intelligence. And to give some numbers of App Inventor, it was tested in 2008 as a Google project. And then a few years later it eventually was transferred to MIT. Right now it has gathered over 18 million users since it was created, since it was transferred to MIT with nearly 85 million apps that have been developed. And on a monthly basis we get roughly around one million users. And in terms of open source contributions, we have seen 164 different contributors to the project on GitHub. So today I'm not going to be giving the classic talk. I'm going to be showing a tutorial and people in the audience and at home can just follow this tutorial by visiting the following link on how to build an app. And what we will be doing today is building the Pickaboo app. Pickaboo is a game that is usually played with babies that when you see the baby, the baby loves and when you hide yourself, it just cries. So to show the final result, this will be the final result. Let me just switch to my phone. I'm going to be mirroring this phone. So I open the final app and I'm going to start using the camera. I can see, here I am. I'm looking at the baby that is happy. If I hide myself, it just cries. So let's get into it. So how we can use MIT App Inventor 2. The standard instance of App Inventor is hosted at aiu.appinventor.mit.eru, but that requires an account. So MIT has created this specific instance called.appinventor.mit that allows you to create these projects without any existing account. We will be using an anonymous account so that can be cleaned up later after finishing. And we will receive a code like the following one. Now this is blurred, but you will be able to see a code. And on the previous screen, if you want to recover the project later on, you can just paste the code that you previously get right here. And once that is done, you will be able to access an anonymous account as you can see over here and you can just start creating the app. So let's get into it. So we can visit fosdm23.appinventor.mit.eru and we can click on this link. This link is basically the code App Inventor instance and we are loading a template project for the pickable project. So I'm going to click on this one. So I click on continue without an account. And just wait a few seconds for the project to download from the repository and here it is. That was faster than last night. I can see the code here so I can just copy paste the code to access this instance later on. And as this is a tutorial, we can see on the left side that we are seeing a description of what we are trying to build with a detailed step by step guide. And this is the instance of the project. I can see the happy baby and then the sad babies hidden here. Okay, so let me just continue the presentation. The next step is turning the classifier. I'm not going to get too deep into the machine learning and how it works. I'm just going to be providing a very high level overview of how this works. So we will be using an image classification system that consists in creating two groups of images. We will be creating one group of images that is the face of myself looking at the baby and another group of images that is me hiding from the baby so we can show the sad face. So how does this work? We can visit this website, classifier.appinventor.mit.edu to train this model. Just as a side note, this instance only needs internet to load once. To train the model, that all happens in your browser so no servers are involved, no images are transferred outside your desktop. And this website is also open source so you can just check. There is the link on the FOSDEM23 website. So we visit the website and we start first creating the images group. So in this case we will be creating one image group for me so I'm looking at the baby and not me I'm hiding from the baby. If for example we are in a biology class and we want to classify trees, we will be creating one group of images for each kind of tree so we can recognize them later on. Then we will turn on the camera to take a photo of myself setting the group of images that I'm going to be saving this. So if I'm looking at the camera, it's going to be the not me group. If I'm looking at the camera, it's going to be the me group. This is the same for the not me. And once that is done, when we have a reasonable amount of images for each group, which should be around 5 or 10 images for each group, we can train the model. As again, this training happens in your computer so no images transferred outside of your computer. We can then test the model to with new images to make sure that we have properly trained the model and can identify ourselves. And once that is done, we can export the model and load it into App Inventor. So I'm going to be doing that very quickly. Go to foster203appinventor.mit.edu. I open the classifier instance. It's going to ask for permission to use the webcam. There it is, I just accepted it before. So the first step is creating the labels. First the me label, enter, and next the not me label. And well, the light is quite hard. I think it was... Take a few more images of me like looking to different places so I can train better the model. One more, one more. So seven images should be fine for this demo to not take too much time. And now for the not me, I'm going to take like one photo of me not being there, basically. I'm going to be using the right hand to hide myself so I can put this one in front of my eyes, turn it upside down, diagonal, like more images, the other hand as well, like that. So that should be enough for now. And once that is done, we can just hit the train button to train this model. It will be built on your local machine without sending anywhere. Okay, this was faster last night. Seems like the time that we saved from the loading the project, we lost it here. So now it's training. Yeah, it's my laptop. So this is a React app that has been open sourced and the only internet required is to just get the initial webpage later on. We can just disconnect and it will work perfectly. It's just offline training. If you really want to build it fully offline, you can just launch the React app locally and it will work. So now the model is built and to test it, I'm going to be looking at the camera as I did before. I captured the image and I can see that there is a 99.42 confidence that I'm looking at the camera. If I take a photo of myself hiding from it, there is a 99.33 confidence that I'm not looking at the baby. So once we have validated the model properly, we can export it to the app. And we will get this model.mdl file for AppInventor. So let's go to the presentation. And once we have the model, it's time to code the app using blocks. I'm not going to go through the slides anymore for people at home. If you have internet problems or the streaming is down, feel free to follow the slides. It's a step-by-step guide. But for here, I'm going to be showing the tutorial live. So let's go back to the project that we just loaded. And right here, we can see a quick description of the project of what we are going to do. A set up of the computer of how to connect to the MIT instance to the app. I'm going to show that at the end of the presentation. And we have here the pickable example. This is the final result. This is one of the MIT curriculum developers that made the original tutorial. And yeah, basically it says that we will be using the personally much classified extension that was developed by that research of MIT. And the first step is loading, turning the model. And then we have to upload the model. To upload the model, we go to this section over here on the media. We select the just downloaded model file. It should be here. And now it's uploaded. It's in the asset file of the app. And we can just change the model of the image property, personal image classifier to the now loaded new model. And we have just loaded the model properly. To give an overview of how the app is going to look like, there is going to be this status label that will tell the user when the app is ready to work. Now it says loading, because it's the initial state. It will let it go to the ready state and it will just identify the faces. We have these two bars that will be showing which percentage of confidence we are that we are looking at the baby or not. This is going to be the live image from the camera that I just showed before. And these are the interaction buttons to start the classification, to toggle the camera from the front to the back camera. And we have here the happy baby in this case. So uploading the turning model. This is the sequence of events that I was just talking about. First we start the app. The app will show to ready as soon as the classifier is ready to start working with the app. The user will press the start button and then the personal image classifier extension will keep classifying the live stream video from the camera continuously. And once they have identified the result, if there is a higher confidence that the me model is working, we will show the smiley face, otherwise the baby will just start crying. So in this template, there is already a set of blogs that are available to speed up the process. And we can go here that we see the one personal image classifier error. So that means that if for any reason the personal image classifier shows an error, maybe because there are some missing things on the phone or whatever, we will set the status label text to the actual error that we return from the image classifier. Once the image classifier is ready, we will enable the start button as well as the toggle camera button. We will set the status label text to be ready so the user knows that they can start using the app. And we will set the text boxes of each classification group to the previously defined labels me and not me in this case. If the user presses the toggle camera button, we will be changing from the front to the back camera just every time that they press so we can use the front selfie camera or the back normal camera. And once the user presses the start button, if the personal image classifier is already classifying an image, we will just stop it and we will show the start button with the start text. Otherwise, we have to start the classification. So to do so, we just invoke the start continuous classification method and we change the text to stop because we will be changing the the button interactions. And that's the quick overview of the code that is already available in the in the app. So how does the image classification work in MIT Preventor? Well, we have this big block that is the when personal image classifier has received a classification succeeded. We will receive this result variable. This result variable is a dictionary that to just give a little high level overview of what is a dictionary. It's a key value list of elements. So if we have two different groups, me and not me, we will receive me equals a specific value, not me equals a specific value. If we have three groups, we have one, two, and three that they each equal to a specific values. This is a little example of how it looks like. So we have key father equals this value, key model equals this value, then equals this value, etc. For the image classifier, a specific example, we will have something like this. We have me with this value and not me with this other specific value. So how we can retrieve a value in the dictionaries area, in the dictionaries block area, we have get value for a specific key. And we will be doing something similar to this. So we have the original dictionary here. We are building it in this area, make a dictionary me and not me. And we will be getting the value of the group that we want to use right now. In this case, it's the me example. If we want to take the not me, we just have to change this label to not me. And if we are using the wrong model because the groups are not the same, we just return a zero because we cannot classify that group. So let's get into it. By default, the tutorial will provide this block that is some variable, some me confidence level, and we have to complete them using this block. So to do so, we will take the get for key in dictionary block. We join it to the me confidence block. We remove, nice, get value for the key. And we will take from the text blocks an empty text block to touch it right here. And we can type me. So we can get the me group into the me confidence variable. The dictionary is the result. We can just attach it here. And if not found, we will just returning an empty zero value. And for the not me confidence, it's basically the same. So we can copy paste the blocks. We attach them to the not me confidence area. And we just have to prepend a note in front of the me. And now we have just defined that me confidence variable that can be accessed like that. We'll have the percentage of confidence that we are looking at the baby. And the not me confidence, it's the opposite. It's how confidence we are that we are not looking at the baby. The next step, the interesting variables. And now we just have to recap what do we have to do in the app. So in the app, we have to first update these labels here with the percentage. And we have to update these color bars with the correct confidence levels. We can do that by going to these components, to these two horizontal arrangements. And we have percentage one, bar graph one, percentage two and bar graph two. Percentage one, we can update the text to the percentage that we are showing. One second. There it is. So the value that we return from the dictionary goes from zero to one, but we want to return a percentage which goes from zero to 100. So we will take this me confidence value and we will multiply it by 100. So we can get the zero to 100 range. We just join it right here with the math number and we multiply it by 100, 100. But we will be missing the percentage sign. To get the percentage sign, we can use the text blocks with the join block. So we can join two text together and we can just create a new percentage symbol like this using the percentage symbol. And this is for the percentage labels. For the bar graph labels, we will be pledging with the length of the actual graph, bar graph. To do so, we have the width percent block that can modify the width according to a percentage. And we already have defined the percentage right here, so we can just copy paste these blocks and attach them to the width percent. And this is for the me group. For the not me group, we can copy paste the percentage one, which changes to percentage two, and we change the me confidence value to the not me confidence value. And for the bar graph, it's going to be bar graph two right here. And me confidence changes to not me confidence. And with that, we already have all the sequence of events for the labels updates. We can just go to the next step and confirm that we have defined it correctly, which is the same result. The next step is the fancy image change that if we think that we are looking at the baby, we will show the happy face, otherwise we just show the crying face. We will be using the if then logic. So go to the control blocks and we just take the if then otherwise block. We append it here. And what will we do is we will be comparing the me confidence value to the not me confidence value to know if we are looking at the baby or not. We go to the math blocks, we pick this comparator block, attach them to the if statement, and we are going to be changing the comparison to higher or equal because we will not, we don't worry about the equal in this case, we just want the higher or equal. We take again the me confidence variable, we compare it here, and we take the not me confidence value and we compare it right here. Then we will be updating the background of the of the app, which is available in the screen one. We take the background color block, we attach it here, and the tutorial already provides the example colors. So I'm just going to be dragging this right below so I can have them more easily accessible right here. And I can just join it here. And for the baby images, we have two images available here, happy baby and sad baby. So if we think that we are looking at the baby, we show the happy baby. So we use the visible block. And if not, we just hide, sorry, if we think that we're looking at the baby, we hide the we hide the side baby face. We go to the logic blocks, we take the true so we can set to true to visible, we can set visible to true, and we set visible to false for the sad baby like that. And we just join it. For the case of me confidence being higher than not me confidence, for the opposite case, when we are not looking at the baby, we just change the background color to this pink color. We hide the happy baby face like that. And we show the sad baby face just like this. And now the app is finished. So here we can just check the final code, which is exactly the same as we have right here. There are other possibilities like we can just implement a classifier using a different person. But to show how this works, we can use the MIT company map that is available on the Play Store. Let me just show my phone again. Here it is. So you can just go to the Play Store and go to MIT App Inventor, search MIT App Inventor and you have right here the company. You can open it. Yeah, you can just ignore this warning. It works without Wi-Fi. Continue without Wi-Fi. And over here you can connect the AI companion. And now I can scan the QR code like this. I'm sorry, it just disappeared. It takes a few seconds to connect. Let's see if it was faster than tonight. Now it's loading the extension, the personal classifier extension into my phone. Like this works with Wi-Fi, mobile network. It doesn't have to be connected. It's just connected because I'm just mirroring the screen through that cable. And I see here the layout of the app. I can see that it shows ready. So I can just toggle the camera to be the from one. And I can look at the camera and I'm just going to be start. And there is a higher confidence that I'm looking at the Wi-Fi. Just put a hand in front of it. It's just crying. And yeah, that's it. Later, if you want to build any other apps, you can export it to APK files. So you can start it on your phone or 200 app bundles if you want to distribute it through Play Store. But yeah, this is just a very high-level introduction to artificial intelligence in Inventor. You can just build any kind of classifier, for example, to classify trees, flowers, to classify even people. For example, for a faculty, if you want to build an app that recognizes people in your class as a game, you can just use Inventor and build any kind of app. Thank you so much and hope that it was useful for everybody. Thank you. Any questions? Do you mentor a technobation team? No. Sorry, I'm a software engineer. I just contributed to Inventor. I started as building apps and then I transitioned to open source. I participated in Google Summer of Code, like this option to export a 100 app bundles was my project in 2020 for Google Summer of Code. But yeah, I'm more like, more technical than actually teaching to kids. Any other questions? What's your experience with the relation between the number of pictures you have to submit to your classifier and your accuracy? That's a very good question. So for the linear example, he was asking like, what's the experience with the amount of images that we are going to be using for the classifier? So I haven't really tested it right now. But we have seen that if we go higher than 10 images for each class, for each group of images, we'll have really good results. In this case, because I was just turning very fast and using just a few number of images, you can see that the confidence levels were a little bit like 80, 20. But if we provide more than 10 images for each class, we should be able to get around over 90, 95% of confidence for each number. I'm not sure if there are any questions from the chat. Let me just check. What do you capture? Is that just for your face or did the learning, what was learned? It recognizes, it depends on what you are training, because in this case, we are just providing a very specific gestures. It's training my face like any face looking at the camera, or a hand in front of the face. By default, the model that is available, that is here, this model is turned by Salim. It's the example guy that is at the beginning of the tutorial. And I just tested it last night, and it worked with me because it recognizes the gestures, not the faces. If instead we train recognizing people, we will all be looking in the same way at the camera. So it will just go for specific facial, how do you say, facial features. In this case, in this case, it will work. You can just try if you want. We can try. Yeah, it should work. Can you, it's going to be a little bit tough, but Mark, can you, can you just try with Mark, for example? Toggle camera. Toggle camera. Yeah, and just try with, and start. Press start. It's a happy face, so if you put a hand in front, it's a sad face. It's recognizing the gestures. So can you also train it to recognize specific people? Yeah, it can be trained, but in this case, because the higher difference was the hand, it's just looking for the hand in model. But if you don't show the hand, it will look for faces. Yeah, it's a, it can be a fit because it's just, you can just use this website and fill any kind of models. The only restriction is that it has to be an MLD file, but yeah, it can classify any model basically. No problem. Any other questions? Well, I think we can leave it here. Thank you so much. |
Hedy: A gradual and multi-lingual programming language for education |
Okay, 10-20. Welcome everyone. Everyone can hear me right? Good. Welcome to my talk about Heady. But first of all, I promise to wave to my wife. She's watching at home. Can you all say hi, Susan? She never thought this would ever happen, being talked to from the biggest open-source conference in the world. She's at the couch with a broken ankle. Okay, my name is Mark Heisen. I'm a lecturer in IT at Applied University in the Netherlands. I'm doing, oh well, I had this wonderful new kitchen. And I'm doing this talk about Heady. And Heady is a gradual, multilingual programming language for teaching. Instead of all the other languages here, they're for learning, I think. This one's for teaching, specific. And it's multilingual. And why is it here? It was invented and usually mostly built by Felina Hermans, a professor in the Netherlands. She's a professor mainly in how do we learn programming. She wrote a book about it. You should definitely read it. What was it? The program is Brain. You become a really better programmer by just reading that. But how did she come to this? Somewhere in the past, for her it was 2013, for me in 2019, there was a group of children that wanted to learn programming. And, well, we program. So we said, I'll teach it. That's pretty easy. We started to think, how do you teach programming? How were we taught programming? Well, we weren't. We just sat in front of a computer with a cursor. And in my case, I started typing hexadecimal coding. I typed over from the nibble or the byte. Those are magazines. Some of you might know. Probably not. I see some nodding, but okay. And this would result in three things. It either worked. Those moments were pretty sparse. It wouldn't work. Or it said beep. Those were the three options I had. Felina was a little luckier. I was a few days or a few years later a little luckier. We did not have a teacher, but we had a book. In Felina's case, a book of basic program. And once you get to this, well, you can read it. You can read it. It's plain English. To us programmers, this is plain English. It even has line numbers. So if there is an error, it actually tells you it's on line number 120. And it even puts out some text telling you what's wrong. Wow. Compilers are perfect teachers. That's what we thought. This is how we learned. We never realized that 90% or 99% of the people didn't think this. And they quit programming. They never got anywhere further than beep. So most of the coding dojos or code clubs start using scratch. It's pretty easy. It takes a lot of the syntax away by using block programming. You can actually ask children to build this. And some of them will say, oh yeah. And some of them will say, when they're 14 or 15, scratch is for kids. If I Google code, it never looks like this. I want to do the real stuff. I want to do Zimzalabim. So Felina started coding with this sort of Python. And she said, the first thing I'd like to learn is input and output. She is a teacher. She knows how to teach baby steps. The first thing is input and output. Let's use some output. Print, hello, everyone. Enter, and voila. Hello, everyone. Children really like this. It's easy. Unless there's some red scribbling below, they don't care. They still press enter. And this lovely teacher tells them some mumble jumble and then in line five, name print is not defined. And the children will ask, well, I did define print. If they understand English, only if they understand it, mine don't. They're from eight years old and up. They don't know. They don't know what it is about. And they don't recognize that there's a capital P in there. This one's even worse. It's missing a parenthesis at the end. And your lovely compiler will tell you some empty lines and then unexpected EOF while parsing. This eight-year-old kid asked me, what is parsing? Try explain parsing while print is not even in their minds yet. It's not that easy. There's too much going on, too much interference in learning. This is not a way of learning. And there's even a better one since we're using Python. Did anybody see it shift a little? There's a space in there. Well, Python knows it's not supposed to be there. Unexpected indent. Once I was finished explaining parsing, this was my next problem. So compilers are lovely teachers. Okay, that was true for me. That was true for some of you. And the 99% of the others that didn't make it to this conference was not a lovely teacher. The next phase, if I get through to input and output, the next one is iteration, repetition. Four i in range, four print i. Well, I can explain that, I think. I have an old hour. Classes are an hour. Just one hour to explain repetition. Well, the kids only see colons, brackets and spaces. And they mumble, jumble them through each other. I can give them an example and they can press run. And they say, okay, okay, but it can never reproduce this, not in an hour at least, not my kids. Maybe some others can, but not in my class. Syntax creates too much cognitive load. They have to remember too much things to actually get to the stuff I want them to learn. This is not new to programming. This is true for any language. It's true for math. How did you do that? Well, we just start with small bits. If you start writing, you write an a and an i and an n and an e. And we would say to the kid that produced this, wow, that's a lovely a. Oh, and you already made a word in. They're not separated much. It's what's good. That's good. And that e, it's wonderful. It's even nicely on the line. Three complements, four complements, and a lot of learning. The next thing they know is they can create words, cat, in, tree. And once they grasp that, we're going to make it a little harder. We start sentence with a capital letter. And the capital S or T is completely different than the lower ones. So they have to learn a complete new set of letters. The C looks alike, but some of them are very different. And in some languages, almost all of them are different. After that, we teach them, okay, we start it with a capital, we end it with a period. And once we end it with a period, once they know a sentence ends with a period, they can actually spread sentences over lines. This is maybe sixth grade or something. I'm not sure where exactly we are now, but it's gradually changing. Rules are changing. It's not bad to change the rules every now and then. And it's good in this case. Same for math. If I have five apples and I take three, how much do you have left? What, two? That's, they know that. If you have three and I'm trying to take five, how much do you have left? Zero. Yeah. Only years later, you'll learn that there is a possibility you have minus two apples. Happy appetite. Dividing, same, eight divided by three, that's two. And then there is two remaining. I cannot divide the latter two by three anymore. And then there is two, two thirds, and then there's 2.6666. And that takes a long light. Can we actually teach code gradually? Well, yes, we can. Now, we can by using Heady. This is Heady. This is a program environment. And the first line of code you can actually run is print hello world. And it works. And you see there are no quotation marks, no brackets, no nothing. And there are only three commands. There's print for output. And then there's ask and echo. Ask for input and echo is another way of producing output. And it produces the word, the text behind it, and then followed by the input. So if I run this, what is your name? And my name is Mark. And it says hello, Mark. This is fun already. And if it's not fun enough, they can actually have it spoken to. It's your name. Hello, Mark. And imagine a classroom full of 25 kids producing things like this. And within a few seconds you're hearing Mark is stupid. And they're having a lot of fun the whole hour. With only three commands, they're having fun. As soon as they stop laughing, or usually the next hour, the next lesson, the next week later, we introduce variables. This is probably the toughest part of programming, two kids at least. At least that's what I found. You can label things. Name is haddy, age is 15, print name is age years old. I can run this code and it produces haddy is 15 years old. They can play around with this. They can create actually pretty nice programs. They're actually busy for another hour, sometimes two, until they hit a snatch. Sometimes some of the kids find out that if I ask what is your name, and they produce so your name is name, most programmers will probably realize now we have a problem. There's one name that I want to produce as text and there's one name that I want as variable. I can never, ever use the text anymore. This is a learning opportunity. We can actually say, okay, we have to make a distinction between these. If I mean text, I put quotation marks around it. Ah, okay, that's a smart thing. It's like they invented it themselves. Let them think so. So it's gradual and it's 18 levels and at the end, you're speaking fluent Python. That's the gradual part. It's also multilingual. Why? Well, we asked kids, do you like Heady? There was a small study. There's a paper about it. It's in the slide deck at the end. There was only 39 kids in 12 online lessons. This was Corona time. We asked them, what are the benefits, the challenges and the improvements? These were the kids of Fellini's classes. And they said, well, it's great. It's stepwise. It's level by level and I can follow on. The teachers that actually work with it are usually normal teachers and not programmers. So they teach for the first time programming. They don't even know programming. And they liked it. And they said, oh, well, all kids, my whole class is learning programming. Of course, there are some kids that are, you're going to be a programmer, but the rest is still dragging along. And some of the teachers said, even the kids, the girls now have the confidence to become a programmer. It's something they mentioned out of themselves. What do they want? They want some improvements. First of all, they want better error messages. And we had very good error messages. I'll show you one. You cannot read this, but it says, you typed comma, but that is not allowed. And we put a comma here as a comma between quotes. Remember, these are kids of 10 year olds. And they just learned that if there's a comma in the sentence, they pause. So they read, you typed, but that's not allowed. Why cannot I type? Okay, so we had to change that in, you typed a comma, but that's not allowed. And almost all of those well thought of error messages weren't well thought of after all. Well, now they are, and they wanted something really weird. They want a Dutch keywords. Well, we're in the Netherlands, so it's not weird they want Dutch, but why? Everybody's coding in English. Print is the same in Dutch, whatever, the echo is the same. But why? Let's look at the demo. Let's see a nice error message. Print, hello world with a capital P. It says, print is not a command in level one. Did you mean print? And the first one is a capital P, and the last one is a small P. And maybe it gets even better. If I put a space in front of it, it runs. But it gives me a warning. Oops, you started a line with a space on line one. Space is confused computers. Can you remove it? It's level one, eh? There's plenty of time to correct these ideas. Yeah, yeah, I don't, yeah, I should change that. And then there is translation. I'm in English now. I can move to 45 languages, like Dutch. Yes, I want to reload. And now this whole program, every text, almost everything is in Dutch, except for ask. And ask was, by the way, the main reason they wanted it in Dutch, because in Dutch we don't usually have the s and the k so near one each other. And so they write x, and it's a weird letter combination. There is a toggle where you can say, oh well, I want to change the commands in Dutch as well. So now everything's in Dutch. I can ask for the ones we do not speak Dutch. That's the same as what's your name. Well, it's still Mark. Hello, Mark. So it works even in Dutch. And once we had this, we realized that people were in India, using Hindi, Thai, Chinese, Arab, it's all, those kids are just like in the Netherlands. They're just learning a language. They don't know a P if they see one. For to us, to me, these are images. These are not letters. I cannot read this. These are, I can copy paste some of this, but I cannot read this. The same is true for them. They cannot read these letters. So they should be able to put them into Arabic as well. And this works, of course, I hope. I cannot check. Maybe I'm cursing now, but I guess it works. So that's why it's multilingual. It's for the kids. And yes, at the end in level 18, we do tell them, you better put it in English now. You know what it all means. But you better start using English now. But from now on, no more Dutch or Arabic or whatever. But they're old enough to do that. It's built for teaching. It's not actually, it's built for teachers as well as students. They like it that the levels are a step-by-step guide. There's only small thing they have to learn, and the rest is not even possible. The teachers that do not know programming, but are teaching programming using Heady, they think, oh, I can do this. This is not overwhelming. And well, when I was using Scratch for the first time in a Code of Dojo, I had to think of a program they could make. What can my students make? There's a lot of examples on the Internet. I need to do research. I need to find the right thing. And if I'm planning a 20-day course, oh, somewhere in level 18, I'm probably going to mess up something I did in level 17. It's going to take, it's going to cost me time to get things right. I can find this one or that one or that one. And if I'm not a real programmer, if I'm not very good at it myself, which one should I take? And if I choose one, I print it out, I hand it over to my students, and they open up Scratch, and they start looking at Scratch, and they start looking at the lesson again, the paper, and they look at Scratch again. And what was that? They have to context switch over and over again. I see the students getting a little distracted. I don't remember. I don't remember. What was it? And they're a very impatient species, children. So we invented adventures. I'll show you in the last demo because the time's ticking away. And then there is class management. That's the last thing I want to talk about. Class management has in, there's a whole different type of kids in the class. There's this type that actually read the paper, reproduced it in Scratch, and tells the teacher, look, I did it. It's just once a pattern in the bag. Yeah, it's right. In five minutes, they produce a mowing, running around cat. It's lovely. They did it. But the other one, and there are quite a few, it dragged just one thing on the screen, which they do now. Well, you could read the paper again. Don't get it. And then there's this type. It's probably the level of Olivier. They would produce something they want recognition for. And they did this in five minutes as well because they've been doing this for years at home. And now they want my help to get some synchronized legs at the bottom of this Robin Hood. And once I run, I don't even know how to do that. But no, at least I can call Olivier now and help me out in this block programming. But all these kids are in the same class. I want them to move along about the same page. What was it? Base, that's it. So we created something we call customization. First of all, let's put this back into English. And let's see me in action. This is my account. And as you have seen before, maybe there are some tabs on the top of this. And this is what we call adventures. There are maybe 10, 15 different adventures. And they come along every level, 18 levels along. They can do rock, paper, scissors. First of all, it's just printing. And at the end, it's deciding who won and playing against the computer and three players against, etc. So it's getting more progressively difficult. But it's still the same program. So as a teacher, I only have to know 10 programs. And it's all in view. I can just click on I want to do the story and I code it here. So for the kids, there is no context switching of paper or thing. Then there's this teacher that can do, oh, teacher. Is this the teacher version? Where is the teacher version? But I should be logged in in any of them. No, I'm not logged in in any of them now. Why am I no longer logged in? I don't know my password because that's very secure. So I need last pass. I can't solve everything. Oh, well. How many minutes are left? Okay, that will work. What am I doing now? Yes. So this is where I was. I thought I was at my page. If you're a teacher, you can just request to become a teacher. Even if you're only teaching your own two kids, you can actually become a teacher. And then there's this for teacher's page and there's a complete teacher manual. And there's a demo class in which I am a student. I can look at some statistics. How many errors did I produce? What level am I? What kind of errors did I produce? How many successful programs? How many errors programs? And I can customize the class. I can switch on and off all the levels. And what I do in real life, I switch them all off except for the first one. So all the class has to stay in lesson one, in level one. I can automatically produce a schedule, put opening dates, so they automatically open every week, for instance. I can hide and show quizzes and puzzles. There's a parse and puzzle at the end of all the adventures. And there's a quiz at the end, 10 questions. And I can ask or I can set this class to a minimum quiz score of 80 before they can advance to the next level. So if I save this and go to the other one. Oh, wow. Now it accidentally locked in. I think it's because I have two of the same browsers now. Mark and class. This one's not a secret. If I go to this now, you can see there's a lot less. Only two. So this class only has two adventures. It has a puzzle and a quiz. And they cannot go to level two unless they made the quiz until with 80 percent of points. Okay, that's customization. That's about it. Hetty, it's called, it's named after Hetty Lemar. Who knows Hetty Lemar? That's a lot of them. World famous actress and a world infamous inventor. We're still using a lot of her inventions. And of course, we're open source. We love your help. If you think, ah, this is a good idea. And I'd like to work on an innovative gradual parser. It's actually actually quite a feat. There's a paper about it. And it tells you a little bit about an EVNF extension that we can merge partial grammars. We actually want help in multilingual language. Like I said, we have 45 languages. And these are 45 languages. As you can see, English is pretty well and Dutch is pretty well. Some other languages are blue. We love help making all these ones as green. And of course, we want you to help teachers. So go to your schools in the neighborhood and tell teachers about them and maybe help them get along, get started. Once they get started, it's like oil, it spreads. I'm only teaching 40 kids now. If I was teaching 40 teachers, that would go a lot faster. There are some building videos for teachers. And this is where you can join. Thank you very much. |
MicroBlocks: small, fast, human friendly |
Let's just fetch our microblocks here, this happened to me before, why did that go away? I need to, let me just quit microblocks a second, oh there it goes, no I don't want to quit, okay so the cable wasn't quite plugged in, for some reason I lost my displays do something, so I said basic sensors just like that heady language, actually let me just quit microblocks really quick, this cable is being a little flaky, no no no it's and also in the display, okay so this is what it looks like and you have outputs, you have inputs and then you can do all the pins, you have your control variables, operators, all sorts of stuff like that but it, I plugged in this board, it showed a green here, it's already connected so it already knows what this board is and when I click temperature it tells me it's 19 degrees C, if I want to see the display I simply turn on the display, oh let's see I need this quick camera, so for those of you at home there's my happy face and then I click temperature, there it is there, keep losing this connection, okay, all right, every time the HDMI pops in and out it's causing microblocks to be unhappy, sorry about that, what's that, okay all right let's just try this one more time, load microblocks and another particularly cool it's okay, well it's very flaky, oh I lost it, let me see what happened to my power, so HDMI is lost don't touch anything, yeah we might have a few screen issues here, let me just try really quickly if I can quit and open again, okay, quite briefly I wanted to show you that you can graph something like the light level and you can also see the data you know of an input value and it will, let's put zero at the bottom and I can watch this in real time and if I close my hand over it it gets reflections on stuff but so you can see you can plot data in real time x y and z it has an accelerometer, so that's very cool, all right now we're going to go back so that's liveness, there's no compile download it's just happening, next point is that you can do things in parallel, so I'm going to show you a quick how multitasking is handled through just opening the two button blink program right here and I open this and if I press button, let me get my camera okay if I press button A while I'm pressing it it's going to repeat and blink 100 milliseconds while I press B it does the same thing I can actually press A and B and you know I can try to get them in sync or I can try to get them on and off but it's it takes too long so just as Hetty has you know variables I can do something like a delay so I can try to play around with it in real time and I set the delay to like 500 whoops and then I drag this delay variable in so I can play around with a slower blink time and change it all I want while I'm experimenting and now if I set that delay variable you can also use say block so it's it's nice to just set a variable and say it now I know what it is and if I'm pressing the A now you see it's every 500 and I can now I can more like get them on off on off and if you were in an Arduino loop just one continue forever loop this would be a lot harder to have either you know these different sync multitasking done for you which is very very cool the next thing is that it's autonomous so if I am programming and I'm doing these blocks on my my IDE then what happens is there's actually an opcode it looks a bit like assembly language representation of those blocks that our that you can see and then in the actual board it's it's actually in bytecode so if I were to turn on advanced blocks I could show you the instructions and you can see that and then I could actually actually see the bytecode and the bytecode is then what is inside the virtual machine so that will have cool ramifications about sharing our our files when we get to that point okay so um so blocks equate equate to bytecode so it's autonomous in the sense that if I unplug it from my laptop and I plug it back in then what happened did I did I lose my connection okay just oh I know what happened for autonomy I made a mistake I actually need a when started block to set that delay because there was delay of zero right so I actually needed this to be set okay we have to show our errors and now down here so there we go and I love having a co-speaker okay and then another another very awesome aspect of it is its portability so micro blocks if you you look at how we build the virtual machine we have a platform IO script within any file that has I looked last night there were up to 43 different boards and there's a kind of about nine varieties that if you go into micro blocks itself and you want to update the firmware on a board it will show you um well this is plugged in now I'm just updating the firmware on it but uh let me just unplug this for a moment and if you want to update the firmware on a board there's like micro bits calliope city the esp32s there's some new electrophreak the raspberry pi pico's and the eight behind eight of fruit is a whole bunch of boards but there's also in the platform IO any file a whole bunch of Arduino boards and and other stuff that's there so it's it's portable and I want to do a quick display of that where I take this quick cam I'm gonna make it big and we apply power so I have a whole bunch of boards plugged into a usb strip let's hope we don't lose the HDMI and Bernat's gonna turn that on and now you can see I have the heartbeat program running on a whole bunch of different boards so I kind of cheated on a couple boards that are new and didn't have full support like this M bits board but and of course the circuit playground doesn't have the same thing but on some of these OLED displays you can actually use the micro bit type style of display box and it still works and then go around and push the a button on everything so the a buttons turn them all the smiles and whoops dang just just so now they're all happy face and of course if they have B buttons we can go back to our heart and this is the only one that has a side B button voila so as you know like getting used to a programming environment when you pick up a different board do you want to get a different programming environment every time no clearly so portability portability is really cool okay so did I lose it completely all right I'm about to hand this off so that's good all right it's here okay and then the last part to explain is shareability so uh yeah I could do the same program so shareability is there's a few ways to to deal with this let me go in the slide slide show mode because I'm basically wrapped up one is like you can go to the file open save menu with micro blocks um let me just show you in real time here here so you can just go in here and say file you know save file open and then there's uh then there's these other options like encoded in the url so you can say file copy project url to a clipboard and then you can open it you give that url out and we do this you know embed a you just put a hyperlink on some you know documentation and you can open that right in the chrome browser chrome or edge browser support the serial mode you can't unfortunately use firefox i used to work from a zillip okay and then encoded in the picture so our documentation we have pictures on the websites in the wiki in the learn section and the code is actually embedded in the picture so you drag the picture into your chrome browser and or into your ide standalone or chrome browser and it will actually load the code and then the last part is it's shareable by opening from the board and so um i'm going to have for not demonstrate that as we hand off to more cool demo for part two okay can you hear me yes of course again okay so uh yeah we were talking about portability and one of the aspects of portability is not just that the code is portable across boards but the board itself is an example of portability the board is actually uh it contains the program that you are seeing here right uh in any other microcontroller environment once the code is here the code is here and it's it's gone if you've lost your program if you've lost your source the source is gone right uh in micrologs we have a way oops not this mic yeah yeah this microlog sorry uh that's open from board now i mean the browser version we have versions we haven't mentioned that but we have versions for the browser and for mac windows linux and chromebook and raspberry pi etc so we have many platforms that we support this is the browser version and now it says plug in the board and connect and click the usb usb icon to connect so i'll just connect as it said now i select the usb port and now it's actually reading back the code and it's not like we embedded the blocks inside the board it's actually for the i know this is a slightly geekier audience than our usual so we actually have a decompiler that john built in that's inspired in this quick decompiler uh that john also worked on and what it does is it takes these bytecodes and it retranslates them into what we see here right and just to prove that it still works uh it it does and you know we like to joke that micrologs is so portable that we could even port it to a board that does not exist and that's actually a joke but it's true like we have a board that doesn't exist and can't run micrologs since micrologs is vm-based you could compile this vm for something that does not exist that is virtual and that's why we made boardy boardy is a result of the pandemic we were doing online classes and it was really hard to get hardware to kits uh so that was an idea like christine christiane bowie uh my boss and jens and jatka's boss at SAP she got this idea that we needed something virtual so that kids who don't have access to hardware could still at least try micrologs and that's why we made boardy and boardy as you can see can run the same code it's not a simulator that's very important it's not a simulator it's running it's running the exact same vm so it's a virtual board not a simulator it has its own capabilities it has a couple of buttons it has a speaker it has a touch screen it has a file system it can do htdp client operations so it's a different sort of board okay it's yeah right so it does the same things okay uh but you know this is nice but micro blocks was always about physical computing this is nice if you don't have a board that's a good way to get started but our aim was always to teach physical computing to do tangible things in the real world in the physical world so let me show you one thing that you would not be able to do with boardy and that's connecting external sensors to your board so just this funny sensor that I have in here that's an RFID sensor that's technical mama jumbo for the thing that's in your credit card that lets you pay contactlessly or on your subway card or your gym membership card or whatever and I happen to have some of these cards with me as you all do I'm sure so for example yeah and maybe I need yeah I need a cable oops it's physical so this board has a battery but it's dead that's the problems of the physical world as well so I'm connecting an external battery to it okay and now I'm just going to try this has a micro box program in it by the way that I can show later maybe well I'll show it later so okay so I made it so it recognizes this particular card and it plays a tone okay that's interesting let's try another card okay let's try this one okay okay okay cool I'm missing some no wait yes we'll talk later I'll need a special number that comes with it as well let's try with my bank ones and this one as well okay so we have a bunch of cards and we've seen that each of them can make a note let's try to maybe oh yeah where is it I know it's somewhere oh maybe yeah you know when you're having trouble paying that's why because it's okay okay I'll try to play a song where is it there's a second part Kathy can you hand me your keys okay so that was to show that programming is fun but programming the real world and touching actual things is a very engaging way to get kids and people that are not hardcore geeks interested in what we do with these programming computers that was all go to the oh yeah go to the micro blocks website if you want to learn more about it remember micro blocks dot fun small fast human friendly that's our website we have very nice if I may say so learn page with a lot of tutorials micro box is also translated to a lot of language I forgot how many the code is also translated just like heavy and more translators just like heavy yeah and if you want to help out in our site there is a whole section about how to contribute and we have a space for everyone who wants to help out thank you do you have time for questions just maybe one or two right any questions is it expensive I was getting asked that a lot at the booth yesterday can I just because we were doing some demos in the sfc booth and the range if I can pull this how much was this maybe 25 dollars m5 stack probably even less than 10 for the m5 atom the micro bits were retail at 15 until the pandemic supply chain shortage is now there 20 or more the m bits what was this one now 12 dollars 25 the pico edge is maybe 10 this is maybe 25 so I would say they range you know less than 50 dollars and then you can buy some of these educational boards just show you like some of the small boards with stuff already included with stuff already included so you don't have to learn but the new robot aston.com pico ed board this is like 50 dollars but it has all these sensors and actuators and then you can pull apart the bricks as they're called and and use cables to put them back together this I want to say boarding is free though yeah yeah all the ranges from zero to and the cheapest of the ESP 8266 there are a couple dollars but then you have to buy the sensors and actuators I actually find for teaching you probably want to buy something with integrated stuff and then you can buy all these kits and plug them together and run robots and the robotic kits there's a ton of hardware out there next question so first of all in the learn site uh you're going to find some resources that are like full class classes like let's say this one that teaches you about maya numerals we also try to do activities that are not just about technology right so we can get a more diverse audience interested if you make a project out beats and bites that's going to interest that's going to interest probably the people in this room but you're already interested with you're not the target audience right but if we talk about maya numerals maybe people who are interested in history or culture are going to see the value in programming and microcontrollers so that's a whole activity and then you have actually drag the the screenshots pictures into the IDE and they will load and then you have a teacher's guide with extra information about what's being talked about and then there's we've we've done these things called activity cards and i put together a kit with a with a manufacturer and they included these activity cards 10 two-sided cards in the kit and there's things like flashlight tag and you know sound and um two button texting it uses two micro bits and use tilt to find the letters and punctuation and button a and b to to find the letters and and add and you can actually text messages directly between two micro bits for example or clues or other boards what do you mean can you code we don't want people to have to go through this hurdle that's why we're making micro blocks so you don't have to care about uh beats and uh bite codes no this is coding yeah you can you can see the ipode off codes if you want and you can build other editors to the vm if you want a text based editor on top go ahead um i'm wondering i'm familiar a bit with the micro bits and um so but what you do is you add sensors and or you put them in little overcards i think like that could yes yes if you go to the learn side again uh there's you can select the micro bit here and you'll see all these activities for the micro bit and a lot of them are using external sensors like this one uses motors to make a a micro bit robot this one uses a ready made robot car uh a lot of them use external sensors and actuators so for example you ask i deliver there's just five commands that send of the radio forward back left right and stop every time the buttons go up i stop and this one's running out of battery so the other one's faster oh we're over time okay thank you everyone |
Snap! - Build Your Own Blocks
A visual programming language for Computing Education |
Okay, so hi everyone. Awesome that you're all here. I'm Jadka, this is Jens and we will now present SNAP, a programming language that we work on together with Bernat and a few other people that I'll mention in a second. And what we want to do with you today, or what you have to listen to, is first I'll explain shortly what SNAP is and why we care about it. Then we'll show our three guiding principles and if we have time in the end, Jens will show what's under the hood of SNAP as well. Okay, so SNAP is a visual block space programming language and I'll show you that to you in a second. It's also like MicroBlocks Live and Parallel and we develop it together with people at UC Berkeley in California. So we have three people, us three working together with three people from UC Berkeley, developing this thing together. SNAP is a block space programming language but it's aimed at high school and college students but as you will see in a second, you can basically start whenever you want with it. Okay, our three guiding principles, let's start with the first one, is low floor. So this means that we really want to have engaging activities for entry level programmers so you don't get scared away and that we really want to have fun activities that they can do. To show you that, I've built a super simple what we call a micro world. So this is an extension of SNAP that I customized myself and this only has these three blocks here and as you can see you don't even have to read to use them. We can just try them out and see what happens. So let me increase the stage size a bit so I can click on that. As you can see it's live so I'm clicking on it while it's in the palette and it's still doing things. I can click on that, nothing happens. I see oh wow there's an input slot here, maybe I could type a number in here. Let's see what happens now. Okay, it's doing something. This is stamping a flower and this one is supposed to be a sponge. It's clearing the stage again. So since we're working with flowers let me draw a beautiful petal real quick using the costume editor. Let's do a yellow one. Oh that's brown. Okay and then let me draw something. And let me even fill it maybe. Yeah wow this is very beautiful and then let's make this even more beautiful and let's also, that's important, move the rotation center. And let's try the same thing again. Ah okay this is pretty cool. So now if I want more flowers I can build a larger script so I can use the central area of snap to build more complex programs. So I could go to a random position and draw another flower there and go to a random position and draw another flower there and so I could click on that forever. But it would also be cool to just have a forever loop that does that. So computers are really good at doing things automatically very often so why don't I make a loop. So I already prepared this page here and let's use for the look for the infinity sign because what I want to build is an infinite loop and let's just copy that and then let's build our own control structure by just hiding it in the actual, I don't know we don't hide it, let's just do it. So this is going to be a control block and I'm calling it infinity. And this is going to have two inputs, one is going to be, no it's only having one input, it's going to be the action. So let me build this block, I already have it here. So okay no let's do that again. Sorry I'll just delete that and start over again. Okay I want to make a control structure, it's called infinity. I want to add a parameter to that and this is the action that I'm going to run. I can even decide what I want this to look like. So as you might know from scratch our loops have the C shaped or from snap our loops have the C shaped command structure. So I can click on that and when I apply that you see that I have the C shaped block with this infinity sign on it and now I need to decide what I want to do with this block. So first I want to run the action, I don't have a run block here so let me just open the hide blocks section and because I hit all the actual blocks from snap and I'll show you how to do that in a second and let me drag out one of the run blocks. And what I want to do is I want to run the action and then I want to repeat the same thing and I want to run the action again. So I think that's correct. Let's try it. So I made myself a forever loop and let's clear the stage so we can actually see what's happening and let me drag this around here and I can run my program and I built myself my own control structure and it still looks very simple for kids but all of you obviously could do that. So this is one thing that we really care about. We want really simple entry level engaging activities and this is one of my favorite starter projects but we also want teachers to be able, educators to be able to build these microverts very easily and we want them to be able to customize snap in a way that they can use it for their needs. What I also wanted to show you in this project, let me make the stage a bit smaller again, you can have several what we call scenes in a project so this is basically a project in another project. So I can now switch to that scene and here I already prepared a petal and you see that I have more blocks than I had before. I also have a separate stage so I can switch between scenes and we can use that idea to build something like language levels. So in the first, in the first micro world I used I just had that block that draws a flower. In the second micro world I want learners to be able to build this flower block themselves. So we gave them all the tools that they need for that. So here for example we only have this block that stamps a leaf, so one of the petals. So what else do I need to build the flower? I want to do that several times so I might need a loop and this one as you can see has a number as an input so I can specify the number of repeats and of course I need a turn in between. So what we want to do is we want to stamp, let's do a flower with six petals and then I need a turn in between and also I build this block myself and I want to turn 60 degrees each time so let's clear before we draw a flower. And then let's do one. So this is how the flower block was actually built in the first thing. So we can help learners to gradually get new ideas. And then which is, which I find pretty cool, it's also super simple in Snap to prepare your own libraries. So I made one if you have a kid in class that is just faster than the other kids. You can export a library and just let them add more blocks. So here I added a few more blocks. For example these ones let me change the appearance of the petal that I have. So this one for example changes the size of it. Let's set this to 100 and then you see the petal becomes super huge. I can set that to 10. Let's clear again. Then it becomes super small so maybe we want to pick something in between and we could even do that randomly. So for example I could add this here and I want the size to be between 20 and 40 and then I get differently sized petals each time I do that. So if I do it like this you see that this changes and I really like that about Snap that you can easily expand projects with fun ideas so they look differently and are more engaging but you only need like one or two more ideas. This one for example switches to the next costume so if I wanted to add another petal let's draw one. Let's do the one that I did before but in blue this time. Let's fill it with something. Okay and then again let's move the rotation center to one of the tips and we can now use that and each time we draw something we draw a different flower. So the next one in line and then again we can combine that with what we had before and just create a beautiful flower field with the blocks that we made. How do you do that if you want to create a micro world? Let me go to the third one I prepared so this is not actually a micro world this is regular Snap with all the blocks that you have so as in other block space programming languages we have different categories here we have the palette on the left where you find all the blocks for a specific category as I mentioned before it's a live programming language so you can just click them and something happens immediately and what I also added here or what I thought I added here but didn't oh that's unfortunate is the so let's just build it is the project so again we want to have all the blocks that we used last time so what we had was the next costume block we had the set size tool block we had the pig random block we also need a division block we had a repeat block we had a forever block we had a stamp block and we had a turn block and what was the last one oh the second to last one was clear and what was the last one go to random position awesome and we need to go to random position and let's assemble these so that we actually can make sure that our script is working so for a flower we want to repeat a specific number of times let's do six times then we want to stamp then we want to turn the number of degrees that's three that's 360 divided by the number of petals that I want so six in this case and before we wanted to do that we wanted to go to a random position so let's check that that seems to work awesome we wanted to do that forever and so that all flowers don't look the same we wanted to set the size to a random position a random size random number let's do twenty to thirty maybe and we need a second costume so let's just duplicate that one and let's change the color a bit yeah this is very different awesome this is exactly what I oh wow okay okay and so we have two costumes so we can actually also use the next costume block and so we start with a clear stage let's add that at the beginning and let's see whether this works okay awesome and now if you want to build your own micro world you can just go to the file menu and select hide blocks and then that's the awesome thing that we added in the second last like the previous release and you can hide all the unused blocks and if I do that I only have the blocks left in the palette that I used for my project and they are in these categories here and I can even make the single palette that I had before by clicking on the settings menu and then selecting the single palette and now you have all the blocks in like one palette and you can make your own micro worlds that have all the blocks that you need for your project that you want to do in class or with your kids or with some other people who want to learn programming and then you can delete that and then you have your perfect passance puzzle generator and so this is again the low floor idea that I mentioned we really want to have engaging activities that have a cool artifact that looks beautiful or is fun to do but we also want to help teachers and educators to create these fun and engaging activities in a simple way okay the second idea that I wanted to present or that we think is or that we care about is white walls so we want to allow for a huge variety of projects and the ones that we care most about media projects so we love this idea of media computation meaning you learn general purpose programming by playing with sounds images texts and one exemplary project for that is for example from snap you can access the microphone of your computer so in this case if it's running it should be and this for example is a visualization of the frequency spectrum of my voice that's just picked up through the microphone so this looks beautiful at the same time as it's interesting because you can talk about sounds from a physical and computational perspective and we love to do stuff with the camera so this is another project that I like very much this is pasting the webcam of my computer to the stage right now I set the transparency to 50% let's make it fully transparent so the video is still there but you can't see it anymore because all the pixels are transparent and then I am sending a message to the other sprite and what's that doing is it's drawing dots on the stage which size corresponds to the brightness that it's measuring in the image that it picks up from the camera and this is actually a pretty cool technique it's called dithering and this is how images were made in newspapers back in the days when you didn't weren't able to print different colors so you just did differently sized dots to get like a more deep color space so this is what we mean with white walls allow a variety of projects that are engaging and fun and that also kids who don't necessarily like Fibonacci might find interesting and the last idea for that I will hand over to Jens is no ceiling so as I said snap is a programming language that's aimed for high schoolers and early college students so it's scratch but with all the awesome ideas that make programming fun and Jens is going to show you this now. Thank you Jadka so we're having kind of a conflict here because we want to have this low floor and the wide walls for the kids but we're also having this idealist notion of blow off the ceiling like we don't want to constrain kids we don't want to dumb down the language so it's okay for kids and this is kind of coming from the sixties when you know some of you might remember logo had the idea that you know you don't you don't make stuff easy for kids but you make it welcoming for kids but you don't constrain it you let kids express everything they can and this is only one part of the pedagogy the other part of the pedagogy is geared as Mark said towards teachers towards educators because if you look at these micro worlds that Jadka has shown let me again go to oh yeah this is mine already so here is a bunch of blocks like these aren't the regular blocks these are the ones that we made okay I need to stay more in the center of so y'all at home can see me we want teachers to be able to build these exercises to build these micro worlds for their kids so it's not just us you know building something here's an exercise you can do with their kids but there's a teacher there's an educator wants to teach something as Yadka did about what was then called you remember the total turtle trip theorem that was kind of what Yadka showed you about so we built a micro world to that which language are they going to build this sub language in are they going to have to learn another language to build these little domain specific languages no you saw Yadka build these blocks in snap itself and so we want to take this even further and find out whether we can maybe even invent a language that lets us build a block space language inside the block space language and for this we really kind of want to do things that we can do in the UI we want to be able to do it in the language so in this palette here I made a little block and it shows me some tools so if I click on this I'm getting some more blocks and I can again hide these it's a very simple block I can build myself that just shows and hides me some blocks and now I can kind of explore some of these things so I can for example look at this block that the flower block and I can see how it is defined and see it is defined with its own blocks so I can open this and edit it but I can also get the definition of this in a program so now I'm getting the definition of this block and I can see that it's a function I can take this out and I can do some other interesting things with that so here is our split block for example I can split hello world by the space and I get hello and world I can split it by letters and I get each letter so what happens if I split a script by blocks let's make this even smaller if I split a script by blocks I'm getting a syntax tree a table of the syntax elements in there so I can do this with you know the definition of my flower block and oh wow so I can so this is a table so I can also flatten this so I'm not saying I don't want the length of this I want it flattened so now I'm getting a list of all the syntax elements that make up the definition of this flower block and that's kind of interesting because now I can find out for example so here I can take out the turn block and I can say you know do all of these um is this list contain this turn block and says yes the turn block is part of the definition of this list so but what about for example the clear block is that also in there no it's not present in there so this is interesting I could maybe discover something else so what I can I can look at all the blocks that I have in this micro world this is a list of all the blocks in this micro world so I'd like to find out which blocks contain this turn block so I could say you know I want to keep I want to filter from my blocks those that contain this block let's see whether that works yeah so I'm getting a list of two blocks that all use this block now I can turn this into its own block I make a block that says this is the colors of a block block and the block should be an input so I'm saying this is a block and in order to define this well I just did define this I just dragged this in and I say okay this should should work for any block so not not just this block but for any block and um so now I have this block that gives me the colors of for example this block let me see whether it works so this block is used by two other blocks and this block isn't used by any other block and huh let me see maybe I can get a whole report by looking at all the blocks and by mapping overall the blocks and I want to see the block and I want to see its colors right so what we're now doing is really an introspection of the system of the block system in itself so if I map this I'm sort of getting this interesting structure it's a data structure it's a table and it's a graph really a reverse dependency graph of all the blocks and the blocks it uses and I get a report of the overall structure of my micro world and folks this is something interesting if you think about this so we're starting with this easy simple thing that we can build worlds for kids and we want to build these little domain specific languages but we want to have a language that actually lets us build these things in itself and this is why we build introspection and kind of all the goodies of functional programming into this language but we didn't want to make it so that you kind of have to go down to memory addresses but we want to represent everything in blocks and this is kind of the idea of no ceiling because at one point you know we start doing this with kids but we actually build it for the University of California at Berkeley and they're using it for the introductory course of computer science for non majors and you know this goes up quite a long way so we actually want to be able to do scheme in this and to really teach abstraction so at one point we really want to blow off the ceiling and we don't want to do just only this imperative style programming where you know one follows the other and you've got the puzzle pieces but we actually do a higher order function you do kind of recursive you build your own control structures you build your own language it's a little stiff and it's challenging to try to accommodate you know the let's draw a flower with a bunch of three blocks to let's invent our own programming language in the same environment and in order to build such an environment the whole thing that you see here at your nerds I can show you right the whole thing you see here isn't using Blockly or a library but what you're seeing here is actually an operating system that runs inside a single canvas element in the browser and there's sort of a pill you can take to switch to dev mode and you're sort of inside this environment that some of you might recognize looks a lot like Squeak so if anybody you know Squeak and so it's kind of the same morphing environment where you could just get you know any things you can directly manipulate them for example you can make this bold but you can also use this to you know let's make a slider I can attach the slider set the target to the string morph to the font size I can change it I can make it horizontal and now my slider governs for example the size the font size of this I can make other elements like for example color palette and I could set the target of the color palette also to the string and now I can change the color and the size I can still edit it so this is basically our own system that is kind of self sustained in the browser and everything that you see in snap is sort of an application inside this really OS inside the browser and comes with its own green threading model which is how we do parallelism so in order to go up on the ladder of abstraction on the back end side we have to go kind of down to not the metal but to kind of a metal of sorts of the browser okay this was kind of our ideas if you'd like to check it out it runs on snap.berkeley.edu give it a try yourself you'll find lots of material there is it's open source by the way it's free and open source is an AGPL license so if you want to know so it's copy lefted so you know everything is open source about it we write the front end the back end the community side it's not as big as some other languages but we've got a vibrant community it's all hosted on github please do contribute please do fork it many people have forked it please let us know what you think and thanks for coming. So I found that really fascinating and as a list programmer I couldn't help but think about macros when you were showing the exploding of blocks so would it be possible once you get the abstract syntax tree to annotate it let's say add a sound between each step and then recreate a new block from an existing block? Yes if we had more time I could show you this we do have macros. So I showed you the split block which takes apart the stuff we have a joint block that you could pass on the syntax tree. Well this is running in the web in a browser so your phone can run the thing in a browser what you get here is you can publish it as a URL and yes and then you can run it on the phone and it's something I like to do with kids a lot it's also interesting if you do stuff on touch devices like on tablets or stuff like that you kind of need to be considerate of which gestures to use because you don't have mouse over or you know so yeah it's fun. That's a good question I really have this dream that I think block space programming if we take away the kitty stigma of it and if we dare move on from this sometimes horrible imperative paradigm to the beauty and joy of functional programming that you know we get these expressions that actually make things easier to understand that make things more accessible and that this is a great way to be to actually you know express many things inside professional development so we write the software in a way that we like to use it ourselves and we hope that it might convince others to actually build and embed this into enterprise applications because often enough if you think about these low code no code things you know you moved you click together some preconfigured stuff but then you need to write some glue code and the glue code often enough is terrible and awful and this is where I would love to see a block space environment that has no ceiling yes it is less it is skiing in the browser yes exactly and we do have stickers and buttons here take all you want yeah yeah thank you very much again. |
Lomiri Mobile Linux in Desktop mode
Lomiri and the myth of the pocket size desktop computer |
It's 10.30 on the spot. So this is the start of the first call. Perfect. Thank you. Thank you, everyone. All right, so just to give a little introduction, I need to mute myself real quick. Just to give a little introduction about myself, my name is Alfred Neumeyer. I've been involved with the UBeport's project on Ubuntu Touch for about four, close to five years now. And I have started out with porting devices over to, Ubuntu Touch over to devices that used to run Android kernels and Android drivers, not mainline. I used and reused the high bits that we made up. And for that matter, I gained a lot of experience with the desktop later on because I started out with the phone and I really wanted to have this tablet experience with a touchpad and a keyboard. I couldn't get it quite right until some later hardware offered the ability to do so. But in the meantime, I was very interested in the desktop talk and what can we accomplish with the desktop convergence as the keyword. And I'm going to mention that one real quick. Let's see if the convergence thing actually does what it's supposed to do, right? Okay, black screen. Does it stay black? There you go. Okay, fingerprint reader doesn't want to work, but this one does. So, yeah, you can actually, with Limeria at least, you can use the phone as the virtual touchpad and virtual keyboard just to give it a little bit more presentation feel. We need to, where's the full screen? Is there a full screen? Do we have full screen? Otherwise, that one is sufficient, I hope, right? Now it's very obvious where it's running. Sorry, broke it to you. So, the Limeria on the desktop, less of the pocket-sized desktop. As we can see, it is totally possible to do so, considering that we're still at the starting phase of where this really can go. So, I'm going to talk about a little bit of the history then about the evil convergence word, the word that we all like to use, but so many people are using so many different senses, so many different ways, then how we migrate it over a few technology pieces like the system layer and the windowing protocol, the UI toolkit I'm going to introduce to you and notable components that should you be interested in either working with us or taking our stuff and working on your own stuff, bring it over to another distro, you're totally free to do so, and I will mention a few components that you might be interested in. So, what do we have? The history. 2013, the year of Ubuntu Phone. As we hoped, as some of us hoped, as it did not drop until it really dropped, and that was until 2016. In 2013, they introduced the Lumiri Shell, what's now called Lumiri, they called it Unity 8 back then as a continuation of their desktop efforts, and they introduced MIR, the MIR server library that allows developers of compositors to easily create a new, these Weyland-based compositor, back then they used their own protocol, which caused a lot of problems for some people in the community, but they turned around, they implemented Weyland on top of their own display stack, display server, which is Weyland compatible nowadays, so that's something. 2016, five devices have been released with Ubuntu Touch pre-installed, which is quite interesting for something that was so niche. Yay, that was so niche. That gives me a hint that I should speed up talking. So, five devices released, one tablet, four smartphones, and they all ran, like, Android adaptations that they ported over using libhybers. That's how they quickly made it to market, very easily. I know for some people that's not interesting at all, but if you want to go to market really quickly, that's sadly what they had to do, and others before them also had to do it. Even Karsten Monk, who started the libhybers project, even himself, he said, well, if you want to do it quickly, you need to do it that way. 2017, Canonical gives the Ubuntu Touch project to UbiPorts. We took it over, Canonical couldn't finance it anymore. They had no interest in keeping up with the project, overall, and gave it over to us, and we are since then trying to make the best out of it, and we have released 24 OTAs, and we're in the process of releasing our first final images of focal-based Ubuntu Touch 20.04. We already have a beta version, which you can download and install in a select few of your devices, but, yeah, and Unity 8 gets renamed to Lumiri. Why? Because it was very important for others to package it up without asking Canonical for every single license, oh, copyright-related license question. So we named everything from Canonical and Ubuntu to Lumiri so that it's generic, others can make use of it, and they can package it for their distributions. So, convergence. What do we mean by convergence? Goddamn. Screen real estate, first of all. Let's go back to the basics. The screen real estate, how many things can I actually put on the phone, or on a tablet, or on a desktop? Various, very much, right? What's the next property? Is it multi-monitor capable? Is there a possibility that you can plug in an external monitor into your phone and have it work, like, easily, without much configuration? And can you do so wisely, too? Wired, wisely? Both of them can work. And which input methods are available to your device at the moment that you want to use it? We support various types of input devices, from touch, keyboard, everything that you expect nowadays from a personal computing device, which accounts phones into the mix as well. And we are pretty much ready to deliver input events to applications whenever they need to be delivered. And they deliver everything that the application expects. If there's a new protocol coming up, we are pretty much future-proof due to the MIR team working on their stuff. The way the MIR handles the input and passes it on to the compositor is very much making it possible to increase the abilities of input methods overall. And that gives me a hint to shut up again. So, convergence. What is actually convergence? In our mind, and that is the ability to adapt to different usage scenarios. It's not which device does it run on. It doesn't mean what output device, it does not only mean what output and input device do you plug in. It also means how is it adapting. So, for example, this phone, when you use the wireless display capability, it's still a phone. It's just got a wireless screen attached to it, which displays everything that a normal monitor would do, but it's still a phone. It's not wired. It's only determined whether something is in phone mode or in desktop mode or in tablet mode, and that's what the shell usually does. I will get to the components and each of their responsibilities later. But for now, the basics are convergence. Adapt to a usage scenario that you see fit as the shell or as a user, basically, because we cater towards what the user might want at a very specific moment in time. Both the shell and the applications need to be able to adapt without the applications adapting. You have a great shell and you can't use anything with it, right? So, we need the applications to follow the same rules, to follow the same standards, and make it easy so with our custom UI toolkit that we have built based on Qt and QML. It actually existed before Qt front trolls. One was a thing. So, canonical had to get something out of the sleeves real quickly and they made up the nowadays called Lumiere UI toolkit. Back then it was just Ubuntu toolkit. And another thing that goes into the whole convergence thing is pausing the application as you expect nowadays from a phone where the UI thread of an application is paused. It doesn't do anything. How can we provide the same functionality with a typical GNU slash Linux way of handling things? Like, we don't have this side gold Android process that spawns other services in the background and it does everything quote unquote hidden away from the user. We as application developers need to think differently about how to approach a modern day usage scenario with a smartphone being able to power save as well as being feasible and doing stuff in the background when it's supposed to do it. Not to run in the background every time, all the time because that's wasting battery. But we need to make sure that the application lifecycle is something that works on a phone and it might work differently on a tablet and it might work differently on a desktop. It definitely does differently on a desktop, right? You don't want an application to pause underneath your feed or something, right? Pulling the rug away. So, we do this for preservation of battery life and for that, we need different policies for every usage scenario. And we do that by using Lumiri App Launch which is a little library that handles everything related to launching the applications. It puts everything in a C group as you expect nowadays. Specifically, it uses system to user sessions for that and it also helps with application confinement and security because on open to touch at least, the main goal is to provide a store that other people can download applications from without worrying too much whether their data can be accessed or not. Of course, we can trust every application but then we would need to review them. They're a typical traditional open source way of, a Debian way, whatever you might want to call it, of importing data, software, code and auditing it and then releasing it to others. That's not what we want to do. We want to give developer easy access to release their stuff to users that they care about. And for that, we need application confinement and a few security practices that are part of the whole convergence story of the application lifecycle model that we have. And the result or at least the dream for Canonical was this. You have a phone, a tablet, a laptop and even a TV which can run the same shell. And that's the last missing property of convergence. You have one code base running everywhere. You have one code that is sufficiently adaptive to each situation that it works as you expect. It looks similar enough. The input methods, of course, differ but in reality, they provide exactly what the user wants at a specific moment in time. Swipe on the phone, it works like a phone. Use your remote control on the TV. It works like a TV. It looks similar. It feels similar. The user experience part, right? So, next one. Migrations to new technologies. It took us some time. It took us some time to move over to system D but the concept is very much applied because upstart back then already used to start stuff as individual session services. And our Lumiri app launch, it reacts to apps appearing and disappearing for us. It gives us the events. Hey, application XYZ is gone. Remove it from the launcher because it's gone and remove all the resources that you have allocated to RAM. Right? Migration to new mirror and Wayland. That's a more interesting one because that one is pain. Mirror 2.0 drop support for the old protocol that everyone complained about, right? Current Lumiri on Ubuntu Touch does a split like Charles Kloss and them. And in that sense, it runs both mirror client and Wayland. And through Wayland, it also supports X Wayland. So, the Wayland support on mirror on Lumiri is good enough that you can actually spawn a GIMP application running through X Wayland on a libhybers device or on a Wayland device depending on which X Wayland version you install. And it works. The problem is we want to go Wayland everywhere. And for Wayland everywhere, we need to adapt a few new concepts, a few new things that are right at this moment in development on the mirror side. So, Miral, an abstraction layer which allows basic windowing to be done to really receive data about a window being placed somewhere and stuff like that. Miral takes it another step further which gives us the ability to get those trust prompts that we use on Ubuntu Touch. Those dialogues that pop up underneath an application and overlay themselves on top of an application where you have to say, okay, I give this application access to my GPS. I give this application access to my microphone. It doesn't do it by default. It doesn't allow it. You, as a user, need to allow it. And for that, we need a few more components, like trust prompt integration into Miral. And screen casting is also going to be tough because screen casting is just different enough on mainline devices versus hybrid devices that buffer passing might work. But then how are we going to tell which buffer looks like an Android buffer? And I think I need to figure out a few things on my side with regards to Wayland and how WL buffers are supposed to be passed around. But if anyone here has a lot of Wayland knowledge and wants to help us out here and wants to have some screen casting for themselves too, please get in touch with me and we can figure out something that works for both sides, I guess. So the UI toolkit, Lumiri UI toolkit, this is a beautiful thing if you ask me. I really like this. The way that our browser, for example, that you can see the Chrome of on the top, that's the Lumiri UI toolkit rendering the tabs. And depending on whether you resize the window, it actually turns into something that you have to swipe up from the bottom edge of your phone. Because this responsive design thing is one way to achieve convergence in a sense. So the UI toolkit, it follows our human interface guidelines, allows for convergent use cases, and it is used by the Lumiri shell throughout the code base. So we have something that can be used by both apps and the shell so that both look the same compared to others who write one toolkit for the shell and one toolkit for everything else. And in this sense, we have something that works on both sides of the equation. It's abstracted by Qt, and Qt does everything for us. But what it does not do for us is handle the wayland connections for that we need Mir, and we need Qt Mir as an integration point that takes Mir's window contents, the buffers, and displays them on the screen, or rather the scene graph of Qt. So when you put things on the scene graph, it's like this tree type of structure, you put things on the screen, they get rendered, and then they get displayed on the screen as you expect using OpenGL. And Qt Mir also already does basic window management, but it doesn't do everything that Lumiri needs to do. It already does multiple monitors and also supports multiple input types, whatever you expect nowadays from a display that can be driven with input devices, right? Lumiri itself, the shell, it uses Qt Mir for the application window presentation. So everything that Qt Mir gets from Mir, it gives further down to Lumiri, Lumiri displays it. We can even put some shader effects onto those tiny windows and make it a little colorful, make them explode as you used to from compass times, and have them render quite interesting stuff just because of the way that Qt does animation in QML, which is quite interesting. And it also does more sophisticated window management, so window snapping, multiple workspaces, the whole thing that you expect, right? And Ethercast, that's another one. This one is not for the mainline folks yet, but take this as an invitation for mainline folks to help us get it working on the Pinephone Pro, for example, which someone in the community has actually started to port on, right? And they got Ubuntu Touch running on the Pinephone Pro with Lumiri, with everything that you expect, even the trust prompts, because they required a few patches to our side. And Ethercast in this scenario wouldn't work because it doesn't know how to take those DRM and GBM allocated buffers and pass it on to an encoder. We just need the screencasting and the video encoding to be done quickly. Then we can achieve sort of like a 20% CPU usage on one core type of deal for wireless display support. So 20% of one CPU core being in use, for me that counts as nothing, almost, right? It's definitely usable, and I regularly do it with my Pixel 3a, and it works flawlessly, right? And one last notable component, or two of them, is the off-screen keyboard, which also plays into the whole Lumiri story, so to say. So, Malit. We use Malit as the off-screen keyboard framework, and Lumiri keyboard is a plug into it, which connects over Wayland or over Miracleint to the shell. And says, hey, I'm here. Take me as your input device. And the shell says, yes, I like you. In that sense, the Lumiri keyboard, it uses the Lumiri UI toolkit as well as you would expect. And it also has some nice little features like you swipe up. When you have the off-screen keyboard sliding up, and you swipe up from the bottom edge, you get to do text selections very, very nicely, and you can also do the same with the spacebar, with the space key. That's a totally awesome thing to play around with. And with that being done, I think... We have five minutes left in total. In total? Yes. Then we're going to skip the demo because you saw everything already. Well, that's pretty much it. If you have any questions, I will play around with the device. We can interactively together use it, right? OK, there's a question. So if I want to use my phone as an input device on Gento, do I install Malik or something? As an input device on what? Gento. Are you planning to bring the shell over the Lumiri experience to Gento, for that matter? I want, for example, I have Gento, I have Wayland, and I want to use my phone as a touchpad. No, no, that's because it is integrated. This is the virtual touchpad of the phone, and it's because it's integrated with the shell, with the Wayland desktop that we are running. This is on screen. This is Lumiri, yeah. This is all integrated into Lumiri. So Lumiri knows this is a virtual touchpad. Let me do all the virtual touchpad stuff, and it does so by... But is it available as a separate package that I'm creating? No, no, no, no. It's integrated into the system's desktop. We're not playing Lego here. I'm sorry, I didn't want to sound mean, but it's like those things need it to be integrated because they want it to release very soon, very quickly, very on time. But they need to have a separate library, for example, using different... It uses U-inputs, the typical user space injects events into the shell. That's all it does. U-inputs, U-input specifically. Sorry for my intellectual environment. No. Is there a plan to have a separate package that I can use in a different distribution where I use a Wayland, for example? This is Wayland. This is totally Wayland. Where you use Wayland. This is a Wayland desktop. This is the desktop, but it also does the virtual screen, the virtual input method for you. You could, in theory, create a new Android, selfish, all of them, Mobian application that just does that and transfers it over the wire to a desktop, but this is not it. This is all integrated. You want something like synergy, but synergy is the tool which allows you to, like, have a separate machine running and then move your mouse over to the open screen border. They want something like synergy, but then accept that you use the mouse over yourself. Right. But synergy doesn't work at a Wayland, but work is on the way to the way synergy works. Oh, multiple workspaces. I haven't shown that you're that. Multiple workspaces. At a workspace. Hooray. It's just connected and it's done. We do detect certain properties of the screen, whether it is a TV screen, projector, monitor, integrated thing. We have a whole enum full of different types of monitors that can be connected. And we also do a check by screen size, by, I think, by scaling of the device. So if you put it on an 11-inch device, and depending on the pixel density, it would behave differently, right? But yeah. That's pretty much how it looks like. This is a terminal that I'm not going to input my password in. Yeah? This is the Fairphone 4. All right. The last questions. Are you public? OK. Public can already start setting up stuff for the mix. So that ends the demo part, I guess. Yes. Sorry for that. But you still have some time for question answers while public. All right. Sure. Thank you. |
AMENDMENT Sharp photos and short movies on a mobile phone |
Okay, so hello, I'm Pavel Majek and I'm here to talk about cameras, but you can also talk to me about clicker train horses, mobile phones, kernel, smartwatch by based on ESP32, Mobian on my molester. So first thing first, video following is not for cameras, it's for frame grabbers and they are really very different, which is basically what this talk will be about. They can do remote controls, but they cannot do autofocus for you and so on. But the interface is fairly simple, you just open depth video zero, select format and capture. Unfortunately, what you get is blurry photo, which will be either all white or all all black. This is with autofocus and auto something. Anyway, they are phones with smart sensors, one such example is pine phone and those are pretty close to the frame grabbers. They do basically everything in hardware. This used to be a pretty common design in the past, which made a good sense at that point because USB had limited bandwidth and you could not push uncompressed data through it. It's easy to sanitize, but it doesn't make much sense today. If you have like five lens on your phone, you don't want to have five JPEG encoders there. So we are moving to dump sensors, which basically do bare minimum. There you set parameters like exposure, gain, select area and so on. And it just passes the bare data over the fast bus and it usually ends up in your memory. And then you have component called ESP, which is image signal processor, which will do the JPEG conversion and such stuff. Unfortunately, in case of the interesting phones, which is official LibreM5, pine phone and pine phone pro, we either don't have the processor or we don't have drivers for that, so we can't use it. So this is how the image, this is a photo if you try to take it without the automatics. Can you recognize what's there? It's a USB connector. It's recognizable, I'd say. So what do we need to do? Nokia N900 is another example of complex design, which used to be very important historically. And actually the photos in the presentation are from N900 with open source stack. In real time, you need to do auto exposure because otherwise you will have black or white frame and you need auto exposure for autofocus. On most cameras, you really want autofocus too because you can't just focus to infinity and expect good image. And that's pretty much everything you need to do for the video recording in the real time. Then you have preview. Preview is a bit less important than the video recording, but it's also important. You need to convert from Bayer to RGB. And you need to do gamma connection because the sensors are linear in one side and exponential on the other side. GPU can help here. And then there are extensive post-processing steps like auto white balance, lens shading compensation, getting rid of bad pixels and probably many others I forgot about. Advantage of this is that this can be done after taking a photo or after recording the video. And there are quite good tools for that, including raw therapy, Euro and so on. So people were working, unlike the other parts, this got some work done before. So what we are talking, for example, on the N900, you have LED flash, which is a completely independent device. You have voice coil support for autofocus, which is again a separate device somewhere on I2C. Then you have two sensors, front and back camera. You have GPIO switch to select which camera you want. And then you have ISP, which is quite a complex piece of hardware, which will not be important for this presentation because we will do without it. So tools to use. There's great set of tools to use, but they have some limitations. One which looks very nice is G-streamer. And G-streamer is really great if you have an unlimited CPU. Unfortunately, you don't have unlimited CPU. If I was willing to hack its C code, it would be very powerful, but there's some learning curve involved in that too. And at the end, G-streamer might be right to use, but I found other tools easier. There's FFM pack, which has quite nice and very simple command line interface. So I used it at the end. I didn't really need much. Just please take these images and compress me every video from there. There's megapixels. Megapixels is a very nice application focused on mobile phone, very well optimized, but its origin is a pine phone, and they don't use live camera there. Then there's live camera. Everybody says live camera is future of video on Linux. It probably is, but there are still many steps to get there. And there's megapixels. Millipixels is fork of megapixels, which is supported to live frame 5 and to live camera more importantly. So in many ways, so megapixels actually currently looks nicer because it is based on newer GTK. On the other hand, millipixels use live camera, and that's important stuff. Okay, so this will be a bit of history and reasons and so on. I started to play with camera on pine phone, and first idea was, hey, Gstreamer is there to capture video. Let's use Gstreamer, right? Okay. I started capturing raw Bayer data because that's what should be most portable. I did some shell scripting, media control to set up the pipelines. That's not fun. And then just use Gstreamer to save the Bayer images to the disk. And I could do 200 kilopixels, which is not great, but better than no video at all maybe. And I realized that CPU can compress at 70 kilopixels images in real time, which is, well, people were doing this, but it's sometime ago. So I tried to improve. There's IUU format the camera could do, which is the Bayer tent converted to like for better processing. And I could capture up to 0.9 megapixel video with that. And if you were wanted, you could take a look there. Maybe it's useful for someone. But, well, that was the reason. The reason was called colorimetry. And someone in Gstreamer decided to do a regression basically. And all the Gstreamer stuff stopped working. And I realized that, well, perhaps it wasn't good too great to start with anyway. So I started looking around. Quickly, I found the camera, which is the future, right? And, well, it's C++. It didn't work at all on pineforms. So I had to do some quite heavy patching. I get some help on the mailing list. And I realized it has JPEG support, which is, well, you avoid a lot of stuff, because if JPEGs are already core space converted and compressed and so on. And I realized that maybe JPEG is worse having second look. So I did. You can't say data into megapixel resolution to flash, because the flash is not fast enough. But it was like almost possible. So, hey, JPEGs are four times smaller. Perhaps this could be adjusted. And saving sound is easy. So maybe we can, well, maybe we already have everything we need. And this is why how Unixic camera was born. I realized the second reason. Someone decided that placing Uncache data to user space is fun. And the camera decided that placing Uncache memory up to the application is great. I thought someone stole my CPU, because the performance penalty is about 10 times. But not. It's just the way it is. I believe this needs to be fixed. If you fight with the streamer and the performance seems too bad, this is probably why it's too bad. And I don't know, talk to your kind of person which can change it. By the way, in the old days, we used to have a read interface to get data from the camera. This is now deprecated. Of course, it is faster to read the data than to get Uncache memory, right? That's how badly Uncache memory sucks. Anyway, so Unixic camera started. Audio is really simple. You just create a small C application to sound, record sound, split it to chunks so you can have easy processing later and timestamp them, which is important for synchronization. Live camera with some small hacks can write 35 frames per second to megapixel this data to the file system. All you need to do is edit timestamp and sim links so your preview can tell you which is the latest image. Very easy. Control application, you probably don't want to start your video record from command line, but that's also very easy. You just take some GTK and Python. It creates timestamps, telling you, hey, start recording it now, and displays preview, which is the most intensive thing there. And this is basically what runs during the recording, so this is to be determined a bit optimized. Post processing is not that important, right? So you just use Python and FFM pack to compress the resulting video stream. Easy. This is something I was pretty happy about. If you want to deprecate it, you will need some setup like patching clip camera and so on, but code is out there, and there will be easier method in future. So I like this solution because I could use multiple languages to do my camera recording, write language for the job. In the end, this was few hundreds of lines of code total. And it could do some quite interesting stuff. Like you could take still pictures during recording. You simply copy the GTK one more time. Easy. In video resolution, but if you are recording it at two megapixels from phone camera, I'd say this is going to be pretty decent picture anyway. You could take photos with arbitrary delay. Like you could even take photos before the user asked for them because you are taking all of them anyway, so you just don't delete them. This was fun. Then I've got access to LibreM 5, which is different in important ways. It has damp sensors, so it won't give you JPEG. But it had better support. Let camera work there out of the box. There was megapixel application, as I explained about before, it with patched megapixels, but it had no auto exposure, auto white balance, or autofocus support. It couldn't report video. And there's more issues on LibreM 5. Canon could use some work. It only gives you 8-bit data, which is not really good enough for good photos. You can select one of these three resolutions, so megapixel, three megapixels, or 13 megapixels, and for some reason only 23.5 frames per second work. I don't know why. Hardware has face detection autofocus, which is a very cool sounding toy, and I have to thank Purism for their hardware and for the great work they did on the process of Verstek. They are heroes. That's the best photo I got with Nokia N900. Some megapixels, they are very simple application. There's small development teams, so it's easy to work with, it's plain C, it's easy to mark patches. It does all the processing on the CPU, which is great if you want to change the processing. So I started to do auto exposure because that's the most important part, and I did a very simple one. I prototyped on N900 years ago. So basically, if you have too much, too white pixels, like overexposed, you need to turn it down to exposure, right? And if you don't have enough white enough pixels, you need to turn the exposure up, and this is it, and this works well enough. It takes a few seconds to converge, can be improved, I don't know how to do that, but this is good enough to take photos. Other thing is auto white balance. This is not that important because you can do it in post processing. Anyway, they did have manual white balance, so I felt this is easy enough to do. It will need some more work. Again, if it's too blue, you make it more red. If it's too red, you make it more blue. That's it, works well enough. And in a few hundred lines of code, I had simple software only, auto exposure, and I got that merged. Next step is autofocus. Autofocus is something which deserves more respect because you really want it tuned, but well, if you want to do it simply, you just start from the infinity. You compute blurriness of each frame, and you only need to take a look at part of the image if you want to save your CPU, and you start your sweep, you start to blink the focus closer, and when the image gets more blurry, well, you stop. You might want to go a little bit back because of the physical issues of the lens, but this works well better than manual focus, and I got it merged rather quickly. Next step was video, so I decided that I like the ideas from Unixi Camera, and simply did 0.8 megapixels recording directly to the disk. I hacked millipixels to save timestamped frames, and left post-processing after the user presses the stop button. Easy to do, obvious to their disadvantages, right? You are now limited by the disk space, and maybe you could say it's not quite nice to the flash to just stream raw data to it, but hey, the flash is cheap and the phone will die anyway. Post-processing is quite long, it takes five times slower than recording, or I guess this could be optimized. This is again my old code, so I'm Python with FFmpeg. Ideally, there is hardware to do the encoding, we should use it, but I feel that doing that is awful lot of work. Anyway, this is now upstream, so if you update your LibreM5, you should be able to take video off, and I believe it's important to have something other than video recording. Next thing I want to talk about, which is very exciting, is face detection autofocus. You may want to Google it for nice explanations, but basically they have selected some blue pixels, they are special, and they are special in a way that they only take light from certain day directions. So you have a lens, and if it's focused, it's okay, the light comes and meets at the sensor, but if you are autofocus, funny set of happens, and light from the left of the image ends up at different place on the sensor than the light from the right part of the lens. But if you block the light from the direction on the chip, which is easy to do, you can use it for focus. So if you take a line from the sensor, and you have on the top you will have left special pixels, and on the bottom you have right special pixels, for example, then you will have this. The tree you will see on the line will be at different positions on different special pixels. Well, and you can use this to focus, right? You just compute correlation between the two lines, and it directly tells you how much autofocus you are, and in which direction you should focus. This was great to play with, it was like hacking. Unfortunately, it is not too usable on LibreM5. They are two issues for the special pixels are quite far apart, which they basically have to, because if you made all the pictures special, you would have, you would lose your resolution, and it only works in the high resolution mode, and you don't want to run your preview in high resolution mode. So if someone is interested in fade detection autofocus, I have the code, the code is on the, on the GitLab somewhere. It was fun experiment, it worked, but I decided, like, for real focus, you would probably have to do hybrid, like do course focus using the fade detection, and then do contrast detection on the, on the end. It seemed like a lot of work, and with the driver, which would only give you 23 frames per second, and so on. Well, I decided not to take this much. So I have some wish lists, and I think I have, like, five minutes left. So five minutes talking, or five minutes questions? Including everything. Including everything. Okay. So I have a long wish list for all of the world. I would like to have better media control support in the tools, because it just doesn't work. A piece changed, and the tools didn't catch up. I would like library to get conversions between formats, and so on. I would like better than 8B support. I would like multiple applications accessing the camera at the same time. Better support would be nice, and someone should re-resolve the caching problem, because that's bad. For lip camera, I shouldn't be really hacking millipixels. I should be hacking lip camera, but lip camera doesn't really support software ISP, and I'm not a great C++ hacker, so I could do it, but they will reject the patches if I do. So I would much prefer them to do the preparation, and then I would fill the code. And that's pretty much it. So time for questions. Sorry? We do want your work to software ISP. The comment is that they want my work on software ISP, and I guess I will want to cooperate, but lip camera is not easy to hack for me, because of the C++ stuff. So be patient, and I maybe it would be better if someone else did it. Yes, so well, there will be not be much to see. So you know, millipixels could use some work too, but I can take pictures, trust me. I didn't use autofocus for this, because, yes, I can do it. So it's now upstream, so you can just update the operating system, and you will get one, and it should be possible to do just a short video recording too, so now you have all been recorded, and now the CPU is busy converting that. Okay, so I guess. |
Mainline Linux on recent Qualcomm SoCs: Fairphone 4
A look into the work of getting a modern Qualcomm SoC into mainline Linux. |
as we did, so we try to cover as many, it's unfortunate, there's so many topics to cover when you do mobile, for Linux mobile phone stuff. Okay, welcome Luker, we have seen the first pair of phones being used as a presenting device this morning, so now we are going to learn how to put the kernel on it. Thank you, yeah so kind of very quickly who am I, my name is Luker Weiss, I've been learning phone since like 2017, I'm a post-marketers core team member and also my day job is Android platform engineer at Fairphone, kind of about the background of how the whole situation, so I mean Qualcomm has a lot of lot of SOCs like system on the chip, there's quite a lot of actually already supported, so you see all these wonderful numbers here, the ones launched since 2018, so like in the last four years, and they are already supported in Mainline as in they have a DTSI file and you can build something on top of this and it's booting. But of course there's also many, many others that aren't supported, especially mid-end ones like the high-end ones are normally very quickly supported by Linaro, so like for example the SM8550 is the newest one, it was basically supported in day one or the same with the SM8450, but yeah the other ones are not, but you can of course do it yourself. So the device, Fairphone 4, used the Snapdragon 750G, the SM725, yeah launched like a year and four months ago, running the 4.19 kernel, so which is already, I mean we had 6.2 nearly, and yeah like what I have so far working on the 4.9, on the 6.1 or 6.2 kernel is like all the basics that you can see here, USB including nearly the USB role switching, so you can actually plug in for example a keyboard into the device and not just use it like as a gadget, and internal storage in the SD card, so the UFS and other things. Display with backlight control which is separate components, touchscreen GPU, Wi-Fi, the remote prox which is like separate cores on the SoC, they are actually all booting, but at least for the modem one I'm actually not really able to communicate with it. Mobile data could also in theory work, so the Linux driver initializes it actually gets the remote prox up, it already does some initialization things there, but it's not really testable without actually having the modem up, so it's kind of untested, it's upstream already. Vibration motor, the flash and torch LED which is actually was upstreamed recently or is in the process of getting upstreamed, the camera I2C bus, so I can actually talk with like the I2C set and I2C get commands, I can talk with the camera, and like get the chip ID, so that works, but yeah, not really much more useful with the camera yet. And also lots of other plumbing which includes like, yeah, of course all the I2C bus, interconnects, like the bus scaling, cache scaling, and yeah, bunch of other stuff that's useful. So kind of what isn't working yet after this one year and four months that I've been sort of working on it, I have some parts that are actually sort of working, it's like the speaker, I do have, I can get sound out of the speaker, it is super quiet for some reason, I don't know why, and also for some of the audio formats it actually doesn't play at all for some reason, I know. One of the problems with the speakers also like not very many phones in Mainland actually, like Qualcomm phones have audio working, so it is still kind of a new area where this, yeah, where a lot of things are kind of unknown. In Bluetooth I have based on some patch set that I found, you can get it on the Bluetooth, you can make the phone discoverable, so you can see it on other devices, you can actually connect other devices to it, but when you try to like on Bluetooth CT area, do the scan on command, like it just fails. So which is a bit weird, I don't know why, probably need to spend some more time on it. And also like of course all the other parts that don't work, so the modem as I said before, I can talk with the modem via QMI, so the Qualcomm protocol, but when I say please enable yourself, it says nope, and doesn't say anything else, so it's kind of difficult. The microphones which are like also kind of, it's a different part of the audio stack. The camera sub system which is used for receiving image data from the sensors, it's not working, including the time of light sensors, like for autofocus it can be used. And the video encoding, decoding hardware which is for, so you can play MP4 for example, without actually doing the decoding on the CPU, NFC, the fuel gauge, so for battery percentage and the charging driver, they are not working, they are actually, I was able to port the one from the 419 kernel to mainline, like just import it. It does sort of work the fuel gauge driver, but apparently there's some weird, really weird things going on on Android where like a user space component writes something to the kernel driver, and without this like nothing works basically, it's super weird. And also this part of the USB-C, what Alfred already demonstrated before, like it works in the hardware, but it doesn't work with mainline, just with the downstream kernel. So kind of what is the things that you need to have when you're trying to get a new SoC network, it's like one of the first steps is kind of also figuring out how can you make this boot loader boot what you want to boot. Because in the Android case, like Google requires some special things going on, and also the way that many SoC manufacturers implement it is kind of sometimes working, it is working for Android, and that's good enough for them. And for example, the DTPO partition, which is device tree blob overlay, on some devices you can just fast-boot erase it, and then it doesn't try to apply some overlay for the old kernel on top of the new kernel, which doesn't work and doesn't make any sense. On some devices just crashes and burns, and yeah, it is not fun. So mostly you can do this, and there's also on new devices with GKI, the generic kernel image from Google, there's also the vendor boot partition, I actually have no idea what this one does, and how you need to wipe it to be able to do something. The serial console is actually quite useful if you can have access to it in the boot loader, if it doesn't boot. Like if you cannot even get Linux booting, normally on a serial console it will say what it's doing and why it's not doing the things that you want to do. It's like, yeah, on the Fairphone 4, on the new SoC, I got the first boot actually after some hours of working on it, which contains the early console, it's just basically already set up area where Linux driver can write to it, and you get serial output, and also the display via simple frame buffer, which is actually now way more easy than getting USB up or getting actually proper serial console up, so it's super nice. Simple frame buffers where the boot loader already sets up the display hardware correctly, so Linux actually just has to write to some memory area, the bytes for the pixels, and it will just magically appear on the screen. It is very nice, very useful, and yeah, the first boot was in like 180 lines of the DTSI for the SoC and 40 lines for the device, so yeah, total 220, and no single driver change was necessary for getting a completely new SoC booting anything, basically. Yeah, I was basically just following what Iskren wrote on his blog, mainline.dev, super nice, it really contains useful steps for the very, very first things that you need to do. Yeah, so if you want to go a bit further, you very quickly start to need the clock driver, which is GCC, global clock controller, driver, now you can basically just take whatever Qualcomm gives you, for example, for the 419 kernel, copy it over, modify a few small things, but then it works. You also, at least for the 419 kernel, also these power domains, which is like, is some concept in Linux, it's also called GDSCs, you need to, they were a bit differently implemented, not in the GCC driver, but you should put them in the GCC driver for mainline. Now there are more clocks with the RPM edge, also like various other bits you should add to the DTS, because otherwise it just won't, like random things won't work, which are dependencies that are not really expressed in the device tree, but the drivers still need, for example, access to the S-MEM for like doing various things. These definitions are basically all the same in downstream, so you can also mostly copy them over. Don't blindly copy them over, because it will be slightly different, but you can definitely get good inspiration of what you need to do. USB is, of course, kind of the next step, because just staring at the tiny text on the screen is not very good debugging, and you also don't have any input. You can do surprisingly much with simple framework, but at some point you, of course, want USB, at some point also a pin control driver, which is for the pin multiplexing. This really only starts getting used for once you get to more advanced components, let's say, for like I2C and other things, and also regulators are important at some point. I think that's actually, I don't know if you already need this for USB or not, but these are kind of the basic components that you need, and then you can start building actually enabling various components that you find, like the flash driver, the vibration motor, and things to talk I2C and things to talk other protocols. Of course, lots of things that can go wrong, the IMU is especially on new Qualcomm chips, it's kind of annoying. I mean, it's less annoying than old ones, but it's still annoying, because a lot of things, like some things that you do, or yeah, let's talk about IMU directly first, so like what's a bit different between downstream kernel and also mainline, is the bootload already initializes something in this memory seguration, or like SMMU is also called. I initialized some things for the bootloader to use, for example, for the internal storage, it already initialized it. This normally gets on downstream kernel, it just continues using those and adds some ones on top. On mainline, they actually get wiped completely, and they need to be reset up by Linux, which causes some problems if the downstream kernel doesn't express like which IMU to use, for example, for UFS, it's a very good example where it is bad. So you kind of need to find out there is a patch, there you can actually dump the mappings, and yeah, you can use this to figure out what it is. Also, the device really likes to reboot when anything is not really right. If you access some register in the clock, and some clock isn't on that it requires, it just reboots. It doesn't give you a kernel panic, it just reboots. If you're writing to a wrong register, it reboots, the IMU is defined wrong, it just reboots. Actually a thing for the IMU is that it sometimes gives you a message of why or at least that something isn't correct, but yeah. For printing, what I've actually used sometimes is just printing the current line where this in the driver, like sprinkling this everywhere, adding a sleep of like half a second, and then seeing like, oh, this is the last line that I was seeing, so it's probably messing up there. And maybe also increasing the sleeps, because sometimes the flushing doesn't happen, like printing it on the screen actually is a bit slow. Also, for like once you have like more USB up, you can actually also build various drivers as modules, and this way actually, yeah, it's not built in, and if it's built in, it loads like it's, I don't know, it like kernel locksack and 0.5, which is quite early, and if it then crashes immediately, you don't really have any time to be the debug, but if you build this module, you can load it later and actually have something set up already. Yeah, like what is important to do if you work on this, actually if you have anything working, if you have something working progress, just commit this into your repository already to have a reference point to go back to, because sometimes one single line change will fix everything or break everything, and like you can, your first commit doesn't have to be perfect, obviously. But also don't let this sketch branch that you have lying around want to have something working, don't let it sit around in your local repository on your GitHub fork or in your GitHub repository wherever, forever, and don't upstream it, because then it will just drop there and nobody will know that it's there, and they will probably, like this next person has to do exactly the same thing again, even though you have already got it working. So like already starts upstreaming your patches early, like if you have simple framework for booting on the device, upstream it. It would be very nice, because there's also a better overview of which SSCs have already been worked on, and it's very nice. Of course like when you upstream it, you also have to do some extra things, for example, adding the new compatible strings that are used in the device tree added to the documentation, and do some things there, but it's normally, it is, yeah, some extra work, but it's really not too bad. And also patches just because of how Linux development works, just takes some time to get upstream. So like two months later, if you go to, if you re-basement new version, your patches already there, so you can build on top, and don't have like 100 patches lying in your own tree. Or it's like get send email is not difficult, if there's a wonderful guide, get send email.io from the source developers. It explains it super nice, once you have configured once, it just works, yeah. Thanks for listening. We basically have one minute for questions. When you get GPU working, you should also actually get the display hardware working properly. But yeah, this was fortunately done for, for this SOC was done by Konrad, who is, who knows a lot there, like he got the display hardware completely up in the GPU also. This is used for actually, because simple framebuff, you cannot turn off the screen, you cannot basically do anything except just write pics of some memory, write data to, or write bytes to memory area, and that's it. Yeah, so you actually need to get the display hardware also up, but then you also get, can get the GPU up. And yeah, this one works really well in mainline. Like I run performance benchmark on it, it's actually, not too bad, it's actually relatively close to the downstream version. Yeah. I've contributed to the SDM 625. Mm-hmm. Like, you know, how to up the screen and manage the complexity of partners, generated partners? Are there panels? Yeah, the panel drivers are still, I think in general, a question of like how they should be handled upstream, because in theory, I think the panel drivers are not really generic, I mean, they are not generic. But in theory, they are relevant to the, to the display controller, and not actually a panel itself, which is two separate parts, but like, without having actually access to like all the documentation that are like internal to the company, you won't find out which, which driver this actually is. And currently, let them sit around in your tree, they are, I mean, most of them, you can also just generate from a downstream DTB, and it works, so good enough for, for now. At some point, we probably figured out, but the MSM8916 people also, they, they have like already like 20 or 30 panels there. I think trust is always running, so like the boot loader, which is like, it is a signed binary, and you cannot really replace it without having access to the, to the signing keys. It is running, I think, and it also, I think this is the thing that kills your, that kills the phone, like when you're doing something wrong. I don't know, you cannot get rid of it, I, you can probably somewhat communicate with it. I know that normally the fingerprint sensor is handled via Trustone. I was like, you actually talk to Trustone for the fingerprint, but I actually don't know how this works. Okay. Thank you very much. Thank you. Thank you very much. |
Mobian: to stable... and beyond! |
Thank you everyone and thank you for attending this presentation. So the title is Mobian to Stable and Beyond because right now we've been only doing some development release but first what is Mobian? You could think of it as a Debian derivative or in Debian language a blend which is targeting mobile devices such as smartphones and tablets. We provide a separate package repository but it's not a standalone distribution right and we have some ready-to-use disk images which are built for several devices and more on that later but Mobian is actually a very very small overlay. In our whole package repository we have 44 source packages compared to 35,000 and more on Debian itself so it's really some tiny bits and actually we are planning to drop some of those packages and my hope is that basically one year from now we will be down to something like 15 or maybe 20 packages at most because we have some transitional packages and actually the most difficult to get rid of will be device support packages where we have downstream patch kernels and stuff like that but in the end Mobian isn't supposed to be a long-term project it's really supposed to be merged into Debian itself and just improve the overall Debian ecosystem rather than being a standalone project aimed solely at mobile users. The question we have been seeing a lot lately over the past few months is basically where can I find the latest Mobian stable image? You can't because it doesn't exist yet. We target Debian testing which is a moving target you could think of it as a kind of rolling release distribution and the Debian testing distro is frozen every once in a while it's about every once every two years and then moved on to Debian stable. The latest stable release from Debian was bullseye which was released a bit less than two years ago and back then we definitely weren't ready for prime time. For example we had version 0.6.8.2 while we now up to version 0.24 for the compositor and shell sides and there's been a lot of progress over the past two years. Back when bullseye was released we didn't have stuff like eG25 manager which is basically a piece of software handling the pine phone and pine phone pro modem configuring it properly to work as we expect. We didn't have MMS we have very few adaptive applications because lib add waiter at the time was not even existing we had lib handy but no GTK4 and no lib add waiter and so in the end we decided against releasing a stable Mobian version for bullseye and the ecosystem was only starting to ramp up there were still lots of issues and bugs and instabilities and really a low count of actually usable mobile applications. So what does going stable mean for Mobian? If you look back at the bookworm development cycle which is basically the past two years we've seen some great progress both in the overall mobile ecosystem and in Debian itself. The system is really really richer than it was before and it's still growing and more and more people are creating or modifying applications so they can run just fine on our tiny displays here. Graphical environments are more usable and way more stable than they used to be. I mean if you've been using Fosh like two years ago it was all tapping buttons and trying to get the things right. Last year we had the swipes which was a huge usability improvement and overall lots of bugs were fixed so it can run smoothly on many devices and that's just awesome and we even uploaded a lot of packages we were hosting downstream to Debian itself and that even includes some Mobian specific package such as the splash screen theme, the installer settings, the repository key ring also so we have the GPG keys for Mobian also in Debian now so if there's another mishap it happened last year we let the GPG key expire and user was stuck and had to download those manually now they'll be able to just update the keys from upstream Debian and still have access to the Mobian repo. We had also fixed some early mistakes and some optimal choices regarding how we name packages, how we organize those and how we decided to ship all the device support tweaks. For example we used to have for each device one tweaks package, one support package which was just a meta package putting in all the dependencies. Right now for Qualcomm devices we have two packages which are in Debian itself, those are Qcomm found utils which contains all the tweaks which are common to every Qualcomm supported device and we have Dreadjuicer as well which I'll tell a bit more in a minute and in the end now seems a good time to finally go stable So what will it look like? We have support for the devices we already support basically so those are the Linux first forms, Pinefone, Pinefone Pro and the Librem 5. We also have some Qualcomm based devices mostly SDM 845 thanks to the awesome work the community has done on this kernel and of course we also shipped some 86 images with or without non-free firmware depending on what you want and it runs just fine for example on the Microsoft Surface Pro and Surface Go tablets. This is really awesome. We'll also ship two flavors of Mobian one with Fosh and the other one with SXMO. We would have loved to ship a Plasma Mobile flavor as well but this won't make it I'm pleased to announce that Plasma Mobile is finally in Debian itself but we only have the basics which are the calls, contact book, SMS application and settings application and of course the Plasma Mobile shell but that's not enough to ship a stable image based on Plasma Mobile so we'll keep that one and start releasing it for the Trixie development cycle which is the next Debian testing and of course we'll ship an LTS kernel and we'll commit to keep it up to date with security updates and try to update it as often as possible for all the supported devices. We also going to ship some kind of semi universal images. One thing we'd like to achieve with Mobian is that you could just ship one image and flash it on any supported device and the kernel would support the device. All the small config tweaks needed for this device would be applied automatically. The firmware could be extracted and so on and we didn't quite get there yet but we're getting closer. For example on the SDMA45 devices those are Android based devices and they need some proprietary firmware blobs to just work. The thing is this firmware is shipped by the phone manufacturer. There's no clear license allowing you to redistribute it so we just can package those into Debian and call it today. This is where I came from. I came with a joint juicer. The thing is this is a small runtime program. It runs on boot. It mounts the Android vendor partitions, fetches the firmware from there and copies it into the Linux user space root file system and then afterwards you rebuild the Initram FS, reboot the device and on the next boot you have your Android device with all the firmware you need running just right now without the need for downloading firmware from the internet. By doing so we also can have one image for every single SDMA45 devices. One root FS at least because the boot image is using the device tree for the specific device but you have one root file system and as many boot images as you have device supported and it just avoids the need for any device specific tweaks and so we hope that in the future this can be extended to other Qualcomm-based devices such as the Fairphone 4 for example which by the way runs quite nicely on Mobian thanks to the work Luca has done so far. So that's one of the semi-universal images. The other one we're planning to implement is for all the.64 devices because those need very few device specific tweaks. The two of those, the PinePhone and the PineTab already share the same kernel and all we have to do which is not that easy but all we have to do is basically import the downstream patches for the PinePhone Pro into this kernel. This can happen quite easily but we still have some things some details we need to work out especially considering the audio configuration on those devices due to the need to have the modem properly talking to the SOC in terms of audio and frequencies and so on. So this might get pushed back a bit but we're working on it and we really hope that it can be done for Bookworm so that we only have SDM845 images,.64 images and one other for the Libram 5 which needs its own kernel because basically there are some patches that are incompatible with the PinePhone Pro kernel. They share the same display the same block for the display output and if it works on one device it doesn't work on the other. Anyway what we'll do during the freeze period so basically Debian is being frozen in preparation for the stable release. We cannot have new packages in Debian starting the 12th of this month and one month later we cannot have any update at all unless it's bug-fixed but we'll still be able to work on downstream packages to improve the stability and fix the remaining issues and hopefully but we make no promise there we'll be able to work a bit more upstream by submitting kernel patches, implementing proper Tobu support for the Libram 5 and PineTab for example and yeah maybe we could think of other things but for now we're focusing on trying to improve things during the few months we'll have left before the stable release and so what's next once we have more stable well we'll switch obviously to the Trixie development cycle tracking the next Debian testing and trying to get even better software support for mobile devices and so we're going to try to make it easier to support new devices in Mobian this is already we're paving the way with the SDM845 images and the Pine64 images and trying to get to a universal image and so we will hopefully make it easier for people to just support their own device we will also support 64-bit RISC-5 we actually have all the bits and pieces in place we have a dev board which is actually as a GitLab runner and is able to build packages for this architecture which is already supported in Debian and so that's one we'll just flip the switch once the stable release is there we'll keep packaging new software and new options for our users bits plus my mobile as I mentioned already Lomiri the UB ports user interface and finally try to get this universal image thing out of the box and working smoothly that's basically it for me you have a bunch of links there the slides are uploaded to the website so feel free to go there and yeah I'm not sure we have time for any question a little bit so first thank you very much so one minute two minutes for questions well the question was for the semi-universal images where we extract firmware from the android vendor partition do we have a solution for getting the updates from the vendor itself the answer is no you just get what you have on the device by the time it's run you can flash a new android ROM on your device and then reinstall mobian if needed and then it will pick the new firmware but there's no automated way and I really doubt that android phone vendors will participate in LVFS to get updates in a timely manner to users one last question perhaps yes would it be possible for mobian to be completely assimilated in debian almost the only thing that will be pushing us farther from this goal right now it's kernel support if we manage to get fully supported devices in the upstream kernel that means upstreaming lots of downstream patches and doing so for any new device which will arise in the next few years then yes we'll be able to be completely part of debian and have no downstream repository at all but for now we're being held back by the kernel situation basically okay thank you very much we don't have more time thank you |
What's new in the world of phosh? |
Okay, the room is full. It's still one minute to go, but I think we can get started. Welcome Evangeles, what's new in the world of Hosh. He is a Mobian member, a purest employee, so he knows what he's talking about. So welcome to my talk and thank you for the nice words. We'll see whether I really know what I'm talking about or not. Maybe you can correct me if I speak out of it. Since you already did the introductions, there's not much that needs adding right now. Since I'm interested, how many people here in this room, maybe by show of hands, are currently running Linux first phone? Okay, that's nice. I mean you're in the right room then. And out of those people, how many people are running Hosh? Okay, that's like I guess half of it or something. That's great. Okay, so yeah, what is Hosh for those of you who might not know? It's a graphical shell for Wayland and it's aimed at mobile devices so you can use it with touch-based inputs. The UI is written in GTK and we'll get back to that later on why that is really nice that it's written in GTK. And maybe just as a short history, originally Hosh started its life on the Librim 5 and on the pure as distribution developed by purism which heavily invested in getting the GNOME based mobile software ecosystem starting, I would say. And so while it originally was written with the Librim 5 in mind, it quickly spread to other distributions and now also runs on a lot of different devices as well as if you saw the other talks, for example, our most talked by Mobian, you see that there's like Pinephone, Pinephone Pro, all the Pocophone and so on. And maybe one thing that bears noting is all of these projects, they are working really great together from my perspective as like some or upstream for some of the projects. And it's really great to see all this cross-pollination going on between the different projects and distributions. So now to what has happened in the last year. I guess most notably is the swiping gestures that were implemented. So you can see in the video, for example, how it really tracks your finger movement and it looks very nice. And yeah, that is already probably old news to some of the people here in this room, but I think that was one of the greatest improvements in terms of usability because you, at least if you're anything like me, you would always accidentally open up the app drawer because when you were aiming for the space key and you just went the one pixel below that. Yeah, so gone are those days. Apart from that, there's been a bunch of quality of life improvements. You got, for example, you can turn on and off VPN in the quick settings. The quick settings are actually now also accessible from the lock screen itself. And there's been all sorts of design overhauls and making sure the buttons and everything look really nice thanks to the people that work on design and so on. Another thing that I find really enjoyable to use are the lock screen plugins so you can have plugins to put some widget tree onto your lock screen. So in these examples, you see a simple proof of concept calendar widget, like the one on the left that is probably not that useful unless you just happen to forget the date and then it's great. Upcoming events is something that I really enjoy because you see at a glance what's going on next. Emergency information is also something that you can have displayed there. And also notably, if you want to show tickets when the train conductor comes by, you can do that from the comfort of your lock screen. You can actually turn on or off these plugins in the fresh mobile settings application, which has a few different plugins you can enable or disable and other settings like, well, you want the keypad on the lock screen to shuffle whenever it's there so no spying eyes can learn your pin from watching unless they watch really, really closely. This is some more images from the mobile settings application. So for one, you might find some device specific things there. And in the compositor settings, which is also shown in the video on the right, you can, for example, enable scaling applications with windows that are overflowing down to have them fit on your small mobile screen. Then we also have a nice thing that came about while at Debconf in Kosovo where it was really nice weather. And the main developer implemented automatic high contrast switching between dark and light variants based on what the ambient light sensor would show you. And if you want to try that out, currently you would need to use these G settings and you may need to adjust the threshold for your case and to make sure that it works for you. But depending on how sunny it exactly is. Apart from that, there were a few design overhauls on the calls side and also maybe notably to people who have large call history, it starts up a lot faster than it used to. And also the scrolling performance in the history has much improved. It will get even better with GTK4, which is on the road map, but with GTK3 resulting to some hack, limiting the amount of widgets that is displayed in the list box makes a lot of difference, especially on weaker hardware like the PinePhone itself. And you can also, I don't know if you're aware, but you can long press the entries in the call history and from there you can start sending, for example, a new SMS or if it's an unknown number added to a contact and so on. Apart from that, oh, that image should have, yeah, okay, not been there, but yeah, it's calls that can also be used to make voice over IP calls using the session initiation protocol that's been implemented some time ago and it should work for you, so if you have a, I don't know, jump chat or sip gate or something account, you can use that for phone calls or sorry, voice over IP calls. And during last year, supporting encrypted media streams is also something that has landed. Actually, the call display will not tell you right now that it's an encrypted call, but you can trust me on that. On the chatty side, like the SMS application, or SMS and more, I should rather say, there was a lot of work on MMS especially, thanks to having MMSD. There was a lot of work on group messaging flows and there's also work still ongoing on Matrix, but that is something that I'm personally very excited about. Then maybe in the wider ecosystem, one of the things in GNOME that I really enjoyed is that we now have dark style preferences with latest Lipadvita and also in Lipendi. And if you want to know more, you can, I guess you can't click on the links, but if you go to the slides, there's some blog posts that I linked right here. And maybe just a few examples of some of the applications that have been made adaptive since I think pretty much all of the GNOME course applications are now adaptive. Yeah, Contacts, GNOME Software, and there's lots of things to look forward to, as I said, Matrix and Chatty is one of them. And yeah, fixing paper cuts because I think we're in a position with Fosh right now where it's, as was evident by all the people raising their hands earlier, that is in a good shape. There's still things that could be better, but it's definitely usable as a daily driver-ish. Okay, and yeah, if you want to reach out, look at the slides and thank you. Three minutes for questions, plenty of time. Yes. I don't think there is, if you look at the, or maybe I'm wrong, but if you look at the notifications specification, I'm not sure that you can put real widgets on with all the bells and whistles. You can tell that there's, or as an application you can say, hey, here's an action, so the notification will give you some button to click on, which will then, I don't know, do something, reply to a message or something, but I don't think custom widgetry would work at all, because you'd also need, how would that work? Like, if it's a QT app and yeah. Probably embedding it in the Wayland service would be complicated, but the word widget is, I mean, it's already built on GDK, so maybe. Okay. Oh, it's not specific to GDK though. Specification is process, it has to work on X or on P and Wayland, so you can't embed something. I mean, probably since it's free software, there's always ways you could do things, but I personally, I'm not entirely sure how exactly you would implement something like this. Yes, hold on. Okay, I have a question. In the convergent mode, you have the application list there, and it would be great if you can press some button or get to something and say, make this application now running in full screen on the external device, external screen or or maybe it is there is some combinations. Yeah. You mean like from the when launching the application? Okay. If you swipe. Yeah, I'm not sure how like from the spatial model of since you already have these swipes to the to the right and left to go between the the open applications, I'm not sure like how would you need to. I see, go press and some menu. Ah, okay, yeah, that's because it is quite missing for me. Okay, that's good to know. Please file a back. Okay, so basically time is up, sorry, but we only have very little time for this. |
Ondev2: Distro-Independent Installer For Linux Mobile |
OK, so welcome Oliver, as I said, installing stuff and all the reviewers know first impressions count. So whether you have a nice installer or not will make or break your review of the Linux phone. So welcome Oliver. Thank you. Yeah, so this is called ONDEV2, I'm Oliver Smith. And as you can see from the number, that's probably also our first version, right? So I'm going to tell the story of that a bit. It started with the Pine phone, post marketer as community edition, where we figured, well, it would be nice if when you bought the phone and you had post marketer pre-installed, that there would, you would be able to encrypt it actually. So it's not just installed, but you can encrypt it like a proper phone, yeah. And that's how the idea for the ONDEV2 installer was born. It looks like this, yeah, basically a simple UI, you press your continue button at the welcome page, then you put in a password, a pin for the user, and you can select whether you want full disk encryption or not. And that's that. After we released that, an additional feature came, which is that you can select whether you want to install to the EMMC from the SD card. So if you already have a Pine phone where something else is installed or post marketer as is broken or whatever, then you can just take an SD card, put the ONDEV2 installer on it, and then you will see this prompt here. And it asks you, do you want to install to the SD card itself, which works after some complicated stuff. And you can also install to the EMMC and override stuff there. That's what it can do, basically, and this is our cross-district project, so not only post marketer is using this, but also Mobian. Looks like this, Mobian already did a talk. And yeah, they boosted the fund a bit, but it's basically the same. And they also added that you can choose the file system, which is quite nice. So I thought to myself, well, it's good that we have this, and it works for what it does, and it was good as an initial version. But what would be the perfect version of this, right? And a lot of people tried this out, and they figured, well, it would be nice if they had more options, like being able to select a file system or the host name, or what have you like 100 options like you have in a desktop Linux installer. And other people were like, well, it's too much options already. What is this SSH user, which we also removed? There was a separate user which you could set up. And so it's a bit conflicting, and it was hard to figure out what would be the perfect version, right? And besides that, there would also be, it would be necessary to choose the language and locale, and it would be nice if it was adaptive, so it doesn't look like a letterbox like this when you run it on a laptop, which is the case with the first one. Yeah, and also it would be nice if there was the same keyboard, because the keyboard was entirely different, and also there was another keyboard in the unlock application, and then another one when you finally have it installed, so you had three keyboards, which is not very consistent. So my plan was to add a simple and advanced mode, so you could deal with that some people want more options, some people want less options, so the casual users can just go to the simple installation and have the least questions asked that you need to get it running. And the advanced mode would be for the nerds, where you can pick everything and have it encrypted or not, and choose separate passwords if you want, and so on. Yeah, so I went ahead. This is the decision tree, which you would have for the simple and advanced mode. This is the simple path, so it goes through welcome, then you choose your language and locale, then it asks you simple or advanced, then you say simple, and then you choose the installation storage if it's possible with your device, and then you would set up one combined password for full disk encryption and for the user, and it wouldn't even ask you if you want to encrypt it, because it's just assumed, okay, that's the right thing to do for normal people who don't care, and then it's ready to install. And then we have the advanced path, which is more complicated. You get asked for the storage device, then you can pick the file system, the host name, the username, and maybe lots of more options we can add later, and you will set the user password, then it asks you do you want full disk encryption or not, and then it asks you do you want to use the same password or not. In the current installer, it didn't work like this, you had to type in the same password again if you wanted the same password, and that's not very optimal under the tiny phone screen, you know. So yeah, after that, you're ready to install. So as you can see, this is a nightmare of choices, and yeah, it would need a lot of testing, you would need to test all the code paths every time, and to make this feasible, it needs to be automated. And this was not possible with the current code stack. So I looked elsewhere, so first here's a slide. We need short test cycles, CI for all paths, and it should be easy to extend that it's not much effort to add a new option if it's where it makes sense, right? Of course the idea is not to add endless options that don't make sense at all, but some of them are really useful for users. So I looked around and figured what code I could use for the stack, and I found this nice application. This is called Unlocker by Johannes Marbach, and it's a replacement for OSK SDL, which is the unlocking application you use to type in your password after you have installed it. And yeah, the nice thing about it is it can use the keyboard layout from Fosch, so that's already pretty consistent then. And it's very small, it's based on LVGL, yeah, it has a dark mode, and you can, it's actually adaptive like this, so lots of great potential here. So I based on Dev2 on this, and behold, on the next slide you will see what it looks like, looks like this, yeah, it's still like similar format, you will see some description, and then you, in this case, you select the language and then continue, and then you, like here are a few example screens which it has, here you can pick whether you want a simple installation or advanced installation, and where you want to install it on your EMMC or SD, and here is, here you would set a combined password. I know the visibility icon is a bit blown up, this needs to be fixed, but that's the current state, yeah, and here it asks you if you want to use the same password or not, so basically what I showed in the decision tree earlier. Meanwhile, while it's displaying all these dialogues, you will see on the serial output that there's a text interface, and this is useful because, well, it's kind of nice that you can also type in the answers on your keyboard, but it has some practical use, it can be used for accessibility when we hook it up with some text-to-speech and speech-to-text stuff, and then you could actually talk your way through the installation, and it's of course very useful for testing, because then we can, yeah, just run through the whole thing in an expect script, and that's, so there's one line which would interact with this dialogue and there's a function button, and it waits for a page with the title Advanced Options, then it looks for a button called Simple Installation, and it has to be the first one, all based on the text output above, and it presses it, and if the dialogue doesn't show up, then it runs into a timeout and the test fails, and so as you can see, we can add new tests and extend the tests really easily, that was the goal, that it's super easy to test the whole thing. It comes next, yeah, some, oh yeah, the code. So this is what the code looks like for this Advanced Options page, the idea was to make it very small, and I believe I've accomplished that, so you set the title, the description, it can be translated, that's why there's this T around it, and you set the buttons, and that's how you would add a new page like this, and you don't need to add some XML file with the layout, or what have you, yeah, so that's quite short, and this is the button handling, it's also not complicated, you don't need to read it now, but basically it's, that's my point, that it's easy to extend, okay, and then we have the current state, so it's still work in progress, what's done is, it runs entirely in the inner drumFS, which was not the case before, so this is also nice because it saves like 100 megabyte or so of overhead, there are these abstractions for pages, for installation steps, which also need to be extended, and I just showed you the decision tree, but that's only the front end, the back end of course needs to handle all these decisions also, and needs to react based on what you chose there, test cases are there already for installing from SD to SD, from SD to EMMC, and these are just examples, we could also add NVME of course, and what other installation mediums you have, but it's, I think it's useful for the user to be able to see okay, this is DSD card, and not this is some DEF, MMC, BRK, zero, or what have you for the normal user, so they know where they are going to install it, so I just called it SD and setup partner config, and yeah, the test run with LO setup or QMU, you can use LO setup to run it on your own laptop, and this is, this gives you the fastest test iteration cycle, you can also run it in QMU, so it, you make sure that you have all the files in the inner drumFS actually, and that it works after rebooting, that you can properly boot into the own OS, and so this is the more complete test, and with LO setup I actually rebooted my own PC once, so you need to look out for that if you run it next to your regular operating system. To do what's needed to replace the first on-device installer is some more testing, fix some fix-me's, and the usual stuff properly integrated into post-market address at least, so yeah, I would be happy if Mobian also wants to use this, and other distributions are, and we have so many distributions out there, yeah, but post-market this is probably the first test case because this was developed in tandem with it, and for the next level, this is the really exciting stuff, so when once it works and replaces the current one, we could actually support more devices, Android devices, Chrome OS devices, because for them you can just install it on the whole storage device, you have to keep in mind that there are already partitions, and you need to use them in some way, and what's nice is we could download the OS images with Wi-Fi, so you would only flash this very tiny installer to your SD card, put it up, and then you could pick any OS image and download it, and you wouldn't need to create a new SD card every time, and it would be much faster than when you have to flash this whole image every time, and yeah, even better, we could also construct the OS images on the fly with the package manager, which is APK in the case of post-market address Alpine, and then we might be able to get rid of some of the OS images because it takes quite some resources, resources to generate them every time, so this would be nice also, and yeah, that's what's the end of my presentation, thanks to all these people who helped out making this possible, and thank you for listening, of course. Yes, that's the plan to use Unlocker to replace OSK SDL also, and that's quite far, I believe, there were some bugs with some very few phones, this is why we didn't roll it out before, but it should be there sometime soon, yeah. Any more questions? Yeah? Yeah, you could flash the on-device installer to the place where you would install the operating system, and from there it can expand itself into the installation, so this works by, you have first the boot partition, then you have a lot of empty space where you will put the installation, and then you have the on-device installer, and it partitions, it uses the empty space and creates the look script setup device there, and then shafts the data from the third partition in there, and then deletes us the third partition and expands it, and that's how it also currently works. That's already implemented in the first version already, so, and the second version can do that too. Yeah? I didn't look, but I would expect, I mean, it's AVGL, it's super tiny, maybe 50 megabytes or 100, it's tiny, really. You can compile the whole UI thing in like a second or so, so it's really, it's also fun to develop with this, yeah. I have a question if there's no more, because it's distro-independent, so what's the way to handle distro-specific things like configuring repository sources or set different ways to set host names on different distros, things like that? Good question. So that's our config file, and there's actually some distro-specific parts inside the main repository upstream, and they are separated in a different directory, and there you have a directory structure for all the operating systems, and my idea is that we test them in CI all the time, so we ensure that they don't break, so when you make one change that it still works in all the other distros. And you have a separate config file, and you can run your own code, there are some hooks, for example, after you're done with installing and you want to regenerate the, in the drum of s, then you can put the commands for that in a shell script and run that after the installation. Okay, cool. Thank you. Okay. No more questions? Okay. So thank you very much. All the best. Okay. Thank you very much, Oliver. |
Sailing into the Linux port with Sony Open Devices
A journey of adapting Sailfish OS to work on Sony Xperia phones |
So that's much better there. Yeah. Personally, I think it's like in a perfect world, everyone would have just used mainline, and everyone would be happy. And this would be a dream world. And while I'm a YOLA developer, in my own opinion, what post-marketers, what they do is awesome. And if everything would work this way, we at YOLA, we probably wouldn't have, would use Sony OSP. But the word is not at its place. And so I still advocate to use hybrids, especially if we have an open platform, which also has open communication. This is still why I think Sony OpenOS is still a good target to port to. And a lot of end rates, our patients or the end rates, even just end rate problems have the problem. They don't have the blessing of the window. So as long as you do it for yourself, it may be fine. Nothing will happen. But you don't know. A lawyer can write you a fancy letter and tell you why you're doing this. And now you're broken. And that's about you. And we at YOLA, we cannot do that. So that's why we choose Sony's open devices. And then one thing, especially for someone that's unexperienced, going to Sony open devices and trying to start a device is easier because there's existing structure and guidance there. And open communications are like behind some back channel where you have to talk to some specific person. They talk to you. The communication is not always perfect, but they do talk to you. And this makes a huge, huge difference. And then also, the lifetime is really long. Whereas for other devices, you may be stuck on one kernel. Sony open device actively upgrades the downstream kernel that they have to their devices. So that's also the reason why we don't use Android. So the Android updates don't really do that much. It's still an important factor for someone to decide. So it's not mainline, but it's like in between being stuck on a really old downstream kernel where you have a long, more up-to-date LTS kernel that has most of the features that you need, but also makes it more attractive to port a Linux to. Yeah, I think I kind of skipped this already. So contributing to Sony open devices. So personally, I started contributing to it because I wanted to have a different device that I wanted to run SafeFishers on because I liked SafeFishers as a platform, but I wanted to have more devices. And while it's a reasonable choice, started with the Xperia X series, but I wanted a better device with a better display and better hardware, of course. So I choose that. And if you have some Linux experience, you can start to try to run AOSP on it quite easily. You just have to know how to use Linux a little bit. And have a device in the computer, and that's it, really, except maybe some programming skills, but a lot of things you can just learn while you go. You don't really have to be an expert. Personally, myself, I'm not someone that has a university degree or a vocal education, a formal certification or something. I have nothing. I just learned it while I learned, and here I am. I think you don't always have to go this other way. Yeah. So, and then I wanted to highlight how Sony contributes, because I think many people wonder, like, what's the relationship of Sony and YOLA? Because it kind of looks like there's some deeper level of communication there. Well, yes, we talk to Sony. We have someone to talk to at Sony, but it's except that we're not really someone that gets special treatment. We just want a contributor, and anyone can be a contributor to Sony Open Devices. And the chance that then YOLA goes into YOLA devices is quite high then, because of the way that Sony Open Devices works and the way that you can reuse work. So, we usually choose a mid-range Sony reference. Sony Devices is a reference port, which then you can use to port other Sony devices too, or other devices of the same Android base. Because nowadays, thanks to Google, other things are more standardized, and you don't really depend on one specific device vendor, which is still not mainline, but it's getting there. And I think Google, as a company, they listen, and if they could make everyone work with mainline, then they would probably do that, as they would do with Chrome OS, but device vendors are like stubborn, so things change slowly. Yeah, and usually we are more focusing on refining at a specific target device, the other adaptation, that's by the quality of the other adaptations than better than it was before, usually. Not everything is perfect, but it's sometimes hard, like camera stuff, or audio. The audio guy, when he talks, it's in our company, it's sometimes like he's not some kind of black magic skills that I don't know, and it's really something. Yeah, and we, in general, contribute quite frequently back, so most of the work that we do, two Sony open devices, goes back to them. Some stuff is specific, but not really, it depends on how much effort it is. Yeah, so, as I said, originally I didn't really talk about mainline, but I still want to advocate for it, since I think it's a good thing, and if possible, if your device has a mainline port, and the mainline port is in a state where it's okay, then try to push it. So, I think the big thing about hybrids is like this, it's like, when a devil in the Garden of Eden gave them the apple, like the easiest solution, that's like hybrids, so that's why I think it's still a good thing, and as I said, you're not in the perfect world, sadly. And then Sony's house, the quality is good to okay, depending on the area, some areas, like camera are okay, but that's just about the topic in general, it's not really their fault, and it's open, so it's just a clear line between the separation of the blobs, and even open devices, even mainline devices that have a mainline kernel, they don't run without blobs at all, it has still some firmware, and just the way that the firmware is on a different level, that's really the big difference. And because you have malfeatures, also probably there are a little bit more blobs, but that's why it's easier to get there. And in a lot of ways, it's really mainline tech sometimes to get there, whereas hybrids is easier to have some kind of functionality. And yeah, really, mainline kernel is nice, and in the long run, it's really better, but the downstream kernel gives you a better base, but just in the long run, it's just, no, it's annoying. And while Qualcomm is better than anyone else really, it's still not nice, sometimes there's funny stuff in the kernels that you don't expect, and then they rub your time when you just try to update the kernel, which Sony probably feels quite often. So, and I kinda skipped the spot already, but I wanted to give some ideas for target directions. So I quite strongly followed the port of Calib to the SEM 845, which is the, I think, 2009, 2020, high-end stock from Qualcomm, and those are really, what was very good today, a good start to try and to think around and then to go further. While those are mostly OnePlus, there's also an SOMainline port to it, I think, but SOMainline right now is not as stated, I could demo it. If it was, I would have shown something here that was idea, but sadly, I couldn't get that far, so yeah. And then the PinePhone or the PinePhone Pro has really great devices, and we also have ports to the PinePhone and PinePhone Pro, and our stand, you can see them, you can see them. You don't have a PinePhone Pro here, but we will probably find some people that have it, and then we can show you. And then this maybe sounds strange, but in some ways, KVM is a really good target to try and mainline on, because it's PC hardware, essentially, and you can develop the basic middleware components on it, and then have a testing target that you can for people to try out and to just have it running. It's great, and I also, if time, I can show the demo of it, I have one on my notebook, and if not, then on my stand, or on our stand, there. So, as I said, I'm a developer, but I come from the community, and I want to give back to the community, and I want to bring some of the processes that we have to the community in the same way, or in a little bit same way as are right here. And most of the processes, this will probably work for mainline ports and safeshers too, or also outside, or safeshers maybe, but some stuff just doesn't apply. So, for example, since you're a port of reference device of the same, of the year, for example, like 2003, then to Sony open devices, most of the work can be used for any other Sony device of that year, so for example, I personally port the high-end Sony devices, and I can either test the same commit when I work on the YOLO device, or just have to change it a little bit, and then import it from the YOLO repositories. That's why there's so much shared. It's like 95% of the port is the same as on the YOLO port. It's just, there's different testing, and so in the end, there's more work, but it's really shared really much. Sorry, can I see a timer somewhere? Yes, we have three more minutes. Oh, sorry, sorry, I have to process that, really. Including discussion. Yeah, okay, I tried to speed up a little bit, sorry. And then, the auto infrastructure can also be used, and then also, the issue tracking can be done similar, the YOLO ports and the changelots. So, yeah. Okay. Sorry, I'm not sure if I can cover all of this, but I will try. Yeah, so YOLO just works it out to track down or on internal issue tracking, and then we generate the changelots from that. But we can do the same with GitHub, and then generate some changelots with DNF, and then have a markdown based changelock that is based from that. So, you tick down, you close an issue on GitHub, and then we generate a changelock for the API package, and then with DNF report, if we scrap the difference between the repositories down from that, and then have a changelock from that, which works really good, and yeah. And I think it really helps to just contribute and test stuff, that's by a lot of people that don't are not programmers, they're still contributing, and it really helps much. So, yeah, okay, that's my last pick. Yeah, and then the last thing is, right now the community doesn't build the enriched parts on the OBS, the hardware adaptation, but with some scripts that I have, you can do that locally using the OBS and then upload the binaries, which is a lot cleaner than doing this with Hadoq then, because you just have always a clean environment, just takes more resource a little bit, yeah. So, how much time left for questions now? Oh, first, thank you very much. Yeah. Thank you. Yeah, thank you. Thank you. Thank you. Thank you. We have one minute for questions, so any questions? Yeah. Yeah, you? Yeah, I have a short question. Yeah. In one of the slides about your process you had written that change logs are done manually, I don't know, with? So, we have, so it's similar to the other, we generate like the technical change logs, where just the role changes in there, they are generated, and then we have like a manual change log where we have to learn over you for like the non-technical people, and that's what's the idea about it. So, you should still look into it if it's like exactly as you wanted, and not just take it. Okay. One more? Yeah. Yeah. Ah, hello, Ellen. Thank you for the call. Yeah, thank you. I thought we should do two small clarifications. Yeah. One was regarding the fact that it's easy to port to other devices, it's easy to port because all Sony devices use one camera. Yeah, yeah, I should have said that. Yeah. And the second one is regarding the bugs. If you find any bugs, you can always post them on the bug tracker. Even if you're not working for YOLA, everybody can post over there, and we won't give you that. Yeah, so the reason why I didn't mention it, and this is how I think, we are just one contributor, and anyone can report bugs, and so that's why anyone can do it. It's open, it's just there on GitHub, and you can fork it and then contribute and then make a protocol question, that's really it. Yeah. Okay, time is basing me up, so thank you very much. I will be also on the also of the room if you have more questions, and then later I will also on our stand, so in case you have more questions, you can just ask in. I didn't, I forgot to put a link here. Sorry. But I will post my slides, and I will have the references on the slides, and you can just find our project on GitHub, and the Sony Open Devices Project of course, also on GitHub, and just create a bug there and ask really, that's really the only thing that you have to do to get help. Yeah. Thank you. |
AMENDMENT Writing a convergent application in 2023 with Kirigami |
Hello everyone, we will talk about Kegami, about our corrections applications for Plasma Mobile and Plasma Desktop, so who I am, I'm a KD developer since 2018, I'm a member of the KD EV Foundation since 2019 and I started using the KD by basically like working on the website and the commutations and then I started doing applications. I worked with multiple Kegami applications, not to learn, for example Nailchart, Matrix Client, Contrast, Contrast Shaking applications, Tokodon, Muslim Client, Calendar, like it's like a calendar applications but also like there's a bit more fun calendar, for example contacts and soon emails and I am now like an hip-hop reader. I work at Kedab but she started three years ago before I worked at Nest Hub. So what is Kegami? Kegami is like a framework, you have to tweak it to basically like build project applications, it's written in QML, the QML is like a language delivery paper by Qt, it's a declarative language, basically like it's sort of mix of JSON and JavaScript and that's really good integration with C++, so you can always use C++ often. It was basically Kegami was developed to build mobile and applications in the Nokia times when Qt was updated by Nokia and that's why we are also using that for past mobile but also like to develop original applications not only for mobile but also for desktop. So yeah, basically how it works is that you have multiple pages, you can see like two pages, one of them is the list of emails and the second page with the content of the email. There's also like a driver concept, basically you can have things on the left and right, additional information, this is like a thing or it looks on the stock and on mobile you get only one page displayed at the right time, like you can see of two and only one, but it's quite the same UI, it's still like the same code but right and for the better, the drivers are also mobile friendly, so you can see on the left you have some of the collapses. Basically there's two ways to have page pages, either you put them in the column, like here, this page is for each of us, or you can as a layer on top of each of us and each page has some actions, you can basically add some buttons on the top, details if you want, but basically like how you build the key applications, you see like there's like a few concepts with the pages and the drivers and that basically of most of the key applications that they will put. For example, for shell applications like NeoChat, you only have two pages, one left for the list of users or sheds and one right with a few sheds, and yeah, it's as it looks, yeah. And basically like how Kigami started was 2015, we announced Plasma Mobile, but at that time we are still using the Plasma components, so basically the same components that we are using for the Plasma desktop, we are using them to create applications, but these are squeezing issues because first they got a look of Plasma applets as in applications, key applications, maybe as in the net convergence, so you are basically like to build applications for mobile and other applications for desktop, same for safety, there's a lot of duplications of efforts. So I try like one year later with Marco Matting, I'm a tinder of Kigami, created Kigami to basically only have to build one applications for both mobile and desktop, I mean this is like a long time ago, eight years ago, seven years ago, and then we tried to add more integration with the desktop because even at the beginning of the Kigami applications we are more like mobile applications still, so one of the things that we did first was like to create a team for the desktop called QQC2 desktop team, basically like critical controls, the framework that we are, the critical QML allows you to have teams that basically implement the features, so we have like a desktop team, what is, looks like a desktop applications, normal QQQ applications. When you're later, we did a lot of work as well as the colors, but they use the same colors and the desktop, as a Kigami application uses the same colors as normal QQQ applications. Before 2020, there was like almost no device that we could use Kigami, so the desktop was quite slow, because I mean there was Nexus 5, what kind of works with, but it was still small, old device that was slow, really slow, and yeah, when there was 2020, Pine 64 announced the Pine phone, and I think that really helped to get new contributors, and I mean as a way I started contributing to PlusNorway, it was a natural Pine phone, you add finally like a device that you could buy, and what supported Linux on, yeah, and last year, we also launched the Kigami add-ons, I will go more on to that later, yeah, basically like, at the recent times, we like felt it brought in new components, like for example, yes, the settings from Tocodon, the Mustang account, yeah, we worked for example, there was a new chat, there was a list to have a nice look, and again, the Tocodon settings, actually it's a mobile form components, but it allows to basically, mostly doing forms, but we are also using that for other stuff, basically display information, we are still a bit the idea of GNOME for that, because they look quite nice, and it allows basically like some comparisons of the forms, one of the last introductions, we added to Kigami, we are going to do this as a tree views, it's in calendars, it allows to do tree, display information as a tree, pretty useful for the task view in a calendar, where you can see the task and the task, yeah, another one last introduction that I created, what I contributed to is a search pop-up, it's a nice component, where you can basically have a search filled with a pop-up, but up here, when you click on it with the results, it's pretty nice, I think, with nice animations, but yeah, that's mostly it. Thank you very much, questions, comments, yeah, yeah, I was interested in how many devices support platform mobile, or most productive device to use platform mobile? Can you repeat? What kind of devices support platform mobile well? I mean there is a pine phone, a pine phone poor, that's like supporting it as well, when there's a basically platform mobile team doesn't really like do all the other implementations, we focus on the UI, because it's where we are quite good at, but the post-market OS folks, I mean, everything that's on post-market OS, for example, can run platform mobile, because it's basically like just a shell on top of post-market OS, or Manjaro, as a mobile edition, and there's like a lot of other small distributions, I think, open source as a mobile variant, and basically everything that's called mainline. The important thing is that it needs to be mainline support, we at the beginning supported other like, not mainline devices, but we had to stop that, because there's too much work to support both this sort of device and mainline, we had a lot of issues with telecommunications stack, because we had the LiboFono, what we were using, but it wasn't really working well for us, so we switched to a network mobile manager, but basically every device that runs mainline Linux would also run platform mobile, I mean, you can also run it on the desktop if you wanted, I wouldn't recommend that, but you can also do that. May I ask one more question, Isra, what is the long term of the platform mobile project, what's your goal? I mean, we have partnership with Pine 64, so they shipped personal phones, but it's quite hard to, I mean, Pine is no mainstream mobile manifesto as support for mainline Linux, and they're not really interested to support that, and they're not really interested to have something else run under it, because I mean, there's not enough applications, I mean, it's one of the issues, under its ecosystem has a lot of applications, we can actually like, there's a way to run applications on past mobile, but see like, yes, before we can maybe see like a, it's run on installed on our mainstream devices. I'm mostly focusing on applications, because that's the thing, it's a part where we are lacking the most native applications, I don't really do the pass-machel stuff. Other questions? Do you have some kind of support for progressive web apps? I think there is, as a browser, I think there is, does support, okay, see web app, you know, correctly, was implemented by Yonah a few years ago. Do you have any support for switching light mode and dark mode? Basically it's using the color shims API, but basically it works, so if you, in your chat it's implemented, you can basically switch on the color shims, but by default it just follows what the plasma colors are, so if you are, it's basically like we are trying to have the key applications that are quite new and not yet completely mature, to like hooks and have exactly like the older QT widget applications, so basically like if you change the colors, every application will change the colors, the QT widget one and the QT one. But you can as well implement, especially what one application should only be one team, one color, as possible. Yeah? Do you see more KD desktop applications being ported to Kirigami in the future? It's always like a bit of a controversial team, I'm, I've been with Calendar, I've been basically like, watching with Clodio, the entire PIM stack, all the KOOF applications, like Calendar, Mail, Contact, it's quite a bit of work, I think it's worth it because it looks a bit more nicer, but it's still like a lot of work to get it completely right, and I'm, what you really want is basically like, at least the Kirigami applications and the QT widget applications, because we can't get rid of all the QT widget applications and what everything looks the same, what they work the same, so when you interact with it, it should be like consistent. I mean, that's our goal, because we can't just go with everything in Kirigami. Thank you. No more? If not, then thank you very much. Yeah. So, we have 15 minutes left. |
Where do we go from here?
The future of Linux on Mobile could be exciting, scary, or both! |
Welcome to the final talk, wow. So lots of people want to know where to go from here. Welcome Clayton Craft. We are really happy. His plane made it barely despite bad weather and all that. So some might know him as Crafty Guy in some chat rooms. Well, I don't need to introduce him here. Welcome and yeah, glad to have you here. Thank you. Yeah, so I'm Clayton, also known as Crafty Guy. You might recognize me from some of my contributions to Post Market OS. Or you might recognize me from my avatar, right? I started contributing to this distribution back in 2017, mainly because it had initial support for the Nokia N900, which was the first Linux phone I owned. And by 2017, I was really tired of the two options for mobile operating systems. I wanted something that could run a recent Linux kernel, that had a familiar user space, and most importantly, wasn't trying to exfiltrate personal information all the time. And I still feel that way today, which is why I'm here, because I think that we as a community need to try to answer the question, where do we go from here? Because here today, the situation is quite a bit different and has improved in some ways from the last few years. For example, there's a number of phones out there now that can run Linux, some of them out of the box, which is really exciting. However, when you look at how many phones exist out in the world today, thousands and thousands of them, and only a small handful can do this. So I think there's some improvement there, obviously. There's also a lot more distributions, both Linux and, as you saw in the previous talk, other alternative operating systems that can boot on these phones. However, I think there's not a whole lot of coordination between distributions today, because a lot of these distributions are targeting the same hardware. They also are targeting some of the same use cases. And so a lot of them are trying to solve some of the same problems and have some of the same goals. And a lot of distributions are kind of doing it on their own and not really comparing notes and trying to work collaboratively to solve these things in cases where the work being done to solve the problem could be used by multiple distros, for instance. And another exciting thing I think today is there's a lot more applications that have been created both with some of the work from Purism and other folks in the community that lets these applications work pretty well on mobile form factors. However, when you consider non-technical end users and what they expect for a modern smartphone, there's still a lot of missing functionality there. So again, more room for improvement. And I think there's a lot of people both inside and outside the community who are really interested in what we're doing. And I think a lot of them are kind of asking the same questions. Specifically, one question I think everybody's asking, no matter who you are, is what's it like, right? What's it like to have a phone that can run Linux and use it as a daily driver? What's it like to depend on that for navigation and communication? And obviously whoever's asking this is... The answer to that question depends a lot on who's asking it and where they're coming from. For example, an end user, when they ask, hey, what's it like to use your phone that's running Linux, they want to know can they message grandma on WhatsApp or can they use it to navigate from your hotel to the Fosdum conference and stuff like that. I think myself as an OS developer, when I think about this question, I tend to think about a lot of the problems I run into with developing and maintaining an operating system on Linux phones and how a lot of these problems, again, are shared between distributions because, again, we're targeting a lot of the same hardware and use cases. And I think about how hard it is today to create or solve problems that can be reused by other distributions without a whole lot of rework on their part, right? A recent example of this is a Libram5 user, Chris Vogel, last week was trying to work around a problem on the Libram5 and he created some patches for this workaround, submitted them to Purism, and I actually just happened to come across the patches because I was trying to address the same problem on PostMarketOS. And his patches look good from the context of PurOS, but they were pretty much unusable for me on PostMarketOS because of just differences in the distribution, right? His patches were relying heavily on SystemD services in order to trigger things to apply workarounds, and I don't have SystemD in PostMarketOS, so that was a non-starter right there. I was able to talk to him and give some tips on how he could redo it so that it would work across multiple distributions, even ones without SystemD. And I think this is kind of like the current happy situation where he's off creating something now that could be reused, right? But I think there's a lot of cases where because people don't know there's other distributions or know what they need, oftentimes people run into problems like this and they create something which works totally fine for them, but is not usable or not even known to other distributions with the same problems, right? So, like, I might end up recreating or redoing a lot of the work and then it's inefficient, right? I would rather spend the time not solving problems that have already been solved elsewhere, but, you know, adding new functionality or supporting users who are using the distribution that I'm hoping to develop. And I think, like, we need a number of things as a community in order to address some of these inefficiencies with, like, maintaining distributions that target a lot of the same hardware and use cases. It'd be really nice if, like, in that previous situation I just spoke about, there was a place for a developer like Chris or myself or anyone, right, to ask for feedback directly from the community and be fairly confident that they're reaching, like, you know, critical mass of the community, right? And also where distributions can sort of, like, provide this feedback. So when people do solve problems that they're experiencing or when people are trying to implement things that they could really use, that they have the opportunity to provide feedback and, you know, the person doing the work can take or leave the feedback, but at least know that they're getting input or have access to this input so they can create something that's usable by everyone, and we don't have these cases where people are just kind of one-off doing the same thing, right? I think it would also be really nice if we had as a community a list of priorities that we care about, both, like, goals and also, like, these shared problems. The main purpose of this is, like, when contributors come along and they want something to work on or not sure what to do, they could see this list of priorities and, you know, if we come up with it as a community, we can put stuff up there we care about, obviously, and when people choose to work on these priorities, then we all benefit, right, because they're working on things that we said are important to us. And so, you know, maybe it'll provide some motivation or inspiration for folks that want to contribute and aren't necessarily certain how. And I think by kind of addressing the first two things, we'll inherently create a stronger relationships within the community, right, between individuals and projects, and I think that these strong relationships are critical. Like, if we want to have any chance of convincing, you know, businesses or governments or what have you, or even just end users, right, like... Like, if I want to try to convince, you know, a new group of users or something to give this a shot, we need to be somewhat organized and have an idea for, like, what we're trying to accomplish and be able to communicate that well externally so people know what we're all about. And these strong relationships, I think, are necessary for that, and, I mean, it's great, like, meeting people here at FOSM, but it's, like, very one-off, right, and we need to maintain that, and I think we maintain that by better organizing and, you know, trying to implement some of the things here I think we need. So, I'm here to propose forming a committee. I'm not even sure if committee is the right word for this, but bottom line is we need to somehow be more organized than we are. Not necessarily, like, you know, dictatorial or anything like that, but in some ways, at a higher level, just addressing, like, you know, having a place for people to get feedback, and I think a committee or some central place where distributions and projects are represented could be a place like that. I also think, like, as a developer, I'm not necessarily, like, the greatest at communicating when I'm working on and, like, what my motivations are for working on this and what Linux on mobile or FreeSoft on mobile have to offer. So, I think we should also, like, work on our public representation and having a committee or whatever you want to call it to be sort of the single point for communicating to the world what we're doing and why we want to do it would be important. Like, I know why I'm here, right? I don't want to have a corporate centralized device that's leaking personal information and I want the freedom to hack on this thing and, you know, do what I want, more or less. But, again, I think a lot of us are developers or engineers and we're not necessarily, like, the best at communicating that to non-technical users specifically. So, I think, you know, having some central thing where we can kind of work together to create something that can educate the world about us is, you know, nice to have. Now, I know what you're thinking, like, who is this guy? Why would I want some oversight committee thing, like, authority telling me what I can and can't work on during my free time? Because I know a lot of people here, myself included, are working on this during our free time. And, yeah, the last thing I wanted somebody to be like, here's the priorities for you. You know, when are you going to have them done by? Because that's silly. And I completely agree. That's not the purpose of this. And the question, like, the point is, I don't really know what this looks like when it organizes, but I think we need to organize. And I created a working group. There's a link to the matrix room on the slide where I would invite everybody here, everybody listening online, everybody in this community to join in and let's figure out how we can become more organized. And, oh, I'd like to thank my employer, Gallia, for sponsoring my travel to come here to give this talk. And, yeah, any questions, comments, opinions? Pretty short talk, but... Yeah, go ahead. Thank you very much. Thank you very much. Thank you very much. Thank you very much. Thank you very much. Yeah, that's one of the motivations for why I spent way too much time thinking about this. Yeah, for problems like that that exist across distributions and whatever, it'd be nice to know about that, right? If you're also trying to work on supporting a new device or, you know, improving something in your distro and you want to know if other people have had this problem. And right now, it's like, you have to kind of know what other distros are out there who might be working on this thing and then, you know, know where to find information and then go search, like, a million different GitLab instances or whatever to figure out, like, is this a problem that other people have seen or not? And it's kind of a mess, right? And the same goes for, like, a lot of other problems I've come across. And so, yeah, that's the idea, right? Have kind of a central place where people can... where problems like that can be expressed and people who are working on them, no matter what distribution you're in, work together on these things. Yes? Yeah. Yeah, that's tricky, right? Because, like, in the desktop world, there's some focus towards, like, flat pack and other ways to sort of package the runtime. So then it kind of doesn't matter what the distro is, right? You can reuse the same runtime and then you write your application and you target that thing and... I don't know if that's right for us, but, like, that's a specific thing, but I agree, like, there should be a way that people can talk about these things within these distributions we have. And, like, I think there's a fine line, like, we don't necessarily want to try to... I like that there's a lot of distributions and I like that they're all doing their own thing, right? Because, like, PostmarketOS was started by Ali for a very specific reason and, like, people started Mobian because they wanted to run Debian on their device and there's less focus on, like, what the runtimes are there, but I think it's a good thing that there's so much, like, distro diversity or whatever within the community and I wouldn't really want to try to, like, shoehorn any particular runtime mechanism or whatever you want to call it. On the other hand, I know that's, like... I know it's hard for application developers, right? Like, as you said, so... On the disk of Linux, for example, you can expect it to have some, like, Q or GTK or something. And this still is not the given on the... Right....like some sort of Q, some special GTK or even some something of the custom. Yeah. It's not, like, if you can even count on, like, common QE library... So would it be nice if you had a way to ask distributions, like, hey, what version of GTK3 are you running or something, right? And be able to get input directly from them. So that way, at least, you know, like, here's the minimum version I need to support. I know it's not the ideal situation where you just, like, support whatever you want, but... Yeah, basically, the idea is, like, you'd be able to go and say, hey, Linux mobile distros, I want to target this version of GTK4 for this application. If you care about this application, is there, like, a version I should look at targeting or something? And so today, if you wanted to do that, you'd have to, like... You'd have to know what all the distros are that might be interested in using this thing you want to create. And then you need to know how to contact them. And even when you do, the people who have an opinion might not even be online or, you know, might not be available or maybe you asked the wrong person or something. So it's not great, right? See, what I'm proposing is, like, have a way that you could ask for feedback from all the folks in the community and people who care about what you're doing can be like, yeah, here's a version I use, or, like, maybe, you know, maybe this person is, like, representing their distro and they take it back to the person who knows that's working on their distro and then they, you know, give you the answer back or whatever. So it's a way to convey information, basically, to and from people who are interested in, you know, solving those problems. Yes? I like that you mentioned the public representation. Yeah. Yeah, sure. I mean, I get reminded daily, right? My wife's like, what are you working on exactly? And I'm like, oh, some phone stuff, you know. But I know, like, there's a lot of people, friends, family, whatever who have tried to search online for, like, what this Linux mobile thing or free software mobile thing is all about. And they tend to see, like, you know, posts by people and projects who are the loudest talking about what they're doing specifically, but not, like, what the whole thing is about, right? And, yeah, so, like, if a business or you or somebody's interested, you kind of just get, like, this hodgepodge collection of information and it's hard to, like, figure out what exactly is going on here. Yeah, it's like you're interested in and you follow the trail of information and you end up at some point and then you start that point and maybe you could go to another. And it's fine like that, but it's a big effort for the public. Yeah, yeah. For now, it's for the techies. Right. And it's okay because it started from that group of people and we need to ask people otherwise it wouldn't exist, but I think it's something that should be ensured. Yeah, thanks for proving my point. Yes. How is this different than, like, Linux in general? I mean, a lot of people use phones and in a way that they're not very aware of what the operating system is, maybe. How is it different? Has Linux succeeded where mobile Linux is in or something like that? Yeah, I mean, defending on how... You've already had this kind of central place, but I don't think it does really. No, no, no. Desktop Linux is what I call it, I don't know. Yeah, you know, like Red Hat and Canonical and those folks who are doing desktop and server distributions. Those communities are kind of dominated by certain companies who are out there to make a product and sell support and sell services and sell products and whatever. I mean, there are some OEMs who are doing stuff that are selling products. Purism is the obvious one, but we don't really have any big corporate participants in this community yet. I honestly don't want to see that happen because I think there's a lot of history with trying to run this type of environment on phones, right? And I think that some of the past failures were due to big corporations getting involved, dumping a ton of resources in. You could argue whether it was done effectively or not, but then kind of just giving up when they lose interest because, you know, it didn't turn a profit as fast as they thought it would or whatever. So, like, I don't want to recreate that. And I also realized, like, we don't have, you know, a ton of money pouring into this right now, which could be a good thing. So, I mean, this is my attempt to, like, try to organize without waiting around for somebody to be like, hey, that's a business model I need to throw money at and then just overwhelming us with, you know, like one option and, you know, one or two devices and sort of just like pigeonhole in the whole community in that way. So, I'm hoping that by bringing this discussion up now, we can kind of prevent that from happening. And, yeah, it'd be cool if, like, desktop Linux had something like that. But I think we're also kind of in a more unique situation, like people that want to run Linux on their desktop. The hardware is kind of boring, right? Like, it's mostly x86. It's kind of a solved problem. Every so often, you'll get, like, a Wi-Fi module that acts up and, like, oh, wow, you know, unsupported hardware. But on phones, it's like, oh, wow, the whole platform doesn't work, right? And then you kind of, like, start from the ground up. And it's getting better as, like, Luca was talking about. And the work that people are doing on Mainline Linux, it's getting better, right, with device spring up. But there's still a lot of, like, weird hardware out there. And so, yeah, I think, like, a lot of the organization can benefit some of that, because, again, a lot of these distros are targeting some of the same hardware. So, like, when you run into a problem, it's almost certainly going to be specific to, like, some device model or some family of SoCs. And so, like, as distros, we want to know what those problems are, so we're not having to try to solve them individually. Hope that answers your question. Yeah. I did think there's at least two things, you know, proposed. No, I think those are two things I'd like to see happen. I think they're very much related, right? What I mean by that is if the distros can get their shit together, then the end user experience gets better, right? I think, in my opinion, people who develop applications for distros and don't necessarily have to think too hard about the distros or can at least, like, you know, get the feedback necessary to make something that works everywhere, gives end users more choices. They can run more distributions based on, and they may not care, right, in some cases, but also, like, it kind of sucks using a phone and running into a problem that's, like, distro-specific, right? And you kind of want your applications to work the same, regardless of what distro you have, right? Because you don't want to have a phone that's running post-market OS that supports, like, you know, these applications and these features and whatnot, and then have a device running Manjaro or Mobian or something else, and, you know, you have a different set of things that work there, and then you have, you know, another phone with some other distro on there with, you know, some different set of applications and stuff that work there. So I think by getting all the distros kind of, you know, organized, and I don't think it's just distros. I think it's also OEMs, too, and other projects that are in the community, I think, should also be a part of this as well. And by kind of getting our stuff together, then we can help with providing a more consistent experience for end users who, you know, that's what they want. They want their phone to work. That sounds like a great closing statement. |
Welcome to the Friends of OpenJDK (Foojay.io) Developer Room! |
Good morning, everybody. Thank you so much for being here. In the Friends of OpenJDK developer room, the primary focus of what we'll be talking about throughout today is Java, as you'll see, but there's also Kotlin. OpenJDK is large and welcoming to all kinds of languages and technologies, of course. And what we're doing by means of the FUJI project is bringing everyone together across the ecosystem, across the users of the OpenJDK. Over the past years, of course, there's been a lot of innovation. There are a lot of different vendors of Java. There are a lot of different releases of Java. So it's really hard to keep up and to know what's actually happening in Java. And what we're doing by means of the FUJI project is to bring everyone together to have one place where people can go to, to be updated about what's going on in particular in Java. So that site is FUJIO. Go there every day. There are new articles. There are tips. And what's also really important is there's information about the new releases. So there's Kotlin releases of Java, which is fantastic, that there's so much pace and innovation and change. On the other hand, it's really hard to know what's actually happening and what's been added to the Kotlin releases and to the LTSs and just to keep up. So on the FUJIO site, we have information about all the fixes and all the enhancements, et cetera, et cetera, that you can find there. There's also a Slack channel that you can join. So you're going to see this particular URL quite a lot today. This is going to point you directly to the Slack. And the Slack channel is for everyone interested in just discussing Java in one way or another. If you're getting started with Java, if you're some kind of expert Java developer, if you want to share tips and tricks, if you want to collaborate with the Java community at large, just join the Slack channel and you will find many new friends. Today, we have a really great program for you and it's a fast-paced program. So every 20 minutes, there's a new session. And it's by a range of different people from a range of different companies. You're going to see Simon in a minute talking about, after all these years, why Java is still so popular. You have Johan talking about upgrading to Java 17 and then a range of other topics. So just stay sitting. You'll be entertained throughout the day. And if you're bored at any moment, just wait 15 minutes and you'll be in another session. And again, you're going to see this link throughout the day. This will bring you directly to the Slack channel. And with that, I'm going to tell you again to go to the site. Every day, there are new tips and insights and articles and so on. And the link directly to the Slack channel. |
After Nearly 30 Years, How Is Java So Popular? |
Okay, so good morning and welcome. It is great to be here in person. Now, what I'm obviously going to talk about is the idea of why is Java as a platform and as a programming language still so popular even after nearly 30 years? Can you believe it's going to be 28 years in May since Java was first released? The first thing that we need to kind of think about when we're talking in this respect is what do we mean by popularity? Because there's many different ways of measuring popularity. If you think about something like music, you know, you can say Ed Sheeran, for example, is popular. And you can measure that popularity by saying he sells so many albums, he sells so many downloads, he can fill seats at stadiums when he plays live. So that's an easy way to measure popularity of something like that. But when it comes to programming languages, when it comes to platforms, how do you measure their popularity? Well, one way is at conferences like this. If you want to do a presentation on a particular platform, how many people are going to turn up to listen to you talk? But from a wider perspective, there are people who do surveys. And there's a couple that we look at on a fairly regular basis. So who's heard of the Tiobi Index? Okay, yeah, a lot of people have heard of the Tiobi Index. Now, this is kind of a weird one because Tiobi actually stands for the importance of being earnest. Why? I mean, why does that have anything to do with programming languages? But for some reason, that's what they call it. So the Tiobi Index is one that tracks how popular programming languages are by looking at things in terms of metrics, like the number of job opportunities that are being advertised, the number of GitHub projects, the number of questions that are posed on Stack Overflow, things like that. And what we've seen with Java is it has maintained a very consistent level of popularity. It's literally been in the top three for the last probably 10 or so years until January. This year, it dropped to fourth place, which is a bit of a surprise. But I don't think that that is really a trend that we're going to see continuing. It's not going to be slipping further and further down. Two reasons for that. First is if you look at the numbers that were in that survey, firstly, Java actually went up in terms of popularity. So even though it dropped in terms of its overall ranking, its popularity actually went up. And the second thing is that the one below it is actually seen sharp, and that has less than half the popularity level that Java does. So I think we're some way away from seeing Java losing its popularity. So let's talk about why Java is so popular. First thing we need to look at is why or what made Java popular in the first place. And if we go right back to 1995, I'm going to ask, anybody here remember the launch of Java? Anybody here? Okay, good, because there's a few old people in the audience. Excellent. And this was what really made Java something that people got excited about. This little dancing duke here, because essentially what you had before Java was web browsing that was static. So there was no way of including interesting functionality in a web page. I mean, again, I'm old enough to remember Mosaic. Anybody remember Mosaic? Oh, good. Okay, yes. Mosaic is a web browser. Now that was purely text. Even if you wanted to show an image, you had to fire up an external application to render the image. That's how basic it was. So when Netscape came along and they included Java into their browser, suddenly there was this wonderful way of programming things and including applets. So that really fired up Java. People got excited about the idea of right once, run anywhere. You could move your application from one platform to another without having to change the code, without having to recompile it. Now, to continue that popularity, one of the things that was a bit of a drawback, if you like, was that it was controlled by one company. It was Sun Microsystems and they sort of wanted to maintain control over Java because it was something they'd invested in and they wanted to try and make money from it, very logically. But people were sort of pushed for an open standard. People really wanted more openness around the whole idea of Java. So in 1999, we got the Java community process. Now, rather than having, like, ANSI or ECMA, drive the standard of Java, there was an open standard, but it was one that Sun still maintained a level of control over. But this was good because it maintained that popularity. And then the next thing in 1999 was also a shift in terms of how people use Java. Rather than applets, rather than desktop applications, which had been very exciting at the beginning, it turned out that wasn't really the best place to use Java. And what we've seen is the shift of Java onto the server side. That's where Java really hits its sweet spot. Server side applications, the ability to scale to internet workloads, the ability to deliver on those types of things. And we got Java EE, enterprise edition. We got serverless. We got EJBs, all those good things. And again, that sort of morphed as well. So we now see things like Spring, who here uses Spring? Okay, yeah, lots of people use Spring. So it's very, very popular as a framework for developing enterprise applications. And again, we've seen that sort of migration away from the JCP to the Eclipse Foundation for Java Enterprise Edition. And now we've got Jakarta EE. And then moving forward again, in terms of making and continuing to deliver on the popularity of Java, the thing that Sun did then was to say, okay, we'll actually open source the whole platform. This was really a push because Apache had the Harmony Project. There was IBM, who were very heavily involved in that. And they wanted to create an open source version of Java. Sun resisted the push to do that for quite a long time. But eventually they said, okay, there is going to be an open source Java. At some point, it might as well be ours. And they created open JDK. Now, initially in 2006, that was just the hotspot virtual machine and the Java compiler. There was a lot of due diligence needed to be done in order to actually ensure that they had the rights to open source all of the code that was included in the JDK. Because there was lots of stuff that had been contributed by other companies. So it wasn't till 2007. And in fact, it was JDK7 built with 31, which was the first that you could actually build completely from the open source in the open JDK. And what that's led to is a huge growth in terms of contributions from the community. This is the wonderful thing about this as a project. It hasn't just been Sun and then obviously Oracle once they acquired Sun Microsystems back in 2010. We've seen lots of companies contributing all sorts of engineering, all sorts of work to the open JDK project. And I've listed, you know, not all of the contributors here. This is just, you know, some of the bigger ones who add not just bug fixes, but new features as well. You know, we've seen things like Shenandoah from Red Hat. We've seen all sorts of different projects being added to the JDK, not just from Oracle. So this is really what's helped to drive that popularity is it's been open standard, it's been open source, and it's very popular on the server side. So what makes, what makes Java so popular today? Why are we still seeing Java's popularity? One of those things, and Hilton sort of mentioned this at the beginning, is the fact that we're now seeing Java evolving much more quickly. It used to be that we had two, three, even four years between releases of Java. And now, since JDK 9 back in 2017, we have two releases a year. That's really quite significant. So there's a lot more progress in terms of delivering on the new features that programmers want, developers want. In order to keep that language fresh, keep it appealing to developers so that they can continue to do the things that they want to do. And as the architectures shift, you know, to things like microservices, and that type of approach, the language needs to adapt to that. And we've seen that. We've seen over several releases a number of new features being added to the Java platform, to the Java language, which is keeping it fresh, and therefore driving its popularity even further, keeping it at that level. Obviously, JDK 8 introduced the idea of lambdas and streams, which gave us a way of doing functional programming in Java. And I think lots of people here would agree that JDK 8 was a big release. It was one where suddenly people went, oh, this is great. Now I can do all these cool things with streams and lambdas and stuff like that. JDK 9 introduced modularity, a bit of a divisive one in terms of a feature. Lots of people didn't really see the benefit of it directly. But what we have seen now that we are moving to microservices and we want to put things into containers, the ability to use modules, the ability to use JLink to create a runtime which is tailored specifically to your application is very powerful. It means we can reduce from sort of 300 megabytes as a Java runtime down to maybe 40 or 50 megabytes by only including modules that we need for our service. So that's a very powerful thing, even though it's not a language level thing. JDK 10, local variable type inference. So, you know, Java script's got it. It must be good. Let's add it to Java. Again, you know, it's one of those sort of minor features, but still something that people liked. Then in JDK 14, we got records. That's another big thing, I think, where people suddenly had the ability to create data types and rather than having lots of boilerplate code for all of the classes that just wrapped a few values, suddenly you could use a single line and define a record with a set of values that you want to store in there. Seal classes in JDK 15. Many people look at it and go, I don't get why I need seal classes, but when you start getting into pattern matching and the idea of switch using pattern matching and having exhaustiveness in that switch, then seal types do make a lot of sense. And pattern matching is really the kind of big thing that we've seen over the last few releases. Pattern matching, for instance, of pattern matching for switch, pattern matching for records, all of which is taking that sort of some of the rough edges off the language and just making it a little easier for developers and then maintaining that popularity. And in terms of some of the bigger features as well, OpenJDK is driving those projects. We've got things like AMBA, which is the whole idea of the small language features like records, like pattern matching and so on. But then there are other ones like Lume, that came in JDK 19 initially. We've seen some more stuff coming in JDK 20. That's another very exciting thing from the point of view of scalability of Java applications. If you have a server-side application that has lots of connections coming in simultaneously, but those connections are IO bound, meaning you have to wait a long time for something to happen in terms of IO, using virtual threads through Project Lume is going to increase the scalability of that application by orders of magnitude and make it much more performant. Project Panama, replacement for JNI. So now we have foreign function interfaces, foreign memory interfaces. Anybody here use JNI? A few people. I remember talking to one of the people who initially designed JNI and I said, it's always a little bit complicated to use that. You've got to do this like header files and then you've got to find the right libraries and stuff like that. He said, yes, we designed it that way. And I thought that's not really the idea, is it? You don't want to design something to be difficult to use. But that was their guiding principle. So Panama is about making that much easier. Valhalla, we haven't really seen much of that yet, but that's a bigger thing in terms of how to create a different way of doing things in the Java language from the point of view of storage. Things like value types and the ability to store a collection of primitives, rather than having an array list of ints where you have to use auto-boxing and unboxing to use the wrapper class and create specific object instances, what we'll now be able to do is have an array list of ints and actually store those primitives without boxing them. So that's again helping to deliver on performance and solving some of those small problems. And then we've got things around the idea of startup time for the JVM because of the fact you're using byte codes, you have to convert them into native instructions. That's always been a bit of an issue for the way the applications start. So we've got Java on crack, the idea of coordinated restore at checkpoints. You can freeze an application and then restart it. We've got Project Leiden, which is again looking at the startup time of Java applications, more related to some of the work that's happened in the Growl VM, where you have a native image and you compile for a specific platform rather than the right one that's run anywhere idea. Freedom to choose. Again, Hilton mentioned this at the beginning. This is one of the really powerful things about Java is that you have so many options, not just in terms of versions, but in terms of distributions. Open JDK gives you the source code and then lots of people have taken that, they built it and then they can provide that to you as a distribution. You can install it and then if you need support, if you need maintenance around that, you have lots of different options for that. So it's very important having that freedom. Last thing then, why will Java remain popular in the next 20 years, let's say? And I think there's several reasons for this. Java is open. The fact that we have an open standard, people can contribute to that, people can join the JCP, they can join the open JDK, open source. Java's continuing to evolve. It's not static. It's not stationary. We're seeing lots of different things happening to Java, which address the needs of developers and keeping it fresh, allowing people to suggest things in terms of new features and so on. Java performs. We have this idea of the virtual machine. We've seen how things like JIP compilation could improve performance over statically compiled code. And we're looking at how to take that even further, using different ways of doing static compilation, different ways of doing dynamic compilation and so on. Java gives you choice. As we saw on the previous slide, the fact that you have so many distributions to choose from, you have so many versions that you can choose from, gives you that choice. Java's community, the fact that we have so many people here on a Sunday, that always blows me away that I get to talk to people and they come in their time off to listen to things about Java. So the fact that there is this huge community around Java, interest in sharing people's experiences. And the last thing, Java gives you freedom. The freedom to choose, the freedom to do what you want to do. And so that's it. That's my 20 minutes of presentation. I think I just about managed to get it in there. So thank you. |
Why And How To Upgrade To Java 17 (And Prepare For 21) |
So let's get started, so this session is about why and how to upgrade to Java 17 and Simon already talked about the great new stuff that's being developed in the last couple of years. But I was wondering like what version of Java are you currently using at your projects for your company? So who's already using Java 21? Release, raise your hand. Of course that was a quick question because that's already access, I guess you don't use that on a production machine, but who's using Java 17 already? Quite a few, so basically you can tell the rest how to do it now. Who's still using Java 11? Okay, quite a few as well. Java 8? Okay, older than Java 8? That's not too bad. So the interesting thing is, well, I mean Simon talked about Java almost existing for 30 years, a lot of cool features come in in the last couple of years. Still a lot of people are on 8, which is almost like 10 years old, which is quite interesting, of course. You miss all the cool new stuff. And that's basically what this session is a bit about. So if you have any questions, it's a short session, so please ask me afterwards on Twitter, LinkedIn, or somewhere else. This session is based on the GitHub people story I created, so I collected all kinds of examples from projects where I did the upgrades to newer versions of Java, starting with Java 8 and open until Java 17, a little bit about 21 as well. So it basically gives examples, shows you error messages that you see when you upgrade, things like that. So you can use that when you do the upgrade yourself. So why actually this session? I think a lot of people see it as challenging to upgrade to a newer version of Java. And a lot of people also estimate it at really large amounts. To give a concrete example, one of the companies I worked at, one of the teams estimated that to just upgrade their application, which was like a one team effort, which was a couple of years old, they said it would take them a quarter with a couple of persons to upgrade from Java 8 to Java 11. You can guess what management then said. They said no, basically, so they couldn't upgrade. They already stayed with it for a couple of years. And why I am not basically trying to convince people to change their estimates, but that estimate was really high. In the end, I didn't want to have all the discussions. And I was lucky that my manager simply said, you do whatever you think is right. So I upgraded it myself in like two weeks. Of course, I had some previous experience, so maybe it takes a bit more if you haven't done many Java upgrades in the past, but still, I did it next to my normal job. Most of the time, you're waiting for the build to finish anyway and all the tests to run. So it's not like that much of work. And what I find interesting is we always blame management that we cannot upgrade our applications, but in the end, it's us who benefit most, because then we can use all those cool new features. So before I had the relaxed manager who said do whatever you want, I often did it on like Friday afternoon. I upgraded to a new version of Java, tried to compile stuff, tried to tweak it a little bit, and then I got a better idea like, hey, we need to fix only the test or maybe stuff that doesn't compile. We need to fix it. And then I could say to management like, hey, give me two days or give me a week. And then they often are like, okay, you're already working on it. We can give you a week and you can upgrade. And that's often beneficial to yourself as well. But there was also a good reason that you should do it for the company because the new versions of Java, they also offer you free performance improvements, security updates, things like that. So there are a lot of good reasons why you should upgrade to a new version. And I want you to actually upgrade. So something changes in Java and you build your application on top of it. Most applications have some dependencies. I only know a few who basically build everything themselves. But in general, you have some dependencies and if something changes in Java, those dependencies might break or your application might break or both of them. And then as they sometimes say, a lazy developer is a good developer. If you wait, those dependencies will automatically update their code to make sure that it compiles on the latest version of Java. Often already before the new version is released, sometimes shortly afterwards. And then what will actually be removed from Java? It's basically anything from tools like Java Mission Control to methods to all kinds of things. If you go to my GitHub, you can get some references where you can see in detail, for instance, on the Fuji website, you can see what methods are deprecated, removed, et cetera. You can get a lot of those details, which is quite interesting. But you can also simply try it out and upgrade to a new version and you get those compile errors for free. Then in order to upgrade your dependencies, because I noticed that if you keep your dependencies up to date, then it's relatively easy to update your Java version. If you have really old dependencies, it might be tricky, but then first you need to upgrade them before you can go to a new version in Java. And there are some interesting tools to help you there. For instance, the renovate project automatically creates pull requests for all your dependencies in your GitHub repository. So then you only have to merge it. You don't have to search for the version or anything like that. There are also maybe ungradable versions plugins that can show you the latest versions or can automatically update those versions for you. Now, one thing to keep in mind with that is sometimes your artifacts, they change their name. For instance, first we had Java EE, as was explained before. And then we had the Java API package inside the Java X place. Now it's Jakarta, so it basically changed. So make sure not only to use the latest version, but also to see if an artifact basically has a fork or some other name or whatever. And actually there are also some nice plugins for that. All group IDs, alerted plugins, names and gradle, they basically alert you that, hey, you're using some older project, there's a newer project available, you should change your dependencies. So again, most of it you can automate. If you then go and look completely at what changed in different versions of Java, which you will probably encounter when you upgrade to a newer version. When we started Java 11, one of the bigger changes was that Java X was being removed. However, there are still separate builds for Java X, like the one from Glue-On. You can use a maiden dependency. And what a lot of people are unaware of is that there are different versions of the JDK. So you have the open JDK, there's Oracle JDK, there's an Amazon JDK, there's a Microsoft JDK. Some vendors, they offer more than basically the standard JDK. They offer extra tools, like for instance, Java Vax included. So for instance, the Liberica JDK, they offer a build with Java Vax still included. I see all JDK builds that is mentioned here still. I forgot to remove that one, because that project basically stopped. So they don't release any new updates nowadays. Another interesting thing that was changed was that in the past, Java contained a few fonts. Just a really small set of fonts. If basically those fonts couldn't be found, and now they are removed, if you do that on a normal operating system, it doesn't matter, because then they will use the operating systems fonts. But if you use a really small operating system, like Alpine Linux, that also doesn't have any fonts. So then Java has no fonts, the operating system has no fonts, and you get some really weird errors about missing fonts. For instance, if you use tools like Apache Poi, which you can use to operate on office documents, that apparently use some fonts on Red Hood, and a colleague of mine got this issue, and then the solution is basically to install some Java packages with fonts. Some other stuff that's being removed is the Java Mission Control. You can now download that basically separately. If you want to do some profiling or monitoring of your application, I can highly recommend having a look at this one. It's really interesting, but it's no longer part of the JDK itself. It's like a separate tool to get done. And I think one of the bigger changes was that the Java EE and Corba modules have been removed. I hope Corba isn't used that widely anymore in your projects, but Java EE modules are often used. If you look at a complete example, for instance, jugs B, as mentioned before, before it was Java X, and now it's Jakarta, so basically you need to change the imports of your application, and you need to add the dependencies explicitly, as it's no longer part of the JDK. And that goes basically for all the ones that you see here, all the different modules on the left. You can see the replacement artifacts on the right. For jugs B and jugs W, as you need two dependencies, one for the API and one for the implementation. As you make the switch to Jakarta, you need to make these changes. Also, some people that didn't switch to Jakarta, they simply added the old Java EE dependencies explicitly. That's also possible, but I mean those, no one will receive any upgrades, so if you want to receive upgrades, you should move to the Jakarta versions. Java 15, who's using Nashor in here? Let's see a few hands. In the past, I only saw it at conferences. I always found it a really cool tool, but never encountered it in projects, and then I worked at a company and upgraded a lot of projects from various teams, and suddenly I got a Nashor, an exception. So there are some places where it's being used, and in those cases, you can simply add this dependency and then you can keep on using them. Java 16 has a very interesting one. They're basically strongly encapsulating the JDK internals, so what does that mean? In the past, there was some internal logic of Java, basically, like reflection APIs and things like that, and they were never meant to be used by end users. They were only meant to be used by the guys who built Java itself, but they couldn't hide it away. A couple of years ago, we got modules in Apple Pay, and now they can hide away that logic. But if they would have done that immediately, a lot of applications would break because they would use these internals. But gradually, they are making it a bit harder to use those internals. So what happens now if you use some of those internals? You get exceptions like this. For instance, Lombok uses lower-level Java logic, and it's also something like that. A module compiler does not export some stuff to an end module. Basically, it means Lombok doesn't have access to that anymore. So what's the solution? Again, a lazy developer is a good developer. You work it a little while, and then Lombok creates a new version, which you add as your dependency, and then you can simply continue. If that isn't the case, maybe you use some obscure older framework or library, which is no longer maintained, and they want updated, then there is still a workaround available, and that's basically by giving some compiler arguments. So, for instance, via the Maven compiler plugin, you can give these arguments, which of course is a little bit dirty, like someone puts a lock on their door, you remove the lock, and then you leave the home open so everyone can again access it. So it's not really a nice solution. So please, just update your dependencies whenever possible. And there is an even dirtier hack. That is, you can start the Java process with minus-minus-illegal-access, and everything is open anyway. So that's like the really smashing the door with a hammer. How long is that option going to stay? How long is that option going to stay? That's a good one. Thank you for, that's a hop to the next version, because now we have another strongly encapsulated JDK internals. So Java 17, so it stayed for six months, basically. So with Java 17, the launcher option minus-minus-illegal-access no longer works. So you had six months to fix your issue. Yeah, so then you'll get an exception like this if you try it. So again, by now I think you know what to do to resolve this. Make sure you upgrade your dependencies. Those lower level JDKs, JDK options, those methods and things, they have replacements that are higher level. So if you use those methods in your own code, make use of those higher level methods instead of those lower level methods. Unfortunately, there is still the last resort. You can still use that minus-minus-add opens to still open up stuff. As far as I know, there are still no plans on stopping that. So yeah, that might stay for a bit longer. That's like a workaround to make this work. Then when you look at the newer Java versions, so Java 18, 19, 20 and 21, they didn't announce any major removals. Although the full time for 21 is still a thing under debate. I tried with another repository that I have for another presentation, and I explained a lot of libraries and tools for Java. I tried upgrading all of those in Java 21, and then I still got a few issues. So not everything worked there, but that's mainly because the compiler doesn't support it yet by those frameworks, as we will see in a second as well. But no major things are being removed so far, so it should be a relatively easy upgrade. Then one thing that I encountered a couple of times in the beginning was that I often got the exception in support of cross-file major version 61. Does someone know what that means? Sorry though. So basically, cross-file major version 61 is used to describe Java 17. So it basically means your stop doesn't run on Java 17, but then with the number 61. If you see the same with the number 65, that means it doesn't run on Java 21. So if you get these kinds of weird errors, again upgrade your dependency and often it's fixed. So I tried running some examples on Java 21, and then Spring for instance, they don't support the cross-file format 65 yet, so you get this issue. Often that's resolved after a while by the people maintaining those dependencies. So in the past basically upgrading was a matter of find, replace, import, compile it, and then see what breaks and fix that. Nowadays they're basically becoming more and more tools available to help you with that. So there is the more generic open rewrite project, which I think you can use for changing almost anything. It's a really advanced find and replace engine, so to say. And there's a session about that topic a little bit later. Yeah, and a session later by Tim. So if you want to know more about that, stay seated here. There is the Spring Boot Migrator project, which is more focused on upgrading Spring applications to newer versions. And quite recently they introduced the Catch Migration Toolkit from Java, which is again a bit more generic toolkit to also upgrade to Java versions. I have to be honest, I tried some of them in like simple projects to play around with upgrades I did in the past. I mostly did it by hand, because mostly, like I explained, it was a matter of upgrading dependencies and then see what code changed. And maybe I was lucky, but in most cases that was relatively easy. So after you've done this work, you upgrade your dependencies, fix the code that's breaking in your own code base. Then basically you're happy again. You can finally use those cool new features from the last 10 years that a lot of people aren't using yet. So you can make use of records, seal classes, all those cool nice features. And I think that's really helpful. So to summarize, if you look at the amount of work, I think if you upgrade from 8 to 11 and you switch from Java EE to Jakarta, it's relatively a bigger task. It takes you a bit of time to upgrade everything, get the right packages that are compatible with each other, and then that to work. Still, it's not like a quarter with a lot of people. It's still a matter of, I would say, depending on your application days or maybe weeks. If you go from 11 to 17, that's a lot easier. For me, it was mainly updating a Lumbok and some test dependencies, because tests like mocking frameworks, they always use those internals of the JDK. So they tend to break when you upgrade to Java 17. So if you upgrade those, then most of it already works. So I did, I think, like six or seven projects from 11 to 17. And for all of them, that was basically all I had to do. Only one that used some, I think, reflection logic in one of the unit tests. So I had to rewrite it, but I could quite easily rewrite the same test case with higher level logic and test the same cases. So all in all, I would say from 11 to 17 is easier than from 8 to 11. And from what I see now from 17 to 21 is maybe even easier unless they introduce any big changes in 21. My advice is always to take incremental steps. So don't try to do everything at once. First, try to make everything compile. And you can also tell your teammates or manager, OK, now it compiles. So one step further, now I do my tests, and then we're basically, we can deploy it and run it. Instead of saying, add 80% done, because we all know how much 80% done means. It means 80% of the work that needs to be done. So that really helps. Make those small steps. So also for yourself, then you know what the progress is. And as I mentioned before, I mean, we can play management for not giving us the time to upgrade Java. But in the end, I think, for us, it's most beneficial, because then we can finally make use of those cool new features. Nowadays, also spring requires, for instance, Java 17, if you want to upgrade to the latest version. So I think that's one more reason to make sure that you're up to date. So that was it. I think I have five minutes left. So if someone has a question... Coming back to this problem with the class version, is it because we are trying to run classes which were compiled against JDK 19, for instance, at least under JDK 17, or stuff like that? This is because the bytecode is too new and contains features in Java. Yeah, I think that's the problem. It doesn't support the newer bytecode. So basically, you need to upgrade your JDK. Now you need to update the dependency so that it is compatible with the bytecode of that new version of Java. So this happens when you run an older dependency on a newer version of Java, basically. Would you recommend Github Dependabot? So the question is, would you recommend Github Dependabot? I think it's quite similar to Renovate, which I used. I quite like it, but it depends a bit on what kind of dependencies you use. Because some dependencies release really often, then you get a lot of those updates. But I think it's a lot better, because if you look at traditionally, a lot of projects are done by hand. It's not like a funny task to do to find out the latest versions by going to the maybe repository or something like that, so it's often neglected. And if you use tools like Dependabot or Renovate, you basically get the issue smashed into your face, like you have to merge it, and it's more or less automated, unless, of course, after you merge it, the build fails, then you have to fix it. But at least I think it helps to stay more up-to-date with your dependencies, because else, I've seen it in so many projects, most developers don't care about it, and they only upgrade once a security issue is being found or something like that, then they upgrade it, and else it stays the same old version. Although if you keep it up-to-date continuously, sometimes you have to do some minor fixes, but then you don't have to do those big fixes when your friends want to migrate to a newer version of Java. And also there's a topic, a session on this topic, the one after the next one about dependencies specifically. Yeah. Regarding the open-spot, is there a way for Java 11 compile time to report issues that will throw in Java 17 runtime? It's unnamed class opens, you know? Because we are compiling in Java 11, but when we run in Java 17, then we see that we need the keys at open, but it's like two, three... There was a flag here that could warn you, but I'm not sure... Oh, sorry, yes. So the question was, the gentleman compiled in Java 11 then run it on Java 17, the code, and then it breaks because some add-opens were missing. Basically, they're using some internals of Java, and he asked, is there some way to find out on Java 11 if I'm missing those things? Because then I can fix it there already, right? So I know there was some flag, something like give a warning or something, but I don't know in which version of Java that was introduced. What would be the solution to compile with Java 17? Yeah. Yeah. Yeah, then you can run it on 17. That's of course the easiest way. Yeah, sorry. Yeah, then you get the warning immediately. Or you can see that flag already exists in Java 11. I'm not sure. It was introduced sometime, but I don't know which for it. And thank you all. |
Best Practices For Real-Time Stream Processing (With Hazelcast Open Source Platform) |
The next session is a very important one around streaming and Java. Of course, streaming is an increasingly, or has been for years, important popular topic. And Thibautz is going to tell us more about it. Yes. Thank you. You have to talk loudly. Yes. Yeah. So, welcome everyone. So, this session is mainly about three-time stream processing. So, what I'm planning to do today, because it's Sunday and early morning, is to make it as easy as possible. And the fact is, I don't know your background, so I'm not sure how much you know about real-time stream processing. So, I will take it from scratch, basically, to get up to speed everyone. And I will also show you, demo how you can basically use real-time stream processing in your work as well. So, before we start, anyone recognize these guys on the screen here? On the left. Details. Yes, that's correct. So, these are the details. On the right side is the Liverpool Football Cup. That's where I came from. So, I wanted to highlight these two images here, because I wanted to say, you know, real-time stream processing is not about domain specific. So, it could be anywhere. So, it doesn't have to be, like, for example, in financial institutions, or machine learning, or IT or IT. It could be, for example, in sports, or music, or any domain, basically. The fact is, you're using real-time stream processing in every single day. So, just to give you an idea what real-time means, and how you can actually approach it. So, anyone can guess how long it takes for an IT blink? The question is a second. So, yeah, sub-milli-seconds are roughly one-third of a second. So, that's pretty fast. So, the same thing applies if you want to clap hands, or if you want to take a photo as well. So, we're not talking about minutes here. We're not talking about days or weeks, which what batch system is all about. We were talking about, like, sub-milli-seconds, how you can process it in real-time. So, as you can see, it's everywhere. So, it's not domain specific. And some of you who work already with real-time know that, basically, you have some kind of events coming into this moment, and you try to make sense out of it. So, looking at it from user perspective, what you want is to make sure that you have some kind of secret source, or key element, when it comes to real-time stream processing. So, I've seen it so many times. People approach it from the wrong angle. So, they try, basically, to read the data in real-time, and they try, basically, to provide some kind of meaning of this data. So, I'll give you a demo today for logs, so this should be easy to follow. But the secret source here is to kind of combine new data, real-time data with historical data. So, what we mean by real-time data is this data is coming this moment, and you read it in this moment. Obviously, you want to make sense out of it. You want to understand what's going on here. And the historical data is normal data. We know about, like, stored somewhere on physical drive, for example, or database, whatever. So, you want to make sure, basically, to have these two types of data at the same speed. So, now we're talking about two types of data, but how many, you know, is too many, basically? What size we're talking about here? So, for some might be, like, a few thousand, others might be millions, others might be billions. So, essentially, what we're trying to do here is not taking, like, a small data set and trying to process it, because that's, you know, easy to do. But we're talking about, like, over a billion, or over, like, ten billion of seconds in transactions per second. So, the idea here is to take a huge amount of data and, you know, trying to find some kind of trains and alerts from this data. So, I'll give you an example here, so you can start now to work on what's going on here. So, imagine, basically, you write a Java program, and you obviously have some kind of logging mechanism in your application. So, in order to understand what's going on with your log system, essentially, what you want is to have some kind of platform to allow you to actually, you know, analyze it, but also, like, at the same time, we're not talking about logs from yesterday. So, for example, events happened yesterday, you want to make sure that, for example, you actually do alerts or trains in the same moment. So, same thing for trains as well. So, if you want to know if your application is going to crash or not, what you want is to kind of have a platform or solution where it is easy for you to actually look at the data in this moment and say, hey, something is going wrong here. I need to basically define trains out of it and do some alerts. Now, for manual work, this is kind of like painful because you need to go through loops, for example, and you want to make sure that you know how to scale it and also kind of like, you know, knowing exactly where your data is stored because your enemy, when it comes to real-time scene processing, is latency. So, you want to make sure your application is as low as latency when it comes to delay, basically. Obviously, the scaling is bottleneck. And now, if you look at platforms, now, you might have heard of some of these. So, the easiest way is to split these platforms into various categories. So, on this one, here, you can see, you can have open source solutions or you can have hybrid, which is mixed between open source and the managed service. And on the horizontal, as you can see, you need to capture your data. Obviously, in real-time, you need to do some kind of transport as well as some kind of transformation and processing as well. So, you can split it into 12 squares and it becomes like, you know, easier to understand which tool you need to use. But obviously, this is still a bit complex because the area for real-time scene processing is mainly about two different subjects. So, it's not only, you know, capture or transport because you have to do all of these at the same time. It's kind of like, if you want, you need basically to decide if you're going to use swim processing engines from one side or you want to have some kind of fast data storage from the other side. So, swim processing engines are pretty good in handling data coming in real-time, which is like, for example, Kafka or TX equals and so on. Or from the far right side, you can see fast data storage, which is kind of like essentially caching solutions to your application. So, for example, MongoDB, Redis and so on. So, if you want to apply this solution, essentially you need one tool from the left and one tool from the right, which means that it adds more work on your side. So, what you want is kind of looking at it in this way and say, hey, I want one solution for you, and that's where Hazercast comes into place. Obviously, I work for Hazercast and the Hazercast as itself is built on top of the Java virtual machine. So, it's Java based and it's open source. So, this is the platform here. It's kind of like a A to Z solution. So, what you want is to catch your data, capture your data. It could be coming from Apache, for example, Apache Kafka and from IoT devices. It could be coming from some kind of custom connectors because it's open source, which means feel free to contribute to this project or if it comes from file watch up, for example, or from work suffix. So, once you have this data, free time data ingested into the platform, platform itself has two main components. So, the first one is the jet engine. So, this is the engine for scene processing and the demo I will show you how to use it and also the fast data storage or fast data management. So, this is essentially a component which allows you to load your data from external sources and it's optional, obviously. So, it's kind of like, I don't know, some kind of file system or database or stored on the cloud and you load it into memory. Why do you need to load it into memory? Simply because you want to make sure your application is as fast as possible. So, we're talking about, for example, here speed where it's sub milliseconds or fractions of seconds. So, this is very important. For example, in fraud detection scenario, if you, for example, you're using your cards and someone else is using your card somewhere else, you want to get alert in this specific moment. It doesn't make sense to get alert, I don't know, in the afternoon, for example, or next day. So, this type of machine critical solutions, what you want is to make sure that your data is stored in memory which allows you to access it in really good time. So, once you have this data, you can do some kind of transformation for your, for example, on data because remember, we're not talking about one single data source. For scene processing, we're talking about multiple sources. So, it could be, for example, I don't know, some transactions coming in Kafka topic and some IoT device for, weather forecast for examine coming from other topic, sorry, from other source. So, you want to make sure that also you can combine it. Obviously, because here we're in Java run, so you can use the Java, obviously, client for it. So, essentially what you need is kind of like a Java jar. You need to download it and plug it into your pump file. But, for example, if you're a data scientist and you're not sure, you know, about programming language, maybe you use a little bit of Python, but, you know, programming languages is not something you want to invest in. What you can do is do some, everything I mentioned today in SQL. So, which means you do everything for team, written scene processing in terms of alerts, for example, or defining trace, or even query your data using SQL. So, once you do this process, you actually can output it in some kind of, I don't know, same thing for your input. So, the Kafka topic, for example, you can use the WebSocket, or you can create your Java application to do some kind of visualization for predictions, for example. Now, the cool thing about Hazelkast is not only the platform and the easiness of use, but also how to scale. Remember what we're talking about here? We're not talking about a few thousand. We're talking maybe a few million, or even billions of transactions. So, when it comes to scaling, you want to recognize between two different topics. So, for some, scaling might refer to, I don't know, your data. So, you want to scale this data, for example, and for others who work, for example, in programming or development, they focus mainly on the compute. So, you want to make sure, basically, to combine between data and compute when you want to scale. And the cool thing about it is it's partition aware, which means if you have your data stored in multiple places around the world, for example, in different data centers, what you want is your compute or your process or your application is to be stored as close as possible to your data. So, this will give you some kind of speed and lower latency when it comes to transactions. Now, we talked about transactions, but how many we're talking about today? So, we've done this kind of like benchmark, obviously. It's bit outdated now. It's kind of worth trying to add one more zero to it. So, it's one billion transactions per second on 45 nodes. And the cool thing about it is not only the latency, which is like 30 milliseconds, but also the linear scale, which means you just need to add more nodes to your application. So, with that being said, so let's just move directly to the demo and I'll show you how you can use HazardCast within your application. So, for this demo, what I wanted is kind of, you know, you're writing your application, Java application, obviously, and you have some kind of logging mechanism within your application. And your boss comes next day and says, hey, your log messages or your solution, we need to upgrade it in a way where we'll provide some kind of alerts or predict what's going to happen next. So, your task, essentially, what you want is to kind of take exactly the application and make sure you have some kind of scanning mechanism and real-time screen processing into it. So, obviously, you have two options here. So, the first option is to download this jar, plug it into your application and run it. So, that's good. But the problem with this is usually logs stored in different places. So, what you want is you want every machine to send its logs to some, you know, center place. So, usually, it's a cloud. So, the idea of you sending logs to cloud is kind of providing some kind of one place for every single machine which sends logs. HazardCast runs the HazardCast Viridian, which is exactly what we're talking about on the cloud. And this is kind of like what you need to do. So, you write your log message in some way. So, obviously, you need to have context for your message. And the idea is to store all logs into memory. So, we're not going to store it into database or file system because that means you're adding input-output latency to your application. So, we need to minimize this. So, the idea is to use some kind of map structure. And this map has a key, which is like where is this data coming from, from ID address and port number, and the message which is the value. So, it's your choice now. Whether you use Varchar, for example, string. Obviously, it's faster. Or if you want to use some kind of JSON format, for example, if you're trying to do some kind of machine learning and do, I don't know, maybe some classification on your logs. So, once you have your tm value, what you can do is to proceed to save it into HazardCast. So, remember what we're talking about here? So, what you want is to get this map within HazardCast. So, first step is to send your logs into the cloud, obviously. So, because different logs coming from different machines. And second step is to store it into memory. So, this will allow you to access your logs in much faster way. And obviously, you need to do this mapping. So, what you see here is kind of like SQL. You can write it in SQL. And once you have it in SQL, you can proceed to do this instance. So, in order to run HazardCast, what you need is have some kind, I don't know, from HazardCast instance and block in your pipeline. So, pipeline first to the JIT, for example, or your process. And in here, what you say is basically, I'm defining this is the IP address I want to run it on. This is my part and this is my data. Now, I mentioned SQL. So, the platform itself has management center. So, this is really cool, which allows you to query your data. So, what you see here is I'm trying to query my data of my logs and really trying to understand what's going on. So, on the left, you see the key, which is like IP address input. And on the far right side, or bottom right side, you see values. So, essentially what we're doing with logs is we're giving a score for each log message. So, if you're trying to define it, for example, I know, for example, you have a specific category for logs, whether it's information or warning or error. So, that's good, but if you're trying to predict what's going to happen next, you probably need to have some kind of linear scare for it. So, for example, you run it from 100 and you give it value for your log message and this is the value you see there. So, we take this data and we ingest it into memory. So, we have the IP address port number and we have a score for each log message. And once we import it, we define a trend. So, in my case, I'm taking a window on this real-time stream processing and, in my case, it's two minutes, but it could be anything. And I'm defining this trend based on the score value for each log message. And based on this, I try to create predictive, or prediction map, so another map, in order to say if I want to send alert, which has value of one or zero, don't send alert. Obviously, you need to do some kind of programming in it. So, the idea is to group messages based on the score to define the trend and from there, you can use some kind of machine learning, so linear regression or classification for example to do some kind of prediction and also you need to output it, which means you send it back to the user. Now, this is good, so this is how it works. So, this is your actual code for doing the map prediction. So, we take the score for each log message, we do the linear regression based on the window I define, and I'm simply checking if the value is greater than 100, send an alert, otherwise, don't send alert. And in here, what you need to focus on is kind of like three ways to proceed from here. So, there are three ways to do stream processing, and the first one is to use SQL for example, so you take it from the logs map and you store it in our log map with some filtering. And second option is to use a process or pipeline, so you read it from the map and you do as well, you do some kind of alerts for example or train. So, the first two ways is called batch stream processing, so it's not read time. And the third option, so this is the main thing you want to do if you want to provide some kind of read time messaging, is to look for kind of the map journal, which is also similar to map, but it's like ring puffer sucker, which allows you to start processing your logs, either from start or from end depends on what you want. And the pipeline itself takes this memory map, logs and do some filtering. So, if you see some value in this specific moment, you can do this alerts for example, or you can send it to the message. So, I think kind of like I wanted to cover everything, but because it's too 20 minutes here and it's not enough time to talk about everything, so just to give you somebody what you need to do and open this stage for questions. So, first of all, you need to store your logs to the cloud. So, in my case, I'm using HazerCast, but obviously you can't use any other cloud provider, but you need to import HazerCast into that cloud provider. So, once you upload all logs, what you want is kind of like have some kind of cloud solution for it, so where you can use for example, I don't know, Varchar, if you want speed or JSON, for example, if you want to apply some kind of machine learning, and you need to use some kind of map structure. So, the idea is to store logs into memory in order to have some kind of random artist and rebalancing. And obviously you need to configure this map because you have a specific size that's on limits on it, so you need to have some kind of eviction policy on your data. And obviously you need to consider security. So, whatever you send to the cloud, obviously it can, you need to have some kind of security mechanism just to make sure that you don't send sensitive data. So, if you're interested in this topic about stream processing, we're running an unconference in March next month, and it's free to join, and there is like training workshop as well. So, all you need to do is just scan this code and register for the real-time stream processing unconference. It's community-based, so it's open source. You get the training, you get batch on top of it, as well as we have run table where you can see industrial experts as well as community users how they can contribute to open source of projects. And, you know, you can basically ask questions if you have. So, with that being said, thanks very much for listening and I'll open for questions. Thank you. Thank you. Yeah, so it is possible to do like multiple streams, joining multiple sources. So, this is possible to do. Obviously, you need just to find the configuration for this specific case. So, I mentioned sources here. So, the sources is not only a single source, it could be multiple sources, and that's where you do join multiple sources. So, it is possible to do it with data. Any other questions? Yeah? Yeah. Please take a seat. What is the difference? So, the question is what is the difference between HazardCast and Apache Flake? So, there was a slide, but I decided to remove it from here. So, essentially what you want is kind of like, for real time is to look for minimizing latency within your application. So, we've done benchmark between HazardCast and Flake. So, this is where things can make difference. So, if you're trying to basically write an application for real-time sequencing, we want to minimize latency. So, the lower is better, and this is where HazardCast has performed a link or Apache Spark. So, the latency is the key difference between these two. The results are online, but I decided not to include it just for this. So, basically it's the latency between different platforms. So, that's what you want to focus, whether it's HazardCast, Flake, or any other platform which offers real-time sequencing. Thank you. Thank you very much, Thomas. Thank you. |
Keep Your Dependencies In Check |
Yeah. Good morning. So since we're all here, we probably want to talk quite loud. Sorry. Since we're all here, you probably, thanks, all use open source, which is great because it offers us functionality without us having to write it ourselves. But the downside is for the dependencies that we declare, we tend to pull in a bunch of transitive dependencies, and any of those can contain vulnerabilities. So you might remember December of 21, because I do. And from your laughter, you probably spent those days in much the same way I did. I was working at a Dutch retail platform that uses microservices, and because of Log4Shell, we got to update everything, urgently, twice because after the first CVE was fixed, there are multiple other CVEs. So, you know, fun times. And then in March, we got to do it again, because of Spring4Shell. But at least we got the practice in the first time. So it was fascinating this time, right? As we know, using external dependencies has pros and cons. I have more on that, but not in these 20 minutes. I'll share a link to my website that has all of this information at the end. So we have to maintain our dependencies and make sure we keep them up to date. So I'm going to give you an overview of different tools that you can use, ending with, normally I would end with open rewrite, but Tim is here, so he's going to do the honors for that one. You probably use Maven or Gradle to manage your dependencies, so you will probably also know that you can use Maven dependency tree to get your dependency tree of your declared dependencies and their transitive dependencies. And you can ask Maven, hey, which ones have updates available so that you know what you could be updating? And you can use a command to analyze your dependencies to see which transitive dependencies you're using but haven't declared, and also which you have declared and aren't using that you might want to remove or in the case of JUnit, add some tests. Gradle has a command as well to get your dependency tree. As far as I know, it doesn't have a command to get updated versions, but you can use a plug-in, like the Ben Main's version plug-in. And I'm also not familiar with an analyze command. If you're a Gradle user and you do know how to do that, please tell me. And as I currently work for JetBrains, I'm also going to tell you what IntelliJ IDEA can do to help you manage your dependencies. So you can view your dependency hierarchy in the built-in window, either for Maven or for Gradle. And here you can see, again, the hierarchy of the direct dependencies and their transitive dependencies. And you can expand and collapse as you like, which is easier than reading it from the terminal. And you can analyze dependencies as well. We use the dependency analyzer in IntelliJ IDEA. This year we, or past year, we added a functionality called PackageSearch, which allows you to manage your dependencies right inside your IDE. So you can use code completion, for example, to add dependencies right inside your build file without having to go outside of your IDE. So, and it will also tell you, with a little squiggly line, I don't know what the official name is, so we're going with squiggly line, that will highlight, hey, there's a new version available for this, and then either with a hover or the familiar alt enter, or option enter that will suggest fixes. So, for example, to update this version, I'm using a project that's really outdated from my GitHub graveyard of projects. The project itself is pretty useless, but it's a perfect example of outdated dependencies. So, don't worry, this is not being used in production. So, yeah, you can use intention actions, like I said, to update the dependencies, and this works in build gradles too. And there's the dependency tools window that you can open where you can manage your dependencies. So you can, as you can see in this example, you can upgrade all of them at once, or you can upgrade individual ones, or you can select the version to use from the list of versions. You see information about the dependency right in this window. You can search for dependencies and find information about those dependencies right inside your IDE, so you don't have to go to search Maven or somewhere else. And in IntelliJ Ultimate, we have the package checker functionality, which even provides information. So this is if you hover over a vulnerable dependency that's highlighted in yellow-ish. It will tell you these are the vulnerabilities that were found in this dependency, and you can click the links to go to the checkmarks advisory for more information on that. And you can also see that information in the vulnerable dependencies tool window, so you can see which of your dependencies have vulnerabilities, what the severity is, find more information, and fix it right inside your IDE. So all of these tools are great because you can use them right as you're working on your code, but the downside is you need to be actually working on that project. And like I did at my last job, you have a bunch of microservices that adds up to a bunch of repos. You'd have to check out each individual repo and check for updates. And then, of course, you still have to apply those and verify that everything still works, et cetera. So hopefully, your company will have some kind of software composition analysis that can scan your repositories and sometimes also your Docker containers and provide you with an overview of your repos and which ones have which vulnerable dependencies. And the upside is that, as a developer, you won't have to individually check all of your repositories, but the downside is I still have to check the dashboard to see what's outdated and then, again, still apply and verify all of those updates. The next generation of tools that is useful are bots that can create PRs for you. Since we're in the Java room, I'm assuming we use Java, and these are the options that we have, dependable, renovate, and stick open source. Dependable is now GitHub native, and it offers three features. It can alert on your repositories. It can create security updates, so PRs for dependencies that have no vulnerabilities to update them, and it can do version updates. So, then there's a version for other reasons. Since it's GitHub native, you can configure it in your settings on GitHub. It's also available on other platforms, but I use GitHub just to compare the three-box. So if you have alerts, you'll see a yellow box with a button to press for more information. If it generates PRs for security updates, that's what this would look like. And if you want to use the version updates as well, you'll need to add a dependable YAML, insert obligatory YAML sounds like a Dutch word, jammer, which means too bad or unfortunate. And you have to provide a little bit of configuration. You can set the frequency, the maximum number of PRs scheduled, and some minor details on how to manage these PRs. The next option is a renovate, which is an open source project, but also with a vendor behind it. This offers the security updates and the version updates, like dependable, but also a project and a jobs dashboard for some more information. On GitHub, it's also fairly easy to apply this to your projects. You can use the app, and you can choose to apply it to either all of your projects or only certain projects. So if you want to just, if you've never used one of these bolts and you want to try it out, this is one that you can try out on just one repository. And as far as I know, the only one of these three. It will then generate a configuration for you. And once you merge that basic configuration, it will start doing its thing and generating PRs. You can also specify the maximum number of PRs and the maximum number of branches that you want open at a certain time. And it has more options and those options are more fine grained than the dependable options. The PRs provide more information as well. Why is it trying to update these versions as well as some information on how old is this package? This is an old screenshot, sorry. The adoption rate among renovate users, how many percentage of the builds that pass with this update, and how confident they feel about doing this update where neutral means either they can't tell based on the information they have or they don't have enough information yet. It will also add a dashboard to your project with a list of all of the things that you need to update. And there will be a jobs dashboard where you can check the details of all of the jobs that have run. The last option is sneak open source, which also offers security updates and version updates and some dashboards as well as the option to check for vulnerabilities in new PRs, making sure that you're not adding vulnerable dependencies and it can check your source code. Slightly more steps to enable that. You can go to their website, authorize your GitHub, select which repositories, either public or public and private, add a token that it will generate PRs for you, also providing some information on why they are giving you this update with more information about the vulnerability, if it's for vulnerability reasons. And what sneak does by default is it will bundle PRs that are related, so it generates less noise, so less individual PRs. You can configure renovate to do that as well, but then you have to configure that yourself. So sneak also checks on incoming PRs and provides a dashboard, again, with outdated projects. I hope your dashboard doesn't look like this. And it has some configuration options for frequency and to enable or disable either for new or known vulnerabilities. So if you want to start with only making sure you're not getting new ones in and separately tackle your backlog or if you want to apply it to some projects, you can configure that. So the pros and cons of these bots are they're fairly straightforward to add to your repositories. It's a lot less work doing this just once than manually checking it every time that you're working on a repository. They can create automatic PRs, so it doesn't depend on you checking for updates anymore. It will do that automatically. The downside of that is it can create a lot of noise, especially depending on how outdated your projects are and the maximum number of PRs that you've set. And you will still need to manage those PRs. So if the build fails, you know that you have more work to do, because either some stuff doesn't compile anymore or your tests are failing and you need to look into it. That's at least good to know. But even if it's green, you still need to find the time to deploy that and make sure that everything still works depending on how confident you are with your test suite. We had a fail-safe update that managed to stop running the integration test. So, you know, it looked green, but really wasn't. So we have to revert that. And these bots only update the versions and they don't make any changes to your code. So that's when we get to migration tools. You might not be aware IntelliJ IDEA has a migration refactor, and it offers several standard or well-known refactoring, so Java EE to Jakarta EE, JUnit 4 to 5, as well as the option to create your own. If you're interested in the JUnit or Java EE refactoring, we have videos on our IntelliJ IDEA YouTube channel that detail all of that. Basically, what it does is it will update the imports, but there are some manual steps that you still have to do. So it can help you a little bit. Then there are other tools. Error-prone is one of those tools. It's not intended to be a migration tool. It's a static analysis tool to check for known bug patterns in your code. It offers a number of standard bug patterns that have been identified. It can either report on them or fix them. And included with Error-prone is Refaster, which is a refactoring based on before and after templates. So you can help or you can use that to help you migrate from one pattern to the next. And I know, for example, that Sander Muck at Picnic has said that they used it to upgrade to newer Java versions. And he's done talks on that at, for example, NLJ conference, J Spring and Jful, if you want to go find that on YouTube. And then another migration tool is OpenRee, right? That Tim is going to tell you all about. Thank you. |
Major Migrations Made Easy With OpenRewrite |
Okay, continuing with the immigration topic. There is a relatively new tool called Open Rewrite, and you're going to hear all about it from Tim. Hi. It's been mentioned a few times before, so I hope I can live up to the hype. My name's Tim Tobake, and I'm a staff software engineer at Moderna. I recently started just the start of this month, and before that I was a migration engineer as a consultant for five years. So what that means is I would walk into organizations, familiarize myself with all the old technologies that they were still using, and then hack away at lifting all those services up to the latest versions of Java and Sprint. I would frequently find versions that up to five to ten-year-old versions of Java or JUnit or Sprint, which is not ideal from either a securities perspective or even a developer experience point of view. Initially I would migrate these services by hand, or gradually introduce more and more force automations. But then at the end of the early last year, I discovered Open Rewrite, and Open Rewrite is a tool that promises to make light work of all such migrations. I got so excited by this technology that I started to contribute and even present about this on conferences, and then eventually quit my job to work on Open Rewrite full-time. So after a nice sabbatical, that brings us here today. Perhaps you've faced some of the same challenges that I did. At a conference like this, you'll hear all about new framework and language features. Yeah, back at work, you've stuck to using Java 8 and JUnit 4. And migrating all of that by hand can seem daunting, if it ever gets priority. I want to show you how easy it can be to perform major migrations. That way you too can adopt all the latest language and framework features. And it can be fun to adopt new language features such as records and text blocks. But you don't want to adopt these features manually, you're only on a single project. Instead we will look into automations and make all projects feel like new again, so you can benefit from JGM, language and framework improvements. Here's a very brief overview of the types of migrations I'll be talking about. Likely you already performed some of these migrations in the past, and other migrations are always just around the corner. If you look back over time, there's a near constant stream of worthwhile improvements to pick up. And I like the challenge, I still get excited whenever a new version comes out. I just don't like the repetitive elements that come with upgrading. And if you try to keep up by hand, you will hardly get anything else done, especially as microservices these days mean you're not just upgrading once, but dozens of time. Automation may then be the only option, especially for companies using thousands of services. Through Open Rewrite you can now migrate between versions of Java and Spring with a simple command. You can even migrate between frameworks, such as from JUnit to AssertJ, and from Java.e to Spring. In this talk, I'll tell you all about Open Rewrite, how it came about, how it works, and what you can do with it. And finally, we'll briefly look at who is developing these recipes and how to apply them to open source projects. Open Rewrite was developed in Netflix, initially to aid in the migration of an internal logging framework to AssertJ. You can probably imagine that any logging framework is going to be pervasive throughout an organization, so even consider migrating using a perfectly accurate automation. So they, especially when usage is spread across hundreds of services, so they develop a parser to accurately read Java and turn the source code into a lossless semantic tree. This model can then be modified to replace the old learning statements with calls to AssertJ. Next the migrated model is running out as close as possible to the original source code. That way the applied changes are minimal, leaving the surrounding code untouched. Later, the same developers moved on to work on Spinnigar, and while trying to onboard teams and organizations there, they found that teams often struggled with the same outdated languages and framework. To help teams adopt the latest versions, to help teams adopt the latest versions, they applied a different set of migration recipes, through the same lossless semantic tree parser. Let me just get this one on. This allowed them to quickly reduce this technical depth and bring teams from Spring Book 1 to Spring Book 2 and from JUnit 4 to JUnit 5. The project has since been open sourced, with the company behind it committed to making all recipes available on the Apache license for open source software. The initial focus for open rewind is on the Java virtual machine languages and surrounding technologies. There are parsers, for instance, for Java, Groovy, Dunl and XML, and these in turn unlock support for builders such as Maven and Gradle, and libraries such as JUnit, AssertJ and Guava. Ultimately refactoring entire frameworks and platforms is supported, with recipes available for application frames such as Micronaut, Barkers and Spring. Open Rewrite is not the only parser capable of understanding and manipulating Java. However, three features set Open Rewrite apart from the competition. The first is to focus on exact type attribution. By having the exact type available on any tree element, we can be sure to only manipulate exact matches. The second characteristic that sets Open Rewrite apart is the form of preservation. The parser not only takes into account the functional code, but also the surrounding code style and implementation. This allows us to accurately reproduce your source code regardless of further changes. Changes made through Open Rewrite look just like our colleague worked on your code. And finally, the serialization format ensures you're able to query and refactor your code faster and at scale. Together, these features make Open Rewrite exceptionally good at safe code transformations, especially as the changes are minimally invasive and guaranteed to work in part of the due to the do no harm mentality. By manipulating the full lossless semantic tree, Open Rewrite can far exceed simple search and replace operations. With the full lossless semantic tree built, we need to instruct Open Rewrite what operations to apply and where in your code. Recipes are highly defined, such as a group of search and refactoring operations. Together they accomplish a higher level task, such as a framework migration. Recipes can consist of a single standalone operation, or be linked together with other recipes. Open Rewrite comes with a large collection of fine-grained recipes out of the box that can be combined for common migration steps. You can think of these as LEGO building blocks, ready to be applied with the proper parameters. There are hundreds of these building blocks to, for instance, change types, change methods, change arguments, manipulate properties, and alter dependencies of plugins. Full recipes are implemented as Java visitors that first match and then modify elements of the lossless semantic tree. There are plenty of examples available, but notice that you only need a dedicated Java visitor, but none of the existing recipes can only really achieve your goals. Typically, you can get very far, only configuring, combining, applying existing recipes through a YAML description form. Examples then group together these fine-grained recipes into more coarse-grained, application-specific recipes. There are modules, for example, for loading frameworks, testing frameworks, and application frameworks, such as Spring. Think of these as LEGO sets, with built plans for common migrations ready to be applied. In my opinion, the lossless semantic tree, combined with a large collection of fine-grained recipes, is what sets open-grained apart from other similar tools, such as error-prones and repositories. Now, I want to show you how migration recipes are configured in OpenRero. Let's briefly look at a migration from JUnit4 to JUnit5. I want you to imagine the steps of what we need after such a migration. You've likely applied some of those steps already in the past. Some others should have to update the test annotations. But you would also have to update the assertions, and sometimes the argument order would have to update all imports, and they have to update any test rules, and that's just getting started. Notice how each of these steps is reflected as a separate recipe in this YAML configuration form. Some refer to and prefer the NERC steps, such as the change-type recipe. Others are implemented as an imperative set, a dedicated Java visitor that changes the lossless semantic tree. All these steps combine to achieve a complete JUnit5 migration. This is a common pattern with OpenRero. Large migrations are broken up into small, reusable steps. When we run this recipe, we get predictable results. Our imports are replaced, as we would expect, and our Makito runner is replaced into using the extension. Life cycle annotations, such as that before, are correctly replaced. But interestingly, we can see how OpenRero shines through when it comes to comparing expected exceptions. Having the full power of a lossless semantic tree, combined with a Java visitor, allows us to adopt assert throws. Since these types of changes that would not be possible with a regular expression approach. Running migration recipes is fairly straightforward. First, you apply a built-in plugin for OpenRero. I've used Maven in my example, but Gradle works just as well. Then depending on the changes you want to make, you add a dependency on the respective OpenRero module. Lastly, you run the OpenRero plugin with the migration recipe that you want to execute. The command scene here will migrate an application from Spring Boot 1.5 to the latest Spring Boot 2.7 branch, and we're also working on a 3.0 migration. This migration works all the way back to Spring Boot 1.5. They will update dependencies, properties, and deprecations from any older versions, and it includes the JU5 migration we've seen before, as well as any Spring-specific test constructs. Now that we've seen how OpenRero works, let's have a look at what you can do with it. Obviously, but now we've seen it is well-suited to migrations. You've mostly seen migrations from one version to another, but you can also migrate from one framework to another. If you want to switch from large for J to as a large for J, you can, and the same thing if you want to switch between JUnit and AssertJ, and even larger migrations are in development. Another application is fixing static analysis findings. A large collection of checkstiles, sonar, and security findings are supported to allow you to reduce your technical debt in minutes. Finally, there's a whole class of recipes to enforce a code style, and rather than merely apply a formator, these style recipes go a step further to actually change your code. This ensures your code style is consistently from project to project. In addition to what's already available, it's fairly easy to add custom migration recipes specific to your project. Now that we've seen how it works and what you can do with it, let's briefly look at what is still to come. As you've seen, OpenRemo has dedicated parsers for multiple languages already, but we have some catching up to do still. We are working on a parser for both Java 18 and up and Kotlin, but note that you're perfectly able to run on Java 17, but you cannot yet migrate to some of the new language features. The interesting thing about Kotlin is going to be that the Java migration recipes that we have will also just work, even though the languages look very different. Another subject we're working on is data flow analysis, and this not only takes into account the individual code statements, but also how data flows throughout your application. This will allow recipes to, for instance, add immutability or dedicated security fixes. Another interesting development is the Spring Boot Migrator project from VMR. It builds upon OpenRemo to migrate projects towards Spring from other frameworks. It takes a slightly different, more interactive approach, which will be coming handy when it comes to the Spring Boot 3 migration. All these features are inactive development. It's not yet clear when you can use this in a production setting, but it's interesting developments nonetheless. There's a last subject I want to tell you a bit about the company behind OpenRemo. As I said before, Moderna has committed to making all recipes available open source. Our focus is on applying recipes at scale. Through Moderna, clients can discover code patterns across an entire organization and target these for transformation. And even if you're not a paying customer, you can still use the web interface to browse available recipes and even apply them to open source projects. This can be a great way to start contributing back to open source software. And if you find any of the migration steps are missing, OpenRemo itself is very accepting of new contributions. The community plays a large role in the development of new recipes. Now, as you could probably tell from my email address, we're not exactly a big company. But we're pretty well connected in the broader Java community. Through collaborations, other companies contribute migration recipes for their remotes. And this ensures their users are able to migrate easily and timely with new releases. And if you maintain or merely enjoy your particular library or framework, you can help other users by providing migration recipes. So with that, we are getting near the end of my presentation. Before I send you away, I want to recommend a few resources where you can learn more. There's extensive documentation available on OpenRemo. Development is all on GitHub with new suggestions typically picked up with surprising speed. And as you've all seen, it's quite easy to contribute minor migration steps. If you want to try some recipes quickly on OpenSource software, have a look at public.moderna.io. And if you have any questions, you can reach out on our public slide or via email. And finally, if you would like to play around with the months you've seen before, I've written a blog post to accompany this presentation. This blog post migrates an old Spring Path Clinic branch from Spring Boot 1.5 to Spring on Java 8 to Spring Boot 2.x on Java 17. That way you can play around with your commands and see the changes made every step. For your own projects, I recommend you start with the testing framework migrations. They're an easy way to gain confidence in the tool and see what it can do for your project. And with that, I'd like to thank you for your attention. What's also really great about Tim's story is that he was enthusiastic about a project. He started contributing to it and now he was offered a job when he's working at the company behind this project. It's really the textbook story of starting contributing and then getting paid to do that and joining the company behind the project. Well, thank you very much. |
Rethinking Ecosystem Security After Log4Shell |
So, my name is Steve Ball. My name is Nikke Dadell. I am from Sonataite. This different is McCutche Foundation. I had a much longer title for this, but I want now to use it. So, this is a short one. We're going to talk about security. I need to speak up. I need to speak up. Wow. Okay. Can you hear me in the back? No. Good. Can you hear me now? No. Okay. So, we're going to talk about security. Very quickly. I want to scare you a little. I want to tell you a little about what's happening. And then we're going to tell you about some concrete actions. The girls can tell you some concrete actions about what's happening in McCutche Foundation. Pay attention because this stuff is coming to you. So, the first thing is the way that you think about security has got to change. You think about security, you probably don't even think about what it means. You think about authentication, encryption and things like that. We had a couple of talks about dependencies. That's beginning to percolate. You've heard about S-bonds and things like that. What you've got to understand is that the world that we started with, with Java 25 years ago, 30 years ago, has changed. And it's changed dramatically. From now on, and probably from the last two or three years, but now it's become a big thing, is we have a new problem. And the problem is that cybercrime, which has gone beyond all expectation, cybercrime brings in $7 trillion. If cybercrime was a country, cybercrime would be the third biggest country. There's so much money that's coming in. But that's just not the worst of it. The worst of it is all the techniques that the bad guys have used to steal money are now being weaponized because it's become apparent that if you use these techniques to steal money, you can use these techniques to influence, to penetrate. So, can I get into your banks? Can I get into your chemical manufacturing? Can I get into your delivery systems? That's what they're going to do. That's what they're doing now. Because if they can get into these systems, they can manipulate it. They can turn them off. You've probably heard one or two of these things happening with the war in Ukraine. You may have seen these coming through. But that's just a little bit. It's happening all the time now. The new reality is that cybercrime is being used as a weapon to influence your economy. It's trying to get into your supply chain to do quiet, damaging things. So fake news is one of them. You've seen that. There are little things like getting into delivery systems and changing addresses so that things don't go quite well. Breaking systems, shutting down traffic lights, or disrupting the delivery systems for a supermarket. All these little tiny things influence your economy. So basically every country in the world is beginning to understand this, and every country in the world, every disaffected group, is looking at cyber technologies as a way to get into our systems. This is it. I cannot stress this enough. You will hear more and more of this goes on, and this is going to affect us all. Not just in this room, but everybody who's a false step. We're all open source people. So the governments are looking at what's going on and seeing the value, but they also understand, begin to understand the opposite. Because if you can attack somebody else's supply chain, somebody else's economy, you can be a victim too. Log4j was our wake-up call. It was the one where everybody went, there is this vulnerability that impacts everybody. Every in the world is running a business system that's got Log4j. One way or the other, they were impacted. And it became a government thing. You saw it, we all joked about it, but it became a demonstration of how dangerous these things could be. And it's still going on. How many people in this room had to fix a Log4j problem when it came out? Not good, was it? Tip of the iceberg, more come in. And we're not really good at this either. So I work for standardized. We run Maven Central. We see what happens. We see all the downloads. And even now, this is like live, maybe you go to a certain time to see this yourself, 28% of the downloads for Log4j are still vulnerable versions. One third ish of what people are downloading. They aren't safe versions. They're bad versions. Still, a year later, more than a year later. And this is just one example. This happens all the time. Because we don't have the tools or the knowledge or the awareness. Okay? And it's back. Why don't you block those? Why don't we block them? Have you got 20 minutes? So the simple answer to the reason that we don't block them is because it's possible that somebody has got to work around for it and is protected. So we break them if we did that. The only time we take things off Maven Central is if it's got malware. And we've done that once or twice. You wouldn't believe it, but Java does the malware. But other than that, you may actually not be affected. So you may be in the lucky thing. I bet you 28% of the majority of you just have no idea they're doing this. We know from talking to many large customers that they don't even know they're doing this. Couldn't it be inculcated if they aren't fixed by a certain date and then they're rude from... We could do that if people want us to do that. It's not being socialized, but it's possible. I'll give you some more... Okay, so, scary. Here's where it gets even more scary. Because for the last two years, governments have been going, we should do something about this problem. And they've been waking up. We were in Washington at the end of last year at one of many groups' conversations with governments, bigger organizations, were all looking at what rules do they apply to fix this problem. And the problem is that they think the problem is us, is open source. Because they keep looking at where the problem is, and the problem is in all this open source. We have Java, Node, Rust, Go, you name it. And it's not just a tech. So you've heard about some tools already. You're going to hit some more tools. It's not just about the tools. They are concerned about our behavior. That is even more frustrating and worrying to these people than the tooling. Now, the good news is that we were in Washington, and I don't know, this has been going on for, as I said, two years. There's lots of policy conversations. The Open Learner Foundation, OpenSSF, lots of people are getting together to work out what the right answer is. But whatever the community decides, this is happening. So there's two books here. So the US put out a national improving cybersecurity for the defense of the nation or something like that. That's the US one. May 2021, 2020. And recently, as you can see, the US Foundation are not happy with European one. Because the European one is about making all of you open source contributors and project suppliers and making you comply to a bunch of rules. And the rules are pretty stiff. So how many people here have their own little open source project they share? OK. So now, all of you are going to be suppliers. And so it's going to be. All of you have to provide esports. You've heard about that. All of them, you'll all have to have automatic processes. You'll all have to have evidence of software integrity. You'll have to have audit processes, fundability processes. This is coming our way. And so we have to get our act together to make sure that the way that we resolve this isn't as individuals, but as a community. Because you can see there's lots of things that people want to do. It's governments. They want to manage this. They want to look at what we do. They want to put processes in the way. So as a plug for Maven, since we're a plug in tools, you've got a Maven central. And you can download stuff. We're adding more pieces in conjunction with people like OSSF and others. We're working out, can we help assess the behavior of open source project? And that means how good are the contributors, how good are the committers of reviewing code? What's the release pattern like? Can we spot unusual behavior? Because what the bad guys are trying to do is subvert your projects, your behavior, and get in there and get your software to deliver malware or to have bad feedback. So we're looking at ways of trying to codify this. We've got some ideas, other people have other ideas, so we're plugging that into central. We have a visualization tool for S-bombs, i.e. your dependencies. So you can go and find that. You can see that for all of these projects. And again, we're trying to figure out how to score it so that we can give you the best advice. Because when you're choosing a dependency, as we said before, somebody was mentioning about compile levels and things like that, there's a whole bunch of choices you can make as to what's the right choice. But you need to know what your dependencies are and where the risks are. We're trying to help you do that. There's other stuff coming. We're not the only ones doing this. Everybody's concerned that we're trying to fix these problems, right? May 12th was sort of when the clock started for us. My call to you, Michal's going to talk about what's happening in formation, but my call to all of you is start to pay attention, start looking at the tools that are being proposed as ways of solving this. Understand it's not just the tools. It's the people. It's the data. Dependency management is great, but dependency management, not all tools are equal. So you have to look at the different tools. You have to ask yourself how the bad guys are going to behave. You have to start thinking differently. If you're an open source project contributor or a committer, you have to start thinking about this and getting involved. Start looking at the tools and the standards that are coming through and seeing how you can be helped. Because we need to have the community doing this and becoming a body of people who are behaving better in terms of how we develop software, how we think about security. Because if we don't do it, it's going to get done for us. Right, I should stop there. Hand over to Michal, who's going to tell us about some practical things that we're doing. Yeah, because it's not all doom and gloom. Thank you. Thank you, Michal. Thank you, Steve. Are you scared? Yeah? So, I will talk a little bit about what we do at the digital nation. In good news, we will solve everything. No, of course not. Our vision is to be kind of a role model in the open source project and how we can implement supply chain security best practices. But, of course, we realise that we cannot just put the burden of additional security on the shoulders of developers. You probably don't think it's important, or you don't have the time, or you don't have the skill to actually implement all those best practices. So, what we want to do is to help our projects that provide services and tools and best practices, recommendations about it. And we've been able to build capacity to do that for our project thanks to the open source security foundation and the specific project Alpha Omega that provides us bounds to build the capacity, build the team to actually help our projects. So, what I would like to show you today is what we are starting to do for our project, what are the tools and the practices that we are implementing, and also give you some of the examples with particular one of our projects that you may know already. So, of course, we try to do that with measurement. We want to measure what is the current status of security, analyse those status and try to improve it iteratively. So, the very first tool we are using to do that is CoreCard, it's an open SSF project, and what CoreCard does, it runs on your GitHub repository and gives you a score, a global score, regarding your security posture of your repository. So, do you have branch protection? Do you have a security policy file and so on? So, we run that on our GitHub repositories and the nice thing with being a foundation is that we have a large amount of projects so we can have a large dataset. So, we have about a thousand GitHub repositories, so we run CoreCard on all of them, and so that's the histogram, the distribution of the global score of our project. So, you can see we are not too bad, but we are not too great either. So, that's what we want to shift on the right, right? We want to have a better, more higher score project. But this thing all by itself in isolation does not give us a lot of what to do. So, let's dive into some of the findings we have from this analysis. So, we found two issues basically. In most of our projects, they don't have branch protection. So, do you know who knows what is branch protection on GitHub? Okay, about half. So, branch protection is basically, the most basic protection is do not force push, or you cannot force push to your project. And that's very important because if someone manages to steal credential, they will force push to your repo and be able to add malicious commits to your repo. The other issue that CoreCard found is that most of our GitHub actions actually use high-privileged tokens permissions. So, by default, you may not know, but tokens in GitHub actions have right permissions to the repository. But actually, most of the GitHub actions, they don't need to have the right permissions on your repo. They don't need to be able to push. So, you can do that. You can decrease the permission of your GitHub tokens, but it's not by default on GitHub. So, that's the two main things that we will focus on to improve our score. To do that, there are many tools available there. There is one from Step Security that is very helpful. It's a tool that also runs CoreCard on your repository and provides you the ability to create automatic PR on your repository to fix some of those issues. So, for instance, to lower the permission on the token on the bottom right, the other is about replacing the tag, the GitHub tag of the actions you are referencing by the SHA. So, for those who know about Docker containers, it's better to use the SHA rather than the tag because tags are not immutable and there are plenty of security issues with that. But what we want to do as well is to be able to disseminate those special practices to all of our organization and projects. In our organization, we have more than 100 organizations, GitHub organizations to manage, and we have, as I said, more than a thousand repositories. So, we need some tools to do that. So, I don't know if many of you in the audience have to manage that many organizations or that many projects, but going to GitHub to edit and configure your settings and especially the security permissions, security settings of GitHub, it's a pain. We are developing a tool called Autodog that will help us use a configuration as code to deploy the security batch practices on GitHub. We are also following some security framework to improve the security posture of the supply chain of our projects. So, the one we are following is Salsa SLSA, so you can find more on SLSA.dev. It's basically a set of batch practices with different requirements and the more requirements you comply with, the higher the level of Salsa you comply with as well. And we have a way for projects to promote their security posture by displaying their Salsa compliance level. The most basic stuff, but please do that now, activate 2FA for your account, security starts with the developer, security of the supply chain, starts with you, and we will start to enforce that for all of our projects. We also generate S-bombs, so that's an experimentation. We are starting to using ORT to generate S-bombs for all of our projects and we are comparing it with what Sonataip and MavenSocial will provide and with also S-bombs generated by BILTools. It's still an experimentation, but we want all of our projects to be able to generate S-bombs. And thanks to this funding as well, we provide security orders to our projects. So we are funding security orders, thanks to our partner at OSTIF, West TIF. We just started three projects this week and three more to come in the year. Of course, I cannot tell you much about it today because it's still under auditing, but they will be published by the end of the year. And finally, I want to talk about one project in particular at Eclipse. It's Adoptium and the tenoring distribution of OpenJDK. Basically, what the project is doing is trying to get the world-most secure OpenJDK distribution out there. And to do that, they follow all the best practices I was talking about and they are doing a tremendous job in leading the way for all of our projects. So in particular, they are following two security frameworks to ensure the security of the supply chain. Salsa, as I already mentioned for all of our projects, but they also follow the NIST, SSDF framework. They are very similar, one to each other. One is more focusing on the what and the other on the how. But the combination of the two makes it very, very secure. They are today at level three, we're getting Salsa at level two, sorry, and nearing level three shortly. And they already actually comply with some of the requirements of level four, which is the top level of Salsa. And finally, I would like to mention that what makes tenoring and Adoptium the world-most secure OpenJDK distribution out there, it's actually reproducible. So if any of you know what a reproducible build is, it's the top-notch level for ensuring the supply chain security. You can rebuild the binaries on your laptop and check that it's exactly the same as the one distributed by the website, the project, so that you know that the supply chain has not been compromised. And it's already doing that for JDK 17 and 19, for Linux and macOS. They provide all the patches to achieve that upstream to OpenJDK and Windows should be there pretty soon. That's it for me. Okay, just one thing. Give him a clap. So I know we're over time, but you don't have to do this. We're talking about open source projects. You do not have to do any of this, but what will happen over time is that those projects that do improve their posture, become more security conscious, are going to end up being the software projects that get used more and more because the governments are going to force the businesses to make choices and the businesses rely on open source dramatically. But their choices are going to become limited based on our behavior and our actions. So we want to get ahead of the game. So I would encourage you, 2023 is the year of secure supply chains, start looking at all this tech, learn about the standards, some of it's rubbish, some of it's getting better, look at the tools, just start to get your head in the game, start to make choices, start to get involved. That's it. Thank you. We actually had a question in time. Oh, we have more time? If there are questions for anyone or... Questions? Comments or... Anybody scared enough to do something? I have a question. So about the product PRs, is there any possible way that the tech can actually make like a post-pied PR that looks like something close to like a proper security solution, but in reality it's an actual vulnerability? Yes. So if you're asking, can PRs be... You're asking if PRs can be faked? Yeah. So we could do a whole day, a whole week, on how the bad guys will compromise your projects, but PRs are one of them. If you get somebody turns up and they're very helpful and they like doing the merges for you, merging is a good place to other code to come in, because nobody checks it afterwards. So there's all sorts of places where that can happen, but honestly, most of the time, right now, if you want to know the one thing they attack, it's your build systems, because almost always, the build is less protected than anything else, and it's easy to trigger. So one of the things that we watch for in terms of the store cards is we look for unusual build behavior. If a project releases once a month, and then suddenly it releases five in a row, there's something wrong. But honestly, every way you can think of, there is happening, people using that. Yes, absolutely. Dependencies can be contaminated. One of the things we talk about when you're looking at S-ponds, you think of log4j, you're assuming that you can find log4j because it's listed as dependency. Think about all the fat jars that you've ever built where you've switched things together. That information may or may not be in this S-pond, because it depends on whether the bad guys who've compromised your projects make that available. That make sense? S-ponds are really good, but you still need good scanning tools because the bad guys are trying to hide from the S-pond. Nothing's new, it's just the game has changed. Yes? Is it OK to use grade or W? I think so. I'm not sure I... Do you think it shouldn't be? Use grade or major in S-p-t, whatever. I have no specific guidance on wrappers, because I don't know if there is or isn't a problem. If you think there is a problem, come and tell me afterwards, but I'm not aware of one. Just another remark about the proposal to pull some college from the repository again. Remember, somebody who was denigrant of his calls and broke off the internet? Oh, yes. We're talking about situations where people have removed code from a repository. Node was the best example recently where people just took something out of the repository. You're basically one of your dependencies disappears, and of course your build process is break. The one that's worse than that is you may have heard of occasions where people who have owned, valid owners of dependency, the committer of that thing has put bad code in. Not deliberately as in trying to crush the internet, but for instance, there was one who was trying to do geolocation, again, if you're in Russia when you use this, bad things will happen. And of course, they got it completely wrong and a lot of people were hurt. But that's just an example of the sorts of things that everybody tries to do, and it's just going to become more obvious. We as a community rely on trust, and I'm afraid that trust is being diluted because we have a lot of people who are going to exploit our trust, and so we have to learn to be protected against it. Sorry, that's the way it is. Yeah. Sir, about reprisical builds, I know that you had a JDK, you're getting now a reprisical build, but what about all the mammoth artifacts, like all the javas? Can we actually get built-in reprisical builds for the javas? So the whole point of reproducible builds is that you can be absolutely certain that you can produce a binary that looks identical apart from specified differences to dates. So the reproducible build process says if you're actually paranoid, check out the source code, compile it, and you should get a binary that you can prove is the same as the one that you got from a supplier. Now, that's conceptually, you can do that across all binaries, provided that you've got a way of describing the differences and they're not too big. So what the reproducible build process has done, like with Tamarin, is they've worked it down to just like two or three differences. So it's easy to spot, because otherwise you could obviously rebuild a jar file from source code and you get a binary, and depending on what compiler you used, it might be slightly different in 50,000 places. So the idea of the reproducible build is to get the differences down to something that you can assess. And you can see for people who are paranoid and want to ensure that they're not actually taking downloads, but they're actually rebuilding from source, reproducible builds is a way for a lot of big companies like to do that and take the source and build them, but they need reproducible builds to make that happen, which is another reason why you'll see more and more of it. More? Yes? If you're looking for reproducible builds, also for maintenance and so on, look at Nixos, because they have... So if you're looking for reproducible builds to Java, it's on Nixos? Yeah, Nixos.org. Oh, thank you, more resources. There are pretty good recipes for Maven to build reproducible jobs with Maven. So check on the reproducible builds that info or whatever the extension, slash the Java, and you will have the process. Hi, so I'm an open source maintainer, and you really scared me for your idea of regulators, governments coming in to put in classes and they're telling me what I should be doing. Have you got any advice like what can we do as open source people to, you know, prevent the action from happening that a stupid regulation comes in that makes us... Can we prevent a stupid regulation coming in? I doubt it. What we can do is manage it. What we're trying to do, and what Linux Foundation is doing, or Google and IBM and everyone is trying to do, is to come up with something that will work, because we need these protections. It's not like we don't need them. We just want it done in a way that doesn't mean that everybody who's writing open source stops and goes home, because 90% of what application is open source, which is why we're all scared. So the basic advice I would give you, two things, one is get your head in the game and start looking at these standards of what's happening, go open SSF, start reading about what's happening. Look at the scorecard processes that people are putting together because they will help you understand what you've got to do. So like, Linux Foundation is a really good example because you're looking at Sousa and scorecards, CNCF, everything called CLO monitor. Some of them are really straightforward. There's simple things like, do you have a security.md file? Do you use protection, branch protection, things like that? So there's that list, and that will get you somewhere. And if you start to say, I'm doing these things, and here's the protection I'm doing, if you're public about your behavior, it becomes just a little bit more for us to see that people are following it. My expectation, honestly, is that one or more of these standards will fall out and it will become the bar. And we just got to make sure it's not a heavy bar, but it's something that we can all agree makes reasonable sense to do the next thing. But my final problem, my final advice to you is, you have now got to learn to be suspicious. So everybody contributes to your stuff. Think about, is it the right thing? When your code is designed, when you design code, think about the unhappy part. When you use dependencies yourself, think about, do I trust the people who wrote that? Go look at the website, go look at the GitHub repo. How many of you have downloaded something for the first time? How many of you have gone to the GitHub repo and gone, oh, this hasn't been updated for seven years, I'm not going to use that? We all do that. So do it more often and just get a bit more thoughtful about your choices. I think we're way over time. No? Wow, okay, boy. I've got to talk to you. To Mickey, I was pointing earlier about the read and write GitHub token permission. So GitHub launched last week that essentially all of the repositories created would be by default with a read token. But at this current stage, they've left it open that any existing repository will stick to by default write. I wanted to see whether you had an opinion on whether that was the right move to make or whether at some point they should consider actually making the switch for everything across GitHub. That's a good question. I think we should GitHub switch the token permission. It's the same problem that we have, or the same sort of question we have that Simon asked about made in central. If you do that overnight unilaterally, like if you said, okay, from now on all APIs are going to be paid for, where did that happen? You can see the consequences. So you have to understand. So I would say, if you look at what Google are doing, Google will obviously have a program and they're working through and trying to make it safe because obviously you've got Google, GitHub. GitHub business is all about this. So they're really vested interest in making sure that they're providing as many security features as possible. I think if we agree that there's a situation where you should have tokens that are read-only because you don't need write-read rights. So what the hell have we got one? So they should produce one to make it obvious that it exists. Okay. Are we around today? Are you in this room today? I'm around today. You too. Yeah, can't find this. And also one thing, the session after the one coming up now will be by people from the Linux Foundation and from CNCF. So we can continue this discussion there. And also it will be about S-bombs and supply chain, etc. The same thing again. Good. I love you with a completely different story. Thank you very much. |
Elasticsearch Internals |
So, before we go on with the next security or later topic, we're going to talk about something completely different, and that is about elastic search internals by Martin from the Bulgaria jug, and maybe also about security in this context. So, we're going to talk about something completely different, and maybe also about security in this context. Test, test, test. Thank you. So, people coming in, please move to the middle of your row so that there's space on the side so people can sit. We're working in Cisco together, so a lot of people coming, so we can start. If you're standing along the side, please take a seat. So, hello, everyone. My name is Martin, and I'm a consulting architect at the European Patient Office. I've been also doing a lot of consultancy on elastic search in the past two to three years. So, just before we start with this session, how many of you are using or have used elastic search in a project? Okay, more than half of the people. So, why this talk at FOSDEM? So, multiple reasons, in fact. When I've worked with elastic search, I realized that even though it has quite a good documentation, in many cases, you need to go into the public code base and see what's in there, and to understand how it works. I've had questions from many people, how this functionality works, or how can I achieve something with elastic search. And not always it's clear from documentation or blocks over the Internet what you can achieve with elastic search. So, in this short session, I'll try to show you how this elastic search works internally, and I'll talk about the elastic search architecture. So, first of all, we'll do a 360-degree overview of the elastic search stack, which I believe most of you are familiar with. Then I'll go into the elastic search architecture, and at the end of this short session, I'll show you how you can write a very simple elastic search plugin. In most cases, you won't need to write an elastic search plugin because there is quite a rich ecosystem of elastic search plugins that you can use. But many companies find that that's not always the case. So, sometimes you need to either customize something in elastic search or write your own plugin to achieve something. All right. So, let's talk briefly about the elastic search stack. In the middle, we have elastic search, which is a Java application. It's being updated quite oftenly. There are a lot of features being implemented in elastic search, especially in the latest few releases. And around the elastic search server application, there are different applications that are being built to allow you to work more easily with elastic search, such as Kibana. Kibana is a user-rich user interface for elastic search that allows you to achieve multiple things, so not only querying elastic search, but Kibana allows you to also visualize data that's already in elastic search or build different dashboards that are quite nice, especially for management. Also, if you want to put different data from a variety of sources in elastic search, you can use LogStash. So, originally, LogStash was implemented to provide a way to aggregate logs into elastic search. But over time, LogStash evolves to an application that is used to integrate data in elastic search, not only log data, but any kind of data. So, you can think of LogStash as a log aggregation pipeline that allows you to put data in elastic search. And on top of that, we also have a different set of so-called bits applications that are lightweight log shippers that allow you to collect data and put it either directly into an elastic search or through LogStash into elastic search or different other data sources. The specific thing about the bit applications is that they are lightweight in nature, so they are supposed to not consume a lot of resources such as CPU and memory. And in that reason, they allow you to collect log data or other data and put it into elastic search. Now, you can think of elastic search as a web server built on top of the Apache Lucene library. So, the Apache Lucene library is an actively developed Java library that is used by different applications that want to implement some kind of search functionality. And elastic search is one of them. So, I'll show briefly in a few slides how elastic search interacts with the Apache Lucene library. And another way to describe elastic search is a document-oriented database. So, elastic search is used by different projects not only for searching, but also as a NoSQL database. So, I had a few projects where elastic search was used purely as a NoSQL database, not as a search engine. And one can think, okay, elastic search is a Java application. Why I cannot use Apache Lucene directly? And the reason is that elastic search provides a number of features that are missing in the Apache Lucene library that allow you to implement search in your project way more easily than using directly Apache Lucene. Some of these features are, for example, JSON-based REST API, which is quite easy to use, quite easy to write search queries, to index data into elastic search, and so on. There is also a really nice clustering mechanism implemented in elastic search that allows you to bring and scale your elastic search cluster quite easily, something that's not possible if you use directly Apache Lucene in your project directly. And also, it has a number of other features, such as, for example, caching, that allow you to improve the performance of your search queries, and so on. Now, the basic data structure used by elastic search is the so-called inverted index, and indexes are stored on disk in separate files or Lucene segments. Search can be performed on multiple indexes at a time. That's one of the capabilities of elastic search. And in earlier versions of elastic search, documents were logically grouped by types. That was effectively deprecated as a version 7 of elastic search, and it's expected to be dropped. In order to ensure score relevancy when you search for some data in elastic search, elastic search uses a set of different algorithms to score results relevance. In the later versions of elastic search, this algorithm is BM25. In earlier versions of elastic search, this was a simpler algorithm which is called TFIDF. And the base of those algorithms is the fact how many times does a term occur in a document, and how many times does this term occur across all documents that are currently indexed in elastic search. Based on that, by default, elastic search scores every result that gets returned by your search query, and by default, it returns results sorted by relevant score. Now, why would you use elastic search in favor, for example, of a relational database? Well, it provides faster retrieval for documents in way more scenarios than a traditional relational database can do. So, as you know, traditional relational databases provide faster searches through indexes. However, indexes in relational databases have many limitations based on the type of SQL queries that you write. In elastic search, the inverted index data structure provides with the capability to cover way more scenarios for searching using more complex queries. And for that reason, many projects choose to use elastic search as a search engine. Now, documents also in elastic search might not have an explicit schema, as you have in a relational database, and that's typical for many no-SQL databases. An explicit schema, however, can be defined on the fields, and certain fields can even have different types mapped to them. This is needed because sometimes you need to use different kinds of search queries based on the field type, and some field types pose limitations. So, that's why you might need to have multiple types on a single field in elastic search. Now, this was brief about what is elastic search and how it works. Now, let's see what the architecture of elastic search. Elastic search, as I mentioned to you, is designed with clustering in mind. By default, in later versions of elastic search, if you start, if you create an index, it has one primary chart and one replica chart. So, what is a chart? Now, an elastic search index contains one or more primary charts that distribute the data in the elastic search cluster. Below that, an elastic search chart is, in fact, a Lucene index, and a Lucene index is, in fact, the data structure that stores the data on disk in terms of Lucene segments. Lucene segments are the physical files that store data on the disk. Now, when you index data in elastic search, you might have also replica charts. Replica charts provide you with the possibility to enable high availability and data replication at the level of the elastic search cluster. So, two types of charts, primary and replica charts. The more notes you add to the elastic search cluster, the more data gets distributed among charts. Now, it's very important that up front you plan the number of primary charts based on the data growth that you have. It's very difficult to change later in your project lifecycle the number of primary charts you would need to re-index data. However, if you want to change the number of replica charts, that's more easy to do later in time. So, it's very important that you plan up front what's the number of primary charts on an index that you create. Now, by default, elastic search tries to balance the number of charts across the notes that you have. And one of the other capabilities that elastic search provides you is that if a note fails, you still can get search results, or so-called partial results can be returned, even if some of the notes in the cluster are not available. Now, by default, elastic search determines the chart where a document is indexed based on a relatively simple formula. You get the hash key of the routing key of the document. This is the document ID, which can be generated in different ways. You can generate it from elastic search. If you don't specify your application, you can supply the document ID, and so on and so forth. And you'll take the modules, the number of primary charts that you have defined on the index, where you index the document. Now, as I mentioned, by default, the routing key is the document ID, but you can also use a different routing key. And one interesting technique that some people use to enable distribution of data in the elastic search cluster is by specifying a custom routing key that allows you to enable so-called chart routing. This is a technique that allows you to specify at which particular chart you want to send the document to be indexed. But that's a case that's used in some specific scenarios. In most cases, people rely on the default mechanism that elastic search uses to distribute data in the cluster. Now, by default, new nodes are discovered via multicast. If a cluster is discovered, a new node joins the cluster if it has the same cluster name. If a node on the same instance already runs on a specified port, and if you try to run another node on that instance, elastic search automatically gives you the next available port. Now, however, in some cases, in some companies, multicast addresses are disabled for security reasons. And that's why the preferred mechanism to join new nodes in an elastic search cluster is by using unicast addresses. In the elastic search YAML configuration, you just need to specify one or more existing nodes from the elastic search cluster so that they can join that existing cluster. And in that list of unicast nodes, you don't need to specify all the nodes in the elastic search cluster. You just need to specify at least one node that has already joined the cluster. Now, when you bring up an elastic search cluster, there are some considerations that you need to take. First of all, as I mentioned, sharding, it's very important for you to consider what should be the number of primary shards that you define on the elastic search index, and the number of replica shards, which is more easy to change over time. You also need to consider how much data you store in an elastic search index. Indexes with too small amount of data are not good, because that implies a lot of management overhead. And the same is for indexes with too many amounts of data. I've seen some cases where people store, let's say, more than two, three hundred gigabytes of data in an elastic search index. And that really slows down search operations and other operations of that index. And people start wondering, okay, why is my indexing slow? Why are my search queries slow? And in many cases, the reason is that because data is not distributed properly in the elastic search index. The preferred amount of data that you should keep in an elastic search shard is between five and ten gigabytes, roughly speaking. So if you have more data that you want to put on a shard, you should consider splitting that data. So you either use more shards in the cluster, or you split the data into so-called sequential indexes. So for example, you might have daily, weekly, or monthly indexes. Now, this is what I mentioned. So you should avoid putting too less data in the elastic search cluster. Also, if you have too many shards defined on an index, that also introduces performance and management overhead. So you should consider rather splitting the data in the index rather than bringing too many shards on a single index. And determining the number of shards should be a matter of upfront planning. Now, apart from putting the fact that you need to avoid putting large amounts of data in a single index, the main strategy that people use is to use, for example, prefix when they split data into indexes. For example, you can put prefix for daily, weekly, or yearly indexes. And if you do that, it's a good practice that also you use aliases to reference data, directly reference a particular index in your application, but rather use aliases. In terms of concurrency control, elastic search does not provide pessimistic locking, like, for example, you have in relational databases. If you want to establish some form of concurrency control in elastic search in order to make sure that you don't have unexpected race conditions, so elastic search uses optimistic locking for concurrency control. The way this works is when you index a document, there is a version attribute that can be specified. And if there is already a document indexed with that version, then the operation is rejected from elastic search. Concurrency control can also be achieved with the two fields that can be specified when you index the document. If sequence number and if primary term parameters, if they already match the document that's indexed, then this operation gets rejected. So if you want to establish some form of concurrency control in elastic search, you can use this optimistic locking provided by elastic search. In terms of high availability, you can create one or more copies, or so-called replicas of an existing index. The number of primary charts is specified when you define the index mapping, or you can change it later. Once an index request is sent to a particular chart, determined based on the hash of the document ID. The document is also sent to the chart replicas. And one interesting property in elastic search is that the replicas are not used only for high availability, but also used for searching purposes to improve performance. So when you have replica charts, they also participate in the search requests that you have for elastic search. Now, this mechanism for improving performance is really nice, but this doesn't mean that you need to supply to increase the number of replicas because, of course, that increases management overhead. So it's also a matter of determining how many replicas up front you would need. And later on, if you plan to scale your cluster, you can also increase the amount of replicas. So you should not put a lot of replica charts also at the beginning when you define your indexes. Now, how is a chart request processed? Now, if we want to index a document in elastic search, what happens? We send the request to a coordinating node. This is one of the nodes in the elastic search cluster. And this coordinating node sends the request to the chart, to the node in the cluster where the document needs to be indexed and stored in Lucene segments. When the document reaches the elastic search node in the cluster, the particular chart, it gets sent not directly to the disk, but to two in-memory areas. This is the memory buffer and the transaction lock. Now, the memory buffer gets flushed every second to the disk. So when you index a document in elastic search, you cannot expect it to be available right way for searching purposes. But there is also a parameter that you can use to enforce it to be written to disk right away before waiting for this one second to be flushed on disk. There is also another area, which is called the transaction lock, where it gets flushed not so often. It gets flushed every 30 minutes or when it gets full. So the important takeover from this is that when you index a document, you should not expect it to be available right way for searching, but you can enforce it too. What happens if you send the search query to elastic search? First, the search request gets sent to one of the nodes in the elastic search cluster, the so-called coordinating node. Then we have two phases. First is the query phase. It asks all the shots, primary and replica shots, hey, do you contain some data for that search query? And this information gets returned to the coordinating node. Based on that information, the coordinating node determines which nodes it needs to query. And on the second fetch phase, it sends the request to the shots that have some data for that search query and return it back to the client. Now, in terms of how is the elastic search called base structured, this is a snapshot from the GitHub code base of elastic search from the public code base. Now, what I'm speaking about in this presentation applies for the public code base in elastic search because of version 7.16, there was a licensing change, and there is a lot of controversy in the open source communities whether elastic search is still open source or not. So we can have a discussion about that after the session. I'm not going to go into the details, but the main thing about this licensing change is to protect elastic search from other vendors willing to provide elastic search as a service, not from people willing to customize elastic search or to use it for their in-house projects and so on. So this is the structure of the elastic search code base that has been like this since the Apache license code base. So elastic search gets built with GitHub actions. You can see also the definition in the.github folder. The main server application is in the server folder. The documentation that gets generated on the official elastic search website is in the docs folder. We have the main modules for the elastic search server application in the modules folder and the internal plugins in the plugins folder. An implementation of the REST based Java client for elastic search, the high level and the low level REST funds are in the client folder, and the distribution folder, you can find the gradle scripts that allow you to build different distributions of elastic search for Linux, Windows, and so on. Now, I would say the structure of the code repository is very logical. It's easy to navigate. So you can just go into GitHub, and if you need to see, for example, how is a particular plugin or module implemented, you can just go to GitHub and check it out. Now, internally elastic search is comprised of different modules. And in earlier versions, elastic search used the modified version of Google GIS for module binding, but they're slowly shifting away from Google GIS in favor of their own internal module system. So modules are loaded on startup when the elastic search server starts up. And in this simple example, I've shown an example of how modules were bound internally when the node starts up. So we use a module binder. The earlier versions, B was a Google GIS binder. And then we bind particular module classes to their implementation. And then wherever you need them, you can reference them in the elastic search code base. It's a very simple dependency injection mechanism. Now, when elastic search starts up, you can imagine it's a simple Java application. The main class is Orc elastic search, bootstrap elastic search. It boils down to calling the start method of the node class. And the start method, in fact, loads up all the modules of the elastic search node. Now, some of these core modules are, for example, modules that provide the REST API of elastic search module that allows you to establish clustering and elastic search, or so-called transport module. There is a module that allows you to build plugins for elastic search, and so on and so forth. Now, how does elastic search internally interact with loosing? When you start up the node, the node also exposes, provides different services that are used by the modules of elastic search. And, for example, if you want to, when you start up a node, there is a createChart method that gets called, indexServiceCreateChart, to create and initialize the chart that is part of this elastic search node. And then, if you want to index a new document, it boils down to calling indexChartApplyIndexOperation on primary. Then, this boils down to calling the index method on the indexChart class. And the indexChart class goes down to an internal engine class that calls index into loosing. Then, that calls internal engine at docs. And at the end, we just call indexWriter, which is a class from the Apache Loosing Library, at documents. So, it boils down to calling different methods from the Loosing API. And on top of that, we have a lot of initialization and services happening. So, in a way, you can think that apart from all the functionality that elastic search provides, the integration with the Apache Loosing Library just boils down to calling the different APIs that Apache Loosing provides. And last but not least, I'll show how you can build a very simple elastic search plugin. Now, if you see the elastic search code base, it already has some building plugins that you can use. And there is a very nice elastic search plugin utility that you can use to manage plugins, to install them, remove them, and so on and so forth. If you build your own plugin, you can use the same utility to install the plugin, and it gets placed in a folder in your node installation. So, if you install a plugin, you need to make sure that it's installed on all the nodes in your cluster. Because many plugins are cluster aware, it needs to be installed on every node in the cluster. Elastic search plugins are bundled in ZIP archives, along with their dependencies, and all of them must have a class that implements our elastic search plugin's plugin class. There is a plugin service which is responsible to load the plugins in elastic search. Now, let's see how we can create a very simple ingest plugin that allows you to filter words from a field of an index document. So, if you index a document, you can specify from which field, which words you want to filter out. This is a very common scenario, for example, if you want, for example, to implement that allows you to clear contents from documents and so on and so forth. It's probably one of the simplest plugins you might have. So, first we have a filter ingest plugin class that extends the plugin class and implements ingest plugin. We have different interfaces for the different types of plugins you might have for elastic search, and ingest plugin is one of these types of interfaces. Then you specify you implement the get processors method because an ingest plugin needs to have processors that you can define that do something on the documents before their index. And the get processors method, what we do, we get a filter word from the parameters that we supply on the ingest processor that we define in elastic search. And then we get the filter field. So, we have two parameters, the word that we want to filter out, and from which field of the document we want to filter it out. Then we create a map of processors, and in that map we put the filter word processor that we create from this class and return it. You can also have multiple processors defined in that plugin. Now, what does the filter word processor look like? The filter word processor extends abstract processor from elastic search. It, again, comes from the core class of elastic search. And we have an execute method. In the execute method, we get the document that we want to index. This is the ingest document. We get the value from the particular field that we want to filter out, and then we replace that value with the empty string. And then we set back the value on top of that field and return the document. This, when you index a document and you specify that ingest processor, applies the filtering on that document before it gets indexed into elastic search. Now, those two classes, if you want to build a plugin, you also need to supply some simple plugin metadata, then build it, for example, with Maven or with Gradle, and then you can install it with the elastic search plugin utility. And in that manner, you can build any plugin you would like for elastic search. And since we are running out of time, I'm not sure if we have some time for one or two questions, maybe. Do you have time for? Yes, of course. Okay, so if anybody? Yeah? Hey, thanks for your insights. We saw how too many cats can go out and fall into the pool. Yeah. Yes? I was curious, how does one know how many cars are going to be in charge? Well, I would say it depends on upfront estimation of how much data do you expect to put in that index. So we need to do an upfront finding, okay, in the first phase of my project, how many, let's say, gigabytes of data I would have. And based on that, you determine how many initial set of shots do you put, and if those shots still have a lot of data, then you consider partitioning the index. And it's a matter of upfront planning to determine that. Okay? Yeah? What is the structure used for store indexes and data? It's inverted index. This is the data structure. Yeah? Inverted index. Inverted index. It's just called an inverted index because it's an app between terms, and if for each term you have a pointer to the document that contains that term. So it's called inverted index. |
Securing Your Software Supply Chain One Open Source Project at a Time |
So, thank you all for coming. This is securing your software supply chain one open source project at a time. My name is Lori LaRusso, that's a picture of me and my kiddo. I am the open source program manager at JFrog, I don't know if you guys have heard of us, I'm sure you have. And I'm also the marketing outreach committee chair of the Continuous Delivery Foundation. Feel free to find me on Twitter or on LinkedIn. Hi everyone, thanks for being here. My name is Fahad Dermanj and I work at the Linux Foundation as an executive director of the Continuous Delivery Foundation. And that's my photo with my son because she puts her daughter, so I had to put my son. So, thanks again for being here. Okay, so I know you guys are hardcore Java developers and so I'm surprised that you're in this room and I appreciate you for staying for our talk. So why is supply chain security important? So let's just do the math real quick. So 99% of all software is developed with open source components. And of that, at least 85% of enterprise products are created using open source components. So any sort of work we do to secure the supply chain is going to have a tremendous impact not only on your own projects but on the enterprise in general. So I have a quick question for you. Oh, no I don't, just kidding. So what happens when you change slides first thing? So supply chain security, right? Like the shift left has happened, you guys are all like working on securing your code before you go to production, everything is good to go, right? Maybe or maybe not. So now I've got my question. So how many of you know what percentage of increase in software supply chain security attacks happened in the last year? Anyone, throw out a number. How much? 650 was in 2021, 2022 number. So we're already at 650. Anybody? Anybody? Okay. 742%. There was a 742% increase in software supply chain attacks in 2022. That's kind of a big deal, right? That's kind of something that we should all sort of look at and say, hmm, what can I do to help decrease this number and make sure that my projects are running smooth, my work is running smooth, I'm efficient. So when we do the numbers, there's a 99% of software has open source components. 85% of enterprise software is built using open source components. And there was a 742% increase in software supply chain attacks in the last year. Okay. I think now we're all listening and thinking, yeah, this is probably something we should look at. So in the last 10 years, we've had six massive attacks. I'm sure you're all familiar with SolarWinds, maybe the British Airways, the amount of money that these corporations lost, the amount of productivity, the amount of time, the effort that the developers had to put in to fix things that were broken from the beginning because maybe somebody didn't maintain a project because, you know, maintainers, nobody gets paid, right? Like things happen, you move on. And so in the last 10 years, we had six massive attacks. And I'm sure if I say log for J, everyone's going to shrug. So that wasn't an attack, but that was a discovery that could have been significantly crazy. So what are the three types of supply chain security attacks out there? I think we can all read. So there's known vulnerabilities, unknown vulnerabilities like zero day attacks. So that's where that log for J came in. And then non-code issues. So human error, people attacking, hackers are getting smarter, they're looking at things in a different way and they're going in and trying to cause disruptions. So this is a very American reference because we all have cars. So how many of you turned on your ignition and your car didn't start? So you had to replace your spark plug or your alternator. But if you got in your car and it started, maybe you wouldn't know that your tail light was out because it's your tail light and you're just like, I can go. So that's the genius of a supply chain attack. They attack the parts that are built in between, the things that you don't notice. They're not denting your car, they're not breaking a window, they're not breaking your alternator, they're going in from the inside and busting things out. So as this might be a little outdated reference, but Winner is here and it's time to really take this seriously and Fatih is going to talk about some open source projects that are helping to secure the supply chain. Thanks Lurie. So things look a bit, how to say, risky or kind of demoralizing because all these attacks happening like new vulnerabilities are disclosed and some of those vulnerabilities that are currently exploited and we are probably impacted by them one way or the other. And if we think about like 2020, 2021, people start talking about these things like software supply chain security and so on and then governments started making some noise and I'm sure most of you heard this famous White House executive order, 2021, May, June or something. And then other countries, other governments followed the suit, India released something similar and EU had this cyber security act and everything. And those things like the governments, the regulators, they are putting out of focus around this topic. But as you know, the regulations, it takes a while for them to become real, like it goes through all these real processes, politicians are involved and things are a bit slow. Because of that, the communities we are involved in or the projects we are developing or the contributions we are making, we all have a big chance to impact and improve the situation even before those things, those regulations, laws become real. And as Laurie mentioned, like some of the communities, they are actively working on improving the state of software supply chain. And I took just four of those communities on this slide and they are kind of different from each other and they are taking to this slide on purpose. Like CD Foundation is where we are contributing to and as a foundation and as our projects, we mainly evolve around continuous integration and continuous ecosystem and projects. And some of our projects are very famous Jenkins. They have a table here, Jenkins is under CD Foundation, Spinnaker and other CD orchestrator tool that is under CD Foundation. And we try to contribute to efforts from our perspective because when you look at the how to say, scope of supply chain, it is vast and no foundation, no one project can fix those issues. And we as the practitioners from continuous integration and continuous ecosystem do our bit to contribute those efforts from continuous delivery perspective. Because the attacks Laurie summarized, the people are actually abusing pipelines themselves as they are hijacking pipelines that are kind of route to production. And that's why within our community we are contributing those efforts. OpenSF is another foundation under Linux Foundation that works on security best practices, standards and so on. And they are very much focused on supply chain security itself. So CDF looks at from CDF perspective, OpenSF looks at the supply chain as a whole. And OWASP is another project that's not under Linux Foundation and it works on improving the security of OpenSource and CNCF, which I'm sure most of you heard, Kubernetes and other cloud native projects, even though security is not their primary concern, they are doing lots of work around security by publishing white papers or releasing best practices for cloud native. They actually had another event, I think it was this week called cloud native security con. So these foundations, committees, they are all different, but they are all trying to contribute. As a Java developer, as Java, you know, people contributing to Java ecosystem, I think we can also try to contribute these efforts and make sure that the stuff we are developing, the people who are using what we are developing, what we are making available for them to use is as secure as possible so they don't, you know, face problems. One of the things, again, we took examples from these different communities to show what they are doing. This is not an extensivist, this is just a snapshot of what they are doing, there are lots of other initiatives going on. But if you look at OAS, for example, they recently released this report, top 10 CICD security risks and, you know, I'm sure there are people among us who are doing CICD for their communities, their projects, their organizations and some of these items, we do those mistakes without knowing and there are some people out there who are exploiting those issues and hijacking pipelines and injecting malicious code and instead of real artifact, maybe some bad artifact is, you know, flowing through our pipelines and going to production, impacting end users. So this is one side, the CICD. And this is another project called supply chain levels for software artifacts, Salsa. This is under open SSF, open source security foundation. And this also looks at things a bit like end to end perspective, instead of just focusing on, you know, you have your source code, secure development practices or you have your artifact, you must sign your artifact, but Salsa looks at the entire supply chain and it is a framework for giving projects, organizations, committees to have different levels of security. It has four levels, you start with very basic build automation, documenting your build process and it goes to fourth level, which is the most secure and, I would say, increases confidence to your supply chain. And the practices documented by Salsa is pretty good and it's already actually, many communities actually start adopting their community pipelines or community production systems to adhere to these levels that come with Salsa. Another project, again, this is under open source security foundation, CICD store. This is about signing the artifacts. I think, like some of you know, May 1 requires PGP signatures and CICD store is similar to PGP, but it's keyless signing and it is a collection of open source projects like FUTU or RECORE and so on to make it easier for developers, maintainers, package managers to build and distribute artifacts in a secure manner. So when you publish your artifact, it's signed so people can check if that is the right artifact they are expecting, they can verify that signature and make sure that they are not getting some weird random artifact and using it as dependency perhaps. So this project is actually pretty interesting because when we think about open source, we want many people to use our projects, we want many other tools to adopt our library or whatever. And CICD store is actually showing great adoption and many other open source projects are adopting CICD store for signing their release artifacts, for example. And this again, another snapshot I took from CICD store landscape and you see all of these are open source projects, they are located under different communities and they are different from each other and they started signing their artifacts. And this is the CICD store, May 1 Gradle. I've seen a blog post yesterday actually, it was talking about like the status of CICD store with May 1 and Gradle and I believe May 1, Apache May 1 project started working with CICD store to adopt CICD store for signing the artifacts and they had a two-step plan, they are currently working on the first step and the expectation is that they will adopt CICD store coming months, I don't know when, but the work is happening there. And the Gradle is a build tool which you probably know, they also start looking at CICD store for signing artifacts. And if you go back to the title of our presentation like securing open source supply chain one project at a time, like CICD is a big area and sometimes I'm a CICD person, I'm a developer and I don't, like I can't secure my software as best like some other people because they may know CICD better than me, but one thing I can do if I am developing a library, if I am releasing artifacts, I can at least sign that artifact so when people pull that artifact during their build process, as a dependency, then they can check that artifact to make sure they are getting the right one without some tampering and so on. So another project I want to highlight is Tecton Chains and Tecton is a project like Jenkins but Kubernetes native continues their framework and it has multiple projects underneath Tecton Chains, Tecton Dashboard, Tecton Pipelines and so on. The reason why I want to highlight Chains is as a CICD practitioner, like as I noted there are lots of risks we have within our pipelines. People might hijack them and we must make sure the pipelines, the phases in our pipelines, the tasks within our phases, they are only doing what they are supposed to. They are not doing anything else but what we tell them to do. When Tecton Chains comes up with some kind of observability, it takes the observability and applies to the pipelines. So whenever Tecton runs a task, it takes a snapshot of that task, records whatever happened during that task, signs it and stores it so you can go back and audit your tasks if you face issues, security issues, you can check which task actually caused that problem and then make sure that task is fixed. So it is like observability of pipelines will be an important thing going forward. And the last project I want to mention is the youngest project or latest project that joined the CIDI Foundation, Persia. Again we distribute packages and we hope that they are safe, secure and people use them with confidence but sometimes it is difficult and it takes some effort to make sure to verify those packages are built by the right people, community organization and they are not tampered with. Persia comes with Bitcoin, not Bitcoin, no. Blockchain? Blockchain, yeah. I mixed those two. Blockchain-based technology and it is like peer-to-peer package distribution network. So multiple nodes could build the same artifact and they have this algorithm, they check if this artifact is same across all over the place, all those nodes distributed and then the package could be consumed by the users safely and securely. So these were like five, six different initiatives and there are tens or even maybe hundreds of other types of projects and initiatives. So I think we as developers, we have responsibility for our users, obviously as Dory mentioned, some of us are not getting paid for this and we are doing that for our fund and someone is using our project, at the same time we are responsible. We make something available for people to consume and if they are facing issues because of some of these simple things we are not putting in place then we may perhaps feel responsible and do our bit to make sure we improve any and each project we are involved in. So I just want to close with all of the foundations that we mentioned today. So each one has projects, each one has SIGs, each one have working groups, they are all alive and well in terms of communication whether they have it on Slack or on Discord or mailing lists, things like that and if you look at these foundations and you look at the corporate sponsors for these foundations you might be surprised that your corporation actually sponsors one of these and with that there are tons of benefits that you might not be receiving that you are eligible for. So there is always trainings, there is webinars, there is opportunities for like white papers, blog posts like Bhatti was saying there is best practices in each of these depending on what area you are interested in. So I would encourage you to go to each of these and there is tons of other foundations out there but here are just some core that are really working on supply chain security from multiple angles and I think it is something that as job and developers you know you should look and see what other people are doing and how you can also just improve and move forward as well. We all know Java is not going anywhere so it is time to kind of see what else that you can do to help yourself on the front end so on the back end you are not paying millions of millions of dollars lost hours, wages, all that good stuff. So any questions and thank you very much. |
What I Miss In Java (The Perspectives Of A Kotlin Developer) |
Thanks to be here. Thanks for missing your lunch. And let me rant about Kotlin. I am Nicolas Frankel. I've been a developer for, like, now more than 20 years. Perhaps I'm not a developer anymore. I'm a developer advocate, but I still see myself as a developer. I love developing. Right now I work on the Apache API 6 gateway. It has nothing to do with Java. It's just an infrastructure component. But since they allow me to be here, here is the slide. Anyway, why am I talking about Kotlin in a Java room? Well, they didn't want me to talk about Kotlin in the Kotlin room, so that might be one of the reasons. And also I've mentioned I have, like, two decades of coding experience, and those two decades were spent in Java. So I started with Java 1.3, 1.4, 1.3 perhaps. And, well, there were, like, big wins and small losses. And I tried Scala, and I didn't like it at all. I still have the certificate saying, hey, you are like Scala certified. I didn't like it for multiple reasons. Then I went to a couple of conferences. I had a friend, and she made all sorts of talks about Kotlin. And I said, we don't need Kotlin. We have Java. It's enough. Then I wanted to teach myself Android development, and Android development at the time was only in Java, and the API was super low level. And I said, no, I cannot write such like codes that it brings me back 15 years ago when I was writing Java on the service site. And I looked for solutions, and I found Kotlin. And I thought, wow, that's cool. And I stopped learning Android, and now I write Kotlin on the back end. So just a disclaimer. If Java is the best language in the world for you, just leave the room. I'm not trying to bash Java, but depending on your culture, I'm pretty straightforward. You might feel offended. So better leave now. It's up to you. OK, let's start slow. Immutable references. I'm telling you, hey, it's better in Kotlin. And you can't tell me, hey, Java has it, right? Immutable references. OK, let's try to check how it works. Immutable references in Java. Oh, fun stuff. Immutable references in Java. So here, this is my immutable reference class. So I have to put final here. I have to put final here. And I have to put final here. Now I have immutable references. Who puts final everywhere? A couple of people. No, really. We had this idea 15 years ago, and then we decided, no, it just makes the reading so much harder. So perhaps not. Even worse, if you don't put final on the parameter, you can actually relocate stuff, which I believe is one of the worst things you can do. Like really, really, really bad. So I'm not saying that Java doesn't have it, but I'm saying that in Kotlin, it's, from the beginning, you need to decide whether it's a vowel, which means it cannot be reassigned, or it's a vowel, which means it can be reassigned. If you are using IntelliJ, I don't know about Eclipse or NetBeans. The good thing is that if I say it's a vowel, so it can be reassigned, it tells me, hey, like, there is something fishy. It's not an error per se, but at least it's visually like pleasing. And, of course, by default, yeah, for those who don't know, any is the equivalent of object, so it's not very important. I could write object. By default, you cannot reassign parameters. Any language that allows you to reassign parameters should be taken with, like, utmost caution. It's actually not a great design idea. So, of course, Java was designed a long time ago, doesn't have it, but Kotlin takes the lesson. And, if you think that vowel and vowel are coming from Scala, because you are a Scala fanboy, that's completely true. Kotlin has stolen every good idea. So, don't pretend otherwise. That's fine. OK. Second, immutable classes. Well, that's fine. Now we have Java records. We've got them. That's fine. Let's continue. Null safety. Null safety in Java is not that fun, right? How many ways do you have to implement null safeties in Java? That's a good thing. Yeah, that's a good thing. Diversity is a good thing, because we work in IT. And, if you want to really, like, have fun, you might check, like, null-able, null-able, null-able, null-able, null-able, null-able, null-able, null-able. Oh, not null. Well, I'm sorry. Sorry? Yeah, exactly. I'm not sure it's an error. I think I just copy-pasted. So, I'm not sure. Yeah, that's really fun stuff. And, of course, they won't work with one another. So, you need to have the preprocessor. Sorry, you need to have the compile-time processor. And, you need to choose which library you will be using. And, then, you hope that somehow it will work. Good. In Kotlin, what do we have? Well, it's backed into language. So, basically, here I was too lazy to write it in Java. But, basically, if you write something in Kotlin, you have additional types. So, basically, for every, like, normal type, this is a non-null-able type. And, this says it's a null-able type. It means that if you are calling something on a non-null-able type, you can call whatever you want, plus whatever. And, if you call something on a null-able type, Kotlin will say, oh, it was saying something, and now it doesn't say it. Yes, because plus is smart. Plus knows how to operate on, like, null-able types. But, let's do something that is not safe. Which is, it's also very smart. The library is too smart for me. Empty. Reversed? Yeah, finally, thanks. And, it tells you, hey, you know, this might be null-able, so please don't call it like this, because there is a chance you might encounter a null pointer at runtime. And, yeah, you should take care. And, well, afterwards, it's quite easy. You can do, OK, this stuff. I mean, if you have been doing groovy or, I think, Scala does it too. But, the compiler tells you, you should be careful about this. So, and it's, again, it's backed into the language. So, like, for every type, there are like two real types, one that might be null-able, the other might not be null-able. The good thing with, I will show you afterwards, one you can write extension function that work on null-able type, which is really, really crazy. Good. So, second stuff, better. The utils classes. Who has not written a single unit utils classes in their life? Nobody. So, even people younger than 30 have written them. In general, there might be a divide because, like, yeah, all their developers have written them, the younger ones, they are smart enough to use the library. But, the thing is, well, at our age there was no library. So, basically, we say that Java is in the object-oriented language, and then we put everything in a class, we put static methods in a class, and we pretend that it's object-oriented. Right? Yes. Well, let's not pretend it's object-oriented. If it's no object, a class with static methods is not object-oriented. So, here I have created my amazing string-utils class. And, of course, I need to remember because the users of my class, well, they might instantiate it. So, I will just remove the constructor, make it private, again, very object-oriented, and then I create these capitalized stuff, and I do whatever I want, and then I can call the capitalized method. Good. Scala and Kotlin have, I think, like, similar stuff. Let's not pretend we are an object-oriented language. We can just add methods in states, but mostly methods, on existing classes. That's crazy. Yeah. Of course, at the bytecode level, it boils down to a static method. That's not the problem. The problem is the user experience, the developer experience. We can see now that we are really doing here, we actually are doing object-oriented development. So, through these functional stuff, because here we create a function that is at the root level, we are able to write better object-oriented code, which is mind-blowing. Even better, as I mentioned, we can say, hey, this only applies so here we can have for f of type string might be null, so this is a nullable type, and here we cannot say f.capitalize because it's null, right? It's a nullable type. Here it only applies to real strings, but we can do something like this. We can say, hey, it might be a nullable type, and we can check if this equals null, then we return the default, which might be an empty string. Yeah, I see like, what the hell? And in the end, that's a static method, but how you call it is like object-oriented, which, in my opinion, makes Kotlin much more object-oriented than Java will ever be. Well, will ever be, no, I'm not sure. Let's see what we have in the future. So, that's already good stuff. And then, oof, Rayfied generics, right? Who has been bitten by the lack of Rayfied generics already? Yeah, so I have a collection of thingy and a collection of foo, and well, at one time you have nothing. So, the trick when you do Java is to pass the class. So, here is taken from Spring, when you get a bean, you say, hey, I want a bean of class whatever, and then you will get the whatever. Okay? How can we do it in Kotlin? Kotlin has this Rayfied T. When you call a get bean, you can pass the type that you want, and it will get you, but it's a string. If we are a bit, like, tricky, we can do this, and it's a list of strings. And if you tell it, because it's all about compiling, because in the end, of course, the byte code is still the same. The byte code must be compatible with Java byte code. There is no Rayfied generic in the byte code. So, it's just about compiling. And here we can say either we set it here in the signature when we call it, or we tell that type of, that S is of type string, and we will get a string. And I think that's pretty amazing. The only thing that we need to do is do this. We need to tell it's Rayfied. Sorry. Again, we need to tell it's Rayfied, and for Rayfied, you need to have inline. Why? Because as I mentioned, there is no trick. At compile time, it will actually replace the code. It won't be a call. It will be just copy-pasted. And so it knows which type you are. And in the end, so I still have a bit of time, we can do, like, really, really fun stuff. I will, yes, fun stuff. OK. I will create a function beans, and for the moment, I will return any. OK? So, some syntactic sugar, I don't think it's really important. I don't think it makes me want to use Kotlin. I don't think you need to use it, like, either you return the type like this, or here any idiot can understand it returns any. So the Kotlin compiler is not an idiot, but the Java compiler is. It doesn't make any sense to specify the type explicitly every time, but still the Java compiler requires you to do it. It makes no sense. OK. Sure, sure. If for whatever reason, if for whatever reason you want to specify what it is, because it might be complex, but then if it's complex, perhaps it doesn't belong to a liner, then you can still have loads, too. Especially if you are using, if you return a concrete implementation and you want your signature to be an interface. That's perfect. OK. And now I will have a class which I will call, let's say, beans DSL. OK. Here I want this to return beans DSL. OK. So now I can write something like this. I can have a main function, main, main, OK. Private static void main, OK. And I can call the beans function, beans function. Great. Nothing mind-blowing. Now what I can do is I want to write something like this. OK. So I want to write something like this. I will just use the compiler, OK. So I create, here I accept a parameter that takes nothing and returns units. Here what I can do is normally I would write something like this. But in Kotlin, if the loss argument is a lambda, you can move it outside the parenthesis. So that's what I did before. And then here if there is no arguments and there is a lambda, I can remove the parenthesis. Good. Now I can add the bean method inside. And I can say that I can actually call the bean method on the bean DSL. So here I can do something like this. And what is it telling me? And resolve method beans. So I still have an issue. Yes. Little trick. Yeah. Sorry, that's live coding completely. I thought I would be less fast. OK. And now we can say that the bean method is generic, right? So we can say it accepts a type T. And I don't remember how it's written. So I will be doing my stupid stuff. So this is the inline function. So here inline fun. Refi T. Here and it returns a T. And if you continue like this, you can have this kind of stuff. So here this is the spring boot Kotlin DSL. So you say this will create beans. And then you can define either through lambda or directly. And through the refit stuff, here you see the product handler actually requires two dependencies. And at compile time, it knows that it requires, I don't know, a foo and a bor. So because those ref methods are refit for generics, it knows it needs to call the ref method calling, getting a foo and a bor. And it will inject them. So here you have the magic at compile time. And at runtime, it's the usual spring boot stuff. And I believe it makes my code much easier to read. Of course, you need to understand the trick. It's like every time you need to be very explicit in the beginning because you like a lot of context. When you have the context, then it makes your stuff much, much easier. And that's all. I don't want to bore you with more details. You can follow me on Twitter. You can follow me on masterdom because, well, you don't know what will happen to Twitter. And though the talk was not about Apache API 6, you can check Apache API 6, which makes my job so much easier. And I can come back here to talk about unrelated stuff. Is there any time for questions? There is. There is. Thanks. |
Update on #JavaOnRaspberryPi and Pi4J |
Thanks for joining during your lunch. I have some pie for you, some raspberry pies with a little bit of Java coffee. So let's jump in. Normally there was a session planned now about Kotlin, so I have a little piece of Kotlin in this presentation. I'm not a Java Kotlin developer myself. What is the raspberry pie if you don't know it? This amazing small board. It's really small. This is the raspberry pie zero. And yes, indeed, that's 15 euros. This is a full PC, a full Linux PC, where you can run on Java, Java Vix. Anything that you do as a Java developer, you can also do on this small device. Of course, it's not so powerful, but it still allows you to do a lot of experiments. And what is special about raspberry pie is those pins where you can connect electronic components. And that's what I'm going to talk to you about. I see I have some missing images. That's a good start. There was also Raspberry Pi Pico announced, launched a few years ago. That's actually a micro control. So if you see a Raspberry Pi Pico, if you ever played with Arduino, it's more comparable to that, so we cannot run Java on that one. So what is a Raspberry Pie? It's a single board computer. You can run a lot of different Linux distributions on them. I mostly start with the Raspberry Pi operating system, which is the official one. But you also have gaming operating systems, NAS system, any kind of thing that you can think of does exist. There is a website, awesome Raspberry Pi where you will find all these. There are many versions, which are also 32 or 64 bits, which can be interesting if you want to do some specific experiments. And they make 400,000 Raspberry Pies a month. And still you cannot find them. Because of the ship shortage, this 400,000 is not enough. They do reserve a lot of them for industrial use. So as a consumer, a maker, you have to find them. RPILocator.com is a website which pulls a lot of websites who sell Raspberry Pies. And they list them. And if you follow them on Twitter or on Mastodon, you will get an alert if a certain type becomes available. I have been speaking at FOSDM virtually thanks to Fuji in the last two years. And in 2001, I spoke about how I got into Java on Raspberry Pi. I started doing some personal projects. I wanted to have a touchscreen control for the drum boot of my son. I wanted to use Java VIX. And I was missing a lot of documentation. So I wrote about that. And I ended up, even before building that thing for my son, I have written a book. And then afterwards, finally, he got his controller. It's in the book I explained and also in the FOSDM talk, how you get started with this, how you can use Pi for J. Pi for J is a library. More about that later to help you as a Java developer. And I also gave some examples of running Java VIX on a Raspberry Pi. So, 2001 was my explanation how I got started with Java on Raspberry Pi and has been my niche pet project ever since. And last year, I was here again because there were new Raspberry Pi boards launched. And we had to do some changes in Pi for J because they were not compatible anymore. And so, in 2021, we launched version 2 of Pi for J, which is more compatible with the newer boards, which uses Java 11 under the hood and allows you to do a lot of fun stuff. And I'll give you some examples. I also gave an example of a CropiOS. CropiOS is an operating system based on the official Raspberry Pi operating system. But FHNW is a university in Switzerland. They have a lot of courses where they use both Java, Raspberry Pi, electronics, all kinds of stuff. And they contribute a lot back to the Pi for J project. And they made an operating system with some additional tools for Java developers, like the latest Java VIX is there, the latest Java is there. On the background screen of your desktop, you see the IP number of your computer, which is very handy if you have a lot of Raspberry Pies and you never find back the connection to them. Also, some experiments with FXGL. Who has used FXGL or know what it is? No. Definitely take a look at it. It's by Almas. He's a professor at an English university and he created an amazing library for creating games. If you ever want to do some fun stuff and create a game with Java and Java VIX, FXGL is the project. And you will also find a lot of info about that on Fuji. And I also had some demos with HiveMQ. That's also something that is very easy to do, is messaging from Raspberry Pi towards a cloud provider. HiveMQ is a messaging platform, but they have a free cloud solution for up to 100 devices. Every maker with more than 100 devices can now raise his hands. Nope. That's the place to be where you can find those things. Now Py4j. Py4j is a Java library. That means it's a dependency. You add it to your Java project. Inside the library is native code. Native code that will call the different protocols that you can use to interact with the pins on your Raspberry Pi. So the simplest thing, you connect the LED and you can make the LED blink, but you can go a lot further, read temperatures, control led displays, all that kind of stuff. Now with the launch of Py4j version 2, we also launched a new website. And actually that's my role in the project. I didn't contribute a lot to the sources of the Py4j library, but I focused on the documentation part. Just like FooJ wants to be the source of truth for all Java developers to find information about Java, Py4j wants to be that for the Raspberry Pi, where you will have information about how you run Java Avix on a Raspberry Pi. One of the nice use cases of Java Avix is a kiosk mode so that a user interacting with your Raspberry Pi through a touch screen cannot do anything else than your application. They cannot go to reboot or in Linux terms. Now let's look back at what happened last year. For me personally, my biggest change is I joined Azure. Azure is one of the distributors of OpenJDK. I'm part of the documentation team. And because of that, I can also focus a bit on writing documentation for FooJ articles and other stuff like that. But it was meant to be because Azure has a lot of distributions of, it's called Zulu. That's the core product of Azure is a distribution of OpenJDK, like you have so many. What is the main advantage of Azure Zulu is that it is available for a lot of platforms, more platforms and most other distributors. That's the nice thing that I found out after joining Azure. They are even the only one which supports all the oldest Raspberry Pi models. Now what I also found out is who knows SDKaman? Yes? Okay, look it up. SDKaman allows you to switch between Java versions with one command. It didn't run on the Raspberry Pi. And that was of course that hurt. So FooJ, the website for Friends of OpenJDK, behind the scenes there is the Disco API. The Disco API is an API to search for Java distributions. Now the same Disco API is used by SDKaman. And SDKaman is a tool for Linux and Mac where you can do, it's a one line installation script, then you do SDKalist Java and you will get a list of all the available Java distributions for your platform. Now because of the Disco API and small changes done by Gerrit Grunwald, who is also an Azure colleague who is maintaining that, and by changes in SDKaman and I did a very few, very small commits to that, we were able to get to this. So if you have this Raspberry Pi zero from the first generation which has an ARMv6 processor, is a different architecture than the newer ones, you will get four versions of Java that you can install. Unfortunately, it's only Zulu. As I say, it's only Zulu who will work. There is still a problem with the architecture of the processor that has some issues there. But so you can install with SDKaman Java on any type of Raspberry Pi because, yep, I have something else here. If you run the same command on a newer Raspberry Pi with a 64-bit operating system, Raspberry Pi always has that now, since recently, officially, 45 extra lines. So there are more than 50 Java distributions and this screenshot, I think it dates from, it's from a Fuji article in last February. So Java 19 is not on this list. So there are now more than 60, I guess, Java versions that you can install, distributions that you can install on a Raspberry Pi. Another article I wrote for Fuji is G-Bang. Who used G-Bang? No. Since Java 11, I think you can run Java files without compiling them. If you have a single Java file which does some simple things, you can just run it. You don't need to compile it. What G-Bang adds is you can define your dependencies in that one single file. So if you install G-Bang on Raspberry Pi or on any computer where you didn't run Java yet, it will install Java for you. And then you can just create a text file. And then with this Gradle-style definition of dependencies inside your Java file, G-Bang has everything it needs to run your code. So this example is based on the minimal code example that we have on the Pi 4j website. It's just to control a lit. Let me see if the video works here. If it doesn't show you, I will just forward you to fuji.io where you can find the full, nope, no video. Okay. We didn't try this before. You see the sessions here go very fast there. How much time there? Another fun project I love is Vaadin. Vaadin allows you to build user interfaces with pure Java. So if you have played with Java VIX, it's a bit the same feeling but then for web applications. Vaadin, so you have button elements and table views and all that kind of stuff. I also wanted to create an example using Vaadin on the Raspberry Pi and that's exactly what I have done and this video will play. So this is the web interface without any modification that you get from a default Vaadin project. There you have a custom setup with just a let and a small button and then you have Vaadin application running. So this is a spring application, combination of spring, Vaadin, Pi 4J and that's running on the Raspberry Pi. So it's running on the Raspberry Pi you see also on the top. I'm not going to show you any rocket science experiments. It's just pure basics. It's blinking a lot. The hello world of programming electronics and you see after the button has been touched that the info changes there. So that's all what is documented. Five minutes, okay. Good. I promised you some Kotlin. I'm not a Kotlin developer myself but Pi 4J, the project, it's a community project. It's an open source project so we welcome anyone who wants to contribute and Mohammed Hashim who once as a student I think developed a Kotlin implementation of the first version of Pi 4J said I can do that again. So he created a Kotlin implementation on top of Pi 4J. So if you are a Kotlin developer and want to do Kotlin on the Raspberry Pi, you can do so and even control electronics. I'm not a Kotlin developer. These are just some example codes that I took from his examples. Now what the fun thing is that he also went back to the documentation part of the Pi 4J website and added four pages or five pages with documentation about how to do this with Kotlin. So if you are interested in Kotlin on the Raspberry Pi, go to Pi4J.com. Now a few things I can tell you about this year and what we're going to do. Just as an experiment again, I wanted to create a library. Now I got very worried about how I have to maintain libraries and the legal parts so I don't know if this was a good idea. I wanted to create a library containing a database of all the Raspberry Pi's, the history and what pins they have and what you can do with these pins. We need it for another project so I wanted to create this library. Now on top of this library we actually created API.Pi4J.com. And again I used Waden. I know it. I've used it before. So this application, API.Pi4J.com, it's public. It's using a library containing a database with Raspberry Pi information and it's visualizing it here and because it's a spring application we can of course have Swagger and all that kind of stuff. But the fun thing is of course it runs on the Raspberry Pi. It runs on the Raspberry Pi that we got from this company FinalTech.com in Czech so somewhere in Prague in the data center Raspberry Pi is hosting this Pi4J.com. I don't know how performant this is so if you all visited at the same time we will know it. And then something unexpected happened a few weeks ago. I was asked by Daniel Frey how about Spring Boot and Pi4J. Does that exist? No. But now it does because he created it. So Daniel Frey and the Sean Carter are two guys from the Spring team and they just developed this. And I joined them in a Twitch. It was a bit chaotic but yeah it was a Twitch. And we didn't finish it yet but we're working on it. So you'll have a Spring Boot starter that will help you to detect which Raspberry Pi you are running it on, how it should be configured. It will create a context for you. The context is that it loads all the plugins to control the GPIOs. And that you will be able to also, how is it called? The info controller, the Prometheus list of all the data that you get from Spring and what? That you can use it for Grafana. So you will have all this data available. What IO pin is toggled? What is active? What is the signal that is arriving at this pin? So that kind of info, it's not finished. We're working on it so maybe if I'm back here next year, I can show you. So what is next? You can visit me on Twitter. I'm also on Mastodon on the Fuji's social account. Of course with Fuji we also started the Mastodon service. And I write a lot about all this kind of stuff and you can find it either on Fuji or by Fuji. And that's all I want to do. Thank you. |
Write Once, Run Anywhere... Well, What About Heterogeneous Hardware? |
Hello, everyone. Good afternoon. I am Thanos from the University of Manchester, and today I have the pleasure to present you Tornado VM, what is the state of Tornado VM at this moment. And in fact, I want to focus also on the slogan that's very known to everyone, the right ones run anywhere for Java. So, I will start with that. So, this is a known slogan derived since 90s from some micro systems in a way to advertise that Java language and the JVM in particular, it is a platform that can ensure portability across different CPU structures and architectures. So, the idea is that programmers can run their code, they can compile it once, and it can run transparently on different hardware architectures. However, hardware has changed in the last years. It is evolving, and perhaps this is not sufficient for the new types of hardware resources that are coming. So, lately we have GPUs and FPGAs which are coming to complement the power of the CPUs in a way to maximize performance and reduce the energy consumption. These are good, but there is some challenges that are deriving, and these are mainly posed in programmability. So, how programmers can harness this power from these resources. I don't know if you have experience with OpenCL and CUDA, but mainly these programming models that have been designed for these hardware types to get access to these hardware types, they are mainly focused on the C and C++ world. So, there are different programming models from different companies like SQL, one API, NVIDIA CUDA, OpenCL, which is a standard that can run on all the devices. And if you have FPGA expertise, then perhaps you can write RTL and Verilog, which is a hardware description language, but this is very low level. And here we are talking about Java, so we want to go high level. So, if you are a Java developer, then you use the JVM and you go to the CPU. If you want to have access to these devices, then you need to write your own native interfaces in the JNI and then tap into the C and C++ world. But still, you need to be aware of how these programming models are written. So, you need to be familiar with this. And this is exactly the problem that Tornado VM has been designed to solve. So, Tornado VM, it is a plug-in to existing OpenJDK distributions, like Amazon Goreto, Red Hat Mandrel, Azul Zulu, and others. And the way that it is built, it is to enable hardware acceleration in an easy manner. So, it offers a Java API and it has inside a JIT compiler for the hardware devices that are showing this figure. Our compiler inside, it can automatically translate the Java bytecodes to run on CPUs, multi-code CPUs, GPUs, integrated or discrete GPUs and FPGAs. And the compiler in the backend, it has three different backend types. It can emit OpenCLC, PDX, which is the assembly for the CUDA, for the NVIDIA GPUs. And it has recently also the SPIRV backend, which enables to utilize the level zero dispatcher from the one API. So, Tornado VM, it is a technology that can be used as a JVM plug-in to enable hardware acceleration for JVMs. And some of the key features is that it has a lightweight Java API. It is coded in a platform agnostic manner, so one command can be the same no matter which device it will be executed to program. And it can transparently, at the compile time, specialize the code. Because the code that is generated for the GPU, it is completely different from a code that is generated for an FPGA. So, regarding the compiler, we have different phases that will be enabled for GPUs and different phases that will be enabled to specialize the code for an FPGA. Our code is available in GitHub, so we encourage everyone who wants to have a look to fork it, download, play with examples, or even create their own examples. And also to come back to us. I mean, feel free to use the discussions to trigger the discussion if you have questions or to open issues if something is broken in order to fix it. And we have also available docker images for NVIDIA GPUs and Intel integrated GPUs. Now, the next part that I want to talk, it is regarding the API. Two weeks ago, we released a new version of Tornado VM, the version 0.15. And this comes with many new changes in the API level. So, our goal was to make the API easier for Java programmers in order to use it in a comprehensive manner. So, to know how to use and how to express parallelism from Java. But first, I have to make you familiar with the programming model of Tornado VM. And this programming model comes, it is inspired from the heterogeneous programming models like OpenCL and CUDA, the way that these programming models are operating. And in this sense, a Java program can be composed of two parts. The first part is the host code, where it is the actual core of the Java application. And the second part, it is the accelerated code, which actually it is the method or the set of methods that will be offloaded for execution on a GPU. Once we have made this clear, then we can move with the execution model, which it requires first because the processing will take place on a device. It will have first to move the data from the host code, from the CPU to the actual device. Then perform the processing. And once the processing is finished, then the data, the result, will have to be transferred back to the host code. Now, in Tornado VM, in the API of Tornado VM, we have exposed the set of objects and annotations for each of these two parts, the host code and accelerated code. In the host code, we have the task graph object and the Tornado execution plan. The task graph corresponds to what to run on the GPU. And the Tornado execution plan, it is how to run on the GPU. And for the accelerated code, we have a set of annotations and objects that I will show you later. So let's start with the task graph, what to run. Assuming that you are a Java programmer, then you want to offload the execution of a method. In this example, method A to the GPU. This method, it has some input and some output. Now, this method corresponds to what in the Tornado VM terminology call a task. So each method that will be offloaded for execution on hardware acceleration, it is a task. And it has the input data and the output data. And then we have a group of tasks, which is the task graph. Now, task graph can be a group of tasks that may have dependency or may not have dependency. And the programmers, they want to offload them all for hardware acceleration. In this particular example, I have put one task in this task graph. Once we have defined what to run, one question that comes, it is how often to transfer the data between the host, CPU and the device. And this can have a tremendous impact because it can affect the data transfer time. So it can have a long execution time. So it can affect performance, but can also affect energy, the power consumption. So how to transfer data? It matters. It depends on the pattern of the application. So one application may need to copy only the first execution if the data are read only, then always or only in the last execution, for example, for the output for the result. And here is a code snippet of how the task graph can be used to define this functionality in the Tornado API. So we create a new object, the task graph. We assign a name, which is a string. In this particular example, it is TG. And then we utilize the exposed methods of the API in order to fulfill the execution model. At first, we use the transfer to device, which has two inputs. The first argument that we put, it is the data transfer mode, which will be used to trigger how often the data will be moved. In this particular example, it is the first execution. So only in the first execution, the data will be moved. And then we have the parameter, which is the input array. The second method, it is the task, and it defines which method will be used for hardware acceleration. The first parameter, it is a name, a string, actually, of the method. It could be any name. And this is associated for the dynamic configuration, which I will show you later. The second parameter, it is the method reference. So the reference to the method that will be offloaded to the GPU for acceleration. And then it is the list of parameters of this method that corresponds to the method signature. And the last method, it is the transfer to host. And this, again, this method, it is configured the first argument to be the data transfer mode. And this example, we will copy the data, the output, in every execution. Okay. And once we have defined the task through the task graph, this task can be appended, can be updated. We can add a new task, a second task. We can change the way that the data transfers will be triggered in every execution only in the first execution. Then the next step, it is to define the immutable state of the task graph. So how to preserve the shape of a task graph. And this is done by taking a snapshot of the task graph, by using the snapshot method in the task graph object. Then we retrieve back an immutable task graph. And this means that this can be used for jit compilation and execution on the hardware. And this ensures that the Java programmers, they can create different versions of their task graph. They can update it. And then the code cache that we have in Tornado VM, it can store all these versions. It doesn't need to recompile and override the generated code. And this is the final step before we move to the actual execution plan. We have the immutable state of the task graph that can be modified and the immutable task graph that it cannot be modified anymore. So if the users they want to do a change, they can still change the task graph and get a new snapshot for a second version of their code. And now we move to how to run, how to execute the task graph. And this is done through the execution plan. Here is a snippet of Tornado execution plan. We create a new object that accepts as input the immutable task graph that doesn't change anymore. And then we can either directly execute it in the default execution mode by invoking the execution plan.execute method or we can configure it with some various optimizations. In this particular example, I have enabled the configuration to run with dynamic reconfiguration, which is a feature in Tornado VM that will launch a Java thread to GIT compile and execute the application per device that is available on the system. So we can have a CPU, a GPU and an FPGA. Java thread will run for all the devices and then it is triggered with a policy of performance, which means that the first device that will finish the execution, it will be the best and the rest Java threads will be killed. Now I have concluded the part of the host code. We can briefly go to the accelerated code, which is the way to express parallelism within the kernel, within the method or the Tornado VM task, as we call it. We have two ways, two APIs. The first one is called loop parallel API. And in a sense, we expose the parallel annotations that can be used by programmers as a hint to the Tornado VM GIT compiler that these loops can run in parallel. And the second one is the kernel API, which is an API exposed to the users through the kernel context object. And in a sense, the meaning of this API, it is meant for OpenCL and CUDA programmers, or Java programmers who know OpenCL, in a way to get more freedom on how to code things so they can get access to local memory, which is the equivalent to the cache memory of the CPU for GPUs. So they have more freedom on what to express. And in fact, I have used this API to port existing kernels written in OpenCL and CUDA to Java. For more information, you can use this link, which is the actual documentation of Tornado VM and describes some examples. I will briefly go to one example of a matrix multiplication, which I presented last year in FOSDEM. So in this example, we have the accelerated code and the host code. The matrix multiplication method, it implements matrix multiplication over a flattened arrays in two dimensions. And the way to annotate and express parallelism using the add parallel annotation, it would be to add the add parallel annotation inside the four loops. That means that we indicate that these loops could be executed in parallel. And now regarding the second API, the kernel API, we would use the kernel context object. And in particular, we would use the global ID X and Y, which correspond to the two dimensions that we have. So in a sense, it is like having the thread ID that will execute on the GPU. Here are some use cases that we used on Tornado VM. And concluding this talk, I would like to focus on a feature that we implemented in a research project that we are working. It is called elegant. And the idea is to create a software stack that unifies development for big data and IoT deployment. And there Tornado VM is used as a technology to enable acceleration as a service. So we have implemented the REST API. It is still a prototype. But the programmers, they can write a method. They can specify a method. They can specify the characteristics of the targeted device. And then the service will return back OpenCL code that it is meant to run parallel dysfunction. The interesting part is that this code, the OpenCL code, it is generated to be portable across different programming languages. So it doesn't only bind to Java. It can run also through C++, Python because it is OpenCL. And this means that in this particular example, we have Java. We use OpenZDK. We take the byte code and we pass the byte code to Tornado VM. And Tornado VM is running on an experimental feature which is called code interoperability mode. And in this mode, it converts this byte code to OpenCL that can run from any programming language and run time. Therefore, it is like prototyping in Java for parallel programming. Wrapping up, we would like to receive feedback. And we are looking also for collaborations if we can help to port use cases or for any other issues. And summarizing this talk, I briefly went through the right ones, run anywhere in the context of heterogeneous hardware acceleration. I have familiarized you with Tornado VM which is an open source project and the code base is available in GitHub. And I familiarize you with the programming model of Tornado VM and the new API, how to use it. And more are about to come in the FUJ blog with a new blog. So finally, just to acknowledge the projects that they have supported our research in the University of Manchester. And I'm ready for questions. |
The Next Frontier in Open Source Java Compilers: Just-In-Time Compilation as a Service |
Hello. I'll get started. Okay. My talk is entitled, The Next Frontier in Open Source Java Compilers, Just in Time, Compilation as a Service. Whoops, this isn't working. My name is Rich Agarty. I've been a software engineer for way too many years. I'm currently a developer advocate at IBM. So, we're all Java developers. We understand what a JVM and a JIT is. We'll do the JVM, execute your Java application during runtime. It sends the hot methods to the JIT to be compiled. With that in mind, we're going to talk about JIT as a service today. And we're going to break it down into three parts. First, I'm going to talk about a problem, right, which is Java running on cloud, specifically in distributed dynamic environments like microservices. Then we're going to talk about the reason, which is going to take us back to the JVM and the JIT, which has some great technology. It's great technology but does have some issues. And then the solution, which is the JIT as a service. So, is Java a good fit on the cloud? So, for context, we'll talk about legacy Java apps, enterprise apps running. They're all monoliths running on dedicated servers or VMs to ensure great performance. We loaded with a lot of memory and a lot of CPUs. They took forever to start, but it didn't matter because it never went down. We have clients running Java applications for years. If they did upgrade, it would be every six months to a year, do some simple refreshes. That was the world of legacy Java enterprise apps. Now we move to the cloud. That same monolith is a bunch of microservices talking to each other. They're all running in containers, managed by some cloud provider with a Kubernetes implementation to orchestrate. And we have auto-scaling up and down to meet demand. So the main motivators behind this, obviously, are flexibility and scalability. Easier to roll out new releases. You can have teams assigned to specific microservices and never touching other microservices. Once you're on the cloud, you can take advantage of the latest, greatest cloud technologies like serverless coming out. Obviously, you'd have less infrastructure to maintain and manage. And the ultimate goal is saving money. So before we start counting all our money, we've got to think about what about performance? So there's two variables that impact cost and performance. It's container size and the number of instances of your application you're running. Here's a graph showing all the ways we can get these variables wrong. Starting down here, containers are way too small. We're not running enough instances. It's pretty cheap, but the performance is unacceptable. On the opposite side, we have our containers are too big. Way too many instances running. Great performance, wasting money. So we need to get over here. This is a sweet spot. We got our container size just right. We have just enough instances for the demand. That's what we want to get to. Very hard to do. In fact, most conferences have a lot of talks about how to get here or their fixes for this problem. So before we can figure out how to fix it, we've got to figure out why it's so hard. And in order to do that, we've got to talk about the JVM in a JIT. So first of good, device-independent Java became so popular because we write once, run anywhere, in theory. 25 years of constant improvement, a lot of involvement from the community in it. The JIT itself, optimized code that runs great. It uses profiler, so it can optimize a code that you can't get doing it statically. Has very efficient garbage collection. And when the JVM collects more profile data in the JIT, compiles more methods, your code gets better and better. So the longer your job application runs, the better it gets. Now, the bad. So that initial execution of your code is interpreted, so it's relatively slow. Those hotspot methods compiled by the JIT can create CPU and memory spikes. CPU spikes cause lower quality of service, meaning performance. And your memory spikes cause out-of-memory issues, including crashes. In fact, the number one reason JVM, or a main reason JVM crashes because of out-of-memory issues. And we have slow startup and slow ramp-up time. So we want to distinguish between the two. Startup time is the time that it takes for that application to process first request, usually during an interpretation time. And ramp-up time is the time it takes a JIT to compile everything it wants to compile to get to that optimized version of your code. So here we have some graphs to back that up. Here we take a Java Enterprise application, and you can see on the left we got CPU spikes here happening initially, all because of JIT compilations. Same thing with the memory side. We got these large spikes that we have to account for. So let's go back to that graph I had finding that sweet spot. Now we have a little more information, but still we need to figure out a way to right-size those provisioned containers. And we got to make our auto-scaling efficient. So we have very little control over scaling. We control the size of our containers, but as far as scaling goes, we just have to set the environment enough up correctly so that auto-scaling is efficient. So on the container size portion of it, the main issue is we need to over-provision to handle those out-of-memory spikes, which is very hard to do, because JVMs have a non-deterministic behavior, meaning you can run the same application over and over, and you're going to get different spikes at different times. So you've got to run a series of tests with loading to figure out, to get that number kind of right. And on the auto-scaling part of things, again, we talk about the slow start-up and ramp-up times. The slower those are, the less effective your auto-scaling is going to be. And the CPU spikes can cause other issues. A lot of auto-scalers, the threshold for starting new instances is CPU load. So if you start a new instance and it's spinning, doing JIT compiles, your auto-scaler may detect that as a false positive, say, oh, you need, the demand is going up, you need more instances, when in this case, you really didn't. So it makes it very inefficient. So the solution to this problem is we need to minimize or eliminate those CPU spikes and memory spikes, and we've got to improve that start-up and ramp-up time. So we are proposing here, we're going to talk about JIT as a service, which is going to solve these issues, or help solve these issues. So the theory behind it is we're going to decouple the JIT compiler from the JVM and let it run as an independent process. Then we're going to offload those JIT compilations to that remote process from the client JVMs. As you can see here, we have two client JVMs talking to two remote JITs over here. We have the JIT still locally in the JVM that can be used if these become unavailable for some reason. Everything since we're all in containers is automatically managed by the orchestrator to make sure that we have their scaled correctly. This is actually a model to microsolution, so we're taking the model, as in this case, as a JVM. We're splitting it up into the JIT and everything left over in the other microservice. And again, like I mentioned, the local JIT still is available if this service goes down. So this actual technology does exist today, and it's called the JIT server, and it's a part of the Eclipse OpenJ9 JVM. It comes with the, it's also called the SAMRU cloud compiler when used with SAMRU runtimes, and I'll get to that in a minute. And I'm sure everyone here knows OpenJ9 combines with OpenJDK to form a full JDK and totally open-source it free to download. And here's a GitHub repo there. A little history of OpenJ9. It started life as the J9 JVM by IBM over 25 years ago. And the reason IBM developed it was because they had a whole range of devices they needed to support, and they wanted to make sure Java ran on all of them. That's all the way from handheld scanners to mainframes. So it was designed to go from small to large in both types of environments where you have a lot of memory or very, very little. And about five years ago, IBM decided to open-source it to the Eclipse Foundation. And OpenJ9 is renowned for its small footprint fast start-up and ramp-up time, which we'll get to in a minute. And again, even though it's got a new name, it's OpenJ9. All of IBM enterprise clients have been running their applications on this JVM for years. So there's a lot of history of success with it. Here's some OpenJ9 performance compared to Hotspot. Again, this doesn't take into account the JIT server. This is just the JVMs themselves going left to right here. OpenJ9's in green. Hotspot's in orange. So in certain circumstances, we got to see 51% faster start-up time, 50% smaller footprint after start-up. And it ramps up quicker than Hotspot. And at the very end, after a total full load, we have a 33% smaller footprint with OpenJ9. So, several run times. So that is IBM's OpenJDK distribution. Just like all the, someone just mentioned, there's a ton of distributions out there. This is IBM's. And it's the only one that comes with Eclipse OpenJ9 JVM. It's available no cost. It's stable. IBM puts their name behind it. So it comes in two editions, open source and certified. The only difference being the licensing and what platforms are supported. And if you're wondering what Samaru comes from, the name comes from, Mount Samaru is the tallest mountain on the island of, anyone know? Java, there you go. See how that makes sense? If I had a t-shirt, I would have given you that. Alright, from the perspective of the server or the client talking to this new JIT server, this is the advantages they're going to get. From a provisioning aspect, now it's going to be very easy to size our containers, right? We don't have to worry about those spikes anymore. So now we just, we level set based on the demand or the needs of the application itself. Performance wise, we're going to see improved ramp-up time, basically because the JIT server is going to be offloading. We're going to offload all the compiles in the CPU cycles to the JIT server. And there's also a feature in this JIT server called AOT cache. So it's going to store any method it compiles. So another instance of the same container application calling it, and then they'll have that method, it'll just return it. No compilation needed. Then from a cost standpoint, obviously any time you reduce your resource cost or your resource amounts, you're going to get a savings in cost. And I mentioned earlier the efficient auto scaling, you're only going to pay for what you need. Resiliency, remember the JVM still has their local JIT. So if the JIT server goes down, it could still keep going. So this is kind of an interesting chart. This is pretty big. So we're going to talk about some of the examples of where we see savings. So this is an experiment where we took four – let me see my pointer works – we took four job applications and we decided to size them correctly for the amount of the memory and CPU they needed doing all those load tests to figure out what this amount should be. And we have multiple instances of them. So the color indicates the application. You can see all the different replications. The relative size is shown with the scale of the square. And in this case, we used OpenShift to lay it out for us and it came out to use three nodes to handle all of this, all these applications in your instances. Then we introduced the JIT server, ran the same test. Here's our JIT server here, the brown. It's the biggest container in the nodes. But you notice the size of all of our containers for the applications goes way down. So we have the same number of instances in both cases, but we've just saved 33% of the resources. And if you're wondering how they perform – whoops, went too far – you see no difference. The orange is the baseline, the blue is the JIT server. And from a stable state, meaning once they've performed, they perform exactly the same. But we're, again, saving 33% of the resources. Now we'll take a look at some of the effects on auto-scaling in Kubernetes. Here we're running an application and we're setting our threshold, I think it's up there, at 50% of CPU. And you can see here all these plateaus are when the auto-scaler is going to launch another pod. And you can see how the JIT server in blue responds better. Shorter dips and they recover faster. And overall, your performance is going to be better with a JIT server. Also, that other thing I talked about with false positives. So, again, the auto-scaler is not going to be tricked into thinking that that CPU load from JIT compiles is the reason for demand. So you're going to get better behavior in auto-scaling. Two minutes. All right. When to use it? Obviously when the JVM is – we're in a memory and CPU constrained environment. Recommendations, you always use 10 to 20 client JVMs when you're talking to a JIT server. Because remember, that JIT server does take its own container. And it is communication over the network, so only adding encryption if you absolutely need it. So some final thoughts. We talked about the JIT provides great advantage that optimize code, but compilations do add overhead. So we disaggregate JIT from the JVM and we came up with this JIT compilation as a service. It's available in Eclipse OpenJ9, also called the SAMRU Cloud. It's called the Eclipse OpenJ9 JIT server. That's the technology. It's also called the SAMRU Cloud Compiler. It's available on Linux Java 8, 11, and 17. Really good with microcontainers. In fact, that's the only reason I'm bringing it up today. It's Kubernetes ready. You can improve your ramp-up time, auto-scaling. And here's the key point here I'll end with. So this is a Java solution to a Java problem. Initially I talked about that sweet spot space. So there's a lot of companies, a lot of vendors trying to figure out how to make that work better. And a lot of them involve doing other things besides what Java's all about, running the JVM, running the JIT. So it is a Java solution to your Java problem. That's it for me today. That QR code will take you to a page I have that has a bunch of articles on how to use it, also the slides and other good materials about it. That's it for me. Thank you very much. It sounds amazing. It's amazing. It really is amazing. Well, why wouldn't you? Open J9 is a perfectly, I mean, it's a viable JVM. It's nothing special, right? And nothing unique about it that makes you change your code. It's a JVM that just points to the open JDK, the open J9 JVM. Okay, here it comes. I think so because I've seen examples of using those apps in tests. Check that, yeah. Yeah, okay. That may be a problem. She go out and check the latest coverage of that. Well, the way the AOT cache will work in this case for the JIT server, it's going to keep all that information and the profile has to match from the requesting JVM, right? So if it matches, it'll use it, right? Because also on the clients, they also have their own cache. They'll keep it, but they go away once they go away, right? Or when you start a new instance of that app, you have a brand new flush cache. I'm sorry. Yeah, so that's what we were talking about. You want to go static. You're going to get a smaller image running statically, but you lose all the benefits of the JIT. Over time, yes. So that may be a great solution for short-lived apps, right? But the longer your job app runs, the more you're going to benefit from that optimized code, right? Yes? So Eclipse on the J9 is not a certain set-byte, but my main server is also a set-byte for open edition, but today it has available binaries. But for Eclipse, they are not able to actually release the binaries because they cannot actually access the TCK certification process. So that whole TCK issue is a, I don't know. Well, I guess I could say, it seems to be an issue more with IBM and Oracle, right? So our own tests are going to be, they're going to encompass all the TCK stuff. Open J9 is managed by Eclipse, but 99% of the contributions are from IBM. It's a big part of their business. It's not going to go anywhere. If you have to do open source, this is like the best of the most worlds, I think. It's available. It's open. You can see it, but you know you have a vendor who has their business based on it that it's not going to go anywhere, and they're going to put a lot of resources to making it better. So, you know, I'm just telling you right now that we just came up with a JIT server. We're going into beta on Instant On. I don't know if you've heard of that. It's based on CryU. So we're going to be able to take snapshots of those images, and you can put those in your containers. Those are going to start up in milliseconds. So JIT basically handles the JIT server, handles the ramp up time, but Instant On will handle the start up time. So we're talking milliseconds. That's coming out in the next couple of months or so. Anyway, thank you. Well, if you don't have the JIT, then you're going to be running interpreted. That's like the worst of everything. Oh, well, it won't be. But you still want to use the JIT remotely. Oh, you're talking about locally. It will not be used. It will not be used. By the way, yeah. And by the way, the JIT server is just another persona of the JVM. It's just running under a different persona. No, it won't do that. Okay. Thank you very much. Okay. Thank you. |
Afraid Of Java Cold Starts In Serverless? Fear Not, Java Is Super Fast! |
I used to work at Pyra, so maybe some of you know me from my six years at Pyra company. Now I work at Omnifish, where we, with our co-founders and employees, were a support glassfish server, so back to the roots, kind of. But this time I'd like to talk about Java, plain Java and Jakarta E, and how it all fits together when we combine that with AWS. So first, before I talked about AWS, let's ask, why do we want to have Java fast, or do we want to have Java start fast? I think everybody wants that, but why? Because it's cool, or because we need it. So there were times when we really didn't need that, when we had the application servers, it was a pain that it took a while to start, but in production it was already running, so there was no real business need for that, only to make developers happy and be more productive with developing codes. But now we have several use cases where it's really needed, because the more time it takes for Java program to start, it costs more money, and it's not user-friendly. And one example, a perfect example of this is AWS Lambda. So now, what is AWS Lambda? It's basically a service to which you can deploy your code, and this service runs your code only when it's needed, and it also charges you, because we need to pay for the cloud environment. But if we run the code in Lambda, we are charged only for the time when the code is running. And that's pretty nice, especially if we have code that usually just sits there and responds to users just once in a while, or only during certain periods of time, especially during the day or in the morning when there is some business activity. So how does AWS Lambda do that? It basically creates environment and deploys our code when it needs to be executed. And for that, if the code is not already deployed, it needs to create the runtime and initialize our so-called function, because this is how our code is called. It's called a function because it's basically just called by the runtime, it gives some result, and then it's thrown away. In reality, it's not always thrown away because AWS Lambda tries to cache our code so that it doesn't have to re-initialize it every time when it's run more frequently. So sometimes it stays there, and then AWS Lambda can skip the initialization phase. This is called warm start, because the code is already prepared to serve things. But if this doesn't happen, and the code is not available, it has to initialize everything. And this is usually referred to as cold start, just start from scratch. So the whole lifecycle of AWS Lambda is as on the slide, you can see there's init phase. This is only when the code or the function is not initialized. So in case of cold start, then there is a warm phase, which happens even for warm start-ups. There is this invoke phase, which actually is the only productive phase from these three. It actually does some job. The first phase only initialized gets some things ready before the application can process requests. Then the invoke phase does the job. And then when AWS Lambda service decides that it doesn't no longer needs our application running, because it's not doing anything right now, and they need AWS wants to use resources in some other way, it will tear everything away. So it will shut down the environment. And then we'll add square at square zero. And next invocation needs to go through the initial initialization phase. So let's not go back to the roots with plain Java application. And let's see or let's think about how fast we can get with Java on AWS Lambda. Can we start Java really fast? I tried to start a very simple Java program on my local machine. And if you do that too on your computers, you will see that Java really starts fast. In my case, it was 50 milliseconds, 0.05 seconds. So very small fraction of a second, where JVM started, printed something on output and finished. So we see on a local computer, plain Java doesn't start, doesn't take very long to start. If we compare the exactly what's going on in the AWS Lambda, because AWS Lambda needs to initialize the environment and only then it can run Java function. It takes a bit longer in reality. But when we compare it to other languages, I haven't done this. This is done by some other guy who is more experienced with AWS Lambda than me and compared performance in a more sophisticated way than just running on the computer or just several measurements. He did a lot of measurements across all the or various different languages, various different runtimes provided by AWS Lambda. And he found out that Java basically is on the same level as JavaScript, Python and a lot of other languages that there's not much difference. There's a small difference that at that time C sharp was a bit slower. But as AWS improves continually, the AWS Lambda, even these numbers would be probably better now. And C sharp and Docker will be maybe more even with the with the rest because the technology running AWS Lambda is continuously improving. But this is just to compare and show that Java itself or even the implementation of Java AWS function or the environment isn't worse than other languages. So now what is the problem actually? Why a lot of people perceive that Java starts very slow. The problem is how I see it is that many people don't think about Java in this simple way that it's a simple application. A lot of people think about Java as a language that runs enterprise applications. And with enterprise applications, we're used to use frameworks that do a lot of job for us. We run the applications on application servers, which are start to which are slow to start. And this is what we think about when we think when we say Java or when we talk about Java. So now we're coming to that. That thing that if we basically can run our applications that are similar to what we were used to before, but if we can start them fast, we could solve a problem with Java call starts as least as we use Java now. So the question I have now is Jakarta EE or some other frameworks like Springboard or something like that. Can that be as fast as plain Java? Can we run that in AWS Lambda to get good performance and fast startup? And the answer is there are such frameworks and solutions to that. There are several ones. I don't have much time to talk about all of them. So I picked one that I personally like. And it's called Piranha Cloud Framework. And this one is based on entirely Jakarta EE APIs. Previously, it was called Java EE. So it's a very well-known API that a lot of people already know, a lot of tools out there already use. So it's interoperable with existing codebase. But the thing with Piranha Cloud is that the implementation actually the engine of the framework is new, very flexible, and allows our application to stop, start very fast. Piranha Cloud is based on a lot of existing components. A lot of them come from the Glassfish server, which actually sort of proves that the server is not a problem or Jakarta EE is not a problem. The components are there, they are quite fast. But the problem how they are assembled in traditional Jakarta EE servers, Java EE application servers, that is the problem. Because an application server usually has a lot of other things that we don't need in Lambda, like monitoring a lot of vendor features and go on an administration console and a lot of other things. So here is an example, it's basically nothing else than a servlet. But this is an application using already some Jakarta EE APIs. And this application, this servlet, you can run on any Jakarta EE server. You can run it on Tomcat, you can run it on Glassfish, you can run it on anything that supports servlets. So the only difference if we run it with Piranha Cloud is that it starts fast and it uses Piranha's own servlet container, which was designed from scratch. And it's very flexible and fast. What is also nice about Piranha's servlet container is that it can be embedded very easily. And that's the crucial point. When we want to use Jakarta EE in Lambda, we need to basically shave off everything that we don't need. And in AWS Lambda, we don't even need an HTTP listener. Because AWS Lambda basically only wants a method from us that will be called, returns some response. And then AWS Lambda is responsible for mapping the HTTP request to an object that it passes to our method. And then the returned object should be mapped to an HTTP response. And not only HTTP requests and responses, but Lambda can handle any type of basically JSON messages, JSON events. So the only thing that our application needs is to parse some input object and return some output event. And with Piranha, we can create an engine and map our servlet onto it and just listen on some object. This object is usually called or the request response cycle is invoked by a service method, which accepts a request object and returns the response object. And this is exactly how we can use it in AWS Lambda. We just need to add one additional layer to map AWS request object to Piranha request object and back. If we run Piranha Cloud, this simple servlet, which is comparable to our plain Java, we were running before. If you remember with plain Java on my computer, I had startup times. Actually, it was not only startup times, but until the program ended and printed some message and finished, it was around 50 milliseconds. With Piranha, it's a bit longer time. But this already includes the first request. So it's very similar to the plain Java application. It's not only that the engine starts, but it actually serves the request response with text message through HTTP stack. And with that, it takes still comparable time around 130 milliseconds. Now we can compare how it works in AWS Lambda. And in AWS Lambda, I have a picture, but I hope I will be able to show you in a minute. As I said before, it takes a bit longer when we start the other function first time. Because this doesn't really matter if we run Java or any other runtime. AWS Lambda first needs to create some environment to execute our code in. And it takes a little bit of time. But together with creating this environment and running our code, our example Piranha function, it takes under one second to serve the request. Even if nothing was ready before, even on the first time we tried to run the function, it still serves the response under one second. If we tried it again, again, again, then the response times are much faster. This is on the right side here. It's under two milliseconds. Because this is only the code that needs to serve the request. Everything was initialized. Environment was initialized. The Piranha engine was initialized. It's cached in a static variable. So it's part of the process that is already live. AWS just executes a method basically on the Piranha engine that goes through the servlet and creates the server response. And that's it. That's why it takes only two milliseconds. This is only the time required to serve the actual response. So I'll try. I think I have a link here. How it works. Okay. So this is the actual AWS console where I already deployed the application, the function. And AWS console has a nice feature called tester or test button. With that, we can directly invoke the Lambda. Normally, we would have to create an API gateway and map it to Lambda so that we can access Lambda via HTTP from outside. AWS can also generate some URL that we can use to invoke the Lambda. But this is like directly execute the Lambda without actually invoking an HTTP request. So with this, yeah, there is some examples, but the application doesn't read anything from the request. It just responds with some hello world message. And if we execute it, you see it takes a bit of a time. And this is what I had in my slide. Here it's even shorter, 850 milliseconds. But if we try it again, it's already pre-warmed because AWS caches. Where is it here? Caches the environment. And now it's just two milliseconds. So now the question is when the cold starts happen. They happen. I don't have any experience. How much they have an impact. I heard that it's not much of an impact because they happen normally once in a while. So the response is once in a while, takes one more second on top of request processing. But if it takes five seconds, which can happen with normal spring boot application or traditional frameworks or even application service, I don't know, sometimes some application service can be embedded, then you can run them in AWS Lambda. But some of them really are hard to basically map to the method call. So you have to install application server. And for that, it's not even possible application to run application servers in Lambda. But if you did, it would take 10, 20 seconds with some application servers. And that's really a difference. You pay for the execution time, but you also have exposure users to waiting for a couple of seconds. If it's a user facing Lambda. If it's not, you maybe don't care so much. If it's something that's some bad job that takes two, three minutes to finish, then couple of seconds don't really matter. So here's a slide about Piranha Cloud. In short, Piranha Cloud is basically, as I said, based on a new servlet container designed from scratch, and a lot of components built on top of it. The servlet container being servlet implementation can run any servlet out there. And a lot of Jakarta technologies are created as servlets. So for example, Jersey as a servlet can be deployed on Piranha. And that's quite an easy way how to get rest endpoints or rest library on Piranha to deploy Jersey as a servlet. And then we have everything that Jersey provides. We can embed Piranha as I did in my demo, but we can also build a war application and run the war application with Piranha on command line. This is using Jakarta distributions, which already contain this distribution of packages, distribution of functionality of Piranha that are mostly used. And the last thing, it's plain Java. There's no real magic. There's no generated code. Everything is just clean code written by clever people, I think. At least judging on the code, when I looked at the code, it looks like the people were very clever. So with Piranha Cloud, we were able to achieve quite fast startup times, but it still takes a couple of milliseconds, 100, 200. It depends on how our application is complex. It may end up to two seconds even if we add all the Jakarta functionality that Piranha Cloud provides. If we want to reduce that even further, we have some general Java options to do that. We can first increase the CPU and RAM on the Lambda, which we can always do with any language. But we can also use a faster JVM. On the last slide, I have a table where I compared running the same application with Java 11 and Java 17. If you look at the numbers, Java 17 is mostly most of the time a bit faster. So just by deciding which Java version we use, we can get a bit better startup time. Then the last option here is basically a combination. I did some experiments which options work well regarding to startup time or reducing the startup time. And in the end, not many things matter. But what matters is class data sharing, which basically caches class information. So it doesn't have to be loaded and processed in the beginning. It's already pre-computed before cold start. And tinkering with compiler, we can disable second level just-in-time compiler if we want to really focus on startup time. And then there are other more magical options, but they can even reduce performance or reduce startup time almost to zero, either compiling the code to Gravium, with Gravium to a native binary which runs the application almost instantly. Or we can use Crack, which is a co-ordinated restore and checkpoint mechanism. The next talk will be about it also. And yeah, which is also nice is that AWS Lambda integrated that basically in one of their Java run times. And it's called snap start. So you can get it for free, but only with Java 11. But hopefully Java 17 support will be coming soon. And this works in a way that your application basically stores, or you at the build time can store a checkpoint of your application with all the memory or all the information basically like hibernates, you can hibernate your application. And then it started again and again and cold start and warm start in that case basically don't make a difference because they start from the same point. That's all from me. If you have any questions, let me know. Thank you for watching. |
FireCRaCer: The Best Of Both Worlds |
So, hi, I will start right away because my talk is quite packed, so I'm focusing on this working for Amazon in the Amazon Coretto team. My slides and the examples are on GitHub, I will show this link one more time at the end of the talk, so you don't have to take a copy. I am principal engineer in the Amazon Coretto team, working in the OpenGDK since more than 15 years, been with SAP before, that's for also more than 15 years and have various duties in the OpenGDK and JCP. So let's get started about firecrackers, so firecracker is a minimalistic virtual machine monitor, it's KVM backed, it only supports a limited set of devices, basically block and network devices which are virtualized to Vortio and a VSOC and a serial device that makes it very fast and also very secure because it doesn't support any exotic devices like for example QMU, it has a rest-based configuration, it's completely written in Rust which also makes it kind of safe, it's based on, it was forked from Google's CrossVM and it's nowadays based on Rust VMM library which is like a based library for virtual machine monitors and I think that's also used by CrossVM meanwhile. It supports a microvia metadata service which is basically a JSON storage where you can share data between guest and host because with full virtualization it's not easy to exchange data between guest and host because all the guest applications run on their own kernel and with this data service for example you don't need a network connection between host and guest and then the firecracker process itself supports in addition to the security provided by KVM, sandboxing, so a jailer utility which basically places the firecracker process on the host into additional C-group, change-route and sec-comp environment and it's all open source, Apache 2 licensed and it's the technology behind AWS Lambda. So every Lambda runs in its own firecracker virtualized container. So here's just a picture of what I've just told you. So we have the kernel with KVM on the downside and then we have the firecracker process which has a thread for each VCPU which you configure in your guest and then it has a special thread to handle IO and an API thread which is low priority to handle the rest requests and then it boots the guest kernel which has the VATIO devices and the VM thread handles these VATIO queues and maps them for network to tap devices on the host and for the block devices for either on a native block device on the host or on a file system which is exported as block device to the guest and then you can run a bit more application on the guest and you can run as many guests as you want, it's only limited by your amount of memory basically, and overhead by firecracker is just about 50 megabytes per, I know it's less, we will see, it's very small. So let's go to a demo. So I have to truncate the file. So here we just start firecracker, we specify the API socket where we communicate, we have a log file and a log info in the boot timer to see the boot time. And now from another terminal we start to config this with JSON data as I told you before, so we configure two VCPUs and 512 megabytes of memory. I have here a root file system, extended X4 root file system and a freshly compiled Linux kernel, so I will now use another REST command to configure the Linux image which will be booted and I pass quite a lot of kernel arguments, it's mostly to switch off devices which we don't need anyway and which unsupported and we define as init script to just run bash, so init script will be just a shell and then we finally have to define a root file system, that's our X4 file which I showed you before and now that we've configured everything we can just start the virtual machine again with a JSON request and when we go back into our window we see that now the virtual machine has been started and it took about 200 milliseconds to start bash and it's fully configured Linux, the image was assembled from Ubuntu 22 image and the kernel I've compiled it myself, you see we have two CPUs and about 512 megabytes of memory, so if we exit the shell it will be able just reboot because it was our init process, from this 200 milliseconds which you take to boot the serial device alone took about 100 milliseconds, so if you take that away usually in production you don't need the serial device it puts in 100 milliseconds and that's on my laptop, okay, so very quick comparison of Firecracker and Docker, so Firecracker is fully KVM virtualized, Docker has only C group namespace isolation, the good thing about C group namespace isolation only is that Docker images run on the same kernel so they can do copy and write, page cache memory sharing so if you run many of them they are denser whereas for if you run several Firecracker images they cannot directly share memory so you have to use ballooning devices for example in the guest to give back memory to the host, on the other side that's much more secure because every container has its own memory, its own kernel and Firecracker has snapshot support to a checkpoint the whole container like with the kernel everything together and Docker can use Crewe checkpoint to store in user space to do the same thing basically serialize Docker container with all processes to a file, I will see examples for that now, so now what is crack and Crewe, so as was mentioned before crack is called in native to store and checkpoint that's a new project in the OpenJDK, it has basically three points which are important, first one is to create the standard checkpoint restore notification API because many applications are not aware of being cloned and there is state, security, time, all this kind of stuff which an application might want to react upon especially not only when cloning but not only when checkpointing and restoring but especially when cloning the application, think for example of an application which logs to a file and then you checkpoint it and restart two clones and they both write to the same file they will corrupt the file usually so you have to take some measures if you run many things in parallel and the application is not prepared for that, so if you want to, a crack is currently not part of an official OpenJDK release it's still mostly a research project in the OpenJDK but you can already now make your application ready for crack by using the org crack API that's available on Maven Central and that basically wraps JDK crack namespace which is currently in the crack repository in OpenJDK but if it finds javax.crack once it should become available it will switch to that and it also offers the possibility to pass the custom implementations to a system property and then finally what makes crack interesting for many people to experiment with is that it basically integrates with Creel so it has a copy of Creel packed with the crack distribution so you can easily checkpoint your java process and restart it and then as I mentioned before Creel is checkpoint and restore in user space that's an old java functionality which allows to serialize a single process to the file system it uses kernel free cgroup freezer to freeze the processes or process tree and then writes all the memory to the disk and so on. Still Creel has some issues because it has to take to look at all the open file descriptors, shared memory segments, stuff like that which might not be available again when you restore the image whereas firecracker as I said before it restores the whole kernel with all the file system everything in place so it's much much simpler from that perspective. So let's take a quick demo on crack. So I have here open gdk.17 with crack extensions and then you simply pass the option checkpoint to that's a file and this is just a pet clinic up a spring boot pet clinic example application and I modified it to register with the orc crack callbacks as I said you can see here it's registered to orc crack and now that I've started it I can use j command to checkpoint it so I send it a checkpoint command and when you see just out of the box it didn't work it shows some exception because it found for example that the port 8080 is open and this uses a vanilla version of Tomcat which is implementing the crack callbacks so but that's not that bad it has a developer option which has to ignore exceptions so for this simple case it will probably work so let's try it started one more time prepare the checkpoint here so let's wait until it becomes ready so and now now checkpoint it and you see we also locked the resources so you see what they were about 10 file descriptors and most of them were okay because like the crack modified VM already knows a lot of the file descriptors the VM is using for example for the jar files it has opened or for the module files and it closes them by themselves without need to register anything so and the checkpoint you work and what's interesting is here that before checkpointing it calls the my the the listener the handler I installed in my pet clinic application so I could do additional stuff before checkpointing and now we can just restore this frozen process and you see it starts instantly it calls the after restore a hook I have registered and we can send a serial request on 8080 and yeah it basically still works so that's nice let's go further so now firecracker so that's basically combination of initial firecracker and crack I found it somehow funny that words are so similar so it's a play with words and my my opinion it's the best of two worlds to combine these two currently as I said a crack project is based on crew but I think it might be interesting to add support for firecracker as well and I'm currently working on that so with firecracker you can basically checkpoint a plain JDK even with if it's not modified by crack because as I said no need to worry to worry about fire descriptors so on one issue with firecracker as I said before you cannot trigger the checkpoint from Java so the crack implementation in open JDK can checkpoint itself because crew is running on the same kernel like the Java application so the Java just so JNI calls crew and checkpoints itself that's obviously not possible in firecracker because you cannot escape from the gas that's the whole thing about running it in in a in a fully virtualized guest so we need another means of communication but that's not not that complicated it offers maximum security and speed and I said before no copy and write memory sharing but you can use ballooning same page merging kernel features which are also have their plus and their drawbacks but things to investigate so let's do a firecracker demo with Java now to not bore you more with all this JSON request I've written a shell script which basically does all that in in one script and instead of calling bash it just starts Java as in it process and we can now submit the request and you see it's it's it's working it's here here is the request my I have still registered this this callbacks although I'm running on a vanilla JDK by using the org crack library so they are they are empty they won't do anything and I can now snapshot firecracker you see that's also quite quite fix quite quick firecracker is not is resumed automatically so I have to kill it manually and now if I restart from snapshot you will see it also it takes just a few milliseconds to restart the whole image and again I can see well into it it it works you see there is no the hooks are not being called because there is no real crack implementation in the back in this case but like checkpointing for Java itself works and it's also easy to run a second clone now obviously we cannot run it in the same namespace because it will use the same IP address like the like the first version so we we started in a in a network namespace so minus and zero is just to create a new namespace for for the clone and you see it uses IP net NS net names with exec to execute firecracker but it restores quite as quickly and the initial IP address of the of the of the process has now in this namespace is it's now mapped on a different IP address on the host but you see it's it's still working so in the get the guest still has the same IP address it has in the first place it's just running in its own namespace and inside the guest again the Tomcat is running on the same port all no problem so we just kill the first instance and we kill the we kill the second instance how much time do I have oh okay okay so just a few words I I realized that talks which are rated highest are usually so some animation so I decided to do animation because usually only so console console demos so quick introduction user fold demon is a is a possibility to handle page faults from the user space and firecracker offers the possibility instead of mapping the image file right into fires firecrackers memory to to use an external user fold demon and if we write the user fold demon ourselves we have the possibility to follow page by page which addresses get loaded at the restore and I found it interesting so I created that kind of thing so to an animation for that and for that we we restart our our our firecracker service native memory enabled native memory tracking and from the guest we do now ssh into into our firecracker guest where Tomcat is running and just call j command native memory details and and put that into a file and we do the same thing with the pmap information this is just a shell script inside the guest which basically prints all the virtual to physical mappings for all processes into a file and now we can start the the visualizer and it takes the locks oops it it takes the locks of the user fold demon and the nmt and the native mapping so what you see here is basically the physical memory layout of the guest so it's memory page zero and in the end it's memory page one gigabyte and every square is four kilobyte page and if you go and that's on the java process for example you see the dark these are the pages the rss of the java process blue ones are occupied by the java process but they are also in the page cache so that's probably a file for example or something or uh uh class uh spare shell class for example when you when you look at the nmt output we see that for example for the classes we use about 66 i probably cannot read it it says virtually 69 megabytes uh rss is 60 megabytes and user fold demon loaded about 10 megabytes of it and here's the the animation i promised you so this is how the pages got loaded when we did the first call request on a on a resumed image and like the the yellow ones are all the pages which i've loaded and the orange one i don't know yeah some are orange belong to the to the to the virtual memory region i have selected here so for example all the orange pages are the the parts of the class space which got loaded for the first request so this is a lot of space for more investigation would be nice to to compact this more like physically because you want to prefetch the the things which get loaded especially if you download your images from from network for example and but the problem is that all the physical address space is continuous like the virtual uh the physical pages are are not and try to look into uh possibility to do that so that that's it thank you thank you very much there's about 30 seconds for questions is anyone got a question called your answer question i have a question regarding uh when you showed uh uh crack uh implementation there was uh implementation that put into the uh so yeah yes i unfortunately there is no time in 20 minutes to show that but you can obviously use the current crack implementation inside firecracker use j command and instead of crue there is a backend called uh post handler that's just a small program which instead of calling crue just dispense the whole process and then you can send in the signal to restore it so with firecracker you basically checkpoint with the post engine then do the firecracker snapshot then restore firecracker and then just do an ssh with a kill signal on on the process and it will will restart that's one possibility another one is i wrote the jvmti agent which basically has the same thing even without crue it uh it um suspends all threads it calls system gc and then waits uh on a on a port so you just ping it with telnet or whatsoever and and it even calls uh the the the the hooks by implementing the this custom possibility to uh with the property so i i i i say or crack to use my crack implementation to call the hooks so that all works it's in the in the repository which is i had a resource slide which i didn't show it has all the links so |
Classics Never Get Old: Two Easy Pieces For GraalVM |
Just two classical optimizations that will help modern but mature virtual machine where we have that powers native images and why is it important? Well, and who I am? My name is Mito Chukko. I work at a company named Bellsoft which actively participates in OpenJDK community and we release our own JDK distribution which you probably met if you have ever built a Spring Boot container with default build pack. So it's in there. And now Spring Boot, since version 3, supports containers with native images. It can be built as a native image and if you do that, the compiler being used is the American native image kit which is a Bellsoft distribution of GrowLVM. So that's another project that we participate and GrowLVM itself can be seen as different things at least two major modes that we can absorb. It can run as a JIT where compiler is GrowLVM or we can build a native image with a static compilation and it will utilize a special virtual machine substrate VM and here it's different from the traditional Java, traditional way of how we run it. Well, another interesting and peculiar point here is that it is written in Java. So it is a complex project but the most of the code is Java and this is beautiful. So you have a virtual machine and a compiler for JVM languages and Java in particular written in Java. So if you look at Java itself, why is it so beautiful? Well, not so beautiful compared to Kotlin as we know, right? But still, both Java and Kotlin, they share those concepts. So from the very beginning, there is a way to write correct parallel programs. So then the right parallel programs, we need some means of synchronization or to orchestrate so our threads, if we share data, most typically we do that. And also it's a managed runtime where we don't have to worry that much about pre-memory because we have garbage collection and garbage is collected for us and our programs just, they can't have memory leak but you have to work hard to get one. And having that native image implementation makes our final binaries very, sometimes makes them very performant. Of course, we have an instant startup. It was mentioned today several times. But we can also have a very good peak performance. In certain cases, that's not a rule but it can happen, like it happens here on this plot. That's just a simple spring boot application and we just ping the same endpoint. And here the native image works better and also it warms up instantly and it has very good latency. So for this small amount of memory that it takes, so this is a small service, it takes small amount of memory, very small heap, and it also has low latency. And under the hood, it uses, well, serial GC and we'll talk about that later. Well, what about relationship between Graal VM and OpenJDK? Well, we're here in a Friends of OpenJDK room and Graal has been integrated as an additional experimental compiler in JDK9. But while it has been removed from recent JDKs, but what's the left over? It's an interface to plug it in. So now it's going to be a second attempt to do that. So here on slides it's mentioned that there is a discussion about project, new project they all had, but last week it was already called for votes in OpenJDK to start the project of bringing the most sweet parts of this technology into OpenJDK, back into OpenJDK. It's something that happens right now. So that default garbage collector that sometimes shows very good latency even compared to ParallelGC or G1 in hotspot, well, on small heaps. Well, it's a kind of garbage collector we can easily understand. And it's generational stop the world collection. So here only one survivor space, but actually it's 16 by default. But anyway, so we stop all our application threads and we collect garbage in a single thread, so this is a kind of a basic garbage collector, right, but from the other hand it's reliable and it's very effective, especially if you have only a single core available. So you see the problem. We have some CPU which may be enough to run many threads, but we run only one at least for garbage collection. Now garbage collection can take significant time during our application execution, well, that's obvious. Well, what would we do? Of course, we would like to do exactly the same thing, but in parallel, to decrease the time garbage collection takes to reduce the garbage collection pause, because it still stopped the world pause, but we reduce it because we process data with multiple threads. So that's the idea of parallel garbage collection. The idea is not new, but surprisingly, this modern runtime doesn't have it yet. Well, we decided to implement it and it's still being under review and some implementation details, well, they change, but the idea is very simple. You just say, pass the garbage collection selection during the creation of your native image. For instance, if you use some Maven or Gradle configuration for your Spring Boot container, you also can do that. And then you have some GRIPS in runtime, which you also can twist when you run your application. And well, you enable that implementation. I'll show some performance results later, but basically the implementation itself, well, it can be analyzed as a change in a big Java program, which Brawl VM is. And there are now two GC interfaces and implementations. And this functionality just re-use existing things in a very, I would say, smart way just to keep what is all about the parallelization as a code. So everything else is reused from serial GC. Basically there's a problem of how do we synchronize and share the work? Because parallel threads for garbage collection, they also have the same problem because they work on the same data, so they have contention or may have contention. So we need to share in some smart manner. Well, it's implemented with a work divided in its volume. So every thread operates its local memory, and it's a chunk of memory of one megabyte. So if we need an extra memory, like we scan objects and we fulfill some set of data that we operate on. And then we have an extra chunk. We can just put it aside so someone else can pick it. So that's the stack that contains the chunks of work. And then the work is finished, the thread just takes the next chunk of work. There may be a situation when several threads try to copy to promote the same object. And this is actually solved very simply. They just reserve some space for the object and then tries to install forward pointer using an atomic operation. And as this is an atomic operation, only one thread succeeds, so others just roll back and this is a lightweight operation. Again this is Java. This is not a strict AML, sorry, but still all existing places that manage memory were reused without changing the architecture of Growl itself. So there are already possibilities to add garbage collectors. So if you want to implement one, it's not that complex. The major problem is to be correct when you deal with memory. When you deal with concurrency, and then you inject your code into this virtual machine because it's all declarative magic that requires you to be careful. Well, some performance results. With relatively large heaps with serial GC, you can have pauses of several seconds, which is long, of course. And there's a big difference if you have a two or three or four second pause or if you decrease it by one second. So that's possible with this implementation already. So that's the order of this improvement. With another benchmark, hyperalogue, you see that latency here, latency of pauses can be decreased like two times. Those pauses are not that big, and we have frequent collections here, so x-axis is epoch, so each point is a garbage collection, and y-axis is time in, I believe, milliseconds. Well, that's paralogy. So we can obviously improve many applications and many installations where we have an option to use several CPUs. If we use one CPU, of course, we won't see much difference. There is some increase in memory used for service needs, but that's kind of moderate. So other parts of this complex system. I mentioned synchronization, and, well, synchronization is useful, but it has tradeoffs. Because if we implement the non-synchronization, we need to save our CPU resources to put aside threads that won't get the resource. We need to stop them, to queue them, to manage that queues, to wake them up, to involve operating system in that process. So that's not cheap, but there are situations that, that's another queue, right? And that even influences the design of standard library, because, like, we all know string buffer and string builder, right? One class appeared because, well, another one wasn't very pleasant in terms of performance. Yeah, we need it sometimes, but in many cases, we need a non-synchronized implementation, saying, like, hash table and hash map, whoever uses hash table, right? But it's very good synchronized. But not all classes that have any synchronization in them have their twins without synchronization. That makes no sense, right? So there's a well-known technology, how to deal with a case where accesses to our data structures, to our classes, are mostly sequential than at any point in time, only a single thread owns and operates with an object. And it's called bus-locking or thing-locking. Well, why is it simpler and more lightweight? Because we don't want to manage all the complex cases. We know that we are in a good situation. And if we're not, yes, we can fall back, and it's called inflate our monitor. Well, it existed in OpenJDK for ages, and it has been removed from OpenJDK. If it was deprecated, then no one noticed, I believe, because still, are there too many people using something newer than JDK 11? Well, some consequences were noticed probably too late. Well, what are the reasons, first of all? What are the reasons to remove a bus-locking from OpenJDK from hotspot JVM? Well, to ease the implementation of virtual threads, to deliver project loom, to decrease the amount of work there. So some consequences here, initials discovered. In certain cases, things like input streams can be slowed down, like here it's 8x or something. That's enormously slow. And for GraVM, there is a mode that you say during static compilation, OK, this native image doesn't try to work with many cores. It's a single-treaded program. So it's simple, and it works really better in these circumstances. So there is an optimization for that. But you have to know it in advance, then you compile your program. Well, and there is, of course, a runtime option that supports all kinds of situations, and it's complex. So the missing part is in the left lower corner. Well, to dynamically be able to process the situation of sequential access pattern. So we've lamented quite a classical approach to this problem. That helps to, that brings that thing locking to GraVM. The initial idea was operating with object header. So where it already contains a pointer to a FAT monitor object. But it can be treated as well as some words. We can atomically access and put some information there. Probably close to final implementation that we have right now still, or again, uses a pointer because it turned to be not so easy to keep correctness across the whole VM with some memory that you treat as a pointer or as a word depending on the situation. Well, anyway, inside that part of header or inside that special object, we can have 64 bits of information. And we can mark it as a thin log, this is a flag, then we can do it atomically. We can keep the ID of an owner thread, which we can obtain, then we work with threads. And account of recursive logs that we currently hold. That, by the way, means that after a certain amount of recursive logs, we have to inflate the monitor because we can store more information in that part of this work. Yeah. So again, it's a pure Java implementation where we work with some atomic magic and we update this information. What we've got, and the most recent numbers are even better. So we see that effect on exactly that example, the streams. We can speed them up. And even in a very kind of nano-benchmark kind of measurement, you also see the improvement. And even in multi-threaded case, there is now no difference with the original. |
AsyncGetStackTrace: The Improved Version Of AsyncGetCallTrace (JEP 435) |
Yes. Hi. I'm Johannes Pechberger. As I was already introduced, I work at the sub machine. It's another great distribution of the OpenTraderK. So I worked since the beginning of last year on my new project on Async Get Stack Trace. It's essentially an improved version of the Async Get Call Trace API. And I think many of you probably don't know this API. I didn't know it before I started this project. But essentially, it's related to profiling. So how does profiling work? Some of you might have already seen Flamecraft. If not, there are some other talks on profiling in the Mozilla left room that you can look it up. But essentially, what profiling is, you want to see which parts of your applications are so, for example, here, wanted to see, I can see that some JDK stuff is probably a thing that takes time. But essentially, how it works under the hood is that we have a selection of threads, like for example, here, five threads. Then we randomly select three threads because we cannot usually sample all threads because it would be too costly. Then we pre-allocate some traces. There's just a data structure where we store the stack frame information in. And then we ping the first thread. And with ping, I mean, we send it the signal. And then the signal handle. We walk the stack because in the single handler, the thread is stopped. So we can walk the stack. We do this with the thread two, with the thread five. And we have the traces. And then we store it. And then we do some post-processing. That's essentially how I think profile works, but just in a loop. So in a loop, we already do this. And so we need an API because we need an API. It's called, I think it called trace because we could use JVMTR libraries. They are safe from bias. So they let the threads wait till they're ready, till they're at a safe point. But we want to have the call trace at a certain point where we want it. And so I think that call trace is quite a cool API. So how it works, here we have the stack, how it's on your system. We have at the bottom the pthread start. It's on the Unix system. And on top, we have like some Java frames. And then it goes up till the top to write, write bytes method because it writes to a buffered output stream. It's essentially, hello world, just print some strings. And in the single handler, we get the top frame. That's where the U context from the single handler points to. And then we do some stack walking. And as in get call trace does it for us. And essentially it returns us in a preallocated data structure, the frames. And the number of frames that we got. And it also stores a number of frames in error code if there was an error. And so what we get for every frame is the line number. So it's called line number, but it's essentially the byte code index. I don't know. It's historically this way because this API is like from 2003 around. And we get a method ID. But we only get this information on Java frames. So what are these problems? So don't get missed out. They worked on it for long enough time. So it's unofficial. So it's there in 2003, like for three months. And then Oracle put it out, sun at the time put it away. It's now just lying around as an exported symbol but doesn't have its own header. It's unsupported. So if there's a change in another part of the JVM that potentially breaks it, nobody notices it because there's only one single test that doesn't test that much. So there's also missing information. So it only gives us information on the stack frames of the Java stack, of the Java frames, but not on anything else. And it misses information like inlining, which isn't that great. And so in the beginning of last year, I started to work on a new API because this, I think, is the best we have. And maybe we could do something better. And so I worked, I started to work on Async et cetera. It's now a CHEP candidate. It's 435. So if you want to see the CHEP in its entirety, just go on the OpenTedicay website or read the blog post for this talk and you get a picture of what it does. And so the idea was to create a better API that gives us more information and is far more supported, so with lots of tests with its own header. And so again, we have the stack, our stack, but we then get more information. For example, we get at its most basic level, we also get the kind of the thread that we're running on. So is this thread like in Java mode or is this in GC mode or what is this thread, which is quite neat. And we got more information. For example, we get the BCI. It's not called BCI because, yeah, it's the byte code index. We get the method ID. We get also the type. Is it inlined? Is it native? With native, I mean not CC++, but these boundary methods that are defined in Java, but which code is implemented in CC++. And we also get a compilation level. So is it C1, C2, compiled, or don't compile at all? So this is quite neat because we get more information. But the cool thing is we have options now. With this API, we can set in an integer. Hey, we want to have non-Java frames and we also want to walk non-Java threads, which leads us to this situation where we get information also on the thread on these CC++ frames, which is quite nice. Because for these frames, we get also the type. So it's a CC++ and we also get a program counter. So we can then go back, do some of our own analysis and use DL-SIM to get methods of the DL family and get the method name. And we can also walk with these options non-Java threads. So we see more information. It essentially makes the life of a profile developer far easier because we can now just use this API. It will be supported if it gets in. It will be supported. I'm working on lots and lots of tests. And yeah, I hope it gets in. And as a bonus, what I also introduced is new methods for OpenShiftedHead developers to walk stacks because currently the code is like spread between a few different places. Some of them are copies of others. So it's quite hard when you change some port. You have to change other parts too. So it's essentially technical depth. There were good reasons in the years before, but still I want to make stack walking easier. So the new API that I used in the implementation of my chat proposal allows us to just give a stack walker some options like, hey, I want to walk stacks. I want to skip. I want to walk also non-Java frames. And I can just go over it and say, oh, give me the next frame. And on this next frame, we can ask all the information. Is this a Java frame? Is this a native frame? Which is this compilation level? And this makes it far easier to walk stacks and hopefully makes it easier to combine all the stack walking from some ever-related stack traces from AsyncGetCallTrace, from JVR using one API. And so when you make an improvement in one of these APIs and implementations, you get an improvement on all. So what I've done is that I improved AsyncGetCallTrace with the help of my colleagues to be much safer. So I wrote testing code that used SafeFed so that it checks the pointer. So it kind of checks the pointer before it exists. So it's far safer than I did here for AsyncGetStackTrace. Lots of testing, for example. I did some fuzzing. So I called AsyncGetStackTrace with random u-context, so with randomized frame pointers and stack pointers. And it doesn't crash like for hours on a large machine, which is quite cool. And so this covers AsyncGetAsync profile when it modifies the frame and stack pointer to alleviate some concerns when the VM is like an undefined state. It needs a lot of convincing, so I'm still in the process where I have to talk with all the people from Oracle, all the JVR people. It's a long drawn-out process, but I hope I can convince them. But clearly, because clearly the people on the profile side are really happy to have this because it has many advantages for them. And of course, again, testing because the whole point of this API is that you get more information, but also that it's a better tested API. Currently, I have six tests, and I'm working on more. So I hope that it gets in. Till then, you can see on GitHub, there's a draft PR on the step. Just search in the PRs for draft PR with ISKST in the name. And then you can, yeah, you can follow me on Twitter on our team at SpeedSubmachine. And that's all. Oh, yes, yes, yeah. And I'm also blogging like on mostly nerdlers, and all the blog posts I like also put on Fujay. But yeah, you can follow me there and read on all the topics that they talk today. So, thanks. The question was, can safehatch be called from signal hunters because it uses signals? I think it uses different signals because I didn't have any problems using it from signal hunters. So I have tests. To use us and get stack drives, you have to use signal hunters. So I didn't see any problems so far. I think that's probably, it's even weird because from signal hunters, you can, you cannot do any malloc. So you have to preallocate, but you can call fork. So it's quite, quite interesting. So any other questions? Does it handle in both dynamics, especially, because within that stack, you get like the whole stack of deciding how to dispatch the call? So the question was, does it handle in work dynamics specifically? Now, it just uses, it just is based on the frame stack walking and like the internal mechanism of stack walking. So it doesn't handle it differently than, for example, I think get call trace and trade for. Yeah, that's all Java frames. So that's, that's probably fine. Do you have to change the native parts? Or does it go on all platforms? So the question was, does it work on all platforms? It's known that it doesn't really work on windows just because windows hasn't really a concept of signals. If you have any ideas on getting something like this to work on windows, feel free to drop me a message. So no, I didn't have to change any native parts. I had to change some, I had to create some native parts for testing to modify like the U context because this is highly applications, highly operating systems specific. So the changes to the whole OpenJDK are fairly minimal. So they aren't that large besides passing through some bullets to configure stuff. And the code itself is just a few couple hundred lines. So it's quite simple also to understand. And there's a blog post that describes like reasoning behind it. Any other questions? Yes? Is it already a sub machine? No, it's not yet on the sub machine because I'm still in the process of testing it. So there's of course a podcast. You can already use the JVM when you compile it yourself. I'm in the process of updating my demo repository which contains a modified sync profile that uses it. So you can try it out yourself. I should be right in the next few weeks. It still has some bugs. Yeah. Anything else? Thank you very much. |
Quarkus 101: Intro To Java Development With Quarkus |
Alright, welcome. How do you like our little Duke rock stars here? So there's stickers going around somewhere so you can get some of these stickers. I think we have four different rock stars or something. Anyway, let's talk about Quarkus, obviously. Who am I? I'm a developer advocate at Red Hat. My name is Kevin Dubois. You can find me on Twitter or Macedon. So I know we've already talked in a few sessions today about, you know, traditional Java and the startup time and all that stuff, so I'll do that some more. So we'll talk about traditional Java. So traditional Java is, can I see the little hand? Traditional Java is designed for, let's say, different times, not designed for cloud-native workloads, necessarily. It's designed for running kind of long time. And what's important in traditional Java is throughput at the expense of footprint. So footprint can be quite large, right? You typically have traditional Java applications running on pretty beefy servers. And they're designed to be long running and you have dynamic loading and all that stuff with mutable systems. But in the cloud-native world, your throughput, you get that mostly through scaling. Your workloads are ephemeral, which means that, you know, like if you think of containers, when you scale up a container, when you start up a new application, those containers are going to start up and then maybe they're going to get rescheduled on a different node. And so containers kind of come and go. They're not going to be around. And if you change something in a container, that change is not going to last, right? Because that container, whatever you change inside, that's going to be gone when that container gets removed. So in that sense, we have to think of Java in a different way. We need to think about the footprint of it because we want smaller containers that we can schedule across different servers. You know, if you are familiar with Kubernetes and clusters, there's usually multiple servers on which it schedules containers. So, you know, we need to be able to handle that. And so that's where Quark has started, was kind of invented, I guess, because it's a framework that uses Java. But it's, you know, we call it supersonic because it starts up very fast. Subatomic because it's very small, like subatomic smaller than an atom. And it's still Java. So if we think, if we look at Quarkus in terms of startup time and in terms of memory usage, you can see here, this is a test that they did with a relatively small application running on a traditional cloud native stack. It took 136 megs of memory running the same application, you know, with Quarkus, you already got, you know, pretty good gain in memory, right? And that's running on the JVM. So it's the exact same application running on the JVM. And then, you know, compiled down to a native with Grail VM, you get, of course, even less memory usage. And you can see here, too, Quarkus starts up quite a bit faster than a traditional cloud native stack, which is ideal when we're talking about, you know, cloud native. We're talking about containers, talking about serverless, where we need to start up really fast so we can react quickly to, you know, changing loads. So startup time is one thing. There's also the warmup issue. I don't know if issue is the right word, but actually, when an application starts up, it takes a while with Java before you get your maximum throughput as well. So here we can see that, you know, like a traditional Java application, this is actually the point, and I think this is like 13 seconds or something, or it's actually able to be working at maximum throughput, which, you know, for this particular use case, it needed a certain amount of throughput to be able to handle load enough. And then you can see here with Quarkus, it goes quite a bit faster. Now, Quarkus isn't just about fast startup time, it's not just about memory, but it is kind of a nice feature of Quarkus. So if we think of containers and Kubernetes nodes, traditional Java applications, running on EAP or WebSphere or whatever, running on a Kubernetes node, you can see they take up quite a bit of space. Let's say that in this case, only four instances of the application can run, which isn't so ideal because if one of the pods, one of the containers goes down, that means you lose 25% of your workload, right? If you look at Quarkus, on the JVM, you already have quite a bit more density, which means that if one of these guys goes down or needs to be rescheduled or whatever, you still have, you know, what is it, maybe 70% or something, that's still up. And, you know, we can compare that to Node.js or a Go or something, where Go has quite a smaller footprint and with Quarkus native, we can actually be very comparable with Go, which is nice because that means that we can use our Java skills and not have to, you know, change languages and reinvent the wheel and still get all the benefits in the cloud native world of having fast startup and everything. So how does that work? So a traditional Java application, basically build time is when you do your packaging and then as it starts up, it loads config files and then does class pass scanning and build kind of its model of the world and everything, but this is when it starts up. So if you think of containers, again, that means that this all happens when the container starts up and that takes a while. And then, so with Quarkus, what we try to do is instead of doing all that, you know, at runtime, at startup time, we're trying to do all of this or as much as we can during build time before the application actually gets packaged, which means that during runtime, we have a lot less to do, right? So it starts up quite a bit faster. So that's kind of the cool thing about Quarkus. And then, so you can use Quarkus on JVM or you can compile it down to native, of course, just like most other frameworks. But there's some cool things about native compilation with Quarkus as well that we'll get into in just a second. So this is my favorite part about Quarkus. It's not necessarily, I mean, yes, it's nice that it starts up fast. It's nice that it has a small memory footprint. But what's really cool about Quarkus is that it has a bunch of different ways of making the experience of working with Java and Quarkus a lot more fun. So one of them is, you know, so of course, it's based on standards. So Quarkus uses, you know, your Java EE standards, the Java standards, uses, you know, Vertex and all that good stuff. So if you're used to that, hey, great. You basically already know Quarkus for 99%. What's really cool with Quarkus is that there's this dev mode. This basically, you can start Quarkus on your local machine in dev mode. It's going to start up. And it's going to just keep checking to see if you make changes in the class path. And so every time you make a change, it's going to automatically reload when you, you know, let's say go into your browser or whatever you make a new request. It's going to automatically reload your application so you don't need to recompile, redeploy every time you want to test something. Quarkus does that automatically so you can just go to your browser, hit refresh, and it's there. So make a code change, refresh, it's there. Which, you know, if you're a developer of a couple, you know, of some other language where that just happens, then that's not so cool. But in Java, that's pretty cool, right? So we've got our little guy here that says, wait, so you just save it and your code is running and it's Java? And the guy says, I know, right? Super Sonic Java. So that's, that's pretty cool. Another cool thing with Quarkus is that it has this concept of developer services. So who knows test containers? So basically it uses test containers built into Quarkus. So let's say that I'm developing an application and I'm adding an extension to use Postgres database or a Kafka, a Kafka topic or something. Actually, well, of course you have to have a Docker or Podman or something running on your local machine. But Quarkus will look and see, hey, you've got, you've got this dependency on a database. Do you have something configured on your local machine? Do you have a database running on your local machine? Is that configured in your application properties? If not, no worries, I'm just going to start up a container with that dependency, for example, a Postgres database and wire that up. So it's going to, you know, set the configuration so that it connects to that database automatically. And then you can even go and see, you know, what exactly that configuration is and then copy it down. Anyway, so that's the developer services. So Kafka or, yeah, there's a whole bunch of different developer services that you can use just out of the box, which is pretty nice because otherwise, you know, having to configure a database on your local machine or Lord forbid, a Kafka instance with all your zookeepers and all that stuff, that's not so easy. You also, you know, so you have live coding, you also have continuous testing. So kind of the same concept. So if you have unit tests, you start your continuous testing. So every time you make a code change, it knows, hey, this class is related to this unit test. So I'm going to rerun this unit test every time I make a change here or vice versa. If you're making a change in a unit test, it knows, you know, this is what I need to rerun. So it gives you quick and immediate feedback every time as you're developing, which, yeah, again, it's pretty handy. It also has a dev UI. So it has a UI in your browser where you can go and look at all these different, you know, developer services that are running, what Quarkis is doing. So again, I was talking at the start about Quarkis doing some optimization, right? So during the compilation time, so in the dev UI, you can actually see, you know, what it's doing, how it's optimizing and what it's going to remove from the class path because Quarkis does, you know, some introspection to make sure that, hey, this is used or this actually isn't used by your code. So I'm going to remove all that from the compilation. There's a Quarkis CLI, which, again, it's not super crazy, but so you can either use Quarkis with Gradle or Maven, or you can just use the Quarkis CLI, which means that you can do, like, Quarkis dev or Quarkis build or whatever. You can even use, you can say Quarkis image build or image push, and it's going to build your application. So build your application, build a container, and you can even push it automatically all, you know, from one command, which is kind of handy, right? And then one of the last, but not least, is unification of imperative and reactive programming. So Quarkis has a lot of reactive programming kind of built in underneath. Now, me, for example, I'm not a super deep expert in reactive programming, but what's nice with Quarkis, too, is that I can write imperative code, right? So just, you know, every statement gets handled one at a time and it blocks every time, whereas with reactive, you've got these event loops. But you can use both at the same time in the same, even in the same code in the same class. So for those who are familiar with reactive, usually you kind of have to decide, hey, if I'm going to build a reactive application, that means I have to decide before I start writing this code, you know, that this framework that I'm going to use is reactive and I can't combine the both. But with Quarkis, you can, which is nice. And best of all, it's still Java, right? So you get all these kind of features. And at the end of the day, if you're, you know, if you're familiar with Java, this is really not reinventing the wheel at all. So it uses micro-profile, vertex, rest easy, you know, like, and if you want to add extensions, you can interact directly with Kubernetes. So you can push your code directly to Kubernetes. You can create config maps or secrets directly from Quarkis. You can, you know, you can work really easily with Kafka and OpenShift, of course, patchy camel and all that. So in terms of native compilation, I think we've already had a few sessions about that. So I'm not going to go too deep into that other than, you know, if you can run Quarkis on the JVM and probably for 70 to 80 percent of the use cases, that's probably a good way to go. If you really want to have the fastest startup time and the smallest footprint, then you can, you know, do a native build of your Quarkis application with Quarkis, by the way, that's really easy because if you create a new Quarkis application, it already automatically has a native profile built in. So you can decide, you know, as you're doing your compilation, whether you want to do a native build or not. But yeah, so Red Hat is on the GrowlVM advisory board and then there's the mandrel project, which is a downstream distribution of GrowlVM specifically for building Java native builds. So, and that's what Quarkis uses to, for example, if I do a native build and I don't have GrowlVM installed on my local machine, Quarkis will again pull down a container, it's really good at pulling down containers to do a native build inside a container on your local machine. So again, then you don't need to have GrowlVM installed and configured on your local machine. So it really tries to make, you know, your life as easy as possible and, you know, kind of have the benefits of, you know, a lot of the things. So if you're thinking of, you know, should I do a native build or just run on the JVM? This is kind of an opinionated scoring. But, you know, if you want the maximum developer joy, the, you know, the best and easiest monitoring peak throughput and reduced max latency, then you want to run it on the JVM. If, for you, it's important to have the lowest memory footprint, a small packaging, and a very fast startup time, then a native build is probably the way to go. So what do you want to use? You know, what can you use Quarkis for? Virtually anything. So, you know, there are Quarkis-based Kubernetes operators. So there's an operator framework where you can create, you know, these automatic components in Kubernetes that manage resources in Kubernetes. You can create GitHub actions with Quarkis. You can create, you know, just regular jobs. Yes, you can build traditional Java applications, even monoliths with it. So this is, of course, the sweet spot of Quarkis is cloud-native applications, so event-driven applications, reactive systems, microservices, and serverless and functions. So that's about it for my session. So if you want to check out Quarkis more, developers.redhat.com has a ton of resources on Quarkis, on a lot of developer stuff. This dn.dev slash Quarkis tutorial is just a, you know, kind of a nice, lightweight introduction to Quarkis where you can create an application from scratch and then, you know, kind of add some components as you go. You add a database and then, you know, check out the live, the dev mode and all that stuff. And, you know, so it's a pretty nice thing. Yeah. And like I said, if you want to keep up to date, you can follow me on Twitter or Macedon. I try to post interesting stuff, but I don't know if that's really true, but we'll see. All right. And that's it for me. Thank you. Any questions? Yes. One of the first slides you compared startup time and peak performance and, like, the crowd time. I don't, yeah, I don't remember exactly the numbers on there. Yeah. So that, yeah, so the, yeah, definitely. So the peak throughput time, there was a slide about, you know, the three graphs. So, yeah, I think with the native compilation, you're not going to have necessarily more throughput than on the JVM, but you do get the startup time, you know, like the time it takes to get to the maximum throughput is faster when you're, when you're native. Yeah. Yeah. Yeah. We can look at it later. Yeah. Yes. Yeah. That's it. Yeah. Yeah. So the question was how easy is it, is it to migrate to Quarkus from, for example, spring boots? So Quarkus has spring compatibility extensions. And so that makes it relatively easy because basically you're, for the most part, you won't have to change your code. You just have to add the, you know, add the spring extensions. Of course, in your palm, you're going to need to make some changes, but it's fairly straightforward. My colleague, Eric D'Andrea, he wrote a book on spring, let's say, Quarkus for spring developers. And he does, he does talks too, but if you want on that developers.ridhat.com, you can find, you know, there's a section about books and you can find that book. But it's, yeah, overall pretty, pretty straightforward. So like I said, there are extensions so that you can keep using your spring annotations. Now, would I recommend you just migrating your application and keeping all your spring dependencies? Probably not. But it's kind of a nice way to migrate without too much work and then afterwards maybe migrate further. All right. Any more questions? Yes. So, so the question, if I understand correctly, right, so your, the question is, why, why should you use native compilation? Because on the JVM, you have all the kind of capabilities that the JVM brings, right? So in terms of, you know, garbage collection, in terms of throughput and everything, JVM is very optimized to do that. When you do a native build, the GralVM compiler is going to do kind of an opinionated approach of how to do your native build, but then that's also it. It's not going to be able to optimize afterwards like the JVM does. So, kind of depends. Yeah. I think we're out of time. But if anybody has any more questions? Any questions, you'll be up there. Yeah, yeah. Great. Thank you so much. All right. Thank you so much. |
Modernizing Legacy Messaging System with Apache Pulsar |
So, hello everyone. So, welcome to our talk and really thank you so much for staying for this long. This is like the second last of the session of the day. So, really appreciate you being here. So, today we're going to be talking about modernizing legacy messaging system with Apache Pulsar. And here, you know, we have Enrico and then myself too. We're from Datastax. Okay. So, but before we start, if you like a copy of our, you know, slide deck, here's the QR code and also the short link if you want. I'll let you take a moment. Good. Okay. Okay. Well, even if you missed, don't worry, we'll be sharing with you our connection info. Then you can connect with us. We can always be there to answer your questions too. So, with that, let me start. First, just a quick introduction. Who's Mary? So, I'm a streaming developer advocate at Datastax. And Datastax is a company based in California. Starting in Apache Cassandra, Managed Cloud. And then now we also have the Managed Cloud for streaming, which is Apache Pulsar. And I was also a developer advocate before joining Datastax last year. And I'm based in Chicago. I'm also the president of the Chicago Java users group. And I'm also a Java champion. And before this, I was spending over 20 years or so being a developer myself too. So, that's me. And then this is Enrico. Enrico. Oh, yes. Sure. Sure. I'm Enrico. I work with Mary. I really enjoy working with open source communities. So I'm involved in a few Apache projects like Pulsar, but all the big Datastax or ZooKeeper. And also I collaborate with Maven and Curator. I'm participating also in some CNCF project like Pravega that is still about massaging and distributed streaming. And also contributed to RDB that is a Distributed Embeddable Java Database. Okay. Great. Thanks, Enrico. I'm really happy today to be here with Enrico because we were just working remotely, finally get to meet here in Belgium when he lives in Italy. And I'm in Chicago. So, okay. So without further ado, this is the agenda like within 20 minutes. So it's going to be a little bit quick, but we'll end up having Enrico also doing some quick demo as well. So first, let's kind of give an introduction to what is JMS, assuming you know, not everybody is familiar with that. So some introduction. And then we'll talk about Apache Pulsar and why Pulsar. And also just quickly describe the Pulsar architecture and how do you do the mapping between JMS and Pulsar. And then how do you use JMS API with Pulsar. And Enrico will show that. And then that's how we're going to be doing. So first of all, just some core concepts too, right, of JMS. And as such, right, JMS is all about also messaging, but it's very much a Java centric technology. And it's here, as you can see, right, it's also published, subscribed kind of model, making use of destinations that it supports queues and topics. So messages, producers, consumers, these are typical like pop-up producer, consumer type of pattern. As such, it's a pattern, but this has its own implementation. And basically too, it makes use of the JMS context and that will help you with the connections and sessions. Okay, so about destinations, right. So essentially too, it supports both queuing and the topic too. And so it acts as a broker in the topic case, but for queues. So each message is basically, as such, right, message queue is you drop the message there and then it gets picked up and then it's kind of done, right, by the consumer like that. It's browsable, this queue, first in, first out kind of approach. And then with topic, it allows for multiple subscriptions too. And message dispatch according to the subscription type as well. And consumer, as far as consumers styles go, you can have blocking, which is in the blocking received methods and that's all application driven. And also, yeah, okay. And then there's also making use of the message listener method, which is a JMS to driver driven in that case. And as far as producer styles go, the blocking will be send method or there's also a async send too. And that will be like with completion listener. So that's real quickly. And then as far as administrative operations go, as we know, JMS does not cover administrative operations. And how do you manage the destinations and doing, you know, connection properties, all of these things, the defining security models or resource limits, all of these things and configure all of these at JMS itself doesn't have to do it. So how do you manage it? It usually relies on your vendor. How do you, you know, we kind of do all of the management too is through some vendor way of allowing you to do that. And so basically too, there's also API also to let you work with administrative objects too. And so basically, they're, you know, supposed to be kind of also provided by the system as well. And as far as destinations go, there are queue and topic references. And connection factory basically is the, is essentially too, using connection factory is the client that allows you to connect to the system in that case. And then there's also JMS, right? The API is essentially allows you to interact with Java EE or now is Jakarta EE, but back then there's Jakarta Java EE. And in that case, you can basically make use of EJB components. There's stateful, stateless EJB. That's used in web surflets or, you know, the Jax RS, Jax WS endpoints, right? And it allows you to also do background like doing scheduling kind of way of doing things. And then there's also message driven beans. So these essentially too is basically their JMS specific kind of beans to handle messages in there. And it's basically managed by the container, the, you know, J2, JEE container. When you receive a messages from a container, then it will be essentially be, you know, activated in that case. So J, the Java EE container provides support for like all of the, you know, life cycle management pulling of these context dependency injection of these things and transaction supports of security standard API. All of these tools basically relying on the container to do that for you. And then there's also to what about external resources. So a lot of times, and that's how it relies on resource adapters. It allows you to essentially extend the Java EE container in that case. So in some key points, it basically to use it is you need to have the resource archive file. So dot RAL file that will contain the code and you have to then configure the resource adapter and everything. And it allows you to essentially create administer objects, right? That conforms to these objects will conform to the standard API and is implemented by the, by the core inside the resource adapter too. So these are the different packages like basically Java X dot JMS. In this case, it's I think in the new, new version would be Jakarta, but we're still talking about Java, the older JMS in this case, and will be connection factory queue and topic. So usually each objects to a bound to a JNDI naming and directory interface registry, right, provided by the container. And so it's specific to the container as to how you do deployment too. And that's how it usually works. Now then let's get introduced, right? So now we talk, talk about JMS stuff is a bit more legacy stuff. So what are some of the options, right? To, to kind of leverage on today's like more modern world that allows you to work in a cloud native environment. But also we want to introduce to you Apache Pulsar is an open source platform and it's cloud native and it supports distributed messaging and streaming too. And as such too, this is the link where you can kind of find out more information or this is actually more the, the GitHub repo. So wanting to highlight it because we don't have too much time, but basically it's very cloud native in nature. It's born with the cloud native DNA and various, you know, it's basically the key point of it is that why do you want pulsars? I think what, I think at least one of the key point, it separates out the compute and the storage. So basically Pulsar can focus more on working with the messages delivery, right, dealing with all the messages coming in, delivering all of these things. And then, you know, you have a whole laundry baskets of all the log messages, then what do you do with it? Rather than dealing with it, Pulsar said, let me get bookkeeper to handle it for me. So, so that way Pulsar can focus on that, you know, just the messaging part and coordinate with the bookkeepers. So that's what it does. And it also supports multi-tenancy and that's a very nice way of helping you to organize all of your messages, as well as some features that are more kind of ready for enterprise level, like, you know, geo replication is also a major thing in that. And also it has what is called like tiered offset. It's basically if your messages get code, right, and bookkeeper, you don't want it to take up too much room. Then you want to move it to, or actually, I should say, it gets kind of in the one storage and you want to move it off to cold storage. So all these, as Pulsar has built in and it knows it. So native Kubernetes support all of these things, schema, it has a Pulsar schema connectors, and you can use the basically Pulsar IO framework to build different connectors. And currently we're supporting like almost a hundred different kind of connectors, too, in there. Message processing, you can use the Pulsar functions framework, so you don't need to use anything outside to do message transformation as you are building your data pipeline. And also the nice thing, too, is that it doesn't restrict you to only using Java as your client. You can use other things like C++, Python Go, and other community contributions to such a cloud. There's also Node.js, also.NET C-Shop client, too. So that's really flexible and really functioning real well in Pulsar. So let's kind of really quickly kind of take a look. I already mentioned some of it. Essentially, too, it's a blazing performance. That's what we all want. Provides you with true like real-time type of processing. That's why we want it, right? It's basically millions of JMS messages can be handled if you have JMS leveraging on such a platform. So it's all good. Horizontal scalability. If you expand your infrastructure, adding more servers and nodes and all of these to it, Pulsar will handle that for you. You don't need to rebalance all of your topics, and you don't need to deal with offsets, right, such as in maybe like Kafka, things like that. It has its own way, so then you don't have to worry as a developer. Worrying about all of these infrastructural things. So all of these things are just listed here. I know there's a lot of, you know, works in here, but it allows you to kind of get a bit more into detail, and we can share with you this thing. So let me pass this on to, let me see. Oh, let me kind of quickly, I thought this was on. Okay, so just a really quick basic architecture. This kind of pictorially described to you what I just talked about. We only have so little time. So this is just describing to you, right? Producers, consumers can be written in, you know, many different languages, not just with Java, and it gets managing, you know, by bookkeeper that deals with all of the storage side of things, and very dynamic. As you can see, this kind of quickly summarized in picture what Pulsar can do for you. Okay, and then here, just quick summary Apache Pulsar. Again, take mixtures of a pop-up type of architecture, right, and that's what it is, and supports like multi-tenants, namespaces. Different subscription modes do that. You can also leverage on that, essentially turn Pulsar into a queuing kind of capability if you use an exclusive type of mode to do, you know, subscription. And what other thing? Yeah, so there are different modes. It's just highly flexible is what we're trying to tell you about the story. So here, we have a little bit of story about that. We can talk more about it later. Yeah, so I just want to map Pulsar concept to JMS. JMS is pretty straightforward. So the model is quite flexible because it is with a queuing, but also a pop-sub. And in Pulsar, the mapping is really natural because you can map a JMS topic to a Pulsar topic, whatever it is, Pulsar standard topic, partitioned topic, virtual topics. A JMS queue is like a Pulsar shared subscription, and the JMS is like a Pulsar message with an envelope and with the body. So in JMS, we have several consumer types. So I'm not going to enter the details, but there is a subscription type that matches the JMS requirements. One important thing is that if you want to use JMS with Pulsar, you don't need to install any additional plugin because the JMS API is built over the standard native Java client because the Pulsar features are a super set of JMS. So it's only about implementing an API. You know, in JDBC, you have an API that allows you to connect to every database. In JMS, you just have to implement the API and follow the specs. If you want, you can deploy a server-side component just to push some of the computations. So for instance, in JMS, you have filters. You can filter the messages. So if you want, you can filter them on the broker. Otherwise, you can simply filter them on the client side. I'm just showing some examples of how to use Pulsar with JMS. Maybe if you are already familiar with JMS, that's pretty simple. So in JMS, you start with a connection factory. So we have Pulsar connection factory. And this is JMS 2.0. And you can get a JMS context. You get a reference to a destination. This is create queue. Create queue is not creating a queue. It's creating a reference to a queue because JMS doesn't deal with administrative operations, as Mary said. You create a producer. You can send as many messages as you want. And if you want to consume, you create a consumer. And you can use receive or set the message listener. This is from standard Java. If you're using Jakarta or Java Enterprise, actually, yes, I've been helping a few companies to migrate from Java Enterprise to Pulsar. So I know much more cases about Java Enterprise more than Jakarta. But that's it. So for instance, if you want to write and you have an Enterprise Java bin, then you can ask to the container to inject the connection to Pulsar. And this is a standard Java Enterprise code. So this code runs with ActiveMQ, with TIBCO, with whatever you want, whatever you are running. And the container injects the connection factory and the destination. And you can, as in the standard Java code, you can get a reference to the JMS context and then you send. We will see later how the administrator, for instance, with Apache Tomy, connects all the parts. The consumer, usually in Java Enterprise, you use message driven bins to consume from destinations. So yes, this is a simple message driven bin. You configure all the relevant things that you want. For instance, usually you configure the destination that is still a logical name and a subscription type or the parallelism of the kind of things. In many containers, you can configure the things on other descriptors or descriptors on user links and files. You implement a callback on message. Every time a message is dispatched to the application, the code runs and if everything goes well, the message is acknowledged to the Pulsar broker and it won't be delivered anymore. If there is any exception that is thrown, Pulsar will deliver again the message. In Tomy, there is a very simple way to deploy the resource adapter. I'm deploying the resource adapter for Pulsar. So Pulsar RA, you configure the connection to Pulsar. Now in the demo, I'm using localhost and this is the most interesting part. I create a logical queue, so full queue. This is a queue and I bind it to a physical destination. So the container will create a Pulsar connection factory and also the Pulsar queue. The demo is on my GitHub space. So yes, you can run it by yourself. I'm going to use Apache Tomy 8, Starlight for JMS. I'll talk about that later. It is basically the JMS implementation. I create the object with the same file that we saw and Apache Pulsar to.11. So we have one application that consumes, one that produces and Pulsar will run locally. So let me switch to the console. Oh no, yes, the code. The code is really simple. This is on GitHub, so you can check it out later. So this is the producer. I'm not writing the code that instantiates or assigns some value to the factory or to the queue. I'm scheduling the execution of this method every two seconds and that's it. Very easy. On the JMS listener, these are two separate applications. Usually in a real world application, you have some application that produce the data. Then you have a pipeline that transforms your data and something else that consumes the data. This is pretty common. So here, on message, depending on the type of message, I'm printing the content and message. Here, I'm just declaring the reference to the logical queue that I want. In this case, OpenAJB that is still Tommy will resolve the binding with the physical queue via JNDI. We are running out of time. So I have a script to run all the demo. The script simply installs two instances of Tommy, Pulsar, copies the configuration file, deploys the resource archives, changes some ports because I'm running multiple services on my machine. So there will be conflicts. Copy the consumer application to Tommy one, copy the producer application to Tommy two, then start the Pulsar standalone. That is a quick way to start Pulsar locally with all the services, but only in one JVM process. Tommy one, Tommy two, and then we will see the logs. So there is some noise initially because it is installing everything. This is Pulsar. This is starting. These are the two Tommy. Actually, we don't see. Oh yes, this is good. So Tommy two is sending the messages. Tommy one is receiving the messages. So it works. It's a very straightforward setup and very common way to develop with Java Enterprise. Let's drop up. Two minutes probably. Yes, okay, good. So JMS is very useful and it allows you to switch very easily to another vendor. Usually with JMS you don't use very specific features. Usually in my experience with JMS, maybe you're using TIBCOR, you're using ActiveMQ. You configure on the container some special flags, but the code usually is pretty standard. Yes, so switching to Pulsar is usually easy. Pulsar is cloud native. It's scalable horizontally. So like Mary said, really, if you it looks like a promise, but this is real, you can add machines, add or remove machines, and the service automatically adapts. Actually, at Datastax we are running it as a service on the cloud. And so this is very powerful because you can automatically adapt the resource consumption. And also you can move the data that is not actually consumed to tier storage. And this allows you to really lower the cost. It's open source. It's a vibrant community. If you want, you can reach out to me on the community. And there are many people that are very enthusiastic. Pulsar is young. It is only five years old, something like that. But in the past two years, it grew very fast because it is really the next generation. Maybe someone working with ActiveMQ, then I did it in my previous jobs, ActiveMQ and then Kafka and then Pulsar. Now it's time for Pulsar. If you want to use Pulsar, you can use Starlight for GMS. I'm the initial author and main maintainer for Starlight for GMS. So yes, feel free to ask me any questions. It's open source. It's on GitHub. Pulsar Connection Factory, if you're using standard Java, there is a resource adapter that works well with many containers. And it's already tested and it is running on production. Okay. And these are just real quick. If you like, get this copy of the slide deck. But otherwise, there are resources in here, community info, references to all the Pulsar information on GitHub and also in our Pulsar site. And also then just additional information too with data stacks. If you're interested, we offer like the $25 credit per month for personal projects. So wanting to share with you, I know it's not true, open source in that sense, but we do have astra.datastacks.com and all of the astra streaming is our company's supporting this in our cloud. So oops, where did it go? Sorry. You tried to subscribe to us. Okay. So how do you contact us? This is the slide just containing information about Twitter handles and LinkedIn, all of these things. So please do consider staying in touch with us. We'll be very happy to answer more questions that you may have and all you want to share with us, your project idea, we'll be happy to answer. And those sound to Jay's luck. Yes. That's right. So thank you. Thank you so much. And I think that's any questions. Sure. What the Pulsar functions or Pulsar function is a lightweight processing framework that usually it's very easy to enrich the data that you have on your topics. So it's for very lightweight processing. So if you have to do more complicated processing, you usually move to something like Flink or other things. But Pulsar function is very useful when you have to really process your data. And also it is the base for Pulsar.io that is the connector framework. So basically in Pulsar, you can deploy on the Pulsar cluster your code that transforms your data on your topics. Yes. It starts from a message on Pulsar and usually it ends with another message on Pulsar. So it's really useful for transforming the data that is on Pulsar or to push your data outside of Pulsar. I don't know if this answers. We need to continue. Oh, yes. There is a question over here. If you want to have a discussion and also on Fuji Slack, you can have discussions with people, but usually at the top they are married. |
Fuzion — Intro for Java Developers: Mapping Java's Features to Simpler Mechanisms |
A really cool, interesting project, a new language being presented by Fridtjof called Fusion on the OpenJDK. Final session of the day, thank you so much for being with us, some of you, all day. So let's start with Fusion or end with Fusion. OK, thank you for staying so long, thank you for Gertjans, he just downloaded the latest version of my slide deck and was a bit shocked that it's almost one hundred slides and only have twenty minutes left, so let's see how that will work out. For those who came for the Fusion stickers, please pass them around and take one. So Fusion, a new language, and it's different, it's more from a Java perspective. But there's some overlap, you will see. So basically the idea, the original idea of Fusion was to have something like a simpler Java to simplify Java's features into Fusion features. Bit of my background, I did work on compilers for about thirty years, a big part of that working on read time Java implementations, read time garbage collection and so on. Start with motivating a quote from John Bacchus, the inventor of Fortran, who worked a lot on functional programming but was very disappointed because his work on functional programming basically failed and would likely always fail because functional programming is easy to do hard things but incredibly difficult to do simple things. Fusion has evolved into a functional language and I think, I hope, I find ways to even make the easy things easy with that. So the motivation of Fusion is we see that languages like Java get more and more things packed in there. We already have classes, methods, interfaces, constructors, traits in other languages, records, structs, packages and so on. In Fusion, all of these map to one single concept, which is the concept of a Fusion feature. Then I see today's compilers are much more powerful, so actually to distinguish whether some feature is used like a method or like a class or like a constructor is something that the compiler decides, then it is not needed that the developer decides that. And we see that more and more systems are becoming safety critical, so we need to ensure correctness. And I see that tools have to play in a very important role in ensuring this correctness by static analysis. Fusion is available on GitHub, there is a website flung.dev that gives an introduction into the language with lots of examples, lots of design documents, lots of ideas collections. Please go through that. I can't give a language introduction here, but yeah, you'll find more there. Fusion is backed by a small company, Tokiwa, with currently four employees. One of them is sitting here with us in the group, Michael. Now coming actually to this talk. So I will start with a very quick introduction into what the Fusion language looks like from a Java perspective, then talk a bit about side effects and their dangers, then propose algebraic effects as a solution to manage side effects and give lots of code examples how you could do these things in Fusion. So here a small example in Fusion, I will give a Java equivalent on the right side and the Fusion code on the left side that you can quickly understand what it's about. So I said Fusion maps Java features to Fusion features. So in Java, if you have a package in Fusion, it's just a Fusion feature, in this case Demo. If you have a class in Java, it is also a Fusion feature that is nested in another Fusion feature. If you have a method in Java, it is again a Fusion feature that is nested in this case in the Hello surrounding feature. In this case, what makes this feature different is that it's a function that returns a result which you can see from the result type here, which is unit. Unit type in Fusion is pretty much what the void type is in Java, but with the exception that it is a real type, so you can declare variables of type unit, you can return this as a value. It is not a very interesting value, but you can have this as a full-fetched type with only one single value. In contrast to that, void is also a type in Fusion, but that gets interesting again because void is a type that has no values. So basically the result type of something like system exit in Fusion would be void, which means it will never return. Then printing something is easy, there is a standard library function say that, in this case, prints hello world. Fusion uses a lot of type inferencing, so the result type unit here actually can be inferred because that's also the result type of say, so we don't need to explicitly note this. Then, I'll go back here, very similar to Java. If you have code like that, you don't have anything to run yet. You need in Java, you need some main function. In Fusion, there is one feature which is called the universe which surrounds everything and code put in the universe like here gets executed if you run your application. You can pass arguments to features and arguments are fields within those features and fields are also features, so they come in the same class, but features that are pre-calculated that hold a value. Fusion has inheritance, so you can define a feature hello to that inherits from hello. You can create an instance of that and call features on that. That much to a quick introduction into the language syntax and how it works. There's a number of things that Fusion does not have and mostly because these are things that are considered to a certain extent harmful in a safe to critical environment. There's no dynamic loading, there's nothing like macros, no reflection, no pointer arithmetic. Many of these things also Java doesn't have. There is no uncontrolled mutual abilities, so you cannot easily change variables. There's no direct support for exceptions in the language. The reason for this is we must know what the code does. We want to do static analysis of the code to ensure safety and also to a certain extent to allow better optimizations to increase the performance. Bitmare more on side effects and security. We learned a lot about security today already in earlier talks, but mostly addressing security aspects of the software development process and managing of security issues. I now come from the language side and say what we could do from the programming language to improve the security. If you look back at recent securities a lot, we learned about lock for J today, but there are similar things with Spring Shell, even the Rust community has similar issues. What these issues have in common is that library code that is used has unexpected side effects. You use a logging library, you don't expect this to go to the Internet and make an arbitrary connection and download code from somewhere else in the world. That is the common problem. One way that is used by many new upcoming language to control side effects is to use algebraic effects. Let me quickly explain to you what algebraic effect is. An algebraic effect is basically a set of non-functional operations that code might perform. These are operations that do not have an effect on the actual calculation on the return value of a function. Java already has one kind of algebraic effect built into the language, which is throws for methods that throw exceptions. But algebraic effect is more a broader concept. This is just one example that Java supports. Any operation in an algebraic effect can either resume or abort. So typically, if an algebraic effect is reading some data from some external input, it would return the read data and resume operation with the value that was read, while an operation that would be something like throw an exception would perform an abort, so it will not return but jump back to the corresponding handler. Side effects can be implemented by different implementations of the effect handlers. So there is no strict fixed wiring from the operations to a particular implementation. And very similar to exception handlers, effects may be nested. There's a kind of contrary view to two words algebraic effects. You can see algebraic effects. What I've presented so far is as the effects that the code might have, but you could also see them as capabilities that the code might require. Martin Odelski is starting a big research project in that area. What I do is I define my exception, which is our exception implementation, as a feature inheriting from the base library feature simple effect, which is just a basic standard effect, and our implementation of throw is just abort. So the simplest way to stop an operation. And now we define one feature that throws an exception, and what we do here is we call the operations throw, but we need to have an instance of the algebraic effect. And the syntax we use in Fusion for that is we use the type of the effect, which is my exception, from the environment. So.n means taking the innermost instance of that effect in the current environment and calling that operation on it. When we do that, we should declare in the signature of the function that this function requires the effect my exception. So this is very similar to a throws clause in Java. If I throw an unchecked exception, I need to declare that. Here if I require a certain effect, I declare this with the exclamation mark. Now I add some prints just to show what this code actually does, and I want to call this function, this feature F now, to call it. I have to first install an instance of the effect. So I create an instance of the my exception effect here called use on is, which is a standard effect function that takes a lambda expression, which then calls the code that is executed while this effect is installed. So adding some more prints that you see what is happening, and if I now run this code here, you see that it prints, the exception is installed, it prints the before throw, throw directly jumps, very similar like an exception, out of the use here, and we continue with the we are done. So the code after an operation that aborts here will not be executed at all, very similar to exceptions. Yeah, now let me talk a bit about mutation. I told you that fusion doesn't allow direct mutation of fields, so fields are immutable, which means if we do have code like that, we declare a field x, assign 1, 2, 3 to it, print it, and then assign 2 times x to another field x. We see the expected behavior, but if we create a feature that prints this field x, and try to compile this, or try to run this, we actually get an error, because the problem is this x here is not clear which one is referenced here, because we have two different variables here, there's two axes here, the first and the second, and they are only visible for the code following that. So we get an error message that there are two different axes, and that source code position here doesn't know which one to choose. So really every assignment creates a new field, and these fields are immutable. To make them mutable, to get actually the desired effect that we can print x here, we would have to create a mutable integer value, which is with the base library function mute, creates a mutable instance, assign this to the variable x, and now if we want to update this, we don't create a new field, which would be the colon equals operator, but we have an error operator which updates the value with a new value. If we run this now, behaves first like the code before, but this time the show x function can actually access this single variable, because now we have only one field left. We can now analyze this code for the effects that this code requires, and if we do that, we see there's two effects, there's IO out, this performs output, and there's the mutate effect because we have an update of a mutable field in our code. Now not all variables, very few variables actually usually need to be mutable. Here's an example of a small loop with an index variable counting from 0 to 9, and printing them, if we analyze this code for effects, we see that this only depends on the IO out effect. The reason is that every loop iteration creates a new instance of that variable, so we don't update the i variable here, but we have one independent instance for every iteration of the loop. So no variable is mutated, a new instance is created for every iteration. I want to talk a bit about error handling now, and show how the function can produce an error, and show them three different ways of how error handling could be done. The function I use is just to divide, that divides two integers, and I call this in a show div function that calls divide and prints the result, and then I call this with three different value pairs, and if I call this, I get, not very surprising, I get an error, there is a division by zero, the precondition of the division is not fulfilled. So that's the standard error handling in fusion, but it's not very nice because you have the whole application for failing. If you want to now somehow treat that error, what we could do is return an outcome, which is similar to Rust's result, which is basically a choice type between an error and an actually 32-bit integer, and check the case, if B is zero, we return an error, otherwise we return the result of the division, and if we run this, now the application runs through, it doesn't terminate, and in the middle case, we print the outcome, which is an error here. But if we want to now actually, after calling the divide, want to know was this divide successful or not, we would need to check the cases, so we need to distinguish whether we actually got a value, or we got an error, we can do this with a match over the different choices. Now an alternative would be to use the standard library try effect, which is kind of the default exception based on algebraic effects in fusion, and to do that, instead of returning an outcome, this would be just a function returning a 32-bit integer, but requiring the try effect to be installed, and now instead of causing an error, we would raise the error of the try instance in the current environment, so we don't need the else anymore because the raise would abort and would return immediately, so we could just continue with the code there. And when we call the divide now, we have to call it with an instance of the try effect being installed, so just like before, this can be done through a base library function. Try that installs an instance and calls the lambda, which is provided as a parameter, and this can then be matched very similarly to the outcome, but the big difference is that now the code in between, in between the position where the error is, and where we have this call, does not need to pass along these outcomes all the way, I'll come to an end very soon, but we can directly work with the i32s and the try would jump out directly, so we would see this outcome. So the penultimate slide, the current slide, the status of fusion, it's still very much in development, the language is getting a bit more stable recently, but there's still a lot of work, mostly also in the base library. The current implementation has two backends, one running on a JVM, and there's also a C code backend, and there's basic analysis tools available, as I've shown you, the effects analysis. Java maps actually very well to fusion, there's a tool that allows calling all of Java APIs, creating Java APIs from a fusion APIs from a Java module that we can call into Java, what doesn't work yet well is calling back from Java into fusion, but there's at least in one way, it's one-to-one mapping. We have effects to encapsulate non-functional aspects, and I ask everyone please have a look, we're happy for any feedback. Thank you for staying so long, I think time is over. The match is still needed because this try here installs the effect, and an effect in the case of an abort has to provide some way to join the value that is returned in the non-abort case with the value that is returned in the abort case, and for the try effect this join is just made by producing a value of type outcome, which is the choice between error and the value, but there could be other effects that would just replace it by a default value in that case, so it depends on the effect, but here it's definitely still needed, yeah. Do we have time, yeah? Yeah, I saw that at some point you showed that there was an IO effect, and I also saw a lot of code that uses the same function, which I presume uses that effect, but can you see the effect using any of the examples? Okay, yes, you took very good care, thank you. Yeah, it is not decided yet where the compiler should be strict and require this annotation. The current idea is that for basic code we should not require this annotation, but for a public library function we definitely want to know what are the effects. So, I don't want to enforce this for everything or for all the intermediate values, and there's also some cases where only a static analysis of a whole application can actually determine what the effects are, so static analysis plays a very important role there. I don't want to enforce too much typing basically for these effects here. Another one, there is, like John for example has this distinction between runtime exceptions and checked exceptions, and there are just these kind of exceptions that can have pretty much any code, like out of memory exception or static load, and I wonder how do you handle these kind of cases? Oops, they're shutting us down here. Okay. It's a small hint. I actually, it's not nothing of that is done yet, but I think I would like to get one step further and make it user configurable. What are the effects that you want to have considered acceptable in your environment? Like you want to have maybe add some debugging print or some logging in somewhere nested in some internal function that shouldn't have forced you to add effects all over through the code. So we must have some set, some way to define for the debugging build. These are the effects that are in there, and please don't complain about that. But we have to still see how we actually will do that. Thank you so much. Thank you for saying so long. Thank you. Thank you very much for attending the freegear room. This will be a room next year again. Hopefully we'll have two days and we'll have more time for sessions and hopefully many of you will submit proposals. You will all be very welcome to present in the freegear room next year. Thank you very much for coming. |
The State of Go
What's new since Go 1.19 |
Good morning everyone. Finally I can hear you again after two years online where I just had to stare at a boring matrix chat. I am honestly so glad to be here and welcome everyone back. Just like every year we are starting with an update with the state of Go. We are going to talk about what is new in Go. I will quickly touch on some topics. The interesting things about Go come in later talks. So what I am going to look into today is change the language as well as the standard library, tooling of course. I got two interesting design drafts for new releases in Go and of course an update on our Go community. What is new in Go since Go 180? Well of course Go 119 was released in August of last year. Go 1.20 was released a few days ago. It is just the first time that they released a Go version before I do this talk. What are the big new changes? There are four new changes to the language as the most we ever had. However one is not really a change so it is more like two and a half changes but actually more like 2.25. Let's just keep it there. Two real changes to the languages addition. The first one is that there is a new syntax for converting a slice to an array. Those of you who are new to Go might be confused because what is the difference between a slice and an array? I call them both arrays. Well technically in Go an array has a fixed length and the slice does not. That is too easy to say it is not correct. But you can go verb between those two and you could have done so since Go 117 using this ugly syntax which has to do with pointers and how it works underneath. I would have never come up with this myself. But in Go 1.20 it is more logical now. You can just make an array with a fixed length of 3 and put your slice in it. This now works. The next change has to do with generics which were introduced just last year. It has to do with a comparable constraint which we could give to variables. You could say a variable has to be comparable. Why would you use this? For example when you have to loop over a map it has to have a comparable key. So you could write something like this to make a sum of some numbers. And you can use it wherever you want. You make a map from strings and ints and it will count this. Okay strings will work because they are comparable. How do I know that? Simple. You can compare them with equal signs. But what about the empty interface? Is that comparable? Well that depends what the interface is. Before this was not valid. You couldn't do this. But in Go 1.20 you can because now we have two types of comparables. You have strictly comparables like ints, string, bytes, usual, but also non-strictly comparables like the empty interface. Do be careful because this might panic at runtime. It's allowed but it can panic. The next change is the comparables of strict values. It now checks one property at a time and it will exit at the first mismatch. Wait a minute. This was always this way. Yes, it was always implemented this way but it was never specified in the language specification. This isn't really a change in the language. The next change has to do with three new unsafe functions. This is the unsafe package. You should just avoid doing this. Next up, what is new with the Go Tooling? Well, I say this every year and before me, Frances said it every year, is that there are new warnings in Go Vats. Why should you care? Okay, there's a new squiggly line in your editor. You might not care about those squiggly lines in your editor. You should but you can't. Okay, but it also runs when you run tests with Go tests. So your CI might suddenly turn red if you update your Go version because there are new warnings. That's why these are important. The first new warning this year is in Go 119, that it will now error when you pass a pointer to an error in the Error.Ask function, which is such a common mistake that kid of co-pilot wrote this code for me. It's not bad. It doesn't work. The next warning has to do with incorrect date formats. Let's say I want to format a date of today in an ISO-like notation. Well, I would write some code like this. But wait a minute, let's think twice about my codea. How do I format a date in Go? Well, you always have to think about Monday, January the second of 2006. This is February 1st. Many of you probably haven't noticed. Again, this is a common mistake because people are confused between 1 and 2. Go Vats will now warn you against the above format because it probably is used nowhere. There are also some welcome changes to Go Dock. You can now make lists, link, and headers inside your Dock comments, which will be rendered into HTML. Use an example below where I put a header in, I link to the RC I'm implementing, and I'm also listing which guys of coffee my machine supports. There is also in 119 a new Unix build constraint. If you want to build a file that only will be built on a Unix system, you can do that using Go column builds. Okay, in the past, you could do it by listing all different Unix systems. There are a lot of them. Okay, that's a lot of code, right? Well, in 1 to 19, you can just do Go column build Unix. Wait a minute. Isn't this a common thing to do? Because a file system in Unix is almost the same everywhere. Okay, so I asked chatGPT, like every developer does these days. I asked, give me a Unix build constraint, and it gave me this thing. Go build, not Windows. Okay, you all know what's going to happen, right? Okay, you say, it's an AI, I trust the AI. Let's think tries about this. Don't try to be smart and reach the actual compiler code, like every one of you does all day. And immediately I found this thing. Oh, JavaScript is a thing. WebAssembly, one important fact I didn't even talk about if Plan 9 is Unix or not. No, I just love like JavaScript. That's not Unix, and it's also not Windows. So just don't trust your AI, please. 1.20 ads also coverage on building binaries. Why should you care about this? Well, many integration and end-to-end tests, you run them by making a special binary, running it, and getting your test results. If you also want coverage results, this wasn't possible before. If you now build it with dash cover and at a co-cover deer environment variable, then run the script, okay, you get your output, and you also get which lines all your code touched, so you can put it in your site, your favorite coverage tool. There are also a few small changes I want to touch upon. C code will now be disabled if a C tool chain is not found. Many container people will now be happy. Go generate and go test also have a skip flag, which you can put in the red text for which file to skip. Okay, let's take a look at the standard library. Go as many things in the standard library, and of course we have changes in those every year. The first one is in 1.20, and I find super useful. You can now wrap multiple errors. You can do so using error.fmt.errorf. You can now put multiple percent sign w in there. You can just wrap multiple errors. Your functions that you will run on them like errors.is or errors.s, just loop over all those errors. It does that by using the underlying new unwrap interface, which just gives you the slice of original errors back. There is also the new errors.join function, which you can just throw all your errors into. Why would you use this? Okay, you always written a code like this. You just loop over a list, and you want to check for some errors. Okay, you could just return a slice of error, but then you have to check for the length, if it's not nil, etc. Okay, you just want a single error out there. You can use errors.join, and you get a list of all your errors, which are joined together. You can just treat it like a normal error in your code, and even use errors.is and errors.us. So you can just say, oh, was there any empty string in this list? There are also a few changes to the strings and byte package, which is that it now has a new cut function. It just works as an act trim with trim prefix, cut prefix, and cut suffix, except it will now return a boolean if a cut has happened. There is also a clone function, which returns the same instance copied in memory. Also, few small changes to the time package. We now have a compare function, which is a combination of before and after. It does both. It will return you an integer from either minus one plus one or zero, depending on if it's before, after, or the same as a given time. There are also three new layout constraints, which you can use, and those actually came from you. Those came from the Go user survey that they are commonly used, so they added time. There is date time, which gives you an ISO-like notation. There is also date only and time only, which gives you only the date and only the time. There is also a change in the TLS package like every edition. This time, it's a change in how it treats memory. It now shares a copy of your certificate in memory. Why is this useful for you? Well, let's say you have an application that does many concurrent connections, like Kubernetes. Well, until then, now it's sort of a copy of your certificate for every connection in memory. It now is sharing those amongst multiple connections, so you are saving memory. If you somehow have an invalid certificate, you also get a specific error that says the certificate is not valid instead of a general error. And yes, we also have breaking changes this edition in the standard library. The first one happened in 119 in the HTTP package. The HTTP client will now no longer give an error back if you serve a sense of 300 response without the location header set. If you rely on your code to check if the location is set or not by using this error, yes, your code will break now. Also, a change in the random package. It is now preceded when you use the global random functions. You no longer have to call the dot seed function with some random number you get from somewhere. It now does that for you. Of course, it deprecates the dot seed function. If you still need your own seed for predictable random numbers, you can do so by using the random new function. If this somehow breaks your code, it could. You can disable it using this new go debug variable. In tar and zip, there are also some changes which are welcome, which is that it will now error if your R guy has an absolute pot, an invalid character in a file name, or a reserved name on the Windows platform. It will now return error in secure pot. This to protect your server from being hacked. If you somehow don't want this, you can also turn it off in go debug. There is one new package in the standard library, which is the elliptic curve of the helmet gear change. Yay. Very excited. This was possible. Go can do it using the lower elliptic one, but you had to implement more yourself. So it's probably more secure than you would. Okay. The go runtime. We also have a few changes in there. Well, go 119 has a revised memory model, and I have no idea how they did that. I don't know. So if you want something, we actually know what he's talking about, Russ Cox wrote an amazing blog post about it. But what does this mean for us average go developers and not compiler developers? Well, first of all, go now has a soft memory limit. You can now tell go how much memory you wanted to maximum use. It's a soft limit. Okay. You can, for example, set it to be one gigabyte. Okay. What will happen tomorrow? Go towards one gigabyte limit. It will try to trigger the garbage collector more to get more memory frets. Yes, you can see the results if it's too low. Well, what will happen is if it's too much, it will try to limit it to 50% of the CPU execution time, which your process is using to be garbage collection. There is, however, a warning. If you set it to tens of limit, tens of megabytes, it might just work because your operating system says that's absolutely not enough. This also results in a new atomic package, which provides low level atomic memory access. So you could now access these variables in multiple go routines. It works only for primitives like integers, booleans, and unsafe pointers. It does this by exposing the function store and load. Also add for integers and compare and compare and swap. Okay. But if you use these, you need to know exactly what you're doing. You need to know how our topics work. As always. And it's still not really recommended. They recommend that you still share memory by communicating, for example, with channels and not communicate by sharing memory. So please only use this as if this is your only option. Go 1.20 has a few small changes in the runtime. The garbage collector got better yet again. Say this every year for five years, it got better. And it is now a Leatheretic. There is also a new mode, which you can compile the binaries in, which is P go, which you can give it a profile of your program that has been running, which will now will try to optimize the binary towards your CPU profile from a previous run by, for example, inlining frequently called functions. The go team claims that is up to 4% faster. I had some colleagues who were looking into this, but not in time to get actual benchmarks. At last, I want to give you a small update on go ports. So what is happening on the ports in go? Well, go 1.19 added a new processor architecture on Linux, which is long arc. It's a Chinese built architecture. It's not yet in white use hour. Go 1.20 will be the last one to support Windows 7 and 8. It will also be the last one to support Mac OS 10.13 and 10.14, but who cares? Go 1.20 also has experimental support for RISC 5 and the free BSD platform. Yay. Okay. That is the current version of go. But of course, let's take a look at the future. And always we try to look in the future. It won't always work. I have two interesting design drafts, which I found. The first one is one for structured logging, something you all do, but doesn't work in the standard library. There is a proposal to make an S log package in log in a standard library. They want this to produce machine readable logging. And it hopes to replace the many, many, many, many, many structured logging libraries like log RISC that ZLog, log arc, log, HLOc, and however you pronounce all those. It tries to propose something like this. Something like every library probably already did is you set up for something, you set up what you want to send it to, you put in messages, you put in variables, and it logs those out in something that is machine readable. This is the text output, which is just key value peps. So your computers can all read it and can index it and make it searchable. How does it want to do this? It wants to give you a logger interface. Again, these are all interfaces. You can just implement them in your own library. Okay, it wants to give you fellow functions like info, error, warning, log attributes. It then makes those into a record. This is just a track containing all this data, and you give this record to a handler, and this handler will turn it into something that's machine readable. If you want JSON, you just give it to a JSON handler. If you want some proprietary format, you just make your own. So it tries to give you an implementation and interfaces in the standard library for different log levels like debug, info, warning, error, passing in data to be printed out, and now putting it into text in JSON and maybe more formats. Again, this is a design proposal. It's not yet implemented anywhere. If you have strong opinions about logging, you can read the full proposal on this link. I will publish the site of FOSDM later today, and you can go there, read everything about it, and leave some comments in their issue tracker. The next big thing they want to tackle is Go version compatibility. Why do they want to do that? Well, we've been doing this talk ever since 2015. A lot has changed, bigger room, different speakers, and especially different slide templates, but there's one thing that always stayed the same. It's this slide. Freaking changes. We wait a minute, Marcia. Isn't there the Go 1.0 compatibility promise? And yes. Well, Go's emphasis on backwards compatibility is why we all use Go, because we don't have to rewrite our whole application every two years. However, there are times which is not possible, for example, with external security dependencies or just bugs that we have to fix. Okay, let's take a look at this in practice. Let's look at the big Go project. Kubernetes again. When did Go break Kubernetes? Well, more than you think. Just in the last version, Go 115 broke Kubernetes in some way by deprecating the X509 company. 117, a bug fix in that part's IP broke it again. In 118, again, X509 broke Kubernetes again because Go changed something, they deprecated something. And in 119, a bug fix in loop path also broke Kubernetes. Oops. Of course, it's impossible not to break Kubernetes somehow. But still, let's try to avoid this in a language. So we have a solution, and it's a solution already we have today. Is that Go debug flag I've been showing on my slides? Okay, what is this proposal? It is to commit to adding one of these Go debug flags to every breaking change in the following releases. And also to guarantee that they'll stay there for a few years or maybe forever. They also want to add metrics to it so you can look at your program and see how many of those are there that you have to fix. And also to put it in code so you can use Go call and debug to override it inside the code yourself. Again, this is not yet fully implemented. There is a design proposal. You can read everything on the link there and leave any comments. But wait a minute, Marcia, don't we already have this? I have to specify that Go version in my modules file, right? Yeah, but what does it actually do? Oh, I know. This says the minimum Go version to build it. No. It will try, any version will just try to build it. It's just a suggestion. It might fail. Oh, I know. It says a Go version in which it uses. Also, no, sorry. It uses the installed version on your laptop. Nothing else. Oh, did I know. This says the semantic rule set for the version. And yes, that is correct. But only the semantic rule set. So that slides to array conversion. Yes, that is set by this flag. The octal numbers which got added two years ago. Yes, that is also checked by this flag. But that's all. Okay, they want to change this. And this is the Go toolchain proposal. They want to add a Go toolchain environment variable which you could use to set a specific toolchain. Okay, I want to use the 1.20 toolchain for this application. This will allow Go get to get a new Go toolchain just like you would get your Go modules. Okay, but it also needs to change the Go command a lot because it has to get your toolchain from somewhere and then first download it, check it, and run it. That changes our tooling a lot. And also, there is a cool toolchain local if you still need a local for some reason. Again, this is just a design proposal. I might be saying that this is implemented next year. If you have comments about it, there is a link here as well. There is also a proposal to add this to the Go mod file. So it's right under the Go version. You say, okay, my application uses the 1.19 syntax which has to use a 1.20 RC for toolchain. So if you build this module, it will go download this version of Go and build it using that. Okay, that's a technical thing. Let's talk about my favorite subject, the Go community. This is a map of all Go meetups in the world. We are pretty much covered everywhere where big populations are, but still not enough. What are the numbers? Well, the professional Go developer network on Meetup counts 127,000 members. That's 8,000 more than last year. There is sat news for the first time. There are now only 190 meetups that seeks less than last year, which also results in one country being less represented. Probably due to the pandemic. There are also the women who go and go break chapters, which is still stable at 41 chapters, and Berlin is still the most active one. But now let's talk about my favorite community, the Foslan community. Our deaf room is nine years old today. So we've been doing this since 2014. Okay, small room. Anyone can see themselves? Okay, we got upgraded in 2015, 2016. Okay, bigger one. We stayed in the same size for three years, which was a crowd enough, and today even is full house. 2019, we got the biggest upgrade ever. We got a giant room. And in 2020, we got the biggest room they could find for us. But I regret doing that because a month later, we were all in lockdown. That caused our 2021 edition to be fully online for the first time. We all did our best. We turned our living rooms into giant television studios trying to bring you some talks about Go. We learned a lot of lessons. And in 2022, we brought you gophers around the world, which we had great fun in producing. But hey, welcome back. This is something you'll never, ever see again today. There was just one guy still sitting there. And he'll be here at 9.00 at 9.00 p.m. Good. Let's talk about Go! Conference. You're all in the mood, right? So there is a Go! Conference. You are here. Please stay. There are better thoughts than mine. If you quickly catch a plane, right now, you can still make Go! Con Israel, February 7. Con 42 will still be held online in April. If you want to go to New York, you can do so at April 28. Go! Con Japan will be held online. Go! Con Europe will be in Berlin in June. Go! Con US will be in San Diego in September. And Go! Lapp in Florence, Italy will be held in November, which I have not officially confirmed yet. So we got an amazing schedule today. I already want to talk all speakers for signing up to be in our deaf room today. I hope you'll welcome me again next year. But before I leave you all, I want to give a few housekeeping announcements. First of all, out of tradition, we have lightning talks at the end of the day. We reserve the last half hour of the day to do five-minute talks. Those timing is strict. I will pull you offstage. We have a CFP for those. It's open till 17.00, or 5 p.m. for your Americans. And you can submit a tile till that hour at govres.gov.slide. I'll write it on the right board later. You just have to fill out three easy questions. And if you fill those out, I can welcome you onstage at the last half hour. So you have time to submit a talk. Quickly think of something. Submit it. We need you. If you want to talk to us about, talk to us about social media, you can do so by using hashtag Golan and hashtag FOSDEM23 or FOSDEM223 or FOSDEM, nobody agrees on that hashtag. But we stand to say with FOSDEM23. We're also on the Fediverse this year because Boo isn't. You can follow us, mention us, like us at godevroom at fosterdon.social. We have a social media responsible person this year. We will be happy to reply to all your angry tweets. So this is a state of go. I first of all want to thank the FOSDEM organization for welcoming us back in the ULB. I want to thank all the volunteers who are helping to make this room possible, as well as the AV team from FOSDEM, who makes my camera work. And everyone else who is working at FOSDEM. And at last I want to thank all speakers for coming here today. And of course, you all for coming to the Go Dev Room again. Thank you. |
Recipes for reducing cognitive load
yet another idiomatic Go talk |
Okay, our first actual speaker today is Frederico, who is a maintainer of Metal LB, which I personally use, thank you, over at RedHeads and he'll be talking to us about cognitive loads. So a round of applause for Frederico. Yeah, it works. Yeah, so today I'm going to talk about cognitive load and how it affects our code base, why it matters and how we can reduce it. And the reason why I put together this talk is because over the past, I would say, two years, I started contributing first and then maintaining the Metal LB project. Is anyone using it? Okay, so if it's gotten less stable, that's because of me. But by doing that, I started reviewing a good amount of PRs and over this period, I kind of identified the recurring patterns that I was keeping asking and asking over. And those recurring patterns, those scattered suggestions that I try to give in code reviews are what this talk is about. In terms of code, Metal LB is a nicely sized project, not too big, not too small, and I think it's worth keeping alive. So some quick words about me, I'm Frederico, I work for RedHead, I'm part of a networking team in charge of making the OpenShift platform suitable for telco workloads. That means that I touch and contribute a lot of these different network-related projects, but that doesn't mean that I'm a network expert because I'm not. So don't come asking to fix your router, as my parents do because I won't. All of these are my handles. Probably the most annoying one needs to be adjusted, but you can find me there. If you have questions to ask, if you need to provide feedback, I'll try to reply. So let's start with cognitive load, and this is the Wikipedia definition. Cognitive load is meant to be the extra energy, the amount of effort that we need to put in place to understand something that applies perfectly to our codebase. It might be because we are reading something that we wrote years ago where we were less expert. It might be because we are trying to review some code that somebody else is trying to push to our project. It might be because we got a bug report and we need to correlate the behavior that we get from the reality and what we understand from our code. And the less energy we spend, we are able to spend the better because it might be evening and we might be tired and we might have some urgency about that. That's why it's so important. And sometimes, this complexity is proportional. This extra energy is proportional to the complexity of our code. Think about cryptography. Think about ultra-optimized code that runs in embedded systems. But some other times, it's not. Take this example and take the same run through another skater. This takes a lot more energy to understand that this function prints hello world. So this is to say that we need to put an effort because that effort gets our reward in terms of speed of development and speed of understanding. So say that a disclaimer, not everything is black and white. Of course, there might be exceptions to the suggestions that I'm going to say. And this talk is more or less a collection of scattered robots that I collected from sources that I trust. So in case you don't like them, blame the sources. In general, I think that the stuff that we write should take care of two sites. One is, of course, the implementation, and this implementation is pretty clear, I guess. This function is just doing the sum of two numbers. It's easy to understand. We can't argue with that. But what if we land on a code base that is doing something like this, and this takes more energy compared to a better version of it, where the function is named nicely. So we understand what it's doing. This is to say, and this is something that is going to be a requirement in this talk, that what matters is not only how we care about the implementation, but also how we care about the users of our packages, of our functions, of our objects. So let's start with the first item, which is the line of sight. And this is something that I believe every good and idiomatic code base should try to foster. Basically, we have this leftmost indented line where all the happy path leaves, and we have this indented one where we handle all the exceptions. And I expect every code base, which is well written where I land to, to respect this rule. And there are a few tips to do this. It wasn't me. So these are just tips to do that, to implement this. And let's see why it matters, how it will make our code base better. This was more or less a real example that I got from a real PR, and it was really hard to follow all the special cases. And so I tried to give feedback and try to hammer it with suggestions in order to leverage early returns and flipping errors, removing else, when a CNL is something that I try to get rid of. Like, it's a red flag, and I think three times before allowing it to go through. And then leverage more returns and then, yeah, leverage more returns and then, sorry, yeah, trapping into a function so we can leverage even more returns because now we have a smaller scope. So we got to something from something which looked like this to something that looked like this. And I dare you to say that this is easier to understand. And remember, like, this is understandable, but this requires a lot of energy. It's clear. It's better because of all the reasons that I already said before. There is this nice blog post from Matt Ryder about this very same topic. He more or less gives the same set of advices. Linocyte is not a nice exercise. It's a rule of thumb that allows us to untangle our code and to make it slicker and easier to understand. Next I'm going to talk about package names. This is another favorite of mine. We know that naming is hard, and that is particularly true in case of package names. We know that the name of a package should be small enough because that is consuming screen space, but should be also good enough to let us understand the purpose of the package. In Go there is even more because when we use an object, the name of the package is part of the name. So that is an opportunity for us to put some value in that part that the reader can consume. And again, I'm starting with a bad example. We have this utility package, and we have this copy node function that is totally fictional, but that utility part is a wasted opportunity. It's part of the name that doesn't add any value. So it's better to take and split our package, smaller scoped packages that do and explain what to do. And in this case, from the colon side, you have node.copy, which still explains the purpose of the function, and it's not wasting space. And this was taken from the official Go blog, and it says basically the same thing. There is no need to have these gigantic kitchen sink packages where we throw everything because in Go packages are free. So it's fine to split them in a better way. Next one is going to be about errors, and I see also this happening very frequently. In Go, errors are types, and let's say that the developer wants to handle a special error. And the problem with these approaches is that we are giving away the fact that errors are types, and we are converting them to a string, and we are treating them as a string. And since Go 1.13, we have, like, and there are, like, that's legacy. So there are no excuses not to use this. There are two ways, one is to assert that the error that we are checking is an instance of a given object that we have somewhere, and there is another, sorry, this is new because the other one wasn't working. And there is another one, which is about asserting that the error that we want to handle implements the error interface against a specific real type. But there is more. So in this way, you can have wraps of errors, and you can assert that the error that you are checking not only equals the one that you are handling, but also any error inside this wrap. And this is how you wrap them. You can either use errors.wrap, so the return error from this function will contain the value returned by this, but will also return true if we assert against the wrapped one. And also there is the way suggested by the standard library, which is using the %w formators. So both of them will return you a wrapped error. So now let's talk about pure functions and why they are important. So a pure function has two properties. One is the fact that no matter how time, when you call it, no matter how many times you call it, with a given set of input parameters, it will return always the same output. And the other property is the fact that it shouldn't rely on the state of your system. It shouldn't modify the state of the system. Should it be global variables or static variables or your input parameters or anything that is external to the function. And why it matters. This is an example where the behavior of this function depends on the state of an external system that is accessed through a client. And then you have the business logic after that. And why this is bad. I would say that mostly because this is hard to test or we can mock the external system, we can do tricks to replace the client, but moving away the statefulness part of the function away and having the business logic implemented as a pure function will allow us to be quicker in writing the implementation and to write our tests. And how about the second part. So we have a function that accepts a pointer and in some random cases it changes the object. And what's the problem with that? The problem with that is now on the reading side because you don't know that it's not clear enough that this function is changing the node. So you get your bug report and you look at the code and you know that somewhere the name of the node changed, but you don't know why. And that's because it's not clear from outside that is what this function is doing and it's harder to reason about it. So a better way is to change the name of the function so it's clear, but I think that and this comes quite often, a better way to do that is to delegate the responsibility of changing the object outside and changing the function to be a pure function. Again this version is easier to understand, it's easier to reason about, it's clear when you will have something to change. And this can also say about environment variables. In the world of pods and containers, adding a new knob as an environment variable is so convenient. Just add an environment variable, you consume it from the function where you need it and you are done. But the problem with that is that you then don't have control anymore on all the knobs on all the parameters that your program is consuming because they are all scattered across the code base and that is bad because you can't foresee what a given function is doing by reading its calling site. So again, this is something that should be avoided, environment variables should be read in your main functions and then be propagated through all the stacks. So another topic that I care about is function arguments. And the first one is Booleans. So you start with something like this where you have a simple setup function that is easy enough and then with all the good intentions of the world, thanks, with all the good intentions of the world, the developer starts adding a parameter but then we need another one and then we need another one. And how does it look on the calling site? Something like this and you think, true, false, true, true, false, what the hell? And then you need to stop, you need to enter into this function, you need to understand was it, where was the enable webbook parameter? It was the first one and then you get back here and this works but adds friction and getting a better version of it is so cheap that we should do that because we are doing a favor to our future selves, we are doing a favor to the maintainer and it's going to be easier to understand. Another option might be to pass a structure to the function that also works but not this. Now I want to talk about function overloading or the fact that God doesn't have, so it's more or less the same as the other one. God doesn't have function overloading so it's easy to have this full variety of the same function where we need to slightly change the behavior. So you start with creating a service, then you need one with a backend, then you need one with an IP and then you need one with a backend and with an IP and it's clear that can get easily out of hand. So an approach that I really like is using a variety of the argument with some modifiers that accept the parameter and do what they have to do and this is how it looks from the calling site, again, it's clear, it's easy to understand, your future self will thank you for this. And there is also another version where you can have these generator functions. I think it's on the borderline of being too magic for me but, again, this one is easy to read. So next one, I see this happening a lot in the world of controllers where you have one file that basically implements all the methods related to a controller. So you have this file and you need to add an utility function and then all the other functions are methods and what do you do? You add a new method, even if it doesn't have to be a method. So you look at something like this and you think, hmm, why is this a method? Is there something wrong with that? And this, again, is adding friction that could be avoided. So if a function is a function, just make it a function and not a method because also testing is easier. You don't have to have the instance that you are not using for anything just in order to test this function. And then a word about pointers. Go has pointers, like not all other languages, so people might find them hard to reason about. And when I see two functions like this, my first thing, thought is, like, this one is not changing the object and the second one is doing that. So this is the rule of thumb that I'm trying to apply. If a function is not changing the object, then pass the object with a value, otherwise pass the pointer. But there are also exceptions. There are some kind of objects that can be passed by value or they can, but they will give you a bad afternoon. But so mutex is file descriptors. We need to pass them by reference because that's the way it works. We have linters that help us in that and we have this rule of thumb that says if you look at the object, if all the methods associated with the pointer, then use a pointer. One might argue how about performances. We are passing the whole object instead of passing just the reference. Yeah, passing the reference is cheaper, but this is not see, this is go, and that's not always clear. So what we should care about is the readability. And we have a lot of toolery that will help us to understand if that can be optimized if it's in the hot path. And then we need to sacrifice a bit the reliability of our program in order to have better performances. So now I'm going to talk about something that was advocated in clean code where it says that our code base should read like a newspaper, which means that you open a file, you should have all the high level concepts on the top of the file, and then start to find all the nitty details of the implementation in the bottom of the file. And these applies pretty well to go. So what I expect from a well-written go file is to have all the public methods, all the public objects in the top of the file, because when I open the file, I see what this package has to offer to the external world. And so those are our high level concepts by definition. And another thing that I think is sometimes underestimated is the fact that we can have our packages split into files. So again, in order to have a better navigability of our code base, we can split it into files, have a main file related to the package that is named after the package, and then have these smaller entities where we put the different logics. And this is basically what I'm trying to say here. So try to have the public fields on the top, try to remove or to move the utility functions in the bottom, split the package into file, because again, it's free. It won't cost any energy to you or to the executable, and have a main package file that is named after the package. Next item is about asynchronous functions. And I saw this many times. It's one of the nice things about Go, right? It's so easy, so convenient to implement concurrent code. You can just implement Go routines, you can pass channels and have fan in, fan out. But the problem with that is that something like this has some flaws. And I think that is way better to, again, take the business logic, move it to a synchronous function that is easier to test without all the infrastructure that you need to put in place with channels, with weight groups in order to reverse the synchronousness of your function just in order to test it. So if you can, move the business logic into a synchronous function and let the calling site handle the life cycle of the Go routine. So again, that part has to be delegated on the client code, and that will make our function easier to test and our code base easier to reason about. And again, I didn't invent this as everything else. This is from the code review Go wiki, and it's basically saying the same thing, like try to use synchronous functions as much as you can. Next item is about functions that lie, and what I mean by that. You have something that is, what would you expect this function to do? Clear the node. Exactly. That's what I would expect. But the developer found a very edgy corner case where if the name of the node is do not clean, then do not clean. And he was doing that with the all good faith of the word. He's trying to solve a problem here. But the problem is that, again, this is going to give us a bad afternoon because we'll see that the node is not being cleared and we'll have to put a lot of printfs in our code or to do a lot of debugging in order to understand why is this happening. So again, this is done with good intentions, but the result is not so good. So again, as I said multiple times today, we should defer this responsibility to the calling site because that will result in a code base that requires less energy and less effort to understand. What if we have this function called 100 times in our code base, then I don't know. Just call it clear the node, but do not clean one or have one filter function, whatever, but not lie to the reader. So wrapping up, there is no much to wrap up. I mean, it was just a list of no related items. Maybe the only take away that is globally is to say that we should be smart and let our readers, the calling site over the code base do a bit more because that will give us a better day in the future. I'm a strong believer of the Pareto principle, most often when it's on the bad side of it, but in this case, I think that by applying these set of rules that will take very less to implement, those will improve the quality of the code base a lot. And then I want to finish with this quote from Rob Pike, simplicity is complicated, but the clarity is worth the fight. And with that, I'm finished. Sorry? Are there any questions, I'll try to come with a microphone, if it doesn't work, we'll have to repeat it. Hi, thanks for the talk. I was wondering, do you see any room for automating some of these rules and wisdom that you share today, maybe something else as well? I don't know, I should think about that. Probably some of them, yes, like avoiding having functions or raising a flag if a function is accepting a channel, for example, but there are exceptions to that, so that shouldn't be blocking. There are some others, like the function that is lying to the user is something that depends on the implementation, or for example, having a function that accepts five booleans should be flagged. So, I see that, I think that it depends on the case, but some of them might be automated. Any more questions? No? Thank you very much. How was it? |
Building a CI pipeline with Dagger in Go |
Okay. Thank you, everyone. Our next speaker has some interesting news for your CI. There are better solutions than the YAML you're used to. Mark is going to talk to us about building a CI pipeline with Dagger in Go. Thank you. Thank you. Can you hear me? Okay. So very important information before we get started. I have some Dagger stickers here if you want to pick them up. I don't know. Maybe I can just leave them after the talk or you can come to me and pick them up. I'll leave the stickers over here. Perfect. People can grab them. Thank you. An important thing. Stickers are for your laptop, not for the room. Every sticker you put inside a room involves them we have to pay for. So keep them for yourself. Yeah. Well, that's why we are going to conferences for it. So thank you again for joining me today. My name is Mark. And for the better part of the decade I've been focusing on helping engineering teams focus build, helping them focus on their business applications, building their best business applications instead of worrying about things like CI and how they are being deployed. And I have this fake title at Cisco technical lead that I decided that I would come clean here today that I'm really nothing more than just a YAML engineer. Ooh, that feels good. Anyone else want to onboard them themselves? Any other YAML engineers here? All right. So let's talk about CI-CD a bit. And CI or CI systems improved a lot in the last couple of years. We have new and more evolved CI solutions today. But we still have some challenges that we face every day. Like the one I've been already hinting at, YAML. Obviously, YAML is one of the biggest problems with CI systems today. Admittedly, sometimes, like using YAML to build the declarative pipeline can be fine. But, man, you miss a space. The whole thing just broke. The whole thing just breaks. And you might not even know where to start debugging it. So YAML makes it often really hard for people to even just touch CI. And the other thing is CI tends to break for no obvious reasons. Like, the pipeline that worked yesterday may not work today and you don't really know why. And obviously, as developers, when something breaks in production, we can just tell the ops people to worry about it. But with CI, that's not really the case. Like, we have to interact with CI. And if something goes wrong, we might have to be the ones who fix it. And with the currently available CI solutions today, you can't really, like, everything was running in the cloud or in the remote system. You can't really have or you don't really have tools that you can use to debug effectively. You have to start gassing and start changing some YAML config. And you have to push that to a repository and then wait for the CI to get triggered. And you have to go through this whole and long feedback loop to be able to debug what's going wrong and to be able to fix that. And that's a pain. Like, it takes a lot of time. It's really a huge waste of time. And it's really painful to do that. Now, sometimes it's not the CI that's wrong. Sometimes it's you, like, pushing something that you shouldn't be pushing to the repository like tests are not passing or the linters are not passing or something else goes wrong. And again, you may have the tools locally in your machine, but you may not have the same versions. You may not have the same setup as in the CI. And it may just break in the CI even though you ran the test locally and everything was green, it may still fail in the CI. And you still have to go through the same long feedback loop again and again, trying to fix that, instead of being able to run the whole thing locally and being confident that it will just work in the CI as well. And probably there are other challenges with CI, but these are the ones that wasted hours from my life in the last couple years. So how can Dagger provide an answer to this problem? So first of all, Dagger is a programmable and deportable CI solution, which means you can run your CI pipelines basically anywhere. We will get to how it does that. But the important thing is that you can run your CI pipelines anywhere using the same environment, which means if it runs on your machine, then you can be confident that it will run the same way in your own CI system. Now, that's a great thing for a number of reasons, because when you start building a pipeline, for example, using any of the CI systems today, you still have to go through that feedback loop, like adding some config and then pushing it to the GitHub OO and trying to figure out if it works or not, and then changing until it works. Now, the ability to run this whole thing locally, it's much shorter feedback loop, so you can build your own CI pipelines much more quickly than using some remote system. The other thing is that if something goes wrong, you have the whole thing running locally. So again, shorter feedback loop, you have more tools to debug, so it's much easier to figure out what goes wrong, even if it's either the CI pipeline or your code. The other thing about Dagger is that you can actually write your pipelines in your own preferred language. Now, not any language, obviously. Some of the languages that Dagger supports, but that's already much better than YAML. You can write your pipelines in Go, Python or TypeScript, ThinkQ, EvenQ, but that's already much better than YAML. You can write your own pipelines in code, and you don't have to invent or use some weird syntax, for example, to represent dependencies between steps or between different pipelines. You can just do that in plain code, so that's great. And all those, so the possibility of writing pipelines in your own language points to the fact that you can avoid Pandor locking entirely. You would still have a CI solution like Jenkins or GitHub actions or whatever, and you would still run Dagger on those systems, but you would have to write a very thin integration layer just to run the Dagger pipelines. You can, you would be much more confident that the pipelines would run the same way on the CI system as on your computer, and yeah, you can avoid Pandor locking entirely. You can move to another CI system if you want to, and you may say that it doesn't happen often, but when it does, man, it's really painful, like converting from one YAML to another or one YAML to, I don't know, Groovy or JenkinsFly or something, that hurts. And lastly, costly caching. So every CI system or most CI systems have their own caching solutions where you can cache the dependencies of your language or dependency manager, but that requires configuration. You have to make sure that you configure it right, otherwise, well, it could either like grow the cache endlessly and then you will be paying a lot of money for that, or it would just be non-functional at all and it wouldn't cache anything properly. Now with Dagger, everything is cached by default, like every step is cached. You can think about it like a Docker file. Every instruction or the result of it is basically cached in a separate layer in the Docker file, and if nothing changed between the steps, then when you run it again, it will basically run the same way and it will come from the cache. That's really how Dagger works. Obviously, you have some control over what you want to cache and how you want to do it, but by default, Dagger got that covered for you. Now how does all this work behind the scenes? If I had to describe it in one word, it's obviously containers. Now we can do it ourselves today, right? We could just run everything in a container and it would be reasonable to say that it will run on the CEI the same way. What Dagger adds to the mix here is that you can actually build pipelines with code and that would be translated into build pipelines. So you would use the Dagger SDK, the language SDK that Dagger provides. Again, today, I believe it's for Go, TypeScript, Python, maybe Q as well. But the underlying API is actually the GraphQL. So if you have a language client for GraphQL, you can actually build your own SDK if you want to, or you can just write GraphQL queries and send those directly to the Dagger engine. But basically, you write your own pipelines using this SDK in your own language, and then the SDK will basically send GraphQL queries to the Dagger engine. Now, when you run the whole thing locally first, then the Dagger SDK will actually launch the Dagger engine for you. All it needs is really a Docker-compatible container runtime. So if you have Docker on your computer or in your CEI, then you can run your Dagger pipeline basically. So that's once more the portability of this whole thing. If you have Docker on your machine and Docker basically runs anywhere these days, then you can run the Dagger pipeline there. So locally, when you launch this for the first time, the Dagger SDK will launch the Dagger engine for you and you send these GraphQL queries. You'll see a couple examples how that looks like in the SDK, and the Dagger engine basically builds a DAG, directed basically graph of all those steps, and then sends it through, well, it says an OCI runtime, I believe currently Docker is the only supported runtime, but sends through an OCI runtime and runs the whole thing in containers for you. And then when, obviously, when a pipeline is finished, you get back the results, and you can use the results in another pipeline if you want to. For example, the result of your build pipeline would be used in your deployed pipeline, and you could deploy or project or whatever you have. So that's how Dagger works under the hood. And let's take a look at an actual example. Let's see. So the example will be go because this is the go-to-room. Can you see it from the back? Okay, cool. So the first thing you need to use the Dagger SDK and go is importing this module from Dagger. It's the Dagger SDK for go that you can use to interface with the Dagger engine. And once you have that, you can basically start writing your own program. Now, in this case, I'm using mage. I'm not sure if you're familiar with that, but it's basically like a make file like solution for go. You can write these functions, and mage will basically compile the binary from that and execute it like it would work with make. Now, you can absolutely import this Dagger package in your own application if you want to. In case of applications, it's probably not a huge deal if you have an additional dependency in your go modules. If you're writing the library, though, you might want to create a separate module, for example, called CI, and import the Dagger SDK in that separate module so you don't import Dagger as a development dependency in your libraries go that modified. I know it still won't be built or still won't be in the final binary if you import that library, but some people get to know it if they see dependencies that is actually not necessary for the library. So make the life easier for your peers, and if you develop a library and use Dagger, just create a separate module and put all the Dagger-related code there. The first thing you need to do when you want to write pipeline with Dagger is call this DaggerConnect function, which will basically connect to your Docker runtime, and it will launch the Dagger engine for you and start the so-called session. Now, within that session, you can start building your actual pipelines using these containers. Now, it's pretty similar to how a Docker file would look like for good reason, but what you can do here is basically use some of the same instructions as you would do in a Docker file. You can obviously go from a base image, which will be going in a Go project, for example. You can mount your source code. That's how you would have access to the source code within the container, and then you can run a bunch of commands like test or you can do the same with the linter, for example. And the other two here, these are the mounted caches. You can do that with built-kit, actually. I believe that's a built-kit functionality, so you can mount a cache directory to the container that will not actually be part of the container, but it will be a mounted cache directory from your host. Now, let's see if I can run this. So, I'm using the mage miner here. I'm telling it to change to the CI directory because it's a separate module, and then I'm telling it to use the current jet. Can you hear me? Okay. I don't know what happened there. And then I'm basically just telling it to run the test function here again, similarly how a mage file would look like. Now, let's see what happens. Kind of hope that I don't have to download all those container images. Let's see. Let's get some locks here. I swear this worked like a couple of hours ago. Oh, you know what? I think I don't have Docker running. Yeah. That's a problem. Yeah, maybe we'll, yeah. So, I don't have the Docker engine running at the moment. Let's see. This should start a new container. I mean, this should start a new container. Let's see what's going on. All right. This is not great. Can we all pray to the demo gods, please? Thank you. Okay. You all just have to believe me that this actually works. Okay. So, here are the locks from the previous run. So, this actually worked before. Yeah, it says, let's see. Okay, let's try that. Let's see. Do we have internet connection here? Yeah, we do have internet connection. Okay. Well, we'll have to work with the locks from here. So, well, basically what happens here is when it works is it just runs the whole thing within this goal and the image mounts the source code and then runs the goal that test command and just gives back the results. Obviously, this is the build log, like this is the debug log, but normally it would just output the, output of the go test command. Or the go lxci command. Let's see. It's still not working. Let's try from hotspot. Maybe that works better. Anyway, if, well, if someone wants to get back their money, sorry, folks, this is a free conference. Anyway, yeah, you will just have to believe me that this works, but I will try to make this work after the presentation. Now, if you take a look at the code here, this is still not very user friendly. If you don't know how dagger works or if you don't know what happens here, then it's not really useful to you. You will have to go to the documentation and understand how this whole thing works, when it works. So, but the good thing is that this is like, this is not an arbitrary YAML interface you have to use, so we can actually make this a bit better if we want to. And what I did in the last couple of weeks is that I built a higher-level library over the dagger SDK. So, instead of writing all that container mount nonsense stuff, you can just use this go link package. It's actually called the OCI. You can find it here if you want to give it a try. And instead of, you still have to connect to dagger, obviously, but instead of writing that whole container code, you can just use this much more friendly interface to run your tests or run the go link CILin, for example. And it's much easier for developers to interact with this. Like, if they want to change the cover mode, for example, it's pretty obvious how you would want to do that. In this case, compared to how you would want to do that, it would be the lower-level dagger SDK stuff. Let's give this another try. Now, from the... Oh, still on. Let's go to the... Well, it doesn't work. Anyway, if you want to give this a try at home, you're absolutely welcome to do that. The documentation is getting better by the day. It has a bunch of different examples. You will find these examples on my GitHub as well. I promise it works. They've actually just released a brand-new kickstart guide. So far, the documentation... They had documentation for the different SDKs in different places. Now, they have a kickstart guide that is basically the same for all of the languages. Regardless which one you want to choose or if you want to try all three supported SDKs, you can do that with the kickstart guide. You can actually go ahead and run the code from there. And finally, they have a playground that works with their low-level GraphQL API. So if you want to give that a try, it's fairly similar to the SDK, actually. If you want to give that a try, then you can absolutely do that and see how dagger works without actually installing it on your computer. That's all I had for today. If you have questions, feel free. There is a sticker. Thank you very much for your attention. If there are any questions, raise your hand. I'll try to give you a microphone. I have one over here that's closer. Can you use it with something other than Docker? I'm sorry. Can you use it with something other than Docker underneath? I think you can use it with Docker compatible runtimes. So I think you can use it with Podman at the moment. I think, technically, you can use it at runtime. But I don't know if that's currently available as an option. But we have someone from the Dagger team here who can actually answer that question. Hi. So how does the portability work when parts of your deployment depend on publishing a Docker image to a repo that is external or AWS or Terraform or things that require secrets? How does that fit in running it locally? So that's a great question, actually. So the code itself should be completely portable. So the pipeline itself can run anywhere. What you would need to do in that case is you need some sort of either a central secret store that you can connect to from your own computer or you need to be able to load some sort of secrets or credentials from your environment variables, for example. You can absolutely do that with the Dagger pipeline. So from that perspective, if the processor is here, you can push to another registry or push to a development environment, for example. So you can parametrize pipelines based on where you run them. You would still run the same code, but you could deploy to different environments from locally. We have one more question there. Okay. Thank you. One last applause, please. |
Debugging concurrency programs in Go |
is going to talk about the most painful thing I ever did in Go, which is debugging concurrent programs. I'll give it a pause for Andriy. Hi. Can you hear me well? Nice. I'm very pleased all of you here. In person, finally, since all this COVID. And today I will talk about debugging concurrent programs in Go, and a little bit about myself. My name is Andriy. I'm a software engineer originally from Ukraine, currently, unfortunately, living in Austria. I'm a big fan of sports, gymnastics, crossfit, and different debuggers, etc. The interest in parallel programming has grown dramatically recent years. And the added complexity of expressing concurrency has made debugging parallel programs even harder than debugging sequential programs. And usually, sorry, every day at work, I feel like I have these eight stages of debugging myself. So that can't happen. That does not happen on my machine. That should not happen. Why does it happen? Oh, I see now. I feel I know what's the problem. Then how did that ever work? So last couple of days, I saw PR, like, oh, it's not working since two years, some code. And I was like, who wrote this? And like, oh, wait, it was me. So the classical approach for debugging sequential programs involves very easy, like, straightforward way. So we rapidly stop and set breakpoints. We just go step by step. And like, sometimes we print something, sometimes we continue, rerun, etc. So and this style we just usually we call is cyclic debugging. But the problem, unfortunately, parallel or concurrent programs do not always have reproducible behavior, even when they run with the same inputs on the same machine with the same results. So an output results usually can be radically different, and it's hard to predict. This difference, a cure, for example, when you run some program, and as you can see, it's very dummy one, but output is different each time when I run it on my machine. Sometimes it's the same, but sometimes not. Yeah, I spend lots of time to read some books and articles and videos on YouTube. I just always trying to find, like, a question, like an answer to my question, okay, there is any, like, okay, we have books how to write code, we have books how to write tests, okay, how to debug code, there is no books. Even there is no books to how debug concurrent programs, and so to start explaining my journey, how I usually do it, let's a little bit remind who, what is gorotin? So gorotin is just like an abstraction, yeah, it's, by the way, struct which handle gorotin under the hood inside Go. And usually gorotins are multiplexed on different or multiply OS threads. So if one should block, and like we're waiting for some IO call, others can continue to run. And there are also lots of design which hides many complexities of thread creation and management. So go and do it on our own, so it's nice. And to create a gorotin, it's very easy, just prefix your function with go keyword, and that's new gorotin, nothing completed. By the way, who knows why they name it gorotins? Maybe somebody have ideas, yeah, go ahead. Why not just call it gorotin? So in each language, we can just replace first letter and like gorotin, yeah, it's, yeah, yes and no. So like they call it, at least from what I read, they call it because like threads, gorotins, processes and so on. It's not an accurate explanation what guarantees does. So gorotin has its own like simple model and how it's executed, et cetera, et cetera. And that's why like they know it, cool. So next question, before I will share my experience, how do you think, how can I debug my concurrent program? So nice, nice. Can you repeat what the answer was for the stream? Thank you. Can you repeat the question, you mean? If you have an answer from the room, can you quickly repeat it so it's recorded on the stream? Yeah, we'll do. So let's repeat, how can, how can I debug my concurrent program? So the gentleman suggested using prints, nice, yes, nice. This author of delt by the way, okay, any other ideas? Okay, yes, yeah, it's a good idea, nice. So just to repeat for people who are watching, their ideas was using debugger delt, using trace or trace, using tests, et cetera. So my first assumption was, okay, playground, let's play a little bit. And like a few years ago, when I started writing this talk to be honest, there was like a limit, so playground worked only with Go Max Prods 1, so it always reproduced my program. But right now, it's more or less simulate local development. Okay, I have more like bright ideas. So maybe we can just color logs, I don't know, visualize goroutines, why not? So here's a funny package, which just what it does, it's just like print different goroutines with different colors like this. So yeah, I mean, if you do something very quick, you can just figure out which goroutine, which color, et cetera. Yeah, return to seriously, there is interesting article, it's quite old, but one of my friends from Ukraine, he wrote this article also a few years ago, he decided to visualize how all this scheduling, goroutines works with these fancy pictures, also very good article to highly recommend. Another idea is try to print how Go schedule events. So there is the environment variable, which can print you some extra information and yeah. And of course, using debuggers, today I will focus a little bit on Delph and a little bit on GDB. So next question. Can I set breakpoint inside goroutine? Any ideas? Yes? No? Yes? So the answer is yes, yeah, typically you can set breakpoint inside goroutine, you can jump into this goroutine, see what's inside and yeah, it's very handy, especially if you develop like server and other stuff, okay. What about channels? So if I decide like send a message to the buffer channel of size four, yeah, it's very nice that you can set breakpoint, you can print channel and Delph has very fancy like meta data, which shows you even like current channel state. So you see I send like one, it's a first item and some data information also useful. Then if I add another one, so like next, you see now I have two elements in channel and the small problem, usually like if I want to send message to channel from Delph CLI, unfortunately it's not supported, here's the issue I created, yeah, and there's a comment that yeah, we can fix it, but yeah, I hope we will fix it some time, yeah, so you can't set, so technically it's possible, but it's not, I mean, so it can be same semantic, you can set and Delph will handle it. Okay, now let's focus a little bit on how we can debug goroutines, so yeah, if you're inside a goroutine and you will print a state of goroutine, there's a keyword, goroutine, it prints current goroutine where you put your breakpoint, but if you have lots of goroutines, there's like interesting feature, I really use a lot, so, but let's step back a little bit, there's another idea and like implementation, you can use this profile labels, so it's inside pprof model, so you can run pprof do and inside through context run your code and it will like mark your goroutine with label and usually you use this labels for profiling, so you can open pprof profiles and see like some different metrics, but you can do it also with Delph, which is super cool, so you can, if you label your goroutines with labels like this or if you use like middleware, you can also do it, I mean, if you use web server, you can use this middleware, I post link on next slide and it will automatically like add labels to all your handlers, which is nice, so you can see like which handler you are currently because if you print goroutines, you will see like even in Delph, you will see lots of unreadable information, but if you just need to focus on login goroutines or like goroutines which doing something with your database, you can label it in the same manner as you do with pprof and then, yeah, also you can do it directly, by the way, this library which I mentioned, it's very small one, it's also support like set labels, just a wrapper, so very handy one and then, if you run goroutines keyword inside Delph debugger, minus L, it will print goroutines, it's just very simple hello world, which has like this main goroutine and few other goroutines without any labels, et cetera, but then, I created another like project inspired by one article and yeah, so here you can print all goroutines which related to your like label page and yeah, also you can go to docs and find different like group by, I don't know, filters, so it's very handy and how you can find your goroutine, then you can switch to this goroutine, if you don't know, also you can print or list source code, you can set new breakpoint, it's very nice and yeah, also you can use this demo project, it's not my, but it's more written for goland, but if to run it, you just need to this small tweak, you need pass some build flags and tags debugger, otherwise this library will not work and then you can repeat everything I did, I highly recommend to play with it and when you need it, you will be already like with everything you need. Regarding gdb, yeah, I play a little bit with them, it's quite not supported what I need for goland and yeah, it has this like info goroutines keyword, as far as I remember, you can't like filter goroutines and it's not readable, so like yeah, especially this part, yeah and I decided to not waste my time, to be honest, because it's yeah, you can just use delf and for such problem rather than playing with gdb. Cool, so next, not only with debugger, you can find your problems, one important problem in goland world is deadlocks and with deadlocks, usually program gets stuck on the channel send operation, which waiting forever, for example, to read the value and nice that goland support detection of these situations compared to other languages, for example, Python does not support this deadlock detection, which is hard to debug such problems and yeah, if you need like real world examples, you can see this very interesting library go deadlock, which using this library also found lots of deadlocks on cockroach db and there are lots of interesting examples how mutex is can be handled properly, how to write it properly and etc, etc, it's like this library is an entire separate like discussion. Returning to our case, yeah, I wrote like, I put to slide this very simple example, so yeah, sometimes you have this conflict in access and you have this data races and I saw it few times in some open source projects, but usually people do not do it, so I highly recommend run your CI pipeline with this dash race, especially tests, it helps you like always run with this flag and it will print you if there is data races or not, this dash race not always can find all data races, some common yes, but sometimes no, but highly recommend to add it to your project, so never skip, so now I have like seven, I have a seven rules for you, so how to unblock yourself when you get stuck on something and you don't know how to debug it, so first never assume a particular order of execution, so when you write in concurrent programs try to always think about not running it in particular order, especially it works with some benchmarks and tests, so try to not put this like, I also saw it lots of times when people, when run tests, when you run go test by default, if you know they run it in parallel, but usually people say like, no run it like sequentially and that's not a good idea, another advice, it's more about designing than writing code, try to implement your any concurrency logic at highest level as possible, try to not pass lots of channels, lots of like go routines etc, try to like keep logic separately and this concurrency separately, yeah, don't forget as I said, go race not always helps because it's not detects when program like whole freeze, it's only when a subset of go routines get stuck, as gentlemen suggested, you can use as trace and different tools for tracing which can help you to see like, are we waiting for some resource like reading file, access net, it's more low level but it's very useful, yeah, I show it on another talk but you probably know about it, you can use conditional breakpoints which helps you to cover cases especially when it's concurrent program so you can catch only your case, not like click next on every go team, as I said, you can use shadow in tracer, you can use go deadlock and yeah, last but not least use debugger, don't forget about it, it's also very handy and like every release, every version I see how debuggers are adding new stuff which is nice, cool, so I have like few references because to cover everything is hard in 25 minutes, I will post slides so you can accurately read everything, maybe to like picture it and thank you, thank you, are there any questions? Yeah, before you're thinking, if you want to donate to Ukraine, just let me know, few my friends right now are fighting so we can help directly, if you're afraid. Thank you. Oh, I have a question. Have you tried using tools such as RR or Hermit which try to execute the program in a deterministic fashion? You mean backwards? Yes, they can do a recording for their execution and then replace but the point is that the recording is deterministic. Yeah, I use it for sequential debugging, never for concurrent debugging, I mean, maybe it's possible but in my case it's, I covered what I just showed, of course there are other cases, I will try. If you are leaving the room trying to stay quiet for a second, do not talk, chairs are okay, so we can still hear any questions. Well, no more questions, that means your talk was very clear. Thank you and a lot of applause. |
What's new in Delve / Tracing Go programs with eBPF |
Okay, thank you. We have two traditions here in the Go Dev Room. That is that we start with a state of Go, and then it's around lunchtime. We always have the next state, which is the state of Delph. So let's all get into Delph. Let's go debugging. Let's try this again. Hello, everybody. My name is Derek Parker. I am the author of Delph, and I continue to maintain the project along with my lead co-maintainer, Alessandro, who is also in the crowd today. And as mentioned, it's been kind of a tradition here at Bosdom to piggyback on the state of Go and talk about the state of Delph. So this talk will be kind of a two-parter. I'll start with the state of Delph, and then I'll go into the main talk, which is debugging Go programs with EBPF. Now, if you're unfamiliar with what EBPF is as a technology, fret not, I will go in and kind of explain it in more detail throughout the course of the talk as we kind of get into the real meat of everything. So just to introduce myself a little bit more, again, my name is Derek Parker. I'm a senior software engineer at Red Hat. If you would like to follow me, I am Derek the Daring on Twitter. And at Red Hat, I work on Delph and Go itself. So the first thing that I want to start and talk about is the state of Delph. So what I'll go through is essentially what's changed since the last Bosdom, and actually since the last in-person Bosdom. So I was actually here in 2020 presenting a different talk before the world ended. And I'm happy to be here again in-person with everybody and being able to kind of talk and catch up and present these things. So thanks everybody for coming and for attending this talk. I really appreciate it. So one of the big milestones that I kind of want to call out is that Delph turns nine this year. So to celebrate that on the count of three, everybody in the room, we're going to sing happy birthday. One, two, I'm just kidding. Maybe next year for the 10th anniversary. I'll hold that off for a little bit. But the Delph project was started in 2014. And yeah, it turns nine, still going strong. And I appreciate everybody who uses it, contributes to it. It's just really fantastic to see. So I'll go into some statistics a little bit about what's happened in the last couple of years. So since the last in-person Bosdom, we've done 18 different releases. Now, the way we do releases of Delph is somewhat scheduled, somewhat ad hoc. So we always produce a new release when the first release candidate of the new Go version comes out. So anytime a new Go version comes out, we ensure that the day that it's released, you can debug it. So once you compile your code, do everything, you have a debugger that's going to work with that version. And then aside from that, we have kind of minor releases that come out throughout the year. And in between the releases to fix bugs, add new features, things like that. So within that time frame, we've added support for numerous different Go versions. So 114 all the way through 120. And as 120 was just released the other day, we've supported it since the first RC. So you always have a debugger to kind of go through your code even before the official release actually comes out. During that time, we've also added four new operating system and architecture combinations. So with Delph, we strive to enable you to debug on any operating system and architecture that Go itself supports. We're getting closer and closer to that goal with each passing release. So I'm proud to say within the last few years, we added support for four new platforms. And there's a few more already in the works and we'll be releasing later this year. So I want to call out a few major new features that have been developed. The first is we've integrated a DAP server into Delph, which is probably not something that's super relevant to everybody here unless you're like the author of like an editor or something like that. It's really for editor integration, but from a user's perspective, it really improves the usability of Delph within editors such as VS Code and things like that. We've added support for Apple Silicon and that happened really quickly once we were able to kind of get our hands on the hardware and everything like that. We added the ability to generate core dumps from running processes. So while you're in a debug session, you can ad hoc, generate a core dump, save that away and use that and debug it later. We've added support for hardware watchpoints, which I think is a really, really cool feature. And kind of difficult to do with Go due to some kind of internal things of how Go kind of looks at the stack and changes stack and stuff like changes go routine stacks as the stack grows and things like that, but we were able to implement them and get them working. So if you're unaware of what hardware watchpoints are, it's a really cool feature where you can say like, I want to watch this particular variable or this particular address and memory and I want to know, I want the debugger to stop any time that value is read or changed. So you're basically just saying like telling the debugger what you want to do and letting it do the heavy lifting for you. Really cool feature. And as was just shown in the previous talk, we've improved some of the filtering and grouping for Go routine output. So you can filter by label, you can filter by all different kinds of things. So in like massively concurrent and parallel programs where you might have tons and tons of different Go routines, we've improved a lot of the introspection on that and being able to kind of filter out and get the information that you really need. We've also added an experimental EBPF based tracing back in. So that's what I'm going to be talking about today. And we also added support for debug info find. So this is really cool for a lot of operating systems where maybe you're debugging a package that you installed via your package manager and the like the door, the debug information is not included with the binary. Maybe it's in a different package or something like that. We've integrated with debug info find to be able to automatically download those debug packages for you so that you can have a fruitful and successful debugging session. And there's also been a lot more. If you want a look at all of the details, go ahead and check out the change log in the repo. It'll detail all of the changes that we've made. Next thing I want to talk about is a few little upcoming features that I want to tease. So one of the biggest ones is we're working on support for two new architectures. So PowerPC64LE and S390X. My colleague Alejandro is working on the PowerPC64LE port and he's in the crowd as well. So thank you for your work on that. We're looking at some more improvements to the EBPF tracing back end. I'll go into some more detail on that as well during this talk. We're also working on the ability to debug multiple processes simultaneously. My co-maintenor Alejandro is working on that and we're hoping to land that pretty soon. So that would be if your process forks or anything like that creates new child processes, you can debug all of them within a single session. Another thing that we want to work on this year is improved function call support across architectures. So that was a big feature that landed in Delve as well, the ability to call a function while you're debugging your program. It's very architecture specific. So one of the things that we want to do throughout this year is improve support for that across different architectures. There's tons more. We're always working on new things and we also always try to gather community feedback and user feedback and stuff like that. So since I'm here and other maintainers of Delve are here, if you want to come and tell us something that you would like us to implement or something that you would like to focus on, feel free to come chat with us. So now with that said and done, I want to move on to the real portion of this talk which is debugging and tracing go programs with EBPF. Now it's really cool that this talk comes after the talk right before because I think the tracing feature in Delve is somewhat underutilized and I think it's really good for debugging concurrent programs and seeing the interactions between go routines as your program is actually running. So if you're unfamiliar with what tracing is, I'll show a little demo but essentially what we're talking about here is instead of going into a full on debug session, what you're really doing is spying on your program. So if you're familiar with STrace, it's the same concept except for the functions that you're writing as opposed to the interactions with the kernel, like the system calls and things like that. So you can kind of spy and see what functions are being executed, what are their inputs, what are the outputs, what go routines are executing that function, so on and so forth. So to show a little demo of it real quick, let me increase my screen size a little bit. It may still be hard for folks in the back to see but hopefully that's good enough. So what I've done here is instead of typing everything directly on the console, I've created a little make file just so that you can see kind of the commands up there and they don't disappear as I run them. But the first thing that we're going to do is we're just going to run a simple trace. So to do this, we use the trace sub-command of delve and what you provide to it as an argument is a regular expression and what delve will do internally is set a trace point on any function that matches that regular expression. So you can do something like main.star to trace anything in the main package, extrapolate that out to any other package and it's a really cool feature. So just to kind of show how it works, we can go here and say make trace and we see the output there. So to explain the output a little bit, you have like the single line or the single arrow is the call, the double arrow is the return. You can see there it labels what go routine is running and calling that function. You can see the arguments to that function and then you can also see the return value. So again, really cool and useful for like if you have a bunch of different go routines, you can kind of see the interactions of them and see what go routines are doing at any given time. Another option that you can do is you can say if you pass the stack flag and give it an argument, you can get a stack trace anytime one of the trace points are hit. So if we say trace with stack, you see we get kind of a similar output but we get a stack trace as well. So you can kind of see a little bit more detailed information as your program is being traced. So the real meat of this talk is how we improve the tracing back end to make it more efficient because what you, especially when you're doing something like tracing and things like that, the lower overhead the better. We don't want to make your program run significantly slower because that's just going to frustrate you and it's going to take longer to get to root cause analysis which is what you're really trying to do if you're using a debugger in the first place. So we'll talk about quickly how things are currently implemented and then how we can improve upon that using EBPF. So right now Delve uses, or traditionally Delve uses ptrace syscall to implement the tracing back end. It's how ptrace is useful for, like it's used by pretty much every debugger, every kind of tool like this. Delve is no exception. And if you look at the man page it'll explain a little bit more about what it is but it essentially allows you to control the execution of another process and kind of examine the state of it, memory and things like that. So the problem is ptrace is slow and it can be very slow. So I ran some tests kind of a while ago when I was implementing the first iteration of this EBPF back end and I measured like a simple program execution that executed in 23.7 microseconds. And then the overhead with the ptrace based tracing, the traditional based tracing, it went up to 2.3 seconds. So that's several orders of magnitude of overhead, which is definitely not what you want. But why is ptrace so slow? So part of the reason is syscall overhead. We have to, ptrace is a syscall so whenever you invoke a syscall, you trap into the kernel, you switch context. So that has its own kind of overhead which can be pretty significant. And as I mentioned, the user space kernel context switching, the overhead of that can be really expensive. And it's amplified by the fact that ptrace is in a sense very directed. So when we're tracing these functions, we often have to make multiple ptrace calls per function entry and function exit. So if you think about it, we need to read the registers, we need to read all of the different function arguments that are there. There's a bunch of different things that we need to do. So it kind of balloons up really, really quickly where we get into this situation where we're doing a ton of these user space kernel context switching per every time you hit one of these trace points. And on top of that, all of these operations have to happen twice per function, right? So the entry and the exit. So it's a lot of overhead, a lot of context switching, essentially a lot of unnecessary work and a lot of work that just slows down your program and adds a lot of overhead. So the way that we can improve upon this and work around this is by using EBPF. So EBPF is a lot more effective and efficient, a lot quicker to do this kind of work. So with the same task, again, as I mentioned before, the original program, 23 microseconds with ptrace 2.3 actual seconds and with the EBPF based tracing, we have like 683 microseconds, which is still measurable overhead but significantly less than the traditional method of doing it. So I've been talking about this technology a lot, EBPF, EBPF, EBPF, right? But what actually is it? So EBPF is a technology that enables the kernel to run sandbox programs directly. So EBPF programs are written primarily in like a limited C. I'll get into some of the limitations later. But it gets compiled to a bytecode, loaded into the kernel where it's executed and jaded as it's ran. And it has a lot of use cases, observability, networking, debugging and a lot more. So you'll hear a lot about EBPF. I'm sure a lot of folks in this room have already heard of it in some shape or another. Typically, it started as a technology for networking and kind of ballooned from there. So originally it was like BPF, which is Berkeley packet filtering, and it came into extended Berkeley packet filtering. And now the acronym doesn't really mean anything anymore. EBPF is just EBPF because it's way more than just what it originally was. And the cool thing is these programs that are loaded in the kernel, they can be triggered by certain events. And I'll talk about how we can trigger those events ourselves, but they run in response to something happening. So why is EBPF so fast in comparison to the way that we're traditionally doing things? The first thing is these EBPF programs run in the kernel. So there's a lot less context switching overhead. We're already in the kernel, so we don't have to keep asking the kernel for more and more and more information to get what we actually want. Relative to traditional sys call and a bunch of sys calls, the context switching is a lot cheaper. You get small targeted programs that, again, execute really quickly and can do everything that you need or want to do in essentially one shot. And a single program can execute many tasks that we would traditionally use multiple ptrace calls for. So you have access to the current registers, you can read memory, and a lot of other things like that. Now, when I was looking to implement this backend, I had a few requirements that I wanted to make sure can be satisfied with this EBPF-based approach. So the first one was the ability to trace arbitrary functions. As a user, you just want to say, I want to trace everything in the main package or I want to trace this specific function or whatever. This new backend had to be able to satisfy that requirement as well. We had to be able to retrieve the GoRoutine ID from within the EBPF program. We had to be able to read function input arguments and we had to be able to read function return arguments. Now, let's talk a little bit about tracing arbitrary functions. So, just as a little bit of background, how DELV has been used is EBPF from the Go side of things is we use the Cilium EBPF package. There's a few other Go-based EBPF packages out there. Originally, I implemented using one from Aqua Security but ended up switching to Cilium for a few various different reasons. But the first thing that we need to do when we're tracing these arbitrary functions is we need to first load the EBPF program into the kernel so that we can start triggering it with some of these events. Once we've loaded the EBPF program, we attach U-probes to each symbol. This slide is actually a little bit outdated because we don't actually use U-rep probes. U-probes can be attached arbitrarily to different addresses and things like that within the binary. U-rep probes are typically used to hook into the return of a function, which seems like something that would be super, super useful. In theory, it is, but with Go, it doesn't work very well because of how Go manages Go-routine stacks. When Go has to inspect the stack, it reads up the stack to unwind it a little bit, and then we can if it sees anything that doesn't look right, it'll panic. U-rep probes work by overwriting the return address of the function that we're trying to probe. Go notices that during its runtime work and freaks out. We just use U-probes. Again, we want to do as much in the kernel as possible to limit overhead. We have to communicate function argument and return values to the EBPF program and get those values back from the EBPF program. First, we load it. First thing we have to do is write the EBPF program. Second thing, compile the program and generate some helpers. This is what the Sillian package helps us with. Then we have to load the programs into the kernel. These are actually links. I'll publish these slides. You can follow along at home, but I'll show a little bit of the code here. This is an example of the EBPF program that we use, written in C, basically. We have access to a bunch of different EBPF-based data structures, like maps, ring buffers. These are just different ways to be able to communicate with the EBPF program running in the kernel and the Go program that's running in user space. I won't go through all of this exhaustively for time, but again, if you want to look at it yourself, go ahead and follow the link. The second thing that we have to do is go ahead and actually compile this EBPF program and make it usable from Go. The Sillian EBPF package has a really nice helper that you can just use with Go Generate to be able to compile the object file that is your EBPF program. It generates a bunch of helpers for you that you can call to be able to load it and interact with that EBPF program. Then finally, we have to load the EBPF program into the kernel. Again, the Sillian EBPF library has a ton of helpers to be able to facilitate that. We open up the executable that represents the process that we're debugging. We call this helper that the package generated for us. Then we initialize some of the things that we need to do, like the ring buffer and the map data structure that we use to pass values back and forth. The next thing we have to do is attach our U probes. First, we find an offset to attach to, we attach the probe to that offset, and then we go from there. We have a little helper here to take an address within the program to an offset. The offset is just like an offset within the binary itself as it's loaded into memory. Then from there, we attach our probe. Then from there, we attach our probe. It's as simple as the executable that we opened earlier. We have that attached to this EBPF context here. We just call this U probe method and pass it the offset and the PID. The nice thing about this is you pass along the PID so that this EBPF program is constrained to just the process that you're trying to debug, because these programs that you load in are actually global, so they're not really by themselves attached to any specific process. Then from there, we need to actually communicate with this program. We need to store function parameter information, and then we need to communicate that information with the program. I won't go too much into the code in this for the sake of time, but essentially we need to tell the EBPF program all of the function argument information, the return value information, where they're located, are they on the stack, are they in registers, and let it know where to find it so that it can read all of this information and send it back to the user space program. When we want to get the data back, we use a ring buffer to again communicate between user space and our program running in kernel space, and essentially it's just a stream of all of the information coming back, so all of the information that's being read and picked up by the EBPF program. That's ultimately what gets displayed to you as we run the trace command. I'll go through another quick demo of actually using the EBPF backend, so all you have to do to enable it for now is just add dash dash EBPF to the trace command, so if I run our make command here, nobody looking at my password. We see that the trace happens, and from here you can't really tell that it's significantly faster, but the output is a little bit different. As I mentioned, this is still kind of like an experimental work in progress backend, so some of the output is a little bit different, and it doesn't have exact parity with the traditional more established tracing backend, but you can see it works. You see the arguments, the return values, and everything like that, and this is all happening with significantly less overhead. So a few downsides of the EBPF approach. The programs are written in a constrained version of C, so you're not writing go. You end up having to fight the verifier a lot. If you don't know what that means, that's great for you. Congratulations. There's a lot of constraints on stack sizes and stuff like that within EBPF programs, which can be kind of gnarly to deal with. It's different to write some control flow, like loops and stuff like that, and as I mentioned, UREP probes do not play well with go programs at all, do not use them, do not try. And that's it. Thank you very much. Thank you. Unfortunately, we do not have time for questions, but if you see him in the hallway track, you can always ask him any questions, improvements, bug fixes, et cetera. If you leave, it's better to do so on this side. You may pause the stage, and there is also a swag table diagram. |
Go Even Further Without Wires
Long Distance Radio Communication Using Go and TinyGo |
Well, one of the slides I'll quickly introduce him. I have a lot of things to say about him, but he's already learning late. But I've never seen such dedication to even five seconds before he came on stage debugging his code. I've never seen such dedication for a talk. I think this is true conference-driven development. Thank you, Ron. At Fostam 2021, we learned to go without wires, and we discovered Go Bluetooth, a new package that let you use Go to connect with Bluetooth. Not just on microcontrollers, but on Windows, yes, I said Windows here at Fostam, I'm very brave. Windows on macOS and on Linux. Then at Fostam 2022, we learned to go further without wires, and we discovered the mysteries of Wi-Fi and the Internet. Now at Fostam 2023, we will go even further without wires. This time, we go long. I am Ron Evans, dead program, I am technologist for hire, aren't we all these days, of the hybrid group on micro consultancy here on planet Earth, where we're all technologists for hire. So we do a lot of open source work, usually for little or no renumeration, and TinyGo is the result of the amazing collaborations of a huge community of people all over the world. So this is about going further without wires. So what we're actually talking about here is low powered wide area networking, or LP WAN. So we talked about personal area networking two years ago, local area networking, and now we're going for wide area networking. And of course, we're talking about Laura. So and Laura WAN. So what is Laura? That's a very good question. Maybe we should ask who is Laura or why is Laura, but let's start with what is Laura. So Laura is, of course, long range radio. It is a semi proprietary but freely licensed protocol that was created in order to do long range wireless communication of digital data. And yes, I had to ask why is Laura. Well long range, of course, I mean you knew that from the name, right? Ultra low power, not just low power, but ultra low power, and license free spectrum. That means you do not need to go to any governmental entities and ask permissions. But that does not mean free for all. That just means we must share the commons gently because these airwaves are in fact the property of all human beings. So Laura is the physical layer protocol. And what we mean by that is it's actually like tells us when the radio signal comes whether it's a one or a zero. So a question, what do these three things have in common? A bat, a dolphin, and screen star of the 20th century, Hevy Lamar. I know you're probably wondering. The answer is, of course, chirp spread spectrum. You have that, right? So Hevy Lamar, in addition to being an actress, probably many people know, was an inventor of what is now known as frequency hopping, which was a technology that was used to avoid jamming and detection during World War II. And we use this today for the lower protocol. So to kind of get you an idea, there's an up-chirp and a down-chirp. So I will now imitate the up-chirp, and the down-chirp, and imagine that in a cute little dolphin voice. So by being able to parse and modulate these signals, it's able to actually send across long distances using very low power. So how do you use Laura? Chips, of course. Thank you. Good night. No. Chips mostly from Semtech. So Semtech is a company that are the creators of the lower protocol, and they make most of the chips, and they license them out. The two that are the most common are the SX126X series and the SX127 series. And so what we're going to do is we're going to see you have a microcontroller, some type of device, and we're going to connect through the serial peripheral interface, which is a low-level serial interface, to the actual Laura chipset, and then with the antenna talk out to someplace far, far away. So this is where TinyGo comes in, right? You knew that when we saw a microcontroller. So the Go compiler for small places, if you haven't checked it out, you could program Arduino's with Go. You'll see in a minute. So let's start with the whole old world of things, which of course is a blinky LED. And we're going to start with a Raspberry Pi Pico, which, oh, I forgot to start my video. Let's see here. Because you need some actually to see what's going on, or it's not quite as exciting. Now let's see here. Yes, I use all Linux tools, don't we all? Let's see if the camera will come up. Oh, wrong camera. It looks like, well, I think that is it. I forgot to take the lens cap off. That helps. No, I am not a professional photographer by trade. And of course, if we make that bigger, it's a lot easier to see. And we can even bring it into a little bit of focus. All right. So this is a Raspberry Pi Pico RP2040, which is a microcontroller made by Raspberry Pi. And as we were seeing a minute ago, it's got a dual-core ARM Cortex N0, which is a very, very low-powered, not very powerful ARM Cortex microcontroller, a 32-bit. Runs at 133 megahertz and 2 megabytes of flash. So let's just take a quick look at some code, just so you get an idea of what it is that we're looking at. And the whole world of things is a very simple program. You can see this good. All right. So it's just a Go program, right? But it's run through the tiny Go compiler, and it compiles to the code that can actually run on the microcontroller. So we'll import the machine package, which is a special package tiny Go uses to communicate with the hardware directly, then the time package, same time package, and our function main, you've seen this before. So first we're going to say LED colon equals machine LED, which is like the built-in LED that's on a lot of boards. We'll configure that as an output, meaning we're going to send a signal to it to turn it on. And then forever we're going to turn it low, meaning off. We're going to wait for 500 milliseconds, half a second, turn it on, and then wait for another 500 milliseconds. All right. So let's go and let's see this actually work. So if we go back to my presser, costume, there we go. And if we make blinky, I really like make. So we'll then compile that code, flash it on there, and you can see that it's a 7K program. Can you see that? Yeah. It's really small, both the type and the program. And then if we go and we take a look, if we, oh, I forgot to plug it in. I was a little rushed for time, I'll admit. Naturally it failed the flash. That would have been frightening if it had. There is no wireless in there yet. It's very inexpensive, meaning there's no wireless built on board. All right. So now it's flashed. And if we take a look, we can see an LED is turning on and off. Yes. All right. We're off to a good start. I tempted the demo gods quite a lot today. So now we're going to use the TinyGo drivers package, which is a package that is a sister package to the TinyGo compiler, which contains support for all different kinds of sensors, of displays, and other interesting things like, for example, our Lora wireless adapters. So our first demo is going to be showing Lora, just the low level protocol, transmitting and receiving. And we're going to use the same Raspberry Pi Pico, but we're going to add to it an RF solutions lambda 62. So if we can actually take a look at that here, if we go to the video, we'll take away that one, and we'll put in this one, different Raspberry Pi, and it's wired up to one of those chips that I showed you before. This, by the way, is the antenna. This little wire. Do I tell you that in the, I think I do. Yes. So we're going to take SX1262 with an 868 megahertz radio, is what you need in order to be legal and broadcast here in the European region, and it's got a wire antenna, which is literally just a short piece of wire. And so if we take a quick look at the code of our SX126X, so we can see it's not that much longer. It's got a package main, our machine package, time, and now we bring in the drivers for Lora, which is the actual communication for Lora, and then for the chip itself. And what we're going to do here in our main is we'll start by sleeping, and then we'll set up the Lora interface, and then we'll try to receive data, transmit some data, and then sleep. So setting up the Lora interface is really just about configuring the SPI interface, creating the driver that we need in the TinyGo drivers package, attaching a radio controller, which because these chips have so many different variations that we need to be able to do, so we can tell it which wires are going to be turning it on and off, and then make sure we actually have the device detected, configure it appropriately. So here we've got our 868.1 megahertz frequency, the bandwidth that we're using, et cetera, and then once we've got that configured, if you recall, we have our setup, then we'll receive data. So to receive, it's just a matter of saying LoraRadio.rx, and then how long we should wait, and if we don't receive any data, time out and return. And then transmit is almost exactly the same thing, that's going to be transmitting this message here, which is from RP2040 saying hello, TinyGo, and then it's going to use LoraRadio.tx. All right, let's see if it actually works. The demo gods are just waiting, waiting for their chance. All right, so let's actually plug it in this time, since we are professionals, and let's run make, which will now flash that code, and that one is a whole 15K. Yeah, you have to add something to go wireless. All right, and we're actually using one of the capabilities we added into TinyGo two releases ago, which is it's got a built-in serial monitor, so we can see it's trying to receive LoraData, and for 10 seconds, there's no one sending, apparently, and then it will try to, after that, it'll try to send. So because there's no one sending, wait, what, who is that, that's my next demo. All right, the yo badge. You may have seen several of us are wearing these go badges. So the go badge is a, it started out like, oh wow, it's upside down. It started out life as an ate a fruit pie badge, but we helped it transition to its final form, a go badge, and it's much happier now, I can tell you. I mean, just look at its display, not to mention that we've got such cute stickers. So we're actually running a different TinyGo program on there, which is called yo badge. So yo badge is using the ate a fruit pie badge, I told you about that a little bit, and it's using this ate a fruit low rough feather wing, which is a little daughter board, but can be added to some of these, and I soldered it on here, and it's got a UFL antenna, which is one of those little antennas that clip on, that way you can wear it as a badge, because I mean, it is in fact a badge. And then, naturally, I need to reboot it. Okay, so you can see the cool yo logo, and then because the other program is still running, right? Remember, it's plugged in, the Raspberry Pi is still plugged in, so we could say yo to it, and within like 10 seconds or so, it should say something back. Let's see, let's see if it's still here. Oh, yep, that was it. The machines are talking to us, and we're talking back. I feel so warm. I really like machines, if you haven't noticed that. All right, so now let's talk about low rawan, because this was all just peer to peer, which actually before I do that, just real quick before I do that, so we brought a few of these go badges that I give away here today to some very special lucky individuals, we'll do that this afternoon. So if you go on Mastodon or any of those other social media things that you're still using, and you send out some really great messages about how awesome TinyGo is, and how cool Fostem is, and how you really would like to be one of the kids with a programmable badge with wireless, then we'll arbitrarily decide who gets these badges. Maybe random. I don't have time to write me more software. So we don't have that much time left. So low rawan, now you're getting, we're going to go really wide. So the first low ran specification was actually created in January 2015, so we're not cutting edge here, my friends, we're just catching up on what the cool kids have been doing since back when they were kids. So this is the lan part of the talk, which means the cloud. Take a refreshing breath. So that means routable packets. If you want to go between internet works, generally we use media access control addresses or MAC addresses. You've seen these and wondered, that's so ugly. But we need this because with lower wan, our architecture is a bit more complex. We have our end devices, as you saw, like the badge, and they talk to a lower gateway. And the gateway is what I was trying to get working before, but I had to do a router reset and I didn't have time to finish, I apologize. They didn't give me an ethernet cable, they were worried about what I would do with it. I don't know why. Anyway, the gateway then has a backhaul to the internets. And that's where the lower ran protocol has three components that are very important. The join server, the network server, and then the application server. So by the way, lower wan is already running on go. It's always in all the good places and it's already lower wan. What do I mean? Well, you may have heard of a company called the things network. Very, very cool company, real pioneers in the space, and they have a complete stack for a lower wan server back end that's all written entirely in go. Come on, give it up for them. Not to make an awesome free public service. And then chirp stack, a little bit more recent entry, they're actually doing amazing stuff with similarly entirely in go back end stack for lower wan and they have a lot of cool tools and libraries that we're using. So give it up for them. But we're talking about devices here. I mean, they've really got go on the back end, like we don't need to reinvent that wheel, they're doing amazing work. No, we're talking about the actual end devices here. And the most important part starts with device activation. So device activation is like when you buy a phone and it turns on, you don't have to keep like logging into your phone, maybe you should be, but let's skip over that. That's another talk. So it connects, it's activated, you go to your cellular provider and now you just start making calls. Well, this is the same model, the same pattern that we use with lower wan. And there's two kinds of activation. One is activation by personalization, which means pre-saved keys on the device itself. We're running out of power, 4%. The question is, what do I unplug? It's like, it's a tough decision. Oh, well, and also I don't have my adapter with me, so. You only have five minutes, so go. Oh, perfect. And then over the air activation, which means that you just connect to some server somewhere and you get your keys down from the cloud, and then you save those, and then you can use those. And then you use those for uplink and downlink, and one thing to remember, and that's really important, is that with uplink and downlink and lower wan, there is really only uplink. You uplink and then you maybe get a chance to download some data. So this is the reason why it's so low power, it mostly talks and doesn't really listen, which is the opposite of the app I showed you before, which is just a peer-to-peer thing. Also we have lower gateways, that's what this awesome antenna here is. It's a micro tick knot that I couldn't get rebooted in time with a Yagi antenna, and these are, this is a very powerful antenna. And you'll see what this is all about tomorrow. What do I mean? What I mean is tiny global, a Pico high altitude balloon. So if you go to tinyglobo.com, and we'll see if we have internet, yes, it will redirect you to this page, which is showing you when it's turned on the actual current location, altitude and stats of the high altitude balloon. This balloon we will be launching tomorrow, here at FOSDEM, uh oh, I think it may have fallen asleep, nope. Well noon, central European time, weather permitting, of course, and that's the end of the talk. As the best ending ever, we still have some time for questions, weirdly enough. Thank you, battery. How did that happen? I have no idea. Any questions for Ron? I'm sure you've got a lot of questions. Apparently no questions. Sorry, sorry, sorry. Hi. Have you ever managed to compile the whole Raspberry Pico SDK in C, and then import it successfully in tinyglo? I'm sorry, could you repeat the first part of the question? Have you ever managed to successfully compile the whole SDK for Raspberry Pi Pico, and then successfully import it in tinyglo? Because a year ago it didn't work. Well, so the question is, can we import the Raspberry Pi CSDK and then compile it into tinyglo? The answer is, I'm not really sure, um, I believe you, actually I think you can, but that's not something we're really trying to do, you're probably interested in the Wi-Fi support. And then it's important, you know, watchdog, for example. Most things are probably better implemented in tinyglo itself. There is no watchdog in tinyglo. There is no compile goal. There is a branch with a watchdog WDT experimental branch for, check that out, but yes, watchdog, low power, and bringing in C, those are all things that are part of the tinyglo continuum. So, 12 noon tomorrow, look for us outside somewhere, you'll know us by this antenna, look for this antenna, and some people wearing glowing helmets with actual balloons that are back in my hotel room, and by the way, all the parts are of Chinese origin, but it was made by these American hands, thank you. Thank you very much, and please do not tell the government about tomorrow. I'm sorry I don't have any cards. |
Optimizing string usage in Go programs |
Okay, our next speaker is going to talk about something we all used in Go, which is strings. If you didn't ever use it in Go, what are you doing here? So let's give a round of applause for Matej. Thank you, everyone. Thank you. Excited to be here, excited to see so many faces, excited to speak first time at the FOSDEM, also a bit intimidating, but hopefully I can show you a thing or two about string optimization in Go. About me, my name is Matej Gera. I work as a software engineer at a company called Coreologics, where we're building an observability platform. Apart from that, I'm active in different open source communities, mostly within the Cloud Native Computing Foundation, specifically in the observability area. I work a lot with metrics, I'm a maintainer of the TANAS project, which I will also talk a bit about during my presentation. And apart from that, I contribute to a couple different projects, most interestingly, Open Telemetry. And yeah, these are my handles. I'm not that active on social media, best is to reach me on the GitHub issues directly or PRs, and let's get into it. So if anything else, I'd like you to take at least three things today from this presentation. So first of all, I'd like you to understand how strings work behind the scenes in Go. This might be old news for many people who are more experienced with Go, or might be a new knowledge for newbies. But I want to set kind of a common ground from which we can talk about the optimization. Secondly, I want to tell you about the use cases in context of which I have been thinking about string optimization and where I think the presented strategies can be useful. And lastly, I want to tell you about the actual optimization strategies and show some examples of how they can be applied or where they have been applied. I won't be talking today much about stack versus heap, although a lot of this has to do with memory. For the presentation, I kind of assume we'll be talking more about the heap and kind of a long-term storage of strings in memory, also only going into encoding or related types like runes and charts, although it's all kind of related, but it's outside of the scope for today. So let me first tell you what kind of brought me to this topic, what was the inspiration behind this talk. As I already said, I worked primarily in the observability landscape with metrics and over the past almost two years, I was working a lot on the Thanos project, which I mentioned and which you can, for simplicity here, imagine as a distributed database for storing time series. And with these goals, it's intended to store millions of time series, even up to or more than billion series, we have heard also about deployments like that. And as I was working with Thanos and learning about these various aspects and components, one particular issue that has been standing out to me was the amount of memory needed for certain Thanos components to operate. And this is partly due to the fact that the time series data is stored in memory in a time series database. And this is where I decided to focus my attention, where I started to explore what are some possible avenues where we could optimize the performance here. The big role here was played by doing this in a data-driven way. So I started looking at different data points from Thanos, like metrics, profiles, benchmarks. And this small side note, because I considered data-driven performance optimization to be the most important when you're improving efficiency of your program. So I don't want to diverge here, but I highly recommend for you to check out a talk by Partik Plotka, who I think is in the room here. So he's talking a couple of thoughts after me, who is kind of dedicating a lot of his time into this data-driven approach to efficiency in the ecosystem. I don't have it on the slide, but also the presentation that's after me, that has to do with squeezing go functions, it seems interesting. So a lot of optimization talks today, which I love to see. And he might also ask why string-specific, what makes them so interesting or so optimization-worthy. And although I've been looking at Thanos for some time, something clicked after I've seen this particular image at the different presentation. So this was presentation from Brian Borum, I know it should be also somewhere around FOSDEM, who is working on a kind of a neighboring project called Prometheus, which is a time series database on which Thanos is built. So if Thanos is kind of a distributed version of Prometheus, we reuse a lot of the code from Prometheus and also the actual time series database code. So he shows, based on the profile and on the icicle graph that you see here, that the labels take most of the memory in Prometheus, and that was around one-third. And when I thought about it, the result was rather surprising to me, because the labels of the time series, we could think of them as some kind of metadata or some kind of contextual data about the actual data points, about the samples, as we call them, and these were taking up more spaces than those actual data points, those actual samples themselves. So there's been a lot of thought and work put into optimization and compression of the samples of the actual time series data, but Brian's finding indicated that there can be more, can be squeezed out of labels. And what are actually labels? Labels are key value pairs attached to a given time series to kind of characterize it. So in principle, they are nothing more than pairs of strings. So this is what brought me in the end to the strings. And it inspired me to talk about this topic to a large audience. I thought it might be useful to look at this from kind of a more general perspective, even though we're dealing with this problem in a limited space of observability, I think it can be also, some learnings from this can be gained and used also in different, in other types of programs. So first let's lay foundations to our talk by taking a look at what string actually is in Go. So most of you probably are familiar with different properties of strings. They are immutable. They can be converted easily into slides of bytes, can be concatenated, sliced, et cetera, et cetera. However, talking about the qualities of strings does not answer the question what strings really are. And if you look at the source code of Go, you'll see that the strings are actually represented by the string struct struct. So strings are structs, shocking, right? You can also get the runtime representation of this from the Reflect package, which contains the string header type. So based on these two types, we see that the string consists of a pointer to the actual string data in the memory, an integer which gives information about the size of the string. When Go creates a string, it allocates storage corresponding to the provided string size and then sets the string content as a slice of bytes. As you've seen, the string data is stored as a contingent slice of bytes memory. The size of the strings stays the same during its lifetime, since, as I mentioned previously, the string is immutable. And this also means that the size and the capacity of the backing slice of bytes stays the same. When you put this all together, the total size of the string will consist of the overhead of the string header, which is equal to 16 bytes, and I show in a bit why, and the byte length of the string. We can break this down on this small example of the string I created with FOSDEM, space, waving hand emoji. So this is just a snippet. I don't think it would compile this code, but for brevity, I decided to show these three small lines. And by calling the size method on the string type from the Reflect package, you would see it return number 16. Don't be fooled. The size method returns only the information of the size of the type, not size of the whole string. Therefore, it correctly tells us it's 16 bytes, 18 bytes due to pointer pointing to the string in memory, and 8 bytes for keeping the string length information. To get the size of the actual string data, we have to use the good old length method. This tells us it's 11 bytes. This is the string literal. Here is UTF-8 encoded. We count one byte per each letter and space, and we need actually four bytes to encode the waving hand emoji. And this brings our total to 27 bytes. Interestingly for such a short string, the overhead of storing it is bigger than the string data itself. It's also important to realize what happens if we declare a new string variable that is copying an existing string. In this case, co-creates what we can consider a shallow copy, meaning the data the string refers to is shared between the variables. Let's break it down again on the example of our FOSDEM string. So we declare a new string literal, FOSDEM waving hand emoji, and then create a new STR or new string variable, and set it to value equal to string or STR. What happens behind the scenes? If you would look at the values, pointer of each of the strings, you would see different addresses. We're making it obvious that these are two different strings strictly speaking, but looking at their headers, we would see identical information, same pointer to string data, and same length. But because... Excuse me, sir, can we turn the light on the front off first? I cannot. Sorry. Okay. Sorry. Yeah, it's a bit light, right, sorry. But anyway, so these are two different strings strictly speaking, and looking at the header information, we would see that they point to same string data and have same length. Because they are two different strings, we need to be mindful of the fact that the new STR comes with a brand new string header. So the bottom line is, when we do this copying, there is, again, even the data is shared, the overhead of 16 bytes is still there. So I briefly talked about my inspiration for this talk, but I also wanted to expand a bit on the context of the problems, where I think the string optimization strategies can be useful. I think in general, many programs with characteristics of in-memory stores may face performance issue. I will talk about in this slide such programs. I already mentioned numerous times, the time series database, DNS resolvers, or any other kind of key value store, where we come with an assumption that these are some long running programs, and over the runtime of the program, we will keep the number of strings we will keep accumulating. So we can be talking potentially billions of strings. There's also potential for repetitions of strings, since many of these stored values may repeat themselves. So for example, if we associate each of our entries with a label denoting which cluster they belong to, we are guaranteed to have repeated values, since we have a finite and often small amount of clusters. So the string cluster will be stored as many times as many entries there are in our database. There are also certain caveats when it comes to handling of incoming data. Data will often come in a form of request through HTTP or GRPC or any other protocol, and usually we handle this data in our program by un-martialing them into a struct, and then we might want to store some information, some string from this struct in the memory for future use. However, the side effect of this is that the whole struct will be prevented from being garbage collected, because as long as the string or as a matter of fact any other field from a struct is being referenced by our database in memory, the garbage collection won't kick in and eventually will lead to bloats in the memory consumption. I think the second kind of different type of programs where string optimization can be useful are kind of one of data processing situations as opposed to the long-running programs. So we can take an example of handling some large JSON file, perhaps it can be some data set from a study or a health data, which I think were some good examples I've seen out in the wild, and such processing will require a larger amount of memory to decode the data during processing. So even though we might be processing same strings that repeat themselves over and over again such as the keys in the JSON document, we're having to allocate such strings in new each time. So now that we have a better understanding of the problem zones, let's look at the actual optimization strategies. So the first strategy is related to the issue I mentioned a couple of slides before where we are wasting memory by keeping whole structs in memory when we only need part of the struct that is represented by the string. So what we want to do here is to have a mechanism that will allow us to quote unquote detach the string from the struct so that the rest of the struct can be garbage collected. Previously this was also possible to achieve with some unsafe manipulation of strings, but since Go 118 there's a new method called clone in the string standard library that makes it quite straightforward. Though clone creates a new fresh copy of the string, this decouples the string from the struct, meaning the struct can be garbage collected in the long term and will retain only the new copy of the string. So remember previously I showed that when we copy strings we create shallow copies, here we want to achieve the opposite, we want to truly copy the string and create a fresh copy of the underlying string data so the original string can be garbage collected together with the struct it's part of, so this we can refer to as deep copying. The next most interesting and I'd say one of the most widely used strategies in software in general is string interning. String interning is a technique which makes it possible to store only a single copy of each distinct string and subsequently we keep referencing the same underlying string in the memory. This concept is somewhat more common in other languages such as Java or Python but can be implemented effortlessly in Go as well and there are even some ready-made solutions out in the open that you can use. So at Simplus you could achieve this by having a simple map string string and you can keep the references to the string in this map which we can call our interning map or cache or anything like that. First complication comes with the concurrency, right, because we need a mechanism to prevent concurrent write and read to our interning map so obvious choice would be to use mutex which have our incurred performance penalty but so be it. Our concurrency save map version from the sync standard library. The second complication or the noteworthy fact is that with each new reference string we are incurring the 16 bytes overhead as I explained a couple of slides back. So even though we're saving on the actual string data, it's not, we're still incurring the overhead so with millions of strings, 16 bytes for every string, it's a non-trivial amount. Third complication comes from the unknown lifetime of the string in our interning map. At some point in the lifetime of the program there might be no more references to a particular string so it can be safely dropped. But how to know when these conditions are met? Ideally we don't want to be keeping unused strings as in an extreme case this can be a denial of service vector leading to exhaustion of memory if we allow the map to grow unbounded. One option could be to periodically clear the map or give the entries a certain time to live so after a given period the map or the given entries are dropped from the map and if a string reappears after such deletion we simply create the entry in the interning map so kind of like a cache and naturally this can lead to some unnecessary churning and unnecessary allocations because we don't know exactly which strings are no longer needed or referenced but we might be still dropping them. One and more elaborate way to do this is to keep counting the number of references of the used strings and this naturally requires a more eloquent and complex implementation but you can see here I linked a work done in the Prometheus project writing is a good example of how this can be implemented with counting the references. We can take this even to the next level as I recently learned there is an implementation of an interning library that is capable of automatically dropping unused references. The go4.org intern library is capable of doing this thanks to somewhat controversial concept of the finalizers in the go runtime. Finalizers set very plainly make it possible to attach a function that will be called on a variable that is deemed to be garbage collection ready by the garbage collector. At that point this library checks the sentinel boolean on the reference value and if it finds this is the last reference to that value it drops it from a map. The library also cleverly boxes the string header down to a single pointer which brings the overhead down to 8 bytes instead of 16. So as fascinating as this implementation is to me it makes uses of some potentially unsafe code behavior hence the dark arts reference in the slide title. However the library is deemed stable and major enough and has been created by some well-known names in the go community. So if you're interested I encourage you to study and look at the code it's just one file but it's quite interesting and you're sure to learn a thing or two about some less known parts of go. And as an example I recently tried this library in the last blood point in the TANOS project again I linked you the PR with the usage with the implementation which I think is rather straightforward. And we ran some synthetic benchmarks on this version in turning on this was the result. On the left side you can see probably not very clearly unfortunately but there is a graph showing metrics for both reported by the go runtime, how many bytes we have in the heap and metrics reported by the container itself and you can see the differences between the green and yellow line and the blue and red line so it came up to roughly two to three gigabytes improvement per instance so this is averaged per I think across six or nine instances so per instance this was around two to three gigabytes so we can count overall improvement around ten to twelve gigabytes but more interestingly on the right side of the slide there is another graph to kind of confirm that the interning is doing something that it's working then we can see we're following again a metric reported by the go runtime and we're looking at the number of objects held in the memory so we can see that it dropped almost by health when we look at the average. Finally there's a string interning with a slightly different flavor I would say which I refer to a string interning with symbol tables and in this alternative instead of keeping a reference string we replace it with another referring symbol such as for example an integer so the integer one will correspond to string apple or string integer two will correspond to string banana and so on and this can be beneficial with scenarios with a lot of duplicated strings again this brings me to my home field and to the time series databases where there is generally a high probability of the labels so also the strings being repeated and especially when such strings are being sent over the wire so instead of sending all the duplicated strings we can send a symbol table in their place and we can replace the strings with the references in this table so where this idea come from or where I got inspired for this was also in Thanos but this was by one of my fellow maintainers so you can look at that PR who implemented this for data series being sent over the network between Thanos components so instead of sending all the long and unduplicated label keys and values so instead of sending all of these strings we build a symbol table that we send together with the duplicated label data that includes that contains only references instead of the strings so that all we have to do on the other side once we receive the data is to replace the references by the actual strings based on the symbol table which saves us on one hand the cost of the network since the requests are smaller and also the allocations once we're dealing with the data on the receiving side. Lastly you could try putting all of the strings into one big structure into one big string and this can be useful to decrease the total overhead of the strings as this eliminates the already mentioned overhead of the string header so yeah since this is always 16 bytes plus the byte length of the string which consists which creates the size of the string by putting all the strings into the one we can effectively decrease the overhead of those string headers. So of course this is not without added complexity because now we have to deal with how to look up those sub strings or those smaller strings within the bigger structure and so you need a mechanism because you cannot simply look them up in a map or symbol table and obviously another already mentioned complication such as concurrent access you also have to deal with this and I think particularly interesting attempt at this is going on in the Prometheus project which again this is done by Brian Boren who I mentioned in the previous slides so if you're interested feel free to check out this PR. So I will conclude with a few words of caution so I have shown you some optimization techniques that I found particularly interesting when I was doing my research but let's not be naive these are not magic ones that will make your program suddenly work faster and with fewer resources this is still a balancing exercise so many of the presented techniques can save memory but will actually increase the time it takes to retrieve a string so when I mean optimization this is mostly in a situation where we want to decrease expensive memory footprint of our application while sacrificing a bit more CPU a tradeoff that I believe is reasonable in such setting. Also not making any concrete claims about performance improvements of various techniques as you have seen and I think this nicely ties into the introduction of my talk where I talked about the need of data data driven optimization so I believe there's still more data points needed to show how well these techniques work in practice how well they can work in your specific use case how they compare with each other when it comes to performance and whether there are some other real world implications or maybe properties of go or compiler or the runtime that might not render them useful in practice or the performance gain might be negligible so just to say that your mileage might vary but I think these ideas are worth exploring and can be interesting and that is all from my side thank you for your attention. Also included a couple more resources for those who are interested you can find the slides in the PENTA bar. |
Squeezing a go function |
Okay, thank you. So our next speaker is Jesus, we've been talking a few times in the GoDev room about everything that has to do deeply within the language and today he's going to talk to us about what's going on in functions. A round of applause. Okay. Hello, everybody. Well, my name is Jesus. I'm software engineer and I'm going to talk about squeezing a Go function. So what is optimization? I think it's important to know that optimization is not being faster or consuming less memory, it depends on your needs. So it's better for squeeze use, probably everybody will say yes, but it depends if you are looking for convenience or for something that lasts forever. So in that case, it's not the best option. Optimizing is about what you need and trying to address that. It's important to optimize at the right level. You can buy the best car, you can get an F1 car and it's not going to be fast if this is the road. So try to optimize always at the upper level because this kind of optimization, the ones that we are going to see in this talk are micro optimizations that probably are not the first place that you should be starting. Optimize what you need and when you need it. It's not about taking a Go function and try to optimize forever and try to make that run super efficiently and scratch every single nanosecond because probably the bottleneck is no longer there. You have to search for the bottleneck, you have to optimize where the bottleneck is and then look again if the bottleneck is still there because if it's no longer there, you are over-optimizing that function without much gain. So just take that into consideration, optimizing is an interactive cycle and you need to keep moving and keep searching for the bottleneck. Do not guess, please. Yeah, I know everybody has instincts and all that stuff but guessing about performance is an awful thing because there's so many things that comes into play that is just impossible. There's the operating system, the compiler, the optimizations of the compiler, if you are in the cloud, maybe a noisy neighbor, all that stuff comes into play with performance. So you have to, you are not good at guessing almost for sure in performance. So just measure everything. The important thing here is try to measure everything and work with that data. Probably is what, probably the talk that is after the next one is about. So I will suggest to go there also because it probably is a very interesting talk. So let's talk about benchmarks. The way that you measure performance in micro-optimization, so micro-benchmarks, is through Go benchmarks. Go benchmark is a tool that comes with Go and is similar to the testing framework that comes in Go but very focused on benchmark. In this case, we can see here an example to have two benchmark, one for MD5SAM and one for SHA256SAM. That's it. It's just a function that starts with benchmark. I'm going to receive a testing.b argument and that's this four, I have this four loop inside. And that is going to do all the job to give you the numbers and I show you now the numbers. If I run this with Go bench, we got this dash bench dot. The dot is a regular expression that means everything. So you can use like the Go test run a regular expression for only executing certain benchmarks. And here you can see that MD5SAM is around twice time faster per operation than SHA. So well, just a number. It's that important. It depends. If you need more security, probably MD5 is not the best option. So it depends on your needs. Another interesting thing is the allocations. One thing that you maybe have heard is about counting allocations. Counting allocations, why is that important? It's because every time we allocate something, when we talk allocation, we're talking about allocation in the heap. If every time we allocate something in the heap, allocating that is going to introduce an overhead. And not only that, it's going to add more pressure to the garbage collector. That's why it's important to count the allocations when you are talking about performance. If you are not worried about performance at that point, don't count the allocation. It's not that important and you are not going to gain a massive amount of performance from there if you are not in that point there. Okay. Let's see an example here in MD5 and SHA SAMs. We have zero allocations. So well, this data is not very useful for us now. So let's use another thing. Let's open a file. Let's open a file thousands of times and see how it goes. Now I see that every single operation of opening a file, just opening the file, is going to generate three allocations. And it's going to consume 120 bytes per operation. Interesting. So now you are measuring things. You are measuring how much time it takes, how much time is gone in processing something, is going in allocating things, how much memory is gone there. So let's talk about profiling because once you, well, actually normally you do the profiling first to find your bottleneck and then you do the benchmark to tune that bottleneck. But I'm playing with the fact that I already have the benchmark and I'm going to do the profiling on top of the benchmark. So I'm going to execute the gobench, I'm going to pass the mem profile, I'm going to generate the mem profile and I'm going to use the people of tool. The people of tool is going to allow me to analyze that profile. In this case, I'm just asking for a text output and that text output is going to show me the top consumers of memory in this case. And I can see there that 84% of the memory is gone in OS new file. Okay, let's see what happened, okay, it's that file but I need more information, well, it's that function, sorry, I need more information. Actually I cannot like this output but if you don't like this output, you can, for example, use SVG and you are going to get something like this that is very visual and actually is kind of obvious that where is the bottleneck there and in this case, again, is OS new file. If I go to the people of tool again and instead of that, I use the list of a function and I'm seeing here where is the memory going by line and here I can see that in the line 127 of the file, fileunix.go, I'm consuming the memory. Actually there you see 74 megabytes, that is because it's counting all the allocation and aggregating all the allocations, it's not, every operation here is consuming only 120 bytes. Okay, the same with CPU profile, in this case, this is generating the most of the CPU consumption is in Cisco 6, I can see in SVG, this time it's more scattered, so the CPU is consuming in way more places but still the Cisco 6 is the biggest one. So I'm going to list that and I see some assembly code, probably you are not going to optimize more this function, so probably this is not the place that you should be looking for optimizations anyway, this is an example of getting to the root cause during the profiling. Okay, this talk is going to be more by examples, I'm going to try to show you some examples of optimizations, it's just to show you the process more than the specific optimization, I expect you learn something in between but it's more about the process, okay. One of the things that you can do is reducing the CPU usage, this is a kind of silly example, you have a fine function that have a needle and a high stack and just go through the high stack and search for that needle and give you the result. This is looping over the whole string or the whole slice, I'm going to do a benchmark, the first thing, I'm going to do the benchmark, I'm going to generate a lot of strings and I'm going to do a benchmark looking for something around in the middle, it's not exactly in the middle but it's around there and the benchmark is saying that it's taking nearly 300 nanoseconds. If I just early return that is just a kind of silly optimization, it's not super smart or something like that, I'm going to save basically almost the half of the performance, this is because the benchmark is doing something really silly and it can vary depending on the data that it inputs but it's an optimization is just doing less, that is one of the best ways of optimizing things. Reducing allocations, one of the classic example of reducing allocations is when you are dealing with slices, when you have a slice, for example this is a common way of constructing a slice, I create a slice, I loop over this, generate a loop and start appending things to that slice, okay fine, I'm going to do a benchmark for checking that and it's taking 39 allocations and around 41 megabytes per operation, okay sounds like a lot, okay let's do it, let's do this, let's build the slice but we are going to give an initial size of a million and the time I'm just setting that, the final result is exactly the same but now we have one allocation and we have consumed only one megabyte and actually if you see there is around 800 microseconds and here you have around 10 milliseconds, so it's a lot of time actually, a lot of CPU time too but you can squeeze it more, if you know that at compile time, if you know exactly the size that you want to have at compile time, you can build an array, it's faster than any slice actually, so if I build an array I'm now doing zero allocation, zero heap allocations, it's going to go in the stack or in binary somehow, whatever but it's not consuming my heap allocations and this time is 300 microseconds approximately, so an interesting thing if you know that information at compile time, okay another thing is packing, if you are concerned about memory you can build this struct and say okay I have a Boolean, I have a float, I have an N32 and the goal compiler is going to align my struct to make it more efficient and work better with the CPU and all that stuff and in this case it's just adding seven bytes between the Boolean and the float and four bytes after the integer to get everything aligned, okay I built a slice and initialized a slice and I'm allocating one time because that's what the slice is doing and I'm consuming around 24 megabytes per operation, if I just organize the struct, in this case I put the float at the beginning then the integer 32 and then the Boolean, the compiler is only going to add three bytes so the whole structure is going to be smaller in memory and in this case now is 16 megabytes per operation, so this kind of optimization is not going to save your day, if you are just creating some structs but if you are creating millions of instances of an struct it can be a significant amount of memory. Function in lining, function in lining is something that the goal compiler does for us is just taking a function and replacing any call to that function with the code that is generated by the function. I'm going to show you a very damn example, I'm not inlining this function explicitly and I'm using the inlined version that is going to be inlined by the compiler because it's simple enough and then I'm going to execute that, I'm saving a whole nanosecond there, so yeah it's not a great optimization to be honest, probably you don't care about that nanosecond but we are going to see why that is important later, not because of the nanosecond. I'm going to talk now about escape analysis, escape analysis is another thing that the compiler does for us and basically analyzes our variables and decides when a variable escapes from the context of the stack, it's something that is no longer able to get the information from the stack or store the information from the stack and be accessible where it needs to be accessible so it needs to escape to the heap, so it's what generates that allocations and we have seen that allocations have certain implications, so let's see an example here, this is another inline function that returns a pointer that is going to generate an allocation, this is something that returns by value, a value is going to copy the value to the stack of the caller so it's not going to generate allocations, so we can see that in the benchmark that is saying the first version have one allocation and it's allocating 8 bytes and the second one have 0 allocations and actually you can see there is one allocation and it's taking 10 times more to do that, 10 times more in this case is around 12 nanoseconds that is not a lot but everything adds up at the end especially when you are calling millions of times of things, okay and one interesting thing is escape analysis plus inlining, why? Well imagine this situation you have a struct, a function that generates or instantiate that struct and the constructor of that extract, okay, the constructor returns me a pointer and do all the stuff that it needs, okay great, it is generating 3 allocations and it's consuming 56 bytes per operation, okay, what happen if I just move the logic of that initialization process into a different function, if we do that suddenly the new document is simple enough to be inlined and because it's inlined it's no longer escaped so it's no longer needed that allocation, something that simple allows you to just reduce the number of allocations of certain types when you have a constructor, what I would suggest is just keep your constructor as simple as possible and if you have to do certain complex logic do it in an initialization function, well if that doesn't hurt the readability, okay, let's see here we have less allocations, we have now 2 allocations and 32 bytes per operation and the time consumed is you are saving 50 nanoseconds every time you instantiate that, so this is a good chunk, okay, well this is optimization sometimes it's a matter of trade-offs, sometimes you just can do less, like less allocations, less CPU work, less garbage collector pressure, all that stuff is things that you can be done, but sometimes it's not about doing less, it's about consuming different kind of resources, I care less about memory and I care more about CPU or all the way around, so concurrency is one of the cases where you need to decide what you want to consume because go-routines are really cheap but are not free at all, so let's see an example with IO, this is two functions that I created, one is a fake IO that is going to generate some kind of IO simulation by time-sleep and then you have the fake IO parallel that received the number of go-routines and it's doing basically the same but distributing all that hundred cycles between different go-routines and I built a benchmark to do that using three different approaches, one is serial one, the non-concurrency, the other one is concurrency using the number of CPUs in my machine and the other one is using the number of tasks that I have, and because this is IO, this is the result, I'm going to see that if I create one go-routine per job, the number of bytes per operation and the number of allocation is going to spike but the time that is going to be consumed is going to be way lower, actually I'm able to execute hundred times this function using this one go-routine per job approach and only 12 using one CPU per job because this is IO, so let's see what happens if I do that with CPU. Using the CPU, this is to simulate some CPU load and using MD5 sum and it's more or less the same approach as we saw in the fake IO, the benchmark is exactly the same approach, we are using the number of jobs and the number of CPUs and using no go-routines and here is interesting because if you use the number of CPUs and this is a CPU workload, that is what is going to do the best efficiency. You can see here that executing one go-routine per job is going to be even slower than executing that in serial and actually you have the worst of both worlds. You have plenty of allocations, plenty of memory consumption, plenty of time consumption and you are not gaining anything. In the case of CPU, you are consuming more memory and you are getting better CPU performance because you are basically spreading the job all over your physical CPUs and the serial one is just doing everything and is using only one core of your CPU. Whenever you want to optimize using concurrency, you have to take in consideration what the kind of workload that you are using is the CPU workload, is your workload, do you care about memory, do you care about CPU, what do you care about? That is the whole idea. I just want to explain that all this is about measuring everything, measuring all this, doing all these benchmarks, doing all these kind of experiments to see if you are getting improvement on the performance and iterate over that. That is the main idea. I show some examples of how you can improve things and some of them can be applied in general basics like using the, try to keep constructors small or using the constructor for slices when you know the size and things like that. Some references. Efficient Go is a really book that is really, really interesting. If you are really interested into efficiency, Bartolome Plocca wrote that book and actually is going to give a talk after the next one. I am sure it is going to be super interesting. High-performance workshop from Dave Cheney. There is a lot of documentation about that workshop that Dave Cheney did and it is really interesting also. The Go Perf book is a good lecture also. An Ultimate Go course from Ardon Labs is also an interesting course because it is giving you a lot of foundation and the course takes a lot of, cares a lot about hardware sympathy and all that stuff. Well, some creative common, all the images are creative common so I put the reference here because it is creative common. Thank you. That is it. Thank you. |
Reconciliation Pattern, Control Theory and Cluster API: The Holy Trinity |
We still have one minute before the game starts, ready to go? Thank you. Our next talk is by Sachin, and he's going to talk about a thing that I use every day in Go, but it's kind of weird because it's only existing in this language, as far as I know. But it's how Kubernetes is built, which is the reconciliation pattern. Go ahead. Thank you. Thank you. Thank you all. Thanks for coming. Welcome to Forstium. Today, I'm going to talk about control theories, reconciliation pattern, and how do we use that in Cluster API? So a little bit about me. I work at Canonical, particularly the MicroKStream. Previously, I used to work at VMware. Then I got to know about Cluster API, BIOH, and I try to contribute to Cluster API upstream too. And I'm very much interested in distributed system and cloud-native technologies, so ping me with your favorite tech. So the agenda is like this. So we start with the first basic principles, like what is control theory and PID control system. Then we go up this tech. So L0, L2, just simulate tech, we're going up this tech, one more layer of abstraction. Then we'll see about reconciliation pattern and how they are using Kubernetes. We then see how we extend those reconciliation patterns, and finally, we'll take a look into those patterns in Cluster API and a short demo to come to play with it. So a quick one-on-one of control theory. I'm talking to you. You folks are taking a feedback, and that's like 90% of control theory right there. So control theory is like a branch of mathematics, engineering. So there's a lot of folks who are in trying to find a common theme for dynamic systems, and they were all like, wait, we are all talking about the same things, let's just unify it. So that's how control theory was. It's just a study of how dynamic systems work, particularly the main fundamental crux of it to bring a desired state, a final state into a desired state. So this is kind of what control theories are about. Let's take a very simple example to know more about it. And open-loop controllers, what is it? A simple example will be you have some wet clothes you want to dry them. You put them in a dryer, you set the timer on. Now the clothes are in no way dependent on if they will be dried or not. The only function that is variable is the timer. It times the duration that it needs to shut down the dryer to. It doesn't matter if the clothes are dry or wet. So it's not a good approach to take this. Before I introduce closed-loop controllers, there are a few terms that we need to see. A system is the entity that we want to control. Set point is our desired state, process variable will be observed state. Error is the difference of how overshot or undershot we are from the set point and the process variable. Controller is a simple finite state machine which drives essentially your process variable to the set point. A very favorite example of mine is thermostat. So we are in the room, we have an air conditioner and we have set the thermostat to maintain the temperature at T1, let's say, and currently the temperature is T0. So the thermostat says, no, no, no, the temperature I want is T1. So it produces some processes to the machine, to the AC and it does like an adiabatic process or something to achieve that state. So in that case, our thermostat will be the controller, T0 will be our process variable, T1 is the set point and the error is the difference between the temperature that we want and the room is our system in that case. But it's not always this ideal, this change takes time. It's not like instantly you do, instantly the thermostat says, okay, make the temperature T1 and the AC does that. It takes a gradual amount of time to do that. And so we need a non-ideal situation. What would be an ideal controller look like? So it needs to do these three things essentially. It needs to see, okay, how far am I undershooting or undershooting from the set variable. It needs to do the compensation for large changes and try to adjust based on it. And also it needs to make prediction of how to minimize this error based on previous experiences it has. A very good example of this will be cruise control in your car system. When you're going you turn on the cruise control and it identifies, okay, now I'm going straight but I need to, and there's a turn coming up, I need to apply this amount of turn essentially to make that, to avoid an accident or something. So PID controller is essentially what these three accumulate to, the P is the positional. It's essentially the amount of, for example, in the case of cruise control, it's essentially the amount of turn that the car needs to take to make that curve. It is the linear component, the P is the proportional or the linear component. In the graph we see that it is defined by, if the set point is like a straight line and PV just fluctuates all around, it's the magnitude of the point from the set point to the process variable. The I is the integral component, it is the compensator. So it adjusts based on what the current state is and how I need to set to the desired state but also it needs to compensate fastly. So you're going on a straight road, you need to quickly make the curve. So you cannot, the car cannot go like, okay, I'll make the turn right away when the turn comes up, it needs to gradually make that change. And so for that it uses, the integral component just signifies that gradual curve that it needs to take. And it is defined by the area under the curve in the magnitude versus time graph. D is actually really interesting. It's the predictor, it's how previous experiences that it has, it applies the previous experience that it has and tries to control the state it is trying to achieve. In our cruise control example, it will be as simple as, it sees the curve, it slowly gradually starts to make that adjustment based on like previous experiences that it has, that I should not just overshoot when the curve comes but start gradually differentiating that. The other controllers that we have fall under PID, the D is not much used but it's a really interesting one if you look at it. This funny looking diagram is just a block diagram of how the PID controller tries to manage the process and like it has a sensor in it, which just takes the state of it. This example R is the set point, the signal that we are sending into the controller. The Y becomes the Y function, that becomes the process variable, E is obviously the error and U becomes the signal that is sent to the process here. This fancy looking thing is just a state of the process that we are in. So U takes the signal that we are sending into it, which was as in our previous slide, the set point, sorry, U was the, yeah, the controller, the signal that was sent to the process. YT is the measured output, as you can see from there. The error is the difference between RT and YT. So RT was our set variable from this previous example. And so this, this simple differential equation is just tries to find the particular state of the controller that is written and how is it trying to achieve that state. The coefficients K0, K1, and K2 totally depend on the system that we are in. So reconciliation patterns in Kubernetes. How do Kubernetes incorporate these patterns that we see and use it to make controllers and you can silence it? So on a very high level, this is what a simple reconciliation look like. It's a forever loop, which has a desired and a current state, which are set points and process variables, and actuator that makes this change. Let's try to take the current state into a desired state. And this is like available on, this is like from the controller, and you can check it out it has a very good specification of how to make a controller. Let's take a very simple example to see how it actually works in a one node cluster. So we have a one node cluster, we have deployment that is deployed, which has a replica set which provisions two pods on a single node cluster. The node talks to the API server, the API server talks to HCD, and it has a bunch of controllers that it needs to run that state. So everything is fun. Now, pod decides to bail out, it's gone, just like that. And so there is now, the state is not maintained, the desired state is lost. So what the Kubelet does, it talks to, it mostly talks to the API server, API server that says, talks to the HCD, it says, okay, I need two pods, but there is no pod here. So there is the, API server talks to the controllers, it's the scheduler, the deployments and the scheduler and replica set controllers, she gives a new pod to that node, it is mentioned in the HCD server, and finally a pod to its provision on node zero. So this is a very simple example of how controllers works in Kubernetes. Now how do we extend the reconciliation pattern? How do we use it to make CRDs and stuff? So first of all, how many of you folks have used Kubernetes cluster API, CRDs, all these fancy words? Quite a lot. So most of these frameworks, CubeBuilder, Operator SDK, these have this basic structure to make a controller. You create a spec which is set point in this case, we have a status which will, which will the process variable in this case, which is the desired state that we, which is the observed state that we want at any point of time, and it will, and we have a schema that is just defines this object foo in this case, and it has all these spec and status, this I mean, the meta objects, like the name, type, and all that stuff, information in that side that. We create, and we need to fulfill the reconciled interface, so we create a foo reconciler object, and we, we essentially provide it with, with all these business logic that we need to reach that desired state from the current state at any given point of time. The way we do that is we define a CRD, our CRD has a spec which is the desired state, and the controller continuously looks at the CRD to check, okay, this is a desired spec, but we don't have a desired spec right now, so it needs to change, and it's called the, it calls the reconciler, and it does, it executes the business logic that we want it to do. And so that is how we use the, the reconcilation pattern that we've seen earlier in, and extend this for other custom-made objects that we have. Now how do we use these patterns that we saw, and incorporate them in Cluster API? So first of all, Cluster API is a Kubernetes project which tries to declaratively use Cluster APIs to create and figure, manage the life cycle of other clusters that you have. So in a very crude example, the user applies a spec to the cluster, there's a management cluster which is kind of a cluster of clusters, it manages all these other clusters that we have. So a spec defines all those, what those other clusters need to be do, and the management cluster basically has these four kind of things, it has Cluster API CRDs, infrastructure provider CRDs, control plane, and bootstrap provider CRDs. So all these need to be present in the management cluster, and based on these, these specs that it has in CRDs, it will try to maintain the state of all these other, all these other clusters that we have, sorry. So what do these different CRDs do, these different objects, what is the purpose? The Cluster API is basically all these copy objects, like machine set clusters, all this stuff that we, the upstream Cluster API provides us. The bootstrap provider does the job of turning your VM or any default server into a Kubernetes node. You can utilize logic to that, and convert it to the particular Kubernetes node that we want, for EC2, for OpenStack, whatever your cloud provider is. The control plane provider, it provides you with the objects that the control plane of the, like the simple control plane in Cluster API, in Kubernetes, it provides you with all those reconciliation loops and controllers that the control plane needs to mark those states. And the infrastructure provider is basically how particular infrastructure, like EC2, OpenStack, whatever infrastructure you have, and how they will be incorporated into bootstrap or control plane providers. So this is kind of like how these different CRDs go into, CRDs interact with each other, so Cluster, Cluster is from Cluster API, but we need to provide an infrastructure cluster which comes from infrastructure provider to that, and then it will manage. So all of these are very much dependent on which cloud you're using. We'll see an example of this in a few minutes. So a control plane directly comes from control plane provider, machine deployment, machine set, it's all Cluster API stuff, but we need to provide it bootstrap and infrastructure, and similarly bootstrap config and infrastructure machine for it to work, machine health check comes directly from Cluster API, its job essentially is to keep checking the state of the machines and if it's working fine or not. A bit about microcades, because we're going to use microcades, control plane and bootstrap provider. So what happens, so microcades is lightweight communities we have, we have been working on, it is one touch communities highly available, it has all the same configs, you don't need to do much, and it has a very good add-on ecosystem that you can call your own tools, you don't need to rely on us to do all this stuff, you can bring your own custom tools that you need for your clusters. So for the demo, it's a small demo, we need three essential things, so the Cluster API comes from the upstream step, but we need to provide these other three things, and then for this, for bootstrap provider, we'll use our microcades bootstrap provider for control plane, same thing, and from infrastructure we will use open stack providers that we have. So for the demo, let's go, let's see if it works, so like I said, these clusters, these are from upstream cluster API, we just take these CRDs, but then we need to apply what control plane reference will be using, what infrastructure will be using, and it's all like custom based on what you want to do. Similarly to that, we have open stack cluster, open stack cluster that is specific for open stack cluster, we have different projects for that, AWS, Azure, EC2. Then we see microcades control plane, it's specific to microcades, it defines all these specs that a particular instance of microcades will have, and this is a thing to see a bit. So we define a particular version that this particular control plane will have, open stack machine template that we saw before, that is needed for that, and machine deployments, and machine deployments will also have a version that is essential for our demo. So and then there are all these stuff that comes from template, whatever template you apply, it comes from that, so it's quite default, so without trying to actually go into entirety, I have screenshots of it because the entire demo took like an R2 issue. So if I apply this cluster, I'll get this too, so I don't know if you can see, but I'll have six machines in an open stack cluster, which will have a version of 1.24 each. As the time progresses, it provides a provider ID, and at a certain point in time, they're all in ready state and good to go with all of them with 124 communities version. I think to note that is both of them are controlled by different providers, so the machine deployments are controlled by the bootstrap provider, and the control plane takes care of all these control plane nodes. So we'll see how, what happens when we try to update this cluster, what reconciliation is happening when we try to do that. So if I go there, I'll change it to six, and then again to six, as soon as I apply this manifest back, I have changed the desired state for me to have version 126 on both of the control plane and the machine nodes. So as and when I apply that, both the controllers, the bootstrap and the control plane controllers, we'll see 124 is now not what we want, we want 126, so it will start provisioning these machines at 126 version. It goes through the entire place of, so these are the rollout updates, so what happens is a new node is provisioned, a old node is depleted, and this happens until all the nodes are in the desired state. So it's also in place updates, which is a very cool idea, so rather than depleting the nodes, it just does the upgrade in place without having to drain nodes each time it comes and go, and it is a very good use case for when you have a stateful application like a database or something. So it does that, it does the deletion, it does all that stuff, until the entire cluster will be 126, which was the desired state. So all of this we see, we go from basic first principles is like what was control theory, how it gives us controller, then we apply, then we see how we applied it to our communities ecosystem, and then how we extended that, extended those patterns for our cluster API, and finally how can we, how we can have a feature from that first principles. These are some of the talks that I took inspiration from, I definitely recommend control theory in Fitment Rewind by Valerie, it has lots and lots to say about this. Control theory and all these stuff, control theory is dope, it's a very good article that you should definitely check it out. It also talks about reactive patterns, which is cool stuff, lots more use in AI and stuff, so it is cool, and these are all references that I use from other sources as well. So yeah, thank you, thank you for coming, I hope you didn't come in for me. Thank you. I'll take questions if you have, yeah. Are there any questions about Kubernetes, I'm just going to try to get the microphone to you, not any questions about Kubernetes, about the talk, thank you. Can you pass the microphone along, thank you. Hey Guruji, thank you for your talk. In the theory you have the state, the desired state and the current state of the system, and then when you're talking about the thermostat, this is the desired temperature and this is the current temperature, how do you accommodate for when, can the system predict when this is not going to happen, oh I've been pumping the heater for 48 hours and I see that the temperature is not raising, not a single degree, like how do you cater for that? So first of all it means that the system has a fault if it does not reach the desired state, but it will take it as an experience, so if I go to here, the predictor component is what predicts it, it will see okay, the derivative is the predictor component, it will see okay, at some point of time previously this did not work, this change was not working, so it will take that into account and the next time it does that it will take it as an experience, so if it was not working and how did we try to make it work, it will try to take that experience and incorporate it into the next time it tries to do that. Thank you, any other questions? I'll take it as a note, thank you very much again, we have a small 5 minute break so you can stand up, stretch a bit. |
Five Steps to Make Your Go Code Faster & More Efficient |
Okay, welcome back. So while you all have been walking in, I've been quickly reading this book, Efficient Go, it reads very quickly, and now Bartek has made sure that my code is ten times quicker, so tell us everything about it. Thank you. Thank you very much, everybody. So welcome. I hope your travels went well. Mine were, like, canceled, flight canceled, change of route, so I had lots of adventures, but generally I'm super happy I made it, and we are at the FOSDEM. So in this talk, I would like to invite you to learn more about efficiency of our Go programs, and there are already two talks that I have been on who mentioned, you know, optimizations in its name, and, like, generally how to make software more efficient. I wonder where this, I don't know, it's not hype, but it's already three talks about one topic, why it's so popular, is it because everybody's saving me money, that might be a reason, but I'm super happy we are really uncovering this for Go, because Go alone might be fast, but that doesn't mean that we cannot, you know, doesn't need to care about, you know, making it better, and use these resources when we execute it, right? So let's learn about that, and turns out that, you know, you can save literally millions of dollars if you, you know, optimize some code, sometimes in production, long term, so it really matters, right? But before we start, short introduction, my name is Bartolome Vodka, I'm an engineer at Google, normally I work at Google Cloud, Google managed Prometheus service, but generally I'm open source, I love Go, I love distributed systems, observability topics, I maintain TANOS, which is like open source scalable Prometheus system, I maintain Prometheus as well, and generally, yeah, lots of things in open source, I mentor a lot, and I suggest you to check, you know, also try to mentor others, it's super important to bring new generation of people up to the speed in the open source, and yeah, I'm active in the CNCF. And recently, as you see, I published a book, and I think, you know, it's kind of unique, everybody's doing TikToks now, and, you know, YouTube, and I was like, yeah, let's be old school, because, you know, you need to be unique sometimes in the world, and I really enjoyed that, I learned a lot during that, and I would love you to learn as well, so I'm kind of summarizing of some concepts from my book here in the stock, so let's go. And I would like to start with this story, and, you know, apparently some of the talks, one of the best talks, have to start with the story, but this is something that kind of maybe triggered me to write the book, right, so imagine that, I mean, yeah, that was kind of five years ago, we just started the project called Thanos Open Source, really it doesn't matter what it does right now, but, you know, what happens is that it has microservices, it has, you know, I think, six different microservices written in Golang, you put in communities or any other cloud, and it's just a distributed database, and one part of this database is compactor, it's like a component, again, doesn't matter much what it does, what it matters is that it touches object storage, and it processes, you know, sometimes gigabytes or terabytes daily of metrics, right, of some data, so what happened is that at the very beginning of implementation, as you can imagine, you know, we implemented, yeah, MVP, it kind of functionally worked, but of course, you know, the implementation was kind of naive, definitely not optimized, we didn't even run any benchmark, right, other than just running on production and just, yeah, kind of works, so, and you're laughing, but this is usually, you know, what development in a higher velocity looks like, and it was working very well, until, of course, more people put load into this, and, you know, we have some issues like Ooms, you know, one user pointed us to some graphs of, you know, incredibly high spike of memory usage on the heap, on the Golan heap, right, and you can see it's a drop, which means, you know, there was a restart or someone killed this, and, yeah, and the numbers are not small, like 15 gates, I mean, for large data set, maybe it's fine, but it was kind of problematic, right, so it was really interesting to see what different feedback and what different suggestions community were giving us, and I mean, community, everybody, like users, other developers, maybe product managers, we don't know sometimes who they role are, but, you know, probably depending on their background, the answers, the proposals were totally different, right, so I would like you to kind of, you know, check, and like, check if you had the same situations in your experience, because, you know, this is kind of like very ongoing problem, and I would like to, yeah, showcase this, so, you know, first suggestion was that, can you give me a configuration that doesn't womb, and it's like, what, do you expect me, like, very new project to have, like, flags, not a womb, or like, useless memory, this is not as simple as that, yet many, many users are asking us this question, or person's, or person's project, probably you heard this question, okay, what configuration I should use, so it uses less memory, right, or like, it just, it's more optimized, how can I optimize using configuration, it's just, you know, it's not as simple as that, I guess, you know, maybe in Java, in JVM, you have lots of performance flags, you sometimes tune things, and it's better, but, you know, it's not so simple, it's a goal, like, kind of low level, you, I mean, yeah, it's, you need to do more than that, right, another, you know, interesting approach, but very, very good in some way, is it just, okay, I will just put this process into bigger machine, and it's that, and that's totally valid, you know, solution, maybe short term, maybe sometimes it's enough, but, you know, in our case, it was not sustainable, because of course, you couldn't grow vertically more and more, and also, even if you would maybe find the big enough machine that was working for your data set, then, you know, obviously, you were overpaying a lot, if the code is naive and maybe wasting a lot of memory, right, then finally, you know, the most fun approach, okay, let's split this one microservice into, you know, like a schedule there, and then, you know, warcares, and then we'll just replicate in my super nice computer, you know, communities cluster, and, you know, it will just horizontally scale, so I can use many, many hundreds of small machines, so it will work, yes, but, you know, you are putting on small, kind of, microservice so much complexity that it will be, like, more expensive, generally, right, so the network costs, like, distributed systems, you know, injects, you know, things that you have to replicate data, finally, so you overpay more and more and more, and you are, kind of, distributing this non-optimized code to different places, that's not always the solution. Sometimes the code cannot be optimized more, and we can, you know, we should probably horizontally scale, but not in the very beginning of the project, right, yet, that was the first suggestion from the community, right, of course, you can just switch from Thanos to something else, right, that's also solution, and then, if you have this approach, and probably you would just jump through project, this is not super efficient, but maybe, you know, some parts of the project are better or some worse, that's an option, some suggestion, of course, paying for vendor, right, like, they will solve the problems for me, for real money, so, but yeah, that's not always a good solution, like, that's just giving up, and also, you know, migration of data, huge cost of learning new tools, and so on, and, you know, all of this work we're in the code, we have this, and it's like, you know, it's bumping, and super easy ways that you could be avoided, right, and, yeah, so, you know, of course, that was Maloch, so in C++, I mean, in Bugo, we don't have Maloch and so on, but, you know, memory overhead, memory leaks like that, like, are very common in Golan, like, just imagine how many gorotins sometimes you put, you created, you forgot to close some kind of abstraction, and the gorotin is leaking, and so you are leaking memory like this Maloch, right, so, and, you know, what actually, you know, was the solution, was some contributor finally came up, investigated, what about this efficiency problem on the code level, algorithm and code level, right, and we wrote, or like, we wrote small part of the, of the compactor to stream data, right, so instead of building maybe the kind of resulted object that the compactor is doing in memory, it was as soon as possible streaming that to file system, easy, generally easy, easy, easy change, yet there was lots of discussions, lots of stress, lots of weird ideas, and I would just find it like, over time, amusing that this, this story was repeating in many, many cases, right, so, and you know, that's not only, you know, of course, experience, so many, so many kind of nice examples where only small character change, two character change there, and, you know, so much kind of like improvement over like large system, so sometimes, sometimes there are like, very easy ways that we can just pick it up and just do it, right, but we need to know how, right, so kind of two learnings from the story, one is that software efficiency on code level and algorithms, so changing code, you know, matters, and learning how to do it can be, can be useful, and second learning is that there is common pitfall, I think, generally in the, in this years, because in the past we have premature optimizations, everybody was playing with the code and trying to over-optimize things, I think now we are lazy and we are more like into DevOps, into changing, you know, configuration, into horizontal scaling because we have this power, we have cloud, and this is usually, you know, more chosen solution than actually checking the code, right, and I call it closed box thinking, and I think this is a threat a little bit in our ecosystem, so we should acknowledge that there are different levels, we can sometimes scale, we can sometimes put more bigger machine, we can sometimes throw right to rust, if that makes sense, but you know, that's not the first solution that should come to your mind, right? Okay, before we go forward, I will, I have five books to share, and I will start the link to quiz at the end, and it will be super simple, but pay attention, right, because maybe there will be some questions around, and you can answer, send me an email, and I will just, you know, just choose five people, lucky people, to have my book, so, yeah, pay attention, all right, five steps, five steps, yeah, for efficiency, efficiency progress. One thing I want to mention, I don't know if you have been in the previous talk, or like before previous, he kind of explained a lot of optimization ideas, like I think, and I might say before, like he mentioned, string optimizations with internings, has just mentioned, I think, something around, you know, allocations, and many kind of like, I think, padding, strike padding, and generally, you know, all those kind of ideas, this is fine, but it's optimizing stuff, it's not like looking through dictionary of things I did in the past, it's kind of more fuzzy, more involved, so what I would like you to focus, it's not all particular way of how we optimize an example I would show, because it's super simple and trivial, but how we get there, right, how we found what to optimize, how we found if we should even optimize, okay, so focus on that. So first step, first suggestion I would have, and this is from Book, I kind of found, yeah, I don't know, like I defined this name TFBO, which is essentially a flow for development, efficiency aware development that worked for me, and generally I see other professionals doing that a lot as well, so test, fix, benchmark, optimize, so essentially what it is, it's like a TDD with something else, and TDD you are probably familiar with, test-driven development, you test first, as you can see, and only then you kind of like implement or fix it until the test is passing, right, I would like to kind of do the same for optimizations as well, so we have benchmark-driven optimizations, because as you can see, we benchmark first, then we optimize, and then we profile, right, and I will tell you later why, but all of this is a closed loop, right, so after optimizations we have to test as well, okay, so it feels complex, but we'll make one loop, actually maybe two, during the stock on a simple code, so let's do it. So let's introduce a simple function, super simple, super stupid, we are creating millions of elements, I mean, a slice with millions of elements, and each of those elements are just a string, a constant string for them, super simple, it's the first, you know, kind of first iteration of this program we want to write, so what we do regarding TFBO, okay, so we test, right, I mean, now we have a code, for example, and we want to maybe improve it, we test, test-driven development, so let's assume I already had the test, right, but the test could look like this, and then, you know, I'm ensuring, okay, it's passing, so nothing functionally I have to fix, so what next? So next is this measurement, it's a benchmark, and again, has this already mentioned how to make benchmarks, but I have some additions, extensions to that that you might find helpful, something I want to mention is that, you know, we were talking about micro benchmarks, because the same level of testing behavior, like for example, like for this small function, like we have this create, you know, unit test is totally enough, right, this is on micro level, we are making just unit test, it's fine, but sometimes if you have a bigger system, you need to do something on macro level, like integration test, end-to-end test, whatever bigger, right, and the same happens in a benchmark, right, this is micro benchmark, this is kind of unit benchmark, there are also micro benchmarks I covered in my book, and then you need to have more sophisticated kind of setup with low testing, with maybe some automation, with some observability, like, you know, Prometheus, maybe, which measures over time some resources, but here we can, we have a simple unit create function, we can just make it simple with micro benchmarks, and, you know, it has already mentioned, but, you know, there is a special signature in a test file you have to put, and then there are optional helpers, for example, that I like actually to put almost everywhere, report allocs, which is by default making sure that this function will measure allocations as well, and the reset timer, which is super cool because it resets the measurement, so anything before you allocate, you spend time on, it will be discarded from benchmark result, so benchmark will only focus on what will happen within this loop iteration, right? And then this for loop, you cannot change it, don't try to change it, always copy, this is a boilerplate that has to be there, right? Because it allows Go to make repeatable, check the repeatability of your test by running it, you know, hundreds of times. Okay, so how we execute it, already, again, has this mentioned, but this is, you know, how I do it to, like, focus to one test, but this is not enough, in my opinion, right? By default, it runs only one test, one second. I recommend to actually make sure you explicitly state some parameters, right? And I have one liner, one liner in bash, for example, that I often use, so what it is essentially, I'm kind of creating some variables so I can reference this result later on in a short-term future, V1, for example, so this will create a V1.txt file in my locale, it will run this benchmark, it will actually run it, you know, sometime, I specify, again, which is super amazing because it was like, okay, so I have this V1 file, what I was doing with it, and then you check in your bash history, okay, oh, that was one second, and then that was something else, right, so it's kind of useful. And then this is crucial, this is something I don't know why I didn't learn in the beginning, maybe you learned the count, dash count, right? So what it is, is that it runs the same test couple of times, six times, actually, and so one second, six times, and this is super important because then you can use further tools you will see to check, you know, how reliable are your results, it will essentially calculate the variance between the, you know, the timings, for example, so if the variance is too big, then your environment is not stable, right? And then I pin to one CPU, this is super important to, generally pinning, not to one, right? Just pick something that works for you, for concurrency, pick something that runs on production maybe, or similar, but always between tests, don't change that, right? So that's super important, and also I recommend to change less than numbers of CPU because your operating system has to run on something, right? So those things matter, also don't run this on laptop without power connected because you will be CPU trolled off. There are lots of kind of small things that you think, oh, it doesn't matter, no, it matters because then you cannot rely on your results, right? So try to make this serious a little bit and at least, you know, don't put, don't benchmark on your lap, you know, in the bed, you know, because they will be overheating. So yeah, small things, but it matters, right? I was doing that all the time, by the way, yeah. So results, you know, result looks like this. You can see many of them. But this is not how I use it or how we supposed to use it, apparently. There is amazing tool called BenchStat, and it just brings in more human-readable way, and you can see it also aggregates and have some averages over those runs and tells you within this percentage. For example, the time, latency, there is a variance of 1%, which is tolerable, for example, right? And you can kind of like customize what exactly, how it calculates this variance and so on. So we can trust it, like it's within 1% of, I guess, free, you could trust it, depend on what you do, but generally it's not too bad. Allocations, fortunately, are super stable, right? So yeah, so we benchmark, we measure it, okay, we know our function has these numbers, like, I mean, what's next, right? Everybody was like, yeah, let's make it faster, let's make it faster, but wait, wait, wait, why, why should we make it faster, maybe, okay, maybe that's a lot, 100 megabytes of every, you know, create invocation, but maybe that's fine, right? So this is where I think we are missing a lot of experience, usually. I mean, you have to set some expectations, right, like, to what point you are optimizing, and usually we don't have any expectations, like, okay, yeah, I mean, even from product management here we have maybe functional requirements, but never really concrete performance requirements. So we don't know what to do, and honestly, if you don't, you just ignore those requirements, okay, I don't have, I just want to make it faster, then this premature optimization is always, right, because it's always premature, because it's a random, a random goal you don't really understand, right, so maybe, maybe just make it fast, right, that's also like very fuzzy, obviously, and that's not very helpful. So what is helpful? What I will, and I know it's super hard, I know it's kind of uncomfortable, but I suggest doing some kind of efficiency requirements, spec, super simple, as simple as possible, I call it rare, so there are efficiency requirements, and what it means is essentially try to find out some kind of function, right, some kind of, you know, complexity, but not as if it's very complex, it's just more concrete estimation of the complexity based on inputs, right, and for simple functions, like for example, our function, we can estimate, you know, what in our minds we think should happen, roughly, right, so, you know, for runtime, we know we, one million time we do something, we don't know how many now seconds, let's pick 30, this is actually pretty big for one iteration of just append, but just really pick some number, sometimes it's good, it's just, you know, you can iterate over this number, but if you don't know where you go, then, you know, how you can make any decisions, decisions. In allocations, it's a little bit bitter, a little bit easier, because we expect a slice of six, of one million elements of strings, and as we learn from MachiTalk, every string has these two parts, one part has 16 bits, which has length and capacity, or maybe capacity not, but then, yeah, length capacity and pointer, and then there's other parts, which lies in the heap, but for this, you know, 16 bytes, we can assume that we'll be 16, right, so it's every element is 16 bytes, so now we just multiply, that's our function, that's what we all expect, right, and with this, we can, you know, kind of expect that every invocation of create should, you know, kind of allocate 15 megabytes, but what we see, we allocate 80 megabytes, right, so already we see that, oh, there might be like easy ways to do, or something I don't understand about this program, and this is what leads us to better, to spotting maybe easy wins, and spotting, you know, if we need to do anything, right, in terms of time, latency, it's already kind of like, more than we kind of expected, right, but this is more of a guessing, like I just guessed this 30 seconds, right, okay, so what we do, now we know we are, you know, not fast enough, not allocating, we are over allocating, right, so then we profile, then we check, okay, we have a problem, now let's find what's going on, and this is where, on micro level, we can, you know, use profiling very easily by just adding those two flags, it will gather memory profiles and CPU profiles in the file, like v1.mempprof, on macro level, you can, there are other ways of gathering profiles, but you can use the same format, the same tools, there are even continuous profiling tools in open source, like parkadev, I really recommend them, and it's super easy then to gather those profiles over time, so this, what we want to really learn is that what causes this problem, and this is like a CPU profile, and we could spot, and the wider means it spends more CPU cycles, the depth doesn't matter, this is just how many functions we have, right, so we can see that create, of course, is one of the biggest contributors, but the growth slice, right, like why we spend so many cycles growing slice, ideally, I know how many elements I have, kind of why it doesn't grow me once, right, and then we can check, and by the way, you can use this go tool, pprof.gttp, locally, I kind of use it a lot on this file to kind of expose this kind of interactive UI, you can do the same for memory, but honestly, this is not useful because Append is a standard library function, and they are not very well exposed, right, so they're hidden, so this is not very helpful, actually CPU profile was more helpful, because it pointed us to the growth slice, and if we just Google for that, you will notice this comes from Append, and then you can go to documentation of Append and learn what it actually does, and as you probably are familiar, because this is like, should be a trivial case, Append resizes the slice, or assizes the underlying array, whenever it's full, right, and resizing, it's not super simple, it has to kind of create a new, bigger array, and copy things over, and garbage collection will kill the old one, but not fast enough because of the garbage collection, so we kind of aggregate that as another allocation, right, so this is what happens, and kind of the fix is to just preallocate right, so to tell, you know, when you create the slice, okay, how much capacity you want to prepare for that, and thanks for that, so what we do now, okay, we did optimize in our TFBO, now we test, before we're even measuring, because if you are not testing if this, you know, this code is correct, then, you know, you might be, you know, yeah, we would be happy that things are faster, but functionally broken, so always test, don't be, you know, lazy, run those unit tests, easy, and then, you know, once they are passing, you can comfortably measure, again, I just changed V2, just to specify another variable, right, on our file system, and then I can do a bunch that V1.txt and then V2.txt, actually, I can put like 100 of those variables, it will compare all of them, but here we compare two, and not only we have absolute values of those measurements, but also a diff, right, so you can see we improved a lot, and if we check absolute value in regards to our efficiency requirements, you see that we met our threshold roughly, but like we estimated it, so it's totally good, you know, 15 megabytes, we have 15 megabytes, and then it's faster than our goal, so now we are good to go and release it, right, so that's kind of the whole loop, and you kind of do it until you're happy with your results, so yeah, this is it, and learnings, again, five learnings, follow TFBO, test, fix, benchmark, optimize, use benchmarks, they are built into GoLang, they are super amazing, GoTest slash bench, set the clear goals, goals are super important here, right, and then profile, and you can, I mean, GoLang uses Pprof, which you can Google as well, it's like amazing kind of protocol, kind of set of tools, integrated with other, you know, clouds and so on, and use it, you know, every day whenever I have to optimize something, and then finally, the key is to try to understand what happens, what I expected, and you know, what's wrong, reading documentation, reading code, this is what you have to do sometimes, and a general tip, whenever you want to optimize something super, super carefully in some, you know, bottleneck part of your code, I mean, avoid standard library functions, because they are really built into generic functionality, it will test, I mean, it will do a lot of things with, you know, different edge cases that you might not have, so a lot of times, I just implemented my own parsing integer function, it was much faster, so this is a general tip that always works, but again, do it only when you need it, because you might have a box in this code, right? So that's it, thank you, you have a link here, bwplotka.dev. Thank you. |
Headscale: How we are using integration testing to reimplement Tailscale |
Next up, we have two speakers for the prize of one. They are going to talk about everything to do with an open, open, more open source version of Talescale. So let's give an applause to Christopher H. Gouan. Hello. Hello. Okay. This is cool. Hello. My name is Christopher, and I'm going to, together with H. Gouan there, talk a bit about how we use integration testing to kind of reimplement the control panel or the control server of Talescale. So first a little bit about us. Juan Fontalonso is the creator of Talescale. He works for the European Space Agency on cloud and DevOps and infrastructure. He claims to have been my first manager, but I think that's incorrect. And he has the attention span of a goldfish. Which makes the whole collaboration very fun, and I'm here with Christopher. He's a top contributor of Talescale and one of the other maintainers alongside me. He's part of the technical staff at Talescale, and part of his time at Talescale is to work improving Talescale. I was his manager, at least from a hierarchical point of view. And one of the challenges we have is that he always finds these kind of super niche languages, like OCaml or things like that, where he tries to reimplement headscaling. But first of all, how many people here know Talescale and headscale? Good. That's pretty good. So for the people who don't know, we'll do like a quick tweak what is Talescale. So Talescale tries to solve this problem, where you basically sit and you want to connect your organization or home or something like this, and you have an old school or legacy VPN concentrator, where you connect into your kind of perimeter, you have access to absolutely everything, there's a single point of failure and a massive bottleneck. And it tries to do this by creating like a mesh VPN that uses direct connections wire guard and kind of facilitates this for you using techniques like natural and has a very, very powerful client that will make sure that you always reach what you're trying to get to, and it offers a lot of different kind of granular access, and you get a lot more power compared to your old school bottleneck, single point of failure VPN. And in Talescale, the clients are open source, at least for the open platforms, and what they have is a closed SAS. But still, they are quite open when it comes to explaining how the whole thing works. And in March 2020, they publish a blog post basically explaining how the whole thing worked, how they use these natural techniques so you don't have to open the ports in your router. And there was a phrase in this blog post that gathered my attention for a little bit, and was basically talking about a coordination server, that the clients talk to a coordination server, the core of this services service offering, which is essentially a shared drop box for this wire guard public keys. So I was pass up by that, and basically I took that open source clients and started reverse engineering, basically a lot of print apps to see what kind of payload were they sending, what kind of endpoints or protocol they were doing. And yeah, this was around April 2020, in June, I had a lot of free time at that time, and in June I did the initial release. I talked to my friend Christopher about tail scale, and he was very happy distributing wire guard keys with Thansible, which, yeah, so I kept doing my own thing for a while. Head scale gained a little bit of traction, and around mid 2021, he joined because he was quite curious about the whole thing. But he was afraid about breaking stuff, and that's why kind of we are here, although he was not afraid of making a logo, but I think it's super nice. So what I've learned doing this reverse engineering exercise is that the tail scale clients talk to what is basically a web service. This web service receives metadata from the clients, like the endpoints or the wire guard public keys that they use, and assigns them IP addresses, like you would having a classic traditional VPN service. As everybody knows about everything, you can establish this mesh network across the clients without interference because the data doesn't go through the web service. So we arrived to the initial stage of head scale, the illusion that everything works and kind of worked until it stopped doing. So we had this web service, we implemented the web service, the series of endpoints that we found in the reverse engineering exercise, and we were assigning IP address to the when a node arrives, and what happens when a second node arrives? Hey, we want to tell that I'm here and I want to find my friends and I want to communicate with them. So in order to handle that, and to handle the metadata that you need to establish the connections, we developed a little bit of a state machine that will handle, and you know has arrived, there's been a change in the map of the network, and we need to distribute the updated metadata that we have. However, at that time, I was kind of learning go, and we follow a little bit of a weird approach when handling concurrency, which was basically adding more Mutex every time we needed it. And this is a problem, because at the end we ended up with a great Mutex for this state machine, and this is a very big problem because the Python track is tomorrow, so the grid logs are over there. So what ended up happening inside the state machine, or what didn't end up happening, was that basically some of the failure modes we saw was that a new node trying to register, and then we burned a couple of CPU cycles trying to calculate some stuff, and then we did nothing. So no updates were sent out or anything. Sometimes we would have a new node joining, and we would compute everything, send some network traffic, we just omitted the new information, that was kind of crucial for everyone to know, so it ended up not working. And sometimes a new node joined, nothing really happened, but then eventually something happens and it sent out an update to everyone, and that was, you know, useful. And sometimes on the individual update channels for each node, some of these aforementioned Mutexes kind of deadlocked up the whole thread or the GoRoutine, and then we just never sent updates to particular nodes, and sometimes we just deadlocked the whole server and you kind of had to kick it to make it come back to life. But still there was kind of this notion that it did work pretty well eventually most of the time, and it gave this illusion of working, because what you often saw was that you had like three nodes, and only two of them actually talked together, and as long as those had received the updates they needed, you know, the user was happy and you're just like, ah, it works, so I'm going to press the star sign on GitHub and share it with my friends. So, but we figured that eventually this would like caught up with us and we're trying to get to this stage where we, you know, it works most of the time, so what we did have was a fair amount of unit tests, but the problem with unit tests is that we're trying to reverse engineering something, that we're also learning how it works, and what we spent a lot of time on was misunderstanding how it was supposed to work, writing wrong, well, writing unit tests that would pass, but they were wrong, so you kind of have like a, a passing test and it's an entirely wrong implementation, and 90% of what we were actually trying to do was integrate with a third party software, and this is where we get to actual integration tests, so what I started doing was I found this Docker test framework, which basically allows you like programmatically create Docker containers, so we started making tests that spun up a head scale container, it created a bunch of tail scale instances also running in Docker, and associated them with a couple of users and tried to like emulate the entire environment so you can test everyone to everyone. We had them join the head scale server, and since it takes a little bit of time for everyone to catch up with each other and, you know, send the updates and stuff, so we put a sleep of two minutes in front of the test, which is a terrible idea, but, you know, you learn, and then after that sleep runs out, presumably everyone is now up to date and can talk to each other, so we had a test, the most basic test is, is my network working, can everyone ping everyone? So we tried to do that, and of course that didn't work because of all of the errors we actually had in the code, and I ran some initial like, tried to make some statistics on my laptop and out of like 100 test runs we had 70 failures, that's pretty bad, but at this point we're starting to approach like, we have an actual goal that we can measure so we can improve on this, and quite rapidly we figured out that these two big blocks of problems that we have is associated with two things, so one of them is the being able to reliably send updates to all of our clients, which is the kind of deadlock problem that the update channels were just locking up and didn't really work, so we made a massive, massive rewrite PR that re-did the whole logic and made sure that we always were able to send an update to the client as long as it was connected, and then the other problem was the state machine that was very broken, and then we kind of figured out that we can make a global state, and we tried to simplify it initially and optimize later, so basically a global state, how can we determine if everyone is up to date, and make sure that we know when you last received the successful update, and if not we have to re-issue one to make sure that you know everything. However, changing the Rambo culture takes a little bit of time. We kept merging staff without proper integration testing, but as Christopher said, we didn't have the incentive, we didn't have the pressure because the thing really worked. It's not the same when you are in your home lab and you join a node than when you are joining 100 nodes within one second, so if you are slowly joining machines to your tail net, things were working. However, the project was gaining popularity, and we were increasing more and more in contributions in external PRs, and this was around August 2021, or September, something like that. So it was great, we were getting to a point where we could improve headscales with confidence. We had a point of view, given that the project started as a reverse engineering effort, we had a lot of staff that was not that great, we could improve or maintain the compatibility with these third-party external clients that we are using, and we could improve from a community point of view, I'm going to talk a little bit about this now. For starters, we could improve from a technical point of view, we could do massive refactoring within the project, or implementation of the second version of the headscale protocol without breaking the existing users, the only thing that breaks is probably the mental health of the reviewer that has to deal with 3,000 lines of code. But that's a different thing. Then as I said, we have this minor small detail that we completely depend on a third-party client, because we are using exactly the same official clients as a stale scale, however, we have a very good working relationship with them, and every time that they change something, we get a heads up. However, we keep within our integration tests quite a bit of commitment for support in this client. So we target the head of the repository, we target the unstable releases, and we target nine minor releases of the client to make sure that nothing breaks from their side or from ours, because I mean, it can happen. And then I think integration testing also helps the community, because we as maintainers can trust in a better way those random PRs from random unknown people that appear in GitHub, which is something that is not given. And in theory, or that's what one would think, is that by having integration tests, contributors, those external people that we don't know, should also feel more confident when submitting a PR. But that's a theory. So it does still come with some challenges. So one of the things that we see occasionally is that a PR comes in and it doesn't have a test, and then we ask nicely if they can add a test, and then the contributor disappears. So some of the times we're trying to improve on this thing and kind of like always get them in. So what we try to do is, if they truly disappear, we try to pick it up if it's a feature that we really want and we are bound with to do so. Once we try to reach out and kind of sit and help them write a test and kind of onboard them in this kind of things, one of the tests actually for our, there is an SSH feature. And the test for that, I knew the developer and he was also in Norway, so once I was dropping by Oslo, we sat down for an afternoon and we worked on them together and paired on them. That's not available for everyone, sadly. But you know, we always try to kind of like get this test message out there in a way. But there is a couple of other challenges as well, and that is that adding the test raises some sort of learning curve. So you know, you need to know go test, you need to understand our test framework, you need to have Docker and all of this kind of thing, whereas it's not writing tests that are a lot less code. And it's hard to convince people how awesome tests actually really are, that they're not really a chore and that you really, really thank yourself later for doing them. So some of the things we're trying to do to even make this barrier lower, since we're so heavily dependent on this for compatibility and everything, is that we're making like our own test framework, V2, because we depended on a lot of repeated and copied code and there was a really high bar for adding new tests and it was really hard to update and change and it did depend on time.sleep, which was, yeah, haunted me so many times and it couldn't really be ran in parallel for many of the previous reasons and the documentation wasn't really good, like I knew how to use it, one knew how to use it and then that was about it. So a couple of other people figured it out. So what we're trying to do is we're abstracting things a bit away, so we have this concept called control server, which is what essentially head scale is and the tail scale product, the software as a service and it's implemented as like head scale in container and it exposes convenient functions that now have Godox support and all of these things to make it easier for developers to actually use it and then we have the tail scale client, which is implemented at tail scale in container and it has the same type of convenience functions and what this allows us to do is previously the two files on the right here, sorry, on the left, is two different versions of the setup code for the tests because when you needed something that was slightly special, you had to copy the whole thing and then make a new file to be able to write a test case like you see on the other side here, but now after abstracting that away, making it a lot more configurable, we allow people to write more or less regular test cases, but you just set up what we call a scenario, which is a head scale with a given amount of tail scale nodes and then you let them ping each other or something like this. So what do we test right now? We tried to, we kept all of the original tests, so basically we make all nodes, join the network and we make them ping each other to verify that we have a fully functioning network both by IP and magic DNS, magic DNS is tail scales DNS system. We test tail drop, which is a file sharing features, a bit like Apple's airdrop and we send the files from every node to every node to make sure that they work. We test all our registration flows because we broken them a couple of times, so it was better to do it that way, which is pre-authored keys and web plus a command line flow and even open ID we currently have tests for. We try to isolate all of our network from the internet and test with our own embedded relay server because tail scale depends on some relay servers that we also embed in our binary and we have preliminary tests for the SSH features that we support, which is like authenticated by head scale so you can SSH into your machine and we test SSH all to all and we try to do negative tests. And also we test our CLI because if you may change something, you don't want to sit and type in every single command in a structured way manually because that's just painful. So in the future, we want to also kind of improve this granular access control that tail scale offer. Currently this is a very good example of where we have added a lot of unit tests and they all pass, but they're all wrong, so well, they're mostly wrong, so we have to kind of redo most of this into integration test first and then kind of backfill the unit test once we know how the implementation is actually supposed to work. And one of the things we've been dabbling with, especially for this ACL feature, is to use that control server abstraction we have before and use the tail scale product to test our tests because if they pass on the public server, we know they're correct and then we can use them to verify our thing. And then maybe run tail scale in the VM instead of Docker to test it properly, but that's more of a benefit for tail scale than it is for us. So if you're just here waiting for the next talk, a little bit of a TLDR is that, I mean, we cannot understate how important having this integration testing when we depend on an external party has been for the development of health scale. I reckon also the head, like the name, is also excellent, ponytail scale would have been worse. We have, I mean, with integration testing, we are able to maintain this compatibility with the client and we are able to take contributions from third party developers, otherwise it's a little bit more difficult to develop this trust across the internet, right? And even though the tests are not perfect and we still have to migrate unit tests towards integration tests, I think this is one of the keys for the success of the project. So some extra things, tail scale is hosting a happy hour at a brewdog by the station. This QR code takes you to a sign up form, I'll quickly switch back to this slide at the end, but I have like a question slide as well, so, you know, we go through this. Basically this is how to reach us, Github, we have a Discord community, and we're very happy to talk to anyone who wants to talk to us here at Fostem, so please feel free to reach out and I'll leave it at this one if anyone has any questions. We have some minutes, I think. Thank you. While I have your attention, we have a go for that lost there wallet, look to the left, look to the right, front and back, if you see a wallet that is not yours, please come right to the front, it will help this person a lot. Thank you. After you look for the wallet and you have a question, raise your hand and I'll try to come with this microphone. How come the tail scale guys are not mad at you, and not only are not mad at you, but they hurt you afterwards. I mean, part of it, is it working? Yeah. No? Okay. I think part of it is that they are quite chill, I mean, they could have, they are quite chill, they could have taken this way worse than they have, and I don't think we are competition. We are focused on self-hostors, on home labs, perhaps a little bit of a small company. And what usually happens is that people that use headscales at home, then they go to their companies and they talk about tail scale, and when you're in a company, you actually prefer to pay for the service. So it's like a way of... It's like a way of selling headscales, sorry, headscales also. Okay, thank you very much. Last round of applause. If you have any questions, you can card them in the hallway track. |
Our Mad Journey of Building a Vector Database in Go
Building a Database in Go |
It's four o'clock, so let's look at our preview, sorry, now next talk. I have been doing some mattings in Go, but building a database, I honestly have strong respect for. So next up is Etienne, who is going to tell us everything about Crazy Kitchen is in Go. Thank you, thank you, yeah, welcome to our mad journey of building a database in Go, and yeah, it's pretty mad to build a database at all, it may be even worse or even a matter to build a database in Go when most are built in Go. Let me start over in case you didn't hear it, so hi, my name is Etienne, welcome to our mad journey of building a vector database in Go. So building a database at all could already be pretty mad, doing it in Go when most are built in C or C++ could be even matter or even more exciting, and we definitely encountered a couple of unique problems that led us to create creative solutions, and there's lots of shout outs in there and also a couple of wish lists, so we just released Go 1.20, and of course the occasional madness. So let's get one question out of the way right away, why does the world even need yet another database? There's so many out there already, but probably you've seen this thing called chat GPT, because that was pretty much everywhere and it's kind of hard to hide from it, and chat GPT is a large language model and it's really good at putting text together that sounds really sophisticated and sounds nice and sometimes is completely wrong, and so in this case we're asking you, is it mad to write a database and go, I might disagree with that, but either way, basically we're now in a situation where on the one hand we have these machine learning models that can do all the cool stuff and do this sort of interactively and on the fly, and on the other side we have traditional databases, and those traditional databases, they have the fact, because that's kind of what databases are for, right? So wouldn't it be cool if we could somehow combine those two? So for example on the query side, if I ask Wikipedia, why can airplanes fly? Then the kind of passage that I want that has the answer in it is titled the physics of flight, but that is difficult for a traditional search engine, because if you look at keyword overlap there's almost none in there, but a vector search engine can use machine learning models basically that can tell you these two things are the same, and searching through that at scale is a big problem. Then there's that sort of chat GPT side where you don't just want to search through it, but maybe you also want to say like take those results, summarize them, and also translate them to German. So basically not just return exactly what's in the database, but do something with it and basically generate more data from it. And that is exactly where VV8 comes in, so VV8 is a vector search engine which basically helps us solve this kind of searching by meaning instead of keywords without sort of losing what we've done in 20 plus years of search engine research. And now most recently you can also interact with these models such as chat GPT, GPT3, and of course also the open source versions of it. So VV8 is written in go. Is that a good idea? Is that a bad idea? Or have we just gone plain mad? So we're not alone, that's good. So you probably recognize these things, they're all bigger brands at the moment than VV8, so VV8 is growing fast. And some of those vendors have really great blog posts where you see some of the like optimization topics and some of the crazy stuff that they have to do. So if you've contributed to one of those, some of the things I'm going to say might sound familiar, if not then buckle up, it's going to get mad. So first stop on our mad journey memory allocation, then that also brings us to our friend the garbage collector. So for any high performance go application, sooner or later you're going to talk about memory allocations and definitely consider a database a high performance application or at least consider VV8 a high performance application. And if you think of what databases do, like in essence basically you have something on disk and you want to serve it to the user, that's like one of the most important user journeys in a database. And here this is represented by just a number, so it went for UN32, so that's just four bytes on disk and basically you can see sort of these four bytes. If you parse them into Go they would have the value of 16 in that UN32 and this is essentially something very much simplified that a database needs to do and it needs to do it over and over again. So the standard library gives us the encoding slash binary package and there we have this binary dot read method which I think looks really cool. To me it looks like idiomatic Go because it has the io dot reader interface like everyone's favorite interface and you can put all of that stuff in there and if you run this code and there's no error then basically you get exactly what you want. You could turn those sort of four bytes that were somewhere on disk, turn them into our in-memory representation of that UN32. So is this a good idea to do that exactly like well if you do it once or maybe twice could be a good idea. If you do it a billion times this is what happens. So for those of you who are new to CPU profiles in Go this is madness. This is pretty bad. So first of all you see it in the center parsing those 1 billion numbers took 26 seconds and 26 seconds is not the kind of time that we ever have in the database but worse than that if you look at that profile we have stuff like runtime, malloc, gc, runtime, mem, move, runtime, m, advice. So all these things they're related to memory allocations or to garbage collection. What they're not related to is parsing data which is what we wanted to do. So how much time of that 20 seconds did we spend what we wanted to do? Don't know. It doesn't even show up in the profile. So to understand why that is the case we need to quickly talk about the stack and the heap. So you can think of the stack as basically your function stack so you call one function that calls another function and then at some point basically you go back through the stack and this is very short lift and this is cheap and fast to allocate and why is it cheap? Because you know exactly the runtime of your variables or the life cycle of your variables so you don't even need to involve the garbage collector. So no garbage collector cheap and fast. Then on the other side you have the heap and the heap is basically this sort of long lift kind of memory and that's expensive and slow to allocate and why because and also to deallocate and why because it involves the garbage collector. So if the stack is so much cheaper then we can just always allocate on the stack right. So warning this is not real go please do not do this. This is sort of a fictional example of allocating a buffer of size 8 and then we're going to say like yeah please put this on the stack and that is not how it works and for most of you you probably say like this is pretty good that it's not that it works that way because why would you want to deal with that. But for me just trying to build a database and go sometimes like this something like this may be good or maybe not. So how does it work? Go does something that's called escape analysis. So if you compile your code with gcflags-m then go annotates your code basically and tells you sort of what's happening there. So here you can see in the second line that this num variable that we used was moved to the heap and then in the next point you see the bytes.reader which represents our io.reader escaped to the heap. So two times we see that something happened to the or went to the heap. We don't exactly know what happened yet but at least there's proof that we have this kind of allocation problem. So what can we do? Well we can simplify a bit. Turns out that the binary or encoding binary package also has another method that looks like this which is just called view in 32 on the little endian package and it kind of does the same thing. You just put in the buffer on the one side so no reader this time you just put in the raw buffer basically with the position offset and on the other side you get the number out. And the crazy thing is this one line needs no memory allocations. So if we do that again our one billion numbers that took 26 seconds before now take 600 milliseconds and now we're starting to get into a range where like this is acceptable for a data basis. And more importantly what we see on that profile, the profile is so much simpler right now. There's basically just this one function there and that is what we wanted to do. So admittedly we're not doing much other than parsing the data at the moment but at least we got sort of rid of all the noise and you can see the speed up. Okay so quickly to recap. If we say a database is nothing but reading data and sort of parsing it to serve it to the user then we do that over and over again then we need to take care of memory allocations. And the fix in this case was super simple. We changed two lines of code and reduced it from 26 seconds to 600 milliseconds. But why we had to do that wasn't very intuitive like that it wasn't very obvious. In fact I haven't even told you yet why this binary dot little nvn dot read why that escaped to the heap. And in this case it's because we passed in a pointer and we passed in an interface and that's kind of a hint basically that something might escape to the heap. So what I would wish is yes this is not a topic that you need every day you write go but maybe if you do need this would be cool if there was better education. Okay so second step delay the coding so this is kind of the idea that we wouldn't want to do the same work twice. And we're sticking with our example of serving data from disk but now while the number example was a bit too simple so let's make it slightly more complex. We have this nested array here basically a sort of slice off slice view in 64 and that's representative now for a more complex object on your database. Of course in reality you'd have like string props and other kind of things but just sort of to show that there's more going on than a single number. And let's say we have 80 million of them so 10 million of the outer slice and then eight elements in each inner slice and our task is just to sum those up. So these are 80 million numbers and we want to know what is the sum of them. So that is actually kind of a realistic database task for an OLAP kind of database. But we need to somehow represent that data on disk and we're looking at two ways to do this. The first one is JSON representation and then the second one would be the sort of binary encoding and then there'll be more. So JSON is basically just here for completeness aid. We can basically rule it out immediately so when you're building a database you're probably not using JSON to store stuff on disk unless it's sort of a JSON database. Why because it's space inefficient so if you want to represent those numbers on disk like JSON basically uses strings for it and then you have all these control characters like your curly braces and your quotes and your columns and everything that takes up space. So in our fictional example that would take up 1.6 gigabyte and you'll see soon that we can do that more efficient. But also it's slow and part of why it's slow is again because we have these memory allocations but also the whole parsing just takes time. So in our example this took 14 seconds to sum up those 80 million numbers and as I said before you just don't have double digit seconds in a database. So we can do something that's a bit smarter which is called length encoding. So we're encoding this basically as binary and we're spending one in this case one byte so that's basically a U and 8 and we're using that as a length indicator. So basically that tells us that when we're reading this from disk that just tells us what's coming up. So in this case it says we have eight elements coming up and then we know that our elements in this example is U and 32 so that's four bytes each. So basically the next 32 bytes that we're reading are going to be our eight inner arrays and then we just continue. Then we basically read the next length indicator and this way we can encode the stuff sort of in one contiguous thing. Then of course we have to decode it somehow and we can do that because we've learned from our previous example right so we're not going to use binary.lnlndian.read but we're doing this in an allocation free way, you can see that in the length line basically and yeah our goal is to take that data and put it into our nested sort of go slice of slice of slice of U in 64 and the code here basically you see we're reading the length and then we're increasing our offset so we know where to read from and then we're basically repeating this for the inner slice which is just hinted at here by the decode inner function. So what happens when we do this? First of all the good news, 660 megabytes that's way less than our 1.6 gigabyte before so basically just by using a more space efficient way to represent data we've yeah done exactly that we've reduced our size also it's much much faster so we were at 14 seconds before and now it's down to 260 milliseconds but this is our mad journey of building a database so we're not done here yet because there's some hidden madness and the hidden madness is that we actually spend 250 milliseconds decoding while we spend 10 milliseconds summing up those 80 million numbers so again we're kind of in that situation where we're doing something that we never really set out to do like we wanted to do something else but we're spending our time on yeah doing something that we didn't want to do so where does that come from and the first problem is basically that what we did what we set out to do was fought from the get go because we said we want to decode so we're basically thinking in the same way that we're thinking as we were with Jason we said that we want to decode this entire thing into this go data structure but that means that you see we need to allocate this massive slice again and that also means that we need to in each inner slice we also need to allocate again so we're basically allocating and allocating over and over again where our task is not to allocate our task was to sum up numbers so we can actually just simplify this a bit and we can basically just not decode it like while we're looping over that data anyway instead of storing it in an array we can just do with it what we plan to do and in this case this would be summing up the data so basically getting rid of that decoding step helps us to make this way faster so now we're at 46 milliseconds of course our footprint of the data on disk hasn't changed because it's the same data that we're reading we're just reading it in a slightly more efficient way but yeah we don't have to allocate slices and also because we don't have these like nested slices we don't have like slices that basically have pointers to other slices so we have better memory locality and now we're at 46 milliseconds and that is that is cool so 46 milliseconds is basically the time frame that can be acceptable for a database okay so quickly in recap we immediately ruled out JSON because it just wasn't space efficient and we knew that we needed something more space efficient and also way faster binary encoding already made it much faster which is great but if we decode it upfront then yeah we still lost a lot of time and it can be worth it in these kind of high-performance situations if you either sort of delay the decoding as late as possible until you really need it or just don't do it at all or do it in sort of small parts where we need it no wish list here but an honor we mentioned so go 1.20 they've actually removed it from the from the release notes because it's so experimental but go 1.20 has support for memory arenas the idea for memory arenas is basically that you can bypass the garbage collector and sort of manually free that data so if you have something that you know has the same sort of life cycle then you can say okay put it in the arena and basically in the end free the entire arena which would sort of bypass the garbage collector so that could also be a solution in this case if that ever makes it like right now it's super experimental and they basically tell you we might just remove it so don't use it third stop is something that when I first heard it almost sounded like too good to be true so something called SIMD we'll get to what that is in a second but first question to the audience who here remembers this thing raise your hands okay cool so you're just as old as I am so this is the Intel Pentium 2 processor and this came out in late 90s I think 1997 and was sold for a couple of couple of years and back then I did not build databases definitely not in go because that also didn't exist yet but what I would do was sort of try to play 3d video games and I would urge my parents to get one of those new computers with an Intel Pentium 2 processor and one of the arguments that I could have used in that discussion was hey it comes with MMX technology and of course I had no idea what that is and it probably took me 10 or so more more years to find out what MMX is but it's the first in a long list of SIMD instructions I haven't explained what SIMD is yet but I will in a second some of those especially the one in the in the top line they're not really used anymore these days but the the bottom line like AVX2 and AVX512 you may have heard them in in fact for for many open source project they sometimes just sort of slap that label in the read me like yeah yeah has AVX2 optimizations and that kind of signals you yeah we care about speed because it's like low level optimized and VVA does the exact same thing by the way so to understand how we could make use of that I quickly need to talk about vector embeddings because I said before that VVA doesn't doesn't search through data by keywords but rather through its meaning and it uses vector embeddings as a tool for that so this is basically just a long list of numbers in this case floats and then a machine learning model comes in and basically it says do something with my input and then you get this vector out and if you do this on all the objects then you can compare your vectors so you basically can do a vector similarity comparison and that tells you if something is close to to one another or not so for example the query and the the object that we had before so without any simd we can use something called the dot product the dot product is a simple calculation where basically you use you multiply each element of the first vector with the same corresponding element of the second vector and then you just sum up all of those elements and we can think of this like multiplication and summing as two instructions so if we look out first shout out here to the compiler explorer which is a super cool tool to see like what your go code compiles to we can see that this indeed turns into two instructions so this is a bit of a lie because there's more stuff going on because it's in a loop etc but let's just pretend that indeed we have these two instructions to multiply it and to add it so how could we possibly optimize this even further if we're already at such a low level well we can because this is our mad journey so all we have to do is introduce some madness and what we're doing now is a practice that's called unrolling so the idea here is that instead of looping over one element at a time we're now looping over eight elements at a time but we've got we've gained nothing like this is we're still doing the same kind of work like we're doing 16 instructions now in a single loop and we're just doing fewer iterations so by this point nothing gained but why would we do that well here comes the part where I thought it was too good to be true what if we could do those 16 operations for the cost of just two instructions sounds crazy right well no because simd I'm finally revealing what the acronym stands for it stands for single instruction multiple data and that is exactly what we're doing here so we want to do the same thing over and over again which is multiplication and then additions and this is exactly what these simd instructions provide so in this case we can multiply eight floats with other eight floats and then we can add them up so all this perfect here maybe not because there's a catch of course it's our mad journey how do you tell go to use these avx two instructions you don't you write assembly code because go has no way to do that directly the good part is that assembly code integrates really nicely into go and in the in the standard library it's used over and over again so it's kind of a standard practice and there is tooling here so shout out to avo really cool too that helps you basically you're you're still writing assembly with with avo but you're writing it in go and then it generates the assembly so you still need to know what you're doing but it's like it it protects you a bit so it definitely helped us a lot so simd recap using avx instructions or other simd instructions you can basically trick your cpu into doing more work for free but you need to sort of also trick go to use assembly and with this tooling such as avo it can be better but it would be even nicer if the language had some sort of support for it and you made my saying now okay this is this mad guy on stage that wants to build a database but no one else does needs that but we have this issue here that was open recently and unfortunately also closed recently because no consensus could be reached but it comes up back and back basically that go users are saying like hey we want something in the language such as intrinsic so intrinsics are basically the idea of having high level language instructions to do these these sort of avx or simd instructions and c or c++ has that for example one way to do that and maybe you're wondering like okay if you have such a performance hot path like why don't you just write that in c and you see go or write it in rust or something like that sounds good in theory but the problem is that the call overhead to call c or c++ is so high that you actually have to outsource quite a bit of your code for that to to pay off again so if you do that you basically end up writing more and more and more in that language and then you're not writing go anymore so fortunately that's not or it can be in some ways but it's not always a great idea so demo time um this was going to be a live demo and maybe it still is because i prepared this running nicely in a docker container and then my docker network just broke everything and it didn't work but i just rebuilt it without docker and i think it might work if not i have screenshots basically that um that do a backup so example query here i'm a big wine nerd so what i did is i put wine reviews into vv8 and i want to search them now and one way to do it to show you basically that the keyword um that you don't need a keyword match but can search by meaning is for example if i go for an affordable italian wine let's see if the internet connection works it does so what we got back um is basically this this wine review that i wrote about a barolo that i recently drank and you can see it doesn't say italy anywhere it doesn't say affordable what it says like without breaking the bank so this is a vector search that basically happened in the in the background we can take this one step further by using the generative side so this is basically the the chat gpt part um we can now ask our database based on the review which is what i wrote when is this wine going to be ready to drink so let's see you saw before that was the fail query when the internet didn't work now now it's actually working so that's nice um and here in this case you can see that so this is using open ai but you can plug in other tools can plug in open source versions of it um this is using open ai because that's nice to be hosted at a at a service i don't have to run the machine learning model on my laptop then you can see it tells you the wine is not ready to drink yet we will need at least five more years which is sort of a good summary of this and then you can see another wine is ready to drink right now it's in the perfect drinking window so for the final demo let's combine those two let's do a semantic search to identify something and then do an ai generation basically so in this case we're saying find me an aged classic riesling best best wine in the world riesling um and based on the review would you consider this wine to be a fruit bomb so let's have sort of an opinion from the machine learning model in it and um here we got one of my favorite wines and the the model says no i would not consider this a fruit bomb while it does have some fruity notes it is balanced by the mineralogy and acidity which keeps it from being overly sweet or fruity which is um if you read the text like this is nowhere in there so this is kind of cool that the that the model was was able to do this okay so let's go back now it's the the demo time by the way have a github repo with like this example so you can run it yourself and um and yeah try it out yourself so this was our mad journey and are we mad at go are we mad to do this well i would pretty much say no because yes there were a couple of parts where we have to give a get really creative and had to do some some yeah rather unique stuff but that was also basically like the highlight reel of building a database and all the other parts like i didn't even show the parts that went great like concurrency handling and the powerful standard library and of course all of you basically representing the gopher community which is super helpful and yeah this was my way to basically give back to all of you so if you ever want to build a database or run into other kind of high performance problems then maybe some of those |
Building a basic event-driven application in Go in 20 minutes
Introduction to Watermill |
Okay, this speaker claims that in 20 minutes, Robert is going to build an event-driven application. Well to be kind, I gave him 25 minutes. So start your countdown clocks. Hello. So my name is Robert, and yes, I would like to show you today that we can build an event-driven application in Go, and it can be as simple as building a simple HTTP server. And I actually decided to put the bar a bit higher. I think that I can do it within 15 minutes. All right, at the beginning, a couple of words about myself. So during the day, I work in a company named SlashID, so I work there as a principal engineer, and we are creating some identity and we're onboarding from a solution that is a bit more frictionless than a solution available now on the market. And during the night, I'm blogging at 3.0.tech blog, where we are writing some blog posts that are covering how to create Go applications that are business applications, but are also maintainable in the long term. I know maybe some of you had a chance to read at least one article there, there are some people. Nice. I will have something special for you later. You can find me on Twitter, GitHub, Mastodon, there's also my email if you would like to write to me and ask about something, but what's the most important for today? I'm Oselton of Watermill Library, and how everything started with Watermill, because I think that this is pretty important context. So a couple of years ago, I worked in a company where we are creating products that were not doing something super unusual, but the idea was that each user was able to add some content and he should be able to, we were storing it to MySQL, plus we wanted to have some more advanced search, plus have ability to create fit for other users with some magic machine learning models that they were doing personalization. And usually if you are building such kind of system in a synchronous way, there's one problem. So this part may be sometimes slow, because elastic search is under high load, or magic machine learning model so that this day it will not work. Not nice, but it happens in work, unfortunately. And yeah, or even worse, for example, some part is not working, and it's not best user experience if it's working slowly, or it's, so for example, you can imagine that you're adding some tweet and you're waiting for 10 seconds, because I know elastic search need to index something or machine learning model is working slowly, or even you are not able to add this content. And it doesn't make sense, because everything what is done on this other part of the diagram could be done asynchronously, because okay, it's not a problem if, for example, the search, some content that was added cannot be searched, for example, for one minute after it's added, if something is done. It's much better than not allowing people to add anything. So by the book, the default solution for such problems is using some kind of pop-up and doing it asynchronously. So in this case, we decided to use Kafka, because it's scalable, it's nice, but as usually with some concepts that you're reading in the books or listening on the conferences, it's not that simple in practice. And it was also the case here. The first problem was that the big part of the team wasn't actually working in asynchronous architectures earlier. That kind of makes sense, because if you're starting to learn to code, you're not starting with building some event-driven application, you're rather creating some REST API or website. So it makes sense that it was a big entry point for people that didn't use that. And it was not the only problem, because event-driven architecture has a lot of concepts that you need to know, like customer groups, partitioning, message ordering, at least one's delivery, acknowledge negative, acknowledge poison queue. And with all of that, you need to be sure that you didn't miss an event. And it's pretty important in some domains. In some cases, okay, it's fine, you're missing some event and okay. But for example, I used to work in the financial domain, and losing one event may, for example, mean that somebody will be not paid out. Not nice. In general, I believe that as engineers, we should be responsible, because sometimes the code that we are building has a really big impact to the real life. And after thinking for a while, I actually started to wonder, is it maybe something that I can do to making, to building some kind of applications in Go simpler? And here we are. This is how WaterMill was created. So far, we have more than 5,000 stars in the Github. We have more than 50 contributors across multiple WaterMill repositories. We are supporting 12 different PubSupp implementations, like Kafka, like Google Cloud PubSupp, like NATS JetStream, Rabbit and Q, but we have also some more strange implementations, like MySQL, for example, if you don't have infrastructure for some real PubSupp, or for example, would like to avoid to face commits problem. If you are doing some more fun projects, you can have just Go channel implementation or BoldDB, for example. But there is one more important thing than that, WaterMill has logo, and it is a logo with Go for Vomiting to Gobernati's logo, not as Muai. And you can think about WaterMill, like, so let's go back to this HTTP server example. So you can think about WaterMill, like something that makes your life simpler, like standard library for HTTP. So, for example, if you are implementing an HTTP server, you don't care about TLS, layers of network, you can start connection pooling and all this stuff, you are just implementing the logic in most cases. Sometimes, of course, you may have some specific scenarios that you care about that, but in most cases, you should just implement your handlers and don't care about everything around. And as you already, some of you shown, so I sometimes wrote the article that I think that frameworks are probably not working best in Go, and WaterMill is also, for example, that's the case why WaterMill is actually a library. And it's pretty good to upside. So the first one is that if you already have some system and you would like to migrate to WaterMill, it's kind of simple, because WaterMill doesn't add anything super custom and it can be integrated with any existing system, and vice versa. See, for example, for some reason, you decide that you don't like WaterMill, but you will not. So you can migrate from WaterMill to some different library. So this is the good thing. And I think what's pretty important, so how everything is done, because, okay, in theory it may sound nice, but it's helping, but how WaterMill is built. And in the heart of WaterMill, I would say that you can see in multiple places something that is named UNIX philosophy. And it's kind of old philosophy, because it's from 1978. And it's saying us to write programs that do one thing and do it well, write programs to work together, and write programs to handle, in our case, message. Because that is a universal interface. And some small question now. Do you know who's that? So it's Ken Thompson. So he's the author of this philosophy. And what's also interesting, he's one of the authors of Go programming language. Actually it makes sense, because if you look on the Go, for example, to IO Reader or our writer, this is pretty nicely visible there. And I know that for a lot of people didn't know about UNIX philosophy. And sometimes when I have too much time to think, I have some impression that, no, sometimes we forgot about some good old ideas and we're trying to reinvent the wheel, even if some problems were already solved. And you know, it's maybe something like in Dark Ages that it was some old nice ideas, but it was a bit forgotten. And OK, maybe I'm thinking too much. Let's go back to the watermill. So there are a couple important times in watermill. So the first one is message. So if you compare it to HTTP server, so it's something similar to HTTP request. So in message we have UID, that is pretty useful for debugging. We have metadata. So metadata is something like headers request plus payload. So this is the place where you are storing your event, for example. The two next important parts of watermill are publisher and subscriber. So publisher, you can publish those messages. And with subscriber, you're right. You can subscribe for those messages from the provided topic and receive that by the channel. You usually are not using these interfaces because it's used somewhere internally in watermill. But for example, if you would like to add a new implementation of PubSub, this is something that you're implementing. And each PubSub implementation is implementing this interface. That's why I actually pretty like this interface. Because it's making some constraint on the implementers that, OK, they need to implement that in that way. But it's also not good because it's making each of them pretty compatible with themselves. And the last but not least type is hender function. Hender function is something like HTTP handler that you are implementing in your HTTP server with the small difference that instead of receiving HTTP request, you are receiving a message. And optionally, you can receive the message. So the idea is that you can react on some message, do something, and emit some other messages so you can do some kind of changing later. I will show shortly an example. And everything is magically connected, sorry, it may be small, but you need to trust me that in the middle there is a router here. And this is connecting everything. So the message is going from some publisher, it doesn't need to be WaterMill, it's going to the queue by subscriber, the router. Router is passing it through middleware. Middleware works in WaterMill like HTTP, so another thing that is pretty similar. And it's processed by handlers. And later, if we want, we can publish some other messages. Not super complex. So do you know the first rule of live coding? Don't do live coding. So do live coding. What can go wrong? All right. Like to change sharing settings, so on second, it's probably not this one. This is why you are not doing live coding. Yes. Okay. So something does work, that's good, but I'm not really like, I want it. This is something that I wanted to have. So I prepared a simple application here. And what does application does? So if you're not from Brussels, so this may be something familiar to you. So it allows you to book a room in hotel. So you can provide room ID, pass guest counts, and let's see if it works. Okay. It seems that it's not working sometimes. Sometimes it's working. Sometimes it's not working. Sometimes it's working slowly, slowly, slowly, slowly. Sometimes it's even not working slowly. So it's even worse. So let's check the source code of that application. So okay. So here we are running HTTP, so boring, signals handling boring, but this is probably not boring. This is usually when the most interesting part of the application lives. Let's check our handler. So okay, so we are unmartialing stuff, to book room request, we have some advanced algorithm of calculation of room price, and we are taking payment. What can go wrong here? And okay, as we can see, our payment provider, it's not super stable, but okay, I don't know, let's imagine that it's our boss colleague and we cannot change that, no, politics. It happens. It's okay. What we can do? We can do like that, go, fang, okay, done, it works now, but it's one problem with that. So if our server will die, there is a chance that we'll not take payment, and it doesn't like that as the best idea. So what will be my idea? So instead of doing it synchronously with this HTTP handler, I would like to emit some event, listen to that event, and take payment asynchronously. So let's do that, and let's do that with watermill, of course. So at the beginning, we need to get rid of that, and we need to have our publisher here. Message publisher, so this is the interface that you should remember, all right. And I also can prepare some code snippets to not lose time on some boring stuff like room booked. Well, we have our event, so room booked, all right, guest count, and price, room, price. All right, now we need to marshal that, because we are sending bytes between our processes through our PAPS app, so JSON, because JSON is kind of common and it's pretty easy to debug. So let's marshal that, payload error, room booked. Don't do such error handling at home, please. And now let's publish that. The H publisher, publish topic, so let's use bookings, and we need our message. Let's remember we need to have UID, so it doesn't matter actually what format of UID it can be, I know, it can be even empty for some plantations, but good luck with debugging, and room booked payloads. All right, and it returns error, so we need to handle that in not a nice way, but it's live coding, so it's fine. All right, so we have the first part. So we have our room booked event, we're publishing that to the topic bookings, and, okay, so we just need to inject now the publisher. So let's check where it's created, okay, we no longer need payments. I heard that Kafka is nice and scalable, so let's use Kafka. I have also snippet for that, it's nothing magical here, it's just this and the water mid-documentation, and let's use this publisher. We don't need subscriber yet, but probably we'll need it later. All right, by the way, I'm running some nice Docker Compos under the hood that is recompiling the project each time when I'm putting changes there. At the end of the presentation, I will give you materials with all the source code, and with the description of how it's done, that it's automatically reloading after each change. All right, so we have our publisher, we are publishing our event, so let's check if it works. Hopefully it will work, okay, so you can see that our API is pretty stable, and let's check if our event is really published. So we'll use mule tool, so mule is part of water mule, as you can guess, and we'll consume from bookings from Kafka. Mule is allowing you to consume messages from multiple Pub-Sub types that are supported in water mule. I know that there is tool for that in Kafka, but it's not mine, so. And yeah, with mule, you can use multiple Pub-Sub types, and okay, as you can see, now we have event here, so it seems to work. Okay, so done, thank you. Not really. We are not taking payments, so probably if our company will go bankrupt pretty quickly, so we'll need to start to take payments. So for that, we already have our subscriber, that's good, so let's uncomment that, okay. We need to have water mule router, so message router error, router config, water mule logger, router handling, and now we need to add a handler. So we'll use addHander, so we'll need to provide handler name, so it will be payments. It doesn't matter really what is the handler name, but again, pretty useful for debugging. Subscribe topic. So we're subscribing to the topic that we published this message, so this is bookings. Bookings, we need to use subscriber, and we need to publish the topic. So we'll publish event when we succeed to take payments, so payments, publisher, and handler function. So hopefully you remember handler function signature, so yeah, we are receiving message and we are returning message, but we'll do it in a bit more fancy way, payments handler, because we can inject some dependencies earlier, I need to fix that, and that, all right. So we have our payments handler, so we'll receive message, and we'll take payment and emit some event. So we need to have our payment provider, and what? We need to have room booked, we need to have our shoulder, so message payload to room booked. And compared to standard library HTTP handler, you can return errors from a water new handler, so I don't need to panic. And all right, so we should have the payload that we published here, so that's good, so we can now use that to take payment for room booked price, great, great. And as I said, so I would like to also, I need some event, so it may be useful, so if you're an intimate event that we took the payment, we can have some BI or we can, I don't know, do something else, I mean, I don't know, we can send beer to this person after he booked room, because why not? And, okay, so we need the second event, payment taken, payment taken, filled, filled, room booked, room booked as well as price, and we need to marshal it again to JSON. Error. Cool, okay, and the last thing that we need to do is returning message, message as new, message new, UID new string, and payment taken payload. I hope that I'm not writing too fast or too slow, all right, so in there, there is a chance that it may work, so what we are doing, so we are receiving our room booked event, we are marshaling that, we are taking payment, and when we succeed, we are emitting another event. Sounds like a done, so the only thing that we need to do is to reuse that handler, so we have that one, and handler, cool, let's check if it compiles, it even compiles, so let's check if it's working, so let's book a couple rooms, and the idea is that by default WaterMe handler will try if the payment provider failed, so in there we should see some information that payment was taken, and we don't see that, I don't, I know why we don't see that, because we didn't start at router, run, context, error, it's a bit naive implementation because it's not really graceful shutdown, but what in the documentation, as I remember, we have examples with real graceful shutdown, so, okay, and let's see, okay, so we have some random error, and you can see payment taken, hooray, our company is saved, all right, so this is working, but there's one problem with that, so now we figure out that, okay, actually Kafka is a bit hard to run, and we are on GCP, so maybe we can just Google it, so I think that I can change Kafka implementation to Google, it pops up in one minute, I'm rewriting the bar today, hi, but I think that I can do that, let's start the timer, one, two, three, okay, let's check, I think I did that, so let's book, and okay, payment taken, we can double check, so let's use meal, and let's consume bookings, you see, it works, all right, so it will be that from live coding, one last thing that I would like to show you, because you may notice that, okay, it's a lot of boring JSON there, et cetera, et cetera, you may notice that I don't like boring stuff, because probably there are more interesting things to do than marshalling to JSON, so that's because of that we created a component that is named CQRS component, and the idea is that instead of doing this JSON-marshall and all that stuff, you can provide configuration to which format you would like to marshall everything, and under the hood it would be done, so you can use JSON, you can use Protobuf, Avro, I don't know, even something custom if you really want, the idea is that you're only implementing this interface, so you're providing the name of the handler, you are providing the event that you are expecting to receive, so in that case it will be room-booked, and you may notice that it was pre-generic, so we have the interface here, but we are working on the newer version, and you are just receiving this event, zero, un-marshalling, or whatever, and the same is going when you are publishing an event, so you are just providing the struct and watermill under the hood is doing all the marshalling stuff. Okay, so I think that will be all for live coding, it looks that I was lucky this time that everything worked, and yeah, of course it's still not production-grade implementation, I mean it's even hard to create a production-grade implementation of HTTP server, so it's more kind of inspiration to look deeper and see that, okay, it's not that scary, but you need to take into consideration that there are things like Kafka and Google Cloud pops-up internals, what is once delivery, actually shown the secure component, but I didn't call that, but it's helping a bit, so where you should start, because okay, it may be a lot of sources for you, and a lot of stuff to check, so I heard that we have pretty nice documentation, so we don't have any consulting or whatever for watermill, so we kind of don't care to have bad documentation, so yeah, I heard that we have pretty good documentation, so at the end of the presentation it will be in the link, what else, we have also a lot of examples in watermill, so I will encourage you to, it's black, oh, live coding, okay, it's not live coding, not only live coding, it's risky, so yeah, we have a lot of examples that probably you cannot see because it's on the black, but you need to believe me that this is on the watermill repository, at this point I wanted to say a big thank you to all watermill contributors, because without you it wouldn't be like it's now, and it's not an announcement that we actually released watermill 1.2 after having too many release candidates, so yeah, finally it's released, and you are all invited to an online release party, and we will say what are the new features, and it will be on March 1st, on the last link it will be also linked for that, and I think that will be also, this is the, again it's not working, oh, yeah, so this is the link that I promised to give you, the bonus that I have, I have super fancy holographic sticker notes, I'm sure that you don't have sticker notes, laptop stickers, so I'm sure that you don't have holographic ones, so if you don't have, so I have a lot of them, and yeah, I think that would be all, so thank you very much for your attention. Thank you. Thank you. |
Is Go Object-Oriented? A Case of Public Opinion |
Okay, thank you. We have to stay on our schedule. So our next speaker, Rona, she stood here three years ago, but before the pandemic was a thing, and she gave us a challenge to solve the go diversity problem within a year. Rona, is it solved? As far as I can tell, yes. Well done. Well done. No, no. So actually, Marty has stole this from me because, yeah, three years ago, I did challenge this forum to solve the problem of lack of diversity within the Go community within a year. And then the pandemic hit and it seems that these issues kind of took aside, were pushed aside, unfortunately. But we can start again. So I'm here to talk to you today about a lighter topic is Go object oriented. Now, it appears to be something that people have many opinions about. And I hope that you do too because that will be fun. So I am Rona. I am a Google developer expert for Go. I create workshops. That's kind of like my thing when I want to teach something. I make a workshop about it. And in 2022, I kind of realized that after a few years of seeing developers struggling with different paradigms around the Go typing system, specifically with interfaces, I figured why not create a workshop about it. And I submitted it to Go for Con Europe. It was the name of the Go workshop was object oriented design with Go. And then the comments started coming. So something between object oriented is dead. Somebody posted a comment on my tweet and then blocked me. Because apparently that's how the internet works. So, yeah, and somebody blamed me for introducing Spring into Go. Now, I have been a developer for 20 years, but I have not done any Spring in my life. And that is such a specific accusation. I was fairly surprised. So I am not here to promote my workshop, even though I am giving it again this year in Berlin in June, because that will be bad. So what is object oriented programming? So it's the idea of separating software into things or objects or concerns or instances and adding some functionality to them usually called messages or methods or member functions. So that we can work with software in a more intuitive way, the way that we understand how we interact with the real world. That's it. Now, where there are things, it can get incredibly messy. That's the business model behind Marie Kondo. Yeah. I felt the sigh. You don't have 14 items maximum in your house. You're not alone. And we have this lovely quote from Joe Armstrong. The problem with object oriented languages is that they've got all of this implicit environment that they carry with them. You wanted a banana, but what you got was a gorilla holding a banana and the entire jungle. And that feels like it does. It really does. So gone really went a different way. It tried to sort of stay away from this. But we will see what remained and what actually remained from this and what actually we were able to let go of in a second. So we're going to hold a trial where we're going to check. You're going to be the jury. You're going to decide if Go is object oriented. I'm going to try and convince you. I'm going to show you the arguments for both sides. And I'm going to have to convince you that it is object oriented or I am responsible for defamation. I am taking the goes good name and dragging it through the mud, tarnishing it, hurting its reputation. You're the jury. You will decide. So disclaimer, this is not a real trial. But and I'm the judge and the rules are what I say they are, which is to say that these proceedings are going to be ridiculous. But yeah, we're just going to have to do it. So what have we come to expect from an object oriented language? Most of you know this by heart, really. We have classes because classes and only classes can have methods. Classes can also have these constructors. Classes are created, allowed to create objects safely. And we also expect inheritance. Objection. So in go, we don't have classes. And therefore we don't have constructors. But we pretend to have them. So here's where we pretend to have something. Now this one is quite fun. So this is a package that I created. And you can see here that inside type robot, the Godoc aggregated nicely, a function called new. Now you can see that it doesn't actually have a receiver. It's not a method. It's not anything. And it's just a package level function that the Godoc understood to be a constructor. And then he added it where it should be nicely nested inside the robot type. Which is really interesting because what that means was that the Go team essentially decided at some point that safe construction of objects is a tooling problem, not a language problem. It's interesting. We work with constructors. Okay. All types can have methods. So you've probably seen this or code similar to this at some point in your life. So I created a new type my thing out of an integer. So we have a new type with an underlying type int. I added a method foo to it using the receiver. Just remember that we said earlier that we have objects interacting through messages. It's called a receiver because it perceives. It's that easy. Come on. All together. Nobody agrees? What's going on? Yeah, okay. Pathetic. I said I was going to judge. All right. And then we have, so now we created a variable t of type my thing. We assigned it one. And then we're able to call a method on t foo. Voila. We have a primitive type and it has a method. Because in Go, all types are created equal. Thank you. Okay. So let's move on to inheritance. So we get in Go. We get composition. We don't get inheritance. That's not something that's available to us. So this little snippet here is supposed to show the difference. I created a type a. It's an empty struct. I added a method foo to it. It has a method bar. foo calls bar. That's nice. Returns bar. We have type p that embeds a. It can embed as many, as many as many as many types as it wants. It embeds a. That means that now it has foo and bar. And then it decided to override bar. Fine. Variable b of b lowercase b of type uppercase b. To be confusing. Not to be confused. Okay. Get it. And we can call b.bar. And we expect b.bar to be invoked directly. And then we call b.foo. Now, with inheritance, we would expect b.foo, which returns a.bar, which is overridden, to be called. So we would expect to get b in both cases. But that's not what's going to happen. Because we do not have this type of polymorphism. That is true. Who was it that said it earlier? Raise your hands. Round of applause to the gentleman. Okay. Moving on. So let's talk about single and multiple inheritance because this really bugs me. So I started my career with seven years of C++. And yes, it sounds biblical because it is. Thank you. Yeah. So I started with seven years of C++. C++ actually has a really nifty feature. You can inherit a lot. You have multiple inheritance. It's not limited. Java, Ruby, they allow you to inherit exactly once. That to me does not feel like a feature. That feels like a limitation. I don't understand it. Let's say that you have a truck and you want to describe a container of goods and a vehicle. You cannot inherit from both. What do you do? Well, in Go we have composition. But in many, many languages that offer inheritance, you only get single inheritance. Now, I will say this. If you feel, and I know that a lot of people do, if you feel that inheritance is that important, it just doesn't make sense that it will be so limited a feature that you will not be able to use properly or fully. And I do believe also that that is the cause of all the messy code that we see because the classes that I used to define were very small were one function, two functions. I didn't have to make odd choices of what is going to go into a class or what wouldn't. So it was really easy to be very expressive. So that's what I personally, my personal opinion is about single and multiple inheritance. I don't feel in most common languages, since it's usually single inheritance, I don't feel that it makes it something that is going to determine whether a language is or isn't object oriented. Because if it is, then as far as I'm concerned, single, any language that has single inheritance cannot be object oriented. So Go is not object oriented. The usual argument summary, and I'm going to, like, you have to understand that I'm aggregating here everything that was said to me over the internet, which is a great source of information and also a great source of information. So Go is simple, object oriented programming isn't, therefore Go is not object oriented. Go doesn't have classes, so Go cannot be object oriented. It doesn't have inheritance, so it's not object oriented. And lastly, and this one is a great argument, we're going to dive into that one. Objects are not really messages, so Go is not object oriented. Now, this one is fun. So this comes from the Alan Kay School. So all the fans of Alan Kay have jumped in to let me know everything about the history of object oriented. So here's what I've got. One person said, technically they aren't methods, so he's referring to method receivers. They aren't methods on type T, they are functions where the first argument is type is the type. The promotion to them as methods is syntactic sugar. It's why you can call, now look at this, this is amazing. V.foo with bar or T, call the type, invoke a method on the type, and provide V as a parameter. And it's true. Show you what it looks like. So same code as before, we have a empty struct. We have method foo that we added to A. Quiet in the court. We have method foo that we added to type A, which is nice. And we can invoke it using two ways. One is clearly less common than the other. So this is the common way. Or we can invoke it on the type and pass in lowercase a as a parameter. You have the screen? That's why, by the way, a pointer receiver can be nil. So I have this game. I play this game all the time with people who are new to go and I ask them what's going to happen. So we are able to create methods with pointer receivers. So I add foo and make the receiver a pointer. I create variable a, which is a pointer to uppercase a. It's zero value is nil. And I invoke a.foo. And then I ask people what will happen. Now, we're not going to be able to do this quiz here because we are late on time and we have to make up some time. So I'm just going to run this. So high from foo is actually returned by foo, which means that we are able to invoke a method on a nil pointer. It's possible because there is no receiver. There is no actual receiver. Most of the time when we call a method in other languages, what happens is that we have to go and go through some reference that's somewhere in the address space of the variable itself. This tells us that it's not where the method is at all. It's not where it's defined. It's not where the runtime looks for it. It's just not. So my co-organizer, my women who go co-organize it, Jessica Green saw this and said, ah, so there is no spoon, which I thought was amazing because this is kind of a design thing, right? So everything is in our minds. Really everything is in our minds. We sort of, you know, we have these philosophical ideas and then we put them into code. If there is anybody who actually thinks that the gopher that you saw, the gopher that you saw on the screen in the maze, there is an actual gopher out there doing this, let me know. I want to hang out. So where do receivers come from? So listen to this because this is very interesting. Going to the Go team, the Inspiration 2 method receivers came from Oberon 2, which is the object-oriented version of Oberon. Okay? And they're called receivers because they receive messages, except there is no receiver. So everybody's right. Another thing that the same person brought up, which is really cool, he said, well, I forgot to say who Alan Kay was. What's wrong with me? Alan Kay is the creator of Smalltalk and also is considered to be the person who coined the term object-oriented. So he's supposedly supposed to know stuff about it. So that's why people quote him and that's fine. It's totally fine. And he has a lot of opinions, which is also great. We encourage opinions. So what a person said to me, so in Smalltalk, you don't need to explicitly declare that an object can have a specific message. You send it a message and then it decides whether to handle it. So duck typing. Now, this is really interesting. Can we do this in Go? Can we check if a certain value, if it can handle a certain message or has, in layman's terms, has certain methods? Well, the answer is yes through the magic of interface conversion. So again, we type a, we added function foo. It doesn't do much. We don't actually care what it does. And we created interface i, uppercase i. And this interface has one function, defines one function foo that returns a string. And a coincidentally or pointer to a coincidentally also has foo, which is a string. Now, this is where Go completely differently than other languages. In Go, interfaces are implicit. In most languages, if you have, let's say, class A and you wanted to implement interface i, you have to, at the time of creating that class, you have to say something like class A implements i. That means that you cannot have a type that is not aware, has no idea that an implements, that a certain interface exists and will implement it. Not unless, not unless, that's why a lot of people move to scripting languages, because that allows them to pick some code from the internet and use it. You know, just use it unless of what's going on. And Go allows you to actually download some random code from the internet and plug it in using your own interfaces. It's very strange. It's very unique. So with that in mind, if you have implicit interfaces, it makes sense to be able to try and be able to ask whether a type implements an interface, because maybe it does. In other languages, it just doesn't make sense, because the answer will be no. Right? Right. Okay. So therefore, we use interface conversion. Now, this is the syntax. What you need to understand from this expression is that if everything was fine, then Val will have a type, will have B, which is, which it points to A. And with the interface of i, inside the interface of i, Val will have all of that and will be able to invoke foo, which is this one, which is exactly what's going to happen. And this is how stringers work. Now, we said that small talk, we said that, we said that small talk was created by Alan Kay and he coined object-oriented. Well, actually, he created small talk with explicit, explicitly with his ideas of what object-oriented is supposed to be. And it's important, because according to Robert Grisimer, who is one of the creators of the Go programming language, this is what, small talk was actually the inspiration for this kind of interface conversion, so that at runtime, we can actually check if a type has certain methods. And Russ Cox compared it to duck typing. And by the way, this is a really, really nice read. So if you just Google Russ Cox duck typing, you will find it. It is a really nice read. He explains how the runtime does it and also how the caching works, because obviously you cannot compute well, compile time, you cannot compute every type against all the interfaces in the world. It's really, really nice. And finally, we have a surprise witness. Martier is going to be the proxy of said surprise witness. Do you have your? I am here. So before I introduce our witness, I am going to ask the witness, do you remember, given an interview in 2010, to Danny Kalev? I don't recall. Well it's on the internet, so it must be true. It's true. I submit into evidence a web page. Exhibit 90210. So please read a portion of the text from the web page in your own voice. Go is an object oriented, even though it doesn't have the notion of a class, the type system is more general. Any type, even basic types such as integers and strings, can have minutes. Thank you very much. So what makes you such an expert to be able to say such, to make some such claims? I co-created the language. You created the language. What is your name? Rob Pike. Thank you, Rob. New glasses? Yeah, new dress. When they made it. Thank you very much. Thank you, Martia. So yes, Rob Pike actually said that in 2010. I don't know if he actually changed his mind since. But the truth is that I feel that at this point it is clearly a matter of opinion. So I would like to know yours since you are the jury. Voting is now open. So the verdict. It's verdict time. I hope it's now open. Yeah, it is. The co-team is not permitted to answer. I believe one of my members asked the co-team last year if go is object oriented. That's a dare. I am going to give you 35 more seconds because we have to wrap it up. Am I correct? You already see 105 judgments. 108. Oh, no. I created a bunch of bots. All right. So I have to close this. Unfortunately, let's find out what you said. That's interesting. Why can't I see the results? Yeah, I don't know what happened. Let's do this. That's what I did. Wow. Okay. So I am going to cancel these proceedings because clearly you're out of your minds. As the judge, I condemn myself to providing you with stickers, lots and lots and lots of stickers. Iris, Rona, thank you very much. I have to get off. Thank you. If you have a sticker, you have also have to give her one if she convinced you. Again, housekeeping announcement, if you submitted lightning talk, check your mail, matrix, discord, WhatsApp, whatever you sent to me, I'll try to contact you if you got accepted. If you aren't sure, our master on account has contacted me on Twitter. |
Visually programming Go
Let's mix Blockly + Go and see what happens! |
It's getting late already in Brussels, the sun is shutting down, but we still managed to find a second tiny go talk somehow. Daniel has got to tell us everything about visual programming in Go, which I think means that I no longer have to write codes. So, run for class. Can you hear me? First of all, thank March and the rest of the organization for this beautiful go depth room today. Also, the foster. And I'm Daniel Esteban, also known as Conejo, and I'm going to talk about visually programming Go. Probably you will use your eyes to program right now, but I'm talking about visual programming language, which is like you use graphically element, graphical element instead of text or code. Usually, there are two main branches of visual programming language. One is flow based, and the other is block based. And today we are going to focus on this block based way of programming Go. Why? because I like to make crazy things with Go, especially tiny Go, and I wanted to know if it's possible to make some code graphically and then translate it to Go, because, well, more seriously, because I think Go is programming is an essential skill for the future. It's a great way to introduce a non-programmer to programming, especially children. It's great for simple tasks like home automation or if F-E-R-A-I-F-T-T-T. Also no code, no code movement are great in popular, and Go has a standard nice library, some nice package, it's easy to read, and has multiple targets. You can run Go on Mac Windows, but also on a lot of microcontrollers. How are we going to visually program Go? Well, there is blocky, blocky. Also known as make code or scratch or hard to block, all of them use the same engine, and we are going to see this in a few moments. Blockly is a poor JavaScript library, it's 100% client-side, no server-side dependencies, it's compatible with all major browsers, and it's highly customizable and extensible. Blockly, unfortunately, does not support Go officially, so I'm here to fix that. Unfortunately, I have a playground specifically made, but I left the last update at my home. It's not on the internet, so I cannot run on my laptop right now, but I will show you some screen shots and we are going to see some demo. As you can see on the left side, the blockly editor is run on your web browser. Once we have, it generates the Go code, and then we send it to a server, so it can compile, it can format the code, and then we can get one file that runs on the web and the browser again, so we can see the output or we just get the file for our device. We are using for the server to compile, we are using TinyGo, which is a Go compiler for microcontroller, but you can use a regular Go. Let's see Sonet's sample. This is for example, we can make, and we are going to see different features of Blockly. The first one is, of course, a Hellover. So Blockly, we have these different blocks already made, and we just drag and drop them, we can edit them, we can add, we want, for example, to bring Hellover, and we just go for a text, we drop here, Hello Foster, and we are going to make this five time, and yeah, this doesn't work, sorry, but well, we just drop and drop the different element. The code it will generate, it's pretty simple, and it's just Go code. So this was our first example, you will need to trust me like this is working, but like I said, the last version of the playground is not, I couldn't bring it here. Our next example is like, especially focused to children or non-programming people, I present you the logo turtle, turtles are educational robots in computer science, yes, because it's really easy to see, you program the turtle, the robot, and you can see like you tell them move forward one meter, turn right, or turn left, move forward again, go back, and then you see the little robot moving, and it's very easy for children to understand the principle of programming. So they became popular with the local language and turtle graphic, so we make our own version, is the goferino, this little one, okay, this is a different robot, the chassis, the brain is a BBC micro bit, which can go inside, and then the eyes are a ultrasonic distance sensor. We are going to avoid obstacle, the first example like you can see is we set a variable called distance, it's a number variable, and then while forever, we just get, we read the distance from the sensor, and if the distance is less than a value, we just move to the, we spin to arrive, if not, we are going forward, and then repeat it indefinitely. The generate go code will take, I mean, Blockly will make the right imports, it will declare some variable needed for it, and it generates valid, this block will generate a valid code. So we just run, I will skip the flash bar, because again, it's just wasting time, and I think we are pretty tired for a long day, but we just can see, wait, maybe, yes, and when it finds an obstacle, it just spins. Bye bye goferino. The next sample is what I call the stream tank problem. I have a stream tank at home, oh, I need to live in a very specific temperature, and it turns out like water heater are very cheap, but water cooler are not, instead, you can just blow air with a fan to the fish tank, and it will lower the temperature. So I couldn't bring the fish tank here, but I have hopefully a still hot coffee, and hopefully still cold water, the circuit is very simple, it's an Arduino Nano RP2040, it has a waterproof temperature sensor here, and instead of blowing a fan, I will just put some RGB LEDs, so you can see them on the back. The code is quite similar as before, we just have a variable called temperature, we just initialize the device, we read the temperature, and if the temperature is higher than 30 degrees, we just blink red, if it's under 20 degrees Celsius, it will be blue, and if the temperature is okay, it will be green. So the temperature is okay, we're going to put the sensor in the coffee, it turns red, and now in the water it will get hopefully blue. So we can make this small thing, because with very, very few blocks, we just save us a lot of money instead of buying a water cooler. The code again is very simple, I just keep some pa, but it makes just regular code. Also the next example is more focused on no cold, low cold, and wet assembly, wet assembly is getting supported by more and more entities, more and more in service, like Fastly, Cloth Fair, Capsuleware, X-Team, you can have serverless function right now, X-Team is trying to make what Lua was a couple of years ago, you can have your program and you can include some extension from the user or the community in wet assembly. So I just create a once worker, it's a visit counter, in orange you can see the special block to work with Cloth Fair, we just create a data store with a name counter or the connection, we get, it's time we visit the main route on the web server, it gets the value of that data store, it's increment by one and then put it back, just a simple visit counter. The code generate here is again like HTTP function, you can probably write if you have done some web server code, this is a bit ugly code, I will explain later, but the race is let's just convert from string to integer, just add one and then put back again, and then we search the code, wait, the URL that publishes here, this is right now one, two, we keep reloading the page, it will contain, and if we go to the back end on Cloth Fair, we can see the value is there. So again with very few blocks, we can allow non-programmer people of our team to write an extension for our main program or whatever tool they are using. Now let's make the blocks, you need to define the blocks, it's a JSON structure which you have what type of block, the message to show, if it has some input, the output, the next statement, even you can add a tooltip, some comment or documentation, and then you just define what the block does and what the go code that it should write finally. There is even a block generator, you can add different properties to your block, and now the feature of the block is they are type checking, there is a little bit of type checking, you cannot assign a text variable to, I mean a text to a text variable or something like that, some blocks you can define what type is returning, you also have conditional, of course, you can edit the own block on life to add more else if or else condition, you have some list also, you have inline documentation, just wait a little bit and just the tooltip appear of what does block does, it can be translate, of course, in any language which makes it really, really easy for people to start, and you can have different colors and different style for each type of block. You can also have some image on any part of the block which helps a lot, again, the user to know that, okay, this block is an LED or a temperature sensor or something like that. The limitation is currently not everything is yet supported, you need to create a block for it, it's probably worse for vision impaired people or screen reader user, because I guess just text file or code file is easier for them, static typing is complicated because there are a lot of different types, and Blockly was made with JavaScript, Python, and Dart in mind, like dynamic type in kind of language, you have to make a lot of decision in behalf of the user, like, for example, on our example of, or at what service, we just assume like the response writer will be called D and the request is R-E-Q, so you have to have that in mind when you make other blocks, ugly code is sometimes needed, because, for example, and just focusing on tiny go, I expect all number variables to be 32, because it is common in tiny go, so when you have to typecast into I'm 32 or something like that, it gets ugly, and there is not much documentation right now about how to debug it or how to develop on Blockly, so it's kind of hard right now. Here are the links for the different projects, and that's it. Thank you. I'll give you some time. |
vfkit - a native macOS hypervisor written in go |
And one second, okay, our last speaker from a full sized stock of the day is Christophe and I got weird that I got a macOS stock at FOSDEM, couldn't even fire me, but it seems to be open source and it goes, so, say just yours. All right, hello everyone, thank you for staying so late. So yeah, we'll be talking about VFKit, which is macOS hypervisor, which I didn't go, which I've been doing for my work, so yeah, we first present a few things, then I will present in more detail hyper utilization framework, which VFKit is based on, why VFKit, and then I will go in more details about how you can use objective C code, you can call it from go. So a bit of background, so my name is Christophe Fergero, I'm working at Red Hat, I'm working in the CSE team, which is a so-called open C flow call, which was called, quadratic containers, so what is it? It's basically a way of running an open shift cluster, so just say Kubernetes cluster on a laptop, so you can do that on a Linux laptop, you can do this on a Windows laptop, or it can also be done on a macOS machine. So how we do it, we create a virtual machine, and then we start the cluster in it, yeah, basically it starts and you have a Kubernetes cluster. Why do we do that? It's aimed at developers, so if you want to develop a Kubernetes application, you can have that all on your Mac, you don't need access to whatever on AWS or something, you create your Kubernetes application, you start the VM on your Mac, and you can do all your tests over there. So for a virtual machine, we need a hypervisor, so on Linux we use QMU, it's easy, on Windows there is Hyper-V, it's so easy, on macOS, it's been more complicated for us, so we used to be using Hyper-Kit, but Hyper-Kit, they don't have support for Apple's second hardware, so a few years ago, okay, okay, they're switching to these new ARM CPUs, they're great, but we cannot really keep using Hyper-Kit. So the next option was QMU, QMU is just great, it's in blue on Mac, you can install it, you can use it. For my specific project, at Red Hat, we would have to re-build QMU ourselves, and then to ship it, and we have nobody at Red Hat maintaining macOS QMU bits, so we were really worried that it would be on my team to maintain that, we are like three people, a little bit more, five people maybe working in the team, and QMU, it's like millions of lines of C-Code, I'm not exaggerating, we really did not want to be the ones maintaining it, in other projects, I would just use QMU, I would be happy to do it, for my project, it was like, yeah, not a great idea, so we needed something else, we were looking for something common line, we were looking for something which is free software, at the time, we did not find anything, even today, I'm not so sure there's like a lot of things we could use, but yeah, there was this new visualization from work from Apple, which was just released, it looked very nice for our purpose, so I'm going to present a bit more in detail what it is, so it was introduced in macOS 11, so two years ago, I think, and it's like some really high-level APIs to create virtual machines on macOS, you can create macOS virtual machines, which I've never tried actually, and you can also create Linux virtual machines using this framework, it's a framework which means it's a Swift Objective API, that's some kind of library, they create a framework for me, it's just like a shared library in C, but they don't provide user applications, there is nothing to manage your VMs, there's nothing to create them, there's nothing graphical, it's really something you can use as a programmer, but not something you can directly use as a user, so yeah, when I say it's a high-level framework, I mean, it provides everything you need for a virtual machine, but yeah, not much more, so in QMU, you would have support for real devices, like QMU emulates, I don't know, for internet, they emulate real tech hardware, they emulate Intel hardware, and they also have some virtualization-specific devices, in this framework from Apple, you only have Vataio devices, which are just virtual devices used in virtual machines, but there is no real hardware implementations of that, and so they have some VataioNet for networking, they have Vataio BLK for disk images, Vataio RNG for serial number, Vataio Serial for serial ports, there are plenty of devices like this, and there are some very useful devices for my use case and for other use cases, like containers on my OS, for example, so they have Vataio FS, which is a way of sharing files between the host and the guest, and yeah, it's quite efficient, like I forgot which container solution switched to this on my OS, and they say, yeah, it's really great for performance, so it's really useful to have that, they offer Vataio vSoc, if you need some communication between the guest and the host, it's a POSIX sockets API for easy communication between the two systems, and there's a server-zeta support, which allows you, so you start a Linux ARM64 virtual machine and they provide you a way of running Intel Linux binaries inside this virtual machine, so they just reuse what they implemented for the Mac, they make that available for Linux binaries as well, so it can just be great and useful. Yeah, so I don't know, is it really about enough, actually, I hope so. Yeah, I just wanted to show how easy it is to use, so you create a configuration for the virtual machine, you give the number of CPUs you want, you give the memory size that you want, you need a bootloader, more on that in the next slide, basically this is the very basic configuration, all that you need for a virtual machine, you could ask a disk image to it, but it's not in that example, this is some twist code, it's not some go code, so for a bootloader, it's also very easy to create, you need that in the configuration, but basically, they just need the path to the kernel, which is at the very top, you can specify around disk, if you need that, but it's optional, you give the kernel command line arguments and that's it, you have your bootloader configuration for your Linux virtual machine, and then you can start the VM, which is that, so it's just a few lines of code, you have a way of creating a Linux virtual machine on the Mac, so virtual machine, you create it, you give it the configuration, and then you start it, and that's it, so this framework would be just great for what we needed, which was something to start virtual machines on the Mac, the framework is maintained by Apple, so we don't have hundreds or millions of lines of code to maintain, because Apple is kind of taking care of that, but yeah, it has like some issues, it's written in Swift or Objective-C, basically the framework is non-free, and yeah, I'm in the Godave room, so yeah, it's not a great fit for this room. In my team, we do everything in Go, so yeah, ideally for us, what we would use to start and manage the virtual machine would be in Go as well, so we're like, okay, we have this great framework, but there are no Go bindings, we would like some Go, so what do we do? This is where VFKit comes into play, so before VFKit, there was this very, very nice project, Got Hex VZ, which is written by someone named Key Kamikawa, and it's some Go bindings for the Apple Virtualization framework, so yeah, basically lots of Go code, lots of Go code from the Go code to be calling the Objective-C code, it's written in Objective-C, not in Swift, it has a free license, MIT licensing, and yeah, one very important thing as well is that the maintainer is very active and is very, very, he was very fast in adding the new API which we added in MacOS 12 and 13, and some of them are very important for us, MacOS 12, this is where they added file sharing, which we really needed. MacOS 13, they added lots of API, but mostly for graphical virtual machines, I don't really have a need for that, so it was not that important, but they have a way of booting UEFI virtual machines, which makes everything simpler, like I showed, if you want to start a virtual machine, you need a Linux kernel, a NetRD kernel command line, with this new feature in MacOS 13, you can directly boot this image, and you don't need to provide these additional details, so it's just so nice to have that. And so, one could think that, okay, we have this, it's great, we just use it, and we are done. There is one more thing to know about the virtualization framework, is that it's an API, it's a framework, so this is binding the API to be able to use it in Go, but it's really not managing virtual machines or anything, so if you write your test code, you create a virtual machine, you start it, as soon as your program exists, the virtual machine is gone, because basically, Apple provides a way of starting the VM and stuff, but it's up to you to keep the process alive for as long as you want the virtual machine to be alive. So, Codex VZ, it's also some kind of library, it's a Go package, but there is not this process which would be alive for as long as the virtual machine is alive, and I spoke too much. And so, we needed something, a process, to create the virtual machine and to be sure that it will stay alive as long as we need the virtual machine, so basically, until we want to kill it, or until the virtual machine starts by itself. And so, we decided to write this program called VFKit. It has a command line interface on top of the Codex VZ bindings, so this means you can just use that to create your virtual machine, to start it, and as long as the process is alive, the virtual machine will be running, so it's written in Go, and it's using also free license, so just command line, you specify the bootloader, this one, yes, CPU memory as we saw before, and then the list of devices that you want, and that's it, so very simple way of starting the VM, and yeah, the command line can get long, so this is why we also added a way of creating the command line from Go, so you can use like a nice Go API, you create a bootloader with the parameters, you create a virtual machine with the parameters for the memory and the CPU, then you add your devices, and then you ask it to give you the command line, and then you can just use that to start your virtual machine, so yeah, put the name of the package at the bottom, and so I wanted to give a quick overview of how you can use Objective-C from Go, because at first I was like, yeah, okay, it's magic, but actually, yeah, it's a bit magic, but not so much, Objective-C, one thing to know, the syntax can be weird, but it's really a super set of C, so you can just use a C Go, and so this import C syntax to just call the Objective-C functions methods and to do what you want, and so if you look at the code for the code hex vz bindings, which are like the bindings for the Objective-C methods of the virtualization framework, you will see like the Objective-C types are usually changed to C types, converted to C types, and then we interact with the Go code using C types, this means C strings, and void pointers, C pointers that we can reuse. So what can we do with this support for Objective-C in C Go? One also nice thing is that we can build a M64 and AMD64 code on the same machine, and so we can generate then a MacOS universal binaries on the same machine. There's a Go module to generate this universal binaries, so it's very nice, and a few annoying things, minor but annoying, I always forget the semicolon at the end of the lines, and then the compilation failed in the Objective-C code, and compilation can get quite slow, even for small programs, don't know why, but I guess for the compilation of the Objective-C code, sometimes it can easily take like 30 seconds to just compile something, which yeah, sometimes I can't come on, hurry up. Yeah, and the samples are not going to be much readable, I'm sorry for that, but yeah, if I want to call a Hello World function from Objective-C, I would just use like this C prefix, which is the same if you were using like C code from Go, we'd also like add this C prefix from the name of the function in C. So in Objective-C, I define the method here in the comments, and you have to put this import C right after the comments, otherwise the Go compiler will miss it, will not realize it's Objective-C code or C code, and this will fail. Then in the comments, I put my Objective-C code, so it just void Hello World and no parameters, which match what I call here, and one important thing is that at the beginning, you have like to put some special flags with the compiler is going to be using to know it's Objective-C, so there's a flag at the end, which says it's Objective-C code, and there are some libraries you need to add for the Objective-C code as well. But apart from that, you do this, and you will be able to just call the Objective-C code from your function. So next here, it's more examples of passing data from Go to Objective-C or the other way around from Objective-C to Go. So in this case, we want to get a string from Objective-C. We will not be using a native Objective-C string, which would be an S string. We are just going to a C string, which is just a char pointer, and then using that from Go. So the Objective-C code, it has an S string, but we just convert it to UTF-8 string, which is what Go expects at the encoding of the string. We make a copy, because otherwise, it's going to cause memory issues. And then from GoLang, we just get the C string. And then in Sigo, we have some helpers to convert from a C string to a Go string, which we use. And then we can just print it. I could do anything I want with the string. It's a regular Go string, so I could add more stuff to it. I could do some comparison. I could check if it has some given prefix. I could do anything I want. So here, I just print it. And then, since I made a copy of the C string in the Objective-C code, I need to fill the memory I use. So this is the path, which is trickier when you are used to Go. You have some memory management to do, either in C or in Objective-C. So you have to be careful about it when you add the boundary between the two languages. So the next example is the opposite one. So I want to pass a string from Go to Objective-C. So once again, in Sigo, there's a helper to convert a Go string this time to a C string. So I tell it, okay, I want a C string for the HelloWatt string. And then I still have this C prefix to call my method in Objective-C. So I pass it the string. And then, once again, there is some memory management to do, so I need to free the string, which I got from C string. It's all documented in the Sigo documentation. And the Objective-C code is getting a regular C string and it can print it directly because Objective-C is quite close to C and it can reuse some stuff from C. And so, yeah, this way I pass the string from Go to Objective-C. And the last one, the last example, it's when you might have some cases when you want to call a Go function from an Objective-C function. For example, if your API has some callbacks, like there is one API which is like when you start the virtual machine, you can get some function to be called if there's an error or if it stops. And so, in this case, you would like a Go function to be called from the Objective-C code to tell you, okay, there was an error in the virtual machine and then you want to do something in Go. So once again, it's similar to what you would do for a C function. So in Go, you have to say this method is going to be exported, which means it's going to be called from C code, for example. So you had export print before it. So this one is a regular Go method. Yeah, so it's getting a string from the C code, so it converts it to Go string and prints it. And then the Objective-C code also needs to be made aware of the Go method. And then, yeah, it can just call the print method from Go with a C string. So I put that in a comment. I'm not sure it compiles the analysis, but it was more convenient to show in the presentation. But, yeah, one thing to be aware of that you can put that in separate files as well, you could have an Objective-C file, Objective-C header, and then you could tell the Go code to make use of that. And, yeah, sometimes, I mean, if you add more code, it's easier to have separate files for everything. And so this one would be an example, most sophisticated example, but a concrete example of calling the virtualization from OKPI or from Go. So it's more complicated because there are some parameters to pass on everything. The important thing I wanted to show is just this valid return value. So this creates an Objective-C object by calling the virtualization from up. This is here, so, yeah, the syntax was weird with these brackets, but, yeah, basically it's how Objective-C works. And so I'm creating a VZ disk image storage device attachment, which is a virtualization from OK Objects, and I want to return it to Go, to keep it around, to use it later. And so I create it, I pass it some parameters which are converted from C strings to what I needed for that function. So it's similar to what I did before. And then I return it as a void pointer, which is like anonymous pointer in C, which then I can reuse in Go, which I show here. So this is the Go code corresponding to the example before. So here I have this C prefix to call my method, the method I defined just before. So I pass it the same parameters. And so I get this pointer, this C pointer, and in Go you have this unsafe pointer data type which I can use to store my Objective-C object. And then I can reuse it in like later API calls. So, yeah, this one is very big. Yeah, there is not a lot to be seen. So here, so the unsafe pointer I got from the API before I pass it to another Objective-C method. And then the Objective-C method is able to reuse that pointer and to pass it to some more Objective-C API. This is basically what allows me to use the virtualization from work API from Go. So every time I make a call from Go to the virtualization API to get a pointer for, I don't know, a configuration object. In Go, I store the pointer for this configuration object. I can do some more Go code if I want to do more work. And then when from the configuration objects, I want to tell the virtualization framework create a virtual machine from this configuration. I just pass back the unsafe pointer I stored in my Go code. I pass it back to the Objective-C code which then can pass it to the virtualization framework and then can create the virtual machine. And so this is how all the interaction with the Objective-C code is working. It's through these Go strings, C strings helpers and through this unsafe pointer and some casting back and forth to communicate with the Objective-C layer. So yeah, in the end, you have this Go code. Under the hood, it's calling some Objective-C code but it's like really close to traditional Go code. So you create a device storage attachment. You store it so I don't handle the errors. You create a config by reusing the attachment you created. And so here, there is some memory handling to do. So when you no longer need the Objective-C objects, you have to release them. But even for that, you can like have some calls in Go to tell it, okay, when you dispose of the Go object, I want you to code this method to also release the associated Objective-C object. So even that, you can remove it to really have like some typical Go code and not have to really realize that you are interacting with the Objective-C. So yeah, put the memory management rules here which are compiled from this URL but yeah, not going to go in details about it because yeah, time is like slightly short but basically they have some conventions about like when memory is allocated in Objective-C and when you need to keep track of the memory and get rid of it when you no longer need it. But yeah, otherwise, like a lot of APIs are just returning new pointers, objects that you don't really need to dispose afterwards because I don't know, Objective-C is doing that for you. Some other features. So yeah, cgo.handle can also be useful. It was introduced two releases ago or three releases ago in Go. Yeah, it's really useful for delegates. What I was talking about when I mentioned callbacks before. So it is when from your Go code you want to pass a function to the Objective-C code which needs to be called at some later point in the future. So before it was like quite complicated to do. You needed like to record everything somewhere and to look it up and something. Now with cgo.handle, you can from the Objective-C code call a Go function and from the Go function find back the Go callback that you want to call. Objective-C has this notion of blocks. If you keep the blocks internal to the Objective-C code you can use them, there's no problem. You can have pointers to block. I did not try to see if you could pass the blocks back to Go and then back to the Objective-C code. But yeah, it's really a very specific Objective-C feature and it has exceptions and this one I never tried to see. I mean, I'm quite sure you can catch the exceptions and convert that to Go errors. But yeah, I never tried that. So these are things that are still to explore regarding like by doing Objective-C in Go. Yeah, and this one, it's quite nice for my use case because in Objective-C you can dynamically check if some API is available. So at runtime you check, okay, if I'm on macOS 12 then I know I can use this API and I'm going to use it but if I'm not on macOS 12 then do something different. And so the compiler is taking care of that for you and this means I can build a binary for even macOS 11 on macOS 13 and run that binary which is like in C can be complicated to do for example because if you are using API which is not available where you try to run the binary basically the binary is not going to start. So this was really nice because for example file sharing is not available in macOS 12 but in macOS 11, it's only available in macOS 12 but I wanted to use it. So with that I can have like some nice error handling and fall back. If I try to use file sharing on macOS 11 at runtime I get an error telling me it's not available so I can just ignore the error or print it or do whatever I want with it. And so this is about it for VFKit. So all the examples on the code I tried to show but we trust to that it's in this GitHub repository. So I put here some contact information at the bottom and if there are some questions there are a few minutes for that. Thank you. Thank you. Question from this weird guy again. Hello, a lot of your code was censored. Will the slides be available? Sorry, a lot of my code was. A lot of your code was censored. It's like some of them. Yeah, I should have tested them on the projector before. But will the slides be available? Yeah, yeah, I will put them on the page of the Tokyo. So just go to the schedule page, click on there and every speaker has uploaded slides there. So all slides for today should be able to be found on fosdm.org slash go. Any more questions over there? I'm gonna quickly run around the room. This is my fitness for this week. Can I pass by please? Sorry. Thank you. Hello, thank you for the talk. So you said VFKit makes it possible for the virtual machine to not get suspended. So how are you achieving that? I know it's not about suspending the VM. It's really about, so the VFKit process has to stay alive. I mean, it stays alive as long as the VM needs to be running. But yeah, if you suspend your Mac, the VM is also going to be suspended at the same time. Right, so I mean, how are you keeping the VM alive if there is no activity inside? It just does a loop at the end of the VFKit code. I have some loop which basically waits until the VM stops. So the virtualization framework, it tells you okay, the VM has stopped. And I'm also like catching the signals that you could send to the VM to just tell the process, okay, just shut down now, it's over. It's just not a wide loop, but a select to equate for a condition to happen and tell it to exit. Okay, thank you. Thank you, and our time is up. So one last round of applause. Thank you. |
Go Lightning talks
Come speak! |
Hello everybody, I will talk about Go Evil, which is a project, a personal project, which allows you to do one-liners in Go, so you just type your Go code and call it with Go Evil, and you can simply write a name or word from the command line. So this is like magic. I will show you a bit under the hood how it works. The word project is about 300 lines of code, not more. Here is an example, you can call Go Evil and tell it to take the code from the STD in, but here is how it works under the hood. From your Go code, from the command line, Go Evil generates a full Go program, and so the dash E allows to print that Go program that has been generated. It is sometimes useful when you want to debug the syntax around that you make, and then the Go import equal allows to stop using Go import, because here you see that in that code, there is no import of the FMT package, but it is introduced by the Go imports, which is called by Go Evil. So I am announcing today that Go Evil has been, 1.0 has been released just a few hours ago, and the new feature of Go Evil 1 is that Go modules are supported, and with Go module, you get locked versions for your dependency code from Go Evil. So this allows to submit to share your one-liners with other people, because the previous code that I showed was depending on the dependency to be installed in Go pass. And so that's it. Try it, use it, report bugs, and I'm available for question later. Thank you. It's weirdly enough, not the first open source project to be released when people are in the dev room. If this is your slide, you can come up now. Hello, everyone. My name is Keegan. I'm a staff software engineer element, and I've been spending the past year debugging why Go servers are slow. So hands up, who's made a crud application before? Create read, update, delete. That's basically everyone in this room, which is what I thought. Who's tried to speed up their server before? This is a slow request, 3.6 seconds. Fewer people, but still a fair number of people. Cool. Who's used PPROF before? So flame graphs. It's great. Who's used runtime trace before? Not that many people. Okay. Who's struggled to figure out what was going on when you're using this? Right. Okay. Great. This talk is for you. So the first thing you need to really is use spans to make these traces readable. Very easy. If you've ever used Jager spans before, they're basically the same sort of thing. So you can create a new task, and then you get a new context. You pass the context through to new functions. You can create regions from those, and you end up getting something that looks a bit like the stuff on the bottom there. You can also add log lines for some contextual information. That'll appear on the UI, which we'll get to in a moment. And the crash course in using runtime trace is you make a trace in the same way that you'd make a CPU profile with PPROF, except you hit a different endpoint, but you also tell it how long you want to trace for, and then you use gotool trace to open that trace. You don't use the gotool PPROF, confusingly, and you'll get something like the bottom over here, which is quite a lot of scary words and links, and you have no idea which thing to click. The only thing you care about is the user-defined tasks. If you click on that, you'll see something a bit like this. The only thing you care about is this GoRoutine view, and if you click on that, you can profile basically everything. So, for example, here's a bit of a request, which is slow because of garbage collection, and if you click on any one of those Gs at the bottom, which are highlighted with the red circle, you'll see stack traces that mention GC. Also, the blue bar in the middle there says GC, so spoiler. Other thing, if you have slow SQL queries, you can find that as well because if you click on any of these things, you'll see stack traces, and those stack traces refer to any point where the GoRoutine yields away for network IO or syscalls or things like that. So, you can clearly see, oh, it's doing something with SQL, and it's just doing the same thing for SQL for not particularly long here, only 20 mils, but still, it takes a long time. You can do the same thing for profiling functions, if functions are being slow, so you may, this is calling the same function over and over and over again, which it probably shouldn't be doing in this particular scenario, but again, it depends on your actual code as to whether or not this is the right thing for it to do. Sometimes that is normal behavior, in this case, it's definitely not normal behavior. So, the TLDR is you should probably use runtime trace next time and not CPU profiles. So, for me, I've sped up requests that were taking 3.6 seconds to 96 milliseconds for the same request, and they're bottlenecks from various different things, so from garbage collection to poor database queries and poor computational complexity on certain algorithms, and some of these things will only be visible if you use runtime trace. So, flame graphs don't help you for debugging slow SQL queries, but runtime trace will do. Thank you very much. Thank you. If this GitHub repo is yours, come to the stage. And you've got 10 seconds to switch laptops. 10. No. And it works, which is a miracle for Linux. Hi. I actually didn't create a slide, and this will be the fastest lightning talk in my life. Basically, I just wanted to talk about the JSON package and the issue what we faced with, and a lot of people faced with it. Basically, it's the... Have you ever used struct with omitempty? Then, basically, this is where the issue come in, and that is an open issue here, which trying to fix this, but it's basically abandoned, and it's a pretty big issue because it's created in 2015, and there is nearly 200 comments under that. And basically, I just wanted to make an attention on this ticket, because if someone fixing this ticket, that means that, basically, you can do something like what I show you in this code. So it's really hard with point. Yeah. Probably use this package, the encoding JSON. I have a struct here, which is here. Thank you. Thank you. So this is basically, I introduced a new struct, which is basically a new string, or something like that, and here I added omitempty. In this case, I implemented the E0 method here, which says if it's not valid, then it's basically a 0, so I wanted to remove it from the JSON. But if I run the actual code, please run it. Live demo is in a lightning talk. You're brave. Yes, live coding. You see that it's basically here inside the JSON, however, I wanted to basically an empty JSON. And there is another implementation with exactly the same code, but I just created a pumpkin seed JSON, which is exactly the copy of the built-in JSON. The only difference here that the issue what I mentioned is basically suggesting an implementation that the omitempty section of the built-in JSON should check for the E0 method, whether it's existing in the struct or not. And if I run this one, it's basically doing what it should do. And basically that's it. So this is something what I think should be implemented in Go and this ticket with this number is basically showing actual implementations for that. Right now, most of them are not declined but not processed. So I think if anyone has a good idea how to implement it in Go, then basically it would be nice to put into this ticket. There are also, this is the actual change request in the code language what the guy made and I just copied his code. Yeah. One disclaimer, the pumpkin seed JSON package, you shouldn't use in production. And that's it. Thank you. If this is your slide, come to the stage. All right. Hello. My name is Michiel. I created Mox. I've been working on this for quite some time. I started using it two weeks ago, released it earlier this week. It's a meal server. So I'm curious, is anyone here running their own meal servers around the main? One, two persons? Wow. Okay. Three, room for improvement. So let's go right ahead. This is the tagline. It's a modern, full-featured open source secure meal server for low maintenance self-hosted email. So let's break it down. It's modern because it supports all the latest meal standards and there have been added quite a few over the years. It is full-featured in the sense that it aims to do everything at once, meaning all the relevant email standards. So you just need this one thing. You don't need a whole bunch of components to make a working system. So just really to make it easier. It's MIT licensed. It is secure, meaning it supports all the latest security things about email like TLS, et cetera. And of course, a bit of secure coding and low maintenance. So you actually started using it because I hear many people are moving all their email to the cloud, some big providers because it's too hard apparently to run a meal server. So it's for your self-hosted email. Email is one of the oldest decentralized messaging protocols, but we're making it more centralized by moving everything to the few big providers. So Mox is an attempt to make it so easy that you will all start using it. So a bunch of features, a list of acronyms. IMAP, so you can access your mail, SNTP, so you can send mail. Nowadays, if you want to send mail, you need to configure SPF, DKIM, DMARC. Does anyone know what that means? Yeah, see, that's good. Automatic TLS, so you don't have to worry about any certificate stuff. So it's like the caddy for email. TLS reporting, MTA, STS, that's one of the latest additions to secure email. There's a reputation-based junk filter in there, so if you receive messages from people and you don't like those messages and you mark them as junk, the next time those people send mail, it's rejected. So new senders don't have any reputation. You can look at the content, so there's a content-based abyeasing spam filter, so in there. Internationalized email, so you can have smileys in your domain names, that's what you want. And auto-configuration, so you get your thunderbird, and setup is just instant. No need to worry about all the port numbers, et cetera. It just works. So getting started, of course, now you're all convinced you want to use this. Luckily, there's a quick start. You just set up a Linux machine, probably, get your email address for your domain, and you get a configuration file that's all, that has this all configured. You just can start it right after. Not only does it make a configuration file, also print some commands and all the DNS records that you need to create, so you don't have to think. You can just copy, paste, and be happy. Then the code, 40K lines of implementation, 10K lines of tests, quite some test coverage. There's integration tests, fuzzing tests. It's all pure Go, no C Go, just go install, cross compile, all the good stuff that you get from Go. The implementation is heavily cross-referenced with the RFCs, so both ways. You can go from code to the RFC and back from the RFC to the places in the code where it's used. So this is supposed to help with maintenance, so it's implementing all these protocols, and it gets a bit overwhelming to understand all of that. So if you would code it once, you cannot go back to the specification and back to the implementation. You don't know what's going to, so how you, how to fix bugs, et cetera. Let's move. Oh, wow, quick. So what's next? I just released it. I'm looking for feedback. Please use it and tell me if it works for you or why it does not work for you. So I aim to make it very simple, so if you find something that's not simple, let me know. Of course, if you find bugs, let me know. And this is where you can find it. All right. Thank you. If this is your slide deck, you can come to the stage. If this is nobody's slide deck, I'll just skip it. Something with Postgres. If this is your 404 page which you sent to me, please also come to talk to me. So yeah, also the speaker is not found. That's the thing with last minute talks. Then I had one backup speaker. You can come to the stage. And the gophers are also falling down. They are tired. Understands me too, me too. Yes, I have HDMI. I also use USB-C. Let me just close this down for you. That's 4G clicker. Okay. So thank you, first of all. So I want to tell a go-of-story and why we use Go to have to implement this idea of fluid pull requests. Before starting with that, I need to talk a little bit about pull requests. So for that, I brought Robin and Kat with me. So Robin wants to contribute to a project that Kat is a maintainer. And what everyone does or at least they try to, they open a branch, they create what they have to do. And then at the end, it comes a time when it needs to merge into main. And then when Kat comes in and says, wait a minute, we need to review those changes. So this kind of methodology is important for critical contributions from interested parties. And it's well-known as open source projects, especially with the name of pull requests. We also use it inside our own companies. But it's well-known at the open source community. And it's quite popular. As you can see, in 2021, we got a lot of pull requests. And the process goes like you do whatever you want to do. Then the CI triggers, you get the review, you get some feedback, and then you have to apply the feedback. And we enter a loop here until someone decides that it's good to go and we get our approval. Then it goes to merge and that everyone is happy. And the problem here is that Robin goes through this process every time, regardless of the type of change it is. And we are unavailable with the fact that Robin and Kat have been contributing and working with each other for some time. So this idea that all pull requests are the same can be actually improved. For instance, this scenario where Robin is just trying to do some configuration change, why do we need a pull request? Maybe we can just go directly to main without a review. Another scenario where Robin just gets an API with some documentation or some warnings. Let's imagine why can it go to main and then we can do a review afterwards. And then when it comes to critical changes, then when we want to stop the process and say, okay, this is critical, we need to have a very good review here. And maybe instead of just asking one guy, we can ask two people for them to get their own approval. So this idea of pull requests is that all that I just said could be defined in rules. And we can apply those rules into our own process and minimize the time. That's where we came with the review pad, which is done on go and it's full open source. And that's where we can define all these ideas of what are the rules for our team. So here's how we could work with this terminology. Behind this is go, of course, then it can, for instance, if my changes are all on markdown files, I want to merge my pull request right away. So no review. If, for instance, my author actually is considered a new joiner, a new joiner could be someone that didn't do 10 PRs, like Spotify does, I want to assign a reviewer from my tech leads. And then, for instance, if I want to get some compliance, make sure that my pull request is an issue. I can confirm that and make sure that the user gets notified as soon as possible in order to iterate on that. And then we can do some more incredible things. I want you to look at the line at the top where we have an annotation saying that it's critical, saying that every time someone changes that function, that function is critical. If the function is critical, if my code touches a function that has this annotation, then I want to trigger my pull request review that is for critical changes, like I want to assign a label, I want to send someone from the tech list to review it, and I want to notify join, which is the tech architecture. Okay, we had a talk this morning about reducing cognitive load from Federic, and I want to show how we could do that with this terminology. So here's how we could look into line of sign and make sure that if someone uses a lot of tabs, so it means that we have a lot of loops between each other, if and else, we can actually send a warning to the user. For instance, our error validation, making sure that they don't use string contains for errors or equals, but they use error is. And last one, the mysterious Boolean, making sure that no more than one Boolean is used in the function signature, that's pretty much it, how we could use to make our lives easier on pull requests. Thank you all. Thank you. The last lightning talk of the day is from me again. What do we want to talk about today? Well, two subjects, what is naming God? No, I want to talk to you first of all a big thank you again to everyone. First of all, to all speakers who came here today to give an amazing talk, standing with a lot of stress to say things. I also want to thank Eva again for helping me out. I also want to thank the two FOSDEM engineers in the back who made our audio video work all day. I want to thank the people from FOSDEM who brought me food today. I also want to thank everybody at FOSDEM. And I also want to thank all the volunteers. I think they are left right now. Who helped us with video, even what they couldn't solve today. Thank you very much. Thank you all for coming, by the way. Thank you for staying so late. Thank you. And now my second subject. Which is that Go is a garbage collected language. And you know you can trigger the garbage collection by doing runtime.gc. So when the time is 19 o'clock, I want you all to do runtime.gc and grab some waste you see around it and put it in any of our bins. But I think Eva wants to say something. Yes. Thank you. Thank everyone that has been here to help you. But without you this wasn't possible. So a big thank you to Marcia. And thank you for coming. Thank you. |
TEDective
Opening up European Public Procurement Data |
I'm Daniel from the two-star platform in Europe, and I'm sitting over there, and I also work with other projects like Detective, which we want to be an open-source solution to make European public tendering data or public procurement data explorable for people who don't know that much about the procurement data. So I want to do a couple of things in this talk. First, I want to describe why public procurement data is interesting, why we should take a look at it, and I want to discuss some problems of how this data in EU context is currently accessible. And then I want to show you our project of alleviating some of these problems with Detective. And then I want to show you how you can actually contribute to the project with your company. Still very much in the early stages, just getting going, and we love the opportunity to show this now so we can actually contribute even in the earlier phase of the project. So what's TET? TET's in the name, and what's TET? So TET stands for Tenders European Daily, and it's basically a data set that's published by the EU Publications Office, and they've published this data for a long time. They've been publishing this for a long time since 2015, actually, they've been providing this freely on the internet, and it's data about basically who buys what from whom, like which public institutions in the EU buy what for what price from which organization. So it's really data about the relationship between business and government. And if, so for example your local school or some ministry in your country in the EU wants to buy something that's of a certain threshold, they're defined in the EU legislation, you can look them up in the link here, I will upload the slides upwards. It needs to go into TET, and it will be in this data set, and there's at least 670 billion per year in value that's kind of encapsulated in this data, and there's more than 700,000 notices that they publish each year. They've described this entire process of public procurement in the EU. It's very great that some of you want to join. So you put things, well great, you publish it, so what's the problem with that? I mean the way this data is made accessible is via this UI, one funny thing is, one funny thing is, this button for statistics mode, I still haven't found out what that does, like what that changes, maybe somebody from the EU can illuminate, but basically you have to really know what you're searching for in the first place in order to be able to use this kind of interface. And there's also a lot of other problems with accessing this data. For example, you can't really search by organization, which would be interesting. I mean it's about the relationship between government and business in all of the money terms. So why is there no option to search for organizations that I'm interested in? I can only really do a full text search over these huge XML files, which are really complex. And I can do some other stuff, but there's no type of tolerance, for example, none of the really nice search features that we can use to. And most importantly, there's no ability at all to readily visualize the results that I get. Like if I type something in here, in a search mask, I get back a list of HTML, basically just an HTML list of notices, then I need to understand what's a notice or the different types of notices that I'm interested in. So it's really hard. So it makes the test right, because accessibility is really bad with this data. So why is detective needed? In the past, there have been a number of attempts to look at this data and transform it into a more manageable or readily analyzable format. And we weren't really able to identify a single, freely available solution that was published under a free software license that allows you to explore this data even if you don't have domain expertise or data science. And you kind of need both now to be able to make some sense of this data. And we thought this would be interesting. So why isn't this more readily available? So we applied to last year's EU Datathon with this idea, basically, to make this data more accessible. And this is what we told them. So we have any type of, let's say we have a public servant that wants to find out who buys what from, like, within their state. Who buys from Microsoft, in Germany. And how much they spend on free software from this company. And yes, maybe make the case of how much they can save if they use free software instead. Or let's say you're a journalist who wants to investigate recent purchases made by Microsoft. Or authority. You could do that now with a patent to face, but it would be very, very difficult. And you'd have to jump a lot of hurdles to get there. So we want to take it as to be an application that you use which lowers the barrier of entry to analyze. So we thought let's present the publications of this concept with free software. And keeping it very simple. So we built something roughly with this architecture. So you have this XML file. And this was very quickly built just for this Datathon. So I'll go through it quickly. So we had this XML file. I transformed it to JSON for whatever reason, which was a very bad idea. And I parsed it in Python and put it in some ad hoc schema in Postgres. And then I used the Neo4j ETL tool to put it to a Neo4j database. The data I was interested in was relational data between, and it shows the relationship between business and government. And then I used Neo-dash to visualize that. And that actually already gave people at PUD some chance to see what might be possible with you if you open up this data. So I'll show you the little demo of how that looked. So basically this is just an overview. I parsed data for roughly three years or two and a half years. This shows you the activity per country. This is just some general overviews, like roughly a million tenders. And then it's not optimized yet. You basically search for Microsoft Germany and then you have this graph. You have a geographical distribution of commercial activity that's related to Microsoft. And you get this nice graph of relationships between Microsoft Germany here in the center as an entity. And then the yellow or red ones are tenders. So here they sold something to some institution of German government in this case here. Mostly because Microsoft Germany mostly sells to German government. And the red ones are tenders above one million euro. And that gave you a very quick overview of the commercial activity and the relationship between government entities and business entities. I do the same with you get more information here. You can actually go to the TED website to see the notice that analyzed this. I'm searching for a short question. You search now for Microsoft, usually they work with like these server providers. Can we get back to the challenges that we face that you can overcome? So here I do the same with the Polish order authority. Here it's more like who there's an entity buys from over the past two and a half years. You can see what kind of like fence and weapon and ammunition stuff they bought. I'll get through this because this is actually another problem that I'll talk about towards the end of the talk. It's deduplication. So in TED data, as it's published in these Excel files, there's no deduplication of entities at all. So you can have Microsoft Deutschland, DMPH, Microsoft Deutschland, just Microsoft, whatever that is. And like you can see here, Microsoft Ireland, like there's all these different. So I did some very naive deduplication attempt. I also put that data in a new project graph, but there's much more to be done on that front. And it's a very interesting problem, I think. Also because you need to think about it from a policy side as well. Like is Microsoft Deutschland a different entity from Microsoft Ireland? And if yes, what does that mean for my data analysis? Should I analyze them together? Because they're really operating as one entity. So they're interesting questions connected to this that are not only technical. So let's go back to my... So that was obviously limited in scope, because it was really ad hoc. It was quickly made, and there were lots of problems with how I parked the stage up for this deduplication. So now we're at the stage where there's actually a lot of interest in the FST doing this. I've heard from a lot of people that they would be interested in analyzing this data and being able to explore this data. So what's next and what's already implemented? So there's the open contracting data standard, which is something that actually came after. TET was first published, so I told you already TET was first published in 2015. I think the OCDS started being developed around 2018, 2019, something like that. And if you now build any kind of public procurement platform, you use this data standard. Because it's just a very nice way. People have put a lot of thought into how can we display this entire process of public procurement? How can we put this neatly into a data structure? And so now we're building TET with this data structure at its core. And the first task will be to parse this TET XNL jungle into this nicely specified OCDS. So I built a relational database that roughly captures OCDS. You see a lot of JCP because some things I didn't do for many to many or many to one, but JCP for now makes it much, much easier. Otherwise this table would not have been presentable. And now, this is the graph system after all. The next question, because I think analyzing this data, analyzing public procurement data, analyzing these relationships between business and government, is probably really lends itself to being encapsulated in the graph database. So this is really the core of OCDS that's interesting, and that would be interesting to model in a graph database like Neo4j. You have this tender. A tender is basically a company says, like we thought, like a public entity says we want to buy X or Y amount. And then an organization, another organization can apply for that. They're usually like something commercial. They say, look, we can furnish this tender, like we apply for this tender. And that's interesting data, you know, so who applies for which tender and which regions and stuff like that. And then there's awards. That's basically who gets the contract after all. And so that would be a very simple place to start with the graph database, to just have this, have all the test data going back from F15 Parcet into OCDS, and then take this subset of what's really central and put it into the graph database and really start exploring this visually and that's what we want to do. And part of it is already done, so I'm currently working, we are currently working on parsing this data, this XML. We use LXML library for that, which is really nice. And I've parsed this into a relational database, and I specify the OCDS data schema with SQL model, which is really cool for the library. It basically gives you identity models and SQL Openly models in one entity. It's really cool. It's really nice to work with. And then I want to create like a CSV export to be then able to input that data in Neo4j, put fast API, and scaffolding around that, and then also build some UI, which we are currently researching, which framework to use, and I'm also here to find out which one would be the coolest one, so I'll stay here, because I think there will be some problems in Neo4j's data. Yeah, but there's also React Force Graph, and yeah, really like the nice UI that's specifically geared towards that use case of analyzing public procurement data. And yet, I had that back and back by these two, like the relational database and the Neo4j database that choose, depending on the query which data sources you actually use. I'll go through the rest really quickly, but this is, if you want to get on-boarders, the documentation is still up around the edges. I'll do my best in the next days and weeks to really make the project approachable to the developers. The plan is interesting. I want to work with you and the CSV on this. So, some key characteristics that we want to really kind of put a focus on with Detective is that it's, yeah, it must be free software. It's reuse-compliant. It means that every file has the license header and the copyright header, so that it can really be easily used. And we want to make it for the people, so like a lot of my work in the next weeks will also include speaking to people who analyze programming data and ask them what kind of queries they would, what kind of questions they would like to ask, because that's really important for the design of the system that you use. Ask people that are later going to use it, like, how could this be helpful? We have done some of that, but we will do way more of that, especially now because we start building the UI. And we want it to be interoperable, so everything that Detective uses, every data that it uses will be also published under the CC5 4.0 license, and there will be open API interface, so that will be completely available. Obviously, some limits gets too crazy, but we'll think about that when the problem arrives. And also, we fundamentally believe that link data is more interesting, because once you have this data in the OCS format, you can start linking it with other data sources, right, or if you haven't already graphed it, you can start linking it with other data sources. Like, things that come to mind would be open corporate data, where you can really enrich the data that you have in organizations with data that's in this public database of corporate entities. Open sanctions would then allow you to flag people or companies that are on some sanction list, and stuff like the offshore leaks database would allow you to highlight things to offshore companies and stuff like that. That's of interest for your analysis. So this would be a future possibility that I'm really excited about, but the first step is obviously to get this into a nice format, and then think about extending it. Some of the challenges is between this step data, because some of it's quite old, like if you look at data that was published in 2015, and it's just, there's a lot of tables there, and there's these huge XML files that didn't currently do much validation on the forums that were used to take input this data, so it's in some places very messy. And also the S helps a lot actually with starting the session, because it's a very well-defined standard, and there's people like the mapping from S to S, and some people have published, so it's pretty cool. And then the next big problem that we would be helped with is duplication of problem entities, which are already kind of online-inning, and they are very cool. So we do have a good idea of that as they contribute, because I think that's really central to taking it being helpful. So how can you get involved? All the code is on our get instance. At the moment, you can only really contribute PR issues if you make an account, and I'll get this free. It's just a couple of weeks, but that's for now if we, if somebody manages that, then we'll think about mirroring GitHub, but let's try this first. Maybe there's a federation coming for the Git forges, not there yet, as I understand. There's also websites with the documentation, and then you can also write an e-mail to, this will reach always the maintainers. Yeah, and I'm looking forward to your question. Thank you very much. Thank you. Regarding funding, did you try to contact the official European institutions so that you can have funding for this slide, and so that it becomes like the default slide for that in Europe? So I know that... Ah, yeah. So the question was whether we asked the Publications Office for funding for this. Not specifically yet. I know that they are working themselves on a huge reform of the entire ecosystem, so they do this, what they call e-forms now, which is supposed to substitute what used to be TED, but e-forms still isn't most yet, there's discussions around that, but I don't fully understand all the time, and they're also rebuilding the TED website. We should get the compact for them. I have the compacts because we want this data fund, and we have the technical contact there, and we should make use of it, but I was really that keen to code the past couple of weeks, but this would certainly be very helpful to reach out to them. Absolutely, and this will happen. And we already got some funding because we want this data fund. We'll use this. So the data that's currently produced for publishers is it still some TED or is it also called NOS-DS? It will be all NOS-DS format. Honestly, I don't think anything else makes sense. So it's just a whole data that we will republish as NOS-DS. There's some place like OpenTenator.q, which was a component project, which also does this republishing of the NOS-DS as NOS-DS, but it's not consistent in how it's regularly and how it updates its database. It doesn't seem very actively connected. I got a question. When you look at these centers and companies involved, are you also able to extract what the ActionTender is about? So is there an underlying structure? This is about, I don't know, classroom furniture, and this is about military equipment so that you kind of can coordinate both by item or by contract product? Yes. So shall I repeat the question? Yes. So the question was whether there's also data on what has been procured and details about what was being procured by a public institution. And the answer is yes. There's usually a title that's fairly descriptive. And a description. Sometimes an usage, sometimes another usage. And then there's CPV codes, which is more like a common procurement vocabulary that specifies what kind of category this procurement is in. But some stuff is excluded by this legislation. For example, like military equipment. It's not published in the state. It's not open. That's why we can't talk about open procurement in good context yet, because there's still lots of sensitive data that's not being included in that. Do you plan to host it publicly? Yes. We plan to host it publicly. Yes, absolutely. It's just at the moment that the API is down because I've retracted so many things. But it will be off again. Of course it will be publicly available, but if everything crashes, because there's so much interest in it, then we'll think about limiting it somehow. But there's a sister from there. Exactly, yes. So we'll see. There's really that much interest in it. So what was the biggest challenge in cleaning the data? So I would say one is just finding, if there isn't English translation available, finding that for the specific, because we really lay out layout in this text well. Whereas if a translation exists, where is it next in our time? What does it apply to? Another one was languages that I didn't know the alphabet of for the hard to parse. Yes, I just generalize company names that they didn't have for a long time. I mean any validation on what you could put in there, which makes it really hard. And it would have been very easy to implement upstream, but now it's because of the sounds. Thank you. |
ipysigma: a Jupyter widget for interactive visual network analysis |
Okay, welcome everyone, as you can see from Adam, typical notebooks are a very important tool in each data scientist, but using graphs, refer to notebook as a challenge, for instance visualization. And so Bjornville talked about, I pie Zygma today, which is a tool to use ZygmaJas as a component in a Jupyter notebook. So I'm really looking forward to that and without further ado. Where I'm from this time, I'm actually not Guillaume, I'm sick. So I apologize in advance, because I'm not the creator of the tool and so I will do as much as I can to present it, but Guillaume can answer that by email or by Twitter or any other means if you have more questions than what I can actually answer myself. And so I will just start by a brief remember of why we sometimes want to use graphs and actually visualize them and not only do statistics on notebooks and actually visualize graphs. And so why do we do visual network analysis? It actually goes back very old to 1736 and the Bridges of Collector, which is a classical mathematical problem that was solved thanks to visualizing the graph that it was showing. Later on in France, Moreno did a social graph where he tried to visualize how a connective where students in a classroom. And recently, thanks to the community assisting tools, we can do those kind of visualizations but with massive graphs and we can try to do a computed processing to try and automatically specialize nodes on the map and on the plane and also identify clusters within it. So that brings a different mean to actually analyze graphs and actually visualize this helps a lot understanding. And we are coming from the field of social sciences and we use a lot of graphs to interpret social issues in general. And we use them actually as maps. So it's not maps in which coordinates make sense. X and Y don't mean anything. You can just take the map. But basically what you see on the plane indicates information on the, I mean the localization of each node that makes a sense compared to the other nodes. So, but I guess most of them are not. That's another example of a map that was made a long time ago. So to do that, there has been over the past years a lot of tools that have been developed including the first desktop ones. So this tool is the direct heritage of this long lineage which started with Gezi. I believe later today there will be a presentation of Gezi version 1 which finally will go out soon after so many versions already. So you probably already know Gezi. But recently we could switch from the actual desktop analysis to actual web representations thanks to a variety of libraries. D3.js proposes to do it. But there's also a site escape and a bunch of others. But so our community works with Simba. And Simba has been developed by people who are actually close to the people of Gezi. I don't think Alexi is here today. But Alexi Jacomi is the small brother of Mathieu Jacomi who speaks about Gezi. He's the one who invented Simba and Guillaume is the co-host of Simba with Alexi. So please take a look at Simba. I will put the slides to the conference and you will find links to all the tools around. And then thanks to Simba, we could build a lot of Gezi-like tools but for the web. So that we could do all those interactions that we do directly in a web page. There's been a long history at Miguelab and around of trying to build such tools. Minivan was one of them. There's also Nancy which is a very small, very publishing-oriented way of displaying a graph with very few options so that you can just put your Gezi-like search file or GraphML file and very easily do what you do on your Gezi. Retina is the one developed by people at Westware right now and is very rich, proposing a lot of features. And soon I think Mathieu and Mathieu will talk about it briefly also in the later talk about Gezi version 1. There's a Gezi-like version that's currently being developed and that should come in soon. Which brings me to all of those tools are very nice. We have all those that are interactive and you can visualize, explore, publish, manipulate all those graphs but they all require pre-processed graphs. You cannot just work with your graph while you're visualizing it. You have to pre-code in your file, usually JSON or JXF or GraphML, then you load it into the tool and then you can explore it. But we would like to be able to do that at the same time. And so that's where the idea of I by Sigma came from, to try and put within Jupyter a notebook, a widget that would display the graph using SigmaJS. So it's really easy to install as long as you have Jupyter. You usually need a tool to work with graphs under Python. There's two main ones that you might know already about, I by Sigma is built to handle both formats of graphs from both networks and I by Sigma. And so you just install I by Sigma in addition, and then I just switch to the brief demo. Maybe at the seat. So we'll do two small explorations of graphs. There's the first one that we're working on right now, which is on the, what I call the open source, I mean, actually larger than that, open access, open world. It's like for them, but just in France and the French communities working on that. And so we built this network of websites, links together of those French communities of free software. And let's take a look at it as well. So first, I will import the projects. Then I'm reading the graph that I built already. So that's all this first example. So here we have a graph with 621 nodes and 7000 edges. Let's look at the node. So the first node, I don't see that information. It's April.org. I don't know if the French people are in the room, but people should know that April is from France. It's the main NGO in France about open source and just the graph. So we have this whole page. That's all the data that was collected while making the graph. And then let's try to just visualize it by just loading i5 sigma, importing sigma and applying it to the graph. Here, just by the widget with the graph, which is randomly specialized. We have metadata information. So we can run for FATAS on it. So very easy. You see your specialized graph. Just a few seconds, and then we can also apply some... The graph is too dense for that. And suddenly no effect. So yeah, but right now it's just a graph and we don't have much information. It's very complicated readable. So let's go down and try to add a few other options to the sigma code. So we can set the outside standards. Let's use the number of pages for. So here I can see that for this graph use, we put a lot of pages on some specific websites. Let's put a little bit more and try to adjust the sizes of the nodes. So we can adjust the range of the values for instance. Here it's really readable. Okay, so we got sizes. Let's add some colors. So iBuySigma proposes some internal metrics that you can compute on the fly. So for instance, as a result, it generates clusters. And we will apply colors to the map. I mean we will apply those color communities as colors. So here we get a set graph of colors. Let's see that there are a lot of communities. As knowing this network and knowing this community, I can tell you that basically what this is. Here we got the open data, open command community. Here we got uphill and basically the NGOs working on the open source. Here we got GIL and it's mostly a lot of softwares. Fedora and all the Linux distributions. Here we got FFDL, MaproductionDenets, and all those activists working with the open internet. And I guess here is more the... Oh, it's also a mobilization. It's a form of formigated old form of food. I'll just speak a little bit. Okay, so now that we got this, let's try to make it a little bit nicer. We can add, for instance, some border colors. So it just proposes to see a stronger border of colors. Graphs are a little bit sexier. We can also try to do like Gephi, like curled edges. All of those are in options. I guess I'll show you briefly later on a list of the different options. Here we also put the recursive font to the level. So basically you can do a lot of things. But all of that so far is mostly like Gephi. There's no real new thing. But here's something that actually proposes something else. So right now we can see one graph. But let's try and see multiple ones. So I buy similar properties, what we call a similar grid. And so I will put the same graph, but it will trickle out. And those will be common options that I set for all versions of the graph. But then, within the grid, I will add three different versions of the graph using different metrics for the size of the load. So here's one on the left one. And we see it's on the middle of the degree. And the right one on the bottom. Now I'm going to add this. So here are the three graphs, which are all synchronized. If I visualize it, it happens at the same time. If I over-enode, I will see it on the three different versions. And then if I zoom a little bit, I guess we can see that... Wow. What can we see? We can see that PharmaSoft, for instance, is very connected most globally, but especially it has a very strong in-degree and not so big out-degree. Why is that? PharmaSoft is such a reference in France for open-source tools that it gets a lot of links from the whole community. And all websites of the Free Software community point to it, because it's like a resource. Whereas, of course, they cannot point to the whole rest of the community. On another note, I guess we could find... I think there was Linux... Linux-affair.org is the opposite. It's a media that pretty much talks about anything that happens on open-source in France. And, of course, they're the ones having the most outlinks. All right, so that's just a small example. Then I can show you maybe another notebook that will show other things. So this one is a notebook that was built out of data collected by Laura Miguel, which is a trainee at Media Lab right now. And she scraped the first-day website, the agendas, to try and get all speakers and rooms over the past 15 years. So here we will have to build the graph progressively. We just had a CSV that she scraped of the data of one speaker and one room. Disclaimer, the speakers have been anonymized. So you won't find a name that you know about, but they represent actual people. So let's take a look at, for instance, three examples of the data. So those are the three first lines. I mean, that's one line and two other lines that I picked specifically. This one is one speaker, and she talked about within this track. Here it was a stock that was shared between two speakers. So sometimes we get speakers separated by a pipe. And here is obviously someone that was still anonymized, but that should be in my seat right now. And we did many talks in the past, including in this room. So we will build the graph using NetworkX. So for those who know NetworkX, it's quite simple. You just create a new graph, and then for each row in our CSV, we will, if there's no speaker, we don't take it. Then we take the track and the year. We add a node for each track, and for each speaker inside the talk, we add a node for the speaker. And then we had an edge in between those two, and we increment it as a count if it's the second time that we meet him, for instance. And we also upgrade the year to get, for the edge, the last year that was used. So by doing that, I built a new graph that has 5,000 nodes and 6,000 links. Let's take a look at my alternate speaker here. It was supposed to be a speaker, and apparently, so it's linked to, in year 2018, to two talks in the graph room. Yeah, he spoke twice in the room, back then. In JavaScript in 2019, and in 2020 in the Open Research Tools and Technology Room. So let's take a look at this graph now. Oh, it was broken. Yes, there's a comma missing here. Here it is. Still, I tried to add this earlier, but I'm not expert enough with it, so I'll remove this. So here it is. So this time, it's a bipartite graph, since we got two kinds of nodes, the tracks and the speakers. So I decided that the node color will be attached to the part type. And if I take a look at it, we should see all big dots in blue are the rooms at first then, and all pink ones are actually speakers. And so we can see that there are a lot of lightning talks, of course, every year, but there are some rooms that have way more speakers than others, probably also because they exist for way longer. So maybe we can try and explore that, and that's the main idea. So sorry, I don't remember what this one is. Let's just run it briefly. I guess it's the same. Yeah, it's the same. Sorry, it's a copy-paste. So what we could do is try and apply other things. So let's do a grid again. But this time, we'll try and display for each node a gradient of color. That will indicate the intensity of the node at this moment. So to do that, we will, for instance, take a look at the year 2012 and the year 2022, and use the strength of the ALO, depending on how many talks were associated to this node for this specific year. So both graphs should show the intensity of the talks during those two years. So let's show it again. And here we can see that in 2012, the main rooms that were filled were actually more on desktops, Mozilla, Lightning Talks, and Embedded, whereas in 2022, there are way more rooms that are actually filled and spoken. Then what we could do is continue working on our graph and continue exploring while working with it. So at Medialab, we also have a tool called Pelot, which allows us to do a bunch of metrics and calculation on a, so it's already installed, it's going faster. And for instance, it can do a monopartite projection out of a bipartite graph. So I'm just running this, and then we can try and display it. And here, just in a few lines in Python, I can just see the alternate graph that is the monopartite version of the graph, and just see the links between the rooms depending on when they are co-spoken by speakers. Let's continue. And the problem is that if I look at this graph, I can see there are a bunch of nodes isolated. And so usually when I want to visualize a graph, those are a bit annoying because they take a lot of space in the visualization, and I don't want to see that. So let's just use Pelot's Crop to Largest Component function that will keep only the biggest component of the graph. So then I can re-displace this graph without all of those single nodes. And that's a rough idea of what could be done. Then we can work with the graph and just visualize on the fly. And I guess I'll conclude by just showing inside the GitHub page of the tool. There's all the visual variables that are available. So I showed you already node color, but you can also play on the saturation of the nodes. You can play on the size we saw, but you can play with the label size, of course, the label color. You can adjust the border ratio, how big it is. So basically all visual ways to better help you interpret your graph can be proposed. You can also add pictograms, use shapes for each node. You can use halos like I showed earlier. And play also a lot of those applied to edges. So you can play on the colors, the form of them, and so on and so on. And I guess that will be it for me. And I will take all of your questions. Sorry, I'm just scrolling back to things that are nicer. All of you. Yes. Can you preserve the layout between the different steps so you can execute the layout every time you go to a new cell and preserve it? That's a good question. I don't think it has been planned yet. Can you repeat the question? Yes, sorry. So the question was can we maintain the layout from one cell to the other and not having to re-click to apply the layout every time? I don't think so. And what I know is that the layout, the way Forza class works, has some chaos. But here it's always instantiated on the same seed. So whenever you run it, it will always generate the same exact layout. So that's something. But it won't reuse the one from the previous cells. No. That could be something that could be an idea. Yes? Do you have any numbers on the upper limits of this system? And the size of the graph that you're going to run here? So can you run the values of noting this one or the limits? So the question is about volume and amplitude and how big of a graph we can display with this. So I believe the limit is actually the one of your browser. So it will depend on your GPU and your CPU and your RAM. But I know that SigmaJS properly endows graphs with, I would say, 100,000 of nodes and links. I guess I know I already displayed one with a few million links and 100,000 of nodes. It takes a bit more time, of course. Do you support something like collapsing nodes and expanding them? For instance, in these kind of power graphs where the communities could collapse if you want to put height once, and they could be selectively expanded as well. So the question is, can we aggregate and split nodes that have, for instance, the same group? For me, it would be, I don't think it's built-in within Sigma for sure. Maybe in Pelot, the library I was showing, like the monopartite projection is pretty much this kind of ID. And I don't know, but it might be in Pelot. Yeah. You might try the GPU. Yeah, SigmaJS, sorry. So the question is, does this use the GPU to display the graph? Yes. SigmaJS is heavily relying on WebGL. The previous version of SigmaJS was proposing to choose between Canvas and WebGL. Right now, it's only WebGL, so it won't work with all browsers. But nowadays, most browsers know to work with the GPU. So, yes. Thank you so much. Thank you. |
A case for DAG databases
Correlating revision history with CI results |
Looking at this from the angle, how can I manage such a large graph in a good way, and moving forward to that, so Nikolai. Thanks for the introduction. We have to speak up because the audience only for the- Okay. So my name is Nikolai Kondashov. I work at Red Hat on the CKI project, which has built in one of those Linux kernel testing systems for Red Hat and for Upstream. I also work in the kernel, louder? Okay. I also work with the Kernel CI Upstream Community on the KCI DB project, which is the source of this presentation, and I do electronics and embedded as a hobby. Okay. So I'm going to walk you quickly through the kernel contribution workflow, through the testing systems, then what we are trying to do with KCI DB at Kernel CI, and then how we want to solve the problem, and what the actual problem is with the Kernel CI process in general. Then I go briefly through the data model, and what kind of a few questions, what a few queries that we need, and how it went with Neo4j, and what we can do instead. So the kernel contribution workflow, I don't know if everybody's familiar with that. I hope not because it's not very pleasant. But basically, you do your changes, you commit your changes, then you make an email out of that and send it to a mail list and to a maintainer for them to review, to give you feedback, then you repeat that again until everybody's satisfied, including maintainer, whoever is concerned with that change. After this, your patches get merged into a sub-tree for the particular subsystem that you were changing, and then sometime later, this is getting merged into the mainline which Linus maintains, and you're done basically. But at any point in that process, you can get some test results for your change. It could be if you're lucky, you can get it before it even gets reviewed, or sometime it gets reviewed, or after it was merged, any time. So there's a whole bunch of kernel testing systems, this is just a sample. Each of them is trying to solve their own problem. For example, CKI is a Red Hat system, they would test particular hardware that our customers use, particular features that our customers request, to make sure that they work, that the distribution works, Intel tests their hardware, their graphics cards, and make sure that those work. Google fuzzer system calls, SysColor and SysBot, LKFT from Linaro, they test ARM boards, and finally, kernel CI is aiming to be the official CI system for the Linux kernel, it's supported by Linux Foundation, and they're trying to run tests on the whatever hardware others can provide, we can have. You can see everybody has their own interest in that game. So this is how your various email reports can look from those systems correspondingly, and this is their dashboards from different systems. So kernel CI, as I said, is striving to be the DCI system, and we have a testing system and the hardware management and the framework and everything to run the tests in various labs, and these labs can be located in different premises by people who have some hardware to run them on the test zone, and then that gets collected and put into the database, and then we have various other CI systems collecting their results and sending them to the KCIW database, and KCIW was conceived as a system to try to reduce the effort that all CI systems have to put into their dashboards, into their reports, and instead have one dashboard and one report if possible or close to that, and as well to save the developer's attention, which is a precious resource because as you see, it's not so easy to investigate every report and from different CI systems because they are differently formatted emails, different data, different dashboards, you have to look at them this way, that way, and you have to figure it out. So that's case IDB is the effort to bring this one into all the wall. So conceptually, it's very simple, these are systems and JSON which can consist like various objects in any combination, and we have the database we put them in, we have the dashboard to display that, and we have a subscription system where you can give some rules and say like, okay, I want to see these results from this test and from this tree or for this architecture or whatever, and we can generate the reports based on that whenever you need it as the data comes in. One important note about this is that compared to our regular CI system where you control everything, in this system, the data can come in in any order. In a regular CI system, you have the results come in the same order as commits come in. So if you tested something earlier, that means for an earlier commit, if you tested something later, it's for a later commit, and you can have a line of history with those results. But for case IDB, since various different CI systems, they get in any order you wish. So we have about 100,000 test results per day, a few thousands of builds, and hundreds of 100 revisions per day tests that received by the case IDB database. Well, actually, I think, yeah, that's correct. That's the correct scale. So it looks something like this as Grafana is like a prototype dashboard. We're thinking about building a new one, but I don't know how soon that's going to happen. So graphs, tables, all that jazz. This is our prototype reports look like this. So what's the problem with the kernel CI in general, not with the kernel CI, the project? So first of all, kernel is intended to be an obstruction layer for hardware. That's this whole purpose, and to make it easier to write software. So in theory, to make sure that it works, you have to test it with every piece that you're abstract away from. But that's not possible, of course, and hardware is expensive, so it's a natural scarcity in this whole system. Then the tests, since you cannot get all the hardware at the same time, and you cannot possibly run all the tests on all the hardware for every commit that people post, it means that sometimes the tests run on this hardware, sometimes on that hardware, sometimes they don't run, and the tests themselves are not so reliable because there's a lot of concurrency management in the kernel, and that's hard to get right, and in general, things happen at the same time in the operating system, so then sometimes they're not so reliable. So you can get a pass on your change, even if it's broken or get a fail on your change, even if it's not broken, or even if it's somebody else's change that broke it, basically, hell. So it's hard to remove noise from those results, and for developers, it's hard to investigate even a valid change. While it's a kernel, you have to meet all the conditions, and well, sometimes you have to get the right hardware, or ask people for the right hardware, or ask them to actually run the test and send you results, like you know, over email takes a while. So if we start sending people emails with results that are not valid, false positive, false negatives, then people kind of get pissed because of that, because it takes such a long time to reproduce them. So a lot of CI systems resort to human review before sending those reports, like they see the failures, they say, okay, well, let's send this to this mail list and then they send them, and only a few manage without that so far. So obviously, nobody stops the development to fix CI, because there's just so many developers, and if one system breaks something, like another subsystem doesn't want to care about that, and the feedback loop is just too long. So tests keep running, keep failing, and it takes a while to fix them. So instead of the ideal case where you can move past, only move past the tests if they pass, and then do all the stages, like a review, and then it's merged, and it's test, and it's fine, and then you can upstream it, you get something like this where all tests fail, okay, it's probably not our problem, not have time to investigate it, or we just didn't get any test result with new one. So what we're trying to do is we got to fix this, right? So we got to fix the test results. So we fix the test result. We look at the test output conditions, et cetera, and we add a rule to the database saying like, okay, well, this failed, but we know about this, here's the bug that was open, so don't complain to developers, don't waste their attention, and it looks like this, shiny and sparkly, but after a while, we get this fix into the test, and we repeat the process with another issue. So these things are already working in separate CI systems like the CKI. There's a UI screen for an issue in the kernel, it says like, okay, look for this output in the test, for this string in the test output, if you see it for this test, then we consider it a kernel bug and don't raise the problem. Okay, so or bug log CI, Intel's CI system, they have like a huge form. For file in this, you can see another string that is you're supposed to look in the error output and the conditions and what kind of status you want to assign to the test, et cetera. So here's a dog tags for you to take a breath, and for me to take a drink. So I'll dive into the model. We start with checkouts which basically just specify what kind of revision you're checking out, we have taken it from repository branch and which commit, and if you have patches applied on top, and the patch log and everything like that, then we aggregate that to get the revision data, like from multiple checkouts of the same revision, they get the same single revision, and they have builds which link to the checkouts, to say like, oh, we just tested this check out, and therefore link to the revision. The builds describe which architecture, compiler and configuration, output files and logs and everything, and we get the test results finally, and yeah, builds can fail, they have failed builds all the time and it stops nobody. So we have kind of test which we are running the environment to train on, what kind of result it was, the status result, pass, fail, et cetera, and the output files logs and stuff like that, very typical. Then we get the issues which describe like which bug it is, and who is to blame like the kernel, the test or the framework, and we will have the pattern there matching the test results, okay, this test, this output, what you saw on that screen. The status that it should have and the issue version, because we want to change those issues over time, and finally have the incidents which are linked in those builds and issues together, so saying like, oh, this is the issue with this build, and things like that. So that's all we keep in the relational database, but then we got to talk about the revisions. So revisions could be just a commit to get history, and here's your graph. So that's the basic thing that we've tried to do, but we also need to have revisions of patches applied on top and somebody posts the patch on the main list. We take it, apply it to some commit, which is pointed to and we test it, we get the results, and we know it was applied to this commit. Then somebody reworks that patch and posts a new version, they got a link, both the commit we tested upon and to the previous revision of the patch set. Then there is this weird thing when maintainers keep a special branch for CI for the testing systems to pick up their work and test and send them results, and they just keep pushing there like they're working on something, they push there, they get results after a while from testing, then they push a new version, and then they get new results and they got to say like, okay, this is the Git commit history, but we also know that we checked this branch out previously, so this is the child of that branch, of that previous revision. This basically it. Well, as you probably all know, this is a directed acyclic graph, so test directed edges and it doesn't loop on itself. So that's about what I know about graphs. So bear with me. Finally, I think that there's just too many build and test results to put them all into a graph database at least so far. I might be wrong, but that's my idea. We obviously need to keep the graph of the revisions to be able to reason about them, but we might be able to put issues there as well in the same database if it saves us something. So this is just a short list. Basically, what we want to know, okay, as the data commit comes in, the test results you got to try them and match them against the issue. So we can say, okay, we found an issue here, so don't raise the flag or something like that, like similar, okay. There is no issue here on test result, but we want to raise the flag because there's actually an issue. We cannot possibly try all the issues against all test results because there's going to be a lot. So we have to build a priority for those issues, and then we have to cut off that priority somehow, and say like, okay, at this moment, we can tell the developer that we've basically tried these results, you can go take a look, but we can still continue and try those issues as the time goes on. So we have to base that on one of the criteria that we might need is how far, for example, that revision is from the current situation, like if this issue only appeared somewhere, I don't know, like 1,000 commits ago, or 1,000 is not that much for the Linux kernel, okay, 10,000 commits ago, then we don't need to try it right now. We can tell the developer, okay, it's fine, and then we'll go and continue try it and if we find something, then we can raise the alarm. Okay, then we can ask, like what were the last X-test results, like for this particular test, for this number of commits to be able to say, okay, this test wasn't often failing, okay, it was failing sometimes, but that's okay, but if it suddenly starts failing more often, we got to raise the alarm, or if it stops failing so often, we got to also raise the alarm and see what's changed. Then we need to track the performance trends, of course, over the history of the development, and once again, we cannot do this just based on time, because some systems move at a different speed and some systems might start to decide to, okay, we're going to test this old branch because somebody if some of our clients wants to base their BSP on it, wants to base the release some software with that kernel, and we got to start testing it, and it starts coming in like the last year's release or something, and we cannot just take that data into account for testing the current releases or vice versa. So, or for stable kernel maintainer, if Greg wants to release a branch, he might want to see like, okay, which issues were discovered starting from the previous release in this branch, and finally, yeah, like just for the dashboard, like, okay, I want to see issues in this branch, or which branches contain this issue. So, that's what we tried to do with Neo4j. I did basic things, so I wrote a little script to get the Git log in a particular format, and then generate the data for commits and for relations. It was a little over a million commits look like this, and it was a little more relations, because as you probably know, a commit can have more than one parent in Git, and it looks like this, very simple. So, I loaded this into Neo4j with something like this. This is updated to the latest release. It was different than created an index for hashes and then loaded the relations, and it worked fine, but not a few days ago when I tried the Thresh Neo4j release, it just hung like this forever. So, I don't know, I could not give you a fresh data how it works right now, but I tried it last year, and I couldn't get answer a simple question if these two commits are connected. It was just go on forever, then run out of RAM. But with Epoch, I could do that. I could get the answer. It was okay, but if I wanted to get the nodes between those two commits, it would do the same thing. But with Git, I complete that in milliseconds. So, here you go. I think the problem, well, in my opinion, is that the graph management databases and software there aimed at a general graph problem, and not tuned to DAGs. How Git does that, Git is tuned to DAG, they have a lot of optimizations for that, and there are streaks to make like repositories like the Linux kernel work. So, I don't know nothing how you do this. This is magic to me, and this would be new to me in this book. But from a purely engineering perspective, I would have liked to see something like a support for databases that are restricted for DAGs only, and that apparently could be done with not so much computation. Then, once you have that, then you can do some branching and say like, okay, if we are DAG database, then we can do the optimizations and do the fast thing with them. So, the full back plan is obviously just put everything in Git, put the commits, and the patches, and all the branches, and all the subsystems, it's going to be giant repo. Maybe we can manage that, and then query it with libGit2, which is the library that Git uses to work with the data. Then, well, shuttle the commits with the relational database. Okay, we want to see if between those releases, we have issues and we take the commit hashes from Git and then query the database with that. That's all. Thanks. So, we can help you with the Neo4j things. It's just like literally this string, this length. No, it's text index is for full text back for. Okay. Well, it was just this one thing. So, do you have the data somewhere to try it out? Of course. Of course. There's a link from the slides to the script that you can use yourself on any Git repo. Yeah. Any more questions? Yes? Did you try any other graph databases? Well, I looked at the question is, did I try any other graph databases? Yeah, I looked at a bunch of them. Some of them require so much setup that I was just floored, but I read the documentation. I couldn't see any indication that it would be any different because nobody says anything about DAGs, any optimizations or anything. I tried memgraph before this talk, but I had the same problem with loading revisions, I think for some reason. Because previously, I could load revisions. I guess in Neo4j, the syntax for indexes has changed since then. Maybe I did create indexing correctly as I was just hinted at. But I could load them in reasonable time before in Neo4j and everything fine and like in query and except that thing. In memgraph, I just hit the wall because it's a different syntax slightly, it was slow. But yeah, no such luck and it took like four gigabytes of disk space. So, not too bad, okay. What version of Neo4j was successful? I don't remember now. I think it was, if I take a look now, I think I- The version will also be successful, it's just research. I tried one Neo4j desktop 1.4 before, 1.415 and that worked. I don't know which one, which version it was included. Any other questions? Thank you so much, Nikolaj. Thank you. Thank you, everyone. I'm still looking forward to work with data in the graph database. Because I think that's actually good for the graph database. And so we can make it work and then Dexter, you can come back and do some large scale analysis on the data. Okay, that would be great. That's what you can do. Yes, thank you. Thank you. Thank you. |
Visualization paradigm that will (potentially) replace force layouts
Visualization paradigm that allows an effective arrangement of the graph, through the use of AI |
The next talk will be on dimensionalization again, and this time we're looking at layout algorithms. So if you have a large graph, computing and good layout for the graph is actually computational expensive, and also hard, and oftentimes we end up with hairballs, as you've seen in the Sigma, Sigma example, but there are other approaches as well, and so I'm really excited today for an ML-based approach, right? So we've all seen that ML models are taking over more and more of our jobs, so we can all just relax all day and don't do anything anymore, because our ML overlords will take care of everything. So I'm really glad that Simone and Tomaso are here today to talk about a different approach to graph layouts, and so very much welcome and enjoy the talk. Thanks very much. Can you hear me in the back? Okay. Good morning to everyone. Thanks you for being here. Let me present myself briefly. My name is Tomaso. I'm here with Simone. We are to front-end and data visualization. So there's no amplification, so the last group and the last group can hear you. Okay. It's okay. We are to front-end and data visualization developer for Laus company from Venice, Italy, and today we are here to talk about new artificial intelligence-based approach to graph visualization discussed in recent papers. So let's start. Graph drawing is a very wide and vast field of computer engineering, so if we observe its evolution during the last 70 years, we can find many graph visualization algorithms. So we can find hierarchical techniques, radial layouts, orthogonal layouts, geometric methods, and also, of course, we know force-based approaches. And typically, most of these algorithms work by applying some kind of heuristics or models or geometric relationships in order to achieve the final visual results. And for example, we all know that post-direct algorithms work with physical models to unfold the graphs. Yeah, post-directed algorithms are those that probably have become more popular over time. This has happened for many reasons. They are very easy to use, easy to implement, can be used with all types of graphs and can be parallelized and many other reasons. But they present, for me, of course, a limit in terms of design control. If we run our first directed algorithms, it may be easy to introduce a steady conference over the layout. So we can only be sure that the graph will converge in order to find a balance between the forces, of course, but we can control which are the graphic features, the visual features that will be improved of design. And to do this, in recent years, the growing use of artificial intelligence models to solve problems has led to the creation of a new family of graphicization algorithms. These algorithms work by exploiting the concept of cost function. And so cost function is basically a mathematical function that allows us to measure how far a given system deviates from an ideal state. So if we have a graph composed of blue nodes, any cost function related to that graph will take the nodes, axes, coordinates on the screen as input and return a number as a value. That number indicates us how much the graph is respecting the graphic feature encoded in that cost function. And indeed, a cost function can be fully established and formulated by the programmer, the developer. And theoretically, we can encode any graphic feature in a cost function. So the question is, how can we formulate cost functions related to graphs? And to answer these questions, we have to ask ourselves which are the graphic features that we want to improve. One way to do this is to observe a bad layout. So if we look at the image, we can immediately notice that the topological distances, for example, are not respected. So we can find pairs of nodes with two ops that are seven times more distant than pairs of nodes with one hop. This is certainly a negative aspect of this design in terms of using of space. In addition, if we look at the topological node, we can see that the angles are not uniform on the graphs. The structure is not homogeneous. Same as for APD and distances. And finally, there are also conclusions of nodes known to be the element that most compromise the quality of a layout. So all these observations can be encoded in a cost function. So let's see an example now. For time reasons, we will talk only about a single cost function. Specifically, we will talk about the topological distances. So if we look at the tree in the image, it makes sense to think that the Euclidean distances between the pairs of nodes should be somehow proportional to the topological distances, the length of the shortest path. This is valid not only for three graphs, but in general for all types of graphs. So our goal is to formulate a cost function that gives us a measure of how much the current APD and distances between pairs of nodes are similar to the topological distances. And as in any artificial intelligence problem, we have to follow a data-driven approach. So we need a source of data that indicates us which is the ideal state of the system in order to train our model according to it. And in this case, our data source is metrics, is the topological distances metrics. So we can know which is the length of the shortest path between pairs of nodes. And with this data source, we can compute for each pair of nodes INJ the quadratic deviations between the current Euclidean distances and the real topological distances of INJ nodes. And finally, we can sum all the contributions of all of pairs, sorry, and build a single cost function in two variables that give us a measure of how much the graph is respecting the topological distances. So once that the cost function has been formulated, we can optimize it by running an optimization algorithm. So we can run Vanilla-Garand-Dichent, Stochastic-Garand-Dichent, Momentum, Adam, and many other algorithms. We know that the function variables, the cost function variables consist of the node axis coordinates on the screen. So if we optimize that cost function, we are moving the graph in order to find a minimum or a maximum of that cost function. So the graph will move in order to respect topological distances. And so we have linked the papers of these official intelligence methodologies. And today the authors have provided a tool to show these algorithms. So we can change, for example, types of graphs. And we can see, in this case, we have the spreads, loss function, cost functions. And we can combine many cost functions by applying a linear combination of all these cost functions. Well, in these introductions, I have talked about the methodologies. But our goal today here is to present contributions in 10 or 13 months. So I let the world to Simone, who will talk about our contributions in our web application. Hi. When you design a layout, it must be analyzed on the basis of two terms. Effectiveness and efficiency. The effectiveness covered by Tom Maso is the ability of a layout to highlight important structure in the graph and ensuring that they are understanding. Efficiency, which I will talk about, aims to visualize as many elements as possible while granting interactivity. And this is very important nowadays where everything is characterized by good data. Here we can see the results obtained with a glassy solution explained by Tom Maso with just only the stress function, not the wall of the other 10, but just only with the stress one. And as we can see, with just less than 3,000 volts, we can't guarantee the interactivity line. So because the layout can perform at least 15 iterations per second anymore. And I think also that in our web application, this is a task CPU intensive and make it unusable for the entire time. Our target for this project is to allow the visualization of as many nodes and edges as possible through the CPU and the parallel programming, so parallel programming on CPUs. In our web application. Let's see together how the algorithm is composed. So we are using just only the stress function. So the first thing to do is create the topological distance mathematics. Then until we achieve the goal, for every duration, we are calculating the gradient and not positions. Calculating the gradient, we traverse the topological distance mathematics. For every pair of nodes, we calculate the partial derivates over the Euclidean and topological distances. And in the end, we update the positions, the non-spositions, very simple. But as can be seen, this step of every duration is a quality time. So in every duration, you have to perform it. And the idea is to split this calculation of the gradient into two various threads. When you, if you would like to create a solution and multi-times solution, you have to consider at least two aspects. The memory you are sending every time to each thread and the load balancing across thread. For memory, for optimizing memory, we saw that the topological distance mathematics is mirrored. So it's divided by two triangles, the upper and the lower. So we are using just only one triangle. For load balancing, we take the triangle and create an array and split it into threads. Now every thread calculates a partial gradient. And in the end, the final gradient is coming from the sum of all the partials. Now the results over typescript and just only five threads. And the green line is the results over multi-times solution. As can be seen, it's very close to if the layout was linear time. So this line representing that. And let's see together the speed up of the solution. Considering the graph with 5,000 nodes and more or less 10,000 edges. So we are in this situation here. And comparing the solution with just only the main thread in the application versus the five thread, we can see that the speed up is more than eight. But that is possible with five thread. It's possible because as can be seen here, when you have five thread in a web application, they are performing the layout while the main thread is free to doing other things. If you have just only the main thread, he has to handle all everything. So the fact is also explained by this other solution with multiple thread with just only one thread plus the main in which we have five or less. This means that this is a problem that can have very good parallelization with five thread with five. Now I would like to show a simple example with a random generated graph. So we now are just only watching the performances and not the aspects. And this can be seen, I hope you can see, it's very fluid, with 8,000 nodes and more or less 16,000 edges. He is searching for a structure but he is a random generated, so it is an entire structure. So future works. So we saw that the problem is perfect parallelized. So the next step for us is to transform the problem from the parallel OSGPU to parallel OSGPU because we are just only using the GPU for the rendering but not for the computation. And with another solution, we perform the classic force layout, we obtain performance like 900 times more. So achieving more or less one million nodes visualize. So the hope is that about efficiency but also study if this is this kind of layout can guarantee the effectiveness. So making also not just only the stress one but more cost function and understand if he can guarantee a good visualization with 100,000 nodes. That's it for us. What? This version is not available but we will publish it, sorry. He asked if the code is available online. This kind of version is not available online, you can find the version of the spring and better here, you can find the, this is the same code more or less. Does it work for directly the graph, yes, of course, yeah. Yes, of course, there are many quality measures that allow us to show complex graph as a clip. If you go on the table, you can find the quality measure called the net volume, this is the quality measure and combining that measure with the stress with another one you can show also complex graphs. Are you using auto-exift tools to calculate partial derivatives of the cost function? No. It's being done by hand? No, it's interneted by us. We have used a tool for calculating of the derivatives, the partial gradients, etc. Now we have implemented the font-squash but the derivatives are very easy to compute from the most of quality measures. They are not very complex to compute and also to build an efficient multi-training solution because tool chains exist along with the parallelization on GPU, they come with 4-series of tools. Yes, yes. So, the formula was using TensorFlow.js but as easy as it is implemented, it's not that the complexity, it's not giving from there the time-square complexity. Does Julia out-support interactive adding of new nodes like when user wants to expand to see new neighborhoods? Yes. So, the new nodes appeared around the double pick note and the other nodes doesn't get around. Yes, it depends of how you have the... Ribbit, yes, yes. He has... Sorry. If we continue to work, if we, for example, expand a node, we do something on the graph, yes, of course, it depends of how you have built your applications. For example, if you have continuously run over the time, you can expand a node, update the graph topology and the renderer will continue to work. Yes. You can perform it also to the introduced nodes. Sorry. Thank you. How does it compare to first layouts regarding number of iterations, convergence speed? Okay. It is about the same because if you are watching to the classic first layout, so the spring and better of Peter it's time square because for every relation, you take the charge and this is time square. Okay. The time complexity is the same, but the velocity of convergence depends of how you have tuned the hyperparameters of the model and depends from the optimization algorithm that you have used. For example, gradient descent is known to be very slow to converge, but if you use more efficient optimization algorithm, such as Adam's command that accumulates an inertia during the iteration, the speed up is more than gradient descent and depends also on the learning rate curve that you put on the system. For example, you can add an exponential decay of the learning rate in order to have a very speed during the iteration, very large speed in the iteration. And then when the graph starts to converge, you can reduce the learning rate in order to find a better minimum or a better maximum. Thank you. The last one or so? If you have more questions, you can just continue. Thank you so much, everyone. Thank you to Majora. Thank you. |
Graph Stream Zoomer
A window-based graph stream grouping system based on Apache Flink |
The next board is on graph aggregation, graph grouping, especially on streaming graphs. Graph grouping is a really interesting challenge because we've all seen individualizations, hairballs, and complex graphs. What graph grouping allows you is ready to pick these graphs, group them by certain attributes, and you have kind of these better nodes that then can be selectively expanded. But you can also then, on a group graph, which is mostly a monopod type graph, in many cases, you can also run graph analytics, which is a really interesting problem. So I'm really excited to have Christopher here because both working on streaming graphs as well on temporary graphs with graph grouping is a really challenging and interesting aspect. So I'm really looking forward to the talk, and so without further ado, I come to the graph network. Yeah, thanks. Yeah, so thanks for that introduction and also thanks for accepting my abstract for this talk. Yeah, my name is Christopher Rost. I'm a PhD student of the University of Leipzig, and I'm currently writing my thesis, so I'm glad that I bring some free time to doing this talk. Yeah, so about us or our team, so that's me, and I have also a master's student working on this project. We called Graph Stream Zoomer, and this project is a result of two master's thesis of our university from Lia Salman and Rana Nurideen, and I think it's also nice to show the result of a master's thesis here at the FOSTEM. At the top is our professor of the database department. Yeah, just to say that. Okay, what you should take away from this talk. So you will see what the property Graph Stream is and why it's important to have the streaming idea inside the Graph topic, and second why you should or should not group a Graph Stream, and then you will learn what the Graph Stream Zoomer, so this specific project is, and the main idea behind it, because we provided Java API to do that. Okay, let's just start what is an event stream. I think I don't can skip that maybe. So we say that anything that happens as a specific time and that can be recorded is an event, and if an event stream is now stream of these events, so a sequence that is ordered by time. So and I think everyone knows why we need event processing, so we cannot store everything into a database or whatever to analyze it. So I want to identify these meaningful events and respond to them as quickly as possible. Okay, what is now a Graph Stream? A Graph Stream is an event stream where each event is a Graph element or some Graph update. Yeah, that could be edges, could be vertices, triples or whatever. And a Graph Update could be a modification of this. For example, the addition of an edge, the removal of an edge, the addition of a property or an edge or whatever. So this is just an overview. Okay, why should I use now a Graph Stream? Because I can execute on this Graph Stream all algorithms and also all mathematical stuff from Graph Theory on this stream of Graph Data. For example, calculate page strength concurrently with the evolving Graph of the Graph Stream. Okay, I can update my analysis results with a low latency if I combine that in a stream processing engine. And my goal is to monitor the changes or monitor the changes in the Graph or to create some notification or some reactivity. For example, if something, some average goes over threshold, then I create a notification. Okay, by now, the Graph Stream could be very heterogeneous. That means it consists of many different types and it can also occur on a high frequency. So it is advisable to summarize the Graph elements in a specific way. And we can summarize Graph elements by three criteria. For example, by time, that means Graph elements that belong together. For example, the time window, we group them together by structure. That means, for example, edges that share the same source or target vertex can be grouped together. And by content, that means vertices and edges that share the same label or a specific value of a property. And we introduced for our algorithm so-called grouping key functions. That means it is a function that maps a vertex or an edge to a grouping key. And that could be everything that is inside this vertex or edge. It could be labeled, temporal information, some kind of property or whatever. So you can map everything that is represented by a vertex or edge to a key function or to a key. And on this key, we group the Graph. So that means, at the end, the result is again a Graph stream, but the grouped representation of that. Okay, now you can question, okay, why I need this? So what are the applications of that? You can think about it as a pre-processing step. For example, before you calculate the page frames, you just group the vertices to the city attribute of users together or something like this. You also use it as a pre-processing step. Second application could be as a post-processing step. For example, after you apply the Graph stream analysis, for example, a community detection, you can now group on this cluster ID with our grouping algorithm to summarize the different communities together. You can also use it to understand the Graph stream in more detail. For example, okay, just to know which vertex or edge types exist in my Graph stream, how frequent these different types arrive, or how vertices and edges with different characteristics are connected together. So just to get deeper insights, or if you use our aggregation functions or aggregation, for example, counting or calculating an average on that, you can also get or reveal some hidden information that you would not see in the Graph stream itself. Okay, so this is an introduction. Now I explain the ideas behind this Graph stream zoomer application just by an example, and then afterwards I summarize this. For example, we're using bike rental data that can have two different Graph schemas. I named them A and B. So the first Graph schema A is that a bike rental is an edge between two station nodes. You see it on the left side. So a station has several properties like the name, the number of bikes, latitude, longitude, and so on. And the trip edge has properties like the user ID, so who rented the bike, which bike was used by the bike ID, and from and to, so when this trip happens, until when, and the duration, for example, in seconds. So this is schema A on the left side. On the right side we have a more heterogeneous schema. So we have also stations and trips as vertices here. And we have also bike nodes and user nodes with several properties. So I just divided into this because I can explain the examples that follow a bit better compared to just using a simple schema like here on the left side. Okay, so how a Graph stream of these schemas could look like this. Yeah, of schema A I have just these trip edges between two stations and all information inside. And from schema B I have here the trip nodes connected with all the other vertex types. So this is just an exemplary view how a stream of this graph data could look like. Okay, so now begin with a very simple or a simple example of our grouping algorithm. So the input of the grouping is the graph stream, I think it's clear, and we need a grouping configuration. And the grouping configuration consists of five attributes. The first is the window because we are doing windowing on our graph stream. So I can define a window size, which is here, for example, 10 minutes. And then VG key are the vertex grouping keys, that means the key functions that leads to the grouping of vertices. EG key are the edge grouping keys, that means the key functions that are needed to group the edges together. And we also have a collection of vertex aggregate functions. This is VA-GG and EA-GG are the group of edge aggregate functions. So the four on the bottom are just an array of several key functions and aggregate functions. Okay, and now having the input stream and applying that grouping, we get a result. And just looking like this, because we define for the vertex grouping keys a function that maps every vertex to an integer value. And that results in, it doesn't matter which vertex exists in our graph stream, we group everything together to one vertex. And that's the white one with an empty label. And the same for edges, that means every edge that exists in our graph stream, we group them together to a super edge, we call them super vertex and super edge that is displayed here in gray. And because of the count aggregate functions, we add a new property to the super vertex with the count of all vertices that are grouped together. And a new property to the super edge with also the count value. And this is now our result for every window that we defined on the graph stream. For example, here the first window, second window and so on. That means we are creating a stream of grouped graphs here. Okay, that is for schema A and that's the most zoomed out view. So I group everything together that exists in the graph stream. For schema B and same grouping configuration, it looks the same because it doesn't matter which type label exists, so we group everything together. So we have just different counts because we have a bit more vertices and edges and also for the second window it looks the same. Okay, so this is my first example, so the most zoomed out way. The second example are called a graph stream schema. So that means now we are using as a vertex grouping key a function that maps a vertex to its label and a function that maps the edge to its label. So what's the result here? Now our node has a label station and the count because the count aggregate function stays the same and our edge has a label trip. So it's more or less the same because our graph streamer has just one node type and one edge type and that's it also for the second window. Now it gets a bit more interesting when we are using schema B. The result here with the same grouping configuration as before is now this. That means every vertex is grouped by their label and every edge is grouped by their label and now I have like a schema representation of my graph stream. And again with all counts because we are just using count aggregation. Okay, and that's for the second window and so on and so on, I just leave here the properties. Okay, the next example we stay with the vertex grouping keys and edge grouping keys, but now I added several aggregate functions to vertices and edges. For example I say okay for the vertices I want to calculate the average of all available bikes for these stations. And for the edge aggregate functions I want to have the minimum, maximum and average duration that a trip between two vertices has. And the result would be this. So same grouped graph, but now I have three additional properties on the edges. Minimum duration, it's in seconds here, maximum duration and average duration and the average bikes available on this station also as a new property. Same for the second window. And now my last example, it is, I call it, not the last example, there's one more. So I added now here a second vertex grouping key function and that's an important thing of the graph stream group. You can also implement your own grouping key function. For example this one called getDistrict consumes the latitude and longitude property of the vertices and then calculates like a district identifier. For example here of Priscilla, whatever, so in which district that belongs to. And then the graph is grouped on this representing district identifier. And we also say that for the edges we want to group the edges on the user type. So that means for every edge, so in this data set we have a user type subscriber and something else we will see it shortly. So that means for every edge I get new now two edges, one for this one user type, one for the other one. And also here some aggregate functions added. And the result is something like this. So here exemplified for three stations. And here the district ID one, two, and three and the average latitude and longitude for example for visualization, proposes to place it on the map. And for the trips between two stations or between two district representatives, we calculated also the minimum, maximum and average duration and counted them. And you see here the green edges are for the user type customer and the red edges are for user type subscriber. So and the last example is then this one here if I say okay as vertex grouping key functions, I say please extract me the identifier of that. That means every vertex that exists in the graph stream is placed here and also for the edge identifier. That means since we have unique identifiers, every vertex and edges are placed here. And this is more or less like a snapshot of the current state of this graph stream for the specific window. So therefore I call that zoomed in. It's the most zoomed in configuration that you could use. Okay, you could imagine implementing this is not that easy. So the master students found a way using Apache Flink and its table API. So everything works distributed since we are using just the API functions of Apache Flink. But we also figured out several implementation challenges. So first was to find a good graph representation. Second one is since we are creating a workflow of this graph stream, we have to ensure the chronological ordering of every step in this workflow. As a third point is also you want to ensure the scalability. Since if we scale out this algorithm, the scalability should be also high. And also keep the state as minimum as possible and provide a low latency and high throughput. So these were several challenges the master student solved quite well. And at the end we created a grouping operator looking like this. I don't want to get into detail. This is just an architectural overview of every Flink steps we used. What is quite interesting is that we created like an operator encapsulation of this. That means the operator consumes a graph stream at input and has a graph stream as output. That means you can combine several of these grouping operators. Or if you define another graph stream algorithm that produces a graph stream as output, you can just put them before. So you can like chaining these grouping operators together. And like I said, this consists of the mapping of the input data, the duplication of vertices, grouping of vertices and edges, and then mapping it to an output graph stream. How an API would look like? It looks a bit messy, but I think it's quite fast clear what's happening here. So first you have to define the execution environment of Flink. Then we read the data from some streaming source, for example, a socket source, or some Kafka stream, or whatever you want, whatever Flink supports in our case. Then we map it to a graph stream object, which is the internal representation of our stream. The interesting part here, you define the grouping operator. So in the middle, that's the grouping config I showed you in the examples. You can define it here by an API. You set the window size. You set the vertex and edge grouping keys. You set the aggregate functions and so on. And at the end, you just execute this operator on this graph stream, and then you can define a thing or just print it to the console, or whatever you want. So that's the operator call, how you define it in the API. And current state, the students are about at 90% of the complete implementation of that. We figured out some bugs at the SQL or at the table API of Flink that were not fixed yet, so we had to define some workaround that cost us time. But like I said, we found the workaround. And the next steps are that we plan an evaluation. So how is the latency and throughput of this complete system? And we want to test it on real-world and synthetic graph streams. And maybe then publish some results, so let's see. And also, the user-defined key and aggregate functions are still under development. Okay, then that's it. That's all folks. Please check out our GitHub repository, or maybe you want to contribute, so we are open for this. The two links here at the bottom are also the icons are two other projects. The one is Gradube. This is a big temporary graph processing engine also based on Apache Flink. So there where I'm also a main contributor to that project, which was initially created by Marty Nugans, who's now working at Neo4j. And also, the temporary graph explorer is a user interface for that system, where you can play around with the evolution of a graph, but in a historical data set. Okay, so that's it. And please ask questions. Thanks. Yeah, please. On one slide, you said a problem was to decide on the, on slide 20, I think, as well. Yeah. The optimal graph representation in the streaming model. Yeah. What was the answer? And so the question was, what, so we had this challenge to find the optimal graph representation, and what was the answer? The answer was a triple stream, but a rich triple stream, we called it, since two property graph vertices are connected with an edge. That means every vertex consists of the label and possibly a big set of key value pairs as properties, and the same for the edges. And this was our optimal, because you can then model everything with this model. But the counterpart of this was in here that we have to do a vertex de-deplication. For example, if you have a self-loop, so from one vertex to another one, we have a duplicate of that vertex, so we have to de-deplicate it afterwards for this model. So this was one counterpart. Yeah, but we figured out that using every concept of the property graph model there as a triple is, that was the best choice for the students. So another, yeah? Would you comment a bit more on the scalability, like what graph size should test this on? Yeah, so the question was some words about the scalability. The scalability is an open point of future work, so we don't have concrete results of that. We tested it with some city bike data that we interpreted as a stream, so some historical data. And we could process, I think it was about 600,000 edges in a few seconds. But this is just some first results, and we have not tried it on big and high-frequent graph streams on a cluster. Because we have huge flint clusters at our university, so we can benchmark the scalability of that in a later step. Yeah, thanks. Yeah? These aggregate functions, are they part of, you know, like a Java API, and how do you define them? Yeah, so the question was how we define the aggregate functions. So we have a set of predefined aggregate functions, like the count, average, min, max, and then you have an interface you can implement against, so there's an interface called aggregate function, and then you have to implement, I think, two or three functions, and then you can define your own and use it then here on, yeah, there. Where you give the classes of the account and average property, you can give your own class, and then it will be used. Yeah? Could you elaborate more on the real-life use cases or real-life applications? So the question is if we elaborate more real-life use cases and real-life questions. So applications. Applications. So since, yeah, so we are in, so I'm at the university, that means we are missing real-world data a lot, and we need also some input from companies to provide us with real-world data that we can use. So use cases could be, we also have to, we only have this bike sharing stuff or Twitter data and whatever, and I think if you have something like this aggregated function, like here an average property, you can use, because at the end it's a time series of changing values, for example, of the average property, and of like defining a threshold and get the notification afterwards. I think this is maybe one good application afterwards. So to, for example, if you have network traffic, you see, okay, now the average, I don't know, packet size is increasing. Now I get notified, for example, like this. But this could be an application, yeah. The idea was to use like a video stream for that, but then the question is, how much graph that is, could that maybe not be done just in a regular stream processing way? So I think this is just advisable to use that if you have some quite complex relationships between entities, then you can use this system besides just an ordinary stream processing engine or complex event processing engine. So I think the unique point of this is to have the graph aspects into the streaming world, yeah. So any further questions? Yeah. So could the events are only additive, right? So you can only add to the graph, but not delete from the graph by streaming it, right? Yes, so at the moment everything is interpreted as an insult only, but since Flink supports everything, like also updates and also deletions, it is thinkable about some future work that we also can support this. Because at the end, the result of us is also, since we are using windowing, it's also an insult only stream at the end, but if we maybe think about to remove the windowing aspect, so they have something like a continuous aggregation or whatever, then we need to support like a continuous addition on the end to update already existing aggregating results. Okay, any further questions? Thank you again. Thanks. |
The LDBC Social Network Benchmark |
I am Gabor Sarnas and I'm here with David Proha. We work at CWI Amsterdam and we're here to present you the LDBC social network benchmark. What is the LDBC? The abbreviation stands for Linked Data Benchmark Council. It is a non-profit company founded in 2012 and its mission is to accelerate the progress in the field of graph data management. And to this end, it designs and governs the use of graph benchmarks and everything we do is open source under the Apache version 2 license. From an organizational perspective, LDBC consists of more than 20 members who all have some vested interest in graph data management. We have financial service providers like the End Group, database vendors like Oracle, Neo4j and Tigrograph, cloud vendors like AWS and hardware vendors like Intel. Also we have individual contributors like David and me who contribute to the benchmarks. So to put things into context, the last two decades has seen a rise in the use of modern graph database management systems. Typically, the data model used in these systems is called a property graph, which is a labelled graph where both the nodes and the edges can have an arbitrary number of properties. For example, this is a small social network consisting of five person nodes and a single city node, which is the city of SPA. And the properties can be on the nodes. For example, here the nodes have names and the edges have attributes like the date when the friendship was established. We can see that Bob and Carl met in 2015. And if you want to run a query on this system, we can use a graph query where we look for matches of a given graph. So here the query says we want to start from Bob. We want to use an arbitrary number of edges to reach some person who lives in SPA and we want to do an aggregation to return the number of those people. If you want to evaluate this, we then start from the person Bob, push to all the people transitively, which are known by Bob directly or via multiple edges. This means all four people here. We shrink it down to the people who actually live in SPA, then add up the results and get the result too. So graph databases use something called a visual graph syntax, also known as the sqr graph syntax, which is similar to the popular cipher language of Neo4j. And here this query is actually really similar to the graph pattern that I have shown. So there are similarities in how the nodes are formulated, how the edges are captured in this text, and also how the transitive closure of the little asterisk is captured in the query language. So this is a very intuitive and concise way of formulating the queries. If we deconstruct this query, we can see three main components. The one is relational operators. Obviously, we still need relational operators. We want to be able to identify people by filtering. So we filter for Bob, we filter for SPA, and also we want to sometimes aggregate. So the count aggregation is part of this query. The pathfinding is really elegant in this formulation because we have nodes asterisk which captures that we can use an arbitrary number of edges. And the pattern matching which connects the person to SPA is also very concise and readable. So what is interesting from a future work perspective on graph databases? Obviously, relational operators are quite well known at this point, and there are endless papers and techniques on how to implement these. But we believe that pathfinding and pattern matching is really good in graph databases compared to traditional relational systems because they provide a more concise syntax and better algorithms and implementations. Interestingly enough, even in the last 15 years, there have been lots of papers on better BFS algorithms, better factorization representations for graph patterns, multi-wavers, case optimal joins, and so on. So we believe that these should be adopted by more and more systems. And to this end, we designed benchmarks that try to push the state of the art and the four systems to adopt better and better techniques. David will talk about these benchmarks. Yeah, hi. So I will give an overview about the social network benchmark. And so first, we'll go through three steps of this benchmark, so the data sets, two example queries, and the update operations done in this benchmark. So here we see a small example of the data sets where on the left side, we see persons with friendships, forms, and network, and these persons post messages on the social network and can reply to each other forming a tree-shaped data structure. And now we will do one query on this very small data set example. So with query nine, we want to retrieve messages posted by a given person, friend, and friends of friends before a given date. And the dates are here shortened for simplicity. So if we would start with BOP, we will traverse to their friends and friends of friends, retrieve the messages, and then filter out the ones that are actually before Saturday. And then we touch upon 10 nodes in this data. Suppose we would start from another person, so for example Finn, and we traverse again to their friends and friends of friends. Here we see that we touch upon five different nodes. So half of the one of BOP. And this difference can actually be troublesome since runtimes for the same queries are different and therefore doesn't help in understanding what's happening. So for this benchmark, we actually want to select parameters that have similar runtimes and also to actually stress the technical difficulties in these systems. So we select the parameters more carefully. So here we see an example of when we do not select the parameters carefully, just a uniform random. And we can see here a trial model, distribution by model, and one with many outliers. And we don't want that. So in the data sets, there are also statistics provided in this example for each person, the number of friends and friends of friends. Then we want to select persons with similar number to get more predictable runtimes. And so if we do that, then we can see here an example that we have unimodal distributions with very tight runtimes. And that improves also in understanding these, like the behavior of the queries. So now we're going to the updates. And for example, if Eve and Gia wants to be friends, we insert a nose edge. And this is then formed into the network. Suppose that the next operation is inserting a comment. So Gia comments replies on a message posted by Eve. And both messages are posted on the same date. Then we have another problem. Because when we are executing these operations concurrently, it can happen that the reply is earlier than the message in such a network, posting an error. And to mitigate this, we introduce dependency tracking. So for each operation, and also includes the edges, but just for simplicity, the notes are here with the dependent dates. We include for each operation a creation date and dependent date. The creation date is when it's scheduled to be executed, and the dependent date is the one that's, like in this case, for M6, is the creation date of M3. And here we can see, actually, that each operation is dependent on each other, forming a whole chain in the social network. Suppose now that Eve wants to leave the social network and removes her account. And so we start with deleting the notes of Eve, and this will trigger a cascading effect by, since we then need to remove the edges connected to Eve, the messages posted, and also the replies to those messages. We can actually see, like, this huge cascading effect, and that can actually have a large impact on the data distribution, and also therefore the executability of these operations. And furthermore, it also influences for selecting the parameters, which we have shown before. And we want to include this delete because it prohibits append only data structures in databases and also stress the garbage collector of these systems. Now we are going to give another example to also stress the temporal aspect of this benchmark. So suppose we want to find a path between two persons. So we have a start person and a destination person, and, for example, Finn and Gia. Then we can see here that we have a four-hole path between these persons. But at one point in the benchmark, it can happen that a node's edge is removed, and then there is no path anymore. It can also happen that there's another edge inserted between Carl and Gia, and then we have a path again. And so for the same parameters, we can actually have three different outcomes. And to mitigate this, we do temporal parameter selection. So each parameter is assigned in a time bucket to actually ensure that we have similar results and therefore also similar run times. Now going through the benchmark workflow. So we start by the data gen, and the data gen provides us with a temporal graph spanning over social media activity for three years, and it is simulated closely to the, similar to the Facebook social network. It's a spark-based data generator that can generate data up to 30 terabytes, and it contains the, you know, skewed data sets, for example, with the nodes and person data in this data. And so the output is a data set suitable for loading into the system on a test, updates which are then executed during the benchmark, and statistics where we can select the parameters. And the selection of the parameters is done in the parameter generator. This ensures the stable query run times and assigns parameters into a temporal bucket. So a parameter can, it may include parameters that once are inserted into the data sets or before they are removed from the network. So and then we have a benchmark driver who schedules these operations and ensures that they can be executed with using the dependency tracking. And this is especially important when executing the operations concurrently. And lastly, we have the system on the test where we have, for example, graph databases, triple stores or relational databases. And now Gabor will go further into the workloads. Okay, so graph workloads are actually quite diverse in terms of what they are trying to achieve, and our benchmark reflects that by having multiple workloads. We have the social network benchmark interactive workload, which is transactional in nature, so it has loads of concurrent operations. The queries here are relatively simple, so they always start in one or two person nodes, the same as David presented before. And here the systems are striving to achieve a high throughput, so the competition is getting as many operations per second as possible. We are happy to report that we have official results from the last three years, where systems started with slightly above 5,000 operations per second and have sped up exponentially, now being close to 17,000 operations per second on a 100 gigabyte dataset. The other workload of the social network benchmark is called business intelligence. This is an analytical workload where the queries touch on large portions of the data. For example, this query in this slide shows a case where we start from a given country and then find all triangles of friendships in that country. It's easy to see that this is a very heavy hitting operation. It may touch on billions of edges in the graph, and it also has to do a complex computation to find those people. So here system can use either a bulk update or a concurrent update method, and they should also strive to get both a high throughput and low query run times. This benchmark is relatively new. It was released at the end of last year, so we only have a single result, which was done by a collaboration of Tiger Graph and AMD. We're happy to report that there are more audits under way, so we are going to release more results in 2023. So probably you can see from this presentation that these benchmarks can get fairly complex and implementing them is not trivial. So we did our best to provide everything our users need. For each of the workloads that we have presented, we have a specification, we have detailed academic papers who motivate the design choices and the architecture of these benchmarks. We released a data generator as well as pre-generated datasets, and we have benchmark drivers and at least two reference implementations for each of the workloads. Moreover, we have guidelines on how to execute these benchmarks correctly, how to validate the results of a given system, and how to ensure that the system will lose your data or mingle up the transactions. So we have asset compliance tests and recovery tests. This leads us to our auditing process. Similarly to the TPC, the Transaction Processing Performance Council, we have a rigorous auditing process where vendors can commission an independent third party who will rerun the benchmark in an executable and reproducible manner, and they will write up it as a full disclosure report so that the benchmark is understandable by whoever wants to see that result. This is important because LDBC is trademarked worldwide, and we only allow official audited results to use the term LDBC benchmark result. This is not to say that we don't allow people to use this benchmark. Researchers, practitioners, and developers are welcome to use the benchmark. They can run it. They can report the results if it is accompanied by the appropriate disclaimer that this is not an official LDBC benchmark result. I would like to talk a bit about standard GraphQL languages. This is an important topic because this has been a pain point for GraphSystems for many years. There is a bit of a tower of Babel out there with many languages, both of them using some sort of visual graph syntax, but always with slightly different semantics and a slightly different syntax, which makes it difficult for users to adopt these techniques and may put them in a position of being locked in by their vendors. In the next couple of years, there are going to be new standard queer languages. These focus on pathfinding and pattern matching. The first one is called SQL PGQ. This is an extension to the SQL language and PGQ stands for property graph queries. This is going to be released next summer, and GQL, the standalone GraphQL language, is going to come out in 2024. We are happy to report that even though we have two new languages, the pattern matching core of them, the visual graph syntax that we all know and love, is going to be the same, so users can port at least those bits of their queries. To give you a taste of how this will look like, here is query 9 that David presented in the social network benchmark interactive workload. This query can be formulated in SQL. It's not too difficult, but the new variants, SQL PGQ and GQL, can represent it as terms of a graph pattern, and this is a much more concise formulation. The difference is even more pronounced for query 13 with the path queries. Here we can see that in SQL PGQ, the pattern is really similar to the visual representation. It just has a source, a target, and an arbitrary amount of nose edges denoted by nose asterisk in between. In SQL, this is a lot less readable, hard to maintain, and it's even less sufficient because it just implements a unidirectional search algorithm instead of doing a bidirectional search which has a better algorithmic complexity. The way LDBC is involved in these new query languages is manifold. First, it had the G-core design language released in 2018 which influenced these benchmarks. Then LDBC has the formal semantics working group which formalized the pattern matching core of these new languages, and LDBC is doing further research to advance the state of the art on graph schemas. We have an industry-driven and a theory-driven group, and what they do will end up in the new versions of these languages. The outlook is the LDBC Graphalytics benchmark. This is a more wide benchmark because it can target analytical libraries like NetworkX, distributed systems like Apache Giraffe, or the GraphBlast API. This is everything that has to do with analyzing large graphs. Here the graph is an untyped, unattributed graph, so there are no properties or no labels. We do use the LDBC social network benchmark dataset, but it is stripped down to the person-nose-person core graph. Additionally, we have included a number of well-known datasets like Graph500, Twitter, and so on. The algorithms that we run are mostly well-known graph algorithms. There is the BFS, which starts from a given node and assigns the number of steps that need to be taken to all of the other nodes to reach them. We have the famous PageRank centrality algorithm, which highlights the most important nodes in the network, and we have the local clustering coefficient, community detection using label propagation, weakly connected components, and shortest paths. This benchmark is a bit simpler than the social network benchmark. It does not have a rigorous auditing process. We trust people that they can run this benchmark efficiently and correctly on their own infrastructure, and they can report results. If they do so, they will be able to participate in the Graphalytics competition, which has a leaderboard for the best implementations. Wrapping up, you should consider joining the IDBC because members can participate in the benchmark design. They have a say in where we are going in terms of including new features. They can commission audits if they are vendors, and members can gain access to these ISO standard drafts that I mentioned, SQL, PGQ, and GQR. Otherwise, these are not available to general public. Being wise, this is free for individuals, and there is a yearly fee for companies. To sum up, we have presented three benchmarks, the social network benchmark's interactive workload, its business intelligence workload, and the Graphalytics graph algorithms workload. We have more benchmarks. There is semantic publishing benchmark, which is targeting RDF systems set in the media and publishing industry. There is the financial benchmark, which is going to be released this year, which targets distributed systems, and it uses the financial fraud detection domain as its area, and it imposes strict latency bounds on queries. This is quite a different workload from the previous ones. Of course, graphs are ubiquitous, and they have loads of use cases, so there are many future benchmark ideas, including graph neural network mining and streaming. Thank you very much, and we're open to any questions. Yes. So, in this one overview that was the graph data set, and the updates were kind of separated. Is there a possibility to create a graph data set where the updates are included in the data set, so that the nodes and vertices get time stamps when they were deleted or when they were added? Yes. So, is it possible to create something like a temporal graph with the time stamps of when the specific node is created and deleted, and this is actually very easy, because this is the first step that the data gen creates. So, when David said that it creates a social network of three years, that has everything that was ever created or deleted during those three years, and then we have attributes like creation date and deletion date, and then we turn it into something that's loadable to the database, we hide deletion dates, because the database, of course, shouldn't be aware of this, but this is something that the data gen supports out of the box. Okay, but then it's also too able to get this data set with the deletion date, because you already said that it's hideable. It's hideable, but we have one which is called the row temporal data set, and that is available, and we even published that, so that's something that, yeah, it has a lot of chance to be influential in the streaming community, I believe. All right, more questions? Yeah, Michael? Yeah, Michael? So the question is, can we extend to other domains? And we usually emphasize that social networks is not really the domain that is the actual primary use case for graphs, we just use this because this is really easy to understand, we don't have to explain person-nose-person, and you can put in all sorts of interesting technological challenges to a graph domain like this. It would make sense, and sometimes we are approached by our members saying, we want to do a new benchmark in the domain X, and we then send them the process that is required to get one of these benchmarks completed, and that's usually the end of the conversation, but we are definitely open to have more interesting benchmarks, and of course, a good data generator is worth gold to all the researchers and the vendors in this community, so that's usually the hard point, and I would be definitely interested in having a retail graph generator. Carlo? Hi. The question is specifically, what do you see the impact of this will be on the industry or it's more uneductive of evidence if it's, if the system would have improved, or if the system would get more robust as in that you detect stuff that is doing things and stuff get fixed, or what's the, yeah. So the question? Yeah, the question is about the potential impact. What could all this achieve? And we believe that it will help accelerate the field in the sense that systems will get more mature, because if you want to get an audited result, you have to pass all the asset tests, you have to be able to recover after a crash, and ideally you would have to be fast, so that is hopefully one of the other things that systems will take away. They will have better optimizers, improved storage, better query execution engines, and we have seen this in the aftermath of the TPC benchmarks, so those resulted in quite a big speedup. So that's one area, and of course there is pricing, we would like that users can get more transactions per dollar, and the third that we are personally quite interested in is the new accelerators that come out. So there are, especially in the field of machine learning, there are cards that do fast sparse matrix multiplications, those could be harnessed specifically for the analytical benchmarks that we have, and that would be interesting to see how big of a hassle it is to implement and how big of a speedup they give, cool, all right, okay, thank you very much. |
Gephi towards v1.0: the codebase, and the rest |
We've seen all the developments, we've seen Gaffy light is coming, but also what will happen with the main Gaffy, with the main Gaffy code base. So, thanks a lot for coming and for having us and enjoy the talk. So hello everyone, Gaffy is a network analysis and visualization software that is I think primarily used in teaching, it's Java based and it's very visual. I had assumed wrongly that most of you would have known what it is, but basically you can interact with the network nodes and edges visually and it's really a good entry point into say network science, it's used in teaching and I would say it's quite field diagnostic, it's not dedicated to any industry, it's not for the social science only, it's not for biology, it's not for just visualizing a graph database output, it's just any graph, but it's kind of for beginners. And it's been existing for more than 10 years now, it's pretty popular and Matthew and I are basically the two fathers of that tool, he's really the architect of the code and I'm more the guy who assures the continuity of the project and I had designed some algorithms and the user experience if you will. And this is a community oriented talk, we'll talk about the code base a little bit but it's also about what's around, hence the code base and the rest. So Gaffy has basically been unfinished forever like so many products we know, that's why the version number is below one and it's also because you know there is no big entity funding Gaffy, it's like our free time, so we push it as much as we can when we can, but there is always something missing and basically we have known for a very long time that we want to finish it basically but we're not there yet and we patch it and we keep it working but it doesn't mean we do finish it. So for instance, we do not have all the features we would like to have that would make a coherent or consistent whole like a set of features that go well together and we also want it to be reliable enough and efficient enough that it can live for a longer life, right? So we want to set goals and fulfill these goals. So basically the idea is that when we get there, that's when we are going to call the version 1.0, right? Because of course the project is going to work, live and we're going to work on it after that and we're also going to work on it before that. So the 1.0 is today's Gaffy if you know it but finished in that sense. Now the question is about what does that entail and that's what we want to talk about. And the way I'm going to start is by explaining how we think of the roadmap to version 1.0. And basically, so there is a demand for Gaffy that's established, we're very lucky, that's very nice, so people want it, they use it, good. And our goal is to have a project that is healthy or sustainable if you will and also the code base that is sustainable so that we can fix it when it breaks and so on and so on. So how do we go from this need or this demand that is there to those goals? I separate them because they have different implications but none of them makes sense without the others so that they are kind of connected. So one way to take this problem would be to look at what are the threats to the project or the code base. So the main threat would be first threat if we kind of clash with the community of users, if we don't want the same thing then that's threatening obviously the project. But then there is also two different threats for the code base, one would be what if no one wants to fix the code and what if some people do want but they cannot, for instance because they don't know how to understand the code and so on. So those are the three main threats we identify at the project level and then we can see the roadmap as fixing or preventing these threats. So this would be having a thriving Gefi community and I'm saying community kind of at large, I include here the community of developers and also the community of users who are sometimes the same and sometimes not but all of that is okay it's kind of fuzzy, I'm going to acknowledge that now so I don't know exactly what is the community but there is a community. And then we want to be able to pay developers for maintenance because we think that we could get there, that would prevent the problem of no one wanting to because of course money is an incentive, we can keep you know competencies to do that and we also want to have a clean and robust code base so that when newcomers come and want to participate in the project they can actually get involved relatively comfortably into fixing issues, improving the code and participating to the evolution of the project. But also those kind of steps towards sustainability also depend on previous steps. And one of the very big steps that is an issue for us is the fact that for us to pay someone we have to have a structure like fiscal or legal structure where if you receive funds we can legally hire someone because right now the project can work kind of in a without this administrative shell so far it's worked for us but it cannot go beyond the need of hiring people ultimately so we need to get what's called a fiscal sponsor basically so our legal and fiscal representation. We also need to raise funds, we are thinking of doing that through crowdfunding because we think that enough people are interested to give some money and we don't need that much money either and we also need to attract Java developers and if you pay the attention that we are looking for Java developers to contribute to Gefi maybe that's you. And basically so and we want to have a better community tools meaning also kind of websites tutorials and so on so those things can derive from having a demand on Gefi and of course they prompt an order in which we have to you know pile up the bricks so we have to start by having a fiscal sponsor because then we can raise funds then we can enroll Java developers because when then we can pay them if they want to participate to the project and then we can have organized the maintenance and we can have a better code base and more sustainable over the long term and that will make the Gefi project more sustainable at least on the code base level. I didn't number the track on the top which is the kind of the community track this can have it's kind of another branch of those things it can happen at the same time it's also ongoing right away but I'm going to just acknowledge that I don't want to forget about it but it's kind of independent to some extent. So that is how we think of our roadmap so our priorities are obviously the most important things for us is to make the project sustainable in practice it may mean things such as having the software easy to install for the users so something that might be overlooked but is so important to the public of Gefi which is beginners. We want to have the OpenGL so Gefi is based on an OpenGL visualization of the network so drawing on the GPU is so important to have efficiency when you have to display a large network but it has to work on many different configurations which is not obvious because of drivers and all of that. Gefi's multiplatform I didn't mention it but of course so it means Linus, Mac OS, Windows and so on. We want to be able to test everything if we can we want it to be stable we want to be able to fix the major bugs when they happen and so on. We want to finish the project at some point now it's ripe for that so that's where we are heading for and then there are other things so we would like to have the ability to stabilize core contributors so people who come and we want to keep them because they get involved in the project you know how it works and then there are other things I've listed a few here I'm not going to mention them we will come back to that so there are also a lot of other aspects that are also important even though they are kind of all over the place for the project. So our technical road map for the two was the version 1.0 it had a number of items we had established that a few years ago like two years ago and we've done part of it already. So stuff is done embedding Java so that you don't have to install Java you can know it works like you install the app Java is in and people don't have this issue of having the right version of Java. We've updated to the latest networking platforms we are now GDPR compliant in the bug reports that we receive if people want to send them to us. We are working now actively on the new OpenGL engine we'll explain why. The quick search is there is kind of a prototype already working in the last version of Gefi so you can search your notes right in the interface new icons are coming we are reworking the UI and then we have a lot of other things that we think should be there before we can call Gefi 1.0 namely the highly demanded undo feature and yeah autosave and graphic very managing parallel edges which is the hardest is done but like the last mile is not been done in like the algorithms and so on so it's coming on a bunch of other things. So one of the things we've tried to do to address that issue is to organize what we've called Gefi code sustainability retreats and now they are becoming Gefi week for obvious reasons of a short naming I think but the idea was to make the code sustainable and it's kind of evolved into something is gotten out of hands in some ways that's the kind of thing I want to say so the first one happened in 2021 and it was basically about restarting this effort of maintaining Gefi so we had to show that we are still there we're going to go back to Gefi improve it maintain it and so on and we attracted just a few people so it was kind of a small team we're just six and the version of last year so it was in a very early December last end of August beginning of December 2022 a few months ago and then we had like 22 participants in Paris including a lot of different people and this was kind of more pushing forward to other things and we're going to show you a few pictures and to tell you a little bit the story of what happened in these kind of meetings so we made a call some people answered I do want to participate we were able to fly them we pay for travel accommodation so for instance you have like on the left you have Antonin Delperge who is the maintainer of OpenRefine if you know that tool that is dedicated to cleaning data very much used in the social science and newsrooms and stuff like that and OpenRefine it used to be Google Refine so it's an open source project that had a life in the industry and it's been released it's coded in Java and it's a very well structured project so like Antonin was really good at telling us how he did it for his project or the project OpenRefine to have a fiscal sponsor he helped us design the upcoming undo feature for instance we had Tiago Peixoto that you may know for his work on community detection algorithms so it's kind of more of the hardcore researcher if you want but he's also the developer of graph tools if you know these science tools for networks and you see Mathieu explaining the code base so there was also kind of a school aspect to you know raising the competencies of every participant to understand the code base and basically the one year before it was kind of much more people much more varied people so designers researchers in social science people who are doing open source intelligence or ascent you know so it was kind of broad and of course developers and we worked all together and of course it's kind of a free for all in those situations and some people pushed in kind of unexpected directions it was good so for instance you can see someone working and making the new icons and finding new ways to visualize community detection so kind of half consolidating the code and also exploring new directions and the takeaways are that it works so it's these gaffy weeks are really working for the project they are not that expensive for what they bring us but they also go in many other directions that are super nice but not exactly what we wanted which was making the project more sustainable and they contribute kind of indirectly to that so it was an unexpected result for us because it attracted unexpected people it had unexpected outcomes it's it's becoming kind of a mini festival and we hope to make that every year so we're going to make new calls so we're also going for calling for participants it's really a nice place to be because there are so many interesting people to talk to it's really lively but we try to maintain a balance so that we don't forget about making the code robust which might be the most boring part but also the most necessary part because if gaffy just doesn't work then nothing exists of all of that hence the questions we have which is how to attract the java developers and we've not been so great at doing that in the gaffy weeks it's kind of the the flow of the of these whole operations that we fail to attract java developers so why why is it so the the idea we have is that we could look at how java developers got involved into the project so the first way you can get involved is you made gaffy yourself so that's one way another way is you've developed a plugin you know so the plugin is an entry point because you can develop plugins for gaffy some people do that because they want to implement a niche algorithm they've developed and then from there they can get to know better and better the codebase and we can maybe draw them into so it has happened so for instance from Matthew Tote he's not here he's not here so we can draw them towards the the core of gaffy you could also be a user of gaffy who also happened to be a developer and through that become involved into gaffy so this happened but it's a very slim overlap of people who have the use for gaffy and also also happen to be a developer and usually it's unconnected right so you don't use gaffy because your developers you just happen to be both also good but not so many people so the question we have is also also a question for you because it might be you but it could also you could also have an opinion about that so is there something such as a fourth path that would be i am coding or i am using gaffy in the industry as a java developer and then we would like to connect to you and see how could we draw you into the project how could you help us and stuff like that or is there just no people who work as a java dev with gaffy in the industry i would like to know that and from there i'm going to give the mic to Matthew who is the architect of the code just to give you the his battery we're going to swap oh you want me to install it to you let me put that to your okay so the mic is here okay it's fine it's fine i'll just i'll just keep it um it's out of the little it's fine okay no worries okay well so just just that it looks like so welcome everyone um i'll continue the presentation from now and i'll talk a little bit about this this more the code aspect that what we've done to to actually eventually engage more developers i mean of course uh we are open source you can you can browse all the code the code on on github and so on but that's that's only the starting point right so what what do we really need to do to make the project more inclusive for for new developers so we've done some actions and we plan to do more but it really essentially is about its you know documentation entry points we try to reduce the number of github issues so it's not kind of a complete mess when you try to understand what's going on we started to do a bit of youtube videos to teach the code base and um you know try to make the experience for somebody landing on gaffy and having an issue trying to convince them it's actually not that hard to get started on on the tickets and uh and you know here are a few videos or things you can watch and follow so um there's really other ideas but this is the main idea and we also have some open questions there you know is there actually you know technological problems so that maybe in the number of people knowing how to do java ui is not you know high enough or is not interesting enough that prevents this to happen or there are some other reasons obviously gaffy you the user base of gaffy as you know is mostly non-developers right so we're talking about a fraction of those users that might be put potentially contributors we also have the gaffy toolkit which is the kind of library version of gaffy so only the kind of core that is more used in automation that could be another entry point but we haven't explored that i want to talk now a bit about the the actual software and why you know it is so challenging so first of all gaffy is a full product in my view there is a user interface point part so um there is also some some design elements right so you have to understand you can kind of understand how to get to different functions so there's some performance issues uh and challenges right so it's it's large graphs uh you know memory efficiency so it kind of has it all um that's also what i found personally very interesting then there are some also technological challenges such as java itself uh you know we use java swing for example not so popular anymore um you know is you know how do we overcome that uh in the long run and also uh support for open gl in um in java is not is not amazing uh how do we overcome that then we have a couple of also principles that we have put into the software for a long time that may actually deliver a lot of value but also quite challenging such as uh the it's multi-threading aspect right so um there are lots of issues and bugs that may be related to the fact that everything needs to happen in parallel and that's kind of cool about gaffy but that's also for our developer pretty hard and then we also commit to have a multi-platform software right so i uh there are there are certain processes for example you need to do to certify your application on macOS it's pretty cumbersome completely different process on windows we have not developed some automation for this but it's it's it's a you know you need to spend a lot of time to make sure it works on linux and and so on and so forth um historically we haven't put enough effort into testing so um the the the quality assurance when you when you put a new release out um you know how how do you make sure that it actually works and and it doesn't have you know critical bugs then there's also the plug-in lifecycle if you submit a plug-in you know we need to kind of review that it works and that there are like dozens of plugins so everybody wants to do a new version how do you review that and manage that process and finally also the ecosystem is very wide you have graph databases you have plenty of other tools gaffy is uh you know needs to somewhat be interoperable with these other tools right to make to make the the the user's uh life easy so i'm gonna stop here it looks really doomed right so it's like we're not gonna make it but still we we have some uh some some belief in it still and partly also because the challenge is very high is very big right so that's also what i said personally i like big challenges so this uh this long list of things uh is is kind of fun as a developer but i think we also have some assets that that also makes us believe in the project in a long run uh and first of all it's a desktop tool and uh you know we've seen for example with lately the release of the m1 processor there's still a lot of innovation happening in hardware web is great and it has it can flourish but it's also great stuff on desktop we have a great modern architecture uh we have a great uh graph structure uh called graph store that's very robust very well tested and we have a fully automated uh build cycle so you can release in in one click so to speak so it it doesn't take that much effort to to put a new release out so we've built all of these things over the years and and we think that it can bring us in uh forward recently we've also been successful at rewriting certain modules uh and and seeing that they work well they're more reliable so we can kind of you know rewrite piece by piece not do a big bang uh 1.0 but kind of get along the way there by testing milestone by milestones um we've also uh have some cool work in progress such as the new visualization engine so we know that there's there's potential in in what we have in uh in in the works and finally the foundation of on which Gefi is based on for example Java itself keeps evolving the language is very mature but also there's lots of cool new things coming out uh we could you know probably 10x the performance of Gefi if you were to to leverage all of these recent innovations I will now pass it back to Mathieu for the web version yes so and we have one of the two of the developers here with us today so we're we're browsing out to the web and maybe I should start by saying how many years ago three years ago the last time there was a physical for them so we skipped two years so I think it's about three years ago we were here in this dev room uh presenting the dystopian future of Gefi in javascript a new benchmark that so we've we've actually considered very seriously the question of going full web for Gefi and the answer is there are still limitations to the web technologies that make it so that it will never be a replacement for the moment it cannot be a replacement for Gefi because it cannot go as far as Gefi in terms of scalability of large networks right so we think we what we understand of the the people using Gefi is that what they one of the things they value and that they like with Gefi is that you can start with very small networks and you know do you do your tinkering as a beginner and when you have actual data that comes from the real world and it's really big and which is coming you can scale that up to very large networks and your net your knowledge is going to scale so we are going to have Gefi and along with it a web version but that's going to be more tailored towards the beginners and networks that are not that big now if you have really big needs then you use the Java version but for teaching we think that the web version could work and we that's why we're calling it Gefi light it has it's it has a little bit less features but it also has in some ways more features because the web offers opportunities so for instance you could modify your network through JavaScript because it's the ecosystem where everything exists and I want to say that this is a different team so we know them very much but it's kind of now super it's ripe it's ready to do it you can if you have the slides they are in the first-day website you can have the see the demo I have to I do not have any more time you can see the demo and see the repository so I'm going to skip this slide for the sake of time but basically these are the things you can expect are coming and they will end with Gefi 1.0 only then we will talk about what Gefi could be differently and what do I conclude or do we conclude together what do I do I'm going to do it and then you if what I miss you do it so we're looking for fiscal sponsors if you have feedback about that because you have an open source project we'd like to know and we have a question for you should we release the version 1.0 as a way to get a better crowdfunding or should we do should we do crowdfunding in order to get to the 1.0 quicker which is at the service of the others open question we'd like to know what you think we're looking for Java devs spread the word contact us if you're interested it's it's really interesting it's really exciting community and we want to do a new Gefi week this year so in like 10 months if you can offer some sponsorship we are really interested and we'd like to discuss with you that's why we are here today so thank you very much for your attention |
Welcome to the Haskell devroom |
So, welcome to the first Haskell Dev Room at Fozdem and thank you for coming. We have a great lineup of speakers today on a diverse range of topics and a bunch of the topics are really aimed at newcomers to Haskell or people who haven't been programming Haskell for very long. So, please encourage people you know here at Fozdem to come along. I believe that there is a live stream, there is a matrix chat room where you can put questions or discuss the topics and we're going to have a raffle at the end of the Dev Room. We will hand out raffle tickets to people who are new to Haskell. So, the prizes, we have three copies of Graham Hutton's programming in Haskell, second edition and we have the booby prize, Jara Vegumite from my native country. We have stickers up the front here, come and grab some stickers to adorn your laptops or books or whatever you like. We have a lot, we have a lot of stickers so feel free to take a handful if you want. The first talk will be in just a few minutes so I will introduce the speaker in a couple of minutes. Thank you. |
A quick overview of the Haskell tooling |
Okay, so here is a quick overview of Haskell today. I am Julien DeRos, I am an assistant professor in computer science, and I use Haskell since 2015 initially for teaching functional programming. So since the beginning of the language, Haskell has many tools that have been created for developing in Haskell, and today most of the Haskell developers use GHC compiler for building Haskell projects. We have a nice tool such as a cabal or stack. Haskell is now quite well integrated in editors such as Visual Studio Code, Veeam or Emacs, thanks to LSP implementations such as HLS, and all these tools can be installed using some tools like GHC or Nix. In this talk, I will focus on cabal with VS Code and the Haskell plugin. First, we have some online tools such as a package, which is a package archive, so you can go to the website and search for packages. There is for example some libraries for doing whatever you want. These libraries, you can access their documentation so you can see what to do with this library and how to use it, and you also have access to their source code, with nice colors and code navigation, which is quite useful. We also have Google, which is Haskell Google, so it's a search engine, and you can type the name of a function, and Google will give you a link to the documentation of that function in the package. If you don't know the name of the function, you can also type, write the type of the function, and Google try to find a function that matches that type, so you can see its documentation on the package, and you can see if it's the function you are looking for. To work on a Haskell project, you can use cabal, which is a tool for building and packaging projects. To use cabal, we have to write a cabal file, which is a configuration file, where you can specify some information about your project, and also define the target you want to build in your project. For example, if you have a library or executables, you can write them here. You can also add some dependencies, for example, libraries available on Ackage. Then you can use the cabal tool once we have this file, so we can run cabal build to build our targets. When you do that, cabal will get the dependencies from Ackage and run the compiler to build all your files. We also have the cabal run command to run specific targets, and you can also give command line arguments if your program requires that. We have a repo, which is a read-eval-print loop. This runs the compiler in interpreter mode, so you can write some Haskell expressions, and the compiler will evaluate these expressions and print the results. It's very interesting for testing some code, and you have also more specific commands. For example, here, you can ask some information about a type, a function, or anything. Okay, so to work on a Haskell project, we can use editors like Visual Studio Code with HLS and the Haskell.haskell plugin. It's a very classic tool, so you have the file of your project. You can open them, edit them. You also have code navigation and documentation, so if I put the mouse pointer over a function, VS Code will show me the documentation of that function, and if I control-click on the function, VS Code goes to the definition of the function. We also have code compression, so VS Code tries to complete the code you are typing. We have integration of the compiler, so if there is an error in your code, VS Code will show you where is this error, and it can give you the message from the compiler. Even if your code is correct, Haskell, VS Code can help you improve your code. It can give some hints to refactor it. For example, here, it says that my code is correct, but it would be better if I use the FMAP operator instead of the FMAP function. We also have HLS, which is, let's say you are writing some code, and you don't know what to write at a specific pace. You can put this underscore character, and the compiler will tell you what you'd expect at this pace. For example, here, it says that it waits for a function that takes a string and returns an int. We have an inline ripple, so you can type some Haskell expressions as commands in your code with a specific prefix. When you do that, VS Code will print a button, it will show a button, and if you click on this button, it will evaluate your expressions and add the result of these expressions in the comments below. It can be very useful for adding some examples as commands in your code for documenting the code. Speaking of documentation, we have Hadock, which is a classic tool where we write the documentation of our project as commands inside the code, and then we can run kbalhadock, and this generates the documentation as HTML5, which looks like this. As you can see, it's a tool that is used for generating the documentation on the package. Finally, we have some tools for testing our project. First of all, Haskell has a quite powerful type system, so it already prevents us from writing many errors, but we still need to test our code. We can do that with a very classic unit test. For example, here, we just write a Haskell expression with a specific input, and we call a function on that input, and we can write the value that we expect for this input. So we can write many inputs to test and test many functions, and when we run a kbalh test, this will compile our testing program and run it, and it checks that every expression is evaluated and returns the expected value. If there is a problem in one of the tests, a kbalh test will tell us which function fails. We have more than that. We can use property-based testing. Instead of giving a specific input, we can write a property, which is a function that takes an argument, and return a Boolean. This Boolean says if the property is satisfied or not. And when we do that, QuickChat will generate random inputs. Here, it says that it has generated 100 inputs, and it tests the properties on each input. If one input makes the test fail, QuickChat will try to shrink the input to the smallest value, such that it's simpler for us to debug our program. To conclude, Haskell has done some nice tools for many years. All these tools, kbalh, Ripple, QuickChat are quite old and now very mature. Since more recently, we have very nice integrations in editors like VSCode or other editors. This is quite easy to install, at least on Linux. You just have to install VSCode, VSCode, and Haskell.Haskell.Pagin, and that's it. You have a nice Haskell environment for developing your project. This slide and the culture here are available at this link. You can also see the tooling below, which presents other alternatives. If you prefer to use VIM or MX, there is a tool you can use to do that. And that's all for me. Thank you for your attention. Thank you very much, Julia. There is time for questions. Five minutes. Just shout it out and we can repeat the question. Can you please repeat the question? What is the difference between the kbalh Ripple and the GHTI, which is the Ripple from the compiler? In fact, I think it's quite the same tool. Kbalh will call the GHTI, the Ripple from the compiler. But if you project some specific dependencies or some modules, the kbalh Ripple will take all of that into account so you can inspect that code and quite more powerful. But it's the same tool at the end. Which tools do you recommend for debugging Haskell? I don't use very much the Askell debugger. The debugging in Haskell is quite different from other languages, I think. So there is a debugger where you can inspect memory or the runtime system using the compiler. It's not something that I can do very well. So I won't recommend anything. In your examples, you showed that you can have comments evaluated as a Ripple. I know Rust has something quite similar and it made sure that examples in your documentation have tests. Can you do similar sorts of things? You had an example of a Ripple evaluated in comments in your code. This one. Is that specifically the VS Code extension? I think so. It's provided by Visual Studio Code with the Askell extension and it's automatic. Yeah, Rust has this feature. The evaluator is doing the test for you. Yeah, so what is very confusing for beginners sometimes, if you come from another language like Rust or so, then it's one way to build a project. In Haskell, you have like Carbal and Nix together and then you have Carbal v1, v2, new. Is there some plan to clean this up someday to give a simple way to build a Haskell project? In fact, Carbal and Stack doesn't do exactly the same. Stack is based on snapshots, so it's more secure if you want to have the same build for every project. But both tools are compatible, so Stack can use the Carbal file, so you can just write a Carbal file, then add the version of the snapshots you want to use for Stack and that's it. But I agree with you, there are many tools. I'm not familiar with Stack, but I thought that Carbal has something like log file nowadays, like a freeze. Is that similar to that? Carbal, a freeze. Nix with Carbal, so there is a freeze with Nix, but I don't use Carbal very much. Okay, we're out of time, so please once again thank Julio. |
Hackathon HaskellKatas style
Install a complete hackable haskell katas environment for a new hackathon concept |
And our next presentation is a hackathon, so it's the hackathon Haskell Carter's style and we have Methodis and Renaldo running this hackathon. So thank you very much. Over to you. Thank you very much. Okay, so we are, as he's already told us, Methodis Cordero and Renaldo Cordero and we are going to present to you our project called Haskell Catas. To make a brief introduction of the content we're going to talk about. First of all we're going to talk about, we're going to set our goals of our learning improvements that we want to implement into our project. Then some characteristics, it will just be like a summary of the things we're going to do in our project, then we're going to talk about projects and finally we have like five minutes for questions and answers. So in learning improvements, we want to set some goals, we found three goals. First goal is that we want to learn on the way so that we could fasten our speed curve of learning Haskell and also make it comprehensive with a comprehensive method. Okay, the characteristics we want in our project are this. First of all, we want Catas as a way of learning. Catas is something that we're going to explain forward in the presentation and the thing we want to implement is make some exercises in which we could upload our solutions and also make them like so that people can interact with other solutions and see what other people respond to those cases and also make like many different exercises so that we can learn. So let's talk about projects. We have made here a summary of the steps of this project. First of all, it's an installation step in which we want to, first of all, have a validation step so that we check that everything is all right and when we have finished this selection, we have to log out. In step two, we recommend that people in the terminal write a new key help so that it gives a brief summary of what the project is about. Step three are the examples that we want to show you, but also here it will be like the step in which you do the exercises. We're going to do a very, very simple exercise that is a multiplication one, very simple, so that you can really get the step-by-step of doing it. So what is a cata about? Well, it's a type of exercise that has disappearance. On the right, you can see the editor and the instruction on other windows in the editor is the window where you are going to do the exercise and it has two files, the validation step test and the file solutions. In the left, you can see the editor changes where you can see the changes made in the editor and the compiler that is right below that shows the solutions, that everything is all right. So as we have already told you, in the editor is where we're going to do the exercise. The two files, yeah. So step one, all the steps are written in the comment window where it tells you what to do. First of all, it tells you to go to code words and update the local files, so we have left you here like a link so that you can enter and when clicking here, you will see this. Here you have the solution and the source test, the sample test. Here you copy the solution and take it to your Haskell cata and paste it, the type and the parameters. In this case, we want to multiply, so we're going to get another integer and we're going to get the type integer and also the parameters, we're going to have A and B, simple. And on the other hand, we're going to take the code that is in the link below, the link we already talked, and we're going to paste it in the comment window of the sample test, the file test, yeah. So this is what we had before and now we incorporate that code from the code words. Step two, step two tells you to get a red. This means that it compiles but it doesn't pass the test. So you will receive a red message in the terminal of the stack build. So next step, it gives you like a final step that shows you these things. First of all, resolve the exercise and doing that, you will receive a green message. Then you can certificate where here you will modify your own solution, trying to be the most expressive as possible and also you will validate with the code words. You will take the code and then you will paste it in the code words. And this is where the interesting part is, is that first of all, it allows you to see other people's solutions to the same exercise. So you can learn like different ways to do that. Also, it allows you to see the whole test cases and also upload your solutions to the test cases. And to commit, you only have to write in the terminal new key two and that's it. Some hot cases that you want to use while doing this. In the editor, you can use addp to save all the changes. Some other shortcuts could be shift addp to show turbulence, shift addq to lockout completely and shift add 125 to go from one terminal to another. And that's it. Thank you for listening and if you have any questions, my dad will come. Pa pa menti. Yes, we have until 20 past. Perfect. He's going to do a demonstration of it. Great dad. Well... Do you want to do it? Yes, of course. Yes? For the work to be done. Thank you. First of all, I want... First of all, I quit this. Okay? Simple. Shift addq. And let's start a terminal. Open a terminal. Now we can... Okay. Multiplie. Multiplie. Okay. This is the time to set up a Cata to continue a Cata. Okay? This is a Cata when I was working with this Cata and then I interested in continuing the Cata. Okay. This is the normal state for a Cata. You are in a red state, so it compiles, but it can test our work. Let's get a solution. A plus B. I think it's a correct solution for this. So, but in this... Well, if you read the instructions, the first step is to Cataficate code. It's very important this idea. To Cataficate code is giving this code that is correct out P to trial. It's an experiment. All tests are okay, so perfect. But in this time, the idea is to Cataficate this. To make this very simple code with more meaning. Please, somebody do something with this code to give more meaning to this. It's very interesting because good programmers don't know how to do with this. But the idea is you must try to give and Haskell is perfect because it's perfect for this. I gave one solution. Look, this is an equal. So, I think after the equal, after the equal is distinct from before the... So, I think I can do this. And maybe this is my personal and my personal try to Cataficate this code. So, first line, one meaning, second line, another meaning. Okay? Yes? You could rename the input variables. Yes, yes, perfect. You can rename the input names. That's not your personal approximation for the solution. And it's easy because give me a name, please. Operate one. Operate one, okay. The other one is Operate two, ever. Okay, no, no, no, no. Okay. Sorry. Perfect. Because that is correct and... Well, almost everything you made and that... Wait a minute. That compile is okay. Okay? And you are experimenting. That's the idea. Okay? Okay. And P and very fast. Okay? Your experiment is very fast. You don't mind about tooling, about anything. You are only... And this is an idea. Okay? When you get this, you can continue with this. But there are a lot of other approximations. You can do a lot of things with that. With this. Look at this. You can simply make this. Okay? A, B. Okay? And... Okay? So in columns. Okay? So... Oh! Two, three. One with another line. Two with another... Another names. And three. But there are a lot of... There are a lot more because the spacing is important. Because it's your space, your... Okay? Imagine this one. Okay? Pass the test. Okay? Yes. The service experiment. So you... Maybe from time to time you fail. Okay? So it's very good for... Experimenting. And then we are here. And then new K3. So you simply... New K3. And then... Ah! Okay. Commit your certificate code. Please enter a commit message. In this case what... To a commit message for this. Like saying these symptoms. Maybe a rename. Rename? Okay. That's it. And then here is your code. The idea is to make fast experiments and... And forgot about environment, tooling, everything. You just... Okay? This is a very, very simple code. But when you reach other codes, you experiment a lot of interesting interactions with... With what you... With other catification you have made on your style, improve greatly and make a lot of interesting experiments. That's the idea behind this. And then another in the first position. The idea is you. No. My... Okay. Yes. This is Haskell Catastrophe of Comset in Google. So you get this. This is the webpage of the project. If you... If you... Go down here. There is a way to install. Install. Install... Here. Recommended. Install via script. You only... Unload... Download this code. And execute and you install the whole... The whole system in very fast, very easy. It's all free software. It's... It's very simple. It's a... It's a bash script. And... Okay. This is the code of the installer. It's a regular... It's very, very, very simple. But... Ah. One curiosity, it uses a cinema to... To validate the... Well, to... So you can revise the code, the installation. If you install using a cinema and you can revise what was... What were your... What have you done? Well, this is the... This is the code wars web. This code wars... When you... Download... Take the... This one. The... Okay. This one. And this one. This one. Okay. And you get this. The solutions of the... Of the... Of the... Of the... You get these solutions. Okay. It's very important because if you... You can get the test cases. Not only the simple ones, but the... You see in proper... The advanced test. Okay. You see property. So it's more robust to validate. Okay. Shift alt P. And you can... Wait a moment. Okay. Yes. This is the... Okay. Of course. And when you use this on your computer, You have total control and you can use your own strategy. Your own... Your libraries, everything. You decide everything. Not only... It's like code wars. But in a more... You have more control. And then the cat-a-ficate... The idea is you take the next one. Next solution. This one. Okay. And you copy. And past here. Okay. Look at the idea. Okay. That's the idea. So obviously, because it's a solution, It works. Okay. And then you... Okay. Alt P. Okay. Obviously, it compiles. And then you do another cat-ification. What's your... What's your approach to this cat-ification? And one idea. Maybe if you... Maybe you hit after the equal. And in column, multiply. So visually, you establish this idea. The first is the second. So it's very curious because... It's not normal to do this in a professional... According. To do this, it's easier to get this idea. So maybe it's a cat-ification. But the idea is you can establish your cat-ification. Okay. Alt P. And another one. You, following the instructions, New K minus three. New K minus three. And another. Maybe this is a section. Well, if you don't know what... You can invent whatever you want. I know this section. So maybe a section. Let's see. And another one. And you take another solution. Okay. Look at the solution. It's very interesting. It's a very interesting solution. Okay. Here. The idea is very fast. You can do this very, very fast. Okay. Yes. This compiles. Okay. And then you cat-ificate this. Maybe it's a... Well, I imagine you can do this. And maybe you can do this. Or whatever. Because I know this is... No, you can do whatever you do. It's okay. Okay. This is another... Here. So maybe try again. Should be. Should be. Should be. Okay. Small problem because... No, no, no. Ah, maybe it's not exactly okay. Ah, okay, okay, okay. Ah, okay. Yes, exactly. Because I... Andu. Andu, andu, andu, andu. Okay. Exactly. That's the idea. Because I lost the... Okay. Alpine. Experiment. I'm very fast. If you don't know where you are. Andu, andu. And you... The idea is you don't get stuck using Haskell. You continuously... In contact with Haskell. With no pain. And experiment. And very easy. And that's the idea. Okay. Okay. And this is... This is for a workshop. For a workshop. In a workshop. It's very easy because I can... All of... The fun is in doing this. And when you cataficate other code, you realize that this is very easy, very fast, etc. That's the idea. Okay. Do we have any questions? Any questions? Okay. That's it. Thank you very much. Thank you. Thank you very much. Thank you. Thank you. Thank you. Thank you. Thank you very much. Thank you. Thank you. Thank you. Thank you. That is... So there is... Thank you very much for that, Ronaldo. And there is 20 more minutes in this session. So I would encourage you to... for your Haskell Carter's environment. And if you are able to assist people, answer questions, then we can do that for the next 20 minutes. Okay. Thank you. I have 20 more minutes. 20 more minutes. Yes. Oh, I... You want to keep going? Oh! I have 24... Perfect. Please. Come here and please... So 20 more minutes here. Feel free to come up... Perfect. And maybe just fly all your limbs if you have a question. And... Okay. Thank you. Oh! Of course. Please. Sofia. Okay. Okay. Ooh. Okay. Maybe we can make another one. We can... Cataficate another one. Maybe this one. Isn't it? No. No. No. No. No. Okay. This one. Is the next? Okay. Copy this. Copy this. Paste. And replace the code. Okay. Look. This is based on observation. Look. This is new. I... A new line import, but I can... I can see this. So my mind is taking note of this. Okay. And of course it compiles. Of course because it's a solution and then the idea is to cataficate this. Okay. If you cataficate, you use your previous catification style and try to do the same. My style was after equal, a return, make a return, after another sign, make a return. Okay. And if you do this... Okay. Oh, it's a compile. So it's perfect. But well, I know do is a block. So normally blocks are... So this is... If you don't know this at this moment, maybe later you can realize that. Okay. And look, all these symbols, if you put these symbols on the left, the idea is you can't forget about it. It's... Because your mind is focused on the rest. So... Okay. Okay. This compiles. Perfect. Okay. So another... This is cataficated. So... Yes. Wait a minute. Okay. So this is the... Comments for... And maybe this from... From just. Because these are from just. I think it's interesting. Okay. But you can do whatever you want. Because if you don't do... Well, you learn revising the code later. And because you made... If you do this, you remember a lot of things, a lot of details. At the same time, when you saw a movie, you remember... You can remember a lot of details very fast, because your mind is prepared to do this. It's not... If you look someone who is watching a movie, I can't look in that person. I don't understand that. You are at the wheel. It's very different. So it's important to cataficate by yourself. Okay. And this... You take another one, another one, another one. Okay. I... Okay. The installation. Because it's the... It's the most important thing. You are interested, I think. Okay. The installation is very simple. Well, here. You need to... Close, close. Okay. Here. The installation is here, in the project... Download from installation. Install, we are recommending. You get a file. Let's do it. Okay. Here is the... Install. Okay. Open the installation. You see? Because it's there. Moment, moment. Okay. Okay. Okay. So you... Install. Make sure it's... Okay. It's already bad. It is. Okay. Okay. And, of course... Okay. Installation script. Okay. A cinema is currently installed. So it recommends to... Excuse me. Okay. Here. Okay. The password? Your password. Okay. Here it is. Okay. And that's it. You... You will... You wait. You wait. And, of course... You get a file you can revise later, if you are interested. If they... You can get some warnings and even some errors, but it's okay. The program compiles correctly and is ready to use. And the idea is using this environment is very fast. You get a test environment... Well, you are encouraged to use this test, this scientific way of learning programming because you use continuously tests in two ways. Knife test and then you get the extra test. So it's a good two-phase way of learning because you get some errors in the... You pass the code, but when you pass the extra test, you realize your code is not okay. And the idea behind this is modeling your mind, realize... You realize... Okay. It's... No, no, no. No, no. It's... No, no, no. It takes... It takes... It takes quite... If the... In the installation, the screen is apparently doing nothing. It's doing... You need to wait. It's okay. Okay? But in some time, the installation... It's only one installation and it's okay. And you get a system. You can do this... And you can use this for whatever you want. Well, if you want to see a little... A little of the code. This is the install, but the... The code is here. This one is the program. It's a very simple program. Very, very simple. And very hackable. You can change a lot of things very easily. Even you can... You open at least five... Five... Five terminals. Okay? The editor. The REPL. A REPL. Okay? Okay? So you can... You... For your experiments... Your diff. Because you... Make some... Okay? Oh. Oh. Sorry. Some... Okay? Alt P. Alt O. Okay? You can revise the code you are... Change. Three... Okay? Obviously... It doesn't compile. And... U.K. Obviously, this is a... Normal terminal. So you can move. You can rearrange. And you can even create another one for other... With other... Intentions. Okay? This is the... Help. For this... Oh. Okay? Minus... H. Two catas. Recommended catas to start with. But you can use a lot of... In code words. Okay? You get... Exactly one thousand seven eight... Seven... A lot of catas you can use in the same way. So it's very easy to... And it's very fast to set up in your... In your laptop or in your computer. Because it's copied to small things. And start working with. Okay? That's it. Okay? Ah, yes. This is the help. And you can help. Okay? It's only three phases. First rate is number one. Extended test when you... With your first solution. Okay? And you get extended test. And the other one is next. It's another... Grab another solution. And it's very ready to a hackathon. Because you are with a lot of people doing the same catas. Other people can give you your solutions. And it's very fun because your audience solutions fit your catification phase. Okay? So it's quite fun. You can get fun. And a lot of people who are interested in this. Since very small child to seasoned programmers. Okay? Well, this is a help. Phase one, phase two, phase three, three, three, three. This is lock. Okay? New K. This is the first lock. Okay? With instructions. This is another help with instructions. But with less... It's more simple than this one. Okay? Another interesting is show. So you can revise your solutions. New K minus. Okay? This is... And you can establish visually some similarities. So you can revise what you are doing. Okay? This is one. Okay? Another is faster because it's only one color. Another one, maybe this is... Ah, you can compare. Okay? Compare. Compare. Because when you get this, you can use these codes. This is from Git. So you can do this. New K minus compare. Oh, close. Minus compare. Okay? Copy. And maybe this one with this one. Okay? This is... The idea is the most important thing about this is it's fast to use. You forget about tooling. You can... It's hackable. So I don't want to use Emacs. So you go to the... In the script, you go and change this and change for your favorite editor. I don't like... It's very easy to... Because it's a bash script. So it's very hackable. It's very expandable. It's a script. So you put another terminal. Say whenever you want to open. And it's very easy to... It's very easy to... Okay. To use. Okay? Okay. Okay? Thank you very much. And thank you very much. This is our proposal and please use it. Please. |
Web application architecture in Haskell with flora.pm
A case study of a Haskell community platform in 2022 |
Hecate is a Haskell from the trenches with an interest in resilient systems and documentation. When not at work or in the Haskell community, they are a trombonist in various orchestras and brass bands. Hecate uses they and them pronouns and farcical amounts of caffeine to retain human form. And they're going to present to us a presentation entitled Web Application Architecture in Haskell with flora.pm. Thank you Hecate. Thank you very much. Can everyone hear me? Perfect. So welcome to this talk entitled Web Application Architecture in Haskell when the domain drives the types and types drives the program. So this talk is intended for a missed audience of software engineers who are acquainted with the practice of domain driven design and Haskell programmers who are interested in crafting better systems. The goal is to create a bridge between the practitioners of domain driven design and the users of Haskell. So my name is Théophile Choutry aka Hecate to the community. I am a backend engineer at Scrive. We are a Swedish company and we have a e-signature platform for contracts and various documents as well as a digital identity hub where we aggregate various national identity providers like It's Me in Belgium for example or Bank ID, France Connect for the French people here. I am also privileged to be a board member of the Haskell Foundation and this is one of my numerous implications in the community. So Haskell, the pure functional programming language, so these are words and words mean nothing until we can practically apply them and these words bring us concrete features like native supports for recursion for example without blowing your stack type system that doesn't hate you, higher order functions that almost every language has today and many other features. There are two features especially that I want to talk about and it is going to be its ability of the language to adequately represent business domains and the ability to track side effects in a semantic way. For example, the adequate representation of business domain we can use algebraic data types to allow us to model with more precision the real world and its nuances. Encoding rules by construction at the type level is something that we can easily do and in this example for example the members of the excess data type visitor and admin are promoted to the type level for this user type which means that for example there can only be two values in the privileges parameter of this user type and also it means that I am going to get rejected if I pass a visitor user to this function called view back office but I'm going to get rejected at the compilation. This is something that is trivial to implement at the value level I could have a check on a member of you know the object of a property of the object that would be is admin with a Boolean but I have to write the check manually and possibly for every function that needs to have such a check. If I encode this immutable property at the type system, at the level of the types the compiler then is tasked with checking if I'm doing my job correctly. Checking side effects semantically. If we check the previous example we can see that view back office has a result type of IO HTML which more or less means you know I am doing observable side effects and I return you a value 8 in HTML but that's you know it doesn't adequately represent the reality of the function and for example here we have a way of tracking side effects that are being executed you know in a more human readable form and we can declare them at the type level. So here we have we have a function that so the signature is we get an int which is an identifier and we return a function that returns you know text and this f monad this f return type has also a list of effects so what are these effects basically I declare that I'm going to perform database access and possibly you know mutation I am going to access a ready server for the caching and I'm also going to ship my logs to an external platform or at least to perform the loading effect whatever that means. So and then we can have you know more useful breakdown of which functions have which effect so get entity name is composed of first we get the entity from the cache if we have it and then if we don't we can switch to the database to perform a perhaps more costly access and then finally we log the effects so logging db redis all of them are then unified in this list of effects at the type level. So one type system is both strong and expressive we get a lot closer to feel as refactoring and what does that mean because many languages today claim to provide such a thing. So feel as refactoring you more or less get vibe check by the compiler which is pretty cool because then the compiler keeps you in check regarding the changes in your program and how they affect the program's behavior you have also to come to terms with the fact that your worst enemy is now yourself you can't blame you know errors in production because undefined is not a function. So there are limits of course to correct by construction and I think it's especially important to be intellectually honest with that Haskell is not a prover you can prove you know lemmas or you know and theorems with it so you have to write tests and tests coming come in many shapes and forms you write tests for your integrations your properties and end to end tests tests are not optional unlike maybe. So functional application architecture so this that I showed you gives us a tool to focus now on the topic of functional application architecture and this time we arrive at in the land of domain design and there is there will be a brief overview of the terminology and the techniques that were created in the discipline and how they apply to us. So we have without surprise the concept of a bounded context it's a context there are workflow in there so what is it really it's an autonomous subsystem so it's responsible for one or multiple workflows and it has well-defined boundaries which is extremely important so we have to formalize or at least be very explicit how we talk to it its inputs and its outputs. If we take the example of flora.pm which is a community website an alternative package index for the Haskell community the schema is fairly simple we have the web component that goes to the core component which is tasked then to interface with database and we have a jobs worker for the jobs queue that also talks to the core components and to the database but we know that it's not going to talk with the web components so this is the kind of you know setting boundaries because they are in a healthy relationship and you know we know they will not talk to each other. One more step forward philas refactoring. Now something that the Java and C sharp world gave us are the concepts of data transfer objects data access object and business object. Sometimes they're the same thing sometimes you are lucky that the JSON payload you receive is the same object upon which you will perform your business computations and which you will store in your database but sometimes they are not and really it is pure luck that sometimes these types align. An example of how this bit me when I was young and hopeful obvious without much practice during meeting with other people we would try to define a json based format for data exchange between several systems and we had elixir systems php ruby python and these all you know give you several slight differences in how you can you know have data types encoded in these languages and for example if you are dealing with rubies or php users they will try and push heterogeneous lists in the data format so you can have a nint a string and natively in Haskell we don't do that so we would have to create some abstraction on top of it and I was realizing that I was constraining myself with the capacity of each language to create this data format based on json but you don't have to do it you can have your fully external way to talk to your mates your other systems and have a different representation inside your core components for example if we apply this to flora we can see that actually I have my business objects living inside my bounded context when I need to store them I serialize them I serialize them to a data access object that will be compliant with what my database expects so it means no fancy mutually recursive types for example or something like that and when I need to send that on the wire I will serialize that to a format that is easily representable by xml, json and other you know various cursed binary representations that we may find especially in the banking system so in the end if I was to summarize bounded context you know I showed you a very simple diagram earlier of flora but now how does it interface with each other so we have details between each component and especially between the clients and us daos for storage access and inside each component we operate on our business objects it so happens that the business object can be extremely similar between the web and the core components but sometimes they are not and I think it's very liberating to know that you don't have to keep to a single representation from a to z all the way you really can have conversion layers between your components between your interfaces and it's perfectly all right for example the retrieval but reading configuration the 12 factor application model tells us to read configuration from the environment from the shell so what we have on the left is the conflict type which models what I get from the environment with a twist because you know I can force some types it's not all text base I can force my parsing of HTTP ports to be a word 16 for example because I'm not so interested in you know having port number of one million and unless not without overflow so I've got my xml configuration that describes for example the first member is db config with a pool configuration so it's all the information I need for the pool the database pool and then internal configuration it's the pool itself and it's it's very useful because then I have this very explicit conversion and it's perfectly all right then to change something inside or outside my core components because then I only have you know this bottleneck that I can easily change and one more step towards fearless refactoring separating commands and queries so this has practical effects in terms of operation infrastructure and also in terms of ergonomics for the people who read our code you know in a practical way if we know that we have a recurrent fairly heavy processing query that runs and can take significant lock or CPUs on our server we have the option to have these queries run on a read-only replica for our database and put this replica on another machine so postgresql for example very specific example but I can talk about that you can have read-only replicas which take read-only queries and will be very angry at you if you ever try to mutate the state of this replica so you have your primary server which upon which you perform mutating commands and then it will stream these changes to the read-only replica and then the replica will provide you with a read-only interface that is like not only enforced at the type and at the level of the types for example in your applications but fundamentally on the protocol itself you will get a runtime error if you try to mutate this state so you can't like unsafe performance unsafe course you weigh you know behind that so something I learned at my current place of employment scrive is to have a separation like a physical separation in the code between types the commands and the queries so the dialect the idiom that we have here is that we have these dot query and dot update modules in which we put the read-only and mutating queries and then when we import them we qualify them for example import qualified as query and then there is a visual indicator so you know it's very bare bones but it does work that this is a query that is going to be read-only it's not going to increment a counter in a site table because you have performed something that is seemingly read-only a a good example for example it's LinkedIn when you view someone's page on LinkedIn they have a notification so you would think that viewing something it's a fundamentally read-only even the terms reading viewing you know you would think it's read-only but perhaps there is a counter that is increased with you know user tracking so that you can later report who has viewed your page but if you can you know bring one step more into separating the queries and the commands then it's much more it's much easier to know what which kind of operation you're performing at which place in the code so we could go further even and declare queries and commands as effects and with their own connection pools so for example I don't only have the db effect in my stack I'm declaring that I'm performing a read-only operation on the read-only replica of my PostgreSQL database so one more step towards you know more so of course it can be a technical detail but also I think it's very important to be able to say to the readers of your code what are you performing which side effect does it have especially in the system that you have ownership of now that's my anarchist tendencies coming up let's keep our distance from the state the state is best contained so the cache of our application is actually a bounded context in its own it has its own lifecycle data storage and api and by decoupling our application monolith from its state we have worked a significant portion of the path to having a setup where we can have multiple instances of our application and serving data from the same database in cache so at this point by ensuring that the database server keeps operations in sync we've got you know higher consistency of the application so that's the the cap theorem for the systems you've got cap is your application consistent is it available or is it tolerant to partition and in you know some industries where you work in very sensitive with very sensitive data if you have a production incident you can't risk having inconsistent data or having an inconsistent state where people can read someone else's private folder so it's better to shut things down for a bit we you know we keep our count we take a deep long breath and then we restart the system but it's because availability has to take you know one for the team in order to keep consistent and you know partition tolerance sorry to partition can go out the window so for flora for example very simple we can have our clients that talk to our nginx gateway and then the multiple instances of flora that still speak to the same database server for mutating operations and the same replica for read only commands you know i'm not selling you microservice architectures and you know scale to the moon type of stuff but i think it's a very decent way to start a monolith we all know that you know a good distributed system has to start as a monolith and then you know split it further and further if you start with a microservice based you know architecture you might end up with a distributed monolith but the the whole thing of a microservice based application is to have you know independent context that can still run so here we don't take you know the bet that every component is fully independent we acknowledge boundaries that we have some boundaries between the web the core and the job workers components and then themselves they have their own context so it's also about realism like do you want to scale to the moon and raise like hundreds of thousand of dollars from venture capitalists or do you want to create a nice community website that indexes packages for the haskell environment i'm going to make a short detour here and it's directing our workflows with types so it's a technique that brings together type safety and ergonomics which is one of my favorite subjects to create type directed state machines very fancy word basically it's it's really the way that your operation are composed together and you will be driven to compose these operations via their types it can be a bit scary sometimes to think of your business operations as a state machine but it gives us a terminology and a literature to take from and to think of how we organize and compose our operations so for example here we have a workflow state which can have three values arrival processed and departure and a workflow that has this state type parameter so we have a new workflow value that creates a workflow w1 and then the process workflow function takes a workflow but not any kind of workflow it has to to be set to arrival it can only take newly arrived workflows and then sets them as processed again this could be in quotes trivially implemented at the value level with you know properties of the workflow objects and we could very easily verify check these property at in the value level in the code but here I factorize all these checks and I put them really at a place where the compiler can guide my hand and tell me where I went wrong with that and finally the send back workflow can only take processed workflows by the laws of the types and then sets the workflow as departed so if I compose the functions in the good order so new workflow and then you know a pipeline of functions and then I pipe it into process workflow and send back workflow everything is good if I try to skip a step I will get a compiler error that says you wanted me to take a processed workflow but actually you know I need sorry you wanted me to take an arrival workflow but actually I need the processed workflow and this code you know you're not sending that code to production because you cannot compile this code in terms of web application development there are also some for us hastalers we like to put everything at the level of types you know and think of our code as being you know formally proven or code by construction but sometimes you know we must not drink all the cool aid or all the climatic for example database layers that promise type safe SQL if you ever find a database layer that promise type safety either it's the kind of type safety that is trivial to implement and it's totally expected of the tool to have it or it has encoded the semantics of SQL at the type level and we've either found the golden goose you know or someone who has clearly underestimated the difficulties of SQL semantics. Also SQLite for development and PostgreSQL for production that's something that the python community has popularized in the 20s 2000s and 2010s so we can accomplish great things by lying to the universe but we carefully accomplish anything by lying to ourselves and SQLite is its own system and unless you somehow perfectly code in the common subset of SQL supported by both implementations you will be maintaining two sets of database migrations and sometimes of code so PostgreSQL has very good features SQLite has difference but also good features not its type system of course but if you get used to one locally and then discover the second one once you're deployed you're going to have a bad time and also the muscle memory because brain is a muscle that you will have accumulated with SQLite will be fairly useless with PostgreSQL. So where to go from here documentation you produce documentation we have many ways of producing documentation and we hold also tremendous power in the types and coupled with introspection it means that the algebraic data types like the sem types and the product types the product types so the enums and the records they can serve as the backbone for further even documentation the types themselves are not documentation but they can be used to guide the reader and you remember how I told you to write the tests so the best tests are those that can describe real-world behaviors and if you can even produce you know a summary web page that shows the behaviors and the high-level paths taken by your program according to some input this is very particularly helpful for less technical people like product managers who want to know the behavior of your program if you can present a nice interface of how the code is executed according to some high-level business you know operation it's even better. So I have a couple of sources for what I'm saying I'm not pulling that out of my arse the first one is domain modeling made functional by Scott Vlaschen it's an excellent book written in F-sharp for the functional and DDD practitioners it's excellent I encourage you to read it as well as living documentation by Cyril Maertere and that one is also excellent really it puts the documentation as its own living system that for which you will have real clients because you know PMs and other engineers in your organization or consumers of documentation and of course here's amounts of caffeine as Fraser told earlier so that would be the end of my talk. Do we have a couple of minutes for questions perhaps? Yes thank you Akate and we do have about 10 minutes for questions. Yes Youngman there. This is this is on this is more of a comment than a question. So one little detail that I think that sort of you could have sold also right is the fact that when you do this when you do the data kind annotation on your workflow you know instead of you know checking that during runtime we do the type annotation and that's actually more efficient right because because of type erasure that there's no runtime data or check that has to happen right. Yes so what Björn says is that indeed there is a matter of efficiency because the data kinds when we encode you know the nature of parameters in our workflow these all goes away at code generation so you if you are in a setup where you need some you know very minimal code that is being generated if you are in tight loop for example this code is completely raised at the level at the time of code generation and indeed you you spread some CPU cycles. Any other question? You can also call me out on my bullshit. I won't be offended. Yes Do I need a mic or am I? I can repeat your question. The libraries that offer type safe database access that are I'm sure hideously incomplete also offer abstraction over different database backends which is one of the problems you were talking about like why you're using Postgres to develop locally. So my question is are there really situations for Flora PM where those libraries didn't provide a feature that you needed? Yes so the question is those libraries that encode you know all of the semantics of SQL at the type level are there situations where they don't provide features that I would need for Flora PM? Yes so as I told earlier I'm very preoccupied by ergonomics. 20 minutes of compilation time and you know 20 gigabytes of half interface files on disk I would consider that a problem in terms of feedback loop for contributors. My previous place of employment we used the toolkit squeal for type level encoded SQL queries because they were business critical so we wanted to invest in something very much type safe because of the critical aspects of these queries. It was hell, it was horrendous, it was not only to view and to review but also because it took so much time to compile like unironically 20 minutes and we had some problems with stack because the interface files on this were taking way too much space. Type families in Haskell are best consumed with you know responsibly and I'm a servants user so you know I can't you know shit too much on type families but in some cases very specific cases is best to rely on the expertise of outside systems. For example my best friend who's here actually in FOSDEM is my database administrator at work and you know I keep him close you know. Do you ever have experience to need on board like a newer developer that to maintain or even do new feature to the project if so what's the experience especially if they don't have any Haskell experience or especially this kind of yes very good question do we have any experience on boarding new developers on the project actually with this this talk was supposed to be the continuation of the different on boarding sessions rather than on floor at the pm so sometimes if you find me on discord or matrix I will share my screen and introduce people to the codebase and I think that's one of the most important aspects of flora as a project not only it is a community tool that has you know aims to satisfy the users but also it's a vessel for teaching so I have got many tech techniques that I explained in this talk implementing flora and flora is my the factor codebase to teach these techniques and I had very bad you know experience with community tools that have badly aged and the code is only known by you know the 10% of maintainers that stick around even if the majority the vast majority of contributors of a project or the 90% of people who just make one pull request and then go away forever so it's very hard to retain institutional knowledge and also is very hard not to aim to please the 10% of people who stick around and submit patches you know on the regular so yes I would think that and that's the goal of flora onboarding new contributors easily is actually a feature and if it can't be done anymore it's a bug any other question nope oh sure such a representation is it possible to write a function that say generate a diagram it technically is I have references for you so the question is can we generate diagrams from such representations because indeed we have the possible values that we have at a type level and we can do many things with our types including inspecting them so yes I believe there are several libraries on hackage that aim to for example provenance it's a library that gives you the the path that the data takes and the provenance of your data throughout the code I would say it's the it's one of the the greatest thing to be able to do is to represent your code and to extract facts and movements from your code in a higher you know level representation so yeah I believe we can do it today I don't do it personally I think it's possible there's time for more questions or you can like duel me if you want to challenge my beliefs okay that seems like it so thank you again Ikate thank you very much you |
The Haskell Security Advisory Database
Status and next steps |
Hello, everybody. Welcome back. Our next talk is from Fraser Tudel, Fraser Tweedow. He lives in Brisbane, Australia, where he works in identity management in PKI at Red Hat. He's passionate about functional programming and security and really likes playing with little plastic bricks from Denmark, a detail which he clearly included to pander to his co-organizer tools. Okay. So, just a very quick talk, giving an update about the Haskell Security Advisory Database. So, why do we want one? Well, security matters a lot. It is very important. And many programming languages have been improving their security tooling and introducing vulnerability databases of their own, audit tools in the package management and build systems. And so we should follow suit. We should not just follow, we should try and lead, in fact, and have best-in-class security tooling for our best-in-class language. It's increasingly important for industry adoption that language ecosystems have a way of reporting security vulnerabilities and disseminating that information, so that commercial users of the language or non-commercial users of the language can find out about security issues when they arise and respond to them. It's also needed for some industry certifications like ISO 27001, which is therefore also important for industry adoption. So, in August 2022, there was a tech proposal through the Haskell Foundation Tech Proposal Procedures to establish a security advisory database and a team who would manage it. And that proposal was refined and accepted. The database repository was created, bootstrapped, if you will, in November 2022, but as of now it's still empty. So the next step is to assemble a security response team who will populate and manage that database. And the call for nominations for the Haskell security response team will be going out in the next couple of days. The responsibilities of the security response team or SRT will be to triage and assess incoming security reports and for real vulnerabilities to move them into the database. So we will update and maintain the advisory database and ensure that the data is in a form that is useful for downstream security tooling. So these could be tools like Cabal Install. We can implement an audit command to check whether your program or any of its dependencies have known security vulnerabilities. GitHub Dependabot is another tool on GitHub that can consume the data in the advisory database and automatically notify maintainers and project owners when there are security issues and potentially even do bumping of bounds and automatically creating pull requests and testing the projects in order to make life easier for maintainers to move when a security issue is found. So developing those tools is not the responsibility of the SRT, but working with and liaising with the developers of those downstream tools is a responsibility. And there will be a quarterly report on the team's activities and on the trends in the reported security issues. So who will be on the SRT? We're looking for five volunteers who can commit to an initial term of six or 12 months. And in that way the terms will then be staggered. We're looking for people with experience in security topics such as this is not an exhaustive list, but topics like secure development and web application security, pen testing, incident response and vulnerability research, cryptography, authentication, security management, GRC, that's governance risk and compliance and any other security related topics. Obviously no one has all of these, but we're looking for people to bring different experience in these different topic areas so that we have a broad coverage within the security response team. So as I mentioned the call for nominations will go out in the next day or so. We're looking for people to self nominate, so if you know someone who you think would be great, please encourage them to nominate themselves. The nomination process will be to email me on that address and say I want to do it and give a brief overview of your background in security. And we will aim to announce the initial security response team around the end of February. And that's the update. Thanks for listening. |
On the path of better interoperability with Rust! |
Okay, so our next talk is by Ivan Sraka. Ivan has commercial experience working in 3D graphics and runtime design with Rust, Nix and Haskell, and he also does design algorithm competitions and children's coding workshops. He lives in Belgium, loves biking, hiking, climbing and vegan cooking. His topic today is on the path of better interoperability with Rust. Thank you, Ivan. Thank you. Hi, everyone. So this talk is about a thing I worked past few months and there is also a blog article on engineering.iog.io, I guess, which is basically the content of these slides with links and references and everything. And so the IoG issue is there is really large Haskell code base to maintain and some parts of the code base are C-bits mainly for cryptographic library like Kriptonite. And so there is a will to more easily integrate Rust library because for cryptography there is cool Rust library implementation. And that's not quite simple right now to interface Haskell and Rust. And so this talk is about how to make the experience easier. And there is a repository, yeah, so it's the links here, the subtitle where you can find the source and everything. And so when we try to integrate to and time, for example, Rust and Haskell in our case, there is a lot of way to interoperate different programming languages and one of the solution which is often used is to use, I don't know, sockets or writing on a file or a pipe or something like that and to use a protocol, for example, protocol buffer from Google or anything you can use, JSON or an HTTP API, all this kind of stuff, but if you do such, you will rely on these calls like EOS to make interoperability works and if the library on which you rely, you want really little overhead to call it to rely on what the library does, library does, you prefer to use something which is called FFI which is more close to a system programming which is foreign function interface which is basically jumping the memory of binary code generated by another languages and hope it will work the right way and to make it work, it requires special attention and the tools needed to handle all that is we will take special attention to that right now and if we look at what exists to FFI between REST and other programming languages because there is such a common way to interact, interoperate stuff, we can see that from C to REST there is a thing called REST bind gene, from REST to C there is a thing called C bind gene, REST is, I will not name all them but there are all about a thing which is bind gene, which is generating bindings, maybe you already know that but why we do binding generation? Because FFI is really something dangerous where if your two foreign function interface didn't match, you will not know, your compiler will not warn you about that, it will warn you if a symbol is missing but it will not check the types, the number of your argument, if you respect the same calling convention, it will, so you want this interface to be generated so you can ensure it match, something really common but it does not exist between REST and ASCAL so that basically what this token project is about and so here are a bit of REST code, the way I choose to generate bind gene using macro which work like a function decorator, so it is a function that does something really simple, just print a low name, it is string interpolation in REST, here I import my library and here I have my custom thing which is I tell my binding generation about what would be my ASCAL wanted function signature and so a macro will expand the code, so it is code generation, macro actually in most languages so here is a code expanded by the macro, I have no mongrel things because maybe you know mongling, some things that exist in C++2 is function symbol change, a change in binary and we want to preserve it, we want binary to have the same symbol to be called from the website from an ASCAL program and all the other things I want you to look at is I use extern C here, precise C, that means in REST I want to use the C calling convention, C, HBI, so HBI is the calling convention and memory type layout which is a bit of the same part of the same definition, the thing is when you define FFI in ASCAL, I will show your ASCAL code just after this slide, you don't have the choice, you use the C calling convention, in REST you have a bit more choice but you can't use, that's not true, what did you say, you have the choice in ASCAL, never mind, I want just to point out that REST, HBIs, REST calling convention, the CUL convention, I use REST function inside the binary between the same, isn't stable, which is things that exist in ASCAL 2, which is a way that REST internals, the REST core team keep to be able to break the mechanism inside REST C without changing the measure version of the compiler, so for example the CUL convention changed several times since the stable version of the language, which is REST 1.0 something, so if we build a thing on top of REST, HBI, CUL convention, it will not be stable, it will be an ACC, so maintaining it will be really laborious because it's working with internal documentation which is not made for people to use it as a public API, and so why REST macro? I want to point out that because there is a lot of way to the binding generation and often it works as an external tooling and the issue with the external tooling is really easy to forget to integrate the external tooling into your build suite and if you do that your binding will be out of sync with your codes, so your program will not work, so we want binding generation to be part of the create compilation, create is a REST module, and so I did it that way, and I also want to point out that there is other program in REST space like C bind genes that just try to integrate REST code like search and replace regex, so if you for example in the language have two identifiers in different namespace, the C bind gene that generates C bind gene for REST will not be able to understand their meaning, so when it's an undefined behavior, it's part of the limitation of that library, which is not the part of the bind gene library we present here, that understands REST semantics because it's implemented as a macro, and so it expands REST code and it has to generate as a set effect ASCAL code and ASCAL modules that just have the wanted signature, so same symbol, same signature, things that sync together, so if we relook the REST code, there's another thing I want to talk to you about, here you see you have the C, the compatible calling conversion type, the thing I want you to look at is I use trait, trait, rep C, trait, rep REST, and that's trait REST type classes, that's a way to define contracts for data structure, you mean a data structure should implement this and this method, so I want every data structure that has this method would implement those, internally it's implemented as virtual table, like in C++, and why I use that, I use that to be extensive programmable, so I have nice error, it's part of REST type system, users can add types to this framework, and for the types, which is part of the standard library, that's I implement myself, at the care of the memory management, and we talk about it a bit later, and also the thing is in a FFI in REST, you can use only what's called FFE safe types, it means types that have a memory layout in C calling convention, and most of REST type have undefined memory layout for C calling convention, so the thing is for example if I get REST strings, I cast it safely to a C string, so that I could represent as C string works, which is another way that REST strings are actually represented, and so what about GC, the thing is REST have a destructor mechanism based on ownership, a lot of rusty things, but the idea is more about destructors, like when you go out of a scope, the destructor of the value is called, which is dropped, and so what I do here is I tell REST type system to, does not call drop on REST value, because it will live on the Askel side, so that's the other side of the bindings that need to free the value, which is a thing that you can do with March Mald, I don't know how to pronounce that library, and the thing is also I learned during this project the real semantic of safe and then safe foreign call definition, because I finally understand that unsafe mean you want to play with any object on the Askel heap, which is not the default case of what we actually do when we pass basic data types, that's our, does not need the garbage collector to pause, because it doesn't know if we could get an inconsistent state by some things that the foreign call will do, and so few users in safety will have a warning that say a cautious, an unsafe call will slow down your war program, are you sure you want to do that or not and anything, and yeah, and that's a library, a REST library, but those are things I did and that's also what exists for all of our REST tooling is a key tool that helps you to set up a project, because you have to tweak your REST build file, you have to tweak your R-scale build file, actually you have to do a setup.hs build customization in the Askel unless you use NICS, which I could understand a lot of people who won't do, and on REST side I use build.hs, if you do a dynamic library, because dynamic library that GAC fetch needs the GAC version in the name of the dynamic library, so that's the thing that I have to get as a build customization, this kind of tweak, but overall this library is really small, it's, I'll put the next slide just a few minutes later, the whole library, the whole tooling I present here is less than 1000 lines of code, so okay, it's 1000 lines of code, so that's really small, minimalist, and kiss, so actually I, I forget what I was about to say, actually you, you, you, all that, all this plumbery is, is simple, and I'm not sure myself, I want to, to make it more complex, because it just works, and the thing is, it works, but you use really simple data types, the C data types that are representable in Askel and REST, and there is a lot of things that are not representable in C, for example, REST slice, a slice is a pointer and a size, and you have a guarantee that there is memory in it, you can iterate on it, you can represent it in C with a strip, but it's not really a C type, you will do a custom strip, and, and those things are targeted by the REST RFC, which is named interoperable ABI, which is creating an ABI which is stable, which have more REST C things than the C ABI, but which is flagged as stable, which is not the case of the REST internal IBI, and the other thing is cabal customization is a bit of a mess, doing the setup that hashes to find the library and anything to make it work, because of cabal bugs, so I really love to have a standard way to integrate, for example, REST chain into cabal, here you have to run the two-tile chain in two different states, and that's it, I can throw a lot of questions, or I can do a demo if my talk is shorter than what I was expecting, so that's the date to ask questions, or, and, yeah. Yeah, I can do, I can do a quick demo, sorry, it's not, it's not easy because my, it's not mirroring, so I have to look at the, the screen why I, I doing this thing, what I can do, what I can do, yeah, yeah, did you, up, yeah, it will be unsafe on the Askel side, if you do that, for example, because you, you, will it be, I, I, it's an interesting example, I didn't try passing a function pointer, the thing is, there is, the thing is, you, you, if you want to pass complex data types, you, it's always more meaningful to use serialization, because it's, it's, it doesn't match the two language model, but function, I didn't honestly think about it, but I, you mean a function pointer that will cross the FFI barrier? So that's the whole question, right? Yeah, yeah, yeah. In Erlen, for example, you have no concept of sharing pointers, you copy the whole data, and use it to the other node, including function, which are serialized, and then reinterpreted on the other node. I guess it will not, not really be an issue, because, because in fact, there's no really boundary at the end, that's the one binary, so the only things that I, I want my mental model to, to have a better grasp around it is how GC, as a GC will behave in this case, but I think there is no special issue for doing that, but I'm, I think I should experiment on it first to, to, to, before saying it's, it's completely okay to do that. Yeah, thank you for the question. Uh, yeah, I, did you see something? Fuck. Fuck. So, sorry. So if I go here, for example, I have a little resting, SRC. Um, yeah, I can show you the lib.rs, which do, uh, cryptographic primitive, for example, and you, what you, you manipulate is clearly, uh, rest types, it's not C types, and I have a warning, and I don't know why, uh, yeah, because I don't install it first, and if I do, I guess it will work. I hope so. Uh, I can check what's happened with, after some macro expansion, uh, will it work? Yeah. Sorry. Uh, DEMO effect. Uh, yeah. Hmm. Why? I, I don't know why you do that, but maybe I'm no internet connection and that's the issue here. Yeah. I guess that's the issue. So never mind. I can show you the, the askels things generated. So, no, that's a cabal file generated by the client, and so, uh, I have, it looks like that. Uh, and I have, yeah, it looked like that. I have my, my code generated. Uh, and what can I show you? I can show you the build customization. You can see what it looks like. Uh, it looks like it. It's not really interesting, but it do a few things. And what can I show you? Uh, I have a little file that's unsure that's a client you use to generate the whole stuff is, uh, compatible with the version of the library you use, due to, if I want to change the whole Bay of York in the future. And yeah, on the askel side, it would look something like that. So I have a project, I have a cabal project, right? Uh, that looks like that. Uh, and the test thing just work like, that I have, uh, a test dot cabal. Uh, it looks like that. Or I can use it like it was a normal askel dependencies and my, uh, askel code is quite simple, I guess. Yeah. Uh, so it's fixed me, but don't look. Yeah. Uh, yeah, you have something like that. It's, it's always a bit, uh, you, you, you manipulate, um, low level data structure, but that's, that's often what you want to do when you, you, you brought something in the system programming language. That's one reason to use rest over something else. Otherwise, uh, um, but I don't know. I, I, I got the input that's the, the bind gen don't really need, don't sorry, don't really need, uh, more advanced data types. For example, OCaml have an interop, uh, OCaml interrupts things with rust. That's, uh, helps to represent, um, ADT, both rust and OCaml. And I don't know if we really need to have ADT conversion between languages for most of use case. I don't know. Um, yeah, uh, that's, that's many. I don't know if you have other question or just curious about something or, or another. Yeah. Okay. There's a question. So, uh, you said at one point that, uh, rest will not drop the value and that's to, for our scale to, to free. So how does that work? Is that done by the, by the GC or does it need to be done manually? Um, you mean how I tell rest to not drop a value? No, if, if we do, if rest doesn't drop the value, who does? Is it the, the, uh, the scale GC or, or somewhere we need to, to advise, you know, you know, program? Yeah. Uh, the, the idea is because it's part of, uh, the marshmallow things, you will have to free it explicitly as I understand it. If, if we do not survive with leave, uh, maybe get some, somebody could tell me if I, if I'm mistaken in this point, it's really hard to, to, to, to debug what actually GC does. Uh, so far he didn't experiment so much on that, but, but yeah, the real thing is really that if you, if you do not force for us to, to not drop things, it will, because it's part of its type system, it will, it will put code to, to free memory, to free of the memory allocator you used, uh, straight into the code. Uh, rest, uh, statically decide where to put allocation and release of memory inside the bin and binary. It's a computed as, uh, statically at compile time. So you have to tell type system to, to not do it for specific types if you, there's an internals for that. Uh, but on Askel side, I, I, I'm not sure, I'm pretty sure, unsure that the GC will, uh, track it, um, as, uh, as a Gaboch collected object, I, I think, because you have to, to do it explicitly. Yeah. But that's one of the, the point, I'm a bit unconfident, so I want to, to check again, it's this kind of scenario. Yeah. Um, do you have, was there a question? What do you do if there's an external Rust library that you don't have control over, um, because they might not want to add HS, bind gen macros to their code base? Is there a way of generating a shim or something? Um, okay. In fact, it's really easy in Rust to do reexport. So you can always create a new crate, reexport a crate as you, you say, uh, I depend on this crate and reexport the thing I want to reexport and I, I, I, I decorate or not element. Yeah, that works on functions that exist in the other library. Sorry, you can you put those macros on the functions in the other library then? Yeah. Yeah. Yeah. Um, in fact, it's just, just wrapping function by function. So, so you can do whatever you want. Um, and yeah. Uh, and you can opt in or opt out some function of a library and some not, which is, to me, is a, is a advantage of using macro and function decorator over, uh, code, uh, wall code, uh, parsing. That's not free though, is it? That's not. So if you're, sorry, so if you're, you're reexporting functions but you, but are you actually reexporting the same function or are you exporting a wrapper around the reexport? Yeah, but I, I, okay, I, in Rust, there is a clear, uh, um, idea of, uh, what symbol I want to expose and what symbol I don't want to expose. So I, I'm pretty confident that the compiler has the library to inline or not things. So for example, most of the trade implementation I do, which is casting, are inline. I explicitly say I want it to be inline. And so, yes, it's a function that's called a function, but at the end, it's just, it's, uh, have no, no runtime cost. So yeah. Okay. We're out of time. Thank you. So thanks very much, Yvonne. Let's give him a round of applause. |
2D animations in Haskell using gloss, lens and state |
Hello, so I will speak about two animations in Haskell using Gloss, Lens and State. I am Julien DeHos, and I am an assistant professor in computer science, and I use Haskell mostly for teaching functional programming. Haskell is not the most widely used language for implementing animations, but still it has some interesting tools, such as library bindings like SDL2. We also have some entity component system implementations, which is a classic technique for developing games, and we also have functional reactive programming, which is a technique for implementing complex user interfaces, for example. And you can find some cool projects developed in Haskell, for example the effect process, which is a game available on Steam, that has been open sourced recently, and also the weanimate library, which can make quite impressive animations. In this talk, I will show how to implement several animations on concrete examples using functional programming, and how to improve this code using some features of Haskell, like data type, release evaluation, students library and the state and one other. So first, let's look at a very simple example, let's say... I want to know, can we get to a little louder? Oh, okay. It's a little hard to understand. Okay. So as a first example, let's say we want to draw a disk on the screen with a fixed radius. To do that, we can use the Gloss library, which is a classic library in Haskell for implementing animations and 2D graphics. This library provides some functions for drawings, primitives, for handling user events, and the Gloss library also provides some main loops that will run the main application. So basically, all we have to do is to write some unlawful functions, which say how to run the scene or how to run the user inputs, and then we pass these functions to the main loop and that's all, we can run the program. So let's do that. For this first example, we don't have any particular data, we just want to draw a disk with a fixed radius, so there is no data to remember for describing the scene. So we can write a type, which represents the model of our application, but here we don't need anything, so we can say it's the unit type, which means no data. Then we have to write a function that renders the scene, so this function should take a model and return a picture. Here we use the solid circle function, which is provided by Gloss, to draw a disk on the screen, and we say we want a disk with 50 pixels as the radius. We also need a function to under user events, that function should take an event and a model and return a new model. This is a very classic way for modifying data in functional programming. We can't mutate a variable because it's a side effect and pure functional programming, we can't do that using pure functions. So we just take the current model and return a new model, a copy of the model, which contains the modifications. For now, the scene is static, so we just return the same model. And finally, to handle time, we just need a float, the elapsed time, the previous update, and the current model, and we return the new model with the modification. Once again, the scene is static, so for now, we return the same model. Now we can write the main function. We just have to set some parameters, for example, the initial value for the model, and some parameters for the window, the background color, and the format of the animation. Then we can call the play function, which is a main loop provided by the Gloss library, and we just pass to this function our parameters and our under function. This is a very classic way to do in a functional programming. We have functions that we can pass to other functions, and we can organize the code like this. So we get something like this, we can run the program, it's really impressive. Nice. And now let's add some animations. So let's say we want to refresh the scene every second and change the radius using a random number. So to do that, we can use a pseudo-random number generator. We need to model our scene differently, so we write a type, which is model here, which has two fields, first the current radius of the disk, and the random number generator that we can use to update the scene. So this is a record type in Haskell. We have two fields, which have each of them as a name, and we can then use the function here. So the name of the field is also a function that can access to this field using the model. So here we get the radius of the model, and we use that as the radius for drawing the disk. Of the under time function, all we have to do is to generate a new radius. So we take the generator inside the model, and we call this function to generate a new radius. Since we cannot mutate the generator, we have to return a new generator for the next random generation. So this is why we get a new radius here and a new generator here. And that's it, we can build and return the new model, which is the result of the function. We need to update the main function. We have to get a random number generator. We can do that with this function, which gets the standard number generator from the system. And we can also generate a first random number for the first radius of the animation. And the model is built, is constructed here. We get something like this, which is not so much impressive, but there is some animation. So this is a very classic way for generating random numbers, but in Askel, we can do differently. Since Askel has lazy evaluation, we can define an infinite list for all the radius of the animation, and Askel will compute the numbers when it needs them. So instead of the generator, we can use here a random list, an infinite list, and that's all we need. We will consume the elements in this list for having new reduces. The unmet time function can be in fact like this. So we have here the infinite list. And we can just get the first element for the new radius, and the rest of the list will be used for the next update of the scene. In the main function, we have a function to get an infinite list. So instead of the randomR function, we just have to call the randomRS function. And this gives us an infinite list of random numbers, and we don't have to under a random generator explicitly. Let's say we want a ball that moves inside the window, and bounces against the border of the window, and can show the result. So we want a ball that moves inside the window, it can bounce against the border of the screen, and if I hit on Turkey, the scene is initialized with a random velocity and a random position for the ball. So how can we do that? We need more complex types, so we can first describe a ball as a position and velocity. These fields are 2D vectors, and now the model is just the current ball and the infinite list of the other balls we can generate randomly as we did before with the radiuses. These types are more complex than before, because we have a model that has a ball, and a ball has two fields which are 2D vectors. So these vectors have x-coordinate and y-coordinate, so we have nested types which is a bit more complex to use. We can handle this type with a scale using standard record syntax, there is no problem with that. The syntax is just a little bit more complex. So here we get the ball field of the model, and here for example we return the same model as the argument, but we change the ball field with these balls here, which has been computed before. All the other fields of the model doesn't change, we still copy them, in fact. So this function updates the scene, I have implemented it in two steps. So first we move the ball and then we compute bounces against the border of the window. So let's look at the update bounces function. We have to compute the collision with the border of the windows, so we take a ball as input and we return the ball after all the collisions have been computed. To do that, we can use the record syntax as did before to change only the field that needs some modifications. But in fact, it's sometimes simpler to fully reconstruct a ball, so that's what I did here. I have detected a collision with the left border, and I have to return this ball so I can set explicitly what is the new position vector and the new velocity vector. In fact, there is only two fields which are different, the x-coordinate of the position and the x-coordinate of the velocity. So to avoid reconstructing the ball, we can use a library in a scale which is length and which can simplify this code. So the length library enables us to access and modify nested types so we can go deeper inside the type to just add a small modification. To do that, we need to construct lenses, lenses are just functions that can access to a data type. And when we have these functions, we can use all the functions an operator provided by the lens library. So let's do that. We can build these functions, these access functions using this function make lenses, and that does everything for us. So we just have to call make lenses for the ball and for the model. And that's it. We can use all the operators provided by the lens library. This can look like this. So here I return the model with two modifications, the first modification which is applying this function to the ball field and the second modification here where I apply the update lenses function to the ball field of the model. And finally, the model with these two modifications is returned. We have more than that. For example, for the update lenses function, instead of reconstructing the ball, we can now just getting deeply inside the type to apply some changes. For example here, I set this value to the X field of the position field of the ball and finally the ball is returned. And then I can change another modification here. I apply the negate function to the X field of the velocity field of the ball. So I can change several modifications and go deeply inside the type to make some modification, setting a value or applying a function. So this is quite interesting. We can still improve this code. As you can see, we take a ball and we turn a new ball. So it's just updating a ball. And to do that, we have computed here several steps which corresponds to the collision between all the borders of the windows. In fact, we are modifying a ball, but we can't do that in pure functional programming. So we have to use intermediate variables that store the modification after this collision and this collision. So the code is quite cumbersome and we can improve that using something in Askel, which is called the state monad. So the state monad is a very well-known monad in Askel, it's a very classic monad. It's just a context where we simulate mutating a state. So each action inside this monad is an access or a modification of the current state and we can get the final state or another result, we can do that also. And that works well with the lens library because the lens library provides a stateful version of its function and operators. So let's do that. We can change another function like this. Instead of applying several modifications, we can just execute the state actions defined here. So this is the function. We have to, this function takes a first parameter which is the previous ball and when we have applied all the action, the state action, we get a final state which is the final ball that we can use to update our model. Let's see the update monad's function. So instead of taking a ball and returning a ball, now it's clear that we are in a state monad and this is a state action where the current state is a ball and we can return a value but here we don't need that so the function returns a unit. That means that every action inside this function is now an action, a state action. So reading the state, modifying the state and so on. For example here we can access the postfield of the current state which is a ball. Here we can set this value to the x field of the position field of the ball or applying a function on the x field of the velocity field of the ball which is the current state. Since the state monad is a monad, we can use all the features available for monads such as the denotation so we can change several actions like this and we can also use some functions provided for monads such as the went function. As a result, the code is a little bit more simpler and it's clear that this is a state action that we have a current state which is modified according to the code and then we get the final state and this is checked by the compiler. So to conclude, we have seen that functional programming and ASCEL using a functional programming and ASCEL we can implement animations and this is very natural in functional programming since we just have to pass some function to other functions like a main loop and we can decompose and organize our code like this. We use infinite list to generate random numbers so we don't have to use random numbers explicitly. We just consume the elements of this list. We also use the lens library to access or modify nested types and we can go deeply inside these types. Then finally, we simulate mutable state using the state monad so we can modify variable and get the final result. So all of this is still based on functional programming so we just manipulate pure functions and static typing and this is quite easy to read and less work run since we have no side effects function only depends on its argument and produce the same result if the arguments are the same. And all of this is checked by the compiler. So this code, this state and the code shown here are available at this link and you can find some information in the documentation of the libraries and see things that sit for me. Thank you for your attention. Thanks to the organizer. Thank you, Julia, and there's five minutes for questions. If you have a question put up your hand, I'll bring the mic. Do we know what the performance of class is like for complex applications like could you write a complex QI in Gloss? Do we know what the performance of Gloss is for complex display? Gloss is based on OpenGL so it's not that slow but I don't know for very complex animations. I believe some projects use SDR and it seems they have no problem of performance but I have no experience more like that. In the play function it pretty much makes the whole program pure with no I.O. What if you do want to do any I.O. in an application? So the Gloss library provides both two interfaces, one which is purely functional and another where you can do I.O. So there is a version where you can do that. Any more questions? Yes. Can you explain the operators used for the lenses? There's many, many operators. Is the person signed? Yes. Okay. So there is two versions of these operators, one which is purely functional so you just take your data structure that it can access and return the value. So this is such operators. So that means we apply a modification, so the ball zero is returned after this modification. So this is what this operator means. Here it's for accessing another field. So it's an X field of the position field of the ball. And this operator says we set the value in this field and this operator says we apply a function on the field. And the stateful version is the same but we have an equal sign instead of the tilde. Like a get and a set. Yes, we can say like this. We can say that. Any more questions? Okay. Let's thank Julio. Thank you. Thank you. |
Open-Source Opportunities with the Haskell Foundation |
All right, so our final talk for the Dev Room today is by David Christensen. He is the Executive Director of the Haskell Foundation and has worked with Haskell and functional programming in both academia and industry. He was a major contributor to the first version of Idris and its Emacs mode, together with Daniel P. Friedman. He wrote The Little Typer, an introduction to dependent type theory, and is currently working on functional programming in Lean and introduction to writing programs in Lean 4. His presentation today is Open Source Opportunities with the Haskell Foundation. Thank you, David. Thank you. So, as a brief overview, I'll be presenting three major sort of categories of speech today, I guess. A bit about lore and values of Haskell for those who are new to the community, a bit about institutions and projects that kind of keep us all up and going, and at the very end I will plug my employer, the Haskell Foundation, as one does. So to begin with, lore and values. Many people coming in here who don't know Haskell so well might be thinking, what is Haskell? And I'm not going to answer that at all. I'm going to answer another question, which is, who is Haskell? And the reason why I think this is an interesting question to answer, well, my thinking was really influenced by this essay by Ken Pittman from back in 1994 called Lambda, the Ultimate Political Party, which was kind of a riff on a series of MIT, AI lab memos, Lambda the Ultimate X, where X are things like op code and so forth. And most of this is a bunch of details about the LISP standardization project of the early 90s, which is not so relevant anymore to most of us. But for those who are not historians, the author makes a wonderful point where he proposes a thought experiment, and in this thought experiment he says, take the LISP community and give them the C programming language, don't let them talk to C programmers, come back, check in after some time has gone by, and what should you expect to find? Well, you should expect that the version of C that the LISP hackers have been hacking on has grown Lambda, it's grown garbage collection, it's grown fancy interactive environments, it certainly won't run on low end machines anymore, whereas you'd expect that C would be sort of basically like C, but better in the ways that C normally gets better, like maybe Boole has become a type and things like that. This is from the perspective of 1994, that was not the case yet. And what this really says is that the way we should understand a programming language is really as like a shared artifact for a community of practice, like a group of people who are working together on some project. We're not all identical, but we all have something in common which draws us to this programming language, and what this really means is that like the values of a community shape the development of a programming language over time, and the shape of the programming language affects the values of the community, because if you hate it, you're going to leave. So, what are some values of the Haskell community? Well, we think that elegance is a very important thing, very, very important. But we also like to build things that really work, you know, there's a stereotype of Haskellers who, you know, like once they're programmed type checks, they just like delete it because they're done, and that's not true at all. We really do like to build things, you know, like, I used Xmonad for like a decade, but also we really do appreciate a mathematical inspiration, you know, we like to sort of be able to look at a thing and say like, oh, from this branch of mathematics, this means this thing. And a few of us do that, and the rest of us look at those people and say like, yeah. But as opposed to, eh, right, we also really like playing with things, you know, like playing practical jokes in the type system is something that will get you respect rather than discussed in Haskell. We have a real culture of like colluding through a thing, and then cleaning it up and making it elegant and beautiful later, like you're not expected to prove theorems, and then do your work, it's more like find a thing that works and then see if you can prove something cool about it. We tend to be pretty anti-authoritarian and independent people, that makes my job as chief cat herder more kind of like chief mountain lion herder, which is fun, and typically when we want to make a thing that works well, we're going to be using kind of lightweight semi-formal methods like, you know, fancy types or property-based testing, things like that, as opposed to other processes that are used more so in the rest of the world. We like cleverness, you know, if you can come up with a thing that's like fancy and powerful and cool, like people look at that and say, yes. We tend to have low power distance, you know, if you go to a Haskell event and the person with like the gaggle of young Haskellers following them around says something and you talk back to them, you're likely to be met with respect rather than with like rejection for having dared cross the great leader. We like novelty, and we tend to have a lot of respect for knowledge, and we don't want to say like, oh, that's just book learning, and we also tend to be a bit insular. There's a bit of like a not-invented-here thing that happens quite often in Haskell, and also we've sort of mass-imported a lot of random-unix stuff, like, you know, a preference for kind of grody command-line applications. I think some non-values of the community. One is like achieving correctness through formal discipline organizational design. That certainly happens in some organizations, I think more in the corporate world than in the open-source world. But we also tend to not use traditional software engineering practices even when it might be relevant and useful because we just kind of look at that and say, huh. And also, while we value simplicity and beauty in our language, the things outside of the language we often look at and say, yeah, it's okay if that's a bit crunchy. So where is all this coming from? You know, a community has values, but it also has history. Well, back in 1976, two very important things happened. The first was a paper by Dan Friedman and David Wise called, Cons Should Not Evaluate Its Arguments. Cons is the name of the list constructor in old school lists, and today in lists today. Also David Turner made a new version of the St. Andrews static language, which was lazy. And this sort of gave rise to a cottage industry of cool things that are not all in the slide here because there's too many of them. One notable example in 1984 was lazy ML from Schalmers and also of course Miranda in 1985. And as we almost know, Miranda is a trademark of research software limited. And so then a bunch of other languages came out, like Orwell, Alpha, Clean from the folks at Nightmakin, a really great, interesting language with some cool ideas you should look at, Ponder, and then people would think, well, we've got this nice compiler over here. We've got this nice library over here. We can't use the library with the compiler because they're different languages, but there's not anything importantly different about them. It's just that this one came from this university and this one came from this university. And to be clear, this was a very university-led phenomenon, all of this lazy programming in the 80s. And so a committee got together, and in 1987 and through 1990, they started working on the sort of committee language that they would essentially shave off all the things like, this one uses a single colon for the types, this one uses a double colon, this one uses a capital letter for constructors, this one uses a lowercase letter, and shave off those differences and make this language where you could use a library from the one site on the compiler from the other site. And unfortunately, when you get research, fortunately or unfortunately, when you get researchers in a room together, they tend to do research, and all of a sudden they figure out type classes and they start to do an I.O. with monads, and then Haskell was born. And the, yeah, yay! So the 1990s was like a period of furious hacking both on the definition and the implementation. The two kind of went hand in hand. There were lots of implementations with Haskell in the start, you know, because the idea was that it would be a common standard for implementations, sort of like, in this sense, more like C or common Lisp, and less like, you know, Python or Perl or Rust, which are sort of defined by their canonical implementation. You know, in 1992, work was started on GHC, in 1995 on hugs, has anyone in here used hugs? Ah, good, some hands, yeah, so back in the day, GHC didn't really, didn't have a REPL, so you'd use hugs for the nice interactive environment and the error messages, and then you'd use GHC to compile your code, kind of like standard ML programmers do with like SML and J and Milton. You know, and by the end of the 90s, we had the Haskell 98 report put together by the committee, and by 2001, GHCI came out, and I see this as kind of a watershed moment, because it's when GHC began starting to kind of serve all of the needs of the Haskell community rather than just the batch compiler to make your fast code need. You know, in the 2000s, we had a fairly finished language standard, right, the Haskell committee disbanded itself after Haskell 98 came out in 1999, and a while of work was put into making Haskell go fast, like let's make the compiler generate better code, let's look at all these nice optimizations we can do, also at doing like reliable concurrent programming, so we got a lot of cool like parallel Haskell features, we got software transactional memory, all this like space age technology stuff from the perspective of the mid-00s at least. And in the 2010s, you know, as I should say, in 2009, we got the Haskell 2010 report, and that was actually the last major revision to Haskell, and there isn't really a committee around it anymore that feels like they can define a new Haskell language, and in some sense, this report was a little bit anachronistic, because by this point, GHC was the Haskell implementation that everyone was using. Through the 2010s, GHC was extended with all sorts of super fancy types. You've seen a little bit of them today, but there was this whole line of research of how can we extend the expressive power of the type system while still keeping a lot of the properties we like of Haskell, like being able to write down a simple program and have it tell me what the type should be, as opposed to having me tell it what the type is first, and then having it check the program. And it's, you know, we're only a little bit into the 2020s, but I think that what's happening here is that we're going to finally deliver on our potential of having the best experience driven by the fancy types and the fearless concurrency and all these things, but we'll see what happens. So a little bit about institutions. I've been talking a lot about sort of community and history, but a community is more than just a group of people, there's also, you know, figures within any community who kind of set the agenda for that community, who others look to for leadership and inspiration from time to time. And an interesting thing about Haskell, this comes up a lot in various discussion boards, is people will say, what is the Haskell X, right, where X is drawn from the set containing build tool or tutorial or book or IDE or compiler or whatever. And in fact, we can't answer that question because unlike something like, you know, Python or Rust or many other of these implementation defined languages, there isn't really any organization that owns Haskell and can say, we're going to now say that this is the official Haskell X. And, you know, like, we're essentially defined by GHC, but it doesn't have, the GHC project doesn't have this kind of leading role in the same way that, like, the Python project has in Python. So that's, we do have various committees that are, you know, that exist and people mostly do what they say. And like I said, we're a fiercely independent bunch of people. So there's the Core Libraries Committee and they're the ones who are maintaining and controlling the standard libraries, so things like strings and lists and all the basic stuff that you need that essentially every Haskell program is going to need either directly or transitively through other things. Then we have the GHC Steering Committee and the name of that committee is a bit misleading. In fact, what they do is they evaluate changes to the language implemented by GHC. So in some ways, this is the forum in which changes to Haskell are discussed and if you have input about it, that's where you should show up. Also if you'd like to participate, you know, they regularly have new nominations, so it's a place that you can do that. And then we have the Haskell Org Committee, which is responsible for administering the Haskell.org site and that's both the website but also, like, the sub-domain namespace. So when I needed to get errors at Haskell.org for a thing, they're the ones that I went to and asked. And they also, for historical reasons, run the Google Summer of Code or Haskell Summer of Code when there is no GSOC. The key tools in Haskell, as we've seen earlier, there's the major compiler, GHC. We've got HLS, which gives us all those fancy features that we saw earlier in the talk. There's Cabal and Stack are the two major build tools. There's GHCUP, which is like a tool chain installation and management program, which is quite convenient to use. There's Hackage, which is kind of our C-Pen or C-10 or crates.io or depending on where you come from, one of those might make sense to you, which is a centralized repository of packages. Luckily, we've just got one of those. Stackage is a version-pinned distribution of packages from Hackage that have been tested to work well together, so you can get a coherent set of stuff. These days, like in the old days, as the name suggests, Stackage used to work with Stack. Now you can also point Cabal at it as well, if you want. Then there's Haddock, which is a documentation generator, and it's in need of some serious refactoring. The maintainer is sitting over here, so if that's something you'd like to get involved with, you should go talk to them, because that could be a really useful way to help out. Also, GHCUP, I know, is looking for a co-maintainer to share some of the burden there, so that's another good place to get involved, where you don't have to be a super-type system expert. Last meeting's Haskell Playground is looking for volunteers as well. This is a sort of up-and-coming project to have sort of online place where you can go put in some Haskell code and run it and see what happens, without having to install anything on your machine or anything like that. You can think of it as, essentially, like an active paste bin. I've been instructed to tell you to look at the help-wanted and good first-issue labels on the issue tracker, and mentorship is available from the author. The Haskell Foundation is the other institution that I didn't talk about with the first ones. We're a very, very new non-profit, just a couple years old, and we are trying to broaden the adoption of Haskell because a programming language is more useful when more people use it, so the more people we can get making cool Haskell stuff, the better it gets for all the rest of us. Also, we think that there's a lot of really good things in Haskell that haven't, like good ideas, that haven't spread as far as they could yet. The rest of the world definitely deals with first-class functions now, so we've succeeded there. The rest of the world is basically catching on about monads, so we're succeeding there, but there's a lot of other cool stuff that I think we still have that if we want to make the world a better place, we can spread those good ideas. The point of the Haskell Foundation is not to come in and take over everything. Our goal is really to support existing processes from our fiercely independent Haskell community and figure out what opportunities to help out. I am the executive team, so I've been in that role since May of 2022, so I'm still fairly new at it. I used to work at Gelwa and Deon Digital. Before that, I have a PhD from IT University of Copenhagen from 2015. I also did a postdoc in Indiana University. I worked on interest one, and I helped write the little typer, and I'm working on functional programming in Lean. As you can see, I'm into dependent type stuff, but that's really not the focus of where I'm working in Haskell. I think dependent types are cool, but there's way more cool things than just that. The other full-time person we have at the Haskell Foundation is Brian Richter, who you may know online as Shriekat. I'm actually not sure how to pronounce that. CHR is a consonant cluster found in my own name, but nonetheless unusual in English. He's doing full-time DevOps and CI work for the GHC project and helping to unstick things there and make it easier for both the existing team and new contributors to work on it. He's also looking for volunteers to help out, so if you have knowledge of CI and DevOps things generally, and in particular GitLab, Nix, Python, Bash, and or PowerShell, get in touch with him and he'll put you to work doing useful, interesting stuff for GHC. Another project of the Haskell Foundation is the Haskell error index. This is a new website, which you can get at errors.haskell.org. It really got its start at Xerahack last summer. The way this works is that participating Haskell development tools, and so far there's three, that's GHC stack and GHC up, can assign a unique code to each of the error messages and warnings, and then these can be looked up on this website. The website contains, for a given error message, detailed description of it that's sort of longer than you could put in the error text itself. It can contain any number of examples, so ideally we're going to be providing sort of a before and after example, like a program that exhibits the error message and then one in which it has been fixed, along with details about why that program exhibits the error message. I could really use some volunteer help on this one, so if you're good at CSS and JavaScript, then dark mode support would be super useful, and also we put some work into the CSS, but it could use more, like it's not the most beautiful of websites yet. Also writing documentation content is super useful. If you know enough Haskell to understand one error message and you can write a markdown file and use git, then you have the skills necessary to contribute documentation for one error. So far we have 72 in there. I hope that with the new GHC release coming out that supports the error codes, we'll quickly get the 371 remaining errors documented, no, 271. And also, if the site backend is a static site generator written in Haskell using the HACA library, and right now the deployment script takes too long to run, and if you can help me with the caching to fix that, that would be awesome, because I'm terrible at that stuff. As you saw earlier, Frasier is running our security advisory database. This is a new project, a new initiative, sort of inspired by the Rust and NPM advisory databases. The idea is that it's going to serve as a data source for tools like Cabalstack and Dependabot. And in particular, a lot of organizations that want to use Haskell need to pass ISO 2701 certification, and doing that is certainly possible without one of these automated scanners, but then you have to have a conversation with the auditor, and that makes things slower and riskier and more expensive. So if you can just check the box, that's much better. Also there's real value in finding out whether or not one of your dependencies has some sort of a known issue. Volunteers are wanted for the security response team, which is going to be administering the actual contents of the database, but also for tool development. It would be great to be able to sit down and say, you know, Cabal audit or Stack audit and have it spit out a list of things to look out for, and also some public communication help could be useful. Right now we have like a data format for the database, but generating a nice website that documents everything in the searchable would also be a really useful contribution. We have a podcast, the Haskell Interlude podcast, which is looking for guests, members of the Haskell community. So if you'd like to get on there and have a discussion with some leading Haskellers about what you're up to, that would be really cool. Email podcast at Haskell.foundation. The Haskell Optimization Handbook is an in-progress text on how to make Haskell code go fast. I blurred out the address on there a little bit just because it's in the process of moving from one address to another one. But if you Google it, it'll come right up. This is being organized by Jeff Young who works at IOG. He's known as Do You Can Do on the Internet, so you can get a hold of him if you'd like to find ways to contribute text or infrastructure to this project. In addition to all of these sort of concrete technical things, we're also orchestrating the Hackage Security Signing Process. So Hackage uses an instance of the Update Framework, which is a sort of standard way of securing software repositories against man-in-the-middle attacks and untrusted mirrors and these kinds of things. And part of this process is that we have a collection of trustees who have keys, and any three of them have to sign the metadata file from time to time just to keep the thing going. And they're certifying that all of the associated roles are correct. And that's been a volunteer-led process in the past. And we've had a couple of times where we thought, uh-oh, we've got to sign this real fast, otherwise things are going down. And so the HF is, by being a professional organization, we can put a thing in the calendar and get the process going in plenty of time and all of those things. Also, we're going to be doing a sort of lottery factor audit of key projects in infrastructure soon and try to find more places where we need to recruit extra maintainers for important projects and that kind of thing. Some people call the lottery factor a bus factor, but I'd rather think in terms of how many people can win the lottery and retire from computing forever without the project collapsing. It seems a little happier. We've also spent some time helping out the GHC developers. The results of this end up on discourse.haskell.org, typically. Recently they asked us for some help in going out and surveying a certain number of GHC users about priorities for the next six, nine months. And then we collected feedback on that and then developed a report where they said what they're actually going to do based on the feedback. So this is available to be read. And right now we're trying to do something similar for a project on making nightly releases easier to get to. So right now you can get nightlies if you know the incantation, but we'd really like it to be super easy to get a hold of them. And so if you go find, if you think that you could use nightly releases for something, then please go find the discourse thread and post because that way we make sure that whatever solution we have incorporates your use cases as well. And I've already discovered a few that I hadn't thought of, so that's been a useful process. We're organizing a workshop for new GHC contributors. So if you'd like to get started on hacking GHC, but don't yet know how, then you should come to, then you should come to zero hack three days before zero hack and get an introduction. This is still an in progress thing. We don't have a specific speaker list yet. Simon's definitely talking, but the rest is depending a little bit on a survey that's out there. So if you fill out the survey and say what you're most interested in learning to hack on, that'll affect the people who we invite to come and present parts of GHC for potential contributors. And also if you know how to run a hybrid event well, get a hold of me, I don't. I'm going to do my best, but if you could spare 45 minutes on the phone with me to tell me all the things I'm about to do wrong, that would be very valuable. So email me if that's a thing. We have two big working groups at the moment. We have a technical working group which evaluates various project proposals and especially proposals where Haskell Foundation like administrative time or money would be useful. And so if you have something like that, please come and give us a proposal and we'll discuss it and try to refine it and eventually hopefully fund it or administer it or otherwise carry it out. We've also used this to host like a community RFC process in cases where that's needed because as I said earlier, there isn't really anyone who owns Haskell. So it seems as good a forum as any to have some of those discussions. And we have a stability working group which meets every two weeks and we're looking at which involves GHC developers and academics and others and we're trying to find ways to reduce the difficulty posed by updates to the Haskell ecosystem. And that's going to be some combination of social and technical means over time, lots of small stuff. Thank you for listening. If you want to get a hold of me, I'm David at Haskell.Foundation or I'm Christensen on Matrix.org. And then the Haskell Foundation itself is Haskell.Foundation. You can also look at Haskell.org to find out more about Haskell itself. We have, we're on Mass on Twitter as well and the names are up there. And I believe a few minutes, five minutes left so if there's any questions. Thank you, that was a great talk. So I observe like what Haskell Foundation does and there's like a lot of great initiatives and I think it pushed the language forward. But I'm wondering, so does Haskell Foundation measure somehow how popular Haskell is? Every other month there is like a thread that Haskell is dying. So I was just wondering like, do you have some like a concrete data that would say otherwise? Like that, I don't know, like a job postings are growing or the number of companies that using Haskell are growing or like the community is growing somehow. Like do you gather some data like this? Not particularly, no. I haven't found a good way to do it that I think is going to be more signal than noise. I don't think Haskell is dying. I think that there's a couple of people who feel that way and they're entitled to that feeling and they say it regularly. But I don't think that that's a commonly held feeling is my impression. I get the impression that's more a feeling that a couple of people have. I know that I keep hearing about new users of Haskell who I'd never expected because they're not very public about it. Hackage continues to grow. I see job posts on a regular basis which I didn't see a couple of years ago. So those are all anecdotal qualitative things. If you have ideas about non-misleading measurements that are cheap enough that I can do them being essentially one person with a, you know, not tiny but not infinite budget then. Yeah, definitely. I'm not saying it's easy. I was just wondering if you maybe do a self-light. As you know from my background, as you could see from my background, like it's very much on the like programming language side rather than the market research side. So if you do have good resources, I'd love to hear them. Sure. Thanks. Okay. Any more questions? It seems not. So thank you very much, David. Thanks. Thank you. |
Acknowledgements, *prize draw* and farewell |
So, thank you everyone who has come today, those who are here and those who are not in the room and were earlier. I would like to thank our speakers, Julia, Methodis and Renaldo, Hecate, Ivan and David. Thank you very much for presenting at the first and hopefully not the last Haskell bedroom at Fozdem and let's give them all another round of applause. Thank you to David Antwells who have helped a lot with the organising of this event, selecting the program and for the swag which was donated by the Haskell Foundation. We have stickers here at the front, there's stickers on the table at the back as well. Feel free to grab a bunch as you leave and for the books, for the prize draw, Programming in Haskell by Graham Hutton, second edition, three copies donated by the Haskell Foundation to give away in the prize draw today. So, thank you very much David for organising that. Thank you to the Fozdem organisers and volunteers to make this possible and especially for all of the AV side of things which is a massive undertaking. So credit to them and if you see some volunteers please remember to thank them. Okay and now the prize draw. So I think the highest number ticket was 39, if you have a ticket with a higher number than that, jump up and yell so that I put the correct range in the random number generator. So let's import system.random, let's make a generator, a new stdgen, actually I'll do it in a, I'll use a monad, monad for this new stdgen bind and what do we need to do here? Random R, so random in a range from 1 to 39, it's inclusive, okay take my word for it, it'll be fast enough. Okay and we need to, we'll need to do a random RIO, is that a function, we'll use new stdgen. We have to use a monad and a bind here just because this is the Haskell dev room and we need to play up to those sayings that people have that we're all obsessed with monads in the Haskell community and that's all we ever talk about. Okay and we'll need to, let's see, pull out the first value, there we go, okay 35, who's got number 35? Not here, I'm just going to keep going then. Number 9, yes we have a winner, okay, I'll pass that back, thank you, okay I might actually hand these down and get someone else to do the running around, okay, number 26, okay number 29, yep here we go, 33, there we go, everyone's right at the front, okay and the grand prize, Australia's finest export, number 3, well done, okay, I'll bring up to you in a minute, okay so thank you everyone, that's a wrap, cheers. |
Efficiently exploit HPC resources in scientific analysis and visualization with ParaView |
All right. Good morning, everyone. Welcome to the HPC Dev Room. Thanks for being here so early in the morning, maybe not entirely sober. We'll let Nicolas get started with opening a Dev Room with a talk on paraffin. Thank you, Nicolas. Hello, everyone. Thanks to be here early in the morning to begin this HPC day. I trust them. A big thanks to everyone who organized this room. That's really great to be here. Thanks, Kenneth, on all your team. So, about me, I'm Nicolas Vieille. I'm a C++ developer, and I have the chance to make my job about making first-code contributions. I'm working at Kitware Europe, and I work mainly on ParaView, so developing the software, but also interacting with the community. So, as my may want to reach me later. So, ParaView, it's an end-user applications that work for scientific data analysis and visualizations. We have an open community on the GitLab for the code and discourse for discussions. It's supported by Kitware, which is also behind VTK for visualization toolkits, and to make you potentially already know about. So, what do we do with ParaView? It's for displaying and analyzing scientific data sets. So, it's mainly a 3D visualizer, but you've got also some charts and spreadsheets and so on. It's also intended for data processing. So, we have a concept of filters to take an input and compute some stuff on it and get your outputs. So, basically, you can extract the data of interest of your raw data. We also have some other module like realistic rendering. So, you can do your communications with directly your real data sets or not with just some kind of fake one. So, here is some basic screenshot of the applications. Who is using ParaView? We cover its generic application. So, we cover a large range of domains like fluid dynamics. So, we can compute streamlines, particle tracking, and so on. We have also volume rendering that is real nice for our medical applications with you have some 3D scan and you want to understand what happened inside. So, that's just, that's non-finite list. We have a lot of domains that can be covered, but here is the more well-known one. Oh, we do use ParaView. So, as I said, that's an application. So, basically, the first way to learn is to use the GUI. So, you click on buttons, you do some stuff, and you're happy. But you can also use the Python wrapping to write some scripts. And so, you can run the script on that processing without having to be behind the computer. It has a framework because you can code your own extensions, your own derivative work from ParaView in the native simplest language, but also some features can be done in Python code. And it's all based on the visualization tool kit. I mean, all the hard work of processing the data and do the rendering come from VTK. So, as I said, that's also supported by Kitware. So, I do work sometimes on VTK to have some bug fix or some small new features. Where we can run ParaView, on which hardware is it possible to use it? Basically, on your small, classical, big stop, we have some official binaries you can try to download and just run. It should be, it's cross-platform, I didn't say, but you can run it on, as well, on proprietary software like Mac or Windows, but it should be out of the box for Linux 2. You can also build it. We have a large selection of build options, depending on what you want to do exactly. You can enable or disable it. So, if you want to have Python data distribution, parallelizations, or custom rendering, a lot of stuff. We have some documentation about it, and we can help you on the discourse if you have to try to achieve some specific build of it. And which kind of usage do we have of ParaView? So, either a research and industry are using it. For instance, recently, there were the Super... So, before winter, a supercomputing conference in the US, and they organized a service contest where people are asked to upload some nice videos made about their scientific analysis, and most of them are using ParaView, either for just the data processing, but also sometimes for the video generations, and animate their data. So, why all those people are using ParaView? Because ParaView does some stuff efficiently to process the data and their infrastructures. So, that's what I want to talk in the next part of my... of my talk. So, what's ParaView used behind the hood to make it possible? So, first, we have a client-server architecture. So, I always say that you can do it from Python, but the degree on Python are just two clients. So, you can do exactly the same stuff with one or the other. There's no real limitation about using one or the other second. You can also run with remote server, so, and you can run in a distributed environment your server. So, in that case, you can connect your... you can either just run the server parts to analysis with a script analysis, but you can also connect your local clients to your distance server, and again, using the graphical interface to do the stuff as if it was on your local machine, but instead, it's... yeah, the supercomputer or the remote architecture. At the bottom, two other modes that are available. If you... typically, if you have some graphic nodes on your server, you can use them for just the rendering part and stay the data management on the CPU nodes, and still, you can connect your client on it to see what happened and to control from a graphical interface. And last mode, I will go back on it later. We have an institute infrastructure, so, basically, your simulation can call an IPI that's Fluid Paraview script analysis, and you can even connect with a graphical client to see time step per time step what is happening on your simulation. So, that's for the different mode of use for Paraview. So, first, to run on HP infrastructure, we implement data distributions for the analysis. So, basically, we rely on the MPI standards. So, our readers are MPI aware, so they can distribute the data over the rank early in the process when you read your data on the disk. Then, most of the filters are okay to run just with their support of data, but some other filters need to know about the neighborhood to execute correctly. So, for that, we support the concept of ghost cells, where each rank knows a little bit about the rank that is next to it. In that case, mainly, we split the data geometrically. So, a subset is really a geometric subset of your data, and different one can know and communicate with the other for specific tasks. At least, we have some filters. So, what I call a filter is really something that the user can instantiate from the client and ask to process. So, we can ensure a load balancing by redistributing the data during the process. The visualizations can also be distributed over several ranks. So, for that, we use an inner library that call IST, that's also based on the MPI process to do that, and parallel view support has a different kind of model of rendering. So, you can, if you have dedicated rendering node, you can, as I said, create parallel view server just for the rendering part and connect it to the data server. You can have multiple GPUs per rank, yeah, multiple GPUs per rank to do the rendering, but that's also possible locally if you have just one machine that have multiple GPUs, you can ask to do a rendering on both simil-tune-y. Concerning the performances now, so that distributions is not about performance, it's just about running with too big data so you cannot just run on your machine, that's a requirement when you have huge data to be able to distribute it over your computer or your supercomputer. Now, we're talking a little bit about performance because if you have big data, you also need to be performant on how you analyze it, on how you are proceed with it. So, for that, we have a thin layer for CPU parallelism, we call that a simple tool in our code base. The goal is to parallelize, do code parallelizations for many for loop, and main purpose is that you can choose at build time and then at run time, if you enable the OpenMP or TBB backends, and if you don't want external live, you can also use the C++ thread to do that. And so, as it just, for instance, to parallelize a for loop or field operations, it's really widely used in a lot of our, in our algorithm, and you have some environment variable that can control the back end on some of the number of, of thread, the size of the thread pools, or if you allow nested pools or so on, depending on the, on your resources and back end. So, it has some documentation on it, and we made some improvements last year about that. Still, still about performances, we also use as an optional dependency, the VTKM, VTKM projects that stand for, yes, somewhat some many core that is intended to be used on heterogeneous systems. So, basically, when you want to have performance on supercomputer or even, you, you still need to be aware of the current technology and the state of the art, and as we saw in the past decades, a lot of new architectures emerging. We, we think about using a dedicated library to, to be able to use this, this new architecture. So, with VTKM, the goal, inside VTKM library, the goal is to split all operations into really atomic operations, and then the, the, at runtime, it can dispatch all, all that on the hardware you find on the back end that are available. So, with VTKM, you can use CUDA, OpenMP, TBB also, to do the computation. This time, with VTKM is not just accelerating some specific loop inside an algorithm, it's more about VTKM is implementing some whole algorithm like extracting ISO control or, or so. And then, we embed this into, into Paraview with some kind of wrapper to, to communicate with all VTKM works. So, that's optional, that's enabled by default in the binaries we, we provide. Another point about performance, but that's really depending on the use case, on, on the data you are using is the in-situ wall. So, basically, when you, traditionally, when you have your simulation, it dumps every time step or every end time step some data on the disk. And then, to analyze, you have to load back to the data with post-processing tools. But that adds a cost of writing and reading from your disk. And you, you should have the size on your disk, the whole size of the disk. So, you should have big disk. And then, it's, it have really a cost in term of time, when you should write a full, a full mesh or full data on disk. And then, read back with another process. So, basically, the goal of in-situ is to, to make the simulation communicate directly with the processing tools. And then, the processing tools can wrap the memory in place and analyze directly in, in the RAM without writing on the disk to save some higher time. So, in the context of ParaView, we have a standalone API that's called Catalysts. That was recently released, as we make big improvements into Catalysts past years. And the goal of Catalysts is to have a really minimal API and stable API. So, you can choose and run time the implementation you, you want. And one other goal is to minimize the instrumentation you need to do in your simulation code directly. So, it's really easy for simulation developer to understand the few key places where they have to put a new code to call our API. So, here is a really basic example from one on the tutorial we have. We need to initialize, of course, and you need to call some method at each time step where you want the processing to happen. And finalizations. Of course, the, you still have to do a little layer to describe your, your data. For that, we do some sort of a partial library to, to help us. So, ParaView, so Catalyst is a standard, I say standalone, is not, is independent project, no, independent of ParaView. But of course, the first real implementation is an implementation for ParaView. So, we, sorry. So, yes, we, we implement Catalyst. So, the back end, so you can run ParaView pipeline directly from your simulation. It's each time step or when, when you call it. So, how does it, how does it work? Or do you, you can, the idea is that you are, are called the communication between your simulation and the, on Catalyst through the API. But then the actual script that is executed, the actual pipeline and visualization, visualization you want to produce. It's all scriptable thanks to the Python wrapping of, of ParaView. You can even, you can even, sorry, load some representative data in the graphical interface of ParaView. Do some analysis, export this as a Python script and use the script to feed Catalyst. And then, when you run your simulation with Catalyst enable, it will reuse the script you produce directly from the GUI. So, people that are not at all developers still can do some stuff with, with Catalyst. And last point is that, when you have a running simulation with the Catalyst pipeline on your dedicated server, you also can use the GUI to connect to this ParaView server and to see real-time get some screenshots of the visualization on the analysis that is proceeding on the server. So, you can have a feedback, a time step per test, a time step on what happened on the simulation. So, if you see that something is diverging or going wrong, you can stop your simulation directly and you don't waste all the time before seeing that something went wrong and that you should tweak the parameter and start again. So, yeah, I was quite faster than expected for me. So, in the conclusions of to, to be able to run efficiently on the, on the supercomputer with ParaView, we implemented a client-server mode. The server can be, is MPIO rare and can be run on distributed environments. We are relying on old, on well-known libraries such as implementation of MPI to the distributions, but we are also really looking for, toward new, new library that can help us. Yeah, and we, we are able to integrate new library to, to do some performance analysis on new library that is aware of a new architecture of supercomputer or new technology. That's okay with, for instance, with VTKM or, or others. And we have this API to do institute that can save a lot of time and disk space. Yeah, just a slide to summarize the organize. So, we have different kind of way to interact with ParaView. Yeah, the grid, the Python scripting, the catalyst in city stuff. You can also build some custom one. We have some web example of clients. And at the bottom, we have a list, a non-finite list of library on which we will like to, to, to do the effective work. So, basically, open GL, MPI, open, open MP. And so, concerning roadmap, we have several improvements that are coming. First, I talk about in, in situ, in the current implementation, you have each rank that does the simulation, does also the, the co-processing work. So, that's not always what is intended. Sometimes, you want to do the co-processing on the other rank. Just because, for instance, you have dedicated the rank for visualization. So, you want to do all the processing on the visualization nodes. That's for not possible just with the in situ implementation, but we have an in-transit implementation where the simulation can communicate with those different nodes. And, and the analysis can happen on other ranks than the simulation. So, the simulation can go forward directly. We use, we also use some new library, recently used a library called DIY. That's here to do some wrapper for us. It's, we take it as a wrapper around the MPI. So, DIY, I love to do some to, to cut the data into different blocks. And then, the, at runtime DIY itself is a rare to do. Okay, I should put three blocks on each rank. So, only one block. And, yeah, it's a, yeah, just a new abstraction over cutting your, your data for distribution. We are also looking for better VTK on always, yeah, better VTK integrations to, to be able to, to run on a lot of hardware. And something very cool that is very new. It's just in the development branch of VTK. So, absolutely not in Paraview yet. That was merged, I think, one or two weeks ago. It's what we call implicit arrays. And, basically, it's really cool for memory point of view because we, it's some kind of views on memory. For now, in the Paraview process, your data is really an array in the, in the memory, in your memory. So, with the implicit array, we have some views. So, you can implement an open, open pattern on it. For instance, when you do an isocontrol of your data, you know that the, the, that, the resulting data will all have the same values. So, if you want, if you, after the isocontrol, you still have one million points, you will know that all the points will share the same value. For now, it's one million times duplicate in your memory. So, that's a not-efficient. With implicit array, you can sort only one time the value and say, okay, this should, this should be an array of size one million. And the value you should return is this one. But you can imagine as a, a compressed array in your memory and have an on-the-fly, uncompressed algorithm to when you just want, just when you want to, to read your data. So, it has a cost in terms of time of computations. But if you run out of memory with too huge data, that's, that can be really great. Okay, I still can have a lot of things to, to say, but that's what is the, the end of what I, I put in the slides. So, thanks for attending these songs to be here early in the morning. And if you have any questions, I think it will, it will be the time. I put just a lot of resources at the end of the slides so you can get it from the website of the phone. Thank you. Thanks, everyone. Thank you very much, Nikol. Do we have any questions? Thank you. In our group, we are happy users of ParaView. One thing that maybe I could add to a wishlist or some, well, maybe just for discussion is that we have quite some headache when using ParaView on GitHub Actions for multiple platforms. So, like to set up environments for Linux, Mac and Windows with the same version of ParaView coupling with Python just, just to be ready to use it. It's a bit of a headache, especially when you go to Windows and you need to download things, brew things, up to get install things and then they not necessarily work all together. So, wishlist thing, GitHub Actions, ParaView, set up a thing. Unless it doesn't, it exists already but I haven't found it. And the truth is, if there are questions here? The use of ParaView and GitHub Actions, so in like a continuous integration, limited environment, I guess? Yeah, it's a wishlist. Yeah, well, we don't choose GitHub directly. We have a, we have a, the GitLab where you can find a lot of stuff with our CI and CDO. We produce, we produce nightly releases of ParaView through the GitLab. So, I don't know if I, if it's some sort of part of the question, but. So, what kind of stuff are you doing with ParaView and GitHub Actions? Is it rendering or, rendering with Python? The fact is, I don't really know about GitHub Actions because I don't choose GitHub anymore. So, I don't see what you can do with that, that you should not able to do otherwise. Any other questions? There's a question on the chat. Okay. Yeah, there's a question on the chat. If I want to put Catalyst in my simulation, what is the first step? Oh, sorry. If you want to use Catalyst. In your, yeah. What's the first step? We have some, I think we have some tutorials on example in the code base of ParaView. We have some examples where there are some dummy simulations with just a main, so you can enter from it to, to see how it is organized. And, and yeah, the first, one first thing is to be able to know what do you want, which data do you want to, to send through, through Catalyst and where you can access it in your code. And then, it's, and then so you, at this time you, you have located the entry points from your simulation code and then you will be able to, to start writing the, the small wrapper you need to wrap your data on the need to the actual API of ParaView. Thanks. Okay. Any other burning questions? Maybe one last, yeah. Last question? Thank you for the talk. A very naive question, because I, is it working? A very naive question because I know almost nothing about, about ParaView. You had many components there. One of them was the client that does the visualizations. Yeah. Is it, would it be possible at some point in the future to be like a web client where you just log into the website and it just displays everything? Or is it just, due to the architecture is it like super complicated to do it that way? We, so the question is, yeah, the question is, are we able to use a web client for ParaView? Just, just for the part that does the visualization, if that could be like a, we are, we have some web client for ParaView already. So we have a framework called Trame, T-R-R-M-E, that's intended to, to connect to a ParaView server. And then you build your own front end for these applications. So basically it's, we don't have a, yeah, you should build your own. But it can be, okay, I have a server on, I open the, always this data and the front end is just a 3D round of view. That's already possible quite easily, I think, with the Trame framework. And Jupyter Notebooks also, right? Jupyter Notebooks, I think I saw it on the user interface line. Yeah. Well, we are, yeah, as we use intensively Python, we also make the step to, to be supported from a Jupyter Notebook and we also have a plugin that allows you to control a ParaView GUI. So you can do some stuff in the Notebook. And if something goes wrong or you don't understand, you can launch a magic command run ParaView that's open the ParaView client with all your Python, Python, and you can introspect in the GUI. And then you can go back to your, to your Notebook. Okay. Thank you very much, Nicholas. Thanks. Thank you very much. |
Simplifying the creation of Slurm client environments
A Straw for your Slurm beverage |
Okay, next talk is Pablo. Who is going to explain us how to set up slurm client environments more easily. My name is Pablo and I have been running the HBC Clusters since I was 9 years. I was running the HBC Clusters at CERN and got involved mostly in slurm, running slurm. That's when I came up with the idea for this tool since about 8 or 9 months ago I started the HBC Clusters and I'm also participating in the SKA project, hence the pretty background, where we do also things related to the HBC infrastructure. So just a brief introduction to slurm in case anybody is not familiar with it. Slurm is basically both a resource manager and a job scheduler, meaning slurm will manage their allocations, it will track which machines are used to which jobs and which users own, which CPUs and which nodes, etc. And it's also the job scheduler, meaning it will, when users submit jobs, you have your happy users over there, or hopefully it will be happy users. And they can be one-on-one on your cluster, so they make a job submission, usually writing a script that launches some workloads. And they will basically interact with slurm and slurm will manage all of these job submissions. You won't just have one by one, you will have hundreds or even thousands of jobs that are scheduled to run on your infrastructure and slurm will manage the views and the priorities and the accounting, etc. So basically it's a batch manager, but there's both resource managing and the scheduling of the jobs. Building a bit deeper into how slurm works, because this is relevant for this talk, there's basically two main components, two units that are the most relevant, and those are the controller, which is called the slurm CTLD, and then the deals that run on the worker nodes at the bottom, which is the slurm VDU. And then you have other demons like the slurm VD, slurm RST, slurm RST. Those are not relevant for this talk, I will mostly focus on the part on the left here. So users and client tools, they basically interact with the controller over a slurm protocol. There's nowadays a slurm RST, so you can also interact with the rest with some scripts, but mostly all the user lab tools, and mostly almost everything in the slurm ecosystem just talks to the slurm CTLD, and this controller handles the source of truth for slurm, so it knows which resources are allocated where, it knows which jobs exist, and knows who the users are, etc. The controller talks to the slurm units, and talking to the nodes and the slurm units are in charge of launching the jobs, so you do the cleanup, setting up the seed routes for the jobs, whatever you have. Now, what's important here is to know that for all of this to work, you need at least the same thing. You need the slurm conflict files, and they need to be instinct between the whole cluster, so you may have some difference, but mostly it should be the same. There was no audio online? Okay. So as I was saying, the slurm CTLD handles the source of truth. The slurm units are in charge of launching the jobs, and the two important things are that you need the slurm configuration files. It's mostly the slurm.conf file, but there's other files as well. Those need to be in sync in the whole cluster, and they need to be basically the same. They should have the same hash, ideally. And then you should also have a shared secret so that nobody can, a rogue client cannot just add a worker node to the cluster and start doing malicious things. So you have usually it's a munch secret. It's called the demon called munch, and you have a shared secret as well for the whole cluster. And this fact is important, is very relevant for this talk. Now, up to containers. So containers are increasingly becoming a super popular tool to run infrastructure for reproducibility, for automating deployments. And just in general, they're becoming super ubiquitous in our industry. I think for good reasons. And there are, I think, very good use cases for using containers with slurm. In this talk, I will focus on the use case where you use containers on the user and client side of things. So those tools that will talk to slurm, to the controller mostly, to do things on the cluster. So this could be some automation that you have run to do whatever. For instance, you could use it for monitoring purposes. You could write a tool that does health checks on the cluster for accounting. I've used it extensively for accounting as well. But also integration with other services, right? Or if you want to connect the Jupyter notebook with slurm, you will end up with some tools that talk to the controller. Now, there are basically two scenarios in which you can use containers with slurm. On the left, we have the local use case. That means imagine you have a frontend mode, you have a machine that's configured where it uses SSH2. And from there, they can run the slurm commands to launch jobs, to track their job usage, et cetera. It's conventionally called frontend mode for the cluster. So if you just add the slurm client container on that node, it's very simple. Because you can just, as I said, you need a secret with munch, and you need the config files. And that scenario is very simple because you can just do bind mounts, and you can access the munch socket to talk to slurm. And you might bind mount the slurm config directory, and you're done, basically. So that's sort of easy. However, what if you have, for the use case on the right, you have the distributed or remote use case. And in that case, you may run your slurm client container in a different service. That's a different network, or you may run it on Kubernetes or somewhere else. In that case, you obviously can't just do the bind mounts because you need to give it all those things. So you would have to give it all the slurm config files and somehow the munch shared key so that your external service can talk to your cluster, right, specifically to the slurm controller. Now, this is an extraction from a Docker file. This is the naive approach. This is how I started trying things. Easy, right? You just take the slurm config, and you just copy it to the destination, right? And this will absolutely work. But I was not happy with this approach because then you end up managing two copies of your slurm config. And I really like having a single source of truth for when you do configuration management and automation of your infrastructure, I really like having a single source of truth. And managing this in this way with containers is very fiddly because it's very easy that you will forget to update it or something that will fail to update it automatically. It's just not ideal. I didn't like this approach, but it will work. It will work. And some of you who know slurm may say, oh, but Pablo, why wouldn't you just use slurm's config less feature? So slurm config less is a new feature since slurm 20 or so that will basically allow a client to just pull the config files from slurm. So the slurm ddemons that run on the worker nodes, when they start, they will just grab the slurm config files. So you can just remove the needs to even copy the slurm config, right? Well, that's a trick question. Not necessarily because then you need to run a slurm ddemon in your container. And you also need the munch demon. And it sounds easy, but it's really not. You will need to do a lot of hacks. This is an instruction from a container that I was creating. And you run in lots of awful things. Like the slurm ddemon expects this release agent file to exist in the C group and the containers, they just don't create it. I tried it on Docker. I tried it on different Kubernetes versions. It just doesn't exist. I don't know why. I couldn't find out why. If anybody knows, please tell me. I googled around a found that could have been related to some privilege escalation issues. However, if you just remount the C groups, the file appears. So I'm not sure what's going on there. Another fun story is that, for instance, if you're using Kubernetes, Kubernetes likes to give a sim link to your secrets, and munch refuses to take the secret from a sim link for security reasons. It makes sense. So there's no more. So you will need to put in hacks. And it's hacks on top of hacks on top of hacks just to run these two demons. And yeah, I was not very happy with this approach either. So basically I was faced with two options. We arrived at this situation. You're faced with two options. Either you basically do the first naive approach where you just copy all the stuff into your slurm container. You manage a copy of your slurm config files. But as I said, if you want a single source of truth, this might not be ideal. You also need, of course, in the case of use case, unless you need munch, and you need to supply the munch key. Or you can try the configless approach, but then you need to add slurm d to your container so it can pull via configless your config files. But then anyway, you also need munch. And you need to add the munch key to your container somehow and managing secrets. I mean, if you're running Kubernetes, it might not be a big issue or some other container manager. But you will still need to maintain all these extra demons with nasty hacks. And we don't always like all these having lots of hacks in our infrastructure. There's a third option, by the way, which is trying to go secret less. It doesn't work in combination with configless, where you try to use JSON web tokens. But it gives a lot of issues. It doesn't really work. I tried it. So I didn't include it here. Just mentioning it in case somebody thought about it. So Pablo, you talked about the bad and the ugly. What about the good? Is there any good part to this? I'm glad you asked. Yes. What if we had a single shot CLI tool, that just a very simple tool that just was able to authenticate to the controller, either using munch or JSON web tokens, which Slurm also supports, and just fetch the config files, and then it's done. That's all you really want to do, right? Because then your tools, the Slurm tools can work, because they have the Slurm config files, and just by having the JSON web token in your environment, you can just talk to the Slurm controller. And yeah, that's the tool that I wrote. It's a very simple tool. It just does exactly what I described there. And it's open source. You can find it on GitHub. I uploaded it in the past month. Fun story about this. As I said, I had the idea for this when I was back at CERN. I worked on this a year ago already. But then I somehow lost the source. I don't know what happened. Just before I left CERN, the source was just lost. I don't know why. I must have deleted it by accident. I don't know what happened. So after I left CERN, I kept in contact with my ex-colleagues, and they were telling me that they wanted to do this integration between the swan, which is the who here knows swan? Anybody? Okay, one, two, three. Yeah, so it's the Jupyter Notebook Service for CERN, which also does analytics. And we wanted to connect it to Slurm, and we run into all these issues, because this is a service that's exposed to the whole internet. So we didn't want to have the munchkey for the Slurm cluster in the container, et cetera. Anyway, so then I left CERN, and then, yeah, my colleagues were telling me, oh, it would have been so useful to have this at Watapiti. And then a few months ago, I just didn't like the fact that I had lost the source and all these days. I spent a couple of days reverse-engineering the Slurm protocol, and I just didn't like losing it, so I just rewrote it more properly in Python and just made it public. So if you're interested in making client containers like this, feel free to give it a try. It looks a bit like this. It's very simple. You can choose between munch or JWT, JSON WebToken's authentication. If you choose JWT, which is the most simple one, you just need an environment variable with a token, and you can tell it where you want to store the config files, and then you have verbosity as an option. So it's very simple. It has very little dependencies. So the tool talks several Slurm protocol versions, because with every major release, Slurm changes the protocol versions. So you can list them with minus L, and it will show you basically all the versions that it supports. So imagine you have a Slurm WebToken in this variable. You can just tell it to do JSON WebToken authentication with the server. It supports multiple controllers in case you have high availability set up in your Slurm cluster, so you can specify a list of servers that it will retry until it succeeds, and then you tell it the protocol version of the Slurm CTLD, because it needs to know what protocol it should talk. The protocol version negotiation, I think it doesn't exist in the Slurm protocol, so you have to tell it which version you want it to talk, and that's it, and then it will just download the Slurm config files and happy days for your containers. Conclusions, I think I'm ahead of time. So this tool called straw, it can simplify the cost of creating and maintaining your Slurm client containers. It can also increase the security, because you don't need to put the Munch key everywhere, where you're running your client containers. JSON WebToken's surface. Caveats, caveats. I think this tool should not exist, because ideally this would be supported upstream. So, you know, if anybody has any influence on SCADMD Slurm development, yeah, I think it would be nice if we had this built-in into Slurm. And then the second caveat is that the JSON WebToken, the token needs to be associated with a Slurm user, basically. So ideally, you would be able to just generate a JSON WebToken for a user that's going to run on the Slurm cluster, and then if the secret for some reason is exposed, you've only exposed the JSON WebToken of a single user. However, this is a limitation built into the Slurm, into Slurm, basically. You cannot pull over the protocol the Slurm config file unless the token belongs to the Slurm user, or to root. Still, I think it's an improvement over having your Munch key available everywhere. If you're free to try it out, that was it. I'm happy to answer any questions you might have. APPLAUSE Thank you very much, Pablo. Time for questions. So what kind of clients do need the config file? Could you do everything over REST nowadays? Is it still necessary to use the config file? Yes, so anything that wants to run srun, sbatch, sq, sinfo. For instance, if you have the Jupyter Notebook plugins, they will just run those commands. Or if you want to run a client that uses PySlurm, for instance, or any library really, anything that uses lipslurm underneath will automatically read the config files, right? So, of course, you can write your own client, handwritten from scratch, that just interacts with the Slurm REST to do stuff. Yes, but you cannot leverage all the existing user client tools, and the lipslurm, PySlurm, etc. So if you want to create a Python tool, for instance, that leverages PySlurm, this would be, I think, a good solution. I think Slurm does have, like, a REST API, but it's considered very insecure. So even the documentation tells you, like, don't use this. I just didn't understand, like, for a long time now, why everyone needs the config file, right? I mean, why doesn't it need to be in sync? Like, couldn't they just exchange the information over the protocols now and just say, like, this is your Slurm server? Yeah, that's a configless feature. That's a configless feature, essentially. Yeah, but the configless feature just downloads the config. Yes. Next, like, config less OK. Yes. I download the config. I don't need the config beforehand. It's like serverless. There's always a server somewhere. Yes. Yeah, exactly. So that's just how Slurm works. Yeah. So I'm still a little confused about the Slurm client container. So the container is an application on the actual Slurm client, because you have to document in the SlurmConf, you have to sort of say what your clients are so that the scheduler can intelligently decide how to schedule jobs, right? I'm missing something. No, you don't really need to declare all the clients for Slurm. You just need to declare the worker nodes that are part of it. But you can have any... I mean, it depends on how you've configured it. You can limit it. You can limit in Slurm which clients are allowed to connect, but you don't have to. So you could just... But even if you do, you will need this, because you will... Even if you authorize a host name to connect as a client, it will need to have the munch key and the SlurmConf files, et cetera. Does this answer your question? Well, no, so when you... In the Slurm.conf, you sort of detail what your positions are, and you have to kind of tell it what the capabilities are of your clients, of your Slurm clients, right? So that Slurm can decide how to schedule jobs. I'm missing something. Well, I think you're thinking about the compute nodes. Yeah, I am. Yeah, the node names part of the SlurmConf. So the containers run on the compute nodes? No, the containers would be... Let me go back to one of the slides where... So you're thinking maybe about the compute nodes, each of which runs a Slurm DDemon, and those you have to declare. Yes, I think in 2023, by the way, you will be able to dynamically spawn compute nodes, but that's the future. What I'm talking about is all the users and client tools that connect to the controller to run SQ as info, like when you use Slurm and you... Hello. So if you had some tooling that you automated to gather metrics from Slurm or, yeah, a Jupyter notebook service, for instance, that connects to your cluster that wants to launch jobs, that wants to run as batch SQ, whatever, that's in that domain. Yeah, I mean, the newest werewolf runs containers on my back for the stream. I mean, I think the newest version of werewolf is set up to run containers on the Slurm clients, right? It's sort of, you're actually launching containers as applications, so that was kind of... That's on the compute nodes. On the compute nodes, yeah. Yeah, yeah, that's the compute nodes. Thank you for your talk. So I have a question. You are telling that you can pull the configuration with your tool, but there are many... Fine, you can't pull with configless. For example, all the spank plugins, or I think topology, you can pull it, but various, like I said, spank plugins and so on. So how do you manage this kind of config file that are not ended by default by Slurm? Right, that's correct. So when you use the configless feature, it will download the, you know, the Slurm Conf, the C Group Conf, a lot of config files, but it will not download your plugins, your plugin files. But I think those are usually not needed if you're running a client, because those are usually just needed for the Slurm D demons, right? Even for the worker nodes. Like the epilogue, the prologue, you mean all of those plugin scripts, right? The authentication plugins. Those are usually needed by the Slurm D demon, but if you're just writing a client, but say you're automating something with PySlurm to interact with it, you don't need those files. And Slurm will happily... You can happily run... all of those commands without those files. Yeah, okay, so if I just summarize, the idea is just to create some frontend nodes, but not really work nodes. That's right? So you... So if you want to use configless to set up a frontend node, you might need those files from somewhere else. But if you're just creating a container to just interact with Slurm and send Slurm commands, you don't need them, basically. Because the plugin files are usually the... Yeah, the epilogue prologue for the Slurm D or the Slurm CTLD. And that's not what these Slurm client containers are about. So short answer, you usually don't need them. Hello, thank you for the talk. I'm wondering, in huge institutions, like in CERN or EPFL, would you run your own forked or patched Slurm so you could fix maybe the authentication privileges? Or is it just not done because it's... I've never carried any Slurm patches, to be honest. I've always, both at Slurm and at EPFL, we just use Slurm out of the box. It works well enough for our use cases. It is true that you could, for instance, do a patch to enable finer granularity for the permissions. For instance, you could enable any user to pull the config file. That would be a nice patch. We don't do it. Okay, thank you. We have time for one short question. Hi, thanks. We actually are very interested in this because we are applying... We have a Jupyter Hub frontend that actually talks to a Slurm cluster through SSH because we don't want to install all that stuff, like the munch and the full Slurm deployment into the Jupyter Hub host. And I'm wondering, how does it talk actually to Slurm control? So is the Slurm control always listening to any... any of the hosts that will talk to it? Yes. Or is there any restrictions to who is connecting to the Slurm control demo? So there's an alloc nodes setting in the SlurmConf, I believe, which will allow you to restrict from which nodes you can allocate resources. Okay. So you can limit it. However, if you don't have that, the Slurm will happily accept anything because if you have the shared secret, it's considered good enough. Okay. Or a valid JSON web token. Okay. Yeah. Thank you. Thank you very much, Pablo. Thanks. Thank you very much. |
Troika: Submit, monitor, and interrupt jobs on any HPC system with the same interface |
Okay, the next talk is a perfect fit after a previous talk. So Olivier and Axel are going to talk about Troika, a system to easily manage, submit your jobs to any HPC system. Yeah, thanks for inviting me and thanks for letting me talk. So yeah, Troika, as Kenneth said, is a system so that we can interact with job submission systems with one given interface. So just before I start, a bit of context where I work. So I work for the European Center for Medium Range Weather Forecasts, which is a European-based international organization. And we run an operational weather forecasting service four times a day that we send out to national meteorological services and private customers. So we also operate quite a variety of services, like we have our own in-house research to improve the models, to do climate analysis, reforcasts. We operate services linked to climate change, for instance, as part of the EU Copernicus Service, and we've just started a new project called Destination Earth. So I'll talk a bit more about that because it's a nice entry to what I will present. So it's a EU program for weather and climate. It's a large collaboration that we drive with ESA, the European Space Agency and UMEDSAT, the European Meteorological Satellite Organization. And the goal is to run simulations of the Earth at one kilometer resolution. So for those who are wondering, that's about 256 million points per vertical level. So this project is quite big, and it will run on multiple HPC systems across Europe. So for instance, I think Barcelona with BSC and Lumi in Finland, just to name two. And that means we will require some level of flexibility to run our workflows. So you notice I didn't say job because in weather forecasting and also for these projects, we have lots of different tasks that we run together. So here you can see an overview of what we run operationally. But in practice, that's a few thousand tasks that run every time we want to run one of these pipelines. And we have multiple types of workflows in-house. So the main one is the operational one, of course. But then researchers have their own workflows. We have support workflows like CICD, deploying software, or just fetching data and analyzing data and things like that. And that amounts to about half a million tasks per day on our HPC cluster. And so sometimes we run parallel jobs, but most of those tasks are just small, like one CPU or a few CPU tasks, just to do some processing. So for that, we use a workflow manager that we developed called ECFlow, which basically manages a task graph as a tree with additional dependencies. So you can have dependencies on dates, loops, and things like that. And that runs a script for every task. So a task being one leaf in the tree I show here. It stores variables for pre-processing if needed, keeps track of the task status, fetches log files on demand. What it doesn't do to keep it simple is connect to remote systems and talk to specific queuing systems. So ECFlow just runs commands on the server host, which is usually VM, and provides three entry points, which are submit, monitor, and kill for every task. And so if you want to run an actual job on an HPC system, that means you have to have some kind of interface. So first you can start by just saying, oh, yeah, the command is SSH to my cluster and submit a job, and that's it, which works. But when you change cluster, or even like there is an option to put, you're in trouble because you have to change that variable, and it can be a bit painful, especially if you have thousands of tasks, or you don't want to regenerate the whole workflow. So next possible thing, you write a shell script. So you could do multiple actions in your script. You have a bit more flexibility, but I don't know if you tried handling configuration in a shell script. Usually it ends up quite easily into a nightmare. It's very hard to maintain, and if you deal with several people, everyone has their own. So we tried to have something a bit cleaner, and so we want to delegate it to a submit interface that can be made generic, gives you lots of flexibility, and you can also maintain it as a proper piece of software that means versioning, testing, and some level at least of reproducibility. So we call our software Torica because it runs mainly those three actions, submit, monitor, and kill. It's able to handle remote connection to a remote system, mostly using SSH. It's also able to prepare the job script for submission, interact with a queuing system, and optionally you can run hooks at diverse points. So it's written in Python. We put a strong emphasis in making it configurable so everything can be driven by configuration. I'll show how this works afterwards. And we want it to be extensible, so you can add new connection methods if running locally on your server node or running over SSH isn't enough. You could just add a plug-in if you want to support another queuing system, same. And if you want to add some hooks, for instance, to create directories before your job runs or copy files over before or after submitting a job, et cetera, you can also do it. So as an example, that's how you would run Torica. So it has quite a simple command line interface where you can control most of the flags you will need in your day-to-day life. So you choose the action you want to do, submit, monitor, or kill. You give it a machine name which is defined in configuration. Some options like the user, you tell it where to write the output file because that will stay on the server. And it serves as a reference if you want to copy some other files, they would be put alongside this one. And so here you can see the log below that shows the commands that would be actually executed when doing that. So as I said, everything that is configurable. So each site has a name to identify it on the command line. And then you define the connection type, local, SSH, whatever you want to add, a type. So for now we support direct execution, slum, and PBS. And then you can add some hooks, for instance, oh, yeah, before I start doing anything, check the connection just to see whether it will actually work, or, oh, yeah, before submitting the script, just make sure the directory containing the output file exists, or once the job is submitted, copy the log file to the server so that we can see everything in the same place rather than having files scattered around every system. And so that's all good, but just having an alias to SBatch that does it remotely is not really helpful. So we need also to modify the job script to add some options that are understandable by the submission system. So for that we decided to have a new language, because obviously the directives are not interoperable across submission systems. And so we need some kind of translation. We input some generic directives, and we can add some in the configuration as well. And then we translate them, so either for things very simple, like, oh, yeah, the output file in PBS is minus O in slum, it's minus, minus output. So this kind of translation could have also plugins that compute resources, like if someone gives you the number of nodes and the number of tasks per node, and you need the total number of tasks, things like that, so you could add plugins, or if you have some specific resource management in your HPC, you can add that as well. And then on the output side, we have a generator that's site-specific, again, because we need to adapt the directives to the system. It can make the last few translations, for instance, the actual syntax of some options, like mail options, most submission systems allow you to specify an email address to which send an email for some of your tasks. Only the syntax is slightly different for everyone, so it does that translation, and it's able to add code, if you need, for instance, to define environment variables in your software. So the main components that are extensible in Troika are, as I said, the interaction with the queuing system. So you have a parser that reads the native directives so that you can use them if you need them for your processing, generates the job script, it runs the appropriate commands, so either using QSOB, SBATCH, or whatever, it could use APIs if you have another system. And it can also keep track of the submission, so most of the time, just keeping a job ID so that if you want to monitor the task, you just say, oh, yeah, the script was this, and Troika will know, oh, yeah, put the job ID in that file next to the script. I don't need you to tell me where it is. And so you can choose how you want to interact and define new interfaces if you want. Same for the connection. So the connection mostly does the running of commands on the remote system. It's able to copy files over, if needed, both ways. And you can have some hooks at various points at start-up just before submitting, just after killing a job, for instance, if you want to tell a workflow manager that, oh, this task doesn't exist anymore, I just killed it, or at exit if you want to move your log files around, for instance. And that allows you to perform extractions. And then the last thing you can customize is the translation. So if you want to generate more directives than the user provided, you can also do it. And basically, you just pass a function that takes the input set of directives and updates that set to whatever you need. So as a bit of a success story for us, so we've just switched to a new HPC with a new set of EC flow server VMs, new location, new everything. So it's much simpler to actually be able to just change a config file rather than rewrite a whole shell script that does all the submission for us. And also, since we have lots of different users, they have different needs, they have different ways of working. And what we managed to do with Troika is that we managed to bring them all together to use a single tool, which runs the operational workflows where they need to have tight control over what they actually submit and all the options. Research workflows, which need to be very flexible because every researcher might have their own specific needs, but in the end, they run mostly the same kind of code. So we need to have an interface that allows that. And then we run also general purpose servers. If someone has a data processing pipeline, for instance, they can just spawn a server and do their work. And that needs to have an easy to use interface because we don't want to teach people, oh, yeah, you also need to know that to run your job. So now what we do is we provide them with VMs where Troika is pre-installed, and many of them just don't even notice that it's there. And as a summary, so I said at the beginning that we handle about half a million jobs per day, and most of them now pass through Troika, and it hasn't failed yet, so hopefully it works well enough. What it will help us with also going forward is supporting our software development. So it's not necessarily tied to a workflow manager. We want to control our CI CD pipeline also using that because some of the elements of the pipeline have to run on our HPC system. So basically what we could do is from a GitHub runner, we could use Troika to connect to our HPC run jobs there to do testing, deployment, and everything. We, as I said, run our in-house workflows, and we will continue to do that for the foreseeable future. It will help us to adapt to new HPC systems because every time we make a tender, any provider could answer, and we don't control which submission system we will end up with, and even which site-specific variants there will be in the set of options. And then for destination Earth, as I mentioned before, we want to support multiple HPC with minimal changes to the code. And so just to tell a bit more, where do you want, we want to go from here. So we want to support more queuing systems because, I mean, we support two, and one of them quite well because if we use it, the other one a bit less maybe. We want also to add functionality to inquire about the submission systems of, for instance, which are the queues available, the petitions, things like that, so that the user doesn't need to go to the server, check before running, like you could just run a command that fetches all the information in a useful way and gives it to you without, we're abstracting basically the specifics. We also want to add some generic resource computation routines. So we have some in-house, but they are very tied to the way we function, and so there will be some work to make it more generic and then integrate it in the main source code rather than in a plugin. And for improvements to the code, we want to improve script generation. For now it's a bit clunky, but it works. We want to widen the coverage because you never test enough and provide packages to install it on Debian-based machines, for instance, or RPMs for Red Hat systems, et cetera. And if you want to contribute, feel free to talk to me or go to our GitHub page and I'll stop for now and take questions. Hello, thanks for the presentation. So basically I've done something quite similar for my employer, sadly it cannot be open sourced, but the problem that we have is we have legacy clusters with legacy job submission systems. How did you manage to get the traction to migrate to Troika and to convince the user to port their jobs, their developments to this new system? So what we did first is that we made it as seamless as possible. So if you want to interact with your job submission system without using our directives, you can. They will just pass through, but you lose on the generosity. And then what helped us is that we changed our HPC system, and that means we did basically start afresh and everyone had to make changes, so we just pushed that onto them. And I must say many of them have been happy because that meant we can do that for them rather than them having to figure out the details of how do they submit jobs on that new system and everything. We can just tell them, oh yeah, it's reinstalled, it works. And so yeah, that has been really helpful. I actually have a follow-up question to that. So one thing we have been doing, we switched recently, well, four or five years ago from Torq to Slurm, and we didn't want to let all our users retrain themselves and learn the Slurm commands, because in our experience, Slurm is a bit less user-friendly than Torq is. So what we did is we rolled a wrapper that people can still use QSub, but they're actually submitting to Slurm, and it just, it translates the script in the background. Troika doesn't do that now, right? You have to use the Troika command, but it knows about the Slurm header. Yeah, so you could technically do it. We didn't want to encourage that, but technically you could, like, I think you could write a script in three lines, a plugin that just takes the directives. You would probably need to support all the directives you need, but we have a built-in parser that is able to read, like, Slurm commands, for instance. And so you just need to tell Troika, oh yeah, use those on top of whatever is specified in configuration. Is that something you would take pull requests on? Yeah, if you want to. Okay, we had another question. Yeah. Passed them away. Hi, thank you for the presentation, very interesting. So I'm an early programmer myself, and so my question for you is, how does it fail? Like, have you studied or provoked, you know, intentional failures of the system, and have you encountered funny behaviors, like, or plain hilarious faults of the system? Yeah, we had, I mean, getting a new system has its lot of failures, so I don't know if Axel, you want to take over for that, because you probably have handled some of the failures. In the example of the command line provided, you can see that we redirect the output for each submission, and this is a chance to analyze the submission and to decide what's the best approach to deal with erroneous submission, meaning that some of them have to be reflected the hard way to make it clearly visible, this is a problem. And some others can be handled in a hidden way, or not so visible way, in a still deterministic way, and so it may be hidden and still automatically handle the problems when they occur. And this is what we expect with so many jobs to submit, to focus on the critical essential for the human side, and to have a chance to teach the machines through the hook system to manage with the specificities we have identified as problematic, but we want to keep ignored or manage automatically until a fix is coming from the curing system, for example, if it is related to a curing system problem or identified issues that may come with the next release. So this is a way to deal with the failures that can occur at job submission. Thank you. Did I understand correctly that when you're monitoring a job, the reference is the script? Yes. Correct. So that means everyone has to make sure their scripts are uniquely named each time, otherwise, or is it the sort of script and where it is in the file system? It's where it is in the file system. So you are correct. If someone deletes or renames their script, then it can cause a problem. Submits with the same script. So it's not a problem for us because our workflow manager basically does some pre-processing, meaning that the script has some additional things like, oh, yeah, it's your second try at that submission, so I will add.job2 at the end. And so that's how we circumvent this issue, but you are definitely correct, and that's something we will need to improve at some point. But we didn't want to have to link to a database or something so that we can keep it simple. Thanks. You could just copy the script on submission, no? We could copy it. It's just that, yeah, if you have half a million scripts, we need to think, per day, we need to think about cleanup. Yeah. Other questions? Hello. Users like things to be as simple as possible. In order to do that, they would probably be nice to have some sort of central location where recipes of various clusters would be sort of combined accessible for people to be able to get access to. Is that in your plan or? What do you mean under configuration side? So I could imagine a user turning up going, oh, I'm going to download Troika, and I'm going to talk to this cluster that I have access to. How do I get the configuration? Oh, OK. I see. So we don't have that, but if Troika gets attraction, I think we could come up with a website where you can host your configuration files or have some kind of index where you can list them. I think we would have all that's needed to do that pretty easily. I think, hopefully, the configuration is easy enough so that you don't need to do much on top of what's actually provided as examples. But yeah, you are correct. We could, if it gets popular, just provide configuration files for several systems or, I mean, HPC system providers could also just give a configuration file with the system so we can have it where Troika is installed and then the user doesn't even need to bother about it. Very small second one. Given you've just done all this stuff, have you heard of a project called DRMAA, Distributed Resource Manager Application API, it might make the insides of this slightly nicer for your EC flow stuff, maybe it might take some inspiration for that. Thank you. A question, but also an observation. A long time ago, there was a standard called DRMAA, it was an API. Just mentioned. It seems not to be used, maybe I'm wrong, but very quickly, your system, if you had cloud-based resources on AWS, you've got an SSH connector. Could you have, in the future, maybe run up some machines on AWS? Yeah, that could be an option. As long as you can write Python code to spawn up an image, a container somewhere. Yeah, sure. I think the API is for that, that just needs to be a plug-in that does the connection, and that's it. Cool. Okay, we're out of time. Just a comment. I don't think you have any people using Troika outside of ECMEF. No, that's the first time we actually presented outside. All right. Good. So you're trying to start, or trying to get people to start using it? Yes. You're building a community, you're getting yourself into trouble. We're going to get public requests and bug reports, but okay. Thank you very much. Thank you. Very nice. Thank you. I'll just switch. Okay. Okay. Okay. Okay. Okay. Okay. Okay. Okay. Okay. Okay. Okay. |
Self-service Kubernetes Platforms with RDMA on OpenStack
K8s, OpenStack and RDMA are just like oil, vinegar and bread? |
Next speaker is John Garbert from StackHPC who's going to talk about self-service Kubernetes with RDMA on OpenStack. Thank you. Hello, everyone. Yeah, I pressed the button. Excellent. I'm Green. Hello, everyone. I'm John Garbert. I'm here to talk to you about OpenStack, RDMA, Kubernetes, and are they oil and water mixing or are they bread, oil, and vinegar? Hopefully I'll come into you at something nice. So start with some thank yous from my sponsors. So I work at StackHPC. We're about 20-something people now. We've got people across the UK and across Europe. So I'm based out of Cambridge, but the head office is a lot of people around Bristol, people in Poland, and people in France as well. And we work on helping people create OpenStack clouds, train them up on how to look after them, and support them through that journey and everything that's happening there. For this particular topic today, I want to say a big thank you to all of these organizations. These are all in the UK. Lastly, Jasmine. So I'm going to talk today about how do we package up these solutions and stamp them out for people as reusable pieces, and this is a project that's come out of the Jasmine Institution. And that got taken on by Iris, which is an STFC community cloud project. So they're trying to get ways in which more STFC funded activities in the UK can share the same sets of infrastructure. How do we get one pool of infrastructure and share that between all of these different research use cases? And in particular, there's lots of organizations we've been working on getting feedback from. So we've been working a lot with the SKA community in the UK, particularly the SLC community at the moment. And they've been giving us great feedback on some early versions of all of this and how to improve things. And that's actually been funded partly by also the Dirac project, which is the HPC center, a group of HPC systems. Also note the small I, not the capital I, Dirac, just to confuse everything. If you look for the small I, Dirac, that's the group of the HPC centers as opposed to the job submission system. And we've been working very closely with the research computing services at the University of Cambridge and tying this together. One of the iris sites and one of the Dirac sites, and we're starting to reuse the things coming out of Jasmine. Anyway, big thank you to all those folks. So I want to start with, why on earth would you use OpenStack and Kubernetes and not just have one big batch schedule? And really it's about getting the most value out of the infrastructure investment you've made. And today also it's worth saying that the, getting, what I really mean by that partly is the, that investment in your infrastructure is also investment in carbon cost. How do you get the best out of that investment in carbon to manufacture these machines and run these machines? And what do I mean by value? Well, that's different things to different people. I mean, had we reduced time to science, how do we get more science out of that particular investment that a community has made? So firstly it's a bit about sharing diverse infrastructure. Hopefully people aren't hungry, apologies. I've spent far too much time on unsplash, so thank you to unsplash. So there's increasing diversity, as in different flavours on the pizza here, in lots of the user requirements. So in terms of the iris community, they're currently working actually a lot more with large international collaborations, and often those users come with a system that they want to run on your infrastructure, regardless of everything else that's happening. And so one of the problems that's been happening is you sort of silo your infrastructure into well, this was bought for purpose A, this was bought for purpose B, but actually those infrastructures are getting more diverse. There's only so many GPUs anyone person can afford in a particular institution, and everyone wants to use them. How do we share that out? How do we share out the accelerators and all the special bits of kit between these different use cases that day to day might be different people wanting to use those bits of infrastructure? That's kind of how do we slice it up? And also one physical server, particularly when you're doing test and development, is getting bigger and bigger in terms of consuming it, so giving people one whole server can be a problem. The other thing, and I'm speaking as a developer here before I bash developers, we love breaking things. So if you give people access to the kernel and they're going crazy and they crash the kernel, if it's just your little kernel, then it's only you you've just crashed. That's a bit of an extreme example to be fair. I don't really mean crashing the kernel, I more mean crashing the thing that you put in the kernel more likely to be particular. Anyway, how do we separate this up? And actually probably a better analogy rather than pizza is sort of a reconfigurable conference room. So if you plan ahead, you can make this kind of change. So sometimes you want to use all of the room for a really big meeting, like this one. Sometimes you want to divide it up, and when you divide it up, you kind of want a certain amount of isolation, and not accidentally, you can get the noisy neighbor problem in these setups. So you have to be careful about actually how you're doing that dividing. And so one of the things that's also changed most recently is how do we get these reusable bits of infrastructure. So I said we've got a well reusable platforms on top of the infrastructure. So one of the things I said about the IRIS project is it's working a lot with international communities coming with a thing to run. Very often these days that thing to run is packages and Kubernetes. Sometimes people are developing on Kubernetes on their laptops, and they need a bigger Kubernetes, but this is certainly becoming a thing now. People just say, you know, whereas this is how I'm wanting to deploy, how do I carve out the Kubernetes infrastructure and have Kubernetes on top of it to do what I need to do? And actually it's been very helpful in terms of giving us a higher level of abstraction that we're working with to kind of, you know, to package up web applications and interactive applications and a whole manner of things. Okay, so the next piece in the topic was why RDMA networking or why random access, remote direct memory access. I can remember that, so I put it in there. I thought to try and prove my point, I'd show a pretty graph. This is open foam. At the bottom here, there's a link to the tool that we use to actually run these benchmarks, and to make it nice and repeatable. Essentially you can describe in a Kubernetes CRD the kind of benchmark you want to run, and then it basically submits a job to Volcano, monitors the output and just tells you what the output of that was. It's just a way of just making it nice and quickly reproducible. So if you look at this graph, it's showing basically wall clock time for the simulation. And on these lines, we've got lots of different networking technologies that were being tested out, and not unsurprisingly the ones that were performing the best have all got the lowest wall clock time, so the best result in this particular benchmark. As you can see, this was probably an interesting configuration in the sense that as you were scaling out the compute, there was actually no benefit at all in terms of the simulation time. Actually, interestingly, because of this slightly wackadoodle configuration, or the job was too small essentially, you can actually see in the TCP ones above, they gradually actually get worse as they've got more cross communication, as you would expect with MPI underneath here. So if we dive down into MPI, on the left-hand side we've got the latencies, and these bottom two latencies for people at the back of the room, there's two at about five microseconds and one that's about half of that. These are interesting, these are the RDMA ones. Actually, I'm saying RDMA here, these are actually all rocky using Ethernet, as you probably guessed, because I just said what the latencies were, if you're interested in that kind of thing. So there's no such thing as a free coffee unless you're at Fosden, I guess, but let's just compare very briefly those three technologies. If we have a look at the bandwidth, there's something interesting happening here. It would be slightly more interesting if we'd actually had the hardware for long enough and run the rest of the points, but you can see that the one with the lowest latency actually caps out about 100 gigabits a second, and the ones with a slightly higher latency, or double if you're being mean, actually go all the way up to the 200 gigabits a second, and actually there's a difference in the way in which that's been wired up, which I'll go into in a bit more detail later, but essentially one of them can use the whole bond, and one of them can only use one side of the bond. So these were on service with bonded 100 gig ethernet. If you pay a latency penalty, you can use both sides of the bond in an interesting way. If you want the ultimate lowest latency, you kind of have to dedicate and just use one side of the bond. Anyway, so why do you make a big difference to these kind of workloads? I'm referencing a talk here that was at KubeCon, five ways with a CNI. If you look at the FOSDEM session information for this talk, one of the links on there is to a blog that we wrote about this kind of thing, and there's a video from KubeCon you can watch to have more detail, and this particular set of bang for bang, how's all these different ways of wiring the networks. So that all sounded a bit complicated, right? How do we actually stamp this out in a kind of useful way for users and get this all tied together? So how do we manage that operational complexity? So the first side of this is in terms of deploying at the OpenStack layer and configuring all of that, we've got tools from the OpenStack community, from the Collar community in particular, Kube and Collar Ansible, and we use those with Ansible playbooks to sort of repeatedly, once you've got a working configuration, make sure you do that every time. It involves ensuring you can re-image the machines easily and make sure that you apply the Ansible on there and get the same thing each time, so sort of package that up, and that is all open for people to reuse. And then the next stage is the users need to actually consume this infrastructure. So if we give people OpenStack directly, they can get very confused, because the people that are trying to just create a platform are typically not experts in using cloud infrastructure. So how do we make that easier? So I want to talk about azimuth. This is the project that I mentioned at the beginning coming from the Jasmine team, and the idea here is for the people creating platforms, so for the platform creators, people who want to create a Jupyter Hub or a Dask Hub or a Slurm cluster that's isolated and dedicated for their own needs. This might be for a development use case or otherwise, or create a Kubernetes cluster. How do we just package up those good practices and make that really easy to deploy, so calling this platform as a service? If you've seen me talk about this before, one of the changes here is that you get all of the platforms in one view now, so you can log in using your OpenStack credentials. So there's the cloud operator, and then there's the platform operator logs into azimuth, creates the platform, then on top of the platform, you can choose which users can log into that, just to make all of that much easier to do. So I'll quickly go through the types of things that are going on here and the different types of platforms. So firstly, there's Ansible-based platforms. So things like, give me a bigger laptop, which is a particular case, and so give me a Linux workstation that I can just guacamole into, or then give me a Slurm cluster. What we do for that is they're not Kubernetes-based. We use Terraform to stamp out virtual machines, and then there's Ansible, basically, Ansible's running Terraform to stamp out machines and do any final configuration that might be required. So when you click the button, all of that happens in the background, and it sets up the infrastructure and you can get straight in. The other type is, give me a Kubernetes cluster, I go into this in a bit more detail in a sec, but you choose your Kubernetes cluster, set that up, and it stamps that out. And the third type, which is relatively new now, is, well, I just want a Jupyter Hub or a Dask Hub, and so for those kind of situations, we're deploying those on the Kubernetes cluster, so you can go through that. So let's go into a bit more detail. This is more just a bit of an eye chart, particularly because it's not rendering at all. The idea is you just ask some basic questions about creating a Kubernetes cluster, what size nodes you want, what name it is. If you're creating your Kubernetes application, and you've pressed go into the Kubernetes application, you give it a name and the basic constraints for the notebooks, and it's sort of pre-configured and you tell it which Kubernetes cluster to put it on, or create one if you haven't got one yet. And then finally, when you've stamped out all of these bits of infrastructure, you can see there's a nice single sign-on to go and dig in. So if you've got Dask Hub, you can click on the link to sort of open your notebook, and it gets you straight in. One of the issues we've got at the moment is that there's a cost of IPv4 addresses or the shortage of IPv4 addresses is a big deal. So we're actually using a Zenith proxy here, a tunneling proxy called Zenith. So essentially when we create the infrastructure, there's an SSH session poking out, doing a port forward essentially, out into the proxy, and the proxy secures that it does all the authentication and authorization, and then punches that through. So essentially it means that these are inside, you've got a VM inside your private network, and then it goes out through the NAT, not consuming floating IPs for each of these bits of infrastructure that you're stamping out. And there's lots of, I'm not going to go into too much detail on all these things. If you create a Kubernetes cluster, it's easy to get the kubectl out. It's got monitoring included, and SLIM, similarly, it comes with monitoring open-on-demand dashboards. So in this case, you can get in and out through open-on-demand, although this one does require a public IP so that you can do SSH. I said about bigger desktop, so if you just want a VM, you can get into without worrying about SSH, without having to configure all that. You can go in through Guacamole, get a web terminal and otherwise. Again, you can stamp out all of these without consuming a floating IP. Another mode, which is a bit like Binderhub, but just inside a single VM, is just you specify your repo to Docker. Same kind of idea. It spins up the Jupyter Notebook, punches it out with Zenith, so it's all nice and simple to just get that up and running. Okay, so let's do a little bit of a shortish technical dive into actually how do you get RDMA in Loki, what the heck is Loki, you may have said. If you've been in some of the open-infotalks, Thierry described this quite well. This is the idea of Linux OpenStack and Kubernetes, giving you dynamic infrastructure. How do we get RDMA into this stack? There's three main steps. First of all, you do need RDMA in the OpenStack servers that you're creating. Second step is, if you want Kubernetes, you need the Kubernetes clusters on those OpenStack servers. The third step is you need RDMA inside the Kubernetes pods, executing within the Kubernetes clusters. So let's just drill down into each of those. So how do we do RDMA inside the OpenStack servers? Well, there's two main routes here. The first route is if it's a bare metal server, you've got the nick there, RDMA is generally available in the way it's normally available. This is not a lot special to do there. I should stop there for a moment. What I've said is you're using the standard OpenStack APIs and all the Terraform tooling and you're stamping out bare metal machines. That's totally possible. When you select the flavor dropdown, it might be give me a box with an 8A100 on it. I want the whole thing. That's perfectly possible. So I referenced Cambridge as helping us out with this. Cambridge's HPC clusters are actually deployed on OpenStack using the bare metal orchestration. So it doesn't get in the way of anything in terms of RDMA or InfiniBand or whatever. You get the bare metal machine. On the VM side, it's a little bit more complicated. Essentially, the easiest way to get RDMA working in there is that we pass in an actual nick using PCI Passu, the SROV. So the VM itself has to have drivers appropriate for the nick that you've passed through. Now there's a whole bunch of different strategies for doing that, but I wanted to quickly go through this one, which is using specifically in some MeloLux cards, and there are other ways of doing this. Essentially you do OVS offload onto your virtual function. So if you do SROV into the VM, that virtual function can actually get attached into OVS. Now that sounds insane because that's a really slow path and you just put a nice fast thing into a slow path. What happens is OVS gets told that actually you look for hardware offloaded flows. So when you actually start getting connections going into your different machines, it notices the MAC and IP address pairs and those flows in OVS get put into the hardware and then it goes onto a fast path. The other part of this is that you connect the OVS directly to your bond on the host and the VFs are actually getting connected to the bond. So in that earlier graph where I was showing 200 gigabits a second and basically getting line rate, that's using this setup where essentially your VM with its virtual function is going through the bond rather than through one of the individual interfaces. And this is actually quite a nice setup in terms of wiring. So if you've got a server that's got dual 100 gig ethernet going in or dual 25 gig ethernet, you don't have to dedicate one of those ports to SROV. You have the host bond on there and you can connect the virtual functions into the host bond. Okay, so the next bit, create Kubernetes. I'm not going to go into that too much detail. Essentially we're using Cluster API. I really like its logo because it uses basically you create a management cluster. In CRDs you describe what you want your HA cluster to be or your other cluster to be and it stamps that out for using an operator. This has proved to be really quite a stable way and reliable way of creating Kubernetes. One part of this is that we're actually hoping to try and, well, while I'm in the room, I'm trying to fix the unit test on it, but we're developing a Magnum driver for OpenStack Magnum to actually consume Cluster API and just stamp them out. To make this repeatable, it's all been packaged up in Helm charts, which are here. So now we've got OpenStack machines that have got RDMA in, we can do that. We've set up a Kubernetes cluster that's using those OpenStack machines that have the virtual function in that's doing RDMA at line rate. Now how on earth do we get the Kubernetes pods to actually make use of RDMA? Now if this was a bare metal machine, there's actually quite a lot of standard patterns. It seemed to be quite well documented in terms of actually using virtual functions into the pod. If we're inside a VM, we've already done the PF to VF translation, so you can't go again. You can't have a VVF yet, although VDPA and other things might change this. So what we're actually doing is we're using Maltis and something called the Mac VLAN CNI. So essentially when you create your Kubernetes pod, you give it two interfaces, your regular CNI interface, so that has all the usual smarts, and you give it an additional Mac IP address pair on your virtual function for the VM. Now at the moment, you have to turn off port security to ensure that those extra Macs that are auto-generated inside Kubernetes are punching out correctly and not restricted by the virtual function. There's a plan to try and orchestrate that, so you can use allowed address pairs to explicitly decide which ones. But that's basically, so you use Maltis to say, give me two network connections, and you use Mac VLAN to get that connection to your RDMA. And there's also some permission stuff, which is actually quite a simple decorator on the pod. But essentially, extra pod YAML to opt in to actually how to get this all wired together. Okay, so it'd be really great if people have these problems, and this is interesting, to get involved. There's a whole load of links, but yeah, thank you very much. And before you've got time for half a question. Yeah, you mentioned, I thought you mentioned that we are doing the bond on the network interface. You're getting the full bandwidth of the bond. So whenever I do LICP bonding, any particular connection, I only get half the interface. So I'm just wondering how you're doing that. It depends on your bonding mode. But yeah, so with LICP bonding, I only say make it half. Well, no, so there's a hashing mode on your bond. So what you need to make sure is that you do something like L3 plus L4 hashing, so that from a single client, it depends, it's basically, each of your traffic flows gets hashed onto a different bit of the bond. So you need drivers that are respecting that hashing function. But yeah, if you get enough different flows, then it will actually hash across the bond, okay. It's all about the hashing modes. Not all switches support all hashing modes, which is the gotcha in that. Yeah, the other question I have is, I don't understand the connection between MAC VLAN and RDNA. Sorry, what's that? The connection between MAC VLAN and RDNA. The connection between the MAC VLAN and RDNA. Yeah, why do you need MAC VLAN to do the RDNA into your VMs? So you could just do host networking. So if you did host networking on the pod, you would just have access to all of those host interfaces. But if you want to have multiple different RDNA flows with different MAC and IP address pairs, then the MAC VLAN allows you to have those multiple pods, each with their own identity on your VLAN that's doing RDNA. Anyway, emails for the next questions, I think. So I should let the next person set up. Any other questions for Joan? Can it? Oh. Yeah, last one. Actually, I had two, but okay, I'll re-work it. Okay. So I saw that you were also creating slurm clusters. Yes. So how does Kubernetes and slurm play together for the network topology and placement of your networks? Well, I have lots of ideas for that after your talk. At the moment, not really. So they just, the pods get placed wherever and then... Yeah, at the moment, they're totally isolated environments. So you stamp out a slurm cluster and it's your own to do what you need. And then super briefly, the pink line that was legacy RDNA, SR, IO virtualization. Yes. Is that bare metal or is that also virtualized? That was... Is it running on... Because that was... We can catch up later. Okay. So that specific scenario, I definitely recommend watching Stigt Alpha's talk, they're five ways on CNI. I think that particular setup was actually bare metal with a virtual function. So it was actually Kubernetes on bare metal with the virtual function passed into the container. Right. I believe we got similar results without doing that legacy path into the VM as well. The extra cost, I believe, is on the VF lag piece because there's an extra bit in routing inside the silicon, I believe, but I'm not certain on that, so I'd have to check. Thank you. Pleasure. Thank you very much, John. Thank you. |
How to deal with validation as an HPC software?
An approach to power software testing at scale |
All right, we'll get started with the next talk. Julia Anon from Paratools is going to talk about testing and validation. Hello, everyone. Thanks again for the introduction. So, yes, I'm Julia Anon. I'm going to talk to you about testing. As you may notice, I'm a bit happy to have a mic today. So, please let me know if I'm becoming unreadable. So, first of all, a bit of background, back in some years ago, we were actually, we still had a team developing an MPI runtime, and while developing this runtime, we had the major stake to develop a validation system to assess our software quality, but also to be able to compare our implementation to others. So, like everybody in HPC field at this time, we started to build our own ultra-specific shell scripts to validate our implementation, because we were considering that our implementation were too specific to be able to use some mainstream tools. So, we started with self-scripts, great idea. The fact is that with the team growing, people working in a separate place, working on multiple heterogeneous machines, we had a huge issue to make people continue to validate, to use this validation system, because it was slow, not really efficient, and hard to make it grow. Especially about maintenance, when we wanted to add anything from to the validation process, it was just a nightmare, it was really costly in term of time, but also when our software was evolving, was growing, just adding, just a little test into this non-requestion based was a nightmare. So, we started to consider why not creating a validation system able to take care of HPC environment, and what it implies, like validating on multiple architecture, multiple machine by multiple users, with multiple benchmarks, and we just thought about having a generic tool able to handle all of that at what point. So, what we want in that scenario, we are here in the case of validating an MPI implementation, which means having a standard API, okay, so there is a lot of well-known MPI benchmarks already existing, we don't have to rewrite a whole non-requestion, we have benchmarks, you have proxy applications, and so on, so how to scale these benchmarks to any runtime or any project using them to build a proper validation process. So, what people want, I mean, what users of this validation system want, it's a simple tool. You don't want to have really complex Q, GTK, any kind of complex architecture to deploy to test. So, what we want is just a command line interface, basically, and really, really, really few setup, really, really, really few configuration to deploy to have such test working, because a lot, maybe it's not a generality here, but a lot of people hate tests, basically, so we have to convince them that testing is good for software quality, something really simple, but also able to handle any really complex scenario we may find in HPC, running an MPI application is not that hard, but it may become really, really complex, and we don't want to have to rewrite our, I use this tool every two years, because a new technology, a new approach, a new paradigm is implemented, and the validation process should have to be rewriting. So, from that state, we decided to write a tool to fit our needs, but to be able to be used by people meeting the same requirements. Before going further, I would like to tell you that testing is not a brand new field, and some other projects are tackling such kind of issue until now, and so please take a look at them, if you think that PCVS does not completely fit your needs, they are really, really powerful tools in the field. So today, I'm here to talk to you about PCVS. It's a tool maintained by the CEA, it's retained in Python, yes, everything is in Python now, so we are based on Python, it's a simple command line interface, and with few configuration files, obviously in EML, it's a trend, the design of this framework, the testing framework is to be, to bring simplicity when writing test logic to users, so we want tests to be simple to write, easy to port, okay, and these benchmarks, written by the user, we want to be able to be run in multiple environments, so we don't want to bound a test suite to a given application, let's consider MPI run time, we don't want to have our Lulesh or our IMB benchmark bound to a specific MPI implementation to be run on any kind of architecture. So this is what we call in PCVS, the retargeting approach, the other approach we want to focus on is the fact that we have heterogeneous test environments, we have benchmarks, how to scale, automatically scale this benchmark to the actual test environment, consider, for example, having users wanting to validate the IMB, but they are working on their work session, we don't want to launch up to, I don't know, 100 MPI processes on their work session, they're going to be not happy with you squashing their machine, but once this validation system is run on a real HPC cluster, you want the test suite to automatically scale to this supercomputer resources without having to rewrite 100 of configuration files, or even one file is already too much, okay. So this work is maintained by the CEA, which is a French research administration, and we are collaborating with us, with them, I'm sorry, to make this tool more generic or not generic and attempt to make as many users as possible, as many users as possible. So at Eglance, some feature specificities provide, the idea is to split the test effort into specific fields, the first one is the specification of test, what a benchmark need to expose to build the test, so this kind of information is carried by the benchmark, obviously, how to build, what is the program, what are the parameters, and so on. And the environment, it's a testing environment, here is carried by the people deploying or providing the testing resources, so most of the time it's just a team policy to schedule jobs, or our system admins, right, all of this to pursue two goals, still the retargeting of tests automatically when the user is calling this benchmark, depending on the compiler and run time you want to target, and auto scaling test to test environment. Obviously, as PCVS is a kind of test framework, we add to add some functionality around testing, like in-place reporting, because most users are running their tests on HPC cluster where the set of functionality can be restrained, so they don't have access to all their GitLab, GitHub, Gira and so on stuff, so we add some basic tools to answer these needs, and beyond a single execution validation run, we added a way to build the history of the validation for a given application by storing results over time, and allowing the user to run simple analysis, still in Python, to produce statistics over time. So quickly, the architecture of PCVS, it's based on files, so Bison CLI, so Bison file system, it's parsing some user inputs, and it's run through a dedicated connector to mainly SLURM currently, mainly SLURM, but we are focusing on supporting as a batch manager. So the idea is that the benchmark express job descriptions and resource requirements and the environment will provide resources. Let's consider, for example, a basic component called number of MPI tasks the job is taking, the job will say, okay, my job is only running up to two processes, or it's only taking, I don't know, cubic value of MPI processes, people are aware of users of the Lulech proxy application will know what I mean by cubic, so it will describe constraints, and then we will have environment configuration files called profile in PCVS, where the admin will say, okay, in that context, I will have up to 100 nodes, so you will be able to spawn up to 100 MPI processes to run your application. Based on that, PCVS will cross this information and will say, okay, I have MPI jobs, I can run up to 1000 MPI processes based on this specification, why not running the user benchmark 100 times once for each combination. PCVS is an as an opt out approach, so it will consider that every combination provided by the environment will be used to scale your tests, okay, so if your test is not able to run up to 1000 jobs, it's up to you to specify that you can't reach this limit. So here is a quick example, I don't know if you can see up there, but we have a really, really basic environment configuration where you can see that there is what we call an operator, this is a variadic component, and it can take up to 4 values for the NMPI attributes and when describing a simple job, we're just saying, okay, my job just consists in running a program, PCVS will enroll up to 4 common lines to run 4 independent tests to execute. So PCVS will automatically build your test scenarios based on your specification. So how to basically write a test, but more specifically a compilation test, there is a lot of things to customize or you can build your tests, so it looks complicated, but as you may see on the previous slide, all the keys in that TML are not mandatory, at least the files one, obviously you have to specify which kind of application you want to compile, so the framework will try to auto detect your language to select the proper compiler, obviously you have a manually designed approach also. If you are not based on compiling source files but already using a well-known build system, we have also an interface to invoke directly build system to build your framework. I'm thinking about, for example, Lulesh, which is using a Mac file, DIMB using a Mac file, or even the Mpitch test suite using a Configure Mac-Macon style approach. So you have many options, all of them are optional. What I would like to highlight is a variant. What a variant, it's a capability from PCVS to expose to job, to benchmarks, a specificity or requirement this job has to be run. So in that case, the OpenMP keyword probably means something to everyone here, but it just you know a token saying that to run this job, the variant, the component OpenMP is required to be scheduled and in the environment, the user will say, okay, what is my variant OpenMP in case of GCC, GCC like compiler, it will be dash f OpenMP if it's Intel, it's not the same option. You see the ID? I will add, PCVS will add Flavor depending on what you have specified in your environment. How to write a run test. So this is where the component, the iterative component takes place, we didn't port it yet on CompilationModel because we have issues with the race condition reaching the same file on the file system, but we are planning to containerize, to encapsulate, to isolate such model to be able to support also CompilationTest, I'm sorry. So what as a user have to do to integrate such test in your workflow, just write a PCVS.yml file, you put it anywhere you want in your pass, I mean your benchmark pass, and it looks like just a run node and everything below is totally optional. Here we can see that we restrict, we restrict our add.out program to specific MPI values because as a tester, I know that my test has this constraint. You can even create this variadic component for your own application if you want to programmatically generate a list of scenario that PCVS should integrate to its own process to build multiple scenarios. So in that case, with this, I'm sorry, why am I moving in the program node, you'll see that with this three simple lines, it will be able to build three times the number of scenarios that you were expected to have initially, right? And what a test without having a way to express or validate a test, obviously. So for any test, not only for run, you have a YAML description to say, okay, so I want my job to exit with this particular return code, having an execution time within this range matching or not matching this kind of pattern. Even here is my script, give him the regular output of my test and I will tell you if it's okay or not. Okay, so I just write my test, how to run them now. It's just celibate, so you just have to call PCVS run, but before running my benchmark, I just have to create a profile to express the resources my environment has. Obviously, in case of MPI, we provided some templates, some basic templates to initiate the testing process. So here, we are just creating a profile named my MPI based just on MPI. So by quickly running that, I will have a full profile based on MPI running tests for one, two, three, and four MPI processes, but from that, you can then expand the profile to fit your needs. The whole build of PCVS relies on a single directory. And in that directory, you will find anything required to analyze the results and even rerun in the same condition the tests for reproducibility. You can see on the right, repository we provided alongside with PCVS, which is called PCVS benchmarks, and we attempt to put in that repository many well-known MPI benchmarks, PCVS enabled, right? So here is a fancy view of PCVS. Obviously, there is many options when running a validation. You can have an interactive approach, non-interactive approach, slur enabled, I mean, batch manager enabled, running inside an allocation, outside an allocation, and once the whole configuration has been generated, we have commands, especially a PCVS exec, to interact independently, uniquely with your benchmarks, so for instance, what people are used to do is to run their validation and after maybe 10 seconds, some failures appear, and they would like, without interrupting the non-requestion system. But rerun in an isolated environment, their tests to see why it failed. So we have some extra commands to rerun this special pattern, okay? Obviously, I'm going to just pick it up, obviously, everything is not always perfect, and the static approach of the ML file is not what you need, you would like something more dynamic because you have some stuff to interpret, to read, to process before knowing what you want to test, even within a benchmark. So we have a dynamic approach. Instead of providing a static ML file, you will provide an executable script, an executable file, whatever it is, and it will produce by itself the actual ML file. This way, you can programmatically generate your benchmark suite without having to know it in advance, without knowing in which environment your non-requestion base will be run. Let's consider the NAS framework, where within the name of the binary bit, you have the number of MPI processes, cool. You know, we have run our tests, but we would like to see what it looks like, okay? We have a test framework, not just a job runner. So obviously, we had some tools to report results to the user, spoiler, cannot be compared with real tools for that, okay? And the idea is just to offer a way to users to grab their results directly on their machine, in place, okay? And essentially, it's just a way to look at tests at a glance to be able to rerun if necessary in the process. So as I said, we can isolate and rerun independently jobs, which is pretty convenient when some failure want to be explored right away. And in FINE, we are using a web server to report in a web browser directly, offering more interactivity for your results. So what it looks like, for example, here, gathered by Label, you can see that there is some red, so some failures. Let's dive into it. You can see that some trouble with MPIO, what a surprise. And when clicking a job, you'll see the complete log of this trend, so the command line and the actual, so I truncated, I'm sorry, I truncated the actual error, and you can directly dive into the error without leaving your actual SSH terminal. So a quick overview of how to configure your site, so the test environment configuration part. This is also AML, you will define directly compilers, compiler and run times, and the special variadic component here. It's split in five different modules, why? Because this whole profile can be split up to five blocks, independent blocks, we can be distributed over a cluster because it's not always the same teams who are responsible of this particular block. Let's consider, for example, the variadic component, it's in charge of the team to build this list, while for the compiler and run time, it may be in charge of the C side means of the test environment machine. And after running a single job, what I would like to see is to have a trend over multiple run, all my test suite behave, and in PCVS we integrated a way of using to stack multiple runs over time, and then run analysis on them, to build trends, to have more things than just a test result, but a test result over time. So here is an example of what you can do afterwards, running analysis directly on this storized pass, and this is enabled thanks to, I would like to call that a DSL, but actually just a Python API to interact with that, and you can build such beautiful graphics to see over time the rates of success inside your test benchmark. So finally, just a quick glance at the SPAC plus PCVS. We are not in SPAC, but we are supporting SPAC, especially to do such things by specifying a simple package, a simple spec package, we will be able to check any combination of building this package to see if there is some curiosity into your package recipe. For future work, we have many things scheduled, and most of the most interesting in the capture in metrics, the capacity to PCVS to capture directly some metadata to be able to then run analysis on them, and many other things, but I think I'm running out of time. Thanks for your attention, I have two questions. From your configuration file, I assume you already have control of the cluster, at least you have allocated some nodes or something. Do you have some step that then allocates and deallocates these resources on the fly for each one of the tests? So actually, currently, most of our test scenario has run through an MPI run with slurm enabled or srun commands directly, so they are taking care of the resource allocations. Some other users are just running the whole PCVS inside a given allocation, like resource allocation, just a saloc, for example, and then any test doing srun does not pay the cost of waiting the actual speed. If some tests need some type of CPU, the other tests need other type of CPU, then you need to, and if one of them is unavailable because an other user is using, you have to wait instead of fail. Yes, this is something we still haven't had a solution for, would be to be able to put a job aside while we have the allocation. Yes, this is something we are currently investigating, absolutely. Do you have any questions? Yeah, so one thing that I wanted to ask was kind of for your future work you had mentioned building out a graphical front end using textualize, I was kind of wondering how much assessment have you done into that, because I've done some work like trying to build GUIs with textualize and while I do think that it's very interesting framework and it's great for making textual GUIs, I think that it still has a bit of a way to come before it can really make a standalone or comprehensive seal or a textual interface, so I was just wondering what your thoughts were on that. I'm not sure, I understand the whole question, but you mean, why did we choose textualize? Absolutely, we discovered just recently because we were using Rich to highlight the output of PCVS within the console and we are looking for a solution to present the things graphically in a terminal and we still are looking for the ideal framework and as Rich is already as textualize, I'm sorry, it's based on that, we are considering textualize, but if you have any other offer to propose, I would be happy to discuss with you about that. Thank you. |
LOFAR: FOSS HPC across 2000 kilometers
The unknown world of open source radio astronomy software |
Okay, we're ready for the next talk by Courtney who's going to talk about open source radio astronomy. So hi everyone. I work for Ostron which is the Dutch Institute for Radio Astronomy, we're a government institute so we're publicly funded. Apart from being publicly funded we have a lot of research grants and that's basically what pays my salary and I'm going to talk about our biggest instrument that we have which is called LOFAR and we actually utilize a lot of open source software there, almost exclusively there's some caveats. My name is Courtney Leuker by the way, I'm also into amateur radio, my call sign is Papa Delta 3 Sierra Uniform, this is going to be quite a relaxed talk, we're going to give a high level overview of all the different components that are there, there's quite a lot of them actually so it's not possible to go into detail with the time we have. This is my first forced them talk ever and also my first talk ever in a lecture hall so that's quite interesting. Now some background, I mentioned that we're a government institute, we firmly believe public money means public code and we stand by that in almost everything we do already, we have an open source committee that also ensures that we do that and we have basically two very big telescopes, one's called LOFAR which stands for Low Frequency Array and the other is the Westerbork Synthesis Telescope also called WSRT. There's some sister institutes that you work closely with or are related to us, one is called Kamras which maintains a telescope that we've stopped using and the others are Jive and EVN, I'm not going to talk too much about those today. What I want to tell you is that there is this principle that our radio telescopes work on that is called very long baseline interferometry and this enables us to do radio astronomy in a way that wasn't possible with traditional radio telescopes. This is the whole map of LOFAR, there's 54 stations in total, roughly 25 of those are located in the Netherlands. I say it's around 2,000 kilometers in diameter but that's no longer true because the one in Rosen is new and we're now about 2,500 kilometers in diameter. This diameter is also called the baseline which is where the very long baseline interferometry comes from. If we then zoom into a station you see all these tiles and you see these little squares and those are the different types of antennae and that is what makes this type of radio astronomy so interesting is that we don't have a single antenna to catch radio waves, we have lots of them, about 20,000 in total actually which is quite substantial. This center is called the super-turb and it's located in Exlo, the Netherlands. How can we actually combine this data? I told you that traditional radio astronomy relies on a parabolic dish or a single antenna and we try to scale those up, make them bigger and bigger but of course physics are at the limits at some point, you can't make a structure from steel that's like 500 meters in diameter. What we do instead is we combine smaller antennas to act as if they are a parabolic antenna and the trick about a parabolic antenna is that all radio waves, no matter where they are incoming from, they all have an equal distance to the receiver so we need to emulate that with our antennas and we do that in two ways. That is an artificial delay, an analog artificial delay by just making the line that it needs to travel across on the PCB or the coax cable longer but we can also do it digitally after the data is being sampled and then we can aim into the sky and create a very narrow beam that observes a very small portion of the sky and that allows us to zoom really deep into space and make very detailed images. But what is this radio waves actually? What are those? What are we observing? There are two types of radio waves that are being emitted by objects in space and the galaxy. We're only going to describe one phenomena today that's called synchrotron radiation. Basically if you have an ion, a charged particle, you accelerate that, then it starts creating radio wave emissions. The frequency and the intensity at the frequency that is actually very dependent, that's all details that are not very interesting for this talk, but one of these entities that emit these types of charged particles are sometimes black holes and we'll see an example of that at the end. So I mentioned black holes, there's other types of radio astronomy that are very interesting. We can also model our own ionosphere or enlightening. Pulsars are pretty interesting, these are stars that are rotating at a periodic interval and they have very strong radio waves coming from the poles of those stars. So what does Lover actually look like? A very small antenna as I told you, we can see on the left that there's like wires attached to those poles, those are actually dipole antennas and if you configure them like this where they are like a V-shape, they are called inverted Vs. These are the low-band antennas and on the right side you see the high-band antennas, they're like a clover shape, like a tie shape. Then we combine all these antennas, low-band antennas and 69 high-band antennas in a station and we send the data at around 3 gigabits per second to our HPC clusters. There's a two-phase cluster here, the first is GPU processing where we do correlation and beamforming and the second is a central processing which is more like CPU based. In the early days our computing cluster looked something like this, we had IBM BlueJeans machines which were based on PowerPC and they had a 3D torus interconnect which is actually a quite interesting interconnect. This was problematic because utilizing the floating point vector extensions required manually rewriting assembly which wasn't that nice and it was pretty hard to find developers who were willing or capable to do that. So we moved to commodity hardware, GPUs, two CPUs per socket, two GPUs per server, two GPUs per server, lots of networking. That's really what you see here, we had 32 gigabytes of 10 gigabit Ethernet and then in 2018 when we upgraded we had 24 times or 23 times of 100 gigabits in FiniBand but you also see that there's a lot of 10 gigabit Ethernet per device and I'm gonna go into that wider this in a minute. If you look at the data flow or more like a software site then you see that the antennas have ADC so these conferred the analog waves that are incoming to digital signals and then we do beamforming on the stations and we send data to the correlator and this correlator also does the correlation afterwards and you can see that once the correlator is done with it we store this to disk and once it's stored on disk then it's made available to the central processing. So the correlator and our GPU cluster cobalt are doing like streaming and the central processing is more like your traditional HPC. When we look at the data flow in cobalt there's all this incoming 10 gigabit Ethernet and this is why we have four or three 10 gigabit Ethernet links per cobalt server. They are streaming the data and we configure per station where it needs to send its data to. Then once it's there it's being transposed at roughly 240 gigabits and once it's transposed we do have two pipelines that essentially run in parallel, one is correlation and one is additional beamforming so we actually beamform twice in a sense. It's little bit more complicated than I'm sketching here but I'm keeping things simple because stations also have the capability to not beamform and send unbeamformed data. We have a special buffer that's called the transient buffer where we dump raw samples and can send those two clusters but the general pipeline is what I'm sketching here. If I assume into these two pipelines the correlator pipeline and the beamformer pipeline I don't want you to look at the details too much here because that's not interesting and I really don't have time to explain it but the trick is almost everything you see here is based on signal processing, digital signal processing. That's what we're doing. We're using the fast Fourier transform, finite input response filters and transforming the data in like the frequency domain if you will. Then it's put back into CPU memory at cobalt, some final transformations are being placed and then it's put into the disk so that's how it can work on it. At Ostrone we do a lot with software and I've showed you now how the data flows but I haven't told you what software components are making that data flow happen. For cobalt, it's actually one solid product that lives in the low-fiber repository. Please don't all now visit or get lab instance at once because it will die if you do that. Try to spread that out a little bit over the day. I'm sure I will upload the slides soon after so you don't have to remember all this. Then this, all these tools that are listed here basically except for cobalt with lifts in the low-fiber repo, those are more like what you would find on the SAP side of things. I'm going to explain, I want to address that this is just the tip of the iceberg. On our GitHub repo you can find a lot of more stuff that you can play with and I would encourage you to do so because it's quite interesting. Then there's also CASA core which is heavily being developed at Ostrone as well but it's not actually just a product just by us. A competitor or like a friend of CASA core would be AstroPie, very widely known packages in the industry. If you look at the radio astronomy tool kit, so the necessary things to go from antenna to image if you will, then these are your friends. There's DP cubed, WaySW clean, and IDG, and Rattler, AO flagger and I'm not going to talk too much about every beam. What do these things do? What are we looking at here? DP cubed is where you define a complete pipeline, so you have the incoming data, you need to do some transformations on the data, maybe you want to identify some noise sources that might be in your data, and eventually you want to create an image, and for this imaging you need deconvolution as well, and you also need gridding and de-gridding. So, AO flagger is where you identify noise sources, this can be anything, like radio instruments are very sensitive, so one noise source in particular would be electric fences, windmills, solar farms, bed quality LED lighting. Then we move to the imaging part with WaySW clean because when you have a radio telescope consisting of many small antennas, in between your antennas there are holes, and that means that you're not receiving the data as if you would have a very large parabolic dish, there are some differences. This creates some kind of fringing in the image that you need to filter out, and that's what WaySW clean together with Rattler and IDG are doing. In IDG is your translation from the data domain to like a domain that is useful for radio astronomical imaging. So I talked a little bit about CASACOR and how it was industry widely used, it's based on all the packages that have been around for a very long time, but we've actually switched it around now, so now CASACOR is built on these older packages, on top of these older packages, rather than CASACOR depending on these older packages. There's several unique features here, the UV domain is pretty interesting, so that's the domain about having your, about the plane that is filled, so those holes in your surface area if you will. And there's some phytonda bindings here, so these are all very nice tools that you can just play with. We also use a lot of open source tools, and we're doing quite well, there's still some close source software, I'll get into that in a minute, so we use OpenMPI, OpenMP, Slurm, GitLab, Grafana, and actually the part that I work on is PyTango, which is a SCADA system. So with supervisory control and data acquisition, that's basically your interface that we have on the individual stations, and those stations then configure underlying hardware with the antennas and the ADCs, and they report how they are configured to a higher level system. We also use Prometheus, Docker, and a variety of interesting open source tools, this is just the tip of the iceberg as well, there's much more. Next to our SEP cluster is also pretty interesting, which is actually where we use Slurm, we also have a DustSys cluster, which is a cluster shared with many universities within the country. Things where we can improve, well, we use CUDA, so that's not really open source compared to OpenCL or Falcon, we're using WinCC for monitoring, maybe you've heard of that package, it's Windows-based, that's why it's called WinCC, we're trying to face it out for Grafana and Prometheus, that's going quite well, I'd say. We have a lot of closed source SEPDA vendor blocks, so if you have your silings or what have you or your Altera, then they for instance offer IP blocks to implement 100 gigabit ethernet interfaces, and they're not too keen on you sharing those with the whole wide world. Then InfinityBand firmware is pretty closed source, I believe there's open source versions of that, but I don't think I know if they work quite well, and then the main area that we're actually struggling is with office management tools. This is definitely the area that we can improve the most at Astrone, we use Office 365 Slack and Zoom, and as you can see, Copano, Metamode, Jitsi, there's definitely open source alternatives to this, so there's no real reason why we should be using this. Of course, you need to host the infrastructure, and that also costs money, so there's some little way there, I'm not saying that it's definitely cheaper, but there's open source alternatives to this. Now I want to show you, I told you about IDG, that does the gridding and de-gridding, I told you about WSWClean, and the Dravartler part that does the deconvolution now, and I want to show you how those tools work in practice. So we have an IDO point source, this is our most IDO radio source that can possibly exist, it creates a very sharp point in the sky, we put it toward the gridding, and we get a point function. What this basically is, is this is the error of our telescope, so we now know, okay, this is the error it's going to generate in our images, because we don't have complete filling of the UV plane, there are holes in between our antennae, and then we can use the WSWClean image together with Dravartler for deconvolution to create an iterative process in which we iteratively remove the noise from the image. So actually I'm going to see, oh yeah, that's nice, so here you see these lines, these lines are the fringes that I've told, and if you then perform these iterative cleaning process on what are called calibrated visibilities, then we iteratively see that this image is drastically improved. So now some example of this, what is the science that we do with Lofar, how does this look like? Well this is the, this is a paper by Erk Timmermann, so you can look at it when you spare time, and what we're basically seeing here is we're seeing huge jets of this synchrotron radiation emissions that are talked about, and you can see that they actually over millions of years they vary in intensity, and at the center of this image is actually a black hole, but you can't see that because it's a black hole, and then on the background of this image there is an overlay of what's the optical domain, so not the radio domain from the Hubble Space Telescope, and this is what we have been able to capture with Lofar. So we're doing groundbreaking science, and we're going to do a lot more, we're in the middle of a big upgrade that's scheduled for the end of 2014, Vincisi is going to be replaced with Kavana, we're thinking about Allerta, but I've heard that Kavana has persisted alarms during, actually forced them today, so we might not need Allerta, and we're using Prometheus. We had this low band antennas, and the high band antennas, I briefly skimmed over that because the details, yeah, you have to cut some corners somewhere, but basically with Lofar 2 we'll be able to use both of them in a single observation. We'll also be able to use multiple beams, so we talked about the beam forming, currently Lofar is only able to have a single beam per observation, and we will also be able to point at different points in the sky, change that during an observation, and we call this mega mode, don't ask me why, yeah, we're completely refamping the skater system, we're now using Pytango, we have sufficiently upgraded hardware, Unibor 2s, we actually sell those to external institutes as well, so they're available, and we're drastically improving the timing distribution, so we're currently GPS based, everything is synchronized using GPS, all the stations across Europe, and we're going to use the white rabbit protocol that's made by CERN, that's based on a precision time protocol. Now very briefly this mega mode, what would this schematically look like, so this is basically what's running on cobalt or GPU cluster, and we do imaging and beam forming, and now we have one beam and several pointings, and they stay the same during observations, now we can have multiple beams, and we can repoint during the observation. That's going to create a lot of flexibility for the astronomers, and I'm going to be very excited with the science that is going to come from this. I want to leave you with some links, as mentioned our Astron repo, the Astron website, there's a very interesting page about 10 years of LOFAR, because we've actually existed, LOFAR has been in production since 2008, so that's been since quite some time, there's this very nice map on which you can actually see all the physical locations of all the stations, how many antennas are active or working or broken, so this is all open data, you can just look at this, and there's some history about all these presentations that I've done in the past, so any questions? Maybe first a short comment, the Chinese built a 500 meter dish, but what I really wanted to ask is whether you have collaboration with other astrophysical observations like square kilometer array or something like that? Well actually we collaborate on the square kilometer array, so there's definitely, can you repeat part of your question, because people were just leaving? Well, whether there is shared development in software and stuff? Yeah, yeah, for sure, for instance on CASACOR as I mentioned, but also WS Clean, we see contributions from external collaborators, and especially the jive, the part of jive that I showed at the very beginning, let me see, shouldn't be too long, so here I mentioned jive and EVN, this is a huge collaboration of parabolic dishes that are all connected and all the data is sent centrally to Dringelo at the headquarters of Astong, and that's actually where the EVN network processes all this data, but all these dishes that we use, those are not ours, right, those are from other parties. Someone's asking, since everything is processed digitally, can these telescopes focus on multiple targets at once by processing the data multiple times? That's an interesting question, and that depends, as I said you have the transient buffers which dump raw samples, but typically what we do is we already do beamforming on the station, and if you do the beamforming on the station, you're already looking at some point in the sky, you're only sending the result data from that beamforming to this cobalt cluster, you can't beamform again then, the data is lost, it's reductive in nature, but if you would send the raw sample data to cobalt, and it could somehow process all the data, which I don't think it has the bandwidth to do so, then you could, at a later point in time, point at any point in the sky again, which is, that's the job of the transient buffers. Thanks. Maybe I have a question here, would there be any ways or interests for amateur astronomers, or radio astronomers, to help or work with astronomy? Well there's definitely job positions on our page all the time, I think, I don't know if most are in the field of radio astronomy, but what typically happens, and I can briefly explain, is we have a system called Nordstar, in which astronomers submit their proposals and describe what they want to do with their instrument, and then we have a community that looks at that, and that actually accepts these proposals, and then they are scheduled. This is actually a very good question, because I completely skipped this in the talk, but I wanted to talk about this, and then things are scheduled in a system called TMS, and that basically looks at, okay, what part of these stations are required, and to do these observations and collect these data, then these data are collected and processed on cobalt and sap, and the data products are made available to these individual astronomers who've done that, and they get exclusive access for a period of time to do their research. Okay, thanks. I was more thinking about, just if someone is in Africa with a homemade dish, is there any way to capture something with an SDR, and add a little bit with data, or the scale of things is so different that... What's actually very important, or it's rather we need to model a lot of things and calibrate a lot of things, so that's why all the stations are roughly similar in shape, similar in shape, have similar hardware, so it would be definitely possible to buy your own station, build your own station, and have the same hardware, and then hook it up, that happens all the time. Different countries do that, buy stations, and then we add them, but having vastly different hardware and then hooking this up to the system would be very difficult, it's not designed to do that. Okay, so if you would make a very cheap station that could be built by amateur astronomers, you could deploy that everywhere in the world, and then make your public radio astronomy like that. Interesting, thanks. |
HPC Container Conformance
Guidance on how to build and annotate containers for HPC |
First lightning talk is Christian. Yeah, thanks, Dennis. Corn has said that he has a relaxed talk. I have only 10 minutes, so I need to speed up. What I would like to talk today is about HPC container conformance, which is a project that came out of the HPC container advisory council, which is every first Thursday. And we try to provide guidance on how to build and annotate HPC containers. So conformance, what you might ask, so what are we trying to achieve? We focus on two applications, maybe a third, but mainly Gromax and PyTorch, and we want to go through an exercise of providing best practices on how to build or shape the container and also how to annotate the container. And I think that's the most important part, is the annotation part, by the way, anyhow. What we don't want to achieve is we don't want to boil the ocean by making everything work everywhere. So that's why we focus on these two applications. And we want also to allow for generic and also highly optimized images and make with annotations, make sure that people can actually discover those and also provide some expectation management for those. We are going to focus on OCI images, and most likely on Docker files. I mean, if people throw a lot of singularity build recipes at me, then maybe I will change my mind, but for starters, we are going with Docker files and OCI images. And if we have a Docker file that is derived from other artifacts, like a spec YAML file or an easy build recipe or an HPCCM recipe, then of course we also want to include those to make it easy for people to reproduce and tweak the actual container. When going through this research or this project, I was like, I'm in touch with the biocontainer community, and they created a paper in 2019, which is pretty interesting, where they provide some recommendation on how to package and containerize bioinformatics software. Of course, they don't compile for different targets, and they don't use MPI a lot. So it's just a baseline, I think, for our work in HPC, but it's a good baseline, and I highly recommend this paper to be read by people. So the first thing in the HPC container conformance project is the expected image behavior. So I think we have all been there, where we have different images, we want you to swap out and then we realize, oh, the entry point is different, or the container does not use an entry point, but the application name. And so we want to make sure that at the end of the day, all the containers that we produce in the HPC world are built in a way that they behave the same way, so that you can just swap out the container, you want to run Gromax, you try out multiple different containers, and you don't need to change your submit script but only the name. So at the end of the day, the container should drop you into a shell, like you're logging into an SSH node, and it should also have a very small, ideally small, or even no entry point so that it's easy to debug as well. So if the entry point takes forever or makes a lot of changes, then it's hard to debug the container. So the container should be, has a very small or even no entry point, and maybe it changes some environment variables to pick up the application that is installed maybe by Easy Build or Spec, but it should be very small. The main part is annotations for this project, and why annotations are the basic ideas, and we have all been there, so everyone who's done HPC containers, that we encode the information about the specific implementations of the image in the tag or in the name, and we don't want to do this anymore, right? So we want the information to be annotated to the image and not part of the name, because the name might change. So what do we want to do with these annotations? We want two things. First, kind of describe the image, the content of the image, and how the image is expected to be used so that sysadmins and end users know what to expect. So what user land is provided by the image? What tools are installed on the image? How is the main application compiled, like for what target, for what microarchitecture of the CPU, for which GPU, which MPI is used and so on, so that we can take this information and make maybe configuration examples for different container runtimes that hooks can react to those annotations, like potman and seros, for instance, they can already react to annotations. So depending on what the image provides as information, the runtime can adapt and say, okay, I have an open MPI container, I do this hook, I have an MPI base container, I take this hook. So I think that would be great if we can agree on certain annotations and agreeing on certain annotations. I think it's a huge task, but I'm hopeful that we can achieve this, and then make sure that the configuration is done so that the application is tweaked the right way. And another piece that we can achieve here is that we create maybe a smoke test that looks at the host that is running on, looks at the annotations of the container that you want to run, and just tells you, okay, this thing will sack fault anyway, you are on a send too and you have an application that's compiled for Skylake, it won't work. So that you don't download 30 gigabytes of images of layers just to realize that your image won't work. So I think that's also a very important part that we can do this. Another part as well is not just describe the image, but make it easy for end users to discover what images are around. So you want to run Gromax, and you know or don't know the system you are on. So maybe you can just run a tool or have a website that tells you you want to run Gromax. I have looked through all the annotations, I know a little bit about your system. Here we go, this is the image that you want to use. So also for discovery, I think that's important. Of course, we will have mandatory and optional annotations. So mandatory ones might be what CPU architecture is it compiled for, I think that's the obvious one. And optional ones, of course, if you want to add a CUDA version because your image has CUDA installed, then of course that's an optional one. Or you want to annotate the whole software bill of material. Maybe it's too much information, but maybe not. So there are optional and mandatory annotations, I think that's pretty clear. And I created a couple of groups, like annotation groups that I think we should think about. I won't go through every single line item here because I only have 10 minutes and it's only 3 minutes left, so just maybe grab the slides afterwards and then go through it and it's not written in stone, it's just a proposal, so happy to have feedback on this as well. So the first big one, and I talked about it already, is of course hardware annotations. So what is the target optimized for, the architecture, generic architecture or the real microarchitecture and then a key version, a value for this. As I said CUDA versions, driver versions and so on, I think that's obvious that we need to annotate the container so that it defines what the actual execution environment should look like. Also obvious HPC things like the MPI and interconnect annotations so that you define what the implementation of the container is, is it open MPI, is it image based, is it even thread MPI because you only want to run single node. Not framework is used, libfabrics, ucx, what have you and now I'm going through all the line items so maybe I should stop, but at the end I think the last line is also important. What is the container, 2 minutes left even, what is the container actually, how is it expecting to be tweaked, so is the MPI being replaced, libfabric injected and so on, that's also I think important so that the sysadmin or the runtime knows what to do with the container to make it work on line speed. Sysadmin annotations I think is also important so that we know what the container expects from the kernel, what the modules are introduced and so on and also what the end user can expect what tools are installed, is jq installed, is wget installed and so on. Another annotation is of course documentation would be nice as well, base64 encoded markdown would be great so that you can render how-tos and build tweaks and so on directly. Okay, one minute, how to annotate, I think that's obvious as well that's a layered approach, of course the base image should have annotations that we can carry over and if you build subsequent images at the annotations that are important and after the image is already built you can use things like crane or builder or podman I think or builder to annotate images at the end without even rebuilding them, just repurposing them or we could also collect annotations offline in another format and then annotate it. Okay, ideally and that's like Kenneth and of course Todd as well, easy build, spec, they should annotate it correctly so that we don't need to teach everyone to annotate but the tools just annotate the image for us. And that's the external piece so I created a tool MetaHub where we define images for different use cases and we can also annotate those images without actually changing the image but just with this. So okay, 10 seconds, last one. We need of course a fingerprint of the system to match the annotations against the host itself so there needs to be a tool, time is up and yeah, so we need to discover the right image, need to have a smoke test and help tweak the container. That's like the last bit so I think that's it. Thank you for the excellent example on how to do a lightning talk on time, we'll take one question, any questions for Christian? Do you need the clicker? Thank you for your presentation, I would like to ask how does this relate to like existing software supply chain method databases like GraphiS, does this complement their functionality, is this completely something different? I mean we are good at HPC to build our own thing and then just say that everyone should adopt it. We want to complement it, we want to use these two applications and go to the exercise and then maybe learn from what we did with this project and try to push these ideas also in other things. But I think the AIML folks maybe didn't realize that they won't have this problem so we try also to not only think about HPC here but also think about other communities as well. So I'm open to everyone and the project is as well. Thank you very much Christian, if you want to chat with Christian he'll be around probably outside the door for the rest of the day or in the room and we'll switch it over to the next. |
The LDBC benchmark suite |
Hello, HPC Room, my name is Gabor Sarnas. I work at CWI Amsterdam as a researcher, and today I'm here on behalf of the LDBC. The LDBC stands for the Linked Data Benchmark Council. We are a non-profit company founded in 2012, and we design graph benchmarks and govern their use. Additionally, we do research on graph schemas and modern graph queue languages, and everything we do is available under the Apache V2 license. Organizationally, LDBC consists of more than 20 companies. These are companies interested in graph data management. We have financial service providers, database vendors, cloud vendors, hardware vendors, and consultancy companies, as well as individual contributors like me. So we design benchmarks, the first one being the LDBC social network benchmark, which targets database systems. Let's go through this benchmark by a series of examples. I will touch on datasets, queries, and updates that we use in this benchmark. As the name social network benchmark suggests, we have a social network that consists of person nodes who know each other via a distribution that mimics the Facebook career social network. The content that these people create is messages. These form little three-shaped subgraphs and are connected via author edges to the people. On this graph, we can run queries like the following. Let's have a given person enumerate their friends and their friends of friends, get the messages that these people created, and then filter them based on some condition on their dates. So a potential substitution could be on this graph that we are interested in this query for Bob and the date set on Saturday. And if we evaluate this query, we start with Bob. We traverse the nose edges to Ada and Carl, then continue to Finn, Eve, and then we move along the author edges. And then finally, we apply the filter condition, which will cut message three and will leave us messages one, two, and four. So obviously, a social network is not a static environment. There are always changes. For example, people become friends, even Gia may add each other as a friend. That will result in a new nose edge. That's simple enough. Gia can decide to create a message. This message will be replied to message M3. So we add a new node and connect it to the existing graph via two edges. The heavy hitting updates are the deletes. A person may decide to delete their account, and that will result in a cascade of deletes. For example, if we remove the node Eve, that will result in the removal of their direct edges, all the messages they created. And in some social network, this will even trigger the deletion of all the message trees and, of course, all the edges that point to those messages. So this is quite a hard operation for systems to execute. It stresses their garbage collectors, and this allows certain append-only data structures. So if you want to weave these three components together, the data set, the queries, and the updates, we need a benchmark driver that schedules the operations to be executable. It runs the updates and the queries concurrently, and, of course, it collects the results. The system under test that we run the benchmark on is provided by our members who are the database vendors, and we go to great lengths to allow as many candidate systems as possible, so graph databases, triple stores, and relational databases can all compete on this benchmark. Speaking of relational databases, some of you may think is SQL sufficient to express these queries, and the answer is that in most cases it is. So the query that we have just seen can be formulated in a reasonably simple SQL query. It is a bit unwieldy, but it is certainly doable, and the performance will be okay. However, this being a graph benchmark, it lends itself quite naturally to other query languages. There are two new query languages that are going to be coming out, and both of them adopted a visual graph syntax inspired by Neo4j's Cypher language. The first one is called SQL-PGQ, where PGQ stands for property graph queries. This will be released this summer, and as you can see, it's an extension to SQL, so you can use select and from, but it adds the graph table construct, and the query can be formulated in a very concise and readable manner. There is GQL, the graph query language, which is a standalone language that is going to be released next year, and it shares the same pattern matching language as SQL-PGQ. So the social network benchmark has multiple workloads to cover the diverse challenges that are created by graph workloads. The first one, the older one, is the social network benchmark interactive workload. This is transactional in nature, and it has queries like the one I have shown before. So these queries typically start in one or two person nodes. They are not very heavy hitting. They only touch on a limited amount of data. They have concurrent reads and updates, and systems are competing on achieving high throughputs. So this benchmark has been around for a few years, and we have seen actually very good results. In the last three years, we witnessed an exponential increase in throughput, starting from a little above 5,000 operations per second to almost 17,000 operations per second this year. Our newer benchmark is the social network benchmark business intelligence workload. This is analytical in nature, and it has queries that touch on large portions of the data. For example, the query on this slide enumerates all triangles of friendships in a given country which can potentially reach billions of edges, and is a very difficult computational problem. Systems here are allowed to do either a bulk or a concurrent update approach, but they should strive to get both a high throughput and low individual query runtimes. This benchmark being relatively new, we only have a single result, so it's a bit difficult to put it into context. But it allows me to highlight one thing. Many of our benchmarks use different CPUs. We actually have quite a healthy diversity in the CPUs. We have results with the AMD Epic Genoa, like this one achieved by TigerGraph. We have results using Intel Xeon Ice Lakes and the ETN 710s, which use an ARM architecture. We have more and larger scale results expected this year, and we are also quite interested in some graph and machine learning accelerators that are going to be released soon. So our benchmark process is quite involved. For each workload, we release a specification. We have an academic paper that motivates the benchmark. We have data generators, pre-generated data sets, as well as a benchmark driver and at least two reference implementations. We do this because we have an auditing process that allows the vendors to implement this benchmark to actually go through a rigorous test, and if they do so, they can claim that they have an official benchmark result. So we trademark the term ADBC such that the vendors have to go through these hoops of auditing, and we still allow researchers and developers to do unofficial benchmarks, but they have to say that this is not unofficial ADBC benchmark result. Another benchmark I would like to touch upon briefly is the Graph Analytics benchmark. This casts a wider net, so it targets graph databases, graph processing framework, embedded graph libraries like NetworkX and so on. This uses untyped, unattributed graphs, so it's only the person knows person graphs of the social network benchmark or other well-known graphs like Graph 500. We have six algorithms. Many of these are textbook algorithms like BFS, which just traverses the graph from a given source node, or we have PageRank, which selects the most important nodes in the network. We also have clustering coefficient, community detection, connected components, and shortest paths. This benchmark is a bit simpler to implement. We have a leaderboard that we update periodically. The next one is going to come out in Spring 2023, so talk to us if you're interested. So wrapping up, you should consider becoming an ADBC member because members can participate in the benchmark design and have a say in where we go, they can commission audits of their benchmarks, and they can also gain early access to the ISO standard drafts, SQL, PGQ, and GQ that I have shown. It's free for individuals and has a yearly fee for companies. So to sum up, these are our three main benchmarks. We have other benchmarks and many future ideas. If you're interested, please reach out. Again we have time for one question. Any questions for Gabor? This is a newbie question. I'm not into graphs. Apart from advertisement, optimization, mass surveillance, and perhaps content distribution, which I don't know if they're the major applications, but it's just what my naive minds come with. What other applications are those benchmarks meant to optimize? So the big one this year is supply chain optimization, like strengthening supply chains, ensuring that they are ethical, ensuring that they are not passing conflict zones. It's something that is very important these days. You can also track CO2 emissions and other aspects of labor and manufacturing. So that's certainly a big one, and that's something that we have seen. And there are, of course, all the graphic problems like power grid, a lot of e-commerce programs, and the financial fraud detection, which is going to be part of our financial benchmark this year. |
Multiple Double Arithmetic on Graphics Processing Units
GPU acceleration to offset the cost overhead of multiple double arithmetic |
Next lightning talk is Jan Verschijl talking about multiple double arithmetic on GPUs. Thank you very much, the organizers for allowing me to speak here. So I will hope to talk about computations that I've been doing with multiple doubles. So the multiple doubles go back actually from the time when people, when the hardware was not yet supporting doubles. So this was the late 60s. So this is actually a similar idea. So you use the hardware arithmetic to extend your precision. It has a lot of advantages. So if you're used to working with complex arithmetic, then double-double arithmetic has about the same intensity. So speaking of intensities, relative to the previous talk where we were working with graphs, so in the previous talk I had the impression that everything was about graphs and there was a memory bound. My problems are compute bound. So I get really good arithmetic intensities. There are some disadvantages, of course. If you want to work with, say, 17 decimal places, you can't. Also if you want to work with truly infinite decimals, well, you can't either because you're still having your 11 bits of the exponent. Disadvantage might also be that you can still do numerical analysis. So this might be an advantage or disadvantage. I got into this by power series arithmetic. So this is about the EXP and the EPS. So when I started working with power series, I was using 11111111. And I know the binomial theorem. Well, I only knew it when I saw the numbers blowing up on me. So you know it when you don't know it. So here is a table. The exponential has a very nice development, nicely decaying. And if you multiply these exponentials, you don't have any blow-up. However, the last coefficient, if you want to represent that, and you have to think about GPUs. GPUs are actually quite happy if they can do things in groups of 32. So actually a 32 power series, an order 32 power series, is actually still very small for GPUs. But there you have already to use quad doubles. Otherwise, your last coefficients, you can't represent it anymore. OK. So I started working with the QD library. And then we were doing multi-core. Me and my student, Gennady Jofi, and we looked at each other, should we do this on the GPU? Should we write the entire library on the GPU? My student didn't really want to do it, and I didn't want to do it. But then we discovered GQD, and we used GQD. And the recent package that we are using is Compari. It's actually the only software I know that is named after a beverage. I don't know if that's a good sign or not. In my supermarket store in Chicago, I once saw Compari, but it's not my drink. So I didn't want to ruin the taste of using Compari. So I stayed off this. Compari is actually quite good. So because it allowed me to go to quad double, and now also octo double. The numbers in this table are kind of good, because I want to have really performance. But it also comes somehow misleading, because as soon as you're using complex double-double, everything becomes compute bound. And the problems that you have with memory transfer and all, you do a lot of arithmetic operations on a relatively small amount of data. I also like to do quality up. If you can afford the time for, say, a double precision calculation, well, you will see that everything is not really right. But then you can allow the same amount of time, and you quadruple the precision. So the 439 there, think about 1 gigaflop, 2 gigaflop, and then you go to teraflop. So the 439 is kind of, if you have teraflop performance, it's like as if you would be doing this on a single core. So I mentioned the funding agencies at the very slight. I would like to have a hopper. But so for now, I have to deal with Pascal and Volta. And the last one is a gaming laptop, which is also actually quite a powerful GPU. My first teraflop card was Kepler. And this last list of GPU actually gets there. Okay. If you think of a double-double, there is a double-two. And then for a quad-double, there is the double-four. So that was what the GQD was using. And that's very good for memory coalescing. But we actually got into trouble with the complex quad-double because there was no double-eight. So instead of working, if you work with a vector of quad-doubles, a vector of arrays of four length, you actually better use four vectors. The first one with the highest double, second double, third double, fourth double. So it's a little bit similar like working with power series. So power series is invertible if the leading coefficient is not zero. You can work with matrices of power series. But actually, that's not good. You should actually work with a series where the coefficients are matrices. Same idea here. QDLIP is a very good library still. It's quite complete. So I have extended the square root, for example, to octodouble precision. OK. So here is then my beginning. So I mentioned, so you saw this eight. So if you take a vector of random complex numbers, 64, then the norm is eight. Should be eight. So that's a really nice test property. If you work with GPUs, you actually define kernels, and kernels, the name says it itself, it should be small. So think small. And actually, this problem is a small problem, mathematically speaking, but it has all the richness and the complexity of all the problems that you will run into. You will have to study the prefix sum algorithm, for example. So that is needed. You also have to tune your software for large vectors or for small vectors. You can only have one block of threads that is active. The square root works with staggered. So you apply a Newton method. And then actually, this is where the dot comes in. So the nice thing about double doubles, quad doubles, is that everything fits into registers. So it's also very good if you do multi-core. So you don't have to use the heap ever. But of course, when you get to complex quad doubles, you have these eight arrays. If you do octodoubles, so it doubles and doubles and doubles. So I have with my old graphics cards, they can no longer even compile the octodoubles if you inline too much. So it's still very interesting that, actually, you have to tailor your kernels towards the precision levels. So here is my last slide. I did more than just norms. So we have teraflop performance when we evaluate polynomials and differentiate them. The QR, the blocked householder QR, is also wonderful. You get already teraflop performance with 1,000 in complex double-double. And then the last paper is where you try to combine these things by computing Taylor series for solutions of solution developments for polynomial systems. Newton's method is actually a quite nice operator. You start with a multivariate system where all the variables are linked to each other. And what Newton actually does, it spits out power series for each component. So actually, it untangles all the linearities, all the nonlinearities. So I listed the archive. So the IEEE puts things in a paywall, behind the paywall. So you have the archive versions there. And you're more than welcome to the bottom line of this slide. I mean, the conclusion, actually, is that all the software is free and open source. I'd have the GitHub handle there. |
Overengineering an ML pet project to learn about MLOps
Force yourself to do pushups while working from home! |
The next lightning talk is Victor, and this should be a really fun talk, I think, about MLOs. Yes. So, hello. This is probably going to be the least serious talk you have seen today, so I'm sorry about that. We're going to be automating weight loss with AI. It's a stupid project I made in a weekend, or like in a few weekends, but I want to talk about it. So, who am I? Lightning talk version. I'm Victor Zonk. I work at ClearML. Hi. Let's make something. So, that's the reason why I'm here. The problem statement was I'm working at home, and I'm not working out enough, like probably a lot of us are, and so the problem solution is why not lock my PC every hour and force myself to do push-ups, and then it automatically opens again. That was the main idea. I want to do this with AI, obviously, because over-engineering, and I'm a machinery engineer, so if I am a hammer, everything looks like a nail. This is going to be the diagram, so left top is an oak one. It's an AI camera. I'll talk about it in a second. That will run one model, and then, because one AI model is not enough, I have two, so there is a second model that runs on the Raspberry Pi that will lock my PC. This is what it looks like, so you get like a notification, workout time, lazy bum, and you have to do push-ups. It's in Raspberry Pi running in the corner of my room. You can follow along with the diagram at the right top. So I'm going to do post-estimation with the oak one. Now what is the oak one? The oak one is a 150 bucks open-source hardware AI camera, which is really cool. I highly recommend it. They run the Intel Mirrored X, so if you look at the speeds there, if you have the oak one, because it does the AI, like the AI, on the chip itself, on the camera itself, it can get a lot higher FPS on the Raspberry Pi, because it doesn't have to go to the Raspberry Pi to do anything. Even when compared to another AI accelerator connected to the Raspberry Pi. It also has excellent documentation, which is a unicorn these days, but yeah, it really is a nice library. So they have a bunch of cool examples that you can try, like there's D-plop with segmentation, there's other stuff. But luckily for me, I didn't have to do any work, because they also had post-estimation. So thanks to GXGX for implementing this. This is an awesome repository. It's still being maintained, if I remember correctly. So definitely check that out. That's really cool. Now, what does this repository do? It basically gives me post-estimation, so it films me on that AI camera. I have one with me, by the way, so after the lightning talks, I can actually give a demo lightning talk. I can't do it right now. Basically it draws like a skeleton on top of you in like seven, eight frames a second on the Raspberry Pi, which is awesome, and then it even positions them in 3D. So that's nice. This is stage one. We want to go to a pushup detector. This is stage two. So we now basically have a skeleton. If we just throw away the pixels, these are the only points that we actually care about. And then now it just basically becomes a tabular problem. So the second part of the machine learning, or like the simple machine learning, is going to be really simple. Now we just have a few points, and we want to classify them. For this second model, though, this is not pre-trained, so I actually have to label. A few images, it's not a very complex model, but you have to do something. So what do we want to do is we want to say, okay, this is a pushup, this is a pushdown, and then we can do some additional logic to actually count them while they're happening. Right. But then the question becomes, how do I take a picture when I'm actually doing my pushups? Because like there is a camera there, do I need a button, but then it might overfit on me pressing the button or like something else. So if you're a machine learning engineer, the answer is throw more AI at it. So basically overengineering using an unnecessary amount of AI, I set up a microphone while I was pushing up and pushing down. There's a really cool open source package for Python that can do voice recognition. It does send it to the proprietary API of Google, but at least the code is there. And then you can just basically shout label me, and if label, the word label is actually found inside of what you said, it will take a picture. So that's that. Now we have a third AI model that's really useful. And then I did a lot with ClearML. So this is actually the MLops part. I now have two models. I want to be able to train them. I want to be able to maintain them. So what did I do is this is the labeling tool. So right, left top, left top for you. Right, left top is the oak one that will take a picture. When I shout, take a picture, it will send it to ClearML, which is actually an open source MLops tool, one that I work for. And they have data versioning, for example. So every single time I run the labeling tool, it will create a new version of my data set, which is very useful. And then I can use this new version of the data set. I can pull it in. I can use the experiment manager of ClearML to keep track of all my code. Every single time I run or I train, I will get all of my output, all of my plots. And then you can actually build this into pipelines. You can run this automatically on remote machines. So I over-engineered the crap out of it, but I can't really tell everything in Lightning Talk. The main idea is you have a lot of different tools in ClearML that can help you with that and automate a lot of that stuff. Now training my own model. So now we have all of those points. We have four each of those sets of points we have if it's a push up or a push down. Where do you go from here? Training my own model, it's this. Like it's super simple. It's three lines of code these days. So this is just sklearn. It's an SVM. It's a simple classifier. It takes points in. Give you one point out. Push up, push down. It's not ideal. I should do a no class, but I was lazy. No class basically meaning it's nothing, none of the two. So now it will say when I walk to it, it will maybe register a push up, which is not ideal. So in order to combat that, I made a very simple, even simpler piece of code that basically primes it. So here on the left, you can see one is basically push down, two is push up, and so you can see it happen. I think, yeah, you can basically see it happen there in the beginning. But when I run to my place to start to do the push ups, here you can see that there's like a bit of jittering going on because it doesn't know the zero class. So in order to catch that, what you can say is, okay, if the, you can basically say if the length is, I don't, yeah, if the length is 10, so if you're at least been doing detection for some time, then you can check if the last 10 of them were push up. So I'm basically ready in my position, only then prime it. And then once it's primed, you can start counting. So that's just a very simple, stupid way of doing it. Two minutes left. Excellent. So actually, that's it, but I have two minutes left, so I'm going to do one more thing. Locking the computer. So I use Linux, which allows you to do everything, which is awesome. So locking the computer was easy, but unlocking was hard, as it probably should be. You have to put in a password. So there was no real way to get a custom password. I tried thinking of like maybe I should like scramble my password and then fill in that scramble password, but never do that. You will be locked out if your code is buggy and it happened. So no way to get a custom password, and there is one big problem. I know my password, so if I can't change it and I lock my computer and I really don't want to do push ups, I can just fill it in and be done with it. So the best and simple solution I can come up with is just to use Xdo tool and then spam backspace. So Xdo tool actually allows you to type automatically while your computer is locked. So you can just spam backspace, not allow you to fill it in because it's like backspace 20 times a second, and then when you do the push ups, it just fills in your password. And that's it. So yeah, a lot of over engineering, and I hope you find it interesting and you learned something. So thank you so much for listening. One last note before any questions. There is a YouTube video about it on the channel MLMaker. |
Reproducibility and performance: why choose?
CPU tuning in GNU Guix |
Okay, final lightning talk for today is Ludovic talking about geeks. All right, thank you. Hello HPC people. So my name is Ludovic Cortes. I work at INRIER, which is a French research institute in France in computer science. And I work as a research engineer. So I'm very much concerned about engineering issues in general. And in particular, I'm concerned about deployment. So if you're an HPC dev room aficionado, we've probably made before. I gave a couple of talks, I guess, in this room, more specifically about geeks. So maybe you're afraid about geeks. It's a software deployment tool. So we have Easy Builds Pack, also RPM, well, you know, app, et cetera. And this is yet another deployment tool, if you will. But we have this very particular vision where, you know, the grand vision where we're trying to build a tool for reproducible research and HPC. So the thing here that you see is the vision, so to speak. So at one end of the spectrum, we have, you know, research articles and we want the research to be solid. So we want the computational workflows to be reproducible. And at the other end of the spectrum on the left, we have archives, source code archives like software heritage, which we really need to have if we want that scientific source code to, you know, to remain available over time. And in the middle, while we need a bunch of tools, in particular, deployment tool like geeks to reproduce, well, to deploy software reproducibly. Yes. So in a nutshell, yes, geeks provides actual tools for reproducible research people. I'm not going to go into details, but basically you can say, all right, I've made an experiment, a computational experiment. So now I'm going to pin the exact revision of geeks that I used. This is the first command here. And the second command is, you know, some time later or some colleague wants to reproduce the results. And so they use the time machine to jump to that specific revision of geeks. And from there, they deploy the exact same packages that I have done in that manifest file, bit for bit. That's the idea. All right. So in HPC, I guess most people in this room would agree. We have two obsessions. That's MPI and AVX, well, vector instructions. We want things to run fast, right? We have those fancy clusters. So we want to make sure that the communications are going to be fast. We want to make sure we're going to use the latest vector instructions of our CPUs. And that makes a lot of sense. But sometimes we're going, maybe we have preconceptions about the implications of all this. So here I'm creating Todd Gamblin, who's maybe in this room actually. Hi, Todd, if you see me. This is an example where we, well, Todd here was saying, you know, binaries, distributions like Debian or Geeks or Fedora, for example, are just targeting the baseline of the CPU, like A6664 without AVX, for example. And that's a problem for performance. Because of course, if you have that latest fancy Intel processor, then you probably want to use those vector instructions. But the conclusion that because of this, we cannot use, you know, binary distributions. Distributions like Geeks or Debian that provide binaries is not entirely accurate. That's the point I'm trying to make in this talk. So yeah, as most of you know, there's a whole bunch of vector extensions. It keeps growing, you know, like every, every few years we have new vector extensions in Intel or AMD CPUs or even AH64 CPUs, Power 9, et cetera. And it's even worse if you look at the actual CPU models, for example, this is just for Intel, there's a whole bunch of things. It's not always a superset of the previous CPU, you know, we're discussing it the other day for dinner. And yeah, sometimes it's complicated. You cannot tell that Skylake AVX is exactly a superset of Skylake. It's complicated. And yet you want to be able to target these CPUs specifically, these micro-architectures. And it makes a big deal of a difference. So this is an example from an Agen benchmark. So Agen is a C++ library for linear algebra, specifically targeting small matrices. And well, you know, if on my laptop, if I'm targeting, if I'm compiling with MR equals to Skylake, then I get a throughput that's three times the baseline performance. So it's a pretty big deal. So we definitely want to use that. We want to be able to compile specifically for the CPU micro-architecture that we have. But so the good news is that to a large extent, that's a solved problem for a long time. So there is this thing called function multi-versioning that is already used in a number of performance critical libraries. So if you look at LeapSea for string comparison, or if you look at OpenBLAST, if you look at FFTW, GMP for multi-precision arithmetic, you know, many libraries, programming languages, runtimes, already use function multi-versioning. So what's the deal here? Well, roughly when you have function multi-versioning, you can say, well, I have one function that does some linear algebra stuff, for example, and I'm actually providing several variants of that function. And when I start my program at runtime, the loader or, you know, the runtime system is going to pick the most optimized one for the CPU I have at hand, right? So if I use GMP, for example, for multi-precision arithmetic, it's going to pick the fastest implementation it has, you know. So you can compile GMP once, and then it's going to use the writing at runtime. And even if you're using GCC or Clang, you can specify in your C code, well, I want this particular function to be cloned, so to have several variants for each CPU microarchitectures, and GCC or Clang is going to create several variants of that function so that it can pick the right one at runtime. So kind of a solved problem, in a way, well, except in some cases. Well, one particular case where we have a problem is C++ template libraries, like Agen, which I was mentioning before, they are not able to benefit from function multi-versioning in any way. So when you compile your Agen benchmark, well, you really have to use mRch equals to Skylake, for example, if you were targeting a Skylake CPU. And this is because if you look at Agen headers, for example, where there are many places where you have if depths, do I have AVX 512 at compilation time? If yes, then I'm going to use the optimized implementation, otherwise, I'm going to use the baseline implementation. And this is all happening at compilation time, so you really have to have a solution at compilation time to address this. And so this is where Geeks comes in. So Geeks is, you know, it's a distribution, like Debian, like I was saying, that's targeting the baseline instruction set, but we came up with a new thing that's called package multi-versioning. It's actually one year old or something, which is roughly the idea is taking the same idea of function as function multi-versioning, but applying it at the level of entire packages. So let's say I have those Agen benchmarks, I can run them using just the baseline X8664 architecture, using this Geeks shell command. It's, you know, it's taking the Agen benchmarks package, and in that package running the Bench plus gem command, right, on a small matrix. And then I can say, all right, now I want to tune that code specifically for my CPU, and then I just put that extra, that extra dash dash tune option, and it's selling Geeks, all right, please optimize that Agen benchmarks package directly for the CPU I'm on, which is Skylake in this case, and this is it. And what happens behind the scenes is that on the flag, Geeks is creating a new package variant. So it's taking that Agen benchmarks package, creating a new package variant that is built specifically with a compiler wrapper that passes the MRT equals to Skylake flag. And I get the performance, and I'm happy, right, so this is in Geeks since 2022, and it's still reproducible, you know, because we can still say, all right, what precise option did I use, what dash dash tune option did I use, and it's Skylake, all right, so the build process of the package remains reproducible, right, I'm still getting the same binary if I use dash dash tune equals to Skylake on my laptop or on some HPC cluster, whatever. And there are no world rebuilds, which means that the build farm, for example, the official build farm of the project is providing several variants of those packages, those, you know, performance sensitive packages built for Skylake, Skylake, AVX, 512, you know, different things. So if you install them, most likely you're going to get a pre-built binary that's specifically optimized for that CPU. And if not, well, that's fine, it's going to be to build it for you, that's okay. So my conclusion here is, you know, we keep talking about performance of MPI, vector instruction and so forth. Well, I think we can have performance, we can have portable performance, that's what we should aim for, and we can still have reproducibility, we don't have to sacrifice reproducibility for performance, that's my take on the message. Thank you. Thank you very much. Again, time for one question. Okay, yeah, this whole dash tune thing looks awesome. But what if the majority of the computation time is spent in libraries that that library is actually using? How do I tell it to optimize those instead as well? Right. So the way it works in Geeks, you can annotate packages that really need to be tunable, right? So you can add a property to a package like, so it would be egg and benchmarks in this case where it could be the GNU Scientific Library, GSL, and you said this package needs to be tunable, so if I use dash, dash, tune, please tune specifically this package. And it's going to work even if you're installing, you know, an application that actually depends on GSL, for example. All right, thanks a lot Ludovic. Thank you. Thank you. |
LIBRSB: Universal Sparse BLAS Library
A highly interoperable Library for Sparse Basic Linear Algebra Subroutines and more for Multicore CPUs |
Alright, we're going to start the next talk. If you're going to stay in the room, please take a seat. If you want to leave, please leave. Okay, so next speaker is Michele, who's going to talk about a universal sparse glass library. So, yes, at the core of many technical or scientific computing problems, we end up, we reduce the problem to solving a system of linear equations. If the system of linear equations were a simple one, like a two by two one, the method of solving would be pretty simple. And in any case, it would involve representing the linear systems by the data structure of a matrix, so a table of symbols or usually numbers and a few vectors of numbers. So, the matrix is the basic structure of scientific computing. In the case of such toy problems or school problems, we have exact direct solutions at our disposal, which works fine. However, once we go into the problems involving simulation of larger domains, so engineering problems, those linear systems to be solved become large. And also, the methods that we use for smaller systems are not applyable here anymore because the numerical stability of, let's say, toy problems or small problems, the stability is not here anymore. Simply, those methods, numbers, results, they verge. And the time to solution also increases more than exponentially. So, they're simply infeasible and don't make sense. So, it's a different, it was completed different techniques for large linear systems. So, furthermore, if the systems were not only large, but also full of zeros in the matrices, so how do we call the systems or what do we have to do with here? We have to do perhaps with sparse systems or sparse problems. This is the way we call it. So, in this acoustics matrix or matrix coming from acoustics, we observe that less than half percent of each row on the average has a non-zero element. So, we would call this sparse systems, perhaps, so the system coming from this matrix. Indeed, usually we use, we are happy with the definition of Jim Wilkinson, where we say a problem or a matrix is sparse. If we can, with our technology, which our technique, we can make use of the amount of zeros there to our advantage. So, this is the definition. It's not about numbers. It's really about what we are able to do with the way the matrix looks like. So, among the different matrices we can encounter, we could have matrices from a circuit simulation, which looks like this, and have such clustered elements in them. Sometimes the elements are more clustered around the diagonal, like in this quantum, I think quantum chromodynamics, I think, matrix. Computational fluid dynamics matrices a bit more regular, I could say. So, it means that you can exploit all of those matrices, perhaps, in different ways. This is what I'm showing you, this gallery. This is another CFD, so computational fluid dynamics matrix, a structural matrix, another material problem matrix, structural and so on. This is also CFD1. So, this was just to tell you that sparsity really is related to the technologies, the technology we use to deal with it. So, usually, we are happy using iterative methods with sparse systems. Iterative methods, because something is being iterated. So, there is a loop, and with the most common methods, Krilov methods, the loop usually has a bottleneck, has a core operation, which is prominently multiplying the system matrix by one vector or many vectors. It depends a bit on the technique. There are several of them. Here, I'm showing a new octave implementation of such one iterative method. So, there are two kernel operations, or main operations, multiplication of the matrix by many vectors, or let's say, another dense matrix, or the triangular solve, so the solving of matrices which are called preconditioner matrices, but are spars. And these are the core operations which we are interested in. And I want to mention that those operations for the sparse matrix vector or multi-vector operation can have many variants. The variants can be on the matrix, which could be perhaps complex and Hermitian, or symmetric. It doesn't have always to be square. It could be any rectangular. And perhaps it has already a unit diagonal, and we can exploit this. This is what I'm saying. Many things change if the matrix has a complex numbers, or long complex numbers like a speaker before me spoke about. So, that changes the balances in the performance profile here. And other things might change. And all of this have influence on the specific kernels. And if you think like Ludovic has spoken about the different variants that one might want to build over different architectures, you see that this is, you end up with code bloat if you really want to optimize each subcase. Also, the operands have their own variants. So, in the way the data are laid in the dense matrix. Yeah. Similarly, for the triangular solve operation, there also you have different variants, which lead to a multitude of different kernels or ways you wish to write them, kernels of code. So, this leads to a committee of people, end of the 90s, to meet together. It was mostly US people, but also from delegations from Europe to standardize an API, which they called sparse blasts, sparse basically algebra subroutines, to somehow just give an API to the different variations of the operations that I spoke about. So, it's mostly, it's not like full blast if you know the dense blast. It's mostly about creating sparse matrices, destroying them, and doing a few operations, not only those one, but these are really the core operations. And they talked about C and Fortran, because the, yeah, 20 years ago, 20 something years ago was the final document which they finalized. Now, after 20 years, we could say that, well, what they've wrote, especially this is especially in my opinion, is perfectly portable, allows some parallelization, even if it's not specified at all. They didn't foresee extensions, but it's possible. If you look at the API, you see that you can have extensions. So, they're not blocked somehow. The namesake of sparse blast has been copied by every major vendor you can imagine. The sad thing that each major vendor has completely violated their API. So, they changed something in a slightly incompatible way, which is sad, simply sad. And the original sparse blast didn't think about the GPUs, but actually, in my experience, looking at how people program code, I see so much technical depth that I think you can do compromises. And with small adaptions, you could adapt the sparse blast to the GPU, to the GPU computations to some extent. So, I think you can save this API to a good extent. And this is the reason why I'm here. So, I wrote a library which respects the original sparse blast. So, I see sparse blast program can look like this, where you have a notation for the sparse blast operations, which is logical if you know blasts a bit, so you can understand it a bit. And going in the direction of my library, it's centered, it's around a data format, a sparse matrix format, which I came up with. It's called recursive sparse blocks, because there is a recursive subdivisions. There are blocks which are sparse. And the reason, the motivation for this data structure is to not exclude the sparse blast operations. So, I have made compromises in order to allow sparse blast operations to be there. I didn't want to preclude these operations. So, it's a compromise. And the core idea here is to have, let's say, cache size blocks, more or less, and a way to give each multi-core core something to work with. So, it's oriented towards multi-core. It's not for GPUs, or not at the moment, at least. So, the matrices which you have seen before, with this data format, the data structure looks a bit like this. The color is based on the population, on the amount of matrices are there. Then there is another core coding with other information. But this is just to tell you that the irregular aspect of those matrices is reflected also here, to some extent. Yeah. So, the library itself wants to provide sparse blast. So, building blocks for iterative solvers. It's pretty compatible at the library. It works with C++, Fortune, Octave, and Python. I say it's quite compatible. So, it uses, let's say, established technologies. And it's quite compatible also in the sense with your software. It doesn't require you to use the only data structure which is custom is the matrix. You don't need extra data structures for vectors. And the program I saw before written in the sparse blast, for using RSV, uses just one extra init and finalized function. So, I really respect that API. But, however, it's nice to write also the 15th standard. Or joking. This is not the 15th standard, but just the internal API. So, if you want, perhaps you can exploit the internal API of Libre Sb, or not internal, but the native one. Please tell me when I'm at 10 minutes. Yeah. Which is primarily in C. Then you have wrappers with C++. And there's also the Fortune one. These are the native APIs. And what is specific about RSV is that the blocking is not so clear which blocking is best. Because, yeah, depending on how you block, you could have better or worse performance. And for this reason, there is an idea of using automated empirical optimization in this library. There is a special call, a function which you call when you invest time to ask the library to optimize a bit the data structure. So, you sacrifice a minute, perhaps, for optimizing the data structure a bit. And you do this in the hope that the many hours which we'll be using this matrix afterwards will be, will profit, will be decreased, thanks to the optimization. Because, as I said, this is meant to be used for iterative methods. So, you will be running this for many hours. And, therefore, spending a few minutes in automated optimization, it's something that should pay off. No guarantee, but that's the idea and that is usually how it goes. To give an idea, this C++ API is what you would expect. So, there is a class templated on the type. So, there is type safety here. When you say, this is my library, sorry, this is my matrix. These are my non-zeros because this is what we are representing here. We, you have flags, C-style flags for options like symmetry or asking for discarding zeros rather than keeping the zeros because sometimes you want to keep structural zeros for modifying them later, for instance. So, you have many such options here. And this is the way, this is why I'm showing this slide to tell you that there are many options which I'm not showing you here. Yeah. And the only data structure here is the RSB matrix, no other custom stuff. And you can exploit, you can enjoy the spam interface of C++ 20 that doesn't really force you to have any weird custom vector type apart from the standard C++ ones. If you want to use, for instance, GNU Octave and enjoy the multi-core speedup from Libar SB, you can use the sparse RSB plug-in which I wrote, which uses C++ Libar SB pretty efficiently. So apart from a few conversions, it should be, it should have almost native performance. Similarly for Python, the PyRSB plug-in for standalone, sorry, package has an interface which is copied from CSR matrix. So you use it mostly the same way. But underneath, Libar SB runs. You don't see it. Or you see it if you ask it to use the auto-tuning routine. Because as I said, in all of those language implementations, you can also use all of the functionality of Libar SB which includes the auto-tuning also here in Octave. And I want to stress this. GNU Octave doesn't have multi-threaded sparse operations. With Libar SB, you can have them. Same for SciPy sparse. As far as I know, it's not multi-threaded. With Libar SB, you get it. Libar SB is by default licensed as lesser GPL3. Which means you can, if you don't, as long as you don't modify it, you can distribute it with your proprietary code. If you modify it, well, it's more complicated. You have to release the modified version. The Libar SB library, if you want to learn to use it, it makes absolutely sense to use a packaged version from Debian Ubuntu or most of Linux distributions. Or if you use Windows and you can use Siegwin. Or once you want the performance, I mean, you can just compile it by yourself because it's quite trivial. Or enjoy what our colleagues here from Spark and EasyBuild have done and use the packaged version from those distributions. And some people have written wrappers for Rust and Julia. I don't know these languages, so I didn't use them. I think the Rust one is like the entire API. I think Julia is more in Julia style, so it's just what is, the core functionality is there, I think. Yeah, that was everything. I don't know how much time did I take. Oh, 50 minutes. So, thanks. |
numba-mpi
Numba @njittable MPI wrappers tested on Linux, macOS and Windows |
Thank you for the opportunity to present our project, Namba MPI. Let me first acknowledge the co-authors. My name is Sylvester Arrabas and we are here with Olex Ibulenok and Kacper Darlatka from Jagiellonian University in Kraków, Poland, Maciej Manna from the same university contributed to this project and we have also, we will be presenting some work from David Zwicker from Max Planck Institute for Dynamics and Self-Organisation in Göttingen. So let's start with a maybe controversial, provocative question, Python and HPC. And let's try to look for answers to this question in a very respected journal, okay? So maybe you have some guesses what's written there. 2019, in scripting languages such as Python, users type code into an interactive editor line by line. It doesn't sound like HPC. Next year, level of computational performance that Python simply couldn't deliver. Same year, same journal, Namba runs on machines ranging from embedded devices to the world's largest supercomputers with performance approaching that of compiled languages. Same year, nature astronomy. Astronomers should avoid interpreted scripting languages such as Python. In principle, Namba and Nampa can lead to enormous increase in speed, but please reconsider teaching Python to university students. Same year, nature methods. Implementing new functionality into SciPy, Python is still the language of choice. Full test should pass with the PyPy just-in-time compiler as of 1.0 SciPy. Are they talking about the same language? No. The left-hand side are papers about Rust and Julia. The right-hand side are papers about Python. So maybe that's the reason. So just to set the stage, let me present, I think, a way that is apt for thinking about Python. So Python as a language lacks any support for multi-dimensional arrays or number crunching because it leaves it to packages to be handled. Python also leaves it to implementations to actually interpret its syntax. And SciPy, of course, the major, the main implementation, but it's not the only one and actually solutions exist that streamline, for example, just-in-time compilation of Python code. Moreover, Nampa, while de facto standard, is not the implementation of the Nampa API. And alternatives are embedded in just-in-time frameworks, just-in-time compilation frameworks, GPU frameworks for Python, and they leverage typing and concurrency. So probably here the highlight is that Python lets you glue these technologies together and package them together, leveraging some of the Python ecosystem and its popularity, et cetera. And probably, arguably, I would say that's an advantage. I'm not saying that please use Python for HPC instead of Julia. Probably vice versa, actually, but still it's an interesting question to see how it can perform. OK, so let's check it. I will present a brief benchmark, a very tiny one, that we have come up with in relation with this project, and it uses Nampa. Nampa is just-in-time compiler that translates a subset of Python and Nampa into machine code that is compiled at runtime using LLVM, OK? So here is the story about the super simple benchmark problem. It's related to a numerical weather prediction. So you can imagine a grid of, well, numbers representing weather here. And numerical weather prediction, or part of numerical weather prediction, the integration part involves solving equations for the hydrodynamics that is the transport of such pattern in space and time, and, of course, thermodynamics that tell you what's happening in the atmosphere. Super simplified picture. I'm not saying that's the whole story about NWP, but for benchmarking Nampa, let's simplify it down to, in this case, two-dimensional simple problem. You have a grid, x, y, some signal. And if we look at just the transport problem, a partial differential equation for transport, we can see what happens if we move around such signal, which could be some, I don't know, humidity, temperature, whatever, in the atmosphere, OK? So we have a sample problem. Here I'm showing results from a three-dimensional version of what was just shown. And let's start with the right-hand side plot, x-axis, the size of the grid. So if it's 8, it means 8 by 8, super tiny. If it's 128, it's 128 by 128 by 128, and wall time per time step on the y-axis, OK? Green. C++ implementation of one particular algorithm for this kind of problems, and orange, pi MP data, the same algorithm, numerically, but a Python implementation. So here you see that actually Namba, just compiled version outperformed C++, maintaining even better scaling for the tiny matrices, but they are kind of irrelevant for the problem. And please note that in both cases we have used multi-threading. So here on the left-hand side, you can see actually on the x-axis number of threads, y-axis wall time per time step. And again, the green line is the C++ implementation. These two are two variants of the Python 1.jit compiled with Namba, almost an order of magnitude a faster execution, five times faster. And what's probably most interesting for now is that when you compare with just setting the environment variable for Namba.jit to disabled, we jump more than two orders of magnitude up in wall time. So this is how Namba timing compares with plain Python timing. But there are two important things to be mentioned here. The Python package is written having Namba in mind, that is, everything is loop-based, which is the reason why plain C Python with Nampa performs badly. This line is kind of irrelevant, just as a curiosity. On the other hand, the C++ version is kind of legacy, it's based on Blitz++ library. Back then, when it was developed, IGN didn't have support for multiple dimensions. And it's object-oriented RIProcessing, which was reported and measured to be kind of five times slower than 4777 for these kind of small domains. It's not the same for larger domains. Anyhow, we can achieve high performance with Python. But what if we need MPI? We need message passing in our code. How would we use it? Let's say we divide in a domain that can position spirit our domain in two parts. So the same problem, same setup, just half of the domain is computed by one process or node or anything that has distributed, has different memory addressing than another work. So this is how we want to use it, why we want to use MPI? Well, because despite expansion in parallel computation, both in the number of machines and the number of cores, no other parallel programming paradigm has replaced MPI. At least as of 2013. And already in 2013, people were writing that this is, even though it's universally acknowledged that MPI is rather a crude way of programming these machines. Anyhow, still, let's try it. And let's try it with Python. So here is a seven-line snippet of code where we try to import Namba to get the jit compilation of Python code. And then we use MPI for pi, which is Python interface to MPI. What do we do? We define some number crunching routine, and we try to use MPI from it. And then we try to Njit. Njit means the highest performance variant of Namba jit compilation. We try to jit compile this function and straight ahead execute it. What happens? It doesn't work. It cannot compile because Namba cannot determine type of MPI for pi.mpi.intra.com because it's a class. Classes do not work with Namba, at least not the ordinary Python classes. So something doesn't work. So the problem is that we have Namba, which is one of the leading solutions to speed up Python. MPI, which is clearly the de facto standard for distributed memory parallelization. We try to work with them together, but it doesn't work. So stack overflow. Let's go it. Nothing. Let's quant it. Nothing. Wrong search engine, right? Someone must have solved the problem. Nothing. Let's ask Namba guys and MPI for pi guys. In 2020, you will not be able to use MPI for pi's siton code. It was not designed for such low-level usage. Well, okay, it's siton. But I mean, it must be doable, right? We have two established packages. The aim is kind of solid and makes sense. So it must be doable. And 30 months later, 120 comments later, 50 PR slater from five contributors on a totally unplanned site project, we are introducing Namba MPI. Namba MPI is an open source, kind of small Python project, which allows you to, let's jump here to the Hello World example, which allows you to use the Namba NGIT decorator on a function that involves rank, size, or any other MPI API calls within the Python code. As of now, we cover size rank, send, receive, or reduce broadcast barrier. The API for Namba MPI is based on NumPy. We have auto-generated documentation. We are on PyPy and Conda Forge. Few words about how it's implemented. Essentially we start with Ctypes built into Python to try to address the C API. There are some things related with passing addresses, memories, void pointers, et cetera, super interesting. Probably the key message here is that we are offering the send function that is already NGITed, which means that you can use it from other NGITed functions. We handle non-continuous arrays from NumPy, so we try to be user-friendly. We then call the underline C function, and kind of that's all. But really, there is the key line number 30. This one. Well, that's nothing but, in principle, without it, Namba optimizes out all our code. Anyhow, these are kind of things that you see when trying to implement such things. Unfortunately, there are quite more of such hacks inside Namba MPI. The next slide is kind of a thing that you prefer to never see, but they cannot be unseen, in a way, if you work with it. So please just think of it as a picture of some problems that we have challenged and essentially wrote to Namba guys asking how can it be done, and we got this kind of tools for handling void pointers from C types in Namba with Python, NumPy, et cetera. But well, that's utilspy, and that's it, and it kind of works, and why do we know it works? Because we test it, and let me handle the mic to Olexi to tell you more about testing. Okay, it's focused. So I'm going to tell you about the CI that we have set up for our project for Namba MPI. So the CI was set up at Github Actions, as I said, and this is the screen of the workflow. We start from running the PDoc, Precommit, and PyLint. PDoc is for automatic documentation generation, PyLint for static code analysis, and Precommit for styling. After that, if these steps were successfully moving to the main part where we run our unit tests, this is the example, not example, but actually the workflow file that we run. As you can see, when we run the CI against multiple systems, different Python versions and different MPI implementations, and here we should say a big thank you to MPI for PyTeam for providing set up MPI Github Action, because this has saved us a lot of time. So thank you, MPI for Py. And as of operation systems and MPI implementations, we are running, in case of Linux, we're testing against OpenMPI, MPICH, and Intel MPI, Mac OS, MPI, and MPICH, and in case of Windows, of course, MSMPI implementation. But when we are talking about MPICH, there is a problem that has recently occurred, namely starting from version 4 of MPICH, it fails for on Ubuntu on our CI for Python version less than 3.10. So if anyone has ideas how to fix it, please contact us, we will appreciate any help. Okay, so sample, we are running the unit tests on different systems and so on. Let's see the sample unit test. In this test, we are testing the logic of the wrapper of the broadcast function of MPI and the main thing that you should remember from this slide is that we are testing this function in plain Python implementation as well as Github compiled by Namba. We have also set up an integration test, the integration test is in another project named isopredoperate-les, and this is just a scheme of this test. We are starting from providing the initial conditions for the APDS solver, and these initial conditions are written to the HDF5 file. After that, we are running three runs, the first one we run with only one process, the second we have two processes, the third three processes, and in each we divide, well, in the first we don't divide the domain, but the other ones we divide the domain accordingly, and in the assert state we just compare the results and we want the results to be the same for different runs. And also these results are also written to HDF5 file. Interesting fact that everything works on Windows except installing HDF5 package for concurrent file access, HDF5 package was enabled in PIO, we have troubles setting up on Windows, but everything else works fine, and there is also an independent use case, the PyPD project that uses our library, our package, and it's not developed by us, so there is a user, and this is the Python package for solving partial differential equation, it focuses on finite differencing, and these are defined by, I provide it as strings, and the solution strategy as follows, we start from partitioning the grid onto different nodes using number MPI after add that with partial expressions using the SIMPY and compile the results using number, and then we trade the PDE exchange in boundary information between the nodes using number MPI. Take home messages, there is a common mismatch between the Python language and Python ecosystem, we should remember that the language could be slow, but we also should consider the ecosystem around this language, the libraries that are available, the libraries that are available, and probably different implementations, and Python has a range of global HPC solutions such as just-in-time compilation, GPU programming, multi-trading, and MPI, and in case of number MPI, this is the package to glue the MPI with LFMG compiled Python code, it is tested on CI, on GitHub Actions, we are aiming for 100% unit test coverage, and also there is also already the two projects that are dependent on this package, here you can find the links for number MPI, the GitHub links, and also the links to the packages at PyPy and Anaconda, and we also welcome contributions, the first two issues I have mentioned earlier, and we also welcome and encourage to provide the logo for number MPI, as well as adding support for the other functions, or we are also aiming for dropping dependency on MPI for Py in our project, and also the plan is to benchmark the performance of this package, and we also we wanted to acknowledge funding, the project was funded by National Science Centre of Poland, so thank you for your attention, and probably we now have time for questions. Thank you very much, any questions? Question from an MPI expert? Hello, thank you for the talk, so the interface you are proposing is very close to the let's say CMPI interface, let's say when you do a send you work with a buffer, or do you try to provide a bit higher level interface, for example, serializing some Python object, or it could be very useful. Yeah, the interface is as slim thin as possible probably, very close to the CMPI, one of the reasons being that within Namba and Jitted Code, probably things like serialization might not be that easy to do, there is no problem in combining MPI for Py and Namba MPI in one code base, so when you are out of the Jitted Code, you can use MPI for Py, which has high level things like serialization, et cetera, so you can use it there, but within LLVM compiled blocks, you can use Namba MPI for simple send, receive, already use, I mean, without higher level array functioning, having said that we, for example, handle transparently non-contiguous devices of arrays, we also, yeah, there are some things that are higher level than C interface, but in general, we try to provide wrapper around the C routines. Okay, thank you. Any other questions? Thanks for a great talk, it seems really interesting what you are working on, I have got a couple of questions, probably born out of ignorance, but I just kind of wondered if you could help me with them, so firstly, I was wondering why you went with making a separate package rather than sort of trying to build this functionality on top of MPI for Py, would it have been possible to sort of add this, add the feature of making things jit-compilable into MPI for Py, and secondly, I was kind of wondering with the MPI IO thing that you were looking at with Windows, if that requires kind of concurrent file access from separate processes in Windows, is that just a complete, completely a no-go for Windows, because I understand that's something that Windows kernel doesn't support. Thank you. Thanks, let me start from the second one. So here our, well, essentially it's a fun fact that everything else worked for Windows, we do not really target Windows, but it was nice to observe that all works, it's kind of one of these advantages of Python that you code and you don't really need to take too much care about the targeted platforms, because the underlying packages are meant to work on all of them, and here everything works with Microsoft MPI, the only thing that actually was a problem for us was to even install H5py on Windows with MPI support. So we don't really know what's the true bottleneck, but even the documentation of H5py suggests against trying. For the first question, why do we create, why do we develop a separate package instead of adding it on top of MPI 4Py? So I think even on the slide with the story of the package, there was a link to, yeah, there's a link to MPI 4Py issue, the bottom footnote, where we suggested would it be possible to add it, and in relation to the first question, so probably the scope, the goal of MPI 4Py is to provide very high level API for MPI in Python. So with discussing with the developers there, we realized that it's probably not within the scope of a very high level interface, so we started off with just, well, small separate project, but I mean, well, great idea, it could be glued together, as of now we aim for dropping dependency on MPI 4Py, which we now use just for some utility routine, not for the communication or nothing that is used by Namba, and probably that might be an advantage, because you can eventually you should be able to install Namba MPI with very little other dependencies, and Namba MPI is written purely in Python, so installing it, you do not need to have any Python related C-compiled code, and you can do it quite easily. Okay, thank you very much. |
Running MPI applications on Toro unikernel |
All right, we'll get started. We have another talk on MPI, but I think a very different one, running MPI applications on the Toro Unicolon. Exactly. Yeah. So, hello, everyone. I'm Matthias. Here, I'm going to talk about running MPI applications on Toro Unicolonial. Usually speaking, a Unicolonial is a way to deploy a user application in a way that is closer to the hardware by trying to reuse the operating system interference. So, in the overall, it should perform better than just deploying a user application by using a, during our proposed operating system. First, I would like to introduce myself while I am passionate about operating system development and mutualization technologies. I had worked at Citrix, to take Huawei, and I'm currently a Baptist, and here I have my email and my GitHub profile, if you want to know more about why I'm doing it. So, I'm going to start to present what is exactly a Unicolonial, and then I'm going to go to the details of what makes Toro special, and then I will show current implementation of the MPI standard on top of Toro, and I will finish with a benchmark that I'm trying to do to see if the current implementation is working as expected, or if there are things that could be improved. So, maybe you are already familiar with this picture. This is more or less how a user application is deployed, either using a built-in machine or bare metal. So, what we have is the operating system, the user application, and the two different modes, the RIN 3.0, which is the different modes in the X86 processor. So, in general, what we have is that when a user application requires one to open a file, send a packet, or whatever, it's going to trigger a Cisco, and then it's going to be a switch in which the processor is running from, which is user space to kernel space, so it's going to be processed here, kernel space, and come back, right? In general, when we see what we have inside the kernel is, well, we have different components, right, that, for example, have the scheduler, the file system, different drivers, and so on. So, in particular, when we have a scheduler, a scheduler is going to choose what is the next process that's going to be executed. One of these processes, or several of them, is going to be your MPI application, for example. So, if you deploy your MPI application in a, by using a general proposal, but as a system, your application is going to compete with other processes in the system for sure. And also, what you have in the scheduler is some policy, which is going to decide which is the next process to be deployed. Also, we have components like the file system, and since we have a general proposal processing system, we're going to have several drivers for different file systems, and different drivers, and so on. So, what some people observed was that there were too much generality in using a general proposal processing system for a single proposed application, like, can be an MPI. So, some people come up with a new architecture, they propose what they call Unicolonial. You have some projects there, like OSV, Mirage OS, Unicrash, or Nano VMs. What they do is just take the user application and compile it within the kernel itself. So, at the end, what you have is a single binary that is going to be deployed, either by using a virtual machine or a bare metal, right? So, instead of, for example, having syscalls that we have in the case that we have a general proposal processing system and different modes of execution, in the case of a Unicolonial, we have simply calls, which are cheaper than using syscalls, for example. In general, the projects, the projects that I presented before all come forward to the epoxy standard, so it means that if you have any application written in C that come forward epoxy, you can theoretically compile with the Unicolonial without any modification of the user application. In reality, this does not happen, and most of the time, the epoxy that the Unicolonial implement is not complete, so you have to do some work to just, you cannot just take your application and compile it and generate something, it doesn't work out of the box in most of the cases, right? So, in this context, what is total is also a Unicolonial, it's an application-oriented Unicolonial, and the idea of total is to offer an API which is dedicated, I mean, to efficiently deploy parallel application. In the case of total, it is, it's not a epoxy complaint, it means that even if the nail of the functions like this opens in close and so on, it's more or less the same nail, the semantic of this function is slightly different, so I will not say that it's a epoxy complaint in that sense, and I will explain that later. So, let's say that the three building blocks of the total Unicolonial are the memory per core, cooperative scheduler, and core-to-core communication based on built IEA. Here I'm talking about the architect of the Unicolonial, I'm not talking about yet the application of how we're going to build an application to compile tutorial, right? And I'm going to explain these three points. So, first, what happened in the total Unicolonial is that we have memory dedicated per core, so at the beginning what we do is just allocate memory, I mean, to split the whole memory in rations and we assign these rations per core, and for the moment the size of these rations is just proportional to the number of cores that we have. That makes that, for example, the memory allocator is quite simple, it doesn't require any communication because we have chunks of data, I mean, yeah, the allocator is, we have one allocator per core which means that we don't require any synchronization in the kernel to allocate for one core. It's quite, we call it per CPU data, let's say, yeah. So, for example, if you have a thread in a core one and we want to locate memory, we're going always to get it from the same rations and that happens also if you're on the core two, we're going to use the rations, from rations two. And the idea is that by doing this, we can then leverage technologies like hypertransport or interquit path interconnect in which we can say, well, this core is going to access this rations of memory and if you access all the rations, it's going to get a penalty to do it, right? So, talking about the scheduler, what happened in total is that we only have thread, so we don't have process means that we, all threads share the view of the memory and we have mainly one API to create thread, it's called a begin thread and it's the parameter that have to say in which core each thread is going to run. The scheduler scoperity, which means that it is the thread that's going to call the scheduler to then choose another thread and this is by relying on the API call assist thread switch and most of the time, this is just in bucket because we are going to be idle for a while, so we just call the scheduler or we, for example, we're going to do some IO. So the scheduler is also very simple, we have, again, first CPU data, so we have one cube per core and the scheduler is simple going to choose the next thread that is ready and then the scheduler and this means that also we don't require any synchronization at the level of the kernel to schedule a thread, so it's like each core run independently one for another. Fine I am going to talk a bit about how we communicate cores and basically what we have is one dedicated reception cube per core for any other core in the system, so we have one to one communication. This basically relies on two primitives which is send and resist front and it's just by using the destination core and from where we want to get a packet, for example. These two primitives are the ingridients to then build more complicated APIs like MPI gutter, MPI because and MPI scatter, so these are the building blocks for those API, for example. So to implement this core-to-core communication, I was using butaio, so I was just following the specification, I will talk a little bit about this, I don't want to go too much into detail so as to understand roughly how communication between core is done. As I said before, we have one reception cube for each core, for any other core in the system, so means that, for example, if core one want to get packets from core two, we have a reception cube and also if core one want to send a packet to core two, it's going to have a transmission cube and the number of queues is going to be for sure different if you have three cores, for example, because the build queues are dedicated. So basically how a build queue works is made of three RINs buffers. So the first RIN buffer is the buffer which only contains descriptors to chunks of memory. The second buffer is the aviable RIN and the third buffer is the user RIN. Basically what happens is the aviable RIN is the buffers that the core one are exposing to core two. So if the core two want to send a packet to core one, it's going to get a buffer from aviable RIN, put the data and then put it again in the user RIN. This is basically how build.io works, just that if you are familiar with build.io, in this case, for example, the consumer of aviable RIN is the core two, but if, for example, if you are in a hypervisor and you're implementing some build.io device, the consumer is not going to be the core two, but it's going to be the device model, QMU, for example, and if you are familiar with that. But it's the same scheme. This allows that, for example, since we have one producer and one consumer, we can access to the build queue without any synchronization, I mean, at least if we have only one consumer. So you don't require any luck, for example, to access to the build queue. So yeah, I already talked too much, I don't know how much time I have left, but I wanted to show some examples about the implementation, maybe it's more fun that all the flyers should show. So what happened, how we, how we deploy an application by using Toiletware. We have the MPI application, this is a C application for the moment, and you compile it with a unit that's going just to link the application with some functions that are the implementation of the MPI, the MPI, for example, MPI Bicass, Gatter, and so on, it's implemented in this level in the MPI interface. And this unit is going to use some MPI from the unique kernel. So at the end, what you're going to get is an ELF and binary that could be used to deploy your application, either as a built-in machine or a bare metal. So you don't have any operating system intermediate there, you have only your application, the threads and so on, but you don't have nothing else. So if you want to see how it is deployed, if you get the MPI application that doesn't see what is going to happen, we're going to get the main and then instantiate it to one for every core in the system, as a thread. So to benchmark the current implementation, I'm not very familiar with the MPI where I was just coming from another domain, so I am not really familiar with how I had to benchmark such implementation, and so I choose the also microbenchmarks, maybe you know them, maybe not, and I just pick up one of them, like for example, MPI barrier, and I try to implement, which is quite simple, the benchmark itself is quite simple, so I decided to implement it. I could not take the benchmark as it is, I have to do some rework to make it work, and then my idea was to see how this behave when I was deploying this as a single VM, which many cores. The hardware that I use is this one, since I'm not familiar with the high performance computing work, I'm not really sure if this is a hardware that you often use, it's quite a new Intel, you can get it in Equinex, the price is four euros per hour. So I launched the test and I tried to measure things, so I was just measuring the latency of this, and I was taking into account the max latency through, I mean, over four, eight, sixteen, or thirty-two cores. I am getting values in the order of the microseconds, and then I found this paper, which was also using this benchmark to measure their platform, and I was saying, well, in this paper, they were reporting around 20 nanoseconds and 13 nanoseconds, sorry, this is nanoseconds, this is microseconds, not nanoseconds, sorry, in this platform. In any case, I will be very cautious about this graph, because I was getting a lot of variation in the numbers, most of the time, for example, I was trying in a machine with thirty-two cores, and the VM has already thirty-two BCPUs, so you should not test in that sort of machine, because one of the threads is going to compete with the others, with the main one of the hosts, so you should always test with less cores, less BCPU cores, physical cores. And, yeah, the idea is to continue doing this, I mean, improving the way I am measuring this, and also try maybe in different hardware, and at the same time, I found a lot of packs in the unicroner by doing this, so for example, at the beginning, I only support more or less four cores, so I went from four to thirty-two, well, it was a number in a constant, but anyway, I found many packs when I was doing this, so it is all, this is just a proof-of-concept and a work in progress, so you don't take it too serious, I am trying to say, I don't want to jump into any conclusion from this, and, yeah, it was fun to do, anyway. So that's all, I don't know if you have any questions. So you said this runs on bare metal. Sorry? The unicroner runs on bare metal. Yeah, they are some. How do you even install it? I mean, operating systems are kind of complicated, right? Sorry? How do you even install it on bare metal? Can you say that again? How do you install it on bare metal? How do you install it? Yeah, like if I had this, how would I install it on bare metal? Is there an installer or...? Installer, you mean? Yeah. Now, you can just use some device to boot from, for example. So it's bootable? Yeah, that's it. Yeah. Okay. Yeah. Well, yeah. There are many ways to do that, for example, you don't have to install it, for example. You can use from a device that is removable, for example, you don't need to install it. Any questions? Thanks. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you very much. |
MUST: Compiler-aided MPI correctness checking with TypeART |
Okay, we're good to get started with one more MPI talk, but I think a very different one compared to the others. Hopefully. Compiler-added MPI correctness check. Yeah. Thank you. So my name is Alexander Hück, and today I'm going to talk about basically the dynamic MPI correctness tool, which is called MUST. And in particular, I'm going to talk about the compiler extension, which is called Type Art, which is supposed to help with MPI type correctness checking. And first of all, as we heard before, the message-passing interface is the defector standard of distributed computations in the HPC world, right? And it defines a large set of communication routines and other stuff, and it's also designed for a heterogeneous cluster system where you have different platforms that communicate and compute something. However, in that sense, it's also very low-level interface where you have to specify a lot of stuff manually, and you can expect only a little error checking in general from the library itself. So the user is required for the simple MPI send operation to specify the data, which is transferred as a typeless void buffer. The user has to specify its data length of the buffer and the user and the type manually, and also the message envelope, so the destination of the message and the communicate and stuff like that has to be specified manually. So there's a lot of opportunity to commit a mistake, basically. And this is quite a question to you guys, if you look at this small code, try to figure out how many errors you can spot in this small example. And just try to look at every corner, basically. And while I'm talking, I can also spoiler you that I'm going to show you every issue in this small example in a couple of seconds, so to speak. When I first looked at it, my colleague Joachim showed me I couldn't find the most simple one that was a bit crazy to me. Sometimes you don't see the forest in front of the trees. OK, so the most basic one, we don't call MPI in it, right? That's usually in MPI applications. That's the very first call you're supposed to do, where you initialize the MPI environment. And then likewise, if you look at the end of the program, we do not call MPI finalize. So those are two simple mistakes. But then in total, we have eight issues. I don't know how many you found. And I'm also not going to talk about each one of them, but it's quite easy to, if you look at each individual issue, to kind of guess that it can happen to you also. And those are the pointers where they are. And in particular, I want to talk about the receive-receive deadlock, where, for instance, two process weights on each other without being able to continue. You can argue that all those issues, except maybe the deadlock, could be found by the MPI library itself. But typically on HPC systems, the library does not do any checking for performance reasons. That's why many of these issues will not be, will cause maybe crashes for unknown reasons or just produce some strange results. Well, that's why the dynamic MPI correctness tool must was developed in the past, which is a tool that during runtime checks for issues and produces such reports where it finds some issues. And this is a report of the deadlock we have seen in the example code, where the message itself just describes there's a deadlock. In the bottom left, you can see a wait for graph, which just shows you which rank waits for another rank causing the deadlock. This helps you to kind of see where the deadlock occurs and why it occurs. And also, must can produce so-called call stack information, where you can see, beginning from main of the program to the basically origin of the deadlock, but this was omitted now. Okay. So, to facilitate correctness checking for MPI, must uses a so-called distributed agent-based analysis, which means that you have your normal MPI application with four ranks, four processes that communicate as you would expect as the user wrote it. But must will also create a analysis network, which helps you to do local analysis, it helps you to do distributed analysis. If you think about a deadlock, you need information for more than one process to figure out that there occurred a deadlock in your program. So must creates that completely transparent to the user, so you would use MPI comworld and any other communicator as normal, must takes care of creating such a network. And also, what's maybe the focus of the talk today is the local analysis, where we look at process local checks. If you think about MPI type correctness of a send operation, you can do a lot of stuff locally, or I should do a lot of stuff locally, and this is the focus. So, MPI type correctness, we focus basically on the buffer and the user-specified length and the user-specified MPI data type today. Must can already detect mismatches of, for instance, the send and receive communication pair, where must basically creates a so-called type map, it looks at the user-specified buffer size and the user-specified data type, and compares it to the corresponding receive operation. If there is a mismatch, obviously, there is going to be an issue, and must creates a report about that. This also, of course, works for collective communications, where you can make sure that all ranks call, for instance, a broadcast operation with the same data type. However, since must only intercepts MPI calls in general, it cannot look behind a device, like it cannot look what happens in user space, you know. So, we cannot reason about the type of the Void buffer data, and this is why we were motivated to create the tool type art, which is something that helps with basically figuring out what the memory allocation is that you put into your MPI calls. So, if you look at this small example on the right side, completely processed locally, there might be some memory allocation in that example, it's a double buffer that was allocated by Melloc, let's say, and the question now becomes, how can we make sure that data, the data buffer, which is a Void buffer, fits the user-specified buffer size, so is it of length buffer size, and it also should be compatible with the MPI float type, and of course, we can already see that double and MPI float, there's a type mismatch, but must cannot answer such a question without further tooling, because it just intercepts MPI calls. Okay, so to just show you that it's not an academic example, there's two well-known HPC benchmark codes, which have some issues, so one was reported in the past by others, where there's a broadcast operation, it uses a big end, which is an alias for a 64-bit data type, however, the user-specified MPI end, which is a 32-bit data type for the broadcast operation, so there's an obvious mismatch, that could be a problem, likely, and also for milk, there's an all-reduced operation, where the user's passed in a struct with two float members, and it's interpreted as a float array of size two, which is B9, to be honest, but that could be a portability issue in the future, maybe, you know, depending on the platform, maybe there's padding, or whatnot, and maybe it's an illegal operation, so this could also be an issue in the future. Well, from a high-level point of view, how it does must work, well, you have your MPI application, and during runtime, it intercepts all the MPI calls, and collects all the states that it's needed for deadlock detection, and so on, and we added type art, which looks at all those allocations that are passed to MPI calls for those local analysis of buffers, which is the compiler extension based on LLVM, so you compile your code with our extension, and the extension instruments all allocations, be it stack, be it heap, which are related to MPI calls, and we also provide a runtime, so during runtime, we get callbacks of the target application, all allocations, all free operations, so we have a state of the allocation of the memory, basically, in a target code. We also, of course, look at the allocations and pass out their type, so simple case is buffer, A is a double type, more complex cases would be structs or classes, we pass the serialized type information to our runtime, which then enables, of course, must to make queries, so for instance, for an MPI center operation, we give the type art runtime the buffer, the typeless buffer, and the runtime would return all the necessary type information to ensure type correctness of those buffer handles. This is the whole high level process behind it. And then if you take a look at an example of a memory allocation, C is a small heap allocation of a float array, this all happens in LLVM IR, I'm just showing C like code to make it easier to understand, we would add such a type art alloc callback, which where we need the data pointer, of course, and then we need a so called type ID, it's just a representation of what we allocated, that is later used for type checking, and of course we need the dynamic length of the allocated array to reason about where we are in the memory space, so to speak. Once we handle stack and global allocations, for stack allocations, of course, we have to respect the automatic scope dependent lifetime properties, and for global we just register once and then it exists at our runtime for the whole program duration. And of course, for performance reasons, you can imagine that the less callbacks the better, hence we try to filter out allocations where we can prove that they are never part of an MPI call and just never instrument those. This is basically possible on LLVM IR by data flow analysis, so in the function foo we have two stack allocations and then we try to follow the data flow where we can see that A is passed to bar, and inside bar there's never any MPI call, so we can just say, okay, we do not need to instrument this, this is discarded. Likewise for foo bar, we can see that B is passed, if it's in another translation unit we would need to have a whole program view of the program, which we support, but other tools have to create such a call graph with those required information. Anyways, so also if we had this view, we can see foo bar also does not call MPI, so both stack allocations don't need to be instrumented, which helps a lot with the performance. Okay, so the type ID which is passed to the runtime for identification works as follows, built-in types are obviously known a priori, so we know the type layout, float is 4 bytes, double is 8 bytes, depending on platform of course, for user defined types, which means structs, classes and so on, we basically serialize it to a YAML file and the corresponding type ID of course, so we can match those during runtime, where we have the extent how many members offsets, byte offsets basically from the beginning of the struct, and also the subtypes are listed, which can then be used for making type queries about the layout and stuff like that. And then of course, must needs to have some API to figure out type correctness, and this is provided by our runtime, which has quite a few API functions, the most basic one would be this type out get type where you put in the MPI buffer handle, and what we put out is the type ID and the error length, and then you can use the type ID subsequently, for instance in this call where you put in the type ID and you get out the struct layout I just mentioned earlier, and this way you kind of can assemble some iterative type checking which is done in must. And then putting it all together, if you want to use our tooling, you would need to first of all compile your program with our provided compiler wrapper, which is a batch script and does the bookkeeping require to introduce the instrumentation, the type out stuff, so you exchange your compiler, that's the first step, it's optional, you don't have to do it if you don't need this local type out checking, and then you would also need to replace your MPI exec or MPI run depending on the system with the must run, which also does some bookkeeping for must to execute the target code appropriately, spawn all the analysis agent based networking and so on, and then the program runs as normal and must output file is generated with all issues found during execution of your program, and as a side note maybe, as I said must does this agent based network and in the most simple case for the distributed analysis, there's an additional process needed for the deadlock detection and so on, so for SLAM or whatnot you need to allocate an additional process, however you don't need to specify it in the must run stuff, it happens automatically in the background, alright, so that's it, if you look now at what the impact is of our tooling, well that's quite dependent as I kind of alluded to, how many callbacks you have, how many memory allocations you actually have to track, and how good we are at filtering them, so here's two examples, Lulech and Tachyon, which are again quite well known HPC benchmarking codes, and Lulech is quite favorable for our presentation because there's not many callbacks and hence our runtime impact is like quite non-existent so to speak, where you can see that this is compared to vanilla without any instrumentation without our tooling, type art almost has no impact, and then with type art analysis enabled has almost no additional impact, for Tachyon the picture looks quite different as you can see, there's an overhead factor of about three using when you introduce type art, this is because there's a lot of stack allocations that cannot filter, so we track a lot of stack allocations and the runtime impact is quite high, and this is reflected by those runtime and static instrumentation numbers, so first of all the buff table here shows you during compilation what we instrument, so you can see that there's some heap free operations that we find an instrument, there's some stack allocations and globals that we instrument, well of course those numbers do not represent the runtime numbers because heap and free operations sometimes are written in a way that they are like centralized in a program, that's why those numbers are not as high as you would expect, for stack allocations we find 54 and out of those 54 we can filter for Lulish at least 21%, and globals are much easier to follow along the data flow in LLVM IR so we can filter much more and much more effectively, well going to the runtime numbers which means that those are basically the number of callbacks that happen during our benchmarking, we can already see that the high overhead of which we observed in Tachyon is to be explained by the almost 80 million stack allocation callbacks basically that we have to track during runtime, which is a lot of context switching and so on, which is not good for the runtime, alright so this is already my conclusion, what we have done is basically with type art must can now check all phases of the MPI communication with respect to type correctness, so the first phase that must can already do is this one, which is basically the message transfer, this is checked against, however there is also the phase of message assembly, right where you go kind of from the user process into the MPI process and you have to check this, and of course if you think about it you would also have to kind of check the message disassembly where you go from the received data to your user program again, so type art enables these kind of local checks to ensure type correctness, thank you very much. Any questions? Yeah so I really like to talk, I thought it was really interesting, so one thing I wanted to ask was how does one get must, like how do they install it, is it available for distribution package managers or is it more that you have to compile it yourself? Good question, I think you have to compile it yourself, even on our HPC system so, but it's not that tedious to compile I think, maybe I'm biased, but just go to the website and there's a zip file, it includes every dependency that you need and I think the documentation is quite straightforward, you need of course maybe open MPI installed, but not much more to be honest and then you should be good to go, yeah, I think it's CMAC based, I don't know if you have problems with that, but yeah, it should be straightforward to try it out. Thank you, another question there on my way. So on the type analysis that you do, I mean if you look at malloc and it has like a type cast then you know what the type is, but if it doesn't have a type cast, if you malloc into a void pointer and if the amount of bytes you are allocating comes from some constant or macro or some argument, how far do you follow and if you can't see it, do you have a warning, do you crash? That's a good question and that's basically a fundamental problem, right, so we have to have some expectations of the program, right, so our expectation is that the malloc calls are typed, otherwise we would just track it as a chunk of bytes and I think our analysis is quite forgiving, so we would just look at okay this is a chunk of bytes, it fits you know the buffer and this is fine. Yes, you kind of lose that, right, if you just know it's a chunk of bytes then you kind of lose the alignment checks because you could, if you have like say you malloc is struct and then you do some pointer magic for your MPI buffer and you point between members in the padding area, only if type art knows about the malloc struct, it can of course warn that you are doing some illegal memory operations, if we just see a void pointer due to the type plus malloc then we have lost basically, anyone else, do you have any thoughts on using Rust which is a way more memory safe language than C and C plus pluses, have you looked at it? Not really, not yet, for now we have so much to do with the C and C plus words to support typing better, to get more robustness and so on and not yet to be honest. Maybe all that work becomes irrelevant if Rust gets popular enough. I think in general maybe I'm completely like a newbie when it comes to Rust, I think the MPI support itself is still in the works, I read some papers about like generating bindings for MPI which are inherently type safe, not sure how that goes. I think everyone will be happy if Rust or some other type safe language becomes more used by people and this kind of work is irrelevant, but while people still use C plus pluses, this is very relevant. That pays my bills. Thank you very much. Thank you. |
Link-time Call Graph Analysis to facilitate user-guided program instrumentation
An LLVM based approach |
Okay, so we're switching topics a bit away from MPI to something at least a little bit different, link time call graph analysis. All right, thank you. So, yeah, we're going to talk about user-guided program instrumentation approach and especially a link time call graph analysis extension to that project, which kind of makes it more usable. So, to give some background on this work, I'm involved in a project called XRFoam that deals with the Open Foam Computational Fluid Dynamics Toolbox. And that's a very complex code, quite large. And the goal here is to improve the performance of Open Foam for HPC systems, especially for the exascale error. And one of the things we do is, yeah, develop empirical performance models. And for that, we have sort of developed like workflow, how we do that. We start with several measurements where we get an initial overview and identify hotspots. Then, based on these hotspots, do an analysis of the critical kernels. And finally, we can do the empirical modeling in order to find scalability bugs and predict performance when scaling out. And so, especially for the focus measurements and the modeling, you need quite accurate and reliable methods to be sure that the data you collect is right and you have the right level of detail. And what we use for that is code instrumentation. And just to give the background, I want to give an example. For example, in GCC and Clang, you have the F instrument functions flag, which does a very basic instrumentation. But if you activate that flag, you insert these, or the compiler inserts these function enter and exit probes, which are then a runtime interface with a profiling tool which records runtime. Or, yeah, more involved metrics, maybe performance counters, stuff like that. And the big problem with this instrumentation approach, especially compared to other mechanisms like sampling, is that it can increase the run times by orders of magnitude if you're not careful, if you just instrument every function. So, you have to have some kind of selection mechanism in place to prevent that. And sort of the method that's most commonly used is to use profile-based filtering, either manual, where you look at the profile that is generated by the instrumented measurement, and then determine, like, which functions, maybe, for example, very small or called very often and don't do much work, so that can be filtered out. And there are also tools like Scope, which can help in generating these filters. However, this is only, like, on a purely profile base and there's no, yeah, no calling contacts involved in the decision. So, there are some other call graph-based approaches where you have a static call graph that is then used to make the selection. I want to highlight, too, there's Pira, which does automatic iterative refinement. So, we start with a static selection based on the statically collected call graph, and then iteratively compile a program with instrumentation, execute it, measure it, and then look at the profile in order to determine what to filter out. And then the tool we're working on is the copy tool, which is short for compile-assisted performance instrumentation, and that one is more focused on allowing the user to specify what he wants to measure, so he can say, okay, I'm interested in MPI communications, so that is what I want to measure. So, here's a very high-level overview of how copy works. So, you have your source code, and that is put into the static analysis in order to generate the call graph. And then at the same time, the user specifies the measurement objective in front of a selection spec. We have a simple DSL for that. And this together is then fed into the copy tool in order to generate the instrumentation configuration, which is hopefully low overhead. And, yeah, so let's consider an example for that. So, we might have the following scenario. So, I want to record all call paths that contain MPI communication. Additionally, I want to measure functions that contain loops with at least 10 floating point operations, and I don't care about system headers or inline functions. And this is how this looks as a copy spec. I won't get into the details of the syntax, but you can combine different selection modules where you start with the set of all functions and then sort of here select inline functions or system header functions. And then you can combine them to, in the end, produce one set that you want to instrument. And for this particular example, this reduces the number of instrument functions by 74 percent for our form test case. So, the call graph is sort of the base data structure that we use for all the analysis and for the selection. And this is currently generated based on the source level by a tool called MetaCG. And this can be a bit cumbersome for very complex applications, such as Open Foom, because you, yeah, you need sort of a separate source code analysis step, and that can be difficult to set up, especially if there's like shared library dependencies stuff like that. So, it can be tricky. And so, we looked into how we can maybe generate the call graph at different stages. And so, this is what Tim is going to talk about, how we can generate the call graph at different program stages. And then we introduced the caged compiler plugin, where we, yeah, evaluate how this can be done at link time. So, well, thank you. So, we were interested in the whole program call graphs. This is because CAPI, the tool that does our instrumentation, needs to be aware of every function inside our call graph, which means that we are, that it's necessary to have a whole program view. And the tools that were used like MetaCG were able to create a call graph, and they have the distinct advantage of being able to annotate metadata to this call graph. This is where the name MetaCG came from. And this means that we not only can answer structural questions like we want to instrument every function that eventually call paths down to some MPI call, but we can also answer instrument based on metadata. For example, the loop depth specification or floating point operations. And yes, there were multiple possible ways to generate a call graph. One of them is source code. One of them is the intermediate representation that is basically part of every compiler, especially the LLVM compile pipeline, and machine code at the very last. So, the basic idea is, well, we have the source code. Let's do call graph generation on the source code, which is relatively easy. MetaCG is doing it on a source code basis. But this means as you feed every single source code file into MetaCG, this means that your view of the program is limited to one translation unit at a time. So, you then need to merge those part call graphs of every translation unit together to generate this overview of the whole program that you need. The information you then gather from your profiling maps very cleanly back to your source code, because once you find a function, well, it's named the same way in your source code. But on the other hand, what you write in your source code is not necessarily what's actually executed on the machine, right? Because there might be code transformation, constant propagation, dead code elimination, inlining. And this is actually something we want to be aware of if we are doing instrumentation, not that we want to instrument a function that doesn't exist anymore. Also, this merging of the translation units means that the user is involved. The user currently has to specify when he uses MetaCG which source code translation units belong to one target. And then manually has to tell MetaCG these functions are all to be merged. Please generate the call graph for that. And the user might not perfectly emulate the linker behavior. So, there are different resolution types that a linker might choose if there are samely named structs or classes. And, depending on how you are implementing your merging process, you might have slight differences between your call graph that you generated and that's what the linker will eventually do. So, the other extreme would be, well, let's do it on the compiled machine code then. Reverse engineering tools like radar or Ghidra are able to generate call graphs and binary data just fine. And those have the very distinct advantages that this is actually what is run on the machine. There are no code transformation left. You have the advantage of being able to see machine code optimization passes if they are influencing the generated call graph. But, on the other hand, a lot of information, the metadata that we also would like to be able to instrument based upon are lost as soon as we go down to machine code. Inlining already happened, so there is no function annotated with please inline anymore. Also, pointer type information, as we heard in the talk earlier, gets lost as soon as we go down to machine type. And constness is also something that is more to be inferred than actually stated once we go down to machine code. And so, we decided the best of both worlds is probably the LLVMIR because it's a heavily annotated representation. It is close enough to what will run on the machine that we have the ability to observe the code transformation. We are able to give more specific estimates on what the actual cost of a function might be because we have more clear way of tracking, for example, instruction counts, floating ops, and integer ops. On the other hand, it's also close enough to what the user actually wrote because we're not down on the machine code yet. And we can figure out the inlining stuff, the constness, the virtual functions. We can get type information in the IR. And if we do it at link time, we are not even limited to the translation unit by translation unit scope that source code based approaches are. So, if you have your pretty default compile pipeline, you have your source code, which builds a translation unit, gets fed into the compiler, which outputs intermediate representation, and then the optimizer runs there and multiple source code translation unit optimized IR modules are fed into the linker. And we can do our call graph analysis inside the linker, solve our translation unit problem, and are able to have all our information ready. So, to do this, we developed the cage plugin. Cage stands for call graph embedding LLVM plugin. And it basically generates a call graph using some of LLVM tools, does some annotation, virtual function call analysis, and this can run as part of the optimizer in the pipeline, but it can also run as part of the LLVM linker. Also as a plugin, for which we use a slightly modified version of the LLVM linker, but the basic logic of running plugins in the LLVM linker was there. So, then we do a V table analysis, our metadata annotation, because it's all available to us. And then we embed the result into the binary, which enables dynamic augmentation. And I will come to that one later. So, at link time, we're doing static analysis basically. And as I already mentioned, we split our information in basically two types. One of them is structural information like call hierarchies, call paths, call depth, the number of children, how deep we are in our path. And you also have virtual function calls, which are mostly structural, because once you have virtual and polymorphic calls, you have like a set of functions that are probably being called by that pointer. And so, we can narrow down the possibilities of which functions are called, but we cannot actually statically figure out this function is getting called. So, it's slightly metadata based, but it's also mostly structural. On the metadata information side, we have instruction composition, so we can determine what is the relation between arithmetic operations and memory operations, for example, or we can generate local and global loop depth estimates. And then we have inlining information, which is metadata, because inlining is not like a must do for a compiler, just because you have specified inlining for a function doesn't mean that the compiler will actually inline the function. So, it's partly structural information and partly metadata, so you see there's no clear like line between those, they blur at some points, but we can represent all those in the metadata annotated call graph. And if you remember, we were able to do dynamic augmentation. Well, what is dynamic augmentation? If you remember, each object contains the call graph that we generated, which means that the call graphs can be aggregated at runtime if a shared library is loaded, because even if you are at link time, even if you can see all the statically linked objects, all the translation units that belong to your target, your binary might load a shared library, which then, well, you're unaware of. So, the idea is, as soon as the main executable is loaded, it passes its embedded call graph on startup to a runtime collecting library. And then, the main executable can load whatever shared object it wants. And if this shared object also contains a call graph, then this runtime collector gets passed this call graph on the first load of the shared object and can aggregate it like merging. So, we're basically back to the translation unit by translation unit based approach, but now we're doing shared library on binary and executable based merging. And then, we attach all this data together to one really big whole program call graph, now including shared objects. And then, we can export this, for example, and pass it back to Karpi for some further refinement of the instrumentation. Go ahead. All right. Thanks. So, to put it all together, for Karpi, we have the call graph analysis approach, Tim just explained. So, for each object file, we have the call graph analysis and then the embedding. And then, on the runtime side, we can sort of merge the main executable call graph with the shared libraries as they are loaded. And we defer the instrumentation using a dynamic instrumentation approach in order to sort of apply that selection dynamically. And, yeah, that's how it works. So, to summarize, we are developing the Karpi tool for instrumentation selection based on call graph analysis. And we have explored this cage plugin for generating this call graph information at link time, which allows whole program visibility and this dynamic documentation. And together with Karpi, we can, yeah, use the embedded call graph to run the selection at runtime and thereby improving Karpi to make the compilation process and the analysis process more streamlined. And this is sort of an active development. So, at this point, we don't have a very detailed evaluation about performance and stuff like that. So, there's some concerns, for example, when you go to very big programs and do LTO, there might be performance problems. So, there might be more work to make it viable in that regard. But, yeah, it works well in a prototype fashion. Yeah. Thank you. Any questions? Perhaps I was a bit distracted. So, I have two questions. If you can comment on Lambda functions, OMP sections, or OMP sections of the code, not specific constructs, and instruction cache. If you can comment how those are handled, those three aspects, let's say. So when it comes to OpenMP, we don't at the moment have any specific OpenMP profiling interface that we target. So, this might be something that is probably useful in the future. But for now, we just do the basic functions mutation and select on that. Yeah. Then the other point was, sorry, could you repeat? So, do you want to comment on that? So, lambdas and caches, so caching, no, there is no logic to handle caching in any way. And regarding lambdas and OpenMP, it's, if you're talking about profiling, you've got your answer right there. But if you're talking about generating call graphs in which lambdas and OpenMP runtime calls are available, then the call graph will actually figure out that there are OpenMP runtime calls, and will correctly, if I remember correctly, will figure out that this function calls back to the OpenMP runtime, because once you are in IR, the runtimes actually carved out, all the pragmas were removed from the actual source code. So, we are aware of OpenMP, but we are not using that information currently for any profiling. But you could do metadata-based copy selection with it. So, every call path that eventually leads to an OpenMP runtime call would be a valid instrumentation using copy. Just that the question of the instruction cache is whether, this is my ignorance, but if you are reintroducing a lot of new instructions here in the code, or in the code that is being read, I was thinking whether too much data ends up, maybe I didn't understand well. So, this is more of a performance-related question, right? Okay, so, yes, of course, you introduce a new instruction whenever a shared library is loaded, because we then add instructions that pass the call graph back to our runtime collecting facility, and we also introduce instructions because we are using a profiling approach, which are function calls. So, yes, we are impeding the instruction fetching, instruction caching flow, which is also why profiling has rather high overhead compared to sampling approaches, for example. But as Sebastian told, we have not really extensively profiled our application, quite ironic, so we are not aware how much the benefit or the impact actually would be. So in your slide, you say you have a fork of LLD. The obvious question is, what's stopping you from upstreaming this? So, yes, we have a fork of LLD that just basically exposes the flag load new pass manager plugin, and you pass the plugin, and it does the rest. What's currently holding us back from upstreaming is it's not very well developed, it was coupled together in half a week or something, and there already is an open merge request on fabricator that implements this exact functionality for, if I remember correctly, LLDM9, which was abandoned for a year until it was totally abandoned and closed, and so we didn't actually figure out how to make this more interesting. Don't take that as a signal. People just move jobs or whatever, so try it again and bash people, and I can help you with that as well, find the right people to get this, because it seems like a simple and obvious thing to have. It isn't actually that hard. Apparently, there wasn't much interest in the community back in 2020? Yes, that's too long ago. We will polish it a little much, and then hopefully get this one upstream. Thanks. Any other questions? No. Okay. Thank you very much. Thank you very much, Sebastian. |
How the Spack package manager tames the stat storm |
All right, it works, cool. So welcome all to this talk. My first time at FOSDAM, so I'm excited. I will be talking about taming the stat storm in SPAC. So what is the stat storm and why should it be tamed? This storm was coined, let's say, by the Geeks developers who happened to be also here. And it's for you to be affected by this problem, you have a need a few ingredients. One is a package manager that installs every package into its own prefix. NICs, for instance, Geeks or SPAC. You need a loader, like a dynamic loader or interpreter like Python, that has to locate dependencies at application startup. And you have to have a typical HPC file system that is slow and shared. And with all these ingredients, you get horrible startup times of applications. And that is what the stat storm is about. First, before we look into the problem more, a little bit about SPAC. I guess Todd will also introduce it. But SPAC is a flexible package manager, primarily targeted for HPC. One of the nice things or that attracted me to it was that you don't need root privileges to start using it. And it builds on top of your distro so it can also integrate with, like, MPI libraries that are already there. It supports installing multiple flavors of the same package. And I'm saying flavors because a version does not really describe a specific package. There are, like, tons of compile time toggles. Usually, you can swap dependencies in and out. And, well, a version does not uniquely describe a package. And it also comes with a very powerful dependency solver. And it is quite easy to contribute to it because the package recipes are written in Python. So, for example, below here is how you could write part of a recipe to specify, like, a conditional dependency on Python under, like, these particular conditions. And if you're used to, like, Appcat install or whatever, you can do that basically with SPAC2. You can say SPAC install FFTW. But you can also be more precise. That is unique about SPAC, I think. You can say SPAC install FFTW and set the variants to, like, compile time toggle to, like, precision. I want float and doubles. I want FFTW to be compiled with MPI. And the particular provider for MPI will be MPitch. And it should be not version 4, but limited to or constrained to version 3. So that is the input you give to SPAC. And then it goes for a think and it spits out a dependency graph with all the details filled in. So there's, like, the concretization process. It's called SPAC. That can then be installed. And every package, as I said, it will be installed into its own prefix. And the directory name of this prefix will contain a hash derived from the dependency graph. And that allows you to have multiple packages installed at the same time. So that makes SPAC a intentionally non-file system hierarchy standard compliant package manager. So there is no root level bin directory or a lib directory. Everything is in its own prefix. So it can look like this, for instance. But then the problem is that packages have to actually be located at runtime. And I guess the classical solution, I mean, it's not very unique to SPAC. The classical solution in HPC is to use, like, module files, for example. So you log into a system, you do module load, FFTW. And before you know it, you have dozens of kilobytes of environment variables set for binaries, typically like LDLibraryPath is filled with stuff. And for Python, PythonPath, et cetera. This is not necessarily great because, especially for SPAC, if you want to use SPAC executables and also system executables, if SPAC sets LDLibraryPath, then your system executables may change behavior. Or if you use, like, two different SPAC executables with conflicting dependencies, and you have this one global variable that might also lead to issues. So these environment variables are not the way to solve it. Let's focus just on ELF binaries or executables libraries on Linux. If you have an executable, like an ELF executable, in there you find a section that says what interpreters you use. That is the dynamic loader. That's at least one thing that is mentioned by absolute path. So that's what the kernel finds. The kernel starts the loader. The loader interprets the executable. It needs to find a bunch of libraries in the dynamic section. It recursively finds these libraries. So that's basically how the story goes. And then we want users to be able to run executables without all the magic variables, or the opaque variables. And the typical solution, which I think is shared among Nix, Geeks, SPAC, is to have a compiler wrapper that injects R-paths. And R-paths are basically binary local search paths so that you don't have to use global variables anymore. The exact behavior of the loader then kind of depends on what libc you use. So for instance, in glibc, the R-paths beat the library path during the search, which is also something that SPAC exploits. That's the way that SPAC actually can run executables on messy HPC systems that do set these variables for other things. On muscle libc, the search behavior is slightly different, but you don't see muscle that much in HPC anyways. However, there is a cost to R-path, and that is a runtime search. Normally for system executables, what happens is that if you start the executable, the loader will basically just loop over the things that it needs and look it up in a global cache, like the loader cache. And this is quick, and nobody complains about startup times, I guess. In SPAC, you have at least a double loop, and in glibc you even have maybe a triple loop. You loop over the need libraries, the search paths, the R-paths, and then there's hardware-related sub-directories of the R-paths where you can maybe find optimized libraries, which is actually kind of redundant in the SPAC world because we optimize every package for a specific target, so there is no need to look in sub-directories. Well, in any case, there is this triple loop where eventually there is some syscall. And generally, that is not a big deal because in general, how many dependencies do you really have? If you look at Git, maybe there are three packages involved, so there is not a whole lot of searching going on, but things get really wild if you look at, for instance, Emacs with GTK support. This is not the whole... It doesn't fit on the slide to show all the dependencies, and you can get like, I don't know, like 150 libraries with like about 700 DT-meeted entries in the binaries. Yeah, there is a lot of load, runtime overhead, or startup overhead. If you use strays, you can actually see what happens, and you get horrible things like about 5,000 syscalls, of which 4,000 are basically searching for a library in a path where it can't find it. So yeah, there is some overhead to it. And even, like I tried this on the production system, on a warm cache, there is like very noticeable overhead where a lot of time is spent in like system time, just to, I don't know, like print the version of Emacs, which should really just print immediately, of course. So with the dynamic loader in spec, you typically have an overhead that is shifting towards loading objects and not like relocation where the dynamic loader is usually known to be slow for. And then HPC is really a problem, because typically you don't start like one process, but you start like a whole series of processes among different nodes. So yeah, there is a good reason to try and improve this. So obvious solution would be to just switch to static linking, because there is no dynamic loader involved anymore. But generally, there is still use for shared libraries, I would say. For one, you can avoid all the symbol clashes, especially I feel like these huge graphs or dependency graphs, like with the Emacs example, the odds that you find like some symbol that's being used twice are quite high, and shared libraries have good ways to say like, this is my public interface and this is my private interface, and if you have clashes in the private interface, well, there is no problem. Also, LD preloading is still nice to have, I would say, like swapping out a malloc just to try, like will this improve my performance, for instance. So that would be gone with static linking, and there are some other issues, like, I don't know. Dynamic languages, if you have to interface with them, you kind of have to use shared libraries anyways. The geek solution that's already there for, I don't know, like over a year at least, is to patch GLEPSE, and basically they create a package local cache of libraries, so that you basically know like the library name maps to a particular path. It is made package local instead of global, which I think is quite elegant, but for SPAC it is not really usable because it requires GLEPSE. Muscle doesn't have a loader cache, for instance, and it also requires patching in GLEPSE, and we currently don't control GLEPSE. So it's for SPAC not really an option. Another solution would be to emulate a loader cache by simlinking. So in our prefix, we add like a bunch of simlinks from, these are the libraries that we probably need to the dependencies where they are, and then we can replace nrpass into one, and so there's like a single search path, which is also easy, which also works for muscle, and it's a more recommended way to, like according to the muscle made in this, to emulate this cache. But there are some technical issues, like you can still have relative R paths with origin semantics, and they become relative to a simlink, and not to the actual library in the prefix directory where they are, so it may not always work. Another solution, shrinkwrap. This is actually done a bit more recent, and it's currently a pull request to patch-elve from a NixOS project, and their idea is basically to replace all the, all the DT-needed entries with absolute paths of like the transitive closure. So if you run LOD, you're executable, you get a bunch of libraries out of that, and all of them go into the DT-needed entries, and by absolute path and dynamic loader will do no search, it will just directly open them. So it's interesting. It's built on top of patch-elve, which is also used like a lot in Nix. At the same time, patching L files that way is kind of tedious, and there are bugs every now and then, and there are some side effects I'll talk about in a bit. But before we look at the current spec solution, let's step back a bit. Like a typical user issue who is not very familiar with loader internals or whatever, they build their software on an HPC system, they submit their jobs, and it doesn't work, and like the loader cannot find particular libraries that were located during the build, but not at runtime, or they suddenly end up with the wrong Lipson and C++ or whatever. That is a bit of an issue with the discrepancy between like the linker and the loader, and the basic example is like this, you create a shared library, you create an executable, you link to the library, this is a libf, you run the executable, and oh no, it cannot find the thing. Obviously, we can understand why that happens, but at the same time it's a bit dumb, like you just linked it, why can't you not find it right now? In general, of course, we are probably going to install the library, and maybe it's in a slightly different location, so we cannot fix the path ahead of time, but if you think about back, all the dependencies, they are pretty much fixed in their location in a prefix, so they're not going to move anywhere, so if linking immediately binds the library path, that would be great. And one way to do that is if you think about what the linker does, it does a whole lot of things, but one of the things is it copies the shared object name of the library that you're linking to into the executable, a library that needs it. In the dynamic loader, it performs a search for that name, always except if there is like a forward slash, or like a directory separator in it, then it directly opens it. So what if you create a library with a shared object name that contains a forward slash? That is the trick, and actually the trick is also quite popular on macOS, just not very popular on Linux. So what you get is any linker that you would use would, if you, sorry, any linker that you would use would basically copy a path directly into a DC-needed entry. So that raises the question, can you just change shared object names? And generally, yes, you could. And they're mostly like a cache key anyways, it's not a very special field in a binary. There is some possibility to have like introspection with deal info in C. It is rarely ever used, so I've only really seen it in Java, where they check like what Tlpc version is used, for instance. But then, okay, in spec, we can say like, okay, leave that so name there then for that specific package. And that is basically, that leads us to the current trick that spec uses. So we have an opt-in setting in spec 0.19 that you can enable with this command. And basically what it does is, after something gets installed, we replace all the shared object names with the path where the library is located itself. And then what you get is not only better performance because there's no search anymore, but also more like stability or hardening because whatever you link to is also what you get at runtime. There's no discrepancy anymore between the linker and the loader. They will always use the same libraries. It also works outside of spec, so if there's like things installed with spec and people start linking against it, they will automatically always use the spec libraries without having to set environment variables or setting R paths themselves. It does not, in some cases, the trick happens a bit too late. Like if you, I don't know, build curl, curl links to lip curl, like intra package linking, then lip call shared object name has not been replaced yet. We do that past post install, so sometimes there may be some small issues. And last thing that I want to say about this is like, how do you replace shared object names? So currently we simply use patch elf. It is generally good, I would say, apart from the issue tracker, which has multiple dozens of problems reported, but it generally works. But there is one downside, namely that it, well, it increases the, or it reduces the, or it solves the stat storm problem. At the same time, it may, like, change the L files in non-trivial ways and create new load segments. So you end up with fewer stat goals, but more M-map goals, for instance. So if we can avoid patch elf, that would actually be nice. And then there is actually another trick that we are, well, it's under consideration, or it's an open pull request, to basically reserve some space in the dynamic section of the executables and libraries with a dummy R path. And then in Python with SPAC, we just move the shared object name into that placeholder space. And then we can basically update executables and libraries in place, and it doesn't require all the advanced patch elf logic. Okay, so with that solution, do we improve the startup time of e-mail, or like, do we improve the e-max time to printing the version? And the answer is pretty much yes. So the system time goes down quite a bit, so that's good. But we still don't have, we don't capture glipc, so this is what the LDD output looks like. It all absolutely passed, but not glipc. It still search for. And now we end up in a rather funny situation where basically everything that the dynamic loader opens or needs is found directly, except that it spends about 400 syscalls looking for glipc, and the loader itself is part of glipc, so it feels a bit dumb. But in muscle glipc, actually, that is not an issue at all because they're quite smart about it. The loader is actually also the glipc implementation, so they never locate glipc, and that's also a reason why muscle binaries may start actually a little bit faster than glipc binaries. But if we are now at the last issue, like if we make the paths of glipc absolute or preload them, then we actually finally reduce the startup time to something reasonable. And then the statstorm issue is solved, so there are actually zero statcalls, and the open-add calls are, well, significantly reduced. So basically, to answer the question, have we solved the statstorm spec mostly? It would be easier if we also control glipc, but we are not there yet. But at the same time, it is definitely possible, and, well, for sure, you get the second runtime for free, and if you push a little bit harder, we could still make the paths to glipc itself absolute, for instance, and then you get the proper performance. So here are some further links for, like, there's also the whole discussion, and Nick's going on, and their issue has been open since 2017, I think, where people reported this issue, like slow startup times, and lately there's quite some discussion for them going on, too. They also have the same issue. They not only want to support glipc, but also muscle, so it's interesting to read up on that, too. And I'll leave it by that. Thank you. Any questions for Harman? Hello. Hi. So I have a question. So how do you load the prefixes on your software packages? Do you use a module system like Elmot or something like that? So we have multiple ways to actually use the software. You can generate modules, yeah. There is also a way to, which I like a little bit more, is, like, you create an environment, you add all the packages in there that you need, and then you generate a view. That is actually like a more classical directory structure that you get out of that, where everything is merged. Because, for instance, in Elmot, with the modules, you can swap modules on the fly, so it can be used by the user. So I'm wondering if you're using these absolute paths, and then one of my users decided to do a module swap on the OpenMPI library, so something else. How is that handled with this system? So one thing that you lose is the ability, like, if you use absolute paths, one thing that you lose is the ability to use the library path, but you can do LD preload, and to be honest, I'm not sure why LD preload doesn't, well, it's not used that much, but LD preload has the advantage that you can very specifically talk, like, I want to use this library. Yeah, that's quite hard to say. Yeah, but it's also not very different from, like... It's prepended everywhere, so yeah. It's also not very different, in my opinion, from using a LD library path, but yeah. Thank you. Thank you. |
Keeping the HPC ecosystem working with Spack CI |
So next speaker is Todd Gamblin and as I think a lot of people here know I'm very much involved in the EasyBuild project which was actually the excuse we used to start the HPC Dev Room but we're also very open to other projects which are very similar to what we work on every day so in some sense back as our mortal enemy but we do allow them to give talks in the Dev Room as well. Yeah, with that, thanks. Okay, who's heard of SPAC? Okay, cool. People have heard of SPAC. We don't need to do too many introductions for this talk. This is less of a talk about SPAC and more of a talk about the CI that we've started doing since introducing binary packages in SPAC. I don't think I need to tell people why they need SPAC for HPC. I think lots of folks have talked about that already today. Harman said that, so I'm supposed to talk about a little bit about deployment. To deploy SPAC, if you want to try it on a new system, just clone it from the Git repo and run it. Like all you need is Python and a few other tools on your system to do that. So you can just run it straight out of the repo if you want to play around with it and build stuff right there. SPAC is designed to install lots of different versions of things like others have said. This is sort of a snapshot of the syntax. Some of the things that you can add, you can install HDF5 at lots of different versions, you can inject flags in the build, you can pick a compiler, you can do all that on a fly and it will build you sort of a custom version of that software and let you use it. You can get it into your environment a lot of different ways. What we're trying to do with SPAC is provide the ease of use of mainstream tools that people are used to, but with the flexibility for HPC, whether we fully accomplish that is a whole other question because there's a lot of complexity still in this because it is intended for HPC. Originally it was designed to build from source because it was trying to automate people's common workflow. The Fermi lab and CERN folks added a first implementation of binary packaging to SPAC and I talked about some of that in a past FOSDM. Since then we've actually started relying on the build caches a lot more. SPAC has relocatable build caches that you can build in either a build farm or you can make one right out of your SPAC build. You may not want to do one yourself that way because then you won't have padding on the path. Like you said, the patch off relocation is dangerous. Generally if we build binaries for wide use, we pad the paths pretty extensively so that we can just poke values in instead of having to do all the patch off stuff. Anyway, you can install SPAC binaries from a build cache in S3 to your home directory. You can make a common build cache in the file system. You can use a build cache to accelerate CI. It's very handy because it eliminates the need to rebuild lots of stuff all the time. If you look at the SPAC project as a whole, I think people know most of this. There's a community. We maintain the core tool. There's package recipes but the part that you don't see is all the infrastructure behind the scenes that keeps the thing working. Originally, we did not have CI for SPAC or at least not for the package builds. We've always had CI for the tool itself and we've done unit tests and checked a bunch of things about concretization and so on, but we weren't building all the packages. We're still not building all the packages, but we're building quite a few of them. With the infrastructure that we have, we have a system where essentially you can build lots of software stacks on top of SPAC. You can write an ML description of what you want in the software stack. You can have E4S, AWS stack, Livermore's, Math stack. There's a Viz SDK within the Exascale Computing Project. Every application is its own software stack these days. Our production codes have upwards of 100 dependencies that they are using for multi-physics. Each of them is essentially maintaining their own little private software distribution in some sense or another. We'd like to be able to build all of this stuff and ensure that these things keep working. That's hard to do given that the GitHub for SPAC is a pretty busy place. There's almost 7,000 packages in SPAC now. Over the whole life of the project, there's been over 1,100 contributors. You can see down there, this month, there's been 122 people active on the GitHub repo. Over 400 commits and 300 to 500 PRs per month that we have to merge. Ensuring that everything stays working without many changes is pretty hard and you'd be nuts to do it without CI. One of the problems that we have, though, is that CI for HPC is hard. If you want to test in the HPC environments that you actually care about, you can't just take an HPC node and hook it up to random pull requests on GitHub. They don't like that when the machine might have export-controlled software on it because you're effectively allowing some random person in a pull request to run software on your HPC machine. This is the model for SPAC. We have a bunch of external contributors on GitHub constantly contributing to this develop branch. We have stable release branches where we freeze the packages to reduce the churn that some people rely on. Most users are actually on develop, at least according to our surveys, which is a little surprising to me, but that's where we are. Then off of the release branches, there is a software distribution within the XSCL project in the U.S. called E4S. There's a few others that sort of freeze a commit from SPAC and do their own integration after that. That's really supposed to be the deployment mechanism for the 100 or so packages, and it's like 600 with dependencies, that are in ECP. What happens with this is that that gets deployed to the facilities, and the E4S team goes and ensures that everything works, but we're not able to run on these systems on pull requests in CI, and it's very frustrating. Essentially, this is a bunch of downstream work that we would really like to get rid of. Moreover, the applications are also doing downstream integration. They may have their own CI, which may be good. They're essentially pulling from all these places. They may pull a facility deployment. They may pull from develop. They may pull from a release. They're essentially integrating from all these places, and so there's a lot of downstream porting there. What we would ultimately like to do in SPAC is take all of that work that's going on downstream and move it upstream so that we're actually doing CI testing on develop along with everything else. This is progress towards that, but we're not doing that yet. Essentially, the main obstacle for us to build stuff that looks like the HPC environments right now is a licensing issue, which is that we can't take the CrayPE container and run it in the cloud, because that's just not something you can do with HPC's license. We are pushing them real hard on this and trying to get an exception for us to build things, in which case we would be able to do work upstream and, ideally, deploy at the facilities from the binary cache, which I think would be way more stable and less error-prone than what we do right now. We set out to make this CI system to enable this with a bunch of different goals. One of the goals is that we want to be sustainable. We don't want to change the maintainer workflow, and we already have few enough maintainers for the amount of work that there is that we don't want to change what they have to do. They're used to going out on GitHub and approving PRs, getting them merged, checking if they build and so on. We don't want them to have to do something different, so we don't want them to both have to maintain PRs and think about how the integration branch is doing, like some distros do. We'd like that to just happen. In that vein, we want a rolling release where, on develop, we're constantly building binaries for the develop branch, and that we basically snapshot develop for every release that we do and say, okay, it's stable. Everything built. We are ready to do the release. We'll just cut one, and then we will backport bug fixes to this back tool on that release if we need to. We want it to eventually support all 6,900 packages. It's not something we're doing now, and we want source builds to still work with these binaries effectively once it's done. We want to make sure that the recipes are still versatile enough to do all those combinations of builds that I showed on the first slide. Then finally, and this is a big one, we wanted to ensure that the binaries that we have in SPAC are just as trustworthy as the sources. If you feel like you can trust our maintainers and rely on the sources that are in SPAC packages with checksums, that you feel just as comfortable with the binaries that we're putting in the build cache for you. If you think about how this works, if you look at traditional package managers, like say APT or YUM, you have a recipe per package configuration that's getting thrown into a build farm. For each of those package configurations, think like easy configs if easy build had binaries. Throw that into a build farm and then you get these portable unoptimized binaries for theoretical binary having easy build, where there's one of those per package configuration or per spec file or whatever. This is more like an APT or whatever. You're managing one software stack that's meant to be upgraded over time, and there's a consistent ABI across the distribution so that you can swap one package in for another. The solver in those distributions really operates on the binaries. In SPAC, you have parameterized package recipes that we are designing to be portable, and we want the maintainers to work on them together so that they remain portable, so you can use them in different environments. Throw those into the build farm and effectively test the parameterized package recipe in lots of different configurations and spit out different stacks. Lots of different stacks optimized for different environments from the same portable recipes for different systems, OSs, compilers and MPIs and so on. We also want, at any time, for you to be able to choose to build something from source along with that if you want to customize some aspect of the pipeline. To enable that, we came up with this architecture. We have a bunch of AWS resources, because AWS has been nice enough to donate some cycles to the project. They are interested in using SPAC in their parallel cluster product, and so that's the motivation for them is they want binaries ready to go if someone spins up a cluster in the cloud. They don't want to spin up a cluster and have it sit there and build software for hours and then run after having charged a bunch of money, which is nice to them, because they would make a lot of money. But then no one would use their service. They want binaries. In there, we use S3 and CloudFront to distribute the binaries around the world. EC2 is really the main build resource, and RDS is in there, but it's not that important. We've got a Kubernetes cluster in there that we have autoscaling runners in, and so we're building mostly in containers inside of Kube, and there's a GitLab instance in there, too. We have a high-availability GitLab instance. We chose GitLab because the HPC centers actually have GitLab CI themselves. The same CI logic that we run in the cloud, you could take that and run it internally and have these pipelines generated for you at your own site, too. You could slap another back end on this and have it generate build grass for some other system, but that's the one that we're using. We're using runner pools with something called Carpenter to basically get just-in-time instances and allocate the containers on them efficiently, and then we have some bare-metal runners at the University of Oregon with some fairly exotic architectures on them. So if we need to build or if we need to specifically run on something that has an AMD GPU or A64FX and so on, we can do that. And we could add more runners to this eventually. And there's a bot that coordinates all this work. So it's a lot of stuff. Every time I look at this, I am amazed at how complicated CI is and how it's one of those things that seems like it should just work, but there is a lot to maintaining a reliable service for doing this many builds. And I suspect other distro maintainers have realized that, too, and I'm just late to the game. The way that contributing a stack in SPAC works is we have this directory in the repo that has all of the cloud pipelines in it. And so you can see some of them are for AWS. Some of them are different variations on E4S. Each of those directories contains just a SPAC.yml that defines the stuff that is to be built. And so if you look inside of there, it's basically just, it's a list of packages. So here's the ML CUDA one that has the build of, I think, PyTorch and TensorFlow, Keras, Jax and friends for CUDA. It's just a list of packages plus there's a target up there, a target setting for all the packages. You could have a matrix of targets if you wanted. And then there's disable, rock them and enable CUDA on everything except for LLVM because there's that bug that's linked there. And I'm not entirely sure about the specifics of that. But the configuration part is up here and it's fairly minimal for the stack. There's currently, if you look at these, a bunch of other boilerplate stuff for things like mapping runners. I'll get to that in a minute. But this is, there's a PR that's going to go in where this is basically all that's going to be in your stack. And you might include some stuff from elsewhere. But this is essentially a stack definition. And we take that and, you know, this makes it very easy to change low-level parameters in the stack. So we had a working E4S stack with something like 6 or 700 packages building. We wanted to get better testing for one API because that's what they're going to use on Aurora. And so we wanted to use the one API compilers. We added some compiler config and we said everything should use one API. And then, you know, at the very least we got a pipeline generated with some errors for one API. And it made it really easy to iterate on this with Intel where we would basically say, okay, this package is broken. Here's the bug. Go fix it. And then it would come back with another version of one API and we would iterate with them until it was done. I think this is probably more open-source than anyone has recently run through a vendor compiler. And so just being able to do this, I think, is big because it might make those compilers like actually viable things to use for real programs that have lots of dependencies. At the moment, you know, you have to sort of piece your program together and build parts of it with like, I don't know, PGI was the infamous one that broke on everything. But, you know, I think this could help with the vendor compiler being a viable second option and, you know, maybe instill some competition among the vendors because they can do this frequently and show, you know, benchmarks against these packages. So this was, I think, a win. Yeah, thank you. Each of those stacks gets concretized. And so people know, in SPAC, you take that abstract description of the things that you want to install, which is basically the requirements. You run it through our dependency solver. You get, essentially, a concrete description of what you're going to build, which is the whole concrete graph. And then we generate a GitLab CI YAML from that that describes the jobs that need to be run to build the whole thing. This is the part that we could swap out for something else. So, like, we've looked at, like, Tecton pipelines. We've looked at other options. I don't know, some people use Jenkins. There's all sorts of things out there that you could potentially map the jobs to. And I think we could generate a description like that from the representation that we have. For mapping those jobs, we have a section in the CI YAML right now, or in the SPAC.YAML, that basically tells you how to generate the GitLab piece. And so you see this mapping section here. There's a match section. If you match any of those specs there, and the first three are just a couple, just some names, then we have special tags that we put on the runners that say, you know, get me a special resource for these things. And so that first block is basically so that I don't run out of memory building LLVM TensorFlow or Torch, get me something with a lot of memory and a big CPU to build that one. It has to run on a big instance, because those are sort of the long poles in our tent in CI. And then down at the bottom, there's just a mapping from everything else gets something that supports X8664, V4, and it's a little smaller than the other one for builds. And you could do this for lots of different architecture combinations and so on. And you can ask for images and things like that. I said that we needed to ensure that the source is, or that the binaries are as reliable as the source. And so we sat down and we asked ourselves, you know, what is it that people trust about the SPAC project? And it's really the maintainers. If you use any open source project, you're trusting the maintainers, or you really shouldn't be using that open source project. And so I don't see where we can do better than that. And so what we've done is we've said the place where bad things could get into a build, at least from SPAC, is in the build environment. And so if you give people control of the PR environment where they're submitting things there, they could push a commit that puts something in a binary that gets cached. And then, you know, somehow, I don't know, they could do bad things and end up caching a binary. And if we took that binary and stuck it out there for anyone to use, you know, there could be bad things in it. And so we have this separate set of untrusted S3 buckets where we only build PR things. Each PR gets its own build cache. That enables the maintainers to see if things work. And then they come along and review the code. And then once things are actually merged to develop, we don't trust any of the binaries that we built on PRs. And we go and rebuild everything in sign, specifically from the, you know, the sources that got approved, just, you know, so that we know that we didn't cache anything from that environment. So that's where the development and release caches are coming from, where they're entirely separate from the PR environment. And the signature here is, you know, it's ephemeral. They have, like, a signing key locked up somewhere in a secret server. And we generate, you know, we have subkeys and then we generate ephemeral keys for the signing in the pipelines. So whatever it is that you got signed with doesn't actually exist anymore by the time the user consumes the binary. We could look at sick store for this. It wasn't quite ready for arbitrary binary signing when we did this. But that's an option to reduce some of the custom GPG stuff we had to do here. So the pull request integration, I think, makes it easy for at least for most of the contributors. They get status updates on PRs. And it's fairly easy for users because they can just add one of these binary mirrors and then start using the build cache. And I'm not going to get into the details here, but in SPAC, for a very long time, it was easy to get a lot of cache misses, like we would just look up hashes. And I have another presentation about our reusing Concretizer. The summary is, if you add one of these build caches and you have those binaries available, SPAC will prefer to use them. And so before it tries to rebuild something. And so with the reusing Concretizer, this is actually quite powerful. And so, yeah, what could go wrong? Well, there is a burden to doing this. And a build cache distribution like SPACnix or Geeks is different from an RPM distribution because every node has a hash. And the deployment model is really that you have to deploy with what you built with. And so you can't just swap in a new version of Zlib in a stack. If something has a particular hash, that implies all of its dependencies' hashes. And so you need to deploy the build cache with everything that it was built with. So if, for example, you modify XZ, right? Yep. And then you're going to need to rebuild all of these things, too. And you're going to need to do that all the way up to the roots of your environment every once in a while so that there's a consistent build cache for people to deploy. And that can be bad if your stack is this big. This is E4S, right? And someone comes in and submits a PR, which you can do, by the way, that, you know, modifies package comp. And then all of a sudden, you know, this is what happens to your CI system, right? Your whole graph is rebuilding again. And it can take a long time for develop to catch up with a change like this. And right now we are rebuilding all that stuff on PRs. So your pipelines can get long. You dig in there and you see that, like, visit is still building. And you're like, this is the fifth time I've built visit today. I think Harman once commented that he was worried that SPAC would eventually cause the heat death of the universe because of pair of view builds. Or no, the pair of view builds would eventually bring on climate change in the U.S. So we worry about that. We don't want to do that all the time. The other thing that can happen is there's a delicate balance between redundant builds and, you know, holding back PRs. I didn't think about this before we really got into CI. But it matters what commit you pick to merge with when you're doing a build-cache build. And so if you have a pipeline like this where you've built B and develop has now picked up on D and that one's building up there. And you get a PR like this. So PR1 comes in. You can merge that with B and get a lot of reuse there and get a pretty good testing on PR1. If instead you get a PR up here that is based beyond your last developed build and you try to merge that with D or even C, I guess it's already based on C, so you can't really merge with C. But if you merge that with D, you're going to be duplicating the work that's already being done on develop. And so if you get a bunch of PRs like this at the same time, you can get a whole bunch of builds at the same time that are effectively already being done on develop. And so this is a difficulty of navigating these PR-based CI systems. If you had a server that had shared that one patch was built all the time once, then you could get around this. So you have to be picky about this, hold up PR2 until the next thing is built and then merge with that commit and send it to GitLab to be merged or to be built. And this can annoy contributors because they have to wait for that to happen for their PR in order to keep the CI system sane. We actually did bring down GitLab once with a bunch of PRs like this. Essentially something got broken in develop, develop, got held up, people started submitting a bunch of PRs, they were all doing redundant builds and GitLab fell over. So that was fun. CI does keep things stable. And so we have had, at least anecdotally, that our package maintainers at the lab are much more happy with how reliable their builds are for packages on the machines since we've had CI. But like I said, the committers get frustrated. And the other thing that happens here if you're doing so many builds on PRs is that if your CI system has occasional system errors, if you're building a thousand things on a PR pipeline, it's very likely that you're going to get a system error on there. And so what ends up happening is that you end up having to babysit PRs a bit. And that can be painful. The other thing that happens is it's hard to stay correct. So testing on PRs doesn't really ensure that you have a working develop branch. If you have a setup like this with an initial package state, you get a pull request at update B. You get another pull request in there that updates C. You test both of those configurations on your PRs and they work. And you merge them. The thing that you now have in develop is actually updated B and updated C. And you never tested that. And so keeping that state consistent is rather difficult. And we're thinking we're going to, we didn't, you know, before we had CI, I think we just didn't see these kinds of issues. They would just get manifested on users, which is not great. But now we run into them in CI because we can see that things are broken undeveloped. So we're looking into using merge queues, which actually solved this problem and a couple others that we have pretty effectively. So you can do faster iteration on PRs with merge queues because you're merging in sequence, testing in parallel. I'll describe what that looks like in a minute. It's a good balance of CI versus responsiveness because you can do sort of sparse tests on the PRs and queue them and then do the heavy tests. And it actually does preserve the security model because anything queued in a merge queue is actually approved by containers, by maintainers, and you can take the builds and move them straight into develop. And so what that looks like is this, where you might have the same initial packet state, you get two pull requests, you do some small testing on the pull request, and then you set up this merge queue where effectively you're doing heavy testing on things that are basically staged exactly as they will be merged if they are successful. Okay. So that gets committed, that gets committed, and now you've tested the final configuration on develop and you're not in an inconsistent state. So we're going to stage the work that we do in CI. On PRs, we're probably going to build just the package or just the package and its dependence, which is similar to what Nix does. On most merge queue pipelines, we may build a bit more than that, and then every once in a while we'll build everything on develop, and we'll see how it goes. We can probe, you know, what the balance is here. So that's where we're at. Thanks. Okay, I think we have time for one or two questions. Any questions for Todd? And off the wall question, we have, for example, software bill of materials, dev room. You mentioned export controlled software and also being able to trust binaries. I work with classified customers who have isolated networks, probably Shopify, MI6, if I told you who they were. But could SPAC help with providing, they're now asking for what software is running on these systems. I mean, what does that question mean, really? Can you help with producing a report on exactly what software is? Yeah, we have a PR right now for so that every SPAC build would produce an SBOM in some standard format. There's a whole dev room on SBOMs today, which gets into that. And so I think, yeah, I mean, we know everything in the graph, and so do Nixon Geeks and the other systems that do this. We don't expose it in a standard format that auditing systems can scan right now, but that's what we'd like to do. So very briefly, Debbie and I, a while ago, did something on reproducible builds, which were much more difficult. So if you haven't worked with her a bit, that might be interesting for you. Yeah, so we would like to have fully reproducible builds. It's a lot of upstream patching, right? And even Debian isn't fully reproducible right now. I think that that would be like something we could consider after we get down to, like, libc even, because at the moment, because we have to run on things like craze where there's so much dependent on, like, the module environment, we have to include the external environment to get some of these builds done. But yeah, I would like to have a much more isolated build environment with that. It's a good practice. Okay, one more question here, and then need to switch. Hi, so you were talking about padding your header files for rallies of pathing. Given that you don't have a static path or a pre-defined destination as in FHS-type locations, are you in serious danger of running out of space in that header? Well, we're not building in a static path. We might be building in a home directory, right? And so you can put padding in your install tree prefix, it's like the next store, and you can say build with 256 long paths. And you wouldn't want to have a user actually deploy in a path like that, but you can build that way, create the binary, and then redeploy in a short path. You've got potentially a space where there's, where you can have an arbitrary length path as you're running. A lot of stuff doesn't build with overly long paths. So if you get to 5.12, auto-tools starts breaking down and not supporting that length of a path, and the packages actually don't support it. And so the sweet spot seems to be like 256. Okay, thanks. |
Developing effective testing pipelines for HPC applications |
Okay, now we come to it. Alright. So last talk. Alright, testing, testing. Okay, so I started here. My talk is about developing effective testing pipelines for HPC applications. Just to give a short introduction about who I am, relatively new here. So, you know, I first started in the HPC industry while I was in college. I joined my university's HPC Institute as a software consultant, so largely in that role. I was kind of more focused on support. So in that case, I was working on, like, singularity deployments, you know, helping debug Fortran. Not very good at writing Fortran, you know, helping build custom kernels for Jupyter, all that good stuff. And then I left for a bit to go work at another university to do some adversarial machine learning workflow orchestration framework. And then, eventually, I came back to my university's HPC site then as an engineer, instead of a kind of support role. So more in that case, what I worked on was, like, building, you know, a suite of singularity containers that clients could consume, and then also working on, like, debugging tools that interfaced with, like, Moab and Torque and PBS and all that. And then after I graduated university, I joined Canonical. So that was probably, like, nine months ago now. So, you know, kind of the start then, you know, what sent me down this path of wanting to develop effective testing pipelines for HPC applications? Well, first, when I started at Canonical, to be frank, I wasn't very good at Debian packaging. I had mostly deployed most of my software using bash scripts, compiling from source. And so part of that is I had to write provisioning scripts, so I mostly work in cloud orchestration, like, nodes and whatnot, onto clouds. And so I had to, you know, develop dangerous code, you know. So it's installing packages, making configurations, kind of setting pre-seeds in, what is it, Debian, Ubuntu. And so basically, I kind of had this, you know, dilemma of where, okay, I want an easy way to test these provisioning scripts without kind of having to go through all this manual effort. So originally, you know, I just, you know, typed in vagrant virtual box or any real kind of virtual machine or a supervisor, bring it up, install the script, and then once I was done with it, blow it out. And then kind of also have this issue of where I had a desire for reproducible tests, because even though, like, you know, I know how to bring up, like, a virtual machine on my system, you know, I might be working with someone who's off of Windows or Mac, and so they don't have the same setup that I do, or they're using a different distribution, in fact. So this gave me the idea of, you know, kind of, if I'm on my personal workstation here, so my laptop, unfortunately, my current laptop's X server isn't working, so the HDMI cable gets all screwed up, so thank you, EasyBuild folks, for lending me your computer. But basically, I had the idea that what if I take a test that's written on my system and then have the ability to run it using any hypervisor I want on any operating system that I need to test that code on without any extra hassle of having to really go through supporting or teaching someone else how to bring up that instance. And basically the case of, like, oh, I have to write code, or run or anything else that needs to run, like Ubuntu, CentOS, Alma, or Rocky. And then kind of after working for that a bit, you know, I had kind of an initial prototype that would bring up, you know, basic operating system image, and then kind of as I, you know, got more into my HBC work at Canonical, I kind of realized, like, oh, okay, you know, maybe don't want to rack up the cloud bill trying to test and deploy to AWS or, you know, Azure, or, you know, even our own private internal public cloud or private cloud using OpenStack for, you know, just trying to provision HBC nodes. So, for example, have, like, identity management, open LDAP, shared file system using NFS, you know, have other options. And then also working in the case of just setting up and configuring a Slurm cluster, all kind of headless without any kind of manual user intervention. And so I kind of had this revelation where I'm like, why waste precious compute time on your HBC cluster because the fact is it's expensive, you know, you're kind of paying for that tenancy. And what if you could test your simulations, jobs or applications and whatnot on a mini HBC cluster that's kind of similar to the target platform that you're deploying, but it's more of a mock so that you can kind of get the general feel for the platform before moving on to the more expensive resources. So in this case, you know, I started working on a custom testing framework called CleanTest, which is basically a fancy Python testing library that allows you to bring up many HBC clusters on your local system and, you know, just kind of in general usage for developers who are in a hurry. So kind of then breaking into a question, you know, what exactly is a CleanTest? Because, you know, originally I named it SimpleTest because I just wanted it to be really dead simple, but for some reason PyPI wouldn't let me register that name. So CleanTest it is. So basically it comes in three parts. There's kind of a different breakdown. First, you have the bootstrap and configuration stage. So for that, it's kind of where you can register hooks and whatnot for configuring the instances that you're bringing up or do some more advanced bootstrapping. So the example I had shown in the previous slide of where you're able to bring up NFS, open LDAP, slurm services and whatnot, and then be able to inject scriptlets into that. And so kind of that's when we get to the second part here, which is a Tesla. Kind of interesting little word that I came up with, joking with some of my colleagues, but basically a Tesla is an entire Python program that's wrapped into a regular function, and kind of the idea is that they contain the full body of the test that you want to be able to run inside this virtual machine container or, you know, test mini-cluster. And then kind of the last part here is the evaluation reporting aspect. And then kind of with that, you know, I took a test, you know, subtest framework, you know, agnostic approach where I know that everyone kind of has their own taste that they like. Some people like pie tests, some like unit tests, you know. I don't want to sit here and make opinions for you, so, you know, I want you to be able to write tests in the format that you're most comfortable with. Oh, that came out small, but that's okay. So kind of the first part here is the going more advanced into, like, what exactly is the bootstrap in configuration is that you're able to bring up example nodes, you're able to provision them. What is it? You can also register hooks, so it's kind of similar to anyone who's done any Debian packaging. Listening to Todd, I don't think he's still here. Yeah, you seem to have some experience with Debian developers. So, but... How do you, like, circuit 2,000? Yeah, circuit 2,000, yeah, it's cool. But, yeah, so if anyone's ever, like, worked with P-Builder before, kind of one of the features that I really like about P-Builder for building Debian packages is that you can have different hooks that you can ingest, or you can create that run at certain parts of the package build. So I kind of wanted to replicate that same functionality. And then kind of the general process is instantiating a configure instance, so that's, like, a Python singleton class object that basically takes in all that info, shares it across, you know, the whole test suite. They can bring up nodes that you need, and then you define your hooks, so that's kind of, like, what do I want to run when I start the environment and then register them so that, you know, when you run the test, the program knows about it. And then kind of the next part here is the testlets. So that part kind of uses some little bit more tricky programming. So in this case, I use Python decorators and metaprogramming, so kind of reading class descriptors and whatnot, assessing the current state of the program. And what the decorators do in this case with the metaprogramming is that rather than having that function run locally, instead it kind of takes out the body of that function that you defined. And then it injects it inside of the container instance and runs it there, so the idea is that, like, oh, you could be working off of a new bunch of machine, but you're developing for a cluster that's running Rocky Linux. So in that case, you can just easily bring it up, inject it in there, get the results back. And then finally, kind of the last part, evaluation slash reporting. It's like each testlet kind of returns this results object, so that kind of contains an exit code, standard out, standard error. And from there, you can evaluate the results locally instead of having to kind of instead do it inside of the container, so you could say, like, oh, in this case, if you're, like, doing a spread test, so kind of if you have, like, oh, you know, I need to test this on Ubuntu 22.04 or 20.04 and 18.04, it returns a generator object of a name and a result. So, you know, let's say that your code works on 22.04 and 20.04, but the version of Python on 18.04 is too old for what you're trying to do, so report back is an error. So kind of then, you know, breaking into how does it exactly work. So the idea is that you kind of start on your local host, so that's kind of your computer there. So you have the host operating system, and then you have the clean test package installed as, you know, I'm part of your Python interpreter, which is a regular Python package. The idea is that then, as you see that dotted line in the middle there, it then makes the request of the hypervisor of your choice and tells it, like, hey, so, you know, the user who wrote this test told me that I need to bring up a certain instance, you know, says that they need, like, a centOS image, so bring up a centOS image for me, and then once that's done, you know, what clean test does is that it takes that test body function and it kind of creates a simple JSON packet, which is a checksum to verify the authenticity of the testlet, the data, which is basically the testlet encoded, and then, or any data necessary for the testlet, and then the injectable, which is basically, like, hey, you know, when you get this data packet, here's what you need to do with it. And then once that happens then, you know, clean test, what it does is that it copies itself onto the container image and then from there it ingests that data packet, does the evaluation that you requested, and then it returns that result object back to the clean test that's on local host. So kind of two different ways that it works. Then for, like, how do you control the hypervisor? The first way is kind of Archon, which is a fancy word for director, you know. I kind of wanted to have a buzzword in there somewhere. But the idea is that what the Archon does is that it's kind of more declarative approach to doing clean tests, so rather than, you know, saying, like, oh, you know, automatically do this, wrap it, you know, you can kind of direct the deployment of said mini-HBC cluster, and then what Harness does is that instead of, you know, having to explicitly declare, like, this is the infrastructure that I want for my deployment, it just brings up an instance based on the function that it's been wrapped around. So this is a short demo video. Let's see if I can choose better quality here. Oh, don't tell me. Came as a PDF, but let's see here. All right, YouTube, sweet. They go full-spring. There we go. Oh, settings, playback. 180p. There we go. It's a little interesting. But yeah, so basically what happens here is that just using, like, simple talks, in that case, I use talks as kind of the test administrator. I started a test, which is called LSD Archon, which basically says bring up a test environment instance. So first it starts with LDAP, so it's provisioning an LDAP node on top of LSD hypervisor. That's what I'm using here. So first it starts with LDAP, and then after a few minutes, for it to boot, crappy hotel Wi-Fi was when I was doing this. And then see there? Now you have the NFS image. That starts provisioning, and then somewhere in here. Now you have the Slurm CTLD node that comes up. And now you have the Slurm, three Slurm compute nodes that come up, and the idea is that then what the framework does is that it injects a testlet inside of Slurm CTLD, and then from there it uses SBatch to submit the job off to the test cluster. Takes a bit. Ooh. A little too far ahead. Give it a few seconds. Yep, and then it cleans up the cluster so that it doesn't linger on afterwards. Oh, goodbye. Oh, God. What happened here? All right. Okay. That's not fun. Okay. Yeah, I want to go back to the video. I don't know why I jumped into the other video. I just had an auto play moment. Wow. I feel like a school teacher. This is right for the end. Okay. All right, thank you. Yeah. All right, okay. Ooh, no auto play. All right, so basically what happens here, I'm going to full screen it so it's a little bit bigger. What? Come on. I'm an engineer, not a YouTube video player on a projector kind of guy, but what is it? Yeah. So what I've seen here is that the test starts, brings up the nodes that you need to use. So in that case, it was just LDAP for basic identity management, manifest for shared file system, and then just like Slurm for kind of resource management services. And then from there, it just like injects like a little test script to run. And then, yeah, what's if the job succeeds, it kind of copies back to the results. And then, yeah, and then it says like, okay, we get the result that we expect, so in the case of the test that I wrote, it just prints out basically like, I love doing research, and then it says like, I love doing research in the log file for standard out. So, yeah, pretty low fidelity right now is mostly a lot of work, went into just getting it to work and all that. Okay, now I want to go back to the slides. There we go. Hey, so now you saw that video, you know, kind of overgoing some of the current limitations. The first is that I'm kind of bad at playing YouTube videos and presentations, but the next part here is that kind of, right now, big issue is that there's kind of a lack of robust multi-distribution support. So currently, I mostly developed it to work on Ubuntu and work with Ubuntu, I wonder why. But you can't launch Alma, Rocky, CentOS, Arch instances, et cetera, but kind of the macros, hooks, and like utilities that I built into the framework aren't really fully there yet for supporting it. And then public documentation is behind because, you know, usually I write code before I write documentation. Unfortunately, it just seems to be how it always goes. Yeah, so I need to update that. And then kind of big issue right now is lack of package manager integration. So a lot of the support has been added ad hoc. So currently, I support like charm libraries, which is something that Canonically uses, and then snap packages, which are kind of controversial, depending on your opinions. And then also just pips because I do a lot of Python development. But in the future, I hope to add support for like Debian packages, RPMs, you know, Arch installs, it's back in EasyBuild. So, yeah, and then lastly, I'm the only developer currently. So, you know, code developed in isolation isn't reviewed as thoroughly as it could be. So, you know, yeah, I make design choices based on what I think is appropriate. So, yeah, so last thing too is that, you know, testing framework I think is a lot cooler than the video that I kind of struggled to play here. So if you want to scan the QR code, if you're interested, feel free. Slides are also online as well. So if you're not available right now, you can check it out. And then lastly, you know, this is kind of a, you know, call for involvement. So, you know, really, at Canonical, we're trying to start getting Ubuntu kind of geared better for HPC, you know, we kind of know that we're a little bit behind Red Hat in the case of like network driver support and whatnot. So, you know, just if you're interested in using Ubuntu for your workflows and whatnot, we have a public Mattermost channel so you can scan that QR code or you can, you know, check it out later. But, yeah, we have a public channel for HPC online. So, yeah, if there's something that's missing or, you know, there's kind of some reason why you're being held back on using Ubuntu for HPC, we'd really love to hear that feedback. So, yeah, that's it. Thank you. Any questions for Jason? Last chance for today. So, you're just doing, this is all Python code that does this system for you. Yes, yes. So, how are you spinning up the LXD containers? So, the idea is that, in the case of LXD, it has a public API socket that uses, I think it's like open API standard or something. But, yeah, so you can make, like, HTTP requests out to that API that basically say, like, oh, you know, I need this instance or tear this down or set this configuration so I can install, like, AppTanner or some other container hypervisor inside of LXD and whatnot. So, yeah, so I'm just using the API. Any other questions? So, yeah, I'm interested in this. So, the, like, you had an NFS server and some other stuff. Are you, how are you, are those preassembled images that you're just using? Are you building up with some kind of configuration management to use Ansible to build them or how do you do that? Yeah, so currently what I have is, like, the way that I provision, I'm using, like, base Ubuntu images for configuration right now, but you do have the ability to register your own custom instances and pull them in as well. But basically, I have a little mechanism built into the framework where you can write, like, provisioning scripts using, like, the clean test utilities. So, there's some stuff for, like, installing apps, running commands on, like, sub-processes on the unit and then also, like, you can reach out to the network, download anything you need. So, yeah, I'm just using custom Python scripts, the provision that are... Yes, yes. Anyone else? Last chance there? All right. It's for the stream and the recording. We've been doing well all day. Let's keep it up. Thank you very much. Would you mind please to move to the previous slide so I can scan my QR code correctly? Okay, thank you very much. You're welcome. That was a good last question. The other one? Which one are you looking for? This one or...? Yeah, this one. Okay. That's all good. All right. If there's no more questions, we can wrap up here. Thanks a lot, everyone. That was a wrap for today. The 9th HPC Dev Room at FOSDEM, that means if FOSDEM likes us, we can have a 10th one next year. That would be really nice. Some practical stuff. If you're leaving the room, if you see any trash, please take it with you and dump it in an appropriate place outside. And the FOSDEM team has asked us to ask you to leave the building as soon as possible so they can lock up the whole building. There's another talk going on in Janssen, in the really big room. I would recommend going there. It's a really nice way to wrap up FOSDEM, and there will be places there to get one or maybe two or three more beers and have some more chats with people. Thanks a lot, and hopefully see you next year. Thank you. |
Devroom kick-off talk: UKI? DDI?? Oh my!!!
Introducing and decoding image-based Linux terminology and concepts |
Good morning and welcome to the very first image based Linux and secure measured boot Devon. Bit of a mouthful we'll try a shorter one next year. So let me start by introducing myself. My name is Luca. By day I am a software engineer in the Linux systems group at Microsoft where I work on the Azure infrastructure. And by night I contribute to various open source projects. I am assistant demantainer, Debian developer, DPKLTS maintainer, a bunch of other stuff that I forget and neglect. So I will give you the introduction to the Devroom and an overview of all the topics that we touch on to hopefully give you a holistic view of what image based Linux is. So let me start by saying thank you to all the organizers and co-organizers for this Devroom, especially to Thilo, who unfortunately could not make it to Brussels this year, but he did most of the work on the FOSD website and CFP and so on. So thank you Thilo if you are watching. Now some boring logistics. This Devroom has a five minutes break between talks to allow to switch some spillover. We have a 10 minutes break at 10 past 12 and we finish at 20 past 2. Next Devroom starts at half past 2. Now in case this is your first FOSD or it's not but you never noticed, everything is live streamed and recorded. If you're not comfortable with your back of your head recorded or live streamed, best I can suggest is to sit at the sides. If you ask questions, remember again there will be live streamed and recorded. If you're not comfortable with that, there's a matrix chat. You can ask a question there and our Devroom organizers will repeat it for you. We do want people to ask questions, please do so. Please do not just start shouting at the presenter, raise your hand and then we will come to you with a microphone. If you're a speaker and people shout a question at you, please first repeat it and then answer it. And I think that's it. Now let's get into the interesting stuff. What is image-based Linux? Now if you're an embedded person or radiation to that word, you're probably thinking, what are these guys talking about? We've been doing image-based Linux for like 30 years, it's nothing new. And you wouldn't be completely wrong. Now the difference is that our focus within these people, this ecosystem, is on the security aspect. Because let's face it, Linux runs everywhere, right? Most of our infrastructure and economy runs on Linux these days. All the public clouds run on Linux, even Azure. So we want to get our security story straight. What does that mean? What are some of the basic concepts? First of all, we want to have first-class support for at least one, if not both, of Ufisecureboot and TPM-based measurements, hopefully both. Because the goal here is to extend the chain of trust at boot. Now if you're using a generic Linux distribution like Debian or Ubuntu or Fedora, the story in your firmware to kernel chain of trust is pretty well buttoned up by now. Because a lot of people did a lot of work in the past 12 years to get that story straight and they keep doing that to maintain it. So in your generous distribution, you will have your firmware which verifies the first-stage boot loader, which will be SHIM signed by Microsoft, and then SHIM, the first-stage boot loader, verifies the second-stage boot loader, and verifies the kernel, and the kernel verifies the kernel modules. So if you are within ring zero, the chain of trust is pretty solid. There is this small little thing to the side called user space where things are not so pitchy, and that is what we are trying to improve. So just a quick summary, we'll go into more details, but we want EATRDs to be signed. EATRDs are completely unprotected right now in most distributions. They are built locally. If an attacker or malware can get right access to that, they can embed their malware in there and you will be known the wiser. That's a bit of a problem. Same thing for your rootFS. It probably is encrypted these days, but that helps for offline attacks, not online ones. We want to do better there. One of the key requirements to use any of the specification infrastructure tools that we'll see is you need to have an hermetic Zesh user. What does that mean? It means your vendor tree must be within Zesh user. If you are in one of those handful of distributions that still have the top-level beans bean or lib directories that are not sim links, it's time to move on. Then Debbie and Marge finally kicking and screaming to get into merged users. That is our core requirement. The people who work on this stuff kind of got together from various distributions, companies, projects, and we created this UAPI group. We have a nice website. We have a GitHub organization with a discussion tab. There's already quite some interesting discussions going on there, so I encourage people who are interested in these topics to check that out. Now, what does actual image-based Linux mean? This is my personal understanding and analysis from my point of view. I see at least three different flavors of this. There's the pure image-based one. It's the one that I do in Azure where you build images, all images, and you ship to the machines. You have DM Verity to cover the root file system. I want to explain what that is. The next talk, we go into details, and then we have AB schemes for upgrade-downgrades. Nothing groundbreaking. Android has been doing this for 15 years or whatever. The second camp is the OS 3 one, pure or RPM-based, like Fedora Core, for example. What they do there is they build either packages or OS 3 snapshots, and then they apply them locally. You're a boot into the next snapshot or a different one. It's like having a Git tree for your file system. The root file system there is either a femoral or immutable runtime. I cannot remember. You cannot change it. You're a boot into the new one. ButterFest camp, very similar. Instead of OS 3, you use the ButterFest snapshot in capabilities. So you install a package that doesn't get installed in your root FS. You install it into the new snapshot and then reboot into it. The core thing I want you to take away from this is that there are different flavors, ways of stringing things up together. That's okay. That's what Linux is great at, at this diversity. But the core important concept is that we share goals. We share tools, infrastructure, and specifications, because we want the same thing in different ways. So let's look at some of these specifications. Fair warning, there's a lot of acronyms coming your way now. I apologize in advance. Now, UKI, you will hear a lot about this today, unified kernel image. What is this? A UKI is a single P binary, a UFI executable. Why is it good? Because you mix a UFI stub, a kernel, and an NTRD. And then you can sign it for secure boot. Remember I talked about the NTRD being unsecured before? This is how we fix it. The NTRD is no longer built locally. It's built by the vendor and shipped inside the UKI. So it's signed and it's verified as part of the boot process. I won't go into details in this process because one of the next talks will tell you everything about this. So the UKI is dropped into the boot partition or ESP. And then it's out to discover by bootloaders implementing the BLS, bootloader specification. What does that mean is that you don't need to configure your system to pick the UKI up when you boot. The bootloader, we parse what's available, get information out of it from the UKI itself and present you a menu. It's drop and plug and play basically. So this is supported by system reboot and there are patches for grab as well. I think Fedora will ship with those patches and hopefully they make it their way upstream. Another good thing about the UKIs is that not just we sign them and verify them as one, but also we can then predictably measure them into TPM in PCR 11. So the hashes will always match. If that doesn't make any sense to you, that's okay. And later we tell you everything about TPM and measurements. I just mentioned it here, so you have an overarching view of why this stuff is good. And we want to do some future work here, but the important thing is the specification is at this URL. And that's for UKIs. Now, next one, this is my favorite one, DDI, discoverable disk image. What is this thing? It's just a raw image wrapped into a GPT partition table. The good thing is that it is self-described. Each partition is tagged with a UID that is fixed and tells you what the purpose of the partition is. You don't need to say root equal devsDA5 because the partition is tagged with UID that says I'm the root of s. So also because security is important to us, this natively first class supports the unverity for protection. Again, the unverity will be delved into later. I won't tell you what it is, it's for securing the partition against tampering. All tools that support DDIs support the unverity natively. The other important thing is that given they are self-described, you just pass them to the right tool and they do the right thing that you expect out of the box. You put it in, if your disk where the SP was is a DDI, system D will automatically find the root partition by looking at UID and boot it from the internal D. If you pass the DDI to N-spawn, it will automatically use it for the root for a system or the container you're booting. You pass it to portable D or the system service as root image and it will automatically use for the root for a system of that amount namespace. You pass it to CZEXT and it will be automatically used to extend the root for a system. We'll see in an example that means. So one image format, self-described, give it to many tools, they do the right thing automatically. Security as first class concept. Now what's a CZEXT? This is important for the interdetalk later. So it's a specific type of DDI. It can be used to extend a root for a system. So it will ship the user hierarchy or the shop if you're a third party vendor and it's identified by the extension release file which matches the format of the OS release file that you're probably familiar with. Specification of this is that URL. So you get a root FS DDI, bunch of CZEXT DIs and bam, you get another AFS read-only that sums the content of all those images. And again, this is important for the later talk. Again, security, first class citizen, the dmware is supported for all of these and all the tools that use these CZEXT DIs. Same idea as before, you pass it to the right tool, it does the right thing by default. If it's your ESP, we'll see how it's be. It will be used to extend the internal D. If it's on var or ETC, system D will use it to extend your root FS. You pass it to portable D or to a system service. It will extend the root FS of the service or portable service. Again, with security in mind, so it's all protected read-only and enforced by the kernel. You pass it to EnSpawn and nothing happens because we don't support it yet. We should add that probably at some point. Specification of this URL. Now, all of these might sound like the clear and humbling of a raven lunatic, but I swear it's real. It exists. It's used in production. So what you can see here is the stuff that I work on. It's a PCI express card that has an ARM64 system on a chip. It's used in the Azure hosts. So the machines that run Azure virtual machines have these cards plugged in and the Linux R&D OS provides offloading and acceleration for the Azure nodes. So if you use Azure, you already use this DDI stuff. You just don't know about it because we use DDIs extensively and we are looking into UKIs as well. So this is all real. It's all true. Now, to conclude, come talk to us. We usually don't bite. Here's again the URL and for the website and for the GitHub organization. So we want you to join us and embrace some of the secure way of doing Linux. We want you to help us extend the specifications and also we want to finally get a work class of security bugs extinguished. And any questions? Yes, microphone. Hi, how would you compare UKIs to fit images from UBoot which also support signing and packaging all these parts into one single image? Yes. It is actually quite similar. Now, of course, the UBoot team. So UBoot is a firmware slash boot order environment used by embedded devices, essentially. They support this fit format, flattened image table and they have very similar concept absolutely. The main difference is done with TPMs in mind. I'm not sure UBoot support that and measure everything. Okay. I'm not very familiar with that, but they are very similar concepts. I don't know what the main difference would be. It's just different environments. I guess this is mainly for UFI, not UBoot is a specific boot loader environment, right? All right, okay. They are very similar, I guess, then, yeah. Thank you for the talk. From my understanding, we often in usual distribution have a SHIM sign and GRUB sign, but we don't have system UBoot sign or EXT Linux sign or UBoot sign. What is the plan in the future to have those sign maybe? Excellent question. Now, there is a group of people working on this problem of UFI and UFI 2.0 and everything that happened with the secure core PCs. Things are moving. I can't tell you much more than that right now. We do want to get as the boot sign for some internal use cases, so we want to get that allow listed to be allowed to be used as a payload for a second stage loader for SHIM. We have not done that yet. We would like to have that done at some point in the near future. Thank you. Can I make a comment? So, a kind of intermediate option is to have it signed by a certificate that is provided by the distribution and it's protected by the hardware security measurements and so on, but it's not trusted by SHIM, and then you can self-enroll on the first boot and have a trust on the first boot thing. We've done a bunch of work on this boot to allow self-enrollment on first use, so you can always do that. Of course, it doesn't work by default. You need to do the self-enrollment, but it's a step in the direction. Yes, thank you. Anything else? Can you pass me this, thank you. If you compile yourself, Linus Kernel, what you have to do then? So, in System 8.253, we ship a tool called U-key-fi, U-K-fi, U-K-fi, whatever you want, and that will, this one, will allow you to easily put together a UKI. Of course, in that case, you cannot sign with your off-site key. You need to do self-key enrollment and whatnot, but it is possible to build a UKI locally, absolutely. You need to sort out the signature by yourself, of course. I think that was at the back, yeah. You mentioned, well, I saw in one of your slides the abbreviation DPS. DPS, yes. What does that mean? Sorry, yes, I should have said that. Discoverable partition specification, yes, told you there were a lot of acronyms, so that is the list of all the UIDs and what they define, root of s, a verity, var, tmp, blah, blah, blah, et cetera. Thank you. Thank you. I completely forgot about that. I think we are three minutes. Any more questions? Anything online? Nope. Going once, going twice, well, thank you very much then. |
DM-Verity Rootfs Protection
Blockwise Hashtree |
The presentation, so it's about de-embarity to establish root fascism integrity. My name is Renk Riewerger, I've got a small demonstrator following this link, so that's my sandbox to evaluate those techniques. So let's jump into the presentation, so what is de-embarity? So as already mentioned, it belongs to a family of kernel device-mapper modules. It's mapping a physical block device onto high-level virtual block devices, for example, first to mention part of this family is de-emcrypt. It's intended for encryption and realizing confidentiality of your partition or the data in your partition. It establishes a read-writeable access, de-emintegrity, it's kind of journaling, establishing a read-write access also, and there is de-embarity for authenticity or integrity and optional authenticity and establishing a read-only access to your device. De-emcrypt, as already mentioned, is all established confidentiality, that means the authenticity or integrity is not enforced, so it might be possible to modify your content of your encrypted file system and you would never notice if by luck the block or the file structures or the directory structures are met, you wouldn't notice. De-embarity is different. If you use de-embarity, any modification of your partition, file structure, content of your files will be noticed and the integrity of your partition is enforced. It's also possible to sign your de-embarity setup. In this case, you achieve authenticity, so you know for sure that whatever you delivered and the signature matches, your data has not been manipulated, so de-embarity is available since kernel 3.4 or Android 4.4, so it's quite old, late 2013, so it's not a new feature we're talking about. So how does de-embarity work? De-embarity is based on a hash tree, so you have got your block device contained, for example, containing your root file system, these are the blue boxes on the bottom, and for every block device, 1K, 4K, whatever you choose, a hash value is calculated and a group of hash values is forming one more hash value on a higher level and so on and so on until you reach a single hash value on top called the root hash value. And this root hash value represents the state of your partition. If you sign this root hash value, you achieve authenticity of your overall partition. The good thing is to achieve this, you don't need any secret on the target, you just have to ship assigned entity to your target and the public key, and using the public key, your authenticity of your partition can verify. So it's different to TPM achievements or de-emcrypt, for de-emcrypt you need a security, here you don't need a secret. So how does it work? So once you created this root hash tree, or this hash tree, you install your root file system, your partition, for example on SDA3, and your hash tree will be placed in the partition SDA4. The invarity in the kernel is set up using both partitions, and it's providing a virtual file system into user space, and every time user space a block is read from your partition, it will be verified with a corresponding hash tree. So each block from root file system in SDA3 will be hashed, and the hash is compared to the hash value in the hash tree. And it will be calculated, the hash will be verified up to the root hash. And as the root hash is signed, you are sure it's not only the integrity is given, but authenticity is given also, because of the signed root hash value. So what can we, so what do we achieve now using the invarity? So we, it's a counter measure against one of the major threats for embedded devices in the field, IoT devices, somewhere being installed along the roads or whatever, detecting manipulation during startup. You can detect manipulation during runtime, because every time a block is read from your root file system, it will be hashed again, and it will be compared to the signed hash tree. So this way, even after startup during runtime, it's not possible to modify the content of your partition. You can use the invarity to terminate the execution of your kernel and the overall operating system, in case manipulation has been detected. It can deal with forward error correction in case of outwearing of your hashed devices, re-arranging blocks on your flash, and it requires a minimal run time overhead and almost zero latency during startup. So compare it to a naive way to verify the integrity and authenticity of a hash file system, let's say 150 megabyte, and you hash the complete 150 megabyte during startup, 50 megabyte per second, so that will take 10 seconds, at least, to verify. So using this one, there is zero latency, almost zero, so it's not noticeable. It's just reading a few blocks, all the few blocks being read to startup your, to start system D, your basic services, the few blocks have got to be verified and compared with a corresponding hash tree managed by DM Verity. So where should I? In line 11, the signature of the root hash, it's about 560 bytes, so we are talking about half a K of command line parameters here. So what do we need in the kernel? We just need a few parameters, or we need some kernel features. We have got to tell or integrate into the kernel the device mapper init capability, Verity, of course, the root hash verification, a trusted key ring, and we have got to specify the root, the certificate being used to verify the root hash. And this is the only cryptographic item we need to compile into the binaries. It's a public key, it's a certificate required to verify the hash tree. So let's get me back to this one. As you know, we must make sure that we have got a secure boot process without gaps. So you showed that public key, is that public key compiled into your kernel, is that how it works? Yes, it's a certificate, but it contains public key, and that's all you need. So we have got to make sure that the boot process is secure. And the signed boot loader, for example, UBOOT, we must make sure that there's no escape, no possibility of escape, we have got to lock down UBOOT. This is difficult if we want to establish some kind of AB booting, booting to A, booting to B. So UBOOT must provide some support for UBOOT environment. So here we need some features to lock down UBOOT to allow only certain variables being read and evaluated from the environment. If we do AB booting, we will have two different kernel command lines. We have got to specify device or partition representing the slot A, and we need another command line representing slot B. So we can't manage this now in UBOOT as UBOOT environment, with the containing also the seed and the root hash value and the signature of the root hash. So for this reason, the device tree now contains the boot argument, and we can provide two different configurations in the fit image. One device tree containing the boot step, the boot commands for slot A, and one for slot B. And everything else, the only thing UBOOT has to specify now is should it boot to A, slot A, or should it boot to slot B? And it starts, so as it would look like, it works like UBOOT is loading the fit image and specifying which configuration it would boot. It would just specify boot configuration A, boot configuration or slot B. And then these would represent the device tree configurations either for slot A or for slot B. And that's the reason the boot arguments have been moved into the fit image. So we can provide two different device trees with two different boot arguments, either for slot A and for slot B booting. So the benefits, it's very, so the de-invarity introduced a very low overhead. It allows us to do root of its integrity, authenticity, it's terminating the application case, manipulation of the root file system has been detected, and it's, well, it's just it's nice, it's nice, it's a nice feature. And I wondered, there's little in, there's not a lot of documentation about this feature as far as I got to know. So is there any other questions? Yes. Yeah, thank you for your talk. I have a question about the verification. So how I understand it, it's an on-the-fly verification of the image, meaning that the system is already being in use when there are still some of those blocks to be verified, meaning that you, let's say, in a secure boot, you might have that condition that you say I only execute signed code, meaning that either I have the, either I know that this is all, that this all has integrity, or I'm not starting it up at all, right? Yeah. For, let's say, for some critical applications, this might be important because if you've some kind of control device for, yeah, I don't know, an autopilot or something like that, then maybe you don't want to get into that application if you're not, if you don't have the security that everything's okay. Yes. That's the reason the kernel is not allowed or must not be stored in the root FS. So for some embedded systems, you will find that our build root is putting the kernel into the root file system by default. That's not, you can't do it here because you have got to start the kernel, you have got to, you have got to start the kernel and then you are able to verify the dm-varity tree. For that reason, the kernel is located in the fit image and the fit image is verified by the bootloader. So once we start the kernel, we know that the kernel is, the integrity of the kernel is given. If you start an application from root file system, it's read block-wise and it must be read into the memory and linked. And if during reading block-wise, you have got to start the kernel, you have got to start |
Image-Based Linux and TPMs
Measured Boot, Protecting Secrets and you |
So, welcome again. The next speaker is Leonard Pettering, known for SystemD, PostAudio, and other fundamental projects. And, yeah, he will talk about his journey and learnings of fusing TPMs and, yeah, how that is working with four image-based Linux operating systems, measuring boots, and so on. Hi. Yeah. What he said. So, I'm going to talk about image-based Linux and TPMs. We don't have much time. I think it's just 20 minutes, and it's a complex topic. I already know right now that I'm not going to be able to finish, but, yeah, that's not too bad. I hope the slides should be online sooner or later, so, if we don't finish, it's not too bad. So, let's jump right in. What's an image-based Linux? I don't think I have to say too much about that, because Luca already gave an introduction about this. But, yeah, it's really about having a core file system images instead of fine-grained images. Many benefits. I'm not going to go into detail. Here, we'll focus simply on the relevance for building trusted systems. These images, we now call DDIs. Luca had that on the slides as well, which is like this coverable disk image. It just means that there's a GPT partition table which describes what's on it and usually has a variety and things like that. The other thing in the title of my talk was a TPM. Yeah, I hope some of you have a rough idea what a TPM is. It's a security chip. Traditionally, it's a discrete one, but nowadays, it can also be in firmware or even in software. It's a cryptographic scope processor in a way, and usually, people then think, oh, it must be something fast, but it's usually not. It's just, it's actually terribly slow, usually, but it's not what it's supposed to be. It's standardized, widely available, all your laptops, and particularly, like, the business laptops usually have it as a discrete chip, the cheaper ones nowadays have it in firmware. It's pretty universal. Conceptually, similar to smart cards and fighter keys, right, like it's a storage for keys, but it's also very different. Supposedly, temp-approved, it can store at least one key securely, actually, more than this, but yes, the primary purpose is like, there's a seed key stored on it, and then everything else can be derived from this. That key cannot be extracted, that's the fun part about it, so ideally, you have to have that specific chip, or if you don't have that chip, then, yeah, you can't retrieve the keys. You can do lots of things with these things. Most interesting, I think, is one that you can ask the TPM to encrypt and decrypt data. Ultimately, it was that seed key that stored on it. That basically means, while you don't have that much storage on the TPM, you can see me store as many keys as you want with the TPM simply by wrapping that key, as they say, by taking any secret key you have, passing it to the TPM so that it encrypts it with its own built-in seed key, or some key derived of that, and then you get it back and stored on disk, and so you basically have infinite ways how to bind keys to the TPM. What's fun about this part is that this encryption can involve policy. Policy basically means that you can make restrictions about how the key can be decrypted. For example, you can require that also a pin is provided, like a human-typed pin. Pin just means password, by the way, it doesn't mean it has to be a number. It can also imply that system has to be in a specific state for the key to be decrypted. State means, for example, that specific software firmware bootloader runs, that the system is in a specific boot phase, that the specific disk encryption volume key has been used and things like that. That's why a concept called PCR is more about that. It's probably mostly we're going to actually talk about that part. It can also mean you can bind policy to time, like so that it can specify time when a key can be decrypted and prohibited otherwise. It's actually a useful feature. It sounds crazy, but it's actually pretty useful, and hopefully we can quickly talk about that later. The prior slide, that's just the TPM stuff that I find interesting that we're going to touch in this talk, but there's a lot more about it. It has an RNG, you can actually do store stuff there, dictionary, tag protection, like millions of things you cannot just encrypt, you can sign, I don't know. I'm not going to go into detail with that, only 20 minutes. These PCR policies, I want to go into detail. It's the most interesting type of access policy, and most people who deal with TPMs, I think, have encrypted their hard disk and linked it to the TPM, usually just played around with the PCR policies. That's something. PCR is short for platform configuration registers. They're basically just registers, like little data variables on the chip. Usually you have 24 of those. Initially they boot up as zero. You cannot set them to anything you like. The only thing you can do is you can pass some data to the PCR, and then it's going to execute this. You can update the PCR with the number N, like you have 24, like that's zero to 23, and we'll calculate a hash value of the old value plus concatenate it with the new data. I said that it's chart 256 because it's effectively what people use these days, but it actually can't be anything. But that's the only operation it does. That basically means if you pass data and later data and later data, it's going to come up with a value, and this value is derived from all the data you pass to it. This is a way of how we can implement a chain of trust. This operation where you send the data to the PCR is called extension of the PCR, or you can also say the measurement of the data, depending which way you look. How are these PCRs used on most of the laptops that you have? The firm will already start writing data to it. What data does it actually write to it? Basically all the code it executes right before it executes it, and also all the configuration for that code right before it makes use of this. Afterwards, the PCRs have a specific value, and this value you can then use to prove that the system went through certain stages, because every part of the system always measures the next thing so that if every element in the chain originally was trusted, then it will pass on the trust and so on and so on, and nobody can change the past, but only the future, and that's how you implement a chain of trust for the formatted boot. The other good thing is that if you know the elements of that chain that you're going to measure in advance, you can pre-calculate PCR values. You can basically say offline, okay I'm using this firmware, this boot loader and this kernel. I can tell you what the PCR is going to look like ahead of time, because I can actually calculate this hash operation myself in a very, very trivial way. You can pre-calculate PCR values. This is awesome, because if you can pre-calculate the PCR values, and these PCR values can be used in policy and secrets on keys that you encrypt with CTPM, then you basically can say with that that keys can only be decrypted if the system booted up with software that I actually trust, and that's what is kind of useful about PCRs. You can do even other stuff with that, but this is like the key takeaway. If we can bind the decryption of secrets by the TPM to the state of these PCR values, we can protect the secrets so that only on very specific systems running specific software, running specific configuration, these secrets can be recovered. This is, yeah, for disk encryption that's a primary use case. Image-based OSes have this benefit that they can be measured as a whole. If you have the entire OS in an image, it's very easy to calculate the hash value of it, and then it's very easy if you take that hash value and pass it to the TPM to know in advance what the PCR is going to be at the end when the system is fully booted up. This is systematically different if you have a package-based OS, right, and package-based OS where locally you create a file system, locally you unpack all these packages, and sometimes you update them at this time, and sometimes you update them in another time. Image-based OSes, you would end up as a file system that is going to be widely different from everybody else who installs the same OS, right? Image-based OSes don't have this vulnerability because their cores, their updates as a whole, you can do these precalculations, and they're going to have the same hash values in every single system, and hence the same PCR values in every single system. Yeah, what's also great is it's not just about hash values in the PCRs that eventually will show up because that sucks a little bit because it basically means that if you have a secret that you bind to a specific hash value and then you update the image that you ultimately calculated this from, then all these hash values will change. That sucks because it basically means that you can't update your software anymore because if you do, you lose access to all your secrets. Nonetheless, this is what Windows actually does, but we can do better and actually assign these things because if we can precalculate them, right, we know that this OS version is going to end up with these PCR values, then we can actually sign them on like the vendor can do this and provide the user both with the OS itself, plus information about what's the expected PCR is, plus the signature for it, then you can actually use a TPM and link secrets instead to the hash values can link them to the public key that belongs to the signature that you will get. This is awesome because it basically means that you can update as many as you want, you will not lose access to your secrets. Now I'm going to talk about system D because I'm the system D guy. Quick overview about all the TPM stuff we now have in the most current system D version, like it actually covers the unreleased version that's coming up 253. But yeah, system D stuff, it's a stuff that, like a UFI stuff that you can glue in front of a kernel and it does measurements of the kernel that's about the boot, it does lots of other things, but this is, yeah, for this context about TPMs, this is what I find interesting about it. It still runs in UFI mode before Linux is invoked and then passes control to Linux. There's a service system called system D PCR phase, what that does, it measures certain strings at specific times of the boot into a specific PCR. What's the purpose of this? Sometimes, or I guess always, it's kind of useful that, for example, the encryption key for the root volume can only be an unlock in the iterative, but not after. And that's an awesome property because it basically means that when somebody exploits your system after the root file system is actually activated, then they can't talk to the TPM to get the password back because basically these phases stuff can destroy the PCRs and because the PCRs that are used for the policy, yeah, the key is inaccessible until a reboot is done and we end up in an iterative again. The system D PCR FS, what that measures is like FS identity, like file system identity, that's UUIDs of a file system, things like that, the thinking about that is that there should be PCR where you can basically say, okay, that identifies a specific installation that I have, do we use PCR-15 for this? So basically, that you can guarantee, yeah, my laptop is going to have that value there, another laptop is definitely going to have a different one. So it's about being able to bind policy to specific installations. There's a system PCR machine that measures etsy-machinity, etsy-machinity is just a explicit system ID that we introduced a while back. System re-crypt setup is now something that can consume all this stuff that I was talking about there. These things measure, this one then actually is able to unlock disk secrets based on these PCR policies. There's something, yeah, the crypt setup is actually not just about making use of the PCR policies, it also measures a hash derived from the volume key of the root file system, for example, or actually you can measure any volume key with that. So the idea is basically that the secret is also measured to DPM and that this gives a much stronger protection of the PCR-15 stuff that I was talking about. In future we probably also add something similar to system de-varity so that basically, yeah, you can very protect file systems and have the top level hash measured. Crypt setup can consume this, then there's also system de-crets in the service manager, that's the concept we recently learned, system de-crets is basically how you can pass secrets into services and they can be encrypted via TPM stuff and things like that. So it's kind of useful that you basically can say, I want my X509 secret key encrypted in a way that it can only be unlocked on my specific server running my specific OS during the inner dirty boot phase but not later or something like this. We measure as a tool that can pre-calculate expected PCR values for UKIs, UKIs, unified kernel images, we had talked about that before, so I'm not going to explain that, can also sign them so that you can basically have a UKI that carries both like the signature along with it so that then later people can just make use of this to unlock their volumes. There's recently added tool, UKi-fi, UKi-fi, I don't know how we're supposed to pronounce that yet, but anyway, it builds on system measuring a couple of other tools and you give it just a component that builds the UKI stuff, signs it with secure boot, signs it with the system you measure and then you have this one blob that you can ultimately deploy. The net result of this is three relevant PCRs, in 11 we're going to measure the kernel and the root file system into, in PCR 12 there's going to be basically the configuration you pass to the kernel, that's primarily kernel parameters and things like that, but it's also more is like, we have this credentials concept that I mentioned is going to measure it into that if they are passed into the system and we'll have soon something called SysCVG, which is like a secure way how you can manage Etsy stuff, it's also going to be measured into that. And then PCR 15 I mentioned is kind of the identity of the local system. So yeah, and then you can consume this, and I mentioned this already with system decryptin roll, where you can basically say, now this disk can only be unlocked on a system that has physical access to this very TPM, so to hardware, that can only be locked if it's booted with properly signed OS UKIs and DDIs by a specific random, so I can say, I install Fedora here and this secret key shell, only be unlocked ever if it's actually Fedora that is booted, but not if Hacker OS or something is booted. I can say, it will link it to the configuration, I can say that secrets can only be available during a specific boot phase, for example in the NRAD but not later, so that you know that if you leave your booted up laptop somewhere in your hotel room or something like that, and you're not supervising it, that at least they'll not be able to get your root volume encryption key via directly talking to TPM. You can also add a PIN, you can do rate limiting and things like that, so that like this is dictionary attack protection, so that people cannot just go and hammer the TPM trying a couple of things, trying a couple of PINs or something like that, because the TPM eventually says no, you have to wait or something, or you have to provide an unlocked PIN. I'm good at time actually, so far so good, PCRs are really useful stuff and you're probably going to extend users them all across the system before other things. They have other uses like remote attestation, like you can use them to basically remotely say, like verify that a system in a specific state. People usually use that for servers and stuff, so that if you have a cloud and you have lots of notes, that before you give a workload to a specific survey, you can verify that it's still running exactly the software that it should be and nobody hacked with it and booted something else and things like that. It's also useful actually, yeah, back to the hotel situation, I leave my hotel, in my laptop in the hotel, I come back and I can use the remote attestation so that this device can prove to me that it's actually my device, running my software as I intended and not something else. So, yeah, configuration and images I already mentioned, credentials I already mentioned briefly. Yeah. Last thing, second boot and measured boot, people shouldn't confuse, measured boot is what I was just talking about with all the measurement, like this chain of PCRs and things like that. There's also second boot in UFI, usually you do both, but they have different purposes, right? Like one is ahead of time stuff where it doesn't even allow software to boot that isn't signed properly. The measured boot is different, like the software can all run, but it will not get access to secrets if it's the wrong software. Anyway, we don't have much time anymore. I would like to take at least some questions, so let's do that. Hello. So, all these measurements, are they going to be in TPM event log for SIMD2 where I can debug that stuff? Yeah, that's a good question. So, yeah, the, like, we currently mention this in the login framework, like the regular, like syslog and third journal. We do not write a TPM event log. I know that there's a specification for actually writing a TPM event log. We totally should. But I didn't want to implement this TCG spec for that yet, and I didn't find a good library doing this that I wanted to use. But ultimately, like for the remote attestation case, we absolutely need this. Eventually, I want to get to this world where we can, like before we run any service, any container or any VM or something, we can measure ahead of time what we do and we write that result to that event log and then people can consume the event log. So, I think system we should write that. Yeah. I mean, I'm not a direct fan of the former two, but if you implement that stuff, we have now three event logs. We have the UFO one, we have the IMA one, and then now we have the system D1, so it's kind of, it's a kernel thing that actually the kernel should provide, I think, at least. Well, I mean, it can grow indefinitely, and I'm not sure you want the kernel to write stuff indefinitely. I don't know. I'm amazed that. If you care about these things, like, I don't think we should come up with our own standard for this, so I kind of, if we write anything, it probably should be the TCG one, but I don't know. I didn't look into details like rotation or whatever else. Anyway. Okay. Thanks. So, right at the moment, system D krypton roll has a least privilege problem in that whenever you're handling cryptographic materials, you should always expose them to the least number of subsystems. And the problem is that system D krypton roll is unwrapping the TPM key and then passing the unwrapped key down into the kernel. But the kernel itself has a system for handling the unwrapping of TPM keys, so what we should be doing is passing the TPM key straight into the kernel and allowing the kernel to unwrap it. Are there any plans to actually implement that? I didn't know this existed. So I don't know, is it documented anywhere? Yeah, it's called the trusted key subsystem. Yeah, I mean, yeah, so far we don't really make much use of the key rings thing because of complex beast, but if it exists, I see no reason why we wouldn't use it. Okay, thanks. I have a talk on that tomorrow on trusted keys if you are interested. My question is about something else, do you know about systems where the TPM2 communication is encrypted and authenticated to avoid someone desoldering the TPM and messing with it? We do parameter encryption, and originally it was just bound to the PIN and in the recent versions there's some salting going on, but I think we are like, you know, the good thing is that nowadays even the TPM2S maintainers themselves send us all the patches to fix these things. So I think we're pretty, like, at least we're way better than what Windows currently does and we keep on improving. Hi, how to integrate this into a grub boot process? So you always need, so as far as I know, see, you always need an initial run file system to set up or to process all these TPM functionality and then initializing a looks key store and then switch root into the root file system. So is there a better way? You're asking me if grub is a better way for anything? You're asking the wrong guy, I think grub, yeah, I'm not going to say bad things, but you know, we have this other bootloader like a system, I mean it's not a bootloader, it's just a boot menu, but SD boot, and I'm not a fan of grub in these things, and I systematically think it's really, really stupid to do disk encryption in grub because it's not what you want there, you want authentication, not encryption, and so, I don't know, it's, like, to build a trusted boot chain, I'm not convinced that grub should be part of the solution, it should just like, you know, an EFI systems at least if you sac your boot and things like that, you have all the buildings block that you need, you don't need another way of protecting things. I know that many distributions, if they want to, like, because they use unencrypted, like, they want to use encrypted slash boot, which I think is stupid, but if they do this, then yeah, of course, you need to add a disk encryption support to grub, but I think the problem is you shouldn't have an encrypted boot, you should use UKIs and things like that, and then have your secrets come in once the, and it already initializes and tells a system what to do. So, if you ask me, is there a better way? No, what you're proposing is the worst way, this is the better way, so yeah. Thank you for your quick question, you didn't talk a lot about recovery, and I'm more interested into, through your test, have you encountered TPM2 wearing and TPM2 dying, and what can happen in those situations regarding field recovery, disaster recovery, and so on. So, you know, in all the stuff we're doing, like, the TPM stuff is just one option among many, right, like, we have multiple key slots and locked stuff, and so we added actually to system decryption role high-level support for concept like recovery keys, which are just regular passwords, the only difference is they're generated by the system instead of selected by users, so they have high entropy and should be good quality. But I would say, yeah, if, like, it's what everybody else does too, right, like on MacOs you can have a recovery key, on Windows you can have a recovery key, and this is what we should do here as well, right, like use TPMs absolutely, and then hide entropy recovery keys that things are fucked up, things are fucked up, but yeah, it's a matter ultimately for the distributions to have a nice UI for this, right. With all of this, I kind of want to push into the direction that the TPM stuff is a default, and then if you want something else, like FIDO or whatever else you enroll in addition to that, and that's, yeah, anyway, here, thank you very much. |
Building initrds in a new way |
Please sit down, if you can. So the next speaker is Spicknev Yanderevsky-Smak. And he will talk about MK OSI InitRD, a new project to build InitRDs from system packages. Yeah, so my talk builds on previous talks. So my name is Zbyszek, I work in Red Hat in the Plumbers group, and I work on system D and Fedora. Brought to RPM, RPM Autospec build tools and stuff like that. And so we're here today about a new approach to delivering the kernel and the user space, the root file system to computers. And all the stuff that was mentioned today, so secure boot, to trust your code, PCR measurements, boot phases. Locking secrets to the PCR state. This creates a situation where we should think how we build InitRDs. And I think it's a good opportunity to kind of throw away a lot of old approaches and use a new approach. And the things the way that I'm talking about today would have been very hard a few years ago, because we didn't have those mechanisms that we have right now, like credentials and system extensions. So look, I talked about system extensions, so the compact disk images that carry a file system and one partition, the inverted data in another partition and a signature for the root hash of the inverted data in a third partition. And this is squished into an image with minimal padding. So actually it sounds kind of awful, but it's basically just a file system image with some metadata that can be trusted. And those tools allow us to do things in a completely different fashion than we used to do them in the past. So what do we do now? I mean, this varies between distributions, but the approach is generally that on a general-purpose distribution like Fedora, the user wants to have an InitRD, so they scrape some files of the file system, whatever is installed there, sometimes with some local modifications, sometimes not. You use LDD to figure out what libraries should be loaded and whenever there are extra files that need to be put in the InitRD, well, somebody has to remember to do that, so essentially we duplicate the packaging layer. And we do it on every user machine, so this costs time during upgrades, it's actually quite noticeable because of compression. And so this was before we booted, and after we have booted into the InitRD, generally, for example, the Fedora InitRDs, they have SystemD, but they also have lots of extra functionality added that came before SystemD was there. And over time, this functionality has been moving into SystemD. And now we're at the point where it's completely useless, I mean, the part that is apart from SystemD is not useful because we can just get rid of it and because it is implemented by SystemD in a better way, in general. And while people hear, OK, now we should do something, some kind of access a file system in the InitRD, oh, let's like some bash to do it. And why? I mean, we should just do the same thing we do in the host file system and use proper tools. And if those tools are not good enough, then we should fix them so that it's nice to use them in the host file system and they're also nicer to use in the InitRD. And it's like a legacy that this InitRD environment used to be much different, but we use SystemD and SystemD does the setup where it sets up slash proc and slash dev and mounts everything that needs to be mounted. And in reality, the environment in the InitRD doesn't have to be. The fact that it's different from a real system is just something that we should get rid of. So we have this duplication where we have the SystemD way to do things and the Non-SystemD way to do things in parallel. It just adds complexity and doesn't, it's not beneficial. And everybody does the InitRD in a very, very different way, every distribution. And even some distributions have multiple ways. I know that one of the goals of Chacut was to unify the approach between distributions, it didn't really work, so this is another approach we'll see in 10 years. So, okay, so we sign stuff. If we sign the kernel, but not the InitRD, then we are just pretending, right? It's a waste of time. We need to sign both. But in general, users want to have the thing, the users don't want to play with local key management. It should be signed by the distro. If it's signed by the distro, then the InitRD must be also built by the distro. So all this functionality that we have to inject things into the InitRD based on local configuration, well, we cannot use it. So essentially, the idea is that, okay, if we are going to move the whole build of the InitRD into distro infrastructure and build it like a package, then let's do it in a clean way. And for me, this means taking a declarative set of distribution package names, letting the distribution package manager figure out all the dependencies, and then using the distribution package manager to put the files in a directory and then just zipping it up into any InitRD. So, before I talk about the specifics of how to do this, let's talk about some problems that immediately appear, right? If we try to build one InitRD for, let's say, whole Fedora, then we will end up with this straight genetic blob that will take 60 seconds to load whenever the kernel boots. This is not nice. So we need different InitRDs for different people. And one option is to just build multiple variants, and we definitely plan to do this. For example, like a simplified InitRD that works for VMs that only has some basic stuff that you need in the VM and no other drivers and no other tools and like one for laptops and so on. But this only scales so far. We can have maybe five variants, but anything more than that would be annoying. And the other approach is to use SystemD extensions. So the idea would be that you have the basic InitRD, and let's say you want, I don't know, SSHD in your InitRD, and you install this additional extension. And I should mention what happens with trust here. So the bootloader verifies the kernel and the InitRD before loading it, and then after the InitRD has been loaded, and we want to use, we want to, the InitRD code loads the extension. So it checks the signature of the extension before loading it. So SystemD extensions are a mechanism to add code in a way that it is trusted before, I mean, the trust is checked before it is used. And if we use UKIs, we need some way to inject configuration into the, well, I mean, we build an image that's supposed to work everywhere so it cannot have local configuration and we need to deal with this issue somehow. And one approach is to use the, just out of discovery of partitions and not have any local configuration, which is nice if we can make it work. But a more general approach is to use credentials and to inject a configuration that has been signed and bound somehow to the local system using SystemD credentials. And I would say that this is an area of active research because I don't think that anybody really knows how this is supposed to work in details. And I wanted to mention that if we build InitRDs from packages, we have build reproducibility because we have an exact list of packages that was used and we can download them again and unpack them again in the same, I mean, the exact same way. And we should have bit for bit identical result, which is good for checking that the idea was put together correctly. But it's also very useful if you want to build some system extension afterwards because you build the extension by adding additional stuff and then taking the difference between the old image and the new image. And the tool that we are using for this is, well, the project is called MQSI InitRD because it takes MQSI, which is a very simple tool that takes a list of packages and calls the package manager to download the packages and put them into an image. And it does all the things that we happen to need. So it supports GPT tables and DMVarity and signatures so we can do some extensions. And it can also do archives and this is what you need for InitramFS format. And MQSI itself is just a set of configuration scripts or configuration files for MQSI right now only for Fedora, but the concept carries very cleanly into other distributions so I think that if it works in Fedora maybe other people will pick it up too. So just a list of packages and some tweaks to turn the install packages into an InitRD. And the same for system extensions. And well, in general the plan is to do it on the distro side but right now this hasn't happened yet so we have a kernel install plugin and you update the kernel and you very slowly build the InitRD from packages on the local system. It's not very efficient but it works surprisingly stably. And for Fedora 39 we want to propose changes to the build system to allow InitRDs to be built in this way in the build system and deliver these packages for people who want to try it out. I think it's like years from being the default if ever. And so this works, I mean it works without too much issue. The InitRDs are bigger but they're not infinitely bigger, they're maybe just twice as big. And the biggest difference surprisingly comes from kernel modules because what Dracut and other InitRD builders do, they select a specific list of modules, kernel modules that are needed for the local system. I wanted to avoid this, I wanted to take all the modules from a kernel package and just put them in without knowledge about specific modules. I think that this is not feasible, we'll have to do some kind of filtering too. But the code itself, there's very little difference in the size of executables and libraries installed into the InitRD and this is because the code that we use in the InitRD is the same code from the host file system. So it has the same library dependencies and you need to put the same set of libraries and actually most of the space is taken by libraries because the functionality has been moving more and more into shared libraries because we build more complicated stacks. So this means that in principle the size overhead is not too big and can be reduced. And I mean this works in some cases but like for simple cases for laptops and for some types of storage but certainly not for everything. And what do we get? We have less things, we use the package dependency resolution mechanism so we don't duplicate packaging, we don't need to care about installation because we have RPM to do it for us or whatever. We have fair principle builds because we don't take files from the host and everybody has the same image which is important for people trying to debug errors reported by users. And well if we build things centrally we can assign them and we use systemd and just get rid of the extra stuff so our life is simpler. And you know there's like a common set of things that people complain about, I mean like arrays when this is discussed. So I wanted to underline that systemd is already used in the entirety so we're just removing things not adding new things. And systemd sets up the environment so things are already like they should be. So like all the extra work that people have put into having scripts that work without slash prop mounted, it's not useful. And we remove stuff and we would be moving from scripts to demons anyway, right, and shared libraries. Because if somebody tells you to provide, find the two authentication decryption for the root disk, you're not going to script it, you're going to use some compiler code to do it anyway. And like, I don't know, netling, timeouts, retries, localization, debuts, all this stuff is just semi incompatible with scripting. And we would be moving in the direction of normal compiler code anyway. And so MKSI in the RD as itself is kind of, it's implemented and it mostly works. The stuff that is happening is like in the surrounding areas. So in particular, this development in systemd rated credentials is very important because we want to make use of this. And support for unified kernel images is growing. There are patches, there's a link here, patches for grab2 to load unified kernel images. And I mentioned that we want to build MKSI internal images in Fedora. So this will require changes in the Koji build system. And that's what I have. I do have time for questions. One minute, three minutes, okay. When I was thinking about systems that boot from network and, for instance, the root system is on iSCSI or NVME over fiber, do you need some information that is really device specific? How can we separate that from the init ID? Because you want a single init ID for all the systems. Yes. So one option is to put it on the kernel command line if this is an option. And the second option is to provide a credential that is unpacked and becomes part of the configuration in the init ID. So essentially, yes, you just inject this configuration, but it wouldn't be part of the init ID itself. It would be delivered in a different way. The same question is what would you do if you want to have files from the local file system? For example, you need a custom mount command that does more than a feature mount. Well, I mean, I would say ask why do you need that? But if you need that, then just do the build locally. And the difference is, I think it was kind of the same question was asked before about unified kernel images. You can do the build locally, you just don't have the distro signatures. Thank you. It might be a bit similar to the previous question, but thinking from a distribution standpoint, there is a lot of hardware out there which is incompatible with default configurations or default init energies. And you need to add patches to kernels and you need to add special kernel modules which are not enabled by default. What is going to be the flow to support this hardware, to use it on distribution by default? So I think that this is much less common than people think, right? Because, I mean, how many people build their kernels nowadays? Small minority. Like from the distro point of view, this is already outside of scope, right? If you come, report a bug that your custom compiled kernel does not work, then nobody is going to help you. Because people have too much bugs reported for the standard distribution. The existing ways of building things locally, they will always be there, right? I mean, they are not going away. So basically the answer is, well, I mean, if you are doing something specific, then you keep doing this specific thing. And this, I mean, this is the way to make the life for the distro easier, but it's not going to cover all cases, maybe like 90%. Any more questions? Yeah, you mentioned the kernel module making the integer bigger. Could this be shipped in a standard extension somehow? Can you repeat, please? Could you ship the kernel modules which are in very nearly, in a standard extension? Yes, definitely. So the kernel, the initial NTRD, the kernel must have enough stuff built in to understand the NTRD, and the NTRD must have enough modules to be able to load system extensions. But once you do that, you can have an extension with kernel modules and whatever. Yes. So last question. Is there a path to getting your init ramfs core from somewhere and running a different distro? Like essentially to your project or some project providing the core init ramfs with the system d init inside it and everything inside of it. You get the modules from elsewhere and then when you pivot, is that a hard line that you can live? I think that technically it's doable because system d is kind of used everywhere and there's really no reason why it wouldn't work. I assume that distro would want their own code in the init rd, but technically it's not required. Okay, thank you. I mean, there's this general requirement that, because there's a switch route where state is passed from the NTRD to the host, and you don't want to pass the state from newer system d to an older system d. So the NTRD would have to be just older, old enough. Thank you very much. |
Ultrablue
User-friendly Lightweight TPM Remote Attestation over Bluetooth |
Okay, so today I'm not going to talk about image based or anything, but I'm going to talk about the containers. So what do you mean by containers? So what do you mean by containers? So what do you mean by containers? So what do you mean by containers? So what do you mean by containers? So again, maybe for the remote audience, when you got a boot chain, you put up your computer, you have UEFI, it goes to bootloader, it goes to the kernel, it goes to initramfs. And then because we're talking about a lot today, security things, you want to crypt your disk partition and so enter a passphrase to unlock it. Then if you want to make sure that what you've loaded is what you expect, you can store measurements, so signatures, hashes of every step inside the TPM. A TPM is a passive component. It is just there to store information, perform some cryptographic computation on it if you ask it, but it doesn't block anything in the boot chain, right? It is just there to store and execute comments that you pass to it. And so later on, after the boot is done, you can go back, for instance, in init in your kernel to read back what you had in the TPM and check that what you have is what you expect. The problem is, is there is an issue? If it's not what you expect, you end up here, you enter your passphrase, but your system was already compromised into the passphrase that you've just entered, maybe got excreted and now your disk is compromised as well. That's typically the case when you have offline attacks on a laptop, you may have forgotten or left unattended in a FOSDEM conference room. So what we can do and what people in large organizations typically do is use remote attestation. So you have an app started, for instance, in the initrmfs, which will talk to the TPM, get the measurements, and then ask a remote attestation server somewhere trusted to check that the measurements are what's expected. And that's where what you got in the previous talk, that you want to have images that are reproducible, that are signed, comes very key, because a remote attestation server would typically be your organization server, it knows about the signatures, so you can check that the image is the one that it signed, that it is the one that is blessed by your security team. And then everything is fine, typically the remote attestation server would hand back a secret to your dedicated attestation client, and the client could use that secret to decrypt your disk. And in that case, you don't even need to enter a passphrase anymore. I mean, you could also add a passphrase if you want to be more sure, but if you had some kind of control on the attestation server, then you may not even need the passphrase. The issue is that everybody in this room, maybe some of you work at big companies, but not everybody obviously, and so how do you set up this server? You need to rent something in the cloud, it's super inconvenient. As a matter of fact, all of you, I think, have a server in their pocket, it's called a mobile phone, and so you just need to run the remote attestation server, that's a security part because I work for a government agency, and so you have the server run on your phone, and then you can communicate with your laptop over Bluetooth. And that's what we implemented, a user-friendly, lightweight TPM remote attestation running over Bluetooth with a server, well, client, it depends, we will change names, but one of them running on your phone and the other on your laptop. It's an idea that was proselytized by Matthew Garrett a couple of years ago at Linux kind of Australia, but it was just a rough prototype and basically you're trying to make it production grade, at least hacker grade. So we have the server running on the laptop, you can know I'm switching names between clients and server because we can't decide. So on your laptop, you have the server, it's written in Go, it talks to Bluetooth stack, and then we have clients for Android and iOS written, well, for the TPM part in Go, and then for the UI in Kotlin or Swift. So the way it works is when you start the server on the laptop, it shows a QR code with some key so that we can encrypt the communication between the laptop and the phone and pair them in a secure way, you scan the QR code on your phone, it gives you the key, and then on first use, you have a trust on first use step where you need to kind of trust the measurements the very first time. We do not support right now downloading trusted measurements from an external server the way you would do if you're in a larger organization. Remember this is for individual users, but we could extend it to support that use case if needed. And then every time you want to test and that things haven't changed, then you just run the server here again on the laptop, you don't need to scan a QR code anymore, you just start the attestation on the client, it pairs over Bluetooth systematically using the key that is remembered, it exchanged graphically the results of the measurements. Thanks to the TPM remote attestation protocol, it can check that it is the same physical machine or at least the same physical TPM if you left your laptop and attend in the first dem room unless like the attacker put it open, swap TPM stuff like that, it doesn't have it. So it can really make sure it is the same hardware it is talking to or at least the same TPM, and then it checks that the measurements of the blockchain haven't changed since last time. And it can send back a secret optionally to the laptop if you want to use it for disconfusion. Of course, it doesn't send a secret back if things have failed. So we have a demo and because we know how demo works, it's a recorded video. I mean I can also try the demo for real afterwards. So this is running in a virtual machine created with May Cozy that was mentioned in previous talks. So here I'll first show the enrolment. So the virtual machine is booting for the first time. So on the left you have the virtual machine, on the right you have the phone. In that case, the iOS client because it is much more beautiful because my intern spent a lot more time working on it. But I don't have Xcode, so the client I reviewed is the Android one and so the one you get on GitHub has fewer features, I'm sorry, but eventually it will merge everything back. So here's the first time we boot. We need to enter a passphrase because we haven't enrolled yet. And then, sorry, it's the bottom of the screen, but we run the old server to perform the initial trust and first use enrolment that we are fully booted to a Linux system. So here you have to trust your system. It's the first time you boot if it's already compromised your host. It's going to display the QR code that we are going to scan with the phone. And then it starts at station showing the steps on both sides, get the measurements, and now we have the device registered on the right. So we can change its name, we know when we created it, when we last run an attestation. And as you'll see, you can edit the security policy. So the way it works when you boot is you choose what kind of measurements you want to check. Again, this is only in the iOS application, but coming soon on Android. Then the next step is to run system decrypt enrol so that because when we enroll the phone, we send back a secret to the laptop which was stored in the TPM in a special register, register 9, PCR 9. And so here we run system decrypt enrol so that we add another factor, another slot to unlock the disk based not on the passphrase, but on the contents on TPM register 9. So here basically, yeah, that's what we did. And now we regenerate the Initram FS using Dracot, Dracot modules. We would like to get rid of them because they're very buggy, but maybe see preview stock one day we can, so that we can enable ultra blue and start up. And here we start again, but now start up stops at some point, starting the ultra blue server. And we go on the phone and just press this button, the play button. Starting an attestation, fetching the measurements, it works. So we send back the secret and we don't need to enter a passphrase anymore because we send back the secret so it unlocked the disk based on the secret. Now what happens if something has changed in the boot? So again, we start an attestation, yada, yada. And here we do not send back the secret. We say, oh, something is different this time. It's in PCR8, right? It has changed. And PCR8, we provide some info to the user, is expected to contain the hash of the kernel command line. Then you can even look at the diff, and if you look at the diff you can see that you have a new command line option, super-heavillopt, that has been added, right? So this data, you can actually trust it, but you can trust that this data, it's digest, is the same as the one stored in the TPM. And then you can choose what you want to do. You can say, OK, this is a legitimate update, for instance, I updated my firmware, it's an expected change, so I saved the new one. You can say, OK, trust, but only once, as you would do in a browser. Or you can say, oh, no, it's definitely an attack. I want to reject it, maybe call some security team. We haven't implemented that yet, but do not send back the secrets in any case. That's what we do here. We reject it. And so it falls back to asking you a passphrase to end up your disk. And that's about it for today. So you can, it's Versatile, you can embed it in any TramFS. You can use it as a second factor. We provide in the release sample, make a script to build the VM that I just demonstrated. We also just today published a release just for you, first time attendee, so that you do not need to rebuild everything. You can just test. Then if you're interested about the protocol, you have five minutes of questions. But trust us, we ask the security team, I mean, the cryptography team in our lab to review it. It's fairly standard. It's fairly standard for a multi-station protocol. And so, yeah, we want to integrate more. That's why we are here today to talk to you. So go on ncfr slash ultra blue, add slash release if you want some binaries for Android and x86 Linux. And yeah, I'm here for questions. I'll start with a question. How practical would it be to run this on your laptop? Well, I'm not like I just basically mentored Loic, but I co-mentored Loic with Nicolas Bouchiner, who should be in this room somewhere, I think, yes, here. And Nicolas mentioned just yesterday that he's now quite ready to use it on his laptop. So I guess ask him. I don't even have secure boot enabled on this laptop. So I guess the French NSA doesn't allow you to use that. So it's like, I mean, this is the public, so it's endorsed by our agency. It's been endorsed in the sense that it's OK to release the code publicly. We have reviewed it as carefully as we can. We do not consider it production grade right now. Have you thought about using it to assess the IMA log at runtime as well, so for files that are accessed against some network-accessible database, not only for boot, but also for checks at runtime? So I think for checks at runtime, there are two things here. Checks at runtime of your file system, I think you want to use more like the VM variety and other mechanisms that we have. But the TPM doesn't only have uses at boot time. So the fact that you run remote attestation, this could be done later. The advantage to make it early is that you can detect if you're compromised in the before inputting your password, basically. Yes, I see some questions here in the back. Let's say that nobody swaps your TPM, yes, put that case aside. Could you get the same level of attestation just with signatures with the TPM? Let's say you don't think that somebody was going to swap your TPM. So could you just check some of the parts of the boot chain and just sign the hashes and just add boot, verify the signature with TPM? The problem is how do you trust what you see on the screen of your laptop? The idea that the phone remains in your pocket, so the phone is trusted or it's more trusted to have two devices than one, because it's harder to compromise two devices. So if something in your boot loader is compromised, it could show you on your computer that, okay, the TPM measures check out, et cetera. And this is even more true for something that's called dynamic root trust measurement, root of trust measurements. So not only trusting your BIOS to make the measurements, but like some special instructions in CPUs where you restart the root of trust later in the boot process. But for this to work, the check has to be external. You said it's not production ready. Do you see it being production ready anytime soon? Is that the goal? It's not a stated goal of the agency. I'd be very happy for it to be used more widely. But I think for this to happen, we need people outside of the agency interested. Yeah, looks great. I have a question regarding the app on the mobile phone. So how does the app know that it's speaking to a trusted server or the server that it can trust? So does it exchange keys with the server on first use or how does it work? So there is a trust on first use. So there are two levels of security. First, the Bluetooth communication is encrypted and the key for that communication is embedded in the QR code that you scan first. And then the identity of the TPM. The TPM has a secret key inside and uses it to... And so the public key is stored in the app on the mobile phone. And so the app can check that it talks to the expected TPM. So basically, the encryption key is used to attest that it's the same operating system or E-NITRA MFS, and the TPM key is used to check that it's the same TPM. Yeah, but that's still, as far as I understand, does not prevent a malicious server to send the app the wrong values or to just imagine the values and send them echoed. So there is something that we haven't implemented right now because we wanted it to be easy to test with virtual machines and software TPMs. But one thing is that the TPM certificates, like basically you have a list, the TPM keys are signed by vendor, and so you have a list of the keys of the vendor that are used to sign the certificates. So when you get the quote from the TPM, you can check that it's a hardware TPM by some well-known manufacturers and not some eliminated TPM. So you can be sure that you're actually talking to a physical TPM device, and so the server cannot mess with that. But you're right, we haven't implemented this right now, otherwise the demo on purpose, so to say, because otherwise the demo wouldn't work. That's an option I want to add. Very good question. This is using Bluetooth, Bluetooth specification is very complicated. How much does it rely on Bluetooth working fine or as expected? So it's using Bluetooth low energy. It's relying on the Go Bluetooth library to work fine. So we don't use any of the Bluetooth stack encryption for the encryption part. We roll out our stuff, and we layered the code so that it's mostly agnostic to the underlying transport. So if you want it to run over infrared, it should be relatively easy to port. I just don't know how to code that. And we are extremely conservative in that we just abort as soon as we receive packets out of orders or unexpected packets. OK, last question from online. Someone asked if any distribution expressed any interest in making it in the Ultra Blue? Yes, in fact, but not officially, some Twitter messages. We presented this at Open Source PMWare conference last September as well. And after that, there was some interest online and people saying that Fedora, they may be interested, but it was not official Fedora position. Some Fedora contributors. I can say that this is interesting to that. Thank you. Thanks, for the speaker. We have one request. |
Converging image and package based OS updates |
Yeah, hi, I'm Ludwig from SUSE. I'm a research engineer there working in the so-called future technologies team and today I'm presenting some crazy idea. Not because we're going to build a product with that, but just because we can. So, first let's take a look what's the difference between package-based and image-based. I mean, I'm from the package-based world, so I don't have too many insights into actual image-based systems like I'm embedded world, so that's just my view. Anyway, packages are known by all of you, I guess, from your desktop Linux installation, so you have individual components pre-built, the vendor ships your packages and the client side decides which one to install, so there's some kind of dependency resolver that takes the components, puts them on your local system. Usually you don't need to reboot for that. This has advantages and disadvantages, so the advantages, if you need a new VIM version, you just get it and it works right away. The disadvantages, if it breaks, it's broken. On image-based systems, on the other hand, you get a full Linux system pre-built by the operating system vendor. That means it's a ready-made file system that is typically downloaded and DD to some partition, for example, or it's a table that gets extracted somewhere. Installing that one or activating the install requires a reboot, so that's an advantage because it wouldn't break into intermediate states, but it just either works or doesn't. The disadvantage is that it's typically not extendable, at least not the original image, so you would have to have some other mechanism like system desizics nowadays that plays some tricks to get extra layers on top. So, let's first take a look at the file-stem layout of a typical image-based system, or one, I think, our system, the envisions it. So, we have the operating system in slash user for Linux systems after user merge. Then the EDC partition at the EDC volume is on slash, which is writable. The boot partition nowadays should be the ESP, no matter where it's mounted actually, and your data is in var on a separate volume again. So, how do you do updates? Usually you have separate partitions for user, so the standard case is AP, so two versions, and then you DD your operating system into one, while the other one runs, and on reboot you just switch. That's quite an easy technology because it's just a regular partition table. It's actually read-only if the file-stem you use is a read-only one. You could use our single CAsync to download deltas and getting it from the server, and the signing is also pretty easy because it's one image, so you can put some GPG signature on it, and that's it, and you can even verify it afterwards because the image is unmodified on your partition. Disadvantages, there's no deduplication, so you always consume twice the space basically, or however big your user partition is. Even if your updates are small, you still need all that space. The amount is limited, so usually if you have only two partitions, you have two versions of the operating system, and the space is pre-allocated. Again, that could be an advantage because there's no surprises, the space is just there, and if it's there, you can put the image, period. The disadvantages, your image can't grow, and the updates are always of a fixed size basically. So I can be optimized that. You could actually use butterfaces to our operating system, that's how the micro-S works, some more details on that later in Ignite's talk. Anyway, we use a sub-volume for the operating system. That means you get the copy and write semantics automatically, so deduplication means your updated system does not automatically need twice the space, but only the changed amount. You can still use rsync or casync to only apply deltas. The amount of versions you can store is basically infinite, only depends on how big your updates are. So if you only have updates on text files, it could be a lot of versions. Disadvantages, it's not really read-only, it's just a butterfaces flag, and that can be changed of course. Also, put a question mark on verification. So previously with image, we could just run GPG for example, and you can verify whether the image was modified or not. Here we have to take a look how to solve that later. But how do distributions actually build those images? At least an operating system vendor like OpenSUSE would use packages to build the image, just on server side. We learned that we can even use this technology for building any of these, and the way the image would be shipped would be just install packages somewhere on a server system and then throw away everything that is not in user. That means all the scriptlets that run in packages are just modifying something that's not relevant. Like in EDC and VAR, it doesn't make any sense to have a scriptlet doing something there. So when, for example, a package needs to add a user, it can't just call user add. It needs to use SystemDISUS users. Same if you don't enable a service, you can't just call SystemControlEnable, you need to ship presets. And that way packages also can't just put the kernel in slash boot anymore because that's not in user. That means there needs to be some extra tool that somehow makes a system bootable when there's a new kernel, for example. So back to verification. Actually packages, at least RPM packages, have signed headers, and the headers have check sums for each file. So in the end, an image is a list of RPMs with signed headers, and by verifying each header, you can also verify each file. So in the end, you have a tree that could be verified, you can check that there's no file added, no file removed, and no file modified, just by looking at the RPM headers. Disadvantages and images that you ship, you typically remove the RPM database because it's this ugly binary blob. Even if it's a SQLite database, it's still an ugly binary blob with more binary blobs in there. So that's why people really hate having the RPM database there. This is something that nobody wants to see in an image. So how can we fix that? We could actually store the RPM headers as files. So we just dump the header part of the package into a directory. That means the directory is the RPM database that looks much less ugly. Different two file directory listings to actually see which packages could update it. That is quite useful if you already use Microsoft, for example, and do some snapper compare, then it tells you this RPM database and this RPM database change, but you don't know what. If the database is a listing of files, it's naturally visible what changed. And still we have the RPM header, so users fully verifiable because they're signed. So the question now is, what do we actually need an image for? So you don't need to take those RPMs, put them into some file system, and then download the whole file system or this image or table. You could actually define an image as a text file that lists RPMs. And then you download those RPMs, which are bits of your image, and just put them into this file system or partition that is user. Disadvantage of this method again would be that with nowadays RPMs you would lose the ability to do easy deltas because the payload is actually a CPO that is compressed with some compression. So if you don't want to use delta RPMs or other fancy things, you need to find a solution for the payload of RPMs. So the payload doesn't actually have to be a CPO that is compressed. It could be actually completely uncompressed. So you just concatenate all the file contents at the end of the RPM header. And because RPM header contains also the file sizes, you know which file is where. Now if we do another trick and align those file datas to page size, you actually get reflinkable packages. That means you could download this uncompressed RPM, for example by means of CAsync, which would compress it actually on the server side. Then you have the RPM on disk. And then instead of copying the payload to some other location, you just use reflinks as a file system feature that reuses the blocks. So you have this one big chunk of data as the RPM. And to create your actual files, you just link the data into there. That means user bin bash is not actually a copy, it's just sharing the same data that is in this RPM that is stored there, which conveniently is at the same time your RPM database. So it's not just the headers that are in this directory, it is the full package. So in the end, in this example, I put userLibsusImagePackages. UserLibsusImagesPackages would be your image. That means users just a view. And if you would map this into ButterFS, like in this example, that means you have several versions of this image directory as a snapshot. And then you could create other ButterFS volumes that just link into those RPM headers. Or you could even omit this user completely, a colleague of mine Fabian, even wrote a fuse plugin that just creates a file system from looking at RPM, RPM's in the directory. Quite crazy. So to summarize, we could build an image-like system by leveraging ButterFS. So instead of using AP partitions, we just use snapshots. The behavior would be exactly the same because you prepare this new snapshot, put all your data in there, and then you have to reboot to activate it. But since it's still packages, you retain the flexibility to actually change it on client side. You could ship the image as a list of RPM's on text file, but you could also add RPM's locally in this directory, and you have them installed. So best of both worlds. And this is not just completely crazy in my head. I actually built a prototype that kind of works. So it uses busybox because it was easy to modify the RPM implementation in there, to work with those raffle link in the packages. This is our sync for updates, and it uses SystemD's kernel installed to make this system actually bootable. So you can try it out. There's also pull request open for RPM, I think, to have this raffle linkable stuff and send a patch to busybox, but I don't expect it to be accepted. It's just a proof of concept. Of course, to make this work in practice, there's lots of to-do's. So in existing distros like OpenSUSE, we have to fix all the packages to no longer use scriptlets. We need to talk about the butterfly sub volumes. The naming should be standardized. There was actually a discussion like two years ago already on the list. RPM raffle link payload would be nice to have upstream. There are other stakeholders that also would like to have that. I'm already working on SystemD's kernel installed to make it usable for this use case. In case of micro-as-like systems, we want to have roll-baggles of slash and just the operating system. For installing deltas, I would like to use the async, which we revisited. And last but not least, all of that should be native in RPM or SIP, and in my opinion, not just some extra tool. And that's it already. So any questions for this crazy stuff? Okay, me first. So how did you handle the timestamp AC? Because if you're using BTRFS and doing an RPM minus I, you get the A time, M time, and C time in the i-node. And an image-based system is supposed to hash end to end the same. And if it has different timestamps, it doesn't hash end to end. Yeah, well, you can't modify the timestamps. You can't modify. You can only look at the ones that are under your control. But then doesn't that mean that effectively we can't use this to reconstruct image-based systems? Well, it's not a bit-to-bit identical. What is on disk is not bit-to-bit identical, of course. Only the actual RPM payload. But you know that the payload that is linked is actually the same. So I don't think, I mean, maybe there's a use case why the timestamps need to be exact. But in my view, it's not important because the, like, user bin bash is user bin bash, no matter what the timestamp is. Well, the use case is just for image-based systems. The end-to-end hash tells you that you've done the right thing. And it's simple to compute. With your system, you basically have to do a tree hash down all the packages to prove that this is what you're supposed to have installed. It's, semantically, they are equivalent. It's just the latter is more difficult to do than the former, which is why people like image-based. I mean, you only need to hash the directory with all the RPMs in them, right? So if the RPMs, you have to check some of all the RPMs, and then you can verify the RPMs. Yeah, I'm not disputing. Exactly. You're just saying it's more complicated. Yeah, it's, of course, there's always a trade-off. Of course, yeah, yeah. And depends on how you construct this user view. I mean, if it's really a butterfly, it's a real file system tree, then you have to walk it. But if it's only a view, like with a view stingy, then it doesn't actually exist on disk. It's just, you know, looking at the RPMs. So it cannot be modified. You need to walk, don't need to walk the tree. You just hash the RPMs. Okay. Yeah. So how would you integrate that into what we have heard in the previous talks? So during boot, I specify something similar to my dm-varity root hash, so that I know that I'm actually booting from an unmodified root of s. That's a good question. That problem is not solved yet. Yeah. So far, the challenges are already at the point, how do you actually specify which version of the user tree do you want to boot. But now all the models assume that if you ship a new image, like a new user version, it also comes with a new kernel and a new init.d. So this init.d knows what disk to boot. But in the Butterfest use case, there is a kernel that can have a number of init.d.s and those init.d.s match with a number of snapshots that they can actually boot. So then I'm already struggling with this part. So the verification comes later. So dm-varity gives you authenticity of the blocks at runtime. So the device cannot switch them underneath. And I guess that if you verify the kernel headers, the RPM headers, I don't know, when loading the image, this wouldn't give you the same runtime properties. I haven't played with those technologies yet, to be honest. So another interesting thing would be this FA policy daemon. Only with a bit about it, it uses the audit subsystem to actually block access to modified files by comparing them with the information in the RPM header. So it would be another area to just explore how to integrate some verification technologies into this model. Any more questions? If not, then I guess we'll wrap up this talk. Thank you very much. |
Ubuntu Core: a technical overview |
Okay, we can start a bit early. So next speaker is Vananthau David. He will give a technical overview of Ubuntu Core, so thanks for having me. Yes, so this will be a very deep technical talk about how we do things in Ubuntu Core. Ubuntu Core is a distro. It's based on Ubuntu builds, so the Ubuntu packages, but you can't install or remove packages because we removed APT and DPKG and everything. The system is split in four atomic snaps, so there are four parts that you can upgrade independently. If you want to install anything on it, you need to install it as a snap. Ubuntu Core targets internal things for the month, so it's not a distro for desktop. You can have graphical isn't your first button, but it's not ready for desktop. Because it's targeting IoT, it supports lots of bootloaders, but in this talk, I will focus on UEFI because this is what is really interesting the rest is. Ubuntu Core has been doing a secret boot with fuller disconscription using the TPM for a while, and this was done before. SystemD had lots of nice tools to do that, so this is why I thought it was interesting to talk about how things were done because they might seem peculiar or it was a different approach. So what is a snap? A snap is, I mean, there is a different, you will see that there is a different type of snap, but what is command for a snap is just that it's a squash fs image, and there is a specific file that is metadata that describes what this image is for, and then after they will be depending on the type, there will be more information that you can get on that. The type of snaps first is the application, so this ship, the application and its run time, and it expects to have another snap to run on top, which is the base, which is the root file system, so it doesn't have the root file system, it just have the run time for the application itself. Typically, application exports services or commands. In the desktop world, we have also applications, desktop applications, but for Ubuntu Core, we don't care about these kinds of things, and those applications run confined. There is all type of application in Snap that we don't support in Ubuntu Core, which are not confined. We only support confined in Ubuntu Core. Then after, we have the base snaps, and those are the root file system, and they are used for the applications, but they're also used for the bootable system, so it has system in it. An application that runs on Ubuntu Core doesn't need necessary to have the same root file system as the host itself, it can just use a different version if it was built for another version. Then we have a snap D, because to handle the snap, how to unpack it and to install it, there is a demo, and this is distributed as a snap itself. It's not an application snap because it has specific things, so it has its own type. Then for Ubuntu Core, we have the kernel snap, which provides the kernel, which is a UKI, so it's a UFI kernel that is signed and has the entity in it. The snap also provides modules and firmware files, and then the final type is Gadget. I don't know why Gadget, but it provides the bootloader, so in our case, it's shim and grub, and it has also lots of configuration for snapd, how to make the image. Cloudy needs some initialization, and then we have also, if you want to have extra command line in your kernel. The disk starts, the image starts with just one partition, which is the ESP, when we call it the seed, and it contains grub and shim, of course, and it has the seed snaps, it's like the factory snaps that you can reverse to if you need it. It also has a seed kernel that has to be unpacked from the seed kernel snap, because grub cannot read from there, or I don't know, and it has also another file, it's a signed metadata file. How to explain, it explains how to update things, and it's some authority file. Once you have done the installation, I will not explain how the installation happens, because I don't think it's fun. I will explain how we run normally, but once it's installed, we have three other partitions that will appear in your disk. The second one is the boot, which also contains a grub, and typically the grub will change from the seed one, so you will always boot from the seed, and if it finds that it doesn't have to do recovery or anything, it has to do normal run, it will go to the second one. The second one will have the current active kernel that you have installed, and it will also have a seal key for the data partition, because at the time I think it was not common to you to have this seal key on to the header, looks to header, so it was done like that. Then after we have two partitions that are writable and they are encrypted, there's a safe partition, basically is to identify the device, it's not much on it, it's a very small partition, and then we have the data which contains most of the writable data. To get to have a runnable system, we have to do things in a trimfs, so first of all we use the systemd on both the trimfs and the main boot, but here I'm going to show the few things that we do that might be different to other systems, so one of the first steps that we have to do is mount all the disks that we have, and the first thing we do is measure the epoch, for now I think it's always just measure zero, it's something that has been done probably for revocation, if we need to revoke the code we can just increment that, but I don't know how useful it is because I don't think it has been touched. Then we mount the boot and seal partitions, those are not encrypted, from the seal partition we find the model and we will measure it in the PCR. After that we will find the seal key from the boot partition and we'll unseal it and open and mount the data partition, then we will do the same with the safe partition, the seal key for the safe partition is on the data partition, so we have to monitor the data partition before we open the safe, and last we will find the base snap that we need to mount and this we find it from the data partition, we will find a file that is described what is installed and we mount the base, the kernel and the gadget, so the base will be our sysroute and the kernel is needed because we need the modules and firmware and the gadget, there's some configuration there that we need, optionally we need snapd to be mounted also, but I will talk about this after, and then there's some looking of the seal keys so we don't unseal them again. Once we have mounted all the disks we have to prepare the file system, so first we'll buy mount the base into the slash sysroute, where we will do the switch route, then after we have to buy mount the user-lib modules and user-lib firmware, we do mounting of the seal and boot partition within boot, there's a specific way to do it, and then after we have to, from the data we have to do bind mounting of specific paths onto the root file system, so typically for example you want the slash of wire to be mounted, to be writable and your your your base snap is not writable because it's a squash fs, so you want to want to buy mount, we have a script for now that reads a list of paths that we have to do buy mount, and we also can configure saying that if it's not, if it doesn't exist we will initialize with the data that is, that is on the base a snap, or have it empty. This script will probably be replaced by using tmv files d and fstab, but it was written like that, but it wasn't there. We don't buy, we don't buy mount slash var and slash atc directly as writable, because most of it is not writable, this is to, here's the update, because we contract everything that needs to be updated in atc, and if someone modifies some file in atc we don't, it's very hard to track, so we have a specific list of paths that are bindable that you can write. So an example is atc systemd, this one we need to install services and mounts for all the snaps, so this needs to be, this path needs to be writable, but there are some things that are really annoying for us doing this, for example slash atc hostname, because systemd does some atomic write, so that means that it would make a temporary file in the directory slash atc slash osnab does some temporary name and do a switch, but it's not possible if slash atc is not writable, so there's some patches for that, and atc local time is even worse because it's a sim link that has to be rewritten, and we need to follow the sim links until we find the sim link that is writable, so it's not, all these things are a bit confusing and it's a bit annoying, so yes, and this was the initial decision during the initial time of fs, and then after we switched to the normal boot, and most of what happens is just systemd, there is two main things that will happen, is done by snapd, is mounting the active snaps, so and starting the snap service from the application that are installed, and those are just units in systemd, so they are installed by snapd inside slash atc slash systemd, and systemd will just start them for us, we just have snapd installing those things there, but the problem is that in the first boot snapd is not installed, that means that we have something special for the first boot where we find that snapd is not installed, so we have this process where we have to find a snapd that was mounted from the inner trim fs, run it to tell it to install itself so that we can just continue, so this was called the the the seeding of the snapd, so the disks are encrypted using lockstove, we use a tpm, and the thing that we the pcr that we use for for the policy are four seven and twelve, if you don't know what four and seven is, it's from the system, you don't have to worry about that, twelve is something that is that we deal with, so we do our measurements, and we might have several values, expected values, because we have parameters that the current parameter that change if we want to do recovery or some other things that needs to still mount the the five systems that are encrypted, so we have I think there's a policy or there's several values for the pcr, and then we have another in the policy, another thing is that we have a counter that we use for evocation, this is because we we need to reseal our keys every time we change what the values of the pcr can be, and to not allow all their values to be able to boot again, there is also a non-volatile counter that is that is used that we increment each time we have to change, what we measure in pcr12 is one is done by system distub because we use a system distub to make the uki, this is the kernel command line, and it's important for us because yeah, and then we have as I said before in the try my first we have the epoch and the model, there are some interesting things that we have, what happens when you have a failure that happens and you want to have the emergencies a shell to to happen, there is two things, first Ubuntu Core doesn't have, we try to not have password, you can have password but by default you don't have password, but even if we add we don't we have the intramfs that is built for all that has to be signed in the uki, it can't it can be only built for everyone, one built for everybody right, so that means that there is no, in the intramfs there is no, there is no password but we still need to have an emergency shell if we want to be able to debug things, to allow that we only have this emergency shell if we have a specific kernel command line parameter which is dangerous, and you have to remember that since we measure the kernel command line in pcr12 that means that you will not be able to unlock with the tpm your disk if you are in that shell, so you will need to use a recovery key to be able to unlock if you want, but we can do this, I mean we can do some debugging and figure things but you have to change things and to disallow you do, to unlock, to use a tpm, but maybe there is all the things, the other ways we could have done things, maybe we could have done some specific measurement for when there is any form of emergency, I don't know, oh yeah, let's see quickly, when we update the gadget, the interesting thing to know is that the gadget provides, the gadget snap provides the shim and grub and we might want to update it in different partitions, so the seed or the boot, usually we want only the boot, but if we might want to do the seed which does the recovery, there is some versioning that we use to do that, that's very important, then when we upgrade the kernel snap, we copy the new kernel in the specific way, in the boot partition we update a grub environment file, the grub will boot and see that there is something to try, try to boot and if it found that it did try to boot and it didn't work it will roll back, then after if snapd found that it managed to boot with the new kernel that it knows that it has finished the update and will update everything, this we can skip, this is not that important, if you have any questions. So I'm from the Bluetouch team and the Bluetouch team still use the writeable path system, the same thing that is still in use in Mudu core today, my question is that does Mudu core have any plan to move on from the system or are you still planning to use the system in the foreseeable future or any update on that? My plan is to move to TMP files and fstab, so use just tools that comes from systemd because we can do those things, it means there is a bit of duplication in how you write the things but that means that we have one less shell script to maintain that is not maybe the nicest thing, so but there is no, I don't think there is any date or when it will happen, it's just there is, we know it works and we have to decide when we move to that. Thank you. You mentioned resigning or resealing on updates, could you do away with that and use the trick with binding to a hash of a certificate and then using the certificate to sign whatever needs to be signed? Yeah, I mean that would be better, I mean what I showed you is how the state is now and it would be better not to have to reseal but we didn't have the, there was not that much experience I think in the community about that at that moment that we could just do that which was better so when it's been designed I think people didn't realize that could do something better but yeah I mean at some point we will get there but I don't, I don't think we have any fourth concept on Ubuntu Core using this yet, we have talked about it, that's it. We still have time for questions? Yes, last question. So it feels to me like using SNAPs in this way is a bit complicated, my question would be what benefits do you get from this approach compared to something like system D, CIS extensions that were talked about earlier today? You mean those, because the point is that the application SNAP is the important thing for the user experience because they want to have their own application and they can do quite a lot of things and this is very interesting and having everything to update in the same way makes things simpler for them, I mean it might not, because I showed you what was the complex things behind the curtain but I think the point of having SNAPs there is just for the simplicity of the users, people who make image for their applications they have to deal just with SNAPs and not many different technologies. Yeah, we could probably, I don't know how to use all the things and make it look like it's a SNAP, I'm not sure, maybe there is a way but which would be nice, I mean because if we have less code to maintain and we just have a wrapper to something that something else is with, it's much better but for the user I believe that it has to look like a SNAP. Okay, round of applause. |
openSUSE MicroOS design
A functional read-only OS in an imperfect world |
Okay, and so our final in-person talk, the last talk will be recorded playback. And now we have Ignas Foster talking about open-source and micro-S and the technical details. So thanks for coming. Thanks for the introduction and thanks for joining. As said, this talk will be about open-source and micro-S. We have several topics to cover, and not much time, so I'll just go over them quickly. We can't go into too much detail over here, but yeah, we'll see. Just one slide back. My name's Ignas. I'm working for SUSE, and yeah, I'm working as a research engineer. So what are today's topics? First of all, I'll give you a short introduction on open-source and micro-S itself. How is it built up, and on which foundations is it built up? Then we'll have a short look about the update and rollback mechanisms we are using there, and something we haven't talked a lot about today on configuration file management on our approach to handle these in image-based systems. We probably won't make it to the full-disk encryption and trusted boot part, but I can just refer to Len's talk from earlier today. So when we are talking about open-source and micro-S, we are basically talking about some product of the whole open-source universe. As the most known products, there are tumbleweed and leap. Tumbleweed is our rolling release distribution, and leap is our stable distribution. Stable not in a sense of tumbleweed crushes all the time, but stable in a sense of the API doesn't change all the time. Leap itself is based on Susie Lenox Enterprise, Susie Lenox Enterprise in turn is based on a tumbleweed snapshot which will then merge into a stable product. So when we are talking about micro-S, we still have the same products basically. We have open-source and micro-S which is based on tumbleweed, and we have open-source and leap micro which is based on leap. So basically we also have the same ingredients in there. We still have the RPM packages as a base. To be exact, exactly the same packages we have in the tumbleweed or leap distributions, and we are building up on these. You may know open-source as one of the large users of B3FS with leap and tumbleweed. You can use any file system you want, B3FS is just the default, but when we are talking about micro-S, then B3FS will be mandatory. You have to use it because we rely on these features really deeply. First of all, these are the snapshots. When we are creating an update, we will need those snapshots to have the forward and rollback handling. And B3FS has a huge advantage, we have heard that from a previous talk already, B3FS has copy and write functionality, so when we do an update, it will be the smallest update possible basically. We don't have AP partitioning where we have to have two partitions the same size but can have a minimal snapshot for that. So when we are talking about micro-S, if it's based on tumbleweed and leap, what's the actual difference compared to them? First of all, it's a read-only root file system, which is common for all the image-based systems we heard of today, and it contains a minimal package selection. Minimal doesn't necessarily mean that it's the smallest distribution possible, but a minimal package set for what it's trying to do. What it's trying to do is being a single-purpose system. It evolved from a project called OpenSusieCubic, which was a Kubernetes distribution, until we discovered that you can use it for so much more basically, what it's nowadays is a single-purpose system including running containers, but you can use it for any system or any use case you actually want to have. You also have a micro-S desktop, so your single-use case would be running a desktop there with flat packs as packages for using your applications. That single-purpose system approach also means you can install packages from the tumbleweed and leap distributions, but it's not designed to be used that way, but you should use containers for having your actual use for your actual load, or you can say you install one dedicated package, for example, if you want to run it as a mail server or whatever. If you have a look at our commercial site, we have the SusieLynx Enterprise Micro-Product. We even have a very limited package set. You can't even install all the packages you can from tumbleweed and leap to make that clear that it's really meant to be used in a quite restricted way, for example, for workloads, for example, for cluster nodes, or for embedded systems. Especially for embedded systems, we also have another very important point, namely it's a self-maintaining system. So you basically install it, and then you, in theory, can't forget about it, because it will just update and maintain itself. We'll have a short look at that later, how that works. So when we are talking about updates, I said it's a read-only system, so you still have to be able to update the system somehow if it's not image-based. For that, we are basically using the BDRIV-S snapshot functionality, which the BDRIV-S snapshots are snapshots of the root file system. That's the important part. If you have a workload, for example, containers, it's running in VAR. Of course, you also have ETC for your configuration files, but the root file system, basically what's below user, is that part that is actually snapshotted and read-only. So when we have an update, we will just create a new snapshot, the blue square on the second point, and perform the update in there. The blue snapshot is a BDRIV-S snapshot, so it has minimal overhead. It will just be created in microseconds. There's no performance penalty for doing that. And then the update will be performed in that snapshot, for example, by just calling the open source world superdup for doing a system update. If anything should go wrong in that step, then that snapshot can just be discarded again, and we are back at square one, we just have the currently running system, and the active system won't even have seen the new snapshot. If the update was successful, then that snapshot will be marked as the new default snapshot. That means if we reboot the system, just like AB partitioning, that snapshot will be used as the new root file system. We have a second step, but that will be later, just a second. Yeah, let me talk about it immediately. We have a second step at boot called health checker. If a health checker should detect that something is not working as expected, for example, a service you expected doesn't come up, then an automatic rollback will be performed again. We will be back at square one again, and it behaves as nothing would have been changed, and we can wait for the bug fix for whatever broke the actual update. So all the magic happens in an application called transactional update, and a new library called tuket. Transaction update was originally a shell script. It called all the open SUSE-specific stuff, for example, calling SIPR for all the package management, or calling MKInitad for rebuilding the Initad, everything which needed write access, so had to modify the root file system was just wrapped with that wrapper script. Out of that, a new library emerged called libtuket. That one is supposed to be platform agnostic. It includes C, C++, and divas bindings, and the only implementation currently is for BDRIF as in snapper, but if you want to, you could also just write another backend, it's prepared for that for more general snapshot management. We have, for example, a DNF plugin, or micro DNF plugin to be exact, so you could just use that one library for that. So I just said, if you want to activate that snapshot, you would usually reboot and have that snapshot activated. That's the atomic part, which is important for making sure that you don't touch the currently running system, but make sure the update is activated in one atomic step. With the next transaction update version, there will be also a new option called transaction update apply, what that one does is it will just mount basically user into the currently running system. That was possible if you heard the talk from Ludwig two talks ago. That was possible because of the user merge, because practically everything the system requires is below user, if it isn't, it should be changed in the future, but for micro s, that one's working very reliably already. If you just update or if you just installed a new package in the new snapshot, that one will be totally painless, because you just have the new package available immediately. If you change system services and I don't know what, then maybe you should still think about rebooting afterwards, because it's basically the same when running an update in a running system, basically an update or supered up in a running system. So I talked about rollback already, if you reboot the system and something doesn't work as expected, then it will perform an automatic rollback. Health checker, you can see the URL at the bottom, itself provides an interface basically for custom plugins, so you have to know yourself which services are running on your system, and you now have to know yourself how to check them. As said, it's a single purpose system, so I hope you know what you're doing. And you can just extend it with custom plugins, where you define what defines a correctly running system. So let's have a look at the time, excellent. Let's have a look at the configuration file management. When we are talking about read only root file systems, we are talking, we still have to have some directories of files writable, we've just seen that with the Ubuntu core talk, in contrast to Ubuntu core for open source and micro as all of var and all of etc are writable. Var is a separate subvolume, for etc we are using a mechanism called overlays, it's basically just the kernel overlays file system, and that one may need a bit more explanation. So if we have a look at the etcfs tab entry for etc, it looks overly long and overly complicated, but in the end it's just the two lines which are colored. We have the upper there, when we are just performing an update, that's the overlay of the next snapshot. The lower there is the directory of the currently running system, and to avoid having stacks over stacks over stacks, you'll just sync the state of the previous system as the base of the new snapshot. So that's the three layers we are talking about. If we have a look at that in detail, what's happening on configuration file changes? If we have the case of file 1, that's a really old file, that file has always existed, was never changed, that's no problem at all, basically we are just having a look from the top to the bottom, nothing is in upper there, nothing is in lower there 1, so the version of lower there 2 will be used. File 2 is interesting because that one has an old state, during the update that file was actually changed, it seems, either the package refreshed it or some postscript modified it or whatever happened to it. In any case, we have a new version of that file, so as soon as you reboot into the new snapshot, you will see that version of the file, older snapshots will see that version of the file. I mean, file 4, file 5 and file 6 are similar, those are quite easy, that one is new in the next snapshot, that one was new in the previous snapshot and those were modified in the snapshot before. The interesting thing is file 3, because that one exists both in the current snapshot and in the new snapshot and that may indicate a conflict, because as soon as you perform an update or create the new snapshot, the state of the file will be frozen basically and that one will be used as the base for the new snapshot. If you change the file before rebooting the system, then the new layer won't see that new file anymore and either you are operating on an old version of the file or you'd have to check, there will be a warning when you boot the system that there may be a conflict if the file in lower tier 1 is newer than the file in upper tier. That's our approach to handling current systems, current packages as we have them from vendors, from our own distribution and having a pragmatic approach to handling those configuration files, but in the end we don't want to have that at all. In the end we want to have the files which are packaged to be completely in user, we don't want to have them in ETC at all. Only changes that were explicitly done by the admin are supposed to be in ETC. That's where we are really approaching our final goal and for OpenSus and MicroS we almost achieved that goal. There are still some problematic files, for example ETCFS tab of course is such a singular file which you can't extend. If you have an application and also want to make it possible to split those files, we have developed a library called LibEcon. You can just embed it and we'll do all the things automatically just in case you're interested in that. We have made several pull requests and patches for projects which are used by OpenSus and MicroS to fix those use cases for us. So in the end I think we took a quite pragmatic approach using existing technologies to just be able to use your existing infrastructure, your existing packages and preserve the compatibility with all the legacy software which may exist out there and we are basically doing the opposite of most other distributions by extending the existing distributions to the new use cases. As I've heard, I just referred to Len's talk from earlier today, we do support full disk encryption, we do support trustee boot and we also have measurement support so let's better get to the questions part and trust in case you have some. Thank you very much for your talk. I was pestering your colleagues out here moments before this talk. Do you have any plans to migrate the old cubic docs to MicroS because first someone was new to transactional systems, it isn't immediately obvious how the classical administration style translates to transactional one? Yes, we have a new documentation server. The problem is MicroS doesn't include documentation. If you're in an embedded use case or using it for containers, the main pages are just overhead which is not needed. So we have a separate documentation server which will just provide all the documentation but I forgot which one it is. You have to ask me later, I'll search it for you. And I hope the search engines have picked it up meanwhile it's a few weeks old. There's a question in the chat. How is MicroS related to ALP? Okay, basically the current state of open SUSE ALP which is supposed to be the next generation of our SUSE links enterprise, whatever emerged out of that, basically open SUSE ALP is based on open SUSE MicroS in its current state. There will be more use cases than just the container use cases but that's the base which it was initially put onto. So a lot of things you can do with open SUSE MicroS, you can also do with open SUSE ALP. We even share parts of the documentation currently but open SUSE ALP is more than just open SUSE MicroS. It's just the same foundation currently. Questions? Going once? Going twice? Thank you very much. |
MachineOS: a Trusted, SecureBoot Image-based Container OS |
We've been working on it for a couple of years and its sign points are focused on appliances for data centers, lights out, hands off environment. And over years ago, we've presented at the Linux security summit, we've seen details about securing secrets in TPMs for data center appliances, with some of the information about how we are going to keep those secrets in these environments. And so the key pieces that we've incorporated from that is on the discussion on machine arrest, focus on the use of security platform and the TPM2 demands that we need to guard a unique identity and keys that would be placed into a TPM book we use for identity and device authenticity. And further the operating system, we'll be able to extract the deep game secrets if we have securely looped it all the way up to kernel and user space and that the chain of trust is still verified, ensures that we are on the device we expect and we're running the software we expect as well before we can access these secrets. It also includes a port for unintended storage for access to data protection and incorporating continuous updates. The new trust for machine arrest is Certificate, an associated key pair which we protected in hardware. Certificate holds a product ID and a serial number that sort of binds it to the physical system that people are using it. And then the Certificate key pair and the TPM, those are inserted at manufacturing time and then Certificate will give us a mutable identity to verify that the device is authentic and that we'll be running the software that we expect the product expects and that it hasn't been tampered with. At the runtime of the body screen from the last, we start with our body loaded and we're executing sort of a normal view of how the kernel comes up. We'll transition into the loaded in the MFSEs startup and then we get into the machine arrest. We're going to unlock the storage to pull out our EA policies that's loaded into the TPM and then we attempt to access the registers to make sure that the values and these registers are what we expect. If they're not then we help the system run off onto the board for something that's been modified and we don't have access to any of the secrets in the TPM that we've locked away. On success though, we can go over trust keys and certs, which has phrases that are protected in the TPM, loaded into the TPM and used for only 5 minutes. And then once we have extracted that information, we extend the XR7 with the wrong measurement, which will protect further access to the TPM from any of the runtime services. Before we start the containers of the runtime service, we have to go through steps to ensure that the images that were included on the system match the product and the product's expectations and have been signed by the product certificates to verify certificates, signatures, and so for each of the OCI images in the manifest, those will be mounted and then the services can be started. We also optionally can put this into a particular OCI that we call the OCI. We want to ensure that the images stored haven't been modified and the way we do that is the stacker builds as a squash effect for all the layers in the OCI, which is a real and final system is mounted. In addition to the squash layer, the OCI images has very matches that were calculated at build time. With that, we can build a deemed very block device in Linux, which on access, as we read the first time we access that block device, the kernel will do the work of patching and evaluating whether that block is valid or it's been tampered with. For our updates, we have continuous updates as we can sync from our image repository and an update is much like what we do when we boot up. We have to go through the same sort of verification of signatures such that we know that this is the software that's expected for the device that we're running on and that all of the layers that we've built have been signed correctly. Once we've applied that, we can update our configuration database, the point to those versions for current data containers, we can send them to restart them, but if we have changes to reboot our bus or the UKI, we'll need to reboot the system so that we can execute the new versions. Lastly, for offline protection or other prevention of maybe that is sort of a maximum down, there are zero days for another kernel, we need to talk about verification. |
Why we ditched JavaScript for Kotlin/JS |
So, do you have an answer to that question? Okay, don't worry, I have some answers. And so we'll start with the wrong answers, of course. Because Kotlin is cool, because of the hype, because we love to hate JavaScript. First JavaScript was designed in just seven days, while Kotlin was designed in six years. It's not completely false, but I don't think it's super accurate, because JavaScript kept evolving and Kotlin as well, and JavaScript is still older than Kotlin, anyway. So what can we do on the web with Kotlin? I think it's important to answer that question first, because, well, it's Kotlin.js, so it still has some JS inside. So can we do as much as we can do with JS? And so I want to show you a demo of that website, and hopefully the switch of the tabs will happen smoothly. Give me a second. Oh crap. Forgot to add internet. How in 20, 20, 30 people still forget to connect to internet, give me 10, 20 seconds to go. So this website has been done with Kotlin.js, so it's a place where you can play with data to use. That's the name of our company. And so it's about the data visualization. Some of you maybe know D3JS, who knows D3JS? Okay, so basically data to use, I mean, there's an open source repository named data to use, same as the company. This is D3JS, but for Kotlin, and it doesn't compile just for JavaScript, but also for other platforms that Kotlin supports, you will see which platforms Kotlin support. And so you can do this kind of thing that look like those. So this, I wanted to show you a few examples, dynamic example, because this was just a web page, and just a web page is not very interesting nowadays. So this is entirely made with Kotlin.js, please. What is happening? Should respond to, yeah, so you see it's reacting with forces, can fix a big one, we'd have a better effect. So this was done with Kotlin.js with the calculations and the rendering. There's also this one, rip curtains. Didn't like switching density for monitors, but now it's smooth again. Okay, really uncomfortable to have two different screens. The one, same, but with some color. Okay, anyway, now I also want to show you something built on top of data to use, which is Charles's Katie. Unfortunately, Charles's Katie is not open source, but the examples are. I know we are for them, so it's a bit, how can I say, the team line sometimes, but yeah, this is the kind of thing, so it's a charting average, just basically with as little code as possible, we want to have a chart that makes sense with the right bounds, with the right zooming, and all these kind of things. There are also a lot of charts, but you will find them in the presentation. The links are clickable. I will share the link at the end. Okay, so what can we do on the web? A lot of complicated things, just like JavaScript, you can manipulate the DOM, go on canvas, write reliable and shareable business logic. There's no hard limit. And so now why, actual answers, why we did JavaScript or code.js for this, and other projects as well, sometimes I should say. So the first reason was safety and reliability, especially when working with multiple people over time. We don't need to transfer a lot of knowledge because it's already built into the code itself thanks to typing, for example, the types, like this is a number, this is a string, this is whatever object we have, maybe this is a duration. Duration is not just an integer or a long, it's a duration, so we know what we are dealing with, the language design, the concise syntax, because that's also something we like in JavaScript and we like a bit less in Java, for example, having a concise syntax is actually important, type safety, concelable async programming, also named coroutines, and other features in the language design, but also code sharing. So we can share what we want and we can have native interoperability with a platform, say at some point we want to make a mobile app, we can actually compile the same way all the native apps are made with the same code, granted that we are not using JavaScript or web-only APIs. And there are some APIs that work the same, for example, plus adding two numbers, it works on all platforms, and there are a lot of other things that work on all platforms, actually. So let's focus on code sharing first, because sharing is caring, right, that's what some people say. And so I want to answer which platforms does Kotlin run on or compile to, I should say, because Kotlin, in a way, unfortunately, but at the same time, it's so strange, is a compile language, which means that you might need to wait a bit before you can actually test your code. Yeah, so it compiles for the JVM, that means server, desktop, apps, libraries, JavaScript, browser, Node.js as well, Android, and whereas in what is within the Android runtime, which is kind of a JVM, but not really, and native, so kind of like C++, you can compile Kotlin for iOS, watchOS, I don't have an Apple watch, but you get it, macOS, CVOS, Linux, Windows, and also the Android NDK, which is kind of outside the Android runtime, the kind of JVM art, you know, you can also compile there, if you want, can be useful if you make a video game, for example. So basically Kotlin runs natively on almost all devices, at least devices where we run what we call apps, so maybe not a microcontroller, but if it's slightly more powerful, yes, but Kotlin is not all about Kotlin, actually, interoperability is a key thing in Kotlin, and so Kotlin has direct interoperability, which means that you don't have to use a special API where you pass a string for the name of the method and then the list of arguments, you just call it like it was made in Kotlin. So it has direct interoperability with Java, with JavaScript. Now with TypeScript as well, very recently, in the last release of Kotlin, a few weeks ago, Objective-C, and therefore Swift and C, but unfortunately there are quirks, because languages have specific design and sometimes it's hard to have the square fit into the square-ish, so it works both ways for most of the languages, but sometimes you are limited. Just want to be completely honest about that, for Java it works well, actually I should put Java in orange as well, because a few features now don't work supernaturally with Kotlin, you have to jump to slide hoops when it comes to async programming. If you do it the best way in Kotlin, you cannot apply that in Java, but you will be able to support all the results that you've got yourself. What direct interoperability means is that also you are not reliant on third-party plug-in updates, for example, it could be Cordova or Flutter, you are waiting for them to support the new release of something or that new standard in web, you don't need that in Kotlin, because you are still doing a native app, but just you don't have to write to duplicate the code for each platform, it's like you are doing JavaScript if you use it for the web, so in JavaScript, no, it doesn't need to compile for JavaScript. You can use native APIs like they were built in Kotlin, so calling new Kotlin code from Java, like it was built in Java, the same for Swift, same for JavaScript. Sometimes there are limitations, as I told you, but for most of the use cases, it works just like this and sometimes it's only a minor thing and you will find the answer on the internet. Kotlin nowadays is the most interoperable programming language, unlike C++, which doesn't compile for JavaScript, for example, and I know C++ was used for multi-platform libraries, even though I don't really want to deal with overflows on a daily basis, personally. And so back to that slide, now we look at safety and reliability, because, oh, okay, let's move on. So for me, it's reason number one. I don't know if it speaks to you, but your user might be in display. And so that's how JavaScript is like. It doesn't care. It's like it's on you, check whether you still have the wing. Missing both? Wing off? No problem. Let's try to fly it anyway. Let's try to run it anyway. That also applies to other programming languages like Python, Bash, all PHP, not PHP 7, but yeah, and basically languages that say, yeah, types, we don't need them or when you want. So on the other side, Kotlin is more like this, bolt required, no takeoff possible, wrong bolt type, no takeoff possible, expect the pilot seat but got an economic seat. No takeoff possible. So you don't have to fly the plane to realize that one wing wasn't attached properly or that you were missing the wheels for landing because you want to realize that before taking off. And that also means that compiling is not just about browsing the web while waiting. Compile languages mean less tests to write. You don't have to check whether the input is of the right kind. You only have to check for the values but not for like, someone gave me a horse but I was expecting an integer. So it can save you time. But not at the time you might expect. That's the quirk. Anyway, so what should we do on the web with Kotlin? Because in my opinion, it doesn't suite all use cases, a lot of them but not all. So first web apps, when I mean web apps, I mean like something where you are not just going to look at one page but something where you are going to do some interaction, say watch a video ad, oh no. Web pages with a lot of dynamic logic that can't move to server side. Kotlin might be a good alternative to do that. Because when you have a lot of logic, types and correctness is even more important. So you actually, personally I feel less stressed when I do complex logic with programming language that checks a lot of things for me. Anything that doesn't have to be less than 300 kilobytes. If you have video ads, you don't need to be less than 300 kilobytes. Otherwise, if it's a landing page, maybe it's not the best solution, maybe you want to generate your web page, maybe even with Kotlin code. But you don't really want to have a bundle compiled with Kotlin on your landing page. But anything further than a landing page, or if your landing page is heavy or all your users have a fast internet, you don't really care, network is cheap when you are lucky these days. So that's when you're lucky, think about it. You can generate HTML with Kotlin, as I told you, with Kotlin JVM, it's actually the simplest way to do it. But there's an API, Kotlin X.html that works on both JVM and JS. You can also do libraries for Kotlin consumers, complex libraries for TypeScript and JavaScript consumers also. This is a good fit for Kotlin. And now I want to talk about language design, because I told about Kotlin, but who of you have ever seen the syntax of Kotlin? Who hasn't? Yeah. Okay. Anyway. A lot of curious people in this room. So basically, if you want to have a variable, you can just write it like this. You see, we don't declare the type, because we don't need this obvious. It's an integer, 23, what else could it be? If you want to increment it like this, it won't compile, because it's an immutable reference, Val. As opposed to Val, variable, you can change the value like this, and it will increment. You have functions. You write the keyword from, and here if I pass, I don't know, first them to name, then it will have welcome back first them, a few years after. And we can actually put it on one line if you want, if it's short enough. So that's great for our conciseness. Sometimes you just want to give a name to a dumb operation, but you want it to be more clear in the code, and we spend more time reading code than writing code usually. You can also omit the type if it can be inferred from the expression after it. Only when it's a one-liner, even though you could have something that has some curly braces, it has to be like an entire expression. You cannot have multiple expressions or statements if you want to omit the types. You can make a class. This actually compiles. If you want to put something in it, you can add parenthesis. And now it's a data holder. Usually we write it like this for indentation, but it's not mandatory. It will still compile. It's not Python common. You can also add the data keyword. So the data keyword can come handy when you want to have the equals implementation based not on the identity of the object, but on the content. Say if you use a hash map, for example, then if you make a copy, but it's still exactly the same, somehow you make a copy in your program because it's complex, then the key will match. But if you don't use data and you still want that behavior, you have to implement or let the IDE implement the equals function. There's also hash code that speeds up dictionary operations. And there's two strings when you want to log the content. So data keyword, you just add this, and the compiler implements the thing. You can make it open. By default, they are not open, the classes, so you cannot subclass them because usually people didn't really want their class to be subclass, so yeah. And then you can have a subclass like this. Then this is how you can implement an interface and override a function. If it was comparable, if it was a class, you would have parenthesis next to it, but otherwise it's exactly the same syntax, same principle of inheritance. There's the handy to-do function when you want your code to compile, but you are not done yet. So if it actually runs that function, it will crash, but maybe you can see if the plane is able to take off. And then it's very easy to find back the to-do calls because you control or command click on to-do. It will take you to the declaration, and if you do it back at the declaration site, it will show you all the usages in your project. A few cool language tutorials, null safety, nullable types, and null pointer exception safe equals, so it's kind of type error when you try to add an integer, but it's not initialized. And for example, this doesn't compile because string is not nullable. If you want it to be nullable, you have to add the question mark here. And now this can compile. But this doesn't compile, again, because greeting might be null, and actually you set it to null. And even if you remove that, it still doesn't compile because you explicitly stated that the type was nullable. But this, with the question mark before the call, it becomes its name is safe call, and this compiles. If there's no value, if string is put to null, then you won't get null instead, or it will just skip the function call. You can also crash a program in case it wasn't null, so in this case, it will work fine because hello world is there. But if you add null, it compiles again, but it crashes at runtime. But you wanted it, right? This doesn't compile because the type is int, but it returns int question mark. But you can also use the Elvis operator. If you turn it 90 degrees, it looks a bit like Elvis haircut. And so when the first part is null, then you get the expression after. Don't overuse it, but it's quite cool for search cases. The equals, equals checks for equals, and doesn't check for identity, like in Java. So this is similar to this code in Java if you compile for the JBM. And if you add a triple equals, then it's comparing the identity. And in Java, that's a double equals. We think it makes more sense the way it's done in Kotlin. I think it's very similar to what's done in JavaScript, in a way. And if both values are null, it doesn't crash. It won't try to call equals on a null object. It will see, oh, both are null. Okay. One is null, one is not equal. Seems more intuitive when you say it like this, I think. Other things are properties. So val and var, if you put it in a class, those are properties. Otherwise, they are local properties if they are not in a class. And so they are similar to fields, but they also include getters and setters. And they can be delegated. You can Google for Kotlin delegated properties if you want to find out more. But basically, it helps you to not repeat the get and set on and on. So properties, these are properties. You can specify the type or not, then it infers whatever the expression is. This is a custom getter and setter. And if you type get or set in the ID, it will suggest you the right curly braces to make it work. And there's also extension functions, which are static methods, kind of like instance method when you call them, but you don't need to edit the class. So it works on any type, including integer, which is built in Kotlin, right, into whatever platform is behind. So this allows you to have ease even on any integer where the distinction is visible or imported. You can use it like this, three dot ease even, it works. I think I'm out of time. Maybe I have one extra minute. Okay, few seconds. I want to show higher order functions, so you can pass a function. And I guess some of you are familiar with lemdas. So higher order functions are functions that take lemdas. And in Kotlin, they can be in line. And so this is a lemda. This is a lemda, so it's just curly braces. And whenever you see curly braces, it's not just a function, it's a lemda. You can also check types, but you don't have to cast again, like you would in Java. And string templates, if you want to use a variable, you don't have to do concatenation. You just use dollar and put the name of the variable. Anyway, the entire content is here. I think I said the most important thing. You can flash that QR code and go on speakerdeck.com, my handle, and you will find the presentation with all the links, including some links if you want to get started if you never try Kotlin. Thank you. So we have time for one or two questions. And I'll ask the first question, which is for the next speaker. Are you here? Okay, so you don't have to move. Next question then. You raise your hand, I run to you with the mic. I can't dust it because it's dangerous. Okay, so you mentioned the Kotlin X HTML, right? So is that something like for CSS? I think there's like Kotlin X CSS for extensions also for JavaScript. Yeah, CSS, it's a bit of a complicated topic. There are multiple ways. You can do it the way you do with plain HTML and you have a link ref. But there's also some special support where you can compile a SAS into CSS. Or you can also write Kotlin that translates to CSS. But I think it's runtime CSS. I think the best way to answer the question will be to Google for Kotlin CSS. Because I don't have the perfect answer in my head. But yeah, there are some things about that. Another question? Thank you. Thank you. |
Doom on the browser thanks to WebAssmebly and .Net
Or how I ported Managed Doom to Blazor |
Hi, everyone! Hi, Fosdame. How are you doing? I hope that you're doing great. Yeah! Let's go! So I'm really honored to present my talk about how I ported Doom to the browser with Blazor Wazen. So it's a topic about another language, which is, we will see. So a quick word about myself. So I'm Yessin Ben-Abbas. I'm a DevRel at Wordline. I'm a teacher also. And during my spare time, I love to play video games. So that's one of the things that made me make this port. But before going further, let me explain what is a port. So game porting consists of making a game run in another platform other than the original one. There are many ports that are released nowadays. Some are good, some are bad, depending on how they are developed. And it consists of adapting the source code of the original game into the new platform. So adapting means that maybe we need to change some bits of the code. Maybe sometimes it can be a whole rewrite, depending on the differences between the platforms and how the game has been developed. And using a virtual machine or an emulator is not really considered as porting. You really need to have access to the source code and adapt it to the new target platform. So in the beginning, I wasn't really confident with making a port. I considered it as a complex task, difficult. I didn't have a clear vision on what it is. And what really gave me a first inspiration to consider porting games is modern vintage gamer. Who knows about MVG? Yes. So he's a game developer, a YouTuber who makes great videos. I didn't expect it, but he made a video where he showed how to port Heart of Darkness, one of the greatest games of retro gaming. And he showed how he ported it to the original Xbox. So he made a video where he showed changing includes. So watching this video really made me more confident and considering porting in one of my activities. But that's not the only thing that gave me the idea of porting a game. The other thing is I like to play with.NET framework. I really like this framework. Because it has many good things. Some of them is that it's an open source, cross platform, general purpose framework. So it runs on Linux, Windows, Mac, Android, iOS, a lot of platforms. And even other platforms that I'll talk about now. And the language, the main language of this framework is C-Sharp. So C-Sharp is really, really good language that keeps evolving over the years. It has some features that you can find in modern languages, null safety, extension functions, and this kind of stuff. So it really keeps evolving great language. However, in the beginning of the framework, the browser was not target. But in 2020, with the release of.NET 5,.NET introduced the support, the Blazor wasm framework. So it's a component-based framework, like Angular, React, and view with components. But the code of the component is in C-Sharp. And it runs locally, natively on the browser, thanks to a WebAssembly stack, as you can see here. So Blazor has WebAssembly implementation, which allows the developer to access Razor components and also access the.NET framework, and also communicate with the DOM. So this is an example of a Razor component. It's similar to what you see in Angular or View. The difference is that the code below is in C-Sharp. In addition to that, you can have also CSS, of course, and you can even call JavaScript and it interoperates with C-Sharp. So when I saw this, C-Sharp.js, which interoperates, I was like this. I was amazed. I was really happy to see this. And I told myself, it's time to make this port. So now I need to find a game to port. There are many games with source code available. And the game that I chose, no spoiler here, it's Doom. So I will tell you next why I chose Doom. But one of the reasons is that it's one of the most successful first-person shooters. And not footers, shooters, sorry, for the stick. And also, technically speaking, it's well developed, in my opinion, because the logic of the game is separate from the resources. So you have the famous WAD files of Doom. So it contains the assets of the game. And you have really what updates the game state, the position of the character, or the game logic in a separate project. So that allows to have Doom being portable by design. And in terms of ports, Doom has a lot of them in video game consoles, of course, and even anything that has a screen and some processing, as you can see here. And there are even more. So comes the reason why I chose Doom, because I found that there is a.NET port already existing of Doom, of Linux Doom, which is the source code released by id Software. And in GitHub, there is a repository which has developed a port of Doom in.NET. However, this port uses libraries that communicate with hardware, like graphics, audio input, which are not compatible with the browser. So that's why my work was to take this port of Doom and make it work on the browser. So just to be clear, I used the V1 of Managed Doom, because currently they are developing a V2, which uses another library, but just to be clear on my work. So to summarize, id Software released the source code of Doom for Linux. Since you developed Managed Doom, which targets any platform that is targeted by SFML, desktops mostly. And this is what I intervene to base my work on this port and make it work on the browser. So before starting work, my porting, I made a strategy which is this one. So this is an AI image, by the way. I tape Doom monster typing on keyboard, and I got this. So my porting strategy was to get something that works like proof of concept that works quickly and to demonstrate quickly. So the first step is to take the source code and compile it with the Blazor framework, as simple as it is. And as soon as I see a compilation error, I delete the code and I add the to-do. So another presentation with to-dos, that's fine. So after that, once the code compiles, I replaced little by little bits of code that are not implemented, or the methods or functions that are not implemented, by giving priority to frame rendering, because it's always nice to see something on the screen, rather than working blindly. And in terms of optimization, I always left that to later, unless it's really necessary. And in terms of reading documentation, so it's really well documented, how Doom is implemented, but I only read the parts which are really relevant and important, specifically how the Doom image is drawn on the screen, when the frame data is generated by the engine. And with this kind of porting strategy, like two, three weeks of part-time or site project work, I was able to achieve something, a port that can be run, executed, even if it's not perfect yet, but we'll see in the demo later how it works. Now let's enter into more details on how I ported more concretely this into the browser. So first of all, before giving more further explanations, let me show you how game is developed most of the time. So it's a big picture of the game algorithm. First of all, we have a wide loop, which is an infinite loop, but it doesn't iterate as soon as possible. It iterates only when the frame-pacing is relevant. For example, if you have 30 FPS game, this next iteration will wait a little bit if the previous frame was computed very quickly. So it allows to have a frame-pacing which is correct and nice to the eye for the user. So once the frame is ready, we get the user input compared to his previous frame. Really simple. And after that, we run a frame or we compute the next frame of the game. So we run the update game state. It's just an example name of the method. It takes the input of the user, the what file for the doom in this case. And then it advances the game one frame. So it updates the player position, the monster position, the ammo, the status, his life, all this kind of stuff. And it also generates to be rendered a frame and some audio. And this is run. So for each frame, this algorithm is run and it updates the game each frame. And once we get a frame and some audio, we play them and render them to the user. So when you see this, you can start guessing which parts. So in the managed doom, this all is done in C-sharp. That's clear. And you can start to see which parts need to be adapted for the browser, which are not available in C-sharp, but need to go to the JavaScript realm to be able to achieve it. But to show you what I ported more precisely, let me show this in another way. So here. So we have the while loop and the frame-pacing step here. Next, the user input is sent to the update game state with the what file as argument. And then we generate some audio and the frame to be rendered. And it loops. So this is what needs to be ported, what you see in red. So what you see on the top is frame pacing. It's not really Blazor relevant, but browser relevant. For a frame pacing, there is a better way to base frame in JavaScript, base frames in JavaScript. And to render, since SFML is not available in Blazor, so this needs to be replaced. Also the update game states which is also, so everything is in C-sharp. Update game state, even though it's a platform agnostic, it's not 100% the case. So there needs to be some bits that needed to be adapted to the browser. But hopefully 70% of the code approximately was across platform and runs on the browser without any problem. So after some work, some coding, some fun, some fails and learning, I achieved this result. So I replaced the white loop with request animation frame. Anyone knows about request animation frame here? Yes. Nice. So yeah, request animation frame is how you tell the browser, so I want to render frames in an optimized manner for the browser. For example, when you switch a tab, don't do anything to optimize energy. So you ask the browser, request a new frame. When you see it relevant to compute a new frame for my game, call me back. So it's a callback. And for each frame, we call it back. After that, so once this has been changed, change also the rendering. So for the audio, it's the audio context library. And for rendering, it's the canvas, of course. So audio context is the audio API of the browser. But there is one thing that I didn't mention yet, and that you see here, is that in this state, since I was, as I said earlier, Blazor is a component-based framework. It's like Angular View React. You need to have some kind of main component, which is the entry point of your program or of your component. So here it's missing. So that's why I added or I had to have a Blazor component, which only serves as the entry point to invoke the JavaScript, which then goes back to C sharp. So this is C sharp.net, C sharp. So when I say C sharp.net, I say them interchangeably. So this is JavaScript. We go back to C sharp. We go back to JavaScript. So there is a lot of context switching or language switching. And this is achieved thanks to this API. So Blazor provides an API that allows to go back and forth from a language to the other. So this is Blazor way of doing things before.NET 7. Starting.NET 7, there is even a better way to do this. I'll show it at the end of the presentation. Okay. So now we have something that runs. So I will show you, quickly show you for the audio parts, some code, and then I will continue the last, or maybe just the entry point and then continue the presentation. So this is the main component. As you can see here, which in the code, so we have the canvas here. And here we have, we initialize the DOM object or the game object. And then here we invoke the JavaScript method that calls request animation frame. So we invoke the JavaScript method here. We here handle the frame pacing. Okay. Here we handle the frame pacing. And then we call back.NET to run an iteration of the game, the DOM engine to run a computer frame. And then we call request animation frame to prepare for the next frame. Which calls back this method. So this is like an infinite loop. And this method that you see here, which invokes.NET code, just invokes the game objects and requests it to render a new, to compute a new frame with the user input. So this, I will just skip it. So this is how audio and video are rendered. So it's communication between C sharp and JavaScript. And I continue. So, so what I learned from this in Blazor, avoid copying arrays, big arrays. In the beginning, in the.managed DOM source code, the final image is generated by copying, converting a one-dimensional array into a 2D array. So this slowed down the game a lot, a lot, a lot. So I removed this part from the managed DOM source code. And I sent it to JavaScript. That's what you, what I was, was shown in the previous slide about frame rendering. And I don't have to cover it. But yeah, avoid copying big arrays in.NET. This is in.NET 5. Maybe in.NET 7 it has been improved. Avoid extensive logging. And calling Blazor from JavaScript from Blazor communication can be very fast if you use the correct API. As I said, however, this, the API that I used is undocumented. And I confirm it because I found no documentation, just some source code or some obscure GitHub repositories. But hopefully in.NET 7 it's, it's improved. In JavaScript, I learned that request animation frame is the way to paste frames. And to play audio programatically, you need to have some user interaction before or the audio context API doesn't work. So here is the demo. So I click the, to enable the audio, the user interaction. And let's see. Yes, of course. Yes. And here we go. So just to show you that we have sound. So don't be afraid. It's just a game. And just to show you that you also have secret passages. I don't know if you know this one. You have 200 armor. But that's another topic. And it runs correct frames. It's a 2012 Macbook and it runs at 30 FPS. No problem. Okay. So last two slides, the interrupt in.NET 7. So here's how interrupt works now. You don't need Blazor. It means you don't need to create component if you want to interact between JavaScript and.NET anymore. And I'm working currently on this part because it's really exciting to see this kind of work. So to call JavaScript methods from.NET, you just need to export your JavaScript method as you do in any JavaScript module. And you call here, you just import the method and you can have access to it. And in the opposite sense, you just export your.NET method and then you import it in JavaScript using this kind of code. And that's it. So I'm working on changing how the game is ported to this. And in terms of next step, then, it's to migrate to JS interupt. Update to manage Doom V2. Maybe I will gain some more performance. After that, I would like to have some game music and also to be able to play other wads. Currently, only the Doom one works. I don't know yet why. And as long-term, really, it's also a wish. Maybe this can be integrated to the official managed Doom project. So as a conclusion, wasm makes existing code compatible with browser. It means that, I mean, wasm is not just the very fast JavaScript alternative. It also opens the way to make many, many languages, many, many technologies run on the browser. So that's really what I like. What's really exciting for me, at least about wasm, and porting games is fun. Developing is fun. Do you agree? Yes. Thank you very much. Thank you. So we have time for a couple of questions. Who wants to ask the first question? Thank you. Hi. First thanks a lot. It was really, really insightful. Thank you. I have a question about request animation frame. I think I saw, so, because request animation runs at 60 FPS, right? And then I saw you do something with timestamps to try to do 30. Yes. Does it ever drop or become inaccurate? Or is it just, is that like the right way to achieve 30 FPS? Yeah, I guess it's, maybe, I'm not a JavaScript specialist on this. Me neither. I'm curious. It's here, I guess. Yes, it's here. For me, it worked. Yes, I didn't have eyes here. That's what I found on the Internet. I tried it. And I've seen the demo. It doesn't drop. When it drops, it's really when there is a lot of things happening. When there is a lot of audio, it's still not optimized a lot, the audio part. But this frame-pacing, for me, it works. So you compute the duration between the last request frame and the new request frame. So for me, it's okay. Excellent. Thank you very much. You're welcome. Next question. Yeah. The next speaker who is speaking after? Nobody? Nobody is speaking after you? I mean, who is taking that seat who are standing there? Okay, we'll call them. So it's a follow-up to the previous question, actually. Have you tried removing this check and see how fast you could run the game? Can you do 1,000 FPS? No, it's not 1,000. No. I actually tried to remove, but don't remember, but it's not 1,000 FPS. For sure, it's certain. It's not really, really fast also. Like maybe to 40, 50 FPS, it depends also on the machine. It depends on the hardware that you have. On the processor hardware that I have, I don't have a gaming computer. It was like maybe 40, 50 FPS. Okay, thanks. But yeah, that's a good question because you see when we talk about good ports, bad ports, for example, this, I mean, it's a quick to achieve port, but it's not the most optimized one. So that's when you see game companies making ports. And also, for example, when I said that the array copy on.NET is slow. So at the same time, if you don't have time to optimize, you just leave it as it, and you get a crappy port with slow frame rate. But I did the effort to at least make this part in JavaScript. Welcome. So next question. He's going to ask more if you don't, so he's ready. Go for it. Yeah, another question is, what is the size of the wasn't files or whatever that needs to be downloaded to play this game? It's big. Let me show you. I don't know. It's like this. It's a big file. It's a big file. So let me inspect. When you go to application here, you see the storage, you have like 21 megabytes. It's a big file. It's not huge. It's not like Windows when you start on desktop. It has a little bit of overhead, but it's not downloaded each time, you know, maybe the first time. Yeah, that's a good question. So we can have one last question. And in the meantime, while people are still thinking, please don't stay on the edge of the lines, because people are standing in the back. People arrive a couple of minutes later. So if you are at the edge, if you're here, and there is an empty seat, you need to shift just a bit. And you can also optimize this by making a service worker. I did it, but it doesn't work anymore. But you can also make this as a service worker. And now if you're here, if you see me looking at you, please shift a little bit. I don't do this for the pleasure of annoying you. It's because there are people who are going to enter the room. We're going to have more and more people, hopefully, who are going to ask written questions. So then please let them sit next to you. Also, there is a trash right there. So when you exit the room, and if you see something, Evan, if it's not yours, please pick it up. There is another trash there. Thank you. Thank you. |
Controlling the web with a PS5 controller |
Hello everyone and welcome to the talk controlling the web with a PS5 controller. Now before I jump into the talk there are a few things I wanted to share with you all because I gave this talk at a small meeting and I got two very important questions. You are coming in from the community. So just wanted to admit that yes I'm going to PS5 flash it. Please don't ask me how much I paid for it. And the second is I am not a gamer. I know it's very, I heard it in the very first statement but the last time I played a game on my PS5 console was maybe six months back. So I'm sure that I am not a gamer and I wouldn't be able to answer any gaming questions. So who am I? Well, I am Harsheela Kewal and I work as a network advocate at Contentful. If you haven't heard about Contentful, of course you will content platform with an API for architecture. I'm not going to talk more about that but I'm also a Mozilla representative. So here at 4GEN I am volunteering with Mozilla. So if you want to catch up with me, talk about open internet security, privacy and everything around what Mozilla is doing. You can find me at the Mozilla stack. And I like to think about myself as an experimentalist because I do a lot of different kind of experiments. Build a lot of city projects and share my learnings with the community. This is what I'm going to do today as well. And if you want to know a bit more about it and find me on the internet, you can head on to that lecture. Alright, so I think I'm going to talk today about something like this. I'm going to start with an introduction, talk a bit about APIs and WebHID. Because then look into the implementation of it and understand how you can implement WebHID into your applications. And with that I'm going to try to do some life coding if life coding gods are kind of unmean. And then I'll show you a little demo that I created and walk you through the code. Now before I talk about Web APIs, can I get a show of hands for the folks who already know what an API is? Alright, so the majority of you know what an API is, that is wonderful. But for the folks who are still not sure what an API is, here's a very quick recap. API stands for Application Programming Interface and it allows two different programs to communicate with each other. And Web APIs basically allow programs to communicate on the web. So it can be clients interacting with the server, server interacting with other servers and stuff like that. Now these Web APIs can be categorized into two categories. The first is the third party APIs. Now very important to note, these APIs have not been in the browser. What these APIs allow you to do is interact with a web show, which is external web shows in any data browser, maybe? So for an example, you want to create something for YouTube, right? You can use the YouTube API to create a bot. You might use Telegram API if you're using something on Contentful. You're going to use the Contentful API. Now in comparison with this, there is a browser API which is built in the browser. And this API allows you to extend the capabilities of the browser. For example, we have a Web Audio API, which allows you to interact with the audio that you have in your application. There is a Web Storage API, which allows you to interact with the local file system of the client's machine. And then there is the Web HID API, of which we are going to look in today. Now before I go on talking more about Web HID API, one thing that is very important to note over here is it's still experimental. Not all the browsers support it. So if you are creating applications which you want to use in production, please make sure that you have a check if the browser is supporting that or not. Now Web HID APIs allow you to create applications which can interact with new interface devices or HID each other. So these devices basically are bi-directional communication devices, which means that they can take input from the device and can also provide the output to the user. So in my case, I have a PS5 controller. So whatever if I click on this, nothing's going to happen right now, but it's basically taking an input. And if it is vibrating in the NVIDIA stage, then basically I am getting this output from the application. Now prior to the HID, the device could only unilaterally define protocols which were mainly designed for keyboards and hardwares. So hardware innovation either had to overload data in the existing protocols or they would have to create non-standard hardware that specified drives. Which kind of created a lot of complexity if someone wanted to create an application of that. The HID basically kind of reduces that overhead and allows you to interact with devices. Accordingly, their HID protocol was defined for USB devices, but now it has been implemented for other protocols such as Bluetooth. So now talking about the implementation part of it, it is a part of the navigator API of the browser. So this is the code secret of how you can connect your device to your application. Now as I mentioned earlier, it's still experimental. So whenever you are implementing the code, just add a check. Just add a list condition to see if HID is available in the browser or not. And with this device function, you need to compulsorily pass on the filter parameter which takes the vendor ID and the product ID. So each device has a unique product ID and the vendor ID is basically the ID for that manufacturer. And if you want to find out what is the vendor ID and the product ID for the device that you want to control, I have added a link in the slides so you can go ahead and get the information from that list. Now similar to request devices, you might have a situation where you just want to not include particular devices. You want to exclude particular devices. You don't want your users to use specific devices for your application. In that case, you can use the exclusion filter. Now this is an optional parameter. And again, this takes the vendor ID and the product ID. Again, this is unique to that particular device. So let's try to write the code for this. So over here, I have a very simple application which is not running. Let me quickly start the show. It has basically just two buttons right now, connect and then devices. If I click on connect, it's been defined because the device is still not yet connected. And then device is not going to do anything again. So over here, I have the connect function. I already have my fingers in here because it makes it easy to work with it. What I'm going to do is I'm going to use... Now again, one thing I forgot to mention is by default, this is an asynchronous API. I can simply use a bit and then call the function. I'll switch very much and even I don't get... IntelliSense does not give me or help me out over here. Okay. Now what this will do is this will give me an array of devices that the user might connect with the application. Now what I'm interested in is just the one device that I am connecting. Over here, so I'm going to set this up. Device is coming from devices. If I save this, if I come back to the browser, I'm going to do a quick refresh. And if I now connect, I get the option to connect to which device I want to connect. Right now it's just because I have set the ID for the wireless controller that I have right now. I can click on this, click on connect, and now you see me get some functions in here. So my device is now connected with the application. Now you might also want to get the list of devices that are already connected to the application. And you can do this using the get devices function. This will again return the array of all the devices that is connected. Now it returns an interface which is called the HD interface which contains all these properties. The first is the collection which returns an array of report which is again report is a format of data for each of the devices. Then it contains an all input report event. So this event gets triggered wherever we interact with the device. And there is an open flag which will tell you if the device has an open connection or not. So by default the connection is always closed. We have to use an open function to open the connection between the device and the application. And then you get the correct ID, the product name, and the vendor ID of the device. Now as I mentioned we need to use the open connection event or the open connection function to open that connection. Otherwise we won't be able to interact with our device anymore. So I'm going to come back over here and I'm in here and I'm just going to do a wait device. You're open. So just a quick note over here. And here right now the open is false because the connection is not open. I'm going to quickly refresh this. Connect it again. And now we see the connection is open and now I can interact with this. Now that my connection is open I still need to invent an event list now. But before we go into that I just want to quickly talk about the format of the data that is exchanged between the device and the application. And as I mentioned earlier with the HIG we already called it reports. And there are basically three types of reports, the input report, which is basically the connection that is coming in from the device to your application. The next is the output report which is the information of the binary data that is coming in from the application to your device. And then there is a feature report which is a bi-directional report. Now whenever you want to listen to the interaction that the user is making there is an input report in my list now that you can hook it up. Or you can also use the un-input report function to do that. So let's take a quick look at the full process. So over here now that my device is connected, I am going to... Again this can be an hc2nus function. I am passing the parameter in here. And this contains three basic information. You can get the data. So this is going to be in binary. You can get the information of the device. And you can get the report ID as well. So almost all the HIG devices have a report ID in their unit. So if it's an input report, if it's an output, if it's a feature report, each of those have a unique ID. If it does not have, you can simply replace the user as the report ID. Let's show the information in here. Now I am going to try this again. I don't have much to look into in the memory for this. For demo which I should have done before but now... I am going to connect it over here. Now you see, I get this thing of data coming in. If I click on this, it's going to log more things. It's going to be logging all the information or the data that is coming in. If I click on one of these, it will say it's an e-buckle. And you can see the information is in binary port. Now there are other events that you might be interested in like the connect event. So whenever the device is connected to the application, you can use the connect event or trigger any events that you want to make. For example, you might want to show the user the device that gets connected. So you can use that connect event disk over here. Similar to connect, there is a disk connect. Maybe I use a file-stake disk connect event. So you can use this to listen to that and you can show whether the device is disconnected. Do you want to connect again as a path and you can have that logic going in there. And remember, we had explicitly opened the connection between the device and the application. Now you might want to, after the user has interacted with the application, they have done with the application, you might want to close that connection. So you can use the close function in here. But the close function will still let the connection exist. I just want to close the connection in a way that the access of information is not going to happen, but the device is still going to be connected to the application. So if you want to remove access, you can use the forget function, which will basically tell the device to close the connection and remove the access. So if the user wants to connect again, they will have to connect to the application and go through that whole process again. Now as I mentioned very early on, this is an experimental API. So it's only available in Chromium-enabled browsers and then Opera as well. There are a lot of security issues that the community is discussing with the community is trying to talk about and that is why a lot of people have still not implemented that. Hopefully when we come up with better security and privacy features for this particular API, we might get support for all the other browsers as well. Now as I mentioned, I have a demo created, a very simple demo. If you have a PS5 controller with you and if you want to try it out, you can go to the website and you can try it out. Here is a quick demo. So I'm going to click on connect and I'm going to connect my controller in here and I have a very little nice demo where I have Mario with me. So what I am doing is I'm really interacting with Mario with my PS5 controller so it can move back and forth, it can go up and down. And one thing that I kind of added for fun was lasers. It's very handy to me but I enjoy the process of creating this and learning more about WebHID and I'm just going to quickly walk through the code of it. I'm actually implementing WebHID. So over here I have few stage defined. So I have the left stick, the left axis or the left thumb stick, the right stick, and then there are buttons. I haven't configured all the buttons for iOS because it didn't make sense to add everything in here. And then as I mentioned, make sure to add a check in the browser's report, as I did a lot, that's what I'm doing here. If it's a post, I am connecting it to my device. After it gets connected, I'm just doing some changes in the UI and then I am listening to the event business. Now I have added a report ID check just to make sure that I am getting the correct input from my device. And then I am seeing if the data at unit U8A is normalizing it, which corresponds to the left stick x-axis, and I'm normalizing it to get the value between minus 1 to 1 inclusion. And if that happens, I have another function that takes care of the change of the position and doing the same with the left stick y-axis, and then I am doing similar for my buttons. Now for buttons, I have configured for all of this, but right now it's just implemented for the shortcut button, which is getting the values in Boolean. If I click on that, it is true, and that stuff happens in the UI. Now this is the update position function. So if I come over here, it checks if the left stick equals 1, minus 1 is going to move forward. If it equals minus 1, it's going to move backwards. If the left stick y-axis is where it's going to go up, similarly if it's minus 1, it's going to go down. And for the circle, if the circle is true, I am busy being adding new elements and trying to give it an independent effect of the value of shooting a laser. So that was the very small demo that I created. There are other applications that you can try out. Algometer Space is one of the most interesting applications that I came across. It basically allows you to connect your Algometer machine to a web application, and you can kind of give me fire experience when you are using that Algometer machine. The source code is publicly available, so you can go and check it out. There is another project that's called ReMap, which allows you to customize your keyboard key mappings. Again, this is using web. I'll show you another hold, which is again open source, so you can go and check and take a look at the code. Now then, these were a couple of resources that I found helpful while I was... Thank you. You You You You You You You You You |
Finite state machine (and some retrogaming) |
up to you. Can I start? Okay. Can you hear me? Okay. I am Gabriele Falazca, a front-end developer working in a company called Surzents and located in Rome. And if you don't understand, I will speak to you about a finite set machine with some example inspired by retro game and arcade games of the 90s. Let's start with this slide that is very clear and self-explanatory. A finite set machine is an abstract machine that can be exactly in one of finite states at any given time. It can change from a state to another in response to some inputs. This is the Wikipedia definition of finite state machine. And it's very theoretical definition. But now we will see how to apply that pattern in programming. For a representative finite state machine, you can use the state charts that are a sort of graphs where the states, the nodes are called the states. And the links between the nodes are called the transitions. This is the state charts of an application that made a fetch call. So we have the state of the application is idle, loading, such as, and failure. And the events that trigger the transition are fetch event that trigger the transition from idle to loading. Reactive event that trigger the transition from loading to failure. Retrievent that triggers the transition from failure to loading. And resolve event that trigger the transition from loading to such as. Another state chart is this, a little bit complex. This is an elevator. An elevator starts in idle state. When a user called the elevator, he passes to state prepared up or prepared down, based on the floor where is it in that moment. When the user select the floor, elevator passing state moving until it reaches the right floor. After it reaches the right floor, elevator passing in state door opening. And when the user left the cabin, elevator returns to idle state. Now let's create a state chart. Yes, we are going to create the state chart of this animation. Okay, the chart take off his panties, make military grids, and return to idle. So the states of the state chart are idle state. When a waiting, panty state, and military state. Before defining transition, I think I give you some context about this chart there. If you don't know, this chart there is called Yaku Taro Chimori, is a side chart of a famous arcade game in the 90s called Metaslag. He triggered this animation when the main character, Marco, worked near to him, and he made this animation for dropping bombs, a reward for the main character. So the first event that linked the idle state with the panty state is Marco is near. Second event that connect the panty state with the military state is a reward drop. Because after drop the reward, he redress his panties and grids Marco with military grids. And after, when Marco is far, he returns to idle state. Now let's see this event live, but before, I want to show you the simplest code of finite state machine. It's a JavaScript, because here we are in a room called JavaScript Debrum. But you can apply this pattern in every modern programming language. It's a simple ES6 class with two methods. One for setting a state, and the other one for executing the routine of the current state. This machine live we can see here. This is a little part of Metaslag game in the browser. I made for demo. When you press the arrow right and left, Marco walks back and forward, and when he's near Yakutaro, in very big screen, there are a lot of iterations. Yakutaro plays his animation, take off the panties and grids Marco. When Marco returns far, he returns to idle state. Let's see the code of this demo. We have, you see, I have to zoom. Okay. Okay, this is the HTML page. It's very simple. It has a sheen, that's the sheen, and two images that are the sprites of two charters. Now, we have a simple entry point of the application that has a demo list for the arrow keys, and initialize the script for Marco and Yakutaro. Our machine is the same, so in the slide. And the two script for Marco and Yakutaro. Marco is a simple script as just two methods for going back and forward. And Yakutaro script is as the finished machine as brain. So it has the three methods that are the states we have defined before, and another utility method for observing Marco and trigger the events for changing the states of the machine. And it's just this code. Return to slide. Another type of machine, a bit optimized from this, is the stack-based finite state machine. This kind of machine has not a single active state, but as a stack of state, and consider active the state on top of the stack. So in this model, you can navigate through the states back and forward way. Think the history of the browser like our finite state machine, where think to the web pages as the states, and the back and forward event of the browser, the events that trigger the transition. Okay, it's clear. This is the code of stack-based finite state machine. It's very similar to the previous one, but we have the stack of the states instead of the active state, and three utility for pushing and popping the state in the stack. Very simple. If you have to develop a more complex machine, there are various tools and frameworks. In JavaScript, the most famous is Xstate. That is a series of utility for finite state chart and finite state machine. This is the code of a machine created with Xstate. It's a very functionally way. We have got utility for creating the machine, all configuration based, and a toggle service for defining and sending the event, and define the transitions. My goal in this talk is not to show you a single framework, because you can choose one study by yourself. I want to explain the theory and the pattern, and how to apply it to real life. I introduced Xstate, because it has a cool tool called XstateVidz, that can help you to test your machine. The tool is this. I don't know if you see. This is the same state chart of our previous event, and it's interactive. You can trigger the events directly from the chart, and see what's happened. On the sidebar, you have three tabs. In the first one, you can put the code of your machine in Xstate way. The second tab is the state that contains all information of the current state of the machine. From the third tab, you can programmatically send the event for testing your machine. As you can see, I can send the event directly from here. You can reset the machine. My talk is finished. Have you got any questions? Who has the first question? And do raise the hands. Hello. Thank you for your presentation. What happens to your state if an event is triggered that you can't handle? What happens to your state if an event is triggered that you can't handle at that state? If they are not connected to a transition, nothing happens, because the machine doesn't respond to this event. This depends on the implementation that you use. In Xstate, the most famous implementation, this thing is safe. It depends on which machine you use and how you have implemented. Okay? Next question, raise your hand very high, please. While you think about it, thank you for the video team up there with the green T-shirt. They'll hear it after. Don't worry. They hear everything. How is the animation graphically that we see on the screen is handled with regard to the state machine because there is some delay. It has some time of duration. Thank you. The animation is how it's made. Okay. I have just the three sprites and every state of the machine set the correct sprite in the image tag and prepare the machine for the next state. Observe Marco is updated the state and run the routine of every state. Next question. Okay. Will you be around? Will you be around? Will you be in the deaf room or outside? There are no more questions. Thanks again. Okay. Thanks for the fish. |
Javascript for Privacy-Protecting Peer-to-Peer Applications
Usage of the I2P-SAM Javascript Library: Anonymized and End-to-End Encrypted Communication |
Hello, everyone. Thank you for joining this event, and thank you very much for the organization of this death room. Much appreciated. I know how much work this is. Awesome work. Thank you. So thanks a lot to the whole FOSTEM team. Really cool. This presentation here is mainly about privacy. And the I2P network is a so-called overlay network, which I will shortly introduce. And I'm the JavaScript TypeScript library maintainer of this library, which allows you as developers, me as developer, to write privacy by design applications. Privacy by design means a few things, which I'm going to talk about shortly after the introduction. I'm a totally independent researcher and developer, and I'm one of the co-founders behind Diva.exchange, we're just a loose of bunch of developers and researchers spread all over the world, very much interested in privacy topics. And one of the topics is free banking technology for everyone, which is not part of this presentation, but it's no centralized model involved in my work, so there is no business model at all involved, because if I'm fully distributed, fully distributed, not only decentralized, fully distributed, it's totally impossible by design to introduce business models. Obviously, no coin, no token, or things like that. I'd like to talk quickly about the motivation. So, why I2P-SAM, this SAM got developed, and how we set up a completely distributed network like I2P, an overlay network. And I obviously like to talk about creation of applications, so how we do that, and how we can do that. We look at the use cases and then some questions and take-outs. All right. I'm Conrad, I live in Switzerland, so, bonjour. Great to have you here. And I lecture at the University of Applied Science in Lucerne, Central Switzerland, a bit about microservices and fully distributed systems, where I'm a bit an alien in this cloud world, because today everything is cloud, but I'm not cloud, I'm peer-to-peer. And now we're here at this I2P network. Let me ask you a question. Please raise your hands. Who ever got in touch with an overlay peer-to-peer network like I2P? Again, I'm not totally lonely, so thank you very much. There are a few which have heard of it, and in a nutshell, I2P is a fully anonymous confidentiality giving messaging system. So it's, you have the general internet as you know it, and where all the cloud applications are running somewhere in central services. And this I2P network is a layer on top, it's a software layer, and everybody who's running such an I2P node is becoming a client and a server. So when I'm talking about a node, a node which might be run by every one of you, you're a client and you're a server. You're both at the same time, and you're helping the network. There are around 34,000 I2P routers in the network, which is a joke. That's nothing. That's compared to the internet infrastructure as we know it today. That's tiny. That's nothing. But still, these 34,000 routers, more or less, they run this fully anonymous and fully confidential messaging system. And please, it's an overlay network. It's not, well, some media call it, but it's not a darknet. It's just an overlay network. It's a piece of software. It's a technical solution to a problem. And the problem is we want anonymity and we want confidentiality, because these two things, by definition, define total privacy. And if I want to disclose my private stuff, it's my decision and only my decision. And that's the point behind privacy. All right. So I ask you now, please, in this room, to be open towards peer-to-peer applications which are a bit more complex, but not really complicated, and open your mind for something which has nothing to do with the cloud. All right. Why did I do the work and develop a library, an I2P SAM library? Well, the I2P core developers, they are super cool, hardcore network guys. And they love what they do since 20 years. Devo chain, which is a fully distributed storage layer, so something to store data in without trust. And that's kind of a problem. You can't be spied out. Everything you exchange is totally private. And there is no man in the middle. There is no man in the middle. Because, again, this I2P network works like a garlic. All the messages which are hopping through this network from node to node, from peer to peer, they're multiple times encrypted. So you send your message from your application into the network layer, and it ends up at the destination, and it's multiple times encrypted. Just by using the library. That was the motivation. When you appear to peer, just by definition, you get a bunch of problems you don't really want. And it's complicated a bit to get into it. So at Devo, we thought, hey, come on, let's build a few Docker containers to simplify this process. And today, the students at the University of Applied Science and Lucerne, they were able to set up a complete test network and a complete developer network within a few minutes. And that's this Docker container you find on GitHub. And, by the way, also mirrored to Kodberg. But you find it on GitHub. And then you can start by initializing these containers with a simple script. And with one go, you have your I2P connectivity available. You have, if you like to, a storage layer available. And you can start programming. You can start developing without needing to care about all the complexity of such a peer-to-peer network. And this is a screenshot of GitHub. And here I'd like to be totally open. All we do at Devo and all I'm doing is really, really free Libre software. There are no strings attached or strange stuff or things you need from somewhere else. It's really free. It's really Libre. And it's very strict licensing, which we're doing. So that's quite important for me personally to have open source software at its core. And that's very important for me. So there exists also a simplified version. I told you, you need a network to communicate between your peers. You need maybe a storage layer on top. But the storage layer is not a necessity. So if you say, well, I just want to communicate. I do not want to store anything. I do not want to store data. Then you don't need a blockchain because you don't want to store data. So if you just need to communicate in your application between peers, then you have this simpler setup. You go with NPM install, I2P SAM. And in there is a YAML file. That's the last one. So this is SAM.Devo.I2P.YML. And you initialize this container in there. And you have a very much simplified application development environment available without storage capabilities. The library got quite popular in the last months. It has to do with one thing we did for the DNS crowd, domain name system, domain name service. And the students at the University of Applied Science State got the job from me to create an API for a DNS system for I2P because I2P does not even have a DNS. So welcome to Stone Age. And so the library got used by the students and got more popular in the last months, which is nice. And here we have this, by the way, who is familiar with Docker? Who is using Docker? Okay. Right. So great. Almost everybody. So yeah, here you have a YAML file. I don't have to say much. You use it. And, ta-da, you have your environment available. And everything is available on GitHub on Mirage to Codeberg. Now I want to go through theoretically to two simple use cases to inspire you to create your own privacy by design application, your own. We go through two examples. One is reading and the other example is writing. As you said, as I said, every note in the network, you are a client and the server at the same time because you're a router within the I2P network. So what we're doing first, we're reading something from the network. Now the documentation on NPM, the documentation on GitHub for this library is quite grown up. It's quite complete. That's my personal view on it. If you have a different view, please do not hesitate to tell me and improve this documentation because I can learn that much from you. So here we have an example of creating a reading stream. So you want to read some data from another node in the I2P network. And you can simply use this very first quick start example and then replace only the IP which points to your Docker container which we have seen in the YAML file just before and ta-da, you're communicating through the I2P network. That's it. So privacy by design and exchanging private messages, totally confidential, anonymous, over the existing internet infrastructure isn't difficult anymore. Here it is. It's not more. And the same thing is now also if we're looking into writing data which means nothing else, you're offering a service on the overlay network I2P. There is the other example in the readme which is doing both things at the same time. The second example is creating a writing instance. So serving some data and at the same time, that's the very last part here at the end, it's reading data. And it's not doing this locally by simply locally connecting from the reading instance to the writing instance. No, it goes through the overlay network, through I2P completely and it does its job. A word of warning, I2P is not fast. Confidentiality and total anonymity has a price tag attached. And this price tag is called speed, latency. So to give you an idea, when we're reading and writing data from the diva blockchain where we're exchanging this data over peers, distributed all over the network, we have latencies of three till five seconds. Three till five seconds. That feels like 1992 or something. So that's the cost of privacy. You don't get privacy for free. Right, a few use cases. And I'd like to highlight the second one. The first one is the free banking. That's where I'm working together with and everybody is invited because we're totally transparent. So if banking is your thing, yeah, join in. If chat is your thing, then the I2P development team is really would be super happy to, would be super happy that somebody hops into the chat challenge. You don't have to worry that the chat application is not good enough because I2P simply has nothing. So it would be a great thing to start somewhere. And if you're a good user interface designer or user experience designer, hey, they would be like in heaven if they get something like that. That would be, wow. Additionally, games could be a topic for some people. But the latency could be a killer there. So it would be interesting. Right. Since I have now around eight minutes left, as my colleagues have shown me, which is great, I'd already like to enter the links, discussions, feedback and questions face of this presentation. So please, any questions? Oh, yes. Call to action. There are some questions. And there is a micro phone. Hi. Thank you very much for your presentation. So usually in secure systems, one of the issue is that due to security, there is more friction for the user. And that's also part of the cost of implementing secure systems. So, of course, here, almost everybody used Docker. So that's not an issue. But for, let's say, my grandma, that's going to be a bit more difficult. It's probably also not the target audience. But on the network side, have you tried, for example, setting up a compatibility layer with WebSockets or WebRTC so that the full stack could be run from the browser? Yeah. Short answer. Yes, WebSockets. WebSockets, not WebRTC. WebSockets is used by Diva, which is a real-time banking exchange system running on your own device. Yes. Yes. Everything which you, as JavaScript developers and TypeScript developers do know is on board, it just might be sometimes a bit slow. But I do not believe that there are additional user experience challenge. Obviously, you're totally right. But since you are the developer, I have, I just deliver the glue. I just deliver the glue between the privacy network and the end user interfaces. So the human-machine interaction, which we as developers should create. But this here, this library is just the glue, which gives you privacy by design. Thank you very much for this question. More questions? Please. Hi. Thanks for the presentation as well. How does it compare to other peer-to-peer networks such as IPFS, for instance? Thank you very much for this question. There are other presentations in the lightning talk, in the lightning room, just afterwards. First, I have the I2P presentation, and then there are other overlay networks. Honestly, I can't compare it because I'm the I2P guy. It says I2P here. But there is quite some research around which compares these networks. What I'd like to lay out is, on the research gate, which is the academic network for papers, there are some interesting papers around to read about darknets, and now I call it darknet, which have storage capabilities suitable for large files. Please do your own research. Please think what you're doing. Privacy is important, but there are also bad actors out there. So do your own research, and please read the research gate papers and articles about overlay networks. Is this okay for you? When are the lightning talks? It's today the lightning talks. There will be lightning talks today comparing those different. Okay. So the speed of the networks, the latencies. No? Don't worry. I'm going to check the links. Thank you. More questions? Sorry. I actually had a question about the latency. The problem is the number of the servers know that we have only 34,000. That's the problem. If we got more, that would mean that we can speed it up. It doesn't let it go faster. Interesting question. The question is, if there are more nodes in the network, will the network become faster? By building overlay networks, now theory, tunnel building is involved. Tunnel means a message hops over several nodes in the network. Now, a message comp can be only as fast as the slowest node in this route, so in this tunnel. Just by stacking up additional nodes in this network is not necessarily decreasing the latency of the network. It depends off the available bandwidth and performance of all the nodes involved within one tunnel. So the answer to your question is, it depends. More questions? Yeah. Since there's no other questions, could you give some more context about your free banking use case, the first one? Right. Yeah. It's a JavaScript type script application. It's built to exchange any existing or any future digital value, which can be something like to take an example, which everybody understands, Bitcoin, but also can be something like a piece of music and art, which is digitally available. It has nothing to do with Ethereum or directly. It's just an exchange system for all digital values. And here we require by definition in our foundation, it has to be private by design, because we want that people decide and not some operation in the center. That's the context I'd like to give here. Other questions? And thank you very much for your time. Thank you a lot. |
Strong Dynamic Type Checking for JavaScript
Where TypeScript is helpless, JavaScript Proxies come to the rescue! |
Thank you for being here. I was not expecting such a large room of JavaScript developers and nothing has been broken yet, so it's unbelievable. So yeah, I'm here to talk about strong dynamic type checking for JavaScript, which may sound a bit weird because you are not expecting strong type checking in JavaScript in the same sentence, right? But I will prove that we can do something about it. So first, what is strong type checking? What do we mean by that? So let's do a bit of vocabulary. So I try to find one definition online of what is strong type checking and couldn't find any. It's like a never-ending argument about which one, which language is strong and which one isn't. But I found two definitions commonly accepted. The first one is that this strong type checking means that you have this kind of explicit binding between some variable name and some type. So this variable and this type are like bound together. That means that every time you are calling variable by its name, you will get some reference data that matches the type you expect. The second definition is more regarding the program-languaging features, like no lack of type safety due to loser-typing rules. For example, in the case of JavaScript, we have implicit type coercion. That means that it's perfectly fine to get the plus operator between a number and a string. For JavaScript, it's just string concatenation and automatically casting the number to a string. But for other languages, like Python, you will get a type error. So let's take on as an example to show that Python is more strong, stronger than JavaScript. So when it comes to JavaScript, whatever definition you pick, you might say that JavaScript is not a strongly typed language and you will be right. Because JavaScript is based on dynamic type checking, dynamic typing, that means that JavaScript variables, you can assign it to some type and then move to another type and go on. It doesn't matter. So because types can change during program execution, that makes types in JavaScript quite unpredictable. So the creator of the language, Prandon H, justified his choice by saying that developers can be more expressive when you get dynamic typing, which means that it can get to the result faster, but he also agrees that it is also more error-prone. So that's the image I took for static versus dynamic typing. I think it sums up pretty well. So yeah, JavaScript and strong type checking, not really. Actually, every time you see someone complaining about JavaScript or mocking JavaScript, it would be about one of these memes, right? So these are some of my favorites. Maybe you know others. But as you can see, almost all these jokes about JavaScript are basically about the lack of strong type checking, which is too bad. So some people will decide to just get rid of JavaScript and maybe go to Kotlin or.NET or whatever language we have seen this morning. But I think that's the most common solution to this approach has been TypeScript, right? I mean, this thing has been invented specially to address this issue by having some optional static type checking about JavaScript. So how many of you are using TypeScript? Raise your hand. Wow. I was expecting that. Like, almost 100% of these people in this room are using TypeScript. I mean, why wouldn't we use TypeScript? It's so popular. Almost all the ecosystems, all the libraries today on VNPM ecosystem have been converted to TypeScript or provide TypeScript definition files. So we have seen this kind of exponential growth in popularity among the years. TypeScript is 10 years now. It will be 11 in five days, actually. And it has never been so popular. So did we solve the issue of type checking JavaScript with TypeScript? I would say not entirely and I explain why. Here are some things I learned about TypeScript after 110 years. The first one is a bit of shock is that TypeShaking is not actually the main selling point of TypeScript. The main selling point of TypeScript is developer experience. So if you have practiced TypeScript and the whole room has done that, it's great. You have seen some improvements in your development experience. So many things like be able to explore some API by using the autocomplete, be able to detect some typos that you have done in your code, be able to refactor your code more easily thanks to the static type annotations, maybe have some documentation right inside your IDE. Some compilers are using static type annotations to bring some compile time optimizations, which is great. You get type inference, good type inference in TypeScript, so you don't have to write all the static type annotations at any time. And we have seen some innovative use of TypeScript. For example, the Angular community is using a lot of TypeScript annotations to make some code generation, which is great. So all of this is part of developer experience and is great, but it's not really about type checking anymore. It's much more than that. I figured out that type checking was not the main selling point of TypeScript or at least not anymore. When I looked at the ES build project, one of the most important JavaScript project of these last years, ES build, the famous bundler, so maybe you are using the VIT development server, some people in the room. So VIT is based on ES build. And does ES build support TypeScript? Of course it does. Everyone is using TypeScript. But the fact is that ES build does not actually do any type checking when it compiles some TypeScript code. All it does is look at the TypeScript code, look at the TypeScript part of the static type annotations, and just get rid of it. That's all it does. Nothing else. And they say that running the whole TypeScript compiler is actually a loss of time today. Because of this development experience, developers have this whole integrated type safe environment and development process. So that means that you don't need to do it twice a second time on compilation. The second point of TypeScript that I learned about 10 years later, it's that type safety in TypeScript can be easily defeated. What I mean by that is that in many scenarios in your application, you are relying on type assertions. That is these little elements like the ASCII word or the exclamation mark here, which is I ensure you the compiler that is not null. So all these things are not bringing any type safety. It's just the developer saying to the compiler, trust me, I know what I'm doing. And most of the time, we do not. So yeah, this problem, these type assertions, you can find them easily on any web application. There are actually many parts where you are forced to use these kind of assertions because of the nature of the web. You can have your perfectly type safe TypeScript application and still have to deal with lots of unpredictability. Unpredictable is like the most of your time, your job for a front-end web developer. And what I mean by unpredictable, it is known at runtime. That means it changes every time at every user and so on. So for example, your application may have some back-end services, may call some APIs, maybe some third-party cloud providers. And you are trusting the responses of these servers, right? You are not validating any of the response of the server from the application side. So this could break. You are also relying on a browser. And some browsers have bugs and queers. They do not fully support visual script APIs. The web standard APIs. I chose this logo for a reason, why? You may also have some client-side store data for your application. Maybe you are storing on a local storage some user preferences or some five-store age users' cache. So this is likely to break as well because sometimes the cache is outdated. It comes from an older version of your application or maybe it has been modified by the user itself, who knows. And finally, maybe the most unpredictable part of every developer's job, did you guess it? The user. The user can be very unpredictable. If you have some application in production and have a look at the production database, you will always find some crazy stuff like, how did it get there? I don't understand. This is the user. All these things need to be validated. Otherwise, this is a recipe for disaster and can break your perfectly safe application in TypeScript. So if you look at TypeScript and wonder, how can I do that? How can I type check all of these things? No luck. It's not a compiler problem because all these things happen at runtime. You cannot anticipate it. So it's more an applicative problem and not a compiler problem, which means that TypeScript is completely helpless and it is up to you, the developer, to find a solution to these problems. So how do we deal with runtime errors? Most of the time, the truth is that often we don't. Maybe in the best of scenarios, you are doing some custom logic to validate the data and trying to fix the stuff the best you can. But most of the time, you have so many different possible runtime errors that you would have like try catch blocks and trying to show some error messages to the user, saying them to call you and send us an email in case something bad happens. And I also saw that we have some kind of global unexpected exception handler that is just sending all the runtime errors that you didn't catch to maybe a monitoring service. And it is added to a log file that you are checking like once in a month, looking at a bunch of errors and saying it's not worth my time, so I should move to something else. I don't know if some of you do that, but it happens, right? So it's too bad because we could figure it, figure a way to solve all these runtime errors. So back to this idea of strong dynamic type checking. How can we do that? So I'm just taking to the definition of a strong binding between variable name and a time here. What if we could do this kind of strong binding but at runtime? What would it mean? First, it would mean that the type errors that we get would still be runtime errors, right? Because it happens at runtime. But at least there will be more explicit and more catch early. That means that instead of having like undefined is not a function, you will get an error message like this variable has been from undefined and it was not supposed to. So instead of pointing to the consequences, it points to the source of the problem. So that helps a lot to reduce the investigation job that you have to do as a developer when doing debugging. The second thing is that this strong binding, it should not be just a one-time validation pass. I'm sure that there are plenty of JavaScript libraries that do that. That is, you are throwing a bunch of data to it and it validates, saying true or false or just throwing a type error. But we need more than that. We actually need to have this binding. That means the type information needs to live along with your data. So it should be validated, this type checking thing, on every reassignment or mutation of this data. And finally, the goal of this is to get rid of maybe some silent errors because we have many mistakes in JavaScript that just are silent. That is, you are not noticing them until it's too late. And it can also make runtime errors maybe more predictable and so more manageable from a developer's point of view. So this is the main reasoning I have when I worked on the open source library that I want to present you today, which is object model. So definitely not a new project. Actually, I've been working on this for the past eight years. So I'm at the version of 4.4.1. That means that I have written the entire thing like four times now. It's obviously the hardest thing I had to code in my life, I would say. It's very complicated, but it works. So I'm glad. And I would say also that it is my most used for real open source project. By use for real, I mean that it is used in business projects. So I use it in my professional project. Other people are using it as a fundamental component of their business project and I receive lots of positive feedback about this library. You've got an example here. So what is this library doing? So how do you use it? It's pretty simple actually. The first thing you have to do is define the dynamic types, I would say, I would call them models. I explain the difference later. But basically, let's say you are working on e-commerce application. You can declare an object model for the order, for example, the customer order, saying that you have a product which has a name property which is a string, a quantity that is a number, and also an order date. After I've been declared this model, you can now bind it to some data. So this is where you have this strong binding between the type and the variable. Here, I used the constructor pattern, so that means you are calling new order. I think it's probably the most intuitive form of a binding for the developer. And also, it helps to store the type information on the prototype level of the object. So that's how I have this strong binding. We already have a binding between objects and prototypes in JavaScript. And after having done that, you get the myorder object, which you can manipulate just like you would do with a regular JavaScript object. But instead, when you are assigning, for example, quantity to Boolean instead of a number, you will get a dynamic type error at runtime with an explicit message saying, expecting product quantity to be number and got Boolean false instead. So because this happens, every time you are doing any mutation on an object, it is really easy to quickly find some mistake that you are doing as a developer and so improve the debugging experience. So that's great, but how does it work? So let's start from a pretty basic example. Here, I have a class user having a constructor taking a name and an age. And if you want to validate the parameters that are passed to a function and not rely on static type annotation in JavaScript, that means that you would validate this data at runtime. What you could do is use these if conditions and check the type of these different variables and throw type errors like that. Pretty easy. The problem with that is that it only works in the constructor. So maybe you could decide to declare some setters like set name, set age, and have this validation process on every single attribute, but it's a bit tedious and we can do better on that. So we can improve this by using a feature of JavaScript, which is the proxy. So I don't know if everyone knows about proxies. This is a feature of JavaScript that has been introduced in 2015 as part of Xmascript 6. And proxies are actually really great features, really powerful. The way proxy works is that they enable you to intercept some operations that are done on some object and react to these operations. So in this example, I just use the set trap of the proxy, which means that every time I am reassigning a property, I can execute some custom logic. So I can move my if type of name, different string and so on into the set trap and be able to detect the different issues. So that's great. So it works both for the constructor and the future reassignment, the future mutations. What we can do as a first step is try to make a generic function out of this. Like so. So now I just move the definition part on this generic type check function argument. So the type check function takes two arguments. First is the data. Second one is the definition or the type if you prefer. So it makes clear that you have this strong binding between objects and types. And as you can see, it is really easy to make a generic function to do this kind of stuff. So the type check function that you see here is a very basic version of what object model is. Of course, the library is much more complicated than that. It can cover many other use cases, but you get the idea with this example. So as you can see, it is really easy to reuse this type check function to apply to many different models. So why did I call these models and not types? Actually, I wanted to find another word just to make straight that there is a few differences from the types that you know from TypeScript, for example, because everything happens at runtime. This is runtime data validations. That means that models are more powerful than just the static types. For example, they can not only check the types, but also the values. Let's say I have a short model which can have a size which is either a number or a letter like MSL, Excel, and so on. I could decide to have this kind of type annotation to have both control that it is either a number or a letter M or a string matching this regular expression. I can also have some more complex assertions. For example, if I want integers in JavaScript, yeah, integers in JavaScript. So it will be the number in the end because that's how JavaScript handles numbers. It's double 64 bits. But I can add another assertion on this number model to say I need to check that it is an integer. And maybe if I want it to be a positive integer, I can add another assertion to make sure that it's above zero. So this is the kind of stuff of assertions that you can have. And again, every time you are manipulating this property, for example, the age of the user, it will check all these assertions automatically. And also the last difference from models to types is that model validation can affect application logic. Because it's happening at runtime, that means you need to react to it and have some strategies of how to handle these runtime errors. For example, if you got some error, some type error on your short model, maybe you just want to cancel the order so that you are making sure that everything is happening correctly on your application. So these are the main differences. So to get a look at the pros and cons of this library, first the pros. You get descriptive error messages. They are pointing to the initial problem, not the consequences. So just that saves you a lot of time. And it means that you now have this kind of continuous data validation as part of your development process. That means you get faster debugging and more reliable applications in the end. Regarding how you manage these runtime errors, because you need to do something, right? Not only showing an error message, but maybe doing some strategies that are planned. You can define some global or maybe per feature, per model strategies about how to manage these errors. Maybe some errors can be easily manageable. For example, clean up an outdated cache. Or maybe some of them are more complex and then you need to maybe log them into a monitoring service. Some of the cons of this library, one about performance, of course, because since it's happening at runtime, that means that it has a cost, a performance cost. Don't worry, it's not too much. But if you are doing some heavy mutations, like more than 100 times per second, maybe you should avoid using dynamic type checking for this specific scenario. But most of the times you don't have to do this, so it's great. The second problem is that it relies on proxies. So you need support for it. Today, modern browsers are supporting ES6 proxies very well. But if you have older browsers for some users, this can be an issue. So, which is better? Static type checking or dynamic type checking? The correct answer is you should use both because they address different issues. Type script, we saw it. It's awesome. It improves a lot with the developer experience. It makes you have a coding base which is reliable and makes sense, which is logical. But you should also take care of all the unpredictable parts that are happening at the runtime of your application. So my personal recommendation would be to stick to TypeScript for the core application logic but also add this object model layer for every external interface that you have to deal with, like the server, user inputs, the local stores or the browser APIs. And this can lead to a more reliable application. That's all I have for you today. So thank you for listening and I'm taking questions. Thank you very much. So, we have time for questions. Who would like to ask the first question? My question is, have you ever tried using this library with other libraries for like, as it called, immutable data or for other validation? Have you tried using this library with other libraries for dynamic checking like YAP or for immutable data like EMR or other libraries? Yeah, so immutable should work fine. For other validation libraries, I mean it's kind of the same thing, the same job, so maybe it doesn't work that well and doesn't make really sense. But I think it should work perfectly with immutable data structures. So EMR should work fine. Yeah. Hi. Do you think it would be possible to generate the object model, like definitions I don't know what's called, object models from TypeScript? Okay, that's a good question. So actually, you can do the opposite. That means that if you are using models, it will generate TypeScript types for you. But because this is more than type checking and as you saw, this can affect application logic. That means that we cannot do this simple conversion. If you use it dynamic type checking, just like you would do with TypeScript types, you are just using like 10% of the potential of a library and it wouldn't make any sense to me. So you should see the website maybe to have more example of that, but yeah, it's a bit more than that. Yeah, I see hands. You were there. The next speaker also, could you raise your hand or stand up? Okay, so we'll have to contact them. A fantastic idea. Love the library. You mentioned rewriting it four times over eight years. How stable would you consider the project to be? How often do breaking changes to the API get introduced, that kind of thing? Yeah, it's true that I've written it four times, but the API never changed. That's one thing. And also I use it for professional projects. So I would be embarrassed if I had to throw it away, all right? So it's quite stable for many years now. Hello. Thank you for the presentation. And I would like to know whether would you recommend using object model on projects that has not yet TypeScript, only JavaScript. Thank you. Yeah, I mean, that could be a thing. Although if you are into strong type checking, you're probably already using TypeScript. If it's not the case, maybe it's fine. I don't know. But yeah, it's totally possible. But most of people are using TypeScript. Thank you very much. We have time still if you have time for another question. Yeah, you'll have to be loud because people are moving. And if you sit down, please make sure there is no space because the room is pretty full. Hi. Thank you for your project. So one other approach is using, for example, JSON schemas that then translates to types, should I speak? Sorry, I can't hear you. One other approach is using JSON schemas, for example, on the validation side, let's say, in a controller. And the JSON schema then compiles to the or deduces the TypeScript type that the schema defined. So that's one way, for example, to do validation and not have a runtime penalty besides doing the validation itself. Have you considered this approach for your use case? Yeah, so good question. So you can indeed use this kind of type declaration. One problem is that, again, if you are sticking to what can TypeScript do with static type checking, you are only just using a fraction of what can be done. For example, I told you about custom assertions. I told you about the fact that you can check values along with the types. So all of these things would not match the model that you are describing with JSON. So that means we need to have another API for that, and that's why I have the on API for object model. Another last question, and then please, everybody, squeeze no empty seat. There are a lot of people still standing. So JavaScript is executed with V8, and there is a lot of optimization underneath where you have an inlining going on optimization of the function. When you use proxies, all of that is going to be gone immediately. Like the performance set is not only when you set something and when you would normally go through the proxy. It's not a hook. How's it called again? I forgot the name. Anyhow. Like when you have the trigger for it, but also for anything that relies upon that data from that point on. So it's going to be a huge performance. I would definitely not recommend this pretty much in production. Yeah, I talked about the performance issues. One thing is that it's only useful for applying to external interfaces like network request. You can just validate everything related to one request. So the loss of time due to the network request compared to the loss of time due to proxy, it's acceptable in my opinion. You can debate after. Don't worry. In his go. I mean, I run a bunch of assertions and it takes less than 10 milliseconds. So I don't think the loss of time is so much trouble. Thank you very much. Thanks again. |
Secure by accident
How performance optimisation can lead to more secure apps |
Hello, everybody. Can you hear me? That's cool. I'm Andrei Yenich. I will talk today about situ by accident and the slides are to be published under the Creative Commons Attribution 4.0 license. I am a freelancer since last month. I'm doing web development and consulting. I'm available on Mastodon, for example. I'm on other places. So if you have questions, please get in touch with me either afterwards, preferably outside or on Mastodon. Who is the target audience? I expect you to have some knowledge about Angular, about TypeScript and Webpack, because I can't go into this. It's not enough time for it. I'm interested in security and performance. If you are, we are in a good shape. What will you learn today? I'm going to something which I will get to in a minute, and I show you which steps I take to reproduce it, help you to understand the results of a Webpack build, and then go into more details, because you could see the child routes from Angular in the result, and I want to mention how you can protect them, what benefits it can have, and how code splitting works and helps in that. A few words about how that came to be. I was approached by Pechida last year, and was researching something for security and had questions about Angular, because that's not her expertise area. I explained how Angular works, what the different files are meant, how to read it, and learned about which information you better not include in an Angular build without prior authentication and authorization. So we have four more seats here. This presentation will focus on Angular, but the issue at hand is not limited to Angular. It will help you react as well. It's not something the framework can help you with, because it's a responsibility that lies with you as an app developer. I used Angular for this presentation. It's a minimal application, and to help me read the webpack build, I used Pretia, which is also quite nice. So I started with a brand new Angular project, used the recommended approach to use ng new, called foster, because that was something I had in mind when I was writing the slides, and because we have to deal with Angular, the documentation isn't complete, so I had to install some more dependencies to be able to generate a build on Dev Server environment. As a result from the build, we get several files. For this presentation, I just looked at the JavaScript files, but I can also quickly go through what are the files meant for them. We have several files, for example, in HTML, that is a minimal app, so that means it contains a bare minimum of HTML you need to know to have to load the Angular files and some CSS, and have the JavaScript file loaded we saw before. Then we have some size.something.css, which is empty because at that point, we don't have styles applied to components, otherwise they would go in here. The hash is generated by a webpack at that point. We have a runtime file that contains the Angular runtime that is that those parts Angular needs to pass the template and manage the dependency injection and everything. Mostly not interesting from what we are interested in. We have polyfills that contain something like the zones, certain promises features as polyfills, so if some browser doesn't support it, it gets added to the global namespace, and we have the main file, which is mainly what we as an app developer wrote, but also some boilerplate code for example for RHS or some other template parsing elements. So what's the case right now? I looked at the routing file, the routing module, that's how it's generated by Angular, and we are mainly interested in that routes variable over here. So I looked up what is the type strip definition for a route that's not complete, it's just a partial of it. We are mainly interested in those properties pass, especially the pass and the component, which is basically used if you don't use child routes or just have a mapping from a pass segment to a component, but that's also helpful to have RyderX, it's also something you will add early on to have a catch all route or RyderX to a 404 page. What is interesting for this presentation is also the children property, which is another area of routes, and load children, which is used for lazy loading other route segments, so you can activate in front of it to guard it, that means you have some check if the currently user allowed to access that route. If you then load into what is produced by Angular, you get some Java profile and can look for something with ng generate component, because that's part of the index HTML that gets generated, and that's the main app component that gets generated by the boilerplate code, and that's the entry point above that line, it's Java webpack boilerplate code, so you can just ignore it for now. Below that is that what you as an author wrote. Next, I created some components a patient found, I thought about two more components, for example, the speaker component and the slides component, and let's say we want to protect the speaker component for some reason, if you then generate a new build, you will see no other changes to it, so it's identically why, because the build gets reshaken by a webpack, that means that if you don't load those components, they will not be part of the build. Let's include them then, we want to have them as routes, so I extend the routing module, declare a route for the slides and for the speakers, right now just without children, therefore I also have to import them. I added the imports because when I look at Angular documentation, I often find that the documentation isn't complete, which is a bummer, so if you want to reproduce it yourself, I give you the necessary hints to follow along. So I added that route, I just re-run the Angular build command and gbuild and look into the results and then, for example, I see that the JavaScript now contains some more lines, the variable names or identifiers might change because Angular on repack is mainly in the variables, but it has something like this structure, for example, for the slides or directly below the speakers, so that's what Angular makes out of your route definition. We also have a mapping from the path to the components, the components where the function calls before and that's how you can read it. So if you are a security researcher, that is what you would look at and try to make sense of it. I feel it's helpful to have, okay, we have the result now, how would I translate it back? Reverse engineer, so to speak. So the next thing we can add is general catch all paths that redirects to the slides page to some kind of landing page or index page or for everything else that can't be mapped, we have a page not found component. So now I want to go into more details about how I would actually see the components that I had and the index that the component HTML really bloated and it tells you with an HTML command that you can drop certain paths and if you just remove everything within that diff container with the road of main, with something more semantically, you could actually see, okay, what's happening here. So now we have updated our component. Now we have updated our component. Now we have our own HTML and I can look into the guarding and into defining child routes and I would like to have some kind of model. So I'm going with a template-driven form here. It might also be possible to use a reactive form, but most of the Angular applications I worked with use the template-driven form, so I'm more familiar with that. And just defined a model here. It's called auth with some string that is used as password. It's just for demonstration purposes. You wouldn't use this kind in actual application, I hope. Now we have defined it in the components part. We should also update the template and here it's like I removed that speaker work and replaced it with some basic HTML. Still no styling because that's not relevant for what I'm doing here. I added a form element and basic here and input and to demonstrate, okay, we have something working here. I show the link to the access to the slide subpath once a form is valid. And here we have a template here with a type of password. So the browser is using that obfuscation with dots or stars or what have you. And you have a model that's helpful for Angular as well. So it's a two-way data binding. And once it's valid, you get access to that anchor. So last thing you should have is because when you have child routes, you should have a route outlet that tells Angular where to display that child route. So now we have updated the template. The next step will be to also update the route definition. So back to the routing module. I extended that speaker route. It's now not having a mapping from past component, but it's also declared, okay, we have some child components here in this application. It would be slash speaker slash slide to access that child component. It's still rendering the parent, the speaker component, but also the slide component in the route outlet. So at that point, it will still be possible to look at the bundle and see, okay, we can animate, we can animate every route in the application. So as a security researcher, I know, okay, this is the places I have to look at and check for security. Going more into protecting the thing, the first thing I would do is defining a new route for the speaker module because that's part of the application I want to protect. Angular CLI offers you a new way to generate the boilerplate for that as well with the ng-generate module, which is the name of the module. I want to have a router as well, and I want to have it as a sub-module of app module. Once that speaker routing module is defined, you can update the routing module of the app and replace everything with your head there with just the path and then tell it to lazy load the speaker module. I will get to that in a minute, later on. So now I lazy load the whole speaker module, and I have to redo the slide sub-path there. So I go into that, load the slides component, and okay, I want to have the slides module as a sub-component of the speaker module, and the path for the speaker module is here, is an empty string because I already am in the speaker path from the parent component. It's also important to remove that speaker component from the app module definition because otherwise Angular will be set, it can't have that in two places. So once I generate a new build, I will now see that I have a second part about lazy shine files which has some hash and some other hash. I come back to that as well, and it has a speaker module because that's what we defined, and we have some sizes. So I can tell, okay, right now I really have some part of my application that gets lazy loaded. So that means it will only be loaded once that path segment is entered. Now I told you that I would like to protect that path segment, so I define a route in guard, and GCLI has a guard schema for that as well, so I run ng generate or ngg guard and name it something. Here it's usually something that can activate, can deactivate, or the other types of guards you have. You get a small interactive prompt for what do you want the guard to be able to do, and here it's for protecting the speaker. What I receive is that I get some boilerplate code provided by the GCLI again, and I follow the documentation about how to define guards, define the user token and permissions stuff, right now it's just returning to always, you would have some more checks here. For example, is there some JSON token set in the local search or whatever you want. I also described that the documentation isn't complete here, because either you have to export those two classes, or you have to declare them as injectable. Once those two classes are defined, you can have them somewhere else, you need to inject them into the guard itself, declare that the guard implements a can activate interface, that means it has to have a public function called can activate, that takes active route snapshot as the one part, and returns an observable whatever, not relevant for our case. I just want to use that injected permissions here and check that can activate a method on it and handle the current user and route patterns. That's some way you can do, and then you can check the case. I want to enter that route with the user, is he or she allowed to do that? In our case for demonstration, I just return true, so it's always allowed, but you can also set it to false to test it on your thing and discover that you get redirected to the index page, those slides. Now that we have a guard, I have to update the app routing module, because I want to protect the speaker, and that means it gets a new line for the can activate, which is a list of guards, in our case only one, and I also have to tell us that there are some providers for those injected dependencies, therefore it's important to have those two classes, permissions and user tokens exported as well, otherwise I couldn't use them here. And now I'm at the part where I would like to use the name chunk, magic comments from repack, and after my pitch and I got accepted, I discovered hey, Angular doesn't support that, but you can turn on the AngularJSON property in the, or the name chunk property in AngularJSON, which would give you not the hashes, but the whole filename that you would have in a depth server, and therefore you could then apply some more security measures, I will get to that on the next slide. For those of you who don't know what name chunk is, when you have that import statement for lazy loading, you can have a JavaScript comment, which is called repack name chunk, and declare another name and repack will use that one instead of the generated hash. So my idea is that you have certain files that are on your server then as static applications, and you can add HTTP headers, for example, content security policy, you can also look at the documentation about Angular and it says you have to have especially unsafe inline for strips, which is bad, because it allows more for attacks. If you can't use content security policy, you still have the ability to declare it as a char hash or as a nonce, which is a bit more labor, but in my experience, I work with companies who deploy it every other week or so, that's doable to compute a hash and add it to the HTML. The idea is that nobody else is able to crawl that route of the JavaScript chunk and look at two more, because for example, you could have a list of certain passwords, that was the case of Percadia, and if you would like to attack that application, you can just exclude those passwords, because they won't be valid anyway, and therefore you would aid with credential stuffing that is trying out passwords and credentials you found somewhere else, or you can exclude your brute force attack, because you know it has to have this length or the special characters are allowed or not allowed, and therefore you would make the criminal's life easier, so I would like to have that not part of the bottle, but loaded from some JSON file or something else, which is then be able to be protected. So basically having some authorization for certain information and only give more information once that user is authenticated, for example, by an authorization header and the response. So what have you learned today? I hope you get a bit better understanding about how the result looks like, what Angular is producing, what name chunks are in Angular, you learned a bit about content security policy, which is really important, there's a documentation on MDN as well, and learned about some ways to secure static files. I used creative comments images and provide the resources with the exception of the profile picture of Percadia, but I got the written permission that I am allowed to use that one. Thank you. Do you have questions? |
The problems you will have when creating a plugins system for your shiny UI project |
Okay. I guess I can start. Thank you very much for being here. Thanks also to the folks organizing the Dev Room and the event. My name is Joakim. I work for Microsoft and I'm here to talk to you about some stuff that we found while building a UI project, a JavaScript UI project. And maybe, yeah, maybe it's helpful for you too. All right. So this is kind of a high-level presentation. And I'm not here to tell you how to do stuff. Certainly, certain things will be very basic or so. Others, hopefully, will also disagree. But the idea is to tell you about the patterns that we found, which hopefully, if you want to do something like this, you will already be, you know, aware of these things. So just to set up the context of what I'm talking about, and I should have started the timer. Yeah, you know, when we say like, plug-in system, we're talking about applications like, for example, ones you know probably matter most or VS code. They have extensions or plug-ins, if you will. I usually may just say plug-ins, but essentially the same thing. And the talk is not about the product that we do, but just to give you an idea of, you know, the context. This is a Kubernetes UI. It's built with React. And there's a server, you know, or backend and a front-end. So it's very traditional in that sense. And you can run it as a web app or as a desktop app. So, yeah, and when I talk about plug-ins, still in the context thing, what this means is that, of course, it's code that should be loaded dynamically. It has an API from a library, of course, for you to change stuff. And in our case, this is used basically for changing, you know, the UI, most of it. But you can also change certain core things like adding routes, deleting routes, I don't know, changing the token when you need to get a token, stuff like that, right? But this is essentially the context of what I'm talking about when I say plug-ins and functionality. Okay, so let's start by looking at what the plug-in should look like. This is usually, we're talking about this bundled single JS file, right? Let's not talk about, like, several JS files, you know, the drill, hopefully. But so you have the plug-in code, maybe that's enough for us. It's been enough for a while. But likely you will need some information together with the code, right? That's often called a manifest. Certain plug-in systems that I worked before, it was kind of programmatic. So the plug-in itself declares, here's my name, here's my, I don't know, dependencies for other plug-ins or whatnot. We recommend not doing that. We recommend using a text file, or where you declare like a manifest file, where you have all that information about the plug-in. Turns out that, you know, package JSON is pretty good for that already. So the advantages of not having this in the file, hopefully they're obvious, but the thing is that you don't have to load the code before you know if you should load the code, right? So if you have metadata that you need to use to decide whether to use the code, for example, you want to tell the user, okay, here's the name of the plug-in. Do you want to enable this plug-in? So it's better that the code is not already running. Loading and loading plug-ins. So, you know, this is coming into what the plug-in should be structured like. But of course, you have to load the plug-ins dynamically. And usually there is this pattern of an activate method. This activate, of course, is about telling the plug-in developer more or less when the code should be, you know, loaded. So that's the sole purpose of this, right? This is not so much for anything like, okay, if you don't put code inside the activate, that means you're not one. Don't trust that, right? But it also can be used to tell the system that when you try to activate the plug-in, then you can have, like, a return, for example, from that activate method. And the plug-in can tell the system, I could not activate, right? We'll see a couple of examples. Without the activate, and our plug-ins, for a while, also didn't have the, like, well, you can use them without the activate, but that's because you should assume that the whole loading is deactivate, right? So both methods are fine. Like I said, this is, in many cases, a matter of taste, but this is, like, a pattern. So if you have an activate, of course, should you have a deactivate? This is when you tell the plug-in, okay, you're going away. Maybe there's certain things that the plug-in needs to do before it goes away, like cleaning up. This is unlikely to be used by most plug-ins, but the thing is that, like, you shouldn't rely on deactivate for just when the, like, you shouldn't trust that the plug-in will only run the code inside the activate. You should also not trust the plug-in to deactivate itself, right? So this is mostly about telling the plug-in developers, here's a way that you know when the code is going to be activated and deactivated. Yeah, for unloading, of course, you should just reload the system without the plug-in, mostly. And that's pretty much the conclusion of the section, right? Don't trust plug-ins code, I guess. Now going into the structure, and hopefully this is a bit more interesting now. Yeah, so you have to decide how can people write plug-ins, right? And since we are extending functionality and, you know, extending something that you already know how it will work, maybe a plug-in class sounds good, like an object oriented where you extend stuff, sounds good. But of course, right now we have a lot of functional code going on, so ultimately this is a matter of taste, right? We can argue, like, what's fastest, what's quicker to do, but in such applications, it's fine if you're not squeezing that extra 200 mil. So, yeah, so these plug-ins do exactly the same. For example, in this case, the plug-ins would say, okay, we only work on Mondays, so if the, you know, if the day is not on Monday, then you just tell the system, I'm not activating, but these are different flavors. However, maybe an interesting plot twist would be that, of course, plug-ins have a life cycle, right? They activate, they deactivate, so that's what React kind of gives you for a component. So why not making a plug-in be a React component, right? You already have, you know, certain life cycle things, like use effect, for example. Another advantage is that you could also use hooks inside it. So we, for example, in our system, have some hooks, and if you just use the, this method, it's going to be a bit complicated to do, right? But if you use this, then probably it's a good idea. And if you, of course, this only works if you have a React-based system, the other systems, I don't know, but yeah. But then the other thing about the functionality is, okay, how about implementing actual stuff to change the system, right? And in here, I think there are mainly two options, so you can make it so that it's very textual and declarative. So let's say that you have a top bar, we do have a top bar, and you want actions. Actions would be like a button or a string or something else that you want to put there, right? So if you make it declarative, like the left example there, of course, the system is responsible for interpreting whatever you put there, so as long as you don't implement stuff, you won't give any power to that plug-in implementation. On the other hand, this is kind of a bit limiting, right? So because you always have to develop more and more functionality to support it. If you want to be a bit more flexible, although not as, I guess, friendly to new developers, you can just say, okay, just put a component here and that's it, right? And it can be a string or something else. So that's the right side. And of course, maybe people will do stuff that you're not expecting with that, but also you basically support anything. So it depends on the level of what you really want to give. Now, the functionality itself, like I won't bother you with the functionality that we have, but of course, you will usually say, okay, I want people to add, again, top bar actions or, I don't know, like a new route and all that, and it will be likely that you also have to have a way to remove those, right? So you can add an action. Somebody will say, okay, now I need to remove it if it's not Monday or something, right? So it's some sort of crude or crud or whatever you call it. So let's look at what it could look like. So in our case, now I talk about header actions and I actually put this screenshot here just to illustrate, but this is a header action, it's just a header with an action. And yeah, so if we want to support something like this, should you have one function per operation, like register header action like we have here? So you declare the button and you do it or maybe, okay, if you add one button, maybe you can add a list of buttons, just keeps getting appended there. Sounds great. And then you have a counterpart for the operation that's removing header actions. And in this case, you can just call it deregister or just to be the opposite or remove, to be a bit more direct. However, how can you actually identify what you added there, right? So if you declare the component or the function in this case on the left, then you have access to it. So you just call it again, hopefully it will equal to the same thing internally and the system can understand, okay, this is something that we have here. So let me remove that. But let's say that we already have default actions. So how can the user refer to the default actions, right? Will they import them? That sounds like they cannot really refer to the actions by the function name because it gets minimized and then things don't work. So of course, one solution is to add IDs. So it's probably a good idea that whenever you have a function where you're just passing a component or something else, probably you should identify it, right? If you are to refer to it later. But then you're very happy it works and somebody will go and say, hey, that's cool, but you keep appending the actions. I want to actually have my actions prepended. And then your world goes upside down, right? So either you can add, of course, like, I don't know, like an index to the function when you call it. So now you have ID, index and the actual action. Or you can scrape all that and, for example, just use a list processor. So instead of registering, okay, add function, remove function and all that, you can say, here's my list processor for other actions, right? It's going to be fed whatever is the default actions and you can add them, you can remove them, you can shuffle them, whatever. So it, of course, you have to identify them as well. So the ID keeps going there. But this is, I think, a more flexible way and less work to maintain. Now, yeah, developer experience. So it's supposedly important that users can start plugins and develop plugins easily for your system. So just like other programs that you probably used before, there should be, like, this boot strap way of creating a plugin, right? Either that or, of course, you have a folder of examples that you say, okay, just use this and modify it. We do have something like a bootstrap script, like the headland plugin, it's called, because we're original. And, yeah, and that's interesting because you can just generate the base plugin. But also, you should take into account that you should require the developers to configure as less stuff as possible, right? So one way would be to say, okay, here's the package JSON that we generated, of course. Here's the TS config that we generated. Here's the web pack configuration that we have and all that. However, of course, the next time that you need to upgrade the plugins, then you have to upgrade all that stuff. So a probably better idea is to try to add as least stuff as possible, right? So if, for example, in the case of TS config, instead of just shipping the whole TS config and then you have to figure out how to upgrade that if you need, we are already, as a module, we are already inside the application at the development stage. So we just point to it, right? We ship the TS config that we want and we point to it. So if the developer touches it, then it's fine. We're never touching that again. We're just touching the file that we ship, right? So it's going to be updated automatically as long as you update the module. So actually, we try to keep the dependencies as simple as possible. In this case, it would be just to have one, right? Our headland plugin package in this case. All right. Next is about bundling. So now you have your API with the processing, this processing and stuff. Yeah. And Webpack is very easy to use. Yeah. So you get your bundle, right? And you get your bundle to be, you know, you get your single JavaScript file to be run, ready to be run. But of course, that's going to, if you just have like React, for example, that this plugin will import, then you get React bundle there. If you have your, in our case, headland plugin, then you're going to have that library, you know, just bundled in. And you should try to avoid that, right? Because it's going to be running inside a system that already has this, it should run with the same versions, and it should not, you know, just even if it's for size matters, not pack the same thing. So Webpack has this thing called external modules, and you can just say, okay, whenever you find this import, actually, we mean this variable. So when it finds the React router, it says don't care about the React router, just use whatever you put there. In our case, it's like plug-in lib, React router, and it's going to use the same code. So we were like, okay, this is great. We can avoid shipping all this stuff. We were going to keep our headland plugin package really lean. It's going to be great. So we even thought, okay, we're not even shipping our own library. We're just shipping the type declarations. That's going to be fine. So we spent many hours wiring, you know, TS config and Webpack and whatnot to make sure that it was happy when users are developing. They can see that the imports seem to work, even though they don't exist inside the library. And then somebody wanted to test the plug-in. And they're like, oh, okay, so now you cannot test the plug-in because you don't have the libraries around. So maybe you have to use the program itself to test the plug-in, but that's probably not a great idea. So we were like, okay, we have to ship the actual library. Yeah. And so, yeah, so it still works as an external module. So we're not bundling it, but we're shipping it. And it's fine because, I mean, yes, the headland plugin package is slightly larger, but that's okay because it's just one time. So take that into account and don't get too extreme with not shipping stuff. Yeah, we're getting to the end. So running the plug-ins, right? So now you got your bundle. It's not bundling React because we got it. It's not bundling your library because you got it. And you're going to run it. However, of course, at some point you will break API. And if you do break API, it would be nice that you don't load a plug-in that will be broken because it's expanded. You'll break the system for your users. So package.json has this already. It's part of its known keywords, I guess, called engines. So you can just put something there for your system. And then, of course, when you run it, you should just check for that before running it. Now, how to run the actual system? So now you have everything in place and you can check for it before you load it. But how do you actually load it? Do you load it? Of course, it's going to be loaded in the front and that's what we're talking about with JavaScript. And this is highly special to each project, right? So maybe you have something there where the users can just load the files directly and it just refreshes and loads. In our case, we didn't want to do that. We wanted it to be very transparent to the user. So if the user, of course, downloads a plug-in, next time they start the application, they should see that things changed. But also, in the case of our application, it works also if you deploy it as a web service like that. So in that case, we don't want really the users to keep using different plug-ins. In this case, it's more like, okay, whoever deployed it is giving you an user experience with the base code plus a bunch of plug-ins that the users shouldn't really know about. And for security reasons, of course, we don't want the users to keep loading plug-ins on something that is now, even though it runs locally, it's going to, you have it just deployed for this user and other users. So they would have different user experiences if they keep adding their own plug-ins. So what we do in our case is that we have the backend or the server. It's a server, but depending on if it's running locally or actually in a server. And then that's the thing that has to have access to the plug-ins themselves. So it reads the plug-ins and then it has an endpoint and the front end, when it loads, before it loads everything, it says, okay, what are the plug-ins that you have? It says, I got this 10 plug-ins. It says, okay, give me the plug-ins now or give me this one plug-in now, for example. Then you get the actual bundle JS code from the backend, and then it loads it dynamically, and then you get a shiny thing. So this way, of course, if you're running it locally and you have the plug-ins in the plug-in folder that it expects, the backend has access, it actually keeps watching the folder in our case. It gets refreshed whenever you change stuff there, and that's how we do it locally. If you're running on the server, then, of course, we don't check if the plug-ins change or not. That's not supposed to happen, but you still get essentially the same experience. But like I said, this is mostly, this is very tied to each project. And that's all I got. So thank you. Thank you very much. We do have time for questions, so raise your hand high and we'll start with the first question. You said putting some dependencies in the plug-in, so having a module that they import so they can use some things like hooks. Do you have to, or do you do anything against a plug-in modifying those things that would then mess with other plug-ins, like changing the objects you've passed in? I cannot understand what you're saying. So if the plug-ins are depending on a module that you've made, and they're the same ones being passed into different plug-ins, could they modify the things you're passing into, then mess with the other plug-ins? Yes, but what was the actual question? Is there a way to mitigate against them changing how the plug-in system behaves for the other plug-ins? But you mean for example in the example of the actions, whatever goes there? So you had a button, I say you have a button class you're passing in, they can extend. What if they changed the behavior of that button class to then other plug-ins have a modified version when running? Yeah, but that's actually something by design, right? So you're supposed to, let's say that you changed the delete button and now you still have the delete button, but the delete button will no longer delete. It will just say, actually it wasn't on the example, but it will just say not today, right? So the delete button on the left just says not today, right? It actually replaces the delete button in this case, but that's fine, right? Because that's what the plug-ins are supposed to do, right? So the plug-ins are supposed to do, and maybe you have even plug-ins that, okay, they expect you already to have other things in there, so if you have a combination of both plug-ins they can see that you added stuff, and yeah, so that's by design. Of course if you install plug-ins that will make your system not do anything, well that's also, you should be careful about what you install. You talk about security in the front-end, but isn't that something that the back-end should handle more and just keep the JavaScript as light as possible instead? About the security, what? You talk about security in the front-end, so that users can't add their own plug-ins, but isn't that something that the user would be responsible for anyway? No, no, yeah, maybe, I mean I was rushing maybe I didn't explain that correctly. No, the thing about the security is not so much about the security, it's about the user experience. So you suppose as a user to add your own plug-ins of course, but that's if you use in our case our application as a desktop application, because then it's you who is responsible for that application. When you go and you use it because you access some service that gives you in the browser, then it's the person that deployed that or the company or whatever that is supposed to give you the plug-ins that you are supposed to see, so you shouldn't change the way that the application works, but that's of course our decision, right? In other cases like a guest lag or something like that, you can add actually different plug-ins for yourself, and that's cool too, but this is like I said, this was a highly intimate decision for our own project. So a couple of months back when we were checking for the plug-ins, so usually a few applications run the plug-ins in an isolated environment, like they ship their micro runtimes and run in them and then try to communicate. So in your use case, are you running them in the parent application context? Because in that case, we can't always trust what users are writing in their plug-ins, right? So they can steal stuff from Windows, things like that. So do you have any check to see all the plug-ins and do that due diligence before I load them to the store or something like that? Yeah, security would be a whole talk about it. Which we don't have time for? We're just running the plug-ins as is because as of now, you know, we don't have, for example, you cannot just download plug-ins from NPM right now, right? We're going to have that. When we have that, we're going to have a different way to run them, hopefully. I know that depending on the system, you're going to find that some people do have a way to isolate them. There's a good blog article by Figma doing that. And that's kind of cool that you say what approaches they took and what conclusion they got to. Other, you know, other programs, they just say, okay, you're supposed to install stuff that you trust and they go through some, you know, just like when you install an NPM package, it can be harmful, right? But there are mechanisms to kind of mitigate that. So I want to make it as secure as possible, but that's not, you know, it was not security from the start before we actually have the system. Do you have any ways of handling code splitting and other stuff like that? Maybe a plug-in wants to load some components later on. Is there a way you can handle it using your method of doing that? If a plug-in wants to add components? If a plug-in developer wants to use code splitting and loading stuff later on, is it fine? I mean, if you have an active method, I mean, if I understand your question, you have the moment where the plug-ins are loaded, right? So you can just say, okay, I'm not supposed to be, I'm not supposed to be running the buttons, nouns, or changing the functionality now. I'm supposed to be changing the functionality whenever. Of course, that's a responsibility for the plug-ins, right? We just say, okay, we're loading you. Now you should make sure that you do whatever you want. But it should be like, you can, of course, this is just code. You can change when it wants, right? Thank you again. |
Is it time to migrate to Vue 3?
TLDR: it depends |
Okay. Hello, everyone. Today I will talk to you about migrating your old projects to, of course, Python. No, just kidding. How many of you are using Python for work? Okay. We are in the wrong room. We need to go downstairs. So I will talk to you about migration of your Vue.js projects from version two to version three. And this is not working. Great. Okay. I am Denny. I work as a full stack developer with JavaScript and Python. Not go. It's fake news. And work as front-end developer also using, of course, Vue.js. And let's start with a quick work through the Vue.js versions. They released the version 2.6 in 2029, a lot of time ago. Then in 2020 they released version three, 3.1, 3.2, one year later. And then last year in February, the version three became the new default. So when you install, when you run NPM install Vue.js, you install the version three now. And then in July, 2022, they released version 2.7 in maintenance mode. So they won't release. They should release a new version of Vue 2. And it will reach the end of life in the end of this year. So 2023, it will be the end of Vue.js 2. Then let's do a quick recap of options API. Options API are working with Vue.js 2 as well as Vue.js 3. And you simply can define a data function. Inside that, you can return an object. And that object will be exposed in the template as well as methods. So you can define functions. And that functions will be exposed to the template as well. Then again, 2020, they officially announced Vue.js 3 with composition API, script setup, experimental at that moment. But now it's official. And I know all of you are sad about this. No Internet Explorer support. But now it's dead. So who cares? Then at that time, I was like, oh, wow, nice. Let's drop Vue.js 2 and move everything to Vue.js 3. Well, it wasn't so simple. So I started to check about the new script setup in composition API. So you can define a setup function. And if you return a whole constants and function you define in there, they will be automatically exposed to the template. So, well, not so clear, but nicer than before. So you can compress constant and function of a single piece of logic in the component and then return them. So if you want to use this in Vue.js 2.6, you need to install Vue composition API and use it like this in this example, importing from composition API all ref computed and all functions. And you can use a kind of setup script like in Vue.js 3. But spoiler alert, your test will break. Because, well, easy fix, you need to use Vue composition API also in test files in Vue.js or in the local Vue instance you are using in tests. So, well, easy fix for now. Then the best part about this is the composition API plus script setup. So you can define a script setup in your single file component and define in constants and functions. They will be automatically exposed to the template. So, well, easy, clearer than the setup script, in my opinion. And everything is clear and needs. So, another big news, 2022. They released Vue 2.7 with support for composition API and again, partial but nice script setup. So, again, I was like, well, great, let's try Vue.js 2.7 before everything else. And, yeah, it was really nice. Now it's working with basically all components and you can look at the documentation for this. So you can upgrade your Vue.js dependency to 2.7 and it should be it. It should work. And you can change in your 2.7 projects your script setup like this. So removing the script, changing to script step and changing it like that. And if the component is not too complicated, well, everything should work as expected and it will be ready for Vue.js 3. Yes, now we can get commit push and deploy to production. Well, maybe not because if you are using Vue Router, then it will break. So you need to use the new version of Vue Router, at least 3.6.5. And you need to import use route and use router like this. And instead of using this dot route and router, you need to use route and route like this. Pretty much the same for Vuex store. Well, not the same, but you need to import your store from your definition and use the store like this. So store.state.propertyname and not this.store anymore. Then everything is great. But another problem is tests. Well, tests will return a lot of errors because of this Vue.js compatibility with Jest. It's not working with the version installed with the UCLI, that is 27, because they released in September 2022 support for that, but just for Jest 29. So not a good news. But we have a workaround for this. We can remove the usage of Vue CLI service and use Jest directly. It's not so nice to do. We need to copy and paste the default Jest config file from the plugin used by the UCLI plugin in our Jest config file. And then we need to update our package JSON test script using Jest instead of the UCLI service test unit. And everything should work kind of like before. And then we need to remove CLI plugin of Jest and install everything the CLI plugin was installing automatically. But at version 29, like we can see over there, fix a couple of deprecation warnings, like this one, just a minor change. And that's it. We are ready to work with Vue.js 2.7 and test working. Now it's time to move to Vue.js 3. They released an entire whole website for explaining new features, breaking change, recommendation, and migration build. So tonight, before going to bed, I suggest you to take a look at this, and maybe every night before going to bed, because it's really interesting. And let's skip breaking changes, because there will be a lot, but it depends on the user usage. And let's have a look at new recommendations. They're recommending to use new version, of course, of router, depth tools, test duties, all of them with Vue.js 3 support. They have a new build toolchain. So they are suggesting to use VIT instead of UCLI. And this is great, but not at the moment. And they are also suggesting to use PINIA instead of VUEX. And VUEX is there to remain, but for now they are suggesting a new default. So PINIA, move to PINIA. And a new ID support. So instead of using VETUR, for example, for VS Code, you need to use VOLAR. So easy migrations. Well, depth tools, just update your depth tooling from browser. In the support, same as before, just remove or keep VETUR, but install VOLAR, and that's it. Then you, of course, have a lot of mandatory migrations, for example, Vue.outer, new version compatible with Vue.js 3, VUEX, test duties, and of course, third-party libraries, for example, Vue.t5, Quasar, ElementUI, before starting your task to upgrade the Vue.js from 2 to 3, you need to check all your dependencies and check if they release the version compatible with Vue.js 3. They should, because, well, one year from now. But it depends on the library. So you need to check that first. For now, you can avoid to migrate from UCLI to VIT and from Jest to VITEST, even if they are a great tool. If you just want to move from Vue.js 2 to Vue.js 3, for the moment, you can avoid to migrate them, but maybe in the future, you should migrate to them because they are a great tool and they are the new default. And now it's time to talk about the migration build. So they released a new dependency, Vue.compat, that is a build of Vue.js 3 with a configurable Vue.2 compatible behavior. And it runs in Vue.js 2 mode by default. And it will display a lot of runtime warnings about changes and deprecated features used in your code. It has no limitations, for example, dependencies that rely on Vue.js 2 internal APIs or undocumented behavior, for example, beautify. Same for usage of private properties on VNodes, again, beautify, Puzzar, ElementUI. So if you are using server-side rendering in production, well, you should complete the migration before releasing to production because, well, it won't work anymore with Vue.compat. And so let's start our workflow of migration. So at first, we need to upgrade our Vue.CLI using this command here, or just CLI Vue.upgrade. And if you are using just a custom Webpack setup, you need to upgrade your Vue.loader to version 16. After upgrading, you need to update and install your Vue.js instance and install Vue.compatibility mode. You can drop, for the moment, Vue.template compiler. It's useful in tests, too, but for the moment, we can remove it. And then we need to create an alias for Vue.js. So every time we import from Vue stuff, we will import instead of Vue.js from Vue.compatibility mode. And again, in our configuration file, we need to enable, of course, compatibility mode via the Vue.compiler options using the version 2 for now. If you are using TypeScript, you need to upgrade your U.S. typing file using this. Again, this is explaining in their website that you will read tonight. And now you can run your code using npm run serve, for example. And if you see compile time errors, you should fix them. And when they are fixed, they should be, for example, your configuration issues or, well, small changes, you can switch your compatibility configuration to use UJS3. And you can run the app. It should work. You can open your DevTools, look at the console. And you might see a bunch of errors, a lot of them. And now you need to focus on that errors and you need to fix them one by one. What I'm suggesting is that you need to focus on fix your own source code errors and warnings, mainly, because you will have a lot of errors and warnings from Vuex, router, and so on. But they are from Vuex and from Vueouter, for example. And we will update them in a moment. So, first of all, update your own code warnings. And then update a couple of things that won't display warnings. For example, transition. If you are using transition, you need to find and replace and replace them with this. And then, okay, working. You need to update your application entry in your main.js file. So, instead of using new Vue, you can use create app, because this is the Vue3 way. And for the moment, we can pass to create app, router, store, and everything else. But we will remove them in a moment. So, now it's time to upgrade Vuex to version 4. It is the version compatible with Vuejs 3. And we need to upgrade our Vuex store definition using create store. And moving from an object in state to a function. This is the main change, I think. And then, we can remove the parameter from create app and use the app.use store instead. And in all components using the store, we need to change to this. So, again, like before, instead of using this.store, we need to import the store in this way for Vuejs 3 and use it in the store constant. Same for Vue router. So, we need to upgrade to Vue router version 4. We need to upgrade our configuration, importing create router and create WebEaster, because it's a new breaking change. And change them in this way. Then, again, we can change our main.js, removing the router from create app parameter and using app.use router. And change everything in the components. So, importing user router and or user out and using them in the components. And after this, you should pick off individual warnings if any remaining and fix them one by one. For example, upgrading Vue T5, upgrading Element UI, Quasar or other dependencies. Just solve them one by one in order to isolate your problem in a single way. After this, you can remove the migration build and switch everything to the official Vuejs 3, just when all warnings are fixed. And if you have dependencies that rely on Vuejs 2, you may not be able to do so, so please check all of your dependencies before doing this last step. And that's it. Now it's time to get commit, push and deploy to production. Well, maybe not. Maybe not. You should update your Vue test setup, too. So, you need to remove Vue 2.js and move to Vue 3.js at version 29, hopefully. And update also Vue test to version 2, changing it in the just configuration and everything should work. Apart from a couple of breaking changes in Vue test tutorials, because now props data, when you force a prop into component mounting, you will have to change from props data to props. There won't be a create local Vue anymore, because everything is an isolated instance of Vue. You need to move mocks and stubs inside of global options. And findall.at finally has been removed. So, now you can use findall like a proper array finally. And for all other problems in your test, you should go to this website and look at problems and warnings, for example, and solve them with their great documentation. And that's it. Now we can be happy with our Vue.js 3 instance. So, as a quick recap, if you want to stick to Vue 2.6, for example, you can use Vue Composition API so you can test and try with Composition API in setup only. If you want to try the new Composition API in script setup, you can upgrade to Vue.js 2.7, waiting for dependency to date their compatibility with Vue.js 3. But if you need to longer support your code base, your projects, then now it's the time to migrate from Vue.js 2 to Vue.js 3 properly. So, here's the link to the feedback for Fosden, if you want to leave a feedback for this talk. And thank you very much for being here. Thank you. So, if you have a first question, please raise your hand higher than this because I can't see it. Don't be shy. While you think if there is a seat that is empty, we aim for this at the center. Please don't stay at the sides of the ails. Defrag the rows, please. Questions, hands up. If you are a next speaker also, please come to the front now. We're waiting for you. Hello. That was helpful. I haven't really got an app to upgrade. I did one app in Vue.js 2. Can you hear me? Can you hear me? I did one app in Vue.js 2 years ago, which I've left. Now I'm trying to learn Vue.js 3. That was really helpful. I'll use your links because I'm mentally upgrading if not upgrading the code yet. So, not really a question, but one of the problems I've found is finding examples that are only Vue.js 3 and getting mixed up with examples from Vue 2. Is there a good place to go? I mean, obviously Vue.js.org is good, but like Stack Overflow is full of mixed examples and mess. Any help? Of course, the official Vue.js 3 website contains a lot of examples of Composition API and Options API. So you can switch, I think, in the on-page, your preference if you prefer Options API or Composition API. And all examples in there will be switched to use that kind of option or position. But other places, well, it depends. If you go, for example, to Vue.js website, they have two versions of the website. So the old version, it's all about Vue.js 2, so just Options API examples. And instead, the new version contains example about Options API, but a smaller section of the website contains also Composition API example. I don't know why they are not upgrading all of their website to Composition API because I think it's great. But maybe it's just because now we are also, again, we are in a passage between Vue.js 2 and Vue.js 3, so they prefer to leave Options API examples just to involve everyone from Vue.js 2 to Vue.js 3. But I don't have an explanation of this. |
In the loop
or: How I Learned to Stop Worrying and Love the Event Loop |
Hello, everyone. I am Bhavan. Yes, I'm here because I love software. Also, I really love talking. This is however the first time I'm giving a talk, so go easy on me. I work in Munich as a senior dev at this small startup called WorkerBase. I love DIY and I love making cocktails. That's me making one when I shouldn't be. I don't use a lot of social media, but you can find me on LinkedIn. Let's jump right into it and talk about some JavaScript architecture. I was secret to share with you. I've had the worst layover of my life at Berlin Airport and all I want to do right now is just sleep on a bench. You guys have to help me a little bit out here and make this more interactive so I don't go to sleep. Usually, it's the other way around. I had questions in there hoping that the audience will not go to sleep, but now it's on you guys. First thing, and this is a fairly easy and uncontroversial statement, I hope, right? The JavaScript engines are asynchronous. Is there anyone who disagrees with this? Do you all agree with it? Who agrees with it? Just raise your hand. It's fine. It's fine. Don't be ashamed. Just raise your hand. Okay. You agree with it? Three, four people who disagrees with it? Okay. Can one of you maybe tell me why you disagree? I mean, don't give a reason that because you're asking this question, obviously, the answer is no, but can you give a reason apart from that? Yes, exactly. So JavaScript engines are actually synchronous. Right? JavaScript and time environments are, in fact, asynchronous, right? And we'll talk about in this talk what's the difference between the two, right? But so far, does anyone get what I'm saying? There's the engine and then there's the runtime environment and there are two separate things, right? Final question. Is Node.js single-threaded? Yes. Anyone says no? Okay. One guy. Two. Good. I'll not bother you too much with this. There's a bit of a bad question, so to say, because I should have defined what do I mean by Node.js here? Do I mean the runtime or do I mean the whole ecosystem? But colloquially, when you say Node.js, you mean the whole thing, right? And that is not always single-threaded. There are parts of it that are actually multi-threaded and we will try and demystify some of these things, right? So this is what the JavaScript runtime environment, wow, that's a mouthful. This is what it looks like, right? Up there, you have your V8 engine, which is the JavaScript engine. I mean, it doesn't have to be V8. It's V8 for Chrome, SpiderMonkey, for Mozilla, so on and so forth. But that's the JavaScript engine, right? That's the thing that understands JavaScript and parses it and does a bunch of things, right? It reads and reads JavaScript, does stuff. And that thing, as we just mentioned, is synchronous. There's a few other things here that will actually give you the asynchronous part, right? But let's talk first only about V8, right? So what does V8? And when I say V8, I'm only using it as a placeholder for JavaScript engine, right? It could be any engine. It doesn't matter for the purpose of this talk. So what does it do? It does memory allocation. You have your heap, so it manages, it randomly allocates memory whenever it needs to store like a variable or something. It has the execution context, which is a fancy term for your call stack, right? And we'll talk about what call stack is in a slide or two. It is also single threaded, right? And synchronous. Okay. So yeah, I pretty much covered that, I guess. A quick intro to call stack. If you have ever seen an error message in JavaScript, what you see there is your call stack, right? It's a snapshot of your call stack when that error happens. That's basically what it is. And it's single threaded, which means that there is only one call stack in the JavaScript engine, right? So in other words, it can only do one thing at a time, right? If it has to do two things, it cannot. It has to first finish what it's doing and then do the next thing. Right? Let's look at a quick example for this, right? So what I have here is a simple pseudo code. Well, not pseudo code. It's a working code, right? So we have three functions here. It's pretty self-explanatory, right? I don't need to explain you what's going on here. What we will see on this side is what's happening with the call stack, right? So as the execution starts, you have your first function. Okay. First of all, you have sort of the general execution context, right? Which is sort of like the main equivalent of your JavaScript engine. Like if you've ever done like cc++ code, there's this main function, right? So that's this. You have a bunch of functions, nothing to do so far, nothing to execute. And then we actually come to a statement, right? And what do we want? We want to print greeting. So that adds something to the call stack, right? So now we are going to this function, print greeting. What do we do inside of print greeting? We call the greet function, right? And we call it with some value, but that's not important. So when we call the greet function, one more thing gets added to the call stack. And what are we doing in the greet function? We are calling the join words function, right? One more thing gets added to the call stack. And now you hit return, right? So now we actually have to return something. So something gets popped from the stack, right? So now you are out of the join function. You now, you go to the return statement of greet, you are out of greet. You go into the next statement of your print greeting. You do your console log. And there's no return statement here, but it's end of the function. So you're going out of this one as well, right? And you're back to your main thread. And that's it. That's how asynchronous, not asynchronous. That's how asynchronous JavaScript code runs, right? So far so good. Everyone with me? Great. Another example, like I was saying, if you've ever seen an error stack, essentially what you see is the call stack, right? You see a snapshot of the stack when the error happened, right? Or if you ever use the debug tool, for example, you are also seeing the call stack over there. Now let's look at something slightly different. What happens in this case, right? We all know what would happen without referring to the stack, right? It'll give hello, it'll give there, and then it'll give forcedM, right? But now there's sort of two things happening here. There's a mistake there. I forgot to add the time. It'll be there in the next slide. Just ignore that. But yeah. So what happens, right? Because our call stack now cannot do two things at a time, right? And we know from experience that setTimeout is going to sort of run parallelly while the stack moves on to the next thing, right? So that's where the other things that we had in that previous picture come into play, right? So you have mainly three things here. You have your web APIs. In the browser, you'll have web APIs. If you're doing Node.js, you'll have what's called libUV. And towards the end, I'll also talk a bit about libUV. But for now, let's assume we are in the browser, right? So we are on the client side. You have these web APIs here. You have your task queue. And you have the star of the stock, the rotating thing, what's called the event loop, right? And you will go through in more detail. But just to summarize, all the stuff that is sort of slower, right? That's not happening immediately. That's not being invoked immediately or running synchronously. That gets delegated to the web APIs, right? So here you have your DOM manipulation. You can make XHR calls. You can do setTimeout. If you were in Node.js, you could make a call to the database. Anything that's slow runs here, right? But the whole point of doing this is that after you run something, you want to do something back into your main thread, right? You are finally, you're running JavaScript. And you are making a call to some external system web or doing some delay. But finally, you then what, right? You need to do something. And that something is handled by the task queue, right? So take a simple example of a setTimeout. You have a callback and some delay, right? So the actual waiting for the time, say you put a delay of 300 milliseconds. So the actual waiting for 300 milliseconds is done by the web API. Then your callback goes to the task queue. And then it has to somehow go back to your main call stack and get executed, right? Because your callback is still JavaScript. I mean, we are assuming it's a simple callback here. Obviously, that itself could have another callback and then the process repeats, right? But let's assume for now we are just doing like a console or you know, callback, right? So that's pure JavaScript and it has to go back to the stack. And the event loop is what decides this part, right? So the event loop checks if the stack is empty, if the call stack is empty, as in if it's idle, then check the task queue. If there's something in the task queue, move it to the call stack, right? This is in a nutshell what's happening. This is in a nutshell how JavaScript manages to have asynchronous features, right? While still itself being single threaded and synchronous. So let's look at our function. And now we have, let's run this, right? So we have our main. We go to console log and we log hello. Okay. I don't know. We'll just, we'll not do full screen. Ah, I know what's not working. So that there's, there's a gif here of, of like a timer. So just, just assume that for now, right? But, but yeah, it's not important. We can still manage without it, right? So let's, let's go back. Let's start from the top, from the bottom, right? So you have your main, you have your main function, then you call your console log that gets executed. So it gets popped off the stack. You guys know this, but I'm doing this practice because I'm also going to ask you questions about this, right? So it's a nice thing to visualize for some examples, right? You have your set timeout, right? Set timeout is not on the stack. It gets delegated to the web API, right? Which starts a timer. And then it knows that it has to run a callback when the timer is finished, right? While this is running, we move on to the next thing that we can do, right? In the stack. And then we have another console log that gets executed. It goes away as in it's popped. And then the main thread is free, right? And then after some time, so, so you see that tick? There was supposed to be a loading thing there, right? To show that it's still counting. After some time, your two seconds are done. And then, so after some time, your two seconds are done. Your timer is over. That's also done. You move the callback to your task queue, right? And then the event loop checks if the call stack is free, which in this case it is free, right? It moves the callback function over to the stack and then it runs it, right? So there's again a console log that gets printed and you're done, right? So to summarize the JavaScript engine itself is synchronous and single-threaded. The Node.js runtime is asynchronous because it manages the asynchronous things outside of the JavaScript engine, right? In a separate thing, right? Which is usually written in C++, either by VPIs or LibUV. And the event loop is the glue that ties all of this together, right? So wherever your thing is running, the callback goes to the task queue and then the event loop decides when to run these callbacks. All right. Before I go there, does anyone have a question so far? Yeah? During that event loop, are they green threads that are created? How does that work from lower level? Sorry, can you repeat it? Are they green threads that are created then? How in terms of lower level, does it become multi-threaded at that moment? Yeah? You want me to give them the mic? Okay, yeah. So the question is, in the event loop, are there green threads that are created? The event loop does just one thing, right? It's a loop. It's basically a while too, right? And we will try and make a more complex model of it. But for now, for our understanding, it's doing just one thing. Go to the queue. If there's something in the queue and nothing in the stack, pull the first thing from the queue and put it in the stack, right? That's all it's doing. So there's no multi-threading. There's no nothing. It's just a while loop. There is multi-threading in the other part over here, but in Node.js, right? Not in the browser. All right. So, exercise time, right? This is a super simple function. Ignore all the boilerplate in the HTML. I was too lazy to remove it. All that that HTML is doing is it has a script.js and the script.js is over here, right? What is the script.js doing? Pretty simple to, pretty similar to what you guys saw earlier. It has two console logs, a set timeout in between, and another console log in the callback, right? So the first question is this, what's the output? Perfect, right? That's very simple. We just saw it. Second question, what will be the max number of tasks in the task queue? Right? This is a little bit tricky. One. Two. Who said two? Why? Yeah, but console log is just going to run on the stack. So, see, your answer is right, but the reason is wrong, right? So there will in fact be two things, right? Because reading the script tag itself is a task, right? In this case, the script tag is a different file, so it actually has to, I don't know how browsers internally do it, but actually has to go to the location and read the content. But even if the script tag is on the HTML, it's still a different task, right? And this is important to understand if you have to guess, even there's a race condition basically, right? And when we come to micro tasks, this will become a little bit more complex, and we have an example for that. But just keep in mind for now that the script tag is also a task, right? And that's the last part in the learning. So this talk, I don't know if anyone read the description, is in three parts, right? So there's intro to what happens in the browser, then there's a deep dive of the loop itself, or, well, the task queue, and then the third part is Node.js. And we are done with the first part, right? So everything looks good so far, right? It's a little bit janky, they did some weird things there, but we get what's happening now, right? Well, not exactly. I mean, we do, but there's more nuances, right? So let's do a deep dive into what's actually happening inside of the loop and how our different tasks handle, right? So as the line says, not all tasks are created equal. This is the model we had so far, right? That the task queue is a simple single queue, which had callback 1, callback 2, callback 3, whatever. And every time the event loop is a while loop, right? So in each cycle of the while loop, which is, by the way, called a tick, right? That's just the term. You might have heard this sometimes next tick, right? That's what it's talking about. Okay. I've been told I don't have a lot of time. Let's try to fit this because we are, like, barely halfway there. In reality, there are multiple queues inside of your task queue, right? So what I told you earlier was a bit of a lie. And actually JavaScript or, well, the JavaScript ecosystem does not handle each task equally, right? Some are given a higher priority than the others. Click events for example. And this varies a bit from browser to browser. It's different in Node.js. So don't take this as like Bible. This is just an example to show you, right? And it's also oversimplified, obviously. But click events are given a higher priority and then everything else, right? There is also something called a request animation frame called back queue, right? And what the F is that, right? So the browser, sorry, JavaScript and time is also responsible for rendering, right? It's also responsible for drawing things on the screen because the browser is doing that, right? And it doesn't do it on every tick, right? Because that would be wasteful, right? Because you have a tick happens roughly, let's say, one millisecond. Not really, but let's assume that. Whereas if you have a 60 hertz screen, you only need to refresh every 16 or 17 milliseconds, right? So it's smart enough to understand that I don't need to do this all the time, but it does need to do it at some point, right? And if you block the event loop, you're going to freeze your screen, right? That's the big take home message. So let's look super quickly at an example. We have three frames here, right? And there's a rendering step happening on each frame, right? And now what we want to do is we want to... So the time is up, but if you guys don't mind, I'm maybe going to take five more minutes. If you have questions, maybe hit me up later, but I at least want to finish this part, right? So you have the rendering step. And now say if you want to change some logic in the rendering step or related to rendering. You would just do like a timeout or something, right? Say, run this logic every, I don't know, at 60 hertz frequency. But that's not very good because as we saw in the set timeout zero example, the time you give in set timeout is not guaranteed, right? It's the minimum time that it'll wait. If your queue, if your stack is not empty, when your time is up, then it'll wait for the next take, it'll wait for the stack to be empty and only then will it do something, right? So if you just ran this as is in like a set timeout, that'd be a very bad user experience, right? Because you might skip a few frames, you might have visual artifacts, and all kinds of weird things can happen, right? So that's why we have a separate queue for this, right? And that queue can be accessed through what's called request animation frame. And that's this in the parlance of my coloring, right? So let's quickly summarize. The event loop is responsible for rendering frames, but not in every cycle. But when it does render something, it takes everything in the request animation frame queue and renders it altogether, right? So you have this, and then you have micro tasks. And micro tasks are basically promises, right? They're not really, but we are oversimplifying, and we are anyways out of time. So let's just stick with that for now, right? The big difference with micro tasks is that the queue has to be empty whenever it's run, right? Even if the micro task creates another task that also needs to get executed, you don't wait for the next time, right? So let's look at an example super quick. So in one tick, in one cycle, right, on which animation is also going to get rendered, you will pick up one of the regular events. You would do all the micro tasks, right? If you have more micro tasks, you will do all of them. Then you will go to the animation frame. If you have more tasks for animation, you will not do them, right? You will wait for the next animation cycle to do it. That's some big difference between these two. And again, as you can see, if you mess up micro tasks, you can end up freezing up your screen, right? You put a while loop. You do a task that calls itself, like a promise that calls itself, and then everything is going to get frozen. That's because of this. All right. I think we are out of time. This is maybe the one last thing I want to do, if you guys don't mind, and then we'll stop, right? So can someone tell me what would be the answer to this? Awesome. Can you explain it? Perfect. Yes. And also keep in mind that the script itself is a task, right? Because why not be otherwise, right? So why not just put the set time out in the queue and execute it? But you can't because script itself is in the queue, right? So the set time out will go afterwards. All right. That's about it. Thanks a lot. Hope you guys learned something. Thank you. |
jxr in /engine/ - coding in WebXR on a plane
Custom JavaScript subtset open scaffolding to spacially and textualy explore interfaces |
So this is more of an entertaining talk, let's say. It's a lot less technical, probably a lot weirder, I would say. And hopefully, I will go fast enough so that you can have all the crazy idea questions at the end. Even a little demo, I have some hardware with me, if you didn't try VR, at least I think today you should try this, if you feel like it. So that will be, like, say, just there where it's fresh. So I will, so those are my slides, it's a jiz and frail, yes, why not. I will start by saying that I gave a couple of workshops to kids, actually, on discovering VR, and that was pretty nice. I thought initially I would give workshops and give, you know, a roller coaster. Some kids asked me about murdering clowns on top of a roller coaster. I did not prepare this, I admit, but that was, yeah, interesting, different inputs. I think it's a really interesting way also to consider teaching, computer science, programming, because I was saying there are my Pine Watch, like there is a computer everywhere. That's just a computer. That's why it might sound weird, it might be a little bit outside of people's comfort zone, but in the end, you can program a computer, you can program that. So I will introduce myself a bit first. So my name is Fabien, Fabien Benetou, Utopia. I work mostly at the European Parliament as a WebXR consultant, so I just do VR and AR on the web, that's it, and I do it as a prototypist, submitting all the tips you had or the suggestion you had at the beginning of the day on the TypeScript, on quality code. I don't know any of that. As I was saying before earlier, chatting with somebody at the entrance, if my code runs during the duration of the demo, I'm happy. The next day, yeah, it's a bonus, but yeah, that's the length mostly. And I gave a couple of talks here at first, them on connecting exotic hardware, like the Watch or IoT, everything, of course, using JavaScript. And this one will be on how to do this on a plane in a VR headset. So my motivation behind this is I have a bunch of notes, I have a wiki, that's the wiki, and I have a, you can still hear me, the 2D visualization of it, but it feels flat, like I don't want to have my notes in my model scheme or I don't want to have my notes behind the thin screen, I want them everywhere, so I can play with it, like I organize the space. So for this, I already had a couple of versions, I used Mozilla Hubs, so a little show of hands, how many of you tried VR at all? Okay, I would say two-third, half-two-third, how many of you tried Mozilla Hubs? Okay, five, six. So it's a social VR experiment. It's a pretty good solution to be honest, works quite well, and you can do this, meaning you can be in of your headset with your hands moving, you can be on a laptop, on a mobile phone, and you can see this shared virtual space, it's using 3GS for the 3D environment, and you can have PDF, you can have your webcam, you can take screenshots, you can honestly do pretty much most of what one would want to do, so I did quite a few hacks or modifications of it. It's obviously open source, that's the interesting part, but it's not enough, so you can do a bunch of hacks or explain a little bit how, for example, I turn on enough the lights in my office while I'm in VR, director kits on how to record with traveling, a bunch of different things, to the point that I did my little toolkit on how to customize, yeah, that's not going to work here. So you will have the link on the page, so imagine a bunch of functions related to hubs to do all this. So I really think basically here the point is hubs is a good solution, it works quite well if you want to do just social VR. The thing is first of all, it doesn't work offline, and I started after the end of the pandemic to start to fly again, to go see family, and I thought that's one of the best use case for VR, like I don't like being on a plane, I don't know if you do, but I think if you're the pilot, it's fine, but otherwise as a passenger, it sucks. And one of the beauty of those things is you put the headset on and you're basically somewhere else. But I don't want to be in VR on a plane and do a roller coaster or something, I just want to code, I want to be there and be able to do something that I have agency over, I want to build stuff, I want to create things, and I don't want to be like this. I want to feel like I'm somewhere else. Then I started to build on that, including, let's say, managing the history in my browser. So I would take a snapshot of the different pages that I visited, be able to organize them. As you might be able to see, this doesn't look at all like Mozilla Hubs. So Mozilla Hubs is amazing, but it's a huge stack, actually, and it's a stack that depends on AWS, which I obviously don't have on my plane or the plane I'm flying with. So then I started to rebuild it from scratch, basically. One warning, also, when you saw to hear programming in VR, I had a discussion with somebody just yesterday about it, it might be productive, I would say, at some point, but that's not my point. I don't find that interesting, like I have a really nice desk at home, it moves and all, and I have my ergonomic keyboard, and I can move my 4K screen, I like it, but I don't have that on the plane. So the point is being able to move in space when I have the space, and if I don't, to be able to use the space around me in a really compact manner. So the point is, this whole argument is like, don't look at this presentation thinking, is it to replace the way I work today? That's not the point. It might be, if you're into it, but it's more exploring what could be interesting in terms of interfaces. So yeah, word per minute, if you're starting to sit next to me with your laptop and we do a competition of how efficient you are, you're going to win. I mean, I hope for you, but that's not the point. And I was able, actually, on the plane to make friends, because the person next to me was like, what is this guy doing with his keyboard on the tray, plus there was no USB-C on it, so I had to plug the adapter, put it on top of the device, and then put, it was not wireless back then. So now I'm pretty happy I just received this little thing. So it's a Bluetooth keyboard, and it's mechanical one that's way more efficient. Back then I didn't have this, so yeah, that was a good excuse to meet someone and chat about what does it actually mean to program in VR. The next step being actually tinkering with it. So I went from having the history of my browsing behavior, let's say, to moving content and moving content by just pinching basically a page. So that's a render of a page, that's not a web page itself. What hopefully, I don't know if you can see all the way there, but basically with my right hand I'm able to move things around, and with my left hand I'm able to execute code. So that's, let's say the main trick, if there was one thing you need to think about that presentation is about moving things around freely or naturally, but also you will see the, so I move things. The red sphere there, and then if I pinch on that piece of code, the sphere there is blue. So I'm just moving things around, but it starts to manipulate text, which takes this code and then code can change the environment you're inside of. And the one trick to do this, I will hide it away, but you can guess. How to execute code in JavaScript? The one thing you're not supposed to use, evil, because it's evil, I think that's the one use case when you can use it. When your string of text is actually code and you're not running a bank or something like this, like this is my code running on my headset, I don't have anything that I won't accept to break, let's say, but even then it's executing locally. So that's the one reason, in my opinion, to use it. So the idea behind it is like, oh, I can manipulate text. That text can be code. I can change that text directly while I'm in the headset, either pinching it like single letters or with a keyboard to be more efficient, and then I can execute it, which is going to change the whole environment I'm inside of. So the trick for it was basically what I like to think of the escape valve, which is to say that, yeah, I could redesign an entire programming language, I could make it a programming language for VR, but I can also rely on one I already know, and that already has an engine in it, so namely JavaScript. So that's a little, let's say, trick. So I'll show a couple of examples, but that was an excuse not just to manipulate text and this code, but then to start to make my own programming language. So how many of us here have made their own programming language? Yeah, I see, there are very few hands that are like this, most hands are like that. I'm like that too. I don't think, honestly, I feel a little bit, maybe not ashamed, but I'm not proud of it either, what I'm trying to convey is the endeavor, trying to do it is super interesting. I don't think I've learned as much recently as trying to do this. I also recommend, if you check the slides or the presentation after, don't learn my programming language. I don't think it's a good one. It's a very interesting one for me by building it, by seeing the limitations of my understanding of what a programming language is. But so this, why we recommend this course specifically, it's because he's not going to, let's say, let's do an alternative to C, which for most of us would be too far and not practical. He's basically saying, well, JavaScript is a language, the browser will notice an environment you can already work with, you can transpile, you can use Babel, and that's enough, let's say, to get the foundation of what a programming language is. So I really recommend this kind of course, if you're considering programming, making your own programming language. And also, I don't know how many of you are familiar with this book. Yeah, I don't think I can go back easy either. So yeah, if you ever try to learn, I think, the foundation of programming and programming languages, usually you often do it with Lisp, which works well because it's so compact that you can keep it all in your mind, basically. But again, most of us are not familiar with Lisp, and even if we are, then it's not what we work with on a daily basis. So, SICP, but the JavaScript version has been out for a couple of years now, so if you want to learn, it's at the foundation of a programming language. While being familiar with the language itself, I really recommend it, it's quite taking, but I think it's really, it's a good investment. So then I'll show a couple of the features of the environment, but I won't show all of them, go too deep in there. But then what's interesting is you don't have just one line of code, but then you start to have groups of lines of code or history, the stack of what you executed. Of course, because your NVR code is not everything, so for example, some of the code is about loading a 3D model. So if you pinch it again, you execute that line of code, you get a 3D model out of it, and then you can do it a couple of more times. You get, I see that the contrast is not high there, but you get the history of what you just executed, then you can save it and execute it again. So you start to have the same, let's say primitives you're used to. And then yeah, you can do shortcuts, you can visualize the groupings. So for the people who twisted their hand instead of raising it, this is also where I'm at in the sense that I don't actually know if it's a fair thing to call this a programming language. Is it just like a bunch of utils, a bunch of shortcuts? I'm not sure. Again, what I find interesting in that process of making a programming language is, yeah, it's up to you. It's up to what you find interesting, it's up to what you find efficient. So yeah, I start, try to document it, stack back a bit, but yeah, all what it prompts me to think of is really, really interesting, really valuable. And also, if you don't go down that route, there is a subreddit for it, it's a subreddit for pretty much anything, but yeah, for R slash programming languages, where you can ask simple questions like this. Like what is the most interesting aspect of it? So of course, I had to omit that, so I display my issues from my repository in my VR environment so that I can move them like as if I was on a wall. I can display 3D models from, let's say a library of 3D models, this way I can organize it like tiles and then start to make a 3D environment to work out in off. I can do honestly this part, the graphs, I mostly did because a friend asked me during lunch, I was like, this is annoying, everybody find graphs sexy, it is looking good, but I'm not sure what's the point, so just okay, I'll just make a graph. So those are actually the pages of the wiki where I'm taking the notes out of the display there and they're all manipulable. Taking screenshots also in VR, I think that's a great way to document if you're doing a process, of course you can have the code out of it, you can save it back to your wiki, for example, but the movement, how things are going to be organized and how you want to, let's say, organize the tiles to make the 3D world, you want to tell a story to someone who might not have, no one to use a headset, then just taking a screenshot, capturing it and then sending it elsewhere, it's a pretty efficient way to do it. That was just this morning, so there's the same thing of managing issues, displaying the issues from the repository live, I can again grab them, move them in places, but using Swagger, so if you're not familiar with Swagger, it's a meta API, it's an API to manage all APIs, so once you have access to, when you load using Swagger, I can load from GitHub, whatever is going to have a Swagger specification, and that means making them as, for example, manipulable blocks. It's federated, so it's social, but if you have your own server, then you can connect to it, and we can be friends, we can exchange code also, so I can move the code in my event, and you can receive it and execute it. One port that is kind of funny is snapping, so those are blocks. You cannot hear it, so you'll have to trust me, but when the blocks are close enough, you hear them snap. The point of those blocks, or blocks rather, or not just blocks, is you can attach code on it, so then you can manipulate them literally like a legal block. That's again the part of the interactions that become interesting, yes, you can have a keyboard, and the keyboard works well, but if you just do the keyboard, yeah, you could just use a bigger screen, if you're always not that interesting for it, but if you can have both, programming with a keyboard, or naturally manipulating, or simply, let's say, directly manipulating with your hand, I think that becomes again interesting. So again, you need to imagine those are not just blocks, or blocks of code, it's called, let's say, for swagger, so that you can get the information from an API or execute through that API. So why also do I do all this with the blocks, is because I think that's a great way to learn. So you can imagine this, the blocks for running code that is abstract, let's say list things, but you can use that for chemistry, let's say it's a C, and it's a C, or it's an H, H, and an O, and you bring them together, they snap when it makes a molecule. I think the same kind of things apply to pretty much any field of study, so physics, chemistry, sociology, like you have a set of atoms and a set of rules, if you can, then, to have a certain organization specialized together, then you can learn this field, hopefully, in a more embodied way. Nope, I guess that's going to be it. I have a Twitter archive, because it's not working properly, but not for recent things. Well, okay. So that's, it's fine. All this, I think it's pretty interesting. The one thing I'm even more awkward with, rather than calling the thing I've done in my programming language, is this. So that headset is an Oculus Quest or a Meta Quest. I don't have a Meta account, I don't think it's, I don't think it's a, yeah, I don't think surveillance capitalism is a good business model, let's say, as a society, but it works really well. So that's pretty awkward. I really don't think it's such a good thing, and I called it an adversarial dependency, because it helps me to build more, explore more, but on something I don't want to rely on, because it's not aligned with my set of values. So I still show all this, because, yeah, you won't see the photo, but there are a couple of headsets, like the links, that should be there in a couple of weeks. The point is, don't associate VR with just Meta or Facebook. It is a convenient one, but there are alternatives coming out. There is also a similar VR. It's not the only way to do VR. So adversarial dependency, in the sense that, yes, you can rely on it temporarily, as long as you can have it just as a replaceable block, and as long as you can swap it whenever you get something that's more than with your value, that is much better. But that's it, yeah. The whole point of this is building your own scaffolding, being a programming language, being a tool chain, using VR, because I find that needs to be the most interesting interface. But yeah, that would be my recommendation for you. Just make your interface, make your scaffolding, using VR or not. And if you want to try the VR one, I'll be around with it. Thank you. I think we have time for a couple of questions. Do you have any questions? I really like the VR experience in the plane, but have you thought about the incidents that may occur, like hitting my neighbors, punching him? No. Thank you. Yeah. So, I don't recommend punching anybody, but you're not good, so you define the interactions however you want. So basically, if you put things, let's say, to the object, either a block of code or a hat, whatever, a meter away from you, you're going to punch your neighbor. But because now you have access to the code in there, you can just say, oh, I bring, let's say, everything within a 50 centimeters radius, and yeah, you can just do it on the spot. Again, it's using WebXR, so it's VR and AO on the web, meaning there is no, like, unity build, there is nothing there. Everything is done directly in the headset. You can put your Node.js server, or even your Python server, whatever you want to run. So it means you can refresh the page, and then you have your new content, the new way to handle it, so either you do it without even having to reload, but you can fix it on the fly, let's see. |
Visualize the NPM dependencies city ecosystem of your node project in VR |
Okay, thank you for coming and thank you to the organizers as well for organizing this for them and this JavaScript Devroom. I'm Navid Moreno, I'm a PhD student at the University of Rio and Carlos in Spain, Madrid, and I'm going to present you how to visualize the MPM dependency city ecosystem of Junot project in virtual reality. I would like to say that this is not a technical talk, it's just a prototype that I designed, it's more related to academia, I'm a PhD student, so it's kind of some new things and maybe have your opinion on probably if this work is useful or not. So first of all, I'm talking about a city. How many of you know something about cold city? Please raise your hand. Cold city? No? One, two, three, okay, so the city metaphor in visualization is just to take the characteristics and layout of the city and then represent something with them. And the city layout just means quarters and buildings. And in this case cold city is one of the most known tools that uses this metaphor for visualizing Java software code. In this case, its building is a source code of the Java project and the quarter is the level of the source code in the package of the Java project. So in this case, we developed a new version of this city metaphor called city visualization using web technologies, it's related to the last thought that Fabian gave us. We use web XR and WebGL and everything related to web, in this case it's A-frame. How many of you know it's A-frame? Probably more, right? So we changed this metaphor a little bit, changing the algorithm of the layout of the buildings, we use a spiral algorithm instead of a tree map layout because we want to reorganize the buildings in a different way because the layout, the tree map layout is too fixed and when you want to add a new building there, there are many problems. So in this visualization we have metrics that we can represent in the buildings and in the quarters. The quarters represent a tree layout because you know you have quarters, then you can have quarters at the top of the quarters and so on. The buildings have different metrics like the area of the building, the height of the building or the color of the building. We are using this metrics to represent the NPM, the dependencies ecosystem of a project. So if you are in your computer right now or in your mobile phone, I want to invite you to go to that URL or scan this query and then visit this live demo and then you can follow the demo with me. So this is the demo, so this is the city of, sorry, it's tinyurl.com slash varia.force them. Here's the name of the tool that I'm developing in my thesis. Just a minute. With your mobile phone you won't have as many, as the interaction that you can have with your computer but you can have an idea and you can visualize the city. So this is what you can see in the demo. This is the dependency city of a project. In this case it's the user interface of Shorting Hat that is a project for doing things under the Gamer Lab project. So in this case, in this city, each building represents a dependency of the project. What we can see here is that, okay, they are building with different colors and so, but the first thing that we can see here is that the quarters are kind of elevated. We can see here a quarter, what this big quarter, this quarter, this quarter, this quarter, this quarter, and this quarter. This quarter means the dependency level that your application has. And this first quarter, the big one, the first one is like the bottom or the package.json of the Gamer Lab project where you define the first level of dependencies, okay? So we have the elevated city, we have to be clear of this. The second thing that we can see in this city is where the buildings are. Where the building is, the quarter that the building is laying out means that this building or this package is in that level, in that dependency level, okay? For instance, this building is in the first level of dependency that you probably define this package in your package or json if you're not a project or whatever. And these three buildings that are laying out in this level belong to the third level of dependencies of your project, okay? I'm explaining this because it's just a matter of explaining how the visualization works. And some buildings go through a quarter, which means that the building that goes through a quarter is the owner of the dependencies that are in that building, in that quarter, sorry. In other words, this package is going through this quarter, so the dependencies, this, this, and these are suit dependencies of this quarter, of this building, sorry. Okay? Again, for instance, this package is going through this quarter, so these dependencies, these dependencies are suit dependencies of this building, of this package. It has to be clear because it's kind of the core of this visualization. And if you are in the demo, you probably notice that there is kind of some buttons here that are the metrics available in this visualization. These metrics that are here only change the characteristics of the buildings, okay? So, we are going to focus on the third level, first on this row, that is the heap, the heap. I don't know if you can see clear what it says here. H days, okay? Height. This means the height of the buildings. And now, the height of the building represents the age of the package in number of days. Okay? We have this metric, this numeric metric in height. The area of the buildings, we have three metrics. Now, we select this one that is log slash age, which means the lines of code of the package divided by the age in days. And you are probably wondering why divided by the age in number of days. Because if you divide, if you multiply the height of the area of a building, of a box, you have the volume. So, if you compute these two metrics, multiplying them, you can have as a third metric, the volume, as computed metric, in the case the volume of the buildings, will be the lines of code in total. But we can change this metric into size, in the size in bytes, divided by age. We can click on the metric, then the visualization is updated. And the third one is the number of comments that the package has divided by the age in days. But probably, this is not as important because in this project, on this prototype, we are focused on the color of the buildings. And now, you realize why there are many metrics in the row of color. So, by default, what you see as a color building is the type of license that the package has. In this case, we can see that there are many packages with a purple color, kind of a purple. And then, on the right, we have this legend that says which license is using. So, with this color and this city, we can have a quick overview of the packages that are using the same licenses or the same license. Or if a package is using a license that we don't want to have in our projects or so, that's probably one of the goals of this metric color. Then, we have two metrics that is times installed and times disappear. These are not category metrics. These are numbering metrics. These metrics follow a heat map color from blue to red. And this is interesting because I think that you know pretty well the NPM ecosystem. When you are installing a package in NPM, if NPM found a dependency more than once, in the same version, NPM will install the dependency once. We all know that. All of us know that. But if NPM found the same dependency defined in two different versions, then NPM will install the dependency twice, in this case. So, this metric represents, as a color of the building, how many times the dependency of the package is installed. In other words, how many times the dependency is defined in different versions. And this one is how many times the dependency appears in the project. There are less diversions because we are using, for instance, we are developing something in GraphQL and we use GraphQL in many things. We define GraphQL. I don't know a package that uses GraphQL, but then we use another package that uses GraphQL as well. So, this represents how many times this package appears. How many times it is installed there. And these two metrics are completely related to the first line that is the attributes line, I call it. And it's used for changing the transparency or the wireframe of the build, the wireframe features of the buildings. Because now you notice that there are some buildings that are kind of transparent buildings, which means that these buildings are not installed. These buildings are replicas and only the solid ones are the ones that are in this installed. And if we go with our mouse or with our VR controller over a building, we can see that if we hover a building with the mouse, we see here that this dependency is GraphQL and then some other buildings are highlighted in white as well, but only one of them, that is this one, is solid. So, it means GraphQL is defined in many, many locations, but only this is those ones that is there, but appears nine times. We have also metrics related to the community of the, or the package of the community behind the repository of the package. The first one is last at days, which means the days, the number of days since the last commit. So, in a hit map from blue to red, if the package is red, it means that probably this package has no activity in the last. For instance, this, we can see this package here, this is script 002, 2000 years since the last commit. So, probably this is kind of a smell, probably in this kind of visualization. Also, we have the number of commits in the last year of the package, of the repository of the package, the number of unique committers in the last year of the repository of the package, and of course, the number of vulnerabilities that we can find with the MPM audit tool as well. But this visualization has the goal of merge all the information that we can retrieve from the package and then show it in this, this city. And the last one is the number of issues radio that is just the number of issues open of the repository divided by the number of issues closed in all the story of the, of the package. So, this is the first overview of the, of the tool. I invite you to play with it. I was really quick. Of course, where are we? In the, in the, in the slides there are plenty of QRs with links to the documentation, to links to the step by step tutorial in order to have this, this visualization with your own project because what you need is just to have a project installed and then run a set of tool that I'm going to spray right now. But I said that this is for virtual reality. So, in academia, some researchers validated the city metaphor in virtual reality agreeing that the notion of having a city in a, like in a, in a table or something like that helps the, the user or the, the participant of the experiment or whatever to have better information that using a computer or using a 2D screen. So, that's the reason behind this, this research. So, in this case, we are using a MetaQuest 2 glasses. If this is an experiment and then we have in the left controller the user interface that we are already, we're already showing the demo. And then we can see that when you, when we click on the buttons, you see that the, that the city's updated instantly. I didn't tell you, but there is a second line in the user interface because the demo, I forgot that. The demo has four different projects to analyze in this city. Now it's a little shorting hat. That is the project that we are selecting in this project. But also we have the data from the PM2 package. If you know PM2, you can, you can click on it. We can click on the PM2 project and then the city is automatically updated. We have also data from the portainer user interface and the github desktop user interface because it's developed in Node. So, we're going to focus on the data just to finish. This QR will redirect you to this repository where I have all the codes that I use for having the data because this is really, really easy to replicate because what you saw here is just an HTML using A-frame, using Babia that is the tool that I developed and then adjacent with all the data. So now we're going to talk about how to produce adjacent. But you can produce adjacent whatever you want. So it's not a matter of, you can follow this tutorial or not. So the first step is to have your MPM package or application installed. Of course, because we are representing the MPM dependencies application. The second step is to use MPM to have the dependencies list and the vulnerabilities list. Yeah, just a simple question. |
Micro-frontends in React
Using Webpack Module federation to break free from monoliths in UI |
My name is Bipal and today I'll be talking about micro front ends. The title might be a bit misleading because it says in React, whereas it's general and can be applied to other libraries or frameworks. The main focus of my talk is around Webpack's plug-in called module federation, which allows this paradigm of micro front ends to be possible. So yeah, what are micro front ends? Front ends for ants or something like that? No. So, oh yeah, before I dive into micro front ends, something about me. I work as a UI developer for IBM. Work on the OpenShift Data Foundation product. So we use this extensively in our product and that's how I got to learn about this feature and it's not something that is written or created by me. It's more of an open source project created by some guy named Scripted Alchemy. I don't know his real name, but that is his GitHub user ID. So if you missed the description of my talk in the Foshtem website, I'll be basically talking about creating, going away from monoliths to this micro front end paradigm where basically traditionally what we would do is we would have this huge repository and we would have all the code for, let's say if it's a 10 page web application, then you'd have all the code for all the 10 pages in the same repository. But to make things easier to decouple all these teams which are working on these various different pages, there is this paradigm where you can divide these various pages into different repositories. So you would create different builds of these different pages and integrate them in runtime. So you're not being stopped or you're stuck to the same release cycle as for the other teams who are working on the various other pages. So to explain it with a simpler analogy, I would use something like building a house or building a hotel. So there is one team which would build, I guess, the core of the house and then there's a team which would install the electrical wiring and there's other team which would do the interior decoration. So in the end, all these efforts are combined and you get a hotel in the end. Similarly, you have a team which creates, say, the home page, the other team creates the calendar component, some other creates other component libraries that are used in your project and you can all seamlessly integrate them in the runtime. So that would help you avoid reusing the code and basically all the benefits that you would get by following the dry principles and the solid principles. So more about module federation, it came in Webpack 5. It came as part of Webpack 5 and yeah. So the main concept around it is of having different modules, the remote modules and the local modules. So let's say you have your host application, Application 8, which ships with the home page. Now it has its local modules related to the React. Let's take React as an example. It would have your React components related to the home page. Now say you click on the About Us page, then what it would do is it would fetch modules from a remote location, a different web server where you are deploying the bundles related to About Us page. And those are known as remote modules and you would basically pull it into the runtime, create it over there, which would be handled by Webpack, obviously, and you would get your module federation. So the problem that this helps solve is that monoliths are growing bigger and bigger. Once it goes bigger, the build time increases, there is a lot of dependence across the multiple teams that are working on the same repository. In the end, it makes things harder. If we go back to the olden times before microservices was there, you would have a bunch of teams working together and everything being deployed at the same place. That caused its own issues. Now since the front ends are getting larger, it is bringing its own set of problems. So basically this is what module federation aims to solve that issue. So like I explained previously as well, let's say you have application home page in About Us, then using this method, you could split the repositories, split the projects, build them separately, and combine them at the runtime. So let's take an example of how a single build application would work. So first you send a get request from the browser and then it would return the index file. From there, the script tag would ask to load for the main bundle. And then if there are other buttons or anything that you use over there, those would also get pulled from there. I have missed some of the responses from the server. Sorry about that. But with multiple application, it would be something different. So you would have your web server A, which would be the host application server, where you would have your index file, the main bundle, and some of the chunks that are coming from the main bundle. And you would have your second web server where you host your remote bundles. So this is a setup where you have one host and one remote application. So the first, it would ping the host application. It would get the index file, pull the module that is related to, that is sort from there. And then it sees that it requires something like a button bundle, a button component in the home chunk. Then it would basically ping to web server B and get that. So how it figures out that it needs to ping web server B and where actually the button bundle is, is something that I'll go after this slide during the code demo. So pretty much if you look into the requests that go as, similarly, it hits your main application server, then it gets the main bundle. And then if you see here, there's something different. It hits local host 3000 instead of 3001, and it looks for this file called remote entry.js. So let me go to the code and explain it a bit better than what I did so far. All right. This is my remote application. Let me go to my host application. Is this visible? Awesome. So the most important part of this configuration is the plug-in part, which is module federation plug-in. So here what we're doing is we're specifying the remotes. So in the remotes, you can see that I specify remote module, and I specify that it is at local host 30002 slash remote entry. So this remote entry is a special JavaScript file. It does not have any code-related chunks over here. I mean, none of the application-specific code is present here. It's just a way for Webpack to expose the modules that you're trying to expose from your remote build. So you see that I specify remote module as presented that. So Webpack knows where to hit to connect to the remote build. And something related to React and similar libraries is that I have mentioned that React and React DOM are shared. So that means there are not multiple instances of this library being loaded into the runtime. And I also make sure that it is singleton. That means only one instance of it is present there, because if you have multiple instances of libraries like these, it would not work well together. So hence, you have to mention that it's singleton. And similarly, if we go to the remote bundle, you see that here what I do is I specify the file name for the remote entry, and then I expose this component-named button. If you see here under source, there is this component-named button, which I have exposed through there. And in my host application, I go here and I import button. And you can see here I import it from remote module slash button. And Webpack knows basically this is not part of your node modules. And it has to pull it from a remote source, which it does. So let me try and run this, if we see here. And we get a warning, perfect. But we can always ignore the warning. And you can see that there's a toggle button that comes up. And I would just like to show a few things on the network tab as well. Let's see that warning one more time, all right. So the first thing that it loads is the main bundle. Then you can see that there is the remote entry that comes from the 3002. And then only after that, the button is actually getting pulled over here. So it gets the remote entry, it sees what chunks are present over there, and basically pulls the button chunk from there. And you can see that the button is working. So it's a remote module that's coming over here. Some of the links that are useful are the documentation for module federation, which is hosted on the Webpack site. And something that I would like to also mention is the OpenShift Console project, which uses this. I think we were one of the first people to use it. So OpenShift Console is a project user interface for enterprise distribution of Kubernetes, which is also known as OpenShift created by Red Hat. One interesting use case that we had is we had one static application with a bunch of plugins that were all pushed into the single repository. So we had like eight products along with the core application that were shipped together, and all our release timelines were synced because of that. And that would cause massive headaches for PMs of multiple products because it would stop the release for other product and so on. So what we did was we basically removed all the plugins from the same repository and created a different repository for them and combined all of them in the runtime using this same plugin of Webpack. So this has helped us tremendously to have our own release cadence and send bug fixes at whatever time we want to and not be stuck with one timeline for all the products. So in the end, yes, it has helped a lot to us. So with that, that's all from my side. Any questions? To what degree does this send box code and or impact security? It does impact security because it gets access to the whole of runtime. That's why you would do this something with applications that you would trust. And if it is something of a third party, it's usually safer to sandbox things. I wondered what happens if the server with the remote components is down but not the whole environment, I think that could lead to very weird user experience bugs. Yes, that would lead to a very bad user experience. So you'd make sure that the remote application is also working but there is some other things that you could do. There are fallbacks that you can mention there for dependencies which would not crash the whole application. You mentioned that for now it's on Webpack. Would other bundlers have something similar? I am not sure about that. Thank you. Yeah, probably similar to the security question but how does this, if you've got a system where users have to authenticate and now you've got a web token that potentially the different micro front ends need to have access to, yeah, how's that kind of handled? I think we faced something similar when we were deploying our application. So basically what we did was we had our own proxy in the back end. So instead of doing the direct course request like I did, like it hit to 3002, we would put a proxy between the web application and the different backend servers that we had. And the proxy would basically add all the credentials that you would require. I was just wondering how it impacts the performance on the server because you have several instance of the code running at once. Sorry, I didn't get you. I was wondering how much it impacts the performance to split the code like that. Once the bundles are loaded into the runtime, it does not make a difference because it's as if it's the same build once it enters. So the only difference is that, yes, the chunk loading operation might add a few network calls and it's limited to that performance penalty. Yes, thank you. Thank you for the talk. It was very interesting. Yeah, I was just wondering, well, it's kind of like something that I kind of want to verify, I guess. This would be maybe also something that you could very well use for scenarios where you would normally like lazy loads stuff, but then instead take you from another location, right? Because like, I mean, otherwise it would affect the first paint, I would expect. Oh, could you repeat your question, sorry. So like, this would be more for scenarios where you would, like if you load normally when you route to pages, for example, you know, you would lazy load things. But in this case, it seems like if you load from a remote location, it's not instantaneous. It doesn't load instantaneous. So it would be better in those scenarios and it would be like for if you would load a page and you want to show a lot of stuff on first paint because it seems a bit delayed. Yes, that is there because since when you call it from another location, there is added step of loading that chunk. So yes, if you have too many components on the initial page, you'd rather have it in the original host application and not load it from remote. What's the difference between this technique and Angular's way of doing lazy loading? Lazy loading is basically you delaying the operation, delaying the loading of the chunk until it is required. Whereas here you are loading it from an entirely different place. Angular would not allow that with lazy loading. If you try to do it from a different place, it would just fail. But this is, this allows you to get it from other places as well. Yeah, I just want to correct that it exists for Angular, a specific way to lazy load from a different server from another friend. But just my question was, you say that you were extracting some code from your repository and you said also that you wanted to share some libraries like React. How do you manage multiple versions of React? If each team wants to upgrade to its own version of React, then you duplicate the React library many times, right? Yes. So here, let me just show you, maybe I've not mentioned it here. But you could have a thing called expected version over here. And with that, you could tie it to a certain version only. With that, you would manage the different versions of React issue. Is there a question? Any questions? Sure. Thank you. The MIM project has to know before the definition of the remote component, or is this completely independent? It could be known in runtime. For the better developer experience, what usually we do in a production environment is that we export the types of the components that are part of the remote module so that it helps the developer know that before time. But the components do come in runtime only. But in compilation time of the parent of the main project, he has to know the definition of the remote, right? If you're using TypeScript, then it would require types. But for JavaScript, it would just be a promise, right? And then it would resolve later on. Okay. About the dependencies, it's only about React. You can use it there, the expected, or rather, for example, I don't know, material UI. Yes, you could share any kind of dependencies between different bundles. And then remote module, you could use a module like from NPM if you publish it and you can use it there. So remote module usually should point to another build of a webpack, sorry, should point to another build of a webpack application that uses this particular plugin. Well, is that it? All right. Thank you so much. Thank you. |
Managing customization in UI library
How to allow customization in complex React components library. The example of MUI. |
Okay, so welcome everybody. So my name is Alexander and today we'll talk about user interface library and the problem of customization. So to give you some context, I'm a three-year-old React engineer and I'm working at MUI during the day and during the night I was working for OpenFoodFact which is a kind of Wikipedia for food products at the end, don't hesitate to ask for stickers for both. So for the few, we don't know what is the material, this is the homepage so that you can get what is the component we provide. And we have a usual problem of user interface library which is what the re-user want to do with it. So one aspect is making internal tools or a small MVP, you work in an NGO for example, this is one project I'm working on. I'm currently doing a lot of effort because you can see the colors are not the default one which is a lot better than most of the projects we're using. But you also have kind of real websites that do a bit more customization with the library but all the components are from MUI. So you get a conflict about what is interesting for you because for me working out of the box is super important. I don't want to spend more than five minutes to set up a button. If you're paying engineers, they can spend a bit of time on it. Being beginner-friendly is super important for me to get new contributors on the project for companies to don't spend time on onboarding and about customization. I think the example before was clear enough. So I will present you how I perceive customization across my learning journey. So I started into a consulting company and what you have to do most of the time is the designer comes to set up some specific colors and so you add a CSS specification, selectors. It has to be a bit more specific for some of them, you have a class name. You want to be again more specific, you add another class name and it works by magic. And one year later, you will know that in CSS, the more classes you put, the more likely you are to override the properties, which is called a CSS specificity for those who are not aware. For user interface library, this means only one thing. But all your style into one CSS class name so that your user will be able to override them. So here is one of the simplest components to override one class with a meaningful name and developers can override. That's not all for customization. Of course, you need to respect, to pass some attributes to your HTML, for example, to strictly speaking, disable your buttons. And you can allow a bit more with React, for example, spading custom props. So that's all for me. If you want to customize the style, it's CSS. And if you want to modify the behavior, you will use props. That's basically what you learn after a bit one year of development. And for me, it was the time to level up and to join the MUI. So I was thinking, yes, I will work on two of those tiny components. And the guys say, hmm, it will be slightly more complex. So you will work on two of the data grids. Was a bit scared about at least being able to use it, but fairly simple at the beginning. You define what is a row, the collection of objects, same for the columns. To pass everything and the components, magically do the stuff. But of course, such a component has a lot of features. And here is the problem of the complex components being able, at the same time, for you to manage them internally, but also to document them. Because if user was just taking your component, put it into their website, it would be nice. But of course, they want to modify how the filtering is working, how the cell are rendered. And sometimes they have very bad ideas to test. So here is one of the first issues I get. So it's complex because you can open a filter panel. Into this filter panel, you can select on which column you will do the selection. And they wanted to sort it by alphabetical order instead of the order of the columns. It makes a lot of sense. But the problem is, how do you allow the developers to access this really specific rendering without having to break everything? If you're following what you learned during your first year, you say, okay, it's a behavior modification, so I add the properties. And then you get a problem about how your documentation will look like in a bit one year. So you need to find a solution. So there is three main solutions, which we will talk about. So the first one is a headless, and basically it means you remove the problem. You say, I will not consider giving you the components, I give you only the logics. So you provide options here, for example, what are the columns, what are the data to render, and it returns you all the utility function to quickly set up a filtering rendering. So you're not any more responsible of rendering the select, so developers can do whatever they want in it. So it's super customizable because you do not deal with the components. It's clearly not out of the box because even the quick start example are more than hundreds of lines. And beginner-friendly, yes and no, because you strictly give them like legal pieces. They have a function, they know what to do, they can do whatever they want before, after, or with the results. But you're also responsible to pass all the attributes to the HTML elements, so you have to learn each of them. If you don't know, you need to pass, for example, a label ID to be a RIA compliant. There is no way to say you, eh, you forget this specification in the components. Another way to do is to provide the basic components, the case here of React admin, which like the name says, it's an administration panel. Basically this is how you define your administration panel. You get a provider that wraps all your application, and you define resources, which define how they will render the table. Most of the time, you don't want to get the default table because, like always, you want to do some customization, and so it's easy. You define your own list, so there is an example, and you can specify, okay, this colon ID will be text, the category is text also, published at, it will be a date, and so on. And if you're not happy about the default way to render a date, you just remove the date field, and you create your custom components with smaller components they provide, and you can go deeper and deeper as you want. The idea, in fact, is that they provide the basic components as small as possible, and the way to wrap them all together, which is the providers, and you get your application, and if you know this image, you know that the result can also be this one. This is, so you know Material UI, so you will recognize the code. It's a form control with an input and a select, and just below, it's a form control, a label, and a text field. It's a real code I wrote when I arrived, and it was not working. You can see the variant standard is not applied onto the text field, and you probably already guess why, it's because the text field is already a form control with a label, with an input, and an helper text, so without realizing, I was putting a form control into another one, so it did not work. The problem is not that it's not working, because it was my fault. The problem is that I was not able to get any warning into the console to say, me, hey, you're doing stupid stuff, but that's because people can take all the components they want and rub them in the order they want, and you cannot handle all the stupid adcage that developers could be able to do, so you cannot prevent them from doing a mistake. So here is a correction. If you want an example, this is Ikea page, fairly simple, but there is already six ways of not to do, and in two documentation, developers do not already read the section how to do, so how not to do, you are likely that they will never read it. So customization is nice, because you can already override your components. It works out of the box, if you stick with the default one, it's nice, but it's not beginner friendly, because you cannot prevent errors, and there is another fact, if you want to do customization, you need to understand how the providers are working, and it's super hard one to document, and two to explore. You cannot say, oh, I console log the props, and I saw all the attributes that are available, and I guess how they work, there is to use some provider and consumer. So now the last one, which are the slots used at MUI, so if you remember, the first example I give about customizing button is to add an end icon. We can apply the same thing for custom components, for example, I want to provide a custom filter panel, so I pass a props filter panel, but we arrive at the same problems, an infinite list of props. So we put some context, so we define the property slots, and we will document in a specific place, these are all the components you can override, and that works, but this is a kind of basic filter panel with a lot of subcomponents, a lot of logic and edge case that are already being solved, and you want to take advantage of all the issues we resolved. So the idea is to provide also slot props, so you say to my filter panel, which is kind of a box I can remove and replace by a custom one, I also want to put some inputs, so for example here I want to provide the property colon is sorted by ascending order, and the default component might support it, so of course if you replace it with a custom one and do not support these props it will not work, which is a complex case, but this is currently the solution to solve the initial issue sorting by ascending order. So it's customizable because in the worst case you remove the components and replace by a custom one, it works out of the box because if you don't want customization it's working, and begin off only because we have a single place for the interface between your code and our code which are the properties, so you remove the components and you know props pass to these components is the only responsibility we have to provide you, you don't have to expect props coming from somewhere else or modification of the wrapper because you see it has a new children. So which one is the best? I've already seen this question a lot of time on two Twitter and of course the answer is there is no best, it depends on how many time you are able to spend on your project and how much custom you want to be, and there is a last solution which is you can also not choose, it's an upcoming way to do it, so for example for a filter panel you can use slots to override it, so you have a way to say that your only props are your interface between the library and your custom components. You can use headless inside your custom components to say okay I know what I will get as a props so I can provide you a hook that will transform them into a helping functions, and if there is a lot of elements inside you can also provide the basic bricks to build your filter panel consuming the, for example your filter items props that are provided by the headless approach, so there is a way to say okay you can do whatever you want with a filter panel, if it's really common you will just have to pass props, it's more general if you want for example to modify the DOM, it's something like that when you have to mix everything, and if you want to do something completely different, you go to the trash and put your own components, and that's all for me. Thank you very much, we obviously have plenty of time for questions, so who wants to ask the first one? Hi first thank you very much for creating the data grid, it has saved my job countless times, my question was more related to, oh yeah I don't want to, okay, my question is related to another talk that we saw today called Penpot, which is an open source version of Figma basically, and MIUI has some design kits to use in different design tools, are you guys looking at Penpot, are you thinking about creating a design kit for it, I would love to use it, thank you. I have clearly no idea because it's a completely different team that is, okay, thank you. Hello, thank you for the presentation, a quick question regarding the design phase, did you consider about backward compatibility, about what? Backwards compatibility with previous versions of Material UI? Yes, but what is the relation with backwards compatibility? So with each new major version that Material UI has, usually there are quite a few breaking changes, so when you were talking about the considerations on the design phase about being beginner-friendly and all this stuff, was backwards compatibility a thought at that stage? I honestly don't understand how do you want to provide a backward compatibility for such a user library. I don't understand what you're expecting to get solved by the library. So you usually expect backwards compatibility so that every time that a major version is upgraded, your code does not break a lot, but that is not the case, unfortunately, usually. As long as you do not go from one major to another one, it's working. I mean, we follow the same verse, so if you're into V4, you can upgrade as long as you want and it will be working. Okay, thanks. Thank you, while I walk up, I see a little bit of trash here and there, so please, even if it's not yours, pick it up and put it in the bin. Thank you. So you showed us about slot props and slots. You said that it was basically the best trade-off between being beginner-friendly and customizable. I was wondering what if you have really complex components when the slot components also have slots themselves. Are you able to do this kind of cascade of slot props? There is this problem sometimes. It's problematic for documentation point of view because if you allow people to override the sub-slot, they will not implement all the edge case interfacing with the props provided and after that, they want to just pass the props to the wrapper components and, of course, it will break because they did not implement it the internal ones and it's problematic, but you are to say that the best way to mitigate is you do not provide a slot for every component just the stuff that can be considered as an independent. For example, a filter panel is super easy because you have an entire piece of software. It will get some filters in time. It needs to update its value and that's all. It's super clear. It's the same. You will not have sub-components. You want to customize except writing them by yourself. So it's really rare when you have a sub-component into another one for slots. Okay. And for the rest, you are using props, I suppose? Yes. And I think for now, we do not do it because we don't need to reach this level of simplifying customization, but I think providing the sub-components to do it is maybe the best way. For example, we have a toolbar and the way to overriding is you provide a slot to define what is the toolbar and we provide all the default buttons you wish to put into your toolbar. Okay. Thank you. |
A practical approach to build an open and evolvable Digital Experience Platform (DXP) |
All right, welcome everyone, we're going to go, yeah, it's four, so we did a kind of a dry run this morning and we realized that there's too much content, so we're going to have to, you know, skip, we're going to do the introduction, it's going to be very quick. So this is where we work, it's called Mirai, blah, blah, blah, connect to the website, you're going to see what we do, we do basically consultancy on a number of different clients on development, architecture, digitalization, so on and so forth. This is us, you can tell who's who, these are our Twitter handles and this is the Twitter handle of Mirai, if you like to talk or if you don't like to talk, just feel free to share on Twitter. What we're going to talk about today, so this is the result of a number of lessons learned that we had on a number of projects where we tried to solve a problem that we see coming up over and over again, okay? So it's just, we just took stock of the situation, we saw what was around, all the mistakes that we made and we kind of condensed it into a presentation that we're going to give to you, so this is by no means an idea of giving you best practices of any sort because there's a lot of context that goes behind these kind of problems, but it's just the way we are approaching the problem, the way we are analyzing the problem right now and the way we are approaching it. And we're going to talk about this idea of digital experience platform. You will see that like behind the name, there is a thing that you will know that is this idea that you start to have technical ecosystems that are more and more complex and these technical ecosystems are done by a number of systems that need to integrate together and they need to communicate together and they need to give back to the user in some form or another on one channel or another, a number of information, a number of capabilities and interaction. So that said, what is the state of the art? What is the thing that is the trendy thing right now? It's kind of synthesized, but this idea of the Mac kind of technologies, there's an organization behind it, you may know it about it or not, but it's this idea that the new architectures have kind of four major characteristics. They're based on microservices, we're not going to go into the debate of the microservice, what micro means and why do we need microservices or not. They're API first, they're cloud native and they're headless. We talked about headless, I think Alexandre talked about headless on material UI in the previous talk. I think that the two that you have to retain for this talk are these ones, that you have systems where you don't necessarily care how the information or how the capabilities are going to be exposed because it can be on a mobile app or it can be on a web application or it can be on a kiosk that you have on a store or on something else, you just don't care. The system is capable of giving you this kind of capabilities and data regardless of the channel that you use and how do they do that precisely by having an API first approach. So you don't have the interface, you have the capabilities, you can call any kind of HTTP REST service that is behind, the capabilities is there and then you take care of the visualization or the user interface or the user experience somewhere else. Am I going too fast? No? That's okay? All right, but there's a kind of a yes but a kind of situation. So let's see that you start from a simple website. So this is what happened most of the time, you have a relatively complicated kind of interaction website, you choose your framework, here you have React, you may have a number of others, we basically don't care, but that's kind of one of the possible starting point. And then you say, okay, but I would like to manage this content. So what do you do? You add a content management system. That is okay. You connect your content management system to your front end, off you go. Now you have a number of editors that can manage your content. All right, sounds good. Then you want to add some commerce because the company is doing good, they want to add commerce capabilities. Well, very good. You have a commerce engine that is headless API first, you can add your commerce engine and then you start to connect it to your front end. But you also have to connect it to your CMS because they need to share certain kind of data. And this is the situation that you have. And then you have a lot of products. So you start to have, you start to need a digital asset management because you need assets for your MOBA, you need assets for your kiosk, you need assets for the web. And you can't store them here because they're not made for this job. So they start to see a bit of their constraints. So you connect the dam and you connect it with the commerce, you connect it with the CMS, and then you connect it with your front end as well because it's the ultimate client. Okay, you kind of see where I'm going with this, right? Search, you start to have a huge catalog, you need to add search, you add Algolia or elastic search, we don't care, it's the same kind of concept behind. And then you connect Algolia and you need to connect it with your commerce layer, you need to connect it with your CMS, you need to connect it with your front end and off we go. Then you have a Netlify because why not? And then at some point somebody says, you know what, I would like to do some customer relationship management with the data that I have and then you start to add the CRM. And that's where you end up with the situation. Now if we remove all the capabilities that are provided by the system, which is a huge step forward because before you had to do all kinds of plumbing behind, you need to reconnect your user interface from the system behind, it was very tedious and very complicated. Now we have all those pieces of software that are capable of doing this by themselves, where you're basically left with this. And I was finding it funny what Alexander, I don't know if he's still here, was saying in the previous talk, I would say you go headless and you just remove the problem, where you would see that there's kind of a law of conservation of complexity, you just don't remove the problem, you just shift it somewhere else, you know, someone or something still need to take care of this. And then of course, because we're not stupid and we're a good intelligent developer, what we say is like, okay, this is not possible, we're going to create something that is going to aggregate this all together and we basically create a layer. But this complexity is still here. So you kind of have a tension between the fact of having these Lego blocks and the modularity and the fact that someone or something still needs to orchestrate all this kind of stuff. And it can get very, very messy and very, very complicated very, very quickly. All right, so this is the first problem, kind of big problem that we were trying to address. The second kind of problem is that if you take kind of a random e-commerce kind of interface, what we start to see more and more is that before it was just a matter of your CMS and your commerce, you had the catalog and the basket, these were coming from the commerce system and then your content was coming from the CMS. What you start to see right now is that if you take all the blocks that compose, let's say, a fashion e-commerce, in the same blocks, you have pieces of data that come from different systems and you need to aggregate those and you need to present them as if they were a unit because that's what the user cares about. I need to have a catalog and the catalog is normally coming from the search because the search is optimized so you don't ask it to the CMS and you don't ask it to the commerce because you want to do all sorts of categorization and classification. But then you don't show the IDs of the catalog, you show the product and the product comes from the image from the dam, some content from the CMS and the SKUs from the commerce system. So, again, you see where we're going with this, it can get very complicated very quickly. And the problem for us as developers behind is that every time that you need to change something here, you need to make sure that everything stays coherent and consistent behind the scenes. And again, this can get very tedious very quickly. So, I hope what I said kind of made sense as an introduction for the context. You raise your hand if that was not the case, okay? Everything is clear? Sounds good. I was getting scared. Okay. So, we have the problem. Now what? And that's where I handed over to Bouba that is going to show you what is the situation that we're applying right now. So, fully disclaimer, it's an approach that I'm showing you. It's not silver bullet, it's an exploration, so be with us, it's not perfect, and just trying stuff and trying to see where we can solve this problem if something. So, first thing that I'm asking you is, you tell me which framework I'm using. So basically, I'm running on 4,200 and 4,201, and if you look at the two websites, okay, we are in JavaScript, I have to make a joke about CSS, I'm really bad at CSS, CSS is a dark magic for me, so that's the reason why I don't bother with the logo, it's normal, it's okay. Just focus on the product here. Could you tell me on right or left, depending where you are, who is build with next and the other one is build with next, no, okay, so let's see on how we, here, so I will just open it, so if I do an inspect, if I scroll a bit, so one of them is based on next, okay, the second one is based on, because voila, next, okay, so you will say like, why I'm asking that, like, okay, so if I tell you like, these two pages use the same base has code, okay, and I just compile it, and then I get a component that works for React, that works for view, and it's not a component, okay, so do you know a tool that can do that? Nobody, okay, so I try, okay, so let's go. So for this tool, I will be speaking about some tools that we are using to orchestrate everything, so the first one is NX, so if you don't know NX, it's kind of like a tool that you can use to handle all your monorepo, okay, it's a really good tool because I don't want to set up webpack guys, or anything else, so I use this kind of tool to handle all the configuration, just handling webpack, handling, side press, story block, everything, it's handled by this tooling, okay, I just use that again, and then give you a good basic when you work with your team, and really good standard, okay, second tool that we will be showcasing is MetaZis, MetaZis will be the basic of how we build a component, like I said, it's an exploration, it doesn't solve everything, it's just a base, and we are trying to explore, so if someone knew something about MetaZis, did you hear about it, no, okay, so give you some context, so MetaZis is a kind of like a compiler, component compiler created by the Build.io team, so Build.io, they build kind of like a CMS that completely, I will say visual CMS, where basically they have a lot of SDK, they would like to support all the framework that we have in JavaScript, so it's a lot of frameworks, so they hire multiple people, and then they're like, okay guys, you're hiring too many people, okay, maybe you should find a way to be able to handle all this complexity, so they begin to create a compiler for components, so they use a common layer, like JSX, and then base with this common layer, they compile it to, natively, to React, Vue, Svald, or like all bunch of stuff, okay, basically they take the approach of LLVM, so if you don't know about it, they take the same approach, having a common layer where we can target, and then all the architecture can file for that, so we'll be using that, frankly speaking, it's not perfect, it's really at the beginning, what's really nice with that, they're really pushing the boundary of that, because all the SDK, so if you use Build.io, basically you're using me to this under the hood, so they use that to build all the SDK for the visual editor, like everything that you're using for React, Vue, it's based on that, so it's not perfect yet, it's improving every time, but it gives you really kind of like a good code, I will not say it's perfect, a human will be better, yeah, but at the end you win time and energy, because you write once, and you build everywhere, okay, it's a dream like since like 20 years, I know, but voila, every time, every 10 years we try to do it again and fail, maybe, but voila, someone try, okay, so basically if you take back what Maurizio said, we are going on the headless, so our goal is really to reduce the friction, so we have multiple frameworks, it's really difficult to go on client and say, hey, you use React, you use Vue, you use Angular, okay, and every time we're building the button, we're building the same thing, so trying to find like a common layer and using me to this, we can really focus on what I said most of the time, the presentation component, I'm not mean like for really big component with charts, it's not mean for that, and don't try to do it, you'll be it early, and you will hate it, and you'll be like this is shit, okay, it's really for, I will say common component that you will have, that will be really simple, mostly UI based, not too complex, but kind of complex for some stuff, so you can really reuse, basically if you're on React, the presentation component, so we shouldn't say it anymore, since Dan said it's bad, but voilà, use the term, okay, so I will show you how we did it, and what we have built, all the advantage issue that I got about it, so I'm using upstone, okay, and basically here on the left side, you can see I'm using, it's just a standard, which application on Enix, so you have the app folder, the lips, and then all bunch of stuff that you have every time on every stuff, okay, the lips contain all the kind of library, what's cool if Enix, so you don't need to publish it, so you can really put it on the more repo, it handle all the configuration, so when you import it in a React, it's basically use a test path on the hood and make the magic on the web pack, it just works, okay, it's kind of like.magic, but it's not too complicated on the hood, it's just test path and bringing it in port into the web pack, or vids, okay, so basically we create a library, a core library, okay, using metosis, okay, so if you open here, you can see I have a core library, this is the metosis, what I call a core, because it's the basic for everything, all the components we are using, okay, then I create a subfolder called the UI, UI will be like all my target that I'm pushing for people, okay, so I have the UI for React, okay, and I have UI for view, okay, so let's first focus on the core, so on the core I have like a metosis config, so basically it's a configuration layer where I basically set up how I want to target and how I want to compile everything, so you can see it's a bit messy, because like I said, it's not perfect, perfect completely, you have some edgek's, but behind it's just basically a config that you put and you say, okay, let's me scroll, you put where you put your file, where you put the right, really important, I will explain it after, and then the targets we would like to do, it gives you also some edgek's, like for example, you can set up if you want to compile the TypeScript, want to run pressure already handling, like for example, how you want to handle a strict act, and have like kind of like plug-in system, basically it helps you to hook into the system like when you compile, okay, it's really basic, it's not like a big compile, it's just basic, but it's already to fix some stuff that's not perfect, okay, like if I go a bit here, like Ragnative, QIP, Swalt, like Swalt, you can see here like, holy shit, what this guy is doing, like basically it's replacing stuff under the hood and patching stuff, yeah, sometimes you have to do it, but voila, it's just like to give you like the extent where we can go, most of the time you don't need that, this is really like edgek's for them, but you have already code, okay, so then you have the SRC, the SRC basically is just where we put all the code, so I will focus mostly on the component, so the convention is really straight forward, it's just you create a component basically with.ly.t6, it's the convention that will be every time pick automatically by the compiler and then we compile it to view or react, okay, so basically here I'm just taking a simple component, this one it's more tricky, I will open another one, I will say easier, not this one because this one is hollow, let's take this one, okay, so here basically just under the hood it's using like kind of like a.ly.t6, but a more.ly.t6 contraint, so if you know solid.ly.t6, you already play with it, not you, basically solid, it's kind of like a react, what Red should have been if they write it now, and basically to use like really a contraint.ly.t6 system to make it more performant, and basically they kind of copy some stuff from solid, like for example if you want to do a maps, you don't do a maps, like in React like it's just JavaScript, yeah okay, we know it's just JavaScript, but here you have to use the four component, why it's because it makes easier for the compilation basically to be able to say okay, if I'm using in view I use a v4, if I'm in React I use a map, if I'm in Angular I use an ng4, okay, so this reason why they create these contraints is to make the compilation easier, if not they have to play with the AST, so abstract the AST, abstract syntax tree, and just make the complexity increase, so try to make it simpler, okay, and then they come with some common ground, like for example you store, it's just kind of like a hook, basically this will be compiled to a use state or some similar in view, okay, so they give you some rules that you have to respect, it's really strict, it's not easy at the beginning, and that's make the complexity of using this kind of tool, because you have to really think like, when I say if I'm working in Java before, like in interface development way, okay, what I mean by that is you have to think about the common ground between all the component and all the frameworks, you cannot just say hey, I will write it like that, and basically this does not exist another component, okay, or it's not compatible, okay, you have some degree of flexibility, but it's not every time the case, okay, so you have to find really the common ground, try and do it, but accept that, it's kind of like a just react, okay, with some stuff, okay, so basically you get this component, and it will be compiled, and you can see it on the output folder here, it will be compiled automatically on react, so if I take the, here, why I have client and server, it's because I'm already testing stuff, whoa, five minutes, okay, I have to go straight, so client basically is where we put all the normal component, server is when you do LRC and so on, and not speak about it, so component here we have the future section, so basically it's really, I take this one, under the hood, it looks like really something that you should have written, okay, the compilation is not perfect, but it's going really well, the good thing with that, it's like, it gives you also an escape H under the hood, the escape H is the override system, so if some part of your application, you need to have like component that's not completely divided too complicated, you can override it, and how you do that, basically it just follows the path here, and then you can say, on the compilation level, when you meet this import, replace the file by this one, okay, and here for example, I do it for like headless UI, okay, I say okay, when someone meet the headless disclosure, basically under the hood it's switching to headless UI, and because headless UI is working both in a reacted view, it works fine, okay, when I have that, I copy it completely automatically on my library here, UI and next, and basically I just create a library, simple library, where the client and I do an export, simple of that, this morrow I can take it and publish it on npm, it works, okay, then that's, I take it, and then I use it in my application directly, thanks to nx, I don't need to do like npm, blah blah blah, it's already hundered automatically, so to give you an example on next.js, on the page here, I have like my application, so basically I will introduce now the layer, so do you see here, we speak about the component, but now we need a layer because everything is interconnected, so we need a layer to be able to say okay, I have a layer in common where I can connect everything and handle it, this layer is called uniform, okay, it's a tool that we use and we explore a lot with the teams, so let me show you, so uniform, it's basically that, okay, it's kind of like a visual editor that you can get, it starts completely with CMS, but it's way more powerful than that, where we can connect a lot of stuff, so here basically I'm connecting my next application directly with my component, I will show you in two minutes, and I'm also able to connect directly the data coming from multiple source if I want, so here, okay, to show you like the example, so here I have my sitemap, I have my what I call like FAQ mesh, it's just an example, the FAQ mesh basically is connecting to, I hope it works, connecting to my CMS, okay, and you can see here on the container level, I'm clicking on this one, basically here you can see it provide me a component called loops that allow me to make reusable stuff, and then it's okay, the first component is the template, okay, and the template I can say, okay, I want to connect it to something, the title, the title is connected to something, okay, this here what you see, it's a note from my API from Storyboard, okay, how I get that, I move uniform, come with a lot of stuff, first thing that you provide you is a component layer, okay, all the components that you saw basically are recreated here, so you can take them, reuse them and use as building block like a CMS, then with the component you can create what you call a composition, a composition is basically just aggregation of your component or basically here I create pages, okay, and then you have a third system called the data type, the data type is basically your injector, okay, do you see all this mess, so they provide you a tool to be able to connect in one way, in a standard way, every system that you want, here I will show you with my CMS, so basically here I'm configuring my CMS, and I can go and say okay, I have multiple code, and the code is really simple, basically it's an HTTP request, I can say what is the name, I can say where I want, I can create variable that can be reused into the visual editor, and then here I can get that and connect everything, so when you saw a lot two minutes ago what I'm using, it's basically a connection with that, this it gives you like the full extent, completely, when you have full power, but if you're lazy and sometimes you come with an integration, they have some integration, one of them is you go here, they have all this integration already available, you just plug and play, and they provide you into your system, you connect, you have access, you put your credentials, and you're ready to go, like for example here, I set up Shopify, blah blah blah, and I will remove stuff, but basically here, if I wanted I can connect Shopify, then basically it provides me information so I can just reuse and pick for example my product magically, automatically, and then I get one endpoint where I can get the data. One last thing that I would like to speak, why this tool is also interesting, it's about personalization and testing, so it's already built into the system, because it's the orchestration layer, you don't need to add like, non-jokely next to it, because the handle already did a bit of personalization for you, the handle already did a bit of testing for you, voilà. Do you have a question? Before the question, we'll say thank you. Thank you. |
Using the Firefox Profiler for web performance analysis
Capture a performance profile. Analyze it. Share it. Make the web faster. |
Hello everybody. Thanks for being here. I'm so glad to be here for them. That's the first time in three years it's we're in person. So yeah, I'm so glad. So I'm Julien. I work at Mozilla for like 10 years now and so you can find me on mastodon, you can find me on Twitter, you can find me on Matrix, you can find me in Paris. And of course, so Mozilla you know that so well. We are responsible for Firefox. It's high performance, open-source, privacy-conscious. So I'm sure you already have it on desktop, you'll have it on your phone. So you link them together and use your history and your password. You know that already. Mozilla is also famous for the MGM of course, for all your development fantasies. And for VPN and for Firefox monitor, a lot of good stuff. But today I want to talk about the Firefox Profiler. So you can find this slide here and with the QR code as well. And let's dive in. So this is the thing I want you to leave the room with today. First, I want you to know what a Profiler is. I want you to know how you capture and how you can share a profile. How you can share a profile with your colleagues, but also how you can share a profile with us Mozilla in case you have performance problems with Firefox. And finally, we will do a quick tour of the UI so that you can have the first glance of how you can analyze a profile. So first, the Firefox Profiler, at least the UI, it's just a web app. It's a React-based web app with no privilege at all. It's used as a service worker so that it works offline. And it's also an open source project. You can find it on GitHub. You can do contributions. You can look at the source code. You can even adapt it for your needs. So yeah, come and do some React-based contribution with us. So first, what is a Profiler? So I was scared by this cat because it's so big here. So yeah, how do you measure prep performance? You can measure it on three different places, basically. You can measure it locally on your computer. You can measure on the CI with automated tests and benchmarks. And you can measure on your user's computer. This is called real user monitor. But there are also different timings characteristic, right? Locally, you get the results right away. You profile, you look at it, you get a measure. On the CI, you can get the result after maybe one hour. And with the real user monitor, of course, that takes a few days. And the Profiler, this is locally. So this has some advantages. Of course, you get the result right away. But this also has some drawbacks that we dive into later. So basically, this is a tool. This is just a tool in your toolbox, right? This helps you analyze performance issues in your application or in Firefox. But here it's more for your application, for your JavaScript or CSS stuff. This gives you insight and clues. But this is really detective work. When you have a bug and when you have a performance bug, that's the same. It's just a detective work. You have to look at the problem for different angles with different tools. With console.log sometimes. And this tool can make things easier. But it's just a tool. And yeah, that's something I think the most important with performance is that if you try to guess, you will guess wrong. Very commonly. Like, you know the web, right? You get some property on some DOM element and suddenly you get a full reflow of the whole page and it takes a few hundred milliseconds. And you can't see that really clearly with the code. So also with the JIT, with the JavaScript virtual machines, the JIT can optimize it for you, can optimize a lot of good things. And for other things, they will just be very slow and it's very difficult to know in advance. So yeah, do not guess. Try to measure being with the Firefox profiler, with the Chrome profiler or anything, even with performance that measure with timer in your code. But try to measure because not everything is always, it can be confused on the web, right? So now I'll try to do a live demo. Let's hopefully that will work. So I will show you how to capture a performance profile. Let's switch to that view. So this is a view of the profiler, but don't pay attention to the UI right now. Just because I have a bug. A bug was reported lately. This is very slow when we do a panning here. Like when I pan here, it's not so slow right now because the resolution is smaller here, but it's quite kind of slow here. So I want to measure this. I want to profile this. So what I'm doing is first I'm going on the profiler website. And there is this handy button here. Just click here. This has a profiler icon on the top with a bunch of settings here. So you can use, there's a settings title for web developer here. Or you can profile the whole navigator if you prefer, the whole browser. You can also edit settings here. You can change the interval, for example, if you want to reduce the overhead, or you can change some features. I will maybe talk about that later if I have enough time. But now I will just start the recording. And I will run my scenario. Like that. And then I can capture. And I get a view into what happened here. So we will type into the UI a bit later. And I will zoom in also because maybe it's a bit small for you. But I want to just to show you that you can also upload the profile then. Of course, there can be personal information in anything you can show from your browser. There can be things about URL you visited, about the tab that you have currently opened. Maybe there are some tabs you don't want anybody, everybody to know. And there can be paths in on your local computer. So you can disable some of these things. So for example, I will remove the, yes, the thread that I don't see here. I can upload here. The Wi-Fi works surprisingly well today. And then you have this ND URL. You can take this URL, give that to your co-worker. And this will open exactly the same view with the same profile. And so this is very handy to discuss that with your colleagues. And some of you can have different understanding of different types, different parts of your code. And so you can have different views like that. You can also share this URL with Gecko developers. If you have a problem in Firefox, you can open a bug, put this URL there. And that will help the Firefox developers solve problems that you may find in the browser. And finally, because that's your data, you can also delete it. There is a delete button here. You can delete the data that was uploaded. Because otherwise, it's public. You need to understand that it's very public. Of course, you need to know the URL. But otherwise, it's public. There is no password, whatever. So you can delete it when you don't need it anymore. So now I come back to the talk. We'll dive into the UI a bit later. But first, I want to explain how do you have your good data, yes. You need to isolate the problem as much as possible. Like, let's say you have a lot of tabs, lots of websites in background that are running stuff. The problem you want to test can be isolated. And maybe it will be some performance differences just because you have some websites in background. Or maybe you're building something in background of your computer and that will skew the results. So try to isolate the problem. Try to have a computer that is not too busy at that moment. When you get the data, try to ensure that the results you get are the ones you kind of expected. I can look at the screenshots, see if this is the thing you actually wanted. And you can keep recording. There is not one truth. There is several truths here. Depending on if your system wasn't in the best shape at that moment, maybe you need to record again so that you have something that you can work with. So no, yeah. Now I want to go into the capture data. So we have two types of data. The first type of data we do is sampling. So what we are doing is every interval, every millisecond here, we look at what the current program is executing. So we look at the stack. Here for example, this is an example of a program that starts with main. This is the main function of a program that we call A, that we call B, that we call C, C will return, et cetera. So this is like a normal program calling normal functions. And every millisecond we look at where we are. So here we are in A, no, we are in main, sorry. A, we don't see it here. But here we are in B. So we are in B, but we are also in A, right? And we are also in main. And then at this point we are also in B, in A and main. Here we are just in main, for example, and here we are both in main, in A, in B, and in C. And then so we take all these data and then we squeeze them together, which is a technical term to mean that we aggregate the data. So we squeeze them and so we show this information as far as the culture like that. So that's basically the same. So if I come back here, we can see basically main, we spend 100% there, right? It's the same function in the whole program. In A we spend maybe 70% there and in B we spend maybe 50%. And that's what we see in the culture then. Here the one at the top, we spend like 98% in it. But we spend also 56 samples in, right in this function. So the difference between this column and this column is that this column is this function including everything it's called. And this column is this function only, excluding everything it called. Another view is the frame graph. The frame graph is the same data, it's just more visual. The time we spend in one function is the width of a rectangle. The one at the top are the one where we are at one point. So very visually we can see where we spend time. So here we can see we do a lot of converse rendering stuff. This is exactly the scenario I was playing earlier, by the way. So we see that we're spending some time on the converse and we are maybe redrawing too much stuff at that point. So we can dive into that later also if we have time. I don't know how much and we have, okay. So yes, sampling is a view into the program that is currently executing. But we can miss things. Like you've seen here, because we do only every millisecond we can miss some of the codes and maybe they're important, right? So in this case we see only once because it was a bit longer but the small ones here we don't see that. This can be missed. So that's why we can instrument the code. So Firefox itself is instrumented for important things like the reflows, the restyles, or the GC, or things like that. But you can also instrument your own code with the performance timing API, performance.mark and performance.measure. You will see that in the profile. So that's what we call markers in our little technical jargon, some other profiler called that events. And we get like another view on what's happening in the program here. So for example, we get, we see all the events here. This one is the mouse move event because that's exactly what I was doing, right? I was moving the mouse while I was panning. So I get mouse move events that are pretty big actually. 30 milliseconds for one event, that means we spend maybe too much time there. We also see all the user timing because the code of the profiler is instrumented to add performance.measure anytime we do some converse drawing. But you can also see things that's happening inside Firefox. We can see, for example, the restyles here, the reflow here. We can see some GC. We can see some, when it's awake or when it's idle. So we can see all stuff, more or less complicated. We want to make this view a bit simpler for web developers. Currently you have still everything. And you need to probably ignore a part of this and just focus on what's important for you, like DOM events or restyles and graphic stuff. But it gives a, like a quite, yeah, you get all the stuff here. So you can dive into how Firefox works. You get the animation stuff. I don't have animations here, but you can see CSS animation. You can see the target of this seven. So yeah, markers are pretty interesting. So there are limits. As I said earlier, you're measuring on your own computer. And this also has drawbacks. Your computer, as a developer, usually it's very, it's very performant, right? You have the best CPU and you have a sheet load of memory. And so this means that the performance on your computer might be different than the performance on your user's computer. And also just the act of profiling itself can skew the results. Because we're, by inserting markers, for example, we need to lock the memory in some place to insert the data there. And we need to capture the stack sometimes. And this takes a lot of time. And so that can skew the results. We try it to be constant over read, but it's not always the case. So there are limits. And you need to sometimes take a step back. Sometimes what you're looking at, it's not an absolute truth, right? It can be, yeah, it's just a tool again. It's not absolute truth. You can take a step back. Maybe what I'm saying is not exactly the truth. Yeah, that's about it. So more and more about the UI now. I will show you how we can navigate in this UI. So you can put that full screen. It's not this one. Yeah, full screen. And zooming a bit. Okay. Come back at the first one. Okay. So what we have at the top? I can't choose the laser because you have to use the laser that reloads everything on this computer. So I don't use the laser. I use my finger. So at the top here, you have what we call the timeline. It's a chronological view of what's happening. At the top, you can see the screenshots. You can even, even over them and can see what's happening. So here I was on the first tab and then I go to the second tab and here I'm scrolling them, scrolling it and panning it. And yeah, that makes it possible visually to see what's happening and what was your scenario. And then you have a bunch of tracks here that are interesting. The first one here is the parent process. This is the process that is for the UI of Firefox. That's not really interesting in your case. There is also the localhost here. This is the localhost because I was running my UI on the localhost so that I could get the name of the functions. They're not minified because we don't support source maps yet. That will happen hopefully this year. But currently, we're not supporting that. So if we want to see the functions I'm profiling on the development version of our application, which also has some implication because, of course, the development version does use the development version of React. And the development version of React, it's slower than the production version of React. So again, taking a step back is always a good idea. Knowing that, we can dive into what's happening. So I can dive into this part because that's where things are happening, right? There is nothing elsewhere. So here you have what we call the category graph because the colors correspond to some category. Like the blue is dumb stuff. Yellow is JavaScript stuff. So here we clearly see that we don't do a lot of JavaScript. We mostly do a lot of something else. So dumb in this case is Canvas stuff. I will switch to JavaScript actually. Yeah. And then we see that we are doing the field stack stuff. So we see the stack when we hover, right? So we see this is a full stack of React sheet stuff. And then at the end, we do a, we just call field racks on the convasse API. That's where we are here. And so I can hover and have just a sense of what's happening. Also, I'm also setting field style here, for example. And I can click there and just select whatever is below. So this is basically a way to navigate into this profile. There is a red stuff here. The red stuff means we spend more than 50 milliseconds on waiting on the browser to catch up. That's probably a part where we want to zoom in, by the way. So we can do that. I can click there. I can zoom in here. And I can look at what's below. So better is what you can, all details you can see. So there are a bunch of tabs. The first one is the code tree. I already explained you a bit of that. Then there is a frame graph. I can show the frame graph by clicking. It's better, right? This is something I was showing up earlier. So I can also hover and see what was the stack and have an idea of what's happening here. The stack chart is chronological way of looking at your data. So here we can see, for example, I think we can see that we're rendering twice, I think, for example. So we're doing twice the same thing. And I think we see that more here. Like for one mouse move here, I have two renderers here. So that's how markers are handy, right? Here it's clear. It's clear that for one mouse move we do two renderers. And we shouldn't do that. So I guess I fixed my problem here. I need to look at the code. Of course, that doesn't solve everything. But I have an idea of where to look at my code. It means probably that I changed some state on some red component and that does a new render. So we need to look at the code then and see what's going on. Because this is just a tool. This doesn't solve everything. But that saves a shitload of time to find the right data. So let's come back at the slides. Okay. Some more advanced topics now. We also do, so the advanced topics are less exposed for a reason because they're also less finished, less polished. You can enable this stuff from the edit settings button that I was showing earlier. And one thing you can do is memory allocations. So we have two types of instrumentation for that. And I think this is working only in Firefox nightly. One of them is you can see every allocations happening in Firefox. Allocations and deallocations and things that are written, that are not deallocated, but they should. So you can see all these and where they were allocated to. And so another one is the JavaScript allocation. And that works also in release, I think. So you can have a look at the documentation here. So one thing that could be interesting for you is we have an importer for Chrome and Node also. So you can take the JSON of Chrome, put that in the profiler from the profiler homepage, and then you can do all the things I was showing you earlier. You can share it. You can zoom in maybe better than in the UI of Chrome and use some transform that I didn't explain today. But advanced stuff you can do with the UI. We do that Chrome count. And that's a parenthesis. Chrome tools are also very good. And that provides a different angle to your application. So because it's a different browser engine. So on Firefox and on Chrome, maybe it makes sense that they don't work the same. So they won't show the same thing. That's interesting to use both tools and the one in other browsers too. But you can import a new tool and share with your colleagues. So that's pretty interesting. And you can also compare profiles. So there are some documentation in your docs. So that makes it possible to see the impact of a change. Like you think you fixed the problem, you can compare the profile before and after and maybe check that you actually fixed your problem. Because as I said earlier, you need to measure. Sometimes you think you fixed it and you didn't. Or maybe you fixed one part of it. Like you fixed one state change. But there was another one elsewhere. And so there is still an update happening. So my conclusion now. Again, the most important thing is measure and guess. I think you really remember this sentence. This is just a tool in your toolbox. And yeah, sometimes you need to take a step back. You can also use a profiler to debug. I didn't insist much on that. But because you can instrument your code, you can also use it to just debug your code. It's not just for performance. You can just get a view of what your program looks like. What's happening in your program. And finally, you can share profiles with your team. Especially when you're distributed like we are at Mozilla. You can share it, put it on metrics or ISC or whatever things you're using and go to a Zoom code and then talk about that together. And thank you for being here at that time of the Sunday. So you can find the Firefox Profiler website here. Documentation also here. We have a matrix channel. So come tell if you have questions. If you have a question about a profile you want to ask or help to analyze it, come here. Ask us. We will gladly help you. And this slide you can find them here with this QR code too. So happy to answer questions now if you have questions. Thank you. So the microphone is here. There is a question there. I am very curious how you ensure security of the profile information because if I am on my usual admin account and I'm debugging and all the network information is there, it will have all the tokens and everything. All the API responses probably have the API keys because I'm debugging stuff. And somebody nefarious with a strong Russian accent could be waiting and just iterating over all the possible URLs all the time. And how do you deal with that? So let me refer to the question. What about the security? Because you can have API keys in the capture data, right, for example. So yes, that happens. But when you share a profile, you can uncheck the thing, include URLs and that will remove everything related to URLs. So you won't have that anymore. We don't capture headers and all that currently. We just capture the URLs. Even in the network configuration? Yeah. Okay. So it's not like the normal. Thanks for the talk. Small question. I believe there is kind of the same tool in the performance tab of the Firefox developer edition DevTools. Is there any difference between the two programs? Good question. I wanted to mention that I forgot. There used to be a performance panel in DevTools. We removed it. Now it's been replaced by this one. But we still have a performance panel in the DevTools. It's the same thing, except that when you open your DevTools, you have some overhead. So our recommendation is that you use the pop-up like I've shown here. Okay. So general recommendation, use the external profile. Okay. You can also profile your Firefox Android and you can profile remotely like that using the DevTools. Smaller question. Are there plans to add support for like flame graphs generated from birth, for example, for native applications just to use it as a kind of a viewer for other births not related to the web? Yes. So we have an importer for Linux birth, actually. We have some documentation about that. There is also the fantastic tool by Markus, which is somewhere here called Sempli. That is very well integrated with the profiler to profile native applications to. So it's called Sempli, S-A-M-P-L-Y. You can compile it on your computer. It's made in Rust and this is just an amazing tool to profile your native applications. So you can try that. Hello. It is the way to profile a full stack application. I mean the front end and the back end in the same time. You can't do that in the same time currently. What you can do is you can profile with of course the full stack. So it's probably not in back end maybe. Not Python. So there is a tool called FunctionTrade that I don't know well. That is using the profiler to profile Python applications. I don't know much about that. So I don't want to say too much things about that. So currently we can't do what you're saying directly. So sorry. Last question and then we'll close the room after of course all of us pick up all the trash that would be left if any. Thank you for the talk. Is there any plans to add support for importing P-Profiles which like some profilers output? So can you repeat that again? Sorry. Are there any plans to add support for P-Profiles which like for instance Go produces as an output of the profiler? You mean to have output of no I think I didn't get the question. Sorry. The output like the P-Profile format which is for profiling like we can get P-Profs with this viewer. Okay. We don't have we don't have really a plan but it's easy to I mean easy. It's not really easy but it's possible to write converters for our format. Our format is very well documented. So we have converter for Linux perf for Valgrants format called DHAT I think for ArcTrace for Chrome. And currently that's all but we also have an external converter for ETW from Windows and for I think GPR for GVM stuff. So it's possible to do converters. No we just need somebody to do it. Sorry. But if there is enough interest now maybe you can look at looking to it or at least help somebody doing do it. Thanks again and I don't know if you're going to be hanging around outside if you want to chat. And there are more stickers. And there were stickers. Thank you. |
Hardening Kernel Subsystems by Architectural Capabilities |
Hello, everyone. I'm Zahra. I work for Microsoft as a part of Linux Systems Group, LSU. This type is going to be about an ongoing project that we have on hardening Linux kernel with architecture capabilities. To be clear, this is not a product by Microsoft, so it's more like an exploratory project that we want to see if this hardware feature can be used for security issues that we want to fix on Linux kernel, and also as a part of this process, if we can find some new vulnerabilities, attack vectors, and things like that. So I'm going to start with a very brief background, like an intro on Cherry and a state of the world Linux, and then some of my work that capability-based hardening and future work and opportunities that I'm really hopeful that open source community can help us with that. So the big picture problem is that, as you all know, operating systems are really complex. We have millions of lines of code with a lot of complex abstractions, and basically any forms of proper hardening for a kernel is still an open problem. We have all kinds of software-based, like control flow integrity, like approaches, like compiler-based techniques for fuzzing, and all of these approaches are helpful, but we still see lots of vulnerabilities. Some of these are memory safety, like many of them are memory safety, but some are also logical problems because of this complex monolithic structure of the kernel. At the same time, Linux kernel also has different security subsystems. We have a combination of, for example, LSMs, DAC, sandboxing techniques for, for example, SECCOM, EVPF, and still also in this complex stack, the proper integration and hardening of these security subsystems themselves is not also a clear, we don't have a clear solution for that. At the same time, we also have a lot of ongoing hardware security features that are going, like, are adding to or suffer our hardware platforms. Like, for example, just on ARM, we have a combination of, like, core-screen privilege separation techniques, like trust zone T's. We have, like, more, like, at the same time, like, finer-grained memory safety features, like pointer authentication, memory tag extensions, and as we go, like, for example, modern hardware, we can have, like, for example, resource domain controllers, and all of these, like, hardware features are not really, so they are there, but our operating system doesn't really use them, like, in a fundamental way, like, in a basically principal approach. So this is the big picture of, like, lots of problems that we have for both hardening the kernel and also, like, using these hardware security features properly. And Cherry is one of, like, this fine-grained both for memory safety and for extensible compartmentalization features. That it has, like, a really old history from the University of Cambridge. I think, like, about 14, 15 years of, like, research is behind it. And the concept of, like, capability-based security models, it's, that concept is not new. We have it, like, even on, like, file descriptor, like, abstraction for Linux. So basically, having an unforeachable, like, token of authority that's, it's for, like, accessing any kind of, like, sensitive object. But the novelty of Cherry is that you have this hardware-software semantic approach for bringing this, like, concept, this, like, memory safety concept to, like, both your hardware architecture and, like, an instruction level and also, like, really have the opportunity to redesign your systems as, like, based on that. So they have, like, these extensions on MIPS, on RISC-5, and on, recently on ARM. And also, like, I think they complete, like, an example of their systems as, like, is based on previousity. So the Linux one is a new one that's mostly ARM, like, folks are working on. So what's Morello? Morello is, basically, the new development, like, experiments kind of, like, board for having, for adding Cherry to ARM V8. And it's extending, like, basically, the entire, like, instructions with, like, new registers and new sets of, like, and also extending, like, previous, like, systems registers for ARM. It's basically, like, introducing, like, the 129-bit pointers. So every Cherry pointer has, like, besides the value, it has the whole set of, like, metadata that contains its bond, boundary of, like, the memory region, the object type, and the permissions, like, basically, for any kind of, like, access to that pointer. And ARM also has this, like, added this notion of, like, controlled non-monoticity. That's basically, like, it's trying to also, because, as you know, like, ARM has, like, at least six execution levels for, like, EL0, EL1, EL2, and, like, at the same time, like, in the secure board. So somehow, this notion of, like, Moeller should be extended to all of these execution levels. So that's why they added this through, like, new exceptions sets, new executive and restrictive mode, privileged execution, and also some unsealing, like, operation that I'm going to describe later. So for every pointer, we have this permission set that, basically, by hardware, you can say, like, this, like, piece of memory should have what kind of, like, permission access. It can have, like, load, execute, just store, or even more complex, like, access controls, like, if you want to have it immutable, for example, like, region, like, through ceiling, or if you want to have even, like, systems-based, like, access controls, like, for, and software-defined, like, waste access controls, that if you want to have your own custom, basically, permissions to be defined for that, like, pointer, that capability. And the interesting thing is that, like, this system, like, accessing system registers that you can define to these capabilities, it's, the behavior is still, it's not, like, really affected by your, like, hypervisor mode, like, HVC calls, SMC calls to set security monitor, and also supervisor mode. So as you can see, besides, besides, like, the notion of, like, capabilities, you need to, like, change, basically, ARM had to change, like, several of, like, system registers, including the control registers. It, like, we have new registers, for example, for bounds, for setting, like, converting, like, pointer capabilities, like, we had to, we have to, like, have a new, like, PC, like, program counter, like, in some PCC, for example, instead of, like, PC, but at the same time, the execution levels for, like, EL0, EL1, EL2, and EL2, all of these should also be aware of the concept of capabilities. So most of, like, the control registers should also, like, they're also changed. And there are, like, you see, like, for example, CTLR, it's now, like, capability-based, like, CTLR. And this, this similar thing, for example, for, like, trade IDs, and things like that. So, for example, like, the neural Linux, like, trade structure tab, like, instead of, like, trade ID, like, traditional, it has, like, control capability-based trade IDs, or restrictive-based, like, trade IDs, that you can find most of these details in the technical manual. Similarly, as I said, like, we have a new set of exceptions for, basically, capability-based exceptions for any faults that you get from, like, permissions, like, accessing them, like, setting boundaries, like, right or wrong, so things like that. So, as I said, the whole details, it's, like, basically, a lot of details, so you can find them mostly on, like, Cherry site, and Morello, like, project, all of them, especially the arm one, it's, like, everything is, like, open, so you can, I'd really, like, if anybody in community to go and check this. So, about the state of Morello Linux, it's, the, the maintainers are most from ARM. They are really doing a very good job on, basically, in a very short time, they have a stable environment for Linux development. If you go, look at that, you see that, like, they're already, like, enabled most, most syscals. They're already, like, they have, like, distros, like, like, Debian, and they have both, like, even if you don't have, like, the development board, you can just, like, use their FVP, fixed virtualization platform, something that's basically an good emulator. And the whole system is, like, really, like, ready for experiment for both from the user space and the kernel development. Also, like, from their perspective, like, they modified most of, like, the main modifications of, like, memory management for adding capability-based architecture and things, like, that they're added. The main problems now that I'm going to discuss are from the security perspective, can be from the, like, the intersections of, like, user and security, user space and, like, the kernel space security, their interactions, their shared memory, and things like that. So, for example, in my experience, when, so, I first started with enabling some of the security features to more Linux, and the experiment was, like, really easy. I was just, trying to, like, for example, get the TE stack, like, TE driver running, adding, like, trusted keys, like, like, BVPF, like, checking, like, if the BVPF is working, like, properly on more low. And in most of the cases, when you want to add, like, these features to the more Linux, the, like, issues that I was seeing is, like, minor issues. So, basically, mostly, like, pointer mismatch, like, in the current architecture that they have, like, a pure capability-based ABI, most of the, like, so, basically, most of the issues coming from, like, when you enable these features, like, you have, like, traditional pointer abstractions that you need to convert them to capability-based abstraction. And, for example, like, when I was working on enabling the U-Axis, that, as you know, like, U-Axis is mostly from the Linux, but you have, like, all of the Linux abstraction for, like, function for communicating with the user space, like, passing pointers, passing shared memory, and things like that. And so, this required, like, low-level, like, I think, capability, like, instructions to Linux. And after we did that, like, changing the put user, like, get user, and things like that, basically, like, the kernel breaks, like, in several places. But the break was mostly on, like, okay, for example, here, on, I notify user, it says that, like, this pointer, like, it's a, like, integer user space, like, pointer. And so, it's not a capability. So, basically, we need to find out, like, dig out, like, what kind of, like, pointer is, it's, like, that, it's an address, or it's just, like, an integer pointer, and things like that. And try to use the abstract, like, tree abstraction to convert them, like, in a secure way, to capability. And then, like, it, for example, this one was, like, a kind of large patch that still needs more filling up. But, like, after fixing this, like, about 50 files, but in a very, like, a small, like, lines of code, you have, like, we have the user space, like, based, like, capability back-ends for Linux. But the tweaks are, like, actually, the good thing about this, this process is that you can find out, like, dig out, like, if there is a, like, basically, viability, like, if there were, at some, some of them, it was just not just a cherry, like, the pointer mismatch was, like, something that could, could provide, like, could be an issue, like, in the future, like, to have memory viability issues, or things like that, or if, for example, they had the boundaries right, and things that can go wrong, even, like, if we don't have cherry. So, the good news is, you have a lot of helper functions, like, from the compiler for, like, using cherry, getting, like, setting boundaries, and, like, converting capabilities to pointers and vice versa. So, that's good. The other thing that's, so, the current state of the model in Linux is that there's a main root capability that's, basically, every other capability is generated from that. So, what I'm working on is, basically, adding more, like, finer-growing, like, capabilities for both ceiling, like, making, especially, like, the sensitive, like, parts of the user space or the kernel immutable after, like, the operations are done. And, also, like, so, basically, we need to add more root capabilities for the user space, for, like, a specific capability for ceiling and making, like, both the kernel subsystems and user space subsystems immutable. And, also, we need to use better the concept of, like, this software-defined permissions on cherry. So, one of the, for example, custom, like, permissions that are added on free BSD is the permission on syscalls and permission on software, for example, virtual memory. So, this will, kind of, like, let them to define, like, and sandboxing the, like, environment, sandboxing abstractions that's, basically, backed by cherry. That's, like, this, this is, like, a really useful thing that, for example, can be, like, useful for EVPF or sitcom-based syscall filtering. So, the other thing that, like, we are working on is that, like, what's, basically, the better combinations of, like, this software-defined permissions for Linux and sandboxing. That's, basically, can, can get a lot of, like, feedback, it's usually useful to get feedback from EVPF guys, like, spokes and, like, sandboxing people that are working on sandboxing. So, see if, like, we can add these kind of, like, abstractions and integrate them properly to Linux security subsystems. So, as I said, like, most of the, like, the goal of this project at the end is, like, we want to use this hardware feature, like, similar hardware features, for protecting Linux security subsystems. LSMs, EVPF, and name-assets. So, basically, this is a very, like, open, like, area that it can be, like, really benefited from community and open-source community to be involved. And, besides that, one of the things that it's, like, it's, it's really, like, an earliest stage for, for it is that the whole systems are stacked from the hypervisor, from secure kernels and trusted execution environments. Now, for the first time, we have this option that we can, like, integrate fine-grained, like, memory protections and scalable, like, compartmentalization features to, for example, or trusted execution environment. And there's, like, a huge area of, like, attack vectors between the interactions of, like, this secure world and, like, the normal world and, like, the kernel environment and all of these, like, systems are stacked now have, like, a way to, like, to protect us from the secure RPC passing, like, pointers. That's basically the main attack vectors for all of, like, T, T environments. Now, we have the options to redesign these stacks based on these fine-grained security features. So, to summarize, the state of Linux, the Moerla Linux is really ready for, like, a special open-source community to get involved. And there are a lot of, like, basically open problems that we're looking forward to, like, get feedback from, especially, like, Linux community that are working on security subsystem to both, like, hardening the kernel itself, hardening the kernel security subsystems and, at the same time, adding more compartmentalization tools, sandboxing tools based on these kind of fine-grained features. Because at the end, the Linux, like, privilege separation is really coarse-grained. And most of the problems are, like, from, like, these huge, more intricate stacks can be solved if we have a more, like, better abstraction for privilege separation and compartmentalization. Also, from people, like, if they're working on debugging and tracing, that's also, like, a huge problem, like, open a space for capability-based systems, how we do it securely, how we basically do it properly. Now, you have these options that, for example, like, instead of the, you can define, for example, secure domains for, like, giving some capabilities to, for secure debugging. And then, for example, like, you don't need to be worried about, like, shutting down security, like, security, for example, secure boot or, like, security features of your system just to do debugging. Because, as you know, like, it's a, by nature, it's an insecure property. So, I'm happy to get any questions if you have. And let me know if you're interested in working on this project. Yeah, thank you for the talk. |
Hybrid Networking Stack Demo |
Okay, ready for our next talk? Next talk is by Mariam and she's going to talk about, I'll give us a hybrid networking stack demo. Thank you. Hi, everyone. My name is Mariam Tahan. I'm a software engineer at Red Hat. And today I'm going to talk to you about a concept I've been researching, and I've coined hybrid networking stacks. So if anybody has better names as well, I'm open to the suggestion. So what I'm going to do is I'm actually going to introduce what a hybrid networking stack is. We're going to talk a little bit about an open source project called Cloud Native Data Plane or CNDP that gives us an example of such a networking stack or at least some components of it. We're going to have a live demo with a star there because we're going to cross our fingers and toes and pray that it all goes to plan. After that, I will try and sum up what we discussed and hopefully there will be some time for Q&A at the end. Okay, so what is a hybrid networking stack? Well, it's actually a networking stack for applications that want to take advantage of the XDP hook and AFXDP in particular without having to reimplement the full networking stack in user space, but rather lean on the existing Linux stack. It relies very heavily on the concept of control plane and user plane separation. So parts of the stack can run in user space and other parts of the stack can run in the kernel. And even if they're part of the control plane, they can run either in kernel or user space and the same for the user plane aspect. You can run stuff either in the kernel or in user space as part of that networking stack concept. This concept relies very heavily on the principle of classifying traffic into application-specific traffic and non-application-specific traffic. And application-specific traffic is redirected to the user plane and non-application-specific traffic is redirected to the control plane to be handled. So in that way, applications only really need to process the types of traffic that they're interested in. And what's really important then is that you filter this type of traffic as early as possible in your networking stack. So if your NIC hardware supports that filtering, you can take advantage of that. If it doesn't, then you can always rely on EBPF at the XDP hook to be able to do that level of filtering for you. So in the example I'm showing here on the slide, you can probably consider FRR and the Linux Networking Stack, the control plane. FRR is just an open-source routing protocol suite that's for Linux. And then on the user plane side, you would consider the CNET graph from CNDP, your data plane or user plane for this demo. The CNET stack that comes with CNDP, I'll just talk about it for a minute before we dive into the next topic, is based on the graph architecture from VPP. So with VPP, the concept was that you could build your whole application or parts of the stack that you want to leverage using a graph. And then your packets are processed by traversing each node in this graph. And they're processed in batches as well to keep your instruction cache relatively warm, and you got all the performance benefits from doing all of that good stuff. So the CNET stack is based on the exact same concept as that. And obviously, as your packets traverse the nodes, they're either terminated as part of that stack, they're either forwarded on, or they're dropped, depending on the decision that was determined previously by the control plane piece for your application. So let me introduce CNDP to you folks. CNDP, our cloud native data plane, is an open source framework for cloud native packet processing applications. It's actually built on the performance principles of VPP and DPDK, but it doesn't have any of the resource demands or constraints as it's completely abstracted from the underlying infrastructure. It actually is completely written using standard Linux libraries also. So what CNDP gives you is really three things. The first thing it gives you is a set of user space libraries for accelerating packet processing for your application, cloud application or service. The second thing that CNDP gives you is that CNET graph is part of the hybrid networking stack, and also a net link agent that's capable of communicating with the kernel to retrieve relevant information, like routing information and so on. And the last thing that CNDP gives you are the Kubernetes components to be able to provision and manage actually more so an AFXDP deployment than just a CNDP one. Those components are the AFXDP device plugin, which provisions the net devs that you want to use for AFXDP and advertises them up to Kubernetes as a resource pool that your pods can then request when they come up. And then you have the AFXDP CNI, which essentially plums your AFXDP net dev from the host network namespace into the pod network namespace. So just one last point on CNDP before we move on is that it actually supports multiple IO, packet IO backends, not just AFXDP, but for the purposes of this hybrid networking stack we've focused in on AFXDP itself. Okay, so it's nearly demo time. So, excuse me. So what am I going to show you? I'm actually going to show you CNDP FRR vRouter that we built. Originally, I set out to see, you know, could I build some sort of a hybrid networking stack application that could accomplish, you know, DPDK-like speeds, but leverage completely, you know, cardinal smarts. And so the scenario we came up with was that we would have two clients, client one and client two, residing in two different networks, network one and network three, and they're interconnected via a pair of vRouters, which learn routes using OSPF. So what the demo is going to be is we're actually going to bring up four Docker containers, client one, CNDP FRR one. We actually call this container CNDP FRR two, but for the purposes of the demo, I'm only going to run FRR in it, just to show it full interworking. And client two will then be our last Docker container. At the start of the demo, we're just going to bring everything up. No FRR will be running, no CNET stack will be running. And so when we try to bring from client two to client one, we're going to see nothing happen. And then we're going to bring up all the components in part, see the routes being learned, hopefully have a successful ping, and maybe even, you know, run an IPerf session between client one and client two also. So if we just zoom into this CNDP FRR node for one second, I just want to show you one thing, I guess. So we can see here it's going to have two vEath interfaces, one connected to net one and the other connected to net two, and these are here. We're going to inject an EBPF program on the XDP hook that's going to filter all UDP traffic to CNET graph and non-UDP traffic to the Linux networking stack. So actually one of the other things I'm going to show you is that we're not going to see ICMP traffic traverse through CNET. And then when we run IPerf with UDP traffic, we're going to see the actual traffic flow through CNET also. So here we go. Let's just check that we have nothing running. Yep, that's fine. And I presume everybody can see the text. Okay, cool. Okay. So all the script is doing is setting up the four containers and the relevant networking between them right now. We can ignore the permission denied, but we didn't see that for now. So we actually see we have four Docker containers here, Client 1, Client 2, CNDP FR1, and CNDP FR2. And if we try to ping Client 1 from Client 2, essentially nothing happens. Okay, so let's start up our FRR agent on CNDP FR1 as well as the CNET graph. So, sorry about the formatting. It looked a lot better when I was presenting. But the key part here is if we try and check the routes, what we see is the two net devs that are attached to CNDP, or the CNDP FR1 vRouter, but most importantly we just see Network 1 and Network 2. So let's start up the FRR agent on this node. So if we have a look at the information that's been set up so far, we can see this vRouter has an IP address. It's adding Network 1 and Network 2 to the same OSPF area. And if we try to show IP OSPF neighbor at this point, it hasn't learned anything because we haven't started FRR on the other vRouter. So let's go ahead and do that. And here this vRouter has my IP address and is adding Network 2 and Network 3 to the same OSPF area. And if we show the OSPF neighbor, it's picked up its opposite end of the vRouter. And if we do the same on the CNDP FR1, it's also learned about the other route via OSPF as well. So at this point, if we actually try to ping again from client 2 to client 1, we can ping. And actually, if we check the routes on CNDP, we have the new Network 3 added in. And just to show you that no traffic is flowing through CNDP yet, this is ETH0 stats for RX and TX. We see they're still 0 and the same for ETH1. So let's kill that off for the moment and try and run an IPerFUDP session between client 1 and client 2. And this time we should see traffic flow through the CNET graph. And if we check here, you can see an increment in the stats. And this doesn't show as nice as I hope. And this kills the app. Let's try it one more time. Unfortunately, I won't be able to get this right just yet. Oh, there we go. OK, let's try and run it one more time. OK, folks, bear with me. So we can see sort of IP4 input node at the top here, an IP4 forward node, and they're passing UDP traffic through those nodes. Now, we're not going to the UDP nodes that are listed there because obviously traffic isn't destined for the CNDP, FRR, V-Router, they're destined for the client attached to it. And that's why they're forwarded on. Applications can also hook on to the CNET graph via a socket-like architecture. All the function calls look exactly the same like a socket, except it's just called a channel, and you prefix all of your normal socket calls with channel underscore before hooking up into the CNET graph. So that's the demo. So the next step was to essentially take that CNDP FRR-Router and put it through a heck of a lot of permutations in terms of interfaces that we hooked it up to, leveraging things like XDP redirects between the two V-Router instances and so on to try and see what kind of levels of performance could we push this to. And so what we noticed was for AF-XDP, the performance is completely dependent on the deployment scenario. So for north-south traffic that was coming in on a physical interface or out of a physical interface with AF-XDP in native mode, so hooked in at the XDP hook, we actually had this example yielded comparable performance to DPDK. However, while we moved to something that was completely local to a node, so east-west type traffic with all virtual interfaces and AF-XDP, while the performance was still better than vanilla V-Eat for AF-XDP in native mode, it wasn't what we had expected it to be. So there's definitely some level of optimization that we need to look into on that front. And then we tried one other thing which is AF-XDP in generic mode, so that's your program hooked in at the TC hook, and that actually yielded a better performance than native mode. But again, that goes into some optimization requirements are needed on that front. So just to sum up, I guess, we set out to show it was it possible to build some sort of a hybrid networking stack. I think the building blocks are there for sure. I think we've demonstrated that it is possible to do something like that, especially for these high-performance use cases that want to take advantage of internal fast paths and essentially XDP and AF-XDP. There's obviously an opportunity as well to make sure that we hook in EBPF a lot more into the puzzle, especially from the user plane aspect, not everything has to go into user space and so on. So I just want to summarize in terms of generic challenges that we have noted for AF-XDP. The first one is that we still can't take advantage of hardware offloads. It's been great to see the XDP hence K-funk support getting merged into the Linux kernel, or at least agreed on as model and then merged, which has been fantastic, and it will form a great cornerstone for a lot of this work. The only thing that I would ask is that we make sure that for the containerized environment, we put the onus on the infrastructure to lifecycle manage the BPF programs and to take that level of responsibility and privilege out of the scope of the application. So the application doesn't need to know any special formats or have to, especially if they're using AF-XDP, they don't need to know any special formats or have to do special compilations of BPF programs or anything along those lines. That should all be managed on the infrastructure side. The next thing that's been a gap was really jumbo frame or multi-buffer support for AF-XDP, but we've seen lots of activity on that in the last couple of months on the mailing list, so hopefully that's something that we can take off the list very, very soon. And lastly, there's going to be some need for some level of optimization of AF-XDP in native mode for VEATs, and just some links for folks if they're interested on some of the stuff that we used for this talk, sorry. So thank you very much folks for your time, I really appreciate it. And it's been a pleasure presenting on my first podcast. It's like a bucket list I should have ticked off there, so thanks a lot. Thank you for the talk. We do have ample time for questions. Thank you for the presentation. I have a question about XDP, so does it run on hardware or is it in software? The XDP hook itself is typically supported by the drivers, so actually I think there's a good host of drivers that support them right now, most of the Intel ones, a good few of the Melanox ones as well. The thing with AF-XDP is if the hook isn't natively supported by the driver, it automatically falls back to the TC hook, which is what we call generic mode, and it'll still work there except that you don't get the raw buffer from the driver, you will essentially be working with the equivalent of an SKV. So there's some level of allocation and copy that happens there before you can process the package. Okay understood, thank you. More questions. Okay then, thank you for the talk. Thank you for being here. Thank you very much, really appreciate it. |
meta netdevices |
Okay, we're ready for our next talk. Daniel is going to talk about MetaNet devices. Thanks. All right. Thanks a lot. So, yeah. So, this talk is about MetaNet devices. This work has been done by my colleague and myself. We are at ISO Wayland software engineers working on the kernel and also Cilium. So, really, the goal is what we were, the question we were asking ourselves, like what about, how can we leverage the BPF infrastructure in the kernel and also the networking features to really achieve maximum performance for Kubernetes parts. And before I go into the kernel bits, just a really quick recap around Kubernetes and parts and what it is. So, basically, what you can see here is a host. The host can have one or many parts. In Kubernetes, it's an orchestration system, essentially, and a part is usually defined as a network namespace, and it is connected to, typically, to Weave devices to get traffic in and out of them. A part can have one or many containers that are sharing this network namespace. So, yeah, and CNI is basically a networking plug-in which will set up various things. When a part comes up, for example, it will set up net devices, assign IP addresses. It has an IPAM infrastructure to manage pool of addresses. It will install routes. In the case of Cilium, which is a CNI, it will also set up the BPF data path to basically route traffic in and out. It has various features on top as well, such as policy enforcement, load balancing, bandwidth management, and so on and so forth. But I don't want to make this talk about all the different features about Cilium, but rather about performance. There was an interesting keynote last year at the SAEcon from Brandon Craig, where he talked about computing performance and what's on the horizon, and he had a couple of predictions and one was quite interesting when he was talking about OS performance, and the statement that he made is that, well, given the kernel, it's becoming increasingly complex. The performance defaults are getting worse and worse. Yeah, so he stated that it basically takes the whole US team to make the operating system perform well. And the problem is, given all these performance teams, they are trying to optimize at the larger scale, nobody's actually looking at the defaults anymore and how they can be optimized. So this was quite interesting. So yeah, in case of defaults, we were wondering, given two Kubernetes nodes with part to part and they're connected with 100 gigabit NIC, we wanted to look at the single flow, like a single TCP stream, and we asked, like, what's the default baseline you can get to? Where are bottlenecks? How can they become, how can they be overcome? And actually, can we provide better defaults out of that, what we figure out? Why bothering with single stream performance? Well, first of all, it's interesting for the kernel to be able to cope with growing NIC speeds, so 100, 200 gigabit or more. How can this be maxed out with the single stream? There are lots of data intensive workloads from user and customer sites around machine learning, AI, and so on. But generally, it's also interesting to be able to free up the resources and give them to the application instead of the kernel having to block them. So the assumptions for our test is basically, like, usually the Kubernetes worker nodes that users run, they are quite generic, they can run any kind of workloads. What we are also seeing is a large number of users typically just stick to defaults, they don't tune specifically the kernel. There's an interesting cloud-native usage report where they tried to get some insights into how Kubernetes deployments usually look like. There's definitely an increasing trend to have a higher density of containers per host. So like around 50 or more is expected these days and the number of pods per node is also increasing. So yeah, the question now is, okay, so like a basic, very basic compatibility setting, for example, for the case of Silium, we use the, like if you deploy it in a basic mode, we just use the upper stack for routing and forwarding. There are various reasons why people might want to use that. For example, in case of Kubernetes, there's a component called Qproxy which uses net filter IP tables for service management, for service load balancing. Some people stick to that or they have custom net filter rules, so maybe they require to use the upper stack for that or just simply they for now just went with defaults and might look into more tuning at the later point in time. So when we try to look at the performance for that, what you can see here is on the yellow bar is the host-to-host performance for a single stream, so we got to 44 gigabit per second, but then if you do the pod-to-pod connectivity, it's really reduced dramatically. And so yeah, like one of the reasons is, and I will get into this in a bit, is because the upper stack is giving false feedback to the TCP stack. One thing that we did like a year ago or so is to introduce a feature where we can, which is called BPF host routing, where we don't use the upper stack and for BPF itself, given we attach to physical devices, but also to Veef devices, we added a couple of new helper functions there. One is called BPF redirect peer. What this basically is doing is adding a fast switch into the network namespace for the ingress traffic. So basically we're just like, instead of going the usual X-MID route to the Veef devices, we can retrieve the Veef device inside the pod and just scrub the packet to remove the necessary data that we typically remove in switching network namespaces, but then also to just set the device to the device inside the pod. And then circle that around in the main receive loop without going to a per-CPU backlog queue that you would normally do when you transfer data through Veef devices, and we don't need to use the upper stack because there's all the information already available in BPF context. And for the way out, we added a helper which is called BPF redirect neighbor. So that one will basically insert the packet into the neighboring subsystem of the kernel. So usually we can do a FIP lookup out of BPF, so there's a helper for this as well, and then combined with this for the resolution of neighbors. It will allow that you don't need to go to the upper stack, so the nice benefit you get as well with this is that the socket context for the network packet for the SKB is retained all the way to the physical device until the packet is actually sent out. And this is not the case when you normally go to the upper stack. So then the TCP stack actually thinks that once you go to the upper stack that it already left the node, but it's actually not the case. And this way it can be retained. And this is how the complete picture looks like. And if you look at the performance, it's already much better. So we were able to get almost a 40 gigabit per second under 1.5K MTU. So this was interesting. Now the question is, how could we close the remaining gap? Under 8K MTU, we also did some tests, and one thing to note here is that we were able for a single TCP stream to get to 98 gigabit per second for the host-to-host case, but still the situation looks quite the same for the weave with the BPF host routing. So there's still a small gap that we want to close here as well. And that's where we introduce a new device type as a weave replacement. So we call this meta device because it's programmable to BPF, and you can implement various, like your own business logic into this. So it's flexible. And this time, the main difference is that this also gets a faster switch on the egress side for the egress traffic, so it doesn't need to go to the per-CPU or back-lock-U for the egress as well. So if you look at the flame graphs, so that's the worst-case scenario where we compared the weave and the meta device. So what you can see here on the weave device, like on X-MID, it will basically scrub the packet data, it will un-CUE the packet to a per-CPU back-lock-U, and then at some point there's a network's action. It will pick up the packets from the queue again, and in the worst case, it can be deferred to the kernel software QDemon, and then you see this rescheduling where you have a new stack where this is processed again. And then from the BPF side, it will reach BPF on the TC egress on the host weave, where we then can only forward this to the physical device to leave the node, right? And all of this can be done in one go without rescheduling through this meta device. So it will scrub the packet, it will switch the network namespace, it will reset the device pointers, and it will then directly call the BPF program. And if the BPF program says that based on the FIP lookup and so on, that it will forward the packet directly to the physical device, then it will avoid this rescheduling scenario. So on the right side, I mean, it's really straightforward. That's how the implementation of the driver X-MID routine looks like. So it will basically just call into BPF, and then based on the verdict, push it out. It's really just like 500 lines of code. So it's very simple and straightforward. I think it's just one-fifth of the weave driver that we have right now. And the other focus that we wanted to put into is compatibility as well, so that, like given in Solium, we need to support multiple kernels, the ideal case would be that we don't need to change much of the BPF program and can keep it as is. And in case of XTP, we didn't want to implement it because for the weave case, it really is very complex. It even adds multi-q support, which you normally would not need on a virtual device. So we wanted to keep it as simple as possible and to have the flexibility that this can be added as a single or a paired device. So for the weave replacement, it would be a paired device, but you could also do it as a single device and then implement whatever logic you would want in BPF for that. So looking at the performance, again, like the TCP stream under 8K, so this is really able to reach through this approach the full 98 gigabit per second, and in terms of latency, so we did some net-perf TCP or R measurements as well, where you get the minimum, the P90, 99 latency and so on. So this is really on par with the host. So now we were asking ourselves, so can we push this even further? I mean, well, so we were able to get to 98 gigabit per second, but like the cost for like a megabyte to transfer, can this pushed even more? And there's a relatively recent kernel feature which is called Big TCP. It landed for IPv6 only in 5.19 and was developed by Google, and the whole idea behind Big TCP is to even more aggressively aggregate for GEO and GSO. So normally the aggregation, the kernel will try it basically out of the incoming packet stream, create a super packet and then we'll push it up to the networking stack so it only needs to be traversed once. And the limit up until that point was for 64K packets, simply because in the IP header that's the maximum packet size that you can do. And the idea for Big TCP for IPv6 was that, well, maybe we could create a hop-by-hop header in the GEO layer and then add, and then like the 16-bit packet length field can be overcome because there's a jumbo-gram extension in there which allows for a 32-bit field. So you can do much more aggressive aggregation. And yeah, so this is also now supported with the new studio release where this will be set up for all the devices underneath automatically for IPv6. Actually like this week, that was also merged for IPv4 now. So this will end in kernel 6.3, which is exciting. And when we looked at the performance again under Big TCP, turns out like using the upper stack is currently broken in the kernel, so that still needs to be fixed. We will look into that. So forwarding there wouldn't work. And with the host routing cases, it will basically bump up the regular Veef one to get this on par with the meta and also the host, so it will basically hide those glitches. The latency is still better in terms of like the short packet response type workloads for the meta, so that's still on par with the host. So what is the remaining offender like when you run all these features together? It's basically the copying to users, so like between 60 and 70% of the cycles is really spent on copying all this data to user space. So the next question we ask ourselves actually in this experiment, so what if we combine the whole Big TCP stuff with TCP0 copy, so what if we could leverage the memory map TCP? And it turns out that's currently not possible in the kernel. That's a limitation because in the GRO layer, Big TCP will create a frag list, which is essentially like a list of SKBs as a single big one that is being pushed up the stack. And TCP0 copy only works on the SKB frags, so that's like an internal. So basically you have a single SKB and it has like the pages as read only attached in the non-linear section. So that currently does not work. Combining those two would probably have like really big potential, but what we now try to do is we just looked at just using the TCP0 copy to see how it looks without the Big TCP. Actually speaking it's not as straightforward to deploy because first of all you need to rewrite your application in order to leverage memory map TCP. This can be done for RX but also for TX or both. And it needs driver changes and in particular driver changes to be able to split the header from the data because the data you want to memory map to user space. Some Nix might do this with the hardware and some others you would have to do some kind of pseudo header data splits where you basically just copy the header into a linear section. So this is how it would look like. We tried this for 4K and 8K MTU. There's a great talk from Eric Dumasay, the TCP maintainer in terms of what all the all those things you need to do in order to be able to make use of this for example you also need to align the TCP window scale to 12 segments so that you fill exactly those pages. And yeah the driver support can be very different like we had in our lab MLX500 gigabit Nix and they did not implement the header data splits. So my colleague Nikolai did the POC implementation in the like to change the driver to be able to do that so that we could get some measurements out of this. And if you want to look into an application like an example application and how you can implement that there's one in the networking self-test in the upstream tree called TCP M-MAP which is useful also for the benchmarking. So you really need to align various different settings like for our test implementation we used like for MLX5 the non-striding mode so like the legacy mode so you need to switch that off and EVE tool first because I mean for the POC it was easier to do the way like the packet layout is done there then you need to align the MTU to either one page or two pages and you need to do various other settings like for the course of time I will not go into all of the details. And generally I would think it would be useful addition to the kernel to have like a configuration framework for that and to be able to have more drivers supporting the header data split. There's actually one in Windows kernel turns out so there's some documentation around there that we found while preparing for the talk. The other thing is the caveat is like the TCP-0 copy may like the benefits might be limited if you then actually go in your application and then touch the data because then they need to be pulled into the cache which they would be if like if you would have to copy things to user but they may not be like if you just memory map so the applications there is mostly like on the for example storage side where you wouldn't have to do that. And looking at the data for 4K MTU we tried with the implementation we got to 81 gigabit per second so that's a bit limiting. Could also be that this is mostly because the implementation was you know POC with lots of optimization potential that can still be there but we looked at the 8K MTU and there we were able to get to 98 gigabit per second but the interesting piece here is the cost per megabyte we could we were able to reduce from 85 down to 27 microseconds per megabyte so this really is significant and yeah because of the avoidance that you don't copy anymore. So as a summary so we started with the default and then you know switch to the 8K MTU then we went to the host routing that we covered then the addition of the meta device where we can avoid where we can do the slow latency you know for ingress and egress then the big TCP and that all works without changes to the application and with that you can already get like a 2x improvement and then it gets even more dramatic but it's of course dependent on your application for the zero copy. Some of the future directions as I mentioned earlier it would be useful to have like a generic header data split framework for NIC drivers that they can implement that and expose it to this setting. There's potential here as well to optimize for example like the header pages could be the head page could be packed with the headers it could get recycling which we didn't implement here and in future big TCP would be interesting to combine with the zero copy so that this covers the frag list in GRO and the other thing that is at some point on the horizon is to push the big TCP actually onto the wire as well if the hardware supports that and yeah so with that I'm done with the talk and the prototype for the meta device is currently you can find on this branch we are working on pushing this upstream in the coming months and there's also the prototype public for the MLX5 header data split and yeah so the plan is basically to get this into upstream kernel and then also to get this integrated into Syllium for the next release yeah thank you are there any questions? Thanks we already have two questions in the chat so we can start with them if there are no one in the room. Alright so the first question we got was can we well not really a question as much as a comment I think which is can we please come up with a better name than meta for this just call it BPF or BPFDF or something like that. The other thing was if the perf benefit comes from calling through from inside the network namespace can we just make VIF do that instead? I think we could but the question is like I haven't looked into the like how it would affect the XDP related bits so it felt easier to do something simple and start from scratch like one thing I don't really like about the VIF devices to be honest how complex it got with all the XDP additions and it's not even that beneficial and it added like multi queue support and all of that which was not needed at all for a virtual device so I really wanted to have something simple and yeah it's easier so. Any more questions from the audience here? I have a question you're about big TCP and TCP zero copy it's you're in your lab you only did benchmarks host to host right because so but also part to part so we tried both yeah. Because do you know how it would work with the switches do you need special switch support for this? No no so the big TCP is only local to the node so like the packets on the wire they will still be your MTU size packets it's just that the aggregation on like for the local stack is much bigger so it's it doesn't affect anything on the wire that's a nice thing. And for the zero copy? I mean like for the zero copy you need to change definitely the MTU which also affects the rest of your network of course so that's one thing that would be required for that. Thank you. |
MPTCP in the upstream kernel
A long road that started almost 15 years ago |
We're ready for our next talk. Mathieu is going to talk about MPTCP in the upstream kernel. Thanks. Yes, hello, everybody. So welcome to this short presentation about MPTCP in the Linux kernel. So it was a long road that started almost 15 years ago. I'm indeed Mathieu Bartz, working at Tesserace in Nouvelle Anneur, so it's 30 kilometers from here. And let's start by a quick overview of the agenda. So today I suggest to begin with a short introduction of MPTCP and its main use cases. I will try to be quick for those who already know about that, but still trying to make the concepts clear for everybody, hopefully. Then I will explain what we can do today and what's expected for later. I will finish by giving some explanation about why it took so long to have it included in the official versions. So MPTCP is short for multi-pass TCP. This is an extension to TCP that breaks the assumption that a connection is linked to a fixed pair of IP addresses and ports. In one sentence, it allows to exchange data for a single connection over different paths simultaneously. Now that you can have multiple paths for the same connection, you can then have more redundancies, more bandwidth, and many more things. But enough with the nice definitions, let's have a look at a typical use case. Here is a classical MPTCP use case with smartphone. So a smartphone can typically connect to both Wi-Fi and cellular networks. That's a completely different view from the 70s when TCP was designed and where everything was fixed and clearly not transportable. Let's take a typical scenario. So you are here in the room connected to the Wi-Fi access point. Quickly you realize that, A, you have enough and don't want to listen to me anymore, and B, you got called by the smell of the fries outside. You then decide to watch a video stream about the history of fries. Why not? On your smartphone and leave the building to get real once, much better. Slowly, the Wi-Fi signal will become weaker and weaker and likely the video will stop. It will only restart when the system detected the issue and each app will then have to handle that by reconnecting to the server and then continue where it was if it can. It's clearly not a smooth experience for both devs and users, of course. In other words, do not leave the building if you don't have MPTCP on your phone. Of course there are fries for everybody. So I guess you already got that MPTCP is going to improve this situation. And yes, it will because it helps supporting seamless handover scenarios. MPTCP allows to create multiple paths for the same connection. So these paths are called subflows and they look like TCP connection when you look at packet traces. Except that these packet content are the channel TCP option to let the client and server attach new subflows. They can also announce available IP address. Of course they need to have some numbers to reassemble the data and more things. Multiple paths can be used at the same times like here on the slide. So with the same workout scenario, the frustration of being disconnected from one network goes away. Indeed, MPTCP can quickly take the decision to continue the communication on another path and even use multiple paths at the same time when one can no longer cope with the demand. This kind of use case is already supported by Apple with apps like Siri, Maps, Music and others but also by Samsung and LG in some countries like South Korea and Turkey. Another use case which is one that kept us busy for a bit of time at my company is the hybrid access network case. Many people are stuck at home with a not so great internet connection. That's usually because they are using a couple line and are far away from the street cabinet. Improving the situation is costly but also take time, especially if it is needed to deep new and long trenches to bring fibre to home. On the other hand, different assets of the network operator can be used, like the available capacity on the mobile network, so I mean 4G and 5G. With the help of a transparent proxy installed in the residential gateways for the client side and the telco cloud of the operator for the server side, MPTCP is used in the middle to offer more bandwidth to the end users. One last use case that can be quite interesting is that MPTCP can also play a key role in managing data between cellular networks like 5G and fixed one like Wi-Fi. So the 3GPP, which is the organisation in charge of defining the 5G technologies, suggests operator to set up an AT-SSS core function. The goal is to use MPTCP to have a seamless handover between networks, so 4G, 5G and Wi-Fi not to break connection when you go from one to another, but also to reduce the utilisation of the mobile network and avoid the situation of these mobile networks in the future. MPTCP is then part of 5G, but I cannot tell you if this is the same 5G as the one they put in the COVID vaccine. Anyway, enough with the theory, how do we use it and what can we do with it today? So MPTCP in the upstream kernel is fairly new, a recent kind of kernel is required. An application can create an MPTCP socket and use it like it would do with a TCP socket, so it's just one line change. You can see on the slide that IP Proto MPTCP is used instead of TCP. So yes, the application needs to explicitly pick MPTCP, but it is also possible to change the behaviour of existing applications by forcing them to create an MPTCP socket thanks to LD preload. It is also required to configure the network side to tell the kernel that multiple interfaces can be used. So tools like Network Manager and MPCPD can help doing that automatically, but it is also possible to do it manually with the IP tool. So it's probably better with an example. So just install a recent GNU Linux distribution, so Fedora Ubuntu and you name it, then you set up the network configuration. So here in this example, you can see that we need to declare which other IP addresses can be used to create new set flows. That's for the client side, the top, and also to signal the IP addresses to the other side. It is also needed to tell the kernel that the traffic generated from one IP should go through the right interface. So here we do that manually, but this can be done, of course, by a network manager and others. Finally, at the end, you can see that we need to run the application and here we use IPRF3 and we use it with MPTCP I just to force it to create an MPTCP socket. So the last table Linux kernel support most of the protocol features. So using multiple subflows, announcing IP addresses, priority, fast close, which is the equivalent of TCP reset and many other things. It also supports many socket options used by many apps. So for example, TCP fast open can be used with MPTCP, for those who know what it is. And it's also important to support these options because some existing application depends on them and would fail if they are not supported. It is also possible to retrieve information from the user space thanks to MIP counters, so also an INET-DIG interface and MPTCP socket option, which is the equivalent of TCP info. It's also important to mention that two pass managers are available and one packet scheduler, but maybe better if I explain what it is. So quickly just about the MPTCP path manager, so it's a component that is in charge of creating additional subflows, removing them if needed, announcing addresses, priority, etc. It is needed on both hands, but serve different purposes. So for example here, it is traditionally the client who create new paths and the server which announce additional addresses. There are two paths manager available, one where the user can define global settings to get the same behavior for all the MPTCP connection, that's the net name space, and also another one where the KNL notifies MPTCP events to user space via net link and accept commands to create, for example, new subflow, announce IP addresses, etc. So in short, the user space can control the path manager and take decision per connection. The other important component that I mentioned before is the MPTCP packet scheduler. Its role is to decide on which available paths the next packet will be sent to. So it can also decide to retransmit one packet to another path if needed, and that's what we call a reinjection. The packet scheduler relies on the TCP congestion control algorithm used on each subflow to know if more data can be pushed. But additionally, to better use all available resources, and sometimes limited buffers, it has also to send packet in a way to reduce packet reordering on one side, but also on top of that, it might decide to penalize some subflow that could impact the MPTCP connection, because some networks are quite bad with losses, buffer loads, and others. So the packet scheduler, in this case, might also be able to trigger a reinjection of data from one subflow to another, like if a failure has been detected. So there is an internal packet scheduler for the moment, and only one, but other ones will be able to be built with EBPF. So yes, we need EBPF for the packet scheduler, and not just to look cool, or to be accepted to conferences. In fact, EBPF here will avoid us to maintain all sorts of different packet scheduler in the kernel. It's a bit similar to TCP congestion control, there are few in the kernel, but sometimes no longer maintained. So quite a bit of work has already been done, and it is already possible to do some experimentation if you use a development version in our Git tree. But this work is currently on hold, because we ended up discussing the behavior of the current in-canner scheduler and its API, and yes, some work is still needed here. But there is also a system socket option that needs to be supported, but most likely they are specific to some very specific use cases. So it should be fine, but feel free to report them if some are missing. And one last thing that is worth mentioning is the support of Golang. As you may know, Golang does not depend on a C runtime library, or libc, and it is then not possible to use the LD preload technique with mpcp is to use mpcp. So the default net package doesn't allow application to create mpcp socket, only UDP or TCP, and a feature request has been sent to let apps easily create mpcp socket. But quickly the question Golang developers asked was, then why not using mpcp by default when a stream connection is requested, so when asking for TCP. And the proposition has been accepted, so we hope that stream application using the net package will be able to create mpcp connection, and maybe later that will become the new default behavior. So I will now finish this presentation with a bit of history. I think it is worth telling you that because it was not easy to get mpcp in the official Linux kernel, it could be good to say a few words about that. So still it was not as long and intense as having the full real-time support, and I see that some people here really know what I am talking about. The development of multi-pass TCP in the Linux kernel started in Belgium, at the university in Luven and Ev, something like 15 years ago. Surprisingly it didn't involve BS, no of course it did. The legend says that the ID popped up when the young authors were drinking bees at a crowd pub where the bartender was able to cope with the high demand by using multiple bee pumps at the same time. More seriously it started as a fork, but more to do some experimentation and to validate the concept. So at the beginning of his PhD, Sebastian just wanted to prove it could work. He started to modify TCP by adding more conditions, so just if it is multi-pass TCP, do that if not do something else. Later, more people, mostly Christophe and Gregory, joined the project to help Sebastian. They then took over his work to make it, let's call it, production ready, but also to be able to reach high performances. In other words, to get there, the modification in the Linux kernel were consequent and optimized for the mpcp use case. In parallel, mpcp v0 RFC has been published in 2013 and the same year, a big company with a logo looking like an apple, if you see, announced its support for the client side. And of course they needed to have the support for the backend side and I will let you imagine what they used. So if we concentrate on the very beginning of the project, we can say that it was easy to fork, but you will pay for it. Yeah, please don't read the two lines above out of the context. But anyway, there are different utilization of a fork. You can pick your level. So I let you guess which one has been picked here, probably ultraviolence. Maybe because the Linux kernel is big, it's also complex and the development is very active. So small modifications should not be difficult to maintain in a fork, but here we are talking about quite a lot of code and an important part is modifying the network stack, which still has many adaptations specific to mpcp. And in fact, from those that are even duplicated function that were adapted for mpcp case. So imagine that the code is modified on TCP side, we don't see it directly and then we need to adapt it later to mpcp. But still that was not the nightmare level. This is the nightmare level. So imagine that you have to deploy it on various embedded system with different LTS kernels from very old version like 3.4. So that's what we had to do at Tesserace and my explain why some of my colleague here look like the avatar just by mentioning kernel back ports. In the meantime, very old version have been deprecated, but thanks to the embedded system wall, this took time. So of course, this back port brought the drought of having to deal with many conflicts. But good tools like git re re re and topgit help a lot for that. So also add to that a bunch of batch script and it was possible to automate most of this laborious task. Topgit allows us to create a tree with dependency, that's what we can not really clearly see on the side, but it is also very handy if a fork has to be maintained by a team where regular sync with the upstream have to be done as well. So at the end for us, what we were doing is that we were applying the patch likely at the bottom and then propagated to all the kernel versions and then we had to resolve a few conflicts. But likely we were not doing that too much. At the end, the fork is still quite well used today despite all the work that has been done on the upstream code. I even published new releases last Friday and probably one of the last one. But on the bright side, the migration process has started, wait, just take time. The MPTCP support in the upstream kernel has started in 2020. Why a so long delay? Was it an homage to the Belgium Rideway company? No, it was not in fact a new idea. A few discussions and attempts have been made in the past, but were not successful. In all case, it was not an easy task to upstream MPTCP. Also because the Linux TCP stack is highly optimized, but also because the net dev maintainers have been clear on that topic. It is okay to include MPTCP in the official Linux kernel, but the new implementation cannot affect the existing TCP stack, which means no performance regression maintainable and possible to disable it can be extended via user space. Now with what I said earlier, you might already understand that we are not allowed to take the initial fork as it was. So it was built to support experiments and rapid changes, but not generic enough. Also at the end, it was and still used on environment where the majority of the connection are using MPTCP and not the opposite. So what were the solutions? A rewrite almost from scratch was needed. That's probably why it took so long to say, okay, we need to do it. A key difference with the upstream kernel is that a new circuit type is used. So there is no clean separation. The user space interacts with the MPTCP circuit, which controls the different TCP sub-flows. Thanks to the TCP upper layer protocol, ULP, that was introduced in 2017 for KTLS, it was possible to minimize the modification in TCP code while still avoiding duplicating code. An SKB extension mechanism has also been initially developed for MPTCP, not to include the socket buffer size for the generic case. This is also used now by other components today. Also we had to be very careful when modifying the TCP stacks. So any ID to avoid that were good to take. One last point is that the APIs have been defined not to have to maintain multiple version of pass manager and packet scheduler in the kernel, even if for the last one is still ongoing. But also one thing that we needed to do a lot of work. Here I just want to say a special thanks to our ex-maintenor, Matt Martino and other fellows at Intel who had to step out very recently. In conclusion, it was a long road and it's not over. Thank you. Thank you, we have time for a couple of questions. Thank you. Just two quick questions. One, when you have multiple connections, can you kind of do it RAID 1 sort of style, like where traffic goes on both simultaneously so that you don't have to resend something if something gets dropped? And can you speak also about SCTP and what's going on if it's dead or if, you know, because it's sort of in a similar space and I never understood why people focused more on MPTCP than SCTP. Thank you. I will maybe start just with the SCTP aspect because I don't know much about it. From what I remember is that here with multi-pass TCP we do an extension to TCP. So most likely where TCP was working before MPTCP can work. There are some exceptions with some nasty middle boxes, but I think that's the main reason why we can't see multi-pass TCP in the field and maybe not the SCTP. I think it is not dead and still used for data centers, but I don't know exactly about it. For the other question, I might have not understood everything, you said that you wanted to aggregate multiple paths. You have your two paths, can you send the same data simultaneously? Yes, you can. So there is even a packet scheduler called redundant packet scheduler. There is one small bit that is important to mention is that each path is still a TCP connection, which means that if you have some losses on one path, you still need to retransmit it on the same path. So at some point it might be okay to say that, okay, the other side received it via the other side, via the other path, so if you got a loss on one path. So the end host doesn't need it, but because there are middle boxes and others on the path, you need to retransmit it at the TCP level. I don't know if it's clear, but so you can do re-injection, but you need to continue retransmit on the same path too. You can't just when you're trying to receive that request, just drop it. No, if you want to do that, the best is probably to stop the connection, like if you want to have a low latency thing, or if you want a low latency, maybe don't use TCP, but that's another question, not the topic. But if you want to do that, it's probably best to stop the pass and recreate it. So I looked at the SysCityLs for MP TCP, and I found one called DSS checksum, and reading the patch notes, it's something to do with middle boxes. So is that giving you issues? And last question, depending on that, why is it not on by default? Yes, no good question. So in short, middle boxes are not nasty. They like to modify everything, and I will not comment too much about that because at my company, we do a transparent proxy, so we are kind of middle box. But what can happen is that middle boxes can change a lot of things in TCP. For example, you have all protocols like FTP, where the IP address is sent on the by-screen, but in clear text. Which means that if you have a NAT, you probably have a NAT that starts to look at the connection, identify it is FTP, and modify the text in the by-stream, like the IP addresses. But because it does that, the size can change, and if they don't update MP TCP header, because we need to add some information to be able to reassemble the data on the other hand, they can mess up with MP TCP. So there is this checksum mechanism. But there is one big inconvenience is that for the moment, there is no hardware acceleration, so it's quite costly. And the other thing is that at the end, it's quite rare that you have some middle boxes modifying the by-stream like that. I know that in the past, you had some, if you were going on some website, for example, for AT without HTTPS, it's possible that some by-stream were injected. And probably when they do the injection, they don't modify MP TCP. Sorry, we need to move on. Yeah, sorry. Otherwise, we won't be unscheduled. So that's why we don't have checksum. But thank you. Thank you so much for the talk. Thank you for the questions. Thank you. |
Graphing tools for scheduler tracing |
Julia is going to be talking about graphing tools for scheduler tracing. Okay. Do you hear me? Okay, so thank you. So, I'm going to be talking about graphing tools for scheduler tracing. So, I'll start out, like actually someone started out yesterday with what is a task scheduler. So, for me, a task scheduler is like an important part of the Linux kernel. It does two things. It places tasks on cores when they either are created with fork or when they wake up or when there's load balancing. And it also, when a core becomes empty, it decides what core should run next. Basically, I'm interested in the Linux kernel files core.c and fare.c, so CFS. I'm not at all interested in the second point. I'm interested in where cores get placed when they wake up. So, that's going to be the entire focus of this talk. So, the next question is, how does the scheduler impact application performance? Basically, a scheduler is confronted with a bunch of tasks and a bunch of cores, and it needs to make decisions where is it going to put these tasks on these cores? And so, sometimes, if you make a bad decision, sorry, it can have a short-term impact. Maybe some task has to wait a little bit extra time, but this impact can kind of, there's kind of a domino effect. You do one bad decision and then other bad decisions follow from that. So, one issue that one might be concerned with is what's called work conservation. So, we have a machine that has four cores. We have a task that wakes up. T1, where should we put it? So, based on the information we have right now, we've just got an empty machine and a random task. Maybe we have no idea, we'll just put it on core zero. Now, another task wakes up. What should we do with T2? So, kind of intuitively, it might be good to put T2 on either core one or core two or core three, because they're not doing anything at the moment. Putting on a core zero would perhaps be a poor choice because it will have to wait for task one to finish. So, that seems completely obvious as a human looking at boxes on the screen, but the scheduler is going to have to hunt around to find those empty cores. And so, actually, CFS is not actually work conserving. The basic principle is no core should be overloaded if any core is idle. So, if you have an overload, you should have put it on the idle core instead. Another issue is locality. So, instead of having just four random cores, we may have a multi-socket machine. We've got cores zero and one, which are together on one socket. Core zero, two and three are together on a socket. We have T1 is on core zero. Where should we put T2? So, we have those three idle cores, but maybe core one would be a better choice if either T2 has already all allocated all of its data on the first socket or if T2 wants to discuss things with T1. If we put it on two or three, things might get slower. So, basically, you can see that there's a lot of potential for things to go wrong. So, we need to understand maybe what the scheduler is actually doing, but this problem is the scheduler is like buried down there in the OS. When you're on the application, you don't really know where your tasks are running. So, we want to consider how we can see what the scheduler is doing. So, fortunately, there's some tools that are available. So, the most helpful one, I would say, is trace command. So, trace command allows you to trace all different kinds of kernel events. Basically, it's a front-end on F trace, but in particular, it lets you trace scheduling events, so that's the part we're interested in. So, you can see a trace here, and if you get this trace, it will have basically all the information you need to solve all of your scheduling problems. On the other hand, it unfortunately has all the information you need to solve all of your scheduling problems. That is, it's ordered, it's like sequential thing. It's ordered according to time. If your application runs for a certain amount of time, you'll end up with a huge file. And you can see even in this little tiny trace that I'm showing, we've got different, the activities on different cores are mixed up. We have core 26 and core 62 here. And so, in practice, it can get very hard to actually sort out what's going on, who's doing what, and so on. And so, the next tool, which is very helpful, is one that's called kernel shark. So, this gives you a graphical interface that lets you see what's going on on your different cores. And it also gives you that same textual output at the bottom. You can kind of correlate them to each other. You can zoom in quite easily, and so on. On the other hand, in my personal application, where I'm interested in actually very large machines, kernel shark has some kind of a bit difficult to use in some cases. It's great for if you want to really zoom in on a specific problem. It's not so great if you actually don't really know where your problem is, and you want to see somehow an overview with everything at once. Here, I'm only showing two cores. You can see that the display is kind of a bit spread out. It's going to be hard to get 128 cores to fit on your screen and be reasonably understandable. So, what we would like is some way of understanding what's going on on big machines. So, the thing I put the emphasis on previously is that we want to see what's going on on all the cores at once. Something that I've also found extremely useful in practice is to be able to collect traces, collect these images, share them with my colleagues, put them in papers, put them in slides, and so on. So, I found it useful to make, collect lots of traces, compare them, store them, look at them later, and so on. On the other hand, I have, at least for the moment, completely abandoned this nice feature of kernel shark, which is that you can zoom in or zoom out and find out exactly what you want to see at what time. My proposed approach that I'm going to present in this talk is completely uninteractive. So, you run a command, you get a picture, you look at your picture, and you run another command, you get another picture, and you look at that picture. So, actually, in the last few years, I've made lots and lots of tools that start out with trace command input and visualize it in different ways. Sort of the ones that have stood the test of time are the ones I'm going to present, which are datagraph and running weighting. The names are not super imaginative, perhaps. Datagraph takes a dat file, so that's what the trace command produces, and it makes a graph for you. So, basically, it's going to show you, we have the x-axis and the y-axis, the x-axis is the time, and then on the y-axis, we have our cores, and we see what's running on each core at each time. So, kind of like what Colonel Sharpe showed you, but in much more compressed format. And running weighting is just a line graph. It shows you how many tasks are running at a particular time and how many tasks are waiting on a run queue and are not able to run. So, we'll see how that's used. So, the rest of this talk, I'm going to present these two tools, and I'm going to be motivated by this patch that I submitted a few years ago. I'm not going to discuss the patch in detail now. We'll see it later after we've seen all the examples. The application I'm going to be interested in is part of the NAS parallel benchmarks. These are a bunch of, it says what, you can read what it says. It's small kernels having to do with HPC kind of things. We're going to focus on the UA benchmark. It does something. What's important for our purposes is that it has n tasks, and they're running on n cores. And so, they kind of run, they seem, at least superficially, they seem to run all the time. You would expect that they would just choose their cores, stay on their cores, and run on those cores forever. So, you would expect this benchmark to be completely uninteresting from a scheduling point of view. So, if we take this benchmark and we run it a few times, so I run it 10 times, you can, and I've taken these runs and I've started it by increasing run time. You can see that something is going on, because there's kind of these runs on the left-hand side here, which start out around 20 seconds. And there's a definite gap here. I mean, it gets a bit longer, a small amount, but there's a definite gap here, and then it jumps up to closer to 30 seconds. So, maybe we have 40% overhead between the fastest one and the slowest one. It's only 10 runs. It's quite a lot of variation for a benchmark that we expect will just run like this and not doing anything interesting at all. So, we can ask why so much variation. So, now we can actually look and see what's going on at the scheduling level. So, this is the graphs. We have, as I said, we have the time on the x-axis, and we have the, what's going on on the different cores on the y-axis. What I have, it says socket order on the different cores. What I've done is, actually on my machine, the numbers go kind of round robin between the different sockets, but I have organized it so that we have the first socket at the bottom, second socket kind of in the middle, and so on. It's not very important at this point, though. So, I don't know. We have a graph and we see what it's doing. So, this is the fastest run. It looks kind of like what we expected. The thing's not moving around. Not much is happening. This is a much slower run. So, this previous one was 22 seconds. This next one is 28 seconds. So, that's kind of a big overhead. And here we can see that things are not going as well at all because, in particular, over here in this region, we have these white spaces. And white spaces means that nothing is happening on that core. So, there could be two reasons why nothing is happening. One of them is that there's nothing to do. So, maybe one of these tasks has gotten way ahead of the other one, and so it needs to sleep and to wait for the others to finish what they want to do. The more unpleasant reason that nothing is happening is because several of these tasks can be stuck on the same core and they're going to have to bounce back and forth between each other. And actually, nothing. We have a work conservation problem. Some of the cores are idle. So, we can see which case we're in by looking at the running weighting graph. So, here we have, again, we have our, this time we have the number of tasks on the y-axis, but we have n tasks on n cores, so it's the same. At the top, we have a dotted line, which is the number of cores on the machine. And then the green lines are things, the number of tasks that are running. So, it's kind of like all the tasks are running all the time, but not exactly. There's sometimes when only a very few tasks are running down here. And then we have over here in this situation, this is the place where we had the gaps on the other graph. And here we have often, we have like almost all the cores, all the tasks that are running, but not quite. And we have this red lines here, and so red lines means tasks that are waiting. So, we're in an overload situation. So, some tasks have been placed on the same cores as each other, and so they have to wait for each other to finish. So, this is kind of more of a problem for this kind of application. So, basically the two problems we have, we have problem tasks that are moving around, and we have some cores that are overloaded, and so the tasks don't get to run as much as they ought to be. So, now what we're going to do is we're going to zoom in to some of these situations and see what the problem could be. So, here's the first one of these situations. If you look over here, basically around three seconds, at this point that I've circled, you can see we have an orange core, sorry, orange task and then a blue task. And so, something is happening to cause one cores to change to another one. And if you look up a bit, a bit more, there's some other situations where that happens, kind of all in the same area. So, we can look into that in more detail. If we zoom in a bit, so here I have the command line that you have to write. This socket order is to get the cores ordered in a certain way. Min and max are the, we want to go from three seconds to 3.2 seconds. Target UA is it's going to give our application special colors and other things that happen are going to be black. So, then we can see other, if there's some other strange things that are happening on the machine. So, now that we have zoomed in at this level, we can see that things actually are not as nice as they looked when we were in the zoomed out situation. Here we have like everybody, almost everybody has stopped for a little gap here. And then here, this is basically the fourth socket. There's a lot of unfortunate things happening up here. So, we can zoom in a bit more. So, now I've zoomed in just on this big vertical line here. And when we zoom in a bit more, then we start to see that there are some other things going on on our machine. So, they're the colored lines and then we have some little black lines. So, we can try to find out what the little black lines are. So, this data graph, it has another option. What are the black lines? It has another option where we can have colors to see, it's colored by command. The colors are chosen not by the PID, but by what the command is. So, mostly we have our command, which is blue, UA. But we have some demons here. So, these are kind of inevitable. The kernel needs to do some things. And so, basically, if we jump back here, we can see that if we look, for example, in this place, our task is running along, a demon comes, and then it interrupts our task. So, our task is not going to be working, but at least our task is staying in the same place. And so, nothing extremely horrible happens, but these things get a bit unsynchronized. Some of them get slowed down and so on. So, that's one kind of slowdown that we can have. But, in principle, it shouldn't have a huge long-term impact. So, now we can move a bit further off to the right. We can see there are some more of these little black things here. Here, what we have, here we have an orange task. Here, we have a black line. And here, we have another orange task up here that happens sort of at the same thing. The same position. It's a little bit off to the right. So, what's happening here is we're doing load balancing. And so, the kernel thinks, okay, so there are two things going on here. We should find one of these many idle cores up here and use one of them to put this task. But that's actually quite a poor choice in this case because, basically, in this application, we have all the sockets being filled up with all of their tasks. And so, by doing this load balancing, we have put an extra task up there on the fourth socket. And that's something we will come to regret later, one might say. Even though it seems okay for the moment. So, what this leads to, though, is, so as I just said, it's going to lead to a cascade of migrations. We put something on that task. Someone else is going to wake up for that core. It will have to go somewhere else. And that other place is someone's going to wake up for that and so on. So, then the third situation, this is actually in the same position. We see another situation over here. Here's another case where we are changing from one task to another one. But this time, there's no little black dot which is motivating this change somehow. Nothing strange seems to be happening. It just seems to be happening spontaneously by itself. So, we can look again at the running weighting graph to see what's happening. It's not super easy to see. But basically what's happening is we have a green line which is below the number of cores. And we have a red line that's just above it. And again, we have an overload situation. So, there's one thread which is actually this orange one here. This blue and orange core here, orange tasks are sharing the same core. And so, they're going to have to bounce back and forth between them. So, we can try to look and see how did we end up with this situation. So, this here, this is a graph that I made by hand more or less. This is just focusing on the two specific cores that we're interested in. Here we have this orange task. It's running along. It prefers to be on this core number 111. It then goes to sleep. And then after some time, we move along over here. At this point, it wakes up. And we want to figure out it's waking up. It's actually the task on core number 68 that is going to wake it up. And so, we need to find a place to put it. So, the obvious place to put it would be on core 111. That's where it was before. And that core, the important thing is that core is actually not doing anything. But that's not what happens. What happens is it gets placed on core number 68. It gets placed on the core of the parent as opposed to the core where it was running before. So, this seems very surprising. We expect that we prefer to put cores where it can run immediately. Why does it, for no particular reason, get moved off? So, the key to the situation is this mysterious little dot here. So, it's a key worker that woke up and took advantage of this empty space so it could run immediately. And at the time, this is like Linux 5.10 when all of these graphs come from. At this time, there, basically, there's a decision whenever a core wakes up, should it go on the socket where it was before? Should it go on the socket of the waker? And there are different sockets in this case. And the issue is that when you make that decision, you take into account the load average. And the load average is something, is this collected over time, and then the old information gets decreased a bit over time. And so, because we have this K worker here, the load average is not zero. And so, this core is seen as being not completely idle, even though it is completely idle. And so, once, when that situation arises, then there's some kind of competition between the parent, the waker, and the place where you were before. And for some reason, this core number 111 is going to lose this competition in this situation. And so, the kernel thinks that this core down here would be a better place for it, which in this case, it definitely is not. So, that's where this comes in, there's a little patch, all it does is it checks is if the core where the task was previously, if that is completely idle, then just use that instead of considering other possibilities. So, if we apply that patch, then here we have the pink lines here. So, now we still have a slight increase, we still have our task moving around, it's not going to solve all the problems, but we don't have this big jump, which happens when the overload situation is introduced. And we can see how they impact on another completely different application. So, this is a Java program, it's part of the DeCapo benchmark suite. And this patch causes tasks to kind of have a better chance of remaining where they were before. And on this benchmark, what happens after we have the patch is that all the tasks manage to stay on the same socket, because there's actually not that many of them that run at a time and they fit there nicely. Previously, they were tending to kind of move over the entire machine. And we have here much like more than 20 seconds between the fastest and slowest here, we have a much more uniform running time, and obviously the running time is also much faster. So, it had multiple benefits. So, in conclusion, if you want to understand what your scheduler is really doing, you have to actually look and see what your scheduler is really doing. Just seeing that the number, now it's faster, now it's slower, something like that, it doesn't really give you a good understanding of what's going on. Different perspectives, we found that it provides different kinds of information. The running rating graph is actually very coarse-grained, but it actually sometimes can show you like the problem is exactly in this region because there's overload in this region. So, we have our two tools, data graph, what's going on at what time, and running weighting, how much is happening at each point in time. In future work, these graphs are a little bit slow to generate because we, at the moment for technical reasons, we go through PostScript and then go to PDF. So, it'd be nice to be able to generate them more quickly to be a bit more interactive looking. And also, as I mentioned in the beginning, I've made lots of tools. If these tools could become a bit more configurable, then maybe I wouldn't have to restart the implementation each time and it'd be more useful for other people. So, everything is publicly available. So, thank you. Thank you. We have time for one or two questions. Thanks for the talk. I have two questions, basically. Do you have a solution to visualize the latencies due to cache misses, for example, after a migration? The second one is, do you have a way to visualize when tasks are synchronizing on the mutex, for example, that also can bring some latencies? So, no, we haven't done anything for cache misses. It could be interesting. I mean, I have another tool which deals with events, and I think there's some way to add events to datagraphs and maybe you could see when different locking operations are happening. I mean, I definitely think that's useful. I don't think the support is as great as one might like at the moment, but it's definitely an important thing. I have one more question. Hello, Julien. I was wondering, is there a way to show the CPU state at the time you are printing the time? Because your graph is making the assumption that, typically, the CPU frequency or whatever is stable over time. It would be very interesting to know the physical state of the processor at the time we are printing, because maybe the task is running on a faster... The CPU frequency is higher on one cause than the other. So to visualize that this application is running on a fast or slow CPU could be very interesting to know the... Actually, the tool does that, but the unfortunate thing, I didn't talk about it because you have to actually go and add a line, a trace print K line to your kernel to actually print out that information, because it doesn't exist anywhere in the kernel. So that's the only issue, but actually the tool, once you print it out in the proper format, it actually does everything and it can show you just the frequencies, so you can see the different colors for how fast it's going. You can also see the merged thing where you have the frequency in one line and you have the activity in the other line. Sorry, we're out of time. Thank you for the talk, thank you for the questions. We can't take all questions, but I'm sure you can find Julia later. |
Walking native stacks in BPF without frame pointers |
I'm going to talk about walking native stacks in BPF without frame pointers. So yeah. So my name is Weishali. I work at PoloSignals as a software engineer mostly on profiling and eBPF related stuff and before that I have worked in different corner subsystems as part of my job. My name is Javier and I've been working at PoloSignals for a year mostly working on native stack and winding and before I was working on web reliability and depth tooling at Facebook. So before we get into the talk let's talk about the agenda. So we'll first address the first question which is always being asked that why size need for a dwarf based stack walker in eBPF then we will briefly go through the design of our stack walker will also go from like how we went from the prototype to making it production ready and then what are a bunch of the learnings so far especially when we are interacting with the eBPF subsystem of the kernel and then our future plans. So as I said we work on the production profilers. Generally sampling profilers collect the stack traces at like particular intervals and attaches values to it. For that like profilers generally need both user like application stacks and kernel stacks. Stack walking is like part of the process for collecting the stack traces. In simple words like it involves iterating over all the stack frames and like collecting the written addresses. Historically there has been a dedicated frame, dedicated register to store the value of it in both X86 and ARM although it has fallen victim of some of the compiler optimizations so most of the runtime actually sabers it. It's called frame pointer as many of you have heard of it. And when we don't have the frame pointer walking the stack becomes like magnitude of like a lengthy process. So instead of involving a couple of memory accesses per frame which is like quite fast we will have to do like more work in the stack walkers like not like the stack walking is also a common practice in deep workers right. So what's the current state of the world with respect to frame pointers? So it's not a problem for especially hyperscalers as you may know like in the production they are always running the applications which has the frame pointers enabled because whenever they have to inspect the incidents getting like faster and the reliable stack traces is must. Go runtime enables the frame pointers since go 1.7 in X86 and 1.12 in ARM 64. Mac OS is like always compiled with compiled with frame pointers. There's also an amazing work going on for the compact frame format. It's called simple frame format and there has been like support being added in the tool chains and now there is also like a mailing list discussion going on in the kernel about having an unwinded stack walker, sorry, a stack walker for unwind the user stacks. But the thing is that we want it now, we want to support all the distros, we want to support all the runtimes and the one thing that common across a lot of this is dwarf and that's why we are using it. So where does it come from? So like some of you might be wondering about like the exceptions in C++ or for example Rust tool chain which is like always disabling the frame pointers but when you like use the panic it always has the full backtracks. The reason is that it has the .eh frame section which is being used for that or debug frame. So most of the time either of the tool chains have this section and the other ideas that you can also unwind the tables by synthesizing it from the object code. This is like the approach which is being used in orc, one of the kernel second winder which was added I guess five, six years ago. So we'll talk about in detail about .eh frame in a minute but before that let's see who else is using .eh frame. So of course like we are not the first one who are going to use it, Perf does that. Since I think, since when the Perf event, Cyscall Perf event, Open Cyscall was introduced in 3.4 it collects the registers for the profile processes as well as like copy of the stack for every sample. While this approach has been proven to work, that bunch of drawbacks which we wanted to avoid one of the thing is that kernel copies the user stake for every sample and this can be like quite a bit of data especially for the CPU intensive applications. Another thing is that when we are copying the data in the user space the implications of one processes having another processes data can also be complicated. So those are like bunch of the things we wanted to avoid and stack walking using BPF makes sense to us because we don't have to copy the whole stack instead like a lot of the information still stays in the kernel especially like in the case of stack walking mechanism. Once it has been implemented we can leverage the Perf subsystem to like get the samples on CPU cycles, instructions, alt, cache misses, etc. It can also help us to develop other tools like allocation tracers, runtime specific profiles like for the JVM or Ruby, etc. Now some of you might be wondering that why do we want to implement something new when we already have BPF get stack ID? So the reason is that it also uses frame pointers to unwind it so and having a fully featured dwarf unwind in kernel is unlikely to happen there is a mailing list discussion you can go and check it out why. So now before we dive into the design of our stack walker I wanted to give some information on what ES frame has and how we can use it. So ES frame section contains one or more call frame information records. The main goal of the call frame information records is to provide answers on how to restore the register for the previous frame and the locations such as like whether they have been pushed to the stack or not and all of this would actually generate the huge unwind tables and for this reason the format attempts to be compact and it only contains the information that is being needed. The unwind tables encoded in the CFI format are in the form of opcodes and we basically have to evaluate it and then in the case of stack walking once it has been like evaluated we generate the table that contains like for each instruction boundary how to restore the value of the previous register. It has sort of two layers to it. One is this sort of helps with the repetitive patterns for compressing and it allows for a more compact representation of some data. As in some cases they are like a specialized opcodes that consumes one or two or four bytes so it doesn't have to be four bytes all the time. And the second layer is like a spatial opcode that contains under the site of opcode which is like arbitrary expressions and they need to be actually evaluated for that. And this would need like that, this would mean that we will actually have to implement the full-blown VM in the EBPF to evaluate any expression which is not practical. So we are going to also mention what we are doing to actually come over those challenges. For those who are not aware of like what is like the general flow of the EBPF applications generally this is how it would look like very high-level overview. So in the user space we are using the driver program which is written in Go. We usually BPF Go, it creates the map, attaches the map to attaches the BPF program to the CPU cycles of Perf event and then reads, parses and evaluates the EHRM section of the process. And in the BPF program we fetch the table from the current PID and then have an unwinding algorithm which processes the raw information. So we will go in depth for each component but let's see what the algorithm looks like. So first what we are doing is we are just reading three registers. First one is RIP, the second one is StackPointer, RSP and the third one is RBP which is commonly used as frame pointer when they are unable to. Next we are going for the unwind frame count which is less than maximum depth. We find the unwind table row for the program counter, then we go for adding the instruction pointer to the stack, calculate the previous frames, StackPointer, update the registers and continue with the next frame. So this is like a very simple binary search but when it has to scale we need to also think about storing the unwind information and how can it work with all the runtimes etc. So Javier will now talk about that. Cool, so as Vaishali said we need somewhere where to store the unwind information. We are going to look later at how this table looks like. But first let's see what are the possibilities here. So one possibility for example will be to store the unwind information in process. We could do this using a combination of Ptrace, Mmap and Mlock and this will require us to basically hijack the processes execution, introduce a new memory mapping inside of them and then we have to lock the memory because in BPF and in our type of BPF programs page folds are not allowed. The problem with these approaches of course will be altering the execution flow of applications which is something that we never want to do. This complicates things a lot but for example one of the biggest problems is the life cycle right. So for example if our profiler dies before we finish cleaning up who is going to clean up that memory segment or how is this going to be perceived by developers if they see that the memory of their application has increased behind their backs just because some observability tool is doing something that is not great. But also there's another problem that is sharing memories harder. There is same page optimization from the kernel but if you don't think about that it's a problem to have the same information generated over and over for example for a libc for every single process in your machine. So that's why we ended up using another solution which is pretty typical in the BPF space which is using BPF maps. In case you're not familiar BPF maps are data structures that can be written or read from both user and kernel space. We're using hash tables everywhere which in the case of BPF they're basically a mapping of bytes to bytes that store arbitrary information. So some BPF maps, some BPF programs as well are allowed to lazily allocate memory for their data structures but in the case of our tracing program we kind of do that and this has some implications. So we need to mem-lock that, well the kernel, sorry user space, mem-lock that memory and otherwise our program wouldn't be able to run. And by using this approach we are also able to reuse these memory mappings which is great because that means we don't have to do the same work over and over and we use less space. So let's take a look at the logical representation of the unwind tables. So this is not how the layout is in memory but think about for example the unwind tables for libc, mysql, zlib and systemd how they will be laid out in memory if we could allocate a large chunk of memory. But in reality there's limits everywhere obviously and in BPF we have done some tests and in the machines that we want, well the kernels we want to support we can allocate up to 25,000 unwind entries per value of the hash map. So obviously this was a problem for us because in some cases we have some customers that have applications with unwind tables with 3, 4 million unwind rows which is quite ridiculous just to give you an idea libc I think has like 60k entries so having a couple million is significant. But yeah we came up with the same solution that you would use in any other data intensive application which is to partition or shard the data. So the way we're doing this is we have multiple entries that are allocated when our profilers start running. We allocate a different number depending on the available memory on the system and the overhead that you're willing to pay and yeah depending on how many charts you have you have a different CPU to memory trade off because the more memory you use it has to be locked in memory, it can be swapped out which is in some applications not ideal but at the same time that means that you don't have to regenerate the tables if they are full and then you want to give like other processes a third chance to be profiled. So the way this will work for example for a process like system D is that will be like the representation of the size of its unwind tables and because it's bigger than the size of a shard it will have to be chunked. So here we can see how this is chunked in two. The first chunk will go in the shard zero and a bunch of the unwind entries from the tail will go to the shard one and of course because we have this new layer of indirection we need to somehow keep track of you know all these bookkeeping and know what is the state of the worlds and we're doing this of course with more BBF maps. So a process has multiple mappings each mapping has one or more chunks and then each chunk maps to exactly one shard and in particular the region within one shard because you can have from one unwind entry up to 2050k. Of course this has the benefit that I was mentioning before that is because we were sharing the unwind tables that means that we spent actually not that many CPU cycles doing all the work that Shali was mentioning before. We need to find the ELF section where the door of CFI information is but also we need to parse evaluate it. We have two levels of VM that have to run which is not something that is very CPU consuming but still has to happen. It has to process a bunch of information and generate these unwind tables in our custom formats. So by sharing this for example Lipsy will be shared for all the processes so that means that we only need to add the bookkeeping data structures which are really cheap to generate. In some of the tests that we've been running we use less than 0.9% CPU within the total CPU cycles our profiler uses to generate the unwind tables and of course there is a lot of things that we need to take into account like for example what happens if we run out of space right? So what we do is we adaptively decide what to do in the moment. We might wait a little bit until resetting the state or we might decide to give chance up to other processes to be profiled so we wipe the whole thing and start again and as you can see this is very similar to a bump allocator. This is basically a bump allocator that has been chunked up. So the process of unwinding this is we start with a PAD, we check if it has unwind information then we need to find the mapping and for each mapping we know the minimum and the maximum program counter so we need to do a linear search to find it. Then we find the chunk, with the chunk we already have the shard information so once we have the shard information we have to traverse up to 250,000 items. We do this just with a simple binary search. This takes obviously between seven or eight iterations and once we have the unwind action that tells us how to restore the previous frames registers we do those operations and we are ready to go to the next frame. We are pretty much done for that frame. If the stack trace is correct we know this because basically a stack when you have frame pointers the bottom of the stack is defined in applications with frame pointers when you reach RBP equals zero this is defined by the ABI. When you have unwind tables it is defined by not having that program counter covered by any unwind table and having RBP zero these are requirement by the ABI so if some application doesn't respect it it is completely broken. So once we verify that the stack is correct that we have reached the bottom of the stack then we hash the addresses and we hash we add the hash to a map and we bump a counter and we do this I think it is 19 times a second for every single CPU in your box and every couple seconds we collect all this information we generate a profile and we send it to some server for inspection. So of course BPF is an interesting environment to work with it is amazing and we really really like it but we need to be aware of some stuff. First of all because we cannot page in or page out pages of the contained unwind tables that has to be locked in memory so we need to be very careful with how we organize our data and the layout of that data to make it as small as possible so we basically pack every single thing that can be packed and then there are some interesting BPF things that for most people that have written BPF programs this is well known but I just want to talk a little bit about how we are dealing with some of the BPF challenges. So one of it is a stack size which is easy to work around if I am not mistaken we have 512 bytes which is not a lot so we use another BPF map that we use sort of a global data structure and that stores basically like kind of your heap if you will and then for the program size this is a limitation that comes in two ways first there is probably some limitation in the amount of how many opcodes you can load in the kernel but also the BPF verifier that ensures that the BPF code is safe to load for example you don't do any de-reference that could go wrong or that your program terminates it has some limits it could theoretically run forever trying to verify your program but it has some limits and we hit this limit everywhere in our code for example we hit it when running our binary research it complains saying that it is too complex for us to analyze so what we do here is that not only we have sharded our data we have sharded our code our code and data the same thing right so we basically have our program split into many sub-programs and we keep the states and we basically execute one program after each other and continue the state so one of the techniques we use is BPF tail calls two other things that are way more modern and they are amazing to our bounded loops and BPF loop which is a wonderful helper the problem is that while we use bounded loops right now we don't use BPF loop because it's only supported in modern kernels but it's great and if you can use it you should now because we're a profiler and we want to minimize the impact we have on the machines we run I want to talk a little bit about performance in user space so many go applications well our profiler is written in go and many go applications and APIs are in design with performance in mind and this is something that can be seen in the dwarf and elf library that go ships with in the sandal library as well as binary read and binary write that we use everywhere because we're dealing with raw bytes and we read them and write them all the time to the BPF maps so it is interesting to know that both these binary read and binary write low-level APIs actually allocate in the fast path which can be problematic so there's a lot of things that in the future we're going to be reinventing to make faster and then we profile a profiler a lot in production we have found a lot of opportunities and there's a lot more work to do because there's not much time I'm gonna quickly skip through testing but the great idea here is that we try to be pragmatic and we have a lot of unit tests for the core functions and then we use snapshot testing for the unwind tables and we have a git sub repository where we have a visual representation of the unwind tables and then we generate them every time on CI and locally and we verify that there are no changes compared to last time I think there's only like two or three minutes left so let me talk about the different environments and some of the interesting stuff that we have found while we were profiling our profiler in production we realized that we were spending a ridiculous amount of CPU cycles reading files from this I think the total this is just like a bunch a part of the flame graph but I think it was like 20% of the CPU cycles so turns out this was because our cloud provider has very slow disks that are orders of magnitude slower than our fast NVMEs in the team and another thing that is very interesting that is not a new fact and everybody knows about is that different configuration is the biggest source of trouble and we could see this the other day and if you're interested you can check the board request it's our all the whole project is open source which is the interaction between signals and BPF what happened basically go has an embedded profiler and we use it only in production for reasons but it triggers SIGPROF every couple a couple times a second that it was interrupting the process execution and at that time our process of booting app and it was loading our BPF program because it's quite long and complex the verifier takes a couple milliseconds to load it but it was getting interrupted all the time the BPF whenever it detects that the verifier has been interrupted it retries the process basically wasting all the previous CPU cycles because it starts from scratch but then it retries up to five times and then it says I couldn't do it and of course when we can allow the BPF program we are completely useless so we just crash and there is many other considerations such as what do you do with short live processes because you have to generate a data but even though we have an optimize for this and is we are not that bad and we can profile processes that run even for one second on your box and then the important thing here is that this is our format for our custom on wine table but it doesn't matter the important bit here is that it mostly fits in L2 cache so we basically incur on two L2 misses and it is pretty fast on a machine with a bunch of processes with up to 90 frames we can do the full on wine processing 0.5 milliseconds on a CPU that is five years old cool and so we are going to do mixing on wine mode so being able to unwind JIT sections we're applying RM64 support by the end of the year and this feature is going to be enabled by default in a few weeks because right now it's under a feature flag we have many other things that we want to support including high level languages and we are open source so if you want to contribute or you have anything you want to discuss we meet by weekly on Mondays as part of the Parker project so there's a bunch of links that we're going to upload to the presentation in the FOSM websites and yeah thank you. I think we have time for maximum one short question. |
composefs
An opportunistically sharing verified image filesystem |
So, we're ready for our next talk. Alex is going to talk about the new file system that they're proposing, ComposerVest, and opportunistically sharing verified image file system. Thank you. All right. Thank you. Can you hear me? Yeah. All right. All right. I'm Alex. You may also know me from hits such as Flatpak, FlatHub, GNOME, GTK, all such a stuff. But this here is my first kernel file system, which I proposed on the list a couple of months ago. It's not actually like a file system, a real file system. It's more targeted for read-only images such that you would typically have many of them running on the system. Maybe in a container host or in my case, my primary concern is the OSTree verified boot use case. So rather than talking about ComposeFS first, I'm going to talk about OSTree because it kind of explains where this comes from. So in OSTree, we have images. Normally the images are not simple like this, but actually the full-root file system for your boot system that you want to boot. But they're used to a bunch of files, and they have metadata and permissions and names and whatnot. So they're basically images. And we have this repository format, which is the core format of OSTree. And what we do is we take all the files, like the regular files, in the image, and we hash them. And we store them by the hash name in this repository format. So if you look at any of those files, they're just the same file with the name of their own content. And then we take all the directories we have, such as the sub-dir thing, and we make a small descriptor of them, the names of the file in there, their permissions and whatnot, and a reference to the content of the file by the checksum. And we do the same for the root directory. And this time, we refer to the sub-director by the checksum of the descriptor. And then we add a final commit description, which describes, well, it has a pointer, meaning the checksum of the root directory, and a parent doesn't have a parent, because this is the first commit, some metadata. And then we add the breaths file, which is basically just a file that says, we have a branch called image, and this is the current head. So if anyone thinks this looks like the.get directory in any of your checkouts, that's true. It's basically get for operating systems. There are some details in exactly how the object files are stored, but basically, the entire structure is get, right, just a copy of get. And you can see even more clearly, if you create a new commit, the new version, we added the readme. So all we have to do is add the file, the new root directory, and the new commit that points to the previous one, and then we update the ref to the latest head. So basically, re-implementing get for large binary trees. But you can't use this directly, like you can't boot from a repository like this. So what you do, deploy, we call it deploy, when we run a new version of something, typically you have a pre-existing version, so download the new version of the thing you want to run, which is very simple, because you can just iterate over these recursive descriptions of the image, and whenever you have a reference to an object you already have downloaded, you can just stop because recursively you know you have all the previous things, so it's very efficient to get the new version. And then we create a deploy directory, which is basically a hard-linked form that points back into the objects, like the regular file objects. So we create the directories for the right permissions and whatnot, and whenever there's a regular file, we just point it at the file, the same file, by using a hard-link to the one in the repository. And then we somehow set some boot configuration that points to this particular commit, which names this directory, and somewhere in the Unitardee we find this directory, bind-monit, read-only on the root, on top of the root, and then we boot into that. And there are some clear advantages over this over what would be the alternative, which is the traditional AB, like block device, you have two block devices, then you flash new image on B, and then you boot into B. First of all, it's very easy to do efficient downloads, and like deltas are very small. You can easily store however many versions of things you want, whether they're related or not, like if it's multiple versions of the same branch. You can keep the last 10 or whatever, plus you can also have completely unrelated, you can have botan, fedora, nrl, or debbian or whatever, and you can easily switch between them atomically. And all the updates are atomical, we never modify an existing thing that you're running, we always create a new deploy directory, and we boot into that. And also the format is very, it's very, very viable, like it's recursively describing itself, and all you need is the signature, and there's a GPG signature on the commit object. So if you trust the commit object, you trust the root hash, which you trust the hash of the subdirectories and the files and what not. The problem that I want to address is that this doesn't do runtime verification, like we verify, when we download things, we can verify when we do the deploy or rather like the fact that we're deploying, it's going to cause us to verify things. But if after some point after that something modifies, say we have a random bit flip on the disk, or we have a malicious, like, evil made style attack, someone could easily just remove or modify a file in the deploy directory. And to protect against this, the kernel has two features, de-inverity and fsverity. De-inverity is what you use in the typical ABE image system, because it's block-based, but it's completely a read-only block device, there's no way we can do OSTry-like updates to its file system, you just cannot write to it. So the other thing is fsverity, and fsverity sort of matches very well with the OSTry repository format, because if you enable fsverity on a particular file, it essentially makes it immutable. And immutable is exactly what these things, content the rest files are, so it's good. But the problem is that fsverity doesn't go all the way, it only protects the content of the file, while you can easily make it set UID or replace it with a different one that has a different fsverity, or just add a new file or whatever. So it doesn't protect structure. So that's why ComposeFS was created, to have another layer that does the structure. And now I'm sort of going away from the OSTry use case, and this is the native way to use ComposeFS, where you just have a directory with some data, this is the same kind of example that I had in the repository format, and you just run mkcomposeFS on that directory, and it creates this image file that contains all the metadata for the structure of the thing. And this objects directory, which is just copies of these files stored by their fsverity commit, or fsverity digest. And they obviously use pure files, you can cat them, and they're just regular files with the same contents. They're actually pure data files, you can see they don't have like the executable rights or if you have some complex metadata, extended attributes or whatever, these are just regular files with content. And then you mount the thing using ComposeFS, pointing it at this objects directory, and then you get a reproduction of the original image that you can look at. Whatever you cat this, it will just do overlay fsstyle stacking, read the backing file. So everything is always from the page cache, and also the actual mount is not a loopback mount, we just do stacking style, direct access of the file from the page cache. So that gives you the general ability to reproduce this image, but to get the fsverity or complete structural verification, you actually use fsverity on the descriptor itself. So if you enable safe as in that, that makes it immutable, so the file system cannot change or the file can't change on the file system, at least the kernel API doesn't allow you and if it's somehow otherwise modified on disk or whatnot, it will detect it. And you can see like, I actually passed the digest, the expected digest, and whenever it mounts it, it starts by verifying before it does any IO, does it actually have the expected fsverity digest? And if so, we can rely on the kernel to basically do all our verification from us, and if you replace something, we have in the metadata for all these backing files, the expected verity digest. So if you replace something, or if there's a random bit flip, it will detect it. And actually the descriptor itself is very simple, like this is not a traditional file system where we have to update things at run time, we can just compute a very simple descriptor of this. It's basically a fixed-size header followed by a table of fixed-size INO data, like if the file system has N and INOs, then there are N copies of that structure, and some of them point into the variable-size data at the end, which we found with the VData offset in the header, and that's basically all there is to it, right? We can, INO zero is the root file system, or is the root INODE, you can look at that, and you can, if it's a type directory, then the variable data points to a table of dirants, which is basically a pre-sorted table of dirants plus names, that you use binary search, you get a new INO, then you just look at that offset, and all this is just done by mapping the page cache directly. So it's very simple in terms of structure. If you want to use this actually with OSTree, it's slightly different, like we can't just, we don't want to take the OSTree repository, create this directory, and then run MKComposeFS on it. Instead, we ship a library, LibComposeFS, where you basically link OSTree with this library, and it can generate these images directly from the metadata that exists in the repository. So we don't have to do any kind of expensive IO to create these images, because it's just the metadata, right? It's not very large. You can put it on into memory, generate these, optimize them, and just write a single file, and the way we can do it, it's very flexible, so we can ensure that we can use the existing repository for the backing files. And it's also designed so that it's a standardized way. We put everything, so every time you create a new image based on the same OSTree commit, we will be creating the exact same binary file, bit by bit. So what you do is that when you create the commit on the server, you basically generate one of these, take the digest of it, like the F is very digest of it, put it in the assigned commit, and then whenever you recreate, there's no need to extend the OSTree format on the network or anything, what you do is when you deploy a commit, instead of making this hardling farm, you recreate one of these, and then you use the supply digest as the expected digest when you mount it, so if anything anywhere went wrong or was attacked or whatever, it will refuse to mount it. So obviously you have to put that trusted digest somewhere in your secure boot stack or whatever, something would have to, it has to be trusted, but that's outside exactly of the scope of ComposeFS, and it's very similar to what you would do with DMVarity in a pure image based system. But another interesting use case is the container use case, and Giuseppe, who is not here actually, but he is one of the other people behind the ComposeFS developers, he is more, he's one of the podmem developers, so his use case is to use this for containers, because containers are also used in images, right, and it would be nice if we can get this very, what I call opportunistic sharing of files, like if you use layers and stuff, you can sort of share stuff between different containers, but you have to manually ensure you do the right thing, whereas with this opportunistic style of sharing, whenever you happen to have a file that is identical, it just automatically gets shared, both on disk and in page cache, because of the way these things work. So we also don't want to change the container format, there was a talk yesterday about using DMVarity in SquashFS, for, it's not sharing, but like the similar kind of way you can mount an image, but we don't want to, that forces all the users to create a new form of container, but we want to use, allow this for all existing, tarble based layered BOSIAI images. So an image in the OSIAI world is a list of tarbles that you extract in order, and then you're mounting using over the AFS. There is an extension of this called ETAR, ETARGC, which is some weird ass hack where you put an index at the end of the G-SIP, and then you can use partial HTTP downloads to just get the index, and you can see which part of the layer you already have, and you can just range HTTP gets to only download those parts. So if you happen to have one of those archives in your layers, we can in combination with the locally stored content of the storage, avoid downloading the parts that we don't need. If you don't have them, we have to download everything, which is what we do now, but we can do better. But even then, you can still hash them locally and get all the sharing, and then you combine this with creating an overly composed FS for the entire image, so you mount, this is for the local storage of images, you can use, instead of storing these layers, you store the repository, or content store repository, plus these generated composed FS images, and then whenever you run this, you just mount it, and it goes. It's also nice, you can easily combine all the layers, so if you have a 10-layer container, and you want to resolve libc, which is in the base layer, you have to do a negative entry lookup in every layer before you reach the bottom most, but since the image is metadata, it's very cheap to create a completely squashed composed FS image for single-layer lookups. And I don't know if anyone is following the list, but there are some discussions about this. We're trying to get it merged upstream, and one alternative has appeared, that there's ways that you can actually use some of overlay FS features to sort of get these features. If you use the not super well-known or documented features called overlay redirect and overlay meta-copy, you can create an overlay FS layer that does a similar style of here or the metadata for this attribute redirected to a different path, which would be the content address name in the lower layer, and then you can use some kind of read-only file system for the upper layer where you store all these extended attribute files that just contain this structure. So this combination of overlay FS plus right now ERO FS is probably the best approach for those for the upper layer. You can sort of create this. Unfortunately, that doesn't do the verification. You can use the overlay or read-only file system itself, but you need some kind of extension to overlay itself to allow this recording of the expected FS variety of the backing file. But that does seem like a trivial thing. The less trivial part, and this is where opinions on the list vary, is I think this kind of combination of things is way more complicated than the simple one. Compose FS is, I think, 1,800 lines of code. It's very direct. It doesn't use loopback devices, device wrapper devices. When you do a lookup in this combined thing of a particular file, you would do a lookup in the overlay layer in the read-only system and in all the backing layers. So there's like four times more inodes around, there's four times more decash lookups, and it just uses more memory and performs worse. So I ran some simple lookups. These are just some people complain about the measurements here. I'm just comparing like a recursive find or LSDTR, which is basically just measuring the performance of lookups and readers. But on the other hand, that's all that Compose FS does. I mean, all the actual IO performance is left to the backing file system. So wherever you store your read files, that's where like streaming performance and things like that would appear. So I'm personally in the automotive use case right now, so we have very harsh requirements on cool boot performance, so the cold cache numbers are very important to me. I mean, you might not, this is like listing recursively a large developer snapshot, like a three gigabyte Centro Stream 9 image. So it's not an operation you might do, but just looking at the numbers, the recursive listing is more than like three times slower for the cold cache situation, because it has to do multiple lookups. And even for the cached case, where most things should be from the decad anyway, I think I've seen better numbers than this, but there's at least 10% difference in the warm cache situation. I hope that was useful to do something on it. Yeah, we have some time for questions. So you said about halfway through that one of the goals was to actually keep the reading the OCI image format, but I think everybody pretty much agrees the OCI image format is crap for lazy pulling of container images, basically because it has an end-to-end hash so you can't do the hash until you pull the whole image, and that means signatures are completely rubbish anyway. In order to fix this, we have to do a Merkle tree or something else anyway, so the image format is going to have to change radically to something that will be much more suitable for your image. So I think trying to keep the image compatibility, which is partly what the argument over this versus the in kernel alternatives is not going to be a good argument for that, and I think you should consider it. I agree and I don't agree. And I think I'm not a fan of OCI, I've been part of the OCI specification team for a bit. I used to be one of the Docker maintainers a long time ago. It is not nice, but it is what we have, and it's everywhere. It's so easy as a developer sitting around thinking this is bullshit, we should just fix it, but there are like trillions of dollars invested in the existing containers, it's going to be a long time. But even when we replace it, this will still do the right thing. But there are trillions of dollars invested in the hosting cart, so it doesn't stop us going to the OCI. True, true, but like there are discussions of OCI V2, I don't follow them because the whole thing is bullshit, but even then, if we just had a better way to get partial updates for an image file, you could still use this, to use it. Before taking the next question, I'm obliged to point out from the chat that these performance numbers are before optimizing overlay FS and E-Rofs. Yeah, yeah, so yeah, there's been some work in that and optimizing like there, there's ideas to make the overlays stuff work better. Would that be possible, Joe? Maybe. Oh, I actually still had a question. So here in the back. So what's not really a question, more a remark. I think there's sort of like one missing slide in your deck, namely one use case that you haven't considered at all, but still really worth calling out. Many remote build systems, such as like GOMA, Bazel, et cetera, are all nowadays converging on the single remote execution protocol called REV2, and that one is actually also using a CAS as a data store for storing both input files for like compiler, binary, source files, header files, but also output files, object files that are generated. I actually maintain one of the implementations, and like one of the hard parts of implementing such a build cluster is instantiating the data that's stored in the CAS in the form of like an input route on disk, where you can just like run a compiler against certain source files, and a tool like Composives would also really help in such an implementation. That's just something I wanted to call out, and you should really like also market it towards those kinds of use cases, and it makes a lot of sense. Yeah, I'm sure images are used for all sorts of stuff, I'm sure there are many use cases other than the ones I've mainly focused on. Okay, then since you already ended the Q&A a bit early, then the next talk is going to be recorded. It gives us a bit more time to prepare. Thank you very much for the talk. Thank you for all the questions, and being here. |
EROFS filesystem update and its future |
Hello, everyone, thanks for listening to my topic, URF5 system update and its future. Due to my visa application issue, I didn't find a proper way to go to Russell on site, therefore I have to upload a video for online presentation. My name is Xiang Gao, and I've been working on Lin's kernel stuff for most seven years, mainly focusing on the local file system stuff. I guess URF5 is still familiar stuff for some people, and here I try to give more useful information of URF5. Hopefully it is helpful to everyone. So URF5 stands for Enhanced Read-Only File System. It was originally started in the late 2017, and available since Lin's 4.19. It is designed to be a generic high-performance read-only file system, with a very simple but effective core on-disk format design. As a result, it almost has a powerful performance among the current in kernel read-only file system. URF5 is kernel-mountable as a SQL-occurval format replacement of traditional CPIO and TAR. And it is currently contributed by community lovers, like Google Cloud, Dance, Corepad, Google, Huawei, Oracle, and more. So as an option, URF5 supports core file LV4 and LVMA transparent data compression. However, URF5 can live without compression as well. It is targeted for various high-performance read-only solutions, such as system partitions and APX for Android, smartphone, and other embedded system, new CDs, and container images, as well as AR data sets. There are many useful features which are actively underdeveloped, so if you have any suggestions or contributions, always welcome. There are several main use cases for URF5. The first main use case is Android system partitions. So Android has several read-only partitions, which behave as system firewall, which means Android core can only be changed by way of an update. So in this way, it has many benefits, such as it is easy for vendors to shape, distribute, keep original signing for the images to its instance. And it is easy to roll back to the original shaped state or do incremental updates. And it is easy to check data traction or do data recovery even in a very low level, such as hardware. Also, it is easy for real storage devices to do hardware write protection and even more. So why we introduce URF5? Basically, because it exists, we read only compression solutions. In kernel, we cannot meet our performance requirements, but we need to do compression for our low-ended Android smartphones at that time. That is why we design URFS and sort it out from the beginning. We handled many basic common issues of generating read-only use cases to get high performance read-only file system. In addition, it is good to switch APX to URFS on disk format as well. Also, currently, APK is also another archival format. If it becomes URFS-mountable, that may leverage the latest long-demand review polling as well. So here is the first demo, which URFS is running on Android Cartofish emulator. URFS is running on Android Cartofish emulator. It is running on Android Cartofish emulator. URFS is running on Android Cartofish emulator. Our second main landed use case for URFS is container image with a user-space program called NEDAS. Private fly NEDAS is a user-space example which uses in kernel URFS to leverage its functionality to do faster container image distribution like lazy polling and data duplication across layers and images. Currently, NEDAS can do lazy polling for NEDAS URFS images as well as star GZ images and original OCI images with an extra minimal index, which is much similar to another project which is called SOCI. For more details of NEDAS itself, you could also refer to another topic which is called NEDAS Image Service for Confidential Containers at Confidential Computing Devroom. On the left-hand side, it is NEDAS architecture. You could see that an image format could be built with advanced features such as lazy polling, data duplication, and native or OCI compatible modes. And then a read-only file system for containers such as RunC, Cata, Cata CC, AMOs, and software package can be run by NEDAS-D with Linux, URFS Fuse, VTARO-FS, and URFS over-FS cache with pitch cache sharing. On the left-hand side, it is some partners which are learning NEDAS and driving fly solutions. The second demo, URFS is running with NEDAS 4 container images. So firstly, the run NEDAS container. And it finished in 16 seconds. Then it runs OCI container. You can see that it finished in 27 seconds. So that it induces times due to lazy polling. So this is the third demo. In this demo, URFS is running with original OCI and NEDAS slimy indexes for lazy polling. Note that this use case is still under development so that we could optimize it even further. So firstly, we start already in OCI container. And it costs 26 seconds. And we build NEDAS zero run indexes for OCI images. So next we start zero run OCI container. And you can see it costs 21 seconds. And that is the file system. You could see that the NEDAS slimy indexes is very small. So next I will go into take some minutes to give a brief introduction of URFS core internals. So as an effective with only internal solutions, core URFS on disk format is quite simple. Almost all URFS on disk structures are well aligned and laid within your single block, which means they are never across two blocks for performance. So on the left-hand side, this is on disk superblock format, which contains the overall file system statistics and the root I-node NID. Each URFS I-node is aligned in an I-node slot so that the basic I-node information can be in the same block. And they can be read and wins. On the right-hand side, there are URFS on disk I-node format. Short extended attributes can be kept just next to the core on disk I-node as well as chunk, compress, indexes, and inline data. Here is URFS on disk directory format. URFS directories consist of several directory blocks. Each block contains two parts called deranged part and name part so that with such on disk design, URFS can do a name lookup with binary search, which makes URFS more effective than other existing internal read-only file systems and kept in a simple implementation. So here is an overview of NIDUS use case. You can see that it has almost two parts. One part is called bootstrap or also called primary device, which has meta-blocks and data-blocks. So the meta-blocks could have super-blocks, I-nodes, and some inline data. The other data blocks could have directory blocks or some blocks for regular files. And the other part is called the blocks, which could have external data, which is separated with chunks so that in such designs, blocks can be referred with the metadata. And the details of compressed data is somewhat not quite trivial, but it could be referred from the following links as well if you have more interest in. Here is an URFS recent update. The first two features are called chunk-based file, which could implement sparse files and data deduplicated plain files. The next feature is called multiple devices and blocks, so URFS image can refer to other external data as well. Since 5.19, URFS over FS cache has been already landed, which is already mentioned by some materials available online as the following links. Since 6.1, URFS has been introduced a special I-node called piped I-node for TEL data so that TEL data or the whole of files can be deduced or compressed together. Also, since 6.1, URFS supported global compressed data deduplication by using ruling hush, URFS over FS cache page cache sharing is still working in progress. Here is a URFS compressed data deduplication test result. You can see that compared with scratch FS, URFS is more space saving by using this new optimization. In the next year, we've already planted some new features. Many of them are already working in progress, such as verification solutions and data deduplicated encryption solutions. We also have FS cache improvements together with bad dance folks, such as failover, multiple demons, and directories, as well as demoners. And more features can be referred to with the following links. So that's all of my topic. Thanks for listening again. If you have more interest in URFS, please kindly contact and join us. Thank you. We actually have time for five minutes of question. We don't know how bad the lag actually is, but we can type the question into the chat if you have one. Or you can just ask it. Thanks for the talk. There was mention of self-contained verification solution. Can you compare us with the severity and what advantages do you see in the verification solution you are working on? I mean, you can also write it, yeah? You have no idea what the lag is. Sure. Do you have the app installed, like the FOSM app? If you go into the schedule, then you just need to click a link. Ah. Thank you. Thank you. This is a text only development room, by the way, as you can see. Thank you. Thank you. Just saying, just saying, just saying. |
Having Something To Hide
Trusted Key Storage in Linux |
Hello, our next talk is going to be by Ahmad about having something to hide. Thank you. Yes, so my name is Ahmad Fatoum, I'm an embedded Linux engineer with Pingotronics and thanks for attending my talk on having something to hide trusted key storage in Linux. So Pingotronics, a company I work for is a German Linux consulting company. We specialize in embedded systems. So all around embedded Linux consulting, around drivers, bootloaders, kernel porting. And in the course of one project, I had occasion to get more familiar with kernel's trusted key subsystem, which I will talk about today. But first I will talk about what we need to store these keys for. This is usually disk encryption. So if you install a new Linux distribution on many systems, you already have whole disk encryption out of the box. And it's just really one click affair. But what are the mechanisms underlying that? That's usually the M-crypt. So the M-crypt is device mapper with the cryptarget. And what that does is that it maps physical devices to a virtual device and applies some transformation to it. In this case, it's cryptography. And you see how that looks like in code at the end of the slide. You specify a range. You start from the first block, the number of blocks. You specify that you want to use crypt. You specify your crypto parameters. For example, here it's AES. And then you reference crypto key that you want to use for this symmetric encryption. Here is 32 byte long key with the name key, and it's of type lock on. And in the line after that, you see this key being added. And that's all you need to do. So to initialize your Dm-crypt, then there is a Dm setup tool you can call. And then you have the M-script running. You can use this virtual device, just write to it. And the physical device, everything that will be written there will be encrypted with this parameters that you have set. Most people don't do this manually via Dm setup, but they have a wrapper around that. That's usually crypt setup with looks. So looks is desk encryption specification. You see at the end how the header is laid out. You have this binary header that's still there for compatibility. And you have a JSON area that can describe these parameters that we had in our Dm setup table, like what sort of algorithm is used or what HMAC is used. And then there is this key slots area. And in this key slots area, you have this volume key that was at 32 byte long keys that we had. That key is what's actually used for crypto. But if that leaks, yeah, you have all your data encrypted with it. So the idea with the M-crypt is what the crypt setup and looks do is that you can have multiple key phrases. For example, your normal key phrase that you always enter or a recovery key. And then in turn, you encrypt that volume key with each time with a different key. And that's stored in these key slots area. And that way, you can have multiple passphrases for the same volume. And yeah, where does that passphrase come from? So it's usually entered by the user. So in the init.rd, you are asked what's the passphrase that you want. And then you enter it. You could be a bit more sophisticated and insert a USB stick that has a file. That's the same code pass, basically. You could insert a Fido security key or a smart card, but what all of these have in common is that the user is inserting or writing or you need user involvement. And in my project, it was an embedded system. And we don't have really a user powering up the devices. And yeah, we need some sort of automated solution for unattended boots. And here is where trusted storage comes in. So in the regular case, the trusted storage is like the memory of the user or his USB stick. But for an unattended boot, you need some on-chip or off-chip device that's appropriately secure that can hold the key. Such device is in many systems, the TPM or the trusted platform with yours. This is an industry-wide standard. It's also an international standard and it's mandated by Windows 11, which helps its adoption in a lot of modern systems because you couldn't boot Linux otherwise. They are available as discrete devices, as chips, sometimes on like a breakout board for your motherboard, but they can also be implemented in firmware. And TPMs have this standardized interface where you can talk to them and they provide you a lot of services. What's interesting for us is that it has a random number generator built-in, so it has its own entropy source and gives you access to it. And it holds a unique never-disclosed key. And with this unique never-disclosed key, you can encrypt arbitrary data. So instead of having a passphrase that you need to remember, you could have an encrypted passphrase and then you pass it to the TPM and the TPM will decrypt it with this unique never-disclosed key that it has inside and then pass you the data in a decrypted form, which you can then pass into the M-crypt or into the crypto setup or whatever. And you can make this even dependent on having reached a state that's an unintegrity measurement. So each boot state could verify the boot stage after it and then tell the TPM this is a measurement value. And these measurement values are concatenated and hashed and kept in the TPM. And you can configure the TPM to only release and only to decrypt data when it reaches that state. And then you can be, yeah, and when you configure it correctly, the TPM would only decrypt your encrypted blob when you are indeed in that secure, in that measured boot state that you want to be. You can even bind it to a time. So after a given time has elapsed, you can't access it anymore. Yeah, how does it look like in practice? The kernel has drivers for that that abstract away the different modes of communication. It can be I squared C, it can be SPI. You don't need to worry about that in user space. You have these device files that provide your access. There are user space libraries that wrap that and there is even a system D support since I think a year and a half or so, where you can enroll looks keys into TPMs. It's very easy to set up. But whatever you do, the common way of using this with looks has the common, you could call it issue that privileged user space has access to this key material. So if you, you have seen there is this JSON area where you could store stuff. So you could store your encrypted key there. And what would happen on boot is that prep setup or system decrypt setup would go there, it would get this encrypted key, encrypted key, it would send it to the TPM. The TPM would do its checks and see, okay, I'm in the correct state. It will decrypt this data and then send it back to your user space. And then your user space now has this passphrase, which it could use to decrypt the M-crypt key and then it would pass it into the kernel again. So it's a real roundabout way to get the M-crypt key into the kernel key ring. So the idea behind trusted key was why not directly decrypt the TPM secured key into the kernel key ring and reference it from there without involving user space at all. And yeah, so it has been implemented. It was first added in 2010. The first kernel was released in 2011. It was originally TPM specific, but the naming was held generic enough, I think, in hopes that it can be extended in future. So the same patch series that added it added also encrypted keys. So encrypted keys are keys that you can only observe from user space in encrypted form. That's how it should be. So you will tell the kernel, generate a key for me. And then when you try to export the key, you only get it in encrypted form. And then when you want to load it, you give it a kernel in encrypted form and it will decrypt it, but it will stay in kernel memory in decrypted form. And that's encrypted keys. And trusted key additionally have hardware root of trust. So they use a TPM for doing the encryption and decryption. So in theory, you shouldn't be able to decrypt a trusted key to load it and have it decrypted on another system than the one where you generate it on. Because on the other system, you would have another trust source with its own unique key which is used for the encryption. How does it look like in code? So it's basically the same line as we have seen before, but instead of having a 32-byte long login key, we have a 32-byte trusted key here. It's called KMK. And to create it, you can use the key CTL command, you add a trusted key. You don't specify the key material like we did with the logon key because you can do that. You can just ask the kernel to generate you a 32-byte key. And then when you try to pipe it, which is the command to pipe the key contents out, unlike a user key which would just output the key material in plain text, it would output the encrypted key and set you can store wherever and use it on subsequent boots. So what the rest does is it sets up a loop device and does the encrypt on it and write it works and then it reboots. And then on the second boot, if you were to create a new trusted key, it would be completely different. It would be generated randomly. And you want to use the key that you have stored already, which is what the blue line is doing. It does add trusted KMK, but instead of creating a new key, it loads the key blobs that we have stored. And with that, you should be able to read back what you have written before. Yeah, so that's how it works. We have a way to do it in user space already, and that's how it's usually done. And not everyone agrees that sets strict advantages by doing it in the kernel. But what was interesting to me is that it is a very useful interface to represent much more than just TPMs. Because on modern system, you can have off-ship secure enclaves, basically a TPM that doesn't speak to TPM protocol and doesn't implement everything, but it implements part of it. You can have an on-ship trusted execution environment. You can have crypto units inside everyday socks. Very often you have a crypto accelerator that also has access to a key that it could use for wrapping and unwrapping data. And indeed, in 2019, work started from Sumit Garg at Linaro to generalize trusted keys and add T support in the first instance. So what is T? T is also an API standard. And what it's about, it's having a hardware isolated environment where you can run trusted applications on the same CPU where you execute your Linux. But thanks to this hardware isolation, normally armed trust zone, if you do everything right and have firewalls in place and all that stuff, you shouldn't be able to read the secure memory from your normal world, which is Linux. And these trusted applications can do basically everything. You can have a trusted application that offers you a TPM. And in that case, you could just use trusted keys with TPMs. But you can do basically anything. It's software. You can just do random number generation in T. You can do key sealing and unsealing with a hardware unique key. So that's available on some processors that when you are in the secure mode, you have access to a key that you can never see from Linux, which is unique and fused in. And there are even people doing clock reset power domain support stuff in it because they don't want Linux to have access to these things. So if you are interested, you can just grab the kernel tree for a T client driver and see all the stuff that's there. And what was interesting to me was the crypto unit inside the IMX SOCs. It's called CAM by free scale. And we already have a CAM driver in Linux. It does random number generation. It does crypto acceleration. It works a bit like a network card. So I have these shared TMA rings where you push the jobs you want the CAM to do. And then the CAM replies to you. And you can do, as I said, the crypto acceleration RNG. And it also has access to a one-time programmable master key that's fused by NXP in the factory. And that's unique between devices. That's the selling point. And the CAM can use it for red blob generation, which means it seals and unseals user supplied data using it. Basically the same we have seen with the TPM and with T. And it has black blob generation. So TPMs are very slow. And I don't know if they support crypto offloading, but you probably don't want to do that if you want to do something quickly. But the CAM can do it much quicker. And you can have this key never exit the CAM and use it for crypto inside the CAM. You are, of course, limited to the crypto algorithm. The CAM supports. But the possibility is there if you don't want your key to even enter the kernel. It should be all the time in the CAM itself. And yeah, so why do we need that for? The common use case is certificate storage. So you are a vendor and you need to call into your own cloud. And you have client certificates for that. And you don't want someone to be able to desolder this EMMC and read it out and get access to your certificates. And thus you decrypt the certificates and at runtime encrypt it into memory, maybe normal memory, maybe unshipped memory, however, whatever. And yeah, we had many customers that needed something like that. And we had been carrying out of three patches for it in 2015. We send it out the first time to get some feedback. Back then it was using the standard thing, a custom CSS interface. In the following years, NXP tried to upstream their own new key types to represent, to rep this hardware functionality. And finally in 2019, work began on generalizing trusted keys. And yeah, it was finally merged in 2021. In 2021, I also started then with implementing it for CAM. And that support is now available since 5.19. And it's usable exactly the same way as with TPMs. You can't do this measurement stuff because a CAM doesn't have support for that. But on NXP SOCs, you would rather use their form of verified boot. So this unique key that's inside the CAM, it's only released when the SOC believes it's in a high assurance boot state. It means that the boot ROM has verified the boot loader. And then you are supposed to keep that chain of verification going. And boot loader verifies the kernel, verifies the init.rd and so on. Yeah, some interesting tidbits. While I upstreamed the series, T and TPM both don't use a kernel entropy pool for TPMs. They always have a random number generator for T. It was specified that they need to provide random number generation. That's not something that I wanted to do for CAM because we have a perfectly fine CAM RNG driver. Not everyone was fine with it, but eventually, stubbornness prevailed. And yeah, you can now choose it for existing backends as well. You can specify trusted RNG equals kernel, and then you can, even for T or TPM, use the kernel entropy pool if you want to use that. The default is leaving it to the trust source to decide what it wants to do. And that's also useful for devices like on the IMX6 ultra-light light, you can guess from the name. It's supposed to be very lightweight. And their crypto unit doesn't support an RNG as is, and yeah. So you rather want to use the kernel driver that's available, that does this differently than you have to do it in your own driver. And what was also interesting, hardware feature bits were broken on some variants, so you can ask the CAM what features it supports, and the R-CAMs that support, say they have a blob support, but they lack AES support, so they fail with an internal exception when you try to use it, because it's, yeah. Because the ceiling and unsealing is AES based. But yeah, that's one more thing the kernel needs to take into account to work on these systems. And yeah, that's also something I only learned about while getting review feedback was not something I anticipated. As you have seen, NXP had different, okay, NXP had different attempts on getting into the kernel, and they applied that to their vendor tree. They called it secure keys, and during the upstreaming feedback I was asked if I wouldn't want to change my modifier key to be compatible with the NXP kernel, so people have an easier time migrating to it, because it was no problem for me. It broke my SysS interface, but I needed a migration step anyway, and yeah, this makes stuff easier for most of the users that want to switch, and yeah, so I did that. Why did I need a migration step? Because I was using looks before, but looks doesn't have trusted key support. So what I did is I used the M-Crip directly. I basically did the same things that looks would be doing, but only on the M-Crip part, and I would exclude the header you had seen in the first, one of the first slides. You can specify the range of blocks that it should work on, and then you can just cut out the looks area and do the M-Crip directly. And yeah, and you need a one-time import step, because the first time you don't want to generate the trusted key randomly, but you want to take the ones that you have already been using for years. Of course, in a new product, you don't want that non-upstream patch I linked there, but in an existing product, yeah, that's how you could do it. Old key blob, put into CISFS, gets a plain text key out, keysetlmports, and you have the new key blob. We store both alongside, so if the update fails for whatever reason, you can fall back to the old system and use the old key blob and both work. Yeah, finally, what more is there to do? So there's encrypted key support for the M-Crip, eCryptFS, eFAM, and VDM. There's direct key support, trusted key support, without involving encrypted key for the M-Crip, and yeah, you can use encrypted keys. Future candidates would be FS-Cript, there has been attempts, one for the old key set-up scheme, the second by me for the new key set-up scheme, UBFS authentication also currently uses a logon key that could be changed to be a trusted or encrypted key, but yeah, these patches have died down. Look support would be awesome, because yeah, with looks it just works out of the box, with the M-Crip, we still need to do it manually, but that enables us to do it completely in the kernel without involving user space, and yeah, you don't really want user space missing with a DMA-capable device that could just overwrite the kernel if you give it access, so trusted keys was the correct solution for us there. And that concludes my talk, and I would accept your questions if you have any. Thank you, and we have some time for a few questions. I have a question, are you aware of any way to kind of get this step of getting the secret from the hardware to automate that into the kernel as well, so you don't need user space interaction, user space utilities, my use case is mainly like the root file system, and to forego using an NDRAMFS that needs to run a lot of commands, so you could, from the kernel command line, similar, like with DMInit, also get the key. Personally, if I had that requirement, I would consider doing it from the boot loader and then have the kernel read it off the kernel command line, because the encrypted key blobs there is nothing confidential about it, so yeah, in theory the kernel could accept it over the kernel command line, but there is nothing like that currently. I can repeat the question, if it's to. Is there a way to also combine these hardware keys with some pin and looks, so you have to authorize yourself to the device? That's not really how it's meant to be used, because, well, yeah, the key material shouldn't exit the kernel, and you directly reference the DMCrypt key, insert the key in the kernel key ring and directly reference it, so I don't know how to do it to easily factor in a user pin. There's a passphrase option, apparently there is a passphrase option that I need to look up when using trusted keys. So thanks for the talk, would it be possible to add a manual step before communicating with the TPM, for example, a fingerprint scanner or anything like that? Is there a hardware and software option to combine the two verification steps? You could. So currently you need to have an init RD, so in my case you have an init RD, or I don't even have an init RD, I don't use it for the root file system, but if you were to use it for the root file system, for example, you could in the init RD first check that you have that fingerprint is there, but there is no way to wire it in the kernel, first this needs to happen, that's more of a policy thing that you would do in user space. |
Optimizing BPF hashmap and friends |
All right, so let's get started again. The next talk is by Anton about optimizing BPF hash map and friends. Hello, I'm Anton, working at Datapart team at Isovalent and yeah, this is talk about how simple it is to optimize BPF hash map. So about a year ago, like a little less, Andreina Kricker proposed, suggested to try new hash functions for BPF hash map and BPF tech trace map. And in Selume we use, and in Tatragon we use hashes extensively, so for us it's a big deal and I decided to give it a try. So I will briefly provide some like benchmarking how to one-on-one and then we will take a look at different hash functions and then we will see benchmarks for hash maps which utilize old hash functions and new hash functions. So first thing to do when your benchmark is to try to reduce noise because modern CPUs do everything to ruin your benchmarking. They will run on different frequency, hyperthreading will get in and in the best case you benchmark inside kernel because you can disable preemption and interrupts. So benchmark, we take, if you want to benchmark some function, we first measure some kind of time source or clock source and then we execute our function, maybe in a loop and then we measure time once again and then we divide the number of observations by the number of loops and get our result. So in some cases we can't execute our function in a loop. For example, if it's not like an abstract hash function, it's just, if it's just part of kernel then n is obviously equal to 1 and we need to take an account the time it takes to get time and one obvious way to do this is to benchmark an empty loop and this give us roughly how long the get time function works and typically people benchmark with something like rdtc instruction and if you don't reduce noise then rdtc instruction is pretty unstable. So here my CPU obviously runs on two different frequencies and if my function which I want to benchmark runs 30, 40 cycles then deviation in this case is bigger than function itself so I can't rely on it. So if we disable like reduce noise as I said then results become way more stable so here it's like maybe it looks scary but it's actually like 37 plus minus one cycle so even for very small functions this makes benchmarking more reliable but in this form rdtc doesn't work because it's not a serializing instruction which means that if you insert your code here then it can be executed in the middle of code and even after your code so and it can differ from execution to execution and so we need some way to serialize it and luckily there are no ways to do this so just serialize it and there is a white paper written maybe ten years ago about titanium and it's still valid with some changes from architecture to architecture but yeah we can do this. In this case benchmarking like the offset takes a little bit more like it's not 37 cycles anymore it's 70 cycles but again it's pretty stable and we can use this number to offset our measurements and in fact I did such benchmarking when I don't run in a loop it's then switch it to like more dumb benchmarks when you do loops over bpf maps because it's harder to port things and if you want someone else to try these benchmarks they will have to patch their kernel and this is not simple. So let's take a look at several hash functions of interest so jhash is currently used hash function in the bpf it's Bob Jenkins hash and it's probably was developed some 30 years ago so spooky hash is another it's a newer version it's not not a newer version it's a newer hash from Bob Jenkins and then there goes xxhashes xxh32n64 it's a previous generation of these functions they are available in kernel and we can try them as well and xxh3 is like the state of art hash function. So if we just take a look at this plot the orange line which goes there is jhash which is currently used the green line which looks to be winning here is like the previous generation xxh64 the blue line this is the newest generation xxh3 and while it looks here that it doesn't perform as well as xxh64 for small keys it does outperform it and for bpf hash maps we primarily interested in using it for small keys like I don't know like actually I never use it like huge keys and in any case this like xxh3 works faster than Jenkins hash and later I will show that it can be actually run even more faster but one interesting thing is that the spooky hash it actually like it it performs pretty bad for small keys because it has a lot of like setup which it does in any case but later it starts outperform like every hash function I was interested when it does it so it does it at about key size of 9000 it's cool but it's not the key size of interest for us. So if we take a look at xxh3 and jhash we can see that the blue line xxh3 it actually outperforms like jhash for all key sizes and there is also this green line it's jhash2 it's optimized version of jhash which can be used if your key size is multiple of 4 and it's actually used in bloom filters but for some reason not in hash maps. So if we take closer look at small key sizes we see that yes xxh3 outperforms jhash so for me it's enough reason to try to benchmark maps with it and let's take a look now at BPF maps which use hash functions so first one is stack trace map and then hash map in bloom filters. So stack trace was actually the main reason for Andrey to propose xxh3 because what it does it takes a stack trace and then it hashes it and creates a map of IDs which refers to this traces and if there are hash collisions then old stack traces are lost and we get incorrect picture about the system and stack traces is not too random so if your hash function is not very good if it doesn't have very good like avalanche properties then it will create more collisions for less random data and xxh3 behaves way better than jhash for avalanche properties and this is like one of reasons and the main reason to use it for stack trace. The other reason as a benchmark that's also like for stack trace it also runs about twice faster for typical key sizes because typical key size it's like 8 bytes per stack depth and this is typically like 60, 80, 100 bytes so xxh3 is like is a very good candidate here so for hash benchmarks I was primarily focused on lookups because this is what we do the most and this is the thing which like is easy to measure compared to like more complex pictures so there are some links to benchmarks I used and scripts to actually execute benchmark and plot it because like for every change I had to draw some like 100 or 150 pictures for different key sizes for different fullness of hash maps so it's impossible to do otherwise like and I had a few pictures here so if we just use xxh3 then it looks like it like the new map which is orange it outperforms the original map which is blue and here is lookup speed in cycles vertical and horizontal axis is a key size so the bigger the key the better the more gain from using the new hash function but if we take like a bigger map I see that xxh3 as it is degrades for key size 4 and this is already like for me it's a blocker I can't like propose a change which degrades existing applications so then I went to a different architecture micro architecture and here I saw that it degrades for different key size like here it degrades for key size 12 and if you if we take like a bigger map it degrades for like key size 24 and then I thought how to fix this because it's if it works for bigger keys then maybe I can utilize this and I did the same thing as bloom bloom filter currently does so bloom filter executes jh2 for key sizes divisible by 4 and it uses jhash in other cases so I did the same utilize jhash 2 for key sizes of which are divisible by 4 but for small ones like it's this keyln divided by 4 keyln 32 it's actually computed it's just keyln divided by 4 but it's computed during hash initialization and we can decide for which key sizes we do this and with this hash function I finally see that it doesn't degrade anywhere so this is like 10k 100k and 100% full which is like the worst case and if we take another slice this is 100k 100k map with key size 8 and on the left side it's almost empty on the right side it's 100% full and the bigger key size the bigger gain for particular key size so for key size 64 new like map with new hash function runs about 50% faster and for key size 128 it runs almost twice faster and bloom filters as I mentioned they use the jhash 2 for keys divisible by 4 so I don't expect any gain for keys divisible by 4 at least for small keys and it looks like this so this is like an extreme case of bloom filter with 9 hashes but and I just did it so it reproduces the plot of hash function here and yes for small keys it is the same and for bigger keys we have a gain and here is the key size 240 where xh3 function originally utilizes vector instructions and we can't use obviously vector instructions in BPF maps and for key size 240 it's like it is expected to start using vector instructions but there is also scalar implementation which works faster than jhash but it degrades at this point so and another thing to mention that old hash functions jhash xxh64 they were designed and optimized it with O2 option in mind so if we switch to O3 then they will behave the same but xxh3 actually runs like 50-60% faster so it actually performs way better with O3 so I just jump like here so and I know that like O3 is no go for kernel like there were several attempts to introduce it and the reason was that there are no candidates which benefit from this O3 but this one is a particular candidate like hash if we could use O3 for hash map because not only for xxh3 because it should be inline then in this case we would be able to get rid of this composite hash which mixes hashes and just use it as is so yeah as I said for stuck trace map it definitely makes sense to use it so there is both benefit in speed small one because stuck trace map the bottleneck for speed is not the hash but the bottleneck for hash collisions is the hash and for hash map it's a question maybe someone would advise me on what to do with O3 and after I run like benchmarks on slightly bigger number of architectures then I think this is also like a good candidate to use in the hash map so here are some links for benchmarks and paper which I use it for those who will be reading this and thank you all right thanks a lot any questions you for the O3 thing can you only compile maybe the like the hash map file with O3 yeah yeah if it is like currently it is disabled like for every file in kernel for custom build we can enable it but like generally we just pass O2 everywhere if it's possible just to compile BPF maps with O3 this will solve the thing yes it's it's not such a big change so you don't have to compile the whole kernel with O3 yeah yeah yeah it's local code okay any other questions no then thanks a lot you |
eBPF loader deep dive |
All right. Let's get started again. So, welcome back, everyone. The next talk is from Dylan about eBPF loader deep dive. Yes. Hello, everyone. Thank you for attending. Before we start, I have to make a quick confession. I'm only 80% done with my talk. No, but really, today I'm going to talk about eBPF loaders and while I'll do my best to go as deep as I can within the time constraints, there is of course so much more to go through. So, let's start with what is a loader for those of you who are not in the know. So, the term can be used in multiple contexts, but for the purpose of this talk, I will refer to a loader as any program that interacts with the kernel via syscalls. Or what you more commonly see is a program that uses eBPF loader library to do most of that work for it. So, examples of loaders are IP and TC, which can be used to load XDP programs or TC programs, for example, but also BPF tool, which can do the same or BPF trace, or even your own app if you decide to use a loader library and make something great. Loader libraries are basically obstructions on the eBPF syscalls and to make it easier to use, kind of like Lipsy, but for BPF, which is where the name for the first example comes from, the BPF. But of course, there are many others like Aya, where we had to talk before this on this day, or BCC, or CELUM, eBPF, for all examples of loader libraries, libraries that load BPF programs into the kernel. So, why do we need loaders? This is an example, this is the program example we're working with today. It's quite simple. So, if we, on the left side, I declare a map, which we will be using to store flow data, so packets and bytes per second, for combination of source address and destination address, and on the right is a bit of logic that checks that we have the correct, that we have enough data interpreted as IPv4. Now, there's a handle IPv4 function mentioned here, but it doesn't fit on a slide, so we'll get to that later. When I compile my program, I get what's called an ELF, an executable and linkable format, or linkable matter, I think about it, whatever. If you, a normal C program, if I were to pull any random Hello World C program from the internet, compile it like I showed in the above command, we'll get out an executable, and you can use it out of the box, no need for trickery or things. You make it executable, and you execute it, and you get Hello World on the command line. If you, if you get an EBPF program, and you try to compile it with commands you found on the internet, you'll get a relocatable. Now, if you try to execute it, you'll get an error, so it doesn't work. What you need is a loader. The executable that we have is like a, is like a premade IKEA furniture, but the relocatable we get for EBPF is two pieces, and perhaps if you're lucky, a guide on how to put them together. And this is the job of the loader, putting the pieces together and making it, and providing the guide to make it easy for you to use it. Now, an ELF as we generated has the following structure. So we have this large file, we start with an ELF header, which contains information like, this contains EBPF, and it's this many bits, this machine, but, and it has a bunch of segments, sorry. These sections have names, have names, and each of them can have a different format. So the string type has a bunch of strings. Our programs have a bunch of program code in them, et cetera, et cetera. But they also refer to each other. So you have all the arrows, and they point to each other, and they link to, they link to each other. But in this form, it's not that usable. Because the kernel only understands SysCalls and EBPF programs, it doesn't know how to handle such an ELF. So how does, but the, the BPF SysCall looks like, it's like this, if you, if you pull up the man page, we have a bunch of commands, each command has a, has attributes, and in the kernel they're defined in a very big union, and every command has its own set of attributes that you can use to, to instruct the kernel, to ask the kernel to do something for you. I can't go on over all of them because of time constraints, but the most important ones are loading your program, creating a map, loading BPF, and of course, interacting with that map, attaching it somewhere, et cetera. There are quite a few commands, each of them does slightly different things, and may, and the loaders, in most cases, provide functions that either call multiple of these to do a batch, like a big operation, a high-level operation, or they provide small wrappers for you to do your low-level operations yourself. There are also links, which is a newer concept, and you can pin, pin your objects to the file system, so they live longer than your program. And we have a few other miscellaneous functions for doing measurement statistics, iteration, et cetera, but I can't go in that, in this talk, unfortunately. So back to our program. When we write a program, we have a macro here that says SAC. We, that's quite unique for BPF. Every BPF program needs to have this section tag there, and this tells the compiler to put all of the program code in the specific section that we named. And the name of this section is for also convention, which can be used by the loader to inform it that this is an XDP program, so it should be interpreted as such. Now we can dump this section, so if we dump this section with LLVM object dump, then we get out this, which is hard to read if it's not annotated, but it's a bunch of BPF instructions starting with the opcode, so the actual opcode that tells it if it's add, subtract, whatever. Source and destination registers where these opcodes act on with offsets for jumps. These are relative, and intermediate data for, to say, load some data into a register like a constant value. And sometimes we can use two of them together to represent a 64-bit number, but we'll get to that later. We can also ask object dump to decompile this for us, and we'll get the decompiled BPF program. So the bytes on the left side and the actual program on the right side, but you'll notice that there's a call here. So one thing that I didn't tell you before is that the handle IPv4 function that we have is marked in such a way that it won't be inline, so it's a separate program, and BPF can do BPF to BPF function calls, and if you do that, it puts out this instruction, a function call instruction, but with minus zero. Where do we call to? Well, currently nowhere, because we haven't assembled the pieces of our furniture yet. So what actually, what also happens is that the compiler will emit relocation information, which we can, again, visualize, and it says, all right, we have a certain instruction that is given offset, and you should put it in relative address of this other function in here. Then we can go to the symbol table and we can look up this name, and it says, oh, that function lives in the.txt section, where for BPF programs, all of the function to function calls, all of the functions live together. So we have these two separate pieces of the puzzle, and they refer to each other. But the kernel only has one pointer for our instructions. It expects that every program we give it is one contiguous piece of memory with instructions, and it all should work. So we have some work to do. We need to figure out, or the loader rather, needs to figure out how it wants to lay out our programs, so piece all of the puzzles together, find all of these references, and then put in the correct offset. All of this happens in user space before we even go to the kernel. Now, second fun thing is that we can define our map. So again, we have the sec part,.maps, put it in the.maps section, and if we, and this is the function, this is the part that I have been hiding from you until now. It's also quite simple in terms of BPF programs. We get an IPv4 header, check that we can use it, and we write, or we get a value from the map, and if it doesn't exist, we write a new one and increment the values every time this happens to account for some information. So keep this program in mind, and then if we go look at the instructions again, the disassembled version this time, we see that we have two of these long lines which are zero at the end. So these are the 64-bit intermediate values that I was talking about, and they are just long to keep, to pre-allocate room for actual memory addresses later, instead of relative jumps. But they are zero, and these should be references to our map, and later on these will become pointers when the kernel gets its way with it. And in our case, we again need to figure out what to put in here. So same routine. We have relocation information. The relocation information points to the instructions that we had. It says you need to plug in a flow map here. We go to the symbol table, and there it says we have a.map section, and there lives a flow map. In this case, we handle it slightly differently, so we then have to go load this flow map first, get a file descriptor, which is our unique identifier for the map, and we need to actually put in that file descriptor into these empty values, so the kernel knows where to go. Mapping maps also is also a command, so we have the map create command, and it takes these arguments. I cut out a bit of the later ones, but these are the essential type, how big are my values, et cetera. Give it a nice name. And there are two ways to define these. We have the new way of doing it, which are called BTF maps, colloquially, on the left. But there's also the old way of doing it, using a BTF map definition on the right. Don't use it if you go into libbpf in the part of the libbpf, which is used during ebpf construction. It will warn you that you shouldn't use it and go for the left side. But the odd thing is that if you use these newer BTF maps on the left, and you go look at what's actually then written to your.map section, it's all zero. There's no information. It still keeps, it still allocates room for your map, but it will, but they'll all be zero, and there's no information. All information, instead, is in the type information of the flow map. So we have to get in what is BTF. BTF stands for BPF type format. It's derived from the actual dwarf debug symbols that already are used for normal C programs. But as a way compact or smaller version of it, which only really is concerned about type information and not about where and at which moment a variable lives. And these are used because ebpf itself is just too limiting, and we want to do more, especially in the verifier. So we have, for example, features like spinlocks, which should only be used on maps that have spinlock values in them. Or we have callback functions, so we can define these BPF functions, but instead give them to a helper function. But this helper needs to then know that it's the correct number of arguments and the correct type. So all of this type information we can give to the kernel. And that's why it's, especially if you want to use these new fancy features, it's important to use the BTF information. It also allows for flexible map arguments. So for example, if I go back, we have the definition. And one of the things you'll notice is that we have pinning as an attribute here. But you will not find it in the Cisco attributes. This is purely something that we communicate to the loader library in this case, that we communicate to the loader library, not just libbpf, but that's the name that it has currently. And we can do a lot of different cool things with that. It also provides debug information for us. So if we go look at loader programs, it will be annotated with the line information and from rich file we can read. And perhaps one of the coolest features is compile once run everywhere, which allows the loader and or the kernel to modify our program slightly. So it will run on multiple versions of the kernel, even if the internals have changed. So if we dump this BTF that we have from our example program, it looks like this. Features to note are the numbers on the left and square brackets. Those are the type ID. Besides it's actual type. So we have pointers, integers, arrays. You can basically represent every C type in BTF this way. There's an optional name and then there's a lot of information about the specific type. And they refer to each other. So you'll notice a lot of type ID is something else. So you can also visualize it by nesting it. I've done this manually. By the way, there's no comment on this, but this is how you can do it yourself. So we have, for example, a map section with a flow map in it. And you can see that we have the type, the key, the value. And we have this very detailed description of exactly how it structured at which offsets, which things live and names for it which are used to check all of these certain things. And also to create a loader bill, we use this to infer the actual value and key sizes to give to the kernel. This BTF is structured in, so it lives in the dot BTF section. And it's sort of structured like this. So we have this header, then types and a lot of strings. And each type starts with the same three fields. So we have a name offset, so an offset into the strings. We have information and a size or type depending on what the information says. This translates into the name and the type of the BTF information and then the last part is specific to that type. So encoding for ins or a list of fields for a structure, et cetera. We also have the dot BTF.ext, the extended version of it. And this contains function information, line information, and optionally core relocations. So the line information contains a bunch of lines. So it will annotate this instruction as part of this line of your original source program and functions to label every one of these BTF functions that you have defined. Loading the BTF itself is quite simple. You use the load BTF command in the BTF syscall or the BTF syscall, give it the blob that we have. It needs to be slightly changed, especially for the data size, the data section type, but that's more details to explain exactly why, and a bunch of logging information. Once you have it, we get a file descriptor of the BTF object, and of course we have all of these type IDs. So when we are loading our map again, there are these fields where you can say, this is my BTF object, which contains all of my types, and this is the type of my key, this is the type of my value, that's how we wire everything together. The same goes for programs. So we give it the program, the BTF of the program uses, and then we give it these file, these func information, line information blobs, which will make sure that everything is nice and annotated in the kernel. So we end up with a sort of hierarchy that looks like this. So we start by loading the BTF, we can then load our maps, which use it, and then once we have our map file descriptors, we can load our programs after we have of course assembled all of the pieces of our program. And that all happens, can happen within one call to a loader library. Now for the last part, the core, which I touched on a little bit earlier, like I said, compile once, run everywhere. There's this really good blog post for, which I encourage everyone who wants to use the feature, which contains information on how to actually use it. But what it boils down to is there are in LibBPF, there are these macros to make your life easier, and they boil down to a bunch of compiler built-ins. And they're basically, they're basically questions to ask the loader just before, or the kernel just before, or while loading the program. Like, where does, where is, what is the offset of this field? Where does this type even exist? Do I have this enum value? I have this small program that writes, that writes values to, or that captures a certain, or the cookie value of a socket when it closes. Not useful at all, but it does help us to illustrate the point. When this macro resolves, it looks like this, and the important part to notice here is that we do a helper call, and where the arrow starts, we have the socket pointer, and we have an offset, and we add an offset, which we get from this built-in function. This offset is then encoded, gets encoded in the 104 that we see here. This is this offset that we add to the pointer in the actual code. But the compiler will also emit this relocation, which will tell us that this might be a piece of the code that we want to tweak, depending on if the structure changes. So if we again look at this relocation, there, unfortunately, as far as I'm aware, is not a good command line tool to visualize or to decode this, so I decoded one manually. It looks like this, so it says, okay, instruction number two, which is the instruction that we were, that we were at. Instruction number two refers to type ID 18, and it has this accessor string. And this accessor string is a bunch of numbers, which is basically offsets like the field number that it tries to access. So the socket, then the second field would be sk-common, and then cookie, and so forth. Now, this type information that we knew when we created the program is included in the btf section. But the kernel also has btf types for all of its types it has. So we can do a comparison and see that, for example, it changed position, or we can't find a certain field. And our loader can do this, can resolve this, see it, and then patch our code, change this offset value right before we actually load it, which makes it possible to use it on so many different kernel versions. I'm out of time. That's everything I can offer you for now. Are there any questions? And thank you. Thank you. Any questions? There's one in the back. All right, okay. It's difficult now. Can you pass this on? Hey, thanks for the great talk. So I haven't dealt that much with btf, but since we have those binaries that we cannot really launch because we have to load them in another elf, right? At least as I understand. Would it make any sense to make either a loader that would just work out of the box for those binaries or use the bnfmtm-misk feature from the kernel to be able to load those btf elf files and use some kind of generic or general interface and just load them and run them? Yeah, but I think it does make sense to some extent. For example, the IP tool doesn't have anything additional, so it takes this elf and just loads it as best as it can. And there is probably some way to use the interpreter in the elf itself, just like we do for dynamically loaded executables. As far as I know, no one has tried it so far, but I think it could work at least for a limited use case where you don't have to, where you would only load something and pin it and then allow some other application to actually work with it afterwards. Thank you. All right, thanks. We are out of time. If you have more questions, you can find Dylan in the hallway. And yeah, thanks again. |
Hacking the Linux Kernel to get moar FPS |
So the next talk is by Andre on hacking the Linux kernel to get more FPS. Thank you. Can you hear me? Hello. Yes. Hello. Hi everyone. I'm a kernel developer from Brazil, and I work for the open source consultancy. Does anyone here plays on Linux? Okay. Wow. Cool. And I hope it's been great. And in this talk, it's not very, very technical. I just collected some work that has been done by a ton of people to make game on Linux better on the kernel side. So as you probably know, Linux kernel has not really a roadmap, like trying to implement it. Oh, we need 10 new file systems by the next year, or any kind of this, is all driven by use case. And I mean, if you don't have any real use case, it would be very hard to get your code in the kernel. So it's all about new use case. So for instance, some years ago, we had the Android that pushed a lot of new kinds of kernel, like DRM, and then a container that helped us grow the signals things. And then the cloud that mess up a little bit in the file systems stack. So in the past, before Proton and this kind of stuff, playing games on Linux was not that easy. We had a lot of native things, but it was really on and off and JLBC has some API, it's not that stable on the long term. And to play online, wine wasn't so stable either back then. So we had some native ports on the way, the source engine was one of these native ports, and one very interesting example of how the native version is hard to get right. Bioshock infinity runs very, very bad on native, but if you run the Windows version for Proton, it goes very great. So it was on and off, we had a very big financial interest on game on Linux until things changed. So Proton was announced a few minutes ago, it's a big project for Valve to be able to run Windows games on Linux as good as possible. So Valve has been paying a lot of community developers and consultancies like Iwale to enhance the Linux gaming in all the stack from wine, Mesa, and the kernel. And after that things started really speeding up, and now we have the Steam Deck, and we can see on what was all this effort about now, now we have the big picture, why they are pushing so hard for the Linux gaming. And this is from the website Boiling Steam, and this is from two years ago, it's not really up to date, but you can see this is like the numbers of, the red one is the reported games on the Proton database, and the blue one is like games that are running very nice. So you can see that by time we can, we are really increasing the number of games that we can run on Linux, and this is the Linux market share of the Steam users, and you can see that it's really, really small, but you can see that it's getting bigger in a, well, it's getting bigger all the time. If this line goes by infinity, you get all the market on one day. Okay, so now about the current features that have appeared, just because people decide to play game on Linux. The first one is a very dramatic one, I don't know why people hate that so much, but you can now have a case-insensitive folder on your file system Linux, and people were very mad about that, but yeah, it's optional, so it doesn't matter if you don't want to use that, and to achieve that we had, it was needed to create a unique code subsystem on the kernel, so now in the kernel we have all fun emojis and etc. And this is one of the things that I want to, that I liked about Linux kernel development is that this was developed for the Linux for gaming use case, but then I think the Google people was like, hey, this is very cool, and then they make it support for F2FS for Android, and yeah, so every part of the community can benefit from the effort from each other. So yeah, now we have case-insensitive Linux due to games, and this is, of course, because NTFS is a case-insensitive file system, and it's very troublesome to do that, to do the file path lookup from the user space. If you need to emulate on user space the case-insensitive thing, it's very hard to do that because you need to try all sorts of combinations, but on the kernel side it's very easy to do. You kind of abstract all the things for the user space. Futex, Futex is what I'm most known for, is the work that I was involved with. So Futex is something that is exposed from the kernel, so user space can create Mutex, semaphores, barriers, all kind of cool synchrony sync primitives, and on the Windows side you have something similar, you have the sync API from the Windows kernel, and then you have this function from the Windows called wait for multiple objects, that for some reason games really like to call that, they really rely on that, and all Linux was not that easy to emulate that, we tried with Eventfd, but Eventfd doesn't scale so well if you have so many waiters, so we moved to Futex, and then after some years I finally managed to get it right, and it was measured, so nowadays you can wait on multiple Futexes on Linux, and this is, it was created for gaming, but I know that some distributed systems and database also wants you to have this operation, but yeah, I still need to expose that using Petrax, and the Futex effort kind of created the Futex 2 project, because I was there on the main list, hey, hey, I need a new Futex operation, and people are like, okay, but you need to solve all the other Futex stuff going on, and well, I spent some time collecting why were people so disappointed with Futex, and now we know what we need to improve for Futex, and I work on the Futex 2 thing to have a lot of cool Futex operations. Cisco user dispatch is a feature from the Linux kernel that also was created for gaming, because usually when you are developing a Windows game, you want to call a syscall, you just use the wrapper, but some games, because of the DRM thing, they use it to call the syscall directly using the x86 instruction, but of course on Linux that syscall number didn't match the Windows one, and it was very hard for a line to deal with that, so basically nowadays you can select a member region and say that every time you have a syscall there, it will not go directly to the syscall path, it will call another program to another backend to deal, to see if it really should be issued that syscall number. So yeah, it calls a syscall, but get back to user space, I think, like that. GPU driver, so on DRM we are working hard to make AMD GPU better, so in the past months we have been, after the SNCC release, the AMD GPU was exposed to all sorts of gamers and music cases, and this has been popping a lot of bug reports, and we are trying to fix them, and also as I said, this is like kind of pushing the limits of the driver and the hardware, we are working on new DRM features like a sync page flip in the atomic API, and also working to have a better GPU reset rendering, because nowadays if your AMD GPU resets is kind of, you need to press the button because it won't work again. Also we are trying to get HDR on Linux, and also support 3D LUT on DRM. Also in this kind of error rendering area, we are trying to have a nice feedback for the user when the kernel crashes, kind of Windows blue screen with a link to, you know, to figure out what is going on. Also we have enabled P store and KDAB on Syndec, so you can have the last DMED in a safe place to check out what went wrong, and if everything goes right, you can submit that for, I don't know, for the sync servers, so they can have a look and help you to figure out what is going on. Hard to enable a lot of drivers for the Syndec, and some work on the joysticks to have a pattern on how joysticks exposes features to user space. And well, that is a lot of things, smaller things, smaller projects, like the split block detector handling, so basically on x86 you have this feature for, that is the split block that you can do atomic operations on a line of memory, but it seems that you shouldn't do that, and then if you do that nowadays, the kernel will penalize you and make your code run very slow, and of course games were doing that, so we kind of needed you, we kind of added a button on the kernel so you can turn off, so you can play your games. HDI had bottleneck, I mean it was okay, but given that a lot of people start using VR, NVR has a lot of HDI devices, we kind of discovered that it had a bottleneck and then we fixed that, and also some semantics on Unix sockets, on timestamps, on the time counter, because Windows and Linux, they play very different on the time keeping thing, and yeah, a lot of documentation that we are trying to improve along the Linux kernel, out of three, this is very interesting because a lot of people do on the free time, they try to hack the Linux kernel to play faster games, and some people develop the task schedulers, because on Linux, as you may know, we have the CFS, but people have cool ideas of how to task scheduler could be better for desktop use case to reduce the latency, et cetera, and these people, some of the projects are not very committed to make this upstream, so yeah, they use the creativity and try a lot of different ideas, and another interesting thing is that there are some projects out there, like Xen kernel, Shenmue kernel, Lycoris kernel, that are basically a bunch of unofficial kernel releases made by the community to have a better Linux game in kernel, and it's very fun because they grab a lot of out of patches, they grab working on progress patches, and make it together, and we release, it's a very experimental kernel, of course, it has some bugs, it has some problems, but I think it's cool to try out to see if your games run better on those kernels, and yeah, we are trying to, well, I always check those kernels to see what they come with, to see if there are cool ideas going on there, and for the future, I think we are going to try to enhance the program management, so the handheld devices can have better battery, better life, and there are so many layers of GPU abstraction nowadays, we follow the translation, and I think we are trained, we will, sometimes, eventually, the botanical will be on DRM, and we will need to support that huge stack better, and here, at the end, I have some lists of the patches that I said, so you can have a look, and I think that's it, thank you very much. Thank you, time for questions, please raise your hand. No question, I have a question, like for the task scheduler, did you look into the upstream development that is going on right now, where you can specify schedulers through eBPF, for example? Oh yeah, I have heard about that, but I don't know if people try to replicate those schedulers using eBPF, but yeah, we will have a look at that. It might be interesting, yeah, cool, thanks a lot, oh, sorry, sorry, sorry, one question. Thank you, I had a question about how hard is it to introduce new stuff into the kernel that only you need, like you told us, like, some things were just used by you for gaming, so it's pretty new, you just have to use it, how hard is it, is it easy? It depends on, if you really, if you mess with a bunch of code, if you decrease the performance of something like on the server side, people will not be so happy about that, but if you don't mess with things that already exist, people will be okay with that. Thank you. |
Don't blame devres - devm_kzalloc() is not harmful
Use-after-free bugs in drivers and what to do about them. |
Hi, welcome to my talk. My name is Bartosz Gowaszewski. I work for Linaro. And today I wanted to talk about an issue that had been bothering me for some time, namely a certain family of youth after free bucks in the kernel that has been, for a long time, blamed on Devres and then that I decided to investigate a bit and only to find out that reality is often disappointing and everything is broken and the problem is actually much worse than if that was Devres' fault. So yeah, without further ado, let's stack in because the time is short. So typically your typical probe function in a device driver would look something like this without Devres. So you allocate some resource. If it fails, you bail out. You allocate a second resource. If it fails, you free the previous one, bail out, and so on and so forth. So every next resource would need, if it fails, you need to free the previous ones. Alternatively, you would do something like this. You would have labels at the end of the function and just jump to it when it's one of the allocations or resources. You wouldn't be able to acquire one of the resources. You would jump to it and free every previously allocated resource in reverse order. And then you need your remove function that would also drop these resources in reverse order. But you can use Devres and in this case it looks much better. You allocate a resource. If it fails, you bail out. You allocate a second resource. If it fails, you bail out immediately. You don't have to free those resources yourself. When a driver is detached, they will be dropped automatically in reverse order and with that you no longer need your remove function. So that is pretty sweet. But you will notice, if you send enough patches to different subsystems, you will notice that certain maintainers show a version to Devres. You would get comments like, oh no, keep Devres out of my subsystem or don't you know, Devres leads to, using these interfaces leads to crashes. But without, you would probably not get a lot of explanation on what the problem is. And so last year I stumbled upon a talk by Laurent Panchard from last year's Linux Plumbers conference. It's actually, the link is below so you can watch it. And yes, I watched that and then I started browsing Lore and noticed that Laurent actually had sent an email about that already in 2015 saying that if you open a device node in user space and you unbind the driver that exposed it, exported it, and try to do something with that open file descriptor called one of the system calls, you will get a nice stack trace in your kernel log. So I thought, okay, that's interesting. I read throughout discussion. I found out that there were several such discussions previously, of which I had not been aware. And I don't want to go into much detail, but the gist of it is that there are certain users in the kernel, mostly device drivers, or only device drivers, that allocate using devres, like case alloc in this example, allocate memory that is dropped in remove, but that's actually still referenced elsewhere, which leads to use after freebacks. And the first thing that popped in my mind was how is that different from calling any, from dropping these resources in remove. And I wasn't the only one, because many people asked about it. And one of the responses when something like this, it's not, it's the same thing, but the problem is that people don't understand the lifetime rules and expect magic interfaces to fix it for them. So I thought that this, while this is certainly true, I think that driver developers should not be expected to understand every detail of how kernel subsystems work, because they actually should care more about making the device work. And I would argue that you should actually expect magic interfaces in drivers to fix stuff for you. And if it's not straightforward, then it's a problem of the subsystem or of given interface and not of device driver developers. So I decided to investigate and see if maybe something changed. And I do maintain the GPIO subsystem. So this was the first one that I tried. I had this serial to USB converter cable plugged in that actually also has four GPIO pins. So I opened the GPIO device. I unplugged the cable. I tried to read the value of the pin and sure enough, it crashed, no point under reference. I was like, okay, that's interesting. I did not expect that, but it's certainly interesting. So I tried a different subsystem. I tried I2C. Surprisingly, it didn't crash. It just, like, when I tried to unbind the driver, it just hang, froze and did nothing. Okay, so that's interesting. But I realized that when I unplugged the cable, I had my picocon console open and it just, it didn't crash. It just exited. So certainly, UART is doing something right and other subsystems are not doing it correctly. So I decided to investigate a bit. So in the GPIO subsystem, we have two types of structures. One that the drivers see. This is the GPIO chip. And the second one that the drivers, the GPIO providers don't touch, don't even have access to, which is the GPIO device. And so this will be important in a second. So I looked at the crash that I had in GPIO and it turned out that it was a no pointer to reference at a certain line, the one that you can see above. And at the moment where I, like, I still had my device open, I tried to call one of the system calls and it turned out that the chip that you can see above, when the referencing gdev was no, was already gone. So this is nothing to do with DeVres. It's just that we had a bug in the GPIO character device code where when we, when the driver is going away and it calls GPIO chip remove, it can be called from DeVres or it can be called in your remove function. We simply said the chip to null because the driver is gone, but the, the, the, the, the, the, the struct device still can be referenced elsewhere. So we just numbed it down, which is the correct thing to do, but it, it still, it still needs to be, needs to be checked in the, in the character device code. This is, this was what we were not doing and this needs fixing clearly. And there is also a question of a race condition in the, in the, in the system call callbacks where if, even if we do check it in the beginning, we still need some locking because otherwise the, the driver can be removed when we are still executing the system call. So I looked at that and I thought, okay, this is easy enough to fix and that, that's definitely not, not linked to those errors that the discussion in, in the email thread was about. So I went over to I2C. I decided to see what's, what's going on in I2C. Why, why does it, why it can't I unbind the driver as for as long as I keep the device open? And it turned out that there is this strange completion and a comment about it, making it so that when the driver, when the driver is trying to delete the I2C adapter, it waits for as long as there are still references to the underlying struct device. Okay. It's, it's, it's not definitely the, the, the, the, the freezing when, when you're trying to unbind the driver is not the correct way, but at least it doesn't crash and someone had something, there was a purpose to, to doing that. So I, I thought, okay, so why, why does UART work? I, I looked at the UART code and figured out that actually UART does a smart thing. First, when you go into any of the, of the system call callbacks in the kernel, you check, you have a similar split like GPIO when you have a, where you have a structure that lives for as long as the struct device lives and a separate structure that is allocated by the driver. So it turned out that, yeah, it, it first checks if the driver is still there. If it is, then it locks it so that it cannot go away from under an executing system call. Okay. So this definitely makes sense. And I also noticed that spy def works fine, but upon further instruction, inspection of the code, it turns out that it also suffers from a race condition because the, when the spy def spy is checked, the spin lock is only taken for the duration of the check or like for, for reading that, for reading of the pointer, but later it's still that drive, the underlying driver can still go away while the lock is already released. So this is a concurrency issue. So I started thinking that there is, there is some misconception about Devers. I decided to fix some things. I started with GPIO. I sent some patches. They, they were, they, they went into, into mainline. They, they seemed like in this case, they did fix the issues. The, the, the user space would no longer crash. The kernel would no longer crash when the user space would unbind the driver and use the character device. I also sent some fixes to, for spy def. And then I send a fix for ITC. I remove this completion. This is when, when things went sideways actually. I, I removed that completion. I added locking. I started fiddling with this character device and I was proud of myself because I, I thought that I fixed this problem that has for, for a long time existed in ITC. And Wolfram, the maintainer of ITC, took that patch, reviewed it, said it passes all the, all his stress testing, but he's, he's having a gut feeling that something is wrong. And after a couple of days, he sends me an email and says that, okay, I found this discussion from a couple of years ago where this was explained in detail. And what happened? So what, what turned out, turned out to be the, the case with ITC. It turns out that ITC is a subsystem where drivers allocate the ITC, the struct ITC adapter, which embeds struct device. And they allocate that structure as part of a usually bigger structure that contains driver specific fields in probe. And then on remove, in the remove callback, they drop that memory, but it contains struct device, which is reference counted, unlike the, the, the structure that embeds it. So this is, this is why this, this whole completion waiting for completion is, is there in ITC because you must not drop free this memory containing struct device for as long as there are references to struct device. And I noticed that this is not the only subsystem that does it. There are, so every driver subsystem does things a bit differently. Some of them have that split, some of them don't. For instance, spy has the same problem as ITC, but unlike ITC, it doesn't expect the driver to allocate that data as part of the driver data. It expects it to be allocated separately and hand it over to, to the spy subsystem, which is not probably, it could use some improvement, but at least it doesn't crash and doesn't require the same type of weight, synchronous waiting for, for, for dropping all the, all the references to struct device. So actually this, this talk should be called don't let drivers allocate and control the lifetime of struct device because this is the, the, the culprit basically. So we, we have those systems that allocate struct device and there are more references to it. It's still referenced elsewhere. And then we drop this memory and we still, when the reference is like the, the, the, the, the driver along the, the struct device no longer exists, but it's, it's still referenced somewhere. And then the subsystem, the driver model tries to call, for instance, the release callback of the device and there's nothing there. So we, we have those crashes. So I, I, I didn't look at all the subsystems clearly and there are too many, but I, I just noticed that certain parts of the kernel get it right. So GPIO now with those fixes is, is fine. You are just fine. Word.doc is fine. They, they have this split where the struct device is allocated and managed by the subsystem while driver data does not contain the struct device. I'm not talking about the struct device that is passed to probe. I'm talking about struct device that is allocated by the drivers or the, the, the respective subsystems for those proper underlying devices. So like we have this, let's say a platform device for a GPIO chip and then we allocate a struct device per every bank. Just an example and many subsystems do that too. So there, there's this problem, but there are also other problems. So even those subsystems that get this part right often suffer from concurrency issues because there is no locking in the system call, callbacks in the kernel. So even if they do check if the, if the driver is still there or the device is still there or attached to the driver, they often don't lock the state. So it's, it's, it's possible that the driver will go away one, while they're still executing and they're referencing that pointer. This was the case in, in spy for instance. And I think that the issue is just about the logical scope of objects and not the scope in understood as, as, as a scope of a variable in a C programming language, but more, more like a logical scope of objects where you have, if something is allocated in probe, when you, when you attach, first attach the device to the driver and, and you allocate something in probe, it should only need to exist for as long as the driver is attached and as, as soon as it goes away, it should be freed and removed, be it in, with dev res or, or, or in remove. And yeah, so, so there is this problem with, with many subsystems that they don't have this and they let drivers allocate some data in probe and then handle it over to the subsystem or, or even do it implicitly, which leads to all these errors. So I, I know that Laurent's area of interest will probably be media and DRM, so I just skimmed through, through the use, user space device notes, codes in, in those subsystems and noticed that they seem to be getting part of it correctly, but there are concurrency issues as well. And the problem with DRM is that the RAM, the, the device handling code, the, the, the, the character block device, the, the handling code is not, is, is not centralized within the files, within the subsystem, meaning that we have many different struct file operations with different implementations for different callbacks, correct me if I'm wrong afterwards, but it, it, it seemed like, like it from, I'm not an expert on DRM, but it seemed like it just from looking at it. So what about dev res? Is it safe? I have found no evidence that it isn't. And if something can be freed and removed, it can be freed with dev res because dev res will do just that. It will, as soon as the driver gets detached from the device, the other way around, it will, it will free all, all resources allocated with dev res in reverse order. There is, there are some, some, some issues like dev mk realloc could use some semantic clarification because as, as it is right now, it's not clear whether the order, if you call dev mk realloc, does the order change or not when, when releasing those resources. In any case, I, is my strong belief that dev res makes code much more readable, safer, and it actually should be encouraged and, and not discouraged, but it has a limited scope. And on that point, how can we supplement it? Because a certain semblance of resource, of automated resource management, RAII, if you will, in C would be useful in the kernel. So yeah, the first thing that comes to mind would be using Rust. With Rust, these situations that I described would never, would clearly never, never be allowed to happen. But that's not, for, for now, we're still coding in C. So I was thinking that if you've ever coded in, in, in a user space library like Gillib, for instance, which is, I, I, I believe a golden standard of C programming in, in user space, you would notice that they use, make a, use of a lot of, clean up, clean up attributes. And this is something that GCC and Clang support, and I haven't seen that in the kernel, and I'm wondering why it's, because if we would use reference counting in conjunction with cleanup, we could actually make it. If I can interject for one second, but actually, at least not in core kernel code, but Peter Zilstra used this in somewhere in the kernel source code tree, at least. And I had proposed something like this, at least in person to a couple of people before, because I want to make use of this as well. So I would be very happy if this happened. Yeah. So I have a small example. So if you are, if you are not familiar with the cleanup attribute, it allows you to specify a cleanup function for a variable. And when the variable goes out of scope, it's called, so it's like a destructor in C++ basically. Alone, it's, it's useful within the scope of a single code block or a single function, but in conjunction, well, this is just another example of how to, how to use it, but in conjunction with reference counting, it's, it's quite useful because you can, like, you can see the, the, the foo and bar, foo create and bar functions on the right. You allocate a, a reference counted, automatically reference counted resource. You do something with it. If you bail out, the reference count goes to zero and then it's freed. But if you return it while also increasing the ref count and then grab it in another function in bar in this, in this case, then as soon as the, the, the foo create function returns, the reference count is decreased, but it's already two. So it goes down to one and we, without having to control free those resources manually, we just keep track of the references just by using reference counting and cleanup. And this is what Gillib does a lot. And it, and it works quite nice. It's, it makes programming in C in this space much easier. And what to do about the offending subsystems? It's a case by case issue because every subsystem does it differently. And I tried, I, I spoke to Wolfram a bit about what can we do in I2C. And it's, it would be very hard because you would have to do a sweeping change across the entire subsystem to make drivers not allocate I2C adapter, not, not I2C adapter, but rather the underlying struct device on its own instead let it handle, let, let I2C, the I2C subsystem handle it. Yes, I just wanted to bring that up and that's, that's it. I'm right on time. All right, questions? Thank you. I, I, I like your second slide. I felt when you say I was not completely wrong with my talk, so thank you for that. I was a bit worried. A few comments. It's multiple problems actually that you're trying to solve here. One is the race condition between IOC TL, so use space access in general and the remove function. And for everything that's based on character devices, there's a patch from Dan Williams that was posted in 2021. I think I don't know if you've seen that one, that attempts to fix it at the C dev level, which I think is the right place to do it instead of duplicating the same fix in all the subsystems. So it has been positively reviewed, not merged. I think it was just one review command that said that debug FS and proc FS had similar constructs. And so he was asked to just refactor the code to reuse the same instead of duplicating it. But that should be something we could upstream and solve. So that's one of them. This, I agree with you that there's nothing wrong with dev res or managed API in general. It was mostly DevM case at the lock in particular that I had trouble with. Things like DevM, IO remap for instance, is perfectly fine because you tie the lifetime management of the resource with the physical strike device, which is removed at the end of the remove function, and that is what you should do. So that's totally fine. The issues with the DevM functions come when we try to tie, as you said, the lifetime of a resource to the wrong device. DRM has grown its custom managed helpers based on dev res. So not DevM, but it's DRMM. That does do it relatively right. They tie the memory management to the CDEV that's exposed to user space. But where it breaks is when you have one physical device that exposed multiple CDEVs because then you have a top level data structure that covers all of those. So even if you allocate each of them dynamically, you will need to make sure that the top level structure will be released only when nothing else can have a reference to it. So I think there will always be cases where reference counting will be needed and the drivers will have to handle that. But in many cases, helpers should be possible to make it simpler. And also I mentioned that on the slide that I didn't say it out loud. You have some diverse resource, the diverse helpers that reach into other subsystems, which also have their own issues. So this can be dangerous as well. So I have seen in multiple places where drivers are allocating the strike device that is then getting passed to the framework. But then the solution that they are using is that they are basically instead of freeing the device in remove, they are just dropping the reference to that device that has been allocated. So someone is taking the ownership of that device and then you rely on the reference counting to that device and then freeing that strike device only from the release callback that is then set by the driver itself. This is what SPY does. But I2C is a worse example because you usually have a driver data. Inside it you have I2C adapter. Inside it you have strike device. And this strike device has this release callback. But in the release callback, you are in the subsystem, it cannot know what the outer layer structure is and where is it, what offset it adds to free it. So it sets the release callback of the strike device. It's another workaround I guess. It seems to me that the right thing to do is not let drivers allocate strike device instead do it in the subsystem if you need to. This also hides some additional complexity from the driver developers. On the slide about the state of the current subsystems you had a question mark after SPY. I wonder if you have doubts if it's fine today. Yeah, I think, no, this is the one, right? That one. This one? Yeah, there is a question mark in the SPY. Yes, I put it there because I haven't investigated that in detail so I wasn't sure if it works fine. It looks to be working fine, like just from testing it. I wasn't sure if it's correctly implemented, let's say, because it sounds good in theory. All right, the last question. So for systems like I2C, has anybody calculated the amount of pain for things like the I2C, has anybody calculated the amount of actual work to fix this? It looks like it's a lot because I did send some proposition to Wolfram about wrapping every the reference of the I2C adapter that dev into a helper that would then allow us to change that strike device into a pointer instead of the proper structure and allocate that inside the subsystem but it's not going to be easy. This is why I dropped the patch. I told Wolfram not to pursue that because it looks to me like it's not going to be that easy and I don't really want to get my hands that dirty. So for somebody else then? It's been like this for a long time so I'm not sure if anyone's going to step up to try to fix that. Okay, thank you. All right, thanks. Thank you. |
Rethinking device support for the long-term |
Alright, so that's the last session for today from Nikola, rethinking device support for the long-term. Okay, so hi everyone, my name is Nikolas, and today I'm going to be presenting about how we can rethink the device support for the long-term. So first of all, I work at Calabra, and what I do there is I do the upstreaming of the kernel support for some Chromebooks, and also I improve the coverage for kernel CI, so adding new tests. So first of all, why is upstream the upstream support relevant? So well, there are many reasons, and most of you probably know very well about them, but basically, for one, when you have good upstream support for a device, you can count on continuous updates, since it's easier for the OEMs that are developing the device. If they're basing their work on upstream, it's easier to continuously rebase and provide more frequent supports, more frequent updates for the device. Also, for the same reason, you have less of a vendor locking problem, because you don't need to rely on the downstream kernel for the device, so you can just install the mainstream kernel and be happy with it. On the OEM side also, there's lower maintenance cost, so it's also a good benefit for them as well. So basically, in the end, you just get a longer lifespan for the device, and that's what we've been working on, the upstream support for this Chromebooks, and so that's what's happening. So as we're focusing on the upstream support, these devices are basically staying longer on the market and getting a longer life, having more updates. But of course, we have new devices every year, and so there the demand isn't going to diminish. So basically, as I see it, as we get these devices living longer and needing more updates and there are more devices out there, basically, it's a problem of scale, so that's why continuous integration is important, because it's the way we're going to be able to automatically detect regressions and basically keep up with this demand. And it's also important to emphasize that enabling tests earlier in the device's life is important, because basically, there's a cost in enabling tests, and if you do them the earlier, the better, because then you get to benefit more from that over the device's lifetime to get the regressions. So a little bit about KernelCI itself, so that's why we need a CI, and in the case for the kernel, we have KernelCI. The main instance is there on linux.kernelCI.org, but that tests the upstream kernel, but there are also other instances, like the one on ChromeOS.kernelCI.org that runs the TAST tests, and not only on the downstream ChromeOS kernel, but also on the upstream kernel. So the way the KernelCI works, basically, it's this pipeline, so there are several Git branches from Git trees that are monitored, and when a new revision is found, the jobs are triggered that will build the artifacts, which means, like, the Kernel, the modules, device trees, and the rootFS for the test. Once you have that, the test itself is going to be queued to run on a device in a lava lab with pointers to those artifacts, and after the test gets, after the test runs on the device, then the results are going to be pushed back to the dashboard in KernelCI, so they're available for anyone to check out, and if there are any regressions detected, they'll be reported to the KernelCI results main list. KernelCI can be configured through several YAML files, and, like, for most people, that's the part you're most interested in, so, for instance, there is a build configuration that's where you set the branches and the trees that will be tracked. Also, the config fragments that will be used as part for the Kernel build, compiler versions and whatnot. So, for instance, some maintainers have a for KernelCI branch. They have a for KernelCI branch on their tree and also register that in KernelCI so that they can, as they receive the patches, they can merge those patches to that for KernelCI branch and have them run on KernelCI before they actually, so they can validate the changes before they actually merge those to their main branches. There's also a lab configuration where the labs themselves, so the lava labs that have all these devices on racks, like running the tests, there are currently around 11 of them. Also, you can set filters there, so you can filter out whatever kind of tests you don't want to, or you want to run on your lab. There's a rootFS configuration where you, well, you define what rootFS is going to be used for the test. So, you can add custom rootFS there, which will be involved like setting the base OS itself, so the Debian version and so on, like architecture packages that will be installed, the scripts, because like you might want to do some extra tweaking before you run the tests. Maybe you might want to compile something from source. And FS overlays if you need some special file in your rootFS. And finally, you have the actual test configuration where you define the tests themselves, so the test plans, which rootFS the test needs to run, which lava job template, which is the actual file that gets submitted lava to run the job, some parameters that might be needed, and the device types, which are the actual devices that are getting the tests run on. There are currently around 208 device types in KernelCI. So, basically, it's simple for anyone interested in improving the coverage for KernelCI to add a new test or add a new rootFS. If you have some of your own devices that you want to run tests on, you might add your own lab there. So, these are the tests that are currently available in KernelCI. The baseline tests are very interesting ones. I think they're simple tests, but they do a lot. So, basically, it uses the Buddha RR suite. And, basically, that has a generic and machine-specific part. But the point of this test is to make sure that the basics are there. So, as for the generic part, you have things like checking that no devices are deferring probe. So, the machine is actually probing everything that it should. And for the machine-specific parts, you can check, like, whether all the devices and all the drivers that you expect to be there are actually there and up. So, yeah. And, like, we have lots of other tests, like, case of tests. There are multiple of them. LTP. You have decoder conformance tests for the run cluster to verify that the output of a decoded frame from a hardware decoder matches what it should. And stuff like that. So, IGT, AFR-02 compliance, lib camera, which is one I added, for the Chrome OS embedded controller test, lib test to test suspend, resume, and that kind of stuff. So, we have there. So, that's about it for the basics for our kernel CI that I wanted to share. So, during my work in Collabra, I was doing the upstreaming for the support for the Acer Chromebook, CB514, which uses the MT-8192 SoC from MediaTek and the Azurata Baseboard. And during the upstreaming for the support for this machine, I had to obviously test all the components from the machine. And if I found any issues, I would fix them and send them upstream. But since the kernel is a moving target, I had to basically redo this every time, manually, constantly on every rebase. And I did detect several issues during the upstreaming process. And of course, like if this tests, there are some tests in kernel CI that could have enabled this to be done automatically without me doing them manually as I was doing the upstreaming. So, to just give some of these examples, these are actually the commits that me and a colleague, we sent upstream to fix the issues themselves. So, like, you have things like, for those two first commits, like, the issues that they fixed, they were preventing the display from probing at all. And on this machine, on the machine that I worked on and a few others from MediaTek. And basically, those kind of issues, like, the display not working, not probing, that can easily be detected by a baseline test, or IGT-KMS test, which would fail if display isn't probed. Also, you have another issue where there were basically the call to disable V-blank was moved to a wrong place, where it was the V-blank interrupt was being disabled before it should, and that caused warnings during suspend, and that can be detected by the sleep test that does a suspend-resume cycle. And you have also the other issue where the encoders stopped probing as well, because of platform-get-resource being deprecated on this platform. So, that's also something that could have been detected by a baseline test. So, where are we now for this machine in Kernel-CI? I have worked on upstreaming the, enabling the configs. So, I added the config fragment in Kernel-CI to get all devices probing and working on this machine, and also upstream that config, it's already queued for the next release. Also, we enabled the baseline test that I talked about. So, the baseline test for this machine are already running in Kernel-CI, and I added the device probe checks for the machine itself in baseline, but there are still lots of stuff to enable, like the also case of test, which I'm working on right now. This machine uses, makes use of, it has more complex control. So, the UCM needs to be applied before the tests are run. So, all the paths can actually be tested, the other paths. And some other tests, like CROS EC, the camera, refer to compliance, IGT, KMS, should name a few. And there are also some tests, like refer to the color conformance, that like can't be enabled yet, because the upstream support hasn't quite landed yet, because it depends on the description in the device tree for the hardware decoder, the patch is still being reviewed in the mailing list. And same thing for ZLIP, which requires a CPU-fract node, which isn't there, and the GPU also isn't there for this machine yet. So, after those components get enabled upstream, the tests can be enabled, so we can catch those issues when they happen again. This is a screenshot of the kernel CI page, the dashboard, with the results for this machine. As you can see, there are a few tests that are failing, because I, when I added the space line test, I already added the checks for the devices, for all devices, including the ones that haven't made it upstream yet. So, the one that's failing there, for example, is the external display for this machine, and it isn't probing because upstream support isn't yet, the support isn't enabled. So, as it gets merged, those tests will start passing. And if they ever start failing, then we can quickly notice that something broke. So, where can we grow kernel CI from here? I think definitely more subsystems should have more coverage. So, maybe like IIO and input are subsystems that I didn't find any tests to run. So, I think as we increase the coverage of the subsystems, we'll start catching these issues. If we don't have the test themselves, like we can, we can detect when the issues happen. And while baseline tests are, like, they already help with the base support, like, we really need these specific subsystem-specific tests to detect that issues in the usage of the hardware components themselves happen. So, besides that, also, more trees from maintainers. So, if there's any maintainers in this room that would like to have some branch tested by kernel CI, maybe sign up for requests in the kernel CI repository or get in touch. So, we can start testing more of the trees before the, and catch the issues. Also, more labs, of course. So, basically, we rely on the labs with the devices on them to run the tests. So, the more labs we have, the better, with more diversity of the devices. Maybe if, maybe you're interested in some device that isn't already present in some lab. In that case, it would be interesting if you, you could set your own lab, your own lava lab with the devices you're interested in and hook that up to kernel CI so that you get to benefit from the tests that are already there and they get run on your, on your lab, on your devices you're interested in. We could definitely also have more of the case of tests, LTP tests added to kernel CI and also support for K-unit, which, as it grows in the kernel, it would be great to have support to run those tests. So, basically, I think that there's still a lot, a lot to be gained from like the, like the open source model we have in the kernel that we've, we're all pretty familiar with in the development sense, where not everybody does, everybody does a little bit of work and every, and also everybody gets to benefit from it. I think we're still starting to see that in the Linux testing side of things for that. So, as we keep increasing the, the branches, the code base, the device coverage, everybody will start to benefit from that and this, this usage of the kernel CI will allow us to have, to be able to cope with the, the, the quantity of devices that are there and reply, respond quickly to the, the regressions as they happen and, like, because of that, be able to give a actual reliable long-term support for all, for all these many devices and, yeah, and everybody will benefit from that. So, and that's about it for the presentation. So, if you have any questions. Thank you. Are there any questions? Seems everybody is eager to get out. So, thank you for the talk. Thank you all for being here. This was the first iteration of the kernel dev room. We hope to make this a regular thing at Fastim. So, spread the word. We can use loads more submissions than we had, although we had a lot of great talks. I really want to thank, we organized this together, like three people, Stefan Grabber from Canonical, TechLexity team leader, the, one of the EPPF Updream maintainers, Daniel Huckman from Isovalent, and I'm Christian Browner. Thanks everyone. |
Kotlin DevRoom Welcoming Remarks |
Okay. So, hi, everyone. Good morning. This is so awesome. Welcome to the Kotlin Dev Room at FOSDEM 2023. Well, it's been two long years since we saw us in person. I still remember when we had the first Kotlin Dev Room in 2020 and, like, not as many people as today, like, literally, I'm impressed. Like, I think we had a bigger room and way less people, which was like, I don't know how many people will actually show up. But, yeah, today we are so many and, again, super excited to get this kicked off. And as we get this kicked off, well, I want to give out, like, a little bit of stats. First, I'm so proud to share that this year we received so many proposals for talks. We actually received more proposals than the Go Dev Room, than the Rust Dev Room, than the JavaScript Dev Room, which is happening tomorrow in the same Dev Room. And, by the way, like, if you're so much into Kotlin, make sure you don't miss tomorrow's JavaScript Dev Room opening talk by Luis. I don't know. Is Luis in the room today? He's finishing his slides. But, yeah, he will open the JavaScript Dev Room, speaking out a ditched JavaScript for Kotlin. So you definitely don't want to miss that. And so, like, a couple of other interesting talks about Kotlin also in the friends of the open JDK Dev Room tomorrow. So, check the agenda. There are, like, a lot of interesting content. So, back to the people to thank. Well, first, I want to give a big shout out to the Dev Room managers. So, thank you. It's me, Martin, Sergei, which is there, Olga and Marco. And we do these in our free time. Like, we don't get paid for that. And it's just, like, it's so awesome to get in touch with the community and hear your stories, connect with you. And, well, thanks for showing up and coming. I want to say thank you to Jabrains for supporting us. So, every speaker today will get a t-shirt. Plus, we do have some stickers and stuff to give away. So, stay tuned. But, yeah, Jabrains is supporting us in organizing this Dev Room. And I want to say thank you to the speakers, which prepared content and, yeah, just made the Dev Room possible. So, thank you very much. And, yeah, now back to Martin. So, we have a packed agenda for today with more than 15 talks. There is no, sorry, I do not get this good. Yeah. This is the agenda. So, it's available online. You can go on the FOSDEM website. You also have an Android app. I'm not sure about an iOS app. I think some, there is, okay. So, you have apps. Make sure to check it out. We have a lot of content today. And there is no lunch break at all. So, you have to organize yourself to just decide what you want to do or just bring some sandwiches. Or you can eat some waffles, too, if you want. We have free waffles. That's it. If you want more Kotlin, as Nico said, there is a talk tomorrow in the JavaScript. I mean, you have a lot of Kotlin content today and tomorrow. If you want to discuss more and have some beers, we meet tonight at the GIST. It's in the center of Brussels. The address is not here, but you can look it up on Google. And let's meet there. Just when the dev room closes, we will go there. We can meet or just meet directly there at 8 p.m. Am I doing this one? I think you're doing this one? Or am I doing, I think I'm doing this one. All right. So, like the last years, we will have a live chat on Slack. If you are not on Slack, give me a shout. Or we will have links on here. I think, Nico, you prepared. Deeplings. Pingas there. There's the Fostum room. This is, I think, the Deepling to Slack room itself. If you are not signed up yet, for whatever reason, use this link. Ask me if you need it again and you can't take a picture right now. And this is also a good place to be because, as Nico already said, we do have a couple of giveaways. Thanks to JetBrains, we have, I think, three licenses to give away. How this works, I will let you know on the Slack and on Twitter and on Mastodon. Yeah, essentially, so keep an eye on there. Ask me how this works. Essentially, if you ping me there, if you treat a post, if you two to post a picture or something. I will see you put on the hashtag Fostum Kotlin 23 or Fostum for 23, something like that. I will let you know. And you can win this one at the end of the day. I wanted to say the year would be quite a long time. I think that's it from my end. If you have any questions around this, approach me. I will be some there here or close to the coffee. Okay, fine. Yeah, prepared as always. Thank you so much for coming. Yeah. We will, yes, we will do. Good point. Yeah, here we go. Better. Excellent. Right, so then enjoy the day, have fun, network, speak to each other, cold, whatever. Enjoy. |
The State of Kotlin |
Thank you for joining us today and let's start with a brief intro, probably, of our speakers. All right, thank you. Thank you all for being here. I'm Marco and I'm Kotlin GDE based in Berlin. I'm Italian and today I'm here talking about Kotlin. Same here. I'm Sergey. I'll be talking about Kotlin as well and we were asked to get you the state of Kotlin, whatever it means. I worked for various companies in Kotlin for about seven years now, I guess, such a long time. Early years with the backends in Kotlin now mainly in Android development and infra. But let's start with important things. We live in a modern time with modern technologies available and so this presentation is powered by generative AI. It's really important to remember that the speakers today, both of us, are not anyhow affiliated with any companies that create Kotlin or sponsor it or anything like this. It will be pure, our opinion based on our experience or guesses, I don't know. We might be wrong except for the places when we are right. That's it for today. And generative AI make a thing that probably the speaking these days will be an easy job. Last year, preparing for the FOSDOM talk, I've been writing my videos about 3 a.m. before the deadline. Nicola was already saying why you're not sending me the videos and I had to do and this year I thought, all right, I'm using the modern technologies and ask chatGPT to generate the slides for me. Unfortunately, we ended up in like 5 to 10 minutes saying, oh, I have no idea what is Kotlin because I don't have data after 2021 probably. So the rest of the presentation is really not powered by the generative AI. I'm sorry for this. So it's powered by us. And we had to figure out what is Kotlin and what is the current state. So probably the first reasonable thing thinking about this is to get and try to understand any of the developer service available. And one of the most interesting one in this domain is the JetBrains developer survey that they run, I guess, every year from 2019 or 2020-ish. So here is the data available on the end of 2022. And we can see definitely a domination of mobile development in Kotlin and presumably it's Android mobile development, I guess so. And another leading stream is the web or backend development which is rising and like 40% of all Kotlin engineers are working with backends. I must admit that at the end of like 2022, the majority of population is still in mobile, the backend is rising and backend is trending up. I was under impression like a year ago that, hey, Android is probably the only platform for Kotlin. However, I was like reasonably biased with this and the first time after COVID when I arrived to one of the European conferences that had a few topics there, I was impressed that the majority of topics were with backends. But today let's try to avoid getting too deep into particular Kotlin platforms and focus more on like high-level language stuff, what is coming in the next years, or at least what we can probably predict to come. It's quite funny that it's possible to describe the whole 2022 and 2023 agenda in the Kotlin world with just five letters. And one is common in both of the words. It's K2, the new Kotlin compiler that's coming, and the Kotlin Multiplatform or KMP. So I'm not very surprised that in the past year we didn't see a lot of Kotlin language features and there are a few reasons for this. And we'll definitely discuss them in the next slides and further in this talk. But let's try to first understand what is K2 and what is what it's preparing for us as developers and product owners or engineers. And then we'll get to KMP later in this presentation. First of all, there are a few major problems in the whole Kotlin infrastructure. And they are the stability and performance of the ID. So even though probably IntelliJ is one of the greatest product on the market for developers, it's not great for Kotlin. I mean, it's fine, but whenever your project is growing and growing and growing, and I know something about large project in the industry, it's getting unusable. The second thing is build speed. Like whenever you tried probably Java for its builds before, even if Java is slow, but overall Kotlin is significantly slower. And this is like a drawback, especially if you came from the legacy code base whenever you had like five or 10 years of previously Java code. Yeah, I know, I know, I tried to say we are not getting deeper into the platforms, but it is what it is. And the build setup overall is quite complicated. So for example, Kotlin multiplatform, how to run Kotlin multiplatforms from the common line. If you can give me an answer, I will just take you out for dinner probably. And currently the overall K2 is developing in the two major Kotlin roadmap milestones. Like whenever you're interested was going on in Kotlin, you can get to the Kotlin roadmap. Thanks a lot for publishing it to JetBrains. It's like magic. There are two key things like getting K2 to better. Kotlin is currently in alpha and it's the whole rewrite of the compiler front end. Meaning that if you previously hacked something for the compiler plugins or compiler infrastructure, or you're an owner of the libraries like KSP or KAPT, you basically have to redo everything you did in the past three, five years. The second part is the IntelliJ-based plugin and the story there is the compiler and the compiler front end especially is very coupled with whatever Android Studio or IntelliJ IDEA or your IDEA of choice is doing with the language. Whether it's a language server or the index you have locally, your compiler performance really impact the experience you have in the IDE. Let's get to the other things. Like the unfortunate is the deprioritization of the API for compiler plugins. It means that for all of the engineers outside of JetBrains, it will be still very hard to create tools for a compiler for IDE and evolve them over time. Currently, you don't have a stable and you don't have basically a documented API for the compiler. For example, if you're creating a library that changes something, every major or even minor release of Kotlin, you need to upgrade it and evolve and maintain the compatibility of the version for the compiler. Currently, K2 is in alpha, but as far as we understand, the Kotlin 1.9 is about to get us K2 in beta. For the current numbers, we have the improvement at around a rate of 2x for all the build that JetBrains provided us in their benchmarks and publicly told us about these things. Kotlin itself has been building significantly faster. But if you're impatient, there is the way to bring at least some of the improvements for the build speed earlier in the pipe. You can get to Kotlin 1.7 something and it has probably 10 to 15% improvement in the build speed. You can even experience this in the large-scale code bases. There are a few plans about the K2 going to beta and mainly what JetBrains and Kotlin wants to achieve. It's the full functional compiler and it should work for the whole ecosystem, not only the JVM. It's probably usable for JVM even right now, I tried it. It's not a great experience, but you can experiment at least with it. There is a need for improvement in various plugins like Kotlin annotation processor, serializable, KSP and others. It will take time. As of yesterday, there is a well-done support for Kotlin X serialization. All open, no arc plugins and Lombok. It's already in the source code of Kotlin compiler. I'm not sure which version they're targeting, so I was checking 1.9 daily build and it was fine there. Probably there is something in 1.721, but I'm not quite sure. Unfortunately, KAPT is still not working, so you cannot do any of the annotation processing yet. KSP, the symbol processing, the future of annotation processing for Kotlin also doesn't work. Let's get into some details of the annotation processor support. Luckily in Utrecht, there are tickets for almost everything that works or that doesn't work and what JetBrains is planning to work on. There is a ticket for front-end IR support for KAPT. It's still in progress. It should be done in 1.9, or at least it's the current version now. I tried to understand what doesn't work right now, and it got me to quite interesting things. That KAPT currently doesn't support language version 2.0. Please use language version 1.9 or below. It's just a source code that I saw yesterday. I tried to get deeper so that there is a Kotlin version introduced that has symbols both 2.0. I probably don't want to make any judgment here what the version would be, but there are some things that suggest that we are going to have some major changes in the future. Kotlin symbol processing is another thing that doesn't work currently, but it's expected to get around 1.9. There is some work going on in the domain in the Google repository for KSP, but still nothing there done. The testing info doesn't work at all currently, so one of the major libraries for the compile testing doesn't support K2. I don't know when the support will come there. There is a poor request getting this support, at least partially, but who knows, probably will take another six to eight months. Finally, the IDE support for K2. As far as I understand, the plan is to create the K2-based IDE plugin that will probably start from a quite lightweight one that will support only a few things. It should have the performance that is targeted to be. Everything works fast and correct and likely stable. The IDE plugin will have a lower number of features, will be lightweight and will be released likely after the better or stable of the compiler, so it's not earlier than Kotlin 1.9, probably even later. As stated currently, it should have all the code highlighting. It should have basic code completion, like nothing fancy, probably just the stuff you use on a regular basis, but not for major refactoring or anything else. In debugger, there should be breakpoints and some simple stuff, really simple stuff, as it's declared, and there should be a limited number of diagnostics. The last but not the least thing, as you might notice, another big part is the Kotlin multiplatform support. This new lightweight plugin is aiming to have the full-featured Kotlin multiplatform support as well. That's my third. After some key to magic, let's try to check what went through this year of Kotlin multiplatform and what will come in the future. Let's start our journey from Kotlin 1.6.20. The first magic things, beautiful things that we saw, was the Ierco project structure becoming default. This thing was such a bless, because you finally don't need anymore to do some weird magic stuff like symbolic links and things to cover multiple architecture like iOS version and Intel version and X64 version. You get automatically shared stuff and you can just use an intermediate target for the thing, which was really a good thing to start with. But it wasn't the only thing in 6.20, because we got some improvements on Kotlin native and we got all the improvements that, as an engineer, you can dream of. So runtime, compile time and code size. It was an interesting update that brought many improvements on the developer experience, which was really nice. But the very big thing came later in 1.7.20, which was finally the new memory manager for Kotlin. With that version, it's now enabled by default and basically it's again a huge win in terms of developer experience because a lot of change changed and we got better stuff. That's because the memory manager changed from reference counting garbage collection to tracing garbage collection because in the past it was just made it quick and dirty. Let's put it in a way to just get out, but now that things are getting bigger and people are using it, they realize that some stuff was not the best choice afterwards. So what this means in reality is that there are no more restrictions on sharing objects between threads. There are more leak-free primitives, which means no leaks on the internals, but maybe leaks on stuff that you wrote because we can introduce bad things, but at least the tooling is not sneaking leaks. What does it mean at the end of the day for developers? It means that no more freezing. Finally, we get all the beautiful sun because we don't need to freeze objects anymore. We are really free to use everything and you just don't have to think anymore on free stuff and to understand that it's not crashing, that there's no mobility and stuff, so all magic now. This thing enables a bigger thing, which is the finally cutting-mode platform mobile, the mobile part, to reach the beta version, which is a huge milestone. This means that the technology is basically done and it is safe to use in your project. The fact that it's in beta means that there's still some work to do mostly on the toolchain, but in the general thinking you can use it safely. Maybe there will be some cut-out corners on setting up the toolchain, stuff like that, but don't worry, you can start using and learning your project and do it because it's fun. All right, now let's move to more close to today and with the recent release of Kotlin 1.8, we got a lot more interesting stuff in the field of Objective-C as with interoperability, which is something that people always ask because it could be better sometimes. With this release, we got some nice annotations like Objective-C name that let us specify a more idiomatic or more beautiful naming for some function that we want to expose to Swift so we can just change the name without changing the Kotlin object itself. Another one is hidden from Objective-C, which as the name suggests, we can hide some function from Objective-C because maybe we want to have a duplicated function that works better or has a better representation for the Swift word and we want to just hide the Kotlin one so we can save precious binary size and have a specific function only for the iOS part. Finally, another interesting thing is the should refine in Swift annotation, which basically tells the compiler to mark a function or a property as a Swift private. Basically, it's going to be exported on Objective-C with a double underscore and so with a double underscore, this function will be invisible because autosuggestion doesn't autosuggest. On iOS, sometimes stuff are invisible because autosuggestion doesn't work properly in Xcode, but that's another thing. But in this way, it will be possible to hide some function and then rewrite it on Swift to have a better idiomatic Swift way of doing things. Another experimental thing that we got on Kotlin 1.8 is something that I was dreaming of because every time I saw, I see Android test and Android Android test, my ads explode because every time I think that it's a typo or something, but it's not a typo. But right now, this will change and there will be more descriptive things like instead of Android test, Android unit test, and instead of Android Android test, Android instrumented test. So it's going to be more clear and you got to understand really what is this thing. Connected to that, there will be more clarity also to where to put the manifest. So not in the Android main, but you're going to have a specific folder for the bug and release version, so it's going to be more clear just and you got to understand better. This thing is still experimental, of course. It's going to be enabled by default sometimes in the future, but if you want to use it, you have to opt in with a Gradle option. Another thing that there was in Kotlin 1.8 was the stabilization of the Kotlin.js and so finally, right now, all the three different technologies, so Kotlin for the JVM, Kotlin native and Kotlin.js are using the same backend, which means better handling, less bug, and everything works better. Another interesting goodie that is not specific to a specific version is that since version 1.4 to 1.8, they were experimentally checking binary back to work compatibility, which is always a nice thing. And from now on, there are processes set up on the side to keep binary compatibility from every release, which is going to alleviate the pain of having stuff broken. All right, another very interesting thing about Kotlin and the multi-platform word is the fact that JetBrains is maintaining also a composed version for desktop and the web, which is a fork from the Android one from Google. And yeah, the support is going on, but it's really neat. It's really beautiful because you can just use something nicer to write a desktop application and not still have not to deal with Zwing, Java, whatever stuff. So it's really nice that we have some sort of things. Of course, it takes time to keep it up with Google releases because they have to catch it up. But for example, a couple of days ago, we got the 1.3.0 release, and some stuff is going forward also on this side. Last year, we got also some interesting experimental stuff which is composed for iOS. Yes, it seems happening. It's still some very experimental and technical preview, so absolutely not ready for production. But people in the community are playing with that. For example, folks touch up, they built the DroidCon app for the iOS version of the DroidCon app with compose for iOS, which is something amazing. And yeah, this is something, as I said, not ready, but it's something to keep an eye on because it's going to be wild and interesting. So what is going to come this year? Well, there will be more improvements on the memory manager, even though it's already stable and it's the default memory manager. So folks would keep increasing and fixing bugs and increasing performance on that. Of course, there will be more improvement on compilation time on Kotlin native because it's still sometimes not the fastest thing in the world, and so it has to be improved. Another point, which is, like I was mentioning before, it's always asked from people, is better exporting to Objective-C, so have better APIs to interact with from the iOS part. And also, another thing will be after confirming that there are tools in place to have backward binary compatibility for the Kotlin native version, they will describe and add more improvements and documentation for library developers to maintain a binary compatibility as well. All of that is going to... All of these improvements are connected to have finally Kotlin native platform mobile to stable, which hopefully is going to happen this year. And in order to do that, there are a bunch of things that need to be addressed. Like I said before, it's mostly tool chain and infrastructure stuff, but the main thing is working and you can start already using it because it's in beta and stuff are kind of working right now. With that was it, the introduction of the journey into Kotlin for this year, from the past year and for this year. So, thank you very much. There will be a lot of content today, so feel free to catch it up with all of that and have fun. Thank you very much. We do have four minutes for questions, so if you have any, raise your hand. We'll bring in the microphone. Is it okay if I ask three consecutive questions, two of which are related? First of all, in terms of Kotlin JS and Kotlin native, those are not intended to be a performance alternative for existing solutions. Kotlin JS wasn't meant to be a challenger to TypeScript, and Kotlin native wasn't intended to rival things like Rust or Go, etc. Is there a change in that ambition? That's my first question. My second question is, does JetBrains have any plans to use Compose to port their IDEs to Compose from AWT? Okay, for the first one, I'm not sure if I heard or read some changes or stuff, but yeah, there's just another thing in the ecosystem, probably it's not gonna replace. I'm not sure what you catch or what not, so I'm gonna restart. I haven't read stuff or heard stuff that changed what you were saying. Probably there's gonna be other tooling to support and not to replace. So it's gonna be another thing to cover and other use cases, not to completely replace. But yeah, I don't have more clear guidance or evidence or opinions about that. The second one was about, there was noisy, I didn't get it fully, but... Yeah, I'm not JetBrains, so I don't know, but probably yes, if I have to bet it. I know that they have their own runtime for UI and stuff like that, but probably yes they are gonna use it, but I'm not JetBrains, so I don't know. |
Kotlin Multiplatform: From “Hello World” to the Real World |
Hello, everyone. Hello again. So we are going to resume. If you are in the back, please come in. We have seats in the front. And please make sure to make yourself comfortable for this next talk, where I have the pleasure to introduce Russell. So Russell is coming all the way from the USA to be with us today. He works at Touch Labs, knows everything about Kotlin multi-platform. And today he will tell us about from Hello World to the real world. Yeah, thanks a lot. Yeah, I'm here from Boston, where it's actually like negative 24 degrees Celsius today. So I'm all right with the rain. And yeah, my name is Russell Wolf. I'm a Google developer expert for Kotlin. And I work at Touch Lab, where we do all Kotlin multi-platform all the time. So I'm going to talk a little bit about taking a Kotlin multi-platform from like the basic kind of Hello World example that you might start out with to the sorts of things that you might need for a production hyalab. And I've been part of work at Touch Lab. I've been doing Kotlin multi-platform things since pretty much day one. In a couple weeks, it's going to be five years since I wrote my first Kotlin multi-platform code, which is kind of cool. I don't know. It hopefully means I know something about what I'm talking about, but you can let me know. So let's get started. So quick introduction to Kotlin multi-platform. And I'm kind of breaking up my talk, breaking up the talk by the sections of the title. So started with Kotlin multi-platform. So Kotlin grew up initially as a JVM language, but it also has backends on JavaScript and native. And all of these are actually kind of families of platforms where JVM includes Android, native includes all sorts of different targets. And Kotlin multi-platform adds not just the targets themselves, but the ability to share code between them. So you have platform code that runs on a particular target, and then you have common code that runs on all of them, or runs on combinations of them. And what this enables is a really nice kind of flexible interop where you can share the post of your code that makes sense to share, but you have the ability to drop into platform specific code for things that you don't want to share. And it lets you treat your shared code as basically just another library. So you can be writing what would like an otherwise fully native app and share just a piece of it. Say a couple words on KMP versus KMM. So KMP is Kotlin multi-platform is the kind of whole universe of all the different things that call it in the target. KMM is Kotlin multi-platform mobile, which is the mobile part of that story, which is the first piece that JetBrains is stabilizing. So that's the thing that they announced is in beta. There's not really a hard technical line between them, because KMM in the end is just like parts of KMP. It's just kind of working on the same technology stack. But in terms of what they're focused on for the developer experience, KMM is kind of the piece that's coming first. And yeah, as I mentioned before, it's recently moved into beta. It's planned to go stable this year. So it's a really good time to get into it, start using it, if you haven't yet. And what beta means to JetBrains can be a little bit different than you might be used to from other projects. They're very slow about designated things as stable. They want to be absolutely sure of every little detail. But even by calling it beta, they're very strongly committed to keeping things working. They're just saying there might be some breaking changes in the future. And to kind of break down how call-on-mobile platform code works, I like to use this kind of Venn diagram and focus on the mobile use case. So we're talking about Android and iOS. So if you're an Android developer, you're used to kind of this type of diagram. You have access to kind of all the Kotlin APIs you're used to. You have access to JVM and Android APIs. And there's a subset of that, just the kind of pure Kotlin stuff, that in principle you can run on any platform. Which then means you can take that over to the iOS side, also add some iOS platform specific code. And then, so you have kind of your shared bits and your platform bits. And the KMP tool chain brings all of that together. So that essentially each of these different colors on the diagram are just a different source directory. And the tool chain knows how to kind of put the right parts together so that you get the right code for your platform. And again, there's more to KMP than just KMM, but the 8-way Venn diagram of everything is a lot more complicated to draw. So what does it look like when you're writing your first Hello Worlds in Kotlin Multiclapform? And one way to get that is to start with the Kotlin Multiclapform mobile plugin for Android Studio. So you can do a lot of the stuff. I tend to use IntelliJ IDEA more than Android Studio when I'm doing my KMP development. But the new project template in Android Studio is a little bit easier to get started with. They have these Kotlin Multiclapform application, or Kotlin Multiclapform library. And what they give you is some code that looks kind of like this. And don't worry about kind of every little detail of it. But this is kind of like the Hello World template that it generates for you. So there's a platform interface in the common code. The common code is in the center here. The Android is on the left. The iOS is on the right. So there's a platform interface that's implemented on each platform as Android platform and iOS platform. There's a expect function. So expect an actual or two keywords that Kotlin Multiclapform adds to the language that essentially let you declare something in your common code, but implement it in your platform code. Essentially kind of like a header. Actually, they use the header keyword for it before Kotlin Multiclapform was released in 2017, 2018-ish. So there's an expect function that you kind of get a default platform that has actual implementations on each platform. And then there's a greeting class that just kind of brings it all together and prints the name of the platform that you're on. And this gives you a little playground to start messing around with Kotlin Multiclapform code. And I actually really like there's, I like the way that they use expect actual in here. It's very easy when you have this new tool starting out with Kotlin Multiclapform to kind of overuse it. I mean, you start making all these expect classes and things like that. I tend to find it's really nice to also use, like, hold on to interfaces as well. So when you define an interface platform rather than an expect class platform, you can substitute other implementations a lot more easily. And so this is kind of like a rough sense of the code structure that you get from this template. So the code that I showed you is kind of this bottom three boxes. So there's common code in the middle, the orange, there's Android sources in that that you then compile to an Android library. There's iOS sources that you then compile to a iOS framework file. And then if you use one of the application templates, it'll look at the app layer that consumes that. There's multiple ways that you can configure the iOS app to consume it. So you can, which essentially there's different dependency managers that you can use in iOS. There's a default that's just kind of manually include the framework. You kind of add a custom build step into Xcode that will call into Kotlin and do that. There's also a plug-in that's part of the Kotlin tool chain that uses Cocopods. So Cocopods is kind of, has historically been a commonly used dependency manager on iOS. It's these days starting to be replaced by Swift package manager, but the Kotlin tool chain doesn't have as good of integration into SPM yet. And then I'll also just call out, we at TouchLab have a sample called CampKit that can also be a nice kind of place to start out if you're playing this stuff for the first time. It's a somewhat more complicated sample than that Hello World. It has a bit more kind of architecture to it and shows kind of some of our standard architecture and library practices. And also has a bunch of documentation kind of explaining why we make some of the choices that we do. So check that out also if you're interested. So what are kind of some common themes around these sorts of starter projects? And there's a lot more than just those two, I should say also. There's lots of people that have kind of put together interesting multiple from samples that you can use when you're first learning. And something that comes up often in a lot of them is they tend to aim at maximizing shared code, which like in an ideal world is really nice. In the real world, oftentimes you're starting from two separate native apps and you want to incrementally move towards more shared. And you don't always get a good sense of what that looks like from any of the standard samples. Things also tend to be mono repos when you're looking at starter samples. So what if I already have existing apps? They live in different places, but I want to start sharing code between them. What does that look like? A big piece of a lot of them is there will be some step in your build process where in your build process in iOS where Xcode has to manually call into Gradle to build your Kotlin. But if you're on a larger team, you might not want to have to do that every time. Your iOS team might not even have a JDK set up if they're not used to using that. So what do you do in that case? And they also tend to be single module when you're looking at sample projects. But what happens when things get bigger? So that brings us to what does it look like when you take all of these sorts of things and start scaling it up to real-world projects? And I'm going to talk about some of the ways that we tend to think about this at touch lab as well as some tools and things that we've put out into the community to help out with some of these things. And the first thing I want to talk about is team structure. This is something we've been talking about a lot internally at touch lab recently and kind of building out this sort of taxonomy of different ways that different teams approach the way that they handle their shared code. And a common core piece of that is being thoughtful about the ways that the structure of your team impacts the way that you want to organize your code. Because lots of teams are very different. So the distinction I'll highlight here and it kind of works across a couple of different dimensions. I tend to think of it as kind of small teams versus large teams. But it's also sometimes teams that work kind of as one unit versus teams that work as multiple units. And a key piece of that is often is the group that is writing the shared code the same group as the people who are consuming the shared code. So when you're a smaller team or if you're one unit you tend to have kind of fewer worries about who's owning what parts of the code. You're more kind of unified in what your developer setup looks like. And you're more likely to be in a situation where you're kind of sharing a higher percentage of things and just kind of wrapping a thing UI around it. And you're more likely to be doing all of your feature development kind of at once for both platforms. On the other hand, when teams get larger, things get a little bit messier. You're more likely to have iOS specialists who don't want to kind of deal with the Kotlin directly. And your Kotlin code, you're more likely to have a larger iOS app than just what the Kotlin is. And so you might, like your Kotlin is just kind of one more thing in a sea of other native libraries that your iOS app is using. And you tend to want to minimize the impact of your Kotlin on the rest of the iOS code. And what the topic means in practice is you want to kind of rather than linking your XO build to your Kotlin directly, you want to kind of publish it as an external library. And so the diagram that's why I showed you that kind of diagram on the left earlier, the way it can look like in a larger team is rather than directly consuming things, your shared code is being published to some sort of artifact repository, and then your apps are pulling that artifact down. And there's kind of more of a two-step process to making updates, but it lets you kind of work in separate streams more easily. And we put out a tool to help with this in the fall. We touch lab. It's called KMM bridge, and it's a Gradle plugin that can essentially manage the publishing of your iOS framework in a couple different ways. So it gives you a Gradle task to publish a new version when you've made changes. It has options around how you implement that version and things like that, options for where you want to host that binary and the ability to plug in your own. And then some helpers, if you're using a package manager for making that local development flow a little bit easier. So sometimes you want to be able to toggle between using the binary that you pulled down versus building it directly when you're trying to write new code or debug it. So we have some helpers to make that flow a little bit easier. There's a bunch of little things that are still kind of a work in progress on here. If you're a team that's interested in using it, we'd love to talk to you and get some feedback. So feel free to find me and let me know if you want to learn more about that. Another problem that comes up at scale is modularization. So when you write a hello world, it tends to be one module. But when you're writing bigger things, you might want to have more than one. And Kotlin native, it turns out, makes us a little bit complicated. So when you have multiple Kotlin native modules and you export them to iOS, they're essentially their own kind of separate worlds. And so each of these modules has its own copy of any internal dependencies, their own copy of the standard library, their own copy of any third module that you might have underneath them that you're trying to share between them. And they can't kind of talk between each other very easily. And this can be okay if they're doing very distinct things. So maybe one of them is making analyst calls and one of them is running your database and they don't really need to interact with each other. And then having them separate can be okay. But often you end up wanting to kind of write this umbrella module on top of them so that in your Kotlin layer, you can have them talk to each other more easily. And then you have sort of a shared module on top that you export as your iOS framework. And that lets you more easily have that more typical kind of modular structure while working with the Kotlin native limitations. There's still some messiness to this because your umbrella framework will have, you and I have kind of namespace clashes where all of your declarations in here are essentially in one giant global namespace. And there's romantic that will improve this. But right now it can be a little messy when you have a lot of code in there. Another thing that comes up in real-world projects is your binary size. So hello world tends to be small, real apps tend to be larger. And real apps have consequences when things are too large. Where things like the app store will throttle your downloading or force you to do it on Wi-Fi rather than on mobile if your app gets too big. And this can be a significant impact to the amount of downloads that you get. And it turns out one of the biggest contributors to this is the Object2C interface that Kotlin native uses to export your code to iOS. And the kind of trick to use here is you want to limit the amount of public decorations that you have in your Kotlin code. And that will shrink that Object2C interface because it only needs to be generated for public decorations. And that hitter from Object2C annotation that Marco mentioned earlier can also be a way to do that. Or there's kind of different monitoring structures you can sometimes use. So I'll mention quickly a couple other tools that Tesla puts out that can be helpful when you're running loads of apps. So by default, the crash reporting that you get out of Kotlin native doesn't kind of export to Swift very well. So we have this tool called Crash Kios that will essentially symbolicate your stack traces better. We have some updates to that in-flight that will kind of clean up different pieces of that story, but I'm not going to go into detail there because I'm getting kind of low on time. And we also have a Xcode debugger that lets you debug your Kotlin code from Xcode, which can be a nicer environment for your iOS developers when you're kind of introducing that. That recently to be a CLI-based interface, which makes it much easier to update and install. So if you've tried it out in the past, feel free to give it another look. One of the things I want to talk about is kind of the shape of your API service. So Hello World apps tend to be small, but as your app gets bigger, you start to care more and more about what, sort of, how idiomatic is your API? And Swift and Kotlin tend to want to eat different things with that. So I just want to point out, like, don't be afraid to kind of need a bit of translation layer between your shared code and your platform code. And we have some tooling that we're working on to make some of that easier, but it's not in the open yet, so I'm not going to go into detail there. I'll skip the example because I'm running out of time. But kind of the overall lesson that I want to highlight is different teams and have different structures and want slightly different things. Kotlin is all about adaptability. And so if you're a team that wants to share a lot of code, you can do that. If you're a team that wants to minimize the impact of the Kotlin on the rest of your domain team, you can do that. And you have the flexibility to kind of choose the way that you want to approach all of that. So thanks. I think I'm probably out of time for questions, but I'm definitely happy to answer stuff in person. Feel free to tap me on the shoulder, find me whatever later today. If I'm sharing out my laptop, it's not important because it's Saturday, so I might love to chat. Thanks. |
A mirror without reflection for Kotlin/Multiplatform |
Hey, everyone. Thank you for joining me. I'm Solomon Briss and this is A Mirror Without Reflection for Cartoon Multi-Platform. So, without further ado, let's go into the subject and define what is reflection. Reflection is a feature that allows an executing Java program to examine or introspect upon itself and manipulate internal properties of the program. For example, it's possible for a Java class to obtain the name of all of its members. This definition is extracted from the Java documentation and it explains that reflection basically allows a program to introspect upon itself and look at its own method and properties. For example, in this code, I simply print every field and method that a class is declaring as accessible. And so, this is possible thanks to these class objects that the Java runtime gives me and it allows me to access all fields, methods, properties, and everything that defines this class. Now, that's not the only thing Java reflection can do. Java reflection can also provide proxies. So, for example, here, I create a simple printer and proxy. So, here, I create a proxy by saying, okay, here's a class. Here's its class loader. Here's a class. It's not really a class. It's an interface. So, here's an interface. And what I'm asking the runtime to do is to give me an object that implements this interface and delegates every call to this lambda. So, basically, I'm creating an implementation of an interface at runtime. And this is how you can use it. As you can see, it's pretty simple. All you have to do is call this create proxy method and then you'll have an interface, an implementation of the interface created at runtime. So, this talk is going to be about a lot of definitions because we are going to have multiple pieces of the puzzle. So, the first piece of the puzzle was, of course, reflection. Let's go into Kotlin multiplatform. Maybe let's not go into Kotlin multiplatform because you've just seen an entire presentation about what is Kotlin multiplatform and how it works. So, I'm not going to go into details about what is Kotlin multiplatform and how it works, but I'm simply going to say that Kotlin multiplatform is a way to compile Kotlin code for different targets, namely JVM and Android on one side, JavaScript on another side, and finally Kotlin native on the last side. Kotlin native encompasses iOS and also other less interesting targets. Let's face it, Kotlin native exists for the sole purpose of iOS. So, while Kotlin JVM supports reflection, Kotlin JS and Kotlin native do not. What is important to understand in this sentence is that reflection is not a feature of Kotlin JVM. It's a feature of the JVM that Kotlin uses and builds upon with its own reflection library, but basically it's a feature of the JVM. It's not a feature of the Kotlin language. As such, it is not provided in Kotlin JS and in Kotlin native. So, Kotlin multiplatform being the center of Kotlin JVM, Kotlin JS and Kotlin native, hence, do not support reflection. So, we need to get together and find another way of doing what we usually do with reflection. Maybe if we go back to the definition of what reflection is, we can single out this word. Reflection is a feature that allows an executing Java program. So, what this means is that reflection is a runtime feature. And we all know that what we cannot do at runtime, let's do at compile time. What can go wrong, right? So, to do that at compile time, what reflection does at runtime, we need to add several other pieces of our puzzle. Kotlin Poets is a Kotlin and Java API for generating Kotlin source files. So, what you could do is generate Kotlin source files by hand with templates and fill yourself like the vibe of the PHP 2000 error where everything was done with templating or you could use a type API that will build the Kotlin file for you. So, I strongly encourage you to not generate your Kotlin source files by hand and use an API such as Kotlin Poets. And here, for example, it's very simple. I create a new function called hello. I declare that it takes a name argument and I add the statement println hello name. So, it generates basically this function. Okay. So, that's a very important piece of our puzzle but that's by far not the most complicated one. So, the next piece of our puzzle is KSP. And KSP stands for Kerbal Space Program. It's a very good video game and the goal of this video game is to build a rocket and explore space. It's an exploration game. So, it's purposefully undocumented. So, there's no manual for discovery and that's the entire game. You need to build your rocket, send your Kerbal to space and see what happened. So, the game is heavily based on trial and error. Not all Kerbal will survive the journey. You will send them to space and not all Kerbal will come back. But when you do build a rocket and a space station in orbit, you feel a great sense of accomplishment. And as it happens, KSP also stands for Kotlin Symbol Processing API. The goal is to build a compiler processor, a compiler code processor and it is very, very lightly documented. Let's be honest, there is no manual for its discovery. You will use trial and error and you will scream at your screen, yelling at your frustration. Using KSP is a very good exercise in managing your frustration because of its light documentation. But when you finally achieve a functional Kotlin Symbol processor, you will, just like in the KSP video game, feel a great sense of accomplishment. So, let's see all our pieces of the puzzle. We use KSP to instrument codes at compile time. We use Kotlin Poets to generate codes at compile time. And we use Kotlin Multiplatform to compile everything for all targets that Kotlin Multiplatform supports. So, the idea here is not to allow a code to introspect upon itself at runtime, but to generate the information your code needs at compile time. It is a lot more optimized, of course, because you don't have to introspect all the code and all the information you need are generated for you at compile time, but it is, of course, a lot more complicated. So, how do you create a mirror generator? So, a mirror is a class that contains reflection information of another class. So, how would you create a mirror generator? Well, creating a symbol processor in KSP is not that complicated. What you need to do is create a symbol processor class that takes a code generator and a logger as a constructor input. And you will use those to, well, generate code and log when things go right or wrong. And then you can get, you can find all symbols that are annotated by a specific annotation and then simply see what type of symbol that is, and then you can continue to instrument the code starting with this. So, as you can see here, for example, look at if the symbol annotated is a property or maybe it's just a property setter because you can, in Kotlin, annotate, get and set properties methods or maybe it's a function declaration or maybe it's a class declaration and there are a lot of other things available. What's interesting in KSP and what I'm not showing here in code is that you could ask KSP to give you all symbols that are of a simple, of a declaring interface, for example, that are implementing an interface. You don't have, just like APT, you don't have to use annotations. Annotations are a very valid means of conveying the information that the code will be instrumented. But you could with KSP say, okay, give me all symbols, all classes that implement this interface, for example, or give me all codes to these methods or these kind of things. And then what you need to do after you have instrumented the code is to generate your file, the Kotlin source file that you will generate. And the good news is that Kotlin Poet does support KSP. So you don't have to write a facade between the KSP code generator and the KSP code generator. Kotlin Poet does support KSP. So it's, as you can see, pretty easy to write your very own code generator with KSP. And then what you need to do is to add your symbol processor to Kotlin, to your Kotlin compilation tool chain with Gradle. And as you can see, it's pretty simple, just apply the plugin. Now, the KSP plugins is versioned using its own version number and the Kotlin version number. For example, here, it's version 109 of the Kotlin symbol processor of KSP and it's version 1810 of the Kotlin language. And at the moment, because the Kotlin compiler plugin API keeps changing and is not stable and is not documented, KSP depends on a very specific version of the Kotlin language. So you need to upgrade KSP with the same Kotlin version that you need to upgrade Kotlin. And that's kind of a bummer because you need to wait when a new Kotlin language comes up, you need to wait for KSP to be compatible with this new version, even for minor version. If you use the wrong minor version, KSP will warn you that it is not compatible with this minor version. And once again, that's because the Kotlin compiler plugin API isn't stable and that KSP is using internal function and features of the Kotlin compiler plugin. Then, of course, you probably need to add your own runtime because when you generate code, you will probably need to provide with the generated code a runtime of your own. And then you need to declare that your KSP code processor will run on this code. Now, as you can see, it is declared differently than with regular Kotlin dependencies because at the moment, KSP doesn't interact with the Kotlin Gradle compiler with the Kotlin Gradle DSL. So you have to use this word, KSP common main metadata configuration in Gradle dependencies. So what can you do with this technology? Well, for the last two years, I have been developing an example because it was needed for the company I worked at and that was mocking. So what we have here is a class that works with Kotlin multi-platform test and that works with all targets of Kotlin and that generates mocks at compile time because mocking in, for example, mock K or with mockito, mocking uses the proxy reflection feature of the JVM which does not exist in Kotlin multi-platform. So, for example, here we say we want a view that will be mocked. So view is an interface and it will be generated by the mock AMP compiler plugin. We want a fake and a fake is a data class and we want a data class that filled with fake values, empty string, zeroed integers and all those kind of stuff. We want a controller that uses both a fake and a mock. We want to define the behavior of our mock. For example, here I say that in the interface view in my mock, in my view mock, if I call view.render with any argument it will return true and I want to be able to verify that a mocked has been called with a specific data in this instance model. So all that and all that DSL is possible thanks to KSP and Kotlin Poets and the ability to generate code at compile time. So what was previously unavailable to Kotlin multi-platform because reflection wasn't available is now available thanks to code generation. And by the way, if you're interested in this in mocking for Kotlin multi-platform you can use mock AMP which is a library that we built with Deezer and this library, this testing library is used in production meaning in test production at Deezer, almost all the multi-platform tests at Deezer uses this mock AMP library that we developed together. So there's a problem with KSP. If we go back to the example I just gave you, this method uses this inject mocks function, this class uses inject mocks. And the fact is that inject mocks is precisely the function that is generated for this class. Because this class, we can see here, because this class has atmock annotated properties and at fake annotated properties, then an inject mocks function will be generated by the mock AMP compiler plug-in slash symbol processor. And when you load the project, the Deezer project or any project that uses this system, well, inject mock is an error because it hasn't been generated yet. So idea will show you an error saying, okay, this function just doesn't exist. I don't know what you are talking about. So you need to either build the project or you need to say to, you need to say to Gradle to generate and run KSP. And at the moment, there is no way around that. And that's because KSP has a very important limitation. It treats the source code it is instrumenting as read only. There is no way with KSP to add properties or to modify a symbol that you are instrumenting. So this my test class, there's no way with KSP that I can add a property or I can add an annotation and all that. And since there is no reflection in Kotlin multiplatform, there is no way to find a class that exists, but there's no class dot with name. So that means that you need in your code to use the code that is generated and that code doesn't exist unless you generate it. And that's a small price to pay to use KSP. So why would you use KSP as opposed to writing a full-fledged Kotlin compiler plugin? First and foremost because KSP provides a kind of stable API. The API changes, but it follows a depreciation cycle. And the API of KSP is supposed to be public, so they treat it with respect of a public API. And also because when you use KSP, you don't have to write a compiler plugin. Writing a compiler plugin with Kotlin just not only means that you will have to understand the inner components of the Kotlin compiler, which are absolutely not documented. KSP is a little bit documented. The Kotlin compiler internals are just not documented, but it also means that you will have to handle compiler integration and gradle integration. So you will have to add your own gradle plugin, you will have to add your own compiler plugin, and it becomes a very complicated endeavor. And finally, because for code-generating use cases, KSP remains a lot simpler than writing a compiler plugin, which once again is done completely in the dark. You won't have any support if you try to write your own compiler plugin. So using KSP, KSP is still a very, very important tool in the grand open source library of Kotlin multiplatform project. A lot of Kotlin multiplatform libraries use KSP now, and I encourage you to contribute to that grand library. And that's it for me. Just want to say that I represent here coding coders. We are certified for our Kotlin training, so if you want Kotlin multiplatform training, be sure to contact us. We have lots of libraries that are open source. Romain with the next talk is going to present you another one of them, and we like to do our open source work with Kotlin multiplatform for every target. It can compile too. So whether you want to contribute to Kotlin multiplatform libraries or learn how to use Kotlin multiplatform, be sure to contact us. Thank you very much. Thank you again. We have time for one question. If someone has a question, raise your hand. Shout it, and you have to repeat the question. Yes, so you've decided to write your own library. There was any way of making Mokey to work with yours, but if you were, is that possible? So the question is, rather than using, rather than creating a whole new library for marking a Kotlin multiplatform, is there a way to put Mokey to Kotlin multiplatform? And the answer is definitely no. Mokey to uses a lot of reflection, just not just for proxy, but for object generation and for verification, and it instruments the runtime heavily. And since there is no runtime, there is no JVM runtime in Kotlin multiplatform, there is no way to port Mokey to Kotlin multiplatform. Now, what we've tried to do with Mokey and P is to emulate the same API that Mokey provides so that when you use Mokey and P, you're at home, you're using an API that is really close, but there's no way to port Mokey to itself. Thank you very much, and have a nice first time. |
Toward better Kotlin Multiplatform architecture with Dependency Injection and KSP |
Let's move on everyone. So we have to keep the schedule and to keep the pace for all our viewers online and the rest of FOSDEM. So for this next talk we are welcoming Romain We are going to talk about dependency injections. Please give Romain a warm welcome. So thank you for having me. I'm really glad to be here with you. I'm Romain Boacelle and today I will guide you through our journey of improving the use of codeine and dependency injection library using KSP. So you already seen that commercial art so I'll make it really quick. I'm really happy to work with Salomon on advocating as well as providing services and trainings on codeine and I'm really grateful that TreadBrains is rewarding us years after years. So let's talk about the real subject here, open source. So we are maintaining a set of codeine multi-platform independent tools that are compatible on every target on which we can compile codeine and obviously today we're going to talk about dependency injection with codeine. So let's team up and see what are our problems today and how we are trying to solve them with KSP. First a little bit of context on why we are using dependency injection in our applications. So let's assume that I have a view model that needs multiple instances of use cases. So I will need to initialize every one of those use cases and their dependencies and so on. So to avoid managing those initializations we often use the dependency injection pattern. In dependency injection the DI container has the responsibility to initialize every instances we need to make our objects works and so we can lazily retrieve them with a simple function call. So here we see that we used a generic function that is called instance or we often see get inject or whatever. In some other libraries our framework we often see a huge amount of annotations. So that sounds like magic to a lot of people. Another problem with the generic instance function is that I don't really know what is behind it. Is it a single term, a factory, do I need to pass an argument? I don't really know. Another problem comes with the DI binding declaration. As a maintainer I know what's behind all that but it can be confusing for newcomers. And on top of that there is no compilation checks. So this means that if you missed, if you forgot some bindings you will probably know it when your application crashes or in most cases in your dashboard. So did we just create a monster? Not quite. But there is still room for improvement. So let's do what we do best and refactor everything to get a better API and improve the use of code in. So let's welcome KSP. It stands for Kotlin Symbol Processor. A Kotlin Symbol Processor is a metaprogramming tool that allows us to generate code generally based on some existing code base. Symbol Processor are generally used with annotations that are used to mark our code that needs a special treatment by the processor. In short terms it's a lightweight compiler plugin that because it can just generate code and not modify some. A quick example here I have a full interface that is annotated to be processed. So a Kotlin Symbol Processor should be able to generate a concrete class of this full interface overriding the check function. So back to our initial problem. What do we need to improve the Kotlin API? First we need a readable and typed API to be able to avoid the use of this generic function instance everywhere. As we cannot easily have compile checks, at least we want an easy way to check the dependency container consistency with a few lines of unit tests. Note that the API you'll see today are still working progressive. They may change a little before landing on the release. So the main idea is that you should be able to declare one or more interfaces that represents the dependencies you need in your applications and that's the one that you need to retrieve at least. So after that we need to annotate our interface so that the Kotlin Symbol Processor should be able to generate some code to interact with the DI container to retrieve your dependencies in a typed way. So let's introduce the result annotation for that. Once we have annotated our interface, the Kotlin Symbol Processor should be able to detect and generate code for us. So in that case, you will see that we need a parameter to create a browser service so it will know that we need to interact with a factory that needs a string as parameter to get a browser service instance as well as the controller functions needs to return a simple instance depending on the context. Here is what the Kotlin Symbol Processor will generate for us. A concrete class implementing our app dependencies interface and that needs a DI container as input. So the DI container will be used to retrieve the proper instances here either a factory because it can detect that we need a parameter or a simple instance for the controller. Now let's see how we can use it in our application code. So first we need to declare our DI bindings so our implementation should be able to interact with. For that, we still use our current API of Kotlin's DSL or it will improve it a little bit but we will keep it as well as it is. Why? Because we have history with solutions like daggers that have gone wild with annotations. Even so, it's where Kotlin was created in the first place to avoid that forest of annotations. So we didn't want to introduce tons of annotations again and go full circle. Finally, we think our binding declaration API is good enough to express our dependencies. So here are the dependencies we need to meet the app dependencies interface contract. We need factories that take a string and return a service and a single tons that return an instance of a controller that itself needs an instance of a service. Welcome to that later. So now we can instantiate the generated class with our DI container and retrieve our dependencies with a truly typed API. As we don't want our user to know how we are generating our function or class implementation, sorry, we also introduced an extension function that helps us instantiate the app dependencies for us. So now that we are able to retrieve our dependencies with a truly typed API, let's see how we can check their consistency. For that, we introduced a DI resolver interface that only needs to respect the contract of a check function. So now we can combine the DI resolver interface with an atresolve annotation. In an ideal world, the atresolve annotation should be able to add that DI resolver type to our interface itself. But as we are using Kotlin symbol processor, we can just generate some code and not modify existing one. So now that we are fully packed with our annotation and our DI resolver interface, the code in the symbol processor will be able to generate the override and function check and create a requirement for everyone of the function or accessor we have defined in our app dependency interface. Before, with that code, you will have taken the risk to go in production without easily knowing if you forgot some bindings. No more. Now we just add a test and as we saw before, we instantiate our app dependencies interface with a concrete class and just call the check function. That way, if you missed a binding, your test suite will warn you instantly with a proper message. So here, we saw that we missed a factory binding that returns a broad service and takes a string argument as input. One more thing. Earlier, we saw that the code in binding DSL was impacted with the use of those instance functions. So let's see how we can improve this user as well. Let's take this example. When I need a controller that needs itself a use case, that also needs a service. Here is the binding we will have defined to meet our architecture expectation. So you probably have seen me coming with those instance functions. In the explicit world, we will have written those functions with their targeted type so we saw that we need a service and a use case. So why not using a type API and get this instance directly? This is not that simple because in the code in DSL, this is a type DI builder. So it doesn't know anything about the app dependencies API. Thus, the code in the symbol processor will also generate a new function for us that is called off with the name of the app dependencies contract we have. Thus, it creates a new DI builder that is aware of the DI building API and the app dependencies. Allowing us to call straight functions to get our instances as long as they are part of the app dependencies API, obviously. So as a result of type dependencies, I can now easily check the consistency of my DI container or retrieve dependencies in my application. So I feel the tension and excitement in the room, right? No? Same question on everybody's mind. Is this live? So I can spoil it earlier. It's still a work in progress and we still have some corner cases to crack, like how to use tags, how to manage modules, how to declare and handle scopes and context is and a few more. Here is a sneak peek of some ideas we have to solve those problems. An easy one is the tag API when we can retrieve a dependencies by passing a tag parameter. We could add an annotation with that tag and easily retrieve our dependencies without having to pass or know the tag that is needed behind. For the module management, we could just provide two ways to interact with dependencies either with a fully packed DI container or a DI module or add some parameters or annotations or either create a new annotations again. We'll see. Maybe not. A more complicated use case is how to handle scopes. With code ends, we can define some bindings that are with some life cycle depending on the scope or context and retrieve them with their own function. So to define a scope with a KSP, we could define that a contract is scoped entirely and then be able to retrieve our dependencies with the right context. Maybe this rings a bell. It sounds like context receivers, right? But unfortunately, context receivers are not available in Kotlin multi-platform yet, so we'll have to find another way. Also, we have plenty of ideas to make this work. This is an open source project, so obviously, if you want to contribute, you're welcome. That's it for me. Thank you for hearing me. If you have some questions, I think we have time for it. Thank you very much again. We do have quite a lot of time for questions now, so please raise your hand if you have any. Just shout it and you will have to repeat it. So the question is, is there some support in the IDE? So as for the mocking library, you will need to build your code the first time to be able to have the right APIs, like the new function and the off-app dependencies function, for example. No. All right. I think it's clear. One question. With this dependency injection framework, are all dependencies compiled statically, like the dependency injection part, or is that still dynamic runtime? So the dependencies injection are resolved statically or at runtime? Is this your question? So they are resolved at runtime, but as we saw, we provide tools to check that all your bindings are well-defined with your test suite, but they are resolved at runtime. There is no reflection like Salomon showed you earlier. That's how Dagger works, for example. So it's basically the best focus between Spring and Dagger. There was another. I think it's a bit of an obvious question. The question is how does it compete with coin? I think we have taken different routes with coin. I think Arnault will present you this afternoon. It provides an API using annotation, more that we are doing. We prefer keeping the binding declaration as much explicit as possible. After that, I think it's more some internal implementation that does not really compete, I think. |
KRuMP - Kotlin-Rust-Multiplatform?!
How to write bugs once and ship them to many platforms. |
All right, folks, here we go. Now on stage we have Matthias Geisler, this guy, and he just mentioned that if you're interested in the slides, we will find a way to share them with you, and I hope to get the other slides as well. Check out our channels for the IntelliJ giveaway on Slack, and on Twitter, and on Macedon. That's your mic now. Hello, it's the first time I speak with a microphone, to be honest, so it's super new to me. However, though, let's speak about Kotlin and Rust. Isn't that a great topic for the day? And how we can write one buck and ship it to as many platforms as we can. I think that is a very good idea, we should do that. However, though, before we start, just for your sanity, as I said, or as Holger just said, we will provide you the slides, and I also have an example repository I shared that somehow with you as well, so you don't need to be worried that there's all in it. So what I will speak today about is why should we ever try to put Rust into Kotlin multiplatform? By the way, are there any Rust developers here? One. Great, Unicorn. So why we even want to have Rust in our KMP? How does that look like? What are the cons and the pitfalls? And I can tell you there are many of them. And what are the to-dos if you want to progress further with the topic? And by the way, this wonderful slide deck is provided by my employer. I have to make a short advertisement because they pay the right. I've done that now. Great, why Rust? Kotlin is great. And multiplatform is great. I hope we all agree on that because we all want only one language to ship our code to all the platforms. And maybe a second language is okay, but three is already too much. Ideally, we just want to have one. So as I said, Kotlin is great. And I really love Kotlin, and I do it for a couple of years now, especially multiplatform. I developed a couple of SDKs, and I also have my own mocking library, so check it out. However, though, what's to say about Kotlin there is missing some functionality, which is not super easy to implement, but fortunately, Rust has that obviously already. For example, crypto libraries. That's a big thing. Or currently, there is this ICU package in development, which is dealing with daytime formatting and other good stuff, which you might already stumble upon in Android. But to have a multiplatform version of that is very, very unlikely because the crate itself is developed by the language people themselves, and I didn't see any activity in this regard, for example, for multiplatform. Anyhow, also, there is might be some behavior on multiplatform libraries you wouldn't expect, especially not at core libraries like daytime. If you compare the output of time zones, for example, you will be surprised. Let me call it like that. But if you write some tests and you are thinking there should be the same, I can tell you they are not. So also here, if you run over the case, maybe such functionality is already covered in Rust, so you don't have to reimplement the wheel, which is also great. And Kotlin native itself, if you ever had to write it, it can be really a menace. It just doesn't look great. And it's super hard to read sometimes, and it's also for that reason super hard to maintain. If you have a native language where you can write it on the bottom level, like Rust, it's much better because the code in itself is much more readable, yeah. And of course, you get out of us a much better performance, which you need, for example, if you do machine learning. That's also a good argument. So what are the soft arguments? For example, if you speak with some businesses nowadays, startups and so on, you will discover that people are more likely to adopt Rust and Kotlin, which was super surprising for me as well. But I speak with or spoken over the last couple of months with a lot of people. And I have no good answer why this is the case, even with KMP around, which is in most cases, I think, sufficient. So also, another business requirement is that the business itself might want to have the smallest team size possible, alone for budget reasons. And if you can do it maybe with two people instead of four people, they will go for the solution with two people, which is maybe backed up by Rust. And one very big argument for Rust is wasm support, which is Kotlin completely lacking right now, which is in the development, I know, but it's far from the state what Rust already has. And we have also to think about ourselves as developers. If you might be not aware, last year was a very great year not only for Kotlin due to the new memory management model, it was also for Rust. Rust did a lot of stuff, and there was happening in a lot of stuff, and one password launched their libraries, which are written in Rust, and only a very thin front-end layer is then done on the platforms. And they also have open source library for that. I will provide you with the link later when the slide deck goes around. So check it out. And the interesting thing is here, they took the Rust layer until the view model. So right until the top. And that's quite impressive with this kind of complexity. And what is maybe important for us, or at least for me, I really love the idea of multi-platform as I already stated, and to drive it further as much as possible, I think that's a good goal. And of course, because we can do it, maybe we should try it out. So how does that look like? Not to scare you, I have just a small portion of code examples. What you do on Android and JVM is basically GNI bindings. You might have seen that. You might already know that, and I bore you with that already. But that is what you have to dealing with that. But the good news is here, there is already a very good tooling there for Rust. So for example, what you see there, can you see my mouse here? Yes. It's coming from a crate in Rust from the GNI crate surprise, which provides you all the wrapping and transferring from JVM into Rust and back, which is super nice to use and easy to use even. So just to be aware of that. So how does that look on the content side? Something like that. So you have a couple of definitions. In this line, you just load the library, and if you've done it all correctly and somehow ship also the library and your resources with, that's it. Then you can use it like a normal library in Kotlin itself, which is great, I think. So here are some helpers for you. As I already mentioned, there's this GNI crate. There's also a NDK crate for Android, which is also super nice to use. So check it out. Floppy Gain is a generator for bindings in Rust for Kotlin and Java. So I don't like it, to be honest, because it takes out of much of flexibility, which you might need in some cases. Handwritten stuff might be more work-intensive, but less error-prone. But it's only my opinion. So this native loader is just that you can easily ship the library, and this plugin you will need simply to compile the entire Rust into your Android plugin. So how does that look on JavaScript? That's an easy one, because we don't ship it as something we bind directly, we bind it indirectly over Rust. Because as I already mentioned, Rust has a super cool Rust support and has all the crates to back them up. So that's on the Rust side, like the other function, and that's on the Kotlin side. That's a little bit more work-intensive, because you have to load them asynchronously, which you see here. And that brings a trade-off with it, because the entire library or the entire binding has to be asynchronous. You work then with promises and so on, and that can be super annoying, especially if you then go with React and so on. But it works, and it works sufficiently well and surprisingly well. And by the way, this is also a reason why many, many web developers these days also going for Rust. And you have also, as I mentioned, proper tooling for that. You have Wasmpec, which does all the hard work for you. NPM is just a bundler, you already know that, perhaps. And you need to have an understanding of promises, that's it. So now we come to native, and that's the hardest nut to crack, to be honest. What you have to do, basically, is writing a C-binding and a C-bridge, which just means you allocate your stuff on the native side, ship it to the other side in Kotlin, and somehow get your allocated stuff de-allocated. This is the most potential for memory leaks, but it works, somehow surprisingly sometimes. And also, you will encounter there are many problems, especially with the linker, when it comes to iOS, but it works still, not good. So that's the Kotlin site. So as I said, here you see the de-allocation, and this is just the bridging. So you copy the entire string, put it in, get the result back, copy it back, and then you de-allocate. So it's not that much, you could possibly even automate it, but it's not nice yet. Maybe there will be something better in the future, who knows. And also, one key ingredient you have to do is writing a definition file. You might have seen that if you ever had to deal with native Kotlin, and that's also something which can take a lot of time out of you, because it's also very error-prone in my experience. Maybe somebody else has better skills than me. I would gladly hear about it. So to do Rust with Kotlin native, you first of all need some greater skills, just to make it all work together. You need a lot of patience and a lot of time and good nerves. Because sometimes it's really like that, the one day it works, then you have an update on, let's say, Xcode, and then it doesn't work anymore, and that hits me today. Because I wrote a backup overnight, and I thought maybe it's a good idea, but it wasn't. So currently my native part isn't compiling anymore, because the linker says simply no. So what are the cons and the pitfalls? First of all, debugging is tricky. You don't want to debug this at all, because you have the Kotlin stack and you have the Rust stack. So my advice is here, write test, do TDD, that will take all the heat out of it. So far I tried it, to be honest. And this especially goes if you do a core concurrency, so if you do something with threads on the Kotlin side, if you do something with threads on the Rust side, it gets even harder. So test your code and go even for TDD. It will save you a lot of time and nerves. So Rust is bad with reactive programming, or let's say it doesn't has this pattern yet. Maybe there is something coming up in the future. I would love to see that in Tokyo, but that's something what I was talking about with somebody from one password about, because that means if you go up to the view model on the top, it's a little bit different to deal with this kind of stuff, because then you have to write the view model in a way that takes over the whole reactive stuff, and so on. And that's also not that nice, and can a little bit pollute your entire architecture. So about the sea bridge, I already spoke a little bit, so that's not fun. I spoke about this point in JavaScript and complex data, that's also a thing. Do yourself a favor, just don't try to bind single fields or properties into Rust. Take your data structure and just make a data class or something, serialize it, and then send it to Rust as a string. It's much easier. You save a lot of time and also a lot of nerves. And also here hint, I tried a little bit, and also spoke with other people. It's the best approach, and the fastest approach is at the current state, JSON. So the Rust JSON parser is really, really fast. And you can even make JSON schema sharing this scheme on both platforms, so you still have one definition, and it works on both sides. That's actually nice. So the first one, don't expect help or grievance or anything from anybody, because that's just a plain field. So I know there are some crazy people out there who are doing that, like me, and don't know why, maybe they have no friends like me also, but don't expect any help. And for your own sanity, try to find at least or speak with Rust people, they understand the pain. Cotton people, not anymore since the new memory model is here, and you don't have freezing errors or something. And the learning curve is quite high, because you basically learn a new language, and if you ever try it a little bit out of Rust, you will know there is a very big valley of pain, and yeah, the last one is simply, if you know a Kotlin native a little bit, you know that you can't just share, if you ship different models via Xcode, frameworks, or Swift package management. If you have one library, you can't just share them in between, you have to build up a monolith, and this also applies for that, because it's also static linked, so it's not nice. And what are the to-dos? I'm currently working on a plugin just to make it super more convenient, so nobody has to go to the pain I have. I try to make examples, but if you go with that as well, if you are interested, reach to me out, go to your local Rust community, right on Rust stack, there will be people which are appreciating that, and you can educate others, of course, and also the bridging layer maybe could be done in KSP, who knows, and not only in Rust, that would be nice. So just before we come to the questions, two nice projects are looking for maintainers, please take a photo. In Scandico codes, one of the maintainers is actually here to be asked, maybe raise your hand, say hello to the people, and just because they are paying for the trip, my company is hiring. So if you want to have an Android position and work with me together, there's a chance. And we have now questions, and I also prepared a higher-end demo, you see, that's all written and composed, and it goes over Rust. I have bridged big, big numbers, so you can start with the questions right now, and just copy-pasting stuff. So let's hope nothing is crashing. It actually crashed, I guess. Questions? I heard that JNI was being replaced for something else, I forgot what. There was JNA, something like that, which is, that's completely new to me. I just worked with JNI because it works, sometimes. So somebody else, maybe from the Rust unicorns, would you do Kotlin also, no, shame on you. So, okay, I hope I don't, ah, one is there. Yeah, I didn't try that. I was just asked if I used or no, you, damn it, I'm a little bit, I know of the library, but didn't use it so far. So maybe later, I can report back, okay, seemed, I talked you down, great, that's a good skill. Then we are done. Thank you very much again, we're going to have 10, 12 minutes till the next talk start. |
Kotlin Multiplatform for Android & iOS library developers
Tips for writing Kotlin Multiplatform Android/iOS libraries |
So, hi, everyone. Welcome back to the next talk. Today we have on stage Anna La Bellarte and Paolo Rattolo from Nextome talking about cutting multiplatform for Android and iOS library developers, I guess it was. Now, yeah, gonna talk about it. Awesome. Thank you. Yes, I'm Paolo and it's a very pleasure to be here with you. We come from Italy. We come from a small company in the southern Italy and we decided to introduce cutting multiplatform in our code about a year ago. We did this because for us it was easier to develop and also we wanted to share as much code as possible. We make libraries so we didn't have the part of UI to transpose in the multiplatform and this is our journey in the multiplatform world. Now, we've seen during this conference that we cut in multiplatform mobile. We can develop a library that targets both Android and iOS by just writing a single code base in Kotlin. Now, if you look at what happens when we distribute the jar inside an Android library, we can say that the process is pretty straightforward. Now, for an Android developer, the language is the same. The IDE that uses is the same and most of the library that we can use are the same. So, we can't tell the difference between a library that was made with Kotlin multiplatform mobile and one made with the native tooling. Now, things are not quite the same when we talk about the iOS part. So, if we distribute the framework, the process is not so straightforward. And the main of the problem arises because the code is converted for an objective C and then most of the time we'll be using side projects that has Rift as the main language. Now, for a Rift developer point of view, sometimes what can happen is that they can find the API that we expose is just strange and this is just the base case scenario. Other times, the app may just crash and this is due to some differences that there are between the two platforms that aren't automatically translated for us by the compiler. So, we will see during this talk what we can do to what happens when we distribute the framework and what we can do to make libraries enjoyable also for the iOS part. Now, let's start with a simple example. So, in this case, we talk about coroutines. So, inside our common code, we can use coroutine. So, we can have a function like this, which is a suspend function because it performs some operation with the network and some interaction with a persistent layer. Now, on Android, we don't have many issues because if we have a coroutine scope, we can launch the coroutine. But what happens on iOS when we don't have the coroutine library? Now, this function gets converted by two alternatives. The first one which uses a completion handler, which is basically a callback that gets called either when we have a result or when, in this case, the to-do variable would be populated or when we have an error. So, in this case, I will have the error variable populated. Now, if we don't want to use the callback because it can become cumbersome when we have different functions, one inside the other, we can use the second alternative which uses sync and await, but we have to target at least iOS 13. Now, if we go back on Android, we launch the coroutine inside the scope, and this means that if I don't need the job anymore, what I can do is just cancel the scope and then also the job gets cancelled. By default, I don't have this power on iOS. So, what we can do is try to use a library which is called Coru. Yes. We fixed that problem with Coru that is actually a library inspired by a blog post of TouchLab. So, thanks to TouchLab for this. Does it work? Basically, you have to include, of course, in the common dependency of your code. And using this library, it basically introduces you to a new annotation that is too native class. So, with that annotation, you can specify a name for a class that will be generated just in the iOS implementation of your code. So, if you have a look at the generated class, we can see that it is basically a wrapper of our original repository. So, you have two parameters that are passed. We have a wrapper that is the original repository, and we have a scope provider. We'll see what a scope provider is later. And all the methods of that generated class are the same methods that you have in the original repository, but the result type, list of to-do, is wrapped in another object. So, if we try to use this on iOS, the code that's get generated is something like that. We have two callbacks actually now, but I can see, for example, other two problems with that code. First of all, we are exposing a coroutine scope for iOS developers, and maybe iOS developers are not familiar with that concept of scope, like coroutine developers are. And also, we have that object that now is an SRA, and it is not a list of to-do anymore. That is because we are wrapped in a list of to-do in another object that accepts generics. And the Objective C translation of the Kotlin code cannot do that for now, so it gets simplified to an SRA. To solve this, we go back in the common code, and we use another function of that library, that is, launch on scope. So, we can define a scope in common code and tell the library to run all our coroutines in that scope. So, the scope will not be provided by iOS developers anymore. Also, for the second problem, it's just a workaround. We can define a data class, we can define a typology, or something to hide the fact that we are using a list of something. So, if we try to generate a new code for this, this is the final code that is much more readable and usable. And, of course, now we can dismiss the job if we are not interested in the result anymore. What about flows? We'll talk about coroutines. What about flows? This is an example of a simple flow that emits integers. Of course, again, on Android, it is simple, you have a scope, you can start collecting values. As with the code that gets generated, it is something like that. We still have a collect function to collect the values of the flow, but we have to pass this, that is a flow collector, so we have to implement the function emit, do something with that value that gets emitted, and then when we are done, call the completion handler so we can receive the next value. Also, notice that we don't have the type that we are collecting from the flow, we just have any. I can prove this. First of all, we tried to make that collector generic, so we can use it in more parts of our code, and so we exposed another callback. Yes, make it generic. So we exposed another callback, and we actually casted the value as the one that we wanted, but we found that this is not good enough. Also, because the highest developer has to do this in his code, we are not doing it in common, so every time he has to use our library, he has to define this. So we decided to fix it in common code, and again, we used this, that is a common flow. There is a class found actually in the Kotlin Conf app, and this class wraps up flow and basically emits all the values of the flow and returns a crossable object, so you can dismiss the flow when you don't want to listen to it anymore. So again, we return now a common flow using the extension function, and on iOS again, now we have a much more readable code that we can also cancel if we want. Now, we mentioned before that sometimes the app may crash because of the differences between the two platforms, and one great example of this is our exception handling are handled in the languages, because Kotlin only works with unchecked exception, while Swift only works with the unchecked one, and now we will see what this means and what happens. So if I have, we are bringing back the function from the coroutine that we saw before, so in this case, because in Kotlin, I don't have to mark explicitly each throwing function, I can wrap it inside a tri-catch, so if something happens, I will receive the error inside the callback. Now, if we bring back also the first alternative that we saw before with coroutine, what I expect is to have the error in the function, but if I launch the application, it actually crashes, and this is because in Swift, I have to mark explicitly each throwing function, so the fix is actually quite easy, because there is an annotation that we can use, which is called throws exception. So by doing this, just in the common code, and we don't have to make any changes inside the Swift implementation, so in this case, I will receive the error in the callback, and this works also with non-suspending functions, so if I have this function and mark it throws and exception, once I compile the code, the generated function in Swift will be marked as throwing, so this time will be the compiler to force us to handle the exception. Now, another API that is not quite Swift-friendly is the one of sealed classes. Now, on Kotlin, we can restrict the concept of inheritance by using sealed classes and sealed interfaces, so when we use them inside our Android code, we can just make something like this, because, okay, something like this, because we know for sure that those three, so data, error, and loading in this case are the only cases that we have to handle, but on iOS, actually it gets converted by just using the concept of inheritance, and so when I have to handle the, in this case, the status, I have to define also the case, a default case, which I know for sure that will never be called, and on Swift, we actually have a concept which is similar to the concept of sealed classes, which is the concept of enum, so what we want is to map the sealed classes with enum, and to do so, we can use a library, which is called, it's quite dark, but Merkur keys, Swift, and in this case, using this library, it automatically detects any sealed classes and sealed interfaces, and generates, in this case, will be UI state KS, and it just takes the status as input, and it is actually an enum that I can use, so for a Swift developer, this is much easier to use, because I don't have to define a default case anymore. So, if you're writing code that is platform-specific from 400, for example, you probably will need a context to access some system functionalities. What happens in the library ecosystem? So, you may expose an API like this that gets the context from the user of the library, but of course you don't have to do this on iOS, because you don't need a context on iOS. How you can improve that, how you can unify those APIs, we try to use Jetpack app startup for this, because if you include the app startup in your library, basically you will be able to get a context that is injected by the operating system, and maybe save it. |
Functional fun in Kotlin
A 20 minute run through modern FP in Kotlin |
Thank you, everyone. It's so great being here. So I'm Simon and I'm a developer, a tech lead engineer at CBA Functional. And a little bit about myself. I've been doing Kotlin since 2015. I've been doing functional programming a bit longer than that. And, well, I'm really in love with both things. So we try to improve things as much as we can. And I'm also one of the lead maintainers of Aero, which is a functional library in Kotlin. So today I want to talk a little bit about functional programming in Kotlin. And there's three big topics that we often talk about, which are dependency injection, side effects, and typed errors. But I don't want to go into this comparing it to different languages like Haskell or Scala, because I only have 20 minutes of time, so I need to kind of, you know, pick my topics. So I want to talk about these three things. And I will maybe sometimes refer to some other things from other languages, but if you do not follow or know them, it's not really important. So first of all, dependency injection. I'm also not going to go into why we do dependency injection, but often when we are writing programs, you might have something like this where you have a database, where you can run some queries, you have a logger, where you can log some stuff. And then you need to now write a program that uses these two things to build some logic. So the most vanilla function that we can probably write in Kotlin might look something like this, where you have a fetch user function, it takes in an ID, it takes in both dependencies, we run some query, we make a log statement, and we return a value, right? So this is the most vanilla, pure functional signature function that we can write in Kotlin. But it's not really that great because if you want to write some other code using this function, we need to always wire and pass these parameters manually around. So you will see in your code that after a while, you're always manually passing all these parameters around and wiring all this stuff. It causes quite some boilerplate. And typically, it's not very interesting to read because we are really only interested in fetching the user with the ID, and all their stuff might be a little bit side tracking or making our code more complex. So for mentioning all the different kinds of techniques that you can do dependency injection I wanted to use or include this version is not the one that I'm going to recommend, but I'm going to cover it anyway. So for those that are not familiar with this pattern, you can write an extension function fetch user on a generic type that we call context. You can answer the context needs to extend both the database and the logger. And then inside of the method body, you again get access to the database methods and the logger methods. You can write the same function as well as before. But this is quite complex to read. It's a pattern that's probably very foreign to most people. And to actually call this method, you now have to create a type that extends both database and logger. So this is probably not a very ideal pattern to use to solve this problem. The cool thing, however, is that we can now define again the new function. And as you can see within the map function of the list, you don't have to pass the parameters around, but we have to define again an extension function on context which is constrained to having a database and a logger as a super type. So this is a bit complex. This is the solution that I do not recommend, but it works in Cullen today. So as you might have guessed, a really neat solution that is an upcoming feature in Cullen is called context receivers. And you can now annotate or mark your function with the context keyword. And you can say to call this function the context of the database and the logger need to be available. And you can only call this function with both of these types are available. And again, inside of the method body, that gives you access to the functions of the database and the logger because you have constrained your function to say this function can only be called when these types are available. So we get access to the query method and the log method. And to call this method or to make the context of the database and the logger concrete, we need to at some point in your program say, okay, I'm going to call this function with this instance of the database and with this instance of the logger. And as you can see here, this is valid Cullen in the current 1-8-0 and even in the 1-7 releases, this is valid Cullen, this is code you can write and compile today, albeit only for the GVM and you need to explicitly opt into the experimental features of context receivers. But this is a very neat solution because we didn't have to do this type dance with the extension function and the where constraints. We don't have to pass the parameters of the fetch user function, this is done automatically for us by the compiler. And this allows us to, in a very neat way, do dependency injection using context receivers. And of course, what we love about functional programming is we want everything to be available in the function signature. So here you can very clearly see in the function signature that the database and the logger are required for calling the fetch user function. So a different very hot topic within functional programming is side effects. So typically, in talking about functional program, you often say you don't want to do side effects, but we typically have to write side effects to call or write useful programs like we have to log something to the system, to the console or to a server or we need to call the database to, you know, interact with external systems. So we need side effects to write our programs, but what do we want to do? We want to track the effects to compile time, which means that you should not be allowed to nearly really call side effects wherever in your program without knowing about it. You want side effects to be composable in a safe manner and that allows you to reason about your code in a more clear way. Or it allows you to reason about where side effects happen in your program. So let's again take our two dependencies of the database and logger and here we are actually violating that rule because we are saying we have a regular function query and a regular function log and they are just performing some side effects underneath. So again, might have guessed we can mark these functions with suspend, which is a feature in the Kotlin language. And now these side effects can be compile time tracked. So what does that mean for these side effects to be compile time tracked? When now that we've marked these functions as suspend and we again take our previous method of fetch user, this function will no longer compile, right? So where the red lines are it will fail to compile. It will see a compiler matter saying the suspend function query and log should only be called from a coroutine or another suspend function. So since that the fetch user function was not marked as suspend, we cannot call this other suspend functions in the method body, right? So that is to say that our side effects that we now mark as suspend are now compile time tracked, right? So our compiler is now tracking for us that these functions can only be called from another coroutine or another function is marked as suspend. So simply by marking our fetch user function also as suspend, we propagate that, you know, the fetch user function is side effecting in its method body or in its function body. And now anybody that calls the fetch user function also needs to state that it performs some side effects. Some other languages, there's often used an IO type which appears in the return type and you can compare it to a callback. So for example in Java we often use callbacks for these kinds of operations, right? If you call the database often you have to provide a callback as we also saw in the previous talk and then the callback will say it will either result to the successful value of t or a failure of type trouble, right? And the suspend system does the same thing for us through a technique that is called continuation passing style. So the compiler automatically takes care of all the heavy lifting. So just to give you an example, if you would have to manually rewrite this function using callbacks and continuation, it would look like this. It's quite horrible. It's not nice to read because these are actually the only lines that we cared about. We saw that the code exploded to almost double in size. We have this nesting of callbacks. We get this tree hierarchy in our code. It's not nice to read and it really obscures about what we were actually doing in the code. So thanks to the calling compiler we get this super nice syntax and everything is extremely optimized in the runtime. So there's really not any penalties that we have to pay for this nice syntax. And you can do all this awesome stuff thanks to the calling compiler in a very efficient way. So really, really neat. And it's very similar in spirit to this context receiver that we already saw for DI. What is actually really neat about suspend in Cullen, and this is a solution that has not really been solved in any other language that I know about, is that we have this map function on our list. And the map function on our list is of course not suspending. It's a pure function. It goes over every value in the list and maps it from type A to type B. But we can call this suspend function inside as long as our fetch all function is also marked as suspend. So this constraint of side effects travels through this map function to the list. And the calling compiler is able to track where this is valid and I'm not going to go too deeply into what it is. In this case or in most cases because the map function is in line so the compiler knows that it can replace the code of the map function inside of the body of fetch all and that through this mechanism this constraint of side effecting is allowed to pass through. So the compiler takes care of all of this and we don't have to know or learn any new method names to combine suspend with any other code. So this is very neat. It allows us to combine these patterns in very elegant ways. And then another thing that we often care about in functional programming is typed errors. More specifically again we want to track at compile time what are some of the expected errors. And we will see in a little bit what I mean by expected errors. So this can be errors that you care about in your domain. Things that you can deal with that you can recover from. And you also want to track them at compile time as I mentioned before. We want all of these things to appear in our type signature. So in the case of our fetch user function we can very simply mark our user return type as nullable. So we add a question mark at the end and by that we can basically state that this fetch user function can be absent of results. So in some cases a user might not be found for the ID and in that case we can return null and this is in columnist type safe. But there is many, many more errors that we can encounter besides just saying okay there was no value available for what you were looking for. So instead of fetching a user let's try inserting a user and we're going to insert the user with a name and email. And we can now have an error that says okay this user is already available in the database. So the user already exists and we can now define an error which does not have to extend troble. So we have a simple data class that says here is the error for this name with this email address and in this case I've enhanced also the error with the underlying Postgres SQL error so that I don't induce any information because we don't want to discard any important information that you might need later on. But now we need to make this type appear in our type signature so a very traditional way of doing some functional programming is using the IDER type and you can then say there is results in either the user already exists or the valid user. But we've seen all these nice things in Kotlin. We've talked about suspend which allows us to do callback based programming without having to use callbacks. We have these context receivers that allow us to inject dependencies without having to manually wire it and don't have to explicitly pass them in the parameters of the function. I don't want to have this IDER in my return type. I want something more elegant, something that is very similar to the nullable type or the context receiver. So how can we do that? In Errol, a library that I'm working on, we have this type that we call race. And we can put the race also into the context receiver basically stating that this function has the capability of resulting in a user already exists error. And now we can see that in the return type we are simply returning a user. So we don't have a need anymore for the IDER wrapper. We are just stating in the context of our function that if you are calling this function you need to be aware that at some point a user already exists error might occur and you need to deal with it. Errol offers a bunch of very nice DSLs to, for example, wrap our query method and we can say, okay, I want to catch the Postgres SQL exception from our query statement. And in case of the Postgres SQL exception is a unique violation, so that means the user already exists in the database, I want to erase this error of the user already exists. So we can now erase this typed error into the context saying, okay, the user already exists so somebody needs to deal with that at a later point. And I'm also retrawing any other errors basically saying any other errors is something that I cannot recover from. It's something unexpected, something that you're not going to deal with at a later point in time. And what your criteria are for this are of course up to whoever writes the code or however you want to model your code. And then if you want to call this function, you can again provide the dependencies of database and logger and you can say at the edge of the world or wherever you want to need to call this function, run the effect, right? And this provides the context of arrays that we have before. And what is actually interesting here is Kotlin is able to infer all these types so we don't have to explicitly provide any type arguments because it knows the error that might occur is from the user already exists type. So here it knows that the error that we need to recover from is the user already exists. And then we can say, okay, fold over this method and either print the error or print the inserted user. Right? So this is a very simple example in the API that is available for this method which is much, much larger. And what is also neat what I really like about Kotlin is we have these special DSL sketch and arrays and they actually show up as special functions in the ID. So here the catch method and the erase method show up as pink stating that they are doing some special kind of DSL functionality that belongs to the erase capability in the context. So I mentioned the error in this last set of slides. We've typed error so what is the goal of error? It offers this DSL based functional programming style of dealing with things. And the goal of that is we want to get rid of a lot of complexity of functional programming, things like map, flat map, monotransformers, wrappers in the return types, etc. Right? And we do this so that we can provide an idiomatic Kotlin syntax for working with functional programming. And what is also really neat is it is actually Kotlin multipath from ready. So all the talks that we saw this morning, in all of those things you can also already use error in this style of programming. It offers a couple of more DSLs. For example, the Saga scope. So for people that are working with the Saga pattern on the backend, we saw that backend is an increasing industry in Kotlin. So when you're working with the Saga pattern, again, you don't want to have any return or wrappers in the return type. So we have the Saga DSL that allows you to wrap any action with a compensating action, meaning that if something goes wrong in the program, the compensating action will run compensating the action. Similarly for resource safety, we can say, okay, I'm going to install some resource and whenever you're done using the resource, you need to automatically call this release function with the log statement and the auto-closable function, which offers some special syntax for GVM functionality. So what do I love about functional programming in Kotlin? We can do it in a very elegant way using DSLs. And all of these DSLs are composable. So it means that you can nest these DSLs in a safe way. They will cooperate with each other. They will do the right thing. They're all type safe. And this offers you a very low threshold for getting into functional programming in Kotlin. You don't have to learn anything about map, flat map, special monads, monadransformers. All of these things are not needed because you can very elegantly nest and compose these DSLs together. Seems that I'm right on time. Just five minutes left for questions and thank you so much all for your attention. |
Be pushy! Let's join for wider and better Kotlin support worldwide |
Hello, thanks for coming. It's my pleasure to be the first time at FirstDem in person. Last time was online two years ago. And so today I want to show you no Kotlin line of code and just inspire you Kotlin developers who is already doing Kotlin here and who is not to be easier to see. Okay, so it might be slightly less relevant for you, but it's interesting going forward if you come to like Kotlin and want more support for it. One, so it's no code that means that soft skills will be required if you want to leverage what you learn or what you might see as inspiration in this short talk. So what can we do to push for better support, wider support for Kotlin? We can plant seeds. For example, when I first was introduced to algorithms with Python, that was a seed that could have led me to do a carrier in data science maybe or anything related to Python. That's kind of a seed and you have seeds everywhere or in the browser if you right click and then you can see some CSS, HTML, JavaScript. That's also a seed that sometimes lead people to become web developers or just developers, maybe not web developers. So you can plant seeds. You can also water plants from time to time or every day. Be aware not too often or you can just wait for rain. Sometimes just planting a seed is enough. You can also find and train new farmers. And it's always a bit better that there are also people that share your point of view when you want to say, hey, this thing is cool. But if you are the only guy saying this thing is cool, at some point people are like, but why is he the only person that finds Kotlin cool? So all of that is kind of lobbying. And lobbying might be a big word, but there are multiple ways to see it. I don't see it as a bad word, but anyway. So it can be boot Kotlin. I think it does actually boot Kotlin. And so I will use the lobby world just to make things a bit simpler. So where would you want to lobby? Where do you plant the seeds? So of course online, this is a place where many, many developers find new content and learn about new stuff, discover things. You can ask. You can ask for example for a new feature or you can ask for an issue to be fixed. You can send a support ticket to an SDK vendor. For example, you can send a support ticket to Firebase. How do I do that with coroutines? I don't see the example in the documentation. That's one such example. Sometimes it's just about telling people, hey, I'm using this. And when they get a bunch of people telling about Kotlin or some questions, then you raise a priority internally. You don't see it. It's very hard because you don't have immediate feedback or no feedback at all, but it does have an impact. You can suggest things. So you can comment in an issue. For example, is there something with a lot of Asynchrony in a Java library? You can say, oh, if you also make a Kotlin version, or if you change it to Kotlin, you can make it a bit easier or score with coroutines, for example. But you don't have to be very pushy. You can just suggest that alternative. And then maybe other people will say, oh, yeah, that would be a good idea. Or maybe no one. But you are planting a seed. You can also suggest to collaborate on open source projects, even if they are not in Kotlin in the first place. But contributing is also a way to reach better or wider support for Kotlin. Because when there are libraries existing in Kotlin, well, you don't have to write it. So it's easier to do whatever you want. You can sit on the work from the community previous work that now has been done. Thanks maybe to you. Blog posts. You can be a guest writer. So instead of writing blog posts in a Kotlin-only community, you can find a way to get your post in a publication that is about any programming language. And then people that are just curious also hear about Kotlin before they are very deep into the job. And of course offline. Like now, it's kind of online as well. You can talk to folks at work, at events, at the bar anywhere. Or maybe you are just meeting someone that, oh, yeah, I like developing. I found the inspector. Oh, yeah, cool. And what do you like? Making apps, maybe website, making things that work behind the scenes. And then Kotlin is not relevant for all of those. But sometimes when it is, you can also mention it. And at least they know. And maybe one day, maybe two years from now, they will think again about it. And they will look. And then maybe they will make you educate that we use five years from now or 10 years from now. You don't know. And yeah, you can, when I said talk to folks at events, I'm not saying like a speaker speaking to an audience, but like you can talk to, yeah, there are many developers today at first then. And you can just talk to anyone here. And if they don't reject you, then you can mention that whatever you are doing. And they will know a bit more. Kotlin is a very young language. So that doesn't help the language. Like 2016 was a one or two release. So if you want it faster, we have to be actors of that move, I would say. So then who to lobby? Anyone starting from about 80 years old, I think? I don't know about the other bound. Maybe 80, 80, 80 years old. Anyone has an 80? For those that are interested in computer science, of course, you will not tell all the cashiers that you meet, hey, you know about Kotlin, maybe they are not all interested into that. But something also very important is folks that impact other developers. So that might be developers, for example, someone that is making an SDK or a library, but that might also be a manager or a CTO, project leaders, generally speaking. Also, something you can do is when Kotlin doesn't work the way you want or something around it doesn't work the way you want, you can find issues. That is something that really has a positive impact long term. For example, people have been complaining about long compilation time. Who is affected by this? Yeah, so someone, I'm very sure that there is at least one issue on the Kotlin issue tracker for that. If you want to submit a new issue, some problem that you find, you can just go to Kotlin.in slash issue and the issue tracker has some people really triaging the issues. So there is a good chance that it helps at a long term getting Kotlin in a better shape. And of course, it can be any other issue tracker, depending on what it is exactly, because you might have issues in Kotlin or the ID support, but also in a library. And that might also affect developers, their experience with Kotlin overall or making their project. So yeah, few links. Also, you try www.jotelbrain.com if you know that it's only in IntelliJ. Also, right in your ID, there's a help menu and then send it back. You can send an issue for Android Studio or IntelliJ, depending on which one you are using. There's also links in AndroidX release pages, sorry, release notes pages, a link to send a new issue. So I don't know, you find an issue in androidx.compose compiler, then you can go there and then you click on the release notes, latest version. Maybe you will see, oh, I'm not using the latest version and the bug actually has been fixed, but if it's not, there's a link right here. And more, probably you can find out the right place. And so something important is going beyond the Kotlin circle. I will give you one example. It was, I think, two years ago or something. I wanted to run a Kotlin script on GitHub Actions. I knew how to do it to install Kotlin on the GitHub Action Render for continuous integration. So there was one liner that I could work on Linux, but I also wanted it to run on Windows. And I had no idea how I could make it run on Windows in one liner. And Kotlin was not pre-installed on the machine, so I could not run Kotlin script like this. I could run batch script, but I'm not familiar with batch. And I don't think I would really want to deep dive into it to do what I wanted. So what I did is that I submitted an issue. There was a project from GitHub that was about the GitHub Actions Render when you could request for them to install new software on all GitHub Actions Render. So I said, hey, can you install Kotlin? And a few weeks or months later, I mean, I think they said, yes, a few days later, and then a few weeks or months later, Kotlin was installed on all GitHub Actions Render. So now if you have a Kotlin script, you can just run it on GitHub Actions. And so you just have to ask, and it was really outside the Kotlin community. I don't think GitHub is very, very Kotlin-y company, even though they now have an Android app and they use Kotlin instead. So yeah, and see if we want to not always stick to our circle. And another example is for FOSDEM. So what's this? This tomorrow morning, it's a JavaScript webroom. And the first talk, let's see it a bit bigger is... Okay. Can you read? Okay. I don't know why they put it at the beginning of the day. But I see that it's not orange, it's green on the side. So maybe it's like they want to fade it a bit away. I don't know what the colors mean on the side. Sorry? They're random. Oh, random. So thanks, randomness then. So I really tell that they will never select my talk. So you can try that. And also if you are, I don't know, maybe you are interested in web development, that might also be an opportunity to learn more, because you also get to the event. So maybe you will watch a few other talks. Because I think a bunch of us usually are not just about doing, I don't know, mobile development or backend development, but sometimes we have a pet project and we do other stuff. So it might seem to be relevant to go at not Kotlin at all conferences. So yeah, it's important to spread the word as a right places, I believe. And keep in mind that you need to adapt your approach to the target audience or the expected audience. Sometimes you don't really know. For example, if you come at a deafness in a place you know nothing about it, you might learn on a lot of professionals, or maybe half of the conferences from that company that is doing only that programming language, or maybe it's all students, it really depends. So sometimes try to find out, or otherwise make it a bit generic, but don't talk like they know everything about Kotlin or they know why Kotlin is interesting. You know, you probably know why Kotlin is a good language, or what it's not good about, it's important to that. Being productory, if you're not at a Kotlin event, I think that's a given. And also why requiring Kotlin only specific things are explained. For example, receiver types, not many programming languages have that. I learned it when I learned Kotlin. I think it exists in Scala, maybe. But yeah, sometimes if you want to show something cool, take the time to explain it or don't. And also think problem first, because we know Kotlin, oh yeah, if I had to do this, I would do it, and if I had to do this, I would do it in Rust, and this also in Kotlin. But people are not always aware of what is the problem that Kotlin will solve. So instead of thinking about all solutions, think about what problems they might have encountered in their other tools. So it's kind of you putting yourself in the shoes of other people. And while there are things that you can imagine, that's just a few things that I'm hoping to inspire you. So which other communities you might want to target? Beginners? Because there's not a lot of ceremony, you can do cool things in Kotlin, I believe. For example, writing Hello World is two lines, and you don't have to create a class. So that's kind of classy, I think. Teenagers, amateurs, students, pros that you believe are not that pro, that might still change going forward. And pros and experiments personally, I think that seeing Kotlin helped me a lot to be critical about what tools I'm using, and taking things less forgiven. So that might also be an opportunity for your colleagues to go even if they don't end up using Kotlin going forward. You can also target some specific pro communities, web developers, data science, iOS developers, desktop apps, backend devs, and the corresponding programming languages, of course. Kotlin 1.8 now, I think there's a way to have Kotlin.js export type script declarations. So that's also something interesting in terms of new audiences that now you might target because Kotlin is evil. So sometimes staying a bit on top, kind of like Six Mom Delay if you want, but on the news in Kotlin can help you be a better advocate. So yeah, so even Objective-C, there are still a bunch of very good macOS apps made with mostly Objective-C, and it works very well with Kotlin native. Better than with Swift, I think some of you know. And backend development, because backend developer might switch companies, so they might target something different later on. You really never know, I think. I mean, probably you know better than I do, because you know the person. Some plans are better with beer, especially here, I guess. But not all of the people like beer. Personally, I don't drink a lot of beer, but just wanted to shoot out for the good beer here. And so what you will be about, again, try to be context-relevant, but you can share your experience. Sometimes it's not about saying, hey, Kotlin is good because it's just a technical fact, even though it also might be very interesting, but sharing your experience is really something that helps people see which situation it might be helpful in. So that can be experienced from you, but also folks that you know, or folks online that you maybe don't know, blog, talks, case studies. There's a case studies section on the Kotlin's website somewhere, especially for KMM, I think. And also some companies are just publishing their case studies. For example, I think DoorDash has been doing a bunch of blog posts about Kotlin related stuff and how it had them do a bunch of actually useful things for them. You can of course talk about technical facts about the language or its ecosystem. One of them is coaching and structural concurrency, but it's also not so easy to explain. So beware if you talk about it. But that's probably my favorite thing in Kotlin. So sometimes I mention it. Language videos, like extensions, that is, I believe, not so hard to explain as well. So type and new safety. But beware that people have an idea of what a type is, because languages do a very good job at making you not know what even a type is. And also compatibility sharing multi-platform is also a strength in Kotlin, even though it's not always smooth, and yet it's becoming better and better. This can be sometimes the only reason why some people are using Kotlin. And then they also get the other nice things and less nice things as well. And also you can share new things. Personally, it occurs to me that I look at really notes from Dart or maybe Python or other programming language just to see how they are evolving. It might be a bit particular because I'm kind of interested also in language design sometimes, especially Dart. It's been taking an interesting path lately. But maybe just as a blog post about Kotlin 1.8.20 that is coming sometime in the future, maybe it will interest some of your colleagues. Who knows? And it was a bit hard to structure these slides. I hope you will forgive me. So now it's random things we can do. You can nudge SDK vendors to put Kotlin as a forefront. That also has an impact. If I always say a mechanism can only be a man, then less non-men will see them as impossible mechanisms of future. There's been people that have been looking into this and that's a fact. It unfortunately happens that way in our brains. So you can say to SDK vendors, can you put Kotlin before Java? And then more people will see Kotlin. That also has an impact. If they are using Kotlin, they will say, oh yeah, maybe this is a good SDK. So you can tell them that that could be a good thing if you could put Kotlin before on your page. I've personally seen a bunch of SDKs where they had written Java, Objective-C, Swift, and Kotlin was very far away. You had to dig a bit. So yeah, nudge. Always be nice about it. So you can act as a leader. Something quite different that works best I think when you are in the company and people are taking into account your feedback. You can suggest using Kotlin when it's the right time. For example, we have a new greenfield project and we are not sure how we will approach it. So you can say for this thing that could be helpful, we can consider and then have just a normal discussion. You can also get people to discover Kotlin. I think you already know. You can help your teammates master Kotlin. That can also be through pull requests. You can review some code because knowing about Kotlin is only the first step. And again, report their user experience issues, please, before they target the effect of more people. About acting as a leader, one thing is to get everyone on the same page. So again, start with problems that you would avoid with Kotlin. Otherwise, why would we use Kotlin if there are so many problems? Just because it starts with a K. And also about others' concerns, because there probably will be, take them into consideration. Listen about that, acknowledge them. And when possible, you can address them or put them into perspective. This is a problem. At the same time, we are avoiding those three problems that are affecting us even more. So that's a good trade-off to make, maybe. That really depends. And remember that you can never be perfect at this. I'm sure I'm not. Putting yourself in other shows, taking their point of view helps to know what to tell them about Kotlin because there is so much to tell about Kotlin. And you cannot say everything. Personal experience helps. What I mean is that if you try to spread somehow Kotlin more, you will learn things on your own. And that works best for who you are as well, because this is a very social thing. Listener acknowledging helps, because then people are like, okay, you actually want to help them. It's not just about telling them, do this, do this. And being nice, even when you have a feeling that it's always a good thing. And also, it can take a while. It can take, as I told you, months, years, you never know. So don't overpush. This is something I remind myself. And yeah, that's about it. And if you want to talk about that later, I think you don't have a lot of time for questions. Or maybe two minutes, maybe one question. I have a question. Will you open source how to make this? I repeat the question. Yeah, the question was, will I open source how to make the Kotlin handkerchief? So yes, I will do it right now. This is made with an actual handkerchief. Inside, there's some plastic. And with four hands, so thanks, mom. With four hands, you can hold the thing while some of the two other hands are suing the plastic thing that keeps it, you know, holding. It's only at the top. You don't need to put plastic here. And this is some LEGOs that I put behind. Then I fold it, and then in the packet. |
How we moved SDKs to Kotlin Multiplatform
and saved the world (kind of). |
Thank you very much. So, before we get started, I would like to do a couple of introductions. So first, I'd like to introduce Ashley. He's going to hate me for using this photo, but he is one of the software developers on one of our engineering teams and he was heavily involved in the actual physical move of our SDKs into Kotlin multi-platform. He's kind of the brains behind this talk and also was with the company for many years, has a lot of experience around the history of code share attempts and everything else. So I spent many hours kind of bugging him to get information for this talk. So definitely need to thank him for that. And then there's me. I was an Android developer for many years. I then moved into developer relations as an Android developer advocate. And last summer, I turned to the dark side and became a manager managing the client SDK, the DevRel team. You can find me over on Mastodon or anywhere else as Dev with Zachary if you'd like to. So before we get started, a couple of takeaways for this talk. Unfortunately, not pizza, but I hope to have a major too hungry. I first want to start by saying this isn't a code talk. We've had a lot of really great code talks today. I've certainly enjoyed them. But instead, this is a kind of real world example of an engineering team who have built a library in Kotlin multi-platform. And hopefully, with some good takeaways from that, you'll be able to learn from some of our pains in doing that. Have a look at the history of the SDKs. So hopefully that'll give you a good idea of kind of some of the previous pains that we've had and probably you share if you've done any code share in the past. And also the past code share attempts. Taking that, you should hopefully also see the success of where we are now. The main changes that we have found outside of just the physical code, what's actually changed about the engineering teams, what's changed about the structure of the SDKs and kind of those more abstract things that have made a big difference moving into Kotlin multi-platform. And also where the SDKs are now. Plus a few extras if we have time for some surprise improvements that certainly on the team no one really considered until it happened. So before we get kind of into that part of the talk, I need to give you a little bit of background so you understand what the SDKs are that we've moved over. And they are the Vonage Climb SDKs. So what are those? Well, Vonage does a huge range of communication APIs, things like voice, video, SMS, and then services built on top of that to factor authentication, all those sorts of things. And we have a range of SDKs for a range of platforms. So when we say client SDKs, we mean specifically Android, iOS, and JavaScript. We also have what we call server-side SDKs, things like PHP, Python, and all those sorts of languages. Unfortunately, those aren't written in Kotlin. But all our SDKs really are just wrappers for the APIs. All our APIs are RESTful APIs. They use WebRTC for voice and video. You can just use the APIs if you want. All the open API specs are available. That's an easy thing to do. But one thing that a lot of people want to be able to do is use those APIs in a more native-friendly way. So if you're writing an Android application, you want to be able to use a nice Kotlin library to call those APIs and you don't have to worry about the JSON that gets returned or the different authentication headers you have to use or all that sort of stuff. And that's kind of where I come in and our team comes in. Within the developer relations, we build those SDKs. So to us, developers are our world. And that's who we are thinking about. We try and kind of smooth over some of those issues that might be existing in legacy APIs and all those sorts of things that we all know happen when it comes to APIs. So the SDKs are a better way to do that. And so if we're talking about that with that in mind, it's time for a little bit of a history lesson. And a long time ago in a tech company far, far away, there was a startup called Nexmo, which some of you may have heard of. It was a European startup that did communication APIs. I'm sure you can see where this is going. They had three native SDKs for Android, iOS and JavaScript. They were completely separate. There was no code share at all. And one of the biggest gripes was that this was really incredibly difficult to test for. If there was a new change in one of the APIs, all three SDKs had to support that. And you had to test three times. You had to do everything three times. Any new feature was built by three different people. So even if you had the perfect API spec, you had the perfect kind of design, there were always going to be kind of small things that each of the teams did differently for the respective platforms. And just purely from engineering power, it was three times the work to implement anything. So fast forward a few more years, Nexmo was bought by Vonage. Nexmo became the Vonage API platform. And this was the perfect opportunity for a rewrite. It was time to kind of take that opportunity and make something better that started to utilize code share. And this was in about 2018. So it was a few years ago. And actually the JavaScript team thought, there's this cool new thing that's just been released. It is Alpha. It is very early. But let's give it a go. Let's have a look at Kotlin multi-platform. Very early stage. And the other issue was that the team really wanted to rebuild everything, the whole SDK in Kotlin and in Kotlin multi-platform and leave no platform-specific code. This was their goal. It failed very quickly. The combination of the Kotlin multi-platform just not being where it needed to be to support a lot of the things that they wanted to be able to do across all three platforms. And just the inherent design idea of having everything in that proved incredibly difficult to the point where it failed. But we still really wanted code share. We still wanted to at least reduce some of the effort that was involved in supporting these SDKs. That's where C++ came in. And you can already see where the pain is coming from. It allowed for code share, which is fair. And it was also obviously a much more mature option in its way. And you could basically build a base level with platform-specific code written on top. It had kind of all the fundamentals of what you'd like to see from code share. We also had some internal resource around this. So we had people that knew how to write C++, but not many of them. But this is where we went to. This is where it was implemented. We got code share. It was great, right? I mean, no, no. No, it was actually, it was not great. It was actually kind of the opposite of good to the extreme. What we ended up with is kind of something like this diagram. And it doesn't take a mathematician to realize that instead of trying to reduce the amount of code that was across the platforms and, you know, we started with three SDKs, a JavaScript one and Android and iOS. And we wanted to reduce the amount of code there was. Well, actually, we ended up increasing it by adding a fourth layer, which was the C++ layer. And it didn't actually really reduce much of the three other layers. So now instead of having three teams supporting an SDK, we had four teams supporting SDKs. It also really wasn't accessible to everyone. I don't know how many of you have had the pleasure of building in C++, but compared to things like Kotlin and JavaScript, it certainly miles apart. And a lot of those developers just didn't have the skill set to access that code and work on it. And that kind of meant that we were very dependent on specific people working in that kind of C++ code share. And therefore, if they were on holiday, well, we'd have to wait for a release. And it really sort of slowed down the release cadence, which when you're building a library is incredibly important. We want to be able to have reactive co-share, so reactive SDK builds. So when there's a bug, we get a new release out. You know, when there's a new feature in the API, the SDK follows quickly. That was very difficult to do like this. So let's do a rewrite for the second time. But what were we going to do? Well, realistically, what we wanted is at least something much closer to this diagram. We wanted a decent chunk of the code base to be in the code share. And then the platform-specific stuff was kind of whatever just needed to be platform-specific. Code share, but better. We kind of came on the conclusion that really the best way to do that is keep all the business logic in the code share and leave the lower level platform stuff to the platforms. So that was kind of the rough, very rough idea of what we wanted to achieve. The next question was, what language could we achieve that with? Based on the skills that we had available internally and the kind of maturity of the options. So really we came up with three options within the team. The first being still C++. It was a valid suggestion to actually just rewrite that and use it for the business logic and be still written in C++. So that was presented to the team and actually kind of unanimously there was a very quick answer that they gave. No, please, no. It just wasn't viable because the fact that there was still going to be that dependency on specific people, there wasn't the wider arching kind of support within the team. And also I think there was also maybe a bit of trauma around having to kind of support that previously that they just didn't want to do. So the second option actually was Rust. It was very powerful. I mean, for those that have used Rust, they know that and will quite happily say that. And it was also quite good for code share. We've had an example earlier today of how you could go about using Rust with Kotlin Moj platform actually. But there were some issues that they started to highlight. There was at the time the issue of kind of binding between the Rust layer and then the native code, which would still need to be there to some level. And within the team, the kind of tooling and the ability to actually make this thing happen was quite unknown. And that was kind of a little bit scary because we didn't want to fall into another C++ trap of being dependent on specific people and having to learn something a lot. So in came Kotlin Moj platform. This was now sort of 2021 ish. COVID kind of skews my perception of time, but somewhere around there. And there was obviously a much more mature option than when it was previously looked at. It was also very good for shared code base binding issues were a much cleaner and easier thing to achieve. So the engineering team, who obviously also at least a third of them already had Kotlin knowledge because they were building for Android, they said, right, let's prototype. Let's give it a go. Let's see if we can achieve the thing that we wanted to achieve something like this. So come December 2021. As we all know, as soon as you start getting close to the holidays, no one wants to actually do any real work. No one wants to be fixing bugs. No one wants to do anything like that. So instead, the team kind of hid away and built a prototype instead did something a bit fun. And they were very careful even the prototype about what was going into the shared code. This was definitely a mistake from C++. The Kotlin Moj platform was only for the business logic platform specific stays out. And what do I mean by the platform specific? What was platform specific for us? And really it was the networking layer. We actually already had very good HTTP clients, WebSocket, WebRTC clients for Android and iOS. And obviously a lot of stuff was already built into JavaScript and the browser. So that was kind of what we wanted to keep out of the C++. And I think we did that quite well. So that stuff would just be exposed behind interfaces back to Kotlin Moj platform. And we'd let the platforms worry about that. So the prototype was worked on at the end of the year into January. And it worked. It was successful. It was basically just taking a few of our APIs. It wasn't a full SDK, but it proved that you could do the things we needed to do in a much nicer way. Then we had drama. Like all good stories. Ashley, absolutely, he dared to have a child, which obviously put a dampener on everything. He went on paternity leave. Obviously it was a lovely thing and we were all very happy for him. But it did kind of hold the project for a while because he was a big driving force of this and it was a big kind of hit. We also had team members leave. It was the new year. It happens. People leave the company. And what that meant is the team that was left had to focus on fixing bugs in the current SDK. There wasn't the bandwidth to build something new. They had to keep supporting this thing that people were already using in production. It wasn't like we could just drop support and move on. But wait. By doing this, the team are reminded of the painful process of supporting the current SDK, of how painful it is to build new features, fixed bugs and all that sort of stuff. Just as they are starting to really feel the pain, Ashley returns to triumphant fanfare. He has come back. He is the saviour. The team kind of say that we really need to do something now. We really need to come up with a plan for this year. We need to build something. In the inspiring words from Ashley, solid, let's do it. It was time to build a new SDK. But what actually needed to happen to make that possible? We had a prototype, but where did we need to go from there? Well, we had already previously kind of focused on iOS and Android. We needed to build something in JavaScript as well to make sure that it was going to be functional. We needed to check that. But actually the process there was fairly smooth and that wasn't a problem. So that was actually really all that changed. All that then happened is adding in the new APIs and other stuff to build out a full SDK that was kind of feature parity with the previous one. So that's when I kind of asked the team, what were the main changes to your workflow, to the team's workflow, all those sorts of questions as opposed to clearly what were the main changes to the code was, thing was rewritten. But what changed in their process? And hopefully these are kind of some points that might get you thinking about whether this is something that you can do in the future. Obviously the first one was they had to learn Kotlin. The team at that point merged into one large team and for the Android side, well, they were off having a party for weeks because they already knew all this. But iOS and JavaScript devs, they had some learning to do. They had a lot of catch up and that was a big investment for the company to make sure that they did. They had the resources they needed. They could build this. The other thing of course was we had to go all in on a build system. We obviously went with Gradle. It made iOS developers very sad. I think there was a few people still crying out for make files. It was kind of a little bit of contention for a while, but everyone got on board in the end and it's definitely a big change to the whole process. The other thing was the actual moving and tooling, the shifting and tooling. Like I mentioned, we had C++ developers. We had people that were used to being able to just spin up Vim or text editor of their choice, write some code and build it. They didn't quite grasp just how big Android Studio is and some of the issues when projects get large, as we've heard about earlier today as well. So there was a lot of relearning. I definitely wouldn't recommend trying to use any of those developers' laptops because all their key bindings are completely changed and it's very strange. I've had a look at it before, but because they're so used to using things like Vim, they've completely remapped their Android Studio, which is fine. So yeah, where are we today? Well, we have released the SDK. It is actually available. We've split that out into two modules. We have the voice component and we have the chat component as separate things. The chat component is releasing the end of this quarter, but we have voice out there for people to actually use. And so now that we are right now here in the last couple of months, I've kind of, again, I asked the question to the developers again, what changed? What happened? What have you learned about this process and what would be useful for people to hear about? I think one thing that they kind of didn't really consider necessary at the time was that you have to keep up to date with Kotlin updates. Some of the developers are very much used to using things that are a bit more stagnated, slow developing languages, C++, that you don't have to worry about the new and exciting things because kind of everything's already there and it's done. So there is definitely resource that's going into making sure that the team keeps up to date with Kotlin. It can use the new features as they're released. It can do all those sorts of fun stuff, but it's something you have to consider. Importantly, the original pain point of consistency completely removed. You can now just write once and it work everywhere for our SDK. All the platform code is actually now doing is exposing the functionality of the Kotlin multi-platform code. What's that kind of as meant is that we have more time to work on that API contract between SDK and developer, which at least I certainly think it means that the new SDK is much nicer to use, both as an Android developer as an iOS developer or as a JavaScript developer. We have the time to kind of fine tune and improve that experience, whereas previously it was kind of just, here it is. The other great thing is that now everyone knows Kotlin, which I mean, just as a thing, I think is great from the previous talk. Everyone should know Kotlin. But it does mean that everyone on the team can build a feature. Everyone can own a feature from its design all the way through. They have that responsibility. We're not dependent on any specific person. We're not dependent on multiple people deploying the same feature. There's more ownership within the team as well to actually kind of take that through the whole process, which we wouldn't have had before. Another one which I could give a whole separate talk on is that we have moved to a Mono repo for the SDK. In terms of keeping the whole team synchronized and everyone having visibility and access to the whole SDK code base, it's much better. I know that we could definitely have an argument about whether Mono repos are good and the times that they are good, but in this situation it's worked incredibly well for us. The final point, we have tests now. As you all know, tests are the key to saving developers time, effort, energy. As developers are our world, that's how we've saved the world with Kotlin Multi Platform. Honestly, also it's just really nice to be able to rely on tests and they actually work. That is what we've done with Kotlin Multi Platform. If you would like to check out the SDK, please do. The QR code will take you to a GitHub community, which will give you the tutorials and show you how to get started with the SDK with some sample apps. There's also a coupon code. If you do want to sign up for a developer account, you get 10 euros credit and you can send SMS messages and call yourself and all that sort of fun stuff. Please do check that out if you'd like to. Other than that, thank you very much. As always, my slides are available on GitHub if you'd like them. Thank you. |
Improving the Kotlin Developer Experience in Koin 3.2 |
All right, so let's continue with one of our favorite topic, which is dependency injection. We talked about it this morning. As you can see, we have a lot of topics. We have even more topics until, what time do we finish? 6.30 p.m., right? So, 6 p.m., 6.30. So, we still have a lot of stuff coming in. But for now, please welcome Arnaud, who is going to talk about COIN. Thank you very much. Let's talk about new COIN developer experience in COIN. Just about me, I'm Arnaud Juliani, the lean maintainer of COIN project and also COIN GD. COIN is a dependency injection framework, and the idea is to help you structure your application with ease and efficiency. And this is the challenge of providing a DSL developer experience for you to make your app very, very easy to build. I tried to explain COIN in just two minutes to explain how finally we improved behind that. As we have a dependency injection framework, we want a bunch of classes here, one A and B with the dependency. And we have a DSL keyword here that make the configuration space for us. Then this is the module keyword that introduce the way that we have definitions for the application, single keyword with a lambda function to create what we want to, the class that we have here, and the second one to declare the second component, and then we are done. We have declared our components inside the COIN container. The thing here is that it's working directly with your Kotlin code. That means that here, if you follow closely, we are using constructor directly. And then it's not compiling. Then that means that this class B is needed a dependency to class A. And then this is where we need the last keyword that is get. And then the challenge was that in three keywords, I can manage to write an application configuration to manage my dependency. The thing is that then COIN can create everything by constructor for you and also run any kind of Kotlin code directly behind the functions. The other side of COIN is that you can inject any field easily just by unlocking an extension with the COIN component interface. That means that you can access to the by-inject function that helps you retrieve a dependency out of COIN container. Then you have both components, those one that are created directly by COIN. And when you can't have things created directly by COIN, then you need to inject in fields. This will perhaps remind you some of the Android activity things like you can't create activity by yourself. You can't create Android component by yourself. You are called by a lifecycle. Then you need to retrieve things from the outside of the container. And then that's it. You mostly have all the tools to have your dependency injection. You need just to run and start your container. But finally, the experience is interesting and the community raises many things like problem of verbosity. Because sometimes in application you tend to have like dozens of dependency. I would say perhaps look at your code, perhaps you have kind of potato effect where you can have everything inside one component that try to gather everything. But yeah, not that great for us. This is where the story starts for us and how we can improve this for the Kotlin developers. And then from this really simplistic example here, what we have here is we would like to avoid our developer to write manually the GET things. From the idea and the need of GET is that COIN is made to be super efficient. We don't use reflection. We don't use any kind of meta information over your code. We just run the thing. And then the configuration seems to be a bit manual over there, but then you are manually writing the things for COIN. The new magic way finally to write this with COIN is having a new keyword, single and single off. That is the same semantic keyword, but it's a new function. And then instead of asking you a lambda, we want to ask you a function directly. This is why you have the double semicolon character. And then for us, you don't have to write things with GET. It's still readable, still very easy to use with the COIN semantic, the sense, the meaning of the DSL. And then also, it's consistent with changes in the meaning of, for example, if you change the constructor of a class, your DSL can break because you didn't update it. Then for us, it's a very great improvement to go over a DSL that is using lots of functions, but finally not a voodoo to write those functions directly as a lambda, but directly with parentheses and this pointer kind of pointer function stuff. The other side of COIN, of course, is having dynamic behavior that are interesting is that you can pass that directly to a definition. That means that in this class, we just want an ID. And what we want to do is pass this ID dynamically to this component. COIN offers you the way to do that directly when you inject a field by using the function parameters of. And then magically, your data is going into the graph and then is injected in your definition directly by this function. This is very visual, interesting, but then this compact way of having the DSL also is capable of dealing with your lambda expression. Then you don't need lambda anymore. It's quite interesting because finally, this was this feeling of having COIN as a really compact DSL and compact way to describe everything without to invade your implication is that now it avoid an annoying side effect of the DSL and then now you can just write directly your class constructor like that. There's, of course, some warnings. If you have qualifiers, name parameters, like if you have two components that have the same type but have not the same implementation, then you want to have the different name for them. Then here we can guess which one you want to use directly. Function and classes that have default values, we don't know. We don't know if we, if you want to use the default values, if you want COIN to use this, then up to you to use again the lambda expression thing and also any kind of complex Kotlin expression like builders. Yeah, it's still Kotlin then for you. You can just still use the lambda when those expressions are needed else go with this double semicolon character expression and then just use the class and then you're done. Mostly keep the things up going in the easiest way to write for you. It's opening a door for us like we are a framework maker, then we try to understand what kind of DSL and options we want to offer you. For example, when you define something in your DI, you can give it a name, a qualifier. We can say that it's created on start. For example, the container is starting, then you can create it on start and also you can say that it's binding another type. All of this is interesting but it's not really easy to extend in terms of keywords and binding. For example, you don't know if there is other keyword after the bind because we are already using some lambda, then a function to express something. Then do we open a new lambda block after the lambda block? It's a bit weird things. Then with this new DSL, we can open a new way to write this function. That means that we can directly open a lambda theory for this option. Like we have a name, we have created a start and we have bind. This is the same word but not implemented in the same DSL style. Here this is functions directly out of the definition thing. Clearly, you see it's a bit more readable for us also. It's clearly more maintainable and allowing us to add more features at time. Then it's really interesting to provide a new way to write things and also feedback of the community is super interesting for us. One of the things also is that coin is really simple and keep the things really simple in terms of DSL. That means that you declare anything in terms of module. You can scale in the way you want by features, by layers, by everything. Then up to you to organize yourself. Really, the idea is that the tool shouldn't impact your mind, shouldn't impact your way to build your application. You should even forget that you are using coin. This is really important for us to continue in this way. In coin, the framework never had the need of gathering modules, other by list or by, let's say, something more convenient by a plus operator, but it was really simplistic. The problem is that with scaling application development, scaling around all the development, we need really strong links between definition, between reusability of this component, this layer and this stuff. We finally introduce something that can be seen as really simple to add as a module is that we add the includes function that help us really understand what kind of module we want to reuse and then flatten everything and optimize your loading at start for you. It may be visually, it may be really easy to use, but when finally at the beginning for coin, you only have a list of modules to play with, it's kind of really hard to reuse and really hard to figure out where finally you can build your configuration. Then we are unlocking a really big door that means that you can begin to reuse parent modules with child modules and then you can begin to dive into really, really complex, big and complex configuration things. Then for you, we are flattening all the graph, we are loading and optimizing all the stuff for you. Then all those Kotlin multi-platform, all of these features are Kotlin multi-platform ready and to get those superpowers, sure, you can grab this directly in your Gradle file. It's not 3.2 versions, time is flying by 3.3 and if you are making a Kotlin multi-platform application, use the coin core module, Gradle module and if you are using Android, then use the coin Android module version. Coin has been made to make your development super easy and super simple and this is why your coin configuration should stay really simple for you. I let you meditate on this quote from Chet Hazer and the transition for us is we are trying to use another kind of expression in terms of framework is that until now we were really people that pushing a lot for Kotlin DSL, stuff, et cetera. The idea of perhaps introducing annotation is also to not reproduce what you can find in other frameworks like Dagger and other, but really point something, we want to bring value and the problem of also the DSL is that we clearly have some limits over that. We can't understand really what you are writing, we can't trigger any static analyse of your code directly. That came to magic to the Kotlin compiler plugin. I won't dive into details for that because I believe some people already talk about KSP and everything about that, but let's say that Kotlin compiler is everything we could do for you before we are compiling your code in Kotlin. Then we could rewrite things, we could make code generation of course, analysis, et cetera, et cetera. What we will do with Google KSP is provide you a way to avoid the DSL and go really straight forward with annotation and keep all the coin semantics, all the coin API for you. We don't want to reinvent the wheel. What we want to do here is that we want to generate what the DSL you would have to write and then it's really, really, really a small piece of code that you would have to write. Then if we can still avoid you to write things in your code, this is still a good experience for us to let you understand how far we can go. In terms of definitions, what it means is that if you have a class, then you can just add directly an annotation. You see that this is the same keyword we have the add single on it. Then you see that we are extending an interface. The idea is that with just one annotation, we will understand that this class is having a constructor and then we will also bind directly the type migrate history. The idea is that we allow you to have, not to type anything in this. We detect things for you. We detect the code and then we can say, okay, coin, just write this definition for my repository type with the implementation of my repository. We have another component. We target add factory. Factory is another keyword of coin. It's the opposite of a single ton. Factory is something you want to create. Each time you want to ask a definition for that. Then if you want an instance of my presenter directly, you just tag it add factory and then coin will generate the DSL and then coin will manage to go and get the definition for you. You see finally you don't really care about the DSL and the complexity behind that. Finally, we can detect many, many things for you. Finally, for those who are making Android development, we have an annotation dedicated for coin that lets you understand that. Let coin understand that. We will create an instance of view model. Then we are reaching all the dependencies. We understand that this is a view model and then we will bind everything for you. You see that here we don't speak about DSL. It's just that we tag something here. The idea is that we can have automatic injection and binding. We can detect everything, all you need here by default. We can deal with new label type. That means that if you use the question mark in parameter construction, then we will understand that this is something that can be new label and then for you, it's completely transparent and then it will be taken into account. Also, as you have seen, we can tag a parameter in a function or in a constructor as injected param. That means it will be something that come from the application that is sending a data, a dynamic data directly to the definition. Then the natural way to do that will be to tag a parameter as injected param. You see that finally, the experience for us is to try really to let you write the minimum and the minimum things for you. Like for example, with dagger hit, we still have lots of things to specify. For example, in the spring framework, this is the kind of opposite because a spring is scanning everything for you and making all the class pass analysis for you. Then we are in between where finally we allow you to just tag your code with just a bunch of annotations and then you are ready to go and you can manage any kind of tip-off injection with your constructor things. The idea behind of the magic is that just you use annotations and you are ready to use the standard coin API is that you can use bi-inject or bi-view model field injection style here. Then we don't break the experience of people that are already using a coin and then we continue, then we allow people that are using annotations directly to use those extensions as regular extensions. For the modules, then for definition, we just tag annotations. We annotate classes, but for modules, we can't directly tag something in the DSL. How works KSP is that we are scanning for many classes or functions, then it will be kind of hard to tag around the DSL. The proposal for now is to work directly with class module to let you have an organization module for that. Then how you do that, you declare a module and that's it. You have a module, you have a hard module and the idea is that you put add component scan and then we will scan any kind of component that has been tagged in your application by package. Then it's really specific in terms of scan then that means that you can really filter by package, filter by layout, filter by features, how you want to organize yourself and then you just have this annotation here. Also, if you want, you can declare things directly inside a function. We will understand that if you tag something inside your module class, it will be a definition that we can bind for you directly. You see it's still very, very natural to use and really super compact. Then the idea is for us to let you go super fast for your dependency injection and keep everything aside for you. Of course, between two modules, you can have the includes of other modules that will generate the right things for you. That means that it will use the includes function that has been introduced just above in the new coin DSL site. Then we just need to start coin. That means that you have your module, you have a function where you want to start to start coin and then the idea is we just run the module with the new instance of my module here. The only thing we want to generate for you is that it's just a simple extension that will generate the DSL and this is where we just want to make boundaries for us. We don't want to reinvent the wheel. We don't want to reinvent things to generate code over code over code. We want to keep coin as it is, something that is super efficient to make dependency injection but allow you to use the annotations. This is why with such approach, you can mix both. You don't have to write a new project with annotations. You can already use coin annotations inside your project and test with it. The only thing you have to care is be sure to have the right import. That means that we are generating all your coin contents inside our coin.ksp.generated. Then you can use both DSL module, class modules annotated and everything, everything. Then up to you to express yourself and use the right tools that is great for you. What is interesting for us is that we don't want to reproduce what we have seen and why we have made coin is that we don't want to expose you to tools that can take dozens of minutes to recompile your project. The idea is that it should run for thousands of components really quick. The other good thing of that is that it's cutlin behind the scene, it's cutlin generated and this is something you can clearly debug step by step if you want. Up to you, that means that we don't want to replace DSL by annotations. It's another way to express yourself. Ksp is a good technology for us to help you write less, less quotes, less bugs. Then up to you to choose the right tools and the right solution to make your app structure. To finish then about coin and some improvement of this year, what's next? If you want to throw now about coin, we have many tutorials on many kind of cutlin application from a cutlin, cutlin multiplatform and Android application, also Cator if you want. This is the roadmap for 2023 where we have end of track coin 3.2, 3.3 is the active track, this is the current application that is still maintained before the next release where we are in 3.4, where we want to focus on compose for the jet brains multiplatform side, be sure that we want to bring better experience for cutlin native and we have also the verify API that is a new verification API that lets you make a compile time verification. Of course, we are really keen of Cator and we want to push new things about Cator. Especially today at FirstDem, and this is my first session at FirstDem, I'm really happy to show all the people that are really sharing and contributing to coin and I clearly want to thank them. Thank you all the community to work on coin. I believe some of people can find themselves on this board. If you want to chat with the coin community, then you can either find us on Twitter, on Slack, the cutlin on Slack and also you can go on the website that is inside coin.io to find all the related sources that you want. And also Open Source is great, but you need a strong company behind that to help you and have support on your project that is helping with coin technology and cutlin technology. This is why I founded Cozilla last year to work with people that are using such technologies. Then you can find all the resources on Cozilla.io and write on time. Thank you. Then we have time for questions. No question there. No, sorry. We don't have time for questions. We are so strict on timing. The next talk will start in four minutes, actually. |
Shrinking in the Age of Kotlin |
So, let's get started with the next session, and it seems like we're going to talk about making smaller apps with James Hamilton and talk shrinking in the age of Kotlin. Please welcome. Thank you. Okay, so we're going to talk today not just about Kotlin, but about shrinkers as well. So first off, who am I? My name's James. I'm a software engineer at Guard Square. You might know products such as ProGuard and DexGuard, so we built these products. So mostly I work on things like mobile security, Java bytecode, dialogue bytecode, code analysis, obfuscation, and these kind of things, mostly on ProGuard and DexGuard. Previously, I worked for a few years on something completely different on control systems at CERN, and before that I did a PhD in code analysis and metrics. So first, let's talk about what is shrinking. So if you're Android developer, you might produce APKs. If you're non-mobile developer, you might have, you might produce jars, and you would probably want to keep these as small as possible, especially in mobile because of the limitations on resources, the small amount of storage on the devices, or maybe the users are paying per megabytes, something like that, so you want to keep these things as small as possible. And so to do that, we want something that can shrink these. So if you are already an Android developer, you might know then ProGuard, for example, R8, Redex, or YGuard is another one. So these are all Java bytecode and Dalvik bytecode shrinkers. Just a small disclaimer that this is not a shrinker tutorial, I'm not going to teach you how to configure ProGuard, I'm not going to fix your keep rules today. And it's also not a sales pitch for shrinker, I'm not going to sell you ProGuard, I'm not going to tell you that you should use ProGuard over R8 or something like that. So if it's not a sales pitch and it's not a tutorial, what am I going to talk about today? So I want to basically answer a few questions. How does a shrinker process to Kotlin generate a code? And to help answer that one, we need to know something about the differences between the Java classes and the Kotlin classes. And then I want to show you a bit about how you can build tools to analyze and modify Kotlin classes. So first off, let's just talk a little bit about a very high level about how does a shrinker work. So there are normally three broad categories of shrinking. First one is tree shaking, code optimization, and name obfuscation. So tree shaking, if you think of your app as a tree of all the reachable codes, so you start at the roots of the app, for example, in Java or Kotlin, you start at the main method and you follow all the references that you can find, you build a graph from that and then you shake this tree and all of the non-use stuff falls away. So just like if you shake an apple tree, the apples are going to fall out, all of your unused code is going to fall away. So this is especially useful, for example, with libraries. So as an app developer, you might use a bunch of different libraries. Those libraries might use libraries, and those libraries might use libraries, but you might just want a few features, but all of that code gets pulled into your app. So you can use a shrinker to remove, to do tree shaking on that and remove unused classes, methods, fields, for example. And then another shrinking technique, code optimization, so tree shaking was all about removing the bigger entities, the classes, methods, and code optimization is really about the byte code. So for example, if an optimizer can tell that some path is always going to be taken, then we can remove some of the code. And the last one I want to talk about is name obfuscation. So this is about making the strings smaller. So if you're an enterprise Java developer, you might have some class names like this. More characters means more bytes. So if we just rename this to a single character, it's going to take up less space. Just a small side note here, which could make up a whole presentation on its own. Name obfuscation on its own is not security, but I won't talk about that more today if you want to discuss that more later. I'd be happy to, but today I want to focus on shrinking. So why am I talking about shrinkers in the Kotlin Dev Room? Why is the presentation called in the age of Kotlin? So the Kotlin compiler generates Java classes just like the Java compiler. So isn't it all just Java byte code? Why is there a difference? So let's take a look at a very simple example. So let's look at the Hello World in Java, Hello World in Kotlin. We will use the Java P-Tool to print out the disassembly of the class file. And let's see what the difference is. So it doesn't matter the exact content here, but right away you can see that on the right side, the Kotlin side is longer. So what do we have here? We have some header, which is basically the same. So that's not very interesting. We have a constant pull. We already see here that there are more constants used in the Kotlin class. On the Java side, we have an extra constructor which doesn't appear in the Kotlin side. And that's because actually, in this example, there is no class here. So this main is in the top level of the file. There's no class here. So from the Kotlin point of view, you cannot instantiate this generated Java class file. And then we have a main method. And actually, on the Kotlin side, we have two methods because I declared the methods without the args parameters. So actually, the Kotlin compiler generates two, and one will call the other one. And then at the bottom here, which is going to be most of the focus of this talk, is the Kotlin metadata. And why do we need this extra metadata that we saw in the class file here? So let's look at a very simple example. If you have a data class in Kotlin, data classes don't exist in Java. So when you compile this to a Java class file, you get a Java class. There's no indication here that it was a data class. Another example with context receivers. So if you have context receivers in Kotlin, when you compile this to Java bytecode, you will have a Java function which looks something like this, or your context receivers will end up as the first parameters of your method. So if you're just looking at this from a Java class file point of view, how do you know that the first parameters are context receivers and not just any other context receivers and not any other parameters? And then there are many other things encoded in the metadata, for example nullability, type aliases, and a lot more. And so this is a big problem for code that inspects the Kotlin code. So for example, using reflection, for example the compiler, for example IDE, all of these tools need to know that a class is a data class, for example. And how is this metadata encoded? Let's have a look again at the Java P output and zoom in on the metadata. So if we zoom in a bit, we see that it's actually just a Java annotation. So I say just in quotes because inside that annotation is a bit more complicated, it has to be decoded, but it is a Java annotation. So since it's just an annotation, we can actually see the source code. So you can find the source code on GitHub. There are a bunch of different fields in the annotation. One of them, the first one is the kind. So we saw already that the main function, the small example that I gave with the main function at the beginning, there was no class. So actually this is a file kind, not a class kind. There was also a version here. And there are some two fields where the actual metadata is stored in a binary format and strings that are referenced by the metadata stored. And then there are some other fields here with some strings and some bit flags. Okay, so that's what metadata is, why we need metadata. But why am I talking about shrinking? What then is the problem with shrinking coddling code? So one of the most basic problems here is that there is an annotation. So if your shrinker or the user who is configuring the shrinker does not tell the shrinker that it needs this annotation, typically this annotation is not used directly by the program. So when you do your tree shaking, you won't see that it's used. And then it can just be removed. But then it's just going to be a normal Java class again. So either your shrinker needs to know about coddling or you need to configure it to keep the annotation. Another simple example is if you start renaming stuff in the Java classes. So if you rename the class, if you rename the methods, then you see in this example here that's actually in the metadata still refers to all of the old names. And then if you are removing methods because they're unused, well, there's also information about these functions from the coddling point of view in the metadata. So if you remove it from the Java part, it's still going to be in the coddling metadata unless your shrinker knows about coddling metadata. So as I mentioned, I work on ProGuard, I work on DexGuard. And both of these process coddling metadata in the same way. So let's have a look at how that actually works. So it's a very high level. We have a textual representation of the metadata here. So for example, there's a Java class. It has some metadata attached. There is a function there. And you'll see in the metadata part, there is a link. So for the class, there's attached metadata. And then you'll see also that the function in the coddling metadata points to an actual Java bytecode, a Java method. And then the metadata doesn't contain any of the actual bytecode. The bytecode is in the Java method. So there are two basic rules here that if the Java part is renamed, rename the coddling parts. And if the Java part is unused, remove the coddling part. So for example, if you rename the method sum here, you should also rename the function in the metadata. If you remove the method, you should also remove the function in the metadata. And at a high level, that's two of the basic rules that ProGuard follows when processing the metadata. There are a lot of details around that, but at a high level, that's what's happening. So how is this implemented? So we have an open source project, which is separate from ProGuard called ProGuard Core. But it was born out of the ProGuard project. So basically, it's extracted from the ProGuard project. A lot of the bytecode manipulation and analysis. So for example, you can read and write Java class files and coddling files. And you can modify, generate and analyze code. And importantly for this talk, you can inspect and modify coddling metadata. And this actually is powered by the Kotlin X metadata library, which is developed by JetBrains. So we don't actually need to dive deep into the actual parsing of what's in this annotation. So JetBrains does that for us. We take advantage of the library to be able to load the data from the annotation, manipulate it and then write it back again. And there's also the big advantage in that, for example, with versioning from the ProGuard Core point of view, we don't really care about the version of the metadata that we need to parse different versions in different waves. That is delegated to the JetBrains library. So how can we use ProGuard Core to read and modify coddling metadata? So let's have a look at an example. So I was thinking about doing a live demo here, but I practiced yesterday and there was IntelliJ problems and stuff, so I decided to make some slides instead. So basically what you can do is you can create, for example, a new Gradle project, add dependency on ProGuard Core, and then you'll be able to use the features to modify the metadata. So let's have a look at an example of what kind of code you can write. So let's say we've created a new project in IntelliJ, we added a dependency on ProGuard Core, and we have just a main function. We have a file called main, we have the main function, and we want to read some coddling, so we want to read some Java class file that was generated by the coddling compiler and look at the metadata. So let's try reading the metadata from this class that we're writing. So once it's compiled, it's going to end up somewhere here in the build folder. Let's read it back in and then see what metadata is there. So we can use a small utility function to be able to read in class files. It will read in the class file and it will initialize the coddling metadata. It will put that class file into a container called a program class pool. Once we've done that, we should initialize all the cross references, and this is quite an important concept in ProGuard Core, like for example, the references to the super classes, so you have the whole hierarchy references between classes with the method calls. So that's the important step after you've loaded in the class initializer references. And once you've done that, you now have access to the coddling metadata. So what we can do is we can visit all of the classes that are loaded into the class pool, we can visit all of their metadata, and within that metadata we can visit all of the functions, and then we can, for example, print out the function name. And know that this is not printing out the method name of the Java method, this is printing out the function name that is in the metadata. So if we run this, we will see some output here. So we've run the input to this program is this program itself, so there is one function, and so it prints out the main. If we add another function, we run it again, it will print through and main. But we can't just, we can't only just read metadata, we can also modify metadata and we can also modify the Java parts of the class file. So let's say that our shrinker wants to rename a method to some other name. So let's visit all of the methods in the class, let's rename it. If it's called foo already, let's rename it to new foo, otherwise we just keep the original name. And know that now that we've renamed the Java component, and now the metadata is out of sync. So how do we fix that? Well, what we can do is we can visit the metadata, we can then look at the reference where the metadata points to the Java method, and then we can set the name. But actually there is a utility in Progo Core which can do that for you, the class reference fixer that will fix up all the names after you've renamed stuff. Once we've done that, we need to write the metadata back into the annotation. So we use a Kotlin metadata writer for that. And once we've done that, we can write out the class to overwrite the original file. So if we open the file now in the IntelliJ decompiler, we see that the function is now called new foo. So what's important here is that we've renamed the Java component, the method where the bytecode actually lives, and also the Kotlin metadata. If you want to learn more about Progo Core, if you want to start modifying Kotlin metadata yourself, or if you want to build tools that modify Kotlin metadata, good place to start is the manual. If you just want to look at metadata, you can check out our Kotlin metadata printer projects. It will take in APK, or JAR file, or class file as input, and show you all the metadata. This is actually built into the Progo Playground web service as well, so you can upload a JAR for there, and it will just show you the Kotlin metadata. And as I mentioned before, the Progo Core metadata support is built on top of the Kotlin metadata library from JetBrains, so you don't need to use Progo Core to use that library, so you can also check that out as well. If you have any questions, I'll be happy to answer. You can also contact me via Twitter, or Twitter, I'm also on LinkedIn as well, if you have any questions later. Thank you. Awesome. We do have five minutes for questions from the audience. So, yeah, please just shout it. Most, yeah, so if you're just... Okay, so the question is, can you throw away metadata if you're developing an app? So not a library. In a lot of cases, yes, unless you're using reflection. And reflection is quite popular. So if you don't use reflection, you're not making a library, you can probably get rid of a lot of metadata. But then reflection is a big problem now. Do you have an idea of the size of the metadata in a typical Kotlin JAR? How much bigger is it compared to the same results? I don't have any numbers here, but basically all of the header information for all of the functions except the actual bytecode is encoded in the metadata. It's huge. So it can be quite big. There is some sharing, because there is a user in the metadata annotation, there's a strings array. So actually, those strings are shared with other strings because they're part of the constant pool. So that saves space, but it can be a lot. And if you're developing an app which doesn't use reflection, then maybe you can just remove all of it. Yes. Can you also remove the methods, not only the classes, just from the libraries, but initially part of the classes? Yeah, yeah. So the question was, with tree shaking, can you remove methods, not just classes? So the tree shaking normally will remove entities in the app, for example, classes, but also methods can be removed, fields can be removed. In mining as well? Yes. So this is more, at least in ProGuard, the inlining is more of the optimizer's job. So some things can be inlined, and then the methods, the original method can then be removed. Also, for Java class files, attributes can be removed if they're not used. And for ProGuard, the dead code is part of the optimizer's job. And then once you remove that code, you can also run the tree shaking step again and then start removing unused methods, fields, and classes that just became unused because you optimized. How does this affect the debugging? So the question is, how does it affect the debugging? But what exactly? So we modified the byte code and the source code to make the previous version, so it doesn't really match the original version. So we manipulated the byte code and renamed the original functions to others and our source code remains the original. Okay, so when you rename everything, then how does this affect debugging? How do you get a stack trace from some crash or something? So ProGuard will generate a mapping file which maps from the original names to the new names. And this mapping file is also used by R8 as well. It's the same mapping file, and this is also supported by services like Crashlytics. So the mapping file will be uploaded to Crashlytics, for example, and if you see crashes from customers, it will be automated. |
20 minutes from zero to a live chatbot with Tock |
Thank you. Thank you for having us. We are really happy to be there with the Kotlin community and the open source community today. It's really, really great. Welcome to the intermission between the great talks today. I'm François Nolen. I'm with Julien Buré. We both work at SNCF Connect and Tech, SNCF being the French railway company. At first, we wanted to introduce TOC, TOC, the open conversation kit. So TOC, it's an open platform. Maybe we can score a bit. In case you don't understand me, everything is written there. It's an open platform written in Kotlin to build chatbots, voicebots, callbots, conversational agents, and natural language processing applications. We started building TOC in 2016 when we started building production-ready business to customer chatbots for SNCF, which means for millions of French people, or millions of people in France, I should say. It builds on open source libraries, such as, at this time it was Apache, OpenNLP, and Stanford CoreNLP. To build chatbots and be able to build these conversational services without having only data scientists and experts and developers, we wanted not only to have something wrapping and encapsulating the NLP libraries with user interfaces, but also a conversational framework, Kotlin DSL, to be able to build the answers, a bunch of connectors to be able to integrate with our websites, our mobile applications, the messaging platforms such as Messenger, WhatsApp, Twitter, Slack, and smart speakers such as Google Home, Alexa, and more. So the solution we've built from open source foundations, we've shared it with the community on GitHub the year after, and now it's used by several companies and universities in the field of energy, banking, transport, and even healthcare. But I say the initial idea was to introduce TOC to you, and as you know, recently everything has changed in the field of conversational AI. That's why we won't do the initial presentation, and if you're disappointed, please go to the TOC AI website, and this link, this presentation is the demo we wanted to make initially, and you'll see the multi-channel capabilities and the analytics and everything that is in TOC. Now today, we'll try to do something a little more interesting and fun, focusing on code, because most time when people come and see me to know about TOC, they want to know how to build chatbots without having to code, with no code, and I assume you're more interested in Kotlin. So let's stop talking about TOC. Julien, could you run a new stack and create a chatbot running Kotlin from scratch, please? We'll use OpenNLP or Stanford. It's possible to integrate with other NLP models using their API integrating them through their APIs. Initially, it was OpenNLP and CoreNLP from Stanford because Java implementations are available, so it was really convenient to embed that in Kotlin. After creating an application, we will add a connector. The simplest connector is a web connector. It just exposes an API and we'll use it to have a web page and talk to the chatbot. And that's already done. You have a chatbot. We can talk to him. Okay, that's interesting. So we have a chatbot up on your name, but like Jon Snow, he knows nothing. So can you do something about it, please? I'd like to be able to say hello to the newborn. So we're in the language understanding section of TOC Studio, the graphical user interface, and you've just created an intent, so you create intents and entities inside the sentences sometimes. We'll see that a bit later. So we are training a new model to understand a greetings intent, and I'd like to have an answer. So as we said, we'll use only Kotlin to build the answers. Obviously, you can go inside the user interface, TOC Studio, and build answers, configure answers graphically. But it's really fun to code everything. Oh, really nice. And okay, we've got a chatbot. And yes, I like it. I had this question. I wanted to ask the chatbot. Okay, so you did something. You actually did something, and the answer is always different, but I'm quite disappointed by the answer. So we could do something else. I'm sure chatGPT could answer. Okay. So you decided to code nothing, in fact, and delegate everything to chatGPT. So you're using a Kotlin client, which is available in GitHub, just to perform the request. And so as I ask you to have answers to anything, actually, you just code something to delegate everything to chatGPT. That's it. Exactly. Okay, great. Exactly what I was expecting. So it's a GPT 3.5 model, this one. I hope you can read. It's not too small. Okay. Nothing really interesting in terms of Kotlin code there. So it's just a client to the chatGPT API, and you wrap it into a talk story so that the chatbot, when it triggers the right intent, it runs the client, calls chatGPT, and will have a chatbot answering anything in minutes. So let's try it. Okay. You've just tried it programmatically before trying with the chatbot interface. And we've got an answer from chatGPT. That's Matt. Instead of defining a new story for chatGPT and having our model to be trained to detect some intent, then trigger this story, you are using the new story, the chatGPT delegation, as a fallback story, which means every time the chatbot doesn't detect any intent, we'll have the chatGPT answer. Next time. I can't wait. Okay. That's the answer for chatGPT. Thank you. Congrats. No, we have a chatbot capable of answering about anything because the chatGPT does. I have to, I have to, sorry, you have to remember that tomorrow we go back to Paris. So could you please find something to get back? Yes, the train schedules from Brussels to Paris. I'm sure chatGPT will have the answer. Okay. Links to websites. Actually, there is our website. Okay. We've got another answer out there. Okay. So chatGPT definitely has the answer. But the problem is every time it's a different answer. We've got a real timetable there. But are we sure that's for tomorrow morning? Maybe. Actually, we did a test yesterday. Two days ago, we had no answers, no departure dates. Yesterday, we started having departure dates. But we also had answers for June, for September. And it wasn't always for tomorrow morning. So as we can see, every time we ask chatGPT, we apparently get a different answer, which could be a problem because we have to go back to Paris tomorrow. And we would like to have some predictability. You can find the real departure date maybe on our website. Obviously, that's really impressive and interesting to have a chatbot capable of answering about any question. But sometimes, in particular, when you're a big company and you provide a conversational service to answer your customers, sometimes for some use cases, you would like to control the answer to be able to guarantee it's always the same answer. It's the answer from your database or your API. So differently, we would like sometimes to have some predictability. It depends on maybe. But there, for the train schedule, it would be useful. So what did you do? You just created a new intent, train travel. You added the notion of entities, which is you train the model to tell them this sentence is train travel search. And the terms Brussels, Paris, and tomorrow morning have to be detected as entities, the origin, the destination, and the departure date. So you've just trained a new intent. And now we should have a custom story, not chat GPT this time, using these entities and these variables to perform a real search. Using, for instance, the SNCF OpenData API. And get a precise data always the same answer to the customer. So using the talk DSL, it looks like this. We add a new story. I didn't precise. When you run the program, there's a small WebSocket client which connects to the chatbot you've just started at the beginning. And it adds stories to the chatbot. So there you are defining a new story. And what does it do? It takes from the original sentence the origin entity, the destination entity, the departure date entity, and hopefully it's not text, it's already a date time. So it's been recognized, detected by the model we train. And it was openNLP, if I remember. And then you just have to call your favorite API, get the data, do things, implement business rules or anything, and put the result in traditional conversational widgets like buttons, cards. So we'll see what you choose to answer. A carousel, yes, for the departure dates. It would be great. So it's a generic DSL and model of widgets. But you also have specific widgets. When you integrate to specific channels like WhatsApp or Messenger, you might want to use a specific widget for your answers on these channels. So you are building a carousel with cards, a card for each proposal, and you take the... Yes, you've used the entities to perform the request to the open data client. And from the return proposals, you just have to build cards. I can't wait to see the results. There seems to be a nice image. We've got a natural language detection issue, because we had a chat GPT answer there. As you may know, it takes time to train a model. Obviously, it's not... With two, three, four sentences, you get a performance model. Okay, here is our custom story with the real proposals to get back to Paris tomorrow. That's great. And what about going by airplane? So in minutes, we've created a chatbot who is mixing chat GPT answers. And still some training to do. And several custom stories when we want to control the results. Okay, I'd like, as a company in railway, I'd like to have a custom answer for these questions and point out it's not so good for the planet to take airplane when I can take train. So you can configure something really quickly. Okay, that should do it. Maybe you've got a nice graph to show that. So that's the part for people who don't like to code. So for all the static contents and it's possible to build static stories and decision trees without having to code. Okay, and maybe we... Last question to ask why it's not so good. Maybe to go back to chat GPT answers. If we have time, not so much. So that's absolutely not the demonstration we proposed to do at the beginning. And we haven't seen much of the talk features or even the Kotlin DSL. But it's something a bit different, a bit new. And obviously, as you know, the chat GPT progress and everything they do at OpenAPI, it's really impressive. And every one of you knows about it. And for the moment, for companies like us, it's still... It may be difficult to integrate question answering like chat GPT because depending on the use case, we might want to control, supervise having a supervised model and control the results. Nevertheless, it can be interesting to integrate with chat GPT and other similar models to be able to answer many, many things with, in fact, really few code and training and efforts. And that's what we wanted to show you and to demonstrate today. You can, in minutes, create a chatbot in Kotlin, running Kotlin for other developers. It's also possible to write stories in JavaScript or Python. But let's stick with the best language in the world. So in minutes, you can have a chatbot Kotlin. You can integrate with very powerful solutions like chat GPT for question answering and choose to program your own custom stories when it's required to control the results and have a kind of guarantee of predictability. Thank you. Thank you. |
Take your shot of Vitamin! |
you you you you very different first let's see the material contract of material with compose we have first one kind of parameter name API slots what it means it means that inside these parameters you can put everything you can declare for example for the teacher you can declare a text a brand data application and when to put the logo of your company you can put an image there and the other kind of parameters is used for the UI of the top up bar okay the compose implementation is very different first we can see that we access to the components inside vitamin top bars it is a case for the top bar but it is also the case for all other components so we have vitamin bottom bars vitamin buttons vitamin models etc and we have inside these objects all variants for this kind of components so here we have the primary contract but we also have some other implementations for example we have the search bar to show text input inside the top bar okay so first what we can see that the title is no more an API slot but a string why it is because the design specification of vitamin said we cannot have something else than a title has text in the top of bar so we have a more oriented contract with vitamin because we remove the API slot and we change it to a string and it is a mandatory parameter because it is not new liberal okay for the list of action from compose material compose it is an API slot too here we have a list a list of action item and what is an action item it is an icon which is a painter with a content description for accessibility purpose and these two parameters are no longer so we can put something else in general a text button so inside the contents which is also an API slot and finally we have an unclick callback to be notified when the click the user click on your icon on your action item sorry so the usage is pretty simple we we call primary composable from vitamin top bars we give the title and we have a list of action item to show the menu at the end of the bottom of the top of bar sorry if you want to change this menu dynamically in your application you just need to extract this list somewhere else and change it and dynamically it will show the difference on your screen on your mobile application the material usage is different because it is a declarative way so I need to declare icon button and text button and declare all icon I want inside okay now we make a focus on the colors parameters type it with top bar colors and with a default value with the primary core from vitamin top bar colors first what is vitamin top bar colors it is again an object class with two function primary and contextual which use colors semantic colors from vitamin different according to the function because the contextual one it is with the blue background you can see you see before and so if I want to change from primary to contextual and with Versa I just need to change the call when I use my components if I want to do the same thing with material I need to change all parameters inside my components so here by going color and content color if I want to change a specific color for icons I need to change it to inside the actions parameter and for all action item so with compose you can see we have a distinction between icon color and content color which is which is used for text button okay next parameter the navigation icon the navigation icon is an API slots but with a scope the vitamin navigation icon buttons what it means well the scope is an object class with multiple composable and this this scope will be useful when you will use the vitamin this components because you will have some you will have some auto compression suggestion from Android studio with all of these variants so it is an apple for developers when you use vitamin the implementation is pretty simple it is just an icon button and an icon with the correct doable and the usage is even more simple because just you can use your components the proper component inside the object class and define the on-click color parameter and the content description for accessibility of course you can also declare your own icon button if you have a very specific case but in general it will be always the same the previous page a back etc for material well I need to do everything I need to declare the icon button and an icon with the correct doable and put it everywhere or refactor it inside another component and use it everywhere okay the last parameter and for me the more interesting one is the overflow icon the icon will be used only when you reach the maximum of action declare at the top and use inside my contract I have you cannot see see it but I have a max parameter inside my my components and so I have only I have a scope to I with only one component so here more it is the three dots you can see everywhere in all application and this parameter will use internally another components which is internal and very useful because this component have three interesting parameters the first one the list of action you want to show in the overflow menu in the drop-down a state to switch if you want to show or not the drop-down and finally the overflow icon you configure at the root level and these components use internally another vitamin components so here vitamin menu where we have the drop-down and we have here a very interesting contract because I don't know if you already use drop-down with material we will see just just after but it is from our opinion a very bad pattern to declare a drop-down so here we have an anchor to know exactly what what will be the the component where you want to attach the drop-down and the children for the list of item inside the drop-down so the usage is very simple just I declare overflow icon with my components more I use the states to expand it when I click on it and give it give the states to my component to handle for me the show or not of the drop-down the material implementation require require to you to implement all the logic of the drop-down so you have a box with the first declaration will be the anchor the second one will need to be to drop-down menu and compose will link these two components to know where is the anchor and so the implementation is simple but you need to take care by yourself to know when you want to display this icon because you can declare here all icon button you want so for example you declare a six icon button and material artifact don't know if it will try to display all icon buttons so it doesn't have any logic about that you need to implement by yourself okay what's next okay well first vitamin love accessibility you need to know that the design system the design implementation of vitamin which 95% of the RGA a whoops this is a French standard for accessibility which come from the which we see standard and we want to and we want to have technical implementation at the same level so for vitamin compose we already have a good score because we we are using internally in material components and it have already a good score but we have some custom components and some variance from existing component from the material so we need to work more about that to have a perfect accessibility so don't we want to have so much more tests and two kind of tests the central testing and we will use my paparazzi which is a very good low way to do that which doesn't require any device to be to run tests and we want to have some testable with compose test to test the contract of components and accessibility of our four components but you will have a presentation just after that just after me about it we want to work on the tokenization also of our components so I don't know if you know but the material tree have components tokenize but not material to and so we want to tokenize first material to components and after that tokenize vitamin components it will be it will be allowed us to generate a lot of think after that because if you have everything to conize foundation and components you can generate everything so to be very interesting and for my family vitamin love material I make a lot of opposition between these these two library but you have the vitamin time composable we inject all our semantics about course ships and typography but you can see that we also use the material time components and inside this material then components we we provide the mapping between vitamin compact vitamin semantics to material semantics so you can use vitamin and material inside your projects so pretty cool I finish some reference everything is open source the figma projects and all technical implementation for web Android iOS so if you want to check them you can there is also vitamin slag if you want to exchange with designers and developers on this project and I have a personal link here conference for whole which is a personal project with my gdg activity this is a conference application for the deaf aslil and this application have true use true design system material tree and vitamin so see if you want to check how to use vitamin in a real life project you can just check the code and make your own opinion thank you and if you have indication I'm here yeah it's a if I can explain the code tokenization well you can find you can see here on the bottom components you have a token to say there is a surface there is a padding but there is a start padding and and padding top bottom but also a budding between the icon and the text and you have a token for everything so you have the structure of a component with for all concept inside this component so you have often in the design system ecosystem a pattern name the atomic design system where you have atomic and you have molecules etc token is before atomic this is the most small it is the most part of components of a concept inside the design system and our question no great I didn't just before I before I give the microphone here we have correct code to give feedback so it's just a link to the first them form not as yet it's the fourth time for me in English so please give a feedback and say to me or it was thank you |
How to Test Your Compose UI |
last talk of the day, I am glad that there are still people in this room, like usually people tend to go like dinner early or so, but well, let's close this tab room and please welcome Istvan on stage, he's going to talk about how to test your compose UI. So thank you very much, hello, I'm Istvan from here and thank you all for still being here, that's really encouraging. So let's get started, I'll be talking about testing Jetpack Compose UI on Android. So first of all, just a little extra, there is a sample project, which I took all the samples in this talk from, so feel free to check it out on my GitHub page. Okay, so a quick recap on Android testing without compose, so the regular view system. With Android, with the regular view system, we have views or view groups, which are objects created from inflating XML, or of course from code, and they have rendering and behavior attached to them, and because of this, or well, because of their actual object, we have a grasp on them, we can get their reference from code or reference for them from the view hierarchy by find view by ID or so. Okay, let's see how compose compares to that. Of course, we have a declaration of the UI and not the UI objects themselves, so we don't have a grasp on what compose actually does, what the framework actually translates our description of the UI to, and of course not every composable actually emits UI elements into the composition, so that might make our work harder as well. Okay, so let's see what composables we will be testing in the next few minutes. First of all, there is an entry list, which is just a simple screen of a list, which will display hydration entry items, and the entry list is just a wrapper around the laser column that does just that, that translates hydration entries into hydration item composables, and the items themselves are just a simple row with two texts to display the data on the screen. Okay, let's see how we can solve tests in our Android project for these composables. We have to just add a few dependencies, of course now we have a nice bomb to do that, so after we do that we add those to our grader files, we can already start writing compose UI tests. And how a compose UI test looks, like it's just a regular test class, so nothing special there, if you were writing Android UI tests, this is pretty much the same. The first thing that you can spot is that there's just another rule that we have to use inside the instrumented test, and actually we can create that rule with a built-in create Android compose rule function, which has a type parameter, which has to be an activity, so for that type parameter we can set main activity or any other activity inside the replication that we want to start, so this rule created by this function will start the activity provided in that parameter. Of course, if we want to test composables in isolation without any specific activity, we can do that as well, then we can just pass component activity to that type parameter, and yeah, component activity is just a base class for many Android X activities, it's just a foundation of all the other classes, and it can host composables, so that's that. And if we want to do this, there's actually a shortcut to do that, which is called create compose rule, which does just the same as you've seen before, but in a multi-platform way, so if you want to prepare for a multi-platform project, then you can call this create compose rule, and on Android it will translate to what you've seen before. Okay, with that out of the way, we can start writing URI tests. The tests themselves are just regular test functions, nothing special there. The specialty comes when we actually start to write the test, because all of the calls, all of the test calls, has to be made on this test rule, so we can actually scope to that, scope to, say scope to that, we apply because this is just regular Kotlin code, so we can do whatever we want and whatever we know of, so we can just scope to this compose rules, scope to this compose rule and call anything on that. And the compose test rule has a set content method, which you could have already met, as an extension on the activity or as a function of compose view, if you do interoperability, so yeah, this is the entrance to the compose world inside our tests. And in that content, we can actually call anything, any composable at this point, but for this example, we'll call our applications theme, and then just call the entry list composable, which we've seen before, with just a fake list of entries, and if you wonder where those entries can come from, they can come from basically anywhere, we're still just writing Kotlin code, so these entries could be just a fake list of entries that you provide to the test suite from anywhere. Okay, so now we know how to set up our tests, but we still have no grasp on that entry list composable or anything that entry list composable actually emits into the composition, so let's see how we can fix that problem. Enter the Semantix tree. So the Semantix tree is actually another tree that is built in parallel with the composition, consists of node that will be rendered after some processing, of course, and the composables that we write emit nodes into the composition, but they also emit nodes into the Semantix tree, which is used by the accessibility frameworks and also the testing framework. And just as with the composition, composables that we write can, but may not contribute to the Semantix tree in any way, but also that behavior can be modified, and we will see how we can do that as well. Okay, so let's simplify our example a bit. Now we just call a simple text composable inside the set content of our test through, and by doing that, the Semantix tree will look like that on the right. So there will always be a root node for the Semantix tree, and the text composable by default emits its text content into a new Semantix node, which you can see as a green node. And yeah, that's it. So let's change quickly to the canonical representation that you will see when you will be writing tests and observing this Semantix tree. Now is the root node, of course, and the text contributes, text attributes of 100 milliliters text inside that Semantix node. Okay, let's add the row, because as you can see, we are building the items of the list. So let's add row, and yeah, that doesn't change the Semantix tree in any way, because the row composable is actually a layout composable, and it doesn't emit anything into the composition. And by default, it doesn't emit anything into the Semantix tree either. Let's add another text, and of course, that will create a new Semantix node, which will be a sibling of the previous one, and the child of the root node in this example. Okay, so this is a really easy example, but of course, when you're writing an application, you will be facing complex screens and complex subcomposables and whatnot. So you will be looking at more and more complex Semantix trees as well, which you have to assert on. So of course, there must be a way to visualize this. One thing that we can do in our tests is call the own root finder method. We will be talking about those later. And call print to log on it with a test tag, sorry, a log tag. So what this will do when you run such a UI test that has this line on it, is that it will print this structured log, which you can find by the log tag, and you will see the root node here and all the other nodes structured inside this log entry. The other way might be a bit easier. You can configure the layout inspector, if you have a fully composed screen, of course, to highlight the nodes that are contributing into the Semantix tree and to inspect the attributes of those nodes. Okay, let's get back to this easy example and let's see how we can modify the behavior of how composables emit things into the Semantix tree. There is a modifier called Semantix because of course, there are modifiers for everything. So yeah, the Semantix modifier by itself does nothing, as you can see on the Semantix tree representation. So by default, adding the Semantix tree modifier into the modifier cascade won't do anything special. As long as we start adding some attribute values inside the Semantix modifier. So in this example, we add a content description to the row. And as you can see, this will actually modify the behavior of row. And with this row will actually contribute to the Semantix tree with the new Semantix node. That will be the parent of the two text nodes that are the children of the row, of course. And as you can see, the content description Semantix setting adds a content description attribute into that new node with the text of a list item. That will be picked up by the accessibility and testing frameworks, but also we can define a test tag which will only be picked up by the testing framework and not the accessibility framework. And of course, we will be able to assert on this as well. As you can see, the test tag attribute actually contributes with a tag attribute inside the accessibility node, and it will be a text as well. Okay, so with this knowledge, we can already start asserting and exercising our UI's with the Compose UI testing framework. Let's see what APIs we can use to do so. We've already seen this first one, the onRoot, which selects the, so we already seen this first finder which selects the root node of the whole Semantix tree. And yeah, there are a few functions that you can call on them, call on the root node. One you already seen, this is the print to log, which can be useful when you start writing your UI tests. There is another family of functions which is called onNodeWith, and there are a few variants on this which can find our nodes based on multiple predefined tags or multiple predefined attributes that can be present in our Semantix nodes. In this example, onNodeWith tag selects the node with the tag that matches there. In the next example, it matches for a content description, and yeah, in the next one, it matches on the text. It can be a test tag and so on and so forth. Of course, these finders try to find exactly one matching node. So if you don't have a matching node, or if you have multiple nodes that would match the criteria, this will fail your UI test. There are another family of functions that you can use to find nodes they are called onAllNodesWith, and they also have the same variants predefined, and these will try to match one or more of the nodes of the criteria. If they don't find any nodes, of course, they will also fail our tests. Just as in Espresso, if you found a node in Espresso, you can find a view. Here you can find the Semantix node. Just as in Espresso, you can perform some actions on your found nodes which will translate to actions in the actual composables that you are referring to by Semantix nodes. Let's see this example of a button. It's just a simple button that has a text inside of it. You would expect that it can be clickable, and there is a perform click on the node class that the onNodeWith functions return. If you do this perform click, then the button will be actually clicked in our UI test. This is because the button, besides many, many attributes that it contributes to the Semantix tree with, it defines an action which is called onClick, and perform click checks for this action, and if it's there on the button's Semantix node, then it deems it clickable and it will be clicked. For the rest of the APIs, because we don't have much time here, you can check the official compost testing cheat sheet which you can find on that link. Okay. Let's dive into the last topic for today, pretty much. It's hybrid UI testing. So what's hybrid UI in this context? Hybrid UI is when you want to include composable content inside the view hierarchy or the other way around, you want to include your existing custom views for whatever reason inside a composition, so like a full screen made with Compose. And luckily, we have support for this. Espresso and the compost testing framework can work together to test such cases. Let's see our first example here. So in this example, we will go with the Compose UVay, meaning that the container here, the toolbar, and the rest of the screen, except the button, will be in XML, and we will be trying to include the button which is written in Compose. So yeah, this is the context that we will be using. There's a constraint layout with the toolbar, with its regular attributes, and there is a Compose view which will be our entry point for Compose inside our layout. Okay. Yeah, here is the fluff that we need to set up this layout. We have an activity that uses view binding to set up its views. And of course, with the binding, we will be setting up the toolbar title, but the most, but the more important thing is that we set up our Compose view where we can call the setContent method that you can see here. Sorry. And yeah, there is a custom button Composable which is written somewhere else. It doesn't really matter. We can set a text on it, and it won't be, we won't be clicking it, so we just set its clickListener to nothing. All right. So how would we test this scenario? First of all, we declare our Compose test rule with starting that Compose view demo activity that we declared previously, and then we scope to the Compose test rule and call our tests. First of all, this is just a regular espresso call. Of course, we are acting on a regular view hierarchy, so this is displayed check on the toolbar. It will work. There's nothing special there. But the next thing that will also work is just calling the Compose testing API on this same layout in this same activity, and that assertion will actually pass as well because of the interoperability between the Jetpack Compose testing APIs and espresso. Okay. That's really nice. Let's see the other way around. So in this example, the whole screen you see in here except the button will be in Compose, and we'll be including this custom button here, which is written in the plain old view system. It will be a custom view. Okay. This is the custom button view. Nothing special is here. The layout is inflated from a layout XML. There might be some fluff on it, and there is a subtext method to set the text, and there is an onClickListener method to set the clickListener. Of course, nothing special here. And this is a constraint layout, so this is like a deep custom button. Of course, there might be better examples than this, like an external SDK's custom view that's still not implemented in Compose. But for this example, we'll stick to this custom button view. Okay. And this is the Composable that we will be including that button in. It's called Android View Demo, and it has an onButtonClick parameter to lift up the action handling of the button. Okay. Of course, we are using a scaffold. We are adding a top up bar there that's the fluff here, and there is the interoperability API for including a view inside the Compose position, and that's called Android View. You can read on it in the documentation. The important part here is that there we call the constructor of the custom button view and set it up like you would with a regular view in code. Okay. So how do we test this scenario? We will just, sorry, we will just call the createCompose rule because we don't need an exact activity to test this Composable in isolation. And then we set up our test, and we do a kind of behavioral test pattern there for our Android View Demo Composable. As you can see, the button click handler is just setting up an external value outside. Okay. So if the button will be clicked, then we would expect that button clicked variable to be set to true, and we will be asserting on that. So let's start testing. First things first. Android View, we are testing if the toolbar in the Composable is visible, and that will pass. Then we do the espresso testing for the button to check if it's visible, and then it's displayed, and that will pass as well. Again, this is the power of interoperability between espresso and Compos. Okay. So let's go forward and try to click that button that we have this as a view, that we have here as a view. And we would expect that that assert equals on the button will be passing, but unfortunately that's not the case as of now. So yeah, as of now with the latest Compos bomb and the latest Compos version, this will not pass. This will not happen, actually. That perform click won't be clicking the button because of a bug in espresso. The thing that we can do is to call perform click on the view that's provided by us in espresso, but the problem with this is that perform click won't be happening inside the context of espresso. So there might be timing issues, and when you want to run a check after doing this, that might fail because the click is not performed or the side effects of the click won't be performed in time. So with this, yeah, we now have a kind of flaky test, which we could circumvent by doing some more fluff around it with espresso, but by default this is the case now. Hopefully it will be fixed soon. So we're almost done. There are more topics that you can check out on Compos testing. The best part of it is the libraries that do screenshot testing, of course, but yeah, this topic is pretty deep, so yeah, feel free to check out these. Finally, here are some resources that I use to create this talk, and also there is a 40-minute version on my website that you can watch, and multiple instances actually. So yeah, if you're interested in this topic and some more tools to use and some more examples, check out those as well. And yeah, finally, thank you for your attention. If you have any questions, I guess we have some time, and yeah, that's it. Thank you. Thank you. Thank you. |
Kotlin DevRoom Closing Remarks |
Okay, everyone, so we are just here to wrap up the dev room and we have a couple of last sentence just to share with you first. Well, it was, yeah, awesome to have you here. And sorry, folks. Join the conversation on the FOSDEM channel on Kotlin Slack. There is where basically we announced the code for paper. We announced there will be a dev room. Also, we do have some possibility to shape the dev room in a way that fits better the community. Like we could do a two-day dev room or a half-day dev room or we could do like a half-in-person and half-virtual or we could, I don't know, focus on certain topic. We try to react to how the community responds. So this is the channel on Kotlin Slack. It's public. Everyone can join. You can see the whole conversation there. Join it. And yeah, like we would love to know, like, even if you have feedback in general for this dev room this year to share with us, we will be more than happy to listen and we are all there. So you will find us there. Yeah, some news on the licenses lottery? Yeah. Anything more? You asked. No, fine. So we have three licenses to raffle. Two of them will go out to people who were nice enough to tweet or contact us on Kotlin. And the last one will be raffled tonight at the meet-up. And I think there is somewhere a link for the... Excellent. So sign up so we know how many people to expect and then you, Nick, or Martin, will do the raffle. I'll be there. So looking forward to have some beers with you there. That's it. I just want to say a big thank you to Nico, who has been doing all the heavy lifting, like, for several months. Actually, several years. He's the one who put Kotlin up for them. So thank you very much. Thank you, folks, for coming. It was always awesome. Like, every time I say, like, I should stop doing these. But then, like, I come to Fosden when I see so many people excited to talk about Kotlin. Like, it gives me so much joy. So I can't give up. That's why. Thank you again. And I really hope to see you in Fosden 2024. Thank you. |
Welcome to the Legal and Policy Issues Devroom |
Welcome to the legal and policy, Debra. We're so excited you're here in person at Fosnamp again. Hooray! Especially since it's the first time that we've been together since 2020 and that all of you have ventured all the way up here to UB 5, 132. We really appreciate it. So just to quick introduction to your organizers today, you just heard from Karen Sandler. I'm Tom Marble, and I will let the other organizers introduce themselves. I'm Matthias Kirschner. Yeah, Alexander Sandler. Nice to see you. I'm Bradley Kuhn, and I have the t-shirt, so you know I'm a Debra organizer. I don't think we have too much to tell you before we get started. Some of us will be wearing masks. Masks are very welcome. Don't feel self-conscious if you wear it. You'll see me wearing this hilarious duck mask the whole day. There's an elevator for those of you who need it. It is a little bit glitchy, but it does function. Anything else we need to announce? We have fresh air provided to you by Software Freedom Conservancy. There's an air filter there, and there's one there. So if you're worried about making sure the air is being filtered, feel free to sit close to those. As an added bonus, near the air filter, there is a power strip. So if you need power, move in near the air filter. Oh, and the windows are open too, so you can go by that. Do you have anything else? Yeah, the only other thing I wanted to mention is that on the page for the dev room, you'll find a link to join the Matrix Chat for this room. Also the link for the live streaming video, and so anybody that's joining us online, welcome. And with that, I think we're ready to have our first speaker in killing. |
A Service as a Software Substitute (SaaSS) is unjust like proprietary software
Thinking carefully about services |
I just have to mention that our board has put out a recruitment call, a call for nominations. We're trying to expand our board, so if you go to FSF.org, it'll be the top news item. It's something I just want to plug before I get started. We also hosted another conference in a month. You'll see some information about that too. My talk is about SAS, and that stands for Public Speaking. It takes just a minute. A service as a software substitute. What that means is when you use a service and you use it to substitute for your own computing. That's a little abstract, so it's best if I give an example. The one I like to give is a photo editing service. Say you go to a website, you upload a photo, you tell it you want to turn it black and white, and you click a button, it does it, and then you click download and you've got the photo. Well, that computation done on the server is something that you should control with a free photo editing program, like GIMP. When you use a service like that, you give away your software freedom in a very similar way to using a proprietary program on your computer. But in some ways it's even worse, because the server operator, it's automatically spyware, has all your data, and it automatically has a backdoor to make changes at any time. So in some ways even worse than a proprietary software. So I want to talk about, that's sort of the basic case, but I'm going to assume you understand software freedom a bit, so I'm going to go into some more other cases. So let's talk about first some basic examples of when something is not SAS. So when somebody else is inherently involved in the activity, that's kind of usually a good giveaway. So communication service is an example. Say I want to send you a message across the internet. We need some intermediary to write that message. I don't have a direct connection to everybody else, so somebody has to provide a service to do that. It's necessary. Another good example is publication. So I give you some data, I send you an email, I say you can publish it. Well there's nothing wrong with you doing that, versus me doing that, it's just publishing information. Now some websites offer multiple services, and well, or one, they call it one service, but it has many different use cases. Some are SAS, and some aren't. So let me digress just for a second. This SAS has nothing to do with the SAS with two Ss. It was kind of meant to be a bad pun, and it didn't work out that great, but just know that when I'm saying SAS, I'm talking about the SAS with the three Ss, and you just have to understand by context. So as I was saying, some services have multiple use cases. Some are SAS, some aren't. An example would be the website Flickr. It's meant to share photos, publication, not SAS, but then it also has photo editing features that is SAS, as I talked about. So let's move on to another example, a backup service. So a backup service is something people often like to run themselves. So you think, well, is it SAS? Is it not? Well, if the point of the backup service is to give you back your files exactly as you gave it to them, then it's not SAS, because there's no computation there that you should control if you're only hoping to get back the exact same data that you gave it. The result wouldn't be any different. Now, you may want to run that backup service for reasons other than SAS. Maybe you think it's more reliable. Maybe you think you have some privacy there. And that's one of the complications with SAS, and services in general, is that there are other concerns besides SAS that are often very important to people. So it makes it a little difficult to talk about. That's part of the reason I'm talking about it today. How about the case of a database as a database service? For example, a SQL database service. Well, in this case, SQL queries are actually complex computation. There's huge manuals for them. Exactly how they work matters to the people using it. They want specific database programs. They care about what the version is running. They care about how that computation is run. So yes, it would be a SAS if it's a database service. But then think about contrast, a backup service, and a database service. So say you use a database service, but you use it like a backup service. You just dump some data into a single table, and you just retrieve that table. Well, then you're using it like a backup service, and suddenly it's not SAS anymore. So what we have to think about is the primary purpose of how you're using the service. And when I say primary purpose, that brings me to a secondary purpose. Well, it's not really a secondary purpose that I want to talk about. But let me bring up another example. Say you upload some files to your backup service. And then you say to the backup service, what files do I have? And the backup service returns you a list of files sorted alphabetically. Well, if you had a list of files on your computer and you wanted it sorted, well, that would be something you would want to do on your computer. You'd want to run the sorting algorithm, something you should control. How it's sorted definitely matters to you. That's your computation. But then when the backup service does it, well, it has to tell you those files, and it has to sort them somehow. It has to give you to them in some sorted order. So in this case, it's doing computation that is what I call incidental computing. It's not your primary purpose. Your purpose is just to find out what files are on the backup service. Incidentally, there has to be some computing done in that process, which happens to also be like computing you would do on your own computer if you were running locally. But it's not the primary purpose of using that service. So it wouldn't be SAS. Another way to call it was ancillary computing. And so when you pick apart a service in its use cases, this is one way to narrow down the issue to whether something is SAS or it's not SAS. Now, for another more complicated example, I've been talking about the computing of an individual person, but groups can come together for a common purpose, form an organization or a project, and then they use a server to collaborate with each other. And an example of this would be Wikipedia, which runs, well, media Wiki, and Wikipedia and groups who run media Wiki. Now, that service has features like document editing, different diffing documents, conflict resolution. Those are the type of computing, if you were working on your own, something you would want to control on your computer, but because you're working collaboratively with a group, you could say the group's computing, and that group should be able to control its own computing by running its own server and the software on it. You could say when you join in with that group, you're a member of that group and you're collaborating and doing that group's computing, so you can use those sort of features together. And this is a little bit hard to think about, because sometimes your computing versus a group's computing can get a little blurry sometimes, but it's important to realize and think about. So, onto another example, think about bug tracking, a bug tracker. This is a very common piece of software that developers use. Now, one way that a bug tracker could work or does work for some projects is they have a mailing list where they say, bugs go to this mailing list, so somebody sends an email to the mailing list, say, I have a bug, and somebody responds and says, yes, I agree, that's a bug, or somebody says, no, I don't think so, and they discuss it. Well, this is just a form of communication and publication, like a normal mailing list, and I wouldn't call that sass. But consider a software like Bugzilla, or it doesn't work that way, especially with a larger project, you customize it so that it has maybe even hundreds of fields, and the people who are administering it are doing complicated queries, on all of the bugs, running queries that will modify all the bugs, reassign them, and then it's starting to look like a database with a front-end, a complicated database with a front-end, and then I would call that sass. And this, I think, brings up a little bit of a problem in that it's very difficult to, you know, a lot of projects want software projects, want sophisticated software for their project, but they don't necessarily have the means to run that software, and like a Bugzilla, because complicated software is complicated to run, especially as a service, because you have data, you have backups, you have all of the details of running complicated software. So in that case, it's like, are a lot of projects out there actually running sass, giving up their freedom? Well, yes, basically, but it's an area where we have a long way to go, I think. And a good way to deal with that situation is for projects to come together in sort of a larger project that is in their collective interests. So some of the software I run, or I help run, like the GNU project, where many software packages come together, share the infrastructure that's going to be in all of their interests, and that way they're sort of doing a group computation together for like a bug tracker. But now this brings me to an example of GitHub. So when I think of GitHub, I look around at the GitHub service, and I think there are a few features that are clearly sass. One, some obvious examples would be continuous integration. That's like they kind of give you a virtual machine and say, run some code on your software. And well, if the code that you're running, well, is their code, then I guess that brings up another topic of virtual machines, which are not sass if you control them. And another feature of GitHub would be like, they have a feature to tell you which functions in two repositories are different. Well, clearly that would be sass because it's parsing the language of the repositories. It's doing a complicated function level diff on them. And then some other features I think would not be sass, like simply publishing a Git repository. That's just a publication of information, not sass. And then their bug tracker, for example. I think in its basic form, it works very much like a simple form, like a mailing list that one person posts and another person replies. And I would call that just a communication service. But I'm not too familiar with all of the advanced features of GitHub, but I get the suspicion that some of the, maybe, there might be some features of the bug tracker which go into the sass area when it gets more complicated, when you're an advanced user of all of the features there. So I haven't picked apart every piece of GitHub, and it takes some time to do that analysis, which I haven't done. But in general, this brings me to the next topic, is that, well, I don't know how many minutes do I have left? Ten minutes. Ten minutes? Great. Okay, so I think I'm done with some basic examples of analyzing sass or not, and I'm happy to talk to people because there sometimes aren't bright lines. And it's a little blurrier than with determining if a piece of software is free or not, but it's not like there isn't blurry lines in that either. So I think it's just something we have to deal with. It's part of a lot of ethical issues, have their gray areas, and I just happen to be highlighting them. That's all. I don't think it's an inherent problem with sass at all, with the concept of sass. So now I want to bring up one common misconception, is that a service which publishes some free source code that it says it's running means that the users of the service have software freedom. They don't. The users don't control that service. They can't tell it what program to run. You can only do that with a program on a server you control or on your own computer. And there may be important code that it runs besides what's published. I mean, generally, servers are not publishing the operating system or other things. The first reason I gave is basically the fundamental one, but some other ones are that. And of course, you can't ever be sure what somebody else's server is doing. You haven't installed the program. You can't be 100% sure. It's different than running your own code on a server you control. So for services that are not sass, I'm going to talk about this idea of publishing source code and think about the difference and the interaction between sass and the publishing of service source code. So when we think about publishing service source code, I think about the AGPL, which says you have to publish the source code if it's a service to the users of it. And what we say is that the publishing of source code benefits the community so that they could use that source code. And that benefit is so important to people being able to use that code that it's worth mandating with AGPL. And we recommend it for all software that is intended to be run as a service. And in fact, now, not thinking about the sass issue, but just in general, if I encounter a service and I say, I think, would there be any reason somebody else besides the server operator would want to run it? If there is, if there's a plausible case of that, I think, well, then are they publishing the source code? If they aren't, why not? I mean, why do they not want to benefit the community? And another benefit that brings, I think the biggest one is if the service is working well, good. Maybe there's some fundamental services we all rely on, like DNS services. But then if the service stops working, it adds conditions that people don't want to agree to. It adds, it changes. Well, then the publishing of source code is sort of an insurance policy that somebody else could start up a new service and users could move over there. And that is so important that I think it's worth considering that in general whenever for any service, which is separate than the sass issue. So, and I think if we take a service like GitHub, I think that's obvious. Of course, people would want to run their own GitHub, but so why we shouldn't accept a GitHub that doesn't have the service source code published. It's just foolish to subject yourself to the whims of that service operator without having some sort of insurance to go somewhere else, even if you aren't using it as a sass way. So, I'm going to move on to my next topic, which basically I think the sass has not gotten enough attention in free software advocacy. And why hasn't? I think there's lots of reasons. I talked a little bit about the complexity. I think sass was far less common in the past. Nowadays, most services tend to require non-free software as a client, usually in the form of JavaScript. And that's, so services in general have caused a lot of problems for software freedom, other problems besides sass. So, but like I said, I think sass needs to get more attention because it's becoming more prevalent. Database services have, in the past few years, become very popular. Before that, it was much more common to run your own database. A lot of people are relying on these services, which are sass. And so, oh, I think one interesting historical reason for lack of focus on sass is that, well, it's not part of the GPL. It's not part of any license. When the GPL was being drafted, the FSF had its lawyers try and think of a way to add in a provision against sass, and they couldn't think of a way to do it. Now, so they just didn't. And so when we say the GPL protects your freedom, well, there's one little hidden asterisk there, as long as you don't give it away in a sass. And I'd be curious to know some of the lawyers, maybe some lawyers here today, if they still think that's the case. If there's no way to have a sass provision in a license, I'd be curious to what that is. I mean, it's the one case, I think, of the FSF saying, here's an important issue. We couldn't write into the license. I haven't heard many people talk about how that could be done. Maybe not covering all sass, but some portion of it. I don't know. I'm not a lawyer, but I would love to hear from some. So how can we give sass more attention? I don't have all the answers. It's just a couple ideas. I think number one is just to simply call it out more. When a company has a sass business model to say, hey, their business model is sass. That's taking away people's freedom. That's a very simple way. Another idea I have is that there's this term self-hosting, which basically seems to cover the idea of sass plus other things. It's the idea of services, run yourself. And sometimes it's even expanded to just non-services, a sass services that should be run on your own computer and not as a service at all and running themselves, and people call that self-hosting. So the idea of self-hosting kind of covers sass plus other things. So I think also advocating for self-hosting and saying self-hosting overlaps very well with software freedom is a way that we can advocate against sass without having to deal with the complexity of sass itself, of explaining it fully. And I'm getting to the end of my talk here. I think most of the ideas I want to share, I'd be happy to talk to people afterwards. And one shout-out to a specific program, GNU Units, which is a small sass that many people use. They ask Google or Search Engine to convert between Celsius or Fahrenheit. And there's a program you can do that on your own computer called GNU Units, so look it up. And that's all I've got. Thank you very much. |
Windows and Office "tax" refund
Various cases about the refund of pre-installed software, and the right to install any software on any device |
Hello. So, one quick question for the next speakers. Can you please identify yourself towards us then? So, Karen, who's the next one? Sarah James. Sarah James. So, if you're around, then please come to Karen. So, our next speaker here is Luca Bonisi, and he will talk about one topic when I first saw what he was accomplishing there as one person. It was amazing. So, and the rest he will talk about, but yeah, it's amazing what one person was able to do there. Hello. Welcome to Fosman, and welcome to this talk. This talk is about the right to install any software on your device, on your device, and consequently, what in my opinion is an unlawful behavior to install, to pre-install software on PC that you normally install both in your device. Specifically, this software, proprietary software, that are Windows and Office. I briefly introduced myself. Okay, the images. Sorry. Because the images are missing. Okay, maybe we can do full screen. Sorry. Okay. And yes, display, display, how to make a single display. Okay. Very good. Okay. I introduced myself. My name is Luca Bonisi. I came from Italy. I am, my main job is developing firmware of electronic devices. And I also support some school and small office with free software, especially one school that I defined a free software school where all teachers and students are using free software. I have some hobbies that are always related to free software. I do video, I am, I am, I do video technicians where I use a board, Jack, Jamil, Sinaldana, Cappadena live. I play guitar and piano. And I use Rose Garden as a media player to record from piano. I like to observe celestial objects like the ISS. This image is the ISS, photographed by me with a normal camera. And I built a 3D printer and I use them mainly for repairing or so forth. I make some free software project or activity. One is to make a chart on the text of songs since I play piano and guitar and transpose it automatically. Another is I'm the maintainer of the unofficial porting of Slackware and Linux distribution for many platform and CPUs. A project that I just started is to free your Android TV box. I bought cheap Android TV box and I installed Linux to be used as a general purpose devices. And I am a volunteer of NFSV as a translator. For NFSV, I built this project that was built in Perl, a language that may be not so used but I love it. And I built this program to help translate NFSV, to help translation in the NFSV, the translation process so the translator has not to care about the HTML formatting and so on. I contribute also at this fantastic book. I mean I maintain the repository of this free software book. Free software means that this book is distributed with a free license. And it was translated in five languages actually. German, Ukrainian, Italian, Arabic and English. The Italian book is not yet published but we wish it will be soon published. And if you like this book, you can attend the reading of the book this evening at 6 p.m. and Matias will be the reader, I think. Bonus hobby, I make ice creams. Since the book talks also about ice cream, it is very related to my hobby. And if you are curious, we will go to Ubitur this evening. In this presentation, you will find many illustrations like the one on the right that come from the book. So I sponsor a little bit the book. Okay, we start with the presentation. We have two different contracts. One is for the hardware and one is for the software. This contract cannot be joined together. You cannot force to join the two contracts and to force to give back also the hardware if you ask for the refund. Because the hardware contract is a sales and purchase agreement. And you become the ownership of the product or your PC of the product. And nobody can force you to do what you don't want to do. Instead, the software is a license agreement. You have a right to use the software under some conditions that for proprietary software, like Windows and Office, are limitations. Instead, for free software, are freedoms. I talk about device neutrality because you have the right to choose to install what software you like. Also by installing free software on your device, you gain the sovereign of your device. Properties software, you don't know what proprietary software make on your device. What proprietary software connect to some site that you don't know. Using running free software can overcome software obsolescence. Because especially with the new version of proprietary software, you are forced to change your hardware because the new version of proprietary software doesn't run on all the hardware. Instead, for free software, you can customize like I do with the two boxes and run also on lower-end devices. And in this way, you extend the hardware lifetime and also you reduce the electronic impact of the worst. There is also an open letter that the Free Software Foundation Europe asked to sign for the right to install any software on a device. The link is this one. Maybe we have a label later. And Free Software Foundation Europe asked to the European Parliament and Commission and also to public administration to be able that every device should be independent from the software it runs because using open standards and open interfaces. This is related to our work because sometimes hardware manufacturers use this proprietary interface that limits the possibility to use some feature of your device. Okay, we come first of all technical address. Manufacturers say this is real what Microsoft says from a device I purchase for the laptop surface I purchase. Microsoft says that the device would not work properly with other operating systems and they are not supported. This is said by Microsoft. And this is true because my laptop surface, the touch screen doesn't work. The webcam doesn't work with new links due to the proprietary interface of this device. Also, installing the front-stretched new links is not straightforward because the keyboard and the touchpad doesn't work if you don't build the customizes kernel. Also, the drivers for the keyboard and the touchpad are not in the main line. But okay, you can compile and then it works. And the manufacturer could also lock the boot loader. For example, Microsoft with ARM devices and Windows 8 lock the secure boot in a way that you cannot install software different than Microsoft Windows. So the device is not really your because you can only run software certified by another party. When we come to legal level, manufacturers often say that even if the license say that you must contact the manufacturers, the manufacturer says that you have to ask the review to the vendor or to Microsoft. But this is not true because the license agreement says that you have to contact the manufacturer. Manufacturer says that you must also return the PC. So you should have all the refund of all PC, both PC and the software. In fact, you can't have only the software refund. Manufacturer set a very little time span to ask for the refund, usually only 30 days. Manufacturer says that you must prove you refuse the license. But it is impossible to prove you refuse the license because when you start the first time Windows, you have only one button, accept. You cannot even power down the PC. Especially if it is a laptop, it could be a problem because the battery is integrated, usually integrated, and you cannot show power off the PC if you don't know that you have to press the power button for that same cost. Recently, a factory has switched their policy and said that you must return the PC to their laboratory because you must remove the product that is the information about the Windows license stored in the BIOS. Manufacturers set a price by your own, not the market price of the Windows license. When you first run the PC to see the license agreement, because to refuse the license agreement, you have to see the license agreement. With the latest Windows session, it is mandatory to connect your PC to the Internet before viewing the license agreement. I think this could be not compliant to the GDPR because your PC sends information to manufacturers and to Microsoft sites without your explicit agreement. There is no refuse button as I already said, and you cannot install a different license if you remove Windows and you download the Microsoft Windows from Microsoft website and you want to install a full license or also the gratis license because Windows is gratis. This is only a limitation that you cannot customize the toolbar. You cannot do it because when you install the version you downloaded from the Internet, it looks for the product key. So you are in your search circuits. Okay, very briefly cases of refuelling in Italy. There was a case in 2005 against HP. The case came to the Supreme Court of Italy and that said the refund must be given. It's about nine years, it's last to all the SQL. The second case is similar. It takes also nine years. It's very similar. And this case is regarding this PC. And I repressed the refund in 2018. Lenovo refused categorically the refund on the software. They asked that they say that I have to return also the PC. But I filed a court case by myself without a lawyer initially because from my side it was clear that Lenovo has won. But Lenovo introduced artificial complexity so that I cannot continue without a lawyer. So I asked for a lawyer. I win the first grade low seat but Lenovo immediately filed the second grade, the appeal, the second grade low seat. And they lost the second grade but the judge of Monza Kurt imposed to Lenovo a punitive damages of 20,000 euros because they use the legal instruments with bad fate without any valid reason. Okay, after these cases, I requested also other reasons refund. The case was simpler but not so easy. Asher, for example, asked to return the PC to remove the product K. And in this case, since I have some excessive front end, I go with the bike to bring my PC to Asher. There are also other cases, one with Microsoft as I already said. One with HP. I filed a court case against HP. They lost the court case. But this case, the judge does not recognize to me the legal fee. So the balance was the same. I lost what I gained from the refund. Other cases, okay. This is very interesting the last year with Asher because when I asked to Asher, this time I need the PC immediately so I cannot wait to send the PC to Asher and to give it back. And Asher said that it was Microsoft that asked to send the PC to the manufacturer to remove the product K. Then I filed a court case but it was stopped by Asher before reaching the first year. So they gave me 129 euros. Latest case against Lenovo that was terminated three days ago. Lenovo says again that you must return the PC to the manufacturer and they say that they are not satisfied by the license. The license they propose, they are not satisfied by the license, they themselves propose. So it is not very coherent. Okay. Asking for the phone, I go very quickly. You have to power on the PC, you have to read the license, contact the manufacturer, send email and certified mail in Italy. And you have to be very patient because many mails you have to send before requesting the refund, before having the refund. But you will have the refund. Okay. One important thing is that manufacturers artificially increase the cost of the PC with the mandatory payment to Microsoft. I think that if you obtain the refund, you have to give part of the refund to some free software organization. I give all 20,000 euros to this association. And if you want to donate, you can go to the link down. And if you can support the free software they value the work. Thank you. Hello. Is this working now? I think the main problem with this mic is the pickup on it is substantially lower than the pickup on that. |
Fuzzy Law-gic: FOSS & the Unauthorized Practice of Law |
All right, everybody, I'd like to introduce you to Sarah Dain Whitfield. Hi, everyone. As you know now, my name is Sarah Dain Whitfield. I'm an open source advisor at Google. This is my very first FOSDEM, very first in person speaking engagement, so please bear with me through the nerves. I'm the jetlight. Thank you. Thank you. Everyone I've met so far has been so nice, so gracious, so understanding. I really, really appreciate it. And I'll also say that while I'm going to nerd out quite a bit on this topic that I find super interesting, if you came here for answers about what exactly is the unauthorized practice of law, I regret to inform you that you will not be leaving with answers. You will be leaving with lots and lots of questions, and I hope that that will spur some discussion, and we can think through some of the very fuzzy lines that determine what is or isn't the practice of law. Let's see. Hopefully this works. All right, so in light of the subject matter that I'm covering today, I wanted to just clarify that this is not legal advice. I am not your employer. This is for general informational purposes only. Likewise, I have a few housekeeping notes. So typical American, I know, mostly going to be focusing, well, exclusively going to be focusing on U.S. law today, mostly because that's where most of the action is with regard to software at this time. It's also what I'm most familiar with when it comes to unauthorized practice of law. I absolutely acknowledge that basically every other jurisdiction around the world regulates the practice of law, and based on the research that I did before this talk, it appears that most other countries really distinguish and emphasize regulating the practice in a courtroom setting where you're actually representing a client in front of a judge way more than out of law practice, and it's less distinguished in the U.S. Finally, I wanted to let you all know in the interest of time, and everything that I want to cover, I'm going to be giving you high-level summaries of various rules and three cases that I'll cover, and as I mentioned before, I'm going to leave you with some open-ended questions to consider. To get started, by show of hands, is anyone familiar with a company called Do Not Pay? Okay, just a few of you. So Do Not Pay is a U.S.-based company that was launched in 2015, initially providing a chatbot service that would help users fight traffic tickets. They expanded their services to deal with other consumer disputes, like if your flight was canceled and you were eligible for a refund, the chatbot would help you through that process. If you wanted to cancel your free trial before you got charged, it would help you through that process. In 2018, they relaunched their app to integrate AI. I believe that they say that they're relying on IBM Watson, and they dubbed it the world's first robot lawyer, and they included various legal actions and legal document generation as part of their services. So if any of you are maybe in the future interested in getting involved more in the legal tech Twitter space, you'll see things pop up like this all the time. So in the past month, that's happened after I had submitted this talk, so this worked out well for me, Do Not Pay announced in early January that their AI was going to actually represent someone live in court, in traffic court, to dispute a $200 traffic ticket. A few days later, they offered a million dollars to anyone who is willing to leverage their AI exclusively in arguing case in front of the Supreme Court. About a week later, it announced some additional legal products that it was releasing, where it would summarize terms of service and lease agreement terms highlighting the terms that were not standard, so that the consumer had a better understanding and more insight into what they were agreeing to, and whether it was standard or not market, if they were being taken advantage of. And they then confirmed that their case in the traffic court was actually supposed to take place on February 22. Everything was good to go. Three days after that announcement, they regretted to inform all of their followers that they were indefinitely postponing going live in any courtroom, and they were also going to roll back any of the legal services that their website offered. So what happened exactly? Lawyers, lawyers happened. Lawyers in the regulatory bodies in the US, the state bar associations said, ah, no, I think that's our job. This looks and smells a lot like the unauthorized practice of law, and along those lines, penalties across the board can include civil penalties, fines, it could also include jail time being incarcerated for the unauthorized practice of law. So Do Not Pay was not interested in pursuing that and waiting to see what might happen. And the founder said he was surprised he had underestimated how seriously lawyers in the US would take a $200 speeding ticket. So again, I said that you're likely going to leave with more questions and answers today. I would love to give you a clear definition for what the practice of law actually is, but unfortunately there's not one unifying definition. Each state regulates the practice of law, so each state has their own definition for what the practice of law is, and they very much like keeping it vague instead of enumerating every possible action or type of speech that may constitute the practice of law, which means that you kind of have to wait and see what happens in court or take a very conservative approach and just leave it to the lawyers. So along those lines, I've already mentioned that there are lots of different penalties that anyone may face, but I also want to emphasize that this doesn't just apply to lawyers, this covers non-lawyers as well. So if anyone is familiar with the US TV show Suits, which is what made Meghan Markle quite famous at least in the States, the main character is the quintessential case of the unauthorized practice of law. Someone that did not have a law license, didn't go to law school, and he pretended to be a lawyer. Surprise, surprise, he was brilliant, he does a great job, but there's always this looming specter that someone's going to find out and he's going to be in a lot of trouble for the unauthorized practice of law. So that's the typical case, but it also applies to lawyers. So just wanted to clarify that. Now, I tried to distill the common elements. I read tons of cases, tons of articles and law journal notes and articles about this topic, and I was able to divide the most common elements of unauthorized practice of law, or UPL, into these two main buckets. And they're going to look nearly identical, but I'll talk through some of the distinctions in a moment. So the first bucket deals with legal advice that's given by a person or entity that does not have a law license in the relevant jurisdiction. Sounds like a bunch of nonsense and legalies. The second bucket is legal services for compensation given by a person or entity that does not have a law license in the relevant jurisdiction. So the clearest delineation that anyone could draw just by looking at the two of these is the first two bullet points. In particular, the legal services tends to specify that it's for compensation, whereas legal advice does not. So does that mean that if I provide legal services for free that I'm not liable for unauthorized practice of law? Unfortunately not, because some states will then deem or may deem your provision of legal services as also providing legal advice. And they don't distinguish in most cases giving legal advice. You can give that away for free and still be held liable. So isn't great. There are also a bunch of other nuanced distinctions that hopefully you'll start to see come through as I discuss the cases. In how courts look at and distinguish, are we okay? Between what is legal advice and what legal services are. So since I can't give you a definition for the practice of law, and we couldn't really back into it from looking at what constitutes the unauthorized practice of law, let's talk about the public policy rationale for why this exists at all. This quote is from a 1952 Missouri Supreme Court case, and despite its age I think it sums up the gist of the public policy rationale pretty well. At the end of the day it comes down to consumer protection. This theoretically is not supposed to be about protecting lawyers and their livelihoods, creating a monopoly over who can give legal advice and provide legal services. What is really supposed to do is protect the public and make sure that if they're getting advice, if they're relying on someone else, that it's actually reliable, that it's competent advice, and that maybe there's some repercussions if that doesn't hold water. So in light of that great public policy theory, what does that actually look like in practice? The options for legal representation in the US come down to, well you can do it yourself, all alone, do your own research, fill out your own forms, draft your own documents, file your own things, that's what we call pro se representation, or you can hire a lawyer which tends to be expensive and if you don't already have a connection through family or friends to a lawyer, it can be really hard to track down the right person. And additionally your claim may not be worth their time since they charge by the hour. And finally, in very limited circumstances, mostly confined to criminal cases, a court may actually appoint a lawyer to you, but that's not really a viable option for the vast majority of especially civil cases that deal with consumer protection. So how's that public policy rationale looking in practice? Clearly there's a bit of a disconnect. Additionally, the US is currently suffering from what many call a crisis in terms of access to justice, access to efficient, affordable, reasonable expectation of justice and the ability to interact with the justice system. And based on this quote and all of the other relevant demographics and statistics about this issue, it largely impacts low and middle income individuals who can't afford legal services. So to get back to the unauthorized practice of law, again, I wanted to try and find a way to describe the various ways that a court may consider, since we don't have clear definitions, what may bump over into the category, what may cross that fuzzy line into the unauthorized practice. And so these are various high level categories of what may tip the scales. But again, to clarify, there's no clear set framework that anyone is following. There's no set determination that, oh, well, if they disclaimed that they're not a lawyer and it's not legal advice like I did earlier, then it's clearly outside the realm and can't be held liable. There's no clear line drawn. If I am giving you legal advice for free, again, as I said, it very well may not be exempt from liability. So you'll see some of those factors come up again in these cases. In the interest of time, again, I want to kick start the first case. So this is Janssen versus LegalZoom. Quite some time ago, 2011, this was a case that took place in Missouri. And LegalZoom, for anyone who's not familiar, is a website that offers legal services throughout the United States. When they initially started, it was just their basic legal forms that someone could buy, print out, and fill in themselves and do whatever they wanted with. That's what the court considered to be the goods that LegalZoom offered. They expanded into building out a program, essentially as decision trees where they provided questionnaires to their customers and then would generate legal documents based on the user input. That's what the court, in this case, deemed the services that were provided for sale. Sorry, lost my place. So LegalZoom did mention that legal services make a lot of statements that they were not a law firm. They were not providing legal advice. This was meant to be self-help services. And no actual harm was cited in the case as being done to the public or to the actual customers of LegalZoom. Despite all of this, the court still found that the services constituted unauthorized practice of law. So in particular, I wanted to call this to your attention, the court specifically honed in on the fact that a human had programmed the software to generate the legal documents. And since the human that programmed it was not a Missouri licensed attorney, but they had to have relied on Missouri law in order to generate documents that would affect a Missouri resident, that they clearly must have been making legal determinations. Just like a Missouri licensed attorney would do a client intake and ask a series of questions of the client and then use that to inform what documents they would choose and what content made it into those documents. So the first open-ended question that I want you all to consider is this was decided just over 10 years ago now. How might the court have viewed this human programmer input if this were an AI model instead? And based on where technology has come and where the lay person understanding of what technology is capable of doing, how might that affect how courts look at this? So let's fast forward 10 years to 2021. This is a Miami based startup called TIPT. They were sued by the Florida Bar Association for allegedly violating the unauthorized practice of law regulations. TIPT had developed an app that would connect anyone who had gotten a traffic ticket with a licensed Florida attorney to resolve the case and they did it for an upfront flat fee based on a percentage of the total violation fee on the face value of that ticket. It built an algorithm, this is the main business model, they built an algorithm that used court data on the likelihood of a ticket being dismissed in court or being charged fees or accruing points in excess of the face value of the ticket to come up with what was actually going to be profitable, kind of like an insurance company will calculate the risk that you as a young healthy person are going to suffer some catastrophic loss and they're going to have to pay big. It's probably a low likelihood so they charge you at least in the United States, forgive the example. In the US we get charged a lot of money to pay for our health insurance and usually don't see a return on that unless something really bad happens and that's a story for another time. So to get back to the ticked case, they made it very, very clear in all of their website communications in their terms of service, everywhere on their site that they were not providing legal services, they were going to connect you with a licensed attorney who would take over and handle the case for you and their sole responsibility would be determining whether or not your case was a good fit, referring to a licensed attorney and then negotiating that flat fee on your behalf with the attorney and they would take care of all the payments. Once the connection was made between the customer and the attorney, ticked removed itself from that relationship other than providing the financing to pay the attorney and any related costs. Likewise, the lawyers were not employees. I'm running out of time. So to skip ahead, no harm was done. There were two groups that filed amicus briefs, one in support of the Florida Bar, one in support of ticked and surprise, surprise they were at odds. One was a group of private practice attorneys in competition with ticked. The other was a group of consumer rights activists and legal aid services. So back to that public policy rationale. In a 4-3 decision in the Florida Supreme Court, they found that ticked was engaged in the unauthorized practice of law and cited a number of potential harms that may befall their customers, but no actual harm. But this was enough justification. The majority had even said that they lauded the idea that technology was lowering the barrier of entry to getting legal services to having public benefit and public convenience, but that still was not enough to knock out the unauthorized practice of law claims. The minority dissent, I'm not going to have time to cover it in full, but it's great if you have a chance to read it. I have citations in my slides. Essentially they say, hey, this is a hedge. This is purely financial. It has nothing to do with law. And all you're getting is certainty of this is the flat fee that I pay and that's all that I'm going to pay. And it has nothing to do with legal lawyers handle all of the legal stuff. So this should not have been considered unauthorized practice of law. In Ray Peterson, involved, I'm going to skip through this, forgive me, the non-profit called Upsolve. Upsolve provides free legal services to low income individuals to file for personal bankruptcy in the U.S. Again, no harm done, no fees charged. Upsolve was still found to be engaged in the unauthorized practice of law, despite plenty of disclaimers relying on government issued forms and instruction sheets and publicly available information, making sure that they were only serving clients that had very basic cases that didn't require additional legal advice or input, had a lot of controls in place, but unfortunately it was not enough. Fortunately, in the unlike ticked, Upsolve was able to work with the bankruptcy court to make changes. They're currently working with the bankruptcy court to make changes to their program so that they can continue their services, whereas ticked actually had to shut down. They are no longer active and able to conduct business in the state of Florida. So this is a handy chart. Again, forgive me for the time. We won't be able to get to everything, but I did want to note here in the three cases that I talked about, Upsolve is the only one that did not charge for their services, and yet they were still found to be in violation of unauthorized practice of law. In the legal Zoom case, I didn't cover this, but they were providing some direct filings on behalf of their customers, which in the case law is considered inquisitive. The court, which usually is a higher threshold, everywhere else was out of court. There was no interaction with judges. There was no in-person representation. Finally, I wanted to distinguish the decisions. All three were unauthorized practice of law. Legal Zoom is still operating in Missouri. They settled the class action lawsuit, and they worked out a deal with the state in order to change some of their services and disclaimers and get around any future claims of unauthorized practice of law. Same in Upsolve, at least for now, but TICT did have to discontinue their services. One thing that I left out is that Upsolve, I found out this morning, which is why it's at the very end, Upsolve has released publicly some of their software on their GitHub org. I don't know. It wasn't cited in the case. It is free and available under open source licenses. I don't know if this was the software that was involved or at issue in the case. It wasn't specified, but interesting to note that open source software is involved. The reason that I'm bringing this up at the very end, I know I'm at time, is these are some additional open questions. In light of the various facts and elements that a court will consider in terms of what is the unauthorized practice of law, where at FOSDEM, I wanted to address what effects this may have on FOSD projects. Everyone knows I'm not a lawyer, but I'm going to give you my opinion about something. Licenses contain warranties and disclaimers, et cetera. I don't have time to cover all of this. Forgive me again. You've been a great audience. I really appreciate your patience with me. Please check out the slides that are already uploaded. This is CC by Ford. |
Is “European open source” a thing?
Debating the role of open source in building Europe’s digital sovereignty |
Okay. Good. Thank you. Well, I'm Gal Blondel. I work for the Eclipse Foundation. Hello. Can you hear me? Is it working fine? Yeah. Yeah. Well, I'm the other one. Alberto from Open Nebula. And the topic today, so actually that was a proposal from Alberto. About how about we talk together at FOSDEM about European open source. So I think you proposed the topic and I proposed the title, which is, is European open source a thing? Because, well, over the last, I don't know, two, three years, maybe more, I have been in lots of events where I see people who want to do more open source because the thing that open source is good for Europe. But at some point, there is always somebody who starts with the idea that, well, open source is cool, but what about we would do open source and try to limit it to European companies, to European stakeholders or stuff like that. And so that's what we want to discuss today. So we prepared it like a list of questions. We expect that some of you maybe want to change our point of view. I think that we will have different point of views on some of the topics. And, well, let's start. Do you want to define European open source? Okay. Well, I would like to explain a bit how selfish I am because this whole session, for me, personally, is second. Microphone is not on. I think it's on, but just... Try to speak as loud as you can. Louder, like that. And you hear me without the microphone. No, but you need to use the microphone because we are recorded. The live stream, but everybody can hear, even though the microphone isn't on. Yes? No, in fact, you have to really reject. Or share the volume. Yeah, let's share that one. No problem. That would be a bit painful, but we're sharing stuff. Okay, so I just wanted to explain for me, personally, what this session means. As Gael has mentioned, the concept of European open source has been around for a while. It's been in the last, yeah, three, four years that it's been discussed within this kind of new geopolitical environment in which we all live now. Also, especially at European level. So, for me, I'm one of the people that's actually pushing for this concept to be something different from other stuff. But I also, I can confess, I have my doubts about how we define European open source and whether that is something that might be divisive or counterproductive. Even have my position, let's say, and that's also part of my job. And I'm happy to say, both from my personal level and professional level, I agree on the things I can defend. But I also have, as a European, some doubts about this and what this might imply for the open source sector. So, for me, this session is also about learning what all of you think about this. And that's because we thought that it was about time to have an open discussion and an open debate within the European open source community to the implications that this concept might have. Yes, so maybe just a few words about me. So, I work for the Eclipse Foundation, which moved to its legal incorporation to Brussels two years ago. So, and that's very important for us that we are a global organization established in Brussels. Because when it comes to the answer to what is European open source, I think that from my perspective, at least open source and free software, that's everything we do where you can use, study, modify, and redistribute the software. And there is nothing to do with Europe. So, that must be done in Europe, outside Europe. Everything that is written in Europe should be usable everywhere else. We should be able to use. So, that's really sticking to the definitions of the FSF and the definition of the, well, and the open source definition. So, still, I think that we can have a specific approach to open source in Europe. That's how we help European organizations, how we help European companies do more and understand better in open source. And that's also why I wanted to start with this topic, because every time I go in a conference, I have been in conferences where people were telling, yeah, well, we get money, we invest in open source, and then we have the problem that what we do in open source could be reused by non-European companies. And thinking, oh, my God. Oh, that's so fucking wrong. Why do you do that? Why do you do that? So, that's really one part of the message I wanted to share today. I think we agree on that. Still, I think that we should be more diligent in Europe about helping European organizations, helping European companies, helping universities do open source, produce open source, publish open source, and also be more ambitious about how to use open source as a vector to project technologies and to project ideas in the rest of the world. We can come back to that. Yeah, so I think, I mean, from my perspective, coming from the European open source industry sector, let's say, as a technology provider, as a company that produces open source and makes money out of the services we provide to customers with this, I mean, I think we share that concern that, I mean, open source cannot be something that in any way limits who's going to use that or puts any barriers to who can improve the technology or contribute to that technology. From my perspective, I mean, I've also been involved at several of the European initiatives for things like data sovereignty and cloud sovereignty, these kinds of discussions, things like the European Alliance for Industrial Data, Edge and Cloud. That's something that the European Commission launched a couple of years ago. So for me and other people in the sector, I would say, I think exploring this concept of what European open source means is kind of a natural step after we discuss the problem we have in the market now with cloud providers. So from our perspective, we recognize, we identify there's a market failure in Europe in terms of access to different cloud providers or the dominance of a number of non-European hyperscalers. You all know the names. Let's fix that because the European players are playing in this advantage here. So for us, the next step is, OK, so what are the technologies that European cloud providers are using? Are those led by European organizations? Are those roadmaps defined in Europe? Or are those technologies defined and implemented by non-EU companies and organizations? So for us, it was like the next step, like, OK, what if we apply this sense of this concept of sovereignty of technological autonomy, let's say, also to the, not just the proprietary software, but the open source software that supports all these cloud infrastructures in European cloud providers, but all the companies as well. Can we do that? I mean, can we get some understanding of what that will imply for Europe in terms of getting the industry involved in producing and maintaining the technologies that they consume, not just being mere spectators of what all the companies and organizations produce outside Europe most of the times? So that was the kind of the next discussion we had. Any questions at some point or maybe somebody wants to shout or say something? What do you think? What do you feel about the concept of European open source? It's an oxymoron. Why? Good. Can you repeat what you hear? It's an oxymoron, he said. It's an oxymoron. Why? I explain why it's an oxymoron. Yeah, I just don't understand why you put that adjective in front of this now, right? I mean, that's like, you can do that, it makes chromatic sense, but I don't understand what the meaning would be. Mm-hmm. And I can. So for me it's, OK, this is still not working at all. Yeah, it's in the live screen. Yeah, it's working for the live stream. I think we agree. And to be honest, that's not the controversial part in it. I think that, and to the point is that, yes, open source is global and everything that intends to limit open source scope is certainly not what we want. Still, that's more to not take it as an adjective, but to take it as a, OK, we have open source, open source is global. And what we may want as European citizen working for a global organization or as another European citizen with leading a European company, what we may want is to have a different strategy, like an industry strategy, to sustain and develop open source better, differently in Europe. And so I completely agree with that. Yeah. And so that you should have policies and frameworks and a whole bunch of other types of things. So let's use that example, right? Water forms the shape of the container that it's putting, right? So when one talks about what water is, right, one talks about the fact that it's in a bucket or in a cup or in something because the framework gives the structure to the water. But you don't talk about whether the water is going to be bucket. A bucket to the water. Yeah. Does that make sense? We are not, OK. I'm French in Spanish, so it's possibly grammatically not the right approach to the topic in English. OK, that's what I'm going to do for the discussion, right? At least I think that we set the stage for the conversation, yeah? I think it's an accurate term, but I think there's a missing word at the end of it. It should be European open source industry. Yeah? Because for me it seems like it's very standard. So the comment is that it should be European open source industry. Yeah, and that's another aspect that I like about it. Yeah, because for me I think it's the only concern I've heard, like why this matters, like it's the one about competition, so competition with US and other global companies. And that is predominantly sort of like an industry concern. I think, for example, like there might be other concerns around sort of like governance of open source projects like that are based in Europe or have predominantly sort of like European contributors and things like that. But from my experience working with those projects, none of them have sort of, and at least intentionally sort of like isolationist sort of like tendencies, they might be isolated just by fact of like who they know or who they don't know. That's the diversity of something like that. There's not, yeah, yeah, but there's not, it's less about lack of outreach rather than we want to be isolated. I think most of these projects would rather be not seen as European, but rather as part of more open source movement. Do you want to react? So if we could please just have the speakers call on anyone who wants to speak, we'll bring the mic to you, and then you have to speak really loudly into the mic, but please call on people and let us know, we'll bring the mic to them. So my take on that is that, yes, there is something that we could define as European open source industry. I think there's also something we could call the European open source technology. And from my perspective, that is open source technology whose roadmap and whose governance is controlled by European organizations, yes. It doesn't mean external or non-EU actors cannot or contributors cannot join that community or contribute to that technology. But the roadmap, the vision of that technology, the values, if you want to talk about that, the governance for this that control those technologies are under European hands. Sorry, before, I have slightly different opinion, and my opinion is here on the question seven. I think that definitely one of the advantages of open source is that we move the needle from having access to proprietary IP to having the skills to study, to understand, to master, to evolve, to develop the open source software. And I think that what we need to do, what's the next step? So you represent a relatively small company, and I think that we have a vibrant ecosystem of small open source companies in Europe. And I think that what we need to do to go to the next step in terms of European power in open source, technology impact in open source is to involve the larger companies. And in Europe, we have a characteristic which is that the larger companies are not software companies. They are automotive companies, they are aircraft companies, they are utilities, et cetera. So we have a lot that is being done in terms of policies in engaging the government. So we have a governmental support, et cetera. And I think that if we really want to get more leadership from Europe in this topic of global open source, that's really when we manage to involve the big companies. So we work, for example, with automotive companies, mostly Germans, but we have a Volkswagen, Mercedes, Bosch, Continental, ZF, et cetera. So that's companies that it's very interesting. Some of them have been doing software for a very long time, embedded software, et cetera. And they are slowly getting to use open source to collaborate together. And I think that's really fundamental because they are likely about to create software that would be game changing for them, but also for the rest of the industry. So I think that there is a happy battle which is, okay, what can we do with open source? And on the other side, you have the big, the GAFAM and big software vendors. We don't have this big software industry in Europe. But open source is really a tactical and potentially a strategic tool that can be used by all our industry. And one of the problems is that in this industry, all the managers, they have no clue about software. They have no clue about open source, even less. And so they think that they hear about open source like, oh, it's open, it's cool, et cetera. And I think that collectively, we need to propagate the message that, well, doing open source is, you know, you need to take care about your community. You need to take care. You need to have a governance. You need to have rules of code of conduct, et cetera. And so that's really something where that may be counterintuitive for them when you talk to, I don't know, the CEO of an automotive company, that may be counterintuitive for them that open source can be good for them. But I think that when they engage with open source, that will also be good for you. Because they need you, and that will really grow how we collaborate in open source. Because for the moment, and that's the reaction and what creates the fact that people want to talk about European open source, that's exactly what you describe in ways of governance. And for the moment, I can understand the frustration of some European actors, not the system integrators, because system integrators, they take the technologies, they create a solution with this technology, and wherever the technology comes from, they are happy. But for everybody else, to some extent, the fact that they have those open source technologies coming from mostly the U.S. is, okay, so what is our impact on it? Sorry. Go ahead, please, if you want to move away from IP, please do not refer to it as European open source, but put another word on it, like European open source initiative, framework, community, something that refers to all the aspects aside from the software code. Because when you're talking about European open source, you think about the code, the IP, and all this stuff. With that, please use a framework or something, so that you describe the whole approach. But my interest is leadership, and my mind is more leadership, maybe, so we put it somewhere and describe it. Yeah, so when you skip away from IP and from non-Europeans using open source software, and you can frame it in a better way, so please refer to it to like an initiative or framework or something that you can put multiple approaches to it, but please make a big, please, please do. Okay, just quickly, I have one here, then I saw this one next, then there was one up there, then this one, and then here, so I will try my best. Okay, so this might not only work without the mask. One short question about the European open source debate. Have you ever considered not to try to frame it as a kind of license, because we have seen from the development of the approach of the German car makers to make a German car operating system that they paid horribly amounts of pain when they forked off certain line exports and spent a triple digit amount of millions on something that they then had to scroll back, thus it might be more useful to go to open source and give them some kind of label as we are a company and we maintain it and then eat the liability that comes with using open source and can be contracted given that the Cyber Resilience Act implies companies in Europe to actually take the liabilities that come from using open source, so instead of making it a license thing, you basically go to a certain software stack that you maintain that you can still freely integrate developments from all over the world and then you basically put a stamp on it, like if you want to use it, you can contact us and pay us a service level agreement to do the maintenance for that and then we will also take the liability with regards to the liability that will come with the Cyber Resilience Act where companies have to accept the risk that come from open source and an IT security aspect. Would that be a different spin to actually fostering a money flow into European open source companies and actually create a business model? I don't know, but we could try, just a clarification, I mean just put something that is going to be a real question that we will have to answer in Europe in a few months time I would say. So let's say we in Europe decide that we want to develop a specific piece of technology that we think is crucial for Europe. We want that piece of technology to be open source, it's going to be funded with European fundings and all this, we managed to get the European industry on board, not just as a consumers, passive consumers of open source but as active maintainers and developers of open source around this technology and we want this technology in the long term to respond to European, to the priorities of the European Union in terms of the industry and the geo-strategic priorities of the continent. We want that technology to respond to that, how can we protect that from let's say a hyperscaler contributing with a thousand developers to that project and kind of taking over the governance structures of that project, how can we from that scenario protect that open source technology as I define that, is that a way we can do that without excluding non-European companies from the governance bodies. I'm asking, because that's going to be a question that we are going to have to respond in a few weeks time, sorry in a few months time with thinking of a specific piece of technology that will be developing in Europe by the end of this year, it's probably going to be the largest open source project in Europe. So here's next and then there was one over there, another over there, then over here and then over there. So yeah, real quick question maybe hopefully, I find it very difficult to discuss these topics in abstract terms. So I wonder do you have concrete examples where kind of important open source projects, where project management, the governance exercise or the development happened against interests of European consumers, contributors and if you have such concrete cases, do you know whether this is because the Europeans didn't get involved or because the decision was really done against them though they were engaged. Well in terms of open source projects that are large in control let's say by non-EU companies or vendors, there are a number of examples there, you want me to give you some names? Not exactly for the projects, that I can understand, but where the decisions, project management happened against European interests and not that many open source projects are of course led by U.S. companies and organizations and so forth, but I'm explicitly asking for controversial decisions project management governance because it's more concrete. So he's asking me for specific examples in which European companies have been sidelined by in some decisions in open source projects, I don't know, it doesn't matter. It's about the risk, it's risk management. So how we prevent things from happening, how we prevent vendors from dropping their support to an open source project overnight and collapsing the sustainability of that project because in Europe we have relied for so long on non-European companies to develop these technologies and maintain them that we've lost the skills and the capabilities to take over. That's the problem. When we're talking about thousands of millions of euros invested in technology, I'll give you an example of that. There's something called the important project of common European interest and next generation cloud infrastructure and services, Ipsi-THIS. That's mobilizing thousands of millions from European funding to create a whole alternative stack of technologies for managing the cloud and the edge in Europe. That's only for European companies and that is going to produce I think in my opinion the largest ever open source project in Europe. How we protect that, if we open it up to Google, AWS, Microsoft, the companies for whom this technology is being designed to neutralize their dominance in the market and prevent them from taking over the edge computing market as well. That's a practical problem. That's a topic on which we will likely disagree, but that's fine, that's why we have a conversation. Because from my perspective, I think that if you talk about Ipsi, if you talk about edge computing projects, I think that so there are effectively millions of euros that are spent on helping European companies and that means companies with eight quarters in Europe because I remember I was in such a project and one of the comments we had when it was rejected is that you cannot get this specific funding because some partners are not European. So that's not a team with, you know, that's not just a company with a subsidiary in Europe, etc. That's really it. So we have the funding mechanism. That's already in place. And so next, the question is not how do we restrict some of the funding to help European companies? The question is how do we make it effective because, and I think that we have a fundamental problem in Europe about that, is that we spend lots of money and at some point in research projects, I don't know if you have been involved in those research projects, but in research or innovation projects, we tell people, hey, you need to do open source. And some of them, and I would say 80% of them maybe understand that they just need to publish a bandonware on GitHub to check the green box and they think, hey, come on, we published in open source and we don't. So I think that we also have to do all parts and to understand that, yes, when you do open source, you need to build a community, you need to build a governance, you need to, and even, I imagine, I dream maybe that a group of European companies, because they identify that a project that has been created somewhere else is very important for them, takes over, ends enough developers in a meritocratic environment to take over the governance of the project and to have more, more committers, et cetera. So I think, you know, there is, I know in most countries, in most continents, we have a, we can see some protectionist approach and I don't like it, and my organization doesn't like it either, but I think that it's really exactly, we should exactly do the opposite, that we have the framework to, we have the framework to fund some developments. We have the incentive or we have the requirement to make the development open source. The problem is that at some point, it is lost in some of, you know, fine-grained open source initiatives that never reach a critical mass. So of course, when I say that, I have an interest, is that, okay, at the Equip Foundation, we can help people group together to reach such a critical mass. But I mean it, I mean that there are plenty of open source foundations, there is the Apache Foundation. We are here in Europe, and we are incorporated in Europe, post-it code in Europe. And I want to see a more critical mass of people who, and companies who group together and have the ambition not to be reactive, but to really, well, just create the next technology for edge computing, for example. And the next technology for edge computing, that will be open source. Whether it's started bootstrap in Europe, or it's bootstrap in the US, or it's bootstrap in China, everybody at some point will converge to a few of those technologies. So we have, you see, we have a, we need to do better when we do open source, and I don't say that for you. No, I agree, I mean, there's no discussion about that. So I can tell you, the next big tool for edge cloud in Europe is going to be open source. So, and that's it. And that involves the main industrial actors in the continent. So that's a fact. We only have, it has to be approved by DJ competition and all that, but that's minor details. But if that happens and goes well, we'll have a five year project, five year long project developing some first class open source in Europe. And I agree with you, we have to mobilize, I mean, the whole community beyond the small technology providers, and we have to invest those fundings better and more strategically as well. So as a company, for instance, I would love if European projects hadn't been horizon Europe or research or innovation projects, hadn't been spending this money in small pieces of software to contribute to OpenStack, for instance, which is our competitor, and not to OpenNevla or some other European providers. So the last, one of the most striking things of the thing, the work we've been doing in a couple of years is bringing the main European open source providers into the European research and innovation ecosystem because they weren't there. People like Susie, like MariaDB, Lindit, I mean, these people have never, ever, ever considered even accessing, trying to get this funding or contributing to research and innovation projects in Europe. On the other hand, we find most of the US open source vendors, but also cloud providers, very well positioned in that area. So that's something we also have to work on, to mobilize all these resources to actually help European open source companies being most sustainable in their own time and get these contributions more fluently. So we have a question over here, afterwards, I registered a question here, then one over there, then I saw one over here, and I saw one at the front. One request, if you want to ask a question, please put your hand up and then try to get eye contact with me, then I can acknowledge that I saw you. It's else a little bit difficult in this large room, so, handing over. So what I think is that, like, back to maybe this definition part, where we have a European open source, I feel like the open source part of that is very much, very much global. So there's no, it shouldn't be localized, or it shouldn't be local to any area, really. And what, then we have in Europe, what we should maybe improve on is the, like, those two aspects, the using open source software and the creating open source software. And I can't exactly speak for those major projects, which are, like, funded by huge companies, either, well, in Europe or mostly in the US right now. But what I see is just a lot of individual projects from individual people who just contribute to open source, perhaps even their free time, or at a small company or whatever, and, right, but they still are, they are still the backbone of maybe an entire industry without that industry even realizing. So I feel like this is where we, as maybe the European Union, or as Europe in general, could dramatically improve on, is just to focus on those essential projects. I mean, there's this, I forgot the name, a German fund for exactly those projects, kind of project, like the base project, kind of small, but still really important, and, hmm? Yeah, exactly, right. So I would love to see more of these kind of initiatives. And I think that's really important, not, I mean, I don't know if, like, those governance aspects of open source are even that important. I see it if we are actually the main consumer of something, or we have specific goals we want to realize in open source, as Europe, but just, I feel like just empowering those individuals, or companies, or anyone who's contributing to important open source projects regardless of if they are, have a government's role or not, is really beneficial. Yeah. Well, I like the topic of empowering people. I think that when we, so when we fund, when we fund open source, we have a, the topic is that sometimes maybe we don't have enough money going down to the developers. Because you know, in some of the, in some of the places where I hear a lot about open source, open source, open source, open source, open source, sometimes I hardly find the developers. And that's, so that's, we have the problem on the two sides of the spectrum. I mean, we have, on the policy aspect, we have policy people who are talking about open source, and that's fine, I think that we need to, to potentially do better to, to characterize what we mean by open source and to make it better consumable by, by the policy people. And on the other side, we have also, I think the, the romantic view of open source being done by individuals, like what you described. I think that we have, especially here at FOSDEM, okay, we have all the individuals coming for the weekend, et cetera, all the volunteers, but I think that even in, in the room, we have also lots of people who are doing open source for their, as their day job. So that's, I think that I understand the individual aspect. I think that there are also lots of projects that are not really supported by individuals, but by some kind of organizations that requires those, those projects. And I think that after HubBleed and LookForShell, et cetera, I think that to some extent, the industry is embracing the topic and trying to, how do you call it? Do, do better and potentially, hire those individuals on contractual basis, et cetera, to take care of it. So. Yeah. I mean, that's something, you know, any of you attended the Open Forum Europe summit yesterday, but there was an interesting, if you can go recommend, if you look at the, the talk by the, it was the CEO of Github, right, it's very interesting. He was arguing about the AI Act and how open source would be left outside the scope of the AI Act because of the nature of the open source being run mainly by, by individuals and then the contrast is when you go to argue against the Cyber Resilience Act, then the argument changes. Look, I mean, we are companies, every open source is at some point commercial, so, so I think we have to, again, discuss within, especially in Europe as well, the, I mean, how we, we explain what we do, how we explain what the community is, and it changes from different parts of the, I mean, if you look at the technology stack and different projects, I mean, some projects are run by volunteers, some, some projects are run entirely by the communities, I mean, all projects are run by companies alone. I mean, our project, we write alone. I mean, we control the roadmap, we control the releases, and you can, you're free to contribute with the site, we accept that or not, and that's it. Other projects, they have a different governance, it's a community of companies, and they somehow benefit from that shirt, and that's, that, that shirt knowledge, and that's perfectly fine. They collaboration among competitive, and competitors, and that's perfectly okay. I mean, there are different models, and they are, I think they are all okay, but each of them have their own particularities, so this, this concept of sovereignty in Europe, I think especially relevant for those projects in which they are, they are mainly maintained and controlled by, by, by companies, by open source companies, or other companies, so that's, that's, just to frame a bit the, this, this, the discussion, I mean, from my perspective, so those are the kind of projects that, for me personally, and for all this, create some concern in terms of, if we develop these projects in Europe, and we have this ambition of having the European industry heavily involved in maintaining these projects, how we protect these technologies and these communities from, from the very best, I mean, the very same companies we are creating this technology, not against, but, you understand me, right? I would like to come back to the European project, and I think it's really important, we want to be democratic, it means freedom, and seeing what, what's happening right now in the world, it's like woohoo, so democracy and sustainability, I think it's part of the European project, and so how can we be leaders in, and I prefer free software, because free is in freedom, not as in free beer. How can we use that for interoperability that leads to right to repair and, and that helps sustainability in the long run? So how can we put open source European software to help build the European way of doing stuff, if I can say it like that, and I think that the adversary of, of European open source software is international close software, more than international open source software, I don't know, I mean, when I see that all the better universities are sold to, to Microsoft, we have a problem, like, and how can we change that using European companies that are using open source software? Yeah, I mean, that's, that's the, that's, those are a couple of big questions, yeah. I mean, the, I think, as we see this debate taking place, I think, at certain, at some parts of the technological world, let's say, part of this, this stack, we are happily beyond the debate between open source or proprietary source. I mean, that's, and that's something in, I understand in some other areas, it's not that, that, that clear, I mean, it's a, it's a, it's a vaguely sale, I mean, you have to fight a lot with, with users and other companies to trust open source. In some parts, in some, in some areas, that's, I think we are, we are beyond that. So I'll tell you a very small anecdote. We have the General Assembly of the industrial, what's the name? European Alliance for Industrial Data and Cloud in December, for instance. It's only for, it's a body of, it's an alliance, industrial alliance set by the commission or supported by the commission led by the industry, only for European companies, okay? We are defining the roadmap, the technological roadmap for the cloud and the, and it's a bit of a promotion to have another talk at five in the Sovereign Cloud that from about the specifics of this initiative. So we are defining the technological priorities that we need the commission to help us co-fund in the next years around it's cloud and data. Only for Europeans. And when we have this General Assembly here in Brussels in December, there are three working groups, the cloud and its working group, the one for member states, members from different governments. And there's a specific working group for the fence and aeronaptic sector. And for me personally, I'm old enough to have seen some changes in the sector. For me, watching at this five representatives across Europe from the main defense companies saying they understand, I mean, they want to use open source. They do want to use open source. They just need help to use open source. But they want, that was surprising for me. Okay, it's the defense and sector, okay? But you know, I mean, this is people with a very specific mindset and very critical projects and very hard requirements on the contracts they get and all that stuff. They were saying, we want to use open source. We just need to see how to actually put it under the requirements we have in the sector, because that's what we want to do. And they want to contribute to these projects. We just need to tell them how exactly, with their peculiarities and their particularities. But in some areas, we're happily beyond that discussion. And we are, I think, mature, I mean, the market in that area, so the sector is mature enough to get into the nuances of what do we want to do. Into the nuances of what do we mean by open source? Who's controlling these open source? Who's developing these? What are the risks of taking, assuming, or using this technology or that other technology? I mean, obviously, I have to confess, that's a debate as a small open source European company. This is an environment that benefits the small company like us and others, because it helps us differentiate from others. But that's what it is. I mean, we have the European open source sector, you know what it is. I mean, we have a few larger companies, but most of us are smaller. We've been having for years, producing very good technologies, I think. But that's what it is, so yeah. Here's the question. So we have one last question here. I saw many others registered this, but we can't take any further questions after this one. So I'm very sorry about those who gave their sign to me, and we can't take your question. Yeah, thank you. It's not so much a question anymore, given the time, but I think, given what's on the slide right now, is European open source? I think you talk a lot about governance of it. I think it's about dependence on external actors, which are not a Europe. I think we want to be self sovereign, so as it's now the hype. But I think we should have expertise in it. It doesn't necessarily need to be European development. You need European expertise to judge it, and to, if somebody decides to stop that, we can take over. It's not so much that we created in the first place, but we need to build the expertise, not only in development, but also in knowledge about using it, promoting it. I think that's one of the things we just lack education in. We're only forging a project between it and maintaining it. Exactly. We need more education on both development and promotion of open source and using it. I think that's the thing. We need European education on this. Yeah, I think we agree on that. And in all honesty, when I put that sentence, and again, I'm French, that may be incorrect English. But my goal was to bring us to quickly conclude that the short answer is no. But I think that we must have a European industrial strategy for open source. And I think that we at least can agree on that. And we have lots of talks about open source in the European setup. And it was mentioned in one of the questions with the CRE. And I think that, OK, on one side, for me, that's really interesting to see that the cyber resilience act mentions open source as an exception. But the way it mentions open source as an exception is something that is almost not acceptable for lots of people in open source because that creates lots of concerns. And so on one side, we have incentives, lots of incentives to do more open source in Europe. And on the other side, we are creating regulations that could break the way to redistribute software. So that would really be a problem. And that was my conclusion. So your conclusion, and we are done. My conclusion is that the time's up. Thank you, everyone. |
Financing Open Source by small companies
We give Open Source projects 1% of the revenue, and you can too! |
I hope that there will be someone here who raises their hand, and they can come, I do the talk instead of me because they know better, but unfortunately none of you. I'm going to convince myself, I'm already convinced, convince yourself that we can finance open source by small companies. We are giving one person of the income to open source project, I'm going to convince you that all of the small companies in open source can do that, and we can change, we can change much in the open source world with that. Do you know what it can be, 23.1 million? Huh? Hard question. That's the number of small companies in the European Union. Wow. How many of them do the computing science? How many of them use open source every day? How many use open source every day? I would say half probably, but how many are contributing and are constant users of open source? Way harder to ask. Are you the same that at least 5,000? Imagine now that 5,000 companies, which is not that optimistic estimate, each of them giving 1,000 euros per year. The total budget you get is 5 million. What we can do for 5 million in Europe, quite much, and 1,000 euros per company, it's not high number, it's really a pessimistic estimate, so we can do way better than that. Before we get to the details of the product smart dictionary check, we are in Europe, all of us talking about taxes, incomes, and financial things we are using our own languages, and we need to have a dictionary check when we are switching to English. So, turnover, income, gross revenue, same thing, it's all sales of a company over a period of time, usually a year, and profit, it's earning minus all of the expenses that you have. What I'm convincing you that you can give 1% of the income, so all sales is really a year. Now, my personal point of view, SysLineBit is a company running multiple economic activities under one roof, and we are helping new activities and new people to be their activities. It's a pretty specific type of a company under the French law called CEL. We are concentrating on embedded and open source, so we are strong supporters of open source. Currently, in our model, everyone is a consultant, but we welcome production, system admins, and all other types of activities. Currently, every consultant gives 1% of their income to open source project. As we hold multiple independent activities, every person decides on their own to which project they want to give, because we have different preferences, and I started that when it was a one-person show, when I started a company two years ago. A small reminder, 1% of income means 1,000 euros for every 100,000 of income. The idea of one person is not a new one. You have probably seen about one person for the planet. There have been a few initiatives of giving to open source, but I think it's good to bring the idea back, because we are always, as always, missing money in open source. Now, when the company gives a donation, what it needs? In voice! I was shouting too much, I think. When you're running company, you need an invoice for every single expense you have, and if you pay first and then wait for the invoice, you have a risk of not getting it at all, or getting after a long time. Last year, I have given to one of the organizations in open source, everyone here knows, and I'm not going to name because, and it took me like four or five months of email exchanges with a person who was very sorry that they cannot get to the person doing the invoices, they cannot give me an invoice, they are going to send them a message yet again, and I was a month again, I was sending the message again. Finally, got my voice like five months later. That is risky, I was already thinking how to explain that to my accountant. A company can go through a method used by big companies, purchase order that the company gives, then the recipient of the money gives the invoice, and then you pay. Problem, it's a burden, it's a burden organization you give, and it's a burden for the company giving the money. But that's the solution I use for bigger donations when I cannot risk of not getting the invoice. For individuals, they are all their ways to donate. If a company wants to give money to an individual, that's complicated. For the tax reasons, for the legal reasons, so we avoiding giving money directly to individuals. For cryptocurrencies, some projects accept cryptocurrencies. I leave you the question to ask your accountant, and then you need to be sure you can run away from the place. The reaction varies, but may be drastic when you talk to accountant about cryptocurrencies. So no, if you do not want to run into trouble, you do not run, you donate in cryptocurrencies when you are running a company. Now, typical open-source organizations supporting projects, what they offer for companies. Either yearly sponsoring, or even sponsoring. That is something that exists for most organizations. Typically, a sponsoring is from 1,000 to 25,000 euros depending on the level. Even sponsoring is between 500 to around 10,000 euros. Now, we compare that with what a small company can give. In a small company, I have an alternative. Either I give 1,000 euros to one project, eventually two, or I give 100 euros to way more projects. For people who live in open-source, we know that if I give to project X, someone will ask, but why you are not giving to project Y and Z, and some more, right? Because if you are going to one project, it's fair to give to all others. Opinions differ inside the company who is using which project in the products. Exactly. At the end, what happens is you want to give smaller amounts to more projects. And that is exactly not in line with what the organization supporting open source have as an offer. Our way of doing the donation is giving directly to organizations. So we establish the way of how to do the payment. So either they have an account that you can create and then donate, or you need to contact some email address and figure out the details. Things vary. And then when we have the methods set up, we have that organization now, a database, and every transaction later, it goes the same way and it works. There are other ways using aggregators, like GitHub Sponsors, Open Source Collective, or Linux Foundation crowdfunding. And there has been a talk about the subject by Wolfgang at the Open Source Summit Europe last year. He had a pretty different approach because that was from the approach of a big company with more internal difficulties of having the donations done than we have in small companies, but still his analysis of the ways you can donate is pretty interesting. Now, is it hard to donate to an open source project by a company? Financially, the amount of money, one person, if you compare it to different expenses you have in a company, accounting, rent, internet access, electricity of whatever you have, it will be in the small things that you have on the company bill. So it's not really that hard to do. What is a little bit more complicated is the organization. And what I found out is most complex is actually deciding to which project you want to give, even in a company when everyone agrees, we want to give. A lot of discussions and a decision-making on how to do that. And then important thing, collect and document the payment procedure if you're not going through an aggregator. Important thing. Now, you would ask why? Project work as they can. And when you are already doing open source, you can say that you are already contributing and there's a kind of donation you are already doing, right? I would say that this is an addition to the work you are doing because you cannot take a maintenance in every single project you are using in your product. That is just not possible. And this is the maintenance work, infrastructure, all those hard to sell us new features that need financing the most. In my opinion, that works easiest with the money because you can just pay someone spending their time on doing those tasks. And then, a subject related to the presentation before. Currently, quite many organizations I was involved with, they in reality depend on donation from a small number of big companies with a part of individual donations that they receive. So, if you can break that dependency and make sure that those projects have more stable and more diversified way of getting the income, that is going to improve the stability and at the end is going to improve the quality of the code we all depend on. So, are you ready to try it out with your company? Or do you have any questions, subjects for discussions? In the meantime, we try to fix the microphone and so take a bit of time to question them. Okay, there we go. What can open source projects do to make this easier? Like for example, something that I run a scientific open source project, we're looking at trying to incorporate as a non-profit because we think that will be more attractive and we're trying to figure out where to do that and deal with all of the international stuff and it's a whole thing. So, I guess, what do you see from the small business perspective as to what projects can do to make you're giving us money easier? Yeah, so hold on. Maybe we collect some questions first and then in between we can fix the mic issue. Would it be fine for you? So, that's why maybe we collect some questions or is it okay? Okay, the question the question is how the project how a project can make it easier. For me, it's very simple. Have a legal entity that can create invoices or any legal proof of payment for the amount and make this make it easy for a company to do that. So, make it easy in such a way that when you get a message that we want to give you 500 euros, we get a response rapidly that increase confidence that's going to work. And then you can, if you are ready to send an invoice just at the moment of payment, that helps. If you have a system that generates invoices, it's even better. You can use online invoice generation systems that exist. If you can use it, it's even easier for everyone. But it's a really, really simple fast reaction time and getting the invoice easily. If I have those two, I will be willing to donate again. Okay, so we have another question over here, two ones. So, mainly the perspective on us was by a company's perspective. And this topic has been discussed by the Node.js technical theory committee just recently as I'm part of that. And Node.js is a part of the OpenJS foundation and money is actually not an issue. That's not the main problem. What we need is actually people donating their time and effort into projects. And that is way more valuable as such. So, if a company, instead of investing money, because it's really difficult, even like for an organization, you know, you have to decide who's actually going to get that money. Do we pay someone? What is with the others who contributed? It's a really tough question to answer. If it's not a single contributor to this open source project, and therefore, just if you pay someone from your company to work on that project, that is mostly really, really valuable. Thank you. So, that was a pretty important subject when in a situation when the money is not really an issue, people are an issue. And my answer here will depend on the type of the development we are on. In my case, for most of the projects, we donate small amounts like 100 euros per year, 200 euros per year. That amount of money doesn't allow you to hire anyone to work on the project. What would be possible would be to donate to an organization that will hire that person, and have multiple companies donate to that person and then hire. Also, hiring someone for a small company, it's an important risk if you do not have funding. Hiring someone for me is more of an approach. Either you have a business model closely related with the project, if you are really doing money with that specific project, then it makes sense. Or if you are a company of like 50, then you can probably allow that. For small companies, I think it's a little bit more complicated, unfortunately. Also try to... We have another question over there. I'd like to highlight a service by a small NGO from Berlin. It's called the Center for Cultivation of Technology, which is like a back-end to open-source projects. They're registered non-profits in Berlin, and if you are a small open-source project which doesn't have a company somewhere or a legal name to it yourself, you can go there and say, hey, I'm an open-source project, and they could act as a trustee. They receive donations and they can hold that for your project, so companies can get invoices or checks that the money has been sent there, and the software devs don't have to care about all the financial stuff for that. There is a back-end to support open-source projects and to make invoicing more easier and at no cost. You want to respond to that? A small comment for that. Thank you for doing this, and if you could send me the test of the foundation, I would be interested in all initiatives that allow that aggregating payments for different projects. My question is, well, I thank you for mentioning fiscal sponsorship organizations, and I want to give a yes, software freedom conservancy is a fiscal sponsor organization as well as there are many others, and I know that several of us are very interested in making sure that the projects we accept and therefore take donations for are have great governance, and so there can be trust as far as making sure that funds they're donated or used for the project and are used wisely. How do you evaluate governance for these donations, or do you not because there's smaller donations? I think we are trying to test. It's working. Okay, the question was how we evaluate that the money is going to be well spent. I'm taking a shortcut. We are donating to the projects we know very well. So that's easy. We know the governance. We know how they work. So the check is easy. Donating to projects that we know less, I would definitely agree that getting to a project that is in some aggregator that has policies on the governments, that's an additional plus something point for this project to actually donate it. Then to a project that we just absolutely do not know how they work, what's going to happen, the answer to an email after a month, and things like that. That's definitely a bonus point for that project to getting a donation. Absolutely. Okay, so we have another question over here. So coming back to what you said earlier, like at least when you're a computer company, then you can definitely let someone work at least part time on open source. That would be something really valuable again. I encourage people in my company that are working to work on open source and we have dedicated time that they may invest into that. Even as a small company, I was able to work on open source because we used open source and we ran into bugs, for example, while developing things. Then I discussed that with the leadership at that point of time. Instead of having a workaround built into the application that we are currently building as a closed source, hey, why am I not going to fix that bug in that open source library that we are currently using, and that's what I did. You want to write? Yeah, very good. I think you have my mic. I have my mic. Sure. I completely agree. The money part for me is an addition to all the other open source work you do. We as an open source company, we propose to our customers to upstream the code. We tell them we can upstream the patches that you have, the fixes that you have. We can help you with working with the open source work for your project. The money part is an addition to that. If you are already open source citizen, that's already great. But what I found out is that quite many projects I'm involved in Yocto. I have a lot of experience in that. That's the maintenance part that requires time and convincing customers to have people working and maintain us. It's even a problem for the Linux kernel. Yeah, so it's a bigger problem and I think that without financing people separately from that, I'm not really going to happen. It already happened with different funding for different open source projects for the maintenance part. If you can go forward with that, that's going to help. Time's up. But we're happy to discuss with all of you in the corridors. |
Open Source Initiative - Changes to License Review Process |
Please give a warm welcome to our next speaker, Pam Chastak. All right. I hear my mic working. So I am here on behalf of the Open Source Initiative, of which I am a board member, to talk about some changes that we just announced, or let me put it this way, draft proposed changes. That's why we're here, is to kind of discuss these changes and discuss, improve, take comments on these proposed changes to the license review process. Let's see. So just the background, the first meeting was November 2020, which was a long time ago, and we had COVID. So, you know, our timelines ran long, and our last meeting was September 23rd, and these were the participants in the working group that we had set up for this, which I'm very pleased with this group of participants. I think we had a really nice array of different voices and different thoughts and different backgrounds who participated. So the announcement, it was announced, this draft proposal was announced on the OSI website. I apologize for the long URLs, which are useful to no one. But if you go to the OSI website, you should go to the blogs, you should be able to find it. It may still be the top one. I want to say that there also is, these proposal is on the open source initiatives Wiki page. We have a Wiki page for it, it's on the Wiki site. And that is where all feedback on these proposed comments should be placed. So I'm here, I'm going to take questions, we'll have plenty of time to talk about it, but in order to allow everybody to have, see everybody's input and comments, we are asking that all of the comments be added to the Wiki page. So if you have something here that you think is important, in addition to hearing from it, here I would also like you to take the time to put it on the Wiki page so whether people can react to it. And the Wiki page is linked from the blog post. Again, I apologize about the links. So outside of the scope of this working group and the OSD itself, we did not discuss whether the OSD is appropriate or not a conversation that comes up every so often. And we also didn't talk specifically about the tooling for the process. So at the moment we use an email listserv, we have acknowledged for years that this is maybe not the best type of technology to use for this process, for the license review process, it's what we have now. The OSI has a parallel project of trying to identify a better tooling system for taking in licenses, asking for review, and giving feedback on those licenses, and eventually making a decision. So what we did here and the tooling are obviously very interrelated because if we want changes to the process, that needs to be implemented in the tooling. So they do interact with each other, but we stayed away from tooling and just assumed that these issues could be handled and what we're going to do. So these were the questions that were proposed for the licensing for the working group to work on. We're evaluating the criteria for approving licenses, we're evaluating the process for considering licenses, we're evaluating the current categories for licenses, and evaluating whether there should be a process for decertifying licenses. On the last one, we ended up not getting to that one because when we got to that one, first off, we had been taking a really long time, and in looking at this last one, we realized that there were many, many, many facts about the use of licenses that might potentially be decertified that we didn't know, and it would take a really long time to find those facts. So for example, what are the licenses that one might consider decertifying? How are they being used? What are the knock-on effects of removing that? So for example, is there a business out there that's marketing itself as an open-source company, but we say, no, by the way, we don't consider your license open-source anymore. What kind of impact is that going to have on that company? What about companies that ingest licenses based on the fact that they've been approved by the open-source initiative, and we say, by the way, this one's not approved. How is that going to change their software stack? So these were all things that we were completely in the dark about, so that we took that off of the table for this group. I would very much like to see another working group start shortly to start working on that project, because I think it is a valid project. It just was something that we didn't have time for. And frankly, I'm looking for someone else to lead that, because I also continue to work on this piece of the project. Okay, I'll try to get done what we're going to see. So these were sort of some bigger picture things that we came up with. One was we agreed that the OSI will not be providing information, recommendations that be providing advice on the use or adoption of the licenses. This is too fact-based. These choices about use and adoption of open-source licenses is very specific to every project. It's very hard to generalize in a way that is helpful to anyone. This is an industry. There are people who do this as their business, as to advise on adoption of open-source licenses. So the OSI is not going to be diving into that field. What we will do in order to facilitate that is we hope to provide machine-readable tags, as well as licensed text. So providing API access to licensed text is a task that is underway. We hope to provide machine-readable tags to go along with that in order to provide sort of more digestible information about these licenses. And again, I confess, you're all going to look at this and say, there's a giant leap of faith. There are a couple of giant leaps of faith here, and this is one of the giant leaps of faith, and I'm looking to crowdsource both the identification of tags. We have a preliminary list. I don't think that that's novel to have identified some tags, but to identify the tags. And also, I mean, there are more than 100 licenses, so we need bodies to go through these licenses to be able to identify what tags are appropriate for the licenses. Something like a tag might be, has a trace of venue provision. A tag might be copy left. A tag might be attribution requirement. A couple licenses don't have that, so they would be lacking those tags. But to help with, you know, these licenses are all unique. So in some sense, I know that I always say I always go back and read the license every single time, but certainly providing some sort of base-level information will be helpful. We plan to have three categories of licenses, rejected, approved, and preferred. The word choices are up for discussion. The preferred is sort of the substitute for what is now called, and someone can help me, popular, something, something, a very interesting label. But we're hoping that the preferred tag will be objectively created from data, hoping to find data on adoption metrics, as well as using these tags as a filtering mechanism. So, you know, a license may have an attribute and still be an open-source license, but it may not be sort of a preferred attribute for a license. So that, you know, and that's again sort of up for discussion on what are those tags that would, so it would be some combination of widespread adoption and, you know, a guaranteed level of sort of what the requirements are. And we also will hope to provide more information to the public on the decision-making process. It is a little bit of a mystery, particularly for people who haven't been on license review, sort of how, who gets to say how it works, we'll try to make that clear to people. So that's, those are larger concepts. We identified two different categories of licenses. This is, again, I think not terribly new concept, legacy licenses and new licenses. New licenses, by definition, anything that's not a legacy license. And a legacy license is one, we've just sort of pinned as one that has been in use for at least five years by more than 20 projects maintained by different unrelated entities. Again, up for discussion is 20 the right number? Is 100 the right number? I don't know, but we're welcome feedback on sort of how do we know the point of a legacy license being, this is a license that is widely in use, but the OSI has not yet passed judgment on it on this license. So we want to capture those. So for all licenses, the submission process will, and some of these are old, you know, some of them are not, some of them exist right now. We will ask the submitter to affirmatively state that the license complies with the OSD, including specifically affirming three, five, six and nine. That's because those are the ones that trip up people. I mean, we see licenses submitted where clearly the license submitter has never read the OSD. They just, you know, they just want to call it open source. Again, there's just a channeling function to just sort of put people through their paces on, did you really think about this? Ask what projects are already using the license, the identity of the license steward, if known, we'll try to get in touch with that person. Providing any additional information, sort of, you know, the fact that maybe Debian has said, has accepted this license would help inform our decision, gives a unique name, and again, some proposed tags for the license. This is, because we did not go into de-certifying licenses, instead, one of the rationales for de-certifying licenses is that we get old licenses that would not now be approved, kind of thrown back at us and say, well, you know, this was an open source license and it had this attribute. And so, we're just clearly stating that just because a license was approved in the past with a characteristic does not mean it will be approved again. So, that was the, so, so, this is for new licenses. So, I had what happened with legacy license or for all licenses. In addition for new licenses, the person submitting will describe what gap is not filled by a currently existing license. Compare it and contrast to the most similar, these already exist, describing illegal review the license has been through, but we have a new one, we have a new bullet here with the arrow, which is to provide examples of others potential use of the license to demonstrate that it is not a license that is uniquely usable only by the license submitter. So, what we're looking for is we want licenses that are going to build community, and so if you have a license that nobody will ever use, then that will be a hurdle to overcome. Recommendations for approval of the licenses. Again, I have three here that are, that have arrows next to that I think are new. So, we want to see a reusable license that the license can be used by any license or without changing the terms, or that the license will achieve a different result for a different license or. So, it needs to function the same way no matter who adopts it. The license does not have terms that structurally put the license or in a more favored position than any licensee. So, again, this is a very even-handed, we're trying to build community here, not give anybody an advantage. I mean, I will say some of these are designed at ferreting out. Companies that are trying to get that moniker were open source, but they really have no intention of building community. They have no intention of having their software shared. They just want to use this term as a marketing tool. So, that's what we're looking for. The ambiguity must not have a material effect. There must be licenses that are, you know, a lot of them are not written in a way that is subject to clear legal interpretation. So, we want to, that we will take into account. It must be grammatically and syntactically clear to a speaker of the language of the license. Every possible variation must meet the OSD. I've had, we had a recent one where it was sort of like, you know, two iterations were met the OSD, but not four other iterations or four other, you know, sort of variables. Next bullet, it must be possible to comply with the license on submission. And the example, the counter example to this was the SSPL where I think Bradley pointed out that even MongoDB could not comply with the license itself. It was the copyright owner and therefore didn't have to comply with the license. But it was so, you know, but it was written in such a way that there was no one on earth could comply with the license. And then the license must fill a gap that currently existing licenses do not fill. So that's it. That's the, in a nutshell, again, you can go to the blog post and read sort of more detail about it. Put your comments on the wiki and I'm open for questions. Okay, so, all right there. James here. I see James hands shut up. Interesting topic I guess. Yeah. Hi, how often do you see projects submitting license or people submitting licenses that projects are already using? That seems like the cart before the horse. How often do we see licenses being submitted? Oh, I mean, I think the, there was pretty often. So there, I frankly think that the, so the OSI's practice has been in the OSI's practice is we don't review a license unless someone submits it. So there is a huge body of legacy licenses that nobody has bothered to submit to the OSI. So, you know, we have a little over 100 licenses have approved a little over 100 licenses. If you look at something like the Fedora project, I think they're in 400 licenses or something like that. So, so yeah, a lot of license, a lot of licenses you might expect are not there and submit away. Happy to, happy to have those included. Over here. Yeah. Hey, thank you for your talk. I was wondering if the working group or a related body has a part of a mandate to discourage license proliferation in general, or is it purely just about how to take the incoming licenses? So I will respond to that sort of from my personal viewpoint. I'm trying to be clear that this is not the OSI's view. I will say that on the blog post, we will say that license, license proliferation, and I have the hardest time with that word. So if I may use a different word, because I can't say it, I think I'm not terribly concerned about it. I think it was very important that the OSI did put the kibosh on that many years ago. But I do believe that people are centering around licenses and that's why we're interested in sort of these preferred licenses. But I don't, we don't, you know, when open source first started there, everybody wanted to have their own license. But now I really think that people are cohesion around, around licenses and, and it's so that we're not seeing proliferation as a problem. But for again, they sort of, you know, the company that wants to have their own license so that they can say that they're open source. So it's not, it's not the proliferation, it's the problem. It's the how they are, you know, why are they adopting these licenses and we want to make sure that they're being adopted for the, for the right reasons. Thank you. So two part question. Number one, you said reusable by any licensor or reusable by any of a class of licensor? Any of a class, yes, understanding their special purpose licenses that, but, but yes, and looking for reusability as, as the key. No, it doesn't have to be all purpose for all things reusable by, by the appropriate audience for that. But, but not, but not just that only the, only the person proposing the license or drafting the license is the only one who will conceivably use it. And the second, structurally favored, is the OSI going to take any position on CLAs and the use of, uses there up? I, I don't recall what our position has been. There are others in the room. I'm looking at others in the room who might know the answer to that. No, that's not, that's not on, we have never, we have never approved CLAs or opined on CLAs. And I'm not, I have not heard of anybody who thinks at the moment that that is something the OSI should be, will be doing. Can you take the slide back to the one that showed the comment about machine readable tags? Let's see. This one? Yes. Okay. I'll just comment that I agree that preferred sounds like it means preferred by the OSI. Right. So the OSI is recommending that people should use that as opposed to popular, which means that other people are using it. Yeah. I don't think that's going to change that. Put it on the wiki, please. Going back to machine readable tags. That's a notoriously hard problem. Yeah. You might need a controlled vocabulary for tagging. Yes. And yes, it will be a controlled vocabulary for tagging. And you might need to make it very clear that the fact that you might need to distinguish between something, a license that has not been tagged because, with a particular tag, because no one's got around to it yet. Versus a license that has not been tagged because it negates the validity. So we need a tag for tagged or not tagged. I know. Or completely versus not yet fully tagged. Yeah. We only have a couple of minutes. I want to make sure James, who's hands shot up first, gets to ask his question. Okay. Yeah, I know. But I just, yeah. Okay. So quick question as you said, changes to the criteria. They, I'm wondering if they're, they look to me the way you presented them as additions. Is there things that you're subtracting from the existing criteria? Oh, that's, that's a really, I will, next time I present, I will add that slide. No, it's, it's, it's, it's in addition to must meet the OSD. Okay. So, but what are the existing criteria? It's on the website. There's, yeah, there's an existing process on the OSI website right now. Yeah. Yeah. Thank you. Thank you for, thank you. That's why you do this, right? Okay. So the question I was going to ask is a variant about the no privileged entity one, because one of the ways of avoiding CLA's is to have a license that says, and this entity is, is empowered to change the terms of the license. Often done because for a new license, you don't quite know how well it will work out. And you want an escape route if you want to go to an older license. If you keep that term and sort of, I'd have to swallow it if I were putting it close to my mouth. So the question is, would there be certain privileged terms in the license that you would look favorably on if they're designed to perform a community function like being steward of whether the license should change? Because not doing that. Yeah. I'm not. Yeah. So the question, as I understand it is, are there terms that, that the OSI will look more favorably on as sort of, you mean as getting into the preferred category or just more favorably in the license review process? Basically allow an entity to change the license if something went wrong. That's what people usually use CLA store. And people have been gravitated through the licenses that made the entity instead, which would run a file of your preferred entity. And to verify that sort of reusable and without change is, I would, I think that a name change would be fine. So for example, if, you know, if we said the license steward is who may come out with later versions and making someone else a license steward, I think that that's something like that. Just the name, just a hard coded name in there would be, I think that would not be objectionable. It was just, you know, sort of in something broader than that. Let's see. You've got two minutes. Okay. So you said the ambiguity must not have a material effect. I'm not real clear on what that means. I suspect there's, there's more reasoning behind that. Can you give an example or explain? Yeah, I don't know if I can give an example, but materiality is a, is a, is a term that all lawyers feel pretty comfortable with. And the reason that we said without material effect is every license is imperfect. So every license will have ambiguity. And very often you will not know what that is until down the road when someone discovers that ambiguity. So material would be, I would say material, material, material change would be, it like changes the effect of the license. It's some interpretation that changes, that changes the meaning of the license so that there are two different ways that the license could be read because of this ambiguity. And that actually happens quite often. I would say every license has some kind of ambiguity. The question is whether it's so significant that it's going to create interpretation problems. And time's up. All right. Thank you very much. |
Learning From the Big Failures To Improve FOSS Advocacy and Adoption
How Are Big Companies Benefiting So Much from FOSS, and Individuals So Little? |
Now, I'd like to welcome our next speaker, Bradley Kuhn. I was going to bring up my COVID test to show you I tested negative this morning, so I'm going to take my mask off for speaking, but now I don't want to get close, you know, until I mask off to go get the test to show you, but I'll show it at the end. So I apologize for any typos on the slides. I practice just-in-time talk preparation, which is why I was in the back when people kept coming to ask me questions. I was weirdly sending them away saying I was still working the last three slides five minutes ago. So I apologize for typos, which I always have them. So I decided to announce at this FOSDOM that I'm officially old, and I think it's an appropriate order. I'll tell you, his mouth has so many buttons I can't figure it out. I'm going to have to use the space bar. It's like, this is like a super power mouse I've never seen such a thing. It's very impressive, but I think it requires a tutorial I haven't had. So this is my 12th FOSDOM, and the reason it's only my 12th FOSDOM is because once upon a time, I was under a travel policy where you could only speak, you could only be funded to travel if you were speaking, and I could never get a talk accepted in the main track. And I'm very embarrassed to say this, but in the early 2000s, I didn't realize dev rooms had talks. I thought they were like hack sessions. So I never tried to submit a talk to a dev room until Tom proposed that we run our own dev room, which is, of course, this one. That's why that was my first FOSDOM instead of way back when. But I was counting this morning, and I've been to, I stopped counting at 100 when I found it was definitely 100 different events or conferences related to free and open-source software that I've been to. And I've also given a lot of talks at places like universities that weren't conferences, and I stopped counting this morning when I got to at least 150 public talks that I've given about FOSDOM. And I've considered myself a FOSD activist probably since the mid-1990s, and I started being paid for it around 1997 or so. So I have about 25 years of being a FOSD activist, and this year, I'm turning 3-2 in hex. And so depending on which counting system you use, I'm both over 30 and over 50 in the two different systems, so I'm part of the old generation now. And I want to be really clear, we were talking about this this morning at breakfast, it's extremely important, I think, for the older generation to begin to see the power. I am not terribly comfortable with gerotocracy. I don't mean to be agist about it, but I will promise you that I'm going to begin a process of, for the extent to which I have any privilege and power in FOSDOM, I'm going to work over the next 20 years to completely see that power by the end of those 20 years. But it leaves me, when I start thinking like this, trying to figure out all the things and all the decisions and various other choices we made that were wrong over the last 25 years or so. And it's very harrowing and frustrating to think about, but I figured I should put it into a talk. I've always like actually first heard that this quote is often misattributed to various other 60s activists, and I first heard it in all places, the movie The Planet of the Apes, where the young Ape says to the Charlton Heston character that he doesn't trust all the old leaders of the Ape community, and he says, yeah, I don't trust anyone over 30, which was of course this watchword of 1960s activism. And I think, so I went looking for the real source of the quote, which this is. And what he's trying to say basically is that people who are empowered, no matter what kind of politics they had, there was a lot of manipulation going on from this older generation to kind of push both leftist activists and conservative people in the directions just to follow in the footsteps of the older generation and the older way of thinking. This is of course how the movement, at least in the United States in the 1960s gets dubbed the new left, that they're trying to create this new thing. And of course now they're the old left, because Jack Weinberg's in his 80s and all these other activists from that era are older, and now here I am becoming that old generation. So, but I still think about what is so funny about peace, love, and understanding, or even communism, like why was I, I think a lot about, we were terrified in the early free software movement of being called communists. Everybody was saying that GPL was a communist license, and we worked really hard to say that it wasn't, and that it was really a capitalist license, and it was all about capitalism and great, and all this sort of stuff, but frankly I never really dabbled in communism. I dabbled in anarchy as a teenager. I was a punk rock kid, and I remember getting the exploited first album, I got it in the, it was recorded in 1981, I got it in the late 80s, and the album was titled Punks Not Dead, and I kind of think that once you're at the point, you have to say, our movement's not dead, maybe in trouble. You have to write a song that says, no, we're totally not dead yet, we're totally still a movement. Now I found this image, and the best part about this image is that, so I'm sure this is a copyright infringement, because I found it on literally Adobe.com's clip art website, and that was just such a symbol of co-option that to get an anarchist logo that says, Punks Not Dead, the first one I found was in Adobe's clip art, so I kind of proudly infringing Adobe's copyright here, because that's what an anarchist would do, right? So being a kid in the 1980s in the United States, this was when I became anti-RAAA and anti-MPAA, like this, we were already about that, because there was all this effort to stop people from sharing music, and the punk movement was really up in all of them about this, this idea that music and culture should be for everyone, and not pushed down on us by corporate America. I was a huge fan of the Dead Kennedys, and I actually, I don't have it anymore, by the way, because my mother read Tipper Gore's book and came into my room when I was at school, and took all my punk records and threw them away, and replaced them with Phil Collins, and I didn't realize until like 30 years later, because I never actually read Tipper's book myself, her book actually literally suggested that's what you should do, go into your kids' room, take their Dead Kennedys records, and replace them with Phil Collins. I was already a professional in 2002 or so, and my mother sent me a videotape concert of Phil Collins, because it moved in her mind that I actually liked Phil Collins, Tipper Gore said that I would, but anyway, this was side two of an album that Dead Kennedys put out, spoofing that home taping thing, and I may be one of the only people who literally, because I really couldn't afford cassette tapes, I literally took their advice, and I think I put like a Bad Brains album or something on the other side of this cassette. And the thing is, somewhere also around 2002 or so, Joe Biafra, who was the lead singer of the Dead Kennedys, came and was doing speaking tours, he kind of became a spoken word artist after a while, and he came to Boston where I was living at the time, was living in Cambridge, Massachusetts, in the United States, and I went to see him, and I'd already started working doing, I was literally in the middle of huge anti-DOM work that I was doing, where the place I used to work, and so I walk up to him, I go to the meet and greet, and he's still up on crack, because he didn't start the show until it was supposed to start at 9 p.m. and he started at 11 p.m., so it's like now two in the morning when I'm standing in line for the meet and greet, and I just wanted to go to bed, I was already getting old at that point, but 2 a.m. was pretty late to stand there for a meet and greet, and I get up to Joe Biafra, you know, you always want to, when you meet somebody famous, you always want to say something about yourself that will kind of impress them, and I said, you know, I'm doing huge work right now against the MPAA and their aggressive use of copyright, and Jello looks at me, and if you've ever heard a Dick Henry song, you know the Jello whiny voice, just imagine that because I can't really imitate it, but he looks at me and says, you don't understand, I need copyright, if I don't have copyright, the ability to stop people from copying my work, I would have to tour all the time, and this image of this thing flashing in my head, and you know, I haven't listened to a thing Jello has said since then, because it was sort of like, well, you're the guy that gave me this tape in 1987, and here in 2002, you're telling me you love copyright. Yeah, never meet your heroes. So this leads me to think, was I ever really an activist? You know, would somebody meeting me have the same reaction I had to meet Jello? I always felt I was doing activism all these years, but I'm not really sure, maybe we weren't doing it correctly. I don't even know if software rights and freedom were ever really a radical cause for social change, and kind of the opposite of an anarchist these days, I'm an unwavering rule follower, and I encourage people to follow all the rules, particularly if copyright licenses, so I don't know if we were as disruptive and activist as I hoped we were going to be, I mean I hope so, but I'm not sure. I think about myself as a teenager, the one who had that cassette in 1986, I'm going to turn, cause I'm going to talk a lot anyway, I'm hitting the gain too high there, so I turned myself completely off, am I still good? What's that? Okay, how's that? Am I not echoing anymore? Okay, I think that's good. Okay, so I think about myself as a teenager, and what I would say to myself, and I saw myself right now, and this is what I would say. That's what I say to everybody in the room. I'd walk in here, be like, you're sell out, you're conformist, you look at you and you're wearing a button down shirt, are you kidding me? I look at my haircut, I think of the fact that I shaved for the first time in 30 years and be like, now you're really selling out, but that's not, teenagers are the way they are, right? They don't have a lot of nuanced ability to understand the world, so I don't think I can go back to my teenage self and figure out everything I did wrong because I was a lot stupider as a teenager and I wouldn't trust that person to make any conclusions. But what I can think about is that we tried to build a coalition of people that would support free software, and I think we did a very poor job at the coalition we assembled for a number of different reasons. I think I don't blame us too much for that mistake because I don't think it's, I think it's hard to decide when you're selling out and when you're building a coalition with some for profit company and they exist in the world and they have a lot of power, so it's pretty hard to fight them without some kind of way of interacting with them. So I don't know everywhere we cross those lines, but when I think of the beginning of free software, I think one of the seeds of the problems we had was the rather single-minded obsession with libertarian politics and Ayn Rand style thinking in the early free software movement. There was this idea that we were going to let all the smartest people rise to the top, that free software was better because the smartest people worked on it and were able to work with the smartest people. I often think there was a free software developer who's not a free software developer anymore named Ben Collins-Sussman, and when he went to work for Google, he wrote this blog post years ago and said, I don't need free software anymore, he said, because I now get to work with the best engineers in the world because they're all at Google. So my goal in being in free software from his perspective was to be able to cross-collaborate many different organizations with the best minds, but from his perspective, once he got into the company that had the best minds, he didn't need free software anymore. And that kind of philosophy, I think, was really central to early FOS and also led us to really count out a capitalism. There were two things going on. One, we were being called communists constantly. There's an article of talking about GPL enforcement in Forbes in 2002 that sort of paints me as this communist. And then there was this idea that capitalism was going to be better with free software. And we wanted to convince people. I remember giving talks, going on about you can get rich doing free software and all that sort of stuff. And my team itself would definitely call me a sellout for saying that, for sure. And we've always loved this word freedom and saying fighting for people's freedom, their rights in free software, their ability to do things. But I think we were a little too focused on the freedom of the privileged few to really capitalize on that FOS and do something with it. And I think we forgot about the hobbyists that really, I think, were the center of why free software existed in the first place. So this is a slide from exactly 10 years ago on this day. I just copied the slide forward from my talk 10 years ago. And I've thought about this quote so often over the last 10 years. I was completely obsessed with it when it happened. And this really put into perspective for me the difference between copy left and non-copy left. I don't think copy left is the goal of free software. But the interesting thing about those who wanted to use copy left as a tool and those who despised it is very stark and important. Because the interest of capitalist endeavors was to have a full range of motion and to have a full range of any type of software they could do. So this idea of, well, we want to be able to choose when we open source and choose when we don't. That's what matters most to us because it gives us maximum flexibility and therefore maximizes profit. And I think we were afraid, those of us who are more hardcore free software activists, to not simply say that non-copy lefted software is fundamentally flawed. That giving software to companies and saying we're not going to make any requirements on you that you act well in this community. You have no rules. You have, in essence, anarchy. And it's the precise reason I'm not an anarchist anymore because I've observed that anarchy tends to work great for small collaborative groups of 20 to 30 people who are all equals. But as soon as you bring power dynamics into anarchy and try to scale it, anarchy becomes capitalism, unbridled capitalism, the ability for the wealthy to have more power. It's pretty akin to the same libertarian politics that I feel over influenced the early free software world. I used to say all the time in talks that the great thing about free software is that commercial and non-commercial actors could be equal. That somehow the way free software communities operated would make sure that a large commercial entity would have to answer to the individual developer. Now that does happen and it even still happens today, but I don't actually think it's the norm. I think given the amount of resources, particularly now, that are put forward by commercial entities into open source, means that there is always a power imbalance between the individual developer, the individual contributor, and the corporate commercial contributor. And in fact, this is a great place where copy left has somewhat failed us, because copy left by itself is useful for this problem, but it is still insufficient in its ability to put non-commercial and commercial actors truly on equal footing, in part because enforcement is so hard to find. That's what I spent a lot of my time trying to do in my day job these days, is figure out how to actually make sure the copy left is complied with, but even if we had universal copy left compliance, I don't think it would automatically solve this power imbalance between commercial and non-commercial actors. One of the fondest things I've always liked to say is that I showed up in free software for the free isn't price, because I was a student and I looked at Solaris x86 and it was a $450 in 1994 dollars, or $93, for a student license, which was completely unaffordable for me in 1993, and I said, well, I'm going to install Linux because it's free isn't price. I wasn't really that worried yet about the license of the software, I just wanted a Unix system that was not $450 US dollars, but I stayed, of course, because I got all the source code to Linux and everything else that was on my Slackware distribution that I ran in those days, so I always say this, that I came for the free isn't price and stayed for the free isn't freedom, but I think we underestimated how important free isn't price was. I think there was this essential egalitarian component of people being able to download software as soon as they got an internet connection and not have to pay for it that was important. I'm not saying we should change the licenses to not allow commercial activity or anything like that, I think we should, of course, continue to write our licenses that way, but I think we need to consider the equity of people around the world who may not be able to afford access to software. One of the things I do a lot in GPL enforcement is constantly raise the issue that if your source code release for your GPL work is a multi gigabyte download, you have to give away to get it to people who might have to pay by the hour for internet access at 100 or 10 or 5 or 10 megabits, right, because there are people around the world who go to internet cafes just to get internet access and how would have to pay to download that source release, and if you're holding it back from them, then you're creating an equity that the copy left licenses were meant to solve. I'm obsessed with time travel. I felt a lot of bad things happen in my life, and the only way I can get to sleep any night is I start imagining that I can go back in time and solve everything that's gone wrong. So I've thought a tremendous amount if I could write Marty's letter to solve the worst thing that happened, what would I write? If I think about what I'd have to write with regard at least to the worst thing that's happened in free software, I think this is the letter I would write. I would never have imagined in 2000 that most of the software that individuals use, I'm not talking about companies, I'm not talking about business software, I'm talking about what an individual uses on their mobile device. They pay for their data plan, but they don't pay for any software. People don't like apps that even cost a dollar. They download the Free As In Price app that will put advertising in their face, and they live without advertising all day long. Now I knew advertising funded things. In the 80s, I watched television. I watched television before there were VCRs. By the way, for those who don't know, VCR was a thing we had before there were DVRs, but I predate the VCR, so I remember watching commercials. I remember the best thing you had to get through commercials was the mute button, so you could mute it when the commercial was on, so you didn't have to listen to it, but you had to see it because you had to see them show it back. Obviously advertising is a huge part of culture and how certain things got funded, but the idea that advertising would be the primary funding of software that the biggest, most powerful software companies in the world were actually, in fact, advertising companies, I would never have predicted that. On the other hand, when I think about the thought experiment of, could I go back in time and tell myself that, or write myself a letter that I could read, I can't think about what I would do differently. How could we have prevented that? I'm not really sure if the Free Software Movement could have prevented the advertising industry from taking over the software industry. What would we have done differently? I think the only thing I could think of is how we handled DRM. In 2002, it was the first time, a session was actually told me this, that the MPAA session used to work for EFF said, hey, the MPAA has started using this, DRM is inevitable, marketing, like this is going to happen. I don't think we did enough to fight DRM. We really thought that technologists would find DRM so despicable that they wouldn't help create it, that they wouldn't be part of a culture that created it. We thought we were going to beat DRM. In fact, I did a lot of work in the US in the early 2000s with this thing called the broadcast flag, which the MPAA and other rights holders associations wanted to make it so that you couldn't record over-the-air broadcast television. We had this huge win. We're like we won. There's no broadcast flag. We can have absolutely, I still to this day have 100% free software DVR that can record over-the-air, but I think I'm the last person who's still recording over-the-air. It's all in the internet now and all of it is DRM. The worst thing was that we just conceded to DRM. There was this big announcement that Firefox, that Mozilla made about Firefox, that they were going to do it happily because as Mitchell Baker, the CEO, put it, we want to serve our users and they want DRM so we're going to give it to them because that's what our users want. The only reason they were able to do that is because Firefox was under a weak copy left license that allows proprietary plugins. It's easy to pick on a wealthy CEO like Mitchell Baker, of course, but we also had the W3C which just gave in and said we're going to do a standard for DRM. We're just like oh, okay, we're going to let that happen. I'm not saying lots of people in this room probably protested, complained about blog posts, but I'm talking as a culture. We accepted it. There were not enough of a coalition of people saying we will not accept DRM. Are we really anti-DRM? Did we really say, hey, this is not okay. We're not going to go after people for copying things. It's certainly not non-commercially. We weren't always willing to say that apart because GPL was based on copyright. We don't want to be pro copyright infringement because then we're pro GPL violation, but on the other hand, kids copying music on the backs of tapes was not the biggest problem in the 80s. The NPA told us it was, but it wasn't. Kids watching things without DRM today, not the biggest problem, but we've convinced ourselves that it is and we're happy to accept all that DRM. I'm very fond of this thought experiment because I think it shows how every person can make a difference because if we universally made a difference, if every software developer tomorrow woke up and said, I'm never writing a line of proprietary software, every line of code I'm going to write is going to be copy lefted from now on and I won't take a job. I just refuse to take a job writing anything but free software. Everything would change overnight. Foss would succeed completely. This is unlikely to happen. We're not going to get universal strike among all software developers, unfortunately, because companies are willing to pay so much to get people to write proprietary software, but we have to think about how our individual actions are actually impacting us. The compromises we make about this kind of stuff are part of the problem. They are part of why we haven't succeeded in the past and they will prevent us from succeeding in the future if we just continue to make the compromises that are in front of us. I think we made a tremendous mistake aligning ourselves with companies. I gave a talk, or I got to do a main track talk a couple years ago at FOSDEM and somebody came up to me after and said, I think the biggest problem we had with free software was successful and I was like, what do you mean? And this point was a really good one, which was we were not prepared for the power imbalance on the other side as free software became popular and as such we were not equipped to do that. The idea that companies would just flagrantly violate copy left license such that 98% of the products on the market that have Linux in them are violating the GPL and it's really hard to do anything about it, somebody has to do with how quickly Linux became popular. If we had toiled in obscurity for a lot longer we might have been more prepared when we were finally adopted and maybe it's because we kind of made too much of a deal with unbridled capitalism that has been able to do so much co-option. I look at things now like inner source where companies are like, you know what we should do? We should develop all of the proprietary software using all the methodologies that free software uses and this is a thing that people are excited about, that this is a great idea. That's co-option, that's taking what we were doing better and using it against what we were trying to accomplish and I think we should be willing to call that out and say that's not what we were trying to do and we don't want to see proprietary software developed. The hobbyist culture was essential to the success of open source and free software. I think in the hobbyist culture of the 90s that created free software were kind of the seeds of its own destruction because to have a hobbyist culture particularly with technology where devices are expensive to get, just the hardware itself is expensive. You need a certain amount of personal financial stability, leisure time that people have which they, in industrialized nations and wealthy nations that they don't have in other nations. So it's very difficult to create this non-commercial hobbyist activity in a culture where people have to work all the time just to make a living and they can't seem to get ahead, they can't seem to be able to afford their own home. All these things that we now face as larger social problems that are expanding all over the world at this point. So I'm not sure how we solve that as free software activists but we at least have to acknowledge that without people having a certain amount of ability to take a breath and think I'd rather put my time towards something else, at least my free time if I have any, towards something good. I don't know how, I don't have a solution to that because I'm not that kind of activist, I'm a FOS activist but I am worried that the larger changes in society that are negative will have a particularly bad impact on the ability for free software to succeed in the next decade or so. So anyway, that's a couple of my thoughts about this. I'd be glad to take questions. This is, I'm not forgetting pretty radical talks. I actually asked my executive director Karen Sandler if it was okay if I said the words anti-capitalist a lot and she said it was okay but yeah, I'm sure some people will not agree with me and I'd be glad to take questions if I have any time left. Okay, great. I signed it. You're done? I said when I was seeing the assignments to have questions. Yeah, okay. At the end you said the larger changes you think are going to continue to have problems. What larger changes are you talking about? Well, I think that I look at a lot of things that folks who are much younger than me are saying. Even in the United States, which is a very wealthy nation, I hear 20-year-olds saying I can't figure out how to afford a house. I can't figure out how to not have to work all the time and that kind of culture where you're so, and you look at the health care system in the United States, those in the U.S. know what I'm talking about, everybody else around the world just thinks we're silly, but those kinds of problems where you don't have enough space to feel you can tell your employer, I'm not doing this because it's wrong. I think capitalism, particularly, has set up a system where it's very frightening for someone to lose their job. And you're predicting this is going to continue on this path and get worse? I don't know, because I don't do that kind of activism. I don't know. I hope it doesn't, but I think it's probably going to get somewhat worse before it gets better. That's my concern. Great talk. Really enjoyed it. In particular, I liked how you... Just talk louder because that mic needs more... I'll try to shout, maybe, my mask on. I like how you portrayed a lot of the issues around coalition building throughout your career. Getting people together, but making sure that you're actually together on the same page. When thinking about the future of free software in this continued coalition building, one thing I noticed is there seems to be a significant lack of what I would call a political line within the movement. Agreeing what is the purpose of us working together? What are the political goals of our interactions and coalition? Do you see anything... I'm just curious about your thoughts about building this understanding of what our political goals are. Maybe an example of this is the four freedoms from Richard Stallman. It's focused on user autonomy, and this is really what we're trying to do. Now when we hear about open source and free software, many times people are not talking about user autonomy, but rather this is an efficient way of working together. This is just a way I can work with, quote, unquote, smart people. I'll get off my soapbox. I think it's a good point that you're raising. I think what I'm trying to say, at least this particular moment, which I'm not saying this is definitely right, it's what I'm thinking about, is that I think the coalitions we built in the past were in some ways the wrong coalitions. Our desperation to get for profit companies to adopt open source software, and almost willing to do anything throughout the late 90s and the early 2000s to get them on board and excited, I think we made tons of compromises we shouldn't have made, and we weren't busy billing coalitions with other social justice movements, for example. I remember distinctly I was living in New York City when Occupy Wall Street was going on, and I was going into the office of the place I was working at the time and worrying about what companies were going to do what and how we were going to basically help them do open source, and I'm like, maybe I should be down there talking to the people who work at Occupy Wall Street about free software instead. And so I think that the coalitions we need to build are other people who are trying to do the kinds of things that the individual focus of free software was trying to do, people who are saying we need more egalitarian things. We should be working with people who are trying to unionize. I mean, I give a hard time to Amazon employees because the Amazon employees who are all at this conference are being well paid to do software stuff, but every time I talk to them I say, you know your company's union busting all over the US, people just want to unionize. Like, I think we should be building coalitions with the people who want to unionize against Amazon, not trying to convince Amazon to write more free software. Thank you. So I noticed that as part of this you very much talk about developers and hobbyists and those kind of sit between, or maybe not between, but you have corporations on one side and on the other side we have users. And you give the example of Firefox where we gave in and accepted DRM. I'd posit that without Firefox using DRM it might have a tenth of users it has now, and it's already a small player because users wanted DRM. So really what I was wondering about is your reflections on how much we should be doing more or less with users that don't care about code and maybe how we convince them to care about the movements, or if that's just a dead end and we should focus on trying to find solidarity within the development community. I'm not sure. I think both issues are important and I'm not sure what we should spend most of our attention on. I think we probably have to try from both directions and different people with different skills in the free software movement should take the path that works best for them. If you're better at convincing users to adopt free software, getting your family to install a free software system instead of their Windows and Macs machines, if that's the kind of thing you're good at that's where you should be focused on. I'm not very good at that. I'm much more comfortable talking to developers myself. One of the reasons that, and this is possibly post-doc self-justification because it's easier for me to talk to developers than non-developers, but I kind of feel like it's back to that thought experiment. If I can convince more developers to not write proprietary software there will be less proprietary software for all these users to get stuck in out there in the world. We have to understand as developers, we have a lot of power to make these decisions for users. The biggest decision you make is whether your software that you write is going to be proprietary or free software and if it's proprietary software, you're just out of the gate being bad to users. That's not saying that there's not lots of, I've used a lot of free software that's not very good. I know that sometimes free software is bad to users too, but it's not bad to users out of the gate in the way that proprietary is. That's why I tend to focus on developers, but I agree that we have to build much broader coalitions which includes meaning talking to people who are developers. Those in the room, if you're good at it, please do it. I'm just not very good at it. I've spent the last 10 years or so essentially trying to do the sorts of things you're advocating people should do. When I've been working in organizations that were developing pieces of software, I've been the one advocating for using copy left licenses, not non-copy left free licenses and not proprietary. Outside of that, in activism, I've tried to encourage groups to use free software instead of proprietary tools. In both of these contexts, I've tried to encourage groups to use free software. I've been kind of othered by these groups. I've been seen as an obstacle in those organizations and in some cases pushed out because I'm not willing to release my code under proprietary and non-copy left licenses. I've been unable to participate in social justice groups because they want me to use WhatsApp. What more can I do? I think the fortunate thing is some of that's changing because I think mainly because of mass surveillance. I think most social justice groups basically understand now that there's a lot of danger in using corporate controlled software like the Google Apps, like using Google Calendar to plan your protests and all these sorts of things. People are starting to realize that's a big mistake. One of the things you have to be willing to do as an activist, and I've learned this now that I'm old, I've learned that you have to be willing to revisit things you tried in the past even though it was painful. I hear that that was painful for you to go through that and be kind of othered in that way. I've been through similar experiences, but I think you just kind of got to get up the next morning and be like, I'm going to try again. Maybe it'll all happen the same way, but of course there's generations coming up that are new. There's so many people out there that are completely new to this and will not necessarily react the same way. I'm actually very hopeful about younger people because I think the people who are in there like 18 to 20 right now, I think they have a much different attitude and it's a much more anti-corporate attitude. It kind of fits with the way I felt when I was their age. I feel like my generation, I was rare in feeling that way, but it seems to be common in that age group right now. I think there might be hope to try it again even though it might be painful to make a go at it again. Well, first of all, Bradley, let me say I really enjoyed your talk, and yes, I am also over 30 hex. You mentioned in the time machine one of the things that maybe we made a mistake on was focusing on commercial adoption of FOS, but there's another issue at hand you didn't really address, and that is that our classic open source free software licenses don't read on some of the technologies that have emerged. In the early days, we had this quaint notion of a CPU and a desktop underneath our desk and a keyboard above it, and that was the machine on which we ran binaries. But fast forward, you knew about services and played an enormous part in the drafting of the AGPL license. Maybe you would have liked an LAGPL license, but what about the current state of affairs with composable network services where none of our licenses read on the use of that kind of software? Yeah, I think we have to invent around in that case. I do think the FRGPL helps in that case because if you write replacements that are FRGPL, there's still some hope there, but I think it goes more back to the advertising issue, which I don't have any solution for. It's really hard to compete with free as in price. It's one of the things that was true of free software that we got a lot of adoption because SlayerSex86 was $450 and Linux was free as in price, and I think that until we figure out how to crack that nut of, gee, people are willing to accept advertising, I think actually we should start advocating more about how dangerous that kind of advertising is and how insidious it is and how it's bad for activists, it's bad for everybody to be bombarded with advertising all the time because once that changes, it would switch us back to the old proprietary model. If people reject advertising and say, well, the only way I can reject advertising is I have to pay $50 a month for Google Calendar, they might well try a next cloud in that case. I have a comment actually about your thought experiment, so I can imagine that's your point of view coming from the US, but actually in the Balkans we have a different kind of problem, so we have people who want to work on an open source, but they can't get any funding for it, so we are talking about people who are working for, let's say, 12 to 15k a year, so asking them to quit their jobs, that's like criminal. I'm pretty sure that everyone in the world, there are opportunities for smart people that aren't writing software, and if it's really true that there is a state in the world where the only job a smart person can get if they're educated, like you would have to be to write software, is to write software, I'd love to hear more about that place because it sounds very strange to me that there isn't other choices in life that you could get as a qualified, good, intelligent person. Writing software can't be the only job available in the world. I'd be surprised, but I'd love to talk to you after if you want to tell me about what it's like in the Balkans where everybody has to write software for a living. I'm curious about your take on the government policies. You mentioned unions, maybe to make a progress, but FSFE has a fantastic campaign, public money, public code, or the restaurant tech fund in Germany to finance open source ecosystem with public money. Do you have any take on this? Well, I think it's correct. I mean, I agree with the campaign. I think it's a good campaign. One of the things that I'm really concerned about, I don't know if this has happened in Europe as much as it has happened in the U.S., but there is a lot of pressure in the United States because of corporations that the code that's funded by public funds must be under non-copyleft licenses because corporations want the government code that they can take and incorporate into proprietary products. To the extent to which public money is funding non-copylifting code, it's kind of a handout to big corporations that want to make more proprietary software. That's the only thing I'm worried about in that kind of campaign is making sure that the public money is going to code that will stay public, that will stay free software, and the only way, it's not perfect, but the best way we have to do that right now is copyleft licenses. Thanks for the talk. It was interesting. My question is also about this trend on digital sovereignty because I found your talk interesting, but very American in you. I mean, both in the history part, because in Europe, often software was made by people that were communist and were trying to build something alternative to capitalism, but also in your reference to companies. Because in Europe now, we do have a system of smaller, maybe smaller, medium open source companies that are not just motivated by making money and taking the government away. They are really motivated by building an alternative to the dominant big tech technologies. And maybe to solve what they see as the real failure of the open source movement, which is failing to avoid all the software bases to be used to build wall gardens and closed systems on the cloud, because everything is now on the cloud. So there are companies that are exploiting this just to make that. So I was wondering whether you think that all this coordinate approach between the European industry and the governments and new regulations, like digital markets, could be a solution that brings more freedom into the back of the country. Yeah. I mean, I think the EU is a much better place right now than the United States. That's definitely true. My big concern as somebody from the U.S. is the cultural imperialism that the U.S. has kind of pushed through technology. I think it's just absolutely disgusting that a couple of U.S. corporations, which are effectively thinking of themselves as software companies, have such dangerous global power. And I think it's beholden on people in the U.S. in particular to fight against that. And I find that people in my country are hard to convince to really fight against that. And it's trickling through the world, right? Because you have, I mean, I look at the EU doing these great regulations that are really advancing things. Like I love getting off a plane and seeing all the cookie pop-ups that I don't see at home. And there is regulation. It's forcing U.S. companies to comply with it. But they give you so much pushback and they have so much power to give you pushback that I just want to, you know, I want to have a revolution in my country to solve this problem, right? Because this kind of imperialism through corporate power is revolting. But I do think there's a lot more hope outside of the U.S. at this point. I was wondering if you could comment on the Cyber Resilience Act and the effects, I mean, it would have effects obviously in Europe but in the global market. And we were talking earlier about commercial versus non-commercial. And I know in the text it's also talking about these issues. So I'm wondering if you have any guidance in terms of potentially coalition building around making sure that the Cyber Resilience Act is effective and good for the community? Yeah, I'm just not an expert on the Cyber Resilience Act. So it's hard for me to answer. But I think probably folks at FCEF Europe will be good to talk to, maybe. But yeah, I wish I could say more and I would like to learn more. But I just don't have the expertise to line it up with that. But I hope people in the room who are more informed and smarter than I am would be able to do that. Maybe to quickly add to this. So like two hours ago there was a session which is also recorded from the Jason Room, the European Commission. And in general it's about liability. So we also have the Product Liability Directive in the AI Act. So three files addressing this question of liability in the term of non-commercial, putting something into market. I suggest you just watch this session from the morning again and then also you can shoot us an email and we can bring you in the discussion group on this particular question. Thanks for your talk. I think the one thing that I'm missing from here is a vision for how the whole system is going to work. So I work as a software developer on a copy left piece of software. And I also kind of have a side project that I'm trying to do kind of in my free time. And the amount of throughput that I can get on the work that I do for my day job is just, you know, in a way or magnitude more than I can do kind of in my free time. And if, and so, I don't know exactly what I'm saying, but I think there's good businesses and there's bad, good ways to do business, bad ways to do business, right? And the best way, the only sustainable way to lift someone out of poverty permanently is to give them a good job, right? And so, I mean, having a thing where basically all software is written by hobbyists doesn't really seem like a sustainable kind of a feature. So I'm wondering if there's like part of what we need is not just to say, well, what's happening now is bad, but like a vision for how it could be in the future that's better. Do you have any ideas for that? Or even, you know, things inspired, you know, from like, you know, cooperatives or things from middle ages like, you know, what is this called, guilds or things like that. Like, where there's large organizations that are not designed for profit but designed to achieve in a certain goal. So if I had a comprehensive solution, I obviously would have presented it, right? I've been around long enough that I see clearly where we went wrong, but as I was saying, even if I were to go back in time, I don't know if I would be able to make the right choices to make it better. I certainly see one key component which is what changed from the past was people now who write open source in their day job, which is just open source companies want. They're kind of gladly saying, well, I do open source in my day job and now I'm paid for it, so I don't have to volunteer for anything. And I think that has to change. I think people have to be willing to, if they are lucky enough to have a high paying job, they should be willing to volunteer on their spare time to do something to make a change. Okay. Thanks, Bradley. Thank you. |
Reckoning with new app store changes: Is now our chance?
Recent legal and policy developments around app stores and what they mean for free software |
Please welcome our next speaker, John Sullivan. Hi, everybody. Sorry for the delay there. I was actually trying to do something fun and do this presentation from a pine phone, which actually has worked and I've done it before. And it is my single working free software device that has HDMI out right now. But the Bluetooth keyboard failed me. Nothing about the phone. So here we are. Thank you for the viral laptop. Quick note for people who might be watching on the stream. I did upload the slides in the FOSTA system. So you might be able to grab them from there and follow along. So reckoning with new app store changes. So even though I am no longer with the free software foundation, I am still with the same mission, the same goal that I worked toward there for many years. And that's to try and make sure that I work toward a world where everybody can do everything they need to do on any computer, including tiny ones using only free software. And I still want to devote lots of my time and energy to that, because we're seeing escalating consequences of what happens when we don't control the technology to be used towards our values and goals, but the technology actually pushes us into the technology owner's values and goals. So my current free software affiliations actually are fDroid. So I'm part of an exciting effort that we're about to have some big news on to found an actual formal organizational home for fDroid. So I think that will really help put the project in a position to take advantage of some of the opportunities that I think are being created by the current climate of various regulatory and social things going on. How many fDroid users do we have in the room already? Yeah. But if you don't have your hands up, please give it a try. It just is a free software app store and client that lets you install free software apps on your Android device. For Debian, I'm a Debian developer and user. Of course, I'm a member of the key ring team. Help manage the GPG key and infrastructure stuff there. I'm also looking to get a little bit more involved in the Debian mobile work. And a literative advising is the neighbor might do consulting company. I'm going to be working with companies and organizations to promote, advocate free software, work with free software communities, and follow best practices around those things. So why am I talking about app stores in particular here? I have a relatively long history working with free software on mobile stuff, largely through my history at the FSF, but also a lot in my spare time. Actually, with the FSF, I participated in a lot of protests and actions directed at freedom on mobile devices, including dressing up like Steve Jobs at the iPad launch in San Francisco, going to lots of protests with picket signs, writing lots of articles. But I've also, outside of that, tried to be a contributor to free software mobile projects in whatever way I can, including odd things like writing an e-max mode that let you make phone calls and send text messages, which I discovered somebody is still using about 15 years later. So the irony here is that I am currently on a pine phone, which is not attached to an app store, unless you consider Debian an app store, which I guess you could, but it's not the kind of app store that I'm talking about today. I'm just trying to work at the problem from different angles. That includes using a good Linux phone. It also includes working with F-Troid to help people that are using Android, and they're working on just the top-level issues about what can be in, what app store policies can and can't be that affect Apple users as well. So here I'm going to focus on just Apple and Google's stores. There are many other stores, Microsoft's notably, which has had a complicated relationship with free software. You can look up the Conservancy Posts from, well, it was late last year, about some concerning changes that Microsoft was going to make that we need to keep an eye on. I think they backed out of those, but it just showed that we need to raise an outcry anytime companies try to do that and keep an eye on what they are doing in their app store terms. And of course, there's all the various browser extension stores and other collections of apps that users download that are similar in a lot of ways, but it's really Apple's and Google's that are at the center of the contested territory right now. And I think people in this room probably already agree that mobile devices are very important for freedom, but I just wanted to kind of break it down a little bit. These devices are ubiquitous. They've replaced regular computers for so many people in the world. I think just in the U.S. recently, it's a survey found that 70% of people aged three or over use a smartphone on a daily basis. I would call these devices, in most cases, parasites. They collect and expose so much data constantly in ways that have definite impacts on your individual freedom. They are gatekeepers. They are increasingly a key car that we have to use to access services, rent cars, bank, check into hotel rooms, even COVID related stuff. There's just a lot of ways when they've become kind of our form of identification. Wow, interlopers. It was kind of late at night when I wrote that work. They sit in the middle of our communications, all of our kinds of communications, personal communications, political communications, things we're publishing and sharing with other people. They're there in the middle of all of it, which makes them especially important, whether we have control over the software doing that or not. They're disruptors. This is where the App Store part specifically comes in, because App Store has really changed the way that people distribute and receive software. The whole concept of handing another person a program on a disk that they can install or on a USB stick or whatever has largely just been replaced by getting it. You don't give someone a program directly anymore. You say, hey, I like this app. You might like it too. Here's where you can get it, and they get it from somewhere else, even if it's a free software app in a lot of cases. The App Stores have various characteristics that make them important for us to think about, but one of the largest impacts, I think, is the way they just impacted the distribution of software as a whole. Why do we have App Stores? How are they sold to us? This is kind of a list of things that App Stores claim to do. They provide security because applications are reviewed in both Google Play Store and the Apple Store and rejected if they are found to have security problems. You're getting them from a single trusted source instead of downloading random files off of the internet, fair enough. But with any security thing, it's in quotes because we have to always ask security for who, against who, and while the App Store dynamic provides some security benefits and assurances to users, it also completely guts the user's security against Apple, for example, or against Google. Because particularly in the case of Apple, the user has to hand over full control of their device to your Apple. And they have no choice. They cannot stop trusting Apple even if they wanted to. App Stores supposedly provide quality standards. Apps are vetted for things like they don't kill your battery right away. They might fit certain interface standards. They work with a reasonably up-to-date version of the operating system. They facilitate discoverability so you can browse and find applications to do the things that you want to do. And download them without having to search the broad, wild world web. They facilitate payment to developers. So developers can get paid for the work that they do directly from users. And developers don't have to stand up their own system to receive credit card payments in order to benefit from that. And they keep your apps updated. Now, why do we actually have App Stores? Again, in this term, I think I'm not going to work too hard to provide a lot of examples of ways in which Google and Apple have used their power in ways that have nothing to do with that list that we just showed about the sensible benefits of App Stores. But I did pull together just a few examples to be indicative of how they use these powers in a couple of different ways. They use them for their selfish interests, to advance their profit and other corporate interests. But they also, because they have this power and they become essentially choke points or control points, the power that they have is then put into service of authoritarian governments, who can then force them to use that power in certain ways, something that wouldn't be an issue if they didn't have that power to begin with. So the first kind of thing, Apple requires that all the browsers use WebKit. Why? WebKit, it's free software, implementable, it's fine. But why do you, can you only have browsers that use WebKit? Maybe because Apple has a lot to do with WebKit. Only allows Apple apps to use certain features such as iPhone's NFC chip. You can't have an independent app that has access to some of the features that Apple wants the tightest control over. And then an example of the censorship side of things, they removed HKMap.Live, which was used to track police activity in Hong Kong. Google, similarly. Now Google, of course, allows you to install apps from outside the Google Play Store. So in Google's case, it's sort of more of an issue of the way they use the soft power that they have, even though they don't have the hard power that Apple has over the device, the really tight control because of their prominence, their overall control through the Android requirements and other ways for how the device is presented to the user. They can bury the ability to install apps from outside the Play Store very far, so most people won't do it. And they can push their own apps over independent apps. So a political example, they removed Revolution of Our Times, which was a protest-themed game. Again, I believe, about the request of the Chinese government request. And for more, as an example, of Google's interests, they have removed over the years various ad blockers because that's Google's business. So they're not really interested in helping you find ad blockers. There's lots of ways in which app stores have become concerns for free software. There are the terms of use, the Microsoft example that I mentioned was an example of that. But in Apple's case, some of us spend a lot of time reading over the various terms that you have to agree to, to be a developer, trying to figure out if they're compatible with free software or not. And it's kind of a headache. And I'm actually not going to get into that in this talk because it's the shorter version. And ultimately, the truth is, even though I personally had a lot of concerns about whether copy-left software in particular can legitimately be in the app store, I'm not aware of any cases where Apple has removed an app for being free software. You know, I suppose they could provide a sort of superficial reason by pretext. But the concern here would be Apple sort of reserving the right to kick an app out when it felt like it, which is not a safe position for free software to be in. But at the same time, they seem to sort of, they're definitely aware that there are free programs in the app store and they allow them to be there. Other challenges, digital restrictions management, particularly in the iPhone example. How does that interact with free software? Users can't install their own programs without a developer key. They can't, you know, install other app stores without circumventing the iPhone's DRM. And what does it mean that every application that the user gets through the Apple app store is actually delivered to them wrapped in DRM, even if the developer did not upload it with DRM to begin with. Apple does that part for you. Lack of labeling and searchability is a big problem even in the Google Play Store because none of these companies are interested in supporting the ability of users to specifically find free software. So fJoy is easy, it's all free software. Google Play Store is not, but there is a lot of it in there. But they don't want to elevate the free programs over the non-free ones, and so they've resisted adding that search functionality. Certain content policies, kind of going back to the examples I listed before, can be just inherent problems for free software and that you have user freedom. Some free programs would not be eligible because of what they specifically do, even if the other things were not concerned. Source code links, even if they were to label the programs as free software, Google Play doesn't provide like a trail for the user to get back to the source code most of the time. You can sort of maybe have a GitHub link there or something, but in our ideal world app store I think we would want an easy way for the user to get directly from the app store entry for the program to the source code for that program. I know free software has faced a lot of challenges with fake versions in app stores. It's a complicated thing since anybody can take somebody else's free program by definition, upload it to the app to the Google Play Store and start charging money for it. Now there's nothing inherently wrong with that, but it does cause confusion and problems especially because some of those cases are outright scams and so we might want to ask app stores to do a little bit more in terms of transparently saying who the money is going to and how does the user know that that money is actually going to facilitate development of the program. So Google, like I said, you can install apps in other ways outside the Play Store. There are big examples of that, Amazon and Samsung even having their own app store as for Android. So challenges to this policy though because of the soft power usage by Google have been happening and so one big one that's happening now is Epic Games and the Match Group, Z-Rain Google. So earlier this year set a date for the jury trial in Northern California that is November 6th and this centers around the way that Google, similar to Apple which we'll see in a second, but Google having rules about app developers not being able to directly bill outside of the Play Store if they're trying to get users to pay them for something and Fortnite is the wonderful program at issue here. That does create problems for us so as we're thinking about whether we want to sort of come out publicly as free software activists on one side or another here. It's sort of uncomfortable to be on the side of Fortnite. It's proprietary obviously but also because Google has sort of been able to spin this as Epic is not fighting for everybody's rights. Epic is a huge player who is benefiting tremendously from being in the app store and they just don't want to pay for it. That Google is sort of able to spin that way because we're talking about a large successful program that kind of complicates our relationship to this case. That's not the only thing going on. There's also 37 state attorneys general from both parties in the United States who have separately filed suits against Google. Their Google's response so far has been to say that they'll seed some ground on the commission fees that they charge. If you charge money through the Play Store you have to give a cut to Google. They're going to back off on that a little bit but they're still kind of sticking to forcing apps to use their billing. Epic sued Apple as well and the judge did find that Apple has to allow other payment systems but the judge decided that the app store is not a monopoly which I can't wrap my head around that one. Epic also had to pay some money because it was found that Epic violated its contract with Apple. Thanks to Apple's policies appear to be changing. There's credible rumors that Apple is going to allow iOS users to install software from other sources beside the app store when iOS 17 is released which is this year and the reason for that is thanks to our friends here. The EU digital market, it's just the digital market act, the digital services act is a separate one related to the digital market act that is requiring Apple to probably do this. We've seen free programs respond in different ways to the difficulties with the app stores. So far we've seen free programs just refused to be in the app stores. We've seen them have license exceptions to make it clear that a program that's copy left can be in the Apple app store. We've seen them change their license to work in the app store and then we have some like after I sort of standing up their own organizations. But my kind of question here is given these pressurized open app stores is it an effective response for us to kind of get behind these things? And I think it is even though Bradley highlighted some problems with us supporting market ideologies essentially and the kind of capitalist viewpoint on free software over the years, I think the free software does benefit in this short term from supporting these pushes in the EU and the United States for competitiveness and open and fair competition because free software produces that as a side effect. It doesn't have to be, we don't do free software in order to let big corporations compete fairly with each other but when you have free software there is inherently free competition. So I think we can support things like the Digital Markets Act and recognize what it seems to be doing that free software may be able to benefit from. Also just three days ago the US published a report about competitiveness on behalf of the Biden administration which everyone should have read. The table of contents is amazing. It talks about the problems with app stores that cause Apple and Google gatekeepers. It really went above and beyond anything I expected from the US federal government until you get to the recommendations and it's kind of like we have described these problems really well. Our recommendations are they should like do you better. There is also the open app markets bill in the US currently waiting. It was introduced not yet passed. Who knows what the future of it will be with our new Congress but it's our relevant thing to keep an eye on. It's similar to the EU Digital Market Act. Ultimately we want to get to a world where users can choose which apps they're using, which apps are default. They should be able to install. We shouldn't have a special term siloing for installing your own apps. It's just called installing software but users should always be able to do that. Those are kind of the requirements but then we really also want an app store to promote the values of software freedom by directing people to source code, making it easy to find free programs even if there are also proprietary programs in there. And then support better practices that speak to those app store goals of security like reproducible builds. One awesome thing that fDroid does is build the binary before you get it. It's not reproducible in every case, fully, mathematically, but you have a good indication that this source code built this app successfully. That should definitely be part of our utopian picture here. I hope everybody will take this unique moment in our history. I didn't think we would get here. I thought once Apple got away with this stuff on the iPad and the iPhone, every new computing device would be treated this way in the future and our desktops and laptops would actually be moving more in that direction. But I'm actually tentatively optimistic that regulatory actions are pushing things in the other direction, even if it is the result of fights between big proprietary software companies primarily. I think we want to insert ourselves in this moment, make sure that free software is presented as a consideration and something that needs to be treated on an equal footing along with all of other concerns being talked about here. So thank you. I will be around and happy to talk. Please contact me if you are involved in anything related to this. I would love to help any efforts in this area further. There's my iPhone. Thank you. Thank you. |
How to get public administrations to use more FOSS |
to introduce our next speaker, Klaus Wicke-Hot. Yeah, thank you and welcome to my talk. Actually, today I'm here representing the Open Source Business Alliance, so the OSPR. We are a non-profit organization representing more than 200 companies in Germany trying to make money with open source software. And also the goal of the OSPR is to promote usage of open source software. So we have some heavyweight members like Susie, Redhead or Telekom, but also smaller companies like Linodata, whom I'm working for. And within the OSPR we have several working groups, and I'm the spokesman of the working group Beschaffung, that means procurement. And so we are busy, or we were busy the last year with negotiations, with people working on this sector, and I want to share a bit all the experiences, all the insights I found personally, because I think it's very interesting to see the other side of the table when we're talking with people about that. I've had a disclaimer, so this is no legal advice. I have an engineers degree, I'm more kind of a developer, and I just came a bit of by accident into this world. So this is no legal advice. If you have legal issues, especially license issues, you need to go and get a lawyer. So why is all this getting more important on part of the public sector side? So we know that the world is changing sometimes, and in different ways we don't expect. I remember the last FOSTEM before Corona, and then just a few weeks later the streets were empty because there was the lockdown. Then we had a FOSTEM completely virtually, that was very strange because we were sitting relaxed with coffee and everything at home, but you're missing the atmosphere, the people aren't there. We had the Ever Given, Got Stuck and the Suez Channel, so we are running out of supplies with IT hardware and all these things. Now we have a war on the Eastern part of Europe, and it all leads to something that we call sovereignty, especially digital sovereignty, and this is something that the countries really now understand themselves, so they see that they have to be able to react on things and to handle on their own will. So in the actual coalition agreement from the German federal government, there is something stated, it's in German, but I translated it for you. So developments are usually commissioned as open source, and the corresponding software is generally made public. And there are many more initiatives in German counties and cities that are focusing on building on open source technology, and I'm pretty sure it's the same all over Europe. We have contacts to France, to Luxembourg, where all these things also happen in a similar way. So far so good. It's quite hard to get an understanding on how the number of used open source software evolves. So the main principle in our open source software is that we share it with everybody. So I cannot check somewhere how many installations there are. You can sometimes figure out in server logs what kind of clients are connecting. You can check download numbers from GitHub, GitLab and whatever, but you don't see a real number. So at OSPR we didn't survive on our member companies dealing with selling open source to the public sector. And asking a bit about the issues they face and the question that arrived. And we also did monthly workshop with the minister in Germany that's responsible for some of these contracts. And then we got some pretty interesting insight about the issues they have. And the result is the open source licenses we use, and the contract management on the public sector side, they are not really friends. So there are some problems that we need to solve, that we need to address, and there's really room for improvement. And I want to point out some of these things here. So the power of open source is something that I don't have to explain to you. These are the four freedoms defined by the Free Software Foundation stated in the GPL. We all know that probably by heart, and this is the base of all our work, let's say. But these power of open source, they come with a price tag. And this is something I didn't know before I was really looking into these issues. Because you get all these freedoms if you apply to the terms and conditions. And I took it here from the GPL to O. You may copy and redistribute verbatim copies of the program source code as you receive it. And any medium provided that you consciously and appropriately publish on each copy an appropriate copyright notice and disclaimer warranty. Keep intact all the notices that refer to this license and to the absence of any warranty. And give any other recipients of the program a copy of this license along with the program. These terms and conditions mainly apply in the case of distribution of software. This is something that is not very common in a business to business situation. So if the company is buying software, they just want to keep it for themselves. They are producing cars or they are bakery. So they are not going to distribute the software. So there is no problem. But if you go business to government, you most often have the situation that the government itself is a huge complex body with several legal entities. The company is organizing the software and then they have to distribute it to the other parts. And then this distribution fact applies here. And then they have to check if they apply on the licenses. And they have to check that everything is complete and everything of these conditions are fulfilled. So these are legal issues. So they need lawyers to work on that. And this costs money, resources. Another thing that we face is that these licenses we are using are mainly in English available. So there is only an English version, which is quite hard in Germany because you need a translation into proper German. But there is no authorized version of the GPL, for example. And they are all based on US conception of law. So for example, this copyright thing we have in Germany is completely different in the US. So you need lawyers to put that into the right thing. I'm not going to say they are not working. So we have several projects like gplviolation.org. They showed in Germany by going to KERD that the licenses work and people understand them. But it's quite hard to use them. So in Germany, mainly the software is ordered via tenders. This is regulated in the procurement law. And there is something like the E4B IT. That means they are against the Vertragsbedingung, so Beschaffung von IT, a huge complex thing of terms and conditions and sample contracts. And one of the things stated there in the E4B IT is the contractor must grant the right of use of the software to the client. And this is impossible because in open source the recipient gets the right to use out of the license by following the license terms. So we need special workarounds in that situation. And the OSPR we have made a kind of a handsheet to support companies doing this. But it always needs the special negotiation necessary to solve these issues. And this is also some extra work, something of the price tag I was talking about. You're here in Belgium and Belgium is famous for its comic strips, so I found this one that I really, really like. It shows a modern application, any application, probably one of the app store from the talk before. And you see it's very shiny, complex and very detailed model. But it's a kind of a building blocks. This is the way we work. So we use open source blocks somewhere outside and we recombine them to something new. And the thing is, each of these components has its own license. And the problem here is that we cannot say there is a final license for our whole product. Because when we distribute the software, we distribute all the small packages together. And at OSPR we also did an examination of a popular video chat solution in the beginning of COVID, we all needed that. I'm not going to name it, but on the project side it says it used the LGPL, so the GNU Lesser Public License. And when you install that on your hard disk and you try to examine all the files installed there and figure out all the different license files that we're putting on your hard disk, you find that it's not LGPL, it's a mixture of different things. And this tools also the installer downloads fonts from Microsoft, so they have some specific true type fonts. And in Debian you can download an installer for those. Then you call the installer and then you have to apply to the Euler from Microsoft and then the download starts. But here nothing happened, they just appear on your hard disk. So this is something that you have to take care when dealing with the software. And this is something, when you think back to the public sector, they're distributed to another legal entity within. So now they have to fulfill all these license things that are putting in that product. The main question I cannot answer here is who is responsible for that. The project that put the installer on their project page, the programmer or the developer that made that installer, probably the company that made the offer and made the installation work or the admin guy on whose server it's running or at least later the final entity that is using this application. So this is the kind of an issue. And be sure with containers the things are getting worse because then you get a bunch of containers, everything with its own operating system and also a different bunch of software and a different bunch of licenses integrated. So we need to answer the question, when we look on a piece of software, what is inside? Which licenses are used inside? And then we come to something like a package insert, like for medicine. And the only way to deal with that is an SBOM, the Software Bill of Materials. And this should list all the components, all libraries, all modules, all container stuff, everything that's put into the product. And for the SBOM we need the information about the piece of program, the source, where does it come from, the version and the license. With the version and the source we can also answer security questions like are we vulnerable for something? Lock for J, for example, is somewhere Lock for J implemented? Which version do we have to fix that? And for us, for our legal view on that, we know which licenses are there. But when we have this bunch of licenses, we are not really done as public sector because now we have to check for our business case. Are all these licenses compatible? Are they working together? We have something like copy left that might have an impact, so we have to figure out that situation for a specific business case. So we still need some people dealing with license compliance stuff. For the SBOM itself, there are different formats available outside. The one that's mainly seems to be the most common use so far is the SPDX format, Software Package Data Exchange. It's maintained by the Linux Foundation and it's the only one that is an ISO certification now. So this is something that's an ISO standard and then we can use that. There are tools available out there to convert SBOM from one format to another. That's mainly text files and different formats. Jason, for example, and with that, we get a detailed view on all the components that's inside our software delivery. And for people who are being interested or being a bit afraid, oh, I have to deliver an SBOM, tomorrow there is a whole deaf room here in Foslem just dealing with this SBOM topic. But this is something we will probably face in future when we apply for software tenders that we have to deliver an SBOM with our solution. When we check all those licenses that may be around in the world, we come to a number of probably 1800 different licenses that may appear in such an SBOM. So we have 1800 different legal texts that we might have to examine. OSI has certified 116 of these as being open source. From my personal point of view, this list is incomplete. So OSI has set up a kind of a life cycle thing to get rid of licenses that are not common anymore. So to narrow down this number, so there are probably 500 that may be something like we would accept as open source. And each of them are different. So when you're working as a lawyer and you have to check the license situation, you have to deal with these number of licenses. But I want to remember you that also the proprietary software uses licenses and EULAS, and they are mainly much more complex than our open source licenses. They are sometimes specific for a specific product and they are changed regularly. So if you use any device, sometimes it pops up, new EULA, you have to accept it. And I'm pretty sure everybody here in the room checks them carefully, find out the differences to the last version because they are not kind of red or blue marked, and accept them afterwards. With open source, we have the real advantage, the unknown or hidden advantage I called it. We have the real advantage that our licenses are quite short. So they are mainly just a couple of pages, sometimes even not that. I think MIT is something like just a few lines of license code. And most of them follow a specific pattern. So they are derived from each other and so you can find a structure inside and you get a better feeling for what this license is about. And they are pretty static. I mentioned the GPL20 before. That's the license applied onto the Linux kernel. And the GPL20 is unchanged since June 1991. So more than 30 years, the same license text. If I'm working as a lawyer, I'm pretty fine. I spend one time all my work in that license. And then the next 30 years, I can work with my customers and support them. What you also have seen while talking with people in the public sector, that there are some kind of misconceptions outside. So these are kind of misunderstandings that probably people told them. And the first one is there is some people think there's an obligation to publish everything on the Internet. But this is not written in any license. The idea of open source is you give somebody a binary piece of software and then you have to give him the source code of that software. So if something in the recipient's environment changes, he can adapt the software on his needs. He can add new device drivers, he can change the language, he can add new functionality, whatever. But the thing is you have to hand over the source code with appropriate rights to work on the source code to the recipient of a software, not to anybody somewhere on the world. This is something most of us do because we want to spread our idea, we want to spread our software, but we don't have to. This is something important when you go on this. I think we have another talk today in the military sector. Where probably parts of the solution must be hidden or must be not being published. Then you can easily use open source, you can adapt it, you can give it to another legal entity, but only they have to receive the source code. You don't have to put a mission control system on the Internet. But some people are thinking like this is open source, I have to publish everything. Another thing is like pushing changes back to the community. I personally really like that idea because then I'm not in charge anymore for my changes because the community does. Otherwise with every updated version I have to apply my changes again. But in fact it's not stated in any license. It's more like common sense that we do so. But we don't have to bring the changes back to the community. This is something like mission control system, missile control system, whatever. We don't have to. And then we have, especially with copy left, sometimes the idea of the disclosure of all our source code. I think it's based in some Microsoft advertisements from 20 years ago during this Halloween papers, like all GPL this kind of cancer. This still is in their heads. No, we don't have to disclose all our source code. If we accidentally mix GPL into our product, we break the license. So we are not licensed anymore. So we cannot ship that solution. That's a legal problem we have to solve. But this is not that we, in this moment, we have to ship all our source code. So these are things that we hear. And then you'll think like, oh shit, we have to explain them that this is not the case. It's just normal software that we are dealing with. And then finally my last slide. How can a developer improve the situation? So I said I'm also more a developer guide than an illegal guy. What we can do is, when we start writing code, I mean, I've wrote already better before, but normally you start just coding. When you start working on a new project, you should check the licenses of all the components you want to use for compatibility. Do we have copy left? Does that all work and fit together what we are doing? And also in the plan business case that the company I'm working for or the client I'm working for is going to use. Especially for standard components, when you think of that picture I showed you before, this strip thing, there was one small block on the right-hand side. If this breaks, it all falls apart. And this seems to be a very fundamental piece of that solution. So it's probably not that easy to change. We should start building S-bombs to check what's inside in our product. That's also something that the security guys are asking. In the US there is something like the supply chain thing. There was presidential advice on how to secure the digital supply chain. And S-bombs is a very important part in that. So I personally assume that we have to deliver S-bombs in public sector projects and that developers are starting building S-bombs. And when you think of that picture I showed you, when every community of these building blocks made an S-bomb for their product, we can iteratively use all these S-bombs together for the final product. So this is something that will probably start to work from automatically, I think. But you should, for your own project, start writing S-bombs. And the best way is to build that into your build process. So that's created automatically and updated with every update that you ship. So that you always have a fresh S-bomb. I told you there is the software version included. So you have to use that for dealing with security issues. Avoid exotic licenses. There are some interesting licenses around, like, for example, like the BIA license. It's probably not a very good idea to try to find a very exotic license that nobody else uses because it makes the work for the lawyers much more complicated. And at the end, when you apply for such a tender, and it's not a specific solution that must be built in open source, but it's a tender for any software solution, and you are just one person offering an open source solution against proprietary solutions. They have to check what kind of work do we have afterwards, does it fulfill our needs, and then they make a choice. And when your thing is way too complex and they are mainly working lawyers on that level, then probably they choose something else. I put in brackets. Avoid, if possible, try to reduce the pile of different use licenses. I know this is very hard because you cannot change the licenses of all the pieces that you need. But going back to OSI, I said they treat 116 as open source licenses. They already started to narrow down their number of licenses. They put on their website. And on their website, they have, I think, five popular licenses that they recommend. This is something that you can probably try to follow. And the last thing is a question to you. Do you know the EUPL? That's the European Union Public License. It's a copy-left license. And the cool thing on that is it's available in every European language, natively. So in Germany, we can pick the German version of the EUPL and use that for licensing. And then we have a German text. And if we go to German court, we have then German license text. In Italy, it's the same. It's an Italian version, a French version. And they are all comparable. Thank you. I think we can take one or two questions. Anyone of you? Who's the first one? Okay. Hi, thank you. I really appreciate that you've identified a problem in software procurement with the public sector. And I see that you're making really useful suggestions. My question is, is the demand from the public sector to do these things? Or are you proactively making these suggestions? Does the public sector understand that these are challenges in procurement? Or what is the situation from the other side? Thank you. So does the public sector understand the demand for open source, something like that? So it depends a bit to whom you are talking. As I said in this coalition contract, on very top level, they say we want to do open source. And then it goes down. And then there are people that probably got the order. We have to order open source stuff. What is that? And then they probably make a tender. And sometimes you have a complete new thing that should be built. Then they probably put the tag open source or it has to be open source on that. And if they just go for any software, like a video chat solution, then they should be open. And then it depends on the person if they know what open source is and if they know it the right way. |
EU alternative to app stores |
All right, I have the pleasure of introducing our next speakers, Marcel Helaja and Hans Grossefstein. Thank you. Right, so we have 25 minutes in front of us to speak about EU alternatives to app stores. What is this about? So I will lead you through a tool that is called a pilot project that I have initiated as a member of the European Parliament and what are the potential opportunities that I see here for the free software and open source community. So first, a little bit introduction of myself. As I said, I'm a member and also a cluster of the European Parliament. I'm a member of the Bureau of the European Parliament and I focus my work on policies that I will show you in the next slide in the committee on the internal market and consumer protection and also culture and education. I'm also working in the committee to investigate the use of Pegasus and Equal and Surveillance Pyrir. I'm sure you've heard about that scandal. And I am the first Vice President of the Czech Pirate Party. So what is my focus in policy making? The top two I believe are interlinked, free and open source technologies and freedom on the internet because I think that one cannot exist without the other. Then I also work on transparency and independence of media. Now what are we going to speak about? We're going to speak about the fact that the EU developed, of course, for public money, some applications that you can find, for instance, in Google Play. This is the list from Google Play. So I'm not sure you can really read, but you can see there a couple of applications that may help you to understand your passenger rights or you can see the EU Charter of Fundamental Rights. You can account majorities in the council and all sorts of other things that has been, again, developed with public money. Now what's the problem? The problem is that, first of all, I believe that we should adhere to the principle public money, public code. What is developed for public money should, the result should also be public. I believe that EU institutions should lead by example, and therefore I believe that it's also not sufficient only that public money that is invested into software then results into public code, but also the applications need to be accessible for the citizens. The current situation is that the applications that are developed for public money are dependent on proprietary app stores, pretty much only two. One for the Android platform, one for iOS on Apple, and that creates a high barrier for smaller providers to enter the market, and also I believe that is a less choice for consumers because if you for instance have an Android phone but you don't want to have an account with Google, then you basically cannot access these applications at all. Now let's keep that in mind, and now I will explain you what a pilot project is. So a pilot project in this EU terminology together with Preparatory Actions which is a follow-up to pilot projects, they are tools introduced in the EU budget that aim at testing new policy initiatives and preparing the ground for the adoption of future measures. Such pilot projects and Preparatory Actions give members of the European Parliament the possibility to initiate innovative policies and fund them in advance of a legal basis being set. What does it mean? That means that every member of the European Parliament has this opportunity to propose a project that they will ask a funding for from the EU budget, and if it gets approved as part of the EU budget, which means that the Commission needs to be on board, the Parliament needs to be on board, and the Council needs to be on board, then this project is started and then it's up to the Commission to deliver on that project as it was defined. The project can live for maximum two consecutive financial years and then it can be extended to a so-called Preparatory Action. Now that sounds great, what can we do with that? So what I have done is that I have tabled a pilot project that is called Demonopolised Access to EU Applications. In the past I have also tabled another pilot project that is currently in the making. So results are very much possible and should be expected. So what the pilot project speaks about, modern smartphone tablet and desktop environments have established marketplaces such as Google Play or Apple App Store for the installation of maintenance apps. These marketplaces offer convenient and curated apps that come at the cost of high barrier to entry on the market for smaller providers and less choice for consumers. Now the thing is that the European Union has already realized that reliance on large big tech corporations is a bad path forward and that's why the Commission proposed the Digital Markets Act. I worked on the Digital Markets Act as a shadow repertor and one of the provisions in the Digital Markets Act is that these so-called gatekeepers where Google and Apple definitely fall into the scope, they have to enable on their operating systems in the phones installation of applications from other sources than from the official app store which is not the case in iOS at this moment. And that provision basically enables using other repositories, it doesn't have to be a store but package repositories if you will, to install them on the phone and to use the applications from there that decreases the dependency on this big tech. But then I think again as I said the EU institutions should lead by example and therefore this project pilot project aims that the European Union uses other repositories such as Ifdroid to put the applications there so that they can be installed by the users from these let's say alternative app stores to use the EU terminology. Now the objectives from my perspective definitely is that the EU institutions should release their applications on various repositories that aim at promoting applications released under free and open source software licenses and of course releasing the source code then enables people to build, study, improve the application so basically use their rights that they are entitled to according to free software rules basically for freedoms. So this of course is a potential opportunity or these are potential opportunities for the free software community because who else than the free software community can help with releasing these applications into the free software domain. I'd like to remind that the implementation is done by the European Commission, not the European Parliament. The Commission's website for funding and tenders is over here and I believe that the Commission of course is not going to do everything by themselves so they are going to open tenders to deliver on this pilot project and here I need to say that there are some eligibility criteria and admissibility requirements such as and I quote from the Commission's rules that tenders must have the necessary technical, professional, economic and financial capacity to execute the contracts which in other words means well not every entity can actually be a tender but what is important is that joint tenders and subcontracting are also allowed which I believe can also enable the free software community to take part in that. That's all from my side, you can follow me on social networks, obligatory slide, but now I give the floor to Hans Christoph who can speak more from the developer's perspective. So I guess I can start with my personal experience of this, someone working in free software for almost 30 years, you know, the rise of the pirate party movement, something I felt very close to, but where I live it's not so much a thing and then out of the blue, I work on effedroid, out of the blue someone comes to our chat room and says oh did you know there's a pilot project in the EU budget that mentions effedroid and for me it was like okay whoa okay this is interesting and we look and it comes from of course also out of this feeling, you know suddenly it was a blast to the past of like this is what I want to see happen but it seems impossible and then here it is oh no this is political change coming through free software and pirate party, so I just want to thank you for that. My pleasure. So now to effedroid, I think we're talking about free software in the mobile sphere, I think that we can say that the effedroid community is one of the biggest forces pushing free software in the, I mean certainly the android ecosystem but maybe even beyond and we have gotten there because we have, we're this volunteer led organization where people come because they believe in free software and they want to use all their computers including their phones that's all free software and there's a lot of kind of thankless work where you know someone wrote their app and they say I think it's free software and then they submit it to us we review it and you say well actually you know the Google tools don't make it that easy to publish free software they make every step easy to like oh look at this lovely library just pull it in so easy put this line in and then you're tied into the proprietary ecosystem and so we're very often the people who have to say I'm sorry I know you tried but this unfortunately includes proprietary software. So this has I think been, I mean now we're done through you know starting out with much more volunteer apps small scale now after 12 years of doing this going on 13 we have you know well-known names in free software like NextCloud coming to us and relying on us to do this kind of review and then our other organizations which want to go through the process say okay we believe that our publishing free software will make our users trust us more and so we've had companies like Tutanota and Proton VPN go through this process where yeah it's a kind of a thankless tax but people now are willing to do it because of the four freedoms and because this question I think which doesn't come up that much in free software is this question of trust we now see that it's very easy to make software that does things transparently that we really don't like you know we can talk about tracking you can talk about you know software that's designed to be addictive and these kind of things and this this when we have free software then we have the possibility of actually having anyone we trust review it and say like this is doing something that you know this is this is checked out like this software is not doing anything that I would eject to we take the user's point of view say is this something that I want to use and if not we want to flag it and make sure that others who use it also are aware of the things and this in the nephroid terms is anti-features and now this to me is a very exciting opportunity to say okay like this is I live I live in the EU I'm a citizen of a country in the EU this is my government and I would like to do provide these services as well I mean and I'm strong believer of course in public money public code so yeah I feel very fortunate to be in this position and now we also I think I mean this may be this is purely my kind of feeling but a lot of people get involved in free software because they think that other avenues don't really work like government and I was kind of like that I guess as an activist it gave me an opportunity to do something alone starting out myself like this is wrong but I can do something just by myself but now after doing this stuff for long I see like but government can play a role and when it does it get the impact you can have is huge and so to bring these you know these are complicated processes to even learn about this EU pilot project but I hope that this really opens the door to lots of other projects to say like okay this is this is our government let's get involved let's see what we can do with it well so yeah what else can I say I think the last piece of this I mean to me this is a learning experience like how we're gonna engage with the European Commission are we gonna do just advise them in the process are we going to do public tender I'm very interested to see how it goes I'll try to also publish your blog post or something on what I can to so others can learn from this experience and then a simple ask from from everyone is you know what what of these please look at these apps that our government is producing and what's what's the ones that we you know we're doing a pilot project what's the ones that are most important I would love to have feedback on that I think with that I'll say thank you and I think we can take some questions okay we also have some time for questions so could you raise your hands then I collect and start we have one over there thank you for presentation I have a question about payments in the stores so as we know currently Apple and Google don't provide an alternative for okay currently as we know Google and Facebook don't provide the alternatives to payments as we have it in the web because in the web we have a lot of providers a lot of alternatives a lot of different ways to do the payments to roll back the payments and so on there are basically no quote unquote regulations but with this with the title of the topic I would like to raise also the question not about open stores but what if the EU applications would actually require some payment let's say we would like to pay for some for some unenforced unlawful behavior for our for our debts with government what what are you going to do then what if what if that happens maybe you then want to go into the payment route maybe fdroid would be able to well not not exactly only fdroid but maybe alternative stores should be able also to include the payment ecosystem or payment some something like that and not only not it doesn't only include the payments for you apps but if that would be possible maybe other apps could use alternative stores and also have payments which which are not subjected to oligopoly that we have or monopoly that we have currently from my perspective there is no neither technical nor a business reason why other stores could not be connected to payment systems and provide this functionality as well but maybe Hans Christoph wants to add something from a technical perspective but I don't see a reason why why why this this shouldn't be possible yeah there's definitely no technical reason why this should be totally possible but after it we've actually made that we're trying to stay out of that business because it's a very different business than free software okay also I would like to add on that one that this is also relevant to the digital markets act that I that I mentioned because one of the reasons why this provision about app stores is there is because there were past cases that basically anti-monopoly cases according to competition law where where companies were complaining that that Apple is abusing their position on the market to charge them 30% fee like Spotify versus Apple for instance so so this is definitely a very valid area that that something needs to be done about yeah so do I understand correctly that you're basically or is I'm practically screaming in the microphone well try again so do I understand correctly that the EU is embracing after it in this case the EU is what is like embracing the use of after it or or or totally a releasing after it or no no no no no yes yes the question is if if the EU is yeah embracing after it so no after this basically only an example in order to understand what is behind the idea of the pilot project nevertheless that's just an idea and it should be independent of anyone basically so so it would be up to the commission to assess which direction they want to go but apparently I mean we cannot we cannot avoid using the the the major free software repository for Android applications as an example I can add real quick maybe for the stream okay all right that I think relevant here is that free after it is all the free software needed to create create an app store so you know by decentralizing this you know maybe the EU is embracing free software in the sense of anyone can create their own app store and and but yes it's not an official like EU app store right very quickly I was told so you said the the goal of the pilot is kind of to test the waters for new policy new laws does that mean that your ultimate goal would be that something like this would be made mandatory for the European Union to publish their apps in asteroid or other open app repositories well that's not I don't think that at this moment we can put it like that the the tool the pilot project is not necessarily for legislation but it's also for let's say what the institutions should do so basically it's very clear that putting the applications not only on on Google Play and Apple App Store but also elsewhere will also cost some operational money so that's why we have this tool to actually allocate money in the budget give it a try and then at the end of the pilot project it will be evaluated how successful it was and then the Commission can of course make a permanent allocation in the budget to run it there's not necessarily a need for legislation at the end of the pilot project how can people try to make the pilot project succeed how can they help make it succeed the project right that's a very good question and that's basically the whole purpose of this talk so I would so I would recommend the follow the Commission's website there was a link and and see what the Commission is going to open as a tender and then basically anyone in the free software community can contribute as Hans Kristof said either by running for instance jointly in the tender or advising maybe in some you know side capacity to the to the Commission or being a subcontractor of the main tender so that that I think would be the main goal and I can visit the volunteer aspect like the list of apps is public you and you can go get these apps and you can try to review them yourself you can reach out to us we're happy to help people get started on what it takes to review apps so there's easy small steps you can take as well |
AI Discussion |
It was a discussion that we originally scheduled for an hour, but due to scheduling issues, it got collapsed to 25 minutes, which we are going to stretch to an amazing 28. But everybody has to be in this with us together. The topic is one that everybody has a lot of thoughts on. We're in the beginnings of figuring out how we feel about it as a community and as a movement. And so I want to do something a little bit different for this session. So raise your hand if you think you have like a question, a comment, or a topic that you wish would be addressed during this session. So like, okay, good, five people. I want everyone here raise their hand to come up. Or if you think that by the time those five questions get asked, you will have a comment or question, come up on this side, line up in a line here, because if we take the mic around, we will have no time. Who thinks that they might have something to say about any of those comments or topics? And just has a lot of thoughts and maybe doesn't even know what, you could be some of those people who had comments or questions. If you think you probably want to just say something in reaction to that, come a line up over here. And what we'll do is we'll go through the questions comments. Those people can come over here. We'll have two sections of the line, people who have not spoken yet, and people who have but want to say more. And we're going to speak as briefly and concisely as we can, and it's going to be awesome. All right. So I want the lines to actually come just a tiny bit closer so we can be efficient. Yeah, so is this a line? This is the line of people who want to talk? No, we need people who want to participate in the discussion. You're here because you want to talk. Okay, Van wants to talk. Van wants to talk, so over here, but no, wait, no, no, no, wait. This is not, so hold on. People who want to react to the topics over here, if you're not sure, just line up. It'll be fun. You don't have to necessarily answer any particular question. You can let somebody else come forward. Okay, great. And so we're going to start with the people who have the topics they want to talk about. And then we're going to let people come over here. And so it can be a comment, a question, a topic you want to talk about. And people can go back and forth. And I suspect people will come up as you want to join the discussion. It's like basically a self-forming panel, okay? We're going to prioritize people who haven't spoken. And we'll see what happens. Hi, thank you, Karen. My name is Alex. And actually I have a question related to something that is on the board already. And this has been a lot of conversation on the subject of AI trained on code. And what I notice is majority of those are very U.S. centric and are mostly around the train of fair use, which is kind of, I guess, philosophical thing in a way. But in the EU, there seem to be some regulations on the subject relevant to this one, but from the previous hype cycle, let's say, or previous cycle of technologies that gave us a lot of interesting things. That being data mining and web search results, information retrieval. So there's been some laws and some academic papers published on this subject. And they are kind of EU focused. And they almost never get mentioned in the discussions online. I was wondering why is that and could it be productive part to see it from another angle then? Think like that. I personally love that. And I love that you kick this off with that question because you've just brought it to the conversation. Does anyone on this side want to answer that? So the nice thing about the EU and in a lot of other places is that they've basically made a lot of the things that you're fighting about in the U.S. already de facto legal. And so that is why, for example, the lion land, however you pronounce it, database, is a lot of that model banking is happening in Germany. Or that's the reason why in India there's a lot of scientific literature that is being created and put into models. And then they export those models. Basically, the EU, in my opinion, is ahead of the U.S. in this area. And what's happening is that the U.S., this is still an open question and something that could be sort of de facto made expensive or hard. Awesome. So we're going to do it like that. Each answer will be brief. I'm going to continue with people down here. When they haven't, I'm going to go to the people over here. There are several people in the audience that I know already have opinions on this question. So come on up. Okay. Did you want to answer that or no? Okay. Anybody over here want to address that point? Or we'll move on to the next one. The University of Cambridge has a group called the Cambridge University Ethics in Mathematics Society, which runs conferences occasionally. It's trying to do two things. It's trying to make ethics training a mandatory part of mathematics teaching, just as it is in law and engineering and related fields. So that mathematicians who go into AI, for example, have some idea of the ethical implications of that work. One of these conferences, someone who'd been part of a UK government review of AI implications, came up with a list of things that reminded me of the four freedoms, but four explainable AI. I'd like to ask, what is the closest thing we have to the four freedoms in the context of AI? And also, does anyone else know of other initiatives to give mathematicians ethics training? Also an excellent question. Do you want to, and other people who want to participate in the discussion, you're here for a discussion, so please come on down. And Bea, I'm going to look out for people trying to get out. I've done quite a recent, quite big ethics in the AI project. And the first problem I run into is how do you approach defined ethics and where you ground it. And in this project, we went through the fundamental rights, but that creates a new problem because there are several definitions of fundamental rights, and then you have to choose one. And we had luck that there was some kind of model we should use that was adhered by the Dutch government, and that had a list of fundamental rights. So that would be my answer. Start looking at fundamental rights as who? Make your choice, own choice there. I want to make a comment general about ethics and regulation. Any ethics, any regulation, any restriction we put, it puts it on us, the good guys. It gives the bad guys the monopoly to do the unethical things. Keep that in mind. I don't really agree to that one. I don't really agree to that one because, for example, a government can ask for an ethical assessment of some system, and then the good guys can tell the bad guys, well, you've been very naughty. No, generally bad guys don't listen to their governments, that's what good guys do. But bad guys ignore the loss. So one comment about the focus on ethics is that it is being approached from a very different perspective than the typical four freedoms or the OSD. The OSD and the four freedoms start with freedom zero, the ability to run the program anytime for any purpose. That is the thing that is being explicitly denied by a lot of the ethical efforts around AI, whether or not that will be successful. But it's coming from a much more of the ethical licensing type side where they're trying to restrict it. You can't use this if you're doing climate things or if you're going to make someone be discriminated against or have all these societal effects. I think a more free software aligned one would start with, you can use the AI for whatever purpose you find. I don't know if that's what we'd want to say, but that's what I would say is most aligned with freedom zero and I'm not seeing it out there. Anybody over here want to comment on this? That slightly answered the question. Nobody's touched on the question I asked about training mathematicians specifically who are often recruited by AI companies in ethics. Does anyone know of other efforts to do that? Anybody in the audience? I can ask. I have against training mathematicians about ethics. The response that is not on the microphone is that the audience member asserts that mathematicians don't need and should not get ethics training. They should only get training in mathematics. Can I answer this one? I think that it's not mathematicians that decide that they are going to be hired, but the companies. I can come back on both those points. It's a discussion, so I'm trying to decide if I'm going to weigh in. To the person who said mathematicians don't need ethics training, you should look at the resources compiled by the Cambridge University Ethics and Mathematics Society because they answer that point in great depth. Essentially, I think that you are mistaken. As for the question of the mathematicians being hired, mathematicians are human beings. They have agency, they are not just passive robots who have to work for companies doing evil things with AI. You have a choice about what you do in the world. It's the same as a programmer, right? At the end of the day, you are the one building it, so you are the one that can say no. Exactly. At the end of the day, they are going to blame you because you fixed the Volkswagen's engines to be cheating, and they're not going to take the blame. If you do want to participate, you have to come up. We can't do the shouting. So, one other thing about ethics is that all the entire discussion on ethics is actually being used by companies to, well, do bad things such as, let's say, not releasing the GPT model, an actual open AI used, it will be used for unethical purposes to close down the model. I think this is a really bad approach. We should have a framework where that is not allowed. An ethics should not be a reason to create closed models or be secret about them. Yeah, that's good. It's a good segue because that's exactly the question I had. Companies like OpenAI, which, by the way, were really tricky with their name. A lot of people think they're open and they're not. They did use some questionable practices to train their models and underpaid people on third-order countries for some very bizarre content. And probably that's what makes the models really good, actually. All this data they collected that they're secret about and all this practice they used. So my question is, if anybody knows, how can we from the open source community compete against that in a good way and get as powerful as a computable model just like stable diffusion coming out? We can replicate their papers, but all this data and this practice that may be taken to the next level that OpenAI is releasing. Okay, it feels like you're stuck in the middle here, but you still have something to say. Just get up right now and everyone will let you out. Just come to the front. Is anyone here who has not spoken want to respond to this? Anyone who's not spoken? On that point, in Bradley's talk, you mentioned that a free software had wins early because of the free, as in free beer, part of it. And I think one of the things that's interesting about your question is, what can we do in the free software world? One of the big barriers, I think, is that my understanding, correct me if I'm wrong, is that the models that GPT have used comprise maybe a decade of time and billions of dollars of investment. This is a challenge, I think, for us in the community to compete against. How do we compete? Regarding AI, AI is about recognizing the patterns and companies like proprietary AI censoring the models like chat GPT from actually giving the right answer about patterns that it recognized in the data. And how do we compete with such a closed models? You know, there is a website, the copy of Twitter, it's called gab.com. And there is a CEO of this site, Andrew Torba. And recently he announced that he will create an AI model based in Christian values about openness and freedom of speech where the model will be trained to recognize patterns in data without the censorship that others apply to this data. I think one area where open source really can get the edge over closed source initiatives is in explainability. It's hard to explain a model, it's hard to understand the algorithm, how it works out in ethics, but you can put a layer to it that makes it explainable, that makes it understandable why the model is acting in a certain way. And that's a level, that's an area I think where we can really get the edge as open source developers. There are a few people who haven't spoken yet. Hello. Regarding the problem of closeness of open AI, there is actually one big problem. These kind of models are really powerful, it's like an atomic bomb, it's not like a gun. So if a company, it's accountable for the output of those models, they will do everything in their power to prevent swing and prevent people from asking how to dissolve the body to the AI or to kill somebody. So they have to do this, they are not doing that because they are evil. So if you go through the stable diffusion way, you end up with unstable diffusion. I think everybody here knows what unstable diffusion is, just look it up. So it's a really edgy situation in which we discover this kind of powerful weapon, but we are not ready to handle it and if those kind of companies handle it, they will try to censor it and reduce the scope in order not to get closed by the government or by something else. So I don't know what the solution is but we have to go through the open source process of training them but we don't have the resources to do that. They spent like $10 million to train GPT-3 and I don't know if the open source community can pull up the same thing. They are trying to do that with a couple of other models but they are not nowhere there yet in accuracy and functionality but I think we will end up there. I don't know how. That's my question. There are a lot of question marks around this entire discussion which is one of the reasons we wanted to have it as a group discussion. Because this is a short session, we are just going to touch on the topics but we have mailing list discussion already started and I think we should engage in a really deep conversation there following up on some of this conversation. So let's see how far we can get now. Did somebody want to respond to that? Anybody who hasn't spoken yet? Well, I don't know if open source community doesn't have resources. Actually, they do and even now there is this big bloom model which tries to replicate GPT and they already made a system where each participant can just plug in their GPU and its network and it participates into this giant resource cloud, so to speak. So we might get it. In the end, I think what stable diffusion shows is that AI openness always wins. Nobody cares about Dali anymore, it's just a random project, literally no one cares anymore. And all it took is just one open source stable diffusion just appeared and everyone started training. Every single person started using it. That's little by little how great it became. It's way smaller, way, way smaller. The model is way, way smaller, like a hundred times smaller. Well, that's amazing. That's really amazing, yes. Yes, the stable diffusion models are like 12 gigabytes. I have it on my laptop. It's really nice. Anybody else want to add anything to this particular conversation? I have a question but at the same time I will go into another topic. So we're talking about pictures and stable diffusion but then I want to ask, okay, then they also trained co-pilot. So they trained, they used code which they were not allowed to or did not pay for it even if it's open source. So why did it, why didn't licensing work with that? Maybe if, for example, I heard that EU declared open source as a public good. So should EU protect public good? Co-pilot, anyone want to? It's going to be my question. I guess I can add my thoughts to it and perhaps Mark's on discussion. But yeah, it is a very great question. Should companies be allowed to use code that has been licensed under licenses like the GPL or AGPL or other licenses that require people to disclose the source code? Should that code be used in training these datasets and furthermore is the resulting output from the models? Are they those considered covered works? And perhaps this goes a little bit back on the fair use discussion that question that was done earlier. But I think companies do have a duty to address this concern and perhaps start putting on a framework that would allow people to opt in into having their work used for training of these systems. Yeah, and I also had a second part to that question which was if you declared to open source public good, should you enforce the license from EU side? Because if somebody uses open source not according to the core principle, then maybe should you step in? I think that question is... I think in legal terms what is being asked here is should the government take the place of the license sore? Like if the license sore isn't enforcing, suppose you upload AGPL code to GitHub and Microsoft scrapes that into co-pilot and reuses it. And you don't have the resources or the will or whatever to try to prosecute Microsoft, then should the EU or member states step in and prosecute on your behalf? Yeah, I think we're just starting to begin to see the functioning of how this is going to shake out. There were several lawsuits filed around co-pilot and I believe that there are more coming. So it'll be really interesting to see it shake out. And what's interesting about the suits that have been filed already is that there are quite a number of legal theories that have been thrown out there. And I think actually the core licensing argument hasn't been made yet. While at the same time there's litigation happening about other data sets that have other freely licensed work. So it'll be really interesting to see what those enforcement mechanisms are. So I'm curious if you think this wording is appropriate. If an AI, not just co-pilot, takes in source code, free software, creates a model and then suggests a code snippet, would you consider that license washing? Yeah, I think it depends on the amount and it's a little bit like a fair use. If I read a book and take inspiration from the book and I end up writing five words in a sequence that are the exact same on my protected work, is that fair use? So I think it depends. But I think it depends on certain metrics such as a frequency, amount of work and likeness, things like that. So that relates to the next question I was going to ask, which is, can an AI, there is talk of AI becoming legal persons, can an AI perform sweat of the brow, could a work be the copyright of the AI itself? And I would ask the follow-on question which is sort of related to that, which we got in an email, which is like, should we have an ethical obligation to identify AI in conversation, so people know they are interacting with AI? So I wanted to answer the previous question. And yes, I think you should really step in, should really try to enforce these licenses. But the important note there is that these licenses should work towards opening the models, not restricting on what the models trained on, which is the spirit of GPL. It's not to restrict someone from accessing this code, but rather that the derivative work or derivative model is also open source. And that was the spirit of GPL. And I think it's still the same applies to any AI training. Even if it's like you say a small inspiration, we should make sure that this small inspiration still results in a more open world in the end. To follow on that question, I had a question. What would you consider an open source equivalent within the terms of AI? Is that then just that the model is open source or the output is open source? Or you also want to go with a data set where it was trained on as open source? And even then, how can you produce different steps? Because for many people, especially I'm a physicist now, and my colleagues at CAI is a black box. And things happen, and like the whole power I think of open source is where we can actually go and look and understand the algorithms in the code. And currently with AI, we have no idea. That is the perfect place that our time is up, because there are so many questions. Let's have this discussion at lists.copyleft.org slash mailman slash list info slash AI assists. You have all been great sports. Thank you everybody for coming up. Thanks. |
The coming EU Standard-Essential Patents regulation |
Last year the Commission announced they're going to work on a new package of patent regulations. We don't yet have the text. So there was a call for input and people have responded to this so we do have some chance to give input to the Commission already but that was at the stage where we didn't see what is actually being proposed. We still have not seen it, it will be published in April of this year but at this stage we are already planning because we know this is going to be a big topic. So what you're going to see right now is the early stage planning of what a campaign that could help free software deal with software patents could look like in the coming months and to start it's going to be my colleague here Panos from N Software Patents. Yeah so hi everyone, I am Panos Alvopoulos. I have been working on the N Software Patents campaign for a while, for about two years approximately. Just to give an overview of the presentation today, I'm going to talk about in general what's the deal with ACPs and France and a rough overview of case law and then Kiaran will present the free software toughest problem and five fixes we can think about it. So what's the deal with ACPs and France, two words that are probably unknown to many. ACPs stands for standard essential patent and France stands for fair, reasonable and non-discriminatory. Just to illustrate what's going on here. As we all know that in technology standards are very important so if we want devices to work with each other and we want them to be compatible we need to use a standard, right? So these standards are usually, the rules for these standards are going to be set up by standard setting organizations. So what happens is that for a known standard there are usually patents that are very important for the standard to work. So those standards are called standard essential patents and if one wants to participate in the standard they need to use the patent. So that person is called implementer and they have to get a license from the SAP holder. Now of course as we all know a patent is very restrictive to everyone. So here's what standard setting organizations do, they require from the SAP holders to license their patents, their ACPs with under front licenses. Now it sounds great but there are issues. As we said France stands for fair, reasonable and non-discriminatory but that's a very, very, very vague term. First of all it can affect everyone that's working with standards and that includes network protocols, independent of things, audio-video codecs, the automotive industry and so on and on. And the problem is that when the implementers have to use the patents they have to come under agreements with the SAP holders and the SAP holders they try to charge as much money as possible and most likely the implementers are not very satisfied with that. So the real problem is that there's a potential abuse of dominant power, of dominant position by the SAP holders and the high transaction costs for implementers are simply unsustainable for SMEs, for small and medium enterprises. And that in general adds up to the legal uncertainty regarding software patents. So to be more specific on the way this has been reflected in the courts because what happens is that when you have an implementer and they come into a disagreement with a patent holder what happens is that litigation happens. So they go to the courts and what do the courts say? Well they try to help as much as possible. I have a layout here of the courts, of the court decisions that have been relevant and have been basically the landmark decisions in Europe. First of all it all started with the German Federal Supreme Court and it started with a very strict and conservative stance, very pro-ACP holder stance because it basically had a very strict test for a front defense and the front defense is basically when the implementer says yes, you are the SAP holder but the money that you charge for that are way too high for me to actually participate in the market and that's an abuse of your dominant position. And the German Federal Supreme Court was more in favor of the patent holders and that was seen in 2009. But then what happened is that we had a very important decision from the European Court of Justice that concerned competition law and it took a more balanced approach. And then in 2020 there was another decision, actually there was no decision. There was a German lower court which asked for more clarifications from the European Court of Justice and unfortunately for everyone the decision was settled, there was no decision. It was settled out of court so the court had no chance to answer those questions. To be more specific on what the German Federal Supreme Court said back in 2009, it had a very strict test and it said the alleged infringer can rely on a competition law defense only if it does the following two steps. Only if it unconditionally offers to enter into a license agreement with the SAP holder and more importantly if it behaves as if it were an actual licensee so that would mean that it would have to act as if there's already a license and pay royalties and it was up to the defendant to prove that the conditions above are fulfilled. What happened though is that in 2014 there were two important commission decisions about anti-trust issues with Motorola and Samsung and the European Commission stated that a patent holder abuses its dominant position when having given a front commitment over an SAP to a standard setting organization it seeks injunction against a willing licensee and a willing licensee remains free to challenge the validity of the patent or its infringement. That was very important because the German courts didn't recognize that. They didn't expect the defendant to give up some fundamental rights like this. So in about 2015 the European Court of Justice responded to a lower German court because the German courts recognized that okay there's a difference between what the German federal Supreme Court says and what the European Commission is doing. So the European Court of Justice came in and clarified what's going on and it said there's a very broad test here but it's helpful enough for national courts to expand further. The SAP holder must alert the implementer in writing of the infringement complained of by noting the relevant SAP and how it's alleged to be infringed. The implementer must express a willingness to conclude a licensing agreement on front terms. The SAP holder must provide a specific written offer for a license on front terms. The implementer must diligently respond to that offer in accordance with recognized commercial practices and in good faith. And if the implementer does not accept the offer made to it, a counter offer that corresponds to front terms should be made. So based on that language you can probably figure out that the courts didn't really solve any problems. What they basically helped is to help the companies negotiate better on what front really is. And you might have guessed from the name of the campaign and so through the patents what we're trying to do but it's important to follow decisions like this because it's important to understand how patents work and how a friend can actually affect standards. And finally just for your information, there are more cases that should be started but we don't have the time to present them here. It's the United Kingdom Supreme Court unwired planet versus Huawei in 2020 that concerns the UK. And there are two more decisions from the German federal Supreme Court in 2020 that are also of great importance. So with that, I conclude my part of the presentation and I give the floor to Kier. So the first time I came to Brussels was in 2002 and it was to come to Fastem and at that time Richard Stallman gave the keynote address in the Johnson Theatre and it inspired me to work on free software policy topics as a priority. So a few months later then I got interested in software patents in particular and a lot of people looked at the topic and saw it was a bit daunting and after years people said to me, you've been working on this for 10 years and there's no progress, how do you keep your motivation? And then they said after 15 years, how do you keep your motivation? And finally I stuck with it and now I can make the joke that after 20 years all of the software patents that I was worried about have now become invalid. But of course there are now a new generation of software patents. When I said I was giving this talk I was asked, oh, can you talk about the unified patent court? No, I can't. Very important but there's no time, sorry. Can you talk about software patents coming into India via the AI law? No, I'm sorry, I can't. Can you talk about the idea of AI writing software patent patents? Because I think patent applications are such nonsense, I think AI is the perfect thing to write the next generation of them. But no, I can't. Can you talk about the Cyber Resilience Act? This is going to be massive but no, it's got nothing to do with patents. So audio video patents are the worst case scenario for software because you have to form a standard at first off. You then have to convince all of the manufacturers of hardware that this is worth putting into their hardware because you cannot do audio video with just software. Audio video software used to be optimized for decoding because encoding would happen on some mainframe somewhere and you'd download the video and you'd watch it. But now with video chat you have to do real-time encoding, transmission, so you need a small file size, and then decoding and it should be almost real-time. You convince these hardware manufacturers to put it onto their hardware and then you have to wait five years because then you have to wait for everyone, the whole world of computer users to throw out their current hardware and buy a new generation of hardware. So after five years finally the standard will be hardware accelerated on everyone's computer and then you can really use it. If at that point somebody comes along with a patent and says, actually, I own a bit of that software, there is very little you can do. You can no longer innovate around the problem. So software patents on standards are the worst type of patents for software and on video patent standards they are the worst of the worst. This is happening already. We have an audio format opus which is not very famous, however it is a successor to the AUG audio format. It is used in a lot of hardware devices. There is currently a patent pool being formed to threaten the opus audio format. We have the AV1 video format which is being developed to be freely implementable and already there has been a request sent to the European Commission to investigate the people working on this for anti-competitive practices because trying to help each other avoid patent problems might be a bad thing for some companies. Before I get into it, I saw yesterday there was published a paper by Andrew Katz and four other researchers. I didn't get their names but it is on the HGVC standard in software so I expect that is very interesting but I will not be talking about it. There are a lot of topics here. There was a commission call for input last year. There were responses from open forum Europe who I currently work for and from open source initiative Simon. This however these comments were sent based on not actually knowing what text is going to be presented so they were kind of general comments on what patent policy could be like. We still haven't seen the text that is going to be presented by the European Commission but as I said this is early stage planning for a campaign. We have to start thinking already about what possibilities we have to fix this, what different form can it take because when we are talking about policy makers the way it works is you have to give them a list of different fixes that you are looking for and then they will say which ones are politically possible and then you have to start trying to maximize each of the ones that are politically possible. So I found five starting ideas but this is a discussion document basically. We can discuss them further as in the coming weeks. The first idea would be a carve out specific two free software. A few years ago this might have sounded a bit overly optimistic but now in the Cyber Resilience Act and the Product Liability Directive there is this text and when we first saw it we were amazed to see free and open source software should not be covered. Excellent. Of course once you start reading it then you realize well there is an objective limitation here. You can use this exemption if you are trying to do innovation or research. There is a commercial activity limitation and we all know that commercial activity has never been defined and is an endless source of un-clarity for what exactly that means for software and online activities and it should be exempted. So a judge will have to interpret whether or not you are allowed to use this exemption which basically means that nobody can rely, nobody can confidently make use of this exemption so it's not very useful. However the fact that this is in a document from the Commission in two proposals for regulations tells us that inside the Commission something along these lines is acceptable. They didn't get the wording quite right this time but they are not completely opposed to this kind of thing. So this is a starting point, maybe we can do something along these lines. The second possibility would be a similar end result but from a different starting point and would be to try and carve industry into two sections. For example we might find that small and medium business in the manufacturing and pharmaceutical areas might be perfectly happy with having more patents and might want help from the Commission to apply for patents and use their patents. We happen to not like that so maybe we should have a different regime for them and one for us. There is already as part of this patent package there is a regulation that is specific to pharmaceutical patents so the Commission is not opposed to singling out sector by sector. Our argument can be that our sector is different because we have collaborative models and we have market solutions. If you look at the patent clauses of the GPL and the Apache 2.0 license we already have market solutions so maybe we can tell the European Commission don't interfere with what we are already fixing. A third fix and this is still quite early stage is the idea of competition law. I have always believed there is not enough discussion of how competition law is weakened by the existence of patents. The European Commission loves to work on patent law and it is a big topic in the European Commission but it just seems there is not enough dialogue between people who are enforcing competition law and people who are working on patent policy. I think maybe if we could find a way to make it clearer to the European Commission and the European institutions in general how these two topics interact maybe we could help make them understand what we are trying to achieve. The fourth one, there are only five, not going to go on all night. The fourth one is something that will definitely be discussed because the Commission has mentioned this in their documents so Fran, at times we think the idea of non-discriminatory that must help us because we don't want regimes that discriminate against free software. We need patent regimes that allow for royalty free use of patents. However, non-discriminatory currently the standard meaning for this is that you don't discriminate between individuals. If you have a certain system for working out how much royalties you are going to ask for you have to apply the same system to each potential licensee. Or maybe we can expand this, maybe we can work with this term and find a way to talk about non-discriminatory so that it prevents the patent regime from discriminating against free software which cannot pay royalty fees because we can't count the copies. If that doesn't work then of course the word fair it's an even more vague word but maybe we can do something with this word to talk about how fair should apply to free software. There's also as part of this patent package there is a regulation coming on compulsory licensing so maybe that angle can also be used there to ask for something that works with free software so that would have to be a royalty free, sub-licensable patent license with no domain limit. And the last fix, being presented for the first time. So the whole problem with software patents in Europe has always been that we have a patent law that says software is not patentable as such and the policy makers when we complained we don't want software patents they said but it's not patentable look software is not patentable as such. The problem was that of course the patent applications were always written as patent on a limited resource device which interacts with this and there's always a way to make software sound like it's more than just software as such. What if we ask them to give an example? What if we ask them to put this into a patent law on standard essential patents? The software which reads, transmits or puts data into a data format standard is an example of software as such. If we could do this and this is a crazy idea maybe we could find a way to get software as such to be defined in a concrete way and we could actually start using the idea that software as such is not patentable. Now when we used to campaign against software patents the commission because the software is not patentable as such language was there they said don't worry there's not going to be American style software patents it's not going to happen. I would like to try and find out the documents they used to use and I want to find any statements and things they ever said that explained what their motivation was. If they were saying that their intention was that data formats for example or compatibility interoperability which is a massive buzz word now in the European institutions if these things were not meant to be hampered by software patents then maybe we can say well you know please clarify this. Your regulation says you want to reduce the un-clarity give us some clarity on what exactly software as such is and maybe we can put this into law. Those are my five fixes. I don't have a favourite one we don't know which one is going to be acceptable we don't know what text even we have to insert any of these into but this is something that we will be well continuing to hold a dialogue on in the coming weeks and months and I hope this topic gets lots of attention and lots of you get involved. With that we do have a tool for coordinating work on this and Panas will brief you. I have a microphone over there. So if you probably guessed it from the whole presentation there is so much information to be digested and it's extremely difficult to talk about all of this in 25 minutes but what we are trying to do with our websites and software patents is to have the central resource for everything for software patents so that includes case law legislation and that includes the whole world. We want to make people not forget software patents like they forgot them like probably the last 15 years and we try to make everything about software patents let's make them the headlines let's make software patents something that everyone should be concerned and not make the copyright the only issue that concerns free software because patents actually concern not only free software but software in general and with that we would like to thank you for your attention. Please try to contribute to these resources it's extremely valuable. Yes. Okay we have time for one question and this is going to Simon sorry. Thank you very much so I'll try and frame this in the form of a question if I can. One thing I'd like to point out about all of your comments about FRAND is that all of those lawsuits that you displayed included Etsy as a correspondent because the FRAND expectation is not a part of law it's a part of the way the specifications the standard has written and as a consequence anything based on FRAND commitments is on very shifty ground because there's no real proof that anybody is actually forced to do things on FRAND terms anyway so I don't like number four. Number two your carve out I've written a paper on that for OFE and I think that's the one to go for because there is actually a division between standards that are done that are implementation based and ones that are requirements based where monetization depends on patents and so you can find a compromise that will work there and I'd like to invite you to come with me to an Etsy meeting and meet the people who think we are all crazy batshit crazy idiots who want to destroy the European economy because that's what they say to me when I say anything you just said and we're not going to get anywhere with the commission unless we understand their position and appear to accommodate it even if we think they're the ones that are crazy and then that implementation paper from Andrew Katz you saw the important thing to know about is it says it's impossible for an open source implementation to get any licenses to any patents in any standards because they went and tried for the standard that's mentioned in the title so that paper isn't a general paper about the topic they went and tried to go and get friend licenses and they found it was impossible for them to go and secure friend licenses to any of their standards so this whole thing is based on a false premise that anyone is going to actually fairly get licenses this is all about entrapping implementers into the universe of the companies that have rigged the standards at the regulatory captured standards organizations. Okay, cool, so let's move to Kota Zürich. The term friend is in the commission's call for input so it seems although it is not in European law at the moment there is a regulation from 2012 there is some reference to it but it's not in general European law but it's in the commission's call for input so it may be defined and if it is going to be defined then let's define it ourselves. Thank you. Thank you. |
The Professional's Guide To Haphazardly Picking Licenses For Standards & Specifications
Practical tips for the reckless licensor |
Welcome, our next speaker, Nate Willis. Hi. Well, I have good news. The non-font-related portions of the event are now officially over. We can get down to business. Anyway, as Tom implied, I'm Nate Willis, and I'm an impressionable young PhD student from over in the UK. I'm going to talk about a specification that I worked on a couple years ago and how I discovered that choosing a license or the best license for it isn't simple. The options there are scarce, the particulars are real particular, and there isn't much in the way of best practices or guidance, especially if you want the license on a specification to embody free-to-me principles. And I'm also going to try and make the case along the way that even small project specifications like this matter, and therefore this warrants some consideration. Weirdly enough, I think this is the third time I've spoken in the legal policy dev room, but this is not going to be like my first talk in which I gave a lot of information about fonts and things like that. It's going to be a lot more like my second one, which is about photo policies, in the sense that I just rattle off a bunch of stuff, meeting discussion, and then nothing happens. But I have decided that that is science. I was watching this documentary, and there's a reference at the end, just so you can get that cleared up. It's about the research process, and two of the people interviewed brought up this interesting principle, which I'm going to put to the test, which is to say talking about it is at least good enough. And I genuinely want to know, though, what everyone else here thinks. So as we're going, please look deep into your soul. And for the lawyers, I guess, ask someone sitting near you. But I do have to talk about fonts briefly, so you have some context to orient your thinking, because not all specifications and not all projects are going to be the same. So let's go back to the beginning. Back in 2018, I was hired to work in this repository on GitHub, which is a specification for text shaping. Text shaping is something that there's a free software library for called HarpBuzz, which is excellent, but there's some specification for how it's supposed to work. And so this company, YesLogic in Australia, who make document conversion software, were working on their own shaping engine, written in Rust. And the lack of a spec was a huge obstacle. So in our project, I went through the HarpBuzz source code, and I tested Vedad and Khaled and people who work on it with a lot of questions, and wrote a specification. YesLogic, the team then used that to implement their own shaper, AllSorts, which is also open source, we sort of put that to the test. And initially, the goal of writing this in a specification was that we wanted to have open type or some other standards owning organization officially take it up, which didn't happen. So we kept the license out of the repo at that time just to avoid the hassle of having to re-license it, as well as to hammer home the fact that it was unofficial and still a work in progress. Like I said, most of the stuff was 2018 when I wasn't doing anything else. I've continued to make some updates to it, especially when the YesLogic people find a bug or something. But this unlicensedness does need to change. People do find it useful, they cite it even without the license, especially people who work on publishing and font support tools and things. You might be thinking, well, wait a minute, we have open type, we have Unicode, what exactly is the problem here that needs specifying? Well, here's the thing. This is the bit of fonts that I do have to talk about. Unicode defines how a language is encoded, but it doesn't know or say anything about what's inside of a font. An open type defines how fonts are structured, but it doesn't know what's in the document that you might use with that font active. Silicon Valley doesn't want you to know this, but it's the truth. Here's an example. Unicode says every letter has one unambiguous encoding. In an open type font, every time you see that code point, you look it up and it's an A, so you take that bit sequence in the glyph table, you find it, hopefully there's an image of an A, and you put that into the document or on the screen or whatever. However, there are some languages that have exotic behavior like this. Unicode says E with things on top is a different letter than E. And in most cases, the things are also a different encoding code point. So the person who typed that, they might have had that symbol on the left on their keyboard, but they might have actually typed other two things instead. And what if the font has one of those but not the other, or vice versa, or any of those permutations? Some layer in the environment has to make that match and know what to do in that situation. But it's not just exotic languages. There are widely used writing systems like this one where the whole one image per byte sequence code point just isn't enough to get text on the screen. Like when they connect, they can be connecting in arbitrary ways. Remember Unicode says each letter just has one encoding code point. So in those where there's connected multiple images required, OpenType just says just put a bunch of them into your font with the same code point assigned, and then you can put tags on them to say this one's for initial letters, this one's for connecting the middle, and so on. But nothing describes... OpenType doesn't tell you how the software determines that it needs to grab that variant, or this one, or what order to do it in, or how to even know to look. That's what shaping is. It's not in either place, even though both of those standards exist, because it's sort of part of the environment. So that's what we're specifying. And I thought I should say that because for assigning a license for a specification, you didn't know what it is. This means we're defining a behavior. For example, it's not a format. It's not a codec. It's something functional, but it gets down to real specific detail. Like there's regular expressions in there. But it is also a single component, so it's not back and forth like communications protocols are where you have to worry about two implementations talking to each other. Importantly, people need it to be consistent across all implementations, though. Like that's the point of specifying it. Because documents have unbounded lifespans, and they can travel anywhere. So people on both ends want to know if you write an email or a paper or something, and you send it to someone else, regardless of what else they're on, they expect it to look the same. And that kind of means that the specification needs to support proprietary implementation, in addition to free ones. Some of those are factors that would be different in other specifications or other things you could specify or write standards for. There's interoperability tests that matter when you're doing network protocols that aren't really applicable here. For this, when there's a problem, you do that with regression tests and diffs and things like that. So this is worth defining a specification for, because yes, Logic, people are going to write new engines that do it. There's always a new language or a new environment like a game engine or something that has a different stack to it. So the spec seems to work. We want to license it. What do we choose that meets the need? Take a specification license. That'd be great if it existed, ideally. If there was an open specification license or a couple of them, we'd just use that. There's not. In the absence of that, most people seem to just opt for other licenses, which makes sense. It's understandable. I know a little bit about documentation and licenses, so I can talk about some of those generic options. The GNU free documentation license has some pretty inconvenient things in it, like the invariant clause, which people can invoke. Even if I wouldn't assign the invariant clause to things in the specification, the downstream recipient might, and I don't want to imply that burden for other people. The Creative Commons licenses, obviously there's a bunch of those. The restricted ones would not be useful, like non-commercial, but they just address different use cases and ideas, in my opinion, because they're meant for creative and cultural work, which is also true for software licenses applied to standards, like they talk about derivative works in a way that makes sense for executable code or for performing things in the Creative Commons case. But they're always concerned with the document itself, whereas I think in a standard or specification, they want to be clear about the difference between the contents and implementing them in the document itself. Now, the other trivial solution might be just to pick the license that matches what you have to interface with. That's not always easy. I'm going to show you why from our particular case. You think open type. It says open right there in the name. It's an open standard. And if you Google it, you see, oh, there's a URL there. And that's what it looks like. We zoom in on that left column. That's where all the tables are. Enhance. But you see all the stuff in lower case letters there. That's actually from the true type specification, which is there. Let's go back one more time. Enhance. Except for the capital letters, CFF and CFF2. Those are from PostScript. And they live there in PDF form, oddly enough. Then there's also the ISO spec, which is officially what open type actually is. We don't really seem to talk about that very much. And then there's the SFNT spec at the bottom. I know you can't really read the URL, but that's a dead like anyway, so it doesn't matter. Is that convoluted? Yes. The billionaires don't want you to know this. But the reason is it was written by three companies who didn't really want to talk to each other. Their rivals, their implementations are secret. So as little as possible is put into it. And this is not going to go away. This is how it works. This arrangement is what ended the fought wars in the 90s. This is Versailles you're looking at here. Bad news, but that's what it is. On the other hand, Unicode is a lot nicer. Unicode is very straightforward. There are a couple of different licenses applied to different things. It's sort of split up by the URL patterns there. There are a lot of components. Some of these things are in older licenses because they're from much older releases. Committee reports, national standards, things like that. The little stuff is there. The neo-historian capitalists don't want you to know this, but the older material still matters for things like archives and legacy documents and stuff like that. Again, document lifespans don't have an end to them. There's no limit. I don't want to harp on the particulars too much, but I do want to say this happens when we interface with external specifications. Openknight has that weird lateral complexity to it because of the rivals working together. On the other hand, Unicode has sort of longitudinal complexity, you might say. Those are not isolated cases. External standards don't arrive in the form that's ideal for you. You take what you get there. So without a prefab solution, the next thing I did was just sort of dig in to see what other pro-freedom or freedom-adjacent standards publishers do. Here again, the bad thing is that none of them are presenting their license as something off the shelf community, you're free or encouraged to reuse. So this is a bit fake. There's not really a pallet to choose from, but I looked anyway and I want to highlight a couple of these that I think are instructive because there's some good takeaways. The W3C, probably the most freedom-ish of these organizations, they actually have two licenses and they're distinct from each other in some important ways. One of them is the software notice and license, the other is the document license. They're mixed up. A lot of things you'd think would be on one or actually on the other. For instance, the font-related ones are a complete blend. Progressive font enrichment is one of those and incremental font transfer is the other, even though those work together. But some things that are important here is these licenses, particularly the document license, they talk about implementation, so that's already an improvement. On the other hand, the documentation license has a lot of W3C-specific language in it, like it references pointing out the standards track status and a lot of things are about how W3C does its standards. It mentions that you can use code examples, which is useful for my case because of regular expressions and things, but it is hyper-specific in the way that it says that to Web IDL and CSS and things that are declared clearly marked examples. I don't feel like that's general enough for use. This is a long tail of what could be a reusable snippet. The other thing is it knows that you can quote from it in your implementation, which is also really important because people do that. They put quotes in the comment blocks in the source code to explain the intent, and you want that to be non-burden-some, so there's a lightweight notice that you put in W3C case. The other organization we're looking at is the IETF, which its official license has this creepy name that sounds like they meet in a vault in an IETF mansion or something. Some new points here, though. It knows that translation is allowed, which the W3C does not, and that's pretty important, especially if you're dealing with a project like the one I did, which is all about global language support. It also, differently though, it defines the terms of quotation by the length of the document, just one-fifth of the document, which is an interesting distinction. I don't know how great that is. You want those terms to be easy to follow, particularly because the quotes might start out as a paragraph together, but then when you refactor the code, you have to split that up, and if you end up putting a big notice on every single line of that, that's not ideal. A few other things I researched which are not worth getting into. The ISO process is a monstrosity, as you would expect. The whole machinery of international global politics is involved in that. Plus, there are different terms on most of the things they do. OpenAPI, I sort of thought this project that I was on sounded like an API first. It sounds great, but apparently it's actually only about machine-discoverable HTTP application services. I think I quit reading. It's like 15,000 words long. I'm not exaggerating. I counted. There is more written about open standards from the OSI and from a lot of governments, and that actually gets pretty different in most of the talk there. It's not about the licensing of the document. It deals with patents and intellectual property things, which you heard about in the previous session, but it also deals with the process of our time, like who decides who is participating, how do you make and publish updates. That's about it. That's as much about the standards body as the specification itself, and it gets into deep questions about the difference between standard and specification. For our project, the GitHub repo participation is pretty wide open. We've had a loose, low-impact method incorporating changes, but that's because we're piggybacking on the way the GitHub site works, and we're kind of limiting that by virtue of not having a license assigned. That's what we thought. I did also look at some other things, like the Zife.org, A.B. Codex, which are software licenses, and some things like that, like programming languages that tend to have their own unique license. There's really too much there to generalize from. So if there's not enough shop answer, I guess we're cooking one. So the question is what we like to see. I made a list of these. Big license doesn't want you to know this, but you can do that. The first couple of things that are pretty obvious. Distribute it whole and apart. Implementation, we want to be clear that that's not the same thing as reproducing the text itself. Modifications is trickier, right? Like this open question, because it ties into the participation model stuff. Anyone can make a fork, pull a quest on GitHub, that's good, but we don't want there to be a whole bunch of incompatible things floating around. That connects to stuff like trademarks and how you manage the can and the city of your document. The end there is the stuff that is, I think, maybe more interesting and in a practical sense. The quotation issue, I'm split on that. W3C defines it by purpose. It's in the implementation section. It's a lot of separate green. ITF says it's about length. I'm not sure that I think one of those is better than the other, particularly with posting the notices and rearranging things. Should you do that at the file level? I don't know. Same thing with the quotations of code that people might reuse. Should there be a separate license? Python license does that. It specifies that code snippets are, or should you bake that into the license? Thank you. Thank you. Thank you. Thank you. Okay, check one, two. And we're back on the air. Okay, thank you, Tom. Thank you, other guy whose name I don't know yet, but we'll meet later. Yes, so, okay. I wrote things in pseudo code because I wanted people to implement it themselves. You can't do that with a regular expression. That's what makes them regular expressions. But where do you draw the line there? There's things like state tables. The representation of that is not the mark-downness. It's the actual table. So it's worth thinking about exactly how you specify code and functional things that people could use in their implementation. Anyway, and like I said, there's a few things not discussed here, like patent and IP grants. In my own project, we avoided that because we're sort of covered by implementing open-type and Unicode. Other projects might not be. Let me go back one. Anyway, this is where I'm leaving it. As it is, I guess we're going to put our own license together. I don't know. That might be a whole debate worth it. If I have missed some obvious solution, please, please tell me. I'd like this to be easier in the future for future specificators. It's surprising that this is not a topic that comes up often, even though I guess I'm saying that it does happen. We just don't talk about it. There's a lot of things like on free desktop.org. Kind of specifications, but we don't necessarily think of them that way. We don't think that D-Bus or AppStream or something needs to have all this furniture around it because it's sort of like flying by at the sea that are pants. There's always going to be new bits of technology from the community that have this issue. Let me jump ahead there. Anyway, that's my time. Do get in touch. You can actually visit the GitHub repo, the third link down there. The license thing is a pinned issue, so you can go look at it. Here are the citations I mentioned earlier from the actual screenshots and things. That's where I'm leaving it. Thank you for your attention. Thanks for having me back for the third time. Thank you very much. Thank you. |
Panel: Hot Topics
Organizers of the Legal & Policy DevRoom discuss the issues of the day |
We've made it to the end of our legal and policy dev room. If you're here, you really, like, I think most of you that are here have been here for most of the day. We've had so many amazing discussions and it's really run the gamut. So what we normally do at the end is we just take a moment for us as the organizers to react to some of the talks that happened today, talk about what we think are some of the important issues that maybe weren't covered during the talks, and then to, like, address topics that any of you may have. So where should we start? Hot topics or reactions to talks? Whatever you like. All right. Well, I will... Do I have to do the day a bit first, or then... Shall I? Yeah. Sure. Okay. It's very boring to hear us have to figure out our logistics. Yeah, a bit. I mean, maybe what... So to sum up the day, this is what we've done the last times as well. For me, it was super interesting that there are more positive vibes than we had, like, two years ago, I'd say, and this is very good to see that there is, like, a mixture of optimism, but also lobbyism, that there are legal activities, and that people are yet trying to get active, and that they are successful with this. And in this regard, I think we should also talk about your honorably doctorate. You just got a few days ago from the university in Lofen, and I think that's also something we should applaud to for your amazing work you've done on free software in the last... real life, and so that's why I think that's also something that we should be more positive in all of these questions, and Bradley wants to jump in, so normally the pessimistic won't. Well, I think... Daring to embarrass Karen a little bit further, I think one of the interesting connections with why she got the honorary doctorate was because the students of the university in Lofen are able to vote for who gets the honorary doctorate, one of the honorary doctorates each year, and the students, some of whom were in this room, helped organize for many years to really get her on the ballot, which she finally was elected on the ballot and was given this honor, and that kind of connects up with something that I was trying to say in my talk, which is, I think I'm very hopeful... I'm the most pessimistic person in free software, generally speaking, but I'm very optimistic about this generation who are just coming of age in university right now. I think they're much more interested in building a life that's not merely just doing some tech job or just being part of some corporate machine, and my generation wasn't like that. I was the weird one who didn't want to do that, but it seems like the weird ones in your generation are the ones who actually want to get some corporate job, so that's really good to see. Yeah, I just wanted to say something that echoed, for those of you who are conservancy sustainers, you will have got an email from me at the end of the year, just reflecting on how much has changed, and I really think that we are in this very, very different time. I would say as recently as five years ago, it was hard to explain to people about why they should care about their software, that having control over your technology was something that nobody really considered very much, and when we gave talks, we had to talk about all of the vulnerabilities that had been exposed, things that most people hadn't heard about, and we're living in a very different world now. We're living in a world where everybody understands that these issues are important, and I think that the world is so ready to dig deeply into the ideas of software freedom in a way that they haven't before, and that's affecting a lot of the policy work. When I spoke to officials at that university, and when I go speak generally, I find that people in positions of power are much more ready to listen. It's in part because of young people being motivated by this, and for everyday people realizing that their phones have been surveilling them, that all of the technology that they rely on is in fact behaving in ways that they had never expected, and so there's so much opportunity now, and there's so much potential to make policy. I think it's really exciting to consider this challenge of making users more sensitive and care more about free software, and the talks that we heard today about app stores, I think we're really promising that perhaps we can imagine a future where smart phone users that use applications can know about the apps that they're installing before they install them, much in the same way you might look at nutrition facts on some food at the supermarket before you actually purchase it. Does it contain healthy ingredients? Does it have fat? Does it have sugar? Well, is your software free software? Can it be reproducibly built? Is there a link to the source code? So I think that those are really interesting questions, and perhaps these alternative app stores can give an opportunity to do that, where we can take software with these features that we care about. Yeah, but while food comes with a list of ingredients, it doesn't come with the food bill of materials, and I've been told that unless something has a bill of materials, it's completely useless. So does that mean food is useless, like software is? Well, that's an interesting analogy. Maybe we don't have an exact bill of materials for food, do we? We seem to get by okay. Yeah, I don't know how we ended up with food now, but to be honest... Because everything is a free software issue. Yeah, that's fairly true. Also, beside what we discussed here in the death room today, I also think that we see in legislation, and that was unfortunately during we had our death room here, discussion as well in the big main room on the Cyber Resilience Act and the question of liability, and liability in free software. There are some issues, and we also discussed here, is there something like commercial or non-commercial free software? And these are also topics that keep us thinking about all of this, and how can we put it in writing, and how can we explain this to decision makers? What's happening? What's the ecosystem of free software? How all of this works? And I think we made a lot of progress here in the last month. Still, we see that there are still open issues, and that we have to work on that. But decision makers are open to that. We get these meetings. They want to listen to us. They want to understand. And this is something which helps us and which we should make use of. We also discussed the question of resources. So how many resources do we as a free software community have to jump into these discussions? And I mean, the opposite is well financed, and they send dozens of lobbyists. And in this regard, it's also very good to see that there are people here for the very first time, and that they care about topics of policy, and that this fostering is not only about code, but it's also about legal question, policy question, and that's really good to see. Going back to the food analogy, there's actually a... Going back to the food analogy, there's a class of ingredients that do not have to be declared on the label. And this is a source of controversy. They're called processing aids. This, I would say, relates to software insofar as if you have an app store that tells you just kind of yes, no about free software, or this is the overall license, there is still the possibility, as has been discussed, and there are talks about this on the track, that there is code washing that has happened underneath. So that really, I think, emphasizes the importance of code washing if you're just giving someone one little piece of information about the underlying software. One of the problems with the food analogy, of course, I've always hated it with free software because if you wanted to make the full analogy with food, you would say, well, every thing you buy at the store needs to come with a recipe and a full list of ingredients, which we've never demanded. So I don't know, somebody should probably start the free recipe movement, I suppose. Well, isn't the scripts to control compilation and installation? Well, yeah, that's right. I mean, that's the thing, is that that's the equivalent. You would need the full, like, how do you do each step? I mean, I think that's the equivalent. We're taking it too far. Yeah, a joke turned into a quasi-serious discussion that wasn't that useful. Benjamin Henry from FFI. I networked with Karen yesterday on the unified patent court. I worked on that for, after the software patent directive from 2005, we continued to work on because the lobbyists went to the European Commission and asked to continue to push for a patent system for Europe. So this, actually the actual lobbyist is the guy behind Uber Files, Mark McGahn, who worked for now Digital Europe, but basically the large software companies from Europe, but also from US and Japan. So the push was, in 2005, was to, the large companies basically were fed up with us and they asked the European Parliament to reject this directive and to push for a patent court for Europe. And their problem is, how do you do a patent court in Europe in the EU system where you have the European Court of Justice and National Courts? So we went to, we challenged the unified patent court ratification in Germany, but the thing we didn't see was that basically what the construction they have done, which is to make an international court for patents and removing the national courts and putting a link to the European Court of Justice, is actually not following what the European Court of Justice has been saying for the last 12 years and if not 50 years since the EU was created, which is, if you create a court inside the EU system that has to interpret EU law, they have to talk to the guardian of EU law, which are national courts. And that's why they have dissolved investment courts in 2017 and they have been dissolving these investment courts for big oil, called the Energy Charter, which has the same problem. They are international courts which have to be put up by EU law and they, when they look, so they look at previous design in 2011 where they tried to do this system with non-EU countries like Turkey and Iceland and Switzerland and basically they said you can't do this court system because first of all you do it with non-EU countries but also you deprive national courts from interpreting EU law in this field. And what happened with UPC is that politicians kept on trying to remove national courts. So they called that a common court and the only common court that exists in Europe is the Benelux Court for Trademarks, where national courts of Belgium, Netherlands and Luxembourg are involved. And at some point in the history, because it's a system from the 60s, in the history of this court came the question whether this Benelux Court for Trademarks can ask also a question on Trademarks law because there's EU law on Trademarks stood up in court of justice and the court of justice said yes you can do it but in their decision they didn't say yes because national courts are involved. So the whole construction and we didn't see this but the other guy who complained to the constitutional court saw that he had all the arguments but the court said ah no we're not going to ask the European Court of Justice because it doesn't involve fundamental rights and we are not used to it but the anti-question is there. So now we are talking to politicians, we raise the case and the guy ignores while strictly speaking the commission is the guardian of the treaties and there's like three articles, the 19.1 where Poland, where the EU law is the European Court of Justice and member states have to provide independent tribulations which cause problems with Poland because they didn't have independent tribulations but this article was clarifying to salary ever to say Could you quickly skip to the point? I'd like to invite you to submit a talk for next year's Debrun I missed the deadline this year I think it was a very long statement Hello, I'm Russell Melba from Denmark What's been the most interesting things you've learned today? I loved hearing about the windows tax suits I thought that was so interesting what Luca's been doing and I think it really fits into that theme we were saying before about how people are starting to realize the impact that their technology is having and they're getting involved and I think that what we're seeing is like a lot of the approach of this is from a consumer perspective which is unlike how we've approaches in the past ties into some of the themes about the Vizio suit that we've been working on which of you we can cover later because it's still a hot topic and is progressing through the courts but I think that this like this, the way that we're thinking about leveraging legal mechanisms from a consumer perspective really connects those dots and I like seeing how that happened What did you learn today? What I learned today was from Marcel about the EU pilot project process I was not aware of that and I find it fascinating that maybe connection with your MEP and the ability to support or create such projects can really make a difference I thought that that was really interesting to learn about Maybe to add to this budget question I mean these pilot projects are kind of like commonly known in the EU We had this FOSSA project, FOSSA 2 project where we had bug bounties and hackathons for VLC for example and software used by the Commission, free software used by the Commission However I think especially the budget question is very interesting because if we amend to the budget we can allocate money to projects but also to institutions and for example if you think about the open source program office of the European Commission which was introduced like two or three years ago then if you look for this in the budget there is none and it would be a good idea to go to MEPs or Commission and Council to ask them to add budget lines I would say we want to have first of all a budget to do projects, activities but also to hire staff because what happened is that they among the people who already worked there gave somebody the head on top and said now you are the hospital manager and this is also something we should think about that we work on budget files and that we talk about money and that it's not about just procurement which was also a topic today so how do we procure free software but that we also make sure that there are reasonable funds that there are people working within the Commission or the institutions on these topics and not just like they are doing it half time for one person the full year or something like this so that we also think about how can you become part of the community that means you need to invest a bit and that this investment also will lead to benefits in a few years Well that makes me want to ask you both a question you mentioned budgets one of the things I think one of the great messages from Europe has been this idea of public money, public code and I'm wondering what your thoughts are about the budget for that marketing and initiative here in Europe and then for Karen how do you think that message could be crafted in the US context? I mean with our public money, public code can pay me pretty much ask them to introduce kind of laws to say whenever you procure or especially also when you code software by yourself then it should be released under free and open source software license and what happened since we started this campaign is that we got many many papers and most of these papers go kind of like in this direction but it's always like coming with loopholes and that's one issue so there are no concrete goals, there are no concrete numbers it's always like if it's possible then it would be good if you saw something like this however we see progress and we see that there's kind of a will to go more in this direction and now I think for us it's important to turn more like into a watchdog so we got our papers we wanted and now we need to make sure that these papers are filled with reality so this is for us important in the moment and now we have it in the US I would say that a hot topic in the US over the last year is that it seems like coming out of the pandemic one of the first things that has been happening is organized conversations that involve the US government in various levels there have been several invitation only meetings that have happened some of them have been publicized as sort of there was a White House initiative which involved a lot of corporations having part in that conversation and there have been other conversations that have been happening you know where different agencies in the US federal government and also we're seeing more interest on more local level politics again I think reflecting this general cognizance of the importance of the ethics of our software and the technology we rely on which I guess if I can't mention that without also noting that Matthias isn't here because he's reading his book which is a kids book about software freedom which my kid truly loved and like read three times in a row and was like this explains it so much better mommy so if you have kids that I do recommend it yeah so I don't know if that answers your question but there have been a lot of conversations and I think we're just seeing the beginning there's a question from the chat which I'll read here a question to the panel do you think that free software and AI application for the public administration is important because it's an issue related to democracy and my opinion citizens shall have the opportunity to scrutinize what a software does and how it works what do you think about it I mean we didn't hear what Bradley learned the main thing that I learned is the same thing I learned every year at Fosnum is that if I'm speaking I have to write my talk before the day of the Debra so I didn't learn that again okay yeah so if I got the question right the question is if AI in government in public bodies yeah sure definitely and I think a very good example is from a few years ago when in France they had an algorithm to decide who can go to university or not and due to our work this became transparent then and then we've seen that there's discrimination in it and I think this is something why we need transparent software to see if there is potentially discrimination in this so and that we can have public discussions around it and I think especially when we handle the data of our citizens then it's really becoming an issue and we have this also in law enforcement where we try to run the data and then find the terrorists and stuff like that and here I think it's really really helpful to have transparent so if we decide to go in this direction which happens from time to time then the red line should be the code must be transparent that we can see and have a discussion and check if human rights are protected or not right and that can't exist without the data set so that it doesn't, we can't have the algorithms in a vacuum if we can't reproduce the if we don't have the data we're basically just as much in the dark so we need to worry about that too okay any questions here we already have one so I'm going to go back it's a really quick question could you give the name of the children's book where can we get it on Aida and Zangerman yeah Aida and Zangerman and we put it in the matrix chat yeah hello and Federico so by the way if you think that food labeling is easy and solved check out the current discussion in Brussels about nutrients core you will be amazed I have a question in a recent copy left conf we had this I think question which was whether we should do more copy left enforcement and of course we conservancy has done a lot in the last year about this but I'd like to hear an updated answer what is your current thinking should we see more copy left enforcement according of course the copy left enforcing guidelines by others as well is that something we wanted to see what is your or what is the current vibe do you want to answer do you want me to answer there are so many questions in the question which is also not just for us but it's also for does everybody want to see more enforcement and enforcement by others I guess we'll start by saying that we'll give a quick update on where we are at software freedom conservancy and for those of you who are not you know are not very familiar with what has happened so far what this person is referring to is that software freedom conservancy along with other organizations published a principles of community oriented enforcement and it was a statement that said we want to enforce copy left but we want to do so in ways that encourage adoption and that are fair and that prioritize compliance over any kind of monetary gain and so their principles are listed out on you can look on software freedom conservancy's website for one and see and see what they say we consult them regularly to make sure that we're doing the right thing and abiding by our principles so that's like the backdrop we filed a lawsuit no two years ago now right it was November of 2021 yeah it's about 15 months yeah 15 months ago against Visio the television manufacturer they're very large US TV manufacturer if you go into like a big box store you'll see a big a big logo and so we filed a consumer rights based lawsuit where where we filed as a purchaser of television so we at software freedom conservancy bought televisions that we wanted to use and replace the software but when we asked for the source code and on old products and after talking with Visio for some time they stopped communicating with us and when we bought new TVs there was no source or offer source on those televisions so we had no choice but to bring a legal action and so that is proceeding is like on several unique legal theories it's a contract action not a copyright action we asked for a specific performance we've asked for the completed corresponding source code not for any money consistent with the principles and we filed as a third party beneficiary which is interesting because the licenses say that third parties have rights and so we're exercising those which means that if we're successful it stands for the proposition that the folks that receive that are the consumers people who are buying the products are the ones who actually know if a product is out of compliance so the people will actually do something with the software that they're getting so not to rehash that whole thing but what happened over the last year is that when we the action started in the court Visio filed to remove the case to federal court basically in the United States you have the ability to say that the action is preempted meaning that it is Visio said that actually it's not about contract law it's about copyright law and copyright cases belong in federal court not state court where we filed it and so it actually automatically got removed to federal court and then we had to file a motion to remand it back down to state court and then Visio once it was removed to federal court Visio filed and said ah well you're in federal court now and it's copyright and oh by the way you didn't assert copyright so therefore motion to dismiss but when it was so there was a federal judge who ruled to send the case back so he won that remand basically saying that there was a appropriate cause of action for the state court and so it was a really initial stage of the case but a really exciting successful initial action for the last part of your question one of the reasons we brought this as a action under contract as a third party beneficiary into the GPL is to break what I consider a log jam of and almost a bug in the copyright system that we had which was it requires copyright holders who have chosen the GPL I mean at least if you do copyright enforcement those copyright holders have to basically be watchdogs themselves or be coordinated with an agency like Conservancy to watchdog for them to enforce their rights under the copyright rules and by looking at in terms of a third party beneficiary issue I do believe it expands the class of people that can go forward and bring actions and I personally speaking for myself not necessarily the organization would like to see more people and other jurisdictions making an effort when you see violations to see if there is some sort of consumer action that you can take in your jurisdiction to chase your rights under the GPL I wanted to do it just follow up with Karen how new or novel is this action working on behalf of users yeah I mean it's totally novel I think I'm unaware of anybody taking this legal theory before not in a GPL case but it's a pretty common theory third party beneficiary is not a novel right that's right I'm sorry but in a free software context totally novel but the legal theories are well trod law in the United States applying it to this particular instance is novel but third party beneficiary is not as a legal theory is not something that's new it's something that's been around for a long time it's quite established similarly specific performance nor it's not other part it's very common and specific performance under many legal systems including in the US and the UK is designed to chase things that can't be gotten that can't be replaced with money and it is our view which we think is the correct view that the specific source code and its scripts used to control compilation and installation for a given firmware generally quite unique for that firmware and therefore it's basically impossible for a third party who doesn't have that and is not being given that as the license requires to create that again with any amount of money really because they have to basically reverse engineer the product to do it which is just so cost prohibitive so and this this type of legal theory is widely used for things like family heirlooms well it's for for real property for land it's quite common and I think the whole one of the points of this suit is really to empower folks to ask for the source code and to be taken seriously by companies so I would encourage everybody to make source code requests for the products that you already own and when you buy them just ask see if you get complete and corresponding source code you probably won't unfortunately but just by asking you tell those companies that it's important and it means that when somebody does want to do something they'll be taken a lot more seriously and they'll be able to make software that we're all done from we were recently told by a large company that we're the only one who cares about these source requests so therefore they don't worry so much about them so we hope you'll care too and ask these companies maybe Bradley while you're on the way you can also do information requests towards governments and ask them for the source code of whatever they release and by that you also make it an issue and for example in the Netherlands it's quite successful so they released with these information requests folks are over there they released a couple of apps so this is also a way on matching on this case how do we know they're using copy-lefted software yeah I thought Bradley would enjoy answering this I've answered it so many times in my life so the GPL all of the different variants of the GPL require them to tell you so if it's present and they didn't tell you that's an even worse GPL violation because they also didn't tell you so if you look in the manual and there's an offer you can request if you can't do that if you can extract the firmware in any sort of way if you run bin walk and then strings on the binary it will be very obvious it's not as hard as you think to find out that Linux is in a lot of products and odds are Linux is in almost all of your embedded products can I while I'm getting over there can the panel say something useful we should say something here say something useful what topics didn't you hear about today that you would like to hear more about the relate to free software and legal issues so to answer the question partly how to know if there is copy-lefted software in it for example with routers you can attach a serial console and we'll usually see linux bootlots you could use nmap a network scanner if the device is connected to the network and it will make a guess which operating system the device runs and if it's android and it's also obvious that it has the gpr to call linux in it I have another topic that we could talk about which is sort of like going on echoing from some of the themes from earlier but also another topic that we haven't discussed which is I haven't pre discussed this with anybody up here actually but open collectives announcement that they were going to transition the ownership of open collective to the community that was a very interesting development over the last year I think it's in very preliminary stages so I've read the materials and I don't have a clear idea of what they're thinking of yet but that dialogue is very very interesting and it tough tells to some of the things that I felt was coming out of market's discussion earlier today about these governance questions right open collective is a for profit entity that is assured for the resources for quite a lot of free and open source software projects and it's millions of dollars that go through it so it's interesting to have on the table the fact that it is privately owned and what it would look like to transition to the community and how that infrastructure could exist that could make it be successful so I don't know it just raises so many fascinating questions about the you know the structures that we've put in place around these important projects and then to sort of have these fundamental shifts and question marks about how we're going to handle them going forward I want to take a question that came in from the chat how should we as the free software and open movement approach new fields like synthetic biology which has some similarities to both software and hardware I'd say I have no clue but this is an example of where we as free software people have been to Insular that we don't have connections to people like I hear that question and I don't have a name of who I should talk to and I should and that proves in some sense that we've been to Insular somebody have in this room had a name in that field that we should be talking about and I know a couple of people we should start asking about I mean I think we've actually been starting to do a much better job at that because I think that we were sort of going back to Bradley's talk when we were in our early years I guess I wasn't around for the early part of that but we were kind of this trying to prove our point we had a very specific mission and we were really focused on our four freedoms and very narrowly defining that because we felt like we needed to fight for that legitimacy or fight for that space and we made a lot of of allies maybe possibly in the wrong direction to sort of what you said these corporate allies rather than you know talking to the academics and the nonprofits and I do think that we're starting to develop those bridges in ways that we haven't before I think those conversations are happening I always talk about my medical device advocacy and how weird it was like I there there was a person who was just a couple years older than me with my very same heart condition who was advocating for for patient access to medical device data he had the same device as me we were both we both had interviews on the same day on major news outlets and that's how we found out about each other after we'd both been advocating for years and it was only you know like five years after I've been advocating for like access to source code on control of the source code on these devices that I met like a number of other medical device researchers that had a variety of different interests and now I know who they are and so we can work together and we're sort of I feel like we're having that same process in all of our other fields too and so like the main thing is this for us to like reach out and have these dialogues in inclusive ways in that the bridge that gap yeah and it's it's also my experience from from from lobbyism that it's these open debates not only happen in software they also happen around data they also happen around hard hardware and I figured out it's way easier to go to folks that have good experience with open data for example and then to convince them to go also in the direction of free software open source and this helps a lot and my feeling is that we are that there are few years to go until we talk about open hardware but I think we will also manage this if we manage to open data, open source, open hardware debate to follow these lines and also like again talking about resources we should also think about low hanging fruits and start with that and not start with the most focused topics maybe and losing motivation and also resources and disregard I think we should also have an eye on the general openness debate whatever this means I'll be a pessimist again open hardware is a great example where I think that the free software community did not listen early on very well to the concerns the open hardware community had we had a lot of expertise that we could have shared but I remember many people possibly even myself saying we just use the GPL for open hardware it would be fine and open hardware people were like wait a minute but this thing and that thing and we just didn't listen and I think we have had to rebuild those connections at great pain because we were so poor in our cross community collaboration counterpoint optimism there's so much energy around right to repair and what is software freedom if not to repair and we're seeing those collaborations already happening like the work that Denver works at software freedom conservancy did with us where he participated in a coalition of groups that were involved in the energy guide labeling with the FTC I didn't attend the whole day so maybe I'm asking something which is covered earlier today but I work with Kubernetes and products of the CNCF and I have sort of the feeling that they're completely missing in this discussion they have like really nice tiering with sandbox and then you know levels up have nice ways they organize and when how do you get into the next tier of quality they don't use GPL for most of the things I think MIT and Apache so I wonder why are they missing from this conference of course redhead is here but not from that perspective am I missing something or I just it's like two different worlds do you want to take this? I don't know I'll respond to part of it I think that a lot of what's happening with Kubernetes is it's a very popular framework for cloud orchestration which is really based on you know the emergence of containers and we've had in the dev room in prior years discussions about licensing complexity with containers and I think that it's maybe bad but I'm going to make it sort of a similar analogy to the AI discussion we had earlier that I think that once you put software in a container it's really easy to ignore the licensing that's in the inside and so I think that maybe there's been such enthusiasm and excitement around cloud orchestration and Kubernetes in particular that the commercial forces that are pushing that willfully are ignoring the complexities of software licensing I think more to the answer of why aren't there like CNCF people here and folks from other the corporate trade associations I think the answer is relatively simple FOSM has done such a good job making this a non-commercial conference and those entities are so focused on the commercial world that they don't really need to come here because they're built they have their own conferences that they built around corporations paying large amounts of money to get a seat at the table I think that's the real difference when you look because you're asking like why aren't they here at FOSM well because FOSM is a non-commercial free software community conference and a CNCF conference or other the Linux Foundation entities is going to be a primarily corporate conference that most of the people here could not possibly afford to attend I mean some are here and some are here in this room earlier today and some are at other events as well so but I agree generally but what is just to be flat okay well what is interesting is that when the first talk by Ayn he still on the bottom board when it says not a problem one of the examples he gave is well if they let you run a VM or container that you control they provide a service so it is interesting because these people are kind of solving the last problem by making it easy to create all these containers okay you still have you still want that container to be completely free software but we probably should talk more to them this was a statement not a actually there's a very simple reason why the Kubernetes people aren't here we're talking about copy left and software that's written by engineers for engineers and you also make a distinction at least you did Bradley in your talk about excluding or trying to reduce the power of corporations Kubernetes is very much software written by corporations for corporations it is not anything that one of us in this room would ever run on any of our devices it's the cloud orchestration software and very few of us run scale clouds more kind of does that's part of that there's mini QQ can can a panel discuss something while I get the mic up the steps do you have anything to discuss so if you had Bradley's time machine would you have done something different in the past for example would you have started the Visio lawsuit earlier the answer to that question is always like bitcoin right I would have but then I would have known exactly what like yeah no more seriously like I always think I think that too like what if I could go back in time and and the answer is it's just a question you can't answer right like it's a you know I want to just you know what I was when I was at university I installed a Linux lab and I thought what a great idea it's really too bad it's just not going to go anywhere like you know so I don't really have do you have an answer to that question well it's my kind of I guess my first answer is let's go back and write the licenses to be network aware but I think that's a little naive and the challenge even now even in the present time is okay we now have composable network services and we see the emergence of AI coming on what is authoring software going to look like in the next 10 years the next 20 years will be significantly involved in creating software and you know what will that look what is collaboration and sharing mean and that and I don't I think that that's some thinking we need to do right now I think I would go into the multiple places in the multiverse where there's all of us and most of the software is all free software like 90 80 90 to 95 percent of its free software and I'd say your problems are solved here 90 to 95 percent of the software is proprietary and help us fix the problem because we need to get everybody at Fosden like duplicated from various places in the multiverse that's what we should do that's not the time machine answer it depends on whose theory of time travel you're lying so as it's a time machine I'd go to the future and see what needs to be fixed today do we have time for one more question that's on the handout that's to be a quick this is his hand was up before sorry I want the source code to the time machine I'm sorry I'm beating it I guess the Kubernetes question and my cloud native question it's corporations in the same way that unix providers were corporations providing software for corporations we don't know where this is going so kind of tying together with the time machine if we ignore if we let this happen that it's just say it's corporations making software for corporations where will we be a couple of years down the line we're all looking at each other yeah I mean it's a good question yeah I mean it's this classic this classic balance we've always done like how can we deploy our software widely if we don't use the system that currently widely deploy software and so we're in this interesting situation where we as activists have to balance our public actions with also and trying to engage corporations on our own terms so that we can figure out how to find that balance where we can take the power structures that we have them and we use every single lever that we can in order to get the right result of more software freedom and less proprietary software and I don't know where we can I'm going to just wrap it up because we're over time is that okay just going to say like it was so amazing to be in this room with all of you today in person I've loved it we want to probably wrap it up yeah thank you thank you for coming we have another oriented conference if there's anything in your aisle that's not supposed to be there like bottles stuff like that please bring it up and put it in one of the cans for us we'd appreciate it and I think we have to close the room down so if you could move relatively quickly out because they want us out like by seven I think and thanks for being so awesome |
Migrating to LibreOffice Technology - old and new motivations and challenges |
Okay, let's start. It's quite 3 p.m. Hello, my name is Lothar Becker. Welcome to my talk to migrating to LibreOffice technology, old and new motivations and challenges. Well, LibreOffice technology is a very complex task in technical issues and non-technical issues. So, both worlds where we come from, from a proper tire office suite to a free office suite like LibreOffice and LibreOffice technology once developed further. These tasks are done now, for example, by me for over 20 years. So, in the meantime, the technologies, both worlds are evolving further and it makes sense to think about old things which are deprecated, old motivations and new one which is coming up with new versions of both worlds. So, with respect to the time of 10 minutes, it's very, very, very short for such an item. I have chosen five items. I think the most important one and there are some new ones in it which I want to bring to discussion if you have time to discuss or ask me later if you want to have more here and in this. So, the first one is, is it working? The first one is, is an old one, is an old one and a new one and it means desktop client is still and will still be there. Everybody is going to cloud, to online and so on, but as you could imagine, in a professional environment, you are sometimes in regional or in areas where no internet or bad internet connection is and you have to work there as well for your job or something like this. So, the most deployments are going with the solution that you have and the possibility to work online and have an additional office suite installation on your client and this will be especially in Germany where such regions where you have no or bad internet connections a long time last. So, I mentioned these new developments in both worlds and both office worlds. So, there are some new, over the time there are some new functional and UI developments which are, some of them are positive for our world, for our liberal technology world and some are negative, have negative motivations for foreign migration. So, let me start with a positive one. One is in the client, in the office client positive for us. If you have a look in the new Microsoft Office client that there is working with format templates which is very important in professional document processing is becoming more and more obscure for the user. So, it is hidden for the user that it would be nice to work with templates and we are in favor of working with such templates. The negative side is if you have a look in these new versions of Microsoft Office you can experience that there are a lot of new assistance and automatisms which are partially laid upon artificial intelligence and this is a kind of functionality we totally miss in LibreOffice. That's one example for it where we did it. It's the integration of the DEPIL translation, the possible integration into your text processing to translate, but this is just one. Yes, it's two-sided coin. Some users are totally fine with this automatism but the negative side is that functionality is hidden behind such assistance and they do not know what's happening in the document and could not work with it. So, the second one I want to highlight is also note one. It's the task for a migration to have the right strategy for the document format in-house and for interoperability aspects. So, this is I think this is a well-known item but with new versions of the document format as well in the proper attire world as well as in our world it is becoming more and more complex. For example, Microsoft is nowadays supporting also open document format but if you try to define which version of the open document format they support, you know, a little bit lost in space. So, there are some interoperability issues for ODF in the other world and it's in transparent. Well, it's very, very interesting in a few last migrations we did for professional deployments. We have some aspects in the interoperability from LibreOffice to OpenOffice and back. So, OpenOffice is still on open document format version 1.2. We in LibreOffice technology on 1.3 and extended if you choose the standard one and there are some issues sending ODF documents to other people who use OpenOffice in this. Next one is indeed a new one. You all know this big issue or item of digital sovereignty. It's a huge wave of expectations coming to us especially to the Office Suite to do all the expectations around it for free and open software. So, they are becoming more aware about dependencies in the proprietary world and they want like a big bang get everything something like in one stop shopping we say just click install and you have all ours integrated even with other tools we have not in the LibreOffice technology like chat like collaboration and so on. And this is as I said meant out of the box so doing it in a big bang and all who are doing migrations know that this expectation of one big bang is not very realistic. Yeah, okay. So, the next one is online compilation and smart devices. Also a good side and a bad side. Good side is if you have a look in the Microsoft Office online apps it's a total disaster. It's a total disaster and this is a point where we with solutions in LibreOffice technology are on the positive side. Definitely are on the positive side. Upcoming more and more that's very very interesting. We have a long time where synchronous collaboration was out of no value but this is coming up more and more and more with this pandemic home office work. More people are working synchronously on the same document. Other stuff like file sharing you have to do with tasks for file sharing, integration with chats and so on. And the differentiation between online and smart devices is not known by the user. It's the same for them that they are doing their work online or on a smartphone. We all know that this is definitely not the same with the apps on the smart devices and the online office. I have to hurry up. I'm sorry. Last one is also a new one. It's an interesting development. We have long work to have this notebook bar UI and more and more is the notebook bar seen as the standard UI. And what we get is that we have more efforts have to make more efforts and migrations for training and support in different user interfaces. And there's an additional aspect. We have to have be careful what we are doing on this different devices. Client UI, online UI and mobile UI which can confuse the user a lot with it. So what is TDF doing for such challenges in migrations? What we are doing is we are certifying people in three different areas. Professional developer which are doing supporting stuff. If you have integration tasks and so on, we have professional migrators and professional trainers. If you want to know more about that, please visit the document foundation.org certification program. You can see what we are doing there. It's a very special certification and it covers exactly the aspect of these challenges to have the best quality in migrations. If there are questions, I'm around and come to me and pass. |
Fun project by design – How LibreOffice development can be full of flow?
The ten funniest moments of my recent Numbertext, LibreLogo, Hunspell & LibreOffice developments |
Thanks. Hi. Can I start it? Okay. My name is Laszlo Nemet and the presentation, the title of the presentation, Fun Project by Design. The short summary of my presentation, I don't know if all LibreOffice development can be full of flow, but I can show the best moment of my recent developments. So, my motto from Goethe, mehr Licht, more light, and in the end I hope I can give more light about LibreOffice development or fun around LibreOffice development. First one, I work it on improving the spelling dictionaries or spell checkers, two in the last few weeks and months, and here there is a Wikipedia one-liner I use to extend the spelling dictionaries, pushing the word text of the Wikipedia, for example, the Hungarian in my case, to a UNIX pipeline and ranking their entries of the dictionary in the case of the Hungarian, there was 400,000 entries with a very huge variant of the different localization of the Wikipedia and counting their backlinks, and after that we have a data about the frequency of the entries and we can filter the unknown words for the spelling dictionary. Here is the one-liner. I was so excited to show, to see the Wikipedia example from the download from the dump Wikipedia.org that this is five billion characters in a single PC, as 50,000 books, the text content of this single file, and it was able to create a frequency database immediately. I called my son that, check this, this is the prototype of the one billion dollar algorithm of Google, the page rank. Wow, why, so wonderful. He said that it's a little late, but no problem because it's no late for me. I know the UNIX pipeline, the history of the UNIX line, why so? It started with the moon landing. I know the history of the graph, so it's related to the language analysis of the federalized papers, the founders of the United States. There was a guy, MacMahon, a linguist working in the Bell Laboratory, so it's a history, but it's a present, so that is why interesting that in the liberal office development, like the OpenOffice.org was the first big office suite built on the Solaris system, SunMicro system, developed this with big joy, and I used the VI editor and the modification by a Dutch face-off developer, Ben Maulner, the VI version, so it's a history, but the present, too, and this is the result related to the Hungarian dictionary, and it seems that humans just want to have more fun because check the most frequent new unknown words, McLaren or all musics, or soccer, football, and car racing, et cetera, but there was, in the first, the most frequent dozen words, there was an exception as Misfit, Nauchni, a small Ocarina village with 1,000 people, only 1,000 people. Why? So I checked the Wikipedia, and this was the place of the Crimean astrophysical observatory, and it is a huge number, near 3,000 links in the Wikipedia for the Nauchni, the village, show me that it means thousands of asterite stars, galaxies, this co-ed in this observatory in the last 100 years, so it was a nice experience. I added some other words, not this one, to the Hungarian dictionary, for example, Higgs, the physician related to the gravitational waves, or Thaisetarovs, because this extinct animal is the most popular in the young age people, toddlers, so sometimes useful. The other development, the Soros language, is a mini-language to handle a problem, that converting digits to a written form, the words of a number, for example, Euro to French, I'm sorry for my French, but I don't want to pronounce the French version, but I'm sure there are people here who can help me later, and this is the history of the Soros language, it's related to the software patients, because that was in Microsoft XMS standardization as very special Excel function, the bot text, using converting numbers to Thai language, that was maybe Chinese suggestion in the open formalities to create a more generalized version, a number text, but the answer from the open office dot developer study, developer, that was defining a general purpose number, this would be a nightmare, he had the experience, it's really a nightmare, and a few years later I made one liner in the UNIX tool, a prototype to create something useful and mini-language to describe the different grammar rules of the languages, and the first version within a week or I don't know, supported the English, Thai, German, Hungarian, Italian language, and a very special numbering system, Shuzo numbers, which in the night is still used by the Chinese market, two lines of the numbers with the magnitude of the number, for example, 30's number is 2023, and the next line contains the 1,000 year sign, the best, the worst problem, so it's not so far the development, that this was a spin-off of my grammar checker development support, founded by the FSF.HU foundation, but that time with three small children without too much money, it was a crazy idea for me to create a new development, so that is why I asked the NNET foundation, help of the NNET foundation, and later the 1-1 development resulted in 11 new language support in the software, and the interesting thing here, that when I made a C++ version from this project, it was also funded by an FSF.HU foundation, I've got help from Ekaterinatke, at that time a LibreOffice developer, and later with the Russian developer, Maiko Gonski, from Kolabora, and now if you check the new features of the LibreOffice, you will try to, you will check the, it's possible to select this format, you will number formats in the dialogue windows, so useful, and this was the two characters fixed, I wasn't able to fix five years ago, ago because my, related to my complexity of my development, LibreLogo, a special language, but not only for children, I will check the next example, this line, fix the SVG export with the special font features, open type and graphite, graphite, so you can use, for example, Xpaves, the special generated graphics. For the books, I improved the type setting, the paragraph line breaking algorithm, adding the hyphenation zone of the OXMS standard, and this standard helped create more DTP like text in LibreOffice, my plan to create something better, things for picture books, thank you very much, this is the mascot of the LibreLogo development I made last night, the idea is quite new, new, and solving a thousand years problem that, what is this, log or the crocodile, both of them, log, the LibreLogo crocodile, so 400 bucks fixed in the last four and half year for by me, using special methods to mix the Unix programming language or the approach to finding the freedom in these developments, so you like bug fixing, you can see new features, a language, for example, 200 years, and the famous Hungarian poet, Shantel Petriti, made a very short poem, you can read later, and this poem is from Goethe, and related to Tesla, who made the biggest innovation in Budapest, thanks to Goethe, this is the end of my development, and you can check the numbers and the downloader, thank you very much. |
SmartArt Support for LibreOffice |
I'm Hans-Einorichot and I'm the developer community architect for the Document Foundation TDF. Today, I would like to talk about smart art support for LibreOffice. So here's the list of contents that I will talk about here in the stock. Then support status in different aspects from the important display to saving and editing, then audio compatibility and application support. And then I will discuss some of the remaining bugs. So smart art was a feature that was introduced in MSF East 2007 and it is still supported in the newer versions of Microsoft Office. It's a nice way to create complex diagrams with a little effort you type in some text and it is converted into a diagram. You can change the diagram, you can customize it, you can use the same content to have different layers and that is very interesting and because of that the smart art is something that is very popular. So LibreOffice had basic support inherited from the beginning but since then smart art import has received many improvements from many people including Miklos Vajna and Regina Henshel which were the top contributors in this area. You can also see FOSSTEM 2019 presentation from Miklos to get a better understanding. So today I will discuss the details of the status of this support. In the important display I should say the important display is very good. It works nearly perfect. I have tested myself the compatibility with Microsoft Office online with many possible layouts and I have tested around 200 different layouts and configurations that you can use there and the result was very good. I should say in most cases the result is almost the same. But when it comes to the saving and editing there are some problems. So saving back to the OXML either it's a Ward or Excel or PowerPoint file is okay even no modification is done. So you can also convert the smart art into a set of shapes but this is a one-day road and you can't go back. So if you open the file inside the Office you will see that sometimes if you enable experimental features in the settings you will see that there is a border with text which shows edit element on the top. This lets you edit the smart art but this editing is very limited and I should say it should be considered very experimental or experimental only and it's not stable at all and also saving back is not possible right now. So it's actually not usable for production. But there is an ongoing effort by Armandagrand to improve this. The effort is installed right now but I hope that they will see the results in the future. So when it comes to audio compatibility the compatibility is non-existent yet. So it's not possible to save a smart to ODF although you can do the saving and you can save the file into ODP or things like that. The saving will do a conversion to shapes so there is no going back to smart art but you can move and edit the shapes and that's not something desirable. So the development path is that we should define similar constructs in ODF and hopefully standardize it and then create routing for loading, saving and conversion from OXML to ODF and back and forth. In the application support, writer is supported, in-person is supported but you can't open the smart art that is created inside Calc. So this is something that is still lacking. And talking about the remaining bugs, we have around 25 bugs in the meta bug dedicated to smart art. These are mostly minor problems with text rendering, positioning, spacing, font color, rotation, things like that. I should say that these problems does not always happen and sometimes we see rendering defects but this is not something that always happens and only sometimes we see these problems and although the rendering is not exactly pixel by pixel, the same with Microsoft Office, the result is usually good. And there's also problems with old MS Office support prior to MS Office 2016 and also the lack of being able to show 3D layouts. So in the text rendering, as you see here, sometimes the output is not similar. For example, spacing, either paragraph spacing or character spacing or different fonts and different metrics, the colors sometimes different and there's also sometimes problems with rotation. And also we see minor problems in rendering. So as you can see, as the arrows show, there are little differences and I hope that the bugs can be fixed in the near future. So 3D layouts are also not supported. Also the output is still meaningful. And I should say that sometimes LibRaf is even thus better. For example, if you look at what arrows show, the spacing and the internationalized text, which is bidirectional for Arabic and Persian, is better, even better in LibRaf is compared to MS Office. So is there any questions? I would be happy to answer. Thank you. |
Putting the R in LibreOffice: a Shiny dashboard for QA
Using R and the Shiny framework to help the LibreOffice QA community |
Okay. Good. Hi everyone. Welcome to this one. Hopefully it's not going to keep doing that. So today my talk is called Putting the R in LibreOffice Shiny Dashboard for QA. My name is Stefan Guyou. I joined the TDF team, the Document Foundation, just recently in November as a QA analyst. And this is about a little project that recently started to create something that everyone can come to and to learn a bit more about QA numbers around the project, around LibreOffice, but also everything else that's hosted by the Document Foundation. So just a little bit about where I come from because it matters in some way. Quickly I studied plants, ecology, sustainability. That's my background. Then we've done two working in agricultural research and then finished the last four years working in the library as a research software trainer and in research support. But throughout the last 15 years I've been a free Libre and open source software user and advocate trying to push for the solutions, especially in teaching, learning, research, environment. And because of that I've been a user of a few different tools including the programming language R. This is what I use for this project. There's quite a few things already around QA and LibreOffice. So quality assurance, numbers and graphs can be found in lots of different places. Here I've listed a few. There's a wiki that has an absolute wealth of different pages and tables and links you can use and also stats about our QA project, our QA team and community. There's a blog also that's a specific QA blog. It's not just one blog at the Document Foundation, around LibreOffice. It's also the QA blog. There's a deaf blog. There's a few different blogs. And I really recommend having a look at this one where there's excellent monthly reports that are done with a few different graphs to explore and also updates on who has been working on what, so what new features and bug fixes are coming up. So it's a great reader really. I think you should subscribe to that one if you're interested. Obviously Bugzilla, this is where we report bugs and where we try them and mark them as fixed eventually. If someone's keen to take those ones. We've got QA scripts and our QA repository. There's our Python scripts to process that bugzilla data. We've got a crush reports website that has some, for example, timeline graphs where you can see how numbers of crushes evolve throughout in different versions. And there's an interesting website as well called depth central which gives you quick links to different important tools for development around LibreOffice. Just to give you a quick idea, this is the wiki page where you can also find ways to keep in touch with us. And there's going to be options for chat rooms, but also the mailing list. So if you want to keep hearing about those things, do join that. This is the December 2022 blog post and you'll find a few graphs in there to see how things are evolving. But at the top, there's a lot about new features and new fixes by different developers. And this is the crush report website. This is what I mentioned just before, depth central where you can find links directly to, for example, the wiki bugzilla to report a bug, the crush report website, but also other ones like web late for translation, Gary for submitting your patches. Now the idea of the QA dashboard, an extra tool that's not supposed to necessarily replace other things, but complement them. I think I've used R and Shiny with it, so R the R programming language, very useful for data analysis, statistics. Shiny is the framework for creating web apps that goes on top. So interactive visualizations and tables using two packages called Plotly and DT, there's Ggplot2 in the background too. Quite a few other packages are coming in, in play there. And also, very importantly, I don't want it to be just pictures, just nice pictures to look at, but it needs to also link to activities, to things that we can do. And it doesn't need to be this bottleneck or this dead end. So I'll show you what it looks like currently, this work in progress. We've got a few different files here that we will eventually share once we're happier with what we've got, and we want everyone to use it. Now there's a few helper functions, a script to prepor assess the data because it is currently based on a dump from a bugzilla, that's quite large, and finally the app that has both the UI and the server functions on there. So I'll open this in the browser, maybe a bit better. And I'll quickly show you what we've got. So it starts with this page where it looks at the snapshot, the most recent data we've got, depending on what export we've used. And we can see, for example, the first graph here are the most important components of the project and other projects, not just LibreFist, but you can see that most reports will be related to writer, calc, LibreFist in general, and then you've got impress, you'll see that base and drawer probably attract a bit less attention. Looking at the importance of reports or how it's been classified, we've got a graph on severity, thank you, and priority which work together. You can learn a bit more about those ones with the links underneath if they don't necessarily mean that much, and you can see that obviously it's going to be bigger in the middle where there's the default fields that are used, normal, severity, and medium priority. Now looking at how bugs have been classified, depending on what happens with them, you can see in general green is what has been resolved for different reasons, what has been actually fixed by a commit, and what has been resolved with other resolutions. You can see that there's a few different categories where it might fall, and you can see the numbers by using your mouse going over them. It's looking also more precisely at just that green part that's up there, looking at how bugs were resolved. So you might be interested in knowing that most of them have been fixed, but then a lot of them have been closed because they've already been reported before. Duplicates quite a few are closed as works for me because it's been resolved without necessarily knowing how it has been resolved. We've got a graph on which is the first version that's been affected as far as we know. You've got all the major versions of LibreOffice here, and the creation time as well as the last time of modification. But for example here there's a link to action, a list of bugs that haven't changed for more than five years. This is where this potentially bugs that we can very quickly close. You can use this date range picker to focus on one particular period. You can remove enhancement requests, and automatically invalid reports are removed as well. But I also want to work with timelines. This is often the graphs that we've got in the QA report, the monthly one. And here we can see how unconfirmed bugs have evolved over the last 13 years. If you're interested in just the last year, you can toggle those ones. And if you want to look good, you can remove the zero, and it looks like it's really dropping. And then, thank you. I've got a table here that looks at that specific snapshot. So that stops on the second of Feb, I think. First of Feb maybe. There it is. There's the date. It tells you. And you can look for different bugs here. Bugzilla might be a bit daunting sometimes to do a search. This table is a condensed table. It doesn't have all the information. But if you're interested in, for example, dark mode bugs that are unconfirmed, you can use those filters at the top. It might take a bit of time because there's thousands of thousands of them. But it should come up with a short list of those unconfirmed bugs for dark mode. There we go. We've got four of them. You can click on that link. It takes you straight to Bugzilla. You can sort them by whatever you want, or look at only a specific version of LibreOffice. The next steps would be those ones. Thank you. And if you want to contact me or contact the QA team, there's options here. Yeah, thanks. Have a great afternoon, everyone. |
Cleaning up the unittest code mess |
Hello, so my name is Cisco Foley. I work for the Document Foundation as a keyway engineer and today I'm going to explain how in the last month I cleaned up the unit test code that we used. So it all started. So basically this is a report from Jenkins and there was one test, this one CPP unit test, SCVBA macro test failing on Windows. We didn't know it was failing for quite some time and we didn't really know what was the problem. It was failing every now and then. And in the end it turned out that it was a problem that the teardown method disposing this MX component was missing and well eventually it was fixed thanks to Noel. So I started to look at other places where this kind of problems might happen in the unit test. So basically when we want to create a unit test class these are the three, sorry the four main classes that we can use. So this one is the one inheriting directly from the library CPP unit test. So this is the basic one you use to create a basic test and basically it handles the setup and teardown of the test. Then if you want to let's say work with documents to load them and so on you can use also this library, this class or macro test and then when you want to work with XML let's say you want to parse those XMLs you can use that class XML test tools and finally we have a special class that we use to load and save documents but we only care if they fail or if they pass. So the idea because most of the classes that we have in unit test they normally inherit from these two here or in some special cases they inherit from the three of these three. So basically the idea was to create kind of a wrapper for these two and then another one for the other three. So when I started to do it well a lot of duplicates were found especially with the setup and teardown methods so this could be removed in all the classes. Then same for the X component each class define their own X component so now we only define it in the you know AP test class so this remove a lot of duplicate definitions and I also found that we had a special you know AP test class for Cog for no real reason so this was merged into the you know AP test class and actually we also well I also factor out other methods like for parsing PDF export or execute a macro or create a template. So in we have the impress and calc modules we had their own methods to load and save files they were actually copy paste from from this from the filter test class and they were actually very complicated to set up basically you have to define all the formats that you wanted to use then create this array with all the formats and then finally the load of the document looks something like this where you have to define the clipboard the version and yeah it was in the end really complicated and we already had a method for that and yeah so basically now instead of returning a reference a doc shell reference we just basically load the local document and save and reload and makes it easy and the problem before it was that after we were done with the test we need to close the reference and in some case some test it was missing so that's not a problem anymore and also there were many places where this kind of chunk of code was used to save and save and reload the document so this they are gone there were many of them so now it's just one line to that and the good thing is that now we always validate the files on a sport and in case there is a problem with the validator we can also use the skip validation and there is also support for password sporting and importing that's something that in the past only that was only available in writer module okay and yeah the same for parts export sd sw and sc they use their own ways of handling this while we already have it in macro test so there was also duplicate for that and right now we just do it like that and yeah there there were also operators that were duplicates like for instance this one for comparing colors there was also for comparing every tangles so they were already duplicates in in pairs and calc and writer so all of them are gone now and there was also a problem with this unit test the problem is that the way it works it compares the calculations using open cl enable and disable and the problem is that before I found out they were always the comparison was always when it's enabled so it the test didn't actually test anything and now that's fixed and also we also had some custom asserts like this one as a formula equal which can be done as a string comparison and just using the get formula so yeah in the end a lot of duplicated code was dropped and yeah that's that's it thank you I think we still have time for maybe a quick question if anyone yeah okay thank you |
Crashtesting LibreOffice in the backyard |
Okay, hello, I'm Gabor Keremann from Allotopia at BMW and I would like to talk about how we do crash testing in the backyard. So let's start. What is even this crash testing? Crash testing is a QA subproject of TDF and it's run on the master branch around every second week. So what is this testing? It's testing continuously opening files, saving them in different formats and reopening of those saved files. And make sure that crashes during this workflow don't happen because that's kind of bad for user perspective. And usually this process sends a resulting mail to the developer's list and interested parties like mostly Kaolan fix those crashes and that's good on master branch. But what if the long-term support branches which we maintain for customers introduce such errors that would be kind of bad for customers. So we wanted to avoid that in the longer run. Okay, so what are the prerequisites for this work? You need first some hardware, a beefy system with many CPU threats because there is a lot of files to test and of course bunch of files, tens of thousands and this can be downloaded using scripts in the core repository. Then you need the crash testing scripts themselves. You need to download them on the beefy system, configure them to build LibreOffice, run them on the set of files you have just downloaded and also interpret the results. So this is how the beefy system looks in real life. It's nothing entirely in the backyard but on the couch, let's say. It also needs some 40 CPUs or so and a lot of disk space as well. Next, second step is downloading files. First script is called like this in the core repository and this makes, it downloads user made file attachments from public bug trackers such as TDF and Apache OpenOffice bugzilla, Linux distribution bug trackers and office software bug trackers such as K-Office, Gnomeric and Abii Word and it has some lovely or less lovely un-user friendly properties such as you need to install some extra Python modules and set an environment variable to Taylor for your hardware so that the download happens quite quickly and you need to run it from the download target directory but that's all that gave it at this script. Next, second script is a website scraper called like this. It needs also some Python modules. You can add the target directory as parameter and some Microsoft Office themed forums will need registration before this can work on them and login data needs to be stored in any format file. Next, getting the crash testing scripts themselves. It's not in the core repository but this other contrib dev tools repository and in the test bugzilla files directory. So how to make sense of that. Configuring the environment is also very important and this is the most difficult part of this talk so before you start running the scripts you need to configure the environment with this config file. It needs to be placed on that path. There are some defaults in the dev tools repository but you should overwrite those. The most important settings are this compiler, sorry, the compiler version GCC or clang works the same in this regard but you just need to take care that the old version of Libre Office you want to compile actually compiles with your compiler. You need to set the parallelism, how many CPU threads you have, it's with workers environment variable and the most important thing is the path for this script. So we need the location of the files to test which were downloaded by the two slides ago and after that you need to hard code the dev tools repository path with this TT path and next is source there for the Libre Office core repository clone which you also need to compile and you need a build directory where the output of the computation will go. So in the build directory you need also place the autogen input file, it's also in the dev tools repository. And of course you don't want to send the casual email and upload the results to TDF site because it's internal for the company so you need to set these two other tools variables. Okay next it's easy, there is a crash test data variable for the downloaded files, you need to copy your files there and execute the commands shell script which will do all the heavy lifting and basically that's all. These arts will be in the crash test data directory on the logs and the mail txt file will be the summary of the run. Next step is finding what went wrong and fix the actual crashes which is just casual back porting bug fixing. So what are gains of upstream from this work, there are some, I made these scripts a lot more configurable so you can set them up more easily for other companies. For that it was only able to run on the TDF server and it was kind of a pain to transplant it to another machine. Also a little bit of performance gain, there was a bottleneck and the upstream also can run this work more quickly. And that's all, thanks for the attention. |
LibreOffice Dark Modes
multi-platform support was surprisingly difficult |
We're off to dark modes, just to have a quick look at the various bits that I'm talking about. On the right-hand side we've got the widget tree, that's what I'm talking about mostly, and then around the document, the application background, and the document colors itself. I'm calling them application colors and I'm calling the rest widget tree. That's a screenshot of dark mode GNOME, our GTK version. To overview most cases, what we have is the issue that you have to inform the platform that your application is opting into dark mode. You have to determine the application colors, the ones that you're going to use for your document background, and all the rest have to be extracted from the theme somehow. We have custom widgets in various places. They all need to know the appropriate colors to use foreground and background. The most common obvious problem is that most things have been light mode so far. People have just decided to draw what a random color, and they know that the background is light and that it's worked for them until now perfectly fine, now they have to explicitly set the background. If you're a developer here and you're doing previews, you're doing any custom widgets, just make sure you're setting both colors, foreground and background colors together, not just foreground, and the assumption that your background is suitable because it won't be anymore and it isn't. Conversion of GTK to dark mode, there's two different modes of doing things. In GTK we're using actual native GTK widgets. On the other platform, we're using our VCL widgets, and we team them to look like the actual platform widgets, but they're not the platform widgets, they're VCL widgets, but the GTK case, they actually are GTK widgets, which means that GTK conversion to dark mode is pretty easy. We just listen to the dark mode change on the platform on GNOME. We find out whether it's dark mode, and we tell GTK that we want GTK to be in dark mode, and then all GTK widget tree looks after itself, so all of that stuff just draws itself and dark mode automatically. We've nothing to worry about. All we have to do then is extract the colors from the team, tell our custom widgets what colors and foregrounds and backgrounds we want, and use the application colors then and the appropriate locations. That's all OK because in the case of GTK, any of the stuff we were using to query for team colors continues to work in dark mode. So if we ask for the application background, and we put GTK into dark mode, and we ask it for an application background, we get something that we can use to rate away, which would seem obvious, but it's not what's happening on the other platforms, and that's where things get quite difficult. Mac OS is medium difficulty. When you opt in by telling Mac OS that you're dark mode enabled, and then you query the Mac OS teams for colors, you get results back that are meaningful, as long as you stick to the non-deprecated ones. Some of the cases for Mac, we were not querying the team for background colors, we were just using hard coded colors, so if we update things pretty much to query what we should be using for the application background color, and so forth and so on, we get something meaningful in dark mode as well, and it's all easy for those cases. For the widgets here, though, things become more difficult. On Mac, what we have been doing is clicking harder. On Mac, there is a drawing interface we've been using all these years to draw onto our ECL widgets what the Mac buttons should look like. Those APIs there. If you go Google for the API, you'll find no documentation for them. There seems to be no acknowledgement on any of the Apple documentation pages that this API even exists or ever exists, and the problem is that even in dark mode, it draws in light mode, so any of the stuff we were using on Mac to draw things will only draw light mode, so it's all out of date, and it's all not supported, so the problem there is we have to roll a file or something that is working, and there's a whole set of other APIs on Mac that we can use, where we actually basically use the real Mac widgets, we keep a few of them cached, and we ask the real Mac widgets to render themselves onto our VCL widgets without the blank text so that we get what it looks like. On Mac, some of the things such as buttons will only render in fixed sizes, so if we have VCL widget that's big and you want to render a Mac button onto it, it'll only render a small slice of it, so you have to make sure that your buttons are the right size, or you have to change Mac buttons into one of the other supported styles, which will render into whatever space you give it, but some modes you can't. So it's very tricky. Some of these Mac widgets, if you actually try and create them in a thread at the abort, so you have to actually create them in your main thread and keep them for when you do want to draw on a thread later on. You can make it work. We're doing things much more at a distance when we're running with the full Mac widgets, we can't render a small part of a scroll bar, you've got to render the scroll bar basically, so we have to kind of work with things a bit more difficult than we did in the past. So it means that our work for tabpings and scroll bars are particularly complicated. We've got these focus rings. Our focus rings, even though we attempt to draw them the way the documentation is just to draw them, they're narrower than they should be, so they're not as prominent as I'd like them to be. There's, again, this draw focus ring mask with frame, which is supposed to be supposed to draw these things, but when I try it with different widgets, it works with some, it doesn't work with others. There's a lot of trial and error with a lot of this stuff. This feature for wallpaper tinting, where one thing behind the other shows up, I never got it to work. The accent colors work out of the box, so let's go shoot, shoot, shoot, shot at that. Mac, it all works fine, so that's what it looks like now, using the new APIs, accent color, in my case, I was ready to just pick one, application colors, blah, blah, blah. It all works fine, the changing of all the drawing stuff means that the light mode has changed as well, hopefully for the better, hopefully, but it has made an effect there. Windows then has the highest difficulty, opting in that you want to tell Windows that you are a dark mode application. Any of the stuff that we've been using all this time over the last 20, whatever, plus years, all those APIs, there isn't an obvious way that Windows has given us to support dark mode. In other words, if you actually use Windows in dark mode and you launch the file dialog, you'll see that it's all the old widgets, but it is using dark mode. File Explorer moves in dark mode as well, so they do it for their own. There's various ways of hacking this and undocumented stuff, we pulled out of that URL and the other projects, blah, blah, blah, all doing it, so there's a whole hecky way of doing this that is all based on undocumented ordinance, which is very unsatisfactory and fragile. Well, that's what we're doing. These are the APIs that we're using there, again, unlike the Mac case, the APIs still work and they will give you dark mode, but only for certain things, in other words, only if you pretend to be the two things that we showed in the previous style, if you pretend to be Explorer or the CFD, common file dialog, then you'll render in dark mode, but only for certain widgets, the ones that exist in those two applications, so you have to basically restrict yourself to what pallets of widgets are available in the two cases, which is fine, scroll bars are fine, but you have to hack them in like that. It's very unsatisfactory, but it does function. Unlike Mac, if you use the case for asking, querying for specific colors, you get back the light mode colors if you use the existing APIs, you have to update that, and again, you're limited to the widgets that are listed earlier, so you have to, again, set the Explorer theme or set the CFD theme and then query for what you know exists. The big lack is that there is no tab pane, or notebook or tab panes available in any of those applications, so in those cases, we're falling back to basically being a button, which is why we have some complexities around that. Dark mode, Windows 1, that's what it looks like, and then there's final stuff, there's other stuff, which is not dark mode, which is even more complicated when it comes to the accessibility high contrast mode, so then if you know anything about that, the last section knows about accessibility, let me know at some stage and we'll figure it out. So you've got complexities, that's why it's done the way it is, that's the end, thank you. Thank you. |
Turbocharging an elephant. Making Libreoffice faster. |
Okay, my name is Noel Brandon. I'm talking about the work that I've been doing making LibreOffice faster. We all know that LibreOffice is a bit of an elephant. It's a cute elephant, but it's still a bit of an elephant. We've got 10 million-odd lines of code, some of which is 20 years old. Practices that were perfectly acceptable 20 years ago are a little bit trickier these days. Optimizations that made perfect sense back in the day are not great anymore, and things have just changed around. For example, CPU memory bandwidth has dramatically changed. From around about the era of the 486DX2, we suddenly saw a dramatic shift in the speeds of main RAM versus the speeds of your main CPU. Up until that point, you were looking at CPUs that could touch location in memory at about the same speed as they could touch local cached memory. So you wrote your algorithms around those sorts of things. DX2 changed things, and it only got dramatically worse after that. To the point now where touching something in L1 cache versus touching main memory can be anywhere up to 50 times slower. Cache-finding algorithms are now the new game in town. We now have multiple cores. Everybody has multiple cores. It used to be that only people with vast sums of money had machines with multiple cores or the one CPU. Now, everybody has one. So locking algorithms that made perfect sense because they only got touched by a handful of people suddenly get exercised by everybody, and all sorts of interesting flaws come up. An interesting note, the locking algorithms that were written back then, nobody actually knew if they were truly solid or not because Intel's own engineers refused to commit to a cache coherency protocol until somewhere around the Pentium era. Up until then, they ruthlessly refused to say anything about it, and if queried, they would just say that it wasn't something they had locked down yet. So you were kind of in the dark. So we end up where we end up. So the good news. I've been at this for a little while, and the worst and ugliest of the stuff is largely gone. I mean, there's still lots of performance issues left in LibreOffice, but the stuff that used to hang LibreOffice up for minutes at a time, I've managed to nail most of that down. There's still lots of places where it's slow, but it's not like heinously slow like it used to be. The bad news is that the remaining stuff is really, really ugly. It's typically the stuff that I just couldn't get my head around the first time, and some of it I've looked at again, and I still can't get my head around it. Some of it I've been beating on for a while, and I'm not making a lot of progress. But anyhow, the point of this exercise is that you keep beating on things, and eventually either you break or it breaks. Hopefully, you get the right thing. So what worked out nicely? I'm going to talk today about what worked out well and what worked out worse, because I think it's important that we share both the successes and the failures because I think you learn equally for both. So what worked out nicely is reducing allocations primarily with things like using standard optional. So, for example, if we were allocating a temporary data structure inside a subroutine, I would switch that to use, to declare that thing using standard optional, which does have the downside that it takes up space on the stack even if you're not using it. But on the upside, using the stack is about 100 times faster than allocating for a main memory. The reasons for that are because any time you touch main, any time you have to allocate out of the heap, you have to take a lock. That leads to locking contention, et cetera, et cetera. You have to talk to main memory because you're doing all sorts of stuff there, whereas the stack is pretty much guaranteed to be an L1 cache, and so allocating out of there and allocating out of there is just ridiculously fast. So you can allocate quite large amounts of memory before you start falling out of L1 cache. We're talking up to a couple of kilobytes here. So throwing the sort of stuff at it worked well. The other thing I did was I've been converting usages of our internal OSL, colon, colon, mutex class to the standard mutex class. Now, despite the naming here, we're talking about two different kinds of mutexes. Our internal mutex is what's normally called standard recursive mutex, whereas standard mutex is a non-recursive mutex, and it is considerably faster. In an uncontended case, standard mutex is about 20 to 50 times faster than standard recursive mutex. In the contended case, they tend to fall back to roughly being about the same cost. So given that we run most of the time with very, very little concurrency, except for an occasional use case here and there, standard mutex is a major win. The downside with standard mutex is that if you use it and you try and lock that mutex the second time, you get a dead lock. So you have to write your code quite carefully. Most of our usages converted very easily. Did make a couple of mistakes. So there were a couple of regressions added to fix, and in a couple of cases, the regressions were simply unfixable, and so we backed it back out again. But in general, it's a win. And as a side effect, standard mutex is allocation-free. It's literally just a single word in memory. And the common unattended case of taking a standard mutex is what's called a CAS operation, compare and swap. So it's really fast, and it doesn't use even less memory than OSL mutex. So a win all around. What didn't work out? So we've got this SVL shared string pool. It's a really nice thing. We use it in spreadsheets primarily because spreadsheets often have many tens of thousands of strings, which are all identical across all the cells. So we still references to those strings in the shared pool. Now, this shared pool gets hit very hard at load time because other people implemented concurrent loading of spreadsheets. So we have this nice stuff where we fire up multiple threads and load a bunch of different chunks of the spreadsheet at the same time. This is great. Speeds things up. But it was bottlenecking very hard in SVL shared string pool. So I thought, oh, it's great. This is a hash map. Best case scenario, we can stick a concurrent lock-free hash map in there. I found this great cuckoo hash map written by some very bright people, much smarter than myself. I stuck it in there. Oh, it was great. It worked out brilliantly. It completely destroyed the bottleneck. Speeds sheet loading went up by a factor of two or three. And I thought I had a great win. And then the first bug came in where there was an edge case where it wasn't quite the locking because we were doing concurrent locking now. It was getting some weird edge cases. We had two different maps. They were both talking in the same thing. So I then had to fix those edge cases. I reworked it. We fixed those bugs. We reworked it again. I fixed the bugs. I eventually got it working. And then another bright guy came in. Lubosh came in and said, but wait. String pool is indeed a hotspot, but the hotspot is not actually the hash map inside string pool. The hotspot is the uppercase conversion. He improved the uppercase conversion. And by the time he was done improving the uppercase conversion, my concurrent hash map was making no difference whatsoever. So I had identified a hotspot, but I hadn't identified the hotspot. So we backed it out because there's no point in keeping a highly complicated piece of equipment like that around. So I was very sad to see it go. But I learned some stuff in the process. So no harm, no foul. And we backed it out again. And we're back to being faster than we were before. Red line processing in writer, which is our document editor section, is often very, very slow, especially if there's a ton of red lines in a big document. And it's slow both loading and slow at runtime because when we're doing red lines, we often are doing massive amounts of adding and deleting to data structures as we internally render and stuff. And so I thought I'd try and speed that up. However, this is writer. Writer is the most cash unfriendly program known to mankind. It just wants to chase pointers all over RAM. It's dataset once you are more than, once you have more than a sort of 10 page document, you are completely falling out of L1 cash. So you have no chance of getting cashed algorithms. The data structures are inherently very, very complicated. Human languages and documents just are. Consequences are we do pointer chasing. So we're constantly loading a pointer and then following it to some other location in memory, which is guaranteed to not to be an L1 cache. And then we're following that again to something else, which is also not an L1 cache. And in the process, we've now blown apart our cache. So if we need the first pointer again, that's also not an L1 cache. So we just chase our tails around in a very slow processing. I did my best to fix this. And that involves trying to speed up something called a node index and a content index. And I failed. I am now three levels deep, fixing different things related to that, no end in sight. And I'm currently bottlenecked on a piece of processing that I just don't understand. So that didn't work out. But in the process, I now know a lot more about writer. And so I consider that a win. Thank you. |
Feature Locking and Feature Restriction
Integrator's way to unlock potential |
Hello, I'm not Pranam, but Pranam couldn't be here because of all these visa problems and things like this. So I will be speaking on his behalf and plus I also work with him on this section so I think it's okay. Okay, so I'm Pedro by the way, Pedro Silva, so I'm now talking on behalf of Pranam. So what is feature restrictions and feature locking? It was a section developed by Pranam within Collab Online and the main goal here was to add an additional lever, an additional option to any CIS admin. So basically it's a switcher or perhaps better described as a set of switchers that allow CIS admin to hide a set of features. Maybe you want to restrict the usage of your resources or maybe other options. Maybe you just want to avoid the users to use, I don't know, formatting a cell or something like this. There is a lot of these use cases that you can have for hiding those features, which is useful. So it's always this on, off, straight forward switcher. But what if you do not want to hide completely the feature, you just want to lock it. So imagine, I don't know, again a CIS admin, a hoster, an integrator, might have higher ceiling, but that higher ceiling for system resources, they need to be maybe either paid by the user, but still they want to offer this initial, I don't know, initial, this hostage that they normally have packaged with completely gratis set of softwares. And this is where feature locking enters in, so as it says here, it basically adds this lock icon and then the user imaginary case scenario as a hosting, the user has a free hosting plan or something like this and does as limit of system resources usage. But then after clicks that icon, any of these locked icons sees a pop up to start to have a proper hosting account. So there's a lot of interesting use cases here. But how this is done, so that's the interesting part I would say. So there is these two key words, this is user locked and is user restrict. And they are just, again, as I said, switchers, so on, off, through, false. And this can be set in the cool WSTXML, so it's very easy, just open cool WSTXML, you can even do it locally, you can try it, you can even lock it and then call your, I don't know, brother or cousin that normally screws up all the formatting and tell him try to now screw up and he cannot because all those formatting options are here. So it's very interesting a joke to do. And how this looks like, yeah, so this is an example, you see, he cannot mark any text bold, even if he tries very hard, so it's an example or he can, but you see the italic option disappears. See the difference between restricted and locked, these cool WST options stated here. So I think it's pretty much it. Thanks for having me for the presentation. |
An Interoperability Improvement in LibreOffice Impress Tables |
Hello everyone, I'm Sairper Akdemir and I work for Colobro Productivity. This talk will be about an interoperability improvement in LibreOffice Impress tables. So this is a screenshot from the bug report and let's talk about what was the problem on the surface level. On the right you can see how it appears in PowerPoint and on the left you can see how it appears when it was opened by Impress, the PPTX file exported by PowerPoint. What we need to focus here is the rows pointed by the arrows are somehow shrunk when it was opened in Impress and it wasn't immediately obvious. If we look at how PowerPoint exports the table layout into PPTX files, what we will realize is it defines row heights. So let's just focus on that part but it doesn't really define a total table height to fit inside. While we are importing these row heights what we do is we first calculate what does all of the row heights add up to because we use the total table height to layout a table. So we calculate that by adding up each of the row heights and also we assign each row with their own height. But it turns out when this specific file is exported and if you look at row heights for the empty rows, they are correct, there is nothing wrong there but if you look at for instance the one row that says COLA, it appears to have a row height of zero which doesn't really make sense. So now let's also something to mention here is if we start typing anything in one of the empty rows, there are no text properties actually imported from the PPTX file, they are somehow lost in the process like the text when we start typing in the empty rows in the PowerPoint one, it starts as a blue, it has a blue color, etc. It has a different size but if we write it in the impressed one, it is just defaulted so it's black and 18 points. So before understanding the problem we need to know a little bit about how impressed layouts the table. The table is basically fitted into a given total height but while doing so we also care for what each row heights were. So basically we need to know the total height correctly because if it is smaller than some of the rows, we basically happen to be shrinking, trying to shrink the table. When that happens, what the impressed tries to do is adjust the row height proportionally to what were they before and if there is text inside, there is a minimum height it can go to and when that happens, like in this case when that happens, since it can't shrink the line that says color further, it shrinks the empty line because it can to try to fit in that wrong table height. Also, if we explore the PPTX output, we will realize that the empty cell properties are exported in n-paragraph run properties. We need to import that. There isn't already an implementation for that but I will discuss this a little bit later. If you look at the problem in detail, let's say what we need to fix here is we need to somehow care for these problematic row heights that during import which they don't have a height of zero but they are defined as so, such. Empty cell, we need to know the text sizes for empty cells too. We need to do this from n-paragraph run properties. Also, there was some previous range of pixels here that kind of altered the layouting code instead of the import code and there are some regressions that messed up some table resizing functionality there. We need to revert those changes. So basically, it turns out to be PowerPoint tries to export desired row heights. For instance, if you try to just pull a row to zero while it has text in it, it doesn't let you do it but it actually saves it as such. So, in the end, that creates a problem for us. So, to fix this, what we can do and what I did was during import, before layouting the table into an area which our impress usually does it, we do a pre-layouting which is we just take the row sizes and we don't give the layouting code any area to layout into and we let it layout itself. So, it basically just looks at the row sizes and tries to expand them if they are smaller than possible. And that gives us a final height that we can use in successive layouting attempts. So, we kind of correct the total table heights doing that. And we also don't have the text properties for the empty rows. So, the problem there was, turns out these text properties are actually imported into text nodes but when there's new text is being typed, they actually inherit their properties from the cells' own properties instead of the text nodes. Text nodes are just being dumped and new ones are being created. So, to fix that, we need to push the text properties from the n-paragraph run properties into the cells' properties that themselves to make it work correctly. Well, to finish up, with these fixes, we were able to get rid of the problematic regression causing code in the layout thing and we moved the conceptual fixes from there into the import code, making it possible to work correctly. And additionally, some unit tests were added to make sure the n-paragraph run properties stayed correctly that covered those areas. Thank you. That's all from me. |
Writer Content Controls -- what happened in the past half year |
So this talk will be a follow-up to the one I did at the LibreFist conference last year in September that was about content controls in write or in general and some of the follow-up work was expected, some of that was more like a surprise, so a couple of incremental improvements appeared in the past half year, so it seemed like a good idea to overview where we are compared to where we were in September. So for those who don't know me, I'm Mikhail Shvindam and from Hungary, I used to be very much involved in the write or RTF import export so much these days, but I still focus on write or work for Culebra and for content controls for the scope of this talk, we talk about these rich tax content controls, it's like it's for for Finland, we used to have these input fields in write or already where you can provide some placeholder tax and you can mark that this is the place of the document where you can type when you fill in some form, but one big limitation of that was that it was built on fields and fields can't have formatting, so it was really just for plain text and where write or really shine says more like there can be a rich text, so we want something that provides rich text, so that's where you can have rich text content controls. The UX mass specification calls these structured documents tags, but it's really the same thing, they are user interface calls these content controls, so we also call them content controls, and the way it's structured is that you can once you have a paragraph text, then you can have multiple text portions inside that, so let's say you have some text, normal text, then some bold text, and then again normal text, then we split up the text, paragraph text to three portions, the normal one, the bold one, and again the normal one, and for fields the restriction was that you can't have multiple of these text portions inside that's to be filled in, and content controls support this, so you can have multiple portions, although they are limited to a single paragraph, at least this inline content controls that I'm talking about, so you can't have a content control starting at some random point in the document and ending at some random point, perhaps you know that you can do that with bookmarks, it might start inside the table and outside the table, and fieldmarks can provide the same thing, content controls are intentionally limited to be inside a single paragraph, so we enforce that when you create them, we enforce this when you edit them, we enforce this during exporting to DocEx and ODT, so this is something, this is an invariant that we want to maintain, another complexity is that it's possible to do nesting for this, so when you look at how we write this in XML, then XML elements naturally support nesting, and we call this well-formed nesting, so the outdoor content control starts, and then the inner one starts, and then it's a requirement that the inner one will finish, and then the outer one will finish, and so you can do nesting, but you can do these start ones start to finish them the first and finish the last, similar to what you know from HTML for example, so we want to support this setup that you can do nesting but not in a random order, and you can include multiple text portions but not random positions with start and then bottom constraint that they are in a single program, and if you have fields, then fields typically have some kind of instruction text like commands, and there is the field result, content controls are more like annotations and a piece of text, so you have some start and then you can have a bunch of properties on top of that, we will see you can give it a title, you can give it an alias, you can define the type and so on, so the rich text is the simplest type where you just say that you can fill in something here, and if the task is like provide your one line or command from this for them presentation, then and you say it was really bad, like really, really, so you select the really and market board because it was really that bad, but you can do multi-programs, you can't write a novel on how bad that was, so that was a rich text, and so somehow the picture is missing the top pixels, so you missed the whole point, but I will explain what you should see there, so it's called, I think the values in the interface cause this title, but in the markup we call this alias, but it's the same thing, the point is that you have some complex form and you are supposed to fill in the date, the date and the date and finally the date, and of course they mean that, that means you are registering your company and there was a date when you created the company, there was a date when you filed the papers for it, and there was the date when the first employee was hired and so on, but it's just the date, so it's very confusing to fill that in, and what content controls can provide is that when you enter that content control, then there is a small pop-up similar to when you edit headers and you get the name of the page style and so on, so you get some tooltap explaining what exactly you are filling in, that might be helpful, so let's say the text would normally just say that you need to enter information about the, let's say the birth data, but that means that they want the birth location and the birth date and then when you enter the content control they can give you this hand so that it's a bit easier, so the output, the field inform might conform to what was expected, how some regulation is to accept, but when you try to enter it as a more mortal, you are actually able to fill it in because you don't need to like look up some 10 pages of documentation, how to fill in that form, you go to the form and you get enough helpful context information so that you can just do that, so these aliases and tags were initially missing on the right-hand side, and now we support it, then one other problem was that, I mentioned you can have multiple text portions inside the single content control, so what you see on the above screenshot is that we have an x corrector, then a new line break, so we kind of hack it around, technically it's still a program, a single program, but you see it's on multiple lines, and then we even define some tab stop and then a tab portion, so technically it's still a single program, but you see that this is like at least three different text portions, and we used to take each and every text portion and then it's PDF widgets for that, so in this case when you exported this to PDF and you wanted to fill in the PDF form, then you got three different widgets, which is a nonsense, this was not the intention, so in case originally the placeholder text would be multiple portions there, your right turn, and we still take the bounding rectangle of the content control and just emit a single widget, as probably the user would expect that, then another thing is the primary use case we had in mind was that you create some editable writer document, and at the very end you export that to PDF, and the actual form filling will probably happen in some PDF reader, but you might also have some slightly different workflow where you mark most of the document as read only, and then you can have the editable document handed out to users, and they actually fill in the form in writer, now the trouble is that in case we made the document read only, then you can't fill in the form because you change this content control, and they are part of the document, and the whole document is read only, so that content control is also read only, now this was working with input files before, they had various problems, but this bit was working, they knew that they are an exception from this general read only thing, so it was possible to fill in input files, now we do the same, and we can have this setup that the whole document is read only, but the content control can be still added in. Another thing was that if you look at what Word provides for VBA, if you want to manipulate these content controls, then they have an understanding of what is the list of content controls in the document, this can be very handy in case you want to have some macro that automatically processes the already filled in document, now there are other ways to do that, but one way is that you write some macro that will extract all the fielding results from the document, and for that they can just go to the first content control, the second one query how many content controls you have in the document, but on the right side this is really just a formatting on a program, so you would have to scan the entire document to find out if you have any content controls at all, so we don't have this random access to content controls, until we did not, so initially we ignored this VBA problem, but when Justin was trying to build a VBA compact layer on top of content controls, then he found that there is no random access to content controls, so you can't do this without scanning the whole document, which can be very slow, so this is not great, and we discovered that actually footnotes already provide this, that's also kind of formatting on some piece of paragraph text, and that has this manager that will track as footnotes are created and deleted, and then you can quickly get a list of all the footnotes in the document, so why can't we do the same for content controls, and yeah you can do that, so now there is some star basic access, or actually UNO API access, and that's visible with me in basic, and also there is a VBA compact layer on top of that, where you can query how many content controls you have, you can, if you fill in these alias things, then you can even say that I want to jump to the birth date content control, without saying that is the third one, so if you insert something in between, then it won't break, so this manager provides that, what's necessary here, another thing was that initially when I was adding drop downs, I wanted to like incrementally extend what's available in rich text content controls, so the idea was that in case there are list items for this content control, you know that's then probably a drop down, but there is complexity there, you can have drop downs, okay, you can have drop downs, and you can have combo boxes, and it's possible that, and you can't say which one it is, if it has list items, it might be any of that, and also it turns out it's valid to have a drop down with no items, so that's what you see there, knows that's working, we explicitly track if that's a combo box or a drop down, and then you can have both types, we enforce that if it's possible to just choose one item from a complete list, or you can also have free form tags there, and also in case some existing document for whatever reason has no list items, then we don't break that, and we don't implicitly turn that to a rich text content control just because it has no list items. I think this is the last one, I say Hossain was doing lots of testing on content controls, and of course the first thing he was trying is some Parisian text, and of course it was breaking, I think it had three pieces, so one was the positioning, if you have the drop down arrow on the left-hand side now for archaeal text, that means that if you take the position of the whole bounding rectangle, then you need to shift that to the left to have the correct position, so fixing up the position based on the direction of the text frame, so the paragraph is what's one saying, then also what you were in the render inside the bounding rectangle for the arrow button and for that frame that needed fixing, and the last thing is that if you see a button, then you might have the silly idea to click on that button, and you expect that something happens, but we need to do this hit testing to show, to decide if you clicked on the button or not, and if you do that, then we need to handle this correctly for RTL versus RTL, so that's no all working, so this is it, there was some polish since the LibreFist conference is still, what the feature set it provides is still something that meant to be one-to-one possible to map to the word feature set, we tried to fully save this to UDF without any loss, you can export that to PDF, there are these various types, you can see a few types there, and basically more properties are added, some small editing improvements, and it's a little bit easier to script that now, so that's what we have, thanks for listening. |
Footnotes in multi-column sections |
This is my first conference and I am not very good in English, so forgive me if I say something wrong. I will speak about the footnotes in multi-column sections. I worked on this in last year, a year ago. I thought it can be an interesting problem. Oh, I had to go back. Okay, so I got a back ticket to a fix and it was a very simple ticket at first sight. It was only just a one line text and somehow the writer displayed it as two pages and while the word was displayed just one page. Beside the one line text there was a footnote and its sections has two columns and there was a continuous section break. First I talked, it's just a problem about the section break, some why it's changed to a page break or something like that, but that was an instinct. It was a strange thing why the footnote and the column was needed. If I deleted any of them then the writer changed to one page and it was okay, but I wasn't, it was changed why it is needed. So I started to debug to find where it's become to a page and I found the problem in the layout in calculation and there was even a comment about it stating that the footnote container causes to maximize size in this case. This function to maximize function will calculate if we have to maximize the column frame, the whole section frame, to the whole page and it has several conditions like is footnote at end and is endnote at end. They are checkboxes on the GUI named the collected end of text and the collected end of section for end nodes. They are, they was false in this case. Oh the hasfollow isn't matter now, it was false in this case too, but there was another function that contains footnote container. I checked this function and this function just blindly check if the section have column frame tried that has of footnote container frame tried and if it has one then it returned the first one it found. So now it seems the function was directly righted like this way. So the question is was why do we want to break a page in this case? I even checked the git who changed it or something like that but it was in the very first commit I could reach in the initial import at 2000 year and nobody changed it and I was even wonder why this many condition is needed not just the first one I found but the not collected end is still needed for it. For this I had to analyze the frames how they work so I listed the frames related to footnote some case. The footnote frame is for one footnote entity it can contain a text frame and this can be in a footnote container frame and that frame contains a list of footnotes. It's important it's not two-dimensional list it's just a simpler list that it can display and this footnote container frame can be in a page frame or a column frame. So what is important in it that the footnote container frame does not really support columns it just I could say just one column of footnotes and that's all but we can put this container frame in any of the column frame event so if we put this container frames in every column frame then it will seems like if we support it's like support multi columns. So in one case we want footnote like multi columns as like it's in the world in the other case we want if we don't check the collected end text we want the footnote to be at the page and in the other case we want the footnote to be in the columns end in physically but as we say in the display and it can be true only if the column end is equal to the page end so that's why they maximize the size of the column to the page size. In our case it's not a good idea because in my test case there was only one footnote and it doesn't need to be multi columns but if we wouldn't do this what I showed in the previous slide there could be a contradiction like I made a test document there is two sections I checked the collected end of text which means I want the footnote to be after the section end there is two columns at the first section and three columns at the second section it's very good in this way but if we would change the collected end of text we would uncheck this I mean we want all of this footnote at the end of the page then how many columns you want to how many columns you want it's a contradiction we can't say that two or three because any of them could be wrong. So what we should need we should be able to have column childrens for the container frame because it's not should be but it must be it just should be like words able to have indifferent number of columns at page as many as in the text so we should have a ability for the users to choose how many columns you want in the footnote but we should keep the ability to that's all okay sorry and it's almost end. |
News from the ODF Toolkit
Quick overview: Intro, use cases & updates from the past months and likely future! |
I guess I start already because otherwise it gets boring. Hi everyone. I'm Svante Schubert. I started 99 with StarFist in Hamburg and never worked on LibreOffice core StarFist earlier. But always from the beginning, although I applied a C++, other Java-based Vaptop, we called it a web office, right? And we had two and some. The second one was when the Golden Master came in 2011, it was cancelled, everything. And this library, I'm talking about the ODF toolkit, was the core of this web office and later on a fork from Open Exchange for another web office. So I worked once at the ODF standard, which is the file format, the shot frozen runtime that you dump and send to others. And the reason of the standard is to be interoperable with other things like Microsoft Office. Microsoft is also participating in the OASIS ODFTC. Okay, so far not bringing your own laptop because everything goes faster this way. Anyway, it doesn't matter. So by the way, we are three from seven members in the OASISTC. We are three from the Document Foundation, Regina Henshel, who is very active, and Micha Stahl from Atropia. Many thanks for Atropia for joining that. And Michael is also the co-maintenor of this ODF toolkit. And yes, sorry, Dennis, bring my laptop, please. Okay, so, yes, yes, yes, yes, okay, anyway. So the ODF toolkit is basically a Java-based, and it has two main core deliverables. This is the ODF DOM, and the other thing is the validator. And yes, I know, but gosh, here we go. Sorry for the inconvenience, or it's over there, sorry. Hi, here's something, okay, sorry, okay. The one thing is the core, the ODF DOM, and the online validator is the wrapper around it. It's hosted by ODF, and it's used for validating this ODF file format. And ODF DOM, you hear it by the name, it's not only an ODF implementation, but also in DOM. And you might know this from the browsers, and that's the browser standard, the HTML standard, demand that at the runtime, the browser is accessed by the DOM API, and that is the way that JavaScript not only runs in Firefox or Internet Explorer, but that is this macros are interbubble in all browsers. And ODF doesn't have such runtime API. You have either a LibreOffice macro or Microsoft Office macro, and they do not work to each other. This is a downtime. I don't say that we need a DOM, but there's no DOM access in LibreOffice, but it would be nice to have a feature API, a high-level API. And the reason is that we have a standard, a blueprint, the OSSTC given standard, and then we have this implementation where ODF is one of them, and it's hosted by ODF. And just in a nutshell, what is the standard defining? The standard blueprint is defining the zip. If you will have an ODT document, you can exchange the suffix, and then you suddenly have a zip. And there's different parts in the specification parts, number one introduction, number four formula that's not implemented here, but the zip itself is defined in the package, like the encryption signature, and also this meta-inf manifest, which is a content table, and you will find signature XML as well, and the whole XML part is one of them. And the reason now, the high-level goal is to have a, to close the gap between the standard, the blueprint, and the limitation, like want to get away from paper because you don't want to get, oh, here, developer, here's another 500 pages of the specification or start. But the idea is to generate as much as possible documentation and most of the core by this. And yes, ease by this development, and how it's been done, because we are generating from the XML, look, there are a lot of elements of the XML, but the manifest is like a set of content table signatures, also the manifest XML, and all of these is taken away, abstracted from the developers. So we generate the DOM tree, and type DOM tree, like for each element, attribute, we have a class, and an element gives you methods to what's inserted, and the default values are extracted from the spec, so burned soonish is like you should have a constructor from all the mandatory ways of the subtree, something like this. But there's also some gap in the, let's say, digital gap in the spec, like there's some formulas in the floating text saying, oh, when attribute A is active, then the attribute B, sorry, if it's B is present, then A become active, or there's a certain value for B. And these conditions, I would love to read out and generate from it the source code, because I don't want to type it myself with thousands of attributes. Another thing that is not there, because it's a lot of things, the puzzle pieces, I would call them feature, and this should be the feature API, everybody expect there's an image, and there's a table, and this is like, even if you don't know anything about ODF, you will find another file from HTML markdown, and you have a certain assumption that if there's a table, you can insert a column, and this insert column function, yes, this change API request, that has a certain pattern of XML change, and this might and should be defined in the spec, generate even this API from it, right, so in the end it's something like this, we have a semantic layer, and this is currently not generatable, and XML layer and a package layer, and the idea is that you might exchange the semantic layer with even other file formats, like there's also a table where you can do markdown from a table insert column and markdown this way. So let me run through it a bit, of course, there's just a model and not a view, this is for Libre Vasone, and because as well that the spec is not very strict on the view, and the highlights is, we've done recently a release vote of one or two still, we've refactored a lot this code generation, and also did a release after 20 years and took over the multi schema value data, which is for loading and understanding the grammar, because we have something called real X and G, which is simpler like XSD, grammar, and from this we traverse and are able to generate as much. As in one, three upcoming release, nowish, I thought I could release it with Michael for Foster, but it was some bugs, and one at four later, why one at four later, that's the ODFTC update, really, you see, this was the last release, one or two with Son, that was really, Son was stopping staff this year, and it took a while till we made the next release for OASIS. And now we are closing on ODF, there's a link, you can click on it and see the query of the 66 issues, which are already in, and then 23 credit dates, where people can validate from, take something from it. Okay, so there's another thing, I did a project on a major data search engine, where the text in major been extracted from ODF, and I realized that there's something missing in ODF, I just for discussion after something, it's like, we don't have this ODF model in the middle, like this feature model, and then we say, okay, now I tell you, how do I do the export, how should a table look like in Markdown, and you can do cherry picking of features you like to have in your own format, like Markdown or HTML, and I realized that the whole design, we came from XML, and later we realized, we need this feature level to abstract from this thing to XML details, that this is missing. So here's some sources left, and I really, really love to discuss on some kind of this later by TB or something, thanks a lot, see you next year. |
LibreOffice graphics subsystems - SystemSpecificRenderers
Providing a working Example and report about progress/findings during development |
Hello, my name is Armin Legrand. I'm working on the graphic stack for a long time, as many of you know, and this is actually also a kind of update from the LibreOffice conference where I already talked about system dependent primitive renderers and why they are important. I'm talking about them for years, as you know, and at the last conference I was so, I wish to promise to implement a prototype, and I did, and it's finished, and I did it on Direct2D, and it renders what you expected to render, so with Direct2D, of course, it's on Windows, but there was no special reason for Direct2D, I just wanted to try out if it fits well to a standard graphic library, and if I could figure out how to do it myself because I didn't know Direct2D too good myself, so it was some safe training too. So it's finished, it's working, and everyone can have an example now how to do some system dependent primitive renderer if he wants to. So what happens when painting? I don't want to go in detail about that, I have tried to formulate what's all happening so you can just check it when you download the presentation, that's the history, why it happened, what happened, also some history, so what did I do? Just started with a simple primitive processor, and it's feature complete because it supports all necessary primitives, there are four primitives you have to support, and some grouping primitives you have to support, and there are quite some extra primitives supported to get some more speed into it, which is not necessary to be feature complete because as I hope you know all now, you can decompose a primitive and it just dismantles the simpler primitives which can then be rendered. So it's in a single source file, so no hacking required, inside this single source file you can do whatever you want, it's just 2000 lines, so not too much if you compare it with some back end implementations which are spreaded over the whole office. It translates primitive data directly to direct2d, uses already available system dependent buffering which was not used in other implementations, I don't know why because it's working and available, it does not need any adaption of bitmap-bitmap-x which of course would be even better to do, so to directly use the data without converting it, but now it's just converted once and held in this standard system dependent buffer. So it's quite fast and you can try yourself because it's in the master, so if you have a Windows Pro version and you started with environment variable mentioned here, you will get the new renderer and you will see the added view of your impasse and the parts of wider and calc as far as say support primitive rendering rendered directly by direct2d without using output device at all, so that's proof of concept and I delivered it. So now I hope we find some guys who can help, these are the ones I added to make it faster and it's not even optimized but already pretty fast because just think about the layers which I used, we have the model, it creates primitives, the primitives are thrown at a renderer currently to the VCL pixel renderer, that renderer packs it to output device, still to output device and that output device sends it to a back end, so you have a five level stack and output device alone does a lot of work in between, old, unmaintainable, incredible stuff in between. So what a system dependent primitive renderer does, it removes the three last steps and packs it into one, you go directly to the renderer you want, in this case direct2d. So what does it look like when you let it run? It looks the same and that was exactly the target, it looks the same but it's fast and it does not use output device. So what else can be done, currently it has no support for text, so text is decomposed and rendered as anti-aliased poly polygons, not too bad but of course for production state we would need something more and direct support for gradients, so for direct2d I already looked a little bit into it, it might just be done as a custom effect which is some kind of texturing, so with some more work this could easily be extended to product quality. Let's see if we find resources to do with this this time. So what also happened, Kralen started a system dependent primitive renderer on Kyro, thanks Kralen, you're my man, so he just tested out by copy pasting the structure and filling it out with Kyro stuff and it does render something, there are some caveats and it would need some more love but it's easy maintainable and can be extended, so another proof of concept says this really works well, so in the process I also did some upstream clean up stuff in the master which was in the race, it was another reason to do this proof of concept and prototype, I can no promise that you can do exactly the same in drawing layer and do your own system dependent primitive renderer for any target system without having to fear that you get hung on something in the master which would be in the way because I had to clean that up anyways. The other point is this is good for any few visualizations but what about the rest which is still painting using output device, so I did two more experiments, one is forward calls in the back ends, so the back ends has for lean API, I just made a proof of concept prototype, you can find it in Garrett with the link I gave, you can add it as a patch to the current office, compile it and see it running and you will see that it currently forwards a single method to a rectangle for test and to make it visible in the office it's just a little bit color coded so we can see it, so that works perfectly and the good thing is the back ends are libraries themselves, you can just link them against a drawing layer and all functionality can be in drawing layer and you are storing layer stuff. The other way I tried was a kind of draw forwarder in the output device itself, so for every paint command call something in a virtual structure which is then overloaded in output device also works flawlessly and I also used the direct again and this is proof of concept too, there is also a Garrett link, you can just use that link, patch it into your office compile and see it running, you get the same color styling to see how it's running, yeah. So which way to continue, best convert all to LibreOffice primitives, I always wanted to have said done but I know it's not possible, combination of one of these solutions with system dependent primitives or worst just keep it like it is, like always. So I'm still very interested to do this but I did this prototype now mostly in private with some support from Torsten thanks but I cannot continue doing it in the needed intensity just privately so without getting resources this will fail again and we will stay at VCL output device forever, that's it. |
Improvements to LibreOffice PDF accessibility
Come to see what improvements we made to PDF/UA support in LibreOffice |
Okay, next one is me, so somebody else has to watch the time and drag me off the stage. Yeah, welcome to the bit of an update on LibreOffice accessibility, in this particular case PDF accessibility. My name is Thorsten, we're running our own company here, Consulgency, and some products around LibreOffice. This work here has been funded by a customer, so it's always great to have people paying for improvements in LibreOffice, I'm very grateful for that, that's why we can present this today. Quickly, there is a difference between document and application accessibility, so accessibility can be many, many things from the Wikipedia, makes products, devices, services usable by people with disabilities, and of course the main thing that needs to be accessible with LibreOffice is the software itself, and there's been a lot of work ongoing over many years there, but also what LibreOffice produces, the documents also need to be accessible, so that's more on the services side, the products, the outcome there. Application accessibility, as I said, lots of work has been ongoing, there's been two tenders from the foundation, both of them went to HYPRA, and I think one of them, I think there's still a bit of work ongoing there, the first one definitely is finished, that resulted in some cleanups and some checks, and also some build time checking that all the dialogues have all the prerequisites set, so that the GUI remains, is and remains accessible, and the second tender is for actually being able to, like, enabling to run and also, like, not breaking it going forward with the platform accessibility APIs, so that the screen readers and other tools actually work with LibreOffice and make that testable and kind of lock that down. So that was LibreOffice, oh, and Jobs Corner advertising, so LibreOffice is looking for some accessibility engineer, there's a drop posting there, and if you're interested, apply about now, I think the deadline is sometime next week, I believe, closing that, or tell your friends and family. So document accessibility, that's what I'm focusing on, relevant standards, and we're talking about PDF here, some web content accessibility guidelines, that's about HTML mostly, but much of that also applies to other document types, and then that's the PDF UA, the PDF Universal Accessibility Standard, that mandates lots of things, mostly meta data that is at the document, so that suitable tools, like screen readers, can extract that and say, okay, I need to read this, but I don't need to read that, or for images that there is a description, like a textual description of what this image contains. There's a nice overview and an actionable list of checks that you can go through in the so-called Matterhorn protocol, yes, and if you pass that checklist, then you can be pretty sure that your document is accessible, and there's validators for that, the PAC one and the Verapedia pack is closed source, or source available or something, if you ask, and Verapedia is a proper open source. Okay, so what did we do? We fixed a lot of bugs. There are still some bugs left, I hear, about 14 when I checked last, so anybody who's interested in having accessibility bugs fixed, give me a call, and that's what we did. So quite a chunk of things that still had to be fixed after so many years at the, let's say, core level, like actually getting meta data that was already there in the document down into the PDF in the correct way, so like getting the information down, that was the first thing, so that the PDF export could write it to disk, and then also getting it in a proper way, so like at the right position in the meta data structure, not forgetting something, not mistyping some tags, et cetera, yeah, and it's quite across the board what we did there, and I can say that for reasonably complex documents actually we're passing now the accessibility checks, like those validators. So that was actual bug fixing, and we also had to add a few features. The most important one is adding this decorator flag for flies, so if you check the latest, I think we missed 7.4, I'm not sure, but in any case, a recent master, if you insert a text frame, on the options page you get a new checkbox that says decorative, so you can now say this content here is not important for the document, this is just some whatever code of arms for your family, or it's just a background picture, or something that is not really important for the content, it's just decoration. And that was not, that was not saved, also not saved in ODF, so that actually was a feature we had to add to be able to, so users could say okay, this is not important, that is not important, but this is, and then being able to save and reload that. And some UX improvements on the accessibility checker dialogue, which was the work from Colabora, good stuff there, also the third one, the online accessibility checker, if you enable experimental features, then you get on the extras menu, you get some automatic checking thing, which is this one, automatic accessibility checking, and if you do that, you get this kind of counter there, like how many more accessibility bugs do I still need to check or fix, so that part of the toolbar. And so we made the actual dialogue a little bit nicer to use, so in this case, like not making it modal, because if it's modal, you can't interact with the document, so you get a warning, like in this case, no alt text for the image, so you want to kind of select the image and set the alt text, but for that, you need to close the dialogue, so if you have more than, let's say, 10 things there, then you'll forget where you were, so it's much nicer if it's not modal. And yeah, I also have some rescan buttons, so it's kind of orthogonal to this automatic online check, right, so that was that, as I said, in a decent shape now, far from done, the term starts to be usable, future plans is some user experience, so the goal should be to be on par with Word, so really, not having users click four times every single image that they need to put an alt text in, which is really annoying, and do something that is kind of smooth, that is like one click, or even, perhaps, optionally, something like AI supported, auto image generation, or something like that, presumably, or other great ideas, but I think what's really, so I think the basics, the, let's say, technology, the engines there, that's pretty okay, and the focus now should be on the usability, so making what's implemented actually usable, and I guess, that's it, thank you. Thank you. |
Supporting old proprietary graphic formats |
It's easy, you know, because my laptop doesn't really like how it's being disconnected from the laptop. Hello, my name is Paris and my talk is about supporting old proprietary graphic formats. Specifically, we're going to be talking about the WMF and the EMF formats from Windows. So what is a WMF? And this also applies to the EMFs, but the WMF is a Windows metafile. It is a graphics format that supports vector and raster operations, mostly vector. It was introduced in the early 1990s in comparison. A different vector format like the SVG was released in 2001. It is composed of a set of GDI drawing commands and structures. These drawing commands are played back in order to render the graphic within what is known as the playback device context. And it is not as widely supported as SVGs. Essentially, this means that you can code this format into existence by writing some GDI functions. What are the difficulties in supporting this format? So the WMF files are application and device dependent. The EMF files later, that were later introduced, try to solve this issue, but the WMF files are more difficult in that way. The device context that is associated with a WMF file cannot be queried. That is, an application cannot retrieve the device resolution data, font metrics, and so on. So if you made a WMF file for a specific device, you cannot really know. If you tried running on a different device, you don't really know the device it was built for. There is a format specification for this, but a lot of things are missing. And there is some edge cases with undefined behavior. And pinpointing the root cause of a buggy file can be tricky. So how do you debug a WMF? Well, there's a lot of ways. I'm going to present the way that I do it, usually. So you would want to get the drawing commands, the GDI drawing commands, and there is multiple ways of doing so, more than I listed. One is MSO Dumper. It is created and used by LibreOffice developers, and it dumps the drawing commands. Another one is the Metafile Explorer. It allows for viewing and stepping through the drawing commands, so you can easily understand which command does what. And then there is the Metafile GDI function, which is defined in the GDI header, which allows to enumerate the drawing commands in WMF file and call a callback function. A similar function also exists for EMF files. This is an example of an WMF drawing command. WMF drawing commands look very similar, of course. It's the function signature. It takes some parameters, and this is what the record looks within the file if you open it in the aforementioned debugger. So it's very similar to the function signature itself. After you obtain the drawing commands, well, you want to debug. Because WMF is such a platform-dependent graphic format, sometimes it's good to compare with other WMF reader implementations like PowerPoint to understand exactly what the graphic looks like in other implementations. Then you would want to identify which drawing commands cause the bug. You would, you know, the drawing commands that you obtained, you would step through them and try to pinpoint what exactly causes the bug. It is important to also reduce the relevant commands as much as possible. WMF files can contain thousands of drawing commands, and stepping through all of them is very tricky. A way to do this is to simplify the problematic file or make a new one that reproduces the bug in LibreOffice. And then you work around these buggy commands to find out what is wrong. Easier said than done, but yeah. Finally, you want to make sure you didn't break something. It is good to create a unit test for your fix using the minimally reproducible example you created before. You run the appropriate test suits. You probably broke something, so you go back to step one. And then you confirm that round-tripping works as expected. And you also, I should mention, monitor the WMF file, the fix you made for WMF file to make sure it doesn't break something in the future. And that is all. Thank you. |
LibreOfficeKit – bridge between your application and LibreOffice |
Okay. I think we can start. Hello. My name is Sriman Kwas and I work for Collaboral Productivity. Today I would like to talk about the LibreOffice Kit. I named this presentation bridge between your application and LibreOffice because LibreOffice Kit is simply an API which provides a possibility to render the preview of the document. So if you want to build an app which requires rendering of some documents, you can use that. So that's the API which provides not only rendering of the documents, but also we can manipulate the documents and also it gives us access to the UI components. We use LibreOffice Kit in Collaboral Online, but also in the LibreOffice repository there is some sample application called the GTI TiledViewer where we can see it in action. It renders the document from tiles and also allows us to do some simple modifications. The APIs are in LibreOffice Kit directory and most of the crucial implementation is in desktop module. The most important part is class LibreOffice Kit document which we can use to access all the functionality. Also when we get some notification from the LibreOffice call, some events, because for example selection has been changed, we got the callbacks listed in LibreOffice Kit annals. This should be completely transparent when we talk about rendering because it should be completely transparent when we compare to normal desktop application, but sometimes it's not possible, so we use some conditional code for LibreOffice Kit. Behind the com helper LibreOffice Kit is active flag. For example calling this mentioned callbacks is behind this guard. In this talk I would like to present some improvements in LibreOffice Kit mostly in rendering the tiles. The biggest thing is the master page mode because there was a problem. In Impress we can open slides in edit mode, but also we can design slides how they look and this is called master page mode. There was a problem when we had multiple users in the same presentation and some of them were watching the master page and some of them were just editing the presentation. Sometimes we got the mixed tiles because in our API there was no explicit way to say that we want to render tile for master page or for the normal mode, so depending on which view we selected for rendering it was completely not deterministic. So I introduced that parameter to our API, so now we are sure that we will not mix tiles between different modes. Other problem I noticed was in calc. Again it was not deterministic when we were rendering some piece of the spreadsheet because when we had two users editing content in a similar area which was covered by one tile, when one of these cells was overflowing because we had too much content to fit into the cell, it was rendered only when the view selected for rendering was the actual editor, but when other users were typing we sometimes refreshed the given fragment of the spreadsheet without that, so it was flickering and sometimes we got the right view and sometimes we got view without overflowing content. So I fixed that and now we are rendering always the same way, the same tile, so we got overflowing content in both cases and like you can see here there are two views, two different views and both say have the same content. It's very important because then client application can cache the tiles and they are the same. Other thing I improved was rendering the slide previews because in previews we don't want to attach any selections or draft changes in text fields, we want just plain preview of a slide. In the old code we always used one view, the first one for rendering this but it was not correct because when first user was typing something for example in the text box it was visible as well in the preview for other users. So now I modified the algorithm which selects a view for drawing and we prefer views which are not editing currently. Other functions which get improvements was render shape selection which is responsible for rendering the shape or image of element which is currently selected in the document. We used that for example for showing the rotation result in the real time so when user starts to drag something or rotate he sees the potential results. The problem here was that when we selected some image with very, very large size it was rendering the original image so it was taking sometimes few seconds and was sending megabytes just to render a preview. I optimized that a bit by setting maximal resolution we used for previews because they don't have to be as big as original images. And also from other things in the Librofist kit I added was exposing the formula bar widget which is present in Calc. Now it sends all the events like selections or cursor movements to the client application so it can be handled there correctly because previously in Colobor Online we were using the tunneled pixel based formula bar which is not perfect for user experience. And that's all from me, thank you. |
Collabora Online over lock-down
How LibreOffice technology in the browser got better |
I'll start, hopefully. 30 slides and 10 minutes. Excellent. So, there's a next button here. I eventually will get there. Good. So, a good number of things changed over lock down and perhaps you're interested in that. So, Caliber Online is an amazing piece of software and it built on LibreOffice technology. And one of the things that we really, really you'd have noticed changed, I guess, in lock down was that we bend this dialogue saying this is an unsupported version of whatever. Yeah, we thought this was a good way to encourage people to consider supporting the project and getting something, you know, supported and contributing back, but it turned out everyone hated it. Even I hated it. So, there you go. Even though it did a good thing. So, we got rid of that and after a large discussion which doesn't bear repeating, yeah, we then decided to use the Collaborate brand to make people aware that you could get something. So, one of the biggest and most significant changes, I guess, was removing that and providing then a free version that anyone could deploy at scale without asking anyone, which is kind of cool if you're like that sort of thing. And yeah, so there's a bit of a catch-22 there and we are around it. So, the things that changed, I guess, is now public unlimited online binaries and we also published all of our SDK documentation, Docker image build instructions and just got a lot more open. And Italo, wherever he has came up with this inspired umbrella brand for LibreOffice technology, I guess, that's, yeah, this platform for personal productivity. So, quite a lot of sort of end-user facing change there. And obviously, making those two things work together nicely and be friendly is important. So, we spent lots of time putting, you know, LibreOffice technology logos into the here and there. Those are such, and the next cloud office as well. I guess I got these two the wrong way around, but anyway. And everything was beautiful except no one could see it at TDF. Oh, I need to move this way, right, out of the frame. Sorry. Because Guillem is a genius and the cleverly locked down the CSP rule so you couldn't actually fetch the logo from your client because it was forbidden. So, we then had to implement a whole load of work to get the server to download and then re-serve the image. There's some kind of funky HTML proxy built into CloudBronline for this image. So, anyway, it gets out there now. And so, when you click on it, you get these nice credits that tell about all the good work that everybody has done. So, hopefully, there's even pictures you might feature if you're careful. And there's some more branding there planned. So, we have a welcome screen that comes up now and shows you the differences and we'd love to get this accelerated. Sadly, we're a bit busy at the moment, which is good, all good, for LibreOffice. So, I just wanted to talk quickly about some of the things that we've done because that's quite fun in the technology underneath. So, there you go. Look at this lovely logo. So, one of the things then is to make the blockchain useful and which means providing money to us to do LibreOffice hacking, obviously. And so, we got rid of this really deeply annoying dialogue which was triggered by all those clever people that accidentally had typed in the far right corner of their spreadsheet once and made the size apparently larger, which was really good. Hopefully, lots of people enjoy that. Content control. So, this is just Microsoft has many strata of content controls. You know, every idea has been tried many times. And these, I guess, the latest one, and say, McClosh has done a fantastic job. I think had a talk really on this to implement those so that we can do all sorts of forms and things and then starting to add things on top. Press color theming. Lots of beautiful interruptions. Chart data tables. Everyone likes, you know, pictures. Well, some people like pictures. Others like numbers. So, well, you can have both on the same screen and hopefully see it in your slides, which is kind of cool. Yeah, deep hole. Yes, built in language translation. Just sort of you need to get your German into English or vice versa. And yeah, that's all good fun. And hopefully then with formatting of some degree as well. You can imagine perhaps the problems of turning this into HTML, translating it and then trying to get your styles back at the other end. And anyone that wants to help improve that, I'm sure will be thrilled to have some help making it better. So, that's all good fun. What else? I missed the button? No? Good. And then there's whole loads of things we've done in Collaboration Online. So, one of the things I'm excited about is language tool. So, people ask me, what about your artificial intelligence story? And of course, you know, we need some intelligence around. That's for sure. If only we had some, you know. And a robot seems like a good way of providing it, you know, reproducible, repeatable. Anyway, the wonderful people at Language Tool have managed to build an open source business sort of off the back of Grammarly's market positioning and price setting skills. And so, they're making money selling open source things here are built around the open source language tool and contributing loads of good code. And yeah, they're based in Potsdam. So, it's thrilling to be able to integrate with them and to get much better grammar checking. And of course, you can run Language Tool on your premise. So, you're not sending it all to some random other server. You're sending it to your own random other server, which is good. So, that's all fun. And then of course, the Duden corrector has been integrated with people like this, you know. Actually, these are both German companies, I believe. So, you know, your genitives are safe with us, you know. But anyway, so, a similar web service there and that can be integrated. How really flowing your text positioning beautiful boats for core, you know, in just the right place, you know. It's very, very important to get your images nice. Yeah, you can see the anchor. Yeah, it's one of those metaphors that's probably not used. It's like the floppy disk anchor, you know. I don't know how many people sail. But anyway, sailors get their images in good places. And putting that into the web UI. There's some chunk of accessibility work and just ease of use stuff around the formula entry building on Quailon's work to make formula entries. JS Dialogable, which is all good. That came for our mobile thing. The accessibility checker, I guess we brought that to the online suite so that you can see all of the many failings that you have in your accessibility and then maybe fix them, which would be good. And then lots of work from, I guess, Shimon primarily, but Pedro and others on bringing JS Dialogs, more JS Dialogs. So, having the dialogues, again using Quailon's WX widgets like welding framework, we can move these and run them then on the client, which brings whole loads of prettiness and performance and accessibility improvements. What else? Delta's. So, we also, the way Clabber Online works is it has tiles. It splits your documents into 256 pixel square tiles, which at least in the past I was told is optimal for some GPUs on mobile phones. And it seems like a good size. So, we do that, but we now use Delta's for that as well. So, as you type your hello or H something, it will find just the H and then it will compress that. And previously we did this PNGs. And PNGs, well, they use deflate and deflate is, well, as you can see, really bad. Not only is it slow, it also produces big compressed data. So, Z standard is much better and we do that now. And yeah, thanks to Facebook for their contribution to digital sovereignty and control of your data. That's fantastic. So, what else? Ah, various new exciting dialogues, property things, innumerable PDF options built in. So, new export stuff, some great things there from the team. What else? Online embedded video playback. If you have embedded videos, you can drop them in and get them. Of course, codecs are the hidden nightmare of any kind of video. It's a disaster area. But we just use what your browser has. So, let's hope your browser is doing well. And luckily, browsers seem to have reasonably good codec supports. Somehow, what else can I say? Come heck with us. It's cool. Talk to us. Get involved. And the cool kids are all using LibreOffice technology. So, you should be too, if you're not. There's a LibreOffice hackfest this Monday and Tuesday. So, that's not tomorrow, but the day is after. At the business and technology incubator. So, if you want to be incubating, you know, technology, you can go, beta code working. It's a great place and you can take a photograph and then you'll have all the details you need. And in March, we're having something in Clare College, Cambridge. Again, you're welcome. There'll be a LibreOffice hackfest there. Come make us open source rock with us. And that's about it. Thank you. Any questions? Oh, God. Forty-seven seconds for questions. I lie. Forty-two. You know. Never mind. Well, thank you very much. Grab me over a beer afterwards. Thank you. Thank you. |
A Rocket Engine for LibreOffice Templates
Come to see what's in store for the recently-moved WollMux forms and templating engine extension for LibreOffice |
Okay, let's go, it's me again. Somebody just throws some, I don't know, vegetables at me if the ten minutes are over. If you came for a WebAssembly here, I have to disappoint you, so we'll have to wait another one or two talks when we will talk about that. I re-purposed the slot here to talk a little bit about something equally cool, wonderful, spiffy with a rocket engine for liberal office templates. Well, small disappointment, it's about Walmux, but I think it's, I'm kind of serious there, that perhaps the name, we could think about a different one. So quick intro, what is that if you haven't heard about it? It's a Java extension to liberal office and it's there for template and forms management and also like extending mail merge. So those are two views on it, the left one is the sidebar extension, you see three sidebar panels been added there. This is just a selection of sample templates there. And the right side is the kind of constructed, generated document out of that thing. So and all of that has been developed by the city of Munich starting around 2005-2006-ish under the Linux project umbrella, why, because they wanted to migrate to open office, they wanted some template management and there was nothing that was powerful enough. So they did what you do when it's not there and you do open-sourced stuff, you started doing something yourself and then open-sourced it. So that's what Munich did and there was some, well it was in production I think since 2008 and there was some major upgrade to liberal office from open office which had some, we had to change some UNO stuff there and another major rework migrating from Java swing dialogues and UI to native liberal office UI and mostly sidebar which really nicely integrates as you've seen before. And this year or end of last year, since the Linux project is kind of sunsetting the extension itself which is extremely powerful, lovely, great, also cool for QA because it triggers really literally every little corner of the writer UNO API there, moving house to CDF. And what is it in the first place, it's predominantly for let's say very complex document generation workflows that you will find in larger companies, perhaps also in law firms or something but predominantly in the public sector. The idea behind there is kind of assemble your letter on the fly and employ the dry principle and only ever have one stylistic element or part of the template defined once and then you kind of include that like PHP or other template programming languages so you kind of piece that together, there's also like loops and control flow statements that you can use to kind of very dynamically generate forms. There's a forms generator so you can kind of have some user guidance like where to fill in what and pre-fill things and also dynamically kind of say oh when this is on then do that or not to do that or disable here. As I said, mail merge stuff and yeah this text programming and quotes content based directors doing something very dynamic. Okay so that thing is now has moved to CDF end of last year and we have been quite busy first of all like doing the move and also like doing some adaptations and adjustments and getting a kind of fit for let's say an international open source project like LibreOffice. What happened, Git repose moved to under the LibreOffice project on GitHub which was an easy like click so the old URL still work, you got a redirect that's pretty nice feature on GitHub so everybody can still, who had this clone can still work with it but the official location is now there. We added some translation workflow stuff, we made the Java like everything that it was a user visible string and the Java sources is now in PO files. We put this on TDS web late so the UI can be translated. We moved the documentation from a mark down Git based workflow into Media Wiki which is also much nicer than to translate for the community, for this mark down stuff, essentially needed some kind of developer setup to do that and Media Wiki has really nice, excellent translation support meanwhile. Fixed a number of bugs, made sure that it actually works not just with all the LibreOffice versions but with the most recent ones. We did some build fixes on the, there was some bits of the internal build system kind of leaking through that still so we made it actually build out of the box so you clone that, you build that with Maven and it actually works. Some tweaks with namespaces so we can then upload artifacts to the LibreOffice Maven repository and it doesn't have the old names anymore that were sometimes internal to City of Munich. We set up Jenkins drops so we can actually not only run tests on pull requests, we can also run releases there on the CI and tons of smaller things like started to do comment translation in the code, renamed variable names, got the build houses and other auxiliary documentation translated to English. It's not done, in particular the code stuff, that's not done. Part of that was already English but that's quite a bit left. Which gets us to next steps and the possible future. So handbook is not fully line of super clean, the original handbook was German so we translated it to English and then we wanted to reuse the old German translation because why do that work again and with the Bickey markup so that's kind of busy work there and we're kind of still mostly, we're still busy with that. I guess help appreciate it but no, okay so we get it done in the very near future. Some renaming for the Java packages, that kind of blocks each other with possibly renaming the project, we don't want to rename it twice, have some new landing page done, some more bug fixing, there's still some side bar, not so nice things and some thing that was one crash somewhere and then we want to reasonably soonish have some release there so that people actually can use that. If anybody's interested, we can have some snapshots possibly published, then again it should be relatively easy to build and if it's not, let's fix it. So actually maybe not publishing snapshots far and wide is a feature, not a bug, so we can fix the build system, yeah renaming. So we kind of brainstormed a little bit in the background, those are three suggestions that we will probably mull over and think about, so this LibreOffice template tool or template engine is probably what it actually is. So people would understand what it does when they see that somewhere. This forms and templates is a bit more ambitious because it does much more than just templates, but hey, naming is hard. So yeah and if you're curious come and help, so there's some easy hack there for comment translation, that's the easiest thing to get your feet wet. There are certainly more like UI and document translation, that would be great and of course if you have cool feature ideas, go get it implemented. So there's one thing, there's an old branch about QR code support, so you can kind of generate QR code and insert it like in this process of programmed software controlled template generation, that would be nice to get in or other things that people want in templates these days, and that's it, thanks so much. |
Make Collabora Online yours
Customize and integrate it everywhere |
So, I'm Pedro, Pedro Silva, and today we are going to look at how we can make collaboration online yours, and how those customizations can actually lead into new features, while going through some cool new features, why not? So, we start over with these different themes and different aspects that collaboration can have, but not only we can have these different faces, but we can also have different buttons. If you have a different button that you'd like to have, or maybe you are hacking on something and you want to have it in front of a button that triggers something, you can do that, and we'll show later how, but spoiler alert, you know, we have an XML file that you can customize it, we have a post message API that you can actually use that to hack directly onto the interface and beyond, and third of all, we also have some kind of cool variables that go through some invisible input field that then customize it. So, you know, there is a lot of stuff that you can just hack around. And by the way, language tool, as described by Michael Mix, thanks to Athenaeus, and we got this cool integration, file properties, another improved and now native functionality. You see here on this tab, there is plenty of space. You might want to add a button here. Why not? It's a good candidate to do that. Sparklines, accessibility check. We go through these new features, you know, how we can create diagrams. And you will soon start to feel that, okay, this is fitting my workflow, but maybe I want to go a step further. You know, by the way, we can even do this crazy stuff. Isn't it crazy? On a browser, I don't know, I think it's crazy. And while doing this, you know, we got a lot of performance whims left and right, which is, you know, it's cool and all, but what about new features? What about upcoming features? So, you know, we work with the LibreOffice technology around, in a group, with friends, and this means not only the people that are here, but even other people, you know, like for instance, in this case, next cloud. And we are just doing something that can start with online, then forces us to go to the core, to the LibreOffice kit, and in the process, we are improving both sides. So that's interesting. For instance, this is an example with form controls. As Miklos described, content controls, right? Repair documents. Repair improvements, but how do we make it, how we customize it? That's the question. So, as I hinted before, we can do that via XML, and there is a lot of variables that you can just turn on, turn off. We even saw it earlier with the PranamStalk, one of those examples. But you can set what's going to be the default UI mode. If you want to go a little bit further, you can hack what will be those key colors that we can see, and you see here the input values that we can pass. Or why not hack it directly into it? Maybe you have a different, you know, your native language is not English, and you might want to test it on that. You know, maybe you don't know, but you can add, you know, at the end of your URL, you can add this at the end lang, and for some different language directional layout, and you can test that, and maybe even help us, you know, translate. Why not? You can just go to the web late and just start to contribute. And you see, all those things are, you are making that yours, because it's answering your problem, but at the same time, you are helping everyone else. So I think that's the key here. You are inserting a new custom button, for this insert button, it's just one of the options that you can use and have used with the post message API. So if you go to sdk.coloronline.com, slash docs, slash post message API, you will see all these listed in the table with examples, so you can just hack it, you know, and have fun with it. You know, even if it doesn't work out of the box, even if you are trying to hack something and it breaks, it's okay. You know, you can always keep in touch, you can go to cola slash io, and there is there, you know, our IRC, our matrix, telegram, all these things. You can report bugs, you know, and maybe push a PR, you know, and you will find out that maybe your PR will lead to learn more about LibreOffice, technology LibreOffice kit, and suddenly you already have two PRs, both on cola online and on Garret on LibreOffice. So hopefully it's what will happen. You know, and while doing that, you have three minutes, questions, yes, and while you are doing that, you will be able to experiment crazy new features, the bleeding edge that still no one knows about it because you build it. So you know, maybe you can already see what is happening there, what's the new features, new integration that's coming, heat on citations, but many more things, yeah. So I hope you take it at heart and give it a go, and you know, join the team. Thanks. Thank you. Any questions? We've got some, two and a half minutes. Thank you. Thank you. I use colabora on my next cloud. Can I do all this on my next cloud? Yes. Why not? Yes, if you have your bill environment, you can hack all these things and have it. Now, if you have a production environment, that's probably what you are hitting at, okay, then you can still do some of these customizations just by simply hacking on the cool wst.xml, that it should be under ATC directory and you should find there, cool wst.xml. And you can, for instance, set up the default UI, what will be the default UI that will appear, you know, if you want to allow, if you want to use the integrators theme or not, for instance, if you are using the next cloud, next cloud has its own theme, so you can hack all these things. What's the default language? Maybe you have like three or four languages there, as a list, and maybe you don't need. Maybe you just use like maybe two, Portuguese or English or something, and both, too, and you can remove the others and it will maybe use less resources. So you know, there is a lot of things you can hack. We'll do. Thank you. 30 seconds, one very quick question and answer. If not, go for the matrix chat. Thank you. Okay, thanks. Alright, thank you. Thank you. Thank you. |
Marrying Collabora Online and LibreOffice WASM
Running Collabora Online in WASM |
Hi everyone, my name is Balazs Vargo and I'm working at Allotropia GMBH and this presentation I would like to talk about WebAssembly, M-scripten technology running the colabora online in WebAssembly and headless conversion stuff in WebAssembly as well. So let's get started with the colabora online in WebAssembly stuff. So it was a common project with the colabora, we were together with Tor and Mihail Stahl. The goal was to approach offline document editing. On this slide there is a sketch design for this. The goal was that when the connection is breaks and in a browser the application is activated and that contains the colabora online server functionality and when the connection is restored the document will be edited in the colabora online server again. I think Tor will talk about it a bit more. But to make it work first we had to build it in WebAssembly. For that we are using the M-scripten compiler toolchain. So let's talk about that. The M-scripten is a complete open source compiler toolchain to WebAssembly. Using M-scripten you can compile C and C++ code or any other languages that uses LLVM into WebAssembly and you can run it on the web or Node.js or any other runtimes. The M-scripten generates small and fast code and the default output format is a WebAssembly and highly optimisable executable format that runs almost as fast as the native code. A little bit about the M-scripten toolchain. The main tool is the M-scripten compiler frontend. This is a drop-in replacement for a standard compiler like QCC or CLANG. The EMCC uses KLANG and LLVM to compile to WebAssembly. EMCC also emits JavaScript that provides API support to the compiled code and the JavaScript can be executed by Node.js or from within HTML in a browser. There are more information there but you can read that about porting code to use M-scripten. Support for portable C and C++ code is fairly comprehensive and supported for the C standard library, the C++ standard library and the exception and etc. It's a very good and also the OpenGL supports. The multistrading is supported but depends on the shad array buffer which is still being standardized and implemented by browsers. Now let's see the porting how it worked in the case of Kolobora online. The building for Wasm with M-scripten is a bit immature and some projects needed patching to make it work. First, we need to build a LibreOffice core with M-scripten earlier. The best was to use the feature branch but now it's working with the upstream. Master without the Qt5 framework and then it needed to build the Kolobora online. This is the ZSTD libraries, the Poco library, it's required to patches to make it work. And then it's necessary to build the Kolobora online code and then linking or the binaries and executables together. Everything was executable, uses a lot of memory, without optimizations or with O1 flags. It should work anywhere but there are different kind of flags like the O2, LDF flags which is the default and some link time optimization happens that uses very much RAM and because of that it causes segmentation error, so we are using the O1 flag in case of online and also in case of the LibreOffice core. A little bit about running but we'll talk about it more, its structure is quite similar to the Kolobora office iOS and Android applications, there are one difference is that in case of mobile app the C++ code is what transfers and then loads the HTML page into WebKit in which the JavaScript trans in the WebAssembly application, the other way around the WebPages naturally the one that is loaded and then the JavaScript code then starts running the C++ code as a WebAssembly. You can see an image about it, it's a documentile in the browser, it's running in WebAssembly, there are some depth tools where you can debug, it's working but still needs some optimization work which is in progress, if we have an image in the document it's quite slow but we are working on that as well. Let's talk about a little bit some another WebAssembly work at Allotropia, we have also worked on a headless conversion, the last time in LibreCom with a little demo about it but it wasn't completely a headless conversion but now I make a video about it and let's see how it works now, we are not using the Qt5 framework anymore and we are also using a unique HTML which is created in the LibreOffice core by M-Scripten, so let's hope the video works, it's working, so this is the very basic HTML page but it can make it any kind of HTML, there is just a button, you can select multiple files and using the convert to arguments convert to the document and you can convert it any kind of format so that the LibreOffice core can do and then you can document it, download it and you can see it in the browser as well and there is a common line in the browser where you can see the results but the HTML page can be edited and that's a future plan to make it much more nicer. Also some future plans at Allotropia we would like to calling Unia API function cause from JavaScript and scripting for that also UI, M-Scripten provides various options for connecting normal JavaScript and CompileCode which range from functions to call CompileC++ from JavaScript and V-Aversa so to access environment variables from CompileCode, these options are the Web Ideal Binder and M-Bind, it's creating binding between C++ and JavaScript and also C++ code entities to be used in a measure manner from JavaScript. This was our future plan, I think that's all, thank you for your attention and thank you for that. We'll see you in the next one, bye for now. |
Collabora Online and WASM
Assembling off-line Collabora Online with the Web. |
Let me start. This talk is partly about the same thing as the previous talk. So, repetition is the matter of invention or matter of what it's called. And here, so, as Balash mentioned, the problem that this is supposed to solve is that if you are editing some document in Kolobara online, and, for instance, on your laptop and then the connection breaks because you are going into a tunnel or something else happens, and then you can just seamlessly eventually it will start using the local voice instead in the browser and the same document that has been downloaded without you knowing into the browser's memory, not to your file system. Yeah, so the solution is that. And, of course, implementing this actually will be quite a problem or quite hard, but we are already working on it and I assume it will be successful. And then when the connection comes up again, you can, or the document will be uploaded to the Kolobara online server and the editing will continue then using the online server. And I'm using the name Kobosm for this or actually it was Thorsten who invented that name, I think. And what we don't think is a solution at least for some customers is to install Kolobara office locally because there are situations where you are not allowed to install third party software very easily and if you did that and wanted to be prepared for this connection in Kolobara online, you would have to keep downloading the document yourself anyway all the time and then start Kolobara office locally separately. And what is WebAssembly? Well, this is what Balazs already told you. One thing that I guess could be mentioned is that this WebAssembly runs in the same sandbox as a web page and it has access to the same things that JavaScript has access to or more importantly doesn't have access to anything that JavaScript doesn't have access to. So it's quite safe, it can't read any random files on your disk and so on. And WebAssembly doesn't even have access to the direct access to the DOM to the HTML page but it can easily run on JavaScript anyway that has access so that's not that important. There are also some environments that are not in the browser. I don't remember but it's called if there's something, well at least one such exists. And WebAssembly is supported in most current browsers, Firefox, Chrome, Safari, Edge. Are there any others? M-scripten tool chain is this clan-based tool chain and I'm not sure if the C and C++ library are like, are they considered part of M-scripten or not? Probably yes. And this C library contains much of normal Linux or POSIX functionality. Also, threading using p-threads and they receive an in-memory file system that you can use to pass files from building the most application into the application at runtime. Some of these very traditional Unix functionalities with oddly implemented, which can be surprising for instance, pipes, which are like 50 years old and have always worked the same way in Unix. In Boston they suddenly are non-blocking, which was a surprise to us. And we use a very specific version of M-scripten. Currently, we could try to upgrade to a newer one. I think it will work also. But selecting what version of M-scripten to use and what tool chain options to use has been quite complicated. There are so many different versions to choose from and they all have slightly different issues or functionality differences. So once you have something that works, you tend to stick to it. And the Kovac application, it contains all the relevant C++ code of LibreOffice code itself and also all the external libraries that LibreOffice uses, plus then the C++ code of LibreOffice, I mean Collabora Online. And then the same JavaScript code as in Collabora Online is also used, of course, and that is what provides the UI. And compared to the typical M-scripten examples, you see if you start reading about M-scripten, in this case, the Bosman code doesn't do any UI itself, but all UI is handled by this JavaScript layer. And the application structure is quite similar to the iOS and Android apps. They are built in quite a similar way, constructed in a similar way. And just like in these apps, instead of having several processes, there is just one process and multiple threads. And here you can see how the Collabora Online is constructed and then comparing. Then they communicate using web sockets. And in Kovac, instead of several processes, we have multiple threads and as such, the message passing between them is more or less the same as in the web-based Collabora Online server. And this is actually something that should be eventually improved. As you see, instead of all this message passing, we could just simply call the function directly that will handle the message eventually. So that should be faster and easier or, let's say, simpler on the system. Here are some pointers how to build this thing. You first compile LibreFix score and then the online dependencies and then online. And if you need to run this through a web server because you can't use these shared array buffers if you load a page from a file URL. And here is a sample screenshot where I actually then disconnected the internet and it's still continued working. And that's it. And thanks to Allotropia for doing this initial work and making LibreFix work as a Wasm application. I'm not sure why I put this thing on this page, but yeah, it was Allotropia who figured out what versions work and so on. So that's all. |
State of the Toolchain |
A very quick update on how we use C++ standards in LibreOffice to finish off this afternoon. So yeah, we're still using C++ 17, almost there's one thing that keeps cropping up. I think Noah will try to sneak it in twice and Mike wants. This is this unsuspecting standard from Kars thing where you have a string view of characters and you want to get an integer or floating point value out of that and there's a standard form function for that. Unfortunately it's not in Libre standards C++ 8 or only in 8 and only in 11 for floating point values which we don't have, we're still stuck at 7. So people keep trying to add that to the unolibraries and I as the gatekeeper of these unolibraries which we never want to change because then we have these maintenance problems. I keep banging them back that we don't need that, we don't need to add anything there because we'll have C++ 17 functionality for it anyway and I keep saying that for years now. At one point we'll get there, I'm pretty sure. So yeah, just C++ 17 for now still but of course we're making use of whatever becomes available in the later standards so that C++ 20 is out for two, three years now and C++ 23 is almost finished, it's the standardizing stuff takes its time but it's quite frozen by now. And then there's always these small things mostly in the standard library that are easy to approximate and then we usually make use of those ideas in our own code and have one header file, one include file where we either use the original alias to our O3 etl namespace or implement our own approximation which we then throw out once we have that available. So this is the span thing similar to the string view where you just have a range of values start and length. Then these comparison functions they are very interesting if you don't know them then check them out so you always have a hard time comparing two integers in C and C++ if they are of different types signed and unsigned and you get surprising results and finally in C++ 20 they decided to come up with ugly syntax functions but they will do the right thing whatever types you throw at them and we have at least one place where we use them by now for good measure. And another example is these standard unreachable magic function that they introduced we still have a macro for that to approximate it so if there is a place in your code where you can't reach so a default in a speech and you don't want to return any nonsense from that and the compiler would warn you that you don't have a return statement there then just add an unreachable there to tell the compiler this is impossible to reach anyway. Then there is bigger features or beyond library features that we try to make use of one way or another for example the C++ 20 const eval similar to const expression where you have something that should be computed at compile time and const expo is do it at compile time if possible otherwise do it at runtime and const eval is forcing you to do it at compile time and the trick there is if you have a function that has some assert and you make that function a const eval function then if the assert would not hold then you get a compilation error instead of just a runtime error later on or not even an error if you build with the asserts disabled and we make use of that in some places like this we have this color class and I think Noel again at one point tried to get rid of the ambiguity whether it has alpha channel or not so we now have a constructor that wants to make sure you pass in some value that doesn't have an alpha channel in there so we have an assert in there and if we have const eval in the latest compilers then we use it and then we would get a compiler at compilation time error already if you pass in some value that does have an alpha channel after all so that helped the improvement of the changing the code from the old semantics to the new semantics but so we have an if around that if have cpp const eval then use const eval otherwise we use const expo and in our configure script we have lots of checks whether we can use const eval and unfortunately only clang by now even the latest compiler versions we have five checks in there for bugs that we discovered with all these const eval implementations and clang the latest one has all its bugs sorted out but gcc and the Microsoft compiler still can't use it so that shows how if you have a feature sheet of what the compiler support and there is ticks for their adult trust that too much if you then actually try to use it you run into all kinds of issues and then of course coming up is issues there even even bigger way you can't use some if death trickery or some some include file where you approximate things biggest two things that come to my mind are the concepts in c++ 20 which would make code really more readable but which is hard to do in some macro way we have one place I think by now where we have again around this requires thing we have an if death or if we have a c++ 20 implementation that supports it and there is one place where we where we have some function that internally there's some dangerous dynamic casting and you know proxies didn't support dynamic casting so I wanted to make sure that we never use that function on a template type that does that is a you know could be you know proxy so I came up with this wonderful and requires cloth there to to if you have a lady new enough compiler to get that sorted out at runtime and otherwise we ignore it and even bigger thing is modules and I guess we will have to wait for for others to come up with real real implementations of that or real world usage to see how we would make use of them but even if these features are out in the dist in the distant future in some cases still what we already can do is try to force the compiler is hard to to run our code and demonstrate to them that they what what bugs they have worried what bugs are in the standard library implementations if they introduce new things so what I do is opt into this thing to use whatever compiler you use with the latest c++ you version that compiler implements which is typically c++ 23 by now and then have a big matrix of of platforms and compilers and libraries runtime standard long term standard runtime libraries to build on and and find all kinds of interesting bugs whenever I update one of those things and then mail the people mail Jonathan that he introduced something new in lip see lip standard c++ that doesn't get clang that doesn't make clang happy and so forth and these people are happy that we are the guinea pigs for them and that's it I guess we have time for question if there's any questions I'm not prepared for that who do you contact if you find it back in the Microsoft compiler oh they do have a web form now that you can fill in and I think I even got a response monster wow Mike again ski brought that up okay maybe one more question since we're at the end of the track so what's the status of of modules in compilers because as far as I know only Visual Studio does it and does it kind of sort of no I think they're all three by now in their head trunk versions they support them I think they claim it works but I didn't ever try it sorry a module is a new way of organizing your so that you don't have this issue of having all these includes that you need to include this is like pre-compiled a newer version of pre-compiled headers actually yeah yeah that sums it up okay then thanks again thanks |
Demystifying compiler-rt-sanitizers for multiple architectures |
Good afternoon, everyone. I'm Mamta. I work as a software engineer at Leica Geosystems and I work mostly in embedded domain. So this is the outline of my talk. So just to keep everyone on the same page, I'll be giving a brief introduction about LLVM and Clank and so that the terminologies next I'm using are clear. Then I'll talk about compiler RT sanitizers, what they are, how you can build them and how exactly they work and then my final thoughts about sanitizers. So this is a typical compiler pipeline which most of you are aware of and they call it as a textbook diagram as well. So whenever you write a source or a code in any language to produce a binary executable for your machine it has to go through multiple stages. So mostly the most important ones are the front end, middle end and the back end and the front end actually does all the lexical analysis, semantic analysis, it checks the syntax and then it generates an intermediate representation. Then it is passed to the middle end which does optimizations independent of the target and then it is passed to the back end which generates executable and it has more optimizations depending on the target you are building it for and sometimes if there are multiple object files then comes in the picture linker. So how does LLVM fits in our pipeline? So basically the same, it is a modular and useable compiler framework and it provides front end, back end and LLVM core which is the LLVM optimizer. So if we map it to our compiling pipeline, so for depending on the language you are trying to build or use this compiling tool chain you have compile sorry you have LLVM front end that is the clang for C languages and Rust, Rust C for Rust and then depending upon the target it will use the target as x86, 32 bit or 64 and the main part which is most reusable from the LLVM perspective it is the LLVM optimizer and the core. So if I have to develop a new compiler tool chain I can easily reuse LLVM I just have to write my own front end and if there is a new hardware back end maybe. So in short I can say LLVM is like a Lego of compiler tool chains. Next focusing a bit more on one of the LLVM front end clang which is for targeting mostly C, C++ and C like languages. And here it whenever you provide your source code it performs some lexical analysis generates tokens for parsing then it does semantic analysis and generates abstract syntax tree and the end goal should be LLVM intermediary presentation. So just to summarize LLVM is a collection of all modular and reusable compiler technologies and there is much more to it because it provides now with static analyzers, sanitizers and more libraries so it comes under an umbrella project as LLVM.org and clang is a compiler front end which is mostly for C, C++ and C like languages but when we say clang executable it is more than a front end. So when you build LLVM you have clang executable as well and it is like a compiler driver. So for example if you have a static compiler and we have to tell it which directives to use or to use this hyphen i option where is the standard library path for example for C, C++. So clang as a driver does most of the housekeeping for all this task and it helps this compiling pipeline to tell you where to look for the libraries and also provide some OS related features and how your OS is. So that was just a brief introduction about clang and LLVM. So now I will talk about compiler RT sanitizers it is one of the sub projects in LLVM but before that I will highlight about the runtimes. So LLVM comes with compiler RT runtimes it is pretty much equivalent of libGCC for LLVM pipeline and it provides target specific support for some low level functionalities which hardware itself cannot do. So it consists of three main components built-ins, sanitizer runtimes and profilers. So built-ins provide implementation for target specific hooks which hardware itself cannot do. So just to simplify it a bit more for example if the 32 bit system cannot do 64 bit division so here you can see a code snippet and if I try to compile it using clang. So first we do a normal compilation on x86 machine which is a 64 bit and here on left you can see it directly calls this diff cube which means it performs the division itself but when I use hyphen m32 which is I am forcing it to compile for 32 bit machine it depends on another call it actually calls this udiff i3 which is basically a built-in implementation in compiler RT. So this is an overall picture of compiler RT runtimes. So talking about the sanitizers, sanitizers are like runtime checks which it is like adding a code probe in your code to verify if there are any memory bugs or to sanitize the code or to find any security flaws. So in case of LLVM it is provided by compiler RT and called as compiler RT sanitizers and there are multiple kind of sanitizers available. First is address sanitizer which you can use to detect use after free buffer overflow and memory leaks as well. Then you have undefined behavior sanitizers, memory sanitizers to identify if there is some uninitialized memory and threat sanitizer to detect some race conditions and dead locks. So here it is just an example of how sanitizer looks. So here is a very simple code where I am allocating some memory in heap and then I am trying to I am freeing it and then I am trying to access it after free. So if you build this code with your sanitizer flag on and try to run it immediately it will complain that you are trying to access some memory after free. So that is how it looks like when it is built with address sanitizer and I will talk in more detail about how what goes behind the scene. But before that how to build compiler RT sanitizers. So there are lot of documentation around as well and it is very easy to follow as well but sometimes it works, sometimes it does not. So you can build first compiler RT with LLVM it is easy you can directly enable it with LLVM enable projects when you are building your complete LLVM tool chain and you can do a separate build as well if you have your LLVM config and use generator of your choice. To enable the sanitizers so if I am doing a build along with the complete LLVM tool chain I just have to use this flag compiler RT build sanitizers to on and when you do this and here I am using Ninja as a generator you can see we get a config out of CMake and you can see it is enabling different sanitizers as address sanitizer, leak sanitizers, memory sanitizer, thread and undefined behavior. And same you can do for and after sorry after the installation and build you can get these set of libraries. You can do the same for the standalone build as well with the same similar flag and this is the config generated when I am doing a compiler RT standalone build and then it is also possible to cross compile RT sanitizers you have to provide lot of flags and you need to have your arms this route as well and I personally do not prefer this way but there is a talk today in the end from Peter about building embedded tool chain using LLVM. So, to make it a bit easier for all the embedded developers there is a in Yachto project there is an open embedded layer called Meta Clang. It makes it bit easy because it provides everything for building your tool chain and you just have to include this layer if people are aware of Yachto builds. And just few configurations are needed like you have to enable the SDK, you have to use LLVM runtimes and then either you can write in your package groups or in local conf to include compiler RT and compiler RT sanitizers. And this actually generates a SDK and it is very easy to distribute this SDK to other developers or like in our case I used to send it to the application team so that they can use this and the people who are developing C++ code they can run their code they can compile their code and run sanitizers on it. So this was actually while contributing to Meta Clang I came to became more aware about compiler RT sanitizers and also now it is available for arm 32 bit and arm 64 bit and x86 of course and you can easily test it also on KMU arm as well just specifying your sys route and running your test code to see how it behaves on your actual target. So that was all about what is compiler RT sanitizers, how we build it but what exactly goes behind the scene. So here I am using an example of address sanitizer to use and here is a very basic code where we are taking some arguments and converting into integer but here you can see we are using argument counter and the value for this can be very large as well. So first time when I try to compile it with Clang it compiles and when you run it after like as you can see easily and it is very easy to spot here I have defined the size of buffer to be just 2 and if I provide input more than 2 it should fail actually that is the case but it is very the crux here is like it is difficult for some machine it will fail for 3 and for some machine it will fail with the 4 input and leads to the segmentation fault and we do not know what happened exactly behind the scene. Second when you try to build it with address sanitizer enabled or link it with compiler RT and with your address sanitizer flag as well and the output is actually a bit large this is the first part of it. So it can easily spot that there was a stack overflow stack buffer overflow and it also points out if you build it with hyphen g option it also points out at work line it is failing but what exactly led to generation of this kind of error. So in very simple term when address sanitizers adding sanitizer is like adding an additional code to your actual code to just check when it is going to fail and report the error. So here it looks very easy okay if my buffer size is more than 2 or sorry if the input size for the buffer is more than 2 just notify it is an error but behind the scene for address sanitizer it is much more and it can adopt multiple strategies to implement it. So here address sanitizer uses memory mapping so memory that should not be accessed is called as poisoned memory. So behind the scene it is implemented like to check whether this is getting poisoned or not so here poison means either it is referring to some deallocated memory or some already allocated memory or there is getting some overflow or not and just report the error but there is much more how this memory is mapped. So for any code built with any application that you build the virtual address space is divided into two kind of memories one is shadow memory and one is application memory. So for address sanitizer it is more important to implement this is poisoned and report error in a very compact and fast way then talking about shadow memory and the application memory. So application memory is the main memory of the code and shadow memory is a copy of application memory but here 8 bytes of your application map memory is mapped as 1 byte in shadow memory and when the sanitizer checks whether this memory is poisoned or not it is mapped to either 0 or 1. So here it is a small portion of shadow memory of our example and you can see the memory which is accessible and which is in good condition is marked as 0. But here when we are allocating we are checking the buffer here you can see after it is marked as so the memory which is like more when we are trying to access more than the allocated buffer size it is marked as 1 and these and here you can see in the square brackets it is it is marking that hello this here you are supposed not to access this it is getting out of the range of the buffer. So that is how this application memory and shadow memory is used and then you can see here we restrict it as the red zones. So if you have anything greater than these than the allocated one so that one gets marked as 1 and you can see f 1 f 3 around and so that is how the address sanitizer works by using this shadow memory and application memory and knowing about your shadow memory and whenever it spots it is 1 then it says okay there is a problem. So that is all about the address sanitizer and how it works. So here are my final thoughts about using sanitizers it is a very great tool to find bugs and memory issues and in run time for complex applications. So by using sanitizers you can improve your development and you can spot the errors very quickly and since we had a very simple example but when the code gets more complex it is more difficult to find what is the problem. And though with the sanitizers it is more like a tool to check not to be used in your production it increases the code size but the sanitizers are comparatively more faster than Valgrind the existing one and still there are not all the architectures are supported uniformly like for example for arm 32 we do not have threat sanitizer completely implemented so I hope we see a better implementation later and it supports threat sanitization on 32 bit machines as well questions. Yes it works sorry the question here is like I am using the optimization flag for building with address sanitizer enabled yes it can be enabled to up to 3 as well and you get similar kind of log. So the question is where I am placing the instrumentation to identify whether my buffer is overflowing or not to use assertion here I am not sure I can answer that well but the strategy of this one is using more with this memory mapping one and also here since this buffer size is very small so may be assertion might work here. This I am also not much aware of so the question is like if I can sanitize it for a very specific portion of the code right yeah you can no I am sorry I am not much aware of it you can reverse it you can prevent some specific code from being instrumented either by annotating the code or by using a blacklist. Yeah there is this blacklist address sanitizers options as well yeah. Have you you say that sanitizers are primarily a development tool and yes they mostly are but have you ever looked into using some of them for actual production executables for hardening for example UV sun has a fairly minimal overhead so it can be used in production especially if you make a strap or if you make it if you use the minimal runtime or also there is GWP ASAN which is like a lightweight sampling version of other sanitizers which can be used in runtime I think Android is using it actually. Okay so does it support some of the embedded platforms as well like ARM32 or 64? It supports I believe it does well UV sun works pretty much everywhere almost everywhere GWP ASAN works I think almost everywhere but it is also dependent on like external function for unwinding so it might not be great in all scenarios it really depends. I was wondering if you can use this for kernel ring zero code as well. Yes you can but like there's like. But there's a. While there are no if you don't have like the runtime it's not in the NPM it's actually offered by the kernel and you infer the checks but the hooks themselves are implemented in the kernel. Well some sanitizers don't require any runtime. So for ASAN you need instrumentation. Yeah you can use kernel address sanitizer for it. But like ASAN also needs like the ASAN best point like the IRF. I think it's just a level implementation of the stuff. There's other sanitizers in kernel which are implemented this way. I think that it's GACFI. Yeah. Yes. Do you know where the code lives currently that does the hooking to malloc and free like the C library? Like how does the interaction with the C library work do you know because like I could have that for example have a muscle instead of the C library and I guess the code has to be adapted somehow or there must be some book infrastructure. So do you know how that works and where the code lives that books malloc and free for example. It's in compiler RT instance if I'm right. So if you change the version of library you have. Yeah the F on runtime has to be for malloc and free. Okay. So outside of the C library. Yeah. For ASAN yes but for example for ASAN for leak sanitizer which is enabled by ASAN you need to be aware of things like PLS layout and so on. But the runtime in LVM does implement both. Yes. Are the sanitizers shared between GCC and LLVM or does each have its own implementation? They are shared. Yes. I noticed that one shadow byte corresponds to eight actual bytes. However considering that there are special values for it. Does it mean that there could be situations where there could be some bytes which are not protected necessarily? Yeah there can be some cases where you can call it as a false positive kind of thing or sometimes it is checking very much in the end. So it was a very small example but it can happen that you get these square brackets at the very end and it's an overrun case. So that can happen. Yes. Could you get a false negative in some sense in the sense that if you have two almost adjacent blocks of advocated application memory and you access a pointer from one to the other could you get a false negative in this sense? So because that would not be a point and it would still be part of the application but for example what an address could you get? Yeah it can be. It can be a false positive case maybe. Yes. Okay. I think it was already partially answered before but I am not sure. So I am just going to ask it. So for example I am also working on an embedded scenario and we don't use, I think we don't use glpc or muscle like we have our own set of routines for malloc and free. Yes. Would it also work in that case or is that then something that you would need to, are there more requirements before you can use these sanitizers on those devices? For my experience I have been just using this meta-clang layer included and for testing with like or for just doing on the SDKs and emulator it was perfectly fine. So not on the actual devices because it is just meant for testing. All right. Thank you Manta. Thank you. Thank you. |
Defining a multi-architecture interface for SYCL in LLVM Clang |
Yeah, yeah, exactly. Okay, good afternoon. Yeah, so I'm going to be talking about compiler intrinsics in sickle in DPC plus plus specifically. This is Intel's open source sickle implementation. This is what I work on. Yeah, so hopefully I'll be able to say something without saying too much in 10 minutes. Yeah, so code play. I work for code play. We had the first sickle implementation, compute CBP. We now work. We were acquired by Intel. So now we work on the Intel sickle implementation DPC plus plus. That's what I work on. We have lots of partners, you know, hardware companies, that kind of thing, whoever needs an open CL implementation, sickle implementation, and so on, come to us. Yeah, so sickle is a single source heterogeneous programming API. So you can write single source code that can run on NVIDIA, Intel, AMD GPUs, close to the mic. Okay, voice up. Yeah, so it's great for someone who's developing scientific applications to be able to write single source code that runs on whatever GPU the implementation enables, such as CUDA, level zero for Intel, AMD GPUs, and so on. Yeah, this is a really good thing. So I work specifically on the NVIDIA and the HIP, the AMD backends for DPC plus plus. Okay, so yeah, I just want to talk a little bit about compiler intrinsics and how kind of, you know, math function calls work in sickle and DPC plus plus at the moment, and how we can hopefully improve them so that we're contributing upstream. So what happens to sickle cause? So essentially, you get your sickle cause in your source code. This is redirected to spear V open CL cause F, you compile the spear V, you make a spear V module, this is a symbol within the spear V module, and then that is the implementation is provided by a CL level zero Vulkan driver. Okay, as I said, I don't work on the spear V backend at all. I work the PTX, the CUDA or the AMD GPU backends. So what do we do with these symbols so that we get to the native implementations? We're not trying to reinvent the wheel. We're not trying to do anything that the people who are making the GPUs aren't doing already. We're just trying to redirect to that. So how do we go from this to that, and then compile to our PTX module, our AMD GPU module, HSA module, and so on. So, yeah, how do we go from spear V OCL to NV cause F? So use the shim library, easy peasy, that's fine. Okay, you just redirect it, you compile it to bitcode, you link it, a compilation time, and you get to this, this native bitcode implementation. This is great. Okay, so we use libclc for this. So libclc is written in open CL. Okay, open CL does lots of stuff that SQL doesn't expose as easily like address spaces, that kind of thing. So we write an open CL. This is great. This makes our lives really, really easy. We can do it. Say, before we get into this, just why do we want to use a BC library in the first place? Why don't we use a.so? Why don't we just resolve to some symbol that then a runtime is called and we don't care about it? So on a GPU, the overhead of a function call is really high. Okay, it's because we lose information about, say, address spaces, that kind of thing. The GPU memory hierarchy is a bit more complex than, say, for CPU. So we really, really need to worry about this. We want to inline everything so we don't lose any information about our memory hierarchies. We also allow compile time branching of code based on the architecture, based on the back end, that kind of thing. We don't want to have these checks at runtime. We want high performance. That's the name of the game for what we're doing. This gives us greater optimization opportunities as well. You can do lots of dead code elimination, lots of fun stuff in the middle end because you're doing all these checks at the IR level. Okay, so this is just kind of what it looks like. So we just have Spirvio CR-Casef. We return NV-Casef. Great. Amazing. That's so easy. And then this is the implementation which is provided by NVIDIA. This is in-bit code. We link this, and then this is just in-lined into our IR. This is great. Okay. Yes, so we're linking this echo code with LipsyLC. Then we link that with the vendor-provided BC library. So we're linking, linking. We get to the implementation. It's all in-lined. It's all great. We love it. So this works well, but so this is a bit of code from LipsyLC. Because we're dealing in OpenCL-C, we could choose something else. We could write a native IR. We find that OpenCL is actually easier to use than an easier to maintain than writing a native IR. So we end up with some funny kind of problems with mangling and all this kind of thing. This isn't nice. Sometimes we need manual mangling. It's got to do with namespaces when they're interpreted by the OpenCL mangler, unfortunately. Yes, we need to sometimes as well. Sometimes OpenCL isn't as good as we want it to be. So we need to actually write a native IR as well. So it's a mix of LVM IR, LipsyLC. It's a bit messy. It's not great. Yes, so also we're exposing some compiler internals here. This is the NVVM reflect pass, which essentially takes your function call for NVVM reflect, replaces it with a numeric value. This is totally done at the IR level, so you can branch at the IR level based on this is a high architecture, a newer architecture. Do this new implementation, do this new built-in. There's an old architecture, as well for things like rounding modes. This pass is used. We're exposing this in source code through hacks. This isn't really, you know, it's not, it's not kosher. But it works. Who cares? Okay, but consider the new proposal to add FP accuracy attributes to math built-ins. This is where we have, say, FP built-in cars, and we specify the accuracy in ULP that we want it to be computed to. Okay, this is totally lost on us. Okay, so this is what it would look like. Yeah, you have this attribute here. You've, FP max error. This is really, really needed in SQL because SQL is targeting lots and lots of different platforms. All these platforms have different numerical accuracy guarantees. We really, really need this, but we don't use built-ins at all. We're sorry, we don't use LVM intrinsics at all. So this is, we need to get to a point where we can start using this compiler infrastructure. We're not using it as much as you want to. So we could do this using a libclc compiler kind of hack workaround. We do another, you know, pass, you just say compiler precision value. If it's that, do some precise square root. If it's not, do some approximate thing. Yeah, we could do that. Okay, the problem with libclc and this stuff, it's not upstreamable. Okay, it's, it's a collection of hacks. It's not totally hacked, but like it's a little bit messy. It's not written in the same API. It's lib, it's OpenCL and it's, it's LVM IR. It's messy. We can upstream this. We can all benefit from this. Okay, so the pro about doing some, another, adding another hack to the, the kind of passes, another hack to the bunch is that it's easy to do. Okay, we can do this and we can keep going with our libclc implementation. It's pretty straightforward. We've been doing this the whole time. Yeah, fine. We don't need to worry about the broader LVM concerns. However, we miss out on LVM community collaboration, which is why we're, we're here. And then how many of these workarounds do we actually need in order to keep up with the latest trends and then libclc as bad as it could be now, like it just degenerates into an absolute mess and we don't want that. Okay, so we think the answer for this is to try and redirect, try and, try and actually have it calling the compiler intrinsic. Okay, we want to use compiler intrinsic and then have some generic behavior of these intrinsics for offload targets. Okay, and this would be used by say OpenMP by, by, you know, CUDA Clang and so on, all these different targets, but we don't have this transformation. We, we're not comfortable with this connection. Okay, from an intrinsic to a vendor provided BC built in. Okay, why is that? Essentially, this needs to happen as early as possible in the, at the IR level. So we're adding an external dependency in our LLVM kind of, you know, pipeline. We need to link this BC library early on in our, in our, yeah, pipeline. We don't do this. We're not comfortable with doing this. We need to figure out a way that people will be happy with us doing this. Okay, obviously we're used to things resolving to external symbols, but then that's a runtime thing. It's not, it's not a compile time thing. Okay, this needs to be inline. We need to do lots and lots of stuff with this at the IR level. Okay, so there will still be cases where we need libclc potentially. It's not going to, you know, just disappear from our SQL implementation, hopefully, but we need to start pushing towards better kind of resolution, better use of these intrinsics in LLVM for offload in general. Okay, so why? Why? Share infrastructure, keep an eye, keep on the cutting edge of new developments, less compiler hacks, and we make SQL compilation eventually work upstream. It doesn't at the moment, but eventually we want it to, of course. We're trying to upstream as much as possible, but libclc is not upstreamable, and that's a problem. Okay, so the first step, try and have this discussion about making the intrinsics work for offload. Okay, so time, okay, time's up. So we need to have this link step at the IR level early on in the IR kind of pipeline. This is problematic for some people, but we need to talk about this. So please join in the discussion here. This is NVPTX co-gen for LLVM sign-in friends, if you have any opinions on this. Sorry, I kind of ran over a little bit, but yeah, any questions? Yeah, I was wondering, would it make sense to try to get rid of the mess by going to an MLIR type of approach, or like, what are the benefits or downsides to MLIR? So I'm not an expert. So the question was, are there benefits? Can we avoid this by going to MLIR? I think it's more fundamental than MLIR. I'm not an expert on MLIR, but I think we need basic resolution of intrinsics. Presumably with MLIR, you'll have, you know, other MLIR intrinsics that will need the same kind of treatment. We'll have the same questions there. So this is the first case study. This is the most simple case. We're not trying to implement the new FU built-ins with the accuracy thing. We're just trying to decide how do we make this dependency on this external BCLib work, and do it in a very, very confined sort of way. Yeah, thank you. Yeah. Two questions. First one is a tutorial to generate NVPTX from MLIR. There is a work section about linking with the Bitcoin library from NVIDIA. So what's the difference with this? And the second question is, you mentioned NVM, which is the closed source ptx generator from NVIDIA, and there is also the LLVM NVPTX backend. Are we reaching speed parity with the closed source one? It depends on the application. We find that with, so the second question first, is there still a big performance gap between the native, say, NVCC compiler and LLVM client? So in terms of DPC++, which is a fork of LLVM, we're attaining, say, roughly comparable performance, whether you're using SQL or you're using CUDA with NVCC, and then any improvements that we make to the kind of compiler or whatever, they're shared by client CUDA as well. So the first question again was, how is this different from? So essentially, when you're linking Bitcode or whatever, you're not using any LLVM intrinsics. You're just redirecting things yourself. You're not using intrinsics. So you need to do everything explicitly. You need to either have a specific kind of driver path that will do this for you or you need to specifically say, I want to link this in at this time or whatever. And so it's more manual. It's not happening automatically. It's not happening really within the compiler. It's happening at link time, LLVM link time. All right. Thank you, Hugh. Thank you. |
How to Build your own MLIR Dialect |
Yeah, I hope you had a great Boston so far. I'm happy to talk about how to build your own MLIR dialect. So just as a first question, who is aware of what MLIR actually is, who have heard of the MLIR subproject? Awesome. So it's not the whole audience, so I'm going to talk a little bit more about what MLIR is. So my outline is, yeah, what is MLIR actually, but I only have a really short slide on that. I will show you the standalone example, which exists in the MLIR, or in the LLVM repository as part of the MLIR project. And I will tell you a little bit more about how you can extend it, how you can build your own dialect, because following the discussions on discourse and discourse, it always seems like people hitting the same pain points, at least we did several times. So that's why I set up this kind of tutorial to show you some of the tricks behind, mainly from the CMake perspective, which is some kind of tricky sometimes. So beside that, how to build it, I show you how you can combine it with other dialects. And last but not least, how to extend your dialect with types or attributes. So and just as a side note, all code snippets are, of course, licensed under Apache 2 with LLVM exceptions. So what is MLIR? MLIR is actually a reusable compiler infrastructure that was introduced by Google in 2019, early 2019, and at the end of 2019, Google donated it to the LLVM foundation. So it's officially part of the LLVM project, and there it lives in the mono repo and MLIR, and what it allows you is to define operations, attributes and types, which are grouped in so-called dialects, and that let you define your own intermediate representation. Later on in the session, we will also have an update about the Flang compiler, which also uses MLIR to define its own intermediate representation. And these dialects that can be part either of MLIR core, meaning they are in upstream, like the hung dialect, which gives you the ability to define what a function is, or there's also an LLVM IR dialect, which actually mirrors what LLVM IR is. But it is modeled in MLIR, sorry, what LLVM IR is, but it is modeled in MLIR. There are tons of other dialects, like a GPU dialect, a Tosa dialect, which is the tensor-operate set architecture, or MTC, which I am one of the developers behind. And there are also many, many out-of-tree dialects, like the SORC project is using it, or Torch MLIR, which is actually modeling PyTorch in MLIR. Many, many more, and these are considered as out-of-tree. So when we look at the standalone example, which is really a brilliant starter when you want to create your own dialect, you find it as part of the LLVM Mono repository, and you can just build it against an installed LLVM. You can just run CMake, configure it accordingly. You just need to pass where you find your installed MLIR and where the LLVM external lit is present. And actually, then you can just build your target, which is here's a check standalone. It builds all the tools and further runs some tests. This actually assumes, as I have mentioned, that LLVM and MLIR are built here and built here, and then they are installed to prefix. And that corresponds to out-of-tree somehow. And for me, when I began with LLVM or MLIR, I was not a compiler developer, but I had some experience in CMake and how these terms out-of-tree are used in LLVM and MLIR and the outer world are sometimes confusing, so I want to give at least a short definition. So in the LLVM world, entry also often or nearly every time refers to a monolithic build. That means you build LLVM or your LLVM subproject plus your dialects or whatever. Entry can refer to the source location. So here we have an out-of-tree dialect, which is however part of the LLVM monorepo, but it's considered out-of-tree because you can pull it out and you don't need to have it in the monorepo. So out-of-tree normally refers to work with a separate repository. However, there is also a mechanism which you can use to build your project with this LLVM external projects mechanism, and projects using this, and if you look into their CMake configuration or into other tutorials, either they call it out-of-tree monolithic build. So it's not a component build like you have against an installed MLIR or LLVM, or they even call it entry, which is somehow confusing because when you look to CMake, normally infantry just means you're building where your source code lives, which is actually a bad practice. You shouldn't do this. Normally you do out-of-tree builds. It just means you create a separate directory where you set up your configuration and where you do your build. This can also be a sub-directory in the source tree, but it's a separate directory not checked into your Git later on. So within this talk, I just call it the external project mechanism. For me, it's always an out-of-tree build, regardless of what I do. Even if I build LLVM, I wouldn't call it personally entry because I'm using the CMake notation normally just to make it clear when you look into some of the projects and don't get confused. So what we can do is we can extend the standalone project by this LLVM external projects mechanism and the question is, why should we do this? So Stephen Noindover gave a great tutorial about how to architecture LLVM projects at the LLVM DevMitting 2021, which is available on YouTube. I also have the link in my references. Here we are referring to a monolithic build and in his tutorial, he says, use the component build. That is what the standalone project already gives you. But there are some benefits when you maybe want to use the LLVM external projects. So what we actually do is when we developed the IMHC dialect, we developed this as an out of three dialect, completely independent or buildable against an installed MLIR version. IMHC is now part of the MLIR core, so it's upstreamed. And what we normally do is, or what's quite nice is, sometimes we want to look into when we change our dialect upstream or when we extend it, how does it behave together with these out of three source, which we still have, all our conversions, all our transformations are not upstreamed yet. And it is quite nice to build it as a run project because you can easily debug into, you don't have to keep your installation and what you're building out of source, you don't have to keep this in sync, you just have a monolithic build. So there are some benefits and we just want to look into what do we need to do to build it with the LLVM external project mechanism. So we are creating our build directory again, then we have to define the LLVM targets to build. So here you need to specify for which architect you want to build LLVM. So here it's just host or x86, which is also an option. You must specify the build type, either release, debug, minstars with relinfo, whatever. And we need to enable our project MLIR, otherwise it's not build. And in addition to that, as we want to build our standalone project, we specify LLVM external projects, standalone dialect, which is our project name. And furthermore, we specify LLVM external standalone dialect source tree to specify where do we find our source, that are the two additional parameters you need to pass. And here LLVM source tier, actually, we assume that it points to the root of our monorepo checked out. So that is what we want to have later on. Right now the standalone example can't do this. What do we need to change to make this possible? So right now it's looking like the following, looking to the main CMake configuration and what is important here is we have find package, we call find package MLIR, and find package in general imports information which were exported by a project. So here find package imports information from the installed MLIR version. And furthermore, the find package MLIR also calls find package LLVM for us, so we don't need to care about this. So then just the MLIR config CMake is actually parsed as well as the LLVM config CMake parsed and we can gladly just do our includes, which adds some further code for us. So for the external project mechanism build, we don't need to do this. So what we need to change is we only need to call find package. If there is an installed MLIR otherwise there won't be one because we're just building it as part of our build process. So in that case CMake source here normally is equal to CMake current source here. If it's not the case we have a different build type and we're just adding this if else block and we don't have the need no longer for our other project to load the CMake models with include. And the code we're adding is we're just setting the variables which are not available as export a project settings by yourself. So MLIR main source here, main include here and that's actually it. So that are the few lines we need to make it buildable. However, your LIT tests will fail. So there is a little bit more code that we need to modify. Here we just define a standalone source here and standalone binary variables which are then later on used also for include directories. And we adjust our LIT side CFG upon pi accordingly. So here we actually need to change CMake binary D or CMake source here by our newly set variable. Otherwise, yeah, the LIT tests are the location of LIT CFG is assumed in the wrong place. So we just fix that here. That's nearly it. So when you now want to use a dialect with other dialects and you have these in several repositories or with several projects at least, you can either use LLVM external projects to build multiple dialects. Torch MLIR for example is doing exactly this. Another option is to use CMake's external project at which is considered as the cleanest way as it really keeps the projects enclosed and doesn't transfer variables between the projects. However, what I normally do is I use a sub directory, but in addition with the exclude from awesome no only require the build targets I really use are exported or transferred to the other project. And we do this in our MLIR MLC repository and to do this we actually have an option just embedded which changes our source code a little bit. So only when we want to call it embedded then we check is it the case or not because the find package is already done by the other project. We don't need to call this. We only do the includes which we don't need for the external project mechanism. So getting to types. This is how the standalone dialect is currently structured or at least most of it. There are also some tools standalone ops standalone translate which are considered here. And you see we have multiple finds and types could be specified in standalone ops.td in our table definition file. However, it's quite nice to not put it into all into one file but to use separate files for it. So what we're doing is we're adding new files. We're adding a table gen file standalone types. We're adding a header file and we're adding the CPP for our implementation. And what you need to put into those are actually the following. Let's start with the table gen file. First of all, we include the attribute type base and the dialect itself because the dialect has some definitions and then we can define our standalone types class which is the base class for types. And in addition to that, we can define a custom type. Actually this is a simple copy of a mid-seas or park type. Quite straightforward but here we use a standard assembly format so no custom parser and printer and it just holds a string of parameters. So nothing special just to illustrate the example. So that is how the table gen file could look like. Getting to standalone ops, we can just replace the include of standalone dialect by standalone types. And this is because the types already includes the table gen TDE file so that's fine and that's it. Regarding the CMAC list, we don't need anything to change. Why is that? Actually at MLIR dialect already calls MLIR table gen for you with gen type decals and type definitions so that's fine. We don't need to change anything here. Whatever for attributes that would be different because for attributes, the at MLIR dialect doesn't call MLIR table gen for you to just set the LRM target definitions by yourself, call MLIR table gen by yourself, add a public table gen target and that's it. So attributes are quite close related to, are quite similar except for that you need to adjust your CMAC configuration by yourself. For the header file, just include the auto-generated type dev classes in the header, that's it, add the define, the include, nothing more to do. For our implementation, we need to make sure that the types can actually be registered by the parent dialect. So what we do is we have a define here, get type dev classes, we include then our generated code, generated by table gen and then we implement or we write a function register types which actually calls the method add types plus some of the auto-generated code and this needs to be called in our standalone dialect.cpp. So we just add the register types here and that's nearly the real trick. You can do the same not with ad operands or add types but with add attributes for attributes and to register your attributes. For the CMAC list itself, just add this to your MLIR standalone live or MLIR dialect library target, that's it, nothing more to do. For attributes, you can also just add your source code or you must add your source file here of course. But in addition, you also need a dependency on MLIR standalone attributes, ink gen, the target we generate or we create it by hand because it's not auto-generated just to make sure that table gen generates the code before CMAC tries to, or before the MLIR standalone target is built. You might be lucky otherwise you might have a race condition in your build system. I experienced that several times, tried to fix it or just keep it in mind and that's mainly it. For the standalone dialect, here we use the default printer and parser. Just let us tell table gen to generate those. And for register types, actually we need of course a declaration. We have the implementation but we also need the extra cards declaration generated by table gen. Otherwise, yeah, we cannot use it in our standalone ops.cpp. So all the examples are available in my fork of the LLVM project. I couldn't make it to senses, we are fabricated to be reviewed through upstream inclusion prior to my talk. But I will do so, I will add some more documentation to this, that's at least my goal. So when I planned this talk, I thought maybe some hints which could have one or the other and hopefully it's even more helpful if you not only find it in the slidespot also in the upstream example. And there are many good resources out there. So the talk given by media mini and river riddle, the MLIR primer, the MLIR tutorial at the 2020 LLVM deaf meeting. We have some great docs at MLIRLLVM.org, here how to create a dialect, the toy example for example, how to combine it, how to add attribute and types if you want to get more into the details what you can do all in the table gen world. Thanks, last but not least, the tutorial given by Steven Neuerhender at the LLVM 2021 deaf meeting. Yeah, so that's it from my side. So if you have questions, please let me know and I try to answer them. Hey, I just recently got into learning how to develop compilers, even less how to create languages. I've been focusing mostly on the front end part, like lexing and parsing. And my idea was to use LLVM as the back end for a really abstracted C type language which I'm working on and use LLVM as a back end to generate machine code for x86. And my understanding was that I only had to use or generate an IR for LLVM, the LLVM IR. And now you mentioned that LLVM IR can be described or is somehow related with the MLIR, right? So I was wondering if I'm still in the correct track to try to generate the parse tree and use LLVM and try to generate the standard LLVM IR and target the x86 or an x86 platform or do I need to learn something about the MLIR? So to try to summarize the question, the question is when as a compiler starter you're mostly focusing on parsing an abstract C type like language and want to know if you can just go through the ordinary LLVM IR way or if you need to plug in switch over to MLIR to do what you want. So in real short. So you can do this definitely. You can go the way you're right now doing. So MLIR actually is a little bit different. So if we are looking to clang, if you're talking about an abstract C language, looking into clang, there is clang AST and then we directly or more or less go to LLVM IR and that is one of the things which yeah isn't that nice or if you look into other compilers they introduce more intermediate representations in between like we will see later on in the session the flang app that for example or even Swift has two intermediate representations for example. So MLIR just gives you the ability to define additional intermediate representations. So you can also write a front end for your language, parse it into MLIR, convert it to the LLVM IR dialect and then translate it to LLVM IR. So that would be identical. It really depends on what you want to do, what kind of infrastructure you want to use. But you can go the way you're already going. So hopefully that somehow at least answers your question. Okay, the question is not directly related to the talk but as I'm one of the developers behind MLC, why we developed MLC? Sometimes you cannot compile with clang or directly or with LLVM at all to your target. So the idea was to get something independent of the compiler and when we actually generate C code with MLC you can have send the freedom to choose which compiler you want to use to translate for your final target. So we are in the domain of compilers for machine learning and sometimes we have some very exotic targets where clang unfortunately is not the option to use it as a compiler. So that's the simple reason. Hi, I was coming from the opposite side of the spectrum, I was looking into doing some sort of just in time compilation but I also wanted to define my own types and my own let's say things. My question is would MLIR be a good fit for that or would possibly just see with some sort of, I don't know, C++ with templates and some sort of, I don't know, dynamic language or something like that. So the question is when coming from the other side, so for JIT essentially if MLIR might be a good fit to define your own types and attributes and well I'm not an expert regarding JIT but MLIR provides you, I think most of the codes are upstream provides you the possibility to register types and attributes I'm quite sure at least at runtime. So you can extend your dialect after you compiled MLIR. So that is maybe, yeah, depending on what you want to do. If you really want to modify it during runtime, that should be possible with MLIR. So I'm not 100% sure but at least worth a look, I think. Well, I'm partly aware of IRDL but I don't know, you mean how you composite VRC make into targets or would, yeah, probably you, as you with IRDL as far as I know as you do most of the time at runtime, you wouldn't need to build it in advance. So yeah, the CMake stuff would be somehow obsolete, yeah, I think so. All right, if we're out of questions, thank you Marius, thank you. |
Case study of creating and maintaining an analysis and instrumentation tool based on LLVM: PARCOACH |
I'm sorry for the wrong title, but you know, naming is hard, so I had to put something. But today, I want to talk to you about my experience with dealing with out-of-tree plug-in and tools for LLVM. Parkoach is just one example, and I will give some others. So first of all, I will try to explain to you why and for whom am I doing this talk. So you know the audience. I'm going to talk about three different things. The first one is keeping up with LLVM. So we will see some code. She makes C++ and stuff. The second point is usability, both from a developer, a tool developer point of view and a user point of view. And the final point will be dealing with packaging when you're actually targeting some system. So why am I doing this talk? First of all, it's to provide some feedback and maybe provide you with some stuff I wish I knew beforehand before coming into this. I'm doing this also because I've learned a couple of out-of-tree projects, and I've faced the same issues. So maybe you've faced them too, and it will be helpful. So it's not so much about the tool parkoach itself, it's rather about the approach. And for whom, it's basically anyone who is involved in an out-of-tree project for LLVM. This is my own point of view on this topic. If you have ideas, comments, like improvements that you think may be helpful to me, don't hesitate. I will welcome them. So parkoach is a tool for HPC application. It's basically an instrumentation, analysis and instrumentation tool for OpenMP and MPI application. It basically checks that the user is using the APIs appropriately, that there are no deadlock or data racism. The developers. This is where it gets interesting, because they are not like LLVM engineers, right? They are interns, students, PhD students, researchers. They have a whole job, which is not LLVM. The users of the tool, they are scientific application developers. So you cannot ask them to compile LLVM from the source. It's not going to work. They are not going to use your tool. And the last part, which is interesting with this project, it started a long time ago when it was LLVM 3.7, and now it's based on LLVM 15. So there has been a lot of history in the tool. And I'm working on it right now. It's my main job, so they have an LLVM engineer now, and I can do stuff. I provided the link for reference if you want to take a look. There are two other motivating projects that I can talk about. One, I'm actually not going to talk about much, because it's not free. It's a commercial compiler, which is based on LLVM, and basically the developers are LLVM engineers, so we have more flexibility when doing developments. And the users are clients who are paying for the compiler, so we need it to provide something good. And the other point is student LLVM exercises. I do courses, LLVM courses for security students, and I want them to be able to do some code transformation with LLVM. So the developer is just a friend of mine and me, and the users are students. We are expecting them to code into the project, so we need to make it easy for them to get into. And we have 16 hours to do this project, and they cannot spend two hours like installing LLVM. It's not going to work either. So in all this project, I am considered pretty much the same issues, so I'm going to talk about them now, and the first one is keeping up with LLVM. So I'm not sure if it was intentional in the schedule, but having a talk with you about CMake and stuff, it's quite helpful because I don't have to go too deep in the details. You already know them. So let's go back maybe eight years ago. You wanted to do some LLVM tools, there was no CMake integration. The first approach that you had as a developer was either do stuff manually, maybe using LLVM config to get the flags and the libraries and so on. But basically, you had no easy way to integrate with LLVM. It was quite manual. Then came CMake, and you could use the standard ad library, target link libraries, but you had to know what to feed these macros with. And some stuff I've encountered in this project is some kind of people who were developing this project were not comfortable with CMake, and they would perform some changes where they would actually do CMake integration, but with R-coded passes in the CMake, so it would be awkward. So basically now it's, I think, at least from the examples we have, it's way better. So using the LLVM CMake integration simplifies a lot of stuff. You just have basically to know which component of LLVM you want to use, how you want to build your library, like is it static or shared basically. And you have dedicated macros to just construct whatever stuff you want to construct for LLVM. So let's take an example code. So you don't have to understand everything, just to give an example of how it works. You basically say, okay, I want to find LLVM, provide a version, sometimes, include the LLVM CMake helper, and include some definition, and then this is the interesting part. Because you can say, okay, I want these components in my tool. Call the CMake helper with your plug-in source, and that's it. I mean, the CMake helper will take care of saying, okay, depending on how LLVM is installed, like is it just the big dialy, but are there like individual libraries? It will set up the target link libraries appropriately, and you don't have to think about it. It's just automatic. If you want to like do some tools or pass plug-in, there are macros to do these two. So basically, you just have to figure out which kind of build you want, and CMake LLVM will configure everything for you. There are some useful examples. For pass plug-ins, there is the buy example, which is basically a new pass plug-in, very simple. It's a kind of a hello world. And LLVM tutor has some out-of-three passes to get you started with, and it's actually quite helpful if you are looking into this. No, let's talk about some code. So let's say you're new to LLVM, pretty new to C++, your student, for instance, and you want to perform some LLVM transformation. So you go on your search engine, and you look for how do I iterate over instructions of LLVM function? And pretty much all the resources like Stack Overflow or even some presentations, they will give you the code on the left. So it's fine. It works. I mean, you are iterating over all the iteration instruction of the function. But if you know a bit better C++, you know that you can put range instead of row iterators. And if you know the instruction iterators from LLVM, you know that you can use instruction of F to just get all the instruction of F. All the code works, but arguably the codes on the right are easier to read, and in the end easier to maintain, especially if you consider that there are a lot of examples like this in the code. It adds up, and so simplifying stuff is nice sometimes. So it's not a problem of Stack Overflow or anything. It's just that the answer in Stack Overflow or in the slides are old, like from 2015. Like if you would update the answer, it would just be the option on the right. Another thing I want to talk about, and that I've seen a lot in ParCoach, is iterating over something, but putting a predicate. Like I want to iterate through this stuff, but only if this stuff is true for some predicate. So you can do stuff like that, early, continuous, or nested if. But if you know the STL extra from LLVM, you know that you can create filtered range for any range, actually. So you pass a range, you pass a predicate, and inside the loop, you just get the object you're looking for. Again, it's a simple predicate, so it doesn't matter much as is, but if you add some more stuff, it starts like growing up, and maintenance became a bit harder, readability is impacted too. So this is something to consider. Now, something more like critical is advanced data types. There are a lot of data types in LLVM, and if you are not familiar with LLVM, and I've seen a lot of code like this, you will just use whatever data types is available in the STL, and you will get a map, for instance, use some helper. And the actual issue starts when input, the map of instruction, you want to map an instruction to something. If you go through the input and change an instruction, like if you delete it, or if you replace all the uses with some other value, what happens to the instruction in the map? So with raw map from the STL, like there is no mechanism, so nothing happens, and you end up iterating or trying to find something which is not valid anymore, whereas if you are aware of the data types from LLVM, you are able to use some kind of value map which has specific handle to remove the value or update the value if it is changed during the life of the value. So some other helper that are quite nice, I mean, it's not a big deal, but for instance, instead of using std-finif, you can use LLVM-finif and just put a range, instead of just like the individual iterator, in this case, it's not a big deal, but it's actually quite nice. But basically, every stuff like that, I've encountered this for a lot of kind of code where you would be able to replace most of the occurrences with any vector from the LLVM array advanced data types, or like a array or string array, like there are a lot of stuff in LLVM that you may not be aware of and that makes your code quite nicer if you use them. So yeah, dealing with it, so you may think, okay, this guy is just being picky with people who are writing the code. It may be true. I will argue that it depends on actually who makes the contribution, because you cannot expect the same level of contribution from a student or from a LLVM engineer, and like especially when you're a PhD student, you have, I don't know, a deadline, you just want a tool who does something, like you're not going to spend times and times on how you do stuff as long as it works, at least that's my experience dealing with that. But in my opinion, the accumulation of small details matters, and it was very explicit in the case of Barcoach, because I came after like maybe five, six years where the accumulation of researchers and PhD students like led to a lot of technical depths, and if there was some advice that were given to the PhD students or the researcher, it would have been a way nicer code to read or to maintain. And so obviously, it's quite obvious, but doing code reviews helps a lot. Sometimes you cannot do them if there is no one able to actually provide some useful feedback on this, like in the case when people don't know LLVM, you cannot expect them to review code and provide some, a lot of feedback. But what I do know is I redirect every time to the LLVM programmers manual, it's not like the first thing you usually just go to a search engine and search for what you want. But I will argue that actually reading the program manual is more helpful in that, in this specific case. And something that I know people don't want to do when they are starting LLVM is just read the code from the passes in LLVM, there are a lot of good stuff in there. Obviously, if you're not familiar with C++ and LLVM, it's not the easiest, but I think it's still worth it. So the next topic is updating the LLVM versions. So far, when I've developed out of three tools, I've always set the version to one specific number, right? And like let's say LLVM 9. And then when LLVM 10 comes out, you rebase your plug-in and check if any API broke, if there was like some changes in the IR. Most recently, I am thinking about Opak pointers, it was quite a big change when updating the LLVM version. And something to consider when doing this is that it may be time-consuming, like a lot of time can be spent in, it may be like just a day if there were no changes in the API, but it could also be very time-consuming for instance if you have to change all your passes because it's been three years that the new pass manager was out and you still didn't do the migration and now suddenly it's deprecating and it's going to be removed, so you need to migrate your passes, so you have to do it. And in my experience, it's quite obvious too, but skipping versions makes it worse. And some things that I've seen, and I know sometimes it cannot be avoided, but in that case it was avoidable, but basically trying to support multiple LLVM versions at once, like say support from LLVM 9 through 12, it's actually what was done, and yeah, don't do it. Just pick a version and stay like this, because otherwise it's just multiple if-deaf and everyone in the code and it's unmaintainable, I think. So now let's talk about passes. If you look for a hello world pass on the internet, you will get a hello world pass, which is a transformation pass. So in LLVM, you have two kind of passes, the first kind is analysis, and basically they don't touch the IR, you just look at the IR and maybe provide some result, which is a result of the analysis and that can be used by transformation passes or other analysis. And there are the transformation passes, which may or may not change the IR. And obviously, when you get your hello world pass, you want to do everything in it. Like, I mean, I'm not talking about LLVM developers, I'm talking about students and researchers that have the pass and they put everything in it. And so it's fine when it's just like one shot or something like that, but in the time, at some point, both the analysis and the transformation are semantically different, and LLVM has some mechanism to make it easy for you to have the analysis run only when it's needed, right? There is a caching mechanism, you can say, okay, I want this analysis for this object and if it exists, it will give it back to you. And also, it avoids passing structure around because when you are in a transformation pass, you can request any analysis from basically anywhere as long as you have the analysis manager. And so this is something that has costed me quite some time, like just untangling the analysis code from the transformation code, and overall, it improved the performances because some analyses were requested more than once for the same object. And yes, it leads me to investigating performance issues because it was something too. So what happens when you don't know LLVM and you want to debug your code? You put LLVM errors everywhere and you command them out when your code is ready, okay? So it's a nightmare, I mean, it works, but you're not supposed to do it like this. So specifically for like printf like debug stuff, you have some LLVM helper, it's actually quite handy. You just put a debug type somewhere and you dot CPP, you wrap everything in LLVM debug because it does all the things for you if you don't include debug information, it doesn't even appear in the binary. And when you're running your pass without, you can say, okay, I want to show debug information for this kind of pass and it basically provides the same feature and you don't have to command out LLVM errors. The other thing is timing your code, being able to tell, okay, this part of the transformation is costing me time. And so what I've seen was some manual attempt to do timers and basically declare all the timers, you start them manually and it starts being a mess really quick. Hopefully, thankfully, now we have a time trace scope. It was, I think it's what's used when you use a F time trace when starting clung. And so basically it's just one line, you put one variable and when it's constricted, it starts a scope and it starts a timer and when it's destructed, it stops the timer. And LLVM has a whole system for this and it emits a JSON and you get, if you put this in this JSON speed scope, you get something like that. And you can see basically everything in your code without having to do anything. You get the entry point, you get the analysis and here it was quite obvious for us was the changes, what the changes were because this analysis, for instance, was called multiple times but it was for the same object. So like for instance, it would appear here too but because of the caching mechanism and the untangling, it basically just, it was just called once. So this is something nice that you get basically for free. So now let's, okay, some conclusion on the tool development, so it's a fairly basic conclusion. Try to invest in maintenance. I know it's not always possible, especially in a scientific project but it's worth it. Don't remember the wheel. If you want to do something in LLVM, it likely has already something in LLVM for this. And keep the this minimal. One of the main weakness of Parkour right now is that we use some passes which exist already in LLVM. I'm thinking about memory SSA for instance, we use some copies of this and from a maintenance point of view, it's not quite nice so we need to migrate this away. And if your passes can be useful to others, just try to upstream them, I mean if you don't use them, you don't have to pay for them. Then let's talk a bit about usability because it's quite a big deal for a tool because you want it to be usable. So first, from a developer point of view, if your developers are going to be non-LLVM folks, you don't want them to go into the LLVM install and stuff so I've had good experience with using Docker and basically provide a Docker image with the LLVM compiled installed somewhere or just installed using the APT repositories and have some clear CI, like how to build your tool, like just looking at the CI should be enough to know how to build your tool from a developer point of view. And the other great thing is when you use LLVM, you get LLVM tools with it. So you get a lead and five check and so instead of going through some manual testing and stuff, you can just use them and it's actually quite nice. And yes, of course, I could talk about coding standards but basically since you're making a plugin or a tool for LLVM, it makes sense to follow the same standard and you have already clonk format and clonk tidy configuration for this. Now, as a user, you obviously don't want a scientific application developer to compile your code from source, you want them to just have the plugin and use it or have the tool and use it. If you look at Hello World passes, you see a lot of times that you have to first get the IR. So in our case, it's either from clonk or from clonk and you have to call out, load the path manually and call the path manually. So I would argue this is not nice enough for researchers and students and since Spark Coach is a verification tool, we cannot expect users to call it on every single file. So we actually had to do some more tooling to create some wrapper which takes the original compiler invocation, runs the original compiler invocation, generates temporary IR and then runs the tool over it. It makes it much more easy for the users to just integrate with auto tools or CMake. So that makes the tool more user friendly than I would say unusual. And the other part is how do you get the tool? So again, I've had good experience with Docker especially for students because it's easy for them. And sometimes obviously we also provide some package for major distributions but you actually have to worry about how is LLVM packaged on the target system because depending on what is available, how is it shared libraries, Dalib and stuff, it's not the same thing. And yeah, Docker, it's not something you can quite use on shared HPC clusters, you're more looking at stuff like geeks for instance when targeting such platforms. So for this, you need some packaging. And packaging is my last point. So obviously we used to use do-it-yourself approach, basically just create a shared library and hope for the best, it doesn't work. Because you depend on how opt is installed and compiled because you're loading dynamically a library into opt. So if you have not used the same like C++ libraries, you're going to run into issues. You don't know for sure which pass manager is enabled by default in opt. So there is also this. So we've moved to doing some proper packages for APT.deb and for geeks and for Red Hat 2 because we have some users using some custom version of Red Hat. And for this, we actually have quite an interesting issue because we are sure that the LL version in their image is not available. So we made the choice of shipping just one single static tool. And for this, it was actually quite easy because as I said, when I talked about CMake, you just say, OK, I want this to be linked statically or as a shared library and CMake, LLVM CMake handles it for you. And it was quite a nice experience for us to package for so many distribution without having to worry too much about CMake option and stuff. So some takeaways for the whole talk. In my opinion, the LLVM integration has evolved a lot and in a good direction. It's way easier to integrate with LLVM now than it was 10 years ago. It's nice, but it's nice to say it because when nice stuff happens, you have to say it too. Be prepared for maintenance. If you want to create a not-off-tree tool, you have to invest in maintenance both for LLVM rebases and basically reviews and make sure that your contributors, if you are able to provide some LLVM guidance to your contributors, do it and it's worth it. Investing in CI is worth it, obviously. And LLVM documentation, I would definitely, every day, recommend going to the LLVM documentation rather than Google for understanding what is available in LLVM. And I want to encourage my students to read LLVM source code, but it's sometimes a bit hard. So if you have questions or comments, feel free and I will be happy to answer them. Yeah. So the question is for the wrapper we created, what do we use to create this wrapper, right? So basically, it's a very, very small LLVM tool, maybe you are familiar with not in LLVM. There is a very small utility in LLVM which just does not on the return of a program. And it's a very small LLVM tool based on LLVM. And we use a similar approach. Basically we say, okay, I created basically an empty main where I just use the LLVM support library to get the benefit from like argument parsing and the data types and so on. And I just parse the command line and call successively clang the original compiler line. And then I just generate the intermediate representation for it by adding the appropriate flag and filtering out the other object generation flags. And then I just run the tool over it. Yes, yes, because you can just, for instance, with CMake, you can use the CMake C launcher basically just like Ccache work for LLVM, you just change the launcher and you can use the tool to launch the compiler. And for other tools, you can actually, actually in our project we use MPICC but we are able to change if compiler used for MPICC and say, okay, use instead of GCC for instance. So the question is when you ship your tool, do you link statically or dynamically? So actually both. When shipping for Red Hat because we don't have a control over what package are in their custom image, we ship statically because we are not sure about which LLVM we are going to have. So we just, the binary is 100 megabyte but we don't have much choice. And when shipping for system like Ubuntu or Debian, we just ship the dependence on the shared libraries. So the question is, when we're basing the tool from one LL version to the next one, do you use the changelog developers put their love into and if yes, is it helpful? Unfortunately the answer is no. But that's because I look at the LLVM weeklies so I kind of know what happens. This is just my way of doing stuff, yeah, so no, but if I would look into the changelog, I would find helpful information I'm sure. So the question is, am I trying to rebase as LLVM progresses or am I just rebasing every version when it's released and it's only when a release came out, comes out, I do the rebase. It's easier because otherwise, you know, depending on what kind of target you ship for, it's hard and it's just simpler to say, okay, we know and then, we know we need to rebase the version and it's fine. I think we're out of time now, thank you Philippe. |
The C2 compiler
How the C2 compiler evolved |
Welcome. I am Bess Vandenberg. I have been working on a language called C2 for about 11 years now. I did two talks in FOSDEM in 2013 and 2015, I think. So this is the third one about the compiler, so it's slightly different. So the presentation roughly has three parts. First the language introduction, because you can't talk about a compiler without knowing the language. Then the evolution of the C2 compiler, which is called C2C, because CC was already taken, of course. And then the next steps, so three parts. The first part, the language. There's a lot of info here. My reason for starting C2 was I'm an embedded developer, so I work a lot with low level systems kernels and that sort of thing. And everyone was getting new programming languages except embedded programmers. So we are still writing kernels in C and no other programming language are really making it in that race. So I thought, well, we should be able to do something about that. So I love C, so let's try to keep the good parts and remove the parts that have become somewhat older with time, since C is, of course, 50 years old. So what I did was I looked at the anti-patterns in C. For example, the types you define like Uint, 32T, and lots of macros, lots of plumbing you have to do that everyone does in every project, like the size of an array macro. And that's tried to get those out. Also, the header files had to go, because header files you had to type, yeah, your definitions twice and it's a hassle with a lot of stuff, a lot of tooling. Macros as well. So the macros also went out. And the goal was to get better development speed. So execution speed in C is not a problem, of course, but yeah, development speed is. Then the speed of execution can be better than C. That's because full program optimization is easy in C2 and it's quite hard in C in a realistic program. So you can get better execution speed that way. It also has a built-in build system. So the scope of the C2 compiler is different than a C compiler. The C compiler, you feed it a single C file, it produces an O file, and then you do that a lot of times and the linker just collects everything and turns it into an executable. In C2, you run the C2 compiler, it takes a lot of source files and produces the binary. So it does use the linker, but the translation unit is the whole program. So it can do a lot of diagnostics that the C compiler can do. Like this field in this truck is never written in the whole program. So in C, that's not possible unless it's in the static part, in the C file for example. And I tried to get better tooling, like jump to definitions that are reliable and fast, analyzers that are often hindered in C by the macro use, and so on. What I'm not trying to do is make a completely new language, like Rust or Go, because it should be recognizable by C and because I like the things in C, the abstraction layer is just to do it yourself, and it fits the hardware we have now, so that's okay. And I also don't want to add higher level features like garbage collection or runtime, because it's not the domain I'm trying to use it for. So let's see an example. How many people ever looked at a piece of C2 code already? Probably a few. Well, every file... The dog did. Every file starts with a module definition where you define every file as part of a module, and a module is just a piece of code. Different files can belong to the same module. Then you get a bunch of definitions like a type and two functions. The order here doesn't matter. The ordering is for a compiler, not for humans. There are no forward declarations. In 50 years ago, when the compiler had memory problems, it was an issue. So the user, the programmer had to structure the program, so I first do the forward declaration, and then like this part, the element points to an element again in C. That's not possible. You have to do forward declaration of the struct, and then the definition, that's a computer problem, and it should be able to do that. A second thing you'll notice here is the public keyword. Types and functions can be public or not. Public means it can be called by other modules. Another one is this one. This is called a struct function. So we have a struct, and we can just add these functions to it, and you can, in the next slide, also, you can do element dot in it. You can add them arbitrarily. Not just in C++, but you have to add it in the class definition. Another change from C was the dot here. These are pointers. In C, you have to use the arrow, and a dot for the full body. In C2, we make no difference. It's all a dot, because it's better readable. Okay, so how do we use this part? This is just an example, of course. It would roughly look like this. So you import the list, the linked list module, use the element, and then do something with it. The thing to note here is that the import has no file name. So in C, you do includes, and includes has a file name. It has a path or a sub-directory. That's all very unhandy when you try to reuse stuff, and it's in the utils library directory, and in another project, in the lib directory, you have to change the header files and includes. That's not the case here. You can just put the files somewhere on your file system in your project, and I'll show that later, and in your recipe file, and then it will work. But otherwise, if you look at the body function body, it's pretty similar to C. So the plumbing, like the rest of it, looks different, but if you have a large function with four loops, it's just the same. It looks really the same. So because the scope of the C2 compiler is the whole program, you have to tell it what to do. So you just run C2C, and you can give it arguments, but you don't have to. It will look up the recipe file, and then do whatever you want. So you can turn it into an extra executable or a library. These are the files, so it will not dynamically look for files with some name. It will also not use directory structures as the module name or something, like some languages do, but it will do this. And also, this is the configuration part. There are only a few options. In CUF, I have given actually presentations about the warning flags in C. In C2, the only warnings you have is unused. The rest is just an error. So unused can be, of course, if you are refactoring, it can be annoying. You can temporarily silence them, but other stuff you can't ignore. You have to fix it. So, what other features does the language have? Modules, imports, with the recipe file. We have a lot of stuff like elements of, which is the built-in function, like size of, but then for array sizes. We have enum min, enum max, which is the first and the last element or the highest element in an enum, offset of to container, which are usually macros in C, but they are part of the language here. Also, the base types are all in there. We have opaque data types, which are types in C that you only use by pointer. So, like you have some component and you call a create function, you get a handle back, but you cannot, you can only use the handle by pointer. You cannot actually look into it, and that's opaque data types. That's made explicit here. Another thing you can do, because we have full program scope, is global incremental arrays, which are just array, global arrays, but you can define them in different files and you can have them behind the configuration sort of if-devs. So, if you have this feature, these elements are added to the array and otherwise they don't. But in the end, the compiler will put them all together and turn it into a single array. Another feature I've never seen in other languages is the build file. It's also another file, which is optional. The recipe and the code is made by the developer, but the other users of a program can be, for example, open embedded or Yocto systems, where you have to tell it how to build where the directories are, and that is specified in a build file. So, that's not created by the developer, but by the users or the package managers or those people. Another feature, which I haven't seen in other languages or compilers is plugins. So, the C2 compiler has a plugin system. You can write plugins for it to walk the ASD, do stuff with it, either before parsing the other stuff or after. I'll come back to that later. Let me check the time. Current state. We have 900 unit tests. It does run, so it's quite okay. It gives quite nice diagnostics. Weird cases are all covered. Doesn't mean there aren't more. There probably are a lot. Lipsy support, pthread support. I once supported the Vulkan library, the Graphics library to C2 that worked, well, written web server, web socket server event framework. There are plugins for VIM or Neo VIM to jump to definitions that are really fast and correct. We can generate dependency matrixes of the whole code to some format, also through plugins. This part, the embedded user is still in progress, so that's the bare metal case. You need linker scripts for that. Inline assembly is supported already, but you need more for that. You also need the bare assembly. And it's been used in some production code, web server client and customer service for customers I work for. I advise them to use C2. Okay. So the second part is about the evolution of the compiler itself. Started in 2012. Well, you started by Zonjak because you're doing something with parsers. I quickly found out that they're not really usable in real projects, really hard to use. So I started with a patch on Clang, 3.2 in 2013, when Clang was a bit more slimmer, to parse C2. After that, I created an own C++ code base using many components from Clang. The patched preprocessor, tokenizer, also the diagnostics engine and generic LLVM components. That went on for quite a long time. It went well. Always rebasing on the latest LLVM, like 3, 4, 5, 6, 7, 8. We're now at 11, I think, or 12. Well, the latest rebase. Added to own custom unit test framework. And last year, the plugins, which is quite nice because like for the, so I have another slide for that. So that's this one. Yeah, this one. So some, the C2 compiler shouldn't depend on a VIM plugin, of course, but there's some sort of dependency on it. So we put that in it. You can create a plugin that generates walks, the AST, gets information out, puts it in some binary file, and the plugin can look at the binary file. So that works really well. And there's also another fun stuff to do, like shell commands to symbol. So you can, in a YAML config file, you can pass to these plugins. You can specify, I want to run this command, like git version or ls, and then put it in this C symbol. So you get it in C, you can just use the symbol name in your scope. And yeah, you get the information. So that's quite nice. Git version works the same way, but it's just specialized for this. Also load file. So you can, you have a load file plugin. So you specify, I want to load this file, and this is the symbol name. And the symbol name is just straight with the data field and the length field. So you can just access the content. So you don't need to have macros or scripts that convert stuff to header files and import those. Other IDs are to code obfuscation stuff and additional checks you can implement in companies, like every name should be 13 bytes long, whatever you want. So continuing on the list. Last year I started on rewriting the C2 compiler in C2, because that's the domain it's also meant for. So it was a sort of graduation project. See if it's valid. It was also the biggest project. It's now, I think, 15,000 lines of code or something. So to do that, you need to bootstrap it. So that was done this year. So we now have a bootstrap way to, yeah, to start with a normal C compiler and bootstrap that into a C2 compiler. That's shown in this image. So it's quite a hassle sometimes. So we start with the C++ sources for C2C, compile those with Clang++. We get a C2 compiler. Then we take the C2 sources, the native version, the C2 version, compile that with this compiler, get a C file we can generate C, if you want. Compile that with Clang and then do that again to the bootstrap because we want to use this final compiler to do another step. So then we get a C2 compiler that's the final one. And the bootstrap actually starts here. So we save this file. That's easy. It's quite a big file, but Clang can handle it. And the bootstrap is just the last part. So, well, I had first ideas to use binaries, but if you lose the one binary you have, then you cannot compile again. That's not so handy. I think the project was a success with the graduation. I found quite a lot of bugs in the old C++ version of the compiler, but it's quite nice. I was afraid that C or C2 like languages might be too low level because Clang is, of course, written in C++, but in a compiler you can use memory pools and quite a lot for the AST, and that's also faster. So memory management is not really an issue here. It now parses roughly four million lines a second. And analysis is also quite fast. That's because we do the whole program at once. So when you do a C program, your make files kicks in. It takes the first while, puts it in GCC, get an O file, and does some parallelism with multi-course. But it does code generation as well here, and then the second one and the third. So if your file number 100 has an error, you only get an error after 99 files have been compiled. And in C2 you parse all the files, check all the files, so it's really fast. It takes milliseconds. So I can announce now the public repository. I open sourced it yesterday. Had to add some legal headers and stuff, of course. I get some open source license in there. So it's now on GitHub. You can download it and try it. It's not as functional as the C++ version, so that's still the main. So the next step will be to convert all the unit tests to the new, actually the compiler to fit all the unit tests we have. Sometimes the diagnostics differ a bit, so I have to change the unit test. Also, when I implemented the C2 compiler in C2, I had to run, I had to use a lot of vectors as data types for lists of stuff, and you have to retype them every time you see. So I started playing around with templates. The start is now in there, but it needs to be expanded because it's quite nice to have some form of templates. I'm trying to stay far away from the C++ hell, of course, but at least something. The recipe file format will be changed to YAML because the build file is also in YAML and all the other files also, so that's more consistent. Then currently there are three backends, so the code gets converted through the backend to something else. One is the C backend, which is quite easy, and then we also generate make files, and it just runs it and gets a binary. Another backend is QBE, which was presented here last year. It's a small backend that has no optimization, but it works, and it's quite easy to use. Then there's also beginning of the LLVM backend, which is quite hard because LLVM is like a client, it's a huge dragon, it's millions lines of code. That's more work, so this is the step up to LLVM. Last is the embedded words used, so it's using linker scripts and allowing bare metal. That would be nice. So that's the presentation. I tried to keep it quite short and not focus only on the language itself, so there's room for questions. If there are any. So how do you interact with C code in particular with C headers? From a C2 project, you can generate C headers for your C2 library, a library that's written in C2. So I meant if you want to use a library written in C. Yes, so that's one way. The other way is if you have a C library like the Vulkan library, you have to create a sort of C, C2 interface file, so it's like a header file in C. It's quite straightforward, but it's manual work. There's no way to automate that currently. But the rest is the ABI is the same for the libc. You just have to define like printf, this is the function. So the question was how do you interact between C and C2? So you said that with your C2 compiler, the whole program will be compiled in one step. Do you have provisions for building shared libraries or other things that cannot be compiled into one step? What if the program is so large that we cannot compile and link it in one step? This is the case with many C++ projects if you do not have enough memory. If you take a really large program, it will probably take a tenth of the size of your browser. Loading a standard web page takes so many megabytes. Like a huge program like the Linux kernel, it fits in a few megabytes ASD in C2. That is true, but I'm working with a lot of software packages I'm doing packaging. And these days it's very frequent that some C++ software no longer compiles in 32 bits because the Clang process exceeds the 40-bit virtual average base. And that is because even if the ASD is just a couple megabytes, the internal data structures to represent all the things the optimizer is looking for are a hundred-fold of the size, a thousand-fold maybe. So that's an interesting question maybe for the future when C2 programs grow very big. Yeah, I looked at that. When I was working on the Linux kernel, I created a program to see how many lines of code the compiler actually had to parse, like a thousand files of C and how lots of files get included many times recursively also again. So the factor was roughly a factor of a hundred. So every thousand lines of code you create the compiler to parse like a hundred thousand lines and also analyze and stuff to stuff with it. But I think the biggest part will be probably the representation in LLVM and we can do that step by step because the modules are directed as cyclic graph. So they have a structure, one at the bottom and one at the top. So we can do them one by one. In the first slide you've shown that you can define methods and structures. So the question is do you have some kind of name mangling? Yes, you need to do some mangling but let's see here. It's really simple. In effect what we do we take the linked list, put the underscore at this one and this one is turned into an underscore so that's it. No, because it needs to be recognizable. So it's a really simple forward scheme. You can go both two ways, it's really easy. No, we start with the name of the module. So it's linked list underscore element underscore init. So how do you handle the case where you're linking with a C library that has that name which is a value? It's simple. Well, that gives you an error. It's like if you define a function printf in your C program you get an error. Yeah, so you get an error. In C you have a single namespace so this is a two-dimensional namespace so it's a linked list, a module name and that already solves so many problems that your namespace will be much cleaner. So he says that the symbols started with double underscore and underscore but the uppercase are on the implementation domain. So if you prepanded double underscore to all the symbols then you would know that you would never... Okay, I didn't know that so... Okay, that's easy to fix then. So double underscore and then capital. You could also use a scheme like the go-to channel and put a symbol into the linked symbols which cannot be used in C like you can put a bot in the symbol. For example, you could name the symbol element bot in it. I would never come out of anything. It's a smiley face. If you compile a NIC at the same time, can you actually parallelize the process? Like if you have like a one million files and you have eight cores do you use only one core to compile everything? No, we use many threads so we parse everything at once. We analyze it single thread because we don't do the locking and... Yes? So the modules we have an import like here. So this module depends on this module so we can build a graph. We sort all the modules and analyze the bottom up. And we also generate code that way but the generation of the code and the optimization is 90 something percent of the work. So that's done in threads. The other part is just milliseconds really. So in the thread when the symbol is not resolved because it's another module you just put some temporary hook that whenever it's resolved by the other thread then you update the symbol. No, the analysis is done over the whole program first. Bottom up. So everything is resolved. So the generation of codes just... it doesn't change the AST. Yes? So you were mentioning generics templates. Do you already have plans on how to handle separate compilations for that? Well we don't have to. I mean in C++ every time you use a vector in FAC file it has to generate the whole code for the vector and then in the end you have like 600 implementation and the linker has to remove them. That's why C++ is so slow. So here you don't have to do that. You have one implementation per instance of per use. If you have... So you have your say standard library module that defines the vector of t then you have two modules that both use vector of t that are compiled separately for some reason. Oh, no, you can... But they're not. They're never, yeah. It's quite easy, yeah. Can you reimplement the Linux kernel in C to C without macros? I wouldn't want to because Linux kernel is not a really good example. It's one of the worst pieces of open source there is. If you try to map to dependencies you get a fully connected graph. Everything depends on everything. It's horrible, no? Yeah. Yes? Do you have a C to C? You also have a C++ version. Yes? The C to version is much smaller. But there are also... Because some clang components we use are quite heavy. So it will require a lot of code. And clang is, of course, the components we use can do more than what we need. So they're a bit fat. But otherwise, yeah, the C to code is quite slim. That's also what we do with the name stuff. Here, like list, it doesn't need any prefix. In C, if you would use C, you would probably call it linked list underscore init. So you would have to type that all here. And then pass the list as an argument. Like if you have some component in C that's called foo. You already see foo underscore this, foo underscore that. And all that stuff is just reduced to a single name at the top and that's it. So your code gets a lot smaller in column size. Time's up. I see. Alright, thank you, Bas. |
Flang progress update |
Alright, we are going to start next talk. Next up, we have Kiran. Kiran will give us an update on flight. Hello everyone. As Christoph introduced, I'm Kiran. I work at ARM in the Fortran compiler team and today the task for me today is to give you all an update about the progress we have made with the flying development. This slide shows the contents of my presentation today. It's fairly simple. First, I start with an overview of the flying compiler. Then I give you a summary of the story, whatever has happened so far. Then I provide you with the status of the compiler. Finally, I identify a few of the major development efforts going on currently. This slide shows in brief the overview of the flying compiler. Flying is a new Fortran frontend developed from scratch. It has a traditional compiler flow. It's an LLVM based Fortran frontend. It's actually the Fortran frontend of LLVM. It generates LLVM IR, but it has a difference with the other frontend in the LLVM project Clang. While Clang lowers from the AST to LLVM IR, flying uses a high level intermediate representation called Fortran IR or FIR. That's what flying generates. It uses the MLIR infrastructure for FIR and MLIR interfaces with LLVM through the LLVM dialect. FIIR is converted to LLVM dialect and then the LLVM pipeline kicks in. This is basically the diagram that I have on the left hand side. Given a Fortran program, there is some parsing and semantic checks that happen. Finally, you get a flying pass tree that's fairly well defined. Then that code is lowered into Fortran IR and calls to the runtime. Then the Fortran IR is converted to the LLVM dialect. This slide summarizes the story so far with the flying compiler. Looking at the slide, this project started in 2018. It was during Euro LLVM 2018 that news about this compiler start to come out that there is a new Fortran frontend being written from scratch. One year later in April 2019, the project was accepted as the Fortran frontend of LLVM. Again, one year later in 2020, it was merged into the LLVM project as LLVM flying. When this happened, there was some code that was left behind. The project actually split into two repositories. The first one in the LLVM project, where the parsing and semantic checks and the code for the runtime was there. All the code that lowers from the parse tree to the Fortran IR got left behind because it was not ready at that time. It began to take a life of its own. People had to now sync these two repositories, sometimes commit to both two repositories, and you have all the overhead of maintaining a downstream project. Fortunately, sometime in April 2022, people decided to freeze all the downstream development and pushed all the code into upstream. Sometime in July 2022, this whole code is now in the LLVM project repository. Since then, the project has mostly followed the LLVM contributions process, and all the social guidelines are associated with it. When it was merged into LLVM, it was mostly a Fortran 95 compiler, but there were still a few missing pieces. The code was stabilized further, and all unknown features were marked with to-dos, so that if you try to compile an unsupported feature, it will give a message saying that this feature is not supported rather than giving a crash. At the same time, development has continued to support newer standards, newer Fortran standards features like features from Fortran 2003, Fortran 2008, and things like that. Also a lot of bug fixes went in, as well as people started to look at some performance work as well. This slide summarizes the current status of the compiler. The compiler is not yet ready for general use, but it is still fairly advanced in its support for various Fortran constructs. The driver is temporarily called flying new, and executables can be generated, but you have to use an option called flying experimental exec. The feature development of Fortran 95 is mostly complete, and as I mentioned before, development of Fortran 2003 and later features are in progress. The compiler has been tested with various commercial and free test feeds. It has also been verified with some HPC benchmarks like SNAP, Cloverleaf. We have also used the spec benchmarks to test it. We are also continuing to test it with other benchmarks like the OpenMP version of spec, other open source applications like open radios, and things like that. This driver's name was flying new, and the use of the experimental flag is currently being discussed. It is possible that those requirements will go away soon, and then people can just type flying to compile their application. There is a discourse thread on that currently under discussion. This slide summarizes the support level of flying for various Fortran standards. Fortran is a living language. As you can see, it has gone through a lot of revisions. There is another revision that is going to come in this year and one later in this decade. It is a living language, and it is continuing to make progress, adding new features and things like that, but it makes the job of the compiler engineers much harder because you are always trying to catch up. The initial standard that we track is 77, and the development work is complete. Fortran 95, as I mentioned before, is complete except for a few bits here and there. Work on 2003 is in progress, particularly on polymorphic types, but the parsing semantics and runtime mostly works. Similarly, with Fortran 2008 as well, parsing semantics and runtime works, but some of the features are in progress. Whereas, when you come to Fortran 2018, none of the lowering or codes and work has happened yet. Whereas, the parser and semantics and runtime should work fine for this code, modulo n e bux. Now, I have said that the compiler is able to compile a lot of Fortran code. How does the performance look like? This slide gives you a summary of where this compiler stands with respect to other compilers. The benchmark I have used for this slide is the respect 2017 benchmark, and all the Fortran benchmarks from that, either Fortran or mixed Fortran. I have compared it with the two compilers. One is the G Fortran 12 version, and the other one is the compiler called classic flank. There is a compiler that is previously open sourced by PGI, and that is actually the Fortran front end of many of the existing commercial compilers like AMD's, ARM's and Huawei's. So, I have compared it with both these compilers, and what I have at the bottom here is the geometric mean, if you consider the performance of all the benchmarks in this suite. So, you can see that compared to G Fortran, we are around 1.5 times the runtime it takes in flank, whereas compared to classic flank, it is around 1.38. So, for some of the benchmarks, we are mostly on par, but for some of the other benchmarks, there is still some work to be done. The comments column basically summarizes what are probably the issues that are there that causes this performance difference. Some of them are familiar things like alias analysis, the other things like intrinsic in lining and function specialization. Fortran has a lot of intrinsic functions. So, generally these are all implemented in the runtime. Because you have a lot of these intrinsics, the runtime is many times written in a generic fashion. So, you might not get the performance if you call the runtime function. Also, for simple arrays and all, it does not make much sense to incur the overhead, particularly if that function is being called in a loop or something like that. So, many times it is good to inline that code. So, in benchmarks like exchange too, there are a lot of functions like count, sum and any mean lock and all. That can be possibly be inline to get more performance. And exchange too happens to be one benchmark where if you have performed function specialization, you get much benefits. So, function specialization is a process where if you know the arguments to the function, you can generate a specialized version of that function based on the known parameters value. There are also other issues mostly associated with how arrays are handled in Fortran. So, by the definition of or how the standard interprets arrays in Fortran or array expressions in Fortran is that there is always a temporary associated with it. But when we generate code, if there are a lot of temporaries, you know a lot of work, a lot of time is consumed just for copying these arrays from one version to another. So, in many cases these copies can be optimized away, but we do not do a good job about it and that is what is causing this performance issue. So, few engineers have been working on this for some time now. A few months back, we were around 2x, but we are closing the gap you know as fast as possible. So, now in this slide, in the following slides, I summarize some of the major development efforts. So, I probably missed some of the efforts, but what I am going to summarize is first one is high level fare, that is a new dialect that is being written, that sits just above Fortran IR. I will come to the reason for that. Second one is I will try to have one or two slides about polymorphic types and how they are handled in flank. I look at some of the performance work that is being done. I briefly summarize the work that is done already and what are the work going on in open MP as well as the driver. So, when the compiler work started, the IR that we had is Fortran IR which represents a lot of the Fortran constructs, but what we found is that although it does model a lot of the Fortran constructs, there is still some gap between the Fortran source program and the Fortran IR. So, there is some information that is being lost like what are the variables in the source program, what were the annotations that were there on the variables and things like that and losing that meant that we might not be able to do some performance optimization. So, what people decided is that we need to, so that was one issue. The other issue was that the lowering was also proving to be a bit difficult because of the huge difference between the source and the IR. So, that is the reason why a new intermediate representation was introduced between Fortran IR and the source that is the HLFIR IR or the high level FIR. As I mentioned before, it enables optimizations and because it carries more information from the source, it is likely to generate better debug. So, this IR basically introduces two major concepts. One is that it models expressions that are not buffered. As I mentioned before, array expressions in Fortran generally involve temporary arrays and whenever we introduce that arrays into it or the buffers associated with it, it looks like a lot of low level code. Whereas, expressions that are not associated with these buffers are still higher level concepts. So, that if you have chains of intrinsic functions that operate on arrays, it is easy to do some kind of processing there to simplify those expressions. It also introduces the concept of variables. So, there is something, there is an operation in high level FIR called HLFIR variable which collects information about all the variables in a single place. So, that you know this is the variable and what its properties are. Some of these might be modeled by attributes like if you say that a Fortran variable is a pointer or allocatable that is marked as an attribute. It also identifies the shape of the array, you know if that is an array along with the memory associated with it. So, then the initial lowering will be from the Fortran source to a mix of high level FIR and FIR and then the high level FIR is converted again to FIR and then the rest of the pipeline kicks in as always. So, I would not be going into the details, but if people are interested there is a detail RFC you know inside the flying documents repository. But I will try to show you an example. So, this is the Fortran source code and what we have is a declaration of two arrays. The first array called Y is a two dimensional array, the second one is a single dimensional array and then we are summing all the rows or columns in the first array Y and storing it in the result array that is the array called S. I tried to put the FIR code for this in a slide, but it happened to be too much and it would not fit in one slide or two slides. So, I have just left the comments here whereas the source code is completely gone. So, you can see that there is some FIR aloca which allocates that array and the name of that variable was part of that aloca. You can also see that there is a call to the runtime for the sum function. You can see some comments that I mentioned there that the runtime calls and it allocates a buffer on the heap and then returns that and then that heap is copied to the real variable and the original heap is deallocated. Not much here, but if you come to the HLFIR you know you can actually fit that into a single slide because it models things at a higher level. So, the important difference to notice here is that there is an HLFIR declare that there are two HLFIR declares that declares the variables that are there in your program. So, you can see that there are two arrays S and Y they have a representation for that and instead of a runtime call you have an operation called HLFIR sum that actually returns something called an HLFIR expression. So, there is no buffer associated with it or the runtime is not called there is no memory allocated as of now and then that is assigned from the expression to the original variable that is a result array S. So, basically I just want to show that you know this is at a higher level it has some concepts called variables and it has also some things called expressions. I will now move on to polymorphic types. Polymorphic types came as part of the FORTRAN 2003 standard. The types are only known at runtime FORTRAN has the class type for specifying such a type. You know if you have a class type it can refer to either that type or any of the type that extends from it. So, extension is the name for the inheritance concept that is there in some other languages. Again I will not go into much details I have an example in the next slide, but if people are interested they can check this RFC. So, only the example that I have on the left hand side there is a type called point. We call it derived types in FORTRAN and then you have a three dimensional type it extends from point and basically it just adds another field to it. We have a class type that is called P3D and then you call this subroutine called foe and then this subroutine accepts it as a class type of the base type. Then there is a construct called select type in FORTRAN that you know that can at runtime identify which type it is. So, if its type is point 3D then something is printed, if its type is point something else is printed. So, the modeling mostly follows what is there in this code. We have something called the FIR type and that type has the ones in green are the extended type, the ones in red are the base type. You can see that there is a FIR class that has inside it FIR type and then you have this FIR select type construct which tries to match between the base type or the extended type and then it branches off to the basic blocks that handle it. It is basic block one for the extended type and they are basic block two for the base type. So, at runtime also when you generate further lower level code like LVM there will be some comparison instructions that compares whether it is that type. Types will probably be represented as structures that are global. So, you can compare with it to know what is the real type. So, next I move to alias analysis. So, alias analysis is important to distinguish between different arrays that can potentially alias as well as to say that two arrays cannot definitely alias. The rules of aliasing in Fortran is different from what is there in C and so you know we cannot directly reuse whatever is there in C. In general I mean there are exceptions and lot of other special cases. Arrays do not overlap unless you specify that you know some array is a pointer and another array is a target and then these two can overlap. Ideally we should benefit from the restrict patches that are being worked on, but that work is not yet complete. We also have some issues with pointer escape and all. That is all captured in this RFC by Slava. But we still need to do some kind of alias analysis because otherwise as we saw in some of the earlier slides where I show the performance results you know the performance is hampered by the lack of alias information in the LLVM optimizer. We probably have some more information at the fair level, but much of the optimization is currently delegated to LLVM. So, LLVM still needs that information to do the optimizations. So, as a first step what we have done is that you know we are trying to distinguish between accessing the descriptor versus accessing another memory. So, as I might have probably mentioned before Fortran has arrays and it does it is very good at arrays. So, sometimes to pass additional information you cannot just pass just a pointer. You might need to pass additional information like its rank or its you know starting dimension starting extents value or its ending low basically the lower bounds or upper bounds to see whether that array has a stride or not you know there are all these information that can be passed in the descriptor. The descriptor is generally modeled as a structure at the LLVM level. So, you have to go and fetch the contents from the descriptor by loading. Now, this load can potentially alias with other arrays if you directly load from that. So, we are trying to just distinguish these using alias analysis information using TBA. So, that is what we have on this slide. I do not know how clear it is, but then this is in the LLVM MLIR dialect not in the LLVM IR representation. So, we have this TBA information that is being generated here. So, if you know TBA it is mostly trees and if one node is an ancestor of the other node then they alias, but if they are in separate sub trace they do not alias. So, you can see that the ones in gray is any data access the ones in yellow is whether when you are accessing a descriptor member. And if you go back to the source code. So, what I have here is in the simple program is a subroutine. Subroutine is a procedure which does not return a value in Fortran. The two values A and B, A is an array and B is a scalar both of integer type. I am loading the value at the tenth location and putting in this putting in this variable B. And you can see that whenever we are accessing the descriptor that is in this case the descriptor is as access possibly for a stride. We use the TBA with the yellow color and whenever we access something related to B we use the one with the gray one. So, you can distinguish that these do not alias and sometimes when it is in loops we get some performance benefits. So, the next one is code gen of assume shape array arguments. So, as I have mentioned before Fortran has different kinds of arrays one is assume shape. What assume shape means is that you know if you have an argument it takes the shape of the array that you pass to it. So, you can have you can either pass it an array of you know you can either pass it an array of a non size or an unknown size and it will accept both of them. So, this causes some issues particularly because you know the arrays can also be strided. So, if the array is strided then if you have a loop that is working on that array you have to fetch consecutive elements from the array. If it is strided then you have to increment it by the stride and usually you have to load from the descriptor to find the stride. And then you set stride to find the next element. So, sometimes this can be modeled by you know scatter gather loads and stores, but sometimes it is not possible, but in many cases the stride is actually one you are actually passing a consecutive array. So, we can do some versioning. So, it is represented in high level source here if you have some input code like this an array there is an array called x and you are looping over it. Then you can create a version of it in if the stride is one do this if otherwise do this right. And then this side of the portion loop this side of the version becomes easier to optimize and vectorize. I just have two more slides I will probably just run through it. We are nearing open mp 1.1 completion there are still a few items to complete like privatization atomic reduction and detail testing, but a lot of other things are still going on in parallel there is basic support for task SIMD construct. We have been able to run it with some spec speed benchmarks and it works things in progress include target off loading it just started task dependencies and new loop related constructs. We also made a lot of progress with the driver it can now generate executables, but what is new is that we now handle target specification, fast math, MLIR level optimizations previously only LVM optimization over there now we can control MLIR optimizations as well as LVM pass plugins. People are continuing to work on LTO saving optimization records and supporting something called stack arrays. This final slide just says that you are all welcome to contribute to this project and the details of this are here. Thank you. Yeah, I mean as of now we do not have so the basically the question was that when you traverse across the various layers in MLIR and LVM IR is the debug information preserved. So, the whole concept of the HLFIR operation is to you know put that information somewhere in MLIR at the highest level and then we plan to propagate it further down. So, the HLFIR declare has a corresponding FIR declare it will lower to that and from the FIR declare when we convert to LVM we will just pass on that debug information, but debug support is quite early in flank as of now only function names and location numbers as supported, but also may be not by default it is just a pass separately running the code to add that the driver is still pending. Yes. Everything that since standard or some well known extensions are there, but it is a lot of dusty deck code is was tested with it, but I do not know whether the specific thing that you have in mind is supported or not you have to try out or look at the documentation. So, the question was whether old Fortran 77 code or legacy extensions are supported. So, I am trying to understand what needs to be done for open MP because open MP is a completely separate project right. Yeah. So, the question is what needs to be done separately for open MP. So, as far as open MP is concerned all the open MP work is mainly represented at the MLIR level by a separate dialect called the open MP dialect. Then that dialect sits in the main MLIR repository and from the source program when we generate it you know we create these additional operations for the open MP dialect and it has regions in MLIR. So, it can capture things like you know a parallel directive much better compared to LVM. So, roughly speaking open MP is a set of intrinsics more or less. I mean it is not as I mean intrinsic when you whenever you say it is kind of some kind of function, but these are MLIR operations right. So, if you have something called a parallel directive there is an operation in MLIR called omp.parallel and it might have lot of clauses like you know what is the you know threading model and things like that. So, all those information is captured at that level along with the code that comes in that parallel region and what we do actually now is that we are trying to share code with Clang. So, there is some code that is refactored from Clang and put into something called the open MP IR builder. So, when we lower it from this dialect to LVM we use that to generate the LVM IR. Thank you. |
Open source C/C++ embedded toolchains using LLVM |
Next up, we have Peter Smith, who will talk about using all the empty-created LLVM tool chains. Hello, my name's Peter, and thank you all very much for staying up so late. It's almost bedtime. So I'll be here to talk about embedded tool chains using LLVM. So the first thing I want to clarify is what do I actually mean by an embedded tool chain? Now, some of the people here earlier on were talking about, as I say, sanitizers, and we mentioned Yachto and embedded Linux. That's way too high-level. This is basically for bare-metal embedded systems. So, yeah, so typical, for those of you who already know this, I'm sorry, but for those of you who aren't necessarily familiar about what some of the differences are. So typically when you're developing on, say, Linux or Mac or Windows, whatever, you're developing with the knowledge of an operating system. So when you implement your C-Library, you already know you can use system calls. You know, if you want to get some more memory, you ask the operating system, that type of thing. So by contrast on the embedded system, you don't have an operating system you can ask for memory. So you basically have to roll part of that into the C-Library, that type of thing. So also, when you're actually programming, you're programming on the device, you're actually running the program on, embedded systems, you're cross-compiling. That is one thing that is likely shared with Yachto and embedded Linux because quite often you're cross-compiled for speed on that one there. Typically, you're be static linking only because either your RTOS probably doesn't have a dynamic link at that particular point and your RTOS might actually just be a library that you link into your program, that type of thing. So yeah, so platform, if you're on Linux, you might be using G-Lib C, that type of thing and that will be platform and then you just use, when you have a tool chain, you might just need to provide a compiler and everything's there for you. Embedded systems, everything's just, you have to do everything all yourself. I will mention one word there, freestanding. So there is a definition in the C-plus standard of what freestanding means. It's a little loose. It kind of says this is basically what the minimum you have to supply, but that's practically useless unless you want to write a full C-plus plus standard implementation yourself. So in effect, what happens is that most embedded C-Libraries tend to roll half of an operating system into themselves, at least basically, yeah, basically minimum from there. So that's what we're sort of talking about by an embedded tool chain. Okay. So this is the thing we already have embedded tool chains. Why do we need LLVM essentially at this particular point? So this is some of the reasons why you might actually want to use LLVM over say something like GCC. So first of all, clang is kind of a natural cross compiler. So you don't actually have to say, gather GCC for ARM, GCC for S5, GCC for AL64, you just have one clang. Now that is quite useful if you're a team where you don't want to use different compilers, different installations, I guess more administrative more than anything, but it can be a benefit on some places. So code generation can also be more mature and I will say be safe of fairness, sometimes less mature than GCC, for example. So my, obviously for somebody who works for ARM, all my examples are for ARM just because that's what I know, but I'm sure there are similar sort of things on their architectures as well. So an example here, V8.1M, which is one of ARM's most recent sort of CPUs for embedded systems, it's got a vector extension and basically clang has got better support for auto vectorization for this than GCC, just simply because the work was done earlier, that type of thing. But that's just one of the examples why if you've got that particular target, you might want to use that, whereas if you've got a different target, GCC might be better at the moment. Other thing is taking advantage of some of the tooling that clang provides. So I'm going to go into in the next few slides how you might be able to use some of the sanitizers. I know we kind of said in the earlier bit this morning that we were talking particularly about MSAN and ASAN, that type of thing, and those typically have quite a high runtime component, but there are sanitizers that you can use without that, and I'll just go through a few of those here. And finally, you've got diversity of implementation running more compilers, it's almost always good. Compilers find different sets of bugs, and sometimes programs find different sets of compilers. Sorry? I was working recently on a safety critical application for train, and you actually have to implement several processes doing different things, so having two different compilers is a good thing in that application. Yes, definitely, yes, and certainly different programs find different compiler bugs as well, that sort of thing. So yeah, okay. So do you think sort of sanitize the embedded system? So we kind of run through some of this earlier on today. So the main restriction for sanitizers is that it's not actually the code generation, it's actually the run times. So if you look at the run time for ASAN, it's basically using a dynamic shared object to intercept the C library. It's got all sorts of bits that sort of kind of are operating system dependent, but of course in embedded you don't have an operating system, so it's very hard as a toolchain vendor to provide a kind of bare metal thing that doesn't depend on one very specific example. But some of the sanitizers have a very minimal run time, and some of these things you can use here. So I'm just going to go through some of these right now. So the first one to use is the undefined behavior sanitizer. So by default that does have a run time, but all that run time effectively doing is pretty printing a nice error. But if you don't care about pretty printing a nice error, you might not even have a printer. So at this particular case, then you can just say, okay, well, if there's undefined behavior in my program and someone's trying to attack me, maybe that's a bad thing. So maybe I just want to abort, say for example, if I've got an out of range run time. This particular example is just using a very standard integer overflow detection. And basically look on there, all it's really doing is just saying, check for overflow. If I overflow branch to an undefined instruction that just happens to cause an abort on the processor, that type of thing. So yes, crush your program. There's also a minimal run time. So there is a default implementation of the minimal run time in compiler RT. You can't use that directly on an embedded system, but you can basically write your own. So instead of actually calling, well, going branching to an undefined instruction, it just calls a user defined function. And you can basically make that do whatever you want. There are ones for log and continue, and there's ones for log and terminate, that type of thing. But basically the choice is yours. But those functions have got extremely trivial implementations that you can make work from an embedded system. Okay. Next one here is the kernel control flow integrity. And it's called KFCI. And I keep calling it KFC. I've even got this to write right around it. Actually, I think I've even got it wrong on the slide, which is embarrassing. I should actually be KCFI at that particular point. So there is a control flow sanitizer that can work with embedded systems right now. That's the sort of the full fat, I call it sanitizer. But that requires link time optimization. So the advantage of the kernel control flow integrity sanitizers is it doesn't need LTO, which makes, if anyone to try to use LTO on embedded systems, it works until you've got a linker script. Certainly what a linker script that depends on placing things in different places. So yeah, so here's just a very trivial example of something that's just calling a floating point. And this just shows some of the code that's generated. So what we essentially have is this function pointer has a type. And you can basically make that into a signature. So what happens is we prefix the top of the function with the signature. And then we basically load when we're sort of saying, oh, let's load from this arbitrary function pointer. Well, let's check its signature. And then we'll check to see if it matches what we want. And if it doesn't, boom. So this doesn't, as far as I know, work on C++ V tables at the moment. As obviously this is implemented for the Linux kernel. So they don't care about C++. So, but if you're using C with function pointers, this is a way, a relatively low cost way to get control flow integrity checking. Okay. So this is just some of the things that the components of an embedded tool chain. I'm kind of jumping around here at the moment. So these are sort of the things you would expect in a GCC embedded tool chain. And as you can see, Clang's actually, well, LLVM project, we've got pretty much all that we need in one place. We're only really missing a C library at the moment. So yes, we can go through some of the, I won't go through each individual thing in those titles, but you've got, you know, Clang, the compiler, you've got LLV, the linker, you've got implementations of obstump, read-elf, you've got implementations of the C++ runtime library. Yes, say, what we're missing is a C library. So technically, GCC doesn't have a C library either, but there are hooks in the build system to basically build new lib in sort of multi-lib configurations at that point. LLVM is developing a C library. I would say at the moment that currently it's sort of focused on what you would probably call desktop use cases, but they are planning to have sort of scalable implementations. So I think the end goal is that it will be able to cope with embedded systems, but I expect that to be some, some years down the line at the moment. Okay. So how would you actually assemble one of these building? Well, basically assemble an LLVM toolchain from the LLVM project. And the honest answer is it's not as easy as it could be. Certainly when you're building a sort of a hosted toolchain, it's just, you know, it's fairly easy. You just go to LLVM, Cmate, Ninja, done. So actually building the tools is not difficult because they're all cross-compilers. They're just all part of the default build. So if you want to get all the tools, very, very simple. Building the run times is a bit more difficult because you've got to cross-compile the run times. And you've got to do them in a particular order, not all of them build in all of the things. So one of the big problems, if you say try and buy compiler, sorry, if you try and compile compiler RT, it'll fail because you've not got all of the, you know, it's kind of, if you try, it'll end up building the sanitizers. And the sanitizers obviously have got dependencies on POSIX operating systems, which of course won't work. But you can say, for example, build the built-ins, which are kind of like the LibGCC equivalent. So what we've done at ARM is to put together an embedded toolchain for Cortex-M, which is the sort of ARM's microcontroller range. And this is essentially a set of build scripts. It's all open source. And we're using the Pico Lib-C at the moment as our C library. And we did start with New Lib, but we sort of moved on to Pico Lib at that point. But you can make it work with New Lib if you want to. So yeah, so we've got, it's primarily just build scripts. It's not like got an LLVN project embedded on that. It will just go fetch LLVN from the actual source code. And yes, it's got a few samples for, you know, for building some programs, that type of thing. So as I say, it's by ARM, for ARM. But I'm sure if anybody wanted to apply it to a different microprocessor, they pretty much could because it's essentially just a bit of C make and that you can adapt. So what's the usability of an LLVN toolchain like next to say the GNU embedded toolchain, that type of thing. So one of the main things we're missing at the moment is multi-lib support. Now there are some multi-lib support for certain targets. So for example, I think there are some RISC-5 multi-libs that are already in the bare metal driver. But that's not the case for ARM at the moment. I'll go on to what we're doing about that in a few slides time. Clang also doesn't have a direct equivalent of GCC specs files. So specs files are basically just fragments of command line, but they're not just raw command lines. They have got some intelligence and they can talk to each other and override sort of defaults. So as an example here, that nano.specs and RDImon.specs, that says give me newlib nano, which is the really small version of newlib. And RDImon is the semi-hosted version, which is easier to run on emulators, that type of thing. So for the LLVN embedded toolchain, we basically, because we don't have the information for the specs file to say, ah, someone else has someone added this other specs file, so I'm going to modify my behaviors, we have to basically have multiple config files that just blow up for all of the possible combinations. So as you see there, we've got an ARMv6m, which ideally would be handled by multi-lib in the DMD semi-host version. And yeah, there's just more configuration files than you really ought to have. And I would say there's probably a small, well, there's a long tail of small incompatibilities. You might find that LLD doesn't do orphan placement exactly the same way as GNU-LD does that type of thing. But normally these sort of small incompatibilities, you can kind of code around it. There's normally a way you can make it work. So that's what we've found so far anyway. Okay. So this is just, again, another jumping around, just showing you how Clang might do some of this sort of stuff. So if any of you have played around with Clang drivers, whenever you give them the target triple. So normally if you're using Clang on your Linux, your target triple is, you know, native, I guess at this particular point. Or you're using the default triple that's there. But if you're doing cross compilation, you have to give it a sort of architecture environment. So you've got the Linux-GNU there. So this is actually one, if you were targeting something like the Octo, that type of thing. And that will Clang driver will then tell you where all of your header files are, what your target features are. So it's like a much low level using sort of private command line options at that particular one. So for what we find for embedded systems is that Clang has added something, well, probably a few years ago, but it's sort of only recently sort of getting a bit more development onto it. In particular, the multi-lib support for RISC-5 came in fairly recently. And that's when you have a target that the bare metal handles. So far, that's only ARMA ART64 and RISC-5 at the moment. In theory, it could be added for any other target, that type of thing. If you happen to be doing bare metal development on an X86 and you don't match, say, a Linux operating system or BSD or whatever, you end up getting forwarded to the generic GCC driver, which basically throws everything which generally knows what to do about things. So as long as you've got a GCC tool chain, if you give an object file to GCC, GCC will say, oh, I'll just fire that at the linker, that type of thing. So it will work itself out. Okay. So I've just basically repeated what I've just said there. It will default to the LLVM tools at that particular point. So as the last part of the talk, I just want to go to some of the ongoing work that's happening in Clang and some of the community involvement that's going on here. So one of the first and probably the major bit of work that we're doing at the moment is what I'm going to call data-driven multi-lib at the moment. So currently, multi-lib support in Clang is hard-coded. It's basically a C++ class where you basically describe what the multi-lib will do for that. Now, that works pretty well if you're doing things like 32 or 64-bit x86 in, say, Debian or Red Hat because the structures are well known at that particular point and they are stable. Whereas there's no way every possible embedded tool chain with every possible library variant that you might want to do could get that hard-coded in upstream Clang. So typically, what you find is that every tool chain based on LLVM has its own downstream patch if it wants to support multi-lib. So GCC allows you to set this up at configure time and the GCC way basically maps command line options onto directories. So for Clang, we can do a bit better because the Clang driver has a bit more scope to say, do things like target parser and find out more about what the CPU can do. So at the moment, we're kind of proposing that you kind of have a stacked tower of multi-libs where you can kind of almost like a Docker container file where you get each sort of can override the next so that you can basically describe what your multi-lib configuration is and then Clang will be able to take this configuration file. So it will basically allow people to have multi-lib tool chains without having to hard code them in downstream patches, that type of thing. So this is still in active development. There's an RFC that went up probably a few weeks ago. Recently, there's some links to the patches and that sort of thing. So please do, if you're interested in data-driven multi-lib and how it develops, please do comment on those patches and the RFC. So future work. So we'd ideally like to get some upstream build bots for some of the compiler RT runtimes. So whilst there are build bots for ART64 and ARM Linux, we haven't got build bots for, say, the built-ins for, say, the V6M, V7M, the sort of the very low-level embedded sort of targets. And we think that would be good to, you know, well, obviously more build bots the better, I think, at that particular point. There is some work going on at TI. So that link to YouTube is to a presentation at the last LLVM developer meeting, basically adding attributes from the linker script so that you can basically say things like, this section must go in this place, this output section, this one must go in this other one. Please do not cross-module inline across these boundaries because these things might not be in memory at the same time, that type of thing. And also, I need this section to have this particular name, so please don't give it a different name or merge it, that type of thing. So that should be able to make LTO much more usable with linker scripts. And what we tend to find with Clang is that if you get it right, LTO is very aggressive at removing code that's not needed. So that's actually very good for code size if you can make it work. Certainly we've seen, you know, for benchmarks, LTO is great, but then we say to customers, hey, use LTO and it goes, ah, but we can't because of the linker script, that type of thing. Next one is not strictly embedded, but it is very important for the safety critical industry, which often is, you know, by definition embedded because you're controlling some kind of hardware. And this is something called MCDC code coverage. And that is kind of a special form of code coverage where you're kind of, if you can imagine something like if, and then A, B, C, D, E, and E, it's a way of sort of deriving test cases so that, so it's not quite exhaustive, but it covers more than just did this branch go this way or this way. It's like, did it go this way because this condition held that type of thing. Hopefully that's not going to show up too much there. And yeah, so there's some, there's a patch in for generating that in the code coverage thing. And it obviously, a lot of VM libc developing. And we would like that to support embedded systems. Okay, I'll skip through this very quickly. There's some patches up for Big Endian support. If anyone actually uses Big Endian, I don't know. I'll be rude there. There's an armed person. We're almost our arms a little Indian. And then there's the Cortex-M security extensions, which are, you know, that's very useful if you're trying to sort of have secure state, non-secure state. So that supports in LLD. Again, if anyone wants to comment on those patches, please do. Okay. Okay, so fine. Finally, if you do want to contribute to this, and this is not just as a developer, we're perfectly happy to have contributions from users as well or just in some ways just telling us what's important. So Clang has pretty much come out of what I call the hosted community. You know, it's generally, at least as now I would say there's a lot fewer people in the embedded system, embedded systems area than there is on GCC. So if you, you know, there are certain features that are useful in embedded tool chains, but not necessarily in say, hosted tool chains. So just telling the community that you need these features is often helpful because quite often it will say, why do we need all this complexity for this thing? No one's going to use it. And it's like, well, and you can only get people only get used to features if they're there, but then you can't get them in, you know, chicken and egg situation there. So yeah, so there is a four weekly call that goes on, unfortunately, at a time slot that's not great for Europeans, but this is the only sort of time slot you can kind of get across US and Europe together at that particular point. So that's probably about, I'd say about 20 people turn up. And that's really just about the various people who are working on embedded systems and if they want to sort of highlight patches that want to be reviewed, discuss new features. Last time we were talking about how we might improve LLDs, observability of diagnostics, that type of thing. Obviously bug reports, welcome at those links. And obviously if you attend the developer meetings, there's often a round table on embedded systems at that point. And with that, that's my last slide. So hopefully we've got a few minutes for questions. I'm trying to understand something. I know of some people who say that they're using LLVM for embedded already. Does this mean that they're using the other definition of embedded? So there's two, well, you can do it. There'll be three ways they can do it. One of them is they're kind of using an LLVM based tool chain from a vendor. So that vendor will have done all of that packaging up. Or it will be like, for example, ARM will sell you a commercial tool chain that is a derivative of Clang, that type of thing. That's one way of doing it, that sort of thing. Or they might be using embedded Linux, that type of thing. The question was, sorry, I've been holding up a picture all day saying, please repeat the question. I didn't. And the question was, some people say they're already using LLVM. Does that mean they were using a hosted system or not? Okay. Just the first, sorry, go out the back there. Yes. So one of the things I noticed is LLVM ships its own assembler. We've noticed for some LLVM projects that they have trouble with some of the new assembler macros. So we have to go back to the truth and just put some of the targets. Is there some plans for work on this? So with the latest LLVM, I know, sorry, sorry, repeat the question. So the question was, the LLVM has an integrated assembler. GNU has GNU AS. And there are some directives or macro support that might be in the GNU assembler, but not LLVM. So I think it's generally done on demand. So there was a big effort to get the Linux kernel compiled with Clang. And that added quite a lot of features that were basically needed for the Linux kernel. So the best thing to do is have a really important project that's a big company wants to get compiled with the integrated assembler. Yes. Yes. Yes. Yes. That is a very good way of doing it. As macros were in the Linux kernel and they asked us to support them and say, no, screw this, the kernel changed away from macros. And they changed away from macros. Yeah. But there certainly was support. There was, I think there is a directive where you can switch the GNU assembler into advanced macro mode or something like that. Or I can't remember. No, that's not that. That's the inline assembly thing. There is an F. Yes, there is a high GNU extensions option. But no, there was a patch that probably landed a few years ago. So depending on how long ago you tried, then there was some support done for more macros. But whether it's got all of it or not, I don't know. Well, I actually don't. I've got a lot of problems. It was a while ago. Right. Yes. You may find that someone has already fixed that already. Yes. Thank you. Yeah. So my question is about, like, we, for example, tried to deploy machine learning models on tiny bare metal devices. Yeah. And there we are also looking into, for example, TVM as a tool chain, but also MLIR now, or like the EV project from Google. Yeah. And they're basically what they do is they use this entire tool chain, and then they use the MITC dialect, for example, in EV to MITC code again. Okay. To then put it into an embedded tool chain to actually do the final compilation stuff. Do you think that there is, or what is basically needed to omit this last going back to C, or is this a good idea or not? Or... Oh, well, I mean, I suppose... I'm just trying to think how, not very familiar. So the question was about people deploying machine learning models on small devices, and they're currently outputting to a C back end, and then recompiling that C back end. And do I think this is a good idea or not? I mean, I guess the C back ends are often the, how do I get this up and running as quickly as possible? I do know that there are, I guess, machine learning compilers that have got, I guess, code generation out. I mean, I guess if you're using LLVM itself, it's probably not too difficult to just lower to LLVM and get most, and you then get the code generation for free. I guess the bit that you might not get is, have you got all of the run time and intrinsics that you might have that the C compiler might insert, but I don't, you might find someone else knows why. So maybe just in addition, it does not compile the whole machine learning models to C, so they are still a static library linked in which it's generated via LLVM. It's just some parts around, so it's not that they are as pure C in the end. That's not done in the approach. Okay. Yes, sir, yes. I was one of those unfortunate uses of big ambient arm. We were running several compilers in this safety-critical application. Everyone had a problem with one thing, and that's the linker. We're trying to generate a header. And the linker, you can't insert a text string in the linker. It's very difficult to insert static data. We want to insert information about how large is the section. That was basically possible to do with the linker. How does LLVM handle all these other things? With great difficulty, I think, I think there isn't really a, there isn't, I think the, yeah, I think Christoph is nailed it in that, what I would probably do myself is reserve some space in the binary, name a section out of it, and then use Obstump to poke it in at that particular point. Can you see my problem? No, yes. I mean, there are some... Putting a string in the link command file. Well, it's a bit more difficult. It's the, I think the linker needs to know the length. I mean, I suppose you could do it with horrifying things. You could use data statements in the linker script, but that sounds a bit, it's really what you would want. I want it. Yes, really, I think, yeah, because you get a number, but really, yeah, I suppose, yeah, find the extension request of a linker script. I mean, specify the size of the section to start with, then I know the size. Yeah, I think we do have the problem, we do have a problem with things like build ID, I think at that particular point where you're generating the build ID string, which needs to know everything all at once, but yeah, I get, yeah. Unfortunately, there's nothing in the LLD linker that's different. If someone should try to generate the header for a binary, contain the interest in information, then you quickly realize all the problems. Oh, yep, sure. The new assembler has the ink bin, doesn't it? Yes, the assembler does have ink bin, but I think the idea is, for the header, you want the linker to generate something based on the properties of something, but... You want to generate the information about the link time, not when you assembled it two weeks ago. You can assemble, just before the link is done, assemble something that's generated. Even.ovmile, and then link that in using the link script. I know there are workarounds, but a good workaround would have to have a good link. Yeah, I mean, I think, I mean, a lot of the times with linkers, it's the, because one of the, one of the perennial things you could ask is, how do I embed some kind of custom checksum that I've written at link time, you know, that type of thing? And it's just which one, and do you then have a linker Python script extension or plug-in? It just tends to build. Yeah. Check some afterwards, and that's also something that should be supported in linker. Yeah. And it would be done, just say, run this application afterwards on this section, something like that. Yeah. So it's... Do you do a partial link, and then analyze the object file, then do a final link? Yeah. I mean, I guess... More workarounds. Yeah. I mean, to paraphrase, I guess, is the tools are supposed to make users' life easier, I suppose, at that particular point. So if it's a common enough thing to do, then it should be, they should be able to find a way of doing it. And if this is security, then you don't want to generate the checksum in one process, and then use it in the other process. You want to use all in one, because I know who generated this, and it didn't come from outside. We actually have to have two different programs, and is that the case? Yeah. So, I wait, so I better go for... Yeah, so I just have a follow-up to that discussion, so I'm not an LFM developer, but I deal with a lot of built-ins. Does anyone working on stuff to make things a bit better, this sort of thing? Because essentially, like, you're talking about communicating between the file stage and the link stage, and introducing dependencies to what you have, and that you have to link with after that. Right. And now, like, it seems like you need, like, a schema, a data format, and a dependency specification for that, so the built systems could actually use it, and spare the users who have to deal with that. Nice. I mean, I think that the big... I say it's mostly a, what I would call, almost a coordination problem between getting the right people on board at that particular point, and it's... So it's quite... Can you repeat the question? Sorry. Yes. Okay. So the question was about, is anybody working on build systems and things that they will be able to communicate the... Of the linker to be able to communicate to the build system and automate things like the checksome sort of handling and that type of thing. I mean, I think the major difficulty is just an LVM is an open source project, and there's often, as soon as you open something like that up, it ends up in lots and lots of discussions about what the right way, and you can easily find a way that works for one, a small number of people, but completely doesn't work for someone else, so it's one of those... It first of all needs someone brave enough to actually try it rather than just implementing it downstream. So I think it's... What it really needs in this case is, because this is sort of things that... This is not... It really needs people to go on the LVM mailing list and say, yes, we really need this, because typically this sort of thing is to silent people who say, oh, this stuff's all rubbish, but we don't... As developers, we don't get to hear about it, or at least we don't get to hear it loud enough for the people who pay our wages to say go and work on it. Yeah. Right, yeah. Okay, I probably ought to hold it there to let everyone go, I think, at that point. Thank you very much for staying and I'll see you all next time. |
New Year -> New major-major version of MariaDB |
What's new in 11.0 and that's basically a new optimizer release, a new and new but at least the cost model is totally new. So I think that from the optimizer point this is one of the biggest milestones, the only time we did something comparable was in MariaDB 5.3 and the big change is that before one cost unit if you do, last query cost was one IO, the one IO was not really that exact, it could basically be one IO or one key read or one row read or one access to some file and that also meant and the things were not that balanced so some costs were just taking oh this sounds good, let's use that and even my SQL has that even now. So I decided to actually do something that you can measure and then that also makes it very easy to fix the optimizer that if you see that the cost is not something's milliseconds and something is off and then your justings accordingly. So and we decided to call this 11.0 because if you change things in the optimizer as drastically as we have done some plan may change, hopefully it should always be the better because now we actually have a proper cost and it's really easy to change things because almost all costs are available for the user to change. So I just think that this will be a good foundation for all future MariaDB releases. So with the optimizer the idea is to get better table combination and better plans. The old optimizer was actually pretty good in deciding things for simple things because if it found a good index it would use it and so on but when you had to decide that should I use this index or this index and this index I could use with the index and the lookup and the other one not, their things started to fall apart and also cost between different engines were not taking into account. The only one who had some information was the heap table although the ones were more or less the same. So I wanted to fix that and also allow people to function the optimizer. So what the new optimizer should be able to do, it should be able to use the different methods to access rows which is table scan, index scan, index merge and hash and be able to choose those correctly what is optimal for things. And I don't, only those who have complex queries should see a big difference. And I don't know how many user use optimizer trace that was added to MariaDB 10.4 but I couldn't have done this work without that because that shows me exactly how the planner is doing and we have, I've been able to use it to find out where the optimizer calculates things wrong and as part of 11.0 I've been, lots of things added to it so it's very easy to know look at the plan and see if the optimizer does something wrong. There's lots and lots of bug fixes related to optimizer like selectivity, you couldn't use selectivity level four at all before, sometimes the selectivity would become bigger than one which means that the optimizer would assume that you will get more rows when you have condition instead of less rows. So all that should be fixed. And we also added lots of new optimizations like if you have several indexes that you can use and one index is faster than another. But we noticed that the slow index actually will result in smaller set of rows. We actually used that as the estimated rows, something we didn't do before but that helps with a lot of different plans. When we have created the right tables before they were not using unique keys I don't really know why that decision was made but know most the right table using unique keys which are faster because the optimizer can estimate better how many rows we actually will do. Cost calculations we have, there's lots of different places where we calculate costs. I basically gone through as far as I know every single one and they show that the costs are comparable and will be close to microseconds for those. For example we didn't really have a good estimate before, what's the cost of file sort? No we haven't. We have filters, selectivity, metallization, using index for group buy and one big difference is that all these access costs are now based on SSDs, not hard disk as before. I think that most people use SSDs with the database but that's something actually you can change just by changing one variable. We also had a problem with cost that if we assume you have a big table and then you have a small lookup table. Basically before we assume that every read in the lookup table will have a disk access but in practice if the table is small after you have read a couple rows everything is in memory. No we assume that. So here you can see some of the cost and one can retrieve those for every engine and I just to show the difference between InnoDB and Aria. So InnoDB is using clustered index, Aria using a direct access to rows, cached. So with InnoDB basically the key lookup and the row lookup cost is roughly the same which means that if you search for a key and you search for a row both are using indexes so it's roughly the same. For example with Aria the row lookup cost is notable smaller. So this is one example why it's important to do this at a very very low level. All costs, all engine costs and most SQL costs are now available. So for example optimize disk read cost this is the time to read a 4k block from the device and that's a typical SSD. If you have a hard disk you just have to change that one cost. And the disk read ratio is how often we actually have to go to them. Is there a way on your system just to run something so you can populate these values automatically? You don't, the only one that you need to populate is basic, the optimator disk read cost. They are basically there so that assuming something goes really wrong then you can populate this. I don't see that. They are part of the engine behavior not part of the amount of behavior. So basically the three things that you normally would like to change is disk read cost if you have a fast SSD then the disk read ratio I plan to sometimes do that automatically based on engine statistics. I didn't want to do that at the beginning because if we do that automatically that means that you do a query and then you do the same query and then the plan changes. That confuses people. It may be better but so and the wear cost is the cost added to each row. So if you want to ensure that you get the minimal amount of row accesses you just increase that one. So how I checked all these things was that there's a part program part of the server check cost you can run it with any engine and it then produce that's a lot of different checks tables can index can key look up and so on and then it you get here you get the costs and here you have the timing milliseconds for doing that and if the if things are correct you get as far close to one and this I have a fixed for all engines. So that means that I have a way to verify that the cost is up or okay it's almost impossible to get them totally because even when you run things on a machine things actually changes from run to run but it's a millisecond yeah there were lots of things in the optimizer that was cost based sorry but still also a lot of things rule based no basically everything is everything I found is no cost base which means that it's easier for the optimizer to do choices therefore there is patches in the DB that's all still in MySQL where they recommend that MySQL prefers table scan so let's reduce all index scans to half just to force the optimizer to use indexes instead of table scans which is of course a disaster for the optimizer because then it gets wrong data and can do good decisions so all of those are removed so no inner DB gives the best optimizer it can and that helps things a lot and I spent a lot of time doing improving things from performance point of view especially is probably 50% faster than before more caching simplified code and had I haven't worked in the optimizer since the first version of MySQL in 95 maybe between 95 and 2000 I worked on it and then a lot of other people worked on it and they did a lot of amazing jobs in different parts of the server but nobody took the time to ensure that how things related to this one and this one and this one especially with costs so all of that's no done I also fixed small things then we also I also changed that we tried to use a longer indexes if they are there the one thing that is a problem is that especially for the test suite is that no when we actually have proper costs table scans is preferred for most queries in the test system because the tables can both in the DB and other engines is really really really fast one disk seek and you get hundred rows compared to index lookups so there's optimizers can set up cost that is is I think it's 10 milliseconds as default just to encourage the optimizer to use indexes for small tables mostly because if you don't that you can confuse both the test system and users and if they're small tables nobody like here if it takes one 10 to a milliseconds or 100 slow basically this affect tables that are less than than 20 rows and that's unfortunately most of the tests in my Maria DB have 10 rows or small yes what I see in concurrency is that it doesn't work very well because he's doing a lot of full scan at the end of the plan even if the index is there and could choose the accurate yeah but then no things are cost based and except with a small very small penalty for table scan but that's more for getting more and more chocolate so that you don't need indexes anymore because the hash algorithms are much faster hash is really really slow for you if you are going to fit a small of a small amount of rows I was in 10,000 20,000 12,000 of those that you have all in cash and also depending it depends on total on queries and and concurrency hashing takes a lot of memory no but that means that you get this concurrency because the CPU is just moving things from memory when it doesn't so hashing is good in some cases but it especially if you want to access a lot of rows directly or indirectly if you only need to access a few rows then hashing is really a disaster yeah and most if you look at banks and sections everything else hashing wouldn't work or any of those because usually just want to have everything from a small set of customers so here the from the user point of view those are the only variables that I think that you but may need those who create engines may need more and one thing to be aware of that from the use user variables they are in in microseconds not in milliseconds because I first had them in in milliseconds but the numbers get so small that it was very hard to look at those so there's a so when they internally they use the milliseconds but from user point the costs are in microseconds you have these variables here all these are just for memory and then you have a cost for fetching the fetching the disks fetching the blocks yeah so this is all memory yeah so optimize optimize a discrete cost that's the one that is an IO so there's like running more easy on like bare metal machine with very fast SSD like the IO is half it's one 10 of a millisecond but in the cloud IOs go through yeah network as you compare but I guess you need to tune this variable probably I haven't done that so I've been basically focused just to get this to work so everything is focused on getting the memory part of work but the disk is there and it's only two variables so it's very hard to get those totally wrong yeah if you run on a managed database in the cloud they might tune it with people like you might miss this would be interesting to see like the difference in how wrong things get if MeruDB thinks you're on fast SSDs but you're actually on network SSDs and this variable every engine has has its own variable so you can choose it changes for different engines if you want if you for example run on different devices so chasing cost variables is easy you just session or change global all engines are global but the wear cost and or that things are local so this is see I have a couple of minutes left so so question why does this matter to you if you ever had to go and say force index or have to try to tweak queries in any way or had to use analyze the analysis table is still useful because you get the statistics but I would say that one main thing is that much less tweaking queries these should just work and especially we use in a DBM memory engine for example or other engines things are not a little better and no even the server knows that no we have temporal tables in in in area or heap so it can take that that cost into account so state of things basically everything is done we have had QA testing this no form is one month founder some bugs most of the bugs is in the also in older releases so I in this I basically fixing everything related to optimizer in this one there's one issue left that I will push on at this week and then basically the level should be done we have a BB 11 0 that includes everything I think this is one of the most tested releases ever done internally just because I've been working so much with our QA team so I expect this be almost stable from from start or access table from start and for anybody who wants to help if you have a slave where you can put 11 0 on put it on send feedback make your entries anything related to optimizer will be fixed it immediately should it should be the same except something like row by the filters is faster so all code I don't think that anything will be slower in the optimizer but the plan should be better so the end result should be faster there are a couple of things that are not a little better row by the filter means that if you have two indexes you can use and then we then we create we take we will use the the faster one but if if it makes sense we take the other one fetch all primary keys and then when we read other ones we see that only those who has an existing primary key we need to consider so basically where there's a wind we don't have to fetch the row for things that we can filter out and the algorithm's name you know basically it's a lookup of all primary keys that are acceptable and we use the used we do a check against those which is actually pretty fast so state of things basically basically ready this will be released I think it's next in February yes so this month so future plans I will start working on parallel query there's still some optimized cleanups to be done I also want to enable all optimizers which is by default for example MariaD has supported has joins forever with the actually very hard to get to enable those bushy plans is something that we I would like to do because we have this you big users who would need that bushy plans basically have two big tables and then lots of tables that you have a relationship and then you have a join between those directly the big tables directly or indirectly and our optimizer currently can't do that very very efficiently so that's something I would like to do but the parallel query is the next big task that I will start was start on and I have some plans or ideas how to do that and I've been working with Sergi Petrugna who's know the leader the optimizer for doing this and then he got go got some help from the sensor Andrew so that's about 20 minutes okay thanks I don't think we have time for questions our next speaker set up but if you have questions from my team you can chat to him in the hallway we've got a stand downstairs where you can meet some other MariaD team as well it's the one without any uh any banner or anything like that because the person was supposed to bring all this work but let's say this way I'm really happy with this work and I think that for those who have complex plan which is especially when we're looking at things coming from oracle customers where we have lots of store procedures and really big queries queries that are in thousand of lines and this is just one query but this optimizer we really have so thank you |
An introduction to MariaDB contributions |
Alright everyone, good morning. So my name is Andrew Hutchings, I'm also known as Linux Jedi everywhere, long story behind that, another time. I'm the Chief Contributions Officer for MariaDB Foundation and I want to start out by kind of saying what all the different MariaDBs are because when someone uses the term MariaDB it can mean a lot of different things. So there's MariaDB Server which is the software kind of everyone knows, there's MariaDB Corporation which is the kind of for-profit entity that does all the support, consulting, they do SkySQL, Max Scale, etc. Then there's the Foundation which is a non-profit entity, we're funded by kind of lots of different third party companies and we are there to essentially continue the MariaDB source code for the community essentially. So bit of a weird job title, Chief Contributions Officer. So I'm going to explain it using the pillars of the MariaDB Foundation and how they apply to my roles. So this and the MariaDB Foundation are openness, adoption and continuity. And on the continuity side I try and make all the kinds of different contributions easier for the community to contribute by reducing the kind of time between opening and closing pull requests but not on the cost of quality or communications to the contributor. On the openness side we create and publish metrics, I'll be talking about those a little bit but essentially it's so you can see exactly what is going on with the community contributions. And on the adoption side we work with communities such as operating systems and also end applications such as WordPress that use MariaDB to make sure that we can integrate well with them and grow adoption that way. So there are lots of different types of contributions, when people hear the word contribution they usually think of code contributions and those are the ones I'm mostly going to be talking about today. But there are lots of others that are important, you know funding for a non-profit kind of foundation is really important, it helps us kind of grow the community around everything. Documentation is really important, if you can't contribute code then contributing documentation is quite useful for us. See I say this because we have a Zulip, we have people asking questions on Stack Overflow, Reddit, we have a community Slack, we have mailing lists etc. and the small foundation can't get to everyone and everywhere so you know if you know how to answer a question then you know it's a contribution to kind of reply and would love you for it. And so I grew up in England, we suck at languages, I barely speak English, it's terrible. So any help we can get with translating error messages, things like that is really useful to us and we're working on making that a bit easier on a workflow point of view. And then usage, bug reports, feature requests, actually using the thing and telling us what you like, you don't like, what's broken, what's not broken, what you want in there. That's a contribution and that's really useful to us as well, just as much as a code contribution is. So going a bit further into non-code contributions, I'm going to talk a little bit about Intel who's a sponsor of ours. They do lots of non-code contributions, sorry morning, and they can't really do that many code contributions due to some legal stuff that they have to go through every time they contribute code. But they are doing things like they're constantly benchmarking MariaDB against their current and upcoming hardware and feeding back the information to us of what's working, what's not working, what hardware combinations are working, what's not working. And when they do spot some kind of regression on some new hardware or something like that or there's a release that's caused a regression on their hardware, they will dig deep and tell us where to look in the code, what they've spotted and then our engineers can then work on improving that. So they've worked a lot with Marco in ADB, I'm sure he's in here somewhere, to improve the performance certainly in 10.6 recently. They do supply us with hardware to test against, so a lot of the billboard infrastructures on Intel hardware, they've given us financial support. And Steve Shaw from Intel is on the board for the MariaDB Foundation. So there will always be things you can do to contribute even if you can't contribute code to MariaDB. Why contribution is important? Well, so we get a more diverse input from each life experience. So if a project is built by one team in one country, in one office, for example, you're not going to get a diverse feel of not just culture, but use cases, et cetera. So I think it's really important to get contributions for a wide group of people. You get to direct a project the way the users want rather than being led by one single entity, one single corporation. So if a corporation says, OK, all the money is here, they're going to put all the resources to develop those features and it might not be what somebody using WordPress wants, for example. You're fixing bugs and things that are important to you, and I think that's quite important. And you're building a real community around the project. So it wouldn't be a fuss to me if I don't talk about Drizzle. For those who don't know, Drizzle was a database server, it was a fork in MySQL 6 back in 2009. It started in some micro systems and it was designed to be a micro kernel kind of fork with loads of different plugins optimized for web and cloud usage. It eventually died, so that's why you probably haven't heard of it. But in 2009, we had a talk where we said we want 50% of the code contributions to come outside of some micro systems, and we kind of met this goal in a unique way. Oracle bought some and fired everybody. So we did meet the goal and everyone went to Rackspace, but my point is, MariaDB Server has more external contributors than internal contributors. So the corporation has, in 2022, 36 code contributors, there's eight from the MariaDB Foundation, and there were 68 code contributors elsewhere. Now, obviously, those contributors are not working full time on the code base, but it does mean that they kind of fix the problems that are important to them. And it's a pretty impressive stat, I think. And we had similar stats in 2019, you know, kind of something happened in 2020, can't think what, that kind of, of course it was kind of implied, but yes. Also, many of the contributions we've got from China, and that was visited a lot before the COVID. Yeah. So if you don't see contributors, they forget. Exactly. So as Monty said, COVID hit, China kind of caused the stats to dip a little bit, and things like that. They started to get back up again in 2021 and 2022 has probably been our best year ever, and I think we got some really big stuff lined up for 2023 as well. So the actual stats for 2022 are on screen right now. So corporation, obviously, is the biggest contributor. They pay a lot of full time developers to work on it. We have a smaller number in the foundation of full time developers and some people who work part time on the code and things like that as well. So even I contribute a little bit, but now we're near as much as everyone else, you know, at most one day a week, so it's not the huge amount. And then other contributors kind of outside of the MariaDB circle, pretty much on par with what the foundation contributes, so pretty good. So we use Git DM, which is called Git Data Miner, to actually process the Git commit stream to generate this. And I've actually open sourced the tooling that does all this, and it has all the kind of metadata in there to generate this. So you can actually break it down by user, by entity they work for, et cetera. And if you find that I've made a mistake on identifying someone, you can actually open a pull request on that and change the data accordingly. So it's kind of open in that respect as well, if you see what I mean. Git Data Miner was something that was generated, it was created for the Linux kernel. We tweaked it a little bit so we can count hackers and things like that, but yeah, it's essentially the same tool. We have a script to generate pull requests. I know this chart is going to be difficult to see on the screen, but kind of the trend is the important part. So this scrapes GitHub for weekly pull request metrics. So the X axis here is weak numbers, and then the Y axis is the number of open pull requests. So the bottom is 80, the top is 120. Part of my job is to help bring this down. I have been failing. I will be working on that quite a bit in 2023. So I'd run. You should also add how many actually close to that one, how many are open. I do have that, but showing that on this chart was getting very messy. So it's hard enough just showing this. We do close a hell of a lot of pull requests as well, and we don't just go in and say, no, that's rubbish close. We tend to talk to people through the pull request, and that's why some of us stay open quite a long time. So in the metrics future, I kind of want to break down the commit contributions by module, engine, et cetera. So we know how many contributions are coming to InnoDB, how many to connect engine, how many to ROXDB, et cetera, so that we can track that kind of usage. I want to track the average time to merge pull requests, median and mean, I guess, probably median because we've got some that have been open a couple of years and some that only stay open a week or two, for example. But we'll track that. We'll bring it down. Buildbot contribution metrics. So we use buildbot for continuous integration. We do get pull requests through that, contributions through that. So we'd love to track that kind of stuff. More community-wide metrics. So we're talking Jira. We're talking Stack Overflow Reddit metrics, et cetera, like that, capturing those kind of things and publishing along with the quarterly stats that I already published on meridb.org. And if there's any other metrics you want to see, let us know. Contact us because we are happy to generate them. So we'll talk about how to contribute code to Meridb. I wrote a blog post about this on meridb.org, but there are some basic steps you can follow. And it kind of helps reduce the round trip time during review. And also, I don't want you to spend hours, days working on something and opening a pull request and saying, sorry, this doesn't really fit with what we're doing at all or someone else has already done this. And I have to say no, because I don't want to crush people's hopes or anything like that. So if you follow these steps, it will kind of help reduce that quite a bit. So the first step is communication, talking to us. We can guide you through kind of every step of the way. Meridb team are quite approachable, preferably via Jira and Zulit, but there are other ways to talk to us as well. In particular, Vicente Daniel and me at the foundation, there are a list of people at the corporation I'm sure you can talk to as well. Tell us what you want to work on. And if you don't know what you want to work on, there is a beginner-friendly tag on Jira where we've tagged tickets that should be relatively easy to pick up and work on. And we can talk you through these. If there's no Jira for what you want to work on yet, open one and again, talk to us and we can figure out the best solution for it. Next step is hacking. Write some codes. If you are making a bug fix, it needs to be against the oldest-affected version of Meridb, so if it affects 10.5 upwards, then against 10.5. What is the thing that active release? Yes, active release. Yes, this is a good point. Always check the end of life as well for the releases when you do this because we're in this weird phase right now where we have got, we changed release cycles a couple of years ago, so some releases are on the old release cycle and some are on the new release cycles so some in the middle are end of life but some are. So it's a bit funny right now, but again, you can talk to us about this and we can help point in the right direction. The new features always go in the latest development version, which currently is 11.0 for the next couple of weeks. When that hits GA, there'll be another release you can bolt things on. Please stick to the coding standards to the surrounding code. You'll find that different engines have different coding standards because they've come from different places. Connect engine was originally a MySQL contribution that came through to MarineDB and that's got a different coding standard to say NODB and the core server code. I've put together a coding standards document which should be merged shortly and that's just for the core server and at the moment it's descriptive, run and prescriptive, but we're going to improve on that over time. Some test cases, we don't want you to write something, us merge it and then us break it later. So if you have some test cases in there, A, it proves exactly what you're doing and B, it means that it will stay like that in the future. Run the MTR test suite locally because otherwise you might get build bar errors that you don't expect and it just reduces the cycle a little bit there. If it's a new feature, help us write some documentation or at least describe what it does in the JIRA tickets so that we can put that into the knowledge base at a later day. Next up, pull requests. When you open a pull request, a form will pop up and filling this in will help us triage the pull request essentially. So a lot of your questions about whether this is a bug or a feature, have you added a test, does this break things, stuff like that. If it's your first time doing a pull request, something called the CLA assistant will pop up. It's not 100% intuitive right now, it's something we need to improve on, but right now it will pop up and ask you to sign the CLA. You can click through that and either sign the CLA or tick to say, I want to contribute under the three clause BSD license or you can just literally put a comment in and say, I'm contributing this under the three clause BSD license and then we can take it from there. What will run on the pull request automatically, lots and lots of different builders. The most important ones will report back to GitHub and show you that if anything has failed during compiling or testing on lots of different platforms we support. When we actually go to review it, we'll actually look at the full build list where there might be some obscure platforms that might have broken in weird ways, but at least it gives you some idea of what's gone wrong and you can click through and look at the cause. Again, if you don't understand the error that popped up, we can look at it for you and point you in the right direction. Code review process, the MariaDB engineers, both at the Foundation and Corporation will review, give feedback, advice. If we think the code is ready, we'll approve it and merge it. Community members are also welcome to come look at the codes that people have contributed to it, review it, comment on it, and it's another way you can contribute. If we are taking time to get to your pull request and we're dropping the ball or something like that or you need advice, you can tag me at Linus Jedi on GitHub and I will take a look at it for you. I'm lagging a bit behind on that because I'm at FOSDM right now, but I will try and keep up with that. We have a large backlog right now, so it is very easy for us to miss things. That is all I have. Any questions from anyone? Yes. Is the Foundation and the Corporation have different release cycles or is that? It's Foundational Corporation different release cycles. So no. At the moment, the Corporation are generating the releases, so the engineers at the Corporation are generating the releases, if you see what I mean. The releases you get are generated by the Corporation. Built by the Foundation, yes. There's a lot of synergies between the two, which is a good thing. We want to be working closely with them, if you see what I mean. But if anything, God forbid, happened to the Corporation, the Foundation existing means that MarineDB Server will still exist, will still be developed, et cetera. All right. Thank you very much. Thank you. |
Deploying Galera Cluster in the real world |
So, I'm going to talk to you a little bit about deploying your Galera clusters in the real world. I'm actually very pleasantly pleased to note that a lot of you are using Galera cluster already, so that's great. I work at Codership, and before this I was also with MariaDB, MySQL, Fedora, etc. Codership makes Galera clusters, so obviously there's support, training, consulting, and so forth. The company's been around since 2007, three founders, all engineers, fully-services business model, and it's used in lots of places, from your gaming, aka gambling, also gaming, telecoms, banking, insurance, SaaS, Paz, IAS, it's literally everywhere. Since many of you already use Galera cluster, I don't actually have to maybe introduce it to all of you, but it is highly scalable. You can add nodes on the fly, you can delete nodes on the fly, this is as simple as actually just pointing the IP address of your new node to an existing node, and it should be able to sync via something known as SSD, state snapshot transfer. And from an application point of view, it still looks like one very large database. Naturally, we like you to have a minimum of three nodes, even though some have two, and some have three separated across two data centers, even though they'll claim they have very, very fast network. So if you have three databases, as we suggest, you'll have three copies of the same data as well, and so it looks like one very big database from the application point of view. You can't have any number of parallel applyer threads as well. So we have actually renamed things like, we don't call it the WSRAP slave threads any longer, we call it applyer threads. So we are following exactly what MySQL and MariaDB is doing in terms of being politically correct, I guess, though we are in Europe, so maybe this is less of a concern. So this is a point to note, because if you are actually in your config files using older configurations, when you're moving to something new, you should rename it, otherwise you're going to find that Galera will fail to start with a very poor error message in your error log, and maybe that's the other pro tip is to never disable the error log, ever, because Galera writes all messages to the error log. In fact, if you're using MySQL 57 or 80, which I presume most of you should be using in terms of a newer version of MySQL, the temporary password is inside your error log. So if you disabled it, how do you plan to start? So the error log is actually very crucial. So MariaDB, on the other hand, you can start with passwordless logins by default, but MySQL requires it. So I suggest if you're using Galera, never disable the error log like ever. We also, of course, have parallel replications, so it makes state transfers quick for new nodes. So you can actually consider just increasing the applied threads. Don't use a value of WSRAP applied threads that is higher than the average given number of what is known as the cert depth distance, the certification depth distance. You can actually see this in the status variables. We probably, as an improvement, should be able to spend more time actually going through all the WSRAP stuff that you should be paying attention to. But to some extent, we've also given that off to GUI tools to manage. Of course, we have an optimized network protocol, so packets are only exchanged over the when a transaction can be time. We have topology-aware applications, so you can actually segment things. So each transaction is sent to a separate data center actually only once. We also detect and automatically evict unreliable nodes. This has improved tremendously in Galera 4, actually. So if you have a flappy network or node failure, you don't actually need any manual intervention. We will eject. In the very unlikely event that you've configured Galera to ignore split brains, we aim to actually recover from these problems. Again, we provide these options for you, but we don't recommend you to use them. Why would you want to cause yourself grief? MySQL and MariaDB don't allow you to sync data. So if you have a split brain situation with two different sets of data, how do you plan to sync this data later? Well, you can, of course, turn on the binary logs, which we actually do recommend, at least for a short period of time, and then maybe do some replaying. If you have MariaDB, you can use the flashback capability. If you have MySQL, you can actually just replay the bin logs. So if you happen to actually lose data, and, of course, traffic encryption is something we also support, it's not turned on by default in all distributions except Percona XDB Cluster 8. If you get Percona XDB Cluster 8, traffic encryption is enabled by default. And if you're using the cloud, this is kind of key, right? Because it turns out that everything happens in, replication happens in plain text, basically. This is true even for asynchronous replication or even semi-synchronous replication. So did I just promise you a panacea for the whole world? Of course not. It does come with trade-offs, right? You are not going to be as fast as asynchronous because you're not just committing to the primary, you're committing to the primary and to other nodes. It's obviously not going to be as fast as semi-synchronous because it's not committing to one primary and one secondary. So we can't really beat the laws of physics even though you're in the same data center, which is, again, why it's such a bad idea, and we see this a lot in production where people say, we have a three-node Galera cluster in two data centers. If you want high availability, you really should invest in the money that comes from running high availability. So we've now found out that basically none of you use Code of Ships Galera cluster. And as Monty says, a lot of the features are actually pretty much inside of MariaDB Galera cluster, but there is this one interesting feature that is not made it to any other distribution that's clone SSD. SSDs that happen via the clone plugin as opposed to using the likes of Maria backup, extra backup, and unfortunately, the clone plugin doesn't run on MariaDB. So on MariaDB, you use Maria backup, which is also fast. On Percona XDB cluster, you use extra backup, which is also fast, but we found that clone is actually also pretty amazing. So we do open up a new port for it. And if you use Galera Manager, by default, if you deploy MySQL 8, we actually use clone SSD to provision your new nodes. So you should be able to see some good speed up there. It works extremely closely with MariaDB PLC, I have to say not MariaDB Corporation any longer because the company has a new name, to make a MariaDB Galera cluster. In fact, it's been around inside a developer tree since 10.1, which is around October 2014. So then it got released as a GA in October 2015 in 10.18. And you get all the features of MariaDB that work with Galera, basically. This is your Oracle support. I know there's no generic MariaDB feature talk in this entire FOSDAM, but go check out the knowledge base. I have another talk on the ecosystem later, which will give you a quick overview. But you also get the things like system version tables, sequences, all those optimizer features that Monty talked about that are coming in MariaDB 11, you just get benefit of it as well. And it was actually the first to include Galera 4 in 10.4. So if you're using 10.4, 10.5, 10.6, you're already getting Galera 4. And then, of course, there's a Pocona actually DB cluster, which another half of you actually use. Base is Pocona server. It comes with proxy SQL, including an administrative tool. It has a strict mode, which disallows MyISM tables to be replicated via Galera. Now this is an interesting thing because while Pocona actually DB cluster disallows it, MariaDB, later versions of MariaDB actually allows this. You can, since 10.6, have ARIA and MyISM replication inside of Galera cluster. We consider this kind of experimental, but it works. It's feature modes that are available. Of course, it disables tables without primary keys. Again with something like MariaDB, you can also force primary key creation. And Galera also has an option where you can sort of insert one, ensures using the row bin log format, logging to a file, obviously not tables. So there's some interesting things, setting the inner DB auto increment log mode to two, and of course, out of the box encryption. So there are some feature highlights, like intelligent donor selection. So now with Galera 3 and 4, we 3.x where we have introduced new features. You can actually now prefer a donor to do an ISD. When it comes to cluster crash recovery, we've actually by default turned on PC.recovery equals on, so all nodes will maintain the cluster information persistently, not requiring necessarily to bootstrap. We have full GTID compatibility in Galera 4 with either MariaDB or MySQL, so actually using the native GTIDs as opposed to the Galera GTIDs. So previously this was a bit of a point of contention, so this is actually a good thing. Foreign key support. When I say improved foreign key support, you'd actually, if you looked in your arrow logs and used foreign keys, you may have found that there were lots and lots of errors that were not actually errors. So cleaning up the error reporting was actually how we improved foreign key support, because lots of people were complaining like, hey, we're seeing this in the arrow logs, but it was running just fine. So it was just maybe a bit too verbose, and then we added a couple of new tables inside of the MySQL table as well. It's the WSJAP cluster, cluster members, and streaming log. Very important to remember that streaming replication, which I'll talk about probably in the next slide, actually does make use of chunking your data and putting it inside the MySQL table, so this actually improves the ability for you to use long-running transactions or large transactions, so to speak. And that's why it's in the MySQL table because permissions, right, otherwise other people are going to see what's being streamed. Next slide. So previously, you'd have to play with WSJAP Max WS rows and WSJAP Max WS size, limiting the transaction rows between 128 kilobytes to a gigabyte, maximum of two gigabytes, but now you can actually cut those large transactions by setting fragment sizes. This can be done literally at runtime, so you can set them around 10,000 rows to 20,000 rows, and your application can also obviously set streaming replication on and off on a need-by-need basis. Again, this doesn't come for free. There is naturally replication overhead, which is being improved on in every release that we come. So if you're always looking for the latest, greatest streaming replication, naturally you'd want to take a look at what's inside of MariaDB. And then, of course, there is better error handling for poor networks, so cluster error voting is a feature that has a protocol for nodes to decide how the cluster will react to problems inside of replication. So when one or several nodes have an issue to apply an incoming transaction, like a suspected inconsistency, this feature basically helps. So in a five-node cluster, if two nodes basically fail to apply a transaction, they get removed, and of course, now your DBA could go into fix to see what went wrong and then rejoin the cluster. So I know we had forced them, so this is an apology slide because, unfortunately, there are enterprise features, but I will not talk about them. This is for you to go check them out yourself. Both MariaDB and Codeship have these options. The biggest hurdle to upgrades that I hear from consulting is we don't want to migrate to MySQL 8. I don't know why people say this a lot, but if you are using MySQL, last I checked, it's going to EOL fairly soon, 5.7, so it's time to upgrade. You've got eight more months to get working. And if you want Galera 4, remember, MariaDB's had it since 10.4, but again, a lot of you are not even on the 10.4 or 10.5 train yet. So upgrading from 10.2 or 10.3 is probably something that is ideal if you can find the time to do it. Okay. Common setups. Three Galera cluster nodes in one data center. This is the highly recommended common setup. Nine Galera cluster nodes in three data centers. Also, of course, another recommended setup. And if you are doing this, make sure you are ensuring your database operations are kept local by setting each data center that GMCAST segment equals 0, 1, 2. And of course, the flow control is fully configurable. And we want to have minimal latency penalty. Remember that latency penalty only occurs during commit time, so then actually no communication between these remote nodes and Galera doesn't use distributed locking, so each row-level lock does not have to be communicated across data centers. So in very, very high latency situations where complete avoidance of secondary lag is required, we can also support asynchronous replication between two otherwise independent Galera clusters, each running in its own data center. Now I put an asterisk there with recommended because I know Marco Tusa is not here today, but he was on Friday and he has written a lot about why you should never run your Galera clusters in a wide area network. Or I guess even your group replication in IndieBee clusters in a wide area network to some extent. And he's spent quite a lot of blogs and including a video. So basically whenever you need a solution based on a tightly coupled database cluster, you can't obviously locate your nodes at a distance that is longer than the largest round-trip time of the shortest desired period of commit. Wow, five more minutes. We're going to go fast now. You should always remember that we like the minimum of three nodes basically in terms of a quorum because a quorum is greater than 50 percent, so if one node goes away you still have two thirds, 66.7. And you always want to ensure the primary component is there because otherwise if it splits due to network failure you have split brain. So this is very bad. You can fine-tune this with evs.suspectTimeout as a parameter. Very realistic common setups that we end up seeing. Two node Galera cluster, really not recommended. Even though we documented, we tell you how to shoot yourself in the foot, it doesn't mean you should. Three node Galera cluster across two data centers also common. Three node across three data centers also common. Five nodes, seven nodes. So you always have to remember the trade-offs of scalability, reliability, resilience and performance. This is a sample of my.cnf that one would want to maybe pay a bit of attention to where we actually include a WSR provider options because by default we don't put a segment for example, but you also want more than just a segment. You want to, if you're doing wide area network stuff, consider increasing the replication windows. You want to increase the timeouts above max round trip time. Look at the flow control which you can actually monitor inside. And then pay attention to the FC limit, the master slave yes, causal read, timeouts and the evs settings where you can actually set the send window to 512, the user send window to 512. You can look at the keep alive periods. We have all this actually in blogs and documentation to some extent. So I'd highly recommend you pay attention to galeracluster.com slash blog. Always set your GCash size. Potentially also set your retry autocommit, something like five. You may want to certify non-primary key stuff, but really you should use primary keys. You tell the developers to use primary keys. In fact, if you're using MariaDB, you can say in a DBforce primary key equals one. Make it such that they can't create tables any longer. Make the developers suffer a little bit. Replicate my ISM, replicate ARIA, these are all things that are very MariaDB specific. I'd like to actually talk about the ring buffer file as well as the on demand page store which is the GCash.page underscore size. But I don't think we have so much time, so consider this blog post that we will write maybe next week. Another one that I should probably mention really quickly is the arbitrator. So the arbitrator is a member of your cluster that can participate in voting, but not in the actual application. So if you want to save money, you want to have two data center setups, three nodes each. You can actually just set up the arbitrator daemon in digital ocean or line node. It doesn't have to be a powerful machine. It can just basically read traffic and act as an arbitrator. Don't use things like ignore split brain and so forth. So Garbdy can act as the odd node and he can also of course help you do backups. Plenty of proxies available. There's Galera load balancer. There's HA proxy. We have documentation for that. There's proxy SQL or talk coming up later today evening. And of course MariaDB max scale. To provision new nodes, 8.0 gives you clone SSD. Maria gives you Maria backup. Percona extra backup is still the choice inside of Percona, actually be cluster. Very common setup and runtime issues, S Linux firewall. You can't get your, you can't get a IST. Well it turns out you've probably got a port closed. Make sure port 4444 TCP is open. If you want to avoid long running queries, MySQL and MariaDB as, you know, max execution time, MariaDB as enhance kill. You've got DNS giving you problems, switch to IPs. Couple of functions for developers that may be useful. Again, you can tell the developers to check this out. It's well documented. It's widely adopted. So lots of people are going, you know, you could use it in Xflower, PowerDNS, lots of Kubernetes operators. All right, so the most exciting thing for me in MariaDB 11 from a Galera standpoint is WSWAP provider options is now a plugin. So you can actually use this from a more automated thing. I presume from a MariaDB standpoint, this helps automatically reconfigure SkySQL. So this is better for you because otherwise you have to put in my.cnf, some of it's dynamic, not all of it's dynamic, resetting a server, horrible. Got more granular, you know, few things to improve on, makes schema changes, upgrades easier. Lots of further reading. I know we've literally run out of time, 20 minutes is not a lot of time. So you can tweet me, you can send me an email, we're hiring, and yeah, we have lots of services. Thank you for listening. Thank you. |
What is new in analytics for MariaDB |
Hi everybody. Let's begin. Thank you all for coming. My name is Roman Nostrin. I'm from MariaDB Corporation from a column store team and today I will be sitting just because the manipulator doesn't work. So today I will tell you about new features that comes in analytics and maybe not so new, but something I find important. For those who don't know what MariaDB column store is, it's an all-lib storage engine for MariaDB server. It's columnar oriented. It's MPP and it has fancy two-tier distributed storage. When the data is distributed across nodes and all the nodes have the possibility to distribute it even more using DB routes. Where DB routes are basically the Linux mounts. And I will start with the most obscure in my opinion, most obscure topic and that is the MariaDB column store version numbering. Why it is obscure? Because I still don't understand the reasons that server team takes to put this or another version of column store into the community server. But anyway, first of all, column store migrated to yet another version in numbering to simplify the understanding what is the actual column store version in the community server. And that is there are three digits, first goes here, then month and then the page number. And at the bottom side of the slide, you can see the actual mapping that shows us which column store versions are published with the community server. The most notable thing here is the last row that tells when the next table, the most featureful release will be published with the community server. Let's first glance at the features available at the current stable that is shipped with the community 10.5 and 10.6. Most notable is the filtering vectorization for x86. And guess what? Seemed processing is not the ultimate answer to life universe and everything. I mean, the benchmarks, when I first did them, when I first run them, they showed me a 10x speedup comparing with the non vectorized code. But when I run the full blown query processing pipeline, I just got 30 to 40% of the speedup. So this is a great speedup, but not the ultimate answer. Anyway, the next feature is an external group buy. And that is what it is. It allows a group buy operation to crunch the data sets, data sets that doesn't fit into memory. And as usually, it consists of two phases. The first phase calculates the partial aggregates. And when they don't fit into memory, it all flots them and stores on disk. And the second phase sequentially goes over the stored pre-aggregates and merge them together to produce the resulting data set. As I have seen in another open source OLAP engines, the second phase takes 2x memory comparing to the first phase in the worst case. This feature is disabled by default. And one needs to explicitly enable it with the settings in a column store.xml. These settings are shown at the bottom of the slide. The next feature is LZ4 compression for data files. We had heard a lot that LZ4 beats snappy in terms of the compression speed, so we decided to give it a try. And guess what? Snappy still delivers better compression ratio. It is about 5% better comparing to LZ4 compressed data files. So there is always a trade-off between space and the compression speed that does the query processing speed. This is also disabled by default in the current stable. And to enable it, you need to set this MariaDB client session variable. So we basically connect to MariaDB and set this variable just before you create a table. Or you can set it in the configuration file of MariaDB server. And we decided to make this a new default since the new stable that is coming this April or May. What's the benefit? It means that the compression and decompression ratio are very fine. Good point. It might be obscure. The compression ratio for snappy is still better, but the decompression speed is faster for LZ4. Not really, because ingestion speed is not the big question. I mean, everybody wants to ingest their data as fast as possible, but let's be honest, we are interested in the select, not in certs in the first place. Yeah, that is true for LTP. Okay, and now we come to the new upcoming stable over column store. And there are some features that I will not discuss in details, but there are some interesting ones that I will look into later. So let's get quick answer these ones. We are officially support ARM64 platform with vectorization as well using Neon. So this platform builds produces slightly faster runtime comparing to our normal nightly CI routines. But I have seen that it consumes 5% more over RAM, running the DML heavy workload like updates or deletes. The next tool is called MCS Rebuild AM. And to explain why we decide to write this tool, I need to make a small D tool. So how does column store the data? There are two parts, the data itself and the metadata that describes where the data is. Like when you have multiple nodes, you need to tell the query coordinator where to go for the data files. So the metadata describes the location of data and column store tries its best to preserve this metadata in tact and it has multiple copies of meta. So if one meta is corrupted for some reason, because life is not ideal, there are multiple copies that you can use as a backup to restore. However, if one loses its meta, there is no way to access the data itself properly, even if the data files are intact. So this tool allows one to produce the metadata from the data files. The key way is you have to create all the tables starting with column store older than 6.4.4 to allow this feature to work. The next feature is distributed JSON function support. Technically speaking, there were JSON functions in the column store from the very beginning, but unfortunately, to use them, you need to fall back to a very slow mode, we call it table mode. That is basically MariaDB crunching the data, but it asks for a full table scan from a column store. So it's not very scalable, not fast. So we implemented this in terms of last year JSON. And now you can use all but one functions. And these are JSON object egg and the JSON table. The last one is very, how to say, MariaDB specific. It produces the relation. That is also true. But I mean, by specific, I mean, we cannot produce the MariaDB relation, to be honest. So we decided to postpone it because nobody is very interested in this one. But we implemented. Good suggestion. Anybody is interested and maybe anybody knows what JSON table does? Okay, that's a good statistic. So now, and here are features that I want to discuss in some more details. First is auxiliary column. What it does, it basically speds up the leads from 3x up to 50x, depending on a SQL schema. And what does it mean? The more columns you have in the table, the faster the speed up will be. It is disabled. It has an additional sped up configuration option. I will explain it a bit later. But let's take a look at the bottom half of a slide. This is a very simplified data files layout to explain how delete operation works previously before the patch on the left and after the patch on the right. So let's concentrate on the left part. You can see the violet and blue empties. So when delete comes into the table, it replaces the actual values in the columns, because we are columnar, with the special magic values that are called empty, that are specific for a data type. And it does it in place. And moreover, the delete operation has to store the actual block for a version before it changes the data in place. So there are two block copies to do, and there are four in place changes that has to be flushed. So what does the auxiliary column changes in this pattern? First, you can see on the right side, there is an additional column. There is basically a flag. So delete operation now goes over this auxiliary column and changes the flag over there. And this auxiliary column is only one byte. So you need to store only one block of data, and you don't need to change to do in place changes in all the columns. That's where the sped up comes from. As I said, there is an additional, there is an additional opportunity to sped up it even more, and this is to enable a fast delete. I will not discuss it in details, but internally, it doesn't update the metadata file, so it makes it even faster to delete the operator. Basically, you have a new column indicating if it's deleted or not, and you just update it. Yes, that's true. We have plans to use this even more, this auxiliary column, but I will discuss it later when we finally implement these features. And to be honest, this fast delete gives an opportunity to implement the append-only update. That should be also a very fast boost for update operation, but stay tuned, not today. Next feature is extent map scalability improvement, and what extent map is. If you recall, couple slides before, I already mentioned the metadata that describes where the data is, and extent map is the core of this metadata. It's basically a structure and memory structure that allows to map from a globally unique block number. Block is a minimally possible data unit in a cluster, and the extent map allows to map from this block number to a tuple of OID, not partition segment, and vice versa, from the tuple to a globally unique block number. And these operations are used extensively in the cluster because you always want to know where the data is and what is the OID that this block belongs to. Originally, the extent map was an array basically with the OIN lookup complexity, and this array was replaced with, oh, I need to mention why it is a problem. Imagine the extent map that is 200 max. So, going over 200 max, and one entry entry is only 100 bytes only, so imagine how many entries are in these 200 max. It takes a lot to look up in such an enormous array. An array, this array was replaced with a red-black tree. This makes the block-to-tuple mapping to, this changes the complexity of block-to-tuple mapping to log n, or log n. And to facilitate another conversion, another mapping from tuple to block, we implemented the extent map index. There is basically a burger of couple hash maps on top and a very tiny arrays at the bottom. So, this gives us the mapping operation complexity of OC. And here is at the bottom, you will see the results. These are except of the CP import logs. One is, one demonstrates that the preprocessed step takes roughly 30 seconds, and after the patch, you see that it decreases to four seconds. It was originally. So, if you have a huge extent map, it will give even faster operations. The next feature is a primproc and ex-manager processes merger. If anybody here uses comestory, he knows there are a bunch of processes. And the central ones are primproc and ex-manager, where ex-manager is a coordinator for the query processing. And primproc is an actual worker. So, there were a lot of additional headache when the local processes must communicate between each other over the same, in the same node. So, same node communication goes over loopback. And this traffic is not compressed, comparing to the different nodes communication that is compressed. So, combining these two run times, we get the four to seven percent overall sped up. And this gives us another opportunities for optimizations that I will mention later. Union push down. Previously, there was no way to directly run the queries, like you can see at the bottom half of the bottom, the most bottom line when the union closes at the top. Because MariaDB processing path for unions and simple selects, it differs. And columnstore works using the notion of select handler that allows MariaDB server to push the whole query down to columnstore. So, when there was a union, we cannot use this path and we have to do, we call it a subquery wrap. So, we were wrapping these into a subquery and this allows us to use the select handler, comparing to a table mode that is slower. The next big feature is a full TPCH support. And why it is big? Because it is mostly concerned with the optimizer, that let's face it columnstore lag previously. There are two queries, with two features that columnstore lack and these two are correlated subquery with aggregates. It's basically a scalar query, but with aggregate. It returns a single row that you can use in comparison. And guess what? To enable this feature, we just disabled the error message in the code. So, it was a very, very tiny change. And columnstore naturally supports such queries. However, the last query type that was presented at the bottom half is way more elaborate to handle. And that is the common conjunction detection and rewriting. As you can see, there are two joint conditions at the bottom half of the slide. They are marked with the violet. And these conditions are common. However, columnstore cannot handle the joint conditions if they are combined with a junction or basically. So, to handle this query, we need to go over all the conditions that are ordered and get the common one and put it at the top. For a general case, it's very complex to find such patterns and applies them in a way that it doesn't make the symmetrical meaning of the query itself. However, we manage this and this feature will benefit not only TPCH query that didn't work previously, but all others. External distinct. As you might know, columnstore was not great doing this thing. To be honest, it was hash map based. So, it cannot do an external operation and it was tightly coupled with order by. So, to allow this thing to be enacted, to become an external, we need to untie them, to uncouple them, and we just applied the existing goodbye facility that already has the external support, external operation support. So, this gives us a future optimization opportunities and also it allows the distinct to scale now. So, it will be fast. The most notable changes maybe because I was the order is the order by rewrite. The current implementation of order by facility is based on a priority queue. That is great for top K queries like mentioned in the second bullet, maybe the third. However, the priority queue timings are just terrifying when you run the query on a huge data set without limit or the limit is big and relatively big. So, I replace the priority queue I'll go with the different approach. It has 2.5 phases. It first calculated the sort of runs in parallel, completely in parallel. And the new middle phase, it calculates non-overlap intermutation ranges for the second phase. So, the second phase officially merges non-overlap ranges in parallel. So, you have first phase and second phase and these two phases are completely parallel. They don't overlap each other. So, in the end, the column star now has a choice between the priority queue based sorting and this new algorithm that it will pick looking at the order by usage pattern of the query. And these are the results. As you can see, the most interesting is last two bullets. There is a comparison between the previous code, the previous version that is based on a priority queue, the data taken from a TPCDS generator. And the first query uses the integer sorting key columns. And the next, the second query uses the character key columns. So, as you can see, for integers, it brings like four times faster. But I need to confess this is only scalability factor, scale factor 10. So, it's not a big dataset, anyway. For a bigger dataset, you can have like 20x or maybe even bigger. And to be honest, I don't compare with the other open source engine just yet because I'm going to come up with a separate story about this sorting. So, these are the links. If you're interested in column star, first is the code itself and the second is JIRA. So, if you find bags, please post them. We will appreciate it. Thank you all for coming again. you |
Data-in-use Encryption with MariaDB |
You You You You You You You You Know better, okay Yeah, perfect and then let's say the newer generations they move to this VM based A scheme where you have an entire VM isolated from the from you use we cut at the hypervisor you create And I completely isolated VM and everything that's inside like the firmware the bios the kernel and so forth is everything is part of that confidential context That means of course more code that needs to be trusted that's running in the same kind of privilege layer But you don't have the problem of this restricted system interface. So you can directly access do any kind of syscalls pretty much behave like any other application inside any other operating system and those are these these two worlds and both could be an option for protecting variety be at runtime As I said we will interface In in case of the confidential VMs and There's nothing really we have to change and in case of The the into sgx the process-based world We have this weird thing where we need to forward those syscalls to to the host and I Guess it's it's fair to say that the the right side is probably a lot faster But you have this bit more of a stack that you need to trust that has that's running inside the same same context Yeah, so so sgx and process-based limitations It's currently I think it's fair to say it's an interl only solution. You're pretty much locked in with Intel at that point Any kind of context which is expensive that means I always is expensive any other interrupt is expensive and InnoDB I have to say I'm not a marioDB expert whatsoever, but For some reason I know that innoDB is a problem people use different kinds of Storage backends like rocks DB for example so You can't just move the I guess the the off-the-shelf marioDB into an sgx archive There's some some things you have need to fiddle with and then you can make it work The upside is you have a very small Trusted computing base a very small amount of code that you need to trust that runs inside this this context On the other hand SCV confidential VM larger TCB, but that that means you are more it's more lift and shift and Yeah, Intel is currently not an it doesn't really have an a solution out there that you can use But there's into TDX coming. That's more or less the same as as AMD SCV There's other stuff coming from arm and and risk V and so forth Huge attack surface just means yeah, you have the kernel you have the firmware the boot load everything inside that and that inside that Context so if you have any kind of vulnerabilities there, you could be potentially attacked even though you're isolated right? Apart from memory encryption I think an important aspect of confidential computing is is attestation that means remote attestation just means you get an Statement from the chip from the CPU about what's running what code was loaded inside your enclave your for your confidential VM and that you that's that's signed with a key From the CPU and this key has a certificate chain back to the hardware vendor so that You can send such an statement about for example This is a Mara Mara to be enclave a mara to be container that was loaded inside this enclave you can send it to a To remote party and establish a secure channel for example exchanging a key Bootstrapping a t-last connection, and then you have a secure channel through that through the database you can exchange data to your select statements and so forth and With SGX this needs for example you can do that by by having like a small Small step that runs before your before your actual code or next your actual code That does that interaction with the CPU and provides you with a attestation statement, and then you can use that Felix gave a talk last year Felix used that the Mara DB deff room about etch list to be that's essentially a That builds up on Mara to be and tries to Bring that confidential computing concept even even even closer or even more into the the the use case After slides from Felix, let me quickly Because what we've seen so far is essentially just lift and shift of Mara DB On top of SGX or top of AMD SCV what actuals DB does Essentially it uses rocks DB for the reasons mentioned and also for it has some neat features about the encryption so The way it writes blocks Makes it very good for the for the confidential computer attack a model. You can switch things around you can't do any any modifications and The interesting part is why I'm showing this if they added a confidential computing front end that means You not only have attestation that your Mara DB runs inside confidential computing environment. They also give you an attestation About what the database is and who has access to it essentially When you when you set it up you define a small manifest It just gives the initial database layout like the users the initial tables who can modify what who has access to what tables and they add that to the To the attestation statements, so then you can give to a remote attestation and you know This is this is not only Mara DB running there, but this is Mara DB with those user credentials those tables I think that's Integrating this concept of confidentiality more with the The concept of a confidential database if you will so interesting SSTV is also open source as you can check it out. But if that's that's interesting for you, but it's SGX specific You go back to our which lights Yeah, it's his lights his emojis take responsibility, but yeah, the problem with SCV and and confidential VMs currently is that you can You can just lift and shift Mara DB inside it will work the problem with the current way hypervisors and another club providers Offer confidential VMs is that they don't give you full access to the entire stack that runs inside the confidential VM That means they have a firmware shim You don't know you can't really verify that loads your bootloader US and then Mara DB that breaks the chain of verification from the hardware essentially that's what the slide tries to tell you and What we like to have is having the full chain inside this in this VM Verifiable from the from the firmware like from the hyperstatement to the firmware and up to Mara DB itself So yeah, this is a practical problem right now, but hopefully going to be solved soon You Currently you you can for example on Azure you can start on a MDSCV machine on hyper V. They they set the firmer But there's a preview where you can define your own firmer You can either use direct Linux boot or you define your own ua ua fi base firmer Yeah, and then you beat you boot the image the image that starts Mara to be and then you go from there Yeah, if you want to try that Yeah, of course, there's the AMD documentation and stuff of its company cloud cloud offers confidential VMs based on a CV Apparently there's some some solution to try it out, and yeah, actually TV is open source. You can also try that. I Think that's the last light. So I'm not sure if I hit those 20 minutes, but yeah Any any questions The yeah Mostly people that currently want to process sensitive regulated data in the cloud like healthcare Telecommunication this kind of stuff they store They do that on-prem with with with the database they want to move that to the cloud, but then they can't because It's not enough the data is protected at at rest and in transit. They also need to protect data at runtime or as the HSDB case makes it Give guarantees on who has access to the data Yeah To be honest, I am not the best person to answer the question, but I think it's part of the sis calls that That that happened when when you use in ODB I'm not sure what what sis calls aren't the problem and then you have this context switches And you have a lot more context switches than when you use rocks to be that makes it super slow. That's at least one problem Yeah, yeah, I think that's there was something on Felix lights along those lines Yeah, by the way if you I Think this is from last year. I think there's a recording and Felix speaks about Why rocks to be as more as a better performance than then dd in ODB, yeah I don't I don't know this will be a great question for for avid here something right Yeah Yeah, I'm not sure if avid is in the chat or if there's a chat, but maybe he can answer that question Yeah, okay, so this is specific for how AMD implements that essentially they add a A chip the secure processor Yeah, yeah, and this this basically holds the keys holds the information and then you communicate as the guest you communicate encrypt You can establish a secure connection to that go through the hypervisor to DSP and obtain an attestation statement for example So this is implicitly explicitly trusted, right? So the SP as a firmware if the firmware has a buck you could potentially exploit it and so forth Okay, okay So so yeah in in the case of a confidential VM Depending on the hardware you essentially can verify You you create measurements of the entire boot chain So it's similar to a TPM case Like a measured boot where you have a statement of the initial memory layout the firmware and then a statement of all the other Components in the boot chain and the statement just says this was was this is an isolated VM this was the boot chain and this is signed by AMD and this is what you get So the VM is okay, but what about process runtime? Yeah Yeah, so from process space exactly so the Your your untrusted host Creates the creates the process loads the memory pages and then says okay. I'm done and then secure processor that's part of the CPU will create a hash over those pages and Compare that to the to the expected measurement that's signed by you as the author of the of the enclave so you sign your crew you when you build an enclave you essentially build the the expected memory layout you sign that and part of the attestation statement is always this measurement of the of the initial memory layout plus your signature Yeah, I mean yeah, that's part of why you why you why you can say this is a more This is a bit more fuzzy in terms of what the attestation attestation statement says right Potentially you can anything that happens after this boot and modifying the memory layout Modifying what's what's running there? You can only derive from the initial statement So what people do is they'd use a read-only file system a Mutable image this kind of stuff to make it more locked down For example, if you just want Mariah to be you could bring this to a microkernel that just is able to run a Mariah to be container for example Still there's a lot of things that can happen at runtime, but trying to to minimize the TCP or the truss duty base. Yeah, yeah, I mean If you can derive all the states you will end up in from the from the initial state you would have perfect Verification, but this is not feasible of course The main memory if you if you're referring to caches, I'm not sure which cache Yes So in the confidential VM case anything of that VM right from the from the firmware layer upwards anything That's above the hypervisor for for the process space anything. That's part of the process Yeah for process based With the latest generation, I think it's like around like 10% Something like that the bigger problem in other contexts, which is by far for the right-hand side AMD That the measurements and I think they are but around worst case like four to eight percent What Well, what kind of instructions do you use to switch the context to use the ring zero what exactly What are the instructions that you actually need Yeah, so on a process base what will just happen if let's say you do a right syscall the You will the process of a little trap you will have an interrupt and it will automatically Save your your your registers your state encrypted and then clear those registers There were some problems in the past but clear the rules registers and go to to to to a kernel space And yeah for the for the VM There are some they both have additional instructions for doing those confidential computing specifics like getting a remote attestation statement or For the confidential VM connecting through the secure processor, there's an instruction set addition You |
InnoDB change buffer: Unsafe at any speed
The tale of some corruption bugs and how they were found |
Okay, hello. Sorry for the technical trouble. My machine didn't work for this screen, so I got the help from a colleague. So, I'm Marco Mäkele. I've been working on an ENO-DB code for 20 years almost, and today I will talk about one of my pet hates in ENO-DB, this change buffer, which I long time suspected that it's causing bugs, but I couldn't prove all of them. So, I took a car analogy because some software people like car analogies this unsafe at any speed. It's from the 1960s. I'm a bicycle person myself. So, what was the change buffer good for? It was something developed in the times of the spinning rust, the hard disks. The idea was that when you are doing sequential I.O., like page reads or page writes sequentially, then the read write hit will move less on the hard disk. And if you're doing random cx, it could take a long time to position the head to the correct track and then wait for the sector to come under the track when it's rotating. So, the idea of the change buffer was that instead of reading something from a page and then applying a change to that page, you would write a buffer change to somewhere else. So, if a B3 secondary index leaf page is not in the memory, in that case, instead of reading the page to perform an insert, originally it was only insert buffering, we would write that insert operation into a separate insert buffer tree. And then later when some other operation needs to read the page, it would merge the changes from that buffer to the page. And these structures are persistent. So, even the insert buffering could happen years ago and then at some point years later happens somebody wants to access that page and then you'll get the trouble. This was extended in MySQL 5.5 to cover delete operations and purge operations. Deleting in InnoDB only marks the record for deletion, same for update of a key in a secondary index, it will do delete marking and insert. So, the purge is what is actually removing the record. Those operations could be buffered, but not rollback of an insert, that was never buffered. And this leads to lots of problems, like the change buffer is located in the InnoDB system table space. And up to this time there is no mechanism to shrink the system table space. If you at some point use the change buffer a lot, the system table space will grow by some hundreds of gigabytes, there's no way to reclaim that space. Okay, in MariaDB we have something, we are working on something to help with that, but it's not done yet. And then this obvious right amplification, if you are doing an insert, okay, that's rather fine. Instead of doing just one insert, you are doing two inserts, you are doubling the right, plus you have to write some metadata, some index information so that the contents of the page can be interpreted correctly because the change buffer doesn't have access to the data dictionary. But for delete or delete marking, if you apply the change directly to the page, you would write one byte or couple of bytes, now you have to copy the entire record which you are going to delete or delete marked to the change buffer and the metadata. And then at some point it will be merged. And then there is some overhead, even if you are disabling the change buffer, you still have some overhead, you have to maintain some metadata saying how full your pages are. If somebody is going to enable the change buffering or insert buffering later, this data has to be accurate, otherwise you would get a page overflow. The insert buffering must know that the page will not get too full when you are buffering the insert and merging later. And then we got lots of nice corruptions where the secondary index gets out of sync with the primary key index. And these are very hard to reproduce. So why is it hard to reproduce? Well, the first part is the same as on the previous slide. It is exactly this that you cannot easily control when the change buffer merge happens. It's like the Spanish inquisition in the Montice Python sketch. Nobody expects a change buffer merge. And to reproduce something, as a user, you are unlucky and as a tester, you are lucky if you can reproduce this. And you need lots of luck to get that. Because especially this perching of the history, which is deleting records from the index, it can be blocked by reviews. Like if you have long running transactions which are holding a review open, that will prevent perch from running. And then at some point that review will be closed and perch can start running. And then there is also this buffer pool. If a page is locked by something, it can't be written out and it can't be evicted from the buffer pool. So the change buffer can't be used. And we have a debug setting that forces that, okay, user is asking for operation that could use the change buffer and we see the page is in the buffer pool. We are going the evil and we evict the page. But we cannot do that because somebody could be holding a latch on that page or the current thread is holding a latch and the page was modified. And we cannot wait for page writes to happen because this latch is blocking the page write. So this is really difficult to test. And there was a recent fix to some hanks which were introduced in MySQL 5.7. We have that fix in the release that is coming out next week. That one will make it even more tricky this debug option. So in order for tests for this to be effective, they have to do some smart tricks like abandon some tables for a while and let them cool down. Use some other tables meanwhile and then come back. Well, we got some nice magic tools as well. We have this random query generator. It's also used at MySQL and a grammar simplifier. We could start with the huge grammar of all of the SQL covering all the features and let it run. If the crash was frequent enough, then we could use this grammar simplifier. But in this case, this is very hard to reproduce back. We cannot use the simplifier. We cannot get any simpler grammar. We just have to run it all and hope for the best. But then we got this debugger RR, record and replay. That one is really a huge productivity boost. We started using it maybe two or three years ago. So when you are able to reproduce a problem while running it under RR, what you would do that you will save RR record will save a trace, a deterministic trace of an execution that is interleaving processors or threads that are being monitored by it. It saves the system calls and the results and so on. And this trace can be debugged as many times as you want. You just need the same binaries, same libraries and compatible processor. Then you can run it. And you can set break points, you can set data watch points and you can execute in forward and backward direction. You can see what happened before the bad thing was observed. And this can also be used for optimized code. You are probably familiar with cases where you are debugging an optimized executable, then the debugger complains that some variable has been optimized out. Well, then you can just single step some instructions, you get it from the registers, because you can go backwards in time. So now I am coming to describe one bug that we found last year. And actually there was a support customer last week who hit this bug or a consequence of this bug. So we had a bug that would be a slow shutdown which is doing this change buffer merge. It would hang because the change buffer got corrupted. And we were testing some fixes in a branch for that. And then we got this assertion failure. This assertion failure essentially means that when it tried to insert a record that was insert buffered, it ran out of space in the page. And what was the reason? Well, there were some extra records in the page. And it turned out that this is partly by design. Hei Kittori, the creator of InnoDB, he was a friend of lazy deletion or lazy operations. So drop index wouldn't clear anything from the change buffer. It would leave the garbage behind. And later on, if the same page is reallocated for something else, then we would pay the price and free the space from the change buffer, delete the records. And in MySQL 5.7, there was a new feature, bulk insert creation or building indexes faster. And that codebase didn't do this adjustment correctly. It only cleared some bit, but it didn't remove the records. And there was a mandatory Oracle security train I took several years ago before switching to MariaDB. It said something like complexity is the friend of security bugs. I found it somehow fitting here. So the immediate root cause of this failure was that this new code cleared a bit that says that there are buffer changes for this page. So when somebody is going to use that page, he will see that, OK, there's nothing to do. I don't have to care about the change buffer. And then later on, something adds records to the page, and then these old records from the change buffer will come to the page as part of a merge. And how can we prove this using this RR tool? By the way, you can download the slides from the first page, and you can also download an attachment that has a script replay recording of the RR session. So I'm only showing some high-level view here, but you can download a debugger session that shows the exact commands and the output, which I'm going to present in the next slides. So the short version, how we did this, how we can prove these claims in the debugger, we let it continue from the start to the crash or the assertion failure. Then we set the break point on a function that was the last one to access the change buffer bitmap bits. And from that function, we get the address of the bitmap bits for this page, and we can set the data watch point on that. And I found this hardware watch point is a very powerful tool. It's really much easier for some things when you don't have an idea which code is going to modify or read something. And then based on this watch point, we get some call stacks where these change buffer bits were last changed. And then we set the break points on functions that insert records into the change buffer and delete records from there. And then we observe that, okay, there was nothing to delete records for this page and basically proving this claim. So we are printing the index ID and index name to get some more detail to this proof. Oh, sorry. Okay, so we were unable to reproduce this with a small grammar. We just took something and we got lucky and got the trace and debugged it. Possible consequences of this bug are the wrong results. That's of course very difficult to prove. You don't have any testing tools to prove that really or not many tools. Or you could get the crash on change buffer merge like here we got. And that change buffer merge could happen any time, even if you are running check table to check if your table is okay, then before my score, MariaDB 10.6, your server would crash because of this change buffer corruption. So in our case, it was a page overflow when applying an insert. And change buffer doesn't allow any page splitting. It must fit in the page. In the support case I mentioned one week ago, the case was that the page split failed. It ran out of space. You are taking one page, you are trying to copy part of the records to a new page and it ran out of space. How can that be? It turned out that the page contained records for some other index which apparently had been dropped earlier and that index apparently had not null columns. So the length of a variable length field was interpreted, was stored there for that index where this correct index would have the null bit map. And then we would read the length of the record from previous byte and that's how we would get these two long records being copied. Oh, five minutes left. I have to hurry up. So this debugging, how it goes in detail, we continue to the end of the execution from the start. And then we reverse continue to go back from the abort signal. Then we set the temporary breakpoint to this bitmap page access. Then we get to that breakpoint and we get in a register we happen to have the byte address of the bitmap byte which we are interested in. And then we reverse continue to the changes of that byte. So the last occurrence of that was for an insert that was buffered after the at index. And we continue from that, we see that this, the previous occurrence is the at index that is clearing the flag. And from that we can get the page number which is affected and we can get the index ID and index name and the SQL statement which is alter table. And then we set some more breakpoints on this insert buffer delete and insert operations. Set the condition that we wanted only for this page and then we reverse continue, we get the insert that was buffered before this at index. Apparently there was a drop index in between but I didn't add statements to get a breakpoint there this time. So the index name is different as after the at index and index ID is different. And there was no call to the change buffer deletion. When we continue from this point to the end of the execution we just reached the assertion failure again without any change buffer record deletion in between. So I'm quoting this Finnish ski jumper who apparently was confusing to French phrases. He was wishing him a good trip when he is starting to do the ski jumping. And I'm wishing a good trip for anybody who is using the change buffer. So and the deja vu, yes, we have seen this shut down hang actually earlier. There was a 10.1, MariaDB 10.1 support customer case. They got this hang and we had to do something to fix that. But in MariaDB 10.5 we made another change hopefully to reduce the chances of getting like random change buffer much because basically after this change buffer much only happens when SQL statement needs that change buffer much to happen. No background operation. So we had to adjust that previous fix for the 10.5 code base but that was not adjusted correctly. And we were not able to reproduce this corruption or this hang earlier. So only quite lately we were able to reproduce it and then we were able to debug it properly. There are some other corruption caused by the change buffer. And one thing I want to notice that in MariaDB 10.6 recently there was a fix that we should not crash on any page corruption. If there are any cases where we are still crashing I would be interested in details. And this includes like check table when there is a crash. When there is a failure during change buffer much it will not cause a crash. There was a mystery bug filed like 12 years ago. MySQL bug. The customer got a crash during change buffer much because the page got empty as a result of applying a perch operation. I can think of several bugs that have been fixed in MariaDB that could be the explanation. The last one, this one from the previous example that cannot apply because that only was introduced in MySQL 5.7 which they didn't use. And of this list the 30422 it's a clone of MySQL fix which is applicable to MariaDB 10.3 and 10.4. Others I don't think have been fixed in MySQL yet. So this teaches us that it really pays off to analyze any obscure failure you get from running with RR because there are games in there. And I think that assertions are like lottery tickets. If you don't write assertions in your code then you can't win these kind of bug findings. And sometimes you can lose, you can write a bogus assertion okay you make mistake, you correct it and improve the assertion and then hopefully you will get something better later. We have some mitigation for this in MariaDB. We don't access the data file when executing drop table. So if in that case we are avoiding maybe more crashes on drop table. And there was a bug in you know DB slow shutdown that the change buffer merge wouldn't check for log file overflow. So if the user got impatient and killed the server because it's taken too long time then they could end up with unrecoverable database. And some more mitigation that we disabled this change buffer by default. We deprecated the parameter and we removed the change buffer in the 11.0 release. The upgrade code for handling it, it was tested and I hope that if there is some corruption notice during the upgrade it should still be a possible to go back to the earlier version and then do something to correct. Yes, if there are any records in the change buffer we ignore, we don't trust these bitmap bits. We go through the change buffer records and if there are any we will apply them. Yeah, so that was basically what I wanted to say and maybe this last slide that it's a good thing to have a nice layer design if you optimize things by breaking this layer boundaries then you are asking for trouble. That's basically what we can learn from this. Question, you said that you fixed a few things in Merle DB and they're not yet fixed in upstream. Are there bugs open upstream for fixing these things? I haven't filed any, I lost my MySQL account when I resigned Oracle. Okay, well, I'll look into it so we can get this fixed in upstream. Yeah, you are welcome to file bugs and maybe they will be fixed. Well, especially if it's a crashing bug. Yeah, but you can't repeat them, so you have a hard time proving that. Well, there are tools obviously now, weren't obvious before. I'm very persistent. It's like once we know it exists in Merle DB as it takes and somebody that is persistent enough should be able to do that. Yes. It is very bad. I mean, if you have multiple threats or processes, it's running them on a single CPU core at the time. So that's why we are running hundreds of servers in parallel on a single server for several hours to get these traces. Actually, also for normal debugging, there have been cases like if you have lots of conditional branches in your code like this debug library or performance schema. Those branches are never taken, but because RR is interested in conditional branches. I have seen a case where if I compile without these things, I get a crash or problem in let's say like three seconds. And then I was curious how long does it take if I use these stupid compilation options with these extra conditions. For that particular thing, I interrupted it after two hours. So it was 7,000 seconds versus three seconds. So don't use conditional branches or unnecessary debugging. Turn off all the code that you don't need. Conditional branches are evil for RR. Well, maybe you are not using it like multi-threaded with context switches and so on. For single-threaded, there is basically no overhead. Thank you for saving me. It was a team effort. Thank you. One, two, three. Is it working? Okay, there is no audio in the room, right? So you want me to speak loud. Well, anyway, we are going to talk today about MySQL 8 and MariaDB 10.10. Original Toxa is 10.11, but I wanted to make sure we're sticking to the latest GA or stable version so it had to go down a bit. Well, and let me start by congratulating MariaDB team with MariaDB Corporation going public. In particular, Monty, congrats for driving two very impactful open-source database companies to exit. That's quite an achievement, I think you people in the universe have that. Yeah. Well, so what are we going to talk about first? I think which we need to recognize where MariaDB and MySQL started from the same roots, right? We have diverged substantially, right? So I think it was interesting when on the previous talk, Jean-François was talking about the upstream, right? I was thinking, hey, you know, what does MariaDB really consider MySQL upstream at this point? Or not quite, right? In this case, I think there is enough diversity right what this is our kind of, you know, ancestors, maybe, you know, like monkeys for humans, you know, something of this regard. Now, in this case, like I am trying to be fair the best way I can, right, which for me always means offends everybody equally, right? So, you know, if Monty is not screaming at me saying you are fucking moron, Peter, that is not how it is, then probably I am not doing my job properly. No, no, no, but you... Oh, you see? Yes, yes, yes. Of course, of course. You always do everything with loving your heart, right? And you don't use bad words as I do. That is wonderful. So, let's talk about development model first. Obviously, MySQL is developed by the Oracle corporations. We can see what the contributions are accepted, but I wouldn't say they are encouraged in the same way as MariaDB does. And we also have open source, as I would say, like a drop ship open source, right? We have those release coming, but we do not really have a tree there over developers changes, right, happen. You know, as we can see. That, I think, can be particularly problematic, for example, for security bugs where it can be hard to track, like, what exactly change fixes that particular issue, right, which is different from MariaDB, which is... has a server released by MariaDB Foundation, though there is a lot of work, right, for actual new features done by MariaDB corporations, though foundations ensure what the contributions are encouraged and developers really done in the public, right, as I would say, through open source project. One thing I wanted to point out, which I think is interesting, is also changes from the Oracle side, right? For years, I've been actually defender of Oracle in regards to, hey, you know, besides all this kind of stuff that Oracle is looking to kill MySQL, they have actually been doing a pretty good job in releasing majority features of the open source and the proprietary enterprise features have been kind of well-isolated, abstracted through API, and it was relatively easy for companies as well, especially, like, per corner, to implement the equivalent. Now things have been changing in the last couple of years, right? We can see what, everybody knows this guy? Yeah, yeah, yeah. Well, like, we can see what Larry actually discovered, what the MySQL exists in the last couple of years, right? And he only seems to care about the MySQL as a heatwave, because we all know heatwave supports the melt zone of lake, right? And we can see a lot of focus getting on this snowflake development, which is sort of a cloud-only, and of course, you know, proprietary version of MySQL. So far, it is only analytic extension, right? But I think it's all questions to us, hey, could there be some other critical features which will be only property, right? Maybe Oracle somewhere in a bellies developing something like transparent sharding for MySQL, maybe that is going to be proprietary first, right? So that is, I think, the questions what a lot of people in MySQL community are asking. Now, with MySQL, with MariaDB, I think what is interesting compared to, like, a MySQL is that there are actually two companies, MariaDB, well, two entities, probably better than MariaDB Foundation and MariaDB Corporation, right? That is the latest mission, which I just grabbed a couple of days ago from MariaDB Foundation side, right? And I think that is a very good to understand relationship with those companies to understand this, right? Now, if you think in this case is what MariaDB Foundation is really at large focusing on serving MariaDB community, MariaDB ecosystem, right? It develops open-source software around MySQL. They are MariaDB Corporation. That is now public company, right? Which is providing a property solutions and commercializing MariaDB software, right? That is, I think, the interest way, right? Now, relationship sometimes can be a little bit complicated, though I would say there have been some more complicated entitlements in which I mentioned in my previous talks, right? And some of them have been made more clear, which I think is a great progress. So if you think about this, what is interesting is MariaDB Foundation has responsibility kind of relatively narrow to the MariaDB server, right? And we can see what number of other components which are very valuable in MySQL ecosystem are owned by MariaDB Corporation, right? Not by the Foundation and also a lot of development. Roadmap is driven by the Corporation. I also find it interesting what we see MySQL knowledge base, which is kind of built by a community but is hosted by MariaDB Corporation. I find that not in a very good sense for like an open-source software, there is also entanglement on the website level, right? So if I am downloading MariaDB software from.org, right, then I am kind of redirected next to the MariaDB Corporation knowledge base, right? And encouraged to fill out the lead which will go to their MariaDB Corporation, which is not totally transparent, right? I think that's kind of, I may be still looking like, oh, I am engaging with a non-profit while actually I am giving my connections to somewhere else. Now, I wouldn't say though that that is completely unfair in this case because MariaDB does carry the largest way to development and promote in MariaDB, right? And they do also get the largest rewards compared to the other sponsors of MariaDB Foundation. Now let's look quickly at what is really open-source between those versions, right? Now, in MySQL, what we can see is a very clear open-core platform and we have Maria, MySQL community, right? And, you know, router, cluster, whatever, all that comes which comes in open-source edition and there is also enterprise version. Plus, as I mentioned, additionally, we have increasing focus on the cloud-only solution as a heat wave. In terms of MariaDB, there are, you know, a lot more nuance in this case, right? Because there are certain things coming from MariaDB Foundation which are completely open-source right now. The things in MariaDB Corporation Spaceway can be with a variety of licenses. |
MySQL 8 vs MariaDB 10.11 |
I'm going to talk today about MySQL 8 and MariaDB 10.10. Original talk says 10.11, but I wanted to make sure we're sticking to the latest GA or stable version, so it had to go down a bit. Let me start by congratulating MariaDB team with MariaDB Corporation going public. In particular, Monty, congrats to driving two very impactful open-source database companies to exit that's quite an achievement. I think you people in the universe have that. Just one step and you're done. Yeah. Well, so what are we going to talk about? First, I think which we need to recognize, where MariaDB and MySQL started from the same roots, we have diverged substantially. So I think it was interesting when on the previous talk, Jean-François was talking about upstream, I was thinking, hey, you know what, does MariaDB really considers MySQL upstream at this point? Or not quite, right in this case. I think there is enough diversity right, what this is our ancestors, maybe like monkeys for humans, something of this regard. Now, in this case, I am trying to be fair the best way I can, which for me always means offense everybody equally. So if Monty is not screaming at me saying, you are fucking moron, Peter, that is not how it is, then probably I'm not doing my job properly. No, no, no. Oh, you see? Yes, yes, yes. Of course, of course. You always do everything with loving your heart, and you don't use bad words as I do. That is wonderful. So let's talk about development model first. Obviously, MySQL is developed by the Oracle corporations. We can see what the contributions are accepted, but I wouldn't say they're encouraged in the same way as MariaDB does, and we also have open source, as I would say, like a drop ship open source, right? We have those release coming, but we do not really have a tree there, all their developers changes right happen as we can see. That are the things that can be particularly problematic, for example, for security bugs, where it can be hard to track, like what exactly change fixes that particular issue, right? Which is different from MariaDB, which has a server released by MariaDB Foundation, though there is a lot of work, right, for actual new features done by MariaDB corporations. Though foundations ensure what the contributions are encouraged and developers really done in the public, right, as I would say, true open source project. One thing I wanted to point out, which I think is interesting, is also changes from the Oracle side, right? For years, I've been actually defender of Oracle in regards to, hey, you know, besides all this kind of stuff that Oracle is looking to kill MySQL, they have actually been doing a pretty good job in releasing majority features of the open source and their proprietary enterprise features have been kind of well-isolated, abstracted through API, and it was relatively easy for companies as well, especially, like, per corner, to implement the equivalent. Now things have been changing in the last couple of years, right? We can see what, everybody knows this guy? Yeah, yeah, yeah, well, like, we can see what Larry actually discovered, what the MySQL exists in the last couple of years, right? And he only seems to care about MySQL as a heat wave, because we all know heat wave supports the melt zone of lake, right? And we can see a lot of focus getting on this snowflake development, which is sort of a cloud-only, and of course, you know, proprietary version of MySQL. So far, it is only analytic extension, right? But I think it's all questions to us, hey, could there be some other critical features which will be only property, right? Maybe Oracle somewhere in the bellies developing something like transparent sharding for MySQL, maybe that is going to be proprietary first, right? So that is, I think, a question that's what a lot of people in MySQL community are asking. Now, with MySQL, with MariaDB, I think what is interesting compared to, like, MySQL, there are actually two companies, MyDB, well, two entities, probably better word, MariaDB Foundation and MariaDB Corporation, right? That is the latest mission, which I just grabbed a couple of days ago from MariaDB Foundation side, right? And I think that is very good to understand relationship with those companies to understand this, right? Now, if you think in this case is what MariaDBation is really, at large, focusing on serving MariaDB community, MariaDB ecosystem, right? It develops open source software around MySQL. MariaDB Corporation, that is now public company, right? Which is providing the proprietary solutions and commercializing MariaDB software, right? That is, I think, is the interest, right? Now, relationship sometimes can be a little bit complicated, though I would say there have been some more complicated entitlements, in which I mentioned in my previous talks, right? And some of them have been made more clear, which I think is a great progress. So, if you think about this, what is interesting is MariaDB Foundation has responsibility, kind of, relatively narrow to the MariaDB server, right? And we can see what number of other components, which are very valuable in MySQL ecosystem, are owned by MariaDB Corporation, right? Not by the foundation and also a lot of development. Roadmap is driven by the corporation. I also find it interesting what we see in MySQL, Knowledgebase, which is kind of built by a community, but is hosted by MariaDB Corporation. I find, not in a very good sense, for like an open source, a software is also entanglement on a website level, right? So, if I am downloading MariaDB software from.org, then I am kind of redirected next to the MariaDB Corporation Knowledgebase, right? And encouraged to fill out the lead, which will go to the MariaDB Corporation, which is not totally transparent, right? I think that's kind of, I may be still looking like, oh, I am engaging with a non-profit while actually I am giving my connections to somewhere else. Now, I wouldn't say, though, that that is completely unfair in this case, because MariaDB does carry the largest, well, way to development and promote in MariaDB, right? And they do also get the largest rewards compared to the other sponsors of MariaDB Foundation. Now, let's look quickly at what is really open source between those versions. Right now, in MySQL, what we can see, it's a very clear open core platform, and we have Maria MySQL community, right? And, you know, router, cluster, wherever all that comes, which comes in open source edition, and there is also enterprise version. Plus, as I mentioned, additionally, we have increasing focus on the cloud-only solution as a heat wave. In terms of MariaDB, there are, you know, a lot more nuance in this case, right? Because there are certain things coming from MariaDB Foundation, which are completely open source right now. The things in MariaDB Corporation space they can be with a variety of licenses. Now, if you look at the... Peter, let's correct your mistakes. Okay. Max-K, all the versions are open source. What say? You need to say that all the versions of Max-K are open source. Yes, yeah. The latest is BSL. Well, okay, yes. So, the latest version of Max-K is BSL. All the versions are open source, buggy, insecure, and unsupported, right? What? Let me just... What say? You need to say bugs and you support them. You can get support contracts with them as you can get for all the things in Python. No difference. Well, okay, well, you see... What do you think it's like? No, no. I miss hot backup. What say? I miss hot backup in the... Maria backup versus MySQL, whatever. Okay. Not open source at all. Okay. That's... Yeah, that's fair. Okay. Let's move on, right? But I would actually check on the Max scale in terms of how much changes there are. I did check a couple of years ago, right, to be honest, right? And the old versions at that time, they really converted to abandoned there. Maybe that changed, right? And there is actually, you know, they are being maintained beyond the BSL cutoff. Maybe not. Okay. The next thing, right? If you look at the MySQL enterprise, it's a superset of community, right? And I think in this case, you have everything that runs in community, runs on enterprise. With MariaDB, you can see MariaDB Enterprise is an extended subset of community, meaning there are some things in... which exist in community, which has not been included. Everything we do for enterprise is part of the community. There are only two small features that this enterprise we don't know. Well, I mean, so all the storage engines which exist in community are also supported under enterprise agreements? Well, that's what I'm saying, right? I'm saying the first sentence. What? This is not a really extended subset. It has two features. Well, what about expand? It's not part of MariaDB Enterprise anymore. It's a different project. Oh, okay. Okay, okay, let me correct. So there is a cool feature available from MariaDB Corporation, but it's not part of MariaDB Enterprise product anymore. Okay, good. Okay, sounds good. Okay, let me move faster, or we will need much more time if you have a wonderful and productive discussion with Monty. Okay, so now in terms of cloud native, we have a finding new MySQL operator available from Oracle and both for MySQL and MariaDB, there is also a bunch of third-party operators available, including one for MySQL from per corner. If you look at from MariaDB Corporation, there is a lot of focus on SkySQL as a way to run MariaDB in the clouds, right? Like there was MariaDB Corporation operator ones, right? But that is nowhere to be found, but there is an effort for MariaDB Foundation to have their operator created though I couldn't find what if that's GA yet. Database as a service, right? Obviously, there is a lot of databases as a service available for both databases standpoint, right? In this case, I think what is also interesting is what a lot of folks rely on with sort of community versions right there. You would say Oracle has MySQL Enterprise with HitWave available on Oracle Cloud and now increasingly some other clouds and also MariaDB has partnerships with other set of folks. In terms of analytics, we have a column store, right? In MariaDB ecosystem, right? In MySQL ecosystem, right? There is no really integrated open-source solution. We have only cloud, only HitWave as I mentioned. I also think there is a significantly different focus between MariaDB and MySQL, right? I'm not going to read through all of that, but I think it's also interesting what the architecture approach has been substantially different, right? If you look at MariaDB, it has been really having much more incremental iterative approach, right? In terms of MySQL, you can say a very big change if MySQL 8, where a lot of things has been written, made not quite comparable, and also there is also a lot of focus in MySQL, making it work better in the cloud, right, in how Oracle sees operating database in their cloud. Release frequency, that is something which I think is very interesting, which changed from the last year, right? Where we can see MariaDB recently moved to even more frequent releases, right? Well, which are with shorter maintenance cycle, right, as well as LTS releases every two years, right? So we can see what's starting with the February last year, the major MariaDB releases are coming out as quickly as minor MySQL releases, right? So I think that is a very interesting difference in this case, right? And as I mentioned, there are quite a few differences in this case, right? With MySQL 8, it kind of has this evergreen release, right, where you have a lot of features introduced in Maria releases, also a lot of bugs, right? In particular, I think in the last few releases there have been some, you know, pretty nasty corruption bugs, which people did not appreciate. And also this concept of now it's only forward compatibility, right? Once you move to the new MySQL 8 release, right, you are not going to be able to run a previous version, right? So if you really want a safe rollback, you need to dump and restore, which is not appreciated by many. You missed the major point from MariaDB, upgrade from any new version to any newer version. You don't need to go between intermediate version. That's a big change to compare to MySQL. So what you can do from, let's say, 10.5 to 10.10, right? You can go from 5.2 to 11. I see. So you can upgrade. That is a good thing to make sure that you have clean setup. That's the only thing that matters. I just made 5.7 MySQL to MariaDB 10. I think 10.10 also in one step. OK. Well, yeah, let's move quickly, right? So some of the changes in MariaDB in MySQL, which I think is worth it. One is like a protocol. MySQL hasn't pushing a lot on their new X protocol, while MariaDB has been making classical protocol better. We also have different interfaces support right there. Well, something else, Manchi? I just want to know how much do you see uses of X protocol? Personally, I would say almost non-users. Well, yeah. It's used for group replication configuration. If you manage group replication configuration, you use it. Interesting. Everything else? Sorry, I have to restart the focus. Sorry. You're going to lose slides for a minute. Oh, OK. OK. Yes. The box needs to be ready. Yeah. Well, anyway, do you guys have a good generation? Yes, OK. So, Jason, imagine Jason. Can you all imagine Jason? Yeah. So, that is very significant difference. It also exists in MySQL. They designed native JSON data type, right, and have some pretty cool things like a partial updates, and also, I think, from usability standpoint, Jason shortcuts, which makes things nicer and cleaner. With MariaDB, the JSON is really stored as a text, where it has improved the JSON partial speed significantly. And what is cool in the latest MySQL versions, it cut up a lot in JSON features in MySQL. Like, I think, like two years ago, I could say, hey, MySQL is a lot further in terms of what it can do with JSON. Most of that gap has been covered. Now, imagine replication. Well, that is also, there are things substantially different. MySQL has built out a group replication, which gets a lot of focus in MySQL 8, specifically with MySQL in a DB cluster. You now have a kind of cluster set, like how you can replicate between two clusters, a lot of focus on that. MariaDB has been focusing on both supporting classical replication as well as Galera replication. And also, even if you look at classical replication, MariaDB GTID and MySQL GTIDs are conceptually different, right? They are both moved to that versus binary logs alone. Okay, well, you want me to try? One minute. One minute, yeah. One minute. Minus one minute, right now, okay. What? You broke it. Now, it doesn't work at all. You see, you see. I think it's seen something, but it doesn't see. Okay. The good news is you can't tell me. What? Good news, you can't blame me. Four change for once in a lifetime. Can you see the external display? Well, yeah, I mean, I think as you see, it's kind of blinks, right? What it gets, the external display, right? Like, hello, you see, it gets. Yeah, that's what it says it's seeing external display. That's lovely. Well, look, I mean, I have a couple of slides in minus two minutes, so let me just finish. And then you can, you know, troubleshoot during the lunch, right? Yeah, so, okay, let's see what else. A couple of things which are different, right, significantly. I would say is, A, security is very different, right? There have been a lot of approach to making security better, both in my school and MariaDB, but approaches are essentially different. So if that is area you focus on and you're migrating one way on the river, make sure you give it separate special attention. Optimizer is another area, right, which things diverted specifically, right? So, again, make sure to check a query plan, especially for complicated queries, go in one way or another, right? Now, I wanted to also pick up in the latest release a couple of unique MariaDB goodness, which speak to me in particular, right? One is, I like your ID data type, right? Because all that kind of from my school, post, well, you know what, you can actually do this and then you're going to store your ID efficiently. That is not a good way. You know, just provide the user convenient your ID data type and functions, right? And so we don't have to deal with that shit. Log-free Altitable for replication, I think it's also very cool, right? Paying double for essentially time for Altitable. That is, I think was long problem in iSchool. Great, that's fixed. And I also like this concept of grant to public concept, which is being added in 10.11. Okay, and now, I want you to imagine mountains. Well, because this slide was supposed to show what there's a nice conference covering a whole bunch of databases called Percona Live coming in May. It will be in Denver, right? So, and call for papers is open. We want, you know, if you have something to talk, please submit. Also, some other unique opportunity, right? Some of you are probably running MariaDB, right? Anyone? Anyone? Okay, well, this is your opportunity not to just run MariaDB, but run for MariaDB. We will put together like relay team for the Denver Marathon, right, which will take place one day before, before Percona Live, right? And so if you guys want to attend and run about 8K as a part of MariaDB team, let me know. That's all I have to say, and you should imagine this slide, which says thanks to all of you for being such a wonderful audience and coming to listen to my talk. Thank you. |
Deep Dive into MySQL Query Performance |
Now, we already spoke here a little bit about developers and especially the front-end developers. One purpose of this talk for me is to really sort of this kind of a bridge, the gap, which I often see between the people who really have a database as a center of their at least professional life. Any people who are writing an application and database just of them like a thing. It's like a toilet. You do your business and you move on with your life. Something like that. For those people, the database is typically like a black box. There is this black box and what I want is I connect to the service point which is provided to me. I connect it quickly. I run my queries and that's all I care about and all that kind of change buffer combaya. Never heard about it. What about queries? What would you as a developer think about queries? Well, these are actually pretty simple things. When you connect it to a service point, you are queries. You want them to complete if no errors. You want them to provide you correct result set because if you wouldn't, we could alter over my school tables to black hole and get a fantastic performance. No errors too. And also you want them to make sure they complete in that response time what your application expects. I think that is a very important thing to understand. If you look at from the individual developer standpoint, like Ryan application, hey, performance response time for my queries is all I care about. And how that's internal database kitchen works, somebody else's problem. Now if you think about the response time from the database point of view, that is often seen like, well, I see that response time for a query is in average or whatever distribution. We'll talk about that later. But that is different from what business cares about. If you think about the business point of view, you think about, well, do my user have outstanding experience in terms of performance with all the application interactions? That means like a search should work and place an order should work and whatever. And the database is important part of that, of course, but not is their complete part. What is interesting in this case is what as database engineers, we often talk about those kind of different events, kind of like a bad performance and the downtime. And say, well, you know, no, no, we weren't down, it just was taken 15 minutes to run my very basic query. Well, from user standpoint, the bad performance, very bad performance isn't distinguishable from downtime. Because A, we don't have parents, then even if people are very patient, then the browser or some other timeout will happen and nobody gives a shit about that query which may still continue to run. Another thing to understand about query performance is you do not want to focus on the averages. I like this kind of one saying, but there was one sleeve demand to try to cross the river in the average one meter deep. That is same applies to the query. If your average query time is X, that means pretty much nothing. You need to understand more about that. And I like in this case to look at their percentiles and even more to make sure you can look at a specific distribution of your query response time. If you have that, that gives you a lot more insight. Now one thing to understand about the percentile, you may be looking and saying, well, great. My queries have this decent 99 percentile, but that does not mean on a business side what 99 percent of your users have a good or acceptable experience. Why is that? Well, because guess what? The single user interaction can correspond to a lot of queries sequentially, which all add up and typically through their joining user has a number of those interactions. So I would say even 99 percentile that may all well, depending on your application, only correspond to like 50 percent of user session. So if you really see the complicated large environments, they are really focused on either relatively short SLA or rather high percentiles. Another thing that I would encourage to pay attention to is errors. And make sure you are measuring response time for those as well, because errors actually can be offered two kinds, fast errors and slow errors. In certain cases, let's say if your table doesn't exist, you may be like, get the response time straight away, and if you put all your error queries and actually normal queries in the same bucket, you may say, oh my gosh, my response times are doing kind of so well. But on the other hand, if your query is, for example, error is a lock weight time out, then that is a slow error. It actually will have a higher response time than the normal cases. That is why I always suggest to make sure we measure response time for normal queries and for queries with problems differently. Another thing which is very important is looking at response time over time, because traffic changes, a lot of things are going on in the system and just saying, hey, I'll have a response time of x over some long period of time, it's not very helpful. Also what you would see in many cases, you still start those like a small performance problems, maybe like SLA violations, which are if unfixed, they convert in the downtime. For example, in my SQL world, you may say, well, I have forgotten this kind of runaway query, and my history accumulates. It will slowly increase and increase your response time. If you measure that over time and say, well, something is not trending in the right directions, you probably can fix it before that will be seen as a downtime by your users. If you are not, then not so much. This is example what we have here, what you often may see something like this as well, where all the queries have like a spike in the response time, which you often may correspond to something external happening in the environment. I think here is what is very interesting, especially for us running in the cloud, we only have limited observability to environment. If there is some shit going on on the Amazon backend, they're not going to tell us that. Oh, you know what, we had, let's say, some free hard drives failed, which back our EBS and we had to some rebalance, yada, yada. The other question I would ask is where we want to measure response time from a queries. In my opinion, both application view and database you are in the combinations are very helpful because the application can see real thing. If your network, for example, is adding some latency or whatever, and you will see that from application, not so much in the database, because it's only sees from, hey, I got response to, then it's sent the data back. But the database view allows you often to see a lot more inside about what have been going on inside, where from application side, we often can just capture the query time, maybe some very basic additional parameters. So what we spoke from our business view, right? Well, we already said what that all users have outstanding performance experience of all the application interactions, right? Let's now try to break it down a little bit more, right, to what that may mean. In this case, I want to introduce this little project or flag from you. This is SQL Commenter project by Google, right, I mean, which is pretty cool in terms of what it allows to pass you the metadata, right, which you understand as developer all the way to SQL query. They implemented that support from a number of frameworks, right, and it's also supported in their Google Cloud monitoring environment, right? And I wouldn't very much see that developed more, right, and for at least kind of us come to some sort of shared standards between the databases, right, to wherever, how we can augment query information with sort of like a tags, values, right, which users care about. So what are possibilities which can be quite helpful in this regard? Well, finding, for example, who is our actual user tenant, who is query, corresponds, right, because we often may have, you know, different performance issues, right, finding the application, like some sort of like a subset of application functionality where many of them may be hitting the application, right, version information, maybe information about like an, their engineer of a team who is responsible. I often see DBAs or SRAs team having problem, like, oh, I see this nasty query which was shipped yesterday, I know because, shipped today because I know it wasn't very yesterday, right, but now having to find out who a hell introduced that stupid query, maybe problematic in a large environment. Now a lot of focus, and I think the core of the query-based observability may be about the query. But the query, I mean, obviously like a query with sort of like it's, which are same except different parameters, and that is very helpful because, well, obviously they have a different complexity, different expected SLA, and so on and so forth. The next way also to break things down for me would be to look at the schema or database, and why is that interesting? How? I just noticed right now what it's been cut a bit, you see, well, anyway, life is life. I'm just not going to be lucky in this room, right, yes, yeah, but, well, we can blame our windows, right, on this conference we can and should blame windows. Okay, well, why schema and database are also good because often we would separate in the multi-tenant applications different tenants by schema, right, and in that case that gives us a good profiling for performance of their different schemas, right, like we can see here in the example with PMM tool. Another thing what I found is very helpful to find a way to separate the data by different tables, right, in many cases you want to say, hey, you know what, how a query is hitting given table is affected, especially if it did some change which relates to the table. Hey, you know what, I changed the indexing on this table, let me see how all the queries hitting this table is impacted, very helpful because there may be some surprising differences. Database users, that is another thing which is quite helpful because that often allows us to identify the service application, right, if you're following good security practices you would not let all your applications, right, just use one username, you know, not a good idea, right, and also find human troublemakers, right, which are doing, having direct access, right, and so many times you'll find somebody, you know, running the query, right, and say, okay, well, yeah, it's slow but wherever I'll go for lunch, you know, I have time, well, you may have time but your database may not, right, so we also, like here's example how we provide that. I also mentioned database hosts and indexes in many instances, in many cases that is very helpful because even if you may think, oh, my different database instance should perform the same, well, world is a messy place and world in the cloud is even messy place, right, they may not exactly have the same performance due to, you know, some strange configuration differently, having a bad day, right, or even maybe having a different load, right, and that is a good to be able to break it down, right, when you see some of your queries are not performing very well. I would also look at the same stuff from a web server or application server instance because, again, if you have, like, maybe like a hundred nodes, you deploy the same application, you may think, hey, we're all going to perform the same, hitting the database, well, that is not always the case, right, they have seen changes from people saying one FM is misconfigured or for some reason cannot connect the cache, so it's, you know, hitting, you know, ten times more queries, right, on the database than it should be, or the application rollout didn't go well, where UV eliminated nasty query on 99 of application instance but not some others, right, it's a very good to actually be able to validate that because what you would see or, like, again, from a DBA standpoint, you know, developers, sysadmins, storage people, they are going to tell you shit, right, but they are going to lie, right, they are going to lie, right, maybe not intentionally, maybe because of their ignorance and limitation of their tool but as a DBA, a city or something, you want to point them out to their shit and say, look, I have evidence, right, evidence is good, right, so clients costs, custom tags is very helpful if you can extend, that is what we spoke about, the SQL commenters, something else which I find very helpful which we cannot quite easily get with MySQL but being able to separate the query by the query plans, right, often you may have a query which looks the same but it may take different execution plans, right, and often that may be correlated to its performance. In certain cases, it is totally fine, right, very different situations, sometimes MySQL optimizer may get a little bit, you know, crazy just and has that optimizer plan drift for no good reason which may not be very easy to catch, right, and will be helpful to do. What I also would like to highlight is when you find the specific query and say, hey, this query has nasty performance, right, we often want to understand where that query response time comes from, right, and that is some of their things, right, where it can come from, certain of them are relatively easy to find out, right, certain are not very well, right, for example, wherever query has waited on available CPU, right, because system was already saturated, well, you can't really see on per query basics, right, you can only see those things, well, my CPU was kind of like a super packed, right, on a period of time. Okay, here are a couple of other things to consider when you're looking at the queries. One, you want to really look at separately the bad queries, right, versus victims, because sometimes you will see, oh, queries are getting slower, but it's not because of FAM, it's about some other nasty queries, right, maybe that is your Java developer who thought, well, you know, to solve my problems, I will just launch with 200 threads, right, and make sure I am good, but everything else is kind of slowed down, right, and that's maybe tricky. One thing is what you should not forget the currently running queries. In many cases, like if you look in performance schema queries by dash address, that gives you what happened in the past, but believe me, if you start, you know, 50 instances of some very bad query, which continues to run, well, that may be the reason of your problem, not the past, right, and to connect to that, I think it is less problem in my skill right now, right, if you're using query timeouts, which is a very good practice, right, because if you say, hey, you know what, for all my interactive queries, by default, I set the timeout of, let's say, 15 seconds, then you should not care too much about your past queries because, well, you know what, everything gets killed after 15 minutes. Also, 50 seconds, right, you should not ignore the stuff which is invisible from a query standpoint, right, databases do a lot of shit in the background, you may also do things or your operation teams like, well, backups or provisioning another node for cloning, right, for the clouds or wherever your VM system may need to do something in the background, it may not be directly visible, but that can impact the query performance, right, so sometimes, well, when you observe a query impact and you can't really see what is causing that, it's possible. I also would encourage to avoid what I would call like a biased something. I see people sometimes would say, hey, you know what, we will set long query time to one second and only look at the queries which are more than one second in length, well, you may be only focusing on the outliers, right, and missing the possibility to optimize other queries, right, or actually even focusing on the queries which provide, which are responsible for providing that bad experience, right, for your users. Okay, we find another thing like a last minute I have or something, I wanted to say, hey, what I would like to see from my skill to do better, who is Kenny, no Kenny? Yes, he's always hiding, right, he probably wanted to get another sandwich, damn it. Okay, so here are some things that I would like to see. One is better support of prepared statements, right, and right now it's kind of, you know, not done in the same way, right, which is, I think, is a problem, right. Now I would say consider grouping data by time in certain cases, right now you get like all the statements in one table, right, and you have a lot of statement variety, that table tends to overflow, right, which is not really helpful, right, and if you have to kind of reset your queries all the time, that is not very, you know, good practice in my opinion. Provide list of tables query touches, right, that is very helpful because, well, my skill parser already knows it, right, it knows tables query touches, but it's very hard to parse it out from a query, especially if you consider views, right. I don't know by looking at the query alone, wherever something is a table or a view, right, so, in this case. Information about plan ID, right, I would like to see for the query, right, some sort of plan hash or something, so I know then query is using something like that, and also what I would call like a top weight summary, right. Right now we have information about the weights in my skill performance query and about query, but I cannot see and say, oh, that query was slow because it spent XYZ amount of weight on something or whatever, right, or at least kind of like some small class of queries, right, I don't think that's convenient. Well, with that, that's all I had to say, hope that will help you to avoid tuning your indexes by, by the credit card, and yes, oh, I have a time for questions, you told me like, Peter, five minutes, oh, to answer, I have a time for questions, yes, any questions, no, oh, yeah, what's the difference or advantages of this SQL commenter thing compared to what open tracing standards people start tracing the whole thing, what's the difference of SQL commenter? Well, what I would say in this case, yes, I mean, there is obviously open tracing framework, right, which you can use, this gets specifically to the database and specifically in every query, right, if you look at the open tracing framework, I think, you know, getting every query, right, maybe a lot of, a lot of volume out there, right, and again, I also think, well, the good thing if also SQL commenter, right, is what that does it automatically, if you will, right, that does not require you to take an extra integration. Okay, anybody else, yeah, I mean, it works with MariaDB as well, yes, well, there are not practices, there are no good practices, right, like you can, there is a lot of optimizer hints you can use, right, so you can actually force the query to go like this particular stuff, right, but that also prevents optimizer choosing different plan if better plan becomes available. Yeah. Never use forced index, always use ignore index, okay, well, then thank you, folks. |
Online schema change at scale in TiDB
How does schema changes work in a distributed SQL database |
I'm Mattias, I work at PingCap. We are doing a distributed SQL database called the TIDB. It's MySQL compatible, so for the clients it just looks the same, and I do have a short talk about online skimages at scale in TIDB. Similar to MySQL, a distributed database is slightly different. MySQL does a metadata lock, so it basically needs to stop the world, no transaction can go through the metadata lock just to change the metadata. That means that it's a short lock when you do any kind of DDL, but it also means that when you're doing replication, this metadata lock actually stops replication a bit, so if it's not an instant DDL, you would start getting replication delay when the DDL goes through. So a distributed database is of course different. From the client perspective, you should just see a normal database. You should just expect it to be transactional, it should be acid compliant query with your normal SQL queries. For the user, you shouldn't see any changes, but of course underneath it's distributed on multiple nodes, et cetera. So if you take ad index as an example of a DDL, we can't do the synchronous stop the world scenario with the MDL example in MySQL, so that's something that we need to solve. During that, we do need to copy and create all the index entries for creating a new index while normal traffic comes in. So in the beginning, MySQL did more or less stop the full table and copied everything over and then it released it again. Nowadays, they are much better on the online and only keeping the metadata lock. But that's something we need to do better in a distributed database. So the proposed solution is to version the schema, so every change you're doing to a schema or a table, you do that as a specific version. You need to allow sessions to use either the most up-to-date version to the current version of the schema or a previous version. So then you can do transitions in between these states or versions. And we need to guarantee that the states between the previous version and the current version are compatible. So that basically means we need to create some kind of states from the before or the start states to the public state where it's usable. And I think it's easiest to go backwards to actually see what kind of states are needed. So here the VN, it's the current version. And we start by the public, it's the end state, everyone sees the index. So selects goes there, insert updates, everything goes there. The previous version, we can actually remove the selects, but it still needs to do all the updates, insert some deletes there. And as you see, regardless if you're, so the time here can be a bit confusing. So the transactions are of course using the real time, current time, but it might see a different version of the schema because we can't require all transactions to constantly check for the new schema and stop the world for that. Let's then move and say we are in this write only state. Before state for that, then of course we cannot do selects. Do we need to do inserts before to make them compatible? Well, we don't actually serve the reads. So we do not need to do the inserts in this state. Backfill will help with it. And then of course comes the question, how would backfill handle it? So for backfill to handle this correctly, we actually need to have another state between public and the write only state. And as you see, statewide for transactions, it doesn't really change anything, but it gives time for doing this backfilling because when we enter the write reorganization state, then we know that the state before, it's the write only. So all changes will be double write. That means that updates can be a bit tricky because we say we're not doing insert, but how do we handle updates? So we need to go a bit deeper to see how that's handled. In the add index example, we do have the table data that's public. So everyone should be able to read directly from the table without the index. So let's say we have at time zero, a session that sees this new write only state. And it does an insert. It inserts into the table, and it updates the index. So you can find the row through the index. Then later on, another session comes in, but that session has not yet transitioned to this write only state. So it's in the state before, and it wants to update this. So it goes to the table and updates the row. That's public, so that's what it needs to do. But then how about the index? We don't actually need to insert into the index here because that will be handled by the backfill in the write organization state. But the trick here is that we actually need to remove the old entry as a part of the update. So update actually means that we need to propagate the deletes into this new index object, but we do not need to do the inserts. So we need to propagate updates as delete only. And that also makes it easy to handle the delete, so we do need to handle deletes in the new index. That also gives a name for the state, so delete only state. When you're reading this, it's inspired by a paper from Google about online asynchronous schema changes in F1, so on top of Spanner. Then it takes some time before you understand exactly why you do need a delete state. But this is the reason, so we'll be able to move through the different stages. I'm not inserting the new row in the index or the new entry in the index. Does that not mean that nothing else in the system can use it because you have to wait for the backfill to complete? Yeah, so you don't read from the index until it goes public. It should complete, okay, so you have to wait for it and it doesn't mess around the way. It just delays that. You could have done it at the same time while you're all deleting. So since if you would insert it, then it would more or less be overwritten by the reorg phase anyway because the reorg needs to read from a snapshot and take all that data. So a snapshot taken somewhere when everything were on the right only. So it would just be overhead of doing the insert. It wouldn't actually mess up anything. It would still be correct, but it would just be unnecessary. And then if we move on from the delete only state, the previous version can actually be the start state because as long as deletes are done, the previous version does not need to do anything that really states. So there we have the different states that it needs to transition through for keeping transactions running without being blocked. So here we do have the full part of the asynchronous DDL in online, that's done online in a distributed database. Do you support distributed transactions and if you do, what transactions in XA prepare state? So we do not support XA transactions right now, but of course if you're connected to different SQL nodes, it looks just like it is a master or a primary wherever. So full read and write in however you connect. So transaction is a bit slightly different. You cannot have transactions spanning more than two versions. So you need to either wait or you need to block, stop and fail transactions that are too long-running. Okay. And these versions, you have like several versions associated with a single or nice game of change. Yes. Yes. So a single DDL goes through multiple stages. And currently I'm actually working with partitioning and for alter table reorganize partition where you take one set of partitions into a new set, then there's another thing. So during the reorganized phase when you're copying data, you do select from the old one then you go to public, so you select from this one, which means that if someone is actually on the right reorganization state, then they will select from that that's not updated in this one. So you need to add an additional state between the right reorganization and the public state just for moving the select. So it's a double right while moving the reads. And all this is done in tidy B and I'm not sure how many is familiar with tidy B. Okay. Good. Then let's do a quick introduction to this tidy B is mainly architecture around three different components. You have PD which stands for placement driver. It creates the timestamps for transaction handling and it knows about the data locations. So it knows where the date on which node the data are. Then we have an SQL layer that is stateless. So it's very easy to spin up or scale in the different number of nodes. Here we have re-implemented the MySQL protocol. So this is actually written in Go. And all of it is in Apache 2 license. So we do not share any code from MySQL or Maria. It's completely new since 2015 when the project started. And then we have a storage layer. The base storage layer is a Thai KV, so it's a distributed key value store. We even have people that run stats as a distributed key value store and don't bother about the SQL part. So that's what you can do as well. And then we do also have an additional, an optional way of storing the data in what we call Thai Flash. That's a column store. So by connecting it here you can actually do analytics like aggregations and so on on the same data within the same transaction even. And the optimizer here would choose what is the fastest way. What has the lowest cost for executing the query. So you don't have any ETL or anything like that in between. It's very easy to just add. You're doing all the tables and set the Thai Flash replica equals one or two or if you add more than one, then you also get the MPP, so massive parallel processing part of it. We do have an, you can run Spark on it as well. And let's just go down slightly deeper on how we actually store the data. So we take all this data and split it into ranges about 100 megabytes and each such range is stored in three, or yeah it's configurable, let's say three copies in the Thai KV storage nodes and each such region is forming a raft group. So that's how it keeps the HA and the high availability. Thai KV is using ROXDB as lower level storage. So it's an LSM tree, yeah it's similar as MIROX in Percona or MariaDB. So it's not B-tree based. Through this raft protocol, that's how we also can connect the column store. So that's how we also have it, so you can run it in the same transaction and even if you have a join, maybe it's faster to execute parts of it through an index in the row store and then do some of the table scans and aggregation in Thai Flash in the column store. And this is optional, but this is not, this is the base. You always need to have the row store and you can have this as an option. There's a lot of tooling that works. So first of all, I would say that the data migration, so it's easy to have a ThaiDB cluster to read the binary logs or just set it up for dumping an upstream MySQL instance or even several instances into the same cluster so you can combine all the data back. We have the backup and restore, very good dump story. I think that even works with MySQL. You have the tool for do a diff between the different instances, change data capture that can go to either another ThaiDB cluster or MySQL instance, go through Kafka as well if you want. Try up, that's a way for managing and deploying ThaiDB and all components you want. You can even use it as a playground to start it in your laptop. It will download the binaries and start everything, including monitoring everything. So it's very easy to just try out. We have an operator if you want to run it in Kubernetes as well in the cloud. So we even have it as a cloud service, you can do anything from on-prem up to a cloud service in many different ways. And we also have Lightning, which is an optimized import tool, and that's what I will actually use in the next slide soon. A year ago, we started a project because we heard and compared the ad index performance in ThaiDB cluster versus, for example, Cassandra or Aurora. And at that time, we were basically three times slower because we haven't optimized that it was just stable proven and it worked, but it was not fast. And that's especially when you're doing proof of concept or loading the data, that's where it's really beneficial to speed it up. And the way it worked, it would just do data copying through small transaction batches more or less. So that also creates a lot of overhead with transaction handling, et cetera. That's not actually needed when you're doing a backfill because during backfill process, during the data, it doesn't actually need to be transactional. And it's only a single node that does this, a single TIDB node that orchestrates it. I'm not going to go deep into this, it basically just shows how you're creating a command in one ThaiDB node and it goes into a table, a ThaiDB owner will do it, go through the different steps and do the data migrations and data copying. So what we did first was create a feature with this feature flag. It uses this lightning the import tool technology. It's completely built in in ThaiDB cluster, so it's not the external tool. But it reads the data and then it creates these SSD files for ingestion in RocksDB. So it's very efficient load and it has very low impact on the storage side. It just moves these files into the storage and enables them and takes them into the RocksDB levels. The result was around three times speed up and of course a lot less impact on normal load in the cluster. So even if you have a highly loaded cluster, you can do this almost without impacting it. And then we did a bit of analysis of where we could improve even more and there was things like the scheduling could be improved just to shorten the time. Instead of reading directly from the key value store, we could use these co-operators, co-processors for removing columns that's not needed, for example, for doing optimized scans, etc. We disconnected the read and write dependencies so they could run in parallel in asynchronous and a lot of other small optimization. And that created yet another three to five times speed up. So all in all, during the last year, we had done 10 times improvement in speed while we're still only using a single TIDB node and now we're three times faster than the baseline of the other implementations in Cockroach and Aurora that we have compared with. And there's a bit more to do, so we're currently looking into how we even can distribute this instead of running it on a single TIDB node and also being able to auto-tune the priority. So if you have load that goes a bit up and a bit down, so the DDL work can adjust to that. And that is, if you depend on a single TIDB node, if that breaks for any reason, then your basis is going back to the previous stage, is it? Yeah, so we have a state state, so we go back a little bit, a little bit, but you don't need to redo the whole feeling of the index or anything like that. And yeah, it's all on GitHub. If you're interested a bit in how it actually works, I would recommend go to OSS Insights. I would say it's a demo site. It runs TIDB in the background, and it's a simple web UI, quite nice UI on top. But it has all of the events from GitHub, so currently it's 5.5 billion records, and we store it in a single table. It's a bit other things there as well. And you can compare for your own GitHub ID or your own project, your own repository, compare it, and so on, and check some different frameworks, et cetera. It's quite cool, actually. Tie-up is very useful if you want to try it on your own laptop or in your own data center. Of course, you can go to TIDB cloud as well, but I didn't mention that here because that's our commercial offer. Something else that we have that is not directly connected, it's chaos mesh. So if you have a system on Kubernetes and you want to see how it handles different errors, you can use that for injecting errors. That's something that we used for stabilizing and testing out the TIDB cluster. Then I think I'm out of time. Perfect timing, so you have time to answer questions? Yeah. Yeah? First of all, I'm very interested in how do you organize the htap transitioning. I mean, you have both storages, and I miss the way you move the data from row into column or format. I believe you do double copy. You have double copies of the data itself. So we always have the copy here, and the raft leader of the group is always here. So you do have raft leader and raft follower in the Thai KV, and then we extended the raft protocol. So we have learner states here, so they can never become leaders. So that's how we do. So this is a must, and this is optional. What about the optimizer model? How do you calculate the cost-based approach to understand which storage format you use? And it's also the influence of the volcano optimizer model, so that's how you more or less pipeline the different things and can move parts of the pipeline into an MPP framework that handles the column store. And I wonder if this model and the optimizer are dispersed across the multiple partitions of the TIDB operator, or it's in single? So the optimizer, that's in the TIDB project, in the TIDB repository. So the SQL node, and when it executes, it's pushed down this co-processor request and also through the MPP framework for pushing down query fragments or the co-processor request into either TIDB or a Thai KV or two Thai flash. So for example, if you're doing a join where one part of the table can be resolved fast by an index lookup, then it will go here for that part of the table. And for another table, it might be a big table scan or aggregation that will be faster here. So then it actually can combine that. But do you, your cost-based model is based on some assumptions about the cost of these corporations, right? I'm not sure. I don't know the details deep enough for answer that. And last question, how do you test the compatibility of my SQL client protocol between your implementation and because it's a big question. Yeah, yeah. So we don't have any own connectors or anything like that. We just relying on MySQL connectors or MariaDB connectors. And that's what we're using when we're testing. So you basically have the test use that tries different kind of queries and after they pass it, you understand that they are somehow equal. Yeah. And of course, there are differences. But I would say the compatibility with the MySQL dialect, it's very, very high. But of course, like management commands for replication doesn't work because we don't do replication. We have internal replication or we use change data capture for transfer to another cluster. Thank you. Last question. What that clash does when there is high rate of single cell updates, like how it handles this, like rewriting the code files or keeping it separate? It's a derivative of click house. So it caches the changes and then it updates it partially or rewrites the whole. Can this kind of get clicked behind after TKV because it takes more time? It can. But if it's behind, then it will more or less fall back here. You have some tweaking options. You can even do it as optimizer hints that you want to use either engine, for example, etc. Thank you. If there are more questions, I'm sure Mattias will be able to answer. Yeah, I'm here. Even Daniel is here as well. Thank you. Thank you. |
Life of a Query in Vitess
Impersonating a monolithic database |
Hello everyone, let's, all right, thank you. So I'm Harshad Angal, I work for PlanetSkill, and I'm a maintainer of Wittes, and today I'm going to talk about life of a query. Why not? Because I work in query serving team in Wittes. So what is Wittes basically? Wittes is nothing but it's used to scale out your MySQL, and it also manages your MySQL for you. It's a CNCF gadget project, and it is open source with a partial license 2.0. So this is the architecture of Wittes, which is basically, there are so many components, but today we'll just focus on few of them, which is one is VTGate, where your application connects to, and then we have VT Tablet, where once the application connects to VTGate, then the queries are sent over there, and then you have the MySQL, where actually your data are getting stored. Let's talk about few terminologies, which I'll be using across my presentation, which is one is key space. Key space is a logical representation of a database, and it's basically a collection of your physical databases. Shard, it's basically a subset of your data, which resides in a key space. And there's some term called Windex. Windex is similar to your MySQL indexes, which is basically, it's maintained by the Wittes, and there's a thing called primary Windexes, which means it will decide where your row actually live in a particular shard, or in particular table. So once your query comes in, then it decides, okay, this row, while inserting, where it should actually, which shard it should actually go to. So let's take some query, right, then we'll go further. So let's say we have two tables, customer and order table, and what we want to do is we want to find a customer who has at least a spend of 1000 bucks basically in their whole order history. So we have to take two tables, we take a join, and do a grouping on them, and take a filter on top of it. So first thing first. So the client wants to send a query to you, how they will do it. So first, the client has to connect to something called VVol VT gates, and so they can use MySQL protocol. You have all the MySQL drivers, like in different languages you have, and you can use the already available MySQL driver to connect directly to VT gate. VT gate supports the MySQL protocol, so you don't have to do anything on that front. But it also supports GRPC, and it was supported earlier before we implemented the MySQL protocol, but it still stays here for its own benefits. And the reason is like, in MySQL protocol, once you connect, it's like your session is tied, which means if you open a transaction, you have to commit transaction using that. But in GRPC, what the initial benefit you get is you can connect to any VT gate, and you start a transaction over there, and then you can connect to any other VT gate, and you can commit your transaction using that VT gate. So let's talk about the different phases that we have to go through. Once the query is now received by the VT gate, it has to go through parsing, rewriting, planning, and execution. And we'll talk in details about each of the phases, but this is what the VT gate does once it receives the query on its end. So in the parsing phase, now you receive the query. Now basically, it will parse it to know whether the query is tactically correct or not. And once it is tactically correct, then it constructs the abstract syntax tree of it. So here it will have the select expression from clauses, group by, and the having for the query we mentioned before. Now once it's parsed, it goes into the rewriting phase. And it's very important to have this rewriting phase because we are trying to mimic a single data store, though your data is distributed across multiple charts. And this rewriting phase, what we try to do is basically, we first try to see, oh, is there any literals, is there anything which we can parameterize? So once we are done with the planning phase, we don't have to plan again and again similar kind of query. So what we do is we say, okay, here's IC 1000, but I don't need to plan specifically for 1000. So I just make it a parameter, and then I plan for this kind of query rather than planning with the 1000. The other thing we do is we do some replacement functions. So like if you want to get last insert ID, you cannot just send it to any MySQL and get a last insert ID. So we do a session management at the VTGate level where for each session, we have to know what was the last insert ID inserted, so you cannot send directly. So we need a rewriting phase for these kind of functions. There are multiple, but I'm just talking about one of them, which is last insert ID. So you have to do it at the VTGate level. So then you are replacing that with function with a value, and you know what that value is for that session. And the third is you might have to add another SQL node after you construct the AST. So this is like if on the session you said, I cannot select more than 100 rows. So after AST, we have in the rewriting phase, we'll add a limit clause as well. So the planning happens with the limit in place in it. So after rewriting and the AST generation, before we go into the planning, the planner needs some additional information, which is it should know what are the key spaces that exist and what are the shards that the key space map to. And in the key spaces, what are the tables that exist and what are the indexes that you use for those tables. And these information is basically cached in the VTGate, and you have an event watcher which watches. And this information comes from, actually, we have something called Topo Server, which is ZooKeeper at CDSIM, which is where this metadata kind of information is stored. And let's see what the sharding information looks like that we get from that ZooKeeper. So what we call is called vschema. And here it's saying that there's one thing called commerce, and it's sharded. This is the index that exists. And these are the two tables, customer and the order table, which use, and they use the CID column, and they use this index function to shard their table. So now the VTGate knows about everything, that now it can go into the planning phase. And the first thing that it does is does the semantic analysis, which is it does basically scoping, binding, and typing, which is like it's validating, not validated your syntactic thing before, and now it's trying to validate whether you are using the right columns in the right places or not, whether they are actually scoped correctly or not, is the visibility correct or not for those operations, the operations you're using. And then once it validates that, then it goes into the binding phase, where it binds those columns to the table that it belongs to, and then it also does the typing, which means it tries to understand that, okay, once the results come back, we'll know that what type it will be of. The second is once the semantic analysis happens, then your AST is converted into an operator tree, which is basically a logical operator, like you have joints, and then you have tables and stuff, so it convert into those logical operators. And then once those logical operator are converted, it then goes into the optimization phase. So in the optimization phase, we basically make a decision that how the plan should look like, and in the innovators, basically what we do is we get the SQL, and ultimately after the planning phase, we'll get another SQL, because the data does not reside on the VTGit, it resides on the MySQL, and we want to optimize on the way that we should get less and less information back on the VTGit to process your data. So we try to push as much as down to the query to the MySQL so that that can resolve much faster for us than we do it at the VTGit. So in the optimization phase, basically after the optimization completes, we call something like routes, which tell us that, okay, this query will go to which all shards, so that's what the ultimate step becomes, and that's where we call it as a physical operator. And then once we have the physical operators, we transform the plan into an execution plan. And the execution plan is nothing but a collection of engine primitives that we call. So let's look at how the execution plan looks like. So if both the tables which I showed in the V schema were using same sharding function, so that's why we were able to basically make it as a single route, which is telling you that, okay, just go scatter to all the shards and just gather the result at the VTGit level and send it back to the user. But if both the tables, the customer table and the order table were using some different sharding functions, then you cannot merge it like that. Then it says basically, it will look something like that, like you'll have, now, because it gets very complex enough to show everything over here, so I'm just showing that there's a, there'll be two routes that will happen, and then ultimately the join will also happen at the VTGit level, sorting and projecting and the aggregation, and then finally the filtering based on how much customers have spent those thousand bucks. So everything happens on the VTGit. Let's just look at the two routes that I showed on the left. So first one we'll do is basically doing a scatter, and it's trying to do on the order table that, okay, give me for all the customer IDs what is the spend they have done, and then on the right-hand side, we need the customer emails. So for every customer ID, I need to get their email IDs, and because the customer ID was the sharding key, the primary index for that table, we are able to send the second query as a, only to a specific shard, we don't have to do again, go to all the shards. So what all steps happened in the execution? So first thing is, once we get the route, we resolve it using those indexes, and so that it can go to the specific correct shard. Another is then, it basically then we take a decide, okay, now we know which shard we have to go to, then it will also decide, okay, this is a query that we have to send it to something that we call the VT tablets, which basically in the shard we have VT tablets and MySQL, so it's sent to the VT tablet, and then once you receive the result back, you gather the information, but sometimes you also have to do transactions, so if transaction is needed, then the VT gate will also tell VT tablets, go and also open the transaction when you execute the query, and then the session, the transactions will be managed by the VT gate that will know in which shard what is the transaction being opened, so that when you do the commit, it knows which shard it has to send the commit information as well too. And the last thing that execution also handles is basically, if you have a select query and you want to read it on the replicas, then it also does the load balancing for you, so it doesn't overload a single replica. So now the query is being sent to VT tablets, and now the VT job is to basically get the result back to the VT gate. So what are the things that goes in the VT tablet? It's exactly similar thing that goes in the VT tablets as well, but what are different reasons? So first thing is, VT gate has some view of the VT tablet of what this state is, but the first thing that events the query received by the VT tablet, it does this, okay, let me validate whether that's current, am I able to serve this query or not, if any view has changed before, after sending the query and before it reads the VT tablet, so it validates if everything is okay, okay, then let's go and do the parsing. So it will parse the query, and the reason is it has to parse the query is because VT tablet has its checks and balances over here where it tries to put some limitation, like it says that don't overload the MySQL resource, so it will basically add its own limits onto the query so that it doesn't overload your MySQL, that you can configure and do things, and then it goes into the planning phase, and in the planning field, it basically tries to see whether did you have put any query rules, which means like any bad query, if you put like, okay, somebody does select star and without a wear condition and stuff, so it basically says, okay, this query cannot be executed because it's a bad query, so those kind of query rules you can put, and another thing that we check in the planning phase is basically it will say, okay, this is the user that sent this query, but is the user allowed to even access the tables or not? So if it's not allowed, then we throw the error back. After the planning phase, also VT tablet does this, again, and all these are there to just not to load your MySQL, so another thing that it does is query consolidation, which is basically it checks whether any same query is running on the MySQL or not. If it's already running, then it just waits for that query to return the result, and all the threads which are waiting for the same query will get the result and it will return back, so only one query ultimately gets executed on the MySQL and not all the queries get executed with the same exact same query, yes. So once it thinks, okay, nothing can be done, like it has to finally send it to MySQL, then it will use one of the connection from the connection pool to send the query to MySQL, and once the results are written back, it will send it to VTGate, VTGate, if it has to do some operations, it will do it, and finally the client will receive the query. So yeah, so that's what the life of query is, but then there are some custom operations which also affect your query path, and which is like when there's a plan maintenance going, like you're promoting one replica to primary for some reason, and there is basically if you are splitting your data, like you have some N charts and you want to go to two N charts or N plus one charts or such thing like that, so while doing those operations, what VTGate does is it notifies the topo that some operation is going, VTGate understands it, so it doesn't send query to the VTGate tablet, it basically buffers a certain duration, and once everything comes back right, then the queries start going to the VTGate tablet, or it times out and VTGate does the time out from itself rather than sending it down. So yeah, thank you. Questions? So currently we don't do cross-shared transactions, we do it, but in the best effort way, but it's currently on the application to know that they are doing cross-shared transactions, but we allow it, but the application should know that they are going cross-shared. Yes? So join I already showed in how we have the two routes, and then from one we get the result and the other one, this is join. So we have hash joins available, but then it will consume your memory, and otherwise it's like from one you'll get the result and then you send it to the other one, so it's just sequential. Aggregation happens at the VTGate level, he has a sorting layer, so before aggregation we have to sort, yes, that's what it was there in the diagram as well, that after joint there was sort layer, because you have to sort, so we have an in-memory sort, so there are multiple again sort based on what we can do the best, so there's a merge sort also and then you have a complete sort as well, based on what can we do best. And then sorting, then it goes to the aggregation. So that's why we said we try to push as much as with MySQL, so we actually push the sort as well to MySQL if possible for the merge sort, but if you do it in the memory then we still sort it, so it depends whether we can push it down or not, if you are able to push it down we push as much as to the MySQL, yes, thank you, all right I think we have a question, thank you very much, thank you. You |
On the road to managed databases |
You're good to start. Okay, so hi everyone. My name is Mohamed and together with Mikoia we work both at Canonica and today we will be talking about on the road to managing databases, which is a canonical effort to make open source operator as we will see for a large, let's say, you know, landscape of open source databases that we will check later on. So in the presentation we'll start by explaining what we are aiming for, basically the vision of our, let's say, open source operator. We will go into a bit more detail with Mikoia about the database that we did choose and of course we will focus on MySQL for the story. We will explain where we stand right now, we will see a demo and then we will finish about, by speaking about the roadmap for in particular SQL databases. So what we are aiming for at Canonica now. So we have, let's say, our building block for operators. So Juju is our operator, by more, is the technology that allows us to build quite easily operator for pretty much anything. In this case, we are using it to build operators for databases. And what Juju allows us actually is to kind of abstract the, what we call the backing clouds, yeah. Meaning that it can work with pretty much, I mean, if there is a list that you can find, hopefully later on when we will share this slide in, let's say, an internet where we list the exact list of backing clouds and all of the ones that I mentioned here are supported and including orchestrators on top of backing clouds. So to build is one of the LSD which can allow you, for example, to emulate highly available setup in a single machine or even across machine. It's something that you can use as well as your backing cloud or infrastructure provided if you wish. And on top of that, we are building open source operator for a list of databases, so MySQL for sure. But Postgres, good to be, the 20 version, open source and more to come. And the idea is that we would like to automate a number of services or, let's say, you know, operation around databases, including employing a highly represented, as you will see later, scaling out and in, and ignoring node, scaling up and down, adding resources or removing resources, and of course doing backups and restores and upgrades. So this is basically our build-up, and Nicolas, I think, he will talk about the database. So unfortunately, Amazon has haven't killed all the bases, especially on premise, so we will try. Yeah, basically what we are building on which text tag we are building our solution for MySQL is for sure not MariaDB, or what reason we will use MariaDB. We are using MySQL, and we are using in a database cluster. Of course it is group replication, virtually synchronous replication of sort of synchronous replication. We are using MySQL Shell, because a lot of kind of knowledge of Oracle experts have been created, and it creates a lot of painful things for us, but we are suffering and continue using MySQL Shell. We are using MySQL Router, Percona Extra Backup, and for metrics node Percona Monitoring Tools, sorry, on the MySQL Extra. Yeah, so group replication, why group replication? Because that's simple, we don't want to lose data. And we want to have good performance in case of if you compare it to semi-synchronistic. Yes, it looks like that simple. In fact, yes, on the top of Juju, it is additional sub-layer in the system, but anyway it simplifies a lot of operations. Yeah, how it looks in reality, what we have right now, in that concrete moment, with simple Juju Deploy MySQL command, you can get automatically deployed cluster in a DB cluster, you can scale it, it has self-healing, meaning that if one member dies, it can be recolored on start, if data not lost, it can do incremental state transfer, it can do full state transfer, if you lost the full data or data not recolorable. It also can self-heal after full cluster crash. It is not fully automated thing in case of MySQL. I saw today news from Poland that it is automated already in Galera 4, but not on MySQL. Full cluster crash. Yeah, of course, vertical horizontal scaling in meaning that you can add members, and you can remove members from this old hardware. We have kind of user management in meaning operator, operator is application, use set of internal system users for backups, for monitoring, for administration of cluster. We have user management for system users, and for Juju native applications, which doesn't say anything for you, I'm sure, we also have user management, so when the new application connects to our Juju, it receives automatically MySQL user, and the user can rotate it, and the user has limited amount of operations. Yeah, we have MySQL router in place. We see it as, of course, thing which should run on application side, and it is designed in our stand to be run on application side, works well. If you have the same version between MySQL and MySQL router, if you are lucky, and can update your application servers, that is, database servers, of course. If that condition is true, MySQL router is a great product. Also, encryption in transit, and some basic certificate management, centralized certificate management, and, of course, with several, maybe we could go already, merge it back of functionality. Not resource yet, but we are working hard. That's probably about current state. So, how this magic works in reality, it looks like that. Juju is deploying the MySQL. It runs three members. If you see something, you are lucky. Active addresses. Yeah, it is repeating. Another, for example, another example of scaling. Again, if you are lucky and see something on the screen, Juju, you need one. You can see the stuff, and you need it. It is quickly because the database is empty. But, in fact, if you have a lot of data, when a new member comes, it is needed to make a transfer of data from old member to new member. Of course, there is in the DG cluster, so the functionality is automatically in the DG cluster. No worries, automatically, not always works in case of MySQL. You need, sometimes, to force SST, and we are doing that functionality now. Yes, Kenny, I'm making notes. It doesn't work. I can't see anything. I can show you a code. That's probably it about live demos and Roslap. What is it saying? I'm just making this from addition, so we will talk about what we are planning to do next. I hope we are back now. Okay, that is basically what we called the entity before. We are using, let's say, the naming speed there for the control. So, the current version that we will be doing, say, around April, will contain what you can see, of course, already. But we are, as we are working on phasing the resource, so as I said, MySQL is not yet merged, but we are working on it. So, we haven't done a documentation, but it's not so difficult yet, and we are working on it, making that available for you. Observability, so give this mention already that we are using, let's say, an exporter. Hopefully, you will have all the stats, okay, that is provided, actually, by Juju. Not all of the work is done. I mean, some of the work is already done by Juju, let's say, and for Shaxx and Beach. And, of course, the English and English team for our architecture, including the operator, will probably be snazzy. But we might be using it for some reason. So, this is basically where we are operating on. We have confidence that this should be delivered. Just to give you, let's say, an impression. Thank you. Thank you. Thank you. Thank you. Thank you. Take a look. So, that's it about our presentation. I brought some sweets as well for you that you can enjoy for free. And that would be even happier if you ask questions. May I ask a question about... Obviously, I think you guys raised these happy words. Group reputation has a maximum number of members. So, if somebody wants more members, more feedback, that is your system. Yes, so, basically, that's right. Juju group reputation has limitation of 9.11 members. What is suggested way? We think that we will use cluster sets. Very soon, we have a design how to make this happen in case of Juju. So, basically, we will have several energy being cluster sets, if you know what it is. Basically, it is two different group reputation clusters. These are synchronous replication between them. One, two, three, five. And is this underlying CR exposed to the user? Or do they have to do CLI? Dependent on what? So, if you use teachers of manager database, I mean, you want to scale, you want to enable DLS, you want to change the class for administrative system user. In that case, you are using Juju CLI. If you want doing something with database, you are creating database. But it depends. Just to explain this. What you can say is that, I mean, the operator, our code is really about managing database. You can get the regular secret connection and do whatever you would like. Your application does not have to be within Juju. We can configure it in a way so that your application is outside. At least, the part I want you to see is that it is a part of Juju's operator. You can install a CRV that exposed the CRs. No, no, no. It is, yes and no, Juju is operator in meaning of what is operator, but it doesn't work to see it is. You can think of it as an orchestrator that can work with not only, I mean, Kubernetes as the one of multiple players. But it's a good way of thinking about it as an orchestrator. It depends the character. You can place a role as an intermediate layer. That is, of course, completely my base on the slide. And at the same time, as a different user, I would guess that it plays the role of an intermediate layer between the user and the system. Because, well, I see that all the interaction was there. So, is it true that it's actually a component which plays on both fields and funds? Yes, let me answer. So, basically, Juju is kind of automation, a middle layer automation, but we built our operators on top of Juju. In that case, we don't care what the file system. We say, hey, give me the disk there, into that directory. And don't care about multiple field networking and so on. And it has a lot of patterns for process management, which you can say, hey, run me that process. But on that privilege, constantly, or run it on sports when I am doing some action. So, it is kind of operator framework, but operator framework, which uses some unified level of hardware under the hood. And it equally works well on open stack as on Kubernetes. So, it is a bit more advanced, I would say. And not the same plan, it corresponds to the interaction with the user. Yes, absolutely. Thank you. Perfect. One more question. Thank you. |
Lower your isolation level with ProxySQL
Adapt your Galera cluster setup to your needs using ProxySQL |
First, because the last time I did a presentation, I forgot to say this and there were people last step after asking. So I am assistant engineer working at ProxySQL, my name is Sorry. So I am assistant engineer working at ProxySQL, my name is Xavier and for any more information if you follow us, so that's g-hub and j-lub and there is what I am usually. But maybe a more interesting question is how am I here and this is just because of this conference because in certain conference I found this a couple years ago and yes, it was the first in 2020 in certain corner so and we are still hiring since then and I am speeding up a little bit here. So if you are a developer and you find yourself meeting any of those conditions, so. You tell me when I can do, okay. So a brief introduction to ProxySQL, I am going to be very brief, so high performance protocol where proxy for my SQL and focus is scalability, flexibility, certain time and just basic topology, right, like any other proxy, our offers. So these slides normally are like three or four but we are going only to carry this presentation about the first three of those, so load balancing, query router, cluster monetization, we are not going to see any of that, even though the three of that today, but yeah, SQL far with statistics, mirroring and the statistics parts you can develop a lot of that but yeah, we are going mostly to care about load balancing and query routing and probably cluster monetization will be also important if you are, you know, using what we are talking about today, but and we have done a lot of recent work there lately, so but let's move on and say what is what is Gallera cluster and what is the cluster that we are, we are experimenting on with today. So multi-primary cluster is in a synchronous application, easy to use, high probability solution, so it has all the goodies of a multi-primary cluster, right? So it's, so in essence it's multi-primary or not a primary REC plaga and it's used in a synchronous application which I put this disclaimer here because it's how they officially announced it, so it's virtual synchronization and even in logical asynchronous, the actual writing and committing happens independently and that's very important, the fact that it's virtual asynchronous because what we are going to care is about the definitions of the isolation levels of per cluster node and per cluster itself, so the typical isolation levels that we have in committed and repeatable read, so Gallera says that this all, this isolation level is available for you at cluster level being the default one repeatable read and you know, solvable, but no, obviously solvable solves all of the problems in this talk, but with KB hats that, you know, make it pointless for the purpose of the talk, so mostly performance. So repeatable reads offers non-digit reads and reads remain repeatable during transactions, so offers from the last update problem. What I am, I am making this distinction here because we have just going back through the isolation levels, we have per cluster, per node and having repeatable read at no level is something that doesn't, how to say it, is something that solves some issues that you are going to expect a lot more at cluster level, so because of that Gallera has another kind of isolation during the application, which is a form of, sorry I am just going to get a little bit more into the interviews, of the loss of date, so the loss of date for what, for any that is not familiar is just that during one transaction you perform a read and then another transaction perform a return update and then the first transaction perform a write and that preview, that update from the second transaction may be lost, and is something that could happen at a lot more cluster level, so in Gallera they have a form of a snapshot isolation that is an enforcement of a repeatable read and is essentially the first committed read, so that leads to deadlocks, but it is a protection for the loss of date problem, and I have just said all this for saying that okay we have a lot of consistency across the cluster, right, across the cluster nodes, but this level isolation is respected as it was in one node itself, but there is something that is missing, what is the next question, what about the semantics, right, there are the semantics of these isolation levels preserved at cluster level, they are not, because for that we have WS3 same weight which enforce the read commit semantics at cluster level, so once you have that you don't have any potential read after write, right, but it is a very, that, you know, at what cost, you have elevated now the semantics to the semantics that you probably wanted from the beginning at cluster level, but what is the cost of that, so we are going to provide some numbers, what is the cost of that, and now as in every measurement marking thing, these are, there are a lot of things that we can discuss here, I am going just to provide some numbers that I think that are representative from what we have tested, but, you know, there can always be discussion, so this is the system, and just saying that because just for the setup that you are going to see, if you have a system that is alike, you will see that CPU, not bottleneck, memory, not bottleneck, and disk, not bottleneck, what we are seeing is the performance in the cluster where it is not fully, there is no resource constraint in more than one resource at the same time, so the versions, these are the versions, probably the infra will be available if someone wants to try itself, so the versions, this is the version that I am using, and the network is a dockeraceous infra with 500 microseconds delay imposed for one millisecond RTT between the nodes, and that is because, you know, you cannot benchmark a cluster with zero network latency, especially because you are killing the whole point of benchmarking, so cluster, let's start talking about some numbers, single primary grids will be against multi-primary grids, so the first you benchmark like this and you see that multi-primary grids outperform the single primary grids, and you are like what, that's wonderful, that's what I wanted, more nodes, more grids, around 70% more grids, that's incredible, right? Okay, this is not the truth, and why is it not the truth? Because there is a lot of lies that we tell ourselves when we do benchmarking, and this is pretty much the super ideal scenario for a replication of a multi-primary, which is transactions, only grids, I don't care about many synchronization problems that will arise of the level of isolation that we have talked before, and all nodes are equally busy because I have decided the load is distributed and the queries are all the same, and all take the same time. Okay, of course, perfect throughput, because everything takes the same time all the time. Same with grids, all the grids, and you have crazy amount of throughput yourself, but it's not also outperforming like before. Now Gaussian, the same ideal, now mix it to read-write, now we start seeing some more real stuff. So what happens when I mix the load? Well, obviously numbers go down very hard. Now this is the cluster grids, cluster reads. We have dropped from 30,000 to around 10,000 reads, and we have dropped from, I think it was like 5,000, 6,000 to what it looks like between 2,000 and 3,000. So it's half the reads and half the grids, reads probably one-third. This is equally distributed load. Now we compare the cluster with just one node of the cluster. Now it's all the same load, this mixed load, but one node. And now we have 15% more grids and 19% more reads, but it reverses. Now the single node is outperforming the whole cluster. This makes sense, because you're in a mixed load, and in a mixed load like that against all the nodes, you have a lot of collisions, you have a lot of problems, and that's all gone in one single node. But this is not anything that you're not suspecting, this is just a problem, this is just usual. This is how it's supposed to be. What we want is to increase the reads, because we have 30,000 reads and now we have 10,000, so let's try to improve that. So let's make a read-reader split over the cluster and see what happens. So if we do a read-reader split, and this is HA proxy, by the way, this field part of the vermin are going to be HA proxy because this is the dumbiest read-reader split. We have a port for reads, we have a port for reads, and that's it. So the cluster reads, we see that they are more the same as we had before, but then we have where a split reads, which go insanely up, and our grades has gone down a little bit. So we are compromising our grades by our reads, and that also makes sense, because by doing that amount of reads in the whole cluster, we are creating a lot of pressure in the other cluster nodes, because this is maximum cluster throughput, like 10%, 50% of the total cluster throughput on reads, where it is fully on reads. So you're compromising a lot, and you're losing some grades. So well, that's okay, that's okay. But we will, in this journey of talking about the semantics and the synchronization, and I have just done the dumbiest read-reader split in which I don't care about any read-after grades or anything, right, the semantics. So what happens if I now enable the synchronization for the readers, and I enable the full cluster read-committed semantics, okay, that's fine. So what happens is that we go back to basically the same performance that we were having against the whole cluster, which makes sense, because now instead of writing against everyone, you are writing against one and reading from the others, but waiting for the replication. So we can see that the split reads go between the same frames, is split reads still win, but it's marginal, and speed writes still win a little bit, but it's also marginal. So we have created, now we have the whole cluster having the nice semantics that we wanted and etc., but we are in, we haven't fixed our performance throughput. So what can we do in this scenario? Because what looks like we need a little bit more complicated logic, we are in square one of the problem. Read-side, what do I say, I'm sorry, read-side above the original read-write, writes are below the original read-write, and we still need protections for our critical reads. Okay, an alternative for avoiding this, we'll be using a single writer for a multi-premier cluster, which looks like, for what we had seen before, it shouldn't be like a very bad performance trade-off in terms of whole cluster grids, and then we re-read the critical reads only to the master, and then we, for to the replica, we choose, we choose to read all the non-critical reads to the other replicas. So critical reads to the primary, okay, and now we're being processed equally to the picture, because that's something that we can do with process equally very easily. Contrary to the Mexico-Purilus, which offers you both of the things that you need, which is reverection and anthropocontrol. The testing scenario, we are going to have a 10-90 ratio of writes versus reads, we are going to have a 5% of critical reads and a 95% of regular reads. Okay, which is, well, I would say that changing this into a more balanced ratio with an impact, okay, with an impact so much, thank you, with an impact so much the performance, the problem is that if the total throughput is what you care about, okay, so the ratio is not as important as the, how much you are going to stress the cluster with the throughput that you want for that ratio. So now we're being processed equally into the picture. In this scenario, these are the non-critical reads, these are the critical reads, and these are the writes. So we have improved almost a 50% on non-critical reads, we are retaining more or less the write load, and we have an extra 1,000 reads in the critical reads. So we are trying to, we have able to preserve the great throughput, increase the throughput of our 50%, and this is against the whole cluster read load and progress with the synchronization enabled. And this is the most important part of it. This is like this, because I decided that it was going to be like this, because if I went back here, and I decided that I am not going to limit the throughput on the reads, I will go as up as almost the whole cluster throughput, and I will compromise the writes again, that will go below what you were seeing before at any time in another benchmark. And why is that? Because the total cluster throughput is what it is. So what you need is to control what you want to do with that throughput. You cannot expect just to get more reads for free. So the conclusions, multi-primary clusters can be a lot of benefit for you, let's please set up. And it's just like this. System measurements analysis is really hard. Really, really hard. And especially benchmarking is also very hard, because most of the things that I have said here, they are right, but they are right in this scenario. And if you change slightly the scenario, it may not be. Adapting the system to your workload is what you always want to do. A different workload will change everything. The final conclusion is that control is everything. Being able to control the throughput and being able to control what you want to do with your load is what is going to decide the performance of the cluster, more than anything else. And for that, process SQL is a great tool. And thank you a lot for your attention, and happy specificity. You have five minutes for the questions. Okay. How do you know the maximum throughput of a cluster? You just measure. Like you create an artificial environment, I would say, where I say that it's very hard, because it's probably not going to be the typical load that you're going to have. You try to replicate, and then you measure, because otherwise it's... I would say that you are not going to know which is the limiting factor until you try. What did you use to measure the workload? Seizebench. I was using Seizebench, the Lua, one with all the scripts, and the old, very old, magical one, because it has, let's use, because you can do the same thing with the Lua one. You can create your own things and et cetera, but I just wanted to benchmark also what if I am selecting from different tables of the one that I am writing for and that thing. And name selection is not something that you have in the Lua one by default. So you don't have throughput limiting by default. And it's something that you have in the old one in the options. So for convenience, I use a mix. Which, by the way, no, it impacts the performance, depending on what you're measuring. If you're measuring in process equal, if you're measuring against the other proxy, it's different depending even on the tool that you're choosing. You said that you used Seizebench, I wanted to hear, because in my test, when I was trying to do multi-write multiple loads, I noticed that the drop of write performance was because of write complete. Yes. So if it's not right here to me, how you're achieving, like, better for multi-writing, because you need to certify all loads. How I achieve better throughput writing to a single note that the whole, to multiple notes. Yeah. Because from what I see, you have better performance writing multiple loads than a single load. Probably because I was having very, very few conflicts, because the size of the table that I choose were very big. And I was having very, very few conflicts during that testing. It was a very, very favorable scenario for writing. That's why I say that it's super ideal. When I say that it was ideal, it was because it's super, super ideal. Yeah. I don't know. Thank you very much. Thank you. Thank you. |
Extending MySQL with component infrastructure
will MySQL be out of space soon ? |
Well, with something you want to do. Okay, I think we can start. Let's see if this works. Yes. So, yeah, where am I? I am Frédéric Decaux. I'm local. I'm from Belgium. So, this is why I'm often here. You can follow me on Twitter as Lefret. So, if you have any questions, just ask me. I'm quite old using MySQL. I started with MySQL 3.20. I was thinking it was 3.23, but it was 3.20. I found the CD at home. Yes, a CD for people who don't know what it is. Before there was a program in it, and before that was Floppy. I also knew the Floppy. So, as I said, I live in Belgium, and I have a blog where you can find a lot of information mostly related to MySQL. So, today I'm going to talk about the component infrastructure. I'm very sorry. Usually we are in a very dark room where... and I changed my slide to be dark slides, but to make it to see them very well, but it seems not okay. So, we will see that if you want to modify MySQL, you have different ways, creating a storage engine. That's one way to modify MySQL. I want to have my own storage engine to store the data like I want. I want to create a plugin. Next session is about that, right? Vinicius, and or by creating a component. We at MySQL, we encourage you to use this component infrastructure if you want to extend MySQL over the plugin, for example. It's very, very not easy to read for you, sorry. So, what is this MySQL component infrastructure? So, it's a modular design for the MySQL server that will allow the developers to extend the server, like I said earlier, right, in a different way, and such adding support for new functions, new performance schema table, new status variable, new system variable, all that kind of stuff that you want to extend to MySQL, you will be able to use the component infrastructure to do that. And so, what does it mean? It means that the server will provide you services that you can use to extend MySQL. And there is, you don't see it very well here, but there is a component service inventory page. The URL is here. Yes, you will have the slide online, so you can read them and copy the URL. And something very nice is that this component infrastructure is, or you say that, enhancing all the time. So, for example, in MySQL 8028, we had 137 services defined that you could use, and in 8032, it's already 162 services. So, if there is something you would like to do and you don't know how to do it, it's something you can ask us. I don't say it's guaranteed that you will have it in the next version, but it's something we are doing. We try to improve. And internally, it's the same. So, when one of the teams needs something extra, one of the teams creates now services that we can use to make it more modular like that. So, why do we need to use this MySQL component? It's because the subsystem of these components is designed because there are some issues with the plugins. So, for example, the plugin can only talk to the server and not with other plugins. Here, with components, you could have some components talking to other components. That's something possible. The plugins, they have access to the server symbols and they can call them directly, but not really in consolation. And there are some, we were discussing that, that it is even possible maybe to create a component and compile it and then load it for another version of it. This is not something I would recommend to do, but it's possible. And you can also create these dependencies between components, which is very nice. So, we're going to create a component together. I will do it, but so you can see how it works and try to see, my plan is to show you that it's not complicated and that you will be able to create your own component. So, first thing, and this is for me, I think one of the most difficult parts when we want to extend MySQL is, oh, cool, I want to extend it, but what will I do? And sometimes on Slack or forums, I have people say, oh, I'm a young developer, I would like to help MySQL community. Yes? What? Yes, I need to say yes. Yeah, but why not? But people say, oh, I am learning C++ or I want to be in the MySQL community, I want to create something, and what could I do? That's the plan. You know, like Colin was telling earlier, we don't have like a list of features that we ask community to do. It's more that the community wants to enhance something, fix something, they will do it, right? So here, the most complicated is what will I do, how would I extend to your needs? So this is what we're going to define. So the first thing is that to be able to use what we're going to use, I want to have a specific privilege. So not everybody will be able to use the extension we're going to do, and I decided to use the sensitive variables observer privilege that was added in 8029. So you can also, if you want, it's create a new privilege for what you are doing. I've made, I don't know if you have seen, I create a component in 13 block posts with, was covering a lot of things to do it, and one of them was to create a privilege and use that privilege. Then we're going to read a value of some predefined variables on the server, and let's say, extract the pass of it, and then create a performance schema table with the path that we extracted from this variable, and see if there is free space and how many space has been used on this storage that related to the pass. So it's a simple, but it can be very useful that you don't need to have access as a DBA to the file system to check that. So to be able to do this, we will need, of course, to use several services that are to get the information from the privileges, so the security of the threads, to need to have access to the performance schema table to create one, to create the different fields, like here we're going to use the table, but we want to have some begin and stuff like that. We also need to have a log built-in to create error messages. We need to send message to the user, for example, oh, you don't have privilege to access this table, so we need to subscribe and to use all these services to do that. Yes. These are services, and I will show you how to do that, so you use these services and then you call them and you get the answer, and it's much easier than on the plug-in, for example. Yes, so the URL I was showing you earlier, it's the list of all the components that you can use and how to use them. So which variable are we going to check? So if you're a SQL DBA, you need how many variables define a path that you need to check? Do you have an ID? One, two, three, more, less. It's a quiz. It's a lot, and more and more. So, for example, we have lock bin base name. We should put the binary locks. We can put them somewhere. The data here, hopefully you know what it means. It is where the data will be. Temdir, you know, db-undo directory, you know, db-data-own. So all these can be different paths at a different part that we can store data, right? So we will need, I will use here a vector of the string and I will put all these variables in it. This is predefined because I know that this can be, if there is some value in these variables, I will check that. Of course, if you want, you can extend that and create, for example, a variable where you will put the name of the variables to list. It's possible. It's just depending on what you want to do that. And this is, for example, all you call, you want to register a new variable. For example, this variable, 2 par, I called, right? This is how you call the MySQL service components is variable register. And you register your variable. Okay? So let's go, we are in ACA conference, right? So our component will look like this. So we will, in the MySQL server source directory, there is a component folder. And in that folder, I will create my disk size component. So it's another folder. I will have a CMake list. And then a disk size, you see this size dot header and a disk size PFS where I put the stuff for a performance schema. If we have time, and I think we may have, I will just show you the content of the file. But before, I will show you what we have in this, some part of it, how to use it. So, for example, if you want to write in the error lock, all our component needs to have a tag. So I don't know if this will work. Seems not, oops. Where is it? Yeah. Yeah, it's with the, yeah. I need to be very slow. Very slow, yeah. You see here? It's not that useful. You can, you say lock component tag and it's disk size. So every error message that I will create will have this disk size tag in it. So I will require a service page. So I need to use the lock built-ins, lock built-in strings because I want to send strings, right? And this is the service type I will going to use. And I'm defining here. And then, for example, when I will initialize my component, right? I will here call this and initialize the lock objects. And when I just do lock component error, which type of error it is? Can you be warning? Here is information level, print message, initializing. Just so every time I will load the plugin, the component, I will have this. So this is how we do it. It's very easy. So this is some code that you will reuse all the time. And then in your component, this is the message if you want to print an error message. So usually it's much more easy than in a plugin. To check a privilege, so I create here a function F required privilege, right? And I will send the THD. And also to get the context, to get the security context of THD, all that, our services. And then we check, OK, do we have access to this privilege? And if not, I will say I will print an error. If it's OK, we can continue. So here we check the privilege. So this is how we do it. And it's quite very easy to do all that kind of stuff like this. And for example, if you want to access a global variable, it's also again, get the variable from this variable register, get the variable, put the name of the variable, and you will get the data out of it. So this is how it works. And it's quite very easy. And you can extend that. Like I said, there is 161 services to create performance to get information for plenty of stuff. So it is very, very easy. So how does it look like when we run this component, right? So the first thing we do, if you can read it, is we install the component. So we do install component and then file component disk size. By default, the components, they start with the name component underscore. But you can change that if you want. You can see it's okay. It means it has been loaded, right? If we check in the error lock, because remember we were printing initializing when we were loading the component, we will see because in Mexico, the error lock is also part of performance schema table. So you don't need to go a tail file. We can see that, okay, component disk size reported initializing, component disk size reported, performance schema table has been registered successfully. So all that information comes from the component and you can find it in the error lock. So that's the first thing. We were happy because we were able to load the component and we see that in error lock, the component is written. And then we can see also here to use it really. So we do select start from performance schema disk size, which is the performance schema table we created. So it gives us all the data, the directory or the path from which variable it comes from, right? And the free size and the total size invites from it. So this is quite very useful. So you can check without having to, after you can do whatever you want with this information, but it was to show something relevant and not just a yellow word to you guys. I think this is more useful. And for example, yeah, you can display or you can play with the new function like the format bytes and stuff. And you can see here the free size, total size, how many percent are used on the disk, depending where you want to check. What do you see here is that everything points on the same SSD on my laptop and this is why you have always the same numbers because at the end it's only the same disk. But if you use different disks, you will have different values here. The error, like I said earlier, I was using, we created this, we were using this specific privilege. So here I'm using another user, which is called Restore. I was for Dockstore. We were discussing about Dockstore earlier. So I want to check in the performance schema table with that user and you can see that user doesn't have the privilege we want. He says, OK, this select is denied because the user cannot, so I don't give access to the performance schema table to everybody, just some specific user. So some information about the MySQL components. It is, you can see which are the components that are loaded. So you do select start from MySQL component. So there is a system table called component and you can see which are the component loaded and sometimes a component can be part of a group and then you will see if they're on the same group or not, which is not the case here. But you can see here, I have several, so the query attribute is a component, for example. And then other component that I'm playing with on my system like UID, V7 and stuff like that, that you can create if you want. Where are these components on the disk? It's on the plugin here, same as the plugin. So you will see where the plugins are. You have also the components there and by default the name start with component. So the disk size, we just created it here, component disk size. And you can see there are several components already in MySQL by default because we are also using our own infrastructure in MySQL. So some question that I've heard about the component is that is it that when we load a component after we start of the server, what happens with that component? So all the components will be reloaded again, in fact. So if you install the component, when you restart, it will be installed. So there is no configuration in my CNF. You need to load it first and then it will stay loaded. Meaning that if I remove the component that was installed because I like to suffer, right? And what will happen? In fact, you will have a message that says, oh, this component, for example, cannot be open and there is an issue. But that's the only thing. It won't block the server. It won't crash. It will just tell you, OK, this is not there and it won't be available for you. So if you are interested, for example, to see the code, the full code of the component, it's on GitHub. So on my GitHub, MySQL component disk size, you can find it. And also now it's your turn. Please show me what you can do and what you can invent and what you can bring to MySQL because I think the component infrastructure is very something cool that you can use to add features to MySQL. And I hope that, Vinicius, you will change your next talk to a component for the next year and not as a plugin, right? So thank you very much for this. Share your love to MySQL on the social media. Join us on the community Slack, so where plenty of people are there to answer questions from development, consulting, plenty of people from around the world, not working for MySQL, but in the MySQL community, are also there to answer you. Do you have questions? |
Extended observability to agentless monitoring on MySQL using ProcFS UDF plugin |
Thanks guys for holding until the last presentation, we've been through three days for those who were in pre-fossil days, I know it's hard, a lot of information, my brain is half melted, but yes, I hope this won't be the 20 longest minutes of your life. LeFredi he was talking about components, I didn't know, these I will talk about plugins, but I think the good thing of Fossil is that you get out with some insights on what to do next, so let's see. And the objective here of this talk is to talk about the PROCFS plugin that Percona developed in order so you can monitor Linux metrics without an agent on the server. I know I saw some of you already in pre-fossil, but I'm Vinicius, I've worked for Percona for six years almost, working with Database for a plent of time, and also the co-author of the book Learning MySQL, my colleague is here, the other author, Sergei. And this is our agenda for the day, so what to monitor in Database, let's get some insights and why monitoring, what is the difference between agent versus agentless monitoring, what are the pros and cons of each option, we will go in details of them, then we will go through the PROCFS plugin, you see it's very easy to install, how to use it, we can use it natively or with any Prometheus, and an example how to integrate with PMM from Percona, for those who doesn't know, PMM it's monitoring tool, a bundle of like several open source projects, components, glue it together to provide monitoring and observability. But if you want, you can ask me anytime. So what to monitor in a database, usually we like to monitor, for example, customer response time, so if we are booking, I'm trying to make a search for hotels, I want my search to return, let's say in less than 10 milliseconds, also KPS, so I want my search to return in less than 10 milliseconds, but I also want the search for 10,000 people at the same time return under 10 milliseconds. Understand workload behavior, what is the busiest day, what is the busiest hour of my workload, is there something different, some anomaly, this can indicate some security breach, also the most basic infrastructure components, we need to verify if our equipment is working correct, so if network that have a lot of retransmissions, files, storage, if my power supply is failing, all this kind of stuff that needs to work, otherwise we won't have the service. And as I said, like resource utilization, not only to predict what is going on now, but also for the future, security breach and so on. Maybe each company has more relevant metrics than the ones that I'm showing here, but this is just an overall example of what is most monitored around database. So why monitoring? It helps diagnose issues that happen, for example, in the weekend, something happened and I don't know what, you need a monitoring tool, because if you open the database now, you won't see the metrics from the time that problem was going on. We can understand issues that are actually happening, and also proactively, we can see, look, my load is raising, is raising, raising, I think I'm going to have an issue, maybe I need to increase the number of servers, I need to optimize some queries and so on. For those who use cloud, it is very important, because each CPU cycle, each byte costs money, so if you optimize a query, you optimize a table, you're saving CPU performance, disk storage, backup and so on. And my favorite one, it helps you sleep at night, because you don't need to keep working with disasters, you can predict them or work in a better way. Continue, so this is some metrics from Grafana, so we can get memory utilization, disk space, also we can see an estimation of, it will take me 1.86 years to run out of space, so I can plan ahead my budget, buy disk and everything, because when the database gets out of disk, it crashes, and then we need to decide who we are going to sacrifice. And as I said, we can understand what is going on inside the database, what kind of queries here you can see, but it's insert, select, predates, commands, so it's the call handlers from a SQL, and you can understand the fluctuation of them. In this case, it's a database that is doing this green line, the bigger one are inserts, so it's a heavy right database. And as I said, to understand behavior, so during normal days, business days, I have peaks of almost 40k EPS, but on weekends, I don't get more than 10, so if I'm on a weekend and then I have 30k, probably I'm under attack, or someone is doing a promotion, something that I don't know, some important person died, whatever. And there are two ways to monitor, one is using agent, the other one is letting the monitoring system monitor the database. And when you have the agent, you have it installed locally, so it's gathering information, sending to the server, you can get a lot of details, like Linux metrics, you can get all of them with node exporters from Prometheus. Sometimes you can run custom scripts, so maybe you can embed, for example, backups or restores, routines, something like that on the agent, but the cons are more related, for example, enterprise companies may suffer more, because you need authorization to run the, to install the agent from the machine, so maybe you have multiple teams, so you need to ask authorization for security, then the CISA demeaning needs to install, some other team needs to configure, and so this is one of the bad things, and it happens a lot with enterprise companies. There's an example from PMM, so we have the agents running in the database server, sending information to the central server, where data is analyzed. This is an example from Datadog, same way, agent running the same server, sending to the back end. And now agentless, the pros, it is basically the opposite of what we saw, we can reduce overhead of administrative tasks, because we only need the monitoring server to reach the database, and data will be fetched from there, like, for example, from performance schema, or the global status variables. Again, easier to approve, we are talking more about bureaucracy here. The cons, you still have some job to do, for example, I can't say I'm going to monitor LeFred's database, if he doesn't give me a user and password, of course I won't have access, so you still have some job to do. And the problem that we will try to solve with Prof KFS is you have limited scope to analyze, because you are not on the server, for example, you may need some special Linux permission to gather certain metrics, and you won't have it. And one last, if you are monitoring, let's say, 1,000 hosts remote, so you are putting a lot of stress on the central server, like it's natural, someone needs to do the job, if it's not the database server, it will be the monitoring server. This is an example of another company, the EGI, so they work by sending the information, the monitoring appliances collecting data from these hosts. And another example from PMM, in this case without agent, we can connect to anything, the advantage is, for example, RDS or OCI, that you don't have access to system metrics, unless you are using, for example, the CloudWatch metrics or the metrics from Oracle, but you can get MySQL information by simply connecting to them. So the Prof KFS plugin, it provides access to the Linux performance data, basically, it's information from slash proc and slash sys, so when you are running VMSTAT, it is basically collecting information from slash proc, and this information is gathered and populated in a view that is created with the plugin, the Prof KFS view. For those who also, security is a concern, there is a parameter, Prof KFS files spec, where you can say exactly what you want to gather. Now, for the problems, the sad face is, currently, it works only for Percona server, maybe if we go to components, we can make it work for MySQL. If you try to copy the Libby, I did it, of course, in Keras, you will crash the database with signal 11, so don't try this. The other caveat, the other cons, it only works with the same version, so even if we are talking about MySQL, Percona server, if you copy the Libby from 8010 and put on 8030, it will fail. So it always needs to be on the same version. To install the plugin, it has to install any plugin in MySQL, just a command line. When you are installing Percona server, the Libby is already in the plugin folder directory, as the thread showed in the, not only the components, but also the plugins will be there. You just install, and it works. You need to use this particular privilege that is created for this specific plugin, which is AccessProcFS. If you have a super, which I thought was a bit weird, it doesn't work. I thought that, okay, super should override, but it's just a matter of providing the privilege. And as I said, this is the variable that controls what you can collect, I highlighted three, which is our S version, I will load the average, but this is a string, so you just remove the ones that you don't want, and it is required that we start. This is a static variable, so you can change it at runtime. And using the plugin, so this is raw data, you can get from MySQL, you run the command, and exactly what you want to collect. In this case, I was running on Ubuntu 22, which is, I don't remember, oh, it's on AWS. It's a bit weird, because there is no binary, I was using generic packages, because there is no official package for MySQL. And now here we have the raw data, so we need to keep running selects all the time. And for example, when you get the CPU counts, you see a bunch of numbers, which is hard to figure it out what's going on by that way. So we can use Prometheus. Prometheus can work with its open source, it works with exporters. You have an old exporter, I think they have here, that showed about ProcSQL, there is a ProcSQL exporter, so you have plenty of options. In this case, we have the MySQL ProcFS collector running using the Prometheus exporter. So you can integrate with any tool. It doesn't need to be anything like, I know that AWS Oracle Google, they have features that you can use with Prometheus. This is an example that I took from Pierre Grafana, so if you want, you just populate here and start using it. In this case, I will show my example, we'll be with PMM, because I think it's easier. This, I'm going to leave more for the records, for those who want to try this at home, because these are basically codes. In this case, I have a container running the node exporter. As you can see, my username, password, and by the way, this host is still running, if you want to connect, please don't drop the database. And if you want to test if the exporter is running, you can do a simple curve, it will work and you'll see the metrics. These slides about integrating with PMM, basically what you have to do. First, you need to add the database to the monitoring server. So we add as a MySQL, here we will ask for you for user and password. And later on, we have this Docker exporter running, this container running, we will add as an external service. This is how you can visualize the services that are running there. So you can see, I have the database first, and this one is my container that is gathering the metrics from the OS. Those are the commands, how you list the servers, how you add the external service. Don't worry, as I said, it's here more for later on if you want to try at home from where the ideas are coming from, so you can understand what's going on. So adding the agent, and the last slide here is on Grafana, as here I was using a container, and then I can see that I'm having some issues, because I have my user 30, 30, and my IOH is at 30%. So for Docker instance, for those who are using, for example, Kubernetes, that it's hard to collect OS metrics, if you have the plugin installed, it can give you a very nice idea of what's going on. On my experience, lots of time people come in, like, my SQL is not working, and then you are going to see on a Kubernetes node, the worker node where the services host that are like 3,000 containers running, the worker is completely saturated and dying, and we are blaming my SQL, which is a small piece using the whole thing. I think I went too fast, but I had to skip some of the codes. Do you guys have any questions or any curiosity about this? Yes, Muka? That is a good question. I'm not sure what is the frequency of the collection, because like, this is more experimental too, probably don't have the frequency. It should be hard coded, for example, five seconds or something. It was Nicolai who developed it from Percona, and I tested when we restart, the database information is lost, so it's only a view that it's populated along the time. It's not started, like, you don't have 30 days of monitoring or anything like that. So, you don't know how it went, where is it? No, no, I don't. It's when you ask for it. No, it says it's lost after restart. It's when you query the table that it checks the value. This is the case, it should be data. No, it's lost today. He said it's lost after restart. So, when you query that table, that view, it goes to the clock to get the information and return it. Yes, my name is in the search box, because you know it is lost because you have to query the physicality in some signals or something like that. No, that's fine, because I think it's not there when I change the code. Do you think when you restart, there is nothing? I didn't see, you know. Don't say something, Sergei. Yeah, I want to ask you that the implication of no cache and implication of this plugin is let it go, actually, go ahead, take data from proper pass load it into memory, parse it up, use CPUs, there are any controls from the amount of memory being used, the hard limit of the number of nodes in the proper pass, like, it can be, there are situations where it has furious proper pass expansion. It is documented, there is a limitation. I think there is a number of lines that the proper pass will try to collect, and if it's kind of similar to PT's talk when there are more than 1,000, 10,000 tables, he will ignore because of the overhead. Yeah, is it the same work which was done in 2018 by Nikolai, or is it another? Probably it's a continuation, because I saw that the project in GitHub is old. What it is? It is a new plugin written by somebody else. No, no, it's by Nikolai, by Nikolai, yes. Okay, can that plugin be compiled for Oracle? So, you can, but we saw that on the source code because there is a particular privilege, Marcelo helped me, helped me, you need to change parts of my SQL code. Yes, yes, and talking to our engineering team, they have component to this. Yeah, they were talking and probably they will make something, because this is more like an adventure, a college project. It's written in the documentation, it's a experimental feature, but the engineering team, because it was crashing during my presentation, then I said, guys, I'm opening a Gira ticket, I want this next week. They didn't agree, but yeah, I think hopefully this can become a real part of my SQL. As Lefred said, if it helps the community, why not? I want to say thanks to Lefred and the whole organization of Fosden for you guys that survived until the end. It was a pleasure to be here again, and I hope to see you next year. Thanks. |
AMENDMENT Welcome to the Matrix Devroom |
Welcome to the first ever in real life matrix dev room. Who would have thought we would get there when we came here for the first time in 2015? Honestly, a little bit of a surprise. But hey, this is really cool that we're back in one physical place at last after these three years of being stuck in matrix. So we have to do this bit because people might either be really lost in the room right now, or they might be viewing this online and have no idea what they're getting into. So Avondine, what is matrix? So as you can read, it's an open network for secure and decentralized real-time communications. A lot of you probably know it for chat and voiceover IP, but we can do many things on matrix as we will see today. Of course, the chat, the voiceover IP, IOT and then VR, 3D worlds, etc. And we should probably introduce ourselves. I'm Matthew, I'm the technical project leads and co-founder. And I'm Avondine, I'm a matrix co-founder. Basically, the responsible person who tries to keep everything on track and let me play with computers. So our mission, slightly changing it this year actually, because I think we're kind of converging on matrix, trying to be the real-time communication layer of the open web. It's kind of the idea all along and it's under the bit over the top to put it in writing when we began. But the reality is increasingly that's where things are moving towards. You can say that I know activity pub is more of a real-time micro-blogging or information sharing layer, RSS and steroids. Whereas with matrix, we're really trying to go as low latency as we will be talking about in a few minutes. And sort of real-time instant messaging, whatever on top. So the way it works, we said decentralized, we didn't lie. A bunch of servers who can talk to one another, a bunch of clients attached to the servers in green on the graph. But the key thing with matrix, it's called matrix for a reason. It's because it's matrixing all the different networks out there. There is when we created it, we didn't think everyone would jump on it like this. We thought that the interesting and the intelligent thing to do is probably to connect all the existing things so everyone could benefit from it. The name matrix came from basically a conversation where we said, hey, it would be really cool if there was a name like matrix that we could use to describe this kind of substrate in which all these different things could be embedded. Like matrix comes from Latin, where it means uterus, where things grow, where you go and embed things. And it's where the word mother and matron comes from and maternal. And we thought that would be a pretty cool thing to kind of describe the idea of linking it all together. But no, we obviously couldn't call it matrix because of the film. Then we realized that the film came out 15 years earlier and that was already what, eight years ago. And so for, hey, we'll use it anyway. And we did. And also matrix.org used to be a really nasty website and so it was available. What are the stats looking like? Basically, there has been 87 million users registered in the whole matrix network until today. So it's growing quite nicely. The thing is, these are only users we can see and know of. So a whole bunch of them are probably just hiding into big closed network which are not connected and never talk to us. Yeah, so I mean, just for full transparency on this graph, which we show a lot. And it's obvious, I mean, the important thing is the shape of the graph rather than necessarily the absolute numbers. Because this is actually based on the phone home stats that Synapse has in it. So there aren't any dandrites or constructs or conduits and flying around in here. Also, it is literally total MX IDs that server has ever seen. So it is including bridged users, it's including guests, et cetera. And the way to think of it is that literally half of these are actually bridged and then about another half of them are guests. So if you wanted the non-guest actually registered fully signed up users, it's possibly reducing it by a quarter. But we prefer the bigger number. Sometimes it's going to be larger than the number of humans on the planet and then it's going to start looking a little bit awkward. It still means it's people you can talk to on matrix if they're connected, but so yeah, they're actual users. So I think for one, we'll be reaching out to guests 4445442 on matrix.org from September 2013. The other small stat on the corner is across at least 100,000 deployments, again, that we know of, and it ranges from Raspberry Pis, which I'm sure many of you have in the room, all the way up to matrix.org. And it's like 13, 14 million users and when the middle servers for enterprises, governments and anyone basically, all sorts of sites. And again, disclaimer on the stat here for 100,000 servers. This is based on looking at the destinations table on matrix.org, which is about 50,000 at the moment, and doubling it based on the number of servers which we can't see out there on the network. So who uses it? I'm going to seriously go through all of these logos. We've only got like three minutes before Jan Rubby tackles us off the stage. Well, we've put just a bunch of logos out there. So obviously, yes, I hope you've been following fast them on matrix. And hello to everyone out there who is currently streaming from matrix. So we're always very proud to be hosting fast them waves to the camera. Hello. I was looking for it and a whole bunch of different projects that hopefully, you know, a whole bunch of cluster of governments out there who made the right choice and went for data sovereignty. I mean, it's a bit crazy that I mean, we've missed out some of the sort of companies who we know use it. But honestly, a large number are like governments, whether that's France or Germany, Germany, again, UK, NATO, Luxembourg, Sweden, Ukraine, or open source projects. It's not the most obvious mix, but there's a huge footprint obviously on both sides, but also a bunch of companies obviously probably including people in the room using it to apologize if your local country is not on here. And if it isn't on here, but you don't use matrix, stop using teams and get on board. In terms of vital stats, where to start? Colgen.io. If people don't know Colgen.io, it's really good. You basically just give it a get hub or a get lab repository. I'm sorry, organization. And it goes on spiders. The whole thing puts an elastic search and gives you the elastic search kind of credentials to go in and do whatever you want with the stats. So this is looking at it from 2014 with a number of committers, number of issues and reviews. I'm not sure what happened with our reviews in 2020, but there was a mad reviewing frenzy. COVID. Oh yeah, yeah, people got really bored and reviewed all their PRs at last. But yeah, some of the stats as we got 4,000 committers. If you sum all of the get hub stars over the matrix, the organization, it comes to over 50,000 and we're not double counting anybody there at all. And loads of clients. 40 is way more than that. I should have reviewed this. I tried to figure out. My preferred one is this one. Like we have projects with over 30 different programming languages. We're getting almost kicked out, Matthew. Okay, we've got two minutes. We're just going to talk. Easy, easy. So yes, 30 programming languages from everywhere, all sorts, and that's really fun. So today's menu, lots of talks. We're not going to go through them because we have a QR code, plus you already know it's probably sitting here. Or you're looking at it on the internet. We've got a URL that nobody will be able to see, but it's the same as the one on the platform there. And also follow along on, yeah, the schedules there, but follow along on matrix. There are going to be people out there in the void who we should connect with physically and create a proper hybrid room. Welcome. Woo. Welcome. Hope you will have fun. Yeah. And over to Florian. Thank you. |
AMENDMENT matrixRTC | Matrix beyond Instant Messaging
Element Call, Scaling, Thirdroom |
I'm here thanks yes hello good morning and nice day for foster them hope you had a nice evening yes let's start with the torque matrix RTC matrix beyond instant messaging let's get that so first of all what is matrix maybe we have some redundancy a little bit here but for the sake of the recording and I'll repeat some of what Matthew already showed us so basically the idea is an open and federated secure and decentralized real-time communication network and the use cases are many for it so the obvious one is of course the interoperable chat but we are also aiming for voice of IP applications or even VR applications but you can also think of something like IOT data and this is so most of those use cases you're quite familiar with and what's the well-known event layer so for chat for instance you have a store and forward semantics however the cool thing about matrix is that it is yeah constructed such a way that it is de-central it is federated and what's for me pretty cool and important is this replicated and end-to-end encrypted so for instance if you're a client on one of those hosts here and you're attaching to one home server and you're joining a room from another home server at that moment the rooms replicated and everything is end-to-end encrypted it is pretty cool it's like like a git git clone something like that similar with an automatically your synchronization mechanism on top of it I'm quite sure that you're very familiar with it you could also summarize it matrix is a de-distributed real-time database and I can recommend this talk here from my colleague Andy it's a very nice overview of what the specifics here on the other side this talk today is about matrix RTC and now what is the definition of matrix RTC what was it so there's basically it's the world first de-central federated real-time channel or communication platform and it's packed in msc341 and msc3898 and it's depicted here what basically the idea is from a client perspective you have a peer-to-peer semantics and we don't have a storage or persistency in it but you can exchange data with a low latency and a low jitter from one client to the other and another interesting thing here is that the business logic is owned by the clients though in most recent RTC platforms or video conferencing platforms you have something like an SFU and an application server and here the idea really is the business logic is in the clients let's have a look at some use cases so the most obvious of course is video conferencing right that this is what everybody would you know assume when you talk about RTC but we can also have the embedded version here so let's just take a normal matrix client and put a widget into it and then have audio video conferencing or go to the VR world or the the matrix interpretation of the metaverse which is called third room everything can be realized using matrix RTC and the cool thing here is that you can also think about a hybrid use case so for instance imagine we have a whiteboard like we have this here so we can use for fast UX the matrix RTC layer such when I make a stroke now that it's more or less immediately at the other whiteboards but for persistency we can just use the room the distributed database so what are the base building blocks which we use to create matrix RTC so first of all of course we need something like a real-time communication framework and the obvious choice at the current point in time is web RTC because it has a quite good adoption it's in all the web browsers and there are a lot of a lot of SDKs around we can use but going forward maybe you can also sing something like web transport or web codex to to replace it so we are not mandating web RTC but for the time being that is of course the framework to use then for the signaling that's quite obvious that we use matrix for it rates and then to make it scale you might optionally also want to use a back-end component a focus this could either be an MCU or an SFU so now as I stated the back-end component is optional let's have a look on start matrix RTC without an back-end component so having a look at the connection models so the obvious and the simplest one to start with is peer-to-peer right that's just easier create a peer connection and then you have a they have a data layer and then you can play around with matrix RTC it's getting a little bit more complicated if you want to to have an RTC session with more than two people so then we spin up a full mesh and then it really depends on the use case how many people can join if you have a look maybe at the use case of video conference then you would need to distribute n-1 media up links and down links which is then limited by your internet connection and if you're having at least in Germany an average DSL connection it would scale to up to five to eight participants and this is something we're currently using in an element call to element call to the current point in time is based on this full mesh setup or was we have some news today the the vision which also really scales and would allow a large scale RTC sessions is what we call cascaded selective forwarding units and it's depicted here so what you would have you would have a selective forwarding unit as long side with your home server it's optional but you can have it and by this we then would allow a further scaling and can really yeah using the cascading concept to dynamically crawl to any size we have to test it but that's the theory so far the signaling here of course is also carried out over matrix and for the for the specifics for the SFU we have this MSE 3898 which handles then all the the magic of the cascading and so on and so forth from a setup perspective if you want to scale large network and large RTC network the idea is that you place the SFUs as close as you can to your customers or to your clients and to ensure a proper internet so this ensures a proper internet connection with low jitter low packet loss and then you have the SFUs placed in a strong network center and have a good interconnection and by this setup you have yeah that's that's that's the best you can do in terms of media quality so what we did we started with an SFU which is capable to speak the matrix flavor of of RTC it's a prototype which was handed over from Sean from the inventor of the pine stack and it's a Golang based web RTC implementation we added the matrix bits to it we wrote a lot of it it's early stage but we have support for audio channels we have support for video channels and screen sharing and on top of that recently we also added so-called similar cast support similar cast is you can imagine that maybe have to go step back so in the full mesh mode you have a literally a peer-to-peer connection to each of those clients and by adding some signaling you can the receiver can tell the sender oh I'm struggling with a network connection can you please adapt your your network bandwidth from from the encoding side and this is a little bit harder if you're having a central point the SFU so here the trick is that the sender will provide several media quality levels and then on the SFU the client can decide which one to choose low quality mid quality and high quality and this is called similar cast and we all already have basic support for similar cost in our selective forwarding unit matrix is known for privacy and end-to-end encryption and in the full mesh set up of matrix RTC that's quite easy because you have the transport layer which ensures end-to-end encryption but if you terminate a peer connection on an SFU the transport layer is of course terminated and hence we need media encryption and this is the missing part here so using insertable streams you need going forward to implement end-to-end encryption on the client side that such that the media is encrypted on the client send over and then it's not an issue on the SFU for this specific topic matrix RTC and cascaded foci or a selective forwarding units we have later a dedicated talk for my colleague Shimon it's the in-depth talk cascaded foci it will be in this no not in this room it will be online at 2 p.m. this day so now we have an idea of what the vision of matrix RTC is and what we can do with it we have seen so far the use cases let's come back to element call I think we demonstrated it last year right yes very early very early and after one year you could imagine something happened so first recap what's what's element call it so initially it was developed on a green field as a single page application in the cool story here it's just a matrix client right it's not for chat it's a matrix client and the implementation was using the full mesh so without a back-end component so what's new so after this year first of all in our site or not site project our partner project hydrogen we also have call support right now and it's also working interoperability so you can start with hydrogen the call and join this element call and the other way around we added the SFU bits to element call and we integrated it into element web so in element there we now have two flavors of element call we have a crew call experience we are just press the crew call button and we have dedicated video rooms which is pretty cool so the semantic of this room is that when you press or click on this room you're asked to join the conference immediately the question is how to embed element call into a matrix client in general so in theory you could just implement the msc's but that would be very expensive because then you need to implement in each platform expensive in terms of engineering bandwidth so all requirements are one implementation which fits all and the idea is yeah that we embedded element call or the embedded element call needs to share the same underlying matrix client and room in order to not waste resources or to have device proliferation so the idea is quite obvious let's use a widget for it also short recap on widgets what is a widget a widget is basically an application living in a matrix room it's simply an embedded iframe and it's yeah a small form factor web application html JavaScript the widget is embedded within a room and can communicate with matrix clients and therefore from the matrix client through the widget API and this widget API is a defined post message API basically a widget is able to request permissions and to perform on actions on the user's behalf something like posting into a room receiving specific event types and so on and so forth to have a more easy a way of approaching widgets we have also have a widget SDK which is written in JavaScript and TypeScript it's basically a web app and now here we have an overview a few about element call in the various flavors so in the single page application the mode it's just the matrix client using the client server API and in the widget mode here we are going through the widget API over the underlying matrix client and then to the home server the abstraction layer client server API versus widget is a really thin so from a development perspective that was really easy to implement yeah so the solution is it's just a widget it's web RTC in a web view and this is the nice thing so the the whole web RTC stack to implement on various platforms is quite painful but if you can just a web view where this is included for free that's that's a thing we needed to extend the widget API to add some missing bits specifically where the 2d rice messages and to access the turn server also spec in the msc's and this this whole concept we call matroshka embedding where we have a web view hosting widgets in the various clients and that could be a web client but it could also be the native clients like the iOS and the intro lines and hence we have the solution one implementation that fits all let's have a demo right so what you see here basically is just the desktop application of element element nightly on the left hand side you can see the various rooms with various flavors so here this is a general room and he has already a conference started so if you press it you will join it and at the top here we have a so-called video room and you're now if you join it directly prompted to to join a conference and what you can also see here is the chat so let's join it hello I raised the volume here so hi Enrico hi Simone yeah welcome to foster them what we can do here right now is that we have various flavors so we have a new layout here which we call large crits design and you can yeah change the tile sizes and I also added here some debug information and you can see that the small video on the top left receives the low quality stream whereas the large screen or a large tile on the bottom receives the mid quality stream hey Dave hello so and so and now we want to carry out an experiment I have a QR code here hopefully I find it yeah and I encourage everybody to join the call and to crash our SFU let's load test it sorry with chrome yeah yeah I will do but that's the resolution the projects are good enough we'll put the URL in the bedroom yeah so let's see so of course if it's not working then we can blame the Wi-Fi here right yeah it's slightly unhappy now no no no it's recovering it's recovering so we plan a second demo later the day with the talk from shimon so yeah but so how many do we have it depends okay let's say I leave it in the background play around with it and we go back to me talk okay so what's what's next in the matrix RTC ecosystem of course we want to implement the whiteboard at least the hybrid version of the high pot then the very important thing really to is to yeah to start with the insertable streams to also have the end-to-end encryption in case of using the SFU with respect to the selective forwarding unit we need to implement the focus selection logic so if you remember the craft where we had the other picture where we had the the home servers alongside the SFU's alongside with the home servers we sketched out a nice algorithm where you can automatically choose the right one and this is something we need to implement right now yeah and then obviously the cascading this has not be started the implementation but yeah we need to carry out on the similar cost layer we also have some optimizations in mind for instance I think it's only a good idea to upload your video to the SFU if on the other side someone consumes this video stream so this is an optimization we can add and I think also the the switching point where when you change the layer from high quality to low quality there's also some room for improvement on top of that we need to care a little bit more about network bandwidth rate control this is an important thing we need here such that the SFU also feeds back information back to the client such that the client is then in a good position to adapt the local yeah video audio encoders and finally we want to extend the call support in hydrogen and that the SFU bits to it and now maybe some of you are wondering a little bit why Matthew is sitting here with his classes yeah and now let's head over so what we have so far seen was the obvious use case for matrix RTC which is a video conferencing solution but I told you earlier that we also have something like the metaverse interpretation of matrix and we call it third room yeah and in a half minute yeah any questions though on the floor answer yeah it's hard Q&A part apologies for being the idiot with a VR headset on it's gonna start streaming in a second so wanted to talk about third room which is the social collaboration platform that we have built on top of matrix and I'm gonna slightly mess around with trying to get this to work at the right resolution because it's going to look crap if it's not over right whereas how do you change the resolution these days in macOS but you know how it's actually changes refresh rate that's not helpful that's not me yeah and is that your fault that it's like streaming it's the wrong one thank you too many death rooms you should play be using the same one and 2001 online death room no it's not the online death room normal death room yes what is it I can't spell matrix this is going well that one right thank you okay we sure that looks totally squashed that's literally why I'm trying to fix right now I'm gonna be to actually set a resolution on this thing these days how about I press that button where we're gonna go to run yeah so it's gone to four by three but it wants to do four by three everywhere but it's not four by three it's 16 by nine let's go to that well I look thank coming in on the stream all right just check this it'll be worth it don't worry on the plus side the Oculus thing is cakes in good right yeah this is looking good so let me actually bring up third room so the point of third room is that it's a tiny team it's sorry okay hi everybody welcome to my talk on third room which is a project tiny project done by three people Robert and a an AJ AJ also famous for doing Sydney as matrix client which is trying to show people that matrix is way more than chat and VoIP I know that it's called to look at 3D stuff these days and go don't like 3D honestly I think this is incredibly interesting in showing the potential of what we have to build on top of matrix today now the way it works is that you've got hydrogen SDK going and basically providing a plain old matrix client and if I jump into this room here which is hash presentation on third room.io if people want to play along at home feel free to come and jump in and heckle my presentation and you can see that this is a virtual world going and sitting in browser if I pull up the frame rate which is obviously control shift s you can see it's actually going at 60 frames a second I'm indeed you're stuck in the floor it's running at 60 frames a second in browser at well 1080p as we all just saw which is pretty impressive for a fairly complicated scene that we have going on here and the way that third room works is quite unusual in the it's properly multi-freaded in browser it's using an entirely new game engine that the team basically put together and I should hasten to add I've basically been encouraging people rather than actually working on this but Robertson and Sanfran and so it'll be cruel and unusual to get him to do this and talk and I've even got some slides here and it's showing the scripting that is built in that I'll talk about it in a minute now the interesting thing is that we're using shared arrayed buffers to go and share data between the main thread and a bunch of worker threads using post message between these and then the atomics API's and the browser so that you can actually have properable multiple threads in order to have the rendering thread running completely independently from the gaming thread that does physics and the main thread that does react and does hydrogen because with embedded hydrogen in a react app here as well as matrix so if I go to the next slide here are so on the main threads we've got react matrix and WebRTC happening and we have spatial audio in here so if I actually unmute myself oh I've got first for myself on my own talk that's annoying let me pause that amundini still out there somewhere you want to come over and say something to me it's anybody else I'll go over to you say something yeah okay so if we had headphones on at this point and I turn this way and you say something it's coming out the left speaker and you have to believe me and look the other way and it's coming out honestly it helps the immersive experience massively that we're going using spatial audio to go and position where things are here whilst we're wandering around here you can see that we've got at the moment generic avatars but if you walk around a bit and you can see her moon walking backwards for whatever reason I'm sure you can also go forwards there we go and fly for that matter and so the B button lets you fly in this so you can go and jump around like so and so you're spoiling my talk so if we go down here then on the game thread we've got a bunch of rust we have the ability to run arbitrary WebAssembly scripts which is sitting in a sandbox which allows you to basically add any arbitrary functionality into the world from a pure matrix perspective this is probably the most exciting thing here now if you remember IRC and Merck scripting the ability to run arbitrary scripts on your IRC client this is effectively allowing you to define bots and arbitrary functionality in matrix which run inside your client inside the sandbox and the actual data is stored in your room now this whole thing is a matrix room right if I go and hit enter then well you can see a bunch of users there I can say hello world and if I go back to my element client and if I literally join presentation on thirdroom.io for matrix.org then I can you can see I'm Dean saying yeah and I can say yeah to you too and hopefully hang it off well I've got traffic running one way interesting well we should be seeing messages coming into the room as well because it is there we go it's just a plain old hydrogen overlay that is being rented in react for the contents of the room now the actual geometry of the room if we start flying around some more looks like this is actually a big GLTF or a single GLTF asset this thing is just sitting in the media repository in the room it's just a file that is GLTF the transfer format for OpenGL that has been uploaded there and also any scripts in the room like the one which is executing the let me press on the buttons here again there's a bit of I think JavaScript using the quick.js engine that has gone and compiled down the JavaScript to web assembly in real-time it's pretty cool that you literally write it in JavaScript and then the engine sucks it up turns it into wasm and runs it within that sandbox so you could argue it's a little bit perverse to be taking JavaScript compiling it to web assembly and then running it from within a JavaScript environment but it gives you a hell of a lot more safety than you would if we were just I know having random blobs of JavaScript running here on the render thread we are using WebGL2 and we're using Freejs to manage the actual driving of WebGL but the scene itself is the scene itself is managed using a really cool technology called Bit ECS that was actually created by Nate one of the developers before he started working on they before they started working on third room and Bit ECS is a entity component system where you basically trap the state of the world the objects that exist within it their transformations and it's done with arrays in JavaScript and it turns out that if you stretch your arrays intelligently enough in JavaScript you can get as good as web assembly performance and it's one of the other secrets to the crazy and performance that we have here so this isn't a scene graph API under the hood like a frame if anybody ever played with a frame instead it's using the Bit ECS then another thing which is interesting here is that everything is some triple buffered so and a kind of traditional game engine you just have one sort of buffer that you write data into and the renderer reads it out and you have some kind of locking system to make sure that it doesn't collide whereas here we have lot free data structures letting things go as rapidly as possible with the various different bits of the engine writing into these shared this shared triple buffer as a shared array and buffer which is then juggled effectively between the various different threads and it means that the render thread can run at the native speed of whatever device which is particularly useful if it's a less powered device than my MacBook Pro here and then the game engine that is actually rendering what's going on can go at its own speed so you totally decouple the two and you get as high a frame rate as you can and I think that oh yeah and finally lots of fun stuff going on with asset optimization pipeline particularly the textures have been highly compressed using these fun codecs I think it's called universal basis format from binomial and KTX and one of the things we've done to cheat to bootstrap third room is to build a pipeline from Unity where you can take existing Unity assets like this scene here is one that we bought off a Unity assets store and then export it as proper open standardized JLTF somewhat liberating the contents from the slightly proprietary world of Unity in order to get content in more rapidly and then compress it down and there are lots of fun things like the it has instancing support built in so if you start generating lots of objects like the physics engine here I know going create a whole bunch of objects and attack the various people who are wandering around in here I'm sure they love me for it and then this is basically just the same JLTF asset going and being created multiple times all the textures are sprighted so there's just one great big thing there's also some really interesting extensions to GL that we've contributed by the Kronos group and particularly if you look at if we grab one of these mirror balls which are mainly used for debugging purposes let's grab that one and if I run around with it while I fly around with it you should see that the reflection changes there we go like if I go between zones there it's not they need to be tuned a bit but basically rather than ray tracing which would be incredibly time-consuming instead we have lots of different probes hanging around the scene that allow you to so I'm hitting myself in the face with the ball a common problem but as you run around you can see the reflection changes it's pretty nasty if you do it rapidly but if you're doing it more slowly like this and it's a quite a subtle but nice effect and it's even better when it's on not perfect maribles if you look at say Dave if you walk backwards if you can hear me or go into the light or out of the light then you actually see a fairly subtle shadowing effect as it's gone and figured out where the shadows are there right guys cool to see all the people running around in here so what else can I show you one of the so we're gonna launch some tech preview 2 of this next week and this is the first time anybody has seen tech preview 2 and I have 10 minutes yet left on and some tech preview one is sort of what we've been looking at here except it didn't have scripting we've already shown some of the scripting here but one of the big things sort of being added and let's pray that this thing works is web VR so hopefully if I go to the Oculus streaming thing which I had a second ago should have possibly cleaned this up first of course it stopped working well I get this thing back in again and everything is going wrong apparently I've got to recalibrate the entire thing so I apologize for using proprietary technology but unfortunately there aren't any open source headsets which do the trick yet there we go and let me try to cast this up and unfortunately it takes ages for the screen casting to kick in for some reason but I'll go as quick as I can bump and cost and computer go right so WebXR is a really cool technology it's been there for ages now since about 2017 built into browsers like Firefox and Chrome obviously and also interestingly the browser that your Oculus and Quest like this or Quest Pro has built into it which is based on Chromium it has awful screen casting support as you can see and that I started at screen casting and something is happening in the depths of Facebook trying to figure out how to actually get this onto the screen here but hopefully it will come through assuming I've got internet connectivity here it is thank God and I can start talking and I apologize for I'm going to focus on very interesting so I'm going to focus on Flora in rather than embarrassing everybody else but let's just use a stationary boundary confirm right so the browser here sits there I'm not going to update this right now but here is some third room and if I continue into third room as guests you can see this is just a static boring old web browser just sitting here worth noting that third room uses OIDC entirely so this thing here is actually a skinned key cloak I'm gonna say I'm not a bot I'm not gonna bother giving myself a name and then capture failed brilliant thanks Google I'm gonna have to type in third room except caffeine and stress means my ability to use a stupid keyboard like this is gonna be fun okay about to third room here is a streaming okay okay brilliant let's go to login go about third room continuous guests this time hope that it's not gonna make me pick stupid things right good continue accept the T's and C's honestly the using a key cloak for this is really really fun and very anticlimatically we end up with eventually monthly connectivity a 2d version of third room just sitting right here so isn't it amazing by the way that just loaded from index DB local storage but the fun thing is hopefully come on you can do it you can see it's actually struggling quite a lot in this but if I press the old X button there that's I have to close the welcome to third room dialogue here it is come on and to XR thank God for that then I can see Florian hello Florian but more excitingly hopefully if I stay in the right place there we go you can see that I'm actually in the third room environment right now and this is genuinely cool this is running at 90 frames a second for me right now and if I go and press some buttons to crates and oh god crates like that massive crate let me get rid that you can see it's actually hooked up to the normal physics engine so I can go and pull that and I can confusingly throw it into the audience which is no way surreal to be going and flipping back and forth and then back in the normal world again at the moment we've just got basic things like the joystick and hung up hooked up to it it's got a kinematic controller and I'm I running out of oh no thank you Florian what else can I show you if you can jump we can spawn more objects I can go after that glitter globe and sorry Mirable except is running faster than I can run after it that's awkward and theoretically if I was a little bit close to the bloody thing I'd be able to grab it and pick it up etc so this is pretty cool honestly it's as good as the native non-web VR and closed stuff that Facebook or meta horizons does and the entire thing is open and they built on the surgery mentioned how am I doing for time three minutes in which case I very quickly go and sorry thanks for I will stop looking at random emissives go back into this and just look at some other things we've done in fact this one is really cool let's just flip into this one because this is a really complicated bit of wasm it's actually an audio reactive widget which is sitting here as you can see as I yell at it it goes and changes size and this is a whole bunch of C code that has been compiled down to wasm to show how you can have interactive things sitting inside the scene another example is slightly less excitingly a chat box echo service so if I go into here and say hello and then do echo hello with a slash command it says hello back to me now the echo that's happening down there is actually being done from the widget API of matrix going into web assembly talking to I think a JavaScript service and then echoes back and you can see we have a slight bug sometimes with scripting where it loads two worlds at the same time and that's pretty surreal and that's what happens if London got combined with Mars final thing here I was going to show you oh yeah is this guy which is a bit silly but fun anyway and this time I'll remember to refresh as fun as it is to have the scenes combined so you might recognize this from a certain film and if we actually look at the script for this particular room if I could figure out how to get out of fullscreen mode the script here is again just sitting in the media repository it's a little bit of JavaScript and to use the web scene graph API which is a new API that we've created we hope it will become a W3 standard for manipulating scene graphs it looks a lot like Dom you basically get a node by name the TV and then every frame you'd see if the TV is being pressed and if it is then you enable the matrix material and the end result is if I go here and I click on the TV then predictably enough you end up in a matrix style world and I've got it in third-person view which is also new with tech preview 2 and clay back and forth this is super early but you can imagine this is basically a platform for doing any kind of real-time collaborative app could be figma on this it could be multiplayer blender it could be a game it could be digital twins it could be I know smart cities GIS applications it's as powerful and inslexible as the web but for real-time thank you very much and we have no time for questions one question oh it will be coming in the next release after tech preview 2 I mean the hub bit of actually rigging up the engine and doing all the inverse kinematics and if you run around with the current avatars you can now trip over things which is very important but suffice to say we're focusing on the engine rather than the assets and also this is at a point where people can start contributing things so if you've got amazing assets if they support the maximum rig then they should just drop straight in cool |
AMENDMENT Clients as good as you'd expect
Sliding-Sync, Rust-SDK & WYSIWYG |
All right. Well, hello everyone. You'll notice I'm not Ben. This is Ben. There's three of us here to talk to you about different things, all about improving clients and making them as fast as you would normally expect them to be. So, first of all, my name is Keegan. I'm going to be talking about Scyling Sync. And then you've got Ben, going to talk about Russ Eskay and Mara about Element X. So, first of all, Scyling Sync, and a bit about myself. I'm a staff software engineer at Element. And I've worked on many different projects over the years, and more recently working on things like Dendrite and Peer-to-Peer and Scyling Sync. But first of all, what even is Scyling Sync? So, for context, Scyling Sync, the current Scyling Sync mechanism in Matrix is really, really slow. So, if you go and open up your mobile app after a weekend away or something like that, it takes a little while to Scyling Sync. It could take 30 seconds or a minute, depending on how many rooms are on your account. And this is kind of bad, right? We'd like it to sync instantly. And the whole point of Scyling Sync is trying to make that happen, trying to make it sync instantly, or virtually instantly. There was a talk last year on the online TOS-DEM. If you want to know more information about the deep dive of how Scyling Sync works, and there's a QR code there. But I'm not going to be covering too much detail about how Scyling Sync works other than enough to kind of fill in the gaps if you have no idea what this is. So, at a high level, Scyling Sync works by sorting and filtering. So, you can see here you've got all the rooms on the user's account, and then you can filter that set of rooms down in some way. So, for example, you could filter it based on, like, I want encrypted rooms, or I want DM rooms, things like that. And then you can apply some sort of sorting operation to them. So, you might say, sort them by the room name, or you could say, sort by recency. So, like, the last timestamp in the room, or by the number of notification counts, number of unread messages and stuff that mention your name, and that sort of thing. And then you can request the first five rooms, 10 rooms, 20 rooms, and things like that. Also, the rooms themselves, you can filter the room state using Scyling Sync. So, in normal sync, you will go and get all of the room state, and if there's a lot of room state, that's not great. So, in Scyling Sync, you can specify, I'm only interested in, like, the topic of the room and whether it's encrypted or not, and that's it. This is a pretty big change to how Matrix works today. So, how is this actually going to, like, how are we actually going to do this in practice? So, in practice, there is a go process, which is the Scyling Sync proxy, which I've been working on for over a year now, which has a Postgres database attached, and it will go and do Sync v2 requests on your behalf to an upstream server. It could be Synapse, it could be Deadrite, whatever, it doesn't really matter, it could be Conjuret. And the important thing here is that this proxy exposes a Scyling Sync API. So, it exposes a new endpoint for a new Sync endpoint, and then a client can go and try the Scyling Sync API and see how it feels for them. So, they don't need to have a particular implementation on Synapse, they will wait for these implementations to land. You can try it on your own server if you run a proxy. In terms of a protocol level, what this looks like is you can see here you've got, like, some lists. It's a list subject, and then you can specify things like things we were talking about before. So, you can say you've got the ranges there, you've got, so that's how many, like, the top-end rooms that you want, the sort ordering that you want, as well as any filters you apply here. And here you can see we're filtering by IsDMTrue, and that's going to be used to populate the people tabs, say, on ElementWeb. You also have these things for room subscriptions. They are kind of like the room lists, but this is when you know the specific rim ID. So, if you follow a permalink, which may include the rim ID, or if you, you know, if you refresh the page and you know that, you know, this person was currently viewing this room, that room may not be in this list, right? So, you would need to subscribe to that room directly because you know the rim ID. And typically, as well, the kinds of information you want here is different. So, in here, we are requesting all the state in the room and a much higher timeline element because this is being used to populate the actual full room view. The response is very similar, as well. So, you have a list object here, and then you get a list of rim IDs that are used to populate the correct ordering here. And then you also have a top-level rims array, a rims object, and then that's just a key value map, where the keys are the rim IDs, and then the values are all the data that you requested. So, these all get aggregated together, which I will speak about a bit later. In terms of what's new, so if you followed SidingSync, then you might be like, okay, I know all this, but what's actually happened over the past year? We have clients now that run SidingSync. So, this is from Element Web. It's got a nice scary warning there. So, you know, it's great. It all works on Web, but also it actually works on mobile devices, as well, thanks to the Rust SDK, which I'll leave for Ben to talk about. So, there's also a whole bunch of new extension MSCs. So, extension MSCs are an idea of trying to break up the complexity of SidingSync because the sync API is by far one of the most, or not the most complicated part of the client's server API, and trying to put everything into one MSC is going to be doomed to failure. So, we're trying to specify a core part of the MSC, a core part of what is syncing, which is the syncing rims, and working out the sorts and the filter arguments, and then we're leaving to extensions all the extra stuff on top. And the idea is that you can opt into any of these things. So, if your client doesn't do receipts, then great. Don't subscribe to receipts. Don't even enable this extension. Briefly, how these extensions work. So, these two extensions go together because they're ultimately used to make encryption work in encrypted rooms. So, you can see, or actually you can't see at all. Here is an encrypted event. So, there's basically, you have to trust me, there's a ciphertext here with lots of gibberish effectively, and then you need to run keys to go decrypt it into your normal text. The way that works is that you need to get keys via your two device messages. That's why they go together. The other thing here is that it implements another MSC called dropping-sale-centered device messages. You can barely see it on here, but this is an output from Postgres, which is trying to work out how many unread or unacknowledged two device events are there for a given user device. And you might think that might be, say, 100, maybe 1,000 tops. It turns out this can be a lot. This is several hundred thousand unread or unacknowledged two device events. And it turns out when I analyzed a lot of this, this was almost entirely down to RIM keys being requested, and then either canceled or successfully sent. So, this MSC 3944 basically says, hey, if you request a RIM key and then you subsequently cancel that request, you're going to go and delete those two device messages, so they don't just keep stacking up in this way. And that obviously really helps reduce the amount of bandwidth for SignSync as well. The other thing we've got is account data. So, if you wonder what account data does, if you've ever used the breadcrumbs thing at the top here on Element Web, that's synchronized seasoning account data. Also, account data is really, really useful for working out accurate notification counts. So, at the bottom here, you can just about see that you've got some messages here. You've got a message and a timeline. This is encrypted. And it says here, notification count one. Notification counts are the gray numbers, and you've got a highlight count of zero, which is the red number. And yet, on the UI, you can see that it's a red number and it's gone to one. So, something's happened here where the client has worked out that, oh, I should use this as a highlight count, not a notification count. It's overwritten what the service told it. And what's happened here is that the client has decrypted the message, and then it's checked the message to say, hey, you know, is there any app mention or any specific keywords based on your push rules? And if that is true, then it knows, ah, okay, I need to actually make this a red highlight rather than just a normal gray and red count. And that's done using push rules. And push rules is done in, stored as an account data. Final two ones are receipts and typing. Thank you. So, hopefully, you know what receipts and typing notifications are. The main changes for Sliding Sync is that the receipts are lazily loaded. So, you might think, what does that mean exactly? Well, if you request a timeline limit of 10, then you will get those 10 events, and then you will get the receipts for those 10 events, and you won't get receipts for any other events. And you might think, shit, hasn't it always done this? Well, not really. So, here's some JQ for Matrix HQ, which is this room ID, and it's just pulling out the receipt EDU, and then kind of checking, like, roughly how many receipts there are. And, you know, Matrix HQ is quite a big room, so you might think, you know, 100,000. No, there's quite a lot of rooms, quite a lot of receipts in there. And this is not great from a bandwidth perspective, right? We don't want to be sending 53,000 read receipts, particularly for events which you are unlikely to ever view, right? Because these could be for events that occurred, like, a year ago. So, Sliding Sync also fixes that. So, with all these performance optimizations, a very large account with 4,000 rooms can take less than a second to actually sync, which is down from 15 minutes on Sync V2. So, very happy with that, but it's still not really good enough. We're, you know, we're trying to go big or go home kind of thing, so we want to make it even faster, so it is literally instant. You don't want to have to be waiting a couple of seconds. It should just kind of open up, just like most other messaging clients, so you can just open them up and they just work. The problem is that things are going to get a lot worse here, which I will talk about in a moment. So, we've added in a bunch of tracing to the proxy server. So, things like, this is runtime trace. So, you can see exactly the control flow. There's some spans there, and you can see various optimizations that were done. So, this is identifying so bits of code. Lots and lots and lots of commits. Sometimes it's just, you forgot to add an index. Sometimes you should be doing things in bulk instead of doing things sequentially. So, lots of work has gone into this. And also, if you're going for, you know, 100 milliseconds kind of, aiming for 100 milliseconds, the actual amount of data you send is important, because this starts to become quite a large factor in the total time it takes. We can do simple things like deduplication and enabling GZIP for large responses, which we now do. And as well as that, we can aggressively cache things in memory wherever possible. So, we don't have to query the database when clients send a request. So, there's three levels of caching involved at the proxy level, whereas a global cache which contains kind of information which doesn't change for any user, so it's a constant. So, things like the number of joined users in a room, it's the same if you're Alice or if you're Bob, it's always going to be the same. Whereas things like the user cache or things like, you know, what's the unread count for this room? Well, that's going to change depending on which user. And then the connections are things like which room subscriptions have you got, or which lists like what are your top end rooms or whatever your sliding window is. Interesting thing to note here is that the room name is actually not global data. The data that's used to calculate the room name is global data and is the same for everyone, but the room name itself isn't because of DMs. So, if you have a DM with Alice and Bob, then from Alice's point of view, the room name is Bob, but from Bob's point of view, the room name is Alice. So, lots and lots of optimizations have been done. So, with all of this, we're now getting less than 100 milliseconds, which is what we wanted, but it's still not good enough because things are going to get a lot worse because clients, really, it's all up to the clients because clients want offline support and they want instant access. You know, they don't want to have to be having to do a network request to, you know, when they click on a room, they want to just see the list. They don't want to see a spinner. And in the best case, you have a spinner for half a second, maybe, and then it loads, which is, you know, it's not great, but maybe acceptable. But then, if you're on a mobile app and you go into a tunnel, then it's just going to spin it forever and then you're sad. So, users expect these things to kind of work instantly. And you can kind of, you know, Sighting Sync has ways that you can kind of fix this. So, if you want to go and instantly see the room timeline, that's fine because we can pre-cache the room timeline, right? You can use a higher timeline limit and then you can go and pre-cache that. So, you see the room list, you click through and immediately you see all the events, at least a screen's worth of events. For the other thing, which is you want to scroll the room list instantly and smoothly, well, you can opt out of the sliding windows entirely and you can just request really small stub information like, I just want the avatar, I just want the room name and that's it. And then you'll know the position, the room name, the avatar, and then you can just request all the rooms entirely. So, that will scale with the number of rooms in the user's account, but it's, you know, it's possible. And you can use something like this. So, you say timeline limit of zero, but there's a problem here, right? Because you have a timeline limit of 20 on the first one, then a timeline limit of zero. So, you kind of want a bit of both. So, it turns out what clients really want is delayed data delivery. So, the API wasn't originally designed for that a year ago. So, we've made a lot of changes to support this kind of idea of delayed data delivery. So, one of the things is timeline trickling. So, what timeline trickling is, is that you can initially request a set of rooms and you can say, I want only the most recent message in this room. And then at a later point, you can say, okay, now I want the last 10 messages in this room. And then it will go and, and effectively backpaginate those messages for you. Likewise, the clients want all the rooms and the accounts. So, but they want maybe more detail on the rooms that are in the viewport. So, again, you can support this by having two lists effectively. You've got one list, which is just the visible rooms. That might have more accurate information for like room previews. So, you know, you've got room preview, you might have, you know, typing notifications, you might register for typing notifications in those rooms. But then you're not really interested in type notifications for rooms, you know, really far down the list. And then in the background, you can just have a separate list, which just kind of in the background goes and gets all the other rooms and all the core information that you need. So, this has kind of been a huge trade-off, right? On the one hand, you've got sync v2, which is getting everything and is super slow, but it's got fantastic offline support as a result of that. And on the other side, you've got sliding sync. It's super fast. You only literally getting the data that you need, but, you know, there's compromises to be made there because you have to do network requests all the time and things can be slower. There's only so fast you can do. There's only so much you can optimize a server. So, really, I think element is kind of aiming to do something like that. So, it's mostly kind of sliding sync, but there are compromises and trade-offs that are being made to try to give a really good offline experience as well. So, in terms of what's next, we need to add threads because there's no threading support at all in sliding sync. And threads, obviously, only reasonably recently landed and was enabled everywhere. Threads is complicated because threads are changes fundamental answers to questions like, is this room unread? Because normally you could just be like, well, what's your red marker? What's most recent event? Okay, it must be unread. Whereas now you could have scenarios where, you know, the most recent event in the room is on a thread. So, if you were just to click on a room and you see the timeline, they're all messages, but in a thread, you know, three days ago, there's actually a newer message. So, adding support for threads is going to be quite tricky to get right and we'll have to probably iterate on it quite a lot, but it is coming. The other thing we're going to be adding in is this concept called delta tokens, which unless you've read the MSC, you'll have no idea what it is. Ultimately, what delta tokens are is to, sliding sync has a problem at the moment, the proxy server, because it has amnesia. So, it will time out your connection if you don't use it for, say, half an hour. And it will clean up all that in memory state. All those caches and things get cleaned up. And the problem is, is that then when you reconnect, even though your client has stored those rooms and stored a lot of the timeline and stored a bunch of room state, the proxy server doesn't know this. So, it's going to resend that information to you. So, the point of delta tokens is to say, hey, I remember me, I've already stored, I already know about these events. And then those events aren't sent to the client again in duplicate. A few more API optimizations that we recently swapped to using lists as keys, which basically means that instead of representing the requests and response lists as an array of lists, they're now just a big key value map, which makes it easier because you can then reference an individual list by the list key name. So, for things like extensions, this is great because you could then have a way of expressing, I want typing notifications, but only on these named lists, whereas before that was very difficult to express. And we also really want to have comprehensive client support. It's getting reasonably stable now, and it's certainly very performant. And Element Web uses Slime Sync natively in the JS SDK, but obviously, that doesn't really work for mobile. And it would be nice to have some sort of SDK that could be used for Android and iOS, and maybe even web at some point. I think there is. Yes. Let's talk about the Rust SDK. So, this is, sorry, overall a very technical talk. You already noticed that. But I'm going to lighten up a little bit more. But first about me, so I'm Ben. Hi. My name is the only name in the presentation. I don't know why. These guys have more work and show more stuff. So, quick, yeah. I've led the Rust SDK team for the last year for Element, and I've been working in decentralization, decentralized tech for a couple years already. I worked at PariTech before, was leading the substrate client team there, if you know, blockchain. That's one of the most favorite blockchain building systems. I'm going to be working as a tech link for ActiveGlobal, where we're building, on top of the Rust SDK, an organizing app for NGOs and civil society. So, I've been working in this for over a decade. You might know me from almost not at all threatening talk I gave at Jason, like, 2017. That was already about, like, how do you do decentralized privacy-first technology? Enough about me. Let's talk about, let me tell you a little story. We're back in 2019, 2020, and it's the state of the clients. For the sake of argument, I'm talking about Element clients here, because I think there's exceptions to what I'm going to tell you. But let me tell you two truths in a lie, and you can tell me if you can spot the lie. So, truth number one, many clients out there don't actually implement end-to-end encryption, which is pretty sad, because it's a very fundamental part of what we're working on. That is mostly because it's hard. Even if you use the most widely used LibOm library, that is a seed library that is already slightly dated. There's, like, a lot of knowledge that has been built up that is not easy to ingrain in this existing library anymore. Clients usually implement the entire HTTP, or at least most of the state machine around room, room state, who's allowed to write, as well as the entire messaging mechanics themselves in their own language, in their own environment. Therefore, because we have that, clients are super fast, it is totally integrated into the system that they are, and it's just a smooth experience. I don't have to ask you, you know, which one of this is a lie, the cake is a lie. At this time, enter our hero. Our hero is Damir. Damir is working as a crypto dev for Element. He's a Rust into the S, and he knows the crypto in and out. He's intending to rewrite a plug-in that he's using for an ISC client, which is called WeChat, that connects to Matrix. Because of simple problems that are limitations in the Python implementation that WeChat offers, he wants to rewrite it in Rust. But he doesn't really find a good space to build it on. This is not an actual representation, but we're going to use it for now. So he goes out and says, OK, let's write this. How hard could it be? He quickly realizes, OK, so the crypto side with the C, I would like to have that in Rust. I'm going to get that Y in a second. And he pulls that out later, which is now called Votosimac. You might have heard about that, which is our crypto implementation that we're pushing forward as a live-all misdiplicated. But he figures out the stuff around that to make crypto work, not the encryption itself, but the entire thing of how do I know which messages to encrypt with what key in what room, what if a message comes in and I don't have the encryption key? All of that state management around that is actually as complicated and as problematic as the actual crypto. And that is why a lot of people try to use the crypto, but then fail in doing all of that, making it a really terrible experience. And then I drop it and say, oh, let's not do encryption. That's too hard. But he continues and pushes on because he really wants that for WeChat and starts out with what we know as the Rust metrics SDK. So why did he pick Rust? I'm not talking in his name, but I'm going to give you some reasons why. If you heard about Rust before, you probably heard about it because it's the most popular, most beloved language. Rust is running now on the Stack Overflow system. So who here has heard about Rust? All right. Who has used Rust? Keep your hands up. Okay. Okay. That's fairly, fairly good. And while that is definitely true to some degree, like there's a lot of log for that language, it's even bigger in crypto because encryption, building encryption and building that safely is really hard. At the same time, you're not, you can't really go for Python or that kind of stuff because it's, well, too inefficient. So most information used to you see Rust seemed like a such nice alternative. So inside crypto and encryption, Rust is already a big thing. So that's probably the main reason he chose it because he wanted to use it. But there's also a good amount of actual reasons why Rust makes sense to build this with. This is a screenshot of the website of Rustlang.org from yesterday. I'm going to break it down a little more because we have to understand one key thing. Rust was invented by Mozilla to build a new browser. They had Firefox 2010, 2011. They were like, there's so much C, C++ in here. It's so complicated. We barely know how we can change stuff ourselves. And it's like, it's still a Netscape code base in there, right? Like, it's like 20 years of stuff. So they were like, let's build a new browser and it's called the CERBO project as a recent research project. And through that, they realized like, there's certain things we'd like to have from new languages and they started building their own language to build a browser. That project still exists. It's CERBO.org today. Mozilla has handed off the management to the Linux foundation. It's still a research project. I recommend if you want to start with Rust. That is a really good community to start with. But the key point here is that it was a language built by practitioners for practitioners. They didn't set out to say like, hey, let's make a theoretically proven language. Let's make a really beautiful looking language. All of these ideals were not existing. They wanted a language that they can use, that they're more efficient in building a browser with, which is already quite hard. If you say like, I want to build a browser, that's a lot of stuff you have to do. And so they set out to build, this is the previous claim that Rust had, which is a type safe systems language, so systems language like level of C, C++, with zero cost abstractions. Well, by practitioner I said that. So it's a modern language. It reached 1.0 in 2015. It is as speedy as C and C++. Sometimes it's speedier. The most famous example is ripgrab. If you go for that, it's like 10 times faster than the next comparable implementation to grab over a lot of files. And it does all of that without any garbage collector or VM. Again, the goal is to have zero cost abstractions. Any abstraction that Rust gives you, and a lot of the abstractions that the community also gives you in their own crates, has the idea of we can lower that down at compile time to nothing. It doesn't actually exist. Therefore, garbage collector cycles, no. VM below that, no. It should work on an embedded system. That rules out a lot of places. But all of that without memory safety bugs. Just probably the biggest concern for any security researcher. Buffer overflows are nonexistent, effectively, in Rust. Very famously, a couple of days ago, Google announced that since they have been shipping Rust in Android, I think a third of the code that they ship in Android is now Rust, their amount of memory bugs has halved, even lesser than that. And that is their main concern so far. That most of that happens at compile time. So at compile time, the compiler is a little more annoying and telling you, like, you need to tell me where this memory is going to go. Is that in this thread or in that thread? But it also means that after it compiles, it runs. But again, because it's built from practitioners for practitioners, it's not just about the language. Like, you need to be able to actually work with that. That means that it's very famous for its very good tooling. It has a really nice compiler that very famously when people jump from other languages and they run through the first error, they see the compiler complain and they switch immediately back to look at the code. In Rust, you don't do that. The compiler is probably going to tell you what you need to change. Or at least what they, what it thinks you need to change to make that run. That is a completely behavioral change. The compiler is your friend telling you, look, you need to just tell me, is that in this thread or in that thread? This is what I assume you would want to do. It can be wrong, of course, because you have higher level abstraction that you need to work with. But overall, it's pretty good. The same for cargo, which is the package management system and build system, but also Rust up, which is the meta version of organizing your own Rust installation. And all of that, it provides with being built against the LLVM backend, which means it's more or less instantly portable. When you can run it and you don't have any specific architecture code for your Mac, it will compile for Windows as well. You basically just have to say there's another target. The way that LLVM works, it has an abstract syntax tree of its own in between. We compile, basically Rust compiles to that. And then everything that LLVM supports as a target, it can compile to you. And that is pretty amazing. That led to Rust being the very first language that had native support for WebAssembly as a target language. Because it was just switching on, oh, yeah, the target for that. At the same time, sorry, my voice is still a little sick, it allows you to have a C-compatible lip interface. And that makes it really nice to embed it into other stuff and use it as a library. All right, so that's Rust. What currently do we have in the Rust SDK now, a year later? The idea is essentially that everything you need to have to build a matrix client, it should be there. Batteries included. That specifically means we want, we have an async type safe API. Like requests you do, they're type safe. They come back, we check that the JSON that comes back is what it needs to be. It has a full featured room state. So for every room that you're in, it can tell you, can you write in that room? What kind of messages can you write? What are the other users in the room? What is their avatar? What other states do they have? All of that stuff, it is managing for you. You don't have to bother too much about that. It has a persistent storage layer support. So you don't have to worry about caching it or putting it somewhere locally. You can still do that on your own if you want. It is a pluggable interface, but it already comes with a native version which is kind of a deprecated slab which we intend to replace with the SQLite which is still partially there for crypto, but not from the other side yet. But it also has, for example, support for web, for indexeddb. So you can run it in the browser. One of the examples actually is an echo bot that runs in your browser in Wasm. It's pretty awesome. And for us, almost the most important part is that it has transparent end-to-end encryption support. When you're in a room and that room is encrypted, it's going to send the messages out to get the keys that you need to allow you to verify with a different device. But from the point that you join with a new device and you just say room send and you give it the message, it's going to send an encrypted message. That's it. For the most part of it, unless user interaction is required, you don't have to bother about that. It's going to store that information. It's going to make sure that when you start up the next time through the storage support that you have all the keys there, you don't have to bother about there being an additional end-to-end encryption that you have to take care with. I already mentioned that it has Wasm and web support. And because of the C layer, we're also able to offer support to different bindings out there. So we have two bindings that are used in the next generation of element apps. You're going to see that later for Kotlin and Swift through Uni-FFI. But there's also custom bindings for Node.js and for JS on the web as well. I think there's Python bindings out there, but they're not maintained by us. This all allows us to go beyond what we have so far. It allows us to ingrain more of the stuff that different clients and implementations have been using, but that has verily cross-pollinated. If you had a really clever way of managing your timeline in Android, the iOS people wouldn't know. That all converges into this singular place now. And that allows us to do a lot more things a lot quicker. One of the things that we currently do is we offer a new experimental timeline API that manages the state for you. Like back in 2018, 2019, editing messages came around and that fundamentally changed the idea of an event in Matrix. It's just not a stream of events anymore, but events acting upon other events. This changes a message from a previous thing. With a new timeline API, you don't have to bother. We're just going to tell you, oh, position 17. This is now this. The same is true for redactions. The same is true for reactions. All of these things ensune threads. I don't know how we're going to do threads yet, but that all is supposed to be right there. You don't have to bother about the state machine changes that this requires. It's just going to tell you, hey, you need to render a different thing now. The other thing that was mentioned before as well is support for sliding sync. Both of these are still experimental. You have to actively switch them on because it's interfaces that we're not confident with that are going to stick exactly the way they are, but there's implementations out there using that. All right, so does it work? Does it live up to the promise? Well, let's see. In order to build sliding sync, I built a small testing UI. With sliding sync right now, this is Mr. Big, my test account with, I don't know how many rooms, but usually loading it on an element web is like a minute for initials sync. With my timeline, with sliding sync up and this testing system, it's 200 milliseconds. It's 200 milliseconds to render the room. You can see this down here. And to pull up all other rooms, it's like another 30 milliseconds. So yeah, it's fast. It does what it's supposed to do. But is that actually true? I'm a core developer. Of course, the thing that I'm building here is hopefully going to work, but how plausible is that as a SDK? Like maybe I'm just building a lot of stuff. Okay, let's take a look at the thing itself. So the implementation on top of the Rust SDK for this UI is a whopping 2,000 lines. It's pretty small. And most of that is actually 2e realm stuff because actually 2e's in Rust are not that great, so you have to do a lot of state management. The actual implementation of managing the Rust SDK is less than 130 lines of code. Everything else you saw, including that this stores it on your hard drive, totally abstracted away. I don't have to bother about this from that perspective. So I would say, yeah, definitely. It does SDK. But again, I'm a core developer. Hopefully it's easy for me to build this. It should be fairly okay to build something as quick. But of course, it's supposed to be working for you. All right. All right. For that, we have also brushed up our game a little bit on documentation. And one thing I would like you to look at to check the time. It's all right. So we have reorganized the repo a little bit to make it a little cleaner. You can see there's a bunch of stuff around that. There's Xtas, which is our task manager, benchmarks, testing. That should be self-explanatory. We have the bindings and the Uni-FFI bind gen to organize bindings. We have the labs, which is also where you find the jack-in implementation if you're curious about this. But the main stuff lives in crates and contrap is other things built on hub. Do we have an examples folder exactly for this kind of stuff? So let me quickly, is that possible? Roughly. I put the slides into the dev room if you want to look at them. Quickly run through one, the SDK bot 101 thing. It allows you to directly use that from the repo with that command. What you see on the first screen is just the imports that we need. You see mostly Rust SDK stuff, some minor managing around that. If you're familiar with Rust, you know that binary has this main function. We use Tokyo here. It's an async function, right? Async API. Most of that is just parsing in a very ugly way, the command line, and then handing it over to Lock-in and Sync. This Lock-in and Sync sets up some minor stuff. You see that we have a lot of information about this in code comments for right here for you. It does even set up a slept store. You can call the Lock-in username. You can give it a name for the bot that is the device that you will see, and it locks you in. Going further, I don't have the time to go through the entire thing, but it explains everything right here. This bot does two things. For every room that you ask it to join, it will automatically join, which is this first event handler, and the second event handler is reacting on messages. An event handler in the client is basically just a callback that you can say, when these kind of events come in, please tell me, and then I react to this. Those themselves can be async again, pretty nice. Then it just starts syncing, and that's all it does, which means it's running the Sync loop. This does not, at this point, use sliding sync. As I told you, it's kind of experimental. Let's look at the room message. The un-room message, whenever we receive a message, we can again mention that before it's type safe. It's going to give us the actual room message in a typed format, so we can rely on the compiler here to make sure that things are as they should be. We make sure that we are in this room. We try to figure out if it's a text message. If it's a text message we check for, is it dollar bank party? If so, we're going to respond with a message, and that's all the thing does. In reality, it looks like this. I'm showing you this is just regular main at the moment. Then, if I run the bot, this is slightly capped, so you can't see my password. I'm here connected to that bot. You see that I'm in here. I had two more prints that are not in main right now to make it a little cleaner. I'm sending a message. We see that this message is ignored, but if I send bank party, you can see it's reacting, it's sending this. Most importantly, this is an encrypted room. I didn't have to do anything to build a bot that allows me to live and interact with encrypted room. That's an encrypted message. I didn't have to do anything. You saw that there was no setup. I hadn't had to manage anything. The Rust SDK did all of that for me. If you want to learn more, if you want to use this, you can find all of the code, of course, at metrics.metrics.usdk. You can join our developer and general talking about the Rust SDK room. The example you just saw is inside the examples folder getting started. Jack in. The other client you saw before is in labs. Jack in. All of that code, obviously. I really recommend going for the getting started. It has a lot of documentation. I also want to send an honorable mention to Benjamin who is working on Trinity, which is a built on top of the Rust SDK, a bot framework, I would say. It allows you to write some very small Rust that is compiled to Wasm that it runs in the client that can react to messages. You can write just the message part and say, like, I have a bot that reacts to messages. This is one. Oh, yeah. Element is hiring. If you are interested in working on this full-time Element IO careers, we're going to have time for questions later. We have to get through all of these first. Let's see what you can actually build with this. Thank you. That's a long one. So, hello, everyone. My name is Mauro. Honestly, my colleagues are at a slide where they presented themselves. I don't have such a thing. So, I have to be brief. I come from Italy, Naples. I'm a software engineer. I work at Element. I mostly work on the IO side of things and set up working on some Rust implementations. Today, I'm going to talk about the new client element tags. The new client is pretty much being built with the idea of both the fine goals. The first of them is pretty much user experience. The thing is that we really wanted to improve over the user experience of the current Element implementation. The thing is that Element started as pretty much a showcase for what Magics was capable of. So, it was a bit like an app made by engineers, for engineers. So, yeah, not everyone is into this kind of stuff. So, sometimes it's a bit hard to use for the average user, and we want to improve over this. Also, we want, of course, to have a performance to be another very important goal. Actually, just as important as UX, we're actually, thanks to the slide-in sync implementation on Element tags, we're aiming to actually launch the app in less than 100 milliseconds. That's pretty much the thing that we're aiming for. And, of course, also optimize the bandwidth usage. Also, we want to build an app that is reliable just from the start. So, testing code coverage is pretty much right from the start of the project, a Niagara priority. And also, we want to build the app in a way that is actually relying on shared components pretty much. Mattis Francis Decay is just one of them, but, of course, we're planning to build more components that will be shared across different implementation, across different platforms, different projects. So, not even necessarily Element tags. It is that we will be able to use them, and, of course, anyone in the open source community will be able to use them. So, why are we writing the Android and the iOS app? That's actually a good question, because some of these goals could also be achieved with a very big refactor. But, let's go more in depth on why we want to do our right. So, let's start with Element iOS. Element iOS, it's quite old. It started in 2015. Essentially, it was, as I said, pretty much a POC to showcase what Metrix was capable of. It started as being named the Metrix iOS console, in fact. Then it went through a bunch of identity crisis and changed name three times. I guess it was first console, then viator, then riot. Now, it's Element. Let's hope it's going to stick with that. So, and, yeah, it was built by engineers to showcase pretty much what Metrix was capable of. But the thing is that, first of all, as I said, the user experience was not great. Second, it was made with some very old components written on Objective-C that used some very old architectural pattern, like MVC, which should stand for a review controller, but it stands more for massive review controller, because you know, by doing this, but it's a very good controller, and it's just a huge mess, and you start looking at 60,000 lines of code in a controller, and you're like, oh, my God, why am I alive? So, yeah, you don't want to see that anymore, pretty much. We want to move to a newer architecture. Also, even if we did a lot of refactors on the Element iOS implementation, you essentially, yeah, we were essentially not able to change all the old implementation, since they were very hard to maintain, and we still relied on these components a lot. So, yeah, four components are still using these old implementations. Half of the code is still in Objective-C, and code coverage is quite low. So, we decided to experiment a bit in Q2 2022. We decided to pretty much build a minimum client using the Metrix for access decay, and pretty much the state-of-the-art frameworks provided by Apple, like SwiftUI, but not only that, like also AsyncAway and things like that. So, yeah, and we were actually able to build this new minimum client that had a room list timeline, and it was, let's just say, a technical success. It was super fast and amazing. So, we decided to build, on top of this second POC, by giving a more focus on the UX, because as I said, yeah, now we have a performance client, but now we need to have a simple client that anyone is able to use. So, element tax was, iOS was then born. On the Android side, things are slightly different, because technically speaking, the Android application already had a rewrite in 2019. So, we had two choices. We could essentially just take the Android SDK, put it on a side, and pretty much replace it with the Rust SDK, or maybe just rewrite it from scratch and using pretty much the state-of-the-art frameworks that Android provides right now, like for example Jetpack Compose. In the end, we decided to go for the latter, for two reasons. First of all, I mean, if we're building an application on iOS that uses the latest frameworks, why do you want to do the same for Android? And second, UX, as I said, UX was a very, very important concern. So, even if you wanted to rebuild the app from scratch or rewrite it, just pretty much change some stuff for the existing app, it would still require pretty much a huge UX overall, which in the end made the rewrite even more sense. So, pretty much obviously the architecture of element tax structured. Well, we have pretty much the backbone of the client. It's pretty much all sitting in the Magic Rust SDK. It's all there. And the Magic Rust SDK through Uni-SFI is able to expose with bindings and causing bindings. It's interesting because, as pretty much Ben said, it's exposing objects that are reactive, that the client is the only thing that it needs to care about, doesn't need to care about the events, all the events, the newer events. It just needs to know that the event has been changed and it's in that place. It doesn't need to know that it's a new event that came afterwards and so on. So, the idea is that these objects that the bindings expose are actually already ready to be displayed, essentially. So, you just need to render them on the UI and that makes the development way wazer. And of course, the sliding sink is pretty much a requirement on element tax in the sense that it's being built with the idea that the sliding sink will be pretty much next standard for the clients. And so, it will only work with servers that implement for now this sliding sink proxy, essentially. So, this is an example of how the code is pretty much translated from Rust into Zwift and Kotlin through Uni-SFI. As you can see, there is the timeline item. I would say it is pretty much an object that is pretty much like a view model. It's already ready to be displayed. It just pretty much need to take the presentation data from this object and render them on the UI and that's it. Which will make implementing clients for the future with the Matrix Rust SDK way wazer. So, the bindings are pretty much separate to repo. Anyone can download them as a file, a year file for Android or as a framework for Zwift implementations or you can just pretty much use a package manager like Maven Central on Android or Zwift package manager on, yeah, on Zwift implementations, essentially. I think it's Zwift implementations because actually it's interesting but the Matrix Rust SDK is scalable of running on any Apple system target. So, I really can't wait someone crazy enough to build a client for Apple Watch or Apple TV. I'm pretty sure that the 10 people in the world at Apple TV will be very pleased that there is a Matrix client on their Apple TV. But, of course, ElementX is going to share more than just the Rust SDK. We're pretty much trying to build other components that we hope to share just across ElementX but across multiple projects. For example, we want to build an OpenID Connect component, an Element call component. And, of course, since the two apps are pretty much the same up on different platforms, they're going to share translation, they're going to share design tokens. So, I mean, why don't we just pretty much make a component to share these elements already. And we're actually also building an interesting component which is called the Rich Text Editor, which is essentially an SDK of written Rust that then exposes these bindings in Zwift and Kotlin through edify and also in WebAssembly. And it's essentially a UI framework that you can import into your client to render rich text in what you see is what you get fashion, essentially. It's something that is going to come also into ElementX. So, keep an eye for it. But, hey, what's this like? Oh, actually, it's already there. Oh, but this is not ElementX. Actually, this element, the Rich Text Editor is already in Element right now. But in iOS, Android, and Web, you can enable it in labs. You can just go to labs, enable it, and test it. And, you know, if you're able to break it, just, you know, send us some feedback and we will try to fix it as soon as possible. It's a project that I've worked on. I'm very proud of it. I think we achieved something really great because it's a very simple way to pretty much, it's a way in which you can create rich text without need of using markdowns and see how they look like, which will make life easier for when you need to create something like this. Because I challenge everyone to make something like this with markdowns. I mean, you'd go crazy with that. So, yeah, the cool thing is that this Rich Text Editor SDK that we built, I mean, it's not just for metrics, so, metrics client. I mean, technically speaking, anyone could use this. Maybe you want to make a note app. You want to make, I don't know, like an app that is your diary, whatever you want. You can pretty much implement this. And, if you want to test it, you can scan the QR code. You will get pretty much to the latest main implementation on web. It's a test debug version there. The one on labs is more stable. This one is more to play around with it. It's cool because this one, it allows you to pretty much see how the rich text is transformed into a DOM representation, which is in Rust, and then transformed back into an HTML, which is the one that we are sending over the metrics lines, of course. So, of course, testing for reliability, another important keyword. It's something that we want to pretty much improve. And so, pretty much, we built a very, yeah, very stack test infrastructure that we hope is going to cover all these areas. It's already covering most of these areas. And, yeah, pretty much make the app more reliable, and the project way, way safer. So, yeah, ElementTax actually has come with a lot of benefits. First of all, on the tech side, it's way, way faster, both because the metrics for us has decayed. I mean, it's amazing. It makes things easier. Both from a development standpoint, because you just write it once and deploy it everywhere. But at the same time, the fact that you just have your models already ready to be displayed, it's amazing. And also, slide-in sync. And, of course, the use of declarative UIs like SwiftUI and Jetpack Compose makes development time actually faster. And actually, it's also easier to test, I would say. But also, the UI performance has been improved, actually. Also, sharing components is something that will benefit not only just ElementTax, but pretty much any client that wants to implement a metrics client. But actually, we hope that some of the sharing components that we're building will not just benefit the metrics community, but the overall open source community. So, yeah, but, yeah, the major benefit, actually, we should not focus just on the main benefit that we are offering on the tech side. We actually want to focus on the benefit we're really offering to the users, because in the end, the main focus ElementTax, yeah, its performance, its tech, its sharing components, but, of course, it's making the app more usable, more accessible, easier to use. We want to make an app that is not just... We want to make an app that, essentially, also can be used by your friends and family to chat with you, even casually, during the day. So, not just for people that, essentially, want to keep their conversation safe and secure for the metrics protocol. Roadmap. Pretty much this is the present of the future of ElementTax. For now, you can log in, check the room list, timeline, send messages, edit, reply, react. But there are some restrictions. First of all, of course, as I say, this lighting sync is required. So, if your server doesn't activate its lighting sync proxy, yeah, you can much, pretty much use the client on that server. Also, it only supports authentication, and authentication, it's only through the metrics protocol. We want to support also IDC and registration, but when we will build the OIDC component, we will support that. Device verification is there, but only for emojis, so, no QR verification yet, and also no messages through the description. Yeah, this is pretty much where you can find the ElementTax iOS version repo. There will be a public test flight coming soon, and actually, Matthew, will demo this in this afternoon? Yeah. Okay. That's the plan. And regarding ElementTax Android, it's a bit behind schedule, because, as I said, it was developed after ElementTax iOS, so it's more in a state of being set up. But of course, you can try to run it, check the state of the repo. You know, if you want to play around with it, this is pretty much where you can find the actual repo of ElementTax Android. This is pretty much the roadmap on what we plan to do. Actually, more than a plan, it's more like what we, let's say, it's more like, say, it's not a deadline, it's more like what we imagine we're able to achieve in these dates. And I was also told to be as vague as possible, so for the release date of the public launch, I will just say that it will come sometime in the future. All right. Okay. And that should deal with it. So, yeah, that's all. And we can do a, I think, a rapid QA session, right? Yes, yes. We have 10 minutes. Oh, okay. Nice. Right on schedule. Nice. Okay. Yeah, it's this around. Yeah. Please go ahead. If I remember correctly, the sliding sync option in ElementWeb said that you can't disable it in the warning, why is that? So, the question is, let me repeat it for the camera. Why can't you disable the sliding sync labs feature in the current version? Mostly because of end-to-end encrypted messages, you would risk being unable to decrypt your end-to-end encrypted messages in that session. So, the reason why is because when you log into the proxy, it's going to be syncing on your account, right? And it's going to sync forever. Well, until the access token gets invalidated, but it's going to be syncing on your behalf. If you toggled sliding sync on then off, if you turned it off, then your ElementWeb would be using the V2 sync as well as the proxy, because the proxy didn't know you toggled it off. So, that means you've got two sync loops for your account. And that's going to cause problems when it causes a race condition because two device messages, when they're acknowledged, and they get acknowledged by increasing the sync token, they get deleted on the server. So, if your ElementWeb was super, super fast and managed to race ahead slightly of the proxy, then it would go and get all the two device events, and the proxy would not, or vice versa. And vice versa is the problem that's trying to warn against. So, if the proxy was ahead, then you would not get a certain two device events, and therefore you may potentially lose room keys, and therefore may potentially be unable to decrypt messages. Hopefully that's clear. Do you have any data on whether sliding sync significantly impacts the server load? So, the question is, what about server load on sliding sync? Do we have any data? I need clarification, because do you mean at a proxy level, or do you mean in like a general sense for native implementations of the server? Does using sliding sync improve server performance? A native implementation, yes, it would. So, that's one of the reasons why the existing sync implementation is slow, is just because the servers have to do an awful lot of work. And obviously, I've been developing on dendrite, I know exactly what things are slow there. So, a lot of the API that's exposed to the clients are basically efficient ways that you can do it. So, you only get like the current state of rooms, you don't tend to need to go back in time, you don't need to remember all your synth tokens since the beginning of time. These are things that slow down the processing. So, yes, a native implementation, but a proxy implementation obviously is a sync loop that's going to be made, so that will increase load, right? Because that's going to be constantly syncing on your account. Okay, so that's an element X question, I guess. Wait, let me repeat it first. So, the question is about multi-user account support in the app. It's something that we're discussing, but for now, there is no definite plan, let's say. From the metrics SDK side, I can tell you that you can do it. That's not an issue. So, I think you were next. Saw you. So, two part question. One is how far out do you think Sliding Sync is from actually being like merged and finalized as a spec? And then second part to that is are there plans to do a native implementation for those APIs in Synapse? Yes, so the question is basically how long it's going to take for Sliding Sync to land and will we get native implementations in Synapse? You will get a native implementation in Synapse, I don't know when. And yes, we're going to try to merge and land it as soon as this is practically possible, which, you know, there's still things we need to fix, right? Like things like threading and stuff just doesn't work. And that's actually one of the biggest blockers at the moment from us trying out just defaulting element web to Sliding Sync on by default is that for compatible service, obviously, is the fact that we don't have threading support, so you wouldn't have feature parity. So, when we do a feature parity, then, you know, there could be element web clients which enable it by default. It won't be in labs, it will be enabled in labs by default. So, you know, we're getting there, but I can't give you a time, unfortunately. Thank you, next. So, the question is the authentication parts in the REST SDK. So, yes, we have login via username and password. We have implemented OIDC in general, but I don't think it's fully tested. And we have an SSO feature as well. So, we ask the server, the specification test, right? The server tells us what is possible, and then we allow you to use those. So, generally, yeah, if your server is SSO, you can use metrics SDK with it. Jan, here. Question from the Internet. Ooh, a question from the Internet. I heard about them. Are there any plans or what is the status of the matrix RTC in the REST SDK? Yeah, the question is about RTC in the REST SDK. If you followed the RTC talk before, you noticed that most of the RTC part of the RTC is actually offloaded to web RTC in the current implementation. So, going through a web view. For us, as REST, that means we don't have to bother about most of that. There's only some signaling that happens on the actual matrix protocol. So, we don't have at the moment the plan to implement an actual RTC our side. I wouldn't see where you would want to do that for other than that view. So, currently, it's not on the roadmap, at least for our side. Let me talk about IoT. Yeah. So, that's a common one as REST is very, so the question is about IoT devices. Could you do that with REST? I see you can. Yes. That is generally possible. We have, because of the storage systems and some other things in there, and because matrix itself is still quite heavy as an overall protocol. We have tried to get it into an actual embedded device. That is not at the moment possible. We would have to improve a lot on the way that we use REST. REST itself provides that, but we can't do that. But you can use it, for example, on an Android, not an Androidino, but a Raspberry Pi. We know of people that run Raspberry Pis that have signals coming in, and then they use the REST SDK to send it over into rooms. That is definitely possible, because it's more or less just a bot. From our perspective, it's just a bot. So, that is possible, but you still need a significant amount of memory at the moment, and that would make it not possible for actual embedded devices. Yet, if anybody wants to do that, come to me. I can show you and mentor you and help you, because it would be very exciting if we had the possibility to do that. Jan, another question from the internet? There's also a question about the element X. What you see is what you get, editor. Is it still possible to use just markdown if you want to just use knockdown? So, the question is about the element X. Where's your big editor? Can you still use markdown if you want to use markdown? Actually, even on the current element implementation that is on the client, you can actually also still use markdown. So, there's an option that allows you to turn off the rich text and turn back the simple text, and when the simple text is on, pretty much you can use markdowns. But, will it render in what you see, what you get, fashion? So, the question is about does the markdown then render in the WYSIWYG? No, when you're using the simple text version, it's rendering pretty much like a simple text with the markdowns. So, any plans? Or, naming it in the rich text without the markdowns? Currently not. We're pretty much trying to build the rich text editor, as it is with just the rich text using the formatting toolbar to be the most performant and good and simple to it as possible. But, it is something that for sure, when we have a very stable product, we will look into. No question. Will this finally unite the markdown syntaxes that you can use in different element clients? Will that finally reduce the amount of different markdown syntaxes that you can use in element clients? I'm not sure about the question actually. Will the WYSIWYG editor in simple text mode use one unified markdown implementation so you don't have to remember different variants of markdown and different clients? But you're talking if we are going in the future to support the markdowns inside the WYSIWYG directly without turning off the rich text? This is what you mean? In simple text mode, if you enter markdown, we would parse the same way on element iris, android and web. So, I think there's a confusion here. You switch on the WYSIWYG editor, then you get the WYSIWYG. If you turn it off, you have a simple text mode that you can do some markdown, but it's not going to be rendered inside of this. So, it's just going to fall back to the existing implementation. So, therefore, yeah, to answer your question, it's falling back to the existing implementation. So, no, they will still be incompatible. So, that we might switch to use the markdown for round-tripping, because at some point, I think this was the previous question, that people are going to want to round-trip between the markdown implementation and the WYSIWYG one. And to do that consistently, you're going to want to use the same library. You put that in the Rust layer. And finally, we get out to the nightmare of Common Mark versus GitHub, Flavint, markdown versus whatever random library the different element platforms have. I think Android is still out of sync with the others. Great. One last question. Where can we meet you today, or maybe later, if we have more questions? I think we're going to hang around here, right? Yeah, for sure. We'll be able to stand soon. We have to stand in K1. I'm just going to be around here, lurking, so just talk to me. Yeah, same for me. I'm going to be here. I'm the guy with that hat. All right. Thank you very much. Thank you. Thank you. |
AMENDMENT Widgets in the "Sovereign Workplace" for the German public sector |
Welcome everybody. Good morning to our talk. All the people on stream on video on demand and of course here in the deaf room. My name is Kim and I'm here together with my colleague Oliver. We are from Nordic and we are going to talk to you about the matrix widgets and particularly those we develop for the sovereign workplace project in the German public sector. So at first, quickly about us, Nordic is an IT consulting company based in Hamburg, Germany. We are about 40 IT professionals and we, among other things, we develop first matrix integrations for productivity software in the public sector in the so-called sovereign workplace project. And we are here today to provide you some insights from that we gained in the last couple years. So what's the sovereign workplace? Let's take a step back and introduce you all to it. It's a project with the goal of providing IT services to public administration in Germany and potentially also Europe. It's founded by the German Federal Ministry of the Interior and Community. And one of the core aspects is to gain independence from US cloud services and to retain full control over your data. And there's this suit we call sovereign workplace which covers many use cases for productivity and it's achieved by combining the products from many different vendors which you can see on the right here. But all of this is open source software so yeah you could say this whole project also supports and maybe funds some of the open source projects here. We could talk a lot more about the whole sovereign workplace project and that could be a whole talk for itself on another track but of course we are here to talk about matrix and the matrix death room. So yeah quick overview here we can see the Univention Corporate Server and Keycloak which implement identity management in this stack. There's Groupware which is done by OpenExchange. There's cloud storage by Nextcloud. There is an office suit which integrates with Nextcloud very tightly which is Kolobora online and of course there's video conferencing which is done with jitzy right now and perhaps in the future using matrix natively and of course there's messaging which is done natively on matrix and particularly using element as a client. How are we involved in this? Well together with element we provide this real-time communications component which is messaging and also video conferencing so there's chat and video and in particular we are extending this chat use case using widgets for some specific use cases so and this is the crucial part of this talk. The idea is to provide better integration with other components in this stack and build new work solutions. So you might be familiar with the concept of a widget and we've seen it in the previous talks but still I want to summarize a bit. As I said we can use widgets to extend the regular chat client functionality for specific use cases and I brought you some examples here so essentially it's an option and also this was said before but it's an option to embed some kind of web app into your existing matrix client and display more functions right there. So for example you can put it in the right bar like Hookshot does here or you can pin it to the top of the room and it will even adjust to your theme like this countdown and there's more there you can add a lot of widgets at once and see all of them at once in multiple places. Of course there's video conferencing which you can also maximize to view all of the people in the video conference rather than having chat and yeah there's also this half full screen mode where the chat moves to the right side and you can see the widget in a full screen-ish style and there's more so actually if you're following FOSTA remotely you might find this a familiar site so you are actually using widgets right now to watch this and there's this schedule there's the live stream and also from the editor side there are widgets to supporters. So if you followed closely or you are very familiar you might have noticed wait a minute this is a bit more than just static webpages so actually there's a thing called the widget API and that allows your embedded page to interact with the client but also with the matrix. So for example from our widgets we can also send messages read messages to the room. So one way to explain widgets would be you could say these are some kind of limited matrix line for a specific use case and you can build a lot of things really there are endless possibilities and now Oliver is going to show you a bit more about how that works. Thanks Kim so let's ahead a step back we're now back at the concept that it's a website that's embedded so you have an iframe we'll see it soon and the iframe is just any static website and that website has an API that can be used to communicate with the client so for example in this case element and then you have the other option that the element client could be connected to your home server so it's like I would say a pass-through API which gives the commons of the widget to the home server via the widget API and with this API you can do a lot of stuff so one thing that is important is that this iframe also allows to have like an isolation so even though you embed a website there it's not like having full access to what the element does everything that you do has to be done via the widget API so from the feature perspective of the widget API you can separate it into two parts so on the one side you have everything that is related to displaying widget interacting with the client for example Kim showed you that you can have this different display mode there are also even more for example you can model widgets that then displayed in a full screen view you can set things like always in top where the widget is displayed in the right bottom corner and it's always there even if you switch to a different room and you can do capability requests I will come to a second what that means and then you have the second group of features which is like matrix or room interactions so that is everything about sending events receiving events state events room events or to device messages as Florian also mentioned today yeah some special things like reading relations and requesting ODC identity token that you can use at another service to identify yourself so if you go now back to this reading and sending messages you notice that actually you could use matrix for non-chat application where you store your data in the room you can see it is like a real-time database we had the term today and store your stuff there and have all the benefits of the matrix protocol that you would have for chat too for example you can store your data and have it enter and encrypt it you can collaborate with others via federation and all these features are already there and you can see it as like a nice backend for building collaboration apps so I already mentioned this word widget capabilities so one issue that you would have if you like provide all these features just to an app without like having any security they could do a lot so actually I could build a widget that reads all your data and post it somewhere like which chats you write or does stuff in your name for example it could send messages in your name and therefore there is a security mechanism with this widget capabilities and it's actually quite similar to what android is doing with permissions so your web app once it is embedded and once you started it so every administrator can embed a widget into your room once it's embedded you have the chance to request specific permissions for your app and then the user gets a screen like this it's quite a long list and can explicitly allow data to be shared with the widget and that way you have your control over your data you don't share anything that you don't want to share with a third party site and only then the widget has access to it so but maybe grouping and writing best attacks can help here to avoid this situation where users just click accept without thinking about what they are sharing so now I would talk a little bit about with you about the status of widget API so right now it's only supported completely in element web and desktop so there is support for widgets in android and ios but it's more like static web pages without the widget API and the widget API is what it gives is this interactivity stuff that you wouldn't have with just a static web page yeah why is it only supported in that two clients I think there are at least two reasons one reason is that it's not part of the magic spec yet so there is like a draft here which is the spec but the draft is a bit outdated it collects some of the mscs around it but a lot of the mscs are not part of the spec yet so it makes it I guess quite hard to develop based on it so it makes it hard for a consumer of the widget API to develop it but it would also make it hard as in a better because you always have to go through the mscs look for stuff that you need and build it but I think it's not the only reason why it's not yet fully supported by every client it's I think it's also the situation that it's maybe not the perfect implementation yet um and what I mean with it I want to talk about it in the next slide so if right now I want to extend the widget API with a new feature it's not feature complete so it doesn't support everything you have in the matrix js SDK it has a subset of features but if you want to bring a new feature into it um right now we have to follow this process so let's assume that I want to support uploading content into the media repository so it would be quite nice for some use cases um then I would create an msc which I think itself is not a problem do you look into the current spec of the client server API look at the behavior for example uploading requires like thinking about quota thinking about limits size limits and stuff like that and then how the API responds these information back to you and uh so you look at the spec copy that behavior so it's it's the same and think about how you integrate into the widget API then the next step you think about like capabilities uh how can I prevent users from doing stuff that they shouldn't do with it or how can I keep control about it and then you have like a base to start the implementation you would probably implement it in the matrix widget API repository which is the API that both embedders offer widgets of widgets use but also the widget itself and the next step you would implement it in element and if there would be more clients that support it in all other clients that you want to support it and that's actually a lot of work and just for something that was already there right so if you wouldn't use widgets you could upload files with the client server API and then you notice that you can upload files how to download them starts the process over again so I think it's not the problem with the msc process but the problem is that you have to copy stuff that is already there so behaviors that is already there specs that is already there it would be much better if I could just use the client server API from my widget and there's already like the idea of doing that so there's this msc that thinks about how can I share like the client server API with my widget and it's done in this example or in this specific msc by sharing the access token with the widget which brings like new challenges for example you have to think about capabilities again but because if you share the whole access token then you can do everything so you would have to think about how can I restrict the access again and the one idea is there to use a scoped access totems that have there's actually an msc and I've missed it here you use scoped access totem to create a totem that is like only possible one can only access the stuff that the user previously allowed the widget so that's something where you can mirror this capabilities that you've previously had with the widget API with this approach so that would bring us a lot of benefits so we would actually directly have this feature parity with the client server API we can do uh yeah it may be also more performant because now we rely on element loading all the data uh and also relaying all our requests to the home server we could directly talk to the home server and optimize our challenge or our tools and we would also assume that's a lot easier to implement because the widget and better so a client like element only has to implement the exchange of the credentials to the widget but not all the API calls that are available that have to be relayed and capabilities have to be checked and all that stuff so it would uh make the implementation a lot easier and hopefully also bring it to more clients there are some challenges for example it's actually quite good that element does all the stuff because as a widget author you don't have to do have to think about sync you don't have to think about e2e uh e2e uh so end to end encryption so it actually makes it quite simple but probably that's some challenges we have to solve then and uh there's also like this msc proposes to bring the access token via the url into the widget which might also not be the best way maybe or i'd see once it's there can help us here to delegate the identity and access into the widget in the morning florian already talked about uh this matoska matoska mode for element call and there they're using the i think it's called room with widget room client from the matrix js sdk which is quite cool because it allows you to already start using the matrix js sdk in your widget and it feels like it's a matrix js sdk but it's relayed over the widget api and uh providing you later on maybe a better way to migrate to this state or to this style of api so what would these features bring us or what can you already do with existing widget api so you could be built really cool collaborative tools because you have like a real-time communication channel you can build you will give some examples later so i don't give them um you can build stuff where you would actually normally have to build a backend a communication layer and all this stuff and think about a lot of stuff that matrix already has and brings you can use the rooms for data storage um they actually some tricks needed to do that efficiently that would be like the talk for itself um but you have the idea that all these um applications that you build could just use it um i talked about real-time communication before so actually if you use matrix you have some kind of very slow real-time communication it's not suitable for building more complex or more more quicker stuff like for example a game or a whiteboard or stuff like that that's where for example matrix fdc comes into play where you have direct p2p connections or via sf use and if you have access to that in your widget that would be would allow you to build really great stuff and actually if if like you reach your limits which with widgets you always have the options to switch to uh yeah more components for example like a bot said is invited into your room and helps you to um do stuff that you could actually not do with just the widget it wants on the user side um yeah kim will give now some examples for widgets that we build and also do does a quick demo thank you all over so uh here are our use cases the widget that we build um so yeah as you can see there are four of them and so the first one we built is the pulse widget as you might know there are pulse in element now and i believe they are coming to the spec i think there's an msc but yeah these allow you to do some simple pulse but you might have some cases where you actually need to do some more fancy things like you want to even use parties for example and so we built this pulse widget that allows you to do to cover many more advanced use cases and in fact this is already open sourced the end of last year i believe in november december sometime and you can find it online under this link on our github and so the next one is the barcam widget if you are unfamiliar with the barcam the idea is to um meet in a group um collect some topics and then build a schedule right there and then have an event based on that schedule i'm going to show it to you in a minute and uh yeah so this is our second widget that is also now open source and further we are also developing a meetings widget which allows you to create um well appointments with within the widget and it will set up rooms for you and set up the possibility to have a video call right there and this is working progress but will also be open sourced and then there's also the whiteboard widget we are building and again this is going to be open source at some point when it's when it's finished or in beta state um right so as i said i want to show you the barcam widget for a bit and we actually got the chance to use it productively on friday at our matrix community meetup and um yeah we have prepared a quick video for you so right here on the left hand side you have the grid and you can for example add tracks you can edit the track names you can choose some icons uh on that axis and of course you can also um modify the other axis which are the different time slots you can move stuff around change the length of stuff and then once that's set up you can go to you can enable the topic submission and you and all the other users in the room can create this kind of posted cards where you enter your topic and maybe a short description and then once you send it it will appear here on the right in the parking lot but yeah for this because this is not yet quite supported on other platforms than element web desktop and we also built a bot as a compatibility layer which allows you to also submit topics through the chat right here you write a bot command the bot will convert it to the event that's read by the widget and you see a tick and it also appears right here in the select your topic button here's the first one we created at that point you can even edit as a moderator and then you just move it into the schedule and then you can select the next topic and also review it maybe you edit it maybe you don't and put it on your schedule yeah you have the feature of locking and unlocking submissions for the non-moderator users in your room and I believe that's it all right so yeah thank you very much everybody and it's now time for qa we also have this qr code if you want to find us on matrix you can use the qr code you can use this room alias and come talk to us of course you can also find us in the deaf room online right do you want to answer yeah okay so I think it was kind of two questions so one question was how do users find the widgets that are installed in a room and then I think this second question goes more into the direction of integration managers with dimensions so there's something like that available so um widgets are like state events in a room um once you edit it to the room you have like an event within url so it could be hosted on any web server that's then embedding it and uh you have the question about discoverability so yeah there are integration managers for example dimensions that you can use to add widgets but um at least dimension doesn't have set good support for the widget that use the widget api so you would uh uh probably need something else I don't know about any one or any integration managers that support them very well so right now you have to uh have the url of the widget and then you can just use the add widget command and add it to your room um but an integration manager would be great so I could just click in the bottom right of element this is already this integration for it and then as a room admin or moderator you even have the ability to to pin the widget to the top of the room for example or to maximize it and then to save that that view state for the room and then everybody else in the room will automatically also have the widget open for them so just added additional info so in the sovereign rack place we have the meeting widget which is like the start point for creating meetings with widget and there we have some kind of integration manager built in doesn't help for other rooms but there the user already has option to just tick the widget that they want for the meeting in this uh virtual presentation of the death room the widget which is showing me all the questions where the people ask the question in there which are most upvoted and there's a question they are asking how do widgets manage the anti-encryption so does widgets have access to the encrypted rules so the question was how do widgets manage anti-encryption and right now it's actually quite transparent to widgets because they don't know about anti-encryption so the client itself does everything and just returns the already decrypted events to the widget and the other way around so as a widget I just sent a message over the widget API and element for example does the heavy lifting of encrypting it and sending it to the room yes and no so element call itself is a widget too so it uses also the widget API and implements all its stuff also the matrix RTC stuff via the widget API so it's currently I think the code for that is mainly in the matrix JS SDK and then they use this room widget client to communicate via the widget API with the client and then the room um is there a place to discover widgets or is there a collection of widgets where to find them yeah actually it's quite hard right now so the question was how to discover widgets is a place a central place where you can find them I would say yes and no so there is a widget integration manage like managers like dimensions but it only has a subset of all the widgets that are available built in and there's right now not something like a store or collection where you can easily choose all of them I believe the matrix.org website either is already or is going to collect a list of all the available widgets so you can browse them there or yeah of course if you create a widget or any other app yourself you can make a pull request against the matrix website so please let everybody know about anything you build okay thank you very much |
Trixnity
One Matrix SDK for (almost) everything written in Kotlin |
So, welcome to my talk about Trixity, one matrix SDK for almost everything. I added it written in Kotlin a few days ago, so maybe there are some Kotlin fanboys here. Yeah, let me first introduce myself. I am Benedict and my friends often see me as a killjoy when it comes to data protection and data security, but I convince them to come to matrix anyhow. So I have 20 users, family and friends on my own matrix home server. My first contact with matrix was four to five years ago and I gained a lot of experience with it since then. And so much I found it connect to X and this is just a company that is developing a Timmy and that is a TI messenger for the medical health factor in Germany. Now let's start with Trixity. Trixity is a matrix SDK and it is for developing clients, bots, app servers and servers. It is multi-platform capable, so everyone thinks Kotlin is JVM only, it is not. You can compile it to JS and you can compile it to native, that's important for iOS. So we all have that targets with Trixity. And it's also developed test-driven, so we have a high test coverage and it is licensed under the Apache 2 license. You may wonder why another SDK? So back in January 2020 there were only a few multi-platform SDKs to choose from. If I remember correctly there was the Matrix Rust SDK, but it was in a very early stage and there was the Dart SDK, but very likely this forces you to use Flutter in the UI. So there is no real free choice which UI framework you want to use, especially when you want to use native UI technologies, for example on Android Compose or on iOS Swift. Additionally, most SDKs didn't have a very strict typing of events and the rest end points. And also the extensibility was a bit limited. Even if the next point is not that important, SDKs were really bound to its purpose. So you've had a SDK for a client, for a server, for a bot and so on. So you have to learn a new SDK for each target, each purpose of application for Matrix. Why choose Kotlin? Choose Kotlin because it is a statically typed language which compiles, as I mentioned, to JVM, JS and native. And you don't need bindings like in Rust, when you use JS you get JS, you don't need to make bindings over VASM or something. And on native you can just call it from your Swift or Objective-C code in X code and have access to trixity. Moreover, besides shared common code, it is possible to write platform-specific code. You just define a common interface and depending on the platform, the actual implementation can be different. This way you have access to platform-specific APIs and libraries which can be very helpful when implementing encryption like AES for attachments. So you have on each platform, can you use the native encryption algorithm the platform gives you already. And last but not least, you can define your own domain-specific language. You will see later what I did with that. So let's start with the core of trixity. The core contains all basic data structures of the spec and its serialization algorithms. This includes events, identifiers like user IDs, event IDs and so on and other things like cross-signing keys, device keys. One goal of developing trixity was the ability to add custom events which are strictly typed. So this is achieved by mapping event types to just a serializer. In this example, we add a new type of m.room.pum.cat of the Kotlin type cat event content. So you have access to all fields of this cat event content and don't have to mess around with the JSON. The next layer of trixity is the API layer. Each API has its model which defines all endpoints of the API. The actual client and server implementation just use these endpoints. And so as a consequence, there is no need to define things twice. They are using the same Kotlin object. So a Kotlin object represents an endpoint on the matrix side, Kotlin class, not Kotlin object, sorry. The best way to show this to you is with an example. This example is the endpoint leave room. You just implement matrix endpoint, give him the types and in this case, unit is the response. So we don't get a JSON as response, just a hated HTTP OK or an empty JSON. And you can also define a request, a URL, HTTP method and all that. And you can use this to call, to use a client, a matrix client on the client side to call these endpoints. So you create a leave room object, a request and you get the response. That's all on the client side. And the same thing on the server side. So you define an endpoint, give it the type, you expect as a request and in the context object you have access to the request and can answer with a response. To make it a bit more easier for developers, there is a bit of abstraction on top of that. So you can also just call leave room. So you don't have to know which endpoint are there existing. You just type point on your IDE and see, OK, there's a leave room, I can leave a room. And the same on the server side. So you just need to implement an interface and see all your endpoints you have to implement to be a fully-featured matrix server API. Regardless of the API, there is Trick City ARM and Trick City Crypto. Trick City ARM is just a wrapper for a lip-arm. As mentioned, a platform-independent implementation doesn't need to worry about the actual platform-specific implementations. So you have, when you use Trick City ARM, you don't need to know how lip-arm is accessed. So on the JVM, I use JNA. On JS, this is done via VASM and on native, just see interrupt from Kotlin. Lip-arm is also packaged into Trick City ARM. So as a developer, you don't need to ship the build-c library. And it is just loaded automatically. So you don't need to init your encryption, like in other libraries, as it's just loaded. My plan is to migrate that to Vodosemak, but currently, UNIFFI, we heard of that in another talk, does not support VASM targets. So currently, I can Vodosemak just only use in Kotlin, JVM and native, but I also want to use JavaScript. So this project is currently on eyes. Trick City Crypto currently implements the key management and allows to decrypt and encrypt events. And in the future, it will be more, so you can reuse the completely, complete crypto stuff, for example, in app services. Trick City Client allows you to, oh, sorry. The most abstract layer are Trick City Client and Trick City App Service. While Trick City App Service is still very basic and does not have a persistent layer, Trick City Client allows you to choose which database and which media store implementation you want to use. And on top of that, there is something that isn't released yet. We are not sure how to release, because we have to make money with our company. It is Trick City Messenger. This is just the view model representation of a messenger. So you only have to implement a thin UI layer, where when the user clicks the button, the UI sends this to the view model, and the view model says, OK, send a message, or go to this room, or any other stuff. And with this approach, we have implemented an iOS client in a few weeks with one person. So Trick City Client allows you to implement a fully-featured Matrix client or bot. So if you were at the Matrix Rust SDK talk, you can just use their representation and instead of Rust, you write Kotlin. So everything that Matrix Rust SDK does also does Trick City. Some features like Sliding Zinc aren't there, because we want to follow the stable Matrix specs, so we don't implement any MSCs. So we have all the E2E features, the exchangeability data stores and media stores, we have reactive cache on top of that, notification, thumbnail generation, all that stuff you need to implement the client. There are already some media store wrappers that we implemented for all targets, targets expect browsers, we just use the FI system and on browsers we use indexDB. Next I want to talk about how I accidentally created a cache. So on the left side you see the relation between the UI, Trick City and the storage layer. And because reactive UIs are really common, I wanted Trick City to give the UI access to the data in a reactive way. So if anything changes, the UI should immediately know about this. But the question is how? On the one hand there are a few databases which support listeners to react to changes to the database, but on the other hand this would limit support for multiple supported databases because finding a common interface for listeners would be hard. So I started implementing an intermediate layer based on Kotlin flows. The flow in Kotlin is a reactive data structure. So you have a producer on the one side and a consumer on the other side. So if the producer changes anything, the consumers immediately know about that. And what does the intermediate layer, it talks to a very thin database layer which only knows about save, read and delete data. And if someone wants data from this layer, it just reads it from the database or if someone changes something in this layer, it just writes it to the database. And the values are kept in this layer as long as they are subscribed from anyone. So this means that if anyone else subscribes to a value, he will immediately get the current value because there is no additional database call needed because it is persisted in the intermediate layer. This goes so far that even if there are no subscribers anymore, I just keep the value a bit longer in this layer. So if someone asks for a value for example 10 seconds later and the value is still stored, he gets the value and there is no database call needed. And you can now guess what I implemented, it's just cache. So as you see with this cache, everything in Trixity is reactive. These are just a few examples, so you can just get all users or check if a user can invite another one, you immediately get the notification if anything has changed. As mentioned, the database layer is really thin, so we implemented many database layers. So SQL-based one, we are via exposed for JVM-based targets. We implemented one with Realm that can be also used on native targets like iOS and for browsers we have IndexedDB. So most of the data changes, when a Zinc is processed, most of the data changes when a Zinc is processed. So it is way more performant to make a large transaction around the Zinc. So you don't have a transaction, every time the cache writes something into the database, Trixity just spends a large transaction around Zinc, so you have thousands of writes in one transaction. So everything fine, no, then there was Realm, and Realm is just a really fast database, but Realm only allows one write transaction at the time. So if another one wants to write to the database, he needs to wait until the first transaction ended. And the problem is that while the Zinc is running, it may be needed that we have to wait for outdated keys to be updated to decrypt on stuff. So if the outdated keys part of Trixity want to write something in the database, he needs to wait until the Zinc is ended, but the Zinc waits for the keys to be updated, so we have a deadlock there. This is one of the reasons why I introduced Azure Zinc transactions. The other reason was that most of the time the Zinc processing, as I find out with some benchmark, was wasted due to writing to the database. So processing a Zinc takes a long time because there are so many IO operations that the user have to wait until all operations are done. So what does a Zinc transaction in Trixity mean that all changes to the database are collected and processed in the background? So database operations are decoupled from the cache layer, and they are just written in the background. If everything fails, it is war-backed, but that's irrelevant in the normal use case. So we can process even more Zincs at the same time as if we would wait that the transaction has finished. And this gave Trixity a huge performance boost. Actually I released it last week, and I've wrote an integration test which just fails if it is not 50% faster. So it is always green, I don't know. The next thing I did completely different in Trixity are timelines. So normally Zincs are sent as fragment from the server to the client. So one fragment contains a few timeline events, and if there's a gap, you get a token. So you know as a client, okay, there's a gap, I need to fetch to fill that gap, and so on. And these fragments normally are saved as is to the database in clients. In Trixity, I use another approach. There I have each timeline event pointing to each other. And if there's a gap, the timeline event knows about this. So this allows Trixity to again benefit from content flows. So we have a producer that is the room starting from a timeline event, and a subscriber who wants the next timeline event to fill its timeline. So this allows us to go really fast through the timeline and bid the timeline under the top, and it makes it easier to fill the gaps, because we don't have another layer, fragments, we just have timeline events. And this way, it's also possible to very easy connect upgraded rooms. So that one I released yesterday, I think, or two days ago. So the timeline event just shows to another timeline in another room. So timelines with room upgrades are invisible for users of Trixity. You just get an infinite timeline until you reach the oldest room and the first timeline event. And finally, a small example. So if you want to write a bot, that's a good start to use Trixity just to get a feeling about it, how it works. You can just call get timeline events from now on. And what this does is it subscribes to the flow that I mentioned, which built the timeline, and rates until the timeline event is decrypted, because the timeline itself also is a flow. So if everything changes, it is redacted, or there's a reaction, or a replacement, the timeline event flow changes. So this get timeline events from now on just wraps it down, so you get a timeline event that is decrypted. And you can see we can just check what type it has, and when we have checked the type, we have access to body, and then we have send message. So when you call send message, you don't have to worry about if the room is encrypted or not. You can just use the DSL that I created to write text messages, image messages, and so on, and you can also form relations with that. So you can say like here, yeah, this is a reply to the timeline event I just got. And this has extensible events in mind, so if in the future there are other content blocks that are added, we can just extend the DSL, and you can very, very intuitive write your content with Trixity into an event, into an extensible event. So here are some projects that are using Trixity and that I know about. There is a Spotify bot, a Mensa bot. Someone has created some extensions to better use it for bots and so on. And there is Trixity examples. That is from me, this is just a ping bot, part of it you saw here. It is E2E enabled, you can just run it in your browser, on your Linux machine, or on your iOS client, or via the JVM on Android. And there is also Timmy Messenger, that's our messenger from our company, but it's not open source yet. We plan to, but we don't know how, because licensing. Yeah, just try it out and come to me if you have questions. I'm a bit around. This is the matrix room, this is my matrix ID. And if we have a bit of time, I just can show you a small demo, I think. Yeah, I made a small performance comparison. It's not representative, because it just runs once on my machine, and there was no warm up or multiple runs. Yeah, on the left side you see our Timmy client, but basically it's just using Trixity. On the right side there is Element, and in the middle is Fluffy Chat. And now you can give me your bets, who is the fastest. Yeah, let's see, when the red zoom comes, the response from the server reached the client. So I just looked into the Synapse logs when the response was sent. So we just wait a few seconds, and then we see who is first. And you can look into opening rooms, because we have this caching, it is very fast in our client, but I must say Fluffy Chat is also very fast regarding opening rooms. So, oh, Trixity was the fastest. And we can open rooms, and you see open rooms is also a lot faster than on Element. And there was Fluffy Chat, and Fluffy Chat also is very fast. Yeah, I also have a desktop demo, but there Neko is the fastest. This is Neko, this is Timmy, Element on the web. It's a bit hard, this comparison, because Element runs in the web and does not have the multi-threading other clients have. So Neko, just three seconds. I can just chat around, and the next is Timmy on the left, top left side, also very fast opening rooms, and switching rooms, because it is cached all the time in events. Then there was Element, and I think also Fluffy Chat, yeah, Fluffy Chat also. Yeah, okay, that was my talk, thank you. Questions? How do you prevent data loss with your async transactions? The transactions are run each after. So if one transaction fails, the other transactions are just run, and the next one starts with the alt token. What happens if, say, your battery runs out whilst a bunch of transactions are queued? If your battery runs out when all those transactions are queued, so they haven't been written to the database. Yeah, then they are gone. Your client has to do the work again, but mostly this doesn't happen. If you close your client, all transactions are written that are just opened, but it depends on your platform if it is killed hardly, or a trick city have a bit of time to write the transactions back to the database. But it's still very fast to write, so it's just a bit snappier on mobile devices, which are not that fast. Like my smartphone from 2016, Element, I can't run Element on that, because it's too slow, and sending messages, 10 seconds later, the message is, oh, okay, yes, yes, now. Now we send the message. I don't have this problem, because zooms are just faster than the slow I owe writing to the database we have on old smartphones, for example. Another question. It's nice to write. I like DSLs. In Kotlin, we have them all over the language, and it feels very intuitive, because your IDE gives you suggestions, what methods they are, and it's a lot easier to read, I think. There's Rust, and there's Kotlin, but is there any way to realize the amount of things that the user has to learn to use all these things? I didn't understand the question, acoustically. There's a lot of language learning to make any progress. Is there any effort to unify this, or towards Rust, maybe? To be honest, I don't like Rust. I just like a higher level of implementing stuff, so we didn't spoke, I didn't spoke with the Matrix Rust team. I think we are done with the time, and the last question from the audience would be that we can open the windows and the doors a bit to get more air in. Thank you very much. Thank you very much. Yep. That's it. |
Bridging ActivityPub with Kazarma
Interoperability and "beyond-chat" Matrix |
All right. Thank you for coming. So I'm Pierre, and I'm going to talk to you about Kazama and how we tried bridging Activity Pub with Matrix. So we talked a lot about interoperability even two years ago, and we found it sad that we talked about interoperability for proprietary platforms, not with our alternative decentralized networks. So we tried doing that. Kazama is hosted on GitLab. It's using the license HGPL v3, and it's done in Elixir. So we were shown that there was some stuff that hinted at building it. There was some article on the Matrix guide saying that you could bridge two decentralized networks. And there also was a hack and use comment by Matthew saying that there could be an Activity Pub bridge. And there also was a Matrix client library made by Uhoreg that we wanted to use and also an Activity Pub client library that was extracted from Playromar and made by the MoodleNet people and then by people at Bonfire. So the idea is to bridge those networks by creating puppet users. So Kazama is both an Activity Matrix server. It is an application service like other bridges, and it's also an Activity Pub server. So we on the Activity Pub site, we create puppet actors, and they can talk to Activity Pub users. On the Matrix side, we create Matrix puppet users that can talk to other Matrix users. So we wanted to both build an application service that you can deploy alongside your home server and also to have a public bridge so that other people on the Federated Matrix network can talk to Activity Pub users even if they don't have Kazama installed on their server. So that's what I'm showing here. And we, as I said, you can also be installed on your home server and then you have nicely displayed user names. For instance, just one character that changed. We made a web front end to help people see the bridged addresses. We also show here that we can display rooms that we'll talk about later. We started by bridging the chat that had been implemented by Peroma. It's only a one-to-one conversation chat. So we use the direct rooms of Matrix to do that. We then try to bridge direct messages like on Mastodon. Those are posts that are only sent to people that are mentioned in it. So there we used private rooms in Matrix. So there are no end-to-end encryption in Activity Pub. So for now Kazama only works on unencrypted rooms. It could work on encrypted rooms but then it would just bridge by deencrypted the messages. Then we also wanted to bridge Activity Pub public activities by creating public rooms where public activities are just bridged. It was something that wasn't really well thought out because we thought that it was a good idea to start bridging public activities as soon as Activity Pub users are searched for. So we made a relay actor that started following people quite immediately. It turned out that Activity Pub Fediverse had bad experiences with follow bots, as they say, because there were people trying to index the Fediverse. So there were angry people that started differing our staging instance. But there were also nice people who came in our Matrix room and we thought about ways to make it opt-in. For instance by having the relay bot send a direct message and wait for a positive answer or by having it wait to be followed to then follow back. Here is an example with a peer tube video where we can use a reply to have people post comments on the video. We also did something pretty nice with Mobilizon where you have groups and then we found that we could have people invite Matrix users and it would create a private room and as soon as people joined the room then there would be members of the group and then you could use the same activity types for discussions that happen in Mobilizon groups. So as a summary we bridge the user search, the profile, we bridge multiple activity types, post chat message, video and events. Activity pub rooms are still to be finished. We also started to build Matrix user rooms so people could ask the relay bot to make a room that they are administrator of as an kind of ad book room and then it will publish what they say as a public activity pub activity so they could be followed and appear on the Fediverse. There's also something to be thought again about that because on Activity Pub you need to have a way to see the activities that are sent from the instance when they were sent so we made a web page for the activities that are sent but then there's a thing of also showing replies like that and it's not something that's really nice to also display activities from other instances so we need to think about it again. As I said the Mobilizon groups, the private room has direct messages and the direct room has pure match-hat. We have replies, attachments and mentions. So there is still quite some things to be done but you are welcome to come and provide feedback or maybe contribute if you would like. I've also shown as it's a developer room some parts of the application service library that we made in Elixir so we just need some configuration like the one on the application service configuration file and then you can use the nice feature in Elixir like pattern matching to just select the messages that you want to act on and here's an example. So just to finish I'd like to thank NGI0 and NLNet at FundedF and we are in the process of having yellow to sponsor us some servers for our public instance and we've got hints on security by radically open security and then accessibility by accessibility.nl and we've also started to create an organization to work on Kazama on other projects that are mostly built on metrics so feel free to come follow us and maybe support us. I think that's it. Thank you. Are there any questions? Are there any comments? I'm not so much into social media such as Fediverse but I got the attention that there's much more public conversation going on than is usual on instant messaging so if I got a somewhat closed room type in matrix and there's an interaction via bridge to via Kazama would it mean that whole conversation can become a public conversation on a favor side? No it wouldn't. If there are something that are bridged as public it's only because it's on public rooms and the other way around too. So if we use private rooms it's only for private messages and it's... It keeps the same type of... Yeah absolutely. Thank you. You didn't mention the delete event from ActivityPub so do you support it yet or do you plan to support it in the future? Yeah I forgot to mention it but we already support the delete event so we... Deletions are bridged on both sides of networks. Could you talk about the choice of Elixir? Is it like did you thought that it was a language for this application or just a different language? It's a language that I love but it's also with the framework Phoenix it's also a great language to build HTTP APIs and that's something that we do on both sides of the bridge so it worked out pretty well in the end. Yes. Another question is it already in such a state that we can just, as you proposed, install it next to our home server and it will just run or is it still having some rough edges? I think it's not yet ready but it's really close. I'd really love to start by deploying the public bridge so that people can start using it and provide feedback as a public beta first so I think it's not yet ready to be deployed on your own home server plus the fact that we started to work on the features of the public bridge means that there are still something that could be bridged that are not supposed to be bridged on a private bridge. So what's the prospect on end-to-end encryption? It's very cool that it supports unencrypted stuff but I'm a bit curious on the activity pub side. Is there anything happening there? I don't really much know. I know that people have been talking about but I'm not sure what the state of it actually is right now. So in terms of bridging, is it being encrypted and unencrypted because you said that if you have a private encrypted chat in matrix it will bridge that indirect message in an activity pub but that's still public. If you can intercept that message or access the server database or whatever you can still read that. Yeah absolutely that's something that I think is a choice to make. I think that it's also there are some features that are added if we add support for unencrypted rooms because it still gets encrypted on the home servers and federated on server on matrix but it also could give a full sentiment of security so that's something that we really don't know about and still needs to be decided. Okay no question for that. Thank you very much. |
All your base are belong to us
A crazy ride through lots of matrix projects |
First, some disclaimers about this presentation. We will have some Babelfish effects here, because some things I can't really translate from my natural or my German language. But this is not my fault. This is your Babelfish defect. I like to wrap up things, like I did it on the Matrix Community Summit in August 2022 in Berlin at the sea base. This means actually I try to get information and bring it all together. This has some negative effects, because I just finished my presentation two minutes ago, because I had to put all the stuff in. So the presentation itself will be chaotic, and the slides of the presentation as well. Then it should be an interactive session here. So the last session I did at sea base, it was a bit more staged. I had a Matrix room, and there was a Matrix room, and the other person was also talking at it. But for the default was you could not write in the Matrix room, but it was possible to send reactions. So actually we had a really nice interaction with Matrix reaction in the room. So since I don't have a Matrix room now, let's try to do it in the real world. So please express a thumb up with your face, with noise, with maybe gestures. Can you just try it? Thumb up. Okay. So if you want to express something while we are talking, if you find something awesome, if you find something shit, if you miss something, just scream it, express it, because I'm trying to collect information about the Matrix universe. It's only a small part I can represent, but you have more knowledge about it. So please use this session to express where you are, what you have. It doesn't have to be consistent. It can be chaotic spread information. So and the last disclaimer is there is an elephant in the room. Maybe you have noticed when you looked at Matrix and the Internet, there's a lot of information about something Matrix, which is not the real time protocol, and it's also not the mathematics concept of Matrix. It's a movie, and I was heavily influenced by this movie, and so in 1999, I think, and so this talk is maybe a bit influenced by this, by my personal influence, in influence situation of this, oh, I forgot to plug in or sorry about this. And so I had always in my head something with this movie and the idea in that movie and the Matrix, the protocol. And now in 2023, we have these social networks, what are actually social networks? I would say it is a disconnected information silos. So like WhatsApp or Facebook or a forum or something like that, mostly they are not connected, but they are information silos of human cultural units, so-called means, and what does this different silos has an impact of our society on that planet? And if you really think about it, what's happening there? Who is controlling these silos? Who is in charge of the structure of the silos? Who says the policies is very different? And I think when I was looking in 1999 at this thing, there were a lot smaller impacts on the world, but now I see a lot of also machines or programs or artificial intelligence is controlling what happens in this social networks. And when you now think back on the idea of the Matrix movie, what is the world they are describing in the movie and what is the world we are living in, maybe have this in mind and also have it in mind when you think about the solution and maybe also about your role in the solution. So I want to call again to you, wake up, and oh, I was hoping that comes sound out. What's going on here? Yes, your mate, let's try does it help to go here, yes. Come on, come on, come on, come on. Come on, come on, come on. I know exactly what you mean. Let me tell you why you're here. You know something. What you know, you can't explain, but you feel it. You felt it your entire life. Something's wrong with the world. You don't know what, but it's there like a splinter in your mind driving you mad. It is this feeling that has brought you to me. You know what I'm talking about? Do you want to know what it is? It is everywhere. It is all around us. And even now in this very room, you can see it when you look out of the window. But when you turn on your television, you can feel it. When you go to church, when you pay your taxes, it is the world that has been pulled over your eyes to blind you from the truth. Is that the way it works? What truth? That you are a slave. Like everyone else, you were born into bondage, born into a prison that you cannot smell or taste or touch. A prison for your mind. No one can be told what the matrix is. You have to see it for yourself. This is your last chance. After this, there is no turning back. The blue pill, the story ends. You wake up in your bed and believe whatever you want to be. You take the red pill, you stay in Wanderland, and I show you how deep the rabbit hole goes. Okay, thank you very much. So, do your decision. If you want to leave the room now, it's a good... Or you should stay. All right. Hello, I'm Jan. I like to explore and discover worlds. I tripped over the matrix protocol a couple of time units ago. I work for Element. I'm part of the Data Norton in Berlin. I'm also tea at Seabase from the hacker community. And I'm incurudubule at github.com. Yeah, thank you very much. And by the way, if someone can make it happen that I get my incurudubule at Matrix or back, this is not possible because of whatever. I really would like, but I can... What? So you couldn't get pink on github? I can get... I can connect. I'll come later to this. So you can't remember your nickname? Yeah. Do you use another server? I use another server, but then I'm not on matrix.org, but whatever. Of course, I'm using another server. What I'm telling you here is only the truth. It's the truth from a very specific point in time from, I call it now. This is, of course, only through the limited access of information I have, and maybe some of them is not true for you or for someone else. So I share the view from... I share a view, and this is from mine. Regarding to the matrix problem in the movie, there is this mission of unplug the people and get them into the reality. So out of their silo virtual worlds where they are in there and get sucked out or the money, no, it was not money. It was energy. So get them out. How to unplug them? I think this is actually something what a client would do. And I was looking at a lot of clients in the last years, and one which impressed me was GOMOX. So I'm quite sure you know a lot of clients. So I don't want to list them all. I just want to list them what I had in mind. So GOMOX is a nice command line client. It's really fast, and it's for simple things. I really like to use it, and I used it on October. There was in October the... October, other client October initiatives. So using another client than the usual client on matrix. And it was quite fun, but I switched back to my default client, or to my default clients, which is actually elementary chat and hydrogen, and on the phone also fluffy chat. So but give GOMOX a try. And then there's another client. I also want to mention this one because it has a lot of potential. It's not a Polish client here, but it came up in the last year. It's called Thunderbird. Does anyone know Thunderbird? Okay, so maybe one of you is one of the 20 million users, or 25 million users, something around that, which are out there, which also using this matrix client, but mostly not for matrix. But there's a lot of potential in it, especially because they also have these... They're using it already for getting personal information in and out. Also like calendaring stuff. Email, maybe you've heard of this protocol. It's a replacement of Briefe. So yeah, so Thunderbird is an awesome thing. I think what happens there, maybe you should take a look at it. Let's do some interactive stuff. I count down to zero and everyone names their default matrix client. Three, two, one. Okay, next one. Connecting fragments. What was the idea here? Oh yes, maybe you know this little plant guy, whatever. Cactus is a nice thing. I can maybe click here and give you a short demo. It's in the world there, so it looks like a forum, or maybe there is also an article on top or something. But here you have a small window where the people in this silo can put in information and it already goes into the reality when we think in the movie terms. And then from there it can be also go in other silos. So cactus comments is such an interesting thing. I have not really picked it up when I was watching the movie. What it could be, especially if you have an idea was cactus in the matrix movie universe is, just tell it me. Do I have to go back here? Okay, and there's also, I think it's another interesting stuff. I think most of you have just seen the presentation a couple of seconds ago. But it's in a way, I think, the same basic idea. The next one. Oh yeah, populus viewer. Who knows populus viewer? Okay, that's not a lot. I thought, okay, should I share this? I'm aware of it since ages because I know a lot about matrix. It's really nice thing is to annotate PDFs. So you have a PDF, you want to talk with people about it, but you don't want to convert it in some whatever. You just uploaded it to a channel. And then with populus viewer, you can discuss about it. Let's see if this link will work. Yeah, it works. So I already uploaded the simple PDF and then I can mark things and say, oh, okay, here, blah, blah, blah. So this is how populus viewer works. So a PDF is not really a silo because it's not really an interactive place. But here you can also, again, with the idea of putting information on it, somehow use. Is this a 10 or five minutes left? It's a 10 minutes left. All right. Then going back here and let's see what's next. Widgets, yeah. I have no idea what widgets in the Matrix movie is. But I discovered one really simple thing last year. It's just a widget with markdown on top of a room. And there's an edit button. And as an admin of the room, you can just edit it. And it's really good to just summarize or update the stuff in the room. So it's really simple, but really powerful. And I think it's a really good starting point also. If you have people and you want to enrich them with information in the room, it's better than a subscription or a description of the room because you can make pictures or a bit like into it. So it's a really good starting point to enrich the user experience of rooms with widgets, in my opinion. And there are other more richer stuff like the calendar bar camp and polls. You have maybe seen the presentation earlier on about widgets accessing the silos. So let's assume there is on this planet a place where people should have full control over their own communication. Who knows the sentence, by the way? OK, you should read your Matrix manifesto. So on matrix.org, there is a Matrix manifesto. And you should read it. And you should try to find out if you can identify yourself with it. And then what that does actually mean for what you will do with your life. But let's imagine you have, we have a place where people have full control over their own communication. And then you should first start to free yourself before you free others. I was spending, I think, 20 years with going to a lot of people saying, hey, there's this new thing. It's called IRC. You should go on it and then we can chat. And then there was this other, there was this XMPP and this AOL and this. There were a lot of social networks I jumped onto. And then I came to my friends and my real social networks. So my social context and try to get them into this silo. So I was really a slave. And then every time I thought, oh, no, I'm free. No, this is the next good network. And then I sucked them all in. I think the last one was Telegram because I was impressed by, oh, okay, they have a GPL written client and they have an open API. And it was in relation to all the other stuff so open. And maybe you have heard of Telegram is not so open, so easy to deal with. So I was again wrong. What I'm now doing is I have my matrix set up. I have my bridges and I take care that I am free and that I can communicate to the others with my avatars with my Telegram WhatsApp signal link and whatever. So I have representations of myself and I'm not influencing to come into any networks, but I stopped communicating with people which are only in my network because this is not what I was born for. This is not why I'm on this planet to decide I don't want to talk to this person because he doesn't have an iMessage account or something like this. This is really a piece of shit. And this matrix was first freeing myself is for me a really good standpoint. And then I can tell others about it. This is how I'm doing it. And I think it should not be in our life about making decisions about our communication provider. And here I took myself a bit off with puppeted bridges. I will skip this bridges stuff, but you can host the bridge by yourself. You can develop some and you can just get them hosted. Bridges itself, you can bridge a lot of different stuff. And I want to talk about Hookshot because Hookshot is awesome. Who uses Hookshot on a daily basis? Okay, this is not enough people because this is really an awesome tool or an awesome bridge to get the information from the outside world in just like an RSS feed is easy to subscribe to. You can use a lot of web services or these silo things have webhooks and you can get the webhooks pointing to your room. And then you can write little JavaScript snippets and say, okay, there's some data coming in and I take the topic and the status of the bug and send a message into the room saying this issue is closed or something like that. So it's really easy to write this transformation snippet. Please share the snippets. It's not that easy. So sometimes it's useful to use other people snippets. But so webhook is really awesome and it has an incredible feature for GitHub. It's called dynamic rooms. So you create a space representing your project and then you have your repository and then you have their issues in your repository maybe. Then for every new issue is that room created. So you have in the space a room for the issue and then maybe you have to discuss about the issue like you've write a comment on GitHub. And if Hookshot knows about you who you are on GitHub, then if you reply to that comment in that room, this is a comment on GitHub. So you don't have to go between mail notification, GitHub website, Git commit message, something like that. You can just easily do it in your matrix client and it speeds you so much up in discussions on GitHub issues. So if you work with GitHub or GitLab, check out Hookshot. Another really awesome thing. So this is my personal Matrix app of the year 2022 Postmobile. It's an email to Matrix Bridge. And yeah, okay, it's an email to Matrix Bridge. No, it's awesome. I use it with my wife for eBay Kleinanzeigen, for example. We have the subscription into a room and we can discuss about it. And if there is a message coming in and reply to the message in a thread, then it goes back to eBay Kleinanzeigen and eBay Kleinanzeigen puts it to the person, which sends the request. So it is an easy, it's a poor men's bridge to everything, which is already bridge to email. And also sometimes you want to have an email address is so easy. You just invite Postmobile bot into your room and say new mailbox, whatever at my Matrix server. And the email is instantly created. And so this is really, it's a good way if you need to use email, then you can start with Postmobile and with redirects. It's a good transition phase for yourself. Yeah, it's awesome. As I said, I will skip this, but check out Honoreed. If you want to have a simple helpdesk, but servers and services. Yeah, there's Synapse, which is forked or cloned by the Buddhist messenger in Germany at the moment. Interestingly, there's Construct, Conduit, Rocket Chat. Have you heard about this? They also want to support somehow Matrix. We will see what they already do. Stashcat, which is a security method. They also want to support Matrix for Federation. Discourse, which is something like Cactus. They also want to go for Matrix. And as I said, this is only what I observe. EJPD as well. All right. Then there are a lot of awesome providers there. If you don't want to run your server as well. Some stuff I discovered, especially for Germany, there's the host sharing EG, which is a genossenschaft for hosting. They also do not offer Matrix from the Stange. And there's also, yeah, LSTIO and EDGE and Osrocks and, of course, EMS from Element and Ungleich ZH. And I think there are others. By the way, others, I discovered these two projects. One is chatcloud.net. They offer hosted element services. I don't know about them. Is anyone here from them? Okay, no. And there's also Lux Chat. So Luxembourg is going to switch to Matrix. There are whole public sector and also people communication. I don't know about them. Is someone from them here? No. Okay. Yeah, then how to deploy the stuff. So there are different stacks to deploy. There is the EMS installer. If you have huge setups and need professional services, there is Matrix Ansible Docker Deploy, which is an awesome Docker script maintained by, I don't know, not contributed by 250 people. And it has a lot of stuff already in there. There are a lot of hand charts for some basic components and, of course, operating system packages. Is there any deployment stack you are aware of, which I haven't mentioned? There was something on NixOS. Okay, get in contact with me. Building blocks, yeah, all the SDKs. And then what's coming up next? No, this year, there will be maybe a Matrix Community Summit. I think it will be by the end of September and there will be something on the CARES Communication Camp in August. And yeah, in both cases, you want to be a part of and you want to help or volunteer, just get in contact with me. And I think then was the last presentation over and I finished my slides. Perfect. |
Introduction to the Synapse Kubernetes Operator
A new way to deploy Synapse and its Bridges on Kubernetes |
Hello everyone, good morning, good afternoon. Today we'll talk about a Kubernetes operator for Synapse, which allows you to deploy and manage a Synapse on server as well as a couple of bridges on top of Kubernetes. This operator is written in Go, and if you don't know what Kubernetes operators are or what they do, don't worry because that's the first thing we will talk about. In this talk, we will not assume any prior knowledge of Kubernetes. My name is Mathias Gohens. I'm a senior software engineer at Red Hat, and I've been working with operators daily since I started a bit less than two years ago. I was looking for a way to learn and experiment with operators, and that's how this project started as a playground for me. So let's get started. As promised, the first thing we'll do is discuss some ground concepts of Kubernetes and introduce what Kubernetes operators are and what they do. And in the second part of this talk, we'll talk more specifically about the Synapse operator and do a demo of its main capabilities. And finally, we'll talk about the future of this project and the long list of improvements and features which could be added to it. So Kubernetes, what is it? Some ground concepts first. Kubernetes is a container orchestration engine. That means that it helps with managing containers at scale. Among its main features, Kubernetes helps with load balancing, high availability, and self-healing of your containerized application. There's this notion of pods in Kubernetes. A pod is wrapper for your containers. The pod runs one of multiple containers and you never run container directly on Kubernetes. You always go through this abstraction layer, which is a pod. And finally, your resources are described in YAML. The Kubernetes resources, such as a pod, can be described in YAML. And as a user, you often write something called manifest files to create or update your resources. You have also Kubernetes CLI called kubectl or kubectl, which allows you to interact with the Kubernetes API. And you can do things like kubectl get pods or create new resources with kubectl create. Talking about resources, that's what they look like for the most part. First, you have something called the type meta. This is where you identify the type of resource that you describe. In this case here, we're talking about a replica set. Other examples of Kubernetes resources are pods, we already mentioned this, deployment, jobs, service account, PVC, the persistent volume claims, config map, and so on. Then you have the so-called object meta, that's the part which allows you to uniquely identify an instance of a resource type. So here, we're dealing with an instance of a replica set called frontend. And finally, you have the spec and status section. And the spec or specification represents the desired configuration. It's the desired state of this resource. Finally, the status provides the user with information about the last known state of the resource. And so what is a replica set, by the way? It is a way to encapsulate a pod definition. So that's one here in the template. That's a template of a pod, right? And the pod runs containers. And it adds a replica count to it. So here, it expresses that there should be three replica, three copy of this pod running. And if for whatever reason, a pod comes to fail, then the replica set is there to ensure that a new copy of this pod is being created. Let's see a replica set in actions to illustrate what we just said. I'm on a Kubernetes cluster here. And I already have a replica set running on this pod, on this cluster. On the bottom right, you see the list of pods currently running on the cluster. So here, you have three pods, three engineering pods. And well, that makes actually sense because we have on this replica set configured that we would like to have a three pod running three copy of the engineering pod running. Let me take one pod and delete it. kubectl, delete pod, and the name of my pod. So what's going to happen? A new copy of a pod is recreated almost instantly. Well, that's the job of the replica set. That's exactly what it does. It ensures that three copies are running. All right, but where is this logic of recreating pods, recreating copy of my pod actually implemented? Well, that's the job of controllers. So let's think about this. What is a controller? A controller in Kubernetes? It's a process. It's a process that runs usually within the cluster as a pod. It's responsible for managing a type of resource in the cluster. So there is a job controller for managing jobs. There's a deployment controllers for managing the lifecycle of deployments. And there is of course a replica set controller, which we're going to see in a second. Behind the scene, how it works is that it watches the Kubernetes API. Each time it is an event, like a creation, the update of the deletion event, which is related to the type of resource that it manages, then it starts something called its control loop. And the control loop, it's an id input and function, which role is to resolve the difference between the current state of the resource and its desired state. And it's also, by the way, responsible for writing the status of the resource. So let's see an example real quick. This is the replica set controller or at least the simplified version of it. It implements a control loop. And this control loop here in the middle is triggered every time a replica set is either created, updated or deleted in the cluster. The main aspects of this control loop is to first read the desired number of replicas from the replica set. That's the desired state, right? This here. Second, it's to read the current state of the resource. How many replicas of the NGNX pod currently exist on the cluster? That's this here. And third, it will reconcile the resource state. That means that it calculates the difference between the desired and current number of replicas. And it creates or deletes replicas or even it can do nothing if the current and desired number of replicas already match. And finally, its last responsibility is to write the resources status. So this provides the end user or other controllers in the cluster information on the last known state of this resource. And they are similar built-in controllers, as we mentioned, running for the deployment, for the jobs, for the PVCs, etc. For all native Kubernetes resources, there is built-in controller for managing them. And, well, now you must wonder, wait, you keep talking about built-in controllers and native Kubernetes resources. Why is that? Does that mean that there is such an external controller in the non-native resource? Well, yes, precisely that's exactly what there is, and that's exactly what operators are all about. Operators are about reusing these concepts of Kubernetes resource and Kubernetes controller and create our own. So first, we're going to create new resource types. Let's say, for example, we're going to create a resource type called Synapse. In a second, we're going to create a custom controller to manage our brand new resource type. So let's say we're going to create a Synapse controller responsible for all the business logic of creating and managing a Synapse home server. So first, a new resource type. A new resource type. How are we going to do that? Well, we're using something called custom resource definition in Kubernetes. And this is a truly amazing feature of Kubernetes because CRDs, the short name for custom resource definition, they provide a way to extend the Kubernetes API using the Kubernetes API. That means that there is a resource type in Kubernetes natively, which is called custom resource definition. And we can write our CRD as a manifest file and create and query it with kubectl. So the CRD manifest will contain information about the new resource, the custom resource, and such as its kind and its open API v3 schema. Open API v3 schema is a set of definitions and rules that will describe the structure of our Kubernetes resource. So let's have a look. Let's have a look at custom resource definitions. As a matter of fact, on my Kubernetes cluster, there are already some CRDs installed and specifically CRDs for Synapse. And I can't dig into it. So here you see I did the kubectl get custom resource definition because this type is natively present on Kubernetes. And I can also do things like get, this is the command I run, get custom resource definition, the name of my Synapse CRD, that show YAML, that's to have it in a YAML output. And what I see here is how a CRD looks like. So there is a kind Kubernetes, kind, sorry, custom resource definition, this is the type meta. Here we have the object meta on this meta data section with the name of this CRD. And then we have the spec. And what is the spec of a CRD? Well, it is describing a new resource. We are creating a new resource type in Kubernetes, a custom resource. So how does this custom resource looks like? You have things, for example, such as the new kind that you want to have created. You have information, for example, this is a namespaced resource or not. You could also have cluster wide resources. And you have, you see here that it's available in the v1, alpha one version. And you have the schema, open API v3 schema for this new custom resource, this new Synapse custom resource. And on the top level, you have, you find things again, like API version on kind. So this is our beloved type meta, then the meta data section, the object meta that are common to all Kubernetes resources. And then you have the spec section and status section. Again, we again find the second status. And here you see the descriptions of what is contained into the spec. So here, for example, you have a Boolean called create new Postgres QL, by default, it's false. They have a section called home server. And with some information how to configure your Synapse instance. So this CRD is there to describe our new Synapse resource type. We're jumping a little bit ahead here. We'll come back to the Synapse CRD later. We're actually going to use it and create a Synapse instance. I just wanted to show you an example of a CRD manifest file. And because this, because this CRD is installed in the cluster, I can now do things like kubectl get Synapse. What do I get back from the cluster from the API? No resources found in default namespace. Okay. And just to compare, if I would run this here, kubectl get not exist. This is a type which does not exist. I get a different message. This one, it's an error message. So the server doesn't have a resource type not exist, right? Synapse, we have created CRD. So the resource type Synapse is known. And now that we talked about custom resources, we can talk about building a custom controller. So that's where we need to write some code and implement the actual logic of managing a simple Synapse instance, the business logic. Unfortunately, we have some SDKs available, which will help us with all the boilerplate code come on to all operators, such as watching the API, cashing some requests, building work queues, and so on. They also help with actually creating this CRD manifest that we just saw, because as you see, creating the open API with free schema by hand can be a little bit cumbersome. So we also have tools in the SDK to help us bootstrap, generate actually those CRD manifest files. So using an SDK really allows us to focus on writing the business logic and how to manage our application. And yeah, so with that, we have it. We have seen the main concepts behind Kubernetes operator. We talked about Kubernetes resources, how they are structured. We saw how we can extend the Kubernetes API with custom resource definitions. We talked about the controller pattern in Kubernetes and how controllers are responsible for reconciling the state of the resource they manage. And we saw that we can write our own controller with Operator SDK, for example. And hopefully by now, you have a good understanding of what operators are and do. Of course, there would be tons of other interesting details to mention on Operator SDKs and CRDs and so on. But yeah, we have a limited amount of time today. So this talk is actually about the Synapse operator. So let's move on to that part, finally. We have, so I made the choice to, of dividing this project into three CRDs. One for the Synapse home server and one for each bridges. So right now, this operator is able to manage two bridges, the Heisen bridge, so this is for IRC and matrix signal, which as the name suggests is for signal. That means that you can manage each component individually and independently. That's a model which would scale also better if and when additional components are added to the project. So let's have a look at the CRDs individually. First, the Synapse CRD. So we saw this already before and as the name suggests, it allows us to deploy a Synapse home server. By default, it will use the SQLite database, which is shipped within the Synapse Docker image. But there are also ways to work with Postgres. We'll talk about that a little bit later. In order to deploy Synapse, we need to provide some configuration options. To do that, we need a configuration file, which is called home server.yaml. And as you know, if you've dealt with Synapse before, this is a very long file. There are lots of configuration options. So therefore, I made the choice of providing users of the Synapse operator with two options for configuring their Synapse instance. As a user, you can either provide the two mandatory configuration options directly in the Synapse manifest. These two mandatory options are the name of your server and whether or not to report statistics. The Synapse operator then uses the default values and the rest of the home server.yaml with default values. Actually, it uses a default home server.yaml template and feeds those values into it. And this is especially useful if you don't have a home server.yaml at hand or just want to quick start in a project, quickly test the capability of this operator and just want to get a Synapse instance running. However, if you need more control over the configuration of Synapse, which is totally understandable, or if you already have a home server.yaml, then you want to go with the second option. And that is creating a config map containing the home server.yaml and feeding it to the Synapse resource. We're going to see this in a second with examples. The Synapse operator automatically will use your custom home server yaml. Actually, it will make a copy of it and it will use it to configure Synapse. Let's see a little demo of this. So, we're back on our Kubernetes cluster. On the top right, you see the logs of the three controllers, the Synapse controller, the Heisenbridge controller and the Motrix signal controller, which are running in the cluster. All the logs of those three controllers are here on the top right of your screen. On the bottom right, you again have the list of pod running in the cluster and the default namespace. Currently, there are none. And on the left side, that's where we're going to issue some commands. Let's go ahead and create our first Synapse instance. This one is going to use, so this one you see, it's a kind Synapse. And with the name called MySynapse. And this one is going to use values. We're going to use, provide some values in the spec section of our manifest file, the name of the server, which is called myserver.com. And whether or not to report that here, true, because we are a good citizen. And we are going to go ahead and create this Synapse instance. What we see here on the right side is the Synapse controller getting to work. And that's business logic, right, which is in the Synapse instance. It compares the desired state of this resource. So please have a home server running with this configuration values. And the current state, what do I currently have in a cluster? Nothing. I don't have a Synapse instance running. I don't have a pod for my Synapse instance. So this is what then creates all the necessary objects, a deployment, a config map, a PDC, and so on and so on. And now I can check that my Synapse is being created. So as you see, get Synapse, right? This time I have one. I get this one back in. I can check my Synapse status. And I see that some information on the home server configuration has been populated. Well, in this case, it's pretty straightforward. It's basically a copy of the values. And you can see that this Synapse instance is running. In fact, yes, you check a bit the logs of this Synapse pod here. You see that, well, the usual have been created, the usual have been displayed here with some info on this Synapse instance is running. Now we're going to move to the second example. Now that we have seen how to create a Synapse using values, we're going to see how you can do that with a custom home server, the YAML. Let's say you have your custom home server YAML. This time, you have configured a server name, my matrix host. And you want to use this for configuring Synapse. What you do here is create a config map for this home server YAML. So here I did kubectl create config map. This is the name of my config map, my custom home server. I'm using from file to say I want to use this file as an input for the data of this config map. And in fact, now I have this config map created here, my custom home server 15 seconds ago. Now I have the Synapse instance, a new Synapse called my custom Synapse. And in the configuration options, this time, I'm not using values, I'm using here config map, and I'm giving the name of the config map containing the home server YAML that I want to use for configuring this instance of Synapse. So let's go ahead and create it. And again, the Synapse controller gets to work. And a new pod is going to come up for this instance of Synapse. And if I'm checking this time the status of my custom Synapse, in the status here, I see again some information on the home server configuration has been populated by the Synapse controller. And especially the name of the server and whether or not to report stats, this has been fetched from my custom home server YAML. Behind the scene, what it does also, if I'm running this command again, is that it actually creates a copy of your input config map. So it makes a copy of the config map you provide, and it works on this copy, because sometimes the Synapse controller might have to do some modification to make sure that this instance of Synapse can actually run. So it makes a copy, doesn't touch your input config map, it makes a copy of it and edit it in place if needed. Let's go back to the presentation and move on to the next CRD that we have, the next resource type that we have installed, which is Heisenbridge. So Heisenbridge, as the name suggests, again deploys an instance of the IRC bridge called Heisenbridge. And then it also automatically adds this bridge as an application service to the Synapse home server YAML. And similarly to Synapse, you can again provide your custom Heisenbridge configuration file if you want. You can also decide to use some default values. And in that case, you don't have to provide anything because there is no mandatory configuration options for Heisenbridge. So if I'm going back to the terminal, I see that here I have an example manifest file for Heisenbridge, where I specify in the spec section simply the name of the Synapse instance that I want to be connected to. And that's okay because I have, my Synapse is an existing Synapse instance, right, we created it a few minutes ago. So I can go ahead and create this. And what's happening here now, we have two controllers now doing some work. First, the Heisenbridge controller, which runs here this part, the Heisenbridge part. And second, we have the Synapse controller, which you see has terminated one part, the one which was created four minutes ago, and has run a new part about 20 seconds ago now when we first created our Heisenbridge. Why is that? Well, the business logic of the Synapse controller is so that when it sees a Heisenbridge being created for an existing Synapse instance, well, it reconfigures it. So it adds it, it adds the Heisenbridge as an application service in Home Server YAML. And then he has to restart, not actually restart, recreate the part using this new configuration. So that's what it does. I can also check the logs of my Synapse pod, a grab for Heisenbridge. And we see that indeed Heisenbridge has been added as an application service. There is a user Heisenbridge, which has been created, which has been logged in. And there were some initial requests made for configuring it. All right, so I have now a Heisenbridge instance, and I have two Synapse instances running. By the way, this reconfiguration of the Home Server YAML would also work if you would have provided your own Home Server YAML with a config map, because I mentioned before that the controller makes a copy of this Home Server YAML. And so it works on this copy and it also modifies it if needed. All right, and just to mention here, we have also the possibility to configure Heisenbridge with a config map that would work in the same way as for Synapse. That means that we would need to create a config map first and then feed it here to the Heisenbridge resource. And in this config map, we would need to have the Heisenbridge configuration file. So similarly to what we had with Synapse. Finally, we have the MatrixSignal bridge. This works exactly in the same way, creating the Matrix Bridge, creating a signal D, which is required for this bridge to run, and reconfiguring the Home Server YAML and add MatrixSignal as a bridge there, as an app service there. Again, you can either use a custom configuration file or work with default if you want a quick start. Finally, there's a way to provision a PostgreSQL instance. The first thing you could do is have your custom Home Server YAML and if you already have a PostgreSQL instance running, you could provide your own Home Server YAML and configure the database connection information there. Or you can automatically spin up a PostgreSQL instance. We saw that there is a create PostgreSQL Boolean a little bit earlier. By default, it's false. But if you put it to true, it will attempt to create a PostgreSQL cluster or PostgreSQL instance using the PostgreSQL operator. So that's an external dependency on the Synapse operator. Let's finish very quick with the next steps. As you know, MatrixSignal is a very large ecosystem with tons of projects. Many are obviously missing from the Synapse operator today. First, bridges, of course. We could add more bridges. Web clients, we currently don't have any, but we could have Element or Hydrogen, for instance, as a web client. We could have some additional infrastructure components, such as a turn server for WebRTC, audio-video call, SSL certificates, or email server. And, of course, why not also have alternative Home Servers, right? Right now, we only have the Synapse Home Server, but we could also have a provided possibility to deploy. You maybe can do it on the Android and turn this Synapse operator into a Matrix operator, actually. Yeah, that would be for the long term. All right, so that's it for me today. Thanks for listening. Today, we hear some information on how to contact me, the link to the GitHub. Don't hesitate to go ahead and grab the code and try it out, provide feedbacks, write me also in the Synapse operator room. Right now, it's a very quiet room, but please just come and have a chat if you have any questions or if you're interested in working on this project. Thank you very much for your attention, and I wish you a good rest of the conference. Bye. you you |
Cascaded Foci (SFUs)
Selective Forwarding Units |
Hello everyone, my name is Svan and I've been working at Almond as an intern for more than a half year now. Over the time I've been a part of several projects, but I think the most exciting of all has been working with the VoIP team on Gascaded 4 Pi and selected forwarding units. So let me introduce you to all of that. Firstly, we'll see what is at call and how it works. We'll also look at waterfall, the matrix approach to foci and this is news. We'll have a short demo of how all of this looks in action and see what the future holds at the end. So what is Almond call? It is the video calling app which we've been working on at Almond. It has all of the features which you would expect such as screen sharing and also allows you to reorder all of the tiles to switch your leads, among other things. It uses two main building blocks, one of them being the RTC for the media transmission and matrix for the action signaling. That means doing all of the things such as setting up a connection. We use what we call full mesh connection model. That means that in order to create a conference, all of the clients have to connect to all of the other clients. While this has several benefits such as not requiring a backend and being decentralized unlike GCR Zoom or even supporting encryption out of the box, it is harder to make it stable. And this main downside being that it requires a lot of upstream on every user's side. Since for you to share your video to other users, you have to upload it to each individual. So at about 8 users, things might get a little choppy. So the question is, can we do better? But first, let's look at why we want to do better. What are the use cases for large conferences? The obvious one is large calls for us to be able to compete with GCR Zoom and the like. Another use case, which is also quite interesting, is webinars. There you might only have a few presenters, but these presenters would need to publish their video to a bunch of listeners, and that would also require a lot of upstream bandwidth. The last use case, but possibly the most interesting one, is thirdly the Immersive Virtual World type that is built on Matrix. This would also greatly benefit from being able to host virtual worlds with hundreds and hundreds of users. But how do we actually do all of this? Well the solution is to introduce a backend or what we call at the video conferencing world, focus. This might either be an MCU or an SFU. So let's look at what these are. An MCU or a multi-point control unit is a server which trumps code and mixes all of the user's streams into one, which it then forwards to all the users. This has the great benefit of requiring only one upstream and one downstream per user, but it has many downsides. The streams are mixed together, so the clients have no control of how they are laid down on the screen. The server also breaks into an encryption, since it has to trunscode and mix the streams. The server also must be powerful to do all of this, and trunscoding and mixing also takes some time and so the server has delay. Also it has the obvious downsides of requiring a backend and a bit more signaling. That is why we prefer the cyclic forwarding unit or SFU solution, an SFU is a server which takes the user's stream and then forwards it to all of the other users. This is quite a bit more reliable than the full mesh connection model, easier to setup. It requires much less upstream bandwidth, scales very well to large conferences and also does not break into encryption, if done with something like insertable streams. The streams are also set separately, so the clients are in control of laying them out on the screen. The server can also work in a federated manner, which will be important for the next slide. So what is waterfall and why is it special? It is the SFU that has been contributed to us by Sean, the creator of Pion. It uses the matrix protocol for the actual signaling. That means that it is interoperable, so anyone implementing MSC341 and MSC3898 can build their own SFU or connect to our. We also support what we call skating. That means that waterfall or SFU can connect to a bunch of SFUs. This is very similar to what we do with matrix home servers and creates federated matrix RTC worlds. Conferences done in this manner can dynamically grow, since the load is balanced between all of the foci that are part of the conference. So if we assume that the foci interconnect in high quality and that they are placed near participants, it means that the network hiccups are minimized and we have much less delay and packet loss. Also our foci are a little dumb and the clients are actually the smart ones, so they are in control of what foci collect to and can subscribe to tracks on their own. So how does a client actually connect to a foci and how does it choose one? On the left we can see an example where Alice and Bob are a part of a conference which is happening on the SFU in Brussels. And Charlie wants to join this conference. While Charlie is located at London and has London SFU, it is actually much easier for Charlie to connect directly to the Brussels one. But they will still prefer the London one. And this is all indicated in the m.call.members state event. In the m.fauxguide active field you can see that Charlie is actually connected to Brussels SFU, but in the m.fauxguide preferred field we can see that they would prefer connecting to London one. Then on the right we can see even another user joining the call, then who is also located at London. And so now there is a benefit for both Charlie and then to use the London SFU and for the SFUs to federate with each other. And this is all again indicated in Charlie's member state event, because now they are connected to the London SFU instead of the Brussels one. You can notice that both m.fauxguide active and preferred are race. So Charlie or anyone else can specify multiple SFUs they prefer, but also multiple SFUs they are publishing to do so perhaps. If you know that a connection to a certain SFU might be unreliable and you want a backup, you can do that by publishing to multiple SFUs. Now that we have chosen an SFU, how do we actually connect to it? This is done very similarly to what we do in full mesh calls, using two device messages over matrix. We exchange the SDB using the m.call.invite and answer event and the candidates using the m.call.candidates event. Once the connection is actually established, the client and the focus continue exchanging messages using the WebRTC data channel. This is quite useful, since it's super fast and so where you need to quickly negotiate, it is much better than using the matrix to device messages. So now that we are connected to the conference, how do we make it happen that the user actually see something? On the left we can see an SFU publishing our system using the SDB stream method at a field. This field is present on multiple events, such as the m.call.sdbstream method at a changed event or the m.call.invite, answer or negotiate events. In this specific case, we can see that Alice is sending stream ID 1, which has two tracks, track ID 2 which is audio and track ID 1 which is video and has resolution of 12 ADP. On the right we can see Bob subscribing to Alice's stream. This is done using the m.call.track subscription event. In the subscribe field we can see two tracks we are subscribing to, the track ID 2 which is Alice's audio track and the track ID 1 which is Alice's video track. We can also notice that Bob is subscribing at a specific resolution, 640 x 360. This will be very important for the next slide. We can also notice that Bob is unsubscribing from track ID 3, since the SFU is no longer publishing it, since it is not present on the left. While SFUs do solve the issue of extreme bandwidth, there is still the issue of a lot of streams going down and therefore downstream bandwidth. So how do we actually limit that if we can? The solution to that problem is simulcast. Simulcast means that all of the clients publish multiple versions of their streams, each at a different resolution, for example 1080p, the half of that and the quarter of that. The focus for SFU then forwards the ideal resolution for a given client. So on the right we can see two clients, the light green one and the dark green one. The light one is looking at one tile on stream and it is quite large on the user screen and therefore the SFU is forwarding the 1080p version of that stream. But at the bottom the dark green client is actually subscribing to three streams and they are displayed in quite small tiles on the user screen and therefore we only forward the 360p version of that stream. The focus might also send over a solution depending on the available bandwidth. Now let's look at the demo. Hi, thanks very much Simon. So I'm here to run you through an actual demo of Element Call and just show you generally how the application works and what the features look like. You can see me here in this little box down on the bottom right and in the rest of the boxes are some of the rest of the VoIP team. Obviously you've got Simon who has just been speaking to you. Florian. Hello, I'm Florian. I'm the engineering manager of the VoIP team. Yeah, and we're having a lot of fun developing Element Call, right? And Enrico. Hello there. I'm Enrico. Nice to be here. Excellent. So the two modes of the thing I'll show you first is we are in freedom mode at the moment and what freedom mode means is that I can take these tiles and throw them around the screen in a rather fun way like that. So if I want whoever on the top right, that can do that. The other thing I can do in freedom mode is make any of the tiles bigger if I want by just double clicking them. So if I want to want to focus on one tile, I can do that. So to show you the other mode, spotlight mode, now that has highlighted Enrico as the last person that spoke. Florian, if you may be you speak. Yeah, so now I should be in the focus right now. Excellent. So yeah, that's highlighting the person that's speaking at the moment. I'll put this back on freedom mode for now. And now of course, yeah, so the most interesting thing about this is that it's powered by the SFU. So when I make one of these screens larger, you should actually see that the quality will improve. I'll do this with, I've got actually, there we go, yeah. So I think it's quite subtle and you may not necessarily be able to see it on the video, but if you look at Sim on the books in the background, you can see that you can read the spines on the books better when you previously couldn't. What might be a better demo of this is we've actually got some debugging tools here. I can go into developer and this is brand new, show a call for debug info that we can tell actually with the exact video sizes that I'm getting from everybody here. So I'm getting 640 by 360 from Sim on at the moment. And if I make him larger after a couple of seconds, there we go. He's now in beautiful 720p, 1280 by 720. So that is automatically changed resolution pretty seamlessly without, if you weren't looking at the spines on his books and trying to read the covers and trying to read that he had his Mathematica book in the background, then you probably wouldn't even notice. But yeah, and then just back down to lower quality. You will see a slight pause there. That is something that we haven't finished yet. And it actually waits for a keyframe to arrive before forwarding the lower res stream. We don't need to do that. We could continue forwarding the higher res stream and then only actually switch when there's a keyframe and that will make it much smoother. We just haven't got around to that yet. That is basically the next thing that we will do in SFU Dev. I think there's, right, the next thing I'm going to show you while I am on debug settings is back in developer are the other optional call inspector. You can see this down here. It's a bit bigger. There we go. So these are the automatically generated usernames. In this case, I think it's Peach Outdoor Earthworm, somebody. So here we've got, yeah, I can select one of these people and I can see all the signaling that's going backwards and forwards between these, the various parties in the call. One of these calls is the SFU. So you can see all the signaling that's because I only actually have signaling with the SFU because my only call is with the SFU, but you can see all the candidates and all the invites that we were talking about in the talks going backwards and forwards, which is super useful for debugging. There we go. Let's turn that off again. Yes, I think the last thing, the one we want to demo is screen sharing, which is a really good demo of these quality switching as well. So maybe Florian would like to start sharing his some interesting content. Oh, yes, I would do. So this is from the talk earlier the day. There we go. Yeah. So yeah, that's actually how screen sharing works. Again, you can put back on the call for debugging. You can see that's coming through at a fairly high resolution. I am still on freedom mode. So that's automatically switched into spotlight mode because it's screen sharing. Screen sharing has started. That's usually what you want, but I can switch back into freedom mode at which point you should see. Yes, there we go. The quality of that adjusts down again, just the same as any other feed or go back into spotlight. So you can see that. There you go. So yeah, it works just as well with screen sharing as it does with any other video feed. The text is all nice and readable. And there you go. Yeah. Thank you very much for watching and I've got to say about everything for us. Thank you. Bye-bye. Have a nice party. Let's look at one of the future holds. Firstly, we'd like to look at actually implementing the focus selection logic and cascading. This has been partially specced out, but hasn't been implemented in the real world just yet. So we'd like to do that to see how it works in action and if necessary, amend the specification. Another thing which is quite important to us is enter an encryption via insertable streams to avoid potentially malicious folk-eye admins to listening in on conferences. And the last big thing which we have on our minds now is ravaged track switching by pre-negotiating conceivers of tracks. This allows us to avoid having to do the renegotiate dance and allows us to lower the delay between actual switching between different tracks. This is useful in two main scenarios. One of them is running through a crowded area in Furbu, where you have to switch between multiple audio sources of all of the people in that area and you want to avoid an latency delay. Another use case for that is quickly switching the speaker to avoid any delay in that situation. We need help. If you have time, you can check out Element Call, both the stable version and the developed version or even the MSI 389 version which supports SFUs. Also, if you have experience with Matrix, WebRTC or coding in general, you can check out the repositories, pull requests and MSCs. Thank you for listening and I hope you have a good time at Fauston. So if the question, and I'm probably using it with other Matrix on servers, if by the time you then write or can't do it, then no, it should work with any Matrix on server. It doesn't really matter which one you use. That's a good question. Simon, will you go first? You can go first. Okay. Yeah, so the question is, what are you most looking forward to achieve with the SFU? So basically the idea is that the SFU is quite dumb in terms of an application logic and yeah, we want to implement also the cascading bit of it and then test and see if you really can scale conferences virtually to any size. I think that would be the long-term pretty cool goal. The other one is at the near-term to get end-to-end encryption in it and not in the SFU, but on the clients such that you have end-to-end media encryption and to have a same control of network bandwidth and media quality. So I would definitely agree with that and among other things also I'm looking forward to just getting rid of Qtzi so that's all the Matrix science can just rely on something based on Matrix because the Qtzi experience isn't exactly great in all situations. Yeah. Okay. Okay. Okay. Okay. So I'm looking forward to this. Okay. Okay. Yeah. Okay. Yeah. Okay. Okay. Yeah. Okay. Okay. Okay. Okay. Hi. So I can try to emulate Travis and what he used to say last year. So if you have any other questions like what is your favorite color, then you can ask to Actually an interesting question. Can we share two things in Chrome and try that? So in today we are using the new large grid layout as you can see it's not completely perfect for screen sharing since you can't make it full screen, but we are going to merge it with the layout, you know, from the current version of element call. So to be able to see it in the regular spotlight view you have now. Hey, cool. You have the first person joining you. Hi. Hi. Hi. Hi. Okay, so the talk seems to be ending in about one minute. So thank you for watching the thing. And I guess we'll be sent over to the other room. She gets created or the viewers will be sent over to the room in which we are. Yeah, thanks for attending and have fun with the remaining talks. Yeah, thank everybody. Cool. Both like quite eager for the talk to end since it sent the message. Thank you. Yeah, that's true. So I think we can close the session right. We'll be over to element call. So I leave here. See you. Bye bye. Thank you. |
Building a social app on top of Matrix
Fighting surveillance capitalism for fun and profit |
Hey everybody, I'm Charles and I'm here to talk to you today about building a social app on top of Matrix. So a little bit about me before we get started. I'm a lead software engineer here at Futo. I come to this world from a background of CS academic research. I was a professor for about eight or nine years and I worked in security and privacy. I was very interested in how do we build real encrypted systems that can protect our data from attackers who want to get at it. At the same time, like a lot of you, I've got a family at home and I got very interested at one point about how can I bring these two worlds together and build encrypted systems not only to protect corporate data or national security data, but to protect my data at home for my family and my kid. The company that I work for now is called Futo. It's a relatively new company. The current incarnation was founded I think in the beginning of 2021 by Aaron Wolff, our CEO and my boss who was the creator of Yahoo Games back in the day. And our mission that we're working on now at Futo is we want to empower users and help people to get away from depending on the four or five or six major tech giants that are controlling more and more of our lives and mediating more and more of our interactions that we have with each other nowadays. And so it's a really nice confluence of my own personal goals and the company's goals and I hope we're doing some cool things starting with the project that I'm talking about today. And so my motivation when I started this was I was a young security and privacy researcher. I had a cute new baby at home and like all new parents, I wanted to send out these pictures to as many people as possible, right? But at the same time, I was reading in my work life and in the news constantly at that time, all these crazy things that these centralized companies were doing with the data that we give them. And so my wife and I decided very early on, our kid is not going on these centralized services at all. And that put us in kind of a bind because when we looked at the technology space of what we had for sharing photos with grandmas, grandpas, aunts, uncles, friends, cousins, it really looked like we sort of had to make a choice between things that were convenient and easy for us to blast out those photos or to put them up where everybody could get them later at their leisure. Or on the other hand, we could go and we could prioritize security and privacy. And there was some really exciting stuff coming out at the time like text secure and signal and then matrix added end-to-end encryption. But at the same time, all of those products were very focused on interactive chat. And we tried using signal with our friends and family for a while and it worked okay. But it left a lot to be desired. And so we wanted something more. That's what sent me down the path of working on this project. And I want to say this is not necessarily something that's limited only to parents of new babies. Maybe you like to travel and you take a lot of cool pictures from your trips and you want to share these with your friends and your family. And you don't want big tech spying on every last move that you make and analyzing every object in the background of every photo that you upload. Maybe you want to run a book club and have interesting discussions with people and not have every word that you say feed into some algorithm that's going to be used to feed a recommendation algorithm to sell ads to you. Or any number of other things that you might want to do to keep up with the people that matter in your life without being spied on and without being reliant on three or four companies that provide all of our digital life. For now though, we'll use the baby photos as the motivating example. And so at some point I sat down and I wrote down the goals of what I wanted. And it turns out they're pretty simple. First of all, I wanted most of the convenience of using something like centralized service like Facebook or Google Plus or MySpace or Friendster or any of these things. We all used 20 years ago and gave all of our data away to people that maybe now we regret giving it to. But why those services were successful is because they make life really easy. I can very easily keep up with the number of people that a human interacts with in a typical human life. One, two, maybe three, four hundred people and probably a much smaller group that I really care about. I can very easily publish some updates and have that go out to everyone that I care about. And the great thing about those sites is that it's explicitly asynchronous, right? So I can post whenever I want and it doesn't bother my friends. And I can post as many times a day as I want and it's not going to bother them. It's not going to ding their phone until they decide it's time to come and check and scroll through their timeline and see what I've posted. The second major thing or actually maybe the first major thing that I wanted being a security and privacy guy was I wanted most of the protections that you can get right now with an app like Signal, right? So with Signal or now with Matrix, you get confidentiality or secrecy. So I can send a message to my friends and only my friends who have been provided with the decryption key or they're the only people who can read what I said. Similarly, we can get what we call integrity. In a very simple terms, it just means that nobody can inject junk into my conversation, right? They can't have weird stealth ads. They can't try to steer me towards certain interests that they think I might like. What you see is what you get. And that's what we get from the cryptographic protections of the end-to-end encryption algorithms. And then finally, one thing that we want to do here at Fruto and that Matrix makes really nice and easy is to get a little bit more independence from the big tech mega-corps that are trying to take over the world. Because with Matrix, it's pretty straightforward for a user to run their own server or if they don't want to run their own server, at least they're not locked into the same four or five providers that control everything else, right? So in a word, what I was looking for was a more secure and private social networking tool for enabling real human relationships between real human people. And this seems to be a pretty pressing need when I've talked to other people, when I talk to grandparents, I talk to new moms, I talk to middle-aged people. I've had this feeling for a while that Facebook has kind of become a ghost town. It turns out this thing, this theory has a name. It's called the Dead Internet Theory. If you want to lose a couple hours of your life going down a rabbit hole, you can do an internet search for that and find some crazy speculations out there. I won't go into all that right now. Look it up if you're interested. But what I've found is that because Facebook has become kind of a ghost town, and my own theory is that the lack of privacy has been a major driver of that ghost town status. What you see when you talk to real people, the normals are out there stuck using tools from 20 years ago. People are still using SMS to keep in touch with their grandparents and their grandkids. People are still emailing photos to each other like it's 2003 or 1999. And that kind of sucks because at the same time, we're in this world where we have really slick, really powerful, really cool user interfaces that have been built by these big tech companies. For interacting with random internet people that we've never even met. So we're using the coolest technology to keep up with the people who matter the least to us. And we're using the clunkyest oldest technology to keep up with the people who matter the most. That's backwards, right? So we wanted to change that. So I took a sabbatical from my former academic life. I watched some Apple tutorials and learned a little bit about making mobile apps. And I made a first prototype of this app that we call circles. The user interface is designed to be very comfortable and convenient and familiar for users who have used the centralized big tech services. But at the same time, on the back end, it's all matrix. And so circles can talk to any standards compliant matrix server. You can host your own. You can connect with people on different servers from you. And we get all of the really cool benefits of matrix almost for free, right? We get the end to end encryption. We get the federation. We get the open standards and the open source code. It's very easy now for anybody to write a tool that's compatible with what we've got. And we're compatible with tons of other tools in the matrix ecosystem. So to make this a more technical talk, I'll talk a little bit about some of the challenges that we faced when working on this project. The primary one that I'm going to talk about today is just the issue of trying to figure out, OK, how do we map human social structures that aren't really nice and neat and clean and fit into a nice little square box? And how do we cram those into a square box? Or how do we make those work with a kind of a limited data structure that we have? Other work that we've done on the project recently that I won't talk too much about today is mostly around how do we balance protecting users' privacy and at the same time also making it easy for them to use the app. So one thing that we're here on right now is trying to improve discoverability, right? Making it easier for the users to find the friends that they have who are also in the system somewhere. And at the same time, we're kind of treading a fine line because we don't want to make it too easy for random strangers to find you. That's bad for privacy. We've also done some work. In fact, we built a whole matrix authentication system to make it easier for users who only have a single device who only want to remember a single password and make it easier for them to do Matrix's secure server-side storage. But again, today, let's talk mostly about the human social structures and the challenge of mapping that onto a system like Matrix that was built to do chat rooms. So let's think about what we have right now for online social networks. The big type of relationship that they have, they have different names for it, but it's a friendship-like relationship, right? On Facebook, it's called being a friend on LinkedIn. I think it's your contacts. It's basically the same thing, right? Somebody that you're connected with, and this is kind of a challenging thing with the current kind of view of Matrix as chat rooms because it's not like I can just go and easily have one Matrix room where I can talk to all the people that I care about. Most of them don't know each other. And if we were to try to cram everybody into one room, it gets kind of awkward. And if chat rooms were a solution that solved everything, then we wouldn't need to do anything new, right? I could just get all my friends to install Element or Signal or Telegram or whatever and we could just have our chat rooms and we would go from there and we would all be happy. That's really what it seems like people are doing now, but I think we can do something that's even easier and more convenient for a lot of use cases. So before we work up to how we're going to do something better with Matrix chat rooms on the back end, first let's take a look at kind of like the easy mode. So let's think about another online social network tool, which is Facebook groups or Google groups or any other branded groups. So what's that look like? Well with a group, you have really well-defined membership. It's really easy to know exactly who is in the group, who should see the posts that go into the group and who shouldn't. And moreover, once you're a member of the group, you're connected to every other member of the group in pretty much exactly the same way. And so again, this is like easy mode, right? So what's the difference between in terms of like data structures and how we're going to store stuff on the servers and how we're going to transfer that to the clients? How is that any different from a normal Matrix chat room? Really it's only different in terms of the UI. It's just drawn and presented differently to the user. The actual data that we store on the server can be exactly the same. And so just like when we have one interactive chat room for interactive chat, we can have one Matrix room for an asynchronous online social network type of group discussion. So we create one Matrix room. Everybody who's a member of the group gets added as a member of the room. And then when somebody posts into the room, everybody in the group can see it. That's pretty much exactly what we wanted for a Facebook group's type of feature. And so then the only difference between this and the traditional Matrix real-time chat is just that the client needs to be different and it just needs to render each post. It's not as a little chat bubble, but instead it needs to render it like a social post, right? And then the convention that we've developed in the world is that a social post has the little profile photo. It's got the little user name, the display name, and maybe the user ID. And then it has the body of the post, and it's got this nice little border around it. And that's what users have been trained to expect with Twitter and Facebook and G+. And everything else. And so that's pretty easy. That's just, we just need a slightly different Matrix client with a different UI. Cool. We can do that. Okay, now that we have that, let's think about how we can extend this to represent friend relationships. So again, the tricky thing about friend relationships is that it's asymmetric, right? If we make a really simple Venn diagram here of my friends and your friends, maybe you and I have some mutual friends, it's really obvious that not all of your friends are also my friends. And not all of my friends are also your friends. And so the insight that I think the internet has taught us all over the last 10 or 20 years is that whatever you do, do not put everybody all into the same room. That is a recipe for drama and pain and strife. And so really, if we want to make people happy, well, your friends are, they're here to see updates from you, right? They don't know or care about me necessarily. So let's make one room where you can post your updates. All your friends can be members of that room. And then you're good. If we're using matrixes into an encryption, then it also follows that nobody outside of the circle can read your posts at all. Cool. That's good. Then we also have these people who are my friends. Well, okay, cool. Let's make one room for me where I can post my updates and all of my friends can be members of that room and they can get the decryption keys and they can see the stuff that I post. Cool. So then what if I, what if there's somebody who's our mutual friend, right? They're my friend and they're your friend. Okay. Well, no big deal. They just need to be a member of both rooms. And this structure generalizes, right? So for each user in the system, we give them their own room and that's their room where they're going to post their updates. And then everybody who's their friend or whatever the relationship is that we're capturing here, right? For their neighbors, their family members, their coworkers, whatever it is, right? Whoever it is that they've invited to follow them is a member of that room and they get to see those posts. So we create one of these rooms for each user. And then now when I want to see, hey, what have my friends been up to? My client goes out and it fetches all the recent messages from all the rooms of all my friends and it goes and it collates these into one nice unified timeline. And then I can scroll through this like I'm on Twitter, like I'm on Facebook, like I'm on any other traditional centralized social network that is collecting all your data and spying on you. But again, now all of these posts came from matrix rooms that are doing end to end encryption. And so now the only people that can see that content are the people who are the members of those rooms. The other cool thing to point out while we're here is that you're not limited to having only one copy of the structure that I'm showing on the slide here. So right here I've got a circle of three friends. Maybe these are my college friends. Maybe these are my high school friends. I could have six other circles that I'm a member of that would give me different timelines that I could scroll through and see what that group of people is up to. So we have a prototype app built that does this. It's called FutoCircles. We originally built the app on iOS. We had a beta. We got some really good user feedback. And then we had some decryption errors. And we're reworking some of the SDK layer right now. We're hoping to wrap that up by the time that FOSSTEM happens. And so hopefully very soon after you see this talk, we'll have another beta on iOS ready for you to try out. But it's not quite there yet. On Android, we have a current beta that's up and running. You can get that from the Google Play Store or from our own FDroid repository where we have our beta releases. I've got the information here with the URL and the QR code. I'll give that again later too. I've got some screenshots for you. It's a pretty simple app. It does what I talked about earlier. You can choose. You can see a list of all of the social circles that you're a member of. You can see a list of all your groups. Once you tap on either of those, you can go and you can scroll through the posts. And again, we render matrix messages just like their Twitter posts or Facebook posts or Google Plus posts. And you can scroll through your timelines. And the app takes care of all the work of figuring out which rooms are involved in which timelines, fetching the data, decrypting, and displaying it to you. And so again, the user doesn't have to worry too much about all of the underlying technology. We also have on this screenshot on the right side photo, we can also do photo galleries. I didn't talk too much about that now, but it's pretty straightforward to imagine how we can have a matrix room full of image posts that represents a photo gallery. You can have those that are just for your own use. You can use that as an easy way to share photos with your friends and family too. So we have the apps right now. We're continuing to work on them and try to make them better. At the moment, we're working on improving our discoverability. We have some cool work working on profiles as matrix spaces. This is inspired by a cool proposal from Henri who works on mind tricks. We're also working on notifications. And then in the rest of 2023, we're hoping to work towards eventually having a full production release on both iPhone and on Android. The first step towards that, I think we're going to have another round of public beta tests. And then we need to make sure that we get support for MSC 3917. If you're not familiar with that one, that's the fixes for a lot of the security vulnerability that was released late last year. It's the cryptographic verification of all room membership events. That'll give us a really nice security and privacy story. And so with that, we will work on adding subscriptions via the Google Play Store and the Apple App Store so that then if you have friends who don't have their own matrix server or don't have a matrix account, but they want to connect with you on the circles app, we'll make it very easy for them to go in the app, tap a couple of buttons and sign up for a subscription and we can host their account for hopefully very cheap, just a couple of dollars a month. And then once we get a production release, then we're hoping to try to grow this thing and try to reach normal people and help provide them with a more private and secure way to stay in touch with the people that matter in their lives. So thanks for watching my talk. If you want to connect with us on matrix, our main room for the project has the alias circles on matrix.org. You can also find the code for the iOS and the Android version of the app. Both of these are open source. They're available on our GitHub with the URLs shown on the slide. And again, you can try the Android beta. If you have the FDroid app, you can just scan this QR code that I'm showing on the slide right now. Once again, thanks and take care. Okay, so I think we're available now for a Q&A for the talk on building a social app. And if anybody has questions about circles, I am around, had a little bit of difficulty here with getting the upvoted questions widget working again. But now it seems to be, let me be a little lag. There we go. Hey, good. Okay. Thanks, Julian. It's a little bit hard to tell if it was working. Oh, no, and getting a little bit of issues with the widget. I saw some questions and then they went away. I don't know. Let's see. Can I... Cool. All right. Let's see if I can go into the dev room and then... I have too many widgets. Too many things going on. Let's see if I can move this one so I can see. Ah, let's see. There's a question about the website and the... Oh, there's a question about making a web version. It's tempting. It's a little hard. Coming from a security and privacy background, the idea of having a web app makes me nervous. The security story there is a little bit weaker. Originally, I was very, I guess, afraid of the idea of having all these very short-lived sessions. I don't know how much of the down-of-the-weeds details people know about Matrix's end-to-end encryption, but every time you log in with a new session, with a new device ID, you get a new key pair, and that creates a little bit more work for the E2E system to manage all these different public-private key pairs that you've had. It gets a bit messy, and I was very afraid of that as a new developer. Now with a little bit more familiarity with the way Matrix works, for example, Element handles this really well. Most of the time, you can use Element Web with these little sessions that come and go, and it's totally fine. Maybe in the future, if we can find a really great intern, we could have a prototype of circles that runs on a web page or in a desktop thing similar to Element. Maybe that would be something we could try out. I guess, personally, I'm coming around to the idea that it's probably going to be something that's going to be necessary. Because with the mobile platforms, we're just so locked in, and we're so dependent on these two companies to say yes or no, whether we can have an app at all. It's kind of the opposite of independence, right? We don't have a web version right now. We might in the future, I guess, stay tuned. I think our focus for 2023 is definitely getting the mobile apps working. Let's see if I can... Oh, and another question. If it's paid subscription and open source, does this mean that the client is tied to the FUTO server? Not at all. The client right now, I think we have probably about as many users on the beta that are using their own servers as are using ours. The goal is to have it always work with any matrix server, so you can stand up your own. The idea is to just make it... If you don't want to run your own server, we want to make it easy for you to get on to one. We have one more minute, and Andreas asks about, do we have any plans for gaining growth? In the short term, it's going to be going around and trying to recruit expert users, right? We're going to security conferences, we're going to talk to matrix people, and then let you recommend to your friends. Let's see. We're continuing here in the room for the talk. If you're still with us, I think we've been booted out of the matrix online dev room. Yeah. Let's see. I just posted the address for this Q&A room into the matrix dev room, in case anybody wants to come in here and chat. Let's see. Juggling lots of widgets on a laptop screen here. All right. Sorry for the confusion earlier, having a little trouble juggling it all. Let's see. I'll stick around for another couple of minutes. If nobody has any questions right here, you can feel free to ask. Let's see. Oh, no, Andreas' question. The answer got cut off. Oh, no. That was too slow. This was about growth and monetization, I think. So the short answer is that once we get the apps working reasonably well and the bugs ironed out, I think our first step to getting adoption is to go to basically the people like you, the people in this room, the people who go to FOSTA, the people who go to the academic security research conferences is where I used to hang out and so all the people who really know that this is good, right, and the people who can make an authoritative recommendation to their friends that, yeah, this thing is not a scam. Yes, the security of this is actually up to par. The protection of this is actually what you want and what you need, and if we can grow from there by getting kind of like the expert recommenders, then they can tell their friends and by word of mouth, we can grow kind of family by family. Eventually, we will have families where they're both on and then people can start to find their friends, but I think the growth strategy is probably one family at a time and we'll use things like Apple and Google provide these family subscription options where one person can pay and they can get accounts for like five or six of their people and I think Apple calls it family sharing. I forget the name of what Google calls it, so that's kind of the plan. Let's see if I can catch up on any of the other questions that I missed. It was a little hard to see my video and the chat on the side at the same time. Maybe I need a bigger screen. It's still difficult, okay. Okay, cool. Thank you all for watching. Give the app a try and let us know if you run into any difficulties or anything that feels weird that isn't easy or convenient and we'll try to get it as good as we can make it this year before we do a public release. I'll hang out in the circles room over on circlesonmatrix.org for a few more minutes. Anybody wants to talk there? I check that every day, so if you have questions or ideas or anything else, stop by and let us know. Thank you. Bye. |
Decentralizing moderation
Mjölnir for all |
Hello, my name is David, also known as Jorik, and I'll be your host for this session on decentralizing moderation. Most of this work was actually done by Knuxy and DeSopo. And it's a work that we have been doing for the past few years, and it started at Element. And of course, it's about the matrix network. What we want to do, we want to make moderation something that actually, and not just theoretically, works in a decentralized federated network. So what is moderation? Moderation is something that's defined by the fact that we are communicating, that there may be rules on this communication, and that we have people who enforce those rules. The moderation part is the enforcing of those rules. Again, we are talking about the matrix network. Network, the network itself, has no clue about rules. You can have many different kinds of communities. Some of them are going to be family friendly. Some of them are going to be much more adult. Some might be work specific, or sport specific, or politics specific, anything else. And each one of them can have a different set of rules. If you do not have moderation, what's going to happen? Well, this is what's going to happen. You are happily chatting with your best friends in a room, and suddenly, Marvin decides that this room must be spammed with stinky French garlic spam. Who wants stinky French garlic spam in their room? I don't. This is where the moderator needs to step in, most likely kick Marvin away, or at least have a stern talking to Marvin, and get rid of the spam content. Of course, I'm talking about spam, but in the real world, it can be much, much worse than spam. I leave this open to your imagination. As mentioned, we're talking about federation. In this case, it means that we have users who connect to one communication room from a number of servers. For instance, today, listening to this presentation, I'm going to assume that some users are connecting from foestem.org, some from matrix.org, and presumably from a number of other servers. It means that the servers need to agree upon who is a moderator with the authorization to kick bad users and to get rid of bad content. That is something that the matrix network already handles. That's part of the specification. However, things can quickly become complicated. You don't have a single room on a server. You can have users who participate in multiple rooms, can be moderators in some of them, and not moderators in others. You can also have moderators who come from different home servers and who share the moderation burden and the moderation privilege in a number of rooms. That's possible. Typically, in the matrix network, you go one step further by introducing moderation rooms. Rather than having the moderators operate directly in the room they're moderating, they are members of a moderation room. In this moderation room, they use a bot to whom they have delegated the moderation abilities to perform the day-to-day actions. When Alice or Fred wish to kick Marvin, they ask Mjolnir, the name of the bot, to do this on their behalf. This has a number of advantages. One of them is that if Alice and Fred are moderators for many rooms, they only need to kick Marvin once if they need to kick him from all the rooms. They can also publish and share policy rooms. This is how we get into the even more federated part. Those policy rooms are what let rooms and communities work together to define rules, to define users who have been banned for being toxic and abusive or spammy and who should not put their foot anymore into any of those rooms. Generally, that is one step further in the federation of moderation itself. What you already have with Mjolnir, you have those moderation rooms which can again moderate a number of individual rooms, so entire communities. You have those moderation policies that you can publish, you don't have to, and you can share again federation between those communities. You also have the side benefit that moderators can operate with some degree of anonymity, which is good because we have seen a number of cases in which there are reprisals against moderators when they try to apply the rules. So at this stage, hey, we have decentralized moderation. Well thank you for coming and listening to this presentation. It was a very brief presentation, but it was a very intense one, so thank you. If you have any questions, oh, before I forget, yeah, if you want to use Mjolnir, you need to install it. It's not very hard, there are just six easy steps, I'm not going to read them out loud, and then you can use Mjolnir, sorry, I meant 10 steps, 10 easy steps, and then you can use Mjolnir, 16 easy steps. I mean, who doesn't know how to configure engines or pantalymons, right? But after these 21 easy steps involving Docker compose, these 23 steps, are we done yet? Yes, we're done. So after these 23 steps, the 24th step says that you can now start becoming a moderator for those rooms. It's great until you read step 25, which says that if you want to do this with another set of rooms, you have to repeat everything. That is a bit complicated, especially since moderators are typically, I mean, yes, some moderators are highly technologically literate, but some aren't, and they don't need to, or at least they shouldn't need to. So the main issue at this stage when we talk of decentralizing moderation is about democratizing moderation. How do we do that? Well that's the entire point of project Mjolnir for all, which has been developed by Knuxy and Jezopo. And before I proceed, I would like to continue with a public service announcement. If you know employers or potential employers for two highly passionate and high level developers, please get in touch with them, or with me. So what would we like at this stage? We would like something simple, something that does not involve the 25 steps mentioned earlier. We would like to simply be able to invite Mjolnir, confirm that it's the moderator, and be done with it. So can we do that? Let's see how this works. Here we are, me and my best pal having a chat in our very own very interesting room. And just in case, I'm going to set up decentralized moderation for this room. So let's invite the Mjolnir bot. This creates two rooms, one of them that is going to be useful for sharing policies. In case I want to cooperate with other moderators, for instance, and we have a common base of people who wish to kick, and a moderation room in which we can actually use Mjolnir. Let's not forget one thing. We need to make sure that our moderator can moderate. With this, the moderator is no in position of moderating. And if someone were to, say, send some spam, oh no, spam has arrived. Well we should do something. Okay, I'm not going to do something right now because I'm not planning to demonstrate all the features of Mjolnir, or of Mjolnir for all. But we are no in a position where we can do something about it. So we can kick, we can ban, etc. The objective here is to ship Mjolnir for all with home servers. Let's face it, there is still a difficulty involved in setting up Mjolnir for all as the administrator. And this difficulty, I would like to consider, is part of the difficulty of setting up the home server. Of course, it can also be shipped as standalone. But as you have seen, if the administrator of the home server or the administrator of the Mjolnir for all does the necessary steps to set this up, moderators can do something well, they don't need to set up anything. It's just two clicks version. They just need to invite Mjolnir and confirm that Mjolnir is entitled to moderation rights. We even have a prototype of a user interface that does this with a button in element. I'm not going to demonstrate it here because I'm going to run out of time. But that works too. At this stage, Mjolnir for all doesn't implement the entirety of features of Mjolnir. It's pretty close, but that would need to be finished. And once we have all of that, well, this can entirely transform the experience for communities. Because creating a community is just something that you can do by creating a new room, possibly creating a space and inviting Mjolnir for all. And you're done. None of these messy 25 steps of installation of Mjolnir. There is one more problem that I'd like to talk about. Whether it's in a federated world or in a non federated world, at some point you will need to call for help. Possibly because you have just witnessed Marvin publishing some French garlic stinky spam in the room, possibly because you are a victim of abuse, bullying, harassment or witness of some. And you wish to attract the attention of moderators on that so that they can do something about it. In the current flow, well in element, but it's basically the same thing with other metrics clients, what do you do? You click on report, you write a few words to explain that what's going on and you wait for something to happen. As you'll see, in many cases, you can wait for a long, long time. Because on the other side, how do you get access to these goals for help, also known as moderation requests? Well, the official workflow in Synapse is the following. First you need to create an admin account. It's documented but not standard. Then you need to write your program to pull a REST API, which is also documented and also not standard. Then you're going to see the requests, assuming that your program works and you can do something about it. But that's not really ideal, right? Because first you need to be the admin to get the moderation requests. And then you need some custom program, surely that's not how it works. So for one year and something, we've had a new flow in Muernier to replace that. You need to install engines, you need to set it up, you need to set up Muernier for that. And instead of having to write a custom tool, you're going to see the requests in the room, in the moderation room, which is infinitely better. But it's still not good enough because you need to install the engines for one thing and because that's only for one home server. That's still in this paradigm in which you have only one community per home server. And as we have seen, that's not the case. It gets worse because to receive the call for help, you'd better hope that the user is connecting from the same home server as an administrator who can do something about it. If, for instance, you are in this room, which I believe is a FOSDEM or .org room, I mean the room in which you may be listening to this presentation. I'm going to assume that all the moderators and all the people who can do something have accounts on FOSDEM.org. This might not be true, but for the example, let's say that it is. If you are sending a call for help from an account on matrix.org, then this call for help is going to be received by administrators for matrix.org. And if none of the administrators for matrix.org can do anything about FOSDEM.org, for instance, because they're not members of the room and they do not have moderation rights in this room, well, in that case, they can do nothing about it. So that is still broken. On the upside, that's why what we've been working on, let me tell you about Project Aristotle, which is one brick that interacts with Muir near for all to implement this call for help in a decentralized world. In this case, that is my work. And the public announcement made by my colleagues is true for me, too. So what do we want? We want to take advantage of Muir near for all so that whenever someone calls for help, the moderator in the room for which help has been requested gets the information. Moderators, of course. And these are the people who can actually act upon the situation. Let's see how that works out in the current state of things. I'd love to tell you that we have already reached the stage at which this feature is fully streamlined and integrated into everything. It's not quite the case yet. In the current state, we first need to inform Muir near that we want to do this Aristotle decentralized moderation in our room. And the second thing we need to do is inform element client that we want to opt in for this new feature, which is called report moderators. But once this is done, we can look at that spam. Oh, no, there is spam and report it. This is spam. I don't want no stinky French garlic in my room. And send the report. One thing you may notice if you have recent reports is that we are actually confirming that it has been sent to the moderator. And also, this report now shows up for all the moderators in the rooms instead of only showing up for the home server admin who might or might not be in good position to do anything about it. And let's end, you know, the moderator has, without having to do anything, a very complicated, kicked Marvin from the room and removed the offending message. The reason for which Aristotle is not entirely streamlined yet is because, at the moment, it uses non-standard extensions to the matrix plural. These extensions are pretty simple. They have been submitted for standardization, but they're not part of the standard yet. Once they are, it becomes possible to enable them by default in element and in other clients to enable them by default in merely a role, which means that all the messy parts about setting up the room and setting up element to do that will disappear. And users can simply call for help using a much more streamlined mechanism that actually gets the request for help to the person who can do something about it. Now, let's take a step back. The development of Mjolnir for all and Aristotle is very advanced, but it's not entirely finished. So it's still need a little work. This is work that started its life in the former trust and safety development team at Element. Right now, it lives at the address I'm giving you on this slide. And my hope is that it's going to continue living and progressing because I believe that it can make a whole lot of change for users on matrix network and possibly even on other networks. Well, we are reaching the end of this presentation. I would like to thank you all for listening to this point. As mentioned previously, if you know potential employers for highly passionate developers who enjoy working on privacy, open source and distributed computing, please do not hesitate to get in touch. These are all our GitHub handles, but you can also find us on matrix that shouldn't be too hard. Thank you very much and have a good day. Thank you. |
The Microkernel Landscape in 2023
Newcomers, regulars, late bloomers, elders, oddballs and others |
Hi, everyone. I'm Razvan. We are now a part of the microchannel component-based OS Devroom. It's a pleasure to have you all here. We're going to start right away, so we have, I think, 10 talks. We're going to delve into microchannel, unical, and component-based OS topics. We're going to start with Martin with his talk on the state of the microchannel environment. So, Martin, please go ahead. Thank you. Good morning. Thanks for coming. Welcome. It's my pleasure and honor to open this Devroom today. And it's also a great pleasure that we can continue this tradition of this Devroom since 2012. I would like to thank Razvan for organizing the Devroom this year, and let's go to it. So, my talk will be about the currently developed microchannels that I'm aware of. Maybe I'm missing some, but this should be like an overview. If you might be interested in seriously using a microchannel or just trying it out, what you can expect. This first slide is about me. I won't go into it. Let me just say that I have been working with microchannels and contributing to microchannels for almost 20 years now, half of my lifetime. I assume that most people here do know what a microchannel is, or at least most people have some kind of idea. But I will still try to very briefly introduce the microchannels to you. Maybe I will save a few minutes for the follow-up speakers. So, a microchannel-based operating system is a fundamental way how to achieve operating system reliability and dependability by the means of having purpose of the architecture, especially driven by specific design principles. Now, every microchannel has their own design principles. This is where the different implementations differ, obviously, but I think there are like three common universal design principles, the separation of concerns, the split of mechanism and policy, and the principle of least privilege. So, this generally results in a system that is modular, customizable, and verifiable, potentially formally verifiable. By the way, some microchannels do have a minimality as explicit design principle, but many microchannels actually don't. So, the micro part in the microchannel and the whole microchannel term is a little bit of a misnomer, at least as I see it, because I think the microchannel as small as possible is not necessary. The a priori goal is just the result of the other design principles, and I really think that there is no point in comparing whether one microchannel might have 20,000 lines of code and they are 130,000. It's really comparing apples to oranges. These design principles also don't affect just the kernel design, but potentially also the user space design. So, therefore, you might see descriptions like microchannel multiserver operating system with fine-grained components. This means that not only the kernel is non-monolithic, maybe that would be a better term, but we are stuck with the microchannel term, but also this might suggest that in many of these systems also there are no monoliths in the user space. I have some slides about the history, but I will skip them. You can go to the slides if you are interested. Just one note. The idea of micro kernels has been around almost as long as the idea of operating systems. So, if some people say that micro kernels are strange, are this strange over-engineered idea that proper operating systems should be monolithic because this was the way how they started and etc., I don't think those are very valid arguments. So, let's go to the core of my talk. There is a website, microkernel.info, which is basically a condensed version of this. So, this is a very simple site that lists the current state-of-the-art open source micro kernels. So, if you are interested or if you are looking around going to this site, it's probably a good idea. By the way, this site was started by Jakub Mirmas, my colleague, and I'm maintaining it right now. Of course, if you are a microkernel developer and you don't see your project on this site, just send us a pull request. It's so simple. Okay, let's start with the overview. I should say that there is surprisingly, there are surprisingly many projects, active projects that are microkernel-based, and for microkernel developer this is really exciting times, I would say. So, genode by genolabs is perhaps the most versatile example of a microkernel-based operating system, but I mind you. It's actually not an operating system in the common sense, like what you would consider Windows or a GNU Linux distribution. It's actually an operating system construction kit. So, it's a way how to pick and match different operating system components, including different micro kernels or kernels in general, with some user space components and how to build a bespoke operating system for your specific needs. So, what is really interesting about genode that you can really use all these different microkernels like SCL4, Fiasco OC, microhypervisors like NOVA, and you can even use their own custom microkernel, which is called base HW. You can even run this infrastructure on top of Linux for development purposes, maybe. There is strong focus on resource accounting and management in genode. You can read the genode book for the details. Genode is driven by a commercial company. So, they have customers. Somebody is paying them to do that. They don't state their references publicly, as far as I know. I might note some, but I'm not in the liberty to name them. And there is also this thing called SCALP OS, which is like a pre-built distribution of genode. So, if you would like to try something that is, that you don't have to pre-configure in advance for your specific needs, you can go for that. This is a picture from Norman Feske, one of the co-authors of genode from, I think, FOSDOM 2017. So, maybe the image is a little bit outdated, but I still think it gives you the big picture. So, you have all these components like the different kernels, different user space, runtime environments, if I can say, so this one is, for example, Unix-like runtime environment, drivers and UI components and stuff like that, and you mix and match them. And then, this is a screenshot of the SCALP OS, so, like, this one instantiation of genode, and you see that it's actually a nice desktop-oriented operating system. As some final closing remarks to genode, I really like base HW, as I spoke, microkernel for genode, because it's really nicely integrated with the rest of the system. For some reason that I don't know, I don't understand, but there are genode guys here, you can ask them. I don't see complete feature parity of base HW with the other microkernels they support, so, as far as I know, there is no support for hardware virtualization. And this is not a criticism, this is just a comment. If you start playing with genode, you need to read some documentation. There is very nice documentation available, no doubt about it, but, really, it's not so simple by just downloading an image and running it and expecting a fully blown desktop environment, at least not from, you know, just by booting it, you have to do something. But I think it's definitely worth it, so there are some links you can follow. It's an open source project, by the way. Okay, now, let me talk about L4E, which is something slightly similar in some aspects different by my current employer, current concept. So, this is also a production grade microkernel-based environment, a little bit more integrated, I would say, because we basically support just the one kernel, which we called L4E microkernel, but you all know it by the name Fiasco. We use this name currently because Fiasco is a very poor name, trust me. So, we strongly focus on virtualization, we strongly focus on safety and security certification currently, and we also have customers, because we are a company that pays us and et cetera. I'm, again, not in the liberty to name them, but I can say that if you're going to buy a new car from a German car manufacturer, there is a high chance you will be running L4E. There will be L4E code running in the software stack of that car. To be honest, the code base is not the most verbosely commented that I have seen, especially the kernel itself. So, again, the learning curve is a little bit steep, but at least there are some scenarios you can just build or download and pre-build image, and this will show you the potential to a certain degree. And here are some links. Again, it's an open source project. Now, let's talk about Helen OS, which is to compare with the previous two is a slightly different breed. So, this is like an integrated operating system. So, the purpose is to build it or download an image, boot it, and be presented with a desktop environment with a shell and some mostly familiar commands, which you can use to explore the system. So, it's not about compile time or deployment time configuration. It's really about configuring the system at runtime as you go. What do you expect from a desktop-oriented OS? And, of course, I'm a little bit biased because this is my project, but I would argue that if you want to understand how a microkernel-based system works inside, this is the one to pick because of the OS entry barrier. The code base is portable, self-contained, well-structured, so, for example, we know how to use directories and not only a single level of them. So, this is how we structure the system to be more understandable. The code is well-commented and this is not just my observation. If you run a tool that will analyze the sources, you will get a number around 30-35% of commands, which is not bad. And believe me, I have seen many microkernel code bases. I have seen the code of many operating systems in general and I can tell the difference. So, I would compare LNOS to something like the Solaris kernel in terms of the structure and commands and stuff like that. And we also prefer to use our native components, so, no ported components or components that might use some unique kernel layers to really make the system feel coherent. Let's put it that way. So, this is how it looks like when you boot the image which you can compile or download. So, you have a user interface, a shell, et cetera. And we have some interesting features that are not presented in the other microkernels. So, we are portable not only in theory but also in real life. So, we support eight different architectures, including strange bits like itanium. And yes, the RISC-5 port is still not finished and that goes to me. We are using asynchronous IPC which transparently uses shared memory for performance. We have interrupt controller drivers in user space compared to some other microkernels. We have a fully decomponentized TCP IP stack. We support USB 3.0 and we have a sound stack, so, just a few highlights. I will go quickly through these slides. We don't have the time to go to the details but the microkernel, while being quite small, still has a structure. So, we have a well-defined hardware abstraction layer in the kernel. This is how the user space or how the entire architecture of the system looks like. So, you might see some similarities with the genode image but the difference is that all of this is potentially running in the system for all the time, I mean, depending on the actual configuration of your machine. And there are some device drivers which are, again, somehow structured in a tree, starting with some platform drivers, et cetera. If you want the details, please come to me. Yeah, it's a community-driven effort currently. So, yeah, we are not so fast regarding the development but we still do semi-regular releases and, sadly, we don't support some of the new hardware features. If you'd like to contribute, you are more than welcome. Fuchsia by Google is a relatively new kit on the block. It's a microkernel-based system that is strongly focusing on Internet of Things. Specifically, their target is to support maintenance, remote management, and remote upgrading of a fleet of devices. So, imagine, for example, the Google Nest Hub, which is the device where Fuchsia is being shipped currently with, and they even managed to do a remote update of all those, you know, Nest Hubs from the previous Linux-based OS to Fuchsia over the air without the users even noticing. So, I think this is quite impressive. The microkernel is called Zircon and it's capability-based message-passing microkernel. And I have spoken to the developers why they don't actually stress that it's microkernel. And it's their deliberate choice to somehow underplay, understate that it's a microkernel because of some bad press of the term. So, they don't call it microkernel explicitly unless you ask them, but it is a microkernel, for sure. Yeah, this is how it looks like on the Nest Hub, or this is the way how you can tell whether your device is still running Linux or is running Fuchsia. And, yeah, the learning curve, again, somewhat steep because this is not a desktop-oriented system or server-oriented system that would be Unix-like. You have to install a non-trivial toolchain and a custom emulator sort of like when you do Android development and other things. But, again, what would I believe is very nice about Fuchsia that they are only using their own native core components, not ported components. And it's an open-source project. Panagarm, again, a relatively younger operating system, which is microkernel-based, at least compared to the first three. One of the key features, a fully asynchronous kernel design, which tries to somehow mitigate some performance problems by implementing some features in the kernel, which might not be considered pure by microkernel purists like the PageCache. And Managarm tries to be compatible with Linux, so they already support the Wayline protocol in Western and some other applications. They even have some accelerated GPU drivers, so at least one. And it's an open-source project, and this is how it looks like. Of course, you can run more than just a clock there, but, yeah, you get the idea. Redux, another interesting microkernel-based operating system that tries to be Unix-like, but this one has this primary feature of being implemented almost completely in Rust. Also, the core user-based components are written in Rust, like the Lib-C, so they have actually a C library written in Rust. Interesting. What to say, again, POSIX compatibility layer, they already support some interesting end-user applications and libraries, and it's an open-source project, again. And this is how it looks like when you boot it. So, again, you can run a terminal with a bash in this case and just explore the system. A little bit aside, there are also other, let's say, currently non-open-source microkernels being around. I just tried to mention them here very quickly. I know we are at FOS them, but just to complete the picture. So, Huawei is working on something which they call Humong. It's actually quite buried under this HarmonyOS brand, and it's a little bit confusing because you might have heard rumors, the original ones were that HarmonyOS will be a microkernel-based system, then Huawei released something that was clearly Linux-based. So, yeah, this did not resonate well with our technical folks, but the point is that this is just a marketing confusion. So, the HarmonyOS is a common brand for different operating systems. One of them is Linux-based, one of them is LiteOS-based, which is a real-time kernel by Huawei, and the most progressive one unreleased so far is the microkernel-based. The microkernel was originally inspired by best practices and state-of-the-art in other microkernels, but it's a clean slate implementation and design. For example, they have capability-based physical memory management in user space, so the kernel does not manage the physical memory. It's sort of similar, the design is sort of similar to SCL4, but it's slightly more practical in my personal opinion. Sorry that I can't go into the details, and they also target safety and security certification, and actually this is also running in the wild as trusted execution environment in several Huawei smartphones. Then there is this R&D project called DAQ, which is primarily being driven by my former colleagues at the Dresden Research Center, which tries to be, again, a completely clean slate design and implementation. The primary goal was really to use state-of-the-art best practices and software engineering to achieve really the highest code quality and maintainability. For example, one of the goals was to be fully MISRACI compliant. Another goals were high-level safety and security certification and other interesting features. It's an R&D project, and honestly, I don't know what's the current state, maybe you can informally ask some of the Huawei guys here, but it's good to know that this is there. Okay, very quickly, some other systems. GNU Horde, for 30 years, the intended replacement of Linux in the GNU Linux equation, still alive, still kicking, still with semi-regular releases, and yeah, I mean, you can actually run 70% of the BN packages on top of it, which is not bad, I mean, honestly. Yes, it's limited to 32-bit x86, but as I always say, if they would get 1, 3, 1, 1, 4 of the Linux contributors, they would finish it in a few months. ARS, which is a microkernel-based operating system based on the Helios microkernel, which is supposedly inspired by SCL4. There will be a talk later today from the author, so I skip the details for now. Composite, another microkernel-based project that is focusing on predictability and component composition. The kernel itself is designed as lockless, and it has user space scheduling, and it uses thread migration IPC, so if you remember vaguely the idea from Mach 3.0 from Ford at all, this is the continuation of that. Then there is UXRT, which is like a user space part built on top of SCL4. This is still an ongoing project in early stages of the development. Let's see how it goes. And finally, let's mention a few standard microkernels, so these are not entire operating systems, these are just the kernel building blocks. NOVA, microhypervisor by Udo, is again alive and kicking. It has been used by bedrock systems as their primary product, as I believe. So this is one of the projects that sort of went into limbo for many years, and now they are alive again. By the way, G-Note, I believe, is maintaining their fork of NOVA, or maybe NOVA with their own patches, but there is also Hadron, which is an official fork of NOVA from Cybers, and they are also using it as their commercial product. Again, I think there might be Julian somewhere here who might tell you more. SCL4, of course, the microkernel with the most, I would say, the most visible formal verification story, we need to mention it. We also need to say that Google has adopted SCL4 recently as their foundation for secure firmware, sort of something like that. I'm not really sure what are the targets of this can trip OS, also SCATA OS, but it's a static configuration mostly, so it's not a dynamic system, it's a really static configuration system. And in that same area, I would mention also the Muen separation kernel, which again is a separation kernel, so its primary goal is to do static partitioning, but I think it belongs to the family. And sadly, there are some microkernels that are interesting, worth looking into for inspiration, but are currently in limbo, like Escape, M3, Minix 3, Herbigalia, and Redleaf. I hope they will be resurrected. And of course, I might continue with a list of other microkernels that are certifiably dead, and of course, those could be resurrected as well, and it's always good to know the history, right? Yeah, but I would stop here. Thank you, and if there are any questions, I would be happy to answer them. Right, thank you so much, Marty. If there are, please. Thank you. We have time. Yeah, two questions, three questions. Hello, congratulations for your excellent talk. Thank you. Among all those that you studied, which one do you think would be more compatible to the Linux end user base, like for a person to use Minix or... I mean, that is a good question. Thank you. So the question is, which of those systems would be most Linux compatible? Most of them, actually, most of the systems that I've presented do have some POSIX compatibility layers. So I would not make this as the only criterion. I understand it might be important for you, but I would look also into other aspects of that, because most of the systems do provide some kind of Linux compatibility. But if you would be looking for something that is really Linux compatible by design, or that makes it as one of its primary goals, then I would probably go for Managarm. But again, this is just a first idea, first suggestion. I would not rule out the others. Thank you. Any questions? Alex. Hi. Thank you for the talk. So what trends do you think you'll see in the next few years with the microcodes? That's a tricky question, but thanks for that. So the question was about the trends. So I think there will be this kind of retargeting of the systems to very specific use cases like Fuchsia is doing, so really implementing custom operating, microcode-based operating systems that really do fulfill the specific needs of those areas. That's one thing. The other thing that I would like to see, I'm not sure if it's going to happen soon, but I would like to see, I would like to see more hardware software code design. So basically, the Achilles heel of microcodes is the fact that most current CPUs don't really provide hardware features that would help the microcodes, especially in the terms of performance. And we see this vicious cycle. The microcodes are not performing greatly on the current hardware, so nobody is, nobody quote unquote is using them. So the hardware vendors don't see a need for changing the CPUs to provide features that would help the microcodes. But I think with RISC-5 and the possibility, or the democratization of the hardware design, I think this might change hopefully quite soon. And the third trend that I definitely see, which was probably also seen on the slides, is really the certifications in terms of safety, security, and hopefully more formal verification, because this is where microcodes really excel. So why not go for it? Okay. Thank you so much, Martin. Thank you. |
Device driver gardening
Transplant Linux drivers fast but gently |
Can you hear me? Okay, fine. Stefan, please. Yes. Good morning. Hello and welcome to my talk. I'm Stefan Krakowski. I'm a genote developer since 2009. And today I want to present you how to transplant Linux kernel drivers into the genote OS framework much more faster than before and hopefully precisely. So let me start with the motivation behind this. Of course, you might ask why we use monolithic kernel drivers when we talk in a microkernel death room. Of course, there are good reasons to implement drivers from scratch and we also have several drivers which are written from scratch but the ever increasing complexity of modern hardware for single devices but also for the pure number of devices inside a system one ship is not easy to handle for a small team. That's the one reason. And often it's poorly documented even if at all. And we also have hardware bugs inside and you have to find those bugs because they typically are not documented and then you have to find work around for it. And all of this is mostly part of the Linux kernel and you can reuse it because it's free and open source software. And so to sum it up, it's simply an economic decision. So if you want to enable a modern device and you have limited time, then this is the way to go. Okay, we have collected a lot of experiences in the last decade to port drivers from Linux to the Geno.S framework and you have in general two extreme approaches and the reality is somewhere in between always. So either you use just the pure driver code. What I mean by this? I mean code which directly interacts with the hardware by writing to some IO registers or by setting up a DMA transfer or something like this. And in that case, you of course have to implement each Linux kernel function that is called by this driver code. But the good thing is you don't have to implement the whole semantic which the original function is implementing. You only have to match this single driver needs. This leads to a more low complex function than the original ones and in sum to a more minimal Linux kernel driver. But of course you cannot share this emulated code in between different code bases. So if you have not only one driver but several, they will have slightly different semantic needs. And so reusing the same emulation code might be a problem and therefore the whole effort, if you don't only port one driver but several ones, can increase if you ever and ever again have to implement the whole emulation code base for the driver. And of course you need the actual needs of that driver. So you need a deep knowledge of the driver itself. On the one hand, this is the one approach. On the other hand, you can use as much as possible from the original code base. Thereby you might gain more or less the same runtime behavior than the original one. And you can of course then better share resulting emulation code because it's already stressed by this whole bunch of code running on top. Thereby you get less manual work to do for having more than one driver. But of course the code base for the single driver increases because you have much more of the original Linux kernel. And if a problem arises, then you have to know a lot of the whole Linux kernel itself because it might be in the timing subsystem and whatever. You can name it. So in the recent past we were more on this side, on the taking the pure driver approach. But the high effort for each driver was also leading to the situation that you keep your old code base, that you are not that good and maintain the code and getting a new kernel version and driver updates. So, and at some point there was a need for action. For me this was at the beginning of the pandemic situation when I was trying to enable the display engine of this device, which is M&T Reformed 2 from M&T RE, a small company from Berlin, so a completely open hardware. And yeah, I tried to enable the display engine and it includes some NXP IMX8 SoC. And we already had a driver for this because a colleague of me, he enabled in three months on the early evolution kit the HDMI connected display. So this was one part. And then another colleague of me wanted to have a touch screen which is connected via DSi connector and not via HDMI. And again he had to spend three months into this work because on the one hand there are more devices involved now. On the other hand there you had all this bureaucracy for device tree management and it was all hard coded for this first use case of using HDMI for the specific board. So there was a lot of manual tweaking to do to enable the touch screen. And then I wanted to, I thought, yeah, I don't have to do this, someone else did it. And now we use it for the M&T reform for the panel because it's also connected via DSi. But actually there's another device in between and EDP bridge in between the DSi connector and the panel. So yeah, I had to do work again and then I recognized, oh no, the code base we used for porting is a different one than the one of the M&T reform and it's a totally different kernel version. You have to back port stuff, you cannot correlate it. No, I give up. So that was a turning point for me to start a new way of porting. And of course it was not only me, but we had a lot of discussions formerly in the kitchen, you and coffee breaks, what we want to change. And so number one requirement for the new approach was to reduce the manual work for tailoring a driver-specific environment. And we wanted to meet as close as possible the original semantic of the driver so that whenever you change the context, like with this display engine, it just works. You don't have to do much more. And because formally we all, at some points we have the impression that you cannot be deterministic in knowing when you will finish your porting work because when some problem arises, you could not correlate it to the original runtime often. So it was somehow hard and we wanted to change this. So it should be an easy way to correlate it to the original runtime. And last but not least, we wanted to share more of this resulting emulation code which is more semantic complete so that we can maintain the code better. Okay, so this is the story beforehand and now I come to the actual work. So I want to introduce you to this approach for those of you who like to port drivers to Genote or like doing the same approach somewhere else. So we typically start now by configuring a minimal executable Linux kernel. Let me just call it tiny kernel, so to say. So you have to do some manual work here. You have to use the Linux kernel build system itself. It has a tiny config, some small configuration which is at least compilable for your architecture, but it won't run any device. And then you just enable certain configuration options and of course you have to find them by looking at the configurations. And in the end, this might take some time, but in the end you will have something which you can correlate laterally if you run the driver in your ported environment and you want to look at why doesn't it work, then you really have a minimal Linux kernel which just drives this device and this is the first thing to do. And another aspect of this is that you gain a minimal kernel configuration for your codebase which just calls those kernel functions that you really actually need to drive that device. So you don't have to emulate that much. Okay, kernel configuration is only one part. If you take an ARM device today, then you have of course these device trees which name what kind of devices you actually have in hardware and which also contain additional driver information. So it's a bit of configuration is also inside of these device trees. And this is the device tree for the M&T reform. You see it's quite complex. So you have to identify what kind of devices are interesting for my tiny kernel to execute those. And this is again some work to do, some manual work to do, but at least you start to know more about the dependencies of your hardware. And we have developed some tooling for it. So you can, this is a small tickle shell script which pauses the device resources and then you can name device nodes that should be extracted and it will take them and the transitive closure to give you something like this. And then you can take that device tree of course with your tiny kernel and start it and it will just drive that device. And we also take that for our own ported drivers as input value. Of course you won't implement everything which is seen here. So powering, reset pins, IEQ stuff like GPIO or something like this would be part of other drivers in the system like the platform driver or some dedicated GPIO driver in the GenoDOS framework. So those highlighted ones are the ones that we actually need for porting. And this is the starting point for you to identify the first compilation set that you need. So each of those device drivers has some compatibility string and those are used in Linux to identify the concrete driver of the Linux kernel. And so you can take those strings and grab in the Linux kernel sources and then you get something like this. So you have your first compilation set units and you can put them into a make file, into a build environment and then we combine it with the unmodified Linux kernel headers. So we take the original include path of the Linux kernel. Formally we always define the whole definitions you needed by hand. So this was a lot of work to do. I would say initially the most work you had to do. And now we just take the original Linux headers and then you can just compile those compilation units you already have seen. So it's really a work which is done by this. But of course there are some exceptions. So we had to tweak some headers. We shadow some few headers to prevent the system from trying to enable, disable interrupts or something like this. And especially to define init calls in the Linux kernel. So each subsystem in the kernel including any driver has some init call definition and those are the order of the init calls is important. Even if you have one init call priority level there are dependencies in between the different compilation units and they are solved by linking order. So the Linux kernel uses some weird linking magic to put them all into one order and later when starting the kernel it takes that order. So we didn't want it to infect our linking script with this. Thereby we have built some tooling again which uses this tiny kernel you built in the very beginning and just extracts the order of the init calls and puts it into a header and you can just include it in this built environment and then you run and it will, the emulation code environment of us will just call the init calls by the correct order. So when we do all of this then you of course get a lot of undefined references for all the functions which are not implemented yet. And this is a lot of error messages from the compiler. So we made a small tool to identify those undefined symbols help you to identify the original compilation unit which implements them and then you can try to find a correct setup for this. And I want to show you this shortly. Okay, so I've prepared some makefile like here. So here you see the compilation units we identified. There's some inclusion of the general emulation code base and if you now use this tool it will build the target which you name so it will try to build the driver and it will collect all the undefined symbols and here it just shows you the symbols and the overall count of the undefined symbols. Typically you can also have what I said the compilation unit which is responsible but I've skipped this here because on this machine it's a bit slow. So we can now identify, okay, there are symbols for DRM mode which we want to solve and we see, okay, let's try to add the original one. Oh, sorry. And yeah, you just run the tool again and it will show you in a few seconds. So on the PC this is quite quick but this is just one gigahertz or whatever, I don't know. So it's a bit lame and it has to recompile the driver of course in the background. Okay, and now you see it's seven symbols less and in the end I think because of the time we will skip this, in the end you can generate with the tool the missing symbols and it will give you per function the correct declaration of the function of course and it calls a function which gives you the backtrace till then and just loops endlessly so you have a no returning function therefore you don't have to get a valid value back or something like this. So if you now take the driver it will link, you can start to execute it and you will always get the point where something is not implemented yet. Okay, so let me just switch back. So, okay. So this is the overview of the APIs involved. I don't want to explain them in detail now but what you should take from that is we have a very strict layering there is this layer where there's only C and assembly code which is actually the Linux kernel code and the shadow copies of the Linux kernel code. Those are the only ones which can include Linux kernel headers and then you have this emulation code base which is just the C abstraction for the Linux kernel code above and then you have all this C++ stuff from us which abstracts the genome services and the concrete driver services. And the good thing is you have those abstractions here from the device services and you have their Pondong here and then if there's one, let's say for an Ethernet class you can just reuse it. So if you already have this in our emulation code base you just need to implement or port the concrete driver but you have all this glue code which connects with the actual APIs and services it's always there. Okay, so now let's see this in practice. Okay, I just skip this here and once you shut down the Fitching Machine Monitor Okay, so you actually see the whole time genome in action and what you see here on this mount reform is actually everything the device brings with it except audio. So all other drivers are already in place and this also is valid for the GPU for instance. So here you can see the GL mark demo again. Those of you who have seen Normans presentation yesterday already knows it. So I think I run out of time. No, you have five minutes. Okay, good. If you have any questions about two more minutes, it's okay. Anyway, I think this is enough. I've had more or less everything runs on this device. Yeah, so what are the results? So this is the list of drivers we have ported within one year now besides of course doing other stuff. So we don't just port drivers all the time but this is really a significant change. So we have taken new drivers for our whole x86 code base for instance. And you see a lot of ARM drivers for the pine phone and for the MNT reform are also, we're also added like the Mali and the Vante GPU for instance. And this was done by a very small team and we also had some architecture independent porting of wire guard. So something which doesn't even use any device at all. Okay, so in numbers, the initial driver porting, so nowadays to compare to the initial approach, we have something like 15% of the time that it takes to do this. Of course it's a bit hard to measure because we don't track all the times we do in spending porting work but approximately this is the number. And this is especially because of this tooling which reduces the manual work. Of course you have to find semantic backs but here this tiny kernel correlation helps a lot. So you can instrument the original code then just run it on original Linux and on your ported code and you can see the difference. Driver updating, we also did this within that year because the first port was done for this display engine and then there was a new version available, two kernel versions later and we made an update. So it's significantly faster than the initial driver port, of course. And the drivers meet a better all purpose. This is what I meant with, for instance it took one day to enable the HDMI connector for the M&T reform once the panel worked. So it's much better matching the different contexts. And of course there's something bad on the other hand, so the code base for a single driver explodes like two or three times more than before. But on the other hand, the code to maintain by ourselves decreased to the count of 20% than before. Okay, so I think that's it. If you want to read more about this, I can reference this book, the second genote book about how to enable a platform. There's a lot of this stuff already written by Norman. And we have also much details in different genotian block articles. So thank you for your attention and I'm open for questions. Thank you. Thanks, Stefan. Any questions? Yeah, please. Yeah, first of all, awesome work, by the way. Thank you. Linus is known for not having a stable driver API. I think there's a Linux developer from Red Hat who once says we do not do hardware extraction layers in Linux. I did say that the initial port is the hardest and then it's a lot less work maintaining back ports going forward but there's still some work involved. So I was wondering, wouldn't it be less painful to, for instance, support drivers from BSD? Because I'm not mistaken, we have a more stable hardware extraction layer. That might be probably the case but actually we want to have this first argument, Linux runs on all kind of hardware and all kind of different situations. You have, for instance, we have a BSD ported driver for audio but for today's Intel HD audio devices, it's somehow, yeah, that device might work but on that device, the microphone doesn't work and that device doesn't, this and this. So it's more about we just want to have the functionality and therefore we need to look at this. And of course, we were not, yeah, we didn't like to get kernel experts, Linux kernel experts but now we had to do it and, yeah, if you once dived into it then maybe we just take that advantage. Thank you. Another question here for Malik. Hi, great talk and so I was just wondering, I wonder if I might be able to use your tooling to introduce NVMe driver into OSV but I also wonder if maybe similar approach could be used to also port file system drivers into operating system. Like OSV is missing EXT to driver and I wonder if I could do something like that. I'm pretty sure you can. I mean, we already used this RAM kernel approach from Antikanta who was also in the step room in the past years and the support BSD port of the protocol stack and we also used the Linux IP stack in the past from Linux and we will of course use this approach again to renew that version. And there's, as I said, we already used WireGuard for this so something which is not at all connected to any device driver code. Yeah, it's possible. Okay, thank you. I have a question on licensing. Is it okay because GPL, BSD? It's all under GPL then only, of course. Each driver report has to be under GPL. And there's no problem with having the link together. I'm not sure about the license of GNode. No, it's not a problem because this code then is only GPL code and GNode itself is also under GPL. It's possible. Thank you very much. Okay, thank you so much, Stefan. Let's take it again. |
Using Genode as an enabler for research on modern operating systems |
So, our next talk is from Mikhail is using Genote as an enabler for research on mother happening systems. So, continuing on the Genote environment. Where are the leaflets, where are the leaflets, Stefan? Maybe we can switch now, I can give you my laptop. Do you have it, do you have the slides uploaded on the PENTA bar? Yes, I uploaded some, some mistakes. Let me give my laptop, I think that's being easy. Let's do some sort of demo or something, something bad happened. Oh, oh, oh, oh, oh, oh, oh, oh, oh. So, you have a stick? Do you have a slide? Yes, yes, yes, yes, it's better because... Okay, do you want to use the keyboard? Back, forward and left, right? You don't see the slides, you want to see the slides, right? Let me duplicate the screen then. Let me do it, it's okay. Let's see if we can... A, B, yeah. Yeah, help? Help, please start. I'm Michael and first a quick introduction for those who don't know me yet. I studied computer science at TU Dortmund and since 2018. I'm a PhD student at Osnabrück University and a full-time research assistant in the MX Kernel project, which is a joint project between TU Dortmund and Osnabrück University. And the focus of my research is on heterogeneous many-course systems for the data center. First I will, now I will present you my experiences I made with using Genet for research and how, yeah, I will show how well this worked out. And now, at first I'm not working for Genet Labs, so even if it might sound a bit like an advertisement, it's not exactly that I was paid for that or something. So let's start with a talk. Yeah, the operating systems we know today very well, like Linux or Windows, are basically from almost more than 30 years ago. For Linux it's even called as the architecture. And the systems there looked quite different. There was just a single CPU and only the CPU did computation work, contact switches were cheap, memory was scarce, and yeah, and we had the old dogma that I was always slower than the CPU and that by magnitude, but today things look different. Now we have many CPUs. I think octa-cores are even now the default for laptops and quad-cores are the factor default for small mobile devices. And as most of you will know, not just CPUs compute, but now we have GPUs and in data centers also FPGAs and I accelerate us in processing a memory and with this new amount of cores and deep memory hierarchies, contact switches aren't cheap anymore. Now we have to pay synchronization costs when scheduling processes by load balancing and we pay for the distributed memory architecture which we actually have in our systems with distributed caches with higher latencies for contact switches. Now main memory on the other hand has become abandoned at least for the data center and now we have heterogeneous memories, at least non-uniform memory access and it's also a trend towards distributed memory where we do not even have sharp memory guaranteed anymore. And also the IO has now become almost as fast as the CPU. So one might question whether the operating system abstractions and interfaces we are accustomed to like POSIX are still viable for these modern systems and there's a lot of research which argues that they are not. For example, the blocking IO of POSIX doesn't fit well when IO is as fast as the CPU can be then it doesn't make sense to block threads because the cost of unblocking a thread or a process is higher than actually doing polling or something. So we need further research on operating systems to deal with that. There's also a lot of research work investigating how to deal with such things like an FPGA which were completely different than a CPU does. So we need more research but there are some hurdles on the way that put OS research at a risk. One major hurdle is non-free licensing which prevents us from fully understanding the system or especially drivers for hardware like accelerators or GPUs and it makes a modifying system very difficult and even if one might be able to modify it it might not be publishable which is bad for research where you want to have is that your results are reproducible by other researchers. Now furthermore we have hardware black boxes which make it even harder to implement drivers and make it also difficult to evaluate the hardware because you can't quite figure out what is going on in the hardware there. And then there are also NDAs, non-disclosure agreements which may suppress unfavorable results so that you might have nice results or the paper you aren't allowed to publish them because some company doesn't like their results because it may damage their business because your results state that the hardware is not as good as they think. And one other big problem is missing documentation especially if this leads to reverse engineering like it was necessary for a long time for the NVIDIA GPUs where this no-volt open source driver has to be completely written from scratch via reverse engineering because NVIDIA didn't publish any useful documentation. Now a major problem then we face in research is the lack of manpower that puts start limits on what we can do and it can also endanger the success of the project itself. In research success of a project is measured and the amount of publications we can do which accounts for the amount of experiments we can do and this means we don't have so much time to implement drivers and such stuff. And also the complexity of modern hardware as we have seen in the previous talk for that MNT reform laptop can be quite intimidating and making it even harder to get an operating system working. So what does OS researchers do in the scenario? They mostly write workarounds and tweaks for Linux. Here is a short list of publications and they are mostly from OSDI 2020 and this is just the tip of the iceberg. What's going on? In fact most papers on the OSDI 2020 and 2021 OSDI are one of the major scientific conferences for operating system research and one can see that most papers here in grey that were OS research papers were actually just tweaks to Linux kernel and only the red part were really new operating systems with new concepts or abstractions. So now we know why they use Linux but I think Linux isn't a good idea because you still have a huge and complex code base to deal with as the previous talk might have already teased that it is still a lot of work working in the Linux kernel and getting a company with that and furthermore the positive compliance of Linux and also the strict requirement that you may never ever break user space puts hard limits on what we can do in research. So completely changing abstractions and interfaces that break user space will never ever have the chance to get into the kernel. At least it will be very difficult because we need to persuade Linux torwards to integrate them. And furthermore Linux is a moving target. The kernel APIs are changing rapidly and this needs a lot of maintenance work. So as we have seen before a small research team might not be able to do this maintenance and so extensions will break sooner or later. That's something we have experienced in our own research where we try to compare against other Linux extensions and they didn't compare with newer kernels or only worked with some ancient kernels which we couldn't run on our hardware. So one might ask isn't there something better to do OS research with but also is able to lower the burden of writing an OS from scratch. So something like a framework and such an OS framework that should be minimal ideally that eases understanding and makes it easier to change kernel primitives and add new interfaces and it can also assist debugging because you don't need to analyze a huge code base. This also makes it investigable. That's necessary to understand what's going on in the system. Ideally it has an open source code base and it provides some profiling to it. It should also be maintainable so that it has regular updates. So if its skill can work on newer hardware and not that it's five years later you can't use that framework anymore because it only supports very ancient hardware. Extensible is also quite obvious. It makes it easier to implement your own operating system services and abstractions and therefore it should have separation of contents and well-defined components and it should also be well documented. Ideally it should be a book and documented code and also portable to make it future-proof and enable it to be portable to other also experimental hardware like the NCM computer from ETH-Zurick. This would also then enable support for hardware OS code design. Basically what it meant here is it should not assume a specific CPU architecture and a nice thing to have would be composability at one time something like the module system Linux has which would allow to use, for example, different OS interfaces simultaneously to evaluate them against each other and find out what interface, for example, provides the best performance for a specific task. So now that we have that we might ask what is such a framework? Does something like that exist? And as the title already teespoiled there is actually, I propose the genot OS framework as a good candidate here. As we've seen in the previous talk, genot is an OS framework that provides different kernels, drivers. Now also from Linux which makes things easier getting hardware up and running and furthermore it also includes libraries which makes it easier to port existing benchmarks and later applications to genot and your special fork of genot you use for research. So getting back to the requirements, how well does genot fit the bill here? First it is minimal compared to Linux. We only have about 53,000 lines of code for genot with the Nova kernel for really the operating system kernel and the basic operating system abstractions and services while we take the same parts of the system, both are x86 only. We have 911,000 for Linux 4.140 which this number might be higher even by now. Also it's investigable because it's under GPL but yeah, the tracing and profiling is at the moment as I experience it quite basic it's not yet not comparable to Linux Perth but that might change in the future. It's also maintainable. I've seen that there were almost quarterly updates and I didn't have to wait three updates and didn't have to change much on the kernel API here. So I think that is also green here. Of course genot is a component-based system so everything is clearly separated into single components which have an RPC interface which is basically well defined. That means that the basic foundation should work with this RPC interface are the same for each of those components and the requirements for adding new components are quite minimal because if you don't need, for example, an NVMe driver then you don't have to deal with such things in that interfaces. Basically you just need to know the core services and libraries of Genot which are very well documented in the book Genot Foundations which help me a lot to understand how Genot works and what the concepts are here. Also they have an extensive change log for each release and there's also the genodian's block and the FOSDEMT talks. It's also portable as we have already seen in the previous talk and it has its component-based architecture. We saw the LightCentraler in the previous talk which allowed to add components at runtime or exchange them and change the configurations and it's also possible to have multiple instances of a service at runtime. That makes Genot a quite good fit as I think. How much does it facilitate as research? I might ask here and before I start with that I want to present shortly my own research operating system called ELAN OS which is an experimental implementation of the MX Kernel architecture we devised in our research project and is based on the Genot OS framework. In this MX Kernel architecture we have three basic concepts for clarification, the squares in this picture represent a hardware resource like a CPU core or a part of memory and then we have the first concept which are organisms and these are basically resource containers for applications that follow a specific common goal for example a web application like a web store which usually is comprised of a database, a web server and some implementation logic for the store itself so we would have usually three programs running and they have the same goal to provide this web store experience and they are then in LOS put into one organism and the resource management is controlled by a component we call IVAT that's finished for brain for an organism that can be application or user specific and can also provide a specific operating system interface for example it might allow to provide a POSIX interface or another custom OS interface whatever the applications are needing and these organisms can also grow and shrink in the amount of resources they use for example if this yellow one wouldn't need these resources here then the red one could also extend there for this we have Heuter that's the global resource management that has the task to provide a fair amount of research utilization and between organisms and can also implement such things like server level agreements and within an organism we have cells these are basically your processes and they have also an elastic resource container in our system we have a strict rule that space partitioning comes for before time sharing and that makes it necessary if we have diverging loads that these containers might have to shrink to save resources and especially to grow if already assigned resources don't suffice and then one new abstraction we also added is that we changed the default control flow abstraction from threads to tasks which are closed units of work you can think of them as a remote procedure call or a bit bigger method call although its execution time is quite short in microseconds to milliseconds range compared to the lifetime of a thread therefore we can allow it to be not preemptible between each other which then allows us to annotate them to synchronization or provide automatic prefetching and other nice things as the architecture then looks like that we have our application running in user space and in kernel space we have 2KIA which is basically a fork of the NOVA microhypervisor especially the genote version of it which then will fulfill the role as a resource provider it will basically on the command of Heutea it will either withdraw a resource from an application or grant a resource so how did we implement this with genet? first for organisms we use the feature of service interception genote allows to have that you have several instances of its core service so for example you can implement your scheduler, memory allocator and such things and then we route the service that it uses specialized or as service rather than the generic one cells are implemented as genote components and one feature genote already has is resource trading but only for RAM and we will extend that so that it also can work with CPUs to implement that growing and shrinking of cells and for tasks the genote didn't have anything when we started but we have already developed a task based runtime library and framework basically it was a colleague from Dortmund which is called amix tasking what I did was porting this to genet and for this I needed a standard C++ library because it uses this for the internal data structures and to be portable a file system for the benchmarks and for writing out the profiling results from that benchmarks timer support which was also needed for the profiling and of course we needed multi-core support which was necessary to provide task parallelism and the last thing was NUMA support NUMA stands for non-uniform memory architecture and this is needed by amix tasking because it does NUMA aware task scheduling and data object allocations and placements and here comes the tricky part if you would have to do this from scratch in your own operating system you would have to implement quite a huge amount of code here but genote comes here to rescue because it already provides a standard C++ library file system timer support and also multi-core support what we needed to add was NUMA support and for this we extended the NOVA microhypervisor that now it parses the AKPS wrap tables to find out which CPU cores belong to which NUMA region and also the address, memory address ranges of the NUMA regions which are later used for a NUMA aware allocator furthermore we implemented and this thing cost only 365 lines of code where the barest of it was just the definition of these table structures in C++ code Michael are you close to finishing? Time is almost up I can't hurry up Two more minutes please Then we implemented a topology service with 531 lines of code and also NUMA aware Sorry, you have one hour, sorry for that My bad, sorry for that I don't have my laptop, I didn't Sorry, go ahead, my bad No problem I'm so good at whipping people Based on this NOVA extension then we developed a topology service which enables now to also query the NUMA topology not just for the core components but also the user space applications can now ask for example where does this thread I'm currently running in is in the NUMA topology which can be used for example then for actually allocating memory locally or from a specific NUMA region which is a usual use case used for implementing database applications and also for high performance computing and the last part was providing the glue code between the genote interfaces and this MX tasking runtime and as one can see this was about 1500 lines of code which is quite manageable for a single developer and we also started to implement something from scratch at the beginning of the project which only compressed mostly what the hardware abstraction layer here in grey which was to get the system running at all and the other was this task-based interface and this thing already needed about 24,000 lines of code while again this ELNOS this genote-based thing I started has only about 5,500 lines of code and I have to add that this from scratch version did not have anything like components that there was no support for memory protection for example and it could only run a single application while now with genote we can have several applications which are memory protected and isolated from each other and we also have well defined inter-processed communication mechanisms like the remote procedure call interface and also semaphors and such stuff which was still lacking in the from scratch version and in time this is just an estimation here for especially the effort here I assume here that a single developer can arrive about 10 lines of code which is some approximation that is usually used to calculate men month and this culminated in about 18 men month for the implementation we did from scratch I have to admit that I didn't do all the coding by myself I had help from the people of TU Dortmund and could use a small research operating and not research but an operating system for teaching which did already the very basic stuff to get a system up and running but didn't include all the NUMA stuff and not much drivers and for the Elon OS that was something I did completely by myself in about six months which is compared about a time saving of almost 90% here this NUMA should be taken with a grade of salt because it's lots of approximation here but I think you get the picture using genote I was able to really accelerate this implementation and engineering effort which usually does not yield any scientific publications because you implement something that everyone else has already done it's nothing new and this helped me a lot making progress and now I want to show you how I used genote's internal scenario concept basically it's some kind of and this component concept to do automatic experiments but first a quick recap genote consists of components that are these red boxes here and then they're aligned in a tree usually you have an inert component that then starts all the other components and then you can specify within a scenario or something like an XML config how these components are related to each other for example that's its inert component shell start a GUI component and a launcher component and the launcher component then starts an application component and this application component uses this GUI session and has also the rights to use it and such stuff but now I want to show you how these XML configurations work in a real experimental setting I've brought you an example from the database community that is a bilingtree benchmark a bilingtree is a widespread data structure that is used for indexing database tables and it's also used very often to implement key value stores such as memcached and now we would like to investigate how the throughput of this benchmark is affected when we run multiple insets on the same set of CPU cores and do time sharing so that we have to pay these contact switch costs and then we do the spatial partitioning I explained earlier and let us assume our research quest which scenario will yield the higher throughput at the respective maximum of cores so let's take a look at what we have to build up as component tree first we have oraces in it and we want to, for example, have three instances of this bilingtree benchmark there are a number named bilingtree123 and they all need the timer service for the genet and basically to just define this structure we would write the code on the right which is just this, oh, sorry this config tag and then for each component you write start the name of how the component shall be named and then close that start tag for the bilingtree we have one exception here since bilingtree is a name which is shared by all three components we specify a specific binary name here which is called bilingtree and called the components differently that's just because genet has to have, yeah requires that each component has a unique name which is needed for this service routing and checking access writes since now we have the basic structure here and we need to define that this timer component is actually an operating system service here this is done with a provat tag and adding here that the service shall be named timer and then we also have to specify where it can find the other operating system services it needs so it's just the default route stating that if it wants to make a connection to another service it should either ask its parent or one of its siblings and then we do this for the bilingtree one component for example and we have also ready to add something else because we also want to have this timer service and this is done by specifying that the name of the service we need is timer and that one of the siblings that's done with this child tag here shall be used for that which is here timer we could also have another component called timer and then write that name and that would use another timer and we could also do this for the other trees so this basically allows us to do the service interception because here we can then specify which actual implementation or component that provides the service shall be used after that we need to specify where this component shall run to realize the experiment but first I want to mention that Genet manages CPU cores not just as a set of IDs but in a two-dimensional space which is called an affinity space which looks like this and each point in this matrix is a CPU core and one can map components like these in a components to subsets of this and we will use this mechanism now to place our Billing Tree benchmark components to the cores as was stated in the experiment but first we have to specify the affinity space that's the huge gray square and we might assume here that we have a machine with 64 cores and to make things easier we say with the 64 inside with one so we do not have to calculate coordinates here and after we have done that we can pick out these subsets and say for example Billing Tree 1 component shall be mapped at position X which corresponds to core 1 and shall use 63 cores which is stated by this width here and height and furthermore we have to specify a RAM limit because it's a database benchmark we need quite a huge amount of memory 80 gigabyte in this example and this is then done for each instance of the Billing Tree unfortunately this laptop didn't want to work with a beamer so I couldn't show you the final config that comes out of it but I already ran the benchmark beforehand and that would be the results if we would run the experiment and this also answers the question if we only consider inserts into the Billing Tree then it is better to use this spatial partitioning since we reach about 16 million operations per second this is 60 million insert operations of key value pairs into the Billing Tree while on the other side if we only have a read only workload that means we just look up a key in the Billing Tree then as we can see the time sharing out performs the strict spatial partitioning I didn't come to analyze this deeper and I don't think that's the scope of this talk here now so to conclude hardware has changed tremendously in the last two decades and we need more OS research but there is a high entrance hurdle to overcome which can then be lowered by an OS framework and my claim here is that Genote can significantly help here as this specific example from my experience it saved me about 90% of development time compared to when I would have to implement all the things by myself and furthermore my research operating system also provides some contributions to Genote which I might file in as a pull request which is this NUMA support and also some support for many core systems that I already contributed that by filing a bug report by finding a bug that Nova crashed a boot loop if you wanted to use more than 30 cores this has now been fixed and I've tested it it definitely works with 128 CPU cores on a real hardware machine and also the NUMA support I implemented is also working and last but not least we also now have a task parallel programming library which can be used with Genote now my focus will be clearly on research and the data in the data center here and my personal road to the future with ELAN OS is that next I wanted to implement more profiling tools in Genote especially hardware performance counters to actually find out why the plot looks like the resource then I wanted to implement this elasticity of cells especially the resource trading for CPU cores and then the management strategies for these iWord and Hoitaya components these resource managers and do an evaluation with a realistic scenario we already have implemented a database based on MX tasking which just only waits to be ported to Genote and then I hopefully will have a first full feature prototype of ELAN OS which can be used by the community here Thank you for your attention and I hope you get in touch with us Thank you, Mihail Thanks, questions from the audience? One question I may have is you talked quite a lot about research and it makes sense given your current work what do you think about productization? I mean getting into what the talks before they had some actual use cases there were business ventures what do you think about productization? Would this approach make sense where people are still going to get back to Linux because that's what everyone uses and it's going to be the default? That's still... I'm asking an opinion, I don't have a crystal ball I'm just saying what you're thinking It's our assumption that's why we're doing research that it will be better or at least have benefits compared to Linux also for production use especially we think that we can provide a better performance and also ease the development of database systems and other highly parallel applications like from high-performance computing communities Okay, thanks Any other questions? Yeah First, thank you for the talk I'm amazed about it because 15 years ago when we started with Genote we had dreamed about such things like research picking it up because we came from academia unfortunately my question is have you encountered any pain points on this journey? Yeah, in the last six months you have had a very intensive time with Genote was there anything that frustrated you about it or did you think this was probably not the right choice or something that we could pick up for improvement? Let me quickly think one thing that comes to my mind was the documentation of the tracing services profiling thing I figured out how this trace service works but it doesn't seem to do not so many things yet it's not comparable to what you get with Perf under Linux where you can exactly see the clock cycles and cache methods down to function level that would be nice to have Thanks, it was a very nice talk What specifically do you see regarding for example the NZN architecture from ETH Zurich how much should for example the Genote framework be tweaked to efficiently use such novel hardware architectures if you can make a guess or prediction Thank you I had thought this would for this NZN computer in particular would we often cause support for ARM which I think is there but it would have to be adjusted to the SOC they use and I'm not completely not such deep into this research computer I only attended a talk of Timothy Roscoe where he presented this thing and how cool it is so I'm not quite aware of what had to be done I think there would have to be implemented basically the usual stuff, drivers for it Yeah Thank you for the nice talk I wanted to ask you about your in a slide with future developments you mentioned what you want to do do you have some time frames if you're committed to doing this and how much Geno would help you in shortening these time frames how much easier would it make could you make some sort of prediction about it Thank you Using Geno I would estimate that the plan is that this all should be running in fall this year at least to this point here with the evaluation I also hired a student assistant which will help me from April onwards to develop a nicer interface that we do not need to always code this XML stuff With the profiling I already began the basic stuff is already there it's just that the interface are a bit ugly there are no capabilities and it's not implemented as a service yet but I can basically use it for my benchmark As an LSD sit I think these two parts will come to real will be realized until I think summer I would say and if I would have to do this all from hand I would assume that I wouldn't have been finished in one year not with the manpower I have available Have you contributed back any of your work back to Geno? Not yet but I plan to contribute this NUMA support I think this could be a nice addition to Geno because it would enable Geno to also be usable for data send applications at big servers where NUMA support is crucial and maybe later the performance counters when they are finished and polished Thank you Michael, thank you so much you |
NOVA Microhypervisor Feature Update |
All right, so we move on to our next talk. We have Udo here with the NOVA microhypervisor update. Udo, please. Thank you, Arsalan. Good morning, everybody. Welcome to my talk at FOSSTEM. It's good to be back here after three years. The last time I presented at FOSSTEM, I gave a talk about the NOVA microhypervisor on V8, and this talk will cover the things that happened in the NOVA ecosystem since then. So just a brief overview of the agenda. For all of those who might not be familiar with NOVA, I'll give a very brief architecture overview and explain the NOVA building blocks. Then we look at all the recent innovations that happened in NOVA in the last three years. I'll talk a bit about the code unification between ARM and X86, the two architectures that we support at this point. And then I'll spend the majority of the talk going into details, into all the advanced security features, particularly in X86 that we added to NOVA recently. And towards the end, I'll talk a little bit about performance, and hopefully we'll have some time for questions. So the architecture in NOVA is similar to the microkernel-based systems that you've seen before. At the bottom, we have a kernel, which is not just a microkernel, it's actually a microhypervisor called a NOVA microhypervisor. And on top of it, we have this component-based multi-server user mode environment. Genote would be one instantiation of it. And Martin has explained that most microkernel-based systems have this structure. In our case, the HostOS consists of all these colorful boxes. We have a master controller, which is sort of the init process, which manages all the resources that the microhypervisor does not need for itself. We have a bunch of drivers. All the device drivers run in user mode, they're privileged. We have a platform manager, which primarily deals with resource enumeration and power management. You can run arbitrary host applications, many of them. And there's a bunch of multiplexers, like you want multiplexer that everybody can get a serial console and you have a single interface to it, or a network multiplexer which acts as some sort of virtual switch. And virtualization is provided by virtual machine monitors, which are also user mode applications. And we have this special configuration or this special design principle that every virtual machine uses its own instance of a virtual machine monitor. They don't all have to be the same. For example, if you run a unique kernel in VM, as shown to the far right, the virtual machine monitor could be much smaller because it doesn't need to deal with all the complexity that you would find in an OS, like Linux or Windows. So the entire HostOS consisting of the Nova Microhypervisor and the HostOS, the user mode portion of it is what Bedrock calls the ultravisor, which is a product that we ship. And once you have a virtualization layer that is very small, very secure, and basically sits outside the guest operating system, you can build interesting features like virtual machine introspection or virtualization assisted security, which uses features like nested paging, breakpoints, and patched civil overrides to harden the security of the guest operating systems, like protecting critical data structures, introspecting memory, and also features in the virtual switch for doing access control between the different virtual machines and the outside world as to who can send what types of packets. And all of that is another product which is called ultra security. The whole stack, not just the kernel, the whole stack is undergoing rigorous formal verification. And one of the properties that this formal verification effort is proving is what we call the bare metal property. And the bare metal property basically says that combining all these virtual machines on a single hypervisor has the same behavior as if you were running these as separate physical machines connected by a real ethernet switch, so that whatever happens in a virtual machine could have happened on a real physical machine that was not virtualized. That's what the bare metal property says. So the building blocks of NOVA are those that you would find in an ordinary microkernel. It's basically address basis, threads, and IPC. And NOVA address basis are called protection domains, or PD. And threads or virtual CPUs are called execution context, short EC. And for those of you who don't know NOVA very well, I've just given a very brief introductory slide for how all these mechanisms interact. So let's say you have two protection domains, PD, A and B. Each of them have one or more threads inside. And obviously, at some point, you want to intentionally cross these protection domain boundaries because these components somehow need to communicate. And that's what IPC is for. So assume that this client thread wants to send a message to the server thread. It has a thread control block, which is like a message box, puts the message in there, invokes a call, an IPC call to the hypervisor, it vectors through a portal, which routes that IPC to the server protection domain, and then the server receives the message in its UTCB. As part of this control and data transfer, the scheduling context, which is a time slice coupled with a priority, is donated to the other side. And as you can see on the right, that's the situation after the IPC call has gone through. So now the server is executing on the scheduling context of the client. The server computes a reply, puts it in its UTCB, issues a hypercall called IPC reply, and the data goes back, the reply goes back to the client, the scheduling context donation is reverted, and the client gets its time slice back. So what you get with that is very fast synchronous IPC, this time donation, and priority inheritance. And it's very fast because there's no scheduling decision on that pass. Also, NOVA is a capability-based microkernel or hypervisor, which means all operations that user components do with the kernel have capabilities as parameters. And capabilities have the nice property that they both name a resource and at the same time have to convey what access you have under that resource. So it's a very powerful access control primitive. So that said, let's look at all the things that happened in NOVA over the last two and a half or so years. And we are now on a release cadence where we put out a new release of NOVA approximately every two months. So it's always the year and the week of the year where we do releases, and this shows what we added in NOVA in 21, 22, and what we'll add to the first release of this year at the end of this month. So we started out at the beginning of 21 by unifying the code base between X86 and ARM, making the load address flexible, adding power management like suspend-resume, then extended that support to ARM. And later in 22, when that unification was complete, we started adding a lot of, let's say, advanced security features in X86, like control flow enforcement, code patching, cache allocation technology, multiple spaces, multi-key total memory encryption. And recently, we've added some APIC virtualization. So the difference between the things that are listed in bold here and those that are not listed in bold, everything in bold I'll try to cover in this talk, which is a lot, so hopefully we'll have enough time to go through all this. First of all, the design goals that we have in NOVA. And Martin already mentioned that not all microchones have the same design goals. Our design goal is that we want to provide the same or at least similar functionality across all architectures, which means the API is designed in such a way that it abstracts from architectural differences as much as possible. That you get a uniform experience, whether you're on X86 and ARM, you can create a thread and you don't have to worry about details of instructions that register set, page table format, NOVA tries to abstract all of that away. You want to have a really simple build infrastructure and you'll see in a moment what the directory layout looks like, but suffice it to say that you can build NOVA with a very simple make command where you say make architecture equals X86 or ARM, and in some cases, bold equals I don't know, Raspberry Pi or NXP, I'm X8, whatever, and it runs for maybe five seconds and then you get a binary. We use standardized processes like the standardized boot process and standardized resource enumeration as much as possible because that allows for a great reuse of code. So we use multi-boot version two or one, and if I for booting, we use ACPI for resource enumeration. You can also use the FDT, but that's more of a fallback. And for ARM, there's this interface called PSCI for power state coordination that's also abstracting this functionality across many different ARM boards. So we try to use these interfaces as much as possible. The code is designed in such a way that is formally verifiable, and in our particular case, that means formally verifying highly concurrency plus plus code, not C code, not a similar code, but C++ code, and even weekly ordered memory because ARM V8 is weak memory. And obviously, we want to be, we want Nova to be modern, small, and fast, best in class security and performance, and we'll see how we did in that. So first, let me talk about the code structure, and Martin mentioned in this talk this morning, that using directories to your advantage can really help. So on the right, you see the directory structure that we have in the unified Nova code base. We have a generic ink directory and a generic source directory. Those are the ones listed in green. And then we have architecture-specific subdirectories for ARC64 and X8664, and we have architecture-specific build directories. There's also a doc directory in which you will find the Nova interface specification, and there's a single make file unified. And when we looked at the source code and we discussed them with our formal methods engineers, we recognized that basically all the functions can be categorized into three different buckets. The first one is what we call same API and same implementation. This is totally generic code. All the system calls are totally generic code. All the memory allocators are totally generic code. Surprisingly, even page tables can be totally generic code. So these can all share the source files, the header files, and the spec files, which basically describe the interface pre and post conditions. The second bucket is functions that have the same API, but maybe a different implementation. And an example of that would be a timer where the API could be set a deadline for when a timer interrupts should fire. So the API for all callers is the same, so you can potentially share the header or the spec file. But the implementation might be different on each architecture or is very likely different. And the final bucket is those functions that have a different API and implementation and you can't share anything. So the code structure is such that architecture-specific code lives in the architecture-specific subdirectories and generic code lives in the sort of parent directories of that. And whenever you have an architecture-specific file with the same name as a generic file, the architecture-specific file takes precedence and basically overrides or shadows the generic file. And that makes it very easy to move files from architecture-specific to generic and back. So the unified code base that we ended up with, and these are the numbers from the very recent upcoming release, 2308, which will come out at the end of this month, shows sort of what we ended up with in terms of architecture-specific versus generic code. So in the middle, the green part is the generic code that's shared between all architectures and it's 4,300 lines today. x86 adds 7,000 and some lines specific code and ARM to the right adds some 5,600 lines. So if you sum that up for x86, it's roughly 11,500 lines and for ARM it's less than 10,000 lines of code. It's very small and if you look at it, ballpark 40% of the code for each architecture is generic and shareable. And that's really great, not just from a maintainability perspective, but also from a verifiability perspective because you have to specify and verify those generic portions only once. If you compile that into binaries, then the resulting binaries are also very small, like a little less than 70K in code size and obviously if you use a different compiler version or different NOVA version, these numbers will slightly differ, but it gives you an idea of how small the code base and how small the binaries will be. So let's look at some interesting aspects of the architecture because assume you've downloaded NOVA, you've built such a small binary from source code and now you want to boot it. And typical boot procedure, both on x86 and ARM, which are converging towards using UFI as firmware, will basically have this structure where UFI firmware runs first and then invokes some bootloader, passing some information like an image handle and a system table and then the bootloader runs and invokes a NOVA microhypervisor, passing also the image handle and the system table maybe adding multi-boot information. And at some point there will have to be a platform handover of all the hardware from firmware to the operating system in our case NOVA. And this handover point is called exit boot services. It's basically the very last function that you call as either a bootloader or a kernel in firmware and that's the point where firmware stops accessing all the hardware and the ownership of the hardware basically transitions over to the kernel. And the unfortunate situation is that as you call exit boot services, firmware which may have enabled the IOMU or SMMU at boot time to protect against DMA attacks drops it at this point, which sounds kind of silly, but that's what happens. And the reason if you ask those who are familiar with UFI is for legacy OS support because UFI assumes that maybe the next stage is a legacy OS which can't deal with DMA protection so it gets turned off, which is really unfortunate because between the point where you call exit boot services to take over the platform hardware and the point where NOVA can actually enable the IOMU, there's this window of opportunity shown in red here where there's no DMA protections and that's the point. It's very small, maybe a few nanoseconds or microseconds where an attacker could perform a DMA attack. And for that reason, NOVA takes complete control of the exit boot services flow, so it's not the bootloader who calls exit boot services, NOVA actually drives the UFI infrastructure and it disables all busmaster activity before calling exit boot services so that we eliminate this window of opportunity. That was a very aggressive change in NOVA because it means NOVA has to comprehend UFI. The next thing that we added was a flexible load address. So when the bootloader wants to put a binary into physical memory, it invokes it with paging being disabled, which means you have to load it at some physical address. And you can define an arbitrary physical address but it would be good if whatever physical address you define worked on all the boards. And that is simply impossible, especially in the ARM ecosystem. So in ARM some platforms have the DRAM starting at physical address zero, some have MMIO starting at address zero, so you will not find a single physical address range that works across all ARM platforms where you can say always load NOVA at two megabytes, one gigabyte, whatever. So we made the load address flexible. Also the bootloader might want to move NOVA to a dedicated point in memory like at the very top so that the bottom portion can be given one to one to a VM. So the load address is now flexible for NOVA. Not fully flexible but you can move NOVA up down by arbitrary multiples of two megabytes so add super page boundaries. And the interesting insight into this is for pulling this off, there is no L3 location complexity required. NOVA consists of two sections, a very small init section which is mapped, which is identity map which means virtual addresses equal physical addresses and that's the code that initializes the platform up to the point where you can enable paging and then there's a runtime section which runs paged so it has virtual to physical memory mappings and for those virtual to physical memory mappings if you run this paging enabled the physical addresses that back these virtual memory ranges simply don't matter. So paging is basically some form of relocation. You only need to deal with relocation for the init section and you can solve that by making the init section be position independent code. And it's assembler anyway so making that position independent is not hard. We actually didn't make the code just position independent, it is also mode independent which means no matter if UEFI starts you in 32-bit mode or 64-bit mode that code is dealing with all these situations. There's a limit, an artificial limit of you still have to load NOVA below four gigabytes because multi-boot has been defined in such a way that you can't express addresses above four gigabytes because some of these structures are still 32-bit and that little emoticon expresses what we think of that. So then after we had figured this out we wanted to do some power management and this is an overview of all the power management that ACPI defines so ACPI defines a few global states like working, sleeping and off. Those aren't all that interesting, the really interesting states are the sleep states. And the things that have this black bold border around it is the state in which the system is when it's fully up and running, no idling, no sleeping, no nothing. It's called the S0 working state and then there's some sleep state. You might know suspend to run, suspend to disk and soft off and when you're in the S0 working state you can have a bunch of idle states and in the C0 idle state you can have a bunch of performance state which roughly correspond to voltage and frequency scaling so ramping up the clock speed up and down. So unfortunately we don't have a lot of time to go into all the details of these sleep states but I want to still say a few words about this. We implemented suspend resume on both x86 and ARM and there's two ways you can go about it. One which is I would say a brute force approach and the other which is the smart approach. And the brute force approach basically goes like you look at all the devices that lose their state during a suspend resume transition and you save their entire register state. And that's a significant amount of state that you have to manage and it may even be impossible to manage it because if you have devices with hidden internal state you may not be able to get it it or if the device has a hidden internal state machine you may not know what the internal state of that device is at that point. So it may be suitable for some generic devices like if you wanted to save the configuration space of every PCI device that's generic enough that you could do that. But for some interrupt controllers or SMM use with internal state that's not smart. So for that you can actually use the second approach which Nova uses which is you save a high level configuration and you initialize the device based on that. So as an example say you had an interrupt routed to core zero in edge triggered mode. You would save that as a high level information and that's sufficient to reinitialize all the interrupt controllers all the redirection entries all the trigger modes based on just this bit of information. So there's lots less information to maintain saving becomes basically a know up restoring can actually use the same code pass that you used to initially bring up that particular device and that's the approach for all the interrupt controllers all the SMM use all the devices managed by Nova. The next thing I want to briefly talk about is P states performance states which are these gears for ramping up the clock speed on x86 and Nova can now deal with all these P states. The interesting aspect is that most modern x86 processors have something called Turbo mode and Turbo mode allows one or more processors to exceed the nominal clock speed to actually turbo up higher if other cores are idle. So if other cores are not using their thermal or power headroom is elected set of course maybe just one core maybe a few other cores can actually turbo up many bins and this is shown here on active core zero which basically gets the thermal headroom of core one core two and core three to clock up higher. So Nova will exploit that feature when it's available but there are situations where you want predictable performance where you want every core to run at its guaranteed high frequency mode and there's a command line parameter that you can set that basically clamps the maximum speed to the guaranteed frequency. You could also lower the frequency to something less than the guaranteed frequency there's a point an operating point it's called maximum efficiency and there's even points below that where you can clock really high but then it's actually less efficient than this point. So all of that is also supported. So as an overview from a feature comparison perspective ARM versus x86 we support p-states on x86 not on ARM because there's no generic interface on ARM yet we support all the s-states on x86 like stop clock, suspend, resume, hibernation, power off, platform reset. On ARM there's no such concept as one but we also support suspend, resume and suspend to disk if it's supported and what does it mean if it's supported it means if platform firmware like psci implements it and there are some features that are mandatory and some features that are optional. So suspend, resume for example works great on the nxpimx8m that Stefan had for his demo it doesn't work so great on Raspberry Pi because the firmware simply has no support for jumping back to the operating system after a suspend. So it's not a novel limitation. There's a new suspend feature called low power idle which we don't support yet because it requires way more support than just Nova basically requires powering down the GPU, powering down all the devices, powering down all the links so this is a concerted platform effort. But from a hypercore perspective the hypercore that you would invoke to transition the platform to a sleep state is called control hardware and whenever you try to invoke it with something that's not supported it returns bad feature and for the hypercodes that assign devices or interrupts the state that the system had when you assign devices or interrupts to particular domains will completely be preserved across the suspend, resume codes using this safety high level state approach. So next I'll talk about some radical API change that we made and being a micro kernel and not being Linux we don't have to remain backward compatible. So that's one of these major API changes that took quite a lot of time to implement. What we had in the past was basically an interface with five kernel objects. Protection domains, execution context, scheduling context, portals and sum of course and every protection domain looked as shown on this slide. It actually had six resource spaces built into it, an object space which hosts capabilities to all the kernel objects that you have access to, a host space which represents the stage one page table, a guest space which represents the stage two guest page table, the DMA space for memory transactions that are remapped by the IOMU, port IO space and an MSR space. So all of these existed in one single instance in every protection domain and when you created a host EC, a guest EC, like a virtual CPU or device they were automatically bound to the PD and picking up the spaces that they needed. And that is, that worked great for us for more than 10 years but it turned out to be suboptimal for some more advanced use cases like nested virtualization. If you run a hypervisor inside a virtual machine and that hypervisor creates multiple guests itself then you suddenly need more than one guest space. You need one guest space per sub guest. So you need multiple of these yellow guest spaces or when you virtualize the SMMU and the SMMU has multiple contexts and every context has its own page table then you suddenly need more than one DMA space. So you need more of these blue boxes and the same can be said for port IO and MSR spaces. So how do we get more than one if the protection domain has all these single instance? So what we did and it was quite a major API in internal reshuffling is we separated these spaces from the protection domain. They are now new first class objects. So Nova just got six new kernel objects that when you create them you get individual capabilities for them and you can manage them independently from the protection domain. So the way that this works is first you create a protection domain with create PD then you create one or more of these spaces again with create PD. So that's a sub function of create PD. And then you create an EC like a host EC and it binds to those spaces that are relevant for host EC. So a host EC like a hypostrat needs capabilities so it needs an object space it binds to that it needs a stage one page table so it binds to that and it needs access to ports so it binds to that on x86 only because on ARM there's no such thing. So for host thread all these assignments are static. We could make them flexible but we have not found a need. Gets more interesting for a guest EC which is a virtual CPU that runs in a guest. So again the sequence is the same you first create a protection domain then you create one or more of these spaces and when you create the virtual CPU it binds to those spaces that it urgently needs which is the object space and the host space. It does not yet bind to any of the flexible spaces shown to the right. And that binding is established on the startup IPC during IPC reply. You pass selectors, capability selectors to these spaces that you want to attach to and then you flexibly bind to those spaces as denoted by these dashed lines. And that assignment can be changed on every event. So every time you take a VM exit Nova synthesizers and exception IPC or architectural IPC sends it to the VMM for handling and when the VMM replies it can set a bit in the message transfer descriptor to say I want to change the space assignment it passes new selectors and then you can flexibly switch between those spaces and that allows us to implement for example nested virtualization. The same for a device which in x86 is represented by a bus device function or an arm is represented by a stream ID. The assigned depth hypercall can flexibly rebind the device to a DMA space at any time. So that took quite a while to implement but it gives us so much more flexibility and I heard that some of the Nova forks have come across the same problem so maybe that's something that could work for you too. So let's talk about page tables and I mentioned earlier that page tables are actually generic code which is somewhat surprising. Nova manages three page tables per architecture, the stage one which is the host page table, the stage two which is the guest page table and a DMA page table which is used by the IOMU and these correspond to the three memory spaces that I showed in the previous slide. And the way we made this page table code architecture independent is by using a template base class which is completely lockless so it's very scalable and the reason why it can be lockless is because the MMU doesn't honor any software locks anyway so if you put a lock around your page table infrastructure the MMU wouldn't know anything about those locks so it has to be written in a way that it does atomic transformations anyway so that the MMU never sees an inconsistent state and once you have this there's also no need to put the lock around it for any software updates so that's completely lock free. And that architecture independent base class deals with all the complexities of allocating and deallocating page tables, splitting superpages into page tables or overmapping page tables with superpages and you can derive architecture specific subclasses from it and the subclasses themselves inject themselves as a parameter to the base class that's called the curiously recurring template pattern. And the subclasses then do the transformation between the high level attributes like this page is readable, writable, user accessible, whatever into the individual bits and coding of the page table entries as that architecture needs it and also there are some coherency requirements on ARM and some coherency requirements between SMM use that don't snoop the caches so these architecture specific subclasses deal with all that complexity but it allows us to share the page table class and to specify and verify it only once. So let's look at page tables in a little bit more detail because there's some interesting stuff you need to do on ARM. So most of you who've been in an OS class or who've written a microconnet will have come across this page table format where an input address like a host virtual or guest physical address is split up into an offset portion into the final page 12 bits and then you have nine bits indexing into the individual levels of the page table. So when an address is transformed by the MMU into virtual address into physical address the MMU first uses bits 30 to 38 to index into the level two page table to find the level one and then to find the level zero and the walk can terminate early. You can have a leaf page at any level so it gives you one gigabyte, two megabyte or four k superpages and with that page table structure like this three levels you can create an address space of 512 gigabytes of size and that should be good enough but it turns out we came across several ARM platforms which have an address space size of one terabyte. So twice that they need one extra bit which you can't represent with 39 bits so you have a 40 bit address space. So what would you do if you were designing a chip? You would expect that it would just open a new level here and that you get a four level page table but ARM decided differently because they said if I just add one bit the level three page table would have just two entries and that's not worse building basically another level into it. So what they did is they came up with a concept called concatenated page table and it makes the level two page table twice as large by adding another bit at the top. So now suddenly the level two page table has 10 bits of indexing and the backing page table has 1024 entries and is 8k in size. And this concept was extended so if you go to 41 address space again you get one additional bit and the page table gets larger and this keeps going on. It can extend to up to four bits that the level two page table is 64k in size. And there's no way around it, the only time at which you can actually open the level three is when you exceed 44 bits. And then when you get 44 bits you can go to a four level and it looks like this. So the functionality that we also had to add to NOVA is to comprehend this concatenated page table format so that we can deal with arbitrary address space sizes on ARM. And we actually had a device, I think it was a Xilin CCO one or two which had something mapped above 512 gigabytes and just below one terabyte and you can't pass that through to a guest if you don't have concatenated page sheets. So the generic page table cluster we have right now is so flexible that it can basically do what's shown on this slide and the simple case is x86. You have three level, four level, or five level page tables with a uniform structure of nine bits per level and 12 offset bits. 39 isn't used by the MMU but might be used by the SMMU and the MMU typically uses four levels and in high end boxes like servers for 57. On ARM, depending on what type of SOC you have it either has something between 32 or up to 52 physical address bits and the table shows the page table level split, indexing split that NOVA has to do and all these colored boxes are basically instances of concatenated page tables. So 42 would require three bits to be concatenated, here we have four, here we have one, here we have two, so we really have to exercise all of those and we support all of those. And unlike the past where NOVA said page tables is so many levels per so many bits, we now have turned this around by saying the page table covers so many bits and we can compute the number of bits per level and the concatenation at the top level automatically in the code. So that was another fairly invasive change. While we were at re-architecting all the page tables, we took advantage of a new feature that Intel added to Islake servers and to all the Lake desktop platforms which is called total memory encryption with multiple keys. And what Intel did there is they repurposed certain bits of the physical address in the page table entry, the top bits shown here as key ID bits and so it's stealing some bits from the physical address and the key ID bits index into a key programming table shown here that basically select a slot and let's say you have four key bits that gives you 16 keys, two to the power of four, so your key indexing or your key programming table would have the opportunity to program 16 different keys. We've also come across platforms that have six bits, it's basically flexible how many bits are stolen from the physical address can vary per platform depending on how many keys are supported and those keys are used by a component called the memory encryption engine. The memory encryption engine sits at the perimeter of the package or the socket, basically at the boundary where data leaves the chip that you plug in the socket and enters the interconnect and enters RAM. So inside this green area which is inside the SOC, everything is unencrypted in the cores, in the caches, in the internal data structure, but as it leaves the die and moves out to the interconnect, it gets encrypted automatically by the memory encryption engine with the key and this example shows a separate key being used for each virtual machine which is a typical use case but it's actually very more flexible than that, you can select the key on a per page basis. So you could even say if there was a need for these two VMs to share some memory that some blue pages would appear here and some yellow pages would appear here, that's possible. So we added support in the page tables for encoding these key ID bits, we added support for using the P-config instruction for programming keys into the memory encryption engine and the keys can come in two forms, you can either randomly generate them, in which case Nova will also drive the digital random number generator to generate entropy or you can program tenant keys. So you can say I want to use this particular AS key for encrypting the memory and that's useful for things like VM migration where you want to take an encrypted VM and move it from one machine to another. And the reason why Intel introduced this feature is for confidential computing but also because DRAM is slowly moving towards non-volatile RAM and offline either made a tag or so where somebody unplugged your RAM or takes your non-volatile RAM and then looks at it in another computer is a big problem and they can still unplug your RAM but they would only see ciphertext. So next thing we looked at was, so this was more of a confidentiality improvement, next thing we looked at is improving the availability and we added some support for dealing with noisy neighbor domains. So what are noisy neighbor domains? Let's say you have a quad core system as shown on this slide and you have a bunch of virtual machines as shown at the top. On some cores you may over provision the cores run more than one VM like on core zero and core one. For some use cases you might want to run a single VM on a core only like a real time VM which is exclusively assigned to core two but then on some cores like shown on far right you may have a VM that's somewhat misbehaving and somewhat misbehaving means it uses excessive amounts of memory and basically evicts everybody else out of the cache. So if you look at the last level cache portion here, the amount of cache that is assigned to the noisy VM is very disproportionate to the amount of cache given to the other VM simply because this is trampling all over memory. And this is very undesirable from a predictability perspective especially if you have a VM like the green one that's real time which may want to have most of its working set in the cache. So is there something we can do about it? And yes there is. It's called cat. Cat is Intel's acronym for cache allocation technology and what they added in the hardware is a concept called class of service. And you can think of class of service as a number and again like the key idea is there's a limited number of classes of service available like four or sixteen and you can assign this class of service number to each entity that shares the cache. So you could make it a property of a protection domain or a property of a thread. And for each of the classes of service you can program a capacity bit mask which says what proportion of the cache can this class of service use? Can it use 20%, 50% and even which portion? There are some limitations like the bit mask must be contiguous but they can overlap for sharing and there's a model specific register which is not cheap to program where you can say this is the active class of service on this core right now. So this is something you would have to contact switch to say I'm now using something else. And when you use this it improves the predictability like the worst case execution time quite nicely and that's what it was originally designed for. But it turns out it also helps tremendously with dealing with cache side channel attacks because if you can partition your cache in such a way that your attacker doesn't allocate into the same ways as the VM you're trying to protect then all the flush and reload attacks simply don't work. So here's an example for how this works and to the right I've shown an example number of six classes of service and a cache which has 20 ways. And you can program and this is again just an example you can program the capacity bit mask for each class of service for example to create full isolation so you could say class of service gets 40% of the cache, weighs 0 to 7 and class of service 1 gets 20% and everybody else gets 10% and these capacity bit masks don't overlap at all which means you get zero interference through the level 3 cache. You could also program them to overlap. There's another mode which is called CDP code and data prioritization which splits the number of classes of service in half and basically redefines the meaning of the bit mask to say those with an even number are for data and those with an odd number are for code. So you can even discriminate how the cache is being used between code and data and gives you more fine-grained control and the NOVA API forces users to declare upfront whether they want to use CAT or CDP to partition their cache and only after you've made that decision can you actually configure the capacity bit masks. So with CDP it would look like this. You get three classes of service instead of six, distinguished between D and C, data and code and you could for example say class of service 1 as shown on the right gets 20% of the cache for data, 30% of cache for the code, so 50% of the capacity in total exclusively assigned to anybody who's class of service 1 and the rest shares capacity bit masks and here you see an example of how the bit masks can overlap and wherever they overlap the cache capacity is being competitively shared. So that's also a new feature that we support right now. Now the question is class of service is something you need to assign to cache sharing entities. To what type of object do you assign that? And you could assign that to a protection domain. You could say every box on the architecture slide gets assigned a certain class of service and the question is then what do you assign to a server that has multiple clients? It's really unfortunate and what also means is if you have a protection domain that spends multiple cores and you say I want this protection domain to use 40% of the cache, you have to program the class of service settings on all cores the same way. So it's really a loss of flexibility. So that wasn't our favorite choice and we said maybe we should assign class of service to execution contexts instead. And again the question is what class of service do you assign to a server execution context that does work on behalf of clients and the actual killer argument was that you would need to set the class of service in this model specific register again during each context switch which is really bad for performance. So even option two is not what we went for. Instead we made the class of service a property of the scheduling context and that has very nice properties. We only need to context switch it during scheduling decisions so the cost of reprogramming that MSR is really not relevant anymore and it extends the already existing model of time and priority donation with class of service donation. So a server does not need to have a class of service assigned to it at all. It uses the class of service of its client. So if let's say your server implements some file system or so, then the amount of cache it can use depends on whether your client can use a lot of cache or whether your client cannot use a lot of cache. So it's a nice extension of an existing feature and the additional benefit is that the classes of service can be programmed differently per core. So 8 cores times 6 classes of service gives you 48 classes of service in total instead of 6. So that was a feature for availability. We also added some features for integrity and if you look at the history, there's a long history of features being added to paging that improve the integrity of code against injection attacks. And it all started out many years ago with these 64-bit architecture where you could mark pages non-executable and you could basically enforce that pages are either writable or executable but never both. So there's no confusion between data and code. And then over the years, more features were added like supervisor mode execution prevention where if you use that feature, kernel code can never jump into a user page and be confused as executing some user code. And then there's another feature called supervisor mode access prevention which even says kernel code can never without explicitly declaring that it wants to do that, read some user data page. So all of these tighten the security and naturally Nova supports them. There's a new one called mode-based execution control which is only relevant for guest page tables or stage two which gives you two separate execution bits. So there's not a single X bit. There's now executable for user and executable for super user. And that is a feature that ultra security can, for example, use where we can say even if the guest screws up its page tables, it's stage one page tables, the stage two page tables can still say Linux user applications or Linux kernel code can never execute Linux user application code if it's marked as XS in the stage two page table. So it's again a feature that can tighten the security of guest operating systems from the host. But even if you have all that, there are still opportunities for code injection and these classes of attacks basically reuse existing code snippets and chain them together in interesting ways using control flow hijacking like Rob attacks. And I'm not sure who's familiar with Rob attacks is basically you create a call stack with lots of return addresses that chain together simple code snippets like add this register return, multiply this register return, jump to this function return. And by chaining them all together, you can build programs out of existing code snippets that do what the attacker wants. You don't have to inject any code. You simply find snippets in existing code that do what you what you want. And this doesn't work so well on arm. It still works on arm, but on arm, the instruction length is flex is fixed to four bytes. So you can't jump into the middle of instructions. But on x86 with a flexible instruction size, you can even jump into the middle of instructions and completely reinterpret what existing code looks like. And that's quite unfortunate. So there's feature that tightens the security around that and it's called control flow enforcement technology or CT. And that feature adds integrity to the control flow graph, both to the forward edge and to the backward edge and forward edge basically means you protect jumps or calls that jump from one location forward to somewhere else. And the way that this works is that the legitimate jump destination where you want the jump to land, this landing pad, must have a specific end branch instruction placed there. And if you try to jump to a place which doesn't have an end branch landing pad, then you get a control flow violation exception. So you need the help of the compiler to put that landing pad at the beginning of every legitimate function and luckily GCC and other compilers have had that support for quite a while. So GCC sends eight and we are now at 12. So that works for forward edges. For backward edges, there's another feature called shadow stack. And that protects the return addresses on your stack and we'll have an example later. And it basically has a shadow call stack which you can't write to. It's protected by paging and if it's writable, then it won't be usable as a shadow stack. And you can independently compile Nova with branch protection, with return address protection or both. So let's look at indirect branch tracking and I try to come up with a good example and I actually found a function in Nova which is suitable to explaining how this works. Nova has a body allocator that can allocate contiguous chunks of memory and that body allocator has a free function where you basically return an address and say free this block. And the function is really as simple as shown there, it just consists of these few instructions because it's a tail call that jumps to some coalescing function here later and you don't have to understand all the complicated assembler but suffice it to say that there's a little test here of these two instructions which performs some meaningful check and you know that you can't free a null pointer. So this test checks if the address passed as the first parameter is a null pointer and if so it jumps out right here. So basically the function does nothing, does no harm, it's basically a knob. Let's say an attacker actually wanted to compromise memory and instead of jumping to the beginning of this function, it wanted to jump past that check to this red instruction to bypass the check and then corrupt memory. Without control flow enforcement that would be possible if the attacker could gain execution but with control flow it wouldn't work because when you do a call or a jump you have to land on an end branch instruction and the compiler has put that instruction there. So if an attacker managed to get control and tried to jump to a vtable or some indirect pointer to this address, you would immediately crash. So this is how indirect branch tracking works. Shadow stacks work like this. With a normal data stack you have your local variables on your stack, you have the parameters for the next function on the stack, so the green function wants to call the blue function and then when you do the call instruction the return address gets put on your stack. Then the blue function puts its local variables on a stack, wants to call the yellow function, puts the parameters for the yellow function on the stack, calls the yellow function so the return address for the blue function gets put on a stack. And you see in the stack grows downward and you see that the return address always lives above the local variables. So if your local variables, if you allocate an array on a stack and you don't have proper bounds checking, it's possible to override the return address by writing past the array and this is a popular attack technique, buffer overflow exploits that you find in the wild. So if you have code that is potentially susceptible to these kind of return address overrides, then you could benefit from shadow stacks. And the way that this works is there's a separate stack, this shadow stack, which is protected by paging so you can't write to it with any ordinary memory instructions, it's basically invisible and the only instructions that can write to it are call and read instructions and some shadow management instructions. And when the green function calls the blue function, the return address will not just be put on the ordinary data stack, but will additionally be put on the shadow stack and likewise with the blue and the yellow return address. And whenever you execute a return instruction, the hardware will compare the two return addresses that it pops off the two stacks and if they don't match, you again get a control flow violation. So that way, you can protect the backward edge of the control flow graph also using shadow stacks. There's a feature that NOVA uses on Tiger Lake and all the lake and platforms beyond that that have this feature. But there's a problem. And the problem is that using shadow stack instructions is possible on newer CPUs that have these instructions, that basically have this ISA extension, but if you have a binary containing those instructions, it would crash on older CPUs that don't comprehend that. And luckily, Intel defined the end branch instruction to be a knob, but some shadow stack instructions are not knobs. So if you try to execute a CET and able NOVA binary on something older without other effort, it might crash. So obviously, we don't want that. So what NOVA does instead, it detects at runtime whether CET is supported and if CET is not supported, it patches out all these CET instructions in the existing binary to turn them into knobs. And obviously, being a microkernel, we try to generalize the mechanism. So we generalize that mechanism to be able to rewrite arbitrary assembler snippets from one version to another version. And there's other examples for newer instructions that do better work than older instructions like the Xsave feature set, which can save supervisor state or save floating point state in a compact format. And the binary, as you build it originally, always uses the most sophisticated version. So it uses the most advanced instruction that you can find. And if we run that on some CPU, which doesn't support the instruction or which supports some older instruction, then we use code patching to rewrite the newer instruction into the older one. So the binary automatically adjusts to the feature set of the underlying hardware. The newer your CPU, the less patching occurs, but it works quite well. And the reason we chose this approach, because the alternatives aren't actually great. So the alternatives would have been that you put some if-defs in your code and you say, if they've CET, use the CET instructions, and otherwise don't. And then you force your customers or your community to always compile the binary the right way, and that doesn't scale. The other option could have been that you put some if-then-else, you say, if CET is supported, do this, otherwise do that. And that would be a runtime check every time. And that runtime check is prohibitive in certain code paths, like NT paths, where you simply don't have any register-free for doing this check because you have to save them all. But in order to save them, you already need to know whether Shadows.Tex are supported or not. So doing this feature check at boot time and rewriting the binary to the suitable instruction is what we do, and that works great. So the way it works is you declare some assembler snippets, like Xsave S is the preferred version. If Xsave S is not supported, the snippet gets rewritten to Xsave, or a Shadows.Tex instruction gets rewritten to a knob. We don't need to patch any high-level C++ functions because they never compile to those complicated instructions. And yeah, we basically have a binary that automatically adjusts. So finally, let's take a look at performance because IPC performance is still a relevant metric if you want to be not just small but also fast. And the blue bars here in the slide show Nova's baseline performance on modern Intel platforms like NUC12 with Alder Lake and NUC11 with Tiger Lake. And you can see that if you do an IPC between two threads in the same address space, it's really in the low nanosecond range, like 200 and some cycles. If you cross address spaces, you have to switch page tables, you have to maybe switch class of service, then it takes 536 cycles. And it's comparable on other micro-architectures, but the interesting thing that I want to show with this slide is that there's overhead for control flow protection. So if you just enable indirect branch tracking, the performance overhead is some 13% to 15%. If you enable shadow stacks, the performance overhead is increased some more. And if you enable the full control flow protection, the performance overhead is in the relevant case, which is the cross address space case, it's up to 30%. So users can freely choose through these compile time options what level of control flow protection they are willing to trade for what in decrease in performance. So the numbers are basically just ballpark figures to give people feeling for if I use this feature, how much IPC performance do I lose? So with that, I'm at the end of my talk. There are some links here where you can download releases, where you can find more information. And now I'll open it up for questions. Thank you so much, Udon. So we have time for some questions. Yeah. And then you're partying. Thank you. It was really, really nice talk with us to see how many new things are in Nova. One thing I would like to ask is you mentioned that page table code is formally verified and that it's also lock free. What tools did you use for formal verification, especially in regards of memory model for verification? Thank you. So I must say that I'm not a formal verification expert, but I obviously have regular meetings and discussions with all the people. And the tools that we are using is the Koch theorem for basically doing the proofs. But for concurrent verification, there's a tool called iris that implements separation logic. Well, the memory model that we verify depends on whether you're talking about x86 or ARM. For ARM, we're using multi-copy atomic memory model. Also, thanks for the talk. And it's great to see such a nice progress. Just a quick question. In the beginning of the talk, you said that you have this command line option to clamp the CPU frequency to disable the turbo boosting. Why can't you do that at runtime? Why can't you configure it at runtime? We could configure it at runtime too, but we haven't added an API yet because the code that would have to do that simply doesn't exist yet. But there's no technical reason for why userland couldn't control the CPU frequency at arbitrary points in time. Okay, wonderful. Thanks. I was going to ask you about the verification aspect of this. Okay, got you. Any other questions? Yeah. Can you just say, sorry, Jonathan, it's going to be a lot too. Yeah, just to clarify, on the point of the DMA attack, were you talking about protecting the guests or the host or the DMA attack? So the question was for the DMA attack that I showed in this slide here, and you'll find the slides online after the talk. This is not a DMA attack of guest versus host, this is a boot time DMA attack. So this is, you can really think of this as a timeline, firmware starts, boot loader starts, Nova starts. And at the time that Nova turns on the IOMU, both guests and hosts will be DMA protected. But Nova itself could be susceptible to DMA attack if we didn't disable bus master simply because the firmware does this legacy backward compatible shenanigans that we don't like. And I bet a lot of other microcalls are susceptible to problems like this too, and the fix would work for them as well. Thanks, Udo, for the talk. I would like to know, can you approximate how much percentage of the architecture specific code is now added because of the security measures? So most of the security measures that I talked about are x86 specific, and ARM has similar features like they have a guarded control stack specified in ARM v9, but I don't think you can buy any hardware yet. You can take the difference between x86 and ARX64 as a rough ballpark figure, but it's really not all that much the, for example, the multi key total memory encryption. That's just a few lines of code added to the x86 specific pitch table class because it was already built into the generic class to begin with. Code flow enforcement is probably 400 lines of assembler code in entry, pass, and the switching. I did a quick test as to how many end branch instructions a compiler would actually inject into the code. It's like 500 or so because you get one for every interrupt entry and then one for every function, and it also inflates the size of the binary a bit, but not much. And the performance decrease for indirect branch checking, among other things, comes from the fact that the code gets inflated and it's not as dense anymore. Okay. Yeah, final question, please, because red is one of the, yeah. You were saying that you were able to achieve an L binary without rotations. Yeah. Can you elaborate a little bit on how did you do that, which linker did you use? So it's the normal GNU-LD, but you could also use gold or mold or any of the normal linkers. So the reason for why no relocation is needed is for the page code, as long as you put the right physical address in your page table, the virtual address is always the same. So virtual memory is some form of relocation where you say no matter where I run in physical memory, the virtual memory is always the same. For the unpaged code, which doesn't know at which physical address it was actually launched, you have to use position independent code, basically say I don't care at which physical address I run, I can run it in arbitrary address because all my data structures are addressed with relative or something like that. And at some point you need to know what the offset is between where you want it to run and where you do actually run, but that's simple. It's like you call your next instruction, you pop the return address of the stack, you compute the difference and then you know. Thank you so much, Udo. Thank you. So the slides are online, the recording as well. |
Evolution of OSv: Towards Greater Modularity and Composability |
So, once again, hello everybody, welcome to my talk. This talk is going to be about OSV, evolution of OSV towards greater modularity and composability. Thanks for introducing me. So I've been contributing to OSV since 2016. Here I go, in 2015 I heard about OSV in one of the conferences and then a couple of years later I was nominated to be one of its committers and my greatest contributions to OSV include making OSV run on Firecracker and significantly improving ARCH 64 port among other things. So I'm not sure if you can tell it but OSV is actually my hobby so I'm not like a real current developer like many of previous speakers are so it's actually, you know, I work on it in my night when I feel and I have a day job so I don't represent my company that I work for so this is all my personal contribution to the project. So in today's presentation I will talk about enhancements introduced by the latest release of OSV 057 with the focus on greater modularity and composability but I will also discuss other interesting enhancements like lazy stack, novel ways to build ZFS images and improvements to the ARM port. Finally I will also cover an interesting use case of OSV, seaweed FS running on OSV which is a distributed file system. So as you can see in this talk besides the title, modularity, I will actually try to give you like state of the art where OSV is, how it has changed recently and a little bit of where it's going hopefully. So I know there are probably many definitions of unique kernels and each of them is a little bit different right but so I'm sure most of you understand what unique kernels are but just a quick recap with emphasis on how OSV is a little bit different. So OSV is a unique kernel that was designed to run single and modified Linux application on top of hypervisor whereas traditional operating system were originally designed to run on a vast range of physical machines. But simply speaking OSV is an OS designed to run single application without isolation between application and kernel or it can be thought as a way to run highly isolated process without ability to make system calls to the host OS. Finally OSV can run on both 64-bit X86 and ARM V8 architectures. Now a little bit of history, so OSV for those that don't know, OSV was started in late 2012 by the company called Cloud Use Systems and they built pretty strong team of 10, 20 developers I think. I wasn't one of them but they pretty much wrote most of OSV but at some point they basically I guess realized they have to make money I'm guessing so they basically moved on and started working on this product you may have know CillaDB which is this high-performance database but I think they took some learning so and after that basically I think OSV did receive some grant from European Union so there was some project on that and I think there may have been some companies also using OSV but honestly since then it's been really maintained by volunteers so like me like there's still some people from CillaDB, Nadaf, Harrell and others that contribute to the project you know I would just single out Fortisks and Akis which actually was the one that implemented Virtio FS as a you know for very interesting contribution to OSV and obviously I would like to take this opportunity to invite more people to become part of our community because honestly you may not realize it but our community is very small so it's just really me, Nadaf and a couple of other people that contribute to the project so I hope we you know we're gonna grow as a community after this talk. So quick recap of a little bit of how OSV looks like what the design is so in this slide you can see major components of OSV across layers starting with G-Lipsy, the top which is greatly based actually on Musil, then core layer in the middle comprised of ELF dynamic linker of VFS, virtual file system, networking stack, thread scheduler, page cache, RCU, read copy update, page table management and L1, L2 pools to manage memory and then you have a layer of device drivers where we OSV implements Virtio devices on both of our PCI transport and MMIO transport and then Zen and VMware among others and obviously and one more thing so as we can run on KVM based hypervisors like QMU like Firecracker I did test also OSV on cloud hypervisor which is I think Intel's hypervisor written in Rust and then I personally didn't really run OSV on Zen so I know that the Zen support is a little bit dated probably and I'm not sure how much it has been tested. I did test on VMware Vbox, virtual box and I think on Hyperkit at some point. So I will just I want to go into more detail about this diagram but I will leave it with you just as a reference later. So in the first part of this presentation I will about modularity and composability I will focus on new experimental modes to hide the non-Gilipsi symbols and standard C++ library. I will also discuss how ZFS code was extracted out of the kernel in form of a dynamically linked library and finally I will also explain another new build option to tailor the kernel to a set of specific drivers. I call them driver profiles and another new mechanism to allow building a version of kernel with a subset of Gilipsi symbols needed to support a specific application which I think is quite interesting. So by design OSV has always been a FAT unicolonal and which has been some sort of some of the criticism and by default provided a large subset of Gilipsi functionality has included full standard C++ library and ZFS implementation drivers for many devices and has supported many hypervisors. So on one hand it makes running arbitrary application on any hypervisor very easy using a single universal kernel. But on another hand such universality comes with the price of bloated kernel with many symbols and drivers and possibly ZFS that is unused. That's causing inefficient memory usage, longer boot time and potential security vulnerabilities. In addition C++ application linked against one version of LeapSTD C++ different than the version the kernel was linked against may simply not work. For example that happened to me when I was testing OSV with.NET and the only way to make it work was to hide basically the C++ standard library and use the one that was part of the.NET app. So one way to lower memory utilization of the guest is to minimize the kernel size. By default OSV comes with a universal kernel that provides quite large spectrum of Gilipsi library and full standard C++ library and exposes over a total of 17,000 symbols and most of those are very long as C++ symbols that make up the symbol table. So the question may be posed why not have a mechanism where we can build a kernel with all known Gilipsi symbols hidden and all unneeded code that is unused garbage collected. So the extra benefit of fewer exported symbols is increased security that stems from the fact that there is simply less potential code that is left that could be harmful. And also that way we can achieve better compatibility as any potential symbol collisions for example and mismatch standard C++ library which I mentioned can be avoided. So the release 057 added a new build option called conf-hide symbols to hide those non-Gilipsi symbols and the standard C++ library symbols. These are enabled if enabled in essence most files in a source tree of OSV except the ones under Lipsi and Musil directories would be compiled with the flags of visibility hidden and only if that build flag is enabled. On the other hand the symbols to be exposed as public like the Gilipsi one would be annotated with OSV Asterisk API macros that translate basically to attribute visibility default and the standard C++ library is linked with the flag no whole archive. Those SV Asterisk API macros basically would be like OSV Lipsi API or OSV Pthreads API OSV Lipsi API and so on basically that match all then I think around 10 libraries that OSV dynamic linker exposes. Finally the list of public symbols exported by the kernel is enforced during the build process based on the symbol list files for each advertised library like for example Lipsi SO6 and is maintained under the directory exported symbols. So these files are basically list of symbols that are concatenated using the script called generate version script and which goes into version script file and then is fed to the linker as an argument to the version script file. So in order to now remove all unedited code basically garbage all files would be compiled with the function sections and data sections and then they would be linked with the flag GC section. Now any code that needs to stay like for example the bootstrap start point or dynamically enabled code like the optimal memcpy implementation or trace point patch size is retained by putting relevant kept directives and relevant sections in the linker script. The kernel L file built with most symbols hidden is roughly 4.3 megabytes in size compared to 6.7 which is reduction of around 40%. This great reduction stems from the fact that the standard library standard C++ library is no longer linked with whole archive. The symbol table is way smaller and unused code is garbage collected. Please note that the resulting kernel is still universal as it exports all glipsy symbols and includes all the device drivers. And as a result of this size reduction kernel boots also a little bit faster. Well this all sounds great so one may ask why not hide most symbols and standard C++ library by default. The problem is that there are around 35 unit tests and some also applications that were written in the past that rely on C++ symbols and they basically would not run if we hide all of those symbols. And those are basically used to, they were implemented in the past and it was done sometimes out of convenience, sometimes basically out of necessity. So to address this specific problem we will need to expose some of those OSVC++ symbols as the API expressed in C. So we'll basically define very simple C wrapper functions that we'll call those C++ code. Well I can use this one. A good example of modularity improvements made in the release 057 is extracting ZFS code out of kernel as a dynamically linked library, LibSolarisSO, which effectively is a new module. To accomplish that we changed the main OSV make file to build new artifact, LibSolarisSO out of ZFS and Solaris file sets in the make file, which basically used to be linked into kernel. The new library has to be linked with a bind now flag and OSV specific OSVmlog node to force OSV dynamic linker to resolve symbols eagerly and populate the mappings eagerly as well. This basically is done to prevent page faults that would lead to potential deadlocks as the libraries loaded and initialized. The init function ZFS initialized called upon the libraries loaded creates necessary thread pools and registers various callbacks so that the page cache arc, which is adaptive replacement cache from ZFS and ZFS depth driver can interact with relevant code in the ZFS library. On another hand, the OSV kernel needs to expose some around 100 symbols that provides some internal free BSD originating functionality that LibSolarisSO depends on. OSV borrowed some code from free BSD and actually a good chunk of this code was actually implementation of ZFS, which right now is outside of the kernel. Finally, the virtual file system bootstrap code needs to dynamically load LibSolarisSO from bootFS or read-only-FS using DL open before mounting ZFS file system. There are at least three advantages of moving ZFS to a separate library. First off, ZFS can be optionally loaded from another file system like bootFS or read-only-FS partition on the same disk or another disk and I will actually discuss that in more detail in one of the upcoming slides later. Then also, kernel gets smaller by around 800 kilobytes and effectively becomes 3.6 megabytes in size. Finally, there are at least 10 fewer threads that are needed to run non-ZFS image. So for example, when you run read-only-FS image on OSV, with one CPU it only requires 25 threads. The regular Linux Jalipsi apps should run fine on kernel with most symbols and standard C++ library hidden, but unfortunately many unit tests which I mentioned and various internal OSV apps which are written mostly in C++, so-called modules, do not, as they had been coded in the past to use those internal C++ symbols from the kernel and we have to do something to deal with that problem. So in the release 057 we introduced some of the C wrapper API which are basically in C style convention and then we changed those modules to use those C wrapper functions instead of C++ code. The benefit is that down the road we might have some newer apps or some newer modules that would use those C wrapper functions and it also may make OSV more modular. As you can see some of those, one of the example is, for example, OSV gets all threads which is basically a function that gives a thread safe way to color, to iterate over threads which, for example, is used in an HTTP monitoring module to list all the threads. A good example of OSV specific modules that uses some internal C++ symbols is HTTP server monitoring. We modify the HTTP monitoring module to stop using internal kernel C++ API. We do it by replacing some of the calls to internal C++ symbols with this new module C style API, symbols from the slide which you saw on the slide before, for example, SCAD with all threads, with this new OSV get all threads function. In other scenarios we fall back to standard G-Lipsy API, for example, the monitoring app used to call OSV current mounts and right now it uses basically getMTNT and function and related ones. So the release 0.57 introduced another built mechanism that allows creating a custom kernel with a specific list of drivers intended to target given hypervisor. Obviously such kernel benefits from even smaller size and better security as all unneeded code, all unneeded drivers are basically excluded during the build process. In essence we introduce a new build script and makefile parameter, driver, driver's profile. This new parameter is intended to specify a driver profile which is simply a list of device drivers to be linked into the kernel and some extra functionality like PCI or ACPI, these drivers depend on. Each profile is specified in a tiny include files with the MK extension under conf profiles arch directory and included by the main makefile as requested by the driver profile parameter. The main makefile has a number of basically if expressions and add conditionally given driver object to the linked object list depending on the value of 0 or 1 of the given conf drivers parameter specified in that include file. The benefit of using drivers as are most profound when they are used with when you build kernel and hide most of the symbols as I talked about in one of the previous slides. It's also possible to enable or disable individual drivers on top of profiles as profiles are basically list of the drivers but the number of configuration parameters that where you can specifically for example include, which I'm going to be actually showing here, you can include specific driver. One may ask a question why not use something more standard like when you config like for example what Unicraft does, well actually OSV has this specific build system and I didn't want to basically now introduce another way of doing things so that's where we basically script build uses the various effectively parameters to for example to hide symbols or specify specific driver profile or list of other parameters. So as you can see in the first example we built default kernel with all symbols hidden and the resulting kernel is around 36, 3.6 megabytes. In the next example we actually use, we built kernel with the VIRTIO over PCI profiles which is like 300 kilobytes smaller and then in the third one we built kernel which is intended to for example for firecracker when we include only VIRTIO block device and networking driver over MMO transport and then just to see basically in a fourth one just to see how large the driver's code in OSV is when you basically use driver profiles base which is basically nothing, no drivers, you can see that roughly 600 kilobytes of the driver's code is roughly 600 kilobytes in size. And then in the last one actually option is where you can specify, you use basically driver's profile and then you explicitly say which specific drivers or you know driver related capability like in this case ACPI, VIRTIO FS and VIRTIO NET and PV panic devices you want to use. Actually with the new release of OSV 057 we started publishing new versions of new variations effectively of OSV kernel that correspond to this I thought you know interesting build configuration that I just mentioned and in this example the OSV loader hidden artifacts are effectively the versions of OSV kernel built with most symbols hidden and then for example which will be at the top for both ARM and X86 and then for example right here in the second and third and fourth artifacts basically version of the kernel built for micro VM profile which is effectively something that you would use to run OSV on Firecracker which only has VIRTIO over MMIO transport. Now the release 057 introduced yet another built mechanism and that allows creation of a custom kernel by exporting only symbols required by a specific application. The extra such kernel benefits from the fact that again it's a little bit smaller and tasks offers better security as in essence all unneeded code by that specific application is removed. This new mechanism relies on two scripts that analyze the built manifest, detect application L files, identify symbols required from OSV kernel and finally produce the application specific version script under app version script. The generate app version script iterates over the manifest files produced by list manifest files pi, identifies undefined symbols in the L files using objectDump that are also exported by OSV kernel and finally generates basically the app version script. So please note that this functionality only works when you build kernel with most symbols hidden. So I think what is kind of interesting worth noting in that approach is that you basically run a built script against given application twice. Basically first time to identify all symbols that application needs from OSV kernel and then actually second time we do is to build the kernel for that specific app. In this example we actually generate kernel specific to run a simple going app on OSV and when you actually build kernel with symbols around I think 30 symbols by going pi example the kernel is effectively by around half megabytes smaller and it's around 3.2 megabytes. So this approach has obviously some limitations. So some applications obviously use for example DLSM right to dynamically resolve symbols and those would be missed by this technique. So in this scenario basically for now you have to manually find those symbols and add them to the app version script file. Basically a lot of Jalipsi functionality is still in OSV in Linux CC where all the system calls are actually implemented is still basically references all the code in some of the parts of the Lipsi implementation so this obviously also would not be removed. So obviously we could think of ways of finding some kind of build mechanism that could for example find all the usages of Cisco instruction or SVC on ARM and analyze and find all this only code that is needed. In the future we may componentize other functional elements of the kernel for example the DHCP lookup code could be either loaded from a separate library or compiled out depending on some build option to improve compatibility while also planning to add support of statically linked executables which would require implementing at least clone BRK and arch PRCTL Cisco. We may also introduce ability to swap built in version of Jalipsi libraries with third party ones for example the subset of libm so that is provided by OSV kernel could be possibly hidden with the mechanism that is discussed and we could use different implementation of that library. Finally we are considering to expand standard PROCFS and CISFS and OSV specific parts of CISFS that would better support statically linked executables but also allow regular apps to interact with OSV. A good example of it could be implementation of net stat like type of capability application that could expose the networking in terms of OSV better during runtime. In the next part of the presentation I will discuss the other interesting enhancements introduced as part of the latest 0.57 release. More specifically I will talk about lazy stack and new ways to build ZFS images and finally the improvements to the ARH64 port. The lazy stack which by the way is actually the idea that was felt off by Nadav Harrell which maybe is listening to this presentation effectively allows to save substantial amount of memory if an application spawns many p-threads with large stack by letting stack grow dynamically as needed instead of getting prepopulated ahead of time which is normally the case right now with OSV. So on OSV right now all kernel threads and all application threads have stacks that are automatically prepopulated which is obviously not very memory efficient. Now the crux of the solution is based on observation that OSV page fault handler requires that both interrupts and preemption must be enabled when fault is triggered. And therefore if stack is dynamically mapped we need to make sure that the stack page fault never happens in these relatively few places where the kernel code that executes with either interrupts or preemption disabled. And we basically satisfy this requirement by refolking the stack by reading one byte, one page down per stack pointer just before preemption or interrupts are disabled. So a good example of that code would be in a scheduler right when OSV scheduler is trying to figure out what the next threat to switch to. And obviously that code has preemption and interrupts disabled and we wouldn't obviously want to have page fault happen at that moment. So there are relatively few places when that happens and this idea is to basically pre-fault this code. So to achieve that we basically analyze OSV code to find all the places where the IRQ disabled and preempt disabled is called directly or indirectly sometimes and pre-fault the stack there if necessary. As we analyze all call sites we need to follow basically five rules. The first one do nothing if the call in question executes always on the kernel thread right because it has pre-populated stack there's no chance that page fault is going to happen. Second one is do nothing if the call site executes on other type of pre-populated stack. The good example of that would be the interrupt and exception stack or Cisco stack which are all pre-populated. And the number three rule is do nothing if the call site executes when we know that either interrupts or preemptions are disabled because we don't need to somebody already probably pre-faulted that. And then pre-fault unconditionally if we know that both preemption and interrupts are about to be enabled right. And otherwise pre-fault stack by determining dynamically basically by calling the preemptable is preemptable and IRQ enabled functions. And now the idea basically if we only always if we did if we followed only rule number five which actually this is what I tried to do in the very beginning the first attempt to implement lazy stack it will be actually pretty inefficient. I mean I saw pretty significant degradation of for example context switch and other parts of the OSV when I dynamically checked if preemption and interrupts were disabled. So this was accessible pretty painful to basically analyze the code but I think it was worth it. As you remember from the modularity slides the ZFS file system has been extracted from the kernel as a separate shared library called LipsolizeSO which can be loaded from the different file system before ZFS file system can be mounted. This allows for three ways ZFS can be mounted by OSV. The first and original way assumes that ZFS is mounted at the root from the first partition of the first disk. The second one involves mounting ZFS from the second partition of the first disk and at an arbitrary non-root point for example slash data. Similarly the third way involves mounting ZFS from the first partition of the second or higher disk at an arbitrary non-root point as well. Please note that the second and third options assume that the root file system is non-ZFS obviously and which could be like read-only-FS or boot-FS. This slide shows you the build command and how OSV runs when we follow the original and default method of building and mounting ZFS. For those that have done it, there's nothing really interesting here. This is a new method, the first of the two new ones where we actually allow ZFS to be mounted at a non-root mount point like data for example and mixed with another file system on the same disk. Please note that lib-solaris-so is placed on the root file system typically read-only-FS under USR-lib-FS and loaded from it automatically. The build script will automatically add the relevant mount point time to each ZFS. The last method is basically similar to the one before but this time we allow ZFS to be mounted from the partition from the second disk or another one. It's actually what happens with this option, I noticed that OSV would actually mount ZFS file system by around 30 to 40 milliseconds faster. Now there's another new feature we used to run in order to build ZFS images and file system we would use OSV itself to do it. With this new release there's a specialized version of the kernel called ZFS loader which basically delegates to this utilities like Zipple, ZFS and so on to mount OSV but there's also now a new script called ZFS image on host that can be used to mount OSV ZFS images provided you have open ZFS functionality on your host system which is actually quite nice because you can mount basically OSV disk and introspect it, you can also modify it using standard Linux tools and unmount it and use it on OSV again. Here's some help on how this script can be used. Now I think I don't have much time left but I will try. So there's also, I will focus a little bit on the AR64 improvements, I will focus on three things that I think are worth mentioning, the changes to dynamically map the kernel during the boot from the second basically gigabyte of visual memory to the 63rd gigabyte of memory, addition enhancements to handle system calls and then also handle exceptions on a dedicated stack. As far as the moving memory, virtual memory to the 63rd gigabyte so I'm not sure if you realize OSV kernel is actually position dependent but obviously the kernel itself may be loaded in different parts of physical memory so and it used to be before that release that you would have to build different versions for Firecracker or for the QEMU. So basically we, in this release we changed the logic in the assembly, in a bootloader where we basically OSV detects itself where it is in a physical memory and in essence the, you know, dynamically the early mapping tables to then eventually bootstrap to the right place in the positional code. So now basically you don't need to, you can use the same version of the kernel on any hypervisor. Now we had system calls on ARM, we had to handle the SVC instruction, there's not really much interesting if you know how that works and what is maybe a little bit more interesting was the change that I made to make all exceptions including system calls to work on a dedicated stack so before that change all exceptions would be handled on the same stack as the application which was, which wasn't you know really, which caused all kinds of problems and it was, for example, that would effectively prevent implementation of the lazy stack. So to support basically that we would, you know, SV which runs in EL1 in a kernel mode we would basically take advantage of the stack selector register and we would have, we would basically use both stack pointer register SPL0 and SPL1. So normally OSV uses SPL1 register to points to the stack for each thread. So with the new implementation what basically we would do before the exception was taken basically we would switch the stack pointer selector to SPL0 and once basically the exception was handled it would basically go back to normal which was SPL1. I think I was going to skip C with FS because we're running very little half time left but you can read it on that. Yeah we've also added netlink support and we've made quite many improvements to VFS layer so both actually of those netlink and VFS improvements were done to support C with FS so there are basically more gaps that have been filled by trying to run this new use case. So just briefly as we are pretty much at the end of the presentation I think in the next releases of OSV whenever they're going to happen I would like to, I would like us to focus on supporting statically linked executables, adding proper support of spin locks because OSV for example Mutex right now is lockless but under high contention it would actually make sense to use spin locks and we have actually a prototype on that on the mailing list and then supporting ASLR, refreshing Capstan which is a build tool which hasn't been really out because we don't have volunteers, improved for a long time and then even the website and there are many other interesting ones and so I would, as a last slide I would like to basically use this as occasion to thank basically organizer Razvan for inviting me and everybody else from the community of Unikernals and I would also want to thank ScyllaDB for supporting me and Dorlaor and Nadav Harrell for reviewing all the patches and his other improvements and I also want to thank all other contributors to OSV and I also would like to invite you to join us because there are not many of us and if you want to have OSV alive we definitely need you and so there are some resources about OSV, there's my P99 presentation here as well and yeah if you guys have any questions I'm happy to answer them, thank you. Thank you Voldemort, thank you. So any questions for Voldemort, yeah please Marta, just ask it's going to be a bit of the mic. Okay I have two questions, first when you have spoken about the symbols, about the G-Lipsy symbols and the symbols for symbols, do I understand it correctly that the problem is that the kernel might be using some G-Lipsy functions and the applications might be linked to its own G-Lipsy and so-so symbols apply basically? Well not really, they would use the same version it's just you know and there's no problem with for example malloc, like malloc we don't want to expose malloc but there is a good chunk of OSV is implemented in C++ and all of those symbols don't need to be exposed because they inflate the symbols table a lot and they are not, they shouldn't be really you know available to, visible to others and yeah I mean now I think OSV exposes if you build with that option around I think sixteen hundreds of symbols instead of you know seventeen thousands. So it's really about the binary size there? Yeah, yeah basically binary size and with in case of C++ library avoiding a collision where you build OSV with different version of C++ library versus you know the application that. Yeah okay so this is the case I'm interested in, so have you thought about maybe renaming the symbols in the kernel image during link time, maybe adding some prefixes to all the symbols so that you can have them visible but they would not clash? That's an interesting idea, I haven't thought about it yeah. And Marty the other second question, yeah. Yeah that's just a quick second question, so when you have spoken about the latest tag you said that you pre-fold the stack to avoid the problematic case when it drops in preemption disabled, so basically when I'm thinking about it you still need to have some kind of upper bound of the size of the stack so that you know that you pre-fold it large enough to not get into the issue. So my question is why not then have the kernel stacks in all fixed size because if you already need to have some upper bound then why not have a local upper bound for the whole kernel? Wouldn't it be just easier? Well I mean this is for applications threads only, so for application stacks where the kernel threads would still have the pre-populated fixed size stack, yeah so because I mean there are many applications like good example is Java that would start like 200 threads and all of them right now are proposed like one megabyte and all of a sudden need like 200, so this is just for application. Okay so basically my understanding is wrong, so you have the user stack and the kernel stack is the same stack? Well no, it's in the same virtual memory but yeah, I mean when I say kernel stack I mean in OSV basically there are two types of threads, there are kernel threads and there are application threads so basically application threads use their own stack, but when they enter the kernel so to speak they are still reusing the original stack right? I mean application threads use application stack and kernel use kernel and when I say like some kernel code obviously because unicernel as the code executes in an application it runs on application stack but it might execute some kernel code as well which yeah, yeah. Thank you, any other question? Okay thank you so much, let's move on. |
Introducing Helios Micokernel
A small, practical microkernel |
Thank you, so hi, my name is Drew and I'd like to talk to you today about a new microkernel I've been working on called Helios. For context, I work at a place called Source Set and I'm the project lead for a new programming language called Hair and I've done many other projects but that's what's relevant for today. This is a new microkernel. It's inspired a lot by SEL4 but it differs in many ways. It's written in this hair programming language that I mentioned and one of the main motivations for it is to find out if we can use the hair programming language to write microkernels in or any kind of kernel really. Presently it runs on X8664 and ARM64 and we're thinking about risk 5 in the foreseeable future. In terms of the footprint of this kernel, it's pretty small. The portable code is about 8,500 lines of code. Each architecture adds about another 3,000 lines of code, all hair code and then add on top of that the boot loaders which are also written in hair and it's a pretty small footprint. We've been working on it for about nine months now and we use the GPL3 license. So again, about nine months of progress so far. Where do we stand in terms of functionality is about here. We have capability-based security and the capabilities do work similar to what SEL4 does and also similar to SEL4 we have inter-process communication working using endpoints and notifications, very similar to SEL4 with some notable differences. We have scheduler work, we're in user space and we have multi-processing but we don't have symmetric multi-processing, we have only one core at the moment but we'll do SMP fairly soon. We also have all of the necessary rigging in place for drivers in user space so we have access to ports on X86 and we have memory and map.io support as well as IRQs are rigged up. And for booting up the kernel, we currently support EFI on ARM and multi-boot on X86. We'll be doing EFI on X86 as well in the future and our plan is to also do EFI on risk 5. So we'll use EFI as the default approach for booting Helios in the future. But why should we be thinking about writing a new macro kernel or a new kernel of any sort? I imagine that for this particular dev room I don't need to give too many reasons but for the sake of anybody who's maybe watching online, the first point is pretty obvious. It's really fun to write kernels and that's kind of reason enough so I'm having a great time working on it and that's enough for me. But also, importantly, we've been working on this programming language here for about three years now and it's a systems programming language and one of our goals is to be able to write things like kernels and so in order to prove that we have achieved this goal, we have to write a kernel with it and so Helios is that kernel. I also am a big fan of SCL4's design but I also have some criticisms of it and I'm curious if we do a kernel which is inspired by SCL4, can we make some improvements on its design? And if we were to be particularly ambitious, could we perhaps do better than Linux? We'll see. I should also point out that this slide deck is going to cover a lot of details which maybe will seem redundant to people who are already familiar with the design of SCL4 and that could be a problem with this audience but please bear with me while I explain things that you already understand at some point in the future. So the hair programming language, this is the pitch from the website. I won't read it out to you but essentially it's a very simple language which is very close to C in terms of design but with a lot of benefit of 50 years of hindsight of what could be made better about C but compared to other C alternatives that are floating around today like Rust and Zig and Nim and so on, I would say hair is much, much closer to C's original ideas than any of these other attempts but it improves in many respects like dealing with modules and error handling and bound checked things and some safety features. So it improves in a number of respects. It's also very, very simple. So here we have some more line counts for people who like the line counts. The hair compiler is 18,000 lines of code in C11. The back end that we use, it's not L of the M, we use cube as our back end, is another 12,000 lines of C99 and then we use binutils for the linker and assembler and that's it. We support three targets, XA664, AR64 and RISC-564 which it's no coincidence that those are the targets I'm working on for the macro kernel. We intend to add more but this is where the language is at. I started, again, I started here specifically to work on this kind of project and this project exists to validate the language design for this use case and also because it's fun and maybe it could be useful. For those of us who have never seen any hair code before, I just have a little snippet here so you can get a vague idea of what it looks like. Again, not going to explain this in too much detail but if you're familiar with C, a lot of things will seem familiar to you and you can probably guess the double colon does namespaces. You can guess what the no return tag does. It's fairly straightforward programming language and this is what it looks like. The code sample we're looking at here is the portable kernel entry point so it starts with the boot letter entry point and then the arch specific entry point. This is the first line of portable code that runs when you boot the kernel. With the context out of the way, let's talk about Helios itself. We're going to go over a number of things here with respect to the design of Helios and the implementation of Helios. I'm going to talk about our approach to capabilities and on memory management and some other things specific with how various capabilities actually work like processes and threads and talk about inter-process communication and then also talk a little bit about the implementation as well, not just the design. Here's the big picture in terms of the implementation. Again, those who are familiar with SEL4 will find no surprises on this slide but essentially access to all system resources including kernel objects is semantically governed by user space and we use the MMU to isolate user space processes from each other and to enforce this capability model. On system boot up, the kernel enumerates all of the resources on the system, all of the memory and all of the IO ports and all of the IRQs and it prepares capabilities that entitle the bearer to access these resources and then it hands all of these off to the first process, the init process which can then subject to its own user space policy decisions, choose how to allocate those resources to various processes in such a way that it can provide a secure system. Here's a look at our capabilities. There's an example here on the left of a fake physical address space and on the right shows the kind of state that we'd be storing in this. Here we have a number of physical pages, one for a capability space, one for a virtual address space for a task, a bunch of memory pages, some free memory and so on. In this physical memory, we store the state you see on the right, so the C space here stores a list of capability slots, very similar to SEL4, and in those capability slots is a very small amount of state. Each of them leave 64 bytes so there's not a whole lot to store there. In this case, a task, which is like a thread or a process, stores a pointer to another physical memory page where the bulk of its state really lives. In this case, we have an example of some registers for XDD664. And the access to this state is gated behind the MMU, so only the kernel itself can directly read from this kind of physical memory. But then, user space, who, you know, maybe this process that we're looking at has semantic ownership over the C space and this V space. They can invoke the kernel to do operations against those things, but they can't actually directly access the memory. Instead, the virtual memory can only contain certain kinds of capabilities or certain kinds of physical memory pages, so that could be, you know, arbitrary general purpose memory or it could be a memory map diode, it could end up in their address space. But while they have semantic ownership over these other capabilities, the actual state behind them is not accessible to user space. So in order to work with these capabilities that the user space has semantic ownership over, it uses, of course, the syscall API. And Helios has a very, very small syscall API, it is a microkernel after all. We have 14 syscalls, which I have enumerated here, 12 of these are for working with capabilities. And again, if you're familiar with SEL4 here, there's probably no surprises here, except maybe for syspol, which I'll talk about later. So here is a little example of how you might invoke a capability on x86 to make use of the microkernels API. Again, you're going to be making a syscall here at the end, and here we're going to be filling up registers and memory buffers with the information we want to use. So this code is going to invoke the vspace map operation, which accepts a page capability, a virtual address, and a list of mapping flags, like writer or execute. And its goal is to map a page of physical memory into a slot in a virtual address space. And in order to invoke this operation, the caller needs to have access to a vspace capability, which they're going to modify, and a page capability, which they're going to map. And these capabilities are provided here in two different ways. The object being invoked is the vspace, and it gets its own register, RDI, which is the first API register. The page which is being used, again similarly to SEL4, is going to be placed into that process's IPC buffer, which is done here with a fake capability address for the page. And then we have additional arguments like the message tag, which contains the operation name, the number of capabilities, and a number of parameters. And then any additional arguments to the function. You run syscall, and the operation happens. I also want to talk a little bit about the specifics of interprocess communication. So we have two approaches, and I'll first look at endpoints, which they are kind of a generalized form of IPC. And the way you use them is very similar to how you use chronologics. In fact, the interface is uniform, but it can send a set of registers or a set of capabilities between tasks. So one task can transfer a capability to another task. There is synchronous, so calling send on an endpoint or calling receive on an endpoint will block until the two tasks run debut. And if there are many senders or many receivers, then the one who has been blocked the longest will wake up, so you can have many processes, maybe doing some kind of load balancing operation against IPC operations. And also, SEL4 style call and reply is supported, so if one task does a call rather than a send, then it immediately blocks waiting for the reply, which is guaranteed to go back to the same thread. I have here a more detailed example of exactly how that kind of IPC interaction looks on Helios. So I have here on the left one task, and on the right two tasks that want to communicate with each other, and the text which is in black is taking place in user space, and the text in red is taking place in kernel space. So let's say task two is a daemon or a service of some kind which wants to provide a service, and so it's going to essentially have its main IO loop call sys receive and then block until somebody has work for it to do. And task one wants to be a consumer of that interface, so it will invoke sys call, and the kernel will notice that task two is blocked waiting for somebody to call it. And so the kernel will perform the copy of registers, move any capabilities as necessary, unblock task two, and then block task one while they wait for task two to process the message and prepare a reply, which is what happens next over here. The sys call returns from task two, they process the IPC request according to however they implement their services, and they call the reply sys call, which copies the reply registers back to task one, very similar to this fourth step, and then unblocks task one, and then both of them can proceed onwards with whatever CPU time they're given. Another interesting feature we have in terms of endpoints, which is one of the things that distinguishes Helios from SEL4, is support for a pull-like interface. Similar to Unix's pull on file descriptors, Helios lets you pull on capabilities. So this is an example from the serial driver that I implemented for the standard X86 comports service. It has two capabilities it mainly cares about. It has an endpoint capability that it uses to implement its API for consumers of the serial port. So if you want to request a read or a write from serial, you'll send a message to this endpoint, and then it has an IRQ handler for when the serial port says it's ready to receive or transmit more data. And you can prepare a list of capabilities you're interested in, and a list of events you're interested in, and block on pull, and then when one of those is ready to be done, you can call it, and it's guaranteed not to block, very similar to the Unix pull syscall. And again, I think this is, for me, one of the more notable improvements and derivations from the SEL4 model. And I mentioned this earlier, but this interface for doing endpoints and for invoking kernel objects like virtual outer spaces is uniform between user space endpoints and kernel objects. So it is, for example, possible for a user space process to create a set of endpoints and then use them to implement an API which is identical to the kernel API. And if that process is the parent of some other process which thinks it's talking directly to kernel, it can be sandboxed according to whatever kind of policy you want. So the kernel is using this API which is uniform with the way that user space communicates with itself, and thus user space can fill the role of the kernel in sometimes. This can, for example, allow you to very easily run several different Helio systems on the same computer at once without going to virtualization, which is kind of interesting. So here I have a little bit more detail on capabilities in particular, and then the implementation that some of our capability objects use. Here we have a capability space on the left, which is, again, a little bit distinct from SEL4. We don't use guarded page tables. It's more like a file descriptor table. It's just zero to however many slots are allocated in that capability space, and the process invokes a capability by its number, not by its address. Here we have an example of slots where we have a number of things which are preallocated, but then notably we also have some empty capability slots. And another derivation from the SEL4 model is that we support capability allocation in the kernel. We do this by maintaining inside of empty capabilities a free list. And so when you invoke an endpoint or you want to allocate a capability, you can set the capability address to the maximum possible address, and the kernel will allocate one for you using the free list. You don't have to worry about that state in user space, which is, I think, a very nice convenience to have and very easy to implement as well. This is a list of the capabilities that we have implemented. On the left here is a list of all the capabilities which are available on every architecture. We have things like memory, device memory, IPC capabilities, threads, and so on. And then on the right, we have a number of additional capabilities which are specific to each port. In this case, I've listed the capabilities which are used on x8664. I'm going to look at just a few of these. First I want to talk about memory management, again, very similar to how we use capability allocation with the C space in the kernel using a free list. We also derivate from SCL4 in that general purpose memory uses a free list as well so you can allocate pages without trying to keep track of a watermark, without trying to reset your watermark, or divide it into smaller objects. We have a free list of pages, so you can just allocate pages, which is quite nice. But the only reason the slide is here is to tell you how it differs from SCL4. We also have address space capabilities, vspaces, which is, again, similar to SCL4. In fact, it's so similar that we've cargo-cultured this constraint that you can't share page tables. I don't really know why SCL4 does that, but once we understand, then we will probably either commit to this or change our mind. But we have virtual address space capabilities which can be used to manage processes. And then we have tasks, which can be either a thread or a process or something else if you come up with something creative. But essentially, a task just has a capability space, which is optional, so that I can do IO and invoke capabilities as an address space, and it receives some CPU time when it is configured appropriately. And again, we don't have SMP support yet. We would like to do that soon. And for now, the scheduler is very simple. We just have a round-robin scheduler, but we would like to expand that in the future. It should be at least easy enough to add priorities or niceness, and we can probably look into some more sophisticated exclusions a little bit later. Oh, and I missed one on my notes here. A quick note to add on the topic of address spaces is that I think it's implemented a little bit more elegantly than SCL4, but I did not write down why in my notes, so you'll have to take my word for it. OK, so that's enough about the design. I'd like to talk a little bit about the implementation. The goal is to keep the kernel very straightforward when it comes to booting. I don't really care for the never-ending nightmare, which is different ways of booting up computers. And so the kernel is an Elf executable, and the bootloader's job is to do whatever crazy bullshit is required on whatever platform it's running on to just load a goddamn Elf executable into memory. So that's what we've done. And these bootloaders are also implemented in here, by the way. We support, again, multi-boot in X86, and EFI and AR64, and we'll do EFI everywhere soon. But the bootloader comes up, and it's responsible for a few things. It has to, of course, read the memory map. It also has to load from the file system any boot modules, like similar to an initRAMFS on Linux, where it's going to pull out the init binary or the init executable for the first user space process to run, as well as maybe a tarball that init binary wants to use to read some early drivers from. It's also going to provide to the kernel that memory map, those boot modules, and details about the loader kernel, like where it was placed in physical memory and so on. If we're booting with EFI, we're going to pass along some stuff about the EFI runtime services. And if we have a frame buffer at this stage, thanks to GOP or multi-boot, we'll pass that along as well, and that will eventually make its way to user space. During boot, we have system initialization. You saw a little bit of this in the earlier slide, which showed the code sample of the kernel's portable entry point. That's where the system initialization begins. So of the three phases of the kernel runtime, we have the boot phase, the system initialization phase, and the runtime phase. So during sysinit, the purpose is to do something I hinted at earlier, which is to enumerate all of the system resources, create capabilities for them, and then assign them to the inner process, which is, just again, an alpha executable. So we pull that alpha executable in from the boot modules. The kernel has a simple loader, which pulls it into memory. Enumerate system resources creates enough capabilities to host a task and a v-space and so on for that initial executable and hand it off. The basic problem at this stage is not messing up memory. Everybody who has written a kernel from scratch, maybe as opposed to approaching a project later, knows that the hardest thing about memory management is you need memory to manage memory. So there's a lot of stuff in this stage to deal with that. We also tried to enumerate resources on the system at this stage, but this is actually going to change soon. The kernel, at the time of speaking, has a PCI driver for x86 or a device tree scanner for ARM, and we kind of tried to enumerate the physical address of everything, but this is not a good idea, so we're just going to take all physical memory and give it to user space and let it use policy decisions to figure out who gets what, rather than trying to enumerate everything in the kernel, just to keep the kernel smaller. And we definitely don't want to do ACPI, so please, please. If anybody here is on the risk-fire board or something, I'm begging you. No ACPI, device trees. And then finally, we jump to user space, and that concludes the sysinit. Speaking of user space, I want to talk a little bit about our future plans. So here we have kind of the onion of Aries, is what it's called. The Helios is the kernel at the core of this dream of a larger operating system called the Aries operating system. And we want to wrap the kernel with various layers to add more functionality. So we have Helios as the kernel. We've also started working on Mercury, which is a framework for writing drivers. It's basically a user space interface to the kernel API, which you can use for drivers plus some useful functionality like utilities to make it easier to map memory and so on. And then this Mercury system is applied by Venus, which is a collection of drivers, real-world drivers for actual hardware. At the time of this talk, Helios exists, Mercury mostly exists, and Venus was just started last week. Gaia is going to be the next thing that we're going to do on top of this. So through Mercury and Venus, we'll get this kind of abstract view of the devices on the system as presented through IPC, our capabilities. And then this will be consumed by Gaia and formed into a cohesive user space, which is going to essentially be Unix, but everything is not a file. Everything is a capability. So you open slash def slash fb and you get a frame buffer capability rather than an IO object that you might see on Unix. And furthermore, the design of Gaia is going to be mostly a combination of inspirations from Unix and Plan 9. And on top of this, we'll add a POSIX compatibility layer because Gaia is a chance to leave behind the legacy of POSIX, but the legacy of POSIX is strong, so we'll have to accommodate it somehow. And we'll tie all of this up into an operating system called Aries. So we're going to use your space and we're going to build this stuff there. One other thing I want to show off is something which is part of the Mercury system, which is our DSL for defining IPC interfaces. We were thinking about not doing a DSL, but DSLs are kind of good for this use case, so we made one. This is an example of a serial device. It has support for configuring the BOD rate and stop bits and parity and so on. And it implements the IO device because it supports read and write as well. We have a tool called IPCGen, which reads this DSL and generates hair code for it. And this is now mostly working, but we're going to start actually writing more real drivers with it soon, so it'll be, remains to be seen if we'll like it after we use it for a while. So does Helios work? And the answer is self-evidently yes, because this slide deck is being presented from this Raspberry Pi, which is running Helios right now. Thank you. So I have no C code written on this device beyond the point of EDK2. It has EDK2 to run UA5, but once EDK2 hands over to our EFI boot loader. From that point forward, 100% hair and assembly, just a little bit of assembly. This port to ARM64 was accomplished over the past eight weeks. Actually it took exactly 42 days to port the kernel from X86 to AR64. This system has a simple driver for the Raspberry Pi GPU running in user space to drive the projector. And it has a serial port driver, which I'm connected to on my laptop here, to switch between slides because I could not write a USB driver in eight weeks. The slide deck itself is encoded as quite okay images, which are packed into a tarball and dropped in like an NFS would be. And there really are very few hacks. I would say that this is a pretty complete port of the kernel with very little shortcuts or problems. The reason why I chose to port the kernel to ARM in 42 days is because I was originally going to give this talk from a laptop running Helios for X8664, where I was going to drive the projector through Intel HD graphics, and then I read the Intel HD graphics manuals and it decided it would be much easier to port the entire kernel to ARM and write an ARM GPU driver. So that's what I did. After about two days of reading the HD graphics manuals, I was like, I've had enough. And then I pulled down the ARM manual and tried to find a PDF reader which could handle it. In terms of those hacks and shortcuts, there's no SOC specific builds, so the same kernel that I wrote will boot from any ARM device with the standard EFI configuration and a device tree. It's not Raspberry Pi specific. The user space is Raspberry Pi specific. It's actually Raspberry Pi 4 specific, because that's the one I have, just because I didn't feel like doing device tree parsing in user space for the sake of a silly demo. But all of the silly demo code aside, the stuff that's necessary to make this talk work is maybe a little bit hacky and Raspberry Pi specific, but the kernel port is a genuine port which has basically feature complete. I think the only hack that's in place is that I said earlier that the kernel tries to enumerate the device tree to find physical memory for devices to provide to user space through device memory capabilities, and that was a bad idea. I was right that it was a bad idea, but there is a little bit of a hack in the kernel in that I just gave all physical memory to the Raspberry Pi to user space without really much critical thought. That's really the only hack. The full complete, it's done port will correct that oversight by using the EFI memory map to find memory, which is less stupid to just blithely give to user space. Additionally, I will confess that I don't have support for IRQs in user space, so if I put my finger on the heat sink here, it kind of hurts, because it's just busy looping in user space while I wait for the next slide. I did get that working before FOSSTEM. I just didn't incorporate it into the loadout for running the slide deck, so yeah. It's a good thing that it's not that hot in here or this would crash. In total, Helios has been developed in nine months. The ARM port was done in eight weeks, and it's sophisticated enough to run this slide deck, which is pretty cool, but what's left to do? The kernel is mostly done, and by done, I mean feature complete, but not necessarily where we want it to be. So, by feature complete, I mean the kernel API is complete, and you can write programs against it, which do everything we want them to do, and then other improvements won't maybe not necessarily affect that API. Still needs to be polished in a number of places, like that device tree issue that I mentioned is one case. If you get grep through the code base, you'll find about a hundred do comments which we need to address. One of the more challenging things that we're going to have to do is SMP support, but again, the kernel is a total of like 15,000 lines of code, so despite the boogeyman the SMP often appears to be to most kernel hackers, I imagine that it won't be that difficult for us to do, which could be famous last words, but we'll see. I also want to put it to risk five. I have gotten some hair code running on risk five at the supervisor level, thanks to the efforts of one of the hair contributors. We did do some basic OSDA research, but we haven't actually ported Helios itself to risk five. We'll do that soon. I also mentioned that we're going to work on more options for the boot loaders, so we're going to try to get EFI going everywhere. The main blocker for EFI on x8664, for example, is that our programming language, which again is hair, which is developed almost alongside this project. X86 doesn't have for PIC, and so it would kind of be a little bit of a nightmare to like do runtime relocations of the boot loader and assembly or something of that nature, so we're going to do PIC first before we try to do any EFI boot loader for x86, but we do have PIC for ARM, so that's already working. Then I also want to improve the docs. I spent the past few weeks in between hacking on the ARM kernel, improving the documentation on the website aries-os.org, which there will be a link to in a later slide, which is probably now about 60% complete, so if you're curious about the project and maybe you want to try your hand at a little driver in user space, feel free to check that out, and wherever you find a page, which is a stub, and you need to know what it should say, you can come to IRC and ask about it, and we'll fill it in. After the kernel is polished, actually alongside the kernel polish, is going to user space where I explained a lot of the things that we were looking at. Mercury again mostly exists, and Venus is just getting started. Prior to the introduction of Venus, we did have a number of drivers that were built. For the purpose of testing the kernel and testing Mercury and so on, we obviously have a serial driver for X86, we have a PL011 serial driver for ARM, but we've also done things like E1000 networking, we did send pings with that, we did the Rodeo block devices and a couple of other simple drivers, just as we were working to prove the design of the support code for drivers. But the support code is mostly done, so now we're going to start writing drivers for real. I need to provide some acknowledgments here to people who helped make Helios work. Finally I mentioned earlier that there's a RISC-5 kernel by Alexi Yuren, Ember Sovati also did some work on X86 for early kernel attempts. This was really useful stuff, none of this code, very little of this code that came from these efforts, they did it to Helios, and these projects never really developed into full systems, I don't think either of them made it to user space. But they were still very useful for proving out some ideas about how to actually do basic kernel hacking in here, how do we boot up the system, how do we work from ring zero, how do we configure the MMU, how do we do yada-yada-yada, deal with interrupts and so on, how do we link it properly, all questions that had not been answered previously within the context of the hair programming language, and so this work was definitely instrumental in setting the field upon which Helios could be built, so big thanks to these guys. Also thanks to the hair community, there's almost 80 people now, actually I think it's more than 80 now, but there's around 80 people who have been working on the programming language itself, again we've been working on it for about three years now, and the project Helios of course would not be possible without the programming language that's written in, so a huge shout out to the hair community for everything they've done, very proud of that community. I also want to thank the OSDev community on the Baruchat, hands up if any of you are in this channel, yeah, so OSDev on the Baruchat is the single best place to learn about kernel hacking on the entire internet, those guys are so smart and helpful and they're very knowledgeable, they know everything about kernel hacking and drivers and anything you want to know, if you want to mess with kernels, go talk to these people. And also of course to SCL4, as I'm sure you noticed we took a whole bunch of ideas from SCL4, I think SCL4 has got a really cool kernel design and I was really happy to learn about it and apply a lot of its ideas to my own kernel. So kernel hacking is fun, hair is fun, that's all I have fun, that's it. Thank you so much Drew. Any questions from the audience? Martin. I'm going to give you guys my soap, best thanks to Martin first and then we're going to give you. Hi, thanks for the talk, very interesting work. I was unable to map the standard SCL4 capability derivation tree, like starting with anti-memory and then converting that to your slides, so do you have it as well or are you? Yeah, we do track capability derivations in a manner very similar to SCL4 and it really would have been smart for me to put in a slide about that. So thanks for the clarification and the second, well I have a hard time formulating it as a question, so maybe just take it as an unsolicited advice. Many of the design decisions of SCL4 were strictly motivated by the formal verification target. So maybe when you have spoken for example about not sharing the pages, etc., just give it a thought that the reason for that might be that they did not want to make their life harder regarding the formal verification and that might be the only reason. Yeah, I think I've noticed that for a few other implementation details from SCL4 when we were studying the kernel to learn what we could do for ours and with examples like sharing page tables, I had bigger fish to fry so I left a comment which says SCL4 doesn't support copying these and I would rather not run into a Heisenberg because we did it without thinking about it, you know, and then we can really address it at some point in the future. Thanks. Hi, yeah, thanks for the talk, very interesting, quite impressive. Yesterday we were talking about hair and thinking about it in retrospective, it seemed to me, my personal opinion, that the great mechanisms of language design are mostly discovered like we have garbage collection and tech unions and that, assuming you agree with that statement now that you've written that operating system kernel, would you also say that the great mechanisms how to write a kernel are established and well-known, things like paging, was there still areas to experiment in new ways to do memory management, things like that? Interesting question, I would say that there's a lot of concepts and ideas which can be applied in the field of OS development which are understood and you can see examples of kernels and systems which apply these ideas and various different designs that you can learn from and study and maybe apply to your own kernel if they're the right choice and you can make a complete kernel which is interesting basically only using proven ideas which is for the most part describes Helios but at the same time there's certainly all kinds of research which is being done into more novel approaches, there's been talks in this room throughout the day which address some of those novel approaches and ideas so I would say that there is certainly room to build a kernel out of understood ideas and still make it interesting but there's also definitely an active frontier of research ongoing as well. Thank you. Thank you so much. Any other questions? Yeah, please. Yeah, thank you for the talk. You mentioned that you need position independent code, right? Yeah. But I don't understand if you use every driver and it's just like a user space process so can't you just remap that in MMU so that all the processes like a normal Linux process just have the same memory map? Yeah, actually the kernel and user space processes both use a fixed memory map. The thing where we would maybe want to look at position independent code is specifically for the case of loading our EFI bootloader as a P32 plus executable. After that stage it's all fixed memory addresses. Yeah, then I understand. Cool. Hello, thank you for the talk. Can I ask a non-technical element? You've GPL-3'd it. How are you making decisions around the kernel? Like is it inevitable? Tata, are you making decisions or are you having massive conversations about things? How's that looking at the moment? The vast majority of the work is just done by me personally at the moment. The project is still pretty early on but we do have a number of other contributors. I would say that the group of people who ought to be consulted on changes is probably in the ballpark of five people in total. So we just have a fairly informal community based on trust. We try to be transparent, like I am transparent in all of my free software, so we have a public IRC channel where we have these discussions and anybody can jump in at any time and there's a patch review process which just goes through me at the moment. In the hair community, for example, which is a lot bigger at this stage, we have something more of a governance model where there's less of a BDFL and more of multiple maintainers who all can do code reviews and improve patches and things like this. As the Helios project grows, I imagine it will adopt a model similar to hair and perhaps as hair grows, we'll have to improve upon the model even further. But we'll see. Thank you. Nice shirt, by the way. Thank you. Can you pass it on? Yeah. Thanks. You mentioned that you don't want to deal with ACPI but at the same time you want to make UFI standard, so what's the plan there? Is there any way to sort of port ACPI in your system because I imagine that it will become mandatory, right? Yeah. I'm going to wail and gnash my teeth and hope it doesn't happen in practice because at the moment, you know, something like EDK2, to be clear, by the way, I don't give a fuck about non-free firmwares, so I'm thinking about EDK2 and things like that or Uboot and so on where there's already an EFI standard UUID for passing a device tree along and they can be configured to do that instead of ACPI, which is what I'm doing on this Raspberry Pi, for example, and what I hope to continue to do on risk five and so on. Our proof of concept on risk five took the same approach. But there's very little in the kernel that actually needs to be concerned with ACPI versus device trees that, again, it is a microkernel, so in the long term, we might just pass along the... at least a little bit in X86, you know, because there's no device trees. But you know, I'm fingers in my ears not thinking about the fact that ACPI is upon us, but we'll probably have to deal with it at some point. Thank you for the presentation. Which software is running the presentation itself and how is it compiled? This is just a custom piece of software I wrote myself. It's a single binary which is loaded by the bootloader as the inner process and then the kernel loads it into an address space and boots it up as PID1. And there's additionally a tarball as a second boot module which is, again, loaded into memory and then passed along to PID1 which is a tarball full of slides. And then there's just one statically linked executable which contains a serial driver and a GPU driver and the code to glue everything together. The code, by the way, is available on source set if you're curious. As you mentioned, Elios is heavily inspired by SCL4, so is there any plan on format verification for Elios or that's not something you're interested in? No, I'm not particularly interested in that. In the back, there's someone. Thanks for the presentation. I have a question, is it on the road map that something like Weston or a other GUI server or other service like that could potentially be reported to Elios? I'm actually also the original author of WROTS and have a lot of experience in Wayland and so there is a 100% chance that Wayland will be running on Elios at some point. Any other questions? As you said, for Gaia, you are inspired by Plan 9 and Unix. What do you plan for Gaia? What's the best of those both worlds? It's a little bit hard to say. At this point, there's less plans and more vision in that respect because we have at least probably a year of work before we can really start serious work on Gaia. But I will say that there is a lot of stuff I admire about Plan 9. There's per process namespaces is one great idea. I'm also going to go further with the idea of there not being any kind of global file system at all. We're also going to take a look at things like using text-based protocols where possible and we're going to use different from Plan 9. We're going to use this IPC generation thing for places where text protocols maybe don't make sense. I also have a lot of admiration for things like MBD on Plan 9 and so I would like to organize networks perhaps in a similar fashion. Also, I would say that the bigger vision for the whole area system is you can almost say it's correcting a mistake that Plan 9 made, which is that Plan 9 was correct, that distributed computing is the future, but it was incorrect that they would be distributed across a main frame and a bunch of thin clients in office building, which is how Plan 9 was designed. In fact, the group of devices which should be running a uniform operating system is all of your personal devices, my laptop, my workstation at home, my phones, they should all present as a single system. It's very vague and lofty and long-term vision, but I would like to try and achieve that with the design of Gaia and Aries. Thank you. Any other questions? Okay, then thank you so much, Drew. Thanks a lot. So, 10 minutes break until the next talk. |
Unikraft Weather Report
The Unikraft Project During 2022 |
Okay. So, hi everyone. I'm here with the yearly, let's say, Unicraft Weather Report. It's mostly a community update of what we've been up to in the past year, and then I'm going to let Simone give up the technical side. As an intro, there's a lot of swag here, so if you want some Unicraft items, please get them from over there. So if you don't know what is Unicraft, Unicraft is a Unicolonel, the focus is on specialization. We call it extreme specialization, being able to customize it for your own needs. It has a build system, configuration system that makes it suitable for your item. Now, given this is the Weather Report, I'm going to give a political statement, and I'm going to say the State of the Community is strong, right? So we're doing it very well. It's going really good, and just to give you some items, this is what's been happening during the past year. So you can see this is where we were last year, and now we're over here. It was 1200 about one week ago, I think we reached that limit. Some more metrics are there, so I don't have the exact numbers from last year, but I'm not going to keep a good track of each FOSDM entry, but this is mostly, so this is where we were approximately last year in terms of GitHub stars, Discord users. We had five releases. We do a release roughly every two months now because of constant pressure from Mihalis and the others. Mihalis is here, so I have to get on here. We're going to expand it to three months. It's kind of a lot of release pressure. We are having a lot of hackathons. There are three life hackathons, and you can also see the number of committees that are going over there. As hackathons, these are events where we evangelize Unicraft, Unicronals, and get people more in tune with the US topics and open source. These have been happening during the past years. We had three at Lyon, Aachen, and Munich, and they're also a nearly online hackathon. You call it Unicraft Summer of Code. This is for everyone in the world. It takes about two weeks. It's four hours per day where we use Discord and a lot of hacking challenges to get people to know more about Unicronals and Unicraft. These are some pictures. I think I see some people over here. Martin is here. Martin and Stefan. This is Stefan. This is Martin. This is Stefan, the white hair. Manuel. Manuel is here. You can see the tall, thin guy. He's over here as well. I think you're this one. This is Victor, right? This is on our university. This is I'm not sure if Yakov is here for Sartura. These were the Sartura guys that we met in Munich. These are all parts of items happening in the hackathon. If we are here, we need to do the proper praising. With many thanks to Michalis. Anastasius is over here as well. Thank you, Babis. I'm not sure if George was here. I'm not sure. Thank you so much for doing this. They're taking us to a tour in the classical era with the hackathon in Athens in late March. If you want to enjoy lovely weather, good food, and of course, Unicronal hacking, please join us. All the information is available if you join Unicraft.org website. You find the information, the form to enroll, and all the items like that. Join us there for two days of hacking. Also, during this time, you see those people here. Actually, even Michalis is here to manage to climb it. There's over here. This was a bunch of us. Yeah, there were the lazy people that stayed at the hotel. This was our first Unicraft meetup. It happened in Romania in October last year. We're going to, of course, make it happen again. Because Alex and Michalis are here and the jury still is debating who is going to win it. But we have the Papanasi. They loved so much. It's still up to debate. Who loves them more? It's a continuous battle there. We're going to see. Maybe next year we make a decision on that. As a final point, as we always discuss, very serious about it, what happens in Sinai stays in Sinai. No pictures, nothing. We know what happened, but no one is going to see those pictures. Getting a bit more on the unserious side, what we've been doing to get this happening is that the student engagement. As I'm also part of the university and also collaborating with other people from universities, you saw Lio Achan, Miuhan Athens. We talked to a lot of people in universities. They are going to be hackathons in Portugal. We are trying to do something in Amsterdam. I talked to Manuel about that. There's something in Canada, so we talk a lot of universities. We do a lot of student engagement via projects. We do mentorship. We take kind of one-on-one discussions with someone. We get into the project, and that's also part of Google Summer of Code, but not only that, and that gets more people in the community. Also, the hackathons, I mean, Martin is basically involved also in Newcraft community, apart from Rusty Hermit, because of the hackathon we were there, and we do challenges just to get people involved. We are a fairly welcoming community. We have a lot of beginner items, so if you just want to do something, you want to do a project, we have plenty of items, and we have a lot of mentorship bandwidth that help you get into the OS world. What we also been doing, this is part of maturing the project and the community. We started kind of creating some sort of hierarchy, so we started creating roles, so we have different roles for different topics. There's also what we call some sort of community leadership item with people working as PR manager, release manager, documentation manager, governance, CACD, so different people doing different things just to make sure everything is working properly. There's a lot of meetings. If you go to the calendar and if you see what's happening on Discord, a lot of things are happening. We discuss on different topics, be it Rust, security, app compatibility, a craft kit. Every topic has their own periodic meeting, usually a weekly meeting. There are some other high-level meetings. We keep a meeting summary of them. You can find the meeting summaries on Discord. Everything is kind of open and aiming to get people aware and devolved and make things transparent. During the past year, this was with help from Alex and Stefan, Stefan Jumara. We've been improving the documentation greatly. There are some things to do, but it's very good now. You get a lot of information on how you can use Unicraft. What's happened technically? These are the major features that you may have heard from other speakers. These have been happening during the past year, so S&P support, virtual memory support, internal metric system done by Chesar. There's a bit that requires Simone's attention. Muscle support has been integrated in the previous release. Binary compatibility, which Simone is going to talk about, is finally in, as of today, to the release. So we've been doing on that, but now it's going to be official. The Apple Flutter application is out there. With the help from Mihaly, you see a lot of arm security features and there's also the, there's an OSS roadmap, a HackMD.io, and the GitHub issue item with the OSS roadmap giving all the items. Actually, this should be, force them 22, right? So sorry. This is the slide for next year, but with some other items. And we are almost there. So my estimate is that the next release happening in late April is going to feature all of this. So the tooling upgrade for CraftKit, this is what Alex and Chesar have been working on, raw support with great help from Martin, who's been leading this. Integration of S&P support and synchronization, making sure it's going to be really used inside Unicraft. Integration testing, there is a bit of work to be done there. We need to do a lot of testing. We need to automate it. It's going to be part of the CSD system. There is a lot of security features we plan to do. These were mostly led by Mihaly's on the arm side. I saw the discussion on Nova on Intel CET and ShadowStack. We have work done over there. So the estimate is that around March. We're going to integrate that. And then, yeah, the ASLR and data execution prevention now with paging support should be also available. Risk 5 support is basically implemented, but with a lead from Mihaly's, we are doing a plattery architecting discussion. And after that is complete, then risk 5 support together with support for other VMMs and hypervisors, VMware, Hyper-V. Firecracker is going to be upstreamed. Well, what I've seen, and this is kind of more on the community management side, items that we are facing as challenging in community and we are kind of adapting to them, is making the product easy to use. There's a bit of struggle on that. Once we fully transition to a mature craft kit, our companion tool, that should make things a lot easier. Basically, just kind of push a button to make things happen. Now that we have muscle support, and now with binary compatibility support, my expectation is that new applications are going to be easy to be ported and then be improved. It's important to also attract experienced contributors. So we are very welcoming to newcomers. We do, however, of course, rely on experienced developers, contributors to drive the major items going. So that's been mostly happening in the core team with Simone, Mark, Marco and Michalis. We want to get more people to work on that. Also, we are working on improving code quality. There is a discussion we are having, and I'm hoping that in the next month or so, we're going to finish up the coding guidelines, making sure we follow a consistent coding style throughout the source code, and giving those good practices in the community. As a kind of call for contributions, come to the monkey side, we have a lot of items for everyone. So, of course, it's the chorus part. If you are fond of C, low-level assembly language, virtualization, there's work to be done there. If you want to work on tooling, Alex is driving that, so go containers, Kubernetes, CICD, you can join that. Language support, such as I said with Martin, we draw support all the others, so there is work being done. We want to expand the ecosystem of applications and languages, so any language you want that's going to be done there. There was actually a discussion with someone who was working on some JSON toolkit if I recall, so that's going to be it. So, there's any sort of preference on library programming language or application you want is going to be there, and then, of course, performance, benchmarking, doing measurements, there's plenty of items to work on there. So, we'll welcome you. Please join. You can see on Discord, you can look at the beginner tasks, you can join the hackathon and find out more. And finally, this is something I discussed briefly also last evening for those of you who still remember. It was a bit late after some beer consumption had occurred. So, I would like to start a unicolonial discussion group to share knowledge, to share the code base, to see what we are doing. I think that will improve a lot the unicolonial ecosystem and improve each project. For example, we've been using tons of feature from OSV. There's quite a bit of code that was, as you might know, borrowed from Stefan and from Pierre, at least on the back of the website and some others. So, there's plenty of things that we can do if we are together, and that will also help promote more unicolonials in the OSS community in academia and also create products from it. So, I'm going to contact you guys to see how we can actually make this happen. I don't have a clear kind of goal. I'm more of a process type of person. Let's make things happen and then improve on it. But I think it's kind of a very good time now given the cloud, the cloud business ecosystem, the academic ecosystem and Unicolonial OS, also the educational prospect that they provide to do this. That being said, thank you so much. As I said, the community is strong. You can find out more about Unicraft over there, and we'll wait for questions after Simon's presentation. Thank you. |
Building a Linux-compatible Unikernel
How your application runs with Unikraft |
Okay. Thank you. Thank you, Rassan. Actually, after hearing your talk, I'm kind of considering I should join the Unicraft community. Sounds to be fun there. It's a threshold there, Kim. Yeah, I see. Well, you don't have to test it. Okay. So, my name is Simon Kunzer. As you now heard, I'm the lead maintainer, also the original person that started that Unicraft project while being still a researcher at NEC Labs Europe. In the meantime, we spinned off. We have now a startup also called Unicraft. It's the Unicraft GmbH, and I'm their CTO and co-founder. And, yeah, we're building a community and a startup at the same time. So first question into the room. Who has used Unicraft before? I would like to know. Okay. Who has maybe more theoretical background, what our key concepts in Unicraft are? Okay. So then, yeah, also, I have some background slides to bring everybody on the same stage. And then we jump directly into the binary compatibility topic, but I won't spend too much time here. Okay. So with this picture, I usually start that. You see on the left side the traditional setup when you have virtual machines and your applications running on them, so stuff that you know since 20 years now. Then you have a setup which is more recent and more popular is using containers where you basically run a host address on your hardware and then you use isolation primitives on your host kernel to separate your containers from each other. And then there's Unicurnals. I don't know. Is this interrupted somewhere? Okay. So we think this could be a different execution environment, especially for the container setup bringing kind of marriage what you had before with virtual machines with strong isolation and really more minimal hypervisors underneath that are much more secure as well. And don't need to do a shared host base which can become an attack surface. And then you want the flexibility of containers and this is where we think a Unicurnal can come in, so where you build a kernel per application. So the thing is, since you know the application that you run on, you can also give up a lot of principles you had in standard operating systems and do simplifications which is totally okay because it's not hitting your attack vector actually. So if you say one application you can go for a flat and single address space because that kernel that you have underneath is just for your application for nothing else. We in Unicraft build a single monolithic binary usually, so it's your application plus the kernel layers and everything ends up in function calls into drivers. And you get then further benefits, first by this simple setup, but also since you know your environment, you know where you run on, you know what you run, so you can specialize the kernel layers that you need underneath. So you put only drivers that you need to run on your target hypervisor. You build a separate image if you run that application on a different hypervisor. So floppy drivers, forget it, you don't need it. VertiU only for KVM guests, Zen NetFront for instance, only for Zen guests. And you know the application, you have knowledge which features of the OS is needed and that way you can also from the top down specialize the operating system to just provide that what you need. So this makes us also slightly different from the other Unicraft projects maybe that you had heard of, so we are for sure not the first ones. But we claim we are the ones that follow at least this principle as the most strongest because we build it from the beginning with that in mind, which is specialization. So everything that we implement should never dictate any design principles. The concept is you know what you need for your application, you know what you need to run your Unicron, so I want to give you a highly customizable base where you pick and choose and configure of components and specialize the kernel layers for you. So that led us to this principle, everything is a micro library, which means for us even OS primitives are micro libraries, meaning a scheduler is a micro library, a specific scheduler implementation is a micro library, so like a cooperative scheduler or some schedulers do preemptive scheduling, different libraries, memory allocators, also things like VFS, network stacks, the architectures, the platform supports and the drivers are all libraries and because we are also going up the stack the application interfaces, so everything that has to do with POSIX even that is split it into multiple subsystems, POSIX subsystem libraries. The Linux system called ABI, which you will see in this talk now, and even language runtimes, if you let's say run a JavaScript Unicron, you can build it with a JS engine. And the project consists basically of a Kconfig based configuration system and a build system, also make based to not come up with yet another build system, to make actually entrance easy when people are familiar with Linux before, and our library pool actually. And to give you a rough idea how this library pool is organized, I find this diagram nice, so let's see if this works at this point. Yeah, so we divide roughly, so you don't find it that way in the repos, but we divide roughly the libraries into these different categories, so you have like here on the bottom, the platform layer, which basically includes drivers and platform support where you run on. Then we have this OS preemptive layer, these are then libraries that implement like a TCP IP stack, or file systems, or something regarding scheduling, memory allocation, et cetera, et cetera. And then always in mind there's like, first the opportunity for you to replace components in here, and also that we provide also alternatives, so you don't need to stick with IP if you don't like it, so you can provide your own network stack here as well, and reuse the rest of the stack too. Then we have this POSIX compatibility layer, this is basically, you know, things here, FD tab, this is for instance file descriptor handling, as you know it. POSIX process has then aspects about process IDs, process handling, et cetera, pthread API, of course. And then we have a libc layer, where we also have at the moment actually three libcs, muscle, which is becoming our main thing now, a new lib that we had in the past, to actually provide all the libc functionally to the application, but also actually for the other layers too, right, it provides also things like memcopy, which is like all over the place used, right. Okay. Then Linux application compatibility, that was now a big topic for this release. Why do we do application compatibility? It's actually for us, for adoption, to drive the adoption, because most cloud software is developed for Linux. People are used to their software, so we don't feel confident to ask them to use something new or rewrite stuff from scratch. And if you provide something like Linux compatibility, you remove also obstacles that people start using Unicraft, because they can run their application with Unicraft. And our vision behind the project is to give seamless application support. So the users that say, they tell you, I use that in that web server, and it should be like with a push of a button, so including with some tooling that we provide, that we can run that on Unicraft as they run it before on Linux, and they benefit from all these nice unicolonial properties, which are lower boot times, less memory consumption, and also improved performance. Okay. So now, speaking about which possibilities you have for supporting Linux compatibility, we divide actually compatibility into two main tracks. One track is so-called native, which means that we have the application sources, and we compile that together and link that together with the Unicraft build system. And then we have on the other side, the binary compatibility mode, where the story is that the application is built externally, and we just get binary artifacts, or the final image event. And then actually you can subdivide these two tracks on the native side. We have, which we did actually quite a lot until recently, this Unicraft-driven compilation, which basically meant that when you have your application, you have to port or mimic the application's original build system with the Unicraft build system, and then you compile all the sources with Unicraft. It has the benefit that you're then staying in one universe and don't have potential conflicts with compiler flags or things that, I mean, influence your calling conventions between objects. And then there is this way that you probably have also seen, for instance, with RUM kernels. They did it a lot using an instrumented way, where you actually utilize the build system of an application with the cross-compile feature, and then, you know, you hook in, and that's your entry point into replacing the compile calls and make it fit for your Unicom. And then on the binary compatibility side, we have, so let's start here, because that's easier. So, of course, so externally built, and this means basically you have ELF files, so like a shared library or an ELF application. What you need here is basically just to support loading that and get that into your address space, and then run it. And then there's also this flavor of, let's say, build time linking, which means that you take some build artifacts from the original application build system, like the intermediate object files, before it does a final link to the application image, and you link those together with the Unicraft system. And they call it here binary compatible, because, you know, you interface it on an API, right, and not on the API level, like in the native cases. So, and here, this is just a little mark that in the Unicraft project, you will mostly find these three modes in the project that people are working on. So, here we, that we never tried with Unicraft, in fact. But I mean, there's some tooling and this should work, too, actually. So, as you may have noticed, native is about API compatibility, so really on the programming interface, and binary compatibility is about the application binary interface. So, really, the compiled, sorry, the compiled artifacts and how you have calling conventions here, et cetera, where your arguments in which register or how's your stack layout, et cetera, right? And this is, this is here on a programming language level, right? So, the requirements for providing you, let's say, a native experience is POSIX, POSIX and POSIX, right? Most applications are written for POSIX, so we have to do POSIX, no excuse, right? So, libcs will mostly cover that. But, yeah, it's all about POSIX. And the second point is that you also need to port the libraries that your application additionally uses. Let's say, yeah, let's take engine access and web. So, right, you have then tons of library dependencies, for instance, for cryptographic things, like setting up HTTPS tunnels or doing some other things. So, those libraries, you need also, you know, port here and add them so that you have the, you know, the application sources available during the build, right? On the binary compatibility side, the requirements are, you need to understand the L format, share libraries or binaries, depending on which level you're driving it. And then, since this stuff got built for Linux, you must be aware that it can happen that that binary will do directly system calls. So, it's instrumented because it got built together with a libc or something like that to do an syscall assembly instruction, which means on our side, we need to be able to handle those system calls as well. And if we, you know, speak about shared library support, we need to support all this library function or library symbol linking, actually, right? And additionally, of course, each data that is exchanged, right, needs to be in the same representation. This means, because this is ABI, right? Now imagine you have a C struct. And here, it's fine to move some fields around, because if you use the same definition for your compilation, it's all fine. You can sort the fields in the struct, it will all work. Here, you can't because your application that got built externally, that layout of that struct, that binary layout, that must fit. Otherwise, you will read different fields, right, obviously. And then for both modes, which is important for us as an operating system, we have, of course, also some things that we need to provide to the application, which are things that the application just requires, because it is that way on Linux, meaning providing a procfs or sysfs entries or files in slash etc. or something like that, because, you know, they do sometimes silly things just to figure out in which time soon they are, so they go to slash etc. and figure out what is configured and also the locales and etc. So, I'm closing that, let's say, this high level view up, so that we have the full understanding. Let's speak a bit about the pros and cons between these two modes. The native side, what is really nice here, which is a really interesting pro, so if you got everything put together, you have quite of a natural way to change code in the application, to change code in the kernel, to make maybe shortcuts between the application kernel interaction and can use that for driving your specialization even further and performance tune your unique kernel for the application. The disadvantages, you always need the source codes, because we are compiling everything here together, and which is also, let's say, for newcomers a bit difficult, is if you require them that the application they have, and they say, okay, I have the source codes, and I just run make and then it compiles, but I have the source code, you need to instrument either the build system of the application, as we just saw with the instrumented build that also ramped it, or you actually, we must say, okay, sorry, you can't use that build system, now you need to mimic and write and Unicraft make file equivalent to build your application. So, this is why binary compatibility is actually interesting, really interesting for, let's say, newcomers, because you don't need the source code, they can compile the application that they're interested in, so if they need to compile it, let's say, right, the way as they usually do, they don't need to care about Unicraft at all, and normally, also no modifications to the application is needed. Obviously, you can still do things here, but it's not a requirement. The risk that we saw by doing the work is, at least for the, let's say, on the unicolonial side, is that you get into a risk that you need to implement things the way how Linux does it, and one really stupid example, I get a bit nuts on that, is providing an implementation for netlink sockets, because if you have, like, a web application, or, you know, any application that does some networking, and that application wants to figure out which network interfaces are configured, and what are the IP addresses there, so it will likely lose the libc function get if address, and that is implemented with a netlink socket, so this goes back here, right, here I can just provide a get if address, which is highly optimized in that sense, right, which just returns in that struct all the interfaces, but if I go binary compatible, and if I do it really on an extreme, means, because that libc, which is part of your binary here, maybe, opens a socket, which is address family netlink, and starts communicating about a socket with the kernel to figure out the interface addresses, which can be really silly, right, for a unicolon, right, to do. And then also, it's, maybe it's less opportunities, but also a bit harder to specialize and tune the kernel application interaction, right, because assuming you don't have access to the source code of the application, there's nothing you can do on the application side. So to give you a rough idea, what that means in performance, because at Unicraft, let's say, the second important thing for us is always performance, performance, performance. Here we just show you engine x here compiled as a native version, so meaning it uses the Unicraft build system to build the engine x sources versus we run engine x on, we call it elf loader, so this is actually our Unicraft application to load elf binaries. And then a comparison here with a standard Linux, and here this is the same binary. What that means in performance, so this quick test, we have just the index page, the standard default of any engine x installation served, and these are like the performance numbers. The takeaway here is, if you just, you know, don't go into any special performance tuning yet, and start just, you know, getting the thing compiled and run, you will end up in a similar performance as if you just take the, you know, the elf loader to run that application in binary compatibility mode. That is interesting because you don't need to see necessarily huge performance drops. The only thing that you lose is the potential to further optimize in this mode if you go for this one. But the nice thing is you can still see benefits, right, running your application on Unicraft. And to just give you an impression, so this is here a Go HTTP application, where we go a bit crazy about optimizing and specializing the interaction between the Go application and Unicraft, yeah, we can get more out of this. We can really performance tune and squeeze stuff out of it. Okay, so now in the next slides, I go over how we implement these modes with Unicraft, because as I said, we don't want to target just one mode, we want to target multiple modes. And it has also some implementation challenges, because as an engineer, you also want to reuse code as much as possible. So we'll talk about the structure here. Okay, so to give you an overview, so this doesn't mean now that these applications run at the same time, it could also be possible, but it's just to show you how the components get involved in our ecosystem. So if you take just the left part, the native port of application, we settle now on muscle to provide all the libc functionality that the application needs. And we have a library called syscall shim, which is actually the heart of our application compatibility. And this is actually, you can imagine, this is a bit of a registry where it knows where in which sub-library a system called handler is implemented. And it can forward then the muscle calls to those places. On the binary compatibility side, you have a library called this elf loader, which is the library that loads an elf binary into memory. And then here's the syscall shim, taking care of handling binary system calls. Now I will go into the individual items to show you a bit more zoomed in view what's happening there. And we, of course, will start with the heart, with the core, the syscall shim. So here we have macros, so when you develop VFS queries, our VFS library actually ported from OSV, or POSIX process, where you do some process functionality, like get PID or something like that. We have some macros that help you to define a system call handler. And it's really a system call handler, it's just a function that is defined at that point. And you will register this to the syscall shim. Then the shim provides you two options, how that system call handler can be reached. One is at compile time. This is like macros, macros, and preprocessor, which allows you, when you have a native application that does, or actually it's on the muscle side, to call a system call, it will replace those calls or will return at compile time the function of that library that implements that system call. Then it also has a runtime handler, which is provided here, which does the typical syscall chop and running that function behind the scenes. Our aim, as I mentioned, we want to reuse code as much as possible. The idea is that we implement that function for that system call just once, and the syscall shim is helping us, depending on the mode, doing a link, or providing it as binary compileable. So let's go back to the overview, and then you will see it a bit more concrete with muscle, but probably I said everything already. So we have muscle natively compiled with a Unicraft build system. Now imagine you have the application, you have a write, goes to muscle, and muscle does then a UK syscall R write, which is then actually the symbol that is provided by the actual library that's implementing it. And the rewriting happens, as I said, with the macros at compile time in lib muscle. So what we did for that is to replace that syscall muscle internal function with our syscall macro, which then kicks in the whole machinery to map a system call request to the direct function call. The thing is that in muscle, not all, but most of the system call requests have a static argument with the system call number first. So this, let's say, write is a libc wrapper, and internally there, they're setting, preparing the arguments, maybe some checks before they go to the kernel, and then they have this syscall function with the number of the system call, and then the arguments hand it over. And as soon that number is a const static, just written down in your code literally, we can do a direct mapping so that that write will directly do a function call with UK syscall R write. If it's not static, which is really happening only on two, three places, if I remember correctly, then of course we can provide an intermediate function that then does a switch case and jumps then to the actual system call handler. And the thing is, since everything is configurable, means I can have a build where VVScore is not part of the build or POSIX process is not part of the build. Then the syscall scene will automatically, also with all this macro magic that we do, replace calls to non-existing system call handers with an inosys stop, so that for the applications look like a function not implemented. And exactly, so at runtime the syscall shim is for that port out of the game, so everything happens at compile time. So for the binary compatibility side, that's unfortunately a runtime thing, and we have actually two components here. As I was mentioning, the ELF loader itself, which loads the ELF application. What we support today is static pies, so if you have a static position independent executable compiled, you can run that. And what also works is using your, let's say, with your libc together provided dynamic linker, meaning if you use glibc with the application, you can use that dynamic linker, so ld.so, and also run dynamically linked applications with that. What it needs is POSIX M-app as a library, which implements all these M-app, M-unm-app, M-protect functions on the system call there. Then system calls are trapped here in the syscall shim, and yeah, I think as I said that, when the library is not selected, it's replaced with enosys, so the syscall shim knows which system calls are available, which are not. Then there's a bit of a specialty for handling a system call, so the system call trap handler. So we provide it with a system call shim, and we don't need to do a domain switch, so we have still a single address space, a single, what's called, I forgot the word, so it's all kernel privilege, yeah, so we have, it's the same privilege domain, exactly, so we don't have a privilege domain switch as well, right, now we have it, good, good, good, if you learn it. But we are slightly in a different environment, I will show you later in the slide exactly what this means. We have some different assumptions that you have on the Linux system call API, which requires us to do some extra steps, unfortunately. So the first thing is Linux does not use extended register, or if they use it, they guard it, meaning extended registers are floating point units, vector units, MMX, SSE, you know. We do, unfortunately, so we need to save that state, because that's unexpected for an application that was compiled for Linux before, that these units could screw up when coming back from a system call. And the second thing is we don't have a TLS, you know, in the Linux kernel, but unfortunately on Unicraft we have, so we use the same, even unfortunately the same TLS register, so we also need to save and restore that so that the application keeps its TLS, and all the Unicraft functions operate on the Unicraft TLS, good. So I'll continue and give you some, let's say, lessons learned while implementing all these things. I would like to give you a short demo. And then we speak a bit about what was tricky during the implementation and what are our special considerations that we had to do. So then let's hope that this works. So this is a super fresh demo, don't touch it, you will burn your fingers. My colleagues, so thank you, Mark, for getting that work, just, you know, half an hour before the talk. Well, it's the person that no one sees, but that's all the work. Yeah, he's amazing, yeah. Okay, so in this demo I have actually NGINX, it's a web server with a standard file system, I'll show you a bit of the files around. I have it once compiled natively, and once compiled as a Linux application, we'll run it with the Elf loader, and you will see that the result is the same. So let's start with the native one. So I'm actually already, so probably I need to increase a bit the size, right, that you can read it in the background. Is that good? Yeah. Yeah. Let's do it here too. So I hope you can, also in the last row you can read, perfect. So yeah, you have here the NGINX app checked out. So we have menu config, so you can, oh, this window is somehow wider, no, just one second. No, it's better, okay. So you see, the application is here as a library here, lib NGINX, and then you have here the configuration of all these HTTP modules that NGINX provides, and you can select and choose, like this is really the Unicraft way to do things. Because it builds a while, and for that my left side is not the fastest, I built it already. So you see here the result of the build directory, you see each individual library that, because of dependencies, where we're coming in and we're compiled, so like for instance POSIX few tags, POSIX socket, RAMFS, which is an in-memory file system, and the, where is it now, the application, here, that's the application image uncompressed, so what I can do. So let's see how big it is. So it's here 1.1 megabyte, so this is like a full image of NGINX, including muscle, including all the kernel code and driver to run on a KEMU KVM X-rated machine. Yeah, then let's run it, to see what happens, so exactly it's already up and running, to show you, these were roughly the arguments, so we have them in the meantime, because I found KEMU systems sometimes a bit brutal with command line arguments, a wrapper script that shortens a few things, but in the end, I mean, this is running a KEMU system, and then, you know, it's attaching to this virtual bridge, take that kernel image, load that in ID file system, because we reserve a file from that RAMFS, and here's also some parameters to set the IP address and that mask for that guest, and here and down there, so we can check actually, you see here set IPv4, that's the address where the unicorn is up, and yeah, you see here, with this W get line that, yeah, I get the page surfed, and to prove that this is real, let us kill this, now the guest is gone, and this is dead, so no response anymore, good, so now, let's go to the ELF loader, which is also treated as an application that can run other applications, also here in the build directly, let's do the same thing, so it has also like similar dependencies, of course, it's prepared to run NGINX, so POSIX socket is there, et cetera, et cetera, where's the, here, so here's the image, it's a bit smaller, it's now 526 kilobytes, which provides your environment to run, and Linux ELF, of course the NGINX image is not included here anymore, right, so that is part of the root file system, and if I run this now, so on purpose I enable now some debug output so that you see the proof that it does system calls, but if you scroll up, so the initialization phase looks a bit different, also sets the IP address, here it's extracting the INIDAR-D, and here it's starting to load the NGINX binary, the Linux binary from the INIDAR-D, and then from that point on, the ELF loader was jumping into the application, and you see every system call that the application was doing, and you can even see that, you know, some stuff, probably this is a first GWC initialization, here for instance an ETC local time, it's trying to open and find some configuration, of course we don't have it, we could provide one, but it's still fine, it's continuous booting, affinity, we don't have, but whatever, it continues booting. It's quite optimistic actually, but it works, a lot of files, if you look into proxies, get PWName, all those items, it works, it works. Yeah, yeah, exactly, and there's tons of Mabs, and you know, EDC password, etc, so those files we had provide, so you get a file script returned back, otherwise it would have stopped, etc, and then, you know, configuration, and so forth, and now you should see that some system call happened when I accessed the page, and you saw it happened, index was opened, file script is 7, and here is, there should be a write to the socket, you know, over here this is probably the socket number 4, yeah, I mean, you get the impression what's going, what's going on, right, so it's working the same way, okay, how much time do I have left? Five minutes? Five minutes, okay, then? Actually three minutes, just to leave some room for questions. Yeah, yeah, exactly, okay, so let's get quickly back. So we had some learned lessons, learned lessons, for the native mode, I mean, the thing is we have also this model, like you heard on OSV, we want to use just one libc in our build, right, so meaning all the kernel implementation, and everything that the application needs is one libc, we provide multiple implementations of libcs, because muscle might be for some use cases too thick, still, or too big, so we have an alternative like no libc, and originally we had new lib, and we need, so what we want as well in our project is to keep the libc as vanilla as possible, like upstream as possible, because we want to keep the maintenance effort for updating the libc versions low. But these courses then, I mean, just list them, I speak just about one of these items, some things that you stumble on, and one was quite interesting, was this get dense 64 issue that cost us some headache, it was mainly a rust wound fixing it, which caused, or required actually a patch. I'm only fixing it. Yeah, yeah, required a patch to muscle. The thing what happened here is that in this drn.h, muscle is providing an alias, right, to use the non-64 version for get dense, and if it finds code with using get dense 64 because of this large file system support thing that was happening, it maps it to get dense, right. On the other side, on the VFS core side, so this is the VFS implementation where we provide the system call, we need to provide both, obviously, we need to provide the non-64 version and the 64 version, and guess what, we include drn because we need a struct definition here. And then you can imagine, so if you're familiar with CMP processor, there's a little hint with this thunder, of course, I mean, this gets replaced, and then you have two times the same symbol, and you're like, what the hell is going on here, all right. So let's skip this because of time. Upcoming features, Ruslan was telling a bit already, especially for this topic for application compatibility, we will further improve it, so this will be now our first release to officially release alpha loader and an updated muscle version. We want to make that more seamless, which requires a bit more under the hood libraries for that support. You should also watch out for features that are coming up for a seamless integration of Unicraft into your Kubernetes deployment. No question, Alex. Brandy Unicraft on your infrastructure provider, for instance, AWS, Google Cloud, et cetera, and automatically packaging of your applications, right, and it would love, or actually all of us, everyone within Unicraft will love to hear also your feedback and what you think about, you know, turning the cloud with Unicrons to the next level. Yeah, any feedback to me, please send to Simon. Right, and these are, again, the project resources, if you're interested, you can just scan the QR code. I think that's it. Okay. Thank you, Simon. Right. So we can take a couple of questions, you can also address me to address them to me. I mean, that's a joint talk, so any, yeah, please, first here and then on the back. Yeah, thanks a lot, both of you, for your talks. I have a question regarding dynamically linked applications in Linux. As far as I can see, you only use muscle, and how does this work out if my application is linked against GDFC, and I want to run it with my loader, what do I have to do? Because in Linux world, when I link against GDFC and I only have muscle, nothing works. Right. So I'm assuming you're speaking about the binary compilability mode. In the end, what you just need to do is providing the muscle loader, if you have compiled with your application with muscle, or the GDFC loader, and then both works. The things in that setup in memory, there is actually two libcs, there's the libc on the Unicraft side, and there's the libc with your application. So that's why it works seamless, actually. Okay, thank you. Just to add to that, when you build your Unicernal for binary compatibility, you don't use muscle. You can if you want. But the app loader doesn't use muscle because the entire libc is provided by the application, either by the application of static binary, or the application plus its libc inside the root file system, and it's loaded from there, there's no need to have anything like that. Yeah, please. Yeah. So the question is about the API. You spoke about the POSIX API. You also add a diagram showing a direct link to Unicernal. So the question is, is there some variable next diagram, perhaps? One of the next diagram. Okay. Is it a variable use case? Yes. There is a link directly from the native application to the Unicernal. Yeah. Yeah. This is what it shows you is how the calls are going. It can happen because some system calls don't have a provided libc wrapper. Yeah. It's like for that completeness, this error is here. For instance, the futex call, if you use futex directly from your application, there is no wrapper function in libc. You need to do a system call directly, and you can do that by also using the syscall, macro then, or actually, I mean, the syscall shim will replace that with a direct function call then to actually POSIX futex. So is it valuable to have a kind of application that you develop specially for Unicernal and the native API? Yes. Yes. That for sure. In this talk, it's just about how we get application compatibility, even in case you have your application already. But if you write it anyway from scratch, I recommend forget everything about POSIX and speak the native APIs. You get much more performance and more directly connected to your driver layers and APIs that, you know, POSIX has some implications, right? There's a lot of things like read, write, imply there's a mem copy happening. And with these lower level APIs, you can do way quicker transfers, just because you can do a zero copy. For instance. Maybe even POSIX can be improved using POSIX. Yeah, sure. Of course. Of course. Yeah. Have you looked into patching the binary to remove the syscall overhead? Patching the binary to remove? For example, now with the syscalls, do you have to emulate the syscalls? Have you looked into patching the binary itself? Yeah. Instead of doing it at runtime, handling the syscalls at runtime? Yeah. Let's say at least we thought about that, but we didn't do it. I mean, the hardware talks, that is the other, exactly. He's sitting in front of it. They were doing some experiments with that, that works too, so you can patch it. But yeah, I mean, just we didn't do it. Okay. In regards to memory usage, obviously Unicernel lowers it, but what if I ran multiple Unicernels and multiple VMs, how do you support membalooning or something like that, or is it like just over provision? Yeah. I mean, the idea is to have membalooning, but it's not upstream yet. Of course. There's also a really interesting research project, maybe I should mention, that works on memory duplication. So if you run the same Unicernel, the same like 100 times, you can share VM memory pages on the hypervisor side, but you need hypervisor support for that. Okay. Thank you so much, Simon. Let's end it here. Yeah. We're going to ask. Yeah. Yeah. And get some speakers. Anastasis and Babis for the next talk on VXL, so please. So please get some stickers. Yeah. Stickers. They are free. Don't have to pay for it. For now. VXL 100 Euro each. |
Hardware acceleration for Unikernels
A status update of vAccel |
Hi, everyone. So it's my pleasure to introduce Babis and Anastasios. They're going to give you the talk on using VXL for hard acceleration in your kernels. Babis, please. So hello, everyone. I'm Babis. My actual name is Haraldos Minus, but you can just call me Babis. So we're going to give a talk about hardware acceleration and our effort to having some support in the unit kernels, and we do that with VXL. So, yeah. Yeah, Kim, oh, sorry. I forgot about that. Oh, okay. Oh, let's forget. Yeah, put that over here over there, and maybe you can just keep it here. Okay. So, yeah, we already heard from Simon, so we don't have to repeat what the unit kernels are. There are a lot of projects, and we know that they are promising. It's a promising technology. We can have very fast boot times, low memory footprint, and some increased security. We also know some of the use cases for unit kernels, which are usually traditional applications that you might have heard like web servers and stuff like that. But they have also been used for NFV, and we think that they are also a good fit for serverless and in general micro services deployments, either in the cloud or the aids. And we also think that they can also be a good fit for, especially in this case, for ML and AI applications. And that sounds a bit weird because, as we know, ML and AI workloads are quite huge and heavy. So, maybe you have heard about PyTorch, maybe you have heard about TensorFlow. We're not going to touch them, don't worry. But what we want to say here is that they are very, very heavy frameworks, very difficult to add support for them. And secondly, we know that these kind of applications are usually compute-intensive applications that can take a lot of resources. And for that exact reason, we see that there is also a shift in the hardware that exists in the data centers, not only in the data centers, but also in the aids. We see devices that are equipped with a lot of new processing units. Of course, we have the traditional FPGAs and CPUs, but we also have specialized processing units like TPUs and also some ASICs. And first of all, as we know, ML and AI workloads cannot be executed in Unicernals, that's for sure, because there is no support for these frameworks. And secondly, there is no support for hardware acceleration, so there is not really any benefit if we run it in a CPU. So, I will give a smaller, I'm going to go through the acceleration stack and how we can virtualize it with the current approaches. So, in general, what we have, it's pretty simple. Usually, you have an application which is written in an acceleration framework, it can be OpenCL, it can be CUDA, it can be TensorFlow, PyTorps, all of these frameworks. Usually underneath that, you have the operator for the GPU or maybe a runtime for FPGAs. And then you also have, of course, the device driver which resides inside the kernel. So, this is what we have to virtualize. And as we know, Unicernals are virtual machines, so we can use the same techniques that we have for virtual machines, we can also use them in Unicernals. Some of these techniques are hardware partitioning, para-virtualization and remote API. So, in the case of hardware partitioning, the hardware accelerator has the capability to partition itself and we assign this small part of the accelerator to the VM and the VM can access directly to the hardware accelerator. This has very good performance. On the other hand, we need to have the entire acceleration stack inside the VM from the device driver to the application, to the acceleration framework. There is also the case of also, I forgot to mention here, that this is something that it has to be supported from the device and a device driver needs also to be in the VM. And in the case of para-virtualization, these things are getting a bit better because we can have a generic, let's say, device. And then the hypervisor simply manages the accelerator and then we can have the request to the accelerator manage from the hypervisor so we don't need to have all these kind of different drivers for every accelerator inside the VM. On the other hand, we still need to have the vendor runtime and the application and acceleration framework. In the case of remote API, we even have a lighter approach. Everything is managed from the servers. This server might be even locally in the same thing or it can't be a remote server. And what happens here is that the acceleration framework intercepts the calls from the application and forwards them to the acceleration framework that resides on the server. This has some performance overhead, of course, because of the transport that happens. And it's also framework specific. So it has to be supported, like there is remote CUDA, for example, that supports it. So great, but what is the best for unicernals? In the case of hardware partitioning, this means that we have to port the entire software acceleration stack and every device driver to the unicernal, which is not a good and not an easy task. Again, in para-virtualization, things are a bit better. We have to port only maybe one driver, but still we need to port all these acceleration stack. In the case of a remote API, this is something that sounds much more feasible because we can port only, let's say, remote CUDA, only one framework. But how easy is that? And it's not easy because, as I said before, these kind of frameworks are huge. They have very, very big code base. They have dynamic linking, which comes in contrast with the unicernals and a lot of dependencies. So it's not going to be easy to be porting any existing unicernal framework right now. So for that, we think that VXL is suitable for unicernals. So I will give to Tasos to present a bit of how VXL is working. Thank you. Thank you. So hi from my side, too. I'm going to talk a bit about the framework that we're building. So we started working on VXL to actually handle the hardware acceleration virtualization in VMs. So it's not tailored to unicernals. We have been playing with semantically exposing hardware acceleration functionality from hardware acceleration frameworks to VMs. And the software stack is shown in the figure. We use a hardware agnostic API, so we expose the whole function call of the hardware accelerated operation. And we focus on the portability and on interoperability, meaning that the same binary code originating from the application can be executed in many type of architectures, and it is decoupled from the hardware specific implementation. A closer look to the software stack, so we have an application. This application consumes the VXL API, which has specific support, specific operations. These operations are mapped through a mapping layer through VXL RT to the relevant plugins, which are shown in greenish, and they actually are the glue code between the API calls and the hardware specific implementation, which in this figure resides in the external libraries layer. And then it's the hardware where it executes whatever there is in the external libraries. So digging a bit more into how VXL works, so the core library, the core component of VXL exposes the API to the application and maps the API calls to the relevant hardware plugins, which by the way are loaded at runtime. These plugins are actually glue code between the API, the API calls, and the hardware specific implementation. So for example, we have an API call of doing image classification, image inference in general. The only thing that the application needs to submit to VXL is, I want to do image classify, this is the image, this is the model, and the parameters, and blah, blah, blah. And this gets mapped to the relevant plugin implementation. For instance, in this figure, we can use the judgment inference image classification implementation, which translates these arguments and this operation to the actual judgment inference framework provided by NVIDIA that does the image classification operation. Apart from the hardware specific plugins, we also have the transport layer plugins. So imagine this same operation, the image inference, could be executed in a VM using a virtual plugin. So this information, the operation, the arguments, the models, everything will be transferred to the host machine that will use hardware plugin. So apart from the glue code for the hardware specific implementation, we also have the VM plugins. We also, some of the plugins and the API operation support a subset of acceleration frameworks, such as a tensor flow or PyTorch. And what I mentioned earlier about the virtual plugins, so essentially what happens is that the request of the operation and the arguments is forwarded to another instance of the VXL library, either on the hypervisor layer or on a socket interface. So we currently support two modes of operations. We have a VTIO driver and currently we support firecracker and chemo. So we load the driver on the VM. This driver transfers the arguments and the operation to the backend, to the chemo backend or the firecracker backend, which in turn calls the VXL library to do the actual operation. And the other option is using sockets. So we load a socket interface, a socket agent on the host. We have the VSOC plugin on the guest and they communicate over simple sockets. I'm going to hand over to Babi for the Unicernel stuff. So how can VXL be used in Unicernels? Actually, it's quite easy compared to any other acceleration framework that exists. And the thing is that the only thing that we need to do is just have that VXLRT that you see over there. That's the only thing that we need to port. And this is a very, very thin layer of a C code. It can be easily ported to any Unicernel that exists. And we, of course, we need some kind of a transport plugin for forward requests. So as Tasos already explained, usually the application is the same application that we can run in the host or in any container or in any VM can be also used in the Unicernel, the same node changes. And it simply uses a specific API of VXL and then we simply forward the request to the host and then we have another version of VXL which is in the host and simply maps to the hardware accelerator framework that is implementing the specific function. So this, as I said, this allows us to have the same application running either in the host, either in the VM without any changes. So it's easy to debug, easy to execute. And we can also access different kind of hardware, different kind of frameworks that exist. And we don't need to change their application. We can simply change the configuration in the host. So yes, we have another acceleration framework and maybe we can think that this is not going to be easy to use. But let's take an example and see how we can extend the VXL and see if it is easier or not. So let's get a typical vector addition example in OpenCL which can be executed in the CPU or in FPGA. And the steps that usually happen is that we set up the bitstream in the FPGA and the FPGA starts the reconfiguration. Of course, we transfer the data to the FPGA. Then we invoke the kernel as soon as it's ready and we also get the results back to the host. So this is what the application is already doing. So if you have this application already running in your machine, the only thing that you have to do is that somehow you need to libyify the application. And that's instead of just exposing an API to do that. And the next thing is that you can integrate the library in the VXL as a plug-in. And we have a very simplistic API that you can use and therefore the application will be seen as a plug-in for the VXL. Later, you can also update VXL, just adding one more API to the VXLRT so the application can directly use it with the correct parameters, of course. So I will give you a sort of demo of how this works, using Unicrout specifically. I will transfer a bit so we can have maybe in the most classification at first and then we can see how this, how a BLAST CUDA operation can be executed in the CPU and the GPU without any changes. And maybe some FPGA if we have time. Okay, this is not good. This is better. So we are in a typical working environment for Unicraft. We have created our application. We have a new lib which we are not going to use actually. And we have also Unicraft. So let's go to here. So this is a repo that we have created. I will show it to you later. So this is, I want to show you. So here you can see that we only have, we only exposed nine PFS and we use it because we want to transfer the data inside the Unicrout. So we are not going to use any network. We are just going to share a directory with the VM. And the only need that you need to do is to select VXLRT and that's all. As you see, we don't have any libc because we don't need it for the specific example. So these are all the applications that are currently running in Unicraft. You can try them out by yourself. So let's, we're going to use image classification. So we'll take some time, let me, we'll take some time to build. But I will also try to show you how the application looks like as soon as it finishes. And it should finish right now, almost. Okay. And that's application. So as you can see, yeah, we can skip the reading of the file. So this application is quite simple. Like we have a session that we have to create with VXL with the host. Then we simply call the, this is the function that is called VXL image classification. It has the arguments that are also needed. And then we simply release the resources that we have used. So I will try to do an image classification for this beautiful hedgehog that we have here. And let's see what's going to happen. Okay. So all these logs that you see here are from the Deton inference plugin. And we see that we have a hedgehog. So it was identified. And the thing here is to, you can see that all of these logs are not from the Unicraft. All of these logs are from the host that is running. I can also show you this small demo with some operations for arrays using CUDA. So the same here. We're just, we're going to export the backend. First, we're going to use a no op plugin, which simply doesn't do anything. You can mostly maybe use the only 40 bug. So we have here the application, which is a SKM. And you can see that it doesn't do anything because it's just a no op plugin. It doesn't do anything special. So we can change the configuration in the host and specify that the backend that we want to use is the actual CUDA implementation for maybe CPU. Yes. Okay. So then we will run it and you will see that we have the, actually it's a min max operation. It's not a SKM. And then you can also, we will also run the same thing in a GPU. Again, we are just in the host again. We can simply change the configuration and now we start it again, the Unicernal, and we get the result from the GPU. You can also, all these debug messages, you can remove them of course. So we also have the, yes, this is also min max still. Now we will go to SKM. Do we have time still? Yeah. Okay. So yeah, we can just use this. Again, no op, nothing happens. Nothing really special. We will do the export for, to specify the CPU plugin again. And we will execute and we'll see that the execution time, it's quite not very big, but it's just remember that number. And now we will run it in the GPU and you can see here that the execution time is much better than before. And that's all. We can also solve the, the, the FPGA, which is, okay. So this is an FPGA, right? So we need to have a bit stream. And this is a black skulls application by the way. And we will run it natively in the beginning and then we will also run it in the Unicraft. So first we just run the application natively and you can see all of the logs and everything of the execution in the FPGA. And then we can, we will see how this is executed in a Unicernel. So this is, I forgot to solve that, but I will, so it will explain later what are all of these things. Usually what we have to do is just to export the VXL backend that we want to use. That's how we configure the host to use a specific plugin. And then we have the chemo command that I can explain in more details after this video. Still, this is from the Unicernel now and we access the FPGA and we have the black skulls operation running there. And we also have one more FPGA application, but I think you got the point. We have all these links for the videos and everything in our talk in Fosden. So you can also see them from there. Let me talk a bit about chemo, the chemo plugin that we have. This is a bit more, this is just from our Apple. So here we need the chemo which has the vertio backend for VXL. And if Unicraft, for example, had support for Vsoc, we didn't have to use the vertio backend, so we didn't have to modify chemo. But since we have no Vsoc support, then we have to use the vertio, and therefore we changed a bit chemo with adding the backend, as you can see here. And these are all the, you already know from the previous talk, all the configurations for Unicraft, the command line options. I will also show you our docs. We have here an extended documentation. You can find how to run VXL application in VM, how to run it remotely. We also have it, it doesn't show here, but we also have... Okay. Maybe more. Okay, so here we also have all the things that you need to do to try it out by yourself in Unicraft. And all of them are open source. You can check them out, and you can clone them by yourself. So let me return. So currently VXL has bindings for... We actually released the version 0.5, and currently there is bind... We have language bindings for C, C++, Python, Rust, and also for TensorFlow. And we have the plugin API that I talked before about extending VXL. You can also see how it is. These are all the things that we have tested and we support right now. So from the hypervisor perspective, we have support for Chemo over Ritio and Vsoc. And for these new Rust VMMs, like Firecracker, Cloud Hypervisor, and Dragon Ball. Regarding Unicernals, we have working... It's currently working in Unicraft and in Rembrandt, but we want to also port it in OSV and maybe some more Unicernal frameworks. And we also have integration with Kubernetes, Cata containers, and OpenFuzz for serverless deployment. And these are all the acceleration frameworks that we have tested and to work with VXL. So this is an inference that you saw that we did the immense classification. We have TensorFlow and PyTorch support, TensorFlow 13, OpenVino, OpenCylo, CUDA that you saw with the other demo. And regarding hardware, we have tested with GPUs, edge devices like Coral, and also FPGAs. So to sum up, hardware accelerations are... The software stack of hardware accelerators are huge and complicated to be ported easily in Unicernals. And we have VXL which is able to abstract the heterogeneity both in the hardware and in the software. And it sounds like a perfect fit for Unicernals. So if you want, you can try it out by yourselves. Here are all the links that you can use and test them out. And we would like to mention that this work is partially funded from two Horizon projects, Ceraan and 5G Complete. And we would also like to invite you in the Unicraft hackathon that will take place in Athens at the end of March. And thank you for your attention. If you have any questions, we will be happy to answer them. Thank you so much, Babi. So for the third time, we'll welcome you in Athens in late March for the hackathon. If there are any questions from the audience? Yeah, please. Thank you. Great stuff. I have a question about the potential future and the performance that we are currently maybe possibly losing to the usage of API and transport. What do you think is a potential in more increase of performance given that framework? Yeah, actually, the transport is actually, yes, it's bottleneck since you have all these transfers that take place. But we think that at the end, we will have still very good execution times, very good performance. And it's also important to mention that we can also set up the environment and everything so you can minimize the transfers. For example, you can have your model. If you have a TensorFlow model or anything, we are working on how it can be done and prefetching it before you deploy the function in the host and having everything there so you don't have to transfer from the VM to the host and vice versa and all of these things. Actually, if I may intervene, so these are two issues. The first issue is all the resources, the models, the out-of-band stuff that you can do in a separate API, in a cloud environment, in a serverless deployment. And the second thing about the actual transfers for Virtio or Visoc, the thing is that since we semantically abstract the whole operation, you don't have to do a CUDA, MIMCOPY, CUDA malloc, CUDA something, set kernel, whatever, and you don't have this latency in the transfer. So it minimizes the overhead just to the part of copying the data across, so the actual data, the input data and the output. So this is really, really minimal. So in VMs that we have tested, we have tested remotely, but the network is not that good, so we need to do more tests there. But in VMs that we have tested, the overhead is less than 5%. For an image classification of 32K to a MEG, something like that. So it's really, really small, the overhead for the transport layer, both Virtio and Visoc. The Visoc part is a bit more because it serializes the stuff through protobufs and the stack is a bit complicated, but the Virtio stuff is really super efficient. Hi, so thank you for the talk. My question would be kind of almost on the same thing, but from the security perspective. So if we kind of offload a lot of computation out of the Unicernel to the host again, I guess security, at least the isolation is a thing to think about. So if you, any words on this topic? Yeah, you can take it. It's yours. Okay, we agree. Yes, there are issues with security because essentially you need to run on Unicernel to be isolated, and now we push the execution to the host. So one of the things that we have thought about is that when you run that on a cloud environment, the vendor should make sure that whatever application is supported to be run on the host should be secure, should be audited. So the user doesn't have all the possibilities available. They cannot just exec something in the host. They will be able to exec specific stuff that are audited in libraries in the plug-in system. So one approach is this. Another response to the security implications is that at the moment you have no opportunity to run from a Unicernel hardware accelerated workload. So if you want to be able to deploy such an application somewhere, then you can run isolated. You can use the whole hardware accelerator and have the same binary that you would deploy in a non-secure environment. So you could secure the environment, but have this compatibility and software supply mode using a Unicernel, using this semantic abstraction, let's see. Any other question? Yeah. Please. So my question is similar to the first question, but I'm wondering, because you can also do GPU pass-through via KVM and just pass the GPU to a virtual machine. So I'm wondering what is the performance difference between doing that and doing it in VR? Yes. Actually, we want to evaluate that, and we need to evaluate it and see how, for example, with the even pass-through directly, like exposing the whole GPU to the VM, this could be also one baseline for the valuation. Currently, I don't remember if we do have any measurements already. Would you consider the pass-through case the same as made? Yeah, but I mean, if we have any, like, okay. Actually, from GPU virtualization, for example, I'm not sure how many VMs can be supported in one single GPU, for example. I'm not aware of any solution that can scale to, like, tens of VMs, even tens of VMs. I'm not sure if there is any existing solution for that. But, yes, we plan it. We want to do some extended evaluation compared also to some, like, let's say, virtual GPU that exists or even the pass-through and native execution. We want to do that, and hopefully, we can also publish the results in our block. Okay. Thank you. Any other questions? Yeah. So, in response to the first security question about, yeah, we are offloading now compute to the hypervisor and host. So, does it imply that there is a possibility to break out of the containerization with BXL? Well, there's, yes, yes, code is going to be executing on the host in a privileged level. Yes. But the other option is what? So, yeah, there is a trade. We are actually working. We want to see what available sources we have there. How can we make it more secure? How we can sandbox it somehow to make it look better? But on the other hand, like, for example, in FPTAs, there's no MMU, there's nothing. If you run two kernels, one kernel can access, if you kind of know what to do, one kernel can access all the memory in the whole FPTAs, for example. So, in one hand, you also need support from the hardware. And regarding, for example, the software stack, we are looking at it and see how this can, how can we extend and make it more, at least, increase the difficulty for having any. So, for example, in the Cata containers integration that we have, so when you spawn a container, you sandbox the container in a VM, our agent, the host part of the Excel is running on the same sandbox, not in the VM, outside the VM. But it runs in the sandbox. So, yes, there is code executing on the host, but it's in the sandbox. So, it's kind of a tradeoff. Anything else? Right? If not, thank you, Anastasia. Thank you, Babis. Yeah. |
A Rust-Based, modular Unikernel for MicroVMs
RustyHermit @ FOSDEM 2023 |
Okay, now that Pierre is here, we can get started. So it's my pleasure to invite Martin to talk about Rust-based Unikernel. So combining the two cool words here, Unikernels and Rust and security. Go ahead, Martin. Yeah, there were two words. Okay. Hi, everyone. Thanks for coming to our talk. I'm going to be talking about Rusty Hermit, which is our Rust-based modular Unikernel for micro-VMs. Who are we? This is us. So there's Stefan, who initiated the project a few years ago. There's Jonathan, and there's Martin, that's me. We are from the Institute for Automation of Complex Power Systems at RWTH Aachen University. Stefan is the academic director, Jonathan is a PhD student, and I'm a master student. I'm currently writing my master thesis with both Stefan and Jonathan as my supervisors, and yeah, I'm happy to be able to present our project to you now. Yeah, just a remark, this project has been funded through EU projects. Okay, Rusty Hermit. Rusty Hermit is a library operating system for creating Unikernel images, similar to what you've seen before with Unicraft, if you were here. It started as a Hermit Core research project around eight years ago, started by Stefan. That project was written in C and had a focus on HPC, high-performance computing. And in 2018, it was completely rewritten in Rust, every component of it, well, and assembly, but that doesn't count. Quick recap, Unikernels, very similar to a slide you've seen before presented by Simon. On the left, we have the classical Linux VM, running on a hypervisor type 2 here. And we have a fully-fledged operating system inside of the VM image, which is quite large, and has its own distinction between kernel and user space inside the virtual machine. Docker containers run on a container runtime, which has their own user space, but share the kernel with the host system, which makes it faster and more flexible. Unikernels on the right are very small. They are created by linking your application against a library operating system to create a tightly integrated Unikernel image, which can then run on machines, real or virtual machines in this case. It has the same isolation from the host or other guests as classical Linux VMs. And since it's just one application and one process, we have a single address-based operating system and no distinction between user space and kernel space. This is really good for performance, because we don't need to do any privileged context switches, which are costly otherwise. And we don't have preemptions and don't do interruptions in that case either. Also, it's very small in this case, because we can just throw away everything we don't need from the binary and have a runnable hello world image at around half a megabyte. We also focus on micro VMs. Micro VMs are a special type of virtual machine platform, which are more bare bones, because we don't need to emulate things like PCI or ACPI. This of course requires para virtualization, so the guest image needs to be specialized and know that we don't want to talk about PCI in this case. That can make the unicolonial image even smaller in some cases. And let's talk about Rust. Our unicolonial is written in Rust for a number of reasons. It's productive, it's fun, and it's safe. Rust has many modern language features that are really nice to work with compared to C or other older languages. It has a strong type system, helpful compiler errors, which are really a bliss if you're coming from C++ template errors. It's a growing ecosystem. It's being adopted by several big projects. I'm sure you've heard of Linux adopting Rust at least in some part already upstream. Rust has also great tooling. There's a very nice package manager that virtually everyone uses to put their projects into so-called crates in Rust. And there's great tooling for formatting and linting, for example. For our case in OS programming, it's also very cool that you can use very much of the Rust standard library without an operating system, like, for example, a vector for a growable dynamically allocated array, for example. The biggest point which really put Rust on the landscape is the last point, which is that Rust is a safe language. It's the first major systems programming language that guarantees memory safety. And that's pretty cool because memory safety is hard if you do it manually. I think if you've programmed C or C++ before, you might have dereferenced a null pointer and resulted in some sec void or something. And it's very cool if you don't do that. Just don't. In big projects like Chromium or other cases, it's been shown that around more than 60% of vulnerabilities are caused by memory and safety. And moving those projects to Rust is in the spirit of hoping that that alleviates this problem. I have an example, proof of coolness of the Rust language. Just one example that I picked to demonstrate the modernity and elegance. It's sometimes aka tagged unions. You can see on the bottom here that there is a generic enum type option, which is either a none or some and then has some data in it. And in Rust, these types are coupled. So the some variant of the enumeration contains the data. And it's really nice working with that. If we have an option as shown at the bottom, we can match this option and then either unpack the none or the some variant and then reuse it directly. I've kind of lied to you before because Rust is really two languages. First, there's safe Rust and unsafe Rust. What does that mean? Safe Rust is awesome because safe Rust gives us all the guarantees that we want. Things like accessing invalid pointers, which would result in use after free, double free or out of bound problems, as well as data races, are classified as undefined behavior in Rust. And using only safe Rust, these problems can't happen to you. These problems don't guarantee correctness, though. So things like race conditions, which are different from data races or logic errors can occur, which is natural, I think. When doing OS development and other low level stuff, we have a few additional requirements, though. We might want to do raw memory access for MMIO. We have to sometimes write assembly code for invoking special CPU instructions. These, unfortunately, cannot be checked by the compiler for safety invariance. That means this is not possible to do in safe Rust. This is why unsafe Rust exists. Unsafe Rust is a strict superset of safe Rust. So it means you can do everything that you can do in safe Rust, but a few things more. But you have to tell the compiler that you promise to be extra careful and don't do any bad stuff. You have special superpowers, then. You can access raw pointers and call unsafe functions, which is required for inline assembly, for example. At the bottom, you can see how we can access raw pointers or write inline assembly, which, if we are not careful, might really do bad stuff. And this is why we have to put this in unsafe blocks. That means, if something goes wrong, we can just grab for any unsafe things and rethink if we did everything correctly there. When writing this unsafe code, we have to be sure not to violate Rust's fundamental soundness property, which says that no matter what, safe Rust cannot cause undefined behavior. And if we encapsulate some unsafe code in some safe function, we have to make sure that this API cannot be misused in any way. Okay. Enough about Rust. Let's talk about Rusty Hermit again. Rusty Hermit is tightly integrated with the Rust language. It's our first language of choice for applications and very specialized. Now I'm going to show you how you would port a Rust application that runs on Linux to Rusty Hermit, which is really easy, I think. But let's see. We have a few requirements. Rust up. The first one is the Rust toolchain manager that virtually every Rust developer has already installed. We then need, of course, a hypervisor of our choice. We can either use the ubiquitous QEMU or U-Hive. U-Hive is a specialized hypervisor created by us in Rust, of course, that is specialized for the Rusty Hermit operating system to have really fast API between those two. If we are compiling with simultaneous multiprocessing for Intel processors, we also need nothing, but that's not important if you don't need that. Okay. This is a bare-bones Rust project. We have a cargo tumble, which is a manifest file for the cargo package manager, which describes the package metadata, and it just says hello world, version, addition, something. Not very important. We have then our main source file, the main RS, which is just a main function and prints hello world. Everything that we need to do to get Rusty Hermit support is first add a Rusty Hermit dependency. It's written a bit complicated to just include this dependency if we actually compile for the Hermit operating system. Then we just need to add two more lines to the main RS to import this dependency. What this does then is that Hermit sys in the background transparently builds the Hermit kernel, the library operating system, and then by importing it like this, we make sure we actually link against this. What we then get is a runnable unicarnal image that can be run in Quemo or U-Hive. To then build this, we have to pin a Rust compiler version because we have some internal things that require that, but we're working on getting rid of that and then just build it. We say cargo build, then specify the Hermit target, which is our target, and then we tell it to build the standard library on the fly because we are small yet and only tier three target, which is why Rust does not support us natively yet, but we support Rust. There was easy. To make sure that all of you can believe me, I have prepared a small demo. I have to get on this screen. Right here you can see exactly the project I talked about. It's just a hello world with the Hermit CSS dependency. It's a main RS, which does hello world. Then we can go ahead and open a terminal, then do cargo build, which is really fast right now because I pre-built it. Normally it takes around one minute on this machine I'm logged into. Then we can run it on you have hello world. To make sure that we didn't cheat, I can also show you the verbose messages, which tells you I have to please print the kernel messages along with it. We can see that there's Rust, the Hermit booting and initializing all the hardware and preparing the memory and everything and then in the end jumping into our application and printing hello world. After that, there's just shut down. Okay, back to the presentation. Yes. Okay, now a bit about our modularity story in Rusty Hermit. There are several modularity stories. The first one is user facing. This is the same similar dependency declaration in our cargo manifest as before, but a little bit expanded. We added features. Features are a thing in the cargo package manager that allows us to select and configure conditional compilation in our dependency. In this case, Hermit's is. We use this to be able to specify in this manner which features we want to be present in the unicolonel image. In this case, I enabled SMP, TCP and DHCP4 and disabled PCI and ACPI. This means that this should be runnable in a micro VM, for example, with no PCI support present. Internally, we also quite modular and we're working on further modularizing our kernel. At the top, you can see the lib Hermit kernel, which has a few dependencies. The first one is a internal Hermit entry dependency, which is shared between the kernel and anything that loads and jumps into the kernel. We then have Hermit sync for internal collection of synchronization primitives like mutexes. The other crates are really provided by the Rust ecosystem, which is really rich. The linked list allocator is our allocation algorithm that we just import and then use. We can also just import and use some device drivers or architecture-specific abstractions so that we don't even have to write assembly code ourselves. Also, small TCP is our TCP stack. Just import it and configure it. We also contribute back upstream, which is cool, but this shows the strength of the Rust ecosystem and community for Rust OS development, I think. In the end, this is a broad overview of the Hermit ecosystem as it is today. On the left, you can see a unicorn image that has been built. At the top, we have the application. It's either a Rust application or a C application, although Rust application is our primary focus, which then either uses the Rust standard library or a new C library. Those are then customized by us to invoke the special syscalls into the kernel to do the required functionality, and this altogether then makes up the unicorn image. This can then be run on either our specialized virtual machine monitor, U-Hive, or a generic VM like Kimu. For Kimu, we have a Rusty loader, which then chain loads our unicorn image, and Rusty loader supports some boot protocols, as you can see here. That's been the main part. What are we working on right now? I'm working on the first three things. Further code-based oxidization, which means making it more Rusty. That means applying more Rust idioms more thoroughly, because there have been a few C-isms that we've been stuck with from the original part. I'm personally also working on Miri support, also as part of my master thesis. Miri is an interpreter for Rust, which initially sounds strange, but using Miri, we can spot a few cases of undefined behavior if we do something wrong in unsafe code. If something runs in Miri, though, that doesn't mean that this is guaranteed to be correct, but it can help us in some cases. Third point is more modularization, and I already talked about that. It's about spinning out internal drivers, for example, in separate projects and crates. Then TCPI stack overhaul is something that Stefan is currently working on, and U-Hive network overhaul is something that Jonathan oversees. We are also generally working on firecracker support and arm support, both of which have working prototypes, but have not really been mainline that much. Please find us at GitHub. We are always happy to have conversations and contributions. Yeah, that's been it. Thanks for listening. Okay, any questions for Martin? Unikernels, raw security. All righty. There's one. Yeah, I just want to know what the subprime focus of this project. So do you have some industry which is already picking up on Hermit, or is it pure science so far? What are the plans? As far as I understand, it started as a research project, and it's much there now, I think, Stefan? Yeah, it's still in research project, but we use it in two U-projects, and they are mostly partners from the cloud area and edge computing, and we want to use it here. Thanks. Hey, thank you for your talk. I have a question. As far as I know, the original C implementation supported quite a few more targets than only Rust and C. As far as I remember, you could run Go code as well, and Fortran, and some other stuff that linked against G-Lib C, if I'm remembering correctly. New Lib, I think. And New Lib, is there any plan to open up your targets as well for the new Rust implementation to support some more stuff, not only Rust and C? So as far as I've been there, it's been only Rust. I'm not that old into the project. I'm not sure what the plans are on further supporting that. We currently have bare-bound support for C, and I don't think the Go implementation is currently working, and it's possible to get it working, but we are not really working on that actively, I think. So, any plans for RISC-5 support? We have a quick press from RISC-5 support. This is also done by two students, but didn't need time to analyze this. So it's there, but a lack of time. Okay, so proof of concept is working, but not upstream yet. This question obviously has to be asked, is there async support? Is there what? Async support. Async. Rust async. We have a runtime, or like async runtime. I think not mainline yet, right? So the kernel uses it internally for networking, and I think the exposure to user space via Mio or something is not merged upstream, but it's something that we are actively interested in. Anything else? If not, thank you again, Martin. Thank you all for coming. |
Loupe: Designing Application-driven Compatibility Layers in Custom Operating Systems |
At our final talk for this session, we have Pierre here. He's going to discuss about Loup, a tool that he and we have been using to measure compatibility for different OSs. Pierre, you have the floor. Thank you, Resvan, and thanks everyone for attending my talk. This is joint work with a bunch of colleagues and students, including Hugo, my PhD student. He's a key player behind his work. I'm just, you know, getting all the medialization, but he has built all this stuff, so all the credits go to him. So in this brief talk, I want to speak a bit about application compatibility for custom operating systems. So I guess most of you don't need to be convinced that we still need custom operating systems today. When I say custom, I mean both, like research operating systems and prototypes operating systems from the industry, right? The thinking that Linux has solved everything is not true, in my opinion. We still need things like Unicraft if you want to go fast, or if you want to specialize like crazy, we still need things like Rusty-Armet if you want security, or SEL4, so we still need custom operating systems, and the thing is with these operating systems, they're only as good as the application that they can run, right? So compatibility is key. Compatibility with existing application is extremely important. If you want to build a community, you want your user to go to your website, compile your custom OS, and then try some of their favorite applications, or try some of the highly popular applications in a given application domain like Enginings or Redis for cloud, if you want to attract sponsors or investors, or even if you, like me, are scientists, you want to gather some early numbers to make a publication, well, you need to do that on standard applications, right? So compatibility is important, and another argument would be like how many times did you hear the one POSIX spoken today, right? There were some slides, there was POSIX like three or four times written in a single slide. So compatibility is important, and it can be achieved in a few different ways as we have seen with Simon. But one important thing to note is, in my opinion, porting is not sustainable. So porting is what many of us do. We build a custom operating system, and then we take Redis, and obviously it doesn't work as is with our operating system, so we modify Redis a bit, we disable some features because we know that they make our OS crash, and then we have Redis like a version customized for our operating system. This is not sustainable because you can't maintain like a branch of Redis for every operating system out there, right? In the long term, it doesn't work so well. So porting also basically means that you as a OS developer, you ask the users of your application to make some effort to the application developers, they need to make some effort to be compatible with your operating system. This doesn't work, nobody is ready to make that kind of effort. Maybe if you give them 10x performance speed up, but this is unrealistic. So what you want to do is, once again, in my opinion, as an OS developer, you want to provide compatibility as transparently as possible. And this means you emulate a popular operating system, for example Linux, or a popular abstraction like POSIX or the standard C library, and then you can be compatible at three different levels. The first level is source level or API level compatibility. So you ask the users to compile their application code against the sources of your kernel, in the case of a unique kernel. So this is, you're still asking some effort from the users, right? In many scenarios, you don't have access to sources, right? If you have proprietary binary or pre-compiled binaries, well, you can't have source level compatibility. So it's not perfect and binary compatibility is generally a more, let's say, pure version of compatibility. And there are many two ways to achieve it. You can do that at the level of the standard C library, like OSV. So you will dynamically link your kernel plus a standard C library against your application, compile as a position independent executable or as a shared library itself. This is great, but if the application is making directly system calls to the kernel without going through the standard C library, well, once again, it doesn't work, right? And as a matter of fact, I have counted more than 500 executables in the Debian repository that contain the C score instruction, right? So they make C scores directly to the kernel. They don't go through the C library. Go, for example, is making most of its C score, put directly to the kernel and not through the C library. So what you want to do is to be compatible at the level of the system calls. So your kernel needs to emulate the C score API that Linux is providing. This is the most transparent way of achieving compatibility. Now, this is scary, right? Linux has more than 350 system calls. Do we need to implement them all? Will we be and aren't we going to reimplement Linux by doing so? And some of them are extremely scary by themselves, right? You have like hundreds of IO controls and each of them probably require its own implementation. The Linux API even goes beyond system calls. You have things like slash proc slash devs that are, you know, used by many applications. Like the first thing a muscle binary does when it runs is to look in, I believe it's slash proc or slash this to get the size of the terminal, right? So you need to have emulation for this part of the API too. And this, because it seems like a big engineering effort, this creates, it hinders the development of custom operating systems. So this is inspired by the keynote by Timati Roscoe at ATC and OSDI 2021. We looked at all the papers. So these are top tier operating systems conferences. And we look over the past 10 years, over a total of more than 1,000 papers, how many were about proposing a new operating system as opposed to things like security or machine learning. And among them, how many were just hacking Linux versus proposing an actual operating system implemented from scratch. And the numbers are similar to what we saw earlier, right? You have just a very, very, very few papers proposing a new operating system because it's a significant engineering effort. And part of the effort is to be providing compatibility to an application like Apache already to get a few numbers at the end of the paper, right? So this is the problem. Now, the particular problem that I want to talk about is how I'm sure several people in this forum have attempted to build some form of compatibility layer for an operating systems. And we are all kind of working on the same thing in parallel with some form of ad hoc processes that may benefit from some optimization. So I've just listed here a few projects that have a Cisco level binary compatibility layers, but actually there are many more. And from what I understand, it is a very organic process. So first of all, it is application driven, right? People have a few sets of application in mind that they want to support. If you are doing cloud, you want to support the user's suspect, ready, Apache, whatever. And the process basically looks like that. You take an app, you try to run it on top of your operating system. Obviously, it fails. You investigate. You're like, oh, I'm missing the implementation for this system code. So you implement whatever operating system features are required to fix that particular issue, rinse and repeat until the app is working. And then you go to the next app. So it's a very intuitive and organic process. So when I built the Armitage, this is exactly what I was doing. So something that comes to mind is, can't we have some form of generic compatibility layers that we could plug? Like something a bit like New Lib that would provide a generic interface. And I believe it's not really possible because most of this implementation to support the system code is very specific to whatever operating system you are using. And it's not clear if a generic compatibility layer can be achieved. But can we still somehow optimize that process? Some have tried static analysis. So they take the binary of the application they want to support and they look, okay, so what are the system codes that are made by these applications? So this has been done in the best paper in Eurosis 2016, analyzed all the binaries from Ubuntu, I believe it was 14 or four repositories. And they concluded that every Ubuntu installation, including the smallest one, require more than 200 system codes and 200 IO controls, five controls, and PRCTL codes, and hundreds of 2.0 files. So this doesn't help. It is still quite scary. It still represents a gigantic engineering effort. But do we want full compatibility with Ubuntu installation? In the end, especially in the early stage of the development of an operating system, you just want to get a few applications up and running. And do you even 100% compatibility? When I write a paper, I don't really care if everything is stable. I just want to get some numbers. So isn't there a better way? And obviously, maybe you think about, yeah, let's do dynamic analysis. Let's run the applications that we want to support. We send them some input that we want to support, like I'm running engineering and I'm submitting some HTTP for something like that. And then we trace the system codes that are done. So this is going to give us a subset of the system codes that can be identified through static analysis that has a tendency to overestimate. So with this trace, the engineering effort to support an application and a set of input is a bit lower. But it's still not a panacea because it's not taking into account two things that we do when we implement compatibility layers. So this is my code. Don't judge me. One thing that I did with Hermitux was at some point, it was an app that was calling MNCore to check if some page of memory wasn't swapped or not. It has actually, you know, there is no swap in most unique kernels. So it really didn't matter to implement this. So you know this means operation not supported. So stopping a system code is just saying, yeah, we don't support it. And you cross your finger that the application has some kind of fallback path to do something else if the system code fails. And it works in some cases. And then we can do something even more nasty. Don't judge me again. You can fake the success of a system code, right? Surprisingly, in some situation, returning a success code, even if the system code doesn't have any implementation in your operating system, it's going to work in some cases. You know, I'll tell a bit more about why this works sometimes. So stubbing and faking lets you implement even less system calls than what you would trace with it trace. So in the end, you know, if you want to support an app or a set of application in your custom operating system, the amount of system calls that you actually need to implement. So obviously, it's smaller than the entire Linux SQL API. Static binary analysis will, on the binaries of the applications you want to support, will identify a subset of that. Still pretty big. It's an overestimate. Source analysis gets you more precise results. But it is pretty hard to achieve. And it is still overestimating. S trace will give you, once again, a subset. Things start to look better. And among these trace by S trace, you actually don't need to implement everything. You can stub and fake some of this SQL. So can we measure that? Yes, with loop. So loop means magnifying glass in French. It's a tool that was built by Hugo, my student, and it's some kind of super S trace that is measuring the system calls that are required to support an application. And that can also tell you which one you can stub and which one you can fake. So we used it to build a database of measurements for a relatively large set of applications. And with loop, if you give me a description of your operating system, basically the list of system calls that you already support, and you give me the list of applications that you would like to support, we run them through loop and loop can derive a support plan, which basically will tell you, okay, for this set of target application. And given the set of system calls that you already support, what is the optimized order of system calls to implement to support as many applications as soon as possible? Okay, so I will give you an example of support plan by the end of the presentation. So from the user point of view, loop needs two things to perform its measurement on a given application. You give it a Docker file that is describing how you want to build and run the application for which you want to measure the system calls needed. And optionally, you may need an input workload. Think about a web server. It's not going to call many system calls until you actually start to send requests to it. Loop will instantiate the application, launch it on a, you know, standard Linux kernel and analyze the system calls that are done and with a few tricks we'll be able to know which ones can be faked or stubbed. The results are, it's basically just a CSV file for each system call that is made by the application. Can it be faked? Can it be stubbed? Or does it require a full implementation? We start that in a database and later, so, you know, we populate the database with as many measurements as possible. And this database can given the list of these calls that is already supported by your operating systems, give you like some form of optimized super plan given which of the applications you want to support. Okay, so how does it work? When loop runs the application, first it does a quick pass of S-Trace to measure all the system calls that are done by the application and then for each system call that we identify, we use SecComp to hook into the execution of each of the system calls and rather than actually executing them through the Linux kernel, we emulate the fact that the Cisco is stubbed, so we just return EnoSys without executing the Cisco. We can also emulate the fact that the Cisco is faked, we return zero. And then we check if the application works or not following the stubbing or the faking of this particular Cisco. And then we do that for each system call that we have identified with S-Trace. How do we actually check for the success of the execution of the application? So we identified two types of apps. Some we call them run to completion. There'll be something like FIO when you know you start FIO, it runs for one minute and then it exits outputting some kind of some stuff on the standard output. So with run to completion apps, we run the app instrumented with loop, we check its exit code. If it's different from zero, we consider that the run was a failure, could have been killed by a signal or things like that. And we can also run a script optionally in addition to that after each run of the application to check its standard output. We can grab for error values, we can grab for success printing, something like, you know, 50 requests per second have been achieved. The files that may have been created by the application and so on. And then another type of application is client servers. So with client servers, we run the app instrumented by loop and in parallel we run a workload, could be WRK, HTT path, the Redis benchmark for Redis and so on. And we check the success of both, we check that the app doesn't crash, generally servers are not supposed to exit. So we check that it doesn't crash and we check the success of the workload. Like, you know, if Redis benchmark returns something different than zero, probably something went wrong. And then we are able to see, okay, so I'm currently trying to stub the read system call, is the application succeeded or not? So really the database, let me check the time, okay. And we analyzed the results. So these results are made on a relatively small database of about 12 highly popular, sorry, 15 highly popular cloud applications. So this is just a subset. So what you have on the y-axis is a number of system calls that are identified by static analysis in purple on the binary, on the sources in yellow. And then dynamic analysis. And we run for each of these applications, both the standard benchmarks, that will be Redis benchmark for Redis, WRK for engineering, and so on. And we also run the entire test suite. So the key idea with the test suite is if you, you know, support, I mean, if you measure what's going on during the entire test suite, you get a very good idea of what are all the possible system calls that could be done by the application. Obviously, you need to assume that the test suite has a good coverage, but it is the case with these very popular applications. And, and what we see is, first of all, you know, static analysis overestimates. This is not very surprising. The amount of system calls that is identified by static analysis is relatively high compared to what we get with dynamic analysis. And if something interesting, too, is that the amount of system calls that can be stirred or faked, so the grain bits on the dynamic analysis pass, it is actually quite non-negligible, right? So, so what this means is that if you want to support Redis with a Redis benchmark, where binary level static analysis tells you that you should implement 100 system calls, if you just want to run the Redis benchmark to get, you know, performance numbers for your paper, you actually need to implement just 20, right? So that's what, like, divided by five, right? And if you want to pass the entire test suite of Redis, you need to implement about 40. It's still like half what static analysis is telling you. So it's kind of a message of hope, right, for building compatibility layers and for developing custom operating systems in general. So, yes, static analysis overestimates a lot of the engineering effort to support an app. And even naive dynamic analysis does measure much more these calls than what is actually required if you know that you can stop and fake these calls. Another view at these results can be seen here. So for each of the system calls, you know, zero is read, one is write, two is open, I guess, and so on, among our dataset of about 15 apps, how many of these apps require the implementation of the system calling question, right? And then so you have here the result for static analysis at the binary level. At the source level, this is S trace without counting which system calls you can stop or fake. And this is what is actually required. So if you consider that you will not implement what you stop or fake, this is what you actually need to implement. And as you can see, you know, it's much, much, much, much less engineering effort versus what static analysis is telling. And why does stopping and faking work? So here you get some code snippet from Redis. So if you stop, get our limit, the C library wrapper will return minus one. And as you can see, Redis will actually fall back on some kind of safe value, you know, so I'm not able to understand the maximum number of files that I can open. So I'm going to fall back on 100, sorry, 1000. And the fact that faking works is actually that you have quite a bunch of system calls. So this is for each system call and each app in our dataset, what is the percentage of apps that are actually checking the return value of the system calls. And some system calls are almost never checked the return value. It kind of makes sense, right, when you see this, why check the return value of close. And this is why, you know, faking work in many cases. Another question that we asked is, okay, so when you speak about providing binary compatibility and you don't do porting anymore, basically, all the effort of supporting apps is on you, the operating system developer. And this is how it should be, in my opinion, but how much effort does that mean in the long term, right? So we had a look at versions of Redis and Jennings and Apache over the last 10 years and what, you know, what are these calls that actually needs to be implemented in purple. And we saw that this number does not change very much, right? So once you make an app and you make it work, it actually means that you need to keep up to date with the most recent version of this app that are coming up, but it doesn't necessarily mean a very big engineering effort either. And these are the support plans. So we had a look at Unicraft, Fushia, which are some operating systems that have already a relatively good support for a good number of system calls. And we look at Kerala, so Kerala is another Unicernel written in Rust. And it's very, I wouldn't say immature, but it doesn't have support for a lot of system calls. And for a set of 15 apps that we had in the database, we derive a support plan. So for Unicraft, for example, in its current state, it's already supporting most of the apps of our data set. If you want to support an additional app, what you need to do is to implement system call number 290 and stop these, and then you'll get memcached. And next, if you implement this syscall, you get H2O, and then you need to implement these two syscalls, and then you stop that, and you get MongoDB. So same thing for Fushia and Kerala. Obviously, it's a bit more interesting because this one doesn't support many applications out of the box. And I believe I have time to do a quick demo. I'm going to do it real quick. So I'm going to do a test with LS, which is like the simplest test because we don't have a lot of time. In the Docker file, I just copy a test that I'm going to show you, and then I call like the, this is kind of the top level script of loop with a few options that don't matter that much. And I say, okay, the binary that we are going to instrument is slash bin slash LS, and this is the parameter. So I'm going to do LS slash, and we are going to check if it works or not with every possible syscalls that can be invoked by LS. And the test, which should be there, the test that we are going to run after each execution of LS to see if things have worked. So this share script will take the standard output of LS as parameters, and to make things simple, I'm just checking that LS actually outputs something, right? I'm doing LS slash, so something should be outputted. If nothing is output, there is a problem. And keep in mind that loop is also checking the return value of LS itself. So, okay, so I'm launching loop like this, so it should work. So what happens under the hood is that we build the container that we've seen the Docker file for. We are starting two containers in parallel. Each one is running a full set of tests trying to stop and fake all the system calls. And we use this to check for differences between the replicas in case there is a problem. Most of the time, there is no differences. So it takes a bit of time. And then, okay, it's done. So, if we go to the database, so we have now much more than 12 apps. And if we go to LS, the most interesting result is this CSV file, which contains, for HCC call, 0 being read, 1 being write. Is it called by LS or not? Can we fake it? Can we stop it? Or can we both fake and stop it? Or it's more like, does the application works when it's fake? Does it works when it's stubbed? And does it work when it's both fake and stubbed? And as you can see, some CSV calls, like 11, I don't know which one it is, can be both stub and fake, same thing for 12, same thing for 16. Some CSV calls, like this is read, for example, it is called, but you can't stop or take it, which kind of makes sense. LS wouldn't work if it can read. And yeah, that's pretty much it. So briefly, what we are currently working on is some more fine-grained measurements. Some system calls have kind of sub features, like a lot of programs require at least a map anonymous for a map to allocate memory, but not really to map a file. So we are looking at, you know, checking which flags can be stubbed or fake and things like that. And we are also looking at the virtual file system API. That's it. So building compatibility layers is important for custom operating system. It seems a bit scary, but actually, it's not that much engineering effort. |
Monitor your databases with Open Source tools |
Hello, everyone, could you hear me well? How do you feel today, Sunday, the fourth day? Did you wake up with energy? Or you are like... We have energy, right? Okay, that's good. I'm Ida Pugja from Peru. I traveled to Europe for the first time for this year. It's the first time also doing a talk in English. So if I make some mistake, I hope you can understand this. And I am a technology evangelist at Percona. I started six months ago. It's a really great company focusing on databases. And if you want to follow me on Twitter or want to connect, I will be really happy by LinkedIn. So I used to publish things about open source containers, Kubernetes. About me, I am from Peru. I am a Google woman tech maker. I was nominated as a docker captain the last year. And I am a container and database enthusiast. So this talk is about monitoring your database with open source tools. We are going to focus in PMM. How many of you did you hear about PMM before? Percona monitoring and management. Okay, so it's something new for this room. And this is a perspective of a beginner view. So this is not something advanced. I will see this monitoring tool as a perspective, because I am learning about databases in the company where I am working, Percona. So we are going to evaluate why it's important the value of monitoring databases. Also, we will see PMM, Percona monitoring and management, and how we can effectively see the dashboards and graphs that we can get to monitor and manage our databases. All this time I was working in a database company. I asked myself, and I realized the importance is to monitor databases because we can just have one database running for us. We can have several of them. We can have it in cloud, in infra. So it's very important, and it's why we should ask why we should care about databases, why we should care about monitoring these databases. And we can ask most questions like, is my database performing well? So as we start to work with databases, there are several queries that we make. So this queries maybe is not executing in the time that we expected. It could be taking more time, and it's going to have bottleneck in the time that we are executing. So we should care about this and know that there are metrics that we should detect the problem for the performance of our databases. Another question that could ask ourselves is, are databases available and accepting connections? We can have several databases, and many connections could be made, but if we don't put a limit in these connections, it could just crash. But if we put a limit also, we should be aware when we are achieving or we are reaching that limit to increase or maybe to stop the connection. But if we pass the limit, so this is going to be a problem in our databases, right? If this is an e-commerce company, it can happen because the user is going to wait just two seconds or maybe one second to go to the next page faster. But if no, if we did this in three seconds, five seconds that the user is going to go to another page and we lost as a business. Is my sister estable? We also monitor the infrastructure where our database is running, not just the databases, because it is running over an infrastructure. We are using CPUs, memory, this, and we are not able to provision these resources on time. We will be in problem for our databases. I am having avoidable time, so if our hardware is not enough, so our application could crash and we can have hardware failures or network outage. We should be aware of these metrics to avoid these problems. So we also can have human errors or these crashes. So there are metrics that we can see to identify these before. But we can just see what are the problems when we are having those problems. We can also prevent these problems, asking these questions. I am minimising performance issues that can impact my business. I am able to identify these issues before they happen because there is a way to prevent. As a previous example, if we are reaching the limit, we can see it before we are reaching it. So we can prevent provisioning more resources, maybe checking the query that we already saw that is taking too much time executing. So there are ways we can prevent these problems. Nowadays, there are challenges that we have when we think to monitor databases. For example, the data volume grow. We don't talk about gigabytes now. We are talking about terabytes or maybe more because the database has a lot. It's a challenge to monitor these days. The complexity of the model databases. Right now, it's SQL databases, not SQL databases. The database has different models. It could run in different cloud providers. It could have different models. So the complexity just grow. Now it's different than before. The downtime and data loss. So it's one race that we should try to monitor to prevent it. Lack of visibility. If we don't have these things ready to be, to check it, if we have to do another thing, like maybe create scripts, Linux script or bash script to check it and to get this metrics, we don't have that data on time. We don't have that metrics on time. But if we have that visibility, it's going to be easy for us to detect these problems to monitor our databases. Even better, if we have everything in one single dashboard where we can visualize it. I learned fatigue. We can monitor many things. But with this, we have different databases. They can have MySQL, they can have other kind of databases. So it depends on the business, what we want to monitor, what metrics we are going to get to monitor. But if we don't know in all these things that we have for databases, we are going to get a lot of things that maybe we don't need and create alerts for maybe things that we really don't need. We are not focusing in that business exactly. We should try to focus to monitor, we should try to focus in the business. What's the metrics that are important for us, for our business? Integration with other tools. So with that time, we can do continuous integration, continuous delivery things. So it increases also the complexity to monitor our databases because this is not in a single place. This go over a process that is also devolved. These things are being automated. Now that we know why we should care about monitor databases, we will talk about one of the solutions that could help us to monitor. This is Percona monitoring and management, which is PMM. This is an open source tools and free tools, also based in other open source tools like MySQL. This led us to monitor databases like MySQL, MariaDB, PostgreSQL, AmongoDB, but not just that. As I said before, this also led us to monitor the infrastructure where our database is running. It's important to know about that. And it also helped us to performance our databases to simplify the management of these databases and we can exchange the security. Percona monitoring and management is built on top of other open source tools like Grafana. I know many of you use Grafana, right? Who use Grafana? Okay, a lot. And Victoria metrics also to storage this data, that metrics we collect for a long time. We are using clickhose to create these reports in real time with all these metrics that we collect in the time. We are using PostgreSQL to storage all the metadata and all these metrics for databases, all the important data that we have in PMM. And everything that we visualize is saved in this database. And we use Docker to install PMM. We can containerize the installation of PMM, the client and the server, and use it in different platforms. We also use Kubernetes operators for scaling our databases. There are three levels of deep when we talk about PMM interface. This is the big one, which is a dashboard that we all know, but we can go deeper and see the graphs, that is a graphical representation of the metrics in long period of time. And we have the metrics also, which is a countable number that represents some value, some important value of our infrastructure of our database. Okay, as I tell you before, what we want to monitor is going to depend of our business. We are not going to monitor the same metrics as another business. It depends a lot on that. And we also should aware of the alerts that we create that should be focused on what we do as a business and create this alertness and notifications that could be notified when we need it. So it could be integrated with Slack, with many other tools that we have in the hand to know when the problem is happening exactly. Some important metrics, some of that that we can check with PMM, is query performance, high CPU, high memory usage, and this high disk part of the infrastructure, the amount of user connections. We can know when the data grows and other kind of metrics that we can have. Could somebody tell me what other kind of metrics we can check with PMM? With some monitoring tool? A part of the infrastructure or... Okay, we'll check. If we see the long query response times, as I say, some queries could take some time, PMM has a very good dashboard, which is this, is Khan, query analytics dashboard, where we can see for a specific... We can see all the queries for our databases. In this case, I'm seeing all the 10 top queries that is running in our databases, but also we have an option to check it here if we want to just check for MySQL, for Postgres, or MongoDB, and we will see the amount of queries per second for example that we are running and how time is taking. So if we open this dashboard when we are working in databases, the first thing that we are going to see is, okay, this query is taking too much time. In this case, no, but we have a query that is taking too much time, or it's running a lot of queries per second. We can see the first one, and we can start troubleshooting from that point. The high CPU utilization is part of the infrastructure. Also, it's important to know how is this going. For example, this dashboard in PMA, in Percona Monitoring and Management, we can see for a specific note, we can check a note because we can have different notes running in our infrastructure, and in this case, for example, we have 25% of our disk that is using. This may be not a good example because I checked last 12 hours, but we can check our disk usage during six months or more, and then it's when we can see and take decisions. Let's see if this is like six months. We are using just 25% of our disk. It could be a problem because we have a lot of infrastructure that we are not using and we are wasting money because of this space that we are not using in six months. We can reduce our CPU and save money. High memory usage is this dashboard where we can see the amount of memory that I have for my databases and also can see what is using for Kakache, what has been using, what is going to be ready to be free, and this is good because we can also see when we are reaching the limit of the memory and we can take actions to provisioning another disk. We can say that this is very easy when we are working in cloud because just we click in a button and say, hey, provision another disk or increment the memory, but if we are working in infra, maybe in a private cloud, this is hard because we have to prepare the logistics to get another disk, another memory is going to take time, and have this kind of visualization helps. The amount of input or output that we make in our disk, you also can check it in this graph. We can see that your latency here, in this case, is stable, but we can have peaks to see where we are detecting these problems. User connections, as I say, it helps to monitor the number of active database connections and size it appropriately, and also put limits in our connections for our databases. We have, in this case, MySQL connections. We can see that you are for other databases too, but in this case, it's also stable. We are going to have peaks when we can see that we are working with a lot of transactions on our databases, and we can take actions with that. The maximum of connections allowed is 151. We are in 150. If this is going to be 151, this is going to alert us. You have to check this. The data grow also. We can see a dashboard where we can... Where you can see where our data is a lot when we are inserting a lot of data in our databases, or we are just removing things. In this case, it's going to show when my databases start, and it's not like too good to see it here, but if I have time, yes, I still have time. I still have time, right? To show something, to show the dashboard. I have time. If you want to try it right now, we can check this PMM demo per con graph. You can enter. Right now, we are going to check the dashboard, but what we learned now, it was some aspects that let me think what we should keep away and monitoring our databases, and also how to explore PMM, which is an open source tool, is available there. You have to double down and start to check it and explore it, and it's easy to visualize things, so we are going to check now PMM. You can also enter to this link, so it's free to experiment. Let's see if this is going to work. Yes, this is the dashboard that we have. We have several nodes that we can choose here. A lot of them, so it depends on the database. We have nodes of MongoDB, MySQL, Postgres, and MySQL proxy also. We can check the details for a specific time. 12 hours is not enough, maybe, for some things, but maybe six months, three months, we have a lot of things here. So we also can check things about the system operator. I don't have access right now to see that, but we can also register alertings for this. In this case, we have three databases that are being monitored for Postgres. One of that is the database for PMM itself. We have nine databases in Mongo and 15 databases in MySQL. This is a good thing for PMM, because you can see everything in one single dashboard, and we can go deeper for each node or for each database as you want. Yeah, this is all. Thank you so much if you have some questions. So thanks a lot for the interesting talk. Does anyone have questions? Hi, hello, good morning. The query monitoring dashboard do have some advice on how to perform these queries, the bad queries, the slow queries. Some advice on how to perform the bad queries, the slow queries, how to rewrite them. If you go deeper into that query, you will open another dashboard where you are going to be able to see suggestions. You are doing something bad here, and you can fix it with that. Hi, thank you for the talk, first of all. One question regarding the PMM query analytics. Is it possible to filter by connection and not by database? Say it again, please. I want all queries from one connection instead of... Yeah, is it possible? Yeah, you can have just one connection. Okay, then we have to talk. Yeah, okay, thank you. |
Observability in Postgres
The Good, the Bad, and the Ugly |
The name of my talk is Postgres Observability. My intention is to show you what's great about Postgres and how it integrates well with observability, but also where some of the problems are. Obviously, in 25 minutes, it's not going to be an exhaustive presentation of all of the metrics in Postgres, but maybe I can give a bit of an introduction. So first of all, my name is Gregory Stark. I work for Ivan, which is a, so I work in the open source programs office, contributing to Postgres. Ivan is a data infrastructure hosting company. We host Postgres, but we also host a range of other data services, including some observabilities that it's all open source software, and we contribute back to the projects that we sell. So I'm sure in this room, most people have seen the cliched three pillars of observability. In a modern software, what people expect are their logs to be structured so they can send it to some sort of index, something like OpenSearch, some sort of indexed aggregate log system. They expect time series database to hold all their metrics with labels and well-defined semantics. They expect distributed tracing. Postgres is not so much of a modern, it's still actively developed and has modern relational database features, but for things like this, Postgres is going on almost 30 years now. Our logs, our metrics, our tracing tools predate most of these modern distributed system concepts. So what these look like in Postgres is we have very good logs, they're meant for a human to be reading in a text file. So we actually support JSON logs, but the actual error message, the actual log message will just be a string inside that JSON struct. All the JSON structured information, labels and so on are the metadata about the log line, things like process ID, session ID, the actual like table name being mentioned in the error, the actual, well actually current user is one of those columns, but if the error message mentions a user name or a table name or an index, it's just going to be part of the string. There's tons of metrics in Postgres, and I'll go into more detail, I'm mainly going to be talking about metrics here, but they're in SQL, they're not in like Prometheus exposition format or open metrics or anything like that. And then there's explain plans are basically a tracing tool, but it's meant for a human to be investigating on a single system, it doesn't integrate into any sort of distributed tracing tools. So I want to spend a little bit of time showing you what like the metrics in Postgres look like because it gives you, I can't show you all of them, there are hundreds and hundreds, probably thousands, but I want to give you a feel for like the kinds of in depth metrics that Postgres does provide. It does give you, there's a whole component inside Postgres whose job is to track metrics about your objects, your tables, indexes, functions, things like that. So those are mostly quantitative metrics, cumulative counters that are counting how many times events have occurred or how many seconds have elapsed while doing operations on your table. There are also other kinds of metrics that don't map so well to quantitative Prometheus-style metrics, and I'll show you, if I have time, I'll try and show a bit of why those are difficult to map to time series databases like Prometheus. The thing to understand is Postgres exposes these things through SQL. The way you access these metrics is by logging into the database and running SQL queries. So for example, this is pg.database, I realize you probably can't read it very well, but if you can, hopefully if you can see the general shape of it, I'll describe it, there's one line for each database inside the Postgres cluster. So there's a database called Postgres, there's a database called template one and database called template zero and another database with my username Stark. And each row of this table, it's actually a view, there's no storage attached to it, it's a dynamically generated table, a virtual table, say. Each row represents the metrics for that table. So it shows you the number of backends that are connected to that database, I think I said table before, I meant database, it shows you the number of backends connected to that database, that's a gauge in Prometheus parlance, you can go up and down, the number of transactions that have committed on that database, I think that's since the database start up actually, the number of transactions that have rolled back, the number of blocks that have been read on that database, the number of blocks that were hit in the shared memory cache, these are all, and actually this is truncated, there are many, well, there's a good number of more columns as well, but the key point is there's a row for each database and there's a bunch of metrics about that database. And then you can go into more detail, there's similar tables for, there's similar views to show you metrics about your tables, the number of sequential scans that have occurred on a table named PG bench branches in this case, and PG bench accounts, PG bench colors, so the number of sequential scans on that, each row is a table and this is showing the number of these various operations like sequential scans, tuples read, index scanned, for each of those tables. So in like Prometheus or other time series database world, you would probably want to make the relation name, the table name here, a label on your metric, you probably also want the schema name as a label, you might want the ID number, which is that first column as a label, you actually have a decision to make there, do you want the time series to be tied to the ID number or the name, so if you rename a table, is that a new time series or not? So the tool in Postgres world, like that mapping, those decisions are made somewhere, where the mapping has to be made is in an agent that connects to the database, runs SQL and exposes the data in Prometheus exposition format or open metrics, so the agent, the standard agent for Prometheus is called Postgres exporter and it has built in queries for these things, it has built in ideas about what the right labels are for the metrics and how to map these data types, these are actually all 8 byte integers which need to be mapped to floating point numbers for Prometheus, so like there's all kinds of hidden assumptions that Postgres exporter has to be making to map this data to the monitoring data, the data for Prometheus or M3 or whatever time series database you're using. I don't have time to go into like how you would use these particular metrics to understand your, like how to tune your database, but one point is, the way Postgres, like these metrics were originally designed, you were imagined to have a DBA logging into your database querying specific rows with a word clause, maybe doing calculations where you divide one by another to find out the number of tuples each sequential scan is returning and things like that and obviously in a modern observability world what you're actually going to do, what Postgres exporter actually does is just do select star with no where clause, takes all this data, dumps it into a time series database and then you do those same calculations but you do them in from QL or whatever the equivalent is in your observability tool and that gives you the same kind of flexibility but now you can look at how those metrics relate to metrics that came from other sources so you get a more global view, you can aggregate across multiples databases, you can aggregate across your Postgres databases and other systems. So a lot of the flexibility here that these are designed to give you is no longer relevant when you're just doing a simple select star and dumping it all into Prometheus. Sorry, there's more complicated metrics which don't really map well to tools like Prometheus or M3, Datadog, whatever. So this is PG stat activity, there's one row for each session, there's actually two, there's the same, just to explain what you're looking at the first results that there are the first half dozen or dozen columns and then the second set there is, I've elided after PID, I've elided those columns and showed you the next bunch of columns just because I wanted to actually make a point about one of those columns that would be way past the edge of the screen. So in PG stat activity you have one row per session on the database and obviously that already is difficult to put into Prometheus because you would be having time series come and go every time an application connects and disconnects. Probably what people actually, I think what Postgres exporter puts in the data is aggregates, it just puts an account of how many rows are present and then maybe account of how many, the minimum maximum of some of these columns. But there is data in here like the weight event type and weight event, those are text strings, inside Postgres those are actually ID numbers but they get presented to the user in a nice readable format which then if you want to make metrics of you probably then turn them back into numbers or you put them in labels, they're difficult to really make use of in a time series database. Some of them are quite important to have some idea, so there's information there that will show you in PG stat activity that will show you if a session is in a transaction, an idle and you really do want to know if there's a session that's idle in transaction for a long period of time. So what most people do there is have an aggregate, they have one gauge for the maximum, the longest time that any session has been idle in transaction. So just to be clear what we're talking about here is Postgres exporter which is connecting to Postgres QL and querying PG stat user tables, PG stat user indexes, PG stat activity, all the various views that start with PG stat, it can also, Postgres exporter is very flexible, you can configure customized queries to query other views that like some of the PG stat views you might want more detail than the default queries. So it doesn't actually include all those table statistics by default if you have an application where your schema is fairly static and you have a reasonable number of tables to do that with, you can quite reasonably get all of those columns, put them in Prometheus and be able to do all kinds of nice graphs and visualizations, but that's not standard. And if you're, on the other hand, you're an ISP with hundreds of customers and your customers create and drop tables without your control, then you can't really be trying to gather statistics like that because you're taking on an unbounded cardinality and time series coming and going without being able to control it. So the level of detail that you grab is very dependent on how you're using Postgres, whether you're a site with one key database that you want to optimize or many, many databases that you just want to monitor at a high level or an application that you're controlling versus applications that you're hosting for other people. It also means that many sites add queries in Postgres exported query, other data sources like what I've put in this diagram here is PGSTAT statements, which is an extension in Postgres, which gathers statistics for your queries. So the key in there is a query ID, which is like a hash of the query with the constants removed, and you can get long-lived statistics about which queries are taking a lot of time or doing a lot of ale, but that's, again, like a custom query that you would be adding. So I talked a bit about the map, like the difficulty mapping some of these metrics for me to use. There's other problems with, am I doing it for time? Am I doing it for time? There's, I don't, okay, there are, so I do want to talk a bit about the kinds of problems that we have. Some of the metrics don't map very well to Prometheus metrics. The fact that the metrics can be customized, and in fact kind of have to be customized because Postgres is used in different ways at different sites, means that there's no, there is a standard dashboard in Grafana for Postgres, but it's a very high-level dashboard. I think I do have a screenshot there, yeah. There is a dashboard for Postgres, but it, this is not showing individual tables and individual functions and so on, because on many sites that data wouldn't even be present. You have to add custom queries for it. It also means you have to deploy the agent. You have to run this side, this Go program alongside your database everywhere you deploy your database, or you could, depending on how you deploy it, you can deploy a single one for all your databases, or one for all the databases running on one host, so that mapping of which agent to, which agents metrics correspond to which actual database is entirely dependent on how you manage your deploys. The other problem that I've, the, I can't go into all of the problems, but the, the op, the Rezors contention, I gave names to each of these classes, but, so the Rezors contention problems are that Postgres, because it's exposing this information through SQL, means that you have to have a working SQL session in order to get the metrics. So when your system is not functioning correctly, you're very likely to also lose all your data, which you need to debug the problem. So if you're running low-end connections, or you're running into transaction wraparound, or the system is just out of memory, or getting disk errors, quite often you also lose all your metrics that would allow you to figure out which application component is using all the connections, or which table is it that needs to be vacuumed to recover from the transaction wraparound issue. I actually tried to, I've run into a problem where a table was locked by the application, and the custom queries needed that same lock. So the queries all disappeared, the metrics all disappeared, because the Postgres exporter was getting blocked on that lock. When I tried to recreate it for a demo, I actually found, oh, this is not a lock, this is, I actually caused the regression test on Postgres to fail, because one of the regression tests tries to drop a database. And the Postgres exporter keeps a connection to each database, because the metrics, like I said, you need a session, you need a connection to the database, so you need, in Postgres, each session is tied to a specific database. So if you have a dozen databases, it uses a dozen connections, and it keeps those connections, it's optional, it's to work around the problem that it might not be able to connect if you have a problem, but as a result, it has persistent connections to those databases, and the regression test failed when they tried to drop that database. And that could actually happen in production. If you try to do a deploy and roll out a new version of some data that drops a database and recreates it from scratch, if you have Postgres exporter running and it has a connection, you could run into the same kind of issue. So I'm hoping, I'm already working on something to replace Postgres exporter with a background worker inside Postgres, so you would be connecting directly to Postgres, you wouldn't have to deploy a separate program alongside it, and my goal is that that program would have standardized metrics. That program would have standardized metrics that every dashboard or visualization or alerting, so we could have mix-ins that have alert rules and visualizations, and it would be able to rely on standardized metrics that will always be present, and they would be exported directly from shared memory without going through the whole SQL infrastructure. So it would avoid depending on locks and transactions and all of the things that could interfere with or be interfered with by the application. It's still early days, I have a little proof of concept, but it's not going to be in the next version of Postgres, it's definitely experimental. The main difficulties are going to be sort of definitional problems of, for example, the table names, like I mentioned before, should a time series change when a table gets renamed? But in fact, I have a bigger problem because the table names are in the catalog, the schema catalog, they're not in shared memory, and they're not, we don't really want them in shared memory, that brings in the whole risk of character encoding changes and collations. So there's, it probably will only replace the core database metrics, and then you would still probably deploy a tool like Postgres Explorer only for your custom queries, only for more application level metrics, not monitoring core Postgres metrics. So my hope is that when you deploy Postgres, you can add it to your targets in Prometheus and not have to do any further operational work to get dashboards and alerts. Two more minutes. It feels like time is elastic here. So I skipped over, I mean, so this is the proof of concept. The telemetry server in the first PS listing there is a single process. It's a Postgres background worker that can be, you can connect to it and get metrics with just ID numbers for the tables. And the second example is Postgres Exporter, and you can see there's a session, there's a database session, and with Postgres Exporter, there's a database session for each database, and they're all idle. So even just reducing the number of sessions and reducing the number of processes involved is already quite a visible improvement. I think I have more information if people have questions or want to see something specific, but I tried to condense a much longer presentation to 25 minutes, so I've skipped over plenty of other information. If there's questions, that would be probably better than me just jumping around finding a slide. Okay, so any questions, thanks a lot for the great talk, it was pretty interesting. So any questions, anyone? Hello, my name is Brian, you spoke about metrics, is there any traces or any talk of traces in the future? I have ideas, I have plans, but they're all in my head, there's no code. Postgres does have explain plans, and explain plans are basically traces, but there's no way to, what we have today is you run something on the terminal and you see the plan for your query, and there's an extension that will dump the explain plans in the logs. So it wouldn't be much, it's a bit pie in the sky, but I don't see any reason we shouldn't be exporting that same information to a tracing server, and that basically just involves adding support for receiving the trace IDs, the spans, and creating spans for either plan nodes or certain kinds of plan nodes, there's a lot of, it's not well thought out plans. In my pie in the sky dream there is I want to be able to answer the question, which front end web API endpoint is causing sequential scans on this table over here, skipping the whole stack trace in the middle without having to dig all the way up. So we have a architecture in which we have Postgres databases which are short lived running in Docker containers, so the entire cluster basically will live and die for matters of possibly minutes or less. And we would like to know what the hell is going on with them, have you got any bright ideas? I admit I don't think I've seen anybody trying to do that with Prometheus. I mean it's not a best practice in Prometheus to have time series that keep changing, but you're kind of inevitably going to get a new bunch of time series with each database. I guess I need a better idea where you're looking. I don't think I have anything off the top of my head that you wouldn't have already thought about. Hi, where can we get your proof concept from and fill it with it and test it? I'm sorry, I didn't hear the question. Where can you get your proof of concept from to test it and fill it with it? I posted a patch to the mailing list. Postgres follows a fairly old school patch review process where patches are mailed to the hacker's mailing list. So it's easy to lose sight of patches if they get posted and it was months ago. I can send it to you if you want. You can probably find it on the mailing list if you search. It's pretty early days though. It's not really ready to use even for experimental production uses. With that integrated matrix, how do you expose the matrix to have an HTTP endpoint that exposed directly from Postgres? The current situation is it's a background worker and that background worker has a configuration option to specify a second port to listen on and it runs a very small embedded web server so it responds to normal HTTPS requests. I would want the normal Postgres port to respond so that your label, your target is just the database port. I expect, well, I actually have already heard a lot of pushback on that idea. A lot of Postgres installs are sort of old school where you probably have it firewalled and you don't want to have two different, you don't want to have a new service running on a port, the same port as the actual database, you want to have a port that you can firewall separately for your admin stack. It makes Prometheus very difficult to manage when you have a different port to get metrics about. So you have database running on port A and then you have metrics on port B and you have to have your dashboards and the targets and so on all configured to understand that the target with port B is actually the database on port A and you can add rewrite rules but then you have to manage those rewrite rules. But I don't really expect people to accept the idea of responding on the database port. There's also a general security principle involved of, it's almost always a terrible idea for security reasons to respond to two different protocols on the same port because a lot of security vulnerabilities have come about from arranging, like finding bugs where one side of a connection thinks you're talking protocol A and the other side thinks you're talking protocol B. So it's probably, there's big trade-offs to doing that. First of all, thanks a lot for the amazing talk, very insightful. Thanks for offering to modernize POSGRACE monitoring. You had a very good point there about standardizing on the metrics. I've been involved in the semantic conventions around open telemetry and other projects but in general, I'm curious to hear if you personally or Ivan or anyone else, what kind of effort is being done to standardize on database monitoring metrics, not specifically POSGRACE but databases in general, if you can share? I would be interested in that. I haven't heard anything on that front. That would be exciting. That would be a lot of work. I think there's a lot of, a lot of the interesting metrics are very, that would be difficult. I don't know, I haven't seen anything like that. Okay, so thanks a lot everyone. |
Application Monitoring with Grafana and OpenTelemetry |
Okay. Hello, everyone. Welcome to the talk on open telemetry with Grafana. Microphone broke, so I need to do it with this microphone now. Let's see how it goes with typing and live demo. Few words about me, so who am I, why am I here talking about Grafana and open telemetry, so I work at Grafana Labs, I'm an engineering manager, I'm also a manager for our open telemetry squad, and I'm also active in open source, so I'm a member of the Prometheus team where I maintain the Java metrics library. So what are we going to do in this talk in the next 25 minutes or so, so it will almost exclusively be a live demo, so basically the idea is I have a little example application running on my laptop, and it is instrumented with open telemetry, I will show you in a minute what it does and how I instrumented it, and I also have an open source monitoring back and running, right, it consists of three databases, one is Loki, which is a open source logs database, one is Temple, which is an open source trace database, and one is Mimir, which is an open source metrics database, so Mimir is compatible with Prometheus, so I could have shown the exact same demo using Prometheus instead of Mimir, so it doesn't really matter for now. And of course I also have Grafana, I have those databases configured as data sources, and what we are going to do, we are going to start up Grafana, you know, have a look at metrics, have a look at traces, have a look at logs, and basically the idea is that at the end of the talk you kind of have seen all the signals that come out of open telemetry, you know, explore a bit what you can do with this type of data, and so you should have a good overview how open source monitoring with open telemetry looks like, right? So last slide before we jump into the live demo, so this is just a quick overview of what the example application does so that you know what we are going to look at. It's a simple hello world rest service written in Java using Spring Boot, and so basically you can send a request to port 8080 and it will respond with hello world, and in order to make it a bit more interesting, I made it a distributed hello world service, so it doesn't respond directly, but when it receives a request, it reaches out to a greeting service running on port 8081, the greeting service responds with the greeting, which is hello world, and then the response is forwarded to the client, right? And there are random errors to have some error rates as well, so basically a hello world microservice architecture or whatever, right? And in order to instrument this with open telemetry, I use the Java instrumentation agent that's provided by the open telemetry community, that's something you can download on GitHub, and the thing is this thing, you basically attach it to the Java virtual machine at start up time with a special command line parameter, so I didn't modify any source code, I didn't use any SDK or introduce any custom stuff, all we are going to look at in this demo is just data produced by attaching the open telemetry instrumentation to a standard Spring Boot application, right? Cool. So let's get started. As said, I have my data sources configured here, so Prometheus and Mimera are compatible, so it doesn't really matter which one we choose. There are a lot of, so I want to start with metrics, and yeah, so... Yeah? Can we turn the lights down a bit? I don't know. Okay. Maybe the other way around. Okay. I will just continue, come on. So there are lots of metrics that you get from the open telemetry instrumentation, so kind of JVM-related stuff like garbage collection activity and so forth, but the one I want to look at, oh, no, it's getting brighter and brighter. Yeah. Okay. Great. I think there is also a light mode in Grafana. Maybe that would have been a better choice. But no, I'm not going to use light mode. So let's figure out how to do the demo while I have a microphone that I should hold in my hands. Let's just put it here. Okay. Thank you. Cool. So the metric that we are going to look at for the demo, it's a metric named HTTP server duration. This is a metric of type histogram. So histograms have a couple of different numbers attached to them, so there are histogram buckets with the distribution data and so forth, and there's also a count. The count is the most simple one, so we are going to use this in our example. I actually got it two times. I got it once for my greeting service here and once for the hello world application. And if we are just, you know, running this query, maybe take a little bit of a shorter time window here, then we basically see two request counters, right? One is the green line, which is counting the request resulting in HTTP such as 200. So the successful requests, and basically we see that since I started the application on my laptop, I got about a little more than 400 successful requests, and the yellow line is, you know, requests resulting in HTTP status 500, and we got around 50 of them, right? And obviously, raw counter values are not very useful, right? Nobody is interested in how often was my service called since I started the application, and the way, you know, metric monitoring works with Prometheus, as probably most of you know, is that you use the Prometheus query language to get some useful information out of that kind of data, right? And I guess most of you have run some Prometheus queries, but they're still going to show maybe a couple of examples. So for those of you who are not very familiar with that, does this one work again? Hey, nice. It's even better. The lights work, the microphone works. Wow. Now let's hope the demo works. So I'm going to run just a couple of quick, you know, Prometheus queries so that for those of you who are not very familiar with it, so that you get an idea of what it is, right? And the most important function in the Prometheus query language is called the rate function. And what the rate function does, it takes a counter like this and a time interval like five minutes, and then it calculates a per second rate, right? So based on a five minute time interval, we now see that we have about 0.6 requests per second resulting in HTTP status 200, and we have about 0.1 requests per second resulting in HTTP status 500. And this is already quite some useful information, right? So typically you want to know the total load on your system, not buy status code or something. So you basically want to sum these two values up, and obviously there's also a sum function to sum values up, and if you call that, you get the total load on your system, which is just one line now and it's just, you know, around 0.7 requests per second, right? And this is, yeah, this is basically how Prometheus queries work. If you're not familiar with the syntax, there's also kind of a graphical query builder where you can, you know, use a bit drag and drop and get a bit more help and so forth, right? And so eventually, you know, when you got your queries and got your metrics, so what you want to do is you create a metrics dashboard and for monitoring HTTP services, there is, there are a couple of best practices, what type of data you want to visualize on a dashboard for monitoring HTTP services. And the most simple and straightforward thing is to visualize three things. One is the request rate, so for the current load on the system, which is exactly the query that we are seeing here. The next thing you want to see is the error rate, so the percentage of calls that fail. And the third thing is duration. How long does it take, right? And I created a simple example dashboard just to show you how this looks like. So I put the name of the service as a parameter up here so we can reuse the same dashboard for both services. Maybe let's use a 15 minute time window, so here I started the application. The first is the request rate, that's the exact same query that we just saw. Second thing here is the error rate, so we have about, I don't know, around 10% errors in my example application. And then for duration, there are a couple of different ways how to visualize that. So what we see here is basically the raw histogram, right? The histogram buckets. And this representation is actually quite useful because it shows you the shape of the distribution. So what we see here is two spikes, one around 600 milliseconds and one around 1.8 seconds. And this is a typical shape that you would see if your application uses a cache, right? Because then you have a couple of requests that are responded quite quickly. Those are the cache hits. A couple of requests are slow that are the cache misses. And visualizing the shape of the histogram helps you understand kind of the latency behavior of your application, right? The other and most popular way to visualize durations is this one here. These are percentiles. So the green line is the 95th percentile, so it tells us 95% of the calls have been faster than 1.7 seconds and 5% slower than that. The yellow line is the 50th, so half of the calls faster than that, half of the calls slower than that. And this doesn't really tell you the shape of the distribution, but it shows you a development over time, which is useful as well. So if your service becomes slower, those lines will go up, right? And it's also a good indicator if you want to do alerting and so forth. You can define a threshold and say it's above, if it's above a certain threshold, I want to be notified and stuff like that. And there are other more, you know, experimental things like this heat map showing basically development of histograms over time and stuff like that. So it's pretty cool to play with all the different visualizations in Grafana and, you know, see what you can get. So this is a, you know, quick example of a so-called red dashboard, Request Rates, Error Rates duration based on open telemetry data. And the cool thing about this, about it, is that it actually, all that we are seeing here is just based on that single histogram metric HDP server duration. And the fact that this metric is there is not a coincidence. The metric HDP server duration is actually defined in the open telemetry standard as part of the semantic conventions for HDP services. So whenever you monitor an HDP server with open telemetry, then you will find a histogram named HDP server duration. It will have the HDP status as an attribute. It will contain the latencies in milliseconds. That's all part of the standard. So it doesn't matter what programming language your services uses, what framework, whatever. If it's being monitored with open telemetry and it's compatible, you will find that metric and you can create a similar dashboard like that. And this is kind of one of the things that make application monitoring with open telemetry a lot easier than it used to be before these standardization. Cool. So that was a quick look at metrics, but of course we want to look at the other signals as well. So let's switch data sources for now and have a look at traces. So tracing, again, there's a kind of search, like graphical search where you can create your search criteria with drag and drop. There's a relatively new feature which is a query language for traces. So I'm going to use that for now. And one thing you can do is to just search by labels. So I can, for example, say I'm interested in the service name greeting service and then I could basically just open a random trace here. Let's take this as an example. Can I, I need to zoom out a little bit to be able to close the search window here. Okay. So this is how a distributed trace looks like. And if you see it for the first time, it might be a bit hard to understand, but it's actually fairly easy. So you just need like two minutes of introduction and then you will understand traces forever. And to give you that introduction, I actually have one more slide. So just to help you understand what we are seeing here. And the thing is distributed traces consist of spans, right? And spans are time spans. So a span is something that has a point in time where it starts and a point in time where it ends, right? And in open telemetry, there are three different kinds of spans. One are server spans. The second is internal spans and the third is client spans. Okay. So what happens when my Hello World application receives a request? So the first thing that happens if a server receives a request, a server span is created. So that's the first line here. It's started as soon as the request is received. It remains open until the request is responded, right? Then I said in the introduction that I used spring boot for implementing the example application. And the way spring boot works is that it takes the request and passes it to the corresponding spring controller that would handle the request. And open telemetries Java instrumentation agent is nice for Java developers because it just creates internal spans for each spring controller that is involved, right? And that is the second line that we are seeing here. It's basically opened as soon as the spring controller takes over and remains open until the spring controller is done handling the request, which might seem not too useful if I have just a single spring controller anyway, but if you have kind of a larger, you know, application and if you have multiple controllers involved, it gives you quite some interesting insights into what's happening inside your application. Like you would see immediately, like which controller do I spend most time in and so forth, right? And then eventually my Hello Word application reaches out to the greeting service and outgoing requests are represented by client spans. So the client span is basically opened as soon as my HTTP request goes out and remains open until the response is received. And then in the greeting service, the same thing starts again, you know, request is received, which creates a server span and then I have a spring controller as well, which is an internal span and that's the end of my distributed application here. And this is exactly what we are seeing here. And each of those span types has a corresponding metadata attached to it. So if you look at one of the internal spans here, we see the name of the spring controller and the name of the controller method and a couple of JVM-related attributes, whatever. And if we look at an HTTP span, for example, we see, of course, HTTP attributes like the status code, method and so forth, right? So of course, you do not want to just look at random spans. So usually you're looking for something. There are standard attributes in open telemetry that you can use for searching. So we already had the service name greeting service, for example. But the most important or one of the most important attributes is HTTP.status, no,.status code, this one here. And if we, for example, search for spans with HTTP status code 500, then we should find an example of a request that failed. So let's close the search window again. Yes, that's an example of a failed request. You see it with the indicated by those red exclamation marks at the bottom here. So this is where the thing failed, right? So the root cause of the error is the internal span, something in my spring controller in the greeting service. If I look at the metadata attached to that, I actually see that the instrumentation attached the event that caused the error, and this even includes the stack trace. So you can basically immediately navigate to the exact line of code that is the root cause of this error, right? And this is quite cool. So if you have a distributed application and you get an unexpected response from your Hello World application, without distributed tracing, it's pretty hard to find that actually there's an exception in the greeting service that, you know, propagated through your distributed landscape and then eventually caused the unexpected response. And with distributed tracing, finding these kind of things becomes pretty easy because you get all the related calls grouped together, you get the failed ones marked with an exclamation mark, and you can pretty easily navigate to what's the root cause of your error, okay? Cool. So that was a quick look at traces. There are a lot of interesting things about tracing. Maybe one thing I would like to show you, because I find it particularly cool, so if you have all your services instrumented with tracing in your back end, then basically those traces give you metadata about all the network calls happening in your system, and you can do something with that type of data, right? So for example, you can calculate something that we call the service graph. So it looks like this. It's maybe not too impressive if you just have two services calling each other, right? So, but if you imagine, you know, a more larger, you know, dozens or hundreds of services, so it will generate a map of all the services and indicate which service calls which other service, and this is quite useful. For example, if you intend to deploy a breaking change in your greeting service and you want to know who's using the greeting service, what would I break? Then looking at the service graph, you basically get this information right away. Traditionally, if you don't have that, you basically have a PDF with your architecture diagram, and then you look it up there, and also traditionally, there's at least one team that deployed something and forgot to update the diagram, and then you missed that, and there's a service graph that won't happen, right? This is the actual truth. This is based on what's actually happening in your backend, and this is pretty useful in these situations, right? And you can do other things as well, like, you know, have some statistics like the most frequently called endpoint or the endpoint with the most errors and stuff like that. So, that was a quick, quick look at traces. So we covered metrics, we covered traces. One thing I want to show you is that metrics and traces are actually related to each other, right? And so in order to show that, I'm going to go back to our dashboard, because if you, let's take a 15 minute window, then we get a bit more examples. So if you look at the latency data here, you notice these little green dots. These are called exemplars, and this is something that's provided by the auto instrumentation of open telemetry. So whenever it generates latency data, it basically attaches trace IDs of example traces to the latency data, and this is visualized by these little green dots, right? And so you see some examples of particularly fast calls, some examples of particularly slow calls and so forth. And if you, for example, take this dot up here, which is kind of slower than anything else, it's almost two seconds, right? Then you have the trace ID here, and you can navigate to tempo and have a look at the trace and start figuring out why did I have an example of such a slow call in my system, right? And in that case, you would immediately see that most of the time spent in the greeting service. So if you're looking for the performance bottleneck, then this is the most likely thing. Yeah, four minutes, that's fine. Cool. So if I have four minutes, it's high time to jump to logs, the third signal that we didn't look at yet. So let's select Loki, our open source logs database as a data source. So again, there's a query language, there's a graphical query builder and so forth. So let's just open random logs coming from the greeting service. It looks a bit like this. So it's even, I don't know, I didn't even log anything explicitly. I just turned on some whatever spring request logging so that I get some log data. And from time to time, I throw an exception, which is an IO exception to simulate these errors. Looks a bit broken, but that's just because of the resolution that I have here. Yeah, so what you can do, of course, you can do some full text search, for example, can say, I'm interested in these IO exception. And then you would basically get, well, if you spell it correctly, like that, then you would get the list of all IO exceptions, which in my case are just the random errors I'm throwing here. And this query language is actually quite powerful. So you can, this is kind of filtering by a label and filtering by full text search, but you can do totally different things as well. For example, you can have queries that, you know, derive metrics based on log data. There's a function pretty similar to what we have seen in the metrics demo, which is called the rate function. So the rate function, again, takes a time interval and then calculates the per second increase rate. So it basically tells you that we have almost 0.1 of these IO exceptions per second in our log data, which is also kind of useful for information to have. And the last thing to show you, because that's particularly interesting, so it is that these logs and traces and metrics are, again, not independent of each other. They are related to each other. And so if we look at an example here, just let's open a random log line. So what we see here, there's a trace ID. And this is interesting. So how does a trace ID end up in my log line? So this is actually also a feature of the Java instrumentation that's provided by the OpenTelemetry community. So the way logging in general works in Java is that there's a global thing with key value pairs called the log context. And applications can put arbitrary key value pairs into that context. And when you configure your log format, you can define which of those values you want to include in your log data. And if you have this OpenTelemetry agent attached, then as soon as a log line is written in the context of serving an HTTP request, then the corresponding trace ID is put into that log context. And you can configure your log format to include the trace ID in your log data. And that's what I did. And so each of my log lines actually has a trace ID. And so if I see something fancy and I want to know maybe somewhere down my distributed stack something went wrong, I can just query that in tempo, navigate to the corresponding trace, close that here, yeah, and then basically maybe get some information what happened. And then the same navigation works the other way around as well. So of course, there's a little, you know, log button here. So if I see something fancy going on in my greeting service thing here, and maybe the logs have more information, I can click on that, navigate to the logs. And then it basically just generates a query, right? I click on the greeting service with that trace ID. So it's basically just a full text search for that trace ID. And so I will find all my corresponding log lines. In that case, just one line. But if you have a bit better logging, then maybe it would give you some indication what happened there. Okay. So that was a very quick 25 minutes overview of, you know, looking a bit into metrics, looking a bit into tracing, looking a bit into logs. I hope it gave you some impression, you know, what's the type of data that you get out of open telemetry looks like. All of what we did is really, you know, without even modifying the application. I didn't, you know, even start with custom metrics, custom traces and so forth. So but it's already quite some useful data that we get out of that. If you like the demo, if you want to explore it a more, a bit more, want to try it at home, I pushed it on my GitHub and there's a readme telling you how to run it. So you can do that. And yeah, next up, we have a talk that goes a bit more in detail into the tracing part of this. And then after that, we have a talk that goes a bit more into detail how to run open telemetry in Kubernetes. So stay here and thanks for listening. Please remain seated during Q&A. Otherwise, we can't do a real Q&A. So please remain seated. Order any questions. Yes. Hi. Thank you for this. One quick question. You mentioned you just need to add some parameters to the Java virtual machine to run the telemetry. What happens to my application if, for example, the back end of the telemetry is down? Is my application failing or impacted in any way? If the monitoring back end is down. Yes. Say the monitoring is down, but I started my application with these parameters. Is it impacting the application? No. I mean, you won't see metrics, of course, if you're monitoring back end is down, but the application would just continue running. So typically, in like production setups, the applications wouldn't send telemetry data directly to the monitoring back end. But what you usually have is something in the middle. There's alternatives. There's the Grafana agent that you can use for that. There's the open telemetry collector that you can use for that. And it's basically a thing that runs close to the application, takes the telemetry data off the application very quickly, and then, you know, can buffer stuff and process stuff and send it over to the monitoring back end. And that's used for decoupling that a little bit, right? And if you have such an architecture, the application shouldn't be affected at all by that. Two more. Two more. So I really like being able to link from your metrics to traces. But what I'm actually really curious to be able to do, and as far as I know, doesn't exist, or I guess that's my question, is like, is there any thought towards doing this, is being able to go the other direction, where what I'd like to be able to answer is, here's all my trace data, and this node of the trace incremented these counters by this much. So I could ask things like how much network IO or disk IOPS did this complete request do, and where in the tree would that occur? Yeah, that's a good question. I mean, linking from traces to metrics, it's not so straightforward, because I think the things you can do to relate this is to use the service name. So if you have the service name part of your resource attributes of the metrics, and consistently you have the same service name in your trace data, then you can at least, you know, navigate to all traces coming, to all metrics coming from the same service. Maybe you have some more, you know, related attributes, like in whatever instance ID and so forth. But it's not like really a one-to-one relationship, so. That's specific. What request, how much did this request come from the IOPS? Yeah, no, I don't think that's possible. So in this example, you've shown that Grafana World and Prometheus works great with server-side applications. Have you had examples of client-side applications, mobile desktop applications that use Prometheus metrics and then ship their trace, their metrics and traces to the metric backend? Did I hear it correctly? You're asking about starting your traces on the client-side and the web browser and stuff? You have tracing on the server-side, but what about having traces and metrics on the client-side and, for example, for an embedded or mobile application so that you could actually see the trace from when the customer clicked a thing and see the full customer journey? Yeah, that's a great question. That's actually an area where there's currently a lot of research and new projects and so forth. So there is a group called real-user monitoring, RUM, in open telemetry that deal with client-side applications. There's also a project by Grafana. It's called Faro. It's kind of, you know, JavaScript that you can include in your front end, in your HTML page, and then it gives you traces and metrics from in the web browser coming from the web browser. And this is currently a pretty active area, so lots of, you know, movement there. And so there are things to explore. So if you like, check out Faro. It's a nice new project and standardization is also currently being discussed, but it's newer than the rest of what I showed you, right? So it's not as, so there's no, you know, clear standard yet or nothing decided yet. Cool. Okay. Thanks, everyone, again. |
Practical introduction to OpenTelemetry tracing |
Hi, everybody. Thanks to be here for this talk. That's a lot of people. I'm Nicolas Frankel. I've been a developer for a long time, and I would like to ask how many of you are developers in this room? Quite a lot. Who are ops? Just as many, and who are devops, whatever you mean by it. So this talk is intended for actually developers, because I was, or I still think I'm a developer. So if you are an ops people, and for this, for you is not that super interesting. At least you can direct your developer colleagues to the talk, so that you can understand how they can ease your work. Well, perhaps you've never seen that, but I'm old or experienced, depending on how you see it. And when I was starting my career, monitoring was like a bunch of people sitting in front of screens the whole day. And actually, I was lucky. Once in the south of France, I was told, hey, this is the biggest monitoring site of all France. And actually, it really looked like this. And of course, there were people watching it. And that was the easy way. Now, I hope that you don't have that anymore, that it has become a bit more modern. Actually, there is a lot of talk now about microservices, right? Who here is doing microservices? Yeah. Yeah, because if you don't do microservices, you are not a real developer. But even if you don't do microservices, so you are not a real developer, and I encourage you not to be a real developer, in that case, you probably are doing some kind of distributed work. It's become increasingly difficult to just handle everything locally. And the problem becomes, yeah, if something bad happens, how can you locate how it works? Or even if something works as expected, how you can understand the flow of your request across the network. I love Wikipedia. And here is the observability definition by Wikipedia, which is long and in that case, not that interesting. So I have a better one afterwards for tracing. So basically, tracing helps you to understand the flow of a business request across all your components. Fabian, where is Fabian? Fabian is here, so he talked a lot about the metrics and the logging. So in this talk, I will really focus on tracing because my opinion is that, well, metrics is easy. We do metrics since ages, like we take the CPU, the memory, whatever. Now we are trying to get more business-related metrics, but it's still the same concept. Logging also. Now we do aggregated logging. Again, nothing mind-blowing. Tracing is, I think, the hardest part. So in the past, there were already some tracing pioneers. Perhaps you've used some of them. And well, now we are at the stage where we want to have something more standardized. So it starts with the trace context from the W3C. And the idea is that you start a trace and then other components will get the trace and will append their own trace to it. So it works very well in a web context. And it defines two important concepts that Fabian thanks already described. So now I am done. So I have the same stupid stuff. So here you have, oh, sorry. Yes. It reminds me of the story. I did the same to my colleagues. They didn't care about the presentation. They only remember that. Okay. So here you have a trace and here you have the different spans. So here the X1 is the parent one. And then the Y and the Z1 will take this X span as their parent span. And so this is a single trace. This is a single request across your service. Web stuff is good, but it's definitely not enough. And so for that we have the open telemetry stuff. Open telemetry is just a big bag of miracles all set into a specific project. So it's basically APIs, SDK, tools, whatever under the open telemetry level. It implements the W3C trace context. If you have been doing some kind of tracing before, you might know it because it's like the merging of open tracing and open sensors. Good thing is a CNCF project. So basically there is some hope that it will last for a couple of years. The architecture is pretty simple. Basically you've got sources, you've got the open telemetry protocol, and as Fabian mentioned, you dump everything into a collector. Collector, we should be as close as possible to your sources. And then some tools are able to read like data from it and to display it into the way that we expect to see it. What happens after the open telemetry collector is not a problem of open telemetry. Just they are collectors that are compatible, and for example you can use Yeager or Zipkin in a way that allows you to dump your data, your open telemetry data into Yeager or Zipkin into the open telemetry format. So you can reuse, and that is very important, you can reuse your infrastructure if you're already using the tools, but just switching to open telemetry. And then you are like you are using a standard, and then you can switch your open telemetry back end with less issues. Now comes the fun developer part. If you are a developer, you probably are lazy. I know, I'm a developer. So the idea is open telemetry should make your life as a developer as easy as possible to help your ops colleague, like diagnose your problems. And the easiest part if you do auto instrumentation. Auto instrumentation is only possible in cases where you have a platform, when you have a run time. Fabian mentioned Java, Java as a run time, which is the Jivem. Python as a run time. Now if you have rusts, it's not as easy. So in that case, you are stuck. My advice if you are using a run time, and probably most of you are using such run times, whether Java, whatever, use it. It's basically free. It's a low hanging fruit, and there is no coupling. So basically you don't need extra dependencies as developers in your projects. So since it's called practical introduction, let's do some practice. So here I have a bit better than the hello world, so I have tried to model like an e-commerce shop with very simple stuff. It starts just asking for products. I will go through an API gateway which will forward the product to the catalog, and the catalog doesn't know about the prices, so it will ask the prices from the pricing service, and it will ask the stocks from the stock service. The intra point is the most important thing, because it gives the parent's phrase. Everything will be from that. So in general, you have a reverse proxy or an API gateway, depending on your use case. I work on the Apache API 6 project. It uses the NGINX reverse proxy. On top you have an open resty, because you want to have Lua to script and to auto reload the configuration. Then you have lots of out of the box plugins. Let's see how it works. Now I have the code here. Is it the begin off? Good. So I might be very old, because for me it wouldn't. Okay, here that's my architecture. I'm using Docker compose, because I'm super lazy. I don't want to use Kubernetes, so I have Yeager. As I mentioned, I have all in one. I'm using the all included, so I don't need to think about having the telemetry collector and the web to check the traces. I have only one single image. Then I have API 6. Then I have the catalog, which I showed you. Of course I have a couple of variables to configure everything. I wanted to focus on tracing, so no metrics, no logs. I'm sending everything to Yeager, and then I do the same for pricing, and I do the same for the stock. And normally at this point, I already started, because in general I have issues with the Java stuff. So here I'm doing a simple curl to the product. I've got the data, which is not that important. And I can check on the web app how it works. So here I will go on the Yeager UI. I see all my services. I can find the traces. Here you can find the latest one. And here is the thing. If I click on it, it might be a bit small, right? I cannot do much better. You can already see everything that I've shown you. So I start with the product from the API gateway. It forwards it to the product to the catalog. Then I have the internal calls, and I will show you how it works. Then I have the get request made from inside the application. And then I have the stocks that responds here. Same here. And here we see something that was not mentioned on the component diagram. From the catalog to the stock, I go directly. But from the catalog to the pricing, I go back to the API gateway, which is also a way to do that for whatever reason. And so this is something that was not mentioned on the PDF, but you cannot cheat with open telemetry. It tells you exactly what happens and the flow. And the rest is the same. So regarding the code itself, I told you that I don't want anything to trouble the developer. So here I have nothing regarding open telemetry. If I write hotel, you see nothing. If I write telemetry, you see nothing. I have no dependency. The only thing that I have is my Docker file, and in my Docker file, I get the latest open telemetry agents. So you can have your developers completely oblivious, and you just provide them with this snippet, and then when you run the Java application, you just tell them, A, run with the Java agent. Low-hanging fruits, zero trouble. Any Java developer here? Not that many. Python? OK, so it will be Python. Just the same here. Here it's a bit different. I add dependencies, but actually I do nothing on it. So here I have no dependency on anything. Here I'm using a SQL database because, again, I'm lazy. I don't care that much. But here I have no dependency, no API call to open telemetry. The only thing that I have is in the Docker file again. I have this. Again, I'm using a runtime. It's super easy. I let the runtime, like, intercept the calls and everything to open telemetry. And the last fun stuff is Rust. Any Rust developer? Please don't look at my code too much. I'm not a Rust developer, so I hope it won't be too horrible. And Rust is actually, well, not that standardized. So here I don't have any runtime, so I need to make the calls by myself. The hardest part is to find which library to use, depending on which framework to use. So in this case, I found one, and perhaps there are better options. But I found this open telemetry OLTP stuff. And here this is because I'm using XM. I'm using this library. And so far, it works for me. I don't need to do a lot of stuff. I just, like, copy pasted this stuff. Copy past developer. And afterwards, in my main function, I just need to say this and this. So I added two layers. So if you don't have any platform, any runtime, you actually need your developers to care about open telemetry. Otherwise, it's fine. Now, we already have pretty good, like, results, but we want to do better. So we can also ask the developers, once they are more comfortable, to do manual instrumentation even in the case when there is a platform. Now, I will docker compose down. And it takes a bit of time. I will prepare this. And on the catalog sides, now I can have some additional codes. So this is a Spring Boot application. What I can do is add annotations. Like, I noticed there were a couple of Java developers. So it's the same with Kotlin. It's still on the JVM. So basically, I'm adding annotations. And because Spring Boot can read the annotation at runtime, it can add those calls. So I don't have to call the API explicitly. I just add some annotation, and it should be done. On the Python side, I import this trace stuff, and then I can, with the tracer, add some, again, explicit traces, so internal traces. And from the first point of view, because I already, like, did it explicitly work. And now you can see that I am in deep trouble, because it happened a lot of time. The Java application doesn't start for a demo, and that's really, really fun. So I will try to docker compose down the catalog. And docker compose, hey, what happens? Dash? Are you sure? No, no, no, no, no, no, no. Not with the new versions. Yes. That's fine. We are only here to learn. What? Stop. Thanks. The stress, the stress. Yeah. Honestly, if there is any, like, person here able to tell me why this Java application sometimes has issues starting because I've added one gig at the beginning, and it's stuck always here. So I can tell you what you should see normally. If I'm lucky, I made a screenshot. Yes, here, but it's the beginning, it's the rust one. So here, this is what you can have in Python. This is what I added explicitly. I have five minutes. Well, if the demo doesn't work, it will be much better. Then I won't have any problems with the timing. Here, you can see that this is the trace that, yeah, this is a trace that I added manually in Python. And here we can see that I filled the ID with the value. And on the Java sides, again, nope, nope. I think it will be here. This is not the manual stuff that I added. Yes, it is, you have the fetch here. You have the fetch here. So this is the span that I added manually. I'm afraid that at this point, the demo just refused working. Yes, it's still stuck. I will stop there. I won't humiliate myself further when it's done. It's done. Perhaps, if you are interested, you can follow me on Twitter. You can follow me on MasterDone. I don't know what's the ratio. More importantly, if you are interested about the GitHub repo, to do that by yourself, perhaps with better configuration of the code compose with the right memory, it would work. And though the talk was not about Apache API 6, well, have a look at Apache API 6. It's an API get away, the Apache way. Great. Are there some questions now? I never got so many uploads with a filling demo. Please remain seated so we can have a Q&A. Who had a question? Thank you. Very good talk. I have two questions. So one is about this. Let's start with the first one. Right. Yes, yes, yes. How much overhead does this bring in Python and Java or Rust? How heavy is this instrumentation? That's a very good question. And the overheads of each request depends on your own infrastructure. But I always have an answer to that. Is it better to go fast and you don't know where you are going to go a bit slower and to know where you are going? I think that whatever the cost, it's always easy to add additional resources and it doesn't cost you that much. Whereas a debug incident across a distributed system can cost you days or even like weeks in injuring costs. And you are very, very expensive, right? Okay. Thank you. And the second one is have you encountered any funny issues with multi-threading or multi-processing? Something like when your server just now... Can you come closer to your... Your server just now was not starting. So some software, when you have multi-threading or multi-processing and have you encountered any issues when the instrumentation costs you trouble? This is not production stuff. This is just better than the hello world. So I cannot tell you about prediction issues. You should find people who have these issues. As I mentioned, it's a developers-oriented talk. So it's more about pushing the developers to help up to their job. For production issues, I must admit I have no clue. Hi. In the case of runtime, does it always work with also badly written application? I mean, how bad can an application be before it stops working? I'm not sure. I understood the question. So how often do you need to do it before it stops working? No, no. I mean, let's say I use deprecated libraries, bad clients, something that doesn't work as it's supposed to be for the instrumentation perspective. I mean, I do request to the network using UDP clients, something I've written myself, some custom stuff that... I'm imagining that the instrumentation sits between some layer of the network, which is going to the Internet, for example. And so how bad can I be before it stops recognizing a request from junk? You cannot be banned. OK. Well, it's a moral issue first. But then on the platform side, the Austo instrumentation, they work with specific frameworks and tools. It's those frameworks and tools that know how to check what happens and to send the data to open telemetry. So if you don't play in this game, nothing will be sent. On the manual instrumentation side, it's an explicit call. So it depends what you want to send. Yeah. I was thinking of auto instrumentation. So let's say I do the NS resolution by myself and then I just throw a request to an IP. Let me show the Python stuff here. This is what I showed you in the screenshot. This is what I write. And this is the attributes that I want to have. So basically, if here you have something that is completely unrelated, it's up to you. That's why it's easier to start with auto instrumentation. And then once you get a general overview of what you have and your app starts saying, hey, perhaps we want to have more details here, then you can come with manual instrumentation. But start with the less expensive stuff. I didn't really answer the question. I understand it. But that's the best I can do regarding it. Sorry. Okay. Thanks for the talk. For the agent you use in the Docker file, how you can configure it, for example, for the tracing for Jagger or other stuff. Regarding the Docker file, sorry? Yeah. How you can configure the agent to send the tracing for Jagger or other stuff. The Docker file doesn't mention where you send it. The Docker file just says, hey, I will use open telemetry. And it's during configuration, it's like in the Docker Compulse file where I'm using, like, agreed upon environment variables where I'm saying you should set it here or here or you should use logging or tracing or metrics or whatever. So that's very important to, like, separate those concerns. On one side in the Docker file in the image, you say, hey, I'm ready for open telemetry. And when you actually deploy it to say, okay, open telemetry will go there for the metrics and there for the tracing and for logging, I will disable it or whatever. Thank you for... Oh, sorry. Sorry. Go ahead. Sorry. And then you have a Docker image that can be, like, reusable. Thank you for being good first-time citizens to remain seated. Next question. Thank you for your presentation. So my question is does open telemetry support error handling like sentry? If not, is there any plans to do that? It's really useful to catch crashes and capture the context of the crash. So that's it. Thank you. If it happens, when you mean crashes of open telemetry itself or of the components that are, like, under watch? Yeah, of the application that's monitored, yeah. Well, Fabian showed you how you could log and, like, bind your traces and your logs. So you could have both here. My focus was just on tracing, but you can reuse the same Docker, the same GitHub repo and just, like, here, put the logs somewhere in, I don't know, Elasticsearch or whatever. No, because it's not a sponsored room. And then you can check and you introduce some errors and then you can check how the two are bound and you can, like, drill down to where it failed. Okay, thank you. |
Exploring the power of OpenTelemetry on Kubernetes |
Welcome our next speakers and give them a round of applause. Can you hear me? I guess you can. So yeah, hello, everyone, and welcome to the session about how we can use open telemetry on Kubernetes to collect traces, metrics, and logs. So my name is Pavel, I'm software engineer at Red Hat, I contribute and I'm a contributor and maintainer of Open Telemetry Operator and Yeager project. Yeah, my name is Bine, and I'm also working on the Open Telemetry Operator and spent most of the time on Open Telemetry. And so as I mentioned on today's agenda, there is the Open Telemetry Operator. We will show how you can use it to deploy the collector, how you can as well use it to instrument your workloads on Kubernetes. And after this brief introduction, we will walk you three use cases, how you can use it to collect traces, metrics, and logs. However, I will start with the history of open source observability. I'm doing this because I believe that if we understand the history, maybe we will better understand where we as industry are going. So on this slide, you essentially see a timeline of the, with the open source projects. And it's divided into the, the upper and bottom parts. In the bottom, you see the open source projects or platforms that you can deploy and they provide you with a storage and visualization capabilities for, for the observability data. Most of them work with distributed traces, however, some of them, like the Apache skywalking hyper-tracing signals, those are more like end-to-end platforms that can show traces, metrics, and logs. I would like to focus on the upper part that shows you the open source data collection kind of frameworks. And what we see there with, especially with open sensors and open telemetry is that it's becoming more important that these frameworks kind of work with all the signals. For me, the, the data collection, especially for tracing started with Zipkin project. It gave us a stable data model that we, as developers, could use to export traces into Zipkin, but as well to many other kind of platforms that adopted Zipkin project. As a developer, when we wanted to use Zipkin clients, because the ecosystem hosted client libraries as well, it was a bit problematic in polyglot environments because those clients were using kind of inconsistent APIs, there was no standardization. And so this problem then was partially solved with open tracing. The scope of the project was a bit wider, there was a specification, there was a document that defines which data should be collected and as well how the API in those languages should look like. This enabled us to build reusable instrumentation libraries. And then, even later, the open sensors project started with slightly different approach. There was no specification, there was no API, but there was SDK that everybody could use and a collector. So with open tracing, the approach was that developers would use the API and then at the build time provide the SDK from a vendor. With open sensors, everybody would use the SDK and then in the collector decide where the data should be sent. Those two projects were kind of competing and then finally they merged into open telemetry in 2019. So the hotel, it adopted all the pieces from open tracing and open sensors, but kind of the biggest innovation in hotel is the, at least in my view, is the auto instrumentation libraries or the agents. Those agents are production ready, most of them, because they were donated by one of the observability vendors, so they are, you know, production tested. So when we kind of summarize what happened is that we started with some instrumentation libraries, you know, with Zipkin project, then since we have some kind of standardization, we could build reusable instrumentation libraries and kind of create more sophisticated instrumentations for runtimes. And now we are in an age that we have available in open source agents or auto instrumentation libraries that we can just grab, put into our platforms, and we will get telemetry data almost for free. And I think, you know, so where are we going? I think we are going into an era where we, as developers, we won't have to care about how the telemetry is created for us. We will be, the instrumentation will become maybe the feature of the platform where we deploy the application. So this is one way to look at it. The other way might be that the observability will shift left, and since we have this data, we will start utilizing it for other use cases, probably like testing and security. So with that, I would like to move to the open telemetry, and it's obviously open source project hosted in the cognitive computing foundation, and its main goal is to provide the vendor or neutral telemetry data collection. It's the second most active project in CNCF after Kubernetes, so it's quite large. And there are several independent components that we can use. There is a specification that defines what data should be collected and how the API should look like, and obviously then there is the implementation of the API, the SDK and the standard data model called OTLP or open telemetry protocol. These four pieces are meant to be used primarily by instrumentation authors or the people that work on the observability systems. And last two components, the auto instrumentation or agent and collector are meant to be used by end users to kind of roll out observability in their organization. To facilitate open telemetry deployment on Kubernetes, there is a Helm chart and Kubernetes operator. What I would like to stress is that open telemetry is only about how we collect and create telemetry data. It's not a platform that you can deploy, it doesn't provide any storage or query APIs. So now let's go to the main part, the Kubernetes operator. The operator itself, it's a Golang application, it uses QBuilder and operator SDK, and it has three primary use cases. It can deploy the open telemetry collector as a deployment, demon set, stateful set. It can as well inject the collector as a side card to your workload. The second use case is that it can instrument your workloads running on Kubernetes by using those instrumentation libraries or agents from open telemetry. And last but not least, it integrates with Prometheus ecosystem. It can read the service and pod monitors, get the scraped targets, and split them across the collector instances that you have deployed. To enable this functionality, the operator provides two CRDs, one for the collector that is used to deploy the collector and integrate the Prometheus. And the second one is the instrumentation CRD, where you define how the applications should be instrumented. The operator itself then can be deployed through manifest files, home chart, or OLM. So what we see here is the Kubernetes cluster. There are three workloads, pod one, pod two, and pod three. The first workload is instrumented with the hotel SDK directly, so when we were building this application, we pulled in the hotel dependency and we compiled it against it and used those APIs directly in our business code and in the middlewares that we are using. The second pod is using the auto instrumentation libraries that were injected by the operator through the Venetian webhook. And the third pod is using Zipkin instrumentation and Prometheus instrumentation libraries, and it has the collector sidecar as well injected by the operator. So essentially the operator there, it reconciles three open telemetry CRs, two for the collector, and one instrumentation. And then all these workloads, they send data to the collector deployed, probably as a demon set, and then this collector then does some data normalization and sends finally data into platform of your choice, which can be Prometheus for metrics, Yeager for traces. With that, I would like to move to the second part, explaining the CRDs in more detail. Yep. The microphone should work. Yeah, so with the CRDs for today, we wanted to show both of them, and we start with the collector one. The collector CRD is a bit loaded, so therefore we picked a few things here, which I would say are the most used or important. So as Pawe mentioned, there are different deployment modes, different use cases for the open telemetry collector, and in the specification, we can go to the mode and just specify it there. There's a handy thing, which is the sidecar, we will see it afterwards. And if we want to use it, we only go to the part definition of our deployment and inject the annotation we see on the top right. And if we go with the deployment mode or something like this, and we want to expose it for collecting metrics, locks, and traces from a different system, for example, we can use the Ingress type, we can set there a lot of more, we configure there a lot of more like also the annotations, your Ingress class. But yeah, mainly the operator takes care of everything, creating services, also is able to balance your load there. And yeah, the last thing here is then the image section, which is also important. With the open telemetry operator, it usually ships the core distribution of open telemetry by default. So in open telemetry, the collector is split into two repositories when you go up and look at GitHub. So in core, you will find OTP, a logging exporter, so some basic stuff. And in Contrip, you find basically everything. So if you want to send your traces to some proprietary vendor or to Jäger, you probably need to look there. Okay, the next thing is then the configuration. The configuration for the open telemetry collector is here provided like it's usually done for the collector itself. So it's passed directly forward. It's split it into three parts here. We see the receiving part there. We specify our OTP receiver. Here it's accepts GRPC on a specific board. It could also be there that we specify a prometers receiver, which is then scraping something. Then the optional part is basically the processing part. We might want to save some resources and we batch them our telemetry data. And yeah, there are other useful things. And on the exporter section, here we use the logging exporter, which is part of the core distribution, but you can configure whatever you like. You can also have multiple exporters for one resource. There is one thing. On the right side, we see the extensions. It didn't fit on the slide, so it's there in this box. This is then used if you have, for example, an exporter, which needs some additional headers. Yeah, you want to set a barrier token or something else. You can do it there. And then finally, we go to the service section where we have different pipelines for each signal. And then we can then configure a processor and receiver and exporter in the way we wanted. So then there is another CD, which is used for the auto instrumentation. And it looks slightly different. So here we have also the, in the specification, we have the exporter. And the exporter only exports OTP, so which means if we want to export it to some, yeah, back end of our choice, we usually instrument our application directly then forward this traces to a collector instance, which is running next to it. And yeah, we can use the power of these processors. Yeah, then we can configure some other useful things like how the context is propagated and the sample rate. And to use it, it's also quite easy. So we have our deployment. In this case, we can, it can choose from this list of supported languages. We might use Java and we only set this annotation on the port level and it will take care of adding the SDK and also setting and configuring the environment variables. If we use something like Rust, we can also use the inject SDK annotation to configure then, yeah, just the destination because then SDK should be there. And if we have a setup where there is, let's say, some proxy in front, like Envoy, we can then just skip the, yeah, adding the auto instrumentation there by only configuring the container names we want to instrument. And we will see this in a minute, a bit more in detail. So this is then basically what we would need to do. So we create this instrumentation, we add this annotation on the left. We see the pot, there is our application. And in this gray box, you see what automatically is added. And this is then forwarded in this example to a collector. And yeah, how does this work? So the operator in that mission web hook, he will add this in its container. On the top left, we see how the container looks before. So there are no environment variables. It's just a plain application. And in the command section, there is then the copy, which copies the Java agent to our original container. And on the right side, we see the final result. We see the Java tool options where the container is loaded, and then we see all this environment variables to configure our SDK. And finally, what we have seen also in the presentation from Nicholas previously, we have here the Yeager output. So we can see the resource attributes and all the beautiful stuff that comes with it. So next, we have, we can have a look on metrics. So there is the open telemetry SDK. So if you want to go with open telemetry metrics, but I assume a lot of people have already some perimeter stuff in place. And the open telemetry operator also helps us with this. So we can, I might we look first on the receiver part on the bottom. We see there, we configure the perimeter's receiver, which has a scrape configuration, and there we can, for example, add some static targets. So we assume we add there three different scrape endpoints, then afterwards, if the target allocator is enabled, this will then take these scrape targets and divide, well, spread these targets across our replicas, which are then responsible for getting the metrics. And yeah, that's basically how it works. There's also an option to enable perimeter CRs, so we can then forward to this one. And the target allocator, which is an extra instance created by the Yeager, by the open telemetry operator, will then, yeah, get the targets from there. So we see this here in this graphic, quite good. On the left side, we see part one, which is using open telemetry, and it's pushing the information telemetry data it gets directly to a collector. And in this gray box, we see there, we have two instances running Prometoys, running instruments with Prometoys, and the collector one and collector two are pulling the information from there. So this is all managed then by the operator. We have seen the replicas, this is basically collector one and collector two, and since we enable the target allocator, we get the targets from there, so which is then coming from the port monitor. And finally, we send the information somewhere. So the last thing here, the last signal are then locks. So for locks, there are different options. So the first one would be to use the open telemetry SDK, what we might don't want right now because we need to do some work, but if we directly want to go ahead, there is the philoc receiver, we can configure it to get the information from this, and yeah, it's available in the conflict repository, and we have different parsers there which help us to move the locks into the OTP format. We will see in a minute how this looks like. And there are other options if you want to integrate with FluentBit, so there is a forwarder, so you can use it as a kind of a gateway then. And yeah, the only thing we need to do then is we can configure it as a demon set. We need to pass our information there, and the philoc receiver, for example, can then get all the locks. And how does this look like at the end? So this is when we exported the locks to the logging output, so standard out. We see that we have the resource attributes which are added automatically, and yeah, we see then the lock information, and on the bottom the trace ID and span ID which are not given if we read it just from disk, but that's it. Yeah, then we are almost at the end. Yeah, thanks a lot for the interesting talk. Does anyone have questions? Any questions? Raise your hand. Question? Yeah, there we go. For the logging part, would you suggest to replace any kind of cluster logging like with Fluent Bit, or that's like sending it off to Loki or something with an open telemetry log scraping, or is that complementary? I'm not sure if I fully got it. So you want to, yeah, in this case it's just another way, but the useful thing is if you have the open telemetry SDK, it will automatically add then the trace ID to it, and then you can correlate your signals. Sorry. So I'm super newbie to this, so I failed to understand how the, if the open telemetry is trying to replace, for example, the log parsers like the telegraph, for example, which is able to generate prometheus metrics by log scraping, or also how Zipkin, which is the tracing thing, fits in the metric collection of all this picture. So I'm not trying to understand how you cobble together all these sources and how open telemetry either replaces or either makes it easier to use all these technologies together. Thank you. So maybe on this slide you see that the third port is using the Zipkin and prometheus, and the collector can receive data in Zipkin format, it can scrape prometheus metrics, then transform this data into OTLT or the Zipkin as well, and then send it to the other collector. So the collector essentially can receive data in multiple formats, transform them to the format of your choice, and then use that format to send it to other systems. Hello and thanks for the talk. I'm just wondering, what's your strategy of filtering health check requests, for example, or the life probes request that you get in the pod? Health checks, like to avoid generating traces for health checks? Sorry? To avoid generating traces for health check endpoints? Yeah. That's a very good question. So you could maybe configure the collector to drop the data, but I think the best way would be to tell the instrumentation to skip instrumenting those endpoints. To be honest, I'm not sure if this is implemented in OTL agents, but I saw a lot of discussions around this problem, so probably there will be some solution. We have time for one last question, if there is any. No? Okay. Oh, no. And thanks a lot. |
Observe your API with an API Gateway Plugins |
Welcome, Bobor. Hi, everyone. Thank you. Thank you. Hi, everyone. I am super excited to be here. Welcome to one of my sessions about observing your APIs with API gateway plugins. Let me introduce myself first. My name is Bobor. I am a developer advocate for Apache API 6. Sometimes it's so difficult to pronounce my name. People say it's from different countries. Is it like a Bebo or Bobor or like a Bebo? And then I say, okay, you can translate my first name as a tiger. It means tiger, Bobor. My last name, Murzokov, English version will be Livermore. In this case, Tiger Livermore. You can call me like Tiger Livermore. It's up to you. And you can also reach out to me on these social channels if you have any questions regarding sessions. So with that, we can get started. First thing first, what I want to do now is take a selfie because I have my TikTok account. I just start to run my blog recently. Just a moment. Maybe this is better. That was good. Thank you. I will just put some hashtags on that. So thank you Fabiano. Now I can leave. I did my job. I can go home. I did my Instagram picture, right? But I have today a very interesting agenda for you. We'll talk about what is APIs and API observability and how we can use API gateway for observing your APIs as a central point for observation. And then we will break down all API observability three pillars. We know that we have logging, tracing, and metrics, right? And we will learn how to enable these three pillars by using Apache API 6 plugins. And I have then a small demo for you. I hope you will like it. And that's good. APIs, right? API is just the three letters acronym for application program interface. By now, we are all familiar with this term, right? Because we are living in increasingly API-centric world. Even chat GPT uses API. Because under the hood, it calls some OpenAI API lists to collect some language models. Now my question to you. Who doesn't know what is API? Everybody knows. You don't know. You are lucky I have a brief for you. You will get this T-shirt because you don't know what's API 6. What is your size? I will handle it for you here. Actually, I have enough T-shirts. If you are first the three good, if I get the first three good answer, I will give it to you. And we have also some medals, stickers if you would like to get, please feel free. It's like free stuff from community, I would say. And we know that success of your services, right, depends on the performance, availability, and integrity of your APIs. Here, another question, Rice. How to achieve these three indicators of success? How to achieve these three indicators of success for your APIs? Let's say we have API should be all the time available, right? It should be integratable. It should be performance with a high performance. Do you have any idea? Yes, you got one T-shirt. Yes, monitoring, exactly. And what do you say, distributing? Yeah, distribute system, monitoring? Just placing it as a service in a distributed process. Yeah, this is also a rice solution. But you will get only the medal. Sorry. But one solution can be using by API gateways. Because API gateways, actually, nowadays, what we do, we build multiple microservices and maybe several less APIs or multiple REST APIs for our unique, maybe, the service, right? And in this flow, API gateways serves as a central point for routing all your incoming requests to the internet destinations. These destinations can be, as you can see, database, or maybe short-party API services. Or it can be also some serverless APIs like Azure Function or AWS Lambda, maybe any other open source functions, right? And it means it's accessed a single layer between your clients and the backend services, right? That can manage all the traffic coming to your backend services. It's a very straightforward term, right? API gateway term. And then it can also be a right point to learn for your API observabilities because it's uniquely identified to know all the traffic moving from client side to our backend service network. And instead of relying on, let's say, some other technologies, SDKs, APIs, and services to enable this observability and improve this observability, you can easily integrate this job with API gateway. We have a bunch of API gateways. I'm not selling anything, but we are talking about now API 6, what kind of plugins you can use today. And next, for example, you can ask, what is Apache API 6? Maybe you know the world's largest open source software software foundation, Apache software foundation. Maybe some of you are part of it, right? Who is a part of it, ASF? Who is contributing to open source projects of Apache software foundation? You, we have some people. And who knows Apache Kafka, Cassandra, Tomcat? Everybody knows. And Apache API 6 is also one of the top fastest growing project of ESF nowadays. You can, of course, you cannot compare it with Cassandra, Tomcat, but maybe in the future. It was initiated in 2019, but we are still, we have some open source community around the world. I am, for example, visiting Tallinn, Estonia. We have contributors from US, Canada. We have some with China, India, and so on. You can check it out. It's a very nice API gateway solution. And as you can see, API plugins, it's a very hard mechanism in API gateway that can be plugged into your API gateway solution. With that, you can extend further some functionalities. You can enable cross-captain constants like authorization, authentication, security, transformation, rate limiting, and so on. At the same time, you can enable some kind of observation, right? And when you're using API 6, for example, you should face with multiple types of plugins broken down into several categories, or sometimes you want something custom plugin, right? To fit your needs. And with API 6, similar API gateway nowadays, more than API gateway provides some language support. You can choose your favorite language, your familiar reach, and you can create some custom plugins. Maybe you are Java developer, or maybe you are Go developer or Python, you can choose your favorite language and write the plugins. Let me show you. Or you don't want to write a code. There is a dashboard where you can do user-friendly dashboard by using it. You can just drag and drop existing plugins together to build new plugins. You can orchestrate one or multiple plugins in this way. You can specify some conditions and also put some business flow on it and generate new plugins. It's a very useful feature that API 6 dashboard provides. For example, you can put together multiple observability plugins in one flow and then maybe enable full observability backend tools in this diagram. Let me, you can just maybe observe later. Now let's jump into the main topic. What's the API observability? Can anyone tell what's the API observability? In one word. No? No. The other problem, like relation and latency, or just great status codes. Traces, yes, right? And for, I think I gave to you this short, right? For your, yeah. The actual thing is doing is very important. Yes, it's also right, yeah. And it means API observability is all about how you absorb your APIs, right? Instead of relying on predetermined data like metrics, monitoring, and wait for the failure, you can use API observability to enable announce, announce, or announce, in this diagram as you can see. Because compared to traditional API monitoring, we have traditional API monitoring and observability, right? This is monitoring focused on analyzing now and announce, what does it mean? Now and announce means you know what the measure, you know the number of requests, you know number of errors, that you know what the measure, right? But on the other hand, API observability lets you analyze exactly what was the issue, and how this issue occurred by learning three metrics, right? You know, again, like logging and metrics and tracing. So because API observability nowadays is a part of every API development teams, let's say, for example, who can use API observability during the API development? Yes, you can see, for example, product managers, they can use to learn consumption and usage of your APIs. Or maybe security team, they can use to protect or maybe detect some possible API threats from outside, right? So as I said, let's have a look at these three pillars now and learn what API gateway can provide as a solution for these pillars, tracing, logging, and metrics. If I start with logging, it's a very tribal step, right? To start your observability because you can use logs to identify or debug what's happening. If you are a junior developer, you will first start digging into logs because in order to understand the project, because the project has zero documentation, you need to have a look at logs, you need to have a look at some API things. So here, in order to enable this logging, you can use some kind of plugins. For example, API 6 provides HTTP-logger plugin, Kafka-logger, or it depends on which kind of integration you have. For example, you can use Google Cloud Logger. Let's say HTTP-logger is a basic logger. Let's just enable to push all the log data from your API gateway to HTTP-S or HTTP servers. It means you can further drive some useful metrics from that logs. Or let's say, what about metrics? Metrics are actually just the data collected over the time, right? Times gas, but the metrics collected. But metrics you can use in the future in the distributed systems like Elasticsearch. You can acquire these metrics using Elasticsearch, or maybe you can show these metrics by using a dashboard, like a Prometheus, right? The Prometheus dashboard provides some metric analysis. And for these, all the external popular platforms, let's say, like Grafana or Prometheus, Apache PR6 has pre-built connectors, I would say. And for example, like Zipkin. For tracing, we have Zipping plugin. You can just enable it with just one click, and it starts to learn some metrics or traces and so on. So now, enough talking, right? I can show something in real. Because we have a bunch of plugins, as I said, for the API observability, this time I decided to choose, for my demo, Prometheus plugin. Because with respect to Fabian, and I really like Prometheus and Grafana integration, now I will show you how you can easily enable this observability very fast. For that, I mean, you can have a look. That is my report, api6.net docker. It shows all the use cases of API GitHub in one report. And it has a branch called the API observability. If you navigate this branch, you will see a real example of some plugins, how to enable it, and you can have a hands-on exercises. Now, let me switch to my VS code. I'm using the VS code today for the session. But I'm talking about this repository. It has a five branch. You can learn some of the branch, like how you can enable health check, start from health check, and API observability. This is the starting point for you. And then it's very fast to spin up this project, because it uses docker-compose. And we are using, for the backend, api6.net, you can use Java project. It doesn't matter. You can use Python. I'm actually a Java developer, but I like to encourage myself to learn new languages. I start to learn C-sharp, and this small project on .net. So if you do some docker-compose app, docker-api6, of course, it will bring some containers, right? All useful containers, as you can see, like Prometheus, Grafana. We have a product API that is with .net. I have a small endpoint that maybe returns some list of products when you call this endpoint. As you can see, it's very simple here. When I do this, it returns... Let me maybe make it bigger here. It returns a MacBook price and some other product price. Simple. And also, I'm using Windows on my machine, some of the necessary containers are up and running in one kind of docker-compose app. Then if you open the project code on your favorite IDEA or IDEA tool, right? In my case, let's say VS Code, and you can see on the Commons folder some common line examples that I'm going to show today. First thing, how you can enable this Grafana or Prometheus, Zipkin, and so on. With API Gateway, you need to create your first upstream. If you navigate this here, you can see some kind of command. Of course, you can use a dashboard to create this upstream. It's up to you. If you're like a hard worker developer like me, you can just use the kind of commands. What we are doing here, as you can see, I had a product API, right? And then I am creating upstream. And upstream means just a set of backend API endpoints. And I have one single node, one node. You can have multiple nodes. You can have multiple instance of the same product API. For the simple case, I'm just creating this one upstream and one node. Let's jump into maybe terminal. I can open some new terminal here to run this code. So I'm using VSL, because on Windows it's a little bit difficult to run a Linux code. So let me open maybe one more terminal on the Ubuntu. There we go. I hope it's visible. And if I just click and press Enter, now I set API 6. Please create the upstream service. Register my ESP.NET Web API as a backend service. And it should have this kind of configuration. Then next step, next easy step, what I do is I need to create, let's say for prometers, I need to create a row. Because API Gateway has three very basics, like you need to create the upstream, you need to create a row, and enable some plugins. In my case, prometers plugins, right? As you can see, I have some plugins configuration on the top. Only single prometers. And I'm giving reference to the upstream that in the previous step we created. And that's all. Like I'm saying also, URI, for the row to find out which URI this plugin should absorb. I'm saying fresh API slash products. Here we go. And then if I get this command and press and put it to terminal, now I will enable this plugin very fast. An API six admin is saying, okay, now you have a row, you have upstream, now it's time to test. Prometers plugin, I enabled like this couple of steps, right? You see the five seconds or six seconds, it's all. But compared to Java projects, how you enable prometers also, maybe a little bit compared to the same steps. But this API Gateway, you are just extracting a little bit of huge work to separate service. And it's highly scalable. That's one of the advantages of using API Gateway, right? For your observability. And next, for example, I can generate some metrics by calling my maybe API into point several times, maybe like this. As you can see, it's responding me with the product list in the response. Maybe I can do it one more time. So we have some data on it. And now, if I navigate to... Maybe if I can request all the prometers metrics to see some result, right, in this output, I can run another command, maybe like this. Here we go. As you can see, metrics are enabled. I can see some metrics, HTTP metrics, plus some API six metrics, as you can see. If you have API six HTTP status, it was 200 returned and was fine. And, of course, it looks like a little bit ugly, right? Now, you can see even maybe these metrics on the dashboard. Just navigate to localhosts-targets, because we are running on localhost. When you deploy maybe to the cloud, you will have a domain and so on. But as you can see on the prometers, if I refresh it, you will have soon now some metrics on it. Of course, you can specify some parameters for your metrics. And let me... We need to maybe a little bit wait to be enabled these metrics on the UI. Yeah. Let me check the targets. It's unhealthy. Maybe we need to do some local compost down and up. Every time when it comes to demo, something fails. It's unexpected, not expected. Docker compost down. Maybe I will do this trick. And maybe then an up in a moment. Yep. And here we go. If I do up, it will bring and refresh some Docker stuff. This Docker always is kind of a issue. Now, if I do this, targets should be... No, no. It's still yes. It's stopping. Yes. Now we are in an upstate. And then let me also generate some logs once again. Or even it should be visible now. Let me go. API 6. HTTP status, if I have any. Or even you can see any traces, any metrics here. As you can see, I can even see ETCD. ETCD has a storage for the API 6. What kind of data exchanging between API 6 and the storage? And so on. I mean, you can filter out all the metrics you want. And even you can see in the graph. If it's not enough on a promises dashboard, you can connect with Grafana, right? Very easy. We have an API 6 Grafana dashboard. You can easily integrate also these logs and metrics by visualizing them. So with it, that was a very easy demo. I think we can continue. If you're interested in that, you can play with other plugins straightforward. And now maybe we can switch to my presentation. And maybe if you have any questions, we can jump into question presentation because we are already running out of time. Here are some references I'm giving out to you. For you. Yep. Any questions? OK, thanks. Thanks a lot. So, I have one more t-shirt. Yeah, I have a question regarding the plugins. For instance, with traffic, you have to kind of connect it to pilot and download them as it starts. How do you develop custom plugins for API 6 and how do you package them with the API gateway? Sorry, once again, last few sentences. There was a door is open. Sure. I was just wondering about plugin development, like how rich is the ecosystem for API 6. And if I were to develop a plugin, is it easy to package with the binary or do I have to connect to some kind of pilot and it downloads it? Yeah, very nice question. I love it. Very nice question. Yeah, it's very straightforward because we have now support for five different languages. If you are using this support program in English, we have plugin runners. You don't have to build anything. You can just run. Of course, you need to create some binaries for Java, let's say, Java file. And this Java file has a connection by using Unix socket files. It can communicate with Unix socket files and you can exchange some log data. I mean, the request data. I had another presentation also. Maybe I can share with you after the session about how to write in Java plugin, how to create plugin maybe in Python and so on. This is a t-shirt for you. So I will keep it. More questions? But I don't have a t-shirt. Hello. How do you compare API 6 to other API gateways like Gravity, Kong, or... Which API gateways again? How do you compare API 6 to other API gateways on the market? How do I compare or... What's its main selling point? Sorry, once again. What's its main selling point? Why is it better than other API gateways? Code coverage, do you mean? Compared to the others, I think it's a little bit... I think it would be a little bit hard to listen from that. I will come to... There's a lot of API gateways on the market. Yes, yes. Why is API 6... Oh, do you mean benchmarking? Ah, okay, I got it. I'm not, of course, trying to sell API 6, right? Even if I have a t-shirt. I'm just giving out a t-shirt. I'm an open source contributor because one reason is it provides hot reloading of plugins. You don't have to stop your instance. You don't have to stop the instance. You can just run these plugins on the fly. You can switch off one plugin. You can write your custom plugin enable while this API 6 instance is running. This is one of the advantages that any other API gateways they don't provide. And the performance is on the top now compared to com API gateway. Our performance is high, twice faster. You can check it out. And we have... API 6 not only has API gateway, it has also English controller. You can use it for Kubernetes English controller or even you can extend it to a service mesh for your Kubernetes services. That's one of the advantages. But these advantages also exist, of course. It is open source, too. Yeah. I hope I could answer. Sorry, it was a little bit hard to listen to. Okay. Could you please elaborate a little bit on the scalability of those plugins? It would be nice. You're worrying about how it's scalable, right? How much it can carry? How much observations can be done? Good question. You can enable multiple plugins. It's this up to you how many plugins you would like to use. And there should be some problems when it comes to millions, maybe billions of API calls. That's why we have different deployment modes. For example, you can deploy API 6 as a standalone or maybe you can deploy with multiple API 6 instance one storage. Or one API 6, different types of storage. Because ITCD is compared to other relational database, non-relational database, super fast to first data. That's why it's also easy to deploy in many places. That's scalable without any issue. Yeah. Does it make sense? Yeah, just a quick question. I saw there you had API 6 EngineX. Is EngineX the underlying API gateway? And API 6 sits on top of it and expands the capabilities of it. Because EngineX is quite limited in that. Yeah, true. EngineX is the root of the API 6. It's built on the top of EngineX. But it provides additional features, right? Yeah. We have one more question. I'll try to make it short. Can API 6 work with GraphQL APIs? Yes. GraphQL, I think nowadays became very popular also. Sorry. I think Nikola also had the talk about tracing, right? He has some of the talks about this GraphQL. We are improving, we're contributing more now, massively on the GraphQL. My answer is it's possible, yes. You can try it out. Any other questions? OK. Thanks a lot, everyone. Thank you for coming. Thank you. |
Adopting continuous-profiling: Understand how your code utilizes cpu/memory
Introduction into continuous-profiling and how it can help you writing more efficient code |
Yeah. Hi, everyone. My name is Christian Simon, and I'm going to be talking about continuous profiling. So we heard a lot about observability today already, and I'm going to want to introduce maybe an additional signal there. So maybe quickly about me. So I'm working at Grafana Labs. I'm a software engineer there, and worked on our databases for observability. So I worked on Cortex slash Mamiir now. I worked on Loki, and most recently I switched to the flat team, and I'm 50% of the flat team, and we kind of work on continuous profiling database. There's not going to be a particular focus on flat today. So basically what I want to talk today is kind of introduce how it's measured, what you can achieve with it, and then maybe as I learn more, at the next first time I can go more into detail and look at very specific languages there. So when we talk about observability, what are our common goals? Obviously we want to ensure that the user journeys of our users are successful, that we maybe even can be proactively find problems before a user notices it. And basically we want to be as quickly as possible when those problems happen to disrupt less of those user journeys. And observability provides us like an objective way of getting some insights into the state of our system in production. And even after a certain event has happened, we found the right way, reboot, and it's all up again. I think it might be able to help us understanding what exactly happened when we want to figure out the root cause of something. So one of the I guess easiest and probably oldest observability signals is logs. So like I guess it starts with a kind of print hello somewhere in your code. And I guess you probably don't need a show of hands who's using it. Like I guess everyone somehow uses logs or is asleep if they don't show a hand. So basically your application doesn't need any specific SDK. It can probably just log based on the standard library of your languages. One of the challenges, most of the time the format is rather varied. So like even in terms of timestamps, it can be really, really hard to get a common understanding of your log lines there. And also like when you then want to aggregate them, they are quite costly. So like it spides, you need to kind of convert them to in floats and so on. And also that something that can happen is that like you have so many logs that you can't find the ones that you're actually interested in. So like grab error, for example, could be, yeah, could be maybe helpful, but also like there might be just too many errors. And you kind of lose the important ones. So also like if you want to learn more about logs, my colleagues, Oven and Kavi, they're going to speak about Loki. So definitely stay in the room. And I'm going to move on to the next signal. So metrics, I'm also assuming pretty much everyone has come across them and is using them. So in this case, you kind of avoid that problem I mentioned before. You have the actual integers exposed. You maybe know about those integers that you care about them. So to get a metric, most of the time you have to do some kind of define a list of metrics you care about and then you can collect them. So it might be, you might be having kind of an outage and didn't have that metric that you care about. And so you need to kind of constantly improve on the exposure of the metrics. Obviously, like Peruvius is the kind of main tool in that space. And very often we talk about web services, I guess, when we think about those applications. So the red method, so like get the rates of your requests, get the error rate of your request, and get the latency duration of the request can already cover quite a lot of cases. And obviously, like as it's kind of just integers or floats, you can aggregate them quite efficiently across like, yeah, a multitude of pods or like a really huge set up of services. And then if you get more into that kind of microservices architecture that has kind of evolved over the last couple of years, you will find yourself kind of having a really complex composition of services being involved in answering requests. And so like, you might even struggle to understand what's slowing you down or where the error is coming from, why do I have this time out here? And distributed tracing can help you a lot with kind of getting an understanding what your service is doing. It also might be that like, maybe the service is actually doing way too much and you're calculating things over and over again. So that is super helpful to get a bit more like the kind of flow of the data in your system. So like the challenge there might be like, you might have a lot of requests and that's, while it's somewhat cheap to get the tracing, you might not cover all the requests. So for example, now production system, I think maybe someone needs to correct me if I'm wrong, but like when we receive a Grafana cloud like logs and metric data, we only cover 1% of those with traces while we cover 100% of our queries. So like, basically you need to make a selective decision if it's worth investing that. Obviously logs data looks always the same, it comes every second and so on. So like, we see more value in having all of those queries where there's a complex kind of caching and all sorts of systems interacting with it and so that allows us, yeah, to look a bit deeper and even then find that one service that maybe is the bottleneck there. So maybe looking at a bit of a real problem, so I'm having an online shop, I'm selling socks and a user is complaining about getting some time out when wanting to check out. That's obviously not great because I'm not selling the socks, but at least the user got some trace ID and is complaining to our customer service. Then starting from there, I'm figuring out it's the location service that actually was the one that cost the time out in the end. And then looking at the metrics of the location service, I might find, oh, there's actually 5% of the requests timing out, so maybe 5% of my users are not able to buy their socks monthly or whatever. So what are the next steps? I guess scaling up is always good. Maybe the service is just overloaded. The person that wrote it left years ago, so we have no idea. So we just scale it up. Obviously, it comes with a cost and so I think always the first thing would be fixing the bleed, making sure there are no more timeouts. So scaling up is definitely the right option here. But then if you do that over years, you might suddenly find yourself having a lot of extra costs attached to that location service. And so that's kind of where I think we need another signal. And I think that signal should be profiling. So I guess most people might have come across profiling. And it basically measures your code and how long it executes, for example, or what kind of bytes it allocates in memory. And it basically helps you maybe understand your program even more or someone else's program in the location service case. And that eventually can translate in cost savings if you find out where the problem lies, like you can maybe fix it or can get some ideas. Maybe you can also look at the fix and see if it's gotten actually worse or better. And yeah, like that basically gives you a better understanding of how your code behaves. And so now the question is, what is actually measured in a profile? So I created a bit of a program. I don't know. I hope everyone can see it. So it's basically like a program that has a main function and then calls out other functions. So you can see there's a do a lot and there's a do little function. And both of them then call prepare. And obviously in the comments, there's some work going on. And obviously the work could be allocating memory, like using the CPU, something like that. So let's say we use CPU. So when the function starts, like we are first going to do something within the main, let's say we spend three CPU cycles, which is not a lot, but that then gets recorded like yes, we took us three CPU cycles in main. We then go into the prepare method through do a lot. Then we spend another five CPU cycles. And those kind of stack traces then they are recorded in the profile. And going through the program, like we will end up with that kind of measurement of stack traces. And while it kind of works with ten lines of codes, you can maybe spot where the problem is. Like there's the 20 and do a lot. It definitely kind of breaks down when you're speaking about like a lot of services or like a lot of code base that is happened or happens to be hot and actively used. And so like there are a couple of ways of visualizing them. I think one of the first things you would find is kind of a top table. So like in that table, you can order it by different values. Like so this is kind of an example from P prof, like which is kind of the go tool. And you can see kind of clearly do a lot is the method that comes out on top. And like there are now different ways how you can look at the value. So you have the flat count, which is the function itself only. So you can see the 20 that we had before, 20 CPU cycles. But we also have the cumulative measurement, which also includes the prepare that is going to get called from do a lot. And so like we already can see we spend 52% of our program and do a lot. So maybe we already can stop looking at the table and just look at do a lot. Because if we fix do a lot or get rid of it, we need half of the CPU less. And that's that kind of, it's kind of represented by the sum. So the sum will always change depending on how you order the table in that particular example. So in this case, like we have 100% already at row number four, because we only have four functions. And to get a bit more of a visual sense of what's going on, there are the so-called flame graphs. And I think the most confusing thing for me about them was the coloring. So obviously like red is always not great. Should we look at main? No, we shouldn't. So basically like the coloring I think is random or uses some kind of hashing. And basically it's only meant to look like a flame. So like the red here doesn't mean anything. So like if you're colorblind, that's perfect for flame graphs. So what we actually want to look at is this kind of like at the leaf end is basically where the program would spend things. Like you can see the three CPU cycles main here. So there's nothing below. So main uses 100% through methods that are called by main. And then there's nothing beyond this here. So like here we spend something in main. And in the same way in do little, we can see the five. And in do a lot, we can see the 20 quite wide. And then the prepares with five each as well. And now obviously if you look across a really huge program, you basically like can can spot kind of what's going on quite quickly. And then if you have like similar with like route in main, like you basically can ignore that, but it helps you maybe locate which component of your of your program you want to look at because like maybe you're not good at naming and you call everything prepare in util and and it would still tell you roughly where it gets called and and how the program gets there. So how do we get that profile? And that can be kind of quite that can be quite a lot of challenges how to get that. So I think I would say like there's like roughly two ways like either your ecosystem supports kind of profiles fairly natively. And then you instrument the profile, you added maybe a library and SDK. And like basically like the runtime within your environment will maybe expose the information. So like it's not available for all languages. There's I guess a lot of work going on that it becomes more and more available. And kind of other approaches more like through an agent and EPPF has been quite hyped. I'm I'm not very familiar with the EPPF myself. I have used the agents but haven't written any code. But basically what it would use it would use an outside view of it. So you wouldn't need to change the binary running really like you would just kind of look at the information you get from the Linux kernel like you hook into, I don't know, often enough when the CPU runs to then find out what is currently running. So there are different languages like for example in a compiled language. You would be having a lot more information. The memory addresses are the same. You can kind of use the information within the simple table to figure out where your program is and what is currently running. In like I don't know like an interpreted language like Ruby, Python, this might be a bit harder and that information might not be accessible to the kernel without further work. Like it also when you compile you might drop those simple tables like so that you really need to kind of be preparing your application a bit for that. I want to look into the kind of prime example I'm most familiar with. I'm mostly a go developer over the last couple of years. Go has quite a kind of mature set of tools in that area. So basically the standard library allows you to expose that information. It supports like CPU and memory. And especially garbage collected languages. Memory is quite a thing to also non-garbage collected languages. But memory is really important to understand the usage there. I have a quick example of a program like so you basically like just expose an HTTP port where you can download the profile whenever you want and you have that P prof tool that you can point to it. So like in that kind of first line example you would just get a two second long profile. So CPU profile that looks at the CPU for two seconds and basically records like whatever is running how long on the CPU and then you get the file and P prof will allow you to visualize it through that top table for example. So what I forgot to mention as well. So later you can maybe go to that URL and look at that profile that I had as an example. And in the same way you can get the memory allocations and P prof also allows you to launch an HTTP server to be a bit more interactive and select certain code paths. So that is like quite a lot in the go docs, go.dev about profiling. So I definitely leave it there. So you can also look at kind of maybe if you are a go developer like use that and play around yourself. But now I want to speak about why profiling might be actually quite difficult. So the example I had like I had three CPU cycles and if you think about that is not very much. So and just to record what the program was doing in those three CPU cycle probably takes I have no idea about thousands of CPU cycles. And so you really want to be careful what you want to record. So if you really would record all of that like your program would probably have like a massive overhead would slow down by profiling behave completely different and you also would have a lot more data to store to analyze. And then if you think about micro services and replica count 500 you might get quite a lot of data that is actually not that useful to you because are you really caring about three CPU cycles? Probably not. And because of that to allow continuous profiling so to do that in production across like a wide set of deployments like I think Google were the first ones to do that and they were starting to sample those profiling. So instead of looking at really every code that runs go for example looks 100 times a second what is currently running on the CPU and then records it and obviously maybe like integer adder will not be on the CPU if you don't run it all the time and so you get a really accurate representation what is really taking your CPU time. And the way that works you also need to be kind of aware that like some things the actual program might not be on the CPU because it might be waiting for IO and so like when you kind of collect a profile and the profile is not having that many seconds you really need to think about is this really what I want to optimize or maybe I'm not seeing what I actually want to see. With that kind of statistical approach like I don't have any kind of sources to say but like I think generally you say that it's like a two to three percent overhead that gets added on top of your program execution so that's I guess a lot more reasonable than the full approach with recording everything. And so what do you generally kind of would do obviously if you first need to ship your application somewhere and run it then you can look at the profiles and yeah think about it look at it like maybe you are the owner of that code maybe you have a bit more understanding and those profiles maybe can give you a bit more of an idea of what you're actually doing there or how the system is reacting to your code. And so basically like for that green box multiple solutions exist so I'm obviously a bit biased but I also have to say our project is fairly young and evolving. So for example there's like CNCF Pixie, EBPF based, there's Polar Signals Parker like people are in the room, Pyroscope and kind of our solution. I think they're all great like you can all use them and start using them and exploring like maybe just your benchmarks for a start and then as you get more familiar with it like you might kind of discover more and more of the value there. So I'm still going to use Flare now for my quick demo. So let me just see. So I guess most of you are kind of familiar with Grafana and Explore. Why is it so huge? And so basically that's kind of the entry point you're going to see in the Explore. You have the kind of typical time range selection so let's say we want to see the last 15 minutes now and here we can see the profile types collected. So that's just a Docker compose running locally on my laptop, hopefully running locally on my laptop since I started to talk. And for example we can here look at the CPU. So that's kind of the nanoseconds band on the CPU and you can kind of see the flame graph from earlier and maybe some bug. I don't know. It looks a bit bigger than it usually should be. But we can see that kind of top table. We can see the aggregation of all of the services. So I'm running like five pods or something like that, different languages. So you can see like for example this here is like a Python main module where it's doing some prime numbers. So what I first want to kind of break down here is by label. And that's really the only kind of functionality that we have in terms of querying. So here we would look at the different instances and we kind of see the CPU time spent like, I don't know, there's like a Rust port and they are both blue so I don't know which switch, but I guess Flare is doing more. So that might be the Flare one. And for my example now I want to look at just like a small program that I wrote to show like how like the aspect. So like here we can see kind of the timeline. So this is like a profile gets collected I think every 15 seconds and that's basically a dot. And then the flame graph and the top table below would kind of just aggregate that. So like there's no time component in here. That's also important to understand. And so like while we were looking at memory I'm now going to kind of switch to the allocated space. And oh no. And here we have some label selection like that you might be familiar. And this random port here you can see like the allocation so the amount of memory that gets allocated is like around six megabytes. But then every couple of every five minutes roughly you can see like some peak. And so if you already look in the flame graph there's already some kind of big red box and the colors don't matter. But basically like you can see this this kind of piece of code is doing kind of a majority of the allocations. And now you could even kind of zoom in here if you really want to figure out and then it even gets bigger and you can see some more what's going on. And so now if you actually want to look at the kind of code for this. And if flare is maybe in version 0.6 we could even see the line of code that we should look at for now you can. But basically like it would show us allocations in line 21. And I guess most of you can see what this kind of program is doing so every five minutes it will kind of have some peak of allocations. And you only see that kind of because you have the time component you can select and then see the flame graph aggregation. Cool. Yeah that was almost my talk. Like I have one more slide that I should just quickly want to. So in the latest version 120 there is profile guided optimizations and I think that might be a really big topic. So what it does it looks at kind of your profile and that can come from production from benchmarks from whatever and tries to do the better decisions during compile time of what things to do with your code like for example think the only thing that it does right now is making inlining decisions. But basically like if it sees this method is called a lot and is in the hot path it would then make the decisions to inline the method maybe if it's a bit colder it would not do it and you can be a lot more accurate as a compiler if you have that kind of data if you know that method is in the hot path or not. Okay that was it. Thank you. Thanks a lot that was awesome. Questions. Thank you. Thank you for the talk. I'm just wondering how would the profiling work with very multi-threaded code. Is there ability to drill down into that level. Yeah so like maybe so like in terms of multi-threading like obviously we only have the main method in that example. So you can see rude and then mine is 100% and like if it's multi-threaded you would have kind of maybe more so it's basically all only the stack trace that gets recorded like you would not see kind of maybe the connections where the thread where it's threading off and things like that. You would get the stack trace. Cold stack. Have you looked into any other profiling formats than B prof ingestion. I know open telemetry has been doing some stuff about introducing a profiling format that people can standardize on but I don't know if you've looked at that at all. Yes. Can you like I haven't seen you like sorry can you repeat like I struggled to. So I was wondering if you've looked at any other profiling ingestion formats other than B prof. No I like I or so like right now we use P prof personally with the player. So I think there's a lot of kind of improvements to be had over over the format there and that's like as far as I know some active work around open telemetry to to come to I guess a better format in the sense to not send symbols over and over again and reduce interest but not no it's the the accurate and short answer. Okay so thank you for the talk and my question is that looking at the flare architecture it's currently pool models so the flare agent is scraping the profiling data from the applications that they configure it to scrape. My question is is there an eventual plan to also add maybe a push gateway or similar API for applications where this might be suitable. Yeah like definitely like I think I also can see kind of the push use case for maybe if you want to get your micro benchmarks from CI CD in so like the API in theory allows it but tooling is missing but I definitely think it's a valid like push use case as well. I think in terms of scalability I think pool will be better but yeah I agree. Thanks for the talk. I have a small question. Did you try to implement this tooling in the end of the CI CD and CO continuous optimization? No like so we're not using it yet for that. I think it's it's definitely a super useful thing because like yeah like you want to see maybe how a pool request behaves like maybe how your application allocates more or less in different parts and and if the trade-offs are right there but yeah I think it definitely can and should be used for that but no no tooling right now. Yeah no I fully agree as well yeah. Hello thank you. So if I understand correctly profiles such as traces combined with OS metrics right so at a concrete specific time you can see how much CPU you used and so on right. Yeah I guess it looks a bit more at the actual line of code rather than I don't know like I don't know I haven't used like tracing where it automatically finds the function maybe that also tells you the line of code but yeah like it definitely adds some metrics to it like without you doing much I guess other than making sure it can read the symbol tables and the function names. Yeah so so I just had like a dumb question or like dumb idea couldn't you just combine for example you already have node exporter which exposes metrics at all times so you have OS metrics and you have traces for example so couldn't you just have some kind of integration graph on or or somewhere else that just combines traces with metrics. Yeah so like I think that was also the like like people that work longer at continuous profiling software that they try to kind of reuse kind of Prometheus and I think where you end up kind of in it's just a very high cardinality it's too many lines of codes and and that's kind of where it stops but like in theory like I guess most promql constructs and functions are maybe something we need to implement on top of that in a similar way because in the end you just get metrics out of it and so basically the problem was too many lines of code too much changing over time and like you just get too much serious turn through that. So thanks a lot. Yeah thank you for coming. |
Loki: Logging, but make it cloud native
Get started with Loki, self dubbed "Prometheus, but for logs" |
All right, hey everyone, my name is Owen, this is Kavi, we're going to be talking about Loki today, this is a project that's very near and dear to my heart, been working on for a while, but I believe it's actually the second FOSDEM top on the subject. First one right here by Tom 2017 or 2018, so we'll cover some of the differences since then and let's jump in. So first of all we're going to get a few things from this talk, one how Loki works a little bit differently than a lot of things that came before it, and what some of those trade-offs look like and why I think that's an advantage, and then you can also learn some of the tips and tricks that we've learned building cloud-native distributed systems that need to be up all the time, and maybe you can incorporate some of that into your designs later. So we are both engineers at Grafana Labs. We work primarily on Loki, the open source project, but then also do a bunch to make sure it runs as part of our SAS, I'm Owen and this is Kavi again. And you can find the project here, this is our GitHub repo. We are operators and implementers, so we both build and operate and run the software. I use Loki every day to debug Loki, and that's kind of a labor of love, but it comes because we get paged a lot, well not a lot, I actually shouldn't say that on it, you know. But we do get paged and so we are very empathetic to that fact. This is actually the last time I was here in Belgium, this is in Ghent on top of the Gravenstein, this is not stage, I actually did get paged here. And so this is actually using Loki to figure out what was wrong with our hosted version of Loki itself. But this is my first time here at Fozdom, and it's been a lot of fun so far, so thanks for having me. So this was actually coined by a friend and colleague Ed Welch, Loki is a time series database, but for strings, this is effectively how Loki works at the end of the day. And so the first off we'll jump into figuring out what exactly a time series database is. So yeah, what exactly is time series database, right? So if you think from the like a normal database, all you always see like a key and value, right? And in time series database, surprise, surprise, you have timestamp. So what you see is like for every unique identifier, you have array of records, or tuple, what are you going to call. So on each record, we'll have a value and a timestamp attached to it, right? So in this example, as you see, so for this identifier, you have a value v0, a timestamp t0, and value v1, a timestamp t1, so on and so forth, right? So this is the mental model, we want you to like, yeah, I mean, keep in mind throughout the talk, so that you understand some of the decisions, why we made the way we are. So to see this in action, right? So this is how it looks like in Prometheus, and this is how it looks like in Loki. So what is identified here is a unique combination of this, what we call label set, right? Here it's app equal to NGNX, cluster equal to US central zero. So this is like a unique identifier, which has this list of records. And as you can see, the only difference between Prometheus and Loki here is the type of the value, right? So in the record, you see timestamp, comma, value, and the value is floor 64 in Prometheus, which is 34.5, you see here. And in Loki, it's a string. It's a log line you ingest into Loki. So that's what, when we say Loki is a time series for strings, that's what we mean. So yeah, it's a log line there. So we definitely take or steal, or however you want to put it, a lot from Prometheus itself. And this is actually what this looks like in terms of how we index and store data. We're going to talk a lot about indexing in this talk, or particularly the lack of indexing. So Loki just basically takes in index and metadata, but doesn't index all content at all, which tends to be the vast majority of all this data. So here we're looking at the set of Prometheus-style label values and the timestamp. All that is kind of controlled and used for routing purposes of queries. And then the log contents themselves, which, contrary to the size of the underlying graphs on the screen, the lines, the log contents are much, much larger. So that's the biggest piece here. So this allows us to kind of figure out and use our expertise for the systems that we run and use these labels, which tend to be topological, so the source of your contents. The application equals API, the cluster, the environment, that sort of thing. And then slice down to the time period that you care about, and we can use our own expertise as operators to figure out where we should look. And so as you can see here, this maybe is a pretty broad section corresponding to about a terabyte of data over that period, but you can mix and match these things. And so this goes to a fraction of that down to about 10 gigabytes by just looking at the sets of applications that we actually care about rather than all of the applications and replicas deployed in that cluster. So to give you a bit of a taste of what we mean when we say Loki is performant and Loki executes code faster, so this is what we mean. This is the metric we took from one of our internal cluster. So what you're basically seeing here is this particular Loki cell at peak, it's processing like 50 terabytes per day, right? And what you see in the UI, it's a Grafana UI, by the way, where you're running some lock yield, which is a Kodi language we use to get visibility of your logs that you ingest right into Loki. We'll talk about lock yield a bit later. So yeah, this is specifically a metric Kodi. This is, like, basically you are trying to figure out the metrics on the fly just from your logs without any instrumentation, right? So this particular query is processed like 10 terabytes of data in 12 seconds, which is almost like one terabytes per second throughput. So that's what we mean when we say, yeah, Loki is faster performant. Yeah, so my favorite piece here is that these are actually constructed from logs themselves are not Prometheus metrics or anything, we log query metadata for every query that comes into Loki and we're kind of extracting the part of the log line that talks about how fast it processes data and then accumulating it over a bunch of subqueries to get this final value and then graphing that over time. So let's step back now and I like to think about how Loki was designed kind of in response to what came before it, you know, if we look really far back, we remember using tail and grep on like individual log files and that's still in use a ton today, but it doesn't work so well when you start to need, you know, hundreds or thousands of machines and maybe they're ephemeral, right? So we had to kind of build new strategies for handling this and a bunch of things have come up in the past, you know, 10, 15 years. And sometimes it leads us into the next point where we've accumulated all of this complexity and sometimes we've really missed that experience and I like this tweet because it's just incredibly emblematic of that experience that sometimes I just really wish I did have grep on top of, you know, all of the log aggregation and on top of the underlying complexity and scale that we've accumulated over the past couple decades. Yeah, so broadly speaking, like, so that's one of the goal of Loki, at least on the query side, to have to take the same experience you have before, like with just grep and tail that you're confident with, can we bring the same experience in this modern cloud native distributor systems era, right, where you have your logs, like speeding out from different machines from different applications, yeah, similar setup, right? So like I mentioned before, Loki has this language called lockium, and that's how you query your logs back to get some visibility, and this is heavily inspired from Prometheus. So people who are familiar with Prometheus may already get, could I get a grasp here? So this particular query what you see at the top is basically saying, like, give all the logs from this particular namespace, let's say Loki day of 005, and then give me all the logs that matches only the error in the log line, right? So as you can see, the experience is like a pipe equal to error. So you can still combine multiple pipes here. So that's the kind of like experience we're talking about, right, so, yeah. So doesn't mean you have to, you can only like grep for only specific pattern, you can also grep for specific IDs, like use case can be like your order ID or trace ID you want to find in the logs. So you can also do some kind of regress match here, and also, like I said, you can also like put multiple pipelines here, right, you can do under R, doesn't matter. So it's basically like, first you choose which logs to look for, like a routing, what Owen was saying, and then you can do all your piping, so, to mix and match. So that's the idea we're talking about. So this query is a bit different, as you'd have seen, like compared to previous two examples, which we called as a lock query, and this is a metric query. So in the lock query, the output you see after when your query is executed, you will see, you're going to see like list of logs that matches the particular pattern, right? So here, it's a metric. So if you see what this query does, it's similar to the last one, but we added two different things, right, here, the rate and the sum by. So what this means is, without doing any instrumentation, like the logs are coming as it is, so you can really find your error per second rate of all your application aggregated by the container, which means, so, doesn't matter like how many applications running in this namespace, so you can aggregate by container just from this metric query. So yeah, that's the idea we are trying to, like, the experience we want to, like, with a lock here. Yeah. This is probably my favorite part about Loki of all the things, right, the ability to extract information at query time ultimately means that you can be reactive instead of proactive. You don't have to, you know, figure out a schema and make sure that you're recording something in a precise way before the bad thing happens, right, because a lot of the time it's the unknown unknowns that get us. And so this, the ability to extract this structure, you know, and particularly to do it from logs, really allows you to figure out when things go wrong rather than having to re-instrument before you can get that information. So next up, we're going to talk a little bit less about the experience querying it and more about how it's constructed. So this is the kind of design choices that we make. So particularly individually being able to scale different parts of the system, particularly tuning your kind of preference for cost versus latency. On the read path, Loki runs, is intended to run across commodity hardware, and our only real dependency is object storage. So we store everything cheaply in a generally managed solution, although you can, you know, run your own object storage or use file system backends if you prefer. So this is how Loki ingestion path architecture looks at the high level. So when you send logs to Loki, and this is what it goes through. So the key takeaway here is, like Owen said, the only external dependency you see here is the object store. So everything else is a component of Loki. So of course we have like different deployment models. So usually all these components can be put together in a single binary. So if you're new to Loki, you are starting just to play with it at maybe like a small scale. So you just run it as a single binary, as a single process, and you send all the logs to a single process. And the only, you just put the bucket name and you're done. So all your logs are getting ingested into Loki, just fine, right? So that's what when we say like, yeah, like less dependency in running Loki. Yeah, in this case, it's an ingestion path. And so one of the things about not indexing the log contents themselves means that it's really easy to ingest data from a bunch of different sources. So maybe you work at an organization, and one team writes in one language, one team writes in another language. They have no standardization over the formats that they're using, the schemas of the logs, if they're using structured logging. So you can have one team in JSON, one team who just pulls in NGINX logs, and then another team that uses something like log format. And that's all OK, because Loki doesn't really care. We just index the source of where this came from, along with the timing information. So you can, each individual team, in that sense, can extract that information when they query it and choose what and how that they actually do care about in their logs. Yes, speaking of query path, right? So this is high-level architecture on the query side. So again, these are all like Loki components, except like a two dependency here. One is object storage, of course. The other one is like optional, which is like a cache. So again, the same pattern applied here, right? So you can combine all these Loki components into a single binary, and you can just run it as a single process. And all you need to do is just point to the persistent object store, and then cache for the performance. And that's it. So you're good to go on the read path, right? So again, if you want to run in a highly available fashion, let's say you hit some scale and you want to tweak some things. So this is how particularly we run in our internal like SAS. So here you have more control in a way, say you can tweak individual component, you can scale individual component, and yeah, that's the idea. So again, the key thing here is the simple external dependencies. So yeah, and it's really powerful to be able to develop on a single binary running all these subcomponents or things that eventually run in microservices in a single process, and then being able to scale that out depending on your tolerance for scale and running HA or data replication, that sort of thing. So because the index in Loki is very minimal, it's just this routing information or the data topology, really. It allows this to be much, much smaller than you would otherwise think. So I actually pulled this from one of the clusters that we run internally. It's close to 40 terabytes a day and just shy of 140 megabytes of index. So it's much, much smaller. And this gives us a lot of benefits when we talk about some of the next stuff. Yeah, so if I were you, probably I'd be asking this question, okay, folks, you're talking a lot about tiny indexes, right? How on earth do you make the query faster? So also this thought comes from the idea like either you are like from an academic background or like a practitioner who is building a distributed systems, it's always been taught like if you want to make something faster, you index it, right? So here we are sharing our experience like where if you want to index everything to make everything faster, so at some point in scale, your index is going to be much larger than the actual data, right? And in our experience, like handling huge index creates much more problems at scale. So that's a whole idea here, right? So let's understand like how low key makes the query faster with the tiny index, right? And this is how. So don't let the image scare you. So let's understand piece by piece here. So at the top, you see the query that comes in, right? So for the sake of discussion, let's say this query is asking for the data for 1 hour period, right? Time period. So the query path architecture what you saw before, what the first thing it does is it takes this query, the huge query, and it try to make it like a sub-query by time split. So in this case, it splits this 1 hour query into 4, 15 minutes query. We call it a sub-query, right? And the trick is it doesn't stop here. So low key index is designed in such a way so that it can look into the index and say, hey, this many data, like this many bytes, it needs to touch. So we can dynamically decide how many worker pool you need to process that query. So that's where this performance comes from. So for example, like in this case, let's say this 15 minute sub-query is touching, I don't know, like 10 gigabytes of data. So and you can plan accordingly like how many worker pools I can schedule this query into, right? So let's think about this for a while, right? So now the key takeaway here is you get to have a control over your cost versus performance. So you start with low key and you hit some limit, right? And at some scale, you're going to hit the limit. And you can increase the performance just by adding more query pool, like more query workers. So yeah, that's like a key control we want to have with the low key. So yeah, that's how we make query faster. So the next thing we're going to talk about is retention. We get asked this a lot, particularly wanting to store things like application logs for some period of time. You know, we use 30 days as kind of our standard, but it could be whatever your use case is and then things like audit logs for much longer periods of time. This is pretty easily tunable in low key. But the important part here is we were talking about the index size earlier. As we don't index the log contents themselves, retention is very, very easy and cost efficient to do because all of our data is stored in object storage. And so if we extract that kind of earlier index sizing slide out to what it would look like in a year, this is roughly what we get. And so again, we just use this index for routing information, which means that all of the data is effectively live. There's no like rehydrating or, you know, hot and cold storage tiers. Everything is served out of object storage generally all the time. Now you can, there's some nuance there. You can put like caches in a couple of places and that sort of thing. But the idea that you can use object storage as your primary back end is very powerful, especially when you consider cost over long periods of time. All right. So yeah, we've been saying low key has been built for the operators and for dels, right? And when it comes to operation, yeah, if you've been run any database or any distributed systems at scale, so you always want to keep on like top of the latest release of whatever the product you're running, right, to get the latest optimization, latest features, so on and so forth. And the other use case is like sometimes you want to migrate your data, right, from one persistent store to another one. In this case, maybe one GCS bucket to another one, even across the cloud provider, right? You sometimes you find like maybe S3 is better. Like I can go with S3. I can change from GCS to S3. So with all these use cases, can we do all these operations with zero downtime? So can we do that? So that's something like we can do in low key. We have been doing it many times. So to give you a complete example here, and this is one of my favorite part of low key config, like we call it as a period config. So what you're seeing here is we have something called like schema version. So whenever we change anything with index or any new feature comes in, and if it's like we use different index format or something, we change this version. Like in this case, you see V11 to V12, right? And so what you can do is, let's say you want to start using this new feature from Jan 2023, right? All you need to do is go and put the version V12 from the start date and you're done. So low key can understand this and it can work with both different schema version. So this is how you can like upgrade your schema and production like lively without downtime. So this is one example. The other one is like the migration, right? Like I talked before. So you may want to move your data within the cloud provider or across the cloud provider. For example, it's the same thing here. So from 2023 of Jan, I need to store all my new data into like S3 instead of GCS, right? So you go back and you change this one config and you're done. So low key again understands this. So when the query comes in, it checks whether which data it's asking for, like which time range it's asking for, and it can go and fix the data accordingly, right? And yeah, again, without downtime. So it also works really well with the retention. Let's say if you have 30 day retention and yeah, after 30 days, you don't care. So all your new data is stored to the new bucket. So yeah, this is FOSDOM all about the community. So we launched low key at 2019 as open source and we have like active community going on. So these are some of the ways you can reach us. We have a public Slack and we have a community forum. And every month we also have a low key community call. We alternate between US and EU time zone. So yeah, come say hi. Yeah, we are happy to, and if any of the things which we talked about excites you, come talk to us. We'll be like more than happy to have a new contributor to the project. So yeah. Yeah, we should have asked this probably in the beginning, but is there anyone out there using Prometheus? Yeah, a couple. All right, a lot. Yeah, good. What about any low key users? Okay, not bad. Thanks. That makes me really happy. Anyone run into hard configuration problems with low key? Yeah. Oh, no hands. That's weird. No, I'm just kidding. Yeah, we got some work to do on. Yeah, so things to take away from the talk. Key is meant to largely function like a distributed grep that you can also kind of pull metrics and analytics out of. It's low footprint, storing things in object storage and not relying on schemas. And we really target kind of easy operations with that as well as low cost. And then please come join the community. Come talk to us. We're going to be hanging out for probably at least 10, 15 minutes if you have any questions. We'd love to hear from you. All right, thank you. Yeah, I can share. Any questions? Hey, great talk. I was wondering with the amount of query sharding you're doing, how are you kind of synchronizing the result in the end? Or is it like synchronized in the UI at the end? Or like, do you need to sort things because that might be very expensive? All right, so I had a little difficulty hearing that, but I think the question was, how do we synchronize sharding in the query engine? Is that in the end when you show it in the UI? When we show it in the UI, it happens a layer down in Loki itself. So we've already merged everything. I think the real answer to that is probably a lot longer than I can give in this question, but talk to me. It's one of my favorite things to talk about. Thanks. Great talk. This was mentioned several times that the main power of Loki is that it indexes only labels basically, and it doesn't index actual log messages. But to make log messages searchable, you want to have many labels in this case, and many labels often lead to cardinality explosions, how you deal with it. Is there any good recommendation? How many labels I should have? What are trade-offs here? Yeah, so the way that I think about this is you probably want to index where the log came from less than what's in it, right? So we definitely added the ability to kind of like, especially in PromTail configs, which is the agent that we write, the ability to like add extra things in there, and people have been really clever about what they put in. Unfortunately, that can also be somewhat of a foot gun at times. So I'd say index the things that correspond to your topology, right? Where the logs come from. So environment, application, cluster, that sort of thing. And less things that have to do with the contents of them themselves. There's probably a somewhat longer answer there, too, and then we've also been doing some recent work to make some of this less of a user concern, particularly about distribution of individual log stream throughput. So if you have one application which is logging like 10 megabytes a second, that can be really harmful on Loki in some ways. And so people got clever around splitting that out into different streams. But in the future, we're going to be doing a lot of that automatically and transparently behind Loki's API. Just to add one thing on top of what Owen said, like in log yield, which we didn't show here. So if you have something in your log lying itself that you want to treat it as a label, you can do it on the fly with the log yield query language itself. So we have some like a parser. If it's a log form, if it's a JSON parser, you can use it like labels, the things in your log line itself. Hi. Thank you for the great talk. I have many questions, but I'll ask one. Do you have any tips in terms of scaling query and caching layers? Like from my experience, usually they're very overutilized, but when people start querying, you get all the crashes. Do you have any code and ratios or anything? Yeah. So this is one of the things like how we expose configurations around query parallelism and controls. I wish I had a do over on this a few years back, because there's a couple different configurations. Largely around for every individual tenant, how many subqueries you want to be allowed to be processed per query at a time, and then there's also things like concurrency for each of your query components, right? How many go routines should you associate with or should you devote to running queries independently? But yeah, I got some work to do to make that easily digestible, if that's fair. So this year at work, I end up kind of giving to Loki for 10 minutes about 20 gigabytes of log from a 500 machine. It was just a one-time experiment that they ran from time to time. And we started using Loki. We optimized it as intended. So using index query was great. At a certain point, my colleagues come to me and say, we want the log string for each individual machine in a file. And I started asking Loki for this, and he took ages to extract this information. Now I hope you're going to tell me, this is not the use case for Loki. You did very well to use RC's log and just pushing things in a file. But if you have another answer, I'll be glad. We didn't catch the question fully, but my understanding is you're asking, like, can the log yield find logs coming from a single machine? Is that right? Am I getting it right? So you're talking about, I guess, some kind of federation, maybe? So why do you want to store it in the file in a single machine? Because they wanted to run analysis on a specific stream of log from a specific machine. So technically, you can store log files in the file system. So instead of using object storage, technically, you can use a file system. That's completely possible. But we encourage to use object storage when it comes to scale, because that's how it works. Yeah, is the question behind the question there changing where you actually store it so that you can then run your own processing tools on top? Yeah? Yeah? Okay. Yeah, that's actually a relatively common ask. It's not something that we support at the moment. We kind of have our own format that we store in object storage or whatnot. We do have some tooling, one's called chunk inspect, which allows you to kind of iterate through all of the chunks, which could be from a particular stream or log file. But it's not incredibly batteries included at the moment, if that makes sense. Hello, I have the use case that I store some telemetry data with my logs sometimes, like a metric or sometimes, which I don't want to be indexed, but I also don't want to encode in the actual log message because it's already structured data. Is it possible to have fields that are not indexed or data that are not indexed? It's funny that you asked that. Yeah, so there's kind of a growing question around non-index metadata like that, right? That's not actually stored in the index itself, but like includes some degree of structure. I know we see this as kind of a current need, particularly for like hotel formats, so it's something that we're looking into right now, actually. So thanks a lot, everyone. Thank you. |
The O11y toolkit
A toolkit to improve, augment and debug your Prometheus stack |
So next up is Ulympe Voltault. Hello, hello. So I am Jean-Péretot, I am one of the maintainers of Prometheus and I have been doing monitoring for more than 10 years now and I am not working at all where we are basically doing Prometheus support. We are covering the Prometheus ecosystem so any tool, open source tool around Prometheus and we are also like many other companies in the area contributing upstream in the open source, Prometheus development and the ecosystem. But let's go into the toolkit. So I have been working with Prometheus for about 5 years now. I have seen many people using it in good and less good ways. Well, anyone is free to use it the way they want. But I also know what is missing in the tools that Prometheus is offering. And I have seen some people struggling to use Prometheus or having very specific issues and it's not always super easy to debug your setup or to get more information about it. And I know that no one wants more tools, no one wants more sidecars. Part of my work in Prometheus upstream has been to include more service queries so you don't have to run that many sidecars. We have also removed the need for sidecars that write to the local system of pointys by enabling HTTP service queries. So upstream we are working to reduce the number of tools that you need to run Prometheus for your own environment. But still some things don't fit upstream. So some tools are very specific. You might only need it in one place and you know Prometheus is big. It's used by a lot of people and once you put something in Prometheus itself, well, you have to maintain it forever and you cannot change the way you like. So some features that we have used only at one or two different customers, we wanted to open source that and to make that available for everyone so people could see what they could, if other people could use them, it would be just great to have a place to do that. And instead of releasing each tool individually, which is very difficult to discover because you need to go to 10 different websites to get your tools, we branded that in the early tool kit. So the goal of the tool kit is to provide an open source. So you can just get it just like you want. You can just download it directly from our website and it will help you with your Prometheus stack. We will probably extend that as we see the needs to other tools in the ecosystem, but the goal is that you can debug your system and also enhance it in some way so that more people can use it and you can find solutions for some problems that you might have. So it's licensed under the Apache to license just like Prometheus is. So the principle of the tool kit is that every tool is available individually so you don't need to download all of them. You can just get the one that you need. We have both command line tools and tools that look directly in the browser. So we'll get to that later. We tried to use the HTTP API rather than local file when it's possible because when you run in Kubernetes or in cloud environments, well, you don't always have access to the local file system. And also all the tools have a common look and feel. So if you provide a configuration to connect to your Prometheus server with a username and a password, you can reuse the same file for the other tools in the tool kit. So let's go to the tool kit. So the tool kit is available at olidotools. We currently have six tools at this moment and I will demo all of them now so you can have an idea about what they are doing. The first tool is called CSV2Targets. I was at a customer and the network team, they wanted to start using Prometheus but really like keeping an up-to-date list of switches that they had is still a very difficult challenge in infrastructure and also CMDBs and all that kind of tools when you work in a really big corporation, it's a lot more difficult to interact with. But somehow they can produce CSVs. So we developed a tool, CSV2Targets, where basically Prometheus can work with file service coverage. So you can have a JSON file with all your targets and then Prometheus can use that JSON file and scrape the targets in the JSON file. But when you talk to those people, they are like, what is JSON or do I use it? And then when you start explaining that in the file SD you need, if you want different labels that you need to duplicate the label section multiple times, they are completely lost. So we made that tool, so if you just create a CSV file. So the first colon is going to be the address. So we can just leave it empty and then you can have labels like the data center and the rack and whatever you want. And then you just can, you can put just all the switches that you have, completely making something up. London 1 and the rack with the rack S5, I don't know. And then they can just duplicate that. And maybe the rest, they have one in Europe as well in Paris and also changing the rack. And then you call the CSV to target tool. So you put it the gateway.csv and you have your file that's written. So you have all the targets with all the labels which are colon based. So it's quite easy to get the file that Prometheus can scrape and then you just need to change your, your point is config. So you are the job and file as the configs, files is gateway.gson. And so it targets.gson, I called it. And it works, it works the first time, so it's great. And I have my Prometheus server and I have on my targets page, I have my gateways with the correct labels from the CSV file. And now because Prometheus is using a notify, just when you change the CSV and you run the tool again, it will just keep the new labels. So this was a very easy way for them to add and update their targets. You can even choose the labels that they wanted to use. So it's quite easy for them to, to add and change targets. It's still very specific tools. And we also plan to add in that tool HTTP support. So you don't need actually to be on the local file system of Prometheus. But those people, they know everything but Linux, they know system and expression, they know CSV, but gson is still one step further for, for them to learn through. So just one work about how you can get those tools. So at the bottom of each page, you will see that you can download, download them, but you also provide dev files, RPMs, Docker images. And you can also use Nix if you want to, except that this will probably download the world. But it works. So you can just get the tools easily each one individually. If you are on a DBN or a CentOS machine, it's just easy to just get the package as well. So it's easy to, to get and to install. The second tool is only exposed. So that tool is just taking a metrics file and exposing it for Prometheus to consume. So the idea is that sometimes you have scripts and you don't want to use the push gateway because then you need to secure the push gateway. And so this is basically the text collector feature of the node exporter, except that the issue with the text collector feature of the node exporter is that the node exporter needs to be able to read the file. It means that you need to run it with the correct user. And if you have three different applications running their own user and writing their own files, you might, you, you don't want to run the node exporter as well just to read all the files. So this is just taking a small HTTP server to just expose your metrics. In addition to just Python dash M simple HTTP server that you might do to expose your metrics, which would also work. This is also adding the node exporter specific metrics about the time changes of the file. So you can still monitor that if a file is changed, just like you can do it in the node exporter. So again, if you, if we start it, thank you. Someone is following. No, I know that you can read the screen. So I have a file with a very small metric which I will call forced them talk running one because my talk is running right. And now I can just expose this using the only exposed tool. So the defaults are it will listen up on 9 0 9 9. And the file it will take is okay, but I will just. As you can see also, it, you will see the same log messages as you've seen prometheus because we are able to use the same configuration file as prometheus. So if you have your TLS conflict file for prometheus and you know the format, you can also just use that and protect that with a password or anything like that. So 9 0 9 9. Not fun because I need the slash metrics. So you see that I have my forced them talk running one and then I have the go metrics one. I also should have the file, the node exporter specific metrics like the text file and time seconds and all the specific items of the node exporter. We also have the same flags as in the node exporter. So you can disable the exporter metrics. So like this and now I just have the first them talk running and the modification time and whether or not there is an error. So you don't have to go garbage collection metrics for those very small components. This is a feature that is also available on some of the exporter. I think at least another exporter is, but I don't know who says it, but it's getting a thing now. So that tool was also used by a customer. And the last tool is a bit more technical. It is all script jitter. Basically, the story is that one of the customers at a pair of HHA Prometheus server doing the exact same work like everything was the same. But in one of the server, the blocks took twice the size more or less. It was like, yeah, why is that server running out of this space? And it was really like slowly increasing until it gets the maximum was at the full retention time. So it did not grow that much after that, but still we had blocks that were twice higher. And after investigating, we noticed that basically that Prometheus server was not compressing the blocks, the chunks directly because there was a difference. The scripts were not aligned, which means that instead of taking the metrics every 30 seconds, it was taking the metrics every 30.30 seconds and 10 milliseconds. So that is a big issue in Prometheus when this happens all the time, because it means that we are not compressing the data. And instead of taking a really, really negligible amount of bytes to store the timestamp, you need to use a lot more storage to do that. And it was really noticeable. So we have all this script jitter. Oh, is it working? It will look at the timestamp of all the ob-metrics in your Prometheus server, and it will tell you, OK, are they aligned or not in the Prometheus TSDB. So we are by using that query and querying Prometheus directly. You don't need to have access to the chunks, but we can tell you just by running the tool if the scripts are aligned. So this is what it tells me now on my laptop, and hopefully my laptop has five aligned targets. So it did look at all the metrics that I have, and it is just happy with that. You have a lot of options there to better understand the output, so you can decide to only log the underlying target. So you can see maybe one of the targets you have is more problematic than the others, and you can also plot the target. So if I run the plot, I can see the... I picked a really good name today. File.png. So you can see anyway like all the targets are aligned, so you don't see anything. If I show you what we had at that specific customer... So that specific customer had 150,000 scripts that were not aligned with a delay of a few milliseconds. And using the tools... So the first thing that we made when we did notice that is we implemented a feature that by default, if you are a few milliseconds off, we will just say, okay, we will just align your scripts so you don't lose bytes just for nothing. You can disable that. You can change the cheater. And for that customer, we actually went and we increased that tolerance a bit more because the tool told us, okay, that server is really overloaded. And if you increase the tolerance, then you can gain a lot of this space. So by default now, Prometheus is doing that with a smaller amount of milliseconds. So if you really have a few cheaters that are not expected, but as long as it's small enough, you will not see the difference. You will still do correctly the alignment. So the outcome for that customer is when we implemented that is that we had a 30% disk usage reduction. So increasing the cheater is not always the right solution. So if you are able to just... It very often means that your Prometheus is running out of CPU. So it will look like it's working really fine. You can still do query with your Prometheus server, but the ingestion path is getting somehow stuck and it cannot handle correctly the scraping on time. So it is very often a way to signify that your Prometheus server might need a bit more CPU power, even if it still works fine. So I did speak about the configuration. So we support everything that you might need to configure to access your Prometheus server. So basic authentication, authorization, also basically every HTTP mechanism and it is using the Prometheus library. So we did not invent a new way to configure your configuration. So if Prometheus can scrape it... Well, if Prometheus can scrape it, yeah, you can just use the same configuration to access the APIs that you are using. And the file... So we are following the Prometheus security guidelines, so we don't let you put username and password on command line or on environment variables, we just follow the Prometheus way, so you have the config file for connecting to Prometheus, except the URL which you can just pass as command line arguments. So if we get back to the HolyScript teacher, you can just put any URL there. So if I go to Prometheus.demo.do.prometheus.io, I think that's the one. Then I was able to run that against the demo Prometheus server to tell me that I have 26 milliseconds maximum. I have four line targets and six line targets, so if that becomes an issue, we might have to look at the number of actually not aligned targets to see if that's a small issue or a big issue that we need to figure out, because sometimes it's on your scripts and it's fine, but sometimes you have only 10% of the script which are efficiently stored on the disk. Okay, so that was it for the initial tools that we have in the command line, but we have a few more tools that are running directly in the browser. So we made a few modifications of streams so that you can compile Prometheus in JavaScript, so it's called WASM. So basically we don't have any API server running the web tools, but we directly run the Prometheus engine in the browser. So the first tool is the matrix linters, so it is using the client Golang matrix linters, so you can also do that with Prometheus, I think, but if you are quickly developing and you don't have access to Prometheus, you can just enter your matrix in that page, and in the browser itself, it will just validate your matrix. If I go again with my matrix for them to running one, I can link it and I will see, okay, that my for them to running has no help text. If I have a compilation issue, well, that my matrix cannot be parsed and I will also see an error, so that way if you are developing a script and you want to quickly add up, go for them back, you can just go to that page and it will link button and it will get you some linking information about your matrix. So this is running the same code as your Prometheus server is running, so you will get the same output and it's really nice to have that just easy under your hands, you would go to all the tools and you can just link your matrix. The second tool is the password generator, so if you want to secure the connection to your Prometheus server, you probably want to install or infrastructure have a password file for Prometheus, but the issue is that this is using Bcrypt, and Bcrypt is not always easy to generate, so this is the ashing algorithm for the passwords, so we made that tool so you can enter your user names and your password and it will generate your Bcrypt. What we see in most organizations is that they will use more sophisticated SSO, so they will use another proxy in front of Prometheus, but you can still secure your Prometheus server or secure the communication between that proxy and the Prometheus server using TLS for example, so you have a lot of possibilities, but smaller organizations, they can quick and easy add the web.yaml also to the exporter themselves, so if you want just password protected node exporter, it's also possible, so if I put the username for them and password demo, and I generate the file and so I have the web configuration file, so if I take this web.yaml and I will paste it in my user, so I will now launch Prometheus with that file. If I find the option, OK, web config file, and now if I go to my Prometheus server, it asks me for a username and a password for them demo, so it is very easy to generate user names and passwords. As it is fully open source, the deployment procedure is also open source, and if you open your browser tab, you will see that there is no connection to the server because it's all generated by JavaScript and, well, go compile in JavaScript in your own browser, so you are not sending us your password whatsoever, which is an issue with most of the big-crypt generator. You will find online, like you enter your password and it sends and it gets back by magic, but... And the last tool is the Prometheus parser, so it's the same, it's running completely in browser and you can just run your Prometheus input, so if I take a Prometheus query, let's see what we have. So we take a Prometheus query and put it in a Prometheus parser and this is running the Prometheus input from Prometheus and it also returns you the pre-defined Prometheus expression, so if you just mess up with your Prometheus query, you parse it and then you will see the pre-defined Prometheus. If you have more complex query, you will see the pre-defined Prometheus in multiple lines. This is actually a feature that has now also been implemented upstream, so let me show it to you because you might not know about it, but you don't need that tool if you have a Prometheus server under the hand, because let me close the umbrella menu. Next to the expression browser in Prometheus, now you have that button which is basically formatting the expression, so it will also tell you if your expression is not correctly written or very strange to read, so if I click this, it does not execute the query, but it just gives you a nice formatting of the query. So we implemented it in the tool just a few weeks before it was upstream, so that's why both versions exist, but still it's nice to have it in browser. It was just fun to make that in browser anyway. We are working on more tools, so we are looking at a tool that can tell you which alert can affect which target, so it's looking at all the metrics into one alert expression and it will try to tell you that target, that Prometheus target can be affected by that expression or that expression is not affecting a single target, which is fine, but it's just a kind of dashboard for people who actually want to nag your style dashboard where they see, okay, if I tab up equal equal zero with that selector, which target will be affected, so that's something that we are working on and we are working, as we will see the support request and what we are doing in the field, we might add more tools there, but as you are seeing with the script jitter, we are also open to work directly upstream when it makes sense. So that's it, so you can get the toolkit on GitHub or on Oli.tools, O11 wide tools, so just have fun with the toolkit if you need more tools, if you have found some things that maybe have been upstream by maybe me, maybe it makes sense to have that in the toolkit and to play around and to have it widely available. Thank you. I'm wondering what the challenges were with the Wasm compilation. I think I tried something similar. I also can't remember where I failed, but it didn't work out for me. So, was that a question? Yeah, but the challenges were with the Wasm compilation. I think you said you had to modify them. Oh, yeah, so the challenge is to make the TSDB to compile because you cannot run the TSDB in browser, so we are faking some of the files, APIs, if you compile to Wasm. Okay, thank you. |
Inspektor Gadget: An eBPF Based Tool to Observe Containers |
Hi, everyone. So my name is Francis Daniel, and today I will present you Inspector Gadget, an EDPF-based tool to observe containers. So first of all, what are containers? Containers permits you to run applications isolated from each other. So on the figure on the right, you can see that there are actually three containers, and three applications, A, B, and C. To isolate and run those applications, we rely on a container engine like Cryo or container D. The container engine will ask to the operating kernel to the host operating system to create containers for us. So contrary to virtual machine, where you have a guest operating system and an host operating system, all containers here share the same host operating system. So container engine will ask to the kernel to create two containers, but sadly in the Linux kernel, there is no structure used to represent a container. Like you have the task structure, the presenter's task, there is no such structure. Instead, the container relies on several features provided to you by the kernel. To have security isolation, you will rely on the name spaces. For example, with the moon name spaces, each container will have its own set of files, and for example, container A will not be able to access file of container B except explicit sharing. To isolate these time resources, you will use the C group. So you will be able to dedicate a resource to one container. For example, with the memory C group, you will be able to limit the memory footprint of a container. For example, you will set the limit to 2 gigabytes, and if your container allocates and tries to touch 3 gigabytes, it will be out of memory keel. So containers are really cool because they permit you to isolate different workloads. Sadly, using them pose several problems, particularly when something is wrong and to debug them. First, it is harder to attach debugger to a running application. You can still do it, but it is not as simple as running GDB and running things locally. Also, you will need to take into account the communication between different containers. Nowadays, it is not common to explode your application into several micro services. For example, if you have a website, you will have maybe one container for the web server and another container for the database engine. So you will need to be sure that those two containers communicate, otherwise your website will just do nothing. To do so, we developed Inspector Gadget, which is a Swiss Army knife based on EBPF. It comes with actually two binary local gadgets, the first one to debug locally running container and cut-cutter gadget, which this time focus on containers running in Kubernetes cluster. So on the figure, I will show you the different tools offered by Inspector Gadget and which part of the kernel they permit to monitor. The first type of gadget we have are the tracer. The tracer will basically print events as they are going on the standard output. So for example, with TraceExec, you will be able to trace the call made to the syscall exec. With TraceMoon, you will be able to monitor the call to the syscall moon, which can be pretty useful when you need to mount volume. And for example, with TraceOutOfMemoryKill, you will trace when the OutOfMemoryKiller kills one application. Then we will find the profile category. So for example, the profile category, you will make it run for a given amount of time. And with ProfileLockIO, you will get information regarding the distribution of your input outputs. Then you will find the snapshot category, which will give you some information on the system as it is running at time t. So for example, with snapshot process, you will be able to get all the processes which are running in your containers, or you can also get this information for your world Kubernetes cluster. Then you will find the top category. So the top category mimics the top command line interface tool, as it will print ranking on information, which will be actualized each second. So for example, with top file, you will get information regarding the file that are the most accessed. And the last category is there is only one gadget in this category, and it is TraceLoop. TraceLoop can be seen as a trace bus for containers. So you will be able to monitor all the syscoles done by your container. OK. So before going into the internal architecture of InspectorGadget and what is eBPF, I will show you a small demonstration to compile local gadget TraceExec, so the tool to trace exact syscoles made by container running locally, and ExactSnoop, which is an already existing eBPF tool. OK. So we will first create a test container. So the test container will execute sleep periodically. And then we will now trace the new processes creation using ExactSnoop. So ExactSnoop is an eBPF tool based and made by IOvisorBCC people. So as you can see, there are two types of sleep. There is one sleep, 0.3 seconds, and one other sleep, 0.5 seconds. Sadly, in my container, I only use 0.3 seconds, so the 0.5 is done by the host. And I'm not interested at all at processes running in my host. To do so, I will use local gadget to trace the same types of events, but this time I will be able to get only the event inside the container. And as you can see, we will get the same information plus the name of the container when the event occurs. OK. So before going into the internal architecture of Inspector Gadget, what is eBPF? According to Brandon Gregg, eBPF does to Linux what JavaScript does to HTML. It permits you to run mini-program which are safe into a virtual machine inside a kernel which will be run only on some particular event, for example, disk IO. Sadly, the eBPF program safety comes at a price. You are kind of limited. For example, you cannot have an eBPF program which will have an infinite loop or not statically bounded loop. Also, there is no function like malloc or camalloc, so you cannot allocate dynamically memory, but you will see that there are some possibilities to cope with this limitation. OK. Inside the kernel, you will find two software components which are related to eBPF. The first one is the verifier. It will take as input your eBPF program and will ensure that it is safe. If this is the case, it will end your safe program to the just-in-time compiler. The just-in-time compiler will basically translate your eBPF bytecode to the machine code and it will install it to be run on some event. When you want to develop an eBPF program, you will write it in a syntax which is almost that of the C. You will then compile it using clang and the target eBPF to get an eBPF object file. So this eBPF object file will contain the eBPF bytecode. You will then include this eBPF object file into another program running in the userland. So to do so, you can use your favorite language. You can use C, you can use Golang, the alpentee of possibilities. So you will use this program and you will use also maps, eBPF maps. eBPF maps are data structure related to eBPF. It takes this plenty of different types of maps. You will get one eBPF map to represent array, one eBPF map to represent key value store. You have several possibilities. So when you want to load your eBPF program into the kernel, you will mainly use a library related to eBPF like eBPF in C or Cilium eBPF in Golang. So your eBPF program will be loaded into the kernel through the eBPF C-Score. It will be verified. If it is okay, it will be just in time compiled and installed to monitor some event. We will do the same with the map because for example, we will be able to use the map to either share information between several eBPF programs or between kernel land and user land as our eBPF program are run into the kernel. So now let's say that I have a process which called the exact C-Score. Then our eBPF program will be executed. It will write some information into the eBPF map and thanks to the library, I will be able to read this information and print it, for example, to the standard output. And then deal with them in user land. Okay, regarding local gadgets, the main component is the local gadget manager. So the local gadget manager at each time maintains a container collection. So it knows perfectly what are the running containers in the system at a given time. Indeed, we rely on rank fanotify to add containers to this container collection when containers are created and to remove them when they are deleted. We are also able to start some inspector gadget tracer like the one to trace the exact system core. So when we decide to start tracer, for example, the one to trace exact C-Score, we will not directly load the eBPF program. We will create a particular eBPF map that we will use to store information regarding our container of interest. Indeed, the eBPF program will be executed each time an event occurs and we need to do a filtering realing this. In the first demonstration, I was only interested into events occurring inside containers and not on the host. To do so, this eBPF map will contain the mooned namespace ID of the container which interests me. So when I will run my eBPF program, we will install the eBPF program and basically we basically compared to the eBPF code of the exact snoop that I presented into the first demonstration, we took it and we modified it to add this filtering. So basically with this code snippet, we will get the mooned namespace ID of the current task and we will compare it if it is present into this map or not. If it is not the case, we just do not care about this container and we just do not care about this task because it is not in our container. If it is the case, if the mooned namespace ID is inside the container, we will continue the execution of our eBPF program because we care about it. Okay, so now we will show you a more realistic world demonstration of local gadgets, particularly how to use it to verify the second profile. So okay, we will create an nginx container with a second profile installed. So second profile is a feature offered by the Linux kernel to allow or disallow the call of some syscall. So okay, I will create it, I wrote by hand the second profile that I gave to Docker. So okay, let's create it and now let's query the nginx server. Okay, some mistakes, maybe I forgot to add one syscall into the second profile. So I will stop the nginx container. Now we will start local gadget and I will start local gadget on a container, on one particular container, the nginx container. Note that it is perfectly possible to start the local gadget with a given container name even if this container name does not exist at the time because it will be added automatically thanks to the container correction and rank for notify. Okay, I will now run an nginx container but without any second profile, I will curl it. Now I will stop my container, it will automatically stop local gadget. Now I will just compare the two second profiles, the one that I wrote and the one generated by local gadget. Okay, I forgot the send file syscalls, so it can maybe explain some few bugs. So okay, let's run again the nginx container with this new second profile. Okay, and now it's the moment of truth, let's curl it and yeah, everything is okay. So yeah, basically local gadget really helps us to verify the second profile that I wrote by hand and more than that, it can generate for you second profile. Okay, so I told you about local gadget and when I presented you first inspector gadget, I told you it comes with two binary local gadget that I already presented and kept kept a gadget. So kept kept a gadget is designed to monitor containers inside Kubernetes cluster. So on the figure I represented the schematic of Kubernetes cluster, so on the left we have the developer laptop, on the right we have the Kubernetes cluster, so we have one node for the Kubernetes control plane and we have one worker node. First of all, we will need to deploy an inspector gadget pod on each node to be able to monitor the events occurring on this node. So we will create a diamond set, Kubernetes will deploy then an inspector gadget pod on each node of our cluster. Then we will want to trace a specific event, for example, the X axis goal, so we will use the kept kept a gadget trace exact command, we will ask to the control plane to create a trace CRD, so a trace CRD is a custom resource definition which is proper to inspector gadget and that we use mainly to start and stop tracer. So we will have also a trace CRD per node like we have one gadget pod per node. So we will create the eBPF program on the associated map, we will install it into the kernel, the eBPF program will be executed if there are some code to exec occurring on our node, those events will be written to a path buffer, a path buffer is a specific type of eBPF map, I saw it in the time to enter into the details. So we will be able to read this information from New Zealand and the events will be published to a stream, to a gRPC stream, we will then use kept kept a exact to read the gRPC stream and so the information will be printed on the developer laptop. So now I will show you a more realistic example about how to use kept kept a gadget to verify the container capabilities. So just before jumping into the demonstration, the capabilities are another feature by the kernel to limit what your application can do. So again time from the demonstration. Okay so this time I will deploy an nginux web server with some capabilities set. So here is the list of the capabilities, so for example you can see that there is the sysadmin capabilities which is not forcefully capabilities you want to but it seems nginux needs it to run so you don't have the choice. So I deployed it and suddenly it seems that there are some mistakes, so okay let's get some more information, okay operation not permitted, okay if I have an operation not permitted it may be because I forgot one capability into my deployment. So on the bottom I run the kept kept a gadget trace capabilities so as you can see I just want to get capabilities which are used in the namespace demo because it is the namespace where my nginux container is and so the big difference between local gadget and kept kept a gadget is that kept kept a gadget will give us information regarding Kubernetes. So for each event we will get the node where the event occurs, the namespace, the pod and the container. It is really aware of the fact that it is running inside Kubernetes. So okay I deleted my deployment, I will run it again, okay we run the whole demonstration for the beginning. Okay so during this time if someone has quick question or if there was one point that wasn't clear you can take it quickly. Okay everything was clear until this moment, so perfect. So okay let's delete our previews deployment and now it can take a bit of time because it is in Kubernetes so yeah compared to when running locally you need to take into a good communication with remote services. Okay now I will deploy my nginux deployment again and so we will get the information directly so as you can see we have the name of the capabilities and when they are used and we are also in this column if it is allowed by the kernel or if it is denied so all the above capabilities were allowed and the shown capabilities was denied. Yeah I think I forgot it in my deployment file so I will just delete my deployment file again, yeah there is a lot of back and forth but suddenly I do not think we have a lot of choice. Yeah again if there is quick question during the deleting and the redeployment of the whole thing I can take it and so I will update my deployment file to add the capabilities that I missed. Okay let's deploy it again and just cross the finger but it is the last time. Okay let's wait for everything to be ready. Okay take also a bit of time so that's okay should do the trick and anyway I do not think we can wait faster so okay everything seems to be ready now we will get the IP of our pod we will now kill it and now it's the moment of truth and as we can see we get the nginux default message so everything was fine I just forgot to add one capability in my deployment file so it's now time to conclude so as I show you during this presentation inspector gadget permits to monitor containers both running locally with local gadget and both unrunning in Kubernetes cluster with cup cutter gadget it is of precious help to debug this application I really like to use gdb but and any kind of debugger but in the case that I show you it would be not so helpful particularly because if you run gdb for the second profile you will just get a narrow number and it will not be so helpful and the same with the capability example we will not be able to know why the syscall failed we will just know it failed with a narrow number but kind of hard to say it was because of the missing capability so as a future work we plan to improve the scaling of inspector gadget because I told you we use cup cutter exact to read the grpc stream and suddenly doesn't scale very well we also plan to add a new gadget and as inspector gadgets is an open source software if you have any idea of a gadget or if you want to contribute one I will be really happy to see your contribution and to review it so you can find us on our website inspector gadget.io we are also on github so under the inspector gadget organization and we also have a channel in the kubernetes slack so inspector gadget so yeah if one day you use inspector gadget and there is something that you do not understand please just feel free to come to the channel and ask we will be really we are here to help you and it will be a real pleasure to chat with you so I thank you a lot for your attention and if you have any question feel free to ask thank you thank you very interesting talk I would like to know I've seen that you were deploying the agents as a demon set so you were running it in all the nodes I was wondering if you can just tailor it to one single node because you know that the the current workload that you want to check or the current part you want to check is there second question would be I understand that this is really big for for debugging environments would you do you think that this would be ready if you had an incident or something going on that you want to investigate in a production environment okay just to be sure that I understood correctly your question you were asking precision when I deploy the inspector gadget pod I deployed it in each node in the kubernetes cluster and so you wanted to know if it is possible to not deploy it on each node yes perfectly there is and related to that when you're running the the comments from your computer does it apply to all nodes at the same time or can you tailor out so to just go target it to one node or something no you can you can target one node so you can basically you can filter by several you have three possibilities to filter you can filter by node you can filter by name space and you can filter by pod name even container name and of course you can mix all of this I was a bit quick regarding the demonstration on this but yeah you have yeah you can do a lot of filtering so yeah so you can do you can deploy the inspector gadget pod on each node and then filter by no by node name but even though if you know that there is one specific problem occurring on one particular node you can deploy the pod on only this specific node we had we have an option to do so with capital gadget deploy to only say to specify which node you want to deploy it thank you you're welcome thanks for the talk again I'm just wondering if you can send the metrics to Grafana or something do to do the filtering and the querying around is that possible so just to be sure you asked if I can send the metrics to Grafana yeah the traces that okay so we plan to we we are actually developing it a lot and we are actually working on it a lot and we plan to add an exporter to Prometheus all right but yeah it is still I would not say work in progress but thinking in progress all right yeah but nonetheless nonetheless if you're really interested into Prometheus I think there is only if you go to the inspector gadget repository there will be you know on the right there is a used by and so there is a project which does the exporting to Prometheus but this is not us who developed it and we plan to yeah there is us there is a lot of things that we want to do actually and yeah Prometheus is on our to the list and on the things that we want to support all right thanks then I think we're done oh one more question yeah I have a question regarding this demon said that it should be installed on the Kubernetes nodes is recommended to it like keep it there always or just install when you need to debug and then remove it back I'm sorry can you please repeat it this demon said on the Kubernetes nodes is it recommended to keep it there for always like or just install for the back and then remove back in the diamond set if I can this remote component no so basically the question was about when I deploy the inspector gadget pod if it is recommended to have it running it for a long time or just shortly no clearly you should not have a running you should not have it running for a long time as we install ebpf program we need to have some privilege and we need for example the capsis admin and all this sort of stuff we cannot use the user name of space which were which was added recently in Kubernetes so no you just deploy it you collect the matrix you collect the you monitor the events you want to monitor and then you just undeploy it so undeploying its specter gadget is as simple as deploying it is one command line interface call and you are done so yeah just avoid it having it running for a long time it's it is a tool to debug so it would be like if you run your application all the time with gdb attached to be kind of how do so yeah no is there a measurable performance impact on of having the agent deployed in your cluster since it's measuring all these things so you are asking about if when we monitor event if we have a decrease in performance right so not so much and it would be related to the world ebpf as ebpf program are running to the kernel you do not have you know context switch between userland and kernel land so it is kind of as quick and you avoid having a big decrease in performance okay thank you you're welcome then I think we're done thank you thank you |
Best Practices for Operators Monitoring and Observability in Operator SDK |
Hi, everyone, and welcome to our talk about operator monitoring and how to do it correctly. My name is Shirley. I work at Red Hat. I'm Jean Villassa. I also work at Red Hat for about one year and a half. So today we're going to talk about operators' observability, Kubernetes operators, and we're going to talk about when to start, the maturity levels of metrics, why we want to monitor, what we want to monitor, and the best practices and code examples that we created for it. So when we want to talk about, when should we start to think about the observability for operators? You can see here in the chart the life cycle of creating an operator, which is starting in basic installation, and the most mature step is autopilot. So when do you think we should start thinking about observability for a new operator? Anyone? When? From the start. From the start. That's correct. Really deep insights, talks about metrics, alerts, which is being able to monitor your operator fully. And people think maybe we should start thinking about it in full life cycle. Maybe that's the case. But you should pretty much start at the beginning, because the metrics that you are adding first are usually not the metrics that are for your users. They are internal. There are a few steps for the maturity of metrics. The first step is initial. You start with your operator, you want to understand how it works, if it works correctly. So the developers start to add hot metrics. I've been working for a few years on an operator in Red Hat called Qvert. And when I joined the project, it was already in the life cycle phase, full life cycle. And when I joined, already a lot of metrics were implemented in this operator. The problem was that there was no, the developers that added the metrics didn't fall best practices. And a lot of the metrics, it was hard to understand which metrics were ours. It's important to understand that your operator is not the only one inside of the Kubernetes system. So when someone, when a user or even other developers want to understand which metrics your operator is exposing, it should be easy for them to identify your metrics. So the first step, as I said, is initial. The second step is basic monitoring. You start adding your monitoring, and you're starting to think about your users, what they want to understand about your operator. And the third step is you have a process for implementing metrics and new metrics, and you are focused about health and performance for your operator. And the last step is actually autopilot. Taking those metrics and doing smart actions with them in order to do stuff like auto healing and auto scaling for your operator. And this is the part that we are actually on in our operator. So as Shirley said, when we first start, we look very much at internal metrics for the operators themselves. So at this point, we might start, for example, looking at the health of the operator. For example, can it connect to the Kubernetes API, or if it's using external resources, can it connect to those providers' API? Is it experiencing any errors? So we can also start by looking at, for example, its behavior. How often is the operator reconciling? What actions is the operator performing? So this is the kind of stuff that, as we are developing, we are very interested in. But we should start, as Shirley said, thinking more in the future about having these good standards, because later we will not be only tracking these, and could also be, like, resource metrics. And then why should then, why operator observability, and what are the steps that we'll be taking? So starting from the performance and health, here we want to detect the issues that come up early. We try to, obviously, reduce both operator and application downtime, and try to detect some regressions that might happen. Also we can start looking at, for example, planning and billing to improve planification, to also improve profitability, or then build users. At this point, we start looking more at infrastructure metrics also. For example, we want to track resource utilization. This might be, like, CPU, memory, this, and we can also start looking at the health of the infrastructure itself, maybe hardware failures, or trying to detect some network issues. Then we also start looking at, use these metrics to create alerts, to send notifications about the problems that come up as early as possible. So we obviously want to take appropriate actions to not let them go around. And after this, at this point, we go into more detail about metrics. Maybe we start looking at application metrics. So what's the availability of our application? What's the time? What's the error rates? And also its behavior. What type of request is the application receiving? What types of responses is sending? And it's important to monitor all of these things. And when we start building up all this information, then at a certain point in time, as Shirley said, we'll be able to give, like, this new life to the operator by having the autopilot capabilities, such as auto scaling, auto wheeling capabilities. Because at this point, if we did everything correctly, you'll be able to know, like, almost all the states that we are in. And we also start looking at metrics functionality metrics. We can provide the expected, are we providing the expected functionality to users? For example, checking that application features are working correctly. We want to see if there are any performance or reliability issues by checking service levels, and that everything is, it's working in the expected way by checking response to the airhorse and the data that it's responding to. Okay. So I hope you are convinced that the observability is important. If you are in this room, I guess you are. And for the past three years, we've been working on observability on our operator. What's important to understand is that our operator is considerably complex. It has a few sub-operators that it's managing. And each sub-operator has its own team, dedicated team, that is maintaining it. And having the insight of looking at those teams working on implementing observability, each team separately gave us a higher level of the possibility of understanding the pitfalls that they all share when implementing monitoring. So we decided to contribute from our knowledge of how to do this correctly in order for others not to do the same, to fall to the same pitfalls as us. So we decided to create best practices and to share with the community our findings. We hope to shorten the onboarding time for others and to create better documentation and to create reusable code for others to be able to use and save time and money, of course. So we reached out to the operator framework SDK team to collaborate with them and to publish there our best practices. As you can see here, this is the operator observability best practices. The operator SDK itself is the first step when someone wants to create a new operator. It gives them tools, how to create it easily, how to build, test the packages, and provides best practices for all steps of the operator life cycle. So we found that this was the best place for others to also go for monitoring. And in these best practices, I will now share with you a few examples. It may sound simple, but simple things have a big impact, both on the users that are using the system and both on developers that are trying to work with the metrics. So for example, a naming convention for metrics. One of the things that is mentioned in the document is having a name prefix for your metrics. This is very simple action that will help you identify, that will help the developers, the users to identify that the metrics are coming from the specific operator or a company. In this case, you can see that all of the metrics here have a cube width prefix, a cube width, as I said, has sub-operators. So under this prefix, we also have a sub-prefix for each individual operator, a CDI network and so on. And this is another example, which does not have this prefix. We can see here a container CPU, for example, prefix, but we can't understand where it's coming from. In this case, it's the advisor. But if you're a user and you're trying to understand where this metric came from, it's very hard, and also you cannot search in Grafana, for example, for all of the C-advisor metrics together. So that's a problem. Another thing that is mentioned in the best practices is about help text. Each metric has a dedicated place to add the help for this metric. And as you can see in Grafana and other visualization tools, the user will be able to see when hovering on the metrics, the description of it. It's very important because if not, you need to go somewhere else to search for it. Also this gives you the ability to create auto-generated documentation for all of your metrics in your site. Another example is the base units. So Prometheus recommends using base units for metrics. For example, you can see here for time to use seconds, not milliseconds, temperature, Celsius, not Fahrenheit, this gives the users a fluent experience when they are using the metrics, they don't need to do conversions, deviations of the data, and they are saying if you want to use milliseconds, use a floating point number. This removes the concern of magnitude of the number, and Grafana can handle it, and it will still show you the same precision, but the consistency in the UI and how to use the metrics will stay the same. Here you can see an example for metrics that are using seconds. And here we see that each CD are not using it. So this is not as recommended, and we would actually recommend to switch it, but they started with milliseconds. And now doing the change will cause issues with the UI that is based on it and everything. So it's a problem to change the names of the metrics once they are created. So when I joined the operator, we didn't have name prefixes. I tried to understand which metrics are ours and which are not, it was very hard. So we needed to go and do breaking changes for the metrics and add those prefixes, change the units, and this is what we want others to be able to avoid, this duplicate of work. Additional information in the best practices is about alerts. This is an example of an alert. You can see here that we have the alert name. We have an expression which is based on a metric, and once the expression is met, the alert either starts firing or is in pending state until the evaluation time. There is a description. There is also a possibility to add a summary. This is the evaluation time. It has a severity. And a link to a runbook URL. There could be other information that you can add to it, but this is the basic. And what we're saying in the best practice is that there's supposed to be, for example, for the labels of severity, there should only be three valid options, critical, warning, and info alerts. If you're using something else, it would be problematic. You can see here in this example, I don't know if you're seeing it, but we see that this is our example in the cluster. We have info, warning, and critical, and we have one non-severity, which is the watchdog. It's part of Prometheus alerts. It's just making sure that the alerts are working as expected. It should always stay one. There should never be alerts that don't have severity. And this is a bad example of using a severity label. In this case, they are using major instead of critical. The impact of that is that if someone is setting up alert manager to notify the support team that something critical happened to the system, and they were to get notified by Slack or by a pager, they will miss out on this alert because it doesn't meet with the convention of severities, values for severities. So what we have at the moment for best practices, we have for a metrics naming convention. We have how to create documentation for metrics, alerts, information about alert labels, run books. By the way, run books are a way to provide more information about the alert. You have a link in the alert where you can send the user to go and find more details. What's it about? What's the impact? How to diagnose it? And how to mitigate the issue. And then additional information about how to test metrics and how to test alerts. We plan to enrich this information, add information about dashboards, logging events, tracing in the future. So Shirley gave an overview about an eye-level situation about metrics and alerts. But how do we translate some of these best practices into code? So one of the problems that we faced was that logic code and monitoring code were becoming very intertwined. Code like this becomes harder to maintain. Obviously it becomes more difficult in understanding what the code does and to modify it. This leads obviously to longer development times, potential bugs, and it's also more challenging to onboard new team members or to contribute to one of these projects. In this specific snippet, there was like 16.4% of monitoring code intertwined with migration logic code. So what we did was try to refactor this code to try to separate these concerns, one from the other. In this specific case, we used a Prometheus collector that's just iterating the existing virtual machines migrations, and then it's just pushing the metrics according to the status of the virtual machines, whether they are successful or not, or the accounts of the pending schedule and running migrations. And obviously this snippet is much easier to understand how the monitoring is being done, and we take all of this out of the migration logic code. And to help other developers that are starting to avoid the same mistakes as we had to solve, we are creating a monitoring example in the memcached operator. We already have an initial example that is already thinking about all these concerns in separation between logic code and monitoring code. Our idea with this example is to make it as clear as possible, especially this is especially important when we are working with large and complex code bases, also make it more modular. It's easier to understand both the logic code and the monitoring code without affecting each other's functionality in the application in general, also make it more reusable. Since like, for example, the way we are doing monitoring in different operators will always be more or less the same. So if we find a more or less common way to do this, it will make it easier to reuse this code in other applications and projects, which will save them time and effort. And also, it will become more performant. If we mix all the monitoring concerns with the migration code, it's trivial that the time it will take to make a migration will take longer because we are calculating metric values and doing some Prometheus operations while we are trying to calculate the state of a migration. So having this separation will also help these questions. Our idea for the structure of the code will be by creating a package. And for example, here we can see a migration example, a central place where we will be registering all migrations and all migrations, sorry, no, all metrics, obviously, and then we will have files that will separate these metrics by their types. For example, in this example, you can see one operator metrics file, which will have all the operator-related metrics, as we talked in the beginning, and then we could have one specific file only for the migration metrics and then register them in one place. And why do we think about this structure and what benefits could this bring us? The first one is to automate the metric and the alert code generation. As we saw, much of the work that a developer needs to do that, it's like creating a file with a specific name, then go to the metrics.go file and register that file there. So this is very structured and always the same. It will be easier to automate and then allow developers to have a command line tool to generate new metrics and generate new alerts easier. We are also looking forward to create a linter for the metrics name. As Shirley said, a lot of the concerns that happen when operators are becoming more advanced is looking back at the metrics and see everything we did wrong with their naming. And even, as she said, it's a simple change, but can have a lot of impact. So a linter that follows all these conventions will also be important. Also automated metrics documentations, we are already doing this. And one thing that we faced was that a lot of metrics were very scattered in the code. So it was easy to automate and find all of them. And with a structure like the previous one, it will be even more easier to create a full list of metrics and that description that will help both developers, newcomers, and users. And lastly, have an easier structure for both unit and end-to-end testing, because if we have, like, this clear structure for where the metrics are, we can test there and test exactly those functions and not code intertwined in logic code. And just to conclude, if you are starting to create an operator or if you already have an operator, we invite you to go and to look at the operator SDK, to look at the best practices, to try to avoid the pitfalls that we had. And I really hope it will help you. And you should really just consider that when you're creating a new operator, it starts small, but it can become really robust. And you cannot tell that in the beginning. So think ahead and try to build it correctly from the beginning. I hope it will be helpful for you. And thank you. Thank you. Do you have any recommendations on how you would log out the decision points within your operator? So if you wanted to retrospectively see why it's done certain things, like the decision points, how it's decided which Kubernetes API calls to make, if your operator did something crazy and you wanted to look back and see why it did that, is there anything you would do in advance to the logging? I think this is the summary of what we've learned is in these documents. Because as I said, for example, the developers that started this project, they didn't have where to go and the best practices of how to name a metric. So they just named it how they thought. And they did follow the Prometheus recommendations, but having a prefix of the operator has a big impact for the users. And not even the users. When we are trying to understand how to use internal metrics for our uses, we also are struggling to understand where a metric came from, where is the code for it. So all of the summary of what we've learned is in those documents, and we plan to enrich it even further. Thank you for your talk. It was very interesting. You mentioned code generation for the metrics package. My question is, do you plan on adding that to QBuilder and the operator SDK? Yeah, basically we are working on the operator SDK right now, because we want to build all these tools, and we are thinking about them, but obviously this needs a lot of help of the community. And I am saying this because I'll enter like a personal note here and an idea, right? Because the way I see it is like on QBuilder and on operator SDK, being able to, you just go there and you say that you want to generate a project with monitoring, and it creates the monitoring package. Or if the operator already exists, you have a command to generate the monitoring package, and then on QBuilder, like you use it to create an API or a controller, you'll have a similar command, but to create a new metric. And you pass the type of the metric, the help, and the same for alerts. At least that's the way I see it. And for me, so it makes sense. I agree. Thank you. Hey, thank you for the talk. How much of a conventions that you talked about, aligned with open telemetry, is my opinion? How much? What? Aligned with open telemetry. Most of them are aligned with open telemetry, actually. But these are specific for operators. That's the idea. The idea is that you have a central place where you can get the information. And by the way, if someone is creating a new operator and has an insight, we encourage others to contribute to the documentation and teach others and share the information. So yeah. Basically, I think we align with open telemetry conventions, but we add more ideas to it to operate. I think that's it. Thank you. Thank you. Thank you. Thank you. |
Lightning Talks
NetXMS | Parca | OpenSearch |
My name is Victor Kirchenstein. I'm from NetXMS team and it's a brief introduction who we are and what we're doing. So it's a network monitoring system and it started in 2003 as my personal hobby project and I was working as a network engineer in a local system integration company at that time. Now it's a small team working full time on this project and we are based in Riga, Latvia and you can check our website, you can check the source code on the GitHub. So the design principles that is in our system is we want to make it fast, flexible, extendable by user in different ways when possible without changing the core source code, suitable for large setups, so it should be able to monitor large networks, large installations and we put a lot of emphasis on network monitoring and on distributed monitoring. So you can monitor pretty much anything in your infrastructure, servers, workstations, some devices, etc., but it still has some very network specific functionality built in. So what are major features of the system? So we have automatic network topology discovery, so we can discover new devices in the network and also how they are connected together on different levels, so it layers two of the OC, so like Ethernet, serials, etc., it's an IP level, it's an OSP of topology and system also provides visualization of this topology on different levels. We have topology-based network, topology-based event correlation, many useful topology-based lookup tools like to find specific MAC address in the network, find specific IP address, check the switch forwarding database, check OSP of neighbors all from monitoring system, user interface, all searchable. We have our own agents that can be installed on different operating systems and those agents among other things can act as caching proxies for most of the protocols that are supported by the system. So you can have a remote proxy that communicates with SNMP devices in the remote side and your secure communication channel with a monitoring server and if you have lost your connection to your remote side it will keep collecting data and caching locally and then re-synchronize to the central server when you have your connection back. The system is very even-centric and we have powerful functions for even processing it quite flexible. We support data collection from different sources. It could be our own agents SNMP, MQTT, SSH commands, Internet IP, web services, data can be pushed to the system via our API. We have very powerful data collection templates to simplify and automate data collection from devices and servers. After data is collected the further processing is uniform so as long as we get the value for the metric it's no longer relevant how we get it with which protocol you process it in the same way. We have many options for data transformation and for threshold checking. We have built-in scripting language in the system that can be used for implementing any custom complex logic for data transformation for even processing for automation. We provide tools to build automation on top of the system and we have very flexible access control in the system so it actually can be used and some users do use it as a multi-tenant so if you are MSP for example you can provide access to parts of the system to your customers as they see the network. This is an architecture kind of overview so we have a monitoring server in the center and it can collect data with different protocols directly or through our agent. It uses SQL database for storing data and configuration and we among other databases we support Postgres with time scale DB extension which is what we really recommend if you have big setups and system administrators and operators can access the system via web interface or desktop application and we also provide full API so whatever you can do from web interface you can do from API and so it's really good for integration. I want to go through a few use cases of our users so the use case one it's a global company with offices around the world and they have more than 12,000 network devices in their global network and they use NetXMS to monitor all these devices, more than 2 million metrics being collected from it. We have a link to InfluxDB like the fan out driver for sending data so besides storing it in a NetXMS database it's also sent to the InfluxDB for further analysis and processing. Case number two is completely different it's an agricultural holding in South Africa and they use NetXMS to monitor too much everything they have network devices, servers, etc. Including the solar panels and the fridges and the power generators and we use different protocols and different methods for getting this information so like MQTT for solar panels and for fridges for power generators you just trust BDP computers with NetXMS agent installed and like generators and fridge sensors connected via GPIO. The third case is industrial computing consulting company in the US and they use NetXMS in a way that was unusual for us as a site assessment tool so basically their consultant comes to the customer side with a laptop with NetXMS server installed, run it in a discovery mode for a day or for a few hours and it finds the industrial devices and collect information using different protocols from them and also builds the topology map for them automatically. And the final use case is the quite big ISP that operates in USA and Central America and they monitor all their network devices and other equipment with NetXMS. They have more than 70,000 devices all monitored with a single monitoring server and they use a lot of automated discovery, automated templates. So let's see that was really quick overview, you take a look at our website, ask questions after the session, take stickers, hey thanks a lot. That was the first ad hoc talk, thanks for doing this. |
Understanding the energy use of Firefox
With less power comes more sustainability |
Hello. Thanks everybody for coming. I will be talking today about what we know about how Firefox is powered, what we can know about it, what we can do about it. First, what we will cover today. So first, why do we care about this topic? Then, how can we understand power use locally, which means like if Firefox is using too much power on your own computer, what can you do about it? And then, how can we understand it about all users in general in the wild, so the entire population? And I will finish by explaining what we have done, what we have improved and what we are still doing. So first, why do we care? There are really two different sets of reasons. The first one is user experience. We very frequently see users complaining that, OK, Firefox is using too much resources, too much CPU. My computer is noisy, like the fans are making noise, they are going at full speed. All the laptop is hot if people are using a laptop. Or maybe the battery life is too short. So all of those are reasons for an individual to want Firefox to use less power. There is another set of reasons, which is sustainability. Mozilla cares about sustainability, made climate commitments, about being carbon neutral, about reducing our greenhouse gas footprint. And because we are Mozilla, we do things in the open. We want to lead on this and do it openly. We want to share our tools, any material we have on this, methodologies. And also we want to improve our products so that they are more sustainable. And the reason why we care so much about our product is because when estimating our carbon footprint, because we have many users, the use of our product is actually 98% of our footprint. So playing less, having more efficient offices, this is all very great, but if we want to save a lot of power, that's really looking at the product that we should go. So I said we will look locally first because I think there are some people in the room who likely think Firefox uses too much power on my computer. I want the battery to last longer. And that's a very valid use case. And if we want to optimize for everybody, we want first to be able to optimize to have it running correctly on one specific machine in front of us before trying to go at scale. So I will present a few tools that we have for this. But first, I will explain how Firefox uses power. It's desktop application, so like any other computer application, it's using CPU time, using GPU time. Waking up CPUs, and actually CPUs are pretty good at saving power when we ask them to do nothing. They go into deep sleep mode, they use almost no power. The problem is if we keep waking them up, they can't really sleep, and that wastes a lot of power. And then there's a few more things that we can do that use a little bit of power, but not as much. Like transmitting network packets, for example, or writing to disk. And they are not where we should focus our time if we really want to have a big impact. So this is how we use power. And next, how we waste power. It's almost the same thing. The only difference is when we do it for no user benefits. If we use CPU time and the user doesn't see a benefit in what we are doing with the CPU, it's a waste. And same for all the other ways we could use resources. And some typical ways to waste power is waking up too often, playing animations even though they are completely invisible, or more generally doing things in the background, things that have no visible effects. Yeah, so typically when someone wants to understand local power use, there's two reasons. Either Firefox is supposed to be guilty of using too much power, there's like one core used at 100%, sometimes even more than one core. Or sometimes the user is looking at the operating system task manager for a completely different reason. And notices that, okay, there are some Firefox processes that should be idle because Firefox is supposed to be doing nothing. It's in the background, but it's using 1% of a CPU or 0. something. Why? It used to be difficult to figure out answers to those questions. We now have good tooling for it. So the first tool I wanted to share is the Firefox task manager. You can open it by typing about processes in the address bar. You can also use the shift escape keyboard shortcut that will open it directly. And it's very similar to a task manager you would see in an operating system. It's showing you a list of processes, how much memory is used by each, how much memory and CPU. But unlike the operating system, it knows exactly which tab we are running in which process. So if you see that there's one process that's using a lot of CPU, and you see that it's a tab that you actually don't really care about, you can just close that tab and this is done. Because very often we have people who say, oh, Firefox uses a lot of CPU, but you know I have 50 tabs, so maybe it's because we have many tabs Firefox uses a lot. Not really. Very often it just, there's one tab that's misbehaving, and finding quickly the right one and closing it is a more efficient way to fix it. Something else I want to show on this is that the numbers, the percentage of CPUs that are very precise here, a lot more than on the operating system task managers. For example, on the third process here you see 0.011% on the operating system you just see zero in this case. And the reason why we want to show it when it's almost nothing is because almost nothing means we are still waking up the CPU to do a little bit of something. And we want to be able to catch this because we use this kind of tooling to find real bugs. This is a screenshot captured on Naikli if you are on a release build you won't see the thread names unless you enable it in about config. Okay, so let's assume you have found that there's a real problem there and it's not a tab you can easily close. The next step which is more easily is using the Firefox profiler. I won't go into details on the Firefox profiler because the next presentation is also about the profiler. But I will say that we recently added a preset for power profiling which configures the profiler in a way that causes very little overhead. And we also have a power profiling mode that was added recently that will be able to say how many watts we used, not just how much CPU. So you can enable it quickly with this preset. And I just wanted to show an example of how precise the measurement can be. This is a profile showing Firefox doing nothing. There was just one thing that was left. It was the cursor in the address bar. It was blinking. On all the spikes that you can see in here, there are whenever the cursor appears or disappears. And here we could select this area and see in the tooltip exactly how much power is used to do this tiny thing. So anything we do, we can see it in the profiler, see how much power it uses and correlates. Honestly, I never thought we could see things like this. I thought we could see bigger things like loading a page but blinking the cursor. That was a surprise. And a good one. Another thing in the profiler that I wanted to share, and it's the last one because the next one will be all about the profiler, is the profiler records many markers. And especially I wanted to show the awake and runable markers that make it easy to see why we were waking up a thread. Whenever the thread wakes up, there's an awake marker. It often says which priority the thread was in from the operating system point of view and which curve the thing run. And runable markers only exist on nightly but they say exactly what we run at that time. Which is very convenient to then fail a bug if it's something we should not be running. One last thing there, task manager, if you hover with the mouse next to the next to the PID, there's a profiler icon that appears. If you click it, it will profile for five seconds the entire process. A few seconds later, you will see a tab opening that shows everything happening in that process. That's all I will say about troubleshooting local excessive power use. I hope you will make good use of the tools I was just showing. And now we'll talk about what's happening in the wild. So that's for all our users. And whenever we care about what's happening for all users, we think telemetry because that's a great way to know about what's happening and computers are not in front of us. And I added data collection for a few things that are related to power use over the last couple of months. Most notably CPU time used, GPU time used, the number of wake ups that we caused. And also we can break down this data by process type. And here by process type, I mean, is it the parent process that's showing the Firefox user interface? Is it a content process that's showing the tab in the foreground? A content process that's for a background tab? On the native channel, we can even break it down by thread name, which is a lot more detailed. And now I will show the use we are making of this data. So I said we care about sustainability and we have climate commitments. And one of the use case for having this kind of data is estimating our carbon footprint. So thanks to the telemetry, we know that on average, every day, we use between 60 and 80 million hours of CPU time and about 15 million hours of GPU time. Those are big numbers. It's hard to think about what we mean, but we can try to use those numbers to convert to CO2 equivalent by using the CPU specifications from CPU manufacturers, the information about which CPU model is being used, and electricity, carbon intensity by country. So we would be publishing our carbon footprint in a couple months for last year, and it's based on this kind of data. And I just wanted to give a sense of scale because millions of hours, that means nothing to me. The amount of power that could be needed to power Firefox for all of our users, which is hundreds of millions of users, would be equivalent of a small thermal power station. Or if you're thinking more renewable energy, we would need to cover about the roof of 50,000 houses with photovoltaic panels. So even if we save just 1% of the power, that still means a lot compared to other things we could do in our personal lives as engineers. Another example of using telemetry data is verifying that the fix we landed actually had the impact we expected. And this is a case where we fixed something related to how timers were implemented, timers for web pages. And this chart shows how many times we wake up various threads, so it's from native users. And you see that there's a change happening here, something trending down. Before, it was about 7% of the wake-ups for the timer thread, and after, it was about 5%. So we really had an impact with this fix. And before we collected this kind of telemetry, it would have been impossible to know. And last but not least, we used this telemetry to verify our ideas about how we can reduce power use. And when I started working on this project, I had the assumption, and other people too, that we use a lot of power in background tabs. And that's probably because as someone who uses Firefox and the Internet a lot, I have many background tabs. And we just collected data. So this is a breakdown per process type. We see that the biggest slice here is the foreground tab, not background. Second biggest is the GPU, so showing things on screen. Then we've got the parent process, which is the UI, when the user is interacting or not interacting. And only then, we have background tabs. So it's between 7% and 8%. Still worth optimizing, but if we spent all of our efforts optimizing this, we would be missing the biggest part of the thing, which is foreground tabs. Another idea that we tested is maybe it's always hard that our web pages, they use a lot of power. We should do something about them. We also collected data. It turns out to be less than 2% of it at all. Maybe still worth doing something about it, but again, it's not where we will have the biggest wins. And maybe it also means that tracking protection works really well in Firefox, and we are already blocking many things. And the last section of this presentation will be about improvements, what we have done to reduce power use and what we can still do. We fixed many bugs. When I wrote the slide, it was 26 bugs that we fixed only within terms of reducing power use. But if I wrote the slide today, it would be 27, because one was fixed overnight. The bugs go in various categories. It's almost always the same kind of things that we find. Sometimes we have timers that really should have been stopped, but keep repeating, but they are not really useful. It's one of those bugs that we fixed this night, something that was waking up every 10 seconds, even when you do nothing with Firefox. Sometimes it's animations that are animating, but they are animating stuff that's not even on screen, maybe because it's a background window, background tab, or something hidden for some other reason. When we can stop those animations, it's much better. And when I said bogus animation, it's sometimes we had animations that kept running even though the window was closed. I think we fixed all of those cases, but we might still find more. And I'm running pointless thread wakeups. That's what I was showing before with a chart about timer threads and edge cases where there was massive CPUs. So thanks to all the contributors who helped with this. It's the work of many people mostly on the platform team. And I will just showcase a few examples of bug fixes we did that had a big impact. So this one is specifically about Windows 11. Windows 11 has an efficiency mode for processes. It's not completely clear what it does, but when reading the documentation, it's mostly letting the operating system know that this process is doing nothing that's user-visible. So we could execute the CPU at the lowest possible frequency. And for CPUs with efficiency or performance cores, always select efficiency cores. And thanks to power profiling that I mentioned before, we could actually verify the impact that we had when deciding that we set content processes for background tabs in efficiency mode. If you look at the slides later on, click the link. You can see it in the profile. But on my computer, when I tested it, divided by five, the power use of a tab using the CPU in the background continuously. Another thing I said, we have many bugs that are the same category of bug. And when we can, it's nice to eliminate the entire category of bugs at once. And for animation that are broken in edge cases, it's almost impossible to write tests for all the possible edge cases. But we have a very extensive test suite. So one idea I had was, what if at the end of every automated test we run, we verify that nothing is animating anymore? Sounds very easy. We did it. The part that was not easy was fixing all the edge cases this uncovered. But that's why I'm confident it won't regress as much as it used to. Next things we can do, because we still have many ideas of how we could do better. I mentioned background tabs. We still have lots of ideas about how background tabs could be more efficient. How we could be more aggressive about reducing the frequency of timers firing there. How we could limit CPU use there. And I keep talking about timers, but there's a lot we can do about timers. And there's actually currently one engineer working full time and improving on timer APIs. The main idea there is to group timers. Because the most expensive part about timers is they wake up the CPU. So if when we do wake up the CPU, we decide to run many timers at once, it would be much cheaper in terms of power. And we are working on those kind of improvements. We still have cases where we have videos that are being decoded, but not played in a place where they are visible for users. Like background tabs and things like that. We try to stop those, but the edge cases that we are still working on. Hidden animations. So animations that keep running when something has been completely closed. I'm confident we fixed most of those. Animations that keep running even though they are covered by something else. We still have many cases. And the biggest one is fully occluded window. Which is you have a window that's entirely above the browser window where we have animation. We try to detect that. We have got to detect it at least on Mac and on Windows. It's not working as well as it should be. And I think we can do much better. So there's probably, there will probably be work going in that direction. And another thing that I like to profile. Also because it's testing the capabilities of a profiler is what happens if Firefox is started and there's nothing. You open Firefox, you load about blank, literally nothing. And then you go to a meeting or go for a walk or do something else. And then you come back a few hours later, you capture the profile and you see what happened. I would like what happened to be almost nothing. It's currently still more than I would like. And we can still improve things there. And I think it would typically help for sustainability there for people who are not using laptops that tend to go into sleep mode. But more desktop computers that might turn on things on their computer. And then they go home for the night or for the weekend. And the computer keeps running the entire weekend. I think it might have an impact for those cases. And some more ideas that are not ready to do something about for everybody that could be experimented with. Experiments you could run individually, like if you want to test it for yourself. Or experiments we could run on a few thousand users to see what's the impact. So the preferences there that I'm giving is when showing the chart, I said that displaying the foreground tab is what's using almost half of the entire power that we use for Firefox. By default, we display stuff, especially animation, 60 times per second, or more if you have a screen with a faster refresh rate. Do you really need that? I think most people don't. And we have a prep to limit the refresh rate. So if you want to have only half the frames, just set it to 30 and you will see what happens. I think for most use cases, except maybe fast video games, it should be fine. Another thing we would like to explore is the cost of video to play. We already block videos that would make sound because it's noisy and that's annoying. But videos that are just there in a corner, typically news articles with someone waving hands and talking, but you can't hear them because you're reading the article and you don't care. They use a lot of power. We could probably stop that. And there's a prep that you can set that will block both when there's audio and there's video. Even if there's just video, it will block it. And that's all I wanted to share for today. And I'm happy to take questions if you have. But first, thanks for your attention. So who has any questions? In the case that you have a lot of background tabs, does Firefox suspend those processes? And if so, does the memory gets freed and does would the process manager show that or not? So the question is if you have lots of background tabs, is Firefox suspending those processes, is the memory getting freed? There are multiple answers to that question because there are different cases. By default, the answer is mostly no. We don't suspend them. One thing we do is we throttle any activity there. So if a tab is trying to do things every 10 milliseconds, we will not allow that. And we will limit to once per second, which saves a lot of power already. I think we can do better. I would like if it was only once per minute and then maybe after a few minutes or after a few hours suspended completely. There are cases where the tabs are completely unloaded. One of the cases when we are using way too much memory and we are about to crash out of memory, then we will, as a priority, unload the tab that's abusing the memory. There's another case, which is when you session restore, we don't reload the tab until you click them, except for the foreground tabs. So then you will have many tabs, but they don't actually use memory or power. And one more case is Firefox on Android. By the way, everything I said in this slide show was about Firefox desktop, but we are also looking at the power use of Firefox on Android. On Android, when you put a tab in the background, it's completely suspended. Nothing runs anymore. Any other question? Okay. Can you just say it? So I just tried seeing what my Firefox is doing about processes that I just learned about. So the question is, I just learned about processes and quickly wanted to see what my Firefox was doing. And the process that's using the most CPU there is the parent process, which is using about 20% of the core. Do we have any idea of what it is doing? So the way I would figure this out is to click the profile icon and look at a profile. And if you want to send me that profile, I can tell you exactly. But otherwise, I will just say a quick guess, which is what happens most of the time, is unless you are running on Windows, but I guess you're probably on Linux or something else. We run, so I said GPU process is a large part. We actually only have a GPU process on Windows to prevent graphics driver from crashing from crashing the entire browser. So the graphics part happens on the parent process on outside of Windows. And it's very likely that you have animations that are running and causing things to be displayed. And with a profile, I could tell you which animations are running and why. We have questions here in the matrix room. Have somebody ever compared worldwide power usage of Firefox versus other proprietary browser webs? I missed a few ones. So worldwide power of Firefox compared to other browsers, have we ever compared? I would say no, because I think the other browsers don't publish worldwide power use. And I'm hoping we'll be publishing this as part of our greenhouse gas footprint report. I don't think competing browsers publish any of that. And we are actually thinking that if we start publishing that, maybe we will push the competition to also publish this kind of information. Great. Any other? So this is kind of an extension to the configuration options that you showed. But are there other other tweakables that we could apply to Firefox on mobile devices, like Pinefone, Libre, and so on, where we definitely don't need 60 frames a second? I'm not sure I understand entirely, but I think you were asking, are there other things that we could do on mobile to reduce power use there? So you have a mobile device running the desktop version of Firefox, and you are wondering if there are things you could do to reduce power use there. Okay. If you are genuinely one tab, try to eliminate entirely background tabs and try to suspend them. Otherwise, the answer is almost always the same. Capture a profile, see what's in there. Because whenever we try to optimize by just guessing, almost all the time, we are wrong. Like I said, background tabs, it's not that bad. Add, it's not that bad. Profiling is always the way to know exactly why you should be spending your time when you want to optimize. And if you want help with profiling, the next talk will be about the profiler, and we are very happy to help about understanding profiles. Yep. Hi. My question is regarding hardware encoding. For example, in HTML, and depending on the browser, you can give a list of different content types or formats. Has there been any exploration into changing maybe the preferred format loaded to optimize for less power consumption, as opposed to faster or higher resolution? So what I understood of the question is, has there been any exploration to changing the kind of content types we accept, for example, for images or media, to reduce power use? I think the answer is no, but I know there's currently exploration in terms of what we can do to reduce bandwidth use. And bandwidth also uses power in some ways. And we were thinking about this mostly in the case of estimating the cost of VPNs for users. And for users who really care about privacy and really want privacy on a VPN, could we do things to reduce the cost for them to pay for that VPN that charges for bandwidth? So I'm afraid that all the time we have, however, if you can see, Florian put his email there for questions and also shared ideas in a matrix room. And we also have that matrix room. Please also add questions there. And there will be member stuff there and also other volunteer Mozilla to answer. Thank you very much. We need to change up for the next talk. Thank you very much, Florian. |
What's new with the Firefox Profiler
Power tracks, UI improvements, importers |
I will just do it very quickly. Nazim Kamaltinova is a software engineer for Berlin, working on Mozilla and has been working on the performance tool in Ting, which developed the Firefox profiler. He goes to talk about what's new with the Firefox profiler. Power tracks, your improvement in portals. Thanks a lot for the introduction. Hi, everyone. And yeah, I will be talking about the profiler and what's new in the Firefox profiler more specifically. So first, I'm going to give you an introduction about the profilers and what the Firefox profiler is, and then I will continue with the importers that we have and the other tools that use the Firefox profiler. Then I will continue with the new features and UI improvements like power profiling, source code view, and inline constructs, and many more. So first, the Firefox profiler is located in profiler.firefox.com, so it's a website, and you can go in there and follow the instructions there. And also, the source code is located in our GitHub repository, so you can find our GitHub repository and contribute if you like. And let's start with what a profiler is first. A profiler helps developers to analyze performance issues, and it gives them insight into how their application works, and it gives them a lot of clues into the application, and then you need to do some detective work to understand the inner details, and then you solve the problems. So I'm not going to get into the normal profilers or give more details about them, because there are multiple profilers like statistical profilers or event-based profilers, but I will talk more in detail about what the Firefox profiler is. And it is a statistical profiler with additional data, and what statistical profiler is that the profiler pauses the execution of Firefox in a determined interval, and then captures the backtrace of the Firefox and all the threads. And after capturing that, also this call stack includes frames from both JavaScript and native code, and after that, Firefox profiler frontends visualizes this captured stack, and it gives you an overview of your application. And in addition to that, we have markers as well as a data source, but I will talk about them later in more detail. And this is what you see when you start to analyze a performance profile. It can be a bit intimidating at first, but don't worry about it. I will talk about it a bit later, but also the more you use, the more you are going to get used to it. Also, so my colleague Julian will be talking about the Firefox profiler introduction tomorrow in the JavaScript day room. I definitely recommend you to check that out as well, because he will give you details about how to capture a profile and how to analyze a profile. So it will be tomorrow at 4.30 p.m. I definitely recommend you to check that. And let's move on to the importers that we have inside the profiler. So first, let me explain what an importer is, because, for example, there are lots of profile data formats on the web, and all the data formats require different kinds of UIs, but we have something called importers inside the Firefox profiler, so you can import the different kinds of data formats into our analysis view, so you can see them automatically. For example, we support Chrome trace event format, Linux perf script, ART trace and Valgrind, and it's pretty easy to just drag and drop all the things into the profiler, and it automatically shows you everything there. And I'm going to skip the video, and we also have additional importers, like ETW importer, but you need to follow the instruction here to be able to import it. And also there are other tools like the JavaJFR profiler that uses Firefox profiler inside. So JFR profiler is an IntelliJ plugin that lets you profile your program and then see the profile JFR format inside the Firefox profiler, seamlessly, and it's been implemented by Johannes Beschberger from SAP, so he's been also contributing to us a lot for the Firefox profiler, so thanks to him as well for implementing that. And you can find the plugin in this link, and also he will be talking an introduction talk in this step room at 5.30pm. If you're curious about his journey and how he implemented his own tool using the Firefox profiler, I recommend you to check that as well. And we have other tools simply from our colleague Marcus as well, so it looks like time is up actually without coming to the other exciting stuff, but do we have still five minutes for questions or? Yes, I left, like we have a few minutes for questions because then we need to change the speakers, but yeah, who has questions? Well, we had more features to come, but sorry about that, because of the technical problems, we couldn't finish it, but let me explain quickly the, like over the new features, like we have the power profiling, and actually Florian mentioned a little bit about power profiling, we have this setting over here that you can select, and then when you capture some profile data, we have this additional power usage per process, and you can see what process is using how much power, and this gives you a lot of information, unless you reduce the power information or power usage of your website or Firefox, and also we include like CO2 emission information there, so you can see the effect of your program, and this is huge thanks to Chris Adams and Frashad from the GreenWeb Foundation, they implemented this CO2 information in our tooltip, and also Florian has a talk in the energy dev room today, you can check that out as well at 5.30, also we have source code view and inline call stacks, and if you look at here, you can see our inline call stack there, and now previously we had some missing samples there, missing frames because of the, because compiler like optimizes some functions by inlining them to the caller, so now we properly show you everything there, and also we have the source code view, so you can see the source code, and it lets you see a lot more inside, and you can see what type of functions are being called inside that function, and you can also use that show file name to context many items to open that up, and our DevTools performance panel has been also replaced with the new Firefox profiler, and we also improved some markers, and unfortunately I won't be able to explain that a lot, but now we changed how we visualized the markers, like now instant markers is in diamond shape, and interval markers are rectangles, so it can give you more information, and it can let you distinguish between them, and we have the task manager that Florian mentioned, I will skip that, and you can select multiple tracks, and you can also, we improved so you can search some tracks if you're curious, and we have lots of transforms there, and now the Firefox profiler front-end is localized, so we have more than 15 languages currently, and more to come, so thanks to our outreach intern Hasna, and also all of our localizers, and you can change the loc-hell, we have no periodic sampling mode, and also our documentation has been refreshed, if you are curious definitely check that out, because there are lots of big inner information there that can let you onboard with the Firefox profiler, and that's it, thank you, and sorry about the technical problems, again you can find the documentation in this link, and my slides are there, if you want to get learn more about them, I also have some presenter notes with more information actually, because I skipped most of them now, but also you can find our matrix channel here, you can ask us anything in our matrix channels, and all of our developers are there, and happy to answer your questions. |
Over a decade of anti-tracking work at Mozilla |
One minute, please sit down, everyone. I'll start. Whenever you're ready. Okay, team. You started? Hello, everyone. Welcome to our third call, where we have Vincent. Vincent Turu is a UI engineer on Mozilla Privacy and Security Products Team, working on tools like Firefox Relay and Firefox Monitor. And he's going to talk about over a decade of anti-tracking work at Mozilla. Yeah, thank you, Francesca. That was actually the first sentence of my presentation just gone now. Yeah, so I worked on the Privacy and Security Products Team. Yeah, and so I want to start this presentation with a bit of a personal anecdote. My open source journey started with the release of Firefox 1.0. It was my first interaction with open source software. Later started using Linux, started contributing. First with translations, later I became a software engineer. But I only joined Mozilla as an employee about a year and a half ago. So for this presentation, where I'm going to discuss a little over a decade of anti-tracking work at Mozilla, I'm going to be leaning a lot on the experience of my coworkers, specifically Luke Crouch and Max Crawford, other engineers on the Privacy and Security Products Team. They wrote this blog post, so most of the content of this presentation is also in that blog post. So if you want to have a quick high-level recap, you can read that afterwards. It's also linked on the FOSTEM side, I think. But Luke has been at Mozilla for basically forever. Mozilla went through and through lots of institutional knowledge, so lots of the content is by him. Max did most of the illustrations, so that's credits where credit is due. So tracking. It can be beneficial. I want to do one short scary slide and then the rest will be more positive. But I want to take a bit of a moment to discuss the risk of tracking. Why are we actually trying to minimize the harmful effects of tracking? So tracking is a personal risk. You can fall victim to phishing attacks, for example, or the people around you can fall victim to it. So if more of your data is known, or if your data leaks, it can be used to impersonate you and get, for example, people around you to transfer large sums of money or whatever. So, for example, if someone contacts my father and says, hey, it's your son, could you please wire me 5,000 euros? You know, it's a lot less believable than if they were to say, hey, it's Vincent. I just got fired from Mozilla. Hopefully, it never happens. But, you know, could you wire me 5,000 euros because I'm in money trouble or whatever? That's a lot more convincing. So data can be abused, but it's also more of a societal risk. So ransomware has been in the news a lot recently. It costs a lot of money. It can even be used to convince people not to vote if they are aligned with a certain political party or et cetera. So there's risks involved with tracking. The rest of my presentation should be more positive. I'm going to discuss what we're doing to minimize those harmful effects to allow you to confidently use the internet carefree. So there's a variety of ways to track you. Historically, a lot of attention both inside of Mozilla and outside of Mozilla has been given to tracking cookies. So I'll start my presentation with an overview of what we've been doing there. And then I'll go over these other forms as well. So cookies and a bunch of related technologies that all call cookies. There are bits of data that websites can store on your computer, which can be useful. So if you load a website and the website sees a bit of data that proves that you are you, it can decide to give you your shopping cart or your private messages or whatever and not show someone else's. So that's a good thing. But it can also be used to track you if you don't want that. So every website you visit can set cookies. But not just that website. Websites can also embed other websites. So for example, a website could contain a YouTube video and then YouTube can set cookies as well. It can contain ads. And then those are often also served by a third-party website. It can also set cookies. Websites can even embed other websites without just seeing them. So for example, with the goal of tracking you, so those can also set cookies. So we've been clamping down on that primarily through Firefox. So I'll start my overview with Firefox Private Browsing introduced in 2008 after Chrome was released, which had an incognito window. But a private browsing window is basically a window that as soon as you close it, forgets everything you did in there, forgets the cookies that were stored as well. It's often jokingly referred to as porn mode, but it's definitely also an anti-tracking tool. So for example, my girlfriend used it as such. She's a kind of ridiculous Harry Potter fan, like participates in international pub quizzes about Harry Potter level. But she's also a high school teacher. And so sometimes she'll need to show, I know, a video of something that happened in the news recently. And she'll go to YouTube, share her screen, and show a YouTube video to her class. And she doesn't necessarily want her entire private life, which is entirely Harry Potter, shown in the YouTube recommendations. So what she does is at home, when she's, I don't know, listening to YouTube, to Harry Potter music or ASMR or podcast or whatever on YouTube, she'll open a private browsing window and do that thing in there. And then if she closes it, then YouTube can't correlate those two sessions. So won't show Harry Potter recommendations when she's sharing something with her class. Another reason why I'm starting my overview with this private browsing window is that we interpret, if someone uses a private browsing window, we interpret that as a signal that someone wants less tracking and is willing to accept some more breakage. And so unfortunately, often when we're trying to block tracking, websites assume that they can track you and then build their functionality on that. So there's a risk if you combat that, that you break the website. So whenever we want to introduce new measures to combat tracking, we'll first introduce it in private browsing, see how much it breaks there, and then later we can try to port it over to your regular browsing window. So that's why I mentioned private browsing first. In 2013, we introduced more granular cookie control, so you could make that trade of yourself as a user. You could choose, for example, to not use cookies at all. Lots of breakage. You wouldn't even be able to log into websites anymore, but also remove that tracking vector. You could also choose to, for example, not the third party cookie. So if you visit a website, then that YouTube video that's in there might not be able to set cookies. Less breakage, but still quite a bit. So it gave you that control, which is helpful, but it's also kind of out of the way you have to know of these options. You have to understand what they do, understand their risks, and just making that trade between breakage and less tracking isn't a great one in the first place. So later on, we also introduced a block list, so that it's basically a list of cookies that we know are just used to track you. Don't provide any functionality for you as a user, and we allowed you to block those cookies. So less breakage doesn't prevent all tracking, but helps there. And then not too long ago, 2021, that's really state-of-the-art anti-tracking cookie work. We introduced what we call total cookie protection, so that doesn't actually block cookies, but for example, if you visit youtube.com, and then later visit a different website that also includes a YouTube video, they'll both have cookies, but they'll be different cookie jars. So YouTube will still not be able to correlate those two sessions. So that helps prevent that tracking vector without actually breaking it, because from YouTube's point of view, they still have cookies. So that's the work so far. Obviously, the timeline here stops at 2021. We're in 2023 now, but they're still the future, so we're always working on more things to come with that. But that's my overview of cookies so far. With that, let's move on to IP addresses. So IP addresses are basically addresses for every device that's connected to the internet. They are relatively stable. They can change, but most of the time, most devices have the same address. And this is a pretty strong identifier. Like, for example, I'm the only one that uses this phone, often has the same IP address, so everything you can link to that IP address is quite sure to be me. Same goes for this laptop, for example. And they're not just able tools to link you, but they're also often correlated to your geographical location. So it's not super specific, but when I'm at home, you can deduce from my IP address which city I am. I'm in. So, yeah, that's the risk of IP addresses, so whatever we've been doing there. Well, like cookies, oh, I forgot this. So the thing with IP addresses, it's inherently if two devices connect to each other via the internet, they can see each other's IP addresses. And so that means by virtue of connecting to a website, that website will be able to see your IP address. And again, websites can embed other websites, so those embedded websites can also see your IP address. So just like we did for cookie protection, we introduced a block list of IP addresses that are known to track you and not provide other functionality. As I mentioned, first in private browsing later, also gave you that option in your regular browser window. Which is good if you never even connect to a tracking server or to an answer or whatever, it can't track you, it can't see your IP address. But obviously, we can only do this for connections that don't provide any functionality. We can't start blocking YouTube.com because you won't be able to view that video anymore. So to battle that, in 2019, we introduced Firefox Private Network, which is in between your browser and the website you're connecting to. And so instead of connecting directly to that website, you'll connect to Firefox Private Network and then Firefox Private Network connects to the website you're trying to view. And then from the point of view of that website, it will see the IP address of Firefox Private Network. And so in 2020, we expanded that to your entire device with Mozilla VPN, which protects not just your browser traffic but everything on your device. And you can also use it on multiple devices. Yeah, so that's our IP protection so far. Yeah, then I'll move on to email addresses, which is the fun part because I work specifically on Firefox Relay, a product here that I'll discuss in a second. But first, there's a question that blew my mind when Luke first mentioned it. But email addresses, they feel like you can easily create a new email address, right? You go to g1.com, enter a username and a password, and you've got a new email address. So they feel like they're easy to change. But in practice, you've probably changed your house address more often than you've changed your email address. I've moved a lot, I don't know, six times in the past 10 years, maybe, but all the time I've been reachable via the same email address. So if you have that on file, there's so much history about me that you know via that email address. So it's a pretty stable identifier, and it's cool if you block all your third-party tracking cookies and you hide your IP address, but if you then go on and sign in with your email address, they can just correlate that back together. And this happens, for example, you go to an online store and you buy clothes or whatever, and you sign in with your email address, and then have the clothes delivered. From the point of view from that store, you're a very attractive customer. Like, compared to some random other street, you're far more likely to make another purchase there than some random person is, right? So what that store then does is they go to Instagram or whatever, and they're like, hey, here's an advertising campaign. Here's also a list of all the email addresses of our customers. Could you please show those ads to those customers? And so from the point of view of that store, that's great news. They can advertise to you, you're a high-potential customer, but from the point of view of Instagram or Facebook, Meta, whatever owns it, it's even better because they have not just your activity on Instagram and Facebook and WhatsApp, et cetera, they also know, hey, you're a customer of that store and of any other stores that have done the same. So lots of ways to track you, and that's the voluntary part, voluntary data sharing by third parties. But in 2018, I believe, we introduced Firefox Monitor, which basically keeps track of data leaks that happen. And if you sign up for Firefox Monitor, it will warn you if your data was found in the data leak, if there was a hack or whatever, to remind you, hey, maybe you want to change your password. But what we saw is that many people use the same email address at different services. So if there's data leaks in different services with the same email address, you can link the two data leaks, the data in the two data leaks, you can link them together and know that they're about the same person. And many people even, probably not all of you, but other people also use the same password across different websites. So even if there's not a data leak in a different website, by using your email address and your password, they're still able to extract data from those other websites. So that's obviously not great. So in 2020, we introduced Firefox Relay. And what basically Firefox Relay does is it provides you with a unique email address per service. So if you have a sign-up form, you need to leave your email address, you hit the Relay button, and it will generate a new email address. So I know ZQF40 at Mozmail.com, for example. And it will forward all email that goes to that email address to your true email address. So the store will still be able to communicate with you, send your emails at that address, but that won't have your actual email address on file. And so if there's a data leak there, your data can't be linked to your data elsewhere because you have two different email addresses there. So that's Firefox Relay. Yeah. Oh, actually, I'm doing well on time. You're welcome, Francesca. So then lastly, phone numbers. So kind of similar thread, right? It was super annoying when I started to use Relay, and cool, you can leave your email address, and I'm like, I've got my Firefox Relay address, you can catch me, and then they're like, can you give your phone number, too? Bit of a shame. So what we did is late last year, we introduced phone masking for Firefox Relay, added this graphic to our website, but it worked similar to email masking. You get a phone number mask, so a new phone number, and all the text messages, all the phone calls that go to that phone number will be forwarded to your true number without having to share your true phone number. And so if you get, for example, a text message saying, hey, it's your bank, could you please change your password and go to this website and change it, that's an additional signal where you can see, well, this is not actually my true phone number, which my bank has, so this is probably a scammer. This is a pretty new addition, so unfortunately for probably most of you, also for myself, especially given that I work on it, it's not available outside the US and Canada yet, but hopefully soon. But yeah, so the point there is lots of this work is like the broadening of the scope beyond just tracking cookies. That's all quite recent works. You should saw like 2018 onwards, we started to broaden our focus. That's also when the security and privacy products team around then started. So yeah, we're not done yet, obviously. This is the overview of the timeline so far. And yeah, if you have any ideas of what we can do, do leave them on Mozilla Connect. And that's, I think, everything I had. Damn, my practice was so much slower. Thank you very much. And so does anyone have any questions? So especially with the IP and email protection, it really seems like you are a single part of Pavia. What do you do to mitigate any resistance? So the question was if, especially for Mozilla VPN and Firefox Relay, Mozilla is kind of the single point of failure there. And yeah, that's true. So the point of failure here is, for example, for Firefox Relay, if you're, imagine we're doing everything to prevent it, but imagine there's a data leak at Firefox Relay. What then happens is your email address still gets public, which is annoying, obviously. You're basically back at where you were before, right? But there's just one place where you could fill, i.e. Firefox Relay, rather than all those places that store your other email address. So yeah, it's annoying. Ideally, we'd have, I'm not sure, I'm sure Luke has probably a couple of ideas around minimizing that that we might look at at some point. It looks like you have an idea. Because especially if you like the central place where you store everything, you become a very attractive target. Yeah, that's true. So yeah, if we're a central place that has lots of email address, obviously we become a bigger target. That is true. I wouldn't say right now, Firefox Relay and VPN as well aren't that big compared to there's far bigger data faults. So that's definitely something we're aware of and want to minimize. I think right now, at least for me personally, for me personally, it's still worth it, right? The risk is smaller for using Firefox Relay. If I want to order a ticket, for example, like a concert ticket, I'll give them a relay address. They'll forward me the ticket and I'm done with them and I never need that again. If you're interacting with your bank, for example, I would use your actual email address. So yeah, there's definitely a risk calculation to make there. All these products are not a perfect solution. I think that's basically the threat in all this. It's all about trying to find that balance, trying to block cookies, for example. It's all trying to find this balance, how much tracking do we want to allow and how much breakage do we want to allow and there's always tracking that we can't prevent and there's always downsides of our anti-tracking measures that we can't prevent. So yeah, I wish I had a better answer, but that's it. We have a question also in Matrix. It currently doesn't work with Chalk. Ah, okay. So Danny on our Matrix is asking that, yeah, so he's asking about five folks relay phone number masks, not working with a bunch of tools. That is something we're aware of and working to fix for those who are familiar with phone number masking. Yeah. So my question is, is it correct to state that at the moment none of the services have been demonstrated to actually treat our data end-to-end as an end-to-end encryption so that you can't see our data? And the following question to this question is do you have any plans to implement end-to-end protection? Is it even realistic in this case? So the question is, do we end-to-end and crit the data we're handling? Obviously not applicable to everything, like cookie, et cetera, not. For emails that's basically not possible, because we need to know your email address to actually be able to forward email. We don't store the emails we forward, so we get them for them to you and forget about them. So... Damn it. VPN, actually I'm not sure. Maybe you're biased the way I am. Is that encrypted? I think your connection via VPN is encrypted, right? So, yeah. VPN works in a way that you generate like... We're using one. Sorry for putting you on the spot, but... Hi. Yeah, so the Mozilla VPN works using the WAGA protocol. So on the client, you generate your own private key and you only upload the public key to the Mozilla VPN network. And then during the server handshake with a partner, you generate the session key. So essentially, even if your public key gets leaked, we can't see anything. Yeah, so VPN is end-to-end encrypted. We can't see what you pass through there. Hey, I wonder how does Firefox Relay prevent spam? What kind of mechanism does it use to catch spam? So there's a basic spam filter. We make sure that if... So if we forward our email and you mark it as spam in your inbox, Firefox Relay gets a signal and stops forwarding emails to that email address to that spam. You know, it's spam. Does it use a spam assassin or R&B? I wouldn't know. So the question is, what service do we use to detect spam? Come to our matrix and I'll find out for you. Yeah, our backend engineers will know that I'm not involved with that part of the implementation. But we do have a number of anti-spam and anti-abuse mechanisms in there that we don't want to get blocked by other people as well. Thank you for the talk. It might sound like a stupid question. You mentioned the three or four products like Relay, VPN, and Monitor. Are they free, including the VPN? No, so the VPN is paid. I would also recommend you not to use a free VPN. They're almost all shady. Quite a few of the paid ones are also pretty shady, actually. No, so the VPN is, I believe, if you pay annually, I think five euros a month. I see not, so that's good. Firefox Relay has a free plan, which gives you just five email addresses. So that works, but then if you want to get every service a unique email address, that's not great, but it costs one euro a month. And Monitor is free. Yeah, and so part of this is also, part of what we're doing at Mozilla is building these privacy-protecting tools. We're also trying to find ways to have those finances themselves. So yeah, we're not trying to sell ad or anything. You pay for it, you're the customer. That's the idea there. Last question. So I think it's pretty obvious that the technologies that are being used to track us now kind of almost require us to have a sort of centralized entity to sort of resolve it, whereas cookie protection could be kind of done decentralized. And that obviously makes Mozilla as an organization kind of a target of critique, I guess, because you become centralized to some sort of tracking. Is there some sort of track or something that happens at Mozilla to mitigate the risks of being a single entity that could store too much information? For instance, in terms of relays, there is separate entity that, or an entrenched thing that tries to ensure that you don't store too much data, so that at least you can't kind of try to monetize it yourself. Yeah, everyone can hear it, right? So one strategy there that we have at Mozilla, we have our lean data practices, so that means collect as little data as possible. That's one way to mitigate that. And that's why Firefox Relay, it's why we only store your email address, we don't store the emails we forward. You can choose, so you can choose to store associations like which masks did I use on which website, but you can also choose to have that stored locally if you have the extension, or just not store it at all. And yeah, in terms of, it's definitely something we'd like. I'm not sure of initiatives going on there, and we do, not last my train of thought, but yeah. Thank you very much. Unfortunately, this last question we have time off. Please put the rest in the matrix room. Thank you. Well done. These are my notes. All right. I think we got 100% of them. That's what this came to, like 100%. Yeah. It's like fraction, like, oh god, I've been thinking to mom, and she's going to make sure I'm raising my phone. No, no. Obviously you're live. It connects. You have an adapter? Yeah. Let's see me if it's on. Why does it work? Turn on your second screen. It's not on, I think. It's difficult. It's difficult. So why does it not see it? Basically, we're asking about the current working part, where thousands have been located into that satisfied party. So it's conducted where, for example, ISPs aren't that trustworthy, like you had probably the first one. Well, it makes more sense, whereas in Lebanon, I think it would be very difficult. Yes, I can see you made it through. I get to try them all. It's a different department. I'm sorry, I'm loud, but if I don't see them, it's got to be there, so it doesn't go through. Yes, it's coming. Yes. |
The Digital Services Act 101
What is it and why should you care |
Thanks, everybody, for coming. So yeah, I am not from the corporation, I'm from the Mozilla Foundation. And so besides being the non-profit behind the Firefox browser, we do a lot of other cool stuff. And so for those of you not familiar with it, I do want to go briefly over that. So the Mozilla Foundation, of course, like the corporation, is mission-driven and believes that the internet should be open, accessible, free-fair. And so at the Foundation side, we do a lot of interesting research, a lot of it into social media, and a lot of it related to kind of accountability and transparency of social media, the kind of the systemic problems that we see on a lot of the web. So some of you might have even participated in some of our research through, for instance, for Greta's reporter, through data donations from lots of people, we were able to actually investigate YouTube's algorithm in a way that there are a lot of us to get a view of it from the outside, right, because YouTube is not necessarily handing over that information right now. We've also done, we do policy work around kind of legislative files that either have great potential for our mission or might affect our mission in certain ways. You might have also followed some of this work over the years. We do advocacy and campaigning. And then we publish every year an internet health report. And so this year's internet health report is really worth checking out because it's actually in the form of a podcast. So you can find that and listen to those episodes. And Moz Fest also is taking place online in hubs in March. And then it will take place physically in a couple of different locations in June, but I would definitely encourage you to attend some of those sessions happening. Okay, but I'm going to talk to you about something very specific today. And I'm curious how many people here have actually heard of the Digital Services Act. Just raise your hand if you've heard of it. Okay, great. So it's a big deal. I think it's a big deal. Great. And as I'm going to go through really an overview of the Digital Services Act and particularly the obligations to a certain kind of actor, this is, yeah, this is an ongoing conversation. So I'm really curious in your thoughts as well about this. So I'm going to talk about why did the EU decide to make a Digital Services Act? What is roughly in this? And what are the next steps where are we in the process? Because while the Digital Services Act is agreed and is applied, it's enforced, sorry, it's not actually been applied to the services yet. We have to wait until the end of the year and until next year, actually. So first of all, why create a Digital Services Act? So the DSA is really designed to update the e-commerce rules, which are now almost two decades old. So it really takes that for the internet stack. And then it adds on to that, looking at kind of the way that the web has changed today. As we know in the last 20 years, sort of the handful of large companies have essentially taken over the web. And so at the same time, there are many countries that are trying to address social media and web policy. And the idea is that the EU can do this in a harmonized way, rather than having one law in one country and one law in another. We kind of make a harmonized playing field, which is then easier for providers as well as for the end user. And the DSA should really be seen also in conversation with the Digital Markets Act, which is probably really interesting to many of you here. And these really are two sides of a coin in a sense, and they're both different ways of addressing a problem of fairness. So the Digital Markets Act is really intervening in saying, OK, there are certain gatekeepers that have actually used their market power in an unfair way. And we need to update our competition law in certain ways. And then sort of a macro level fairness, fairness between different services. And then you have the Digital Services Act, which is fairness in a micro sense. It's fairness between a service provider and the end user. Where does that service deliver to the end user in an illegible way? Is it transparent? Is it accountable for the service that it's ultimately giving them? And it tries to kind of raise the bar instead of high bar when it comes to fairness towards end users. So two sides of the coin. Yes, what is it? What is the Digital Services Act? What does it do? Well, first of all, who does it apply to? As I mentioned, it's really taken the e-commerce directive. So for the internet stack, it's essentially moving that across. So that it categorizes the services that it applies to these four different categories. And the most interesting, the most important obligations are on this new category that it creates, which is very large online platforms and very large online search engines, of which there is one, maybe two. And also online platforms themselves, and the idea of an online platform being different than a hosting service, is that it's disseminating user-generated content publicly. But I'm really going to be focused on this middle blue block. So it comes up with this new term, which is a VLOP. Some people say VLOP. I like to say VLOP. It's really up to you. But the idea is that a VLOP is a platform that is serving over 10% of the EU population, that it has a systemic role in our information ecosystem, and therefore it deserves to have some corresponding responsibilities. So it relates to 45 million active monthly users. You might ask, what is an active monthly user? It is a good question. And platforms are in the process of counting this up right now. And of course, there are going to be different ways that this can be counted. The commission has provided some guidelines around this. There's still some ambiguity. And also, very large online search engines doesn't fit so beautifully into the DSA structure. This was added by the council, but the idea is that search engines that also have the same threshold also should be accountable in the same kinds of ways. So the DSA takes an asymmetric structure. That's usually not a difficult word for me. So the idea is that the larger you are, the more obligations that you have. And there are key exemptions for small and micro enterprises. So this is important to maybe a small, fediverse actor. There are exemptions if you have only a certain number of employees or only a certain revenue. But for the other ones, there are real obligations and there are the possibility of fines. So the DSA is a thing to be taken seriously. And this gives a kind of outline of the different obligations that I can go over more so specifically for the very largest online platforms. So the DSA, it doesn't say what is illegal. What is illegal is defined by EU member states and by EU law. And the DSA really tries to take a systemic approach. It's not trying to kind of whack a mole, take down pieces of illegal content because we know, we've been in this space for a while, we know it doesn't really work that way. I used to work on disinformation. It's not about a single piece of content. It's about the way that the service is providing narratives and amplifying certain things. But it's really structural. And so the DSA is really trying to take the systemic approach and asks each very large online platform to go and say, what are your systemic risks? Do they relate to certain kind of online violence that your service might be perpetuating? They relate to privacy. So and the platform then has to address that and actually propose their own ways of addressing that systemic risk. It also does have some very clear ideas though about how platforms should be addressing their systemic risks. For instance, they have to have advertising libraries where researchers can go through, actually anyone can go through it and look at the archive of what has been published as promotional content. Platforms that are over the threshold have to have a recommendation algorithm that is not based on user profiling onto the definition of the GDPR. So experts in GDPR might say, maybe we can still have a recommender system that's using some kind of engagement type of data. But the idea is that you have to have a really viable alternative. There are rules about online interface and design. So the idea that you shouldn't have deceptive patterns that are tricking you into maybe making decisions online that you wouldn't normally take. Data access for vetative researchers, it's critically important. We know this at the Mozilla Foundation having researched very large online platforms ourselves. It's very difficult to know sometimes what's going on inside the platform. So the idea is that eventually there will be infrastructure for the largest platforms to actually share certain data sets related to their systemic risks with vetive researchers. This is a very long process of building up this infrastructure, making sure that it has really clear safeguards. But this is really important and for those of you who read the news that Twitter is going to get rid of its free API, this is a really critical problem is how are we going to continue to hold platforms to account if we don't understand what's really happening inside of them. So many caveats on this data access regime, it really is about systemic risks. It really is about compliance and it is for researchers. But this is a really, really important thing that's going to change over the next couple of years. Codes of conduct, so like the Code of Practice on Disinformation, which now joins in the DSA, there will be also codes of conduct related to accessibility and related to other kind of key areas where platforms and companies might be able to come together and decide for themselves to an extent what are important measures they can take and then that's backed up through the rest of the regulatory framework. And then crisis response, which we can maybe talk about in the Q&A. But so what are the next steps? Where are we in this legislative process or now it's the implementation process, right? We have the regulation now. But there's a lot of things that are yet to be determined. So yeah, we've had, the draft was released in 2020, it's been a couple of years working on it with the institutions and now we are coming on a very important moment, which is so it's in force, but who does it apply to? And so on February 17th, so mark your calendars, platforms that are likely very large online platforms are very large online search engines are supposed to notify the European Commission of this. But this is actually something kind of funny happened here, which is that it's not written in the DSA exactly how the platforms or how the commission will find out which platforms are in it. So platforms are told to put this number on somewhere on their interface. So the commission might have to go and check a lot of platforms interfaces. They've been encouraged platforms have been encouraged to actually email the commission. There's an email address that they can use. But if some of you have some free time the next couple of weeks and want to find out if platforms are publishing their numbers on their interface, please do. That would be great. So yes, we'll see what happens on February 17th. And then so for the largest online platforms, the DSA will actually apply to them already at the end of actually midway through this year in July. And then for the rest of the stack and for the smaller online platforms, it won't apply until 2024. So we have a bit of time. And this time is because there's a lot of kind of secondary legislation, codes of conduct, voluntary standards, a lot of the real kind of details of the DSA are actually yet to be really specified and spelled out. And that process is what's happening now. And that's why it's great that you guys are in this room and maybe you're interested in some of these details, which are in some cases really not details at all. They're very, very important. Yes, so one of these things is the designation of the very large online platforms. One of them is the data access regime. We have a secondary delegated act that will actually entail what is the data access regime. Auditing requirements, I think I might have skipped that, but the very large online platforms and search engines have to submit themselves to an annual audit. They pay for this audit, but they have to be submitted to this audit. And this audit is all of their due diligence obligations, the massive kind of audit. And the kind of details of what this auditing regime will look like are actually being elaborated now in a delegated act, so that's really important. And we'll have a draft of that delegated act after that you can look at and give feedback on that should appear in the next couple of weeks towards the end of the month. Yes, and then guidelines, various voluntary standards will come out on a lot of different elements of the GSA. And right now, regulators are really getting prepared. They have a massive amount of work to do. The European Commission for the first time is going to become a regulator, actually. This was not the case with GDPR, right? It was really the member states that were regulating the GSA now for the largest online platforms, the Commission will be overseeing them. But of course, everyone else is going to be overseeing by regulators in their member states, so all EU countries are in the process of designating something called the digital services coordinator that will be responsible for overseeing the compliance of regulation in that member state. And a lot of online, very large online platforms are based in Ireland, right? And so that means that the eventual Irish digital service coordinator will have quite an important role, probably not as important as it had in the GDPR towards data protection issues, but still a very important role. And the European Commission has recently actually opened up an entirely new institution, which is the European Center for Algorithmic Transparency, ECAST. It has offices in Brussels, in Italy and in Spain, and they're going to be really serving as kind of an additional kind of technical support to the Commission to understand some of the research challenges that it has in front of it. And that's where I'm going to close. As usual, I talk fast, so I have a little bit of time, but that's okay. And if you have questions, yeah, I'm on Mastodon, I'm also on Twitter, but, and this is my, but this feels like a Mastodon crowd, so yeah. The hand went up really fast, yeah. Yeah, right there. Thank you very much. We also have a question in Matrix. Say, question for the DSA, where does the platform have to publish, communicate their monthly active users? Yeah, that's a good question, right? So they're supposed to put it on their, on their interface, I don't really know what that means. But they're supposed to put it on their interface, but they should also email the European Commission. So if that question is being asked by a very large online platform, you should email the Commission. Hello, thank you for the communication. I was waiting to know more about the law in itself. Recently, there have been a condemnation of Facebook and the US has been in lawsuit between LinkedIn and a small company about web scrapping. And web scrapping right now is, is, is the way for us, for developer and the developer open source developers, to access to the data on this platform because they are the only one that can have the monopoly, but the clear they have the monopoly on it yet the regulation actually is harsh. It could, it could be interpreted as forbidding data scrapping in somehow that the point of this company that's saying that it's forbidden. I wanted to know if they, this new law push, push obligation requirements for the, for the company for this block to have public APIs or at least not to block application want to, to make the content free. Yeah. I'm not sure is that clear because I'm not, I'm not native English. No, that was really clear. So questions about web scraping and how the DSA will block or, or help researchers essentially study the large online platforms. I can't respond to web scraping specifically. It's a really important question, right? But as to the kind of the ability of the DSA to encourage and facilitate research and especially through APIs, it will encourage, so for the largest online platforms again, it will require them to share data. It doesn't exactly say how we'll do this. It talks about interfaces, right? And this is again to be elaborated through the delegated act. So that your question really is going to be answered over the next, you know, year and a half or so. But specifically the platforms will have to, to share data and have to allow research. It won't necessarily be with everyone. There are clarifications about what is a vetted researcher, but it will, and we have this already in the text, at least in the article on, on data access, there are tiers. So for a tier of data, which is quote unquote manifestly made public, so we can think about if we think about Facebook, it's like a public page, that data should be available through actually an interface. It's essentially trying to mandate a kind of crowd tangle. So this isn't everything that necessarily we want to see, and it's also, this has to also itself still be elaborated, but the GSA is trying to encourage this kind of research. Speaking of mastodon, so if, like how does this definition of very large online platform come when it comes to like federated services? Like if mastodon grows to like, I don't know, hundreds of million of users and all the European instances together hit this 10% like point of numbers of users, but none of them individually actually do this. Like is there any specification for this and the regulation? Yeah, so this should be taken as legal advice for people who run mastodon servers. This I can tell you my understanding, and then I can say that the GSA was really not designed to try to penalize the Fediverse. The GSA is trying to encourage the Fediverse, it's trying to look at platforms that are largely profit driven, right? But in the Fediverse, it would be treated, I'm fairly, according to my understanding, as an individual server would be a platform, and so if you had one individual server that finally reached the threshold of 45 million active monthly users, which I think we're very, very far from, then there might be some obligations, but at that point they have a lot of other problems, I think if you're running a server that's that big. Otherwise it would be, yeah, I think we're really going to be looking at small micro platforms there. As far as I understood, the GSA applies to organizations and companies, right? Are you aware of there's any initiative to do something similar when it comes to the profession of software engineering? Because you could argue that people who build platforms also share the responsibility and the safety of the platforms. That's such a, I really like that question. The question is there, so the GSA is largely targeting companies, right, over platforms, but what about applying to software engineers as such? I mean, it's challenging, this is, I'm speaking really my personal opinion here, but to hold a person responsible for their company, or the thing that they're providing, that's difficult, but I think as far as there's some really interesting articles in the GSA that will be interesting to see how companies interpret them and how individuals interpret them, like on the interface design, I think this is really up to designers to do a really great and transparent job on that. So yeah, I don't think that, I mean, it's difficult to have regulation applied to employees, right, that would be tricky, but it's an interesting question. There, any other questions? So specifically, what it comes down to in the end are the details, right? So how's the Mozilla Foundation, and people in general, influence in these details that are nothing about that, in the end also, like, what's the case? So I didn't hear the second part. What were the examples at the end that you mentioned? The question is, what about the details, which are really yet to come, including also court decisions? Yeah, I think there are going to be a lot of civil society organizations that are going to engage probably in strategic litigation. I think that would be a huge component of the next couple of years. As for the Mozilla Foundation, we're really interested in the implementation phase, right? The areas that the commission has very much asked for help on certain things, that, you know, they're consulting with stakeholders, there are public processes, the Delegated Act, where they publish them and they want feedback. So really following those opportunities, and I think a lot of civil society actors are as well. And then there is a lot of, these codes of conduct, I think, are really interesting, and the voluntary standards where organizations and civil society actors can come together and add some details. So the research community has a huge role, I think, to play. So yeah, definitely Mozilla and definitely a lot of other actors are engaging, but there's so many opportunities to do that. So if you're interested in them, you can send me an email, actually. Is there any interaction between GDPR and DSA? There is, yeah. So in a couple of, I mean GDPR always applies, but it's sort of a bedrock for a lot of these things, but there are a couple of really interesting articles where they have overlap. So for instance, the provision that very large online, sorry, the question was the relationship between GDPR and DSA, and I'm not going to give you a holistic answer on that at all, you would ask the specialist, but I can tell you the provisions that I'm really interested in that overlap. So one of them is on the requirement for a very large online platform to have a recommender system not based on profiling. This is typically an area where they have overlap because profiling is understood under the GDPR. So it'll be really important to see how this is actually understood and how platforms will implement that and then how regulators will also see their responsibility to ensure that that's enforced. And that's one area, and there are a few other articles where they're also that kind of similar overlap. We have a question in matrix. Do you fear any backfire action from companies to happen, like the GDPR created the pop-up hell of consent? Okay, the question is about backfire. Well, technically it wasn't GDPR, it was the ePrivacy Directive. It's a combination of the two that's responsible for cookie banners, but it is a really good question. And this is where I think it's actually really important to be clear about what is compliance to the letter of the regulation and what is meaningful compliance for the actual end user whose experience you want to help, right? So it's possible that companies might take the regulation and in their kind of design of their new obligations make something really annoying for users to get them to not want to do it. For instance, take this example again of the alternate recommender system. You could hide it behind eight clicks and you could make it change back the next morning. You know, you could make it somehow like really boring. I don't know, but there are ways that you could make it unattractive, but it's still comply. And so I think it's important that companies decide to comply in a way that's also appealing and also fair, because I don't think that cookie banners are really complying with GDPR privacy in a way that is really fair. We are a bit out of time. Okay, just ask a quick question. So just related to that, especially in the context of dark patterns which is coded after GDPR because how you basically never know what you're thinking there is a normal amount of dark patterns that appear because of GDPR because now people or companies are trying to make it impossible for the user to know what is consensual or not. And so it was clearly not too easy to regulate that since everyone is doing it. So how would the GSA be different? How can the GSA then regulate it in a way that companies don't just play around? Yeah, the question is about dark patterns or deceptive design and especially after GDPR. So there is a specific article in the GSA Article 25 that looks at online interface. So it's not looking at any kind of patterns, but it's looking at online interface design and it's obliging platforms to design their interface in a way that is not deceptive and doesn't leave the user down a path they wouldn't intend. So very much the authors of the GSA are trying to not have that happen again through this one article. However, the article isn't as strong as it could be and so a lot of civil study advocates that were pushing for it to be stronger didn't get everything they wanted. That said, I see this as being really, really important and I think if platforms do look at this article and fully comply we shouldn't have some of these dark pattern issues. But this, yeah, this is one of those details that will yet to be seen and I think there's going to be a lot of pressure on this one provision and I hope it can withstand that pressure. Thank you very much. We don't have time for more questions, but feel free to pop them in the matrix room and thank you very much. Thank you. |
Cache The World
Adventures in A11Y Performance |
And so, now we have two presenters, Benjamin Dekoshny and Morgan Reschenberger, so Benjamin is a member of the Mozilla performance engineering team and Morgan is a senior software engineer working on platform accessibility and Morgan if you want to repeat your name probably with my pronunciation. No, you got it, you got it. Morgan Reschenberger, that's me. Yeah, we're going to talk to you about an accessibility project called Cash the World and the way that we're monitoring and measuring performance. So, I'm Benjamin, I am on the performance team and I'm going to be talking about the collaboration from the performance side. And I'm Morgan, I'm on the accessibility team. I'm going to talk about the accessibility side. We put the matrix rooms for both of our teams here, so if you have topic-related questions after this, you can follow up there. So, here's the agenda. We're going to just talk a little bit about accessibility in Firefox. Morgan is going to go through intro to the rendering and accessibility architecture and some of the changes that happened with Cash the World. I'm going to talk a little bit about how we're measuring performance and some of those questions and current problems. We're going to go through our future work plans and then we're going to open it up for questions. So, the first thing is scoping context for accessibility in Firefox. The goal is, of course, a faster accessibility engine and more performant web use for users, all users and especially users using accessible technologies. We also want to try to create a performance testing infrastructure that will be able to prove these things and test the more we change our internal infrastructures we want to be able to make sure that we can catch problems. We also wanted to establish some accessibility metrics and we want to work in public with public dashboards that show the kind of performance that we're getting. We want to improve our documentation. We want to improve the debug experience. And as such, we're going to talk a little bit later about the profile markers that Nazim talked about earlier, but specifically the accessibility problems, and we want to set up infrastructure for collaboration. So, scope on this is we're going to be talking about screen readers pretty much only, and we're not going to be talking about any of these other accessibility technologies like screen magnification, contrast modes, on-screen keyboards, subtitles, any of that. That's all deferred till later in this work. So context for Firefox and accessible technologies is not great from the free software perspective. Almost all our users are on Windows, and then you have a very small sliver of Mac and Linux, and Linux is like under a percent. We just have to know where we are, and that's where we are. In general, 5.5 percent of all Firefox page loads for the month of January had some accessible technology built in, and that's not evenly distributed across the OSs. We see a much higher use on Windows, and Linux isn't bad, orca, yay. And then Mac is far below that. But for the most part, if we were talking about who is touching this work and who do we have to care about, it's these Windows users. And then here, just for a little bit more context about, like, in that 5.5 percent of page loads that use accessible technologies, like, what accessible technologies are they using? They're using mostly screen magnifiers, which is the black line, and then the purple line is speech rec in general, and then underneath that is NVDA, which is the Windows screen reader. So those are the top three that we really have to care about. Morgan? And so before we get into all the details about the performance work, I want to give you some background on how rendering works in web browsers and how it translates to the accessibility architecture that we're going to be talking about today. So the general job of a web browser is to convert HTML and CSS written by web authors into visual navigable content, right? And we do this through a rendering engine in Firefox. This is called Gekko. It has five different phases and stages that produce artifacts that are used in the following phases and stages. So first we parse the HTML document. This creates the DOM or document object model, which is a hierarchical view of the web page. Then we look at the CSS and figure out the style information for each node, what visual changes we need to make when we render. Then we do layout, which computes positional and size information for each of these nodes. It also constructs an artifact with that information called the frame tree, which becomes useful later. And then we do painting and compositing and rendering, which is the visual part of rendering. But this process is all extremely visual, right? And what if you do not navigate the web visually? What if you navigate it with technology like a screen reader, which turns visual content into audio? What do you do then? And how does a screen reader figure out what it should be telling you? Well, that's the job of the accessibility engine. So like we have a rendering engine, we also have an accessibility engine in Firefox. It doesn't have a fun name. So if you can come up with a fun name, you should let me know on Matrix. But what it does is it takes in those artifacts we talked about before, the DOM, the frame tree, style structs, et cetera, and it marshals them into a new kind of tree, which we call the accessibility tree, or I like to call it the accessibility tree because that's more fun. But it takes all of those and computes excessively relevant information. So this is stuff like semantic role, name, the kinds of actions you can perform on an element, things like that. This is not necessarily one-to-one, like there is not a single accessible for every node in the DOM tree or a single accessible for every frame in the frame tree. We care about different things, which is why we have to build a new structure. And building the structure happens in the content process. We have one accessibility tree per web page. So let's take a look at how these queries happen from an assistive technology standpoint. So at the bottom here, I've got a couple different kinds of assistive technologies. These are ones that Benjamin mentioned on that graph from before. So we have screen readers, voice control, window managers, et cetera. These clients or ATs make requests to Firefox for web content information. So if you are navigating with a screen reader, the screen reader needs to ask what node is focused and what should I say about it to the end user. The way that those requests happen are through platform specific APIs, but they all hit the parent process in Firefox. The assistive technologies are separate applications. So they're communicating with Firefox through the parent process. Each web page lives in one or more other processes, one or more content processes, and is not reachable by the assistive technology directly. So we can't inject the screen reader into web content for a lot of reasons, security being one of them. All these calls go through the parent process. And there are some problems with this architecture that motivate what we're going to talk about next. So let's get into it. Like I said, computation of the relevant properties that the assistive technologies are requesting, that all happens using the accessibility tree in the content process. The result gets sent to the parent process from content via IPC, inter-process communication. This is slow and it's also synchronous. So if a call gets blocked or is taking a really long time in content, you can't do anything. The parent process just hangs. And because the parent process includes all of the browser UI as well, it just looks like Firefox is not responding, which isn't great. So what can we do about that? Well, our solution is this project we call Cache the World, which introduces a cache in the parent process that keeps track of snippets of content information that we need to compute and respond to those API calls. So we're trying to offload as much work as we can from content into parent. And this cache gets updated asynchronously based on content mutations. So we no longer have this problem of synchronous blocking IPC. Cool. So now I'm back and I'm going to talk a little bit about, like, how do we see if this stuff is working? So the first thing we did is actually not at all metric or measurement based, but it was more about helping debug in the profiler. So one of my great colleagues, Michael Kamala, added some accessibility markers in the profiler to kind of, like, get us an idea of, like, what's going on, where? You can see the specific calls here. And then I'm going to show you what it looks like kind of in the profiler. So the red circle is where we start to drop into some of the accessibility calls. So watch this space because we're going to be adding more markers here. The second thing we had to do is really come up with, like, how do we test accessibility and what's going on here? There's a huge amount of screen reader. There's just, like, a whole bunch of different screen readers, and they're all different, and each OS has a different strategy for dealing with this. So we have, like, a huge complex testing matrix here. In addition, we had to, like, in terms of testing, we had to, like, run a large number of variations to kind of verify our results. We have five different variations starting with the baseline, and then we kind of, like, have caches on and off with the accessibility implicitly on by just plugging in screen reader, and also with accessibility forced on with preferences. So we have a really large matrix of five on our task here, and then we were looking for specific problematic web content that would really trigger kind of the worst case scenarios here. And they are, in general, the worst case web content for this are really large static web pages. So what do we do? We added three specific sites. Actually, I think we have, like, five sites. But in general, it's like Wikipedia World War II is a great test page for testing accessibility. We have some search box links because we're Firefox engineers, and then what WG HTML specs. So these kind of, like, really large static pages, which is not necessarily how a lot of the web is built right now. But these are, like, specific problem points that we wanted to be aware of and address. And then comes the question of, like, well, what are we measuring? What's important? And we have, like, three general choices here. We have, like, W3C, navigation timing, kind of page load metrics, like OOG performance metrics, that segment browser page load into distinct phases, DNS redirects, DOM parsing, and then, like, content-ready pages loaded. We usually traditionally use visual metrics, but because of the nature of this, nope, can't do that. And then we have some kind of internal benchmarks that are not really publicly accessible where we just try to look at specific code flows and time and measure. And, like, that's really showing the most promise, frankly, and what we're going to be using more of in the future. And so what we have, we're trying to work in public, and we have some public dashboards for this work, which are at the end here. Whoops. Sorry. So this is, like, some preliminary results. This is a graph a little hard to understand, and I'm sorry about that. We have the blue baseline performance. We have these dotted lines with the caches turned off. And then we have what the caches turned on. And so we're seeing, like, yeah, not great performance for these static web pages right now, at least on Linux. I think that actually varies on Windows. But we're seeing some wins and some more even performance on things like IMD web pages, which aren't, like, these pathological test cases. So in general, what we're going to be doing is we're going to be trying to align the profile markers that were put in to performance metrics using our internal tools at first. And we're just going to try to start measuring, like, the actual cache creation time. And we also want to start paying attention to not just straight, classic page load, but we want to start thinking about page reload, tab switching. And one of the other leads on this project, JNET, has a great blog post about those kind of, like, anecdotal performance measurements. We definitely want accessibility first metrics. And we don't, we would like to get away from generic page load, tab metrics on this. We have a public dashboard, work in progress. It will continue to evolve as this work evolves. And then really quickly, future work. Yeah, so the accessibility team at Mozilla is responsible for a lot more than just the accessibility engine. We're also responsible for high contrast mode, zoom, Firefox front end usability and accessibility. So we've got a lot of projects apart from this that we're working on. But our main goal for this half is to shift cache to release. We're currently in beta and we have a lot of promising results. So we're really optimistic about getting this to all of our users. We're also planning on working on optimizations based on the performance work that you're seeing here. We have a couple of optimizations in mind. Like, we know we can improve on cache granularity. But this work will inform the kind of work that we're doing next. And then the performance team is going to really try to get these Windows results in since we know it's so important. At the same time, we want to make sure that Linux performance doesn't degrade. Also, we would like to kind of like put this into standard continuous integration test infrastructure. Kind of tune our markers, make sure we're measuring what we think we're measuring. And then things that we deem successful in a wide variety of web content, we want to try to push out to public telemetry so that we can actually measure much larger environments and users. And then, of course, all of the internal collaborations inside of Bazilla with Perftools and ETL and DevOps to try and make all the magic happen. We have some questions. If we have time for questions, we have time for questions. We have time for questions. And if you have other thoughts, you can email us or, you know, Twitter. Are there any questions? All right. So complete. Yeah. We actually, on the slide deck, but not in our presentation, we did have some additional resources and notes for people who are trying to work with accessibility, maybe new to it, and things that, here are some resources for you to use. Again, Jamie's blog post, really I'm going to really hype that again. Please read it. Morgan is going to put a video up that has to be done because there is some internal stuff that can't be shown. But she has a great walk-through about how to debug CSS for accessibility. And then I have a web page on color and contrast for accessibility and how you can compute colors that work for a wide variety of people. And also I want to shamelessly plug that you can contribute to Firefox. And if you are interested in working on platform-specific bugs or front-end bugs or whatever, accessibility is a great place to get involved because we span a lot of components and we could always use your help. So if you are interested, we have an accessibility room on matrix at the Mizzilla domain and you should reach out and we are there. So. We will take a question. You mentioned it is not safe to embed the screen redirecting to the web page because of security concerns. But now we are cashing, you are providing a little bit more information to this pattern process. Are there any security considerations you have to look at or address doing this work? We are paying attention to the kind of information that we are cashing. We don't want to give any private user information away. Largely, the information we are cashing is already represented in the parent process in some form. But the way that we compute things is different than how DOM or layout or other parts of the browser compute them. We are cashing really, really granular information as well. So, yeah, we are not currently concerned about security risk but that is a consideration. Maybe you already said, do you have performance tests with accessibility enabled right now? Yeah, that's what that website is. Oh, sorry. The question was do we have performance testing for accessibility? Yes, we are starting to do that. Is it just a matter of enabling accessibility and running exactly the same tests or are you doing something different for accessibility? Yeah, so the question is, what is the method there? You can contact me offline if we are running close. But we are using a standard framework for performance testing called browser time, which is open source. And, yes, what we are doing is we have OS specific handlers that basically start screen readers before we start running that and then stop at when we are done. So it is just RAI straight style on that, yeah. And then porting that to Windows too. One of the difficulties with that approach that we are running into is that we are most interested in perceived performance. So we want to know how does the user feel about this? Like, is it perceivably faster? And that is really hard to do because screen readers are difficult to automate from that perspective. Speech rate is extremely variable. You can do key presses and stuff, but it is really hard to get the kinds of measurements we want. So we are aware that the performance testing we are doing right now is a number and it is something that we can track consistently, but it isn't entirely what we would like to be. And there are different strategies on the Windows screen readers about having to have the full page ready before we actually start in with the speech. And that is like configurable and that is not the default setting for on Linux, for instance. So Orca, I think, is actually pretty smart about this. And they can do partial reads and start the speech earlier. So we are not getting quite. We have a comment on that. Oh, sure. There is a question. Oh, here it is. Note that the caching of the parent moves information into a process that is not exposed to web content. There is nothing before that. It is not appearing here. Maybe. Yes, here. Oh, can you talk about how the cache is populated and invalidated? Oh, sure. How much time do we have? Two minutes. Okay. Go. Go. So the cache is populated from content. So it is a push-based cache. We aren't invalidating from parent because we can't observe content mutations from parent effectively. Each content process is responsible for monitoring their own mutations and pushing or invalidating stuff in the parent process as needed. We have an initial cache push that... Oh, no, sorry. On page load, we collect a bunch of information and push it always so there isn't any sort of mutation that we're responding to there. That is one of our big performance concerns is the initial cache push varies by page size or scales by page size, and that's really costly. But... That's why you put all those big tests in there. Yes. So from initial cache push onward, we're responding to mutations in content from content. Yeah. Are there any other questions? Oh, yeah. Go into the limit. On the web app side, what may impact negatively the performance of the accessibility? Like how could you design web content such that it's optimal for accessibility? That's a great question, and we'll come back at you later with an answer. Yeah, we're still kind of early in phase on this, but I feel it would be a great idea to do some kind of web content help to get people to know the performance choices they're making for accessibility. Yeah. Oh, yeah. Could we come up with some guidelines for performance learning and general guidelines for how to do performance accessibility? Request submitted. Thank you. So thank you very much. We are done. Thank you. |
Firefox Profiler beyond the web
Using Firefox Profiler to view Java profiling data |
Hello, everyone. And now we have a talk by Johannes Bergberg, I hope you pronounce that right, is a JVM developer working on profiles and the underlying technology, it currently works on the GAP, modeling documentation, smaller utilities, and the Firefox profiler. It's going to have a talk on Firefox profilers beyond the web, using Firefox profiler to view Java profiling data, and yes. Yeah, thanks for the kind introduction, yes, I'm Johannes Bergberg, I'm working at the sub machine team at SAP, we create the best JDK, so just download it, but I'm here because I worked on the Firefox profiler in the recent months, and yeah, I'm going to start now, because when I'm telling people that I'm like working at SAP at Firefox profilers, they first start me two questions, like first, wait, SAP does open source? Yes, SAP does open source, and quite a lot, for example, here at sub machine, my team we're working on the open JDK, for example, if you've ever used JDK 17, we have the JDK 17 maintainer in our team, so we do many nice projects in this field, SAP also works with the Eclipse Foundation or other parts, so yeah, we're doing quite a bit of open source, but how did I end up here talking to you in this Mozilla death room as a JDK developer, which normally doesn't happen, so I had a problem, I had a project on debugging last year, so what I wanted to do essentially is that I had on the one hand my IDE, and on the other hand the JVM, they wanted to improve the protocol in between, so I had then some test cases and some of these, some integration testing, for example, here they parsed some program and I did something with it, and I wanted to see why it was slower than I expected, and so what I wanted to do is just write plug on my IDE, tell it that, hey, profile it and show me the profile, and that's all, and I wanted to do it with open source source, because I like open source and that our team open source is really key, so I didn't find anything, but I found some tools that got into this direction, so essentially there were tools that produced some flame troughs, STUI, and Mario Fusco set a few days back, a lot of frame graphs, when you do something stupid it punches you in your face and it's impossible not to see it, so that's great, you can use flame troughs for a lot of things, but sometimes they are not enough, so when you regard visualizations you have those easy tools that are easy to use, but don't have that many features, and on the other hand you have the big tool, it's called JMC, there's lots of features, but has a quite steep learning curve, so I had to write my own, because I wanted to have something with more visualizations than just flame troughs, but that was usable, also usable to the end user, and not just the OpenJD developer like JMC is, quite frankly, and I wanted to just right click on a method, on the test method, and just tell my IDE, hey, run the thing, so I don't end up writing so much code and boilerplate, so writing your own intelligent plugin, and that's the IDE I'm targeting, but it's quite the same for all the IDEs too, like visual studio code, is quite simple, you can just download some templates and work on them, so this I did like in August in like half a week, but then it came to visualization options, so I looked around and thought, maybe creating my own visualizations like flame troughs and so on, but this turned out to be cumbersome, and really would take a lot of work, so I looked around for web-based visualizations, because you can just embed a web-based profiling view into your IDE, because for example in Telstra you have a chromium, an embedded chromium, and in VS code it's essentially a browser anyway, so that's no problem, iPhone speed scope, which is quite nice, but the problem is it doesn't show anything other than stack traces and some profile timing information, and I wanted to show more, there is kind of an existing plugin for IntelliJ, but that only shows flame grouse, and so we're back to step one, and then I found Firefox profile, and this is how I ended up here, so essentially Firefox profile is this large application that you've probably seen in some other talks today too, this is actually taken from a previous talk, because I was too lazy to run Firefox profile, directly you see here it's like it does everything, and it's quite cool, and one of the many advantages is that it's developed by a small team that answer even stupid questions in a matrix channel, so check it out if you have some questions, and they were open to working with me on my project, and also it's backed by cooperation, it's backed by Mozilla, which is quite cool, because it's not a one-man project like other tools are, and so what did I do to integrate it, and what might you do if you want to also use Firefox profile, it essentially points down to creating a converted to my data format, it's the JFR file format to the Firefox profile format, and then you put it in the server, because Firefox profile likes it when you can just say take this file from a URL, so you put it in the server, that's fine, for me I created a travel in server, then you can just wrap it in IntelliJ plugin or use code plugin or zone, and then I took the Firefox profile UI, you can use profile.firefox.com, but you would typically just host it your own, because Daniel Burke's in demos like Bolly today, when I have time, and you can control all the changes that come into your Firefox profile UI, and also you can use a modified version, which I did to put all my progress that is still out there and not merged into one version, so the things you later see, that the inmate will sometimes not yet in the mainstream Firefox profile, so shortly to the file format, so file format is quite simple, it's just a chasing file, and this is zipped, and it has some metadata information like the name, the interval, some settings, and then you have the stretch information, so for every thread you have a separate section in the profile file, and you have a list of samples there, and this list contains essentially the time of when the sample is taken, the stack, and the CPU delta that elapsed the CPU time since the last sample, so it can be used to show the CPU usage data, then the stack is not all the stack, but it's an index into the stack area, and this contains like the list of stacks, and where the stack is just a frame appointed to the previous stack, like the top frame, and then the previous stack, so it points back, and then the category, and of course the frame is an index into the frame area, and that contains the functional line, so what you need for a frame, and then of course function is not really the function, but it's an index into the functions area, and you get the point here, because name and file are of course not strings, but they are indexes into the string table, that's quite hard to debug sometimes, and I had many struggles with it, but it's quite performant, and it's easy for the frontend to see, so after I explain to you how I did it, here's the plugin, I call the Java JVR profile plugin, you can find it in the track branch marketplace on GitHub, it's open source, it's MIT license, I believe, and it works because JVR was open sourced with JDK 11, so it's all open source, just try it out, so how can you get it, just open your IDE, and go into the plugin install, view and type in Java JVR profile, and you get it, then you get some nice buttons, when you right click, and you can just click on the profile with JVR, then you profile it, and then here's a simple example application that just computes kind of complicated, a few nuchin number or something like that, and then you can see that it executes the program with some JVR-related options to profile it, and then it automatically the profile of JVR-file, you get the call tree, you can also look at how every frame is like executed, whether it's in interpreted mode or in our JIT compile, then you can double click and jump back to the IDE, so it has basic IDE integrations, and you can shift double click and see the source view that was presented in the previous talk shortly, so you can see here that in this code, the recursive call is called the most, it's found the most time in the static calls, and then you can have a function table which lists all the methods combined, so from all the stack traces and you see what method is used the most, that's not yet in the mainline Firefox Profiler, you can have some flame crafts, and you can have some nice tooltips, and you can get some information on the profile, you can even upload it like a normal Firefox profile, so you can share these profiles and view them in a normal profile at Firefox.com, just with some few features, and then we can also open any JVR-file, and JVR is like the default, like the de facto standard for profiling files in Java, and you can see here that we get also some crafts that show us the CPU load and give us some summary on GC, like how much memory it requested, and what's also cool, we can not only see timing, but we can also see call tree for other things, like for Java error, for thread starts, where did they happen, or for object allocation, and when you get to Java error, you see that this code uses error to create error every time, like probably the parsing fails, so we can also see, as the last part, the market chart, so we can get all the events that the JVM remits, so for example, we see here at the top, that we had a drop in the memory that the heap was large, and we investigated, we could zoom in and see, okay, that's probably because here GC happened, and this GC took like 10 milliseconds, and this is quite nice because you can then investigate, have the call tree as a simple thing, and then later drop in and go deep dive into the data, and if you know Async Profile, it also supports Async Profile, and also when you want to create the profile, you can decide where you want JVR to build in for the JDK or Async Profile, and that's all, but I think I still have some time, so I hope this works, because I can tell you a lot that it might work, and we can show you some screens, but here is the actual working, hopefully working prototype, so just right click on the main method, tell it Profile with JVR, and then it tells you, hey, I profiled it, then it opens the profile, and then you can just look at it, zoom around, and see that you have to select the main JVR, you can jump back to the source code, you can shift, double click, and everything, and the cool thing is you can even open, as I showed you, arbitrary JVR files, they shouldn't be too large, because then my program runs out of memory sometimes, it's still a prototype, so try it if you want, I would be happy to have any feedback, go to the issue pages, and eventually you find this plug-in, as I said, in the chat print marketplace, you find me on Mastodon, on Twitter, and on GitHub, but you also find the subprojects, so you can use parts of this project, like only the converter, and only the JVR to Firefox Profile server, you find more information on my tool, and also a background information to this talk, in two blog posts on my blog, and you can find the submachine team at Sweet Submachine at Twitter, and find out about our great projects at Submachine.io, that's for me, thanks. Thank you very much, and we have quite a bit of time for questions, so... Thank you. Have you received any feedback from users about this tool, like some colleagues or...? Yes, good questions, yeah, I received quite some feedback on Twitter, they were quite happy, I did some internal... I showed it internally at SAP, to some colleagues, and they were quite happy, I also showed it to friends of mine, and also it was quite great. It still has, of course, problems, because it's a prototype, not everything might work, but yeah, I'm looking forward, so just give it a try, it's free, it's open source, open some issues on GitHub, if you like, so yeah. I will steal the mic. Did you add to forward some patches upstream to make it work in the first place? Yes, a lot, so for example... Because you mean they have bugs, or you need to adjust it for the... There for example, this feature here is not yet in Master, but for example, what I added was, you saw this, this, you saw this here, this, the listing of the categories, this is something that I added, and I added pull requests for the function table, which is not yet in because it's far more work, I added some resizing and searching, that's yet, that's I think in, so I added some pull requests that are in, but not all of them, but I hope I get it in in the next few months. You can give beer to the upstream developers that are here to get the patch. Yeah, but I fixed all the bugs that they wanted to be fixed, so I think we're working more like colleagues. I will ask a follow-up question quickly. Did you have to do anything Java specific in that case? There's nothing that is real on Java specific, so the only the converters like Java specific because it takes Java data and processes it, but anything else in the UI is generic. I was wondering if this could be useful to profile Java applications that run native code and the other way around, and how would that work? Speaking of, for example, Firefox 400, I don't think we have a lot of insight into the performance of our Java code, and that would be useful to have. So, it's in some ways where they can work with chain I or something like that. I think asking profile is a quite nice profile that I also support, and this gives you information on native, on native phrases, and I'm working on getting a profiling API into OpenJK that improves this a lot, but if you want to know more about the struggles, come tomorrow to my talk in the Fuji room, they hear enough about this topic. Hi. Yeah, thanks for the presentation, very impressive. The question I had, actually, I have two questions. The first one is, what is about Firefox profiler that made you choose it, I suppose, and then the second question is, do you see the potential for maybe a tool like this to have a web-based editor? Yes, the first thing, why did I choose it? Because it's a nice project and I didn't want to write everything on my own, and it contained lots of the features that I already wanted, so, yeah, and they could ask questions, because when you're working alone on such a prototype is that you don't get any feedback. Here I got a lot of feedback on progress, and it was a quite good learning experience. The other thing is, yeah, as I said, the project consists only of a tiny intelligent wrapper around the other code, and this can be used to just integrate it into your Spring web app to directly show it in the web. I have some ideas on this, just follow my blog to know more about this, or my Twitter almost. So do you think about the upload local profile? Is this an important feature in your opinion? Is it something that you'll, is it part of the features that made you choose Firefox profile? Yeah, it's quite a great feature, because you can essentially, in open source, you could later than just tell other people, I have some performance issue here, for example, for issues. It's essentially the same thing Mozilla does, and I hope I can open it to use it in a corporate setting, for example, adding features to use Microsoft Thrive for it, so it's more safe, because currently you're uploading it into the web, and that's not that great when you're doing internal company stuff. But this feature then could really make it easy for people to just say, hey, here I have some problem, give me your stack trace, just upload it, and you're fine. So maybe next feature could be a server, and make it configure. The problem is, it's just me currently working on it, and so it's just a prototype, I have to see what could be implemented in the future. Any other questions? We still have three minutes, so if you have questions, we can take them. Thank you very much, do you want to add anything else? Just try it out, I would like to have some users, because I want to know what features are important to you, and also where you find some problems, for example, someone found that the Microsoft support wasn't there, so on Microsoft it just crashed, on Windows, so if you find any issues, please drop by, and come to the Matrix channel of the Firefox profile, there you can find me sitting around all day, and can also answer questions directly on it. Thank you very much. Thank you very much. |
Localize your open source project with Pontoon |
So we are at our penultimate talk for this dev room, and we have Mathias Horvat, which I hope I pronounced it right, with the staff software engineer on the Mozilla localization team, and he's going to talk about how you can localize your open-source project with pontoon. Thank you. Hello, everyone. First of all, I would like to thank you all for coming all the way to Brussels to listen to me. I really, really appreciate it. I hope you're having a good day today, and you're going to be having a good day tomorrow. As Francesca said, I'm an engineer with Mozilla for some time now, and today I wanted to talk to you about localization, specifically how we do localization at Mozilla, and hopefully how it can benefit you as well, be it within Mozilla or some Mozilla-related project or not. But first things first, I have to mention something very important. This is Inti. She's my oldest daughter, and she just turned seven today, and her dad is at some conference with Geeks, spending time away from her. But can somebody take a picture of... Thank you for that. Actually brought her to Brussels, so she's here. So we spent like an hour today together, and by the time I get back home, she's going to go to bed. No, I'm kidding. But I actually wanted to make this talk shorter, because I want to spend more time with her, so sorry about that. It's going to be a pretty short talk. And then if you're going to have any questions, Emily, you'll answer them, okay? Is that fine? Thank you. Emily's my colleague over there who's going to do the last talk today. So big round of applause for Emily for being the last to speak today. She really appreciates that. Okay, back to localization and to some serious business. This is actual data. There's just 13% of Firefox users that are based in the U.S. That's maybe not very surprising. What could be a little bit more surprising is that 60% of all Firefox users use non-default locale, which is ENUS, American English. In case it's not obvious, what I'm trying to say is that localization matters. It's actually very important. We all, me included, often think of localization as an obstacle or something that we're going to do later or we're going to do it one day. But it actually really matters because apparently it keeps the door shut if you don't do localization of your software. I want to say a few things first about how localization actually works at Mozilla. It's driven by hundreds if not thousands of contributors, volunteers, who spend their free time contributing to Mozilla because they like it or because they like the products that Mozilla develops or they like the mission or they care about their language. We're truly grateful that we have such an, as we call it, army of awesome people who are, as you saw earlier, basically responsible for 60% of the Firefox market share. There's not just Firefox. As you'll see later, there's many, many more projects that Mozilla localizes. The platform that we use for localization is called Pontoon. It's like a classic translation management system through which localizers interact. But it's basically, as I mentioned, just an interface. The actual strings, the actual English strings and translations are stored in repositories. So usually that's GitHub, I think also GitLab. Sometimes there's also hg.mozilla.org. That's what we call a single source of truth. And then Pontoon is basically just an interface because many of our localizers are surprised, not really developers, don't really want to work with repositories directly. So it's much easier for them to make contributions through a tool that is hopefully not much more complicated to use than, say, email client or Facebook. As you can see from this page, this is a profile page of one of our active localizers. We really like version control systems, in particular GitHub, as you can say, by a particular widget on this page. And the way things work is that localizer would log in, they start by picking their team, their locale, like the localizer's software to French. And they start on the French page, in this case, which has some basic stats, some basic information about the locale in general. And more importantly, at least all the projects that this community localizes. This is a screenshot, so I can't really scroll. There's 35 projects in total that the French community localizes. I think in total we have 36, and they are being translated to over 200 different locales. For those of you who are not familiar, the difference between a language and a locale is that Spanish is one language, but then you have several variants of Spanish, for example, like Spanish Spanish or Argentine Spanish or Mexican Spanish, those are locales. All specific variants. So localizer would go to this page, pick one project, for example, AMO front-end, which is not fully translated yet. And then the translate view opens up, which is again a pretty straightforward page. On the left you see the list of strings, and in the middle you have on top, source string, and then the text field into which you enter translations. And then in the bottom right corner you see two tabs from which translators get some inspiration from. You get suggestions from several machine translation engines, translation memory, and you can also look into how other locales might have translated the same string. There's two ways most of our teams operate in. One is some localizers submit translations directly, which means as soon as they are submitted to Pontoon they end up in the version control system and can be used in product. The alternative and more common way is that localizers just submit suggestions, and those suggestions then need to be approved by our trusted localizers who have worked with localization for some time and have a proven track record of submitting quality translations, and then they get into the repository. So here in this case we're actually seeing on the left we're seeing strings with corresponding suggestions, which are then approved by a reviewer. Maybe one more detail around this. Since you see the source string and the translation also in the sidebar on the left, the status boxes on the left are actually check boxes, so you can select multiple strings and approve them at the same time or reject them all at once. One last thing before I start to stop with the presentation of Pontoon. We're currently working on pre-translation feature, which is essentially engaging machine translation and translation memory, and as soon as source strings get exposed in the repository to be translated, and as soon as they are served to localizers and localizers get notifications, hey, new strings are available, these strings get pre-translated using a combination of translation memory and machine translation. So if we find a perfect match, we would use a translation memory. If we don't find anything usable in translation memory, we fall back to machine translation. This is a pretty controversial topic, because pre-translation can yield interesting results. Thank you. That means that we're really slowly rolling this out for particular project-local combinations, where there's actual needs, where, for example, locales are a little bit falling behind, but at the same time, they have reviewers who are active enough to hop in and correct potential errors that the pre-translation produces. Pontoon is open source, it's freely available, so there's actually other users of Pontoon outside Mozilla. We're not aware of many, maybe a dozen, but we also don't know in case there are more. It's relatively easy to set it up. We sadly don't offer any official support, but if you do come to our discourse, I'm going to show the links at the last slide, or to our chat, chat.mozilla.org. We try to help, but like I said, we don't offer any official support. There are some requirements that need to be met in order for a project to be localized with Pontoon. Obviously, you need to use GitHub or some other VCS backend as a storage for translations. Then you have two options for organizing the files, either you follow a predefined folder structure or you use our Altenand.toml specification, which is then read by Pontoon to detect where the source files are and where the translations are submitted. Obviously, you need to use one of the, you need to store your translations in one of the supported file formats. Here's some of them. You might be familiar with Fluent. This is one of the formats that Mozilla developed. It's now basically slowly being, Emil is going to talk about it in the next talk, is basically transitioning slowly towards message format two, which is the format that is being developed. That's why there's an asterisk at the end. We don't technically have a full-blown support for it yet, but we're working on that. There's also most common file formats are supported by Pontoon. And once your project meets those requirements, you just need to create it on your Pontoon instance, which is typically a very simple step. You need to add a project name, select target locales, and add a link to your repository, and that's basically it. You save it, you sync it, and you have strings ready. Now the tricky part here is that you need your own instance, and that's a little bit more work than filling out this form. Like I said, there is documentation on how to do that in our repository. It is, however, in our minds for some time now. We're testing waters whether there's an interest for us to create something like a multi-tenant Pontoon instance where you wouldn't need to maintain your own instance. You would just come and create your own project there and use that instance. Yeah, that's pretty much it. I would like to end here. This is the link to the repository, obviously, and all the links to this course and to chat that I mentioned and the documentation are there. You can also find me on Matrix or Twitter, sorry, Matt Jazz, or you can send me an email, and I'd be also happy to answer any questions here. Thank you. Thank you very much. So we already have two questions in the Matrix room. Does it support more complex translation like full articles, example given, what we can find on support.modzilla.org? Short answer, no. Pontoon is designed to be software localization translation system, and we currently don't have any support for, yeah, I don't know how to call it, articles, longer blocks of text. We sometimes abuse that, basically, and split some of the articles or some of our web pages by paragraphs into multiple strings, but that's not really it. That's not really the same as Wikipedia localization works or how MDN localization used to work in the past. We have a ticket on file for that probably since the first week, since Pontoon repository was created, but there has been basically no work on that. We do, not only do we try to help you if you want to set up your instance, we're very happy to take patches. This one would be obviously huge. But anything that doesn't interfere with Mozilla needs, we would be definitely happy to support. The reason why we haven't implemented that feature is because at Mozilla there simply was no real need for that, apart from the exceptions that I mentioned earlier. I hope that answers the question. We have another question from Sylvia. I wonder, why does Pontoon exist when other of us translation projects like WebBlade exist? What WebBlade not yet around when the project started? Were there any specific feature design decision you were missing that didn't work with WebBlade? Not to say that Pontoon shouldn't exist, I'm just wondering what its unique selling feature. That's a great question. I think it's good that people have options when they go to the store and they can choose different types of milk or different types of cars. So it's sort of like the same question as why does BMW exist if there's Mercedes? I think Pontoon, I don't know WebBlade too well, I have to admit that. I was at the presentation today and from what I heard I think it's an amazing piece of software. I know that, for example, Mozilla is very eager about supporting natural selling translations through Fluent and Message Format. We have special UI for that. Maybe that also exists in WebBlade, I don't know, but I would guess that no, because Fluent never really passed the borders of Mozilla very intensively. So that would be one of the things that, and the Message Format support which is related to that would be one of the things that comes to my mind. But other than that, I think it's mostly, there's probably a bunch of other tools. I don't know if Puddle is still in development. There's also close source systems. I don't think, I think it's good that people have different choices and somebody likes that type of UI, somebody likes other types of UI. So, can we add support for Firefox translations in addition to Google and Sistran? Is it easy to do? It's very easy to do. Actually we've been, when we started working on pre-translation support, we wanted to only use machine translation engines that could be customized and trained with our own data. And when we were evaluating several engines, obviously Firefox translations was the first on the list. The challenge at that point, and that was maybe half a year ago, things might have changed, was that the quality was a little bit lower, at least from our experience. We were using, I think, BlueScore system, and I think BlueScore was about five to ten percent lower for the locales that were supported by Firefox translations. And it's killing us because we would like to support Firefox translations, and I'm sure that one day we will. The other issue was that, at least at that point, there was maybe a dozen of locales that Firefox translations support, whereas with Altima, it's around 50, and then there's 50 additional supported by the generic engine of Google. So yeah, hopefully we're going to extend support to Firefox translations soon. And it's actually a good point, since adding an engine itself is quite trivial, which we should probably just add it, not to pre-translation, but at least to that machinery tab where you could get suggestions from. Shit, why haven't we done that? Thank you. We do collect that, yes. Oh, sorry, sorry. So the suggestion was that it would be nice to also collect telemetry to see which engine is preferred by users. We actually do that already for each translation that's submitted by just copying it over from translation memory or any of the machine translation engine. We keep track of that, and we can see that, okay, this engine is more likely to be used than the other. So one thing I was wondering regarding, like, Fluent, for example, like other libraries, for example, the translate toolkit does not have support for Fluent yet, and I was wondering if Mozilla was planning to help on the development of Fluent support in the translate toolkit. And another and related thing is that if there are any way of doing, like, validations, verifications, because in our project we have a lot of very beautiful translators, but they are, many times, it's the first time they translate, so, like, they make a lot of mistakes with the HTML, markdown syntax, and if you have any kind of validation. Okay, thank you. So maybe I can split my answer into two pieces, one piece around Fluent support in translate toolkits or maybe some other libraries, and the other question is about whether Pontoon has any sort of quality checks. So the first question. I think Emily will have much better answer to that in the next talk, which is going to be about message format 2.0 standard, which I see, maybe I don't see clearly, Emily is going to correct me, which I see as Fluent 2.0, it's developed under the standardization bodies, and that, I think, means that the wider support in multiple tools is going to come. If you're specifically interested about Fluent and adding Fluent support to translate toolkit, then I think we should definitely talk and see if there's an opportunity for that. It's already supported, so it's not going to be a question. Okay, apparently it's already supported. So, translate toolkit already supports Fluent. That's the answer to the first question. Thank you. The second question about quality checks, and that's actually related to translate toolkit, Fluent uses three different libraries for quality checks. One is actually two are internal Mozilla libraries, and another one is translate toolkit library, which also has its own checks. So yes, if there are any obvious errors that can be automatically detected, we will most likely detect it. There's probably errors that we could detect, but we don't, but I think most of them, most of them we do. We work on improvements to our check system through developers telling us, oh, you broke our product. Okay, apparently our checks are not good enough. So over the years, I think our check system became quite bulletproof. Thank you. We have time for one last question, if someone has one. I don't see anyone. So, thank you very much, everyone, and thank you very much. There's a cake under the seat, just check it out. Okay. Thank you very much. Thank you. Thank you very much, everyone. Thank you very much. There's a cake under the seat. |
The Road to Intl.MessageFormat |
So, hello everyone, and thank you very much for resisting this late in the evening, and we have our last talk, and we have Emily Haro. Welcome to the localization system and tool chain management, and he's going to have a talk on the road to int-message format. Oh, the int-message format, I should have checked it. It's okay. Hi. Hi. So, the last talk by Mathias, if you were here, was a lot about where we are now, what we can provide now already in Pontoon and otherwise, how localization is now happening in Mozilla. What I'm going to be talking to you about is what's kind of coming up, what are some of the next things in localization that we're working on and that we think are really quite important. And so, yeah, I'm on the same team with Mathias and my staff, the software engineer, but I've been doing this sort of stuff kind of for fun, for ages, it feels like now, so it turns out that when you get really into localization in JavaScript in particular, there aren't too many other people who are that into it and then somehow you might end up hired by Mozilla to do the things you were doing for fun, for pay. So that's kind of nice. Hint, hint, you know, it's a good company. In addition to working just on code at Mozilla, I spend a lot of time in a bunch of different standards bodies working on the standards for localization in particular. And some of the work I'm presenting here is really the work that's going elsewhere than just at Mozilla because we want to have, we fundamentally want to make the world a better place, the internet a better place for everyone, not just Firefox users, but you know, everyone's internet isn't better if they use Firefox, but you know, you know, you're here so you might have heard this one before, but yeah. On localization, this is again covering a bit of what Matiasz was saying that quite often localization is one of those aspects of how do you really build an application or a site or anything that comes up way too late. You end up making some choices early on and then you end up needing to live with those choices later and they might not be the best stuff, best ones. And the need for localization comes after you've made the choices or you discover that hey, this thing, oh good grief, we need to support Arabic now, that'll be interesting. And a lot of, the sort of scope of localization is interesting because there isn't necessarily one right answer, so of course we're working on a new right answer and you know, there's an XKCD comic on that. I don't have it on these slides, don't worry, but you know the one I'm talking about. So things could definitely be better. So we're trying to make some of this improvements happen. It should be easier to localize content and there should be a common way of doing this so that the experience and use, the benefits that you get from using software and libraries in one place can map to elsewhere. Right now, there's a lot of differences in how localization ends up depending on the formats you use and the tool chains you use and all of this and that is not optimal. And fundamentally, a lot of actually when you start getting deep into it UI and UX design ends up being limited to some extent by the fact that most of localization work is working around strings rather than the complex structures like HTML allows us to represent and other aspects that make life more complicated. So we want to improve all of that. So let's start with this. This is nominally something simple. Hopefully most of you can read HTML to figure out that here we have this small little span that says that Brussels is the capital of Belgium, I've lived here, I know it's more complicated than that, let's just go on. And Brussels here happens to be a link. So how do we make this localizable? How do we, no, how do we actually localize this in a way that works really in the end for everyone? And one way that we're trying to sort of build towards is something a little bit like this that you could add an identifier to the element there where you say that this is the Brussels message that we're really dealing with and include in the HTML something like what we have for CSS now where you say that here's this resource that's attached, here's a link to a resource that's necessary for figuring out what's really the content of this page. And then separately you have a message here in Finnish because you know I can and I could not pick between French and Flemish and because it gets complicated. I've lived here, I know, Brussels and Belgium. And here the format that we're using, I'm going to get to that later, but there's a couple of interesting things here in particular that the fact that we're marking up the Brussels text there as the contents of the text of a link so that we'll be able to map that to the link, the AHRF that we have in the source document there in English. And because it's, you know, of course a little bit more complicated than this, it happens to be a link to Wikipedia. So in this particular case, but not usually at all, we could allow the translator to say that hang on, this link in Finnish should really go to the Finnish Wikipedia page on Brussels rather than the English one. And this is like, I can present to you, you can see the screen, you can kind of get what you're looking at here, but honestly getting this to a state where you can get a translator who's not a developer to see this and understand what they're supposed to do and not screw it up and provide useful things, useful content in all the languages, well, the languages that this translator is working on, it gets kind of hard. So we're trying to, you know, make that a thing. And the rest of this presentation is really going to answer these three questions that I kind of would have hoped some of you would be asking, but you're not. There are really questions in my head I wish you would be asking, you might have as well. But these are the questions of the theoretical guy in my head might be asking, what's the format of this thing that we just saw and is this really going to work like everywhere and how's this going to make my life better now or do I need to start using this whole new thing and that's going to be a pain. So I don't want to do that. To tackle the first one, the answer ultimately to all of that is to standardize everything. And the first thing we're going to talk about standardizing is the message there itself. And one particular thing that some of you might have noticed is that it had curly braces around the text there, around the Brussels on Helsinki, sorry, Brussels on Belgian Pääkkarpunki. Sorry, what's that? And this is because it turns out that when you're building a message formatting language like this, oh good grief all the corner cases. Oh good grief is it like hard, like proper hard because you're trying to write a formatting language that developers understand and then get the developers to write content in that language that translators understand without needing to have the developers necessarily understand how translators think. So you need to find an intermediate language for the communication to happen that explicitly limits and forces the communication to work in a way that works. And this is one of the reasons why some parts of this work have been in the active standards body for like three years so far. Yeah, one reason for those curly braces there is that quite often messages get complicated because you need to vary different parts of them depending on different variables. In English for instance it matters is it a he or a she or a day who might have you know done the action here of sent an invite to a party. So we need to have a language message format too which I'm presenting to you here to enable this sort of a communication. And of course it gets more complicated than this because you can have stuff like here we have a need to include something more in the message of the relative time like say three days ago that's included here. So the language needs to allow for internal variables for this message to be definable in a way that translators can kind of see what's going on and hopefully not touch it too much because hopefully they don't need to do that but still be able to do so if they really really need to. So this is about the space of what's possible in most current no in some of the current message formatting languages. At least project fluent which we maintain and work with and maybe one or two others. But when it gets really more complicated than that this is this gets on the edges of not really even supported anywhere. When you have here what we have are multiple different variables being defined and then the matching on which of these messages really the message we're building it depends on how many people as well as the gender of the host. So this isn't even a full listing of the whole set of possible when cases that could be selected here but this is all possible. Quite often happens when you really want to formulate UX experience that is approaching natural language and this is again referring to what I mentioned earlier a lot of this stuff just isn't is the choices that people are making now regarding message formatting how do they formulate it are driven by the limitations of the technologies that we have available for us. So UX itself is being driven in certain directions because message formatting is hard and you don't end up really having messages like this in your UI if you care about localization because whoever is filtering your messages before they go to the translators the localizers is going to tell you yeah no you can't do that they're not going to ever be able to work with it. So please fix and then you end up even maybe building the UI differently in order to accommodate these needs. With message format too which this is I kind of hope we can get beyond that have the possibility and the options of having even richer content in everything that we're working with. But the second question there was about is this really going to work everywhere and yes and we're doing that by trying to make much of the work happen at the lowest possible appropriate level for the work so a lot of this is happening in the Unicode consortium and then we've got work going on in TC39 for JavaScript. It's being added to the ICU libraries provided by Unicode as well and eventually we're hoping to get probably in what we do discussions ongoing about the structure of the HTML stuff that I was showing you earlier because that doesn't exist either yet. And one particular part of this I'm my background is as a JavaScript developer is that this is the first time we're really adding something to the JavaScript language itself at the level of like JSON.parse where you have this string representation of a thing that's not JavaScript and you get an object or a thing out of it. I think that's really cool but we're still working on it. And the part here that makes this extra interesting is that we're not just talking about a new syntax but effectively through the work we've been doing it's looking an awful lot like everything in every single message formatting language that currently exists and is in use somewhere that is, you know, that we can know about that is not like closed and proprietary is supported in the data model that we end up with for message format too. So for example to answer the earlier talks questions about how do you get support for something like Fluent into software like translate toolkit, one quite probable answer for the general case of this is that what you'll be able to do is take messages that you have in dot properties files, Fluent, GetText, ExLiv, pretty much anything and parse that into defined data model structure for message format too, then be able to work with that using tools, runtime, whatever and possibly from there get it out in a different format altogether that's then supported by other tooling. So it's a lot of this work is trying to figure out that hang on, messages aren't really all that complicated as data structures in the end or we can at least express the level of the complexity, so we should enable. Hello again. So yeah, think I was about done with this slide and going on. One key part here is that all of this is already real. So what I showed you in HTML is not exactly what we use internally at Mozilla, but it's effectively the same as how Firefox is now already translated. We have by now literal years of experience of working with tooling like this and seeing how it empowers UI UX development of a relatively complicated piece of software like Firefox to improve itself and to enable easier and better communication between developers and translators and so we're bringing a lot of that knowledge and experience into what we're doing in the Unicode Consultium when designing Message Format 2 which is yes, taking inspiration but also learnings from Fluent and many other systems that make it honestly a better than Fluent currently is for instance which is why we're now pitching that as the really cool sexy thing even though I mean if you're interested it is the currently coolest thing around that's real, this is still in progress, so you know you could be interested in that. As I mentioned the syntax itself for messages is getting defined under the Unicode Common Language Data Repository Technical Committee, it gets complicated in these things and there's an implementation available in ICU72 for Java and the JavaScript proposals, there's two of them at stage one currently for this are progressing in TC39 which is the body that defines JavaScript effectively and there's a polyfill package for JavaScript if you want to start playing around with what Message Format 2 looks like and how you can work with it but yeah, all of this is of course completely public, all of the repositories, all of the work standards are being developed completely in the open and I mean honestly localization is one of those weird places where we don't need to filter anyone on credentials for like anything because in terms of who wants to actually participate in the standards actions and standards work it's enough that you show up and you show some level of interest and we'll let you in in all the like inside clubs and because there aren't any, it's a community where really you can, if you're interested you should not be afraid of someone saying no you don't belong here because you do, we need always more people participating. Yeah, there's links to me as well and also this talk is available at the URL there at the bottom, it's also attached to the talk on Pentebarf, but yeah, that was me. Are there any questions? The question is what really makes message format to better than fluent and one particular example is when you get to complicated stuff like this, is having the effectively enforcing the data structure that you end up getting from this to be one that contains full messages that you end up representing to translators. Other than this it gets into really nitty gritty details, the other big benefit of message format to over fluent is that message format to is becoming a unicode standard rather than effectively a project built entirely from within Mozilla. So the question here is about seeing the sort of typing that you see, the colon number and the colon relative time and actually the colon gender is the same sort of thing here. What are those and are these custom or centrally defined? And the answer is kind of yes and no and it's complicated because what you're looking at here are effectively functions that act a little bit like types but they're not exactly like types. They're declaring for example that the count that we're getting, let's handle it as a number but also let's in the value of it that we end up assigning to count other use an offset of one. So it's an operation happening on the input argument count and on the third line in the match for the host's gender we could imagine host being some complicated object that's defining a whole person and we're picking the gender information from that more complex thing. But yes in many cases they work kind of like types. Influent, these are the capital number, capital date time and capital platform functions that can be used in this sort of way as well. Just be loud. I'll repeat your question. I feel like people are going to start saying like, okay, we can put links in now and then it sort of escalates to what actually in this locale this whole structure of the page doesn't make any sense anymore so we have to kind of like switch things around a lot based on locale and I'm wondering if you have any thoughts on like where what's out of scope here perhaps and if there are any other tools in development that you know of that kind of like are exposed to not just message forwarding on small sentences or whatever but also restructuring pages based on different locale. So if I've understood the question is what happens when you come from a when you have a complicated thing like a whole page that you're translating and in comparing the source locale and the target locale, the target locale ends up having very different structure that might you know be go much deeper I suppose than just the simple link that I'm showing in this example of how does this really work. The answer is it's complicated and it depends on your use case. This work in particular is trying to build tools that could enable that sort of representation within message format 2 so you could end up somewhere really complicated but you probably don't want to. You're probably in that sort of a situation needing to build more tools that are more specific to the use case that you have. When you have when you need to reformat a whole page in order to do work with a specific locale it's there is no universal answer to this. This is the closest thing but I don't know where it's really going to go. We have a question in the live stream. Translators often are not programmers they already struggle when translating strings with HTML tags and other technical terms. The message format curly braces syntax might be difficult to understand and error prone. So here we're talking about something let's take this example of if you put this in front of a translator yeah you don't. This is not really what we want to do. What we want to do is create a format that enables a like HTML a representation of something like a message in a way that is relatively readable but is not necessarily easy to edit and modify for someone who doesn't exactly know what they're dealing with. A little bit like what happens if you take JSON and put it into a Word document and then you start editing it and then you have to figure out that oh there's a curly quote somewhere that ended up screaming. This sort of thing can happen entirely well when you end up dealing with complicated messages like this. So the answer here is that you end up using tooling that gets this to not be presented as one thing to a translator but three yeah in this case three or more different messages where you end up asking a translator wants to translate name invited you to her party on relative date and then a second to ask them to translate name invited you to his party on relative date and in Finnish allow a translator because Finnish doesn't he and she translated the same word. So in Finnish the equivalent of this message would end up being effectively just the third case without the whole matching because the structure of the language works differently. So you do end up when working with messages of this level of complexity effectively needed to rely on tooling but the wonderful thing about message format two is that we can transform this representation of this message into any other representation of this message that's hopefully going to work with whatever tooling is then available for the actual translation work to happen in. So XLIF two for instance or other targets that are commonly supported by software used for translation or some really simple representation that can be mapped then back to this but still allows a translator to just see a simpler thing at once rather than a really complicated thing. Think there's more questions but are we out of time? Two minutes. Guy in front, yellow. My question is I can guess that you are targeting the new message format as a successor of all previous attempts at message format. It is relatively easy to make sure that everything representable in previous message format is representable in the new one. How are you solving the problem that you are really encompassing all the different languages in the world? Because like all the examples we saw were in English perhaps some of the others might be like French or another in the European language. The case here is just for female or male languages with much more complicated noun systems. In some languages you might be writing a single message in several writing systems. How do you make sure that the new message format encompasses all these different strange cases for localization? If I understand the question right you are asking how do you make sure that this isn't really what seems to work for English in a couple of languages around English but hopefully all the languages? Or a sizeable number of languages. The short answer here is that with Fluent we are already doing exactly this using representation of messages that is very close to this. For instance, at Mozilla from this experience we can say that the simpler than this structure that we have for Fluent ends up working in all of the languages that we need to deal with through Fluent. We need to deal with Fluent which is about 100 for Firefox, 200 overall for all of the different projects that we are currently translating. Separately from this the work being done for message format too is by no means done really from an English language point of view. With the main contributors currently working on the specification my background is Finnish, there's a Polish guy, then there's a Romanian, then there's a Sri Lankan and there's a couple of others who are on the periphery of this who are from a much wider variety of backgrounds than this. So, we are bringing in ensuring that these sorts of considerations are actively being remembered to be taken care of. To some extent we are relying on the expertise that we have, to some extent we are relying on the experience we have with working with similar formats than what we are presenting here. But also we are trying to build a core specification for message formatting that is sufficiently small but modular and powerful to then enable the support later on that is required by human languages. We are trying to limit to just being able to support human languages but it might go a little bit beyond that too. I think we are out of time, I am very happy to have people come and ask me questions after. Thank you. |
Welcome to the Network devroom |
Good morning and welcome to the Network Deadroom. Today, we are very packed, so we don't have a lot of time. As you can see, we have several talks and there is little time between the talks. And we start with the first session about, let's say, pure networking, okay, and the second part is dedicated to Kubernetes and to this type of container-based technology. We'll be talking about several topics, starting from, you know, network monitoring and security, and then we still have that. And then I said, moving down, we'll move to Wi-Fi to mesh and then to Kubernetes. We try to have sessions that last five minutes before the end, so that we have five minutes left for questions that probably is not much, but this is what we can offer today. We try to keep everybody in the program. |
Peer-to-peer Browser Connectivity
Leveraging WebRTC and the new WebTransport protocol to connect libp2p browser nodes. |
Welcome to the first talk in the network dev room, peer-to-peer browser connectivity. We're going to talk a bunch about what we see and the new shiny web transport protocol and in general how to get the browser connected to a larger network. First off, before we start, very grateful to be here, thanks to all the organizers, thanks to all the volunteers making this event possible. That's wonderful. Yeah, and then thanks for all of you to be here and listen in. Cool. Just quick introduction about myself. I'm Max. I'm a software engineer at ProtocolApps. I'm stewarding the Lippie-to-peee project. I'll do a brief introduction of what Lippie-to-peee is, so don't worry too much about that. I'm maintaining the rest implementation of the library. In the past life, you might know me from my Prometheus time. I worked a bunch on Prometheus and its integration into Kubernetes is still a little bit active in that community. Yeah. You find me anywhere on the web with Mxenden and then on the website you find emails in case you want to get in touch. All right. So what is Lippie-to-peee? Just a small disclaimer. The talk does mention Lippie-to-peee from time to time. It is not particularly important, so in a sense if you want to build your own Lippie-to-peee application, all the content here is applicable for you as well. But if you want to have this pre-built, you can leverage Lippie-to-peee. So what is Lippie-to-peee? Lippie-to-peee, as you can infer from the name, I'm guessing, is a peer-to-peee networking library. It has one specification, and then that specification is implemented in many, many different languages. Like, for example, Go, JS, Rust, Nim, C++, Java, but a couple others as well. The goal of Lippie-to-peee is provide low-level features like encryption, authentication, hole-punching, and things like that. And then on top of that, leverage those features to then also provide higher-level protocols. Like, for example, DHT, distributed hash table, or gossiping protocols, or things like that. And my big slogan always is Lippie-to-peee is all you need to build peer-to-peee applications on the internet. Okay, wonderful. One small disclaimer that's important later on is that I want to highlight here is Lippie-to-peee always encrypts and always authenticates, and we'll go into that later on, what that means. But that's very important for me. We don't ship any traffic over the internet that is ever unencrypted or not authenticated, and in terms of authentication, I'm talking about mutual authentication. Okay, that's enough introduction for today, and now to the actual topic. What I want to convey today is how we can get from here, from the left side to the right side. So my great motivation is for browsers to be first-class citizens in networked applications. Now on the very left side, you see the typical internet application today. So you have a browser. I'm using the Firefox logo here, but you can use any browser, really. That tries to interact with a network deprecation somewhere in the internet. Instead of interacting with the nodes directly, it acts through a server, and that server acts on behalf of the browser, right? The browser pretty much never interacts with the whole network. And to put this with an example, if you, for example, have a file sharing, you want to share a file. So for example, from my laptop here, I want to share a file with all of you. I would usually upload that to the server, and then all of you would download it from that server. We would never interact directly. Now there are many reasons for that to be a good architecture, right? Browsers usually move a lot. They might be in the living room, then in a cafe, and then at a conference in FOSDEM. And they are usually low power, but what's the most hurt argument for this kind of architecture in terms of that, in comparison to the right architecture, is that you cannot connect to browsers, and that browsers cannot connect to other nodes. That's oftentimes hurt, right? And what I want to kind of convey here today is that you can actually nicely connect a browser to a whole network, and that the browser actually has a lot of connectivity options out there, and I want to go through these. And the next time you design a network deprecation, maybe you want to consider the architecture on the right versus the architecture on the left. All right, cool. When it comes to connectivity for a browser, I want to differentiate this in two dimensions, and the first dimension is whether my node, whatever, for example, my computer here, is public or private. So can it be reachable directly, or is it behind an app or firewall, and or firewall? In public, you would usually refer to it as a server, and in private, you would, for example, refer to my laptop or the browser running on my laptop. Cool. Then the other dimension, when we talk about connectivity, I want to differentiate in two platforms, which is browser and non-browser. Why is this relevant? Well, there are a lot more platforms, I know, but usually it's the non-browser, which is very unrestricted, in terms of, for example, I have access to a UDP or a TCP socket, and then I have the browser, which is very restricted, where sometimes I can't make a connection without, for example, a valid TLS certificate. Wonderful. Okay. So, and my goal today is kind of, we fill this matrix now with the different options that we have, and this way I kind of convey the fact that actually browsers can be first class citizens in network applications. All right. So let's talk about public non-browser to public non-browser. I'm in the network dev room, like this is the easiest one, I'm not going to explain this much. Reachability, they're both nodes are public. We can just reach out them directly over IP and TCP, or then UDP and the shiny new quick. We don't have firewalls and that on either side, and the platform, which is non-browser, so for example, an application running on my laptop has direct access to the TCP and UDP socket. Cool. So we have that. Then private non-browser to public non-browser, again, really easy. You do this every day by any application on your laptop going to a server. We don't have any firewalls, and we're not at the receiver side, so on the server side, the left side is private, but we don't really care as we have the direction from the left to the right. And then the platform, again, we're not running in the browser, so we're pretty unrestricted. We probably have access to a TCP or UDP socket. Wonderful. To make this a little bit more complex, what if I'm a public non-browser connecting to a private non-browser? So does that mean, for example, on the left that could be a server, and then on the right that could be some application running on my laptop right now? What we can do here is something called connection reversal, simply where my laptop connects to some public node, then whoever wants to reach out to me reaches out to that public node as well, relayes a message to me, my laptop, and then my laptop dials whoever wanted to dial me initially. This is depicted here, so B connects to the relay R, and then A relays a network over R to B, and then B can actually connect to A, which is commonly referred to as connection reversal. In terms of platform, again, we're a non-browser, so access to TCP and UDP socket, so we're all good. Cool. And then the last one I want to fill before it becomes complicated, namely before we introduce a browser, is private non-browser to private non-browser. You see this depicted down there as A and B. Reachability really sucks. Both are probably behind gnats or firewalls, so not much luck either. So what we need to employ here is a technique called hole punching. I don't have much time today in this talk, but we have another talk later on. So if you want to learn all about hole punching or what success rate we have across different, protocols or IP stacks, join the talk. I think it's at 11.45. So we'll go a bunch into that. Just short, brief one, A and B want to connect. Both are behind firewalls. Both connect to a relay R, that R is public. They coordinate a hole punch over that relay, and then execute that hole punch through both of their firewalls. Cool. In terms of platforms, again, we're not on the browser yet. So we have access to the TCP and UDP socket. All good. Life is pretty easy. Wonderful. All right. Now comes complexity, which is the browser world. And what we can, what I want to talk about first is what if I'm a private browser. Now private browser is somewhat of a weird term. Usually you're not at Faustum and you don't have a public IPv4 or IPv6 address. So browsers are usually always private, which I'm not suggesting to change. Definitely not. There are many security considerations to keep it that way. But what if I want to connect from a private browser to a public non-browser? So what if I, for example, want to connect from my laptop within my browser to some server? Now this, again, sounds pretty easy to everyone, except one small disclaimer. Again, we don't have a firewall or NAT at the receiver side, right? A server is public, depending on the firewall rules, obviously, but we can easily reach out to them. In terms of platform, we are on the browser, so we're quite restricted in the sense of what we can do. Eventually, I want to end up with a byte stream between the two endpoints. So what I'm restricted to is either web sockets. Everyone knows that. So TCP, TLS, HTTP, then an upgrade, and then I have web socket. The problem with that is I need a valid TLS certificate, so I need the remote server to either have a signed IP certificate or based on a domain. So that's a bummer. What I can do as an alternative in the browser is use the shiny new web transport, which is basically, I'm simplifying a lot here, but basically web sockets on top of Quick or HTTP3. Web transport actually allows us to handle self-signed certificates. And then as a last alternative, we can use WebRTC to get a byte stream, WebRTC gives us data channels, so in the end, we can run on IP, UDP, then SCTP, and then use data channels from WebRTC. Now before you scream, this is insecure. The small disclaimer that I did at the beginning is in case you built this yourself, you still need to figure out proper authentication, right? Best would be mutual authentication, because self-signed certificates, you're not part of the authority trust chain, but otherwise, yeah, these are your options. So web socket, web transport, and WebRTC. Cool. So what if I want to connect from a public non-browser to a private browser? We had this in the past, a couple of slides back. In terms of reachability, my left side is reachable, my right side is not reachable, so what I can do, I don't need to do fancy hole punching, I can just do connection reversal right over the relay, where A asks B basically to dial it back over the relay. In terms of platform, we don't have direct access to the TCP or UDP socket, given that on the right side, we have a browser in the whole stack, so that's a bummer. We can do web sockets in case we have a valid TLS certificate signed by some authority. If not, we can do web transport and WebRTC. Cool. And now comes the very hard part, or not very hard part, but a little bit more difficult part, which is private browser to private browser, or what is basically the same is, was it private non-browser to private browser, or private browser to private non-browser, all the red boxes down there. In terms of reachability, we need to leverage hole punching at this point. Both end points are behind the firewall and are not. So again, we'll go more into details on how hole punching works. Probably a lot of you are really familiar with that. In terms of platforms, at least one of our two sides are behind our browsers. So that means we don't have access to TCP or UDP socket directly. Why am I always saying no access to TCP and UDP? That's relevant because you don't control the ports, and this way you don't have the capability of hole punching yourself. But what the browser gives us is WebRTC. WebRTC has hole punching built in, so what we can do is leverage WebRTC and some signaling server R in the middle to then do the actual hole punch. WebSockets doesn't work because we can't hole punch with WebSockets and WebTransport doesn't work either because we can't hole punch with WebTransport either. Okay, wonderful. And that concludes the whole matrix, and what I'm pretty much showing here is you can connect the browser to everyone out there that runs on IP, and that means your application can actually make the browser a first-class citizen within your network. Cool. That's all from my end. Yeah, I'll be around the venue for quite a bit. If you want to learn more about LiPi2P in general, which makes all this nicely packaged for you, you can visit docs.lippi2p.io. If you want to see all the nitty-gritty details about the different transports and what that means for, I don't know, for example, you sign TLS certificate or where you can hole punch, that would be on connectivity.lippi2p.io. There are various forums, there's a specification online, and then all the implementations are open source, so you can just check that out on github.com slash lippi2p. Cool. That's all from my end. Thank you very much. |
Snabbflow: a scalable IPFIX exporter
A tour of the IPFIX exporter developed at SWITCH |
Yeah, so let's go right into the topic. So I'm Alex. I work for Switch. Switch is the national research and education network in Switzerland. Like most countries have something like us, like in Belgium, it's BellNet, and in Germany it's DFN, and in France it's Renataire. So we connect to Swiss universities and universities of applied sciences via the ISP of those institutions. So NetFlow, I'm not sure if everyone is familiar with NetFlow, so I just recaptured the like a central thing about what the network flow actually is. So when you look at an IP packet, you extract the source and destination addresses, the IP protocol, and if the protocol is UDP or TCP, also the source and destination ports, and those five numbers identify a flow. So every packet with the same values is said to belong to the same flow. And then in the simplest possible way, you basically just aggregate, you count the bytes and packets of all the packets that belong to the flow, and then you export this information to a collector where you can then analyze the data. So this is an old thing, like these days people talk about network telemetry, and back in the day when this was developed, that name didn't exist yet, and I'm not sure when exactly Cisco came up with this, but it must have been the early 90s or mid 90s, and it used to be a de facto standard for a long time, but people just figured out what Cisco did and then did the same thing, and then finally got properly standardized with the IPfix IETF standard. And you can do this either in sampled mode or unsampled mode, so unsampled means you look at every single packet and account for it in the flow, and with sampling you just look at every nth packet, and then you have to make certain assumptions to then reconstruct the actual values. So we at Switch, we've been using NetFlow for a very long time as the basic, as the most important metric or means to analyze our data, our network data, since the mid 1990s. It used to be that this was provided by the Rogers themselves, which is reasonable, and the packets passed through that device, and so the device has immediately access to the packets and then can construct the flow data itself. So initially that was done in software, then it was done in hardware. It used to be basically always unsampled, but with the advent of more powerful networking gear, and especially with the arrival of the 100 gig ports, it became basically unfeasible to do this on the Rogers themselves because of typically software restrictions, also hardware restrictions. If you want to do this in software, you usually can because the Rogers are not very powerful in terms of CPU, and in hardware it becomes very expensive. So the vendors started to basically only implement sampled NetFlow, so these days if you buy a Cisco or a Juniper box and you do NetFlow, you get sampling. And sampling is fine, of course, if you're only interested in aggregate data anyway, so big aggregated network flows between networks, for instance, sampling is perfectly fine, you make certain assumptions about the traffic, and then you just upscale it, and you get fairly reasonable numbers. So why would you even want to do unsampled NetFlow? Well, there are some couple of use cases that are really useful. So for instance, in terms of security, one thing that sampling is fine is detect DDoS, for instance, that's volumetric DDoS, that's very simple, so you basically have a constant packet rate, and if you just look at every end packet, it's easy to scale this up. But if you want to detect a bot, for instance, in your network, then it's more difficult. So maybe you want to do this by looking at the communication with the command and control channels, those are short lift flows, and if you do sampling, you're probably going to miss them. But with unsampled NetFlow, you see every single flow, so you can identify these things. And we as a network operator, we use this fairly often to troubleshoot network problems, so if a customer says complaints, you can't reach a certain IP address in the internet, we can actually go look in our flows for the outgoing TCP SYN packet and see whether there's a TCP SYN coming back in. You can do this because we see every single flow, so this is extremely useful. But as I said, so we cannot longer do that on our big new core routers, we can't do that, they only give us sampled NetFlow, so we started to do this with an external box, and that's where this SnapFlow software implementation comes in. Because I mean, there are always ways to do that, but they might be very expensive if you have to buy dedicated hardware, for instance. So just to give an idea of what type of traffic we're dealing with, Switzerland is a small country, we are a small network, and we only do NetFlow on our borders, so when the traffic that we exchange with neighboring networks, and the peak values are these days, it's roughly maybe 180 gigabits per second, something like that, and 20 million packets per second, and roughly 350,000 flows per second, unsampled. And this can actually be even much more. The flow rate, because of the aggressive scanning that's going on for the past couple of years, has started to perform very aggressive network scans, like plain TCP SIN scans, as fast as they can, so sometimes a single host can easily generate 100,000 flows per second. So the actual IPv6 traffic that the export is done in the order of 200 to 300 megabits per second, so the flow records themselves, so this is all for the unsampled flow. The average flow rate is maybe just around 200,000 per second, and the data it generates, the actual NetFlow data is like roughly 1.5 terabits per day, so the actual scaling problem is more on the collector side than. We have 10 gig, 100 gig, and 400 gig ports, so that's what our solution needs to support. So we used to do this historically on the routers themselves until a couple of years ago, then we moved to a commercial NetFlow generator that did that in hardware, which was pretty expensive, maybe the whole solution for one pop was 100,000 euros, something like that, and then we finally moved to SnapFlow and Pure Software. So how do we do this? On the borders, these are all fiber connections, so we have optical splitters, we create a copy of all the traffic flow, and then we have a second device, or the primary device that these tabs are connected to is what we call a packet broker, it's basically a switch that aggregates all the packets and sends it out on 200 gig ports to our actual exporter box. So it uses VLAN tags to identify, so in NetFlow we also want to keep track of the router ports where the traffic was sent or received from, so because then that information gets lost and you aggregate them, so we use VLANs to tag them. The box we use are white box switches based on the Tofino ASIC, the ones that Intel just decided they stopped developing, unfortunately, these are very nice boxes, like there's one with 3,200 gig ports for 5,000 euros, and the other one has 3,200, 400 gig ports and costs about 20,000 euros. The thing is you have to program them yourself and you buy them, they're just plain hardware, and so you can use the P4 language to do this. I link here to another project of mine where I actually developed the P4 program to do that, so that's also part of this entire architecture. And then the traffic gets to the NetFlow exporter box, which is currently just one rack unit, that's basic rack mount server, we use AMD Epics, mainly these days with a fairly large number of cores, that's the way we scale, with the number of cores, NetFlow always scales very well the cores because you just have to make sure that you keep the packets to a flow together. They use, the exporter has a Melonox 2.4, 100 gig card, that's connected to the packet broker, that's where it receives the packets. So in a picture that's what it looks like, on the upper left that would be our border router, on the upper right that would be the bordering router of our neighboring networks, in the middle you have this optical spitter, which is completely on passive box, just as an optical splitter, and then you have this packet broker switch in between that aggregates all the packets and distributes them by flow on these two links currently. So these are now 200 gig ports between the broker and the exporter, we can easily add more ports if that's not sufficient, and on the SnapFlow exporter we can basically just add more cores to be able to scale. So now let's hear Max talk about the actual software. Hello, hello, does this work, good, all right. All right, how do we know how SnapFlow is deployed, I want to talk about how it's built, how it scales, how you configure it, how you monitor your running application, etc. So SnapFlow as the name suggests is built using Snap, Snap is a toolkit for writing high performance networking applications, Snap is written in Lua, using the amazing Lua JIT compiler, and it does packet IO without going through the kernel, like generally the Linux kernel packet networking stack is slow from an ISP perspective, so a Snap bypass is that, uses its own device drivers, and this is also often called kernel bypass networking, I think nowadays it's fairly common, and Snap is open source and independent, we're not sponsored by any vendor in particular. So Snap is built with these three core values in mind, we prefer simple designs over complex designs, we prefer our software to be small rather than large, and we are open, you can read the source, you can understand it, you can modify it, you can rewrite it, etc. So here I have a snippet of code taken directly from SnapFlow, unedited, so this is how the Lua code that powers the usual Snap application sort of looks like, just to give you an idea. In this particular example we read a batch of packets from an incoming link, we extract some metadata that tells us which flow this packet belongs to, then we look up a matching flow in the flow table that we maintain, if we already have a flow we count that packet towards that flow, if not we create a new entry in the flow table. Got one more snippet, this function is called every now and then to actually export the flows, so we walk over a section of the flow table here, and add flow aggregates from that flow table into a next data export record, and if it's time to export the data record we send it off to an IPfix collector, which is a separate program. So from a bird's eye view, SnapFlow works sort of like this, we read packets from a 100 gigabits nick, the garden hole so to speak, we process those packets to extract flow information in a snap process, and then we send off data records over a ton-tap interface to the IPfix collector. So on the right side here you have a device driver written, like that is part of Snap written in Lua, that actually happens like the actual traffic, the bulk of it, and on the left side you have an interface to the Linux network stack, so since the flow export data is rather small in comparison, you can just do that over the regular Linux network stack, and that works. On the very left side you have the IPfix collector, that's a different application, like a separate program that we send the flow data to in the end. So sadly, or I mean I guess just obviously, single CPU core is not enough to handle 100 gigabits of traffic, so instead what we do is we do receive side scaling provided by the network device, this way we can process n different sets of flows on n different processes running on n different CPU cores, so every circle here is a CPU core. And we also support to repeat basically the same trick in software, so we can do another round of received side scaling after filtering the traffic by protocol, and this way we can process for example DNS traffic on different set of cores than IP traffic, like non-DNS IP traffic, and that way we can sort of like segregate the server resources into the workloads that we actually care about. We might for example care more about that we have an accurate general IP flow profile to send to the collectors, and maybe if we still have some time left we will also do some DNS analysis, but we don't want one to slow down the other necessarily. So SNAP programs are organized into independent apps, so an app is like a logical packet processing component, could be for example a device driver or an app that implements the address resolution protocol, and these apps are combined into implications like SNAP flow using links. Links are unidirectional, they really just ring buffers, and any app can have like any number of them to use as input or output for packet data. And you communicate with like you use those links like shown here, that's basically the API that you call link receive on a link to receive a packet, and you call link transmit on an output link to send a packet. So now to forward packets from one CPU core to another CPU core we have this thing called live interlink, these are really just like regular links except that they span process and CPU core boundaries, and you can also use them just like any link, you have the same interface if you want to operate with them, and we use those to implement the software based receive set scaling that I talked about earlier, right? We also have libp3, so libp3 implements a very strict control plane data plane segregation, I think for most networking folks the concept of control plane data plane is pretty common, but just to recap it, control plane is something that basically is fancy and elaborate, you expect it to be really nice, you want to have a nice interface to configure your application and monitor it, the data plane on the other hand you really just want it to work, it should like preferably run at line rate, and you don't really have any time to mess around. So since these like two different parts of the application have very different requirements nice to keep them separate, and that's what we do. We also have libyang, so you see both the configuration and the application state of snap flow are actually managed by described in the yang schema. So for example you can tell the control plane to load a new configuration of snap flow or you can query it for some state counters while it's running, and on this slide I have some examples how you will use the snap command line interface to do those things. So here we have a snippet of the snap flow yang schema, and yang is one of these things where at the beginning you wonder if you're really going to need it, but once that you have it you are usually really happy that you do have it. So what I like specifically about yang is it's very expressive. If a configuration passes the control plane and it doesn't reject it because it says hey this is invalid, I'm pretty confident that this configuration will do something useful in the data plane and it will not just like crash. For example here we have a list of interfaces and one of the fields is a device which is a PCI address and the PCI address in this case this type is attached to some regular expression that makes sure that it actually looks like a PCI address and we kind of just pass any string in there and validate it somewhere way down the line. Like if you don't put a thing that at least looks like a PCI address then this won't even be loaded. So sadly any piece of software has bugs and in our case even suboptimal performance often considered a bug right and we deal with the second issue here with the performance by shipping snap with a flight recorder. So this flight recorder has minimal overhead, it's always on you even run in production preferably and it stores useful data, part of that data is really useful to profile your application after the fact or while it's running. To analyze the collected data we have built a little UI that we used to do that, it's usually running on one of our development servers so we test stuff but you can really run it anywhere. Did I mention snap? I did right? So we're dealing with a JIT compiler here. So the UI shows you stuff that you would expect from a profiler like basically where does my program spend its time but also some JIT related stuff like did the compiler have issues generating efficient code for particular parts of my program. So for example here there's like a JGC column that's like when the injected code the garbage collector is invoked and that's for example something to look out for. And another part of the flight recorder is a high resolution event log. It can give you accurate latency measurements of the pieces that make up your software. And you can see here on the slide that the OUI has or it shows latency histograms for individual events. These events are, some of these events are like already defined in snap but you can also use a defined new event. And here for example I could tell that processing a batch of packets and extracting the flow data so this is like the main IP fix app main loop takes us about 35 microseconds per iteration per process. And this is really useful if you want to debug tail-latencies, right? And tail-latencies translate basically to drop packets in our world so that's something that's really valuable. So to close things, if you were to write a new application based on snap today you would have all these things and more ready at your disposal. And also it is possible to purchase consultancy services like commercial support for snap and developing snap applications from your friendly open source consultancy Igalia, which is my current employer. So yeah, that's all for now, thanks for your attention. On the right there are some pointers if you have some contacts, if you have questions or inquiries about snap or snap flow you can email us there after the conference or for now. If you have any questions, please ask them. Thank you. Please come down. There are some seats available here in the middle. The next speaker is Peter Manev, that is one of the key guys of Suricata, a very popular open source ideas. And today is going to talk about this open source platform. Please have a seat here in the middle. |
What is an IDS and Network Security Monitoring in 2023?
Monitoring, Detection, challenges and solutions while chasing APTs, CVEs and Ransomware. |
All right, thanks for having me, folks. So let's have a quick chat about Surikara, open source, ideas, network security monitoring and all those sorts of things. And what it used to look like before and what it looks like now, at least from our perspective and all the things that we need to take care of and the things that we have still challenges with. So a quick overview of the agenda. We're going to talk about what is Surikara, how it started, how it evolved, challenges when we're monitoring the traffic and how to get involved basically and help out. First of all, I'll introduce Eric, he's my colleague, he was supposed to be here today, he fled, got cancelled and he couldn't make it, but we did the presentation together. So apologies, he actually just touched down at the Brussels airport. So Eric is the city of Stamels Network, he's part of the OSF team, that's where I met him during the Surikara, developing Q&A during the Surikara project. He's also a member of the board of directors, a Linux kernel, let's go developer, he maintains also Sirius and Selk's and that's the Twitter handle in there. So apologies, Eric could not make it and that's why I'm making the introduction with him, he's a great open source colleague and a friend. My name is Peter Manev, I am 13, almost 14 years with Surikara, I'm part of the OSF exec team and where I usually do a lot of things around Surikara Q&A and trainings. I'm also a chief strategy officer of Stamels Networks, I'm one of the maintainers for the open source distribution, Selk's, I really like thread hunting and lots of open source around it, I'm guessing lots of people here in the conference too. So what is actually Surikara? So basically a high performance network security monitoring engine that can do active or passive monitoring and also produces a lot of application and protocol metadata. Open source, GPL2, code is available on GitHub of course and basically produces when you plug Surikara in the network, it just gives you this high level situational awareness of what's happening on the network, what's going on, something you should be aware of or something you didn't know or to confirm things, for example, you know we have these a lot of angles like trust but verify to use the famous quote, you know we have zero trust architectures, you want to confirm that they're actually implemented. It's one thing to implement, it's another thing to confirm that it's there. And actually Surikara is used by a lot of organizations, a lot of people, it's awesome to see it and really, really thankful for that and all the feedback and sometimes organizations before use it even without knowing because it's embedded in a lot of vendor appliances and similar. So what can Surikara do? Basically it's a few options there. It's an idea system, so intrusion detection system, it could be in passive mode, right? So it could be in line, intrusion prevention system where it actually actively stops or enables a connection to pass through or not. It can also be just purely network security monitoring perspective, you know it works without rules, it can just generate all sorts of protocol, flow data, file extraction and similar. It can also store everything it sees on disk, so it can also do full packet capture. And very often you see users of Surikara in combinations like this, so for example IDS plus NSM mode with all the network protocol data and full packet capture and similar. There is also something new coming up in Surikara 7, Surikara 7 RC was just released this week. And it's called conditional pick up capture which is actually code introduced and developed and donated by Eric. What it allows you to do is, so for example if you have an alert it will save that whole session, so not just the packet but the full session. So in that way you actually have the full sessions for all alerts stored in pick up as opposed to having to do full packet capture which in a lot of cases might be prohibitive. So basically use the network to defend ourselves, observe, protect, adapt, that's what we do, that's what we try to excel at every day. And this is a quick snapshot from actually our website. So some major features. To name a few, yeah more from configuration, JSON for output, multi-threaded, there's hardware acceleration and that becomes more and more relevant these days because the speeds are going up and up all the time. Network metadata logging by a lot of protocols, we are going to name all of them here, some of the main ones are here, more advanced parsing and automatic detection of HTTP DNS, SMB, SMTP TLS, all those guys and more. One thing that Surigata does very well is file extraction, SMB, FTP, HTTP, FTP2, SMTP, all that and the cool feature about that is actually it automatically de-duplicates when it's extracted. So if the same file is seen 10 times it will only be saved or extracted once to disk. So in other words saving a lot of space, really a lot. And of course support for SCADA, that's upcoming and we need more and more of that but it's also very relevant these days. So why the network? Well, because everything happens from the network. Everything from social media to finance to to name it, everything is on the network. So the good people are there, the bad people are there. So even if you have malicious actors getting in, even to get in is actually 99% of the cases it's over the network. And once they get in they need to either exfiltrate or move laterally or do some other part, that's still happening on the network. So you have to double to observe that. It's not the only place where you need to observe from security perspective, there's others of course but the network is one of the major ones. And while you're doing all that, while you're confirming if things are configured as expected or not or if all the connections are correct or there's no some anomaly and all things like that, you have to be able to identify and stop this activity. Granted there are differences, right? So a university network is totally different from a bank corporate network, there's totally different things, believe me. So network metadata logging, so we actually provide a lot of metadata around any alert event or metadata just on its own, you don't need to have an alert. So default output format is JSON, right, JavaScript object notation. And on the right hand side here you can see actually a picture, a small snippet actually of an event type, HTTP or an HTTP protocol. So all the different protocols have their own event type and they're logged in like that and then next to that you also have alerts. So file identification and extraction. One thing to mention is that file identification is done on the fly and it's automatic regardless of extensions and similar things like that, it's just using lib magic and those tools to be able to identify a file. So if it's like an executable, it's a PDF or if it's a PDF but with an extension TXT we'll still be able to match an extract if you want to that file. You can also match on SHA sums and other attributes of the file info events. So as I mentioned extract is sort of system deduplicated that really saves a lot of conversations with finance about sizing. So we have some pick up capabilities. Now what I mean by that, besides the fact that Suricada can actually store everything on disk and do full packet captures, Suricada can also read pickups. It's actually that part is also used a lot in a lot of sandboxes including, so for example, there is a few out there if you Google sandbox for example, there is a few that even offer free public services but to name one any run is like that actually automatically when a sample is detonated they also save the network traffic and that's also run by Suricada just to see what network protocol data and alerts are in there. So Suricada can read a single pick up, you can actually spin up a unique socket let's say and just feed pickups to it and every now and then you can point it to a folder and it will keep reading from that folder until it reads all the pickups or if there is nothing there it will pause and it will stop and when you throw a pick up in the folder it will continue really automatically, it's really automated in that part and of course you need to have multiple instances just to be on the safe side with that new chain that you did in QA you want to move it into Prod. So passive monitoring, how does it actually work? Here's an example basically we hooked up to a tap or a mirror port on switch and we just nipped the traffic. When you place that you're going to need probably multiple sensors in different scenarios you know it depends where you have on-prem cloud, virtual infrastructure and all that so you have to probably have multiple things but in general this is basically where it sits in passive mode so it logs all the alerts, the protocol metadata, any file, instruction events, pickups and all those things based on the monitoring that it does for that network. Now active monitoring, this is a little bit different, we stay in line so we allow or not traffic to pass. This is a bit more business critical, well not a bit more, it's much more business critical because you know it's actually able to stop effectively traffic so a lot more testing and due diligence in QA is needed there but basically how does it work? So basically you have this setup here, you have a user employee that receives a malicious document usually in a lot of cases it all starts with some sort of link or attachment that is being without intention opened by user or clicked by user so you have a network request and usually there is some sort of a signature or a rule that vets the traffic or inspects the traffic and says yes you can pass, no you cannot pass and this is a very basic example here for example a PowerShell script and then based on that there is a verdict, yes you can go or no you cannot go, either way and that's the inline mode right so there you actually actively making decisions of what can and cannot pass and similar. So one is much more acute, the other one is much easier because in the passive mode you are just monitoring traffic. So a little bit about our history and how we get there, how did we get here. So first lines of code 2007 by Victor Julien, he is the lead developer of Suricada, first release was in 2009, I joined the project 2010 April May somewhere around there so we are open source GPL and we actually have actively contributors and people contributing code, test donations from 23 different countries at the moment, well at least that was the last statistics but basically all continents and Suricada is owned by a non-profit foundation, the Suricada code it's open source and it's on GitHub but it's owned by a non-profit foundation on purpose and the purpose is actually so that it could never be sold, that's it. And this is basically how we started, OASFnet you'll find a little bit more info. So as I mentioned a little bit of a visual representation in there, so our first Suricada training believe it was in 2013 and our first Suricada was in 2015 and those were a lot of fun events and we learned a lot just by talking and interacting with the community. So big help there from the community and how did it used to work and look back in the day. So I had to generate that, this is basically an alert that what it looked like 14 years ago, what it looked like 20 years ago as well. So I had to generate that manually to look at what it produces but so basically that's an alert from 14, 15, 20 years ago, things like that. Not much to see there, not much to say there, right? So you have an IP and a port and a message, what are you going to do with it? Now you need to go find other tools, other protocol logs, you need to go to different machines to figure out what this IP is, what's that for, what's happened before, what happened after, is it a server, is it a laptop, is it guess whatever it is. So you needed to do a lot of work, this was just like a message. But back then it was one of the few things that were there, right? So there was nothing more than that. So fast forward to today. So this is an example for an alert but in a graphical interface anyway. So you have the alert, you have the signature metadata, you have in this case is HCP protocol and a lot of things do happen over clear text by the way because it's not vetted that much, especially in some environments and you also have the flow record, you know, packets, bytes to clients to server, similar things, durations and all those things. So you have a much better understanding when you look at an alert now, okay, so this is the status code, this is the request, this is the file and all those things, so you can actually judge much more. And one thing that actually came with time in Suricata is something called FlowID. What FlowID allows you, this was natively introduced in Suricata in 2014, what FlowID allows you to achieve is basically to correlate the specific alert to any and other protocol data from the same flow and session. So if you have an alert over SMB or something like that, you have all the SMB protocol records, the extracted files, you know, pick up, save, if you need to pick up, you have all that in the package. So much bigger evolution than what you saw in the previous screenshot. So how it works, this is a screenshot for example from e-box, it's a tool developed by Jason Nish, he's our colleague from the Suricata team and here's a quick example. Yeah, every session has a FlowID, right? So here is an alert with a FlowID and that translates to, in this case, that FlowID correlates the alerts to the Flow record, to the HTTP record, to the file info, which is actually the file metadata for that file transaction. So much bigger, much better visibility and you can actually make a decision much faster than needing to go look in other tools. This is actually a showcase of FlowID by Sirius, which is an open source web interface as part of CELX that they help maintain. But in any way, here is the file info on the bottom that is highlighted. You have the SHA, the file magic and everything is correlated between alerts, file info, Flow, logs by the help of that FlowID, which helps glue everything together. So really, really powerful. If I need, often enough I need to explain Suricata in one slide today. This is what Suricata does today. One IDS plus an assembly. So you have the alerts, you have the protocol data, you have the network flows, the file transactions, you do any file transactions and the pickup recordings, right? So you have everything in a package. It's much different when I started at least. So we have evolved and here is another run that is possible. So you could actually, it's a little known fact that Suricata alerts are only about 5-10% of all the data that it produces. The rest of the data is all protocols, data, metadata and things like that. The alerts are very, very small part and alerts, now at least I look at them as just context. It gives me something. It gives me an idea of what's happening. That's it. And that's all. I don't necessarily look at it as true or false positives. It's just like, okay, so that's what happened and I need the context around it. So we got a canned function without signature rules too, although it's not recommended because they help highlight certain events. So what challenges do we have? Well, all those years we need to keep the pace, right? We need to adapt. We need to adapt. We need to move forward. Signatures have evolved at least in Suricata, but back in the day it used to be a pattern match, a buffer overflow, some content triggering in the payload and it was very bounded with the IPS mindset. So you have to stop it. You have to look for something specific and it's expensive. It's CPU intensive actually to look on the fly at the 100 gigabit a second, for example, or pattern. You need to, for example, in IPS, you need to block, stop, protect the asset and similar. So we need to evolve from that to actually a bit more behavior analytics and including from the perspective of infrastructure, right? So you can say, hey, how many proxies they have on the network, right? That's interesting. Do you have proxies somewhere that NGINX, for example, proxies that somewhere in the network that they don't expect them to be? That sort of visibility, you know, that kind of difference. So TLS, okay, TLS is encrypted, sure, but the handshake is in clear text and during the handshake you can see a lot of things, including ciphers and things like that. What do you care about the cipher? Of course you do. Is it secure? Is it degraded? Is it recommended? You know, do you have applications that are using degraded ciphers? How do you know? Network security monitoring. That's the easiest way to do it, actually, or one, much easier than the other way. Put it that way. So you have system updates, you know, who's updating Debian, who's updating Ubuntu, things like that. They're very visible on the network, you know, and it's not about the actual system update, it's about actually do you expect it to happen, where it happens, and where it happens. So yeah, we have to evolve towards that direction and we have managed actually to do a huge leap there. More protocol implementation, right, so we need more and more protocols. Of course we do, but, you know, as it says here, it's no longer a network grip, so you have different protocols. You need application layer identification regardless of the port, you know, so it has to be automated. You need parsing, you need logging, you need to parse the protocol, you need to log it correctly, all those things require time and effort, you need specific keywords so you can hook up detection to for different parts of the protocol. And we need, like, secure protocol implementation. So by that way, Suricata needs to parse everything and anything on the network, and trust me, everything and anything on the network doesn't fold the RFC, it just doesn't. So we need to follow the RFC one, and two, we don't, because we can't, you know, we can't crash, we need to actually keep inspecting and alert for problems. Suricata has, not everybody, a lot of organization, a lot of tools have vulnerability based on protocol parsers. Workshack has a lot, Suricata has a few and similar. So one way to battle and combat that is the combination of Rust and known for, memory safety, trust safety, C is not safe, right, C's memory requires a manual, if it's not done correctly, it's not done properly, SecFalse Memlix, they can all occur in there. There is an example of Rust known parser that we use in Suricata, so Suricata is a combination of C and Rust known implementation, and that started only a couple years ago, and it's more and more of that, just to help us be more safe when we inspect and do things. There's the outside evolution, right, so networks' speeds increase, demand increases, there is encryption, there is less visibility, but a lot of data. I was earlier, in January, I had a talk at a different conference, and after the talk, a person approached me, he's like, do you have any recommendations of how to tune or improve Suricata performance, I'm like, yeah, sure, he's like, and I said, what's your setup? He's like, we have over 25, 100 giblets, I'm like, okay, we need to talk, so it's very interesting, things like that are dominant and happening, so it's different, you need to keep up with those evolutions, too, and make sure that these people actually, these setups and people can benefit from everything we do. Challenges, there's a lot, everything from mirroring the traffic to one side, the traffic to encryption, to nick-off loading, because we actually do need to inspect the traffic as the end user or the end device, we'll see it, not as the devices in the network pass it over to one another. Volume, size of logs, it's not uncommon to have, depending on the link speed, of course, it's not uncommon to have one, two, three, five more billion of logs a day, what do we do with them? That's a different topic. So here it comes, the deduplication and all the stuff that we actually also need to take care of. And QA anyone while we're at this? It's quite an effort. So when we talk about encryption, I mentioned, so depending on the protocol version and similar things you have, TLS12, 113, 11, all sorts of things, but during the clear and shake, you can actually extract a lot of information, these are some of the things that we extract actually, the SNI, the subject, the shore, you can deduct cipher codes from there as well, J3J3S, fingerprinting, TLS version is similar. So we need to tell sort of what to do when it detects that the connection is encrypted. So what do you do? You want to keep inspecting it? It might be pointless because it's encrypted or you want to pass it to a specific, some hardware has already built in function so it can say bypass it on the hardware level as soon as it detects it's encrypted because there's nothing more we can do. And there's certain things that we can continue on tracking and generate logs, for example, the flow, how big is it and things like that, but we need to be able to relate that information because it's pointless, as I mentioned, if it's encrypted. There's also a lot of decryption devices and similar things that Suricada 6 next to or behind, but that's a setup, that's an architecture issue. So these four major factors that impact Suricada performance, rules, Suricada version, hardware use and type of traffic. Again university traffic is much more different than a regular corporate traffic where everything is vetted. So you learn a lot from both deployments, for example, just using these two as an example. We're software, right? So we need to run on any hardware and that's not easy to achieve. We need a lot of persistence queuing and all those similar things. So what actually happens, so here's an example of workers more than Suricada. As you can see, this is a network card and it has different queues and each protocol, sorry, each thread, each record actually enters and goes through these points, capture the cold stream, detect output, output, especially when you write in. So it needs to be in, this is an example of that specific order. When the traffic comes to the network, how it's passed to the Suricada and what Suricada goes through on a very high level. Most of NICs are made for web file servers at scale. They're not specifically made for IDEs, some are, but IDEs needs to see the traffic. The network security monitoring needs to see the traffic as this end device will actually see it in order. Everything needs to be in order in the same flow so that we can expect it properly. And a small word, so back in the day when I started, there was different capture methods and they evolved over time. But one of the fastest things that was back there was PF ring. And that was actually, we received a lot of help from Luca and Alfredo. Thank you guys. So Colonel 2.6 and 2.6, yeah, PF ring was the only thing that offered speed and performance. There's others that we were upcoming in the Suricada 7, you know, DPDKF, IKTXDP and similar, but we need to have different methods. I am going just to finish up here. And so QA in Suricada, a lot of, we need to cover a lot of angles. We have code on GitHub, we have code in GitLab. When you submit a PR publicly, it goes through an automated checks to as much as possible before we put it in the code. There's private ones, there's unit sets, there's thousands of checks. So anytime you submit a PR or a code, it automatically triggers checks. And some of them could finish nightly. This is a GitLab screenshot going through some checks. Some of these checks can include thousands of definitions inside. In one check, we have 22,000 files extracted, we need to make sure all of them are there the same way they were there before the code change. Things like that need to happen. And here's an example of a PR that's going through a regular checks on GitHub. But information is actually public on GitHub, you can see it in all the different OS compilations that it goes through. Address sanitizers, leak sanitizers, code analysis, CPP checks, all those things need to actually happen before we can say, okay, we could put this code without affecting us. Regressions, traffic replacement, similar. I am going over with a time apologize. This is how to contribute. Any feature and code can be donated, put on our red mine ticket and start a discussion. And the current reviews could be done on GitHub, they're public. So in conclusion, it has to work because you need to create a community around it. And the power is in the community because it's, you have a lot of ideas, a lot of feedback and you need to be open about it, you need to have open discussions about roadmap and input from all the different people that are actually using it. And that is the, our point is that is the ultimate power comes in numbers. Thank you very much for having me here. Much appreciated. Open to questions. Thank you Peter. Anybody has one question? One question here. So I'm just wondering, I don't know what the current date is, but you have MPF Bluefield as well, and how else does it sort of pay that increase? Have I looked? I'm sorry, I'm didn't hear. MPF Bluefield. Oh, yes, yes, yes, yes, yes, yes, yes. We have a conversation going. Yes. Yeah, part of the whole process, staying and keeping on. |
DDoS attack detection with open source FastNetMon Community |
Hello. Thank you for coming. I'm very happy to see lots of people here. I hope you will enjoy my presentation. So, I'm Pavel and I will talk about DDoS detection using opensource tool. So, first of all, why and why I'm talking here. I'm software engineer. I got formal educational software engineer and since the beginning of my career, I started working on opensource and my first programming language was quite unusual, I would say. It was Pearl. So, not fortunately choice for so many people, but for me it was a way into industry. So, I worked it for domain name register. I worked it for cloud compute company. I worked it for internet exchange. And finally, I got a job for global CDN provider and I ended up working in cyber security. So, what I'm doing now, I'm in charge of development of cyber security product for network security and this product is called Fastnetmon. So, I would like to start from brief description what is Fastnetmon. Fastnetmon is an application. So, it's the very first thing to clarify. It's cross-platform application and when I'm saying cross-platform, I mean Linux, macros, free BSD, open BSD. It's not yet on Windows, but it's still. And main purpose of Fastnetmon is DDoS detection for networks. From technical perspective, Fastnetmon is implemented using modern C++. Back in time, it was quite interesting story when Fastnetmon was started. But very, very first version of it in 2013 was implemented in C++ 11. But because of compilers in some way, not so very modern distribution, we had to move back to C++ 98. Since then, we still support modern versions. No way to like, no reason to maintain compatibility is very outdated stuff. And now it's like, it's really good fancy C++. It's kind of C++ actually enjoy hacking, creating, and like changing if you prefer to do so. And I know this feeling. When you hear about new stuff which may be relevant for you, that's the first urge to maybe, should I try it now? Immediately. Because what is the point of documentation? What's the point to hear my presentation? If you can just install it right now? It did very long. It was very long journey. And I would like to thank you all for our maintainers and who made it actually possible that Fastnetmon has so many distributions. For almost every single popular distribution, I used mostly for server environment and production environment, you may install Fastnetmon on just single command. So, it will start right now if you prefer. And if for some reasons, your distribution have no latest distortion Fastnetmon or you want to just install it, you know, for some distribution is not covered by or official packages, there is installation tool. Let's go forward. Because most of the time we talk about what our tools and our products can do. I would like to start from unusual angle. I would like to highlight what we can do because it's important. Because there are so many tools for DDoS detection. There are so many angles of DDoS detection. It's detection of the part. It's mitigation part. It can be implemented on premise and in cloud. And before we go into details what we are doing, we need to highlight what we are not doing. And if you have any issues with your website or your blog, I'm sorry, we can help you. It's not the point of Fastnetmon. It may indirectly help your carrier or service provider. But for your case, it's better to use cloud services because it's not that complicated to move site around. Because normally if you're not talking about really enormous size, it's quite easy to move on to content delivery network and then coverage from DDoS. And if you have some issues with DDoS when you play in your Xbox or PlayStation also, I'm sorry, we cannot help you. This is decided at this slide because we have too many questions. And I think it's one of the real serious problems in modern days because you cannot play because of DDoS. And if you use managed service provider, it may be public cloud, it may be private cloud. And when I say managed, it means that somebody in charge of keeping your service running. And it's in this case, it's very unlikely that you have access for your network. I mean, administrative level of access. It's a bit change policy, inspire policy, change or alter configuration. And in this case, it's better to escalate to your service provider like call for help. We have some problems. Help us. And finally, what Fastnetmon can help you? Fastnetmon here is not to protect specific service. Fastnetmon here to protect your network. And when I'm saying to protect network, it has very different meaning from protecting specific entity. Main purpose of Fastnetmon to keep up time of all network in general. And when I'm saying in general, it means that keep it running for 99% of customers, eyeball services behind the specific network. And I'll explain how we can do it. What kind of attacks we can protect your network from? Again, there are so many types of attacks. There are like so many opinions about classification of attack. I'm not going to go into details about what the kinds of DDoS attacks. I would like to focus it from well described OSI model approach. So what we can help you? We can help you with IPv4 and IPv6 at the same time. If you still use IPv4, please don't. Please move away from it. And in terms of layers of OSI model, we can help you only with levels L3 and level 4. And if you have some specific ideas, what is the option to filter out traffic using like HTTP or two or three protocol encrypted by a TLS, it's better guess to try to just present it like suricata because fastnet one is a little bit out of scope. Because main purpose of fastnet is to detect volumetric DDoS attack. And when I say volumetric, it means at least hundreds of megabits, but mostly in general case for every size of DDoS in modern day, it's around 8 gigabits. And in some cases, it's exceptionally high, it's maybe hundreds of gigabits. But on average, it's like just few gigabits. And this purpose of fastnet to take this kind of attack. So what is the very first step when we assume that network is under DDoS? Because when I'm saying assume, can we say for sure is it DDoS? Because in so many cases, how we actually can absorb DDoS attack against our network? Like you check your phone, it's not working. Like your website is not working. You check like laptop in your office. And for some reason, something doesn't work as expected or customers calling you. And first step is to confirm that actually DDoS. Because it may be not DDoS, it may be fire alarm in your data center. Why it's extremely important that it's actually DDoS? Because it may be something different. And in case of fire, it's way more important and way more different kind of actions to remedy the DDoS detection. And if you know by accident that some people, your colleagues working in data center right now, and it's like the same timeline, you receive a call from customer like, something doesn't work, it's very unlikely that it's DDoS. It's maybe caused by misconfiguration, because there are so many ways how we can figure down time in our networks. And okay, we covered most of the sources which can cause network DDoS, network down time, but actually DDoS. And look, even this one, it's not DDoS. This one is, it can cause havoc. It can bring down all cities, countries, data centers. But it's still not DDoS. And what is how we can say like, this one's for sure DDoS. And graphs. The only way to be 100% sure it's graphs. And by looking on this graph, if you know like, okay, my network generates like 100,000 packet per second, like 100 gigabits. And if you can see spikes by 20 gigabits, it's very unlikely it is caused by something normal. It's very likely it is DDoS. So it's first level of remediation, at first level, how fast that one can help you. Fast that one can say for sure, in this kind of dashboard, that you are under DDoS. And then you can action it appropriately because you are well prepared. You know what you can do. And what we can do in this case. Fast that one provides lots of different dashboards. And main benefit of those dashboards is that they're built not on physical level of network. Because when I'm saying physical level of network, I mean a port counter, slow for specific interface, slow for specific router. And what fast that one can do, it's more of a review of your network from logical level. When I say logical level, it's more from networks, prefixes, specific services. And in this case, fast that one can provide a required amount of granularity. It's like total traffic for your network. In this case, you can see total income and in case of any spikes here, you may see it almost immediately. It's one of the benefits of fast net money. It's not historical data. It's data which actually was just received from your routers. It's almost real time data. And so in the same case, again, from logical perspective, it's not the, instead of seeing what is the load for specific interface on my router, you can see information about how much traffic you have for specific prefix. And you will aware what kind of service is running in specific prefix. And so you can understand something wrong with this specific prefix. And again, the latest level of granularity you may find even traffic for per host because you may know that for specific prefix, you just have two services, very important services running. And then you can check what is the load for example. And you can see immediately again in real time, what's wrong? If you can see spike for this specific service, okay, we found victim, sadly. And fast net money on graphic capabilities include complete support for influx DB graph it and plenty of graph and a dashboard. I would like to send to community for contributing so many great dashboards because when we started this idea, we implemented a few of them, quite basic ones, but community did really great job by doing plenty of them. And actually, most of them are way better than our official dashboards. And what is the source of this data? Is it AI or something different? No. So we receive this information from our test or switches in your network. And from perspective of protocols, we support almost all available protocols in market. And of course, one of the most popular on its net flow, it's IP fix as flow. And in case of last resort, if you have no an athlete, but net flow or IP fix in your network, you can try to use port mirror for all cases fast and one can handle a really significant amount of traffic. And there are plenty of confirmed deployments of fast net money exceeding at least two terabits of capacity in total. So after you got all information, you may check it manually. For example, again, right at this moment, this fast net money will see what is your total load? What is load for specific network? What is the load for specific cost? And for small networks are like, you may find immediately what is the victim? Because in case of small network, you know, okay, I have 12 services move between and you can check one by one. Can we do it for DDoS detection? And this one is just the not very precise map of United Kingdom. And you can see there are lots of interconnections. It's not the largest country of planet, but you can see amount of interconnections. It's incredible. Even for medium sized internet service provider or telecom providers, they may cover at least multiple countries. And you can see amount, even towns, even regions is incredible. And if you talk about networks covering like multiple European countries or multiple countries in maybe, for example, Asia, it's incredible amount of locations, incredible amount of entities. You cannot check like, is it, for example, you are, we are under DDoS, you know for sure. Let's check every single one plus million city in Europe. We cannot do it manually. It's just impossible. Every single time from moving from large cities, we need to move to regions. Then we need to check household by household because this specific attack might begin specific person playing like Fortnite game in this specific building. You cannot do it manually, unfortunately. If they move a little bit to data centers, data center normally, as we can make here, it's single building, maybe huge building, but it's still just one building. It's not like, it's not scattered over like continent. It's not scattered over like a thousandth of kilometers. Is it easy to find out? No, unfortunately, because sadly in data center, you may have more entities, more potential big teams of DDoS than actually for large telecom networks. What we can do? Of course, as I mentioned, you can manually check every single host available in network because we already got pretty great dashboard and we have real time data coming from your routers. What is the logic? What is the way how we can actually find that? Again, we have data about what is a bandwidth for specific network? What is a packet rate for specific network? We can check every single host in our network and find out. Again, in case of data center and large telecom networks, it's impossible to do it manually. That's the reason how Fastnetone can help you. Fastnetone can do it for you and it can do it really fast. For almost all protocol support by Fastnet mode, we can offer detection time in less than five seconds. It's not about Fastnetone can say, look, you're under DDoS because it may be clear from graphs. At this point of time, Fastnetone can find out what is a specific service in your network which is under attack right now. We will have this information in five seconds. Why it's important, like five seconds? Why? Can we wait a little bit? Have a cup of tea or coffee and wait? Unfortunately, we cannot. That's the main problem because back in time, when I started working with DDoS attacks, it was around 2008. You can wait for around half hour when DDoS attack starts from something like 10 megabits, maybe 15 megabits, 100 megabits. You may have a cup of coffee, wait a little bit, 20, something like 50 megabits, 100 megabits. It's growing. Now, what we can see, attack and escalate from 100 megabits to tens of gigabits in like few seconds. And human being, unfortunately, I had to admit, cannot handle it so fast. We need some machines because people who actually run DDoS, they have lots of automation. And without having automation in place, we cannot defend it. So Fastnetone provides this option for you. And instead of checking every single host in your network manually, because it's still an option, you can verify. When you receive reports from Fastnetone, you can check graphs. Like, is it DDoS? Is it looking like DDoS? Because Fastnetone inside, it uses very simple rules. Like, if specific host in my network generates more than 5 gigabits of bandwidth, and if specific host in my network generates more than 100,000 packets per second, it's clearly DDoS. And after detection, what we can do, and very first step, which is available for every single carrier on this planet, unfortunately, it's free. This thing called BGP Blackhole. BGP Blackhole needs a little bit more clarification how it works. And because of name, you may guess, if you put something into Blackhole, you'll never see it again. And that's the point. And how Fastnetone can help and can rely on BGP Blackhole to stop DDoS from network. In the beginning of my presentation, I mentioned that Fastnetone here to protect your network, not specific service. And it's really important, because BGP Blackhole can be described in many words, because it's quite a complicated abstraction. But I would call it, it's like religion sacrifice made by network engineers to keep their network running. Why am I saying sacrifice? Because at this point of time, we know, for example, for our network, we have 20,000 hosts. Let's imagine every single host, it's residential building somewhere in Europe. And we know for sure, we are receiving DDoS right now. And our service degraded. Our customers calling us. Our site doesn't work. Nothing works fine. And we can find out what is the victim of this specific attack using Fastnetone. And we know specific host, which is IPv4 or IPv6, which is a target of this DDoS attack. And what we need to do using BGP Blackhole, we need to stop all traffic from coming to this specific host. Which means if effectively disabling and unplugging this specific customer or service from the internet. And that's how BGP Blackhole works. It's not about like firewall, which may block attackers. In this case, we literally manually, voluntarily block target or attack. Just to save our network. And that's only purpose of Fastnetone to stop it and do it automatically for you. So and after you stop it, and you can see it exactly on this diagram. So we maintain the uptime of network. And everything is skipped working by sacrificing just one host on your network. And it doesn't mean that you just block it and go away sending it. I can email to customer, look, we block at your service. We can help you. Sorry. There are so many ways how you can actually keep this host running. But again, before you apply some actions, create plan, what we can do, maybe you can call some specific providers to provide defense for it. You may just, sorry. So you need to have some actions and better to apply these kind of actions in quiet environment. Instead of having to deal with 20,000 of calling customers every single like minute, you may block specific target, you may keep uptime of your network back. And your, when your network is back to operation in quiet environment and way quieter environment, nobody yelling to you, nobody calling you, decide other like cup of coffee or tea, what is option, what we can do for this specific customer. And then how Fastnetone can help you. And since beginning when Fastnetone was built, it was open source from very first version. And a lot of features, I just explained it, they weren't invented by our master plan or roadmap. They were part of community request. We receive it at GitHub because of look, there is an option. I have a problem and I would like to solve it. So since beginning Fastnetone was community driven project. And we have lots of community channels, how you can cooperate with us, how you can share your stories, how you can ask questions. And please join all of them and I will help you to answer your questions. Thank you. Anybody has questions? Hi, thanks a lot, quite interesting. So I just wanted to ask you if you ever felt the need to extend the way you collect data with other protocols, like for example, any flavor of open config specifications or eventually BMP instead of BGP? That's a great question. So question was, is it possible to use protocols like open BMP or open con to feed more information to Fastnetone? In current generation of Fastnetone detection tools, we mostly rely on traffic telemetry protocols, which actually carries part of network packet. It's maybe header of network packet or it may be some meta information about source port, source IP, destination for destination IP. And we don't use data about BGP directly. The only way how we can actually interact with BGP is that we have internal BGP demand based on go BGP, which actually injects information and announces routers to your network. So we have no backward integration from network. So we have no way how we can learn information from your network. But because we offer different APIs, we offer different ways to automate and run callback scripts instead of just running BGP, you can run your own Python script and then you can rely on information from third party source and come by this information and make decisions using this information. I was merely asking because for example, with the GNMI, you can have like a sort of retraction on the network. So you can, based according to what you're receiving using IPv6, for example, you can have like an action directly on routers, for example. Yes. This is one of the ways how we can actually use so-called callback scripts because when fastnet on detects attack, it can run specific script. It may be base script, Python script, Perl script. And in this script, you will have access to basic information about attack and information. What is the target? What is like host target? What is the type of attack? What is the prefix target? And then using any like automation protocol, you can run actions on routers. And because of most of the routers, they have, there are no specific, like, there are no standard way how we can inject this kind of information for every single vendor available on market. And we decide to move these attacks to more communities, so to implement it on your own. And if you implement it, share this community. One second. Can you do a BGP flow spec to, like, black hole? That's a good question. So, back in time, we had BGP flow spec support based on exa-BGP, but it was, like, pure C-level quality of implementation because it was just literally hard called at least for DNS and SSDP amplification, but it worked well. So, the only, but unfortunately because of complexity of working as API of exa-BGP using flow spec protocol, we decided to remove this capability. And now the only way how you can actually inject flow spec rules, like, because you can implement black hole using flow spec, you can run it using go-BGP command line from callback scripts of faster one. Okay. Thank you. Any more questions? No. Thank you very much. Thank you for your time. |
ntopng: an actionable event-driven network traffic analysis application
How ntopng can be used as a scriptable system capable of reacting to network events. |
Okay, good morning. This time it's a turn to talk about something different. And top ng. What is in top ng? It's an open source application, of course. We are here. And you can download the code on GitHub. We'll see the link at the end of the talk. What is in top ng doing? It is, first of all, a real-time network and traffic monitoring application. So it means that it displays you on a web interface what is happening in your network live. Okay, no delay. This is unless, of course, we are receiving flows coming from a router that are somehow a little bit delayed because by nature they are on average. So they have a certain lifetime. And it is designed for network monitoring and cybersecurity. It means that there are some behavioral checks. So we are not bound to rules. You have seen Suricata representation before. You see there are some rules in case this happens then. So this is not our case. So we work based on behavior. So it means that if you have a host that is misbehaving, more or less similar to what you have seen before, that suddenly start to send too much traffic with respect to the past, or starting to, you know, fire up a new application. So accept connection on a certain port for TLS that was not open before. This is a typical example. So therefore it means that the application simply starts up and learns what is happening on the network. There are some levels of learning. So sometimes it is an immediate learning because, you know, you specify some sort of configuration. But usually this is not the case. The case is that the application learns what is happening and in case something goes wrong, goes different, fires up and alert. This is the idea. And the architecture is actually divided in two parts. Okay. First of all, the pocket processing part that is based on more or less PF ring or lead pickup. So this means that you can run on Windows, Linux, Mac OS, FreeBSD, whatever. Instead, PF ring is something that we have co-developed that is a Linux technology for accelerating pocket capture, but not only for that, but also for merging traffic for multiple adapters, for distributing traffic. So it's much more than simply RX acceleration. And top of this, there is an open source library that we still maintain at N-top called NDPI. So this is the only open source library that is doing the pocket inspection. But for us, it means that we try to understand from the traffic what is the application protocol. So if it's TLS, if it's a generic protocol, if it's Google mail, but it's a very specific protocol. And out of the traffic, we extract the metadata. So for instance, we extract certificate information and we generate something we call RISC. So looking at the traffic, we see if there is something wrong, okay, such as for instance, an expired certificate just to give you an idea. And we trigger an alert. On top of this, there is a TopNG because this is the first part that is basically provided by the operating system. And TopNG has a C++ engine that is processing packets, that is in essence doing traffic analysis, creates internally, okay, the representation of the data based on the concept of network interfaces because we can have a multiple network interfaces from which the traffic is received. It can be a virtual interface such as, you know, a NetFlow collection or a physical interface, ETH0. And then we have something we call behavioral checks, where we check flows and hosts. Flows means that each independent communication, such as a TCP connection, is checked. Instead, a host, we take the host as a whole. So in essence, if a host is doing a port scan, each individual communication is okay, or more or less okay. But the fact that this host is doing this, you know, in a sequence, in a network or in a host, it's a problem. So this is called behavioral checks. And on top of this, we trigger alerts that can be consumed locally or sent elsewhere. This is the fast part. On top of this, we have the Lua interface. Why Lua? Because we like C++. But C++ is something not for everybody, okay? So we need to simplify, for instance, the development of the web interface. So for instance, the REST API is written in Lua, sitting on top of C++. So we have created an API that allows us to avoid typical problems of C++ at the same time we simplify the way the application is working. So therefore, we use Lua for operations that are not critical, such as the web GUI, or for checking interfaces that are not necessarily real-time, so for the SNMP. For SNMP, we fetch the data every five minutes and do the checks. So traffic ingestion, as I said, is done in multiple ways. Sometime is serial traffic, so packets. Sometime it is not. It's a flow. And this is handled by the C++ engine. So the C++ engine is doing it efficiently. And then we have other type of ingestions based on events. So something that we don't really control completely, but that are relevant for us. So we have seen Surikata, the presentation before, some minutes ago. This is a typical example of input. Why this? Because, as I said at the beginning, we don't have rules. We don't want to have rules. So we don't want to say if the payload contains this and this and this and this then, because we don't believe that this is what we should do. Instead, there are wonderful tools such as Surikata that are doing that very well. So therefore, the idea is to combine network monitoring and behavior analysis with these type of tools. So therefore, indirectly, through tools such as Surikata that is optional, of course, you don't have to run it with N-top-ng monitoring, you can have this type of information that can be combined directly by N-top-ng. Of course, we have firewall logs and syslog. Why this is important? Because we can have a look at information that is not visible from the traffic. So we always play with packets, me and Alfredo. But we understand that packets have limitations, okay, especially for encryption. So we have seen before rules saying if you are downloading a buyer application, that is fine if it's plain text. But if TLS, you will never see that happen, okay. So you have to use things like rules on top of this, on top of this, but they are just guesses. So instead, if through syslog or other means, we know that. So for example, we see an attack or a wordpress saying that this host is trying to guess the password of administrator user. This is much relevant information. And from the network standpoint, it looks simply nice. Everything is okay. The problem is from the application. That's why we believe in network. By the same time, we need to have some other information that is injected into the application. And of course, we have historical data. We use a database called Clickhouse. So we can put a billion of records. Everything is working very fast. This is also an open source database. And for time series, we use round robin database or influxdp. And as I said before, we have checks that are divided in two parts. C++ for efficiency. So the fast part, in essence, where you have to process traffic in line, such as when you have a pocket, an incoming pocket, you have to check if this pocket belonging to a flow is relevant. And then we have other types of checks that are not so real time. So for a check on an SNMP interface, that need to be easy to be developed. But at the same time, that don't need to be fast. Because as I said, if we pull SNMP out in five minutes, we have plenty of time for doing that. And of course, we have notifications that we send out. So for instance, we trigger a shell script, a webhook, syslog, you know, telegram, you know, usual thing. Nothing new here. Okay, let's now start the talk after this introduction of NTOC and GIM. The problem is the following. So we have added over 150 checks, behavioral checks on the traffic. But there is always somebody that comes and says, I want to do something different. How can we support these people? How can we enable new programmers or let's say people that used to use Python, shell script, you know, this type of programming language or that don't want to learn the internals of our application? How can we do that? And many times this happens when you are in a harsh. So that is an attack. That is something happening on your network that you want to check. And we have, you know, two levels of the problem. First of all, we have to extend the behavioral checks in order to have some behavioral detection in a different way. And in the second part that Alfredo will describe later. So how can we use NTOC and G as a data lake from languages such as Python, for instance, that is very popular. So that you can use NTOC and G as a source of data for your own application. Of course, you have time series. As I said, we save data in influxDB if you want. So therefore, you can use Grafana for creating your own dashboard. But these are simple dashboards. So if we want to do something more complicated, if we want to go beyond that, in addition to that, how can we do that? So this is the idea today. So we like C++. C++ is super efficient. We like it. Okay. We are used to play with it since many years. But we understand that it's not what everybody wants. Okay. We need something easier. And we would like to understand also how it was possible to develop checks in minutes for people who are saying, okay, if I see this specific certificate or if I see this specific behavior, then there is a problem. Something very peculiar to an organization. So not general for everybody, but for specific people. So for instance, how do I trigger an alert if there is traffic, TLS traffic within host A and B? So for instance, a printer should not make any TLS traffic just to make an example. So if this happens, and how can you trigger an alert? Another problem is the following. If I have a certificate signed by a certain organization, or for instance, if I have a BitTorrent connection that is going above one gigabit, or notifying me if there is a Zoom call with bad quality, things like this. Things like very, very peculiar, very specific checks that people want to do. Maybe on an operating, sorry, on an autonomous system, and not on another, or on an actor, and not on another. So things that are not general that we can implement for everybody. How can we do that? So let me talk how it works in top NG internally. Let's have a look at the flow, also communication. So in top NG creates a data structure inside itself as soon as we see the first packet of the flow. So we see episodes of the destination, source parts of destination, protocol, VLAN, whatever. And then this is the first event that is relevant for us. And then, as I said, everything sits on top of NDPI, so the yellow part. So we have another event when the application protocol is detected. Actually, this one is divided in two parts. First of all, as soon as the main protocol is detected, such as TLS, okay, and then we can refine this information with metadata saying, okay, this is TLS that is going to Google mail, and not Google search or Google something else, okay. So second event and NDPI. And then we have, for long-standing flows, some periodic activities. So in essence, every minute, we do something different, something like, you know, I want to trigger, you know, an action. And then at the end, the flow end notification, so as soon as the flow is over. So what we wanted to do, we wanted to create a low API that allows people to create the simple checks that are efficient. Efficient enough for most people, because not everybody needs one and a gigabit, but many people have one gigabit networks or, you know, two, five gigabit networks. So they need some sort of efficiency, but they are not super extreme. So let's say use Lua for prototype on a check for some people who need speed, or use Lua for people who have, let's say, an industrial network or a network that is, you know, running at one gigabit or two. So in essence, we have created an API that allows from Lua to see internally, in N-top and G, properties of the flow. For instance, the number of bytes, multicast, layers, seven information, these type of things. And the API calls are very small. So in essence, we don't want, you know, the application to be inefficient simply because we download to Lua the representation of the host, the representation of the flow. Well, simply the method that we are interested in. So in the left side, you will see the C++ code, how it implements the stuff. On the right side, you will see an example of the Lua code. So in this case, just to give you an idea of how it works. So whenever there is one of the events, so for instance, we have to check the flow because, you know, NDPI is over, so the protocol has been detected. So if you want to block, let's say, Google Mail, okay? So what you need to do is to execute a Lua check after this happened. So in essence, the C++ code, we have put the code to the Lua VM that executes a script, okay? A script that can be, you know, applied to many flows, not just for one. So this is where, you know, this happens. And this is an example of a check. So we have a simple example. If you have a flow that is either TLS or quick from, started from host or anything in 192.68, 178.2, 1.1. And if it's TLS, and if the protocol issue is, so a very simple check that, for instance, a friend of mine has asked because it's monitoring IoT networks and they have found a vulnerability on a specific type of rule and the client was a specific device. So something that is not general. Okay. So this is the way it works. Very simple to write. The problem is the following. That the overhead introduced, this is a very slow Intel I3. So just to give you an idea of the super worst case, is 30 microseconds for everything, okay, in average. Whereas with C++, we can do it in one microsecond. Now, you say, this is bad. In a way, it is bad. I agree because we are 30 times lower. But you have to think, first of all, on one gigabit networks, that this is not the problem. Also, you have to think that most of these checks are asynchronous. This is one of the few ones that are synchronous. So in essence, as soon as the protocol has been detected, we call this method. But it is not why the packets are coming. So in essence, we have another threat that is calling this while the traffic is coming. But we don't stop the execution tree. So in essence, just to make it short. So if you take this overhead that you have introduced and you sum to everything and you stay below certain boundaries, so if you want for every minute to execute the flow checks on all the flows, you are good, okay. And of course, we trigger an alert. And the result of the alert is a notification on the GUI that can be sent, for instance, through Microsoft Teams, just to give you an idea. Or we can trigger a shell script for something or can send an alert to my friend on Telegram. So this is the way it works. Okay, now I have this. Okay, so we have seen how to extend the N-top-ng engine with Lua scripts to access traffic information and use those information to check the traffic and trigger alerts, for instance. Now, recently released also a Python package that you can install with pip install N-top-ng that allows you to, you can use it as a library to create a Python script which is able to access traffic information in N-top-ng. And this happens through the REST API. This means that you can run your script even on a remote location. For example, you can access live data in N-top-ng. In this case, we are importing the N-top-ng class. We are connecting to N-top-ng using the N-top-ng class. We get an instance of the, of an interface in N-top-ng, for instance, eth0. We use this method to get all the hosts which are active in my network with all the metadata. And there are plenty of methods in this class or another class in this library that allows you to get traffic information. So you can get alerts, flows, hosts, whatever. And you can also get historical data. So the same way, so you connect to N-top-ng, you get an interface. From this interface, you get the, an instance of the historical class. And you can run queries in the database. For instance, you can get alerts statistics from this time to this time, for instance, for the last 24 hours. And just print the, of the alerts that you have. Now, those are two examples of querying the, the engine to get the data. Of course, we have seen that N-top-ng is able to, when a check or an external event detects something, an event, we can trigger an alert. And we have seen that N-top-ng supports several endpoints. So we can send this alert using mail, a messaging system, like telegrams, LAC. We can run a shell script. We can call a web book. So we can run a shell script. For instance, in this script, it can be a Python script. So let's try to put all the pieces together. So we receive an event from, which is generated by an internal check or an external check. This event can call a Python script. This Python script can get information from the alert itself or can query the engine through this API that we created to get more data, to fetch more data and argument the alert information. And this can have some logic and trigger some action. So you can write your actions here to react to this event. In order to implement this, what you have to do in N-top-ng is, first of all, you have to enable the check that you want to use to analyze the traffic. For instance, in this case, we are using a custom check that the user creates in Lua as Luca showed you before. Then if you want to write your Python script that reacts to this event, you have to write an alert tender, which is a script that you place under N-top-ng script shell. And this case is a simple script, which is just getting, as in the standard input, the traffic information, the metadata. And, for instance, if the alert type is our user script, I want to do something. In this case, I'm just logging the IP address related to the host that triggered the alert and a message from our custom script. Then you have to go inside N-top-ng. You go under notifications. You set that you want to send alerts to the shell script. Here you have all the options, like email, whatever. And you select your handler. And then you specify for your handler that you want to receive just critical alerts. So you specify the severity. You specify the category that you want to, of alerts that you want to handle, from this case, cybersecurity, and the entity. In this case, I want to handle alerts about hosts. And then we can extend our handler. We have seen how to print just the alert information, but we can, again, we can use our Python library and N-top-ng to access more information about the host. So we receive this alert, which has been triggered on a specific host in our network. For instance, this host has been infected by malware. It's generating unexpected traffic, whatever. We want to get more information about this host to build a report, for instance. In fact, in our library, we also have the ability to generate a report, or you can generate your own report using the API that we have. So we build this report and send an email. So this is a simple script that you can use. It's a few lines of code to handle alerts and generate reports and get, for instance, an email or your mobile phone with the alert. So this is the big picture of the example that we're seeing right now. So we have defined a user script that triggers an alert, or we receive, again, events from any other source or internal checks. We are calling our script, which is getting more information from the engine to build a report and send this report by email. So the result is this. So the system is checking your traffic, is building a report when something happens, and we'll send you an email with the report of the traffic for the host with the top alerts sorted by severity or by count, the top contacts for the host, the chart of the traffic generated by the host, where you can add more, like the top applications used by the host, et cetera. Do you want to wrap up? Okay. So we have seen that within top ng, you can collect traffic information from traffic, flows, events, events from Suricada, for instance, et cetera. And we started with the top, actually Luca started with the top, then we moved to the top ng. It was mainly a traffic monitoring tool. Today is also a cybersecurity tool able to do behavioral checks, not just for providing visibility, but also providing cybersecurity monitoring. You are now able to extend this engine, both with new scripts integrated in top ng or even with C++ plugins, let's say checks, if you need to scale with performance, or you can use our Python library to write Python tools that can run externally, even remote boxes, to access traffic information in the top ng engine, and be, for instance, a PDF as we have seen with reporting what's going on in your network. Of course, all the code is available on GitHub, so if you want to contribute, you are welcome. Especially now, you don't have excuses. We have a lot of libraries, scripting languages for interacting with the engines, or something else to add. No. The only thing I want to say is that this is an efficient way from our point of view to do network monitoring and cybersecurity, and at the same time to extract information in a way that does not interfere with the main engine, so that allows, I believe, most of the people sitting in this room to do whatever they like to create a monitoring tool that is tailored for their own needs, and that's the first set that is open source. That's all. Thank you very much. Any questions? Any questions? Wait, wait, wait. It's just a simple question. How does it compare with CN tools? It looks like it does everything CN could do. CN tools. Yeah. I don't know the tools. No problem. I am not familiar with them. Any other questions? The scripts can be compiled to be more performance, or do you not have this task in your developer timeline? To compile script, to have more performance. Loa script, or like CCC, we saw that CCC script takes one millisecond, but the Loa script takes 30 milliseconds. Yes, of course, you can compile them, but you have to code it in C++ at the moment. So we used Loa just in time to compile the one seen before by a stamp switch, but it is not available everywhere for its own arm, and we want to support it as various. So yes, it is possible, but again, these are microseconds, not milliseconds. So one million of them per second. Any questions? Anybody else? Hi, thank you. Do you have some figures about performance you are able to achieve on typical server, about flow per second? Some figures to share on Loa scripting, and also some example on Python, which should be less efficient? Okay. We are, when you process packets with Ntop.ng itself, you are able to process like a few gigabits per second, depending on the drivers you use, how you tune Ntop.ng, let's say, to scale with the performance, you can get 10 gigabits, for instance, but more or less, we range from a few gigabits to 10 gigabits in Ntop.ng itself. You can use it in combination with our probes, which is Nprobe, or we have other probes like Cento. In that case, you can scale with the performance up to 100 gigabits per second, but the architecture changes a bit. It's one 100 gigabit in plus. As of the checks, it depends on the checks that you enable, of course. Okay. I think we are running out of time. Many thanks for being here now. Thank you. |
So you want to build a deterministic networking system
A gentle introduction to Time Sensitive Networking |
Hi. Welcome to my talk. So you want to build a deterministic networking system, a gentle introduction to time-sensitive networking just out of interest. How many of you have heard of TSN or time-sensitive networking so far? That's quite a few for a networking session. That's great. How many of you have already worked with that? Not so many. Okay. You will after that talk. Yeah. Who am I? I think I'm a former system engineer. I worked a lot with time-sensitive networking and its predecessors. I also took part in standardization. So I also did some of that. And since last summer, I worked at a kernel developer at Pengatronics. That's a German Linux consulting and support company. We have roughly 7,600 patches in the kernel. And we also do consulting for real-time networking amongst many other stuff. And by the way, we're hiring, of course. Now, to what we will look into today, we will look into applications. I will give you some examples why you would probably want to do networking over or real-time data transport over networking and what the implications of that is, what the requirements of these applications are. We will look into the basic building blocks. So sorry for the folks who already know about that. And we will talk a bit about which Linux user space and kernel components are used in building these applications. And I will sum up the state of the union a bit. And then, just as an announcement in advance, there are some bonus slides where I will give some more details and some references to open-source projects already working with TSM. So if you're interested in that, just download the slides from the penta and, well, check out the links. And I also gave an example of how to basically glue together a stage box, so a transport system for audio data over the network. I won't make that into the talk because it has been shortened to half an hour. So the example I will focus on today is audio video bridging. So if you want to transport real-time data over a network for an application just as this talk, you want to have as low jitter buff or as small jitter buff as possible to reduce latency in the system because if you transport data over a traditional network, packets could get dropped. So you have to resend them or you have to make sure that somehow, magically, interfering traffic doesn't do you any harm. And that usually involves quite large jitter buffers up to several seconds. And if I talk now and you hear me from stage and you hear me from the PA four seconds after that, that would be quite annoying. So you want to cut that down to as low as possible transmission latency, overall end-to-end latency. Of course, for TSN, which started as audio video bridging or AVB as a standard, they came across the fact that this technology could also be useful for quite some other applications. Most of the customers do like machine control stuff with that. So if you have a large production line and you want to transmit data between your PLC and your server drives or your robot arms and stuff, you also want to make sure that your control data arrives in time at the actor or your sensor data is read in within a certain point in time. And that's quite important to keep that timing. Same holds, of course, for aerospace and automotive and railways and stuff. I won't go into these applications today because we're, as I said, short on time. The first requirement of said applications is that you need to establish a common time base in the network. That's due to the fact that while measuring time in computers, it's basically hooking up a hardware counter to a crystal oscillator. These crystal oscillators tend to have frequency drift over time, especially with temperature. And due to the different switch on points in time, you also have quite large offsets. So if you start one device, say at 12 o'clock and the other at 1 p.m., they have one hour of offset in there. So you want to make sure that all your network devices have a common meaning or a common sense of time passing and a common sense of what time it is. Because lots of scheduling decisions for networking traffic may depend on timing. Also, for some applications as the audio example, you also would like to regenerate your audio sampling clocks. So basically in order not to introduce any additional degradation in audio quality, you want to make sure that your sampling clocks of your ADC and DAC run basically in lockstep. And that is why you want to make sure that your time is distributed evenly. And the way that this is done usually in networks is just shown basically in this old style picture. You elect a so-called master clock. So basically that's the best clock reference in your network or the most stable clock reference in your network. And then basically you compare all other clocks to that clock reference and they have to adjust their local time for that reference time. It's basically just as those three gentlemen do in that picture. I like that comparison because you find a lot of analogies and the standards to just the way that works with like pocket watches. And if you look into that, you will find that basic idea quite useful to keep in mind. Now the other thing we want to have guaranteed is as I already said bound and transmission latency. So if we go across the transmission of a data stream in the network, so that's what the standard calls a talker at the left. And that's what the standard calls bridges. Usually as we're dealing with layer two, that's ethnic switches. And in the right, that's what the standard calls a listener. You also call it a source and a sync. But the standard talks about talkers and listeners. And the packet goes from bridge to bridge to along its pass across the network. And each switch basically a bridge has an ingress queue and a switch fabric and an egress queue. That's due to the fact that you can only transmit one packet out of a certain network port at a time. You can't just if another packet at another port arrives for that destination port, you have to store it. And you have to wait until the last transmission is done. And then you can transmit the next packet. And this introduces what's called the residence time in each switch. So even if you have a perfect pass through through network without any additional interfering traffic, you add a little time at each step, your payload packet travels through the network. So if our audio starts here, it's a bit later when it arrives here, and a bit later when it arrives there, and so on so forth. So that's fine, as long as you have no interfering traffic because if you have additional interfering traffic, and that might be because we of course want to use our audio on converged networks. So we want to use the same network for say our live PA system and for our network internet connection. And we want to download large file because we want to download a presentation recording from FOSTA. And basically that's where this entity arrives and it's introduced or it creates a large amount of traffic here. This will cause the packet here to be delayed until it's sent out of the egress port. And basically it won't arrive in time. And if we go for a small jitter buffers as possible, that's a problem because we have a buffer underrun at the listener side. And basically we have audio dropouts in the audio case, or we have stalling motors in the industrial control case. That's something we have to avoid under any circumstances. So basically something we want to have is quality of service. And so the picture, of course, your professional networking engineer, so you don't need that picture, but the picture I like to use for that is a bus lane in the street because also the bus runs in a more or less isochronous way. So you send those bus or packets down the lane and the way not to be hindered by the interfering traffic there is just basically to introduce a priority lane. And that is what we also use in networks basically when we introduce quality of service measures. Another thing we need for at least some of these applications is link layer redundancy. So imagine if there's a mixing desk right in the back and we run a network link back there and someone just trips over that link, rips out the cable, or maybe it's a fiber link and someone stomps on the fiber link, bad things happen. And basically if our stem is over, we don't want to have that. So we want to introduce means of having redundancy schemes there. Basically you can't think of it as a real-time capable, real-time healing with no waiting time like spanning tree-ish thing you want to have. The standard spanning trees quite don't cut it for these kinds of applications. So we have to introduce other stuff there. We have some other application requirements there. They're not so important so I leave them out for now. Now what does the or what kernel and user space components do we have to implement that? We will look into what the TSM components are later or what the TSM standards are because that's basically just numbers and letters. So for time synchronization, especially TSM, we use GPTP. That's a flavor of the precision time protocol, generalised precision time protocol, of which you can think of PTP standard PTP, IEEE 1588 boils down to layer 2. So of course we're dealing with raw, ethnic frames so we can't use UDP for transport and it also has some other quirks but they're not too important right there. And the way we do that with Linux kernel, we have the hardware time sampling units and the PTP hardware clocks. That's basically the interface to hardware clocks in your FNMAC or FI. And the user space component to run all the remaining stuff is PTP for Linux. That's basically the way it works and it works quite well. You can achieve down to several nanoseconds precision from point to point with that. For traffic shaping, that's the quality of service measure we want to employ. The kernel has the TC subsystem and usually if you configure that manually you use IPv2 or netlink if you want to do that programmatically and that's basically the way it works and we will look into a bit of detail later. For network management, so basically if you have to reserve a data flow from a talker to a listener, that's where it gets a bit sketchy because that's of course user space demons and there aren't much. There's also a problem because there's several ways of doing that, the traditional way or ABB style, the initial implementation used the so-called stream reservation protocol. Modern ways for especially pre-calculated or pre-engineered networks is using young NETCON extensions and there are some demons for that but support for the TSN extensions is not too great. So if you're into that, that's quite a nice thing to work on. For the real-time data packetization, that's mostly user space. Of course you want to use some kernel features like ETF, Qdisk and XDP to have as low overhead as possible and to make sure that your transmission is sent out as asynchronously as possible and you want to use offloading for that and then there's some very application-specific user space components. So for audio-video stuff, you can use the G-streamer plugins and for industrial control, I'd recommend to use a 2G Open 6651 implementation. That's not quite finished yet but it's a good starting point at least. And for the link layer redundancy, that's what PCR and FRER is, basically the standards are finished since one or two years. There's not much hardware supporting that yet and you really want to have hardware offloading for that. So you're basically down to proprietary vendor stacks at the moment. There are efforts to put stuff mainline but there are not quite there yet. But stuff is coming and that's the good thing with that. So I think one slide is missing there, which is not a too big problem. Yes, one slide is missing. So basically the stuff, how to put stuff together with TSN, I will summarize it without a slide. With TSN we have GPTP, that's IEEE 802.1AS for the IEEE standard fetishists here in the room. And traffic shaping, the basic standard stuff is the credit-based shaper but there are more time-aware shapers available right now. They are basically making more efficient use of your network and the way that works is basically a reserving bandwidth along your data flow path in your network. Network management, again, that's a bit, that's a bit application-specific. So the audio video and professional audio video stuff is still using the stream reservation protocols and for the payload, as I already told, that's really, really application-specific. And for redundancy we use PCR and FRER. Usually there are some exceptions to that, especially for professional audio video. PCR and FRER were unstandardized when those standards were written so there are some proprietary or not proprietary but some other redundancy schemes where you basically send two different streams and try to separate your networks via means of VLANs usually and try to force different data paths through network. Basically nowadays you want to go PCR and FRER whenever your hardware supports that. So state of the union, the hard stuff is already done. So there's already implementations in the kernel, there are user space demons available. That's again the stuff that's difficult to get right. So if you want to implement those standards, first of all you have to read tons of paper. I did that for an employer, took me two years. So that's really hard to get right. And the good thing is that that is already implemented, you just have to use it and you have to use the right knobs. For some stuff like GPTP and traffic shaping you want to really, really use, for GPTP you have to use, for traffic shaping you want to use, hardware offloading. You have to bear in mind that your network gear has to support explicitly GPTP and traffic shaping. So about the preservation and basically making sure that your traffic shaping is applied properly. That's not true for every hardware, especially not for commodity hardware. And bear in mind that sometimes configuration especially for traffic shaping can be quite tricky. As I said, I have added bonus slides to the presentation. I will check that they have the right slides in there later on or just contact me. And the point is especially credit based shapers can be really, really tricky to set up properly and to make sure that you reserve the bandwidth you want because you want to have the remaining bandwidth to be available for best effort traffic. So the idea is that you can use like say 70% of your link for your audio video stuff and still have like 30% of your gigabit link, which is what we're usually dealing with for like audio video available for just best effort network management traffic and what so ever. So you really want to make sure your shapers are configured the right way TM. And it's quite hard to treat the right knobs and IP route too. So there are good examples and I'd strongly recommend to read the docs on that. There's also a link to the TSN read the docs for Linux. It's quite a good starting point for getting into that whole topic. And yeah, basically I think that's it. Do you have any questions? Any questions here? Thanks for this. What's the highest speed Ethernet implementation of this you've seen? Have you seen anything beyond like 10 gig E for example? I have seen a 10 gig implementation for that. As far as I recall the standards and have some limitations with respect to how you communicate your bandwidth requirements and they're a bit capped. I'm sure and I know that they are working on that for future revisions of the standards because of course now faster links are becoming available more and more. Most applications for TSN like the control stuff or the AV stuff are running on 100 megabit links still. You want to go to gigabit links because you can achieve quite a bit lower end to end latencies on faster links. But I haven't seen, personally haven't seen faster stuff than 10 gigs so far. But I'd be interested to do so. Do you have happy stories or really users that have put this in production and can you tell more about this? Yeah, so if you want to check that out you can just Google for Milan and TSN which is the professional audio video stuff and they just before Covid started, shortly before Covid started they ran the Rammstein concert in Munich over a TSN system. It's a really large system with several video walls and several like hundreds or thousands of audio streams and pyrotechnics and light control and stuff all in the same network converged. So that's the largest installation for live audio I know of and I think that's quite a good story to tell. I was curious if you had the chance to play around with synchronous ethernet as well. I haven't looked into that too deep yet so I can't tell you too much about that. You mentioned XTP. Are you aware of any applications of XTP in that area? To be honest I haven't seen them and I will start working on some of them for a customer project in just a few weeks probably. The idea is that basically because it's layer 2 you don't have much network stack above the hardware layer. So if you can cut some of the Linux networking stack because you don't use it anyway, you work on raw sockets anyway, you could just cut some of that out and try to achieve lower latencies in your basically Linux stack there. Probably on the next Fostum I can probably give you a talk on that. This is probably a big question but how do you go about debugging this sort of stuff so like setting it up or if you think there's a problem, how do you go about finding problems? That's actually a bit of a pain point and you have to know at least a bit what same values for like path delays for the PTP and stuff are and one of the most useful debugging tools I've found so far is a good ethernet switch because it will give you like output for your stream reservations, it will give you output for your PTP or GPTP. You can also like sniff traffic with wiretaps basically and analyze it in Wireshark or Skypie or whatever your tool of choice is. That works best to be honest for 100 megabit links because you can use passive tabs. It doesn't work that great for gigabit links because it violates some of the sound it's a bit. You can also use like mirror ports and switches to exfiltrate traffic but basically it's a more manual approach of debugging and I'd like to get in touch with if anyone is interested in just write me an email to start a community-based project of automated analysis of TSN networks basically because I think it's something we really really need especially for people who aren't that deep into the standards and we need to make sure that we can basically have a one-click check and setup and can tell from a tool that at least if that looks okay-ish or not what you're doing but I'm not aware of any project so far so I'd like to start but I'm not too experienced in how to start such a project so if you're experienced in that or are interested in that just write me an email, get in touch and maybe we can set up something. Any more questions? That's all the last one. You mentioned some protocols for link redundancy. Can they also be used for node redundancy? I'm not entirely sure. I would have to look something up. I think basically it should work because it's about the data path so if one node drops out basically that would work as well but it won't work for the endpoints so for the talk of the listener of course it won't work but for nodes in the middle of your graph that would probably work. Okay thank you very much again for your presentation. Thank you. |
Hole punching in the wild
Learnings from running libp2p hole punching in production, measured from vantage points across the globe. |
Hello, everyone. Thanks for joining today. Welcome to our talk on hole punching in the wild. Sometimes I would say we're going to talk about the biggest hack of the internet, which I would refer to as hole punching. We want to talk a bit about learnings from doing hole punching on larger networks. Some might remember me from last year in FOSDEM where I introduced our way of doing hole punching, and today we're coming here with a bunch of data. So, who are we? Dennis, do you want to introduce yourself? Yeah, okay. My name is Dennis. I'm working at ProCollab as a research engineer at a team called ProBlab, and I'm mainly focusing on network measurements and protocol optimizations that come out of these measurements, and yeah, I was working with Max on this hole punching campaign. Very cool. And Max, again, software engineer. Yeah, you can find us anywhere there online if you want. Yeah, happy to communicate online further after the talk, and we're also around at the venue. Wonderful. Okay, what are we doing today? I want to do a very quick intro to LiPi2P, a peer-to-peer networking library, but then dive right into the problem of why firewalls and NATs are rather hard for peer-to-peer networking. The solution, which in some cases is hole punching, then how LiPi2P does all that, and then we have been running a large measurement campaign on the internet in the wild, collecting data, how well hole punching works out there, and we're going to present those findings, and then kind of have takeaways of what we learned from there and where we're going from there. All right, LiPi2P, just a quick introduction. It's a peer-to-peer networking library. It's an open source project. There is one specification, and then there are many implementations of that specification, among other things, other languages in Go, JS, Rust, NIM, C++, Java, many, many out there. Cool. It provides, I would say, two levels. On the low level, it provides all kinds of different connectivity options. It takes care of the encryption and authentication here, being mutual authentication, and then things like hole punching, for example. Once you have these low level features of being able to connect to anyone out there in an encrypted and authenticated way, you can then build higher level protocols on top of that, which LiPi2P also provides like a DHT distributed hash table or gossiping protocols and things like that. My big statement always about LiPi2P is it's all you need to build your peer-to-peer application. All right, so to zoom out a little bit, that's LiPi2P. All the things that we're talking about today are implemented in LiPi2P, but that doesn't mean you can't implement it in any other networking library if you want to. Our great motivation for LiPi2P and in general for peer-to-peer networking is that we have full connectivity among all the nodes within the network to the best of their capabilities, obviously. In this talk, we're going to focus on the problem of NATs and firewalls for peer-to-peer networking. Now, before all of you yell, like, I'm not saying let's get rid of firewalls. At least let's not do that. They have a very important purpose, but in some cases we want to get around them. Okay, cool. Yeah, I'm here in the network dev room. I'm not going to explain what NATs and firewalls are, but we will go a little bit into what that means for whole-punching. In general, full-punching NATs and firewalls are big ones that we can have to get around. Okay, what is the problem in some fancy pictures? A wants to send a packet to B, whether that's a TCP syn or anything, right? And A and B are both behind their home routers. Just imagine two laptops in two different houses and they want to communicate directly with each other. So A sends a packet to B. It crosses A's router. A's router sets a five tuple in its routing table for that packet and the packet makes it to B. And obviously a very good thing is that B drops that packet because it's a packet that it has no clue where it's coming from, probably some wider internet and it might be an attack, so it's dropping it. It doesn't have any five tuple in its routing table, right? Okay, so that is the problem and we somehow want to make A and B communicate with each other. So the solution here, in some cases, it's whole-punching. Again, we want A and B to connect to each other. Instead of only having A send a packet to B, we have both of them send a packet at the same time. I'm talking in a little bit about what at the same time means, but that's just for now. Say we have some magic synchronization mechanism. So A sends a packet to B. B sends a packet to A. The packet from A punches a hole in its routing table, so adding a five tuple for it. The packet from B punches a hole in its routing table on its side. The packets cross somewhere in the internet. Obviously they don't, but it's a nice metaphor. And at some point packet B arrives at router A. Router A checks its routing table. A little bit simplified here. It lets packet B pass same on router B, and this way we actually exchange packets. Cool. So now the big problem is how does A and B know when to send those packets, right? It has to happen at the same time, at least for TCP. We might go a little bit into what that means for UDP, but at least for TCP, this needs to happen at the same time for TCP is simultaneous open to happen in the end. So how do we do that? This is lippie-to-pee specific. It doesn't need to be lippie-to-pee. You can use any signaling protocol on top. Let's say A and B want to connect, and they need to hole punch at the same time, right? They need to send those two packets from both sides at the same time, so one can go through the hole of the other to the other side. What do we do? We need some kind of coordination mechanism, so some kind of public server out there that is not behind a firewall and that. B connects to the relay. A learns B's address through the relay. A connects through the relay, so now the two A and B have a communication channel over the relay. B sends a message to A. You can just think of it as like a time synchronization protocol. And at the same time, while sending that message, it measures the time it takes for A to send a message back. So at this time, we know the round trip time. And then once we know the round trip time, B sends another message to A and waits exactly half the round trip time. And once A receives that sun down there, you can do the math. If now both of them start, so A when it receives the packet and B after half the round trip time, they actually do the hole punch. They exchange the packets. They cross somewhere in the internet. Both of them punch the hole into their routers and ta-da. We succeeded. We have a hole punch. We have a connection established. Cool. Okay. A little bit in terms of timeline on all of this. Hole punching is nothing new. It's definitely nothing that Lippity-P invented, not at all. The most obvious mention I know is an RFC 5128. But again, it predates that for sure. But I think it's a nice introduction to hole punching in general, in case you like reading ITF documents. Since then, we have been implementing it around 2021-22, basing on a lot of past knowledge around that. I've been presenting this work at FOSDEM 2022 last year remotely. And since then, we have rolled it out on a larger network, which is the IPFS network, in a two-phase way where all public nodes act as relay nodes, very limited relays. And then in a second phase, all the clients gained the hole punching capabilities. And now on this large peer-to-peer network, we actually have on non-hand the public nodes relaying for the signaling, and then the clients actually being able to do the hole punching work. Yeah. And so we have this deployed now in this large network, but it's very hard to know whether how it's working, especially across the internet, across all the networks, across all the different endpoints, across all the routing hardware, and so on. So that's why we launched the hole punching month, which is kind of like a measurement campaign, which Dennis now is going to introduce. Sorry. Can you hear me? Yes. All right. Thanks, Max. Yeah, as Max said, the LPDP folks conceived this new DCUTR protocol, and at some point, and then deployed it to the network. And now we want to know how well does it actually work. And for this, we launched, as Max said, a measurement campaign during December. I will get to this in a second. But how actually do we measure these hole punching success rates? And the challenge here is that we actually don't know the clients that are DCUTR capable. So where are the clients that we want to hole punch? Because they are behind nets. We cannot enumerate them. They don't register themselves in a central registry or so. So we conceived this three component architecture. And the crucial thing here probably is this honeypot component, which is just a DHT server node that interacts with, as Max said, the IPFS network. And it's a very stable node. And this means that it gets added to routing tables of different peers in the network. And this increases chances if peers behind nets interact with this IPFS network, come across this honeypot. So peers behind nets is in this diagram, the top right corner, some DCUTR capable peer. This one by chance by interacting with the network comes across the honeypot. And the honeypot then keeps track of those peers and writes it into a database. And then this database is interfaced by a server component that serves those identified and detected peers to a fleet of clients. And the hole punch measurement campaign consisted of a deployment of those clients to a wide variety of different laptops or users that agreed to run these kinds of clients. And this client then queries the server for a peer to hole punch. As Max said, it connects to the other peer through a relay node and then exchanges those couple of packages, tries to establish a direct connection. And then at the end, it reports back if it worked, if it didn't work, what went wrong, and so on. And so we can probe the whole network or like many, many clients and many network configurations. So we did this measurement campaign. We made some fuss about it during November internally, pro-collapse, and also reached out to the community. And starting from the beginning of December, we said, okay, please download these clients, run it on your machines, and let's try to gather as much data as possible during that time. And as you can see here, so we collected around 6.25 million hole punch results. So this is quite a lot of data from 154 clients that participated. And we punched around 47,000 unique peers in this network. And on the right hand side, you can see the deployment of our clients, of our controlled clients. So the color here is the number of contributed results. So the US was dominant here, but we have many other nodes deployed in Europe, but also Australia, New Zealand, and also South America, and also one client from the continent of Africa. And this actually, and these clients interacted with these other peers that are basically all around the world. So we could measure hole punch success rates all across the globe. And I think we have a very comprehensive data set here. And so these, so we now gathered the data. And at the beginning of December, sorry, of January, I started, so I said, okay, the hole punching month is over, and I started to analyze the data a little bit. And what we can see here on the X axis is the, so each bar is a unique client. And on the Y axis, we can see these different outcomes. So each hole punch result, as I said, can have, so the clients report back these results and each result can have a different outcome. These outcomes are at the top. So it can be successful. So we actually were able to establish a direct connection through hole punching, then connection reversed. This means, I'm trying to hole punch as I'm connecting to the other peer through the relay. And the first thing before we do the hole punching dance is for the peer to directly connect to us. Because if we are directly reachable, because we have a port mapping in place in the router, we don't actually need to do the hole punching exchange. This is the connection reversed. And as we can see here, it's a little hard to see. But some clients actually have a lot of these results. So this means they have a unique router configuration in place. Then failed is the obvious thing. So we tried, we exchanged these messages, but in the end, weren't able to establish a connection. No stream is some internal error that's unique to our setup. So probably nothing to worry about here. And no connection means we try to connect to the other peer through a relay, but the other peer was already gone. It's a permissionless peer-to-peer network. So it could be from the time that the honeypot detected the peer to the client trying to establish a connection to the peer that the client has already churned and left the network. But actually looking at these clients is distorted view on the data because we allowed everyone who participated in the campaign to move, to freely move around. So I was running this client in my laptop and I was moving from a coffee shop, a Wi-Fi network to a home network to a university network and so on. And hole punching is actually dependent on those network configurations instead of just me running the client. So the challenge here with the data analysis was, so I'm also not done with that yet and happy to open for the suggestions to detect these individual networks that the clients operated in. With each hole punch results, the client reported their listening IP addresses and so on. And I grouped them together to actually find out, to identify unique networks that those clients operated in. And at the end, I arrived at 342 unique client networks. And then the graph looks like this, probably not much different than before. But also there are some interesting unique network outcomes here that I will also get to in a bit. The most interesting graph is probably this one. So what's the success rate of this protocol? And on the x-axis, we have the success rate been by, yeah, just 5% binnings. And on the y-axis, the number of networks that had the success rate by probing the whole other network. And the majority of networks actually had a success rate of 70%. So I think this is already, actually, I think it's amazing because from not being able to connect at all to having a 70% chance to establish a direct connection without an intermediary, it's actually pretty great. But then also there are some networks that have very low success rate. And these are the ones that are probably the most interesting ones. Then also, oops, the IP and transport dependence is also quite interesting to like as an angle to look at the data. Here we can see that the top row, we used IPv4 and TCP to hole punch. So when these clients exchange these connect messages, they actually exchange the publicly listen, the publicly reachable IP addresses of those two peers that want to hole punch. And in our measurement campaign, we restricted this to actually only IPv4 and TCP and with some other hole punches only to IPv6 and quick, which is on the bottom right. And so we can take a look which combination is more successful than the other. And here we can see that IPv4 in TCP and quick is actually, if you average the numbers has a similar success rate. But on IPv6, we have actually, it's basically not working at all. And these unexpected things are actually the interesting ones for us. Either it's a measurement error, or there's some inherent property to the networking setup that prevents IPv6 from being hole punchable, basically. If we actually allow both transports, so in the first, in the previous graph, we showed we're only using TCP and quick. But if we allow both transports to simultaneously try to hole punch, we can see that we, with 81%, we end up with a quick connection. And this is just because quick connection establishment is way faster than TCP connection. So this is like an expected result here, just to verify some of the data here. And now two takeaways for us, for ProCo improvements. So if we took a private VPN, so if clients are running in VPNs, we can see that the success rate actually drops significantly from around 70% to less than 40%. And my hypothesis here is that the router, the router time that Max showed previously is measured between A and B. But what we actually need is the router time between the router A and router B. And if your router basically is the exit node, or your gateway that you're connected to from your VPN, this can differ by dozens of milliseconds, actually. And so the router time doesn't add up, and the hole synchronization is a little off. So this is potentially a protocol improvement here. And then, also interesting, so Max said they are exchanging these messages during the hole punch. But actually, we try this three times. So if it doesn't work the first time, we try it again. And if it doesn't work the second time, we try it yet again. But when we look at the data, if we end up with a successful hole punch connection, it was actually successful with the first attempt in 97% or 98% of the cases. So this is also something for the next steps for us. We should consider changing our strategy on the second and third try to increase the odds. So if we stick with the three retries, we shouldn't do the same thing over again, because as we saw from the data, it doesn't make a difference. So we should change our strategy here. And so one thing would be to reverse the client server roles in this quick hole punching exchange. This would be something, and also the other protocol improvement for us, as I said, would be to change the measurement of the round trip time. And for the future, the data analysis, right now, what I showed here is basically aggregates across all the data. And the interesting part is basically, so why is a specific client or a specific network, why has it less or a worse success rate than others? So these are like these individual things to look into to increase, maybe there's a common pattern that we can address with the protocol to increase the success rate. And yeah, then identify those causes. And also, at the end of all of this, we want to craft a follow up publication to something that maxed and some fellow friends, I would say, have it published just last year. And yeah, we want to make the data set public and so on and so forth for others to benefit from the data and can do their own analysis. Yeah, and with that, get involved, talk to us here at the venue about all of this. LipidFee is a great project. Have a look at all these links. Get in touch and contribute to join our community calls. And yeah, I think that's it. Thank you very much. At least what you implemented there, is it exactly ICE turnstaff or how different it is from this? So we differ in some cases, it's definitely very much motivated by ICE in turn. So a couple of things, we don't do turn itself, we have our own relay protocol, because nodes in the network act for the public as relay nodes. And the problem is you don't want to relay any traffic for anyone, but you want to make this really restricted in terms of how much traffic, how long. If you run a public node, you don't want to be the next relay node for everyone out there. And then, what we built here is very much TCP specific, but it also works well with UDP. We need the synchronization. And as far as I know, at least the WebRTC stack is very focused on UDP, where timing doesn't matter as much. So you saw the timing protocol, right? And that is very TCP specific, where we want a TCP simultaneous connect, which allows two sends to actually result in a single TCP connection. This is for your analysis. I guess a lot of this depends on the default configurations of the firewall. Did you kind of find out what are the Brian's type of firewalls or configurations that stops whole punching in your research? So, yeah. So, not in its entirety, but what we did is, so people that signed up for the measurement campaign gave us information about the networks. And so, if we find something fishy in the data, we could also reach out to them and ask what's the firewall setup in your specific network. We also gather data about port mappings that are in place. So, what LiPTP host tries to do is establish a port mapping inside your router. And this is also reported back. And what we also did is try to query the login page from these routers and get some information about what kind of firewall router actually was preventing you from connecting to someone else. So, these are the data points that we have to get some conclusions around this. But more than this, we don't have. But I think this is already pretty conclusive to a wide variety of analysis. What I was just wondering about is, do you have any data? How many clients actually were behind the net? So, all these clients that the Honeypot detected were only, so were clients that are behind the net. So, these are all LiPTP hosts. And with the default configuration of LiPTP hosts, if they only announce relay addresses, this means that they must be not publicly reachable, which is for us equivalent with being behind the net. So, yeah, it should be. There's probably some error there. So, then all of the IPv6 kind of hosts you were trying to connect to also were behind the net. Kind of IPv6. Yes, yes. And this is the interesting thing. So, I cannot explain this yet. Maybe it's a measurement like a measurement error from us. Maybe it's some, as I said, inherent property to something. Maybe it's a protocol error. I don't know. And this is the interesting stuff in these kinds of things. Thanks. I'm very curious. Yeah. I was wondering, does it also work with multiple nets? Can you open through two nets? So, if another friend of mine who I convinced to run these clients actually was running behind two nets and it was working. But I'm not sure how many people actually ran behind two nets. But in theory, yeah, maybe Max, you can explain this. Yes. So, right now, we don't have really a lot of data about two nets. And also, we don't have the data, which I think was called needle. I don't quite know where you're within the same network. But you don't know that you're next to each other. And you actually want a hole punch through your own net, even though you can't connect to each other. So, there's some challenges. Do we still have time for another question? So, you said that for UDP it should work. Similarly, did you do any experiments with that? Because in the past, we had a custom UDP hole punching thing and the routers were pretty branded. They forgot the mapping within 20 seconds or something. Yeah. So, we run this measurement campaign on TCP and QIC. And QIC in the end is just UDP. And what we do is something similar to STUN in the ICE suit, where we continuously try to keep our mapping up. And then on nets that do endpoint independent mappings, that actually helps. So, as long as we keep that up for, like, I don't know, every 10 seconds or so, then our mapping survives, even on UDP. Okay, cool. Thank you very much. |
"CNI Unleashed"
How to deal with CNI plugin chains. |
Okay. Take a picture. So, good morning. My name is Daniel Mejado. I'm here today with my colleague Miguel. Here we are both principals of our engineers at Weihardt. I'm currently working in the Epsilon DTT team, basically monitoring so, and he's on the OpenShift virtualization team. I guess we just wanted to speak today a little bit about CNI, which stands for Container Network Interface, which this is basically all the networking in Kubernetes, but not only limited to that. Because I think one of the things that this project lacks the most is documentation and how does this work, how you create a plugin, what are the primitives. I think that's something that is super simple, but there's little to no documentation besides the spec. So, let's go watch through that. Yeah, this is a quick intro. So, you may have noticed and that we are going to be speaking about CNI plugins changed specifically. That means that we are going to be basically putting a couple of plugins in CNI mode. But first of all, CNI may stand for three different things and I want to be a little bit clear here. Like you're going to have the CNI specification itself, which is a document. It's fine. You can read it. Then you've got the plugins. So, if you go to GitHub, you can see the CNI plugins. This is a set of plugins basically maintained by the community. Bridge, Mac BLM, you name it. There's a couple of those. And the third thing, it may be just a couple of plugins. So, those are basically the three things that you may be interested into. We are going to be always speaking about the plugins here in this session, but we'll hopefully discuss anything after that. So, just in case, I mean, here we are going to be speaking about Kubernetes, but let me take a note that the CNI specification on the plugins, they do work totally without Kubernetes. So, with any runtime engine such as Rocket, and in fact, these are started out with Rocket. Well, I guess in case you started that out, you can be running this out with cryo, Rocket, whatever you want to play with this. So, what is the CNI plugin specifically? CNI plugins are binaries, which are basically copied over any of the hosts that are in your Kubernetes clusters. So, if you want to install this out, you probably need a demo set to deploy it on or do that manually. So, here, who runs this CNI plugins? You could either run that yourself, but usually, you would need to have the kubelet run those for you. Is there anybody here who doesn't know what a kubelet is? Okay. The kubelet, oh, you fuck, anyway. They were trying me out. So, anyways, I'll go ahead. So, basically, when you go and have the kubelet, which is a mystery thing that you don't know what it's about, runs the binary here in the host, it basically creates a network name and space, which is tied to a BF interface. As this guy doesn't know what a BF interface is, I'm going to be basically speaking out for him. So, it's a B12 point-to-point tunnel, so you can basically hook this up into whatever you want to. Like, usually, these are connected to OVN, OpenB streets, or whatever, B12 streets, or it depends on the CNI plugin implementation. How do I configure the thing? These are used, like, configured by a configuration file. We're going to be speaking later on a little bit about how this changes. But, yes, FYI, and Miguel will be speaking later on about this more. Prior to the CNI specification 0.3.0, you weren't able to, you know, put your plugins in CL mode. It would just break. So, you would need to know, basically, there's a couple of CNI components that you need to put. Basically, the name of your CNI plugin, the type, and the type should match the name of your binary or it would probably break afterwards. Then, later on, you need also some mbars, which basically are specifically about telling you, like, okay, it can switch both the way around this, you see, because it may be a part of whatever. If you put all this out, and, again, recall, a CNI plugin, it's a binary, so you just run it as any binary you would run, goes to your CNI plugin, and then you can exit code. If you did your things right, like that, this should be getting you an exit code of zero, not error, and then you would be getting your standard out here, which is just JSON. Why JSON? And why not using the RPC and the daemon, like everything in Kubernetes, you may ask? Because, again, this wasn't meant to be used on a lot of Kubernetes, and if you get to stay at the more networking session, they'll probably be speaking about some point using YAML for that, too. But here, for the sake of this session, what you would need to remember here is this prep results, basically, and advancing some stuff later from my colleague. This is the previous results as per JSON of the plugin, and if any plugin gets a previous result here, it should output the next one, just in a serial mode way. I was speaking about the specification. What the hell is that? So, yes, if you're implementing your plugin, you need to make sure that it observes several primitives, which is add, del, check, and version. So, for instance, you may want to add, okay, I want to add a pod. I want to delete a pod network interface. A second version, to be honest, and even if the specification really requests you to, you know, honor this and observe this, they aren't really used by most of the commercial appliances. So, for instance, Celium, I guess they don't do any kind of check, maybe version, but those aren't really needed, and we'll be discussing that later on. So, this is what a coffee file usually looks like. So, this is nothing like a big deal. Super simple. CNA version, again, for example, I'm getting you backwards, like with the things that we were speaking before. This one would work with CNA plugin change, because it's 0.3.1. If you put 0.2.0, it would basically fail miserably telling you that he hates you, and it would crash. Then you got here the name, I don't care, but this type should match the name of your binary there later on. Then you can put out some plugin-specific things that are, for instance, this basically comes from the Brits plugin, which Miguel will be showcasing later on. So you can put here, okay, this, what do I need for a Brits? I need the name. You see the default gateway, I'm going to be forcing address later on. This is a little bit special, but I won't be discussing that because you'll be seeing that later on the conflict. Again, like, okay, so you told me about CNA plugins or binaries, and also I got Jameson's configurations. So where do I store these things out? So basically, when you define the cubular configuration, you tell the CNA binary path, and by CNA binary path, it's just a directory where I store all the plugins. Sadly, there's little to no login in most of the plugins, so if something breaks, you would have to go and see what's going on here. So you would just go, one of the things that you'd like to fail the most is just like you don't have the binary here installed, and it would fail and tell you that it hits again. And the conflict directory is here, and here's just a couple of Jameson who maps to the name of the binaries usually, and where do you put this specific plugin configs? Plug-in chains. Let's go for that. So that's the big deal. Again, available since CNA or CO2-0, those are required since the version 1, because now everything is the chain, even if it's just a single plugin, and everything is the conflict, again, even if it's just a single plugin. So if you just want to put the things in the old way, it would just break. I was also pointing out that now, let's assume that you've got a couple of plugins here, like, you've got the Bricks plugin, then you've got a Firewall plugin, which is one of the most used cases for this, and then, let's say, put a couple of plugins bound with Firewall because I like to, and I'm going to be putting another Macbillan interface here. So all these plugins are expected to honor, like, that they're going to be passing the prep results to each other. So even if they don't know anything about what comes before, because the Bricks plugin will know how to make a Bricks, but the Macbillan plugin won't. So everything comes through the plugins, all the info goes through using prep results, and we'll be showing you later on as in with the demo, if it doesn't fail. This is the config, this is not a config, this is not a config for CNI plugins, this is a config list. So if you see, this pretty much resembles what you had in the config file for a plugin before. So now we got CNI version, okay, I know I'm just saying this a lot, but it should be greater than 3.0, otherwise this would just break. This is your config list, and then you get, basically, a list of plugins. At one, we got a few of those, and those would be just working together. So in this case, we got a name, because all the plugins would need the name, it would need the type, which would match the binary, and then we got the specific configurations. So in the demo, we'll be showing you how these plugins get to link with the prep results and so forth. But if you see, it's just a list of plugins with name, type, and then specific configurations, but if you just want to do something, okay, I'm buying this, I want to create my own plugin, you really need a name and type, even if it doesn't do anything. Okay, go ahead, Miguel. Now, use cases, and then, I guess I spoke maybe a little bit too much, and even if we are late because we got a rush to the airport and that, but I'm handing over to Miko Miguel, who's going to be showing you this a little bit around, and then if it works again, demo. Go ahead, Miguel. Hello? Well, I'll use this. So this is like a list of the plugins that are provided by the CNI maintainers and are available on the container networking slash plugins repository. So the first is Tuning CNI. It allows you to configure like a list of syscuddles. So if you need some syscuddles to happen inside of your pod, you use this Tuning CNI. The bandwidth CNI, as the name implies, like quite evident, what it does is to throttle either the ingress or egress traffic to your pod, if you want to do those kind of things. The firewall plugin, what it does is only allow access to and from the IP addresses that are referenced in the results that the plugin got, for instance. And the port mapping, as the name kind of implies, it does port forwarding, configures port forwarding from the host into the container for the set of ports that you specify in the configuration. Now, having said this, let's go to the demo, and I'm now wondering how can I do this while holding the microphone. This doesn't work, I think. Easy. Okay, so the first thing that I should mention, like Daniel referenced Kubernetes a lot, but please remember that CNI is more like, like Kubernetes is a user of CNI. Like, it's not that you can use CNI by itself, and as such, we're going to be using something called CNI tool. It's just a binary that you point at your, at a certain CNI binary and give it like a set of environment variables, the plugins configuration via standard then, as Daniel said before, like, you basically pass the configuration of the plugins, you give it the input parameters, which are the environment variables, and you execute the binary. And this is how we're going to be seeing, well, the demo. You can follow it on this, on this link. But yeah, the first thing I think I should explain is like the scenario we're trying to achieve. And this can be seen here in the bottom right corner. So very simple thing. We just want to have like a bridge, two namespaces interconnected via this bridge, and we're going to be configuring a static IPM on static IP addresses on each of the namespaces. And then we'll run an IPer from the client to the server, and we're going to see how it fares. So first, I also need to show you the configuration that I'm using for this. Okay, it's this one. And this is the configuration that we're going to be using to achieve this scenario. Like, we're going to be using the plugin of type bridge. This is the name of the binary that will be invoked on the host to create the bridge. We get the name, we enable the IPs capability so you can put a static IP in it and you tell it that you want static IPM. And that's pretty much the configuration that you need to give it. So let's just run the example. Okay, like this parameter here that you see, can you see it? It's like the font big enough. It's good. Okay, thanks. Like, you have to give this like the name of the configuration if you see like unlimited bandwidth, it's the attribute here on the upper left corner that you see under name. It's the same thing. It must match. And now this configured the scenario and it's now running the IPer session between the both the client and the server. And as you see, we're getting like a very big bit rate of around 60 gigabits per second. So this is with a straight configuration. Now let's use a different configuration where we put we reuse the first configuration, but we use it in a chain. And afterwards, we put like the bandwidth plugin. Again, what this does, it will throttle the input traffic into the network interface of both namespaces. So we're going to be doing the exact same thing. But with this added with this different configuration and bandwidth limiter. And as we see, like the bit rate that we're getting is literally a lot less. And it should map somehow to the values that we've configured here. So what this shows is that you can use like this band with plug in a chain in order to achieve a different use case than you had before, like you want to throttle traffic to this, you use this type of plug in. Yeah, I think that's let me just check the time. Yeah, we're good. We still have 10 minutes. So let's run the second. The second demo we had, we have actually sorry, okay, the scenario is a lot simpler this time, like we just have the same bridge as before, but we just have like a network namespace connected to it. And what we're going to be doing is showing you how the chain actually works, focusing on the on what Daniel said before, like you need to handle always the previous result, and you need to account for it. And you need to pass it along the chain continuously. And okay, let's just show the configuration of this plugin. So this chain might look a lot more complex than the one before because it has more things in it, but it's very, very, very simple. So this thing first will in it will call the bridge plugin, it will create the bridge. Then we'll invoke the debug CNI, like this CNI plugin is very, very simple. The only thing it does is print the result it got from the previous plugin. So what we're going to be seeing is here is the result of the first plugin in the chain. Afterwards, we run the tuning CNI plugin. And what we're going to do is to change the MAC address that we got on the interface of this dummy namespace that we see here. So the idea is we first run the bridge plugin. This thing will assign a random MAC address to the interface that is on this namespace. We'll print that will run tuning to change that MAC address. And we're going to print that again to see like the result of the previous plugin. And that's pretty much it. Let's just run the example and show you the, what I actually mean. So here we see like, and so this log here is the result of the, of the first call of the debug CNI. And as we see in the, in the pods, in what would be the pod interface which is identified by this attribute sandbox that points to the name of the namespace or actually to the path of the namespace. We see that in its result, we have the interface name that was created on that network namespace. And we see that it was assigned like a random MAC address that is identified here. We then run the tuning plugin that changed this MAC address and we finally printed the previous result again. And we see that this changed to the MAC address that we specifically specified statically in the, in the plugin configuration, which is exactly what we wanted to show. Like that's like very simple demo, but I think it kind of illustrates in a very neat way how chaining actually works in a step-by-step manner. And with this, we arrived at the conclusions. And I'd like to basically tell you again what we just told you, like focusing on the most important things that we think are about this slide. First thing is like these plugins, remember what they do. They add more stuff to the pod. So they enable different use cases. Like they can, you can prevent IP spoofing. You can throttle bandwidth as we've seen. You can configure port forwarding from the host to different containers. You can configure different syscuttles. And actually you can also create an allow list of the syscuttles that you can use inside of the pod. And finally I think like if you have to, to keep one thing from this presentation is that a meta-plugin must always handle the result of the previous plugin. Like you need to account for it. First of all, because you don't know if, if you're a plugin, you don't know if anything will run after you in a chain. Like the user will configure it. So you don't know what's going to happen afterwards. So you need to send a result. And if you're somewhere in the middle of the chain, least you can do, or least you must do is grab the result you got from the previous one and echo it into the next one. Now finally, remember that two things, plugin chains are only allowed starting from CNI 0.3. And they're the only configuration type allowed starting from CNI version 1.0. Like if you use the first example, configuration example we shown on CNI 1.0, it wouldn't work. It will explode, it will fail, make you miserable. And like the idea here is know your previous result always because that's probably the most information you'll get from anything that ran before you. Like as Daniel said, everything is clearly not the thing that's most prevalent on, at least on the CNI, on the plugins that are maintained by the CNI maintainers. And yeah, this concludes our, our talk. And so thanks a lot guys. Questions? Questions? Thank you. Thank you. Can you tell a little more about use cases of CNI without Kubernetes? Okay, that's a really good question. And it's going to probably eat all the time that we have. Like I can, like I work for Qvert, like, and one thing we do in it is, so there's a pod. And inside the pod, you run a virtual machine in it. Now, CNI, what it does is configure the pod interface. But what you want to have is like networking inside of VM, like you need some way to get like the extended connectivity from the pod, from the pod interface and into the VM, you need to have something there. And we have code in our code base in the Qvert code base to achieve this, like one of the thing we could do using CNI chains to offload that entire code to CNI plugins that would create, for instance, the bridge that you have inside of the pod to extend that connectivity that would create the tap device from which the VM would create the emulated network device from. So I really think this could be moduled using CNI. That's something we still need to see, like it's a very rough idea yet, but it's an example, I think. And so besides that, I guess the quickest one is, yes, you know, he was mentioning CNI tool, CNI tool is just used to develop CNI plugins. It doesn't have Kubernetes at all. And you can see that the plugins have just run there. Even that, Rocket, so Rocket, it was on its own. He didn't have to use Kubernetes at all. And it's where most of the CNI plugins were originally developed. You just put any kind of, you know, runtime engine, and it would work. No Kubernetes needed. So if I have a CNI plugin that sets up some external state, for example, a firewall that might even be a separate device, and something goes wrong, and I lose the delete, how do I recover from that? So that's an amazing question. Like you need all, like you must design everything assuming that CNI deletes will be missed. Like it might happen like all the time. So you need to have a reconciled loops of sorts that knows about your state, relevant state, and reconciles it somehow. I didn't see a way to even check if the stage should still be active. How does my CNI plugin know that something is still there? It's clueless. Like you need to monitor this little kid and assume that he will not fall and hit the head in the corner of the table. You need to do that out of bed. It's not designed to allow for that. I'm sorry. Hi. During the presentation you mentioned that some CNI plugin like Cilium, but I guess that you were also mentioning other plugins, they're not doing the logins and they're not using the entire APIs. How come? I guess you hear it. It's okay? So it kind of depends on your CNI plugin presentation. So for instance, some CNI do implement logging, but it's not something that is within implementation itself at all. So you may be totally fine in doing no logging at all, but then go back to debugging it so you can have to go to the queue with loops, you can have to go to cry over loops, and then go back to debugging all those. So it depends on your implementation. For instance, that, which is here, he implemented some of his logins for enamel, but that's not something that is on every plugin. For service, not in any of the community maintained ones. Okay, thanks. Any more questions? Well, thanks a lot for your time and for listening to us and bye-bye. Enjoy your time. |
Networking management made simple with Nmstate
Taming the internals of NetworkManager |
So, guys, there are some more seats. Please have a seat. And whenever you want, we can start. Okay. All right. So, let's start. Welcome, everyone. My name is Fernando. I'm a senior software engineer at Red Hat. I work on the networking services team, mainly focused on network management. So, today I'm going to talk how we can do networking management more simple and how we can make the life of CIS admins a little bit better. I know that network management can be complex, especially because most of the CIS admins need to sometimes configure networking and maybe they are not network experts. So, all right. The first thing is what is Network Manager. So, Network Manager is the standard Linux network configuration tool suite. Basically, it's almost on every distribution and it configures networking. It's just like that. It takes care of setting the IP address, all the properties, manage routes, manage DNS, manage almost everything. Make sure that when other to modify the network configuration, it will notify Network Manager, it will update the status, et cetera, et cetera. Network Manager provides Divas API and also they have their own library to communicate with the demo, which is LibNM. And this is why there are some tools built around Network Manager. For example, I know that some of you know them, NM-appler, NM-3, NM-CLI, or NM-Cloud setup. And there are more and we are building more of them. So, as I said, the Network Manager demo is the backend that we are going to use for the new tool, NM-State. And NM-State is a little bit special because it's declarative. So, the idea here is that as an user, you just need to define what do you want to configure. And you don't need to care about the whole. So you need to, you can define the state, you can define what IP address do you want, you can define what properties, if it's a bond, the bond properties, if it's a bridge, whatever you want. And NM-State will take care of it and will resolve all the interdependencies between the interfaces. We configure the routes, we configure everything that is needed to make it work. And we have, we use Network Manager as a backend, we communicate with Network Manager for applying the configuration, but we perform some operations that we are going to talk later, and we needed NISPOR to communicate with Kana. So, we had a problem, we, initially we were using CFS, and it was not working well, and we decided to create NISPOR, which is another library written in Rust that allowed you to communicate with Kana and get real time network configuration from Kana. So, well, the first question could be, why Nellink and why not CFS? So, CFS is not an API. You need to understand that, because I know that a lot of people build their tools parsing CFS, writing on CFS, using CFS everywhere, and this could be problematic, because CFS is not an API, and if you read the documentation, it's not a stable, it can break between releases. But Nellink is an API. Nellink is stable, it's not deprecated, because CFS is deprecated, so most of the new CFS options, sorry, most of the new network options that they are adding into Kana, they are not providing a CFS interface. And also, Nellink use sockets, and that's great, because you don't need to open a file, read it, parse it, and then get the proper value. Using sockets, you can get the attribute, you know the type, you communicate through the Nellink sockets, you get the value, and you get proper errors, so everything is better. Okay, so then let's go to the important part. So NMS state handles everything. You don't need to do anything, you just need to define what do you want, and ideally, you will apply that state, and everything will be configured after some operations. Sometimes it's not like that, so we have a lot of steps in the middle. We do, for example, validation, we do normalization, unverification, we are going to explain them later. And also, it will point you what is going wrong, so you can fix it. So for example, if you are configuring a MAC address, and this MAC address is not being configured correctly, it will point you which MAC address is configured on Kana, and what is the one that you wanted to configure, and right, you need to solve that. Also, if you put in valid IP address, it will tell you this IP address is not valid, please change it. Okay, for example, if you configure one, great thing is that if you configure an MTU that is bigger than the one supported by the driver, it will let you know about that. Okay, so one thing is that if you misconfigure something, good, we can do a rollback. Let's talk a little bit about rollbacks. So this is already supported in Neville Manager, this is not new from NMS state, but it's a little bit complex to use, and in NMS state, we simplify it. So basically, all the time that you do an operation, and NMS state do the verification, and if something goes wrong, it's rollback to the previous state. But we can also, maybe nothing goes wrong, but you lost connectivity, because you remove the IP address, and we cannot know if that is what you wanted or not, I mean, we as NMS state. So we allow you to define an option which is no commit. So you can say this simple command, NMS state CTL apply, the jammer file, we're going to see the format of the jammer file later, then no commit, and a timeout time. If you don't specify a timeout, it's going to be 60 seconds by default. So what happened if it went well? It's what you wanted, you have connectivity, everything's good. Okay, then NMS state CTL commit, and the configuration will be there permanently. But what happened if you notice that you mess up? All right, NMS state CTL rollback, and you're going to be on the previous state. This is really great, because it's really tiring when you misconfigure something, and then you need to undo it manually. So this time, you just do a rollback, and everything will be like before. And what happened when you are working remotely on a server, and you lose connectivity, and you end free travel to the data center. Right now, with this tool, you can, with a timeout, if you lose connectivity, you are not going to be able to do the commit. So at some point, it will rollback, and you will have your connectivity back, hopefully. All right, and well, verification is optional, but personally, I like it a lot, because what it does is NMS state gets the desired state from the user, then apply it, and then get the current state that is applied to the system, and compare them. And if they are not equal, then it's going to fail, and it's going to rollback to the previous state automatically. This is great, because sometimes you don't know about some options, and there are some requirements to set up these options on interface. So what you can do is apply this if it goes wrong, because kernel is not applying the option correctly, because they are incompatible, for example. So it does a rollback automatically, and you don't end up with a wrong configure interface. But you can skip this using dash dash no verify. Okay, so let's see some examples of YAML files. These are a little bit simple, but I think they are great examples. Here, for example, we have a bone interface, and you can just define the state, IPv4, the link aggregation options, you can define the mode, the options of that mode, and then define the board. And one thing that is really, really useful is that we have partial editing. So imagine that you want to change only the MAC address, but you don't want to change the IP address. You don't need to define the IP address, because you just define the interface and the type. So this is just enough. Then the MAC address, I'm going to apply the state. An NMS state, we get the current status of the interface, and we'll merge. So you won't lose any property. Alright, so then we have, for example, another example in the middle is the Abelian interface or the ATH1 interface. And another great thing here, as I say, is that NMS states resolve interdependencies automatically. So basically, you don't need to know if, in which state needs to be the ATH1 when creating the biland, it needs to be up or down. It doesn't matter. We are going to handle it. So you don't need to worry about it. And then, for example, we have a Linux bridge with the board and some options on the board. And also, one great thing here is that you don't need to care about the state of the board. NMS state will resolve the dependencies and will bring the board up if needed, or we configure as needed. Some more examples, because as I say, NMS state is not only focused on interfaces. It's also focused on DNS, root configuration, and also some other interfaces like OBS and OBS DPDK. So, okay, for example, here we have our interface with the ATH1 configure with static IP address, and then we have the AdRoot. So you can define the root and it will be applied to Kana. The same for routing policy. It's also supported. You can define from IP2 and IP4. It will be for one mask. It will be applied. The same thing for DNS. It's over. And as you can see there, the last example, it's an OBS interface with an OBS bridge. So you can define it. And then the great thing is that you don't need always to define the OBS interface. You can define only the OBS bridge and add ports or delete ports from it. So it's quite great. All right. So having seen these examples, I would like to do a demo. Sorry if it doesn't work. I hope it will. I have an environment. So let's try it out. All right. So is it big enough? I can make it bigger. Yeah? Okay. Right. So, okay, I'm using the main branch version, which is 2.2.6. And here we have, this is really great. We have an examples base. So you can, if you are learning how to use NMS stay, this is quite good. You can go here and see different examples of how you should do it. So one that is really simple is for example, this one. Right. So this one, one similar to what I showed before. So this is an ATH one. And then you have the config, a root config for the ATH one. So let's check before the state that we have already. Okay. This is the IP address that we have. And here we have ATH one. ATH one is a, it's a base, but it's defined as an Ethernet. So according to us, it will behave as an Ethernet. So let's apply the state. It's set ATH min and add root. That's it. It's done. If we go to IPA and we go to ATH one, we can see the IP address configure here. All right. Then if we do IPR, we are going to see the root here. One for the IP address and the other one, the root that we set. And also, if you are wanting to check what happened in the one manner here, you can do this. Oops. Okay. Sorry. All right. You can, you can do this. You will notice that the device is up and we could check the connection that we generated. But let's go to, sorry. Let's check a more complex sample. But we need first to clean up the state. So I'm going to show you how we clean up the states. We have ATH one, the old roots. All right. So here, for example, we just need to define the ATH one and then define the roots with the net hop interface and a state absent. And this will clear all the roots that are defined for that interface, which is great because when removing something, you don't need to define everything. You just need to define the properties that you want to match and then a state absent and it will clear them. All right. So it's applied. Let's check it. Okay. And then we can, oh, exactly. Right. Because we didn't bring down the ATH one interface. That's fine. But we drop the root. So let's create a more complex one, one that I especially like. It's this one. This is going to be the last one. So here we are going to define this. We have two ethernet interfaces that are connected to a point interface, which has also another port, which is going to be the other bridge and a villain over another bridge. So, all right. This is the state. We define the villain interfaces with the ID, the linux bridge, which the port is the villain, and then another linux bridge up and the port is going to be the bond. And then for the bond, we have the two ethernet interfaces, ATH one and ATH two, and we have ATH one and ATH two defined. So let's apply it. All right. This will do a little bit more. That's fine. Right. So in network manager, everything seems set. Then we can do IPA, plenty of things. Let me, this is the villain. Then we have here the bond with the others configure. No, sorry. We have the bond here and the bridge 29. And we can also move them as we did with the other example. And let me show you how it looks. Right. So this is quite simple. You just need to define ATH one, ATH two as down because ATH one is an ethernet. Well, it's emulating ethernet, but it's a physical device. So you cannot remove it completely. And then you have the bond, which can be dropped, the bridge, the villain, and the other bridge. So that's it. If you apply it, it will be gone. Check. All right. So it should be done. As you can see, they are not any more, you cannot see any more the link, not the routes and not the connections because we didn't, we removed also the network manager connection files. So I think that's it. It was a little bit, a little demo about how it works. I really encourage you to try it out. It's really simple. If you are really using network manager, you basically do not need to do anything else because installing NMS state will be enough. NMS state is packaged basically on Fedora, CentOS, Unreal. And well, it's, if you use the Rust library, you can also use the create to use it whenever you want. All right. So double bill. And now some questions. Thank you. You basically use network manager to do all the settings and everything, but do you also use net link? Why is it, why is it necessary? Don't you get all the information from network manager? No, because Nebo manager is not getting real time information from kernel all the time. It's not updating it directly. If you look to the devices, the devices doesn't have all the properties. And in NISPO we care about all the properties that are defined on kernel. So this is the main reason. Are there settings that you need to do the net link as well? No, no. We don't do the settings through net link. We just get them to compare. We found out the problem that as, our manager is a service and is listening on events of net link, sometimes takes sometimes to update their device cache. And that's a problem because when you do an operation, you want the result immediately. Isn't that the bug in the network manager? Well, it isn't. And in the end you have a very good cache and it's hard to keep everything updated. So they perform a lot of operations. Obviously it could be improved. But right now this is the solution that we thought about. Also, NISPO can apply settings to kernel, but we don't use it on NMSD. This is like an extra feature that we work on it from time to time. Hello. I would like to ask if dummy interfaces are supported and if not, are they planned to be supported? Dummy interfaces, yes. Dummy interfaces are supported. And that is, you can check recommendation. I think everything is supported for dummy. Thank you. Any more questions? All right. I think we're good. Yep, we're good. Thank you very much. Thank you very much for attending. Thank you. Thank you very much. Thank you very much. |
prplMesh: open source Wi-Fi mesh
Solving home Wi-Fi |
Hello, everyone, and welcome to my presentation. I am Frédéric van Boogert. I work for a company called Mind, and I'm here to present to you the PurpleMesh project, which is an open source Wi-Fi mesh solution. So who am I? Like I said, I'm an embedded software developer, and I work for Mind. And since 2020, I've been a project manager for this PurpleMesh project at the Purple Foundation. And feel free to email me if you want after presentation, if you have any more questions afterwards. So what is the Purple Foundation? So the Purple Foundation is really a very big conglomeration of the Telcoms industry. So if you look at the logos that I have there, there are a lot of ISPs in there, like AT&T, like Deutsche Telekom, like Dish, like Verizon, Vodafone. So there's a lot of big ISPs in Purple Foundation, and also some hardware manufacturers like ASCII, MediaTek, MaxDineur, Calm. And so basically what we do is we sponsor the development of router firmware. So how this came about is essentially operators, they want to provide their users, so with operators, I mean internet service providers, they want to provide their users with access points and routers, because not everyone can go out and buy them and configure them themselves. But all of the software development is not really their core expertise. So traditionally, they've relied on stacks developed by hardware, their hardware partners, but they also, this is not their core expertise, and so Purple Foundation was kind of created to collaborate on this with various partners. So the main projects that we're working on are Purple OS, which is a router firmware based on OpenWRT, Purple Mesh, which is a subject of this presentation, and also noteworthy is Lifecycle Management, which is kind of an attempt to create an up-store infrastructure, which is really cool actually. And we are also heavily involved in talking about router security and router data models, so basically the API of a router. So that's kind of the overview of the Purple Foundation. So Purple Mesh itself is an IEEE 1905 stack, so this is a layer 2.5 protocol. So it sits on top of Wi-Fi and Ethernet, but below IP. And the stack itself is based on open-sourced Intel codes. So we are a fully functional easy mesh implementation with both agent and controller roles supported, and I will talk later about what that means. And we have extensive API centralization effort as well in collaboration with the Broadband Forum, so we don't just want to write an API, we want to really think about it, and so we are collaborating on that with other industry fora. And we also have a very heavy emphasis on testing. So we have some Wi-Fi Alliance testbeds that we extensively test with. So I said Purple Mesh is an easy mesh implementation, so what do I mean by that? So this is the easy mesh protocol. It's Wi-Fi Alliance standards, just as Wi-Fi 6 and Wi-Fi 7 are Wi-Fi Alliance standards. And this is all meant to simplify your Wi-Fi management. So for Wi-Fi, often the problem that we have is you want to add an X point to your network. So you want to add an X point to your network, but then you need to configure it, and then it needs to, and that can be quite tedious. So the easy mesh protocol can do that for you, so you just onboard new devices using WPS pairing, which is also extended by the easy mesh standards, and the device will automatically join your network and have all of your settings, including passphrases and the bands it needs to operate on, the SSIDs, any guest networks that you have configured and so on. Another thing that the easy mesh standard does is it shares the configuration, so if you want to change that configuration, if you want to add SSIDs, if you want to add guest networks or change any other part of your Wi-Fi configuration, you only need to do that in one place, and it's applied to all of your X points in your entire network. And finally also, it gathers a lot of metrics about your network, which can be used to optimize how devices connect to your network. So in the image there, you see various X points, and you also see various network devices, and so what can often happen is that an X point that is used in your network simply gets overloaded because all of the devices try to connect to that X point, and your other X points don't see any use, and especially if you have a precarious backhaul that can lead to some performance problems. So applications that uses easy mesh standards, they can monitor this, they can see that there is a problem, and they can try to steer these devices to use different X points. Just solving performance problems in your network. So purple mesh is an easy mesh implementation. So basically we implement all of the functionality that I've just described. It is portable to a number of different router operating systems, all based on Linux. So OpenWRT, PurpleOS, and RDKB in particular, it's also portable in theory to other Linux-based operating systems. The main dependency that we have is we rely on all softwares, like UBUS or ARBUS is typically used in these router operating systems, but we could also support something like DBUS for other platforms. The main problem that we encounter when trying to port purple mesh to new platforms is we need really good Wi-Fi drivers because something like mesh networking tests your Wi-Fi drivers like nothing else, and we find that most Wi-Fi drivers simply are not good enough to support all the functionality that we need. So that is one thing that we are also active in purple is we try to spur innovation in the Wi-Fi drivers. So we collaborate or we try to collaborate with hardware vendors to make sure that their drivers are not just capable enough to support mesh, but that this is also done in their open source drivers. So we do still support some proprietary drivers for some hardware, but this is very much transitional, and we hope to get various vendors to open source their wireless drivers and make sure that all the functionality that we need is supported by those open drivers based on config 802.11 and 802.11. And so yeah, why did we develop purple mesh? So basically, like I mentioned, purple is a whole community of different service providers coming together to develop a single solution. So instead of everyone having to develop their own software, it makes sense to collaborate. Also by developing the software ourselves, the service providers, they get independence from system chip vendors. So what we've seen in the past and what we still sometimes see is that there are SOC vendors that try to force you to use their own proprietary software and to depend on their proprietary interfaces, and that creates a vendor lock-in within the ecosystem, and that is something that we are very much aware of and are trying to combat as well. And another good reason to develop purple mesh, and in fact the original reason, is as a stress test for the wireless drivers, because like I mentioned, Wi-Fi mesh, it taxes your Wi-Fi drivers in ways that nothing else does, and that was kind of the original motivation of purple mesh was to act as a test for open source Wi-Fi drivers, but it kind of ballooned from there. And one other thing that we do is try to encourage the development of a common API for easy mesh implementations. This API is usable for remote management, for network diagnostics, and to enable others to create router apps to configure Wi-Fi, so they can plug into purple mesh and use it to smartly configure your own Wi-Fi. This also is enabled by the LCM project, which allows you to add router apps to your router, and some of those might use purple mesh to optimize your network, or to show you more information about your network. So let me check the time here, we do have time. Okay, so yeah, easy mesh itself, like I mentioned, it's based on IEEE 905. This is a very extensible protocol on top of Ethernet and Wi-Fi. So it uses a fixed multicast address, and the main feature of it is it works with TLV, so type length value tuples, and you can add as many of these as you like, and the way easy mesh works is it defines a set of TLVs that have a specific use. So for instance, TLVs to configure access points, or to report certain metrics, or to discover devices on the network, things like that. So yeah, one thing that easy mesh uses, so one thing that easy mesh can also be used for is discovery of devices on the network. So all IEEE 905 devices, they can report all of their neighboring devices in the network, and that helps you get the topology map of all the devices that are present in your network. So it's easy to discover what's currently living in your network. So this is also, of course, a vital tool to allow you to optimize the network, and one other thing that we get is metrics, like I mentioned, so you can see how well the connection is between various devices that you might have in your network, like your laptops and your smart phones and your smart TVs, how well is the connection to their access points, and is there any possibility that we can connect them to a different access point, that has maybe a better connection. So this is also very useful, and yeah, once we've determined that we would actually like advice to connect to a different access point, this is something that we can also do. So the EasyMesh protocol includes messages to steer advice, and what we will do is the controller will tell the agents, try to tell the station connected to you to disconnect. There's a number of mechanisms for that in the Wi-Fi standards, like 8211K, 8211V. They are not always supported by all devices, in particular smart phones have a bit of ignoring them, so what we can also then do as a final option is to just blacklist the device, not allow it to connect to an agent anymore, so it's forced to connect to a different wireless access point. One other very crucial functionality is onboarding, so if you add a new agent, it's easy for them to find the controller's symptom, and then onboard through WPS or DPP, new standards from Wi-Fi Alliance. So in conclusion, so Wi-Fi interest, they are getting more complex, and that means they also need to get smarter, and that is really what we are doing within the PurpleMesh project, and PurpleEcosystem in general, is we are trying to make Wi-Fi smarter. We can use your help with that. One thing that's also crucial to know is that open source is also very useful to get vendor independence, so no more vendor lock-in, that is really a very big deal. And also, sometimes you can find open ecosystems out there, even where you might not expect it. So all of the things that I've talked about, these router operating systems, this LCM, App Store ecosystem, and PurpleMesh itself, it's all open source, but right now it's still developed by the ISPs, basically, and Purple Foundation itself. We don't get a lot of external contributions, but we welcome everyone who wants. So yeah, check us out when you can, and yeah, related to that also, so you can make good money developing open source codes. So yeah, this is where I plug my own company, Minds. We are software consultants, especially focused on embedded software and open source, and we are hiring. So yeah, I'll be around here outside of the whole after presentation, so yeah, hit me up if that sounds interesting to you. All right, then, any questions? So the operators and ISPs have previously gotten together and made something called HomeNet, and that went very much the same direction, and then they decided they didn't like it because it supports you getting more than one internet provider and using that at the same time, which this does not support. And this also doesn't support other things, for example, meshing well with the ZigBee or home automation stuff. So I would ask you, have you actually evaluated whether this is a good standard to implement? Do you think this is a good standard to implement? And if yes, why? Yeah, an interesting question. I am not familiar with this HomeNet you're talking about, although behind you, Walter, definitely is. Maybe I can answer that because I'm one of the operators, and I was in the HomeNet working group as well, I rejected being the chair at some point in time. HomeNet was very challenging in the sense that it decided to remove all existing protocols to communicate with the individual devices, such as let's get rid of the HEP, let's go do something else. So from a point of view of a realistic framework, a realistic path to get there, that was a problem. There are some parts of HomeNet, such as HNCP, that have been reused by some proprietary vendors here and there, but it's all died. Now, when it comes to the lineage of EasyMesh, that's a whole different story, and why it's based on 1905, it was essentially a few companies that got together and said, what are we going to do? Are we going to do it at Layer 2? Are we going to do it at Layer 3? And it's mostly based on internal pressures of, well, we have something that's 1905-based, it works together with our power line devices already, let's use that as a basis. It wasn't too fond of it, but VHS versus Bitimaxa and V2000, right? So to me, I still see value in HomeNet in trying to get some of the concepts in there. Now, the other problem with HomeNet, and I had the discussion with the chair at the time as well, is this mentality that there was that every access point had its own IP range rather than what the reality is in a wireless network in a home, which is that it's all a Layer 2, it's one single Layer 2. If you do this steering that really explains where a device in milliseconds switches from one access point to another while consistently maintaining like a video call or something like that, you're not going to get a different IP address, you're not going to restart your TCP connection or whatever, it doesn't make any sense. So doing this at Layer 2 just makes a lot more sense than the HomeNet concept of multiple segments and passing on from there. So HomeNet does have the advantage that you don't need special hardware support in supporting the 4-frame 802.11, so to get the data along. I do actually agree with most of your points that I think that the best solution is somewhere in the middle between the two, because this is so Wi-Fi centric that it can't really support things that aren't Wi-Fi, but it is better for Wi-Fi itself, which I think is to a good degree your point, especially with the mobility. So the question is, is it possible to build both into one thing maybe? Let's continue this conversation. Yes, so right, so that's the nice thing of being based on. So Wi-Fi Alliance is focused purely on the wireless side. The rest of us are making sure that wired connections retain support, especially Ethernet. Now, the good thing about 1905 is that it is a protocol that supports also other, not only Ethernet, but also other things. 1905.1A added standard support for G.HN, you could have, so there's a modeling on top of mocha, so you can have coaxial twisted pair, an intern in-home fiber, et cetera. Protocols are all supported. We just need to keep the Wi-Fi Alliance on us from time to time, because they're modeling using Wi-Fi Alliance data elements of the network topology, tends to ignore everything that's not wireless. But when it comes to supporting network ports on a particular switch, et cetera, that all comes natively in 1905. So that's fully supported by EasyMesh, and we make sure that that is supported in purple mesh. Indeed. That is indeed very important. We try to go beyond Wi-Fi, that's what it is. You mentioned that most hardware is not compatible, but can you give some examples, maybe? Which hardware is? Yeah. So what I think Qualcomm tends to be fairly good about their hardware drivers. Broadcom is not as good, so yeah, a message for anyone considering trying to make a new SOC avoid Broadcom. A bit more detail, I did get Broadcom to agree to support this as well, and it is supported. There are some gaps in what the proprietary drivers support at the moment, and what they supported in the later 2011. So Linux Wireless does need to get upset. We've identified the gaps, and we hope to work with the host AP and the Linux Wireless community fix that. Now, me as an operator, I've given it a hard requirement to all our silicon vendors who want to do a gateway or an access point, that they must comply with this. And we do have general support for it, it's just that every time there's a new standard coming out, such as Wi-Fi 7 now, that they first develop proprietary interfaces and we need to keep them on the right track. That's a bit of the, that's the challenge, essentially. I did mention before that the Purple Foundation is also involved in trying to set standards for Wi-Fi drivers and low-level API. That's my part. This low-level API, this definition of that, I'm the chair of that, to make sure that all the proprietary silicon uses standard Linux interfaces. Some more questions? Someone over there, yeah. Hello. In terms of vulnerability management, is there a way for the agents to be updated from the central agent or in some other way without losing connectivity or would the need to patch something cause everything to drop? So if I understand correctly, the question is whether it's possible to steer a device without causing the connection to drop, right? Are steered or because you said that it's highly demanding on the wireless driver to do that, so I'm guessing that in case you also need to patch something, you're risking the connection. So is there some handling internally to ensure that even during an update, could be a rolling update, for example, one agent at a time or something like that, is there some provision to make sure that it's not going to cost connectivity if it does an update or is something that's still not being developed yet? I think if you reconfigure your network, like change SSIDs and so on, connection will drop. No? Okay. Yeah. My apologies. I'm not super into all of the technical details, but apparently there are ways to manage that and so that your connections are not dropping. And I should also mention the virtual BSS project, which we are also collaborating on with an organization called Gable Labs. And there the goal is to have a single virtual AP per device, so basically an AP that follows your device around by means of, by way of speaking. So it means that your phone, for instance, will always see the same AP regardless of where you go. And this means that no connections ever need to get dropped when you move around. This is also a very interesting project that we are working on at the moment, so it is, yeah. But yeah, in general, no, the answer to your question is no, the connection doesn't need to drop. Okay. Any more questions? Okay. So it is all very interesting. The question is, is there any of the shelf equipments, which are more or less made-tried at? Yes. We have some reference hardware that you can just buy on Amazon or other places. So that's readily available that you can test per permission. So we have two reference devices, the G-Linux B-1300s and the Taurus Omnia that are readily available and we will be adding more in the future as well. And I guess it is mentioned somewhere, so I'm going to check it out. So if you go to our HitLock page for PurpleMesh, we have a lot of documentation on the wiki including documentation about which hardware to get, so I think, yeah. So yeah, actually I had it there. So we have, in the Getting Started guides, we have a section here on device purchasing options, so as you can see, I would recommend the Taurus Omnia and the G-Linux. The NETgear Rex 40 is no longer really supported because it lacks a crucial feature that is required for a while back also. I would not recommend that one. Okay, thank you. Any more questions? I think we are good. There are no more questions for you. Thank you very much for the presentation. |
Service MESH without the MESS
Latest of eBPF Powered Cilium Service Mesh |
Hello everyone, welcome to this session about Cilium Service Mesh. My name is Raymond De Jong, I'm Field CTO for ISOFALENT, the originators from Cilium. Today I'm going to talk a bit about EBPF and Cilium as an introduction, after which I'm going to talk about how the Service Mesh is evolving, after which we'll talk about the Cilium Service Mesh features, what we can do today, and what we're planning to support in the future. Quick highlight of some upcoming features and some current features, and if we have time I have a little demo to show how it works. Can I see some hands from you if you know EBPF? Quite a lot, good. How many of you know Cilium? Cool. OK, how many of you use Cilium, actually? Not as much. OK, cool. So, for the ones who don't know what EBPF is, I'm going to do an introduction here, is EBPF is standing for Extended Berkeley Packet Filter, by itself that doesn't mean a lot, but what we like to compare it with is what JavaScript is to the browser, EBPF is to the kernel. What that means is that using EBPF we can attach programs to kernel events, and for the purpose of this session is that we can attach EBPF programs to kernel events related to networking, so that's either a socket being opened, a network packet being sent on a network device, that means that's a kernel event, and that means that we can attach a program to it and we can get a metrics from that packet, for example, or we can do load balancing and such. So Cilium is built on EBPF, you don't need to be a network, or sorry, you don't need to be an EBPF developer to actually work with Cilium. Cilium abstracts this complexity on technology under the hood, so based on the configuration you set, Cilium will mount the required EBPF programs for you to run, and Cilium in short provides networking and load balancing capabilities, security capabilities, and also a lot of observability capabilities using EBPF. So this is the 30,000 feet view where we are today, we started with plain networking, IPv6, IPv4 years ago, and now we expand that all the networking capabilities with BGP implementations, Netfor6, 64, extended load balancing out of the box we're working on, having GOBGP control playing fully supported with Cilium. On top of that we have an observability layer with our Hubble technology, which is a observability tool which provides service-to-service communication for your namespaces, so you can see what components, what services are talking to which services, after which you can make informed decisions, for example what kind of network policies you want to apply, also exporting metrics to tools like Rafaana, and service mesh on top of that to provide authentication, layer 7 path-based routing and such. On the right hand side we also have Kettergon, that's not something we'll talk about today, but that's runtime security using EBPF, which is also very interesting, and rerun across clouds, doesn't matter if it's on-prem or hybrid or multi-cluster, so it's agnostic of the platform, and supported by multiple cloud vendors. So as you may know, Google and false data plane V2 under the hood is actually Cilium. Microsoft has recently adopted Cilium as the default CNI for AKS clusters, and all their clusters will be migrated to Cilium, and AWS, EKS, anywhere by default is Cilium, so we see huge adoption in the field of Cilium. So let's talk about service mesh, so obviously if we talk about service mesh we talk about observing traffic, being able to secure traffic from application to application across clusters, doing traffic management, building resilience across applications across clouds. Service mesh originally, if you needed that capabilities, originally you would program your application either in Python or Go to get that observability. That wasn't really useful because you have to maintain all those libraries to get the information you need. That's where the sidecars came in, right, so that they abstract that complexity from the application to have a standard sidecar implementation to monitor traffic, to be able to route traffic, and to be able to extract metrics from that traffic. However, now with Cilium our goal is to move as close to the kernel as we already run in kernel with EBPF, so we're moving from a sidecar model to the kernel, and where we can we will support it using EBPF. The only part which is not yet there is Layer 7, so all the low balancing capabilities, routing capabilities in terms of IP to IP metrics are already available with Cilium using EBPF. Layer 7 routing is not yet in EBPF for multiple reasons, of which one is that EBPF has constraints in terms of how big a program can be, obviously it runs in kernel space, so it has constraints for a good reason, but in the future maybe we can even transport complex Layer 7 routing in EBPF. However, we already provide Layer 7 visibility and observability in using Cilium and EBPF, we already have the capabilities to inspect traffic using EBPF. We can already do the low balancing with the creep process replacement. The only part is the Layer 7, but the visibility of traffic, so HTTP traffic and such is already there. So surface mesh capabilities are extending those capabilities moving forward. So how does it work? So some of you may know that Cilium runs as an agent, as a demon set on the nodes, it programs the nodes to be mounting the EBPF programs for the capabilities you need, and we have an embedded Envoy running inside the Cilium agent. This is a narrow down Envoy proxy in the agent for networking capabilities, and we leverage this Envoy proxy on the node level to do surface mesh capabilities, so all the things like the HTTP path routing and such. So for each namespace you would create, and where you create, for example, an ingress resource or a gateway resource, that means that a listener will be created through the Envoy for that specific namespace for that specific workload. And we leverage C groups and stuff to have separation as well for the security reasons to not be able to have traffic across namespaces as such. So what is different with Cilium's surface mesh compared to other surface mesh implementations? First of all, our goal is to reduce operational complexity by removing sidecars, resource usage, reduced, better performance, and avoid sidecar startup shutdown race conditions. So obviously if you're not running sidecars at scale, this makes a huge difference. You don't have all the sidecar pods running alongside your normal pods, that will save memory, that will save CPU, that will save connection tracking, et cetera, et cetera. So a lot more efficient. And also in terms of latency, running a sidecar has a cost. So in this diagram you see that an application wants to send traffic to another application. What that means technically is that it goes through the TCP IP stack three times with the sidecar. First from the app, then inbound in the sidecar where the sidecar does its processing, and then external from the sidecar to the physical network device to hit the network to reach another node. With EBPF we are able to shortcut that connection and improve the latency because we can detect that traffic and we can see if it's destined for the physical network or it should be routed to the proxy. So once this app opens the socket using EBPF we can shortcut that connection to the physical network device to be routed on the physical network. If we need layer seven processing, that means that using EBPF we can shortcut the connection on the socket layer directly to the envoy proxy where the envoy proxy on the node does his HTTP routing and then forwards the traffic again on the physical network. So a lot less hops there. And it means that latency is much, much improved because we're not going through this TCP IP stack multiple times. In terms of throughput there's also a small difference because we can push more packets and in terms of pod ready performance this is also a consideration at scale because when you're scaling out your applications you always, with traditional sidecars, you need to wait for the sidecar to be spun up as well and to be ready to serve connections for that application. So without the sidecars with Cilium service mess it's already there. It's running on the node, it's embedded in the proxy so once you scale out your application the proxy immediately on that node can serve connections. So in short Cilium service mess provides traffic management, observability, security and resilience. The goal is to bring your own control plane or we are not developing a control plane on our own. What it means is that you can already use Ingress resources with Cilium 1.13 will support Gateway API. We are working on Spiffy integration so with the 1.13 release actually the ground work for MTLS and Spiffy integration is already there. You are not really able to use it yet but the goal is to support both MTLS and Spiffy using Cilium network policies so you can reference for example a Spiffy ID as a source and destination using Cilium network policies and then under the hood the Cilium agent part will connect to spy reserver where that identity is tracked and confirm if that's allowed. In terms of observability you can leverage the already available observability with Grafana or Hubble if you need to export events you can use scene platforms such as Splunk and open telemetry is also supported. If you are new, if you are running new classes you have an option you can run Cilium and you can already use Cilium service mesh out of the box this is obviously the preferred method but if you are running already an Istio based implementation there is still a lot of benefit to run Cilium under the hood there as well because for example we already encrypt the connectivity between the sidecar from an Istio based implementation towards the destination pod. What I mean by that is that when you run sidecars, when you run MTLS between applications that connectivity may be secure but the connection between the sidecar and the actual destination is encrypted on the node so anyone with specific privileges on a node could potentially listen on that virtual interface and e-drop traffic and that's obviously not secure. The running Cilium under the hood already gives you the benefit because we can encrypt on layer 4 directly on the socket layer to the destination pod obviously. With 1.12 so currently we have 1.12 available since I think 7 months. We already have a production ready Cilium service mesh, a conformant ingress controller which you can use for HDD path routing, canary releases and such. You can use Kubernetes as your service mesh control plane, fromisius metrics, open telemetry is supported. For power users we have Cilium Envoy Convict and Cilium cluster wide Envoy Convict CRDs available. These are temporarily I would say because the goal is to replace all that capabilities with Gateway API. And we're releasing more and more extended Grafana dashboards for layer 7 visibility so you can actually see between service to service what kind of metrics there are and what the latencies are and what return codes are, so golden signals. So the roadmap for 1.13 and we're very close for releasing 1.13, expected somewhere this month hopefully. You can already try a release candidate for Cilium 1.13 which includes a Gateway API support for HTTP routing, TLS termination, HTTP traffic splitting and waiting. So this allows you to do percentage based routing or canary releases as such without configuring Cilium Envoy Convict resources. And also the capability to have multiple ingresses parallel balancer. What that means is that currently when you create a Cilium ingress we rely on the hood on a low balancer to attract traffic and forward that to the proxy. Obviously at scale having a low balancer for each ingress, especially in clouds is expensive. So this with an annotation we allow multiple ingresses per low balancer so you can save cost there. So how am I doing at the time? Good features. So today ingress 1.12, also with services we are having support for annotations. So imagine you have received traffic from your ingress, you forward it to a service. That means we support annotations on a simple cluster IP to forward traffic for example to a specific endpoint. If you know what Cilium cluster mesh is we can connect Cilium across clusters. With simple annotations you can have even higher availability of services across clusters. Gateway API which I will show a bit later and the Envoy Convict. So this is a simple example of ingress and this is also something I will show in a demo. You have an ingress and from a specific path you want to forward traffic to specific service. We also support GRPC so you can also have specific GRPC URLs to be forwarded to specific services. TLS termination to terminate TLS using secrets, using ingress. A question I get a lot is what about SSL pass-through, that's on roadmap so keep that in mind. And obviously new in 1.13 is Gateway API and how it looks like is you will configure a Gateway resource. You specify the Gateway class name for Cilium to make sure that the Gateway is created and maintained through Cilium and then create listeners. So in this case an HTTP listener on port 80. Then additionally you create multiple HTTP routes, one or more. And this specify for example a path prefix for values forward slash details to be forwarded to a backend reference service called details. In terms of TLS termination, same constructs. You can also have for example a host name in there to only accept traffic for this given host name and you reference a secret in the Gateway resource and then in the HTTP routes you will specify the host name, you will reference the Gateway you want to use and then again a path prefix for example to forward to specific service. And then traffic splitting, very simple, also using HTTP routes. Again referencing your Gateway, a path prefix and then you have in this case an Echo 1 and Echo 2 service where you want to introduce slowly Echo 2 and in this case 25% of that traffic will be forwarded to the Echo 2 service. And this is the example what I talked about earlier. Using simple annotations you can extend service miscapabilities by annotating services. So in this case this service will receive traffic for GRPC and we can attach low balancing modes for in this case weighted least requests to be forwarded to backend endpoints. And using multi cluster capabilities you can extend these capabilities across two or more clusters depending on your cluster mesh configuration. And canary roll out, so you can even introduce new clusters, have the new version of your application running on the new cluster, so you're absolutely sure that you have no resource contingent on your original cluster and then on the service annotate traffic to forward slowly to remote cluster before you do the flip over. So this concludes the presentation part, so for example when you want to know more about Cilium go to the Cilium community, I encourage you to join our Slack channel if you have any questions, our team is there as well to answer questions for in Slack, any issues you may have or any roadmap or feature request you may have, we're very interested to hear from you. You can also contribute, so obviously if you want to develop on Cilium, join the Cilium Github and contribute, if you want to know more about EBPF go to EBPF.io and if you want to know more about Isovalent, the company who originated Cilium and want to for example work there, have a look there, we are looking for engineers as well, so feel free to have a look and if you want to know more, ask me after the session as well. All right, let me do, see how I'm doing with time, so in order to run Ingress and Gateway API, you need to set a certain amount of flags on your for example your hand value style, so this is an example, I've run a small demo on GKE, so this is a GKE cluster with Cilium installed, what you need to do is you need to enable the Ingress controller and in this case I'm also enabling metrics just because it's interesting to see what's going on. For Gateway API there's also a value, so Gateway API enabled through, this will trigger Gateway API to be enabled, for service mesh it's important to configure the cube proxy replacement to strict or probe, strict is recommended because you have the full cube proxy replacement capabilities in your cluster, this is also required for service mesh and that's basically it to get started. So for this simple demo, I've created a simple gateway with the Gateway class named Cilium, so this is running Cilium 1.13 Release Candidate 5 which has the Gateway API support and then a simple HTTP route for the book info example application which has matches for the details and the default path prefixes, so when I go into my environment, I want to show quickly the following, so if I do a Qubectl getService, you can see I already for the sake of time created this gateways, what I wanted to show you is that obviously a low balance is required, so GKE provisions me a low balancer, low balancer IP I can use to attract traffic, in this case I'm demoing a default HTTP gateway and a default HTTPS gateway, so I have two low balancers with each an external IP address assigned, so this configuration is applied, so if I do a Qubectl get a gateway, good point, obviously in your cluster you also need to install the CRDs for Gateway API support, here you can see I have my Gateway and our TLS Gateway and if I do a Qubectl get HTTP routes, I can see I have my book info HTTP route installed and this relates to this part obviously, so with that I should be able to connect to the details, so this is running the bookstore example, so I'm using that public IP as you can see it works and if I go to details I should be forwarded using the Gateway API HTTP routes to that specific details service and that works as well, for HTTPS again a simple example I've created that gateway, TLS gateway, I've created two listeners, so a listener for bookinfo.cillium.rocks and a listener for hipster shop.cillium.rocks, I didn't have installed the hipster shop for demo purposes, I'm also referencing two secrets, so I've used makesert to create a simple self signed certificate installed in my certificate store and created a secret which I reference using this listener, then again HTTP routes for the TLS gateway for bookinfo.cillium.rocks matches to only the details path prefix on port 9080 and again apart for the hipster shop, so that's what I'm going to show here, so if I do the default URL that doesn't work there's no list, there's no HTTP route configured, but for details I can see I can connect it securely and this certificate is run from the gateway resource as well. Obviously this is a self signed certificate, but obviously you can create signed certificates as well. With that, that concludes my presentation and the demo, I'm open for questions. Any questions? Hi, thank you very much for your presentation. When you talk about no layer 7 support in going to come or not? I'm not sure about that. HTTP routing requires quite a lot of memory, so obviously memory is limited in eBPF programs for good reasons, so it will depend on the eBPF foundation and the roadmap there, what we can support. Technically there's no reason why we shouldn't be able to do that, but in terms of memory we have constraints, so if those are being raised we potentially can have parts of even all parts using eBPF. Any other questions? Hi, does it provide or can you provide end to end encryption, especially between the nodes automatically or not? Yes, so our vision there is that you should configure, for example, IP stack or wire guard for node to node encryption in transit, and if you want authentication and authorization on top of that to configure SPIFI or MTLS between your applications. It's a multi-layered approach, so we're not doing the encryption on the MTLS part, but on the node level, if that makes sense. So MTLS again, SPIFI is on roadmap, hopefully for 1.13. Any other? No, okay, thank you. Thank you. |
MetalLB and FRR: a match made in heaven |
Okay, so thanks everyone for coming. Today I'm going to talk about two projects. One is Metal LB, which I'm currently maintaining, and the other one is FRR, which kind of started integrating in Metal LB more or less one year and a half ago. Is anyone using Metal LB? Awesome, okay, so if you found it less stable in the past two years, that's because of me. So again, the agenda today is I'll describe what Metal LB is, white matters in the context of Kubernetes, then I'll introduce FRR, and then I'll talk about the integration between the two. Some quick words about me. Federico, I work for Red Hat almost for four years now, and I'm part of this networking team on the OpenShift platform that is in charge of making it a suitable to run telco workloads that we know have slightly different requirements from regular application that runs on the cloud environments. Due to that, I contributed to a variety of network-related projects. I touched our primary CNI, which is OVNK. I wrote a simple CNI plugin for using DRFs. I put some code in Kubernetes itself, and lately my primary focus is on Metal LB, but please don't think that I'm a networking expert because I'm not. Who doesn't know anything about Kubernetes? Okay, so I'll very briefly introduce the concept of services for those that don't know anything about Kubernetes. We have our application deployed as multiple pods where we want to scale our traffic against, and the concept that Kubernetes gives us is services, and with the service, we get two things. One is a cluster IP that is a virtual IP accessible from inside the cluster, and the other part is the balancing part. So the client tries to hit the service, and then the cluster CNI in some way balances the traffic across the different endpoints. It's more than this, we have IP families, we have parts, but that's the main idea. What if we want to expose our application outside the cluster? Because yeah, there might be use cases where we want to expose them inside, but it makes sense to access them from outside. And the main construct that Kubernetes gives us is a service of type load balancer. This is the definition taken from the Kubernetes documentation. A service of type load balancer is exposed externally using a cloud provider slot balancer. As we saw in the talk before, like there is the cloud provider, Google AWS, that gives us an IP that is accessible from outside, and that drives the traffic towards our application. And again, the emphasis here is on the cloud provider. What happen when we create a service of type load balancer? We get an external IP that is accessible from outside, and we get the load balancer part. So if we try to access our application, the network infrastructure of the cloud provider will drive the traffic toward all the nodes of our cluster so that the CNI can do its part. So, and this is important to understand how Metal Lab works. Once the traffic gets to the node, all the rest is handled by the cluster CNI. And in this case, this is a real network load balancer. So again, where we don't have control and is controlled by the provider. So just to iterate, we get two things from a service, from a load balancer service, we get a stable IP that we can pin our DNS entries to, and we get the load balancing across all the different nodes. So now let's move to bare metal and see what happens there. The first thing that we see when we try to create a service of this type on bare metal is the fact that the external IP, we are not getting an external IP because there is no one that is giving that to us. And the second part is, even if we had that IP, who is routing the traffic to the nodes as the cloud provider's network infrastructure is doing? And these very same two issues are the issues that Metal Lab tries to solve. Metal Lab is a community project now under the CNCF umbrella. It was originally started by David Anderson. Then there was a handoff to one red actor, Russell Bryant and two folks working out at King Volk, Rodrigo Campos and Johannes Lieberman. And during that phase, Metal Lab went more or less in maintenance mode. They were replying to issues and stuff like that, but it wasn't evolving too much. And at some point, because things went in a different way, I started leading the project basically. One nice thing about Metal Lab is this dichotomy. It's used a lot in home clusters around Raspberry PIs, but it's also used by enterprise users in very complex scenarios. So the first and most disappointing thing about Metal Lab, but please don't run away, don't leave the room, is the fact that Metal Lab is not a network load balancer. Yeah, this was disappointing when I started digging into it, but let's keep in mind those two issues that we want to solve and see how Metal Lab tries to solve them in a, I think in a very elegant way, interacting with an existing network infrastructure. And the first part is the address advertisement, assignment, sorry. And this part is probably the most boring one. We have a Kubernetes controller, listens for services in the need of an IP and tries to assign them the IP. But what IPs are we talking about? We, here we are not on the cloud provider, we don't have control over the IPs. And so in this case, the cluster administrator is the entity in charge of providing a pool of IPs to Metal Lab, so it can use them and give them to the various services. And those can be ranges, can be full-siders, we can add multiple ranges and IPv4 and IPv6. And then there is the probably most networking part of Metal Lab, which is address advertisement. So we have the address and we need to find a way to attract the traffic to the nodes so that the CNI itself can do its part. And Metal Lab works in two modes. L2, I'll briefly describe how it works. It's more or less like a KIPA LiveD, it collects one node as the leader of the IP, it replies to our request, sends out gratuitous RPs when a failover happens and you can only have one node as the entry point for the service. And then there is the part that I'll dig more into today, which is the BGP mode. In BGP, we leverage the interaction with BGP-enabled routers in order to advertise them. Yeah, so this is taken from the BGP RFC, the primary function of a BGP-speaking system is to exchange network reachability information with other BGP systems. So this is exactly what we need. We need to find a way to say, hey, if you want to reach this service IP, which I'm assigning to my load balance service, then you should go through this set of nodes because then, again, the CNI can do its part. And this is exactly how Metal Lab works. Each node acts as a mini router establishing BGP sessions with externally configured routers. You need to make Metal Lab be aware of the existence of those routers with some configuration. And then, when we create a service, it will start advertising the routes to the router so that there, and this is a bit bigger for those in the back, so that the router knows that in order to reach this virtual IP, it needs to route the traffic towards these nodes. So again, the traffic gets to the node, that's the end does the rest. And in this case, compared to the L2 mode, we get fully load balancing through ECMP routes. And the scenarios can be more complex. We can have multiple routers. We can have, we have some knobs to drive which routers, which peers we want to advertise, a given service. We have other knobs to say, hey, I want this BGP session to be established only from this set of nodes because maybe they belong to different areas. And of course, we can have cascading routers and this is like regular BGP. The configuration looks something like this. So we still need the set of IPs to get the metadata in order to have our services assigned to have them assigned to our services. And then we have these other item, which tries to describe the properties of the BGP session that needs to be established with the different routers. And we have a few features here. We have BFD support. We have node selectors. We support IBGP and EBGP single and multi-hop. But even if we are acting as a mini router because metadata-based purpose is only to advertise, routes outside, we refuse an incoming router to the node. Because again, that is not metadata-based purpose. How it works under the hood. The architecture is pretty simple. We have one single pod that is the controller that is in charge of the IPAN part of MetaLEB. So it's in charge of reconciliating the services and the configuration with those IPs that needs to be assigned to the service. And again, there is not too much network in this side. The other part is the speaker. And the speaker is the part that is in charge of handling the networking side. We run it as a demo set. It runs on each node. It runs on the host network. So it is in control of the configuration of the host networking. And it handles the announcement part. So both the L2 and the BGP1. And now I will talk a bit about the history. Originally, the BGP part was done using a native Go implementation that was implementing a subset of the BGP protocol. This was before I started to maintain and to contribute the project. And at some point, there were a bunch of features that were being asked by the users. And the people that were maintaining the project at the time had this discussion about, should we start extending the Go BGP implementation to cover all these scenarios that the users are asking? And the result was, we shouldn't. We should not reinvent the wheel. We should leverage something that is already doing that for us. And that thing was FRR. FRR is Internet Routing Protocol Suite for Linux that is well-established. And it implements all the stuff that we were looking for to add to Metalhead View. So as the result of this discussion, Metalhead View was extended with an alternative mode that is turned on by a configuration flag, where all the BGP part is handled by FRR. An FRR configuration looks like this. We describe our autonomous system number. We describe the properties of the neighbors. And then we describe the prefixes that we want to advertise around. And we can do more. We can set rules for each neighbor. And associated to those rules, we can say, hey, if the IP belongs to this set of IPs, then we can add communities. We can have local preferences. We can block this IP for this particular neighbor. And this is all the stuff that we had to do. We were required in Metalhead View. So now, in BGP mode, the way Metalhead View works is that we are running multiple containers inside the speaker pod as we had before. And one of them is running FRR. And because all the containers share the host network namespace, then what we need to do now is to apply the proper configuration to FRR so that it can do its part. So this is what we have. On one side, we have all these continuously evolving configurations that we received from the Kubernetes API. We have the services that come and go. We have the new routers that we want to configure. We have this BGP advertisement that allows us to set some properties on the advertisement itself. On one side, what we want to achieve is the corresponding FRR configuration so that FRR can do its part. And this is done by some code that renders all this changing stuff and continuously reconcile it in some sort of internal data. We pass it through the go template engine. We then generate the configuration for FRR that we want. But we are not finished yet because then we need to instruct FRR to apply the new changes. And luckily, what we can leverage is this Python script FRR.reload that does a lot of stuff for us. So this comes together with FRR and calculates all the delta between the current configuration and the configuration that we want to achieve and applies all those commands to FRR. And so again, we get to the right configuration corresponding to the Kubernetes configuration. Assuming that we are doing our part here. So I don't generally add memes to talks, but I thought that this was particularly relevant. Because leveraging FRR allowed us to focus on the business logic on what our users were asking to us without having to worry too much about the protocol and its implementations. And that helped us a lot. In order to add new features to the project, we added the BFD support. Seamlessly, we added the VRF support. IPv6 and DOS stack were something that we were missing in Metal LB, and they came out naturally. But this doesn't mean that we don't have challenges. Probably the biggest challenge is the fact that on one side, we had an existing API that was already there and was fitting well with the Metal LB use case where the focus was the service. So I want to apply all this logic to the service. On the other hand, FRR thinks in a slightly different way. So there is a good amount of logic in doing this API contortionism in order to have one API to fit the other. And again, that's because we wanted to be backward compatible. And probably the second most interesting challenge was the fact that Metal LB was known to be super stable. Like, we came and we replaced the core mechanism about the interaction with the routers, and we wanted to make sure that we weren't breaking too much. On top of that, we started also to add new features. And again, we were changing a lot in very few times. And at the time, there wasn't a proper CI mechanism that was covering all the cases. So that was quite a challenge, because again, Metal LB users were used to having something that was stable, and we were promising that we were replacing the implementation without in a compatible way. So the problem was we want to be able to test something like this, where we have multiple servers. So you have one router. You might have multi-hops, and you have all the configuration knobs that Metal LB has. So and then you have node selectors. You have the BFT that we were adding. We have communities and a lot of stuff. And this was something that I wasn't keeping me sleeping. So we started thinking, how do we set up a proper CI for this? And we use kind. Does anyone knows what kind is? Of course. OK, so basically, with kind we are able to replace something like this with something like this. So each node is running inside the container. The external router is now replaced by FRR. So we use FRR both inside Metal LB to do the implementation, but also outside to validate that everything is working. And now we have even control on this kind of network, because this is the Docker bridge. So this allowed us to add a test suite, where we apply the Metal LB configuration and the FRR configuration corresponding to that Metal LB configuration. And we can inspect the external router so that all the advertisements, all the sessions are up and all that kind of stuff. And obviously, we can even access the service from outside. And we can test more complex scenarios. We can test multi-hops. We can test IPv4, IPv6, dual stack, and so on. And most importantly, this can run on our laptop. So even the development phase is now easier to move forward. And also we are able to run this in the UPS MCI as under GitHub Actions. So wrapping up, FRR made us able to focus on the business logic of our applications instead of having to re-implement all those protocols. To the point that sometimes writing this test suite takes more time than implementing the feature itself. And I think this is a nice example of the interaction between two different projects. These are some resources. The first section is about Metal LB. We are trying to be active as much as we can on the Kubernetes Slack channel. We always monitor upstream issues. If you want to ask questions, that's the right place to do. Again, I'm responding on daily basis. And I think that can be said also for the FRR community. They are super active in their Slack channel. And with that, I want to thank them because they made our life super easy. And I'm done for today. Any question? Hello. Is it possible in a few words to explain when I should use ARP or BGP? If you are working locally, if you have a home lab and you don't have a BGP enabled router, that means that the only alternative that you have is L2 and R. Otherwise, the BGP mode requires routers but also gives you more power because you have a proper load balancing across all the different nodes. One more question here. Hi. Thanks for your call. We have Metal LB running on our worker nodes. And our worker nodes are talking BGP to the host, routing to the host. So FRR is running on the metal machines as well. So can we configure Metal LB with FRR routing without conflicting ports? So because we already have a problem? Yeah, correct. You can't. And that's something that we are trying to think to solve. One of the idea that I have in mind, but this is super early stages. Again, having Metal LB able to use an existing FRR instance, but that comes with challenges because it expects to reconfigure all the configuration and it might have conflicts with what you have currently. So the go-to solution now is to have different ports, I guess. OK, thank you. I have a user having a namespace where he uses a lot of IP addresses without control. A lot, sorry? I have some users that are allocated separate namespaces in Kubernetes. And some of them are using a lot of IP addresses. Can you limit the IP addresses that matter? For a namespace. Yeah. Yeah. So one of the things that we added over the past year and I think it's going to go out in the next release. It's merged on main, but it's not released yet. It's namespace selectors and service selectors in the IP address pool. So by doing that, it will solve your, that's meant to solve the multi-tenant problem, where you want to have a given set of pools associated with a given tenant and not more. So we are trying to remove the control from the service itself, as it was before, where you had to set a fixed IP or the pool that you were pulling inside the service and having it as part of the cluster configuration so that the cluster administrator can do that. And there won't be cases where applicants should are abusing of the cluster. Well, Morio. Thanks for your talk, also. I have a question about the possibility to have Metal LB coexisting with Calico, because I think in that case, we have two BGP speakers on the same note, and I had issues with that in the past. Yeah, I'm not a Calico expert, so I know that the existing configuration suggested to disable the Metal LB BGP part and let Calico advertise the services. But that's all I know, and I haven't dug into the thing. One more? Just to follow up on the last question, is it possible to run Metal LB on a different port for BGP? Yeah. Actually, it should be, but I don't remember, because it has to be a process parameter. I should check. If you catch me out later, I'll tell you. Because for the neighbor part, yes, it's like that is clear. For the Metal LB part, I don't remember. OK, thank you. Thank you. |
Decentralized Storage with IPFS
How does it work under the hood? |
So, great to see you all. So many people here. That's awesome. Welcome to my talk. It's called Decentralized Search with IPFS. Maybe first of all, like a quick pause. How many of you have used IPFS? Please raise your hand. Okay. Okay, nice. And how many of you have heard about IPFS? Okay, all of you. Okay, cool. So, you know all about it already, no? Yeah, so the talk is called How Does It Work Under the Hood? So we will dive in, yeah, pretty deep at some points of the talk. But yeah, first things first. My name is Dennis. I'm a research engineer at Protocol Labs. I'm working in a team called PROBLAB and we're doing network measurements and protocol optimizations there. I'm also an industrial PhD candidate at the University of Göttingen and you can reach me on all these handles on the internet. So, if you have any questions, you can reach out and let me know your questions or just hear the venue after the talk. So what's in for you today? First of all, just in words and numbers, what is the IPFS? Just general overview. And at that point, after we covered that, I would just assume we have installed a local IPFS node on your computer and I will walk you through the different commands from, yeah, we are initializing some of the repository, we are publishing content to the network and so on and we'll explain what happens in each of these steps so that all of you hopefully get a glimpse on what's going on under the hood. So we are importing content, we connect to the network, I explain content routing, this is the very technical part and at the end some call-alls basically. So what is IPFS? IPFS stands for the Interplanetary File System and generally it's a decentralized storage and delivery network which builds on peer-to-peer networking and content-based addressing. So peer-to-peer networking, if you have followed along or if you have been here earlier today, Max gave a great talk about IPTP, about connectivity in general in peer-to-peer networks and IPFS is one of the main users of the IPTP library and builds on top of that. And most importantly, it's very tiny at the bottom, IPFS is not a blockchain, so also a common misconception, I'd like to emphasize that. N numbers, given these numbers are from mid last year, so probably in need of an update but this operation is since 2015, that hasn't changed. Numbers of requests exceed a billion in a week and hundreds of terabytes of traffic that we see and tens of millions of active users also weekly but it is a disclaimer, this is just from our vantage point, in a decentralized network no one has a complete view of what's going on, so these numbers could be much higher or just different in general. On ecosystem.ipfs.tech you can find some companies that build on top of this tech and it's all in these different areas, social media and so on and so forth, so worth looking up. What's the value proposition of IPFS? The most important thing that it does, it decouples the content from its host and it does this through a concept that's called content addressing and content addresses are just our permanent verifiable links and this allows you to request content with this or request data with that content address and anyone can serve you the content and just from the address that you asked with you can identify and verify that the content you got served is actually the one that you requested and you are not dependent on the authenticity of the host as it's the case with HTTP. Because it's a decentralized network, it's also censorship resistant and I like to put here that it alleviates backbone addiction, so what do I mean with that? Let's imagine all of you or all of us wanted to download a 100 megabyte YouTube video here in this room, we would put pressure, so if we were 100 people we would put pressure off about 10 gigabytes onto the backbone to just download the video into this room, wouldn't it be better if we could just download it once and distribute it across each other or download different parts and be a little bit more clever about that. In the similar vein, if we were working on a Google doc here inside this room, why does it stop working if we don't have internet connection anymore? It actually should work, it's actually ridiculous. And also, for some to the same category, this partition tolerance for emerging networks could also become very important or if you're just in a patchy coffee shop Wi-Fi. Alright, so how can you install IPFS? So there, I put down three different ways here, so IPFS in general is not, you don't install IPFS, IPFS is more specification and there are different implementations of this specification and the most common one is Kubo, which was formerly known as Go IPFS, so it's a Go implementation, there's a new one called IRO, which is in Rust and I think the newest one is in JavaScript called Helia, yeah, I think that's also the newest kid on the block and so I will talk about Kubo here and the easiest thing to get started is just download IPFS desktop, which is an electron app that bundles an IPFS node, gives you a nice UI and you can already interact and request CIDs from the network and so on. Then there's the IPFS companion, which is a browser extension that you can install to Firefox or your browser of choice or you directly use Brave or Opera, which comes in with a bundled IPFS node already, so if you enter a IPFS colon slash slash and a CID, it will resolve the content through the IPFS network. But as I said in the beginning, in this talk, we will focus on the command line because we're in a developer conference and I will also assume that we run Kubo, which is the reference implementation basically. So now we have downloaded Kubo from github.com slash IPFS slash Kubo and we want to import some content, we just want to get started. So we downloaded it and now we have this IPFS command on our machine and the first thing that we do is run IPFS in it and what this does is it generates a public parried key pair per default in ED25519 and it spits out this random string of characters, which is basically your public key. So formally it was just the hash of your public key, but now it's just encoded your public key in here and this is your PR identity, which will become important later on. And it also initializes your IPFS repository per default in your home directory under.ipfs. This is the location where it stores all the files. So if you interact with the IPFS network and request files, it stores it in this directory in a specific format similar to Git, how Git does the Git object store basically. And importantly, I will point this out a couple of times, this is just a local operation. So we haven't interacted with the network at all yet. So now we are ready to go, I have a file I want to add. So what I do is I run IPFS add and then my file name and in this case IPFS gives you like a progress bar or a Kubo gives you a progress bar and spits out again a random string of characters, which is the content identifier, the CID, which is the most fundamental ingredient here. And this is the part where it decouples the host, sorry, the content from its host. And as a mental model, you can think about the CID as a hash with some metadata. It's self-describing. So the metadata is this description part. You can see the ingredients at the bottom. So it's just an encoded version of some information like a CID version. So we have version zero and one and some other information that I won't go into right now. Then it's self-certifying. This is the point where if you request some data from the network, you certify the data that you could serve with the CID itself and not with the host that served you the content and just reiterating this. And it's an immutable identifier. And all these structures like the CID structure at the bottom and so on is governed by a project that's called multi-formats and it's also one of Prolucolab's projects here. And so the talk is called what happens under the hood, so what actually happened here. IPFS saw the file, which is just this white box here, a stream of bytes, and IPFS chunked it up. It's in different pieces, which is a common technique in networking, actually. And this gives us some nice properties. It allows us to do piecewise transfers so we can request blocks from different hosts, actually. And it allows for deduplication. Also if we have two blocks that are basically the same bytes, we can deduplicate that and save some storage space underneath. And also if the file was a video file, we also allow for random access so we could start in the middle of a video and don't need to stream all the previous bytes at all. And after we have chunked that up, what we do now or what IPFS does now is we need to put them, we need to put it together again. And what we do here is we hash each individual chunk. Each chunk gets its own CID, its own content identifier. Then the combination of each CID again gets another CID and we do this for both pairs at the bottom. And then the resulting common CIDs again will be put together yet again to generate the root CID, that's how we call it. And this is actually the CID that you see in the command line up there. So we took the chunks, put them, put the identifiers together to arrive at the final CID at the top. And this data structure is actually called a Merkle tree, but in IPFS land it's actually a Merkle deck because in Merkle trees your nodes are not allowed to have common parents. And the deck means here a directed acyclic graph. And let's imagine you didn't add a file but a directory. How do you encode the directory structure and not only the bytes and so on? All these formatting and serialization, deserialization things are governed by yet another project. It's called IPLD, which stands for Interplanetary Link Data. And IPLD does also a lot of more things, but for now this is specified in the scope of this project. So now we have imported the content. We have chunked it up, we've got the CID. But again, we haven't interacted with the network yet. So people think if you add something to IPFS you upload it somewhere and someone else takes care of hosting it for you, for free, which is not the case. So we added it to our local node. So now it ended up in this IPFS repository somewhere on our local machine. But only now we connect to the network and interact with it. For that we run IPFS daemon, which is a long-running process that connects to nodes in the network. We see some versioning information with which Go version was compiled with Kubo version we actually use. We see the addresses that the Kubo node listens on and also which ones are announced to the network, under which network addresses we are reachable. And then tells us that it started an API server, a web UI in the gateway. The API server is just an RPC API that is used by the command line to control the IPFS node. The web UI is the thing that you saw previously when you saw the screenshot of the IPFS desktop. So your local Kubo node also serves this web UI. And then the gateway. And the gateway is quite interesting. So this bridges the HTTP world with the IPFS world. So you can ask under this endpoint that you can see down there. If you put IPFS slash your CID inside the browser or in your SUD URL, the Kubo node will go ahead and resolve the CID in the network and serve it to you over HTTP. So this is like a bridge between both worlds. And ProCollapse and Cloudflare and so on are actually running these gateways on the internet right now, which you can use just a low barrier entry to the whole thing. And then the daemon is ready. And in this process, it has also connected to bootstrap nodes, which are hard coded to actually get to know other peers in the network. But you can also override it with your own bootstrap nodes. So now we are connected to the network. We have added our file to our own machine. But now the interesting or the problem or like the challenge, how do we actually find content hosts for a given CID? So I give my friend a CID, how does the node know that it needs to connect to me to request the content, actually? And I put here the solution is simple. We keep a mapping table. So we just have the CID mapped to the actual peer and every node has this on their machine. So everyone knows everything, basically. But as I said, the mapping table gets humongous, especially if we've split up those files into different chunks, and I think the default chunking size is 256 kilobytes. So we have just a lot of entries in this table. So this doesn't scale. So the solution would be to split this table, and each participating peer in this decentralized network holds a separate part of the table. But then we are back to square one. How do we know which peer holds which piece of this distributed hash table data? And the solution here would be to use a just deterministic distribution based on the Cademia DHT. Cademia is like a, is a, is a implementate or like a specific protocol for a distributed hash table. And at this point, I thought, so at this point, many talks on the internet about IPFS gloss over the DHT and how it works. And so when I got into this whole thing, I was lacking something. And so my experiment would be to just dive even a little deeper into, into this. And I would cover a bit of Cademia here, but at the end, this is very technical. But at the end, I would try to summarize everything so that everyone of you gets a little bit out of this. This whole process is called content routing. So this resolution of a CID to the content host. And IPFS uses an adaptation of the Cademia DHT by using a 256 bit key space. So we are hashing the CID and the PRID yet again with the SHA-256 to arrive in a common, in a common key space. And the distributed hash table in IPFS is just a distributed system that maps these keys to values. And the most important records here are provider records, which map a CID to a PRID. Some of the PRID is that what was generated when we initialize our node. And PRID and PR records, which then map the PRID to actually network addresses, like IP addresses and ports. So looking up a CID to a host for a CID is actually a two-step process. First we need to resolve the CID to a PRID, and then the PRID to their network addresses. And then we can connect to each other. And the distributed hash table here has two key features, first an X or distance metric. So that means we have some notion of closeness. So what this XOR thing does, so if I XOR two numbers together, the resulting number or this operation satisfies the condition, the requirements for a metric. So this means I can say a certain PRID is closer to a CID than some other PRID. So in this case, PRIDX could be closer to CID1 than PRIDY. And this allows us to basically sort CIDs with PRIDs together. And then this tree-based routing mechanism here. So in this bottom right diagram, I got this from the original paper, we have the black node. And with this tree-based routing, this is super clever as in each bubble, so all the PRID peers in the network can actually be considered as in a big try, a prefix try. And if we know only one PRID in each of these bubbles, we can guarantee that we can reach any other PRID in the network with O log N lookups by asking for even closer PRIDs based on this XOR routing mechanism here. So this was just abstractly what the distributed hash table in IPFS does. So how does it work concretely for IPFS? So we started the daemon process. What happened under the hood was we calculated the SHA-256 of our PRID, which just gives us a long string of bits and bytes, or just bits basically in our case. And we initialized a routing table at the bottom. And this routing table consists of different buckets. And each bucket is filled with peers that have a common prefix to our PRID, the hash from our PRID at the top. And when our node started up, we asked the bootstrap peers, hey, do you know anyone whose SHA-256 from PRID starts with a 1? And this means we have no common prefix, and we put them, those peers in bucket 0. Then we do the same for a prefix of 0, 0 and 0, 1, 1. And so we go through all the list until 255, and we fill up these buckets. And these are basically these buckets, these little blobs, these little circuits that you saw in the previous slide. And why did we do that? Because when we now want to retrieve content, so as I said, I handed the CID to my friend, and my friend enters the CID in the command line with this IPFS get command. Their node also calculates the SHA-256 of the CID, and then looks in its own routing table, sees, OK, I have a prefix of 2. I take one peer out of this bucket 2 and ask, yeah, locate the appropriate bucket, get the list of all peers, and then I asked all of these peers in the bucket, hey, do you know anyone? So first of all, do you know the provider record already? Do you know the CID and the PRID to that CID? And if yes, we are done, but if not, we are asking, do you know anyone closer based on this XR metric? And then this peer yet again looks in its own routing table, and so we get closer and closer and closer with this log n property that I showed you previously. And for publishing content, it's basically the same. We calculate the SHA-256 of the CID, locate the appropriate bucket, get a list of all the peers from that, and then we start parallel queries, but instead of asking for the provider record, we ask for even closer peers. And we terminate when the closest known peers in the query actually haven't replied with any peer that's closer, hasn't replied with anyone closer to the CID than we already know. And then we start the provider record with the 20 closest peers to that CID, and we do it with 20 because there's peer churn, so this is a permissionless network, and this means peers can come and go as they wish, and if we only started with one peer, we would risk that the provider record is not reachable when the node comes down, and in turn all content is not reachable. So this is like the very technical part of that, but let me summarize this. This is probably the easier way to understand all of this. First of all, so we added the content to our node, and so this is the file, enters the provider, the provider looks in its routing table, gets redirected to peer that is closer to the CID, and gets redirected until it finds the closest peer in this XR key space metric to the CID, and then it stores the provider record with that. Then off-band, the CID gets handed to the requester to my friend, and what I didn't say or told you yet, it's also IPFS maintains a long list or like, I don't know how many it is right now, probably a hundred or so, constant connections to other peers, and opportunistically just ask them, hey, do you know the CID or the provider record to the CID? And if this resolves, all good, we are done, but it's very unlikely for people to actually know a random CID. So let's assume this didn't work. So this requester also looks in its own routing table, gets redirected, gets redirected even closer, even closer to the peer ID of that CID, and then finds the peer that stores the provider record, fetches the provider record, then does again the same hops to find out the mapping from the peer ID to the network addresses, and then we can actually connect with each other and transfer the content, and we're done. So this is the content lifecycle, and this is actually, this is already it, well, already it is quite a bit, quite involved actually, and yeah, with that, it's already time for some callouts, get involved, IPFS is an open source project, if you're into measurements and so on, we have some grants open at radius.space, if you want to get involved with some network measurements, get your applications in, all action is in public, you can follow along our work, especially my work of our team, at this GitHub repository, we have plenty of requests for measurements that you can dive into, and extra ideas are always welcome. In general, IPFS is, I think, a very welcoming community, at least for me, and yeah, just, that's it. So, any questions? So is the way you describe it, using the DHT, how all nodes in the network share files with each other? There's one content routing mechanism, so there are multiple ones, so this first thing that I said here, so this opportunistic request to your immediate nodes is also some kind of content routing, so you're resolving the location of content, then there are some new efforts for building network indexes, which are just huge nodes that store the mappings, centralized nodes, which, like, federated centralized nodes, so not as bad, and I think, yeah, I think these are the important ones, basically, yeah, so there are more ways to resolve them. Also MDNS could also be one part, so if you're on the same network, you're broadcasting, I know, that's just for, sorry, for the local, yeah, okay, true, yeah, luckily we have a core maintainer of IPFS here, yeah, it's actually not a joke, but yeah, sorry, yeah. So I see that the provider records get replicated, but does the content actually get replicated across the network too? Yeah, so only if someone else chooses to, so you're publishing the provider record, so it's public somewhere, and anyone could look that up and also store the record themselves, so this is the idea, if content is popular and you care about the content being, staying alive in the network, it's called PIN, the CID, and this means you're fetching the content from this other provider and store it yourself and become the provider yourself, and because of the CID mechanism, which is self-certifying and so on, other peers that request the content from you don't even need to trust you, because the CID already encodes the trust chain here, but there's nothing that happened, it's not happening automatically here, so. But you can have multiple providers for the same company? Definitely, yeah, that's also, yeah, definitely, that's part of it. Another question is how does the project fit in, the concept of identity and trust and personas into IPFS, I'm thinking metadata, ramifications about the content and stuff like that. What do you mean exactly? For instance, just a history of the content, and can you trust that this content is from a certain person or from a certain, you know, like. I would argue this would probably be some mechanism on top of these content identification. So this is more for IPLD then, or for, perhaps, I would say, so if you want to say some content is from some specific person to, then you would work with signatures, so signing the data and so on, which is something you would bolt on top of IPFS, but nothing I think IPLD has encoded there right now. It's partly the same question about how it is ensured that there are no collisions in the content ID. No collisions? Yes, because if you publish some other content with the same content ID, you said it's happening locally, the content ID generation. You could fake contents. Yes, but then all these cryptographic hash functions would be broken then, which would be very bad. And if you have a hash collision, then it actually means you have the same content. That's the assumption right now, or maybe, yes, Joe. We just use a shadow 56 by default, and you can use also one like black 3, black 2, but if you find a collision in shadow 56, you have bigger problems and IPFS is not working. Exactly this, yeah. Follow on on this, how resilient is this against malicious actors that want to prevent me from reaching the content? It's a big question, but maybe something. Yes, so on peer-to-peer networks, often these kind of civil attacks are in the tech vector that is considered, which means you generate a lot of identities to populate just some part of the key space to block some requests from reaching the final destination and so on. From my experience, this is quite hard, and I haven't seen this happening. I cannot say that it's impossible or probably hard to tell. Max, do you want? Also, yeah, Kadeimnia has this mechanism where only long-living peers stay in the driving table. True, yeah, only, yeah. So this civil thing is just one attack vector, but this is like the common one that is considered. So there are many points in the code base where you need to think about what happens if a civil attack is going on, and one thing that Kadeimnia does is to keep, like, prefer long-running nodes, stable nodes in the routing table. So if someone immediately generates a lot of identities that they don't end up in your routing table and pollutes your routing, your content routing here, or interferes with that. All right, go ahead. I'm not sure if I want to ask it, but removing content, you know, deleting, you know, we got the EPR, so is there any solution that can be done? So, yeah, it's hard. That's part of the thing, if you could, then it's not censorship resistant anymore. And so what is one solution, well, one alleviation, maybe, is to have a blacklist of CID that you may publish or may not to say, okay, don't replicate this CID and so on, but this also, if you have such a list, then it's very easy to just look it up and see what's inside. Yeah, so deleting content is very tricky, however, I said it's permanent links, yeah, the links are permanent, but actually content still turns in the IPFS network, and these provider records that you publish into the network expire after 24 hours, so if no one actually re-provides the content or keeps the content, the content is gone as well. But a delete operation doesn't exist, so we just need to hope that no one will be provided any more, which you could do with these denialists, for example, yeah, Daniel, okay. Who is able to write into that blacklist and is there any? Yeah, this is just one, I don't know, to be completely honest, but this is just one, maybe Jeroko knows. There is no blacklist in the network right now, it's a few people that want that, but we have, sorry, earlier you said that we have gateways, and gateways is just a node that publicly is reachable, and those gateways, because many people say that, okay, they find some content illegal on IPFS, and instead of reporting to the actual node, so it's a content on IPFS, they just report it on the gateway, because they know HTTP and they don't know IPFS, and so our gateway has some blacklist that is somewhere, but it's not shared by the complete network, it's just for our gateway IPFS.io. So cloudfair, for example, and I've already read these gateways, or more, anyone could operate the gateway, so you could file a request for this, don't replicate the CID, it's a phishing website, for example, and then these CIDs are not served through the gateways, which is a common way to interact with the network right now. Just the gateways that follow the list, it's not a domain. Okay, we're running out of time, unless there is one more. I have a question regarding searching through the stored content, is there any mechanism on how to go through or index the files that are there to have some sort of like a search engine for that? Right, so there's a project called IPFS search, and this makes use, like among other things, of this immediate request for CIDs, so it's just sitting there listening, connecting to a lot of nodes, and as I said, if someone requests content, you immediately ask your connected peers, and you're connected to a lot of peers, and these IPFS search nodes are sitting there listening to these requests, and they see, okay, someone wants the CID, so I go ahead and request that CID as well, and then index that content on myself, and so you can then search on this IPFS search website for something, just with Google, and then you see CIDs popping in, and then you can request those CIDs from the IPFS network. So this is one approach to do that, to index content, yeah. Okay, thank you. Thank you. |
CNI Automagic: Device discovery for semantic network attachment in Kubernetes |
Alright, thanks everybody for joining me today to talk about CNI automatic, making some use of some of this bike discovery for semantic network attachment in Kubernetes. My name is Doug. I maintain something called multi-CNI, which is a way to attach multiple network interfaces to causing Kubernetes, and I'm really interested in telco use cases for Kubernetes, especially. And I am going to talk a little bit about some mappings to make this semantic. So I thought I would show you a map of where I'm from, which is Vermont in the United States. If you're not familiar with it, you might be familiar with our two most famous exports, like our Bernie Sanders and Ben and Jerry's ice cream. I'll also put in a little size reference here of the Adirondacks, which will come in as a trivia later, and belt them, because they're all sort of similar sizes. So, yeah, what we're going to look at today is a problem statement for what I discovered. A tour of CNI plug-in that I developed to address that problem. And along with that, we're going to look at kind of like it's what I made is really a proof of concept to try to address that problem and what are the kind of limitations. But most of all, I want to show how this kind of thought process that I used plays into what I think is kind of some bigger picture of things that need to happen for the next version of CNI. So as we look at the problem statement, keep that in mind that there's this problem and there's a solution that I've got. It's a kernel of a solution. But we also want to look at this thought process because I think that it is more important than the solution that I've got. So, a lot of times in networking, we have these kind of ideals that everything's going to be mapped out and it's going to be perfect, but once you get on the ground floor, or maybe you've had this job before, sometimes your network really looks like this in the end. And at least for me, it's usually that we try to make everything as uniform and perfect as we can. In these diagrams, it'll look symmetrical. Everything will be like by a textbook. However, once you go to implement it, you discover that not everything is the same. Sometimes you have legacies in your systems. You might need to have some kind of non-sequitur type of things in your network, like a jump host, for example, or maybe you've got some vendor equipment that you've bought that just doesn't exactly match everything that was going to be part of your plan. And this is really where the problem starts for me is that I have people come to me and say, yeah, well, this would work if it was the same on every machine, but I've got one or two or 25 out of a thousand that just aren't the same. So really what it is is, okay, hey, I'm adding a secondary network to my pods, and I've got this definition of what it's supposed to be. And it references specific interfaces on my hosts, but in a non-uniform environment, it might just not match. So in this particular CNI configuration, which is tiny on there, I'm sorry, we have CNI configs that want to reference a specific interface that this is going to be created on. And that's, say, for example, ETH0. And sometimes we want to know, okay, yeah, it's ETH0, but what is actually the network that's behind that? Because as much as we can reference that interface, there's like a greater world beyond that. It's how it's connected to the rest of the network. So if I have a non-uniform environment and I'm going to have these CNI configurations where I've got, say, node1 has an ETH1, node2 has ENS2, ENS4, and those are all connected to green. The way that, like, a network attachment definition is that's what multi-CNI uses for a secondary network. When someone comes to me and says, I want to use this on this non-uniform network, I've got to tell them to make a configuration for each thing that's different. And then additionally, on top of that, again, it's too small and I apologize, but I've got to tell them to, yeah, well, make one for each node, but then make a pod that references that one and then uses a node selector. And then that way you can get the right configuration for the right pod associated with the right node. And it's just not a very Kubernetes way of doing things. Like, we want to express intent at a higher level and get away to get these things attached in a, like, easier to express way and not have to, like, baby it every little thing and say, it's like this stack of three things and they have to be associated to this node and I have to label this node. And I just wasn't happy with that. So really what I wanted is to say, instead of all of this stuff that I'm configuring with a CNI config in the pod and the nodes selector with the labeled node, I just want to say, I want it attached to the green network. That's really what I want. So I'm like, I want to, like, give some meaning to these network interfaces and make it so that I can scale on a non-uniform environment. So we use Kubernetes for scale. It's a great way to deploy workloads at scale. And we use CNI for plumbing our network interfaces in our pods. So as a CNI developer and as a Kubernetes developer, this is how I approach this problem. So I made something that I call Surveyor CNI. And it essentially maps devices to network names using CRDs, which are custom resource definitions. It's essentially a way to extend the Kubernetes API and to store data and have data that works in a way that, like, other Kubernetes applications can talk to. It's sort of like a lingua franca for CRDs. And also, I have, like, a number of projects that I've, like, named after, kind of, like, outdoor related things in my area. And I really was thinking of, like, this topographic engineer and, like, rad adventure guy named Verplank Colvin that was famous in the Adirondacks for the first thing. But the name Verplank Colvin doesn't flow off the tongue. So I'm like, all right, I'll call it Surveyor CNI. It essentially works in two phases. When the, when it's installed, it starts up a daemon set. And that daemon set has some go language scripts that just go and onto that node. And I say, all right, give me the network interfaces that are on that particular node. Then it creates an empty CRD that we can use to create a mapping association to go from network device name to network name itself. So in essence, what it does is I can have those two nodes with these all different names for the green network. And I can say, each one is green on node one. ENS2 is green on node two. And on node three, ENS4 is green. So that way, I can just say, thank you, I can say, all right, we want to attach to network green. In lieu of actually having a demo of this, I will challenge you to bring up the code and run it yourself. And I've got like a do it yourself kind of tutorial on the read me and I'll share the links with you. Otherwise, you'll just see the frustration of how to do it. So really, in a lot of ways, this is just sort of a like rolling chassis of a car that doesn't have an engine in it yet. Because it will make the custom resource for you and it will let you fill out those associations yourself. But I don't think that that really scales either. So I think that something that could really be improved here and because it uses custom resources is to have really an engine behind it, to have other Kubernetes controllers that know something further about your network. So for you to program some kind of intelligence into this, be able to do it. So like a working group that I'm part of that we call Kubernetes Network Plumbing Working Group, that comes up with all kinds of ideas about how to plumb your networks in Kubernetes. One of our like sort of, what's the word for it? I guess it's like the like holy grail kind of question is to ask the question, what network am I attaching to? And I think about that a lot and it makes me think of the question of, okay, hey, if I have this messy physical network environment with cables all over the place and I unplug a cable from one interface and into another in the physical world, then everything changes. And something that keeps coming up for me is this idea of can we listen to Netlink in Linux and build some more intelligence that happens like over the lifetime of a pod. And IPv6 usually comes up when we talk about this because in IPv6 you can have Slack, you can have router advertisement, so your IP addresses can be assigned kind of on the fly, your routes can change, your network can be changing, like these things are considered in IPv6. And I don't think that they're very well covered in Kubernetes in general and I think about that. I also was like, hey, if I have sort of this like high level way to express this, I'm like, could I train something that might know about this to sort of figure it out for me like an AI? If you haven't heard everyone talking about like the leaps and bounds in AI in the last like whatever, bring up social media and just see everyone talking about it. So I was like, yeah, maybe I could train like Chad GPT to do this. I realized that at this point it will probably just confidently make up something about my network. That's not actually true, but it seemed kind of sweet. Another question that came up when I was socializing this idea was, hey, why don't you just apply aliases to your network interfaces on the host? And I'm like, yeah, that's probably fine. And in fact, I would say if you have this problem today, this is probably what you should do instead of using my demo software because it's a really reliable way to do it. However, as a like CNI developer and as a Kubernetes developer, it's just not the way that I approach the problem. And it didn't really for me approach it in a way that exposes that data to something that can really be more of an engine behind this. So I think it's a good approach. I think that there's still something to be asked about scale here. So given that, let's think about some of the possible pitfalls here. Something in my space that gets brought up a lot when we talk about this variety of network tools is, are we going to wind up creating some kind of network manager like a neutron in OpenStack? And I think that there's a lot of problems that that exposes. And I think that it might also sort of like give some tunnel vision to the problems. And what I really like about how we approach networking in Kubernetes with CNI is we do this in a modular fashion. So with something like this idea of my survey or CNI, maybe it is a more of a modular way to approach it that's like, hey, here's one tool for this specific case. So if you don't encounter this problem, well then don't use this thing, right? And or if you do and you have a non-uniform environment, use something like Surveyor and get it to work for you or whatever, create Ansible scripts and make the aliases to your interfaces. Another pitfall of this as well is it doesn't cover how the workloads would be scheduled to nodes that have those devices available, right? So, you know, in my thing where I say network green, I really assume you're going to have a device mapping and association to the green network on every node. So if you have nodes that aren't attached to the green network, well, you need to know a way to approach that. And the way that we approach this problem today with a resource being available on a specific node is with something called device plugins. And what device plugins do is they give the Kubernetes scheduler awareness of consumable resources on a particular host. So if you've got, say, SROV network interfaces that are for high performance, your device plugin can know about that and tell Kubernetes that, hey, there are 15 virtual functions on this SROV card on this node. And that's how we approach that. So you could definitely extend my idea with adding device plugins, which are not super, super intuitive to use. And I think it's an area that also needs help in this space. But it would be a solution to work through that. So all of that being said, I'm kind of on a campaign to try to give people more awareness of what's going to happen with CNI 2.0. And if you are interested in this space at all, I encourage you to keep an eye on what's going on in the next version of CNI. We've got a number of problems that we really want to address. One of those is what happens to networking during a pod lifecycle. So, you know, I was mentioning, like, hey, what happens if I unplug this network and plug it in somewhere else? Like, can I detect that? Can I do something about that? Because right now, CNI works in essentially two different operations, which is add and delete. So when your network is added, CNI runs. When your pod is added, CNI runs. When your pod is deleted, CNI runs to clean it up. We want to have something that goes further through the lifecycle there. So, like, with IPv6 and ever-changing things, that can be covered. Maybe you can improve how, like, a cleanup happens because CNI delete isn't necessarily guaranteed. But the number one thing I think should happen with CNI 2.0 is that we have, like, more of a Kubernetes awareness for CNI 2.0. CNI is Container Orchestration Agnostic, which is an awesome start for it. But we only have one Container Orchestration Engine left, essentially, and it's Kubernetes. And I think it's important. So, yes, that being the case, this is how I approached it, right? I took this problem and I approached it with Kubernetes and with CNI, right? And I needed to kind of figure out how to, like, interact with all of these objects again when I made a new CNI plug-in. So I sort of, like, took a look at where I spent my time. And I spent it a lot during the, like, integration into Kubernetes. So, you know, I'm a CNI developer, so I spent a good chunk of the time on that. I spent a half-decent chunk on design, and I spent most of the time integrating it with Kubernetes. And I think that that could be improved. The X factor here was that I also used chatGBT to generate a bunch of my code. So maybe that also wasted my time. So there's that because it isn't always truthful of you. But, yeah, I want CNI 2.0 to communicate with Kubernetes. I think it's going to be a revolution in this phase. And I want everyone here that's interested in it to, like, get involved in this effort. And I think it's going to be, like, the next big thing for networking and Kubernetes. So, yeah, if you want, try it out. And that concludes what I've got. I can open it up for questions. Thank you. Have you considered using Linklayer Detection Price Go? Or Discovery, I can't remember. It actually stands for... Yeah, that's the one. Love it. I love that idea. I hadn't thought of it, but I am going to put that right in line with a Netlink thing. Yeah, I was just thinking about that as well. But isn't the problem that it finds layer 2 devices, your adjacency with the layer 2 device, not the layer 3? So I was wondering, would you be better to multi-cast from each, to also discover, have each device multi-cast what it thought and what name it gave to this network? The question then is how you ever assign the colors if you want to completely automate it. And maybe you have chat GPT assigned the colors for you. I like it. No, I think that's really good. Also, yeah, good consideration to the layer 2 versus layer 3. And, yeah, I think depending on your use case, it might end with your network. But I know in the telco world, there's a lot of LTE networks, for sure. Yes. Yeah, I think the next device T is going to block the LDP. So you'll only see that neighbor. You won't see the other nodes on that subnet. That was more my issue. That's what you want to find out, right? Yeah. It could give you the B line if that's your network separation. Yeah, yeah, it could do. Yeah, I think it merits consideration for sure. Any other questions? No? Okay. Thank you very much. Thank you all. Thank you. |
Golden Signals with Cilium and Grafana |
All right. Good afternoon, everyone. Before I get started, how many of you attended the session about service mesh this afternoon? Sorry about that. If you're fed up with me, you have to stay with me for another session. I will repeat some parts of it to introduce the topic. But otherwise, I'm sorry about that. So welcome to the session, Golden Signals with Cillium and Grafana. My name is Raymond De Jong. I'm field CTO for EMEA at Isovalent. Isovalent is the company which originated Cillium. And today I'm going to talk about Grafana and how Cillium enables you to get golden signals out of the box. Introduction about technology, a little bit about EBPF and Cillium, then about observability challenges and how we can tackle those challenges using observability. A bit on monitoring data operations and the default dashboards we provide. And then hopefully the demo gods are with me for a small demo to actually see how we get layer seven metrics. And we can see return codes and monitor application, response times, et cetera. So to start this topic, I want to introduce Cillium and EBPF. How many of you know about EBPF? Quite a lot, awesome. How many of you know about Cillium and are using Cillium? Cool. Great. So Isovalent is the company behind each Cillium and EBPF. We maintain it with the community. And to start with, EBPF is explaining what EBPF is and how it works. So we like to say what JavaScript is to the browser, EBPF is to the kernel. What we mean by that is that we make it extensible in a dynamic way. So that means we're not changing the kernel, but at kernel speed we can run programs based on kernel events. And for the context of this session, what's important is that considering metrics, considering getting useful information from your applications, what we're doing here is whenever a process opens to Sokort or a packet gets sent on the network device, on the node, we can expose metrics or we can export those metrics using EBPF. And we can use tools like Rafaana to display it in a good way so you can make, you can do data operations or you can see how your application or cluster is performing. Cillium runs on EBPF. The good thing about Cillium is you don't need to be an EBPF engineer to run an operate Cillium. You just set certain configuration options and EBPF programs will be mounted on the nodes and will run when they need to run. And Cillium on the high level provides a number of capabilities, networking capabilities, observability capabilities, also service mesh, and using catricon solution we can also use runtime visibility and observability security based on processes such as opening files, process execution, et cetera, et cetera. Today we'll focus about the observability part. So besides networking we can get rich visibility of metrics on your cluster. As you may know Google uses on data plane V2 actually under the hood Cillium, Microsoft has moved Azure default AKS clusters to Cillium so all the cloud providers see that Cillium is a powerful technology to run clusters at scale and to get useful metrics running them. So let's start with common challenges we see. One of the main challenges if we talk about performance of your application or performance of your clusters at scale is that you run into this issue of the finger pointing problem. What I mean by that is that network connectivity is layered and when you run into issues you need to especially at scale you need to be easily aware where a possible problem could exist and it can be in multiple layers and if you look at the OZ layer obviously it can be at the data link layer, the network layer, the transport layer or in the application layer. So the goal of Cillium with Kavana is to give you the tools to quickly inspect what's going on and to be more efficient in troubleshooting your cluster and or applications. Another common issue especially at scale is obviously the signal to noise problem. You may run in the cloud you get data from your notes you see IP addresses communicating with each other but IP addresses by itself mean nothing in Kubernetes clusters. They come, they go and at scale it's impossible to track and trace what's going on with service to service communication in your applications. Also where our existing mechanisms falling short so first of all network devices think about centralized monitoring or firewall solutions, think about Splunk. They are great to get alerts, to get dashboards but again at scale they can be very costly or they can be a bottleneck. Also these devices don't have awareness of the identities of your applications running on your Kubernetes clusters. Cloud provider network logs are nice for note to note communication but don't provide identity as well. You can monitor the host, you can see Linux host statistics but that gives you only visibility on the note level and again a Linux note doesn't have any awareness of the identities of the services running in your Kubernetes cluster. You may want to try to modify application code and this applies a bit to the service mesh session. You may want to install libraries which gives you visibility of the application but then you don't have visibility on the networking layer. Service mesh, main goal of service mesh is obviously visibility of the network and trend communication to get metrics out of that but that with the sidecar based implementation comes with a footprint and with induced latency plus you have operational complexity maintaining your service mesh solution on top of your Kubernetes clusters. So this is where Cilium comes around the corner is that we provide identity aware based observability and security. What that means is based on the labels you set on your workloads, we create unique identities and we're able to attach that identity in the data plane using eBPF and using that identity we can do things with that so we can secure the connectivity in this example a front end to a back end is allowed to communicate based on the network policies we set and the identities we are aware of and these identities are cluster wide property but in terms of observability this also means that we can use this identity to get rich metrics and data for that identity and you can monitor it effectively. This means that you're not looking anymore at IP addresses, you're looking at identities so the whole set of workloads the service to service communication for a front end to a back end. Hubble is our observability solution built on top of Cilium, how it works is that Cilium runs as a demon set on your cluster nodes as an agent and Hubble can retrieve data from those agents through a CLI or UI and we can export metrics based on your workloads. So there are three parts, first of all the Hubble UI gives a service dependency map so on a namespace level you can see what's deployed, what is communicating with each other, what kind of protocols they're using, what's coming between namespaces so you would see for example if there's inter namespace communication you can identify the source namespace, if you use cluster mesh you can even identify the source cluster, you can also identify egress traffic and ingress traffic on a namespace level. The Hubble CLI is more a power user tool to give you detailed flow, you can export it to JSON, you can do a lot of filtering based on labels. Hubble metrics is the part where mostly the topic for today is where you export metrics and you use Grafana for example to observe the performance of your cluster application. This is all fueled through EBPF so again think about a network device sending a packet, that's a kernel event, EBPF program gets attached to it, gets the metrics and it's done. This is a small screenshot of the CLI so this gives you flow visibility using Hubble observe commands, you can follow for example based on the label in this case X-Wing so we're following all the workloads labeled with X-Wing so again no IP addresses just labels. In purple it's highlighted these IDs we use so again each unique set of labels gets a unique cluster wide ID and based on those IDs we can track based on labels what communication is going on and there's a lot of metadata you can filter on things like headers, things like ports, things like protocols, obviously labels in Q&A spot names, services, worker nodes, DNS, we also have Cilium network policies which allow you to filter and observe two FQDN rules meaning we can inspect queries to external domains and we can filter based on that and obviously Cilium related identity such as world, ingress, egress, host and that kind of stuff. Policy verdict matches, things like dropped, allowed and stuff. This is the Hubble UI surface map like I said before this gives you a namespace level view in this case we have a jobs app and I'm using this app as well in the demo I'm showing a bit later so here you're looking at a namespace level view where you can see all the surface to surface communication of your application running in that namespace. In this case it's only intra namespace communication and you can see for example that the recruiter is talking to core API, the core API is talking to Elasticsearch, we have a zookeeper component, we identify Kafka, also identifying Kafka protocols so there are a number of protocols we can inspect and see and we also see layer 7 information so in this case HTTP calls to a specific URI or URL with specific method and return calls and this is triggered through just simple construct as a Syllium Network Policy. If you just allow let's say internamespace traffic and you are accepting HTTP that already triggers this visibility for you to see. Now using this data we can also export metrics to Grafana so we are working with Grafana a lot more lately that means that we are building a lot of more useful dashboards and also integrating with things like Tempo for getting transparent tracing in Grafana, all powered through Syllium and EBPF. This allows us to not only see on the network level metrics on performance on the node but also for surface to surface communication to provide golden signals things like HTTP request rate, latency, response codes and error codes which would as an application engineer would allow you to quickly see which component of the application is not responding as it should. But also detecting transient network layers so this will be more network related we may see retransmissions, we can see bytes sent and received and we can indicate things like boundary time to indicate a network layer problem. So maybe in a data center you have a specific rack switch or top of rack switch not performing as it should so nodes connected to that switch will have improved or will have reduced performance and you would see latency increasing. Now with the latest dashboards we also able to see programmatic API request using transparent tracing. This goes to the integration with Grafana. So at the moment your application need to be able to support it so you need to be able to inject traces. But we are working out of the box getting also this support and be able to help by help support HTTP traces as such. And then you get this exemplar so I am pointing at a small exemplar here after which you can inspect this with Tempo and you can see a span of a specific request and see where the problem may reside. A bit more on monitoring so this is more day 2 ops. I want to highlight that if you run Cillian and you are also using Grafana make sure that you install the agent, Hubble and operator metrics plugins. These are out of the box plugins we provide through the Grafana marketplace you can download. This gives you visibility in the performance of your cluster. So first of all agent metrics everything on the node level how the node is performing, how many throughput they are processing, how many memory the BPF is using, all this related stuff. Hubble metrics gives you the visibility across your cluster in terms of application, layer 7 return calls, policy verdicts so allows versus drops so you can monitor on the cluster level the performance of your applications. And in some cases you run an operator so you may want to track the number of identities, how the cluster in general is behaving, API responses and such. And finally what we released just a few days ago thanks to Raphael who is also here is the Cillian policy verdict metrics dashboard which gives you the capability to get meaningful graphs if you have workloads actually hitting network policies you set. What I mean by that is that when we work with customers with Cillian is they want to go to this micro segmentation zero trust model and you can use obviously Hubble to monitor service-to-service communication and to see if traffic is allowed and denied. But this dashboard also is a very useful tool to confirm if you have either ingress or egress policies which are matching with your traffic. So in this case we see green graphs which means that on ingress and egress we have matching traffic. The purple represents DNS matching traffic but if there's some yellow traffic that's either allow all match traffic which is too broadly which should trigger you to get even better network policies to make sure that kind of flows are actually related to a network policy to ensure that both ingress and egress traffic is secured as such. If you do so all the graphs will turn green and you know and you can confirm for that given namespace that you have secured it. Alright I've prepared a little demo. This runs this tenant jobs application I mentioned before. I'm running this on Kynes so just a simple Kynes cluster on my laptop. I'm showing here the components of my application so it's you know a number of workloads I've shown before on the screen shot. To help me through this demo I've created a little script and what this does it only updates a helm chart for this application so it makes my workflow a lot easier. I don't have to enter commands but we should see some things changing in a Grafana dashboard. Before I start this let me highlight the metrics so I need to log in. So I've installed Grafana, I've installed Tempo, I've installed Prometheus and configured Silium to export those metrics. So this is currently the performance of my application running on my Kynes cluster on my laptop. So as you can see we have 100% success rate, we have incoming 100% and we also have good Grafana information for the performance of the application. Okay so let me start with starting the script. Yes so I mentioned before that the Hubble metrics are available as soon as you configure some kind of layer 7 Silium network policy because that triggers the collection of those metrics for layer 7 and I'm showing this but I will show this a bit better in a different window. So what I'm going to do now is I want to increase the request volume so I'm configuring the crawler component to get more requests in my application. As you can see it's redeploying the crawler component. So this is something we should see in the Grafana dashboard. While this is redeploying I can show the helm chart I'm using. You need to be a bit patient with me because it takes one minute before the graphics, the Grafana dashboards are updated and you can actually see the impact of this new version of the application. So typically you configure Silium through a helm values file so in this case on the operator component I've enabled metrics and Prometheus. On the Hubble side I've configured Hubble relay to gather the metrics and also Prometheus and metrics. So in this part it's very interesting because if you want to have layer 7 visibility you need to have specific metrics being enabled. This will be documented in the Silium IO website once we have the new release ready. So as you can see we are matching HTTP V2, we have enabled exemplars, we are looking for labels in terms of context, source IP, source namespace etc. So these are important sets of labels you need to set and on the Prometheus side we've enabled it to gather the graphs. The Silium network policy I mentioned before this is just a simple example. We allow everything within a namespace. We have enabled DNS visibility so we're inspecting all DNS traffic to cube DNS that allows us to get visibility of the DNS queries. We've enabled ingress and egress for the purpose of the demo so we can also see that traffic. And what's important is that we have an empty or open rule HTTP which allows us to see all traffic, it allows all traffic, but that triggers the collection of metrics. Alright so on the demo side so it has deployed a new version of my application looking at my metrics. I can see incoming request volume increasing so you see already an increase of volume. We also see requests by source and response codes increasing so still 200, always fine, always good, just an increase of request per seconds. And also on the incoming side. Okay all good. Okay let's now deploy a new configuration of our app and this app has an error. So let's see what we can see there. I can redeploying the core API components and now we should be able to see the error rate increase as a result of core API configuration changing. So this will take one minute. Here I can select the destination workload so I can switch between core API or the loader component to see how the traffic for that destination is matching and how it's performing. Let's give it a few seconds to actually show. What I'm looking for is the incoming request to access rate. Obviously this application version has an error so the success rate will be lower than before. It's running. It's always a bit, takes a bit longer than I wanted. There we have it. In the meantime I'll already start the next step so I don't have you waiting. So here you see that the success rate is decreasing because of this faulty version of my application so I can really see there's something wrong with my application and as application developer or owning this namespace I should now be able to investigate what's going on. This also means that here on the incoming request to source and response code I would see the resumes components showing 500 and 503 error return codes which triggers me to check that component and communication between those components. Also on the destination site. All right. So now I've introduced a new version and what this does is changing the request duration. So again a new version of the application and let's see how we can monitor this performance of the application in Grafana. So let's check the request duration increase as a result of core API configuration changing. Okay. So let's use here. So I'm monitoring HTTP request duration by source and destination. So if the demo works well we see an increase there. Okay. It takes a bit too long. I'm comfortable with but it should be there any minute. It should appear any moment. Let me just... In the meantime I will deploy a new version of the application which also introduces tracing. So again for tracing to be supported you at this moment your application needs to support that. So in this case I'll deploy a new version of this application to support tracing. And this is using open telemetry. So let's deploy that in the meantime. That's deploying. In the meantime I can check how the request duration is doing. Okay. This part is not working yet but we should see a request duration increase. Oh God yeah thanks. That doesn't help. I clicked on something. Ah yes thank you so much. Yeah here you can see the request duration increasing. And I just deployed a new version of my application which supports tracing using open telemetry. And then you already can see that I have these exemplars appearing. So I now can not only inspect HTTP request duration but I can also inspect specific traces and exemplars. So if you click on this little box you get this window. You can get valuable information about this trace point. And then you can query it with track tempo. Yep let's leave this side. So here you can see a specific trace ID and you can see a node graph. So this is also nice. You can see between nodes what's going on and you see highlighted in red here what has a high latency as such. And here we can see that in this specific API call there is an error. So a post call and it has some events exception, random error. So something is wrong with my application. So this enables me as an application owner to troubleshoot my application effectively. So this concludes the demo. Let me quickly move to here. All right, so if you want to know more how to configure Cilium to enable metrics, how to configure Cilium with the right values for layer 7 monitoring, I recommend to read the documentation on Cilium.io. If you're using Cilium or planning to use Cilium and you have questions go to our Slack channel. We're happy to help you there. The community is out there and very helpful answering questions. If you want to know more about eBPF go to eBPF.io. We also have released or close to release a lab with this kind of dashboards as well. So feel free to check them out at isovenom.com. And if you want to know more about isovenom.com or you may want to contribute, we also have open positions for engineering as such. So if you want to know more, please check us out. I'm happy to take questions. Any questions? Hello. Thank you for your talk. Is it possible in the service graph of the Hubbell UI to show transitive dependencies of services with the tracing enabled? So with Hubbell UI, the open source version, you will see the service connectivity only. So that related information is not integrated in Hubbell as such. So you would switch between Hubbell and Grafana to get that information. On the enterprise, we have built in dashboards for getting that specific areas of monitoring. So let's say application or node performance or cluster-wide performance. We have some dashboards which should quickly highlight performance issues there. Okay. Thank you. Any other questions? Hello. Did you measure the impact of metrics, recollect of metrics on network performance? Yeah. We do have some performance-related reports on sodium.io. So yes, it comes with a price. Using EBPF, we keep it as low as possible. It's a very hard question to answer because it will depend on which flags you configure. So if you have full layer 7 visibility across all workloads in your cluster, of course it will have some performance impact for sure. Yes. Using EBPF, we keep it as low as possible. But yeah, it's a multi-dimensional question. It depends on the amount of traffic, the amount of applications, how big your cluster is, et cetera. So we have some performance reports you can check. So that's 500 nodes, 1,000 network policies, helpful, enabled, and you get some feel of how memory consumption and processing is with Selium. So feel free to check them out on the selium.io website. But in practice, it's a multi-dimensional story. Welcome. Any other questions in the background? Hi. Thanks for the talk. A couple of questions about the integration of Selium on AQS and GKE. Is there anything specific regarding those implementations or are all the tools that work natively on these kind of clusters? And second questions regarding above UI, is it possible to see intern namespace traffic flows or is it limited to intranamespace? OK. Good question. So to answer the second question first, yes, you can see that. So if there is communication between, from a different namespace in regards to your namespace and monitoring, you will see that. You will see those labels, and you will see the workloads, even across clusters if you enable cluster mesh. So yes, that works out of the box. On the cloud provider side, so if you run AQS with Azure CNI Powered by Selium, you have a limited set of metrics which are enabled. And that's obviously from support reasons for Microsoft to support that solution out of the box. However, you can also choose to bring your own CNI with AQS, and that also applies to GKE and AQS, to manage Selium yourself. Right? So then you have the freedom to configure the flags I just shown and to configure Selium as such. Keep in mind that you're responsible obviously of monitoring, managing Selium, and the cloud provider will manage the cluster. Any other question? Cool. We have to cut it in. Oh, yes. Sorry. OK. Thank you very much. Thank you. Thank you. |
Need to connect your k8s pods to multiple networks? No problem [with calico/vpp]!
Multi-legged containers running wild with calico/vpp |
Hi everyone, so that's the last speaker of the day, so I'm going to talk about community spot connecting to multiple networks. So Doug already spoke about this in a slightly different way, I'll take a slightly different approach. So first a few things about myself, I'm a software engineer at Cisco working on container networking things, and I'm a maintainer of Calico VPP, which is going to be the topic of this talk. This talk is also a bit particular, it's a result of a collaboration effort with many awesome people like Tajera, Intel mostly and Cisco, and direct on collaboration with Mritika Ganguly, which is a P at Intel, but she sadly couldn't be here today because it's quite far from the US where she lives, but I'll do my best to present her work. So first, a bit of a background story of this work, so in the world of employee applications Kubernetes has really become the solution of choice when it comes to deploying large scale services in various environments because it provides the primitives for scalability, so Metal LB that we saw in a previous talk, services, health checks and so on. It also provides the uniformity of deployment, and it's also far from the sequence, so you don't need to know what you're running on. But coming from the CNF land, so trying to deploy a network function in this environment, the story is not the same. So I'll define a bit more what I mean by CNF because it's a bit different between the standard CNF use case, the 5G one. What I mean by that is, so I'll take an example for the sake of this presentation, so typically what I mean by that is the wire guard head end. For example, you have a customer and you want to deploy a fleet of wire guard head ends to give that user access to a resource in a company network, so typically a very prior printer that everybody wants to access to because people like to print. So the particularity of this use case is that it's dynamic enough to benefit from the abstraction that humanities brings, and I've lost my mouse. So typically load balancing, scheduling and those kinds of things. But it has a lot of specific needs, for example ingress has to be done in a particular way because you have some wire guard encrypted traffic, so typically you want to see which IP it's coming from. You also constrain on how you receive traffic because typically, and that's the place where you need multiple interfaces to go into your pod. And you also require high performance because encrypted traffic, so typically you want those things to run fast and you have a lot of user using them. So not for that printer, but assuming it's a bigger use case. So we tried to design a solution for that. So there are lots of components at play, I'll try to go quickly into them. So in the top we have our application, so here the wire guard VPN head end. We want to deploy it on top of humanity, so we have to choose a CNI, so we want to have it with Calico, mainly because of the cuteness of the cat, but also because it provides a really nice interface into supporting multiple data planes and also a nice BGP integration that allows to tweak the way we process packets. And for carrying packets we use the FDIO's LPP as a data plane that gave us more control on how packets are processed. And so that allowed us to go deeper into the way networks actually manage at a really low level. There are also other components that are going to play, but more on this later. So I'm going to go quickly over Calico and VPP because they have been presented many times. So in short Calico is a community CNI, providing a lot of great features, policies, BGP, support for really huge clusters, and the point that's important for this presentation is that it has a very well-defined control plane data plane interface, allowing to plug new performance oriented software underneath it without much hassle, and that's what we are going to leverage. So we choose to sleep VPP underneath Calico, first because we were originally contributors in this open-source user space networking data plane, so it was a solution of choice. But also it has a lot of cool functionalities that are built in and it's extensible. So I am doing a bit of publicity for the software I'm coming from, but it was a good tool for this use case. And also it's quite fast, so it really fits the needs for this application. So how did we bind the two together? What we do is we built an agent running in a demand set on every node, so deployment is the same as a simple pod, just with more privileges. We registered these agents in Calico as a Calico data plane and used their GRPC interface and their APIs that they exposed to decouple control data plane. That agent listens for Calico events and then programs VPP accordingly. And we also built a series of custom plug-ins for handling that, servicing and so on. And we tweaked the configuration so that things behave nicely in a container-oriented environment. And with all this, we have every brick to bring VPP into the clusters and so to have really control on everything that happens, indeed, the communities networking. How does that happen under the hood? So what happens exactly under the hood, what we do is we swap all the network logic that was happening in Linux to VPP, so from this configuration to there. In order to, so, yeah, the thing is as VPP is a user space stack, we have to do a few things a bit differently compared to what was previously done in Linux. So in order to insert VPP between the host and the network, we will grab the host interface, the uplink, and consume it in VPP with the appropriate driver. And then we restore the host connectivity by creating a turn interface in the host root network main space. So that's the turn tap here. And we replicate everything on that interface, it resists the routes. So basically we insert ourselves as a bump in the wire on the uplink. It's not very network-ish, but it works pretty well in that configuration. And that way we restore pod connectivity as before with turn taps instead of a VIF. We create an interface in every pod. And then everything runs normally, the Calico control plane is running normally on the host and it configures the data plan functions in VPP via the agent. So now we have the green part covered. So all those components run neatly. And what we achieve with that is that when we create a pod, Kubernetes will call Calico, Calico will call VPP. And we can provide an interface that we fully handle on a network network player directly in VPP. But for this specific wire guard add-on application, we need a bit more than that. We need multiple interfaces and we also potentially have overlapping addresses. So we don't really manage where the IPs are going to end. So for the multiple interface part, our goal to show us was to go with multis that provides multiplexing. And we chose also dedicated IPAM that we patched, which we were about because it was quite simple to patch and brought those two pieces in. So when I mean multiple interfaces, what does that exactly contain? So the thing is, the typical Kubernetes deployment looks like this. So each pod has a single interface. And the CNI provides a pod-to-pod connectivity, typically with an encapsulation from node to node. But in our application, we want to differentiate the encrypted traffic from the clear-text traffic, so before and after the head end. But we still want Kubernetes as the end to operate. So we still want the nice things about Kubernetes, so service IPs and everything. So it's not only multiple interfaces, it's really multiple interfaces wired into Kubernetes. So it's more multiple isolated networks. So conceptually, what we needed was the ability to create multiple Kubernetes networks. So each network behaving a bit like a standalone cluster stacked on top of each other. So with this, we request networks that provide complete isolation between each other, meaning that traffic cannot cross from a network to another without going from to the outside world. And so that means that we have to bind Calico, VPP, so on, integration, and multist together to create a model where everybody is aware of that definition of networks, have a catalog of isolated networks, specify the way they are going to communicate from node to node via VXLAN encapsulations, and have a way to propose to attach to those networks with annotations, so that in the end, Kubernetes is aware of these networks and we can still maintain the SDN part of the logic. So the way this works quickly is that the C&I interface will call Calico once per pod. So the thing is, multist will call the C&I Calico once per top all pod interface. And we will in turn receive in origin those calls and we can map those with annotations and do our magics to provide the logic. And having also the IPAM patch allows us to support multiple IPs and to have different realms where the IP lives and gets located from. So from a user's perspective, what we expose is a network catalog where our networks are defining CRDs for now. We are starting a standardization effort to bring that into Kubernetes, but that will probably take time. So right now we kept it that simple with just specifying a V&I using VXLAN by default, just passing a range. And we also keep a network attachment definition from multist with one-to-one mapping to network so that things, we don't change too many things at once. And then we use those networks into the pod definitions. So we reference them the multist way. We can reference them as well in services with dedicated annotations. And so that way we tell our agents to program VP in a way where the services apply only in a specific network. The policy is the same way. And also that gives us the ability for pods to have a bit more tweaking on the parameters exposed on the interface. So to specify the number of queues we want, the queue depth, and also support multiple types. So that gives a lot of flexibility to get the performance right and to get, so first to get the functionalities. So the fact that we have multiple interfaces and so also size them so that the performance is appropriate for the use case that we want to achieve. The last nice feature of this is that as we have GoBGP support, we can pair those networks with the outside world if we have a fabric that's VXLAN and if GoBGP supports it. So that part is still a bit work in progress and there are a lot of things to get right. But that's the end picture we want to go. And this could, so if we put everything together, we would get probably something like that, that looks like that. So basically when the users want to connect to this hypothetical VPN and that hypothetical printer, it would get into the cluster via GoBGP peering, so traffic is going to be attracted to the green network, heat service IP in that network, so get some load balancing across the nodes, then it's going to be deciphered in a pod that then encapsulate traffic and pass it, for example, to a NAT pod running in user space. So here I put another type of interface that is more performance oriented and then exit the cluster on a different VLAN peered with the outside world. So some parts still need to be done, but the general internal logic of the cluster is still something that works and that brings the ability for container networking functions to run unmodified with their multiple interfaces directly in a somewhat regular cluster. So we spoke about improving performance of the network, of the underlying interface, but we can also improve the performance with which the application in the pod consumes their own interfaces. So the standard way applications usually consume packets within pods is via socket APIs. So it's really standard, but you have to go through the kernel and it's a code path that wasn't designed for the performance levels of modern apps. So that's why GSO came up as a network stack optimization. But here with VPP running, it would be nice to be able to bypass the network stack and pass the packets directly from VPP and not touching the kernel. So fortunately, VPP exposes two different ways to consume those interfaces. We'll mostly go into the first one, which is the memory interface. So basically, it's a packet oriented interface standard relying on a memory segment for speed. And this can be leveraged by an application via a simple library. So either GoMemi, PhiliBMMF in C, or DPDK, or even a VPP. And so provide a really high speed way of consuming that extra interface in the pod. And the really nice thing about this is that it brings also the connection between the Kubernetes network, Kubernetes SDN, and the pod into user space, meaning that now that connection lives in a regular C program, we can also leverage. So it's easier to leverage CPU optimizations and new features. And that's where the Silicon re-enters the picture and the work from Mrytka from Intel and their team. So they benchmarked this kind of setup and also introduced an optimization that's coming into the fourth generation Intel Xeon that's called Data Streaming Accelerator. Basically, it's a way to optimize copies between processes on some CPUs. And so what they did is compare the performance that we get with the Kubernetes clusters, multiple interfaces, and a simple pod. So not bringing in the old VPN logic, just doing L3 patch and seeing how fast things could go between regular kernel interfaces, the turn, the memory interfaces, and the memory interfaces leveraging those optimizations in the CPU. So that gives those graphs that are really, so that have a lot of numbers in them. But basically, I try to sum up quite quickly what this gives. There are two MTUs, 1,500 bytes and 9,000 bytes here. The performance for turn interfaces in dark blue. Blue is the first MAMIF and the DSA optimized MAMIF is in yellow. And basically, what this gives is that the performance is really, so it brings really a huge difference between, so, sorry, throughput with DSA is 2.3 times faster than with regular MAMIF for the 1,500 bytes packets. And if you go with DSA enabled, it's 23 times faster than turn tap. And with a 9,000 byte MTU, basically, you get more than 60 times faster with the MAMIF that's optimized with DSA. Basically the digit, so the number that's really interesting is that with 200,000, so basically you get a single call doing 100 Gs with that. And that without too much modifications of the applications. So basically, you just spin a regular cluster. If the CPU supports it, you use a regular library and you're able to consume packets at really huge speeds without modifying the application too much. So there is another graph looking into the scaling with number of calls, both with small MTUs and large MTUs. Basically that shows that we can spare calls when going, so turn tap does not scale very well. So the regular MAMIF scales with 1 to 6 calls and DSA achieves the same results with 2 to 3 less calls than its regular MAMIF counterpart. So basically you achieve 100 Gs, which was the limit of the setup with a single call in the case of large MTUs and 3 calls in the case of smaller MTUs. So that's all for the talk. Sorry I went into a variety of different subjects because that topic goes into a lot of different directions. Basically that was to give you another view of the duration we are trying to go, trying to bring all those pieces together in a framework that allows us to make those CNFs run into a community environment. This work is open source. There are the details of the tests that were done in the following slides. You can find us on GitHub and there is also a Slack channel open where you can ask questions. And we have a new release coming up in Beta aiming for GA that's going to go without soon. So thanks a lot for listening, so here are the details. And I'm open for questions if you have any. Just one question for the sake of it, have you ever thought about some shared memory between the different parts to eliminate the need to copy over the packets? So we thought of this, so there are different ways to do that. So there is the VCR which I haven't spoken about, which is a way of opening the sockets directly in VPP. So basically you do a list in VPP for TCP, UDP or given protocol, so like the sockets APIs. And that supports directly, so basically the data never leaves VPP and you can do direct copies between processes without having to copy because everything stays in VPP in the end. For MMF, we don't support that out of the box but nothing forbids you to spawn two pods, make them share a socket and it's only shared memory so you can directly do it without having to spin up the whole thing. So you could even do that in any cluster or directly on bare metal. So MMF is really a lightweight protocol so you can do that just with a regular socket. Okay, cool, thank you very much. |
Welcome to the Nix and NixOS devroom |
So yeah, this is quite insane if you ask me, at least, of how many people are in this very small room at this moment, but thank you all for coming. First and foremost, there is just, I have one more slide. That's it. And then I'll hand it over to the first speaker. But of course, we are at Fossidem, so we have to abide by their rules. Please, we just distributed some stickers. Please don't go sticking them all over their property. Keep them for your laptops. That's what they're meant for, after all. Also be mindful of each other. This is a very well-packed room, apparently. It's going to get hot, probably. If you just have to leave, just please kindly ask the person next to you. And please, if somebody asks you, stand up, make some room. Just cooperate a little bit. It's going to be necessary in this situation at the moment. So with that being said, is everybody ready for an entire block of pure chaos and fun, as well, of everything to do with Nix and NixOS? Yes. Yes. Yes. Then all of it. Thank you very much. |
I am excited about NixOS, I want to tell you why! |
Those two are two half boxes, because the other half is in, you know, the vanadaire. But I guarantee those are two boxes. So one for the networking and one for the computing. I bought it years ago on eBay. I think some startups ran out of business and they had like many full-sized rack, full of nukes and, you know, NVIDIA boards. So I bought one and I started to play with it. So now I have ten nukes lying around at home, five GPU and I don't know what to do with them. So this is how everything started. So I have a single point of failure and all the SREs and DevOps here, you know, will hate it, but it's my single point of failure and I love it. And I call it Snowflake because it's unique. And what it does, it does a very minimal setup. It's the only long-living server or nukes, if I can call it server. I would call it server. So it's my only long-living server I have. And it runs a DHCP server. It runs Pixiboot, a TCP server, and is the core of my home lab. It runs there and it distributes the, you know, the NixOS derivation that I want to net boot on the other servers when I hook them to the network. As you know, because you are here and you are like me, I tend to automate everything I can. So as I said, this is the code that I wrote in the NixOS derivation for the Snowflake server. I start OpenSSH, I configure my keys, I didn't, you know, paste that, the public key there because it was too long and not great, but you know how to do it. And I start PixiCore and I configure it with a package called Blackhole that is the distribution I use for net booting. And I'm going to, you know, get to net booting later on. I have my.files shared with the NixOS configuration, so you can check that if you want. So what is, my Snowflake server is used to do Pixi booting and net booting, to distribute the images I want on my Fimmer Labs. And PixiCore is a tool that helps with that. It can be used, Sandalone is a Go project that you run, it exposes a DHTP server, an HTTP server to download the, you know, the kernel and the initram disk. And then this is Blackhole, so Blackhole is the, when you do net booting, you need an operating system to run on the new servers. And I generate that from a NixOS derivation. It's also very lightweight because it stays in RAM, so it's not persistent on disk, so it needs to be as small as it can be. So I use the net boot base profile and I change to minimal, but I didn't update my slides. So if the image is too big, you can use the installer minimal, that is a bit smaller. And then I install OpenSSH because this is what I use to enter into the server when it boots. But what is net booting? So from zero to zero in one minute, because that's how good I am with Nix booting, with net booting. And so every server or many computers boots and they look for, you know, an operating system on disk, and if there is not an operating system on disk, they look for a USB one. And if there is not a USB, the world becomes way bigger and they try to send a DHCP request to the DHCP server, says, oh, I'm alive, what can I do? And if you have a smart enough DHCP server, it replies with a package containing the IP address for the server and then an address that can be used to download an in-it RAM disk and a kernel and boot the server. So this is, you know, very old style, but still in use. You know, all the cloud providers that I had the opportunity to work with use this because hardware doesn't change that quickly. So Pixi booting is still there and this is the same things I use. So I download all the two binaries and the two packages and my server is ready to boot. So for me, I take the, you know, the compute models out from my boxes, I hook it to the network and they power on the DHCP kernel and I'm ready to SSH. This is kind of the log you get from PixiCore when it detects like a new server ready to handle an installation process. So, you know, the IP is assigned 162 and it downloads the kernel, the in-it RAM disk that I generated with the black hole scripts that you just, Nixx derivation that you just see and it boots. So on the end of the booting process, you have an address that you can SSH into and you have a server to play with. Obviously, this is not persisted. So when you reboot the server, the dance would start again and again forever and you would get a fresh new server to use. But for my ephemeral homelab, this is what I need. I just need a quick way to prototype. So if you can get enough RAM to be happy, and for me, 8 GB of RAM is happier than ever, I can play with that. Obviously, all the nooks can have, you know, disk attached to it and you can decide to persist the operating system on it. So you can write a system disk script that does Nixx install at the end of the, when the server is booted and you don't need to touch anything, when the black hole starts, it would run the scripts, the system D scripts and it would have a new persisted and flashed operating system. As I said, I don't do that because I don't need it, but that's how the full cycle, life cycle of the machine can be managed. And how do you, if you persist a disk, how do you reboot the server and get back to the pixie, to the pixie dance is all on you. In my case, I don't have, like the nooks doesn't really have a board, a BMC to control the server, so I can't force the boot loader to net boot. So I wipe the disk and they break the disk, so it will go to net booting. A bit rough, but you know, homelabs, that's how it is. So service discovery, in my case, it's way simpler than what it should be. You can use console core DNS or whatever you want, Type-K even, so you can, you know, in black hole, you can register a Type-K daemon and have it registered there, and you see that in the UI, I use my router. So when I see that there are new servers with a NixOS hostname, I know that there is a server ready to be, to used and that it's booted. In the meantime, when it does the pixie dancing, the hostname is pixie and a random MAC address. So it kind of gives me enough control and I don't need to run anything else, but you can get fans here if you want. So what is net booting used for? Only for ephemeral homelabs, as I said, no, cloud providers or, you know, people managing data centers that I think you know more than me, but you can do that for inventory management. So when you buy a shiny new server, it doesn't do much, it doesn't know what to do, so you just hook it to the network and the pixie dance starts and instead of running a script that persists an operating system, you run a daemon that registers the server to an inventory management like Netbox or others. So you can, you know how many, how much RAM it is, the model, whatever you can get from the system, you expect the system, you push the blob and you register a brand new servers and you have it available in your system. So it's kind of the same process I do for mine, but way cooler because the hardware is way more expensive. You can do that for provisioning, so as I told you before, I don't do provisioning because I don't want a persistent operating system on my homelab, but you can format and partition disks using system if you want or any other way to run a script and you're ready to go. For recovery, the same way, as I said, I don't have a BMC, but you can, you know, break your disk and get back on pixie booting. Some of the hardware have disks attached, but I don't really use disk much and even if you, because I don't need it, so my operating system is small enough and I do simple stuff that don't require me to, you know, swap or go on the disk, so I can use the entire disk, I can format it just for the data if I need to store some data. In order to do orchestration, pixie core, you can declare MAC addresses and the images you want to push to. So if you know the MAC address of your Ethernet cable, you can say, oh, on this MAC address, distribute this image, otherwise distribute a different image, so you can have servers running different net booting images and it's very convenient. So now when it comes to electric wiring, I'm also not really an expert, so what I have, I have a really, you know, a powerful, a powerful enough DC that is connected to a few fuses so I don't, you know, blow up all the nodes at the same time, one at a time maybe, but not all together, so that's why I have the fuse there, easy, cheap, it works. And then to control, you know, the power consumption, I just cut the power at the beginning, so I have a Raspberry Pi connected to a board of 16 relays that I control over an API, so if I want to stop the nuke number one, I just, you know, stop, cut the energy and the power for that nuke, and goodbye nuke, and, you know, playing with the GPIO, the Raspberry Pi is very easy and convenient, so for the prototype, it's good enough. I want to change from a Raspberry Pi from ESP32 to experiment a little bit, but it's not really for NixOS, because it doesn't run an operating system, so, just there, and I got two new fancy boxes that I want to replace with the IKEA one, because I think that would play a better role with my, you know, ambient there, and I don't know if I'm late or earlier, but that's it. Okay, so, yeah, this is my home lab and what I'm playing with, so if you have any question or if you want to know more, I will just go back to the, if I can, my invite is to have a look at the.file repo, if I can, oh, I can go there, at the.file, because it has all the code in there and you can play with it, so coming back to these slides, what I really like is the convenience of, you know, I have my NixOS derivation, for these experiments I'm building two NixOS derivations and the flake, there's no flake one encapsulates the other one, so I can distribute it conveniently and it's a single flake file that contains and run both, so it's, you know, it's very easy and it's very easy when you get to it. To get there, it took me 135 questions on this course, 36,000 messages on metrics, but I'm there as it's there for you, so play with it if you want, so thank you. Yeah, because it's very, oh yeah, how NixOS plays with the pixie booting and, you know, net booting, you have, I didn't find, there are other packages or other projects that helps you to package an operating system, like HashiCore Packer is another one, there is Infrakit from Docker and all those kinds of stuff, so, but NixOS gives you full control on the distribution that you want to provide, so for example at the beginning I wasn't really looking at the size of the image, so I was pushing in there everything I needed, but then I started to realize that my image was like four gigabytes and my RAM, I didn't have enough RAM or I didn't have enough RAM left to do the actual work, so I started to iterate and play on what I was able to remove, so I started to investigate about how to do a minimal operating system, so I see that minimal profiles, I disabled the documentation, I disabled, you know, everything that I don't need it, and I went back to like a, now the image is 400 megabytes in RAM when it's fully unpacked and it works, so it's good enough, so I think that's really the power you get from NixOS compared with other systems. For getting questions answered, I can also suggest the Discord server which is also pretty great. Okay, yeah, I will look at it. Thank you. One more round of applause please. Thank you. Thank you. Okay, I'll produce some more. One more round of applause please. One more round of applause please. One more round of applause please. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. We ready for the next? Great. Yes. Hi. I'm Philip. I'm here on behalf of Circle, a company, so we specialize on, you know, software delivery, customized to the clients, you know, like what people want. This talk is about NixOS, so like, you know, I'm here in Boston 2023, of course, and my talk is pitfalls of NixOS and how to avoid them, right? So at Circle, we use NixOS for like infrastructure, we use NixOS for CI CD and setting up deaf environments. Basically, everywhere you can use NixOS, we try to adopt it just because, you know, NixOS is great. You have audio visibility and everything that comes with that. Right. So this is me. All my socials are here. If you want to reach out to me, my GitHub. Right. Right. Yes. So Circle, as I said, is, you know, a software company focused on creating customized high-performance solutions. Yes, I can. Right. Sorry. So this is mainly in like the like functional programming realm of like Haskell and like stuff like that, where, you know, people want customized solutions for certain problems. Exactly. Right. So again, this hurts about Nix. And this is, right. The goal of this talk is to make Nix more approachable. Right. So when we adopted Nix, I mean, I wasn't dead then. Obviously, developers weren't using Nix, right. And they still, you know, they try to, right. So there has to developers, they're not dead into Nix, right. And they had some issues and I want to touch on that just so that, you know, a case study of how Nix works in Circle. Exactly. Right. Sorry. I'm right. So I am pretty new at Circle and I'm part of the SID team. So I try to push Nix too. In the often times you have developers. |
Make Anyone Use Nix
"It'll be fine"TM |
Thank you. There's too many people there, I wasn't ready for that. If they told me I would have finished my slides sooner. So yeah, thank you everyone for being there. It's really energizing to see so many people interested in Nix, NixOS. So I work at a company called Twig. My name is Guillaume. We love Nix, a Twig. If you don't know about us, check out our blog. We have, it's okay. So while my job as work is more data science and software in general, I'm a Nix enthusiast. I started three years ago, could not stop since then. My rabbit hole was just managing CUDA basically. As a Nix enthusiast, I have one thing I can't stop thinking about, which is, I want everyone to use Nix. I want you to use Nix. I want you to use Nix. I want you to use Nix. I want, so before I start, does anyone not use Nix on a daily basis here? Okay, so I want you to, okay. So and to the other people in the room, guys, what are you doing? There's two people not using Nix. So as you might know, I want to be able to make anyone use Nix. But there is about, in the sentence, we have both anyone and Nix. In the same sentence, ideally this is what the combination of these words should mean. Nix uses no better. So Nix is not an easy tool to use, right? In order to be completely autonomous with Nix and NixOS, you need a vast amount of knowledge to model your way through all kinds of problems and before you get anything to run properly. However, I am a believer. I think we can, in fact, not have this, but reach this wonderful word where everyone can use it. So let me tell you a story that happened in a company that I might have heard about. So in this company, we have Emma. Say hello to Emma. The company didn't expect people to actually reply to that. I'm delighted that you are. Thank you. I feel less lonely there. The company being rather small, she has to handle all HR by herself because she's the HR, right? So she manages the employee's contract as any HR would and turns out that even though the company is small, she has processed a lot of contracts. I mean, really a lot. Oops. I spent too much time on this animation to let it go to waste. Sorry. So she has been using software for quite some time. You might recognize the logo. To process demo. And it has been going fine. But someday, a task come to her so huge is stupid. We all know this kind of stupid task, right? Let's say, for instance, that she has to go through all the contracts and find all the employees that have some specific terms. She now has to spend hours and hours and days doing, going through all each one contract reading, looking for the terms, et cetera, et cetera. And it is boring. It is repetitive. If only there was a way to do it without doing it manually. And as she's starting to drown in her own despair, suddenly appears in her slack. Someone who says, I think I can be of help. It's an engineer, she says. Little does she know that it's not just an engineer. Anyway, our hero knows that she uses a SAS to process the contract. And the SAS has an API. So it can be automated. So he says, I can do it in minutes. Is he a spy? He puts in his hacking gloves, starts typing furiously, and Emma is in awe. He says, give me access to your account and I will get the date on my machine and I will process it for you and I will pass it to you. But then Emma looks at him in fear. It's forbidden by law. Indeed, we now have to face a terrible foe. You should not have access to this data or we'll be in trouble. There were these close, these close to solving the problem. But our hero knows that it ain't easy to run a custom script on Novice computer, right? You would have to go through many challenges. Making someone use a terminal. Installing dependencies. Sending the code and most of it all fixed in it because it never works on the first trial. So if only there was a simple way to run a program on someone's computer without such hassle. If only there was. But fortunately for us, there is nix. A single flat road for the user. So step one, install nix. Step two, give one liner to run a program. Actually, this one. First one, you got the joke. But it's beautiful. And as you can imagine, this is sort of a true story. There really was an HR. There really was a company. And I can say it out loud. I got HR to use nix. The comment I just put you. Yeah, I really do consider it an achievement. I put it in my tiles. If only it could be displayed over my head. So when I submitted my talk, I put a subtitle and you've seen it on the first slide. It will be fine. When I wrote it month ago, it was supposed to be sarcastic, ironic. But I was supposed to roast nix. I'm in a toxic relationship, as you can see. But when I thought about this one story, I realized that in fact, it did work. And that was what was incredible at the time. I was just writing a script, a Python script in five. And shipping Python, I've done it to some other people for the same reason, like privacy. And it's always nightmare, because people really wanted to work on the machine and you can't touch it and stuff like that. So this is why I, this illustrates why I love nix. It just works. Machines are capricious. But nix makes like a highway for us to make things run on them. It pays the way for software to not just work on my machine, but on any machine. And I think this is what makes nix have such an impact. And if nix can just do that, it's already quite enough, right? So is it sometimes a pain to write packages, to have to debug them? Because you're using a recursive function that at the end doesn't work and you have really not the good error message. Yes, sure, it is a pain to use. But at the end of the day, if it's the only way we have to do what I just did, then isn't it worth it to know it? So my final word would be to all nixers here, let's not make other people use nix because it's cool tech. Let's make it because it makes someone else's life cooler. Thank you. So I have my minutes left. It was just a silly story to get you to breathe a bit between the interesting parts. But if you have any questions, oh, we do have questions. So the question is, in which company does HR not use Windows, which is a great tricky question that I purpose, I did on purpose to evade that fact. Yeah, it was on my quest, as you could have seen. Some companies, I guess. Yeah? Yeah? Sorry? So the question is, why do I use Windows? Because I can run nixOS on it anyway. And yeah, some software just don't work yet. If you can make Ableton work on nixOS, I'm taking it. Still questions. How many minutes left? Yeah? That's a very good question. And this is, sorry? Oh, yeah. So the question is, sorry. Why didn't I just build Docker image, basically, to run it? This is what I did a few years ago when I had to ship Python. And it turns out that Docker was a pain to get to use on someone's computer because there's too many buttons. Like Docker desktop to get running on someone's machine that just don't know or want to know how Docker works. Sometimes it's just like you have to turn it on, you have to wait for it to, like, it's too complex to use in this situation. Or you can argue against that, but in my case, it turns out to be, and I wanted to use nix to be fair. Last problem statement, you said that you need to teach the HR person, command line, package manager, and execute the script. Now, nix kind of solves three of those things. You still need the command line. Yes. Yeah. So the question is, I'm sorry, I just can't get that in my head. Thank you. So the question is, so yeah, I said there were different parts of the problem to solve, like installing dependencies and moving the scripts and getting people to use the command line to run the script, et cetera. So, but there's still the fact that we need to use the terminal. I bypassed that by giving things just to copy and paste, basically. And then terminal on Mac isn't that hard to find, actually. It wasn't the hardest part. The hardest parts were getting the API key from the SAS, because you have to make people click through the UI of the SAS. Yeah, I think that's about it. Thank you, everyone. |
Nixel: a nicer way to write your Nix expressions |
Thank you. Hello for them. I'm Yanam Dawi. I'm working on a twig on a project called Nickel, and Nickel is a configuration language. And is it okay? And so in this talk, I want to talk about Nick Cell, which is a framework to use Nickel as an alternative front-end language for Knicks. And my dear friend and colleague Guillaume, who just got off stage, has a strange morning routine where he stands in front of me and says, I have a question. When? So when can I use Nickel for Knicks? And so my primary motivation is just to be able to enjoy my morning coffee in peace. And my second motivation is to try to get you as excited as Guillaume is about Nickel, or at least 10% would be already quite, because he's a very enthusiastic man. So beside maybe a few people who got lost, because for them is huge. I think we all agree in this room that Knicks is a powerful tool. There are a bunch of things that only Knicks is capable of. My personal favorites are Dev Shares. So you need to hack on a project, you just enter the directory, type Knicks develop, you have all your tools you need, then you exit the directory, everything is back to normal. Knicks OS, we talk about that, being able to manage your whole configuration in.files, rollbacks, competing versions of the same package is pretty nice. And I guess each one of you has their own usage for embedded or whatever of Knicks, either personal or professional. And if that's the case, one of your main interface you have with Knicks is the language, Knicks Expressions. And in fact, it's a pretty simple language. It's mostly JSON plus functions plus a bit of small strings things, but mostly. And paradoxically, if the language is simple, I find it quite hard to use actually, yes, it's two with only one O, to use in practice for a bench or resin, at least for Knicks. And one of the main issue is error reporting. And I think it's a pretty fundamental problem in the language which is dynamically typed and lazy, is that when you make a mistake, your little mistake travels all around the code base. And only when something consume your value, then everything blows up. And the error usually points out deep inside Knicks code, because that's what is consuming your value. And I would like those errors to point out like the point when I made the mistake originally. My favorite one is infinite recursion in the module system. So I was in UB at Knicks. I tried to move my Knicks OS config to Flakes. I made a typo and an argument to a simple module and I got like infinite recursion, but nothing was recursive. I didn't know what was happening at all. Something can be said about discoverability, in particular when you're writing code. So I'm writing some Knicks code. I would like to know what are the standard library functions that are available? What are the least functions from Knicks packages that I can use? What are their type? What argument should I put there? I'm writing a flake. What is the schema of a flake? Could I have some completion or at least some in code information to know what field I'm supposed or attribute I'm supposed to feed? And the last point is that Knicks is simple and usually it's a good thing in language design. Like you build a rock city core and then the rest can be done as library function. But Knicks is not a general purpose language. It's a domain-specific language. And if users of your domain found themselves having to solve the same problem again and again and again, then maybe the domain-specific language should provide a native list way to solve this problem. And one example is overwriting. Something that you do a lot in Knicks is taking a module or a configuration or whatever object, tweaking a parameter and get the new result with all the dependency updated and so on. And it's pretty not trivial to do in Knicks. There are a lot of different ways, a lot of different abstraction implemented by different people, and that makes for hard experience in my opinion, especially as a newbie. And it's not me saying that. It's actually Elko, the creator of Knicks, who wrote a gist some long time ago, which is partly one of the origin of Knickl, about the deficiencies of the Knicks language. And one thing he says that Knicks is a DSL for package and configuration management, but it doesn't have any notion of package nor configuration. So to recap, one of the main things is developer experience in general. Error reporting is one of the main interface with the language or something goes wrong. It's important. There is something to be said about Knicks being too simple somehow or too bearable for its own good. And so people reinvent the wheel in a lot of different ways. And I mean, sometimes it's fine to have competing libraries and so on. But for fundamental things, it's like when you want to put something in the standard library of a language, that should be only one way to do it. And it's efficient and so on. So what can we do about it? Well, I propose to do Knickl. Knickl is a general purpose domain-specific language, if that makes sense, for configuration. And what Knickl has, it has sound-gradual typing, it has opting static typing with higher rank polymorphism, structural typing with full polymorphism. Contract is like I'm going to find that. No. I mean, yes, in fact, but that's not the point. The point, I mean, those are means to an end. And the end is that your practice is nice. So here it's a little video demo. On the right, we have something called a contract. It's like Knicks-Wes types, something that is checked at one time by the Knickl interpreter. And you write it actually pretty much like a type or a schema. You say, oh, Knickl derivation, this is taken from Knicks-Wes. This contract defined by Knicks-Wes should have a name, a version, dependencies, system, and so on. You can attach other contracts and meta data in general to those fields. You can say, oh, name must be a string, version must be a string. Dependency should be an array of derivation. Derivation is another contract that you'll define somewhere. You can attach default value. Dependencies are empty by default. You can attach, you can say that a field is optional, for example, because I think Knicks is not strictly required by a built-in derivation that version is defined. And the thing is that all those meta data can be leveraged by the tooling. On the left, we are trying to write something looking like a derivation. That doesn't matter at this point. But we define an output field. Field is just Knickl name for attribute. And we apply this contract. We just import it. And let's see how it turns out. It turns out that we get completion for what we should put inside this output stuff. Like name, okay, we have documentation. We have the type. Actually, type in a string is named whatever the contract. We get completion for built-in command. And for nested recall, like, oh, what should I put in that built-in command? You can leverage also this information, not only from the LSP, but from the CLI. Oh, no, sorry, I forgot. You get completion for the standard library and actually any library. Those functions are statically typed, but there's another subject. You can leverage this information from the CLI. Using Knickl query, you can say, oh, what's inside a contract that's named? What is the field Knickl derivation? You get documentation. And what are the available fields? You can say, oh, okay, what is built command in particular? I get documentation and field. Now, what happens if I make a silly mistake and build command, which should be a record of strings, I just made it a string, like, instead of an attribute set? And I try to run Knickl on that. What I get is a normal message. The first blue part says, what is a contract that I just broke? You should have a record with args and so on. The second light points at where I define the value. Now, it's used. It's read. And it says, oh, this is one. This doesn't respect this contract. The third part is not really useful there, but it's giving you the evaluated value, which means that if build command was a complex expression bit out of map and fold, you still get the final stuff that it builds. And this green thing here is taking who the L is imposing this contract. So these points to build command field inside the Knickl derivation contract inside the Knickl library. And so this is just runtime validation. You could do it with libraries. Knickl does it. But first, I think this kind of nice structural syntax for it, as well as this advanced real-world reporting, it's really hard if not impossible to achieve purely in library code. Because there are special things in the Knickl interpreter to handle contract application and track argument and the stack and so on. So what Knickl is about is relevant, thorough, and early error reporting as much as possible. Discourability, you can attach all those meta information to fields and they are understood by the tooling. And in particular, by the LSP, giving you interactive development process. And in the end, arguably, the language is more sophisticated than Knicks, but as a user, I find it easier. Okay, great. That's fine. That's just my great 80,000 package. Not a big deal. Nope, not going to happen. Knicks package is a huge behemoth. It's probably the, I mean, the most important thing in Knicks. I mean, the value of Knicks is all this domain knowledge on how to build package encoded in a code that can be actioned by the machine. And it's not going anywhere. So whatever we do, if we want to use an alternative front end, we have to be able to use Knicks packages. So meet your first Knickl derivation. It's a DevShell. And from a distance, I want you to notice that there is no function at the top. So usually, Knicks packages, the first thing you do when you define a package is to define a function. This has a number of problems. One being that before doing anything with it, like getting the name or the version, you have to apply it to some arguments. And this argument may be packages. So you need to apply them and so on. So you need to evaluate the transitive dependency before doing anything. Here is just a flat record. Records are recursive by default in Knickl. So line one, we import some things called builders. We'll see later what is from the Knicks cell. Builders is given by Knicks cell. And line four to six, for now, or API is that you need to declare a bit like a flake. But at the level of the derivation, what you are going to take from the Knicks world. So here I say I want to take CRL from Knicks packages. Line nine to 12, I'm defining the actual derivation, so to speak, even if it seems way smaller. So I give a name. And then I put this input dot CRL in the pass. So this funny-looking string, I won't have time to detail, but it's called a symbolic string, is a way to simulate Knicks string context. But not only actually, it's a pretty generic mechanism, but to have the same namesities like input dot CRL is not actually a string, it's a derivation which has store pass and so on. And you do that in Knicks, and it's not trivial to do it in a different language. But yeah, this has all solutions to that. And we are using this input dot CRL, but we haven't seen any inputs yet. The other was called input spec. So input is defined, but not really, but is defined line seven. This is just a field without a definition. And in Nickel, the idea is that we call a recursive, and we have something called merge operation, which is the unpercent. It's a bit like the slash slash of Knicks, that is combining records, but it doesn't give priority to the left hand side, right hand side. You just try to combine and see if there is a conflict, you have to use priorities. A bit like the Knicks or the module system. But it does what you expect naively when you start Knicks, that it works on nested record, and that it works on recursive record. That is, if you override something there, everything that depends on it recursively will be automatically overridden. So what we do, line seven is a bit like defining a function argument. So we're just doing function in different way, so to speak, but in a way that is way nicer for Knicks, because it just looks like configuration. Overriding is trivial. I just add one line, and I merge with something that redefines the value. Combining stuff is trivial. So for example, line 14 and 15, I use some predefined builders, which are mostly looking like this derivation, and that has rest-day environments and a seed-developer environment in my shell. So I will end up with a shell that has URL, all the rest-toolchain, and C. So I think I won't have time to dwell into the detail too much, but it's a bit convoluted right now. We have a lot of the back and forth between Knicks and Nickel. Knicks is a driver. What's important is that these parts will get improved, but somehow it's not truly, it's a bit orthogonal to all the design of the Nickel side, what do the API look like, what are the builders, how we do overriding. It's orthogonal. This part is just how do we communicate with Knicks packages. Right now, Knicks is leading, and everything that crosses boundaries can't be functions. It has to be data. So in practice, it's JSON, and so you have a bit of back and forth like, what's your input? Oh, I will extract that from Knicks packages. I give you the derivation at JSON. Nickel has almost everything to build a derivation, but it cannot build it, so it kind of gives an argument to Knicks saying, please, can you call derivation for me? But that works for now, at least. That's something. So the limitation of this model is that you have a lot on back and forth, and the error messages at the boundary are pretty bad. If you try to import the packages that don't exist in Knicks packages, that's going to be ugly. And you can't override the Knicks package from within Nickel. That's kind of an important limitation because the only thing you can get is data. You can do it on the Knicks side in the overarching flake, but it's kind of defeat the purpose. We like to be able to do that from Nickel. The roadmap to solve that is to be able to import and involve Knicks expression directly in Nickel. It's actually not that unreasonable because Knicks is simple and close to being a subset of Nickel. So we're already able to transpire most of Knicks as far as the language is concerned, but we are missing all the built-in dot atro of derivation and things like that to make it work, and I think it's the hardest part, actually. Yeah, having a Nickel built-in to build derivation would probably piggyback on Knicks, but so that at least we don't have to do the last back and forth. We have three minutes including question left. Okay, we'll go quick. And so we can do all those things and Nickel becomes a driver and you don't have to go through this back and forth. You can override from from Nickel. For Knicks, what does it mean? I hope it means an improved developer experience. Unified approach to configuration. This looks like configuration more than Knicks. I find like you just define a bunch of fields and you merge steps together and a smoother learning curve for the newcomers. I didn't cover performance, but also having this merging being native and not library function as more room for optimizations. And beyond, my secret dream is that Nickel is general purpose for configuration. So you could use the same language with the same notion, same contract, same data model for all of your stack to reform Kubernetes, Knicks, exchange instinct between the layer. And something we are working on is incremental evaluation. It's a bit like incremental build, but at the level of evaluation, I have this huge Knicks-based configuration. I change one option. I want the interpreter to only propagate the changes to what needs to be actually recomputed. So to answer the initial question, Nickel, Knicks, when? Well, now you can already do this stuff. Well, next week because we haven't merged everything. But Knicks-L will be releasing the 0.1. You could do derivation and basic Dev shells. Knicks-L will be itself will be reaching 1.0 in the coming months. And it's still rough around the edge. You can't do everything you would like to do in Knicks. But the point is that I think we did the hardest. Like arriving at the first derivation was really complex. And now everything is aligned. And somehow we just have to build the same to polish the API and so on. And there is the same for Terraform, Nickel. Thank you. Before that, I would like to know if Rodrigo at Paul is in the room. Okay, you're here. Okay, great. We have one question I think we can take maybe to you. Yes. So did I understand and write yet that you are transpiling between Knicks and Nickel? So not yet. But that's okay. Sorry, did you finish your question or? Yes. So I'm asked if we are transpiling Knicks to Nickel. Right now, no, we are doing this back and forth with Knicks and Jason and so on. But that's the plan in the end. To transpile Knicks. You could import food at Knicks from Nickel and that would give you an idea of what's going on. Food at Knicks from Nickel and that would give you a Nickel value and then you can do whatever you want with it. Yes. Yes, so you mentioned. Yeah. So I've used Terraform Knicks before. Could you just use that? Yeah, yeah, it's true. I guess you could. But probably the idea here is to reuse the Nickel overriding mechanism, which is, we hope, way simpler. Somehow you don't have to do anything to get your things to be overriding easily. So if I think Terraform Knicks is not using a module like system. Yeah. And there is Terra Knicks maybe that is doing that. I don't remember. Yes. But yeah, you could do that. Actually, you could wrap any Knicks package with a Nickel interface somehow like a FFI or you could redo it a bit to have a more both are possible. I guess. One more round of applause, please. |
Playing with Nix in adverse HPC environments |
I hope you can hear me online, well, if not complain on the chat. We are going to talk about HPC, High Performance Computing, and Nix, and how we kind of deal with that. My name is Rodrigo and my co-worker Raul here. A bit of what we do, essentially we work on a parallel, concurrent task-based runtime, like similar to OpenMP, if you are familiar. We also need to work with a compiler based on LBM to read this pragmas in the code and transform them to function calls to the runtime. In our job performance it's critical, so we really need to take care, and in general we execute workloads on several hundreds or even thousands of CPUs. Here's a little example of something that we have observed. We have a program here that runs, and here you can see the CPUs, and this is the time of execution, and we are examining this little point here, because the time here is slightly bigger than what is normal, and we can see that the problem is that the allocator took a bit longer, so this is just an example. In general, in HPC or this high performance computing, it's just a lot of machines connected by fiber optic, they are managed by this SLUR diamond, which allows you to request a certain number of nodes. We don't have fruit in any node, and in general it's very old kernels and very old software stack, so yeah, we are stuck with that, and in general the state of the art now is to use LD library path to load other software and change versions. Problem with this technique is not very easy to reproduce, so the question is, can we benefit from using NICs? In general, we will get up-to-date packages, configuration options for every package, no more LD library path, and we can track everything that we use for an experiment. The problem is we don't have fruit, so we cannot install the NICs diamond as we would like to do. So let's take a closer look at what we do and how we do it. In general, we work in these three hats, so to say, in the development side, we take a program and we compile it several times until it actually compiles, and we kind of need to do this cycle quickly, so we want the compilation time to be very low, so we need to reuse the already built tree to rerun the build command. When we are finished, we switch to the experimentation side, and we run this program in the machine, and we maybe need to tickle with the arguments or the configuration file of the program to get some results that we want to examine. And then we also do some visualization of the results, but we are not going to talk in this talk about this. So we will focus first on the experimentation and later on in the development side. So a bit of what we did. We tried with individual installation of the NICS store by using username spaces. The problem is that the number of packages grows, so we would like to share the store with several users. So we use an auxiliary machine where we actually have a NICS demo, and then we can perform the build in that machine, and then use the post-build-hoc to execute some script that copies the output derivation to the actual cluster. Problem is inside the cluster, NICS store doesn't work, so we wrap the command NICS store in a shell script. And when it's involved by the auxiliary machine, it creates a name space where it mounts the NICS store there, and then it runs the NICS store and receives the derivation, so we can actually copy it over SSH. We also try to patch the NICS diamond to run inside the machine, but it's a bit complicated because we cannot even run a user diamond there. Okay, so let's focus on the experimentation cycle. The first requirement, most important thing, well, assuming that you already have a program that somehow you built in a sandbox, we want to execute this program in the machine, and we want to make sure that this program doesn't load anything that is outside the NICS store. So, especially the LD library path may have some path that actually has libraries for your program, so we don't want that, and also it may use the deal open to load other libraries. So ideally we want something like the NICS booth with a sandbox that prevents accesses to slash user or a slash opt, and it needs to work in a slurm too. Another requirement that we need is for MPI, the communication mechanism, to use this syscall process being read by, that only works if the process are inside the same name space. So we solve this by running a check, that checks if the name space is already created, and if it is so, we enter it, otherwise we create another one. So let's take an overview of how this works in the cluster. We have here the login node and two compute nodes that were given to us for running our program. In general, we have to wait a bit after requesting the nodes, that is fine. After this moment, we take a shell that is connected to one of the allocated nodes. These are the nodes, and each node in your case has two sockets. So we usually run one process per socket, and we talk to one of them only. Inside this process, we don't have NICS. So we first load this name space by using our script, and then we can run other programs like srun, which is the client that will launch the workload, but is inside the NICS store. So we can compile programs linked to this specific version of Slurm, sorry. After that, it requests the Slurm demo to execute something in parallel, and the Slurm demo forks in every process, one process that will run something, but it's outside the name space because it's not controlled by us. So we execute our script again to join the name space if it's found, or otherwise we create another one, like in the second compute node. And here we can see that we can communicate in the same node because they are both in the same name space, and they're one too, and here we use fiber optic communications. Another requirement that we need is that we need custom packages. We use that with this technique where we define a call package function that takes priority over our attribute set. So we can change software that is provided in upstream with NICS packages, and we use our version first, so we can hack on those without disturbing the whole package set. Another thing that we need is to define packages with compilers. In general, we use LBM with a custom runtime, so we use the wrapcc width and inject this little environment bar so we can load our runtime without needing to recompile the compiler. We also need, unfortunately, proprietary compilers, and we use this RPM extract and the AutoPatch Elf hook to fix the headers so we can run them on NICS2 and compile derivations for them. Okay, let's move on to the development cycle. In general, the development process consists in getting an application, adding some new on-cube features to it, breaking things, testing and retesting the tiles, okay? And this interactive workflow requires frequent changes in the source and compilation steps. For this reason, NICS build is not good to work with because every change in the source will trigger a full copy of this source to the NICS store and a full compilation. With big repositories, this is a problem because, for example, in the slide we can see the time about how much time it takes to build LLVM in a 32-core machine with hardware string, so it's a big machine. And we can see that although we use Ccatch, we talk about different orders of magnitude comparing it with simply reducing the previous build. Another alternative could be using NICSel to get our tools to build the application, but this environment is not isolated from the system. And we can find software that includes hard-code paths directly to the system, like in this case with a Sigma model file of ROKM that is CUDA4AMD for those who don't know what is it. And if we take an application that uses ROKM and configure it and check the lock output, we can see that, at the end, the installation selected is the system one instead of the NICS package we want to. An isolated environment will prevent us from this situation, avoiding the necessity of patching the source to solve this problem. Our solution for these two requirements is to first build an isolated environment with a tool we named NICSwrap. NICSwrap is a script that uses bubble wrap to enter a username space where the NICS store is available, but not the system directories, like in this case, slash user. And in this environment, we can launch our NICS tools, like, for example, NICSBuild. This works because inside the name space, NICSBuild creates a new sandbox in an instant name space so the environment is not affected. And the most powerful feature of it is running NICShell inside this isolated environment to get your tools to build your application in an isolated environment so you don't have to worry about accessing to the system. And in this case, it's the previous example, LLBM, and reusing the build. And finally, if you are using, like, as a slum, you can execute your application by running NICSwrap after the slum step-forward process and your application. Another requirement for us is, since we are in an HP environment, we want to get the best performance of the applications, and for this reason, we need to build the critical performance software with CPU optimization flags. Our solution for this situation is to override the compiler wrapper injected flags by overriding the host platform attribute, specifying the architecture and other stuff to the compiler in a standard environment in OCC, and finally, we create the standard environment we will use to build our software with this compiler wrapper. I will talk about conclusions. In general, we can actually benefit from using NICS, but obviously we have some drawbacks. The cycles that I was talking about, we can still do it very fast, so yeah, it's very nice for us. And also, if we have the chance to get something like a NICS demo without the root requirement and still be able to share the NICS store, that would be awesome. Thank you very much. We have five minutes left for questions, if there are questions. Yeah, so question is how can we manage to NICS store where users can install if that can be an issue for this space? So in general, right now we have about 300 gigabytes of storage for our particular group. We have around 2,000, 3,000 gigabytes of space available. In general, in HPC, people use a lot of space, but if we share the store, that will be the best solution instead of every user to have their own installation. And we also, when we kind of analyze someone that says to us, please use less space, we run their garbage collector. Thank you. Yeah. Did you consider using, or can NICS use our path instead of using run path, because that would get rid of your problems there. And then the other thing you can do is, there's a talk in HPC about free binding paths to SO. Okay. But that's a little bigger hammer. Okay. I see your SPAC T-shirt from here. Okay, so question is about using our path. Our path, yeah, because I mean it takes precedence over all the library paths, and you don't have to worry about the user being stupid. Yeah. So the problem is that you can see programs using DL Open to load their own, so they don't, sorry? DL Open respects our path. Ah. Okay. I didn't know that. We're doing something to be okay with that. So. You can do, like, if you find the different namespace for DL Open, they won't respect that, but most of them are not using that. Okay. So DL Open is not only the only problem, because we also see software trying to read each C slash configuration file somewhere, and we also want to prevent that. Yeah. In general, we saw that it's safer to avoid the programs from accessing any path than trying to find every single option that the program can use to, because. Okay. There was still one eager question. Can we find the next wrapped script online? Yeah. I will, I think I will upload it to the fourth page. Any other questions? Yeah. Not so much of a question, but a bit of a shameless plug. The main blocker for having a rootless NixDemon was merged last week on the week before. So hopefully that's going to eventually solve the third of your points. Perfect. Thank you. So what about in-care libraries on the system? Are you only envisioning that you would install the library to things like MBI through Nix? Okay. Because that's not possible on something. Yeah. It's a very good question. For now, we have been very lucky to be able to work with only proprietary packages that can be put inside Nix. But it may happen that the proprietary something, it doesn't work. So we don't have a solution for now. Thank you. Just switch it over again. |
Contracts for free! |
Okay, please welcome our next speaker, Ivan, to talk about contracts for free. Hi everyone. It's a really quick talk about the Akitoi project and some unpopular opinion about using runtime types. So the thing is, there is an old issue, maybe you know it, in NICS, which was closed since because it will never be implemented, which is a bit embarrassing about the fact that NICS have no any static type system, and the thing is, I changed my mindset relatively recently about that, but I will talk about it later, because what's the issue about the type system? It's the lack of a static type system, it's the error you'll get, you'll get them at the last moment. It's really inconsistent regarding where the mistakes were made, and when you read your stack trace, you get out time to find what was the actual error, and that is really not helpless for, for example, beginners trying to work on NICS expression or people trying to debug NICS expression right then by other person. So I changed my mindset because NICS is a really simple language, and it should be a simple language, and it invites us to build construct in library space, and if you look at NICS built-ins, for example, it's really few built-ins, and a lot of things that could be built-ins are actually implemented as NICS by Quage, LibSync, and there is a NICS by Quage, LibTypes, maybe without NNES, which is what you use when you define NICS option to say the option could be that or that and everything, and so I changed my mindset because NICS is made in two steps, you know, so it's the first step that you evaluate the language and produce things in NICS to review, symbols in NICS store, and after you build, you made your Quage, but the instantiate part is guaranteed to terminate, you know, you will, the computation will always terminate, which is what you expect for a type system actually, and which is not the case of some type system or most of macro systems that are not expected to terminate, so actually it's a good guarantee, and that's why we could do a lot of things in NICS, you can see it as, I changed my mind because I see that NICS is not really about the runtime and programming language, so we could do a check at runtime, and I wrote a thing while a vacation and realized later that other folks already did it, which is Tajjin, maybe you know him, he works on Twix, Twix right now, and he also writes NICS OnePager, one of the best NICS starting guide at some point, and the thing is, it's really simple, you define the data you want, just validate your function, so you say, I want something like that, and you check at this point of your NICS expression what it evaluates, that's its respect to a contract, a validator function, and that's help actually there to fail early where you want it to fail, there is differences in implementation, I will not talk really about, but I really simply didn't rely on NICS package, and it's fully compatible with NICS exception, so you can define a thing, and the last thing is, those composes, I go back inside, you can define types, you can define validator, instruct, that became validator, and you can compose really complex things, because it's really expressive, and also the nice things with runtime types is you can produce recoverable error, which is the case for types errors that exist in NICS itself, so for example, this will be unrecoverable, but if you use the runtime type, you would recover this situation. Last thing is, I try to advertise this kind of technique, because it helps a lot to debugs your application and to have assertion about this, respect that and everything, it doesn't really solve the problem, because it's not a real type system actually, but if you're not starting something from scratch, if you actually working on an existing NICS expression, NICS expression could be actually really large, I think personally it helps, if you're starting a new thing from scratch, there's really cool things like Q, dial, NICL, generate JSON, and you use that live in NICS package, so you can use it really easily, and there is also, I like personally plural NICS, because it generates NICS output results and JSON, okay, I think I said most of what I would say, so don't hesitate if you have a question or want to share it with you. You don't have time? Thank you, it was really good presentation. |
devenv.sh - Fast, Declarative, Reproducible, and Composable Developer Environments |
All right. Thank you all for coming. Yeah, I had to fit a 40-minute talk into 20 minutes, so let's see how it goes. Sorry. No problem. Yeah, I want to talk about developer environments, and I don't mean this room, but I mean how to get software that we need in a project together and do it this, obviously, with Nix, and in a simple way. And first, just briefly about me. I started with Gentoo in high school. I did Google Summer of Code. Switched in 2012 to NixOS. Thanks, Florian. Fris'd off for this. In 2014, I wrote a pretty viral blog post, like how we can do better than what is the current state of software in the world. Four years later, I created Cashix, which allows you to host binary caches for open source projects. I started Nix.dev, which is about 15 tutorials right now, how to get started with Nix in different ways. Last year, we kind of restructured the foundation. I became a bone member, and also I started dev in November. Very quickly. So how many of you are using this just to get an idea to who I'm talking with? All right, about one third. That's good. So like developer environments, right? A few kind of design decisions are important things. We want to support all the major kind of operating systems so that when you have a team of diverse developers that everyone can go and install it, we want to have hermicity or determinism so that if you come to a project, let's say, 10 years later, it should still kind of work, right? It should still build. It should still do exactly what it did 10 years ago. Yeah, we can talk about that in detail later. But we want to configure garbage collection. So Nix, we kind of like just download things. We don't really remove things by default. And when we do remove it, we call this garbage collection, which is an analog to how we handle memory. So we kind of want to have our developer environment to take care of that for us. Because by default, in Nix, it's kind of like an optional thing. We want to integrate durian of integration, which is kind of a nice little tool where you just enter a directory and it activates something for you. And you exit and it deactivates that. So it's kind of like you don't have to run any commands. And we want a high level abstraction that we can use to kind of pull things together and glue them and abstract and have a nice little language, which we will use here at Nix. So how does it look in practice? You create a, I don't know if you can read that from that far, but you create a devmf.nix file where you have at the top a little function that passes in a few parameters. Most common one that you would want to use is packages. And then you can define like environment variables, packages, and kind of like a shell code that executes once you enter the environment. And bear in mind, there's no containers or anything. This is just kind of like, you know, virtual and on steroids or whatever you want to call it. And at the bottom, you kind of see like, okay, let's say there's no git installed. You kind of say devm shelf. You see hello determinism and you know, there's git available in this environment. And there's like devm search and you can, for example, search for packages. This will kind of like search the whole Nix packages repository that we have and you know, over 80,000 packages and so on. This is kind of like, you know, this is those that are using Nix OS or Nix Darwin or Home Manager, very familiar with this. So once you kind of familiarize with how the modules work, you know, you can switch between different components and be happily using kind of the same idea for your operating system. On the other hand, we kind of want to specify how these environments are created. And this is devm.yaml file where we can specify the inputs. As I said before, the environment is kind of fixed according to these inputs, kind of pinned. And by default, it will use the Nix packages and stable. But you can customize that, add new inputs and so on. Inputs can be, you know, tarble, GitHub repository, whatever you can fetch from somewhere. And then we want to have these environments kind of composable. So what's happening here is that like, we have still some inputs, but we can, for example, fetch the devm repository and we can reference a devm.nix file from this GitHub repository, but also like locally. So like if you have like a typical setup where front and back end and so on, or you just have some like general stuff you want to reuse, you can like pull it in with the inputs and then say import that developer environment. And you can kind of like compose these together. That's more of a already advanced use. So fast, what does it mean fast? So in Nix, we have the binary caches idea. So anything you build, you can upload to binary cache. And when you say dev and shelf, that will be downloaded. So as long as everything is downloaded, this will take usually a second or two to kind of activate the whole thing. If not, then possibly a bit longer. But there's also a CI command, which you should run on your CI, which will kind of check that if you say, there's a way to declare, for example, Git pre-commit hooks and build packages and all these things, it will make sure that everything is good. So you just run this on CI and upload things to binary cache. And the next time developer invokes dev and shelf, things should be there and things should work. So just briefly the different files. There's like dev and dot Nix as I said, there's dev and dot local dot Nix, which is something that is not committed to Git repository where you can kind of like overwrite something. There's the YAML file. And then there's the log file. This is actually dev and uses Nix flakes underneath. And this will uniquely like pin down your environment that you should commit to a Git repository so that if it's generated every time somebody uses dev and then you will get a different kind of like pinned inputs. And then there is dot NNFRC, which is the dear end kind of like glue code that kind of allows you to hop into different folders and activate and deactivate these environments. As I said, there's garbage collection built in. Yeah, there's many details to it. There's like a bit of a, it currently works pretty slowly, but I think Teofan helped to improve that. And let's see, we still have to work on improving this bits. We have right now 36 languages that you can kind of like just get a basing tooling for. So you say like languages Python enable, languages Python virtual enable, and it will like create this for you. Type script, and I think most of the top 20 use languages are supported. And all right, so to give you a bit more advanced example, for example Rust, you can say enable Rust, you can pick a version, you know, your difficulty level, nightly, latest stable. We can define some pre-commit hooks. So for example, if you have four matters, linters and so on, you can just flip a toggle and it will kind of configure this for you. And for example, you can specify packages. In this case, it's a bit of an advanced case where you can say that enable security SDK from Apple in case we are running this on macOS. And if we're not, then this package will not be added on Linux, for example. There's quite a few people who use it for PHP nowadays. So in PHP, one interesting thing is like if you enable debug, PHP gets really slow. So what you want to do is you want to have a PHP with debug and without debug. And you know, sometimes you want to use the debug version and sometimes not. And here you can see, you can specify like the ini file, which version of PHP to pull in, which is kind of useful when you have different projects that require different versions and which extensions to enable and so on. But that's kind of like just providing the basic tooling. What's usually useful is when we have an environment that we want to start some processes, development processes or whatever. So of course, that's also included as a declarative interface, so you can set processes and it's kind of like bash code that you execute. And the implementation for processes is pluggable. By default, it uses Honcho, which is like a four-man implementation in written in Rust or Python, Rust, I think. Python, thank you. But you can plug that in. There's like Overmind and Hivemind, which will spawn TMAX processes that you can then enter and exit and so on. So you can pick your difficulty level there as well. And then we can go one abstraction further and we can say, okay, now we have these processes. What if we just create recipes that it will just automatically configure the process for us? So if I say, for example, Redis enable, this will just start a Redis process with some default configuration. And then you can kind of like, you know, configure the Redis port or PostgreSQL port and so on. So any software can be kind of abstracted on top of those processes. And we support a bunch of them. I haven't even listed all of them here. How much time do I have left? Yeah, so, and then there's like a bunch of ideas behind like, for example, we have like a scripts kind of entry point where you can kind of define declaratively a script. In this case, I define my script. And you know, once you enter your environment, you can type my script and it will execute this code. And in there, you can reference the packages that you specified before, or you can even hard code them with the NICs, like how do we call it now, string interpolation feature? Yes, it used to be called anti-quartation. Here you can, there's like a nice, I think it's called host CTL command line that you can kind of manage hosts on macOS or Linux, like kind of as a abstract interface. So we, you can configure hosts. Again, this does not use containers behind it. So it will actually manipulate your machine's hosts. And I think we have kind of a prototype to provision certificates in this way as well. But there is no limit to it. You can take any software, for example, I don't know if you know Difftastic, which is like a synthetic, kind of like a Diffing protocol that is understands the language and it tries to do a Diff in a semantic way. You can just flip a toggle for your project and it's available for anyone that uses this. There's DevContainer, for example, also there that you can use and so on. So any software can be turned into a toggle where you can just flip that thing and your environment has this now available. All right, that's it. So I went briefly through all the features of DevEnf and you can go to that URL. There's a ton of documentation. If there's anything unclear, please let me know. There's also a link to, I think we have a Discord where you can complain about the lack of documentation or if something is unclear. It's really important to me. So please let me know if you get stuck or if it's unclear and yeah, I hope that you'll find it useful. Okay, we have five minutes. Five minutes of question. I saw this one first. Hi, Domen, thank you very much. You haven't mentioned the Flakes integration. How much serious you are with supporting moving with Flakes? I've submitted all the issues that doesn't work. Yeah, so Flakes integration for me. Oh, to repeat the question. So the question is, there's also Flakes integration. You don't have to use the common line of DevEnf. You can use the experimental Flakes with the DevEnf kind of like modules and the question was how serious I am. I'm quite serious. I don't use it myself. So I'm serious about like, you know, let's get it in. Let's get the tests there and let's fix all the issues. I think we fixed quite a few in the last month. You know, DevEnf was created like in October and in November, people started using it widely and we did a lot of work in the last month and two months. So the Flakes still need a bit of polishing but I'm quite serious about it because I understand that some people want to use more than just the developer environments and that's great. But I think to getting the tests in, that's the crucial bit. Is it possible to edit in from next? Like how close is the relationship between the next build and what goes on in DevEnf? Like could you say, I want a PHP environment but I want to edit a dependency of PHP and have it built from that source. Ryan, so the question is like, how does Nick interact with, if I understand like kind of patching some of the things that you're pulling in, right? Yeah, because you could, you fetch your inputs, like you could defensively fetch some of the updates here on the board or after you edit them and then on the fly and build Dev or something like that. So like, so the answer is kind of two phase. One is that Nick's is really good at customizing things and overriding things and patching and you can take any package and apply a patch to it, for example, and it will just, instead of getting it as a binary, it will just compile. But I don't think there is a really good and simple interface for this. I know that like in JavaScript, there is like patch something package, I think, which is really nice interface. So I've been looking into how to integrate that so that it's really easy to just take software from somewhere in the world and kind of hack something on top and override it and do that. So my answer is it doesn't really exist in an elegant way, but it's possible to do it right now. Yeah. You mentioned DRAMF, is it already integrated or is it a missing feature? DRAMF is integrated as long as you, yeah, sorry. So is DRAMF already integrated? Yes, it is. It's, as long as you install it, then it will, you know, kind of work. When you say, for example, Devin in it and it creates a template, it will already use it and so on as long as you have it installed. So is it like use Flake or something like that? No, it will just generate a template for you, so that's it. You don't have to do anything. All right. |
The Nix package manager development process |
Hi, everyone, and so, yes, Brian introduced me, I'm Teofan, I'm, so I'm a member of the NYX West Foundation Board, but I'm not going to talk about that today, because I'm also a NYX maintainer, I have been a NYX maintainer for the last few months, and I'm going to talk about the NYX package manager team, officially called the NYX team, but it's very confusing, so I won't call it like that, at least not in the first slide, which is a team that has spawned a few months ago to actually expand a bit on the maintenance of NYX. The reason why we created this team is that NYX, in some ways, had a bad reputation, both for contributors and for users, with this idea that the maintenance of NYX didn't care about contributors and users, which is not great for an, well, it's not great for a product and it's not great for an open source product, so to be fair, that's not true, a slightly less wrong amended version would be that the single NYX maintainer doesn't have the bandwidth to care about all that, because it's one guy hacking on a fairly big project for a bit of history, if you don't know NYX originally, that was a PhD project that Elko Dostra started, and then it has grown up progressively, the community has expanded, but until recently that was still only maintained by Elko Dostra as one person, which was definitely not enough for something of that size, but regardless of the fact that it's not someone not caring about anything, it's really more of an organizational problem, well, that's not great for a number of reasons, so as I said, contributors didn't really feel like they were as appreciated as they should have been, a lot of pull requests stayed unanswered for a long time or forever, even when they were answered, there was no guarantee that you could get an actual yes or no answer to is this pull request something that's going to be met eventually, and there was also no way for contributors to know before the fact whether what they were doing could fit the NYX style guide, because there was one, there still is one, but only one person knew it, and so as a contributor that was really, it was really tough to contribute to NYX because of all these kind of little things, and another problem which is more something for the users is that the release process of NYX was a bit chaotic, so the 2.3 release of NYX was somewhere around 2018, anyway, don't remember the exact date, but then NYX went for two years without a release, and when the 2.4 release came, I just made some quick starts, and I'm going to show up the interesting numbers, so NYX was 55,000 lines of C++ at that time, well, only counting the implementation files, and the diff between the 2.3 and 2.4 release was 32 new lines, or lines changed, so essentially half of NYX had been rewritten between the two releases, and as you can guess, that came with a number of breakages, honestly very few breakages given the size of the diff, but still way enough to annoy users, a lot of, regardless of the breakages, new features had been introduced, but not necessarily properly documented, or experimental features had been added and were like officially experimental, but that was not properly communicated, so people started relying on them as if they were actually stuff that you could use for production, and then they got changed, because they were just experimental and people got angry about that, I mean, rightfully for them point of view, unrightfully from the point of view of the maintenance, for which these were experimental, you should just not rely on them, but anyway, that led to a lot of frustration, and that has been going on for a long time, I think we could say that NYX has started being really a big enough project since the early 2010s, and it has kept growing ever since, so the frustration has kept growing with that, so in 2018 a group of people just sat together and decided that this just couldn't keep going, and so they decided to create the NYX core team, whose responsibility was exactly this, like improve the maintenance of NYX, make it clear what design decisions are, where the project is going, what contributors can expect from them, and what we expect from contributions, so that we could smoothen all that, so this core team was a great idea, but at least to me at that time, I wasn't that much involved in NYX at that time, I was only a user and very casual contributor, the main contribution for the NYX core team that I saw was one year later another announcement that the core team was disbanded because it hadn't done anything meaningful in that year, so yeah, that was unfortunately a bit of a failure. Now, well, then in the meantime I came to contribute more and more to NYX, it also happened that Elko and I got colleagues for some time, which really helps smoothen things a lot for my contributions because I could just ping him and say, hey, Elko, I have this for request, can you please review it, but that was really helpful, but unfortunately, well, maybe fortunately I guess that depends, but not everyone got to be Elko's colleagues, so that model couldn't really scale, and yeah, at some point we thought, well, we need to do something and maybe try again with that NYX core team, well, didn't manage to do, but then the big question is, well, how can we succeed if that core team failed? And like for the record, that team, if you look at the co-authors here, these were the biggest members of the NYX community at that time, that was not just a group of random people showing up and saying, hey, we want to maintain NYX, that was really a big community effort. So how could we make something that works, and why maybe could we make it work now? So I'll maybe expand a bit later on the how. The first question is, well, the first thing is that the circumstances were actually very different. If you look, so I dig a bit in the GitHub history stats, this is very blurry, but that's where you won't be tempted to read all the numbers, and you will just show the big shape of the graph. So these are the main commuters between 2003 and 2020. So that graph represents Elko's contributions. And these very, very, very, very tiny graphs that, yeah, I can see that one. It's not smaller than the pixels on the screen, but barely. So these are the other main contributors. So as you can see, well, that's really a one-person project with some contributors. If we look between 2021 and 2023, and I'm probably getting out of the camera field, so I'm going to stay where I am, well, you can see that, well, Elko is still the main contributor. That's no question about that. But then things are much more well-distributed after that. So that means that there's actually people who potentially are already investing enough time in Nix and could really take on a mentorship. And another key ingredient that also probably was missing at the time is that between 2018 and today, well, the Nix community has kept growing, and we have more and more industrial players in that community and more and more people who are paid to work on Nix. Like, well, Elko has the chance of always having been paid to work on Nix more or less, first as a PhD student, then getting hired by a company which gave him a lot of time to hack on it, then another and so on. But for a long time, he was mostly the only one, at least for Nix itself. But yeah, now things are changing, we're kind of growing up, we're a real community of people actually handling money and having people who have time to do this kind of boring and pay-full things or reviewing poor requests of code that someone else wrote, and obviously I didn't write it myself, so it's not good and I don't want to read it. But now we can leverage that. And so that's what we did. So we gathered a group of people, we sat together with Elko and the other people who would be one of our team members of that team. Everyone agreed that this was a great idea, we had a lot of disagreements on the tiny details because obviously that's how things go. But eventually we found that team, we announced it officially saying, hey, yeah, everyone agrees that there's problems, so we're just going to create a team to try and solve these problems. We set a simple but ambitious goal for the team, which was to basically take ownership of that Nix source code and lead it to better, to higher standards, both for contributors and for the end users, so that contributors could know that their contributions would be taken into account or know that they wouldn't in case these weren't things that we thought should lend into Nix, at least they should have clarity. And the users should know that Nix was something that they could rely on, that was robust, that fixes were met in time, that they had clear expectations and the support they could get. And because Nix starts to be reasonably big, we also thought that a single team of four or five people, well, that's already better than a team of one person, but that's still not enough really to manage the size of the Nix code base. Not so much the size, the code base is not that big, but the vitality and the amount of requests and issues that come in. So we decided that the first mean by which we would want to take ownership of that code was to actually enable more maintainers and contributors by writing some clear contributing guides so that people knew what to expect, by trying and grow more maintainers out of the existing contributors' base so that we would not be a bottleneck for someone having his pull request answered. And as part of that stewardship also, oh, I forgot a bit of history. So this was the main step towards that, but actually someone too decided to buy the ballot on another topic some month earlier. And yeah, same thing, people sat down together and decided, well, the Nix release process is not great, we need to improve that. Well, let's just have a fixed schedule, one release every six weeks, that's arbitrary. Probably half of this rule would say, well, this is a bad way of doing releases, you could just send another half would say, well, this is a bad way of doing things, just leave that head and the other half might agree. But anyway, everyone agreed that having that was better than nothing. Oh, and I'm talking too much. But then this release schedule also was something that was still in Elko's hands. And that, like, doing a release is not a big deal, but you have to think about it. And if you're the only one whose task every six weeks to do that, well, certainly, please don't go on vacation at the date of the release, because otherwise you're going to annoy everything. And please don't forget about it, because then you have, you're going to have 20 people yelling at you because you're three days late. So that team was also a way to share a bit this responsibility. And well, the big question then is, did we succeed? I hope so. I really, really hope so. So I don't think I gave you the date already, but we started all that last September, which means that we are six months in. And six months in, it's still a bit tough to see. The main indicator I have to think that's probably we've succeeded is that we've turned the NIC's commit tree into something that starts to look like a plausible metromap, which is always a good thing when you're maintaining an open source project. And in addition to that, for some less tangible factors, I think that the team is going reasonably well and NICs is starting to move forward at a more sustainable and pre-visible pace, which makes me very, very hopeful for the future. And just adding two minutes to expand a bit on that because this NICs team is not an isolated phenomenon. It's part of a growing movement within the NICs community to try and structure the fundamentals of the ecosystem, starting with something that Domain briefly mentioned, which was the NICSOS Foundation Board, which got expanded a few months ago to try and be more proactive within the community and expand on his previous role, which was mostly pay for the infrastructure. And this also materialized with the emergence of a lot of different teams. The marketing team, which was created a bit more than a year ago than a team dedicated on the documentation, which I think nearly every speaker here at some point say that the documentation is a problem for NICs. So everyone agrees that this is a problem. Another team that got created around the maintenance of NICS packages, which is correct me, but I think that's the seventh more active GitHub repo. And well, when you have something of that size, you'd better not have it just be something informally maintained by whoever happens to pass by and want to make some changes. So all that is part of a big community movement about organizing these kind of things, which makes me very happy and very hopeful for the future of the NICS community in general. And since I want to leave some time for questions, I'm going to stop right here. And if you want to know more after that, we can always have a chat or contact me wherever. Thank you. Thank you for one or two questions, depending on the size. Are there any questions? So, maybe not directly related, but so when, if ever, we'll play speed at the ball. That's a big question. So the question was, yeah. Okay, the question was, when, if ever, will flakes be the default with a commenter by someone who might not name who say that it was in the embargo and I couldn't answer that. And so the answer, well, one answer is when it's going to be ready. And actually, so a huge part of the NICS team discussions were about refining the semantics of flakes to make sure that all the little corners that we wanted to have cut before actually making it stable were cut. And so I would, I would definitely wouldn't advance a date or a time. But it has been, it's progressing, which doesn't answer your question. I know. Okay, so the question was how we were considering project like TVX, which is a re-implementation of NICS, and whether we wanted to engage with them or whether we were already engaging with them. So I think I cut that in my screenshot here, but actually one of the explicit goals of the team was to actually engage with any third party that like that could be interesting to engage with that obviously included the TVX guys, which we didn't do because, well, people never do what they say, you know that, which is definitely something that we want to do. But in the case of TVX in particular for two reasons, one of them being that TVX was really born out of the very frustrations we talked to at the beginning. So it would be interesting to have their feedback on whether they feel that things are moving in the right direction. And also because, well, let's be honest, having another implementation of NICS, which is kind of a concurrent, it's a real off the pain like these people are doing what we want to do, but it's also great because they are like the TVX people came every once, you know, why, for example, with opening an issue saying, oh, there's this very, very weird piece of semantics in NICS when you trigger this specific, in this specific code. I don't know whether that's a bug or not. I'm probably going to reimplement it for TVX because I want to be bug by bug compatible. But is this really what you intended? In general, the answer is no, but my point is that I think we would gain a lot from collaborating a bit more together. Hopefully that's going to work. |
Runix
a type-safe Rust interface to the Nix CLI |
Everybody please welcome Janik. Janik? Yes, that's me. Yeah, I'm Janik. I work at FLOX. Yes, it might be apparent. I am the author of the NixOS search backend. And from that and the work we do at FLOX, we have seen some, well, problems with interacting with the Nix command line as it is right now. And we have written Runix to make this a bit more nice to use from high level languages. So first of all this, I see like five different steps in, well, five different ways of how to interact with the Nix CLI. The first one might be like the most obvious one, the command plan interface. That's the Nix program that we know and love or not or whatever. But I mean, it's commands like Nix build something, develop flakes. Yes, we assume flakes are a thing. Copying or like, you know, the list goes on. But CLIs in this sense is mainly made and done for like user than the user. So you have completions, you have colors, you have your dialogues and whatever. But like the main point is everything is very manual. So every command you do is a single operation and every context you want to share between two operations, you will have to manage this yourself. So you will have to manage to paste or write the right and the same flake reference, installable into two commands that you want to run like together or in the same context or something. But yeah, it's totally up to you. So this is where you come and start writing scripts because you don't want to do this all the time. So we automate with Nix, right? This is like just writing shell scripts is easy and useful for simple automations, either NCI scripts on GitHub actions or just local workflows in general. You can use the same syntax you use on the terminal, you use shell syntax, the same CLI. But you have now like the CLI has a machine readable output like the JSON output modifies, allow you to save output, process output and like the bash or shell language that you use, allows you to compose commands by saving processing and like managing inputs to commands. Now you have been writing stuff with bash. Imagine you want to write an actual program. Now, well, it kind of looks like this. How do I know? This is basically what we wrote. This is part of the Nix search search backend that pulls data out of the flake. In general, it's great to automate things with flexibility because you have a language around your invocations. And it is great to build programs on top of Nix using like executing Nix directly. But it is very heavy on boilerplate as you might have seen. The structure of the arguments as well as the correctness is entirely up to you again. And last but not least, there is strings everywhere. So like you might think like why is this a big problem? Well, like I guess Doman might agree, people might agree who have ever written apps around Nix. You end up writing these Nix invocations and managing their inputs becomes a bit of a pain. So the next step, what's the next step? For us, it was like, OK, let's remove the notion, the idea of invoking Nix as a CLI app through exact statements. At least like on the language side. There are still Nix in the background, but it's done like we put a layer of abstraction on top of it because, well, everything can be solved with another layer of abstraction. And because I like Rust and we're using Rust now, it's a Rust interface. So it's typed and we hope to leverage the typing system to, well, create correct, hopefully correct, well, commands and arguments that are also predictable. So, well, types please. Let's look at how the Nix eval command looks because the command we saw before was a Nix eval command invocation. So what does Nix eval have? If we look at the help or the manual of Nix, we will see like Nix eval has options, like options by itself, and it has arguments that are shared between different Nix commands. Common evaluation arguments, common flake arguments, and things that change the interpretation of installables or whatever that means. So, trying to represent this on a high level in Rust might look like this. You have your flakes, flake arguments, you have your evaluation arguments, source arguments, like the names and the structure is kind of modeled after the C++ code of Nix. So, to have at least like some guidance here why these are named like this. Yeah, let's look at the flake arguments. So, like, flake arguments would contain something like no writing log files, updating inputs, and so on. So, but why would you care about typing these things out? Well, ever try to make an input to Git source? Like, it takes like five steps or so to get to, well, the point where you actually have the right format of the GitHub URL. And it is a weird format because you have to get to Git plus something and it doesn't work like you would use it on the Git CLI as well. So, a lot of confusion, a lot of errors as well. And, like, Nix also doesn't really say why it is, like, what's wrong with it. It just says URL is unsupported, Git failed, not a valid URL, and you're like, why, what? So, well, yeah, being able to specify flake references in a typed way would allow you to eliminate an entire, like, class of errors. Because now, well, like, you can still fail if the repository doesn't exist, of course, but you fail these URL construction classes. The same thing goes for evaluation arguments here, for example, including things like the include argument. You have a typo in there, here it's like missing an A somewhere, I don't know if somebody spots this, but the error doesn't really tell this. And even though, like, the include might be not really necessary with the flakes anymore, but even if you use it, but the idea stays the same. Then, on the source arguments, like, there's arguments that require, like, a Nix expression as an argument. It is just a string in every other context, but now we have at least a chance to enforce that it has to be parsed by a Nix parser and has to be actual valid Nix code that is in the argument. The same is for the arguments of the eval command itself. The apply argument requires you to parse a lambda, a function. Like, the error is not telling you this. It is saying there's an unexpected token and this is not what you want to see. So, like, enforcing that it is a lambda on a syntax level, that's a step forward in my opinion. So, this is the, well, typed version of the command that we saw before, and I don't know, for me it looks more clear what's happening. But I leave that up to you. Also, like, excuse me, that this is not the exact code that you would write. There's a lot of ints or some missing, but it had to fit here on the slide. So, like, it got a bit trimmed. But, like, yeah, the same concept still applies to Nix arguments. We saw Nix eval arguments, Nix itself has arguments as well that apply to all commands. So, they are a bit separate here, but we just allowed input from derivation or disallowed it in this case. So, we'll do this as well. Now, we have our Nix arcs, we have evaluation command arcs, and we just run it, right? We take the struct and there's something that takes the struct and creates a list of arguments and executes command in the background. So, we take our things, we take a Nix command line back end reference, and we wait for the output. Why do we take a back end reference? Well, Nix is, like, calling out to Nix is great, and it will serve our purpose for, like, the time being. But what do we do? We take data, we put it into a struct, and we take the struct and we put it into strings, and then we put these strings to Nix. Nix takes these strings and creates a struct, evaluates stuff, do stuff, serializes it back into a string. We deserialize it in RAS. There is a lot of serialization back and forth going on. I think Jan talked about this earlier with his Nixle, and yeah, like, why do we need all this back and forth? Well, one, this kind of raises the question, like, okay, the next step, wouldn't that be like a truly native back end? So, what's next? Well, we might expand on the Rost FFI bindings and therefore allow for, well, more efficient commands because no serialization is taking place. We allow for entirely custom commands, and with custom commands, well, you could write a wrapper around Nix. You could write a new or experiment with a new Nix front end, and maybe you could empower Nixle if we built the bindings in the right way. And in the end, it is modular, so, like, the entire library is built on top of the idea that you implement commands rather than, like, a back end. So, you can implement commands for, or, like, you can implement different back ends for a command, and yeah, do all this step by step. Now, for some less ambitious ideas, what can you do with these things right now? Well, you can still build custom Nix front ends or use Nix in the back end and do stuff with it. You ask me how I know. You can build testing or mock back ends for, like, well, integration testing or unit testing in, like, confined environments. Or you could implement a dry run function, and this is my community pitch for today. Use, you could use something like your Nix here and write Nix explain. It takes all the arguments of Nix, but the output is not something that Nix does, but just a description of what this command is supposed to do. Just an idea. So, and this is my second to last slide. I talked about a lot about, like, replacing the Nix front end or replacing anything in Nix days. Like, we also, like, here don't intend to replace Nix in any way. This is entirely trying to make the user or the developer experience for people who are using Nix inside their applications more convenient by providing the proper layers and proper bindings. We might, like, explore ideas with Nix developers, maybe TVX people, to see whether there's, like, bindings to be made and in which way. But for now, we are focused on empowering people. With this, thanks for your attention. I'm Wysander on GitHub. I have today published the crate for it. You can contribute if you like. And Flux is hiring, I guess. So, we'll only be able to take one question. You have one question. Two questions. Let's go. Yeah, I love the idea of native interface to Nix. Do you think, I mean, do you know how, if it would be hard to keep up with, I mean, isn't it, I guess you want to give it back on this interface? I'm sorry. Is it stable in any way? How would you think if you could maintain this at the set of five? Currently, it's in the ideas. Oh, sorry. You can answer the question and repeat it. Okay. So, the question was, like, how stable are the Nix interfaces to which we will, like, bind to and, like, which stability do we expect from those bindings? At this point, like, binding to Nix natively is more at the idea stage. And we have seen, where we have seen examples of this in, like, the, what is it, like, the harmonia lip store implementation. That is, that probably, like, provides lessons to be learned from. For now, this is something we will, like, experiment. We can try it with, like, like, individual commands. We can try it with, like, custom commands that we implement. But, like, for now, it's not clear, like, how that works. Thank you again. Thank you very much. |
P4 in Nix
Bringing hardware accelerated network to the masses! |
Please welcome our next speaker, Govah. Hi, so my name is Govah, which is a really not name, so you can just call me Govah. And I'm going to talk about this great thing for the talk. So before that, I was sponsored by the MLNet Foundation with a grant that is financed through the European Commission called NGISU, a generation internet. So if you have ideas and just want money, just go for it, it's actually a really cool thing. So now, what's this before? So I guess the academic definition would be the programming protocol independent packet processor. It's a domain-specific land-range for network devices. But it's fine, our data plane devices, switches, microtors, filters, as well, process packets. OK, great. But what does this actually mean? OK, so it's an ungrade, basically, for hardware-optimized network processing. Like, think SIMD for network, so you can just have, like, those FPGAs, DPUs, whatever, and you can make them process network packets for you at an accelerated rate compared to software. So it roughly looks like C, in a way. As you can see, like, it has the same kind of syntax, like, arguments, whatnot. The thing is, there's a few things, there's a few oddities. Like, you can see it doesn't, it has the state thing, it doesn't have a return. You have transition select, what is this? It looks a bit weird. So let's explain this a bit. Funds in P4 are replaced, mostly, by things called parser, control and package. So, what is parser? A parser is a function that is going to pass an incoming packet according to, well, the same thing as in C, which means structs, type the effect, etc., like, say, I'm going to define a struct, I'm going to say, if this element of a struct is this, then I'm going to call this function. Basically, that's what the parser function does. You have then the control functions. You modify your past packet, say, for example, you have this really long packet, which says, I have up x, y, z, and then you want to remove the up x, because you're on up x, you're going to do this in a control function. And then, you have package, which defines basically the binding logic between the hardware, like, oh, am I going, say, I implement a firewall, oh, am I going to add new rules, am I going to, like, load new definitions, and whatnot. So, that's basically it. There are very interesting keywords, there are other things, like, that I'm not going to explain over this, because P4 is kind of, like, that this would basically be out of scope for the talk, and P4 is already complex enough as is. So, how do you want to use P4 in NICs? There's a few ideas. My take on it is, let's make a transpiler. Why? A transpiler allows you to reuse for concepts within NICs, say, because P4 is a bit verbose, because, say, you want to pass this IP, TCP packet, something. You need each time to redefine the IP header thing is, so you want to redefine this track, you want then to have it passed in a way, and then sometimes you want this thing, sometimes you have this other IP strike, because you don't want to pass, like, some optional data in the header that would slow down the process, et cetera. So it's a bit verbose, let's just use NICs to basically implement parts of it and have it in a nice packet. So what is a transpiler? In our case, it's a NICs to P4 translator, which means you can define things in NICs, strats, enums, whatnot, and it generates P4 code. Then you have the P4 compiler, which actually processes the code that's generated, followed by the target compiler, which, in this case, allows you to deploy your thing to, say, an FPGA, a DPU, et cetera. Basically the thing that allows you to run your thing, your program. So what does it look like in action? Basically this is a NICs file, which allows you to define some P4 concepts. In this case, I define a header flag, which contains a few counts, like, max-ops, standards, ops, et cetera. And then a few headers. In this case, I'm just defining standard T, which has two flags, source and destination, both of type bit 8. That's basically just a way to redefine the annoying thing of P4, then you can also have the include. So this is processed, then, by this function called run-thread-spiler, mcat-transpiler, whatever I call it now. And you just run this, and it generates code for you, and that's great. So we can simplify this, obviously, because that will be annoying to each type, like, this header, this header, et cetera. And the whole point of making this whole thing is to make it less verbose and to reuse code so you can just have your own info, mostly in NICs and, like, a few bits and packages in P4, so you can basically put everything in the same place. In this case, you can just innovate things that I define in helpers, and you can do the same thing, and you get your P4 source. So that's a lot simpler. And we don't have to define eternity each time, we don't have to define mac addresses each time, which is a thing in P4, because if you use, like, standard P4, it's going to make you have to redefine all of this at each program you make. Which is thanks to helpers that I wrote down in my packages. So what does the end result look like? Because I'm saying, oh, this is the NIC code, this is what it looks like, et cetera, yeah, it's great. Okay. So this is basically the transpire. It's kind of dirty NICs in a way, like, you can see it's a lot of messy inventions, but basically the idea is that I define a module, NICSOS module, which verifies the types of what I give it, in this case the other, I give it a default type, then you can have, like, the union, the content, the one-off, et cetera, which are then passed by the transpire, which are these huge, like, nested functions, which basically just map strokes like NICSOS at Simeon into just, like, P4 materials, and then just write it. It just starts to start to transpire. It's nothing fancy, but it's fine. And so what does the end result look like after the transpire? It looks like P4 code, but pretty clean. You have, like, the include, the define, the ops, max ops, standard metadata, then you have your structure, et cetera, and then you can include, like, your own code. Then, and then, this whole P4 code is then processed by the target compiler, which then processes it to its own platform, say, it can actually generate, like, say, EBPF code, so it runs on Linux. EBPF is basically, like, this canal thingy that you can run to have more privileges. I explain it later if I have the time. And this is BMV2, and BMV2 is just a simple model architecture. You'll also talk about it later. So basically, it starts to suspend compiler that has multiple status and allows you to specify P4 functions, and we use concepts. It's nothing fancy, but it's useful. So now that I talked about BMV2, I think I have time to do it. So what is it? Glad you asked. So the simple switch architecture is the de facto architecture that is used to basically test P4C, which is the main compiler. So it's basically, like, this kind of standard test setup that you can use to just see if your code works, but it's not fast. You can't really target it to anywhere. It's basically, like, this low user-length thing that you can use to just test your program that is not fast, but does mostly what you want. It's basically just, like, this interface for targeting the switch. It's an abstract interface, and it's also used to just verify how P4C works. So now to continue a bit more on P4C. To set up, like, to use P4, you need a target, and the main targets that are currently implemented in P4C are user-length, which is, in this case, EBPF. Some people are going to cringe at this because EBPF is technically kernel, but what I mean by user-length is I mean it runs on a computer. The DPDK, which as, and this is why I'm saying it's user-length, its own user-length code that can run, but the DPDK also is like this, how do you call that? An API, I guess. It's an API that can target a bunch of devices. So it's not only user-length, but it has a user-length point. Then hardware, because you can use DPDK to also target hardware, say FPGAs, which are like, for people that don't know, things that you can use to, say, program CPUs or whatnot. You can basically just reconfigure the electrical level, electrical gates, et cetera, and then Q-Storm Asics, which are like Q-Storm processors and whatnot, basically. And then just emulated. I'm going to call it emulated because of how slow it is. It just basically, let's test the thing, BMW 2, I told you about it already. So obviously, all of us need some kind of interface to work on. And usually, you use the abstract switch interface with a few per-device changes and some, say, some control functions usually have a definition specific for EBBA than we like. So there are a few changes, but, overall, you can target basically all of these with mostly the same code, which is great. But this also needs changes to the transpiler. So you need to have more options, so you can target every different one. So you can, say, have the role, mess, auto-generated by a transpire. So, say, you can just, in NICS, define, hey, I want target X, and it's going to do it automatically for you without having to change the P4 code, which is mess. So now, introducing FPGAs on NICS. And I'm actually not kidding. The thing is, I forgot to actually take the picture before going to FOSM, so imagine it is really hard. Basically, an FPGA connected to a computer who, say, USB or whatnot, and then connected to the network for Ethernet. The idea is that we are not running NICS on FPGAs. We are using NICS to define what the FPGA wants, in this case, the network stack. So we need to define, like, say, the devices we have in NICS directly. So say I'm going to have, like, hardware.alastor.type is my FPGA, and then I'm going to explain how I can define it. Then you need some kind of auto-reload-deploy mechanism, usually, like, most FPGA providers just provide you a way for USB. You can also have just an auto-reload of tools through the control plane in P4 directly, if you want to go this way, which means basically no downtime network stack, which is nice. And yeah, you can have a data plane mechanism for feeding data to the host, right. All of this is a work in progress for now. So the transpiler is done. Basically, most of the source-to-source transpiler, including the targets are done, the targets are packaged, you can use BNV2 nowadays, you can use DPDK and whatnot, but the hardware definitions are currently work in progress. I haven't had the time to work on them yet, but it's close enough, close enough. And software works. So yeah, that's basically it. Any questions? Yeah? How can you get a switch that runs P4? So the question is, oh, can I get a switch that runs P4? P4. So you don't technically need a switch that runs P4. You can just use any, say, FPGA or DPU that you can reprogram, and that is targeted by DPDK. You also have, like, some routers that have A6 that are reprogrammable, as long as you can target it with DPDK, which is basically the API you can target. You can just use it, and when P4 on it, basically. Maybe I missed it, but what is it that makes NICs particularly happy for this compared to other things that might be different? Okay, so the question is, what makes NICs particularly nice in this case for P4? So what makes NICs, in this case, useful is that it's at the border of basically this bowing code to generate per target, and you also need to actually target an infrastructure. And basically, NICs is this great thing where you can just define an interface, an infrastructure, I mean, and you can then write automated code. So NICs, in this case, has two key roles. It's allowing you to write code with less, not having to rewrite every time thing, making it just basically this handy source tool. And the thing is to define your network stack in the same place as you would define your standard infrastructure, which means, say, I want to define this server, okay, cool, but now my server has like 100 million requests, I know. And I need to actually, like, huffload these things for some degree, so I'm just going to say, hey, hardware-definition.fga is my PGA, and now I have must-be magically, well, you need to pay for the PGA and program thing, but yeah, magically. Any other questions? Yeah? This is useful for things that are not network-based, right, so you can do things that have nothing to do with networking for your target thing, your PGA, is there a scope for this, or is it basically just for networking stuff? So the question is, is this going to be useful for other things than networking? And the answer is yes or no. So the FPGA deployment mechanism, yes, is going to be useful in this case, because you can just reuse this thing and use it outside of the scope of P4. The reason now is that everything else is basically P4-specific, but yes, the last part is probably going to be reusable for other things than P4, first of all. And second of all, there's going to be a few different FPGA targets, probably, like there's always going to be the proprietary ones with DPDK that have their own, say, deployment binaries that are clusters that are pain, but you can always see, like, try to implement, say, your these thing is and whatnot, so to have an open source FPGA setup, sorry. So yeah, you can use it outside of P4, basically. Yes? The P4 language is heavily exposed to the non-network thing, like, a lot of people using it for non-network. Like, Western digital, like, tax-coherent bus and the bloody thing. If you have any kind of stream that you can process, if you can express it in a way that it can be processed, if you'll find you can process it in a way that really remains specific for DPUs and network processors, like you have CUDA, OpenSQL, and so forth for DPUs. That's a really good point. Just for the live people, basically what was said is that you can use P4 to process any kind of stream and modify it, and it's used by other companies to process this in an accelerated version, which is very true and very handy. It's not the most common use of P4, but P4 is always not very common, so. We have two minutes left. Did you want to do a demo? Okay, let's do that. So, thank you. That was me. Go and find. Here's my website. Here's my email. And, yeah, we have one last thing to show. So, hmm? I want to take a photo of the P4, just like that. Ah, sure. It should be recorded, so. Yeah. Okay. Did I expect people to actually care about me, honestly? So. The demo is a bit of a strange thing, because the next talk is about secure boot. Yes. So, it's a demo about secure boot. There's, yeah, there's one thing I forgot to mention. I'm working P4 on this, but I'm also working on the secure boot thing, at least. So, what I wanted to show you is that. Let me just do that, do that, so I can just do this. I started this small PR the other day, which is like grab support for secure boot. So, yeah, I think now we can actually boot secure boot on XOS using Grubb. And to add on to that, let me just show it there, because it's, I have too many windows. There we go. Let me just, yeah, there we go. Welcome to Grubb. And I'm going to, I'm just going to zoom for the live. Okay. Yeah. There we go. And basically just, this whole thing is working on the secure boot. And as you can see, it's panicking, but it's close enough. It's close enough. So now. So now. I'm going to leave the next speaker explaining what Lensabuti is. |
Towards Secure Boot for NixOS |
Awesome, thank you. So my voice was destroyed yesterday in the pub, so everyone has to be very quiet. Awesome. Good job. Okay, awesome surprise with the Secure Boot demo, super awesome. I did not know about it. Yeah, we were talking about Secure Boot on NixOS, so this was a team effort by this fine gentleman and the other one over there who were happily coding while we were sitting on a boat while I was enjoying life. And all of this happened mostly at the Ocean Sprint in Lanzarote, which was also an epic event for getting stuff done in the Nix world, I can totally recommend going there. Yeah, so Secure Boot, I thought I introduced the shortest possible introduction to it that I can make. And then we go on, what is this Lanzarote thing and what status is it and how can you contribute. But what's Secure Boot, so what's the problem, so imagine you're here at FOSSTEM, your laptop is encrypted, then you go out to the pub where you scream too loud and you can't talk afterwards. And while that is going on, your laptop sits in your hotel room alone, minding its own business and then many hours later you come back and type your passwords in there. So is this still the software you think it is on your laptop? And without Secure Boot you don't really know. So Secure Boot is one solution to this, it's not a complete solution but it mitigates some of it and the way it works in a very tiny nutshell is that your EFI firmware just verifies what it's booting. So it just takes the bootloader, checks the signature, cryptographic signature on it and then the bootloader has then to look at the operating system it boots and also check a cryptographic signature on it. So you have like a chain of trust from your firmware all the way to your operating system and if this all works then someone else can't easily replace the operating system with something that your laptop doesn't trust. Now the question is okay it verifies cryptographic signatures with what? And typically if you buy a laptop it will trust the Microsoft CAA and some OEM CAA. And this is fine for Windows obviously and it's also fine for some of the other big distros so you can take a Ubuntu and it works on a Secure Boot laptop, you can take a Fedora it just works, unfortunately for NixOS it does not. So what's the issue with Secure Boot and NixOS, I think fundamentally it's a very different model from other Linux distributions there's not like a, there is like a thing that centrally builds packets and you can download it from somewhere but it's mostly build cache, you don't have to use it. You can build all of this locally and also you can do, it's so configurable so what would you even, so it's very easy to end up with an init rd that is not cached over the kernel that's not cached on cache.nixos and then you will obviously not get signed binaries even if cache.nixos would sign them which it does not for Secure Boot. So for now we're targeting your own CA so you can just say fuck it, I will enroll my own keys to the firmware. This is scary, it also looks scary if you do it but it works reasonably well. But then comes the question, every time I change the software on my laptop I just have to manually sign it and that sucks, no one wants to do that. So now we come to what is this lanza-botel thing actually, this is actually the tooling that makes all of this convenient for nixos so lanza-botel takes care of automatically re-signing your system d boot, your kernel, whenever you do nixos rebuild, this is a one line description of it. It does not take care of generating keys initially, it does not take care of enrolling them in the firmware, this is something the user has to do once right now. We have quick start documentation for that, so I've heard that it has worked for other people. Not bad, I have also heard that it did not work for other people but so far I think the likelihood of trashing your system with it has been very low, Niklas was very busy fixing the onboarding device. All of this revolves around unified kernel images, this is pretty recent technology out of the system d sphere, a unified kernel image is a normal UEFIPE file that can just be booted by the firmware with some extra bits, so it's basically a tiny archive of Linux kernel, the Linux kernel's command line, and then it also contains some meta information, so there's an OS release file in there and it has the name of this thing, the version and this is basically used to generate the entry in the menu when you select what to boot in system d boot. And then because it's all self-contained you just sign this one thing and you're good with secure boot. As we see grub support is nearing completion and also Ryan keeps telling me Linux support is also planned, but so far system d boot is the thing to go. Now to the sort of, so there is a PE stub, so to form a unified kernel image you have to stub on some PE image, some PE binary at the front, there's one from system d, it's called system d stub, and it basically does exactly what I just told you, so you have like the stub at the front, then the command line, the kernel init-rd and the OS release which I ignored on this picture, and it works. The problem for NixOS is that this kernel command line basically contains a store path that changes for every NixOS generation, so whenever you do NixOS rebuild, if you experiment with your system a bit, you will have a new one of these files in your slash boot directory. This is problematic because you also have the kernel and init-rd in this one blob and at least for me there are around 40 megabytes, and the typical system partition, the slash boot thing is like half a gigabyte, so I've seen NixOS systems with many generations, I point at you, and running out of space in your boot partitions is very uncomfortable, so that's why we wrote the Lancelot boot stub, it's basically, it does the same as the system d stub, just that you don't have to embed the file, the content of kernel and init-rd anymore, you just embed a path to the file and the cryptographic hash to it, so basically just point somewhere and say what you expect to be at that location. So then on the boot partition you also have these files and then you're good, so the stub will get the file name, load the file, check the hash and if everything works out, it gets booted, and since the hashes are assigned, this is as secure as before. And now the nice thing is that this stub is now only like 100 kilobytes in size and you can have another one that has a different command line, but may use the same kernel and same init-rd and you just, for this new generation, you just get another 100 kilobytes instead of 40 megabytes and now Ryan can have his 600 generations again in his boot folder. So obviously maintaining our own stub is not something that we enjoy too much, so there are discussions ongoing in the system debug tracker to upstream this functionality that you can just reference files on the boot partition instead of embedding them in the system d-stub and then system d-stub would just supersede the lancer-botar-stub and everyone is happy. No, the other, I said, awesome, the German is strong. So this is like the boot part of the whole secure boot tool chain, but then there's also like the nix part of it, big thing is the lancer-botar-tool, this is what is being called when you call nixOS rebuild and what it basically does is it takes all the different generations you have in your nixOS and assembles the lancer-botar-stubs and prepares the boot partition and signs everything. This is pretty involved due to non-reproducibility of kernel, so it's a bit tricky at times and we had some issues with that, but I think we're basically down to some polishing at this point to get this right. So we also depend on the boot spec RFC, which is at the moment an experimental feature in nix packages in nixOS where for each generation you get a nice JSON file describing which kernel which init-rd you want to boot and then the bootloader-tooling can just take the JSON and do whatever it needs to do. So this has been also pretty nice. Yeah, how to use this? As I said you have to do some manual step to onboard, we've tried to document them as user-friendly as possible given the topic. You are expected to be able to use nixOS, you are expected to be able to restore your system from a backup if everything goes wrong, but other than that you should be able to set this up if you want to. Of course if you want this to be an actual security feature then you may want to come back later, but if you really, really want to use it as a security feature you definitely need a BIOS password otherwise someone can just turn secure boot off and then you also need full disk encryption because otherwise someone can just read the private keys from your disk. But that being said when it all works it's not much more than SPCTL create keys, enrolling them after going to the BIOS menu and some very benign nixOS configuration. Yeah, that's it, so I didn't want to go into too much technical things, you can ask me about stuff, otherwise you can use it today, so I have this running, so if I type device security then you don't see anything, then I have to exit this and you see that for me your boot is active. As far as collaboration goes, I have to find the button again, the discussion mostly happens on the GitHub repository, open issues, we respond reasonably fast, we're currently in the process of fixing all the bugs and I think when we're bug free we will just call it 1.0 and then afterwards there's like a million features that people want and they will all be very cool, but bugs need to be fixed. First, we discuss on matrix in the secure boot channel, there are a couple of other matrix, there are too many security related matrix channels, you just join all of them and then it's fine, there's one about boot spec then there's one about nixOS and TPMs that I forgot to put on the slide. Rust OS stuff has been helpful for the Rust UEFI development, there's a very nice community over there for Rust UEFI programming and you can also just ping me personally on matrix or on mastodon and it should also not be hard to find my Twitter handle somewhere. So that's it, I'm happy to take questions, oh so many. So what I personally find very nice is, the question was whether I can speculate on the cool features that are about to come and personally I find all the TPM stuff very exciting and the problem is mostly that the tooling is completely horrific and all the documentation and terminology is like, it's annoying on purpose, so I'm really eager to make this usable for people that don't want to know all the details about TPMs, so for example you want your disk encryption to be unlocked if your system has not been tempered with. This is like a, I mean not tempered with is a complicated thing to define so and usability aspects are hard, but this is something I would really want, so if my laptop is in good condition and my TPM believes everything is good then I don't want to type in my password again, yeah then the whole, the question is whether we have tested it with Corbut, no if you deploy Corbut with a UFI payload it should just work, I mean it worked for buggy UFI, we are compliant too I think, so if you use the Tianokou EDK2 on Corbut it should work because this is also what we use in QMU for testing. So using it without an encrypted disk, so first issue if you don't put a private key on it and do the signing some way, somehow else, which we don't support right now, you could at least avoid this problem, but then there's the issue if you don't have integrity protection for your disk, someone can just replace whatever's on your disk and boot some other kernel. So the thing is, so secure boot is like one aspect of a secure system and whenever we start to talk about it, someone comes and wants the whole flower bouquet, so what you say is it's definitely possible, but somewhat out of scope for the secure boot effort. Thank you very much for your time, thank you very much for your time, thank you very much. |
A success story of adopting Nix at a workplace
From reproducible CI builds to production |
Please join me in welcoming the next speaker, Roman. Thank you. Hi. Today, I'll tell you a success story of adopting Nix at the workplace, all the way from reproducible CI builds to NixOS in production, and also what we have learned in the process. By the way, you can use the QR code to follow the slides on your devices if you want. I have some links there as well. So first of all, me, well, now ex-Principal Software Engineer at ProfiN, I've been using Nix at NixOS since 2016. Now ProfiN was a security company, and we were custodians of the NRX open source project. We were mostly focused on, well, developing NRX and also providing network services around NRX. But unfortunately, just earlier this week, we actually closed the company because we ran out of money and couldn't secure funding or acquisition. So NRX is a confidential computing platform providing you the ability to run web assembly within TEEs, so hardware secure enclaves. And well, I'm not going to go into too much detail, but the relevant part here is that NRX builds trust based upon a remote registration procedure. And to do that, basically, how it works is that we as ProfiN, we build a portable, mostly static NRX release binary, then we run it and we measure the memory pages within a TEE, then we add that to an allow list in a station service, and finally, then the users are the release binary, and they, well, they also run it on a TEE, and we in our station service can verify that indeed they are running in a trusted system, this trusted release of NRX within a real TEE. So there are two questions, of course, which arise, which is what if the release version of NRX is compromised, or what if actually our built pipeline is compromised. So then we can also ask a question, so how can users actually verify that the source code that we have released, we say, okay, this release corresponds to this source code, how can I actually verify that it is indeed true, right? So if you would try to just do plain cargo build with a Rust project, then you'll notice that actually you get different binaries depending on the system in which you compile your binary, and so the answer to this is to actually use the produceable builds, which of course, Nick has a way to do. So here is an example as well, so if I just compile it in the Docker container, right, and if I just do it locally, I get exactly the same binary, so I get exactly the same digest. But so how do we actually get here? It's all, of course, self-of-the-shell. How do you develop anything without the next shell before in your project, right? So I've got positive feedback on this, and, well, we infected the project with Nick's. Just a few months later after that, something very surprising to me happens is one of our first outside contributions was actually a NRX build package, so I just added a shell with actual build script by Vincent from Germany, I know if you're here, but thank you very much if you're looking at this. And so by the time we had to ship and release, you know, like a big release of NRX, we already had all our Linux and Mac binary builds in the flake, and we could also build OCI images in the flake. I did some changes in the meantime. So yeah, it just made sense to use Nick's for this, and we quickly discovered that actually building for ARM64 in QMU on X86 runners was just too slow, so that's why I implemented course compilation. It was tricky, but eventually it worked, but unfortunately it also made all functionality parameterized by the package set, and it also made flakes very difficult to maintain a review, especially for people who are not familiar with Nick's, because I was the only Nick's guy in the company, so it complicates things. And here's an example, I don't know if you can see, but before it looked like a just normal kind of build script, you know, like if you did GEN2 packaging, you would kind of understand what's happening here, if you just use Rust, it's kind of clear what's happening, but when you have cross-completion, it gets difficult, right? You have a linker, you have the CC target prefix, right? It's not clear, like this let-in block, there are two different package sets. So even worse is that, well, we had multiple repositories, and initially we just duplicate our flakes, right, because we want to build everything reproducibly as well, we want to have cross-completion, we want to have just consistent, you know, CI, and basically each flake was branched off from a different version of the original flake, right, because we also were keeping developing and changing things. So they started to diverge in some supple ways, but they still largely were doing exactly the same thing. And another thing is that because the flake logs were actually managed independently, so we could benefit from some CI caching, but we actually couldn't, because those potential hits were actually becoming a miss. So and the maintenance just became a burden over time. So let's just take a step back and think about, okay, what do we actually want to do? We just want to build some static Rust binaries, just like Cargo does, except it won't do it reproducibly, right? We want to have an OCI image for that binary, and ideally you also want to have a fast CI, but well, if you can't, all right, fine. But we don't really care how any of this is done, right? All we care about is that if Cargo can do it, Nick should be able to do it as well. So just add functionality, not remove any functionality, it doesn't make anything harder for us. And you could say, right, there are templates in flakes, right, you could just write it once and then propagate across repositories and just use that. But it's not much different from just duplicating the flake as I did before, right? You still have to adapt your template for the actual project, and you still have to maintain, it's going to get out of date, and there should be a better way. So that's why I built a Mixify library, it's designed to be an easy to use, batteries include library with opinionated but customizable defaults, and it just works well in CI. It doesn't try to cover all use cases, but it should be good enough for most. It just simply plugs into your existing language tooling and currently supports only Rust via Crane back end, but it could support other back ends as well if you want to, and there is also support for other languages just kind of designed for. So what does it actually do? So per each default system, it provides you with a flake check for linking and testing, development shell of your tool chain, formatter to do Nick's FMPT, and then release and debug builds for all targets with cross compilation and OCI images and whatnot. Here's an example, so this is actually just a Rust Hello World application, right, that has just simply one input, which is Mixify, and I have just outputs generated by this make flake function, and the absolute minimum is just if I have one argument, which is a source. So Mixify will parse actually the Rust tool chain and cargo automobile if you're familiar with Rust, so Rust tool chain defines basically what version of a tool chain I want, what targets I want to have, and cargo automobile is just basically metadata about the package. So let's look at the next flake show. I formatted the output because it just wouldn't fit on the screen, so this is the checks, so I get my linting, my formatter, the Rust formatter, and the testing on check. I get also a development shell with my tool chain and all different systems. I get Nick's FMPT already predefined, and I also have an overlay with actually the tool chain and all the packages, so including OCI images, and finally that's the packages generated. So I have my native default build, I have also profiles for release builds and debug builds. You'll notice that there's no Darwin builds here, because again I've formatted it. On Darwin you would see also builds for Darwin, but you cannot cross compile from Linux to Darwin, but you could do other way around, so on Darwin you would see that as well. In fact there was an issue with the next package set right now, there is no, like one of the Darwin was not able to compile for the other one, unless you can check yourself. So here's an example of NRX packaging with this tool, so I can also configure for example some paths to ignore, to improve caching, I can configure the Clippy features, whatever I want to do there. So it's pretty small, pretty simple, I can add some, like for example opens a cell in my shell, and yeah, it works. And NRX itself is, by the way, anything but a simple project, so it's using nightly Rust, it has seven crates in the workspace, we actually use bin depths as well, so sometimes we build for three different targets at the same time, basically the shim and the execution layer which are merged later in one binary. So this what it means to actually build something in CI of this, so we just simply have a matrix of all our hosts and targets, and we just simply do nix build, and this is consistent in the same everywhere, right, the only difference would be like the name of the package, but again it could be removed if I wanted to. So next we have testing, and we have linting, it's pretty much exactly the same thing, again it could be a shared github action workflow and just used everywhere. So how do we actually maintain and update this, the next five changes are actually for us was propagated automatically, so we used github action to actually do the next flag update, so the changes were already reviewed by me, so I've audited the changes essentially and then anyone in the team can actually, well you can see the bottom, but you can actually review and auto merge this because the team trusts me, this actually brings me to a trust question which is an open question I just wanted to raise, so the nixify state is essentially a root of a miracle tree, because it includes all the dependencies and digest of those, so I audit the state of the world, so the next package set and all the dependencies, so team trusts me, therefore they trust the world or my audit of it, so it's a transitive trust, but then how can the team actually verify my audit, so can I sign the nixify a root in any way, can I maybe add my signature on this, so maybe nix could validate my signature, maybe it could be like a parameter locate trust, only these projects, so that's just an open question, if you have answers please let me know. So eventually it was time to deploy our services and for NRX we require some auto tree kernel patches until provided asmd pccs service running some use of rules for the hardware, we had no dedicated operations engineer and all our repositories were already nixified, so we just made, well we also had two cloud providers, which is AWS and Equinix, both had nixwire support out of the box, before we had nix we actually used custom OCI images for pccs and asmd and those were basically just compiled once to some binaries just put inside the image and those images were outdated, largely amante, no one knew what to do with them, we required custom provisioning steps like to do podman, secret, create or something and we also needed manual udev rule setup, so once again open source come to help, so the pccs and asmd were actually turned out to actually be supported by nixOS already, also the Intel SGX hardware was supported, by the way again it's either Vincent or one of his team, someone from his team who added this, so again thank you, so it was just these six lines were enough for us to just enable the services and all the hardware support. So next step was that I just added some simple nixOS modules for our services, I set up secret management with SOPS with nix and deploy RS, again like simple tooling, I automated deployments to testing environments, I set up CI to just test all PRs, so if CI fails PR doesn't go through right, so we don't update anything and of course we have the automated updates just like everywhere else, just for every other repository. So to begin with we actually ended up doing this, so we actually were tracking tags of different projects, so you can see there are three groups of, well these are the services, different environments, so we just have three different inputs and essentially we progress, we test things and staging, then we go to production, right. If I were to redo this I would actually use branches, something like mixed package that channels where you essentially just do a release rather than you merge it to say unstable, eventually you test that, you promote it to stable, so I think that's more work upstream, but it's easier to, well, maintain downstream, it just makes sense. So you could ask, okay, why don't we just use OCI for this, but the thing to understand with nix, we get source code references, not binary references, so essentially we get all the benefits of binaries without actually sacrificing and usability, in fact we even get things like updates and the flag.log is nothing else than really a software build material, which is the buzzword today and everyone really cares about it, so because in the flag.log which you get is that you can audit the whole state of the world, right, you can audit all the service source code, you can audit all the build dependencies and you can audit also all the tooling that you used to actually deploy this stuff, right, and you have a consistent simple update procedure, which is nix flag update, it's super fast to deploy once you have your module set up and you know how to use nix, it's just, boom, you're just there and of course you get rollbacks with nixOS or really any deployment tool is used nix. So next step was actually providing things like AMIs and whatever else, so I used just nixOS generators for it and like the way I did it, I just simply imported the nixOS module, like a common module from our infrastructure and it took me a less than a day to set up and this is for SGX and AMD SCV, again it's extremely powerful and just extremely simple to use. I do have to mention that we asked ourselves a question, okay, so how many enterprises would actually use nixOS to deploy NARCs, that was probably not so many, so once we actually got a dedicated operations engineer, we eventually moved most of our services to Kubernetes, but there was a little win because for every service that actually required custom kernel, custom services, custom udev rules, those things are difficult to set up and those actually were kept in nixOS, which brings me to the next point, that nix and nixOS actually, in my opinion at least, make difficult things simple and in this case for me it was great for prototyping, it was great for productivity, for composability, for building literally anything anywhere and for audibility and trust. One particular thing I want to mention about productivity here is like if you ask me, one killer feature of nix, at least from my perspective, this is that, so we had for example lab machines with SGX and AMD SCV hardware and I did it countless times when I was just developing something as a feature branch and I just needed to test it out, I just SSH and do nix run, GitHub call, my repository, my branch and I just run it, I don't need to use git to add a new remote, I don't need to set up my toolchain, nothing, I can just SSH and run it, so I think that's really powerful. So another thing is that if you introduce nix, of course it's also your responsibility to teach it, so you are the FM in RT FM, so for my case I had the nix 101 classes on Fridays and the real uncommon thing about nix, I think the people is a functional programming part, so I ended up teaching a lot of functional programming principles to actually explain nix and also just introducing teams with ecosystem because well, if you just newcomer to nix, it may be overwhelming, it's not clear what is, why there's nix, a nix package set, nix the way, how they're related and I also create a nix channel for questions, so whenever people had questions I would just answer them. So some derivations from this, it's really important to give people examples of how to get things done, like real examples and it's very great also to give analogies with known technologies and okay, so this is not exactly what you would do in nix, but I implement Fibonacci in nix and in Rust, I just show them two side by side, I have a main function here in nix, I have a main function over here, both free to sum in both print Fibonacci examples, so it just works, you don't see but opposite here the same and yeah, so this kind of is an example of how you can show the people that you know, it's not so scary after all right, it's just the same function, just look at it right, the same thing. So one thing I noticed also that nix is perceived as being pretty novel, so I ask people the question like okay, so how long do you think nix exists and I got answers all the way from two to ten years where ten was like a real stretch or not, it can be so long, but the real answer is twenty years, if you look at the git log right, I don't know, maybe it was not before, maybe it was earlier, but that's what I see from the git log. Some frequently asked questions I've received is okay, I'm not on nixOS, how do I use nix or I'm trying to use the flakes but I get an error, why are flakes experimental, is it safe to depend on some other features, what if they get abandoned, for us our show.nix. So a few things we can note here that at least in my experience nix is tightly associated with nixOS, but we need to understand or at least explain to newcomers, I believe, that nix is not just another package manager, so nixOS is just one possible output for nix build but it's just that, it can build many other things and I think we should present it as a generic build tool which is not tied to a particular OS and sometimes for people familiar with Docker, I even presented something like Docker or Podman where you basically have a docker file, you can build a docker file, it's put in your store and you can run it afterwards, it's something like that, maybe. And yeah, this thing was such a pain, so come on guys, we need to do something about this, I mean you hear about this nix thing, it's so cool but it's so difficult, so out there and your very first try in nix is a failure. I understand that this error is descriptive, yes, you read this, you kind of understand what to do, mostly, but it's just not a good developer experience, right, and I think we should, I don't know, I don't know what to do, there could be possible solutions but we really need to do something about this as a community. So thank you at this, you can, yeah, as I said, profing close, I'm looking for a job by the way, but yeah, any questions maybe, yeah, good, right. So the question is, can nix be a generic build system like Maison or Autotools, whatever, so I'm not a C developer, I've never was, I believe so, I believe you will be able to do, at least for Rust, it was a breeze to do, I can definitely imagine doing that for Go as well, and I think that was pretty, so for example, I compile, so we had a collection of examples where we mostly compiled to Wasi, so for C it was actually very, very simple, so I would think so. The only thing to care about though, so it's great for releases, not so great for development because, for example, with Rust you have this target directory with all your cache, so you don't get that anymore, right, or, I mean, you have to, you can do things around it, you can maybe build some libraries to actually achieve that. Maybe you can do it and contribute. If I may add about this, there is some RFC on how you enable nix to make replacements, but there are much more things which are a bit complicated on how you make it fast. Okay, okay, I'll check it out, thanks, I don't know that. Thank you. |
GStreamer: State of the Union 2023 |
So, we start with the first presentation by Olivier Caix from the G-Streamer of the State of the Union for this year. So, hi everyone. I'm Olivier Caix. I've been a short developer now for 15 years. And I'm going to tell you a bit of what has happened in the G-Streamer community in the last two years that we last met. Here. So, we have two major releases, 1.20, 1.22. Quite a lot of merge requests as you can see. And an interesting fact, in 1.22, a third of the commits were in the Rust modules. So, we've been investing a lot in Rust in the community. There's a lot of excitement there. And that's been happening. One of the kind of big things for us as developers that we did was to merge all of the various git repositories into one big giant repository. So, now everything is together, except for Rust. Rust modules are all on the local corner because they're released with the GTK-RS infrastructure. So, the rest of these three ones. But that's kind of our big change for the developers, but it doesn't change much for the end-user because we still release all the packages in separate car balls, just like these. Another thing that we did that's between infrastructure is we've improved our build system, and now we can select the elements one by one. Not just the plug-ins, but you can select exactly which element you want in the plug-in. And then we can also link all the plug-ins, all the elements, all the dependencies into one giant library. This makes it actually easier to make a smaller build because we can build only exactly the functionality that you need for a specific application. Create a big library that only has the exact functions that are being used, right, and nothing else. So, we did that for embedded systems mostly so that you can have something a little smaller. Another area, and there has been quite a lot of excitement in this terminal the last couple of years, is WebRTC. So, as probably all of you are familiar with, WebRTC is a way to send video and load it in SQL from browsers, and Distributor has one of the most complete implementation outside of the Google implementation that's used by the browsers. We were missing one big bit, and that was the congestion control, and that's been added in the last releases. So, now we have a module that is compatible with what's called Google congestion control, which is what Chrome and Firefox and Safari use. And this is in Rust. And to make that work, so we have a WebRTC implementation, but that did not do any of the actual encoding that was left separate on purpose. Now we have a module in Rust plug-in that will plug the encoder and the WebRTC and do the congestion control so that you can adapt the bitrate of the encoder to the encoding, and this is all automatic if you use this plug-in. We also have Web and Web, so these are also within Rust there. Web and Web are a way to replace RTMP, but based on WebRTC, so it's a single request, a single HTTP request way to set up a WebRTC stream. It's mostly meant to stream to and from a server, so it's really a replacement for RTMP for a low latency video transmission. Speaking of WebRTC, so WebRTC is based on RTMP, and this is an area where there's also been quite a bit of development. So, starting with 222-1 order correction, that's a system used mostly for legacy broadcast transmission, and we have the 2D order correction. So what does it mean 2D? It means that we do 4D order correction, which is basically, you absorb multiple packets, then you generate a new one, and if you have any of these packets except one, then you can regenerate the missing one, right? That's what the parallel error correction is. Traditionally, you would do like packets one, two, and three, and four, and then you would generate a fifth packet. What losses tend to come in bursts in networks? So with 2D error correction, we have this kind of traditional version, and also a version where you do packet one, and five, and ten, right? So if you have a burst, you can recover more of the more packets. The other thing that we've added, so just remember for a long time, we've had the API to add RTP-Ether extensions. That's a way to each packet to add a little header with something else. So for a long time, we had libraries to actually write these, but we didn't have something in the system to easily plug in something to insert this header in every packet without having to write application code. So this is something that we've added, and we've added a bunch of different modules. The multiple is this client-to-mixer audio level. This makes it possible for a sender of audio to say the volume that I'm sending is this, so that a mixer or some kind of service can select the person who speaks loudest without having to decode all of the audio. So it can know from this level who's speaking louder and just forward that one. Then, color space. So this is for VP9. If you want to send HDR over VP9, we now have this RTP-Ether extension to make it work. It's compatible with Chrome again, so this is a phase-on-article experiment. And we have an AD1 builder and deep builder, so this is, I think, probably the first thing where we've decided that the official implementation of a major feature is only available in Rust. So this is something that we're pretty happy about. Another thing, so H264, H265, they have two kinds of timestamp. Presentation timestamp, decoding timestamp. When you send RTP, normally only send a presentation timestamp. You need to apply an algorithm to recover the decoding timestamp. We now have modules that generate that. We also support RxC6651, so that's a way to synchronize streams immediately. So traditionally with RTP, we send two streams, audio, video, separate timestamps, and then sometimes later you get a packet telling you what the correspondence is. With RxC6651, we add the RTP-Ether extension in every packet so that we can be synchronized from the first packet. And we also improve our base class for video decoders a bit. So now it can recognize that there's a corruption and use that to request a retransmission previously. We kind of applied the error, but we let the application do the decision. Now we've added something to the base class. Another big feature that was worked on is that we basically rewrote the HLS and dash base class. So the previous one was over 10 years old and had been written largely without the specs. And even when we had the specs, HLS has changed quite a lot over the last 10 years. So now we have almost a state-of-the-art implementation based on 10 years more knowledge. It's much more simple, has fewer trends, much better control on how we download things, on the buffering. We do a little bit of the parsing in there because sadly for many of these for us, you have to parse the base stream to handle it properly. So this is all implemented as one in this decade. We've put a few things around decoding, mostly video decoding. One thing I'm quite excited about is the subframe decoding that has been quite a few years coming. And this means that we now have infrastructure in our base classes to start decoding a frame before you receive the entire frame. Some format, issue 6.46.5, we can split the frame in slices. And from this first now, you can actually start doing the decoding. We have two implementations of this. One is based on HPEG, which can do this only for issue 6.46.4. And the other one is for the exiling hardware because they have the hardware features to do that. So they can do super low latency decoding. We have WebM Alpha. So the WebM format, so HPEG tonight don't have support for transparency built in. But there's a WebM extension where we basically store two separate video streams. One with the colors and one with the transparency. And now we have an element that will spin up two decoders and then recombine them into a A1B stream. We have a DirectX Elevent library. So make it easier to integrate Direct 3D Elevent applications in the streamer to do zero copy and coding. For example, from a Windows application. And also speaking of Windows, our Direct 3D Elevent decoders are now default. So they're becoming the choice that will get auto plugged if you have them. So you get hardware accelerator decoding on Windows. That works. What about MacBooks? Yes, there's also... We've got a question already. So we have a hardware decoder from Mac and IOS, which is the same idea. CUDA, so some people use proprietary software and proprietary drivers. So we have now also a CUDA library so that you can insert more easily CUDA data into the streamer for encoding, decoding all these things. We have some more CUDA native elements, one that is a converter and a scaler, so using CUDA itself. We have CUDA and Direct 3D integration for Windows again. And this whole thing basically gives you zero copy and coding on NVIDIA hardware, especially when you match with some other CUDA-based software. But some people prefer free software. So we also have BAPI, we have a new plugin for BAPI. So we've had G-SUMOR BAPI for a long, long time. It was getting quite freaky. It was not based on any of the base classes that we have improved since then. So it's been completely rewritten from scratch with a new plugin that we call VA. It supports all the major codecs now that we've implemented, all the ones with VA. It supports AVY, it supports just 5, VPA, between 9 and 8, MPEG-2. Encoding, we only have the issue 6, 4, 5 codecs for now, but the rest are being worked on, as we speak. And using live VA, we have a bit more than VA. We have a bit more features. We have a compositor. It's an element that will take two or more streams and composite them into the same video. We have a D-interlacer. And we have a post-processor element with scaling and color space conversion, using the video functionality instead of the GPU. And open work has happened around 8 to 1 or something in the last two years. So we have quite a lot now. We have support for AV1, both in the legacy VA plugin and in the new VA plugin. We have it for AMD, the coders. We have it for Direct3D on Windows, using the NVDI-PI's. For Intel, using their multiple libraries, either Quixin, QSV, or the new VSDK. So we have pretty comprehensive AV1 support, in addition to the RTB plugin that I mentioned earlier. Another new thing that we've done is, this is our first official machine learning integration that is in G-streamer itself. So it's the first step. And we've written a plugin to use the Onyx runtime from Microsoft to basically take some video frames, some model and recognize objects, put little boxes in the metadata, and then another element that can take these boxes and draw them on the image. So this is the first step. A lot of work is happening right now to have a better video on the fixed framework as part of G-streamer. All these things I like about sometimes you want to have a UI. And a few pictures were recently added there. We have a GTK4 paintable, so that's an object. I would say that you can use the GTK4 to actually draw something on the screen. So now you can easily integrate G-streamer with GTK4 to your old copy playback. This one is in Rust, which is kind of cool. Qt6 as well as that thing, so that we have something that is very similar to what we have for Qt5, which is a QML item, so that you can integrate a G-streamer sync with the output with Qt and draw in your Qt application. And the last one is a niche case. So we had a Wayland sync for a long time, and what this Wayland sync allows you to do is to basically take the video buffer and give it to the Wayland compositor directly without going through the toolkit. So you can use the 2D hardware planes of the platform. This is multi-use useful and embedded. It allows you to do things like greater performance, not use the GPU on embedded systems where the GPU might be too slow to jail. Up to now, this all works fine, but you have to write a low level Wayland code, and that's non-trivial. So we've written a GTK widget that wraps up. So now we can write your application GTK, just add the widget, and you get all of these performance benefits for free. Last but not least, in this category, we added touch event navigation. So previously we had navigation, we could send letters, we could send mouse clicks, but now we can also send touch events so that you can have elements in your pipeline that are controlled by the user, such as we have a Webbar app, a web view, a webkit-based source. And we have some new tracers, so these are tools for developers to know actually what's going on in a pipeline live. We have a bunch of tracers already, four more were added. Some of them are quite useful, like we have one to generate buffer, to read the buffer lateness and one to trace the queue level, and these will output the information in a CSV file that you can then load and make nice graphs and understand what's the live performance of your pipeline, what's going on. We have one to draw pretty pipeline snapshots, so for a long time we had a feature where we could draw a dot file to draw a picture of the pipeline, but this required it. I added some code to the application to actually trigger it. With this tracer, now you can just listen to a unique signal and trigger it on the spot. The last one is the factory tracer. This is the very first feature that I mentioned. So it's nice to say, oh, I'm going to build a G-streamer, build specific for my application with only the elements I use. But if you use PlayVin, there's a lot of automated things, and you might not know exactly which plugins you've been using. So with this tracer, we can actually trace all the plugins that get loaded, all the elements that are used, and print, when you exit your application, print the list of what was actually used. A question about that. PlayVin sometimes tries to use PlayVin, but this got it later because it just worked. It's going to be listed anyway. So yes, right. PlayVin sometimes tries to use something, but it doesn't work because the hardware is not there or something. So in this case, the tracer will still list it. It's everything that is loaded, right? When they're tried or loaded, you call a function in it and it says no. So this is really at the loading stage at this place. Thank you. Any questions? Yes. You mentioned V1 and the RTP support in there. So can you also add the SPC extensions? I have no idea. Anyone knows? No. So RTP, any one extension, SVC extension in, I don't know if it's there. So we do layer selection of the highest quality here, but it's not there. So there is an external, there is a dependency description of RTP extension where you can get information about the SVC layers in there, which is basically something like that. So you encode the SPC layers into one screen, then you use these external extensions to vary information about what's in there and what isn't, so that you can do the connection. So that's what I got to introduce from the other video. None of the RTP and SPC stuff, including the VT1, which is quickly required because there's no SVC inside the screen. Yes, so the question was RTP, AV1, SVC, there's an extension that can make it really useful. It's being standardized and it's not implemented yet, but it will be at some point. I forgot to mention we have an online question, online question at number 43401. So if people at home want to ask questions, it's possible, but since we don't have any questions, there are any more questions on the floor. Q6, does support different rendering backends, like DirectX on Windows and Vulkan on Pino or something, with a QML event? And then Q6 supports post-interference that you can directly pass the DirectX buffer to the QML event? So does Q6 support other backends than OpenGL? And I think the answer right now is, won't it support OpenGL? Any more questions? Yes. Can you do a statically linked binary? Can you do a statically linked binary? Yes, you can. That's kind of one of the use cases as we create this statically linked library and then your application can link to it and only link the required bits. That's kind of the part of the trick to make it a little smaller. Yeah, my question. So you said that there is congestion control in WebRTC now? Yes. Is it like the same feature set as in Google's implementation? So, yeah. The question is about congestion control in WebRTC. Is it the same feature set as the Google implementation? As far as I know, yes, because it's basically a copy of the implementation in Rust. So they basically re-implement the same algorithm. So that is compatible. But it's pluggable. So you could write your own. There's a plug-in mechanism and the core version is in C, but this one is in Rust with a separate plug-in. One could write a different implementation because there's a bunch of heuristics in there. There's no line. Perfect answer. Thank you so much. Do you have a question? Yes. If I have an application that does WebRTC signaling of a matrix, for example, would I benefit from switching to WebRTC sync or would I sit with WebRTC? So the question is, if you have an implementation of WebRTC that does signaling something custom, for example, other matrix, would you benefit from using WebRTC sync? And the answer to that is yes, because it will do all of the encoding and congestion control hooked up for you. And there's an interface that you can implement for your own signaling. So the signaling is separate from this WebRTC sync. There's still a module that we can implement. We have one implemented for like a custom signaling mechanism where there's a server that's implemented, but you could write your own. Can I ask the last question? No, just a comment from the question before. The QT6 direct 3D integration. There is a merge request for it. Okay. So Tim says that for the QT, there's a direct 3D merge request open to integrate that. It's not merged yet, but do test it at home and complain when it doesn't work. Last question. Okay, thank you. |
PipeWire state of the union
What is and what will be |
So let's start with our next talk, sorry, we're going in a bit late because the project was not running, so we have a brand new TV now, so the next talk is about hypo erectile Yeah, give it a talk, so we'll start it all, and then we'll talk about what's going on So a bit about what it became since last time and what it currently is, so it's kind of like a sort of So it's functionality is basically show objects in a graph and pass data between them There is for example a session manager called wire clumber who orchestrates the graph of all of the objects in it Then we have a couple of services, like for example there is a post audio service that converts post audio clients to pipeline We also have a replacement library to run in Jackpile, the purpose of this is to share multimedia So we have video sharing, simple actually, but also audio functionality of the audio service So at its core it's to share data very quickly, zero copy, it was originally made to do screen sharing in Wayland So you need to pass video from the GPU without touching by the CPU to clients So memory using file descriptors, stuff like that, zero copy So it turned out in the end that it was also going to work for audio So audio passing around and then I started working on that So the audio part is very much like a Jack audio server, very simple how it passes the data around But on top of that you can build all the existing audio solutions better So this is the current situation of what we have for audio and video So you have the Bluetooth stack with the Bluetooth heat that goes on the e-bots There is also the camera now, there will be talks about that to interface with the kernel video for Linux originally There is still all solid to do all the audio stuff And then we do it together with UDAF, we have the session barger, part wire there And then the applications that go through a replacement false audio server Or creative, not so many, or Jack clients, they go through a really good translation layer So originally for real and screen sharing, it was implemented But the motor, which is the compositor, would expose inside by wire graph It's called a node, which is something inside the stream, it produces a video feed And then you can connect with clients and consume that data So this is all in the process, so different processes connect to the part wire demo They grab a piece of the graph and consume or produce So you can produce data, the compositor produces clients and consumes So you can branch out, you can mix states And part wire makes sure that the data flows around So for audio support, it's like pro audio model like Jack So everything is floating point in the graph, the audio field All channels are split into separate channels So there is one buffer size to the whole graph, just like Jack But it is dynamic, it can change So there is automatic format converted to or from sources and things With all the fancy stuff And it has a couple of things you'll be able to do, the clocks, maybe, and the media stuff like that So with that poor layer, you can run Jack clients almost very good translation And with a little server, you can run Pulse Audio clients as well So it copies basically stuff that was already there, partially from Jack But also from Pulse Audio, the timer model is also used It uses a copy of all the Pulse Audio stuff for managing the cards and the mixers And you plug in the headphones that switches things and remembers volumes And controls the volume styles and all of that That was just copied directly because that piece is very evolved over the years So you can run all the Jack clients like that, it's pretty cool And they will show up like Pulse Audio clients, we'll just show up there And you can run them together like that Pulse Audio clients also run as normal So it was originally start for video, the screen sharing Then a little detour to audio But we're actually back now to video And now we have two words for the other side So now we're basically working on the camera capturing stuff So additionally, browsers like for WebRPC go directly to video for Linux Using our CTLs and stuff like that So it's very difficult to put anything in between there This all needs to be rewritten, this is in the process But also newer cameras, they need more stuff than normal gifts There is a new library called the Camera that orchestrates all of this You'll hear more about that in the camera But you can't just go directly to video for Linux anymore So there is a layer needed So we need to rewrite the applications anyway So then it's probably better to rewrite into pipe wires So that you can route the video more flexibly So basically pipe wire we provide cameras And you can get them from clients This also means that you can have multiple apps going to the same camera So the status currently, it's been in Fedora for a while now for two years Almost two years It was API and API stable So far we can make this work Yeah, it was made to default Against all expectations I was a bit afraid that it wasn't going to work But it works rather than expected So it became a default So for the moment, Jack and Paul saw you Featured targeting Everything should work as it worked before You might have noticed a bit There's a couple of things that are not implement in that Jack There are alternatives This is to connect multiple Jacks on your servers on the network with no rate It's a very specific implementation With very little finicky, maybe a bit There are other alternatives Yes, we have an R2P based solution There are more compatible with actually hardware And there's all bunch of stuff that maybe nobody uses So right now most of these tools are switching as well Or have switched I think WM is getting there Ubuntu switched for 3210 They have no more Pulse Audio You have to notice though, no more Pulse Audio The only thing that is changed there is the server part of Pulse Audio The client part of Pulse Audio is still in use But it talks to a different server It's a re-implementation of the client part So now, what are we doing? Backfixing So backfixing Because it mixes only on the right If you're on a laptop speaker, you only get it in the right channel No, right channel is his lapel Okay It's not true, but okay It's easier So the camera elements, the camera as we work on To get them all integrated We don't have a roof about this on Sunday So it's a whole bunch of things that need to be done Like for example, for the camera, there's also a door So we try to hide everything now behind a DDoS service Called the portal That will actually manage the permissions Like application can access the camera, yes It's sort of like other computers Where you actually give access to an application To do this and this So that goes through the portal We try to hide everything through that But it requires lots of applications to be included So for example, the DS Studio Is a bit of a test case there For the camera and screen integration So there are merged requests ready To be merged But it's ongoing Also Firefox and Chrome For their WebRTC implementation Have patches that are, I heard, merged Two days ago Some of them Some of them, there are some more than new things But it's ongoing So the end result is that if you do a call If you do a call in one of these browsers In a future version You have to go through pipeline to access the camera And then it's, yeah You can do lots of fun stuff then You can add filters to cameras Or you can make it a... Like for example, in the OBS Studio You can do a virtual camera And then you can have that camera Used as an input for the browser You can do all kinds of functions For things Once they're installed Yeah, this is all I had to say Do you have any questions? Yes, yeah So, also you have network functionality That allows you to use the audio system for it There was an experimental batch That allows you to apply the compression to that Do you have anything like that? Repeat the question Yes, so Jack had a network transport Can you repeat the question for the stream? Yes, so the question was Jack had a network transport And it also allowed you compression I think it wasn't OBS We don't have an alternative for that Is there any plans for that? Or is there any... No concrete plans But it sounds like a useful feature Yeah Like for example now The network things that we tried to use AES, which is RQB based It's an uncompressed, but very small batch Okay, I'm just curious How kind of a product that actually uses That stuff that we will tell you Since I put it in order It might... Following up on that There is a rock plug-in Apart from that, would that be any help? Yeah, so the suggestion is That there is a rock plug-in Which is... The rock tour kit is built on RQB And also provides network transport I don't know, does it do compression? I don't know But it is... It's a bit more generic streaming I don't know if it's suited For doing extra low latency communication It's more like a generic streaming Another question Are there any plans For the plans of introducing signal processing Eco-consolation, noise reduction Yeah, so the question is Are there plans to add Eco-consolation, noise reduction Other plug-ins? Eco-consolation we have We have a module where you can load Eco-consolation using The network part to see Eco-consolation and the load module So signal processing There is also a whole bunch of things We can of course load Jack tools To do a few set of things We also have a native module That can load lots of LV2 filters And all those products Filters, filters That you can use to construct filter chains There's also easy effects Which is an app To manage the filters for you So there are just quite a few options We have an online question Is there a way to visually show 5.0 views? Yes Yeah, so for visualizing The part-wire graphs You can of course use the jack tools There's also a native A native tool to do this To make it to the level Telephone, Bluetooth And then you can load your graph Show how to load your signals Can you change it? If you use a native tool You can also route the video Later in the camp So you said you had AES output Are you referring to AES67 here? And if you are How are you dealing with the clocking elements of AES67? Because you need a PTP grandmaster Are you just ignoring it? No, so the question is For AES67 You need a synchronized clock To synchronize the PTP So you have to set that up For the machine But what happens if you don't? Do people just send garbage time stamps By default? Because this is a big problem for receivers To look at the time stamps and go I need to do something with it The question is if you don't have PTP clock Can it still work? Yes, it can still work But with reduced synchronization Of course What should a third party receiver do? Because if they have two different clocks What does it do with the time stamps? Just like this, I guess So the time stamp in the receiver The question is what does the receiver do To synchronize itself to the screen Basically follow the rate At which you receive data And remain to consume And then try to adjust The consumption rate by result That's the question You talked a lot about porters for video Is there also something similar for audio? Like for example, I want to capture An application using video and audio The question is the port for video We talk about lockbooks Is there also a port for audio? The answer is we are thinking about it But we don't have anything concrete Is there anything for networking videos? The question is is there anything For networking videos to use? No, absolutely not Last? Not sure There are a lot of things you can do Go in fast Go away Just try and work There are a lot of things you can do I don't know what to do Let's start from there applause Thank you |
Modern Camera Handling in Chromium
Implementing Camera Access with xdg-desktop-portal and PipeWire in Chromium |
So we continue with our next talk, sorry for all the technical problems we have today. Our next talk is by Michael Dobrich about modern camera handling in Chromium. Thank you. So I'll talk a bit about camera handling in Chromium. I think Vim already teased some of the things at the end of his talk, so you'll notice similarities or reoccurring things that you already mentioned. But I'm going to dive deeper into how this actually works and the implementation, how we got to where we are now, and what's still missing. Okay, so let's start a short bit above me. I'm Michael Dobrich, and I work for Pengatronics. I'm an embedded developer, Linux embedded developer, mostly doing graphics kind of things. So what's right now in Chromium is basically video for Linux. Anyone who doesn't know what video for Linux is here in the room? A few people. Okay, just a few words. Video for Linux is the kernel API to talk to cameras, or other devices too, but interesting here is camera. It's relatively simple API, just watch the format and give me a frame. And if you look at the Chromium code, it's, I'd say, well, there's little development, but it's basically done. We'll get to that information a bit later, because that's why I'm mentioning it. I mean, the camera, the API is stable, it hasn't changed a lot, and the camera is working in the browser, so there's not much to do to change that. And then I started with a project where we wanted to do more with cameras on a Linux system. It was a more or less embedded device, and the first thing was we wanted to choose if an application can use a camera, and which camera it's supposed to use or allowed to use. And that's currently not possible, because Chromium directly talks with the kernel and just picks the camera, right? And then there were web pages that didn't have the API to, the new API to do screen sharing that didn't use that, so we wanted to basically pipe in the screen as a camera into this web page. That was mostly in the earlier days of the video conferencing kind of things, and more the commercial applications that had their products ready and done and didn't want to implement new kind of things. It's getting less, but it's still some of them are still out there. And then there were some cameras that only HG64, and that's not supported by, I don't think any browser on Linux actually can handle HG64, only cameras in the browser. And then I wanted to attach IP cameras. So I have the camera maybe back there in the room, and I want to use that for a camera in a video conference call, right? If you have a bigger room and you don't want to use a long USB cable, then that would be ideal, but I can't attach an IP camera to video for Linux, right? So that's what I was stating, but these are not exactly common use cases. I was looking for something I could argue, what can I do to bring this to Chromium mainline? So if I want to implement something there, I need something what's more, yeah, good, an excuse exactly to say this is useful for other people too. So I started looking, and the first thing that I came across then was because I was also looking at or had used the screen sharing part, which works with the portal kind of things and in containers, that was my first use case where I said, hey, these kind of people, and that's what Vim mentioned earlier as well, is we want to say who's allowed to use the camera, and that was one use case I had in mind to say, hey, we need a new high-level API, we need an API where we can say outside of the browser if the browser is allowed to use the camera. And then while I was already implementing these kind of things, actually the lip camera stuff came up. Vim also mentioned that in his talk that we have new ways to talk to cameras, and there was no back end in Chromium there, and so the idea was to promote and say, hey, let's use this shared high-level API for the browsers and just plug in the lip camera at the back end, which by the way is already implemented because, yeah, I'll get to that in a second. So we want to do the authorization kind of thing, and we want to do high-level API so we can put something else in behind. And the solution was already there. As soon as I really started looking, it was obvious because the XG desktop portal kind of thing, that's the portal stuff that Vim mentioned, it already had an API for camera. It was basically not used, I think. I didn't find any real-life implementations that used it in a common case, but it was already there. And so I said, okay, let's do that, and already was involved with pipe wire and used it for the screen sharing kind of thing, so nothing new to me. And I said, okay, let's implement that. And I started to implement. But before that, I want to say a bit about how it works. So the browser needs to talk to the portal and say, hey, I access camera, that's basically the API call to the portal and say, hey, I want to access the camera, am I allowed to do that? Okay, yeah, you're allowed to do that. Okay, great. Then, well, yeah, I need this pipe wire remote to basically get the connection to the pipe wire and say, okay, well, and then the portal says, okay, pipe wire connect, we want to talk to you. And what the portal is doing then is restricting access. So the browser won't have access to everything that's available on pipe wire, but actually only the cameras, because we only allowed access to the cameras. There is a lot more details there in the background, but for the Chromium side for the implementation, we don't need to care about that. So, and then the portal hands over the file descriptor to Chromium, and Chromium can talk to pipe wire directly and access camera. And then the Chromium starts sending messages basically to pipe wire and says, hey, we'll need the objects, and then the objects show up, and I need your properties for all the objects, and basically builds up a list of cameras. But it takes several round trips between Chromium and pipe wire to get to that list. That will be important again later as well, because we're talking multiple processes here, and so it can take a bit of time. And once we're there, the user selects the camera and says, okay, we want to start streaming, then we'll start streaming and pipe wire sends video frames over to Chromium. That's basically the easy part at the end, although there is some format handling there involved as well. So, okay, first try. I was a bit naive actually at that point. I started implementing this kind of thing that was basically two years ago, and I had more or less pull requests in Garrett with change set, I think, this change list, CL, whatever. And, well, I posted it, I had it ready for review, and there was not a lot of review going on. But I'm not going trying to blame anyone here, but as I said at the beginning, the camera stuff in Chromium was basically done. So, there weren't a lot of people working on that. Also, they knew maybe the Chromium camera API and video for Linux, but I had this new pipe wire portal debus code in there, and none of the developers were actually familiar with that kind of code. So, it was slow going, there was mostly some high level review for how the integration worked, and then there were some issues with, okay, this is not just Chromium, this is also Google Chrome, and that's a binary built, that's deployed on multiple platforms, so we cannot just link to the pipe wire, because it may not be there, so we need to load it dynamically. But WebRTC, and when I talk about WebRTC in this talk, I'm talking about the implementation used by the browsers, not specifically the specification kind of things. So, in WebRTC, the pipe wire is also loaded dynamically for the screen sharing. That should have been a red flag for me, but I didn't know this at the time. So, basically, WebRTC tried to load the pipe wire, I tried to load the pipe wire in Chromium, and that clashed. There were just architectural problems to do both at the same time. And that's when we're stuck a bit and thinking about how to work around that issue, and that the merge request basically could solve a bit. And then, someone came on the platform, Yang Rouli, for example, and he was actually someone who knew about the pipe wire stuff, about the portal stuff, and said, hey, well, we shouldn't do that in Chromium, we should do that in WebRTC. And I said, okay, great. I was actually sitting down with them and with Jan and Mark Volts in a video conference, and we talked a bit about the architecture side and said, okay, right, WebRTC is the right place because if I put it in WebRTC, WebRTC is also used by Firefox. So, Firefox can also use my implementation to do cameras. So, second try, do it in WebRTC. There is already a camera API that's used by Firefox, actually, and on Linux, it just implements Vigital Linux. So, the idea was I put a back end for pipe wire in WebRTC, and I put a back end for WebRTC in Chromium, basically just an interaction. Yeah, there was an API, but there were some issues. I mean, it was an API designed for APIs like Vigital Linux. And if you enumerate devices in Vigital Linux, you can do that, basically, instantaneously. Just open a few file descriptors, send a few Ioc codes, and you're done. So, there was a synchronous call to give me all the cameras. And, well, as I showed before, we need to talk to the portal. The portal may actually ask the user if the application is allowed to access the camera. And then we need to iterate over the devices found or the cameras found in pipe wires. That's not something we can do in a synchronous function call in a browser. So, I needed a new asynchronous API to do actually this enumeration. And then, for the API, it was also, here's the static function. Give it a string, it's an argument to select the camera, and open this camera for me. Same issue here. I need, or similar issue related. I have this open connection to pipe wire that I need to keep open to access the camera. Otherwise, we would need to talk to the portal again, which doesn't make sense. So, basically, we need some state from the carried over from the enumeration of the cameras to actually accessing the cameras. Wasn't there either? So, needed a new API. And on the other end of the API, the frames that came out of the stack were already converted to I420, some pixel format. And this included a copy. Basically, WebRTC took the raw frame from the camera, converted or copied it to the new format, and then handed it to the browser. But Gromium basically assumed that the camera would give it the raw frame from the camera, converted in itself, or copied it. So, if I would use that API, I would basically have two copies. Wanted to write that, so I need a new API to access the raw frame from the browser. So, suddenly, my single merge request to just add the camera support to Gromium grew a bit, so it was a little bit more than that. So, right. And at the end of the day, I wanted to do this without causing any regressions, right? So, the first step is we add a feature flag actually in Gromium. So, if you want to use that, once it's merged, it will be disabled by default at first. So, if you want to try it, you need to enable this feature flag and that's, we want to use pipe wire for cameras. Okay, if we put that, then we go to the web RTC and ask the web RTC, hey, there is a software switch basically and I built what's used by the implementation in Gromium and says, okay, we want to use pipe wire. So, that's why we're going here. There's also a video for Linux implementation. That's not used by Gromium because Gromium has its own VGF Linux implementation. So, this way, if we say pipe wire is not enabled, this is actually implemented for Firefox because we may want to disable the pipe wire stuff in the web RTC from the Firefox side of things if it's disabled there and then just go this route to the VGF Linux implementation in web RTC, but that's not used in Gromium. So, we say, okay, pipe wire is enabled, so we go here. That portal is the portal actually available. So, we have a build from Gromium and at some point enable it maybe always, but still the Linux system underneath may not have a portal implementation running. So, if that fails, we need to go back, okay, VGF Linux is disabled and then we fall back to the VGF Linux implementation in Gromium, but if it works, then we ask the user that maybe cached or is often cached. So, you won't see that more than once typically, but if the user says yes, Gromium is allowed to use the camera, then we actually get access to the camera and here is where the pipe wire stuff actually starts. So, you need to set the switch and then hopefully have a working portal set up and then you can use the camera this way. Okay, where are we? I mean, part of this talk is to say, to raise a bit of awareness, maybe someone who can review more things or maybe add later stuff later on. So, let's talk about what's there. First commit was split out generic portal pipe wire code that was code used in the screen sharing for the portal that could also be used for camera sharing with the portal. So, we put that in a place where it can be shared. Then the next part was the raw frame. I mentioned that earlier. We don't want to do the double copy. That was basically adding a bit of new API and then merged, I think, just two days ago is the actual pipe wire portal capture support within WebRTC. That was merged just a few days ago. So, great. So, the WebRTC is mostly done. I'm not sure what Firefox will do. Basically, it's ready for Firefox people to implement things in Firefox. I don't know if someone was working on it. If not, maybe someone here is interested in that so they can do that part. There's basically just a small merge request left because all this camera kind of things in WebRTC is not actually built right now if we're building WebRTC for Chromium. So, we need just to do an infrastructure a bit, build these files as well. That's all the WebRTC is aligned. Then we come to Chromium. In Chromium, there are two commits pending. None of them have been merged yet. The first is basically, well, there is this Linux camera backend right now in Chromium, which is with your full Linux backend. So, basically, rename it from Linux to with your full Linux, place a switch in front of it. That's where this feature flag comes in. We'll say, okay, can we do portal or not? So, we make space to put a separate backend there next to it. The second commit is the actual implementation. Once that's merged, then we actually have the full support. Hopefully, that happens in the next few. Well, weeks is probably a bit optimistic, but months. Okay. What's next? So, for me, I'll probably won't work on these kind of things in the near future because this was done for a customer project and it's taken a long time and they are getting, well, they've been good support about this. So, it's really, there was no complaining, but I do need to get on with other kind of things. I haven't done this full time, right? This was always, I do a bit, then wait for review, then get some review. So, these kind of things, but I need to get this finished up. So, but there are a lot more features to come. There is some new exigitester portal device API was mentioned on one of my merge requests that there's some API stuff coming that's still not merged. There's a request, but once that's in, the idea would be to support that to access cameras. And then there are some more features that Chromium supports for cameras like rotation that are not yet supported by the new backend. I'm not sure if the whole stack below that already supports all of it. I'm pretty sure that pipe wire does. I'm not quite certain, but I think so. There is API in WebRTC for rotation. So, just, I think it mostly needs to be hooked up. So, we can rotate the camera image 90s, 180 or 270 degrees. And then there are also features basically if you know with your Philinux, they have the controls to do panning, tilting, zooming, focus handling, whatever. And kind of a few of those things are integrated in Chromium as basically image properties, I think. And they are hooked up in the Vigifilinux back end, but I don't think there is an API in pipe wire to access that. I'm not quite sure. So, maybe we need to add it to the whole stack. So, there's some work to be done there as well. But my hope is that at some point that gets added and then we can really switch the pipe wire camera on as default in Chromium because I don't think it will be accepted as a default unless we have a more or less basic feature parity there. So, still work to be done. So, hopefully, maybe someone got interested and will help out there. Okay. A few thanks, Wolfvision. That's my customer who sponsored this work, so I'll put them here there. Jan Grulich and Mark Foltz, they actually got the review started because once they got involved and noticed my patches, they found people that actually could press the necessary buttons on the review and say, okay, this is accepted. And they did a lot of review there. Talked to me about the architecture, so they helped me get on with it. And Ilya, I don't know, and Alex Cooper, they did a lot of review as well. And Kieran Bingham, he's doing a lot of lip camera work. Mentioned my work actually last year on ELCE. But he's done a lot of testing because he is trying out his lip camera stuff in combination with my Chromium patches, so a lot of testing and review there as well. And probably a few more reviewers that I missed just went over the list. People that said something on the request and went to the longest dose that seemed to be had the most comments here. Okay. I'm almost done, so I'll come to your question in a second. So, yeah, like everybody we're hiring all of our companies need good people. Okay. So, thanks and questions. So, I'll start with you. Yeah, just a comment that Kieran is going to give a talk on lip camera in one hour in the salon room. Yeah. Okay. Kieran is giving a talk about lip camera in which that embedded an automotive room. So, maybe I'll be there. Exactly. So, any other questions? Okay. Here. There is a comment discussion for all the bits. How do we need to keep an eye on other questions? If there is a page for all the bits. I don't have any webpages, anything for that. But if you basically subscribe to the last one, that's the one where it gets interesting, right? Okay. Any other questions? Okay. Then we're done. |
Advanced programmable use of Liquidsoap with FFmpeg
Explore how the liquidsoap language can be used in new, safe ways for building media pipelines and leverage FFmpeg functionalities |
Okay so we start again. Our next talk is about liquid soap. Please welcome Romain. Hello and good morning. I'm really happy to be here after not two but three actually. And we gave a talk here before the whole thing happened that I don't want to talk about. A lot of things happened and I want to go back to it and talk about what we did both in the code and with the community and the kind of things that happened. So yeah first of all for those who weren't there at the first talk, what is liquid soap? It's a programming language. Technically it's a scripting language, scripted language. It's statically typed with inferred type. So if you're familiar with type script, it's type script but you don't have to write the type so everything here is a string, an integer, something. What does the language do? It allows you to create online stream and it's not a low level tool. So we delegate that and I'll talk a lot about it. What we want is to empower the user to rezone about the tool. That's what the language really does well. Programming, logic, business logic. I want to play that song. I want to switch to a live source. So that's an example of a full code that we can use to run two outputs on IceCast based on playlists, a file, a list of requests, an input from HTTP, all sort of stuff. Yeah. So what has happened with liquid soap since first time 2020? We worked with Radio France. That's the reason we came in 2020 and it was that starting point of a lot of new work because it really created a new cycle of work interest. We had a lot of community growth that reflect back on it in the first part of this talk. Then we do a lot of new features because guess what? We also had a lot of time during that time and we worked a lot on it. And I want to finish the talk by talking about future development and challenges that we foresee for the future. So first of all, what happened with the community? So I started looking back at the stats over the past three years and I looked at the GitHub stars. It's nice, but it was growing pretty steadily except for that little bump here when we did the 2.0 release. And I looked at the Slack channel, which I want to move out of this platform. But anyway, it was also growing pretty linearly. I was like, doesn't seem really anything happened over the past three years. Then I looked at the issues. And that's where I was like, yeah, I remember that. So what happened is when the whole shutdown happened, because we are a project that enables people to communicate online, well, it was one of the places that people wanted to go to communicate with people and to try to reach out, create link, maintain link. So we had a lot of people suddenly who were like, I want to do an online stream. I want to put music for my friend to listen to. I want to do all these things. And the second effect, we had a lot of people with time. So we're kind of one of those projects where you're like, oh, we like it. I like it. I'd like to look at it at some point, but I'm too busy and suddenly I have time. So people starting using it, submitting a lot of issues and well, we got busy. One of the things that we deal with so is because we didn't have that kind of in-person meeting, someone who's the other code developer of the project decided that we should do an online workshop, which was the thing to do. And it was really good because what it really allowed us is to get to meet our community. And I think one of the things that's really nice with this project is that it's both a technical project, but a lot of our users are actually not technical people. They're like people who have Burning Man Radio at the Burning Man thing. Some of the cute project we had was like, the top one was a network of community radios in Barcelona where they literally have trucks that they bring to different neighborhoods to do online reporting of what happens in the city. And the lower one was another Hungarian community radio. So that was really good to get to meet that. And I think we valued that a lot. We have, of course, we have industrial users, but this is also one of the core motivations for the project for us. And then, eventually, another thing we did, again, by product of having a lot of time, which was mainly some working on it, was to write a programming book specialized in audio stream, media stream, and how to use liquid soap for it. And it was very useful on many levels. One of them is that it forced us to rethink our API and reorganize it. So you say, can we do that? What's a good example to do that? Well, actually, it's not clear. Let's make a nice module that can do that easily, document it. And also, it enabled the users a lot to be more confident because, as I said, most of our users are not programmers. And so they come to this, and they're like, whoa, never touch a line of code. How can I do? So that book was a really good starting point to get people interested and more confident with the project. So what did we do for the new features, though? That's the part that the gist of the work we did. So first of all, we did a lot of language changes because for a long time, our focus was more on features. We only did a lot of output, a lot of input, a lot of different interesting operators. But when you start to want to implement things that are more complex, you also need powerful language extension or toolbox. And also, we did a lot of FFMP integration. So those are the two things that I'm going to talk next. And first, the language change. So yeah, more expressivity. You want to enable your users to write code that is nice, readable, that they can understand, that is powerful, but it doesn't have to have a million lines to do a simple thing. So simple things like that. These are like spreads, you know, to split out a list. You have a list, you want to get the first element, last element, and whatever else is left as a remainder. We could do that before, but we had to use functions and it was complicated. This really helps users just visualize the code and be like, all right, understand what this is doing, and have much less lines to write it. We implemented a lot of types that were missing. So for instance, a null variable. A lot of that will remain a sense of, of course, dynamic scripting languages like JavaScript or others. So we didn't have a null value. We did a null value and we added a nice operator that says if you have a null on X, if X is not null, it's a string, then you get X. If it's null, you get the string full. Stuff like that. But when you start writing actual usable code, you need these things and they're very useful. Yeah, you can write function with optional argument and you can exactly say the user didn't pass an argument. That's just basic programming language features. The other thing we did that was very useful is a new module syntax. That basically means that any value in the language can be attached methods or attributes to it. So here you have a string and that string has two methods. Well, attributes here, duration, floating number, and BPM. It's basically an object oriented ID, but not really an object. More like JavaScript hash, I guess. And that means that you can query the song.duration and get the floating point duration or you can print the song title as a string using the underlying value. That was really useful because when you want to create something that's easy to use for user, you need to structure your language. You need to have modules and you need to have functions in those modules. Also, another typical case is that now you can specialize things. So you have sources that have different functions. So all sources have a skip function. It means that you can get to the next track in your stream, but you may have sources that have specific function. So for instance, you can have a source that can insert metadata that's created with this operator. It will be added a new function that's called insert metadata, and now you can use it. So instead of having a billion of general calls that are API based, you can start really attaching specific use to specific variables and it makes things more compact, more specialized, and very useful for cleaning up your API. Another thing we did after that with the module is that now we were able to describe high level things. So one of those things that's really painful in static ID type language is parsing JSON. Why? Because JSON can be anything, and we really want to know that name is a string. We really want to know that name is a version. So in language like OKMOR you have to parse an object, iterate through the keys, validate that is a string, and every branch of that you need to think of what you're doing. And our users, again, not programmers. So what do we do? We try to find a primitive that's readable and easy to use. So what are you reading here? You're reading a parsing statement that says I want to parse and I want to get a module that's going to have a name, a version, and another script attribute that is itself a module that has a test function. It's what you get in a package that JSON for node modules. And this is a type annotation that says I want the name to be a string, the version to be a string, and the script to be a module with a test that's a string. And at runtime we're going to take all this information, we're going to try to parse this, and we're going to do two things. If we have what we need in the JSON, you can go on with your script. If we don't, we raise an error, you can cache the error and reason with it. But that really makes parsing of JSON more easier on the user, which is important because, again, when you want to connect a lot of interconnected systems for streaming, you want to be able to talk to JSON API. So pretty useful. And we did the same for Yemo recently. So yeah, you can do settings now. Yeah, those are the new features. There's more, but I don't want to spend too much time on it because the other part that I want to talk about that's exciting is the FFMPG integration. And that really started after Radio France because they had a strong need for it, and we started looking at the API from FFMPG. And the thing with the API from FFMPG is that it's really good. It's amazingly good. It's all very low level in C. So for us, because I didn't mention, but LiquidSup is implemented in OCaml, so we need to speak to low level implementation to really be efficient. We can talk to Pearl or to whatever, Python. And this C, it's really simple for us. But it's also extremely abstract. And that really helps us because, again, we want to do what we do well, which is the programming language side, the logic, the typing, the functions. But we're not specialists in multimedia implementation. We don't want to do that. We want to find people who do it better than us and interface with it. So that's the API for a FFMPG packet, which is a little tiny bit of encoded data. It contains, I don't know, an MP3 frame, a video, A or B frame, all the abstract things I don't want to know about. But it's going to tell me two things. It's going to tell me this is your data. This is your pointer to a presentation timestamp. And that thing here tells me this is the time at which you want to insert this packet in your stream. This is the data. That's what I need to know. Because then we can build a stream with it. We can pass the packet around to our different operators, not even knowing what they contain. The other thing that FFMPG really, yeah, so that's why we started doing, we started implementing a new encoder that was basically reflecting everything that you see as a parameter in the FFMPG command line. We think that we support it as an option in those encoders. And why do we think that is because another thing that FFMPG really does really well is describing their API. So I'm sorry, that's not very readable. But that's a C structure that has all the parameters name here for H264 encoders, description, type, somewhere, and minimum, maximum value, everything we need to basically write an interface to it. It also does it for filters. Again, not very readable. Sorry about that. But it's basically a programmatic interface to everything you need to know about parameters for FFMPG. That's also not readable. Great. So then what we can do is this is an FFMPG filter implemented in liquid soap. And if you could read well, you would see that every parameter is in the filter, like speed, is a floating point, optional. It has no value default. They saw no value default. Sometimes they don't. We get this information from the FFMPG C API, and we can plug into it immediately and be very confident that we're using it well. And so one of the things it allows us to do is to actually have scripted manipulation of FFMPG primitive like filters. So we take a source and we want to define a filter that is a flanger followed by a high pass and then output it. So you need a graph. That's just part of the C API. You need a source. You create an audio input from it. That's the FFMPG side. That's what they call an audio input. You pass it to the filter with the parameters. Everything is typed here, so we can check. That's the right value. And then you create an output. Run that. You have a high level description of your filter. I don't know if you all have manipulated FFMPG filters, but when you want to do complex filters, they have a description graph that's pretty hard to read. I'm used to that, so I'm biased. I can read these things easier, but also it's kind of more descriptive. It's typed, too. So this is another filter. It takes an audio, splits it into two sources. One of them is going to go through a flanger. The other one is a high pass. This is some conversion that was required. I don't know why. And then you merge them back. So we're describing now a graph that branches out, do two filtering and comes back together. This is a simple one, but you could use that to do, I don't know, a multi-band compressor, for instance. You can do that in FFMPG. It's just a little bit more structured and readable and also type safe. So next, I want to talk about how we implemented that. Yeah, I still have some time. This is the timeline for us, infinite time. We started here, and we go all the way here. If you use your imagination, all these little horizontal dots are audio packets that came from FFMPG. Vertical ones are video frame. This is your stream of data that you're sending to an ice castle, anything. What we do is that we find the lowest common denominator between the video rate and the audio rate. We need to find a little chunk of time that will contain the same amount of audio and data. Most of the time, with the parameters we have internally, it would be 0.04 seconds. That's what we call our frame. And then, the idea of the streaming loop, once you've parsed and prepared all your outputs, is to just recreate that frame every 0.04 seconds, infinitely many times. That just creates your stream. And so, here's an example. We have a simple script. It has two outputs. You want to save to a file and send to an ice castle server. Fullback is an operator that will take the first source available. So, the first one is an input. It's a hardware. So, it's one of the operators we have that can receive ice cast clients. So, let's say you want a DJ to connect to your radio. You can direct them to this input, and it starts streaming. The fullback will stream that data immediately. If it's not available, we have a queue of requests that you can. So, let's say you want to send a track to be played immediately after the following one. Every now and then, you can send a request here. Otherwise, we have a playlist of just files in the directory. And if that is not available, we have some kind of fullback. Just in case everything fails, something is going to say, I don't know. We're having technical issues. Please come back. So, now we're going to run the streaming algorithm and see how we do that. So, output.file starts. It always goes back from the output down to the inputs. And the reason is because all of that is dynamic. So, there's no reason to start asking these people here to prepare data. They might not be used because, up in the graph, the fullback might choose to use just one of them. So, we have to start from the output. Bring it down. So, I have an empty frame, a cycle, 23 sudden streaming cycle. And I want to fill it up with 0.04 seconds of data. I go to the fullback and say, hey, can you fill up this frame? Fullback is like, sure, let me ask first, input a hardware? Not available. But request.q has something in the queue, actually. Let me pass it down, pass it down to this operator. Request.q partially fills the frame. It added a little bit of audio and one video frame. What you can think about it is that it was just finishing a track. Remember, request.q takes files when you want to play them. So, I don't know. It's just done playing the jingle or the commercial that you wanted to finish. That's it. I'm done. So, it's a partial fill of the frame, in which case it comes back to the fullback that says, I need more. Playlist is not available. For some reason, the directory is empty. So, we go back to the single and say, hey, single, can you fill this frame? One of the things we do when we start the script is that we actually double check that we have at least one operator here that's always going to be available. So, that's what happens here. The fullback is being used. And single is, sure, I got a file. I prepared it. I can decode it. Boom. Finish filling up the frame. And then it goes all the way up the tree. We're ready to encode that, save it in the file. Now comes the second one, which is the iSCAS output. Same thing. It's like, hey, I need to send data to my iSCAS server. It goes back to the fullback. But then what we do here, of course, we cache stuff. So, we know that fullback has already produced a whole frame. We have saved it. We can just fill it up here, send it back to iSCAS, again, encode it if needed, send it back. So, the thing that's really nice with this algorithm is that, again, we don't really care about what's in the frame. We just care that we know that we can fill it up and we know how much is filled. And then we can pass it down. So, these things can actually be FFMpeg encoded packet. No problem. They can be raw PCM. Then we have to encode them. So, it's really, again, we are just looking at things from a high-level perspective. So, if the time for that whole cycle was t, we have two possibilities. If we generated the 0.04 second in less than 0.04 second for the computer, it means that we run faster than the real-time. We can sleep a little bit because we're generating things in real-time. If not, we have a problem. We need to catch up. So, we need to run the loop again as fast as possible. Basically, if you're in the red, you have a problem. It means that, I don't know, encoding takes too long. Something happened in your script. You cannot produce in real-time. What we want is to be in the green all the time so that we know we can deliver the content at a real-time rate all the time. So, that's how it works. Now, because I told you all that we now can use encoded content, we're having a little bit of problems that I won't have time to describe here. But essentially, sometimes we have to do a lower-level understanding of what's happening in the bitstream. So, basically, in ffmpeg, you have things that's called extra data. So, if you take an mp4 media, everything that is needed to decode like fman-table is in the header, it's communicated first, and then all the packets, the frames after that, don't have it. But if you take mpeg-ts, because mpeg-ts doesn't have a global header, every frame is going to have that data. And that was a problem because now we do things that ffmpeg doesn't do, which is a live switch between two different bitstreams. This is a RTMP input, this is a file, mp4. And when we live switch, if we started with this one, for instance, the muxer from ffmpeg might say, oh, you know what, I already have the, no, if we started with this one, the muxer will say, I know that all the frames are going to have the private data that I need, I can go on with it. And then we live switch to that, and suddenly we start receiving frames that don't have it. And the muxer is like, whoops, I cannot do it. So we have to insert those filters here to make sure that they're always present. Which is a problem for us because it means that the user has to think about low-level stuff. And that's part of one of the questions that I was wondering if ffmpeg might find some of the beautiful abstraction they have to alleviate that kind of problem. All right, almost done. I'm going to finish real quick. We have a 2.x release with tracks so you can manipulate and remix different tracks. Let's say you do an MKV with French, English, and Italian soundtrack and encode it in different format. We want to rewrite the whole clocking system and streaming because Okamera 5.0 does concurrency. It's pretty exciting. We want to think about JavaScript and YSM because we can do it and why not. And I'm very interested to see what VLC does because that's the next part and why they do it. And long-term development though is what we are wondering about because we have grown a lot of community, but most of our users are not programmers and our programming language is OKMO, which is still a niche language. And we're two developers, someone and myself. So one of the questions, you know, the project is 20 years. We're 40-year-old. At some point, we need to think about moving on. So what we have done so far is do a lot of automation, which is very powerful. It allows us to focus on the code and have less to think to do. Like, we have automated release. We do CI to run all the tests every time. We have augmented the number of automated tests we have so that we catch everything very quickly instead of relying on a lot of manual testing. But it's far as a challenge to think about how we can bring in more developers that like OKMO, like radio, is a pretty intersection of two niche things. And that's it for me. And thank you very much for your time. And maybe there are questions. So we have time for a couple of questions, yes? Yeah, so once you go from the debumster to the coder, it's expected to format the entity. Then the same thing on the mob suicide. So if the expected output format is something else, so it will actually tell you that you're expected to insert a bit stream filter again to get it forward. Yeah, it was actually an answer. There are APIs in FFMPEG to inform the user of the C API whether or not they need to insert those filters. And I see that there is automatic insertion of those filters in the code. I guess I have to have another pass at making sure I understand when and how. And I regretfully for this presentation I didn't dive again and I remember that sometimes it becomes a little bit intricate. But thank you very much. I will definitely have a good look. Is it possible to control the running script using an external automation, especially like editing running playlists? Yeah, absolutely. So are there any tools to control a running script from the external user? So we have a lot of different options. The traditional one, the old one was a TerNet connection. But we have a fully featured HTTP server, so you can write your own API endpoints. And basically because every source has their own methods now that are attached to it, and you can run scripts, you can basically take a source and say, okay, there's a function that inserts metadata, plug that into running everything else. We're programming language. So yeah, we have a lot of different options for that for sure. Can you use LiquidSoup to do static operations, not to compare files, and not with the goal of doing streaming, just to apply operations on many... Yes. The answer is yes, because the clock doesn't have to be... Oh yeah. Can you use LiquidSoup to do offline processing faster than real-time and not just real-time? The answer is yes, because you can run a clock that says, I don't want you to be real-time, I want you to run as fast as possible. But the sub-answer is that it's not a common case, it's not commonly tested. And that's part of the thing that I want to bring on more when we write the streaming. Because for instance, a request needs to resolve a file which needs to make a network request, download the file test that you can decode that. Most of the time we expect that to take a while, but it's not been tested a lot when you want to run very, very fast. And you run into race conditions that are very different. Okay. Thank you very much. Thank you. |
Dual presentation: FFmpeg 6 and VLC.js |
Yeah, so I'm going to do two small presentation because those are short talks and I didn't want to take too much time today. So we're going to speak about FFMPEG and mostly FFMPEG6.0, and then I will speak about a new project called VLC.js, but it's a lie, it's not really VLC.js. So who am I? My name is Jean-Retier, some of you know me, some don't, so I'm president of Vidoran, I've been working on VLC for, okay, I'm close to 40, so 17 years. I've been involved in X264, which is a Vidoran project, David, which is a AV1 decoder and lately a bit on FFMPEG, mostly on the community management, which is a funny topic. I shouldn't be the one presenting this presentation, but the people who should do this presentation are maybe in this room and don't want to present, so that's why I'm presenting. Jokes aside, like if you look at the first time, open media room, like there is almost no FFMPEG talk, which is completely insane. VLCR is better, thanks to Kiran and Tourémy, but it's ridiculous, like if you look also in the archives, look in the archives, there's almost no FFMPEG, general FFMPEG talk. What? Everything in the multimedia in the open source world and outside of the open source world is actually based on FFMPEG. And when I mean everything, I mean everything you see online, and most of those, like you go to those big trade shows and they are all amazing cloud encoding, so on, and it's just like a very nice whopper to FFMPEG. But of course, when I say FFMPEG, please understand, this is FFMPEG plus LBX264 plus LBX665 plus LibVPX plus David plus all the other libraries that I forget. And even on our voici, Mademoiselle, you know those French fashion thing that we have which is called Hacker News, even on Hacker News, like when there is a release of FFMPEG, it's not even the top Hacker News, right? So that means that we are doing something wrong, which means we don't communicate enough on FFMPEG. So here I am. So the community is healthy. We've had some fights in the past, to be honest. The folks are long gone. And most of the people working now on FFMPEG also put lots of new people who are not there at the folk time, but also people from both folks are still working on FFMPEG. That's pretty cool, especially since we've not seen that many open source community being able to work together after those kind of events. So here I'm just going to speak just quickly about FFMPEG 5.0, which was around almost exactly one year ago. It was very important because we tried to match the new release schedules that I'm going to talk about. But it was probably the biggest API breakage ever on FFMPEG. I think just a train of commerce removing deprecation samples was around 130 commits. And the diff was huge. So some APIs were there deprecated to 2013 and were removed in 5.0. So this is probably going to impact a lot of you because a lot of distribution are still on 4.4. But 5.0 is a big change of APIs. And mostly one big thing is that it's one API to decode both audio and video, and not AV codec, video, decode 4, 5, 6, and so on. All those APIs. It's not doing subtitles yet, but I was promised that someone will do it this year. Where is Anton? Yeah. And yeah, we did a lot of new things. AV frame-based API in sw scale, new bit fields, streaming filters, a lot of things to clean AV format and AV codec. It's disentangling those two libraries, working on the decoder contacts, et cetera. You should look at the release notes on that. There are some people who are doing amazing work, mostly Andreas and James, who are basically removing all the craft on FFMPEG. So one day, the whole FFMPEG will be thread safe. We believe that, right? And AV codec, any of the format will be completely split. Yeah, OK, maybe not. But there is a lot of work to be done, and that's very important for security reasons. Michael, who's still probably the oldest FFMPEG contributor, is still fuzzing FFMPEG every day. Slice thread setting is W scale. IMF digmixing, which is good because so many professionals are using IMEF format, and they usually do weird things on FFMPEG, or above FFMPEG, and then we have to deal with their shit, because it's wrongly marked. So now we're actually getting that directly into FFMPEG. Dolby vision, I'm not sure exactly which part of the vision, because there is, as many of you know, a five or six profile. But I think at least profile five were there. And of course, a lot of things, and one of the cool things was the integration of LiPlaCibo, which used to be the MPV video filtering framework, mostly GPU accelerated, that is now into FFMPEG. And you can use that without GPU, easily with emulation. So the old APIs, like you know the old APIs, and now what's interesting is that it's more async-based, and so you don't need to do those horrible weight. 5.1, so that was like six months ago in July. This one is important for you, because it's an LTS. 5.0 is not LTS, so we're going to try to make that, to fix at least the security bugs for a couple of years. And most of the things that were added were a lot of features, but one of the major APIs that was merged was the change of the audio channel layout API, which was supposed to come in 5.0, but we missed, and then we said, well, it's going to take too much time, so we did that with 5.1. A lot of optimization on ARM in that release, mostly on HVC decoding, a lot of things on everyone decoding in hardware, because there is still 25 different APIs to do hardware acceleration. But soon there will be a new one that is going to replace all of them, which is Vulkan video decoding, and we'll have a 14 standards. JPEG Excel decoding, and a lot of things on SVTV1. So yeah, the channel layout API was developed in 2013, I think, by Vittorio. I'm not sure he's around. Yeah, Vittorio. That was done during the fork, and it was quite complex. But this is good, because it's ready to do what we called, well, well, marketing calls NGA, which is next generation audio. What marketing also calls Dolby Atmos, those kind of object-based audio, and the new channel layout API allows to be a lot more flexible to custom layouts and weird things without us having to do everything directly inside the FFMPEG. So, and I'm still not starting about my main topic, which is FFMPEG 6.0. I hope when I was submitting the call that this would have been tagged, and that's important. I think this is even bigger in terms of a number of commits, and mostly in terms of contributors, because in the last six months, there have been around 191 contributors. That's huge, and that's a lot bigger than the previous release. What is important? There is not that many important API breakage and changes, but there is new APIs. And also, it's a major bump, so we are going to remove more things that were deprecated in the last few years. And there was two new APIs, so that we didn't remove them in 5.0, but we're going to remove that soon? Soon. One of the major changes is one of the most difficult thing that we've seen is multishralling the FFMPEG CLI. So all those big guys are at YouTube and Vimeo and Facebook, and all those providers of FFMPEG nice UIs are basically one of the things they complained about is the lack of multishralling and FFMPEG. So they invent a lot of weird frameworks to do that, so there is a lot of work to do that directly inside FFMPEG. It's going to go on for the whole year, I think, for all 2023, but that means that a lot of things will be better for you to use. And of course, when you do that, you need to actually care about thread safety and cleanups, so that's a lot of cleanups. What was done for 5.0 was that the mercs are now in their own threads, there will be more things. There is now a risk 5 optimization, or at least the framework to do that, inside FFMPEG. One of the things that is important is that you've probably seen that all the big guys building GPUs have now shipped AV1 encoders. So in 6.0, we've got Intel, N, Nvidia, and AMD. So you can actually encode AV1 in hardware, and that's actually very fast. You can reach 30 FPS in 1080p without any problem with those cards, and it's actually decent quality. It's not as good as the SVT AV1, of course, but it's pretty good. There was a lot of work on the FFT code by Lynn. She's over there, she can tell you about that. And I think it's like, I don't know how much faster it is, but it's a lot faster, so all the audio codecs and all the audio filters that require to use the FFT and sometimes is better than the external FFT libraries that everyone is using. New API for restricted frame for encoders, API breakage for deprecation. We have of course what I hate, lots of new YUV format and pixel format, because there is always a good reason to add them. And when I'm downstream as VLT, I hate that, but any. Lots of things on channel layouts and H.274, mostly about external filters. One of the big parts on those features is everything related to hardware. So I said about everyone hardware recording, a lot of pixel formats, especially for hardware. There is finally the Android media codec using directly NDK, and not with a Java crop that is directly integrated into FFMPEG. I think that requires API Android 23, but I'm not exactly sure. And we also have the encoding and not just the decoding, but also based on media codec. We have the Intel folks have done a lot of things to have a 10-bit, 12-bit, 42444 VP9 decoding directly inside FFMPEG. That's one of the reasons why we have new pixel formats. In terms of actual features, there is of course lots of new codecs, lots of new filters. The ones I prefer are the FTR, which is a annoying company who doesn't want us to reverse engine is that. Bonk, APAC, there is a SIM SSIM 360 filter, and some very cool bitstream filter for the DTS to PTS one. Look at that one. It's quite useful. Yeah. So FFMPEG CLI multi-threading, as I said. This is partly done in 6.0. It will be continued on 6.1 and 7.0. It is difficult, and it's long. But this is going to improve all your lives, or at least, especially if you want to do a multiple HLS dash, multiple transcode, multiple resolution, and do that directly without using third parties. FFMPEG releases, this is a slide I took exactly from a previous talk. And we never talked about that during first time, so that's why I'm talking about it. The problem that was like FFMPEG releases were kind of, well, before there was not. So we all took the good show on, and hope it was great. And then we were seeing what Mplay was doing, then VLC was copying. And well, if Mplay on VLC agreed, then everyone was using that. Then we started having releases done by Michael, and sometimes they were not very predictable. So one of the idea is to start to come to a more predictable fashion, which is one major API break and API break every year around December, January, so we're in February, and we fuck this year. But that's the idea. So one major where we allow API and API breakage. We remove APIs. When it's deprecated, it must be there for two years before we move that. But we will bump the SO numbers. And then one or two releases during the year, depending on security and what we need, so 5.0, 5.1. And every two years, one of the.1 will be LTS, and we'll continue that for two or three years. So the plan was to do FFMPEG Cs.0 in January. I hope it's going to come next week. We'll see. Yeah, this was not on schedule, so I'm adding a shorter talk in the middle of my two talks. Zavid 1.0 was released last year. It is insane. 200,000 lines of handwritten assembly, I don't think there is any open source project that I've had. I'm not sure there is even a non-open source project that has that much assembly. And yes, handwritten assembly is faster than using whatever version of whatever compiler and activating whatever amazing feature that is going to auto vectorize something. We still do five, eight, 16 times faster than C, so don't bring that up. It is insane, yet it's necessary. So when you decode everyone, so everyone is now in all your iOS devices, all your Android devices, all your applications that they code everyone. It's on macOS, it's on Windows, it's of course in Chrome, it's of course in VLCMPV and all the other things. So it's literally everywhere. A lot of work was done in the David 1.0 about frame-threading. Like there is lots of, please see the talks from Ronaldo for a few years ago. Wow. OK, thank you. About the different spreading models that are inside David, and David 1.0 has everything in a simpler way. We are going to, it's extremely fast, very fast. David 1.0, 1.1 releases is coming soon, soonish, a lot of fixing, especially because there were a lot of conformance tests that we were not passing. And for some reason, they got out. And there is, of course, lots of new assembly, especially for AVX 512 and Neon. Cool. We're going to speak about, well, I'm going to do a demo, which is vlc.js, which is actually not in JS. So what are we talking about? Yeah. So this is Chrome. And this is why I'm on macOS and not my usual Linux for the people who wonder. This is vlc.n, ffmpeg, and all the dependencies compiled to WebAssembly. And what you cannot see, but this is doing hardware decodings through WebCodecs, right? So what happens here is that what you're seeing is that it's actually decoded on the hardware through WebCodecs. And then you take the output frame directly into WebGL, and, well, OpenGL ES2, which is compiled to WebGL, and display that. And this is a 4K H264 MP4, blah, blah, blah. OK. That's boring, JB. I can do 4K H264 everywhere. Sure. Sure you can. So let's do something a bit more complex. So this is the same, probably a Divx, except it's MKV. The MKV part is done in Wasm, right? It's a normal VLC demuxer. There is no JavaScript involved, right? I'm not demuxing MKV and remixing as MP4, like HLDS.js. It has, of course, chapter support, because, well, what's the use of that? But also, if I found my mouse again, no worries. Yeah, you have also chapter subtitles, which are not WebVTT, right? Normal DVB subtitles. OK, so that's not too amazing, right? So let's do something more complex. OK. 4K VP9 in software decoding directly inside the web browser. OK, that's pretty much better, right? WebM on macOS, right? So, well, yeah, but professional. They use, like, actual format, like MP4.js. Let's do. So that's something that is ATSC over the air, right? So that's htvc83ts, right? All the stack that is not in the web browser. It's decoded and displayed directly into your web browser. And that's where you realize that the US TV is really dumb. OK, but, OK, so that was hardware accelerated or not, because that's why it's TVC. As you can guess, right? I can either force web codec or AV codec. So now I'm going to force software decoding. And I'm going to show you something a bit more complex, which is this is Korean TV, which is interlaced. And the interlacing is happening as a WebGL shader directly inside your web browser, right? So we're decoding 8264 interlaced. Of course, we cannot do that by hardware, because, of course, API doesn't support interlaced codec. So we decode full in software. And then we display directly and do sharpening and the interlacing of those very old multicast formats without any change. OK, and I guess I got no. Yeah, I got one minute more. So this is DNECHD. Of course, the output is not 420, but it's 420, 422. And that's actually interlaced and decoded as MXF directly. All those professional formats, you can play that directly inside the web browser. So of course, if we can do 422 and down sampling for 420 in software, well, on the GPU, I can also do 444. So this is YUV444, 10-bit, right? Of course, BBB, right? But things that are absolutely not possible with any type of APIs. I probably also can show you that we can do other filters. And this is a 360 movie that we have with the support on VLC. And of course, all the mapping from Tetidal to Equial Drunk Duo is done on the GPU. Of course, that means that everything that we do with LiPlasibo in theory should get in. And I'm out of time. Thank you. Thank you. Do we have any questions from the room? Yeah? So an eight question. So you said you have 200,000 lines of specialized code. So perhaps there is no slowdown when you flip or rotate the image or do such transforms because you have a specialized version for that. Is it so? You mean on David? Or here? Oh, sorry. I cannot differentiate. So on David, it's really the decoding part. David is 200,000 lines of assembly just to do the decoding. It's around 35,000 lines per architecture. And we do lots of architectures. And then they give you a decoding surface. And then we use it in VLC, in PV, in FFMPag, and we do things. And here, we would compile it directly inside WebAssembly, run that in WebBrowser. And all the transformations are done then on WebGL. So that doesn't change anything. Just to know whether there is a slowdown of any amount just because of those common transforms, you say? No, that shouldn't be. On the dark question? Can you compile assembly to WebAssembly assembly? Like, could you compile David in the WebAssembly? Could you handle the WebAssembly? So one of the things that we tried with Lynn, again, was what we call an assembly transpiler, where you take actually the ARM handwritten assembly and try to transpire that to webassembly.scmd, right? So that you could use neon and do exactly the opposite of what the WebBrowser are actually doing, where they take the wasm assembly and compile that just in time for neon and so on. It's a very insane project that I had the idea a few years ago. It's really not sure whether we are going to be able to do that because you're transpiling assembly. Like, what the fuck are you talking about? But yes, I think that's the right solution instead of rewriting again all the assembly from FFMPag and David again. If you have time, please come and help us. I might actually find also some money for that, if people care. Please ask questions. I don't eat people. Yes? Last time I checked, compiling straight into WebAssembly, everything that was thread, posix, everything was pretty not yet finalized. Like, what is the process for compiling? It works fine. But that's also why you see that I'm on macOS, right? And I'm on Chrome and not displaying my usual Firefox and Linux because in order to have threads, you need to have what we call shared array objects, which is basically common things. And because in the web world, what they call web work, particularly serialization and deserialization to move data, now this is almost done, works everywhere, mostly on Chrome. Now works on Safari. It works on Firefox, but it's buggy. And also one of the things that is annoying is the off-screen canvas because you want to be able to read directly in the back buffer before displaying it that doesn't work anywhere correctly. And finally, the hardware decoder only works in Chrome for now. But maybe Firefox will do something, won't it? Oh, sorry. What's the concept of the payload that the pages to download to get that? 25 megawatts? So the idea is that we're going to, like, VLC is a module approach. So there is a very small core and 400, 500, 600 now, maybe, modules. And so what I want is to actually be able to load. And that's almost working, in theory, so you can load a shadow object. So you would only stream the core, and then the core will know which one to go and download. Yes, Steve? You mentioned that there were big patches for FNPEG to organize the code. It's easier to read emails, so when you pre-clap them. I'm not answering that question. Thank you. So the question was, when is FNPEG community moving to a sane thing, which is GitLab? Not my shorts, right? Like, you know my opinion, right? Videolan, VLC, all the iOS, Android ports, X264, and so on. Even David is on GitLab and would like it. The FNPEG community prefers to stay on email. So I think it's a mistake because we are losing too many patches because it's very difficult to, but that's a community opinion, and the community is a majority. Last question. So if you render it in OpenGL or in the VLCJS, you're bypassing the media engine, right? So how do you do the audio video sync? Well, of course we are. So the answer is, how are you doing audio video synchronization? Like VLC, right? Like the core of VLC. VLC was done by this guy and other guys to actually do live TV, right? So the core of VLC is a clock, and the clock is basically working on doing synchronization audio and video and resampling the audio when it's too late and so on. So here we are doing exactly that, right? So we output audio with Web Audio. Well, no. A small part of Web Audio called audio worklets. And so we know how much is actually being played back, and then we control the V out, which is basically the core of VLC to synchronize audio and video. And we're using that there. But I'm not using any type of media source extension or any other open media, blah, blah, blah. We are really like, it's mostly a video game. OK. |
4K HDR video with AV1 : A Reality Check |
So, our next talk is about 4K video, HDR video with 8.1, please welcome Vibhuti. Thanks. Hi everybody, so this is my first post-dem talk, so the main idea for this talk is that I just want to have a brief introduction about 4K HDR with AV1 and to have to know whether how it actually works, that's the main idea for the talk and to give a brief introduction about myself, I am currently a PhD student, my second year currently working on video coding, trying to learn how video works and I am also involved in open source multimedia since 2018, start with video land zip and AOM media, so what are we going to do today? Okay, this picture was captured when we were trying to play multiple streams at the same time 4K AV1 HDR stream, so this was some snapshot of that, so main idea is that we want to talk about the main technical challenges when we want to play back AV1 streams which is in HDR and to retain the same colors as such, that's the main idea for the talk. So, before I get into that, I just want to have an introduction about HDR, we all heard about that and there are multiple aspects of HDR right now which is happening, so first aspect of HDR is that let's break down into multiple parts, so first aspect is that it has brighter pixels, so here is a snapshot of an image which is torn mapped down to BT7094 display, so this is in HDR even though the picture was in HDR, so what happens here is that you could see, I don't know if I can see the cursor, so somewhere here it's like super dark at like 1 nits and somewhere here it is super bright at 1000 nits, so in theory the picture is at like 4000 nits but our display is only capable of showing at 1000 nits, so that's captured at 1000 and on comparison for a normal standard dynamic range images it's usually in 100 to 200 nits and flat panels, modern flat panels go like 500 nits or something like that but in theory HDR can go up to 10,000 nits but most of the displays can't do that, so second aspect of HDR is that it has more bits, so in theory, so this is a representation of same brightness in 8 bits 8 to n SDR and in HDR, so you could see that there is lot of condensation happening when it is in SDR, so if you want to show or quote something like 200 or 200 nits you could get away with something like 8 bits and when you want to go to something like 1000 nits, 8 bits is not, it's not what to say, it's not, we need more than 8 bits, so that's one aspect of HDR, so that means we need more bits, so what happened is that we need a mechanism to combine both the nits and bits, so that's how this transfer function was born, so what happens is that our humanize is quite sensitive to dark and mid-gray areas and we are okay with bright sports, so with keeping this in mind different standard bodies and organization try to create something called transfer function where they try to spend more amount of bits in the bright and low mid-gray areas and spend lot of, spend, have a coarser condensation for bright areas, so that's how this transfer function is born and in HDR one of the common thing is perceptual condensation, TQ which is based upon human perception of banding, so I'm not getting into details, so main idea of HDR transfer function is this and there is multiple transfer function but the core idea is that spend more bits on darker and mid-gray areas, then comes the color, color is complex, I don't know what the color is, so how to say is that to keep it simple and short, so display technology since standardization of SDR which was like back in 90s or early 2000 and has evolved quite a lot and now the display technologies can, is capable of displaying more brightness and more colors and things like that, so if you see this diagram here right, so to keep it simple what happens is that this shaded region is the standard dynamic range and SDR range, so that is this SDR and what happens is that in HDR domain or not in HDR the white color gamut domain it is expanded to be something like this, so it can have more wider colors to have a quick background, so you remember the picture which we were talking earlier about a picture which we are talking, so in this picture when we actually check the color distribution of how it is actually spread out, so you could see this was manually measured with the help of a color library, so you could see that the trees and the green areas in the picture is actually beyond 709, so in this picture you could see that this red triangle is the standard dynamic range in SDR and the green one is the BT 2020, you could see that the greens is in the wider region, so the main idea to keep it simple is that there is reds and green and wider range and blues do not change much, so that is why most of the HDR pictures and videos which you see might have vibrant colors with reds and green, that is one of the main reason for that, so now comes the question where do you find this HDR sequences, again these are tone map sequences just for representation, so in practice or in reality this may not be how it looks, so there are bunch of sequences, Netflix has read some of the sequences as open content which is available in public and some other broadcasters have also published some of them which is pretty good and lately maybe like early last year Academy and Academy Software Foundation and partnered with American cinematographers also produced proper cinema grading material with different color spaces which is also available, I have not tried tested that I found that like last week, right, now let us come back to AV1, so I had given a brief idea about HDR, now talking about AV and JB has given a brief idea about the current storyline of AV1 decoding, it is to have a quick refresher, so for people who do not know, so AV1 is a current video coding standard developed from Alliance for Open Media around 2018 and it claims to be around 30 to 50% more efficient over predecessors and that is around 200K lines, so it was an old number which I wrote, so for the video decoding and playback David was there, so David is quite popular from video and released that and even major browsers and also operating system supports that including Apple, yeah I do not know how to use AV1 in Apple, no I do not know if anyone was able to actually use AV1 in Apple even though Apple ships that, so that is the storyline, so that sounds good, so what is the actual problem when it comes to HDR and AV1, we see that it is able to playback and things like that, so the problem is playing back signals, playing back HDR videos with correct metadata and signaling them correctly on a display is not actually easy, so if you break down into three, first part is like the macOS, we know that the display support is there, latest macOS has the support, the OS is having support for signaling that, that is good, but the problem is video output drivers and playback in natively macOS is a bit problematic and most of the players try to do tone mapping and others try to do, others have support but they are very limited, so you cannot be able to actually visualize them as you want and in the next I believe we cannot do because protocols are still work in progress and in Windows if we see that there is display and voice level support and also players do support that with the help of direct drivers, so we could drive HDR playback in Windows and it works fine, but the problem I noticed mainly was like the display transition is there, when you play an HDR window, Windows try to flip from HDR to HDR that takes few seconds and sometimes some black screen, so that is a problem which I noticed when you have, when you try to do in Windows, so that is a problematic thing which I noticed, so what we did is that we took a different approach, so this is kind of not too many people, probably most of the people in this room, so but we took a, this is the most common approach used in broadcast and standardization body industry, so that we are just using playback capture, playback cards to play the video streams and we are using a black cards from the black magic, we use something called deckling 8k pro, so it can play the streams with and it can send these signals with the help of SDI as the output and to play them we are using ffmpeg and gstreamer for driving the playback, so we can work in any operating system if there is sdk support for that, so that's good, so we found a solution for playback, now comes to the question like how do you display that, so first thing is that we need to display this hdr signal with little to no changes, so with the playback card we can send the signal that's good, then we need to make sure that the tv is not doing any sort of funky things when doing the playback, because most of the tv sometimes tend to do some sort of tone mapping or try to play with the brightness and things based upon the ambient display, so we need to make sure that there is no sort of tone mapping and tv is not playing with that and it should be calibrated as per standard, so that's the other thing and it should have minimum of 1000 in its brightness, so that's the main requirement which we have for hdr testing, so what we went is that we went to the reference monitor, so we use something called sony's reference monitor, so that's a 32 in jollet panel that's strictly calibrated as per the standard and it ticks all the boxes which we want and even we can force the signal metadata on the playback, so that's good even though that's an expensive guy, so once, so the main idea with the reference monitor is that once we establish playback with hdr we know that this is a source of truth or ground truth which we have, once we do that we can now show that this is how it actually look, then we can extend to normal consumer displays which says to have hdr support, so that's good, now how do we say that the signal and video which we play is actually hdr or how do we say that that's the actual thing, so we need to confirm that thing, so first thing is that we need to confirm the brightness part and so for the bright confirming the brightness part we are using the european broadcasters as released a chart called eotf chart, so which is going from 0 to 1000 nits you won't be able to see that here anyway, so the idea here is that, so with this chart if you play back in your stream you could see the peak brightness which is available in your display, so that's the first part which for confirmation for the streams, the second part which we need to test is that what's the viewing area which you are able to see that, so if you are playing in a consumer tv's some tv says that they may have something like 4500 nits, so in theory that 4500 nits I mean in practice that 4500 nits might be only for like 2% area for few minutes or few seconds of your whole screen and it goes back to something like 2000 nits or 1500 nits on other times, so we need to be sure that what's the actual area which we are able to show that, so that's one thing we can use this so european broadcasters again release bunch of test patterns which you can confirm the maximum viewing area, so next comes that to confirm the signal which you are sending from the screen is okay or not, so for that we are just using a cross converter motor which can it's just check the existence of the signal you pass through the signal through this and it just says what's the signal which you are sending is okay or not, this is just to make sure that the cables which you send and they is correct or not because sometimes cable can mess with you, so that's one thing which you can try, then comes the color this is the most important because it can play a lot of tricks to you and you don't know how what's the ground truth right, so for that in the reference if you have the privilege of reference monitor then the reference monitor has such settings were called gamma marker, so it's actually essentially turns gives some much like something like a zebra line on top of the screen if it is above 709 that's the main idea but we may not be having reference monitor all the time right, so we need some other mechanism to validate this thing, so we had to use some we tried to use something called spectroradiometer, so this is a device which is used to measure something called radiance from the screen, so with this device we can measure the color volume that is the color space and also the brightness, so with this device you can just point to the screen which we are playing the stream and we know that this area might be in HDR, so once we point to the screen and measure the color you can know that whether this is in HDR space and what's the actual is the pixel above 709 or not and what's the actual brightness which we are seeing and things like that based upon the brightness it varies, brightness in the various in the sense like the time it takes to capture that it varies, so next important part is that the making sure the whole pipeline which you are using is in 10-bit, so this is a very important thing which might be a bit tricky to see in real life because most of the to give a background right, so if you have a so to make this easier, so the main idea here is that we make custom coded 10-bit image having a 1024 levels of the brightness, so once you send that and if the whole pipeline is in 10-bit you won't be seeing band and you will be having a smooth gradient and if it is not in 10-bit you may see some ramps here like this it's not visible here, so essentially you will see some ramps here, so when in practice when you send a signal and if you try to do in consumer displays even if you may even get the color volume and brightness and things like that but it can be still in 8-bits that what I'm talking is like it can be an 8-bit PQ, it's a real thing I didn't know that before starting this, so with this you can be sure, the reason for saying this is that if you have some noise in your signal and due to TV's debanding and dithering algorithm and all the other things which I don't know what TV is doing it can make the bands not visible and it can be smooth as this 10-bit even though the video might be in 8-bits, so that's the main things which we focus to say that for testing the HDR things we need to do at least one of them for each of these things brightness, signal, color and bit depth. So next question comes up that can we extend this to consumer displays, yes we can do that because we now establish the ground truth and it works then we can use an SDI to HDMI converter then we can send the HD, we can play it in a normal consumer displays but that comes to the question that whether the consumer TVs can actually signal the metadata or whether this playback card or the FM back drivers which how you play can actually transport the metadata, so in the TV, modern consumer TVs can force the HDR metadata that's okay and if you don't have that there is some device which we found recently but it's there in industry for many years that's called Dr. HDMI, so this guy is like you plug in HDMI and it will insert the HDR metadata how we want with the web server and it's like it's a magic device I would say and fun fact it can even do Dolby vision with 8K 60fps, it can even fake Dolby vision metadata to the TV so that TV will be happy and this device is like less than 200 euro, so it actually works for consumer TVs and old TVs we just have HDR this guy is magic and most of the time many people who do HDR demos have this in the background in NAB or IBC, so that's one part now there comes to the question that is this okay for HDR, so with the HDR right the viewing environment is crazy, so that's one thing and based upon the viewing environment you had different perception of colors, so main things here is that you just need to be sure that what's the display panel technology which you are using like I mentioned previous test would be happy for that would be you can have some sort of confidence with that then comes the ambient lighting condition on how you view that if your room is getting dark and based upon your ambient lighting this varies so you need to be sure what's how what's the lighting in your room when you're watching the HDR videos and again video materials plays an important role and lastly perception of color the different people at different tolerance of color so that's one thing and last important fact which I need to talk is that based upon viewing environment and certain individuals can experience fatigue and dizziness when you watch HDR that's a true thing and because of the super bright and super vivid colors you need to be careful when you are watching HDR videos for a long time, so ITU has laid off some methodology when you do this for scientific testing and subjective testing environment it's not strictly written for HDR but if you follow that it works on big picture just keep it under 30 minutes when you watch HDR videos continuously so this is some snapshot of how these set up scientific testing environment which we went and when we had a person to view that this is how it happened so yeah so the main things what I was talking here is that HDR signaling HDR metadata and things are different and one of the main two things of HDR is that there is wide range of brightness due to different quantization scheme and white color gamut this was an entirely different concept now the current HDR videos and things has clubbed together and that's this current standard says so that I enhance the colors and last thing is that setting up the whole set a subjective testing environment and things or scientific testing environment is non-trivial and it's accompanied by high cost even though this is standard back in like 10 years back the whole HDR and the viewing environment plays an important role currently we are doing some sort set of subjective tests for HDR viewing and things like that so I have put some references if you like to read more and that's it thanks would be like to hear more questions and I'm still learning these things so I could be wrong please correct me if there is anything wrong yeah thank you you have the slides also and for them websites all the slides yeah and also I have put some additional resources how you can do these things some more information which I skipped in this thing yes I only have a little background in video codex could you explain simply what the pipeline is you were mentioning what does it do not show which pipeline okay this one right so this one you're talking so what happens here so what here I'm trying to convey is that the HDR videos like I mentioned earlier it's coded in 10 bits or more than 10 bits so the whole pipeline which you are trying to so what have to boil down so the TVs or some devices can decimate two bits and make it eight bit when you play that it can be so you may see an HDR in eight bits so when you have it in eight bits the whole nature of HDR in eight bits you can't represent thousand its brightness so HDR having like quite high wide range of brightness and when you have less bits you can't actually view that so you just that's the one thing which I mentioned here yeah yes because HDR is eight bits right now but there are plans to make it 12 bits in the future yes so how are you going to extend your calibration and scientific testing equipment to code with the extra two bits I don't know yeah yes so sorry yeah so he was asking like in future HDR might become 12 bits so how am I going to extend this whole setup for 12 bits I think that's after like six years right yeah so probably the capture yeah but yeah okay so probably I think the capture card can do 12 bits and probably I probably would do something similar I need to dig up how would can we actually differentiate between 10 bit and 12 bits I actually don't know yeah 10 bits yeah yeah it would be a bit tricky but we'll find some way for sure yeah yes all right so he was suggesting we could use something like a waveform monitor to monitor the signal yeah so that is part of something like this and a more an more advanced device can have that so this thing is like a cheap device but even though it's one grand so this can show some waveforms so yeah he was suggesting we can use more advanced tools to show waveforms yeah so that's a tricky thing so we just so if you are repeating this for other setups we just need to make sure that these this for some of these things are like test patterns should be used to make sure the peak brightness is there and the maximum viewing area then we can use the spectroradiometer to make sure that the signal is in HDR and to so make sure that's okay and I think this is repeatable in other places even if you don't have reference monitor because we tried this on a normal consumer device with the same setup and it worked yes yes yes I would say so and so sorry sorry sorry again so he was suggesting why is AV1 better suited for HDR so I don't know if AV1 is best suited for HDR also it's one of the good codec which can do better compression so natively it can do a lot of contents which is in a better quantization and the things are which bit better compared to other codecs compared to predecessor codecs and so due to that nature it's okay to do HDR content but still there is lot of work to do in AV1 to be better compliant with HDR because other codecs have way better handling for HDR because most of the codecs right now even though they have HDR they don't treat the HDR signal or HDR videos differently so it's still a research and development process currently which is happening in codec development to treat HDR differently and AV1 is slowly trying to do that so codecs in the sense like HEVC has some other extensions which can handle HDR in a different way with help of having different quantization for colors and things like that AV1 is slowly add so current AV1 master has that feature I believe so it's slowly in development research and development process all right yes so the question he was asking is that whether is the power consumption is higher in HDR or not I actually didn't measure but I know that when I play HDR and if I keep my hand on top of the screen I can feel the heat so the OLED is really burning so if I keep like this distance I can feel the heat in my hand so it's really bad I would say yes so he was asking why is it why is it that more darker so to answer that question I was expecting this I have some so I was doing a subjective test for the same thing for a dark video with bunch of people I didn't include that this was this request more background and explanation to explain so took so this was a dark image so this is in HDR so in actual setup and proper environment you could see that clearly but now this is purely washed out in blacks so because on HDR it can do why is it like that so I don't know because in HDR you can spend more bits right so people tend to focus the importance of the dark videos and things like that so I think it's just that you can exploit the wide range of luminance and brightness and condensation scheme of HDR when it is in dark so what I am here trying to show is that this is a bit complex to explain so like like the perception of this is very hard and if you have a different lighting condition the blacks and darks would be entirely different so I think it's just the nature of Kandan and they just want to utilize the HDR to be that I am not exactly sure how to answer that I don't know so just like this what I am here trying to show is that even though you had different videos with different brightness and all this is like a dark video and when people try to view this in this environment and when you have the worst compression and the best compression people like these are different kind of noises people just thought that they are same even in the proper environment so that's a whole different storyline so it's a bit hard to answer why is it is like that and how we can do that probably in the another time I could explain this in a clear way but I don't know but yeah. Thank you Vibhuti. Thanks. |
VVenC & VVdeC: Open source video encoding and playback for VVC
H.264/AVC – x264, H.265/HEVC – x265, H.266/VVC – VVenC? History, current state, and ecosystem around open source VVC implementations. |
So, we continue with our next talk, again, about Codex, next time about VBC, with two projects about encoding and decoding VBC, please welcome Adam. Hi, everyone, so today I want to introduce you to VBank and VBdeck, those are open source implementations for VBC. Now, to pick everyone up about VBC, so VBC is this new codec that was finalized just over two years ago, and if you want to know one thing about VBC, you basically, VBC allows you to have the same quality as HEBC at half the bitrate, and on top of that, it was developed by the Dravet, which is the joint video experts group, and it's called a versatile, because it's applicable in versatile scenarios, right? So we have support for screen content, HDR, as we heard in the previous talk, immersive 8K, and we can do some fancy stuff like doing adaptive streaming with OpenGob. All right, so now let me talk you through a little bit of the background of our projects, VBdeck and VBank. So of course, those are both, you know, those are team efforts, right? They're developed by a whole team of researchers at the front of our HHI, mostly in the video coding systems group. Now front of our HHI, if you don't know it, like modern video coding probably wouldn't be what it is if it wasn't for HHI, and HHI is part of a biggest European research organization, the Front of our Society, which is a big German non-profit. And then about me, so I'm Adam Iskovsky, I've been at HHI since 2016, I've been leading the project since 2019, and since about a year I'm also the co-head of the video coding systems group. So why did we even start the software project? So basically, you know, there was HVC for which the test model was HM, and HVC uses square blocks. So they had this method of indexing blocks within, like within the frames using this set index method, which is really amazing for square blocks. And then, you know, they were exploring VVC based on the exploration model, which was still based on HM, except VVC supports rectangular blocks, right? So it's more than only square blocks, and there were really a lot of code for working around this set index thing. And at HHI, we wanted to do even more than that. So we started work on our own partitioner, and we just decided that this is not going to work. And what we had to do, we basically had to write our own software to deal with it, which we very creatively named the next software, which later became the VTM 1.0. And basically, the biggest difference is we had one big map that was mapping the position within the frame to like an object that was describing the current coding block. So the next software, it became the VTM, which is the reference software for the VVC standard. And you can see there in the graph how the VTM was developing over time with regards to the gains over HM with the encoding time, decoding time. And here, you can also see how we started our implementation projects from VTM. So from the VTM 3.0, super early, we already started work on the VVDec. And then from VTM 6.0, we started the work on VVNG. Then in the early 2020, Benjamin Pross became my boss. We basically started the VCS group, and he brought up the idea maybe we can do the project's open source, which we did initially end of 2020 under a little bit shaky license. But with some back and forth with the headquarters, we were able to change it to like a modified BSD3 license. And after some more back and forth, we actually have an unmodified, like a standard open source license, the clear BSD3. All right, so let's talk some more about the projects, you know, some hard facts. So as I already mentioned, they are based off VTM, they are both written fully in C++, but we do have like a pure C interface. So you can integrate it into frameworks or, you know, just use it from a pure C code. Those are C++ projects, so they are object-based, but it's kept very simple. So we try to, you know, not hide anything, no getters, no setters, and like have all the control over what is happening in the memory. Contrary to some other, you know, some other projects, we do not do assembler at all. We do only vectorization using intrinsics, which of course has the advantage that we get stuff like ARM support for free through this amazing SIMD everywhere library. Also support for Vasem cross compilation, so WebAssembly, which we also did. And what we try to do, we try to make those projects as simple as possible. So basically, we only expose options that are use case relevant, but like the coding options, everything that's like connected to efficiency, we try to define it for the user to just have the simplest experience possible. Yeah, they're both available on GitHub under the PSD3 close clear license, as I already mentioned. All right, so how do we do the development? The development is done internally without HHIs. We have our own main Git repo that we basically, from which we push squashed updates to the GitHub repo. Why is it internal? I know many people find it a little bit, you know, there might be issues with that, but you know, here on the right, you can see, or maybe you cannot see, a typical magic quest that I would, you know, do on the internal stuff, on the internal repo, and I think this is just not something that is ready to be released to the public, you know. So I would rather, you know, hide those kind of hiccups, and yeah, it just takes too much time to make stuff nice for, you know, for being public. And I also think not everyone at HCI might be comfortable, you know, being like a public developer. But yeah, all of the stuff that we have internally, it eventually goes to the public repo, either for new releases, to fix bugs or issues, you know, we develop a big new feature, we would push it, and you know, if someone was to make a large contribution, of course, to rebase the code. All right, so VBDec, the decoder project. So the highlights, it's fully compliant with the main 10 profile of EBC, of course, give or take a few bugs, so actually it was supposed to be fully compliant since the version 1.0. But you know, stuff happens, we find bugs, we fix the bugs. But basically there is no feature that is really missing. It's multi-platform, so it works on all the major, all the major OSs, and it also works on different architectures, thanks to Zimdi everywhere, as mentioned. So we have, like, x86 support, ERM, I also saw people, you know, doing builds for RISC V. So this is very nice. Also one thing I'm kind of proud of also is that from the first version we have like a unified thread pool. So also something that was already talked about, basically everything is, like, all the tasks are collected within this one thread pool that just balances itself, right? There is, like, no frame threads, slice threads, which also means we are multi-threading is independent of bitstream features, right? We don't need tiles, slices to be able to parallelize, so I think this is a really nice feature. All right, let's look a little bit into the development history. So this on the left is, like, the performance graph for, you know, different resolutions, different coding conditions, so like random access or all intram, and, you know, major milestones mentioned 1.0, full compliance with the standard in the version 1.3. We did a three times memory reduction, and, you know, I was also asking myself, how could we ever release any other, how could we have ever released any other version? But yeah, at least we managed to get it better. 1.4 we got a major performance boost based of external contributions in GitHub, so this was really nice to see. And in between, we really had a lot of, you know, small improvements, and as you can see in the graphs, it does add up. All right, about the VVank, so VVank, you know, it's an encoder project. Of course, it's way more complex, way more interesting. It has way more degrees of freedom, right? Like the decoder just does this one thing and has zero degree of freedom, right? Basically the standard tells us exactly what to do, and the encoder has all of those choices that it has to do. Anyway, what I like to say is, you know, it's basically the best open source encoder out there. As you can see here on the right, it's runtime versus efficiency. So, you know, for a given runtime, we can have the best efficiency, and for any of our working points, we can get this efficiency the fastest. Of course, you know, it's not the best encoder if you want to encode UHD live. We're not quite there yet. But, you know, at those slower working points, it really doesn't get better than that, at least not that I am aware of. All right, so we have five presets on the encoder, from faster to slower. And you can see, this is a single-threaded graph, you can see more or less how those presets compare to XO65 in orange. With very efficient multi-treading, at least like between, you know, up to eight threads or with up to 16 threads with some additional options. I'm going to talk about this a little bit in two slides. And we have a very good optimization for human visual system based on the XPS and our metric, which I'm also going to mention a little bit later. We have really excellent rate control, so, you know, with bit rate deviations, rarely more than 2%, and like almost never more than 5%. As I mentioned, simple interface, and it's, you know, this thing can be used for academic, amateur, commercial uses, the license really allows it all. All right, so we have those five presets. How do we derive those presets? Like from an academic point of view, this is actually done in a very interesting way. So we take all not use case-related options of the encoder, and we do, like, large-scale optimization. So we start with a very simple, very fast working point, and then we try to derive which option should be changed as the next one. And this is always the option which basically gives incrementally the next best working point. Right? And this is a huge optimization. It takes really a lot of compute time, so we don't do this so often, like every two or three versions, or if we know one tool got implemented way better, we can, like, try to only optimize for this tool within the option space from the last optimization. We target HD and UHD, natural content, but we do sanity checks for resolution and, like, screen content or, you know, HDR. And the one issue that we still have is, you know, at the beginning here, you can see that our curve gets a little bit steeper, right? So we cannot go too fast, because at some point, for, like, every two times speed up, we're just losing too much efficiency, this is because, you know, we started from the reference software which was designed for the ability, and the efficiency is still work in progress. All right, about the multi-treading. So our multi-treading is also, I would say, done differently than in many other encoders, so we do the multi-treading over CTUs, so, like, the modern macro blocks and CTU lines, like we simulate a wavefront parallel processing without using the syntax, and we parallelize independent frames, right? Like two frames are independent of each other and the references are already done, we can do those in parallel, which, of course, means that, you know, how much we can parallelize depends on the number of CTUs that are available, which are always more in high resolution or with smaller CTU sizes. That's why the faster and fast presets, which have smaller CTUs, they parallelize a little bit better, which you can see on the top right there in the full HD parallelization efficiency plot, right? You can see, like, after eight, it kind of, well, it doesn't saturate, but, like, after eight threads, there might be better ways to utilize the resources than to just enable more threads. Exactly, and how can we improve the scaling? So we can improve the scaling by enabling normative features of EBC, so either doing tiles that is independent regions within the picture, or enabling the normative wavefront, which allows us to kill one dependency within the encoding, and this is on the bottom right, so you can see, you know, if we enable those additional features, the multi-treading scaling actually gets much better, but it costs between three to five percent of bitrate overhead. But still, even with those other features, the encoder is not ready yet for more than, let's say, 32 threads, so, like, you know, if you have a really, really big server, the encoder will not be able to utilize all the cores. Yeah, and about our optimization for the human visual system, it's based on the XPSNR, which is, you know, this new metric that a colleague at HHI developed. It has a really high correlation with most based on some public data sets. There are publications. You can look up. They're mentioned on the bottom left, and it has been contributed to FFMPEG as filter, and I think it is somewhere in the backlog of FFMPEG waiting to be looked at. You can see on the right here a lot of graphs, so basically, in the last JVET meeting, no, the JVET meeting one before last, there was a verification test with actual human subjects, where we, you know, where VTM was tested with, like, the new compression technology, and we submitted VVNG to be tested alongside of it in the slow preset, right? So the slower preset has around the efficiency of VTM. Slow preset is objectively five percent behind, and as you can see in the graphs, VVNG in orange matches or outperforms VTM, which means that our visual optimization is well able to at least close this five percent gap, if not even, you know, add more in terms of visual, like, subject to visual quality. So yeah, that's really nice. All right. VVNG in practice, you know, everyone asks in the end, so what kind of FPS can you achieve? So we did some encodes onto mobile workstation kind of computers, encodes using all defaults, which means eight threads, and, you know, for HD versus live, it's like, for faster, it's around times four, like, live times four, so around 15 FPS, medium around lifetime, times 30, right, so around two FPS, I'm talking about HD 60 FPS as live. For UHD, the faster can do, like, 15 times live, fast 30, and medium, well, let's just say medium would only be of interest for, like, large-scale VOD encodings, right? But you know, FPS also depends on many other factors, like bit rate, content, you know, your actual CPU and stuff like that, so this is more like ballpark numbers. Excuse me, is this HDR content? It is not HDR content, but it's 10-bit. So it should be roughly the same for HDR, like we only do, we only test with 10-bit usually. It works for 8-bit, but it's kind of 10-bit native. All right, also some version history for the VVNC thing, for the VVNC project. So our first major milestone was the 0.3, where we added frame threading, and you can see on this multi-threaded efficiency versus speed graph that it was, you know, a huge leap for us. In the 1.0, we added the pure C interface, allowing, you know, the integrations into pure C frameworks, 1.4, scene cut detection, 1.5, we added, like, arbitrary intra-period, so it doesn't have to be aligned to anything. We added a fast decode preset, and in the newest version, the thing that I really like about it is that we added the ARM support, and, you know, every version, we had improved ray control, things also to, you know, great community feedback, and, you know, from one version to another, your encode might be 10% faster, but if you had a ray control problem and it got fixed, it's going to be, you know, like, way, way better. So this is, like, this hard-to-quantify improvements are actually one of the most important ones. And you know, you can see how the curve behaves, extrapolating, I'm sure we're going to get even faster and better in the future. All right, about the ecosystem and the community, of course, you know, this is raw video encoding and decoding, this is really only of academic interest, right, like, it doesn't really bring you anything. That's why we have been looking into FFMPEG support for a long time. There was an open access paper over one and a half years ago that described how to do it for the decoder. There are some patches in the pipeline with FFMPEG. We also put in our wiki how to apply them manually, if you, you know, if you want to build it, you know, I've talked about this a lot, but the thing is, you know, if it's not in FFMPEG, it doesn't exist. That's why, you know, we put up a, like, how-to for you on how to do it. All right, about playback, once you get this FFMPEG with VBC included, you can just link whatever player uses FFMPEG as its back-end and it's going to work. As far as I know, for VLC, you might force it to use the FFMPEG as the demuxer. Not sure about it. I didn't test it myself. It comes, like, from community feedback. I know there are some exo-player integrations going around, which allow you to, you know, to use it in Android apps more easily. We have a toy web browser example, but it's nothing like the VLC.js. It's really a toy example. And for maxing and demaxing, you can just use GPEG, sorry, for maxing, dashing, you can just use GPEG since the version 2.0. And I think it also needs to be linked to an FFMPEG with VBC support. All right. For the community, we do our open to external contributions and we wish to get some. That's also why I'm talking to you, to get some interest. So we try to, you know, make this more easy by, you know, stating that the authors of VVNC are also retained at copyright. We don't have, like, a contributors agreement, so this is, like, the only way we can make it happen. We are interested in all kinds of contributions, you know, efficiency, speed-up, and, you know, we're going to, throughout the review this, test, generate, result, whatever is needed, give proper feedback, march, so, yeah. Please do if you're interested. To conclude, you know, if you just entered the room, I talked to you about our open source implementations of the VVC standard. And I'm looking forward to, you know, contributions, also results, tests, if you want to try it out, and general feedback. Thanks a lot. And a question in the room, yeah. What confused me, because I know a little bit about the backstory, so H265, right, was the super proprietary and the royalties, and the Alliance for Media Launch, the AV-1 company. Yes. And now your colleague is open source and free, right? So to recap. Why is it H266, is the free? To recap the question, the question was H265 was a proprietary codec with licensing, AV-1 is a non-properary codec, and then I'm talking about VVC, which is the successor of HEVC, so H265. And there is a small confusion, so I'm not talking about the codec as being open source, it's about the implementations of it, right, so you have to differentiate between implementations and the technology itself. There will be licensing for the technology itself, but, you know, there is still open source implementations also of HEVC, so that's kind of the way I would like to see it, you know. So it won't be like with MP3, but also developed, it will be completely free to use, forever? No, it won't. I mean, the software is free to use, but, you know, if you build your streaming service based on the technology, independent of which software you use, you have to pay royalties for the technology. Because you have patents, but patent issues? Yes, I mean, there is, you know, this technology cost stuff to develop, and people want to get their investment back, you know. Okay, disclaimer, I'm doing, yes, in the optimizations for video codec in general. My focus is on, and my comment to you is using a library like Cindy Everywhere, it works. It's going to give you the initial results that you want, but you will never get the optimal performance out of your hardware. We had some really, really good examples of code, particular algorithms like more instructions in forensics, a move mask type of in forensics, that are very common in popular in Intel. If you try to port them with Cindy Everywhere or some kind of abstraction layer to ARM, you're going to have to emulate this behavior, so ideally, you have to provide some way to provide optimized functions for ARM or any other future architecture. Otherwise, you're just going to have your Intel layer transferred, translated to ARM. It will work, but you will never optimize your software. So to recap, to recap the comment, there was a comment that Cindy Everywhere works, gives a nice initial, you know, initial deliverable, but will never match hand optimized assembler. We are... How do you know assembler? See, see the corresponding ARM intrinsics. I mean, when I say assembler, I mean, intrinsics, so, you know, we are aware of it, and we are looking a little bit of it, you know, there are different kinds of intrinsic kernels that are implemented there. You know, sometimes you have like two pieces of memory that needs to be added up and stored somewhere else. There, Cindy Everywhere works really nice, but when it's like lookup tables, shuffles, it works worse. We are aware. We are looking into, you know, identifying the kernels where there's the biggest potential for improvement and doing those manually. We are looking at HDR, implementation of a ARM of VBX, okay, so I'm doing the ARM and VMT optimizations for that, and the way to do the transpose, for example, and that is completely different to the way it is, especially with, because for AVX2, you have 256 bit-wide registers. You don't have that with VM, but you have other instructions to help you try to tackle the outcome. Yeah. So my point is that if you are open to contributions with specific optimizations that you might find people to help with that, but if you restrict yourself to only using the library and only use Intel intrinsics and then translate that one using the library, you might lose some performance. A lot. We are not restricting ourselves, you know, to only using this. It's just because, you know, we only have so many resources, this was the fastest way to do it. We can play it out on, you know, like actually this thing can play out VBC videos with our software, but yeah, totally. I think there is, there would need to be some changes to the build process, maybe to that like structure of the project to enable this, but yeah, this is something really very interesting and I think there will be a lot of research going on in that direction. Thanks. One last question, yeah? How does it compare, what are the advantages of AV1 in terms of compression or computation? Well, yeah, so the question is, what are the advantages of VBC or VVNG? VBC over AV1. All right. So what are the advantages of VBC over AV1? So VBC is a successor to HVBC, right? It was done by people who were really knowledgeable on how to make a standard work. So you know, it's the one thing, it just provides the additional bitrate savings, right? So here you still see like 20% additional bitrate savings over like the best case of AV1 and there isn't so many of these initial hiccups, right? So the HDR support just works, you know, immersive stuff just works a little, like a lot of those things, they just work. And then we can do stuff on top of that, like doing open-gop adaptive streaming, which allows you to reduce the bitrate by like another 10% on top of that, right? Like with all the other standards, I think the adaptive streaming can only be done with close-cops or with a prediction break. I would say more mature, but you know, I know there are different views of this, more compression efficiency and really versatile mature usability. Thank you, Adam. Thanks. Thank you. |
The FFV1 ecosystem
A lossless video coding format. IETF standardization, FFmpeg, MediaConch, RAWcooked |
Okay, that starts. So our next talk is about the SFD-1 ecosystem done by someone who knows that very well. Please welcome Gérôme. Hello, a bit sooner Jean-Baptiste said that we don't speak enough about FFMPEG, so we'll speak about FFMPEG a bit more, but not about a very fancy codex. The word is about the lossy compression, H265, VVC, or AV1, but we will speak about something a bit more boring. It is the lossless compression. Because some people need to do some broadcasting and just broadcast and forget the content, people can watch and have a lot of compression artefact. If it is visually lossless, or if visually it is not so high level of artefact, it is fine. But for some other people, having a lossless video coding format is very important. And also something open, so no patent and so on. So FFV1 fits a lot of things about that. It is a very old technology, more than 30 years old technology, so no patent for sure. There is also a good reference encoder and decoder. It is FFMPEG. Thank you FFMPEG. It is widespread, so a lot of people can use it. We can embed FFV1 in a lot of different containers. The most used are AVI for all people, MP4, Matroska, MXF also. And FFV1 has a good balance between compression and speed, and also a bit about cost. So because there are some competitions, some open formats, also in FFMPEG, they can have a good compression, but they are very slow, or the compression is not so good. And outside of FFMPEG, we have GPEG 1000, 2000 for example, but encoders are pretty expensive sometimes and so on. But FFV1 is natively in FFMPEG. The compression is good. The speed for a lossless format is not so bad. So we use it a lot in some configurations, especially because with an open format and an open source implementation, it is easy to expand depending on the need we have. So with FFMPEG, we have only black and white or YUV or HGB, and we can also add an alpha channel. We can expand it up to 16 bits per component. So from 8 to 16, and also if we need 12 or 10, we can adapt a lot of FFV1 formats for being able to support every input we need. With the latest development, a few years ago, we have also a good multi-feed system in FFV1. We have checksum, so we are sure that when we store the content, we can detect if there is a problem during the storage. A lot of work was done by Michael Niedermeyer a few years ago, so he is the main developer of FFV1. So thank you to him about that. It helped a lot. So we are aware that a lossless compression is a very niche market because not so many people need it compared to the broadcasting of a lossless compression like H264 or AV1 and so on. And these people who need that are not rich, but it is pretty important for them to have a lossless compression. And also, actually, it is also for us important because a lot of users who need lossless compression is working in archives, so for our heritage, in 101,000 years, we need to have the content. And for that also, it is important to keep the content as best possible. So no compression artifacts. It is the reason these people want no compression, no lossy compression. And in this niche market from archives, some entities discovered that paying for FFV1 development inside FFMPEG, for example, is a lot less costly than buying products on the shelves. Even if the product are there, when you buy a product, it is pretty expensive. It is a package. You need to buy a package. And actually, if we take this money and don't buy the products on the shelves and we develop a product inside FFMPEG, it is less costly. And then after that, other entities discovered that, oh, it is in FFMPEG and it is open, it is free of charge, no patent, the software is also free, open source and free, and there is a lower cost for them too. So it is very good for these people. They are not rich. They need to have this lossless compression and FFMPEG is perfect for that. But yeah, they need to trust that this choice is sustainable and FFV1 is in FFMPEG, but it is not a standard. So Archive wanted to have more trust in the format, so it is not only about code, but we need also to have a standard, for example. And there was a work sponsored by the Open Union and there was a project called Preformer to help Archive to have checks on their file. And for the documents in Archive, it was easy, PDF, it is a NISO standard, fine. For images, when they do scan, it is also not so bad, they have teeth, it is a NISO standard, it is good. But now they have a lot of AV content and for that, there was no open and standardized format. So they decided to sponsor some work on Matroska and FFV1 and PCM for the audio, so no format for that. So we tried to work with Preformer and we helped to create an ITF working group. And why ITF and not ISO or SMBTE is mainly because ITF has no paywall for the standards. So it is very useful for people who are not rich, a lot of Archive are very small. They don't have the money to pay a lot of different standards and so on. And also, ITF is very welcoming new people, helping them to create a new standard. So thank you to ITF for that. On our side, we focused on different formats, so Matroska for the container part, FFV1 for the video part and also for some lossless audio compression flag. So now with ITF, we are working well, it is a bit slow because we are mostly now volunteers, but we are still working inside the CELAR working group. So Matroska, it is still a work in progress, some part, eBML, the base of Matroska is already published. It is AIFC, so it is good. Now we are working on the core of Matroska. It is on the way and then we will work on the codec, on the tax part and so on. A lot of work is done by Steve Lohm, thank you to him. And thank you Steve, you are there. And also for FFV1, we worked on having a standard. So now the version 01 and 03 are published, it is an AIFC, so people are more confident in that. And we want to have also FFV1 version 4, but we still need more development about that. About the audio part, it is still on the way, there is an ITF last call. So the specification is nearly ready, but there is still some review by ITF to do. So Martin is working a lot on that and we hope to have the AIFC maybe next month. But for video format, it is not only in ITF world, it is not only Matroska. We also need to have it accepted a bit more everywhere. We asked to SMT to accept FFV1 in MXF. So it is not something easy because SMT you need to register and you need to pay a lot and so on. But thanks to the Library of Congress, they wanted that to replace GPAC2000 by FFV1. So they worked with SMT and now FFV1 is officially supported by MXF. So it is not a standard but it is an ARDD, so registered disclosure document, a bit like ProRes in MXF. We have a universal label from MXF, available for FFV1, it is not a hack, it is registered in SMT. So FFV1 is not only in FFMpeg and not only in Matroska, it is also in other containers and not only the one used on the web. It is also used in MXF, so more for broadcasting. FFMpeg has support of FFV1 in MXF for demixing for the moment. And we sent a patch for the mix in FFMpeg Devol, so it is on the way to have a good support of FFV1 in MXF inside FFMpeg directly. For the archive, it is not only about storing the files, it is also being sure that the files are perfect compared to the specifications. So we work with an ecosystem, so it is not only the code, we create a file and then we put in the storage. We want to be sure that later it will be readable because it is not just an immediate usage, it is for the future. So a lot of people need to be sure that the file is fine. So we created a conformance shaker called Media Conch and based on the specification, from Matroska and FFV1 for example, we can say that the file is conformed or not to the specification. And before putting the file in the storage, we are sure that the file is conformed to specification so that it will be easy to read it later. But sometimes in the archive we have different inputs with very different formats and with some proprietary formats, sometimes about lossless storage. So for the archive it is very difficult to be sure that in 100 years they will be able to put to play a file with a codec only available for Windows 95 for example and it is only 32 bits and it is only for Intel CPU but maybe in 100 years there will be only ARM. So it is better to convert and for that a lot of people are using FFMpeg. So thank you FFMpeg about that. And archives, the PIM Institute and so on, they use FFMpeg, they don't develop FFMpeg but the community needs also to do some scripts. So they publish scripts about how to use FFMpeg. So we have two examples there about how archives are using FFMpeg for doing the transcoding. Another practical usage in FIM archives is that usually they receive from a scanner a dpx file. So they don't have only one file, they have one file per image. So it is totally crazy for the file system and it is difficult to store, it is very huge, there is no compression and so on. So we can help with open source format. So with Matroska, FFV1 and Flak, why not? It's still huge because it is lossless compression but dividing by two the cost of storage is good to take when you have petabytes of content. The issue is that not all workflow accepts FFV1 and Matroska. So we need something for doing the conversion between dpx and FFV1. There is also some legal commitments, it is a bit crazy but we need to have the exact same file. So if a dpx file comes in, the legal requirement is to provide the dpx file when it is requested by the state for example. So for the compression it is not so good because FFV1 compresses the video but not the dpx header for example. So we need a bigger ecosystem than FFV1 alone because FFV1 is about the video compression but not everything besides the video content. So we created a tool called RowCooked and this tool fills the gap between what exists, FFMpeg and what is needed to save the dpx header for example. So with RowCooked, we take the dpx header, we store it in an attachment in Matroska. So it is also useful to have an open source content, a container format because we can expand it and so on very easily. And we send the video content to FFMpeg, FFMpeg does the compression and besides that when we need again the dpx, we decode with FFMpeg, we inject again the dpx header and we create from that the exact source file. So we have this ecosystem around FFV1 and we want to expand it to have what could we do with that. FFV1 is good but speed is good but we could be better maybe. We want to expand the decoder and encoder maybe with SIMT or GPU acceleration but we need to work on that. And then maybe if we have other needs we will create FFV1 version 4 and with inside the IETF system. So as a summary, code is important, for example FFMpeg has a very good FFV1 encoder and decoder but it is not enough because we need to make user communicate and to share the scripts. We need to have the formats reviewed by standardization body to be sure that the format is fine. If the IETF when we worked on having a standard about FFV1 for example, we found some bugs in the reference encoder. So it was good to have reviewers about that and we create for that a big community not only about the code, the FFV1 code but also the users. And with that we show that open source can also help about niche needs and not only broadcasting and the big things like YouTube and so on. This is finished so if you have questions. So the main crisis was about speed encoding but I had that question since before so maybe the next version could be comparable to JPEG XS? You mean maybe JPEG XL? There is an XS, yes exactly. How is it comparable to JPEG XS? There was some discussion between us and JPEG XS developers. So JPEG XS is actually a bit base of FFV1. The developer took part of JPEG 2000 and also part of FFV1 for creating JPEG 2 XS. So there was a path between, there is something from FFV1 actually in JPEG XS. JPEG XS is also open but the specification is not open also. For FFV1 the specification at IETF varies a copyright on it and you cannot modify it. But the version in our GitHub is totally open source also. It is a creative command license. So also a big difference between FFV1 and JPEG XS is that FFV1 is easier to understand about the compression mechanism and also that the specification license is open also. And for us it is very important that everything is open. Not only the compression and the decompression but also the specification. But we need to do more performance comparison between speed or compression efficiency with JPEG XS for sure. Just to continue on JPEG XS there is also the issue that JPEG XS is not royalty free. So that also is a consideration I think that is important. You need to repeat what you said. There is an issue about JPEG XS about royalty. It is not completely sure about there is some risk about patents but the JPEG XS developers wanted that it is more or less free. So there is a discussion about that. But it is based on JPEG 2000 and there was some risk of patents with JPEG 2000 about the lossless part of JPEG 2000. So there is exactly some risk about patents in the compression system. With FFV1 we are completely sure that there is no patent because we created nothing about that. I didn't really have a question but I just want to thank the FFV1 project because it was... I made a free lossless image format and it was quite inspired by FFV1. The sense of the entropy going and the context modeling that is going on there was inspired by FFV1. And now these things have moved from FLIF to JPEG XS, which is the other JPEG standard, where some of the context modeling of FFV1 actually is used in JPEG XS as well. So the remark was about FFV1 helped to create FLIF and then JPEG XS. So it is part... FFV1 was... maybe I misunderstood. You talk about JPEG XS and I mix up with JPEG XS. I wanted to say about JPEG XS. So sorry about the confusion. So the names are confusing. So FFV1 was the base of JPEG XL and now we need to do some comparison between what is FFV1 now and if JPEG XL can help more than FFV1. But for the moment, it is pretty important for us to have open specifications and not behind the paywall. So with IETF for us, it is very important to have that. I agree that IETF is much better in that regard than ISO, which puts all the specifications behind the paywall, which is not helpful. Yeah, one other issue with JPEG XL is the paywall. Another question? In comparison to raw video, what kind of compression ratio can someone expect? The compression ratio is between 1 to 3, between half. The average is half. Sometimes it is a third, sometimes a bit more. But the average compression ratio is divided by two, the storage cost. Is the algorithm itself extensible? In the future, let's say you have a new compression algorithm that gets better ratios, can you switch that? We could switch. FFV1 is based on the range code for the moment. But if we find something better, we could switch. There is a discussion for FFV1 version 4 about what to use instead. Is there a theoretical limit on how much you can divide the file based on the information of, for example, the video? Is there a limit about the compression ratio? It is more that the content itself, we need to avoid to lose a pixel. So if you have a black frame, it is very, very small. So the range code, you can store in two bytes. If it is only black, it will be repeated and there is no bit consuming about that. But this is not a use case we have in reality. How does the output file size compare with other codecs that use high-quality parameters? Compared to lossy compression, you mean. A lot. The files are, if you have... |
AVX512 in FFmpeg |
We'll talk about AVX-312 in FFNPEG. He's also the co-organiser of this dev room. Please welcome Kirano. So yes, I'm going to be talking about AVX-512 in FFNPEG. What is AVX-512? AVX stands for Advanced Vector Extensions. There will be a lot of acronyms and jargon, unfortunately, in this one, but I will try and explain all of them. So AVX-512 is a relatively new single instruction multiple data instruction set for Intel CPUs from about 2017 and more recently in the last six months or so with AMD CPUs. In particular, it has a larger 512-bit register size. Many new instructions, which we'll talk about in a minute. Comparisons, which are quite new, and also lots of other things that are not so interesting in multimedia. Cryptography, neural networks, and I'm sure there are other people at Fastem who could talk a lot more about these kind of things. As I mentioned, lots of fancy words, but the thing to bear in mind is in FFNPEG, high schoolers have gone and written assembly. This is heavily jargon-centric. It sounds complicated, but actually quite a big reasonable chunk of assembly in FFNPEG has been written by people who are in high school. Why is this relevant now? I've mentioned AVX-512 has been around since 2017, so why is it 2023? Well, Skylake was the first CPU generation from Intel to have AVX-512 support, but it had very large performance throttling when you used them, so your effective CPU capability speed went down quite dramatically. And so this was fine if you were doing high-performance computing in academia, for example, like fluid dynamics, where you were using these instructions 100% of the time, that was fine. But in multimedia is a mixture of assembly and C code, where you're not necessarily always using these instructions. So this relative main is sort of unused for the last couple of years. You could still use these new instructions, though, with the smaller register sizes, and I'll show an example of this later. But the first Intel CPUs not to have throttling were the Islake 10th and 11th gen Intel CPUs. They were the first to have no throttling, and this meant these ZMM-based instructions could be first-class citizens. How to get started, one of the tricky things as well in the last few years has been actually getting access to devices that have this, and unfortunately Intel have not made it easy. From their 12th generation, CPUs have actually removed support in consumer equipment. It's still available on AMD as in four CPUs, though. And if using the cloud is your kind of thing, available also from many cloud providers in the server CPU range, such as AWS or others. Personally, I think the easiest way is to buy an 11th generation Intel NUC. That's what I did for FMpeg. I bought two of them for the projects and host them. The easiest way, it's only a few hundred euros. It's quiet, it fits under your desk. And that's the easiest way to get started, you get a full AVX512 stack. So let's look at some of the existing work in multimedia that's using AVX512. And probably most importantly, we had the sort of introduction from JB earlier today, the David project, which is an AV1 decoder. This added AVX512 support, I think a year or two ago. It's particularly beneficial in AV1 because AV1 has large block sizes, sort of in comparison to more traditional standards, traditional codecs like H264 and others, which are smaller. So AVX512 in David gave, I think, 10 to 20% overall. So not just the functions themselves, the overall decode performance was improved. And it's actually been a running topic, which is quite interesting over today, in FMPEG that we use, and David, and also we use this classic FMPEG H264 approach to assembly, which is no intrinsics, no inline assembly, no special SIMD sort of libraries to make life easier. It's raw assembly language, and I'll show some examples of that. And also we don't also compile them in and force you to have a particular CPU generation. And I know this is quite controversial. I think it's MongoDB, for example. They forced one-year a particular CPU generation, and this was super controversial because not everybody had that. So what we do in FMPEG is we detect CPU capabilities, and I'll show you the function in a minute. And then we use function pointers, so we set them once at the beginning, and therefore the overhead of doing that measurement is checked once, and then there's function pointers that are executed after that. And unfortunately, on Intel, there's a very messy Venn diagram of capabilities. But in practice, we really, so far, and they may change their mind, but care about these kind of two things. So these are the CPU flags you get in FMPEG. There are others, but the AVX-512-related ones are broadly speaking legacy Skylake, and the newer ICL are put in bold for Ice Lake. But you can see there are actually a lot of different subcategories in there. But in practice, it's at the moment one or the other, but as I mentioned, Intel are very keen on adding and removing features and possibly even charging your subscription for certain features is one of their new ideas. So it could be that newer additions to this are subscription-based, or you buy and pay for it later, or something much more complicated. So who knows? So I guess, unfortunately, there's some sort of dependency in explaining a few of the topics and some of the benefits without explaining some of the backstory. So historically, in old AVX, you had all the 256-bit registers, and these were split in practice into lanes. So in practice, you've got 228-bit lanes, and instructions, broadly speaking, operated in these lanes. So if you ran a instruction, it worked on data, and it was actually quite difficult. It was possible, but difficult to move data between these lanes. And it's one of the historical limitations on existing AVX and AVX2 code that we have in FMNPEG is lane crossing and all sorts of trickery that essentially cost CPU cycles to take up this time, that takes time to compensate for the lanes. I have to talk a bit about KMAS registers as well. So AVX512 has these new set of registers called KMASks, K0 to K7, and this allows a destination register to remain unchanged. So, for example, underneath, you could have an addition, but actually it's a simple case, and obviously you could just add zero, and it's unchanged, but you could actually use the KMAS to say, actually, I don't want addition to be applied to these elements. I want this to be a pure pass-through, or you could even force some of the elements to zero if you wanted to. There's a specific, I think it's a flag that lets you do that. And there's a whole set of new instructions to go and manipulate these KMAS registers, and certainly David, in particular, uses, makes good use of KMASks. So now that I've sort of explained some of the back story, I think it's fair to say one of the most important instructions, if not the most important instruction, is our shuffles in multimedia. Also known as permutes, and there might be a technical difference between a shuffle and a permute. Someone might be able to correct me. There might be some mathematical difference, but these are the most important, or one of the most important, instructions in multimedia. And as you can see on the right, basically it lets you, shuffles let you have various bits of data and rearrange them in any way that you want. Duplicate them, as you can see, or even set individual elements to zero. And this is, for example, famously one use case of this is in the zigzag scan of FFMPEG, which groups larger coefficients in a block together. But the way that that's done is via a zigzag scan. The thing about vpermb, which is the new AVX-512 instruction, is it lets you cross a lane. This wasn't something that was possible in before. And as I'll show you later, this makes things substantially faster in many cases. pshuffb, probably one of the most commonly used instructions in all of open source multimedia. You do get grep, pshuffb, there'll be a huge, you know, that your screen will be full of pshuffb. They're used everywhere in open source multimedia. pshuffb had a kind of useful benefit that if you set the index to minus one, you had to automatically do the zeroing out. With vpermb, this isn't the case. You have to actually use kmasks to do that. So that just makes things slightly more complicated. There's all sorts of other interesting permutes that AVX-512 offers. I think David also, again, makes good use of this vperm2b, so you can actually not just have one set of data, you can actually permute from two different registers. So you could have ijk, et cetera, et cetera in a different register, and your output could be a mixture of both of those. So that's kind of interesting. Variable shifts. You have now variable right shifts. So I've given the example of a vpsrlvw logical right shift and vpslvw variable left shift logical. Big letter soup, quite confusing. In fact, when writing this slide, I misspelt the word shift. You can have a think about how that may have been spelt. Thankfully, that's the good, thankfully, the rehearsals, and we'll pick this up. But this word soup is exceptionally confusing, both when writing slides and writing code, it seems. So historically, to do variable shifts, so if you want to take, obviously, just to step back, take an element and shift each element by a different amount, this was quite complicated. There's various bits of trickery, various idioms that people use to try and emulate that, but they had limitations. I think, for example, you were not shifting by zero, possibly wasn't allowed in one of the various bits of trickery. And so if you needed a zero shift, you had to do it a different way, et cetera, et cetera. But now you have this variable shift, and it's all usable. Equally on the left shift, the naive way of doing an emulated left shift is just to multiply, but these instructions are actually faster than the multiply, so there's still some benefit. VP Turnlog D, this is, I think, no presentation about AVX 512 could not fail to mention VP Turnlog D. This instruction is literally a kitchen sink. It's quite remarkable in what it can actually do. You can literally program a truth table within an individual instruction itself, and, in theory, could replace up to eight different instructions. So you could do a whole presentation on VP Turnlog D. So I thought it would be best to try and pick one of the simplest ones, which is a ternary operation. So this is a bitwise equivalent to the C ternary operation. So in each register, each bit is iterated through. And you can see, for example, one, the ternary operation. So if that bit set choose this or versus this, and you can see the output of that is that. And so, essentially, it's a bitwise operation of ZMM is equal to ZMM0, a question mark, ZMM1, ZMM2, but on a bitwise level. And there's all sorts of other interesting things you can do, and this article is very good. It shows all sorts of interesting things you can do, bit selects, all sorts of various different operations that you can do on multiple XORs, for example. So, yeah, also very interesting. So let's look at a real-world example. I don't know how well you can see that. I was hoping the dark mode would actually make life easier, but maybe it's made things worse. But I'll talk about some of the mouse. Is it the mouse? Because the mouse on the Mac is dark. But anyway, this is v2.10enc. It's probably one of the most simplest assembly functions in fmpeg, but what it does is it takes three 8-bit samples from different memory locations. It sort of, as part of its work, extends to 10 bits and then packs those three 10-bit words into 32 bits. So what's interesting in this function is we're already starting to do lane crossing that wasn't possible before. So we load the y-samples, so the luma samples, into the lower 256 bits. We do the u-section of the chroma into the third, or the second, if zero-indexed, portion of the register, and then equally the same for v. And then we do one, excuse me, and then one single v per mb can rearrange all of that in one go. This was a lot more complicated back in the olden days. P mad sub sw is some trickery that unfortunately there's not going to be enough time to explain, but eventually is a multiply and add, and we use that to emulate a shift. And then for the second element, in the three elements, we need to do a d-word shift because it actually spans the middle. So therefore then we have sort of conflicting bits in each register. So how do we do a bit selection? And this was quite a, I think it's a two or three, even up around two through three different instructions in the previous code. And this can now be done in a single vpternlogd, so essentially c ternary b or a. So if bit c is set, choose the bit from b or choose it from a otherwise. And you'll see in a second that actually provides quite a big, well certainly a measurable speed improvement. So these are the benchmarks. So this is, so I wanted to show a bit about how you can get benefits from AVX 512 even on the older hardware with the shorter existing registers. These are not scientifically benchmarked, I just ran them yesterday. When you do benchmarking you should run them 10 or 100 of times, average them, do standard deviations, et cetera. But just for the simple case, you can see that the c code versus the AVX 2 code is around 10 times faster. And you can see just by replacing, I think it's a set of two or three different pans or various boolean functions, you can get a measurable increase just with one instruction replacing three, even on the older YMM registers. But where the big gains come are on Ice Lake, you can see the c code versus the AVX 512 ICL, there's a huge difference. So by using vperm b and the ZMM, you can already make the legacy AVX 512 twice as fast. And if something was 10 times faster, that now becomes 20 times faster. And I often have to say that's not a multiply, that's a times. So it's massive improvement. This was code that could, if you have a large resolution file, take up an entire CPU core, and now it takes essentially 5% of a core. It's really tiny. What AVX 512 code is next? Anything really that's line-based or frame-based, such as filtering or scaling, I think the next thing we're working on is deinterlacing. Anything involving comparisons, I haven't really talked about comparisons, but there are bits of code that often need to do comparisons. That's going to be an obvious place for AVX 512. Lots of places that do triple booleans, multiple XORs or multiple XORs on ands, and I think it's almost always possible to replace that with a VP10 log D. Likewise in the code base, there's various different idioms and trickery to try and emulate a variable left shift and right shift, or multiplies for the left shifts and trickery for the right shifts. This could be used with the letter soup instructions to try and produce that. Intel provides an official manual to all of this. It's very verbose, which is great in many cases because it provides really precise detail of how the instructions work, but unfortunately is not at all approachable. There's a few websites that try and simplify things. I think this website on officedaytime.com is some kind of Japanese website, English that explains, tries to group all the instructions in some kind of logical ordering, and that makes it a lot simpler to understand. Any questions? Hopefully I'll be able to answer them, but thankfully at FosterM there's always somebody with more knowledge than you in the room. I can't see where they are, but I did see them at one point. Thanks. Thank you. Any questions in the room? Regarding the direct assembly writing of AVX-5.0, there's about 7,000 instructions of AVX-5.0. Why? If you choose the direct assembly, then you essentially might miss out on potential instruction scheduling between different architectures. Compilers might schedule better if you want to get a performance benefit in the future. But then you have to ship a binary for each version. Sorry, repeat the question. You have to write in 3.6, that's what I'm saying. In order to compile... The question is the classic question, can the compiler do a better job than a human question? In David, certainly the register allocation has not been very good in compilers historically. David has shown this quite dramatically because it has its own custom ABI internally, and you wouldn't be able to do that with the compiler like come up with your own internal ABI between functions. So there's certainly 10% plus on the individual function, speed gains versus doing it in intrinsics. Some bits of some instructions are not available in intrinsics like always. It's a compromise. Overall, it's been the way in FM Big X264 for the last 10 years, and I think all intrinsics and in line assemblies banned, and there's only one or two bits left, and there's a very good reason why it needs to be there. I have mixed experience about this. I agree on the... Ideally, assembly is better, but we had some code in 3.6, we compiled it with the latest Clang, 15, and we saw a 15 to 20% speed increase. But did you try writing it to begin with in... Yes, it was in 3.6. Write it in... Write it originally in assembly and compare, but it's... So for example, some of this... Sorry, you've gone to... Some of the bit-twizzling in there, for example, a compiler would never really have the understanding to do... In fact, I did try chatGPT, and chatGPT at least sort of understood a few of the concepts. It's interesting because not quite out of a day job, but I did ask chatGPT to write this function, actually, just sort of to see what... And it did have some vague idea what was going on. It didn't need to sort of be helped, which is quite interesting. Yep. Is there any collaboration between the multimedia, the people who write the codex, and the guys writing the compiler who tell them, look, perhaps you could target certain patterns? Martin is a collaboration between people writing the compilers and multimedia community. Yes, in ARM in particular, I think, is Martin here? No, Martin is not here, but Martin spends a lot of time talking to the compiler community and the linker community on mostly miscompilations is more his thing. And I think, yeah, and I think there is also some sharing of mostly around the C code, if the C code is badly miscompiled or thought of the wrong approach, because you can see, actually, and in some versions of the compiler will really do a bad job on the C and the assembly can be 40 times faster, and that's... Don't know if that's something you can really trust if one day you change compiler version and a function that you thought was immeasurable is now 40 times slower than it is. And then the question from the internet is, did you have the occasion to look at RVA-SVE vector instructions for FAMPEG? Wow, that's a surprise for this person, because the next speaker is going to be talking about this entire topic. Where is the next speaker? He's over there, and the next speaker here, Remy, will be talking about this entire topic. Another question? Yeah, I was wondering. So, obviously, the runtime CPU capability detection and dispatching of the right functions is desirable, but I don't think it's necessarily contradictory to having some amount of abstraction. Like, have you, for instance, looked into the highway library that is being used in some places that is trying to provide some kind of abstraction while still allowing to do runtime dispatch? So, the question was, have you looked into some of the abstraction libraries like highway that's trying to do a sort of compromise between runtime dispatch and abstraction? I think this question was already answered, I think, two presentations ago. Not with highway, but I think with a different SIMD library, but there have been various approaches, LibOil, is it SIMD easy? Various different approaches. And again, the result from certain FAMPEG-264, it has been righted by hand. It's written once, and you know almost certainly that it's going to be usable for a long time. I didn't really talk about it, but the abstraction, there is a lightweight abstraction layer in X-264 and FAMPEG to try and basically to handle 32-bit, 64-bit, and to handle other things like the different ABI cores. The abstraction layer kind of handles some of the future-proofing in that respect, but there's a blog post online from Ronald, if he's here, but he's not here. He explains some of this. It's another presentation in itself, unfortunately. For your benchmark, do you know which optimization the C-code was compiled with? The question was, for the benchmark, what optimizations were the C-code compiled with? The GCC-03, varying versions of GCC. In FAMPEG test suite, there's all sorts. I think from GCC, there's a whole range, depending on the build OS, but from 4 to 12, I think, and maybe some people test nightly. I think Martin certainly tests nightly for ARM. I don't know if anyone tests nightly on X-86. Some are LVM as well. But again, I would be very surprised if a compiler would be able to come up with something, because what a human wrote, because this is involving bit properties of the actual packing, and actually the trick with PMAD SW is a kind of trick to try and do a multiply and a zeroing at the same time, and it probably doesn't have the level of thinking to understand the bit patterns internally. Something like chatGPT might one day, which would be quite interesting, but I don't think the compiler does. The last question. I'm just going to follow up on what you said. If you have a small algorithm, a small function like 10, 100 clients, maybe, writing in the assembly might be easy, but if you have a huge function, like a filter, a variance filter, or something, a VCT, writing it directly in the assembly might take a long time. That's why originally we write it in C, and then we try to write it in intrinsics. So the question is, a longer function might take a longer time to write in assembly compared to C or intrinsics. Yes, but there are DCTs and FMPEG, but they're macroed, right? Steps have macros to try and help that. Again, the abstraction layer also adds, I think, macros on top of what the normal assembler does in terms of macros, so the blog post explains, but swap is kind of interesting. It lets you swap registers, but then continue with them, and the layer just handles all of that internally. There's also just macros for, like, clipping. I think it was on the example, but clip is an example. So clipUB is a macro, and on the right target set, it will go and use the right clipping functions if they're available, for example, and there's a bunch of these, I think, that's how to fly. There's a few others like that. Thank you, Kieran. |
Scalable vector multimedia optimisations
RVV and SVE2 extension intro |
We continue with our next speaker, which is going to be a V extension. We continue with our next speaker, which is going to be kind of a follow-up of the previous one, because it's approximately the same topic, but this time about wrist drive and arm. So please welcome Remi. Hi, good afternoon, everyone. I hope you are done with the digestion. So, yeah, this pretty much follows up our compliments Karen's previous speech. But before I go into the details, obviously, I work for big companies, so I have to put this disclaimer. And then if I speak too fast or if I don't articulate properly, please stop. Please stop me. With that said, who am I? I don't think it matters much, but this is my 16th time in first day, and it's only my first presentation, so bear with me. Having said that, I don't work in this field at all, so just a fancy thing for me to do. So some history. So has anybody ever seen this outside the computer museum? Right. Yeah, so that's the Cray one. It's the first, indeed, it's the first vector processor. It's from the late or second half of the 70s. I wasn't even born back then. But in point being, it's the first vector processor, and we all now, finally, after almost 50 years, are coming back to this kind of, maybe coming back to this approach to calculations in computers. But for people in my generation, this is more what we associate with SIND before multimedia. So this is POD, the first video game that actually used the MMX, which MMX being the first SIND extensions in the consumer business, in the consumer space, let's say. So as I said, the MMX came in 1997, and that was 64-bit vectors. So you could compute over 64-bit at a time. Minded, back then, computers was pretty much only 32-bits. And two years later came SSE, and many, many, many versions of SSE. SSE2, which is more popular in multimedia use cases, 2000. I'm not going to go through all the details of SSE because there's like a billion, million versions. And AVX1 came in 2008. AVX2, which Karen mentioned, came in 2011. That was the first to have 256-bits vectors. Then AVX512, which was the topic of the previous presentation, officially came in 2013. But as Karen mentioned, the only real, real, proper CPUs were only out in 2017. On ARM side, the first SIND was actually 32-bit, only on ARM V6, 2002. That doesn't really seem to make sense, but that's because it's basically calculating as a 4 times 8-bits or 2 times 16-bit at a time. Then 128-bits came. There was no 64-bit SIMD on ARM. 28-bit came with ARM V7, so Cortex-A8, usually called Neon, officially called Advanced SIMD in 2005. And on ARM V8, it's pretty much the same. Now, it's not actually compatible on X86 or 64-bit, but it came with basically ARM V8 in 2012. It's also officially called Advanced SIMD, and it's also colloquially known as Neon. As for RISC-V, RISC-V is much more recent. There is no SIMD. The problem, and I've only summarized, this is only a short summary, there's way more extension, especially on the X86 side, is that every damn time you have to rewrite your assembly, and as the questions and answers in the previous talks and even the previous previous talk covered, this is kind of damn consuming. So, with that said, this was all fixed size SIMD, so what about viable length SIMD, which is what we will be talking about today. So, how would you go about doing it? Well, the simple way to do it is to just ask the CPU what is your vector size, and if you do RISC-V, this is how you do it. So, control-register read operation, the vector is called VL and B for vector lengths in bytes, and it will store in this case, T0, whatever, it's one register, the number of bytes in a vector, and with that you could then iterate. So, if you want to know the number of elements, well, you have to do a left shift to compute the number of elements, so if you want to have 32-bit elements, you divide by 4, shift by 2 bits. You could do it like that, and then you would write your main, you would take your C loop, you would convert it into assembler to operate on however many elements at a time, then you would probably unroll to, like, if you have space in your vector bank, you'd probably unroll to eliminate, try to hit up the latency a little bit because usually between instructions, if you operate only on one dataset, you will have inter-instruction latencies which are going to hurt your performance, so you typically, in multimedia, unroll twice, so you will do, work over two sets of vectors at the same time in parallel, and when you have done all of that, you will be working on however many, like, say, 32-bit, 32 items, 32 elements at a time, so you have to deal with ages because you might not have a multiple of 32 elements that you are dealing with. And that's fine, and that's one way to do it. In fact, last I checked, that's how Clang, LLVM, does the three vectorization on risk five if you enable it, even so you have, it literally starts by reading the vector lengths and then deal with ages and unrolls twice, and if it manages to implement, I mean, if you have enabled three vectorization and you have enabled the risk five vectors, but that's not really how you want to do it. But before we go on how to actually do it, what vector lengths are we dealing with here? So, obviously, well, as mentioned earlier, common values are 128 and 2,512 bits. So, both arm and risk five guarantee that even if you have a viable vector length, it's going to be at least 128 bits, and it's also going to be a power of two bits, which is kind of convenient for the calculations. So, as far as I've seen, there are announcements for real hardware which would have 256 and 312 bits that you should be able to buy at some point in the near future. More crazy stuff. I've seen actually, like, designs also being announced with 1,000 bits. I don't know if they're going to store all those bits in the physical register bank, but it would be interesting if it happens. And I haven't seen theoretical designs at 4,000 bits, and I mean theoretical to the point that there is a schema, theoretical in this case, I mean that there are actual schematics of how you could write a chip and they have even simulation of the performance that the chip would get in certain algorithms as to whether it's actually practically implementable in an existing industrial process. I don't know. I'm not a specialist in electronics, but that sounds a little bit questionable. So, in theory, on the syntactic level, so in the instruction and coding level, you can have up to two power 16 bits, at least on S5. I'm not sure about that, actually. So, how you actually do vector lengths, how you're supposed to do a viable vector length, a SIMD or vector processing, as it's often called, also, practically vector and SIMD are synonyms. Well, at first you have to use predication, which is very highly prevalent in viable vector length scenarios. Now, it's not a completely new concept. Kieran mentioned the K-mask in AVX, so AVX also has predication, but in viable vector lengths, it's really essential because this is basically the programming model on viable vector lengths and or loops is essentially built on predication. And that's true both for ARM and RISC-5. So, a predicate is a vector of Boolean. So, like the K-mask in X86, it's called the p-vector in ARM, and it's the mask vector in RISC-5. And as Kieran said, kind of repeating, but it just specifies which of the elements in your vector, it specifies which ones you will be loading or modifying or storing out of a given instruction. So, if it's a load instruction, which values you load for memory and overwrite into the register, if it's a stored instruction, it's going to be the other way, which values in memory are going to overwrite versus which ones are going to live as they are. And if it's a calculation instruction, vector to vector, then it's going to affect which ones, which results are actually stored into the register versus which ones are just discarded. So, on ARM-V9 or SVE, one way you would typically do now your SVE loop instead of, say, your NEON loop, is you would start by counting down, you would initialize, say, extend to a zero, and then you would... So, you have to imagine here that you have your actual NEON or SVE loop, so you will check... You have this funny instruction, which is actually called YLT or YLLO, and you initialize the p-zero vector in this case, which is one of the predicate registers to say that, essentially, you want to count how many elements you still have in your input data. So, here, we have... We imagine that X0 is the number of elements we have been given to this function. X10 is the count of how far we've been, so it's our iterator. And we'll say... Essentially, what we'll say is, as long as X10 is larger... As long as the number of elements we still have... So, as long as X0 is larger than the size of the vector that the CPU can handle, we'll just set the predicate to handle to be clear, so we'll use the full size of the vector for our programming. And once the number of elements is more than zero, but strictly less than the vector size than the CPU can handle, then we'll start basically just ignoring the values at the end of the vector, so we'll have a bunch of ones, and then at the end, a bunch of zeros. And this is how you abstract away and hide away the complexity of dealing with the edge in your loop. So, essentially, by doing this, you don't care what is the actual capacity of... You don't actually need, at any point, to know how many elements you're dealing with in any iteration of your loop, because it's all hidden away by the... Essentially, the size of the vector and the size of the predicate are matched, so you don't actually care. And you also don't need to deal with edges, because, well, even if there's one or two or three or four elements left over at the end, you can just deal with them in the last iteration, which, of course, will be a little bit less efficient than using the full size of the vector, but it's still much faster than having a separate edge if only because you will not be stressing the instruction cache of the CPU. So that's predication. Now, unrolling. If you thought about it, all that I just said with predication, it doesn't really work with unrolling, because now you've counted down... You've set your predicate vector to count down how many elements you have still in your total count of elements. You can't unroll, because now, like, you've said, oh, I have 10 elements left, I'm going to use 10 elements in my vector, but if you have... It just doesn't work, like, because if you had, like, one and a half vector left, you would want to have one predicate with all the bit set and another predicate with half of the bit set. This doesn't really work very well. And, yes, now, it's a bit of a hot tech. Maybe I will never be ever again allowed to write a session-peck code after this, but just don't unroll if you use variable vector lengths. Now, there may be cases where you can unroll because, naturally, you have some kind of parallel in your design aspect in your algorithm, but the idea of vector processing is that we have higher latency and larger vectors, which, in the end, result in higher throughput, and we shouldn't need to unroll. I'm sure you will find actual designs real hardware, real processors, where it will be faster if you do unroll, and how much you need to unroll will depend on the design. And, of course, if you are trying to squeeze the very last MIPS out of a given specific piece of hardware, then maybe you should unroll, but, I think, generally speaking, at least you shouldn't start by unrolling. And another interesting thing to keep in mind, which kind of already mentioned in the previous slide, is that you don't have alignment issues. The one common problem with CMD instruction set is that the load and store instructions require overaligned data, typically aligned on the side of the vector, which is very inconvenient when you're operating from C or C++ code, because it's usually C or C++ allocator will only allocate align on whatever the ABI specifies, which on RBA, it would be 16 bytes for the stack and 8 bytes for the heap. So, usually, while at least both SV and RIC5 vectors, the alignment needed is only the alignment of the element, and it's not the alignment, it's not the side of the vector. So, if you are operating on, say, 4 bytes pieces of data elements, then you only need your vectors to be aligned on 4 bytes, which is a very nice property for dealing, especially on the edge cases, and also you don't have to deal with, like, if you have one input that is perfectly aligned and the output is not perfectly aligned, like, you end up having this weird mismatch and you end up having to deal with different edge cases, it's really a mess. With vector processing, you don't do that, so you don't actually have to worry about it. So, with that, we've covered generality, so how is it looking on ARM side, and then we'll see RIC5 side, because it's a bit weird if I would... I thought, like, to put everything together, but then it becomes a huge mess. So, it's going to be a bit repetitive, because, of course, there's a lot of similarities, so SVE came about, like, five years ago, a little bit more than five years ago, I think it was announced late 2016, if I recall correctly. It was pretty much less on multimedia. It was explicitly meant for other things, like, well, scientific applications, or engineering modeling and this kind of stuff, well, HPC, and so nobody used it. At least nobody in this room used it. This was fixed with SVE2, which is sometimes called ARMv9, because it kind of comes with ARMv9, but it's really called SVE2. Fixed that issue, the realisation that, actually, this is a good idea. This pattern programming model is also interesting for multimedia and crypto, which was also missing from SVE1. And so what they did is they just took, so which neomonics are missing, and added those, and it's pretty much the same mnemonics you just add the predicate register. That's why this is, of course, a little bit more complicated, but as I mentioned, you just use a while instruction, which will then provision your predicate, and you have to pick the element size so that, of course, this adds up correctly, and then you have a new set of branch conditions, so first element, last element, and so on and so forth. So the remaining elements will be determined by the predicate register, and the predicate register will set the condition flag, and the while instruction will also subtract. There is a count, the number of processed elements from your output register. And yeah, at this point, stop pretending that I'm at risk. How do you detect this stuff? So there's a processor macro, otherwise, as usual, on ARMv8, you have a bunch of privileged registers for the OS to look at, and then you have also Linux, you have a bunch of flags in the auxiliary vector bit, so it's all classic. Another OS that you're out of luck. Availability, so as we said, 2016, but it didn't really work for us. SV2 was specified in 2019, but so the real hardware came earlier last year, so Cortex-AX2 and all the other things from dynamic IQ 110. So Samsung actually knows 2,200, and so Cortex-AX2 and all the other things, they do have SVE, unfortunately, it's only 128-bit vectors, and it's pretty damn expensive, but if you want to do it, you can find the hardware. So RIS5, it's a different model. Can I add? Yeah. There's also the Alibaba one, the Yi-Tian. Yeah, maybe. It's possible, yes. It's only available in China, but it's available. So on RIS5, the predication is a little bit different, so they have separation between element count and the actual predicate. And so in practice in multimedia, maybe not in David, but usually you don't use the predicate at all, so we will instead just count the elements. This is the instruction you always find at the beginning of the loop, which considers the vectors. So in this case, what we say is that we have a certain number of input elements. We want to get the number of output parameters and the number of elements the CPU will deal with in the iteration. We then have to say the size of the element in bits, in this case, for instance, 16 bits. The group size, which is kind of free unrolling, it will automatically, if you set it to 2, it will use all the, and you say you want to use vector 8, it will use vector 8 and vector 9 at the same time, for instance. And tail mode, we always set it to agnostic because we don't really care about tail mode and mask mode, we also always set it to agnostic. There might be use cases where you need to do something else, which might be a little bit slower, but usually you don't. This is about how to deal with the stuff that is masked or with the element that are at the end of the vector which we don't care about. Usually you don't care about them, so you just tell the CPU you don't care about them. One cool thing about RISC-V, the floating point registers are separate from the vectors and like on ARM, so you have more registers available if you have hybrid calculations between scalar and vector side. But do mind the floating point convention, calling convention when this happens, otherwise you will screw up your register state and confuse your CPU. The interesting stuff also about RISC-V, they have segmented load and store, which is similar to structured load and store in ARM, but they can do it up to 8 structures, whereas ARM is only up to 4. What is much more interesting perhaps is strided loads and store where you can say, I have this register X which contains a value and that's going to be my stride. So for instance with that you can put the width of your video inside one register and you can load all the pixels in a column in an instruction without having to do weird shuffling and whatever. Does that actually perform a practice? I think that's going to depend on the design, but normally it should be in the data cache which should be okay. So I'll come to that. Yes, on the downside you don't have transposer or zipping instructions, which should be annoying, which is kind of the same, so you have to replace it with strides. So it's fine if you want to take every second element from one vector and so on. Feature detection, they have very, very detailed pre-processor feature flags. I mean you can download the slides if you're interested. On the other hand, on runtime detection it's pretty crappy. You have to trust the device tree node. So you have to trust the boot loader to actually tell it to your OS correctly in the device tree data structure and otherwise there is a flag in there. So the V, the Vth bit, so the 21, because V is the 22nd later in the alphabet is a vector flag in the auxiliary vector for hardware capabilities on Linux. Availability, unfortunately at this time there is no hardware. Ali Baba, sorry, T-Head has made hardware available but it's implementing version 0.71 from about 18 months before the standardised specification which is implemented by Clang and GCC. So you can kind of work with that and it gives you some idea of the performance but you're going to have to rewrite stuff because it's not completely bit compatible so it's kind of annoying. I don't know when the stuff is going to happen. I'm pretty sure it's going to happen but I would guess by the end of this year we are going to see hardware available. Also I think one kind of not answering or dodging the previous question but because we have so many different vendors on RISC 5 and I think there's more than I did. I only listed three here but I think there's other. There might be big difference between the performance characteristics of the different vendors. These are our references. Yes, I have just a few questions. Have you heard of the SVP64 project from Lever SoC yet which is a kind of similar vector approach for PowerPC? No, I haven't looked at PowerPC at all. Another question that I had with my own CIDD programming workers we often have applications that are inherently horizontal. For example, let's say you are writing a vectorized string search operation or you're doing something like decoding JPEGs where you have these 8.8 blocks where you want to do some sort of close-in transform on them and they have this fixed size and depending on the vector size you want to break them up or you maybe have to process multiple of them at the same time. Is there an intelligent way to solve this? I've had this case. The question is when you have a naturally fixed size input kind of block that you want to process at the time how do you do this? Because then you actually want to have a fixed size vector in effect, paraphrasing the question. I've had this case with the SVP64 a couple of times. One way is to just check that the vector size of the CPU is big enough and just do one at a time. If you can, try to do it at a time because it's always going to be a power of 2 so you should be able relatively easily to parallelize. Obviously the ideal situation is to parallelize. What you will have a problem is if your dataset is larger than the vector then it's going to become complicated for you. On RISC-5 you can deal with this with the group multiplier which will allow you to use multiple vectors as a single vector. And the last question I have is how do you realistically test vectorized triangles? When the hardware you have only supports one vector length at most so you have to probably use some sort of relation to set up for this? Most of the loops will not depend. So the question is how do you test a different vector size for validation I guess. Most of the loops don't really care about the vector size because if you have a simple case where you follow the simple pattern it doesn't really care what the vector size is except for benchmarking of course and you have a problem. Otherwise QMU and Spark at least for RISC-5 support any vector size to give that as long as it's a valid one from specification plan point. Do you realistically really test for that? Or do you just say it's simply not going to be a problem? I mean personally when I've had the situation where I had a fixed size input and I had to test with different vector size and I tested with different vector size in most cases you don't actually care. I mean then it's a matter of choice of you do your testing and no strict you want to be with the validation I think. Thank you. We have no one on question now? Firstly disclaimer, I'm related to the Leversock project with SB64. It's similar to RISC-5 vectors but not exactly the same but they share a lot of the common ideas. You mentioned a very good point that CMD is not vector processing. In order we had to try to report some code from Neon to SV2 and it was less than suboptimal let's say. We had to revert back to the original C algorithm. |
Using the FIM (Fbi IMproved) Universal Image Viewer
A scriptable and highly configurable, yet minimalistic image viewer for X, the Linux framebuffer, and Ascii Art, for command line users |
We can start if you're ready. So we go on with our next session. So we go on with our next speaker, Michele, who will talk to us about the FBI Improved Image Viewer. Please welcome Michele. Thanks. So welcome to my talk about using the FIM Image Viewer. Around 2006, I was a user of FBI, the Image Viewer for the Linux Frame Buffer. And I was fond of it, but at some point I talked, I really need to have the VIM arrow keys into FBI. And I made a patch to FBI. At some point, I started realizing that I need something more, like some shortcuts, bindings, or a small command line, commands, at some point, a parser, auto-completion. So, hack after hack, a fork came out of FBI, which I called FIM. So something which takes inspiration from the VIM text editor, the MAT, mail user agent, and shell languages. So what is nowadays FIM? It's a UNIX tool. A UNIX tool for one task. One task which is viewing images. It's not editing images. So many people confound this. It's for command line users, people who like using the keyboard. It has a configuration file because it's nice to configure custom commands. It uses regular expressions, standard inputs, standard outputs, and it plays nice in scripts. So it's highly interoperable. The amotto is like in Perl that there is more than one way to do it. I think FIM plays well with all the hardware. There are functionalities for caching files or image from files to load them in advance via prefetching, so to spare a bit of IO. And the user interface is quite minimal, so there are no menus, no buttons, sorry, at the moment. And it works at the moment with four graphical output styles. So pixels with X11, pixels with Linux frame buffer, and ASCII art with and without color. And it plays a bit nice under SSH or screen in different situations. So on this picture here you see a pixel mode, character mode, and another character mode without the colors. So you select this when you start the program, or you let it just being auto-detected by the environment variables. The basic invocation of FIM is more or less like what you expect from most programs. So you specify the files you want to open, and in the case of graphical files, the magic number will determine which decoder to use, not the file extension. However, if you want to open a directory, or recursively three of directories, or perhaps even with a background load function, then filtering on file names will occur. Again, it's quite intuitive what the plus, minus, page up, page down keys do. So it's what you expect, and this is good. And what they do, usually it's, of course, the binding is dynamic, so you can configure FIM to do different things. The defaults are for plus to call the magnify command, the internal magnify command, for minus the reduce command, or apart from commands, you have also small actions which can be longer, can be like a concatenation of, let's say, command argument, or even a small control flow expression. And yes, so it's quite rich what you can assign to single keys. So in general, this language, which I show here in these red boxes, it lives in the command line. The command line which hosts this language, which you can also access with the column key, just like in VIM. And just like in VIM and other software, with the tab, you might get some auto-completion of, I don't know, commands, variable names. It's not science fiction, so it can be helpful. Yeah, and this is the same language that you also use in the configuration files and scripts. So that is the VIM language. The language elements of VIM are commands, aliases, which you can customize, variables, built-in or customizable, if while blocks, so to have a bit of control, and some special syntax like shortcut expressions or shortcut statements for some other precise things. How do I use VIM? I don't spend much time programming it or programming the usage of VIM. Most of the time, I use it interactively as any other image viewer, especially to organize pictures collections, like I will show later. Occasionally, I use the special functionality. So what is really unique to VIM or the command line? It's quite rarely when I come up with some nice workflow which I like. Yes, then I exchange the configuration file or even I do an alias in the shell to reuse some special way of calling VIM, which is customized for me. So now we will continue with this talk. I just wanted to mention that another talk which has been recorded will go into language-specific topics that is a bit more nerdy than this. This talk here goes about the interactive usage of VIM. This is not really a tutorial. It's not a documentation. It's a bit of a showcase, what I will be showing here. So I said VIM is programmable, but you don't want to program it here in what I'm showing you here, but still you want to use a bit of automation. And the base level of automation is perhaps to simulate a key press, right? So when you invoke VIM and specify minus K the name of a character or of a key press, that will happen. So you have pressed that key. So for R, we'll rotate. I mean, this is what will happen just after startup. Afterwards, you are in control. So with R, we'll rotate. With delete, we'll delete the first image from the list. With CH, control H, we'll make help pop up and so on. You can go further with minus uppercase K. So with key combos. So if you specify minus BKRA, rotate and autoscale, that will happen as the first thing with FIM starts once. So afterwards, you're in control. In VIM, I appreciate that when you are about to press a key and you prepend it with a digit or more digits, the number that you will have specified also the repetition of what is about to be done. So you have this also here. Of course, now I'm showing you here the command line, but this is the interactive usage. So if you do it interactively, this is what happens. It's the same interpreter who processes this. Yeah, but there's also the dot modifier in VIM and also here that instead of repeating twice a particular command, you can add a dot after what you have just done and it will just repeat the last action. So plus dot, it's like plus plus. Now, you can combine this with number syntax. So if you prepend a number to the dot, the dot will repeat the last command, that number the amount of times. This can spare you a bit of typing interactively, but also in this special mode here. Of course, this just applies to the last command, not to the last combo or last series of things. For more complicated things, you can use another mechanism, which is that of simply configuring your VIM RC file and there you perhaps bind a special, a particular keyboard key to a special command, and then yes, you can use a repetition on that combo which you like, which is what you use, what is useful to you, and that's the way to go. So not over-complicate unnecessary things. Now I will show random functionality which I like in VIM but I didn't bother looking in other image viewers. So with double apostrophe, I have the so-called shadow directory load, let's say. So my observation is that nowadays cameras have a very high resolution. I don't need that resolution. Mostly the pictures which come from those cameras are too heavy for my purposes. So what I do is that I have a directory with reductions which fit more or less my screen and I have another directory with heavy original pictures. But with VIM, I just say, hey VIM, in that directory are the heavy originals. So be aware of this. And then VIM offers me the double apostrophe, the double quote, key, which does something which I forgot what it is but you can just check it up with the help. And that will substitute the content of the images, of the current images with the high resolution or low resolution or whatever you have set it up. So it's a way to substitute it. Probably you can use it to create funny games or whatever but for me it's just the purpose of substituting the low resolution image with the high resolution image because I like using used computers. I think there is too much garbage on this earth and therefore sometimes I don't need that extra heavy processing in my everyday usage. And I think this can have many uses. Another thing which is I think perhaps unique maybe, it's a simple key to jump between the last view and the current view. Why? Because sometimes I watch 100 pictures, I do a selection of the few pictures I really like and I jump between them because I want to see certain detail from one side, from the other side. So I like to jump a lot of times between two pictures, perhaps to catch some detail and therefore I have this key which most of the times I would say retains the position you were and the scaling. So it's really for comparing things. I find it useful, especially in combination as I said with the short listing functionality which allows you to make selections to shorten the selection of pictures. Now another random functionality is the one of conversion pipelines. Sometimes you want to load things which are not properly pixel images like SVG files or PDFs or Postcript files. There are a few built-in defaults in FIM which will invoke that external program to convert it in something that FIM can view. So this enlarges the set of pictures formats which you can watch under FIM. Extending this idea, perhaps sometimes you want to view all of the images which you were about to load with one specific filter pipeline. Here I have shown convert with charcoal filter and put a label on the bottom. Yeah, you can specify that to FIM when you start it. And all of the pictures which you will see in that session will be filtered according that way. I don't know what this is useful for. Previews, making fun, you choose it. But the point is if you don't screw up this expression, you will not write to any file. So just temporary files will be modified. You can interact with different programs in different situations. Sometimes you can use the exclamation point syntax to call an external program and then with that external program that gets its output. OK, it's not that useful. Still, if you got the danger, the dangerous way, you are not afraid. You create an alias which maybe calls in the end XIF tool and you say XIF tool. Please remove the XIF data from the file which I'm just watching because you can specify that file as an internal variable. Yes, you will modify the current file which is dangerous. You should not do it. But if you really want and you are automating some nice useful workflow, you can do it. I have warned you. OK, did you know that with FIM you can even load files from a file name list? OK, this sounds boring. It sounds boring, but maybe it's nicer if you learn that with FIM you can write files with the file name and the description. OK, maybe it could be even useful in a few situations. In my case, I find it useful or I like it because FIM has a few captions in different parts of the displayed window, let's say. And there are a few variables with expando codes, so like percentage and something. So you can customize them. You can view, I don't know, the comment, the description which I said before, or other information. So you can customize it a bit the way you want it. More, you can have in this description file internal variables. So just for the purpose of giving attributes to the files you are about to have in the list. So not only the descriptions, but also attributes. So the bill will be in the category of businessmen. Aron as best, Abram and Richard will be in the category of activists. So they will inherit this, those attributes. Furthermore, there are some shortcut syntaxes which prepend text to this description or allow referring to specific variables in those descriptions for the purpose of making them shorter. What you can use it for apart from the caption? Well, you can use them also for searching the picture in the file list, in the list. So with the go to command or you use a special slash or question mark syntax, not command line, but search line to search and to jump directly to a file which has a description that way. So if you manage your pictures collection nicely, it can be useful. For me, it's nice because my picture is targeted that way and I'm happy with that. Or I have custom collections of pictures that way for my own amusement. Yes. So you have this go to command which you can use also in other ways like jumping or controlling it to jump according to the values of those property variables. Or you can use this go to jump between to the next directory, for instance. If you load 1,000 files in different directories where you don't know exactly where they are, you can use go to and something very specific which you find in the manual. And this will jump to the next directory inside what is being loaded there. So there are many shortcuts, let's say, for doing very specific jumps according to your workflow because this is to adapt the way you wish to organize your stuff. And of course, if you have different specifications to the go to command, the first one which matches the jump will do the jump. Before the session is over, I wish to say the limit functionality which I talk from math. I find it also useful because you can shrink the collection. For instance, I have around 20,000 pictures in my collection, in my photograph collections. But I can limit them, for instance, to city equal browsers or something like this which is useful to me. In this case, you see that we have shortened the pictures list from the file, total five to four which matches category equal activist. Further, the limit command can shrink the list according to duplicate the base file names or the date of the files or the size of the files. The base idea of FIM is that you use it interactively. You have a few aliases which are perhaps sometimes customized and assigned to specific keys which you like to have. Yeah, and you write this in the configuration file and you perhaps share it with others. And you just remember by heart the commands which you use every day. And that's all. So FIM at the moment will be releasing the 06 after 15 years, the 06 version. In a few days, the table is out there. I have to do some promotion and especially I have to say give the next version to the Debian guys. So there is the version on Debian and everywhere else is old but we'll update it soon. The manual has everything. So everything is written there. And I hope you enjoy FIM and perhaps watch the other recording with more nerdy language aspects. That's all. Thank you for your attention. Thank you. So we have some time for questions. Is there any questions on the floor? Yeah, so the collection stuff sounds particularly interesting. Can you update the collection from FIM itself? So while you're watching it, is that an image so that while you're watching images, you update your collection from within FIM somehow? Perhaps at the moment I don't have this. I'll write a text file in your... The question was whether FIM, as I have written, organizer, picture organizer, I wrote it correctly. No, I wrote...it was a mistake. FIM is not an organizer. You have to organize the files by yourself with a text editor. We have a non-line question here if you can read it. Thank you for the talk. Is there a way, plan to have a way for FIM to script a small step-by-step animation of the actions? Maybe some sort of slip between the actions. Yes, there is a functionality which is called recording out or something like this. So after you exit FIM, on the standard out, in a specific file, actions and commands, sorry, and timings will be spitted out. So there is a slip command which says slip, I don't know, three quarters of a second, something like this. Yes, the answer is yes. Question on the floor? Yeah, good question. What about the descriptions? Are stored in the same image files? Are they stored like metadata in other files? Can they be read by exit tools? What I have shown here was just the plain things you write in a textual file. Apart from this, the exit tags from JPEGs, or I don't know, I think in other places also you get exit data, but at least from JPEGs, and they become internal variables in set FIM, because I really like to have some particular JPEG exit data being displayed in the caption, and that occurs there. Actually, also PNGs and also JPEGs without exit have commands. This also drift here. Yeah, so there are different... I don't think I'm covering everything, everything, everything, but as soon as I learn of some extra metadata, I integrate it in the internal variables associated to each file. I have one extra slide. Any other question, maybe? Okay. Maybe I didn't have to understand so much, but the result of modifying this file is... Modifying which file? The original file. It's then storing another file. We don't modify any file with FIM. If you do it, this is a mistake. Oh, okay. There was one example where I was saying, you can write programs to modify the file, but most of the times you don't want this. Okay. But if you really want it, you can. If the file is changed from this, will FIM reload it automatically, or will it... I think there is such a functionality. The question is, if you are stuck on a picture and the picture changes, will the FIM reload it? There is some functionality to detect this. I'm not sure if at the moment it's at default. In principle, it's like two lines of code. It's easy to implement this. I think this is for picture frame situations where many people use FIM for picture frames. I'm not happy with that. I wish people to use it interactively, but yes, that's possible. If you want, I have one extra slide. So with FIM, you can even play the little steganographer or the little forensic investigator by using the offset switch or the seek magic internal variable, which do nothing else than saying, hey, please, don't seek for the image at byte zero. Seek between here and here. The picture is there in the file. So you can use this for looking for the signatures within the file, which maybe is broken. Maybe there are a few files which are concatenated for some reasons. Maybe it's a TAR archive, which actually... Sorry? I have an example of this. So if you look at your Chrome cache, Chrome browser, so the cache, you have binary files, which are a concatenation of HTTP headers and image files. And right now, I'm using exit two to find the byte offset and then using image everywhere to use the file. But I would try it being... Yeah, so FIM is... It's the same thing. So the question is... The observation is actually... apart from maybe seeking into broken file systems that way, you can even just look into the certain configuration files like the cache files, like the one from the Chrome browser, because there, actually, there are some special custom file formats where a proper file is pushed down into another file. Yeah, there are encodings, let's say, which simply you have a picture, but it cannot be immediately seen. But with this, you can... Functionality, which jumps or seeks, or a file, perhaps, with a signature, can locate it. Yeah. Okay, last question. We don't have any online. Question on the floor? Yeah? Okay. Okay, thank you. Thank you. Yeah, I need to get back. Bye, man. |
Merging Two Worlds - Broadcast and WebRTC |
So let's start, so our next talk is about two very interesting words, broadcast and WebRTC. Please welcome Dan. Thanks everyone. So yes, merging two worlds, broadcast and WebRTC. So yes, I'm Dan Jenkins, I've been doing stuff with WebRTC for probably just over ten years now, and I guess very recently I've been more involved in air quote broadcast, and there are the ways that you can talk to me, email, Twitter, that thing that's about to die, and Masterdom. So merging two worlds, broadcast and WebRTC. So I guess to talk about WebRTC and broadcast, we need a few definitions of some of the things we're going to talk about. So WebRTC, what is WebRTC? How many in the room know about how WebRTC really works? Hands up. Okay, about 25% of you, so that's good. So WebRTC is encrypted by default, sub-second glass-to-glass, soap and source completely, two-way communications, no defined signaling, which is kind of a good thing, and it's got a load of required codecs that you have to implement to be compliant, I use the word compliant, I don't think anyone's actually going, you're not compliant, but hopefully they will one day. It's embedded in every single browser, so that's the key thing here, every single phone in every single pocket can do WebRTC without having to download anything, and that is the key thing. You can use your own codecs, but you can't use them from within a browser today, you could build something natively and use your own codec, but takes away some of the magic. Delivery is over UDP, and it's got a load of NAT busting stuff with what we call ICE, and there's one main thing called lib WebRTC that everyone talks to everyone else about as though it's WebRTC, it's not, there's lib WebRTC provided by Google, but they don't really provide it anymore as pre-built stuff, but then there are loads of other open source independent versions available in many different languages. No signaling defined in the spec, so that's a big thing here, it was a good thing when it got made, when WebRTC first kind of became a thing, we'll come back to that later. So then there's SRT, SRT is a secure reliable transport, how many people in the room use SRT or know how SRT works? Probably about 25%, but it was a different 25% than before mostly. So again, it's open source, it's used heavily in the broadcasting industry, it is UDP based, it really requires native apps, no browsers involved at all. It can be encrypted optionally, most people use it encrypted, but you can use it without and it is completely and utterly codec agnostic, but again there are usually pre-defined that people use. It can be sub-second, but usually it's not, and it can be used within, across the internet or within a LAN, NDI, how many people in the room have used NDI and know how it works? About 25%, some of the same crowd again, so network device interface, I mean who came up with the stupid name like that, by network device interface, so what am I actually doing? I hate the, I always forget what it's actually referring to, it is not open source and it comes in multiple forms, so there's pure NDI which gives you a huge amount of data, then there's HX, HX2 and HX3. Designed to work only within a LAN, yes you can make NDI work across the internet now, but it's not really NDI, it's actually using WebRTC to do some of the magic. So ultimately it's designed to work within a LAN and it's not open source, but it is free to use, but the licensing is a little bit confusing, some of the times. So yeah, it's a bit of a weird one, but again it is hugely popular, and yes it uses UDP as well. All the good things with media use UDP, right? Then there's RIST, how many people have actually used RIST, okay, how many people understand how RIST actually works? So for the recording, I don't know, 1% of the room, 2% of the room. So RIST is actually really quite interesting, it is reliable internet stream transport, and to be honest I've never used it, but I had to learn how it actually works to be able to confidently talk about it in front of you guys. So again it's open source, it's UDP based, it is encrypted, it's RTP based, and it's relatively new in the grand scheme of things, and it can work within a WAN or a LAN. The other forms of media transport are not worth talking about right now, because they're not really real time, and I know some of you are going to look at me and go, hmm, but bear with me. So merging two worlds, WebRTC and Broadcast, and Broadcast and WebRTC, they're two worlds that have not really come together in the past 10 years. I know lots of people from the Broadcast industry look at WebRTC as a dirty thing, it doesn't do this and it doesn't do that, but hopefully that's all changing for the better. So they can now live in harmony, hopefully, maybe. So because of something called WIP and WEP, so let's take a look at WIP and WEP. WIP stands for the WebRTC HTTP ingestion protocol, and this is how it works. I'm not going to bore you going through that, but I mean ultimately there's an HTTP request, request response, and then media flows. It should be as simple as that, right? Then there's WEP, which is the WebRTC HTTP egress protocol, kind of looks similar, right? So that's really great. And then there's this third one called, whoa, WebRTC HTTP offer answer protocol. I wish it did exist, I think it would be really cool, but it doesn't exist. I messaged the author of both WIP and WEP this morning and went, we can just get rid of WIP and WEP and just have, whoa, but no. So yeah, these strangely look like signaling protocols to me, right? And I said it was really great that WebRTC didn't have a signaling protocol, didn't I? Well it was great back in the day. It drove innovation. It drove many, many different applications to do things in their own way that made sense to them. They didn't have to use XMPP, they didn't have to use SIP. They could if they wanted to, and many did. Or you could go build something yourself using a JSON API, GraphQL, whatever suited you the most, right? There was no defined way of going, here's an offer and here's an answer. It was great until it wasn't. I say it wasn't because no enforced signaling protocol meant that there was a lack of industry support. So we had all of these islands that didn't really know how to talk to one another. One of those problems that Matrix is trying to solve. But ultimately we had the likes of Jitsi and Teams and Google, Google me and whatever else, all doing great WebRTC things. They're quite great, but none of them could like interrupt with one another and that was a real shame and it meant that when we came to the broadcast industry, no one wanted to say implement, oh, how do I talk to Milikast? How do I talk to Dolby? How do I talk to Flowcast, whatever, right? They were never going to implement these 10 SDKs, 20 SDKs to be able to talk to specific companies, services. So yeah, how do you use WebRTC to deliver media while implementing a different API for everyone? And the simple answer is you didn't, right? The broadcast industry as a whole didn't enjoy WebRTC for many reasons, not just lack of signaling, but many, this signaling was one of them. So you used a different protocol that would solve everything, right, whether or not it was RTMP, whether or not it was RIST or SRT or whatever. So whether or not you're a fan of WebRTC or you're not, it does have its uses. So and up until recently, you'd have to use an SDK or whatever and interrupt was really, really difficult. WIP and WEP. So yep, WIP and WEP, I'm actually going to try and get Sergio to change ingestion over to ingress or ingest, just so that it kind of flows nicely with one another. And they're both drafts in the ITF. Drafts are nothing to be scared of. We all know that. Some businesses don't. Some businesses look at it and go, huh, it's still a draft, haha, we're not going to do anything with that. But we all know that drafts can be a really good thing. So what is actually WIP and WEP? Why have you come here today to find out what they are? So WIP. So you do a HTTP post up to a server. So WIP and WEP are really designed around getting media from a server. Not another peer that's within a firewalled NAT or anything like that. It's designed for, here's a client, here's a server, I want to put media here or I want to go grab media from there. So WIP, you do an HTTP post with an SDP offer and then within the response, you get an SDP answer. And then you're done. That's pretty much it, like we can all go home now. WEP, pretty much exactly the same. You do an HTTP post with an SDP offer and you get an SDP answer in response. And you're done. Like, they are pretty much not quite identical with one another, but they are. You do one HTTP request, well, you can also like, do I trickle ice using options, requests and whatever else, but in its most basic form, you can do one request and one response and get media flowing. So what does that really get us that we didn't have before? Hardware encoders. For me, being able to bake in WIP and WEP support into hardware encoders and software is the biggest thing. And yeah, the Talon hardware encoders support WIP today. With others, I know us are about to support it as well. Software support. OBS. So there's already, historically, there's been a version of OBS that was WebRTC-ified by the Cosmo team. But it was very much kind of designed around like offering up to Milikast mostly, but they supported it and you could do stuff with it. But today, there's a pull request open on OBS to add WebRTC support into OBS using WebRTC Rust. We've heard a lot about Rust in this room today. And that's absolutely fantastic. So now you'll be able to publish to a WIP endpoint and be able to just do it with a URL. And that's really quite cool. And then once that's been merged in, the plan is to add WEP support as far as I understand and then add continuing added extras around extra provisions around RTP headers, specific things. But at the moment, it's a very basic pull request. I say it's a basic pull request. It's a very complicated pull request from what I understand. But you've got to start somewhere without all the bells and whistles. There's support. Again, so GStreamer, GStreamer now supports WIP and WEP as syncs and sources. And this is absolutely huge again. Got properly released in 1.22. It was released earlier, but obviously 1.22 is not a development release. And so you can go and use WIP and WEP from GStreamer today. And there's loads of platform support out there, Dolby slash Milikast support it, Cloudflare support it in Cloudflare stream and then broadcast bridge. My product also supports WIP and WEP. So yay, this is great, right? So using WebRTC for ingress and egress just got a whole load easier. There's now a standard. It's a fairly easy standard. And we can just kind of get on with our lives, right? So Simulcast and SVC are both supported. They're called out in the draft. You can do this. Because ultimately, it's just an HTTP request transferring SDP, right? Anything you could do within your SDP, you can do within WIP and WEP, basically. Yes, SDP still remains. If you don't know what SDP is, it's a block of text that magically says, here's how to set up media, and it's probably at worst, I don't know, 300 lines of code. I haven't looked recently. It's huge. It can be huge because it can tell either side, these are all of the codecs that I support. These are all of the codecs that, and then the answer comes in and goes, yeah, we're going to negotiate this. So SDP is a mess, but it does its job really, really well at the end of the day. And because it's just SDP, just SDP that we're using with WebRTC everywhere, it gives us the freedom to be able to do whatever we want. So that's extra codecs. That's being able to just say, oh, I'm going to make a random codec that does something very, very specific to my use case. It's not going to work in a browser, but it does work for my use case. And if both ends know how to talk about that codec, then you can do that. Opus Red, for example, Opus Red hopefully will become standard in the browser. I don't think it is today. I think it's still behind a flag, but being able to put in redundant packets of audio is really quite cool. So hopefully that's going to turn up in a browser soon. But yeah, it also allows you to do RTP header extensions like DTX, et cetera. So yeah, I mean, you're all looking at me like, this is a bit boring, right? And yeah, Wip and Web is a bit boring. It's not groundbreaking at all. And like, sorry Sergio, he's going to watch this later. He's the author of both, or co-author of both. Yeah, it's not groundbreaking at all, right? It's just an HTTP offer and answer. It should be called woe, right? No traction for that in the room. OK, there is some state handling in there as well, obviously. You have to keep track of things and whatever else. But I mean, it's really great. So it gives everyone these two common protocols for send and receive. And that leads to open innovation and open source projects. So there's already projects out there that do SRT to WIP. So you can take in your SRT that you've been using for the past however many years, and then you can make it WIP because your media server only understands WebRTC, right? There are projects out there, GStreamer being one of them. There's WebServer from the Meet Echo team. There's WebPlayer from the Meet Echo team, and Ivan from Sweden. The WIP server, again, WIP client. These are all browser-based tools, or they're all command line tools, or they're all server tools that talk to Janus for you to be able to make Janus understand WIP and WEP. So it's a great time to start looking at WebRTC if you haven't already done so in the past. I've got the same slide twice somehow. Side note, GStreamer, GStreamer is really cool. It allows you to pipe all of these things to all of these things, right? And it's really quite cool, and it even supports RTMP. You don't have to love WebRTC, and you don't have to love everything about it. I mean, I certainly don't love WebRTC every day. There are times that I literally want to throw it against the wall, but it does its job really, really well, and it is incredibly useful, and it is just another tool in the toolbox. Sometimes SRT isn't right. Sometimes risk isn't going to be right. And whatever, sometimes you are just going to need WebRTC. So WIP and WEP open up all of those possibilities, and I haven't talked in detail about any of them, but hopefully you've got enough information to get going. So thank you very much, and oh, one more thing, ComCon 2023 is a conference that I run in the UK, and that's happening this year for the first time in person since before the pandemic, so expect an announcement on dates and venue soon. And that's me. Thank you very, very much. Oh, and we're hiring just like everyone else, so if you want to do software engineering in the UK around these kinds of technologies, then go to jobs.everycastlabs.uk. Any questions on the floor? Have you used the WSCDI and WSCDI and WSCDI and WSCDI and WSCDI and WSCDI and WSCDI and WSCDI and WSCDI and WSCDI and WSCDI. I'll comment. So I didn't catch all of the detail, AWS is, I'll say, right, no, I haven't used AWS's product called CDI. I try not to use anything AWS to be perfectly honest, because, you know, when's it going to disappear? But no, I haven't. So AWS's product does broadcast uncompressed stuff. Okay. Is there any matrix effort with WIP and WEP? So is there any matrix effort around WIP and WEP? I honestly don't know. I can't imagine so, because they're designed to like, matrix is designed to kind of bridge between media servers, and so they already have ties into those media servers. So they wouldn't need WIP and WEP. I imagine. But that would be a question for the matrix team. But I don't imagine it turning up. You had a question. You mentioned Simon Cast and SVC are supported. I haven't read the latest draft. Do they have, like, a way to layer selection on the client? So the question is, Simon Cast and SVC are supported in the draft. Does it tell you how to layer select? No. There's a note that says, oh, there's nothing stopping you from doing Simon Cast or SVC. But as far as I could see this morning, when I was finishing off my slides, there's no detail, there's no further detail on that. But yes, we shall see. But there's still drafts. There's still time to add detail and stop things from happening that we don't necessarily agree with. Like, yeah. Does that hardware you showed support SVC or Simon Cast? I don't think so. From memory, I've not actually used the hardware myself. I've just read the press releases and talked to people that are using it. As far as I'm aware, they're not. They are just using a stream that they're getting and one stream with one layer. But yes, SVC is kind of the future of having options, right? And so it needs to be a first-class citizen in whatever we're doing in the future. And it's now released VP9 SVC. And I think AV1 SVC are now released in Chrome that's going to be released into stable very, very, very soon. So we're going to have all of these options in the browser natively really soon. Any more questions? Thank you, Dan. Thank you very much. Thank you. |
The open source stack for animation movie pipelines
The tools needed to cover every step of the animation movie creation process |
Our next speaker is a specialist in the animation industry, please welcome Frank Rousseau. So hello first of all, my name is Frank Rousseau and I'm going to introduce you to animation movie pipelines. So before going further into the presentations, I'm going to introduce myself a little bit. I do web applications since 25 years and I do free software as a professional activity since 10 years and I'm the founder too of a company named CGWire. We are a bootstrap company, we are a team of five and we do a product that is aimed at providing project management services for animation studios and this product is licensed under HGPL. So now let's talk about pipelines. So what you have to understand first is that when you do an animation movie, you have to follow a very industrial process but all the people working on it are creative people. So their main focus is to make beautiful pictures and they don't really care about the rest. So you have to deal with that two contradictory aspects and the first thing is to describe roughly, to describe the main steps of the needed to build a movie. So here I'm going to take a simple example that covers the main aspects of a movie but of course there are subtleties for every movie but the main idea is that in every animation production you have some elements named assets that will be displayed in the movie all along the movie. You have the shots which every time the camera frame is changed there is a new shot so the movie is divided in many shots and the asset will be used in the shots. So when you do, you build an asset for example a character, you have several steps. The first one is the concept phase, it's a 2D step where you will draw your, you will draw the idea of how the character looks, then the next step is to consist in sculpting the character, making it in 3D. Then you have to paint it, so you have to apply textures and physical materials. So we name that step shading and the other important step is the rigging, basically you add bones to your mesh, to your 3D model and you make a puppet of it. Then once you have all your elements that are animable you are going to build your shots. So first you need to have the animatic, the animatic is basically an animated storyboard so it's a super rough version of the shot. Then you go to a 3D version of the shot but it's still very rough which is named the layout. It's more direct or thin so to check that everything is well positioned. And the third step is to animate, the goal of the third step is to animate the shot. So here you have the very precise animations, here your shot is almost done but you have to render it so you add some lights and you bring all the elements you need to do the render in the same scene and then you run the rendering. And once you have the rendering you add some 2D effects on top of it and it's a compositing step and after this step you have the final picture. So now I talked about 3D animations but animations can be done in 2D, it's a very popular style too. So here the steps are a little bit different, we still have the concept but then we have the design part, basically we draw exactly the elements we need in that step. We have two main elements that are used in 2Ds are the characters and the backgrounds because every shot has a different background almost. And after that we have another step where every character will describe all the poses that a character will have so it will be easier for the animations that will happen afterwards. And when we do the shots we still have this animatic step because if you remember what I said in the previous slide it's already a 2D operation. Then we do a kind of rough layout and then we do the animation, the animation happens only on the traits of the elements that are moving. Because you know when you see a 2D movie you always have a beautiful background and simpler elements that are moving so we animate the traits then we color them and then there is a compositing step which assembles everything, backgrounds, traits and colors and to make sure and then it gives you the final image. So now that gives you the main steps needed to build a movie but now the question is how we build the movie. So of course we use software and in the animation industry there is a strong culture of IP, secret and proprietary software. So we come from very far on this aspect and because they are really using these things that the image should not be stolen, that there is no, it's better if no one sees the image before the images, before the movies is displayed. And so basically they are not really friendly with everything which is open source but things tend to change because of several phenomena. The first one is that Python is replacing every proprietary script language. I didn't mention it on the slide but FFMpeg is widely used to do some video operations and the academic software foundation is maintaining many projects around pivot file formats. So it makes it easier for studios to collaborate together. And of course we have Blender which is getting more and more adoptions, which is getting more and more popular and which perform always better. So all these elements combined push the industry to change its mindset about collaborations and open software. Now there is another interesting actor which is Krita. Krita currently is not widely used in the industry but we still can feel the same phenomena as with Blender. There are tons of tutorials. The software works pretty well and it keeps on improving because there is a good community. So we can guess that at some point we will have some interesting stuff that will come from it. Krita is mainly used for digital painting so it's more interesting to end everything related to backgrounds or to do texture for 3D. So if we go back to our building steps we can match our software with every step. So for the concept we can use Krita for the modeling Blender. As you will see Blender is widely used to manage most of the steps. The shading is a combination of both because Blender will manage the materials, can handle the texture drawing too but Krita is a little bit better for that. For the rigging Blender can be used. It's not an awesome software for that but they want to change everything for their upcoming releases and if they are able to achieve what they want it's going to be really awesome. For the shots we will see just right after that it could be used for the animatic. For the layout of course animations it performs very well but there are still some steps where it's not yet the right software. For everything related to FX there is still a software named Udini that is widely used by the industry and people won't change it for Blender right now. For the rendering steps they have a very nice rendering engine but when I talk with people from the industry who do really complex images they told me that it's not powerful enough for the moment especially because they don't have what we name send assembly features. The idea of a send assembly tool is to bring everything that is needed to build a scene and add some lights and have the capability to handle a lot of vertices and animation keys and for that there is no right software, there is no competing software to do that with the proprietary software are much better and we have the same issue with compositing where there is Netrun which is a very interesting software but they have governance issues so I don't know if it's still maintained and the software currently is not as powerful as the proprietary ones. About 2D there is something new, it is that Blender recently introduced Green Spence Hill so it allows to do 2D inside Blenders, it's really a huge change because prior to that there was no software to do 2D animations in the open source world that are efficient so now it allows to use Blender on the world pipeline of a 2D production the same way as we use it on a 3D production. So there are still some limitations about every vectorized picture, people still prefer Illustrator to Wingscape, I used Wingscape personally for some design stuff, it works pretty well but it seems that Illustrator is still better, Krita has many good things but it's still too limited but people told me to manage big images but I'm not sure but this is what some people from the industry told me, like I told you Netrun is nice but not as good as the competition and there is no good software for FX and no efficiency in assembly tools. So now let's talk about the pipeline because I talked a lot about the building steps but there is another important part, it is the glue between all these steps. So first let's see this very cute rabbit because this presentation is way too serious, I wanted to add this, it's not made with mid-journey or similar, it's with pictures and I apologize for the author, I totally forgot to mention it and I forgot the name so I'm really sorry for that but the rabbit is super cute. So now we can go back to the pipeline, so the first brick that is needed to build an efficient pipeline is the production tracker, the name is very misleading because it's not about tracking, it's about collaborating, people use it to dispatch tasks, to ship deliveries and then to talk around it, there are many single page applications and they have a strong API to allow every tool to connect to it, it's very important because there is tons of interesting data for every user in the studio and it's important to be able to access all of them. So basically what allows a production tracker, it's a very simplified version but production managers dispatch tasks, artists do the work, directors review deliveries and send feedback to artists and they iterate together and tools grab and post new data into it. The other main element is the asset manager, so the asset manager is basically what we, sorry I just do it the other way, you can see the animation production has a big graph, every elements are tied together especially because when for instance you do an animation, you will need several models to run the animations but we will need specific versions of every models and you have to represent that way or not and every node can be considered as the delivery and the links, it depends on how you want to represent scenes but the edges can be either the operations or the links between the elements and of course everything is stored, every elements are stored on the file system, object storage could be used but in reality I've never seen any studio using object storage so everything is saved on a shared file system but still it can be quickly messy so an abstraction is needed to manage all these files. So here is an example of graph, I go fast because I thought that I had 25 minutes to do the presentation, I only have 20 so I'm going to go a little bit faster, so here are some examples of Enforce file pass and file names and what you have to keep in mind is that it should be configurable because every studio has a different way to store stuff so for Asset Manager it's a real challenge to be able to manage everything and the third part of the pipeline is made of one of the main bricks of a pipeline is the render manager so in an animation production you have a ton of very long jobs to process and for that you need a render farm that will manage everything properly and you have a tool to be able to follow each task so and you have a suite of small tools that are used in the productions to have more efficient work so now let's because the goal of this talk is to build an open source stack for animation production so for the production tracker we use Kitsu so this is the software we develop at CGWire so it's a web application we choose very common very standard libraries and database to build it we rely a lot on FFMpeg to normalize all these videos that are passed through Kitsu currently we don't do crazy stuff with it to see only cool stuff we do is that we push every FFMpeg jobs into an AshiCorp nomad cluster but I hope that for the future FFM editions we will be able to talk more about how we manage everything and to show you that we do super cool stuff with it currently we use it in a very simple way and we choose a very standard stack to ensure that everyone can contribute and can deploy very easily the applications if they don't want to use our services and it's licensed under AGPM so this is how it looks we have here the list of elements to build and here we have the steps needed to build the elements so I won't go further on this because it will require a full presentation then for the asset manager there is a new tool that is emerging it's name is Hayon formerly it was open pipe basically it allows to do what I talked about before it's it allows to manage all the files but in an abstract way so artists can can just manage versions and don't think about how they how it is stored it comes with many tools to push things to other tools so it looks a little bit like that it's a little bit more it's a little bit less sexy but it's efficient too on the left you have the the file hierarchy when you select a step building step you have the all the versioning of all the files and then you have all the operation you can do on files which makes it very it's really several of times to everyone in the studio what you have to understand is that artists are really focused on making beautiful pictures but they don't most of if they are not seniors they don't think about if their their production is usable by the next step so this kind of tools we help them to do cleaner stuff that are reusable by their colleagues there is another interesting software which is a Libreflow or Cabaret I like it especially because it's it's really a community community thing it was done by a technical director in a studio named super monks and another studio named Le Fais Speciale used it and together they build that software it's fully open source and it's so it's a community led so two previous tools are company led and this one is community led so it's very cool to see that alternative emerging for the render farm render manager we have Flamenco by blend by the blender foundations which is dedicated to small small teams and we have open queue which is managed by the academy software foundation and which allows bigger studio to rely on on the first solutions here it's an example of a full pie open source pipeline almost full open source pipeline used by Le Fais Speciale on a feature film named Siren which will be soon in theaters and here it's almost the conclusion but here it is how animation free and open source stack now we have it it's very new for the industry so I'm super happy to be able to show it to you today and so to conclude now we have an almost working farm we have an open source stack that covers almost all the use case needed to build an animation movie of course it's mainly led by the blender foundations but there are no much alternative to blender but in the open source world but it's great that it exists and the work they do is really amazing the pipeline stack is covered by what I described we tend people don't see that aspect and we said we tend to forget it and there are still some changes in effects vectorials drawing and scene assembly but and compositing of course but yes soon it will be fixed I hope so thank you for your attention and if you have any questions I will be glad to answer at some point we'll be able to compete with so I don't know I know they they do effort on that aspect they say on a regular basis they had some that features about this the community is very active so at some point it could change but I think they their focus is not on it right now but I cannot talk for them because I am not not part of the blender foundation but we talk a lot with them yeah is there any other question yeah I mentioned Rita you mean and again I mentioned Rita but not GIMP so GIMP is not as not a good reputation in the animation industry I don't understand why because it's a very nice software but I think that people doing very complex stuff and it can find what they want but from my experience of GIMP is that they keep on improving it they really improved the UI so maybe at that at some point we will see more and more usage of GIMP but I hope because I really love this software too do we have similar stuff for sound I don't know we are only focused on images |
Melrōse, a music programming environment
new language to program MIDI sequences |
Now let's have some music with Melrose, a program and play music by Ernest. Yes. Yeah, thank you all for waiting, it took me a bit more time to set things up, but I'm glad you're here at the end of the day, full room, and I'd like to share with you a result of an open source project I've been working on for a couple of years now. My name is Ernest Mikleij, I'm a Go developer, but my day job is actually being a manager, so this is how I program in my spare time, and yeah, so in a couple, in this presentation I want to show you what it is, what you can do with it, and if we have time I can also show you demos. So why did I start this project? So I'm also a piano player, and I noticed, as you do probably, that music is really a structured language right, so we have all kinds of patterns, and Bach is one of the famous ones, and I wanted to explore those expressions in some kind of a program language, and I took up the challenge to find the right functions, abstractions to actually program. Now I told you I'm a Go programmer, so initially I thought, okay, I'm just going to write a library and create all these function abstractions, but then I realized, okay, each time I create something like a music or a pattern, I have to compile it and restart it, listen to it, no it's not good, change it, compile it, restart it, so I quickly skipped that whole idea, and then I realized I need to offer quick feedback, and so I started thinking about a different language, and then use Go as an interpreter of that language. So what is Melrose Ndient? It's actually a combination of a language, my abstraction of how to write music, and it's also a runtime to play those things. Well, for instance, if I go here, where is it, yeah, that works, okay. So yeah, this is just a glimpse, and I will talk to you through it what it actually means, and I put this presentation in four parts, we'll talk about the language, a bit about the tool, very short part about Go, perhaps, and of course, where you all want to hear something, right? So about the language, so I decided this would be my note presentation, a note consists of several components, just a simple thing here, you can see the C and you can put length information in front of it, you can change the sharp or make it flat, change the octave by adding a number at the end, and then put in some dynamics, and I chose to use minus, minus, plus, plus to make it louder and shorter. So this is how I represent a note in the language. And these are just examples of how to express them, but I hope you already see that this is not really a good way or easy way to compose music, because then I have to type a lot of notes, right, put them together, and so very quickly, of course, I needed some another abstraction, I call it sequence, and a sequence also is typically a collection of notes, and I also added a notation to have grouping, so we can have chords, right? And this is just an example of a sequence, and this is an arrest, this is an eighth note, and so on, and I was hoping I could actually play it here, I think I can, it doesn't work, of course it doesn't work, why not, and there it is, ah, it doesn't know where to find it, I have to set the output device, you see, sorry about that, give me a minute, just restart the tool, let's see what kind of media devices there are, I think there it is, so one, still not, why not, this is the demo effect guys, I'm sorry, it happens, I think it is missing this one, so, yep, are you still with me? Something wrong with my setup guys, I don't know what, maybe I should do this, it doesn't see my synthesizer, and of course I can play it myself, but that's not really what I'm trying to sell to you, right? So maybe I will continue with, ah, there it is, I see it, sorry about that, thank you, thank you, thank you, any questions, no, so, yeah, this is just one of them, and soon I realized I need more abstractions, because, yeah, the sequence, still have to write all those notes, right, so I also have the progressions, and this is a progression, yeah, this is a chord sequence, so, yeah, the real power in the language, I think, comes from composition, so I managed to find all kinds of functions that take groups and transform it into something else, so, you need to read from right to left, so we start with a chord, which is the same as the sequence CEG, right, and then I engroup it, so I get separate notes, then I reverse it to get this, then transposes, I get that, and I repeat it four times, yes, so I can do this as well, so it takes the sequence, and it changes the number of notes by this pattern, I just came up with that pattern, and make it shorter notes, fraction, and eight, so shorter, and I repeat in a join, and the nice thing is, you can also change, I will come to that later how that works, and then I realized, ah, so I can have sounds, but I really also want to express rhythms, and then I came up with the pattern consisting of dots and bangs, so every bang and a dot represents a kick, and if it's centered to channel 10, which is the standard for drums, I shoot, pray again, if that works, now it doesn't, why not, maybe it's this channel, no, that doesn't work, for some reason, and then this doesn't work either, that's too bad, see if we can make that channel 10, no, you have to bear with me, that should work, maybe I will figure out later, let me explain how this works, so I made some variables, I extended the language to have variables, because you can reference them, that's easy, in normal program languages, and then you see a note map here, where I map the kick on all the bangs and the snare, and then there's a function called merge, that squeezes everything into one sequence, but because every note has a different instrument, they won't go all together, they each will send to the right instrument, and I'm using BPM, I can change the beat per minute, obviously, and I send the whole thing to channel 10, which for some reason doesn't work now, I don't know, okay, and there's lots more out there, so I showed you fraction, merge, and repeat, and I realized when trying to compose music like that, oh I need something that has a transpose map, and I need something random, and I need something pitch, and so the language grew and grew, like you see now, so a bit about the tool, so Melrose, the program, it doesn't produce any sound, it just uses MIDI, and MIDI, there's lots of MIDI devices out there, and so that's why I brought my synthesizer here, it's just one of those devices that we can talk to and send MIDI to, and it's very, yeah, I think most of you know MIDI, so I won't go into detail of that, and so Melrose comes actually with a binary and a web version, so the one, the binary version is running currently with my machine, just a little interpreter you just saw for changing the devices, and it actually has an HTTP interface such that each tool can talk to Melrose using HTTP, so that's the reason why my presentation sent the data to my Melrose program from my web page, right, and also have a web version that uses Wasm, web assembly, so you can play it in the browser, if your machine has some local MIDI device you can try it in the playground, and this is the whole setup, so there's this Melrose program, and I could also create it with Visual Studio Extension, so if you load your script.mel file in Visual Studio Code, you can actually have some shortcuts to talk to Melrose, and this is how I actually compose music using the tool, and because I'm actually using the MIDI, Melrose can talk to any MIDI device, it can be hardware, software such as this one, and so you can also combine multiple devices. Playing music, so you already saw BPM, and of course we have a loop, so the loop can take a sequence and just loop it, and it's very nice, I thought, because once you have a kind of a sequence and you start looping, then using a MIDI device you can find the right instrument that matches the loop, so that's how I interact with it. A bit of a behind the scenes, so if you write a program in Melrose, and you ask it to evaluate it, to play it, it will actually pass, of course the language, so you get a music object tree, and it will translate it in a bunch of note events, and it needs to schedule those events, but because each note should start at a certain moment in time, so you create a timeline, and then you play the timeline by sending all the MIDI notes to a device, and another picture that demonstrates the same thing is this is the timeline, it's actually implemented as a linked list for the programmers that know what that is, and so each time I say this is a sequence, I create this timeline and send it, and this is done in parallel, so while the music is playing, I can send more sequence to the same timeline, so I can start doing composition real-time as well. Okay, already mentioned that one, so this is where you can find my goat, and where you can download the binary, I have a Mac version and a Windows version, I created a documentation site on Melrose, and just a little note, so I choose to use the O with a dash on top of it, and because it looks like a note, but it also makes the, sorry, it also makes the whole name unique in the on the web, so if you type in this name, you'll find exactly my sign, it's nice, so now let me try to come up with another example, so let's app it, E-Piano, Taser, Taser, now we already had that one, right? So just kill it, control K, this is fine from the bells, how am I doing on time? Oh thank you, so I'm just going to show you, no I'm not showing you, I'll let you hear stuff, right? So I changed the instrument here, and this is Visual Studio Code using my extension, try to read what's happening, so there are four sequences, I joined them, but I can change of course the speed, I have to wait for the loop to end it, you see, and there's also, so I can also change, I won't do it because it will screw up the sound, I can also change the characters, and I, thank you, and I evaluate again and it will pick up the changes, and that's what I mean by the audio feedback, so without compiling the whole thing, I can just let it loop and then change, and then find another instrument, and this is how I killed the loop, okay, let me jump to this one, piano, four, and this one, yeah, thank you, oh this is a heavy one, I'm not going to do that one, so can you read it by the way, I hope so, so, ah, drum is working, you see, I already mentioned it, so it's actually doing this, and it loops the drums, as you can see, and I start another loop, I hope, yeah, there it is, yeah, and it uses the tap notation, one of the additions I did recently, so yeah, so now our two loops are running at the same time, but I can add more, and send one loop to this device, and the other one to the other device, and change it, and I'll change the text, and so on, one more, okay, okay, okay, and now ends, it's doing program, thank you, yes, yes, that's possible, the question is, can you do other MIDI things like changing program, right, yeah, so pedal support as well, and changing programs, yeah, the only thing I cannot do is change the beats per minute while it's playing, that's design decision, yeah, sorry, yeah, yes, and no, so I'm working on that to read MIDI and try to find the expression that matches the MIDI, and then start from there, but that's difficult, maybe chat GPT knows, but yeah, can you consider also using open sound control, OSC, OSC, can't remember any, I think it's using, I started out with port MIDI, when I thought about that, and then I changed it to something else, I have to look at the library, can't remember, so I didn't look at it, but no, yeah, yeah, can you get MIDI CC in? MIDI CC, oh, as an input, yeah, I skipped that whole part, but you can actually listen to MIDI events, and then trigger it by keys as well, but also make changes to it, yeah, yeah, that would be a nice demo for next time, yeah, yeah, yes, it runs on the triplet, yeah, so Go can compile to multiple platforms, so like I said, I have pre-compiled binaries, so for Windows and for Mac, but you can just grab the Go code and compile it yourself, and if you want to know the library, yeah, have a look at the source code, yes, yes, there's a function called export, and then your whole MIDI is exported, yeah, does the debugger work with breakpoints? yeah, I stole the feature to trigger the loop, so yeah, that's a nice part of the Visual Studio Code expression, sorry, good, yeah, what about the other tools, do you consider things like Sonic Pi? no, not at all, because, oh, so the question is, did I consider using Sonic Pi, but that already exists, and it, yeah, but that produces actually sounds, right, and not, I really wanted to use the MIDI and focus on composition language, yeah, have you looked at a lily pond suit? yes, I did, what do you think about it, so that's a lily pond suit? yeah, I know, yeah, I personally found it too complex, and also part of me thought, hey, let's try something else, to be honest, yes, is there a, what if the binary is running, is there like a input for Melrose, like, that you can programmatically send, like Melrose expressions? by HTTP, by sending HTTP, you set the source to Melrose, and then it gets evaluated immediately, that's how the tools interact, so you could write a program that generates Melrose, yes, yes, you could, send me the submission, yes, okay, so it's interesting that there's a crazy guy that must have the same idea you had, okay, represent music in a way that they equal handle code, so you could do diffs and stuff like that, and it's called earmuff, and the guy's called crazy, you might get along, oh, yeah, so there's another guy, just as crazy as he is doing, earmuff, oh, okay, code to MIDI, oh, I would like to see that, yeah, yeah, it's more bar-oriented, oh, okay, I would like to have a look at that, thank you, thank you, yes, one more, how suitable do you think it is for live coding where you compose as you're doing, yeah, so the question is how suitable is it for live coding, try out, I don't know, I did, as a performer, I didn't come too far that yet, but I would love to try it, really, yeah, and also because, yeah, it gives you direct audio feedback, right, so I can start multiple loops, and at the same time do all the other things, and then start the loop, or kill the loop, and, yeah, sounds interesting experiments, yeah, yeah, very good, yeah, okay, more questions, oh, yeah, thank you, you should start your show with this, I invite you, more questions, yes, melody, I thought about it, and then, yeah, nice rose, and then, and I knew about this trick from someone else that uses it, and then, ah, I'm going to register this name, that's the third thing you need to do, right, is it free, and then, no, rose, it's stuck, last question, last question, okay, you mentioned that it will generate a music object tree, yeah, is this, will you be planning on allowing other software to capture that tree so that, oh, yeah, yeah, so the question was, in my, in my, one of my slides, I showed you the music tree object, and the question was, could that be, yeah, accessible from, for other programs, yeah, yes, so can it be used for other programs, yeah, I would love to have this, this interface, yeah, I already have some kind of a visualization, so if you, you can show in the browser the whole pattern of the sequence, sometimes you need visual feedback, so sometimes I miss Ableton, yes, and, ah, well, it's another way, yeah, thank you, thank you all for joining. |
Become a rockstar using FOSS!
Or at least use FOSS to write and share music for fun! |
Okay, then we continue with music with the next talk which is become a rock star using free and open source software. Please welcome Lorenzo. Thank you. Thank you very much. And it's going to be a hard job following up a guided broad, a synthesizer and an amplifier. I didn't bring anything to the table, so you'll just have to endure me talking about it. I hope it's fine anyway. And first of all, this is my first talk in open media, and I already feel like a fraud because clickbait alert, I'm not a rock star at all. I mean, this is an email that I got from Spotify last month. I got two listeners last month. One of those was me. So I think on average, you need a bit more than that. We actually call the rock star, but it doesn't really matter. I had a lot of fun in the past few years just playing with music and open source, I mean, and I had to use this for something rather than just show them around the home. And for a living, of course, I don't do any music at all. I'm just a hobbyist musician, and that's what I'm going to be talking about today. In the real world, I am a WebRTC developer, for instance, involved in some of the things that Dan mentioned in this previous talk. I love a lot hard rock, metal, orchestral music, symphonic music, when they work together as well. And this is something that I've been trying to do in my own music as well. And here you can find some links if you want to get in touch. For instance, you can get in touch on mastodon, a couple of links to my music as well, and so on and so forth. And this is just a very basic and completely out of order table of content. So I talk about a few different things, and I will not follow this order because I will mostly follow the order by which I learned how to do things with music and Linux in the first place. So how I tried to dip my toes into a few different things, and how I eventually learned how to do some more complex things in the process. And of course, it will be a very, if you pass the term, a very dumb presentation because I will only scratch the surface. I'll try to introduce several different concepts, and really just to tickle your interest enough so that maybe you have your own guitar getting dusted. You don't know how to get it started with, for instance, using your laptop to do some music, and maybe this presentation will tickle your interest and you start doing something. And besides, there is a very good chance that I'll say something dumb as well, or maybe something incorrect. So if I do, please bear with me. It's really a high-level presentation and something that's really meant as an introduction, not really something that goes very much in detail. And when I first started learning about all these, I was really surprised by how mature, for instance, Linux and the audio ecosystems was to actually do music production on those machines because you always assumed the world around was Linux is not good enough to do real-time music. You have to use Windows or MacOS or whatever, and I disagree with that because especially when I started with Jack at the time, I found a very interesting ecosystem to do things. And especially, I really loved the port-based approach that allowed you not to use monolithic applications to do things, but you have different applications that you can just connect arbitrarily any way you want, possibly use the same source to connect to multiple applications, implement very complex workflows, and all in real-time and very low latency, which was really amazing. And the fact that you can have all these different applications talking to each other means that you also often have a lot of different applications implementing more or less the same requirements. So you will have different synthesizers or different ways of implementing effects for your guitars and so on. And often, they don't really need to be a substitution for one or the other. Maybe for one genre, it's better to use some applications, for some others, it's better using others. It's really up to your preference and how you like to work with music, and some tools may make sense more than others in that sense. And it's probably useless to make this distinction right now because we just had a very interesting presentation by Ernst, who explained a bit what MIDI signals are. So when we talk about music production, especially on Linux and Jack, you do have to know basically that you have audio signals, so a sound that has already been processed, recorded or something away from of some sort. And MIDI signals that just carry information that is then used to produce sounds. And so of course, these two can go through very different workflows. Different applications can handle just one or both or maybe none at all. What's really important though is, again, how you can actually have different applications that you connect arbitrarily on your own way. That was at the very basis of how Jack was conceived at the very beginning. And Jack a few years ago was really the way to do very low latency audio on Linux systems. The downside of that, it was that it was a bit complex to set up and manage. And luckily, and we've seen a presentation by WIM this morning, PipeWare has made this so much simpler. So I was a bit skeptical at the beginning. I just jumped the bag on a couple of weeks ago. And basically, PipeWare comes with an implementation of Jack that basically hides all the complex. I mean, the applications think they're still using Jack, but in practice you're using PipeWare instead, which means that you can start using also applications that were not specifically conceived for Jack purposes and work on them together while you work on music production of some sort. And all of these small boxes that you see over there are basically different processes. And you see that some of them have inputs, some of them have outputs, either one or both these sort of things. And this is what allows you to basically just arbitrarily connect different applications to each other to, again, create more complex workflows that you can get out of what a single application can do. And I'll show a couple of practical examples in a minute. So let's assume that you have that whole guitar getting dusted home and now you want to get some noise. You want to connect it to your laptop and do something. So what you don't do is, of course, just plug it in the microphone slot because that will cause problems. What you need is some sort of an audio interface instead. So something like an external sound card that has some inputs that do accept your instrument instead. And often these interfaces come with USB interfaces and so are very easy to plug. Your reprinting system will very likely recognize them out of the box. And they will be available as a system capture. And so as one of these boxes that we saw over there. So something that you can connect to something else. And the one that I have at home, and spoiler alert, it didn't come with the cat. The cat came with something else. I personally bought this Focusrite Scarlett Solo because it's quite inexpensive. It's very common among hobbyists because it already provides decent enough quality for recording at home. It's very flexible and I really like it a lot. And basically the one that I bought basically comes with two separate inputs. So it has a USB interface which recognizes an external USB sound card by the operating system. And mine in particular, I think later versions changed this a bit. But it comes with one input that is XLR, the typical cable that you use for microphones for instance. And another one is the cable, the typical guitar jack slot. And since it's two different inputs, when you connect it to the operating system, the box that you see, which may have this name or an entirely different name depending on what you're using, shows two different channels, which means that depending on where you're actually plugging what you want to plug, it will come out of one of those two different channels for what you want to do, which opens the door to a lot of different things that you can do. Because for instance, I could plug my guitar directly into this external sound card. In this case, I'm plugging it into the jack slot. That's capture number two, which means that I can then use this capture number two to do something. And the best thing that I can do is just connect it to the playback system so that I hear what I'm playing just unencoded. So I don't hear any effects, it's just the raw sound of the guitar, but it's something that I can do. Of course, I can do something more interesting and we'll show an example later. Or maybe you have a very good amplifier at home and a very good microphone. You put the microphone in front of the amplifier, you connect it to the first slot. And what you get when you're on your laptop is an already distorted, for instance, sound of your guitar out of your amplifier. Or maybe you can use them both at the same time, which is what I do often for classical and acoustic guitars, for instance, where I attach both the pickup of the guitar, whether it's integrated or added. And I put the microphone in front of the guitar just so that I capture different frequencies, different sounds. Together, they give me a more full sound than what they would give me individually. I mean, again, it's just very simple examples that show you that before you couldn't do anything with an external sound card like this, now you have ways to put your instrument and get it part of a workflow in your own laptop and do something cool. And one cool thing that you could do, for instance, is just launch Guitarix. Guitarix is a very complex and effective guitar simulator, basically. So it has different beats that you can, it's very configurable, so you can create your own configuration, you can choose the different beats, what you want, how you want your amplifier to look like. I'm really stupid in that sense, so I never really tried to do it myself. I work a lot with presets shared by the community. But if you are savvy enough, you can just do things on your own to create, to really shape your own sound so that the guitar sounds exactly how you want it to sound like. And when you launch Guitarix, it basically spawns two different boxes, as far as jack or slash pipe, again, when I'm seeing jack, you can assume I'm also just implying pipe or usage as a consequence. Basically it comes with two different boxes, one as an amplifier and one for effects. And then it means that since we had the jack in my Scarlett Focusrite was capture number two, I connect that to the amplifier, I connect the amplifier to the effects, I connect the effects to whatever I want, playback, something that records it, it's really that simple. So you have already created a workflow out of that beat that you manage to capture thanks to the external audio interface. Another application that I love a lot as a guitar player is Racka Rack. I'm not sure if I'm pronouncing it correctly or not, which is not a guitar amplifier simulator as Guitarix is, it's basically a pedal board simulator instead. So it has a lot of different effects that you can use and combine. It also comes with a lot of different presets. I particularly love the clean sounds that you can get out of Racka Rack. And again, similar approach. You connect whatever capture you add your guitar on to Racka Rack and then the output of that, so the processed sound of the guitar, it's something that you can end up using. You can do something more complex or, in some cases, also damned by just possibly using both Guitarix and Racka Rack at the same time, so putting them one after the other. This is a very simple example and probably doesn't make sense to have the effects box in between there, but this is a similar approach is what I use, for instance, myself in one of the songs because I add an effect that I like in Guitarix, but I also need a sustainer effect in Racka Rack as well. So I basically just chain them in my workflow. I plug my guitar in the sound card, that Guitarix distorted and then Racka Rack do some more things with the sound before I actually used it for something. And again, this is just very simple examples that are meant to show you how easy it is to create a workflow using different applications out of sounds that you have access to, to do some interesting and really cool things. And so let's assume that basically now we managed to get a decent sound. Now we just want to record something because we want to either write a song or whatever. And of course, if you want to do something very simple, so just record the sounds and then use them somewhere else, you can use any tool that is actually able of consuming these sources. And so Audacity or Gstreamer come to mind, but there's so many more. If you want to do something more complex, maybe work to write a song no matter how complex it is, you'll want to work within some sort of a project instead. So maybe in an application that is capable of handling multiple tracks at the same time and that maybe can add different filters to all these tracks that you're having. So because you need a compressor on one or maybe reverb on some tracks or you need equalization or something like this. And this is the kind of application that you use a digital audio workstation for. And DAV is a short term for that. And mostly because these kind of applications are specifically conceived to do exactly that. So possibly record things in real time or use existing assets, edit and produce, although these are different audio files in different ways, they often support media as well. And especially most of them have a modular nature that allows you to use existing models that are part of the open source ecosystem to add different effects to any of those tracks, either as a whole, for instance, a filter that applies to multiple tracks at the same time or just one of them and so on and so forth. Because you may want equalization, compression, I mean whatever is part of the usual audio editing process in a regular music studio, if you want, it's something that a digital audio workstation can provide for you. So if you've ever heard, for instance, of Pro Tools and stuff like this, this is exactly what a digital audio workstation can do for you. And the one that I personally use is called Ardour, which is a very powerful component. I personally use this because it was the first one I stumbled upon. I fell in love with it at the time and I just kept on learning. But again, there are more than that you can use out there. There's Qtractor, there's Reaper, which is not open source, but it's also very used in the open source applications as well. And one thing that you'll notice when you start using an application like this is that the box is in that graph that I showed, for instance, on Jack or Piper, is going to explode because a digital audio workstation is going to handle a lot of tracks and those tracks are going to be connected to a lot of things. And so you'll see a huge amount of connections out there. And luckily for you, you don't really have to create those connections on your own because otherwise you will go crazy. Often it's the digital audio workstation that does this for you and there are easier ways to change those connections if you need to from the user interfaces of all those applications. And most importantly, this shows that no matter how monolithic now this application can look like, it's still able to communicate with all those external applications that we mentioned. So you can still have, for instance, an outdoor session open, a guitar rig session open, you connect your guitar to guitar rigs and then you connect your guitar rigs to that in order for it to record it, or maybe you have guitar rigs as a plug-in so that you just record the clean sound and then you have it processed in different ways any time that you needed these sort of things. And so let's assume that we have now bass and guitars. I am assuming that bass and guitars, you can process them pretty much the same way. I'm sure that there are bass players that will disagree with me, but the concept is like this. Let's say that you now need drums and let's assume that you're like me, I'm a WebRTC developer, I have no friends, so I don't have any drummer friend either. So I have to create a virtual one instead, so something that plays drums for me because I am at home doing nothing. And which means that this is the very first good example of a virtual instrument. So I need to write the drum parts somehow and then I need to sequence them somehow, which means that the drum parts will be the instructions so what I want to be played and then something will actually translate them to a kick sound, there's an air sound, these sort of things. And of course you can just play, write the MIDI manually or use something like Melrose as we've seen. What I found out that is for drums, it's much easier to work with a pattern bass instead, mostly because of the rhythmic nature of the instrument and the fact that you can often do some repetitions, maybe some variations and then just play a bit with those instead. And personally I like hydrogen a lot in that sense, because it allows you to create multiple patterns for instance, it has all the different parts of a drum, you can say within the context of a measure, play this, this, this, this and this point, a kick here and here and here, you create different patterns, you specify the sequence of those patterns or even some patterns in parallel if you use some patterns just to create variations, these sort of things. And then you basically hydrogen plays drums for you out of what you just wrote basically. And while hydrogen comes with its own sounds, which is really cool, I personally just use hydrogen to write the parts but then use drum gizmo to actually render them mostly because drum gizmo is probably the most advanced drum renderer that is out there because it's basically a lot of drum keys that were captured and recorded by professional drummers, they created samples and then using drum gizmo you can replay them also using, I mean, I'm not even going to try and explain how drum gizmo works because it's very complex but it's suffices to say that it has so many channels that it's the best way to actually get drum sounds out there and also work with them in within the mixing process, which means that from a jack perspective, I just use hydrogen to generate the drum parts and then I connect the MIDI output of hydrogen to drum gizmo and then whatever drum gizmo generates I use within the context of my own application. That's basically how it works. And now that we dipped our toe in the MIDI world, what can we do with other instruments because maybe we want a keyboard background, a pad or a piano solo or even a full orchestra behind our music. And again, this is what MIDI is for because basically we don't have access to a whole orchestra because I don't have 30,000 euros to pay a lot of players. So what I do is I just sketch and write the notes for all the different instruments that are involved and then those notes, that information will be translated to actual sounds. So something will be played by something that simulates strings, something else will simulate a trumpet, these sort of things. And of course, these notes can come from different places. It can come from an hardware keyboard, as we saw in the previous presentation with Melrose or it can come from something that you wrote, for instance, using Melrose or other approaches. And I am not a keyboard player, but I did buy this small tiny thing over here that again is just plug and play. You plug it in there, it becomes a MIDI input that you can use for different things. And once you have it, you can have a lot of different ways of rendering MIDI sounds. Soundphones are historically the oldest and easiest way to do that. And Q-Synth, thanks to FluidSynth, is one of the most popular ways to actually play them. So you download a soundphone from somewhere that contains all the sounds that are associated to different instruments. And then, for instance, you just connect your keyboard to FluidSynth using the MIDI part and that is, that eventually gets generated to actual sounds that you can then use for something else. If you don't want to use soundphones, maybe you want to use a synthesizer instead. There's plenty of those as well. I use Yoshimi a lot, but there's also Xena, that's, uh, Xena, so by effects, so very complex name, but they are basically, they share a lot of code because Yoshimi was actually a fork of death. There is Surge, which is also another excellent synthesizer. So in that case, there is no sound bank they start from. They actually generate the sound depending on what you want to do. And again, I'm, I'm dumb, so I never aim at creating my own synthesized sounds, but if you are more, if you know more about that, it's something that you can do. And again, it works pretty much the same way. You can connect your keyboard to Yoshimi and Yoshimi. You can connect the sound to that as well. And in this example, I also wanted to highlight the fact that, again, you can connect the same source to multiple things. So in this example, I'm connecting my keyboard to both Yoshimi and that sound from the application that I showed before. So that's whatever I play sounds both like a synthesizer and a piano at the same time. Oh yeah, I'm, I'm, okay, I'm just, I'll just fly over the last slide. So for sound files are also very interesting to do the same thing. You can use Windows VSTs over there as well via Lean VST, for instance. If you want to write music, you can use Lilipon, Melrose, MuScore, which I personally like a lot. If you don't know music notation, you can use piano rolls. And if you want to read, to then publish this, you can, there's a lot of places where you can publish your music. But what I really encourage to do, however you publish your music, make sure that you engage the community in order to exchange information. So for instance, I, I use, I use a lot of Linux musicians and the main author of Linux musicians is there as well. He asked me to tell you there's a lot of stickers that you can get from here if you want. It's an excellent place to get in touch with other musicians working, you know, in the open space so that you exchange ideas. You publish a piece, they give you advice. It's, it's an excellent place to learn. You may want to add video like I did with my Viking Metal cover of All You Want for Christmas by Mariah Carey. And that's a fun watch if you want to see it. I thought about about, I, I thought also a bit about WebRTC used for musicians as well, which in a previous presentation that you may want to look at. And then that's basically it. These are again my contacts if you want to, to have a look at that. Unfortunately, I, I didn't speak fast enough, so. I, I. Oh, thank you. Yeah. Yeah. Basically, I personally use, oh, sorry. Yeah. The question was, does PipeWire also have the, the way of showing those different boxes that you can connect in order to create a workflow? I personally still use QJack control, which was a front end specifically conceived for Jack mostly because PipeWire implements Jack as well. And so QJack controls allows you to, to create the same connections from the Jack perspective, but there are some applications that are specifically conceived for PipeWire as well. It's, it really works the same way. You have, you have these different boxes and you can connect them any way that you want. No, no, it's, you can script it if you want, but you cannot definitely use a GUI. I always use a GUI because otherwise you go crazy. Any other question? I'm sorry. Yeah. The, the one that I use, since I work a lot with metal, the one that I prefer is called Mould Your Kids, which yeah, I personally love that one a lot, but there are a lot of excellent tools out there. So thank you. Which one? Sorry. Yabridge. No, for Windows VSTs, the one that I use the most are some free, free VSTs by Spitfire Audio, the Spitfire Labs Audio. They have a lot of free different VSTs that are very interesting and experimental sounds. And those are the ones that I like the most. Yeah, there are different ways. So Lean VST is the one that I found to be the easiest, but you can use DSSI, VOS, for instance, is another way to do that. But there are different approaches. Personally, the one that worked consistently for me was Lean VST. That's just why I use that. Sorry. If I install PipeWire, can I get rid of Pulse Audio? Yeah, no, because PipeWire is basically a replacement for both Pulse Audio and Jack, even though it's compliant with both of them. So you can just... Yeah, exactly. While you record your guitar, which is something that you couldn't do before. And besides, PipeWire also does an excellent thing. For instance, with just playing Jack, I couldn't attach two different audio interfaces to my computer. I had to choose one or use Zeta to add another. Instead, with PipeWire, I can plug as many as I want, and they all appear, which is great. I think I have to... Thank you. Do you have time for a question afterwards? Yeah, absolutely. Just... Yeah, yeah, I'll come outside in a second, sure. Nice to meet you, Lorenzo. Nice to meet you, Lorenzo. Nice to meet you, too. Yeah, yeah, yeah, yeah, yeah, yeah, yeah. I'll just come outside in a second, because I think I have to wait for him. Sorry. |
Distributing multicast channels to 3rd parties: a case study with OSS and virtualization/SR-IOV |
Last but not least, it's Christoph Masio, but first of all, I would like to thank him for his work organizing the Dev Room all day. Thank you. Yes, Vostem is a volunteer event. And Christoph will be talking about distributing multicast channels to third parties, a case study with open source software and virtualization slash SROV. Thank you Kiran and also thank you Kiran for co-organizing the Dev Room with me. And if next year there are more volunteers, I think we would be happy to have that. So I'm Christoph Masio. I've been in the broadcast business for quite some time now and I run a company called Easy Tools. One of the purpose of this company is to distribute linear channels. So either we create them or we get them from satellites or from abroad or we get them in data centers and we deliver them somewhere in the world or to network operators and so on. So one of the critical parts of doing that is knowing how to distribute multicast to other people. Usually how does it work? So first I must clarify a linear channel from my point of view at least. It's basically transport stream. So it's MPEG-TS over UDP or RTP, depends on your religion, both exist and generally multicast. So usually you put seven TS packets inside one UDP frame and you're done. And it's a continuous stream all the time. How do you exchange these kind of streams? Usually you take a rack in a well-known point of exchange. In Paris there is one called Ted House, it's very popular. He knows that in Paris. And so you have your rack, you put your switch and you buy cross-connects from your data center to your peers. So your peers can be your sources, people provide your streams or could be where you distribute them. So the distributors, the operators and so on. So for safety reasons you will want to have each source or each operator in a different VLAN so that they don't see each other from confidentiality and also some content is a little bit critical. So the question will be how do you copy basically a multicationals from one source VLAN to a destination VLAN. It sounds stupid and perfectly easy to answer, but it's not so easy. There is a pure network solution. With some switches you have a feature called multicast service reflection, that's what Cisco provides. I'm pretty sure there is the equivalent on Juniper and that kind of thing. But it's not so widely available, lots of switches don't handle it, only a small range of that. And so basically you can type a command there to say that you copy this multicast address to this destination and so you copy from a VLAN to another VLAN. So it's not widely available, there are good chances that your switches do not support it. And also you cannot handle complex use cases like some operators want RTP, some don't want RTP, you cannot remove RTP with that. And some operators also want you to have specific PID, specific service ID, different service name and so on, that you cannot do of course. So generally you will end up using devices on top of your switch, you will have to plug a device that will do that kind of job. The broadcast solution, the most popular one is what we call DCM, it's a very popular brand. Initially by Cisco, now it's a company called CineMedia, it's a hardware service with electronics inside and it does all this work of switching, it can even transcode actually with some cards, it does a lot of things, but it's also very expensive of course. But we have an open source alternative, it's actually something I wrote maybe 15 years ago, it's called DVBLAST, I've been doing a lot of talks about that, but in this use case it also helps. Originally it was written as a DVB DMAX, so you have a satellite card or DTT card and you want to get a transponder and split each channel into a multicast address, that was the original goal of DVBLAST. But actually with the dash D option you can also read from a multicast channel. And in the arguments you can also say which is the IP address of the interface you want to read it from, so basically which VLAN, that means which VLAN you will read it from. So basically that reads a multicast stream from a specific VLAN. There is a configuration file associated with DVBLAST, and you can put as many lines as you want, so as many distributors as you want, and for each line you will just put the multicast address you want to send it to. And you can also optionally give the address of the interface, again you want to send it to. The VLANs you have on your switch, they will appear as network interfaces on your server, so you can decide to which VLAN you want to send and which multicast address. You have a number of options to turn on or off RTP, you can remap PID, SID, channel service name and you can even spoof the source address, which is very useful in case your peer wants to do IGMPv3, in that case you ask you to put a specific source address and that's the easiest way to spoof it. So problem solved, end of talk, thank you very much. Now I wanted to add a little something. You may want to run hundreds of DVBLAST on one server, but for some reason I wanted to do some virtualization. Why virtualize? Because I have different customers and each of the customer has different channels, some of them are doing adult content, some of them doing children content, maybe I don't want to mix that in the same server. And I don't have the money to have multiple servers at 10 hours because it's expensive. So for client isolation it's required. Also some of my clients have direct access to the VM, because that's the service I sell. So again I don't want them to see the streams of the competitors potentially. So I used Proxmox, that's a very nice distribution based on DBN and it's a very big front end over KVM, very useful. And in Proxmox what you will do is each of the VLAN you get from the Twitch, you will bridge that and on your guest, guest virtual machine, it will appear as a network interface with a VIRT-Io driver. The VIRT-Io driver is the most optimized one. You can also emulate a networker, but it's much slower. So that's why everybody uses a VIRT-Io driver. So everything works fine, end of talk and thank you very much for coming. There is just one little problem. The morning when you get called by one of your big customers and he says, I have discontinuities on what you send me. Discontinuities, that's what every people in broadcast feels. And you start another DV Blast on the IP address that you put in, you see nothing. Everything is fine, and they insist and another customer complains. So what you do is you rack another server, that's a use case by the way, that's why I'm telling my life. So you rack another server and you listen to what the other server is sending. And then you see, a lot of discontinuities indeed. And you dig it a little and you just see that the VIRT-Io driver, I don't know if it's guest side or host side, re-orders some packets. I mean, probably there are several queues inside the drivers. And if this packet does not have any luck and goes into a queue that for some reason doesn't run for some time, it will be pushed there. So I hear now the network guy will tell me that well, there is no guarantee on the order of UDP packets, that's the specification, that's true. But we are in an industry that relies on it. Because if you don't have RTP, there is absolutely no way you can reorder your UDP transport string. So I had to find a solution. I cannot tell my customer, no, you have to use RTP and I don't care, I cannot, you cannot. So I found the first workaround is by using another driver. It's also a driver designed for virtual system, the VMware. But it's supported by Poxmox, KVM and so on. So it's called VMXNet and probably this one only has one queue and it solves the problem. The only problem is that it uses 30% more CPU and this server is only doing that. And I already have 64 core IMD epic servers that are not full, but more than 50% already. So there is some kind of need to optimize. So that's why I started to look at other alternatives and many big, some clever ideas that I found on the net. And one of them that you probably heard of maybe is called SRRUV. So what is SRRUV? So it's a feature of some network cards, not all network cards, some network card of that feature. In a normal installation, so the network card is owned by the host. You have some kind of software switch that is handled by the virtualizer. And here you have virtual interfaces to each of the virtual machine. In an SRRUV setup, what happens is that the network card will create new PCI devices. So it's a different PCI device. And this PCI device, there is a feature called VTD on Intel that you have the current on AMD that will allow you to dedicate a PCI device to a virtual machine. And doing that, the virtual machine directly talks to the PCI device without anything going through the host. So that looks quite interesting. These new devices are called virtual functions, VF. And so on the VM, you just need to have a VF driver, which is included in Linux anyway. So in my use case, I used Intel cards. They're not the only one doing SRRUV, but it may be a little bit different if you use different cards. Not all cards have the same features. So using SRRUV, it's a little bit tricky. First it requires support from the motherboard, the CPU, the BIOS itself, and of course the network card, as I was just saying. You have to enable a number of features in the BIOS, IOMMU, but also ACS, ARI, which is some kind of PCI Express routing protocol, lots of things. And basically, we spent a couple hours, I think, just setting it up. My personal advice would be for you to upgrade the drivers, the inside drivers, to the latest version and the card firmware, because there is a firmware running on the card, to the latest version supported by the driver. Once I did not do that when I started my test, and I ended up having half of my VLANs working and the other half didn't work, and there is absolutely no way to know what's going on, I just had to reboot. And it was in production, so I had to imagine what happened. Creating the virtual function is actually quite easy, it's just echoed into a slash CIS device, and with my Intel card, I can create up to 64 virtual functions, so that's up to typically 64 VMs. So the path through is very easy on the Proxmox, just a menu, and potentially, each VM has access to other VLANs, so that is also, that may be also drawback because it sees all the traffic that it can send to any VLAN, can send packets on any VLAN, so you have to trust your clients a little bit, so it may not be adapted to all situations. But it's quite useful, because you don't have to create a bridge every time you want to create a new VLAN, you just create a network interface, an U821.Q network interface into your VM, and that's all, so it's actually easier to manage. Problem is, so everything looks perfectly fine, so that's the end of the talk, not yet. I talked about sending to VLANs, but how to receive multicast from the VLAN, that's another problem. But initially, it looks like it works, so you put it in production, maybe a bit early. So how does it work, how does a network card work with multicast, first I should maybe remind you how it works. So you have your multicast IP address, you translate that to the multicast MAC address, so the end is the same and the beginning is just dropped. And then you tell the card, every packet that arrives on this MAC address, I want it. So that's how it works on any normal PC, that's quite a standard, it's called a MAC filter, it's quite a standard feature. The trick is, on these cards, the number of MAC filters is limited, and the way they limit it is a little bit stupid, they take the whole buffer of the card, the whole number of MAC filters they have, and they divide it by the number of VF you have, so 64. And in the end, according to my calculation, your limit is around 100 multicast addresses. So 100 may be a lot, but I have hundreds of multicast streams in my network. So you may reach it, so you may think about segmenting your virtual machines not to go above the threshold info, but it's still a dangerous game because there is a feature of the intro driver, that if you reach that limit, it's a scientific phase, of course, while you have a message in the message, but nobody reads that. So you will try again, and try again, and try again, and after just a few trials, like five, the kernel decides that your VM is crazy, and it won't talk to it anymore. So you will still receive your multicast, but if you have any other command to send to the card, like creating a new VLAN, which could happen, you have a new distributor, so you can't, you have to reboot, you have to reboot your VM, fortunately, not the host. So it's not that practical, and to be honest, I have a patch in all my installation that disables the doing dead feature, and disables using the MAC filter at all, actually, because I have found it not practical, in reality. Yeah. Is there a need for the MAC filtering at all, given that, like, modern switches, like, you're not going to get any multicast traffic, unless you do an IGNB billing? Yeah, but the thing is, so the question is, do we need the MAC filter at all? The thing is, you will receive approximately two or three gigabit of traffic, and maybe your VM does not need all of that. So you have to decide which multicast you want, actually, that's my next line. My first idea for a workaround was to put everything in promiscuous mode. So promiscuous mode means what it means, it means your VM will receive all the traffic that is received by the network card. It looks like a good idea, but it dramatically increases if you use it, because from maybe two gigabit per second of data, you only need 200 megabit per second of data, and the rest, the kernel, we have to filter it. So your kernel will do a lot of job. And from what I've calculated, basically, the gain you had from going from VertiO to SRIOV, you lose it, right there. So that's first. One problem is that, imagine you have two gigabit per second on your network, and you have 20 virtual machines. The network card will send 20 times two gigabit per second to your virtual machines. And that means 40. And 40 is the limit of the card. And at that point, you will start losing packets randomly. Again, silently, you will not know what happens. And of course, obviously, you only know that in production, because when you first started with one, two, three VM, it works perfectly. So you say, yes, I have my solution. And then you put all your load on it, and then one day, it just stops working. So while activating promiscuous mode, it's actually quite easy. Again, it's an echo in such a file. So I have found a second workaround, which is a little bit better. I'm using it in production. It's called, maybe it's only an Intel card. I don't know if it exists on other brands, but it's a feature called VLAN mirror. And basically, it tells the card to send all the traffic belonging to a VLAN to a particular virtual function, to a particular VM. So that kind of promiscuous feature, but only for one VLAN, which is kind of a good practice because it means that I think most people who have ever done multicast, when you have a backbone, you put all of your multicast addresses in the same VLAN, maybe with different address ranges, or maybe not. And you expect at the other end that the receiver will pick up which multicast address you want. This approach forces you to have different VLANs per customer. So it's actually not a bad idea, but there is one ruleback. One specific VLAN can only be sent to one VM. So if you have, let's say, a big broadcaster that is sending you channels and several VMs needs those channels, you cannot use that solution because only one will be able to read from that VLAN. So I have a third workaround, it's just to go back to the good old vertio. After all, why not? The packets in version were only on TX, not on RX. So it works. Actually, it has also some additional features because the bridge in Proxmox actually is but most of the time it has IGNP snooping. So we'll only receive the multicast addresses that you subscribe to. So it's actually a good solution, but then it means you have basically interfaces to read the packets from and interfaces to send the packets to, which is a bit of a mess, but still it's a good compromise. That's my compromise currently, VLAN mirror, all this one depending on the nature of the VLAN I have to read from. So all is good in the best of words. At the end of the talk, thank you very much. Not quite. There is another topic I haven't mentioned yet. What if I want to read a multicast stream coming from another VM on the same server? That doesn't work because when you write through SIOV, it just outputs to the switch. As far as I know, maybe some of you have a solution, there is no way to get the traffic back to the network card and use it in another virtual machine. You could do that, but if you don't want to have different VLANs, if you want to read from the same VLAN, then you cannot read. Okay. Well, there is another solution with the intercard again. It's called egress mirror. You can make it so that everything that's output on virtual function number one will be mirrored to virtual function number seven. So virtual function number seven will be on your receiving side and virtual function number one will be on the transmitting side. So that actually works and also I use that in production. So conclusions. While multicast on virtualized environment is no picnic actually, and I'm surprised. You don't find many papers about that. I've struggled literally for years on this problem with a number of problems in production because you only see the problem in production because you only see them on the load. And so this has been a little bit tiring. Thank you very much for listening to this. And if you have any questions. I'm guessing you tried that, but in Vert.io there are ways to actually ask it to be dumber by not doing some of the things it does and I'm not sure but probably you tried that. But you could probably have asked them to say, hey, I just keep them in order. You say in Vert.io there are ways to ask it to be dumber. I'm not sure. Yeah, I'm not sure I've actually tested it. I'm pretty sure it would have the same effect as using VMX net and probably you would see an increase in CPU consumption anywhere I wanted to go to SRIOV because for other reasons because I wanted to have my VM talk directly to the network card. But I think it was better practice than Vert.io. Yeah, James. Did you just back the 3D slide, please? Yeah. Okay, there was no question. Yeah? Just AWS provide the DC to make sense with the SRIOV. With SRIOV? Yeah. Okay. So he says AWS provides instances with SRIOV. I guess in the cloud you don't have that kind of problem because usually you do SRT. Usually. And SRT doesn't care if you need to roll those packets because it will. Yeah. Did you consider forcing, well, either disable multi-Q on the host or forcing each VM to a different queue? So that by definition you will not be reordering? Forcing each VM to a different queue, I'm not sure it works. Disable queuing probably it will work but with additional CPU usage. Yeah. I wanted to, as I said, I have 60 core APIC servers which are pretty big and they're not full, but... It should be possible to pin, it's possible to pin a process to give, if you do XPS CPUs you can pin a process. At the end I don't know if you can do that. I'm not sure if the inversion happens on the host side or on the guest side. Disable multi-Q in your host. Yeah. Enjoying your guest. Yeah. But yes. Mm-mm. eBay? Why do you want to use SRIOV for, like, what seems differently is, like, what are the other use cases you're thinking about with SRIOV? The question is why did I want to use SRIOV? Speed. Speed is the first argument, actually, but also some kind of clean design. Maybe I'm a little bit of a purist and I wanted my VM to have direct access, physical access, I mean, to the network. Also, so if you really want the list of argument, the traditional way of doing VLAN bridging works, works, but probably other environments, you create a bridge for each VLAN. And so you have a network interface for each VLAN. Maybe there is a way to avoid that, but I didn't look too much into that. That means you have a limit because I think you're limited to 32 network interfaces. The limit is low enough. Otherwise you have to switch to a VLAN switch, but considering the amount of problems I had without it, I thought maybe I will not go into that. I haven't tried, to be honest. These are probably people, you don't see papers about the most people who are provided to use HTTP. Exactly. As you say, SRIOV is mainly used by people who do HTTP or some TCP based. And using SRIOV for non-networking things? Ah, you better ask if I'm considering using SRIOV for non-networking things. I suppose you imply GPU maybe? Maybe. But my question is, do you envision other places where multimedial creatures and short GPU is almost an obvious one, but have you ever thought of using that to do other things? GPU, we've actually tried to. I'm not sure of the one we had supported SRIOV, but that... On the really expensive ones. Yeah, they have features like VTG, I think. But in our experience, it didn't work well. But we would like to have the ability to share a GPU among several instances so that you could decode or encode on different VMs and so on. These virtual machines that we have at Tellhouse, they don't do that because Tellhouse is not the place to do transcoding because it's too expensive. Fundamentally, yes. I would be interested in having that for GPUs as well. At least GPUs, if you have any other ideas, tell me. That's my whole thought behind the question. Just mentioning, because of Proxmox, I mean, it's support. Sometimes it's support as well for SRIOV for decode, and they are expensive. And it's just landing in Proxmox and QMU, so you can actually play with that if you have to. Could you use that to SDI or NDI pass through to get directly processing? Yeah, you don't need the SRIOV for that. The question was, can you use that for SDI or NDI pass through? Well, NDI is network. But SDI, I think we tried it and you can pass through the Blackmagic, for instance, device. Yeah, you pass it through a VTD. VTD is one thing. VTD, I already passed through DVB tuners. You can probably pass through DVB ASI, probably SDI. Even though we don't do that on a regular basis. But have you tried a different connector on a different VM? Yes, you have four inputs. A different connector on a different VM that would not be possible with a Blackmagic driver, because it's seen as one device, one PCI device, so it's not possible. If you design your own card, maybe. Siwan, you may want to answer that question. No? Yes, we could do it, yes. There's no technical reason. Yes, Gaff? Any container technology? We could use container. The thing is, as you know, we have our own software. And our own software is actually delivered as a disk image, so it can only run as a virtual machine. The idea to use container is not a bad idea. There is less isolation, though, than what you have with virtual machine. But this is something I would like to try in the middle term, yes. Because there has been a lot of improvements in the past years, regarding the network and the space. And you can probably have some direct MagValan interface, so I think some kind of physical interface that you can have directly. Yes, so it says we can probably have a direct MagValan in the container and do a direct physical interface. I think we are running out of time. Wow, oh, so... |
Relativitization: an interstellar social simulation framework and a turn-based strategy game |
So, good morning everyone. I am Adrian, a PhD candidate from Lidon University in the Netherlands and I'm going to, today I'm going to talk about my project, my software, so called relativization, which is a weird thing, an interstellar social simulation framework and a turn-based strategy game as well. And I hope this can be an interesting start of today's schedule. So, before I go into the details of my thingy, I would like to do a bit of self introduction because it relates to my overall open source development experience. And I was educated in physics, I did my bachelor in physics, I did some master's research on so-called gravitational waves under the LIGO collaboration. And I must say, like before my master's research, I would say, kind of mathematical students, I didn't really like computer, I like pen and paper more. And, but like my research forced me to appreciate computer because I was forced to use doing things like supercomputers, Linux, terminals, and writing programming language, programming like C or Python. So, I started to appreciate like software or open source software in general. So, that's why I get into this world. And after my master, I decided to kind of switch my field to do something social science. So, that's why I came to the University to do something called like quantitative science study. And I'm supposed to work on, supposed to work on social modeling, simulation, or data analysis on some kind of social data, actually most specifically for academic system. And as a hobby, I kept contributing to open source software because I think it's fun to work on, I mean, just use the software and if you find something wrong, then you just submit a PR or a great issue for that. And I learned a lot from that kind of process. And then like, after, because I started my PhD right before the pandemic, as you can imagine like things go when quite long, because I partially blame the pandemic, because for example like the data that I'm supposed to use is still not ready, even now. And the whole collaboration become mass, the infrastructure is mass. So, I partially also blame the engineer in my collaboration because they are so annoying to work with. So, I have to rescue my PhD, right? I can't wait for that, the things to happen. So, I need something to work it out. I can work it out by myself. And because I, because it's pandemic, so I have to work from home. And I was also kind of traumatized by the collaboration that I experienced. And I know something about physics because of a background. I know something about social modeling. And I'm also kind of familiar with software engineering. I'm not trained in that field, but I learned a lot from my hobby. So, it is natural for me to try to combine these things together to create something for a thing, a weird thing, interstellar simulation. And then like, perhaps some of you guys are a fan of science fiction, movies, or games. And you may ask these kind of big questions like, will human civilization become interstellar one day in the future? Or like, does alien exist or not? And if they exist, or if we will become interstellar, we will be curious about like, what is the form of society of those interstellar civilization, right? And of course, it's very hard to study those kind of thing, rigorously, in academic settings. But perhaps one way, at least I claim, I argue that one way to try to explore this is to assume that some social theory we know nowadays can still be a good approximation for that kind of society, like perhaps if we expect that they to also be utility seeking, or there are some collective actions there. And of course, we can study that rigorously because we don't have the data, we don't have observations, but we can try to explore this kind of domain by simulation. And in case if you're not familiar with social modeling, or like, in case if you're not familiar with social modeling, and nowadays when we talk about social simulation, it's majority, like they're talking about agent based modeling. So, social scientists design some behavior rules based on the understanding of how human works, or based on some experiments to design computational rules for agents, and they put the agents into a world just like machine learning people do, and to see how to interact with each other and interpret the overall simulation outcome from that. And there are many existing frameworks that can help us to do this kind of simulation, but I have to create my own framework because for existing social simulation framework, they don't support the physics well, because we're going to work on like interstellar society, and it is a scale that is not on the earth, so it's not easy to deal with, it's possible, but it's not trivial. And there are also some kind of physics simulation framework out there as well, but they are not built for social scientists, so it's also a bit hard to work with. So that's why I have to create my own framework. I think it's interesting and useful if I want to create something meaningful for this thing. And when I talk about the physics for interstellar society, I'm talking about special relativity, which is simple undergraduate level physics, because it is a fiction when we consider a scale like 100 light years or 10 light years, I think it's a good starting point of like try to imagine about this kind of thing. And special relativity tells us that there's an intrinsic time delay out there, so everything cannot travel faster than the speed of light, and also special relativity tells us that like there's a time direction effect, so if you move very fast, close to the speed of light, you're close to the store, so experience the store time essentially. And in the setting of social simulation, we have to deal with problems like what can you see or what can an agent see, and how an agent interacts with other agents, because of time delay, and how we change the state of the agent. I will go into this one by one in the following slides. Because of the time delay, information travels at the speed of light, so we are not seeing each other instantaneously. We are seeing the path of each other because of the delay, so here the red agent, the red people here see the, the red person here see the, the, the path of the blue, blue agents of people, because they have, they have time delay, and it depends on the distance. And technically they see the, the path like comb of the, of the, of the universe at that point. And how the second problem is how the, how the agents interact with each other. I mean if you're a programmer, you're probably familiar with the command design pattern, and this is used for like dealing with, if you have something you might execute, but if you have, if you have a delay, then you have to create a command, wait for a delay, and send it to, send it to a target, and this is what I, I, I, I used it in, in, in, in this framework, so that like if you have some, some, some agent need to interact with each other, they can't, they can't just do that directly, they have to, have to send the command and, and the, the command travels at a speed of light in, in, in the universe. And lastly, the, the, the last problem, how, how we change the state of the agent. We define a term here called mechanism, which is essentially the dynamics of the model. It, it, it changed the state of the agents and, and send commands out, so you define the whole, how, how, how the model actually works scientifically. And recall that because clock takes over for, for, for, for moving agents, so I divide this into, into two types of mechanism, a regular mechanism executed one, a process once per turn, and dilated mechanism process once per end turn because it depends, and, and, and the end depends on the speed, because if it is moving faster, then some things, it's slower, time is slower there, so executed, the execution is slower there. So it is, for, for, for convenience, if you, if you model your thing space, and if you, if you, if you model your thing space on these two, two types of mechanism, then it's easier to, to, to, to adjust, to, to follow the physics. And as an example, regular mechanism could be something like you observe things, and update your information, and it's not something that you, it actually takes time so it can, it can be executed once, once per turn. And for dilated mechanism, it can be something like manufacturing, it actually takes time to, to, to produce some product, and to, to, to, to make, make it and send it out. So if you have a slower time, then you, you manufacture that, sorry. So to deal with this, this kind of problems, I, as I said, I created framework, and the framework is written in Kotlin, there are many reasons behind that, I'm not going to, to, to give you that, and, and basically the framework and force a specific way to, to, to, to model your, your, your, your social model, so that it, it can be automatically, automatically account for the physics. And it also provides some functionalities to, to, to help you to develop the so-called in this data social model. And it deal with some technical subtleties like parallelization of the, of the, of the simulation, and how do you serialize things into, into, into data, or like how do you make deterministic simulation using, under a parallel process. And this is the, the kind of the core of the, of the framework. And you can check my archive repeat, and actually it's going to be a proceeding for, for some other conference. And, and for the, for the algorithmic details, and for the mathematics details. You want to take a picture? I, I, you can find it like in, in the last slide, but anyway. So, I mean, I, I, I didn't begin, I, I wasn't an experienced programmer when I started this project, so I, I make some decision that perhaps I would change if I, if I do it in another way. If, if I start it, start from now. But like, I, I, I, anyway, like I make some decision, for example, here on, on how, how we should use this framework, for example, I decided that like this should be used, say, I say, I say, I say, okay, here it works. So, there I have some technical reason behind why I made this decision because I think it's always easier to provide a library for people to use. I mean, because the reason behind it is that I try to keep it in a pure Kotlin project and it's quite not possible to do things like this, serializing string into objects because the diffraction library of Kotlin is not very powerful. And anyway, this is the decision I made. I may change in the future, but it's how it's now working. And to use this, you have to create a template using the source code and build a model based on the template. And I have created a few effect samples on how to use this framework. I mean, I will do a brief illustration on the results because it's very interesting to see how things actually work out. So the first example is the flocking model, which is kind of the standard of the 101 HMAS modeling. It's a very simple model. The bird follows a very simple alignment bruise and they align with each other. And it creates some interesting microscopic pattern as shown in the figure. And here we are interested in interstellar flocks. I mean, it's a very hypothetical scenario, but you can interpret it as kind of a group of spaceships and they try to align with each other in the universe where some kind of mysterious creature like living in space. And I assume for the sake of physical realism, I assume they are propelled by a proton rocket and we can measure how well the flocks align. And this is the simulation result. I'm also not going into the details of this thing. But you can see some interesting things like if you can adjust the speed of light because this is a simulation and if the speed of light is stored and they take more time, it takes longer to communicate and they are less well-ordered and you can also tune the perturbation on the flock and so it takes more, it's harder to align with each other and it takes more energy to do so. And you can see in the figure on the left hand side or in the right hand side in your alignment then it takes half the mass of the population to align with each other for one year, for one turn. So it's not a very environmentally friendly way of traveling, I would say. And the second example is some kind of knowledge dynamics which is closer to what I'm supposed to do in my PhD. So we study how I try to model how research or knowledge is generated in the interstellar setting and there are some assumptions that are made like cooperation takes time, we need cooperation to stimulate innovation but cooperation takes time to process. You have to invite people to cooperate with you and you have to wait for the response and if it is very far away then it takes a long time to work with people. So this thing is affected by the time delay. And there are some other factors like if there is a small time delay and so you have a more rapid information exchange then it essentially creates a more competitive environment because when I do something and you want to compete with me and you update information very frequently then it creates a kind of stress to work your research out. And also like research of the next time, it is a dilated mechanism. It is affected by the time dilation if you want to move. There is a cause if you want to move closer to each other to do research like there is a cause if you want to move from your home country to this place to work with each other. So this is also some kind of simulation results. This I would say is kind of interesting. So we can also tune the speed of light and as you see in the figure on the left-hand side the blue curve was lower at the beginning and it becomes higher eventually. This is a curve when you have a very slow information travel. Because the innovation or the cooperation started very slowly at the very beginning so it rises very slowly but at the finals ultimately because of its less competitive overall research outcome becomes better eventually. I think it is an interesting result. The figure on the right-hand side shows that there are two kinds of strategies. You can stay in your home country and do all the things remotely or you can move the conference physically to work with each other. The orange curve represents the moving strategy and the blue curve represents the remote strategy. There is a cause imposed by the city because there is time dilation so it starts out lower and eventually pays off in the future. I think it is an interesting result to show up here. This is kind of interesting but you can also always criticize about this kind of model because it is too hypothetical. We are studying something that is not happening in the real world and perhaps we are getting nothing out of that. I kind of agree with this criticism but I am just trying to get something out of the framework. If people are not interested in this kind of model or I also feel like this is not interesting enough and then I made the decision that perhaps I can make a game out of this to make it more interesting. The reason why I decided to make a game is also because I like simulation games and simulation games are kind of similar to modeling or HMAS modeling in a sense. Also, I have many things that I want to test out. The model that I have presented are kind of simple models but they are more complicated mechanisms that you can implement but you can't do this rigorously because they are so complex you can't get anything adequate or interpretable from those ideas. A game provides an environment for me to test things out and to just explore what I can do in this institutional model. Also, perhaps this is one of the sensible ways to get something actually useful from the framework. If it is not very academically interesting, at least I have a game for that. For a game, it can be very complex and I try to address many different things in the game and I try to ask a very good question. For example, we can try to ask how should the economy look like in the international society? Does credit-based economy work? Nowadays, we are relying on currency, credit card but those kinds of currency relies on some fundamental building blocks of the society which might not work in an interstellar setting because the time delay, how do you exchange money and how do you define the exchange rate? Perhaps a simple part of the economy that we use historically is a better choice for those kinds of society here and perhaps we can use some kind of fuel or energy as the money instead of credit card. Also, we can ask how the political system works. Does democracy work there? If you have to wait for 100 years to receive the full-time results from the other planet. We can also ask how we can optimize scientific research in this kind of system. I have my own ideas on how these things work. I put all the things into the game including population dynamics, politics, diplomacy, economics like warfare, science and technology. Some of them are typical elements in strategy games and some of them are not that typical but I still get the inspiration from some other games. I try to get this to work on PC and Android and that's one of the reasons why I choose Collin to implement this kind of thing and I hope this is going to be a multi-player game in the future. I implemented some kind of server-kind architecture. For a game, you have to limit how a client can act because you can't allow them to do everything that they are going to cheat in the game. What they can do is essentially receive a few of the universe and they can send a command to do action. I also created a utility-based or DO utility-based AI for the game. This is the Baker architecture that is built on top of the framework. This is the screenshot of the game. I know this is not graphically very attractive but this is like, oh, I can do and... where is it? I'm losing my side. This is how the game actually looks like. You can do some things here to do something and add some commands. This is a complex interface and so I'm not going to actually show everything here but as something is working here. What's the scale? The spatial scale. For this, it's like three times three light years keep. This is a very small universe. It can be larger. At least it can handle a thousand light-year keep game without pressing my computer. Thanks for the open-source ecosystem. I can rely on all kinds of open-source library to build my game. I use DeepGDX for the working interface, I use Coroutine for the parallelization, and also thanks for the open-art community so that because I'm not an artist, I can't create all the art by myself. I can rely on creating component-licensed assets and I can use open-source software to escape if I want to draw something. It's going to be the end of my presentation but I'd like to have a bit of reflection on my project. I think I've learned a lot by contributing to open-source software and the ecosystem gave me plenty of building blocks from my own project. Of course, because of this, it's natural for me to also open-source my project and at least I've created something interesting. There are also some problems that I face during the development of this project. For example, there's a lack of open-source culture in my workplace. Some people there do use open-source software but most of them do not care that much. Most of the time, I'm by myself. As a one-man project, it's a bit too ambitious, I would say. I don't have time to publish all the things like graphic interface documentation and I can't ensure that things are fun to play and I don't have time to work on the translation. I did try but I don't have time to work on all kinds of things because I'm also doing my PhD, I have to write papers and do presentations. As I decided to go fully open-source, I chose those libraries to depend on but perhaps there are some kind of industry standard solutions like if you're creating a game, people just use Unity for that and perhaps it would be better for the project so I have more time for other things. It's just something that perhaps there's a price to pay for if you decide to go to go open-source. Actually, I did try a bit to commission my process because it's possible to sell an open-source game, right? I know it's possible but it's also very hard to do so. It takes time to advertise it so I gave up eventually. I didn't have time for an hour to push it forward. This is the summary of my presentation. I have created a framework for industrial and social models and I have created some models based on the framework. I have also created a game on top of that and perhaps it's lacking some immediate practical value and that's not what I thought was that like it didn't exist tomorrow, right? But I think it's not going to happen in a short while. So perhaps there's no immediate practical value but I still believe this kind of meaningful and educational exploration. If the game is fun to play eventually then you can ask your kids to play for that and perhaps they can learn some physics from the game, right? I don't know. And at the very least, at the most important, unless I feel like I've achieved something in my PhD, I'm not wasting my time in my pandemic. So that's all my presentation and this is the link of the project and if you'd like to make it a star, just check it out. And the thing that I show is, I think you can find those things in the readme file of this page. So thanks a lot for your attention. We'll take a few questions while the next speaker will come and... Thank you. Okay, Hafer. In your simulations when you said that the simulations then cluster, how many years was simulated in the simulation before you got to sort of the result? Yeah, I'm talking like... Can you repeat the question for the recording? So the question was like how many years did I simulate for the model that I presented here? I typically simulate it for 1,000 or 2,000 years and because of the spatial scale it's like 10 times 10 times 10 light year so it has affection time for the spaceship to move towards each other and to do the cluster. I assume that they move at a speed of light velocity. And you have a question? Yes, I guess it's more technical. So your framework is written in copper. Does it support any form of extension like in another JVM image? I mean, in principle you can use Kotlin with other JVM images. In your, for instance, or I guess customized agents, maybe even maybe the physics model that say I want to implement faster than light and travel or something like that. So the question was like can you use other JVM language to extend it? I think in principle it is possible because you can always, the way to use this is to create something from the source code and you can always work with JVM language in the source code. But I'm not entirely sure because I use some Kotlin feature like inline function to simplify the code so I'm not sure for that kind of thing can you use it nicely with other JVM language? I know you didn't go into that, but why Kotlin? Why not like Azure or Java? I mean, because I want to work on, I want to actually work on Android and you know, Android Java support is not great. Okay, sure. Thank you. |
MuPhyN - MultiPhysical Nexus
An academic simulation tools based on Python toolboxes |
Hello everyone, I'm Dylan Fiverr and I'm a research engineer at CERF Technique, a research center in Mons in Belgium. Today I will present you the software Muffin, which is a graphical tool for modeling and simulating multi-physics systems. So the project started with a research project. It is a collaboration with Thales-Alenia Space, so it's a research project funded by the Wallonia, so a public fund of Belgium. And the goal of the project is the development of a simulation method to assist in the design of mechatronic chains with a complete multi-physics model. So in other terms, what are the goals of the project? It exists different kinds of multi-physics simulations of Tuer, the global variable simulation type like OpenModelica, which simulates a system by item like a motor etc. Then we have a local variable simulation like Finite Element Simulation for a multi-physics system. And both types of simulations are not compatible with each other, so the research project tried to find another way to simulate a multi-physics model with using electrical analogous modeling. So the method we want to develop is a simulation in three points. So the first step is to convert physical system to electrical analogous system. The second step is to convert those electrical analogous model to numerical model. And the last step is reducing numerical model with evaluating the influence of each part of the system to the rest of the system. Then a second point, a more important point for our research is the Times Square Adaptive Model Reduction. So how to explain that? In a system we have a low inertia part in the system and high inertia, so low inertia part of the system are a part of the system that evolve quickly in the time and high inertia evolve slowly with the time. So when we simulate each one, we don't require the same step size, sorry. When we make a simulation, a physical simulation, we have to find an optimal step size. If we have a too large step size, the accuracy will decrease. And if we have a smaller step size, we will have a sufficient accuracy. But if the step size is too much small, the time to execute the simulation will be increased for a little gain of accuracy. So we have a multiple inertia part in a multi-physics system and we want to find a way to optimize the simulation by using different Times Square in the system. So for the project, we have some requirements when we will choose the simulation software. We need a multi-physics simulation software. This one must be adaptable so it must be able to adapt the model as a function of the condition of simulation, like the time, et cetera. And last but not least, the application must be user-friendly. So we have a simulink with MATLAB or XCOS with SILAB that would be a very, very good application for the project. That simulink is expensive, is a closed source software and not exactly adapted to our requirement. And it's a closed source software so we couldn't modify MATLAB like we wanted. And for XCOS, it's not user-friendly and like MATLAB, it doesn't exactly fit with the project requirement. So we decided to develop our own software that leverages existing powerful Python libraries. Here is an overview of our application. It is being used in a regulation system and here we have an example of results for this regulation system. So the software, it's a graphical tool for modeling and simulating multi-physics system. Here is a quick demonstration of how to use the application. So to build the simulation graph, it's just a drag-and-drop from a box library to a scheme drawing and then once the simulation graph is done, you can easily launch the simulation. We can use multiple project at the same time, save and load, project, etc. So what is the application? The application is a simulation graph builder. We have provided some default libraries to use the application if you install it on your computer but you can easily add your own process boxes and can solve all the simulations you've been building in the application. We have implemented some default simulation solvers but you can also add your custom simulation solvers. So we have here is a list of default boxes in the application. We have a math application like addition, multiplication, derivative, etc., signal processing with transfer function, etc., sources, and other like a graph displayer and also open modeling simulation embedder, etc. So if you want to install the application on your PC and want to create your own library of boxes. If we want to create our own box, we have to know that a box is defined by two files. The first one is a description file written in YAML. The second one is a code file written in Python. And there is one condition to make it as a box. It's that both files must share the same base file name. So how to create a box? I will take an example. I will create a box that displays a graphic data in a graph. The goal of the box is to display data in a two-axis plot in X-axis the time and the Y-axis the value. The box must be able to display data from one or more sources. And the graph must be displayed at the end of the simulation. There is how the box must behave during the simulation and at the end of the simulation. So how to create the box? The first step is to define the metadata of the box with the box name, the library name, version, author, and creation date, just a metadata of the box. Then we will define the characteristic of the box. What I mean by characteristic is all the inputs, all the outputs of the box and the parameters. So for the box, for the graphical box, we need three parameters, the title, the label for Y-axis and the label for X-axis. We must define the type of the parameter and we can define a default value. For the outputs, the box doesn't require any outputs. And for the inputs, we have three conditions. The first one, we have defined a default count of inputs to one. So the box, by default, will have one input. The number of inputs must be unlimited. So we must write the parameter is infinite to true. And the number of inputs must be not limited. This means we can define a range of number of inputs. So we can limit, for example, the number of inputs from one to five, for example. But in this case, we won't define any limit. The next step is to define a behavior. So to define the behavior of the box, we write a Python file. We write a Python file. When we write a Python file, we can use all of the available Python libraries. We have to define three functions. The first one, an init function. In our case, this function will init the plot. The second one is during the simulation, what will the box do? What is the process inside the box during the simulation? In our case, it's saving data. And what will the box will do after the simulation? And in our case, it's plot the data. Then the next step is to connect the description file and the code file, the Python file, to connect both. We will only have to write the name of the associated function. Then we have created a box. Now we want to add it in a library. So a library is just a collection of boxes. And in practical, a library is just a folder containing YAML file and Python file. So now a library is a folder that doesn't accept subdirectories. We can't, at the time, we can't put our file in subdirectories. It won't be recognized. So we have our folder. We add our first box with our two files, a second one, et cetera, and it's all we make a library. Another feature we added to Muffin is an IPython interaction. So Muffin can be run in an IPython session. Which means that Muffin can access to all the IPython variables, all variables declared in the IPython environment. We can use all these variables as simulation parameters. And the access to variable is dynamic. It means that if we can launch the application at the beginning of the session, declare variables in the IPython and use it dynamically. So if I launch a first simulation with some parameters, I update it after one simulation. If I launch a second simulation, the second simulation will take the new value into account for the new simulation. How to use it in an IPython session? The first step is to load the Muffin extension with the command load text. And then just run the Muffin command. Here is a demonstration of how it works. So we have firstly declared our parameter in a notebook. We will create a simulation with two assigned signals. And then we will configure all the parameters. We have implemented some information for the user. When he writes a wrong variable name, the application will give the information to the user. So here we have all of both signals, the first one at 10 hertz with an amplitude of 10. And the second one at 5 hertz with an amplitude of 7. I will launch the demonstration. Then we will update the value of the second signal, the amplitude of the second signal only, and we have a signal with the new values. We have implemented all those control features like saving and loading projects, working with multiple projects at the same time, copy and past boxes and cut also, delete in box, then cancel action, etc. So in conclusion, we have developed our own multi-physics simulation software. The advantage of this application is that the application is entirely coded in Python. So we have access to many powerful libraries like NumPy, SciPy, etc., for the application and for the custom processing boxes and simulation solvers. The application is very adaptive. What I mean by that is that you can easily implement your own custom boxes and custom simulation solvers. And the application can be associated to an IPython environment. So Mephine offers a Python alternative to Matlab plus Simulink. These advantages of the application, it's written in a scripting language. So we have a higher solving times. It's a young application at the time, so we need more time to offer a lot of features and we need more users to improve the experience. So what's next? We will have to work on compatibility. We want to make the transition from PyQt5 to PyQt6 and we want to add more interactivity with the user to improve the user friendliness of the application. Here is a list of the contributors of the project and here is the links of the project. If you want to read all the source code, you have a link to the GitLab and if you want to try the application, we have a package available on PyP. If you want to try to install it directly in your Python, you can use PIP install Mephine. So thank you for your attention. If you have some questions. You said you go through electrical analogies to make the calculation, you use the electrical equation, right? Sorry? Can you repeat? I will repeat after you. At the beginning of the presentation, you said you come from physical? Yes, we want to make a Mechatronics simulation and we want to convert all the parts of the system to electrical analogies. We don't use any, at the time we don't use any software like, sorry, I'm a bit stressed. Electrical simulation, I don't remember the name, but the cons with those software is that the model is not dynamic. We can't try a lot of sets easily, but the game of our application is to, with the time, make a box to create an interface to other multiphysics applications. We don't want to replace those applications, we just want to create an interface to those applications to use it in our research project. If I can rebound on that, because in part of the project, the idea is a lot of process can be simulated with impedance like printing, like thermistors, mechanical stuff. You have already an analog between the physical aspect and the electrical impedance stuff. My question was, do you use electrical equations because they are simpler to solve than physical equations? Sorry guys, there is a stream, if you can't hear you, you have to go through the mic. Maybe you can go instead and use the mic if you want to answer that question. Oh, you can repeat. Yeah, that would be great. If you could at least summarize. The goal of the project is to use electrical equations. The goal of the application is to improve the comprehension of the system. Do you plan to make it very untimely, or accelerate the trial process? Sorry, I don't have heard the... Can we in the future make value in the entrance, like the boxes, and change directly the value and see the code directly with its play button? Sorry, I don't have heard the question. I'm a bit... Can you do a real-time simulation? No, the simulation... Can we do a real-time simulation? For now, it's not possible. We want to implement that in the future. For now, the application, we just run the initialization function, then run the simulation and the ending function. And all the functions are run in one time. So at the time, we don't have implemented a real-time overview of the simulation. But we plan in the future to add this feature. In the beginning, you talked about the slow pass and the high inertia and lower inertia pass of your simulations. And you said that your software was designed to deal with local viable solutions and verbal viable simulations. How does that translate into what you've shown us? Where is the global viable part? So I repeat the question. I've said that we have in the system low inertia part and high inertia part. And also local viable simulation software and global viable software. And how does this translate into what you've shown us? Or does it implement it in the application? Yeah, where is it in the application? How does this concept translate into the application in showness? At the time, it's not implemented. It's the background of the project. But it leads to scheduler problems. So the idea is we want to have two schedulers. And since then at this time, you might forget all those things. Discussion can go on in the course. It's a great part of the first day. So go on with the discussion. We should switch speakers now. Thank you. Thank you. |
Guix, toward practical transparent, verifiable and long-term reproducible research |
So, sorry for the mess. It's a bit impressing all these people and so on. I'm Simon. I'm working as a research engineer in the University of Paris and I'm going here to present you Geeks to be able to do some reversible research and there is a group Geeks HPC which tried to apply Geeks tooling for scientific context. So, currently we are in a replication and reproducibility crisis. So, more than 70% of researchers are enabled to reproduce the results of peers or more than half are enabled to reproduce their own results. So, we have a big issue. So, there is many problems of this replication crisis and maybe one solution is open science. So, what does it mean open science? So, what does it mean science? Science means being transparent and collective activity. And what is a scientific result? Scientific result is some experiment. So, producing experimental data and then we have some numerical processing. So, to do that in today, we have different way because we need to communicate so we need to write results. So, we need open article to be able to read the results. We need to share the data. So, we have open data. We need to share the source code. But there is something that we never discuss is that all that need to be glued together because there is a numerical processing. So, we need to glue everything together. So, we need another one. We need a computational environment and this is really mean is one of the issue is that if this is not open, all the other stack is failing. So, that is the topic of today. How do we manage this computational environment? So, again, a result is a paper, some data and an analysis. And there is some parts which are mean possible to audit. For example, a paper, you can read it. A data, you can read the protocol that generates the data. You have analysis. You can read the script. But there is some part that are opaque. For example, the instrument, a telescope, a microscope. This is opaque. We don't know how it works. But there is something that is depend on our collective practice as researcher. And this is something that we can act on to do a better research. So, the question is to be able to eliminate at least this dependent and turn this as an auditable task to be really transparent. So, yeah, from my point of view, a computation and computing is just similar to an instrument. So, we should apply the same strategy that experimental people are applying for any instrument. And computing is just an experiment, in fact. So, the challenge about reputable science. From my point of view, there is two kinds. The first one is controlling the source of variation. What is different between this and that? So, between this computational environment and this computational environment. Because as with a telescope, for example, we want to know what is different between this telescope and this telescope to be sure that what we are observing is correct. So, from a scientific method, we need that the computational environment is transparent. And from a scientific knowledge viewpoint, what we are building together need to be independent. So, what I'm observing, you should observe the same. And this observation should be sustainable when the time is passing. We should be able to observe the same thing. Otherwise, it means that maybe we miss something. So, the big question today is with this kind of context, how do we redo later and elsewhere? So, I did something on my machine and you have to do this thing on your machine, for example, six months or one year or five years later, with the computer. And this is a big issue and is part of the reputable crisis in science from my point of view. So, what is a computational environment? Computational environment implies various points. For example, what is a source code? But, for example, if, say, I use Python and this script, okay, we have the source code of Python is in C and we have the source code of this Python script, okay. But the Python interpreter requires a C compiler. So, we need tools for building. And my script, for example, needs some Python library. So, we need also tools for running at runtime. So, and each tool has the same issue. What is the source code? What is the tools for building? And so, this is really reclusive. So, this is a big issue. And answering all these questions is controlling the source of variation. So, the question is, so, how do we capture the answer of all these questions? So, the question is not new. We have already tools, package manager, modified container. So, for example, with package manager, like APT for Debian, you can control this computational environment. But there is some issue. For example, how do you have several versions of open blasts on the same machine? It doesn't work really easily with Debian or with you and so on. So, there is fixes, but it's not really, I mean, practically, sometimes it's difficult. So, you have, to fix this issue, you have an environment manager, like Conda, PIP, Modifies, and so on. But this is really difficult because, for example, in Conda, how do you know how it is built? What is inside what you install? So, this is for transparency in science. Modifies, how do you use Modifies on the laptop? I think no one. And Docker is for container, Docker, Singularity, or whatever, is a strategy which generally based on the previous solution. So, in fact, you have exactly the same problems as the previous solution. It just helps to move stuff from one place to the other one, but it doesn't help to be able to have the correct thing in the first time. Geeks, in fact, is all these three solutions glued together. So, it tries to fix all the annoyance from each to have something, I mean, working, fixing all the issues of everything. So, Geeks is a package manager, like APT, UME, etc. It's transactional and declarative. It means that you can roll back, you can have a concurrent version, and so on. You can produce a pack, which is Docker images, for example. You can produce virtual machines, like Ansible for deploying on some machine. You can build a complete distribution, and it's also a self-came library, so you can extend Geeks. So, okay, the talk is 25 minutes. So, it's just a kind of aperitif before lunch. So, I don't speak about all that, because it's a little too much. So, I just speak about how Geeks help in open research from my point of view. So, I think it's really easy to try. You have just a script, and give a look before installing it. It's just a bar script, but check it. And you can install Geeks on any recent distribution. So, it's really easy to try. You are running Debian. You can try Geeks without installing the complete distribution. You can use Geeks on the top of any distribution, and it's really easy to try. Give a try. So, now, Geeks is just another package manager. So, you have the same command that you have in any package manager, for sharing packages, showing packages, installing packages, removing packages, and so on. It's exactly the same as any package manager. But you have some more functionality, like transactional. So, everything, you can do two actions in the same time. So, for example, removing and installing in the same transaction. Or you can roll back. So, for example, you install something, and you want to roll back to uninstall this thing without breaking nothing. So, okay, this is another package manager. But is it really another package manager? So, yeah, we can have, it's a command line. It's a, we install, remove without special privilege. So, this is nice. It's transactional. So, there is no broken state. We have been able to substitute. So, we don't have to wait hours and hours to have our binary. But this is nice. But what is really, really nice is decorative management. It means that everything is a configuration file with scheme. But you can declare everything. And you can produce isolated environment on the fly. This is something that's really helpful. And you can also see Geeks as a factory for the Docker images, for example. So, okay, this is all interesting feature. But why Geeks is reproducible? Or what does it mean it's reproducible? For reproducibility, we need to talk about what is a version. So, what is a version? Alice say, for example, I use GCC Adversion 11. Okay, nice. But what does it mean, concretely, I use GCC Adversion 11. It means that you need GCC, the compiler. But you also need AD, which is the linker. And you know, Binitils, for example. And the Jelitsi library. But the compiler GCC, it needs, for example, MPC, which is a package that does, I don't know what exactly. Anyway. And you need also MPFR and so on. And you have this kind of graph. And we can ask the question, is it the same GCC Adversion 11 if we replace this MPFR Adversion 4.1 by MPFR Adversion 4.0? Is it the same GCC or not? And maybe not. And if it is not the same, maybe you are feeling a difference. How can we be sure that we are using the exact same GCC? So this is just an extract of the graph because the graph have roots. And yeah, it can be really large. And maybe we can also talk about what are the roots of this graph. But this is another talk. So when you say that, okay, but I need to have a version. So what is my version in GICS? So GICS describe the state of GICS. So in fact, GICS describe is a version of GICS. And what it does, in fact, it pins the complete collection of all the packages and GICS itself. And because of that, we are able to freeze the complete graph. We can move this graph from one place to the other. So, okay. So this graph, in fact, describe the nodes of each, each, each node in this graph specify a receipt. And this receipt defines the code source, the build time tombs, and the dependency. So for me, yeah. And this graph can be really, really large. For example, for Skypy, which is a scientific Python library, there is more than 1,000 nodes. So, yeah, it can be really large. So for when I say GCC at version 11, it means one fixed graph. And providing the state which describe, this capture this complete graph. And I can reproduce this complete graph on another machine. So this is collaboration in action. So Alice describes the list of the tools in a manifest, declarative way. She generates the environment, GICS shell, and providing the tools. So this creates an environment containing the tools that are listed in the manifest file. Okay. This is nice. But now she describes the revision of GICS. So she writes GICS describe and this fix the state of Alice. So, okay, this Alice is working on her laptop. But collaboration is share this computational environment. So it's about sharing the state. To share this state, you need to share one specific graph. To share this graph, you need to only share these two files. And if, sorry, if Blake has these two files, Blake can create the exact same computational environment as Alice. So you have the GICS time machine. You specify the state of Alice shell and specify the tools that Alice used. And Blake and Alice are running the exact same computational environment. And for example, if you have Carol, who knows these two files, she also can reproduce the exact same that Alice and Blake. So, in fact, you only need two files. And with these two files, you can reproduce everything from one place to the other. So, in fact, you have this kind of picture. Alice, Blake, Carol are in different time frame. But they can jump from this time frame, virtually time different time frame, to the same place. Because their machine are in different state, but they can temporarily go to another state to create the computational environment. To make this work, when the time is passing, you need to preserve all the source code. And this is not straightforward. It is not trivial to preserve all the source code. And you also need some backward compatibility of the Linux kernel and some compatibility of the hardware. But, okay. And when these three conditions are satisfied, you have the reproducibility. But what is the size of the window, of the time window, where these three conditions are satisfied? And this is, from my point of view, unknown. And GICS is, to my knowledge, a case unique by experimenting to be able, because we have the tooling to do all that. And now we can know what is the size that we are able to reproduce the past in the future. So what is software heritage? So software heritage is an archive. It collects preserved software in source code form from a very long term. And GICS is able to save the source code of the package and the receipt of the package itself. And GICS itself is also saved in software heritage. And GICS is able to use software heritage archive to fall back if a swim disappears. So you have the postdoc working on some GitLab and Stance. And the account is closed because the postdoc is moving to other place and so on. And now you have this paper with this URL of, with the GitLab package and say, oh, no, it doesn't work because the account is closed. If you were using GICS transparently, you can check if the source code is on software heritage. And this asks really good question about how to see the software and do you notice it only the source or, and what about the dependency and the build time options and so on. How do you see the software? And how, I mean, how do you see this? Do you see it with intrinsic identifier like checksum or with intrinsic identifier like version label? This is easy. So in summary, there is three commands. I'm almost done, right? Yeah. So in summary, you have three commands. And these three commands, which are GICS shell, GICS time machine and GICS subscribe, they help you to have a computational environment that you can, I mean, inspect and collectively share. So if you have this and two files, manifest and channel files, you are reproducible over the time. So okay, for offline, when you are, because I hope I convince you that is cool. So here is some resources that to, to, to, to, to read offline. So GICS HPC is a group of people trying to apply this GICS tooling to, to, to, to, to scientific research. And we are organizing coffee gigs where we, we drink coffee and speak about GICS. There is a, an article trying to explain this kind of vision of what GICS could provide for, for open research. And for French speaker, there is a one hour tutorial. So yeah. And there is a, now GICS is tenure, so it's kind of ready. So they, we organize the ten years events where there is some really nice materials about, about GICS. And GICS is not new at first them. So yeah, there is, all the number are, are linked to, to the previous presentation. So as you see, there is a 31 presentation about GICS in first them. So you have a lot of material about what GICS can do for, for your job, for your task. So you run in production on big cluster, but also in a lot of laptop and desktop. And here, for example, is to paper in completely, I mean, medical and biomedical stuff using GICS as a, as, as tooling with, as, as I presented about GICS shell, time machine and so on. So, okay, open science means to be able to trace and transparent because is to be able to, to collectively study bug to bug, to be what is different from one thing to the other thing. And this is a scientific method and we have to apply the scientific method to the computational environment. This is my, my opinion and the message that I, I would like you bring back to home. And if you, if, if we have GICS, we can do that by controlling the environment and compare two different environment to know what is different. So, okay, this is, yeah, the kind of, what we are trying to, to do with the GICS project. So, thank you. And I'm ready for your question. Yeah. Yeah. So, we have five minutes for questions and switching speakers. Please take question and do repeat them for the stream. Thanks. I will try to do my best. Yeah. Yeah. Ah. Lobing. So, the question is, okay, I, I, I don't have the, the, the root privilege to install GICS on the cluster because once GICS is installed on any cluster, you can, you can run it without privilege, but you need to install, the first time you need to install GICS, you need root privilege. And the system administrator of my cluster doesn't mean, yeah, I need to convince him. So, maybe the, the answer is to say other people are, are already doing that. So, it's, it's not, I mean, to reduce the scare to, to, to provide a new tool. This is what I, I, I would like to try to say, okay, these people are doing, they're doing it. So, maybe it's not so scary. Uh, I think it was after, yeah. So, yeah. Yeah. You mentioned that you're not sure how, how big the time window is. Yeah. If you look today, how far can you back and still reproduce? So, five years, ten years? No. So, the, the question is, okay, what is the size of the window and can we go back five years, uh, from now in the past? The issue is that the, the mechanism to, to bring back in time or to, to, to travel in time in GICS, uh, had been introduced in, uh, 2019. So, in fact, with GICS, we don't have the tooling to go back earlier. So, now, I mean, the, the, the zero for GICS is, uh, is version one. So, it's, uh, 2019. Yeah. Um, a lot of, um, scientists are using macOS and not Linux. Is there, is it possible to use all this stuff even though GICS can't really run on macOS? So, GICS cannot run on macOS. But we can ask the question, is it transparent if we are running on macOS? So, is it, are we are playing scientific method if we are running on macOS? So, I mean, I, I, I, I, I have not the question. It's, it's a collective decision. Yeah. My name is Alain. Um, as far as I understand, uh, GICS, uh, or GICS, uh, provides the same approach as the NICS. Yeah. So, um, I've never used, uh, GICS before, but I, uh, I have some experience with NICS. Uh, is there any crucial difference? So, from my point of view, oops. Ah, sorry, anyway. Um, in the slides, there, there is a, some, uh, appendix. So, there is extra slides. And there is one extra slide trying to, to explain what, from my point of view, the difference with NICS. So, the question is, uh, what is the difference between NICS and, and GICS? Because NICS, you, I mean, GICS use exactly the same, uh, functional strategy, package management, functional strategy. So, what is the difference? From my point of view, the difference is that you have a continuum in GICS in the language. The package are, are, are, are wrote in scheme and, and, and, and the, the code of, uh, GICS itself is also wrote in scheme. The configuration file are wrote in scheme. So, you have a, a, a big continuation with everything. And because of that, you can extend GICS for your own, uh, uh, stuff. So, for example, you can write a package transformation on the fly using, I mean, GICS as a library. You cannot do that with, with NICS because you have a lot of different tooling in C++ and some, uh, from my point of view, is this unity of, of, of the, the, the, the continuum of the language. Yeah. Yeah. But scheme allow you to, to write kind of domain specific language. It's, it's, uh, it's, uh, it's, uh, yeah. It's, it's a, it's a good language to, to write domain specific language. So, in fact, you have the both of, of the two worlds. From my point of view. Thank you. Oh yeah. Sorry. Last question. Yeah. Uh, that's good. It's, it's, it's, it's a good language to, to write domain specific language. So, in fact, you have the both of, of the two worlds. From my point of view. Yeah. This is, uh, so, when you are running GICS, for example, on the top of Debian, so, uh, how do we manage the graph and can we cut the graph to reuse a part of the Debian part? I mean, a part of the graph from Debian. So, the question is, uh, maybe it could be, maybe it could be, maybe it could be, maybe it could be, a part of the graph from Debian. So, the question is, uh, maybe it could be helpful for some packages. But, but when you do that, you are not able to, to manage the computational environment. Because if you have, for example, if I cut the graph on Debian, so I have a, a state in, in Debian with some packages, I cut the graph at some place to use these Debian packages. If I do that, how my collaborator can cut the graph in the same place with the same Debian packages? So this is kind of issue of replicability. So from a practical point of view, it could be nice because, for example, Debian has some machine learning packages that are not yet in Geek, so maybe we can reuse some part. But from a replicability point of view, you lose the property to move from one place to the other. |
The under-equipped social scientist ?
Why do we need more dedicated, flexible and documented Python libraries for social sciences. |
Thank you for being here and thank you for the invitation. I'm going to speak maybe in a less technical way, in a more reflexive way of the thing I am trying to do for the last year. I'm Emilien Schultz, I'm a post-doctoral researcher in Sociology of Science and Health in France, both in Mediarab and in the system in Marseille. And I'm going to say something about the way we are doing scientific programming in Python in social sciences and the way we can improve it. And I gave this presentation a kind of provocative title that is a way to speak about what are the specificities of social sciences and how can we improve all this kind of environment to make computing and data analysis in relation directly to open science. So yes, it's kind of humble presentation. If I want to summarize it in one sentence, my point is to say that social sciences need more scientific programming. And it is three points for that scientific programming as I think the right flexibility to equip the very diverse practices that exist right now in social sciences. For the moment, we have a landscape, especially in France, with main based on air, language, and Python could benefit of some impulse, and I think it needs it. And the gateway will be to develop very specific disciplinary packages that are still missing for Python in social sciences, which I can call a disciplinary API for the language and to move beyond on open source treatment. So just a quick disclosure, I've been trained in physics, but I moved in sociology and now I'm speaking as a sociologist or a social scientist here. I'm trying to give some feedback of what we are using in our community and try to answer two questions in one, which are not very well-delinated, which is first, how to improve Python in social sciences and in more general way, what are the different uses of central programming right now because we don't have a very clear look of what's going on in all the different way we can use the scientific programming. So it's a work in progress, so it may need to make an exchange with you. To be clear about my title, I'm not saying that we are under-equipped in a pejorative way. Social sciences have a well-established open source software platform, and some of them are going to be presented today, and usually give a warm welcome to new strategies for data analysis, and its own fieldwork are expanding to numeric and software data. So we are using and studying all those software tools and open source tools. But it is a general point of view from a sociologist. We have, in general, a low-tech practice, and we are using software applications, programming for very discreet, meaning punctual or unseen operations, and if you want to have a look, there is a very nice article from Caroline Mueller and Frédéric Laveur about how Easterians are changing the practices and putting some more numerical analysis inside their work, but still conserving the global way of doing their work. And so we need flexibility to adapt to individualized practices or topics, which are very personal to researchers. So I need to say a word about the specificity of social sciences, because I don't think we have very numerous today. There are a variety of disciplines with very different ways of dealing with data and analyzing archives and interviews, and within each discipline, there is a huge variety of methodologies, school of thoughts, theoretical approaches, and from an organizational point of view, there is a very weak functional dependency between all the researchers in our different fields. And moreover, there are very important national specificities. And moreover, each research trend is very conceptually laden, meaning that there is very important given to the friends that each researcher is using to collect data and to analyze it. There are no global rules of how to do it. So there is a very huge limit of one size fits all instruments in our disciplines. And besides, there are very harsh critics against standardization and normalization because it is seen as a way to erase priority, which is some kind of base of what our social sciences. So there is a huge fight between individualization and standardization of practices. Nevertheless, shared instruments are important, especially software instruments. And science studies, which is a field of sociology and anthropology of science, has shown the crucial role of instruments for the functioning of scientific communities. It is very important for conceptually changes because it allows us to look at all the stuff at different scales, at different topologies, like the microscopes change the way we are doing biology. They are very important for disciplinary identity, the way we present ourselves in our research communities and how we define our activity. And they are very important for coordination between specialties and standardization of practices beyond a small group of researchers and to transfer theoretical agencies and methodological changes. And there is a lot of studies about how electronic microscopes change biophysics has to put a second cell for changing medicines and the way we are doing dealing with data. But there is very few studies on the way software is changing and standardizing practices, especially for social sciences. But as I told you, social sciences are kind of divided regarding standardization, especially in post-standardization, because it usually reflects some kind of polar relationship or one specific scale of thought which tries to impose its way of doing things against others, especially in sociology or political sciences. So there is a goal of to define what is the good scale of creating software instruments for social sciences. Scientific programming is what I want to say today, it's a solution which both favor new, there are no new scientific instruments from within the specialties and to improve some kind of second hard generalization by using the same way of doing scientific programming. And it's a good entry point for new open source practices which are not existing or very little existing within social sciences, like those linked to, as we've just seen, open science or reproducibility and collaboration beyond the different disciplines which compose social sciences and to import new stuff like coming from the computer science, high-stuff and machine learning. Nevertheless, for the moment, scientific programming as a global frame is not very common in social sciences. Of course, there are always cool kids, so there are people doing it and computational social sciences are a thing and expand very quickly in our field fields, but for the common people it is not very developed. And we have a lot of users of AIR which has an intermediate status between programming language and a statistical language, which kind of do a status. What I mean as scientific programming is very quick, but it's a diversity of practices. They have come on the fact that it's interactivity, exploratory and based on packages, and the priority is given to usefulness for researchers to explore certain questions they want to address. And all the questions just we have seen are not the primary aim of the researchers who are using scientific programming, so stability, design and also very important question of software development are not what is in the first front of the users of scientific programming. But when you're looking at what researchers are doing, they're doing one of the steps of the different scale and they're not all of them developing and creating new packages. So the diversity of practices is very existing. If you look on what are currently the software uses for researchers in social sciences, it's a kind of an exploratory mapping because it comes from a non-representative survey conducted by Maier-Niglou Baichek in France in a study called State of Open Science Practices in France. And we ask for researchers in different fields what kind of tools are using their research and they can answer different for producing data, analyzing data, so just a small network of all the tools that researchers are using for social sciences and humanities also. And if you look to this huge diversity, what the main result are, there is a diversity of software and profile of researchers, nothing new about that. You will find the centrality of standard office software like Word, Calc, LibreFace, etc. And the main scientific programming language used is AIR with 20% of users that are using it and then a geographical software QGIS, 10% for SPSS, which is a statistical software and only 6% using Python in this broad field of social science community. And if you look just to the quotation, the reality of the work is usually just using at some point some software and there is no global glue of open source tools in the workflow of the researchers. So even when you look at the small part of the social scientist who are doing quantitative analysis, so it's a subpart of it, there are also a huge diversity, diverse tools division. So we are looking to a very, very fragmented communities and there is a need to create some glue between them. What you can say that 20% of researchers on a survey said they are using AIR as a statistical and scientific programming language and the observation is important because AIR developed for good reason for social sciences. It developed because there is an afflicted affinity between the diversity of practices existing and the flexibility of the tools that AIR allows, that allows to develop very specific packages which can continue the work of small communities. You probably all know about AIR, it's a script identity language. You can build very quickly small packages with data about a specific research project and there is a lot of support from the French community, there is a lot of package in French for instance. But this lead to limits, it creates a huge diversity of types of package. Some of them are not very easy to understand, some of them are not very well documented and there are functions that exist only in AIR which create some kind of increase the diversity of tools social sciences community are using. Depending on what packages you look like, there is a very low documentation and very low standardization of the code and there is still this ambiguity between what is the statistical languages and what is the programming languages. For the main topic of what I want to say, what is the state of the Python uses in social sciences? Let's say there are not a lot of people using it, there are more and more young researchers interested to leverage machine learning in their research so they are coming to Python to try to understand how they can use it in their research but for the moment it's difficult to get to realize the basic steps for social sciences work in Python so it means all the tools that we are using on a daily basis like making a logistic regression with a clear presentation of the results and we need, and what I want to say, dedicated community packages as a middle ground for researchers to access financial programming and then being progressively aware of all the open source and open science practices they can add in their research. So for that there is a need to go beyond application development itself and beyond one specific package is it's a whole process to implement. Just what are the expected positive benefits of Python broad adoption, just a small snippet, it will enhance science free programming practices especially with the importation of the whole ecosystem of Python, especially notebooks, have the potential flexibility and they will allow to create a common language with other communities especially computer sciences. The question is still, we in the social sciences have already heard of the main language, what's the future of the collaboration between those two languages, reject the idea to develop Python, to advocate the polyglotism or share a rivalry between Python or start a transition with Python and in France we are, for the moment, we decided, school decided to teach Python as a first language for students in high school so maybe there is a change going on on what kind of languages students are going to practice in the future so it creates a shared language. It's a leap of faith to decide what kind of tools we will need, mine here is the Python but I'm here to speak about the place of central programming so both of them are going to be together. Just want to say a word about what I'm trying to do in France for social sciences especially in sociology to enable this practice of Python and this kind of bio-social humanities, social and humanities science package and I want, I need to achieve a double constraint first to achieve some standardization because we need some shared tools to be able to work together especially to train students and to create collaboration projects but we can't sacrifice our disciplinary and sub-disciplinary specificities so we need to find the good level of flexibility. It's a four-step process, nothing very new and it's very common I guess in every development of packages and tools, the first one is to identify what kind of practices can be called quasi-standard even if they are not completely standard, the second step is to build easy to use packages that can find place in a specific workflow, the third one is to prove it can be useful because there is no way to create something, there is no proof that it makes some positive advantage in research processes and then the fourth step is to train colleagues and develop practices. Step one, to uncover standard practices and we need to, and I'm speaking about social science to understand better what is the common sense for daily job, for instance not all social scientists are doing machine learning or statistics but a lot of them still do it a bit especially basic statistics. So there are some quasi-standard operations, for instance for survey, analyzing surveys with questionnaires and samples, so we need more like R stats as descriptive statistics for survey, different tools for transformation of file formats to generate and modify tables to create intermediate documents and to produce visualizations the way we use it in our work, for instance in France we are using a lot of work from Bourdieu, Pierre Bourdieu, Sociology and he has a very specific way to present the result of factorial analysis and you can find it in the Python universe. So you need to start to work with those existing workflow to build more adoption. The second step is to facilitate the disciplinary use and might try the small package which is in French and it's the choice to be able to be close to students and researchers who don't usually use Python to be a one-liner which is the first step to use some kind of easy route tools to move quicker through results, close to the common sense in the way tables are organized and based to facilitate the complete workflow from the data to some results that can be published or presented to students. And it's based on the basic packages of the Python communities like Pandas so it allows to move swiftly from one specific disciplinary package to the more general practices. The third step is to show the usefulness of both Python and this specific package and for that there is a need of public demonstration in context. No research tools can be used, will be used if it's not direct advantage to use it for research and doing stuff. So notebooks are kind of perfect vector to prove and display some, I can see the figures in this, yes it can be prioritizing but it's a good way to present a complete step for research and we developed with the collaboration between Humanum which is a platform for software and data analysis in French and a cooperative detectivist, five notebooks for machine learning as a starting point to show how Python can be used like from the beginning to the end to analyze a survey and the survey we just discussed before about the state of practice of open source science in France. And the fourth step, it's a very important step is to train colleagues and students so you just can put something out there in the world so the tools need to find a place in research workflow so there is some kind of transition to the tools, to the practices and we are doing it in different steps, writing books and academic examples to stabilize a shared practice in France and I'm intervening laboratories to show how useful it is to use some Python even if you don't, my colleagues don't usually do a central programming to train more new students especially PhD candidates to those new approaches and all the world around Python like Git and using GitLab and creating spaces to discuss our specific practices which for the moment doesn't exist. I will conclude just from very quick concrete ideas. My point today is to say that a scientific programming especially in Python that's not specific in Python is a survey that between the use of application which is the daily basis of a lot of social sciences especially in sociology and not using at all code which also has the daily basis of a lot of quantitative researches in social sciences. Scientific programming will allow to promote the particularity and open source practices because it is the gate to all the practices from the open source communities and we promote interdisciplinarity collaboration with colleagues outside the scope of sociology for instance. Nevertheless there is a need for facilitators, this was my whole point, we need to excavate and make this reflexive process of understanding what are our standard practices that can be standardized at some point to find early users and creating core developers that can come along with this work of reflexivity and to demonstrate more the concrete efficiency of those tools, the limits are, this focus on disciplinary specificities is also as also drawbacks because it can increase the dispersion from the laboratories and it's something which is real and maybe there are better languages to promote than Python or I have to do that but I started with Python. So thank you for listening to me and it's my message. Thank you. If you could take questions, please do repeat and formulate for the stream. Thanks. So in computer science we have also the problem of reproducibility, most of the time we write papers and sometimes the results in the papers don't get the software. So in the recent years we have a lot of conferences that have one special transport tools and one is to request all kind of a stamp on the paper that the paper is reusable. So it comes with a software that you can really run the experiments. Do you think that there will be some helpful also to propose these social science? The question is that there is a lot of problems in similar problems in computer sciences, I'm repeating for the audience, but I think that you are way behind what is going on in social sciences and even the possibility to make a reproducible paper is not here right now in the social science communities because the logic of programming, the logic of automatizing steps and not using directly Excel to make that analysis is not yet here in the basic practices of our communities. Some of my colleagues are doing it but they are very few, usually kind of the youngest one. So what I'm saying but it's not so clear in what I said is we need to use more what has been developed from computer sciences and to try to find a way, a gate to import them at some point in our practices. So to understand what is going on in computer science we need also a better culture of what's part of all those tools and it's still not here. Thanks. This is my question. The question is that there is a division between people who wants to train to those tools and those who doesn't want to. The fact is we are more and more working projects in the basic culture of what's going on, what's possible to do is something we need to share to everyone. Otherwise, there are huge divisions which are going to exist and to produce. So I'm quite sure that that's why I have put this kind of very small diagram first. Science programming can start with reading a script, understanding a language, not producing it by yourself. And it is those steps that I think are useful for everyone. So global computer literacy for social scientists, even if they don't want to move to other more advanced tools. But this one is, for instance, working on a project with statisticians and people who are scrapping for data to know what is possible to do. So thanks. I have a point, maybe a suggestion, if it's not in place already. We have an institution named the Carpentries. They teach professional skills and data science for researchers. And it would be wonderful to see a workshop, including Python and your library in the workshops available there. Because once we have a workshop there, we gain the potential to have more than 3,000 official instructors around the world to teach this content. It would be amazing. So you speak about the software Carpentries, yes? Yes. Of course, for the question it is, there are already very important initiatives existing in the Carpentries, especially in different disciplines. For what I know in France, they are very little visible in social sciences. So I use the content they are creating. But for what I can see, very far from the daily questions social scientists are having in the daily analysis. And there is still, you know, a transition to make. And I can, I agree completely with what you said. There is a need to make this a jointer. But it's a working process and I will try to contact, find someone who wants to do it with me at some point. But I have been trained at some point as a software Carpentry, but I never completed my training. But yes, I am aware of that. Thank you. Anyway, thanks. I have no data about why. Sorry for the question. The question is, how much do we know why people are using specific software tools in social sciences? My answer is, and it's with my heart of social scientists interested in science, is that I know very little work studying how researchers are using software and tools. There are some work starting to be developed, but I have no answer. For instance, for AIR, AIR exists because statisticians use it a lot and so it has been teached in a sociology course. And then students became researchers and now they are using it in France. So there is a historical path dependency on the specific kind of user. But I quite sure it's a huge avenue for research. Thank you. I am quite sure you are right. The question is, how people change from one software to another and this is an unresolved question. So thank you. Thanks. Thank you. Thank you. |
Preliminary analysis of crowdsourced sound data with FOSS |
Okay. Thank you. Thank you for coming to this presentation. I'm Nicolas Roland from the Gustave F.L. University. Thank you. Please. And I will be presenting some research we did on crowdsourced zones data. Some analysis we did with free open source software. I will be presenting the work we did with myself, Pierre Romon and Ludovic Moison from the Gustave F.L. University. So traffic noise is a major health concern. In Europe, in western Europe, it's estimated by the World Health Organization that we lost 1 million healthy life years each year. In France, we have estimated the social cost, so the cost to the community of 1,147 billion euros per year. So it has a cost on a monetary cost, but also a cost on people and their health. So the big question is, oh, we can find where noise is problematic. And so, of course, we can't have a direct measure on the everywhere. We can't put microphones everywhere. It will be a cost nightmare, a logistic nightmare, and a privacy nightmare. Of course, it's not possible. So the traditional way is to simulate the noise from traffic counts. So we put counters on the road and count the vehicles and estimate the vehicles. We do that on trains, we do that on planes, air trucks, and we simulate those traffic and we produce this kind of cars with, for example, noise modeling, which is an application we developed with the Hummerau laboratory that can compute from these counts noise maps. And this is a legal requirement by the European Commission. Another way that the Hummerau, which is working on environmental acoustics, it's a lab on acoustic, is not to simulate, but to get actual data, real data from contributors using a smartphone application you can install. It's working on real smartphones. It's available on F-Droid. So it's also a free open source software. And it measures several things like your position. The sound spectrum, or not the full spectrum, it's just the third octave. So you can't understand what people are saying, if it's someone speaking, but you can detect that someone is speaking. You also have the sound level and some kind of information. So it's part of a bigger project, like the noise-present project. So we have this noise modeling application that generates noise from open source geodata, mostly French geodata and open script map. So when you use script complete to say, okay, this is grass and this is macadam, we use that data to generate more precise maps, sound maps, noise capture to measure and share sound environments. And all of this data is given in a special data infrastructure called Onomap. And there is also some community maps made by the users. This is a map of all the recordings we have in nearly five years. So you can see it's worldwide. It's just not only France or Europe, it's worldwide. So the question was, what can you do with all this data we collect? So there was an extraction in 2021 of the three first years of data collection. So it's still collecting the data, but there was an extract that contains 260,000 tracks. So the track is recording worldwide. With the sound spectrum, like I said, GPS location and also the contributor can provide some tags. It's an open database license. So it's free to use. So the question is, oh, we can characterize the user environment, the sound environment of the user at the moment of the recording with the collected data. We think of two possibilities. One is from the sound spectrum. We record. So it's an ongoing analysis. It's not the very, it's the hardest way to do that because we have to find patterns on the recordings. And we have to use machine learnings to detect these patterns on all of these data. So it's still going, but there is only the easiest way. And this is the way I use. It's by using the tags that are provided by the contributors. So in the subset, like I say, 260,000 tracks, half of them have tags. So we can use just half of it. 50,000 are where outdoors are not test. So we want to work on this sound environment. So we discard indoors and test tagged tracks. We also remove the very, very small ones. So less than five seconds. So we remove maybe tracks that are not, that might be accident, accidental. And we also work for this just this preliminary works on France because we are French and it's easier for us to understand what's happening. And it's nearly 12,000 tracks. And like I said, it's a major, the road noise is a major concern. And it appears directly in our data because the more frequent tag is the road. So people are on maybe a third of our subset. There is one noise in it. The second one is chatting. And so we have also things like wiring, animal sounds, works. So there is 12 tags, different tags the user can provide. So we use a quite simple one toolkit to analyze the data. First is the PostgreSQL and Postgres database because the data is provided as PostgreSQL dump. So in order to access it, you have to rebuild the data and the database. And the tool we use is R because we are in the team, we are mostly R user. We also have Python, but we are more familiar with R. So two tools, simple, yes, actually not really, because we also use in R a lot of packages, like the Tideverse, the SF packages for your special, your JSON stats and so on. And we also, all of these packages use dependencies like pandoc, markdown, reveal.js. This presentation actually is made with R and reveal.js. We also use geospatial libraries, like proge, gos, gilal. And those are dependencies that are not handled by R directly. You just call them. So what we define in this dataset, so let's talk about results. We got some interesting things to add. The first thing we looked at, we looked at it was the animal tags because we know that bird songs can be heard mostly the first hour before dawn. So we can, this is a well-known dynamics in ornithology. And in the sound environment, we can earn it. And we actually find it also. So in this graph, you can see on the left part, it's the time before the sunrise on the day of recording. So we find this actual dynamics of birds singing one hour before dawn. So it was a good sign. And we also find peaks of road noise between 8 to 10 a.m. and 6 to 8 p.m. And it's, we can say, it looks like very much like commuters behavior. But we can't directly, we can't directly link to it. You can say, oh, it's very similar to. So we looked to physical events in the environment of the contributor. And we find a very good correlation between the wind force and the present of tags, the wind tags in the dataset. So it's very, it works very well. We also did that with rainfall. And the correlation is not so strong. Not as enough. It might be a user bias. Maybe if the rainfall is too small, the user doesn't hear the rain or doesn't think to add the tag about it. And it might be also a special issue because the nearest, the mean nearest waiver station distances is 16 kilometers. So maybe the local condition might be different between the waiver station and the user at the moment of the recording. So it's not so strong, but actually find data. I'm not the first one to speak about reproducible science here, actually. And it's an issue, a real issue. So for this today, we have some good points. Like, the data is already available. The source code, we made it available. So all SQL scripts to rebuild the database and the table we used are available. The R notebooks we made are also available. The setup broadly is available. But there are also bad things to assess. So some notebooks were very wide and we went very deep on the analysis and the exploratory phase. But at the end, it was very hard to reproduce even in our team. We actually were able to do it. But for someone coming from outside, it might be difficult to enter in that. So we need some code factoring and more, a little bit more commenting, more explanation. And so there is also a lack of information on software environment. So it makes it very hard to reuse and reproduce. So what could we have used to have a better tooling? Since we use R, you can use RMF, which is our package to reproduce. It's like a virtual environment. It works well, but it works well just for R. And we use other software like POSJS. We use JOS, Proj, Jidal. So it's not perfect. Docker might be something that can be helpful. But like Simon said before, it's not perfect for reproducibility. And I just say Goix is my examined mind from one year, actually, to say, okay, I need to work on that. And I think it would be a good solution. I won't talk too much because there was a talk by Simon Tornier just two talks before, and I go watch it. I think it might be a very good solution. In conclusion, so we can use code-sourced data. Oh, perfect. We can use code-sourced data for science. We can find, even for something quirky like some environment, we can use it for science. This particular data set is usable. So you can access it and find new things. We don't have every question. So we don't have every answer to, we can answer with this data set. So it's quite fun to play with it and find some, oh, we can find birds. I do believe that free open source software are key for reproducible science. We can't make reproducible science with proper software. It's not possible. Repositible science is hard to achieve. You have to think it as soon as possible before starting your project. Because when you are too far, you have to refactor things and it can be very tricky. And you have, maybe, I'm working with, this is more a sound and physics-related study, but sometimes I work with economists. I work with geographers. And they are not often very keen on technologies and computers in general. So sometimes you need someone, maybe an engineer or someone in the team that can handle this reproducible part. And so you need to get the skills. So either you get yourself or you have to take someone in the team that can do that for you. And notebooks are not enough. Notebooks are great to communicate and explore things, but they are not good enough for reproducible science. So there's a link for the data set, the actual data set. Please go to checknoiseplanet.org. You can navigate on the map. You can see actually tracks and click on things to get what is recorded. Thank you for your attention. You can have, you can join me by email or on Mastodon. This presentation is available here and everything is accessible on GitHub. Thank you very much. Thanks. That leaves us a bit of time for questions. So please feel free to take them, repeat them, and then answer them. Yeah. In the graph with the bird. Yeah. You had a sort of a dip at zero. Yeah. Yeah. So the question was about this particular graph, why there is a low beam at the zero and the peak is just above the zero. Because it's smoothed a little bit. And you can see there is a peak just before and the line is just moving and there is little shifting. But and you think why is there is a low? I don't know. I'm not sure. Yeah, please. Yeah. Because we are doing crowdsourcing data, so it's obviously influenced by the users that's collecting the data for us. Yeah. How do you factor in or how do you eliminate this source of variance where it could be underlying behavior of humans that is affecting the results of the data. For example, sunrise time. So people who get awakened by birds during before sunrise, they will be very annoyed and they record more. People who wake up at the normal times are too busy to even make the recording. Okay. So the question was about this is called source of data. So there is there is data that provided by people willing willing to provide it. And there is a bias, of course, because you may be angry at birds, waking you up in the morning and you may be angry to traffic noise. And actually, we don't assess the data. We take it as it is. Maybe there will be some I'm not part of this part of the project, but maybe there will be some some work on it. And we hope that it's so much data that it will be smooth bias. But of course, it's bias like opens with my data. And there is someone making a decision to say, okay, I will record it for a good or bad reason or to prove a point. Okay, my where I live in, it's too noisy. I make a recording. And it's okay. But we recount. It's very hard to assess this kind of information. We don't know why people record tracks, because maybe it's pleasant environment and they want to share it. It's not so good. And it's okay for us. I hope it's answer your question. Yeah, please. Yeah. So I just wanted to ask, because I think a wind is pretty hard to incorporate. Because when somebody records, they probably recorded without the pop filter, which makes the sound really loud of the wind, despite of not being so loud, because somebody comes up with the phone and records it. And the wind blows straight into it and it's really, really loud as kind of like indecent. Okay, the question was about the wind recordings and the fact that smartphone doesn't have a pop shield and protect the microphone from the wind. And so actually, I'm not an acoustician. I'm more a jazz engineer. So I don't have the exact question for that. But I do believe when you are using your microphone, when you're talking, talking, smartphone, no, nowadays, can protect you a little bit from the noise. But I'm not sure from education. Yeah, please. My question is kind of connected to the bias one. But when you were building the data capture, like... Did you build the data capture tool where people are inputting data, right? Or how was that built? And did you make sure that people could use it in a way that... Like, how did you make sure that people were comfortable using it in the situations that you needed recordings for? So the question was... Can you simplify it? So I'm interested in what choices you made in order to have the thing look and function how it did to capture the data and, like, again, to bias. So if people are, like, not able to use it or don't like using it, does that also bias the data? So you're not speaking... So the question was about how we build the analysis and how we build it. If you are not able to use R to build the... Sorry. Actually, we have to make a choice and we are more comfortable with R. So there is a bias, of course, and we also have some libraries like SunCalc, for example, that makes life very simpler for us to... You give it a time, we give it a date and a position and it gives you the sunrise and sunset, thank you, sunset time, for example, so it makes life easier for us. But, of course, there is a bias. Even when we build the application, there is, of course, a bias. But I wasn't part of the team that built this application. And it's more focused on what we want to get, but it's available for everyone, so do whatever we want to do with it. Thank you. We have more time, maybe? On your first slide, you had a really big number in terms of the social costs, only in France. It seems quite egregiously big. Do you know anything about what's included in the social costs? What are the costs that are incorporated into this number? As Kim, it's a huge report. Adam is a French agency, an environmental agency, so it works on noise pollution, but also air pollution and things like that. You are working, sorry, I didn't repeat the question, but the question is about the social cost and the amount and how it is constructed. So I just read quickly the report. And the social cost is mostly about health issues, lack of sleep and stress related to noise and things like that. And how it affects people and how it affects their health and how it affects less-better health as a cost for society because you have more anxiety. Thank you very much. |
Tackling disinformation using opensource software
Tha case of Qactus |
Good morning, everyone. My name is Hervé and I'm French. Sorry, my English is not very perfect. It's not my mother tongue. I work for an NGO called Open Factor, which is a French NGO. I'm going to talk about it right now. It's a self-funded NGO based in France. We created it in 2019. Our initial goal was to try to federate the Francophone Ozynt open source intelligence scene. As we noticed that the Anglophone one was pretty active and brilliant with Bellingcat, for example, people like that. So we decided to create this NGO and we wanted to assist the newsrooms and activists on Ozynt even investigation. We also wanted to train journalists about what we knew about and also wanted to promote young journalists based in France in order to help them to get some skills on Ozynt, which is a kind of a pragmatic way to find a job in France right now. As we are an NGO, a self-funded NGO, we also wanted to set up some philanthropic projects. We trained investigative journalists from Syria, which are relocated in Europe. We also did some programs in West Africa, the Francophone area. The NGO today is about 260 members and it's still counting. About me, some of the work first, we participated in some of pretty cool investigations, such as Greenblood for a media called Forbidden Stories. We also published some things and worked on things for BBC, for the French television on the Wagner group, for example. We also won a prize with Swiss TV for a documentary on war crimes in Syria, in Ukraine. Recently, we published two recommendable reports on Infos. I don't know if you've ever heard about that. It's a Russian company backed by the Russian intelligence. We also worked on that and tried to dismantle their network of disinformation inside their border. About me, I'm not a coder, obviously, as you probably see. I'm the co-founder of the NGO. I'm a former judicial investigator. I'm an open source software enthusiast. I used to compile Linux kernel since 1999 or something like that. I'm an open source software enthusiast and evangelist. I'm a techie, but not a coder. I consider myself rather like a Swiss army knife. I'm very lazy and believe me, if you want to solve a problem, hire a lazy guy or a lazy woman. I'm very curious. What about disinformation? This topic is about disinformation. Disinformation is, as you probably know, false information that is deliberately spread by an actor in order to deceive people. There is a fantastic researcher called Ben Nemo. He tried to resume this concept using the 4D's acronym. The first one is dismiss. The other one is distract. Distract the people, of course. Distort the truth and also dismay. So that's a very good way to define disinformation. It's an art for several actors. You can compare CTI and disinformation. There are some overlaps between these actors. There is a very great investigation. All the links are in the presentation. There is a really cool investigation called Doppelganger by the EU Disinformation Lab. I strongly recommend this report about media clones serving the Russian propaganda. I'm going to switch from time to time in the presentation. Disinformation is about war but also health, politics, economy. It covers mostly every topic, actually. Of course, internet and the web are probably the greatest echo chamber that they use. In France, we have such a problem. We have a problem with QAnon. There is a website called Cactus in France. It's called Cactus. It's for news and Cactus, the tree. QAnon is an American political conspiracy theory and political movement. I'm pretty sure you heard about it. It's mostly far-right based. It appeared on the net in 2017. This French website is, of course, Conspiracist. It's all about anti-vax and it's also very anti-semitic. It's a problem. It's present on several platforms such as, of course, the web, but also Telegram, Twitter, Gabber, etc. All the network. It has some kind of translated content. As soon as you arrive on a web page, it depends on the language, but you can get the news automatically in French or in English. This is via Tor, the Tor browser. It's basically in English, but you can also save it in French, for example. Most of the readers are French or Francophone. It publishes five or seven articles a day, mostly crap. It's always crap. It has more than two million visitors per month, so it's not nothing. It's a lot of people. I have a personal blog and I have an open-factor blog and we don't have two million visitors a month. I can assure you. Since it was open, it re-vendicates more than 80 million visitors. It's kind of big things. It's one of the most productive Q&A websites. In order to make the investigation, we needed a methodology on this information. There is a very good framework called the ABCDE, very easy to remember. It's a good start. A stands for actor, so who is doing the disinformation campaign? The behavior, how? The content, what about the content, of course? The degree, it means the scale of the disinformation campaign. And the last one, he. He was added later, but he means the effect that is looking by the actor. So if we based our research at open-factor for cactus, we would say something like, what is cactus? What's its audience? Its environment, ecosystem, its influence, its motivation, above all. And last but not least, if we can identify who is behind the cactus, it's really cool for us as journalists. So we decided to approach this by trying to qualify the environment. And in order to do that, we used three different tools that I mentioned as a sweet combo. The first one is HIFE, and it's a tool that was made. Most of these tools were made by the Media Lab Sciences Po, and HIFE is one of the greatest ones. It's a tool that you can download on your computer, and it mostly scrapes the web, starting from a URL, and you can build a web corpus of all the websites that are connected to this website by the links inside the article. So it was very cool to use with cactus, because the most important activity of cactus is writing articles, as I mentioned, but inside they put links to other articles, sometimes legit, sometimes links that lead you to other conspiracies websites. So kind of a long process, because right now, at the moment of the investigation, cactus had about 8,000 something articles on the website. So it took me a long time to scrape everything, and I had to go to the level two in order to scrape the link from the website that were cited by Kyaktu, if I'm clear about it. So I had to go at a level of two. It works pretty well with a caveat. There are some limitations with HIFE, especially one that really bothers us. It's Cloudflare, sorry for the typo, because as soon as you arrive on a Cloudflare website, and a lot of websites are protected by Cloudflare, you cannot scrape it. You cannot scrape them with HIFE, but it works pretty well. Actually, not on my computer. I tried at the hotel yesterday night, and it doesn't work, but believe me, it works. So I scrapped the links. I'm going to show you the result after that. The first tool that I combined with HIFE was, of course, Jeffy. I've been an enthusiast of Jeffy since probably 2011 or something like that, and it's a very interesting tool to explore and analyze graphs, and especially if you try to use the modularity algorithm to determine subcommunities, and it's a brilliant tool that I'm sure you already know about. And the last one was important for us as journalists. We tried to illustrate our articles with images and graphs, et cetera, and it's very difficult to render graphs on websites, so this tool is brand new, and one of its authors is in this room, actually, and I'm super happy to know that. And it's Retina, and Retina offers you the opportunity to import a graph inside your website and to dive into it and to analyze it, or at least to help the people try to understand it, and that's really cool. So that's what we got, and I'm going to show you the rendering of the article. So this is HIFE, and based on these three tools, what I tried to do is to illustrate the environment of the website. So the first one at the top level is Cactus, and all these websites were at least once mentioned in the articles, and then all the websites also cite the other one. And here, for example, you can reach the American disinformation network. You've got the Francophone one here and the Canadian one here. So it's very interesting to see where Cactus is in the middle of everything, so I tried to illustrate the graph like this, so it gives you with this three tool a very good idea of the environment of the website. What you can see inside also are some legitimate websites, such as Mediapart, for example, or YouTube, et cetera, et cetera, and that's one of their techniques in order to lead some activity on the website. They link regular websites and also conspiracies websites. It's in order to increase their SEO, their ranking on Google, for example, so it's very important to put legitimate and disinformation websites. So that was the first part of the investigation purely made with the open source software. The second one was the influence, and it's rather usual to do that, so we needed a good scraper for Twitter. Our idea was to have an idea on how the disinformation could escape from their traditional networks, such as Gabe or Parler, for example, and to escape from this platform to go to mainstream platforms. So Twitter used to be a very good platform to analyze. I don't know what's going on, and I don't know what it's going to be within a few days, but Essence Scrape is a cool tool to do that. I don't know if you heard about it. It's a scraper for Twitter. So what we did basically was to extract all the tweets from its creation that had a link referring to cactus.fr, which was the website. We wanted to know about its activity, so you just basically open a terminal and you copy-paste this. I'm not going to export to it first, but what you receive, I hope my connection is okay. Yes, so you get all the tweets, you can grab a cafe, you don't have to scroll, and when you come back, you get all the tweets. So there are approximately 124,000 tweets that talk about it. Once again, it's not nothing. And you can export that as a JSON file. Here. So it's going to work, blah, blah, blah, blah, blah, blah, blah, blah, blah, blah. I'm going to break it right now. Once you have these tweets, you have to scrape them and you have to wrangle your data. And I used another tool called OpenRefine. You probably know about OpenRefine. It's a powerful tool to cleanse. And there is a wow effect for journalists especially with this tool that I like. It's how it deals with JSON file. So this is going to be maybe possible. So you open the text like this and it knows how to render JSON file in CSV, which is a format that we all like when we do data journalism. So you can now analyze all the data that you have. So that's why I love OpenRefine. And also when you want to do some clustering. And the last one was, of course, R anti-diverse. I'm not going to dive into R anti-diverse. You have had several presentations and that. But these tools are really easy because R is an easy language for statistics, easy syntax, very readable when you don't know how to code. And this is really important for newsroom. For example, most of the time, future journalists know how to code. So having a code that is readable is important. You get a quick overview. So we've got 10 minutes, yes, thank you. It's very convenient for prototyping and important also, reproducibility. But if you want, you can use Python. I don't want to enter the war between R and Python. The motivation. The motivation, we use Firefox and we dive into the source code. And this one was pretty easy because QA2 has a Google ad program inside. So every click that you made on this web page gives you some money. So this is the main motivation apart from the ideology. And about the identity, all the tools that we got from GitHub, Holyhead, Jint and Sherlock. These are command line tools that are packaged right now by a French company called APOs. But you can use them on your terminal and it will help you to dive into the nicknames and the mail combination on websites. We also used some leaks, but we found out who was Patrick behind this website. So let's wrap this up. First of all, I want to talk about the myth of a single tool in Hozind. Most of the time when I say to people, well, I'm a journalist or I work on Hozind, they say, oh, which tool do you use? There are several tools. There is no single tool in Hozind. And you've got to mix everything. About this investigation, this guy, Patrick, is a retired IT technician in a lost town in the middle of France. He's a completist, but as we saw, he's in the middle of a bubble of a completist influence. He earns approximately $4,000 a month just by writing crap. So it's very important for us and for the audience to know him and to how to him on the website. Patrick, we know who you are. That's my thing. Why do we use open source software? Because they are powerful, adaptable. They respect your privacy and repress your reproductability. I don't ever know how to pronounce this word. Most of the time, these tools are free, which is really important. And I insist on that. Junos, we don't have money. And when you go to Africa and train NGOs, for example, or journalists, it's very complicated to buy software for them. They don't even have regular machines. So having open source software is really important. So thank you, you all. If you code for open source software, it's really important for civil society. Thank you. The last one, collaboration. I didn't talk about collaboration. When you're in the room today, it's important for us to mention that collaboration on how our investigation is the key. First of all, by software, but also in order to share information and to share indicators of confirmation, for example. So data capitalization is important. And there is this project called OpenCTI that I hope we will try to impose in Europe as a tool. It's an open source tool. And we'll try to impose it as a standard with the disarm matrix. It's called disarm. All the links are in the presentation in order to get some cooperation. Thank you very much. Thank you. Let's spare a thought for Patrick and on with the questions. Yes. You mentioned you used some leaks. Is that a leaks that you have friends that gave you information? Or is it that you found things that had been put inadvertently in the public space? Second one. What we tried to do when we do something. Sorry. I mentioned the fact that we use leaks. And the question was, is this information that some friends give me or information that is publicly available on the internet? That was a question. The answer is, as OpenFactor is an NGO, we don't have friends. So we use public leaks. We dive on telegram, the darknet, et cetera. And we use basically keywords, passwords, and emails in order to make connection between nicknames and try to identify people like this. That's what we do. Thank you. We also have a sort of, I'm a former investigative journalist, investigator, so what I try to do is sort of judicial methodology. So we try to open everything in our investigation, and we mention all the information that we get. If we get a link from somewhere on the internet, we will explain where we got the leak every time. Transparency is important. Another question here? Yeah. Are you Jewish? No, I'm not. OK. The question was, am I Jewish? No, I'm not. No, no, no, no, no. I'm not Jewish. Other question? Yes. I mean, you mentioned like one particular person that you identified, but I mean, I guess there are more people. What is the scope of like the success of those information? We try to, so the question is that we identified one person behind this website, and there are probably more people and more website like this. So we are just a bunch of cool people working behind their computer. So we try to do our best, but at least we have one website which is really influential. So we start by the most productive website that we want to investigate, but of course there are millions, not millions, but thousands and thousands of people doing these kind of things. We do what we can, and we try to inspire other journalists in order to make their own investigation and to out these kind of people. Sorry. How big is the environment of NGO working on this kind of subject? Are you the only one? Are you a lot of NGOs and friends working on this information? The question is, are we a lot of people, for example, in France working on this information? There are some newspaper and newsroom that start to investigate on these kind of topics. It's not very popular, not very famous, but I think that the last three or four years, this information has become a major issue, especially in Europe, and we see plenty of NGOs such as Hew disinfo lab, for example. They are friends of ours, and we really like them, and we try to collaborate with them. So, yeah, I hope that we'll get more and more people in order to work on that. Other question? How can you stop automatic disinformation? So I'm thinking about chat GPT and Vox and stuff like that. So can you distinguish whether it's being auto-generated or human-generated? So the question is that, is how can we stop automatic disinformation made, for example, by chat GPT? Chat GPT is going to be a very big issue for influence, which is something pretty obvious. When I'm saying that, I'm saying obvious thing. But, yeah, it's going to be a problem because websites, especially search engines, they use, as I mentioned, SEO technique in order to do the ranking of the website. And one of the biggest issues, for example, for Google, will be to figure out if a content not only is duplicated, which was pretty easy to do, but now automatically connected. So I don't have any solution for the moment. We haven't faced this problem so far. Chat GPT is quite recent, and its use is not widespread right now, but it's going to change within the next six months. And I'm sure about that. As an answer to this gentleman's question, the OpenAIG recently launched a site where you can give a text, it will give you a confidence level if it's human-generated or computer-generated. Yeah. Sorry, sorry. Yep, sorry. This one. Yeah, I was curious about the length that you did at the notes. Could you explain more of the color of the page? The level? Yes. So the level is basically the title of the website? Yeah, sorry, the coloring. Ah, the coloring. We use the, we use the, between the centrality on this graph. So it's usually what I do on this kind of website in order to figure out the importance of a node as a distributor of the connections between other sub-communities. So, yeah. Because the American cluster is, I'm just curious about your cluster. Ah, okay. Oh, the distribution, you mean, for example, of the links? It's, it's basically a, it's, it's the standard Gefi. No, no, no, no, the, the spatialization. What? For Satlas. For Satlas. For Satlas. So it's this one. Yeah, this one. So it's for Satlas, so every link that is close to the other one will be like packed as itself and a group, a cluster that is not very linked to another one will be far away from the other one. There is, there is a bias in the representation because I wanted QoT to be at the top. So it's part of my fault. But for the rest, it's basically the extract and the export from Gefi. Yeah, I was just wondering, in the beginning, you said that this information is deliberate. Yes. Incorrect information. Yes. How can you judge if it's deliberate? I mean, this guy behind the website, is he like, I'm going to push the slide or is he actually believing it himself? Is there a method to... First of all, when you work on this information, you, so the question is how can we make the distinction between misinformation and disinformation. So, basically, when you start an investigation, this is the first question that you ask yourself. Is it going to be disinformation or misinformation? Misinformation is like something that could be true, but no, that is false, but the person thinks it's true. Okay? But when you see the content of the website, you know that he is talking always on the same topics, always distributing the same kind of information, so you know that he does this deliberately. But the motivation is important, and that's why, in this case, it was important to identify the Google Ads tag, because you know for sure that its motivation, it's probably ideologic, probably, if you see his Facebook wall, for example, but there is the money, and the ads, and the Google and the platform are fueling the disinformation with the ad system. So, this is really important to identify. Okay, last question over there. Yes. Is there a strategy or a approach to tackling disinformation in authoritarian states? I didn't get the question, sorry. I mean, is there a strategy or an approach to tackling disinformation in authoritarian states and with partnerships? I mean, my information is far too beautiful. So, is there a strategy from, the question is that, is there a strategy to tackle disinformation from authoritarian states? I think so. The problem is that everyone has its own strategy. So, there are millions of strategies, and this is the big issue, according to me. So, Europe, for example, is trying to organize this by different programs. EU Disinfo Lab, like is the leader in trying to federate the disinfo structure and activists in order to make wider and bigger investigations like this. So, I think, yes, there is a strategy. The problem is that for the moment, there are multiple strategies. But it's going to change. Thank you. Thanks. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. |
PIMMI
Python IMage MIning |
Hi, everyone. Well, I'm very impressed to have such a large audience for such a small tool. But, well, I'm Beatrice. I work at the French Media Lab. And today I'm going to present PIMI, which is a tool to study image propagation. The Media Lab is a lab where social scientists try to, among other things, study the traces that people leave online. And for now, they are quite well equipped with tools to study text. But when they ask me, OK, how can I study mean propagation? I'm still struggling to give them answers. So what does it mean to study mean propagation? It means, OK, being able to recognize that some parts of an image are copied or partially copied. So what this tool does, it's very simple. It's able to create clusters of images and group together images that are total or partial copies of each other. It's able to deal with image transformation, so if the image is cropped or zoomed. And it's able to adapt to copper's characteristics. So it will try to make the best of your data sets, depending on the number of images you have or the type of images you have. What PIMI is not able to do is to cluster semantically similar images. So it's not the tool that you are going to use if you want to create clusters of cats and clusters of dogs or, I don't know, find images of violence versus images of peace. And it's not able to do some face recognition. So again, you will not be able to make some clusters of pictures of Elizabeth II versus clusters of images of Emmanuel Macron. What you could imagine doing, and we could imagine also work together if you are a researcher working on those subjects, is to study the propagation of MIMA on social networks, as I was saying. But also you could study the usage of the press agency photos in a press corpus or stock photos as well. You could also study the dissemination of fake news based on image montage. Or you could study the editorial choices between different media, depending on whether they use the same images or not. So let me do a quick demo of how it looks for now. It's not on the screen. Okay, let's forget about that. I'm very sorry. Well, I'll try to make it work. Okay, well, it's still not showing totally all clusters. So we create clusters of images. So this is a data set that is created by the French Inria and that is presenting some degradation on images. So they take an original picture and they apply some filters or they crop the images to see if they are able to group the images together. So we can see that we have pretty correct results on that data set. And this is our results on some images that we collected ourselves on Twitter using Elon Musk as a query. And so we try to clusters those images. So as you can see, we have images of Elon Musk. We are able to group together some images that are crops of others. So this is probably the source image of the montage that has been done here. But we can also see that we have some problems with the tool. For example, here we have a cluster with two images that have been assembled together and we create a cluster of actually two images. But well, that's the state of the tool for now. And now I try to come back to my slides. Okay, so how does it work? For people who work in computer vision, I'm going probably to say some things that are quite basic, but I'll try to make it clear for people who do not do computer vision. So it is not based on colors at all. It's used like the grayscale of images. And it tries to detect points of interest on a picture. And then it uses these local key points as vectors. And then those vectors are indexed in a database that is able to perform some very quick similarity search. of the tool. As I say, there is that problem of parts of images that create clusters that are bigger than they should be. So our plan is to be able to detect images that are actually those links between two clusters. So to be able to detect that this image is actually containing two images and to be able to deal with part of images. And also what we would like to do is to show images in their context, to be able to show the tweets that contains those images or Instagram posts, et cetera. Or at least to show additional metadata for the users. And also we would like to show you the graph of image similarities so that the clusters that are resulting from that graph are not interpretable. And to improve our tool, we need your use cases because for now we have those two, three databases. But we would be very glad to do some partnerships with other researchers to improve the tool. Thank you very much for your attention. If you want to look at the slides, we have the references to all the images used and to the papers of the algorithms used by Pini. I'm open for questions. We had a bit of trouble with the sound stream, but it's back on now. So yeah, you should repeat the question. Okay, I'll do. Yes. So thank you very much for that. We'll try to find similarities. Oh, sorry. I have to repeat the question. So the question was, if I understand well, how to reproduce that use case, not on images, but on other types of documents that would be, I guess, some features. 3D counterparts. And I'd say, well, as long as you can, like, represent your data in the shape of vectors, then you're ready to use face to, like, do some, some search for nearest neighbor in your database. And then you can go for the whole pipeline, create some graphs, find communities in the graph, and go for it. But I'm not sure Pini is your tool, but, but, well, the architecture of Pini could be, of course, a model. Yes. Is there any project current or you're completely ongoing that Media Labs has used before, or is it still largely in development? It is largely in the development. Sorry, I repeat the question. So are there some projects at the Media Lab that are currently using Pini? And the response is no. Yes. Sorry, can you consider any other ways that can be considered? Yes. Have you considered other ways of presenting picture similarity or using picture similarity, or the types of image similarity, if I'm here in the Sunwell? Well, I'd say that that was what I was saying in my second slide. There are other types of image similarity, for example, semantical similarity. And, well, maybe in a few months, if we have like a robust architecture, we could maybe include some other types of vectorization of images. But for now, well, there are already tools that do that. Like, there is something called Clip Server that helps you find similar images from clip vectors that are like semantical vectors. So you could use that tool. It's great. Yes. Yes. So the question is, is the tool really able to distinguish the thing that is of interest to us, the fact that we are talking about a dog? So the tool is only able to find partial copies in an image. So the tool would probably be able to say that all those images contain the same parts of face of a dog. So it would probably be able to group all those images together. The problem is that if there are other images in the database that contain the rest of the images, then they would probably also be grouped in the same cluster. So that's why what we are currently doing about parts of images would let us improve the cluster so that it's purified from the rest of the images. And we could have a cluster of the face of that specific dog and then a cluster of that taco in the second cluster. Yes. What kind of clusterization do you use on the graph? Well, for now, we have the best result with, excuse me, what kind of clusterization do you use on the graph? For now, we have our best result using pure connected components. So actually, the specification we do on the graph to reduce the number of links between images is enough to have separated connected components in the graph. And so we take each connected component and it's our cluster. What we would like to do is to try to mix with some Luvain community detection, but actually for now, it's not the thing that works best. Yes. I'm not sure I understand the question. Can you try to rephrase it? Okay. What things are you looking at to improve the model? Well, there are many things we are looking at. For now, mainly, we look at techniques to do a better graph specification in order to find more coherent clusters. We are not so much working on the local descriptors part of the tool for now. Have you considered using the direct link to the Twitter images or social media images online? Did I repeat everything? Well, yes. We would like people to be able to see images in their context because, actually, they won't understand what's happening if they just have images. They need to see, okay, why was this image published? Who answered, et cetera. This would probably mean that we need to add at least the links to the pulse or maybe some kind of visualization of it. We have a bit of time here. Any more questions? We can take one or two. If not, we can switch quietly. All the next questions. Thank you. |
AMENDMENT Better engineer-researcher collaborations though data control |
All right, so I'll start a little bit earlier of the, of the, actually the 2 p.m. Just the thing that, so I'm wearing this t-shirt because I'm one of the day room manager here in this, this room and I'm taking over actually a talk slots that has been cancelled. So we were supposed to, to, to hear a talk by Maria Arias de Reina, who couldn't make it today, unfortunately. She was supposed to talk about data flowing in the right way, which is a talk about a tool called Cauto, which implemented data workflows with a low-code, no-code approach. This is, this is what it looked like. So, of course, I can't talk about this tool because I, I don't know it. It actually looks pretty cool. So we are very sorry for Maria not being here and we hope we can host her next time. So I will speak about a project, a research project called Ricardo in the digital mentees, which I've done with, oops, sorry, how are we going? Yes, I worked with Beatrice de Dinger from Sciences Po, Centre d'Histoire, so she is historian and I am Paul Girard. I am from a company called Westware, a small agency specialized into developing data, data-oriented web applications, and we do work a lot with researchers. Today I'm here to talk about how actually a collaboration between a researcher, Beatrice, and a data engineer myself can be fostered by using data control systems. By data control systems, I mean making sure we care about the data we are using in the research, by documentations, version control, and also quality control. So the research is about the history of trade. So we, together with Beatrice, we build a database of trade flows between nations, well, between geopolitical entities in the world, in the 19th century, which means that we have the main data, as we know, how many, how much amount of money in different currencies has been exchanged between different geopolitical entities in the 19th century. We know important exports, and we know this with a bilateral view, which means that we know the trade from France to UK, for instance, and the rivers too, from two different sources, which makes quite a nightmare to deal with, but still. So this is basically the main publication we already achieved. So we started by releasing in 2016-17 a web exploration tool, I'll show you, and also a paper about how we build this database. And then in 2021, we released a new database called Geopol East, which is basically a data set that explained trying to track which geopolitical entity, so I'm using this weird word, because we have countries, of course, but we also have entities that are part of countries, but we also have trade with entities that were colonies at that time, and all those kind of weird political statuses. And because of that, we built this Geopol East database where we tried to track which geopolitical entities were controlled by which other one in time. And recently, we released a new version of the database, adding 230,000 new trade flows. And this releasing of new data, because actually Beatrice discovered new archives, new archival book about trade, this massive updates needed a tool to make sure we can actually release the data, which are cleaned the structures the way we want it, without having to deal with all those kind of issues manually. I will speak about that a little bit later. So this is what the website, the main website looks like. It's a web application you can go where you can explore basically the trade of the world in the 19th century. So we have different kind of civilizations. I will not go through all of them because I don't want to focus too much on this today. Well, if you have questions about this, we can go back to that later. We also have this website Geopol East that allows you to actually explore the political evolutions of the links, sovereignty links between the different entities. I'll show you a little bit what it's like just afterwards, I think. So just to be totally honest, this slide deck is actually something I already presented in another conference. So I wanted to speak about the visual data exploration tool we made, the frictionless data integration. So this is the main point I want to speak about today, point two. And also the third point was how we can actually analyze heterogeneous data in the long run, like one century of data. My main, I will try to convince you that actually using data integration is a very nice and important tool to foster this long-lasting collaboration we had between Beatrice, historian, and I, data engineer. So about collaboration, I just put a link to a conference I gave a few years ago about this specific subject. So visual data exploration, so I will really go quickly on this part to focus more on the second part. Our main objective here is to propose basically a tool, a set of interactive data visualizations on the web, that all those researchers are basically people exploring this, to change points of view on the data, looking at, for instance, the total trade of the world, then focusing on one specific country, then on one specific currency, and to be able to add all these different ways to look at the data in the same tool. We also like to offer visual documentation, like visualization is a very nice important tool to spot issues or surprises or errors in the data, and to unfold the complexity of the old phenomenon. So this is, for instance, the world view. So we are able to retrace the world trade in a century, but as you can see, there is more than one curve. So we have different ways we can calculate that actually from the data. We can, for instance, take some researchers that really did re-estimations of this total trade by correcting sources and all this kind of stuff. So that's one way to do it. That's the green curve in this visualization, but we also can, we can actually sum all the totals we have in our data. This is the yellow one, and we also have the, so here we are summing all the flows we have, the yellow, and the red is more, we are summing only the total that were in the archive books, and it's not the same thing. If you sum what we have, or if you take the sum that's done at the time, you don't have the same results. Welcome to the nightmare of dealing with archive data. In this visualization, for instance, we are focusing on one country, Denmark, and then we can actually spot the trade on the long run of this specific country, and we can also visualize, so here is Germany's on the Rhine, we can also depict actually not only the total trade, but also the difference trade, bilateral trade between Germany and his trade partner along time. Okay, so this is an objective. So geopolitists here, for instance, we see that when we talk about Germany's on the Rhine, what are we talking about? And we are talking about a geopolitical entity that had a different set of seasons on time, you can see here, and then you can see on the bottom line, which all the parts, all other geopolitical entities were actually part of Germany through time. Because sometimes we have trade with only Saxony or Waldeck, and we want to know eventually if those entities are part of another one. So frictionless data integration. So we are using data package from frictionless data here from Open Knowledge Foundation. So actually, there is a talk from Evergony from frictionless team later today in the online part of our room. We'll talk about the new tool fostering data package. And actually, I'm very interested into that. But I will talk about what I've done myself. So about this project, the main thing we do is versioning the data. So we put data as CSVs into a version control system like Git, simply. Here it's on GitHub. And you can see that you can track actually just the same way we do with code, who changed which data when and why. So here, for instance, is Beatrice, who actually corrected the typo in the flow number, adding the comma at the right place. And we have the commentary here. This is very important to keep track of what's going on because we have like hundreds and even thousands of files like that. So it's very important to have that also to know if we have issues, if that happens, where it comes from. So data package. So data package is a formalism where it's basically a JSON file where you will actually describe the data you have, adding a documentation. So the first interest of using data package is actually to document your data set to make it easier for other people to actually understand what you want it to do. And it's very important for publication at the end, open science. So here we have the title of the project. We have the license, the contributors. That's also very important to have. And then we describe resources. Resources can be seen as a data table. So here, for instance, we have rec entities. And for each resources, which is a CSV here, we describe the fields we have in the table. So we know that the rec name table is a string, it's unique, and it is required. So it's very like a relational database schema. It's very kind of the same spirit, but in a JSON format. The reason why you can do that, as I said, is documenting. The second reason is actually to control your data. So doing right, driving data validation. So if you have a data package described like that, you can then use a Python library, frictionless, it's called frictionless, which will actually check all your data line, if each data line you have, respect the schema you wrote. And if it's not the case, it will actually provide you with reports, with errors like, for instance, here, I have a foreign key error because the modified currency year is not known in the table that is supposed to have this data. So it's a very nice way to actually, we get new data, and then we check, okay, where do we stand about what we want to achieve at the end, which is to respect the data package formalism we wrote. So that's very nice. It's very cool for data engineers. But as I said, our goal with Beatrice is that we work together. And she, because the thing is like, when she enters new data in the system, she has has an historian to take decisions on how to interpret the data that were in the archive. I can't. That's not my job. That's not my responsibility. I don't have the skills to do it. So we need something to allow her to actually correct the data, update the data that comes in into the data package format. And she can't use common line tools in Python script and that kind of stuff. So we need something, we missed a tool here to let actually humanist researchers, in this case, but people in general, to be able to interact with the data flow with something else than actually two technical interface. So we built, we developed a specific web application that actually helps Beatrice to integrate new data by using the data package as a validation system. So all of this is done in JavaScript. You also have a JavaScript library for data package. So basically this is the steps. So the idea is Beatrice will upload a data spreadsheet, so a new data file, transcription of one new archival book she found. The tool which first checks the spreadsheet format, saying like, do we have all the columns we want and everything. If it's correct, then it will go through all the data points, checking all the errors and grouping them to make, to propose a creation interface for her to correct all those issues through a form. And we tried to do, to develop something that makes this process, which is tedious and a long process, as easy and as fast as we could for her Beatrice to actually go through this. At the very end, this tool will actually commit to and push, commit it into a Git repository and push it into the GitHub repository. All of that done into a serverless web application, which means that I didn't have, had to introduce a Git command line to Beatrice neither. The tool does that for her. So this is what it looks like. So it's a React web application. Here we have the Schema validation summary where we see for the fields. So the different fields we have, for which we have errors, the kind of the error we have. And at the end, we have the error overview, which says how many rows that has an issue. For instance here, in the source column, we have two different invalid values that impact 169 rows. The idea here is to correct all this group of 1 and 169 rows with only one edit. So once we have all those errors, basically the process of workflow using this tool will be to go through the error groups one by one. The web application will actually generate a form with inputs to actually help Beatrice to decide. So in this example, we have a value for a partner. So partner is a trade partner. So it's a geopolitical entity. Here it's in French. It's il de selon. So we use English-based vocabularies to translate the partner. So we need to decide what is il de selon in our vocabulary in the rest of the data. And this is where we have a search input where actually Beatrice can actually search for selon, which is called in our vocabulary Sri Lanka Selon in parenthesis. And once she chose that, actually the tool we correct this column and put the data at the right place to make sure we will translate il de selon to the Sri Lanka Selon. At the end, once she went through all the process, we have somebody here explaining like all the corrections she made. So in the first here line, for instance, a year was misspelled. All that kind of thing, we change the source name and everything is fixed. So once all the errors have been corrected through the form, the data form I just showed you, then she can move on with the last step, which is actually to publish this new data that is now valid because we know it's valid because we can control it with the data package into the GitHub repository. And this is how basically the React web application will really prepare the data. So I could go into details into what does that mean later and make it possible for Beatrice to actually take the right decisions to adapt the raw data into the data package we worked with. So I have a little bit more time. So this is the analysis. Maybe I can try to deem all a little bit of tool live. Okay. So the very important thing is like it's a several less web applications. So here it's on my local host on my laptop, but actually it's hosted on GitHub directly. So what is the media lab? Actually, a lot of this work has been done at Sciences for Media Lab, my previous employer. So congrats to Zen2 because they contributed to that work too a lot. So this is a tool. It's hosted on GitHub.io because it doesn't need any server. All the logging process with GitHub is done through a personal token, which is a very specific key that you have to produce in your GitHub account for once. Then you use that as a login mechanism. So this is what it looks like. Once I am logged in, I can fetch the data from GitHub to make sure I have the last version of the data before adding new things. Then here I can prepare the little file here, which normally should have some errors. So the first thing here you see like this green message here on the bottom says that actually this CSE file is valid structure-wise when the errors of the columns are good, which is a good first step. And then this is all the errors I have in my file. This is a nice step because you want to overview what kind of mess you are going to on WR before starting the process because if you have too many, maybe you want to do that later or asking for help. So once you've seen that, you can start. So this is basically all the thing we have to do. So this is the first one. I can move to the next error or go back, even though I haven't corrected it yet. And here I say like, okay, so the value, I don't know, whatever. This character is not actually a unit because the unit should be an integer. Yeah, it's true. So it's better with the one. And I can confirm this fix. And now we're good. Unit is one. Now I move to the next one. You see here I am in two on nine. So all the process is also trying to make that as smooth as we can. So as soon as I fix it, so here I have, it's written in French, it's 1938. But actually we want that to be an integer again. I don't want the later version of the year. So we understand how it tweets as a number. As soon as I confirm the fix, I move to the next page. So that we can try to make that process as seamless as possible. So here I have a source. So this is the foreign key. So in the data table, the source is actually, it's a key that is referring, which is referring to the source table. And say like, so here basically foreign key source violation. So it means that this sort doesn't exist in our system. So here I have two choice. Or I was supposed to, okay, normally I should, I should, oh yeah, sorry. So normally, okay, so whatever. So I can search for it and find it. Or, and in which case it will, the edit will be only replacing the key. Or I can create a new item. And here you can see that here I'm creating a new source because sometimes the source doesn't exist yet. So I have to go through all the, you see this form is much, much longer because here I'm creating a new line into the source table. I will not do that because it's too long. I will just give me something, please. And that will make it, okay. And so on and so forth. Again, we have an issue with the, sorry. This example is a little bit up. Okay. Here it's a Trinidad and Tobago. It's a geographical line time. I don't know because it's a A. Trinidad and Tobago, not A. Okay. And we're good. Australia with a lot of E at the end is not correct. It's Australia. Yep. Sorry, it's very long. Yeah, whatever. A dollar. And let's say it's a scrap. Don't do that, right? But, okay. Okay. So you see, that's the important point is like, so we are based on the data, in the data package. So we are using for NKIS specification and so on. But actually, we had to add specific forms for our case. So the application here is not generic. You can just put a new data package of yourself with your data and it will not work because we had actually for UX and UI reasons to make specific cases like that where the forms are not exactly as the data package described it. It was too complex to make it very generic. But actually, with more work, that could be achieved maybe at some point. And actually, the talk from Evgeny this afternoon will talk a little bit about that kind of stuff. So here we are. I'll stop the demo here. Just to finish, why do we do all that kind of stuff? Because we want to analyze trade in the long run. We have lots of trade values as you can see. A lot of trading entities, very too much. And then at the end, we try to, or this is a visual documentation where we depict the kind of different source we use in the database. And at the end, we try to do something like that where actually here we have the trade of the world in 1890. So each nodes here, circles, are geopolitical entities. And the links between them are the trade of that year. So it's total trade. We could choose import or export here. I just summed it up. The important part here is like the color here is based on the type of geopolitical entity we have. So in this orange or kind of yellow thing, it's what we call the GPH entity. It's entities that geopolitical entity we know, mainly countries but not always. In green, those are colonial areas. So it's not a colony. It's not a country which is a colony. It's like French Africa. It's like we know it's colony. We don't know which one exactly. Like here, for instance, we have European Russia, which is a city part of. It's from Russia, but it's the European part of it. And this is what we find in the archive books. So we can't really decide what that means exactly. And we're trying to analyze this kind of, so we have this very, this gap between very heterogeneous data, very difficult to interpret, but still try to do a quantification and analysis like this network on top of this very complex and rich data set. I think I'm good with what I wanted to share with you today. We can move to question if you have. Yes. There was a slide. The slide had a title which was development of a specific web application to integrate new data. That one. Yes. You might not have time to tell me, but please tell me if you have time at some point. What was the conversation with your historian? How did this happen? What did you actually do to plan? Yeah. So the question is like how Beatrice and I ended up deciding to do that. So basically, the whole point is very like the collaboration, because we worked without that for a very long time. And the process was we had to meet in the same room. I was doing the script, checking the data, editing the data, because editing the data in a spreadsheet that doesn't mess up your numbers and everything is not easy, actually. And we were working together on that. It was necessary and actually very nice to do, because we had to exchange. So she was explaining me why she was taking this decision or not, and she was taking this, and I was just putting data. But at some point, we ended up with the fact that we had so much more to add that this process couldn't scale, basically. So we had to find something else to make sure she could do this process on her own. And I would intervene once the data is in the GitHub repository, checking myself with quantification and script and everything again, because you still always need to check everything many times. And then it makes the whole process much more smoother. Yes? I have a question about this slide, too. So would there be any benefit in you committing it to GitHub at the very top of this process? Would there be any benefit to you committing the data set into GitHub at the top of this process after you upload it so that you can compare? Yeah. So the question is like, would it be beneficial or possible to actually commit the data before checking it and put it into GitHub? So yes and no. The reason why we don't do that is because the first one, because I need batteries to take the decision of documenting the raw data to make it compatible with all the nice visualization I showed you. And she needs to take those decisions. She needs to do it. So that's why we put this data into the GitHub after she has done this work of data creation. We could actually host the data as a raw file first and then do that later on the kind of stuff. Still, we still need a web interface that lets Beatrice, the historian, take the decision. So no. Any more? Yeah? Yes. So this tool I'm using here is actually brand new. It's a Gefi, but on the web. We are working on this with my company, Westware, and we are very close to release it. It's basically the same thing as Gefi, but lighter version and a web version. It's already there, but you shouldn't go because it's not live yet. |
CorTexT Manager, a growing online platform in open research for social sciences |
So, thank you Paul, welcome everybody. So we will today present to you the Cortex platform, which is an open platform for research in social sciences. And it's a collective presentation where all the team is here, but we have three of us will be presenting today out of the team that you see on the picture. So what is the Cortex? It's an online platform built on top of an architecture and its main goal is to help the social scientists with a research question to actually find the benefit of a computational method to fit a specific question. So it has been founded in 2009 and it's driven by social science research and is supported by several French national institutes at the beginning, in particular INRA, and then later on some funding, some research funding and some European project called RIS, European project of infrastructure. We are now at our second version of this online platform called Cortex Manager. So this is the part that you can actually go online. It has been designed from the very beginning, as I said, for the social scientists. So we took the point of departure was that the social scientist doesn't have any IT capabilities, doesn't have any, on average, doesn't have any resource on the computer, especially 15 years ago. So one of the main benefits of the platform would have to be empowering him with the power of computing and of methods. It's a collaborative platform, so a lot of projects can be shared between scientists. And there is, as a web interface connected to a workspace online, you don't have to, of course, have it on your computer, which was, again, quite new at the time. And it's very important to note that it's very permissive. So it's a bunch of methods that you can apply on all kinds of data that we will detail later. And this was one of the main foundation of our platform in the beginning, that all the methods are runnable on all the viable that you have at your disposal. So how is it going from ten, ten, ten, twelve years after that? So we have over 250 peer-reviewed publications that we are counting from now. And of them, 75 documents in 2022. So it's growing, and it's growing quite fast the last few years. So 50,000 analysis, so 50,000 jobs have been done on the platform by scientists last year, by over 1,000 active users, 400 institutions, and in 50 countries. So it's pretty worldwide issue-wide. The team we have now in 2223 is consisted of eight technical persons, so engineers mainly, two researchers, one trainee, and two close collaborators in companies, or independent. Very important is located at University Gustave Eiffel in Paris, near, in the greater Paris, so near Marne-la-Vallée. So you have Disneyland not far away. And this infrastructure is composed as a classic web-oriented application and infrastructure. So obviously we have the web interface that you can go to as a user, regular user, that encloses a lot of web services, so the main interface, but also an API that you can go and act upon from outside. So this allows us to open all our better-than-services to external applications, and specifically external projects that we are part on. We are built on an infrastructure of servers that are hosted at Marne-la-Vallée, so in our university. So everything is local. We don't have anything hosted online, which is 300 CPU, 3.5 terabytes of RAM, and 40 terabytes of storage, which is big and not big at the same time nowadays, but it's enough for our needs. We have two main services that we are providing. First is the storage of big database that we are collecting and curating, like for example the patents, the European patents. We have several other databases that we are putting inside some projects, so we can help specific projects with this data that we have stored in the infrastructure. And the second part is the processing that we offer, so the method, the scientist method that I will talk about later, that are run inside of projects on the platform directly by the researchers. So the researchers are actually actioning those methods directly from the web interface with the parameters and their data. And we of course have this monitoring of the whole infrastructure, so it stays online. So once you have an infrastructure, it has to do something for scientists, and that's the goal of the platform. So how does it do, and what does it do for science? First what? So this is the right part of the previous diagram. So we have a bunch of heterogeneous methods. So from science to metrics, the biometrics, the study of basically publications, scientific publications and tracks, natural language processing, so terms of traction, name-intensive recognition and stuff like that, social network analysis, so the study of all the interactions inside publications and texts in between actors, keywords, et cetera. Stochastic block models, then I will end the table somewhere in the room. We'll talk to you about later. Knowledge dynamics, so through time, the study of the knowledge through time, and the special analysis, which is a big part of our infrastructure now, because we have geocoding, geolocating inside publications, and geo-mapping framework. Okay, next we'll be, I will give, join you the mic to talk to you about how to cite cortex, because this is one of the big, I will try to note. Then as cortex is a research software being used mainly by researchers, then suppose the to be cited, besides the fact that there is no yet strong culture of citing software properly inside academic works, it's expected to be done. Then according to that, in 2022, we documented how cortex must be cited inside academic works, and here we have just an example, for example, if I am writing a paper, and if I cite cortex, this is how cortex will be handed at the end of the paper in the reference section, for example, and with this, we can give, for example, the credit to the developers, like, for example, Philip, Ale, and many others, who is contributing with this software for a long time, and must be recognized by this work academically speaking, then this is important, then that's what we did, and nowadays, we are lucky because a few years ago, we are missing the infrastructure to how to define, how to document, and how to cite software, and now we have, for example, the citation.cff, that it's a citation file format, that it's been adopted a lot, and I think it will be the standard way to do it, and on top of it, we have many tools. Here is one example, CFF convert tool, where we can convert the CFF file format, for example, to big tech, to bibliotech, or to APA format, and many other formats, then we can keep the automated data about the software important for citation in one single file, and from this file, we can derivate to many others, and it's really an easy way to do it. And then in Cortex, it's a solved problem, but we still have an issue about how to identify permanently this object. I say object as a digital object, okay, it's a software, but it's a digital object from the point of view of science, and how we can identify it permanently, for example, for papers, we have a DOI, that it's very well known, and so it's very well for papers, PDF, but for software, we have a problem to, we can need to cite software in many different levels of granularity, for example. We can cite the software, just the software name, or a specific version of the software, or a specific line of a specific file inside the source code of the software. Again, this is still an open question, because, for example, we have a proposition from the community to solve it, but there is no one single kind of permanent ID that covers all the levels of granularity, for example. We have the proposition from the Software Heritage Project, that it's the suite, the Software Heritage ID, that we can use it to cite software in the very specific code fragment to the snapshot of the full source code, but, for example, we can't use Software ID to cite software in a high level, like the software version of this, only this project name. For that, we should use another kind of ID, that's it. And it's an open question that we should work this year or next year, and that's it. Now it's the time for you, Juan. You? I don't know. So one question we've been asking ourselves is how to open the platform fully. What did we do to do that until now, and what to do in the future? So there is no platform, web platform, without free open source software today, and we are using a lot already, and I just put some of them on this page. You can see here, from any layer of the infrastructure, from the virtual machine, the containers, the scripts, the operating systems, et cetera, et cetera, until the actual methods and interface that you produce to the user. We are using free open source software, or at least open source software, because I don't want to start a debate here, but that's the general idea. Now this is not Cortex producing that. The Cortex is wrapping around this open source software, and it's open access for now. So it's free and open to everybody to use as an interface, but the code is not entirely free. We've been trying to do that a little bit more than the last few years, and hopefully in the next few years, even more. And Ale will talk to you about how to do that, and what are the challenges to actually produce a good open source software and methods. Hello, everybody, I'm Ale. I work with the guys and go in Cortex. So this is not really a talk about Cortex, I'm not going to do a demo or something. Because we thought of this talk as a moment to discuss what's going on with us, what is going on in the platform, because up to now, like Philippe just explained, everybody can come and use, but most of the software in Cortex has been developed without, well, with open source in mind, but not in practice, for practical reasons. So are we missing one slide? No, yeah, so everything from the methods, the interfaces, the back ends, everything has, so even our documentation is on Wordpress or Q&A also. So everything is based on open source, but up to now, we didn't have real open source practices. So how are we managing this transition from a project that is very aligned with open source community, but that doesn't manage to do open source, to start doing open source? So there was really a big part was just getting even more open source into the project. So we're starting to use project management tools that we were using, like some of them, but not as much, not as systematically. So this is a real common problem with research software, where researchers, they don't always have the reflex to go into GitLab or GitHub or whatever and use issues and use dashboards and project management, sometimes not even a thing in their head. And that's kind of the background that we were facing, like years and years of research and developing stuff, using open source software, but using version control because, well, it was still the infrastructure guys really needed that to work with, but it was really like the bare minimum. So how do you start to get people more involved? So we just, the first step was really pushing everybody to start using the usual tools, and sometimes first moving some projects and moving the discussion into issues on GitLab and all that. And then, okay, so, and then, and also adopting for our discussions more open source software, like Matrix, so it's like, I don't know if anybody knows Matrix here, maybe today everybody already knows Matrix, it wasn't really a given last year. And then, once we had like adopted the general workflow, the projects were still, most of the projects were still private, but they had the workflow of an open source project, so it helped get people into a better mindset and the people who would arrive and start working with us would like more easily recognize the pattern of working with open source. Then because we had this system that runs the jobs and everything, and usually in the old scripts, they would just use the structure that was in the system directly, and it was pretty hard to adapt and evolve them. So then we developed some kind of intermediate libraries so that it gets easier to, so making it easier to develop new methods in the platform. So we had to develop this intermediate, I won't have time to go into show the details, but just getting the idea that we have to get people into the workflow, develop libraries and interfaces and intermediate layers that help and automate some of the processes, give people more freedom to work, actually we also started integrating containers in the job manager and everything so that we could end with this intermediate layer so that people could have more freedom to make it easier for them to keep doing their non-standard, non-advisable, non-best practices stuff, but that works for them because their researchers are not coders, they're not always coders. And finally, and even for the people in the team that are more skilled programmers, of course, they want to have the possibility to do all this stuff. And finally, one thing is very important is at the management level, because this is like a kind of already intermediate kind of big-ish thing where we have hundreds of users everywhere, so we had to discuss this in the strategic planning so that our institutional partners could get it. So basically, just get it everywhere as soon as possible and then start, once you can show some of the benefits, also work at the institutional level. So these are some of the challenges. Most part of the team didn't have the know-how, part of the researchers we work with. People have limited resources, they have priorities, and I'm just reading out of there, but that's it. It's not very surprising. There's this learning curve of open source and everything. And there's always doubts about where is this going to lead us. There's licensing doubts, so there's part of, like, next thing I'm going to quickly show is that we're starting to open up stuff, and there's lots of stuff that hasn't been open yet simply because somewhere, we know that somewhere in some file, somebody used an extract from a software library that is not clearly documented, where it has been used, and we need to find it and check if the license is compatible. So we're always like, okay, where are we going? But getting things to as close as possible on every dimension is what's keeping us moving. It's keeping the feeling that something is moving, even if we're not seeing everything. But then we, well, through this effort, eventually, we managed to open source already a good part of specifically the science part, so the infrastructure, the dashboard, the interfaces, the front-end of the web application, and the job-managing part, or this part is not necessarily, it's taking longer because it's also another part of the infrastructure. But the main, most important, short-term part is to let the scientific part of it be available. So here's just some examples, I'm not going to, but yeah, so there's two parsers, an importer for a big database of scientific publications, a kind of generic parser that we use for, we're in a sociology of science lab, so even though the tools are more like general humanities and social sciences, we have a lot of interesting things that come from PubMed, data that comes from PubMed, and similar databases, so this is like also a parser for that, and here's one example of a project that's in progress because even though we reworked the project, parts of it still use some libraries that we're not sure, so we have to check, take the time to check if we need to change anything or replace. Just quickly show what some one of these things look like, well, this is just the repository for one of the projects, you have the source code, some documentation, it makes pretty graphs, and you have to use it like this, like this. So just getting things into what we all know about, what you know as an open source format, how do I go back to the presentation, F, no, no, there you go. What else there's to say, that's it, so thank you, I think, I mean, if you, if you, if you have any of you work in a similar institution where open source is not an evidence but it's something that you have to struggle with, we're very happy to exchange after, during or after the conference, did you skip the, no, no, you have no more time, thank you. Thank you. Any questions? Yeah, thank you a lot, I like your time because there is a lot of nice to choose, and you just wanted to share with us, and I'm very happy to hear that you are bringing up the gauging. Yeah, so the question was, what was preventing us to opening it from the beginning, and to make open source software from the beginning. So as Ale was saying, the first thing I think is just the lack of know-how, especially 15 years ago when we started, the second thing I would say is that we, we basically didn't have a precise idea of what we were building, we just went basically script by script project by project, and this, this is how we built slowly the, the platform, then it became a platform, a proper platform, I mean, with the, with the resources and everything, and then, then it was already used, and, and we trained people on those methods already, so the problem of opening, of doing open source software is you have to think about it from the beginning, or at least refactor enough the, the code, so it's big, it's, it has no problem of, of license, and stuff like that, so this is, this is just a, you know, an ongoing streak. Yeah, speak loud, sorry, I can't, I can't hear you. Yeah, it's about, yeah, some, some people upload software to Zenodo or upload a documentation or, that's, that creates one kind of identifier, which you could also make a, some people just make a publication related to the software, so people can cite a DOI, Zenodo is going to give a DOI, so it's the same, pretty much the same thing. The problem is that, I think like Jenner was explaining is that sometimes you want to cite a specific version of the software or a specific file in a specific version, or you want to cite the software without mentioning a specific version, and, and you don't want to upload one version to Zenodo for every possible thing. Yeah, yeah, yeah, I, I don't know exactly deep what Zenodo offers, but I know that Zenodo offers a way to manage many different kind of digital objects, data, images, software, etc., and not only Zenodo. In our case, what is missing is to study all the options to see which fits better in, in, in our case, you know, is more to, to understand, because in, in fact, inside the, the, the software engineering community discussing this topic is still an open question, exactly how to do it. And when we, we search about this topic, it's quite difficult to find tutorials or someone teaching you how to, what kind of permanent indentifier you should use. For example, software heritage ID, for example, here is an example of the type of persistent indentifier that name it as intrinsic persistent indentifier. What means it doesn't, doesn't depend on, on, on a service registry where you need to ask a new ID to, to use it. It's like a hash, a hash ID of a commit is this kind of a intrinsic persistent indentifier that we can generate by the source code itself. Another option is to use maybe Zenodo or maybe any other thing to generate a DOI. Then for that, we are, we are going to ask to the registry, the DOI server to generate a DOI for us. And we have many options, we don't know yet deep and maybe Zenodo could be really interesting. I would be happy to, to learn with you, your experience because it's a lot of things to, to learn. Sorry. Okay, thanks for coming back to stop here. |
Interactive network visualizations as "guided close reading" devices for the social sciences
Development of the twitter-explorer |
Hi everyone, it's a big pleasure to be here. My name is Armin Purnaki and I'm a PhD candidate in Applied Mathematics and I work on building tools for discourse analysis and we build tools for discourse analysis based on methods from graph theory, network science and natural language processing and today I want to present a tool called the Twitter Explorer that is already a bit older but that was built in the Max Planck Institute for Mathematics and Sciences in my previous group and the idea was to build a tool that allows researchers who don't necessarily have programming skills to collect Twitter data, visualize them using graphs and explore the data and maybe generate hypotheses in their pipelines. So this kind of tool building and this research happens in the field called computational social science so when I was preparing my slides two days ago I thought it would be good to maybe give a little overview of computational social science then say why we built the Twitter Explorer and where we saw somehow the need for a new tool, of course introduce the features of the tool because it's kind of a talk on programming, the architecture and maybe give some insights on the usage but when I was sitting down to make the slides two days ago I was confronted with this and of course since the tool is essentially an entry point into the free API there is also a part of it that uses the research API which of course led us directly to this question, what happens to the research API? It's also not entirely clear, right? So I want to maybe instead of giving this the talk the way I was planning to do it, I will do it but maybe I wanted to ask a few questions first that we might then discuss maybe in the discussion also and I think there is even some kind of something planned later right so some kind of panel discussion so I'm just going to throw some questions out there that I think are really pressing now especially in the research field. How serious is this but this I don't mean the implications of it because I know a few people whose thesis is now in jeopardy because they can't collect data in a way but how serious is it in the sense, how serious is it in the sense, will it actually happen or is it some scare tactic so I think this is something that is hard to predict and then these are questions also I think that we can discuss here is how or is there a way for us as users and not necessarily only as researchers to claim our data or our digital traces that we use and that we leave on these platforms and how can things like the Digital Services Act play a role in this and the last question is very broad but how do we move on in the sense of how can we see this as some kind of wake up call maybe and how can we use this new development to maybe on one hand move to different platforms but on the other hand also to think about how we do computational social science in the future. So with these questions that we're going to discuss later I'm still going to give my original talk so in computational social science a typical pipeline for a project is you have a research question then you collect data related to it and in this case it may be data from online social platforms and then you analyze it and ideally you generated some more insights on the research question you had in the beginning and sometimes the exploration and the analysis of the data can help you maybe refine also the questions you had in the beginning so it's some kind of loop that you can see in this way and the tool that I'm going to present the Twitter Explorer is precisely made for this second part for both facilitating the collection and also the exploration of such data and this pipeline is that we start with text so in our case it's tweets that are annotated with some kind of metadata we have on Twitter different types of interactions so you can mention someone you can reply to someone or retweet and we choose one type of metadata and cast it into an interaction network and then we want to find the most significant for instance clusters or the significant correlations in this data by using 2D spatializations and typically these are done using force layouts but today for instance in the graph room there were also some talks about new methods of node embedding and so I think this is also something that we can discuss maybe in the question section but one reason why I think force layouts are good is that especially if you use them in a context where you work with social science researchers who don't necessarily have a lot of knowledge about the latest machine learning algorithms they are quite straightforward to explain in the sense that you have a spring system and nodes that are strongly connected tend to attract each other and especially if you look at interaction networks on Twitter since retweeting can be considered endorsement open clusters in such 2D spatializations can then correspond to something like opinion clusters and there's a lot of research being done in that way but one question that we always had when we look at these networks is how do we actually go back to the data that generated them and this is something that we try to tackle with building these tools so why we built it is firstly to provide an interface for researchers without programming skills also to collect and visualize the data because we were working a lot with social scientists that did not have these programming skills but had a lot of hypotheses about the data that they could not test then of course to facilitate the exploration of controversial issues on social media and this is the point that I was making before is add some layer of interpretability to these 2D spatializations by providing an access from within the interface to the actual data that created these node positions and finally we see it in the context of a larger scientific scope of using the network paradigm as something like a sampling mechanism for the data because if you're confronted with a large number of tweets for instance of course everyone knows that you can't read all of them manually so you need some kind of way to get to the tweets that are relevant for you to read and this is what we use the network for essentially so when we look at read-read networks immediately identify for instance the most influential actors in the debate and then read precisely those tweets that they made to maybe influence other actors and we call this guided close reading because if you do only close reading then you have to read all the text if you have distant reading you kind of look only at the network on a structure level and this is something in between so what can the tool do it collects tweets I mean I think we have like one week left for the v2 and the v1 so far the v2 academic is safe but we don't know that so you can search for it from the past seven days using the API and in the second part in the visualizer you can do display just a simple time series of the tweets to see maybe if there's some kind of special activity during one day you can build these interaction networks build co-hashtag networks so we divide it into some kind of two types of networks which we call semantic networks and interaction networks and then you can compute the typical measures people compute on networks and especially compute clusters like using modularity based algorithms and all this happens in some kind of interactive interface using JavaScript and D3JS and this is essentially the part where it gets interesting because so far all the other things you can do it with a lot of other tools especially like AFI or I think you can even collect tweets right with some plugins so I think all of this is not new and this is kind of where it gets interesting and I think this is time for a quick demo I don't know how much okay I have plenty of time I think I talk too fast okay so I have prepared some Python environments that already have the Twitter Explorer installed but usually you would do it like this and then all you need to do to fire up this interactive interface is type Twitter Explorer collector and this will open a browser window from which you can choose your API access choose the path to which the tweets will be downloaded and insert your search query maybe adding some advanced settings and saving options so I don't know this is a question to the audience now what we should search for this is easy and I already this is you're looking into the future I already have this network prepared for the last slide sorry we could but what would we look for then API is there maybe a hashtag like API shutdown maybe we need to go to Twitter itself API something like this we ideally would find some kind of hashtag know that okay let's just use maybe this as a search query no okay now it's collecting in the background then we can open another browser window here fire up the visualizer now we see that while this is still collecting we can already access oh there were only 400 tweets so there seems to be so we can look at a time series of tweets and then we can choose different types of networks to create we can filter them by language if we want and this is the language of the Twitter API returns so it's not there's no language detection going on here we can do some network reduction methods like taking only the largest connected component of the graph then we have this option here to remove the metadata of nodes that are not what we call public figures so if you want to publish some explorable networks it is advisable to do so there is not as far as I know not a very distinctive or clear rule after which point one is considered such a public figure but within our consortium we decided that it's 5000 followers this is also something we could discuss but since Twitter is public by default in a way anything you post is somehow post potentially to be used in displays somewhere then you can export the graph to all sorts of formats then you can aggregate nodes this means that for instance removing them based on how many retweets they have or how many retweets they did themselves and remove for instance nodes that only retweeted one person so is there a chalk maybe somewhere so if you have a graph and then there are some nodes that only retweet this person they I don't know if everyone can see that actually but they tend to clutter the force directed algorithm and structurally they do not necessarily add anything to the network so if you have very very large graphs it makes sense to remove these and somehow englobe them into this super node and then you can do traditional community detection and then it will be saved as a HTML but you can then open so we see here that this is again now in a retweet network every node is a user and the link is drawn from A to B if A retweets B and now we can look at this user t-chambers and look at the actual tweets that were made for them to end up at this part of the visualization okay so the data we collect this kind of sparse so this network doesn't look that interesting but I have prepared some fallback option so what we did in a case study a few months ago was to look at the repercussion of some discussions in the US about red flag laws and red flag laws are specific kinds of laws for gun control that allow state level judges to confiscate temporarily guns from people that are deemed to be a threat to themselves or to the public and these laws created very big repercussions especially on social media and especially in the conservative camps and this is one typical example where people then can analyze on Twitter if there is something like echo chambers or if people then maybe retweet each other only from the similar camps and then people draw very quick conclusions very fast and what we want to do with this tool is to show that maybe things are not that simple as they seem so I have prepared these networks but I think I will make it a bit smaller so this is now a bit bigger than what we had before we have roughly 25,000 nodes and 90,000 links and this is already one limitation of the tool that I think I would also like to discuss in the end is that you can't display mentally huge graphs so 100,000 links approximately is kind of the limit and I think this is also where integrating it with other tools such as Sigma or Gaffrey might actually make a lot of sense and so now I can call it a nodes by the Louvain community we can turn off the light also and now we can wonder what are these two communities and right now the node size is proportional to the in-degree meaning how often a given node was retweeted but if we want to so these may then be considered as something like the opinion leaders of the given camps and so if we go here we see for instance on this side Donald Trump Jr. and we can then look exactly at the tweets that led the visualization to put him where where he was so okay we don't need to go into the details of what he said but you see you see the point we can also change the node size to the number of followers and then we get an immediate view at who the who the main actors are that in general are also influential on Twitter so we have the New York Times here and Wall Street Journal so so we can see that we have something like of a more liberal versus a more conservative camp but if we look only at the retweet behavior we might think that okay these are separated echo chambers and people do not talk to each other but what is interesting is if we look at other types of networks in these example so we can look at the replies I think I will make it a bit smaller and all of the sudden we don't see this very strong segregated clustering anymore that we saw here maybe it's easier if I put it in but we see something more of a hairball layout and when we look at the nodes we see that indeed the path of going for instance from Donald Trump to Hillary Clinton or New York Times of those people that were very far apart in the retweet network is maybe not that not that long in the reply network meaning that these opposing camps actually maybe do talk to each other and it might be more interesting to see how they talk to each other and what they say and this is something that is that you can do when you when you use this interface and look at the tweets and that the actual replies so so it allows you to then actually go to the parts of the platform that that generate this data and that then generate these networks and finally as a small example of the semantic networks we can look at the hashtags that are used again and you see that for instance there is one kind of hateful conservative hashtag cluster which which and again okay maybe I should have said that in the hashtag networks every node is a hashtag and they are connected if they appear together in the same tweet so this is a very very low level way of seeing what is going on in the data in a way you don't need to do some kind of topic modeling and or complicated techniques you can literally just by looking at the hashtags already get a hint at how the different camps speak about the same topic so if you go here in this area this is about gun confiscation laws so Marxism in this case is also good good example right now we don't really know how it is used right and it can be used either by conservatives or by liberals and and and it's important to look at it in the context of of the data so then we would have to okay five minutes left good I will go back to the slides okay so under the hood this this whole backend of the collector individualizer is written in python and it's using the streamlit python library to serve it on a local front end so this is actually a very convenient library and I guess a lot of people also know it but you can write your code in python and then it essentially serves it in interfaces that look like this and the explorer is written in html and javascript and it uses d3 and prints the graph on canvas which is also why it's probably not as as fast as sigma is but it has some nice other features that are that are especially due to this force graph library so I think if anyone has questions I'm going to go into the details in the questions and so this is how we install it it's fairly simple if you have a running python bigger than 3.7 and there's also an API so of course especially here probably people will not be so interested in using the streamlit interface but you may want to include it into some kind of existing code pipeline that you have and this is essentially the API for semantic networks and interaction networks so I invite you to try it out yourself while you still can you have five days of course if you have the research API you might be able to use it for a bit longer but otherwise go on these websites fast and I will stop the talk with some questions actually I came here with more questions than answers and I'm really hoping for a lively discussion now because I'm not I'm not originally a developer so I kind of wrote this a bit on my own and I wonder if this interact integration of python and javascript is actually a good idea because in theory it would also be possible to probably do everything in javascript and maybe do it on the client side so you wouldn't have to install all these libraries then okay maybe one thing that I would like to show is that I experimented with temporal networks so of course doing temporal force layouts is kind of a non-trivial task but we can kind of look a little bit at the temporality of these networks by at least displaying only the links that that are active during a given day so this is also kind of nice I think but I would like to discuss maybe other visualization paradigms for this kind of network then one thing that would be really interesting I think is to dig deeper into a visualization paradigm for hierarchical structure of communities meaning that okay in theory I can either run stochastic block models or Luvain community detections and stop them at a certain level and then have some kind of hierarchical node structure but how to visualize that is another question but I think it would be very interesting especially for very large graphs and then another question is force layouts should we still use them now that everyone is doing node2vec and all these other things I think yes but maybe there's good arguments against it and on a more like deeper conceptual level is and this is a question the first one is a question for people who already have more much more experience in building tools for the social sciences is how do you kind of further integrate these kinds of methods into existing maybe also more qualitative social science pipelines so yes it's kind of an open question and how can we devise something like a research protocol for these kinds of interactive network visualizations because as you saw in my demo we kind of look at the big nodes we look at the tweets they made and it gives us some kind of intuition of what's going on in the debate but how can we formalize such kinds of visual network analysis and I think I mean there's people in the audience who actually work on this so I will be very interesting for me to talk about this and finally to end on actually maybe a bit nicer note is that there is the network of force them as we had already said in the beginning on this website so it is updated every 15 minutes thanks to a data collection done by my colleague Beatrice thank you very much and so if you go on this website you can see the retweet network of force them and if you tweet and you can find yourself in the network also so yeah what do we have here in the middle okay force them itself then there was Ubuntu they beyond somewhere okay time's up thank you yes so the question is I mentioned that you can only collect tweets from the last seven days with the free API this is a limitation but the tool itself just writes into existing CSV it depends basically so if you do the same keyword search multiple times and it will just depend to a CSV yes I mean this is the question right now it depends because the question is what happens on mastodon I don't know all these like if you want to look at political controversies and such discussions I don't know if mastodon is mature enough yet to or adopted enough yet I think if you want to look at the fostering community it's great so for this yes but yeah this kind of profit for the city and so well actually for me that's more aggressive to see the difference the stricter of this sort of this diagram that we are going to use about the social of what we are talking about let's see the conference of citizens with human people same in collection yeah I don't know what to think about that well I don't know which point exactly should I address because you raised a lot of so so are you okay if I can rephrase so you are concerned about these kinds of this kind of research also yes because because it can be used to track users across political camps right yes okay I see so I think it's more about the representativity of Twitter data for for the wide wider population which of course you're you're totally right it is kind of a subset of highly politicized maybe also a bit more educated than average people so you cannot but but this is not what we're trying to do also you're not trying to infer I don't know actually lecture results based on Twitter data so I yeah I don't know if I addressed the point now maybe we can take more about it right |
Webmapping and massive statistical data, a democratization story |
So, hello everyone. I'm glad to be here. I am Etienne Combe. I am a researcher at Gustave Eiffel University and I will talk about a site project that I have from the right time ago and try to tell the story of this site project with concern web mapping and statistical data and how quickly how the OpenStreetMap stack have changed the way we show and diffuse statistical data. Okay, so this is a background that I will skip and I will go back in time eight years ago. I was on a project and I was working on a massive graded statistical data set that are called Doné Caroyer in French and this data are derived from tax sheets and they are given at a very detailed spatial scale so pixels of 200 meters by 200 meters on all the French territories and you have several variables like population structures, number of people under 10 above 17 and so on and you have also information about the revenue of the people and this data set reached the statistical limit of statistical secrets so in some of the pixels are less than 11 households so the data are aggregated in order to be above this threshold which is forced by the law. Okay, so the more precise data set that we may have on the French population. Okay, and the distribution of this data set at this time was like this so the first step was to go on the INSEE website, find this page. It was already quite hard because it was in a very deep part of the website and you have to download two files in a very specific file format like DBF and MIFMID which were coming from a map and for software so a G software from the 80s and you have to deal with that. Okay, so not very easy. You already have a QGIS way to read that so there is already an open source software to read the data but the exploration was not very easy and and friendly. Okay, so this is the help page to read the data, load them in QGIS and so on but if you are not a G specialist it was quite difficult to deal with that. Okay, and I was working with my student on this data set and at the end of the thesis we had the idea to try to improve a little bit that and we seen an opportunity. So the context I have already given, EV file, tricky file format and also projection problem in the data set at this time, usable with a lot of pain and with dedicated software like map info, ArcGIS or QGIS and the opportunity that we have seen at this time was the web mapping stacks that was developed around the OSM project. So open layer was already there, leaflet also and mapbox tools were also on the rise with at this time this was a timing platform that allowed to build custom map web maps. Okay, and we had seconds that this will allow to renew the data diffusion visualization approaches. Okay, we were not alone, Oliver O'Brien in UK did a quite similar proposal in 2015 with DataShine, this is a screen of DataShine at this time which was built using open layer and he had to build one stack of tags for each features he wanted to be visible on the map. So in all of this interface you may choose between several features, so more than 50, less than 10, the revenue or so on and he had to build all these these tags, so a lot of work. And at the same time there was a technology that was coming which is called vector tags, so the tags that were used to build the map are no more images but are vector file format solutions. So it was the very beginning of this this approach and I have recovered some of the first tech notes about some solution from the open source community about vector tags and for statistical data this was a massive advantage because you may put all the data in the tag and adapt the visualization on the front end and you did not have to produce one tag set for every feature, so it was very advantageous and very interesting and we tried to build something around that and the tools were not existing completely at this time, so the first toolchain was something like some earth creep to process the data and export vector tags in geogizon format, a leaflet map to draw the map and a detroit hook to render the vector tags on the canvas and to animate all this. And some tricks to try something with interactivity because if you draw something on canvas it's not so easy to know on which part of the map you are and we try to to put some some some love in the details about the color scales, some background on the labels and the data that were produced by the French administration were square, so there is one well-known problem with statistical data, special statistical data which is called a modifiable area unit problem which says that when you change the regression levels you will see different things and you may find different patterns and we see that this can be an opportunity in fact and we will try to aggregate the raw data at different scales and link these scales with the zoom levels, so this will solve two problems we need to keep the vector tags a little bit small because we are on the web so we didn't want to download 10 mega and so on so this will be solved by the aggregations and because of the mob we want to explore several aggregation scales so this was a possibility that were interesting so this first solution is still is still usable 8 years ago and it's it's already cool so you have a map you may zoom you may look for a place and so on and you may find a precise location you may switch the features that you want to look at and you didn't have to download the big file open a g software and it's quite more easy okay so this was the first the first project and we get some feedbacks from the INSEE so we are speaking with the producer of the data also from journalists some journalists who are interested on specific topics and use the tools for example to to study segregations on school or districting your banis also use some of the your banis use that for territorial diagnosis transportation researcher and curious people so it was it was cool and since this first this first version the technology have quite maturated and there is now a very standard way to to to to to deliver vector types which is a MVT format mapbox vector tiles with specifications and pogeys one on database can produce this type of vector types and there is also new front end solutions like mapbox gel which is now no more in open source but the open source project is used by my trip now and that's that's also we have we have a new toolchain and some successors and if you have questions and if you have if you want details about some of these successors it will be a pleasure it's how you split so the question was about vector ties on the meaning of vector ties so when mapping is based on the pyramid of ties and when you zoom a tie is a twill in French yes but it's because in web mapping so to draw the map you have to to build ties that are arranged in a pyramid and when you zoom you download only the part of the of the zoom level that is concerned by the current view so it's too it's too no it's just to optimize the usage of the bond which and to to send only the relevant part of the data to the front end during during the exploration of the of the map from INSEE so just for INSEE they are trying to to to build something similar but from some time now and it's still not not online okay so I have a working on a prototype for them okay so as the next version of the of the solution was built with with INSEE and and they they didn't have sufficiently people inside to to to deal with this type of project I think but it's still not there but if you look at that I shine so almost the same project with with similar tools and in the UK that I shine is now available in in for for the census of the UK it's it's a new version of that I shine that is used okay so I have some hope that at one point in time we will have an official portai in INSEE that will look like that but still not yes but not in the web mapping exploration tools for everybody but I had one versions for for your buddies for examples where they can upload their own shapefives to to to add more layers of your information yes there is European initiative I have forgotten the name but it's it's all derived from the in spear directive which gives some guidelines for our delivering a statistical that has a special statistical dataset and there is also a library from euro stat which is called right this I didn't have with so you may find the links but there is a JavaScript library now which is developed by euro stat to show graded data statistical data |
Executable papers in the Humanities, or how did we land to the Journal of Digital History |
Hello. Thank you very much for having us here. I'm Daniele Guido, together with my colleague Elisabeth Gerard. We are coming from the University of Luxembourg from the Center of Contemporary Digital History, where we're running this new journal, this new idea of journal, together with a publisher, a well-known publisher in the open access publication, which is The Groiter. So the idea is the journal of digitalhistory.org, and then the idea is how to bring reproducible papers in the humanities, and in digital history in our specific case. And then that's why we join forces with them. So it's a joint venture with them directly, so the team is relatively small compared to other projects, and then we have two perspectives that we decided to put together. On our side, we understand that academic publishing is a bit too traditional, especially in history. And then our researchers, they currently work on Jupyter Notebook to run their own experiment and so on. So the idea was can we pass from experiment on Jupyter Notebook to actual publication also in our domain. And on the other side, they wanted to test out this hypothesis, because they really want to engage with new publication practices, and this joint venture would just a good match. Then, well, reproducible papers in digital history means a lot of things, because first of all, we have now massive digitization process of primary sources and sacred literature and also new digital material, like the Twitter archive that we've been seeing before. On one side, the details of the code are crucial, so sharing the data set is one thing, but the other thing is really how this data has been created, so where the condition of production of this data. So this is very important for us as historians, not for me, for my colleague. And then interpretation, so how the data set has been built, which were the limits. All this question needs to be addressed in a different way. And then at the same time, we have, of course, new standards, not only in digital history, but also the famous fair principle, so findable, accessible, interoperable, and reusable data. And we need to meet this criteria also with our journal. And this idea of Mullen is the one of a braided narrative. So it advocates for bringing together two things. One is the narrative, so the argumentation of our publication. The other one is the interpretation of data, and say that they can be done in a narrative way. This is where we put these so-called multilayers together. So this one is like every article published in our journal has a fingerprint, sort of identity, where this level, like the narrative, the hermeneutic level, and the data layer are together. So this is the representation of one Jupyter notebook, which is normally linear, cell by cell. We just distort it, we put it in a circle, and here you can test it out. So this was also a tool, it is also a tool for our authors, which we own them a lot, because they are our primary tester, and it is still an experimental journal. And you could tweak with data, you can change with the content, and you see how the fingerprint is changing. This was just an experiment at the beginning, but then it really becomes integrated, it is down there, integrated into the main interface of the journal. And we saw that indeed they were very different. They were very different, and we can see also the code style of every Jupyter notebook, how the author decided to narrate the arguments. So I will go quickly, sorry. And then this is like the basic layer, so the narrative layer that looks like an MV viewer with steroids in the sense that we have figures, we have tables, we have bibliography with Zotero and Psy2C. And then above all, it is a very thin layer on top of the Jupyter notebook, because we use the usual output of the notebook, so this is very, yeah, an augmented MV viewer. And then we have, as it is a braided narrative, we decided to have this metaphor of this level one on top of the other. So this is a sort of animation, on the left you see the full hermeneutic layer, and on the other side you can see how it slides through the, like behind the narrative layer. And the data layer is for the moment, the part on top, right top, which we use MyBinder, fantastic service to publish online your notebooks. And we wanted this article not only to be show off of the data set, but also a small history lab, so that people could just click on the button and get to the data and understand how the data has been composed. The good thing is that we decided to keep this MyBinder as this source of truth. So the article that you see published is exactly the same copy, with just a different way of interacting with a different layer. So this is how it looks like on MyBinder, so it's a classical Jupyter notebook, and for every notebook we have a GitHub repo where we store all the requirements and all the images in the data set. We have to put together the fair metadata, but still, so it's under construction. Then what does it mean having Jupyter notebooks for publishing? We see that in the literature there are a lot of critics who shouldn't use Jupyter notebooks because it's too complex, it's impossible to replicate and so on and so forth. But then for us, it was really the simplest solution. So at the same time, to publish with Jupyter, we had to make our pipeline a bit more complex than usual. So we have a first review directly on the abstract, where we start communicating with the authors, understanding their needs, creating a writing environment for them that can be replicated with Python, sorry, with Docker containers for Python and Air. And then there's the first technical review, she's in charge of the first technical review, which is the most complicated one because there's a lot of checks. We saw some projects already, we needed to have checks. And then we have a lot of other open source software that enters this pipeline, like for the preview of the notebook with the GitHub app, MB Viewer, we have MyBinder, and this is just for the first technical review because then the article is being sent to the reviewer for the double banana review. So before even reviewing, we had to do this huge job because they have to review also the data and the pertinence of the dataset. And then finally, there is one important thing, it's English editing. So how to edit something that which is already being run, so without running itself. So this could be a tool for translators, tool for correctors that they're not into the Jupyter world. So how to do that? We have Jupyter text, we're still testing some plug-in to see if this could work without touching the final output. And then the final technical review, so after all this has been shipped, we have a DOI. So the article is now published and needs to be indexing and there is the problem of long-term archiving, which is a big problem for many reasons. First of all, like the libraries that get deprecated, also API that disappeared. So how to really reproduce this in the future? And then finally, the dataset needs to be included into, we have Dataverse, but we're looking for Zenodo in order to match the fair metadata. And time is up, I have a question for you, of course. Thank you very much, first of all. And then if you have want to contact us, just collaborate or work together on Jupyter publication, JDH admin at uni.lu. And then the question is, how can we actually collaborate on something which is a notebook that requires quite a threshold of expertise, not only for the researcher, but for the people that are around, and how to maintain all this and how to make this history love living for more than one year. Thank you. Yeah, well, I repeat the question. So he asked me if the double blind review, how can we keep it actually a real natural double blind? So she anonymized the data on GitHub. So we have specific repositories that have been created after the communication with the authors, where we only have the code without the names, but then you still have the bibliography, so it's easy to, it's a very small word, one of the digital history. But this is the way to maintain double blind. And then we're going to send the review where both the MyBinder and the version of the article on our website with a hidden URL. So this is the only thing that we can do. For sure, the double blind, we have the problem that we cannot really use the pull request directly on the GitHub repository. So in fact, there is some replication between the GitHub repository. Sometimes after with the peer review, there is some requisite that he come back to a technical review because there is a revision. So there is this question about how we re-synchronize the notebook together. There is some authors that they have good enough with GitHub, but to review a notebook with the output with the metadata to track what has been changed. That's why, yes, this it was, but yes, the questions that you have, we are testing with review and be or not also to maybe use some markdown or just Python script to produce several output in order to not sometimes touch about this metadata that they are inside the notebook. And there was another question, but I don't know if we have time. Yes. Yes. Yes. Please. Last one. Last one. Of the SS brightness of your data sets. Sorry, how to assess? Of the SS brightness of your data sets. Yeah, that's the very big, big, big question. So the idea behind the braided narrative is then you tell the story around the data on one side and on the other side you keep the data like with the Zenodo metadata coherent or probably with what Paul showed us before with Ricardo. So having like an external check on the metadata and on the data set itself. At the same time, the initial, the first technical review is the one where we assess actually the data. So if the data set are complete, coherent, we don't judge them because then we know that there are conditions of production. That needs to be, we try to make this as more explicit as possible. That's. Yes, exactly. And this, like that's why the long-term maintenance. So now we only have nine articles, but we have 28 in the pipeline in the coming year. So it's really now it's getting us up speed and we have more and more interaction with others which makes things more complicated. Thank you. |
The Turing Way: Changing research culture through open collaboration |
Hello everyone, my name is Ann and I'll be talking to you about changing culture, research culture through open collaboration. There was a little bit of a last minute tweak to this talk and I think you'll find out why in a second. I am the community manager of the project but I do want to flag here that I'm a member of a core team of 28 people who represent, who talk about the Turing Way in different contexts in their own spaces. Also really excited to see a lot of frictionless data plugging over the course of all of the talks today because as an ex-fellow I'm happy to see representation of the project. So first and the most obvious question is what is the Turing Way? We are an open source guide on data science and open research. We are documentation first project in a community which means that we involve, we support a diverse group of folks in order to make data science reproducible, ethical, collaborative, inclusive for everyone. So while you may have seen the book around, you may have seen it cited in different spaces. We're very much a global community of folks. We draw upon a lot of open source tools in order to make our guides in order to talk about them, to write about them, to bring them to different spaces and we're also very much a culture of collaboration and I will try and keep my volume up. We are hosted at the Alan Turing Institute which is the UK's National Institute for Data Science and Artificial Intelligence but I do want to flag here that while we're hosted there, many of the folks who maintain the project, who have written chapters, who have been a part of the project have very much come from all around the world and have led to many different and challenging questions about the state of the discipline itself. We've grown a lot over the course of the past four years, over 250 pages and I updated these stats just late last night over 1.5 K people who have started on GitHub and if there's anything that I want you to get out of this talk, it's that all of the illustrations that you'll see throughout are things that you can download, that you can use for your own work and as of last night we have had over 19,000 people use those illustrations and over 16,000 people that have downloaded them so please, please feel free to use them for your own work. They illustrate many different aspects of the research project, research process about working with data, about working with each other. Just a couple of small things about its impact beyond the data science world is that our aim is to affect folks in policy, folks in other kind of public facing spaces including in education and we've seen the citation of the Turnway project really grow over the course of particularly the past couple of years. But to kind of take us back to the beginning and more about the foundations of the Turnway itself, we really started as a project in response to the crisis of reproducibility in science as you all are well aware of by Dr. Kirstie Whitaker in 2018. As you know, many scientific studies are difficult and impossible to reproduce. She as a neuroscientist was especially aware of this and from this notion was really asking is there a place where best practices or ways of working could be gathered together and collectively made again, drawn upon open source methods. And again, reproducibility being when you use the same tools and the same data, can you create the same result as different or separate from the notion of replicability, generalizability or robustness more broadly. So from there, the guide for reproducible research began in 2018, 2019 and has many different resources related to version control, licensing, different tools that you can use. We encourage you to check it out and use it for your own work, but we also are saying that you should not read it from back to front or from front to back because it is over 250 chapters long. So really, it's very much meant to be a kind of buffet where you're able to take different parts of different things that might be useful for your work and for your research. But we soon realized and many people within the community soon realized that in order to talk about reproducibility in science, you have to talk about the scientific process itself. You have to talk about the design of your project, about how you communicate your work, about the collaborative process, about embedding ethics throughout every stage. And so from that one guide grew into five different guides related to kind of all different parts of scientific research, in particular with the addition of Dr. Malvika Sheran to the project in 2019. And we also document a lot of our practices as a community and what we call the community handbook so that other open source groups can use those practices as well. So it's very much meant to be open all the way down and in many different contexts. So again, you see this trend of transformation from reproducibility and best practices for reproducibility and the best practices for open research more broadly. But going even further than that, and as I entered the project as a community manager of the course of the past year, we've really seen a development in these conversations shift. Very much we've seen conversations around maintenance and community care become and grow to the forefront of folks within the Turingway community within, you know, in many ways the community of open scientists more broadly. We also have folks who within the project started on their own the process of translation and localization of not only the Turingway but open science principles more broadly. That means in many cases, in the case of, for example, Arabic, one of the ongoing translations of the project that's happening right now, you are actually inventing a language for open science that doesn't necessarily exist currently. And we have ongoing translations at the moment happening in, I believe, Arabic, French, Spanish, Turkish, Japanese, and I might be missing a couple, but again it speaks to what happens when you adopt open source methods, practices, ways of working and all people to make and adopt this resource in context that makes sense for them. Of course, very much the core of all of this is supporting open infrastructure and governance as well. The entire book is hosted on Jupyterbook and again, hosted openly on Github as well, which includes a lot of our project management work as well. So we again try to be as open as possible. Part of this, of course, is mentoring researchers at every stage, incentivizing PIs as well as early career researchers to adopt open practices and really set the tone and the culture within their labs as well as increasingly talking about accessibility in different contexts. And within the community, we've seen accessibility take on the notion of not only how can we make the term way accessible in a web kind of way, but also how can we make data science more accessible for the public? How can we make cross-disciplinary conversations more possible? How can we be as accessible for the most number of people as possible? And that includes bridging any tech barriers that may come from working especially with non-computational researchers. And finally, all of this is very much under the umbrella of advocating for a change in research culture as we know it. So all of this said, a big part of that is how we go about the process of acknowledgment. We have this really funky bot called All Contributors Bot that we have installed within our repository, which is all about trying to acknowledge all the different types of work that happen within not only an open source project, but within a community. So we try to add a lot of those citations there and again to recognize that it's not just the researcher, oftentimes it's the entire infrastructure, both human, social, technical, that surround a project, that surround research, that really is important to recognize and acknowledge all of the time. And again, we also host lots of fun events, both online and in person throughout the year. Collaboration cafes, what are called book dashes, kind of like but not really like a week long hackathon. So we host to end with little obligatory pitches, connect with us online, read the book, but also join us in any of these different spaces. Really trying to pitch you all on the turning way, of course. Join us at any one of these events if you'd like and I think just want to end here on the note that in spaces like Fosdom, but in any environment in which we find ourselves as a community, it's really about meeting people where they are and we've had conversation with folks that can reign from extremely technical, you know, open infrastructure conversations and the difficulty of implementing that, but also the, I don't have any computational resources in my lab and I want to learn these skills to work with data, how can we have those two people in the same room and have a resource that is useful for all of them. So again, me just representing a bunch of other folks who have made the turning way what is, really thanking them for all the work that they do. And I think that's all. Thanks everyone. Yeah, of course. So the question is how can we join different teams or a different project within the project? Because the turning way is kind of a community of communities, everyone uses different tools as you all may be familiar in the source world, of course. The translation team specifically has a channel on Slack and weekly meetings that they host. And as for getting involved, it's honestly just looking at the GitHub repository. We have folks that just join and look at issues for new chapters that are being written, pitch ideas. We had one recently, I think, on an open infrastructure chapter that's ongoing and a bunch of other things like that. So the translation team, the French team in particular, is very, very cool. They're all wonderful. I'll be with the community that celebrates this project and the turning way. And other things to scale it. If maybe the community grows a lot, did the thing you do now, did they be still okay after? Did you imagine a way to make it work with a very, very big community and a lot of researchers all around the world? So the question is how do you negotiate or like navigate the scaling of the community and of the project? Given all of the network researchers that are here today, we ourselves are trying to understand maybe a network of what the turning way project would look like because we sometimes find it excited in some government document months down the line, which means that there's a long tail or folks that might use templates of the project or might cite it in different spaces that we can't necessarily keep track of. There's maybe two answers to that question. One is that as the community manager, I think a lot about maintenance and sustainability and how maintainers are often such a small percentage of those who write or those who create. And my priority is really trying to ensure that there's sustainability there as that inevitable kind of growth happens perhaps. But the other side of it is I think a process of translation. We recently received a grant to work with practitioners in implementing turning way principles or things that are cited in the book in industry or other research environments. And I think a lot of that work is going to be how can we kind of build bridges between in a very non-trivial way as I think a trope like building bridges sounds like, but rather like translate the work of open research principles into something that an industry person would be able to apply within their work. And I'll just, I think it will be difficult, but that can't be at the expense of the maintainers. What research has been found in a very, very big industry? So have you seen this kind of self-organising of communities feeding through the decisions that are made? Would you be able to repeat that last part? You said, have you, with large research? That's a very interesting question, which was, if you want to change research culture and with very large research teams, have you seen, like have you seen principles like what is in the turning way applied within these groups? Am I reading you right there? If anything, I think if there's anything that the folks in the community have observed, which is why a kind of collective documentation project has emerged, is that everyone's looking to reinvent the wheel. And something like the turning way emerged out of this need to go, hey, we're both on massive research teams. We both are kind of trying to do the same thing. Why don't we just template it? Or why don't we talk about it? Why don't we create a set of best practices that, of course, isn't the best, but rather something that both of us can use? And so I think if there's anything that what the project is trying to aim for is trying to help with that templating process so that we're not constantly reinventing the wheel when we are in massive research projects or in different groups. Because that, of course, is you all are probably aware happens all the time. Yeah, I hope that answers your question. Yeah, thank you. |
Open Research Open Panel
Open discussion among the open research tools and technologies community |
Okay, so basically we had this extra slot of time in our physical session. Actually, we want to use that time to engage discussion with you, the audience, about this open research track here at FOSDEM. We are opening the stage for basically questions and discussions. Just wanted to very briefly say a few words about how we ended up creating this space. So basically, we've been with a few colleagues at Sains-Paul-Paris, for instance, doing a few conferences here at FOSDEM in other dev rooms, using this FOSDEM conference as a venue to actually talk about technical realizations we were doing in the lab. And it was not so, it was a very good opportunity for us because it's not easy to do a scientific publication about technical things, more specifically into the social sciences. So after a few years of doing that, we thought that actually we could try to go a step further on this by organizing this track which is definitely about open source software and research. And so we had, as you've seen, we understand research with a very broad way. Any kind of inquiry is, we think, interesting to discuss because the chair, a journalist investigator, or we had an open factor, for instance, or a social scientist, or a natural science scientist, all share kind of the same engagement with data and about inquiry. So we created this room four years ago, actually at the time we were merged with another initiative from two people from neuroscience, and it was nice also to have these different disciplines. Of course, then the COVID came, so we succeeded into keeping the room online. And now, so we have folks, a crew of five, six people. We have two people who are online, Yo Yehudi and Celia Gruzon Daniel, who will actually monitor from their home the online session, which will start in 20 minutes. And basically we are a bunch of people working in academia, or not, or in the Open Knowledge Foundation, and I am no more in academia myself. And so the first thing I would like to say is like volunteers to help us out organizing this space are more than welcome. Basically the work of this starts in the autumn where we submit a proposal to force them to organize this room again. So each year we do that, we wait for the answer, and once we have it, we have to do a call of proposal for something very, very classic about organizing a conference. So reviewing papers, actually getting the papers in, so we actually contact people for asking them to come to our conference. And then as you've seen what we've been doing today, mining the door, putting signs on people and that kind of stuff, which is fun. And yeah, so I will finish with that, and maybe some other people won't say other things. I will finish this with one thing, it's important for us to try to augment diversity of our group. Gender but also disciplines, we like to open up as much more, well as much discipline as we can. So yeah. Sorry, less diversity, so there is a question about, yes, yes, also less French people. We tried that, we succeeded a little bit, but not that much. So yes, please. Matthew. Yeah, I just wanted to say a few things, but basically the reason we need you is also to increase this diversity in the papers submitted. Because for instance, if you look at the gender balance, the committee is basically gender balanced, but the submissions are completely dominated by males. And also the deaf room in the beginning had one leg firmly into the hard sciences, but then we've kind of drifted towards the social sciences, and it's not just us, it's also the demand. So, you know, we try to promote something, but it's kind of an equity balance game to try to have this variety in the proposals in the first place, because of course we cannot, if we don't have any proposition, for instance in hard science, we are not going to have any. So the audience, which is like, when if you submit papers or if you come into the committee and then become a bridge to contact other people to submit papers, that's how we can kind of ensure that it remains diverse. It's not a lot of work to participate, also because we are many and we are probably going to get even more people, and it's not a problem to be many, because like the real job beyond managing the day is, as you said, to contact the people and then to review, but mostly to contact them. So please come in and join us about that, but we could also have this discussion about what the scope of that should be, because I think that the figure of the tool maker was kind of at the center of what we had in mind. But as we see, it's not just about tools, it can also be about infrastructure in ways that is not tool, like institutions, you know, I don't know, I'm thinking of the Journal of Open Source Software that we had a few years back, think that infrastructure is not just tools. But now it's clear, it's not just the tool makers, let's say the infrastructure maker, and what could it be, like, common and shape the profile of their room with us. Can I, just to emphasize what you said, can I just share my little experience from this? Go ahead, take the mic. But just to emphasize what Mathieu said, this was my first time I forced them and as I told you, I'm not a coder, so I was pretty scared about what I was going to say and share, etc. So please do it, because why I came here was to share knowledge, of course, but to retrieve some different experience from what I do every day and who the people are with whom I work every day. So thank you for this, and please submit papers. Yeah. Do you want the mic? Yeah, yeah. I think one challenge for the diversity aside of many other factors is there's also the implicit assumption that there's sort of a good way to use open source software. There were some talks, for example, that kind of, not in a bad way, but sort of said that R is kind of tricky because, like, I think it's sort of more dead mainstream and I understand that when I want to develop software, I would use it, but in statistics I use a lot of R, but I also use a lot of Excel for help, and then people are like, oh no, that's a bad tool, but a lot of people in their practice, particularly if they're not developers, would combine a lot of different tools, and I think if they get the vibe that, hey, we also use Excel, it's sort of a bad thing for their practice and people look at it strangely, or even if it's like LibreOffice, which at least is open source, but it's like not the death way, and I guess that vibe that people catch that would make also a lot of people that would be very interesting for this community to also see how, like, it's done in the real world in a very messy way, very often, makes them not like that, and if we could get to a place where this would be legit and people would be curious and interested in those non-perfect, chaotic, and not 100% open source ways, I think that would increase their versatility. So just for the stream, the idea that a challenge to diversity could be that there are good ways and bad ways to do open source, and I always say that I forgot the last part of your speech. The Excel LibreOffice part is important. He's going to come back later. I'm sorry. I'm so tired. It was for the stream. Yeah, that's what the point was, but I mean, you were very right that it's about us to cultivate that. Yeah, it comes back to me now. It's really important to show how it's being done in the real world. That's something that I think we all appreciate very much about the first-dimensional. You don't see some people talking from a distance about how it's supposed to be. You have real people who are from actually doing stuff, which also means that sometimes they are not really good at communicating, and that's something that I really appreciate because you hear the voices. I think for them it's somehow a safe space for people who are not primarily good at communicating contrary to academics, for instance, but who are really doing the stuff, and we can hear them, and we can identify to them and realize that anyone can get involved in open science and open source design. We want very much to keep cultivating that. That's why also we try to have a blend of people showing off the tools they've made, but also people having a different kind of talk. I don't know what the right balance is, but we want to preserve this voice. I was a researcher at some point during my PhD, I was part of a training workshop led by PhD students involved in training ourselves with best practices in mind and also to have a grassroots initiative, which I think is kind of like a good example of what worked in a particular field is this kind of like stories of what worked in a particular domain with that be a thing interesting for this dev room. So our stories that worked in a particular domain interesting for the dev room. Well for me I think it totally is. So the answer is yes, and I think we've somehow done that, but sometimes what it looks once it's in the slides and here looks different, but that is very much the case. I think we do not have a clear typology of the kind of talks we accept. One very clear kind is the tool talk. I've done something, I show you that to you. Another kind would be experiment feedback. So I'm not going to show my tool, but for instance I have engaged with that open source stack and I've met issues, I'm going to talk about that. This is a little bit of that kind somehow. Do you think it will be pertinent to add more track dedicated to very short talks like lightning talks, but dedicated to more? Repeat the question. So a question like would it be a good idea to organize very short lightning talks about open research? So actually we try to do that a little bit, like the three last talks were lightnings, okay, lightnings 15 minutes, so it's not, but I think it's, yeah. So we try, actually so the format lightning or not depends on what, on the time we have and what the speaker wants. So it basically is a tool. If you don't know how it works, people propose if they want a lightning talk or a normal talk, and then so we can also ask them to expand or reduce. If we think it fits better, so we try to negotiate. But it's kind of unpredictable how many lightning talks we're going to have. So the proposals decide if there's enough for a track or not. From the two last questions, there is one thing that's very important that's emerging. So the form, the form of things should be in question. And there is a sort of standardization associated with a conference. And as we receive talk, we perceive diversity even in the form and our experiences, tools or maybe talk, we do have a classifications. But seeing things here, it's sort of standardized. And maybe we should work together, so you're in this with us, on the form of things, on maybe other forms of presentations. We're not that constrained by the first them. So that's one way we could work forward into. But there are many things we could do. And I am tired. I do have a remark, I am part of the geospatial community, and on the two sides of the space. And we have the same problems. We have lots of developers who want to speak of their new things and the new stuff they want to do, but we have a problem to have users who speak of old things, how they use a tool and get new spaces and it's very difficult, and it's pretty hard to keep in touch with it. So I have a colleague, I asked her to come and present her tool, and she said, I don't fit. I'm a developer, it's not the kind of meeting for me, so maybe we can speak a little bit and see if it's very open to everyone. So the comment was that Nicola says that in the geospatial, they have this issue of having users come because they don't feel welcome or they don't feel like they fit. So I want to give the mic to Maya, because Maya was first, before she got in the community for the room, she was presenting a talk here two years ago, I think, 2020, as a user. So now you're in. So what would you give as an advice to those guys? Don't be scared, they're really nice. I understand your colleague perfectly, especially coming from academia, which is also my case. Maya is extremely segmented or taught to fear each other deeply, and so it can be really scary to push those boundaries a little bit. It can also be very rewarding. So you said she did come in 2020, or she didn't come? Okay. Okay, so despite trying, there was no success yet. Maybe if you show them this replay, it will change everything. I've got a question for you, which is, so one of our ideas, our objective, this was also to bring folks from academia here at the first day, so that they can actually benefit from the other conference and all the day rooms. So here's my questions, like, do you consider seeing talks in other day rooms and this one here, yourselves? Can you raise your hand if you do? Yeah. It's good? Yes. Yeah. Yeah, it's difficult to see. All the talks that we want to see, even though the offer is large. But on conversely, who is here now, but who didn't come to the FOSDEM specifically for that? Okay. Raise your hands. So it works also the other way around. Good. Thank you. Is it time? No, not yet. So we still have 10 minutes, and then we'll actually watch the stream here, if you want to stay. Quick question. What is the acceptance rate here? That's a very good question. So the question is, what is the acceptance rate? I don't know. We haven't calculated it, but I think... I would say that there is a part of the papers that are not good, I mean, they're out of focus for the day room. And what we can do, actually, we can re-route a paper to other day rooms, so it doesn't matter much. But also some people just submit to multiple day rooms for the same speech, right? And it's okay, right? The boundaries are not super open, but there are kind of buggy proposals. What we would consider not acceptable would be, is that something that is clearly for someone else, and we don't know why it's been submitted here, or something that has just a title? When that happens, we would kind of send an email, sometimes we don't even have an answer, we don't know what happens. So if you remove those, the ones that we actually refuse, I think is like one-third, one-quarter maybe, with not too much. So we are always... It's been always working for us. We need to have enough to fill the day if we want to have a full day, but not too much that we have to refuse a lot of good talks. But to be honest, we've been refusing good talks every year, just because there's not enough room. So basically, we decide about the balance when it's the case. So when we have too many or not enough tool talks, we will try to adjust this way. I can remember that one kind of talks that we tend to reject are talks about a tool that have been developed somewhere in research, which is very specific, in a very specific domain without any abstract, without any issues about what does the open source side of the thing actually change anything in the story. So having just a paper about a tool which is in research depends a lot on the quality of the abstract, but it's not enough in a way. Submitting is very... The abstract to submit is like a few paragraphs long, so it's really easy to submit. It's very easy to submit. Please do. Yes? I did try to improve the younger sexual legalization, but for disclosure, I was a first member of the Italian Catholic Student Association. Okay. It's a good idea. So the question is, have we tried PhD associations, a session of PhD students, to email them, to propose them to submit a talk? No, we haven't. No, we haven't, and we are not very good at communication. Well, I'm bad, but they are good. Yeah? Yeah, I have to say, as an academic, I feel, well, as an academic and a developer, I feel like there is a tendency for academic labs to have sort of a rotation of conferences that they can send out, and it typically is difficult to get added to that as a new conference, and especially if it's something that's sort of outside of the norm. And so it's something that I do think... I think most of the time that's not sort of intentional, it's just there's so much else going on and people don't really think about it. So I would second the suggestion of talking to PhD student associations and you're reaching out to universities, especially universities that might have asked those, to say, hey, this is an opportunity for interdisciplinary work, opportunity to talk to people who are more technical, if you're less technical, or people who are more in other disciplines, because that sort of stuff is in vogue right now as well. So I think that once you're on the radar, it's going to be easier to get people, it's just getting on the radar. So the comment was that actually it's not easy for labs members to actually add one more communication conference in the list of an already existing kind of standard publication venue and above all, when this venue like here is really not standard, academically speaking. But actually the interdisciplinarity of this room will be actually a good way to attract not only PhD students and to make it clear for academia what this space is basically. Okay, so that's a very good question, questions like who would in this room define him or herself as a UX designer? Can you raise your hand? There's one there. So not many, not many. We're doing research into how usable research and science is, they were trying to figure out usability and user design. So the comment was are you aware that actually there is an open design dev room and so the answer is yes, actually you will co-organize it. And actually this room is here tomorrow morning, I think. Afternoon, sorry. And my personal comment on this is that actually in my career and what I've done in my experience, UX is very common stone to go beyond the big, big issue of mixing qualitative and quantitative approach on data as my talk tried to explain. And that yes, we need designers and we have one here who talked about it and in our previous lab that was very important, the media lab is a place where they try to do this mix. We should try to not overlap, it's good this year, the years before and it's annoying I think we have kind of the same public or partially. So if we could, I don't know if we can do that, we should ask at least there or whatever. It's better if we can maintain it separate. Yeah, at the same time, yeah, we can do that, actually we could talk about what we do. Right, one last comment is like if you want after the online session, we can have drinks together in a place which is, tavernier, tavernier, tavernier, we are going to write that on the board, tavernier is close to here and I'm going to set up the stream here and so you can stay here and watch it with us if you want. So it will be on me, we are going to stream the video and the interaction if you want to ask questions to the speakers, you have to go through the chatroom on the first-day website. Thank you. |
The Software Sustainability Institute Community and Events
How the SSI supports research software through community-building and events |
Hi, my name is Rachel Ainsworth and I'm the Research Software Community Manager for the Software Sustainability Institute and I'm based at the University of Manchester in the UK. Today I'm going to talk to you about how the Software Sustainability Institute supports research software through community building and events. The Software Sustainability Institute, or SSI for short, is a national facility in the UK promoting the advancement of software and research since 2010 by cultivating better, more sustainable research software to enable world-class research. And this is more distinctly stated in our motto, better software, better research. The Institute is a collaboration between the universities of Edinburgh, Manchester, Oxford and Southampton and we are very proudly supported by all seven UK research councils. Research software, which encompasses code, processes and community, reaches boundaries and its development cycle that prevent improvement, growth and adoption. And the Institute provides the expertise and services needed to negotiate to the next stage. We advocate for all things research software through programs, events, policy and tools to support the community developing and using research software. The Institute is comprised of five teams. There's a software team which helps the community to develop software that meets the needs of reliable, reproducible and reusable research. There's a policy and research team which collects evidence on and promotes the place of software and research and shares this information with stakeholders. There's a training team which delivers essential software skills to researchers and partners with institutions, doctoral schools and the community. There's a community team which develops communities of practice by supporting the right people to understand and address topical issues. And finally, we have a communications and outreach team which exploits our platform to enable engagement, delivery and uptake. The teams are involved in many different activities, some of which are listed here on this slide. I don't have time to go through all of these today and my talk will focus on the community events, activities and resources and how you can benefit from them and get involved. From the community angle, in order to facilitate the advancement of software and research, the Institute has engaged researchers, research software engineers and developers, instructors and trainers that deliver research software training, research policy makers and groups that provide services that support research software development. We have a dedicated community team but we collaborate closely with all teams within the Institute in order to more effectively engage the research software community. Our programming ranges from blog posts, news items, guides and social media that you can consume to workshops and events where we facilitate collaboration and co-creation amongst our community. We also run a fellowship program which engages across all of our activities. SSI Fellows champion the Institute's mission through organizing events and growing their own communities of practice. The fellowship program engages with natural ambassadors of better software practice from the research community, empowering those working to improve software practices in their domains and areas of work with funding and visibility to run their own workshops, training, events and other activities and nurture their communities. In return, Fellows help the Institute discover important information about software in different research domains and guide to training, policies, community work and consultancy engagements. Here are some examples of our Fellows activities that we have supported. At the top of this list is Carpentries Offline which was started by a number of our Fellows at our annual event which I'll talk about in a few more slides. The project is all about supporting software Carpentries training in areas with unreliable internet usage through the use of Raspberry Pis. There are also a number of coding workshops such as, you know, our workshops for archaeologists, our packages for arnithologists, so we have those research domains who are covered by the specific training and topics. There are activities related to mental health within research software and then some of our Fellows have also piloted training developed by the SSIs such as the Intermediate Research Software Development in Python. And these are just a few of the activities. You can read a lot more about the various activities that our Fellows get up to on the Software Sustainability Institute website and blog. Collaborations Workshop is the Institute's premier annual conference and it brings together stakeholders across the entire research software community such as researchers, software developers, managers, funders, policy makers and more to explore important ideas in software and research and to plant the seeds of interdisciplinary collaborations. Collaborations Workshop 2023 will take place as a hybrid event in Manchester, UK from the 2nd to the 4th of May 2023. And the theme of this year's workshop is sustainable career development for those in the research software community, looking after your software, your career and yourself. The theme encompasses technical development which can include topics such as software sustainability, software products and digital tools, infrastructure and documentation, software development skills and training. It encompasses career development which includes topics such as career pathways related to research software, how to get credit for your work, mentorship and inclusive leadership to support teamwork. And it also encompasses personal development which includes topics such as sustaining your mental health, well-being and finding community. Registration is open and so is the call for submissions. Collaborations Workshop is an unconference which means that it is not only comprised of presentations but it also involves a lot of interactive sessions led by participants. The key notes in Lightning Talks inform and inspire. Lightning Talks also provide the perfect opportunity for participants and sponsors to introduce themselves and their work at the workshop. Panels are informative but also allow discussion and exploration of a topic or theme to showcase different perspectives. Many workshops and demo sessions are contributed by participants and they demonstrate a particular research software product, digital tool, project, approach or standard, deliver specific training, interactive tutorials, conduct information gathering or explore a topic. And each of these session types feed into the more interactive sessions that follow. For example, the discussion session allows groups of people to discuss a topic that interests them in a way that furthers our knowledge of that topic. The groups also co-author blog posts summarizing their discussion which are then published on the SSI website to disseminate to the wider research software community. The collaborative ideas session follows this and it's used to get people talking about their work. Groups identify problems within research software and work together to come up with a solution and this facilitates more focused creation. And the final day of the workshop is the hack day, where teams form to work on projects generated during the collaborative ideas session and other ideas pitched during the course of the event. And this facilitates co-creation among participants and establishes collaborations that last beyond the workshop. Throughout the workshop, we also facilitate opportunities for networking amongst participants to support collaboration. Outcomes of collaborations workshop include the blog posts from the discussion groups and the collaborative ideas proposed which feed into the hack day projects. Participants often start one or two collaborations on average based on their discussions at the workshop and often carry on working on the hack day project after the event ends. Carpentries offline is one of the projects that was born out of the collaborations workshop hack day which has been sustained by our SSI fellows as I've mentioned before. And coding confessions is another example project which aims to normalize failure in research software to create an inclusive space for sharing and learning from experiences. The institute has published an event organization guide based on how we organize and project manage events with collaborations workshop being the main example throughout. The guide takes an experiential approach where it matches what we have actually done and our lessons learned and we provide templates that we use for the event roadmap, venue requirements, managing the event budget, a duties roster, event roles and risk management. There is an in practice section which includes detailed write-ups of how we organized the most recent collaborations workshops from feasibility to closing. The guide has lots of tips for online and interactive workshops from technical setup to the program and our next steps are looking at hybrid considerations for collaborations workshop 2023. If you use any aspects of the guide you can contribute an in practice chapter based on your experience or suggest updates to the text. The final activity that I want to mention is the research software camps which are led by our communications and outreach team. They are free online events which take place over two weeks twice a year and each camp focuses on introducing and exploring a topic around research software and starting discussions among various research communities. The camps that we have had so far have been themed around supporting mental health, next steps in coding, research software being on the spreadsheet and research accessibility. Sessions include panels, training, workshops, guides, blog posts and social media discussions and there are many opportunities to get involved either via the organizing committee as presenters, mentors and experts. To give more context into how connected and embedded the institute is in the better software landscape you can view some of our collaborators and partners here. We are very practiced in fruitful collaborations and we are always interested in hearing about new initiatives, projects, activities and events from community members so please do not hesitate to get in touch with us. Here are some of our contact details, please sign up to our mailing list and you can follow us on social media and I look forward to answering any of your questions. Thank you so much for having me. Thank you Rachel for that fantastic talk. We've had a few questions come in but if anyone else has questions that they'd like to ask either in the open research room or in the Q&A for the specific room for this please do ask these questions. So the first one I think I might ask is that Rachel, a lot of these, this was really really interesting and very UK oriented but I understand that there are also international fellows maybe since this is quite an international audience you could tell us just a little bit about how that works and what you look for. Yeah, great. Thanks for that question, Jo. Last year in, well I guess it's not last year anymore but for the 2022 fellows we started piloting an international fellowship program and this is because we want to explore how best to scale up this program internationally for the future. So we've started out with, I think we had four fellows in 2022 who are international and so we were exploring with them kind of the bottlenecks and administration and particularly you know issues surrounding finance and things like that because we, the research software community is quite global. We collaborate with a lot of folks internationally just as an institute and so we wanted to begin reflecting that within our fellowship program. And so we had four international fellows in 2022 and this year for 2023 we have six international fellows and we are collaborating of course with Open Life Science of which you are a director of and we're very very grateful that Open Life Science are helping us to pilot this program and to help us scale it a bit more in the future as well. Super, we don't have long left but we have a question from Celia asking whether you're raising awareness and training about open source and licenses as part of the SSI's work. So yeah, so the SSI does a lot of training. We are heavily, we collaborate heavily with the carpentries community so we teach a lot of software carpentry and data carpentry and we also do instructor training. There have been workshops in the past led by SSI around licensing in particular I believe and we have guides on the website but I do believe licensing is potentially also covered in software carpentry. Actually, that's a really good question but we do have guides on our website as well around that. Yes. Well timed. I think we're about to hop over to the next session so I'm mentally sending a fantastic round of applause. Thank you so much. Yeah, it's, I think it, yeah, it's just finished. Nice. Thank you. I really like some of this stress by having it pre-recorded, it's just stress beforehand rather than now. Yeah, I was thinking that. I was just like pre-recording a talk is, it always takes longer than you think it's going to take but then like today I'm just like oh I'm really glad I don't have to give a talk today. Yeah. All right. I would love to hang in chat but I need to go and prep for the next one. Thanks so much, Jo. All right. Bye. Bye. Bye. Bye. Bye. |
Establishing the Research Software Engineering (RSE) Asia Association with the Open Life Science programme |
Hello, everyone. My name is Saranjeet and I'll be presenting a talk on establishing the research software engineering Asia association with the open life science program here at FOSDEM 2023. A bit about myself. I'm based in India, a country in Asia. I'm the lead and co-founder of the RSC Asia association and an international fellow of the software sustainability Institute's 2023 inauguration. The RSC Asia association was started based on open science principles of inclusion, fairness, equity and sharing. These principles and values have had a huge impact on how we structured the association and continue to grow in the process. A huge support in this journey was provided by the open life science program, also called as the OLS program. It is a 16 week long personal mentorship and cohort based training for participants interested in applying open science principles in their work and becoming open science ambassadors in their communities. Through the 16 weeks, the organizers, hosts, mentors and project leads or mentees share their expertise and gain knowledge essential to create, lead and sustain an open science project, connect with members across different projects, communities, backgrounds and identities and empower each other to become effective open science ambassadors in their communities. Participants join this program with a project not limited to life science that they are either already working on or want to develop during this program individually or in teams. As of February 2023, the open life science is on the way to host its seventh cohort. Check their website openlifeside.org to learn more about this fantastic program. We started RSC Asia association as a community building project during the cohort four of the open life science program. During this cohort, we developed infrastructure like the website, the social media handle, the GitHub organization of the RSC Asia association. Finally, we officially launched the RSC Asia association in October 2021. We continued the journey of building RSC Asia with cohort five of the open life science program. During this program, we created a working group for RSC Asia as a pathway for on-boarding. There were also further research done on adopting a code of conduct. We established the RSC Asia association with a mission to promote and build the research software engineering community and profession in the Asian region while also fostering global collaborations. We continue to conduct various activities at RSC Asia. At our launch event, we encouraged our community to participate in the Hacktoberfest 2021 program. We organized a workshop on learn how to write all text for scientific diagrams in September in July 2022. In September 2022, we organized the first RSC Asia Australia on conference in collaboration with RSC Australia. This on conference was well received and got a positive feedback from the community members. We participated in Hacktoberfest 2022 and held co-working sessions for our community during the month of October 2022. The RSC Asia association has grown since its launch and we have community members from different parts of the globe. The community membership of RSC Asia association is free and open to all. People from all regions of the globe are welcome to join our community irrespective of whether they belong to or are located in Asia. To become a community member of RSC Asia, please fill the membership form. We are also planning to host community calls at RSC Asia. Please share your suggestions, ideas and thoughts for the same with us. Thank you for joining and watching my talk. You can follow the RSC Asia Twitter handle at the rate RSC underscore Asia to know more about the community activities. Thank you again. |
FAIRPoints |
Hi, my name is Sarah Elkeballi. Today I'll be presenting FAIRpoints, the event series highlighting automatic measures developed by the community towards the implementation of the FAIR data principles. FAIR stands for Findable, Accessible, Interoperable and Reusable. These principles were first coined in 2016 in a landmark paper summarizing the fundamental concept to improve the infrastructure, to enable the reuse of data and to provide guidance to enhance reusability. When I say principles, they are an effort to define the best practices for data to facilitate discovery, access and reuse by humans and machines. So in essence, FAIR is not a set of rules, it's not a standard, it's not a how-to guide, it's an evolving process and a vision. This also makes it very difficult to understand, so how do these principles translate to real life? How do they translate to solutions? And this is the question that inspired us to explore a different approach to understanding what the FAIR principles mean to the community and how they're applied in reality. In addition to that, the applications and the development in the realm of FAIR have been evolving at an extremely rapid pace and expanding to include more aspects beyond just data. So realize that in order to find those solutions, we need to really pull our knowledge together and learn from each other and bring in a diverse community from all over the place. And this is really where FAIR points comes into the picture. We offer a platform for those conversations to happen. We offer a platform to understand what are the realistic and pragmatic FAIR implementations and our main goal is to bring together the research community, the ultimate users and producers of the data, as well as the policy and decision makers who shape research practices in the broader research support community, the people that are going to help in the development of those solutions. So to the end, we offer a framework for these conversations in different formats. We have community discussions where our community members come together, share experiences, identify solutions, but we also produce an output to build up the resource to disseminate this knowledge further. So essentially, we want to capture the conversation summary. But also we have keynote events where we tap into the expertise available of specific researchers or folks in the field of FAIR implementations. And we highlight the solutions that from those diverse fields. And again, we provide conversation summaries in the form of bite-sized reusable contents that include some practical advice and how to. And that can be hopefully be easily adopted by others across disciplinary fashion. Most recently, we've also launched a series of Ask Me Anything events. And these events are meant to bring together speakers from the Research Data Alliance and speakers from a related European Open Science Cloud Service. And the aim here is to have the respective speakers to describe how their work aligns, how it connects with the research community, and how it benefits the research community in a way that is directly related to the implementation of FAIR principles and open research practices. The main point here is that these events are beginner-friendly, they're light-back, easy to follow, and they're moderator and audience driven, meaning that our moderators from our community as well as you, the audience, come drive the discussion in the direction that you want, through your questions. So if you're interested in joining us, please sign up to the whole series or to the specific events you're interested in. Oh, I have to go back because we have five different themes on identifier, FAIR software, regulatory processes, machine actionability, as well as equitable and transparent access to information and knowledge. So please sign up to either one of those or the full series and also to receive the recording. And please feel free to send us your questions if you want answers. So besides these events, we also have ongoing projects such as the FAIR Open Science Forum. And this is something we see as a community hub. People can come together, share information, experiences, find answers to some questions, find topics that are related to their interests, and share their and present their work as well. So we want to make it expand the way different channels where of dissemination of knowledge about FAIR practices. We also have ongoing collaboration with the machine central podcast where we offer FAIR Point Choices Challenge. This is where we ask the community a question about one of the FAIR implementation principles. And what we want to hear from you is either a challenge you face in implementing that principle, or maybe something you've tried that worked out for you and how that choice affected your work and how is it going now. So this is really a two minute free recording. And you can do it multiple times. You can send it in on this on the Speakbyte link here, I'm showing. And if you want to delve into the topic deeper, you can be our guest on one of the machine center podcasts, driven by Donnie Winston is our co founder. So join us, let us know how how's it going for you. Um, one thing that is really important in all of this is to include diverse voices. So we need to make FAIR accessible to the broader audience. And as it develops, we need to develop it together with a global community, where we connect and collect heterogeneous input from a global perspective, supporting equitable access, and working towards advancing FAIR beyond to everywhere. So we also came to learn from experience that there might look different for in real life, for researchers in different places. And that's where we're really keen to include relatable examples from research practices from different regions in the world. And not only do we want to extend beyond fields and disciplines, but beyond geographical boundaries and learn how FAIR translates into practice and what it means to our global community. How does it look like for you? So join us, our conversations sign up to community discussions, our event series, come and join us in our Slack and interact in different ways. We'd love to hear from you. And big thanks to my whole team, Chris Erdman, Donnie Winston, Nabila Xebi and Julian Schneider, as well as all the organizations that are supporting this work, and including Siloflab, Go4US, Cindy, Supercomputing Center, and recently Research Data Alliance, EOS Futures, and FAIR Digital Objects. And thank you all so much for listening. I hope to see you in different formats. And if you have any questions, please feel free to ask them. Thank you. Bye. Excellent. Thank you so much, Sarah, for that fantastic talk. It's great to have you here today. So we have a few questions that have come in from the audience. I think one of the first ones that we have from Celia is, could you explain more information about what machine actionability means? Maybe give us some examples. Did we just lose Sarah? I'm sure she'll be back in a moment. I'm going to try and gently fill the silence until Sarah returns, because I noticed that Celia had another question saying, depending on disciplines, the way is the way of considering data very different and how do you work with diverse research communities? And I actually can just answer a little bit from my own personal experience that occasionally I used to work in data integration. We found that when you are a primary data source, you have a lot more power to make data fair than when you work in integrating multiple different data sources. So I think it's probably fair to say that different domains definitely treat and can address fair differently. And that you need to assess each other on each one on its own merits, depending on what capabilities they have. And recognizing that some people may have more practical limits or some data sources may have more practical limits around fair than others will. I notice also Sarah and Jeet asks, are there any workshops or sessions where a researcher or individual can learn more about fair? I will happily suggest go to the Fair Points website, because Fair Points really is all about making everything practical and approachable for people who may be applying it for the right time. And I've just noticed that Sarah is still in the main channel, but can't access as a speaker, which is a real shame. Folks, I'm going to recommend, if you can, the Open Research Tools and Technology Online Devering. Please post your questions there in a written fashion so that Sarah can answer and she'll catch up as she has time. I now have two minutes of dead air to continue. Thank you for bearing with us. I'm going to read out what Sarah has also said about machine actionability. What is needed to achieve machine actionability, things like semantics and metadata, why automation is required in research. And as a researcher, I think this is a really great point, how you can create something that's machine actionable. I'll maybe rephrase that myself as the ability to allow a computer to read it and to manage data rather than it just being something that a human can read and explain or process or work with. We also have a few really good links here, fairpoints.org. These are in the chat at the moment. You can sign up to the event series, and there's also a Slack and a newsletter where you can go and you can learn a bit more about Fairpoints and how to apply it. Okay, one minute left. I'm going to keep on rambling about things for bearing with us. One thing that I've always wondered or I've noted is a lot of people talk about open research and fair, but don't necessarily recognize that fair is about being findable, which doesn't require open and accessible. You have to access it, but that still doesn't mean it's open. It might be that you have to ask someone or do it some other way, interoperable and reusable. Again, don't require open. So there are times when it's appropriate for something, let's say medical data to be private and it can still be fair. Thank you, Sarah. Okay, we have another question and we still have a few seconds to read out. Depending on disciplines, okay, we try and find commonalities between different disciplines and fair, Sarah says. Hi. Yay, you got back just in time for eight seconds left. Thank you so much for doing this. Folks, please get in touch. Where'd you go back in? I'm so sorry, I missed all your Q&A, Sarah. Yeah, please feel free to send me any questions and stay in touch. I've posted some of the links and I hope I managed to answer some of your questions. All right, thank you. All right, I'm going to hop off to the next talk. Bye. you |
Frictionless Application (IDE for CSV) |
Hi everyone, my name is Evgeny and I am a Tech Lead of the Friction Data Project at Open Knowledge Foundation. Today I'd like to present you our new development called Friction Supplication. Friction Supplication is an IDE for C3 and other table or general data formats. This tool hasn't yet been published, so today it will be more like a future preview, but you can access the CUTBASE github. The main purpose of Friction Supplication is data analysis, data validation, data publishing and many other aspects working in the data. Let's start from overview of the Friction Data Project. Friction Data Project helps people to publish and consume data. It's built on top of open data standards, such as table schema, data resource, data package, you might have heard about them, and on top of these standards you're working on different software. For example, we have developed Friction Framework for Python, so people can describe, validate, extract or transform their data in Python or command line. We've developed Friction Repository, which is similar to Friction Framework, but it's run on github infrastructure using github actions, and it provides continuous data validation. We also last year presented LiveMark, our data visualization and publishing tool. And today we'll talk about Friction Supplication, but first of all, I'll describe the Friction Framework and Friction Repository to show why Friction Supplication. So Friction Framework is written in Python, and it provides command line interface and Python interface to describe, extract, validate, and transform, including publishing tabular data. So this project had been here for a while and we have a rich community and a lot of people using this. But it requires you to work with command line interface or know how to code in Python. So for many people it's just not possible to use Friction Framework. So to solve this, we developed Friction Repository. It's a github action, so it works like continuous data validation service. If you have data published on github or you store some data related project on github, you can add Friction Repository github action and on every push to your repository it will be validating your data. And this solves the problem of people who are not able to use Friction Framework because it requires programming and knowing command line interface. But still it's kind of for tech people who know what's github, how to create github actions. So having said all of this, I'd like to introduce Friction's application. It's a fully visual tool that can be published as a web application, more like a demo, and more importantly published as a desktop application that you can install on your computer. It's fully visual, it's for non-programmers and our goal is to make it really easy to use so it will just like us in Excel, the file manager like from Jupyter, Notebook, and the core features that for any CSV file on your computer you'll be able to see it as a table. And after this you can see the metadata, what fields, counts it has, what types, you can edit it, you can add validation checks and see how it validates. And in general this project just does everything, visually, what our more level projects do for programmers and command line interface users. It makes sense to add that it's not only for tables, for example you can edit your metadata like tables, schemas, or data packages, and here I'm going to show you how you can really date your data table and fix the errors. So when you upload or open a data file in Fiction's application it provides you a tabular view and the errors and a lot of data tables in CSV have errors as shown in red and there is a description of what this error is and you can just edit the cells like in Excel and clean your data and save it. Okay so the goal for our better release is also to support creating charts from data tables and we use here our beloved Vega notation or Vega light and so the idea that it was possible to do a lot of Vega supported charts using data from your tables so you'd be able to set what count used for x-axis, what will be the transform, what's the type of this axis etc etc. Currently it's under development but the goal is to have it. Of course it will be a non-complete data related application if it weren't for CQL features so yes I fixed the application possible to query your table or data whatever formats it had like was it CSV or Excel or whatever it will be indexed in a CQL light database so you will be able to query your data and not only individual tables but more complex queries joining different data tables so basically it will be just a CQL interface to data files on your computer. Okay let's say we're done with the dating and editing and analyzing our data tables what we can do in friction application is pack everything as a data package. Data package is a standard open standard for describing collection of files as said it was and it is a cornerstone of friction data project so in Fiction's application you can create a data package and fill it with your files and then they'll be working on the future for publishing the data so currently working on Github, Xenodo and C-CAN targets for publishing a data package as a dataset for example that is a set of files collection files and it will be Xenodo dataset. These features are already implemented for Fiction as framework and here we're just working on the visual interface for them and I think it's very important to provide this way for non-coding people visual way to publish data. Okay I think that's it for my presentation and as I mentioned Fiction's application is approaching better that release in a few months so please stay tuned I think we will be presenting it also on CSVConf in Argentina this year and when it's released any feedback ideas and suggestions will be like really amazing for us and it will be helpful. Also just a side thing just don't forget helping people for grain just waiting a link here and that's it thanks for your time for your attention and have a great for them thank you. Thank you Evgeny for this presentation now it's time for the Q&A in its life we had some questions I know that a lot of people were waiting for the new app and the presentation and the first question is from Paul can you describe the design process for the user interface user experience of the application please. Yeah first of all thanks for your words and I probably disabled my camera because it sometimes sounds weirdly so sorry for that so yes so let me put it this way your fiction is data and we are kind of like project with a like history already but for us we always kind of like limited regarding resources as many open source projects and funded projects by some grants and help by community so our goal is just to keep things like really simple and for the application we use really standard layout and ways to show the information as I mentioned for like file files excel type of table views and also we used just really high level JavaScript library material UI to just eliminate all the low level design decisions it just looks like I don't know Gmail or Google Drive or just based on material UI style of everything. I'm sorry you muted me. Thank you for your answer Eveline another question from you that we discussed before the conference she really likes and find really useful this new version and when for you will you use this or frictionless or might choose for example to use open refine? I think thanks for the question and I think you would use frictionless when your work is first of all tied to frictionless standards and I hope and at least I know partially that more and more people are starting using frictionless standards in their work that a package table schema and frictionless application is just kind of like the most natural way to work with these standards. Regarding open refine of course if you are looking for some already established and well supported and solution with a lot of plugins etc for now of course you will choose open refine but I'll just suggest to try frictionless application at some point and see if it's like more modern provides more features maybe unique features so it's not fair to ask me for this because yeah I'm just and another question really precise from Paul the table view is playing plan to scale how many rows? So regarding the visual interface it will be just a data frame to the database so the pagination and it can just work for whatever your local database created by frictionless application can be like scaled so when it's currently working on the web version of the application next step will be publishing it as a desktop application and since then the limit only be like local memory so it's it can work it's a frictionless data goal is kind of like so I'll say that the frictionless data is work the best for like small data sets and middle-sized data sets and for middle-sized data sets like million of rows say it will be good because it's just a just a data frame so it's not like for example excel when you just limit it to like a little bit of rows okay perfect another question from Oleg the publication to Zenodo github scan looks great any thoughts about plugging in other targets? Thanks for the question and for the kudos and yes yes as the application just creates a visual interface for the on top of frictionless framework for frictionless framework we have future request for other data targets and I lately just made some research to add others try it and you can etc etc so you'll be happy to add new targets and it's only question of resourcing so when we have an established version of the application and yeah so currently we kind of like having three of these scan github and Zenodo gives us a chance to check where a thing uses like a proof of concept and it's tested and it works it will be like really easily easy to add others okay that is the future perfect from Paul again can you describe the software stack for the frictionless application thanks Paul yes so again as mentioned that as any open source project we have limited resources the frictionless application is basically a wrapper around the frictionless framework so it's a so I'm not sure what's exactly the question because for such a deep software like from visual interface to low level stuff we use a lot so regarding big parts friction application is a wrapper around frictionless framework just a kind of like a client when frictionless framework is a server and we also will publish at some point frictionless API so people will be able to use in python the server we use in the application which may be useful for maybe creating something in house solutions for data validation and a site of frictionless parts it's react material UI and do you stand for state management but sorry I'm not sure if it's really want information or it's to low level I think that people can also watch your live stream after later and may be able to have this kind of a precise answer that's why it's perfect but I have another question it's about the table editor check and does it check all errors from data application validation including the foreign key constraints currently honestly it doesn't because it's just incomplete but yeah the goal is just to be like 100 person aligned with the standards and if it's in the stand if that package standard say that foreign keys should be validated it will be validated in frictionless application so the goal for the better release is differently so for this okay I'm checking if there is any other question no I have some question more about the development of frictionless and the story I have been working on this project for a long time can you maybe like tell us about the challenges that you face to the different step of the development and what is for you like this is especially about open source research after what is for you the advantage of open source but also the maybe the pro and cons of doing this open source development sorry I didn't hear the first part it was a yes it's about your experience you have been in this project for a while what are the challenges that you faced yeah so it's a good question for six minutes we have maybe like for the also next few hours so I'll try to say something maybe less usual than maybe something that will be more interesting than saying that like resources are limited and etc etc so for us maybe this might be interesting that from the beginning of the project also it's also probably kind of usual thing I think the initial idea idea great idea created by Rufus Polak and of these frictionless standards of course it wasn't you know it was too general because it was an idea and during the life of the project I would say that at least for the technical part because I was leading initially like the software only now I'm leading the project in general I see it like trying to pick good parts and remove bad parts so just trying to figure out what's really useful thing from frictions that we can provide and what's just this like you know 80 20 person thing so yeah I would maybe suggest on your open source project to start like to do like as early as possible this elimination not of things that it's not your like critical path to being useful because yeah I can say a lot about the usual thing about documentation contribution like full request no not like synchronized between like among like contributors etc but it's the same like for every open source project so and maybe I have um related to this uh topic a question about um for open source we are discussing a lot about sustainability of project that's why for you what are the main elements to help to yes to have sustainability for for your project and in the in the next steps maybe for that frictions data has been supported for a long time by the swan foundation what work really thank you and currently it's a it's a core open knowledge project supported by the open knowledge foundation and but still here it's an ongoing discussion of sustainability and for so this project like this is like not wide enough like for example a webpack which is now fully funded by I think open collective and just outside contributors so it's just just a few project like this can live like just using donations so in our domain I think project like this still needs to uh build um collaborations uh this some hopefully like ng also like high level data projects and start providing some um customizations and tailored versions of the software uh to kind of like to provide the source resources for core development just what I what I think perfect checking if there is other any other question for now we are at the end of the day of full day of conference in bris in brissel that's why I think there is less people connected right now but do you have like uh take last comment about like your this next step for your project that you you you explain in your in your presentation like or take home message last last words to to conclude our Q&A question Q&A live stream um yeah thank you so the next steps will be publishing uh better release as I said and we're planning to do so in a few months in March April and uh we're targeting ccv con in Argentina and it will be great if some of our uh listeners can join us there in Argentina and I hope you'll be doing more uh live uh version presentation with all the features already working so that that's that's a short uh short uh like one real issues at uh afghanians thank you so much for your time and your presentation yeah thanks a lot thank you you |
Papis: a simple, powerful and extendable command-line bibliography manager |
Hello, Faustem. My name is Alejandro and today I am going to talk about Papis, a simple, powerful and extendable command line bibliography manager that I have been developing during the last 7 years. I will be explaining some of the main considerations of the project and demoing some of its basic use cases. First of all, let me introduce myself. I work currently as a physicist at the Technical University of Vienna in Austria. We develop massively parallel algorithms in order to calculate properties of molecules and solids from a theoretical point of view. You can find me on Nasodon or around the web. Don't hesitate to contact me. So, what is Papis? Papis started as a simple bibliography manager built around the command line. It should make possible to manage papers or books at scale or for small curated libraries. It is therefore important to implement a simple data model and use an approachable programming language, such as Python, so that users can interact easily with Papis' many features. In addition, Python also encourages contributions from researchers in the academic world. Since nowadays, many researchers are exposed to this language. Papis strives to be and build a community, and various plugins have appeared thanks to the community. There are plugins for the major text editors, such as NeoVim and Emacs, and partial support exists for VS Code and Vim. Additionally, lately we have been working on the web application for Papis, and I will be showing some of its features in this talk. But you are asking yourself, why Papis? We think that it should be possible and simple to perform complex tasks on a whole library. This is made possible through a rich command line interface. You can add papers from a DOI or from a variety of websites supported by Papis. You can explore sources like Crossref from the command line, or download information about the citations of a publication, or check which publications cite the current publication. You can take notes that play well with tools like Vim or Emacs org mode. You can version control your documents and export to the most common formats. You can spend countless hours curating and improving your library's notes, metadata, and PDF documents without fearing losing your data to an API change or end-of-life of Papis. Since your data is stored in a very simple but flexible format. I want to emphasize the fact that one of the main goals of Papis is enabling the user to be independent of Papis itself. A researcher, academic or not, spends an enormous amount of time searching, reading and not notating publications. For us Papis maintainers, it is important that a person comfortable with any scripting language should be able to retrieve the totality of Papis data by writing a script in an afternoon. In order to accomplish this, an extremely simple library structure was chosen. The library structure relies on having one folder per library document. This means, for instance, in the case of the shown publication of Turing, the folder includes a YAML file containing the metadata information of the publication and an additional PDF file with the published publication itself. In this example library, we would have an additional document under the folder 1-document, where we find two PDF files in this case. A document in a Papis library is any folder containing a YAML file entitled info.yaml. The contents of the YAML file are in principle up to the user's to the user's are in principle up to the user's to determine. However, in practice, there are some conventions used in Papis. Inside the info.yaml file, the key files contains a list of related files in the documents directory. These files might be PDF files or any other kind of files relevant to the document. In the case of the Turing publication, files therefore lists a single PDF document, paper.pdf. The key ref is used for exploring BipTec files and is the reference of the document when using bibliographic tools outside of Papis. The YAML key type is also used for BipTecs exploring and is the type of document, whether a book, an article, a monograph, etc. There is also an in-built support for tags, which may be added as a list of space-separated keywords. We chose the YAML format due to its ease of writing, reading, and because most programming languages are provided with libraries that can read these files. Of course, given the simplicity of the library model, it is possible to write a crude finder with just a unique scrap and fine commands. All functionalities in Papis can be customized through a configuration file in the INI format. Papis can define multiple libraries through the configuration file, and all Papis settings can be independently configured for each library. You can define default settings under the Settings section, which will be common to all libraries. A library is simply defined as a section with a dir key, which contains the path to the library directory containing all documents. You can then customize this library, in this case a library named Papis, and set the default opener tool to the PDF viewer events. If you happen to want an additional library of books holding mostly EPUB formatted books, you could define the opener to be caliber instead. You can read about all the configuration settings in the Documentation page, where you will see a description of their function and their default values. With this introduction, let us take a look now at a common workflow to add an article from a journal page. Here is a common view of an article in a browser. We can see lots of information, and the easiest way of adding this article to Papis will be by locating the DOI of the article in the page. In this case, we locate the DOI in the URL of the article, and we copy it to our clipboard to paste it in the terminal. The command for adding a paper is Papis add, and Papis add comes with quite many options. In general, when adding a document, Papis will try to download metadata from various sources and, if possible, download PDF documents, if they are freely and legally available. In here, we see that I am using the edit flag. This flag instructs the Papis add command to open the editor with the info.yaml file before adding the document to the library. Similarly, the open flag instructs the command to open the attached files, if any, before adding the document to the library. We are also telling the command through the from flag to retrieve information exclusively from the DOI. We can also preset some metadata through the command line. In this case, we are adding the tags, classics and DFT. Let's go ahead and run the command. Papis will now try to download metadata and a PDF file from online sources. In the current configuration, we are greeted with an interactive prompt to add, split or reject the metadata retrieved from Crossref. We choose to accept the metadata. The interactive session now shows us a retrieved PDF document and asks us, if this is the document that belongs to the publication. At this point, we can inspect the document and we realize that we indeed want this PDF file, so we press Y. Now, all the information is in place and we can see a preliminary version of the info file since we pass the edit flag. We can see that a lot of information could be retrieved, detailed author list information, volume, pages, among others, and our tags have found their way into the YAML file correctly. A confirmation prompt subsequently appears since we pass the confirmed flag to the command. We agree to it and, therefore, the document gets added to the library. We can now fetch information about the publication cited in this article. The command for this is citations and we pass to it the fetch citations flag, which first checks for information in our library and then heads to Crossref to retrieve relevant information about the references appearing in our newly added document. If we now open the directory where the document has been stored, we see that the PDF file has been correctly stored alongside the info YAML file and the newly generated citations.yaml file. If we inspect the citations file, we see that it is in the format of a list of YAML files, where every element separated by three dashes represents bibliographic information about the citations. This can be used for scripting, for browsing the citations, or for easily visualizing them through the web application. This demo will show how to leverage the Puppys API in Python to write one of the simplest scripts you can write. You can find more information in the documentation together with other more complex example scripts. First of all, let us add a bigger library to our demo library. For this, we need to edit the configuration file and add an additional library. After adding the library, we can list the directories with the list command, which shows us the interactive interface to select documents. Most Puppys commands accept a query argument as an input. In this case, we can query for documents matching the author to include the string Einstein. We can also use the all flag to do a Puppys action to all documents matching the query. In this case, listing the full paths for the folders. Other commands like open, edit, or update work in a similar fashion. Next, we will write a simple Python script to scan all the documents in the library and add the tag to the document whenever the substring this appears in the title of the document. To do this, we can use the Puppys API submodule and we can obtain all documents in the current library with the function get all documents in lib. Next, we loop over all documents and we deal with the document as if it were a Python dictionary. The method save saves the document. I will comment out the save call since I don't want it to override the library. Let's run the script and see that it works. And indeed, it works. The last demonstration will concern the web application. The web application is quite useful if you would like to self host Puppys or access it from a portable device. We can run the web application using the serve command to which we can pass a port 88888. Directing our browser to the URL localhost colon 8888, we see the starting page of the web application where we are presented with a simple query prompt. Other pages include listing all the documents in the library, listing all the tags, and browsing a different library. Let us again enter the author Einstein query into the prompt. The result page includes a handy timeline with the results of the query and a simple multi-line list of the results. In this timeline, we can see for instance directly the annus mirabilis of Einstein together with a couple of other publications further right. We could click on the title of the timeline and go to the respective document page. In the results for the document, we see a left block with some basic information and the PDF links. On the right hand side, we see the citation, references, and several external links for the document. Let us look for the first paper we added at the beginning of this presentation. It is worth noting that we can click on the tags of the documents to get the results for the given tags. If we click on the arrow, we will navigate into the document page. The red notifications advise us of small problems with the data in our document. However, I will not fix those now. The document page is a multi-tap page where the first tab presents most of the information of the document in an HTML form fashion. Additionally, we have access to the raw info file where we can modify and override its contents. We have added a BipDex tab for LaTeX users. This document has a single file attached and we can preview it on the browser thanks to the library PDF JavaScript. We can also download the document or open the document in a new window. In the next tab, we can visualize the citations file that we generated previously. This tab also has a timeline like the search results and the documents with the green reference indicate that these documents exist in our library and we can open them. Let us open this article page. For this article, we have also generated citations, but we can also use the Harvard ATS service. In the case of articles citing the current article, we have not generated this file and therefore we get an embedded page from ATS by default. In the last tab, we can edit the nodes from the browser. Furthermore, clicking on the tags and library pages, we can see how these interfaces look like. Thank you very much for your attention. For further information, visit the projects page over at GitHub. Of course, Puppies is only alive because of its community. I would like to thank all the users and contributors over the years. I would like to specially thank the co-mentainers of Puppies, Alex Fickle and Julian Hauser for their hard work in the last year. I hope you enjoyed the presentation and I'll be answering your questions shortly. Fantastic. Thank you so much for that really actually quite interesting talk Alejandro. I really quite felt inspired thinking, wow, can I run this with my own publications as a way of collating stuff and also sharing it with the world? It's always nice when you watch a talk and you immediately think, yes, I'm going to use this as well. So I have a few questions here. I think my first one, perhaps, might be, I historically have used Zotero. I had to think very carefully not to say no to that. Is it easy for me to migrate if I was inclined to migrate from Zotero or other plugins? Over the years, quite a lot of people have developed some plugins for the interface of Zotero and Puppies. You can export to simply and create Puppies libraries but I'm also aware of some people that actually use both. So they have a workflow to export this dynamically the whole time. So it is in principle compatible and there are a couple of projects that do this. This is coming from the community. Thanks. That's actually really appealing to me because whilst I was watching the way you added with the DOI, I thought that was really cool but there's still like seven command line flags and I'd really like the button that says add this to Zotero. Yeah. So the thing is, yeah, I should have maybe given some more examples of adding some documents. So in principle, it's also possible to add a document just by the URL. So there are some automatic recognition in Puppies. So most URLs are recognized and it could in this case even revive URL or within the HTML page. I noticed that there is a tool to use. Sorry. Your connection has gone just robot enough. Testing, testing. Do we still have you now? Yes, that is much better. Could you maybe just repeat your last two or three sentences because it was just a bit hard to hear. Sorry. So there are some, Zotero has implemented also quite a lot of metadata fetches from many sources and there is also a project that tries to reuse these metadata fetches from Zotero for their use in Puppies and maybe also in the web application in Puppies. So this might also happen in the future. But in general, it's much easier to add documents than what I showed in the video. Cool. Thanks. It's also really nice to hear how interoperable y'all are. So we have a couple more questions in the chat. So Paul says, does the YAML format follow bibliographic standards of any type? So we try to use most of the BipTec keywords when they are applicable and in general the YAML format is really free to the user to use. So you might want to use a particular convention in your YAML files but the keywords are mostly motivated by BipTec. That's the only one. It still sounds like there's some decent interoperability there which is really nice. Celia asks, who are the main users of Pappis in terms of discipline, students, researchers, etc? Well, I know that's a good question. A lot of biophysics and biosciences, so bioinformatics, I know quite a lot of people that use it. Physics, mathematics and computer science, I would say. These are the ones. But for instance, Julian is one commentator and he's a philosopher. So it really helps. Of course, you have to be a little bit acquainted with the command line. Maybe hopefully through the web application in the future this will change. But it's in general people that really care about their libraries, the metadata that they have in their libraries, and they really want to have a very clear, clear representation of their data. They don't want some upstream database somewhere stored. So they really want everything in plain text. Yes, I think you demonstrated so beautifully how accessible your own data is and it's surprisingly rare. Do you have any trainings for Pappis so that people who maybe are a bit less confident could learn more about it? Sadly, not right now. Maybe that's something if enough people are interested. That's something that we could certainly look into. But we have the discussions in GitHub. So quite a lot of people ask questions there. So there are also frequently asked questions there. And we have also a Zulip chat. Also we are on Libera, but right now not so many people are there. And yeah, so just drop by and ask whatever you want. Super, thank you. So we have about 45 seconds left. That's time to squeeze in one last question. So we have one here from Paul. And we've got some love for the timeline, which I agree. I was like, oh, I want that. Do you plan any other visualizations, like maybe publication networks from citation data? Yes, actually, yes, because I realized I really like these visualizations. I plan some like with the citation, some trees and stuff like this. But I would like to have more feedback from from users to really know what's really sensible and useful. Thanks. That's a great point. So we have three seconds left. Thank you so much. Thank you. Thank you, all of you. All right, I think we're off the live stream. That was I'm so going to be going back and like setting my own purpose up with a web server. Thank you so much. Yeah, thank you. Thank you. Okay, I'm going to hop to the next talk. |
Research at the service of free knowledge: Building open tools to support research on Wikimedia projects |
Hello everyone, I'm Martin Gerlach, I'm a senior research scientist in the research team at the Wikimedia Foundation. First of all, I want to thank the organizers for the opportunity to present here today. I'm very excited to share some of our recent work around building open tools to support research around Wikimedia projects. Before going into the details, I want to provide some background around what is Wikimedia and its research team. I want to start with something that most of you are probably familiar with, Wikipedia, which is by now the largest encyclopedia in the history of humankind. Wikipedia, together with its sister projects like Wikimedia Commons or Wiktionary, are operated by the Wikimedia Foundation. The Wikimedia Foundation is a nonprofit organization and has a staff of around 600 employees. It provides support to the communities and the projects in different ways, but it's important to know that it does not create or modify the content and it does not define or enforce policies on the projects. One of the teams at the Wikimedia Foundation is the research team, and we are a small team of eight scientists, engineers, and community officers, and we work with collaborators from different universities to do research around Wikimedia projects. These activities can be grouped in roughly three main areas. The first one is to address knowledge gap, so what content is missing or underrepresented? One example of this is the gender gap. Second is to improve knowledge integrity, that is making sure the content on the projects is accurate, can think of vandalism or misinformation or disinformation. The third aspect is growing the research community, that is empowering others to do research around the projects. Today I want to focus on the activities in this last area, specifically I want to present four facets in which we have been contributing towards this goal, that is around data sets, tools for data processing and building machine learning APIs. Finally, I want to conclude with how developers or interested researchers can contribute to these three areas. So let's go. Wikimedia Foundation provides already many, many different data sets, most notably Wikimedia dumps around the content, but also containing information about edits and page views of articles. This is public and openly available, and it's used by many researchers as well as developers to build dashboards or tools for editors. However, when working with this data, this might prove still very challenging for people who might not identify as Wikimedia researchers or for someone lacking the expertise about database schemas or which data is where or how to filter is. Therefore, we try to release clean and pre-process data set to facilitate that. And one such example is to Wikipedia image caption data set. This is a clean and processed data set of millions of examples of images from Wikimedia comments with their captions extracted from more than 100 language versions of Wikipedia. The background is that many articles on Wikipedia are still lacking visual content, which we know are crucial for learning. Creating text to these images increases the accessibility and enables better search. So with the release of this data, we hope to enable other researchers to build better machine learning models to assist editors in writing image captions. In this case, we did not just release the data, but provided it in a more structured form as part of a competition with a very specific task. And the idea was to also attract new contributors through this structure so that researchers could find examples of the types of tools that could be useful for the community, experienced researchers outside of Wikimedia could easily contribute their expertise. And for new researchers is an easy way to become familiar with Wikimedia data. The outcome of this was a Kaggle competition with more than 100 participants and many, many open source solutions in how to approach this problem. This was just one example of data sets that release, and I just want to highlight there's other cleaned process data sets we are releasing around quality score of Wikipedia articles around readability of Wikipedia articles, and also their upcoming releases around using differential privacy around geography of readers. In the next part, I want to blend how to work with all this data. We always aim to make data as much as the data publicly available. However, that doesn't necessarily mean it is accessible because it might still require a lot of technical expertise to effectively work with this data. Therefore, we try to build tools to lower the technical barriers. And here I want to present one such example related to the HTML dump data set. What is this? This is a new dump data set available since October 2021, and it's now published and updated in regular intervals. And it contains the HTML version of all articles of Wikipedia. Why is this so exciting? Traditional dumps, when we are using the traditional dumps, the content of the articles is only available in the WikiText markup. This is what you see when you edit the source of an article. However, what you see as a reader when browsing is not the WikiText markup, but the WikiText gets parsed into an HTML. The problem is the WikiText does not explicitly contain all the elements that are visible in the HTML. This comes mainly from parsing of templates or info boxes. This becomes an issue for researchers studying the content of articles because they will miss many of the elements when only when looking at the WikiText. One example for this is when looking for hyperlinks in articles. One study by Mitrovsky looked at counting the number of links in articles and found that WikiText contains less than half of the links that are visible in the HTML version of the reader. So we can conclude that researchers should use the HTML dumps because they capture more accurately the content of the article. However, the challenge is how to parse the HTML dumps or the articles in the HTML dumps version. This is not just about knowing HTML, but it's also about very specific knowledge about how the media Wiki software translates different Wiki elements and how they will appear in the HTML version. Existing packages exist for WikiText, but not for HTML. Therefore, this is a very high barrier for practitioners to switch their existing pipelines to use this new dataset. Our solution was to build a Python library to make working with these dumps very easily. We called it MWParser from HTML, and it parses HTML and extracts elements of an article such as links, references, templates, or the plain text without the user having to know anything about HTML and the way Wiki elements appear in it. We recently released the first version of this. This is work in progress. There's tons of open issues. So if you're interested, contributions from anyone are very, very welcome to improve this in the future. Check out the repo on GitLab for more information. As a third step, I want to mention, present how we use these datasets in practice. I want to show one example in the context of knowledge integrity in order to ensure quality of articles in Wikipedia. There are many, many editors who try to review the edits that are made to articles in Wikipedia and try to check whether these edits are okay or whether they're not okay and what should be reverted. The problem is there are a lot of edits happening. So just in English Wikipedia, there's around 100,000 edits per day to work through. And the aim is, can we build a tool to support editors in dealing with the large volume of edits? Can we help them identify the very bad edits more easily? And this is what we do with a so-called risk revert model. What is this? So we look at an edit by comparing the old version of an article with its new version. And we would like to make a prediction whether the change is good or whether it is a very bad edit and it should be reverted. How we do this is we extract different features from this article. So which text was changed, where their links that were removed, where their images that were removed, and so on. And then we built a model by looking into the history of all Wikipedia edits and extract those edits which have been reverted by editors and use that as a ground truth of bad edits for our model. And the resulting output is that we can, for each of these edits, we can calculate a so-called revert risk. This is a very bad edit, will have a very high probability, a very high risk for being reverted. And this is what our model will output. And our model performs fairly well. It has an accuracy between 70 and 80%. And I want to mention that we consider this OK. It does not need to be perfect. Our model, the way our model is used is there's editors that will surface these scores to help editors identify at which edits they should take a closer look. Similar models for annotating content of articles exist. We have been developing these types of models. In addition to knowledge integrity, what I presented, we have been trying to build models for finding easily similar articles, for identifying automatically the topic of an article to assess its readability or geography, or identifying related images, et cetera. I only want to briefly highlight that the development of these models is rooted in some core principles to which we are committed to. And this can create additional challenges in developing this model, specifically this context I want to highlight a multilingual aspect so that we always try to prefer language agnostic approaches in order to support as many as possible of the 300 different language versions in Wikipedia. I want to conclude with potential ways in which to contribute in any of these three areas that I mentioned previously. Generally, one can contribute as a developer to media wiki or other aspects of the wiki media ecosystem. And there, the place to get started is the so-called developer portal, which is a centralized entry point for finding technical documentation and community resources. Not going into more detail here, I want to give a shout out and refer to the talk by my colleague Slavina Stefanova from the developer acquisition advocacy team. But specifically in the area of research, I want to highlight a few entry points depending on your interest. In case you would like to build a specific tool, there is wiki media foundations toolforge infrastructure and that is a hosting environment that allows you to run bots or different APIs in case you would like to provide that tool to the public. If you want to work with us on improving tools or algorithms, you can check out the different packages that we have been releasing in the past months. These are all work in progress. There's many open issues and we're happy about any contributions about improving, fixing existing issues or even finding new bugs. So please check out our repository too. If you are interested in getting funding, there are different opportunities. There is an existing program to fund research around wiki media projects. This covers many different disciplines, humanities, social science, computer science, education, law, et cetera, and is around work that has potential for direct positive impact on the local communities. In addition, I want to mention that coming in the future, there are plans for a similar program to improve wiki media's technology and tools. If you want to learn about the projects we are working on, I want to mention that we publish a research report, a summary of our ongoing research projects every six months and you can find more details about some of the projects that I have mentioned. Finally, if you would like to engage with the research community, you can join us at wiki workshop. This is the primary meeting venue of the wiki media research community. This year will be the 10th edition of wiki workshop and it is expected to be held in May. You can submit your works there. I invite you for the submissions. We highly encourage ongoing or preliminary works by submitting extended abstracts. In this edition, there will also be a novel track for wiki media developers. If you are a developer of a tool or a system or an algorithm that could be of interest to research on wiki media, please check it out and make a submission. Even if you do not plan to make a submission, you are welcome to participate. As done in the last three editions, wiki workshop will be fully virtual and attendance will be free. With this, I want to conclude. I want to thank you very much for your attention. I am looking forward to your questions in the Q&A. If you want to stay in touch, feel free to reach out to me personally on my email or any of the other channels that I am listing here through office hours or mailing lists on IRC, et cetera, and with this, thank you very much. Thank you. |
Penpot official launch!
We made it! We're ready for our breaking moment! |
Last Wednesday, when we did the unboxing event, that since we do hope that everyone will be welcomed to a PEMPOT project, we needed to make sure that the invite system could cope with all the different use cases that you get with that link expiration, who got the invite, do any tree send it, etc., etc. Otherwise, the onboarding experience would be tough and people would not be joining the design project. And this is very important for us because this is not only a tool for designers to just express their creativity, but also for developers to join the design process to be welcomed there. So this is very important for us. But of course, we had many improvements in the interactive prototype feature. This is just one example for the overlay advancement. So here you can see how a design asset is being self-referred and you get a very nice pixel-perfect overlay on a button and then an animation that comes with a mouse click. Right? Do you like this? I think you do. Yeah, I think you do. This is happening. This is... This is happening. This is... This is amazing. So this is just about a token. I do mean, I mean, do go and enjoy PEMPOT, but it has many, many features on advanced prototyping, but this is one thing that actually came from this official release. As it did nested boards, so here you can see in this animation like an independent board being dragged to another board and having this nesting effect. This actually was a bit tricky because as you know, PEMPOT uses SVG like natively. So when I say open standards, we do really mean open standards. And it was tough to have this root element having everything, so to have different independent components being designed, we had to do this nested board trick, but that trick actually gave us a lot of potential because when you have nested boards, you can do much more advanced compositing in general, and so we have like the best of world worlds. We have SVG, but also these neat tricks for compositing. Also let me remind you, for those of you that don't know much about PEMPOT, that we have like some innovations in terms of where to find stuff. When you do engineering, you think about scale, like I need to scale up. I need to automate. I need to have everything like in tiny bits so I can combine them any way I want. With design, with big design architectures, you have that same challenge, otherwise it gets just manual, like design is just manual, and it can only go as far as your brain can cope with stuff. So having like your components, your libraries and assets readily available here, whether there are fonts or colors or design tokens or whatever, it's really cool. Also a whole lot of libraries, and you drag and drop, or you create new ones, and they are there on your left pane. So this actually comes from the PEMPOT design system, which is the first one example that you can one-click import. If you go to PEMPOT, you do the importing, it's very nice, by the way it's a nice bottle challenge that you can try and do. Anyone here did the bottle challenge? Some of you do, it's a nice little game where you learn by fun, like it's like you learn the PEMPOT basics using this bottle challenge. But when you actually go into more advanced topics, this is what you really need. So go and check the PEMPOT design system because it's kind of neatly organized. And then we get something so cool that no one had, which is FlexLayout. So every design and prototyping system out there would have this killer feature where you can smartly apply rules to your design. Say I want me to decide to flow this way, and then you would change the content within that design, and those rules would apply automatically. So for designers, I'm seeing people are saying yes, yes, I know why we're talking here. Some designers would feel a massive productivity boost thanks to that. You can call it auto layout, smart layout, fancy layout, whatever. They could not have this once they were exposed to that. It's like not having this once you know about this intelligent layout system is like going really back in time and having to do everything manually. So the problem with this, as we thought at PEMPOT, was that unfortunately these layout systems were design-centric, which is not bad per se, but was not developer-friendly. Basically it was not team-friendly, meaning designers would do their smart layout system, and then all the vocabulary and the ideas, the abstractions were just for designers to understand and to enjoy, and then developers would go there and say, I need to translate this into code. Let's hope there's no issues, by the way, issues. Not only technical issues, but also friction issues, like back and forth. This, am I in control? No, I'm not. Neither the designer nor the developer. They would feel like they would have to constantly double check, double check. And you don't scale up your design or your code if you have to do that. PEMPOT came with a completely different approach and said, what if we have code first design layout? What if already the design is the code? You can do a ton of stuff with Flex CSS. It's actually quite advanced. So what if we say, let's do a developer first design layout system? And we came out with Flex layout, and please do a plot here, because it's the first ever tool to have this, and it's open source. And the great thing about this is that this is not a gift from engineers to designers saying, I made this for you, please use it. I hope you're excited about this. But actually, designers wanted this. So no compromises here. Everyone wins. We got 50 to 60 volunteers for the beta testing of this. I mean, I'm not saying you could not test it using our development branch, but like the proper environment setup for people to go and do some simple tests like this one and give us feedback. And the feedback was phenomenal. I think 80% came from designers, and 20% from engineers. So that was nice, that was nice. Given the fact that in the world, so that you know it, statistically, there's one designer pretend developers, roughly. We don't think that's a good ratio. At colleagues, we have one to two, so pretty different. Somewhere in between, perhaps one to four would be great. But of course, to beta test this, we couldn't have the one to ten ratio in our poll. We needed to reverse that. But we didn't have to work hard to achieve that, because designers were actually asking to beta test this in the first place. So if you're familiar with Flex CSS, you can see properties there. And then, of course, you get, here I'm simulating to, this is a real-time collaboration, because we do have real-time collaboration, of course. And this is like two browsers. It doesn't look like that, because I made it, like, pixel-perfect in a way. But that code pane there, the coding spec, is actually a second browser just beneath, right? Nothing fancy. But you can see how real-time that is changing. So A designer is changing the design using Flex layout. And as the designer does that, the developer can see that code being changed. Just, it's a different tab selection there. That's all it takes. So I'm not saying that a developer should be, like, quickly copying and pasting when it changes, but it's nice for the demo. You see that it's real, there's no, like, export concept of anything. And really, that's it. Thank you. It happened. I do hope we have some time for questions, because I planned for that. That is great. Thank you for the point. Okay. I'd rather not pick if you could store that. You had done a really, really quick test. Thank you for the talk. My company is providing a design system, and they provide it for Sigma, or Sigma, sorry. And is there any kind of process I can follow to use the assets in Pentwork, or do we have something that exists to guide me? So the question was about, perhaps, migrating the design system from... No, we already have a company design system, so I have to respect this library with the assets and the design itself. I just want to use it in Pentport. You could, so if you're using a design system on Figma right now, and you want also to use Pentport, there's a ton of overlap. So it might feel weird for you to use the tool if you're not migrating. The answer is, I'm not sure it makes sense, because this is about team dynamics. This is about people collaborating. So unless you want to export these Figma assets, and use it privately for you, with no going back, if that is... I mean, you're smiling, so I guess that's what I want. I want to live in my private open-source world. We're going to pay for licenses for the whole team, so anyone... Okay. Do we have it for 10 people? And that seems to me, I get that you can save some money doing that, but honestly, it's going to not save you much time. So you have to keep a balance there, because it's going to be back and forth, not within Pentport, but within two tools that compete in the space of the use case. But I mean, go ahead and tell me how it looks. Perhaps I'm mistaken, but I think it's going to have a bit of overhead for you. If you're willing to check that, you have a sticker. I'll give you a sticker, like the shiny one, you know? By the way, who was... May I ask a question? Who was here three years ago? Can you raise your hand? Okay, yeah. Okay. We have stickers, like super premium stickers for you, all right? Next question. Yeah. My question is, you showed us there was a live update of the output code when you were in the designer. Yes. Is there a reverse also possible? Can I, as a developer, give a piece of code to the designer and can he copy and paste it into some... So the question is, we saw the real-time transformation between the design and the code. Is it for duplex? Can I change the code? Not for now. Not for now. Not for now. But of course, it feels like you would like to do the other way around, right? I'll actually go on with that question theme and say, what if you connect that with a gift repository? And you change that in the gift repository that comes back and changes the design. If you save as a designer, you change something that also goes to the gift repository and triggers the CICD process. Ah. Okay. So we thought about that because it's kind of what comes next, obviously. But yeah, so good question. We've got to cover it. Okay. I should just run over and save this person. Thank you. Thank you. So resources for helping teams migrating from other tools to PEMPOT? Yeah. Specifically PEMPOT for me. Yeah. Yeah. I mean, Figma, what is that? Figma owns 80 percent of the market for designers, 80 percent. Adobe now owns 87 percent of the market, thanks to that. So obviously, yeah, I get that. We have a plugin, an export plugin. We hope that the community can help us just improve it so that you can basically export all that from. We actually don't know how much time we've got for that plugin to be okay for Figma. But it's still there. So that is one way you can quickly test that. It's not any means complete. So it will give you the basics, but it might be enough. And then I think everything like exporting stuff and just re-importing that is really quick for teams to have the design system re-uploaded here somehow. I don't think it's going to take much time. If you have the motivation, it's just, you know, I mean, depending on the complexity of the prototypes and design, this is for prototyping, and this is also for design. So depending on the quality for each topic, it might get. But yeah, the export plugin, we really hope that we get more help from the community. So if anyone here is interested in those serialization challenges for the SVG and the JSON, because it's all it has, really, that would be great. So the abstraction model that we have from Figma to actually proper, you know, open standards, fine. So yeah, those, good. I actually realized that I can figure it out myself. Okay. So answer your question. So there was somebody over here that had, and then again. And congrats on the launch. Thank you. I'm really great to see you. I'm curious about the business side of things. You mentioned Figma as an obvious competitor. Mm-hmm. How do you plan to continue with Figma and be successful? At the same time, drive your business forward? So Caledis is a company. Actually the original title is Caledis Open Source. That's the SL, because we're a Spanish company. So the idea for us is to make sure that we follow this, we think it's going to be kind of an enterprise edition that will give companies things that the power user does not want or doesn't need. We call that privately, publicly tax the controller. Okay. So that is something that we don't really need right now, because to be honest, there's still a ton of things that we have to do for the product, but basically perhaps in two years, if we are relevant, we hope we are relevant, we'll see. See you in two years, force them, you know. But if we are really relevant, then we will look for those tax the controller features that big companies are fine. They won't contribute with code. They won't contribute with content. They will contribute with money, right? So we don't have like the details, but it's going to be like, and this has been publicly said many times, the power user gets open source forever. Is the companies wanting to have this control mechanism that will have to pay for something that is exclusive to them, whether it's self-host or SaaS. We don't care. Actually by the way, there is also a desktop app by the community that encapsulated PEMPOT with Electron, so you can also have the pure desktop native experience. So yeah, it's there actually in the community space, community.pempo.app. We elaborate a bit more on that. The only way this will be sustainable as a company is that we are really relevant. Surprisingly, the design space in our view is quite immature. You have one tool, winner takes all, then the next tool, then the next, subsequent winners. We hope that we add fragmentation to that, but still and still be relevant. So we have like one minute left, if you can. Yeah, I answer your question. You answered the second part. And this is the first part of the hug actually, you tend to compete with Figma. You tend to maybe drive business away from Figma toward PEMPOT. Well I think the playbook that we have is quite unique and unprecedented. This is open source design and prototyping tool. So we hope that we have a bottom up distribution model. People are really saying this is actually much better. And we are hoping, this is part of our, yeah, this is hope, that you remember that ratio, one designer per 10 developers, that developers actually loving open source do bring PEMPOT to the forefront. Without the help from open source developers, I'm sure this is not going to happen. Not because designers don't like open source, just because the economics of that will take much longer. So yeah, that one to 10 ratio is the only time I'm happy with that ratio, in this particular case it's like. So we are counting on open source developers to be. Okay, if you want to have a chat afterwards, I'm sure Pablo is very, very happy to answer a question. So that's, we have five minutes for people to leave and people to come in. So I'm going to say thank you Pablo. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. |
Value driven design
A case study on a successful privacy by design project where we did everything wrong |
So, hello. Welcome, everybody. I'm announced as Winfrey Tilanus, but I should also announce Emily Tromp. We did this project together, and unfortunately she couldn't make it today because of problems with public transportation. So, she's now at home watching this presentation and I hope I can make her proud. I'm Winfrey Tilanus. I'm just a privacy expert and usually making totally dull, bad design presentations. Emily is a designer, really working with design thinking and really likes to reframe things. And she's also the one who is responsible for this really great presentation. I'm totally happy with it. So, value-driven design. And this is about the case study. And it has been really successful, privately, by design project. But we did a lot wrong and we make a lot of failures. Fortunately, we learned from it and we went on. And so, this presentation will be the story about that one. But before I really dive deep into it, I want to ask you a question. When you think about privacy, which concept do you think is most related? Is it about the requirements the technicians have to have to build the system in a private way? Is it about the trust of the end-users? Or is it about agency, about their ability of end-users to make their own choices? So, what do you think? Let's make a little poll of it. I hope it works. This is the first time I use NextCloud for a poll. It goes to a domain totally owned by me, so you have to trust me. So you already answered the question. It's also an IPv6 that might be a problem. Are you connected to the dual stack? Yeah, let's... Wait. Yeah. So, we think it's about requirements. See one hand. We think it's about trust. Alright. A little bit more than half of it. We think it's about agency. Quite close to each other. But requirements are down the list, I believe. Alright. This is what I will be talking about. The design challenge, the thing we had to design. Then, three perspectives on privacy. The approach we did in the design project. And what we learned from it. So, the design challenge was promoting agency with an open-source ecosystem for e-hels. One button. There are some things that are really important here. First of all, it is... It had to be about a digital open-source exchange standard. The idea that was envisioned were all different kind of e-hels apps. All kind of health professionals. All working together in unpredictable ways and unpredictable combinations. And that had to cooperate. So we had to find a way to make a standard to have these them communicate. And all the other starting point of the project was user agency. The idea was when people have to control about their own path in the health care system or in the care system, then they make same choices. They choose what's good for them. And that makes the health care system better and better affordable. So agency was not only just because we like to be ourselves and to have the possibility to be ourselves. It was really at the heart of the project of the question. Another part of it is, I already told a bit, it's really about the ecosystem of care. I see these are the wrong way. Ecosystem of care applications and blending with health applications. So more informal care, more social work comparing with really hard medical care. And these have different security levels. And they need different levels of trust. And trusting your neighbor is something else than trusting your doctor. So we all had to design for this. Well, then we started and we discovered that there were three perspectives on privacy that were really causing havoc in the project. Confusion, misunderstandings, everything. So the three perspectives was really a theoretical perspective. I told I'm the privacy expert. We like to make dual presentations with lots of text. And I really like to think about what means privacy, how does it relate to agency. And I came up with talking with peers on a relational view on privacy. And that relational view is really focusing on the choices we make. I'm standing here in certain clothes, revealing certain things about myself, not revealing other things about myself. Those choices about how we feel create how I relate to you and how I appear to you. And also that creates a relation. And we all make these choices. Just look around and see how we are. There's no process. We do this all the time. And by these choices we create agency. By these choices we can determine who we are and who we are in certain contexts. So that's really the theoretical sort we had about it. And then we had the end user perspective. And we really went into neighborhood centers, talked with users, gave them an app, asked how do you experience this. And they were totally somewhere different. What's this? This is complicated. Don't know how to handle it. And I have to enter this data. Why? And where does it go? And I don't trust this thing. And then we were sitting at the table with developers from the companies making the ELs and the care apps. And they were asking, well, tell us what to do. What are the requirements? So they're all talking different languages. So we had to find a way to get on, to get this big misunderstanding solved and to get something that got everything together. And we had to test it with users. It might be a problem because we were creating an abstract exchange standard and not an app or something like that. So we had to bridge a gap. And we took a step back. We're thinking, what are the common values? What are they talking about? What are the things that they share and what they all want to have in the system? And that's where the trust and the agency came in. That's where, when we noticed that users really wanted to see experience that they could trust the system and experience that they could make changes. And the developers, well, they're not bad guys. They don't want to violate everybody's privacy. They're working in health apps. So they know that it's important. So when you talk with them about what's needed in such a ecosystem, they very well understand there's also trust in the agency. It's really important to give the choices and to make sure that you are trusted. So we translated these principles of these common values into design principles. Tell more about it later. And, of course, I was there also and I helped to treat them, give my own input to the design principles. So that's how we came to a set of design principles that were bridging this gap. And then we came to the next problem I already mentioned it, how to use a test. Yeah, how to use a test of standards, abstract thing that can have totally different faces and all different kind of apps. And, you know, when we literally went to neighborhoods and talked about trust and agency and everybody was, what are you talking about? So we came up with a privacy game. And it was a small game, small simulation. And we used that to envision abstract scenarios. And we made a scenario about you're in this situation and you want to share this with somebody with who you want to share with or you're in this situation, you're talking to this person, what do you share, don't you share? Then we changed in the twist of the game. Turned the page around and then there was a new part of it. For example, somebody else or some changing your information, what you maybe made want to share. And based on that, we got into discussion, talks with the users about how they made the choices and if we could validate the principles we formulated. So by playing this game, we made it, creating this game, we really made it possible to see visible what, what, how the abstract scenarios, abstract principles, design principles worked out. So what did we learn? What are the things we got out of this? First of all, trust is an iterative process. You don't, even when you go in at the doctor, it would be totally strange to have a shield at the door of the doctor. Please undress yourself in the waiting room. You know, you first want to make contact, want to see, well, is this a doctor I like to see? And then slowly you start to trust the doctor and then slowly you said, well, now it's appropriate to undress. And the same happens between a human and a machine or when you interact with an app. It's not just, well, there are many apps where you have to first page, you see, create an account and you have to tell everything about yourself. And you're looking at it, well, why should I trust this app with all this information? Maybe the app suite is show that it's useful and there's some nice interaction with you. So next thing is reciprocity of the information or what happens. I share, you share. It's not one way around. Everyone wants to know everything about you and doesn't give you anything back, doesn't show what it does with it. Maybe you say I want to donate information about myself because I can help somebody else, but it has to be made visible. It has to be too racing. So really, I share, you share. People are willing to trust if there's balance. And then the last one, and that one was really important for the designers, we tell them these principles, they must be backed with your technical solution. The first thing they said, well, identify in the system. Well, let's use the email address because it's easy and everybody has it and then you can use it everywhere. But then people lose control about where the data goes because somebody can link all the email addresses across all applications. So if you want to give people really agency, really the possibility to make choices on a technical level also, then you have to say to the developers, well, for each link between two apps, you use a unique identifier. And if you want to share a piece of information across several apps, you have to create a star and make a deliberate decision to share it and really back that up in your technical architecture. So we really could tell there are the developers, well, this is what you should do. And it was really nice because we had many meetings with the developers, different developers, they were really, wow, we don't understand this. And then we came with these principles. Oh, and then I said, oh, then can do this and this and then the old piece together for them. And then they really drafted in several days and first version of a standard. So these are the principles we found. Well, I have another poll, but let's do it manual. I have five minutes left in my talk and five minutes for questions and I want to lengthen this. I'm a bit wondering, curious, and let's just shout out, raise hands, whatever. What do you take away from this presentation? What did you learn? Because I told about what we learned, but I'm wondering what you did learn. The back and forth of information and how we should build trust. I immediately identify with that. When you get an app and you have to fill in everything at first before you know what it does or what it's going to do with it, then you're way more apprehensive to do that. I repeat it a bit because otherwise it's not on the video. So it's the interactive building of trust. I might be projecting here a little bit, but some of the struggles you mentioned remind me of when I worked in the aid sector, where many people would try to apply design principles like human-centered design to develop a solution for certain people. But it's very difficult to ignore these larger economic contexts. So when you talk about privacy and people not being engaged with it, maybe they aren't politically engaged with the topic, educating them and figuring out how to mediate that. It reminded me of working in the aid sector. You have experience in a similar sector and you went into really the commercial barriers of parties who didn't think this way, didn't want to implement it. It's interesting because we were with lots of commercial organizations around the table and they were willing to cooperate and they were willing to change something. They had some idealism there too, but it is a problem in the care sector because it is a huge economic sector with lots of money and lots of people who try to get a bit of the money. So I feel your problem. Any other takeaways? A quite abstract question. How did you came out with the user testing as a game? And was it actually a game or was it a game as you just mean as kind of the exchange information? It was not a game you can win, but it might be played a bit as a party game and might be nice. They have lots of games, card games from this. It was a paper game and we literally went to neighborhood centers and sat with people, visiting there around the table and play the game with them. And interviewed them afterwards about even before and after about how to even influence. I work as a UX researcher and sorry, I was just curious. I find it really interesting how open source projects make decisions about which methodology to use or which tool to use. So I was just curious how did you guys make the decision to use the game because it's a tool that I've never heard of before and I'm very open to learning. So the question is how did you make the choice for the game for testing? And well, we didn't make a choice. We tried lots of things and we had many sessions in neighborhood centers that were totally failed and then we had a feeling well we need a way to discuss more. And then we came up with the game and we noticed this works and now when we're looking back we can make a nice story about it. Linear from well we had the game and it worked but I left out about a dozen of failures there. I can give you more information about the game and it's yeah, I've bought a paper on it and the game available too. Thank you. I never considered including privacy into exactly what I'm thinking of process. My question would be, how do you include a GNPR inside the process because it's part of the privacy you get to have the perfection of this art because saying to the end users this is the low and this is how it can help you and protect you if you have room for the trust. So the question is how do you include the GNPR in font and design? And well, I love. I once been introduced at the conference as a person who laughs about the GNPR. The point is, when you work like this, well, let's move on. We are now here. When you work like this, you have a far better way of handling privacy of your users than the DDPR says. And for example, part of the pattern is that in the iterative pattern you only ask for information when needed. And then you make clear why you need it. But that is already awfully close to an official privacy question you may have to ask according to the DDPR. Maybe add a link to some additional information with all legal stuff in it. But they are already there. Then you have asked a meaningful consent question. And that's, but that's just logical to do it at when you're designing this way. So just design for privacy and then the DDPR should be included already. Yeah, good to know you're the dark side. Well, can you give a clear example of how you do reciprocity and information exchange? Yeah, for example, there are two levels because you can, between humans, you can deliberately support and we already also had some design patterns, but there were UI design patterns that couldn't, couldn't be forced on all apps. But you can really support two-way communication before telling somebody something. And you can really support, well, I asked you this, but I want to ask you this because when I know I can do this or this for you. And so you can support in your design of the chat, interaction, messages, patterns like that. And you can also do that in human-to-machine things. The app asking, well, I really like, if you like to, then we can arrange it if you come back that I remember what you did. But to do so, I need an account. Exactly. Exactly. And don't ask before it is already clear what you can give back for it. Start, for example, with anonymous accounts. And then people can see what the app is doing. And well, at a certain point, you can say, maybe it's interesting if I remember this. Shell, would you like to make an account? So don't ask the account. And then for the account, ask only what's needed for the account. Don't ask anything more. Yeah? One final question. We're just coming close to time. What kind of failures? One final question you said. We had, for example, try to make a really taxonomy and architecture that would support all fine-grained type of privacy patterns. And that became a huge picture with lots of doorings and lines and data model that was really exploding on all sides. And nobody understood it anymore. And it was bonkers to even think about implementing it. And spend some days on doing all the boxes. No, that's not the way to go. To give one example. So the answer was constant feedback and express the whys and not just the whys. Exactly. Yeah. Yeah, I think we're just over time now. So thank you so much for your time. |
Donation Page Design
Helping your users help you |
We started, and we went through a number of stages, like a lot of projects do. We started with a small donation we were trying to build up and get into a system where we could be self-sustaining because we wanted to actually have the ability to move faster. We wanted to build a better project for our community. The way we wanted to do that was we wanted to bring full-time developers into the project. As of 2022, we've been successful. We have two full-time developers working on the KeyCAD project. We have eight other core developers who are all contributing and hundreds of people around the world who we are able to successfully fund to help work on the project. Designers, documentation, librarians, all of the people that it takes to make a large open-source project work is sustained by our donations, by our community. I want to tell you a little bit about how this came to be because this wasn't obvious and it wasn't immediate and it wasn't just putting up a single page. Initially, our first donation page looked like this and you can see I've kind of squished it down so I could see the whole page. If you looked at it on your web browser, you would see the top little picture and you had to go all the way down to get to your donations. Quick, I would like someone in the room to donate 120 Swiss franc to the KeyCAD project. The question was for each user who went in there who didn't happen to use Swiss franc as their base currency, how much is going to show up on my credit card bill if I decide to support the project? How do I know that this is going to actually support the project? All of these initial pain points in our first generation of a donation page were there because we picked what was available, the projects that were supporting us. We were initially working through the CERN Open Society Foundation and they provided this, but they worked in Swiss francs because they're CERN. Our users, maybe not so much. We were initially seeing very few donations from this site. I'm going to tell you how we fixed it. Six ways or six steps. The first thing that we need to do is we need to identify who our users are because this drives every single choice we make in design, who is going to be the person who we're talking to. Then never interrupt the experience that your community has, your donors have because the moment that you interrupt their experience is the moment that they have better things to do. You always want to respect their time. Then shorten that click path, make sure that you're not setting yourself up for failure by putting intermediate steps that you don't have to have. Every time that you have to click on another link, you are going to lose more and more of your community before they get to that final point where they can actually come in to the larger community. It's in-house. There are a lot of places that will actually set up donations. I'm going to tell you, you need to do that yourself. Then test every single step. Finally, build your community because this is the basic raison d'être that we have as open source developers is we are a community. We're stronger as a community and this should be focused as the primary reason for each element in our donation. First, who is your customer? Who is the user base that you are trying to support? For KeyCAD, we were a bunch of engineers. We were a bunch of designers, circuit designers. Those were the professionals who were building those circuits. We're our audience. We needed to set up the system to reflect that. Your designers or your customers are also in this space who you're focusing on. It's not just what you do for a living. It's where you are in the world. What language do you speak natively? What are you most comfortable in? Once you can make your audience comfortable, your community more comfortable, then that is a time in which your community becomes stronger, becomes more tightly bound together. For us, we need to identify where we are. This is actual data. For KeyCAD, where do we see people coming into the project? We gather anonymized data that is sorted by country. The actual map here, our top five regions, the United States, Germany, France, Japan, and the UK. Notice Switzerland not in the top five. We were doing this wrong in the beginning. We were setting up a barrier in the beginning. We should be donating in euro. We should be donating in US dollar. We should be donating in yen and pound, but Swiss franc is maybe not the currency we should have chosen. We need to address this. The first step that we do is we try to address this. We joined in 2019. The Linux Foundation started a crowdfunding project. This was a way of bringing donations into open source projects. We joined this at the outset. Our primary reason, even though the Linux Foundation site is slow, even though they still have multiple clicks and they require way too much data to get you to allow you to donate, even with all of that, just changing from Swiss franc to US dollar donations met more of our community's needs. Our donations increased from about 10 per month to about 30 per month, which is good. It's a low increase because then we start to bring more people into our community. We need to do more, so we want to say, all right, what are the pain points still? Well, we're still slow. We're still requiring multiple clicks. I'm going to say, don't interrupt. Don't interrupt your donor's experience. I love GitHub sponsors. It's a wonderful program. If you're doing that, that's great, but you will never grow your community by putting the GitHub sponsors up there because the only people who are going to go up to the top of that page and find that button and click on that button are people who are already looking for it because no one goes to this page unless they want to download your software, unless they want to download their software, and the download is down here, and the sponsor is all the way up there, and so you have to go out of your way to find this. Going up this, for us, we identify, this is the KeyCAD website. We identify, why do you go to the KeyCAD website? You go to the KeyCAD website to get KeyCAD, to download KeyCAD, and so if we're respecting our community's time, we don't make them go out of their way to the donate button in the top right-hand corner. We start with making sure that they get to do what they came here to do first, which is download the software. Once you click on the download, instead of just showing a static page, we start the download, and while they're waiting, we give them the opportunity to donate because this means, this says, you are already accomplishing what you wanted to do, and we're going to give you this awesome software. Your software is awesome too, so that then becomes an opportunity while you're waiting. Maybe you also want to donate, and so this right here gave us an enormous increase because people were now willing to say, that's a good idea because I want this software to be better, I want to support this thing that I use, and our donations increase to 80 per month at this point. Now, part and parcel to this is shortening your click path because if you're respecting your community's time, you don't make them jump through hoops. Our initial page had nine to 12 clicks over four pages, and it went through postmates, and it was a mess. Our current page, every time we shortened it, we saw more people coming in, and so we're right now at one to two clicks over two pages. We'd like to get it even smaller, but that's where we're at right now because we have built this sub-site that we hold on our own. You can't shorten the click path if you don't own your own project's donation page because then you're at the mercy of whatever intermediary is going to be in between your community and you. We built this ourselves, and we put it behind a CDN, Content Delivery Network, because we want this fast. We want this to be directly at the user to the extent that we can. Right now, when you see this, all you need to do is, if you like the suggested donation, you click. It's one click. If you don't like the suggested donation, you can change it to whatever you want and then click through, so maybe two if you want to change it. We go through Stripe. We go through Stripe for the reason that we can pick how people are allowed to donate, so within China, for instance, where a large number of our users are, we can link into Alipay through Stripe. This allows our Chinese user base to integrate into our community better than they would be able to if we required credit cards, which are not common in China. Similarly, JiroPay in Europe, we link that in, SEPA payments. All of these things are accessible through Stripe, which is one of the reasons why we chose it, but there are other options as well. Then, we also added PayPal. People have different feelings about PayPal, and that's fine, but what I'm going to tell you is that if your community uses PayPal, then you should use PayPal because that is meeting them where they're at right now, so bringing your donations in-house. Internal are where you are controlling the information, and external, you need to get the information from someone else, and I'm going to tell you that internal is where you want to be because you get to make your own donation page, you get to make it as fast as you possibly can, and you get to control the individual steps. Every time that you make your users make your community wait to donate, every time you increase that load time by 100 milliseconds, you've lost 20% of people. They will go back, they'll click the back button, they'll go out and go somewhere else because you're not sure if it's loading or not. You get information when you bring it in. Bring this into your own organization because it allows you to connect with your community, and you get to every single donation should have a thank you from your project. You should never leave people out hanging saying, oh, I just sent this money into the ether, I hope it does something good. Send them a thank you. Send them a thank you because it lets them know that the community has received it, that they appreciate it, and you include in there a link into your forum, into your bug tracker, into the other ways that the community connects. This allows you to build these communities that bind all of us together more closely. Finally, test, real quick, A-B testing, piece of cake, never change everything outright. Never make a new donation page and get rid of your old one. You make donate 41, you want to change something, you make donate 42, and the only thing that you need to change is some simple little JavaScript. If math.random is less than 0.5, go to this one, otherwise go to that one. Two different pages, and each of them should have an embedded code in there, an embedded number that you get to send to Stripe. Because you're controlling your own donation pipeline now, so you get to send that to Stripe, you get to send that to PayPal so that when you do this, when you make a change, you get to look for your metrics, and you get to measure what the changes affect is, and whether you are trying to get people to join your mailing list because you want to announce releases or you want to get them to support the project or new users coming in, whatever it is, you are trying to change, you won't know if you've achieved it unless you're measuring it. So measure it in a way that respects people's privacy, measure it in a way that allows you to be the most effective you can in building your community. And finally, every donation is an opportunity to build your community because your donors are the most invested customers in your project, and you should treat them that way. You should bring them in because they saw fit to support you financially. They deserve to have that community wrap them in ways that are supportive and give them the opportunity to be part of the larger project. Every donor gets a thank you email, and then give them ways to deepen that connection, offer them things that are straightforward, so t-shirts for donation levels or badges on your user forum, all of these are ways to build that connection between your community and your project. And then directed donations give you more information about what your users are interested in. Finally, support your upstream because all of us stand on each other's shoulders. So when you are, whether you are a houseplant open source project just doing this for yourself or you're a big, large project, if you're receiving donations, pass that up because the things that allow you to build your project and allow all of us to do better are the things that we should be supporting ourselves. So last couple of tips. Be transparent. Show people how much processing charges are in the project and allow them to donate a specific amount to your project. You can add a small check box to round up to the donation amount. And always, always, always if you're doing your donation page, watch for fraud because there are a lot of bad actors who will automatically try to use your donation page to verify stolen credit card numbers. So when you do this, you watch out for this because it's not just an ethical responsibility, it's a financial responsibility to actually make sure that people are not being charged for things that they don't know. And so in this case, you can see this was all within one minute, this is actual attack on the KeyCAD donation website, it's like 30 donations per minute. They were just cycling through a bunch of credit cards. And we found the one that they work that they succeeded and we reported it as fraud and they stop after you do that for a little while. So thank you very much. I really appreciate it. If there are any questions, I'm happy to take them. Not actually a question, but I think I don't now remember which opens up project it was, but it might have been KeyCAD actually even, but I remember it was very nice to have like a donation page, which then listed a bunch of new features kind of that came up recently and just said that these were made possible thanks to your help. Yes. I think that was an excellent way to think about it because it's true. Yeah. So the statement was that you can show what the donations have done with and one of the things that KeyCAD does is after you donate, it shows you a thank you page after the donation and that it has a little graphic of all the things that we've done in the past year and we update that about once a year or so. So yeah. Yeah. There we go. Yes. How and why did you understand that these implementation were in such a way? So the question was how and why do we know that we need to make changes? We weren't where we needed to be. So we wanted to support full-time developers and we did not have the funding level to be able to do that. So we wanted to increase that to be able to develop the project faster and better and we knew because we'd periodically dog food it ourselves. We'd try and the developers will donate not just our time but we'll practice going through the donation page and it was painful. So we'd click and we'd be like, okay, come on, come on, come on, come on. And then we'd get there and I had to pull up the, like, how much is 120 Swiss franc in euro? Now I know but intuitively I didn't know that. So dog fooding what you build is an important step and it costs you a five dollar donation which is, you know, you run it through there and you're supporting a good project when you do this. Yes. How do you handle one-time donation versus recurring donation? So the question is how do you handle one-time donation versus recurring donation? So we have right here a highlighted one-time and monthly. So we have this option and you can set up the back end to do that. Right now we only support that through PayPal because we don't want to be responsible for keeping track of user accounts. And PayPal allows you to set up subscriptions without the user account. So right now we do that through PayPal but the thing to watch out for is to make sure in your design that you're very, very clear about what the user is, do they have a subscription or do they have a one-time. So when you click on this, instead of saying donate, this says donate via PayPal or donate via credit card, e-wallet, instead of saying donate, we change that to subscribe. So we do a couple of things to make it really, really obvious that if you move this slider over here and click on this button, you're signing up for a monthly subscription. We still have people making the mistake though. So we're still working out, we're still doing AB testing to see can we get that better. Yes. The suggestion is to have a separate button for the subscribe, yes. I'm going to AB test that and we'll see what comes out. Yes. You absolutely can and I skipped by a couple of the options here for, here we go. So open collective, Patreon, coffee, Linux foundation, if that's where your community is, do that. Absolutely. That's fantastic. If your community is on coffee, you should have coffee on there. But there's a balance to be struck and open collective is nice, but it's one, two, three clicks deep and we've never seen a way to shorten that and we can't cash it. So it does add a couple of steps, but I do know a lot of projects that use it successfully and that like it a lot. So definitely, you know, test it out, but try to replace one of the buttons, AB testing with open collective and see if that's where your community is and the only way you know that is by checking. Thank you. |
Creative Freedom Summit Retrospective |
So we have Emma and Jess talking about what they learned about the creative freedom summit. So give a hand for them. Hi everybody and welcome to our talk. We're very excited to be here at Fastem. It's the first time here. So very excited and nervous. Had a great time yesterday. So today we're just going to go through a summit that we hosted two weeks ago. We're going to explain what it is for people who weren't aware of it and then we're also going to see what went well and what could be improved. So just a small introduction first then. So my name is Emma Kidney. I'm an associate software engineer at Red Hat in their community platform engineering team. Kind of the last year. So I've kind of been more doing design work. Kind of slowly doing more design work, less dev work. I'm part of the Fedora design team and I utilize Floss in every aspect of my work. And an active member of the Fedora community. So I'll just pass it over to Jess then. So I'm Jess and so I joined the Red Hat team in 2022 as a Fedora community designer intern. For the community platform engineering group. And I've got all open source software, Inkscape, Penpart of course. And just loads of other ones and creating numerous logos for the Fedora project as well that you might have seen around. Includes. So yeah, so a little bit of the creative freedom summit. It was a summit that we held, was it two weeks ago? Yeah, two weeks ago. Yeah, so it was all over like Jitsie and PeerTube. And it's just, again, promoting loads of creative software that people might know about. So yeah, basically, yeah, what is the creative freedom summit? So it's a virtual event focused on promoting open source tools, spreading knowledge of how to use them and connecting creatives across the FOSS ecosystem. So the summit's accomplishments and shortcomings will be examined in light of the first year. So the primary goals for this were, of course, promote open source creative software. And also spreading knowledge of how to use it. So it's very important like there's so many free software that people don't know about and it's great to give them that recognition. And also connecting creatives across the FOSS ecosystem. So other people who use these open source software can connect with each other and share ideas and all that sort of stuff. And then our secondary goals were to promote the Fedora design team as well as a welcoming group. So, you know, especially if you don't have a platform to show off like what you've done on open source software, you can always join the Fedora design team. We also create a potential onboarding path for new design contributors as well and spread info on how to use the creative open source tools in communities. So again, same as the last one. So showing how the Fedora design team will tools to make communities stuff happen and have sessions like how the Fedora design team do this, that and the other and hoping make sense for easier for people to get involved. So what topics do we cover? So again, pen path. So if you don't know what it is, if you missed the first talk, it's a web-based design and prototyping platform. Inkscape as well, which I love. And powerful, it's a vector-based graphics editor. Blender, Krita and Caden live, I assume you might have went to the KDE stand that was around. And GIMP and also accessibility as well, which is very important in the design aspect of things, like color blindness pallets and stuff like that. And then the free software was actually, it's like free patterns for different crafts. So more kind of a hands-on design thing rather than technical. And creative professional using open-source software. So we had some people as well who, and of course us as well, like using free open-source in a professional aspect. And many others. So we had an amazing line-up of speakers, so Maureen Duffy and David Revoy to name a few. And we had social hours that include the audience and help build a sense of community. So some of these include a hack and craft, which is kind of like the free software. It's by the same person, Morgan Lemmerweber. And so we were basically, like for the hack and craft, we were all like knitting and making sewing things, but also talking about open-source and Pictionary and Gartik Phone. So it was really fun. And then, of course, wouldn't be possible without our sponsors. So Fedora and Red Hat were the main sponsors. And then the platforms we used to share was Wee's Element. So like people could interact in the chat. Jitsie was for the live streaming part. Oh, sorry. Jitsie was the video part. So we all went into a call, and then Peerchu was the live streaming part of it. So, yeah, I'm going to hand you over to Emma for the retrospective. Nice. Okay. So just some details then as well. So when it took place, as Jess said, around two weeks ago, January 17th to the 19th. What platform? We used Jitsie, streamed with Peerchu, and then Element and Matrix were used for the audience to interact with each other and the speakers. What if you didn't make it? The recordings are available on the Create Freedom Summit's Peerchu. And we'll soon be uploaded to the Fedora YouTube when editing is complete, where we have some great contributors at the minute, converting the raw video files over into properly caught and published talks. And was there a badge? The important question when an event is run by Fedora. Yes, there was a badge. It was designed by Jess, again. And it was able to be claimed by anyone with Fedora badges account. And it was claimed around 48 times, I believe. And also, if you're wondering why we had a badge, the summit was all about promoting open source design. And then obviously, as the Fedora design team, there can be a lot of overlap then between the two communities. Here's just an example of what the setup looked like if you were to attend the summit. So at the top there, you have a little widget for the live stream. And we have another little kind of notepad feature there where you can type out any questions you might have, some information kind of frequently asked questions. And then we have the chat there then so people can interact with each other. And also the background channel for the talk. Oh, yeah. Oh, yeah, because if you wanted to just watch the talk and I've been involved in the chat room, you just watch directly with the PeerTube link as well. So the event go went very well. We had 18 speakers take part with an average of 50 viewers per session, which is very good for our first time round. We had more than 700 unique viewers spread out across 54 different countries. And in the element chat room, we came up to around 250 members. So we were kind of aiming for 100 in the element chat room, but so the fact we kind of double that grade. So what went well? Viewers loved the setup we had. They were able to watch videos soon after they were live as well, due to the system that we were using. And the chat room was very friendly and there was a real strong sense of community there. And the audience had a lot of back and forth with the speakers as well. There was a lot of interactions at the social events and a lot of interest. Also no code of conduct issues, which is always nice. So we got to see some amazing work. So we had some speed painting sessions there with grease pencil and blender. It's been painting with Krita, an inkscape tutorial. There's a software blender as well. There was a good few things about blender. So what can we improve for next year? So what didn't work? We want to try and create a guide for next year with guidance on how to use the platform. Because from a newbie perspective, if you're not familiar with matrix element, or that kind of thing, it can be a bit foreign, a bit hard to get into. We also need to look into auto captions then as well. Since we used jitsy and pure tube, we did not have live captions during the summit. We were able to add them in post-production, but it would have been nice to have it while the event was happening. So we're going to look into some open source tools that can be used alongside jitsy. Hopefully we can stick with that arrangement and then, if not, maybe look into some alternatives. So what's next then? So call for proposals. So depending on speaker capacity and interest, we'll probably do a call for proposals around two to three months prior to the event. There's a lot of interest in the audience to get involved themselves, present their own talks, and share their knowledge. We want to keep group and interaction growing throughout the year. This can bring more people into the community, maintain their interests, so maybe have socials, our challenges, kind of little, mini events. Spread the word, so participate in talks like this one. Kind of get involved in more of the open source design forms, all the different communities. Planning is a major part, so we're going to give ourselves some more time to implement improvements, try some things out, and then if any areas come up. So yeah, I'm going to pass you over to Jess, then, just in case you're interested in participating in Fedora. So a little bit of a segue, I suppose, since we're all, it's very heavily supported by the Fedora design team, so if you would like to be interested in the next one. So if you would like to, there's a Fedora account that you can create. You can create it at accounts.fedoraproduct.org. And you can use it to sign in to the element as well. And it's very kind of, it's like the main account to use, especially with a somewhat like this. So if you want to learn more about Fedora, it's all on the Fedora docs. And also if you do join.fedoraproduct.org, it gives you all the links that you can use as well. It shows our missions, our foundations, all the user documentation. And you can look us up and ping us if you need anything. And then also, yeah, you can hang out with us. So in the element chat, we have a Fedora design element chat and people share what they've been contributing. And if anyone needs feedback from like the more senior people or like just in general, it's a really good place to be, Fedora marketing as well. And there's many more channels as well, even if you're not really interested in design, there's infrastructure, there's many more loads. So, and yeah, where to find us. So the matrix channel there, creativefreedom.dot.fedora.im. And then our website is creativefreedom.com where you can get all the information. And we have, like when we were playing it as well, we had all of our, we had a live schedule going on and you can click on it and it brings you to the streams. And then again, the Fedora design team matrix channel is there. And then that's us. So if you have any questions. Yeah, so we're planning on making it kind of like an annual event. So it'll hopefully happen in around January next year as well. So feel free to enter the matrix channel, ping myself or Jess. I'm more than happy to punch in the right direction if you're not sure on how to use matrix or element, anything like that. So yeah, I think that's our talk. Thank you. Does anyone have any questions? No. Yes. Do you plan to keep that format for the future? Yeah, because it's, I suppose it's making open source design tools kind of more accessible. So we'd like to kind of keep it a virtual event, I think for the foreseeable future. Maybe there might be some mini events, things like that. We haven't really discussed it, but we do like the online aspect that, you know, anyone can join. So it is, yeah. Yeah, I suppose. Yeah, just making it more accessible for people. Does anyone else have any questions? Yeah. Would you like to see more of the next one? So I'll just repeat the question. So we were just asked, what would we like to see more of in the next summit? So I suppose, just getting the numbers off. Well, you know, like it's good to see, you know, just make it more, well, it is approachable, I suppose. But I don't know. That's a tough question. I'd like to see more of like the craft inside a thing and the open source design aspect, because I thought that was really interesting when Morgan came in and was talking about having, you know, open source design, say, of like sewing patterns, things like that, making clothing more accessible to people. It was kind of an aspect of design I didn't really think of, even though I'm into like crocheting and knitting and stuff myself. So it was mad just to kind of see like a hobby and like work kind of come together then. So I thought that was quite interesting. Yeah, I'd like to see more of that. Anyone else? Okay. Yeah, we also have a stand-down in building H as well. So we'll probably be there after this. Thank you. |
Accessibility & Open Source
How open source is key to building a more inclusive world. |
Well, thank you all for coming. This is great. Now are my notes, they're not on the screen. It's never good to go off and give a talk where it's like I cannot see what's on the screen behind me so let's see if I can change that and there we go. So my name is Mike Gifford. I'm with Civic Actions of a Senior Strategist there. What I do there is a lot of work on accessibility and sustainability. I'm also a Drupal Core Accessibility Maintainer and I like to do a lot of talks on open source and accessibility and why it matters because I think that there's something unique with our community that we can do, changes that we can make that make a huge impact. So I'm also just as a disclaimer, I'm not a designer. I'm a PHP developer that sort of got, you know, watered into accessibility and then suddenly had to deal with a lot of front-end stuff that I was not familiar with prior to getting involved. So definitely appreciate having all the attention of the designers and I've learned a lot from the design community I've worked with. But a couple of things I assume that everyone here knows about open source. So there's a great range of open source tools. We know that there's only, you know, here we're going to be talking about a few of them but there's lots of really good accessibility open source tools out there. I also want to go off and look at those. As you know there's a lot of open source tools that are really not, they're essentially dead in the water. Somebody released the project, they don't have a community around them. I think focusing on those pieces of software around accessibility that actually have a group of, that are actively engaged really does matter especially if you're starting to get involved with them. I'm also going to use the short term OSS for open source software. As far as accessibility, I just want to remind people that it's not a small portion of the population. We're not just talking about people who are blind. We're talking about 25% of the population and if you look at permanent tempering and situational disabilities you're talking about 100% of the population. Right now you can probably all see the slides here but if the sun were in a slightly different position there'd be a point where you may not be able to see that even in this presentation just because we've got these gorgeous open windows. So we have to go off and think about how accessibility can affect all of us. We also have to keep in mind that there's both visible and invisible disabilities. Many of us, how many here know that they work with people with disabilities? Okay, there's a lot of people who don't know because you can't tell somebody who's colorblind, you can't tell somebody who's dyslexic, you can't tell somebody who has low vision. There's a lot of disabilities you cannot tell by looking at somebody from the outside. Even people who are legally blind, you may not know that they're legally blind because they've learned how to go off and interact with them, or interact with you. I'm also going to use the short term ALI which is a numerum, there's 11 letters between A and Y and it's what it stands for. So Y, open source and accessibility. So bugs, are a big one. When in most proprietary projects you don't actually get to see a list of the issues that are there. You don't get to get an opportunity to see who made, what problems have already been reported, but in open source communities basically every one of them has an issue queue and often those issues are either identified as bugs which is an important clarification. Accessibility issues should not be feature requests, they should be bugs and they should be tagged for accessibility which many of them are so that people can find them more quickly. With the Drupal community we've actually started the process of trying to tag them for specific WCAG success criteria. The advantage of that is that you can actually start to understand who is affected by the accessibility challenges and have a more fine grained attempt to understand what are these issues, who does it affect and how serious a concern is this. So we're looking at trying to do that with the Drupal community. Open source communities are also really good because they tend to focus on trying to find a merit based approach argument. So you're not looking for something that is who's the most popular in the community, you want to try and find a best case argument for why this is the best pattern for your project. So it's a really interesting experience from that perspective. It's also neat because anyone can submit a bug, you just need often an account on GitHub or Mantis or whichever else and that's a really useful feature as well because you can be inclusive. Does anyone here know where to go off and submit issues to Office 365 when you run into a bug at Office 365? Yeah, good luck. So the other thing is community interest. It's really interesting being involved in the open source community because people want to have others use their code base. It's part of the whole feeling of this. You don't release the software so that other people can say, wow, that's really great software. You release it so that people can use it and say, this is neat. I've learned from this. This is something that I can use to benefit me. So there's an inherent interest in the developers getting access to the code and engaging with the population. I think bringing people into the community is something that's not easy to do but it's something that many successful open source communities are able to accomplish. The Drupal phrase is, come for the code, stay for the community and not that Drupal is the only content management system or a piece of software out there. It's one you should all use, of course, but it's not the only one out there. But that idea is something that's mirrored by many other effective open source tools as well. Also, I think that it's something where it takes a village to maintain any piece of software. The internet changes very quickly. If you want to go off and adjust in time and be able to make modifications over time, you kind of need to have a large group of people who you can engage with them on. Some of the accessibility issues are really very complicated. The kinds of issues the Drupal community has dealt with, even the simple ones. Does everyone know about CSS Display None? There's a really simple CSS code that is a huge accessibility problem because designers and developers use it to go off and hide certain pieces of code from the browser so you don't want to see it. It also hides it from screen reader users. Trying to go from find a way to hide code visually but not hide it for people who need to use it for assistive technology is a really big issue still. Technically, that's not that complicated, but I'm sure that somebody here could write a PhD on how we dealt with CSS and the challenges of CSS Display None in the Drupal community. Learning and sharing is important, trying to be able to engage with others. What we learn in Drupal, we try and share with WordPress and Joomla and Typo3 and others. That's something that we can benefit from that is unique to the open source community. Loads of conduct. Accessibility is part of the DEIA framework, the idea of diversity, equity, inclusion. How do you try and make sure that we are bringing in people and having a structure to go off and see that there is more diversity represented and you're not just having a bunch of white guys speaking at events, it's really important. Having that structure is, I think, really quite useful for accessibility. You don't need to necessarily have this for a piece of commercial software, but if you're trying to engage a community, you're probably going to eventually, if you grow big enough, have a code of conduct that deals with feedback mechanisms, deals with bad actors, deals with people who act like people do sometimes. We don't always act in the most dignified and organized and respectful manners, even in our issue queues, so having forums to try and help moderate and manage that is important. The in-person events are also quite useful. Again, you get to meet people. I don't know if there's anyone here who is blind or low vision, but there certainly have been other people at this conference who have seen who are either in a wheelchair or using a guide dog. There have been some people here who have permanent disabilities that we would typically identify as being more severely disabled. A lot of times community events like this will think about that. There will be an accessibility page. People will ask questions about the accessibility of the website. That also really helps. There's a low cost of entry. This may be a surprise to you, but people who are disabled in our economy are generally the least well-off. They're the poorest of the poor because they often get a subsidy from the government. They often cannot get jobs. That subsidy is often below their living wage. But just a quick joke, who here knows who the richest blind user in the world is? Who's the richest blind user in the world? It's not Stevie Wonder. It's Google. Google's a blind user. Can't see. Also, students are another element of low income populations. If you want to attract students and engage students, open source is a great way to do that. There's a focus on training materials, trying to bring people in and to document things. If you can document an accessible way to do things, if you can provide a best practice that addresses accessibility issues, it's much more likely that other people will follow that practice going ahead. It's also really useful for a lot of open source communities to have a community where you can reach out to an expert and get feedback if you get stuck. The number of stupid questions I've asked that I've been able to get somebody who knows a lot more than me to be able to help move me ahead so I can learn a little bit more about the project. It's really quite important. Interoperability and standards. When I first started getting involved in accessibility in Drupal, I thought, oh, because Drupal cares about standards, I can probably go out and fix up the accessibility issues in Drupal in, I don't know, a year or two. It shouldn't be a big deal because we care about open standards. Here we are more than a decade after me being deeply involved in accessibility issues and we're not there yet. We're further ahead. We're one of the most accessible open source projects in the world, but there's just a lot of issues that you, it's a complex issue. There's a lot of, the more you dig, the further you know, the more there is to do. It's an ongoing process and the goal of accessibility really is to try and get, to be more accessible today than you were yesterday. So that is, that's the goal. And let's see. Yeah, if you're, if you're looking at an open source project and you're trying to figure out how to go off and get acceptance and whatnot around that, just talking about open standards and the value of interoperability, the value of, if you build two standards that meet accessibility, it'll also be useful for future use cases as well. Like, I don't know, how many people, how many people here are designing for voice interfaces? Okay, you're not doing it now, but probably in the next FOS dam or one afterwards, you're probably going to be designing for voice interfaces. If you build for accessibility, it'll be much easier to, to interact with your voice interface because the semantics are built in there and that's what really is important for driving voice interfaces. Yeah, open standards and just having communities that care about this stuff. This is something that, that it takes a while to go off and build up and maintain and to, to sustain that sort of caring about it when you've got deadlines and ideas of, of, you know, yeah, product deadlines and client deadlines, all that sort of stuff. So open accessibility reporting, I wanted to talk a little bit about this. It's, this is a, in the US, VPATs are a much bigger deal. So VPAT is the Voluntary Product Accessibility Template. This is not something that is very common in Europe because VPATs are, were a good effort 20 years ago, but they don't really cut the mustard in the rest of the world. They're not good enough for Europe and they shouldn't be because they, they're, they're largely a sales document at this point. But the, the WC3 went off and produced an, an evaluation methodology that's really quite useful and was developed in Holland and was able to meet with HIDI degrees who went off and developed the, the tool that, that is, is behind the WKGEM license. There's an interface that allows you to go from, to, to guide you through the process of doing an accessibility report. And one of the neat things about this is it has a, it has the ability to go for write a, a machine readable document that it describes the, the, the website you're, you're meeting, you're trying to address. So if you're looking at an institutional website, if you have the ability to have a machine readable implement, like you can compare all of the instances of all the software, all the licenses in one place, you can get a sense of how your website compares to, how accessible it is today versus how accessible it was two years ago. Just having that ability to manage that is really quite useful. And civic actions went off and, and what, like the idea of having something for procurement that we need, that could satisfy the VPAT requirements, but followed the model, the modern model that the WKGEM implemented. And so we forked the, the WC, there was the WC3's WKGEM, AEM methodology, and we created OpenACR, which is, is an open source platform for, for creating accessibility confirmation reports. So if you're looking at providing reporting of different products, this is, the OpenACR is a good format to do that, and it was developed and paid for by the, the U.S. government, but it's, it's available for everyone. So cooperation versus competition. I want to say that there's, there's a, the other neat thing about, about open sources, there's a great deal of both, yeah, there's tension, but there's also, we collaborate with each other, it's really quite nice. I'm on the, the Drupal, you know, Slack channel, of course, but I'm also on the Twitter Slack channel, so people can catch me in either of those places, not the Twitter, the WordPress Slack channel. So you can catch me in either of those places, even though they're, they're often conflicting areas. So, but an area we, we worked with recently was, was working with the, the European Commission with the, the Wefor Authors cluster, and this was an effort to, to take the, the authoring tools and how do we try and make authoring, the authoring, authoring tools better so that the authors are able to go and, and to, to create, it'd be easier for authors to create accessible content. Right now, most accessibility errors are caused by authors who use WYSIWYG editors to go off and to create the content, and there, there isn't necessarily enough guidelines to, to structure them into, to doing the, the, creating good accessible content. There's a, there's something called the ATAG, where ATAG, which is the authoring tool accessibility guidelines, and ATAG Part B is all about trying to make it easier for people to do the right thing with accessibility if you're creating a content. And, and generally the authoring interfaces are, are not given enough attention. This is the, the only study that I know of that certainly the Drupal community has done to try and say, how do we engage with authors to try and find ways to help authors create better content? That's not something that is done often enough. And I also want to just phrase the, you know, the Drupal's, you know, proudly found elsewhere. This, this idea that you don't have to build it all yourself. There are times where you can pick, pick another project, like we don't have our own WYSIWYG editor. We use CK Editor. So when we have an accessibility problem in CK Editor, with CK Editor and Drupal, we push it back upstream. We're involved in engaging upstream with other open source projects. So I want to say it's really important for people, I know I'm running out of time, but the idea of having, engaging with people and having lived experience. Like people with disabilities have a lot of, have additional ideas to share and to contribute to, to your open source projects. So don't rely on the automated tools. Don't rely on your manual testing. Don't rely on your third party accessibility auditor. Try and bring in, I mean, those things are all good. They're all useful. But if you actually have people with lived experience with disability testing your, your interface, you will have a much different experience. You will have something, you will learn a lot more. And you can't necessarily assume that everyone with a disability has the, like not every blind screen reader user is going to have the same experience. You're going to, you know, they're, you're just like regular people, they're going to have a different experience with the interface and they're going to use things differently and navigate differently to, to manage the interface. Yeah. So the more diverse people you engage with, the more robust and, and structured thing you're going to have. So it's definitely useful there. Talent is another big one. Thinking that, that you, you have, you know, open source projects need talent and there's a real need to try and, and build up and to, to involve, you know, with those people with a range of different skills and requirements. But also people with disabilities are, are, are often very qualified but are restricted from, from working for, for even for tech projects. Whether that's because they, they don't want to go into the office or they, because it's too difficult or ornerous for them to work with somebody else's IT infrastructure or because there's, there's the interfaces that they work with are not, not built for accessibility. There's, you know, there's also often not a culture of, of addressing and including people with disability, disabilities within your organization. So, just making it so that you're, you're as much as possible removing the stigma and seeing disability is not something that you have to make special adjustments for but it's really a, a point of innovation and a point of, yeah, it's a learning point for everyone to go out and build better products if you have people who interact differently with, with the web. So sort of thinking about that, that, that talent aspect when you're, you're looking at your organization. Innovation, we've got DQ's acts, we've got tools like Luxembourg's Simply Alley PDF caller. My slides are all available so all this stuff is going to be up online. There's a tool called Alley Watch we're using that right now which is doing some incredibly fast scanning of websites. We're looking at being able to do government wide accessibility scans. We're also seeing tools like Sally and Editorially which helped to go off and support the authoring environment to give feedback to authors who are in accessibility. I'm a big fan of Microsoft's accessibility insights. I got to say when I went to their office I was shocked at how amazing their team was and how they were doing so much stuff the right way. I was like, I, I did have to tell them that I was in the belly of the evil empire but was really impressed. It was, it was a mind-blowing experience. Also the, the, you know, NDDA is an excellent screen reader developed by two blind developers in Australia. Like think about the money that's gone into tools like JAWS and, and voiceover from, from corporations and, and governments. And yet you've got two, two people who believed in open source that were able to go off and build a really strong, you know, screen reader from that. And yeah, there's other stuff that's next but you're going to have to go off and read the slide later and get into that. But yeah, there's a lot of, of, of, of Andrew Brice tools that are built on open source tools and, and there's the ability to scale and to, to push up software and, and to engage with accessibility in your CICD pipeline is definitely something that, that is more and more accessible. So much easier for all organizations to deal with. And now, if you have any questions, please let me know. Is anyone, anyone using any accessibility tools? We ran some accessibility tests and, you know, it amazed me how, just it's, for me doing accessibility tests for the first time, I was told the mind-blown and how the information was available and how to conduct these tests. Right. Really, like, you know, we had to go to YouTube and, and, and that kind of thing. You know, really. And, and you mentioned that one of the best ways of testing applications and interfaces is to actually get, you know, people who have disabilities, ranging, and, you know, there's also disabilities to actually come in and support. Do you, do you know of any resources where we can learn how to do, moderate, to do accessibility tests on, you know, people with accessibility issues? The, the, so the question is, how do we, how do we learn how to go off and do accessibility testing? And how do we learn to do usability testing with, or usability research with people with disabilities? So, in terms of accessibility tests, I mean, there, there are some tools out there, like you can look at, if you look at Microsoft's accessibility insights, there are a series of videos associated with that. As with the Wave Toolbar, they, there are ones that can walk you through that, but they're, they may not be as prominent, they may not be a, what's your first step to get started? So, but they do exist. And there, we have a lot of accessibility documentation as well, which you can find on our, our sub site there that I think points to some of that. But, but it is a bit of an issue. As far as the, the, the usability research, I was just reaching out to Fable, the Fable Tech Labs folks, they do some really amazing work with people with disabilities. And the thing is, is that people often expect to have, to have people with disabilities do this for free. But they, they, they don't have a lot of money to go off and sit around and, and to, to do usability testing. We really should be assuming that we're paying people with, with disabilities to do these tests. And, and so by engaging with Fable, you will be paying people to go off, people with disabilities to go off and to do these tests and get live feedback from people. And I think you can learn a lot from that process. Not that you always need to use Fable, but it's a, it will help you get some insights on how to do that. Any other question? One more question? Okay. Yeah. You had this, I mentioned before and I was thinking that it's kind of a thought that we have sort of a range of people with, I don't want to kind of rank people in a way, but you have a scale of kind of different life experiences, let's say. And then you have also a different scale of like feedback and experience of like how they experience the software. So like when you do accessibility testing for blind people, you know that there are these difficulties. And I think it's also like very big thing to have a lot of neurodiverse and that kind of stuff. Yeah. Also getting those feedback from the people who are usually maybe more quiet in the crowd and to have their experience also. So it will be, give kind of a lot broader experience of how do you make the software better? For sure. I mean, there's a, I didn't raise neurodiversity. So the question was about neurodiversity and how do we try and make sure that we encourage more support and testing and engagement with people of different neurodiverse backgrounds. And absolutely. And it's everything from cognitive disabilities to PTSD to looking at people who have issues with dyslexia, like there's a range of different, you know, disability people have. One of the things that the WKG has really failed on is they've sort of allowed us to think that you could have one sort of website to solve all of your issues. And I'm baffled at how many websites don't provide support for even dark mode. Like all of our browsers and all of our system technologies provide support for dark mode. We've got SVGs and other tools that can provide excellent support for dark mode. Why are design systems not have support for dark mode? Generally they don't. Why aren't we trying to go out and build for that as part of our expectations? This is something that would help a lot of people, even if they're people who just are spending too much time in front of the screen and want to be able to have a little bit of a less intense experience, especially in the evening. We should be building for that. And it's something that I hope we do more. And I think that that will help a lot of people with neurodiversity issues. But also just realizing that there's a tool that was built by the IDRC, which is the Inclusive Design Research Center in Canada, that's called the Preferences Editor. And it allows you to go off and swap your fonts, your colors. You can go from two columns to one column. You can have a table of contents that explodes. And my time is up, so that's all I have to say on the subject. You can always contact Mike on all these things that he put up and also just talk to him after. So thank you for your questions for that. |
A11y: EAA, WCAG, WAI, ARIA, WTF? – it’s for the people stupid!
The web is already accessible – it's us as developers who are including barriers. Let's make the web accessible together. |
We have Danny and Maximilian talking more about accessibility, so take it away. Thank you. Hello, everyone. We have this talk with lots of nice acronyms and we hope you enjoy it. We will talk about accessibility today and about how to make your web applications more accessible or with some easy tricks and make it accessible to everyone. And we are Maximilian and I'm Danny, we are from Deutsche Bahn, it's a German railway company and we are both working in frontends, working for web applications, yeah. So we want to start with a site from Bill Clinton. It says congratulations through the launch of the web accessibility initiative, an effort to ensure that people with disabilities have access to the Internet's worldwide web. And did someone of you guess from which year this site is? I would just give it. Okay, it's from 1997, so accessibility is not a very new topic, a very new thing. It was still there every time. And yeah, we want to start a little bit with the awareness of accessibility, why it matters. Then I will say something to some implementation things and then how you can ensure your application is accessible. Great. Thanks a lot, Danny. So first of all, accessibility, and we heard the talk by Mike, it was really amazing to see so much, even also regarding Drupal and all of the open source CMS, for example, the development over there. But when we talk about accessibility, it's quite easy to say that it's only about a specific group in the end who would have disabilities and even also already said something about it, it's really about including all the people in there. This is the main paradigm in the end. But even also, it's us as developers who are bringing in the barriers, who are bringing in a nonresponsive layout in the end, who are bringing in something which doesn't work under different circumstances. So this is something that we even also want to dig a little bit more into detail today and show you some things regarding those aspects. So you might, again, you might think of blindness, first of all, but it's about so much more. It's about permanent disabilities, it's about temporary disabilities, and even also situational accessibility issues in the end. There might be, you might have, for example, you might be in a situation where you do have probably, for example, your child on your arm or where you do have only one arm, even also for controlling your device. It might be a light condition in the end. There are so many more things than just this one thing that we might think about quite easily in the very beginning. To come to the topic why it's even more important is it's a legal topic. There is the European Accessibility Act, who was enacted in 2019 even already, needs to be even also transferred to the law in each country in the European Union, in Germany, for example. It's even already lost since one year now, 2021, two years, sorry, 2021, but for example, they will only penalize this until the beginning of the year 2025, but it's similar to GDPR. We even also, like, we waited for a long time and didn't think it through, and so many companies only started to think about and to enact on this topic when it was quite near to the state in 2019, I'm sorry, with GDPR. To clarify things up a little bit about the structure, how it's being managed, how, especially for the web, and we're mainly talking about the web, there's obviously even also accessibility regarding applications about native development, so the structure is obviously in general, it's W3C, and they do have a web accessibility initiative, WAI, who's formulating those web content accessibility guidelines, so these are kind of some words that we're even also using in our presentation in the end. So all of these rules, all of these principles divide into four categories in the end. There are, I think, 60 or 70 principles who are divided into those four fields, so it's about perceivable, understandable, and robust criterias that you could test, for example, your website against, or that you could ensure the compliance of your web pages, of your website or web application against. And even also, those play up into three conformance levels. We've listed them over here. It's about that A is really the basis, and it should be ensured in all circumstances, like, for example, about providing alternative texts in the end for your images. This is probably quite a simple one. It's, and the other two criterias, this is about really dividing in between, for example, governance, who need, or governmental services in the end, who need to ensure that it's about these AAA criterias that need to be followed in the end, and, for example, for private companies it's most likely about AA that they need to follow. And all of those criterias are like structures in those conformance levels. Why isn't it important in open source projects, you most likely all know this image. It's always used in kind of the context of security, for example, but it's even also similar. I mean, accessibility is even also some kind of a quality attribute in the end, which is underlying and attribute on something that you could even also use for accessibility. So if you do have an inaccessible UI library, for example, or if there's something that is inaccessible in this UI library, it might lead to problems in so many more projects. And the general paradigm about this is if there's even only one accessibility issue on the website, it's inaccessible. It's not about the amount, it then is inaccessible, and that's very important in the end. You need to test it and you need to, for ensuring this, it should be inaccessible. And that's why it's so important, even also to test your stuff, but even also to contribute if you do find something on the Internet to contribute to those libraries. This might be groundwork for many other websites in the end. So just one simple commercial about us, about Deutsche Bahn. We even also do have a design system at Deutsche Bahn. And we obviously, in the very beginning, we started as Inner Source because it was kind of a small initiative, obviously you only have some developers, you have some designers over there. But then we thought about quite quickly to go Open Source, because from my point of view, I'm even also doing privately, developing some polyfills and that stuff. I think Open Source is one of the paradigms, it's about learning in public, isn't it? I mean, it's going out, it's talking to each other, it's about really getting feedback and that stuff. So we decided quite early to go Open Source and get feedback from the public, even also to get that information. So I'll lead over to Denny then for the implementation part. Thank you. So now we want to focus on what's important when implementing an application on your iLibrary, what's the most important thing? And the most important thing, the really most important thing is use semantic HTML. So that's really the most important thing, I can't say it often, so it's really, if you only use diffs, then you make things much more harder to bring in accessibility later. So to make your application accessible later, it's way harder. So it's easier to change the button style of a button than to bring in accessibility to a diff element. And a very big thing is that you use landmarks. So landmarks are used to structure your web page to big elements. So like a header, like a navigation, a site container, main part, and the landmarks are very important for screen readers to identify where is your structure of the site. So where is your navigation, where can I switch the sites, where is my main content? And this makes it really more easy for users of screen readers to find the content and to navigate between them. And another thing is the headlines. So when you use headlines, be sure they are in the right order. Don't leave out some levels of the headlines or mix the levels. So just don't make an H1 and after that an H5. So the headlines are like in a book. If you open a book and you have the index, so you can just, your web page should be structured like a book. And another thing are buttons and links. They're often, often people just use buttons and make on click handlers and the on click handler will navigate to another page, which isn't a good thing. So please use links to navigate to things and use buttons to do things, to make actions. And please don't use this. So never. Forms are also a great example where accessibility matters a lot. So you have separate elements in the structure of HTML for the label and for the input field, but you should link them together using the ID attribute and the for connection. And you also can use area attributes to give another hints. This is probably the sun, yeah, a great example of accessibility. So yeah, you can use descriptions for more hints which are read out by a screen reader to the users so that the users know, okay, this description is linked to this input field of my form. Yeah, and the attribute is also very important. It's not an art tag, by the way. So it's an art attribute and yeah, it should always, sorry, you should always use an art attribute to describe what's visible on an image. So it's very important that people who can't see an image that they know what it describes. So the only thing when you should use an empty art attribute is when you have just like a decorative image. Like you have a disk icon for a submit button and the submit button already shows that the text submit, then it doesn't make sense to have like more content. So yeah, but otherwise describe your images. So modal dialogs, also a great example, there are lots of UI libraries in the web which are not accessible and lots of UI libraries implement like modals from times where the browser didn't support the native dialog element and these libraries have a big issue because when a modal appears sometimes a screen reader will not notice there's a modal and will, so it will read out the content behind and that doesn't make it accessible. It's not good. So the native dialog element is a really good way to use and to make your application and especially modals accessible. And there are other cool things which are not only good for accessibility issues. As you can see this is a detailed summary element in HTML and it helps also because if you use like command F or control F on Windows and search for a term it will automatically expand and will show where's the content inside the summary, the details. So yeah, that's also a good example for semantic HTML elements. And there are other things like auto completion for data lists. So there are a lot of things that got more and more standardized and are available in HTML5 now and can be used by developers to implement which makes your application more accessible without using area attributes. So area attributes are a way to, it's like, if you don't find a semantic HTML element then you may have to use a diff and then you can think about do I need some area attribute. There are some area attributes which helps for making things that aren't natively accessible that helps to make them accessible. For example, if you have an area life attribute it helps to read out, if you like, have notifications on your site. When you get notified from a server or something a message appears then you have to find a way to tell the user there was some message occurs. So therefore area life can be used to read it out for screen reader users. And now may you asking how can I test easily that my application is accessible and I will try it over to Maximilian. Yeah, thanks. Thanks, Denny. So one thing that I'd like to quite quickly add to what you said, think of those HTML tags like little micro front ends that the browser delivers to the user without the need for JavaScript, it's there even already. We do have poly fills for most of these features even already. So use them and by stabilizing this stuff even also through the browser vendors we might get better features in, we might get better features even also in the future. Like for example on mobile devices you might get a different UI, you might get a different keyboard or even also what we've shown regarding details in summary. It might be something that the browser vendors then could innovate on and for example bring in this cool feature regarding site search. So QA in testing, we had this question previously regarding testing. I think one of the easiest things is really to use, let's say easiest things. You do have the keyboard in front of you most of the time. So you could use the keyboard, use the tab key to try to navigate in the application that you're responsible for, that you're working for in the end and try to reach all of the interactive elements in there. Try to control the navigation for example, try to see whether the focus is existing for example. There are so many things which are related to keyboard controls even only on the page where we do assume that the users are using a mouse but this is even also something that is beneficial for screen readers even also. It's obviously not something that is the only part, screen readers is even also what Danny talked about regarding semantic HTML but if you do see that it's not controllable by keyboard it's most likely even also giving screen reader users a hard time in the end. And this is the easiest one. We even also talked about, Mike talked about some other tools we like to highlight at least Chrome developer tools. They do have some really nice stuff especially for contrast ratio for example. We wanted to highlight over here. They do have Google Lighthouse where you could obviously measure so much more stuff but they do have an accessibility section in there even also giving you great tips in the end and they even also provide you insights into the accessibility tree which is the basis for rendering that stuff later on or telling it to the screen reader. There was a great talk yesterday by Mozilla even also on optimizing this regarding performance and accessibility performance. I would really suggest you to have a look on the recording for this one. Another part is the X DevTools. They do have something for CLI even also. They do have plugins for the browsers. Why have we used the Foster website over here? No hard feelings. I'm sorry about it. Total issues 35. Yeah, let's skip that. Next slide. Still, it's about the same message. It's about the human's test with humans in the end. Keyboard is something which gives you some insight quite quickly. It's something that is really low level in the end. But talk to colleagues in your company in your field in the end. I'm even also working with several colleagues with disabilities and was really inspiring in the end to see how all of this works and to go through this stuff. This is something that is really important. Talk to the users in the end who are using this. One of the last things previous to the questions that we wanted to bring up because we talked about it, we had a quick conversation about it. We do know CVEs, for example, for a global database regarding security issues. Why don't we even also do this for accessibility vulnerabilities in the end? Because it's even also something that is becoming or that should become more important and that even also becomes legal. So why not do something similar for accessibility and we leave you with this. Thank you very much for your attention and let's go over to your questions now. Yes please. I could repeat it. There are some elements for some of the data picker that is not accessible by default. So they are catered where you should have to have the UI library. Do you know some reference documentation that you can report to? It depends on the, I would say, sorry, I should repeat it. There are some elements in the end. Yeah, sorry. There are some elements, the question was, there are some elements like for example the data picker which are currently even already identified as not being responsible in even the native browser implementation. So I think, and the question was about whether I do know libraries which I could, for example, recommend. I think in the end it's depending on the context that you're working in. So for example, if there's a great Angular plugin, you couldn't use it in React context. So you always need to sadly do the research and search for a good plugin. I think, and I talked about the standards previously, I think it's so important to you to support those standards because if we do standardize all of this stuff, then the browser vendors do have the chance even also to make things better in the future. This is the main point. So I couldn't necessarily give a perfect suggestion at the moment, but I do know about those issues at least, they pickers. It's quite hard. There's another initiative who are trying to even also come up with new solutions because even also most designers think about that these ugly, let's say, ugly standard elements in the browser that we need to overlay them because of design reasons. And there's another initiative. I don't remember the name who are even already coming up with something, for example, for the select menu to build this new, but again, I'm sorry, I could even also, we could add it to the slides afterwards and then it's in the, add a link on the Foster website. Yep. Thank you. Any more questions? Yes, please. Yep. It's you. Pablo. My impression here is that by the time a developer has a chance to do that, it's committed to it. At Deutsche Bank, how do you approach this from a design perspective? In the UXUI, but in general, as a designing the product, before even a developer has the chance to say, actually, I know how to do this in a way that will pass a test. Yeah. Because that is the beginning of the funnel. But if we don't do that properly, so I'm curious about your experience that instead of focusing too much on when it's already there and the developer has to do some needs AAA checks. Yeah. So I do even also say the question again, how do we approach accessibility not in the end, not a testing, but even already like, for example, in the concept phase and design process exactly. So I think it's mainly about to have this scaling in an organization. It's only the first most important thing is about awareness from my point of view. So even already in our onboarding sessions for new employees, we are having onboarding sessions, we do have talks, we do educate on accessibility even already. So I think the main thing is about awareness for all involved people in the end, for product owners, for designers, for developers, for testing in the end. I think it's mainly about awareness, I would say, and then you could only scale. And I'd like to thank you for your product because we're really looking forward to your product. You talked about thank you. Yeah. So another question. Follow up. I'm afraid it's not only about awareness. I think it's also about the right skill set. We are really struggling to find a talent, accessibility talent in the design space. Awareness is fine, but it's also about experiencing really challenging projects. Totally. Totally. It wasn't about only aware. I know. You haven't said it. So I repeat it quite quickly. It's not about only awareness, it's even about the skill sets. It's about experiences. But you need to start somewhere. I mean, you could build up experiences by starting and then having all the people getting those experiences in the end. I think I couldn't have a better answer at least. We do have several teams who are specifically working on this, even also education, but even also testing from the very beginning, even already educating concept and designers and all of these colleagues. But in the end, I think it's, yeah, it's about skill set awareness and all of this stuff, education, talking to each other, all of that stuff. This is what I could tell you for the moment. But we could follow up. Okay. Yeah. We are still here. We are still here. If you do have anything else to follow up, just come to us, please. Thank you. |
Building a UX Research toolkit
How a UX Research Toolkit is being built for the Open Source Ecosystem |
Bitcoin sometimes referred to as magic internet money. It's currently the 15th largest currency in the world. We feel that Bitcoin embodies a lot of the open source ethics such as open collaboration and decentralization. In the last two years, 2021 and 2022, two of my colleagues have also been here speaking at FOSDEM. So I'm very grateful to be here waving the Bitcoin design community flag. Speaking of the Bitcoin design community, it is pretty young. It's just two and a half years old. The mission of the community is to improve the global adoption of Bitcoin. And that's only possible if we all openly collaborate and work together. The community basically acts as an online space. When I say online space, I'm talking about a Slack channel, simple as that. It's an online community. There's a whole hub of different projects in that community. There's even open source projects who just have their own Slack channel. So they choose to be in that Slack channel and work totally out in the open. They ask questions. It's a hive of very busy, active projects all working synchronously and asynchronously. I will share a little bit of stats about the community. So we have just over 3,000 Slack members. We have a couple thousand Twitter followers. There's a lot of devs in this room. So when I say close pull requests, you know exactly what I'm talking about. We complete a lot of calls. So as I say, a big goal of the community is to help other open source projects. So we jump on calls with projects who would like to improve their user experience. And we assist them. So that's why there's that many calls. As you can see, not all of the calls are recorded because we have 174 completed calls and 103 YouTube videos. Then we have a newsletter and we work with basically, there's a lot of project collaborations that we work with. So we've worked with Stratum. I don't know if some of you have heard of Bitcoin Core, as well as Wallet's Trutony and Blixt. Any of you heard of any of these projects? Cool. Bitcoin Core, nice. So I'll share a little bit about my background. Many many years ago, I worked as a marketing consultant. And the reason I worked in marketing was I had a passion for business and strategy and those kinds of things. But the area in marketing that interested me the most was actually the research part. And why it actually interested me is because I was always interested in figuring out why things work the way they did. I was always interested in getting down to the nitty-gritty truth about things. And I have a feeling that there's a couple of people in this room that kind of share that insight, just raise of hands. Yeah, there's a natural researcher in a couple of us. So I specialized as a marketing researcher. And during that time, I pretty much worked as a consultant. So I worked as a consultant for basically like small and medium-sized businesses, helping them to devise strategies based on the research that I conducted. How did I stumble upon UX design? I was always fascinated with tech and I was losing the challenge in working as a marketing consultant. It was just the challenge that was missing. So I decided just to go ahead, there's that accessibility. Yeah, that would look cool. No, no, it's good. So I decided to start studying something else. And I came across UX design. And UX design for me was an intersection of three things that I was really passionate about. One was the intersection of design, psychology, and research all in one. These were things that I already enjoyed and I was already doing. So it was a perfect opportunity to combine all of them. Then I stumbled upon the Bitcoin design community. Here is a screenshot. This was just taken last night. So the community is basically, as I mentioned, it's this hub, it's very free, very open communication, very relaxed. And in the community, there are all these different projects that are working out in the open. They're all open source projects. Bitcoin Core is one of them. So I started basically volunteering at this community, as in jobless volunteering, you know, because I was passionate about working in Bitcoin. So I started volunteering, I was just having a whole bunch of fun, jumping on all these different calls. And then one day, someone came knocking at my door and said, hey, Mo, like, you know, you're doing some pretty good work here in the community. Why don't you, you know, be a UX researcher? And I was like, like me, you know, there was like imposter syndrome came in from all sorts of different directions. I know everyone here has dealt one time with the post syndrome, but it came in from all sorts of different directions. The first challenge was the technical challenge, because particularly in the industry that I work in, the technical know-how is extremely, it's extremely high because of all the terminology, you know, I work in Bitcoin and I work in Lightning. So there's a lot of what I would use the term gibberish, which now to me doesn't sound like gibberish anymore, it makes sense. So then I started to find some direction. I met Christoph Onno, who works in the Bitcoin design community. And he basically said, Mo, why don't you be an evangelist for UX research in the Bitcoin design community? Why don't you wave the research flag? So I said, okay, you know, why not? Let's give it a shot. And he then basically suggested to me to create a UX research toolkit for the entire ecosystem to use so that if they didn't have a UX researcher on their team, they could basically dive into the toolkit, grab a tool that they felt would be useful and start using it. And so I started building. So my goal was to develop this UX research toolkit, which would basically help projects in open source to feedback insights and data and knowledge back into their projects so that they could develop more human centric products. But of course, there were some challenges along the way. There were two very big challenges that I faced. Within the industry that I work in, UX research is still very much at its infancy. So building the kit, a big part of my role is getting people to understand the benefits of UX research. So that was the first biggest challenge. The second biggest challenge was no such toolkit actually existed. So there are UX research toolkits on the market. If you go into Figma community and you type in UX research, there are toolkits that exist on the market for UX research, but not specifically for the industry that I worked in. So there I was building this kit. And if I look to the left, hey, can you help me with some advice? There's no one there. If I look to the right, there's also no one there. So I started building this kit purely based on the projects that I was working with within the community. And so I needed to get very hands on. It's nice to see that we have someone from Bitcoin Core here in the room, because one of the projects that I work on is Bitcoin Core. I've also worked with the Galois team, and I've also worked on the Blexed Wallet. So what I actually did was I basically just wave my hands, sent out a Slack message to these different people in the community. And I said, hey, can I come jump in and can I come talk to you about UX research? So I started jumping on these calls and working with these projects, I started to really get a good feel of what do they need, what kind of insights do they need to be able to build more human-centric products. So I started to build the kit from a place of real hands on experience with different projects. But of course, I was working in a silo. So this is showing the messiness of it. First started designing different resources, different tools, working in Google documents. There were files everywhere. You know, I call it like, I don't even know what you call it, you're smiling, probably you recognize this, there were files everywhere on different platforms. And I was working in a silo, as I said, because while I had my feet in these projects, I still was working alone as a UX researcher. So the first iteration of the UX research toolkit, it was just like, hey, these are all the different types of UX research tools and methods that are available. Here they are, I'll design tools and resources for every single thing that's available. So I went in with that, I would call it naivety. And then something really interesting happened. I started to really embody the ethos of working in public. And I was just speaking with another UX researcher about that. And the ethos of working in public is basically sharing all of your work out in the public even when it's messy. Yeah, you do the same, I guess. So it's, you know, I don't, when I share my research or what I'm doing in my research, I don't share the final presentation. I share the messy parts. I share the Google document with the UX research strategy that's half finished. I'm still building the strategy. Have a look everyone. Do you have anything to contribute? So this is what I started doing and as a result of sharing my work on Twitter, on LinkedIn, and being very public, open, and transparent, people started reaching out to me. Fellow UX researchers, they started reaching out and they started saying, hey, you're building something, it looks interesting, could you tell us more about it? So I started inviting them onto my calls and I actually invited them to build with me together. And then we, so the picture here, the picture's not here. I added in a picture, one of the slides, maybe it comes up later on, but there's a picture of a whole bunch of us on the call together, but basically different UX researchers from different industries, not only the industry that I work in, they started to jump on these calls and we actually started to together brainstorm and ideate. We were like, okay, there's a whole spectrum of UX research tools that are available, but what in your experience has been the tools that I've worked the best for the projects that you've worked on? So we did it with pure democracy, pure voting. We just jammed together and we voted, we voted on which tools we felt would be the best. And that's how this toolkit, these tools were basically chosen. Right now we're still building out this first half and actually a week, no, next week Tuesday we have a meeting to discuss which will be the other half of the research toolkit because version one's pretty much been decided on. It's been split out into two halves. So one half is scope and build out. So this is if you have no product and you're looking to build from scratch and you want to build the product from scratch, you have nothing. You start from zero. And the other part is shipping the product to market. So you already have a product, but you need to iterate on the product because you've been getting a lot of user feedback. So in our opinion, and this is just guidance and advice, we would say, hey, these might be the tools you can use if you already have a product. And these might be the tools that you can use if you're starting out from a new product. But these are not hard and fast rules. You know, they're totally open to you and your team and the resources and the time that you have available. And of course, because we're researchers, we absolutely wouldn't do it right if we didn't collect feedback. So we ran a UX research test on our UX research tool kit. That was super fun to tweet about. We had so much fun with that. And we got feedback about our kit. So that was a lot of fun. And so we iterated on the kit. So what I'm busy doing at the moment now is we've collected all of the UX research and the feedback from the kit, and we're now iterating on it and making changes. Just learned a very quick one, Pivot. Pivot and turn if you think that is better for the user experience. So basically, I was developing this kit, working in the silo. And as a result of realizing that the way I was delivering my product was not going to be usable for the people I intended it to be used by, I literally did a 180 pivot within a really short space of time. And I did it in public. I was like, hey, guys, no, this is not going to deliver the right user experience. So I pivoted. So the second thing is that building in silo, it's nice to build in silo if it suits your personality type. But if you build as a team and you invite the entire ecosystem to build with you, you get the insights, the knowledge, and the intelligence of everyone else. And you get to learn from everyone else as well. And the last insight is UX research is as much about communication as it is data. If we would like the developers and the UX designers to take the feedback from our research on board, we need to learn how to communicate it in a good way and in a visual way. One of the things that I do is when I feedback my UX research insights to the team, I tend to not feedback solutions. Then I place the trust and the intelligence in the hands of the team. And I say to them, listen, guys, this is an insight from a user. They said they don't understand what this button means. And then I asked the developers and I asked the UX designers, how can we solve this problem? So it invites a very open conversation for everyone to come up with a solution together as opposed to one person coming up with a solution. That's it for me. If you can find me on Twitter, Mogesh and I do. Are there any questions? Shoot. Thank you. How do your research methods differ across user types, contributors, teaming, node operators, contributors? Could you repeat that, please? How do your research methods differ across user types, like from node operators, contributors, and end users? Node operators, users? End users? Yeah. And people contribute to the protocol. Can you repeat the question? Yeah, it's just, it's, it's. Am I really quiet? Okay. So you know there's different personas that participate in a network. There's people who might contribute to the network and like open a PR, and then there's people who are end users and adopt the product, and then there's people who have to run the client. How do you use different research methods to get feedback from all the people? Do you more referring to the fact that the user feedback is kind of spread across like GitHub issues to Twitter? Is that kind of the, the data is kind of spread across different sources? That's probably about classifying your users into different groups based on their tasks or your scenario. This would be more referring to, and that's a slide there. It would be jobs to be done. So I'll speak about jobs to be done very briefly. So Blockstream has kind of, you know, popped in the community every now and then. We've also had some node management platforms also jumping in lightning node management platforms, jumping into the platform is community as well. And they're really moving towards trying to understand the jobs or the tasks that the interface needs to be able to do to satisfy or make the particular user groups happy. So how they start this process is phase one is who are the different personas that we're designing for. So you could do a persona exercise and identify these different personas by doing ethnographic research, user interviews. But you could also do a jobs to be done, which is basically jumping, having a rough idea of who you think these user groups are, and then having a very umbrella like conversation with them and asking them what they need from the interface. Does that answer your question? Yeah. So great talk. Really interesting. So thank you for giving it. So I'm not a designer, but I'm a social scientist. And I really like this idea of toolkits because methodologically in the social science we do oftentimes a lot of things that research focused designers do. And so in my experience, when finding correct tools to do things, it's like within the social science, it's like you got to read a book. So I published a book 20 years ago and it's got some ideas in there. I was wondering if you could talk a little bit more about how you got this community together to put together, decide what tools go into the toolkit. How did you do that outreach? And I know you spoke a little bit. I'm like voting on various things, but I was wondering if you could just give a little bit more detail. Yeah, absolutely. So as I mentioned, when I started creating and feel free to stop me when the time is up, but as I mentioned, I basically, when I started, I started really umbrella and I started just thinking, oh, I'll just start creating tools for every, all the different UX research tools that I know about. Basically, we at the community, we have calls. The calls are all open for everyone to watch on YouTube. Probably if you go on YouTube, you type in UX research calls, Bitcoin design community, you'll find the call where we voted. So we basically had four UX researchers, everyone from different backgrounds. First, we opened up with a conversation of, hey, how do we split the toolkit? What would be the most sensical way of splitting the toolkit? That was the question number one. Then we decided on the two areas. And then question number two was, in your experience, what has worked the best with understanding and gaining insights at this phase of the product? What helps to understand and get insights at that phase of the product? And then we just jammed. Like everyone was, we wrote, we made a few sticky notes and then we voted literally with smiley faces and we didn't even share, there were no names attached to smiley faces. So it was really open and transparent. And yeah, I mean, if there's any UX researchers in this room, I openly invite you. And I know there's already one in this room. I openly invite you to come work on the UX research toolkit because it's completely open for everyone. I want more minds and more insights on it because I believe we can only build better products if we build together. That's the selection. Yeah, you can find me on Twitter. If you reach out to me on Twitter, I'll direct you to the Slack channel and we can, yeah. Thanks for the great presentation. I'm really curious now that you're working on this research toolkit and you want to take it somewhere, what is the next step? How do you make sure that the people that you want to reach with it can actually start using it? I already have an idea about that. I love that question. So my next dream, and I think I have to stop now and I'll wrap it up very quickly, is that my next goal is once the toolkit's built out, I'm going to reach out to all the open source projects. I'm going to go knocking at everyone's door and say, hey, can you come help me test out this toolkit? Can we run it through? Can we do a bit of a sprint in your project? And then I'd like to get feedback from them and see how they experience it and then build case studies out of that feedback and then show other projects like, hey, Bitcoin Core used the toolkit. Look, this is how it worked for them. This is the case study and then use that as a way to basically move forward. Cool. Thank you, everyone. |
Practical UX at OpenProject
Musing after 1½ years of working on the UX of open source software |
So, this is the second talk related to Open Project. I don't know if anyone went to the talk about integrations yesterday, anybody? All right, designers, I get it. So, this is Practical UX at Open Project. So, I've been working there for a year and a half, and these are musings about the things I've learned, things we've gone through. It will all become clearer in a second. I'm Parimal, and that's a very elaborate site to tell you. My name is Parimal, and I'm a UX designer at Open Project. So, I think we can start with, I think we can all agree that the UX in Open Source could use a little bit of work. I think it's fair enough to say that, right? We know the reasons, but very quickly, some of the challenges we've had. Data design. When you have data design in software, in Open Source tends to have this sometimes. It is not necessarily intuitive, which makes it less attractive to new users, because they're comparing with a lot of other newer tools, and they tend to have all this shiny stuff, right? There tends to also sometimes be inconsistent behavior. The good thing about having contributors, all sorts of contributors over the years, is that it's open, anyone can do things, can add new features, but if you lack a consistent coherent design language, then you do have the problem of inconsistency over time. There's also a lack of investment in designers. Now, this is, I think in general, in Open Source there's a lack of resources in general, right? Not always, but there tends to be this problem, and design does not tend to also be the priority for a lot of projects. There's also this tendency of Open Source projects being engineering driven. Not necessarily a problem, because this can generally mean a very fully functional, very feature-rich software, let's say, but that does not necessarily translate to beginner-friendly software, as we don't manage to again attract new users. UX and accessibility tend not to be immediate concerns of developers if they're not necessarily topics that concern them directly. This also means that these tend to not be prioritized for that same reason. However, there are some positives to UX and Open Source, and that's that UX has also taken this dark turn through all the sort of the Facebooks and the Microsofts and the Googles, where there's a lot of dark patterns, and UX designers and a lot of companies are encouraged to produce these dark patterns. We tend to not have this in Open Source because we tend to want to create things that people like, and we're sort of bound by these values. I think the fact that we're in our foster means we sort of have these values, and there's a spirit of sharing, which is why we're all here, and things are getting better. I think that some of the talks at foster are showing us this. That's the sort of point of where we're starting, but then we joined Mark and I. He's not here, but hi Mark. We joined Open Project in August 2021, and we were quite impressed because Open Project, for quite a small company, decided to invest in two UX designers, which was quite a big deal for an Open Source project of about 30 developers. We have a product team of the two UX designers and the CEO. We take product decisions, but then we also try to work on the user experience. The reason I joined Open Project, because I come from a background of working in agencies, communication agencies, different world, was all the values that you guys share, that we all share, and seeing that a company was willing to put the resources in. And the goal was to improve user experience. What does this mean for Open Project? To tell you this, I'd like to tell you a little bit about Open Project, because otherwise you won't know what I'm talking about. Open Project is an Open Source project management and collaboration software that looks, I'll show you the screenshots in a bit, but I thought I'd tell you how we're organized, so you get a sense of perhaps compared to other Open Source projects. We have about, I said we're a team of 30, about 15 devs, full stack, front, back, all that. We've got three product people, like I just said. We've got one QA, which I'll sort of connect to the user experience a little bit, because if we can't test, it doesn't quite work. Business marketing, some HR, and then sales and support. That's it. That's our entire ship. We've got two versions, the free community version that you can just download and install anywhere, and the enterprise with the support, quite a classic model. You can install it on premises, on the cloud. We also offer hosting within the EU, if you don't want your data sent anywhere else. You can see some of our major clients, including Berger-Bahn, who spoke earlier. The values that are very important to us are the privacy, the data security, digital sovereignty. These are things that we're very, very passionate about, as are a lot of other projects who are speaking today. It looks like that. List of work packages, tasks. This is actually what we're using to work on the new version that we're going to release in a couple of, well, we'll see when it gets released. It'll be released this month. We work publicly using open projects, obviously. You can see a bunch of features, some epics we're working on. You could filter by priority, by status. I think you get the idea. You could go into a bug report that I submitted. This is all public, by the way. If you find a bug, you can also submit one. You can see the description there, the activity where you can really exchange with other users. You can tag them. That's very important to us as well. Connect to other files. We talked about integrations. We have an integration with Next Cloud. If you upload files to Next Cloud, you can link to tasks and work packages in open projects. I think that's a very common view, a canvance view if you want to drag things around. We have that too. We have a team planner. If you want to organize your team and see who's doing what and reassign tasks, you can do that. That gives you a sense of what the product is. Let's go to UX at Open Project. Some things have happened since I joined, since Mark and I joined. We now have a design team. Previously, we didn't have one. It wasn't a priority. It was quite normal as well. Now, we have two UX designers and a test person, a QA person, Ivana. We used to do Markovs in PowerPoint. It did the job. Now, we're working in Figma. Hopefully, we can move to Penport at some point. We had a rudimentary style guide of HTML elements. Now, we have a growing design system with reusable components that we're going to look at in a bit. It was quite engineering driven. You can see that in the design as well. Now, we're working alongside devs to get some early feedback. We're not at loggerheads. It's not devs versus designers. We're trying to work from scratch and get early feedback. These are things that we are changing within our way of working. We all know the benefits of better UX, but for us, we realize that even clients that have used our software for a very, very long time, if we can optimizations, even minor, that's worth the time and effort. Of course, attracting new users is very important to us. We're not trying to only be the open source project management tool. We're also trying to compete as an alternative to the proprietary tools. The challenge is, if you read the abstract, that was something I promised and I will try to deliver. Us designers have a temptation to want to say, be helicoptered into a project and design everything from scratch. I see that. I'll put something on Figma. Let's do this. It's not that easy. I think you guys know this because when you're working with an open source project, there are certain contextual constraints, let's put it that way. It tends to be quite large open source applications, so you can't just change one thing and propagate it throughout the whole application. For example, the same element, say a drop down or a button, can appear very differently in different parts of the application, but not just visually. It might be implemented differently in all those places as well. Because again, multiple contributors over the years, no coherent design language. Sometimes you can even have different code layers. In our case, we have Angular and Ruby, which means the views aren't rendered quite the same way everywhere. The time, if you come up with a new design, it has then to be developed and that time, that's time and effort that's competing with new feature developments, with maintenance, with bug fixing, and general dev time. You can't just say, I've designed this, implement it, and we'll move on. You've got to balance it with all these other things. Of course, we realize that when you design a new feature, the requirements are very different between pro users and new users who are used to certain ways of working. You say, oh, no, we're going to change something completely. The risk of that is, of course, the classic XKCD, I don't know if you know this one. There are 14 competing standards that you're like, oh, you can't have that. We need to develop a one universal standard that works with everybody and then you end up with 15 standards. We don't want that. Of course, data analytics helps often. It helps you get a sense of what users are doing, what they want, whether they're getting stuck. You could use Google Analytics or you could not use Google Analytics like we do and not have the data because it's very important for us not to track our users. That makes our work slightly complicated because we don't quite know what people are doing, but it's very important to us that we don't do this, especially because, again, the data would not necessarily be within the EU. Having said that, we managed to ship our first big feature when we joined Mark and I. It was a notification center. We were very proud of it because we didn't have any process, no UX process, and we still managed to do it. Because before this, all notifications in OpenProject were email-based. If you sort of had a new work package, a feature, someone answered it, added a new requirement, you'd get an email alert. You couldn't follow it within the application. We designed this. Extremely proud of it. It did ruffle some feathers because we said, all right, now the emails are only going to be digests, not for every single notification, but some people will be used to this. We are extremely proud of this feature, but we also learned that you've got to ease users in with these big changes. Perhaps we need to start small. Small means sort of manageable, developable, maintainable, testable, don't scare existing usersable. That's sort of what we mean by small, reasonable, in other words. So when you want to change these things, there's a couple of different ways to go about it. You can change one thing everywhere in the application. So let's say a button. The button that we had doesn't quite work. We want to change it. We want to make it better. You could make that button the same, the new version, everywhere. But then individual pages would not be consistent, right? Or you could change everything in one page. It takes the settings page. We're going to update it with new components. We're going to make it standard. But then that standard, across pages, it's not very standard, is it? You just changed the problem a little bit. So the approach we've gone with is we want to standardize the elements first. Remember how I told you that the same elements could be implemented in different ways in different places? Well, what if we first standardize the implementation, the HTML, the actual code of that element, and then we can modify it, can we, once we come up with some sort of consistent way of doing it? That'll also help us understand how these elements are being used and the different variations we'll need. I'll give you an example. For a design presentation, this will be a lot of words, hasn't it? More images, please. This is the date picker that we used to have, and that's the date picker we have now. We changed it because we introduced a new feature, which was taking Workweek into account, so sort of being able to say that Saturday and Sunday are holidays or could be a Friday, depends on your company. So when we did this, we took the opportunity to upgrade it, and beyond this sort of aesthetic change, you'll notice that was an opportunity for us to then work on all these elements, to try to better define what a button should look like, what an action bar should look like, what the blue, when you time across the application and you're changing the focus, what that should look like, because that wasn't very consistent either. So we took this opportunity, sort of new feature development, to create these components, and not just things that are visible, but also things that are invisible like this, which we didn't quite have in the past. So that does mean that for now, you won't find this in all across the application. They're parts of the application that don't respect some of the things we've defined, but we're going iteratively one step at a time. And to do that, we need a design system. We call it the single point of truth. So it's a spot, and if there's a component that has not been included in the design system, we say it's not been Spotifyed. They can't really sue us. So what is a design system? For us, it's sort of a consistent, dependable, reusable and documented set of things. These things could be styles, colors, spacing, things I showed you. Could be components like buttons or text boxes, any of the little components that bigger interfaces are built out of. And then patterns that are a combination of these things. And the last bit is very important. You've got to then document it and tell people how to use it, when to use it, what to do, what not to do. And of course, as being an open source company, we publish all of it in our website. So anybody can go and look at it. So now when we say spacing to, everybody knows the developers, designers and the testers know what that spacing should look like. Same thing for action bars, with the different variants. And then we go to our website and then we try to explain those variants and how to use it. And this is public in our website. It's not complete yet. We're still working on our design system. So now all elements are there. But we are working on it. How's the speed? Am I speaking too fast for anyone? It's all right. Okay. Well, I'll take a bit for break and bring some water. You can appreciate our design system. So in some startups, they say build fast and break things. We can't afford to do that. That's not our mantra. Our mantra is to build slow, iterate and test things. I say test things. We've not quite done it yet. That's our goal for this year. So we want to balance UX optimizations with feature dev. But like I showed you with the date picker, perhaps if we spend 40% of our time with feature development, we take that as an opportunity as well to optimize and maybe choose five core elements and then sort of fold them into our design system and try to make those consistent. Which requires sort of that means that there will be imperfection and there will be inconsistencies within the application. Our goal is to accept that imperfection and to keep moving. Because again, the developer doesn't like when things are inconsistent or some developers want it to be pixel perfect. Well, sometimes it's not going to be. And for those of you who speak French, the better is sometimes the enemy of the good. And sometimes you got to accept the good enough to move forward to where you want to be. Very philosophic, I know. So the learnings for the past year and a half, it's very happy to be able to put that half symbol there. I thought it looked quite cool. So the first thing is you can't just join a project, be helicoptered in and magic won the whole thing and say, great, we want to just change everything. Let's do it. Upgrade it to 2024. You need incremental upgrades because you need everybody in the company to believe in what you're doing. Be convinced it's worth the effort. And luckily we have that sort of project. But it's not an easy thing to do because there are competing requirements, competing priorities and how companies should use its time. So that's important to accept, incremental. Standardizing things now means a lot of time spent now. But down the road it gives us more time to do more relevant work. It's less time because things are already standardized. If we change the color, if we change the design, it will change all across the application. We need to be pragmatic, try to make something a little better with every release, not go for everything in one release. But sometimes we do want to go for those big changes like we did with notification center. It might ruffle some feathers, but those changes are required because we have a vision for where the product needs to be. However, we've learned, perhaps the hard way, that we need to ease our users in who are used to the old system, help them understand why we're doing it. And in that regard, we also plan on communicating more from the product team, perhaps publishing the why and the how of how we work, why we take certain decisions. And finally, we could learn from other open source projects as designers, I mean, because developers do it quite a fair bit. I don't think we quite have that culture amongst designers. And we're currently working on theming, and I saw a presentation by Matthew, Matthew Brea from Proton, who was in Paris where it was last year. And he was explaining how his company worked on theming and all the things that he did that were wrong and where he learned from. And we were able to use that and learn from it, and that helped save us time, sort of get a better sense of what we were doing. So thanks to Matthew, who's not here, but I think that that sort of sharing is very important. We should do more of that. And we really hope that this presentation maybe hopefully one day can help someone as well. So what are our plans for 2023 then? More UX. Shocking, I know. So more UX, so more work on the design system, continue working on it rather. More testing, because we've not really done any testing now, we haven't had the resources, but that's one of our plans. Even a basic hallway test where we ask five people what they think and if they understand the feature, is some input. We want to integrate more with other tools. Part of doing that is to take the design system and make it easy for someone creating a plugin or trying to integrate with OpenProject to use those design elements as well. So that's one of the reasons we're working on the design system as well. And of course, better theming like I just mentioned, and accessibility is something that we have in mind, but we could do a lot more. We could be a lot better at that. So that's one of our goals. And of course, we've got the usual suspects, the ongoing optimizations, and new features along the way. The goal, of course, then is to have more users who say they picked up OpenProject for the UX, not just because it's open source. Of course, we want them to pick up OpenProject because it's open source, of course. But if they say we try to do a bunch of tools, we think you guys have really good UX, that's our goal. But why do I stop there? Let's go further. A broader goal, broader goal, is to build an alternative to Microsoft 365 with all the other actors, some of whom were here, fast and right. So next loud, OpenExchange, Elements, XWiki, Collaborer, with all these other players, let's give users an alternative that's respectful of their privacy, that is open source, and that respects data sovereignty. And if you're passionate about these things, I'm certainly not telling you that you should, you know, that these companies are hiring, but they might be. Just saying. Thank you very much for giving me some time. And if you have any questions, I'll take them now. Thank you. Yes. A lot of what you said reminds me of when I was at the OpenProject network doing some of the things that we were built on, like Spray, and it was very, like, component messy, change it here, but not change it here. My question is, how beneficial or not is having your CEO on your product team? That's a great question, actually, because it's, for us, it's, designers can also go too far. So we can also have, because we want to change things very, very quickly, but there are some very pragmatic questions that are also development related, but also, let's say, related to the product roadmap. We think it's quite helpful, because then we can have the discussion internally and align ourselves rather than be at loggerheads again with, because, like you can have designer versus developer, you can also have designer versus product. So we try to avoid that. So we think it works well. Yeah. Yes. Sorry. So the organization I work for, my university, I've been paying enterprise users to open projects, probably like a year and a half, two years. Glad to hear it. So I use it, like, every day. I really like it. Universally, 100% of the negative feedback I've received from my colleagues on OpenProject have been explicitly related to the UX. Yeah. And so, and then pretty much all the positive feedback I've gotten from my colleagues have been related to specifically the UX improvements that you showed like last, you know, 10 minutes. So these improvements are really important to us as an organization, but I realize that especially in open source products that are oftentimes very privacy focused, from my perspective, it seems like usually it's a very lab-first adopters. Yeah. Priority among all of the customers, regardless, because there's just not an easy way to measure. Do you have any suggestions as an organization that's paying for the product that really appreciates this work and wants it to be done by, I don't know, like your boss or someone that, you know, this work is important to us and matters and what. Okay. So I'll try to repeat the question. Yeah. So you said that one of the biggest complaints with OpenProject was the UX and some of the positives in the recent times have been related to the UX. And the question is, sometimes the early adopters tend to be the loudest and tend to get heard the most. So how do we get better at listening to the community in general? Yeah. Okay. Customers of OpenProject. So what recommendations would we have to them? Well, we have two things we can say. First of all, our community discussions are open. So if you feel very passionate about a feature or a certain optimization, we really welcome that feedback and we will engage with you. It can be sometimes a bit tough to be on all fronts. So sometimes it's true that feedback can be a bit lost. We tend to track it with both an OpenProject. But our biggest, perhaps a recommendation would be to engage with the community directly via, yeah, via community. And also if you have an idea, submit a pull request. That's also very, but that's not fair for a lot of people because, yeah. Well, thank you for using our project. Yeah. Yes, we will, we will test more. We'll start small with how we test. And this quarter actually we have quarterly goals and so I'll go to test more. And there's a question back there. Perhaps it's the last one. Hi. Do you respect the science to come to your projects being open source in the US? How? Oh, other designers contribute. Oh, we've not had that yet. That's a good question. We might have to talk about it. So we have our design system that's quite open and everything. I mean, you could submit a pull request for pretty much anything, including our design system, but we've not quite had external developers. We are obviously interested so we can talk. Yeah. Thank you. Do we have time for more questions or are we? One more question if there is. Oh, hi. Thank you. Thank you. I'm curious since you haven't done much user testing so far, you mentioned, how have you been able to prioritize or how have you decided on what, for example, things you want to release first like notifications? That's, that's, yeah, that's a good question. When we first arrived, there was a question of what the priorities should be. But there are a lot of issues with Open Project through feedback we've got from our users that we already know about. We already have a backlog that's, I wouldn't say considerable, but we do have a backlog of known issues. So we thought we'll work on those first because we know that customers have told us that there were certain improvements and notification center, which was a passion project actually initially within the company. But it was also because we got some feedback about improvements that we could do. And it was also a feature that I think was quite required. It's quite a basic feature as well. So we were very happy to work with it. Thank you, everybody. And if you have any questions, we can get them. |
Open Source Firmware status on AMD platforms 2023 - 4th edition
OSF on AMD 4th edition |
Hello, and welcome to my open source firmware status on AMD platform's presentation. My name is Mihał Żegowski, I'm a firmware engineer from 3MDep, we are based in Poland. I'm a corporate guy, mainly developing support for various platforms, mainly inter-platforms. I also maintain the Brasswall SOC, PC Engine's platforms, I'm also at the open power system software technical workgroup chair. And I'm basically playing with open source firmware since 2017. For those who do not know 3MDep, we are in various places, like we're corporate license service providers, we're UFI adopters, we also provide FWPD, the firmware update project consultation services, and also we do some Yachto stuff. So, a little bit of information that I will use throughout the presentation about the micro-architectures. So, in previous years I have been talking about various older platforms as well, like for PC Engine's APU-1, KGP-D16, APU-2, and those are the old micro-architectures called Puma, Baldoser, Piledriver, and so on. So, I have based on some sample platforms so that you can sort of imagine what I'm talking about if I mention some platforms. And there are also the newer mobile SOCs from the Ryzen series, which are supported by Corbwood. This is Picasso and Cezanne micro-architectures. These are the newer APU series from the Zencore architecture. And also we'll be talking about the newest developments like the Mendocino, Phoenix, and Glindasox. Not that they could be previously known as under different names because the original, the target names were actually embargoed. So, you may have heard of Sabrina or Morgana, SOCs as well earlier, that were being developed for Corbwood. So, those have changed the names to the Mendocino and Phoenix. So, to review the last year's status, we had Picasso and Cezanne development quite complete, but there was no FSP yet for the Cezanne, but this has actually changed. The FSP was published in September last year. There was also a development of the PSP firmware, AAB partition recovery, supporting the AMD-AVW tool in Corbwood. So, one of the patches was merged, and the single one that I linked above is probably duplicating the functionality. So, the whole AAB scheming stuff is used for any recovery in case of corruption of the first partition, or if the memory configuration after some changes in the setup application would make the system unbootable, then the PSP can, for example, switch to the AAB partition and still be able to, for example, initialize the memory and boot the platform. There was also an effort to implement the PSP firmware extraction with the same tool, but it's still not merged. It seems that the activity there quite stopped, and there's no further work continuation for this feature. Yeah, and I mentioned last year that there were new patches for the Sabrina SOC, so the Sabrina has changed the name to the Mendocino, I believe. And yeah, I said that in 2021 there was another negative attitude towards the old AMD platforms that were like entirely open with full source code for the CPU initialization. If you'd like to see more details, please refer to the Corbwood leadership meeting minutes or the mailing list. I'll also be talking about it also later in the presentation. So in 2022 we also had many new features and option deprecations with each release. Those deprecations sometimes causes the platforms to drop out from the master branch and they are moved to the stable branch. These new releases tend to move the whole Corbwood drivers and code quality to the higher level by replacing the old interfaces that were, for example, buggy or had some kind of assumptions that made some old platforms, for example, boot. But they were hiding some small errors because of those assumptions. So it is necessary to bring these new interfaces to the old platforms as well, but sometimes there's not enough time or resources and hardware to test everything, so some platforms naturally will be moved to the old branches. But thanks to the companies like PCNGs and a few Corbwood developers, certain platforms get the required support and the new interfaces are implemented. So the platforms like PCNGs APU2 is still in the master branch, but some other boards were unfortunately dropped. So another year came with new Corbwood releases, as I said, and many old AMD platforms were actually dropped. To those platforms we can count PCNGs APU1, some MSI, MS7721, Lenovo AMD G505S, etc., etc. That is a little bit unfortunate to see, but there was a lack of actually any testing or maintenance that could also bring those platforms to the newest code, despite our sincere efforts. So we tried our best to actually implement the new interfaces and we sent patches quite early, it was in May 2021, but the problem with reviewing the old code is that there is not enough understanding how those old AMD platforms work, because the initial code that landed to Corbwood was in kind of maybe proof of concept quality and was depending on the EGISA code that was published by AMD, and it was not in the best quality. So if something didn't work there, it was hard to actually locate the bug on the Corbwood site, and there were also promises that the code will be maintained, will be cleaned up, but it didn't really happen. So this burden of maintaining those old platforms were shifted to Corbwood developers. So through all those years, the code didn't improve that much, so in such circumstances, the platforms will slowly naturally die, and on the Corbwood master branch, it will be shifted to the branches created during each release. We tried to save one of the old platforms, which is the KGP D16. We have sent over 50 patches that implement simply the first booting stage of Corbwood, so this is just the beginning, but already the mess of code is enormous, and we lacked the reviewers to actually process all those patches, so yeah, this is pretty hard to maintain those old-ending platforms, because like I said, there is very little understanding how all those platforms really work despite all the documentation. So Intel platforms are like more straightforward, I would say, at least for other developers. That's how it is. Okay, about the new SoCs. So in this season, development has been stabilized, there are many Chromebooks running on it, and we even have seen two non-Chromebook platforms that have been actually sent for review by the starlabs. These two laptops, unfortunately, cannot be built without the blobs, or can be built, but it will not produce a functional image, because certain PSP blobs that are required, for example, for the memorization, cannot be there due to the NDA stuff and so on. So this is something that also needs to be solved. But all other components are actually in place, and you can track those patches and see what is its progress. As I said, AMD, Mendocino and Phoenix are quite new architectures that are being developed in an upstream core boot, while the Mendocino is quite in a more advanced state than Phoenix, Phoenix is a newer one, and FSP is not published for those micro-architectures yet. You can also notice some PSP blobs that are present in the AMD blobs repository. AMD glint dies like the newest micro-architecture that has seen the sunlight. There is practically no information about that in public. It may be also an burgled code name, so it may change in the future, and I do not have much else to say about that yet. But regarding the blob situation, so the Starlabs could build the firmware for their laptops, and probably boot core boot, but they cannot publish the necessary blob that is provided by the board code to initialize the memory, because each design may be different and requires different configuration, how the memory topology looks like on the given platform. So you have to provide that by the board code, and this becomes the problem, because Chromebooks have only soldered down memory, and there is no support, for example, to build the blob with memory configuration for a platform that has solding modules, for example. I believe this is the main reason why this cannot be built to a functional image, core boot image. On the PSB, all the PSB blobs and FSB is present in the AMD Blobs repository. You can check its license on the repository. It is quite permissive. It allows the redistribution, so similar to the Intel FSB and microcode licenses. And of course, you cannot decompile this assembly, et cetera, et cetera. We've also took a quick look at the FSB release intervals, and these are quite delayed, I would say. For example, between the FSB for Picasso and the CISAN, there is one and a half year of distance in the initial commits on the AMD Blobs repository. And for example, the CISAN FSB was released one and a quarter year after the launch of the processors themselves. So as you can see, the delay is quite huge. For Intel, I would say it is like a few months or half a year after the initial release of the CPUs where the FSB is available. So it is like much more stabilized ecosystem for Intel, but FSB for AMD is quite a new invention. I would say it's like maybe three or maximum four years old, while FSB is like for over seven years or so on the open-source firmware market. Regarding the server platform status for AMD, nothing much happened here since the last year. Like, we had those initiatives, for example, from Google Romantic that showed the AMD, Rome and Milan proof-of-concept Linux boot or bootstaff, but nothing else happened. On the recent OSFC 2022, which was in September in Sweden, there was also something new, similar to what Romantic showed. So Brian Cantrell from Oxide showed some proof-of-concept of their firmware on AMD Blan, which was also based on Linux bootstaff, but they had implemented some bare initialization of the PCI Express so that the storages and other IO could also work. It was all very, very basic. So nothing close to what you would normally see in the UFI from independent BIOS vendors. So to summarize up, what is actually the future of the AMD platforms in Corbett? So for sure, the Chromebooks will be gaining the support. They are backed by Google cooperation efforts, they have partners, they have Corbett developers working in AMD, and their partnership will make the project successful, for sure. But for such old platforms like KGP or all the models that I mentioned that were dropped, it's pretty difficult to actually keep them in the main tree, and it requires significant effort to actually maintain a board. So either there must be a community that is willing to test Corbett images on their devices, but on the second hand, who is brave enough to flash their daily laptop and possibly break it, because something didn't work. So it is also not for everyone. Also the quality of the initial code that was published for those old platforms, it was just getting older and older, and the actual cost and the effort that would be required to bring it to the quality we would want is bigger and bigger. So while, for example, the model of dropping the silicon initialization vendor code is good for initial launch of the platform, because you have everything modular and you just tune it, in the long term it is not maintainable. It can speed up the platform shipment by maybe 50%. But in the long term, it is a burden for the project that is supporting this vendor code, and then we end up with such situation where the boards are simply dropping out of the tree. So in such conditions, all AMD boards will naturally die and will be moved to the branches. So we may expect further platforms being dropped out, and I think the next one on the aim is probably APC and APU2. For now, it supports all the recent interfaces, but I have no idea how long it will last. Also we think that the lack of strategy for the long term support by the Corbett leadership is largely decreasing the value of the project itself, because many people rather don't like their boards being dropped, because if they clone Corbett, they clone it from the code from the master branch, and if the issue make menu config and want to choose the board, they see it's no longer there. So basically they think, oh, it is not supported anymore. So this kind of value is decreased, because people will see that something is already not supported, and they lose their hope in the project. So a new strategy would be required to actually keep those AMD platforms alive in the tree. So what we think can save those platforms is the Shara firmware. We are open for any AMD outcasts from the Corbett, and we're working hard to actually prepare a strategy that will make these platforms support more sustainable. We often also disagreed on many Corbett leadership meetings with the current Corbett strategy, but we also offered various ideas like crowdfunding and other stuff that could potentially solve the problems that we were noticing, but unfortunately those were rejected. If you'd like, you can visit Corbett leadership meetings for more details. Also I have a short announcement for you. Unfortunately, the official PCNG's open source firmware support has been ended by the PCNG's company, and the 41901 was the last version that we released, and in the format we have been doing for the past few years, monthly. But do not worry, our commitment is still strong. We want to further improve the PCNG's firmware, but this time we would like to release it under the Shara branch with new features like UFI interface, DMA protection, setup applications, setup BIOS password, and stuff like that. But it will be only a community driving project. We will not have any funding for that. So your support is actually crucial in determining the roadmap and what the pace of the development will be. So I encourage you to take our survey. Your input is very valuable for us. If you have any insight or want to support us, please do. So to sum up, we want to make the Shara a paradise for old AMD platforms and outcasts. Of course, like I said, it will be community driving, so we want to make it a success and have community involved as much as possible. So what we offer with the Shara is we're ready to go test at binary releases with transparent validation. We publish all the test cases on our documentation page. You can use our hardware and software tools. We have validated and used daily during the development, which can help you in case of any recovery or even deployment of the firmware. We also provide high-quality documentation, which explains every caveats and procedures for the updates or the deployment and recovery of a platform. So if you want to know more, please feel free to sign up to the Shara newsletter to get up-to-date information about new features, our releases. So you may also find the new initiatives and new projects that we plan on the Shara documentation page, so I encourage you to visit. Of course, I will gladly talk more about the AMD platforms because I'm pretty much short on time here and cannot explain every small details. So if anyone is interested, we can go later in the evening for some beers and talk a little bit more. We can think on the first matrix or we can join the Shara matrix as well, where we are more responsive. We can just come up with some good place, I think, and talk a little bit more. The depth is also planning to hold two events in March. It will be the Shara user group, which is a forum of the Shara users. This will be a small event with more structured schedule, where we will discuss different topics related to the Shara. And right after that, we will hold the Shara developers virtual pub, where this will be a more relaxed forum event, where we will discuss any topic community wants and what is of interest of them, and feel free to be invited. You can see more details in the blog post I link in the slides. So that will be everything from my site, and I'm open for your questions. So I have a question actually regarding that PC engine, so after like two years and a half of waiting, I just got my board before Christmas. And I got it because it has support for Cool Booth, but I thought that the guys from PC Engine themselves who were actually developing the Cool Booth for their boards. But from what you're saying, that it was sponsored for somebody else to building the Cool Booth, right? Okay, so the question was who was actually developing Cool Booth for PC Engine's platforms. So PC Engine's company was responsible only for producing the hardware and distributing it to the users, and the initial support was being developed by a company called Eltan. But then they requested, they shifted the Cool Booth work to Fremda, so to us, and we were like maintaining and improving, releasing the Cool Booth binaries each month since 2016. So it was like our efforts for all those monthly releases. Okay, theory? I understand that Corboud is not building the runs directly to test the platforms. So the burden is on the users to build stuff. Do you have any insights for facilitating testing in the future to prevent the boards from being dropped? Yeah, so the question was whether we have a strategy for the boards being dropped because it is users' burden to build and test the runs. So like it was written on the slides, we are still working on the strategy for long-term sustainable support for old AMD platforms. But we want to shift the burden to build the runs from the users and instead provide tested images, at least tested what we can test in our environment, in our laboratory. And if someone still has some issues or have some future requests, we encourage to create issues on our GitHub, and then we are considering what is the problem, what is the case, what is the cause of the problem actually, how to solve it, and we try to narrow it down and include possibly the fix in the next release. But of course, as I said, if it is community-driven effort, the pace will depend on the support level. So if somebody is able to contribute in any way, for example, fix the code themselves, or let's say provide the logs or flash our, for example, testing binaries to check or gather more information because we may not necessarily have the exact hardware configuration, then it may be the right step to make users less burdened with all those troubles with flashing and breaking stuff. Okay. Any more questions? Okay. Maybe. Okay. Let's change a little bit. Yeah. So the question is, what are the state of the patches? And why actually there is a problem to get the old AMD boards upstream? So like I said, this code is very old and it was in very bad quality, committed initially, with promises from AMD for the maintenance ship, which didn't happen. So with each year, the code just was getting older and older, and people who actually knew about it are no longer a part of Corbett or are they just simply retired or out of business? And basically, there's very few people that can actually understand what the given code does and actually give some meaningful review for those patches. So that is why it is very difficult to get something upstream for such an old AMD platform. Okay. In theory. You said that all your platform can be in a different branch. The branching model, is it like model-based or is it like version of Corbett-based? Let's say that some patches were made for TGP16 to be able to fix what is missing to be upstream. Would it happen like in the branch that is, for example, for the 11 until it reached the point of being back to master, how is it working right now? Okay, so the question is about the Corbett branching model. How does it work and how to apply patches on them? So whenever a Corbett issues a new release, so a tag is placed on the repository, the current revision that is tagged is being cloned into the branch and called, for example, 4.19 branch, and these branches are in no means stale. So these can also be contributed to by the Garrett review system, but you have to target the specific branch. So even if you hook onto some revision on the branch, for example, and you want to patch it, it doesn't mean that in a half a year it will not have the same tree state, some back ported patches could be landed on the branch and then your patches that you were kept, for example, as a file, will no longer apply. So it is not like that that the platforms are just lying there with no option to improve. That's not true. You can always send patches to the previous branches as well. So it works like any master branch, but you have to just target the older branch with your patches that you sent. So the development might still be active, but it does not longer benefit from the new features that are basically landing on the master branch. Okay, any more questions? Yes, please. There's a point again for the KGBD16, which plays on port at 11. We know that that arrow, Corbou, is moving for that platform. So can we hope that the port at 11 branch involves to the point where that was more of my question. What can we see happening in Corbou at 11 on the nonmaster branch, but the branch that is still active for that board? Okay, so the question is, what can you expect on the 4.11 branch, for example, with KGPE? Well, that depends on the community. If they were to back port some patches that should land on the 4.11 branch, that is possible actually. But probably the older the branch, the less activity there will be on them. And if somebody would like to back port some features, then this is probably more efficient to do it on newer branches, because the gap is just only getting bigger and bigger. If anything, it would be better to work on the fork that we already have, where we rebased all the code, rewritten most of the parts that were in very, very bad shape. So basically, it is not impossible to have something being developed on the 4.11, but it is rather unlikely to happen right now. Okay, the time is up, so I can answer any questions later in face-to-face talk. And yeah, next presentation in eight minutes from theory about the Heads project update. So stay with us, thank you. |
Heads - status update! |
Hello everyone, I'm kind of nervous because it's been three years of no conference on my side, so I'm really happy to be here today because, wow. Just a quick screencap of what it was like last time, Ed was really presented just for the project because I did a conference like at the last post that was in person and it was more of a call for collaboration, I still named collaboration but today's presentation is more like on what happened since that moment which is 2016, so and this is what it looks like right now, so for people using it they're aware but this is it. So this is the plan for the presentation today, who I am, like what is Ed's, why Ed's, what's new and what's next, I saw that most of the people here already knows what Ed's is so I will try to go faster there so that we have more time to talk at the end. Who I am, I'm Cyril Arion, I come from Canada, I started in Sergo Open Technologies in 2017, this is me, I'm currently in the main Ed's maintainer, I certified the privacy beast and years are passing so fast, 2019 and since that moment I'm actually trying my best getting funding and being paid to actually move the things forward in the mission of facilitating accessibility to security and confidentiality for most people, so this is a challenge because everybody needs confidentiality and privacy but they're not necessarily tech savvy, so the goal of Ed's right now is more to bring it accessible to everyone, reducing the friction and that's it. What is Ed's? Ed's can be seen as two different things, it's basically a built system I would say and the outcome is a runtime environment, so the runtime environment is based on core boot as of now but it could be like inside of UEFI and the DXE but what I'm testing is more a core boot payload, so Ed's right now is more a core boot payload and as of now we will call it a runtime environment and environment, so it's core boot and Linux has a payload, so when you configure a core boot you can configure it like to have Linux as its payload and basically like I said before, Ed's is the built system preparing the kernel and the init.rd, you have all the tools needed and the scripts which are boot policies which actually creates the user interface and everything that is behind to make it usable. So core boot here, Ed does an abstraction outside of the core boot configuration, sorry, and basically, Ed's actually core boot as a configuration that is inside of the GitHub trees and it defines also the Linux configuration that will be used to define only the needed kernel modules that will be compiled in and if there is outside modules that are needed let's say the wired card or anything else, it's loaded on demand. So the most important part for us right now is to define Ed's, this is the only thing I want to say about core boot, unfortunately the openness and ownership and the auditability of the core boot part of the code will depend on how core boot is open. As we heard in the previous conference that was given by Michael, we have a problem because less and less platforms are actually completely open source without blobs, so unfortunately we shift the responsibility to the user to define his own threat model and decide which platform is going to use to attain the privacy and confidentiality data is there. The idea here is that if you don't have native initialization of the hardware, if you depend on FSP and memory initialization blobs, you have to have faith in those blobs to actually do what is expected and on the models that we have there like ID bridge again, this laptop the X230, soon there's going to be as well the T1440P will be supported with native RAM initialization, it's coming. Other than that, then we have the KGP D16 which Michael again talked about which was dropped mainstream but dash arrow core boot I think is based on 4.16, I think so. So a lot of other features that were there until 4.16 are merged in, core boot changed the release cycle, so now it's like every three months, so release are going faster but it doesn't mean that so much happened in between release, it's just that the release are going faster. Power 9 is now supported by core boot, it's going to be supported inside of ads, I came and fuzz them, now I will have the TPM modules to be able to test this at home and it will be, it will land inside of ads later on. As opposed to an open source firmware, this is what you would have like inside of a closed source supply chain environment. So this is complicated, I'm just doing that as a reference. Normally you have to have the like I said the FSP, the TynoCore like to implement like the UFI implementation and after that all the companies are doing a little part and the OEM is actually responsible for 10% of the code that is there. So this is true for all computers but Apple that is there which is responsible to actually develop the firmware completely and this gives a good posture because they support the hardware longer. For other firmware, for other company vendors the firmware is not that long supported and we saw it with the X230 which is not maintained anymore, there is no microcode updates, this kind of stuff is happening for the firmware level. So on the other side with ads as an open source firmware, we have complete auditability for the platforms that are completely open source. So the core boot source code is auditable, the Linux kernel is auditable, you can see like the source code for that version plus the configuration that is pushed inside of ads. We depends on Linux tools only to do the magic of ads, so basically you have module dependencies, you will see links later on to actually go directly in the source code to see where the modules are and what they do and this permits us to have really streamlined policy which is right now GUI init inside of ads and GUI init is responsible to take Linux as init and then point directly in the policy that you want. The reason why Trimal did this is because anybody here that would want like the policy which is basically a bash script to do something else could do it and customize it like super easily. Bash is easy to read and understand as opposed to compiling stuff like once the tools are there basically your policy is just saying like okay I want to show this and when you click that it does that. So for developers it's easy to review and see and collaborate as well. So the small note that I put there is that the reason why Linux is really interesting inside of firmware is that you can reduce like the size of the kernel to only do what you want it to do. So on ads it's a really small kernel like only containing what is needed to initialize the platform outside of Corboud and since we are able to also build models we can support for example USB as storage as model and then when we need storage from a USB we call it on demand. So we load the modules, we measure the modules and after that we load them. And the same thing for a USB keyboard, if you don't need a USB keyboard it should not be there so that's what happens for laptops. That permits to remove a complete old class of attacks because if you don't have USB keyboard and you don't have HID as a driver then you cannot have a rubber ducky attack simply because the kernel drivers are not there so you can reduce the stack, the get-acc surface as needed as well. So this is it, so here we are going to talk about Linux as a bootloader. What is a bootloader real quick? It's what is between the firmware and the OS to actually parse configuration and boot inside of it. Sorry, my mount is dry, I'm not used to two-conference anymore. So yeah, if you use the kernel as a bootloader then you're able to bypass completely C by us or grub as we see normally. So if you have Linux again and you have scripts you're able to parse the grub configuration and you're able to display it in boot inside of it. And the Linux tools that are there are actually the basic ones that are needed. So we have Busybox, we have Cripsetup, LVM, the TTM tool stack and WipTel, NFB WipTel which permits us to have something like that. So once you're having an operating system installed and if you don't have an operating system installed, Edge will detect that nothing is on the hard drive and will propose you to boot from a USB disk. And that USB disk could be just plain ISO and the detached signature if the distribution is providing that. And you can verify the ISO prior of booting inside of it. Again it's Crip, we can do whatever we want so that's what we can do. Here in action is the main policy that is there. Like I said everything inside of it as right now is based on one policy which is GUI init. It's not really clear. So basically up here what you see is that the normal boot policy is GUI init. Basically we're mounting the slash boot in read only. We're verifying here because since we have an open source and we have Linux and the tools, we're based on GPG tool stack to be able to actually first when you own the platform you inject your public key. If you don't have a public key and private key you have to have a USB, a USB GPG smart card and Edge will say okay please inject your public key and if you don't have one it will help you to own your USB smart card. Once this is done the public key will be injected inside of the firmware and that permits us like to actually sign the content of slash boot. So here what you see is that it was signed with my key. So all the content of slash boot is verified. The default boot configuration is also signed so it's verified prior of jumping inside of it. So all you see there is the verification of the content prior of trusting it and once this is done it asks us like for the TPM disk unlock key. I think I will cover that later but this is an example of what can happen. Our recent change inside of Edge is that instead of trying to craft a crypt ad entry we extract the NETRD to see what the operating system is trying to do. There's some operating system that we're implementing for some reason two crypt tabs. So instead of trying to craft one we copy like the form of what is expected by the operating system and we just inject inside of it the place where the secret is supposed to be. The TPM will unseal the secret if all the measurements are good so the firmware is good. After that like if the drivers loaded are the same as when we seal the secret inside of it it verifies also that the looks header is exactly the same and if all of those things are as expected. An additional NETRD is created including the crypt padding question and the secret and this is passed to the NETRD of the operating system so we're able to actually boot inside of an operating system without typing your passphrase from a boot prompt that you don't know if you should trust or not. So as it's trustable because everything is measured it has tested to you and then you can type your passphrase there because you're trusting it. So that's basically what we can see. I have laptops here if you want to talk later on. The demo that is there real quick is something that I can launch on my laptop. I acted last week to be able to present it right now but you can have like a flat partition with ESO inside of it. Now you can just say okay media scan you point it like to the partition containing your ESO in question. It will ask you like what ESO you want to boot. It provides like detach signature so basically at that point what you see is again as verifying the ESO with the detach signature that guarantees like the authenticity of the ISO and the integrity. Once that is done it provides you a list of the grub entries that are there. You select it and at that moment like you see the boot options that were passed inside of the grub configuration and at that point a minute later you're inside of tells. So it permits you to actually sign ESO that you did yourself you created yourself any live ESO that you have once you verify it like the integrity of it. You sign it with your key because you already have it and at that point you can guarantee that like a USB drive that you have with multiple ESO will be bootable whenever you want. And in that case it permits you to boot inside of tells and tells guarantees you that micro indemnization will be done. It's in the memory so if you're in an emergency situation or you're in a coffee that you don't trust you remove your battery you work as you want to and at last minute you just unplug your power cable and there's nothing in memory anymore you left no trace you did what you had to do and that's it. That's really important for a lot of customers and this is one of the most important features of ESO. It permits you to guarantee that what you're going to boot inside is going to be as expected. So I covered that already Linux kernel and it contains the standard Linux tools that we need and the policy as well is responsible to fixate the user experience. As a system most of your people are developers. This is a slide pointing exactly in what you should be looking at if you're interested into contributing or customizing it for yourself. One of the good thing that we have right now is that we produce ROMs for the platforms that we support so every time that there's a commit made inside of it Circle CI is actually watching the repository and it builds all the ROMs for all the platform with different cache system and everything. I won't cover that today but this is what is cool like if there is just a change in the policies the ROMs will be created inside of like five minutes if we have cache of what was built before and that's it. It contains the hash as well sorry it's cut at the bottom but each ROM that is produced contains hash of everything that was built and it's supposed to be reproducible. There's funding to make sure that Ed's is backed as being reproducible. Tramil Hudson is helping me on that. It will come in the next year as well. Why Ed's? I think I covered it like already. The main problem that we have like with a proprietary firmware is that there is duplication of code in multiple parts. So we're trying to recreate the wheel and the idea behind Ed's is that we should not do that if we have already a kernel that permits us to drive the hardware that we want. So instead of duplicating at that part here like the drivers to be able to drive the graphics to be able to control your keyboard, the bus and everything the kernel is able to do it and does it like really efficiently. Another reason why we love dealing with the kernel instead is that we can control the IOMMU directly at the moment that the kernel is launched from Coreboot. So at that moment if we specify that we want like to have like the graphic card not having the IOMMU because we're going like to boot inside of Cubes OS for example it can be defined there and it's going to be properly set up for us. Okay, 10 minutes. So again links here if you want to click. The policy that we use as I said is you're in it right now. I will cover that like more in details later on and one of the important change I will cover it like in what is new in the next slide is that we have maximized ROM. Maximized ROM are actually complete, it's a valid complete ROM. It includes Intel ME, it includes an unlike IFD, it includes Coreboot and the packed kernel. So the image that are there inside of CircleCI as artifacts are actually externally flashable. We create like top and bottom image. So if for example the X230 when you open it up there's two SPI chips that you need to reprogram externally. Once that is done you can update internally flash ROM is inside of ads as well so you have control and you're able to like to upgrade as you want. Why ads? There is extensive TPM usage. The reason why we like ads is that there is two sealed secrets inside of the TPM. The first one is happening the first time that you flash your firmware and it's sealing like all the measurements that are created from Coreboot. Coreboot is configured to be in measured boot mode and it extends the TPM PCR to a PCR. Everybody knows what PCRs are in TPM basically. So all the measurements are being done in the PCR2 from the coreboot part. After that when ads start, when the payload is actually loaded which was measured inside of the PCR2, ads extends a couple of other PCRs which are then used to make sure that the integrity of the runtime environment is actually the same as expected. So the first seal secret when you flash your firmware, the next time you're going to reboot it, ads will say okay you either don't own the TPM so you have to own it or otherwise it will say okay we have a couple of measurements that were provided by Coreboot in measured boot mode so we have to seal them. When you seal them, it will create a QR code which you can scan on your phone or if you have like a NitroKey Pro or a LibreM key and you use like the HTTP version of the firmware, it will ask you to seal that secret as well inside of your USB security download. So at that point you have the measurements of the PCR4, 5, no it's another slide, sorry, no it's not another slide. So basically you have like the measurements of PCR0, 1, 2, 3 and 4 that will be like inside of the first sealed secret which guarantees like the attestation of the firmware. So you boot your machine, you plug your USB key, it will flash green if the measurements are good from the firmware alone and after that if you set a default boot configuration you're actually using all the other PCR measurements to seal the secret plus the lux header that is extracted because we have Crip setup and you're going to seal it with a passphrase that you define at that moment, at the moment of sealing the secret. So when you're booting, it's what I showed you before like in the other screen that was here, when it's asking you to type your passphrase it's making sure that all the measurements of the PCRs are exactly the same in an attempt of unsealing the secret inside of the TPM. So at that point either if your password is bad or the measurements are bad as we'll just refuse to boot your system and it will say to you PCR mismatch and at that point you should be worried and investigate. So that's it for that. Why is it important? Because yeah, the secrets that are sealed inside of the TPM memory are actually making you sure that you're in the state that you expect prior of typing a passphrase. For use of typing passphrase in weird environment, this is an environment that is trusted. So at that point like anything that you do inside of eds and that opens the door like to add more feature will work as defined. I will go faster because I have only five minutes. The reason why maximize boards are important is actually that on an OEM version of the firmware this is what we have. The Intel IFD is locked, the IFD is basically the description of the region inside of the firmware as defined there and ME is also locked. So if you want to take advantage of all the firmware space that is there to add tools you have to unlock the IFD and you have to unlock ME. So the reason why the maximized, this is a configuration of the maximized board. This is the extraction that we have here of the important part. So flash run here like on the legacy board or if you're coming from Ivy Rain for those who are interested in that project is that you can only flash the bias region. And the bias region is only having eight megabytes by default on the legacy or the OEM version. So if we externally flash and we nerter ME we're getting more than four megabytes additional. So if we add that inside of the bias region we're having like 11.5 or nearly 12 megabytes of usable space where we can add more features, we could even add Python there if we wanted which is something that I want for a really long time. So this is what it looks when it's modified. Actually you see that the bias region is equivalent like of XBE for FFF so basically it's near of 12 megabytes. Yeah, I covered that. Actually the reason why we love it for the redistribution, the issue that we have on blood redistribution is that inside of the edge repo we owe the scripts. The scripts are being called by circle CEI to actually download the images from Lenovo. We extract them directly on the CEI. We're able to extract ME from there. We're able to nerter ME from there. Then circle CEI is responsible to build Corgut, use the module that we're extracted, validated and everything and from that moment we're able like from a simple script like that to build a complete run image and we just dodge the legal issues. So whatever circle CEI is building like that, I gave it as an example here, provides like artifacts. This is a build from circle CEI. This is like the output of Corgut which was able to stitch everything together and at the end of the build and you have the image I was talking about. So you have the 12 megabyte full image that is internally flashable and you have two image for each commit which are there provided by the circle CEI for 30 days after the moment of the build. That permits you to externally flash and then it permits you to internally flash forever after. The OEM way is like that, yeah sorry. The WipTel support is for either servers and BMC. So it would be usable on the KGP D16. This is what it would look like from a remote serial connection. So if you're connected on the BMC remotely, this is exactly what you would see. The HOTP card dongle could be in your server and the TOTP is what you could check remotely if the time is good and then you have the same option but console based. This is what we see if we're having graphical frame buffer on laptops and desktop. The OEM factory reset, OEM factory reset with ownership wizard was upstream. So it looks like that asking you like to what you want to do, giving you options that are really important if it's not you who have installed your operating system. This is what I pushed forward because if you don't install your operating system yourself then the LUX encryption key could be intercepted. So you want to re-include the content of your drive and once again, since we have Linux, we can do that directly inside of it. You will have like to look at that for yourself. I won't have time to cover that but this is what it is and the main thing that is really nice for developers is that we now have QMU and QVM boards with software TPM support. So all the images, mostly all of the images that I provided were taken from the QMU version. This is what it looks like when you build it. So you can inject your key and then after that you can run it. That's how it looks and then you can add your recovery shell directly from the terminal and you will see the output of WipTel and you can develop and test as you go. And this is what's next. I'm sorry I won't be able to cover it directly but I gave link on everything that the work is happening right now. TPM 2 support is coming so new boards will be able to have it. Cleanroom GPT key generation is the next thing I'm working on to be able to remove the need of having your USB security dongle because people want to test it but don't necessarily want to buy any Trukey which was a problem before. Write protection is coming so if you want to look at this and poke me there or any question, SPI write protection was developed by 3MDeb. It is now in the PR that is supposed to be merged in the next couple of days when I will have a time. International keyboard support will come because right now it's just US keyboard which is a problem for a lot of people and randomization of the mic address will be possible directly in the GBE partition inside of ELS and being reflashed. And this is it folks, reference. I really recommend to watch the original talk on ELS if you have like more in-depth questions because the background didn't change and that's it, thank you. Do we have time for questions? Okay. Yes. You mentioned support for TALUS board PC systems, how do you support them since they don't have TKM? You said the TALUS 2 platform supported? The Raptor, yeah. The question is does the TALUS 2 have a TPM chip? There was a connector that was a really big adventure, I will try to make it really short. The port was done like with 3MDeb here and actually there is a TPM connector but it was not working properly. So we had like to do research to be able to actually make it work on the panel but yes, there's no TPM inside because it's against their mentality but you can connect one but it will be the one that you buy from 3MDeb because you have to add a TPM chip. You have to add a TPM chip but it's available from their shop, so yeah. Another question, yeah. Do you do remote attestation if you don't or are you interested in it? It's something that I don't have, oh yeah, do you do remote attestation? Actually VaultBoot did it, VaultBoot is actually a fork of ads. I'm looking into upstreaming the code but I would say that any person that is interested into that kind of feature, I would guide you to see what they did because they have a proof of concept that is really interesting on the Raspberry Pi for remote attestation and there was a board config for Coreboot there as well and everything is there. It's just a trying to upstream but I'm not against it but ads as of right now is more targeting users to actually own and being able to verify for themself but remote attestation. So the question here would be like is remote attestation possible, yes, someone is suggesting to do it like remotely on the phone, it would be totally possible, there's no, please contact me, okay, cool. Another question maybe, no? Sorry for this dense presentation, next presentation I will be better. Thank you everybody. Thank you very much. |
Overview of Secure Boot state in the ARM-based SoCs
2nd edition |
Okay, so we'll start now. The topic of this presentation is overview of secure boot state in the arm-based socks and this is the second edition of this presentation. No mic here. It's only for the video. Okay, so the first edition was taken place like two years ago. So now we want to present some update from the research that we have done two years ago. Maybe you can speak up a little bit. Okay. Okay. Sorry for some... A little technical issue. Okay. Sorry for that. Okay. Okay, so it is an agenda for this presentation. So first, I will tell you who I am and where I work. So say a couple of words about our company. Next, I will just present shortly what do I mean by secure boot in the case of arm socks and present how the typical implementation and workflow works. Later, we will show the results from the first edition, so from the 2028. Then we will discuss the two cases that we check for this edition, so the Mediatek and the Rockchip cases. Next, we will summarize the whole presentation. So try to look what was different between those two editions of the presentation and with some Q&A session. So this is me. I'm Tomasz Gieski and I'm an Embedded Systems Team Leader at 3MDep. I work there over three years now. Mostly, I work on the Embedded Systems built with the Yoctop project. So I try to integrate the update system and the OS creation for the Embedded Devices. But because I work with the Embedded Devices, I try to touch different areas of the whole life of the devices. So one of the things I work with is also the system security. And this is the topic of this presentation. So here are a couple of words about our company. We are from Poland and based in Gdańsk. We are the core boot license service providers since 2016. Also, UFI adopters since 2018. Yoctop recipients, so this is the area which I work with. From the 2019, also like the consultants for the FWPD project and IBM Open Power Foundation members. Okay, so now let's explain first what do you mean by the secure boot in the context of this presentation. So here we are focused on the ARM context, which is like the feature of the boot room. One of the features that boot room has. Maybe we should call it more like the verified boot because the case here is that when we start the firmware, the next steps of the loaded images are like verified. The signatures is verified by the previous part of the firmware. So that's why we should call it the verified boot maybe. So we need to like use some private key to sign the binary that we put into our machine. Then we also need to take the public key from that private key and put it also there. So when we start it, we will be able to verify the signature and decide if it should be loaded or not. In ARM context, we assume that the boot room is our root of trust. We need to take that assumption because most of them are closed source. Probably it would be better if they would be open sourced, but it is what it is. And yeah, basically like the meaning of the secure boot can be different for any given architecture. So if you would be talking about the x86, that would be a different scenario. Okay, so this is the typical implementation. We have the public key that needs to be written inside our sock. Different vendors will have some different way to achieve that. So we can, for example, fuse them using electrical fuse. So one time write them in our sock and later use it every time that we start our firmware. There is also a possibility for the OTP registers. So those are the one time programmable. So as the name says also can be used only one time. And one of those two possibilities allow us to make from our boot room a root of trust, which we later use to expand it to the chain of trust. Next components can use like different keys. So in case of secure boot, we talk about the step between the boot room and the boot loader. And here we have some one key that was used to sign the binary and that public key that was used to verify it. But the later steps of starting our machine can use different keys. And yeah, so that would be it. So typically we have like let's say our host machine. That should be some secure location and our target device. So on our host machine we generate some private keys. We build binary and use specific tool to sign them. And later we need to take the public key part of that keys. Also design a binary and put it in the target device and then try to verify it. So if the verification will be successful, then we will just boot another step. If not, then specific things can happen depends on the vendor that we are using. Also let's say maybe a couple of words about what it is to sign the binary. Basically what is common between all the vendors probably is that the sign in binary means that we take the original binary and add some header on top of it. This header contains the signature of the digital signature and also some specific format at the start which can be different across some vendors or can be also different inside within one vendor but when we are using different signing tools. So if we try to like sign our binary we can always try to dump the first couple of bytes of our signed image and check if everything goes as expected. This is a quick recap from the last edition. So we look at the 11 cases. Five of them like the NXP, IMX and Liarscape, STSTM, Xilinix and NVIDIA looks like we're like fully open sourced let's say and there was none NDA problems with them so everyone could try to take the documentation needed using and signing tools and try to implement the secure boot on them. We have a couple of cases where there are information that there is some like secure boot to enable on that given machine but it is under the NDA. So we're talking here about the Marvel Armada, Texas Instruments, Sitara, Qualcomm and Microchips and also we talk about two stocks from the Chinese vendors like Rockchip and Allwinner where like documentation was, some documentation was there, another was missing. Also there were some information about tools but we were not able to find them or maybe use them correctly. Okay, so now I will go through three or five, three or four vendors that were listed from the last edition and talk about the differences that was between those two years. So now let's start from the NXP. In these cases like I think this is the easiest way to start with the secure boot on the ARM stocks because the full documentation is there. All documentation is probably like, all documentation is publicly available so everyone can use it. We have the HAP 4.4 so the high assurance boot mechanism on the platforms that are the NXP IMX50, 53, 67 and 8M. The application note is here. The only difference between here and there is that those applications are under the free registration to the site. The same is for the IMX8 and IMX8X which uses the HAP mechanism, so advanced high assurance boot. The same for CoreQ and as you see here the signing tool are available after the free registration. For Marvel Armada we can look at the manuals that are available on their sites. We saw that for the 38 and 39 X families we have some information that NDA is needed. For other families like I believe 8K there is only information that there is a secure boot available but nothing else there. At the last edition we present that in the U-boot repositories we can find some information about how to implement the secure boot on the Marvel Armada but now it is not there. It is only on the older releases. So this is another difficult step that needs to be taken to find those informations. But if you will just use this from the 2018 branch we will see the information how to implement the secure boot but still this is only the theoretical knowledge and there is no step-by-step solution how to achieve that so probably there is a room for mistakes. In Zidia Tegria another quick update. Last time we saw that there are some documentation and tools available. We checked that within last year and it looks like the documentation is some kind of uncertain because in one point it tells that the secure boot is available and they also provide the flashing tools, the script called flash sh. But in the other place they just say that the secure boot will be available in the future. Nobody exactly knows when the future will be but yeah that would be it. Also about fusing there is additional script for that but documentation of it is also some kind of updated. There is also one thread in the forum of Zidia where someone tried to use it to fuse keys on its platform and it caused the platform to break because it looks like not every board can be fused and before we execute that we just need to check the serial numbers because some of them are supported some not. Also update on Alwinner. Still we think that there is no official documentation about the Alwinner socks which would tell us about the secure boot and also no official documentation about the secure boot itself. But we found some interesting case on the forum where someone was able to use the Nano Pioneer with Alwinner H3. And he there provides like all the useful links like the 10 or 15 of them or also the whole list of the verification process. With the link to the Sanxi tools repository which contains some tools which need to be used in order to sign the binaries and also fuse the sock. The one vulnerability there is that in any case when we. Oh sorry. One vulnerability is here that if we try to start our firmware which is signed and the verification fails. The platform always goes to the cell mode which is some kind like the Debug mode. And the Debug mode can be accessed via the USB port. So if the verification fails someone could always like plug in the USB start some me come and then read everything from the fuses or maybe even wipe them. So the solution there for that was just to destroy the USB lines data lines on that given port. So even if the platform goes into the cell mode no one will be able to just read anything from it. Okay so now the Mediatek case documentation is provided on the GitLab pages. It is based on the Yocto project like project so the steps there needs to be done inside the Yocto project build system to achieve the signing to achieve the implementing the secure boot. But basically as in other ARM socks the boot room there is like the root of trust and later we just achieve it using other mechanism to have the whole chain of trust. So after secure boot we have the TFA trusted board boot and then you can use the U-boot feed wave side boot to load our kernel image. So if we have all those steps then we have like whole boot process verified. Yeah and it also was shown in a couple of reports that the Mediatek boot room has some vulnerability which if we like power it in some special scenario then the boot room may just skip the process of verification of the image and still load the firmware so it looks like even if we like have the public key in the fuses and also the firmware signed, a public key in the fuses and we provide the unsigned image and like use this vulnerability we will be still able to boot our platform. This is a short recap how the secure boot looks on the Mediatek so we have the BL1 step which is like the boot room which load the hashed based on the root of trust public key so this is the public key that we put in the fuses and calculates also the share of the signature of the BL2 so this is like the next step loaded. After that we compare those two values and if everything is okay we go next and then we load the signature from the BL2 and also calculate the share of it, make the comparison and if everything is once again is okay we boot the next steps. And here is the process of enabling the secure boot, it is not clear from the documentation on which socks the secure boot can be enabled. The documentation mentioned the MT-83-65 and MT-83-695 ones, different socks may have different fuses indexes so it is really like we need to check those before we try to fuse our fuses in those socks but unfortunately those information is provided with NDA. In the process of enabling the secure boot we need to create the two keys, private keys and provide them to our build system which sign the BL2 load firmware and also the something called the download agent which is later used by additional proprietary tools to flush the image inside our platform which is also described like here so we have the fuse writer tool, this is another tool provided with the NDA only which can be used to like check the secure boot state on our platform and check if the download agent authentication bits are set. If they are not set we just need to set them and then like provide the public key which will be fused on the key hash zero field and after that we will just sign our firmware and use this public key to verify it. Okay so now let's go to the Rockchip case. The public key here can be stored in a fuses in the OTP and it depends on which sock we are using. If the verification of loaded binary will be successful then we will just extend our rule of trust so the boot ROM as in the other ARM core socks to the chain of trust. And later in case of Rockchip you can use the verified boot mechanism, fit verified boot mechanism from the SPL to the U-boot and from the U-boot to kernel to provide the whole chain of trust. So basically to establish it we need to like once again generate some private and public keypire, burn the public key into the fuses or OTP registers, depends on which sock we are using. Then sign our firmware called id below the image in case of Rockchip which is like the U-boot TPL plus SPL merge into one file. Then configure verified boot in SPL and U-boot which means that we will use the fit images to verify it from the SPL to U-boot and from the U-boot to kernel and just flush our signed images. Documentation for Rockchip and for our winner is like how to find and if we find any probably it will be outdated or really short or just not such useful at first. So here we have a diagram of signing the whole of enabling the secure boot but basically it is something I just described on the previous slide and sorry for that but because of time we will just need to skip the description of it. And now we can talk about signing code in Rockchip. So code can be signed using one of the two tools, the Erka sign tool which can be started on Linux or secure boot tool which can be started on Windows. Linux tools can be found on the Erka bin repository. Windows tools was a couple days ago, some time ago was on the repository called Kools tools but now we are not able to find it. For some reasons the Rockchip based repositories are maintained in such way that some things may be missing after some time. But if you use the Erka sign tool we can just generate the signing keys and those keys can be used later with the Linux or Windows tools. And if you use it and also the Erka repository provides the any files which are the files that can be used to create the mini-loader used later to fuse our keys in the firmware. And basically any given firmware that you want to sign and use in our socks can be also signed with those two tools. There is another tool just to burn the effuses and we have for that the effuse tool which is only for the Windows machines. It turns out that when we burn the effuses on the Rockchip socks we also need to provide the voltage for the one of the pins of the sock. So it is not visible here but there should be like VCC, E effuse pin which needs to be powered up when we want to fuse the keys in our sock. So for that we need to find some pin that is there and provide the power by ourselves or maybe our platform has some special circuit just to enable that. Also this information comes from another documentation that is hard to find. This is a summarized of enable the secure boot. So we need to create loader using the boot manager script that is in the Erka bin repository. Next we create the keys with Erka sign tool. Next we need to sign the loader with secure boot tool because from what we know now only signed with that tool binaries can be later used on our socks. We try to sign with the Erka sign tool and it doesn't work and now it looks like this can be hard to achieve because there is no way to download the secure boot tool from what we know now. Then we need to use the effuse tool to fuse our public key in socks and use another tool called the Erka develop tool which is in the Erka bin repository just to load the signer tool. So we need to put loader into our platform and secure boot enabled. Here is a link for the blog post which describes all of that containing all of that what we done with the Rockchip platforms. So this is the summarized of where we are after the second edition. The changes are here about the NXP platforms where it looks like we now need to register to get the socks reference documentation. The NVIDIA Tegra looks like the documentation is not really uncertain and may be outdated. The Rockchip one we know now that it can be achieved but still the documentation is not the best quality. The same with the Allwinner and for the MediaTek we see that NDA is needed to achieve the secure boot. So this is the summary of the presentation. Looks like our knowledge is expanded over the last two years. We for example know now how to enable the secure boot on the Rockchip. Still the general principles is common for all the vendors so we want to authenticate image before we load it. We have some private key to sign the firmware. We need to fuse the public key inside the sock and the boot room is still fit as the root of trust. All cases use the SHA-2456 as a hash function for digital signatures and we see that more and more cases the documentation is under the NDA or the quality of it is really not the best. Here is the way of how you can contact us and thank you for that presentation. I think we are a little after time or maybe one question if there is any. Yes. The question is about in NXP secure boot you have a possibility to use ROM in your boot loader to check the fiber block. So you can use the secure boot of NXP for more components like fit and newborn and so on. Do you know of any of those new or the chips you have analyzed other platforms have similar thing but you can call the ROM and use it as a root for trust of everything. Okay so the question was that the NXP provides the possibility to use the boot ROM to verify other parts of the firmware loaded in the whole process. And the question is if other vendors also provide such things. From what we know now this is only like the NXP case and yeah basically that's the answer. And yeah maybe this one. How is given that you analyze so many vendors how is the support among the vendors for different types of key with different trust boundaries. Like for example like the delegation key for like some select like the production in Asia. Then you would remote that key using row back levels and use cases like this. Okay so the question was I believe if we try to somehow if there are any process to in order to change the key used to sign the firmware yes. To have multiple keys like you would hand out one key. Okay to have like multiple keys to one person and to another team to another company maybe even and then later revoke it. I believe that this will be depends on if we have some mechanism in our socks to provide more than one public key yeah. But as we said those are one time only flushed in the socks so if there is only one place then I think that it will be only one the private key used later for the verification. Okay thank you. |
Trustworthy Platform Module
An attempt to create open-source firmware for TPM |
So today we will talk about trust for the platform module project. So, hello, I am Matej. I am currently an engineering manager at 3MDep. I'm an open source contributor for several years. I'm interested in various stuff, build systems. I enjoy build reliability especially. I like embedded systems in general. I worked to several layers with embedded Linux projects. Now, I'm also working with some stuff with Carboot. Also, security aspects is what I'm interested in. You can have some contact information on the slide if you want to reach me. Some of you already heard that we are 3MDep, a parent-based company over severance in the market. We work mostly in open source firmware and embedded Linux areas. We are a part of various organizations, various open source initiatives like Carboot licensed service providers or reactor participants. So, to the agenda. So, let's start with explaining what the TWPM project is, why we decided to start one. We'll talk about some stuff about TPM modules. Then, we'll explain how we started such project, what challenges do we expect, what roadmap we have and what's the current state of the project. So, Trust World Reversed Trusted. So, you probably know that Trusted platform modules. So, we came up with a name Trust World 3 platform module. So, to indicate that the goal would be to make it a bit more trust-worthy than it is today by providing the open source firmware for that one. And the goal also would be to be compliant with the TCGPC client specification, which might be in fact quite difficult or maybe even impossible. We will see, we will discuss also that later. But, yeah, that's the goal. The project is funded by Nernet, by NGI Azure Fund. So, why we came up with the idea, the traditional TPMs are dedicated microcontrollers and they, not typically, they always have proprietary firmware, which can be easily audited or at least not by regular users, maybe by some governance, maybe. If there are bugs, they might not be fixed, depending on what the vendor is planning for, site-line TPM module, TPM chip. There are also capabilities which might be limited in some cases and if there is no firmware update from vendor, they might not be modified by a user. There are also several different interfaces, LPC, which is present mostly on older motherboards, but it's still commonly used. There is also SPI, which is typically present on newer PCs. And there is also iSquare-C, which is present mostly on mobile devices or also on TALAS. So, another problem is that if you, at some point, wanted to buy a TPM module, you probably know that there are a bunch of different types of connectors, you need to get a one which is compatible with your mainboard. Even if they look the same, they are really not and if you plug in incompatible one, you may even break your module or even your mainboard. For example, we can see LPC connectors for MSI and ASUS. They look the same, the MTP is in the same place, but the ground and power is swapped. So, some smoke may happen if you look ASUS to MSI or all the way around. Similar stuff for SPI. Also, connectors look the same at the first glance, but they are very, very different for different motherboards. Even for the same vendor, you can have different type of connectors and there are many, many more. So, at first, we wanted also to address the hardware problem, but it looks too complex. We want to focus on just the firmware for start. The virality of connectors is huge, bigger than anticipated. So, at other states of the project, we focus purely on the firmware and connect it even by some jumper wires to the motherboard, just to have some proof of concept in the firmware, then we can worry about some other stuff. So, how can we start the project? We need some code to process our TPM2 commands. There are a few command processors out there. At least, we know of two of them, the Microsoft Implementation, which provides simulator code for various OSs. There are also some interesting samples in that repository. There is FTPM trusted application for us, ARM trust zone and there are also some samples for STM32, Nucleo, which is what we will be interested the most. Then, we also have a simulator from IBM. Maybe there is some more. I think these two are the most popular ones. So, we took a look at the Nucleo sample, which was mentioned just before from the Microsoft TPM reference stack. So, it was created like four years ago, contributed and it's not like either maintained or tested. So, there is no single person who knows if it ever worked, basically. It looks like it was developed in Atolic True Studio, which is not an existing software. Currently, it was replaced by some other STM32 integrated development environment. We also asked about the status in general of the sample codes. So, the answer is that these are those contributed at some point in time or when it may not work currently. So, we took a closer look at that piece of code. We have converted the product into the newer programs. We were able to build at least to a certain point to at least run it. On the Nucleo board, we noticed there is some Vicom application. It was presumably used for testing the solution on. The application is only for Windows. We haven't even tried to bother with that. In fact, as we know, as what pain it might be to build some all-to-visual studio stuff. But we look at the code. We saw that the STM32 code can accept TPM commands via USB CDC. It can process it. It can return some response or some simple command response was there. But what was important is some custom protocol was involved here. So, there was no interoperability with existing tools, such as TPM tools stack or with existing TPM interfaces because it was very custom one, and the interoperability is a major goal of this project as well. Also, the resources on the MCU was quite low. So, at this point, we had a lot of idea of what we have to face with. So, we want to also show how different the flow is for the TPM2 simulator versus the TPM2 actual hardware. The block on the top is, we can say what Microsoft or IBM reference implementation provides. It just provides TPM2 commands processing. For the simulator, that's fine. It uses like sockets and communicates with the TPM2 software stack in the OS directly. But in case of the hardware, you need to plug in the actual module to the motherboard. So, we need all of the plumbing. So, there are also dedicated specifications on that in the TCG. So, those are the orange blocks that needs to be implemented in the firmware. So, we need all of the plumbing to pass the commands from the microcontrollers through the motherboard connectors to the OS drivers, and then at last to the TPM2 software stack in the operating system. What are the challenges, current and expected? One of the first was the global chip shortage, which happened in the meantime. So, even if you wanted to use the STM32L4, it was no longer available to source. The other microcontroller was also quite difficult to get. The project in the samples was developed using the hardware abstraction layer from STM. If you want to switch to another hardware, you would need to rewrite that thing at least to some point, to another hardware abstraction layer, and then the chips becomes unavailable and that will go on forever. Also, we do not have the requirements clarified that were at that point. So, we believe that what must be done is to have some OS handling the TPM stack. So, we are looking into ZFROs to have some better hardware coverage and just implement the TPM stack in the ZFROs. So, we can switch between boards more easily. Another challenges are different types of timing problems. In the TCG specification, it requires some timings of different levels on the hardware level. For example, some registers must respond in some milliseconds and so on. It might be difficult to achieve such responsiveness on just a microcontroller. Maybe we will need to fall back to FPGA in some cases. Definitely, we need to fall back to FPGA for LPC interface, which is non-existent on general-purpose microcontrollers. So, we are developing a hardware block in FPGA for that. There are also some environment improvements which can be made. As I said before, a full compliance might be quite difficult or might be even impossible to achieve. For example, in case of the strict initialization time and power requirements because the power requirements are quite low. I don't remember the exact value right now, but typical microcontroller plus FPGA and also the boot time might be challenging here. It's a lot of moving parts. We are still looking into that. So, the roadmap. The first step on the project plan was public site-duty documentation. So, that one is already live. Although, there is not much content there yet. We are just starting, we can say. But once there is already there, we have already some structure and the roadmap is also public. Then the hardware, as I mentioned before, we started with the nuclear as it was supported by the Microsoft samples. We're still exploring that, but that's probably not enough of a hardware to accomplish the goals of this project. Another one we are exploring right now is one which is based on the EOS S3 SoC. It combines Cortex-M4 and FPGA in a single chip. That might be interesting to have a Cortex-M4 for TPM to stack, and to have some FPGA for fast response and LPC communication. The third point which we are working on right now is LPC implementation. FPGA, so that requires to target the many mainboards which are currently in the market, which have only the LPC interface. Then we need to implement also all of the plumbing I showed before on the slides. So, TPM registers as defined in the specification. Also, there is some specialized FIFO protocol to pass the commands between the TPM and the host. So, that's a screenshot of some testing at the LPC module. We are using very like Icarus, I believe, for creating the data and GTK wave for drawing that out, currently. Then some testing is also necessary. We are already preparing some test cases. We can start preparing them based on the regular TPM modules, and we want to expect that our module will meet the same criteria. Of course, we'll add some specific test cases later on as well. SPI protocol is also tricky, even though it should be simpler in theory, because the typical MCUs have SPI, but typically, they are tested, let's say, in master mode, not in the slave mode. That's the most common case, and also the frequency is limited. The frequency of SPI on motherboards can be 40 MHz or more, but typical SPI, for example, the STM32 is advertised up to 24 MHz, but in reality, we have not been able to achieve that event so far, so that also might be difficult, and maybe also FPGA can help you to achieve the timing requirements we need. The eight point is exactly exploring the usage of simpler hardware platforms, so maybe we were hoping that maybe we can achieve at least SPI connection on a board without FPGA. That would be simpler and would give more users ability to test that out, at least on even Raspberry Pi or with the motherboard, so they can push that forward and have some TPM module on some regular development boards, even. I already mentioned some TPM stack improvements, sorry, the flash drive for TPM stack improvements. For example, the more test suites, there are some problems we already seen with what was there, there is no redundancy at least, maybe also encryption because the SPI could be read out from the board, but we may leave the encryption for later, but at least we would like to have the redundancy because otherwise it would be quite easy to prick the TPM board, which is not what we want. We also need some unique identification and some source of entropy, that would be required to have some unique ID for each device, and we need some entropy for cryptographic operations. This is left for later, we will see what kind of hardware we will end up in the end. We might use some hardware specific features, and if not, maybe we can use FPGA finally. Manufacturing, as you might know, each TPM must contain unique key, endorsement key, and also certificate related to that. So, we might provide a way for a user to generate the endorsement key for their TPM, so just to provide some scripts and procedures, how they can provision the TPM device. Then also we imagine that it would be nice to have some nice build system, so we can configure what kind of interface I want to target. I have a motherboard with SPI, so I just flip the SPI switch, rebuild the project, I check the hash algorithm I'm interested in, I choose what kind of hardware entropy I'm interested in, and the goal here is to make the transition between the boards more easy. I can plug out my TPM, flash another firmware, and use it on my new motherboard, which now supports SPI and previously it used like LPC. What's currently in progress? Yes, the LPC module I showed a screenshot from. The next TPM registers will be implemented. We want to pursue as much as we can in the FPGA also in the simulation, and before we stick to two certain hardware, and we also determine how big of the FPGA would be required here. We also in the parallel exploring the path of not using FPGA and using SPI for on the STM or another microcontroller. Maybe even if we could reduce the frequency to lower, use it with the Raspberry Pi as a proof of concept that would already be quite an achievement I believe. Maybe even if it wouldn't work with the regular mainboards, maybe after some time it could be improved, but it could be a nice step forward. Dogs, we are working on the dogs, they will be progressively filled with the results of our current work. Yes, if you are interested in the progress and maybe want to contribute, maybe want to discuss about the project and created stuff, you can join our communication channels. On the other hand, if you want to join the team or work daily on open-source similar projects, you might approach me directly or use contact information from the next slide. So that would be it and we might have some questions. Yes. You mentioned the timing of the problems. Have you looked into how the system behaves if you delay, maybe they or will things not work if you are too late answering some things? I'm not sure if I followed exactly, you're asking about some timing issues we had. There are some timings defined by the TCG specification. So for example, they say the given command, TPM to command must respond with the given time. So that's one. If you don't respond in that time, what happens and did you look into that? We are not at the stage even currently. We are at the lower layer, so we are trying to get the plan being done so we can even send one command. But on the lower level, yes, there are also problems with timing too. At least on the SPI part, for example, we had a problem that the STM32, we've been using doesn't give us the enough flexibility that we needed to follow the spec in some cases. There is like SPI IP block, which can just send and receive some data over the SPI. But if you would want to specify some more on the start of the transmission, pull that line down or up, that might be more tricky to do as well. So maybe the FPGA would also help here. Yes. I have a couple of questions about FPGA. Have you selected an FPGA? So the question was if we selected FPGA, no. As I said, we are just exploring the hardware on the right. So the ESS3 SoC, so that one has Cortex-M4 and FPGA integrated into one SoC, and that's the one we are analyzing right now. Maybe it would be fit for the project. If you wanted to use FPGA only, because what is also important here is that we require, we need CPU to process the TPM stack commands. So either we can use hard CPU like that with FPGA combined, or we need to use full-blown FPGA, soft core, and so on, and that's another complexity. We might be also not that much experienced with that, because we need to run the TPM stack on the CPU. When do you select an FPGA you need to use when you control flash, or because of so that it can't be tampered with it's using? Yes. So the question is if we want to, if we need to use internal flash so that flash is not tampered with, yes, that would be preferable of course, but at this point we are flexible enough to just have anything working, prove that it's even feasible with some limitations, even if the flash is separate, it can be read out. We might not care right now. We focus on the most important parts to even prove that it's feasible or not, and maybe other stuff can be handled later once we prove that we can do it or not. The problem is that you're trying to receive data over SPI essentially, your device tries to be an SPI slave, but the SPI is not flexible enough, and you want to run it, is that right? So yes, it's right that we want to be a SPI slave because the main board is master, yeah? So are you aware of chip line-up from Cypress, the P-talk? It's like a Cortex-M plus a little bit of not really FPGA but programmable clue, which can configure just some degree. Probably. Okay, so there is a suggestion from the audience on the some Cypress series, B-talk, yeah? So yeah, so I don't think we have considered that so far, so thank you for that. From the hardware with hard CPU plus FPGA, we found so far the EOS S3 which we're looking into, it has some Zephyr support, which is a plus, of course. There are, I think, also one board from Go-in, like Tank Nano, it was the name, I think. Yeah, so thanks, we'll check that out after the Cypress B-talk. Yeah? Did you consider trying to get an emulator-based development environment working, mainly using something like QMU, so that you are sort of decoupling your hardware debug from your software debug, and I know that QMU, you can implement a virtual LPC and SPI devices, and it also has a Cortex-M emulator in QMU as well, so perhaps that might be a possibility. Okay, so the question is if we consider some emulator development environments. So as I showed, we're right now exploring some, at least for the FPGA part, we start with simulation, and we develop LPC slave and host, and we can test that out in a simulation for simulating the whole Cortex-M, we have not done so far, maybe that's also something to consider. Any more questions? We can have one more, I believe. Okay, sec, oh, okay. Isn't it a big risk to, let's say, move away from the hardware part first, because a big part of using a TPM is actually being a trusted device, so if it's somewhere and you come back, then you will know it's what you want, and by now putting a lot of focus on the software, and just kind of ignoring the hardware part, what would happen if this software would work really nicely? Let's say SCM32 is in the field, I'm probably testing that. Wouldn't that weaken everything a lot more than just trusting these CC evaluated TPMs? So the question is, I believe, if we have a, let's say, software firmware running on the SCM32, and how we can make sure that the software was not tampered with, for example, yeah? Yeah, tampered or maybe extracted or whatever. Okay, so we know there are existing mechanisms, for example, for different values have already microcontrollers, different types of secure boots and so on, so we can verify if it's our firmware, if it's signed. The user can sign its own firmware and otherwise the firmware wouldn't boot, for example. I believe once we have the functionality, there are already existing mechanisms that can prove that the firmware was not tampered with. That's my understanding, yeah. Okay, so we are out of time, thank you. Thank you. |
Semihosting U-Boot
Look, ma, no serial! |
Hi, I'm Sean. Today I'm going to talk about semi-hosting in the context of Uboot and what it is and how it works and maybe why you might want to use it. So, first I want to ask how do you boot Strapasystem? So, you might do this for two reasons. One, you have a new board right from the factory and it has nothing on it at all and you have to get something on it and the other one is maybe you bricked it and this happens to me sometimes. It actually happens quite a lot especially when I'm working on Uboot and the board will no longer boot. So, there's two basic steps usually. The first one is you want to get something running on your board and the second one is you want to then write something to storage so you don't have to do the process again. So, there's a variety of protocols you can use. USB of course. I like UMS. It's very nice. It makes your device look like a USB flash drive which is very, very convenient. There's also a bunch of Ethernet stuff. The classic TFTP. Baskoot makes an appearance twice because it can do both. If you have an SD card, bootstrapping is super easy. You just pop out the SD card and put whatever you want on it and put the SD card back in but a lot of boards don't have SD cards. So, this is not always an option. There's Serial. This is usually kind of slow so you might only want to use it for the code execution part but it's definitely there. Some boards have it built into the boot loader. You can just flash something over Serial. And there's also JTAG and JTAG is kind of a classic one. Also slow. You probably wouldn't want to flash your whole root file system over it but it's pretty reliable and a lot of boards have it. What if you only have JTAG and you don't have any of these other nice protocols? So, I'd like to take a little bit of a different approach to the problem and let's talk about something totally different which is the NXP Core IQ line of communications processors. These are the newest iterations of a very old line which stretches to the M68K and there's a very long lineage of PowerPC stuff in there and they tend to have lots of Ethernet, some PCIe, some USB but not any display interfaces. So, they're not really media sucks and they often have hardware accelerated networking so you can do some stuff in hardware which you would normally do in software. And this is kind of the main selling point on why you might want to use these. So, all of these have something they call a reset configuration word or RCW and this started back in the PowerPC days as just basic initialization. What enviousness your sock is going to be, maybe what dividers you're going to have on some clocks, how wide your boot bus is, what are you going to do with your debug pins and this is kind of a small amount of data so they stuck it on there some pull-ups and pull-downs on some of the pins and this is a very standard thing you'll see on a lot of different socks and then they wanted some pin boxing because when they originally started with this they all the pins were fixed function and you can sell more chips if you can change the function of some of the pins so that you can use like USB on one chip and maybe ethernet on another so they added some pin boxing and they added it to the RCW and then they added a lot more pin boxing because the more pin boxing you have the more applications your chip can fit into and so they started running out of pins because they started getting maybe like 128, 256, 512 bits of stuff that they needed to configure and so they decided they were going to put the RCW on the boot device so the first thing the sock does when it boots up is it reads off this RCW and it configures all the pins and then it continues with the boot and this is kind of convenient but it creates a chicken and egg problem where in order for your sock to boot up there has to be something on your initial device and if you're in a situation where you have to bootstrap it there's nothing there so the sock won't boot up so what they did is they created a hard-coded reset configuration word this is for maximum compatibility they would disable all the peripherals and you would just have your boot device and so you could always boot into this and be safe and not break your board but this is not so great because they never added runtime pin muxing so this chip you select a function for your pins and you can't change it there are a few pins where you can change it but for the most of them you're stuck so when you have this maximum compatibility RCW with everything disabled you have no ethernet you have no usb you have no serial even and all you get is jtag and your boot device so nxp knew they had a problem and they decided to solve it by introducing this override so you would boot via the hard-coded reset configuration word and then you would program via jtag the values that you actually wanted that would enable all your peripherals for your board and then you would do a partial reset and it would come up and it would load everything like it was supposed to but there's a couple problems with this the main one is that they never documented this stuff so in order to use it you you have to use the jtag probe which is like most jtag probes kind of a gouge because they they know you're buying the chip so you gotta have the jtag probe and you have to use their IDE which is a yearly subscription and they're not cheap so this is not a great situation and if you didn't think this was great here's a glowing review i found on the forums our manufacturer uses a single pc to perform the initial programming on this pc they have an evaluation copy of code warrior which is their IDE every time that evaluation copy expires they erase the hard drive of the pc install the os again and load another evaluation copy uh so this is not ideal uh and i thought about how i might address this uh and make it better and i remembered um something that i learned about a couple months ago it's called semi hosting and the basic idea of semi hosting is that you attach a debugger in my case it's over jtag and uh your code is going to execute a special breakpoint instruction and when your debugger sees this it will read out there on opcode in r0 and an argument in r1 and it will do something for you and then it will give you a return code back in r0 and this is very very similar to how sys calls work because your program will execute a special instruction the operating system will read out your registers it will do something for you and give you a return code so what do you get well the thing that i wanted most is serial because i didn't have any so first i looked at some of the sys write c and sys write c is basically put char uh you uh so we can implement put s here and so we're going to take in a string and we're going to loop over all the characters in the string and for each character we're going to trap or execute our breakpoint instruction and we're going to pass for our opcode the write c and we're also going to pass a pointer to the character uh and if you may know that put char actually just takes the character um and so this is kind of an unfortunate uh performance implication because we have one breakpoint and one memory access per character in the string and for j tag this is not very performant if you've ever used a 300 bod modem you know that's very slow this is even slower so this is really not useful if you actually want to use your serial output so we can do better though they also have something called sys write zero this is basically put s so uh our pit of implementation gets very simple uh we're just going to trap with write zero and now we get one breakpoint per string uh but we still have to do one memory access per character and the problem is that we don't want to read off the end of the string we have to make sure that we don't go past the null terminator so the debugger has to read a character and then see it was at the null terminator and if it's not you read another character and you keep doing this uh and we really don't want to go off the end uh but the problem is that for j tag setting up a read is a pretty intensive process um there's a lot of overhead and it can be still pretty slow so this is faster uh about 10 times as fast but it's still slow uh really not usable but we can do even better um so we're going to use sys write which is basically the write system call and for this one because we have multiple parameters uh the previous ones only had one parameter so it just goes in the argument but for this one we're going to fill in our arguments inside of a struct and we're going to take the file descriptor and the buffer and the length of the buffer and we're going to fill this in with standard out and uh there are string and the length of our string and then we're going to trap and we're going to pass a pointer to our struct and this is generally how we pass multiple arguments um to semi-hosting because there's only one argument register so they will take a pointer to the struct and so now we get one breakpoint per string and two memory accesses per string and this is reasonably fast we can do stuff with this and it's not glacially slow um so this is the kind of implementation I ended up using it uh and if you've been paying attention you'll note that sys write kind of implies the existence of sys open and you can open any file on your host system which is pretty convenient and you can do all the standard stuff like seeking it and reading it and closing it uh we don't get stat but we do get the file length which is mostly what we want because usually we just want to open it find out how long it is and then read the whole thing uh so in uboot you may classically do something like this if you want to load your linux and then boot it you're going to load it from mmc0 add a particular address uh and then you're going to give your file name and then you'll boot it uh and so we can replace this with load host fs which is the something on your host debugger file system uh and that linux image will get read from the directory that you're running your debugger from um and it it's the same structure because under the hood it's using the same api and there's a dash because there's only one host fs and we don't need to have multiple debugger support uh and there's a special file called colon tt uh which i think stands for teletype and this is your standard in and standard out and almost everybody uses this uh except q mu because q mu doesn't have this huge overhead for memory accesses so they don't actually care uh if you can use your console with uh read and write and so you just use uh write zero with them and it's works uh so one classic problem with booting with j tag is that your regular boot process is going to look something like load spl and spl is going to initialize d ram and then spl is going to load regular u boot into d ram and execute it and when you do this with j tag instead you have to load spl over j tag and j tag is going to run and initialize d ram and sometime you have to uh load u boot into j you load u boot into d ram over j tag but we don't really know when um and so a really classic way to do this is uh you just pick a time and you wait that long and then you load u boot but this is kind of awful because if you have any kind of variance in how long d ram initialization takes or how long it takes especially if you're doing other hardware initialization um you have to just wait a lot longer and in the average case you're going to be doing nothing and this can really drive you nuts as a developer because you might be waiting like 20 seconds because sometimes it takes 20 seconds but most of the time it doesn't um so you can also reimplement d ram in tickle and uh this is a really common thing for vendors to do because they love just you know it's it's very simple for them they just write all the registers and it happens over j tag and this avoids the whole timing problem because we know exactly when d ram has vision is sliced but it's a totally different process from normal you have to specify your parameters in a different format in a different language it's not going to be tested as much and it probably won't initialize things in the same way so it's it can be more buggy um and it's kind of uh worrisome especially when you have to uh your regular u boot will work fine and maybe this doesn't work so well but semi hosting makes this really simple because spl can load and then it will over j tag and initialize d ram and it says to your host please load u boot at this address and your host will do that and then it continues on its way and it's um extremely simple to use and it solves this whole timing problem which can be very annoying uh so what else do you get well we get some error handling uh error no is practically essential um to find out why something failed uh is error is not uh the idea of is errors that you will pass in a return code and is there will tell you if it's an error or not but the problem is that some of these semi hosting commands um have different semantics for the return code and most of the time the semantic is negative numbers are errors so effectively you're doing this whole big semi hosting call just to compare to zero um so i don't really know why this is in here and there's actually several functions that are kind of like that um for example sys uh sys time will get you the real time which can be helpful if you're uh if your device doesn't have an rtc or you don't want to initialize it um but sys elapsed will get the number of ticks that your program has been running so maybe you would use this for uh benchmarking but the overhead of doing semi hosting is a lot larger than the the amount of precision that you're going to get so i'm not really sure why you use that one either um there's some libc emulation uh you can pass in a command line you but we don't really need this because we have the environment and we have the device tree and those are kind of classic ways to pass in um parameters but if you're not using uboot and you don't have this sort of system set up uh you can get command line parameters pretty easily there's also a sys heap which is where you tell the device where it thinks the heap is and where it should malloc stuff but usually you know this when you compile you say this address range is going to be or i'm going to stick my heap so also i'm not really sure why that's in there um and as you may have noticed uh you can write files um so of course you can mess things up especially on unix where you can open up a lot of files that aren't really files and do some fun stuff with them but you can also just run arbitrary commands and you can remove files too um so you have to really trust this stuff that you're going to run uh because as far as i know no one does sandboxing they just implement all this stuff uh so maybe they shouldn't but that's how it is uh so if you've ever used semi-hosting before you may be familiar with the this problem uh break points are actually invalid instructions and your program will crash unless there is a debugger attached and the debugger will handle it for you and you won't end up executing it um so typically you would have to have two programs one with semi-hosting enabled and one with semi-hosting not enabled and the one with semi-hosting enabled you'd have to run with a debugger but we can get around this using a pretty simple trick um this one is from uh tom verbuer uh and the idea is that in your synchronous support handler you first check to make sure that we have an invalid instruction and otherwise you panic which you know probably involves printing out the registers or doing something um complaining loudly on the serial which you might not have uh then you would do you we need to check to make sure our instruction which is held in elr is the semi-hosting arm 64 halt instruction which is the special breakpoint um and the lower bits of the pc are actually not the pc on uh arm because they have stuff like are you in thumb mode or not um so we need to mask those off well you could probably just do and uh till the three um and if we actually find out that it was supposed to be a semi-hosting instruction we're going to disable semi-hosting which on your processor can do whatever it wants but on u boot it just sits a global variable that says we don't have semi-hosting don't try it again and then we pretend that we get a failure negative one is almost always a failure and then we advance the pc by four bytes so if you want to use semi-hosting in u boot uh you can enable these configs uh the first one enables semi-hosting of any kind um and also enables this uh command uh the second one semi-hosting serial will get you some serial input and output and you'll probably want this serial put s uh because normally u boot will print a character at a time uh and put s will group those characters into strings and print them all at once and if you want to have this thing you will need to enable config semi-hosting fallback and if you want to use an spl then you can enable the spl versions there's no serial version because u boot always enables the uh serial device in spl that it's using in the regular u boot um and these are the things that i worked on adding uh and i also worked on config semi-hosting a lot but uh the basic support was already there uh there's also risk five support from a kautok console and this is pretty recently added so it's either in the january release or maybe the march release i'm not sure um and if you want to know more about how to enable this we have a documentation link and of course you're also going to need a debugger so i like to use open ocd um maybe because i'm a masochist uh and open ocd is a debug server for j tag so the idea is you launch open ocd and it connects to your debug probe and then you can you can tell the debug probe to do things like uh start or stop your processor and you can also attach gdb to it like it's a running process so this is pretty simple for open ocd you just halt the processor you enable semi-hosting and then you resume it and typically what you would do is in between this enabling semi-hosting and resuming you would load your program and then resume out a particular address and this you could stick in a script and just run and automate the whole thing uh so there's a couple of downsides to open ocd uh you can kind of think of this as like a wish list or things that are knowing me but not enough that i fixed them uh the one of them is that uses the same terminal for regular like logging messages like uh you know i attached a debugger um and that sort of thing as semi-hosting output so they can be kind of get intermixed so you have to watch out for that uh the serial is cooked which means that when you type something uh nothing happens until you hit enter and then everything happens uh and this can is kind of okay because if you're editing a command line um it it's generally really slow if like you hit backspace and then you have to go to u boot and u boot interprets the backspace and echoes it back and then it gets displayed on your terminal so cooked is nice here um the problem is that open ocd is single threaded so while it's waiting for you to input it's not doing anything so if you unplug the device or you hit control c in your debugger it won't notice until you hit enter uh so this is uh can be kind of fun especially because even if you know about it you might forget um and this single threaded thing also ties into there's no sandboxing so ideally you would do something like fork off another process and maybe unshare some stuff or put it in a ch route and then that would be where you would run all your semi-hosting stuff like it would open the file and you could limit it to just a few files but there's no sandboxing so uh your whole system is there uh once again you have to trust your stuff so should you use semi-hosting uh i would say not unless you have to especially not the serial stuff but it's good to have if you have to use it it's nice uh and sometimes it's convenient if you're doing emulation it can be really simple because you don't have to emulate an mmc device you don't have to write a driver for an mmc device you just uh call your semi-hosting instruction and you can load the file right into where you want it um and you don't have to do any hardware and if you're already using jtag boot this can be really nice to solve some of your sequencing stuff uh but i wouldn't recommend it in general um so i'd like to thank a couple people uh tom verberware wrote a blog post on this stuff that got me thinking about the whole thing uh andre psivara uh did the initial semi-hosting and uh i'm he also worked with me when i was upstreaming my stuff so i'm grateful for that and of course tom rini and simon glass who reviewed and merged all of this code and a lot of other patches along the years um and of course maric who put me up to this talk and seiko who employed me while i was writing the code and if you're interested in this there's that blog post i was talking about uh there is the risk five software spec which is just the arm software spec but they use a different instruction and different registers and of course the arm software spec and this link may die because arm has a tendency to rearrange things but uh for now it works thank you anyone have a question questions yeah i do um can you actually use semi-hosting for serial control in linux uh yes but only for debug prints and i haven't looked into it that closely um i think the whole uh stopping linux to do a breakpoint is kind of invasive because linux tends not to like that because like your interrupts for that core will just not happen while it's stuck on the debugger and uh you can kind of break your devices that expect there to be an interrupt that gets handled in a reasonable manner um so typically when you stop the processor in linux uh like your emmc will just break um so generally i've only seen it for debug prints and usually only if like you can't get to the real serial console yeah okay since we have a couple minutes i have a one more slide um so normally when you boot print something uh this is what it gets it'll get like hello uh slash n and it'll normally print this like h e l l o slash r slash n and it inserts the slash r and it'll do it one character at a time but as we've established earlier this is glitchily slow on semi-hosted hardware so what i initially did was this and i printed out hello slash n and then i added the dash slash r um but this will actually break things because they expect it to be r n and not n r even though like functionally they're the same um so i ended up having to do it the other way uh so if you're implement this stuff be aware of that although uh if if you are doing this like on a microcontroller you can probably just put hello r n in your strings and maybe that's better |
Opening Railways and Open Transport devroom |
Yeah. It's great to have a packed room here. Really good to see you all. Yeah. Welcome again. Welcome to the Railways and Open Transport Deaf Room. We want to show you today something what we are working on, what you are working on, what we all are working on. And the co-organizers of this Deaf Room Army. I am Cornelius Schumacher. I am working for Deutsche Bahn. I am in the open source community for a long time and I am really happy to expand my horizon here. So, Max Mehl, who is also on the picture here, he can be here with us. Some of you might know him. Yeah. But, Mahalia is here. So, please introduce yourself. Thank you, Cornelius. Yeah. My name is Mahalia Stefan. I am from SBB, the Swiss Federal Railways and I am responsible for digitalization and for the business driven open source projects we are just going to explore. Hi. My name is Simon Clavier. I am from the French Railway S&CF. I am trying to spread open source in the group for 10 years now. This is a good result so far. I am very happy to be there and I am very happy to meet you. Yeah. So, just a few introductory words, introductory slides to set the stage for what we are doing today. We already showed you we are coming from train companies and we met basically in an initiative to create an open rail foundation. So, and space for railway companies or companies, organizations in the railway sector to collaborate via open source to work on open source. That is a missing piece yet. There is nothing like that yet and we are building that. So, we are really happy to do that. It is not done yet. So, we are in the process of creating it but it will be there soon. There is a website. You see the URL on the slide. There is some more information. The idea is to really spread open source in the railway and that is where we all personally stand behind and what we want to spread and hopefully some of you who are related to this topic as well join us in this effort. While we are doing that, of course, the core of that is the openness, being open, doing open source software. That is why we are all here. That is why an event like FOSSTEM is happening. For us, it is a little bit more than just the open source license. It is also about an open collaboration model. So, having open governance and for the open rail foundation specifically, of course, the goal is to enable open collaboration in the railway sector. But we do not want to stop at railways. That is also because transport is more than just going on a train. We also have to get through the train and everything. So, open transport expands the domain. Also, topic-wise, the main topic today is open source software. But of course, also open data, open APIs, they are very important in this space as well. So, this all comes together in a bigger picture. We do want to talk about more than just railways. We also want to cover public transport, basically, but want to be open. Because in the end, it is about the future of mobility. How do we move around? And how do we do that, in a way, which does not destroy the planet? So, I think that is one of the underlying motivations for many of us. It is about sustainable transport. It is about creating the means of transportation, which will serve us for the next decades and millennia and whatever. We cannot do that alone, of course. That is why we do it together and we want to create things together. That is why we built the open rail foundation. That is why we do something like the step room. I hope we can get into contact with many people, that we can build bridges, that maybe some of the bubbles we are all in, we can connect. And we also would like to follow up with you later. So, if you need anything, if you want to continue a discussion, we are happy to try to facilitate that. We do not have a channel for that yet, but we will create one if there is interest and also support that. So, we want to create together, do open source together, bring the openness into the railway, into the transport sector, continue with what is happening there, support the people who are there. And that is why we are here. And we are not alone. We have a great lineup of speakers. A lot of interesting presentations and the problem is pretty packed. So, I am really excited to see what is happening today and that we see a lot of interesting content. Thanks for being here. I think we will dive in right into the content and start with the first presentation. So, please join me here on the stage and we just switch over the laptops and then we can start with the first presentation. |
Automated short-term train planning in OSRD |
Hello, everyone, and welcome to this presentation where we talk about automated short-term train planning, what it means, and how we handle it in OSRD. So what is the problem? I would say a train wants to go from station 4 to station bar. We could easily just find a path, but the problem is there's many trains that have already been scheduled, and we need to find a path that doesn't just work havoc on a timetable. We can be completely realistic in our simulation, we assume everything is on time. We know where every train is going to be located at any time. So there's a few rules we have to follow to make our blue train go to station bar. We can't add a train that will be delayed by other trains. So in those examples, I use a signal system pretty simple where signal is red if there's a train behind it, and the signal is yellow, meaning slow down, if the next signal is red. What I mean here is that our train cannot ever see a yellow signal, meaning slow down. We can add delay before it reaches a signal, but once the blue train sees a yellow signal, it's game over, the solution isn't valid. The opposite is of course true. We cannot cause delay on other scheduled trains, meaning by being here, our blue train cannot cause another train to see a yellow signal, or red, of course. This means that we need to handle all the weird behaviours of the signal systems which can become pretty chaotic quite quickly. So in these examples, there's one track with signals going both ways, and what happens here actually is that the signals change around the train, but what really matters is on the white, the other train cannot actually enter the main track at all, even if it's really far away, because otherwise it goes face-to-face, and so the last signal would be red, the signal behind that would be yellow, and as soon as we see it, it's over. There's some other weird behaviours, sometimes we even have to know in advance where we will go next to know if we would be delayed. So in this example, if the train continues straight forward, it would see a green signal, but if it would turn to the white to the other train, it would see a yellow signal before we even reach the point where we need to take a decision. This may seem pretty minor here, but in some signal systems, we need to know, like, kilometres in advance. But there are some things we can do. The train can take detours to avoid busy areas, and we can also sometimes not go at maximum speed, like if we need to fit between two trains that would go slower than our train, we can just slow down. What this means is that this is actually not a good thing for us, because we can't just find the shortest path and then find the departure time we need to actually consider all the possibilities that we have. So that's the problem, and in OSRD, we are currently working on a solution to this problem, so OSRD, meaning open source railway designer, is a tool, open source tool, that can be used to edit railway infrastructure and all kinds of simulations on them. Keep in mind that on these specific features, we've come a long way, but it's still very much a work in progress, so not everything is properly handled for now, and we're still currently working on it. So how do we deal with this? The main problem is that the solution space has a lot of dimensions, there's of course position, because we do need to find a path that goes from origin to destination, there's also time, because the constraints caused by other trains depends only on certain time interval when the other train is here, and the tricky one is speed, because unlike cars and bikes and most means of transportation, a train cannot just speed up really fast, it can take dozens of kilometers to just speed up or slow down. So if we find, for example, a solution that does reach our destination avoiding all other trains, but where we reach a destination that says 300 kilometers per hour, this is not a good solution, it's not even a good approximation of a solution, so we do need to keep track of the speed that can be reached by the train. So the way we do that is that we represent the such space as a graph that considers all those dimensions as well as all the constraints, because once we do have a graph like that, we can just find a path, and at this step it becomes pretty simple. So the main challenge is defining the problem itself as a graph. So in this case, a node would have a position, a time, and a speed to consider those three dimensions, and must not add defined implicitly. To get the speed and times, we want train simulations, which we already know how to do in other parts of the project, so we consider everything we need to, like slope, curves, rolling stock, data, and everything we need to. So I have a small, yes, but we actually compute the speed to get the position and the time. So I have a small graphical representation to explain really what I mean by that when we add time to our solution, so let's say we start from a simple graph that represents the physical infrastructure, in this case that would be, for example, track sections, and what we do is in a way we duplicate all nodes of the graph at different times, meaning that the point A always represents a specific point in space, but there's a different node for A at t equals zero, and another node for A at t equals one, and so on. And then we link them in a way that actually reflects the travel time, so meaning that we start at A at t equals zero, and we can reach C at certain time after. And yeah, we can, for example, go from A to F at the same time, because we can't just teleport there. And so this graph is constructed as we explore it, it would be too expensive to just build a whole graph on the whole country at first, so it's all implicitly defined at first, but then we actually want simulations when we move forward in the graph. It's also discretized in time, but only when we evaluate visited locations. What I mean by that is that when we want simulations, we actually keep a full track of the time at full accuracy, but once we reach a point that has already been visited, if we've visited it at, like, too close in time, we consider that it's already visited. Once we have that graph, we just run an A star on the resulting, yeah, on that graph. So A star means we have two functions to define, we have a cost function and an optimization heuristic, in this case, the cost function would be the travel time of the train from start to the current point, and the optimization heuristic is based on geographical data. And because our heuristic doesn't overestimate the remaining cost, we are guaranteed to find the optimal solution, so we will find the path that takes the least amount of time. But I've talked about how we add time to the graph, but I haven't really talked about speed yet. So the default behavior is that we always go at full speed unless we need to. By full speed, I mean not just the maximum allowed speed, like the train speed up as much as it can and always stays at maximum speed. So in this slide, we have a space time chart, so we have time on the horizontal axis and distance on a given path on the vertical axis, and there's an area that is unavailable, meaning there's another train, for example, in this specific area at a given time. And I've shown the edges, the hours represent edges of the graph. So in this case, we can just, if we speed up as much as we can, we can go before that other train, but we also could go after that train, which would lead to different solutions. So in this case, we actually create several edges that are all considered as valid paths. In a way, you can see it as a decision tree, except we can actually reach the same point through different paths. Okay, so that matters. So I've talked about the general principle of the solution. Now I'll talk about a few problems we faced and how we handled them, a problem out of them concerned speed, because it's actually a pain to manage. So as I said, we want simulation to get the speed of the train, but we do that only one edge at a time when we explore the graph. What that means is that we don't know what comes after. So when we reach our destination, we only know that when we explore the graph, the edge that contains that destination, but that doesn't always leave enough distance to properly break. So in this example, we have speed plotted with a distance, and we start in the first stage by going at full speed, and then we see that we need to stop there. We start breaking, and there's a discontinuity. This is not a valid solution. So in the next slide, it's mostly the same situation, but represented differently. Here we have edges of the graph, where in red we have edges where we speed up, and in blue where we try to slow down, and we have the same discontinuity here. To stop at the end of section 4, we need to enter that section at 10 km per hour, but because we've been speeding up, we had 50 km per hour. So the way we do this, we handle this case, is that we go back in the graph, we backtrack to backpropagate the constraints. So we see that there's a discontinuity there, and what we actually do is that we go over the previous section, and we create a new graph edge, but this time slowing down, and we know that we need to enter the last section at 10 km per hour, so we create an edge where we end at 10 km per hour. We notice that to do that, we need to enter that section at 20 km per hour, which is still not the same as the previous edge, so we keep going, and we continue creating new edges going over the previous section until we have something that looks like that. We have a valid path that actually stops there. The two different paths still exist in the graph because if we go another direction or something like that, we can still find paths that would take the top path. Then there's another problem, I've talked about adding delay previously to go after another train, but I haven't explained how we do that. So as long as we can, we shift the departure time, meaning that the train for example needs to leave not just at 10 am, but between 10 and 12 or something like that. So if we notice that the train which the final station, like 15 minutes too early and the other train is already still there, we just make the new train leave 15 minutes later and this fixes the problem. But it is not always possible, like in this example, if we try to shift the departure time to avoid the problems on section 3, we would cause new problems on section 1. So we actually need to add delay between two specific points of the path without affecting the rest and the way we handle this is actually the same way as the other problem, meaning that we go back, we backtrack on the graph to propagate the delay. So we actually have the old edges that go at maximum speed, but we have new edges going from section 1 to 3 that has what we call an engineering allowance. I can't go too much into details in how it's computed, but basically the idea is that we can do precisely what we need to. We add delay between two points of the path by slowing the train down without affecting the rest of the path. So this edge is here, isn't changed, this one is actually the same, but this one is slowed down. So we're ending the end of the presentation. So to conclude what we can do, we can find paths that avoid delay on any train, the one we add and any other, we can take details, we can slow down, we can have all kinds of way to avoid any scheduled train. There are some features that haven't really talked about because I didn't have the time, but for example, the user can input allowance parameter, which means that the train generally go a bit slower than they can at fastest so that they can catch up their delay if they are being delayed. And as far as performances go, it takes up to about five seconds, so it's not instant, but not really a problem for now, this is good enough. And there are some features that we still need to work on. For example, the signaling systems, for now we only support the simplest signaling systems. The reason for that is because we are currently refactoring the signaling engine in OSRD, which is actually really amazing and we would have loved to talk about it today, but it's almost finished and when it is done, we need to plug the two systems together. There are some features a bit more minor, like for now, the user can set the departure time and leave the overall time unspecified, we also need to do the opposite, meaning we need to arrive at a given time and we don't know when we leave. And we also need the user to be able to say we want to stop there, there and there on the path. So I've been faster than I thought, so what I'm going to do is that I'll show a small video demonstration of the project, this is a few months old, but it shows generally what we do, what we can do with this tool. So we are on the Brittany region of France and we add the trains that go from Laurent to Brest. We just set the schedule, we have several trains going there, which we can see here. I won't go too much into details in what the boxes are, but generally it's like if this box overlaps, a train is slowed down. So now we ask for a last minute train that starts from Rennes to Brest and we do find the path. So I'll explain a bit, I do have time, I do have time. What we see here, horizontal axis is a time, vertical axis is distance and previous trains we already added are toward the end of the path, so we see them at the top and the new train goes over the whole path. Okay now we add some other trains, no we don't add other trains, we move one of the trains so that it blocks one, actually the path we took at first. So if we ask for another train, what we'll see is that it will be shifted to avoid the previous one. And we notice that it leaves around 7.20, something like that. So what we'll do is that we'll add another train, this time going to Kibron called Tirebouchon. And we'll make it leave around that time. We add a few of them. And what we see here is that the train started before, before all those trains that have been added on the first, actually I'll explain a bit more, the trains we have added diverged here, like from here they move away from the path we used to Kibron. But so we only see them up to here and the train we add starts before then it slows down to enter in this section here. And we can see the speed of the trains, so it, anyway, anyway, yeah, I'll move on to the questions. I have also kind of links, a website for the project, a link, an email, what kind of stuff. Yes. Does this kind of solution is used to create schedules in advance? It's not used to create the schedules, actually, it's used once the schedule is set, yeah. Like last minute, you need to add a train on a given date. There's a given date where the, it's something I wanted to talk about, I didn't really have time in this presentation, so there's a train railway manager offers some paths for trains and at a given time, those paths are assigned to trains, like train operators who want their trains on those paths. And once this is set, there's still room for more trains, and this is what we do here, we find the room for new trains, yes. Five seconds response time, yes, for how many nodes and trains? Not a lot of trains, we don't have the tools yet to import a whole, what we call SR, and not sure what things, like the whole set of trains on a line, yeah, there's, generally we test with a few trains, like the kind of things that I showed, and pass over a few hundred kilometers, and we do know that it doesn't scale that well with a number of trains and we'll work on that kind of questions when we have something actually working and finished. You go, for example, to a fridge somewhere in the country that can have some stuff that I actually didn't really hear your question that well. Sorry. If I got it right, like you asked about the troubles we can find along the way, mostly we assume at these steps that everything is on time and works as expected. There's, when people like work on the tracks or something like that we know in advance that it's unavailable. Yeah. It's not real time. Not real time. Yeah, it's not real time. It's actually not exactly last minute. It's generally a few days before the train's actually won. So yeah, it's a fair assumption to just... Yeah. There was one question on the chat that this problem might be a good candidate for an artificial intelligence plan or so. I have to consider that. And please repeat the question. Someone asked on the chat if artificial intelligence has been considered for this problem. We do have considered them in the project in general but not specifically in this context. I personally don't think it would tell that much. I mean, it would be a good heuristic to know which path to evaluate before another and not to still find a good path towards the end. We do need to explore all the kinds of solutions. The place where we thought about using artificial intelligence is a decision like which train goes before one another. The context where we really thought about this is not in this case but like when trains are actually running late which one do we favor over one over the other. I think it would be a good heuristic in this case but not really that important. There was another question. What are the biggest remaining challenges to be solved? Definitely the signaling interface, plugging the things together because as I showed in this slide, this problem, this is a pain. We do have some leads, like some intuitions that we could do things in some way but I won't go too much into details because we don't know if that's true and the solution we are thinking about is valid or not. But we'll work on that in the next few months anyway. I'm working in an international organization that is handling aviation through the place so same kind of problems but with additional dimension. And I'm asking how have you managed or your organization has managed to say we will do that open source and we will have this type of solution available for others. So I guess you are working for SNCF to get some money from SNCF and make open source. How have you got agreement on that? So the question is how we managed to make the project open source in SNCF. So I'm not actually the one taking those decisions or even negotiating them. But the general idea I think, I mean that's my vision of it, is that we don't have any competition or something like that. We won the infrastructure for France and I think no one else will. So maybe the other countries nearby have the same kind of problem and maybe they could use our solution and maybe contribute to that solution to these tools and generally it makes more sense to contribute than to compete in this context. Thank you. Yeah, cool. Thank you for this question. So the presentation has arrived on time. We are starting in a few minutes with the next slide. Thank you. |
Using open source software to boost measurement data in railways |
So, we are ready for the second talk and yeah, I'm happy to have Joy here from SPP talking about very condition monitoring and how to do that with open source software. Hello everybody, happy to see that it's this crowded, I wasn't expecting that but I'm pleasantly surprised. So, I'm going to talk to you today from a very niche department of SBB which is measurement and diagnostics. We're basically part of the infrastructure like Eloy before, we're trying to maintain and run the infrastructure in Switzerland. It's not such a big country but a very dense train network, so the problems we face is basically that we have shallow train passing times and a lot of maintenance to do and to be able to do this maintenance at the right moment. We have the measurement and diagnostics department which I am part of and we basically have two major goals. One is to maintain the safety of the system and the other one is to put out a gather data of the infrastructure assets in order to be able to do predictive maintenance or maintenance at the right moment and just to make sure that the money spent is well spent in order that the assets can live the most long without breaking down. Here you can see two of our measurement trains, they're typically sensor based measurements, typically optically made measurements and we'll go a little bit deeper into that afterwards. Normally, I was planning to be here with one of my colleagues, Jean Chédéric, unfortunately he got sick, he catch the cold so I'm here alone but I thought I will still have his picture on a slide in case somebody is interested in what we're doing later on, you will know him by face at least. He's the guy who's doing a lot of the technical implementation, solving a lot of the technical problems and I'm more on the strategic part of the project and we'll leave it by that. Such a measurement from sensor to information is quite a long way so it's a very complex thing, such a measurement system. I've drawn here a couple of steps from beginning to end for track geometry measurement so as I said before most of them are optical by now so what you actually do is you have a laser plane that shines on your track and then you have a camera that makes a photo and the first picture is basically a digital photo of the rail and then you extract this contrast change and you do a lot of software and you get the second photo which is a half profile of your rail. Then you do the same thing for the second half of your rail and you get a full profile and you start measuring on it and you do for instance, it may be a bit difficult to see but there's a lot of little crosses on the profile and those are the points that you're interested in, the top of your rail or there's 1.14 mm below the top of your rail and from this, if you have both rails, you can deduce your track geometry, in instance the gauge and the, louder, okay, I will talk louder. You will have the possibility to deduce the gauge or the elevation or the longitudinal levels of your tracks and you can then print them in form of the fourth picture like as 1D plots and afterwards you've gathered this information, you have to deduce what to do with it. So in SBB and I think in a lot of other countries in Europe now too, you try to do analysis over time to see how the track geometry or other parameters in your railway system change over time and find critical passes to where you know in some time in the future that asset will break and then you can do maintenance in the right moment and assure that it won't break or that it stays safe a little bit longer or you can deduce, oh, I'm too late, I have to change that and RCM, our software suit that we built ourselves is basically an acronym for rail condition monitoring saying this whole first part of the value chain here up to the fourth picture and we try to automate that and to make it a bit more generic and we're currently trying to do that with the now shown architecture so we have the measurement platform on the left side, it gets from the administrative tools for instance topology like to know what kind of tracks exist in Switzerland. We've talked yesterday with Infrabel, it's quite similar and I've also talked to people from SNCF, I've also talked to people from Network Rail at the end, you need a topology to put your measurement data to an asset to a certain location in the physical world and while before I was not sure, before we started with this project, I was not sure if this will be a big problem if the topologies will be vastly different from country to country but at the end since the requirements are very similar from country to country, I came to observe that the solutions that came up are also very similar and I do believe that it's possible to have a generic topology description between countries and this will be, we'll see later on, a bit crucial for this project to work in different countries. Once we get the measurement on the measurement platform, we have this automated data cleansing and quality control processing platform and basically the first thing is the positioning as I said before that would be to tie the measurement to a physical location on your topology and the second thing that we do is a conversion, so whatever comes into one of those measurement, whatever comes from one of those measurement platforms, one of those measurement trains, we convert it into our open data format and this is the first thing I would like to talk to you about, how do you, how do you say that, we'll talk about a bit more detail afterwards. Once we have it in an open format, we do a standardization like the different measurement systems provide the same measurement in different flavors, let's say the gauge once come in an absolute number, 1.35 meters, the second system shows it as a deviation from zero, one shows it in meters, one shows it in millimeters, you have to standardize that a little bit in order to be able to compare it afterwards. We do a consistency check and then we do a persistence and once we persist that data, we have at least in Switzerland regulations to follow, especially on the duration and on the capacity to be able to read the data again in the future. In Switzerland this is about 15 years, so if we do a measurement we have to be, to guarantee that for the next 15 years we are able not only to show the measurement but also to open it and to read the data and this again something that is much easier to do if you have an open data format which is not proprietary, which is not tied to a specific software or a specific software version that you have to maintain too. Then it goes to the presentation layer and the presentation layer typically can be various different programs. In our case in Switzerland we use IRISIS from Ertman in Germany, I think there's a couple of other countries in Europe that do that too, but we also use our own viewing software which we call RCM Viewer, which will be the second part of the open source project I would like to talk about and which can show obviously the open data format that we use. Once this data is gathered, typically now if you buy a system on the market there are different enterprises that sell such systems, for instance MIRMC in southern Italy or Placer Anteuter from Austria, which are two of the biggest players also, there's ENSCO in the United States, you get a proprietary format from them. You may or may not be able to read it depending on how your contract is, so I know SNCF had had in the past some problems with that that the contract did not state that you are allowed to actually know the data format, your data comes to you and once you figured that out and that you are able to read it, you will free yourself from the use of the software that they impose on you if they sell you the measurement system, and we've done that in Switzerland over the last years, we've had the same issues, don't worry, and are now at a point where all our measurement systems are either directly delivering an open data format or at least that we have full specification of the data format that is delivered to us in binary, and we're now trying to transpose that into our open format. For that, with the last measurement train that we acquired, we specified such a format based on an open standard which is HDF5, hierarchical data format 5, you can find it online, it exists since, I think, 20 years, it's widely used in the academic world, it's also the base of matlaps.m files, for instance, and it has the advantage that most programming languages and most measurement data programs like R or DADM or others that are out there already have stops or libraries to read it. So with data themselves, HDF5 themselves says their key features and advantages are metadata with data, fast IO, big data, and the other stuff that you can read here, and what we try to do is to write a specification upon the HDF5 that is specific for railway data, that generic enough that it can handle all type of measurement systems. Basically, you have the advantages of HDF5, and on the same time, the possibility if you follow this specification that we wrote that then you can find on GitHub that a software like the one that we built can read it whatever your measurement systems are, if you buy it from Merrimick, if you buy it from Placer Interior, if it's track geometry, or if it's, I don't know, from Sperry ultrasonic measurements or whatever else is out there, as long as you follow the specification, you can use the software to display it, you can use the software to overlay it, and so on. We named it RCMDX, Rail Condition Monitoring Data Exchange, and our key features for it is that it includes the metadata like configuration, but also the topology, and it makes it completely self-contained. So in 15 years, you take one file, you do not need anything else than the file itself and the specification of the data to be able to reproduce a view of the data. So on the track, in the system as it was in the moment that the data was gathered, and I think this helps a lot. And as I said before, it's accessible through the standard HDF5 tools, so you can download an HDF5 library from their website and directly access the data. You can also build your own using the programming language, Python has a stop, and Java, and C++ and Csharp also. And on the other side, you still have the genericity in order to use whatever measurement system you want. And let's see, like one of the benefits that you would get if you decide to use such a thing. For us, one of the main benefits is that you could, if it gets adopted by more than one country, let's say, you get the advantage that in an open tender where you have to specify what you want to buy, you can handle the whole data part with just one single phrase. Please deliver the data in this specification, you can find it on GitHub. As soon as more than one tender writes it inside, I think it's basically close to a standard because this market is very small. The companies that I told you before are basically covering over 80% of the whole market. So as soon as they start to see that it is necessary to deliver their data in this format, you can get it for free. I think it's quite a big advantage that could be resulting out of using that. And of course, then as a result, you have a complete open data. We will see now, I'm switching from the data format to the viewing software. And this also is available already as freeware and will be in future as open source. And once you have your data in the RCMDX format, as soon as you follow the specification, you can basically use the software to display it. It's highly generic in the case that everything that you see now here is a workspace that can be easily configured by the user. So you can drag and drop every little window inside that workspace, put it somewhere else. You can configure it to show whatever data you want. You can change the parameters that are shown. You can change the boundaries that are shown. You can change the limits that you want to display. And you can show in parallel and synchrony in time, in space, and also on your topology different measurement systems. So you can show data from the track geometry system together with video data, together with optical data, together with ultrasonic data, whatever you think might benefit you the most. You can easily do it very complicated, or you can try to build new views that are less generic for an end user, for instance, for a maintenance manager, if you have to do a specific job, like in our case, the lower right part, like this part with optical data, he has to find surface defects, so you can put on a specific mask on the viewing software in order to help him to do that more efficiently. And yeah, you've seen it play, it's like a video that I made before. This viewing software is currently done in C-Sharp, and we're trying to put it open source since, I don't know, a bit more than a year. And the main problems that we are facing is we have begun building it like seven years ago, together with a new measurement coach, and it has some proprietary libraries in it. The main ones are Sidechart and Telerik for C-Sharp, and we're now trying to find a way to use open source while having those libraries inside, because it's very difficult to throw them out, especially Sidechart, there's no good alternative for the moment in order to be a performance viewing software. That's basically where we are right now. You can already download it on GitHub, I will show you, here's the link, there's also an overview later on, and we also have a couple of measurement files from Switzerland that you can download too, a couple of workspaces, and like that you can see how the data format works, how the viewing software works, how they interact together, and in case of implement your own version of it. And of course, the benefit of the viewing software would then be that you are able to do massively complex views with Synchron, I don't know how you say that in English, display data at the same time, and at the same position, or at the same point in the physical world, so basically the viewing software allows you to include the topology also if it's not exactly the same, so we have in Switzerland twice a year an update on the topology, and obviously some things changed due to maintenance work, due to new building of the tracks and so on, and viewing software can handle that in order if you have different states and different files, because as I said before, every file includes your topology too, so then you can display the data where it is taken on tracks that did not have changes and it will show you where it had changes and do not display the data in the same way if the track changes were applied, and then you can in the viewing software change from one view, from one topology view to another one, and you have an easy and powerful presentation layer I think, if you do so, and of course, we talked yesterday about it, if you take a measurement run and you run from eight o'clock in the morning till noon and you do a hundred kilometers and you do a circle, then if you show it in time you will have a straight 1D chart of your measurement data, if you show it in a distance you will have the same thing, a straight chart with a distance based on an x-axis instead of time, and if you show it in topology then you have a much shorter bit showing an overlay of how many circles you have run through in that morning, so this is quite an interesting thing to do and it is basically what most easily accessible viewing software lacks, so this possibility to tie it to your physical location like that. Then we have the licensing, we worked a lot together with Cornelius, together with Mahalia and also with Christian, which is sitting back there on the licensing that we will be using and we decided to use an Eclipse public license after some discussions on what would make sense, mainly because it is a weak copy left license and I can imagine a future where a measurement system company uses this software directly, say your startup company, you built your own measurement system, you need a presentation layer, you can take the RCMDX data format and the viewing software and create your own possibly commercial version of it by adding your own flavor to it and still be able to sell it as long as the core functionality if you improve it, you send it back upstream and that would be like the ideal version that we could attend in the future and that is basically one of the reasons why I am here to hope that all of you will go home and use it and spread it and bring your own companies to use it too. Let's see an overview, so right now the most important link is the first one, SBB has since one week now a project website for this and if you click on it, maybe something happens, I don't know, right, you will directly get to a website where you can find the, it does not show, thank you, why is it so big, okay at least you can read it, so you can see the website and there is a couple of links somewhere, so you have sample data that you can download directly, I think it is more or less one kilometer of data but it gives a good impression how the whole system can work and if you are really interested in having more data you can still contact me or Jean-Frederick and we can give you access to it and there is this more on topic, normally if it wouldn't be so zoomed out it would be a bit nicer and on the top right of the site but you can directly download the installer for the viewing software and you can directly go to the Github page with the format description and yeah you also have the workspaces that you have to load into the viewing software, you can also create your own ones, that is just the help if you want to play around with it, that is basically most of what I wanted to say, thank you very much for listening and I think we have a couple of minutes for questions too. Excellent, see, thank you very much. Please, that is the need point, you don't need to have the corresponding software if you don't want to, as soon as your data is described properly it is self-contained so you can build at any moment a new software to access it, you can of course keep your software too but you don't need to, you have the specification, you can build a tool to read it within a couple of days, time's up, okay, thank you very much. |
Introducing MOTIS Project
An Open Source Door-to-Door Routing Platform |
So, we start with three presentations, it's like a routing topic now from three different countries, we present you three different routing topics and I think all of us can relate to this topic at least as a customer and it's a pleasure to me to introduce Felix Gündling from Germany, he's introducing MOTIS as a door-to-door platform, what is going open source in 2020 and is already used by some companies for internal use cases and we are really pleased to hear what is MOTIS about. Thank you very much for introducing me, today I will talk about MOTIS project and so this is a very rough overview of MOTIS, I think I could talk hours about all the details but I try to make it short so to give an overview MOTIS is a mobility platform and it's modular so it has different modules for different purposes and you can mix and match all those modules for your use case but I think the main functionality of MOTIS is the door-to-door routing that involves all kinds of means of transportation. So this includes walking, trains, buses, flights or we have also experiments for ride sharing or on-demand integration, basically you name it, everything that brings you from A to B we can use it in MOTIS and to display the data like the connections we have also our own tile server so this is a very easy way to get your own tile server also if you are only interested in displaying connections or anything on the map and of course for the user to be able to enter their wish from where to where they want to go we can also autocomplete places and yeah it's open source. So a short history of MOTIS or a long history basically it started in 1996 when I was still in primary school and so I cannot tell you much about it but there were first experiments about timetable data models and also the first routing algorithms and in 2003 it became multi-criteria optimization algorithm and I will go into some details later in 2007 MOTIS was the first platform that already had real-time information and could find connections on this real-time information so basically if you had delays or cancellations or reroutings this was the first time that you could actually find alternatives that would work in real-time. In 2013 we started to work on the door-to-door routing and this is actually my main topic and there we have all kinds of special topics for example we had one guy working on reliability so to find especially reliable connections where you also consider the reliability of all the alternatives so if something breaks you find still an alternative that brings you to your destination before your deadline with a high percentage of reliability considering the data from the past regarding the delays. Currently one of our main topics is accessibility so we are aiming at finding connections not just for everybody who has no problems but for all people who have some disability and there are different profiles so this is a profile-based approach where you can give us the information about profile you have and MOTIS will find those connections regarding your profile. Additionally we are working on park and ride so if you don't want to just go from A to B but you want to go back and you have your car parked somewhere then you have dependencies between the outward trip and the return trip and you cannot plan those journeys easily with the algorithms that are available on the market because your outward trip and return trip would not necessarily contain the same parking place and MOTIS has an integrated optimization algorithm that optimizes the park and ride problem. Additionally we are working on integrating ride hailing and ride sharing so different kinds of mobility and we are mixing everything together and find optimal journeys from door to door using all of them. Basically in 2020 then came the time that we made it open source and the open source version doesn't contain all of our experiments because those experiments were like branched from the master branch and it's a lot of work to combine everything into one version and so currently we are working on making all those experiments also open source and to maintain them all in one consistent version where you can also mix all the features together. Since we made it open source we gained some interest also by other companies so our primary partner for all the time since 1996 was Deutsche Bahn, German Railway and since then since we made it open source also other companies became interested in using our software and some of them are already using MOTIS in production and that's the path and we will be like I'm happy to see even more usage of MOTIS in production. You might also be interested who is developing MOTIS, this is mainly the Technical University of Darmstadt and currently we have three researchers that are working on MOTIS that's me and to others and all the time we have a lot of students doing their thesis projects or lab topics or some paid students working on MOTIS so we have a lot of churn in developers it's not always the same team but that's the research topic and as mentioned before I want to talk a little bit about multi-criteria optimization because that's one the main difference between rooting on the street and rooting with all kinds of transportation means is that usually it's not sufficient to only optimize the travel time or the distance traveled but you have also other criteria like the number of transfers or some people like want to travel cheap or some want to travel sustainable so they don't want to have a lot of CO2 produced and so there are different criteria and we actually don't know exactly what mix of criteria the user currently using our software has so this is a difficult problem to solve and since we don't know this we give them all the optimal solutions regarding the criteria so basically we do a multi-criteria optimization and find the Pareto set of all optimal solutions that form an optimal trade-off and currently the main version of MOTIS has the three criteria departure time arrival time and number of transfers so it's better to depart later arrive earlier and we want also to have the number of transfers so in this example we have three connections and basically for example the nine o'clock connection that goes to 10.15 has two transfers so it but it takes longer and then nine to ten connection has three transfers but it's faster and since we don't know exactly who wants which connection we just show all of the connections so if we look at those criteria we would show the optimal connection for like the fastest connection for all number of transfers and yeah basically the approach can be adapted to optimize a set of all criteria that you can basically measure in math that you can count in a way so how does it work how do we make the door-to-door routing basically we have two steps one step is to compute the connections from the actual address where you start to all the stations of the that could connect you to public transport or in general timetable based means of transportation and we do the same on the destination side so we look at all the stations in proximity to the destination and we route basically from all those to the destination so we convert these options to go from the start to the station or from the destination to the destination to a set of edges and all these edges are then inserted temporarily in the data model of our main routing algorithm and for those main routing algorithms we support a variety of routing algorithms that optimize the connection the overall door-to-door connection so here we have for example the Raptor routing algorithm the trip-based routing algorithm or just a graph-based solution that is the extra-based and yeah so this is the optimization approach and it's a nice approach because it guarantees optimality and all those algorithms that are named are producing exactly the same results so this is basically our quality assurance that we want to make sure that all the algorithms that we have in our system produce exactly the same results that because they have the same definition we are like the department at technical university of townstatt that is producing or working a motorist is the algorithm department so our focus is basically on the algorithms but to be able to show the platform we have some front ends and we have an Android app we can show the timetable data on a live map and we have also an interface to search the connections and actually use the routing but those front ends are currently not capable to display all the functionality that the back end has so our focus is more on the back end and algorithms and I want to talk a little bit also about our roadmap so we are currently working on making those accessible door-to-door routings possible that's our main focus and therefore we are working on a new data model that is also capable of displaying or saving the timetable data in a compressed way so we don't have one object per connection but if a connection takes place on several days of the year and at the same time then we store the connection with the bit field that has a one if the connection takes place at that day and zero if not so this is a very efficient way of encoding the timetable so additionally we use for the street routing currently OSRM which is very intense in memory usage but there are alternatives and we are trying to integrate those two and of course as I said before we are working on bringing more and more of the research functionality that we have in different branches into the mainline motors so you can use them together and like bring the research into production yeah that's that's it about motors if there are questions I'm happy to answer or I could also like make a short demo because yeah so maybe questions first yeah This is to the next question, how do you decide on that? Currently we ask the user to give us the maximum time he would like to use for the first and last part of the journey for each means of transport so he would say I would like to travel maximum 20 minutes by car and that's basically our limit but of course this is only one way to do it and this is the backend functionality so if you would use this as a user in an app or on a website this doesn't need to be the way that the user interacts with the system I didn't look like I think their algorithm is not open source so I couldn't look into the very details but from what I've seen I think they are not doing their computations in one data model like we have basically then one data model where we do the overall optimization so we can guarantee the optimality on this data model because it's all included and from what I've seen what they do is that they have different routings and they try to combine the routing results in a post-processing step and I think this doesn't guarantee optimality but produces probably reasonable results for the end user. Thank you. How are you able to take in account the life like a traffic jam or like how do you... Yeah we have the option to use GTFS RT real time data for the timetable based means of transportation and currently we are using for the street routing also our arm or like we are planning to support Valhalla and those can or cannot depending on how you configure them so this is not our main focus but this basically depends on which algorithm you use to compute those first and last edges. So basically what we do is that you download all the data put it all in one folder on your server and give that folder to Motors so it has all the timetables as in the GTFS format or a half horse rodent format and you give it the open street map data and you can give it for the real time data API endpoints where Motors will pull regularly like in a one minute or 30 second interval the real time data and so this is basically how all the data comes together in one data model so Motors loads all the static timetable data at boot time and possibly does some preprocessing on it and then in the real time it pulls real time data from the sources that you have configured it to pull. Trans model? Sorry I didn't. I haven't heard about it. Okay so CN standards concerning data modeling for multimodal transport this is why I thought that might be an association but it's independent. Okay yeah it's independent. Thank you. We don't have more time for questions but I think Felix is here and you can ask more questions afterwards. Thank you Felix. |
Transit network planning for everyone
optimise your network, reduce transit time for users! |
So, hello everyone, we start with the next project, the next routing project, and I'm really pleased that Janik Fossil came all the way here from Montreal, Canada, and is going to present us Transit Network Planning for everyone. It's a great project to show how a research project is going into real life, and he just or the project just went open source last autumn, so we're really excited to see what you, yeah, what your project is about. Thank you. Bonjour, hi everyone, I'm really happy to, I always find a good excuse to come to Foss them, so, as Melia said, I'm from Montreal, I live in a, in a borough called Verdun, which is still in the island of Montreal, a little bit far from the subway, so I have to rely on buses to get everywhere on the network. Luckily Montreal is a well-grid, it's a grid-based city, so it's really easy to pass bus line about everywhere, but the problem I have, every time I try to get the bus, they all seems to be in sync, even if I have like three or four lines, I look at the schedule and there's like no bus available right now, and I ask myself, can I find a way to prove that and ask the city council or the transit agency, there's a problem there. Actually a couple of years back, I started about paying a couple of friends at the Polytechnic University on a software project called Transition, which is a transit planning tool, which is aimed for transit planner at first, but as much as you want to make a really good tool for transit planner, people are working in transit agency, our ultimate goal is to make the tool as easy to use for every citizen, everybody, to actually solve and understand what are the transit problem and come up with a solution that everybody can bring. We are really far from that goal, but we want to make it as easy as like your favorite city simulation game. So as I said, we are a research group based in Polytechnic Montreal called Cher Mobility. It's a mix of transit engineers who studied transit problem, mobility problem, but we have also other people like we have an economist, and we have a few software folks to help with the software development. Being an engineering school, we tend to work with applied solutions, so we develop a few tools over the year. We have Evolution, which is a travel survey platform, I'll talk a bit more later. We have a tool for congestion, dashboarding, tax dashboarding, and the one I'm talking about today, the tool about transit planning. So if you're a transit professional, your main day job is to actually draw new lines for your city. There's more analysis to that, but this is the main thing that the tool will provide you. You have the map of the city and you can just add new stops, you can add lines, draw where they go, or you can just import a GTS file for your network or export them as you go. This is Brussels, I just imported the whole city and can just go and try to work on that and see where are the problems or what we can improve. We have the concept of variant or scenario, so basically you want to do some studies and like maybe exclude some part of the line or exclude some mode of transportation, like some people only like buses, they just want to take bus, so you will study maybe your network that way. We have a schedule editor or viewer, basically you can just see all the schedule, you can edit it. We have a simple edition, like you can tell I have five bus for this line, it'll just like generate your schedule considering the time to go over the route, the dwell time at every stop in there. We are not aiming to make a tool to operate the network, like OSRD before, it's more tailored to like actual scheduling of your train in real life. We want to keep it more eye level, like what's the idea of the planning? If you are a small city, small transit agency, you can use that to do your operation, it's simple enough, it will give you some information, if you enter the schedule it will tell you how many buses you need, so we can give quick estimate, but we will not consider things like scheduling of staff, maintenance, another issue in there, so we want to keep it more eye level, at least at this point in time. So the first kind of analysis we do, and like the Lotus project presented before, is routing. So basically the same simple problem, you have a source, a destination, we allow it to specify a bunch of parameters to represent what Felix was talking about, different kind of needs for different kind of people, do you want to do multiple connection or not, are they older people, do you want to walk less, they walk slower, that kind of things. We can calculate like the best route or we can show you all the different alternatives. So if you want to go from the ground plus to here at 9am this morning to get in time, the best option you will give you is the 95 bus, but there was 17 other alternatives that was like in range to get here in time. We show you transit, we show you walking, we are walking on the modes, like adding cycling or adding like, fix parking like park and ride, there is a lot of like complicated problem to combine network like cycling and ride, there is a lot of difficulty in there and I am really looking at like what do you do if you can save us some time to do that there. I was talking about variant, I am sometime a bus knob, I take the bus a lot at home, so when I go, especially on this side of the Atlantic I try to stick to rails, so I excluded all the buses and this is like the best route if I want to keep on metal road. The second kind of allow this really use in transit planning is the accessibility map. The idea there is to specify a point in the region and it will show you the area you can reach in a specific amount of time. In this example I show you like from this place what you can reach in like 15, 30 and 45 minutes, you can specify this area and the main interest of the tool you can do like one point at a time, if you want to do a real analysis you will want to do a lot of them and that is what the tool provides. Also for the routing part you can just upload a file of like origin for the accessibility map or origin destination for the routing and then you can do like real life analysis of like all your movement in a day and see what's best in your network. Every time you will get some, a bunch of statistics for the accessibility map the interesting part is like how many square kilometers you can reach in a specific time. If we use the same variant excluding the bus network you see you reach a little bit less if you stick to rail and you have about like 10 square kilometers less. So that's an interesting thing to analyze, like if you have some new lines you do the analysis with the new line without the line you compare the results in there. We are currently working on new views like to especially do this diff and so that by having to look at the map yourself we're going to show you like okay this is the difference between the two analysis that you did and the statistic difference between those two are a solution that you are exploring for your network. A third analysis that we do we have a simulation and optimization algorithm. It's based on a genetic algorithm. So basically the idea you can provide like either your existing network, add a bunch of new lines or create some random lines in your network and using some simulated trips or actual trips if you have the data there we'll try to find the best route, the best line to put in your network to reduce the overall transit time for your population. You can keep other parameters like I want to keep the cost constant to what I have right now so I will not add new trains, new buses or you can give it like okay I have some extra budget I can do like 10% more in there. For some of the city we studied we showed that we can by using that kind of algorithm reduce the overall transit time of user by about 15%. Some user will get worse but most of the user will have better time. And we did that for real for some cities. This is Ramanville. It's a town about 80,000 people sitting mostly between Montreal and Quebec City and they came to us with their actual network and they were like yeah it doesn't serve our user well. We want to have something better. So we just ran the algorithm and give them that. So if you compare quickly you see like the same overall idea with the lines. We started with some what they have but we explored some space around that. For this case we draw the line ourselves. We have a couple of students working on algorithm to just generate new line automatically instead of having you and tried to add the new idea there. So this is actually being implemented right now as a new network for that city. Another real study that we did, there's an idea to add a new subway line that would go diagonally in Montreal, a project that we called the pink line. Might not go but there's a lot of idea floating around but the city and the government came to us like okay can you show us what would be the impact of adding that lines to the city. So basically using the batch routing capability of the tool and using actual travel survey data from a previous study. We are able to simulate all the movement of people with the line and without the line and we've seen some interesting result that overall adding that line will reduce transit time by about 5% for the whole population. Interestingly we've shown that the people who would get the most benefits from that line would be people who actually use car right now. So hopefully adding that would shift some mode of transportation from car to some other better transit solution. This is the overall map of the city with like that project and all the currently project being constructed in Montreal and in blue we show like we show where most people would get at least like a two-minute improvement in their transit in average with a new city. So all is the tool build basically it's a web application, it's run on Node.js. We mostly write, we convert all the code to TypeScript to have at least something that is not too awful to maintain. All component in C++ and Rust in the back end we try to move more and more of the component in Rust just to make improve the maintainability and it's all running on top of PostGIS and PostGeshQL. For the basic routing as most people use OpenSuite.NET data we use OSRM to do the basic route on the road unlike pedestrian path. We didn't modify the basic profile to have more accurate results especially for buses in cities. You want to be more accurate because a bus is not a car, a bus cannot go on every street necessarily or cannot turn as easily as other cars so we have some specific mode for that. Same with MOTIS we actually looking at also looking at Valhalla maybe it could give us some improvement in our operation. For the transit part we develop a tool car, TR routing, it's a standalone component also available. We implemented the connection scale algorithm, reference to the paper there from some people at the car street investment technology. Really efficient algorithm, we add a transit, a trip-based algorithm before but we used that. It's mostly in C++ at this point, wanted to convert to Rust but maybe I can just use MOTIS in there that part and don't have to worry about it. But if we want to do planning we have to do some data, like if you're a transit agency, if you're a city government you have that already probably but if you're a citizen you need to find that data somewhere. Luckily most of what we need to do the planning is available in OpenSuite.NET. The basic part of the road network, all the pedestrian path, rail line, it's mostly all in there. So you can go around, use the SRM, do some quick planning there, you have everything that you need. If you want to do population and trip-based analysis you will need to simulate a population. Even there OpenSuite map can help us. If you have all the building drawn on the map you can extrapolate a little bit your population. If it's residential building you know some people will probably live there. You can see where the population is debuted in the city. And on the other side, if you have the POI, the point of interest, all the other building, all the shops, all the school, all the like industry, so you know where people usually go in the day. So we can just simulate what is that in our general algorithm, you actually use that kind of information and just create trips to do the analysis in there. OpenSuite map is pretty good but it's not always good enough for what we need. We estimate we have to spend a little bit of time validating the data to make sure we have the accurate result. You can use as is but you might get error in there. For like a dense urban environment like Montreal, like Brussels, we spend about 25 hours per square kilometer doing the validation. For suburban environment, 10, maybe 10 hours per square kilometer in the rural side, maybe a couple of hours. The good thing is as long as when it's done, when it's done, you just have to make sure new exit don't break everything but the information will be there. What do we look at when we do the validation? What are the important things we need? The first part is make sure we have all the link between the pedestrian network, the cycling network with the rest of the road network. If you have things like the tram stop near the ULB, there's a lot of pedestrian path going in and out. If we didn't have connection well between all those time, the routing algorithm would not be able to just get you from the tram network to the campus or we'll find a route that goes around. It's really important to make sure you have all the correct access tag so we have to go look at everything in that and even add maybe some connection between street to go around in there. We'll usually split the sidewalk and the cycling path. I know it's a little bit a debate in the open stream map community, should you split the sidewalk or just use the sidewalk tag? But at least for us, we get a really more accurate result if we split the sidewalk. For big street, having a few meters difference between the middle of the road and the sidewalk that can be more easy to go around. We also know the quality of the road there. We don't use that information yet in our routing. We want to get there and be able to tell people, yeah, this route, if you walk on that street, there's a large, large, large wall on the street. It's more fun to walk there versus some of the road. Whenever I plan from my place to the subway, I always take the road that the tool proposed me because it's more fun to go on that other road because there's shops, there's more life, so it's more interesting. The other thing, we will add doors to big buildings, like big building, like university like that. If we use the center, the centroid of the building, we might route you on the completely other side of the building. If it's a big one, maybe that will end up being a different bus completely. We need to make sure to know where are the main entrance. We're going to go and add that, like big shopping center, things like that, industry big warehouses. We're going to spend time just mixing the street, our well-aligned, the one-way out there that will be useful to ride cars and buses. Make sure we have the speed limit. We can consider that in the transit time of a vehicle. Since we have OpenStreetMap open, we'll just try to add all the point of interest that we can find, and at least the name and the type that will help the algorithm in there. We can go further when we have time. We can have reserve line for buses. We would like to automate a little bit more of that with using MapRulet or the task manager from the OpenStreetMap community. We are not there yet, but it's going there. For the population, I've talked a little bit about the OSM building. What we tend to use a lot is the land use register. A lot of government will provide that for free, or at least you can request information to that. Basically, the information about all the building in a specific area. You know the building type. You can find out the population on there or the interest, where people want to go in there. You might want to use the census information that most government will provide. The difficulty there is the general area is really big. It will often cover multiple blocks in the city, so it will cover multiple bus lines or stops. It won't be hard to find exactly where some people would come from. We are trying to work on some ideas, some algorithm to spread the population in some sensible way to be able to simulate based on that, but it's still a work on progress. Lastly, if you have travel surveys, and if you're not familiar to that idea, most of the agency will spend time once in a while to just go and ask most users where they are coming from, where they are going, why they are traveling, and they're going to do that by phone, sending mail, or just stop people on the network and ask them where they are going. That's really useful information. You can take that and do an exact simulation of your specific day, but that data is not generally available. If you're a researcher, you can have access to it, but a general citizen will not have that. That's why we try to get the population and just simulate that kind of movement there. It will be interesting to know to get at some point, if we can anonymize the information, use the same idea, spread the population somehow, and make that private-free in there. I talked about the evolution tool earlier. It's a web-based kind of survey platform that can be used for that, which is all the plug-ins to actually ask people about where they're going, where they're coming from, more specifically like from a date or survey monkey kind of thing. As we're talking, we also research groups, so there's a couple of challenges to bring stuff to the real world. How do we make sure the things we do are actually useful? The first thing would be the code is open source. As most research, I really strongly believe everything should be open source. Trying to get the student to work in the open source would be a good first step. It's not there yet. A lot of people want to go as fast as they can and just finish their thesis and give us the code at the end that we have to, as a research professional, do a lot of cleanup and make sure it works for everybody, but it would be a good way. The other thing we do is we make sure we partner with actual transit agency. Like in Quebec, we work with almost all the main transit agencies for a major city. We actually have them use the tool, give us feedback. We sit with them and ask them what kind of study do you do and sometimes we do a batch routing thing and then we take the CSV file and then we do some Excel and then we do some Python script and be like, okay, what if we just give you a button that would just give you that and have a quick plug-in to open in QGIS? We try with those ideas in there. As I said, we are far from done. We have currently 450 open issues. Not all bug reports, yeah, there's some, but stuff like improving the UI, implementing new algorithms, trying to integrate some of the tools in there, but yeah, it's a work in progress and we always open to review pull requests. As for my original problem, I ran a batch on a batch accessibility map, so basically from my place every five minutes, how far could I reach in the city for like 60 minutes. I got this really big association, that kind of showed me there's some kind of a problem with the schedule, especially around this 1030 mark where I dip about by almost 15 square kilometers. I remember every time I tried to go out of the house at that time, which is a good time to start working, 1030, and there's no bus available. So I think there's a problem there, it's not highly scientific so far, I'm just like a computer engineer trying to fake myself into transit engineering, but I think there's a problem there. So that's it, thank you. If you have any questions, do you think it would make sense to use that same tool to work at a different scale, for example, drawing train lines at the scale of a country, for example, would it make sense to do a transition scale? So the question is, does it make sense to work at the higher scale, like country level? I think so. I think right now the performance of the tool might not be there yet. We kind of tend to, when I do work on Montreal as a world, that's starting to things getting slow, we are hitting limits, so at this point we are trying to increase the scalability of that, and we definitely want to go at that level, especially at what depends on the size of the country, like Canada, why it might be a bit too big, but at least province level, we definitely want to be able to do that. I think that the first for me, there are two main problems in Brussels, is the transportation of people, and I think it's important to inform the people, which go with all mode of transportation, with indication when is the next bus, when is the next trolley, and when is the next train, and so on, also for bicycle users to pay in the train, because it's expensive, it will have to go with the train, it's more bicycle, not for me, for computer, it should be good to have an announcement in every place, in the motorways, so in the plane, in the trains, and so it can be so computerized. Yeah, it's interesting comment, and yeah, that's where we want to make the tool for people, but I know a lot of politicians, and I want to make the tool as good for them to actually be able to try it by themselves and understand the problem. We often have people come to Catherine, which is the head of the research lab, and like they asked them for studies, but they are not always implemented, but if people can just see for themselves what would be the improvement, that could be interesting in there. Maybe one last question, it's short, very short. Yeah, very short one. That's the comment, it doesn't make sense to actually sort of like adapting the network to the city, how can we adapt the city to the network, and where can we build new development and new residential plan, new residential area. That's a really interesting idea, we never thought of that, but maybe there's something we can find a way to implement and see where there's potential to redevelop the city maybe and increase density in some places. Thank you for your time. |
Digitransit
An open-source journey planning project |
So, to have enough time for our last routing specialist now from Finland, we would start on time. Yes? Please, no, it's not working because it's for the camera. I'm screaming because we don't have an audience here. Yeah, it's not a microphone. Yeah, but yeah, let's welcome our last, more like the speaker from Finland. It's Joel Lappallian. I hope I speak your name well. And yeah, we are really glad to have you with us. Introducing Diki Transits to us, like from Finland we expect a lot because you are already known for great mobility solutions. And yeah, let's give him the chance to explain Diki Transits to us and highlight the really interesting components. Hello, hopefully you can hear me well. My name is Joel Lappallianen. I'm from Finland, a set, and I work as a developer on the OpenTrip planner and the Diki Transits projects. And today I'm here to talk about the Diki Transits project. So what is Diki Transits? It's an open source project that does multimodal public transportation, journey planning. It was founded in Finland around 2014 or 15. I have personally been working on the project since 2017. The goal of the project was to replace existing legacy property, journey planners, and the project is funded by three transportation authorities or sort of authorities in Finland. The project has, or the components of the projects have since been used elsewhere, for example in Germany. The project consists of user interfaces, which I'm going to talk more about in this presentation, and backend services and open APIs. So first a bit about the backend services. So for routing and much more, we are using OpenTrip planner. Hannes will have a presentation next about OpenTrip planner, so I'm not going to go too much into it now. For geocoding, we are using Pellias that we have further improved, so it deals with GTFS data and so on. And for maps, we are using this project called HSL Map Server, which uses Thileive GL and OpenStreetMap. And we also use OpenStreetMap for the geocoding as well. So now for the user interfaces. So first, this is the digit transit user interface. It's written in JavaScript and uses React. It's responsive, most of our users are on mobile devices. It's browser-based, and it's also meant to be, so you can use it with screen readers and so on. It's configurable, you can have different themes. My screenshots will be from different sites, so they will have different configurations and themes. So this is the front page on desktop. Here you can start the itinerary search. You can enter origin and destination, which can be like addresses or stops or whatever. And you can add some favorites or use favorites for locations. And then you can enter these stops near you pages. I will display later on what this means. And then you have this search box for stops and routes and bike rental station and bike and car parks and so on. I won't show those views now because the time is a bit limited. And on the map, if I would have tuned in a bit more, there would have been more shown. Now it's just showing some stations. Okay, so these are the itinerary views on mobile. The view on the left, it's showing the summary of the found results. It's showing multiple itineraries in this view and just like a summary of those. The length of the legs is based on the duration of the leg. And we support real-time data, so the screen numbers are real-time estimates. And this is multimodal, so in this suggestions there's buses, trains, metros, walking, there could be bike rental. And then we have this multimodal also split into separate suggestions on the top. There's cycling, there could be just walking, combining cycling with public transportation or combining cars with public transportation. And on the right, the view is shown when you click open some suggestions. It shows more details about the suggestions and it shows the map. If the trip was happening right now, we could display the relevant vehicle on map for the suggestions and yeah, there's more details here. This view is the stops near you page that I mentioned earlier. So we did some design and discuss with real users and they found this view to be useful. So we have split this into like different modes. It shows the next departures for buses in this view, but there's another view that shows for example trains or metro or whatever is available in the city. And there's also one view that shows the departures for your fare stops. And these stops are the stops that are near you or a selected location. And on the map it will show the vehicles that travel through the stops and also the bus lines. So here's an example how this digital unit UI can be integrated into different websites. This is the front page for the Helsinki Regional Transportation Authority. On the left side, we have the same components we had in the front page from the digital unit UI. This is drawn to shared components. So this only acts like an entry point. So once user does select locations here, he or she will be redirected into the digitized user interface. And here's another example how digitized UI can be integrated. This is done through an iframe widget. There's a view which can be used to generate this iframe snippets. And this is for general routing, but it's also possible to have only this for walking or cycling. So the user will be redirected to the correct view. This is another project, another user interface. Previously we called it virtual monitor, but now there's more called stop monitor, which kind of describes it better. So this stop monitor is meant for like businesses or private persons or public transportation authorities. So they can have this screen somewhere which displays the departures for the stops near that location. So this view is for, I should also mention that this is written with TypeScript and also used React. This view is for generating this view, the stop monitor view. In this you can have multiple stops on the same view and you can have multiple views that can rotate based on some time frame. So for every five seconds it will change which view is shown. And there's many options for layouts here. Here is just an example end result. You can have like this speed view where there's departures on the left and on the right this is for airport and on the bottom there's relevant alerts for that stop. And this stop can be modified later on, so once you have created this it's not really final, you can still do modifications. So what's next in the digitized project? We started the project so early on that Open Tree Planner only had version one out. So we are currently working on improving the support for using Open Tree Planner version two. It's already working but there's some optimizations that can be done. So the digitized UI that I showed it's like version two and we have version three that is better optimized for using Open Tree Planner two. And then we will have a possibility to have a map view on the stop monitor so you can have displayed the vehicles that are going to the stop so the user can see okay this vehicle is going here maybe I should start going to the stop. And then we will improve the support for first and last month services. So we have already been supporting like bike rental stations and in Germany they have added support for like scooter rentals, like floating vehicles and we will also focus more on like flexible public transportation so there's many types of services like that and this can include like taxis or stuff like that. So for more information you can visit d-transit.fi which contains some documentation relevant to the project and the repositories are hosted under this HSL development community GitHub account there's many projects there some are not digitized projects but if you search for digitized projects you will find the relevant ones. And we have this email address digitized at HSL.fi which if you send message there it will I will get the message. So thank you that was the presentation now there's time for questions if you have any. Yes for me it's interesting from your country that I would like to have this kind of information for Europe. Yeah. In community first and it would be interesting to have the best range to reach a place everywhere in Europe and it's not easy for the moment you have only national lines and so I remember it was 20 years ago when it was CEPRO with regional trains in Germany but they were slow and had problems with that also I know I know also to go from Brussels to Firebook in Brysgau in Germany it was CEPRO by the train in Germany and so on so all this kind of information and you plan to have to extend your project to the European level. So we might expand it to include Estonia for this finished project but there's still some issues with scaling this because if we are using multi-criteria search it doesn't scale to this continental scale yet on the machines we are running on. So there's some optimizations needed to get it work. This might be included in the first and last mile services I'm not quite sure on the details yet we are still on the like design phase on what we will support but that is an idea that we have discussed previously. |
OpenTripPlanner
past, present and the future |
So, welcome everybody. My name is Hannes Jornilla. I've worked with OpenTrip Planner, which is an open source journey planner, quite similar to the MOTIS project for almost 10 years now, a bit on and off, so with other projects as well, and for all kinds of different organisations both in Finland and now at the moment in Norway. But before we talk about the project itself, I would like to say a couple of words about Entur. So, I don't think Entur is as famous company as some of the other ones that we have today, but it's actually, it used to be a part of the Norwegian state railway company, and it was carved out when competition was introduced in the Norwegian railway system, and nowadays it takes care of all ticketing, all public transit timetable information in the whole country of Norway. And we use almost exclusively open source software, at least in our team, which handles the incoming transit data, all the journey planning, all the APIs that we provide for all of the railway companies, all of the other hobbyists that want to have a journey planner, API, or like a screen at their home, any kind of API we can provide that. And here in the middle is open tree planner, which is quite the key component. So, first, I will talk a bit about the past. So, how did we come here? What was open tree planner in the origin? How has it evolved over the years? What were some pain points in the history? Then I will spend most of the time with open tree planner 2, which we released two years ago, and have since continued to develop. And finally, I will have a couple of minutes to discuss some topics that we will be implementing in the future. But before I start, I would like to see in the audience, how many of you have heard about the project before? Can I have some hands? So, about a half, I would say. How many of you have actually tested it yourself? So, almost as many, which is a good sign, because I think it's very important for an open source project that it's easy to get going with it. It's easy to take into use. And yeah, as I mentioned already in the beginning, open tree planner or OTP, as I will probably reference it, is quite similar project to the MOTIS project that we have had presenting. So, it's a passenger information system for transit data and other multimodal services, and it handles journey planning, but it also has APIs for just querying the data. So, here we can see how the number of commits to the repository has been over the years. So, it started in 2009 in Portland, Oregon, in the US, and they wanted to build a journey planner that was open source. So, they took in people that had been contributing to other libraries or projects that involved journey planning, transit data, other kinds of projects and took them into the room and said, okay, we'll give you money, just build us the software. And then, about a year later, it was put into production, and in 2013, the people involved started a company called Conveil, which took over the maintenance. But over the years, that company has actually gone more towards the direction of transit analysis, so doing software for the different cities, the different transit authorities, similarity what we actually have already seen today. But in 2013, a really important step was taken. There was a big project with the Dutch government for improving multimodal transit data and real-time data. But at the same time, there was a second project that was included in the bigger Dutch project called R4, which was implementing the Raptor algorithm in an open source. And then, it was later rewritten in R5, which is a software by Conveil. 2016, there was new deployments in Washington, DC, the New York State, Oslo, Helsinki, so a bit all over the place, and there was some trials with nationwide scaling. But then, the project kind of stagnated until 2018, 2019, when Entoor and Router two Norwegian companies took over the responsibility and ported over the Raptor algorithm from the R5 project into OpenTrip Planner. And then, in 2020, we released the OpenTrip Planner 2. And here you can see, if I overlay the commits for the R5 project, you can see that, well, it was the project that took the focus in the meantime. So, what were the main pain points with OpenTrip Planner 1? The core issue is the journey planning algorithm that was selected. So, there was a time-dependent A-star search with trip banning that used over the whole network. So, both street network, transit network, the multimodal stuff, everything was in the same algorithm. And how it worked was that you run a search, then you get results, you ban all the trips that you were using, and then you rerun. So, it didn't scale that well when you wanted to have more results or a longer or larger graph. And also, it was more focused towards research capabilities. But these were then later on taken over by R5. And we removed quite a lot of this from OpenTrip Planner 2, focusing on the journey planning part. Also, there was a lack of kind of vision of where it should go, how it should be structured, everything. And also, there was development happening in many different repositories, and it was not really well-coordinated. So, for example, HSL, where the Digitransit project is, or who owns the Digitransit project, they have their own fork of the project that has quite a lot of added development. But that has now been mostly ported back to OpenTrip Planner 2. So, let's take a quick look at how it works. I think this is the main difference between OTP-1 and OTP-2. In OTP-1, as I said, all of the routing was done inside one single search. In OTP-2, we first get a request from the API. Then we run a street search. It's basically an A-star search, but just with a single mode, or with different modes if you want to have rental bicycle, rental scooters, parking your car, or anything like that. Then we enrich this with flexible transit results for the last and first mile. Then we do a transit search using the Raptor algorithm. And finally, we have two post-processing steps that I will talk a bit about later on. And then we send the result out in the API. So, the street search is actually pretty much the same as it was in OpenTrip Planner 1. There is some improvements. There is, for example, free-floating vehicles, new types of vehicles, so scooters, real-time parking information so that if you do a park-and-ride search, it will actually only be suggested to park at a place where we know that there probably will be space. And also, there is improved focus on this kind of quality of the routing, so there is a walk safety score so that we prefer, for example, walking through nice gardens or other paths that are not next to a big motorway. Yeah, so I mentioned flexible routes. So, most of you would assume that this is, or think about this when you think about transit timetables, so that you have fixed set of stops, fixed set of times. But there is many other ways that transit can be structured. So, you can have, for example, hail-and-ride sections where you actually don't have any stops, but you can just flag the bus anywhere you want. Or you can have areas, just one, two, or as many as you like. And the bus will deviate from its pre-assigned route and just drop you home or pick you up from the door. Or it can be a group of stops, some kind of feeder services that take you to the nearest railway station if there is no regular transit in the village. Or it can be as complex services as we like with any kind of combination. So, after we have added the flex results, we do a Raptor search where we insert all the street routing results, all the flexible results, and then we operate only on trips with fixed schedules and fixed stops. So, there is actually multiple levels of Raptor. So, Raptor is a algorithm that was introduced in 2012 by a researcher from Microsoft Research. And the basic Raptor, it has two criterias. So, it has arrival time and number of transit, and it departs at a fixed time. Basically, what you do is that you find all of the stops that you can walk to or make the street routing to, then you board all the vehicles from those stops, then you alight at every possible stop and find where you have arrived. Then you find all of the transfers from those stops to all other stops, and then you continue this in new rounds. So, this finds your results with the two criterias, the one with the shortest arrival time and the one with the least transfers. Then you can run this Range Raptor, which is basically just running the Raptor algorithm, but you start at the minute and then you run it backwards and adding new results to the beginning as you go. And then you get this tree criteria result that has the departure time, arrival time, and number of transfers. And that creates this kind of set where you have the departure time, arrival time, and number of transfers are all optimal. Then you can insert one or more criterias, like we're talking about that you want to optimize for CO2 usage or wheelchair accessibility or comfort that you only want to use transit that is not too crowded, any kind of numerical value you can add as a criteria, but that unfortunately has the downside that it quite heavily affects the performance. Yeah, so this is basically how we do. This is the result of the one round of Raptor search. This is of the two, three. You only add a few other leaves and then the last you just basically prune the results so that there is some where it's optimal to actually not use a trip with just one transfer, but you need to transfer many, many times. Then we do a process called transfer optimization. And this is especially in the railways where you can transfer between two trips at multiple stations. And it's actually the end result will be the same, independently of where you transfer. You will get home at the same time, but it might be nice to transfer somewhere where you have a roof over you, or maybe you have, if you have like one hour of time, you want to spend somewhere where you have access to a cafe, or then you want to make sure that you have a high probability of getting your transfer. So the one that is has the longest waiting time. And a specific type is this kind of back travel that you want to avoid that if you go from the blue dot to the red dot, you don't want to ride all over the last stop and then all the way back because that is most probably quite expensive. Then we do itinerary filtering and decorating. So the aim from the Raptor is to get as much results as possible. And then here we can make more intricate comparison between different results. And we can prune results that might be optimal, but are really bad compared to other ones. So for example, you might want to have, or you might have a result where you have just one, like no transfers, but you are expected to walk 45 minutes. And that's not really you probably want to do if you have a possibility to do it in 50 minutes less. Then a big thing in OpenTrip Planner 2 was the inclusion of a separate data model. So prior in OTP 1, the data model was based on the GTFS vocabulary and all the data was imported as the GTFS objects. But in OTP 2 we have an internal data model that is built so that we can import transit data from both netx and GTFS. So it's independent of data source and we can easily add new data sources. So there was for example a Swiss data set that somebody was looking at in importing into OpenTrip Planner. Then something that became really popular is the sandbox extensions. So we introduced a mechanism for plugging in an optional plugin into OpenTrip Planner 2 and this has been really successful with currently 22 different extensions existing. So you can provide new APIs, new data formats, new functionality that is maybe not ready to take into the core OpenTrip Planner but there is something that is in the process of development. Or you might want to have some functionality that is custom for your deployment but you want to make sure that it keeps up to date and that it's maintained and that if you do some changes in the data model or any code code that we have we will keep those updated. So one extension is that we now have two GraphQL APIs, one that is based on GTFS vocabulary and one that is based on Transmodel Vocabulary. And you can use import data for example in Skåne in the south of Sweden. They import Swedish timetables in netx and Danish timetables that they use for trips to Copenhagen in GTFS. And then you can use this unified API where you provide the data in standard format so that if you want to have the results in GTFS vocabulary you can get that but you can also get in Transmodel Vocabulary if you want it. And also GraphQL is really useful for this kind of journey planning purposes because usually you want to fetch little information about very many objects. And if you have this traditional very rest pure way you end up having like hundreds of queries that you need to do but with GraphQL you can fetch exactly the things that you need and only those and that also saves quite a lot of space for mobile apps where you can limit the number of downloaded bytes. Then we have vector tiles so you can query all the geometric information that we have just spatial information so all the stop stations, rental stations, even individual rental vehicles, car and bike parking and so on. And you can add real time information for those so that you can easily show a map where you have like the live availability of all the rental systems. And then one feature that actually we bought back from OTP2 was that that was removed as a sandbox extension and that is the travel time analysis. And this is a feature that sits somewhere between pure journey planning and pure research applications because this is really useful for some applications where you for example you are looking to buy a house and you can easily see okay like the on the web page of the seller they can show like this map that okay this is the area where you actually can get into with public transit in 15 minutes or buy car in 15 minutes or buy rental bikes in 15 minutes. So all of this is multimodal and you can export both geogasin so the borders of the areas or you can get a geotiff with actual second values of how many seconds does it does it take to get to this pixel on the map. One other big improvement that we did with OpenTriplanner2 is that we simplified the operations. OpenTriplanner1 expected that you always have a local file system and that you always have all the files on the local file system. But we abstracted this into a data source that is can be input or can be output or can be both. And currently we have local file system so you can still have everything in the local file system if you wish. You can fetch files over HTTPS and you can load and save data to all cloud storage services or at least all the major ones and it's really easy to create your own if you want. Yeah also there is improved monitoring support so you can get all the timing data of the internal algorithms really easily. We improved quite a lot of the documentation so you can find the link over there and actually with that new configuration structure this is all that you need in order to fetch all the data and build the graph for the entirety of the Belgian country. So you have four different operators. You just say that these are the paths to these. You can also say if you would have like for example the German netx feed you could have there as well and multiple open street map files. And with different tag mapping so different countries have different rules and regulations of whether you are allowed for example to walk on a bike path or if you are allowed to drive on the street if there is a bike path next to it and so on and that's why we have this customizable tag mapping so that each country or even city can have their own rules about how these things should be mapped. Yeah then a bit about the future what we are planning to do next. We of course continue all the time with feature development so we have built now during the winter via search that was not part of the initial OTP to release so that you can say that oh I want to go from here. I want to go to my hotel but actually I want to visit this bar in the between so you can get all the connections that are from here to the bar then you say that okay I want to spend about two hours in the bar and then you will get the results about two hours later from the bar to your hotel. The second one is GBFS geofencing areas so that you can limit where you can actually use your scooter if there is some places where you don't if you can park it you will be instructed to actually park it just before the zone starts or if you have speed limits it might be beneficial to drive around those speed limits. Then performance is something that we think is very important we have been focusing more and more of it and we started actually measuring it so we run a set of queries after each commit and we store them in a Postgres database and then we have a nice dashboard that's where you can go to and see how we have been doing over time and we have multiple data sets. It's so that if somebody is using this in production we can add their data set so that they can see how their data set is performing because different data sets have quite different performance characteristics because you might have a very large network or you might have a network with very strange timing conditions or other things that might affect it. We are also working on new internal data model for the timetable data that is better suited for Raptor that uses a virtual trip for heuristic calculations so that you don't actually need to do any timetable lookups and it's better memory or it's more memory efficient. So for example we noticed that most timetables actually only operate on full minutes and are less than four hours and a quarter so you can store them in just one daytime and an array of bytes. Then one thing that is actually touching there was a question about the international deployment so we run currently in Norway we run a Nordic graph that contains data for all the Nordic countries but we are working on this segmented or tiered model where we actually separate the transit networks into local and long distance and then we split those into smaller and we only take in the routing data that you need for your search. Then there is this competition neutrality so in Norway we have commercial bus operators and we have their data and unfortunately they do some awful tricks so for example they might schedule a trip that starts one minute later than the competitors and arrives one minute earlier and that way you actually drop the other operator completely from the results. So what we are planning is to add a Raptor criteria where we have a bit set of the commercial operators used and that makes it so that we will always suggest all the available options even though they are not optimal as long as they use a different carrier. And we are also planning for a unified GraphQL API where we would have one data model, one structure but two dialects, one for GTFS and one for Trans model in order to lower the overhead. So that was it, thank you very much. I think we might have time for one, two questions so if there is anybody in the audience. Yeah. Thank you for the presentation, that's very impressive regarding pedestrian routing. Yeah. So we take the data from OpenStreetMap and some areas have it better mop, some have it worse and we also use if you have in your GTFS we use this Pathways extension so you can model the inside of the train railway station and then we just link not the innards of the railway station but the entrances and then the inner model comes from the Pathways. There is also something similar available in netx but that we don't import at the moment. Yeah. Do you understand correctly that you have demand responsive? Yes. Yes. So that was the part about the flex search that is the second part before the timetable. So at the moment we only use it for first and last mile. But you only interface with external APIs to do their calculations? No, we don't do any availability so we just say that it might be possible to take this but you need to plan but that is something that you can implement in a filter so that the filter actually does the query of okay is this possible and then you can enhance the result with that data. Yeah. Very short question because there is no time. Yeah. So let's tell you international point of view with a different aspect optimization. Yeah. And so speed of transport with high speed train between Brussels and Berlin you pay a lot. Yeah. So when you go with the train and with the pedestrian you have the cheaper solution. Yeah. The bicycle is the optimal solution for me. Yeah. But I can just answer quickly that price is something that you can use as a Pareto criteria but the big problem is that there is so much dynamic fears that exist today that it's impossible to use it because there would be the. |
Developing open transport toolbox and community for ten years
From open data, via Navitia, to Open transport meetups, looking in the rear view mirror |
We welcome Bertrand, he will explain us how he's in the Open Transport Committee since 10 years on what is his journey. Thanks a lot. So do you hear me? It's okay. So I'm not a technical guy so I want to talk a lot about technical stuff. So I was working first at a digital subsidiary called CanalTP, then Kizio Digital and then OV and it was a digital subsidiary of Keolis and Keolis is a subsidiary of Essencef Group. So I'm part of this great organization and we did an open source software called Navicia. I think maybe some people know who knows Navicia, okay, less than OpenTrip planner, hi Volker. So next presentation, Volker will talk about Navicia because he's using Navicia on KDE itinerary. Well, now I'm working more on open data but what is important for me is to connect the vision, the ambition of the CEO of Essencef, Jean-Pierre Ferrandou, because he wants to double the rail share in the mobility mix up to the 15 next year or 10 next year to decarbonize our trips and ensure a more sustainable and inclusive mobility. And for me, open source, open data commons are very important to do this job quickly, easier because if data, if digital resources are more open, they are more useful, usable and used. So 10 years ago, I didn't know how it worked, it began with a little hack when I was at OV CanalTP with Navicia. It was the first, well, proprietary software for 10 years and some developers decided to do an API because they were, they attended a lot of hackathons, challenges, events with open street maps also and they wanted to share what they did with other developers. And so they did this API but it was a hack because the directors of the firm don't know about that. And we saw that a lot of people used it because it was useful and it was less time for them to invent other things. And it begins like this and with some guys, with some developers and then we decided to open the software because there was the open API, there was the open data sets and now we decided to open the software because we are part of a public transport group and we are paid with public money also. So we thought that was normal to have public code and for general interest, for digital sovereignty empowerment, it was very important to do that. And so we opened the source code of Navicia and with the API too. And we did a little work, so what was the benefits about the open source project? We contribute to a book called Open Models about all these ideas that was good for us and for the community. And then we decided to build some, well, there was the software with the documentation but there was the API also and what we call the open data bar because we needed a lot of open data sets from different countries, from France, from Germany, from mostly in Europe and we did data cleansing on these data sets so that was easier for developers to reuse the data. And with this API, we did a lot of different tools. For example, so that's the repository on GitHub. And we did, for example, also a documentation with example and tips. We also, we did also a forum, a chat for developers because it was very important to connect with the community. So we did also the Navicia Playground, it's an API console also to save time for developers and then we did also mobile SDK for mobile application to simplify the work for developers. But one of my job that was very important, it was to organize different events. The first one was what we call the open routing workshop with 10 or 15 guys, a guy from OTP for example and we worked on the topics all together and then I initiated the Navicia meet-up to talk with re-users of Navicia and I decided to change the Navicia meet-up and call it open transport meet-up that was more general with more different people, more different topics, not only in Navicia because when you talk about ecosystem, it's better to have a generic word. So this word was used by Peter Colpaert in Belgium some years ago, there was a mailing list, open transport mailing list and I decided to use it also for the meet-up and I decided to organize different meet-ups in Paris first and then over French cities, even in Belgium, in London also, Brussels and London and that's very interesting because now in Germany they organize their own open transport meet-up also. So for me it's important to share all these ideas in most countries with different people so if you want to have the same initiative in Finland or other countries, don't hesitate to talk to me. So thanks a lot, if you have any questions, it was a very start. So in your experience, what has been a successful way of making private companies cooperate when providing open data? We know state-operated companies, they usually end up publishing information in the open but then you have companies like Lyft or Burt or any of the free-floating services. What is the strategy that you follow? For example, we worked with a GAFAM and they wanted to use data sets in Paris, Paris and we decided to use the open data sets that was the ODBL license. So we convinced them that the work we will do for them will be also in open data. So as everybody could reuse the same data quality than them. So for me that's very important because it means you can work with private companies, you can have bargaining power even if you are a small firm with only maybe less than 100 guys. But if you explain the things and if the benefits are important for them, they can change the mind about open data or not and so that's very important. It's difficult because sometimes these companies are most based in the US because in France most of the times there are only sales but not the guys who decide these kind of things. It's good for me this talk because I have a friend who has a very bad experience with SNCF and new bicycle so to go to the west coast of the sea in France. So because they had no the right information partly in the information of the public, like the French. So it has a bad connection information between national train information and they have only the information for the original train and how to go to another region with another plane, I speak plane, I teach in B and so on. So I think it's important also that the public information is also good for the people working there. They work on national coverage because it would be easier from this kind of problem you could have in France because you have different regions with different information systems and so sometimes it's complicated if you go from one region to another one. The information is not so easy sometimes. It's very short. What could you do in the situation where a government, one entity doesn't want to publish a book later because this is such a very common occurrence in Romania. In Romania? They are not in Europe? Yes. I think it's complicated because to comply with regulations they publish it once a year. It's very out there but it's still complied. There is a similar thing in Brussels. To comply and publish the type-tapers, the real time that is not published against the company has appeared to lose contact with its customers. No, they have a direct contact because they have the app and they say if we publish everything people won't use our app because it may be not the best. Our customers adore and they don't want that. Can you convince them? What I saw is that when technical guys are talking with technical guys it's better than when top managers are talking together. The best is when technical guys work together. Even if the top managers maybe sometimes don't know it. It's a struggle. Thank you. Thank you. |
Public Transport Data in KDE Itinerary
Querying realtime journey data and dissecting ticket barcodes |
Okay, please have a seat, we have to begin, do as you can, all right, hi everybody again, we meet Volker Kosser, he will explain us what he's doing on Caddy E and he's come from Germany and we're very pleased to welcome him. Go on. Thank you. Okay, so yeah, I'll talk a bit about how we use public transport information in Caddy itinerary. So what is this? Caddy is a big open source community, so not a transport operator for once here. We do all kinds of stuff, you can find us in the K building on the second level to look at a few things we do and one of the things we do is an transport assistance app called itinerary. So in that you can import any kind of travel related things like flights, train trips, bus trips, hotel reservations, event tickets, etc. and that is then grouped together and put into a timeline so you have all the relevant information at hand when you need them and we augment that with whatever might be helpful along the way, like the weather forecast as the obvious example. Since we don't really have a lot of time, I'll have to dive right into what we do with public transport data, you'll see some of the features along the way. So the first problem is we need to actually understand where you want to go, ideally without you having to enter that manually but by reusing documents or material you already have. In the best case scenario, that material has machine readable annotations about your trip, there's something that Gmail has been promoting but outside of airlines I think in Europe at least we have only seen that for Flix, bus and train line so none of the major railway operators for example have that. But there is a second best thing and that is the ticket barcodes. Most, not all of them but luckily most contain some information about the trip and especially in international use they are somewhat standardized so we actually have a chance to understand what's in them. The probably most well-known one is the one from airline boarding passes that is a single standard that works globally so that is the absolute best case scenario, only one thing we have to implement. For railways we don't have that luxury but the European Railway Agency has at least defined a few standards that are in use in Europe for international travel and in some countries also domestically. The complexity of those standards varies greatly. The airline boarding passes for example that is a simple ASCII string that is almost human readable, that's as easy as it gets. The latest iteration from the European Railway Agency for the international tickets here, the flexible content barcode, that is 2,000 lines of ASN1 specification defining 300 or so mostly optional fields with some unaligned packed encoding representation so awesome to debug but extremely powerful. That's the ultimate other end of complexity then. Just because it is standardized doesn't automatically mean this is also all openly available. Again, the European Railway Agency is the good example here, they have that on the website. If something is missing you ask them, they put it on the website, perfect. Some of the other organizations ask you for unreasonable amounts of money to get a PDF or require you to be a member and for that you need to be an airline or railway agency which we are not. Some of those systems have cryptographic signatures which we usually don't care about because we only care where you travel not if the ticket is actually valid but in one case the 44E ticket used in some areas in Germany in Luxembourg the signature and the content is somewhat intermixed so we actually need to decode that and just because something is called a public key doesn't mean it's actually public on the website. In this case we got lucky. Extensive internet search found a hundred page PDF in a location where probably shouldn't have been containing a screenshot where we found an URL pointing to an ill-app server on which we found the keys so it can be quite messy to work with this stuff. Most of the standards have operator specific extensions, those of course are not documented. For the final point is there anyone from Tranitalia here? Too bad I have questions for them. Then of course there's also a set of proprietary codes where our only option is reverse engineering. For that we rely on donations of sample tickets because I mean everything we do is very much focused on privacy so once on your own device we never get your actual tickets so we need them donated right to work with them. There were ones listed here for those we have more or less understanding some we get enough out of it to work already for some we can barely prove that there is actually travel relevant data in there but we have no way of decoding that. For me the most frustrating one is SBB because that is a fairly comprehensive format we understand most of it apart from the daytime fields and without of that it is pretty much useless right so if there's anyone here from SBB who has hints or information on how those tickets work I would be very interested. Then once we actually know where you're going and we have that in the timeline we augment that with real-time public transport information. The most obvious example is delays and disruptions, cancellations, platform changes, that kind of stuff right so we notify you about that. Another thing we do is filling gaps in the itinerary right so I to get here I book a train from Berlin to Brussels but I actually need to go from my home to the station then take the train and then in Brussels somehow get from the station to my hotel with using the respective local public transport so and that that is something we we can fill in automatically. And then the third thing is when you miss a connection right we offer you to to find alternatives for getting to the same destination. In order to implement that kind of stuff we kind of need to get to that data and there's unfortunately not a single global service that gives us to us right so we need to query many many different sources depending on where we currently are which backend can actually provide us this information. So we have a bit of an abstraction layer over all those sources which basically offers three basic operations searching for locations by by name or coordinate searching for arrival and departures at a specific stop and searching for journeys from from A to B. And on top of that we then build the the higher level features. In terms of supported backends that is basically three different categories the fully open source ones those are the easiest ones to work with like Navisha, OpenTrip planner. Motors is still missing on that list simply because there is currently no production deployment we have access to as soon as there is one we'll add that as well. Second category is things where the protocol is at least documented like the open journey planner used in Switzerland and the third one the most annoying ones to work with is the proprietary legacy backends. But just having the protocols of course is not enough we also need to know where exactly are those and the respective services for that. For that there is the transport API which is three that's a collaboration with others having that same problem like Janice and that is basically a collection of machine readable information about those those services. Both where exactly do I need to connect there which protocol do they use specific parameters I need to use but also information like the coverage area because in for most of those services that is kind of implied right if I have the Belgian transport app right the scope of that is implicit. Navisha is the exception that actually has API for querying this but if I want to pick the right back end right I of course need that information. Very similar problem all of what you see here is what journey query would describe as metro line one but the signage is very very different depending on where you are and the signage is something that is very prominent locally right so if I I should show the right thing in the app in order to to help the user to to find the right thing. But this isn't this isn't really unique right so finding the right logo is somewhat tricky. What we do there is we get the logo and the colors and all of that information from Wikidata. The Wikidata entry is linked to an OpenStreetMap route relation from that we get the geographic bounding box and the combination of geographic bounding area name and mode of transport is mostly unique and that is then good enough to to find the right logos. Okay then a few more things we integrate one is available rental vehicles so rental bikes electric kick scooters that kind of stuff. What you maybe can see in the screenshot here is a few available kick scooters some shown in green some shown in yellow the yellow ones are those with a remaining range of less than five kilometers. All of this is is coming from GBFS that is a nicely developing open standard for for that kind of information and it is very actively evolving. Just one or two years ago we wouldn't have that level of detail available so that's a very nice example of open standards and open sourcing in that field. Coverage for that is somewhat biased towards Europe and North America though. I know that those systems exist in Asia as well but I have no idea if they if they use GBFS as well or if there's any other system so again something where I would be interested in in information. Another thing we integrate on the train station maps is the real-time status of elevators and escalators so I think in this case they're all shown in green so they are actually functional. This is of course something very relevant if you're traveling say with heavy luggage a stroller or in a wheelchair. The data source for that is accessibility cloud that is the backend behind realmap.org. That's also free software and they aggregate these kind of information from from many different sources. There is a bit of a coverage bias towards Germany so similar data from other countries would be more than welcome. Another thing where we have a coverage problem is train coach layouts. I think there's currently two or three countries where we are getting this. Still is widely different data models so it's not quite clear yet how we best abstract that and that is also somewhat relevant on especially only the long distance trains which can get up to 400 meters so you want to know where exactly you need to go on a platform especially if you're in a hurry. One challenge there is that the especially in the countries where we have that open street map doesn't contain many of the platform section informations and that is the key to match those two data sets together to have the the proper train layout displayed correctly on the actual station map. If you think further towards say indoor navigation in a train station like that is kind of relevant. Pushing this topic even further would be to also show insights of the train. At least Deutsche Bahn has very detailed PDFs for human consumption of the of the interior but there is currently to my knowledge no machine readable format say like OSM for trains and that is again relevant for accessibility for example right so I need to know which parts I can go to and which parts I can't go to and then the the last part that is very very recent a lot of work on that happened just yesterday is using the onboard APIs on trains. So if you connect to the onboard Wi-Fi there is often some kind of portal page showing you information about the current trip that's powered by some API that we can use as well. Typically this gives you current position speed and heading and information about the journey with delays on on each stop. Just showing that is of course the easiest way to to integrate that but the real value comes when when we use that for higher level features again for example checking if you're on the right train might seem obvious but if you're traveling in the country where you don't speak the local language or in case of a multi-set train that splits up along the way. Zugteilung in Hamm as we say in German right it's it's quite helpful if the software double checks that same for detecting if we have arrived yet that is something very very easy to realize for the human but it's actually surprisingly tricky for for the software to know. Yeah so all of these things I've shown you are not tied to the app specifically but are available as reusable libraries and for example next cloud is using the ticket data extraction in their email client so you can automatically add calendar entries for your ticket when you get them by email and I think there is much more that can be built on top of all this. I mean the itinerary app is basically for the irregular explicitly booked kind of travel but doesn't touch the commute use case at all. If you happen to know about any kind of relevant APIs or data sets or have the documentation for those or for ticket formats like we would be very very much interested same if you have travel documents past present or future that you are willing to donate to develop the extractor on that we are happy to take those as well. Yeah thank you. You're talking about getting live train data from the train. Can you do it from the location of the phone? The position information we get on the train is essentially GPS just with a GPS receiver on the train. In theory you could do that from the phone and the problem is that reception inside a metal train is somewhat limited so you usually get better results by using the API for that but it is essentially GPS data you get there so it's it's the same. Yeah that is an annoyingly complicated topic. The modality is awfully undefined. I mean there's neither a technical nor like a product level definition on what is a subway or a metro or a tram and it can be all kinds of hybrid things. That is one of the metadata we carry from Vicky data alongside the logos and so on so if in doubt we use that but even that is there is some loss in there. I mean there's some cities where you have trams that go on long distance railway outside of the city and yeah I mean we will never be able to capture these extreme special cases that a region or operator specific app can capture. So I mean that is the price we pay for that abstraction right and the one app that works everywhere approach. One question you showed the data about the scooters and how it has you said it's getting better and the coverage is better. What is driving this improvement? Is that regulation or why it's getting better? That is a good question I don't know for sure. I know that in some cities it is regulation so if you want to permit to operate your rental system in that city right you are required to publish your feeds as GBFS and we then happily consume that. I think another part is that somehow started very early by some US cities requiring that to to give out the permits for those systems and then that kind of became the standard mode of operation for those services right. So if you get in very early that works. I don't think there is like national or EU-wide regulation so this is usually something that differs from city to city. Regarding on demand traffic some of the routing engines have that in their results so we can show that but we currently have nothing regarding actively booking things on demand or otherwise because that is something where there is there's practically no API available for external users. I don't think the the railway operators or the especially the inverse with the private operators they they give that to to smaller users like us if they give it to to anyone at all. You mentioned computing what about the case when I don't yet have a ticket but I want to make the journey so for instance in Germany I had a bank card 100. Is there a possibility already to enter that somehow? We also have like a general route search so you just specify where you want to go and it offers you depending on where you start right the options from Deutsche Bahn or S&CF or wherever you are and then you can add that to the timeline as well. So there is the the ability to do manual entry for for that scenario but that would be quite cumbersome to do this every day for your commute right. So there you would want something that I know I usually go to your office between 8 and 9 in the morning so inform me if there any if there's any deviation on my usual route but not necessarily make me enter this right. You mean the checks for delays yeah that is polling there is none of those services we use has a push service that we could use. Okay. |
OpenStreetMap, one geographic database to rule them all?
Mapping the railway network for the public, with the public |
Thank you for coming to my talk, I'm very glad that the room is so packed, so I hope that this will be of interest to you. So my talk is named OpenStreetMap, one geographic database to rule them all, mapping the railway network for the public with the public. And I will focus on OpenStreetMap and open data related topics for OSRD, which is an open source project developed by SNSF, the French Railway Company, which is part of the Open Rail Foundation. So there are many information about this here on the panel. Just a few reminders about why the railway company should invest in open data. I think you are all convinced that open data is the way to go for all of projects. But inside the railway companies, it's not always that obvious. So we want long distance trains across Europe, so we can construct together the transport network of the future on rails. We want to do European cooperation because we have railway infrastructure managers in all European countries that have the same needs, and yet we are still paying for different software providers for the same tools and the same data. And of course, we want free competition to prove that all of the train operators we work with are treated the same. So if we share the same source code and the same data, we can ensure that. I will dive into the specific need of OSRD, which is our project. Of course, you may have different data needs, so I will focus on these. If any in the room have other experience with other types of data, I will be very happy to discuss with you. So in OSRD, we have four main features, pass-finding or route compatibility check is to find a train pass in the European railway network. And in time calculation is to calculate the time that the train will take to go from point A to point B, conflict detection is to ensure that the train will not run into another train during its route, and short-term train planning is to add a new train into the timetable at the last minute. Maybe you were lucky to hear my colleague Elwa this morning talk about this topic. So to do these four features, we need a lot of data, tracks, geometry and topology at track level and not line level signals, switches, routes and detectors, which are kind of technical objects, electrification of the tracks, loading gauge, speed limits, slopes, curves, real-time position of trains, and stations can be useful for display use. So I've detailed the needs for each of the features, but what you can remind is that we need a lot of data, which is all geographic and in high quality. So the goal of this study and what I will show you today is we want to find and compare European level open data to choose the best source for our needs at OSRD, but also maybe for your needs if you're working with the same data needs in your projects. I've compared four data sources, the RIMF or Registrar of Infrastructure is a data source provided by the Agency for Railways of the European Union. Inspire is a European directive that's ensure to share geographic data across Europe. Then we can find open data platforms of infrastructure managers, but there are one data platform for each company, so it can be quite confusing to find the good data and of course they all use different formats. And finally, OpenStreetMap, which is as you all know, I hope, collaborative database of geographic feed data, and it feeds all of our needs. We want open data, we want a data model which is consistent across Europe so that we don't have to change the parameter of our tool in each country. We want a data model that can evolve if we want to add a new feature. Of course, we need English documentation, easy data access, and a wide data perimeter. Let's try to access some data. So here I am on the Inspire website, I can find a broken link in a mixed language. Another example of Inspire data, which is supposed to have good metadata. Here you can see the link to access the data, which is in the middle of the page, so very easy to find. And finally, another example, I could go on and on about this, but this is a page in, I think, Swedish, but it cannot be translated nor copy and paste in any translator. So you have to click and download the data, hoping for the best. This is not to blame the people that have created these pages, but just to share that finding open data can be very time consuming and very difficult, especially if you, as me, don't talk all the European languages. Then once you have downloaded the data, we can try to assess data quality. For example, this is the railway network in Italy that I've downloaded from the Inspire dataset. And as you can see, there's supposedly a railway tunnel that links Tivita Vecchia and Sardinia. So I was very surprised by that. I checked on the official RFI website, which is the Infrastructure Manager for Italy. And in the official website, we cannot find this underwater tunnel. So of course, I was not allowed to travel across all Europe to check all the data quality that I've downloaded, so, yes? In some place, it is true, but there it is not. So first question we want to ask is for all the open data sources that I've found, are they compatible with OpenStreetsMap? In many cases, this is the case, but unfortunately, for the Creative Commons license, we must ask the provider if the attribution in OpenStreetsMap is good enough. So this can take more time, and it's not as easy as other type of licenses. So if you publish open data, it's important to check if the license is compatible with OSM. And as you can see, unfortunately, there are still many European countries where I have found no open data source at all. So maybe it's because I don't speak the language, but still, it's problematic. Then I've done a little quantitative comparison of the data I've found. So this is a comparison of track length total for one country, so by country and by source. As you can see, I have found data on OpenStreetsMap for all of the European countries, but not an open data source that is not OSM for all countries. And even more, what we can see on the graph is that in every country, the OpenStreetsMap data shows more tracks than the open data data. So even if there is open data available, it seems that the OpenStreetsMap data is more complete. Then I tried to design an indicator to see if all the useful data was available for OSRD needs. So you can see the same data needs that I've presented before. And I have classified them by necessity. So we require tracks and signals to make OSRD run, and then the other data are optional, which means if we have them, this is good, and we will have a better result. But if we don't have them, we can still run our tool and have partial results. So I've designed an indicator, which is good if we have the two required data and two optional data or more, then an OK indicator if we have part of the required data. The required indicator can be one and a half if we have partial data. It's quite complicated, but I have shared the full methodology on the blog, and I will send you the link after, so don't worry. What you have to remember is that this indicator will give you an overview of if the available data can be used for OSRD needs. So what are the results of this study? First, what we can do is open data. Unfortunately, as you can see, the map is not so green. So there are a few countries where you can do OK or poor implementation of OSRD using open data, excluding OpenStreetMap. And then we can see the map for the OpenStreetMap data. It's better. It's not that better, but it's better. So there are many countries that were read in the first map that are now green, and there are many countries that were gray that are now red. So it's not that good, but it's better. What we can see is that OpenStreetMap is the database we should use and improve because it's currently the best standard across Europe. So as I've said, you can look at the full data and methodology on our blog. So there is the detailed analysis for each country, as well as the sources for each open data set that I found. So if you're interested in one country, specifically, you can check out this. I'd like to thank the people that have done the icons for this presentation, and also a special thanks for the QGIS community that have allowed me to make the maps and most of the analysis. So maybe if there are QGIS developers there, thank you so much for your work. And finally, if you want to contact us, there are emails. You can learn more about the OSRD project on our website. You can chat with us, and if you're a railway company, you might be interested in joining the Open Rail Foundation, so let us know. Thank you for listening. Thank you. |
Closing Railways and Open Transport devroom |
Thanks to all, so that's the end, and hopefully it's not the last one we'll do, something like this. We've had a great success, so thanks to all, and you have contacts on everything to keep in touch in the future, so thank you. |
Tour de Data Types: VARCHAR2 or CHAR(255)? |
So, hello everyone, thanks for your patience. We're going to start with a slight delay, but I think it will be all right in the end. Can I please welcome to the stage Andreas Scherbam, who's going to talk about tour the data types. Okay, thank you, good morning. I hope you had a good time yesterday in Brussels. How was the restaurant? Well, we got some food. This is not switching. So, I have the pleasure to talk to you today about data types in Postgres. I actually have two talks about this, this is a I would say basic talk about regular data types we have. I also have an advanced data types talk, but let's focus on this one. So my name is Andreas. I work with Postgres since about 1997, 1998. I'm one of the founding members of Postgres Europe. We run conferences like Post-MPG Day on Friday here, Postgres Europe, which is in Prague this year in December, PostgresConf Germany, Belgium, PostgresConf Malta in April, so if you're interested in a couple more. Currently, I work for a company called Adjust in Berlin. We have about 1000 Postgres databases running, everything from a couple of gigabytes up to like 30 terabytes in Postgres. Okay, if you already know everything about data types in Postgres, if you're regularly reading hackers mailing lists, this is maybe not a right talk for you. I don't want to occupy your time, but otherwise have a seat and we start. So quick poll, how many data types do we have in Postgres? Any idea? 40. 40? Is this 14 or 40? 40. 40. Anyone else? I see you're still sleeping. And we talk about each and everyone today. No. So first of all, we have to exclude a number of data types because every time you create a table in Postgres, you also create an according data type, which matches the structure of the table. So we don't want this. If you look at regular data types, you have about 80, depending on which Postgres version you are, it's still more than you expected. There are things like Boolean, you know this one, or text, or timestamps, and other things like Trigger and Void and Cardinal Lambo, which you never heard about and you never need. And we focus on mostly the ones I marked highlighted here. At any time, you can also go and check the Postgres documentation on postgres.org, and it has a very long list of all the data types here in Postgres. The basic ones, the advanced ones, all the ones I listed in these slides before, it's all there. OK. How many different data types are you using in your application? Three. Which ones? What about numbers? So everything is watch out. OK. Anyone else? I'm a doctor. And you're sure it's J. J, isn't it? Yeah. I mean, yeah, this fits you. Sorry. You need to turn the volume off a bit. OK. No, that's too much. Better? Good. We are going over these basic types, so numeric, so when you come back, you can rewrite your application and finally you start using numbers. Text types. Maybe XML, if anyone of you is still using it, JSON, Booleans, and a couple more. So text types. We basically have one text type in Postgres. Under the hood, it's all the same. So we have watcher and char and text. Text types in Postgres by default are case-sensitive. So if you want to compare two texts, it's always case-sensitive. If you want to compare it case-insensitive, you have to use something like lower or higher to make a string uppercase or lowercase. A string in Postgres can hold up to a gigabyte of text, roughly. It cannot hold binary data. So if you want to store images in text, that's not going to work. If you specify your lengths, like this large N here, this is where Postgres stores up to these number of bytes or characters in a text data type, excluding white spaces at the end. A char by default is only one byte. That's most likely not what you want. So for char, you always want to specify the lengths. And then, of course, if you say text, it's just a text. You cannot specify your lengths here. Hi, everyone. How does char and varchar differentiate? Mostly how they handle white spaces. So we have a varchar, one and five here and ten here. And I cast five white spaces to both varchar one, five and ten, and char one, five and ten. And as you can see, we get different lengths here. So our five white spaces here, if we cast them to varchar ten, we only get five at the end because we only have five white spaces in it. What char is doing, char will fill up the entire string to the lengths we specified. If you say char ten, we get a string back with ten characters in it. That's mostly the difference for char and varchar. How it's handled, if you specify your lengths in char, it will actually give you that many bytes back, including white spaces. If you use a varchar, it will only give you the string back, excluding white spaces at the end. So it will cut off white spaces. We have a string in Postgres which is called a page. By default, it's eight kilobyte. That's where we store all the data in it, so we have one page header. Then we store rows in it, and a row can be anything from just one column up to 1,000, 1,500 columns. As I said, by default, it's eight kilobyte. You can increase it, but almost no one is doing this. How does Postgres store a text? A text can be up to one gigabyte. So we cannot store one gigabyte in eight kilobyte. I mean, you can if you compress, but usually it doesn't work. Any data type in Postgres, which has variable lengths, is only a pointer into what we call a toast table. In our regular table, we have a four byte pointer, which is a pointer into the toast table. And in the toast table, we have as many rows as we need to store this gigabyte of text. So it's four bytes here, and as many rows as we need here. Which brings me to my one question. Why are so many people using char255? Just two weeks ago, I've seen a customer who does this all over the place, street names, customer names, everything is char255. Does it make any sense? Well, there are certain databases where this might make sense, but it doesn't make any sense in Postgres. In Postgres, it doesn't make a difference if you use 10 bytes, 200 bytes, one kilobyte. You always end up, almost always end up with the text pointer into your toast table. There are other databases where it makes sense, like if you look at one of the competition in the market, they can only have like 255 bytes in an index. Or which is 255 characters, if you use UTF-8 up to 700 something bytes. It's what they can use in the index, so that might make sense to use 255 as a char. But every time you see this construct in Postgres where someone says, char255, go and question, why is the reasoning behind this? Why did someone say 255, not 200, not one kilobyte? Technically, it doesn't make sense. Numeric types. Pay attention, please. So, we have integers, we have floating point numbers, we have numeric, and we have sequences. Integers, we have small int storing two bytes, we have integers, regular integers by default, it's four bytes, anything from minus 2 billion to plus 2 billion. And then we have big int using 8 bytes, this is 9, no idea how much it is. You might get some small problems if you store something, okay, small int integer, small int integer, because the compiler will actually go and say, okay, we need to have some space because we need to align the next integer on a 32-bit boundary or 64-bit boundary depending on your operating system. So, what you really want to do is have all of your big data types. First, in your table, there was a thing which is like text and big int, and then start with your regular 4-byte types and then 2-byte types at the end, so you can compress the table a little bit more. It doesn't really bring you much if you only have a small table, couple million rows. But if you're talking about billion rows, it's actually quite a big saving you get if you can compress and save like 2 bytes, 4 bytes per row. Then we have floating point numbers as wheel and as double precision with 4 and 8 bytes. Keep in mind, a floating point number, even though Postgres might show you the accurate number you have, is always rounded internally. It's always a base and an exponent. I'll give you an example here. We have 6 digits here, 100,001, and you see if I return this, Postgres still shows me the 100,001, even though internally it's already rounded. If I expand this to 7 digits, you see I get a rounded number here. The 4 bytes can no longer store the precision required to store this one at the end, so you only get 1 million here. The same is true if I have my 100,001.5 as a 4-byte floating point number, Postgres starts rounding it internally. So if you want to store anything like money or data where you really need precision, please do not use floating point numbers. But, same example for floating point with double precision, so 8 bytes. If I have 15 digits here, it still looks okay, but once you start expanding the number a little bit, or add decimal digits here, you see it starts rounding the number. So if you really want to store floating point numbers, but you cannot use integers, what else can we use? This one comes later. You can use numeric. There's a data tab called numeric, which stores up to 1000 digits of precision in Postgres. So, basically any number you need to store, you can store in numeric. Keep in mind there is no real hardware support, like if you add two integers, in the end the CPU will load one integer into one register, and the other integer into another register, and just use one operation internally, everything is fast. That's not how it works with numeric, so it will be a little bit slower. Not much, but if you have to use it all the time, you might see it. There's also a tab called money in Postgres. Don't use it. Never. Internally it's a big int, so we have the same position here, 8 bytes. However, you only have one currency. Whatever you assign as LC monetary in your environment, this is a currency Postgres that is using to show you this number. So you cannot say I want to store two currencies in my database, you cannot store like an exchange rate. All of this is not working. Money was deprecated, I think, two times, three times, something like this, and there's always one user who comes back, oh, I really need it, so it's still around, but please don't use it. Is anyone from India here? Ever met him? You know the name? Would be surprised if you know the name. Anyone knows this game? How many, the story is that he did something for his king and the king asked, okay, what can I give you in return? And he said, okay, give me some waste grains, one on the first chess field and then doubly number. How many waste grains are we talking about? Yeah, it's hidden. So we can actually use numeric for this. So it's 2 to the power of 64 minus 1. So we have 64 fields, we start with 1, so it's minus 1. If we use floating point for this double precision here, you see, we don't get an exact number, it's way too big for floating point. If we use numeric, it's just 20 digits. If we have 980 left, we can store much more in numeric than just this number. I did the math at some point, and it's like 1,000 times the speed we are currently using or producing on Earth per year. So 1,000 years of waste production on Earth will suffer to solve this problem. Okay, we also have sequences in post-course. Internally, these are just integers, which are used as a default type in a table. Small serial, which gives you 2 bytes, 32k plus minus, and we have regular serial data type, which is 4 bytes, from 0 to or from 1 to 2 billion. And then we have big end for everything, which needs larger numbers. If you create a data type or a table with a data type serial, what post-course will do internally, it will create this data type in a table. So if you look into the table, it's actually an integer, and 4, and 8. It will create a sequence for you, and it will make this sequence a default value for this column. So every time you insert something into this table, and you don't specify this column, post-course will use the next value from a sequence. If you specify something for this column, it will not use this sequence. So if this is your primary key, and you're mixing values you're inserting, and values from a sequence, at some point you will have collisions. Please don't do this. Although sequences are not transactional, if you roll back a transaction, a sequence will not roll back. Sequences just mean to provide you a unique number. That's all. So we can ask the current value of the sequence. Select current ball, my sequence name here. This will only work if you already used the sequence in a current session, like you did an insert into a table. This will tell you what is the last value this current session I have inserted into this table. So if you have five different sessions inserting data, it will always show you what you, the current session, inserted. You can also ask for the next value by using the next wall. Keep in mind, it will not roll back if you roll back the transaction. It will only ever move forward. You can also set a sequence by just saying set wall and then specify your key, what is my new value I want to have. If you do this on a table, where you use the sequence on a table as primary key, and you set it to a previous value, you may run into collisions again, because it will reuse the same values again. Okay, a sequence internally in POSQUENCE is just another object, another table. So what you can say, select star for my sequence. It will show you all the data about the sequence. So what's the sequence name? What's the last value we used here? What's the minimum maximum value with the sequence cycle around? So when it comes to an end, will it start again? Will it wrap over or not? So by default, it will not wrap over, because otherwise you will end up with the same numbers again. Just another object. Good. Any questions about numeric types? What is this? South pole. Come again? South pole. South pole, sounds good. How did you figure it out? It looks like an apartment. That's one way. Yeah, it tells you here, 90 degrees south latitude. This is south pole. What time is it there right now? That's one valid answer, all the times. Let's see if we can answer this question. So we have a couple of date and time types in Postgres. So we have time stamp without time zone and time stamp with time zone. Depending on your use case, make a good choice which one you use. Or any time you just want to store time, date and time, you want to use the time stamp without time zone. If you work with multiple time zones, we will use this time zone. Internally Postgres will store the time as UTC time, but it will make sure to handle the transformation for you. We will see a couple examples. We also have time without time zone and time with time zone. And we have date and interval. Postgres will not know about any kind of leap seconds. So occasionally we have a year which is a second longer with the leap seconds. I think they're planning one because it's going slower so we use it. I don't know how this will go. But anyway, Postgres doesn't know about leap seconds because the time zone database doesn't know about it. So we have something which looks like a date. I cast it to a time stamp and we see that Postgres makes it this date midnight. So if you don't specify your time, it's always midnight. We can also say, okay January 5th, let's format the Americans using months first. We also see it's midnight here. We can also specify a time zone. So I have here 325 afternoon in UTC time zone. I cast this to a time stamp. Why does it say 525? Any idea? Postgres will always return times in my current time zone which is set on my system. So many companies use servers which are set to UTC. My laptop which I used for this example is set to Berlin time which in summer is 2 hours before UTC. So we see I specify a time zone as UTC here and in August I get 525 back. So we transform the time I specify to my local time. Same in winter. This is December here, 1023 UTC, I get 1123 back as my time. This can be very convenient but also very inconvenient depending on what you're working on. We can also say any time zone as any time zone your computer knows about. So we have a time zone database in computers and we can use any name from there to specify as a time zone. We can also say just plus or minus the number for the time zone. Obviously if you specify a time zone as a name Postgres knows about summer time. If you just say plus 4 it never knows about summer time because you just said plus 4. The example I'm using here is because at some point Russia just said ok we are no longer doing this dance with winter and summer time. We just stopped at some point and said ok here around it's one time zone. In Postgres time is stopped if you start a transaction. So we start a transaction here. If you say select now multiple times you always get the same time back. That's the transaction time. So I'm using this here now I'm setting my time zone to Europe Moscow what changed my output changed. This one is my Berlin time and this one is my Moscow time two time zones difference. What I can also say any timestamp I have in Postgres at a specific time zone. So previously everything here was always returned in my own time zone which is set on my computer. I could say change everything to this specific time zone or I can just say format one specific timestamp I have at a given time zone. And then of course you can say ok select now at Berlin comma now at New York comma now at Buenos Aires in one single query. So we can return as many time zones or timestamps you have at as many different time zones you have. A couple more examples so Postgres does not know about leap seconds but it knows about leap years. So we have two thousand fifths of January minus two thousand first of January. This is four days difference that's an interval now of four days. We have two thousand first of January minus fourth of January it's minus three days. We get an interval back so if we have timestamps Postgres will do all the calculation for us between the two timestamps including years dates including leap dates everything. And we can use this to figure out of a specific year is a leap year. Two thousand twenty eight of February plus one day interval gives me two thousand twenty ninths of February. Of course if I do this in two thousand one I only get three hundred sixty five days back in two thousand I get two hundred sixty six days back it's a leap year. It takes all of this into account which brings me back to my initial question what time is it at South Pole. This station at South Pole is operated by the Americans it's called St. Amundsen station which is however supplied from New Zealand. That's the closest airport they have for all the planes they're operating and by now we know how to figure out the time in New Zealand right. Select now at time zone New Zealand. That would be cheating because the operating system actually knows about Antarctica. So every station which humans have on Antarctica got its own time zone which is usually aligned to the country operating the station. Because this one is well operated by Americans but supplied from New Zealand conveniently they're using the same time zone as New Zealand. Select now at time zone Antarctica South Pole gives you the time at this South Pole. Anyone here using XML. Fine let's. What's your use case. Come again. Some data feeds. Well there are some basic support in Postgres for XML. You can set encoding. You cannot search directly in XML. So Postgres stores the XML as it is but you cannot really search in it. There's no support for this. What you could do is cast it to text and then try to make some sense out of it. But that's about it. We have two different types we can say it's a document type here. So I specify a text and I tell Postgres okay pass this as a document in XML. It will fail if it's not proper XML and tell me okay this doesn't work. So I get back in XML document here with all the formatting and the entire tree. I can also say I don't want to have an entire document I just want to have a piece of XML content. That's working with XML pass and content. But then again I cannot search in it. There's no support for it. I can serialize this and unsealize this if you want to store some larger XML documents in Postgres. That's working. So this one is a text now no longer an XML document. So if you really want to search in something and say okay does this specific text appear in my XML document. Go and serialize it and then maybe apply some like or way gaps on it. But then again if you search for name you will find plenty of this. Reality is if you want to store something like XML go for JSON. We have two different JSON data types in Postgres. One the older one is called JSON. JSON as it is stores the data as it comes in. So it basically takes the entire JSON block stores it as it is. And then later on if you work on the data type then it does all the parsing. So at insertion time it doesn't really know if your JSON object is valid or not. It only figures out when you try to operate on it. And then at some point there came a better JSON type around. It's called JSONB. I think one of the big mistakes this project made was not to duplicate JSON and say okay the new one is the JSON type. So I called it JSONB and now we have to live with it. The new one is better in almost every way so it does parse the JSON when you insert into the database. So when you create this type you can create an index on it. It already tells you on creation time if it's valid or not. And then you have a decomposed object as JSON object in your database which you can work on. All of this is using regular transactions like some other databases using JSON. You don't get really JSON support on this. In Postgres it's one more data type we use supporting transactions, everything supporting replication. How do we use this? This is a regular text here. I need to quote my text in JSON with a double quotes to make it a JSON text. Then I have my single quotes for Postgres telling it this is a string. I cast this string to JSON and I get my JSON text back. We can use arrays and lists and hashes in JSON. So here we have an array. As you can see we have several text types, JSON text types in the array. Then we create this array and then we make this a Postgres string. That's a JSON type and we see okay I still have my JSON text types here. My JSON array for Postgres all of this one string parsed as a JSON type. The same works with key value pairs. So we have key and value here as JSON, make it a hash and then make it a string in Postgres, cast this string to JSON B. Of course, since we passed the JSON array, Postgres knows what's in there. We can say okay I only want to have this key number two. Yeah, my text is DEF, so I can access whatever is in my JSON value. I can ask if the white element is in the list on the left. GHA is actually in there, so I get a tool back. It says yes, no question in Postgres. Same for the keys. So here's the value. Is this value in this list? Is this key in this list? So any of the questions you usually have from an application using JSON, like is the specific value there? Does the specific field exist? All of this you can answer directly in Postgres. You don't have to extract the entire data type, transfer it into your application and then try to make sense of it. All of this can be answered directly in Postgres in one query. On top of it, because Postgres already knows what is in the JSON, we can have an index support on this. This only works on JSON B, by the way, not on the old JSON type. So what I'm doing here is I have an index on my table. I have to use the GIN type. And I only create this index on one specific field in my JSON object. So this is not indexing the entire JSON field. It's just one field. My object is a name field. And then I can use this index to answer queries. So if you have this typical web application, we are storing 20, 50, 100 JSON values in one object, you don't want to index all of them. You maybe just want to have an index on one or two of the fields. This is possible in Postgres. Booleans. We have a web Boolean type in Postgres. So we can say two false. We have a couple of alternatives you can specify. So for two, you can say it's one or two or yes. For false, you can say it's false or no or n. Can you please be a little bit silent back there? In the end, what Postgres does, it transforms all of these values into the two or false value for the Boolean. We have a couple of other databases in the market which say, yeah, we have a Boolean. Under the root, it's maybe just an integer. And then you can insert not only zero and one, but also five, eight, fifteen, and then try to make sense out of this. Here, we only get two and false back. So, two cast algebraian, we see it's getting two. And false, we get an f back. Again, we can use this in queries. So I have one table here. Let's say that's the table where you store log file messages. And yeah, we have some content here, and we have one field which tells me, okay, this log entry is an error. Usually, we don't have a lot of errors, but this is entries in our table where we are mostly interested in. So maybe one, two percent of the queries of the entries in this table will have this flag set. What I'm doing here, I create a million rows, and about two percent of them have this flag set just to have some basic test data here. And when I go and say, okay, I only want to see all the entries in my table where the error is true, because that's what I'm interested in. You see, by default, Postgres has to scan the entire table. That's quite expensive. What I can say, I create an index, and make this a conditional index, and only every row where error is true goes into this index. So the 98 percent of the table, where there is no error I'm not interested in, because the cardinality is not high enough. I will never see this data in my index. Only the two percent here go into my index, and now suddenly the cost of the query came down from 16,000 something to 68. Very fast now. That's one use cases for Boolean for Boolean index. So for the cost of this, I have another index now on the entire column, and you can see the entire index needs 2,700 pages, and my Boolean index only needs 57 pages on this. So much, much faster. We have a bit type in Postgres, so not only Boolean, we can also say we want to store bits. Now you can specify how many bits you want to have. And the interesting part is you can tell Postgres by using the b in front. Okay, this is binary value, and it does all the transformation for you. From this, which looks like a string, with bits in it, to... Sorry. Postgres will do the transformation of the bits for us, so we don't have to do the mental calculation. Which bit is which value? We can also do bit operations on these values. So I have some data in my table, and I want to only select where this bit is set. Or specify that's a binary value here, a bit value here, and that's a logical end for a logical or. And then of course, because OR gives me any value which the bit is set on the left or the right side, I get everything back here. We can have exclusive OR on bits. We can also do mathematical operations on bits like shifting left and right for multiplied by 2 or divided by 2. All of these works. We can also search in bits. Which looks a bit complicated, because we need to make this a logical operation. So I want to search everything where this bit is set, and because it gives me a value back, I need to say, OK, if this is greater or null, then I get the result. If I search for everything where the second bit from the right is set, there's no value in it, so I don't get anything back from my database. I can cast bits to any value. Like that's an integer. I can cast it to bits. I get my bit value back any other way around. I can cast from bits to integers to any other value. Good. Anyone of you storing binary data in a database? Good for you. We have a byte A type, which can do this. And all I want to say about this, please use the functions of your programming language to transfer the data in and out of the database. Please don't write your own code to try and transform this data. The language we have, which supports Postgres, has functions for transferring binary data in and out of the database. We have two different formats. One is called the old escape format, which is hard to pass. And the new one is the hex format. So this will always store hex values for binary data in a database. We also have network types in Postgres. So Enet can store any IPv4 or IPv6 address. The network type CDIR can store the networks. And we can store MAC address in Postgres. So anytime you work with network addresses, please don't store them as a text. Use the proper data type for it, because you can go and have an index on it, or you can go and ask Postgres, is this IP address in this network? I created a table with a couple of IP addresses in here. And then I can ask Postgres, give me every IP address, which is in this network. So you don't have to do all the manual handling of IP addresses and text matching and whatever people do to find their IP addresses. And on top of that, it supports an index. It gets very, very fast. So any kind of address and IP address parsing can be very fast in Postgres. I'm almost running out of time because we started late. You can create your own data types in Postgres. So we have enums and a couple more we have Postgres as extension. This is part of another trocker health, not here, not today. So it gives this part any questions you have. Jimmy, do we have any questions? During the part about the timestamp or timestamp C data type, you said that the database handles the conversion from UTC type. Does it store the UTC type and also the offset in time zone? Or does it always convert to the current time zone of the database, I guess? Whatever you insert into the data type and you don't specify your time zone. It will assume your local time zone, if you specify a time zone, it uses this time zone. It always converts it to UTC internally. And then when you select it, it returns in your time zone you specify. Thank you. Jason, what do you say, can I use Postgres for my non-religional use purposes? For example, something like MongoDB. I think you're in the wrong room. The answer is yes. I mean, almost everything you can use MongoDB for, you can also do in Postgres. Hi, is there ever any hope of getting an unsigned integer type in Postgres QL? I think there's an extension which can do that, but it's not there by default. If you check the Postgres extension network, PGXN, there are a couple more data types you can use for this. I didn't get it quite right. You recommended the jsonv data type over jsonv. Basically, everything you can do in json, you can also do in jsonv, but it gives you more functionality in jsonv. It usually is, yes. Thank you. Thanks, Jimmy. |
How to Give Your Postgres Blog Posts an Outsize Impact |
So, welcome to the PostgreSQL Dev Room. If you just got here, you missed all the early morning excitement. Can I please ask you, because this is a very loud room, take care with the seats, because they're very loud, and please silence your phones. Now, can we please welcome Claire Giordano, who's going to speak about PostgreSQL's blog posts and how to give them an outsize impact. I have to get my phone so that I can take a selfie at some point in this talk. All right, so how many of you, is it your first day of FOSDOM or your second? How many of you are here for your first day of FOSDOM? Okay, most of you were here yesterday. Yeah, and I love that it's not freezing cold this year. It's actually comfortable. All right, well, I thank you for coming. This is my third FOSDOM, my second time giving a talk here in the PostgreSQL Dev Room. Boris, I promise to stay in the proper range now. Okay, am I good? All right, so I'm here to talk to you about how to give your PostgreSQL blog posts an outsize impact. And I'm really happy to talk about it, because sharing information about PostgreSQL and our projects, new features, new capabilities is just super important to growing the community and having more people learn about it, use it, become developers on it. So before I dive in a bit about my background, I started my career as an engineer in the developer tools group at Sun Microsystems. I worked on a predecessor to a predecessor to modern day GitHub and Git. And then I spent many years as an engineering manager in the kernel group at Sun. Before moving into product management, product marketing roles, I think of myself as a writer. That's why I care about blogging. And about six years ago, I joined a small startup in San Francisco, well, not so small, but a startup nonetheless called Citus Data. Have any of you heard of Citus? Okay, so some of you didn't raise your hands, but I know you've heard of Citus. And then after a couple years, we were acquired by Microsoft. So I work at Microsoft. And this is a screenshot of the GitHub repo for the Citus database extension, which is an open source extension that I spend most of my days working on. And I put the screenshot there with the stars over in the right, because I'm on a mission to, you know, make sure people who know it, use it, like it, our supporters of it actually do go and give it a star so I can bump that number up and get some validation. So the reason I created this talk is I ran across this quote somewhere where somebody said about Postgres that it's not just open source, it's open engineering. And I'm looking at Joe Conway. Did that quote come from you? Or was it Peter Eisentraut? I'm not sure who it was, but it's a fabulous quote. And so I thought about, well, why not share the techniques and the best practices that we use, not just in engineering, but in engineering the blog posts and the content that we create? Now, before I dive in, I have to give you a warning. There's a lot of dog illustrations in this talk. And that's because I'm here in Brussels and back home in California is my chocolate Labrador named Zuka. So I don't know, I had to inspire myself to think of her as I was working on it. Now, the talk is going to have two parts. The first is what you write, the actual kind of drafting of the blog post. And the second is, well, where you send it, who you send it to. And in marketing speak, that means we're going to talk about content. And then we're going to talk about distribution. Now let's dive into content first. And in this area, I want to start with you in your frame of mind when you're creating these blog posts. And then we'll talk a bit about the edit cycle after that, which is also super important. And I think my first tip for you when you're creating a blog is to write with a specific reader in mind. Now, and think of who's this person that you want to read it? What kind of skills do they have? What kind of problems? See, those are those that, sorry. The seats are really loud in here, aren't they? Yeah, welcome. I'm glad you're here. Who do you want to read it? What are their skills? What are their interests? And write to that individual. This may seem obvious, but for a lot of us, when we try to write to this vague, big, amorphous world out there, the blog post kind of ends up like hotel food. And I mean bad, boring hotel food, where it's trying to appeal to everybody and doesn't dare have any strong flavors or strong spices and ends up being very bland. So write to that specific person as you're creating it. And then the second tip is also obvious. Pick good topics. But I put it up there because in many, well, there've been many conversations I've had with people where they've said, you know, I want to start blogging. I want to start sharing my expertise, but I don't know what to write about. And so I just want to tell you that the world, the internet is full of inspiration of what to write about. And every time you send an internal email answering a question or somebody comes, if you're working in an office or with a team of people in person, someone comes in and asks you a question. Maybe your answer is good fodder for a post. Maybe more than that person has that question and it should be shared more broadly. So whether it's Stack Overflow as well, or the Postgres Slack, the IRC channel, Reddit, there's a lot of, there's different subreddits. There's one for Postgres, but there's other ones that deal with databases and distributed systems as well. There's questions that come up there that might make a good blog topic too. And then just make sure those blogs are interesting. Like that, again, not bland, not boring. Okay. And the third tip when you think about your frame of mind is to have empathy for your reader. That's probably the most important takeaway from this talk is to have empathy. And so what does empathy mean? It does not mean sympathy. I love this cartoon from a human cloud of gaping void. And on the left, he describes empathy as, I feel your pain. I physically feel how you feel. I really understand and have empathy for your situation. Whereas sympathy is just, oh, I'm sorry that you're in pain, which is kind of just words. It doesn't have the same impact. So the goal is to have empathy be your true north. And that means you're going to care about the reader more than you care about yourself. It means you're going to put in some extra effort in creating the blog post and recognize that they're busy. They're in a hurry. So you should write and create this post so that it's easy for them to quickly get answers to their questions. And just remember that they don't know what you know. So when you work with a team of people who have similar knowledge, expertise and skills, we use a lot of shortcuts. Like we use acronyms and we don't have to define or explain concepts. We all understand. Anybody who works with Citus understands what sharding is, for example. But the reader doesn't necessarily know what you know. So taking a few moments to define a term or explain a concept or link to a background blog is super helpful to them. All right. So let's dive into specific tactics that you can use to apply empathy and really care about your readers. And the first is to make your blog posts scannable. I grabbed these two big, chunky, dense paragraphs from the internet as an example of what you should not do. Like I can't even bring myself to read them. They're in my talk and I've not read those paragraphs because it's just blah, blah, blah. Like my brain cannot embrace whatever they're trying to tell me. So because people are in a hurry, you got to make it easy for them to find the part of the post that will help them. And one mistake a lot of people make is they, it's okay. It's totally okay. They assume someone's going to read the whole post from the beginning to the end, but they're not. They're going to jump around. Maybe they'll start in the middle. And so when I think about how to make things scannable, there's a bunch of techniques that I just wanted to point out that are super useful. The first is to have section headlines that break up the post, H2s, if you will. But they should not be subject titles. And we'll talk about that in a moment. They should actually be headlines, like headlines you'd see in a newspaper. Use bullets. Bullets are fabulous for scanning. If the bullets are multi-line bullets, put a bold font prefix at the beginning of each one, just like two, three words that tells them what that bullet is about. And then most people will just read those bold font prefixes. And they may not read the rest of it if it's not what they're looking for. Short sentences and short paragraphs are your friend. White space is also a big friend. And then tables are super helpful too. I was reviewing a blog post with someone once, and they were sharing just three or four pieces of performance benchmark data, but they were doing it in sentences, like in the copy. And when we took those numbers out and we put them in a super small, like two by three table, all of a sudden the conclusion just popped, whereas before it was hidden in the paragraphs. So I said we'd talk about the section headlines, the H2s, a few seconds ago. So here's a couple of examples. And these are actual examples from actual blogs that I worked on with people. And in this first example, literally the H2 said example, which is nice. It tells you, okay, we're going to look at something that will help show me what we're talking about. But in fact it's much better if it were to say a real world example, a multi-tenant to-do application. It gives more specifics about what's going to be in that section. Similarly I was working on a blog post with, I don't even remember who it was with, and the H2 just said PG Bouncer, which is great. It tells you that section is about PG Bouncer. That's useful. But it's even more useful if we tell you in the H2 that this is about how to set up PG Bouncer. And oh, by the way, it's our preferred connection pooler. So just being more specific can help the reader, and it primes their brain as to what they're going to get, or tells them whether to even read that section. Okay. Now, not everybody is a reader. So I already said people aren't going to read the whole thing. They're going to bounce around. But people's brains process information differently. So how many of you think of yourselves as being visual in how you process information? Okay. And how many of you, when you're explaining something to a colleague, the first thing you do is jump up to a whiteboard and start sketching diagrams? Yeah. Okay. So to make your blog post more accessible to more people, it shouldn't just be paragraphs and words and sentences, but obviously a picture speaks a thousand words. So including a diagram in there and taking the time to figure out what should that diagram be can make a world of difference. I gave you the example of tables before. Those are super helpful when you're sharing numeric information. For those of us who work on cloud services, sometimes having that screenshot of whatever is going on in the UI can be really helpful. And then code blocks as well. Particularly when you're talking about some new capability or trying to show someone how to use something, that code block is huge. So the variety of formats can help. I took this screenshot, don't worry about it, from a blog post that David Rowley wrote last year about speeding up sort performance in Postgres 15. He's one of the Postgres contributors that works on my team at Microsoft. And without this chart, the blog post would have been nice. It would have been good, but it was so much more powerful when you show visually like what the results were and how sort performance was sped up. And when you do include diagrams or charts, it's important to remember to put alt text in there. Not just because we care about people who are visually impaired and are using screen readers, but also it influences how your blog post will get ranked on search engines. Because alt text is one of the factors and things that the search engines look for. Using H2s as well can also help screen readers. So don't just, when you have those section headlines we talked about earlier, don't just like put a bold font in and make it a bigger font size. Actually change the paragraph format to H2 or section header or whatever it's called in your platform. And that will help those screen readers too. The other thing I'll say about this chart is it looked totally different at first. Like we iterated on it to try to figure out how do we make the story pop. Like we ended up pivoting it by 90 degrees, changing the title, changing how we labeled things and doing our best to make it drop dead easy to understand. Alright, the third technique in the empathy area that I want to point out is readability. And I always recommend to people that you write at the third or fifth grade level. Maybe sixth grade is okay. But a lot of us in engineering, we're trained to write in very formal, academic, dry, and tones of voice. And even at the 16th grade level, at the 14th grade level. And that's hard for people to read, especially when they're in a hurry and trying to get something done. So if you simplify your writing, that can make a big difference. This is a screenshot of a tool that I use called the Hemingway app. And if you go and you put copy from your blog post in there, it'll tell you what grade level. It's written that and will give you some specific suggestions about how you can simplify your writing. And that assumes your writing in English. I don't know if it has support for other languages. Other people are big fans of Grammarly and I'm glad you're here. And so, absolutely recommend this. The other aspect of readability that I want to point out, I always recommend people write in a conversational tone of voice. Again, going back to how we were taught to write, it was very formal, pedantic even, official, academic. But actually when people are in a hurry, their brains are wired for conversation. So if you write in that easy, accessible conversational tone of voice, you're going to have more people recommending your blog post. You're going to have more people actually reading it versus just bouncing and leaving, deciding oh, this is too much work or oh, the worst is I'll read this later, which of course most people don't ever do. So part of writing in a conversational tone of voice, like how do you do it? I always tell people to imagine that they're sitting over coffee and trying to explain to another engineer or developer how this thing works or whatever the topic is. And maybe that person is their best friend and they're going for a job interview and they really need to understand. How are you going to explain it to them so that it makes sense so that they get it? That's the kind of conversational tone of voice that you want to employ. And then I also recommend people write in the second person you. And that's because your reader doesn't care about yourself. They don't care about the author. They care about themselves. So when you speak to you, when you write about you and in the screenshot here I put like little, I don't know if you can see them. Oh yes, you can. The yellow boxes around all the instances of the word you. And a lot of times in the first draft, the word you won't even show up. People will just say, I, I, we, we, I'm doing this. This is all about me. Kind of like in the beginning of this talk when I introduced myself. Probably a lot of you didn't care. You just wanted to get to the meat of the topic. So speak to you and write in that conversational tone of voice. So I'm a big fan of checklists. I don't know how many of you have read this wonderful book by Atul Gawande called The Checklist Manifesto. Anybody? Oh, cool. Strong recommend. It's a really wonderful book and it talks about using checklists not just in his field, which is medicine and surgery, but also in other fields like airline pilots and people in charge of big, massive like skyscraper construction projects. Anyway, so a big fan of checklists. So here's a checklist that summarizes the three things we just talked about regarding scannability, variety of formats, and readability. All right. And now we're going to pivot away from you and your frame of mind and empathy and talk about the edit cycle. So I think the first thing I tell people who are starting off blogging or even experts who are looking to get better is to just embrace the edit cycle and the, the iteration. You have to do the work. Now, I suppose that's not true for everybody. I do know some people who are kind of famous and they can write really crappy blog posts and people are going to give it a plus one and, and promote it and they'll get a lot of traffic even though it's not actually very helpful. It's not actually very good, but they're famous. And so they get that kind of benefit. But for most of us mere humans, we've got to do the work to make something usable by people. I'm a fan of this writer named David Perrell and I grabbed this screenshot of a tweet that he did and you don't really need to read the words, but the visual kind of tells the story that when you are creating a blog and probably the same as true for coding, you end up going down a lot of dead ends and that's what all those gray lines are meant to represent. They're things that you thought were going to be part of the blog, ways of explaining that you thought were going to be useful, but they weren't and you had to throw them away. And it's only in, and these are his words, good writing is only obvious in retrospect. And so that purple line could not have been found without all those gray dead ends. And yeah, so you've got to edit. And the other thing that is important is that it's okay for your first draft to suck. So especially with people who are creating blogs for the first time, they'll get a little bit stuck because they're not sure how to say it. And they literally, their hands are frozen over the keyboard. And you just have to recognize that dirty little secret that everybody's first drafts suck. And that's okay. So once you give yourself permission to write something bad, then you're well on your way to having a good end result. So the other quote people often use is perfect is the enemy of the good. These images of these artists though are kind of interesting because when you think about artists, because their initial sketches are typically not shared with a team, they feel free. They don't have anybody judging them and they can go scribble and draw and experiment and create some really amazing things by exploring in ways that they wouldn't be able to do if they felt like they had an audience right there. So the technique that some people do is they just low, okay, I'm not going to show my first draft that anybody, I'm going to wait until it's a second draft and it's a little more refined. And maybe that's what you have to tell yourself. I just write crappy first drafts and I'm not embarrassed by it because I know the end result is going to be really fricking good. All right. Feedback is gift. So if you find good reviewers, they're gold and say thank you and have gratitude for it. I've worked with some brilliant bloggers in my career who, if you ask them to review your post, you'll get, oh yeah, it looks fine, looks pretty good. You got some typo in the fourth paragraph, but other than that, it looks good. And that's generally not complete feedback. And maybe they didn't even read the thing. So hopefully you can find people who will look at what's missing or who will give you specific actionable ideas for how you could improve it or flag the things that are confusing. Now, one of the things I do, particularly when I'm working with folks who are super crazy busy and who I know are really carving out part of their weekend time or their evening time to give me feedback, is I will ask them to focus on something specific. So like if you're giving some important post out to three engineers, give them each a different part, a different assignment. Hey, Joe, I want you to look at the conclusion and make sure I didn't miss anything. Bertron, can you look at the diagrams in the middle and make sure that they're clear? I'm just calling you guys out because you're on the back row, so hello. So giving people those areas of focus can help. I don't know who to attribute this screenshot here, but I have used it with people and it's super effective. I'll tell them to flag anything that's confusing, flag anything where their mind starts to wander, because when their mind wanders, then that means they're probably bored and I need to go cut or fix it or do something to flag maybe the 10% that they love that I better not throw away in my next edit cycle and then scribble any ideas for what might be missing. And people have loved that technique and have said thank you to me for making it easy for them to help me. Okay, fifth tip in the edit cycle is just be willing to get your scissors out and to cut and to cut. Are any of you Stephen King fans? One, two, three, four, half, four and a half, okay. So Stephen King wrote, I'm actually not a big fan of his horror books, but over the years I've found some of his books that I absolutely love and the one I love the most is called On Writing. And we all know if you work in engineering and you work with software that a big part of your job is communicating, right? It is explaining things to other people. So if you've ever wanted to get better at writing, that is the book to read and it's part autobiographical and was written by, well he wrote it as he was recovering from an accident where he'd been hit by a car, but it's part about the craft of writing. And in it he has a quote that says, kill your darlings, kill your darlings even when it breaks your egocentric little scribbler's heart. Kill your darlings. And I just love that quote. A lot of people misquote it as kill your babies, which I think is rather grotesque. But the idea is you've got to be willing to throw stuff away. And if you can't bring yourself to throw stuff away and delete it, one of the things I do is I just put a section at the bottom of the document called the cutting room floor and I just move stuff down there. And then I somehow feel better about it. Like well I'll be able to go back and get it if I decide it really should be reinstated. That never happens. It never gets reinstated. But psychologically putting it in the cutting room floor for me is better than killing it. But you just got to be willing to cut. Otherwise people will get bored and they will leave your blog because it will be full of noise and you won't have enough signal to noise in that ratio. Alright, number six. Forking your blog post into several shorter blogs. So, um, Andres Freund, who some of you probably saw, he was here at Fosdome PG Day on Friday and he's probably around right now. No, he's probably on his way to the airport. I worked with him on some blog posts that he wrote over a year ago about connection scalability and postgres. And when he shared the first draft with me, it was really freaking long. And I didn't even have to tell him that it was long. He started off with, yeah, maybe I should split this into two different posts. Well, he ended up splitting it into three different posts. And I think all three of them hit front paycheck or news and they all got a ton of traffic and they were all useful. And I would argue that having three shorter posts made the information more accessible to people. People tend to have shorter amounts of time available to read or to learn. So, this is a technique that can work wonders for you in the edit cycle, is to cut things apart. And then just keeping it simple, which is really hard to do, especially when you're trying to explain a concept that's complicated. Um, but you know, you can, you can keep it simple in a lot of different ways. Like one of the things you can do is just move certain details into footnotes. So they're not fluttering your main points, but they're still there. You can also use analogies to help people understand a difficult concept, to make it easier for them to put that concept that, that's new to them in a mental model that they already have. And then of course, writing at that third or fifth grade level will help keep things simple as well. And then number eight, reading it out loud. So, it's interesting. Um, when I read, and this is true for many people, when I read something silently in my head, if there's a mistake in it, my brain will self correct it. And I won't even notice the mistake. Like it just, I'll be oblivious to it. Um, but if I read it out loud, I hear it immediately. It's like nails on a chalkboard. I hear it. So reading out loud is a great way to find errors in what you've written or to find that it's confusing or long winded. You almost run out of breath when you're reading it and you're like, oh, maybe I should simplify that or shorten it. Um, it's also a good way to assess if you've written in that conversational tone of voice. If when you're reading it out loud, you're like, oh, I would never say that. Well, then it's time to like get your red pen out and go fix it to make it more conversational. All right. So then the nice tip is to optimize for SEO. So most of you know SEO stands for search engine optimization and improving some basic SEO attributes of your post will make it more likely that your post will show up on that first page of search results. Um, and you do not need to be an SEO expert at all. You just need a few very, very basic skills that you can learn today. And as long as you retain them, you're good to go. Um, but I have to tell you first that all these, these few basic SEO tips are not worth your time if you don't have good content, right? If it's not scannable, if it's not readable, if you haven't, um, written in a conversational tone of voice, then the SEO stuff's not gonna help you at all. So starting with good content, there are a few things you can do. Um, and the first is just to spend a few extra minutes, maybe 10 minutes, 15 minutes on the title. Um, the title is what's gonna cause people to click or not click if they discover or see your blog post, um, somewhere in search results. And also the keywords in the title are going to affect, um, on what search results your blog is gonna show up. Um, not just the keywords in the title, obviously the search engines use a lot of factors. Um, but it's very, very important. So what I've found is I have to create at least 15 bad titles to come up with one good title. So, and it's interesting because a lot of times when people in my team will create blogs, they'll literally just create one title. And that's just the starting, they, they think they're at the finish point, but it's just the starting point. And then we riff from there to come up with other alternatives. Um, so this is an example of screenshot of 15 different titles that, um, Marco Slott and I were experimenting with and fiddling with for some blog last October, um, about the Citus 11.1 open source release. But, and it looks like noise. I probably should have displayed these in categories, but there really is a method to the madness here. Like some of these titles use Postgres as well as Vitus in the title for people who don't know the connection. It's an important connection. So, you know, maybe the title had to include both. Um, we were fiddling with, uh, different ways to talk about like without blocking rights or without interruption or, um, how are the other ways, um, using create distributed table concurrently. So using different terms to express the same concept, 15 minutes. Huh? I better talk faster, huh? Okay. I'll get going. Um, so the bottom line is don't be afraid to generate a whole bunch of titles and fiddle and experiment to try to come up with what the right, the right answer is. Now, there's three techniques for assessing once you have a list. Um, the first is, are the most important terms in the first 60 characters. So on a Google search results page, they will truncate the title after 60 characters. So don't take the most important word and put it later in the title because then people are never going to see it and it's not going to cause them to click. Um, also just, is it boring versus is it interesting? And I'll often use incognito browser searches to figure out, um, you know, what other things are popping up already for those particular search keywords in terms. And I definitely don't want to use the title that somebody else has already used. Like that would be a bad thing. So that's one, one reason to kind of compare it to what's out there in the world. All right. And then, um, a few more bullets here on the right hand side of things to pay attention to. Um, I will often advise people to avoid general terms like solution. I can't stand that one. Like why say solution when you could just say database or Postgres or Citus? Um, I really pay attention to pronouns like it and this. A lot of us like to be efficient in our writing. And so using it makes things sound a little bit less repetitive and I get it, but almost always that pronoun is introduces an ambiguity because it could be referring to this noun or maybe to that noun. And for somebody who's in a hurry, it stops them dead in their tracks to try to figure out what you're referring to and they're not quite sure what you meant. Um, so that's another tip. And then I try to, um, when you create, um, hyperlinks, like it's a really bad idea to ever use like the word here as a, as the anchor text for a hyperlink or anything generic. You want the, the anchor text that, you know, has the underline under it that links to wherever you're taking them off site to have very specific and relevant keywords to wherever you're taking them, not vague in general ones. And you'll, it's actually something that, that matters to the search engines. So those are some SEO tips, but I'm going to move on as quickly as I can because there's more to talk about. Um, alt text for graphics. I mentioned that earlier. And then, um, Google actually also cares about this thing called eat for author bios. And that stands for expertise, authoritativeness and trustworthiness. And so on your blog platform, wherever you publish, make sure that there's a bio description for you. And it's always, it's nice to make it fun. In my case, I would say that I love chocolate labradors or I love salient grease, but it's even more important to show your expertise and your trustworthiness. You know, um, the examples, let's see, do I have two examples here? Yeah. So in Marco's case, the second one down, you know, I showed that he has a PhD in distributed systems and that he's the lead engineer on the Citus database team. And also like what conferences he's spoken at. Apparently the eat, um, team of reviewers care a lot about conference talks. That's one of the ways they decide somebody is credible in a field. Um, for Thomas Monroe, who's another postgres engineer on our team, um, again, list what conferences that he spoke at and, um, the fact that, you know, he's a core developer, et cetera, et cetera. Okay. So again, another checklist. I will be publishing my slides afterwards. So, uh, you'll be able to grab these if you find them useful. Uh, I think I've got all of the editing tips here as well as the SEO tips. Cool. So now we're done with content, um, both you as well as the edit cycle. And I want to talk for just a few minutes on distribution, which is kind of how do you publish? So first I want to ask a question, which is, um, somebody said this to me once, we're going to know which blog posts are good by how much traffic they got. True show of hands. False show of hands. Okay. So it's, it's false. Like typically how much traffic something gets is more a factor of how much was it shared? How much was it promoted? Did you actually distribute it and get it out into the world? Um, distribution trumps content almost all of the time. So there's two aspects of distribution that publishing platform itself, which needs to, you need to be smart about that, but we're not going to talk about that today. We'll focus instead on how to promote your blog posts. Um, so one pro tip I always share with people is before you, as soon as you publish, but before you start sharing it for anybody start sharing it, double check what I call the unfurl of that URL. Um, that might be a slack term. So I'm sure there's another term for it, but basically when you share your URL on different platforms, the HTML metadata will get displayed, right? The HTML title, the HTML description as is here, title, description, as well as the OG image, right? Or the social graphic or the teaser graphic, different blog platforms call it different things. Um, and sometimes people have screwed up and they have the wrong graphic or they didn't fix their description. And then some of these social platforms cash the graphic in particular. So if you notice the mistake seconds after you tweet it out the first time, Twitter was going to catch that graphic for 24 to 48 hours and you can fix it, but people are still going to see the wrong one. So, um, it's just worth double checking that, um, before you start sharing. And then I always put a plug in for Planet Postgres. If your blog is about Postgres and it complies with all of the Planet Postgres policies, um, syndicating it there is a great way to get it into people's RSS feeds. And, um, also you'll see on the right here that there's a Planet Postgres Twitter account and on MasterDawn there's a Planet Postgres unofficial bot account too. And so then you can, people follow that account on MasterDawn or on Twitter and it's another way to reach people. So that's a fabulous way to reach people. And the bottom line thing is just go to where your readers are. Um, and so what that means for me and our team is that we end up sharing our blog posts on pretty much all of these different social platforms because there's no one social platform that everybody subscribes to. And did I add MasterDawn to this list? Oh, I need to update this and add MasterDawn. Um, I created this slide back in October. Things were a little bit different back then. But actually on most of these, um, slides I have my, my MasterDawn account is now listed. So Twitter is just the clear Jordan a bit, but I'm on hackyderm.io, which I probably mispronounced. Okay. Another tip, rinse, lather, repeat. And, and that just means if you're going to share it, don't be afraid to share it a few times. At different times a day. Like you're, the Postgres audience is global and something that you tweet in the morning in Europe is not going to reach people in California necessarily unless they're super night owls. Um, and vice versa. So I know a bunch of engineers were like, oh, I already tweeted it once. I don't want to pollute my, you know, my profile with it more than once. But if you want to reach your friends, you're going to have to hit different time zones. Um, and just keep in mind that only like one statistically only one to 2% of your Twitter followers would see a given tweet. I don't actually know what that statistic is on mastodon. Yeah. Devrim, do you want me to go faster? Is that? Yeah. Yeah. The number may be lower now. Things are definitely changing in the Twitter world. Um, an example of rinse, lather, repeat too. Um, uh, some of my colleagues, my boss actually was interviewed on Azure Friday, which is this wonderful video show put on by Scott Hanselman, who I think is a fabulous human being. And, um, the person in charge of the Azure Friday show tweeted about it multiple times, but with different graphics each time and different words each time. And I actually think that's a good technique because the different graphics are bound to appeal to different people. I mean, I suppose it's not unlike Netflix having, um, different graphics that they use to promote a movie and something that appeals to Boris is not necessarily going to be what appeals to me. So I bet we see different, um, graphics for these various shows. All right. Um, and you can keep your Twitter momentum by recycling tweets. Um, the same is true on mastodon as well. So just, you know, you can quote tweet on Twitter and not on mastodon yet. Um, your mileage may vary, but it's super, um, effective. And also having other people on the team share it. So for example, um, let's see, I think Devrim took something that I tweeted recently and then he quote tweeted it and it reached a ton more people as a result. So, um, just keep recycling it and eventually you will hit more people. Um, on Reddit, it's super important to be there for the Q and A. Like most of the moderators will say that one way promotion is considered advertising. They don't like that. And then you can end up getting blocked by the mods explicitly or by the bots, you know, sometimes by mistake. And so, but if you go in there and you make a comment and you say, Hey, we're here to answer questions. Let's discuss this. And you facilitate a conversation. That's quite welcome. And so, but then you have to be willing to participate. And, um, there were a ton of questions on this particular topic when we open source site is last year. And so Yalta had a lot of work to do to get answers to those questions. Five minutes left. Number seven is that karma is a two way street. So if you want your postgres colleagues or people who work on the same team as you to help spread the word on some blog post that you've written, turn the table around and help them when they've got something new and that they're publishing and they're trying to get it out there. And that's something that I think we all should do more of. The more we help amplify each other's work on postgres and on site is the just the more people will discover what's going on, learn about it, become expert in it and join the community. Again, another set of tips. And here for promotion, your checklist, like your mileage is going to vary. So I didn't give an explicit checklist because what you want to do is going to be different based on whatever part of the database you're working on. So you'll have to create your own promotion checklist, but hopefully these tips will be helpful. And I put it all together here because the devil is in the details and there is a lot to do. And if there's one thing you're going to remember from this talk today, it should be to have empathy for your readers. And if you do that, you're more likely to write a conversational tone of voice, speak to you, make things scannable, readable, simple, take the time to explain things. Prioritize and care about your reader more than you care about yourself. There were a whole bunch of people who helped me and inspired me in writing this talk. So I had to throw that slide out there to thank them. And there is a feedback link for all of the talks here in the Postgres dev room. So if you go into flodstem.org 2023 schedule and you go to the Postgres dev room track, you can see all of our talks listed, and you can click on any one of them. And in the bottom left, there's a submit feedback link. And I don't know how many of you generally give feedback on your flodstem talk. Show of hands. Okay. Let's try to increase that number this year. Because it's super helpful. People pour a lot of energy into creating these talks and organizing the dev room. And we would love to get your feedback. I put my addresses up there on mastodon and on Twitter. If any of you want to follow me, I will be posting a link to the slides on both of those accounts. And I put the site is GitHub repo on there too. So that if it is something that you appreciate, you can be sure to drop it a star as well. And then I know that we need to move on to the next talk. I think I'm out of time. So I don't know. Do we have any time for questions? Okay, so we have a few minutes for questions. I also wanted to say that I have some Postgres and some Citus database socks up here. I have 40 pairs. I also have a whole bunch of stickers of the Citus LA corn open source mascot too. So feel free to come and get them afterwards and wear them in good health. But first questions. Any? Hey, so great talk. So in tech, mostly most many folks have strong opinions about certain thing to a point like some people take religiously certain stuff. And when you're in the edit cycle, would you eliminate such pieces of information that will cause a sparking reaction from such folks? Or would you let the disagreement and see it because knowing that disagreement is information? So what should one do in such certain situation when they want to like have such piece of information in their blog post? I think that's a really good question. So where's the line between having an opinion on something, right? An opinion that might be unique in your own and being offensive, perhaps to other groups or hurtful or overly critical. And I kind of use the code of conducts as my guide. Like how far have I stepped from expressing my opinion into trashing somebody else's opinion? I think I fall back on that. Other questions? There's one behind you, Jimmy. It's a small question, but what's the different tips you can say about presenting something in your team to your colleagues, whatever. It's not blogging something, but still it's a lot of things you mentioned here and etc. Maybe you have some tips there as well to say. I do, but there's no way I could answer that question succinctly in the time that we have. But I will say that a lot of these same tips about communicating in the format of a blog also make a lot of sense in the format of a presentation. It's just, for example, the white space that I talked about in your blog, that breathing room, it's the pause when you're giving a talk. It's the taking a moment to speak slowly when you're giving a talk, for example. I also a big fan of spreading things out on multiple slides so that any one slide only says one thing versus having a lot of noise up there. So the signal to noise ratio lesson is the same in both two. Okay. Another question? Oh, and just be careful with the seats, people. When you sit down, it's very loud. Go for it. So I'm a personal blogger and let's say I don't have all the resources to promote on all the possible websites, Reddit, Hacker News, post to Twitter. So if I can choose, let's say, two or three like major promotion resources, what's your advice for choosing them? Which social platforms to pick if you only have time to publish on a few of them? Is that the question? Yeah, like, shall I choose Reddit over Hacker News? I can't prescribe that for you. What I would recommend is that you experiment, right? Because it depends on where your friends are, who your customers and users are, where they are. That advice about go to where your audience is, that's where you need to go. And so you might try these two first and then try these two another time. I also don't think I would just give it, give a particular social platform one chance because it might be the right platform for you, but you post it at the wrong time of day or the wrong day of week. So I think you're just going to have to experiment a little bit there. Thank you. Any other questions or are we out of time, Jimmy? No, we got three, two minutes. Okay, Karen has a question. I'll repeat the question if you want. Hey, thank you very much. I didn't manage to catch your talk last time you did it. So I'm really glad I was here today. Lots of us in IT from speaking to many people don't find it natural to promote ourselves or to put ourselves out there and write blog posts and, you know, promote our tweets and things. Have you got specific advice for that? That is a really good question. And I struggle with it because it's back to why I made that karma point to sometimes people also have trouble promoting their teammates work because that might be seen as too self-serving also. But the fact is, like, if we want the Postgres community to grow, if we want the information that we're sharing to get out there, then we kind of do have to promote it. And I don't know, I don't know how to really address that concern, except to suggest that you just have to let it go, right? We just have to do it. You have to see it as part of the job. It does become more comfortable over time. It's not unlike anything that with practice. And you know what will happen when you promote something and you get the feedback from someone, Pavlo sends you a note and says, hey, that was really useful. I especially like the bit about this. And then your brain starts to connect the dots like, oh, Pavlo wouldn't have even known about it if I hadn't shared it. Okay, maybe sharing is a good thing. I don't know. So you get the positive validation and that might help. Any other questions? How to cope with imposter syndrome is the question. And it's a really good question. I think a lot more people feel imposter syndrome than we realize. I bet there's a lot of speakers here at FOSDOM who are experts in their field, but maybe felt like imposters as they were walking into the building today and wondering why they're the ones giving the talk. I'm not a psychologist, so I don't think I have a good answer except to say that you're not alone. And everybody has a story to tell. And everybody has a unique point of view and lessons. And just, just, just, just, just, just, |
When it all GOes right |
We are starting, please be quiet. Couple of things, first of all, welcome. Here I am to introduce our next speaker, Pablo, who is going to talk about to go. Just a few practical things. When you exit, please exit from the back. And if you want to go out in between the talk, please be quiet. Because this diaphragm is very loud and these chairs are very squeaky. So, yes. That being said, enjoy. Hello people. How are you today? It's very fun. Because first two talks were not really crowded. But mine is good already. My name is Pablo Golub. I'm working for Cybertech. So, I'm a senior consultant in the body of a young developer. So, today, a couple of words about my company. Cybertech is purely PostgreSQL company. We work with clients only if they have PostgreSQL. If they don't, we install it and they can work with us from that point. We are having several branches all over, not all over, but the world. Some of them in South America, most of them are in Europe. Not now. I wanted to restart. So, some of our customers, so choose your fighter. Some of our products. So, we are not only consulting people, but we do products. Yeah. Why PostgreSQL? You know why. It's cool. Absolutely. So, what I'm talking about today. So, first of all, I want to introduce you to the Go language. How many of you have an idea of what the Go language is? Okay. So, I don't need to start from the beginning explaining what the compiler is. So, yeah. Okay. Then, I will say a couple of words about IDEs and editors we are using. Then, I will describe drivers for PostgreSQL specifically. Some useful extensions. How we do testing, how we do releases, how we do continuous integration, development, etc. And then, probably, I hope, we will have a question session. Okay. So, why Go? First of all, when I start my first project, I like that Go produces the native one binary for every platform. I said, wow. Wow. I just can build everything from the same command line for every operating system and architecture I want to. And I don't need virtual machines and other crap. Just like Go. It's simple enough. It has a good community. It now has already very comprehensive tools support by GitHub, GitLab, etc., etc. Yeah. So, cross-platform is somehow connected with the native binaries for every architecture. So, this is the last developer survey for Go asking how people are using Go language for what kind of. So, speaking about PostgreSQL, I would say that data processing is somehow connected. Before that, we had, like, these answers where you can see that databases is like a half of the projects people were using. Probably, you know, if not, so the top products written Go are Kubernetes and Docker, OpenShift, then Hugo, the fastest framework for building statistical sites, etc., etc. Postgres-related Go products which are written in Go are the CockroachDB, both Postgres operators from Zalando and from Crunchy, WallG, PgCenter, PgWatch, PgTimeTable, so on and so on. So, if you want to find more projects, more applications written in Go, please go to this site, to this repo. There are a lot of them. A lot of them, yeah. Okay, so what about tooling? Because when I was started, I used sublime text. There was no proper IDE to work with, no debugger, no these kind of things. Right now, according to the last year developer Sari, the most used IDE is VS Code, then GoLand by JetBrains, and then Vim, and sublime text is 1%. I was saying at the conference that they talked that I will try GoLand and will tell you about how it is different from VS Code. No, still didn't try it, but I think it's good. So these are the answers from the previous years. So as you can see, the intention is the same. So, VS Code on top, and GoLanguage, et cetera, et cetera. GoLand, sorry. What I'm using, I'm using VS Code with the official plugin installed. Then I'm using the tipwars command line utility to make these fancy tables out of test output. Linter, of course, tip nine. I tried GitLab co-pilot. It's not bad, but I don't want to spend my money on this. I have my own head, you know. GoReleaser to produce the packages and binaries like in one go. Then PostgreSQL, of course. And the last, but not least, the Gitpod.io, which is pretty much the same thing as a Git Hub workspace, but it's free for open source developers. So it's good if you want to try something new and you don't want to install everything on your machine or to set up the virtual machine. You can go there, run it in your browser, drink a beer on the beach, and try something new. It's okay. If you're done, you just close your browser, close your tab, and it's gone. Now, about drivers, about the PostgreSQL part of this talk. The whole idea of this talk started after I tried to find good tutorials about how to start to work with PostgreSQL. A lot of them, not all of them, but a lot of them say, okay, just use ORM and you're fine. Well, why? If I need to create utility to use three commands, why should I use ORM? And yeah, don't get me wrong. ORMs are fine if you know what you are doing. They solve the problems, but you don't need to put it everywhere and you don't need to start from it because otherwise you will be learning ORM but not learning PostgreSQL itself, right? Yeah. We have SQL for that. So, during this talk, I will not be explaining ORMs on how to work with that. I will explain the basic drivers. Anyway, ORMs are using drivers on the low level to speak to the PostgreSQL, right? So, we should know how to use them. So, the thing is, in Go, that we have these databases QL interfaces. Interfaces is just like an agreement on what methods are available from, I don't know, from object, from structure, or whatever. And for each database, there should be a special driver, an implementation for these interface, right? So, the first official implementation for the PostgreSQL in the Go world was Leap2Q. It's good. It's proven. It's a long time on the market, but it is in maintenance mode right now for two years, probably. It's not bad. It's okay. You can be sure that this functionality is solid, but if you want more, and if you start a new project, JXC-PGX is the way to go, because you can use it with whatever you want. I will show you later. So, yeah, we are scientists, and we do graphs. So, unlike of Python, or not in Python, but NPM, JavaScript work, we don't have this statistic for how many times a particular package was downloaded, used, whatever. So, the most funny way is to use GitHub stars and, yeah, to build this stuff. So, as you can see, the PGX started one year later, but now it's going to be very popular. So, in 2019-20, there was an announcement that the LPQ is going to maintain smart, and the PGX started to grow. So, what if your project is using already databases QL, or you want to follow tutorials or techniques and to use these databases QL, standard de facto interfaces? For that purpose, PGX has a special wrapper. So, you can still use databases QL interface, but you can, underneath, on the low level, you will use the PGX package. So, you should use PGX solely if you are starting a new project, and your project is aiming only POSQS QL. Then you don't need databases QL, but if you are dependent on Orm, or you are dependent on other package which wants you to use databases QL, you go for wrapper. In that case, the dependency will use this wrapper to speak to databases QL, and you will use the power of PGX. So, there are a lot of unique features that are not implemented in the standard library, and they cannot be implemented because this interface is the same for every possible database. So, for example, you cannot add methods with a copy support because only POSQS QL do copy support, right? So, the most cool feature is that PGX supports binary transfer protocol, and it supports natively all built-in POSQS types. And if you create user types, it will support them out of the box unless they are very complicated. But even in that case, you can create a special interface or a special object structure that will tell PGX how to encode, decode the developers of your types. So, as I said, yeah, copy protocol, logging, yeah, and for connection pooling, you have this after connect hook. So, the idea in girl language is that the database object that you have is pooled by default. So, it can create additional connections, and you never know how many active... Well, you can know, but you can never tell how many connections you have right now. And this, for example, after connect hook helps you to prepare your session, prepare your new open connection for something, like add some identifier, login, or whatever. Yeah, listen.notify is implemented natively. What else? Yeah, different JSON, HStore, large objects. So, everything you need is already there. Nice thing about PGX is in December, probably the new major version 5 release was... And it was a huge step forward, especially in the term of dependencies. v4, version 4, was good, but it has so many external dependencies that, for example, Magnus Agander said, no, we will not rewrite our internal tool into PGX because it's too much dependencies. It's not the thing anymore, and it's very cool. So, yeah, hello world. You probably all know how to write it in the database SQL interface, so you're just using that import package, but instead of using libpq, you are specifying PGX as the libp, which is a wrapper for the standard library. And then all things are the same. No difference. So, if you want to update your project, you just change the import part, the import of libpq to the PGX standard lib, and you are fine. If you want to use PGX directly, you are fine to do that. The thing is here that the PGX return the connect method return the one connection only. So, if you don't need a pool of connections, or if you want to be sure that only one connection is live, one connection is used, you are going with PGX connect, right? But please remember that you cannot use this structure, this connection in parallel, so you need to know that at one point in time, only one thread or one go routine can talk to the database. Otherwise, you're good. But if you want a pool of connections, you are going with PGX pool. And I don't know what can I add here. It's obvious. You can pass it to the go routines, and it will create additional connections as you go, and you can limit up a number of connections, etc., etc. It's very, very, very flexible. Okay, about useful extensions. For my first project, I started with the libpq as well. It's way to go. That's how we grow. And later, I understand that I want this copy functionality badly. I need that. So I started to look to the PGX, how to switch it, and I didn't want to lose these SQL things when you're encoding, decoding your structures, slices, arrays, whatever, right from rows or two rows, right? That's what most people think the ORM do. It just translates the rows into the structures. But, yeah, it's very useful. So, like, if you are working with an old database SQL or libpq, you are importing this SQL thing, and you can have a lot of new methods, like you can struct the row into the structure, or you can scan the scalar into the variable, or you can create a slice from your rows, etc., etc. It's very cool. PGX, at that time, didn't provide that. But with the latest version 5, everything is already there. Better is included. Probably you can find cases where you want more control over decoding and coding, but after this guy was introduced, wrote to struct by name, written by me, by the way, yeah, everything became very easy. So, before that, we had only row to struct by position, right? So, if your structure fields are in the same position as your field in your row result set, you're fine. You're just, like, doing the back and forth. But if you want to skip some fields, or, for example, if you have some non-public fields in the structure, but still want to use this functionality to decode and code from the result set, this is the way to go. Yeah. Now about testing. We all know it's very essential, right? And I heard a lot this statement that you need to write your tests first, and only then implementation. I never did. Maybe one of us tried it. Oh, go, go, go. I'm too lazy, because I never know where at the end I will go with my code. I'm starting like, okay, I will implement this thing that will return this integer, and then I'm like, oh, no, let's do this. CTE with a lot of things, yeah. And if I write a test before I need to follow it, right? No, it's not funny. How we do testing? So I would say there are three main approaches. The first one, obviously, is to start a real PostgreSQL server. You can have your local installation on the test environment, or you can download it during the test running, install it, initialize, et cetera, then cache, but it's still the same. It's a real PostgreSQL server. The second approach would be Mockin libraries. For database SQL, that would be Datadog, go SQL Mock. And for PGX, I created this PGX Mock, which is the brother of the SQL Mock, but, yeah, works with PGX. I hope you know what is Mockin and how these things are working, right? We are pretending that we are a PostgreSQL server, and our application, our tests, are thinking that they are speaking to the real server, but in fact, we just throw the needed answers to the application. So, do we need rows? Okay, this is row, this is the answer. Oh, no, that one is a row. Let's see how you will react with that, et cetera, et cetera. But if you want to test on the protocol level, there is also some very low libraries, like PG Mock, which is just like the real low-level Mockin protocol. KacrosDB has its own test server, which is just like an import KacrosDB test server and use it in your tests. And another library is copied. Let's try to maybe... It's not very useful. No, let's get back. Can it work? Okay, so how to create a test, how to use this PGX Mock thing. So, if you read me on the repository, you will see that now change is required to your application. That's a lie. You need to provide an interface because the PGX return structures is a connection or a pool. We cannot mock structures. We can mock interfaces. So, I am defining PGX interface here and say to my method, to my function, that I will use this interface. Please use that. And for my function, it doesn't care whether it be a real connection or whether it be Mockin or anything. It just knows that this object has this method and it's enough for that, okay? So, yeah, we write a code, kind of shit even, okay? We are trying to call me, we are trying to roll back, et cetera, et cetera, et cetera, how to test it. So, I will always start with a successful test cases. I am a very positive person. First thing, first though, I am creating the Mockin object. PGX mock new pool. Then I will tell my Mockin object how should my session looks like, right? So, I am saying I am expecting that we will start a transaction. I am expecting to begin. Then I am expecting that the code will try to execute update statement, right? And when this happens, please return to this code the new result that update was successful and we updated one row, right? After that, I expect that the code will try to insert something. And I expect that the arguments for this statement would be two and three. If that is the case, please tell that everything is good. We insert one row. And after all that, I expect that the code will commit the transaction, right? That is what I am expecting from the code. Then I am calling my function record starts and instead of the PGX, I am passing the Mockin object, Mock. And two and three arguments, right? And if anything goes wrong, the taste case is failed, right? But another thing I want to check is every expectation I set were met. For example, after the commit, my code might want to write a log to the database or do other things. I don't expect that thing from it and these expectations were met will fail if something else happens inside this function which is not being expected, right? So for fail, for failure is pretty much the same. We are telling that we expect to start transaction, we expect to start update statement, but let's pretend we want to test how our code will behave if the insert statement will fail. So we are telling, when insert statement is coming with the arguments two and three, let's pretend that error happened. Return error to our code and the error with some error text. Very beautiful. We are starting our function, but in that case, we know that it should fail. That's why we are checking our error to be not new. We are waiting error, right? And the same for expectations were met. So for example, if we failed and our code tries to do more than we are expecting, we say, no, please don't. Please don't. Yeah, so then you are just using go test with the t-parts thingy. I just love how the tables look after this output. So like for this case, we have like one package and we have two test cases, right? But in real application, you might have hundreds, hundreds of test cases and dozens of packages. They all be listed and you can see a coverage for every package and you can see probably coverage for every test case, how many passes, how many fails, et cetera, et cetera. Also, you want to probably investigate what is the coverage. For that, you are using the built-in go to cover two test cases that will produce temporary HTML files and will open them in your browser. So you see a combo box with the list of files in your application and you just go through all your code. The red one is not covered by our test cases. The green one covered by our test cases. For example, in this case, our main function is to test it all because it's tricky to test the main function. That's why you should put it outside of everything and make it like two, three lines and only use your packages inside. Okay, so time for continuous integration and continuous delivery. As a company, we are working in GitHub, but I'm pretty sure that the same functionality is available on GitLab and BigBucket and everywhere. So for every my project, I want to have at least five actions, GitHub actions. First one is a dependent bot. I want my repository is constantly being checked if any package I'm dependent on is updated. So it will update, okay, we have a new minor or whatever version of this package. It will automatically create the pull request. I will check the output. I will check if tests are fine, if everything is okay. Okay, very good. I like this because you can do three pull requests per day and you are super productive. Then what I want to also always have is a code QL which will build your sources and will instigate the possible security vulnerabilities. Building tests, I'm using that for pull requests only because if you have fired them on every push and the test is heavy, it's not fine. Release will produce the binaries and packages when the new release is created. And the docker is the same like for release, it will produce a special tag and an image and push it. But for every commit, it will produce a special docker image which you can just try immediately, right? Like a night build or whatever. Dependable is very simple. For example, for almost all my repositories, I first want to check the Go code itself, so package ecosystem Go mode. And I want to use the latest GitHub actions as well. That's important. So I check them daily and that's usually enough. For code QL, nothing special. When you create these actions from the GitHub interface, it will fill all the fields for you. I never changed the important thing there, only removed some comments and we are fine. Building tests, I hope you can see what is going on there. No. Sorry, I don't want to switch to the editor because I cannot work like that. Yeah, so I will tell you. So I usually run all my tests on three different platforms, Windows, MacOS and Linux. The good thing is all the workers already have PostgreSQL installed on them, but the thing is that it's not started. So for you, essentially it's to start PostgreSQL and then run your tests and you are fine. Usually the version of Postgres is behind two or three minor versions, but it's okay. If you want just like the latest one, you can go with the Docker images instead of that one. Yeah. So there in the build action we have Linter, so no, without Linter we cannot accept any changes or pull requests, pull requests. And yeah, and then we are using, we are generating the coverage report to put them everywhere. See, 99% of code is covered. Yeah, let's lie. Okay, so release is a little bit simpler. As I said, we are using Go Releaser. It's absolutely fabulous piece of software that may produce everything for everything. So the GitHub action code is simple because everything is stored in the YAML configuration file where you set up the name, the architecture, the OSIS, everything. And then you just like, okay, let's check out our code and let's release it. And the cool thing about it, this is the Go Releaser, will create a change lock automatically for you based on your pull requests. So when I'm releasing, it's just like, I copy paste it, just sort it by the, like, what's added, what's fixed, and whatever. And I'm done. The release is very simple for me. Absolutely. Before that I may spend two days on each release to produce all these binders, et cetera, et cetera. Okay, Docker. Go Releaser can produce Docker images. I'm too lazy to rewrite this. And yeah, I'm using like another special, special GitHub actions to build a Docker. You can build them for every possible platforms. And this Apple M1, M2 silicon thing is whatever, it's just working. Okay, takeaways. The Go is popular. Devrim doesn't like Go. Yes? So maybe this is the last time you see this talk, like when everything goes right, he said that I need to switch to Rust. So maybe next, I will go out when the Rust goes right. So no, no. Should I stick to the Go? Okay, we can do both. Yeah, so a lot of developers are using databases with Go. Of course. You can use whatever it did or whatever operating system you are. By the way, a lot of people are using Windows, which is not common for like Postgres community, for example. But it's fine if your system can produce whatever you want. You don't care. You just work on what you have, right? So Kubernetes, you can use whatever you want with the Postgres. If you want to use Orm, please do. But remember, you're responsible for that. Use it wisely. Otherwise, you can use LITPQ package or new PGX. And you can use whatever GitHub, GitHub, Bitbucket you want. And the most amazing thing about Go is the backward compatibility. You can build whatever in the future. Yeah, in the future. Whatever code you want. And it still will be compatible with the oldest something written ages before. Even now, when we have generics, when we have a cool thing is in Go, they are still compatible with that old things. Okay, so yeah, don't be stranger, check my GitHub account, check our blog. Yeah, some of our projects. And if you have a question, or maybe you have a question to Devrim why he hates Go. No, I don't hate Go. It says I don't package. I'm not sure what the problem is. I provide everything you need. Take my binders and put them in your packaging, whatever. These conditions don't like to be in Go, because either it should be a binary or the long internet. Not internet. No, no, not internet. There are only four dependencies for a new created application to connect to the possible scale. There are only one direct dependency is PGX, and there is only four indirect dependency. Two of them are libraries from Google. One of them library from KOROS. I don't remember. And one of them is by the same author, because he's using this package in other way. Yeah, so, questions? Hi. A couple of slides back. You went to a certain functionality where you were asking, do you do select, and then you put in a structure in order to get the results from? Yeah, couple of back. I have a question on that one. Yeah, that's one. Yeah, that's one. Exactly. So you're asking a select star, and then some stuff, and then you ask it by the rows. Does the actual connection, so if you're running this on one machine, you're running Postgres on another machine, there is a network, of course, in between. There's data going back and forth. Is all the data being sent from Postgres SQL server to your Go program or not? The reason I'm asking this is because, let's say, there's a fourth field, which is, let's say, a huge JSON field or a huge binary field, which contains like a megabyte of data per record. Is it then sending, if I'm asking 100 records, a thousand records, is it sending a gigabyte over the network, or is it only sending those three fields? Okay, I got you. Yeah, thank you. Very good question. So in this particular case, there shouldn't be star, first of all. We should always list columns we want to have. I'm just too lazy, and this is not my code. But it's like, you know, it shows. Yeah. So in this case, yes, everything. Can you do it automatically? Can we add columns from a database automatically? Yes, we can. But for that, we should use SQLC package, which is exactly for that. So it is pre-built or hooks or whatever. So when you say, go build, you have special SQL files. Like, I want this field from that and that, and it will go through it, and it will build automatically the appropriate structures for go, and then you can use it. Yeah, it's just a lot of information. I cannot, like, put it into the one talk. Yeah, but yeah. About if we are loading at once, yes, we do in this case. But the Postgres protocol itself supports row by row functionality, and it's possible to use that functionality with this package. So you can, like, yeah, say that. Hello. Thank you for the talk. I have one question about this driver. Actually, it's kind of only way to work with Postgres, for my opinion. And I'm a little bit worried that this driver is not supported by Postgres community, let's say. It's supported by someone. And what is the life cycle of this software? And maybe it will die. You say, how is about new features to it, and all this question arises when you work with it, because if your management says, okay, let's use Java, and in Java it's kind of stable, this Postgres driver, and you know that you always have the new version. What is about this software? Yeah, thank you for the question. So versioning and owner, who is owner, and that kind of stuff. So as a Postgres community, we support on the C library, which is live PQ, and Java, which is JDBC, right? That's all. All others? Yeah. Who uses C++? All other libraries are maintained by someone. By the way, live PQ is not a standard library in terms of made by go. It's also maintained by one person. So how we do in this case, we just fork it and use it if you want something new. If I did everything better than the maintainer, the owner of the PGX will accept my progress proposals and we are stinked, right? If not, I will beat them and I will be popular. I have a question regarding testing strategies in CI CD. So you have shown that it's possible to mock Postgres scale, right? But sometimes you are relying on some feature of Postgres scale, or possibly you are relying on some extension Postgres. Do you want to test with a real Postgres scale? Yes. What you can recommend, so I have seen in your CI CD, you are executing some peer scale comments. Do you have a dedicated instance for running tests? No. So my GitHub actions are using pre-installed Postgres QL on the GitHub workers. They already have Postgres 15.1 probably nowadays. So I'm okay if they are behind several versions. I don't need to test for a specific feature or bug or whatever. But if I want to, in my GitHub action, I may specify the Docker image against which I want to test. So for example, if I want to test the latest master from the Postgres QL community, I will build my own image docker and will run my test against it. And I'm fine. Okay, thank you. Excellent from the back. Yes. |
AMENDMENT The Human Factor: Why Database teams Need Crew Resource Management |
So, hello, everyone. Thanks for joining us today to the Postgres Dayroom. Our next speaker is Chris Trevers, who flew all the way from Indonesia. Thank you. And he's going to talk about why database teams need human factor training. Thank you. Thank you very much for coming to this talk. I think it's certainly one of the topics I'm most excited about when it comes to database-related topics, actually, even though I'm very, very much into Postgres. This topic really excites me. So, just introducing myself a bit for those who don't know me. I have a bit over 24 years of experience in Postgres, so almost 25 years. I've built accounting software on it. I have worked as a database administrator on very large databases, built database teams, managed infrastructure, a lot of that sort of thing. So, I have a wide range of experience. I have submitted several patches in for Postgres, so which one has been accepted, and I'll probably be submitting more patches at some point in the future. So, I absolutely love Postgres for its extensibility. And with that, of course, comes some complexity, some difficulties in kind of maintaining our mental models about how things are actually working. And especially if you're working at things in scale, it's really easy for your mental model not to match what's actually happening, and then sometimes we make mistakes and things happen. So, this talk is basically going to be about two things. The first thing is something our industry doesn't do very well. And that is how we look at human error, and how we can possibly do that better. I kind of want to talk a little bit about how we can improve, and what the benefits are that we can expect from some of the early steps that we can take as an industry. So, this is very much a talk about database people. It's a talk about us. It's much less a talk about like a specific technology. But a lot of the same technical approaches apply. So, I want to give a few thanks, first of all, timescale for paying me to come and do this. Wouldn't really be feasible for me to fly from an issue without them. But I also really want to thank two of my prior employers. I want to thank Adjust, where we were actually able to bring in aviation training on these human factors. So, we brought in a company that did training for pilots, as well as doctors. And a lot of the training was really eye-opening, and it allowed us to do some things that we couldn't do before. This was really a grand experiment, and it had a number of really big, tangible benefits. And then, of course, I also want to thank Delivery Hero, who I worked after that, where I was able to kind of work with people and evaluate both the successes and the shortcomings of what we had done at Adjust, and further develop some of these ideas. So, these are important areas, and I would also say that I'm involved in trying to help implement some of these things also at timescale. So, introduction. So, just as a, this is a completely rhetorical question. You don't have to raise your hand if you don't feel comfortable doing so. But how many of us have been on a team where somebody has been working on a production database while they're drunk? Yes, I see. I mean, as we go through our career, almost every single one of us will probably have that experience. And yet, how many times does it cause a major problem? Almost never. At least, I've never seen it cause a major problem. Now, part of that may be the context in which it happens, like, you know, the subject matter experts spent out partying was not expecting to, was not really on call and has now been called in an escalation. Somebody else may be handling a lot of the sort of wider incident strategy stuff where maybe alcohol might be a bigger problem. But at least in these contexts, alcohol doesn't seem to be a big factor in the further disruption of things once stuff's going on. But let me ask another question. How many people here have seen a case where a major incident or outage happened because somebody made a mistake because that person was tired? See? So, we valorize the thing that causes us problems. Well, we demonize something that probably does cause some problems, no doubt. And maybe the demonization helps prevent more problems, but we valorize something that causes a lot more problems. Okay? Why isn't we do that? How is it that we should stop doing that and actually we think our priorities? Now, on one side, this is a good example of human error, right? We can talk about all the factors that go into that prioritization. But on the other side, it's also partly because we don't understand human error in our field, right? When we do a post-mortem, if somebody made a mistake, we just say, oh, human error, and that's it. I'm going to come back to that point in a few minutes. So, drunkenness versus fatigue. Now, if one person drinks, say, a bottle of wine, and another person, one group of people drinks each a bottle of wine, and the other group of the people has, say, their sleep disrupted, so they're only sleeping four hours, and then get up, and a few hours later, they're in the other group. Give both of them complex tasks to perform. Who's going to perform worse? Sleep deprivation causes heavier cognitive deficiencies. Four hours of sleep, missing sleep, is worse than four drinks. Now, obviously, there are some tasks where that's not the case, like driving a car or something, because you also have coordination problems induced by the alcohol. From a peer information processing standpoint, having four hours of sleep only is worse than drinking a bottle of wine, and it's going to last at least the next day. So, totally, totally worth thinking about that. So, now that I've talked about, like, one aspect of human error, one thing that can induce a lot of human error, I want to talk about a brief history of why this field became really big in aviation. So, back in the 1950s, 1960s, 80% of the aircraft accidents or incidents were blamed on pilot error. Notice I didn't say human error, I said pilot error. I'm going to come back to that distinction in a moment. In fact, I think the number might have been closer than 90%. Today, our incident and accident rates in airlines are well over 100 times lower than they were at that point. So, if you think about it, improvements in the technology in men and of the airplanes could only account for maybe 10% of that improvement. All of the rest of this is due to much better understanding of the question of human error. There's been a shift from focusing on pilot error to focusing on human error. When the aviation industry talks about human error, they don't mean somebody made a mistake, and that's where they leave it. They have a rich taxonomy of understanding kinds of human error, causes of each of these particular types of errors, and sort of practices to try to mitigate them. The way I would actually describe this difference is that if you're debugging software and it's connecting to the database, and every time, let's say you have an error in your query or something the database can't fulfill your request, it just says something went wrong. You're not going to be able to debug that software at all, and you're probably going to have a lot of trouble. That's kind of what we do currently when we say human error. We just simply say the person made a mistake, and that's as far usually as we look. The aircraft industry has actually come up with something with a much richer understanding of this topic and sort of richer system of almost like error codes that they use when they talk about these issues. The reason is it's a very unforgiving environment. If you make a mistake, you might or might not be able to recover, so you have a lot of training on this, and now the chance of a massive disaster is down probably one error disaster per billion takeoffs, which is really impressive. We'd love to have that. So they've made this shift. They've also made a shift that we've already made, and it's worth pointing this out, and that's that they've made a shift from individual responsibility to collective responsibility. In our terms, we call that blameless culture. Somebody makes a mistake. We don't blame them. We don't go, hey, stop making mistakes. We try to find some way to keep that mistake from happening, but because we don't have a clear understanding of this topic, we try to solve this in ways that maybe aren't as effective as they could be. I want to give one really good example of sort of a watershed moment. Actually, before I talk about that, let me just discuss David Beatty's contribution quickly. Beatty was a cognitive psychologist and pilot in the U.K., and in 1969, he wrote a seminal book called The Human Factor in Aircraft Accidents, where he basically looked at the kinds of mistakes that happen and the kinds of circumstances that lead to those mistakes. There are newer versions of that book out now. It's actually worth reading, but probably not best to read if you're a nervous flyer. As a good description of how we break down error, it was the industry starting point. Ten years after that, there was I think the watershed moment in how this became really big within aviation, and that was the Tenerife disaster. Tenerife disaster was the most deadly aircraft accident still in history. It happened on the ground in Tenerife due to a variety of factors. I'm not sure how much detail I should go into in this talk, but the end result was basically that 1-747 tried to take off down a runway with limited visibility without a proper takeoff clearance, and they hit another 747 on the ground. Clear case of human error, and the Spanish report on this more or less blamed it on pilot error. This guy tried to take off. He didn't have the clearance. It was his fault. The Dutch report, which is often criticized in some documentaries that I've seen on this, was very, very different. What they actually did was they asked, why did he try to take off without clearance? What was going through, how did that mistake happen? The thing was, he was an extremely experienced pilot. He was their chief pilot. He actually didn't fly airplanes that much. He was mostly sitting on simulators. The thing was, at that time, when you were the senior pilot in the simulator, you were giving the clearances. A stressful situation, visibility is slow, there's pressure to take off, stressful situation. He goes back to what he's used to doing, which is assuming he has the clearance because he's used to giving them to himself. Airlines don't do that anymore in their simulators, for obvious reasons. The Dutch report actually became the basis for how the aviation industry has started to look at human error ever since. As a result, what we've seen is we've seen this massive, massive improvement in safety. Every pilot in every airline gets this sort of training, and it has made our flights much, much, much, much safer. The question is, can we benefit from the same thing? The answer is yes. We actually can get a lot more benefits from it than just reducing incidents and recovering from them. In fact, if you look at the standard definition that people give of crew resource management, it's the use of all available resources to ensure the safe and efficient operation of the aircraft. If we can use all of our resources to improve both safety and efficiency, that's going to make our jobs better, we're going to be more productive, we're going to be happier. This is actually a really, really important field that I think that we need to improve on. Now I'm going to talk about how we look at human error in the industry. Human error typically in the DevOps and SRE systems, we have one answer to human error. What that is, automation. If somebody made a mistake, we're going to automate that away. We're just going to add more automation. We're going to add more automation. It seems like a great idea. Computers are infallible, we're fallible, so we're just going to use the computers to prevent the mistake. The problem with this, the IEEE has done a bunch of research on something they call the automation paradox. The automation paradox is that the more reliable the automation is, the less opportunities humans have to contribute to the overall success of that. I think I'm going to take a little bit of time here to talk about why that is the case, and then I'll get reinforced in the next section when we talk about why we make mistakes. But to start with a basic summary, obviously we need automation because there are certain kinds of tasks that we're actually very bad at following, and there are certain kinds of requirements where automation can really save us a lot of safety considerations. So steps that have to be done together really should be automated so that they happen together. But automation is just done reflexively, at least according to a lot of the research that's come out of the IEEE as well as many of the aviation study groups on this, is that simply throwing automation at a problem can actually make human error more common and can make human error more severe, and then when things are out of whack, you have no possibility at all of saving, of preventing a major incident. Part of the reason here is that we process all of what we see through a mental model, and so when we add more complexity, when we add more automation around a lot of this, we make it really, really, really, really, really hard for us to keep that mental model reasonably in sync with reality. And then when something goes wrong, we can spend a lot of time and effort struggling to understand what's going on, or we may reflexively react in ways which actually make the problem worse. So automation isn't the answer, it is part of an answer. And reflexive automation, oh, we had a problem that's automated away, is not the answer. Now, I mentioned just a moment ago this issue of mental models. We humans, we operate in a world that's very different from the way computers operate. Computers are basically systems that mathematically process inputs and produce outputs, right? And therefore, computing programs basically operate in a closed world. We humans don't operate in closed worlds, right? We all operate in open worlds. We have the situation where we know we don't know everything. We know we don't know some things. We know, well, and then we don't know other things that we don't know we don't know, right? Some cases we know we don't know what we don't know, right? But in order to function, we have to maintain these mental models, and those mental models are necessarily a simplification of reality. And so when something's going wrong, we have to dig into how we think the system works and we have to kind of go through that. And the more complexity we throw into our automation, the harder that process becomes, right? So automation, as I say, is an important tool. I'm going to talk in a few moments about good automation versus bad automation, but it's something that we can't rely on to solve the human error problem. So I mentioned that I talked about good automation versus bad automation. I think this is really, really, really important here. So oftentimes what I've seen happen is that you end up with large automated systems, whether they're something like Ansible or Rax or Kubernetes or whatever. And oftentimes there isn't a clear understanding of how these things work underneath. Now, if people do understand all of that, and they've built in a lot of these things, then a lot of that's going to be a lot easier, right? So good automation is basically going to be a deliberate and engineered process, right? Rather than something that's thrown together in the course of the messy world of operations, it is a deliberate process which is designed around two factors and three factors, actually. The first factor is the system, right? The second factor is the people, and we usually forget that one. And then the last one is that we actually need to be thinking about the human machine interaction, right? So good automation takes the people into account. Good automation is something which has built in decision points where the person can actually sit there and say, hmm, this isn't going right. We're not going to proceed, right? And good automation is sort of then a well-understood process, right? So the other thing that is really important as we look at automation is this issue of feedback, right? Because the more we automate, typically the more we insulate the individual from the feedback of the individual steps that would be right, right? So it's really super important to sit down and think about what's the person going to see? What's the human going to see? How's the human going to be able to interpret this? How much feedback do we want to send? Do we want to send everything that we got? Do we want to send some summary of it? And those are going to be decisions that have to be made deliberately based upon the context of what we're doing, as well as a clear understanding of what the failure cases of the automation are. And then of course people actually need to be trained on what the automation is actually doing under the hood so that they understand it, rather than just simply saying, oh, push button, okay, everything good. So the way I always look at it is a lot of people think automation basically, a lot of people think checklists are a step towards automation. I think that automation should be a step towards a checklist, okay? The relationship should actually be something around on the other side so that you're thinking about how do I want the human to interact with this? How do I want the human to perform these? Where do I want the human to be able to say this isn't going while we are stopping? And those are the sorts of questions and designs that we have to think about when we're dealing with especially these sorts of critical systems like the databases, where if the database is down, you know, the business maybe now. So now I want to talk a little bit about why we make mistakes. Now, I'm mentioned before, computers operate in a closed world, right? They get inputs from us. They do processing. They give us outputs, right? We live in an open world. We have, we experience things, we perceive things. What we perceive is not a complete model of, or it's not even complete aspect of what our mental models are. We make inferences based on incomplete data, okay? And in order to function in this world, we have had to adapt and develop certain kinds of cognitive biases, okay? And a lot of times people look at this and they go, oh, it's not good to be biased. Bias is a bad word. We don't like biases. But the fact of the matter is that if you could get rid of all of your cognitive biases, you would be unable to function, okay? Confirmation bias, of course, is one thing that we tend to be aware of. But here's another one, continuation bias. Continuation bias is the tendency to continue to follow a plan you've put in motion, even when you're starting to get good indications that that's not a good idea, okay? If you didn't have continuation bias, you might have to sit down and rethink your plan continuously over and over and over again, right? That wouldn't be very helpful. So continuation bias, just like confirmation bias, actually helps us function in the real world. Problem is, it can also lead us into situations where we do the wrong thing. And so understanding these biases, understanding their implications is very clear, is a very important step to being able to notice when they're causing problems and start to trap those sorts of problems. So rather than trying to eliminate our biases, which is, I think, a way in which I see people typically trying to do this, is better to think about what kinds of problems the biases can cause and how we can detect and trap those problems, right? And there are a large number of these biases, right? Expectation bias. Expectation bias is also related to confirmation bias. It's the tendency to filter out perceptions that don't match your expectations, right? This happens today in a lot of environments. It happens in our industry. It obviously still happens in aviation, fortunately, usually not with serious problems. The most common problem it causes there is that the plane comes up to the gate, the pilot says disarmed doors and cross check. Somebody misses the door that's going to be opened. The other person cross checks and expectation bias kicks in and they don't notice that the door is still armed. Go to open the door and guess what happens. Emergency slide deploys. It doesn't harm anybody on the airplane, but it's going to make a bunch of people unhappy because the next leg on the airplane's flight is going to get canceled. And that's usually the worst that happens. But these are important things and we have to recognize that these sorts of biases are going to happen and that our ability to maintain a situation awareness in the course of these biases is very much tied to, very much tied to how aware we are of the kinds of problems that they can cause, right? Because, you know, we form this mental model. We're going to interpret things according to that mental model. We're going to continue our existing plans and things like that. And when somebody says, hey, wait, maybe this isn't right, then that's suddenly an opportunity to go, hey, my bias is maybe leading me astray. Let's sit down and figure out what's going on and verify. And human factors training actually tends to include exercises or training specifically aimed at doing that. So, second major issue is reversion to prior behavior under stress. Something that happens to all of us when we're under stress, our focus narrows, right? We start filtering things out and we start resorting to habit. What this also means is that in a database team, when there's an outage, if we're not careful, we will resort to the things that we're used to doing, even if we have decided that they're not maybe the best ways forward. And, you know, I've watched cases where, you know, incidents happen and, you know, if a company has been really trying to move towards a more collaborative approach to incidents, that suddenly when the incident happens, people are getting stressed out and they're going back to this like hyper-individualistic cowboy incident response. And a lot of that is just simply due to stress. It's a very well-documented part of the stress response. One thing that we got at a just with the human factors training was a strong understanding of that problem as well as good understandings of how to measure the stress so that we could actually kind of keep an eye on it. Another major point that causes problems, and I've alluded to this before, is fatigue. How often do we see people who have a rough on call night, who come back in the next day and start working on stuff? How often are we willing to say to that person, no, go home, get some rest, I don't want you working on this stuff right now? Right? How often have we seen people who are on call for an extended time period and a rough shift make mistakes after several days of continuous sleep interruptions? You know, do we start to think about the question of maybe when this happens, we should be switching these people out more frequently. In the airlines, before any flight happens, the flight crew get together and they check out how each other are doing, right? And there is an expectation that there is a standby flight crew so that if you're not feeling your best, you can say, hey, I didn't sleep well last night, I don't want to fly. And that's another thing which has really helped the increase of the safety, something we should probably think about doing. You're getting tired from the on call, time to switch you out. Do we? I have never worked anywhere the day. So a final major point on how and why we make mistakes has to do with a term in human factors, Lingo, called workload. Now, I don't like this term in this context because when we say workload in here, everybody is thinking, oh, I have so many things I need to get done this month. But in the human factor side, workload doesn't mean over the next month or over the next week, although planning that can be helpful. What it really means is how many tasks are you having to pay attention to right now? How many people here can actually listen to and understand two conversations at the same time? Nobody? Maybe it's possible for some people to train that. But a brain can't, there are certain kinds of things that our brains can't parallelize very well. Understanding where those boundaries are. Switching and flipping between tasks. How much can we reduce that workload? That's actually really important because one of the things I've seen happen is you have your standard runbook and the way most people write their runbooks is you have step, explanation, discussion of output, next step. What happens at three in the morning if you've never done this particular process is step. Okay? Yes, it did what I expected to. Where's the next step? It becomes really, really, really easy to miss the next step in your checklist or to miss critical details that are kind of obscured in the fact that now you're having to read through paragraphs at three in the morning while troubleshooting a broken system. One of the things that I did while it was at adjust is I started writing some of our, I guess I would call them unusual procedure checklists. A non-normal procedure checklist. So things that happen when, things that you do when something goes wrong. Things that you might have to do at three in the morning without doing them for any of the previous three months. And what I ended up doing in this case, and it was actually, this is a good opportunity to talk about some of the main benefits of this sort of training, is that we talked about, we talked about basically what we did was we did the following format. It's a bullet point. Here's what you can ideally copy and paste into the terminal. Expected output, warning signs, all in bullet points and then back, unindented again, the next bullet point. So they're hierarchical, it's easy to scan, but then your main points are all really, really, really short. And then all of the major description that would be in those paragraphs would be moved into foot notes. And those would all be hyperlinked. So you run a test, you know, you run a step. Something doesn't look quite right, you want to see the longer description, you click that hyperlink, you come down to the footnote, you read the whole thing, decide if you want to proceed or not, and then decide. And what this allowed us to do was to take, like the standard platform team people who are on call and actually have them do error spike maintenance at three in the morning on, as I say, the super critical high-speed database system. And before that, every time there was an error spike issue, it was an automatic escalation. And it was automatic escalation because we didn't trust that they would be able to do it or make proper decisions around it. But since we formalized it into checklists and we offered some training on them, and we tried to make sure that people kind of understood the overall considerations of the processes, then they could do some basic stuff and then call us if there were questions that weren't obviously answered by the documentation. Every very, very good tangible benefit meant that instead of several people waking up in the middle of the night, they could be done by the on-call engineer. So that's a really good example of the benefits that come out of paying attention to that workload issue and the sensory overload that happens that's much more serious at three in the morning than at three in the afternoon. So at this point, it's really important to recognize that at this point, we're no longer really talking about human error being somebody made a mistake. Instead, we're talking about the need to be able to debug the person and why they made the mistake. And this is something which very often times we don't even try to do in our industry, but we should. This requires that we have a really good taxonomy of types of mistakes. That we can say, okay, situation awareness laps because of sensory overload from too many monitoring alerts going off. A very common one that happens in our industry. It's also something that's caused airplane issues. So if we understand that, we know, they lost their situation awareness. They couldn't understand where the problem was. This happened because they had too many alerts they were trying to focus on. Now the question is, are we actually throwing too many alerts? Do we need to think about maybe prioritizing things differently? Do we need to rethink how we do alerting? And suddenly, we have a dimension for looking at these problems that we currently don't have. Instead, currently, what happens most places I've worked is, okay, something went wrong. We didn't spot it. Therefore, let's add another alert over this. But when I was at the delivery here, we actually had a major incident where somebody, again, missed a problem relating to a database, relating to a Postgres instance, I believe, if I remember right. Despite the fact that it was well alerted. I was talking to somebody afterwards and he says, do you know what the false positivity rate of our alerts are? And I'm like, no, it's like 99.8%. How do you expect somebody to spot the problem when almost all the time our alerts don't mean there's a real problem? Okay. Now what he meant by false positivity isn't what I would mean by it. I mean, there were problems that the alerts were alerting about, but they weren't like customer facing problems, right? So the second thing is we need a really good understanding of our cognitive biases and the functions that they provide to us and also the problems that they can lead us into, right? So one of the good examples is, hey, look, you know, I know you're about to do this. I'm not sure that's what the problem is. Can we think about this first, right? And as soon as somebody says that, that means that they're saying, my mental model is not the same as your mental model. One of us is wrong. We should probably figure that out before we proceed. Figuring out how to do that's really, really important, especially when we talk about social factors involved, right? It's one thing to do that with your peer when you're on an incident call and there are two of you there. Something very different to do when the person typing the words is very senior and you're very junior and there's somebody C-level popping into the call to ask for an update. I've been there. I've done that and yes, no, I have not raised the issue and I should have, right? You know, figuring out how to make these sorts of interventions and how to understand the intervention and how to respond to it, those are things that we actually need training on, right? We also need training on the social factors. We need to understand how power distance affects these. What happens when there's, you know, the C-level person in the call? How does that change your social interactions? How does that change your interactions in terms of debugging, right? Those are important things and that's one thing that we can get some really big improvements on relating to this. Finally, it's really important for us to be able to get to the point where we can contextualize the person. In other words, since we operate as humans in a relatively heuristic manner, right? We need to understand what the situation the human was in when the mistake happened and that's another thing that these sorts of trainings can help with. So, I've talked a little bit about social factors here. Power distance is what it sounds like, you know, how big the difference is between the most powerful person in the interaction and the least powerful person in the interaction, where we want it to be kind of, you know, not quite equal but much closer instead of like this, maybe more like this and, you know, figuring out how to structure things so that power distance doesn't cause a problem. That also means giving people good training on how to intervene when they see somebody much more senior about to make a mistake, you know, and you want to intervene in a way which is not threatening and in the event where there's somebody even higher in the call isn't going to be perceived as humiliating, right? Having good training on this and how to communicate in those cases is really, really important. And a lot of this ends up playing out into trying to create a work relationship between the people on the team which is very heavily mutually supportive and also kind of helps prevent or checks and traps the kinds of mistakes that each of us can make. So let's talk a little bit about the ideal role of humans in database operations. Now, we kind of need to understand this well. Okay, ten? Okay. Who's checking? Five? Okay, perfect. We kind of need to understand this. Humans need to be in control. We need to be the decision makers. We need to be the people who can say, this is what I think is going on. Let's go ahead and try this process. And halfway through that process goes, this is not going well. Let's back off, rethink, and make another decision, right? Partly because we're also operating heuristically, we can do things that computers can't, right? We need to maintain really good situation awareness. This means we need to have transparency in our complex automation. We need the automation to be built around helping us, not replacing us. And to do this, we need to be well rested. We need to be a clear peak capability ideally when we're in the middle of an incident. Now, we may not be able to completely manage that last part, but if we can take steps towards it and we can try to improve, we can do better, right? So a lot of this training is, at least what I've gotten out of it, is really important. What I think is really important about this, I'll just talk quickly about how to go about doing it, is that if we can get the organizational leverage behind the training, then we can actually turn the promise of the training into the reality. Sometimes you can't just teach people something and then have the punishment abandoned them. That doesn't work. So as an industry, we treat human error the way pilot error was treated in the 1950s. We have a whole lot to learn from aviation. Those lessons are already being played out in medicine and many other fields today. We need to do what we can to learn from it also. And it's really important to recognize that we can get really good improvements in reliability, efficiency, speed of development, all these sorts of things if we can better work with the human side of things. And I'm not talking about human managers rating performance. I'm talking about people on the team understanding performance for themselves and others. I just want to say that the three pieces of this are trying to get trainers in who have experience. Also, an organizational commitment to make it happen and then internally building your own programs and your own recurring trainings and your own training for new people. So that internally you have a big culture around it and you have experts who can think about it when it comes to be a post-mortem. So that's what I have. Any questions? Thank you. That was an amazing talk. Do you have any recommendations for further reading if you can't bring in experts? So this is a field which an aviation has a massive textbook industry. I think probably the best, sort of the most accessible book I would recommend starting with is the more recent versions of David Beatty's human factors and aircraft accidents. I think the most recent version is called the naked pilot of human factors and aircraft accidents. It's just referring to exposing the inner workings of the human piece of the aircraft. But again, if you're a nervous flyer, probably look for a crew resource management textbook instead because it may be less nerve wracking, it may be less intimidating, but it will have information there too. Do you have any recommendations for testing or drilling your processes like those checklists? Yes, I do. One thing that I think we really should figure out how to do is an industry and I completely believe in this. Obviously, like the chaos monkey idea and Netflix could be exploited to do this if you can also build war games around it. But the thing is it's really important to have drills, which means oftentimes you've actually got to probably simulate or create some sort of a potential incident that you have to come together and resolve. Now, ideally, you need to figure out how to do this without threatening your customer-oriented services. In some cases, maybe the cloud's a really good opportunity for that. But having those sorts of drills, maybe once a quarter or twice a year or something, can really give you an opportunity to spot problems, figure out improvements and actually go figure out what to do about those. Just kind of building on that last point is how do you justify the expense in time or money? Given that if this is successful, then nothing goes wrong. So it can sometimes be the outcome of success is that you're spending a lot of effort on apparently doing nothing. I don't believe that, but that's a reasonable thing that gets asked. How do you go about justifying the time or the money on this after it's successful? What I've usually done in the past is I make my points about, yes, we're going to improve our incident response. This will reduce our time recovery. It'll improve our reliability, et cetera. Maybe it'll improve our throughput organizationally. But then usually, people don't listen. And then usually, there are more incidents. And then you can come in and say, you know, these are specific problems that we had here where this training would help. And I usually find that after two or three of those, people start listening and go, oh, really? Maybe there is something to this. Thank you. |
Bulk Inserts With PostgreSQL: Four Methods For Efficient Data Loading |
Next speaker is Ryan. He's going to talk about bulk loading data to Postgres. Hello. All right. Can you hear me? There we go. All right. Thank you so much for coming. It is a pleasure to see you all. This is my first time in Belgium and in Brussels. It's been a great couple of days. Specifically FOSM, it was really interesting coming into the event. My very first tech event many, many years ago, I hope I look younger than I am, was at a small university very much set up like this. And it just brought back a lot of memories of having packed rooms and the stadium seating and the wooden table. So it's been really fun to be here and appreciate the opportunity. This is a little bit briefly about me. I currently work at a company called Redgate. I've been there a few months. You might know them by a tool they have acquired over the last few years called Flyway. It is a database migration tool. And they've been very well known within the SQL server space in Microsoft.net for many, many years. And they're bringing some of that technology into the open space, open source database platforms as well for migrations and a way to generate scripts and things of that nature. There's some of my details, blog and so forth. Very quickly about me, I thought this was relevant. So my wife and I have six children. So I know a little bit about bulk loading things like vans and cars. And so I felt that was somewhat relevant. If you ever run into me again or you want to talk to me yet today and get me talking about something other than Postgres and databases, family for sure because I have a lot of it. Music, I am an amateur beekeeper. I've been doing that for about five or six years now and I love it. I actually do live streams sometimes on YouTube when I'm checking out the hives. And I roast male and coffee. I didn't drink coffee until about seven years ago and now I can't go without it and I love it. So if you want to talk about any of those, I'll chew your ear off for a couple of hours. You can get this presentation. I actually uploaded it before this talk. It might be a first for me in a long time. So the newest version is up there of these slides and the scripts that you're going to see in a minute. And I'll put this up at the end again in case you miss it. Four sections. Number one, death by a thousand inserts. Then we're going to actually talk about four plus methods of bulk loading. I say plus because there's definitely more than four and lots of little nuances. And as I've given this talk, some people have asked questions and so I've tweaked some things and so I just didn't feel like continuing to change that number. I'm going to give you a couple quick demos and then we'll talk, that is not the fourth point. Well, I did everything else but change at one point. I went over this many times. Just a couple parting thoughts. Death by a thousand inserts. What do I mean? So what I used to find, so I worked at timescale. I didn't put my history up here, but I worked at timescale for a couple of years. Partially open source extension into Postgres for time series data. And we would see a lot of folks complaining, frustrated about the rate of insert of their data. And so we would try to help them and recognize that this is something we're all doing all the time, all day long. And yet we often are underperforming, not getting the results that we'd want. It could be any kind of data. It might be binary data in some way. It might be large objects. It might be JSON that you're parsing and you're doing something with. It could just be a simple CSV file with some of your data science work that you're doing. But you need to get a lot of it and you need to get a lot of it into Postgres. And what we would find is we would see work like this often because it just makes sense. We're so used to typing, insert into a table and some columns and insert some data. And we would see stuff like this and I have to admit that I've certainly done this many times myself, particularly when you're rushing to figure out how to get something done, get something in, how to parse that data packet you just got. And you don't realize what you just did. It created a loop of thousands and millions of individual insert statements. And that's really, really slow, right? And it's slow for a couple of reasons. These are just a few to kind of put out there. But the reality is it's so easy to forget when we're using our tooling that every one of those insert statements has some kind of overhead, some of which is listed. Now, there's certainly other factors here. But this is the one part that most developers, particularly newer developers, forget about. That insert statement doesn't just start here. It has to go across the wire. The data reserver has to do something. Depending on your commit settings, you might have to get a response back. Then your program has to do something else. There's a latency. There's indexes and constraints. And we forget about all those pieces. And now you say, I want to insert a million rows a second, and you wonder why it's not working when you're doing it with individual statements. The analogy to me is I almost put my children's faces on here. To say, like, if I wanted to shutter my children somewhere, for those of you coming in don't know, I have a lot of kids. If I shuttered them one at a time with a bicycle, it would just take a long time, right? If we do that data packet one at a time, insert after insert after insert, it just ends up being slow. Now, if we could take more of those things, maybe we take something like, you know, the analogy, get this pickup truck. And we can at least get a few more packets in there. We have to take fewer trips, and we get the same amount of data. The ultimate goal is we want to try and figure out that ability. What's the largest payload? What's the largest amount of data we can pull across the wire at one time so that the database server just has to do its work one time per bulk set of statements? All right? And so the goal, and a lot of things we're going to talk about today, are how can we take the data, the massive amounts of data, these files, the streams we're taking, and get them into Postgres as quickly as possible? All right? We're going to tend towards larger payloads, things on your right, and not the things on your left. All right? Now, we aren't going to be able to talk about today, but then there are also things like, because I actually was trying to get a demo as I was sitting in the hotel the other night. It's like, maybe I can do this. It was a little bit too complicated and based on where we're going to be, I won't be able to do it today. But I will tell you about at least one or two tools. If you could take multiple connections and somehow split your data up, you'll find that you also get even more. So larger payloads, again, depending on your server, your configuration, your abilities, there's going to be a sweet spot. But if you're willing to do the work, you can find it, whether it's two, whether it's four. There are ways that we used to do testing at least with timescale. We've done it with Postgres native partition as well, and many threads taking large payloads up to a certain size, depending on your configuration, can really improve your ingest performance. So things to consider. So let's look at these four methods, and then I'll do my best to demo each of them. The first, and believe it or not, sometimes often forgot about, is a simple multi-valued insert. So what does a multi-valued insert look like? This is what it looks like in something like just plain SQL. And again, for folks that are coming from, we heard talk earlier about Go and using ORMs, don't know SQL well, it's easy to forget that you can do this. You simply say insert into table, you have your parentheses with your values of data, you can just say comma and add another set of parentheses, and you can do that for a long time, and you will get a better performance. I'll show you some of that. Now on code, the reason I bring this one up is for a lot of people, and I'm using Python, it doesn't really matter what the language is, a lot of people do that individual insert statement, and honestly, if you don't have time right now, you want to improve your performance, but you don't have a lot of time, you can simply just put a little loop in there that says, hey, just append to this string for so many iterations, and then send the query. And you'll be surprised how quickly that improves your performance. So a really small change can make a big difference. Multi-valued inserts requires a little bit of extra programming work depending on what you're doing. And we'll talk briefly about ORMs at the end, how many of them, or at least some of them, can help you do some things like this. Multi-valued inserts is part of SQL standard. It's supported regardless. If you can create a string, and you can append these things together, you can send it to the server, and the server will handle it correctly. It's usually at least moderately faster. I'll show you an example in a minute. And it just really requires that you have some kind of batching in your application. So you go back to this Python example. I just had to put something that says, hey, once you reach some limit, go ahead and send the statement, and then let's start over again. So it's not that hard to do. It's pretty quick to iterate in your application now to get a lot better performance. The second one is a little bit interesting, and it's really unique to Postgres. And I bring this up for one specific reason. It's called using arrays to do your inserts. Now, if you are new to Postgres, and you're not really, you don't know this, Postgres is maybe the only, it's one of the few, at least that I know of, that supports an actual array as a data type. And so you can actually put together a bundle of values as an array, or multiple values as an array. Sorry, I had a box to highlight it. And that then is treated as one value element, which can then be what we call a nest, split back apart and put into the database. Now, the reason I bring this up is, at timescale, we had an application that, using an older version of Go, was doing a lot of inserts. And we ran into the problem of, and I'll talk about this in just a second, in Postgres currently, when a SQL statement is parameterized, so we'll say something like insert into table, and then it puts an internal parameter, which is a question mark. And you can only do 65,000 of those question marks. And so if you're trying to insert lots of data, you have to make sure that you don't have more than 65,000 some odd parameters. And so when we wanted to, tracking that can be difficult, and we realized that we could bundle up all of these values into just arrays, and we're technically only sending, in this case, I'm only sending three values, even if there are 10,000 elements in each array. And so that was a way to overcome that, and that's one of the reasons I bring this forward. Now, in some cases, it can actually be faster than something like a multi-valued insert. Now, again, it depends on the tooling. It turns out that even in Python, just got a new version of PsychoPG, and this one is not as fast as it was before, and I'm not sure why I haven't had a chance to look at it. This is what it might look like in Python. Again, we'll show you that in just a minute. Probably the biggest reason to consider using something like this, depending on your ORM. Now, again, we specifically ran into this because of an older version of a Go library that we were using, and we were hitting that connection limit, and then we found that this actually, in some cases, performed better. And so you basically take your elements, you put it into a list, again, in Python, depending what your language is, your support might be slightly different, and then you can put that into your statement, and you're essentially sending the string again, depending on what the ORM does, or the driver, how it sends it is independent. But once it gets to the server, PostgreSQL simply unests that and goes to Tenom. It can be faster in some instances. I'll show you in a second. It avoids the parameter limit, as I said. The one interesting thing here is that it doesn't, in this case, doesn't specifically handle custom types. So if you have a custom type, that might not work for you. The third is something that hopefully everyone here knows about. It's called copy. Now, copy is one of the oldest kind of long-standing commands out of Postgres. It actually existed prior to Postgres in its prior form before what we know is Postgres today. It is the preferred and optimized tool, and it's used in any... It can often use a lot of forks of Postgres. I know, for instance, I used to use Redshift, and so Redshift, that's one of the primary methods for getting data into Redshift, is to use copy. It can read from files or to the standard input. And then the one thing you recognize here is if you are used to Postgres, and you use PCQL, which you'll see me use in just a little bit, this is not the same thing as the slash copy command. So essentially, what PCQL slash copy is doing is taking a file that's... And this is the difference. When you say SQL copy, and I think I said it here, it's not part of the SQL standard, so it is only Postgres. When you say copy, it's looking on the local server wherever that is. So if you're using Docker, the file has to be on your Docker image. If you're using... And so that means you can't use your local one. That's what the PCQL copy command is for. It will allow you to use your local file, and it does a stream into the standard in on the server, and then uses copy there. So they are hand-in-hand, but they're not exactly the same thing. It does have a couple limitations, however. So it's designed to be very fast, and it's been doing this for a long time, and it basically, the way it opens up a stream and to record the records, very, very efficient. It is, however, only single-threaded in a single transaction. You can't say start multiple transactions for it, multiple threads, and do inserts in parallel. It doesn't work that way. Until Postgres 14, there is no way to see the progress. So if you had a 10-billion-line file that you were doing copy on, it's had to wait and see if it's going to finish. And how do I know? Because you can't see it, right? Because it's not committed yet. So how do you know the progress? And so in Postgres 14 and above, we do now have a view. Again, I'll show you that as part of the demo. Very helpful. We do have large files to know about that. Minimal format configuration? No, there's some. I mean, again, copy is a great tool, but it was designed a long time ago and has a very specific set of use cases. There are some new tooling that are kind of using copy above putting a superset of tooling above it so that copy can be more efficient. And there is no failure tolerance. This is one thing that I actually didn't know until recently. When you insert data with copy, if it fails, so let's say you have a format error. If you've used copy, you've surely run into this where you forget that you have a header line or you forget there's something part way down the file and the format's just wrong, and it has actually copied maybe, you know, millions of lines and then errors. Transaction's done, it stops, but under the covers, those records, they're bypassing a little bit of the typical work of Postgres. They actually are taking space in your table until the next vacuum. They're a part of the transaction that is essentially partially complete. It actually states it's in the documentation. I think a lot of folks don't actually see that. And so if you're doing a lot of really large file processing, this is important to know about. If it fails, you really need to run a vacuum after the fact to make sure that those things are happening more regularly in those instances. Again, the point is, it is intended to do one job, two jobs, really, really well. Import data quickly and export data easily. So you can use copy as well to get data out into like a CSV format or something of that nature, and it does it really well. There are two tools that you can use to thine know of, and I'm sure there are more, but I've used both of these tools. So one is PGloader.io. If you have never used it or don't know about it and you're doing lots of work, it uses copy under the covers. And it's been written and written by Dmitri Fontaine, well-known in the Postgres community. It's a CLI application, and basically he designed it in a way that it will take your format and then people have contributed to this so it can actually even just do conversions. So you might have a file, you might have a CSV file, or you might have a database that is a different database, like SQL Server or Redshift or MySQL. People have contributed converters so that you can basically say, there's my database, it will pipe through the database, get the schema, create the schema, and then use copy as the background tooling to make the work much more efficient. So it's a really interesting and neat tool, and then it does other stuff which is really great. It does error checking. You can actually put rule sets in for error checking. It can cast data for you. This kind of value in MySQL does not exist in this way over here in Postgres. Here's how we like it to be cast. So it's a really neat tool. If you've never heard of it, please go check it out, and there's a lot more that it can do, continuous migrations, things like that. The other one is Timescale Parallel Copy. Now, again, I mentioned this because I used to work at timescales and time series database. We're used to people having millions and millions and hundreds and millions and billions of rows in a file that they're trying to insert and ingest. And time series data is really interesting because it's typically in rough time order. And so when you're doing things like partitioning, you could have multiple threads essentially inserting into multiple tables under the covers behind the scenes. Again, copy itself is not multi-threaded, but this is another tool. It's a Go program that can take time series data, splits it up into batches for you, and then will start up multiple threads to do those copies in parallel. This can just be really useful on a high latency system. So, again, when I actually first started at Timescale, I was running, at first day, I was running their demo, and it had 10 million rows. I live in the country. I don't have the best internet connection in the world. And after 20 minutes, I said, are we sure this thing is working correctly? It just turns out using just a plain copy over a very latent connection to a data center on the other side of the country was not terribly efficient. Using parallel copy, it was done in a minute, right? So, a lot of ways to go about it. And then the last thing to consider if you're doing a lot of data insert is unlogged tables. Now, again, in Postgres, a lot of people that I've run into, myself included, forget about this often. But if you need to insert a lot of data and you can deal with some of the ramifications, it's a great option for just giving you that one other little edge. Unlogged tables simply mean that your work is not logged to what we call the right-ahead log. Now, that means it's not fail-safe, right? So, if there's a crash, that data is gone. You can't recover it. But this is really good for things like ETL jobs, right? So, if you're getting lots and lots of data that you have to process maybe every night or every hour or whatever, you might find that you can get 20%, 30% improvement in your ingest speed by using an unlogged table, maybe even more, depending on what it is. And if it fails, it's fine. You still have to file. Try again, right? So, that's a really useful tool to know about. You can take any table, turn it to unlogged. Obviously, you want to put it back to log when you're done, unless it's just a throwaway table of some sort. So, I use it again. It's really great for ETL processes. It's really good for intermittent but repeatable work, right? So, again, any kind of those bash jobs you're handling, maybe you're rerunning a data processing simulation over and over again. It's a great way to do it because who cares if it's not in the log. And it also means your wall is not increasing for this stuff that you're just iterating over and over again, which that can be helpful. I did forget to specifically say back here, obviously, this is not accessible. This data will not be accessible on any replication servers. It's not in the log. It's the way that Postgres does replications with the wall. And so, if it isn't in there, it won't replicate. Again, that might actually be really beneficial for that kind of transactional data that you're doing. All right. I'll pause there to quickly first see if you have any questions and then I'm going to flip over to demos. Any questions? Yes, one in the back. And I can shout it out. Can you turn on the log with just one question? That's a great question that I'm going to say probably yes because it's just a table technically under the covers. And I'm getting a nod from down here, too. So good. Good question. Oh, yes. I apologize. I meant to repeat the question. I didn't. The question was, can you set a partition to unlock one table in the partition? And so the answer there would be yes, because again, the way that we do partitioning is there. Technically, there are no tables. We're going to talk about that in a minute. One thing I want to say, that slowness in PsychoPG3 was fixed in December. So now, possibly, it should be as fast as PsychoPG2. And I want to ask you if you also have tested the new copy support in PsychoPG3, where there is direct copy from Python objects to the copy protocol. So it's a great question. I'm going to show you a little bit of Python. And so at the end, I actually was going to talk about that. I've been starting to play with PsychoPG3. The demo, I actually converted to three. And there's one or two things that it can't do, that two could. I wanted to demo that. But yeah, I will talk briefly about a couple of those situations. Okay. I peed to have a chat about that because I wrote it. You look familiar, so I was assuming that was the case. Thank you. All right, so let me just show you what we're going to do. I have a couple of different tables, two different demos. So I'm going to first show you all of this in SQL. This is not necessarily how you would do it day to day. You're probably not in most of your SQL jobs concatenating strings and then using that to insert into another table. This is all being contained within one database. It's truly just for demo purposes to try and show you one or two examples of how some of this can help and work. I'm not going to show you the row by row insert because honestly it takes about two minutes, and this is not worth it for me to be really frank. I didn't feel like sitting here. Here's what the sample data looks like. I have a script. It is in the demo repo that creates some tables, a table with a bunch of data columns, different data types. I just have a couple of functions I wrote to create random data, both numbers and text, so that I just get different sizes, and then I just inserted just over a million rows. I could have done 10 million, but I didn't want to make everyone sit here. Are these exponential? Yes and no. It depends on data types and a bunch of other factors, which we're going to see in just a minute. I created that data. It is sitting in one table, and I'm simply going to do a couple of these things by taking it and copying it into another table beside it. Every time I'll truncate it, and then we'll see what happens from there. Like I said, I think there are just over a million rows of data here. The first one, as I said, is basically taking, and I'm going to pull this down here. Is that readable by everyone? Make it a little bit bigger? Do that. Same process. Start with an insert statement. I'm simply concatenating that string to some point. In this case, I believe it is 500, and I was playing this last night, so now I actually don't know if it really is 500 or not. Basically, every 500 or every thousand, I'm simply going to then execute that statement, so we have a string with lots of data. This takes a few seconds. It is running. Then you'll see I have two things here. I basically timed the creation of the string, and I timed the actual execution. The string itself took about three seconds. This is the one thing I could not get the beaver today to make that bigger, so I apologize. I will read it for you. The string generation took about three seconds, and the actual execution into the database was just shy of eight seconds. That was a million rows, one database table, to another database table using multi-valued insert statements. Now I'm going to take the same, and we're going to do that array trick. I'm basically taking data from table one. I'm aggregating each of those columns into just aggregating the whole table with an offset. I'm basically doing 10,000 at a time. You can play with different numbers. You can take results. I'm basically taking 10,000 numbers of column one, 10,000 numbers of column two, and so forth, into an aggregated value. Now, because it's being done in Postgres, you'll see that this is actually kind of interesting in how this works. You don't necessarily get the exact same thing. It is running. Again, I tend to use dbeaver in demos because it's just easier for me to color code some things and walk through comments and so forth. That's just why I do it this way. This takes just shy of 30 seconds, maybe actually a little bit less than that, depending on how the image wants to perform today. The really interesting thing you'll see here is that the, and again, I'll read the values out for you, but the string took about 24 and a half seconds to generate, so it went a million rows, 10,000 in time, I don't know, maybe 400 iterations it went through. I'm not sure my math is close to 500, but anyway. And so the string generation took almost 25 seconds, but the actual execution of all of those individual statements took 1.5 seconds. Now again, it's inside of Postgres. It has access to where each of those things are. It is a lot more performant in Postgres, but it's interesting because there might be some things you're doing even internally in some of your processes, functions, store procedures. Might be worth something to try. Now it's complex. This is not the most really exciting way to write a procedure and create all of these values and make sure they get populated and so forth, but it's interesting nonetheless. We thought it helped us mostly because of the parameter limit thing we were finding with the Go package. Again, that is older, but we also found some neat things like this that in some cases it could actually work more effectively. So then I took that same data. Now again, I'm currently using a Docker image. It's just a generic Postgres Docker image. Just kind of spun it up. There's nothing special configuration. I didn't try to give it more RAM. I could have made this a lot more performant, but I chose not to. I simply then copied it out so that this file is on the Docker image server because remember copy is local to the Postgres server. So I took those 10 million rows and it's a lot more, a lot easier to do this, right? We have the copy command. It generated the CSV file. You could provide the other parameters here if you needed to, but you'll see that it ends up being overall significantly faster. So by far the fastest in total time here just to generate and work, right? So we took a million line file, imported it in under two seconds with copy. So it's a great option if you don't know about it. Now, the other thing that shows unlogged. So the question was, I'm going to use copy just because it's quick and we have a time limit in the demo. And so I'm going to set this table bulk test to unlogged, all right? And I'm going to go ahead and do that copy again. So it was 1.7 seconds or so to ingest that. Again, I truncate it at the beginning. And you'll see that it ends up taking 1.3. So, you know, it can be depending, right? That was, someone do the math for me really quickly. What is about 10 or 15 percent? Sometimes it can be larger depending on the kind of table it is, the data you're ingesting, and then some other things we'll talk about in just a minute, all right? So it's easiest to show that in SQL because it's quick and easy to iterate on. Again, the scripts themselves are in the repo, which I'll show at the end. What about in Python? And so, yes, there's a new version of Psycho PG. So this demo was originally created about two years ago. I've iterated on it over time. And so what I did, I actually have, and I think I actually have it up here. So I did actually convert many of these, or all of this, to Psycho PG 3. It doesn't matter what the language is, right? A lot of these principles are going to pertain, regardless what the language is. I happen to be using Python and Psycho PG. Earlier, Pavle was talking about ORMs. Now, again, at the very end, I'm going to mention one or two things to look at for ORMs. Psycho PG is not an ORM. It's just, it's a driver, right, to help us do this work. But there are components of what your driver or your ORM might do for you, which is also valuable to think about. That's the reason I'm using Psycho PG 2, honestly. It turns out that, yes, Psycho PG 3 got much faster, even with the individual inserts. Just by the way it's doing, it's specifically because of a new feature in Postgres 14 and above in Lib PG. So it allows us to do... The word is all of a sudden escaping me. I think I have a pipeline. Thank you. The pipeline feature in Postgres 14 and above. And so with that, particularly, we do get better performance in some of these things. I think the question becomes, again, if you have control, if you can get a better payload, a bigger payload over to Postgres and let it do its work, depending on the architecture of your system, you still benefit from some of these principles. So I did the single insert before we started. So I simply took, again, this is using Psycho PG 2, and I can't speak for the language because I haven't specifically done it, and this is not even about Psycho PG. This is just about simply getting something into the server. If I iterate that file, this is... I meant to show you that too. This is a slightly different file, and the reason is I didn't feel like doing 10 or 12 columns. I just wanted to demo the options. It's a much simpler file, a little bit less than a million rows. And so I did the single insert previously, and you'll see that it took 180 seconds to do 750,000 lines, and it's only three columns, and there's no indexes. So there's a lot of work, right? Just going and iterating back and forth and back and forth. We see that often. So if you were doing that kind of work in your application, seriously considered doing something different, like... So the way this file is set up at the very bottom, I have each of these functions, and so I'm simply going to comment out the single insert. Now I'm simply going to do the multi-valued. I did exactly as shown in the slide. I took that single insert, simply wrapped it in a batch. After so many iterations of pending that string, send that. And when you do that, you'll see that it ends up being significantly faster, and it's really simply to show the value of doing a very minimal amount of work, all right? In six seconds, right? So one line at a time, many lines at a time. I think there's maybe 5,000 parentheses after that, right? So it really can make a big, big difference. I did then want to try the... So this is what I wanted to show, and we can talk about later because I don't know the reasoning, and it's fine. I just... I did not realize that in previous versions of this specific tool, and again, I know other tools have something similar. There are some functions called exact values, is the one I'm using here. What that essentially did for you is did the batching for you. And so I just had to do a little bit less work, and that's the only reason I had... I stuck with this to show you that your tool may have something similar if it can't take advantage of some of the other features that Postgres and others are now providing to do some of the pipelining and things like that. So just to recognize it, this simply does the same thing. It's a multi-valued insert. It's simply helping you do that batch rather than you having to write it. So it's a really convenient method for doing so. Now the arrays turn out to be pretty easy in something like Python. And again, most other languages now, I have that batch. I have a file, I'm reading it in, I'm creating a list out of it, and then I'm sending slices of that list over and over again. And when I do that, you'll see that it has really similar performance. And this is where, specifically because of improvements in PsychoPG, in this case, this actually ends up not being any faster in PsychoPG 3 and above. So I wouldn't necessarily benefit from this application, especially if I didn't need to worry about the number of parameters, right? So again, one of the reasons for this could be a parameter issue if you wanted to have many, many things you're sending in. And the last but not least, and that's why I bring it, is to talk about copy. So if you're a tool, so I appreciate it Pablo. You shared earlier that in Go, right? The ability to use copy, it is a framework that, you know, any language could take on binary copy if they support it. One of the things you really want to look for, because you'll see that it is tremendously valuable if you can use it, right? Less than a second, about half a second, to take the exact same file and simply use copy, right? And again, PsychoPG 3 does a little bit better on some of this because they're taking the ability now, one of the things you could not do, I believe is true, and maybe I'm wrong in this, you couldn't do stream into copy with PsychoPG 2, but you can do that now with PsychoPG 3. That's really nice if you're forming your own string and you want to use copy, right? That's a really valuable tool. So make sure that your application, your tooling has something like that. So there are two examples of how to do this kind of work in either of them. A couple of parting thoughts to take with you. Number one, indexes and constraints. We never talked about this. Now, in case those among you who have been thinking what's going on here, I have no indexes or constraints on any of these tables yet. So someone asked earlier, like, what does that do? I was like, that's a really good question. Let's quickly check that out. And so it's just something to think about. Now, in a very large active system, it is really hard to just get rid of your indexes and constraints because you want to make your data go faster coming in. But currently, Postgres does not have a way to disable indexes and constraints at this time. So you'd have to drop them and then recreate them. But it can have a significant impact. How, again, dropping before insert can significantly improve performance, but use at your own risk. How big of a difference can it make? Well, let's see really quickly. So I'm going to take that same table and I'm going to create a couple indexes. I don't think I have any on there now. I have no indexes. So I'm going to create three just on various columns, B trees, all of them. And I'm going to run, let me set that back to logged. There we go. I'm going to run that copy again. So remember the copy was a little bit over a second. Unlogged, it was just over a second. So just adding three indexes to this table, doing the exact same thing, makes that go at least twice as slow. It is now eight seconds. So that is multiple percentages, multiple times slower. What about the type of index? It's a great question. I want to see like, does it matter? So what if I create a trigram? Now again, this is random text. I would never do this normally. But what if I create a trigram on this text? The text itself is no more than 50 characters long. I should have truncated the table before I did that. It would take just a minute. You will see that I get to show you the other trick now that I did that. Within a few seconds, you start to realize that just changing that index type has a major impact on how this is working. But now I get to show you the view. So now I can actually see how quickly it's going through, even though I have that index on there. It's just a recognition that knowing your data will really impact what you're able to do and what is safe in your specific environment. All right, last but not least. Oh, that was the bonus demo. Partitioning. Consider it. I'm surprised how few people consider partitioning for their data. It can really improve the ability, particularly when you have disparate data, like time series data that might come in late. It means that you have smaller chunks of your table. And because Postgres works in memory, if you're only pulling in smaller portions of that table, you can often get better throughput. Consider it. Give it a look. And the really cool thing here is indexes are kept on each individual table. So even that trigram index lets extrapolate to a billion rows. If that's over many, many tables, the indexes themselves are smaller, you'll probably get better ingest performance. And that just means you are able to take a whole bunch of data and put it into each of these individual tables. If you had multiple threads going and the data was hitting different partitions, you really can see a speed up. Last but not least, last slide. What to look for in your SDKs? So things like, does it support copy? Does it support binary copy? What about multi-valued or batching kind of functions? Do you have to do that work or will your tooling do the work for you or help you do that work? How is AutoCommit handled? Parameter tries. Does it allow you to do anything with those parameter tries queries? How does it handle it? Does it warn you if you're going to exceed a limit? And the one thing I meant to put in here and I didn't and I apologize is, very much like PsychoPG3, has now taken advantage of a feature in Postgres 14 called Pipelining, which essentially says, for every query I send, I don't have to wait for the response, I can just start sending the next query right away. That's really beneficial in inserts. And so it's a really easy and effective way to get more performance out of what you're doing. Turns out, so I have five minutes until questions. Yes. All right. Let me show you really quickly, because I think I can. Oh, there it's done. So then when it stops, you'll get no values back. But it did finally finish. How long did that take? Let's see. Doing that with that trigram index took 65 seconds, right? So your indexes matter, the kinds of indexes and what you're doing. Trigram index is an inefficient index, but it's a very powerful one too. If you're going to use it, you understand that and that's okay. So I have, I think, so remember I told you earlier when I did the single insert, and I'm going to give this a go, and it might not be set up, and this is no comment whatsoever on PsychoPG3, because I was like, oh, that's actually a lot faster than I thought now. No, I think, yeah, I think this is it. So let's just see. This may or may not work. Basically because of the pipelining, this, even though it does 750,000 insert statements, because it's doing one at a time, the time came down very near to what the multi-valued was in this case because of the pipelining effect. So if it works, great. If it's not, it just means that I changed something from the last time I did this. There's no comment whatsoever. Let's see what happens. And I might not have turned on pipelining. Now I do know, oh, you know what? This is not execute many, my apologies. Yeah, I don't have it set up. I thought I did. So there's a function called execute many, and what that does is it automatically sets up pipelining for you. You don't have to enable it in the code itself. Otherwise, at least as far as documentation shows, you have to at least initialize the pipelining to get that impact with it. So it was a great surprise. Thanks for doing all that work. I know there's been a lot there. So that is what I have to offer for today. It's been really a joy to speak with you, and I'd love to take any questions you have in a few minutes that we do have. I'm going to start by talking about bees and coffee. Hi, I've got a few questions, but I'll restrict myself to a couple. The first one is if you take that example where you've got a bunch of indexes and pay maybe even some constraints or even a trigger or two, all of which are contributing to making it slow, how do you tell what's making it slow, either through tooling logs, et cetera, without having to basically play, turn it on and turn it back off again on every one? Really good question. A lot of it would be and probably something I actually was going to try and talk about, but I just didn't have time, which is using the views in Postgres to figure out, number one, first off, what statements are actually happening. You can tell which indexes are being touched, what the kind of transaction are happening, the updates themselves within the index, and that might help you get a sense for at least which ones you have to worry about to start to think through. That'd be my first go, and I will maybe try and update some of those examples with a couple of those views so that you can start to do that. Thank you. Yeah. Question over here. Behind you. Hello. Yeah. So what if you want to normalize your data into multiple tables? Why did I not normalize the data into? Into several tables. Okay. Partitioning? No. No. Like tables with unique constraints and foreign key relations. Oh, why didn't I? That's a great question. Demo and time. What is the strategy to insert data if you want to normalize it into several tables with unique constraints and foreign key relations? Then you cannot copy, and you cannot just do a multivariate insert. You have to process the data while you insert it. Absolutely right. And so that's where I think Postgres doesn't get enough attention on this specific thing. So the question is, when you have a very normalized database, this data actually gets split up into three or four tables because it's not just one file. How do you do that most effectively? Now, a lot of people would have some kind of ETL tool external. I think that Postgres does a really good job internally. I would specifically, what I tend to do is I create an unlogged temp table of sorts. I get the data in as quickly as I can with the copy. I use the power of Postgres to move that data around. Things, tricks like the array trick, honestly, in some instances, ends up being really fast then in processing. That's the best I can offer you when it comes to copy because you're right. That's the limitation of copy. It doesn't follow the parameters down. So hi, thank you. So I would have expected you to also touch on prepared statements, which many drivers have. So did you purposely not do that? Yes, I purposely didn't most, again, partially because of time. So prepared statements, one of the big issues prepared statements that a lot of people run into at least, we were just talking about this on the walk the other night, is prepared statements if you don't need to be released effectively. And so I just didn't have a good demo to be honest with you. And so I apologize when it comes to insert. Now some of that, again, PsychoPG3 does now help with prepared statements. I tested it briefly to see if it would have any impact. On these simple insert statements, that was part of my problem to demo that. It's just a little bit more difficult. So definitely something worth checking. So basically, do the work ahead of time so that statement self doesn't have to be prepared every single time. And I think PsychoPG says there's a threshold. Once the statement has been run so many times, then it will basically turn into prepared statement to save that effort on each insert. Okay, and I just want to remark that it's actually possible, I believe, to disable constraints while importing. So, okay, yeah, good comment. So while importing, absolutely. If you can tolerate it, if your application can tolerate it, you can drop them ahead of time. Yeah, I do. Don't mean drop. You can disable triggers temporarily. An important turn them on again. I'm sorry. So I've read. So apparently constraints are implemented by triggers in Postgres, and if you temporarily disable triggers, then you also disable checking the constraints. Yeah, that's another option. You can defer it to at the end of the import process, which might help for performance if you haven't checked, which you can also disable them, but then you actually disable them so they're not checked. So that's something to be wary about. Having thought about the trigger option, I haven't tested it. So, yeah, good feedback. Disable, please. Yeah, exactly. There's so many ramifications of it. The reality is simply saying there's a lot that you have to think about. If you want to do it, it is an option, and you'll see that guidance out there in a lot of places, right? Then you have to worry about what happens when you re-enable all this stuff. Like, there's a whole different discussion just simply putting it out there. One more, maybe? Have you considered using foreign data wrappers to load data, and do you know about the performance there? Yeah, so the question is about foreign data wrappers. Postgres has a really great ecosystem of foreign data wrappers that allows you to basically say, that thing over there, I'm going to treat it like a table. It can be a file. It can be another system. It can be another database system, Mongo, Redshift, whatever. It really all depends on... A lot of these principles are going to follow. Number one, try to get pushed down. Like, what can be actually pushed down to the queries for the data you're bringing back? The data itself is almost never going to be as quick, just because of the overhead. Even local files, it's not going to be the same thing. But it's a really useful tool. As a source, it could be a great way to take that and get it into something, and then move on from there. One last question. This is a follow-up comment on the question that this gentleman had about constraints. If you use the multi-value insert, sorry, the array insert trick, you can actually just drag the arrays into a CTE and then run an update. So you can still do it in one statement. You don't need to necessarily put it in a temporary table. Absolutely, yeah, again, just example-wise. One question. Hello. Binary copy versus text copy. What is faster, and how much is the difference? Binary or text? Binary is generally faster. It's a great question, and I mostly just mentioned it to mention it. In every experience I've had with tooling that supports it, it ends up being a lot faster. I don't have numbers for you at the moment, and it is part of the demo I want to get in there so that we can show that. If anyone else has a specific input, welcome to shout it out, but it's faster. Look for tooling that supports it. Again, Psycho-P3 and others support it now by default. You'll see some great boosts. Thank you so much. It's been a pleasure. |
DBA Evolution (the Changing Role of the Database Administrator) |
Thanks very much. I leave you in the capable hands of Karen Jax. Hi, thanks, Jimmy. Hi, so I'm Karen Jax and I'm going to talk to you today about DBA evolution or the changing role of the database administrator. So you can see that my title at the moment isn't database administrator, but I was a DBA for over 20 years. I'm not going to talk through the agenda. I've just left it there so that you can keep track of where we're up to and when I might stop talking. So I promise that this would be a talk for everyone that's pondered life's important questions. What does a DBA actually do? Why are DBAs always so grumpy? How's the role of the DBA changed over the last couple of decades? What's the DBA of the future going to do? And will autonomous databases finally have put us all out of work? So this year marks the 25th anniversary of the start of my career as a DBA, so I thought that I'd allow myself a bit of a self-indulgent retrospective. So first, the evolution of this particular DBA, and I'm going to apologize to Claire at this point because this is going to be a bit of a me, me, me section of my talk. So I'm just going to have to get through that bit. Being a DBA definitely wasn't an ambition of mine at school. I mean, anybody here say database administrator when they were asked what they wanted to do when they grew up? I mean, I didn't even know I was going to go into IT until about a year before I started my career, although with hindsight it's fairly obvious that that's what I was destined to do. My dad taught me to count in binary before I could even properly count in decimal. He built me my first computer, an acorn atom, which he later replaced with a BBC Model B, and obviously I used the computer for playing Chuckie Egg and other games, but I also learned to code in BBC Basic. And I actually looked up the spec of the BBC Model B in case anyone's interested, so it had a 2MHz processor and 64K of RAM. I then went on to do a maths degree because I still had absolutely no idea what I wanted to do, and maths was my favourite subject. And after that, I figured I probably ought to get some real-world skills, so I did a masters in software development. Towards the end of that, one of the graduate schemes I applied to did some aptitude tests, and they came back to say that they thought I'd probably be good as a DBA. Once I'd asked Jeeves what one of those was, I thought that sounded like a fairly good fit, and that I'd give it a go. Fortunately, it turns out that Jeeves was right, and the aptitude test, and this is what my careers looked like so far, so I really haven't strayed very far from my DBA role. Obviously, the things that I do in my job have changed over time and I've gradually moved into more senior roles, and I've gradually changed the focus of what I do, but I've tried not to let that influence what I talk about too much. I'm trying to talk more about the DBA role in general. So, let's set the scene for the start of my career as a DBA in 1998. The 90s brought us Grunge and Britpop. They also brought us the first truly portable and affordable mobile phones, for example, the Nokia 3210. They brought us the first text message, the Palm Pilot, and also for better or worse, Amazon and Google. But what does a DBA actually do? Just so that we're all on the same page, I've taken highlights of definitions of a DBA from various places. I looked at careers services, DBMs vendors, and job sites. And the general consensus is that a DBA manages and secures computer systems that store data using specialist software. So, they're all pretty much agreed, but that doesn't really tell us anything about what a DBA actually does. So, first of all, one thing to note is that there are lots of different types of DBA. And the different types of DBA will be different from organisation to organisation. So, at ACME, for example, we might have the DBA role split into many different DBA roles. Whereas at Wonka Industries, we might have DBAs who are in one pool who are all responsible for all DBA tasks. A traditional split is between the production DBA and the development DBA. So, the production DBA will be responsible for keeping all of the databases in the production environment up and running. So, looking at things like availability, performance, security. The development DBA would work more closely with the developers. So, they've got a focus on building and maintaining the database environment that's supporting the application development lifecycle. Just wait for the photos. In some organisations, you might see a split of system versus application DBA. So, here the application DBA will be more looking at the logical application related aspects of the database. Whereas the system DBA is going to be responsible for the underlying software and the physical infrastructure. And the splits not clear cut. The roles often overlap. They may or they may not. And you might see lots of different types of DBA as well. So, you might have a data warehouse DBA who will look after the data warehouses particularly ETL from various data sources into the organisation's data warehouse. A cloud DBA I think I've got in there as well who will be looking after cloud hosted databases including liaising with the cloud provider. Also, you might see database architects. I know it doesn't have DBA in the title but it's a very closely related role and often overlaps. It's a role that's existed for at least a couple of decades but it's gradually become more popular and seems in a lot of cases to have split out from the traditional DBA role. And it's not an exhaustive list. You could see all sorts of different types of DBA. You might also get DBAs that look after just one particular task such as backup and recovery or replication. And any or all of those might be split into junior DBA and senior DBA with the junior DBA doing a lot of the day-to-day tasks and the senior DBA having more of a strategic focus. Okay, so we saw the definition of a DBA role but as I said it didn't really tell us much about what a DBA actually does. So what does a DBA do in 2023? I looked at the different definitions that I showed you earlier and there were lists of responsibilities that went with those and I've taken information from different job adverts and tried to pull together the list. So most people seem to agree that a DBA will do some or all of designing, implementing and managing backup and recovery policies designing and implementing security policies and managing database access although in a lot of places the security role has been split out into a separate security team these days. Implementing monitoring and ongoing monitoring of the databases design and development of the database including data modelling or maybe just reviewing data models that have been created by the development team. Support and troubleshooting including often being on a 24-7 on-call rotor and database software install and upgrades listed in a lot of them as a specific task but in a lot of places that's automated or the installation side of things is automated so the DBA would either be responsible for implementing that automation or another team would do the automation and the DBA wouldn't actually have anything to do there. Upgrades I would say it's more of a strategic role in working with the application team to make sure that upgrades are tested correctly so that they're successful. And there's a whole other page, a whole other list of responsibilities so providing database expertise to other teams performance tuning enhancement often troubleshooting capacity planning creating databases including for development and test environments database maintenance and data protection and GDPR considerations. So that was a massive, oh sorry, before I move on is a massive long list of responsibilities so clearly in most cases one person can't do all of those things. This chart shows some of the most requested skills for a DBA in the UK I've taken this from IT Jobs Watch so it's for the six months up to January 2023 so we can see that SQL is the first requested skill and I went up to number 23 because there we've got PostgreSQL listed there as well. On the next page I've summarised those different responsibilities sorry the different skills that are being asked for. So consolidating the list of skills it boils down to SQL so writing SQL, making it more efficient potentially knowing procedural languages integration and everyone's favourite ORMs. One or more database management systems usually still asking for relational. Knowledge of operating systems so this will be traditional operating systems but also various different cloud environments. Performance tuning, database migration seems to be quite a popular thing people are always moving from one database to another. Disaster recovery, social skills is listed most of the places as a separate thing I don't know what they're trying to say there. HA replication clustering type skills and then not listed in any of the things I found as specific skills but most DBAs I know are actually expected to know about different DevOps tools and methodologies and they're expected to know how to use automation tools like Ansible or Puppet to do things. So how does all that compare to what a DBA did in 1998? On paper it actually looks very similar so there are just reproduced the same list of responsibilities as I had for what a DBA does in 2023. Although I've taken off GDPR because I didn't exist yet. Does that mean nothing's changed? Well obviously although the responsibilities are pretty much the same the skills that a DBA needs and the tools that a DBA is expected to be able to use have changed massively because a lot of the tasks that they do day to day are done very differently. And obviously that's not limited to DBA tasks or even IT tasks we do loads of day to day tasks differently as well. So I've used send my sister a birthday card as an example so that's the flowchart for 1998 I would have gone to the shop I would have browsed the racks, chosen a card I liked I would have paid for that card and probably bought a stamp at the same time taking it home, written a message in it put my sister's address on it, stuck the stamp on and then walked it to my local post box to be collected by the post delivery. I can still use that method in 2023 if I want to but this is what I'm more likely to do I'm going to open the app of my favourite online greetings card retailer I'm going to upload a photo from my phone, type in a message on my sister's address pay using my disposable online credit card and just get them to send it directly to my sister as soon as it's printed. So if we go back to DBA tasks I've taken installed database software as an example I've not put this there to be read this is an extract from the Oracle 7.3 installation guide which was my first database installation that I did I can't imagine that many DBAs in 2023 need to first walk to the machine room and put the CD into the server before they start their installation it used to take me I think two days to prepare and perform a single database install and we didn't have a repeatable process there was a silent mode that was supposed to allow us to script it but because none of the servers were installed using any kind of automation they had their own particularities so it didn't work and we didn't bother with it in 2023 a DBA would it be expected to know how to use an automation tool such as Ansible to do a database software install and they'd probably not just install a standalone database they can install the whole database infrastructure including high availability monitoring back up and recovery all in one command so what kind of skills did a DBA need in 1998? I can't reproduce the list of skills quite as easily because that has changed but a lot of the general skills were the same so you'd be expected to know SQL and probably a procedural language you'd be expected to know one relational database management system you'd probably be expected to know how to do Unix basic admin and shell scripting database performance tuning and disaster recovery which was probably calling someone to bring the tapes back on site so that's what a DBA was doing in 1998 and is doing today what will a DBA do in another 25 years time? well I'm going to hedge that question for now so before we start looking at what a DBA might do in 25 years time everyone likes a bit of buzzword bingo so I thought we'd have a look at the buzzwords, trends, hypes from the last few decades just to give us an idea of what's changed around our vintage DBA and what's still changing so the 1990s the 90s brought us the worldwide web first with painfully slow dial-up access and finally broadband internet which meant that remote work was finally a possibility for some DBAs even if it was just to avoid having to trek into the office when doing 24-7 on-call the concept of big data was introduced data warehouses became popular and we needed massively parallel processing MPP to analyse all of this data so teradata became popular and a teaser didn't come along until 2000s scrum started to creep into our normal everyday work process object-oriented programming and from their object-oriented databases were the next big thing in programming methodology everybody was talking about AI and machine learning even though they weren't new concepts with deep blue-beating Grandmaster Kasparov at chess we were blessed with PostgreSQL and Linux SSDs started to replace spinning hard drives spinning disk hard drives but that's still a gradual ongoing move and a lot of us missed out on New Year's celebrations because we were on call to make sure that planes didn't fall out of the sky and fortunes weren't wiped out due to the Y2K bug the 2000s we were introduced to blockchain Cloud was and still is one of the biggest buzzwords with a move away from the traditional data centre everything started to be provided as a service we had software as a service and platform as a service we were all expected to be agile to embrace DevOps methodologies and to implement continuous integration and continuous deployment the Internet of Things and social media contributed to a massive growth in data volumes we got Hadoop, Exadata, NoSQL and JSON as new ways to store, process, analyse and describe data BASE was proposed as an alternative to ACID favouring availability over consistency for distributed systems access to data on the move started to be possible with Wi-Fi and containers were introduced although they weren't at this point quite ready for databases I'm going to take a deep breath and drink some water OK, 2010s infrastructure joined the as a service party NFTs, non-fungible tokens were introduced we started to store our data in data lakes which sometimes became data swamps columnar databases, even though they were actually a concept from the 1970s became popular and joined the other NoSQL databases we started to avoid putting all of our eggs in one basket with multi-cloud strategies Oracle announced their autonomous databases which meant there was no, not going to be any need for DBAs anymore CloudNative was the next big thing with microservices, Docker and Kubernetes and the EU caused DBAs all across Europe massive headaches by introducing the General Data Protection Regulations GDPR the 2020s haven't all been a bed of roses in tech so far with a global pandemic and massive layoffs on the other hand, remote work has started to become the norm for a lot of us access to data on the move is getting faster with the introduction of 5G Zoom became a household name everyone's now heard of Master Don and apparently we're all going to be living in the metaverse and using chat GPT to write all of our reports and presentations and that's just the first few years so who knows what's going to have happened by the end of the 2020s so, we can look at a few ways that databases have changed in response to all of those things that have been going on this is a massively simplified view of the history of database management systems please don't quote me on this so in around 1980 the relational database pretty much started to replace the old navigational databases and it's been going strong since then and still is in about 2000, various post-relational or no-SQL databases became popular and are still popular and of course everything goes in cycles, the no-SQL databases were around before the relational databases just that they weren't called no-SQL databases at that point it's easy to think that relational databases are going away when we hear about all the different no-SQL databases there are and how popular they are this is a chart from dbengens.com from January this year and it shows that actually relational databases are more than twice as popular as all of the different types of database management system put together and their figures also show that the popularity of relational database management systems has stayed fairly static over the last 10 years we've got more variety of databases so in the 90s there was a handful of database management systems and organisations would usually use one of those for all of their databases or maybe two there are now over 400 different types of database management system most organisations don't restrict themselves to just one or two of those so you'll get not just within one organisation multiple database types but also within one application so that means a DBA might be expected to know not just several relational database management systems but also various different no-SQL types they might be expected to know about spatial databases, time series, graph databases or document store databases have got bigger so the terms big data and very large database were introduced in the 90s but they don't actually tell us very much about the specific volume of data the definition in most places I found of very large database is a database that stores so much data that its maintenance and architecture need specialist tools and methodologies so it's basically just a database that's so big that we haven't yet figured out how to manage it so in the 90s most people I spoke to seemed to think that a multi gigabyte database was fairly big these days people have multi terabyte databases and they're seen as fairly normal once you start to get multi tens of terabytes you might start needing some specialist methodologies in 2018 IDC said that the global data sphere was predicted to grow to 175 zettabytes by 2025 I haven't managed to find any later figures so I don't know if that's on course to still be the case so the global data sphere is apparently all of the data that's created, captured and replicated in core edge and end point locations and one zettabyte is a trillion gigabytes so for anyone like me who has difficulty visualising 175 trillion gigabytes apparently if you burnt that onto CDs you'd have a stack of CDs that either took you to the moon 23 times or around the earth 222 times so it doesn't look likely that there's going to be any shortage of data to manage any time soon databases have got more complex I suspect we've all seen an application architecture diagram that looks approximately like that we've got multiple database types not just within one organisation but also often for a single application we've got multi cloud strategies we've got containerised databases we've got distributed systems and we've got many interrelated components so how has the DBA adapted in response to all of these changes? the DBA has had to develop a cloud mindset I was really reluctant to embrace cloud I can't remember when I first heard the word but I thought it was just a hype that was all going to blow over whether we like it or not cloud is here and it's growing so we have to work with it so with the advent of cloud computing many DBAs are no longer actually responsible for managing on-premise databases the hardware and software is often managed by somebody else so the cloud DBA will need to have knowledge of probably multiple different cloud platforms and even organisations that keep their databases on-premises are often moving to a cloud-like architecture, a cloud-like model with them self-service databases the DBAs had to learn to be flexible so as I've said they would be expected often to know multiple different types of database not just relational but also no SQL they need to be able to work in lots of different database environments not only the traditional operating systems but also cloud and things like Kubernetes they're expected to know how to use DevOps and automation tools and methodologies and they've got to keep up to date with all the changes in all of these different types of tool and methodology the other side of the coin is that actually a lot of roles have been separated out into different teams and you might now have different teams that are responsible for infrastructure, for data, security and the application we already looked at the different types of DBA but actually there are a whole host of different roles related to database and the data in it and this is just a few of the ones that I've seen by scanning job sites all of these changes mean that the DBA has to collaborate with other teams so not just with other database experts but also with teams that support the entire application stack I often say to people that I've been forced out of my little database bubble and the DBAs had to develop a strategic focus so the DBA is more likely to be looking at the overall architecture of the database analytics and data processing, improving automated processes, collaborating with other teams to make sure they've implemented business solutions rather than looking at the day-to-day technical details of managing the database so it's a reminder that change isn't all bad I asked colleagues and contacts to tell me some of the DBA tasks that they used to have to do that they are glad to be rid of and these are just a few of the answers vacuum cron jobs, data file reorganizations, commuting to work controversial ones, some people probably do still have to do that writing complicated backup scripts and installing post-GIS via scripts rather than create extension so you've probably all got your own tasks that you are glad to be rid of so the million dollar question, a variation of the question is the DBA role obviously is asked constantly all over the internet and of course database conferences and probably more so since Oracle announced its autonomous database in 2017 is the DBA the clock winder of the future or the lamp lighter, the switchboard operator, the video rental clerk or probably more likely the COBOL developer because even if things change so much that modern databases don't require a DBA how many decades is it going to take before all databases are migrated over to this new DBA-less technology fortunately data from the US Bureau of Labor Statistics says that there isn't a shortage of DBA roles so we can see this goes from 2003 so 2003 is the first year in which those stats listed database administrator as a separate role before that I think it was computer scientists or I can't remember the exact role but there wasn't a specific DBA role from 2020 it's listed as database administrators and architects so that supports what we saw before about the two roles splitting out from the same thing so this is the total number of DBA's thousands across the left there in the US so we can see that it's going up slightly it's not growing massively but it's definitely a slight upward trend and the Bureau actually projects a 9% growth from 2021 to 2031 which is higher than the average of all other occupations so that's good I tried to find similar stats for the UK but unfortunately the lists of job roles are more generic they don't have database administrator listed so instead I've gone to IT Jobs Watch and I've found the numbers of job adverts that have DBA that are for DBA or database administrator and those numbers are looking pretty healthy for 2022 and 2023 well how about automation? surely once everything's automated we don't need DBA's so one of my friends is a pilot and people like to ask him well the plane just flies itself doesn't it? you just put it in autopilot you don't really need to be there as a pilot and he says well yeah if everything's going well that's fine you could just have someone with a lot less experience sitting there watching the dials but as soon as something goes wrong as soon as there's any kind of emergency situation you want someone there that knows what they're doing okay it's not usually such a life or death situation if a database has an emergency but you want someone on hand that knows what they're doing to fix that and get things up and running again so you still need someone there to make the decisions and actually if each database takes less time to manage that means that one DBA can manage more databases well we've seen how much more data there is and a similar number of DBA's so that seems to probably work well what about the junior DBA's? I've not been able to find any figures but I wonder whether there are fewer DBA's fewer junior DBA roles if automation is replacing a lot of the tasks the day-to-day tasks that junior DBA's used to do do we still have roles for them? and if that's the case how can somebody become a senior DBA if they haven't been a junior DBA does that mean that the career path is changing? and back to autonomous databases I mean if we believe the headlines when Oracle announced autonomous databases back in 2017 DBA's were soon going to be a thing of the past there was absolutely no need for them so even if all databases were moved over to autonomous databases we've got the day-to-day maintenance and patching is automated but you still need somebody to design the databases to do the data modelling to make sure that there's efficient code accessing the database and actually if you look at some of the reviews these are just extracts from a couple of recent ones apparently it's an extremely complicated thing to put in place so you still need somebody who's an expert to be able to put that infrastructure in place in the first place so fortunately the answer to all of that was no the DBA role isn't obsolete so what conclusions can we draw from all of this? the DBA role has changed since 1998 and it will continue to change in response to changes in database technology and methodologies the DBA of 2023 has similar responsibilities to the DBA of 1998 but does them in a very different way and so they're expected to be able to do very different things relational databases are still going strong they're still the majority and we're not expecting them to go anywhere but DBAs will be expected to know about other databases as well the amount of data is growing exponentially being managed by a similar number of DBAs so that suggests that we really do still need those DBAs and things are moving either to the cloud or to a cloud like infrastructure on premises so the main thing for me is don't panic as we've seen quantity of data is growing a lot of that data is going to be stored in databases and that's going to have to be managed somehow even if things are automated you want someone on hand in an emergency you want somebody to design those databases you want somebody to write and improve the automation code you want someone to fix the interesting queries that have been created by ORMs so it's a role that's evolving like any other role in IT like any other role anywhere else but it's not going away if you're happy to embrace automation and to try new things you've got a long and happy career in front of you as a DBA and even if you're a stick in the mud old-fashioned DBA and you don't want to change well there are going to be enough legacy systems to keep you in work for a long time to come so you'll be fine until you retire that's it I got through that faster than I expected so I've got time for questions so just a note from my side what I discovered is very interesting for me is the operator concept in Kubernetes and instead of using a managed PostgreSQL database which is expensive I just use an operator and the operator itself it doesn't just install a database automatically for me it also does piggy-bounds and it does switching from left to right and I think this will be the future so a lot of knowledge we have right now will be moved into an operator it's more central and it will be much better for us I agree if you're going to have a containerized environment operators are definitely the way to go the problem that I've seen and I guess because of the role I do I see when there are problems is that people think that does everything, we don't need DBAs so you still need that DBA knowledge behind it definitely so working in smaller companies the DBA is basically if you're lucky the DevOps team and if you're unlucky the developer team which means a lot of people don't have the knowledge or the interest or time and things are done in a sort of not proper way so as somebody who is interested but has a day job as a developer how can you foster that environment where developers will start to understand if they're making inefficient queries or if they're building the database wrong and get them to both know and be interested and also get the company to kind of care about those things rather than keeping them until it's an emergency like you said then the plane is crashing down That's a really good question, yeah so I wish I had all of the answers to that because I've been in lots of organizations as a DBA or helping and we've had exactly that situation so I would say one thing is make friends make sure that all teams are actually talking to each other as a DBA I was always kind of trying to integrate the other teams and actually be useful I mean obviously sometimes I was that typical grumpy DBA no you can't do that, no you can't have all these permissions I was enforcing rules but I tried really hard to be there and helpful and show that I could actually help to make this query run faster or help to make that better so I think that's... and I did training courses for our developers as well to teach them about the basics of databases that's great because I found that most developers are really open to that because they want to learn as far as people higher up in the organization go I'm not sure, I don't know what the answer is there but probably just showing them how it benefits the organization is the only way to go there Hello, thank you for your talk I feel completely, I agree completely with your talk I'm a DBA as well I have a couple of questions First thing is the role of the DBA is still an old male role so what can we do for making it more attractive for the woman? because it's something very stressful it's something that is very stressful another thing another million dollar question it's something that is not changed yet it's still old male and I don't think it's a good idea to continue this way I know it's an horrible job but how can we make it more attractive and the other thing is a lot of companies, a lot of my customers struggle to find DBA how can they make it more attractive for getting a permanent DBA because I give advice of hiring DBA because it's important to having in-house but they struggle to find DBA thank you The first question I wish I knew the answer to that I have no idea how to get more women into being DBAs that's something that's a problem across the whole of the IT community and something that's not changed since I started 25 years ago the only things that I do is I go into part of an organisation that does coding workshops for girls to try and get people in early I mean I was so lucky because my dad was a self-taught coder he loved IT he built me my first computer and showed me how to play with it that's what people need, they need to see that it's fun I mean, well, we think it's fun I know it's not everybody's idea of fun I know, my husband still doesn't understand and I say no, I know but it's my hobby as well so I think that's it and that's probably the same as that first question is that we need to make this look like fun to people and maybe we need to try and get rid of that image of the one PDBA, I don't know We need to clean up our ugly aspect Exactly, more flashing badges Hello, thank you very much for the talk I wanted to ask you, I'm a data engineer myself I'm a data engineer myself and I feel that it's complicated to get across what we do every day because we don't do any very visual things we're not web developers so we can show really nice pages or nice graphics Do you have any advice to give for somebody who's in the field of databases and behind the scenes to make their work more noticeable? Yeah, that's a very good question I don't know, I suppose it depends which audience you're looking at are you looking at internally within an organisation or outside to the general public? internally but for people who don't know much about it people who do other things that manage it Yeah, I think some of it is showing the value that they bring the problem with a lot of things, I can speak for being a DBA is that when everything's going right you're not noticed because what's the DBA doing? We're not with nothing's going on we all need to start putting little things in the code to cause problems every so often so maybe just try and bring attention to the value that you bring Claire is probably a very good person to talk to later because I don't know if you saw Claire's talk earlier about how to write things in blogs so that you get your point across clearly and that kind of thing Claire has some very good ideas on how to communicate |
Deep Dive Into Query Performance |
So, welcome to the Post-Quest Girl Dev Room. If you weren't here before, can I please ask you to silence your phones and extend a very warm welcome to Peter Zaitsev. Okay, well, thank you. We are going to talk about query performance today. But before that, let me understand a little bit who do we have here. Now, which of you would mostly see yourself as DBA, SRE, CSADMIN, kind of on the operations side? Just, you know, can we have a... Okay. Now, which of you are developers? Ooh, lots of developers. Okay. Now, in terms of developers, right, now if you do it again, but now for sort of like a front-end developers, right, something not, you know, database kernel or something, you know, other complicated stuff, but something more simple. Front-end developers. Any? Yeah! Hello! Okay. Well, anyway, so one of the good points of this talk for me, right, is really to try to bridge the gap what I see a lot between how their operations people, right, the people who are deeply vested in a databases, right, development, think about them versus people who are just happened to use those databases for the application, right? And often their relationship to the database is, well, really quite different, right? As a database kernel developers, we often deeply care about all those kind of internal algorithms, have a, you know, discussion, what is the best way to implement these and that cases. But for many developers writing applications, well, you know, we think about databases, you know, as you think about like a plumbing, right, well, it just got to work, you don't want to think about it, well, it just, if it doesn't, then that becomes a problem, right? They think about database in many cases as a black box. And I think that is increasingly happening now, especially when we have so many databases which are deployed in a cloud as a database as a service, right? Because in this case, especially, well, you just have a database and somebody else focuses on the other stuff. So what does that database mean from developer standpoint in many cases? Well, that means you get some sort of service point, you should use in your application, right, you can connect to that service point and get that quickly with no problem, right? And then you run the queries you need to run, right, of a database or maybe even what your RAM framework, right, or something generates. Now what do you want from those queries? Well, as a selfish developer, you want those queries to run with no errors. You want to make sure they get your correct results, right? And you want to make sure you run them within response time, which is appropriate for your application and for query time and for query kind. And I think that is very important to understand here what if I am looking as a developer and a database from performance standpoint, I understand that as how quickly that database responds to my queries, right? Now if you think about their software design in general, right, and I think especially maybe not developers, but architects often have to care about a whole bunch of other things beyond just the performance. For example, we often have to care about security, right? And typically security costs stuff, right? It comes with overhead, both in terms of performance overhead and, you know, organizational overhead and so on and so forth, right? That's done to factor authentication always takes another couple of seconds, right? But that makes us more secure. Availability is also important, as well as things like costs. I think that is especially important, again, in the modern age when they have a lot of cloud, which is elastic, right? But that elasticity comes also with spend, right? You often can say, hey, you know what, if I just need my queries to run faster, I can blow up my instant size, right, or something else. But well, guess what, that also will be expensive, right, if you're not doing efficiently. And there is, let's say, a bunch of other things you want to consider about, right? So I don't want to simplify that, let's say, to what everything is also only about query performance, but that is what I am going to focus in my talk. Now when you think about response time from the database standpoint, we often think about that from a query context, right? Well I see my database responds to the queries XYZ, you know, in average or something, right? You think about that query basics. But if you really look at from a business standpoint, right, how your boss or boss is boss is boss, right, where it thinks about that, it's mostly about the users which are using your applications, right? And I typically would define it what really folks are after is what the users of your applications, right, and all users, right, have outstanding experience in terms of performance for all their interactions. Because in application, you often may have different interactions, right, and I want to make sure I have a search which is fast and place in an order which is fast, right, and whatever other things all working quickly. Now as database engineers, we often want to talk about performance and availability as a different thing, right, like saying, well, no, no, no, the database was up, it just was overloaded, so that query took 15 seconds, oh, 15 minutes, right, or something like that, right? But the reality is for the user, their very bad performance is really indistinguishable from downtime, because, well, A, people have a limited experience, right, and if something is taking too long, we'll just go into closer page, and even if you have some, something of unlimited patience, there is going to be a whole bunch of timeouts, including your browser timeouts which will, you know, show you what the page cannot load well before 15 minutes, right? So I think that is another important thing which I find also important talking to some, maybe business people about why spend resources on performance, query performance optimization, and so on and so forth, right, because, well, you know what, if it doesn't perform, it is down, right? Another thing what I would point out, right, is in many cases, you see people talking about the averages, right, while the query performance was so many, you know, milliseconds or something in average, right, and while it may be helpful for comparison standpoint compared to yesterday, really, it is not very helpful, right, because, well, the average maybe what you're looking for may be way too many queries which are too slow, right, just balanced by the queries which are high, right, and as I wrote here, I really like this saying, we won't leave the man who tried to cross a river in average one meter deep, right, where once leave the man. So in this regard, I think it's very helpful to look at things like a percentile response time at the very least, right, if you want to look at one number because you're looking at simplicity, 99 percentile for query response time is much better than average response time. What is even better is, of course, is to look some sort of distribution, you know, query histogram distribution and how it changes over time. That often can give you a lot of insight. The thing from percentile, though, is it's interesting how it works as you go from that query to the user experience, right, you spoke about, because think about this, right, typically when you may have a single user interaction as a page view, it may require multiple sequential queries, right, or even maybe some queries run in parallel, right, which all need to be fast in order for user to get the outcome they're looking for, right, and then typically user through his session will have a multiple of those page views, right, so that 99 percentile being excellent may only translate to half the users having that kind of outstanding experience through all the session, right, that is why if you look at companies which have a large number of users, they would either have some very high percentiles, like 99.9 percentile response time as a goal, right, or would have those tolerances, you know, rather high, right, so there is a, well, additional sort of accommodation for if there's going to be many, many queries, and I think to consider when you measure query performance is how you relate to errors, right, in certain cases I've seen people saying, well, you know, we only go into either measure response time for only successful queries, or we're going to put successful queries and queries which are completed with errors in the same bucket, right, which really can really, you know, change a picture for you a lot. Why is that? Well, because actually if you think about the errors, they can be both fast errors and slow errors, right, imagine, for example, table was dropped for some reason, well, then all the queries hitting that table will return the error and vary very quickly, right, because well, there's nothing they can do, on the other hand, if there is something, let's say some data is locked, right, and some timeouts happen, that may take quite a while before error is returned, right, and you better not to mix those with the rest of your successful queries but to be able to, you know, look at that separately. You also want to look at the query performance not just as an overall number but how it changes over response time with a reasonably high resolution. Why is that important? One thing is what in many cases you would see query performance kind of slowly drops before it goes so bad what that seems like downtime or really, you know, really incident for all kind of reasons, right, maybe you have some application which has a bad query, right, and then you had the one instance of that query running, two, three, four, five, now you have a hundred instances of that bad query running, right, so saturating all the system resources, guess what, all the other query performance, right, is going down. If you are able to, if you are going to notice that what some queries are out of bounds, right, and maybe alert on it or something, you are able to take an action before the small problem becomes, basically because of downtime. The other reason, of course, there is shit that is always going on, right, there is something that database doesn't have a background, if you have a cloud there is so sort of other things happening which you may not even know anything about it. Like for example, block, elastic block storage, right, similar stuff, right, well, guess what, it doesn't always have uniform performance, you know, sometimes something is happening at like Amazon back end but you know what, you don't really know anything about that. They don't tell you each time they have to replace a hard drive, right, somewhere, right, or, you know, rebalance the load for some reason, right, but those things they can pop up. Often you may see something like, oh, I have that like a spike in a query response time which I can see for all queries on my, all instances, and wow, that's very likely like something is environmental. Now when you look at the query instrumentation, one of the questions I see people asking is where do you want to instrument the query, right, and we can instrument the query on the application data point, right, an application issues that query, right, and we often have some, you know, tools like, you know, new relic insights which are doing, you know, just that. And hey, that query took this amount of response time and this is very good data because it actually includes real response time as application observed it. If there was some, let's say, network delay, right, or for whatever reason, that is included in that response time where if you just measure from the time database received the query since it pushed the result in the network, right, that is not included. But measuring on a database gives you a lot of other valuable stuff like you can get a lot more insight about what has been going on on the database size right while the query was executed. Most typically when you get the query result and you get response time, maybe, you know, some other little additional information like, oh, this query returns so many rows, so many bytes, right, but not specifically, you know, how much CPU it uses, right, and all the other important things you may want to use, okay. So let's go back to our definition of a response time from a business point of view, right, and we can say, well, what we are looking for have our old users to have an outstanding experience of all of their applications, right, great. Now how do we translate that to the database, right, and kind of maybe breed that gap without what the boss wants and what DBA is able to answer. Now I think there is some great work in this regard done by Google which have been working on this L-square commenter project which allows to pass a lot of metadata, right, from your application all the way down to your query. The cool thing they've done is also integrating that directly with some of the frameworks, right, so it's kind of, hey, you know, you need to do nothing, right, and you just get that automatic information. What could be valuable query metadata possibilities, right, if you ask me, well, here is a bunch, right, there is this actual user and tenant which we can do application or functionality, right, often single database is used by a lot of applications, right, and we want to know where the query comes from, right. I see a lot of DBAs, especially from a large company, say, well, you know what, here is this nasty query came in. It was not here yesterday, but it's very hard to figure out who is responsible for introducing that and how you can come and hit his head with something heavy, right. That's maybe hard, right, without proper instrumentation. You also, as a primary breakdown, want to look at the query, and I mean by query in this case, query of all parameters, you know, normalized, because often you would see the different queries responsible for different functions, and through that have a different response time tolerances, right, let's say some very quick lookup queries, you often want them to complete in a fraction of millisecond as acceptable stuff, while some of you search queries write to some reports, well, may take a few seconds, and that will be quite acceptable, and it's good not to mix those all together, right, in this case. In many cases, when you have the SAS applications, we would have a multiple user, so what often calls like multiple tenants, like one of the ways you split them is to have a different schemas or different databases for all of them, and that is also, I find, very helpful to being able to separate that, so you can see, oh, this query is not slow for everybody, but when we drill down, we can see only that particular tenant is slow, and vice is slow, because unlike other, he has five million images in his album, right, if you would think about some, you know, for the hosting application, so that's just an example. Another thing what we find very helpful is being able to go for a query, right, or to look to understand what tables it touches and reverse to find out all the queries which touches specific table. Why is that helpful? Well in many cases, our database operations are table specific, right, you may think, hey, you know what, I'm dropping this index, as I don't need it, or maybe I add an index, I add a column, right, you do some sort of maybe kind of partition table, right, you can do a lot of things with a table in scope, right, and then it would be very interesting to understand how that particular, how all the queries which touch that table have been affected, because we are much likely to be affected by that change compared to everybody else, right, I find that's a pretty, pretty cool feature. Database user is another one. If you do not have something like a squirrel command to enable, you often do not really see very well from what application given query comes in. But one of the practice you may follow at least is having a different application touching the same database using different user names, right, different users with different privileges, right, if nothing else that is a very good security practice, right, and that is where filtering and breakdown allows that. In a large, large, short environment, we also want to make sure we aggregate the data from a many database hosts, right, and can compare between each other. Typically when you have a short application, you want the load and hence response time between different database hosts to be kind of similar. But often it is not, right, it's often hard to achieve a perfect balance in between the nodes as one cause of the differences, but also things may just, you know, happen, you know, like you may have a settings which drift away on different nodes, you may have some, you know, differences, right, in the performance, especially in the cloud, which, you know, happen virtually from nowhere, right, I mean, I know a lot of people work in the cloud, you know, you know, sometimes you just get a lemon, right, or just like a bad node, which for some reason doesn't perform as well as its peers, right, and just want to, you know, maybe toss it and get another one, better one, right, but to do that you better understand what that is not performing particularly well. And the same also applies to their application server or web server. Again, like if you deploy application on, let's say, 100 application servers or web nodes, right, you may say, well, it's all should be the same. I have my, you know, automation which takes care of that. But again, well, things are not always as they should be. In many cases, you have something which doesn't work out. I have seen so many cases when people say, well, you know what, I already fixed that nasty query and I deployed the fix. When you look at that, well, it's actually was not deployed all the instances for whatever reason. Or you may say, well, you know what, my, I'm using a caching to reduce the query load on the database, but that caching is misconfigured to otherwise inaccessible on some of their web nodes, right. A lot of stuff can happen. Or maybe you're lucky and one of your web nodes was actually hacked and is also getting some additional queries to, you know, download your data and send it to someone. So I find making sure you can look at the query patterns separated by the different client hosts very, are something very valuable. I already mentioned with a SQL commenter which allows you to extend some additional metadata, which I think can be quite cool, right. And you can find the usage for custom tags in many cases. I've seen people, for example, tagging different instance types when I'm saying, well, you know what, this kind of new generation instance looks good. So let me put some of them in production and being able to compare. Well, is it actually working better? Sometimes yes. Sometimes, you know, no. The database version, right, maybe you're running out when you, minor Postgres release, you want to do it like on some subset of the nodes and to make sure there's no, no regressions, right. I mean, I think it's always good in this case to practice, you know, trust by verify, right, because sometimes you do run into unexpected changes, you know, you can validate the configuration changes this way and so on and so forth. Query plan is another area which I think is quite, quite interesting. In many cases, you'll find the same query depending on the parameters, right, or some other situations will have different plans. And if that different plans may have different query performance, and it is a very helpful if you can break down the performance by the different plans a query has, so you can understand if that is a plan issue or not, right. Otherwise, you may be looking at the query and say, well, you know what, something is fast, something is slow, you know, why is that, not very clear, their plans give us a very good information. Now when you find the query and see that as a problematic and you need to make it go fast, in this case, it's very good to understand there is that response time developers care so much about is coming from. And there are quite a few possibilities here. Some of them are instrumented better than others. For example, if you're looking at data crunch and disk IO, right, those are typically pretty well instrumented, you can find how much, you know, of CPU query consumes or that does. In terms of contention, that is typically more problematic, right, to say, hey, you know, what exactly those kind of internal synchronization object query had to wait, right, that is more tricky. You know, waits on CPU availability is even more tricky, right. And what I mean by this is this, right, so if you have a system which has much more runnable threads, runnable processes, right, than available CPU, then they will spend a lot of time waiting for available CPU, right, and that is very hard to see on its impact to the query response time. You typically can see that from the general node stats, like, hey, my CPU is back, I have like a ton of runnable CPU, right, CPU is also in recent kernels, you can see the information about their run queue latency, which is very cool, right, that tells you how long the processes had to wait to be scheduled on CPU after they are ready to start running. So a whole bunch of stuff here, some of them are easy, some of their work is still remaining. Now, from our standpoint, with all this kind of view on approach to the query monitor, we have been working at the extension for my square, oh, for Postgres, sorry, called the PgStat monitor, well, and look, we specifically built it for Postgres, not for MySQL, even though we had a lot more experience with MySQL, because Postgres SQL extension interface is awesome and much more powerful than MySQL, right, so you can read a little bit about this here, and this is extension which allows a lot more insights and getting kind of such slicing and dicing, which I mentioned, right, if you think about the traditional Postgres SQL extension PgStats statements, it really aggregates all the data from the start, right, which is very helpful to be used directly, what we look at the modern observability system through where we expect to have many Postgres SQL instances anyway, right, and some system getting that stuff constantly and considerate at that, so that means we are capturing a lot of information but keep it only on for a relatively short time in a Postgres SQL instance, right, and that allows to get much more granular information without requiring a huge amount of resources, which would be required if you would have it for a time, so you can, you know, read more about what that does on the web pages. Now some folks asked me, saying, well, folks, like, why do you work on a separate extension of the PgStats monitors, and my answer to that is we really wanted to experiment with different approaches, right, to find what works, what doesn't, how users do, and that is always easy to do in a separate extension, right, and then if something is liked by the community, then we can see how we can get that in an official list of extensions, so that is their feedback, is very valuable. And also if you look in this case while we are providing PgStats statements compatibility, right, so you can get that view from the same extension instead of getting another two extensions with additional overhead, PgStat monitor has kind of different ways to aggregate and present the data, right, which kind of, well, you cannot get in the same, in the same view. Okay, now as I spoke about the query performance, I wanted to highlight a couple of other things which are quite interesting to consider when you are looking at the queries where I see a number of issues. One is what I would call the bad queries versus victims, right. In certain cases, or like in many cases, right, you may see even your otherwise good queries like, hey, this is just a lookup by the primary key starting to be a lot slower than it usually is, not because something changes the relation to that query, but because of some other bad queries, right, have been running in parallel. And imagine that if you will oversaturate your node, right, the hundreds of bad queries running at the same time, right, well, then everything will become slow. And I think that's important to understand what if you are seeing some query being slow, you cannot just think about that as that query problem, it may be entirely something else. The next thing to consider is currently running queries. That is also rather interesting, right, because they may not be reflected in the log, right, or something which say, oh, that query completed and it was, you know, five minutes response time or 15 seconds, whatever, right. But running queries can be a problem. And in many cases, that is actually how things start to snowball, right, you have some application or even kind of user starts a lot of, you know, bad queries, you know, forgot like a where clause and a join, right, or something like that, and they just, you know, run for a long time, right, so you want to make sure you're paying attention to that as well. The next is to consider what not all activities are directly visible from a query standpoint. The database often tend to do a bunch of background activities, right. Additionally you may have something else, like maybe you are taking a snapshot, right, or taking a backup in the other way, which use also the system resources, right, which are not seen from query standpoint, but same important. You also have a lot of things which can be happening on the cloud level, right, again, which can be, you know, completely invisible for us. And wherever you are looking, again, at the query performance, it's important to consider where, you know, maybe something going on, right, additionally what those queries tell you. Next question is about, or last thing I would say, is about sampling. In certain cases I see people saying, well, you know what, let us only capture queries over X time. A lot of APM frameworks, right, for example, you know, like New Relics and such may be very focused on that, saying, hey, you know what, we are going to also give you some examples of the queries which take more than, you know, one second or whatever execution time. So focus on those. Well, and yes, looking at those queries may make sense, right, if they take a long time, that may be a problem, but it is often what your medium of performance queries, right, I would say are creating a whole bunch of load on your system, and they contribute the greatest response time to user application, right, and ignoring those can be problematic. Well, that is the main overview, right, I hope, what that was, that was helpful, right, and my main goal here is to make sure maybe to give you some thinking tools, right, as you noticed, that is not like particularly technical talk, right, which tells you how exactly to find out which indexes to create or something, but hopefully you get some tools in this case, how to start, how to approach that, which can prevent you from tuning by the credit card, you know, scaling the instances to inappropriate sizes, because hey, that is good for both your wallet as well as good for environment, right, we do not need those servers generating more heat than absolutely needed. Well, with that, it is all I have, and I would be happy to take some questions. Hey, thank you very much for your talk, my question is about when do you have to increase the box, as a developer, you are in front of a situation where you need to decide between optimizing or asking the CEO to just pay more, because you have a time constraint, so do you have the thumb rules where in front of a problem you would say, okay, better to optimize or better to increase the box, you know, when, how can you decide with me? The question is to, like, wherever it is better to increase the box size, right, or optimize the query. Well, and I think it is interesting, right, that it is not often either a question, right, I think the time in this case is also often essence, and many cases I have seen people saying if they have a problem, right, in this case, and they absolutely need to get like a application up, scale the box, right, and then kind of can currently work on the query optimization, right, and to bring it back and scale down. I think that is a very, very reasonable approach, right, because, well, it gives you kind of more briefing room. What is important in this case, as in many things in life, is not to be lazy, right, like you don't want to just, you know, scale the box and forget about that, you want to scale the box, optimize the queries and so on, right. Now I often, when I look at the queries, right, as you look at that, you can see which of them are low-hanging fruits, right, or when a query is already optimized pretty well, right. If you are saying, well, you know what, actually, majority of a workload is driven by lookups for, by the primary key for a table which is already in memory, you can say, well, you know what, there is very little I can do to optimize this thing, right. If you are saying, oh, that is a query which does massive join, if no indexes, well, totally different store, right, you may be able to make that to run thousand times faster, right, with relatively easy index add. Any other question? Hi, so as part of your slice and dice approach to monitoring queries, would you advise that concurrently queries in the, on the application side are never written as dynamic queries or as like anonymous prepared statements and only follow, say, named prepared statements so that you know we have a fixed set of queries that are always the same? Well, the question is, I would say, like it's kind of like a cart in the horse, right, like from, from my standpoint, right, like you can of course talk about those kind of practices, but developers like to do what is there, what keeps them productive, right, and in many cases saying, well, you know what, oh, you don't use like, or and frameworks, right, on the device and that, that is complicated, right. Now even if you're using dynamic queries, typically, they're still going to be at the, relate to a limited number of variations, right, and especially limited number of most important for variations which are going to be generated, you will still see that from the query type, right. So in many cases, like if you look at that, I would say like a whole set of queries, you would find, well, this application has, let's say, 10,000 of the distant queries, but if I look at top 20, that will be responsible like for 90, 99 percent response time, right, and that of course can change, right, but often focusing on those firsts, right, as well as maybe taking care of outliers, right, is a good kind of practice, how then you deal with that information that you have, makes sense. Any other question? Hello, thank you for the talk. What is the overhead of, to collect this statistic, because if you have, like, very, very much of, that is a good question, right, of course there is, I would say it varies, right, typically there is more overhead if you have like this, like a very simple fast queries, right, if you have like a logic queries for, which takes, you know, many seconds for them, it's less like, our design goal, right, which we are able to get is being similar to PGSTAT statements, right, and, you know, be a couple of percent or so, right, which I think in my opinion, right, many people when they think about that observability, you will tend to obsess about the overhead, right, but really often having that insights, right, often allow you to get so many things optimized when they matter, right, what the benefits are far outweighed. Do you have any advice for catching bad queries before they reach production and kind of like guarding these things? Oh yeah, absolutely. Like missing indexes or whatever, before they even. That is a very good question, right, so I didn't talk about this, but it's also a question where, right, in my opinion, and I think that's also what is very helpful with the open source solution, right, what you can really deploy it everywhere in, including your kind of CI, CD environment, right, because what I often see people saying, well, you know what, data dog, right, is expensive, it's only in production, right, what you want to do is make sure you have solutions in development so you can catch bad queries before they hit in production, but also assume you're not going to catch all the bad queries, right, some queries will only maybe misbehave in production, right, the other good practice which comes to that is you make sure you're like a test environment is good, right, so you can test a variety of queries relevant to your application and you have a good data set, right, for that. I think in this regard, there is like some cool features coming out from Neon, for example, like giving like branches, branching, right, then you can get like, oh, the full copy of production database, you know, mess with it, run tests on it on a full-size data set, right, instead of testing on, you know, table with 100 rows, right, which is kind of useless. Cool. Any other question? Okay, thank you very much. Okay, thank you. Thank you very much. |
Don't Do This |
But what you should not do in Postgres, so please welcome Jimmy Angelacos. Thanks very much. I'm a Senior Solutions Architect at EDB and I am grateful to EDB for allowing me to make Postgres my day job because it is an excellent database, it is an excellent community and thank you all for attending a talk with such a clickbaity title. And thank you to the guys at home for clicking. So why this title? I didn't come up with it. So this title is the title of a Postgres Wiki page that's called Don't Do This. And I got all the content from there. So that's the end of the talk. But no, anyway, so this talk is not all inclusive, right? I'm not going to tell you all the mistakes you can make with Postgres. Who can? I mean, there is literally nothing that you cannot mess up with no matter which database you use. You can always find a way to mess up. But these are some of the things that we've noticed that people are doing wrong in general with Postgres. So some of them are misconceptions. Like I believe this thing works this way, but it doesn't. Some things are confusing because of the way they're implemented in Postgres, especially things that are not part of the SQL standard, but Postgres extensions to the SQL standard. So to be fair, Postgres is the most SQL standard compliant database. It just has some things on top of it. Other databases implement a subset of the SQL standard and also confusing things. So we're a bit better from that respect. And some common mistakes that people make that usually have a significant impact in production environments. So we'll be looking at some bad examples of SQL that you can write in Postgres. We'll be looking at some improper data types for storing certain things. Andreas had a good talk this morning about this, covering many of the same topics. We will be looking at wrong ways to use Postgres features. And also some things that affect your performance and affect the security of the server that you need to be aware of. So let's start off with some bad SQL. First and foremost, not in. As in the Boolean, not in, right? It doesn't work the way you expect it to. So when you're writing, select something where something else is not in this subquery. You have to keep in mind that SQL and Postgres by extension is not Python and it's not Ruby. So it doesn't behave the way you expect it to if you're used to writing not in Booleans in programming languages. So select A from table one where A not in one constant, right? So it's always true. And null returns nothing because if you perform a not in and there's even one null, the result is null. Not false, null. So equally, select A from table one, a more real world scenario where A is not in, select B from table two. Even if one B is null, then the whole result is null. So it's not doing what you're expecting it to. Let's say that table two has no null Bs, right? B is not null. Why is this still bad? And you should not use it because it doesn't optimize well in the Postgres query planner. And instead of performing what is known as an anti-join, so it's the complete opposite of a join. Show me the rows you cannot join from this table. So the Postgres query planner chooses a sub-plan and if that's a hashed sub-plan, that's kind of okay. If it's a simple sub-plan, then the performance of this thing is disastrous. So even if you don't have nulls, you don't want to use it. What should you use instead? You should use an anti-join, as we just said, which looks something like this. The column from table one where not exists is a better way to write not in. So wherever column from table two does not exist where table one column equals table two column. So you want the rows that table two doesn't, can't match up to table one. So that's an anti-join. Or another way you could write this is select column from table one and use a left join. So left join, table two, using the column, using this Postgres shorthand for join on column equals column, but in this case I'm using column because it's the same name in both tables. So left join where table two dot call is null. What does that do? If it cannot find matches on the left-hand side to the right-hand side, then the right-hand side, the result from table two is a null. And that's how you get your anti-join. To be fair, not in is okay. If you know that there are no nulls and you cannot know that for a table, and as we said it has performance implications, but when you're excluding constants that's fine, right? Because if you have an index and you're able to tell that none of this is in the index, then you're fine to use not in. But generally speaking, try to prefer not exists or anti-joins. Another thing is that we've seen people use the wrong way without knowing is between, especially when you write a query with a where clause that specifies between timestamp one and timestamp two. Why is that? Because between A and B is inclusive. It's a closed interval. So when you're saying between one and 100, you're saying include one and also include 100 in the results. When is this bad? This is bad when you're a bank, let's say, and you want to sum up the transactions for the day, right? The amounts from all transactions from the day. And your DBA has written the following query, select some of the amounts from transactions where transaction timestamp is between the end of the previous day and the end of the current day, right? So it should be fine. No, it's not. Because if a transaction has happened exactly at midnight, you'll get it twice. Because when you run that query tomorrow, it's going to return the same row because you've included midnight in both queries, right? So that's a bad thing. So it's better to be explicit instead and use select some amount from transactions where transaction timestamp is greater or equal than and transaction timestamp is less than, excluding the equality with midnight, right? So that is very, very safe. And there's no way to read it wrong. It's very explicit, very clear. Another thing, using uppercase in identifiers. Many people like to do this because it looks very professional because they're used to some database that was out there in the 80s that only could support uppercase table names. And that database can now use lowercase, but the habit is still there. Now why is that a bad thing in Postgres? So if you use table or column names that are all capitals or mixed case, Postgres will just ignore you and make everything lowercase unless you use double quotes around the names. So create table plurp and create table quacks. What are the consequences of issuing these two DDLs? It creates a table named plurp, lowercase, and a table named quacks with a capital Q. Why is that a problem? So table here is shorthand for select star from plurp. So table plurp works because it's not quoted, so Postgres ignores the case. Table plurp quoted, even if it's exactly the same way we specified it when we were creating the table, will fail and it will say there's no such table. Equally table quacks fails because there's no lowercase table quacks. Table quacks in double quotes works fine. So you can see how you can mess up your schema with this. If you give your schema to a developer and they're not aware that there's a difference between double quoted and unquoted table names, then you get in trouble. I think.NET by default, even if you don't do anything, double quotes everything. So if you make the mistake of including capitals there, then they're not going to work in Postgres. So unless you create the tables from within.NET, that is. So the same goes for column names. If you want pretty column names in your output and your reports, then just use select call as pretty name. Double quote the pretty name. It can have spaces, it can have emoji, whatever you want. And Postgres will just return exactly that name and you don't have to change your column name on your table to make accounting happy. Now moving on from SQL, let's look at the wrong use of some of Postgres' built in data types. Again timestamps. So if you create a column that is type timestamp, that means timestamp without time zone. So these are naive timestamps and they represent a local time somewhere. But you don't know where. It stores a date and a time with no time zone information. There's no way to retrieve the time zone where this row was inserted. And why is that a bad thing? Because arithmetic breaks down totally. You cannot add and subtract dates and intervals and anything else because you can't calculate, you can't make computations on what the time would be because of things such as time zone changes and daylight savings times. So it's meaningless, it will give you the wrong results. So instead please use timestamp TZ or TZ if you're British. Timestamp with time zone is the equivalent. Timestamp TZ is the shorthand and that stores a moment in time. A moment in time means the number of seconds that have passed from midnight at the beginning of the first of January 2000. So it's absolute, it's definite and you know exactly which moment in time you're specifying. The arithmetic works correctly as you would expect. And this by default displays in your time zone but you can also choose to display it at time zone. So if you've inserted something which is midnight UTC and you want it in Eastern time that would automatically convert it. If you said at time zone Eastern it would automatically convert it to minus five hours or minus six hours if there's a DST difference between the two time zones. So you don't have to worry about the conversions. Just use timestamp with time zone and you won't have to worry about it. Even if you don't need time zone calculations and all of your operations and all of your queries are coming from within the same time zone it's better to use this. Because then when you have to export your data and give it to someone else they know exactly what this means even if they don't know your time zone. So also if you've decided to only use UTC throughout your organization then don't use timestamp to store UTC because Postgres doesn't know it is UTC. It just sees a local time and doesn't know where it is so it can't convert it. Now something less frequently used is the type time TZ or time with time zone. That is a quirk of SQL. It is there because the standard specifies it and that's the only way Postgres implements. That's the only reason why Postgres has implemented this. So time with time zone has questionable usefulness. Because time zones in the real world have little meaning without dates. It can be the middle of the day in Australia and the previous day here. So it will be times in some time zone but the date is different and you don't know it. So the offset can vary with daily savings time and that's a bad thing because time TZ has a fixed offset and that makes it impossible to do date calculations across daily savings times boundaries. So just use timestamp TZ instead. There's also a space saving. For some reason this thing is 12 bytes. I don't know why. A timestamp is 8 bytes. So just use timestamp TZ or timestamp with time zone instead. Current underscore time is another favorite. Current time is timestamp TZ. So we just said don't use timestamp TZ. Just use current timestamp or the function now to get the current time with the time zone and local timestamp that returns the timestamp if you just want to know what time it is here in your local time zone. Equally you can use current date for date and local time for the local time. These are not timestamps. These are dates sometimes. This is one of my favorites. This morning Andres showed that many people when they want to store a string they just create car 255. That should take care of it. What is the problem with that? It's that this is padded with white space up to n. So if you create a car 255 and you insert a single character to store then that inserts 254 blank spaces after it in the database for no reason. The padding spaces are useless because they are ignored when comparing but equally they create a problem because they don't work for like expressions and they don't work for regular expressions because a regex will see the spaces. So it's inconsistent. So just don't use it. And anyway you're not gaining anything by specifying a limit in the number of characters because it's not even stored as a fixed width field in Postgres. The storage is exactly the same. You're just wasting space by adding white space. Once wise it's even worse because Postgres is spending the extra time discarding those zeros when you're requesting a result that it's supposed to ignore those zeros. So also another consequence of car n is that an index created for a character of n length may not work with a query that accepts a text parameter or a varchar parameter with no limit. The index is created for a different data type therefore it does not apply to that query. So also limits are bad always. Limits on strings are bad. If you create a company name and you think 50 characters are enough I don't know any company name that is more than 50 characters and then you get a customer that's called Petersons and Sons and Friends Bits and Parts Limited which is 54. And then you have to go and change the column width in the database and your DBA starts swearing even though they selected the character length themselves because they were told to. Also it's useless for restricting length. It throws an error okay but it doesn't make sure that the length is exactly what you want. So if you want a four digit pin and you enter it as car four that is not enforced if someone enters a three digit pin. You need an extra check so it doesn't guarantee anything. So to restrict length and make sure that the length of what everyone enters is consistent then use a check and strain and enforce it. So bottom line is just use text. Text is the same as the confusingly named Varkar with no parentheses. So text. Money get away from the type money because it's useless. It's fixed point which means that it doesn't handle fractions of a cent. So for finance that's very bad because you usually have subdivisions of the lowest denomination of currency whether it's a cent or a penny or whatever else. So the rounding may be off and that is a bad thing in finance. Another bad thing is that it doesn't know which currency it's storing the values for. So it assumes that the currency is what you specified in LC monetary. And if you don't know what LC monetary is it's just going to assume whatever it finds in your UNIX configuration or Linux. Even worse it accepts garbage input. So if you select that thing and convert it to money it casts it to whatever it believes is right. And because my laptop was set up for UK pounds it assumed that that's UK pounds. So just use numeric and store the currency in another column for that row with a foreign key so you know which currency that is. Serial how many people here use serial and like it. So I will explain why you shouldn't like it. It used to be useful shorthand it is still useful shorthand but it's now less useful than it used to be because it's non-SQL standard and it messes up the permissions when you use it. So permissions for sequences created using serial automatically created using the serial keyword when creating a table they need to be managed separately from the table. So a consequence of this disconnect is that create table like another table with a table that uses serial will use the same sequence from the other table. And you don't want that usually. So instead we've come up with identity columns that are more verbose but much clearer in what they do because they're attached to the table that created them. So create table ID begin generated by default as identity and also primary key. With an identity column you don't need to know the name of the sequence. So when you alter table tab, alter column ID, restart a thousand you don't need to know what the sequence is called. It's attached to the table so it will just restart the sequence from a thousand. A side note here if your application is depending on a serial sequence to generate things like receipt IDs, receipt numbers that is something you should generally generate in your application to make sure that there are no gaps because there's no guarantees whatsoever that a sequence in Postgres will have no gaps. If you try to insert something and there's an error and you're all back, you've skipped over that sequence number. Never goes back. Cool. So now let's look at improper usage of Postgres features. Character encoding SQL underscore ASCII. It is not a database encoding that you should be using unless you know exactly what you're doing. So things like storing text from the 1960s where no character sets other than ASCII. When you specify that your database is encoding is SQL ASCII, you are skipping all encoding conversion and all encoding validation. So it will accept just anything and it will assume that if your character has a byte value from 0 to 127 that it's ASCII and if it's over 127 to 255, then it will not even try. It will just store it and not interpret it as anything. So it doesn't behave the same way as a character set setting and it's very bad that this is the default. Fortunately, most distributions, the packages that Devin makes for distributions have UTF-8 as the default. So that's a safer choice. Also when you use SQL ASCII, you can end up storing a mixture of encodings because it doesn't check and validate anything. So once you've done that, there's no going back. There's no way to recover the original strings because you don't know which encoding they came from. Rules. Rules are a thing that predates SQL in Postgres. When it was just Postgres, not Postgres SQL. It's a very old thing that has its specific purpose and its purpose is not to work like a trigger. Rules do not apply conditional logic. They rewrite your queries to modify them or add extra queries on top of them. So any rule that's non-trivial, so any rule that's not like a select or an update into a view is going to have unintended consequences because it's going to execute the original query if it's an insert and then apply the rule and then generate another row potentially or change the value of the row you inserted. So also, as we said, it's older than SQL in Postgres and it's non-SQL standard. So unless you're using rules to create views that you can write to, use a trigger instead. That's what you want to use. There's an exhaustive blog post by Depeche that you can read. You will find the link in the slides afterwards. Table inheritance. Table inheritance is a relic of the time of object-oriented databases. If you remember, up on our website, we used to say that Postgres is an object-relational database. Maybe we still do. Okay. But everything in Postgres is an object. Fine. That doesn't mean that table inheritance applies to tables because it seemed like a good idea before ORMs that you would have some sort of inheritance from a table type to another table type. And the way you would write that was create table events, let's say, with an ID and some columns and then create a table meetings. Meetings are events, right? And they have a scheduled time, but all the other characteristics of an event. So why not create table inherits the other table? It's also used to implement partitioning in Postgres before Postgres 10, but is now incompatible with the new way of partitioning after Postgres 10. So you cannot inherit from a partition's table, and you cannot add inheritance to a table that's partitioned. So if you've got it in your database, there is a way to undo it, and I will just skim over it. You can replace it with a foreign key relationship between the two tables. And it works exactly the same way. So create table new meetings, like meetings. Table inheritance is scary. I apologize. It's not for young guys. So create table new meetings, like meetings, creates it in exactly the same way. Alter table to add another column to store the foreign key relationship. So that should have been event ID, excuse me. Anyway. So you copy the data from the old table into the new table. So insert into new meetings, select everything from meetings, including the ID. You create the required constraints, triggers, et cetera, everything you need for the table, new meetings. And if you have a very large table, you can apply a very dirty hack that says that because I know that the data in the other table is valid, I don't need to validate it again. So I add the constraint, the foreign key constraint, as not valid. If you're doing this in a live system that needs to be online while you're making this change, create a trigger so that changes coming into meetings can go into new meetings as well. And the dirtiness of the hack comes in the fact that you should really not be touching PG catalog at all, but if you do know that your constraint is valid because the data in your existing table is valid, you just go ahead and update PG constraint set, constraint validated equals true for that foreign key constraint we just created. And then finally, in order not to do lengthy locking when you're doing this, begin a transaction in a code block, an anonymous code block. You alter table meetings, rename to old meetings. Then you change new meetings that has exactly the same content now with an additional column. You rename it to meetings, you drop the old table, and then you commit. Be careful, also create a trigger to insert update delete items in events as they get changed in meetings. And that's about it. You've gotten rid of your table inheritance. Another very confusing thing, if you look at the Postgres documentation, it explains very well how to do this, but this is probably not what you want to do. So partitioning by multiple keys is not partitioning on multiple levels, right? So let's say we create a table transactions, and it has a location code and a timestamp among other columns. And I want to partition it by timestamp and also location code, because I want a separate table for each time period for each location code, right? So I create table transactions, 202302A for values from timestamp 2023, so the first of February to the first of March, and for location codes AAA to BAA. Then I create the second partition, and 202302B is a partition of transactions for values from the same time period, but different locations, okay? So I'm using locations BAA to BZZ, error, partition transactions 202302B would overlap. Why is that? because you're specifying limits for the keys within each partition. So it will accept values that satisfy those keys, but this is not subpartitioning. What you do want is subpartitioning. You want to partition by one key, and then partition those tables by another key. That is the way to do it correctly. So you create table transactions, location type, et cetera, et cetera, partition by range of timestamp first, okay? Because we want the first level of partitioning to be timestamp based. Then you create table partitions as transactions, excuse me, as a partition of transactions for values from the first of February to the first of March, and we choose hash partitioning within those partitions for the location code. And all that means over there is that when I create the first partition, it's for values with modulus four remainder, zero means just divided by four equal parts. And that creates a partition, a table that is partitioned by both things, subpartitions. Now let's talk a little bit about performance. One thing we see people doing all the time is using many more connections than they should be, accepting many more connections into their Postgres server than they should be. The default is very sensible, it's at 100 connections. We see things like 5,000 connections in production. And a server with 32 CPUs, a server with 32 CPUs, there's no way on Earth it's going to do more than 32 things at the same time, right? It's common sense, okay? You may accept up to 100 things with 32 CPUs and interleave and overlap, that's fine. Or one of the connections may be idle and you take advantage of that to serve the other connections but 5,000 is excessive and we'll see why. Because Postgres is process-based and for every new client connection it spawns a new process. And a new process comes with inter-process communication through semaphores and shared memory and that has an overhead. So every process you add to the system adds to that overhead and you run at the risk of your CPU spending most of its time doing context switching between one process and the other. Also accessing the same objects from multiple connections may cause many lightweight locks to appear, what are called latches in other databases. And if you're trying to access the same objects from many client connections, then that lock even if it's not explicit it becomes heavily contented and the other connections trying to access that object will slow each other down. So instead of opening one connection that does 400 times the work, you open 400 connections that do one 400th the amount of work and that doesn't perform the same, that performs worse because it's making your data hotter for no reason because they compete for access to that data. And also there's no fair queuing, it's more or less random, so lightweight locks don't have queuing so you don't know who will get priority and there's no guaranteed quality of service. Now mitigation strategy is also you need to be aware that before Postgres 13 there's the issue of snapshot contention. So each transaction keeps an MVCC snapshot even if it's idle and so you can end up using server resources even for idle connections and slow everything else down. So this is contention that is caused by too much concurrency. So instead of opening 5,000 connections just put a PG Bouncer in front of your database or another connection pooler and just allow fewer connections into the database while accepting the client connections from the connection pooler. That way you throttle or you introduce latency on the application side but that's not always bad because in some cases it can protect your server's performance which is more important than making let's say a non-interactive client wait for a few milliseconds more. It sounds counterintuitive but it leads to higher performance overall. High transaction rate is also a problem when you're burning through transactions very quickly because there's a lot of detail here about the way transaction IDs work in Postgres but the bottom line is that there's 4.2 billion transaction IDs. The future for you is 2.1 billion transactions in the future and the past is another 2.1 billion transactions. So if you are writing with a huge data rate with let's say an OLTP workload that can go through 2.1 billion transactions in a week that will overrun the last transaction and you will no longer know whether that transaction is in the past or in the future and that's a problem. Postgres won't let you do that, it will shut down to avoid doing that and the solution that we came up with is called freezing where you go through the table and you mark each row as you know to be old as frozen and you know that that row is always in the past even if it has a transaction ID from another time. So the problem is you need to make sure that Postgres has the chance to freeze those rows before the wrap around. So what can you do? You can reduce the number of transactions, you can use batching. Instead of committing 100 things, just batch them or 1,000 things and that automatically uses 1,000 transactions less, sorry 1,000 the transaction rate that you would have and that helps. Also it helps to bump up the effectiveness of auto vacuum and that takes care of freezing. Another favorite is people that turn off auto vacuum, so the thing that actually makes multi view concurrency control work, so don't turn it off. Its work is removing dead tuples, freezing things, among other things, it does have overhead because it scans tables and indexes and acquires locks and gives them up voluntarily and that's why it has limited capacity by default. But the defaults are not suitable for production workload. So if you're concerned about the overhead of auto vacuum then turning it off is not the solution because the alternative is worse. You can risk shutting down your database or accumulating bloat because there's no way to avoid the vacuum in Postgres yet. And when you outrun vacuum by writing faster than your database can auto vacuum it then you may come up with a bloat runaway that requires a vacuum full and that takes a total lock on the table and nobody can use it. So instead of turning off auto vacuum, actually make it work harder and you can find in the Postgres documentation how to make it work harder in order to avoid bloat and transaction ID wraparound. There's some standard stuff here about explicit locking. If your application needs to lock things to make sure that concurrency, oops, out of power, can I use something else? I have a copy. Okay, so we're only like two or three slides. If you're really interested in knowing them you can talk to Jimmy afterwards but you can ask now questions about what he already talked about. So if we have like five minutes for questions, so if you have a question please raise your hand and we are going to bring the microphone to you so you can ask questions. There is a question there. Good. Thanks, it's on the website. Is there any difference on how, is there any difference in how VARCAR and VARCAR N are stored on the disk? Sorry I didn't hear your question. If there is any difference in how VARCAR and VARCAR N and text are stored on the disk? No, VARCAR is exactly the same as text. It's the same type. Okay. So it doesn't matter like also for indexes like I know in my SQR. No, no, it doesn't make a difference but VARCAR with a limit is a different type. Got it. Thank you. Thanks. Another question? Just one of the browser. Questions, questions. Jimmy, I have a question. So you were talking about money and why does money is actually implemented? Is it SQL standard or? Connected. Sorry, what was the question? If money is so bad as a data type, why is it implemented in Postgres? Because it was actually deprecated because of those bad things that we talked about. Twice. Twice. As Andreas pointed out this morning and people requested it so we reinstated it twice. Oops. There you go. So people wanted it. Okay. People wants money. So. Different kind of money, exactly. Any other questions? Okay we have another question here. Quick question about table inheritance. So I know I've read the Postgres documentation about all its flaws and why you shouldn't use it especially now that there's partitioning. But overall I think the idea of having tables that have some common columns but then diverge on some others is an interesting idea. There's other ways to solve it. Like in previous jobs I've implemented one table that had all the common columns and then one separate table for each variation. But are there other solutions that you implement for those types of? ORMs. Why not use ORMs to make as complicated the data model as you like but not store the complexity as inheritance relationships on the database. But doesn't that create larger tables that you'll have to read no matter if the data is sparse? No all you need to link them is a foreign key relationship. So you're just storing an extra identifier I guess. Yeah. Thank you. Okay. Here. Never mind. So anyway, before the last thing I wanted to tell you, right, it was the security slides. They're important. Never use trust over TCPIP in your PGHBA conf. That was the most important thing I had to say in the remainder of the slides. Do not trust anything coming from TCPIP. Always use password, MD5 certificate, scram authentication. That was the last thing. Sorry. I'll take your question now. Thanks. I'm curious as to why outer left join isn't implemented, it's just left join, is that of course it's the same thing as using the anti-join you used earlier. I'm just curious why it isn't implemented. It's the same thing. Outer left join is the same thing as left join in Postgres. Yeah, I know. But outer left join should be, according to my old books of SQL89 or something, just the anti-join left side. So you do not take the center part where the rings meet. You remove the intersection, just take the left part. Right. So yeah, the way Postgres supplements it is it just enters null for the things that don't exist that don't correspond. And the right join would put the nulls on the other side. That's the difference. There was another question here before. So you mentioned about the date and the time handling. Is there any way in Postgres that doesn't involve an awful lot of hackery to deal with partial dates? E.g., for example, if I said I'm going to take the train tomorrow morning or I'm going on holiday in August. So you want to store like August? Well, August 24. Right. So you can use a date with no context. You can use a date that says August 24. Well no, not as in August 24, as in August 2024. Okay, so you can just use extract from that date or truncate and lose all of the other context with that date and only store August 2024. Thank you. We have time for the very last question here, somebody who is ready. Hi. When you write V2 of this presentation, what do the other don't do is that you would add? Other don't do is to involve like, I don't know, like foreign data wrappers or well or I guess the more exotic parts of Postgres that you would say. Yeah, as I said, this talk couldn't be all-inclusive. It was the top things that we see people doing wrong every day. Fair enough. Right. So thanks everybody for staying until the very last talk. Excellent. And remember, you can now get out here on the front because there are no more talks. You can pick up your stickers here. And once again, thank you, Jimmy, for your presentation. Cheers. |
AMENDMENT Intro to public code and Digital Public Goods |
Good morning, everyone. We're starting without slides because the projector isn't connected yet and we're going to keep trying to troubleshoot that. So thank you so much for coming. My name is Elena Finnedrecht. I'm here with the Foundation for Public Code. Welcome to the Public Code and Digital Public Goods Dev Room. I'm going to start with two really quick housekeeping things and then give a really fast pitch for the Foundation for Public Code, so you know why we're here. And then Veepal, who is my co-host from the Digital Public Goods Alliance, will give an intro to the Digital Public Goods Alliance. So as housekeeping, one, we're following Fossum's Code of Conduct here. Please be respectful while you're in the space. And two, I've opened some windows so that we have ventilation so that everyone can be comfortable being here. Please leave them open. I don't think it's a big issue about people closing them today. There's not very many of you. But yeah, but we'll leave them open throughout the session. So thank you for coming. We're delighted that you're here at our second Public Code Dev Room. Our first one was last year. It was online. This year, we're hosting it jointly with the Digital Public Goods Alliance because the Foundation for Public Code is proud that we're a Digital Public Goods Alliance member. And again, you'll hear more about what that means in a sec. And very importantly, because we believe that Public Code has the potential to make a significant contribution to meeting the Sustainable Development Goals. So first, what is Public Code? Public Code is open source code that implements public policy. It's used for the public good, and it's used by public organizations like governments, public administrations, and state corporations. This morning, you'll hear more about two super interesting public code COVID-related projects. You'll hear about best practice for Digital Public Goods and Public Code, and about the policy environment in Europe. And so quite quickly, just an intro to who the Foundation for Public Code is. We're a non-profit founded in 2019 in Amsterdam. My slide would say that we provide tools and processes that bring people and institutions together to collaboratively build and maintain software as public infrastructure. My speaker notes, which would have complimented that slide, say that we exist because there's plenty of governments that are able to build public code code bases, but then they don't necessarily have the mandate to make the communities around those code bases thrive, to build actual public collaborations that ensure that the code bases are being reused, that they're growing, that they're living code bases. And so the Foundation for Public Code exists to help public code bases grow real sustainable communities so that the code bases are used in more than one place, and public organizations can actually benefit from the scaling potential of open source. Quickly, what do we do at the Foundation for Public Code? The most important thing we do is code-based stewardship, and our stewards take the role of a coach. They help to bring the most out of a community by helping, by giving the code base the best possible conditions to succeed. We do this by checking code quality, by organizing community events and helping the community thrive, for example, community calls, helping to develop the code base as a mature product, for example, by assisting with branding and communications around it, and developing training materials, and helping people who are implementing the code base, like vendors so that they can attract more people, they can help more organizations use the code base as well. The next thing we do is work on the standard for public code. There's copies there, and this is a set of criteria that supports public organizations in developing and maintaining software and policy together. It includes best practice guidance for policymakers, government administrators, developers, and vendors, and code bases that meet the standard are easy to collaborate on and to reuse. It's both fundamental for our own work, and it's a recognized digital public good. We recommend that as a resource for other digital public goods. And then finally, we do capacity building work for public organizations, like workshops and training, and an example of that is our governance game, which is an actual game, it's fun to play, but it lets people explore what's needed to govern a code base, what the various roles and complexities are in keeping code base governance balanced and flexible enough to be usable. And services issues worth considering during setup, and in the game context that includes things like Godzilla, but it also might include real things, so there's a rogue developer card, so what happens when you have somebody who doesn't follow the agreed governance. And so it's useful for visualizing how the current governance model is set up or how it could be changed as well. If you work with an ambitious public code base that might benefit from more support to grow internationally, or if any of our tools or approaches sound intriguing, come chat with me, or Kehinde, at one of the breaks. And that's us from the Foundation for Public Code. So on to people from the Digital Public Good Alliance. Of course, yeah. There we go. Thank you. So technical issues, people getting sick, it's all kind of thing going on right now. So sorry for the delays and sorry for all the things. Not as expected, like projector working for now, but I think it should be fixed. I just wanted to quickly run by what Digital Public Good is and Digital Public Good Alliance is. But before that, a quick short story, when COVID started, there's this tool called DHIS2, which was a Digital Public Good back there, and it was used in 76 countries, 73 countries, and it's a digital health, health metrics information collecting tool. And when the first COVID outbreak in Sri Lanka was recognized on January 27th of 2020, within two days, the developer community of there were able to change it, modify to start tracking COVID patients coming from different highly contagious areas. And because of such fast work, in several next few days, it was deployed and started being used. That's not even the most impressive part. Within the next few weeks of that, it was already being adopted in 73 plus countries. And till now, it has located and counted and tracked 3.8 billion people, which is 40% of the world population. And all of it was possible because it has open licenses. It had a community around it, and it was targeting a critical need that the world needed. In Digital Public Goods Alliance, we call these things called Digital Public Good, obviously. These are open source technologies or content, software, data models, it could be anything, which follows and adheres relevant best practices, do not harm, and advances the UN's Sustainable Development Goals. So these three are the heart of what we call Digital Public Good. Later on, when expanded, it has nine indicators from proper licensing, documentation, not collecting PII's information, must be platform independent, so you should not be dependent on a proprietary solution that can prevent you from further scaling up. And all of these nine indicators help us nominate and decide Digital Public Good. Very easy to nominate a software. If you have a link, SubmissionStartDigitalPublicGood.net, you go there, you nominate your project, there will be a technical review of it, and once done, it's recognized on the registry. Today, we have recognized 127 public goods, and it's continuously increasing at faster rate than ever. So DPGA, started by UNICEF, the government of Norway, government of Sierra Leone, and ISPRIT. At this point, we have a lot more organizations together with it, and if you attended the keynote yesterday, the fresh announcement is Open Source Initiative is also in the DPGA, in the Alliance now. We have Bill and Belinda Gates Foundation, Github, UNDP, USAID, and a lot more. It's a multi-stakeholder alliance founded by UNICEF and others, but today DPGA has expanded a lot. And the advantage of being certified, you have discoverability. If your tool is there, it's easy to discover for a lot of people to identify, well, not just contribution, but to adapt it for their critical need, support for usability, and deployment capacity, because there are pathfinder countries who will work with you to adapt that solution and utilize them. Within UNICEF, we also have programs, DPGA accelerator programs, where we'll take a DPGA digital public good solution and try to pilot it in countries where it can be actually used for good. Just wanted to keep it short, because we are running late, and again at 1040 or 1050, I have my talk where we'll again refresh this thing in maybe more details. But you can go to digitalpublicgood.net, read more about us, what we do, and you can submit your solutions at submissions.digitalpublicgood.net. That's me, and thank you for coming today. And I think we have projectors fixed, so for the next talk, we can start giving up. |
AMENDMENT Covid Exposure Notification Out in the Open
Developing an Open Implementation of the Google/Apple Exposure Notification Protocol |
All right, thanks everyone for coming so early. I admire your acudos for getting here so early. I'm very impressed to give up this early in the morning. I'm David Llewellyn-Jones. I work for a company called YOLLA O in Finland. We make an operating system called Selfish OS for mobile phones like this one. And today, I'm going to talk to you about COVID exposure notification. It's actually a project that was done. It was a personal project rather than something related to the company. But it was something that relates to Selfish OS. So hopefully, it all comes together in some kind of sensible way. OK, so exposure notification. It's almost certainly something that you've all come across individually. It essentially is the apps that people ran on their phones that did contact tracing. They pinged. They would give you an alert if you had been in contact with someone who had a COVID diagnosis. And I guess they were a big deal during COVID. I guess probably they're less of a big deal now. Although, I mean, I track the numbers on my phone while I'm at FOSDEM, and there's still a lot of people at FOSDEM using them. I can tell that. And the idea about this talk is that I want to talk about a very particular implementation of it. And that's the Google Apple exposure notification protocol, which was released, I guess, in April 2020. And it was an implementation that went out to Android devices and iPhones in order to basically provide the services needed in order to allow countries or medical organizations to build apps on top of it in order to perform this contact tracing process. And initially, when it was released, it was called, as you can see here, the Contact Tracing API. Later on, I guess that name was a little bit ambiguous, a little bit confusing. They changed it to the Google and Apple Exposure Notification API. I guess when people think about contact tracing, they think about maybe people phoning them up and asking them who they've been close to. But it is all the same thing. And this was the original specification. Like I say, the original one was released in April 2020. And it's made up of these three documents. It was initially this was version 1.1. I think they're up to version 1.4 or later now. There's been several versions that have come out since. So they've refined the protocol. And I guess it's worth putting it in a little bit of context. Back then, when this was released, there was no vaccine. So at that point in time, people thought that the World Health Organization predicted that vaccine was about 16 months away, but no one was entirely sure. In practice, the first vaccine was actually approved about nine months after this. But you can imagine that at this point in time, people were very worried about COVID. They were concerned about how it was going to be dealt with. And this was seen as kind of like a kind of one really quite exciting way to address the problem, given that there were so few tools at that time to deal with things. So let me just talk about these three documents. The first one on the left here is the Bluetooth specification. So it's made up of three parts. The Bluetooth specification, I'll talk about how it works in practice in a second or two. But it's sending beacons using Bluetooth from your phone. So the Bluetooth specification tells you essentially what how to pack the data into a Bluetooth beacon. And also other details like, for example, for privacy reasons, the information in the beacon changes periodically. And you have to change the MAC address of your phone in sequence with that in order to avoid privacy problems coming up. So that's that part. There's a cryptography specification, because the data that is sent in the beacons is cryptographically, I guess, hidden for privacy reasons. And then finally, there's the framework documentation, which is the API, which is really describing how organizations can make use of these tools. And I guess another important point about it is that there are two parts to the process. So Google and Apple released this API for all phones, but that's actually only half of it. That sends out the beacons. But as those beacons go around, eventually, you have to do something with that data that's being sent. And for that, you needed an app that was going to be written by a government organization, a government level organization. Like it could be a health organization. And Google and Apple had nothing to do with that. They said, you've got to write that part. We'll just give you this piece here. And the project that I'm essentially talking about is about a re-implementation of this for Salesforce. And so, of course, we had to also deal with those two parts separately. So this is only one part of it. All right, so one of the nice things about this protocol is it's actually heavily privacy-focused. So you can imagine also at that time, there was a lot of concern about the fact that you're sending beacons out on your mobile phone. There are privacy concerns about that. And the protocol tries to address that in a very careful way. And that's actually rather a nice feature about it. All right, so when it was released, this is actually a slide from Google and Apple. I did not draw these pictures. And this basically describes how it works. So very briefly, Alice and Bob are wandering around the world. They have the app installed on their device, sending out beacons. When they get into close contact with each other, their phones are communicating via Bluetooth. Well, they're not exactly communicating. They're listening for these beacons. And they're storing the beacons that they hear. OK, so that's step one. So when they get close to each other, Bob's beacons get stored on Alice's phone. Later on, Bob turns out to have COVID. And he gets a positive diagnosis. And he basically goes to his doctor. He gets an official diagnosis. And he puts that information into the app. And he voluntarily uploads data about his beacons to some cloud server that's run by a government or health authority. And he uploads 14 days worth of his beacons. So in practice, these beacons are actually generated using a cryptographic process. So you don't have to store on your phone 14 days worth. You just store essentially the seed that generates them. But they change frequently. So they change between every 10 and 20 minutes. It's actually a random interval somewhere between those two. In order to prevent people from tracking you, they are unlinkable. So when they change, you can't link one from the other. And that means that there's a limit to how far you can track people. But Bob can regenerate those keys to upload to the server. Alice just gets random, essentially random numbers. She can't tell what those numbers are, essentially, or what they mean. So then later on, so I guess the next day. So this happens later. But then the next day or later on that day, the keys get uploaded to the server. And the server then dumps all of the keys that it receives back to every single device out there. It downloads these keys to all of their phones, to everyone's phones. So it downloads them to Alice's phone. Alice then, Alice's phone then matches them up with the keys that she captured along the way. And she then knows that she was in contact with someone who has a positive diagnosis for COVID. She gets an alert on her app. And then her life is held for two weeks. That's basically the process. She has to isolate and go through all of the steps that you have to go through. We don't know whether Alice has COVID or not. The idea is she might have. So at the time, it was a kind of a big deal, like I said, because it was seen as a kind of a white horse to get us out of the problem of COVID. At that time, everyone was focused on this thing called the R number, where the idea being that the R number was the number of people you essentially go on to reinfect after you have COVID. And the objective was to keep that down if the R number was above one. There was an exponential growth in the number of people that would have COVID. If it was below one, then it would be reducing numbers. And the aim of all of this process was to keep people isolated if they might have COVID in order to reduce that R number and prevent it from spreading. Google and Apple came out with this protocol because they immediately identified that there were going to be serious risks involved in tracking people with their mobile phone to do this stuff. So they knew, for example, that there were privacy risks associated with it. They knew that if they let governments go ahead and just do it, they would just collect everyone's data was the concern. Or some of them might do. Maybe not all of them, but some of them might. They also knew that sending out Bluetooth beacons. Sorry, you have a question. Sorry. If Alice receives keys from infected people, can she not match that with the relative time she got the ring detected? So she knows that the key is the one that she found two days ago, and two days ago was the time she met up. So is there a problem? So I guess I should make clear. So I mean, there are privacy controls in this, but they're not complete, right? And the idea is that essentially, at the point when you upload your keys, you've got your diagnosis, that's essentially when some of your privacy is lost. So but you voluntarily choose to do that. That doesn't happen automatically. Does that make sense? OK. But it's a good point. So yes, but Google and Apple were concerned about the privacy implications because it reflects badly on their systems, right, if their phones are doing this stuff. Also, the power consumption is a problem, so they didn't want government organizations to deploy apps that would have problems in terms of battery on their phone. And they also wanted it to be effective, because if it's not effective, that also reflects badly on them. So they identified that there were all these risks that they wanted to avoid. And even at that time, the APIs that Google and Apple provide prevented people from doing this kind of stuff for precisely those reasons. Not, I mean, it was nothing to do with COVID. I mean, they've always just prevented you from sending out beacons and collecting beacons like this in a sensible way. Primarily for power consumption reasons, right? They don't want people draining the battery of the phone to do that. So my concern as a YOLO employee was that, well, Google and Apple have released this stuff. It's probably going to get quite popular. And I didn't want selfish users who are using an alternative operating system, which is not Android or iOS, to essentially get sidelined by this process and be left in a situation where they can't basically participate in this scheme. So the objective then was to create an open source implementation that could be used on other sorts of phones. So this is Selfish. Selfish OS is a Linux distribution that has a cute UI layer on top of it. And it is sold by, well, it's released by YOLO. And then there is also a paid-for license that you can get for it to get additions on top of that. So yeah, the idea was to write an application that essentially mimicked that functionality. And so this was the app that myself and Oscar Rosler, I don't know if Oscar is here. Is Oscar in there? I guess not. He said he might turn out. But I guess he's not here at the moment. So we developed this together as a kind of a personal project. And you can see here's the process. This was like the first time that we actually got Beacons appearing on the device. So that was a kind of an exciting moment. And there are, I guess, like several steps. There's the collecting Beacons. There's sending Beacons. And there's also uploading your data if you get a positive diagnosis. So we implemented all those parts. And we tried to implement it in the same way that the Google and Apple did in terms of structure. So you have a service underneath that sends out these Beacons. And then we had a separate app, which was the equivalent to the app that had to be developed by governments. And in theory, other people could have developed other apps on top of the service. I don't think that ever happened. We only implemented one version. And the version we implemented was the COVID WARN app, which was Germany's implementation. And the reason we did that was because it was both very early in the process in terms of being deployed. But it was also incredibly open. It was an open source deployment. And they were really committed to doing it as an open source product, I mean, an open source project. Not just an open source chunk of code, but the development model was also supposed to be open. It was actually developed by SAP and Deutsche Telekom. I think my understanding is that SAP did all the code development. Deutsche Telekom provided the infrastructure for it, the back-end service infrastructure. So like I say, we did this independently of YOLA. For those interested, it was Qt-based. It used Bluezy for the Bluetooth stack and OpenSSL for the crypto stuff. And this was the timeline, essentially, that we worked to. Or we didn't work to that we ended up with, I guess. So on the 10th of April, that's when the Google and Apple specs first came out. We released our first version implementation of those specs on the 19th. So that was quite soon afterwards. Then the COVID WARN app from Germany came out on the 16th of June. And then on the 30th of June, so quite a few weeks later, but we got our first app that had equivalent functionality. So we were pretty happy with that. We thought that could be considered a success. But it was only possible because the specifications that Google and Apple rolled out were very good. There were clear specifications that you could re-implement. And the COVID WARN app team also did a lot of good work in releasing source code very early. They had a huge amount of documentation that was all on GitHub. They even had, I guess, a CCC evaluation of the project in terms of its privacy. So they were really quite committed to being open with their stuff. In practice, what we found was, so they released some reference versions of the servers. And we found that really helpful. That was really important. They also released the source code for their app that we re-implemented. We actually ultimately found that that was not so useful. It was useful for some edge cases, but really it was a server deployment that was really helpful for us. So I guess the real reason that I wanted to talk today was to talk about our experience working with these companies as small developers, as independent developers. And I guess it was very much as you might expect in terms of Google and Apple. They produced, I mean, they're phenomenal at producing good specifications and good APIs. That's what they do. That's their bread and butter. They're absolutely flummole at it. But zero engagement. So we tried to contact them and talk to them about their stuff to try and get information or to get help. Nothing at all. Which is, I guess, not surprising, right? They're a massive company. There's no reason why particularly they would. They promised test data to test stuff against, and that never materialized. So there were also some bumps in the road for us. And although they did release the source code, they released the source code for a reference app quite early, but they didn't release the source code for the underlying system, the Bluetooth system, until somewhat later, until I guess that was late in July. So for us, that was already too late. So working with them, like I say, there were good points, but they were not really having an open source development model. For SAP, working with the SAP code was a little bit different. So this is essentially the SAP implementation that they developed on top of the Google Apple notification protocol. So this is the app, and this is the server back-end on the left-hand side. And the main parts of it are this verification server, and from our point of view, and the Corona WARN app server. So essentially what they do, you have to have a download server to download all of these keys that are going to Alice's phone and to everybody's phone. That happened in the Corona WARN app server. You have to have a verification server so that when you get a positive diagnosis, you can upload it in a secure way with your identity, essentially proving that, not with your identity, but proving that you've got a positive diagnosis from your doctor. And then you need an upload server to upload your keys to. And that happened also in the Corona WARN app server. So in practice, this was actually two servers, not one. So those were the two bits we were interesting in, and we had to deploy them to test against. We couldn't deploy the whole lot. That was going to be impossible. That was not really a plausible thing to do. But they did have a reference implementation of these two servers. And so we tried to deploy, we tried to deploy, well, so we did end up deploying those to an AWS server and testing our code against those. And what we found was that the download server implementation was a little bit broken. We had to fix it, but it was otherwise pretty good. There were just some glitches. Whereas the verification server and the upload server were pretty minimal. So we actually essentially had to re-implement parts of those based on the specification in order to do our testing. But they did release a lot of code with this stuff. And the fact that they had these reference servers was absolutely key in allowing us to re-implement the protocol. So our overall experience was that they produced lots of code and lots of documentation, which was really good. They were one of the earliest teams to get this stuff out into the wild, which was in terms of source code, which was also phenomenally good. They worked through GitHub issues. So when we found problems, we would submit pull requests to try and fix the code. And it was clear that they were trying to engage with that process. But in practice, they were just taking way too long. So by the time, so for example, we would fix something in one of the reference servers. And people would be using our PR code to deploy elsewhere. But by the time they got to actually thinking about integrating it, it was already too stale. It was no longer useful. So they were clearly trying to do that. But I guess it was just a question of time. So as a consequence, some of the code was left broken for quite a long time, even though there were fixes that they could have just integrated. Some of the reference implementations differed quite significantly from reality in terms of the servers. But there was something to work with, and that was really helpful. And so our overall impression was that they had this real commitment to openness. They were genuinely committed to doing this well. They had an open development process. But the team were just a bit overwhelmed with the amount of effort involved in doing it. That was the impression we got from the outside. So we were very impressed by this. But obviously, there's obviously scope for learning there in terms of how to do it better. So my overall reflections on how things went were that at the time, I think again, you have to put this into context, there was huge pressure on governments all over the world to get this stuff deployed. So the first deployment was in South Korea, was I think the COVID 100 meter app. And that went out really early. And as soon as that went out, and that used GPS positioning, GNSS, so that it uploaded people's locations to the cloud, I mean, essentially zero privacy in that situation. But as soon as that was out, all other governments, there was a lot of pressure for them to do something similar. There was a huge rush to deploy this stuff. And so there was a lot of pressure, and there was a lot of competing requirements. The requirements for openness was just one requirement. There was also requirements for efficacy, requirements for speed, requirements in terms of privacy. All of these things combined are actually quite challenging to get right. And both Google and Apple and the CWA team, the SAP team, the impression I got was they really got that. They really understood these competing impressions, but they also understood the need for openness. They understood that in order for people, not just for them to do it well, but for people to trust the apps that they were running, they also needed to be open about it, and that's one of the reasons they did it. And in that sense, I think they were pretty successful. Like I say, there was scope for improvement. With, you know, there was, as I said, with Google and Apple, there was zero inward flow of information. With SAP, there was, and so with both teams, there was a lot of outward flow of information that was excellent. With SAP, there was some inward flow, but not as much as perhaps we as independent developers would have liked. And I guess more generally, what we kind of felt was that there was this commitment to openness, but not necessarily a full appreciation of how much effort was involved in that process, in getting that process to work really well and effectively for the companies involved. So, for example, and I guess I should say that YOLA also works with a lot of open source code, so we also have an open development process for a lot of our stuff. And we have the same challenges, right? So, people provide PRs for us to integrate into the system, and knowing what is good to integrate and what is bad to integrate takes a lot of time and effort, right? It sounds like you're getting free implementation, which I guess in some sense you are, but there's still a lot of effort involved in that process. And like I said, I guess with this process that we as independent developers experienced from the outside, it felt like that perhaps these other competing things were impacting on the effort that they could put into that process. So yeah, so overall, given that these organizations had absolutely no duty towards us as independent developers, we were kind of very happy that we could work with the things that they'd given us and that they were actually high quality. So we're very pleased with how it went, but like I say, we felt that there was definitely some room for improvement. OK, so that's my summary of how things went. Thanks for taking the time to listen. There's some links in case you're interested in more. We are actually on the Linux on mobile to stand in Building K up the top of the stairs, so if you want to talk more about it later, please come and visit us and we'd be happy to do that. Thanks very much. Is there time for questions? Yeah, there's five minutes for questions. We don't have a microphone. Yeah, you speak up and have to repeat the questions. OK. Especially for the live stream. Right, understood. What's your experience with the, like, they used rootin for the proximity, but there was some discussion like either you would get too much notification, contact notification, because it was too wide, but it would get too little, because it was too restricted. Right, so your question is, what's my experience with the Bluetooth parts, the Bluetooth notifications? Because as you were saying, at the time, there was a lot of discussion about whether it was going to give enough accuracy in terms of proximity detection. So the key thing with the COVID notifications, of course, is that there was this two meter, there was always this, like, or some distance that you were supposed to stay away from people. And so in terms of the Bluetooth, you wanted to try and mimic that process. You wanted to accurately distinguish that someone was within two meters or some distance or outside that distance. That's essentially what you're talking about in terms of proximity. So it's interesting, actually. So when we started out the process, the protocol didn't specify greatly how to tackle that. So what happened, what actually Google and Apple did, is they set a bunch of risk parameters in their implementation that you could tweak, that you could change. And they came from the download server. They were set by the government organizations, not by Google and Apple. And they basically said, they gave weightings. How much risk do you associate with a power level of this for the Bluetooth? And how much risk association do you put with how long that association went on for? Like, is it five minutes or 10 minutes that they're within range of this distance? So you could tweak those parameters. And later on in the process, Google actually released a lot of information that provided essentially weightings for some of the parameters for almost every phone on the market. So they had a process where they would basically put two phones on stands next to each other and measure the power consumption that came through. And then based on which phone was communicating with which other phone, they would then tweak the parameters to try and make that proximity more effective. So we use those parameters. But I have to say, we were not in a position to check whether or not they were working effectively. Because what's happening behind the scene is quite opaque. We're sending data up to a server. And the whole point is that you then know nothing about what happens after that because of the privacy aspects. So it's actually very hard for us to judge whether or not it was working well or working badly. But it was clear that there was a lot of work going on to get it to work well in the background. My more general experience with Bluetooth is it's not great at this stuff. If there's a person between you and another person, then water is a big problem for signal propagation. So stuff that's in between causes a lot of dampening of the signal. So it's quite tough with the impression I got. But I'm not an expert in that, so. I'm neither. OK. OK. I'm curious, a lot of governments implement solutions over proprietary solutions. Or do you have an implemented open source solution? So I'm just going to come a bit closer so I can hear from you. A few governments that implemented open source solutions, but not many. Most have been implemented proprietary solutions. But I don't know of any government that re-implemented another government's open source product. Was there any collaboration between governments you're aware of, or even between projects of government? You see that one of the projects that have never been between you and that physical failure in the resource can't repeat the process. And share it with the government. And it's distributed to governments of limited value. Right. OK. So the question was, was there any evidence that all of this open source release of code resulted in reuse of code by other governments? That's essentially your question, right? And I think there was. So I can't think of specific instances. But my understanding was that people did take the COVID WARN app code and use it in their implementations. Or perhaps more generally, I think people definitely took the Google reference implementation and used that in their code, which was open source. And of course, was a yes, sorry, which I guess was an example of open source being reused. Perhaps more specifically, with the COVID WARN app, because the Google and Apple notification protocol was the foundation for it, it was used really quite widely across Europe. Not all countries used it, but a lot of countries did. And at this point in time, I think there's 11 countries that all use the same back end. So they're all uploading their keys to servers. So now if you use the COVID WARN app, or contract, our version of it, you actually get keys from countries across Europe, half of Europe, essentially. It's quite a large area. So I mean, you could argue that's more a consequence of having a common API than it is open source. But there definitely was benefit in sharing of information, sharing of code. It did happen, I think. It did happen. But perhaps it's not as clear that it happened as it might otherwise seem. All right, I'm being told my time's up, so. |
AMENDMENT Global Open Source Quality Assurance of Emergency Supplies |
So, happily, my talk is going to build a little bit on the talk that you just saw, the difference being that I'm making proposals and they actually built things, so it's a little different. My name is Robert Reed, I am the founder of Public Invention, but what I'm presenting today is not a Public Invention project, this is co-work with two other people, Victoria Jacqua and Christina Cole of Open Source Medical Supplies, Open Source Medical Supplies and Public Invention are both US 501C3 organizations. So what I'd like to talk about is global open source quality assurance of emergency supplies and we call this GASCOS or the Global Open Source Quality Assurance System and I'm making a proposal today for this. Now, open source manufacture has rapidly responded in a number of important cases to things that have happened like open source software responded to contract tracing in the previous talk. In particular, 3D printers can represent sort of an army for good that can immediately do things to help in a man-made or a natural disaster. In particular, we're working with some people to make tourniquets for the crisis in Ukraine right now and of course, if you saw my other talk, we've also made human ventilation products and other things. But when you do this, you have this fundamental problem, you have a widely distributed supply chain of people attempting to make useful products, but how do you trust them? And the trust can be broken down into two issues. How do you trust that the design itself is useful? And then even if the design is a good design, how do you trust that the manufacturer is in fact a good manufacturer? Because of course, we all know, for example, 3D printing requires tuning and so forth. Well, if you imagine using a tourniquet, which is a simple physical device but can easily be mis-manufactured, especially if it's 3D printed, you're using it in a life-saving situation where you're trying to stop bleeding. If it breaks, you have a real serious problem. And so even though a tourniquet only costs $20 and it's a relatively simple device, ensuring the quality of that is very important. It's almost better not to have a tourniquet than to have a faulty tourniquet. Now I am a humanitarian engineer and I consider humanitarian engineering the space that I work in. Most of the people who worked for this that I know of were not making money from it. They didn't have a financial incentive to try to sell products to address these things. But nonetheless, engineers have a psychological problem, right? Nobody wants their baby to be called ugly. And so all of us wanted to be heroes and we wanted to save the world and save lives. And for that reason, engineers cannot be trusted to evaluate their own work, okay? But of course, this is a problem that the open-source software community has dealt with already and I'll deal with that. So in October of 2022, just four or five months ago, many non-governmental organizations in the humanitarian engineering space got together for three hours and we had really a surprisingly unanimous agreement that we needed quality assurance for rapidly manufactured open-source devices. And we needed an alliance of NGOs to try to address this. And so Christine and Victoria and I formed a new informal organization, we haven't incorporated, that we call GASCOS or the Global Open-Source Quality Assurance System. So the open-source software movement knows how to do testing, okay? Of course, it's easier to test software than to test hardware devices. With software, you normally have automated tests that anyone is empowered to run. You download the repository, you run the test, and you have an independent verification of the quality of the code. So in a sense, what we want to do for hardware devices is what's already been done for software devices or software systems. So fundamentally to this for hardware devices is to show the data. So you want a test procedure that's sort of a named standard test procedure. And then you want to record a test result. You want to say, what was done, when was it done, how was it done, and who did it? And you may have obviously an analysis of either you pass the test or you fail, and if you fail, in what way do you fail? And finally, you want a discoverable publication of those tests for the particular device. Now there are examples of testing organizations like Underwriters Laboratory and ASTM and other things. Often what happens is an industry begins its own testing procedures and then later they become adopted into governmental regulation. So it's actually the case that many industries are sort of self-policing and then they become part of a governmental structure later. So what we propose is asset provenance tracking as the fundamental way that we can improve the quality of rapidly manufactured devices. So when I say provenance, what I mean is the history of the device in the same way that an art object has a provenance, right, who owned it, what happened to it, where was it physically throughout time. Now this is a way to fight counterfeiting, which is a serious problem for medical devices particularly in low and middle income countries, but even in other situations. It's also a way to organize documentation on behalf of makers. So it's not necessarily that you're doing anything that couldn't be done some other way, but you could be relieving the burden of the makers themselves from having to do all of the documentation and distributing the documentation across a number of parties. So this would allow third party quality assurance testing, relatively simple to implement, can use minimal well understood cryptography, I'm going to talk about that in a minute. Now of course people will say, well there exist asset tracking systems. There is an open source app asset tracking system called Snipet. It's possible that this should be a fork of Snipet. There are some ways in which it's different. What I'm proposing is different than Snipet. I don't have time in this talk to discuss that issue, but this is what we would like to produce. So you can imagine a box of tourniquets having a gas cost seal printed on it, literally a sticker is put on it. And the person who manufactures the tourniquets gets a unique key for this box of tourniquets, which either they generate or we generate for them. We describe the product, which is actually more important than you might think. And then we can give certain certifications if they have actually occurred for the object so that anyone who holds the box in their hands can get some useful information about what's in the box. But more importantly, every box will have a key that you can use to look up in a public open access online database stuff about the particular object, okay? Now it's kind of easy to understand how this would work. Imagine that it's made in Prague, it gets a private key, someone else in Prague does a third party test on it that goes into the database, it's then purchased by a middle man in Egypt, the person in Egypt transfers it to Tanzania, in Tanzania, someone verifies that it's in inventory and a potential buyer in Kagoma then looks at the key, takes the box in their hand and points their phone at it and says, this claims to be a box of mask or tourniquets or electronics or whatever. And they look up in the website the complete history of the device. Now just as we use for intellectual property and other art objects, if you can see the complete history of the device, it's very difficult to fake that. Not impossible, but it's quite, quite difficult to fake a chronologically accurate history for a device. And so in this way it provides great confidence to the person in Kagoma that this product is what it says it is. Thank you sir. Okay now I assume most of the people in this room are computer programmers and they can probably have already imagined how this would be implemented. From a programmatic point of view it's very simple, you just have a database, you assign keys, you use one way encryption, much easier than the sort of public key encryption and the other kinds of things that are necessary today in the cryptocurrency world. You just do a simple one-way encryption of the key so that you allow a public access where anyone can write into the database, okay. Now there are a number of things that you would think are security flaws in this. We don't have time in this talk to go over them, but I hold that the following principle is simple enough and good enough. It's not perfect, but it's good enough to build a workable system. If you have the device in your hand you have a right to see the provenance. Now there are ways in which that differs from our norm today. For example in the United States if I have a box of something in my hand I do not have a legal right to see where it physically was located before I got it. And if I have a box in my hand I do not have a right to see the provenance in the future. Nonetheless seeing those things is not particularly harmful. You can imagine that being a right and it wouldn't really hurt anything if that were true. And so I consider this to be a great simplifying assumption. If you have the physical device you have the right to see the provenance. And that simplifies an enormous number of things. Now what you're not allowed to do is even though the database is in a sense public you're not allowed to scrape it and see the history of all of the devices which are in the database. But you won't be able to do that unless you have the keys because it's encrypted. Okay, therefore the database can be made a public database. This is very very simple, but I claim it's going to be good enough for us to really provide quality insurance. So if you imagine this system existing and you have a gas cost seal that can be put on objects you can ask well does it apply to medical devices or does it apply to non-medical devices? Does it interact with the CE stamp used in Europe to authorize medical devices or with the US FDA? And the answer is it can overlap all of those in a complicated way. It really doesn't require the approval of a government. It can be a completely open provenance tracking system which is used or not used as people see fit in a voluntary way. Now the idea of open source devices are a threat to monopolies, but they're not a threat to large firms. There's no reason large firms could not use open source designs and use the same provenance tracking system that we are suggesting here in order to give buyers confidence in their system. Today, very large firms have their own internal provenance tracking systems. They have asset tracking systems that they use for their own inventory purposes, but they do not expose those and make them public to people and would consider them a trade secret. There's no reason why they don't use an open source provenance tracking to add confidence to their products. So I claim that there's no reason anyone ought to particularly oppose this system. Now we have started writing technical papers about this. These are very much in a draft form. They're not super great, but they're publicly available and we invite comment on them. We are actively trying to build this system. And so today, in this very small room, I'd like to publicly launch the free global asset provenance tracking idea. I would like to be the technical lead of the new open source project system to build a website to provide this technology, but I can't do it completely by myself. For one thing, I run public invention, which is a nonprofit that takes up a lot of my time. So I'd like to call for volunteers, both computer programmers and non-computer programmers who can handle business and communications and other things that we need to make this a reality. There's going to be a lot of work convincing people to voluntarily use this system until it becomes respected enough that people start to demand it. Thank you. So that ends my talk. Thank you. And I'm happy to take questions. If anybody has a question, I'll repeat it into the microphone. Yes, sir. If you notice that something was touched in some previous steps of the system, what happens? So the question is, if you notice that something was previously touched. Yes, so for example, this middle man in Egypt took a few out or screwed it up and then after five steps, the guy in Tanzania noticed that something was wrong. What happens? Well, so there's no guarantee that the entries in the database are completely accurate, okay? But it is the case that you can make an entry saying, it looks to me as if the device was tampered with, okay? Now the people downstream of the provenance can decide what to do with that information or not. They can ignore it or they can say, well, so and so says the box was tampered with, I'm going to begin a legal proceeding with someone earlier in the provenance train or I'm going to ignore it or I'm going to believe that that was entered for some nefarious purpose to sabotage my system or I will use it to repair the device and inspect it and make sure that it's good. It's already the case that the US FDA requires market surveillance of objects for the purpose of doing recalls as well as for other safety purposes. So in a sense, the fact that you have that potential information is a positive thing about the provenance tracking, not a negative thing. Yes, sir? So maybe you suggest that anybody can just add information to this whole database, like how does that build on trust of suppliers or like how do you, the sticky recognition of who actually supplied this information? Yes, the question is, can anybody add information to the record for a device and the answer is yes if you have the key. Okay, so a bad actor can't pollute the entire database, but if I broke into your warehouse and took a photocopy of a box, I could create a record for that. So anyone can claim that they have this device if they have the key for the device and they can make a false claim about it or an accurate claim. But just as with art objects and other kinds of things, I think false claims will be relatively easy to sort out in the system. And so the great simplicity of this is that it's a completely open database that doesn't require any security beyond maintaining the individual keys. And if a key for an individual object is corrupted, like for example, suppose I took a photo of your box and published it on the internet. Well, bad actors could likely disrupt the provenance of that box, but they could not disrupt the provenance of the rest of your inventory. So I claim this is the correct balance between simplicity and security, and we don't have to go overboard on it. All right, thank you, I think that's all the questions we have time for. Okay, thank you very much. |
AMENDMENT Public Money? Public Code! in Europe
A policy brief of the state of play of Free Software in the European Union |
Today, I'm going to be talking about public money, public code in Europe and basically try to do a very brief overview of what has happened in terms of public money, public code in the EU institutions. Because of time and also because of the topic, I just want to focus on public money, public code because I know that at the moment there are a lot of legislation going on that it's worrying a lot of the community, but I'm just going to exclude those if you want to talk about those later, we'll be mainly in the booth or you can just shoot me an email and we can exchange some ideas. So I always like to start with the basics and always to put everybody on the same page. Maybe this is very familiar for you, but I think it's super important now more than ever to always remember that free software guarantees the four freedoms to use, study, share and improve the software and whenever one of those freedoms is excluded, then we're talking about a non-free software project. How are you going to name it? We are the Free Software Foundation Europe, so we empower users to control technology and we do this with free software and among different activities that we have, we have the public money, public code campaign. Maybe this is also very familiar for you and also mainly in this deaf room, but use also briefly overview of our campaign. We started this four, five years ago, 2017 and we're basically requiring legislation that demands that public procurement or public software that is procured for the public sector should be public code. And for this, then we have different reasons. We use different arguments whenever we are talking with public administrations, with decision makers. So one of those is definitely tax saving because then the public money should be spending the most efficient way. So there is no point to spend public money on proprietary licenses if you can reuse free software. Then the collaboration part is also super important because we all know free software enhances interoperability and collaboration. So administrations can collaborate with each other because it's open, because it's there and there is no need to reinvent the wheel again. And then it's also serving the public because then the public money, given by the public, the people will know what the money is being spent and then I think we all agree if the money is spending a good way, we like that. And of course, to foster innovation because then we have to start from scratch again, but we can just reuse the solutions that already exist. So we have in our open letter. We have an open letter that you can sign as individual organizations but also public administrations. We have more than 10,000 signatures and now at the moment we have seven public administrations. There are some from Germany, from Sweden, from Spain, I think three of them. And then recently one from Luxembourg has also signed our open letter. So it is not so many, but it is nice to see some administrations supporting our campaign. But yeah, so now that was a very brief overview of the public money public code campaign. And within this we have different activities. And we also actively try to advocate in the EU level in this regard. So today I want to talk about two EU institutions. I'm going to first start with the European Commission. What has happened in there over the last three, four years. And also I will talk a little bit about the European Parliament more specifically about the AI resolution, a little bit of the AI ongoing AI Act. And then I will also talk very briefly about the Declaration of Digital Rights because these are some of the files or legal documents where we have been active. So let's start with the Commission. I think in order to talk about the Commission we have to talk about these two pilot projects, the EU FOSA and the EU FOSA II. So basically these were projects that were given to the European Commission by the European Parliament. So basically the European Parliament or the Commission get those pilot projects ready because we need to improve the security of the free software tools that have been or are used on the European institutions. So they did that, they did their best. Within those two pilot projects they did 15 bantis, three hackathons to actually audit the code of these free software tools within the institutions. However, there was not budget allocated to these projects and therefore they could not run any longer. So this was basically, you know, like kind of like the European Parliament told them how to do it and what to do it, but there was not budget allocated. And I think through my talk you will see that this is a, unfortunately this is a pattern. There is a lot of nice wordings and a lot of nice initiatives, but fortunately there is not budget. So they stopped doing this project and in 2020 the European Commission released the open source strategy. And this strategy is super interesting because this is, as you can read there, it's a communication from the Commission to the Commission. So basically this is, I mean this is not a legal binding document, so it's basically the Commission telling themselves how they should work with open source, what they should do. And it's not, again, nothing legal binding. So in this regard, I mean we still have to say that we acknowledge that the Commission has the will, has the initiative to set up this kind of things and then, you know, they are already realizing that open source has been used in the EU institutions and they need to do something about this. However, as a strategy by itself, it's quite rather weak because it doesn't have any real indicators or anything that we can, you know, you can actually follow up what's happening and see the progress of such plans so far. So if you go to the text, then you can see wordings such as like whatever it makes sense to do so, the Commission will share the source code. And again, here it's like whenever it makes sense or whenever it makes sense, what does that even mean? Like, it is not clear when the Commission should share the source code. And then also, in some part, there is a section that talks about that the Commission has the freedom to choose a non-open source tool if there are good reasons to do so. And again, like what it's a good reason, what it's not. So all of this wording, it's a little bit bias and not bias like ambiguous, so to say. So it is very unclear and therefore, or analysis of these strategies like okay, nice, you want to do something but there is still some, I don't know, we're not quite happy with the wording and the way it was done. However in 2021, then the European Commission also realized about this and then they had a decision. So this is a decision, then we saw this transition between a project, a strategy, a communication, to more like a legal binding paper. So in this paper, they want to define the conditions under what the European Commission is going to share open source. And within this decision, then we can see that they are trying now to implement all that is already happening. The European Commission opens our program office, so basically this is kind of like the office that will be in charge of taking care of all these plans or all this decision from the past documents I already talked about. So this is a step, I mean now there is an open source program office that is actually trying to act as a facilitator. So they also do some backbounds, they also do hackathons and I have to say that they're really trying to do something about this, they're trying to implement all these projects and all these plans and all these strategies, but again there is not budget allocated to this easy possible. So for them it's really hard to do what they have to do because there is not human capacity because basically there is no budget, so it's really difficult. So here we find ourselves again with a very nice wording, there is like the wheel and the initiative to do something about this, but there is no budget allocated to these initiatives. Within this decision then in the article six a public repository is also included and this is definitely something good. I mean we have been advocating for public repository for all the open source tools that the European institutions used and this is publicly available. So this is again another step here, from here we can see not only the European institutions are sharing what they're using, but this is also trying to include member states, so they're also trying to build this interoperable network among member states. So in terms of public money and public code this is again a step and I feel like this is going on the right path, but I cannot really say enough that there are some things that need to be worked, such as the wording has to be more clear and again there should be more budget allocated to free software in general and this is not happening at the moment. So this is basically what is happening in the European Commission. I know that at the moment the European Commission is proposing a lot of legislation and initiatives towards open source. They are realizing that open source or free software it needs a special regulation, it is important. In my point of view they are noticing that, but within the European Commission this is what is happening, we can see there is will, there is something that Osmo is trying to do whatever they can, but not budget at all allocated in this. So now let's talk a little bit about the European Parliament. So as I mentioned very briefly I want to talk about mainly the EU AI resolution. This was a resolution that was led by the special committee that was created in the European Parliament to take care of, to do this resolution. And this resolution is again not legally binding, it is just an opinion, like a guideline for the ongoing AI Act and the AI Act is going to be a regulation so this is going to be legally binding, but the European Commission, the European Parliament decided to create this committee to exchange views, to talk with experts, with stakeholders, and they come up with this resolution. So of course we also tried to advocate there, although we knew it was not legally binding, but again these are guidelines and the decision makers really, this is a good argument for us. You know if there are guidelines we can always bring them to these guidelines because that's why they're using them for right, because they need to be used, you always need to go and look back to these guidelines. So in this regard it's hard to say if it was completely successful or not, but there was a huge step. There is a recital that talks about public procurement on AI, again we see this pattern from the EU institutions to have this very ambiguous wording of whatever it's appropriate, whenever it makes sense, whatever there are good reasons, and in this recital we see again, as you can see, I just quote this recital. So nice, I mean this is step, and I guess in this regard we can always use it to benefit the community, but it's still super ambiguous. So this recital was voted, especially like this specific recital, and the good thing is that it found a huge majority within the European Parliament, so that tells something as well, that tells also the will that the European Parliament has, but the downside as I already mentioned is it's not legally binding, it's just a guideline, but at least we have something, right? So in this regard then we would say that decision makers understood the importance of open source on AI. So to briefly talk a little bit about our FSB demands for the AI legislation, we basically, it's very straightforward, we say like AI should be fair, transparent, accessible, and this is only possible if it's open source. And then of course we have an argument on public research and public AI, so whenever there is public money invested on research on AI then it should be also public AI. At the moment the AI Act has been still discussed at the European Parliament, so nobody knows how the final text will look like, we don't know if the European Parliament is going to go back and see this guidelines for the AI resolution, it is not clear, but I feel like we are stepping in the moment that we could, we are still going to try to monitor what's happening there, but so far it's really difficult to see how that's going to develop from now on until, I don't know, I think this is going to be voted by the end of the year in the last plenary, I don't know. And finally I want to talk a little bit about the Declaration of Digital Rights, so this was also an initiative of course from the Commission, they just want to have this guidelines as well as a reference point for the digital transformation of Europe, so we decide to also step in because as well as like with the Berlin Declaration and the Tallinn Declaration, these are always guidelines that we use to talk to decision makers, to public administrations because they are there and they talk about public procurement and free software. So we said like let's try to also influence the way the wording is going to happen in this declaration. Again, I mean this is just like, you know, it's a guideline document for ongoing legislation, so a lot of people didn't really like see the point to work on this paper, but personally I saw that it was like the baseline to discuss further legislation, so we went for it. And this was super interesting because we also tried to, we approached decision makers on the new European Parliament, then they had an opinion as a European Parliament, and in this opinion, before going into the inter-institutional negotiations, the European Parliament position was open source or free software was included on AI systems. So there was a nice article there that was in which open source was included, however, once the three institutions sat down to discuss, then this wording was completely gone, and then at the moment, like the final text that was signed by the three institutions removed completely the part on open source, on AI, and then we just have a reference to promoting interoperability, open technologies, and standards. That's the final text, it's super unclear, I mean, is it open standards, is it only standards? It is, for us, it was not the ideal outcome, because we were quite happy, quite happy with the opinion from the European Parliament, but then this was completely changed, and that's what usually, most of the times happens whenever the three institutions sit down to discuss. However, again, I mean, let's look at the bright side as well, we saw that the European Parliament in its position included open source or free software. So again, this shows that there is will or there is an understanding from the European Parliament as well on the importance of open source or free software on AI or on public procurement. So, yeah, unfortunately, this is how this end up, not the best outcome, but yeah, I mean, this is what you get when you try to advocate these European institutions sometimes. So just to maybe talk a little bit what's ahead of us, we see, and I think you already got it. We have very ambiguous wording on the documents that we have so far, so we're just really trying to advocate for a clear and consistent wording about free software in ongoing legislation. So we cannot change what's done already, but we want to, I mean, now that this is being included, we want to make sure that the wording is more clear and also consistent. So again, the European Commission doesn't have to come up with a new wording, with a new inclusion, with something different, but they can just reuse what it's already there. And we want to make sure that if we get to this point, then this wording is clear and of course that it benefits the free software ecosystem in general. And then, of course, there is a problem with implementation because, yeah, we have nice wording documents, but there, to practice, it's a little bit different. So we want to keep monitoring like how much of this legally binding documents are being properly implemented. So we basically, with this, we just have to keep advocating for public money, public card, and then trying to make sure that there is a proper implementation of such documents. And last, but not least, I think that's one of the most important ones is that we really want to keep demanding that it's governmental budget allocated to free software. Because as we can see, there is will, there is some text, but if there is not budget, then that becomes very difficult. So that's what we have ahead of us. It's not quite easy, but at least we have seen what's like the transition and the whole process. We have already pinpoint what we need to focus on, and yeah, we're just going to try to do our best to do so. So and for this, we also need our community. We are, I mean, it is important to talk to decision makers, but it's also important that the free software ecosystem, the free software community also approaches administrations, you know, raise awareness of this matter as well. So you can convince your local administration, and for this, if you might be interested, I also invite you to see the talk from one of my colleagues in the community, that room. He's going to explain more how you can actually get active on the framework of our public money public card, because there is this, sometimes people don't really see the power that you guys have to reach out your local administrations, and also, you know, like we're not talking about the European Parliament, the European Commission, we're talking about the library of your town, that's also a public administration. So I invite you also, if you are interested to check that talk, you can sign our open letter, of course, as individuals, organizations, or if also you want to convince your local administration to sign the open letter, well, that's pretty nice. And of course, I mean, donations are always welcome, we're a charity, and we are really trying to work as much as we can to come up with legislation that benefits the whole free software community. And of course, spreading the word, I know that this public money public card campaign is quite well known, but it's, you know, there is always people that don't know, or people that don't really know what free software is, so all these things are super important. We also have, I mean, we have a brochure on public money public card, but then we also have a brochure that we have prepared for AI, that we use also to reach out to decision makers, it's also on our website, so if you also want to take a look at this position paper and distribute it, feel free to do so. And yeah, with this, just to close up, I don't want to, I don't want you to leave this room feeling a little bit upset, or like, you know, sad, I feel like, personally, I feel quite positive for what's happening at the moment with other files as well. It is just a matter to, you know, that's why I really like these spaces and these events where we can talk to each other, we can, you know, discuss, and then we can actually bring all these positions and all these concerns to decision makers, because there is a gap between the community and decision makers, and we're trying to close that gap or build the bridge, so the future legislation that is happening, it's, yeah, it really benefits free so far. So yeah, thank you very much, and now I'm very happy to take questions. Okay, yeah. Yeah, I'll go. I wonder why I don't hear anything about the cases compared to five years ago. Five years ago, open source did not exist in any enterprise, at the European Commission and any organization. Nowadays, anyone can use mainly, or 100%, 90% open source software, and I hear you only complain that it's not free software. The software that is graduated by CNCF, it is standard, it is supported by many organizations, there's a rigorous process to get it through and to get a new release, and I don't hear anything about what happened, how much of a change we had the last five years, and how much better the world had since five years ago. We had only Windows, NVMware, and IBM, and now we have large, large organizations supporting huge amount of code, and I hear you complain. I don't get it. It's a big reference. Yeah, I mean. Can you please repeat the question for the live stream? Well. Okay, why am I complaining that it's not like free software, although we have seen some changes on inclusion of open source in the EU institutions over the last five years, right? That's basically. Well, but I mean, it was not available very like the open source solutions that have been using in the European Commission are not publicly available until they just released this public repository before they were not publicly available. So they were using Innersauce, they were using open source within the institutions, but that was not open to the public. And that's what we are demanding. Since you have Kubernetes, Thanos, the whole RAM RAM, all the software is open source. But is it available to the public? It's available to anyone in the world, and it's used in the Commission. Yeah. They're used in the Commission. Yeah, but that was. They're building software in the Commission. Yeah, exactly. And that's not available to the public. And that's our demand. I mean, now they're using open source, but is it available to us? No. Until now. I mean, I'm sorry, but I was not really trying to complain at all. I mean, not only complaining, I was also highlighting the will that these EU institutions have because I can't see it as well. I mean, there has been a huge shift and a huge change and there is will now. Now we have a public repository of the free software using the public in the EU institutions. That's something. But I'm sorry, this is not good enough yet. When shall come on one short question, short comment is maybe in the future, like you can advocate for, don't change what you just wrote, so that they can remove what we like. The second thing is when you say that AI must be accessible, what do you mean by that? Yeah, I mean, with AI, it's a little bit tricky. Yeah, with that, what do we mean with AI being accessible? I mean, I also have to say I'm not an AI expert here, but AI needs a lot of data to be trained, and then if you're using open data, then the results of research of AI to build something that should be at least available to people. I'm not saying all the AI to be open. I mean, this is another discussion, but if there is public money involved in the research, then it should be open, and people should be able to see how these AI systems are being trained and what kind of data has been used, and yeah, but not. Yeah. Don't you think the upcoming CRA is a very big threat to open source software? Sorry? The upcoming CRA is not one of the biggest threats of open source. Yeah. Yeah, this is a question regarding the CRA, and that maybe it's a huge threat for the free software community. In the beginning of this talk, I mentioned that I wanted to focus on the public money public code, so basically public procurement on an open source. I would be very happy to chat with you about this, because we have also been meeting decision makers on this file as well, and so far it is still a little bit unclear how we're going to move forward from our side, and also from their side. This is just starting, so I cannot really tell you more, but for this topic, I would prefer to keep this out, and then I would be very happy to stay with you after this talk, and then we can chat a little bit. I think people should be aware of what they're doing now. Yeah. I mean, yesterday at 11, you can see the recordings. The European Commission was here, also with Red Hat. They did a panel on this CRA, it was super interesting, so I also invite you if you couldn't attend to see the recordings of this panel. Do you feel like this movement is collaborating with the movement in the scientific field, where if it's the public research, the paper should also be public? Yeah. This is my time, so I'm going to reply to this. Yeah, so if this is in line with the research community as well, and this is something we have noticed, and we are really trying to focus more now on research, because I feel like this is a community that we have left a little bit apart and behind. With AI, we notice the importance of including all these research communities. This is definitely something we want to keep working on, and try to do it in a better way, because it is definitely a community that can contribute a lot to us. I think they're quite open to be open, to be free software and so on. Yeah, this is on our agenda. Yeah, thank you very much. |
The “Full-Stack DPGs”
Build open, build early, build right. |
Thank you everyone for coming. Today, Justin and I will be talking about full stack digital public goods at the start of this dev room. We went through a very brief what digital public goods and digital public good alliances, but we will also be going into depth again for those who are not aware. We want to start with our introduction. The recap of what digital public good is and the context around it, what we mean by full stack digital public good and also levels of scale that digital public good, how can they be full stack and where does the community participate in that? So who are we? Your first speaker, me, Vipalsiddharth, I work at UNICEF as open source technical advisor and have been contributing to Fedora project for almost one third of my life. And I love making coffee, maybe not drinking so much because of the anxiety. And then we have Justin. Hey everyone, my name is Justin. Today I'm working with Red Hat as a community architect for the Fedora project. But in a past life, I was also at UNICEF working in the Office of Innovation, not directly with the digital public goods alliance, which we'll tell you a little bit about in a minute, but around the DPG ecosystem of projects and many different organizations that are all engaged and working around this big topic of digital public goods. So even though I'm not still with UNICEF today, neither of us are really directly representing our employers, but we're coming more as fellow free software activists and open source fans and sharing some of our thoughts around, you know, when we have a standard around these kinds of important things around public goods, maybe there's something that's still a missing ingredient. And that's what we'll tell you a little bit more about. And yes, I'm still using buying CDs and ripping them with all free software tools, a little old school there, but I'll pass it over to Vipal for a quick recap for folks who weren't here on DPG. Yeah, so we are going to do a quick refresher on four things today, three things. What is a digital public good standard? What digital public good alliance is? And how UNICEF Venture Fund acts as a digital public good pipeline? So let's start quickly. How is a DPG defined? We'll look at the DPG standard, which is maintained by digital public good alliance. So at the heart of digital public goods, there are main three components. A digital public good comes with certain promises, promises of open source licensing, platform independence, transparency in decision making and the roadmap. But also it aims to advance the sustainable development goals by United Nations, which is addressing social or environmental issues and strive to ensure accessibility, affordability and equity, equitable distribution of the technologies through sustainable development goal relevance. And the tools do not do any harm. There are different relevant best practices that tools have to follow. So they can be open source software, content, models or documentations even. As long as they're there to relevant best practices, they do not harm and they push SDGs ahead. They are eligible for digital public good. We believe such goods, such tools needs to be identified, recognized and supported because their capacity to help achieve our goals. So those three components expand into nine different indicators that we have. Starts with relevance of sustainable development goals, licensing, platform independence, but also it should have good documentation, data that is accessible, a commitment for standards, best practices for privacy and security and demonstrate that components do not do harm. In order to become a digital public good, it's quite simple. You nominate your software, content, model. There's a technical review against all those nine indicators that we spoke about. And if it all matches, then you are listed on the registry which is an application to list all digital public good solutions that can be further looked into. Now we want to look at what digital public good alliance is. So it's a multi-stakeholder alliance founded by UNICEF, government of Norway, government of Sierra Leone and iSpirit. And it has a mission to, as we discussed, accelerate attainment of DPG's sustainable goals, especially in low and middle income countries by facilitating the discovery, development, use and investment in DPGs. And one of the core functionalities of DPGA, which is the alliance, is to maintain the standards of DPG. How do we recognize as tool? Today, this is a public good association alliance has grown a lot, started by four, but now we have many more. And if you attended the keynote yesterday, the latest edition is open source initiative in the alliance. Finally, we are going to look into how UNICEF Venture Fund acts as one of the pipelines to funnel more DPGs into it. Now, it may be weird to think that UNICEF acts as a venture capital, a small portion of it, but we invest in early-stage companies, solutions that have a working prototype with a lot of promising results. They match with our goals of doing good for children women, improving our sustainable development goals for UN. And so our main idea is that we take risks in emerging technology that may work or also identifying what does not work. Early investment can accelerate our finding of good with using technology that we have not found yet. So far, we have invested in 128 companies across 75 countries and this number is growing because we just invested four more last month. And I think another country is added to it. We are very concerned about the diversity of the female, just founders of the companies and how many countries we are touching, their levels of development. And we invest in technologies like drones, also blockchain, AI, data science, VR, a lot of this emerging technology that we have yet to see impact directly. Some of them are doing it. That's where they lie in this graph, but our goal is to identify what's not been proven yet. And last, how does the UNICEF open source, where does open source part comes in? Because at the heart of digital public good, there's open source. And the way UNICEF identifies this company is three ways that it must different. First of all, the investment is equity free. We do not take any equity from them. So almost like a grant, but we have certain contracts, certain expectations, milestones to achieve. And the core of it is open source. Any solution that UNICEF invest in that must be open source, ideally at the start of it, or by the end of investment round, it must be a open source solution. And not just with open source license, but have all those indicators that we just discussed, must have good licenses, must have a public roadmap, a place where you can open and report issues. So, and other way that's very unique is that UNICEF provides mentors who will work with companies to help them develop. We have business mentors, we have software development mentors, and the place where I come in is open source mentorship in making sure that we are following the right practices. So far, out of 127 or 129 as per last yesterday that I saw, at least 16 of those companies come from UNICEF, where we have guided them, helped them achieve certain standard and now they are digital public goods. And at least that I'm working with 15 companies that are planning their nominations within the next few months. And now I'll pass it to Justin to talk about what we mean by full stack digital public goods. So earlier, like Vipple talked about, there's that nine point standard, which goes a long way on building on this, you know, open source base. We talk a lot about licenses in the open source world. But if you look around us, there's still all these other parts that are very important and often don't get factored in as with as much emphasis as maybe we focus on like copy left or permissive. So the digital public good standard does a good takes good steps forward of trying to provide more of a framework around how we can look at a digital solution and its impact on again, like the sustainable development goals, but also sustainable as an open source product or open source project. So what does this full stack piece mean? So first, understand that this isn't something that's in the standard. This isn't something that's been published by UNICEF or Red Hat. This is just something that Vipple and I have thought about as we've been working with these companies and teams from all over the world, like you saw on that map, people who are often approaching these open source business models and challenges with building a community for the very first time. So we've noticed some things that come up a lot across these mentoring that we were running with these companies and these teams. So this concept of a full stack DPG is something that we made up for this talk to help identify a possible gap in today's DPGs and one possible way that we can address it. So let's look at four challenges with the current DPG standard and then identify what this concept of a full stack DPG is. So first, the DPG standard is indeed a standard that's been reviewed and has a change process, but the actual review process itself is not quite standardized. There are indicators in the standard that give some guidelines around what you should look for around documentation, what licenses are acceptable thanks to the open source initiative. And we have other kinds of things like, you know, for adhering to data privacy and best practices, making sure that wherever the product is based or if it's a company, where they're registered, they're in line with any local laws and requirements that are obligated to them. But while the indicators provide guidelines for whether they're satisfied, again, that review process itself remains without a common standard. We take kind of a best effort approach to dig into a product when they come and submit their nomination and understand, are they meeting these indicators? Have they taken the right efforts to address these parts of the standard? But as it is today, only the DPG Alliance can truly reproduce a review. So if a different organization attempted to review, maybe unlike some very specific indicators, like the ones that are more loosely defined, like maybe documentation, possibly even the ones around do no harm, because that's a worthwhile thing to try to standardize. But that's also really hard to try to think about all these different ways that something could be used. I mean, we can even look at the last history of 20, 30 years of open source and even ways that this is spun off in different directions. So where I'm going with this, it may depend on the level of consultation that's sought out and what backgrounds people are bringing to the table when they're going through that review process. And also how closely you're following a legal strictness, like what level of adherence you're paying attention to when you do that review. So while the standard does provide guidelines around what we're looking for in a sustainable, healthy, open source project and community, not every indicator provides an easy policy decision. There's no binary here, like yes, this is do no harm. Yes, you are in line with the sustainable development goals. We have indicators, but there's no fixed standard. It's not binary. So it doesn't always provide an easy policy decision or firm rules for proving itself as satisfied. And of course, that can also change over time. So will this will a technology stand the test of time, true of not just of digital public goods, but truly of any open source project, you'll see a lot of conversations around supply chain in the open source space today, also relevant to digital public goods. So will something stand the test of time? Will these newer ideas last into the decades ahead of us as we go into this new era of free software in these 2020s? The answer varies from DPG to DPG. Some DPGs are accomplishing their goals with ubiquitous frameworks. They're used by several projects and they have a very solid base that's maintained by a wide community. Others of them take a new approach to solving a problem in an innovative, but perhaps unproven way. So meeting the DPG standard only applies to the submitted product itself, but it does not necessarily mean looking upstream at the building blocks that are chosen by that specific project. Sometimes this is actually really beneficial because we might have these new ideas that are emerging in the field and take a new approach to an old problem and make ties into that innovation piece. But sometimes it also means that unforeseen risk will emerge. So simply this question is something that Vipple and I believe cannot be firmly answered by the DPG standard alone, at least in its current version. So again, on this whole part of funding and financial sustainability, a big piece here, DPGs don't necessarily come with a promise about their funding or their financial stability. It's not always clear when a DPG is very well funded and supported or when it's still early on and that maybe you've seen that innovation bell curve, the early adopters and the laggards. It's not always clear where a product might be in that innovation curve. So this is perhaps by design, but without better guidance for these downstream consumers of the DPG registry that we talked about earlier where you can find a huge list of digital public goods. It's hard to know as a consumer whether you're choosing an enterprise grade open source platform or you're going with an emerging cutting edge platform that may come with some adoption risk. The point is not to say that every DPG needs to have some endowment and guaranteed funding from the start, but it should be easier as consumers of DPGs to know what grade of a DPG you're working with when you're shopping the market for an open source platform. So change of ownership, change of direction. As with any private sector company, a change of ownership can mean a significant change of direction. Now the DPG standard does have in the third indicator a requirement that you demonstrate clear ownership of the tool or the product, that copyright is managed, that the person submitting the DPG actually can represent the material, the intellectual property that is being nominated. So there's steps to make sure that you're not using some other bits and pieces of other people that you don't have the legal rights to, which kind of comes into the, again, kind of the supply chain piece a little bit. But if you look at the very recent history in tech acquisition, in the tech acquisition world, it may support this notion that sometimes things change and might put what was once a very stable platform or product in a very different direction. But if this happens to a DPG, what will it mean to downstream consumers who are maybe taking certain trusting in part of the DPG standard to find a reliable thing that they can build some infrastructure or some tool on? So while it might be standard practice in the private sector for a pivot when hands change, it doesn't have to be this way for DPGs. So we would say that sustainability for the DPG standard should include greater protections if there is an unexpected change of hands and a DPG pivots in a way that goes against the standard. This could result in a new form of vendor lock for customers using an open source platform that might be later forked to a proprietary model. We don't want that. And in the worst case, the merit of having more IP under open source licenses provide greater assurance that maintenance on an open source platform is feasible. So where do we go from here? Do we need some kind of indicator around sustainability? How do we measure that? Sadly, this talk is not coming with some magic solution around how we patch sustainability. We can definitely have some conversations. But, but we have a pathway. We have some ideas and we hope to actually inspire curiosity about how sustainability should be measured in the context of a digital public good. So we wonder what does an indicator for sustainability mean? What might it look like? So this is what we propose would make a DPG into a full stack DPG. So this very draft version of this concept is that the DPG is with enhanced visibility to their operational stability in their applicability for downstream consumers in various parts of the innovation life cycle. Going back to what we said earlier, sometimes you're really looking for that, that very stable platform, something that you can guarantee has a strong base, a strong foundation, has that, that good support. And sometimes you're looking for that, that very support. And sometimes you're looking to take some risk and try a new idea to solve a problem that the existing solutions that have that proven base may not be able to address. And this is where I'll pass to Vipple to talk a little bit. Well, we'll be back and forth on this one a bit to think around like, how are ways that we could try to promote this idea of sustainability into digital public goods? What are these other things we can do perhaps around the standard to enable this idea of sustainability and helping people know what they're getting into when they find a DPG? So I'll pass. Thank you. So we believe to build early, to build sustainable DPGs, early stage investment and community and resources are the key, inclusive and diverse community, collaborations, transparencies, how they evolve and occasional revision of the project charter, implementing governance model that includes community. So now we are going to look at three different levels of scale and each example demonstrates some or many level of qualities. At first, early stage. So if you remember UNICEF venture funds invest in very early stage companies and we provide structured mentoring to provide high impact payoffs. I discussed about open source mentoring where we work very early with what right licenses to choose, how to draft our project charter in a way that it reflects our mission, vision, community statements, slowly building knowledge base within the company about how to be a model DPG or open source project and move away from there. And actually I might just go back to this one and add on Vipple's point here too is I think one of being of my former UNICEF hat. This was something that basically when these companies have questions and they're trying to figure this stuff out, having that mentor or someone who can help kind of be a seer, someone who's been down the open source road before and can help answer some of these common questions and sticking points when you're just getting started. This is one of the really key values of that mentoring is you get someone to talk to when you're trying to figure all this stuff out on the beginning, someone who either can help answer your questions directly or help connect you to somewhere else in the ecosystem. So again, that early stage piece can be really helpful trying to think through these challenges from the start instead of like, I can think of a few projects where maybe it would have helped them if they had thought about some of these challenges from day one. So basically around data and privacy mentors who can come and help you audit what kind of data you are collecting which can later fire back. So all of this mentorship really impacts the future alignments. So I did mention I changed hats recently so now I'm at Red Hat and while I've been in the Fedora Linux community which is the upstream for one of Red Hat's core products, I'm still, I've been in the Fedora community for eight years but I'm new to Red Hat. So I'm kind of still learning my way around and one of these things I think is really cool as I've joined the open source program office is we have this team called Vertical Community Architecture. And so this team at Red Hat helps establish and continuously refine upstream and community strategy for vertical markets particularly trying to understand this upstream business model. So if I look at my Fedora piece like we're always talking about upstream first and not just trying to keep all of our patches downstream, we're always trying to go and help all these dependencies that were part of our building blocks for our products. So this team is helping other people who might not be, you know Fedora Linux is going on 20 years old now, we've been doing it for a while but our movement is growing, more people are coming into the fold and so this team at Red Hat in the open source program office is trying to help bridge that gap to help people understand these things. So investing early has long term payoffs but the value add is not always packaged so well and so that's where I feel like this team tries to help with that packaging for the partners and people that we work with. I should give you an idea from our, how we talk about this team inside Red Hat. So this team supports vertically focused projects and initiatives inside and outside the company, creates potential new project engagements, places that we want to be involved that might help our upstreams or our own products, helps map and cultivate a range of upstream options to these vertical growth initiative goals and gaps. We track upstream engagements on an ongoing basis making sure we're sticking true to that upstream first ideal and we also help, they also help develop metrics and processes for successful vertical engagements. They also have a part of their charter around this community leadership support. So this is actually providing very close strategic support to Red Hat led projects from inception and they're just getting started as they reach this cruising altitude after they launch. It also helps to plan Red Hat's engagement focus and strategy and opportunities for leadership in these emerging and growing communities and provides community expert consultation for strategic vertical accounts seeking open source strategy engagement. How does this tie in to what I was just talking about with DPGs? This is part of this early stage conversations when you're getting into these topics and you're trying to think, especially with DPGs, we're not really looking at the short term view here. We're really trying to look at the long arc of where we're going with these products and even if they're in that very early stage innovative part of that curve, we want to make sure they're thinking around this long term. So eventually you can progress through that innovation curve instead of having that abrupt drop. So this is I think one really nice example that I found inside Red Hat that I think around how this team is engaging with our upstreams and our dependencies to help these people who are coming into the fold think around some of these challenges and problems. Back here. This is just me. No, this is me. No, I think this is the middle stage. Yes, sorry about the confusion. The lapel mic. Little dance here. So what happens once a company coming from UNICEF perspective? What happens once the investment round is complete and now they're at a stage where they can operate in the right way that we set out to be. We also have further level of funding, but it also ties into a general normal community perspective. It's about finding a framework to identify the right community for you. Where exactly do you go and talk to people? How do you form community? How do you attract contributor? So discussion around that, but also creating evidence of impact. What metrics to observe? What communication and how is the conversation going on around tools? Focusing on that and again coming back to setting governance model, revisiting project charter, all of these things are equally important when thinking of community around the middle to early stage companies. I want to give examples of two companies who were graduated and they later got more funding. First company is a thinking machine. It's a powerful AI to mine data-driven decisions to decide that. Along with UNICEF, East Asia and Pacific Ocean country offices, they are scaled into eight countries, seven is in country by now. And second is Pixframe, which just secured 1.5 million licenses with Gwantamalai government, which is the first of such scale with the government. And Pixframe is a game-based learning tool for children. In this part, I can talk a little more freely because this is where I fall into the Red Hat open source program office. I won't try to bore you with reading all of this. That's kind of like our public-facing part. You might find it interesting, but in a general sense, how this team inside Red Hat's open source program office ties into this middle stage piece. So as a project is growing and scaling and we're upping our engagement with the community, Red Hat's open source program office will actually assign a community architect for a project or an ecosystem of projects depending on what the scope is. But this is where we actually get very hands-on and we are actively engaged in the project, trying to help keep an eye on these issues and key areas of focus around sustainability and trying to make sure that we're both practicing active listening of the challenges in our communities because we're not just working with Red Haters, we're working with people across all ranges of spectrums or different, it could be companies, it could be in Fedora's case, many volunteer contributors or other partners or customers at Red Hat. And that was the first part. The second piece is, so it was around the engagement and right. And so also trying to make sure that we're acting, living our values basically when we're working with these communities that we're actually putting our money where our mouth is and working with our communities and supporting them as they continue to grow and scale. So this is the part where I think as I'm working in Fedora, which is an older community, but I also have many colleagues who are working with these newer projects who are kind of slowly approaching that cruising altitude that I mentioned. This is a way that we help provide a more sustainable focus for these projects to make sure that they're getting their voices heard, that active listening piece. And then we're actually taking steps and actions to make sure that we're supporting them and helping them grow in a sustainable way. So this is a small slide. Our point is middle stage or early stage, investing in community really supercharges you and it matters a lot as early as you can start. And I think also to build on this, part of this whole theme of this talk around full stack DPGs and kind of what we're getting at with the gap in the DPG standard that we talked about is this whole community piece isn't really defined, not just for DPGs, I would say probably across the board in the open source space. But when it comes down to it, you really have to take the steps early on. And if you do that, when you get to this mature level stage, it makes it so much easier to avoid these challenges that will prop up. If you, you know, you'll still have challenges, you can't solve everything in advance, but at least you can try to avoid some of these bigger challenges, existential ones that might be governance or in some cases, license change. Oh, sorry. So the idea was when sudden change in community around this slide, we were talking about what does the change in community looks like at a larger scale company who have been doing for a while and then suddenly they want to invest more in community or there already exists a community but in change in certain directions. And we were discussing about community with flash on how it can fire back because suddenly some people, your partners, industries you are working with, they may not be aligned with you. So at a larger stage, it can be challenging while you may have more resources to invest into, you may have more people to work around building a community and looking at how can we do the right way. But it can also hurt a lot of people who are already using your tools and suddenly they were not expecting it. But thankfully it's for all the way the good way, but communication can matter a lot. Certain examples, even though changes have done for right, if not communicated properly it can be taken the wrong way and it can hurt a lot. For example, this is a personal example I felt, centristream. If you know how the communication around centristream, it was a long time coming, it's actually done for good in certain cases, but when communication not done right at a larger scale, it can create a lot of chaos in the community. And kind of a part of this is when you get into this mature stage, you're not quite that young early stage project you once were where you can make decisions fast and quick and there's fewer people to consult and be your stakeholder. When you reach that mature stage that again kind of building community in from the start, it comes down to building consensus and acting with consent. You might not always have that 100% consensus, but at this point when you're reaching this mature stage and as a DPG, something that's really important to make sure you're reaching your right stakeholders, is making sure you're getting that consensus. You're engaging with your key partners or people who have a clear stake in the project. And ultimately it comes down to this piece of making sure you're acting with the consent of your community. So to wrap it up almost, and for some time for Q&A, our point is just the code does not matter. There are multiple technologies, multiple angles to it and it's not just your IP is not just about code anymore, it's all the things around it, all those different indicators that we talked about, but almost more, something that's missing on how do we identify the clear path forward in future. There are multiple angles to it and we are ready to discuss that with you. The value of DPG lies in what it's created around the code, not just in the code itself or not just the model itself. So let's invest in full-stack DPG, which is also a call. How can we decide what full-stack DPG means and how can we ensure the sustainability of it? That's us and we are ready to discuss or take question and answers. We have a mic. No, we can repeat the question. Part of global harm, who decides, how it starts, there must be some political agenda somewhere that decides who is going to decide what's harm and what, and then from that starting, because I can give examples of really bad things in my opinion, like the Pegasus software despite people, some people say that that software can present more harm. The question is who decides what is no harm. To this, that's where we discussed how digital public good alliance makes digital public good standards and in those standards is how we define what is do not harm. Collection of PII's, right? That's the harm that we don't want to have when it's a digital public good and there is a software that's collecting your information that's not adding to the best relevant practices. If it is collecting for the most important needs, is it following certain right security and deployment practices? I'll add maybe three prongs of this that I'll say. The first one I think in your question was kind of like who says, who gets to define this? Part of that goes back to the slide from the digital public good alliance is all those key stakeholders that are there. In a sense, those are probably the biggest stakeholders who have a, that doesn't mean they're always talking about the standard and every member organization is voting on what is do no harm or not, but this is kind of the body of like where some of the voices are. But I also say at the same time, all of these things are open and have open processes on GitHub where you can open an issue and discuss with the, I guess it would be like the program management and the co-secretariat where people, anyone in the public can come and flag these issues and open a discussion around like, hey, like what about this part? All right, so we're at time. I'll just add one more piece on that question. I think around like the open source bend is that, you know, around this protection from harassment piece, I think is a nice example in the open source context because in that UNICEF open source mentoring, we help people with code of conduct and helping them think through like how do you handle like this case of harassment when you have either it could be users who are interacting on the platform or contributors who are engaging in the community. Not a perfect answer, but I also would encourage you to check out the DPG standard on github, github.com, slash DPG Alliance, slash standard. Maybe we should put that link on the slide next time. But I think we are all out of time. Thank you all for joining us and hope to see you around Vosdem. |
AMENDMENT The New EU Interoperable Europe Act and the Reuse of Software in Public Administration
Implications for OSS in Public Administrations |
Okay. So, I'm working on a project called the Open Source Observatory, which is managed by the European Commission, and it is... that should not be blue. Oh well. So, I'm working on OZOR. The general goals of OZOR are to help the free software community, in particular in public administrations, for them to use free software. So, we have a bunch of projects going at the same time. We have a website, the Open Source Observatory, OZOR, where we publish free software and open source news. We have a knowledge centre where we're gathering information on case studies of countries that are using free software. We have intelligence reports on who in each country is involved in different parts of free software, and we have summaries of all the policies that they have in place. We organise community events, so this year we will organise three workshops and three webinars, and from each of these workshops and webinars we will be producing a report, and then the latest project we've taken on is that we're going to compile these reports together into a handbook at the end of the year, and this is going to be the best practices on how to use free software in public administration. So, we'll be looking at different policies that people have in place, do they work. The folks will be on finding experts that have already gotten free software into use on desktops and servers and services in public administration. We're going to look at all the projects, the problems that they've encountered, and we're going to try and find the solutions that they've found for all these different problems. But, as part of the OZOR work, we also try and get information out to the free software community to help people interact with how the European Commission is working on all these regulations that are going to affect the free software community. So, this is part of the, there is a procedure for gathering input and doing calls for evidence, but the Commission also works through people like myself to find the free software community where you are, and then bring the information, collect feedback, and try and get people to contribute into the process of making these new regulations. So, in particular, today I'm going to talk about Interactive Europe Act. Now, because we had a change in the schedule, I've got a bit of extra time, so what I want to do is I want to make this more of an interactive session. Not the first half, the first half is going to be me reading slides, but the second half is where I would like to get input from everyone here. And so, what we're doing is, I'm going to present the Interoperable Europe Act, and we're going to, I'd like to discuss then how something like an Interoperable Europe Act can help the free software open source community, and then in what way can the community and the Commission and the various bodies that are created by this Act, how can they all collaborate to work together? So, in general, interoperability. The idea is to help governments and computer systems to interoperate and share data. This often happens at a national level already. It's already difficult at that stage, but the Interoperable Europe Act is going to work on this at the European level. And so, the type of problems it's trying to solve are, for example, very simple problems like reserving a parking space for your car once you travel into the next country across the border, and you have to type in the registration plate of your car. If the registration plate isn't in the standard format, then that system may not allow you to reserve a parking space. A little bit more complicated is something like hospital beds. So, the idea is that if a Belgian hospital phones a French hospital and asks, do you have a bed free? And France says, yes, we have one bed. Then a Dutch hospital phones the same French hospital, and they say, do you have a bed free? Well, what is the answer? Is it one or a zero, or is it somewhere between the two? And so, the whole idea is that the idea of data sharing and trying to get all these hospital systems to work together is generally more complicated than expected. You have to have either a reservation system or a priority system, which is understandable to all the other 27 member states. So, these things get pretty complicated. The member states generally manage to get this to work on their own systems because there's a government at the top that can tell the hospitals, you know, please have a system that works with all these other systems or, you know, make sure that you can interoperate. The European Union can't do that because the European Union doesn't have authority over the hospitals of Belgium and the hospitals of France and the hospitals of the Netherlands. So, the European Union can't tell the hospitals what to do. So, the Interoperable Europe Act is about setting a standard that will apply to the member states for how they will interact with their hospitals and systems for reserving car parking spaces and other things to make all these systems work across borders as well as they are working within the borders already. So, that's an interoperability. And then there's also the idea of data sharing. And so, this is the third example of when a city wants to introduce some kind of traffic management system. They may find that other cities across Europe have already done something similar and they might find a good example. But will that data be available to them or will the system that has been created by that other member state, will that be available to this city so they can use the same framework and so they can learn from the same information that the successful city learned from. So, the current situation is that we have the European Interoperability Framework. This is the main piece of legislation. This has been updated a few times. The most recent version was updated in 2017. The main difference between this and what is being proposed for the moment is that the 2017 text is done-minding. So, this works on the goodwill and cooperation between member states, but no member states are not required to do anything by this system. So, then there are also other informal ways that cooperation has been helped. There is the Digital Europe programme and the join-up platform. So, the join-up platform is a website where, for example, the open source observatory is situated. And this is a place where a lot of different projects from the European Commission are available online and they can share solutions and code repositories and general information or articles. And this is another way that interoperability has been helped. But again, it is a voluntary idea because it is making information available in the hope that people will use it if they would like to use it. And then the third way that interoperability has been worked on already is through this informal network of CIOs from the member states and the expert group on interoperability of European public services. So, this is the human side getting individuals to talk to each other and trying to get interoperability to happen that way. So, this is useful, but it could be a lot more useful in terms of making the cooperation between all these cities and member states more efficient. It could be good to lay down a set of minimum interoperability specifications, ways of sharing data, not just formats, but the technical specifications for how data will be shared and communicated. And then the third option in the part is the idea of a binding requirement to have interoperability by default as an approach when designing new systems and also to promote designing systems that are interoperable. So, the European Commission has been gathering information and this is the standard procedure when making a new regulation. They do a call for evidence, a call for input. They work on impact assessments. They talk to their own research centres, which is the JRC, the Joint Research Centre, and part of that is the CPS, the Centre for European Policy Studies. They've gone through these systems. They put together, they did online consultations and they've got 134 for the impact assessment, 112 replies for the European Interoperability Framework. They also perform then interviews with specialists or stakeholders in all these domains. And then presentations like this are the little annex on this because from all this, they've talked to a load of experts, but has anyone in the room been involved in any of these procedures? One, okay, that's two, fantastic. So, for everyone else in the room, this is a way that the European Commission, this presentation is a way that the European Commission is trying to get information to the public and to the free software community in particular so that people can participate when and be aware of what can be participated in. So, the interoperability impact is not the only piece of legislation that is currently enforced or being worked on. We already have the single digital gateway where Member States have an automated cross-border exchange of documents. There is the Open Data Directive, which is for the European Interoperability Framework for designing technical solutions for the reuse of documents. The European Strategy for Data, which is about interoperability of data. The Data Governance Act, which is cross-sector data supporting the interoperability of interoperable Europe framework principles and the core vocabularies of these. The Digitalization of Justice, which is a general trend, which is to make the IT tools used by the justice systems interoperable with each other. And then there are two pieces of legislation that are currently still going through the legislative process. So, these are things that you can still get involved in. And one is the Data Act, which is on interoperability of data through among different sectors within Europe. And then we also have the regulation on digital identity. And this is where Member States have to have digital identity wallets. And this is to create a standard so that this can interoperate between the different Member States. So, the European Interoperability Act will be beside or in some cases on top of these existing pieces of legislation. So, the general goals are to create a consistent approach to Europe interoperability and to policy making across the Member States to establish an interoperability governance structure. So, this I'll discuss a little bit later how exactly that will work and how people here can be involved in that. But this is the second thing. Besides the technical issues of how technically interoperability should happen, there's also the governance of all these different structures that will be created and how that governance should interact with users and experts. Then the last part is the idea of creating an ecosystem, a general environment for the EU's public sector so that people can contribute to interoperable solutions and people are aware of what interoperable solutions exist already and so that this innovation can happen in a collaborative way so that there is more reuse of code. Because we've seen quite a lot of countries now have gotten to the stage of accepting free software as a good idea and they publish a lot of source code for the things they're doing. But we don't yet see a lot of instances where a city from one Member State will copy a solution from another Member State and start even contributing back to that solution. So this is the next step in terms of collaboration and making a good use of free software is the idea of an ecosystem and getting the collaborative innovation to happen. So inside the Interoperable Europe Act there are these six chapters. I'm going to focus on three of them. The general idea is that it is a mandatory interoperable ability that comes with assessments and support and solutions that already exist and encouraging people to create new solutions. But these pieces of text are from the actual Interoperable Europe Act. So at the moment what we have is the European Commission has published their proposal. This then gets handed to the European Parliament which are the directly elected members of the Parliament there. So these are often the easiest to interact with for people who are not too involved in the legislative process. It is also handed to the European Council which is the ministers or the governments of the Member States. So this is a second group that will be looking at the text from the Commission and they will think about whether to accept it as is or whether they should propose amendments. So this means the legislative proposal for creating the Interoperable Europe Act is still ongoing. It's still something that it's actually in the fairly early stages even and it's still something where people can get involved and that's what I'm hoping to facilitate here. So this is one piece of text from the current proposal for an Interoperable Europe Act. This is the share and reuse requirement and it is one of the parts that definitely directly will affect the free and open source software community. So what is probably interesting in this is the idea that they've created a piece of text to encourage the sharing of solutions and so this is what we are quite used to in the free software community but their definition or the way they describe this is that if the solutions and which solutions will generally mean software, solutions shall be made available to any such entity that requests it and shall include documented source code. So that's their current definition of what it should look like to have Member States working together on solutions such as software. So this is what the Parliament will now look at and the Council and for we can interact with the Parliament and the Council to decide can this be improved, should this be adopted as is or if other people want to change it, we can give our opinion on whether it should be kept or whether it should be, whether other changes should be included. So this is one part that I'd like to get some input on at the end of this session so if you have any comments on how that, whether this definition is ideal or not that would be interesting. The second part is in chapter two which is quite a long chapter but I've only included a small amount here. Maybe the most interesting points are that there will be the creation of the Interoperable Europe portal which is a place where it will be possible to share solutions and data and to create some kind of interaction between people who are interested in this topic and to increase transparency of how work is being done on this. One interesting part of article eight is that there is somewhat a definition of what free software such open source is. So it says here, I'll highlight some bits, an open source license means a license whereby the re-use of software is permitted for all specified uses in a unilateral declaration by the right holder and where the source codes of the software are made available for users. So this is the first time I've seen free software or open source license being defined in a European regulation. Once again this is something that we will have to think about and accept input on and decide whether this is the way it should be defined or are there improvements that could be made. The second most interesting part in article three is that the EUPL, the European Union public license, is suggested as the default license for public administration's publishing source code solutions. The EUPL for, I'll just present briefly, is a valid free software license. It's approved by free software foundation and open source initiative as well. It is a somewhat copy left license. So the EUPL itself includes a copy left clause so if you're using the EUPL and you publish software and somebody else modifies it, they also have to publish their source code so it's quite a traditional copy left in that sense but it has a compatibility clause and the compatibility clause says if you're merging EUPL source code with another project under a license that's listed in the annex you can change the EUPL license to the license of that project. So this means, for example, the GPL v3 is one of the licenses in the annex so if you have a GPL v3 project and an EUPL project you can merge the two and distribute the whole thing under GPL v3. This is one way, a very positive, a very useful clause that they've inserted to work on license compatibility which between copy left licenses is always difficult. On the other hand there are also licenses in the annex of the EUPL which are not copy left licenses or not strong copy left licenses and so it's possible that the EUPL software would be merged with a software project which is under a non-strong copy left license and then the whole would be distributed without being under strong copy left. So in one sense it's copy left license but also the there's no guarantee that downstream uses of that code will remain copy left. So that is the most the most unique part of the EUPL I believe and a second aspect that is quite interesting of the EUPL is that it has been translated into all the official languages of the European Union which is quite useful because some member states or cities have requirements in their law saying that they can only sign contracts that are under their own official languages. So if there's any difficulty with getting software put under for example GPL v3 because the GPL v3 is in English there's always the option of getting the software published under EUPL and it can be kept under EUPL or can be later converted into GPL v3. So that is the EUPL it's getting more and more used at the moment. So here's just a map with a few examples of member states that are currently using the EUPL but once the Interoperable Europe Act comes into force if the current wording is kept the EUPL will presumably be used by a lot more member states and local governments and so it is a license that we're going to be seeing more and more use of. So those the chapter one and chapter two are the technical aspects of the EUPL then chapter four contains the governance structure. So this part is how interaction is going to be done with experts and with member states and so the Interoperable Europe Act proposes making an Interoperable Europe Board and then an Interoperable Europe Community. So these two will work together the Board having a more executive function the community having more of an input role. They will both interact with national authorities and national authorities will be required to create national assessment bodies and there will be interoperability coordinators. The details of how all this will work might look a little complicated might be a bit complicated as well but is also not completely defined at the moment. So there have been statements at conferences to say that the free software community will be included in the Interoperable Europe Community and the fact that this has been explicitly mentioned by high ranking people in the commission means that it should be quite a real role which means as a community we now have to think about how do we want to be involved in an Interoperable Europe Community. If we are to have representatives how do we choose these. If we are to work with the Interoperable Europe Board how do what exactly would we be requesting or what would we be suggesting. And so this is where I will soon come to the interactive part because this will be up to us to think about how can an Interoperable Europe Interoperable Europe framework help the free and open source software community. How can it help the software itself in terms of data formats. What can we do to make sure the data formats and the systems that use them are systems that can be used by existing free software solutions or if we are developing new solutions what do we need to make sure that these things are all can all be made use of by the free software and open source community. So that is the general idea of the Interoperable Europe Act. It will have all these positive outcomes legal certainty of course more innovation more agility for public administrations and connected digital public services. But the reason I am giving this talk here today is because I would like to have your suggestions. Possibly I was about to suggest possibly from the people who have already been involved in these various procedures but do you have a microphone for questions or how do we do. So you could repeat any questions you get so that they are audible on the live stream because you have the only mic on your shirt. Yeah and people aren't going to come over here and start. Okay first question. Yes. Yes sir the definition of the definition of the definition of open standards or the definition of the open knowledge condition. I am not sure because a lot of the questions. So the question was has there been work while drafting the text of the Interoperable Europe has there been work to coordinate with the definition of open standards and the open knowledge centres work on defining open standards. That correct yes. So when the European Commission is writing their initial proposal for legislation it is a lot of it is behind the actual drafting it happens internally in the Commission. So they will have consulted with certain experts and when they did the call for input it would have been possible for open knowledge centre or any of these organisations to participate during that stage. A lot of the time these things get missed by people who should be interested in these topics and that is partly just because there are so many different pieces of regulation going through the systems that we can't be aware of everything that is happening. So it is possible that they have not been involved during the elaboration of the text up until now but this stage of the legislative process where the text then gets handed over to the Parliament and to the Council and is visible to the general public this is where there is now a chance for everyone involved to comment on the text and that's what I'm here today to make sure the people who weren't aware of it until now are aware and can start participating. We had one other person. It is a regulation. I didn't have the word on the slide but the text is publicly available. As far as I remember it is a regulation and so it would be directly enforceable rather than a directive has to be implemented by the member state so there is one step of implementation there but it will be a regulation so it will be binding to all the member states. Yes, sorry I forgot to repeat the question. So the question is what the general feeling is on whether the text will be changed now or what the member states and the Parliament think about and what their capacity is or the capability is for dealing with topics such as this. So we have not yet gotten much feedback from the member states. There are a lot of pieces of legislation going through the system at the moment partly because there will be a new set of policy makers elected next year and so there is somewhat of a rush to get some piece of legislation through. Technical topics like this are always a little bit difficult for policy makers to work on and there is always the worry that will our issue get the attention it deserves but what I have generally found is there are always a few people in the Parliament who do understand this topic they understand whatever topic is given to them. There are also in the Council each member state has their own set of experts and they will have somebody who is an expert on it could be standard setting or it could be digital solutions but they will have people who do understand these topics so it is not so much that the MEPs don't work directly on this themselves but each MEP in the Parliament they will have a team of assistants and each group in the Parliament will also have an external team of policy experts and so if you are looking at the MEPs and you are wondering will this person really understand the text you have to take into account that it is not the MEP on their own who works on the text it is their office who will work on the text and so within each MEP's office there will be well each MEP that is going to work on this directly there should be somebody who is who understands this and that is part of the job of anyone who wants to get involved in it is to identify the people in the Parliament in the Council who are the experts and who do understand this and then try and work with them to improve the text and then within their groups and within the Parliament they will generally be seen as an expert and then they would have a larger influence on how the text should be changed. I think it is something that is shared when I hear the questions and I think it is really a big political game to get on the table and get the right people on the table but for example on the member of the extendence foundation it is all about the infallible chat I just made the budget what it would cost our foundation to probably react to the digital market sex And to participate also in a community input would be really a big stress for our organization. So I think we could have lots to contribute there. Lots of useful input because I think that digital marketing could have been much better if they would have conserved the place. But they're out of resources to do it and still it's a big game who's there and who should be there. The comment is that it is very important who is at the table and who will be involved in these negotiations and in giving input and it's always a difficult decision to make on how many resources we can put in a community organization or even a company can put into participating in this particular legislative procedure when there are also so many other that are also very important to work on. And yeah, it's a devalued comment and that's like we're trying to make it through this presentation through coming to FOSTA. I'm trying to be part of making it easier to participate in this. At least people will be aware of it. And then yeah, it's always, you know, whatever can be put into it, it will hopefully be beneficial. But people will, everyone does their best. I'm wondering how do you think that this will benefit the general open source community? These acts, they say that agencies must publish as open source but it doesn't say they will use it. What they often see is that they just hire a bunch of consultants from ATOS or whatever, Gemini, but they don't hire actual open source developers. They just hire a bunch of people using a consulting framework. Then they just download the free software for free and they say, oh look how much open source we need, but that doesn't benefit us as a community. I do think this will improve that. Because I don't see any things that like what the previous commenter said about funding of the community and supporting the community really. It just says you must make open source but it doesn't say support the real makers. So the question is how will this really benefit free software open source developers if we already have a lot of government agencies who already download free and open source software and they use it but this doesn't actually lead to contributions to the software or it doesn't help other people who are writing projects. And so I would say the way this could benefit free software developers is by making public how interoperability will happen, it will make it easier for a project to develop the software necessary to interact with these services in that way. Because at the moment if we have 27 different ways of reserving a hospital bed somewhere and if all these, if the specifications are not public, well then it's very difficult for a free software developer to come along and say well I'm going to write a hospital bed reservation system. But if the specifications and the data formats and the procedures for performing these tasks were transparent and public and then that's already one step that would be useful. And then if France develops a system and they publish the source code for theirs and Belgium develops a system and they publish the source code for theirs then this also gives us reference implementations for how to interact within these frameworks. And so I think I would have thought that was the problem you asked but would you like to do a follow up comment? Yeah, I don't think that that's because I think then it's just the consultancy firms that develop these open source services and I mean as an open source developer you probably don't invest up front a lot of money and time to do something. I don't know how you see that. I couldn't then still be developed by big companies. Okay, is this the same comment? Yeah, actually the add-on to this. The thing that I see happening is of course I'm from the States so I have a whole different environment I have to deal with. But they sat there and go oh here's billions of dollars for open source. Okay, great. What's the procurement path into that money being distributed? Because all that's going to happen is you're going to have these huge consultancy firms come in. If we're lucky they dump 8,000 lines of code on us which doesn't go through our process, doesn't go through any of those things properly, they don't behave correctly. There's all of these different problems that happen and there's nothing to actually address that piece of the fact that there's not a good procurement process. I know in the States it's horrible. There's no way an individual developer could ever go through the process to ever see any of that money. You know, it's impossible. I don't know what the processes are like here in the different countries either. But I know that that procurement pathway ends up being the thing that's really devastating to actually helping the open source structures that exist. Because most of them actually are non-profits too. I mean how many of them are set up as actual non-profit structures and do we have a clean way to get that government money? No, you have to go through. You have to be a business. You have to be a business of a certain size. Okay, so the question then is... The question is how do free software developers get access to the funding for developing free software in public administrations and how do you avoid that all this money gets given to big consultancy firms who dump code without participating in community procedures and the procurement rules will not be of any use to free software companies who would be able to do proper collaboration with the community. Procurement is a national level competence so the European Commission is not going to be telling Member States how to work on procurement. What will happen with interoperability is that the formats and the procedures and the frameworks will become transparent and there will be reference implementations at least available. So in that sense where it was very difficult to interoperate with all these public administrations it should be with open specifications and with source code and implementations. It should be a lot easier at the very least. Who ends up getting paid to develop the software? That is something that will be dealt with outside of this legislation and it will have to be worked on in some other texts, probably at the national level. Two for you. Quickly. The willingness in the commission or recognition that maybe we should just dump all this on GitHub and have a platform within the European Union that is used to host this code. The follow up on that is are there any AI training clauses in this license? So the two questions are should the European Commission have their own code hosting website instead of using GitHub and are there any AI clauses? AI is not in this but there is also an AI Act which is currently going through the legislative process and so that's something where it's still halfway through the legislative process. If people want to get involved in that that would be very useful. GitHub, the European Commission's code sharing site is code.europa.info. So this is something they put online I think last October and so they have their own code hosting site and this is one of the ways they hope to encourage people, projects from the European Union to collaborate and work together and so yes it won't be on GitHub. This is only for European institutions and not for local governments? At the moment I believe the projects that are there are only European institutions but it is also the goal is for it to be broadened out and I expect that it will also be for local governments but it is only online the last three or four months so that's the start. Sorry the question there was what goes on the code.europa.info portal. Do you have one more question? We have time for two questions. Two, okay. I have a question, is there a consideration how much complexity exists within the member states? We build a group of concepts and it kind of died in the draw because there was no confidence in getting consensus among the 16 member states of Germany so the standards within each member state of the European Union are also really different and getting consensus there is hard. The question is is there any consideration for the complexity within a member state in terms of implementing all these different IT systems? The Interoperable Europe Act will focus on making sure that interoperability is possible. How complex it is for a country which has 60 regions or in Belgium where you've got seven governments, how difficult it is for these things to work together is still something that the member states have to work on themselves. Each member state can define their own number of regions and provinces and communes and communities and that is something they all have to deal with. So those technical solutions won't be solved by Interoperable Europe Act but by making specifications, file formats and procedures transparent it might have the knock-on effect of also making it easier to work on those problems. To mention the European Commission doesn't want to interrupt into the procurement practices in national states, why not? It could be harmonized in all of you. So the question is why the European Commission wouldn't get involved in procurement policies in member states? The answer is mostly that for certain member states certain things are very sensitive and there are some things that some people wish was at a European level and for other people it is completely unacceptable that this would be at the European level. And so even within public procurement if you're going to have a national policy then you also have to think about regional governments and local governments and communities. So it would be very difficult to get agreement across all these different institutions. For everyone to say yes the European Commission is the person who should define how the region of south-west Dublin should procure software. So it's just part of the complexity of the European situation. There was one question over here and I'll give a chance to people who haven't spoken yet. Thank you. I see that the legislation is a good step forward but on the other hand open source success depends on the right mindset on community building and exchange. Is there any initiative ongoing to spread the mindset for using and reusing the particular open stuff like that? I mean it's not done by just publishing and there must be a mindset to take that out like we use it. So the question is is there any philosophy or mindset or strategy within the European Commission to encourage participation in open source development and collaboration with the community. So the interoperative is one thing, it's about interoperability. There is an open source strategy, there is an open source programme office where they participated in a session yesterday on the main stage in the Jensen. So there are different activities happening within the European Commission to work on this from different angles. This is one that hopefully will help or if you think it doesn't help enough please get involved in the legislative process. It's the open part. But there are other initiatives also. It is something that's slowly expanding within the European Commission. My time is up so I have to thank you for the input. I'm going to hang around if people have more questions I will be available so thank you all very much. |
An introduction to async programming
Writing a Telegram Antispam Bot in Python |
Hello, everybody. Welcome to the Python, the room of UsDem. I'm really happy to welcome all of you here and to welcome Marc-André for his talk, an introduction to async programming. Thanks for everyone for coming, also, so early for this talk that I'm really looking forward to see with you. Thank you very much and thank you for the introduction. It's really nice to see so many people here on a Sunday morning at 9 o'clock. I'm really happy about this. I wouldn't have expected so many people. I hope this is going to be interesting for you. There is a lot of text on these slides. I uploaded the slides to the website so you don't have to, you know, write down everything and read everything. There's a lot to talk about in async. So what I wanted to do is I wanted to give a short introduction to async and I wanted to frame everything into writing a telegram bot because that was, you know, the motivation behind the talk and how I came to doing this. So a few words about myself. I'm Marc-André Lamberg. I'm from Düsseldorf in Germany. I've been with Python for a very long time since 1994. I'm a core developer. I've done lots of work in the various organizations around Python. So I was your Python chair, for example. I was on the board of the PSF. I'm a PSF fellow and I've done lots of work in that area. In my day job, I am a consulting CTO or do senior architectures. Also do coaching a bit. So if you have a need in that area, just ping me. So the motivation for the talk is writing a telegram and a spam bot. Now, why did we have to do that? We have a user group in Germany, the Python meeting Düsseldorf, and we're using a telegram group to communicate. And early last year, we started seeing lots of signups to that group because it's a public group. Anyone can just sign up to that group. We started seeing lots of signups from strange people. And the people then usually started to, you know, send spam, send crypto links, you know, link spam. Many of those people were actually not people, but bots, and they scraped the contact info and started sending DMs to the various members. So it was, you know, getting to a point where it was not possible for us as admins to handle this anymore because most of these people or bots, they were actually signing up during the night. So there was no one there to handle these. And so the idea was to write a bot which basically tries to, you know, check whether people are human, check whether the signups are actually from people who know Python. And that's what it did last year. So the idea was to have a scalable bot because it needs to run 24-7. It also needs to be very stable because, you know, at night, no one is there to basically restart it. We needed something that is low resource because we wanted to have it on one of the VMs that we have to run. And so we decided to look out for, you know, a library that we could use. And there is a very nice library called Pyrogram, which you can use for creating these bots. It's LTPL3. It's fairly new. It's well documented, and it's actually maintained. So basically all the checks are there. And we started to use that, and we had great success with it. It is an async library. So what is this asynchronous programming? I'm going to go through the three different models that you have for execution in Python. And let's start with the synchronous execution. So what is synchronous execution? Basically you write your program from top to bottom. The Python interpreter then takes all the different steps that you have in your program and runs them one by one, one after the other. So there's no parallel processing going on. Everything happens one after the other thing. If you have to wait for IO, for example, then you just do the cord just sits there, doesn't do anything. And of course, waiting is not really very efficient. So what can you do about that? If you want to scale up. One way to scale up in Python is to use threads. And probably many of you know about the GIL. I'm going to talk about that in a bit. But let's just mention what was threaded programming is. Thread programming is where the operating system basically assigns slices to your application. And then each slice can run for a certain amount of time. And then the operating system switches to the next slice and the next thread. So everything is controlled by the OS. This is a very nice and very elegant way to scale up. You can use all the cores that you have in your CPU. You can, you know, in the past you usually had multiple processes in the servers and you could use those multiple processes as well. There's one catch though with thread programming because it's controlled by the OS and not by the application. So not by Python. It is possible for two threads to try to access the same object, let's say, or same memory area in your application and do things on those memory areas. For example, you know, take a list, append to it, delete from it and so on. And if two threads start doing that at the same time, you can have clashes. And in order to prevent that, because you don't want to have data loss, for example, you have to put locks around these things to make everything work. So there is a bit of extra work to be done there. You have to consider how things work in the thread environment and you have to put locks around areas that can be shared between the different threads that you have. It is an efficient use of resources. So this is actually something that people try to get working. With Python, it's a bit harder. And because it's a bit harder, some years ago, async support was added to Python. So what is asynchronous? Asynchronous is basically focusing on a single thread, on a single core. It looks very much like a synchronous program, except that whenever you do IO, the application, Python in that case, can then say, okay, I'm going to run this function until I hit a spot where I have IO, for example, or I have to wait for something. And then I give back control to something called an event loop. And that event loop is then going to take, it's going to go through the list of everything that is scheduled to be executed and then just run the next thing that's on that list. And so whenever you wait for IO, you can tell the program, okay, I'm done with this part of the application. And now I'm going to switch to a different part and run that part. So that's a way to work around the threading issues I just talked about. It's also a way to write code that scales up very neatly, very fast. It's a bit limited in terms of focusing on just one core. So you, for example, you cannot use multiple cores that way, not that easily. If you want to use multiple cores, you can push work that is being done in the application that's not running Python on these other cores, that's certainly possible. But if you want to scale up, use all the cores, then you basically have to use multiple of these applications, run them in different processes, and then use up all the cores that you have. There's one catch with this. I mean, there's no free lunch, right? So all the parts in your code have to collaborate. Because it's application driven, all the parts that you have need to have certain places where they say, okay, I'm going to give up control back to the event loop at this point because I'm waiting for something. Because Python cannot know that you're trying to wait for something. And so you have to tell Python that this is a good place to give up control. Now, why do we need this? And this slide is about the global interpreter lock. How many of you know the global interpreter lock? Just a few. That's interesting. So what Python does is it keeps a one big lock around the Python virtual machine that executes the Python bytecode. And it does this because it wants to support multiple threads. But at the time when this was added, threading was actually very new. Python is, it's more than 30 years old now. So there's been a lot of development going on since Python started. And because Guido wanted to start supporting threading right from the beginning, he added this global interpreter lock to make sure that the logic that's inside Python is only used by one thread at any point in time. So what happens is the Python starts running code, Python code in one thread, then reaches a certain point and then gives up control back to the OS and says, okay, you can switch to a different thread now. But it does this under the control of this global interpreter lock. So it makes sure that no other thread is running Python at the moment. When it gives up control to a different thread, then that thread will have been waiting for the Python interpreter lock to get the lock. And then we'll start executing. And this goes on for all the threads that you have in your application. So in a multi-threaded program that's running Python, you can just have one thread execute Python code at any point in time. And this is something that, of course, is not very scalable. It's also not a very big issue, as some may tell you. Because if you're clever and you put all the logic that you need to run a multiple course or multiple threads into parts of the program that don't need Python, for example, if you're running machine learning and you want to train a model, then you can just easily push off everything into C code, which doesn't need Python. And that can very well run next to Python in another thread. So that's certainly possible. But, of course, sometimes you don't have a chance to do that. And then you need to look for other things. And this is where async becomes very nice. So let's have a look at how thread code executes in Python. The image on the right basically explains how Python works. So you have three threads. The orange is Python running. Then the yellow is basically the thread, the Python interpreter in those threads waiting for the gil. And then you have some waiting for IO that happens in between. So if you look closely, you will see that it's not a very efficient use here, because there's lots of waiting, lots of yellow in there waiting for the gil, lots of blue waiting for IO. Let's have a closer look at this. So this again is the picture that I had on the other slide. And I moved out all the waiting and all the execution. And if you move all the execution together, you will see that only one thread is running at any point in time. So the other threads are basically just sitting there doing nothing. Now, how can you work around this? You can use async programming for this. And async programming has the need feature that you can actually saturate a single core very efficiently without doing too much work. So again, you have the execution here. You don't have three threads. This is just one thread that you have for one core, but you have three tasks running in that one thread. And the different tasks, they share the execution. And again, you have the orange here executing Python. You have some waiting for IO in here or could also be waiting for calculations to happen. And if you move everything together, you will see that it's really the thread, the core is saturated. So everything is working out nicely. And it's very efficient. So how does this work? How many of you know coroutines? Okay, about like one third. So a coroutine basically is very much like a normal function, except that it's possible to have certain spots in the coroutine in the function where it says, okay, at this point, you can give up control back to the caller of that function. And this is essentially how async programming works. You have something called an event loop. The event loop calls these coroutines. The coroutine executes until it hits one of these the spots where you can give up control. The function, the coroutine gives back control to the event loop at that point. And then the event loop can execute something else in your application. And then at a later point, it comes back to that coroutine and continues executing where it left off. In order to make that easy to define and easy to use in Python, we have new keywords. We have async def, which is a way to define these coroutines. And we have these await statements in Python, which are basically places where the coroutine says, okay, you can give up control and you can pass back control to the event loop because I'm waiting for, let's say, IO or for a longer running calculation that you want to do. And everything around this, all the support for this is bundled in this package called async IO, which is part of the standard library. So let's have a look at how this works in Python to compare synchronous code and async code. So on the left, you have a very simple function. You have a time sleep in there, which is like a simulation for the IO. So something, the application needs to wait for something. And then you call that function. And if you run the simple example, then you see that, you know, it starts executing, it starts working. Then it sleeps for two seconds, and then it's done. And then it's the end of that function. Now, in the async case, it works a bit differently. So what you do is you put the async in front of the function. So you have to turn it into a coroutine. And then inside that function, we use the await statement to say, okay, at this point, I can give up control back to the event loop. And so what happens here is that you have a special function called async IO sleep, which is a way to, you know, wait for a certain amount of time in async. But when waiting for these two seconds, you can actually go back and you can execute something else. It's not possible to use await and then time dot sleep for this, because time dot sleep is actually a blocking function, right? It doesn't give back control. So you have to make sure that whatever you use with await is actually a coroutine so that it can pass back control to your coroutine that's calling this coroutine. And this is what I meant with everything has to collaborate. If you have places in your application that are not compatible with async, you have to be careful and you have to use certain workarounds to make it happen. So the next thing is that, you know, now you have a coroutine calling the coroutine will do nothing. Basically, all that happens is you get back a coroutine object. So it doesn't run. So in order to run it, you have to actually start the coroutine inside the event loop. And this is what async IO dot run does at the very bottom. And this, if you look at it, it takes, it defines two tasks. So basically two instances of that coroutine puts them into this tuple. The tuple is passed to this async IO gather, which is a special function I'm going to come to in one of the next slides. It basically just takes these, the coroutines, creates task objects, and then executes them until all of them are done and then passes back control. So this is how you would run an async application. I already went through these so I can basically just skip these. So what are the, you know, things in the async IO package or module? A very important function is this async IO run. This is basically the function that needs to be called in order to set up the event loop to run everything in that event loop. You typically just have one of these calls in your application, basically starting the event loop and then running anything that's being scheduled. Then you have this gather function. Gather is, like I already mentioned, it's a function where you can pass in coroutines or tasks. And then it runs all these tasks until completion and then it returns. Async IO sleeper already mentioned. You also have a couple of functions down here for waiting for certain things. So sometimes in an application you need to synchronize between various different parts. So there are some handy functions for this as well. Now what is this task object? I keep mentioning. Task objects are basically just coroutine calls that are being scheduled. And it's a way for the event loop to manage everything that happens in the event loop. So whenever something is scheduled to be run, you create a task object. And this is done behind the scenes for you. You don't have to create these objects yourself. In fact, you should not create these objects yourself. You should always use one of the functions for this, like this create task that you have down here. And then these task objects go into the event loop or run and everything happens for you. There are also some query functions down here if you're interested in what's currently scheduled on the event loop. You can have a look at the documentation for those. So how does this event loop work? It's basically just a way to do the same kind of management as the OS is doing for threads, except that it's done in Python. And the async.io package manages one of these event loops. Now event loops can actually be defined by multiple different libraries. But what's important is that there should only be one event loop per thread. So you can have multiple threads, of course, also run. Then again, you hit the same kind of roadblock as you've seen with the GIL. But there might be ways to, in your application, to make use of that. So that would be possible as well. Typically, what you have in an async program is you just have a single thread. And so you just call this run function once. Now, I mentioned blocking code. So blocking code basically is code that doesn't collaborate with this async logic. And you have that quite often in Python. For example, let's say you're using one of the database modules. Not one of the async ones, but the regular ones. Those will all be synchronous. So you call, let's say, an execute to run some SQL. And that will actually wait until the database comes back with results. So in order to run this kind of code in an async application, you have to use special functions. There's a very nice function called async.io2 thread, which was added in Python 3.9, which makes this easy. So what these functions do is say they spin up a thread in your async application, run the code inside that thread, and then pass back control via the threading logic to your event loop. So you can still run synchronous code because the synchronous code is most likely going to give up the GIL. For example, if you have a good database module, then when you execute something, typically what these database modules do is they give back control to other threads running Python code because they're just running C code at that time. So this is a way to make everything work together. And of course, there's lots more. I'm not going to talk about these things because I don't have enough time for that. In fact, I'm already almost out of time. So I have to speed up a bit. Let's just do a very quick overview of what's in the async ecosystem. So of course, we have the async.io standard library package. We have event loops inside the async.io package. If you want fast loops, then you can use UV loop, which is a faster implementation, speeds up your async by almost four times. You can also have a look at other stacks that implement event loops, like Trio, for example, where you can use the package any.io, which abstracts these things. So you can use basically Trio or you can use async.io loops underneath in your application when using any.io and it abstracts away all the details. Now, there's a rather large system of modules and packages around the async world in Python. Many of these are grouped under the aio-lips. So if you go to GitHub to that URL, then you will find lots of examples there. There are database packages there. There are things for doing HTTP, DNS, and so on. Something to watch out is the database modules typically don't support transactions, which can be a bummer sometimes. At the higher level, you have, of course, web frameworks again because, you know, everyone loves web frameworks. And, of course, you have API frameworks. The most popular one right now is FastAPI for doing the REST APIs, and then Strawberry is coming very strongly as a GraphQL server. Both operate async. Even Django does, or starts, is starting to do async right now. It's not fully there yet. If you're using Flask for synchronous code, then you might want to look at a quad, which is like an async implementation using a similar API. And the most famous one probably in the async space is Tornado, which some of you may know. It's very fast. Right. So let's go back to the bot. If you want to see the bot code, it's on GitHub. Just search for Eugenics Telegram, and then you'll find it. How does it work? Very easy. You just subclass the client of that package. You do some configuration. You do some observability, so logging for things. I'm lazy, so I'm just, you know, put all the admin messages into another telegram chat that I can manage so I can see what's happening without having to go to the server. Because we actually want to catch all the messages of these people signing up to the telegram group, and not just people who want to run bot commands, you know, slash something. We have to use the catch all in here, and that's also why we need something that's very scalable, because it literally, the bot sees all the messages that go into that group, and it has to handle all these messages. And then what you do is basically you just do these awaits to whenever you have to do IO. And if you look at it, it looks very much like synchronous code, except that you have these awaits in front of certain things. Wherever something happens, where you need to do some IO, you put the await in front of it, and then everything else looks very natural, looks like a very, you know, very much like a synchronous program. So what are the results of doing this? Writing this bot, it's actually helped us a lot. We've, you know, had over almost 800 spam signups since April 2022, when we started to use it. And this has, I mean, this is one part, this is just the admin part that saved us a lot of work, but of course, you know, every single signup would have cost spam messages. And so that was very successful. So the time spent on actually writing things is, was well invested and basically mission accomplished. So what's the main takeaway of the talk? I think it's great. And give it a try if you can. Thank you for your attention. Thank you Mark Andre. Thanks everyone for attending this talk. Don't hesitate to reach to Mark Andre if you have any question or want to discuss this further. Thanks a lot. Thank you. Thank you very much for coming. Let me just do a picture. Excellent. |
Accelerating object serialization by using constraints
How we achieved 3x-100x faster data serialization to a binary format or to JSON using low-level Cython and Python C API. |
Hello, everybody, and welcome to the second talk of the Python Dev Room, where Vadim Markovstev will be talking to you about accelerating object serialization using constraints. I'm really happy to welcome him here, and I hope you will enjoy this talk. Thank you. So let's start with a short bio. I'm a backend developer. I'm also a machine learning engineer. Used to be a Google developer expert in machine learning before the COVID. I call it in quite a few languages in the last 12 months, mostly in Python and Go, a bit of cc++. And I work in Athenian. This is a startup. It does a software service product for engineering leaders. So the talk will be divided into two parts. The first will be about some custom binary serialization that I had to do in my daily job. And the second will be about speeding up JSON dump. So the first is about how I coded something that was working faster than people by many, many times. And why I had to do that? Well, apparently because I had a use case that didn't really fit well into regular, pickling into regular serialization of pandas data frames. And it's not necessarily a best practice to materialize a huge pandas data frame while you're serving a request. But nevertheless, I had to do that. And it was really big, a few megabytes at least. And I noticed that according to the traces, picking it just took too much time. And we really needed to put it into distributed cache. Otherwise, you would spend extra time recomputing it in every request. And that data frame was quite special. I mean, it's not a usual bunch of integers and floating points. No, it was quite complex. It contained strings, it contained daytime columns, could be numpy daytime, could be just regular Python objects, contained nested lists and digs, and even nested numpy arrays. So it was quite complex. And this was mostly the reason why regular serialization, arrow and everything worked too slow in serializing it. So I came up with a custom binary format. And although it's not really open, it's also public and everyone can study the source code. It's not universal. Really, it only supports the types inside the pandas data frame that we had to serialize at work. It's not backward compatible. It's not portable at all. It doesn't fit well into different CPU architectures or it always assumes Linux and it always assumes CPython 3.11. And this is quite important. I'll talk about it a little bit later. So this is generally a really, really bad idea. And you only do that when you don't have any other options. And you need to squeeze a lot of performance in the backend. However, on the bright side, it was quite compact code. Cytan, because performance. It's a single pass serializer really, really, really fast. And also it releases Jill. This is something that regular people or arrow cannot do for the constraint that they had to be universal and pickle everything, any object. As soon as you have an assumption of what kind of object you can serialize, you can call it in Cytan in such a way that you release Jill and it works even faster. And also other code to execute in parallel. And releasing Jill is probably the hardest feature in this serializer because it was more labor intensive that I initially thought. You see Cytan offers C API wrapper for a lot of APIs inside CPython API. And they are great. However, as soon as you release the Jill, you cannot use them. Just Cytan doesn't compile it. And this is a good thing because to use these wrappers without Jill, you need to be absolutely sure that you are doing the right thing, that you are not using the heap, that you're not writing to any Python objects, also that those Python objects are not mutated somewhere else. This is all true for our backend, but it can be different in other use cases. So this is what happens. And what if there is no C API at all or some C API wrapper missing? You have to re-inclimate it from scratch in Cytan. And this is what we had to do quite much. Final note is that, for example, PyPy has a garbage collector that can relocate objects from time to time. So the pointer to the object can change. And this is why this serializer just doesn't work with PyPy even in Siri because it always assumes that as soon as you have a pointer to Python object, it never changes. Anyway, this is how the high-level serialization of pandas works. And this is not really specific to my serializer or your pickling or this suggests how it works. So every pandas data frame currently in 1.x has a block manager that maps column names to different blocks. Every block has the same data type. And this is a two-dimensional non-PRA underneath. This 2D non-PRA doesn't know anything about what columns are, what are they named. It's just the internal storage, let's say. And whenever you reach a column in the in the data frame, you use the block manager to access them and do some operations. It works fast, yes, and it's also memory efficient. Every operation on a single column executes on a non-PRA underneath, so it's really efficient. And it also allows us to read this pandas data frame and serialize it as a bunch of non-PRAs without Jill. So this is great for us. One thing that we had to re-sync properly is how do we serialize object data type in non-PRAs. And we, again, don't support all use cases, only those that we know exist in our backend. This is some site on pseudo code that shows some general idea how we do this serialization. There is a function called PyArrayData that returns you a row pointer to underlying data in the non-PRA. Then you iterate by lengths. We only support one-dimensional arrays so far, like every column, and we serialize each PyObject independently. And this PyObject can be in Siri arbitrarily, but, again, in our backend, there is a predefined list of things that can be stored like a string, a daytime, an integer, and so on. When we have to iterate standard library containers, such as a list or dictionary, it turned out, again, quite simple. You use PyListGetSize or PyListGetItem. For Dict, you use PyDictNextIterator. This is so great. If you didn't have to release the JIL, you would just use list or Dict type and site. And site would generate somewhat similar code, also efficient. But, again, since we released the JIL, we have to do, let's say, heavy lifting from scratch. For serializing integers and floats, you just convert PyObject to C type, like double or long. If it's a non-P scalar, a non-P scalar is usually a structure, you use a special non-PC API for extracting the scalar value, and you just memory copy it to your output stream. So, this is why we are not portable in the CPU, for example, because the order of bytes in the integer, this endianness, can be whatever. For Intel, it's little endian. For ARM, it can be little or big. So, since our backend always works on the same kind of CPU, and we are really sure about that, we can do these things without caring about endianness and converting to some other order, network order, and so on. And there's also the reason why it works so fast. To serialize a string, it's also nothing special. I have to say a few words about how strings are stored in C Python internally. It's quite smart, and I'm honestly really impressed how C Python does it. It stores strings in three different ways, in one byte encoding, in two byte encoding, and four byte encoding. And it dynamically chooses the encoding based on the maximum number of the character that has to be stored in the string. So, let's say if you only have ASCII characters, you choose one byte encoding and you're super memory efficient. If you have some crazy emojis, then you choose four byte encoding. And yes, if there are other characters, it's not so memory efficient. At the same time, when you address a string by an index, it works super fast because every character is guaranteed to have the same number of bytes underneath. Anyway, to copy a string to binary output format, you just take the row pointer to the string contents. And for the size, you get the number of bytes per character and multiply it by the length of the string. So, again, nothing special in Siri, but you have to be aware of the internals. To serialize daytime and time delta, C Python provides a special C API as well. It's worth mentioning that internal representation of daytime and time delta in Python is not a timestamp. It's not a single integer like in other programming languages. It's really a struct that contains the number of year, the number of months, the number of days, seconds, and so on. So, you just use these getters and you serialize each integer to the output. Worse fast. The same for time delta, you get days and seconds. All of that allowed us to speed up pandas array, pandas data frame serialization by roughly 20x. And, of course, the speed up can be really different depending on the nature of the data frame. If it only contains integers and floating points, the speed up will be marginal. Pickle actually is working quite fast, so it works well for strongly typed numpy arrays, actually. So, does it mean that we kind of managed to beat Pickle? Well, yes and no, because Pickle is universal and it can do all things at once. And it's way harder to be fast at all possible use cases than be super fast at one particular use case. So, I don't know how to say. Maybe yes, maybe no. But, however, there is also an elephant in the room. Can you see it? So far, I only talked about how fast we were at serializing objects, serializing data frames. But supposedly, this is not what you do. Only that you do, you also need to deserialize them back. Otherwise, that will be this anecdote for a compressor that compresses everything to one byte, just kind of decompress back. And decompressing back from this format is also complex, but since I don't have much time to talk about it, I'd rather leave it for next time. But in brief, this is the place where I had to do some extra actions to stay performing. So, yeah. The second part of the talk, how we solved a similar problem, but for Jason's serialization in the backend, this is what we had. A lot of models like a data class with some fields inside, strings, floating points, nested lists, nested models. It could be several levels deep. If you work with a first API that can seem familiar, if I'm not mistaken, by-identic, can offer similar structure. So, it's not really about using data class or something else. A lot of us saw that. And the problem was we had thousands of such objects. And we needed to cache them and surf pagination. Again, this is probably an anti-pattern in a way, because you would ideally push down all the filters in the backend so that you don't have to materialize the whole list of objects to return. But in our case, we really had to do that. So, we loaded all the models from cache. Then we deserialized the bytes to the actual Python objects, these data classes. We took on a dose that corresponded to the pagination and we converted them to Jason's string. So, two things went wrong here. Deserializing everything was super slow and also converting to Jason was also slow. Our first attempt to fix it was actually lightweight. Let's just pre-convert all data classes to atomic objects, like dictionaries and lists. C Python is really, really, really fast. It is working with Dx and lists. So, we assumed that that would help. Then you just store the cache, those atomic basic objects, and you return those that's requested in the pagination. So, that was for the first call, where you create the cache. For the next call, you just serve those objects that are corresponding to the pagination. However, it was still slow. It was slow because conversion from data classes to basic objects through data classes to dict was really slow and painfully slow. And serializing basic objects to Jason was also kind of slow. And just for the subsequent calls, we had problems with deserializing all the objects. It was not great. Materializing a lot of Python objects requires you to do a lot of rev count increments and it just cannot work fast. There is no way it can work fast. So, we had to be inventive and add this to Jason extra value function that pre-serializes all the objects to Jason and also produces the index where each object begins. And this table of contents can be used in the future. When you have pagination calls, you just select the part of this huge Jason blob corresponding to the pagination and you return it. And since you only work with strings, this works fast. However, it really depends on the performance of this function that converts data classes to this huge Jason string with a table of contents. And, yeah, this can be skipped really. This function had to be implemented from scratch, unfortunately, because Jason dumps doesn't specify the table of contents and it cannot do that. Also, to convert data classes to Jason, you still need to convert them to basic objects first using 2D. Jason dumps cannot convert data classes directly. And the only way for us to move on was to really write our own serializer. And this serializer works using a so-called specification of the serialization. The thing is these data classes are typed and you can scan these types to build some plan, how you should perform this serialization, how you should iterate the objects and how you should write them to Jason. Apart from that, we had to implement iterating lists and dicks. Well, we kind of already covered that. Converting integers and floating points to a string. This is really basic. Into stir, float to stir and others. We just don't think about it when you work with them in Python. You just convert them to string, problem solved. However, since it works with HIP and it touches the internal sort of Python, you cannot use it inside and you have to reimplement it from scratch. The same is daytime. Finally, for strings, you need to escape them. If you have a double column inside a string, you need to escape it, the same about new line. And since Jason is UTF-8 and internal binary encoding of Python strings can be one byte, two byte, four byte, we also need to do this conversion. So this is quite interesting. So the main trick to serialize is if you have slots in your data class, these slots are stored as pointers inside the PyObject structure. You take the type of data class, you get the slot members through C API, and each member contains the byte offset where the pointers exist. You just read these pointers and you serialize the objects. Therefore, the serialization spec is recursive. You have the data type that you need to convert to JSON and you have a list of nested specifications with some slot offset. We only support predefined types like lists, floating points, strings, boolean values, times, not everything. We are not universally have constraints, but it allows us to perform faster. Oh, the code listing didn't load for some reason. Anyway, I don't have time for that. This serializer appeared to be 100x faster, really, really, really fast, so fast that we just don't see it in our traces and profile anymore. So the goal was achieved. And it can be improved in the future by pre-computing the serialization spec inside the data class member. Final advice is doing less is doing more. I think it's also mentioned in Zen of Python, probably. Know your tools. It's always a great idea to know your tools and learn the internals, how they work, because it will allow you to write more performance programs. Going low level when you learn the internals is your super weapon that allows you to do many things. You can break the rules and you can do magic. However, you should only do that if you've got your architecture right. Otherwise, it's just stupid. You optimize the place that shouldn't be optimized in the first place. And also, with great power comes consequences, because when we release the GIL, we cannot use almost all the standard library and we have to implement crazy things like UTF conversions. This is annoying and you need to be prepared for that. Thanks. you |
pip install malware |
Can we hear me okay? Awesome. So Hugo is just here. He has just warned me if I step here that it will no longer pick me up on the camera, but I like to walk, so I'm going to do a little bit just to get out of my system if that's okay. So I want to talk to you today about malware on the Python package index, but more importantly than that, I'm actually going to run some malware on my own machine to show you what happens. So I think if I do this, you don't have to, and you'll know what the dangers are, right? So you don't have to worry. So I'm going to go here just so the people, if there's a live stream, and if I'm popular enough for anyone to look at it, then they'll see me. Hello, everybody. I think we might be getting a double microphone here. How do I make it off? I don't know. I'm just going to hide it. I'll put it there. It's all good. Right. So my name's Max, and I'm a developer advocate at Vonage, this company here, up there on left, hearing something. Okay. So what I want to do, first of all, is just explain, because these guys have paid for me to come here. They've paid for my flight. I've come from the UK, so I'm going to do the bit where I tell you what they do, right? So what we do is we make communications APIs. So things like SMS, like voice calls, like video chats, two-factor authentication as a service on demand. That's what we do, basically. So what I actually do is I manage the Python tooling before that. So in my role, I've done quite a lot of work to actually understand how I can not get myself screwed over with malware, right? Which is where this kind of talk comes from. Unfortunately, my research on malware has started to annoy some people. So this is a colleague I actually work with who no longer trusts anything I send him, because he knows I'm researching malware and he gets annoyed now. He's like, hey, don't send me that weird stuff. So, unfortunately, he's like, come on, man, don't do this to me. But he said this is not going to be a slide in your talk, and I was like, yes, it is. This is going to be a slide in my talk. That's what's going to happen. Okay. So he didn't like it because I'm going into the Vonage Python SDK, right? But I made a version called not the Vonage Python SDK that I uploaded to PyPy. I'm going to show you what happens if you install it. Please do not install it unless you want to have a bad time, okay? But it is live and you could literally do it right now. Please don't, again, but you could. Maybe you should. It's up to you, right? So that's where we are with this, right? But before we get there, that's the foreshadowing bit where I say, look, this is Chekhov's malware. You know, we set it up at the start. We hit it at the end. That's what we do. I've just also hit that. That's a literal punch. Okay. So first question for you. Does anybody remember this website, this old font? Remember that? Quick question, yes? Anyone remember that? Who thought that said Google.com? It didn't. It didn't say Google.com. You're wrong. You're crazy. It says Google.com. And actually, this was a real website that in the mid-2000s, you might accidentally visit when you were typing in Google.com into your browser. And if you did, I can see some nodding here. I can see some people might know about this website. But what would happen is it would do a drive-by download of malware onto your machine and it would basically screw over your machine. So this was like one of the really prominent examples in the early 2000s of typosquatting, where just making one typo would absolutely destroy you. So actually, what's really nice is I managed to get some archive footage of a machine being infected with the Google.com malware. And I'll show it to you now. So here's the machine, and here it is after malware has actually infected it. I'm just going to drink some water because I'm a millennial. So what are we going to talk about today? Well, hopefully we're not going to run on because I see panic looks over there from Hugo. Thank you very much. So in this talk, we're going to talk about malware, as you might expect at this point, given the very obvious foreshadowing. We're also going to talk about how it gets onto your machine from Pi Pi. We're going to talk about how it gets made to look legitimate. We're going to talk about how it works, and we're going to talk about how we can protect ourselves from malware. As you can see now, because I'm reading my presenter view, not the actual view, and that's the issue there. So does that sound good to everybody who's here? Open question. I'm seeing, hey, more of that, more energy. I love it. So I feel like there's been some really very smart people giving talks, and I'm not that. So I'm just hyping you up. That's my job. I'm like that guy in the back. He's like, come on, guys, let's get going. Right. That's me. Okay. So right. Quick disclaimer. First of all, I'm a freaking idiot, as you've now learned. More importantly, malware evolves and it changes. A lot of stuff happens here. A lot of stuff goes on. And what I'm going to show you today is kind of currently what I'm seeing. I'm reading a lot of research kind of blogs and stuff, you know, from really smart people. But this is the kind of way that I'm the malware I'll show you works in the way that it's kind of currently working, but that will not be the same in a year's time or two years time. This stuff is going to evolve and it's going to change as bad actors get better at dealing with malware or hiding the malware. Also, I am not a security professional. I'm just a guy. I just walk up here. They just let me. I haven't even registered for this thing. I just showed up and it was like, yeah, sure, we got a slot. Right. So essentially, you know, I, what I'm saying is the stuff that I found through my own research, but please don't, you know, please don't yourself try and necessarily like take what I say as gospel because I'm learning and sharing what I know. Is that cool? Is that cool with you? I love it. Okay. If we're cool with that, then I'd like to show you an image I generated with Dali because it's cute as heck. That's the only reason I included it. It's just really cute. Look at that little face. Look at those eyes. You got 50% hit rate on eyes, which is why he's only showed one on there. But let's see. So the cost of malware, this is important because this is big business, right? Malware is big business. So the question, first of all, so these things came from these stats I'm about to show you. They came from a research study that was done by the Poneman Research Institute last year. They studied 550 organizations. I've got to walk. I'm gonna walk. I studied 550 organizations that have been infected with malware and how they dealt with that. And they basically shared their findings. Right. So the question here, first of all, this is a genuine question. Shout out. What do you think the average cost of it was a data, of a data breach was for those organizations? Anybody? Shout at me. 100,000. Keep going. 300,000. Keep going. 500,000. Keep going. Say that again. A million. Keep going. I had 5 million sold to the gentleman in the red shirt. So 4.35 million was the average cost of a data breach on PyPy. So that was not true. It's just a free breach in those organizations. I just did a spin and distracted myself. It's the blood flow. So another question here. This is slightly more relevant to our talk today is what percentage of breaches were caused by compromised credentials? And I don't mean phishing scams. I mean some malware scraping your credentials and actually then using those credentials to infect a network. What percentage? 70. A little lower. 95. A little lower. 3. Slightly high. Okay. You know what? We can play this all day. I'll show you. It's 19%. So about one in five breaches were caused by this exact thing. I love your enthusiasm as well. Dig it. Right. And back here so the people, if anybody is watching this, can see me. But the question really I'd like us to think about is what does this mean? Thanks. That was my water break. And I'll tell you what I think it means is that developers are now a target for this type of malware, right? Developers are a real target. And actually there's two reasons that a malware actor might want to target a developer. Hello, new people. Welcome. There's two reasons. The first one is obviously because developers are installing something they've got, you know, they're going to have stuff on their machine to exploit, but also because those developers might make software for end users and we might be able to screw them as well, right? That's awesome. For the malware. It's not good for us. I'm just enthusiastic. I don't love crime. There's the first of all, let's talk about remote code execution. So this is kind of the gold standard. If a bad actor gets onto your machine, they execute some code on there, what can they do? Well, we've talked about prudential stealing, but also ransomware, also things like crypto mining and actually also crypto diversion. I saw a piece of malware recently that would actually siphon off payments that are supposed to go to your Bitcoin or Ethereum wallet and would actually just put it to a different wallet address and put it to the attacker's wallet. So there's some quite interesting use cases for this, unfortunately. Again, I don't love crime. But what's important there is that you can be a target as a developer, but also your end users can be targeted. So if you make software that's important that someone's using, you might download a dependency that behaves as expected, except for the fact that it includes some vulnerabilities. And that means that your users could be vulnerable, but alternatively, it could use outdated versions of dependencies itself or an outdated version of the package. That means that essentially your users will be vulnerable as well because it hasn't, for example, received updates or if there's a CVE that's come out, say, look, here's a threat advisory, we need to patch this, you won't get that patch. So these kinds of things can be done to actually get your end users as well. Does that make sense? Good, because this is the most dramatic slide I put in, or one of them, which is this, Python developers beware. I'd like to set the scene for you here. It's a stormy night, it's a castle in Romania, a man in ragged clothes is running, he's running away from the castle, bats chasing after him, he hears the howling of wolves in the distance, and then lightning flashes, and he says, stay away, he screams with wild eyes, stay away from the Python package index. I didn't rehearse that, I'm kind of proud, actually. That went pretty well. Okay, so right, PyPy is what we're talking about today, because whilst it's awesome, and I don't really think you should stay away, it's an awesome way to get our dependencies right. I couldn't live without this thing, well, I could literally live, but I couldn't do my work without this thing. So I want to talk to you about why maybe it's not the safest place in the world in certain contexts. So the first thing is that if we talk about typosquatting, that's if you mistype something, you misremember something, you put it in wrong, this is the case, for example, with goggle.com, where a user would type it in wrong and get screwed over, they'd get that malware on their machine. But the same exists with PyPy, because if you type in pip install and then a package, if you type that wrong, it doesn't check, it doesn't say, oh, did you mean request? It says, okay, yeah, I'll install ASDFJ. You know, that's fine too. So as long as that package exists, you will get it. And that can be concerning. So quick question here, what percentage of PyPy packages are estimated to actually potentially be using a typosquatting technique? Five, that's a good number. Any other guesses? 40, 20. 42, very specific, I love it. 41.9. About 3%. So a lot closer this time. You redeemed yourself, I respect it. So yeah, basically, a little smaller. But still, there's millions of packages on there. So this is a big ass number. This is a big old number. Right. So the next question, this is maybe relevant to us now, is what percentage of PyPy downloads are estimated to be of typosquatting packages? So in percent, how many times does someone download something? They maybe didn't mean to. 10%. Okay. 2%. 4%. Sold. So 0.5 is about the right answer. So again, we're getting some people who, I just thought, yes, and I respect that as well. I love the enthusiasm that we're generating here. That's really the content. You guys are the product, right? So yeah, but 0.5 is not much, but it's 1 in 200. And if you think about how many millions of things are downloaded from PyPy every day, that's a big deal. That's a really big number. Easy photo, good. Let's talk about types of squats. In fact, this is a typosquat of typosquat, which is a pun, and I'm so proud of it, I didn't sleep all night. Types of squats that you can get. So misspelling, pretty obvious. You mean to type something, you hit the wrong button, something goes wrong. For example, these are all real typosquats from the request module. We all know the request module, I hope, I assume. You send HTTP requests, awesome. Well, you might send requests, or requests, or equests, any of those, and those are real actual pieces of malware that were found on PyPy. So if you had mistyped, they would be on your machine, which isn't ideal for you. There's another type, which is confusion typosquats, where the user misremembers the name of the package, and there may be some separate confusion or order in confusion, for example, easy install, maybe there's an underscore, maybe there's a hyphen, maybe there's nothing, maybe it's install easy, who knows. So in this case, you might end up with some malware, but there's also version in confusion, where basically you think, oh, this is a certain version, or maybe beautiful Super 4, BS4, that's the kind of thing where BS4, I never scraped before, sure. So there are some examples where you wouldn't actually want, you know, you'd want to basically consider these different versions. So for example, this is a real piece of malware that I saw on PyPy, request three, you might think, oh, a beta version of version three of requests, yeah, yeah, okay, I'll get that, I'll get that, that's good stuff. Do not do that, do not do that, get the request module. Right, let's play a little game, because apparently all we do now is audience participation. So which of these is the malware? Choose now, top one or bottom one, hands up for top one, hands up for the bottom one, heck yeah, damn right. Okay, another one here, libkill or PyKill, which is the malware, this is a Python package, which is the Python package? Top one? Bottom one? Top one, a really split on that one. So I think the confusion there is because libkill is the actual package that PyKill calls, that's the system package, but actually the actual Python package is called PyKill, and that's why if you would guess wrong, it's really sensible that you might guess libkill, but it's actually malware. So once you've got some malware, how does it look legit? So one way is that you can have dependencies where basically you have a package itself that is innocent, doesn't do anything bad, but it includes a package as a dependency that is malicious, and in this situation you might have the original package behaving as normal, but actually that's just to avoid suspicion, but the actual second package is the malware, and we'll see an example of that very soon, and we'll see a live example depending on how good the Wi-Fi isn't here. So it's also malicious commits over time, and this is another real attack vector. This is not so based on typoscotting, this is based on other elements of trust and abuse. So first of all, the project might be safe, nothing wrong with it, but then builds up a user base, it's a useful package, people start to use it, and eventually maybe malware gets at it, and this was the case with a package called FastAPI Toolkit, so if we've heard of FastAPI, there was a Toolkit package which was adding some useful stuff, and the Toolkit package eventually, in version 0.27, actually added some malware, which has now been rolled back, but as I'll show you later, that doesn't mean you don't get that malware. I heard it, oh, yeah. It got me, it shot me to my core, it shot me to the very depths of my soul. I've gone very Manchester all of a sudden. Right, the other way this thing can work, come in, come in, join us, join us, move that way, the shift left principle. Okay, so the other way this can work is that a repo might get a new maintainer, so somebody who starts to contribute to the repo in a useful way, and they say, hey, can I just get some admin access, I want to maintain this repo, I want to take something over. Okay, awesome, sure. Oh, no, this person added malware, what a surprise, what a coincidence. And this is a genuine and real thing that does happen. Now, I want to drink some more water, so I'm going to show you a pretty slide with a cute snake on it. It's nudge hacking, look at the cute snake, he hypnotized by it while I drink this water. Sick, thank you. Right, have we heard of starjacking, is that something that we're familiar with as a term? Shaking heads, okay, perfect, 10 minutes left, really? Oh, no, okay, we're going to be hauling, okay, right. Starjacking, what is it? I'll tell you what it is. On Pipi, they don't verify the URL that you give as the project URL, and that means you can exploit that. So, quick one here, I'll tell you the answer, we don't have time. Request tool belt or tool belt request, one of these is an actual tooling package for request, it's the top one, the bottom one is malware, how do I know? Because I made it, and it's on Pipi, that's how I know. But we can see here if we look at the page on Pipi, look at these stars, look at them, I've got 900 stars, I only did it yesterday, I'm that popular, and you can see this is a real thing, so if you go and check out a package, you can get screwed this way. So think about it, when you look at this, right, it's a real problem. Okay, right, we are pushing for time, so I'll say a typical chain of events that happens, the user installs a dodgy package that's been typosquoted or another way is confused, it depends on a malicious one, that package runs, it decodes some basic c4 encoded code, which actually then downloads some true malware from a remote server, okay, the upshot of it is very sad snake, hello, the upshot of it is very sad snake. So, let's show an example, that's what we're here for, we want to see my machine get screwed, that's why we're here, that's why we're all in this talk, that's why I've got a full room, right, yeah, yeah, yeah, come on boys, yeah, everybody, so, right, I said I worked for this company, right, and I maintain this particular SDK, you can look at it if you want, I just put the QR code in again because they pay for my flight, right, that's why I did that, you can scan it if you want, but the reason my colleague didn't trust me earlier is because I made a version of this called not the Vonage Python SDK, I didn't want to make it too obviously typo-able because I'd actually don't want someone downloading this, but this is a real package, I uploaded, you can see it looks similar, it has the same number of stars as the actual package, so again, it looks pretty legit, but I want to show you what happens if you download that, so that's the plan, see, just to summarize, we got these two packages here that I've uploaded to PyPy, as of yesterday, I had to make a fake account because they deleted my first set, that's not a joke, I didn't get to go to this conference because I was redoing my malware, so anyway, the point is here, I've got these two packages and what I want to do, show you what happens when you run them, can you, nope, I've got a, oh dear, we're going to have a problem, I'm going to drag you over there, which I think, does it think I'm here, it does, great, there's a thing, all righty, so we can see here, this is going to be very difficult to manage, essentially this is my Python SDK, and we can see here with the setup py, it's normal, except for the fact here, it includes tool belt requests which is a new dependency that wasn't there before, and we can see in the client class here, I've got request util, this is a random function I've imported, it doesn't do anything, but what's important there is I've imported tool belt requests, can we all see that, is that big enough at the back? Great stuff, okay, awesome, so if I actually go and wear my mouse, I'll show you, this one on here, this is going to be quite a lot of dragging and clicking, this is my malware package, this is tool belt requests, and we can see here that this setup py looks normal, request util looks fine, nothing happens in here, but if we go to init py, again, looks normal, where's the malware, well, let's scroll over a little bit, shall we? And I say this, this is a legitimate technique that is used by bad actors, this is real. So, if we scroll over, what do we, oh look, a base 64 encoded, payload, what could this be, who knows? Right, so this is what we end up with, right, and what happens is this command will decode that, and then it will run it, and so because of the first package what we had was this import statement here, it will run as soon as we import the first package, and the reason for that is that actually in the init py file we've got this import of everything for clients, so basically when we import a package we believe is called Vonage, which inherently the module, the package itself is called Vonage still, so a user may not know that they've downloaded the wrong package at this point, you will download and install and then use and activate that malware. So, what's left to do, check off the malware, let's get some malware going, right, so I've got this up here, so I've used a blank one because I'm hoping we can download it live, so let's do that, can I track this while not looking, not the Vonage Python SDK, okay, Wi-Fi is working nice, okay, so this is a blank VN that I've got here, we can now see if I do a PIP list, we can see that I've got tool belt requests in here, which is my malware, and we can see here that I've got not the Vonage Python SDK, okay, so now if I open up a Python shell, cool, and if I actually import that what's going to happen? Import Vonage, what are we going to do? Oh wait a minute, it's done it on my off, this demo would have been so cool if I just mirrored my screens, where's my mouse? I'll show you what it did, I'll show you what it did, can I do that? No, I can't, I need to do it this way, I'll show you right now, all righty, so that's what I want to show you for now with that, but because we don't have much time, I'm going to actually tell you, now I've spent most of my time telling you what screws you up, I'm now going to tell you how you can protect yourself a little bit, so thank you Mr. Astley for your once again, your dedicated service, it actually won't play because the Wi-Fi is not fast enough, but you can imagine it in your head if you want. So right, now we're back here, let's talk about protecting ourselves, right, again, adorable snake, that's my water break, snake for the break, okay, so as maintainers, if you maintain a package and you don't want your users to get the wrong thing like me, what can you do? Well, your dependencies might become compromised, right, so you need to think about how you can look for those compromises and deal with those, so this is the case where maybe, you know, a package like fast API toolkit becomes malicious over time or there's some vulnerability that's discovered, in that case, automated scanning tools can help you, right, five minutes left, okay, no worries, we got this, we're a team, I appreciate you giving me a little more, I like that, okay, so, right, so first of all, you know, there are things to use, we use this, we give these guys an awful lot of money, so if you want to do the same, please feel free, I guess, I'm not sponsored by them, there's also dependable, there's other services basically that will scan your repos and just check, see if there's any malware or any CVs that might, you know, provide vulnerabilities, also as maintainers, another cute snake, you might want to consider defensively typosquatting, which is where you preemptively, you typosquat your own stuff, like I did there with not the Vonage Python SDK, you might do that yourself with your own package, you might give something like similar typos or confusions, you might want to make those yourself, and, oh, now it plays, huh, okay, by the way, I actually turned off, I actually don't know what that's up, wow, okay, wow, I did not think it would go like this, this is my second proper talk, right, so when we are preemptively typosquatting, that means you can save the packages that you're dealing with, right, so there was a person, William Bengston, who actually decided to take some of this into his own hands, and he typosquatted a thousand packages, and he actually, in two years, they got over 500,000 downloads, which just shows how useful that technique was to stop those 500,000 dollars potentially going to malware, right, as package users, we've got other things we can do, the obvious one here, you know, if you type something in, you know, you're at risk, so you might want to look at how you do that, so what is obviously maybe a good practice is we're going to install from a file, so you can check what you've written, vet your dependences if you can, and again, you might want to use those automated scanning tools. Now, if you're using a mirror, you might want to check the latest safe version, and this is really important, because if you're using a mirror, what can actually happen? I'll give you the fast API example again, but basically, mirror sites, so for example, if you're in an enterprise and you download all your stuff, it's cached at a mirror site, you can end up in a situation where, for example, 0.27 is malware, 0.26 is not, when that was discovered, the whole thing, on PyPy, it went back to 0.26, but in the mirrors, they often configured so that they only take the latest version, and so in that case, you'd actually not get that safe version, it would keep the malware on your mirror, so just consider that as well if that is how you install stuff, so because I've got like two minutes left, I'm just going to sum up really quickly, so first of all, typosquatting on PyPy is a real attack vector, so benign packages that become malicious, as we've just said, so what you should do is vet your dependencies really carefully, and if you can, use automated vulnerability scanning tools, and if you really have to, you might want to consider defensive typosquatting, because that's a way to protect your users, but also, be careful when you're using mirrors, okay? So the final thing I just want to say, first of all, is never get security up, it will never let you down, run around, or even, and I quote, desert you. Thank you very much. Thank you, so just quickly, my, if you want to see the slides or any resources, they are there, and this is just the summary, because the other one wasn't super useful, if you want to tweet me and ask me questions, feel free, I'm an idiot, tweet me, it's fine, okay? That's me, if there is anything else, any questions, shout them out me, and I'll attempt to answer them. Thank you, Max. I really want to install your SDK. I haven't seen that video enough. |
Building a Semantic Search Application in Python, Using Haystack |
All right, everyone, can we get a big welcome to Tuana? Can you hear me? Great. So if anyone was here for the talk before, just a disclaimer, I'm not as good a public speaker. I think I enjoy malware so much, but it's all downhill from there, so just FYI. All right, so I'm going to be talking about building a semantic search application in Python, and specifically we're going to be using an open source framework called Haystack, and that's why I'm here. So a bit about me, I'm a developer advocate at DeepSet, and we maintain Haystack. And yeah, so this is some information about me, but let's just dive right into it. So the agenda I'm going to follow, I'm going to try to keep the NLP stuff quite high level and focus on the how to build bit, but I do have to give a bit of a high level explanation, so I'm going to do a brief history on what we mean by semantic search. Please do not judge me for this example, Kardashian sisters. So let's assume we have a bunch of documents and let's see what would happen if we do some keyword search on it, and let's say we've got the query Kardashian sisters. You might get something a bit like this, which is great, and you can see that there's some clever stuff going on here, sisters maybe associated with siblings and family as well. Keyword search is still very widely used, but this is the type of result you might get from a corpus of documents you might have. But what if that's just not enough? What if I want to be able to ask something like, who is the richest Kardashian sister? How do I make this system understand what I'm trying to get to? So for that, let's have a look at this. There might be some names you've already seen here, especially the last one there. I think everyone and their grandparents have heard of this by now, chat GPT. So these are language models. I'm going to briefly walk through where they get such impressive functionality from. So most of them are based on what we call transformers. What those are doing is what I try to depict at the top here. So imagine that thing in the middle as the language model. And very, very simply put, obviously every model does something a bit different or for slightly different use cases, let's say. Given a piece of text, they will produce some sort of vector representation of that text. They're trained on very vast amounts of text data, and then this is what we get at the end of the day. And this is cool because it's enabled us to do many different things. We can use those vectors to compare them to each other, like dog might be close to cat but far away from teapot, for example. And that's enabled us to do a lot of different things like question answering, summarization, what we call retrieval, so document retrieval. And it's all thanks to these transformers. And a lot of these use cases are often grouped under the term search because actually what's happening in the background is a very clever search algorithm. So question answering and retrieval specifically can be grouped under search. All right, how does this work? And I'm very briefly going to go through what these different types of models do and how they do what they do, and I'm going to talk about the evolution from extractive models to now generative models like chat GPT, for example. The very simple one, and we're going to build our first semantic search application with this type of model, is often referred to as the reader model, simply a question answering model, very specifically an extractive question answering model. The way these work are given a piece of context and query, they're very good at looking through that context and finding, extracting the answer from that context, but it does need that context. Obviously, there are some limitations to these models because they're limited by input length. I can't give it just infinite amounts of data. But we have come up with ways to make that a bit more efficient, and we've introduced models that we often refer to as retriever models, or embedding models. These don't necessarily have to be language models, I'm going to be looking at language models, it could also be based on keyword search that we saw before. But what they do is they act as a sort of filter, so let's say you've got a bunch of documents, let's say you've got thousands and thousands of documents, and the retriever can basically say, hey, I've got this query, and this is the top five, ten most relevant documents that you should look at, and then that means that the reader doesn't have to look through anything. So we actually gain a lot of speed out of this. All right, finally, this is all the hype today, and you'll notice, well, one thing you should notice is you see that the document context, anything like that, I've chopped it off, it's just a query. So these new language models, they don't actually need context. You can give it context, but it doesn't require context. And this is very cool, because they produce human-like answers. What they're trained to do, the task to do, is not extracting answers, it's generating answers. And I just want to point out there are two things here. It doesn't necessarily have to be answers. So I'm going to be looking at an answer generator, but it can just be, you know, prompt it to produce some context, it doesn't necessarily have to be an answer to a question. So we've been seeing this, maybe you've seen some of these scenes lately, so this is chat GPT again on the theme, who is the tallest Kardashian sister, it hasn't just extracted Kendall for me, it said, the tallest Kardashian sister is Kendall Jenner, perfect. But let's see what happens if it's not like a question. This is not my creativity, by the way, but I think it's amazing. Write a poem about Fostam in the style of Markdown, change log, that's what you get. There you go. All right, so these language models are readily available. You might have already heard these names, OpenAI, Kahir. They provide these increasingly large language models. There is a difference when we say language model and large language model, but leave that aside for now, let's not talk about that. There are also many, many, many open source models on Huggingface, and if you don't know what Huggingface is, I think very simply put, I like to refer it to sort of like the GitHub of machine learning. So you can host your open source models and other developers can use them, use them in their projects or even contribute to them. And what's really cool about them, like I said, your search results stop becoming just simple search results, they are human-like answers. So now let's look at how we use these language models for various use cases. For that, I want to talk about Haystack, this is why I'm here. So Haystack is an open source NLP framework built in Python, and what it achieves is basically what this picture is trying to show you. You're free to build your own end-to-end NLP application, and each of those green boxes are a high-level component in Haystack. There are retrievers that we looked at, there are readers that we looked at, we'll look at some different ones as well, and each of these are basically the main class, and you might have different types of readers, different types of retrievers. For example, there could be a reader that is good at looking at paragraphs and extracting answers, but there might be a reader type called table reader that's good at looking at tables and retrieving answers from that. There are integrations with HuggingFace, so that means you can just download a model off of HuggingFace, but also open AI here, obviously you need to provide an API key, but you are free to use those as well. A building in an NLP application isn't just about the search component, you presumably have lots of documents somewhere, maybe the PDFs, maybe the TXDs, so they're components for you to build your indexing pipeline that we call so that you can write your data somewhere in a way that can be used by these language models. Some of those components, we already talked briefly about the reader and the retriever, we're going to be using those. There could be an answer generator, a question generator, we're not going to look at that today, but that's really cool because then you can use those questions to train another model, for example. Summarizer, prompt node, we're going to very briefly look into that, but you get the idea. There's a bunch of components and each of them might have types under them. You can use data connectors, file converters as mentioned, pre-processing your documents in a way that's going to be a bit more useful to the language model, for example, and of course, you need to keep your data somewhere, so you might decide you want to use elastic search or open search, or you might want to use something a bit more vector optimized, and these are all available in the Haystack framework. This is the idea of, I talked about the nodes, but the idea behind building with these nodes is to build your own pipeline. This is just an example. You really don't have to pay attention to the actual names of these components, but to give you an idea. You are free to decide what path your application should take based on a decision. For example, here we have what we call the query classifier, so let's say a user enters a keyword, there's no point in doing fancy embedding search, maybe, so you might route it to keyword search. If the user enters something that's more like a human-formed question, you might say, okay, do some what we call dense retrieval or embedding retrieval. That's just an example. Finally, I'm not going to get into this today at all, but let's say you have a running application, you can just provide it through REST API, and then you're free to query it, upload more files, and index them, and so on. All right, so let's look at how that might look first thing you do is install farm Haystack. If you're curious as to why there is farm at the beginning there, you can talk about this later. It's a bit about the history of the company. Then we just simply initialize two things, the retriever. Here we specifically have the embedding retriever, and notice that I'm giving it the document stall, so the retriever already knows where to look for these documents, and then we define an embedding model. I mentioned that these retrievers could be keyword retrieval, or it could be retrieval based on some embedding representation. Here we're basically saying use this sum model name, so it's just a model, to create the vector representations. Then I'm initializing a reader, and this is a very commonly used, let's say, extract a question answering model. Again, some other model, and these are both off of hugging face, let's imagine. We've got this retriever, and it's connected to a document store, and we've got a reader. How would we build our pipeline? We would first initialize a pipeline, and then the first thing we add is the first node, and we're saying retriever. I'm first adding the retriever, and that input you see, inputs query, is actually a special input in Haystack, and it's usually indicating that this is the entry point. This is the first thing that gets the query, so okay, we've told it, you've got the query. I could leave it here, and this pipeline, if I run it, what it's doing is, given a query, it's just dumping out documents for me. That's what the retriever does, it's just going to return to me the most relevant documents. I want to build a question answering pipeline, so I would maybe add a second node, and I would say now this is the question answering model node, and anything that's the output from the retriever is an input to this node. That's simply it. You could do this, but you could also just use pre-made pipelines. This is a very common one, so we do have a pre-made pipeline for it, and it's just simply called an extractive QA pipeline, and you just tell it what retriever and what reader to use, but the pipeline I built before, that's just a lot more flexible. I'm free to add any more nodes to this, I'm free to extract any nodes from this, so it's just a better way to build your own pipeline. Then simply what I do is I run what now looks like a very random question, but we'll get to it. Then hopefully you have a working system, and you've got an answer. Great. I'm going to build an actual example, so I want to set the scene, and I was very lazy. This is actually the exact example we have in our first tutorial on our website, but let's assume we have a document store somewhere, and it has a bunch of documents, TXT files about Game of Thrones. I'm going to make this document store FIES document store. This is one of the options, so let's assume I've got FIES document store, and of course I want to do question answering, and I want this to be efficient, so we're going to build exactly that pipeline we just saw before, Retriever followed by a reader. Specifically, I'm going to use an embedding Retriever, so these are the ones that can actually look at vector representations and extract the most similar ones, and then we are going to have a reader, simply a question answering node at the end. How would that look? I first initialize my document store. This is basically, I'm not going through the indexing one just now, we'll look at that in a bit, but let's assume the files are already indexed, and they're in that FIES document store, and then I've got a Retriever, I'm telling it where to look, and look at my document store, and I'm using this very specific embedding model of a hugging face. I then tell the Retriever to update all of the embeddings in my document store, so it's basically using that model to create vector representations of all of my TXD files, and then I'm initializing a reader. Same thing that we did before, I'm just using a specific model of a hugging face, this is trained by the company I work for too. Then I do the exact same thing I did before. I'm just creating the pipeline, adding the nodes, and then I run maybe who is the father of ARIA stock, and this is what I might get back as an answer. The thing to notice here, the answers are very eddard, Ned, and that's because it's not generating answers, it's extracting the answer that's already in the context. If you see the first answer below, you'll notice that there's eddard in there, and this pipeline and this model has decided this is the most relevant answer to you, I could have printed out schools, you can get schools, I just haven't here, and then I said give me the top five. The first two, three, I think are correct, so we've got something working, but what if I want to generate human sounding like answers, eddard is pretty okay, I've got the answer, but maybe I want a system, maybe I want to create a chatbot that talks to me. Let's look at how we might do that. This is going to be a bit of a special example, because I'm not going to build a pipeline. The reason for that is, as mentioned before, these generative models don't need context, so I should be able to just use them. We've got this node called the prompt node, and what this does is actually a special node, because you can more fit based on what you want it to do. You might have heard recently this whole terminology around prompt engineering, and that's basically used with models that are able to consume some instruction and act accordingly. By default, our prompt node is basically told, you know, just answer the question, that's all it does, but you could maybe define a template for it, what we call a prompt template, so I could have maybe said, you know, answer the question as a yes or no answer, and it would give me a yes or no answer, but obviously I need to ask it a yes or no question for it to make sense. Anyway, so I'm just using it like this, like the pure form, and I'm using a model from OpenAI, obviously I need to provide an API key, and I'm using this particular one, text of inchy 003. I actually ran these yesterday, so these are the replies I got, and this particular one I ran a few times, so the first time I ran, when is Milosh flying to Frankfurt? By the way, spoiler alert, Milosh is our CEO. So I know who Milosh is, and I know when he's flying to Frankfurt, or when he flew to Frankfurt. And I get an answer, Milosh's flight to Frankfurt is scheduled for August 7th, 2020. This is really convincing sounding, fine, okay, but this one was actually quite impressive, again, if I ran the same exact query with this model, I got, it's not possible to answer this question without more information. This is actually really cool, because clearly this model sometimes can infer that, hey, maybe I need more information to give you an answer, that what we now refer to as hallucination. Maybe you've heard of that term, also these models can hallucinate, they're tasked to generate answers. It's not tasked to generate, you know, actual answers for you, that are truthful. Anyway, let's say, when is Milosh travelling somewhere? I love this answer, when he has the time and money available to do so. And then, I guess, I don't know which one is my favourite, this one, or the next one, who is Milosh? A Greek island. Lovely, okay, but the problem here is, this is very, you know, I could believe this, it's very like, realistic, these answers. So we're going to look at how we can use these large language models for our use cases, and what we're going to do is basically, we're going to do exactly what we did for the extractive QA1, and we're going to use a component that is quite clever, because it's been prompted to say, generate answers based off of these retrieved documents and nothing else. It can sometimes not work well, but there are ways to make it work well, and we won't get into all the creativity behind it, so I'll show you the most basic solution you might get. But this is going to be what we do, it's the same exact pipeline as before, the reader has been replaced by the generator. So I actually have Milosh's ticket to Frankfurt. It was 14th of November, and as a bonus, I thought I'd try, this is my ticket, my euro star ticket, from Amsterdam to London and back. So I've got these, and they are PDFs. And so now I'm going to start defining my new components. So I've got the same files document store, embedding dimensions is not something you should worry about for now, and I'm defining an embedding retriever here. What I'm doing is, again, I'm using a model by OpenAI, so I'm using an API key. So this is the model I'm going to use to create vector representations and then compare it to queries. And this time, I'm not using the front node, I'm using that clever node there, called the OpenAI answer generator. And you might notice it is the exact same model as the one before. We're going to briefly look at indexing, so we've got the PDF text converter and pre-processor. And let's go to the next slide. As mentioned before, there are pre-made pipelines, so I could have just defined generative QA pipeline and told it what generate and retriever to use, but let's look at what it might look like if I were to build it from scratch. And first, you see the indexing pipeline. So if you follow it, you'll notice that it's getting the PDF file and then writing that to a document store, given some pre-processing steps. And I then write my and Niloche's tickets in there. And the querying pipeline is the exact same as the extractive QA pipeline you saw before. All that, the only difference is, the last bit is the answer generator, not the reader. This time, though, it does have some context and it does have some documents. What did I get when I ran the same two questions? I got, who is Milosh? He's not a Greek island. He is the passenger whose travel data is on the passenger itinerary receipt. Now, this is the only information this model knows, so it can't tell me he's my CEO because I haven't uploaded any information about my company. So don't make something up, just tell me what you know. If I run, when is Milosh flying to Frankfurt? I get Milosh is flying to Frankfurt on the correct date and time. And then I had that bonus in there, who is traveling to London. I would get Twana Caelic is traveling to London. Now, if I were to run, let's say, who is, let's say, when is Alfred traveling to Frankfurt? What I haven't shown you here, because I think it goes a bit too deep into building these types of pipelines, for the open AI answer generator, I could actually provide examples and example documents. Just in case I'm worried that it's going to make up something somewhere at a time that this Alfred who doesn't exist is traveling to Frankfurt, I can give it some example saying, hey, if you encounter something like this, just say I don't have the context for it. So I could have just run query pipeline.run when is Alfred traveling to Frankfurt, and it would have told me I have no context for this, so I'm not going to give you the answer. This model that we saw does do that sometimes. The first example we saw, it did say I don't have enough context for this, but not all the time. So this is how you might use it for your own use cases, you might use large language models for your own use cases, and how you might mitigate them hallucinating. So to conclude, extractive question answering models and pipelines are great at retrieving knowledge that already exists in context, however, generative models are really cool because they can generate human-like answers, but combining them with a retrieval augmenter step means that you can use them very specifically for your own use cases. Haystack as I mentioned is fully open source, it's built in Python, and we accept contributions literally welcome, and I would say every release we have a community contribution in there. Thank you very much, and this QR code is our first tutorial, bear in mind it is an extractive one, it's the non-cool one, but it is a good way to start. Thank you very much. Thank you, Luana. We have a few minutes for questions, if you have questions for Luana, we have three minutes for questions, as you can also find her afterwards. |
How to build an event-driven application in Python
A practical tutorial for building an event-driven, distributed food delivery app using microservices, kubernetes, mongodb, and a message broker in python. |
Hi, everyone. I would like you to give a big welcome to Yaniv, who is coming from Israel to speak about building event-driven applications. Thank you. The floor is yours. Thank you very much for this warm hospitality and for the clapping. First time in Amsterdam, first time in Belgium, so I'm not sure how many locals from here, but one of the most beautiful places I've ever seen. I'm really happy that I got the chance to visit both, both Amsterdam and Belgium. My name is Yaniv. I'm coming from Israel and Memphis, Dev. I prepared a short and sweet tutorial about how to build an event-driven application, the main building blocks. And I also prepared a short code tutorial, not sure how much time we would have for this part of the talk, but we will try to reach there. So just a quick intro about Memphis, Dev. Basically, simple as rabbit, robust as Kafka, and perfect for busy developers. So it's a next generation of messaging brokers and messaging queues, and that's what we try to build. A bit about our open source. We are an open source project under Apache 2.0 license, so 2.2K stars on GitHub, 500 community members, contributors, production users. And if anyone is looking for an open source project to participate and maintain, we are always open and welcome new members. And that's it. What is an event? I mean, probably I'll go over some very basic stuff that you probably all know all about, but I find event-driven architecture a bit frightening. I mean, the terminology itself can be a bit frightening, but probably most of us are already participating or doing some event-driven architecture application. So just to create a baseline about what an event is, so it's a short update about something. It can be an order that took place. It can be a Kubernetes resource issue. Unfamiliar login. We see a lot of use cases around cybersecurity regarding cloud events and things that are not supposed to happen around our environment or premise or VPC. So we do a lot of event-driven application or event-driven workflows around that area. A network endpoint that got disconnected. All of that are basically events. And it would most probably look like that structure of day description and status. And it's important to save the status because we will see in the tutorial overview that we want to keep the state of the event because usually that event will go through several consumers, several processors, and we want to make sure that we are completely aware to the different stages of that event as it goes along through the pipeline. Event-driven architecture, EDA. All of a sudden, I think in the past two years, it's sort of turned into a buzzword. We see a lot of EDA articles and tutorials and features and companies that goes around EDA, so just to create a baseline on that part. So event-driven architecture is basically you have an event, something happened in our environment, and we trigger an action based on that. And when I say event, it's not that. So change the background to red and then all of a sudden the background is turned into red. It's most like order received and then we need to prepare the order. If, for example, we're talking about Deliveroo or DoorDash or Uber, chain of reaction happens based on that event. So it's not just like creating a trigger or a meet that does something on that event. It's basically an entire pipeline, an entire flow that goes, the trigger because of that event. So act on events on real time and usually more than one action per event. So it's not the changing background to red. It usually will be preparing the order notification for the customer, for the client, BI delivery. Again, according to the use case, but more than one action per event. I thought this one would be funny. Anyway, I managed to get some. So main building blocks of EDA. So first of all, defining the events, right? We need to understand what are the events that require certain actions. So we're defining the events and then we're defining the actions, right? So preparing the order, push to delivery, report to some executive about the BI or revenue, the change or something like that. Update databases. If it's, for example, the Kubernetes example that I gave, we need to trigger more resources. We need to create more EBSs, more EC2s, auto scaling of resources. So that would be the actions per event. So we usually start with mapping the events that are important to our use case, to our organization, to our platform, and then define the actions. And then we go for coding. And the building blocks or the infrastructure that supports that architecture is basically microservices and queue. We'll go in a second, why microservices and why queue. I know there is a long debate about microservices. I thought that we got over it about like two years ago. And microservices won the big debate of monolith or microservices. But apparently not. I mean, there are many, many debates in the social about it. I'm completely with microservices. So definitely doing a monolith would get you faster to what you or where you want to get. But definitely regarding scale, decoupling, event-driven architecture, microservices is your answer. And when we have microservices, we need a queue to create the asynchronous or the buffering layer between the microservices that will be responsible for the async transportation between the different services. So decoupling services, queue will basically enable us to decouple different services. So service A, not really aware if there is something on the other side that is waiting to consume. The data is just push it and then acknowledge back to the client or to the front-end or whatever that event started to be handled. Thread pool, if we have, for example, the need for scale. So if our restaurant or Uber app or food delivery app all of a sudden from one order per day got to scale into one million orders per day. So we need to scale the different microservices and a queue is perfect for that part. And real-time pipelines. So queues started as a buffering layer between services, between applications. But in the past three years, I would say, it also started to enable different parts. You probably read about them in event streaming or event processing in Kafka. But these days, you also can use queues as a temporary database for real-time pipelines, for example. What do I mean by real-time, by temporary databases, basically that your queue in the middle or actually a message broker, because the queue most of the time implements one-to-one pattern and message broker is one-to-many. So if, for example, a queue or a broker in the middle, if we have it and we have a bunch of or chain of reactions that needs to take place and process the data as it flows between services. So we basically push the data or the raw event into the queue and then Microservice B will take it, process the data, throw it to Microservice C that will process the data and then at the end will preserve it in some database or do finalize action to end the workflow and it's all enabled by the queue in the middle and the entire transportation and orchestration of the data between the different microservices. So yeah, tutorial, how much time do we have left? 10 minutes, something like that? 15 minutes, okay, awesome, thank you. So quickly about the tutorial that I've prepared, nothing fancy just to explain the real-time pipeline and how it implemented using a queue or a broker. So we have the customer on the left side, it's sort of like an Uber Eats sort of application. The customer creates an order through the front end and the front end preserve the data onto a MongoDB and push the events of that order into Memphis queue orders and then we have another service that creates the processing. Again, preserving the state, if you remember at the first slide I talked about preserving the state as it goes along through the different services because if, for example, all of a sudden something happened to the processing service or the delivery service, we know when or from where to process or to reprocess that order because we don't want to create and duplicate orders because processing or delivery all of a sudden crashed or something like that which happens a lot in real-time pipelines. So that's the general idea, so it goes through processing saving the state or updating the state on MongoDB again, push that order to Memphis queue deliveries where we have some delivery person that takes that order and create a delivery to the happy customer on the left. So let's see some code and how it looks like in reality. So again, nothing fancy. This is what it looks like if we had like a Flask API using Flask framework to create that API endpoints. It starts with an index which gives the rendered frontend and then we have the order food API endpoint and that would be like the MVP to create the ball of prey. Sorry. Oh, and to increase, like most zoom. Okay, thanks. One second. Is it big enough? Yes. Okay, thanks. So, sorry. One more. One more? One more. Okay. Big enough. Thanks. Okay. So that would be our boilerplate, but if we want to take it to the next level, let me just remove that part, but if we want to take it to the next level and add a queue into it. So let's quickly go over the different parts. Let's start from the main. So we basically have the order food endpoint. So a trigger will go to the order food endpoint and then we basically print some output for that new order received, we'll print it, then we'll create a current time, we'll stamp the order with the current time, order date with that current time, status means the state of it, so it's a new order, generate ID, and then create a new document inside our NoSQL database and then push the order towards processing, so meaning the part that the event goes through Memphis for further processing, that would be the part that handles that. Let's see, for one second, it goes all the way up, so we basically connect to Memphis, connect to the cluster itself, create a producer, and then produce that event to the station. We'll see it in a minute, like in a more visual way. And then afterwards comes the part three. I mean, part two, actually, the process itself. So the process itself, basically there is a component that listens to events coming from Memphis. So Memphis implements the paradigm of produce and consume, so it's not pushing the events into the processing part, but actually the processing service consume that event and start acting based on that. So it's an event-driven architecture that we describe. It's idle until something happened or some order got into the queue. So new order received, just short parsing into JSON, also creating a quarantine for that part, preparing the dish. It's a pretty fast restaurant. Order is ready for delivery. And we're changing the state to delivery and hack the message. The acknowledging message in every queue or broker means that I received the message, I did my processing, my handling, and it's okay to send me more messages. So that's the asynchronous part. So we do some filtering, new values, according to what MongoDB asked us to do, and we update that document, the state of it, so it goes from new order to delivery, so it's ready for delivery, and then we send the delivery, or we send the order to the delivery part. And that's, I would say, that's the main idea of doing asynchronous movement, because at the moment we are a young startup, we have only one delivery person, and all of a sudden we got scaled. And we need more delivery person. We don't want to add, like, more power into that compute, but we want to add more, to scale out our resources and add more threads, add more workers. So we would scale the delivery part, so instead of using just one service for delivery, all of a sudden we can just scale out to 10 or 100 different delivery persons, which is a delivery services. So we send the delivery event into the delivery queue, which looks pretty the same as the processing, so we will start with consuming, and then hacking the message, actually on that part right in the beginning. I just wanted to show two different ways to doing so. So in streaming, or in data streaming, basically we want to acknowledge the message really, really fast, because streaming is made for scale when we got to the place that we want to queue, or a message worker would probably handle or handling large scale of data, so we want to quickly acknowledge the message to free up our buffer and receive more messages. So at the previous service, at the processing service, we acknowledge the message only after the preparing, which is perfectly fine, but in this part, again, if all of a sudden we have massive scale that we need to handle, in order to avoid back pressure on our broker, we absorb that back pressure and doing a fast acknowledging on our client, it's just another way to doing things, and it based on your use case and how much your use case is sensitive to avoid, for example, a message or an event. Thanks. A message or event, because if, for example, we acknowledge the message, and all of a sudden, on that part, our service will go down for some reason, we basically will lost that event and we will not deliver it, and the customer will not be so happy and we will need to re-trigger the entire workflow. So just another way to doing stuff, when we work with massive scale like Uber, like Netflix, like Deliver or something like that, it definitely needs to be in that way and here will maybe come another action or a usage of a cache database like Redis that will preserve that event just for that time, just for the processing time. So again, as we did in the previous services, we print some output, we update the database with the status delivered, that's that part, and we basically print everything to our own logging and auditing and everything will be also updated inside the database, so we will be able to observe that order and the entire life cycle of ordering, processing and delivery. So I will do, I will just run it really, really quickly because I also, and by the way, everything is written in medium and you have a GitHub repo if you want to just check out the code a bit and play with it, so you have a Docker Compose, you just need to run it. It's a live demo, so I hope that it will be all right right now. Okay, so we started everything, all the services are up. Let's see, we have the front end, the process and the delivery are up and we have a local community Mongo that we used just for the preparation but we could use Atlas or any other Mongo that we wanted and just shoot an event or order some food and see what happened. So I have Memphis Sandbox here, Hope End, which emphasized the queues within it, so if I'll go to orders new, I should see our new order here. Let's see, it's a bit small, it's small here as well. So we have the status of the order, we see the order itself, we see on the other side the processing service that takes the data, do something, sorry, something that does something and push it towards the delivery queue and we should see the same event here, so ready date, order date, exactly. So we'll go to the MongoDB and refresh it a bit. We should see our order here as well, so we can see the entire state as it goes or as it updated through the different services. We have the order date and three seconds after the dish is ready and four seconds after it's already sent to the customer. So that's the main event and it happened entirely as synchronously and it's a great hamburger and yeah. So that's what I wanted to show and again as we talked about the debate of microservices versus monoliths so I wrote a lot of articles about it. I started my journey with data many years ago, four years ago, something like that, especially with data streaming. We built a data streaming based AI project that basically analyzed using LDA, Twitter and Facebook to get the general conversation of the public and the need for scale and agility really takes place very, very, very fast so I really recommend everyone to start even not to build with microservices and queues and asynchronous patterns but at least try or think about the refactor that you will need to do if you decided to go with the monolith and not even driven architecture. So that's it. Thank you very much for this opportunity. Really happy to be here. |
An introduction to MicroPython |
Please give a big welcome to Walter, who is coming from the Netherlands for an introduction to MicroPython. Well, hello everyone, take your seats, few more people. I cringe at every extra person who walks in, because it's an extra hundred watts of heating. It's already quite hot here. OK, it can start, I think. OK, that's me, and that's a Python subject for today. I welcome questions. If anyone wants to ask something while I'm talking, please raise your hand and we'll get a microphone. And if that takes too much time, I just skip subjects. Interaction is always nice. OK, I'm Baud van Olje. I did informatics, the technical version, but it didn't just exist. I worked in industry for quite a long time. I still have a web shop and sell things like these stuff. I worked at Hoogschoo Utrecht and for a year at Avants to teach technical informatics, of course, and now I'm employed by a company who does robotics in industry. But this is about MicroPython. First context, I guess you all know what a microprocessor is. Way back when they first made a single processor on one chip. It was a very feeble, small thing, and since then we have progressed a lot. Now we can pack a lot of power in one chip, but that's still a processor. It cannot do anything on itself. It needs external stuff like memory, I.O., hard disk to store things permanently, probably a lot of cooling like we all do here. That's one way to use what we can put in a chip. The other route is don't make necessarily the processor much powerful, but put more things on a single chip. More RAM, Flash ROM to store code, peripherals to do things with the outside world, and then you have a microcontroller. Everything in one chip and, well, that's a nice thing not to run your windows on, but to think of things in the real world, with motors, LEDs, relays, and all kind of stuff. That is if you can still buy them. Maybe that will be solved in the near future when ASML produces more chip machines. Compilers and interpreters. You know, an interpreter looks at source code, writes a line by line, sees what's there, does it, and then it looks at the next line and does it again. That's not very quick, but there is no extra steps involved, so when you change your source code, it immediately has effect. And when you interact, you interact in its way with that interpreter. Compiler takes source code, translated to native machine code, and that is the thing that runs. When you interact, when you're interacting with a program, you interact with the running native code. All clear? Yeah, I used to be a lecturer, so I'll ask everyone clear. And then, of course, no one raised a finger. It's not clear to me. Things cannot be that simple, so there's combos of compile interpreter. That's actually how Python generally works. You have a source code. You have a compiler that translates that source code to an intermediate representation, and then you have an interpreter that interprets that intermediate code. Now, you have either the best of both worlds or the worst. I don't know. This takes some time, but less than real compilation, and this is slower than real running native code, but it's still somewhat faster than directly interpreting the source code. Still clear? Yeah. Okay. Python typically runs interpreted, which has benefits, and this is fun with this, but we'll come by that. What's Python's place in the larger world of things? And there are program language, I think, you can recognize the symbols that make it easy to change your code, to run it quickly, to tinker with it, to try things. That's the one end of the spectrum. There's the other side of the spectrum. When you write high-performance codes like operating systems, graphical tools, games, and things like that, those languages are much less forgiving in what you do. It takes a lot of time to compile it correctly, but then they filter out a lot more errors than those type of languages do, especially Vitorius for that. And somewhere in between, under compromises, that's generally used for not too high-performance, but really reasonably performance, user interaction thing, websites, simple graphical applications. That's the whole scheme of things. And traditionally, microcontrollers, these kind of things, were programmed with these kind of languages because they simply didn't have the resources, the speed of the processor, size of the RAM, size of the flash, they just didn't have the resources, and they just didn't have the resources to run with these kind of languages. What really distinguishes the language is whether you are runtime-typed. In Python, when you have a variable, the runtime system has no idea whether it's a string or an integer or a list or whatever. It will find out runtime. In strongly-typed languages, that's fixed. It's flexible for you. Easier on that processor. It compiles to, well, the left side compiles some intermediate languages. Some of them are even really source interpreted. Right-side languages are compiled to mostly machine language. And memory management, an important one. In Python, you rarely wonder about where your memory is, what memory is used, the runtime system solves it. In, well, C and C++, you have to worry a lot about which memory you claimed, what you released, don't release it too early, or you'll get a nice crash or something else. And in between, well, you take some care of it, maybe give some hints to the runtime system, but in general, you don't bother too much with it. So left side, ease of programming, right side, quick running. That's how traditional C Python works. You have your source. You translate it to an intermediate language representation. That is runs. And that's all done by your Python system you install. That's both compiler and interpreter. Wow, this is a lot more complex. This is my Python. You still have a Python system, but now it no longer runs on your quick-fast desktop. It runs on a small mic controller. Both the compiler and interpreter run on the mic controller. And when your Python application runs, you interact with that mic controller. You still have your laptop because you do the editing. I don't have an editor running on this thing, so that's still the desktop. I use Tony, it's a simple Python application, but it's quite useful to interact with Michael Python. So I interact with Tony. I edit my code there. It's sent or stored or kept on the mic controller itself. There it is compiled and runs. And then it runs here and hopefully does nice things with the outside world. Still clear? Nothing new, nice. On the mic controller, let's say standard, read, evaluate, what is it, print loop, so you can do things just like you did on the prompt on the Python on your normal computer. And it evaluates what you do, prints it, and it can do more interesting things like linking leads or reading files that are stored on the chip. I think we should just show, because it's too... Okay, here I have a Raspberry Pico, RP2040 chip, when I plug it in and I press the boot button, it goes into load mode. You can hear a very faint bleep for my laptop. And when I now look at, well, my explorer, I see an extra drive here that's the drive he created, and I can copy... Here's the FOSDM image. I copy my Python image to that drive. It copies it. When it's done, it reboots. Now the extra drive is gone, and I can see... Oh, I have to put it first. Right. There's an extra communication port here. That's what the chip made. I always rename my communication ports to 42, because... Okay. Just replete it to be sure. Now let's start. Tony. Okay. Now here I have a prompt that is running on that small chipie. And just like a good Python, it can print, hello. Doesn't seem like much, but that's happening here. It's not my laptop that's doing it. And if you're a bit lazy and you don't want to fiddle with files, you can also boot again. Here in Tony, you can select under options interpreter. I want to install Michael Python, and for the Raspberry Pi Pico, it just knows where to find it. Internet is working. It just grabbed the latest version, installed it. Close. And I have it running. Yep, it runs. Nice. Okay. Now I must do something. Let's take some code. I can put code simply in the editor and say run this. No. Are you able to make it larger, please? I think that's the wrong direction. Largely enough for you. And, well, it didn't show that the LED was not on, but now it's on, so it worked. No, well, maybe I can convince you if I put a zero here and then run it again. Ah, let's off. You can all see that? Well, just on is not that interesting. Let's blink. Blink is simple. Define the pin. I import the libraries that are standard for this chip, and it allows me to access a pin. Pin 25 happens to be the onboard LED. I say it's an output pin. Then I make it high, sleep a little time, make it low, sleep a little time, and repeat it forever. Okay. That should blink. Ah, it blinks. Oh, nice. Don't applaud for any day more, otherwise your hands will get sore. Okay, but now when I stop it, it stops, and this thing doesn't do anything. It should stop, it doesn't. Well, there's always the reboot for that. It's supposed to be interruptible by that stop button, but since a week it doesn't do anything. I don't know why. If I want to do the chip do something on itself, I have to store this file on the chip. So I make the blink.py on the chip. Now it's here, and I can run it from there. Yeah, it still blinks. That's not very interesting. Now, if I stop it and put it on a power bank, it doesn't do anything. That's a pity. When it starts, it looks for a main.py and exudes that file. So if I wanted to start, I have to make a file. Import, blink, save that as main.py. Ah, it's a mess. Well, this irritating, there is a small file system here. I can make a new file directly delete, but I cannot rename a file. So let's do the poor man's way of doing that. Save as main, not neon. Now I can delete that one. Yeah, nice. Main, yeah? I deleted the correct one. And that's the important thing. So now, if I put it on a power bank, it starts blinking on its own. But now, at least that's what happened yesterday. One is thought to hold it again now. It doesn't want to. I cannot get into it anymore. It's bricked. But okay, I can still reload. But if I reload the micropyton interpreter again, it keeps the file system, so it keeps the main.py, so it doesn't work. So some clever guy made a flash nuke. I'm going to put that there. It reboots with that code. Now it's nothing. And if I now reboot it, I can put the normal code on it again. Where it is. Force the pickle images. Okay, put it one over there. It's just USB transfer, it's quite quick. And now I have a normal behaving. Yeah. Okay, that behaves again. Nice. The name's real port. You've seen Blinky. You've seen the startup. Demo of the startup, okay. I was going to demonstrate blinking a LED on the breadboard, but it's too feeble to bring here, so let's blink more LEDs. More is better. Okay. Now you know why I put all the things on the table, so I have the correct order of driving everything. What's on here? I did it. I want that one. I can blink a single LED. Oh, it should. Now I use a library I wrote, and it's loading all those library files. It takes some time. After that, yeah, it really blinks. I'll come back to that later. And this one, if I'm a bit insistent, it will probably stop. Yeah, there it is again. We can also have more LEDs. I've made a system that, whatever target ship I use, it has eight pins that are connected here, so we can swap it for another board. There's a whole different processor, and it still has the same eight pins. That's the edge library, and edge.port is those eight pins gathered together in one port, and then I give it to the kit demo, and then it should first load everything. And then it's kits. Okay, kits on eight LEDs. If that's not enough, you take an I2C extender chip, and the chip... Does it work? Yes, it works. And it has more output pins. Load, load, load, load. And then have a kit with 16 ports, and if that's too slow, I think it's boringly slow. Maybe 10 milliseconds is nice. Load, load, load, load, load. Yeah, that's a bit quicker. Okay, back to the sheets. On morelets, we have seen morelets. Okay, you saw that it spent some time at the beginning by loading a few libraries. It showed, because I coded it that way, because I wanted to know how long it takes to load the libraries and how much RAM it uses. In general, you can have three ways of putting library code on MicroPython. You can have the source files, the simple.py files, and then when you load them, they're first compiled and then loaded. You can also pre-compile those files on your laptop and then load the compiled version, that's the.mpi files. You can also take the files and build them into the image you load anyway. You saw me copying that image onto the chip. You can put your files in there. And that has effect for how much time it takes. When I demoed it, it was irritatingly long. My files took about 12 seconds to get loaded from the flash compiled and ready for run. When I pre-compiled, it's a little bit faster. And when I freeze it, I build an image of MicroPython with those files included, it gets really fast. And then it also uses less RAM. RAM, memory on those chips, is always a very scarce resource. This chip has about, I think, 130K left after loading this library. Looks like a lot, but you can eat it quickly, especially if you make screen buffers for LCDs and things like that. 240 by 320 LCD with full color, that's 2 bytes per color. I think that's more than is left there. So you couldn't still use it. So it's very important to conserve that. When I develop things, I simply take a very fast target. A TNG is maybe 10 times faster than a PyPico, and it has, I think, one megabyte memory. So that's ample. But there's always the but, but it comes later. Okay, embedding your code into MicroPython image is not that complex. I had it running on my Linux machine in maybe an hour. Then I tried the same recipe on Windows and it fails. So when it happens, what do you do? You make a Docker container. So I made the Docker container on the Linux machine that does exactly the same thing, worked. Then I run it on Windows, works too. It's fine. And I can show if I take this chipy here. There it is. If I import my library, it's... Okay, you can do it. There's a license statement, a lot of text. Okay, here I have that license statement in the source. Let's add this is FOSDEM. FOSDEM? Whatever. Okay, start the Docker container. Internet works, otherwise it doesn't do it. It takes some 40 to 400 seconds. I cannot predict how long. Of course, it's Docker, so all the preparations, loading the Linux image, loading all the development tools, that's cached. But then after it, it takes my library code and then it compiles the MicroPython system with that library code added, and that takes some time. Okay, it's running now. Well, didn't take that long. So what do I do now? I reboot my chip again and now take the new code. This is the new code. Copy that to the boot. Okay, let's see what we have here. Now you no longer see the library files there because they're not present on the internal file system. They're built into the MicroPython interpreter. But I can still... Oops, it's a bit faster now than it used to be. Now the license text is expanded, so I really have put the changed version of the library in there. Okay, let's go on. I showed freezing code. I still want more LEDs. Not enough. Why is more LEDs? I think this is enough. Okay, that needs more power. I cannot power it off the USB port of my laptop, so I need external power for that. All right, now I take... Oh, it's a microphone. Okay, let's see what's on this chip. Let's try that code. Now I loaded the code from the chip. I prepared it on the chip there. I don't need to load it from my PC there. And it runs over there. And it still loads the library. So, enough LEDs for you. Okay, put it over there. With so many LEDs, you can do nicer things and just put them in the long line. Put them in a square and make displays with it. So, take that one off. Does it stop? Yeah, it stops. And let's run that one. And we are at FOSDEM, so it prints an F on it. The code takes a long line of 64 LEDs, and it folds it after 8888 LEDs to make a square. That's what we call a canvas, and I can write text on that. You can also print the old code letters after each other. Sorry? You can also show the old code letters after each other. Timed, you mean? Yeah. I'm not going to do them all, but... FOSDEM, let's say gf. Wait a little bit. What did I do wrong? Ah, slap. Won't work. And then... Because we can't get enough of it. While true. Sorry, conventions will take one. Yeah, I need a flush, and I need to wait after that. And I need this flush over here. Oh, this should work. I didn't prepare this one, so let's see. Something like that. What did I... Ah! Yeah, yeah, yeah, yeah. Okay, demo. What am I missing? Yeah. A flush and sleep. I have a sleep here. Clear right flush sleep. It's because flush is up to sleep, and it should be good for... Oh, you guys are wonderful. It gets even more hot now, so... Oh. But I guess two letters is not enough, isn't it? Where are more stuff here? Ah, here are my flexies. They also sell these things on flexible PCBs. It's nice. And I put on this one again. Like this one. And now it's garbage. Anyone has any idea why it's garbage? It's another shame. Yeah. The first one, they go tuk, tuk, tuk, tuk, and then tuk, tuk, tuk. And this one goes tuk, tuk, tuk, tuk, tuk. Instead of going the same direction, it's niksacht. Okay. So we must fold it in... Inzigsacht version. Yeah. If you know a better term for it, feel free. Oh, hit the bug. Oh, fortunately I know where to correct this. Get to source. It's in canvas. And I made an error there in the zigzagging. Rotate it. Yeah, somewhere like here. It's a fold one here. It's, of course, an off-by-one error. I have to subtract one more here because size is one beyond. So... Ah, plus one, yeah. Okay, save it. And run it again. And just a hint, this won't work. Why doesn't this work? I made a stupid mistake. And it's not in the code. No, that was a different chip. But you're hot. It's hot anyway here, but... What did I just edit? I edited the source code on my laptop. That's the upper screen here. This one is a source code on my laptop. This one is the code on the chip. I made that mistake a gazillion times. So, do the same thing, but now on the code here. That one... Now, you see it's a bit slower because the most first retrieve that code from the chip to the laptop. Then I can edit it. It's the same code, but still I have to edit it. Flash is going to take... Well, which one on the mark? Ah, that's quick. Okay, thanks. I'm not... 25, okay, near. Ah... I'm doing something stupid here. Control Z is your friend. And now I've changed it on the... Okay, there. And I'm still at saving. Now I run that one again. It should compile... There's not the baked-in version. It's the from-source version. And now it's... Now it works. Well... Okay. I prepared all kinds of versions. They do all kinds of things with the ordering of the LEDs on the screens. Ah... Not again. Okay. Shouldn't it be just X and melted? Sorry? Shouldn't it be just X and melted, not both X and Y? Actually, it is not an accident. Okay, so what I do is the X, Y inverted. And that should be in here somewhere. Where is that X, Y inverted? You asked me to enlarge your ledger, so I'm scrolling slowly. Swapped? Ah, it's called swapped. What's this one? Okay. But do reflect on what I was about to do. That is, to edit it in the code stored on the chip. That's another problem. But then I decouple the chip and start working on the chip. And I just lost my code. Or I don't know where it is on one of the 20 chips I have. And now it's FO again. So that works. So I frequently don't know where I have put my last version of the code. And Tony doesn't support GitHub via the mark controller. That, in fact, would be very nice. Okay, never enough. Let's take more. Now it doesn't know it has more leds, so let's tell it. Where is that code? I'll just top it first. That one, run that one. Sorry? Oh, that takes a lot of time. You might do that for consistency. Downloading the live takes one minute. So not for... You wouldn't know where your editor is. That's true. But then you lose a bit of the essence of using Python. The essence is you can quickly make a modification. If I can afford one or two minutes, I might as well use a really compiled language and spend that much time. So, yeah, you couldn't need to do it. Okay, so I could add one more display, but that's more of the same. Let's not do that. Go back to these things. Yeah, you saw that again loading such a library from source is slow. You can also use a faster chip. If I take my fast chip, this one, and take that connection from it, and use that one. Tony, stop again and recognize a new chip, please. Yes, it does. And this one even has a separate flash card, so I can store more files on it if I want. And I can blink on it. This loads the library quite a bit faster. It doesn't have an onboard LID, so I blink on how LCD screen works, too. But sadly, I cannot use this target with the new pixels because it doesn't have the new pixel library built into the Python version for this chip. Sometimes you're just screwed. But I can... Oh, it doesn't want that. Sometimes when the communication goes wrong, Tony is very confused about what's happening. You just have to restart the chip and restart Tony or whatever you want. You can do this one. Well, now that it will leave that. I'll go to the sheets again. Okay. I've compared a few target chips that you could use for this kind of work. The teensy is what I just showed, a very fast one, but then it costs 36. So this might be real. 36, maybe I'll photograph you all. It's quite expensive. If you want to have lots of fun with things that are expendable, the ESP32 and the PyPico are very good competitors. Less than 10 euros and really available everywhere. And for ESP32s, there are a lot of funny versions of it. There's a watch with ESP32 in it. I don't know my Python on it. In a previous situation, I did a demo with it, but it uses Wi-Fi, and it has the habit of working perfectly before you all walk in, over tens of meters, and then when you're all present with all your phones, the reach is reduced to five centimeters. Let's break half our condemos. These are other things. A doorbell camera has a built-in ESP32. It can run Michael Python, and then you can transmit the images over Wi-Fi, at least if you're not all present. And then... Do you all know these ones? You see one bit... Well, they can run MicroPython, sort of. They cannot store files on the flash system, and they can run about 50 lines or so of Python. So it's nice for a kindergarten Python course, but not for real work. And this one, I sure hope I can get it to work sometime. It's a kendrite. It's down here. It's quite powerful, and it also has neural network hardware on it. But it uses some Chinese version of MicroPython, and Tony doesn't recognize it, so maybe it works, but I don't know how. Ten minutes left. Thank you. If you have questions, those are more important than the rest of my sheets, so interrupt me. A few of those more interesting chips we saw are, in fact, not a pure mark controller or pure mark processor. They have a microcontroller core, but an external ROM added, and in some cases even external RAM. That is overall a cheaper configuration, because optimal, let's say, process parameter for making microchips, flash and RAM, are quite different. So you can make a cheaper, more effectively make a RAM chip, when you have only RAM cells on it. So, for instance, the ESP32 doesn't have that many, not that much RAM on board, but it can use an external coupled RAM chip as a buffer for it, it caches the things, and then you suddenly have a few megabytes of RAM, small actually, but it's a bit slower. Not that much, but a bit slower. Excuse me, can you make the slides real screen, please? I will, if I get my cursor back. Where's my cursor? Yeah. Oh, they'll publish this slide, so don't worry. So that's, then it's nice appearing, and then you have the hybrid version in between. Okay. When you work with this thing, RAM is very scarce. So the more RAM you have, the better, especially if you work with screen buffers or images or things from a camera or so. The thing is, when files are loaded by Michael Python, it first loads the source, then from that builds intermediate representation, that intermediate representation is smaller than the source, but the peak memory use is still the full source. So I cannot load my library as one file, or let's say in eSp3.2, because it doesn't have enough RAM to load that peak amount. It can load it in compiled form, but not as source form. So what I do, I split it in a lot of small files, and one trick I often use is I don't, I have a main file which imports all sub files, and for, let's say, basic classes you must do that, but for utility things like a driver for an LCD, I use a trampoline function. It pretends to be the class, but it's a function, and when it's called, it loads the real submodule, so it's loaded only when it needs it, and then it calls that with all the arguments. And such a trampoline takes maybe 50 bytes, while this full file could take a few kilobytes, so it saves a lot of RAM and loading time. Can you get a version of MicroPython which only takes pre-compiled to save space on smaller machines? Yeah, you sure can. So when you really develop something, I think you should take a larger machine, and when it's in production, take a smaller machine and use a pre-compiled version, yes, of course. Oh, sorry, he asked whether using pre-compiled version saves RAM, yes it does. Well, whether you could have a smaller MicroPython program with complete MicroPorts to save space? Yeah, but you still want to tweak with your own main code if you're developing. If you have a finished product, then all things are different, and I think the main power of MicroPython is that you can tinker with it at a late stadium. If you have a finished product, well, indeed you can freeze it, but if you want that, how do you develop it? Presumably on a different processor with more RAM. Let's tinkle it. Limitations. Should we pre-compile large libraries and put them on the chip, and then we can tinker with our own code, but have to pre-compile libraries? Is that something to do? Yeah, indeed, but there are a lot of pre... Oh, is it a good idea to have pre-compiled libraries incorporated in the Python interpreter and then tinker with your own code? Yeah, I think that's a good approach. These libraries are stable, because if you must change in those libraries, you don't have the source code, at least not on your target chip, and you're stuck in a much slower development cycle. So it's, well, if things are indeed stable, freeze them in the MicroPython interpreter. If not, take a large chip and develop on that. Five minutes left. Well, you can really develop limitations. Time-wise, if you're in the, let's say, clock cycles, time domain, if it must be that fast, then even C or C++ won't do it. You need hardware or Raspberry Pico Pios. If you're in the tens or hundreds of cycles, you can use compiled language. If you're in the thousands of cycles, you can use an interpreter language like Python to concrete driving those new pixels that requires microseconds timing. You cannot do that in Python code. In the MicroPython interpreter, they do it in C code, and that's built-in. Driving a servo motor, a hobby servo back and forth, that's millisecond timing. You can easily do that in Python, in the Python source code. Oh, there's a list of deviations from C Python. You can read it. They didn't bother me much, except that it doesn't support type hints, and I like type hints from my testing, so I must do the testing on my laptop. You can test a lot of basic things, not on the small machine between your desktop. Question. When it's compiled, is it the same by code as Cpyto? No. When it's compiled, MicroPython intermediate code is different from Cpyto code. It has its own pre-compiler, which you can download, and you can pre-compile it, but it's different from the standard Cpyto. Question. Does MicroPython have a garbage collector? Does MicroPython have a garbage collector? You can bet it has, yes. And when you want to do really time-critical things, call the garbage collector and do it after that. Then you can be reasonably sure that it's called. But if you want to do really time-critical, life-critical things, don't use Python. Sorry. Okay, I'll skip this one. Where is it good for? Well, maybe not for volume production. For that you use compiled language, but for anything that requires tinkering, exploration, lab setups, experiments, education, it's really nice tool. Okay, what is problems? I'm always confused where my latest version of the source code is. I changed what's something on that ship. Maybe it's on my PC or on GitHub, I don't know. Remember, USB port cannot supply a large current, which you want things like these pixels have a separate power supply. The terminology of Tony with upload and download is reversed from what I think is normal, but you have to use it. No problem. If you want to have your code safe from prying eyes, I think forget it. The intermediate code can be back-compiled so anyone can see into your code. If you have large invested interest in your code, maybe Python root is not the way. Then probably you want to make a large amount of products so it's not the way anyway. Okay, I did cover some things. Take away. Well, start tinkering. It's always a good idea. Get the same hardware as you would use for an Arduino. Get some micro-python cable board loaded on it. Five minutes work and start doing things. Okay, well, if I did that demo, then those pixels would show questions, but I think we're at times up anyway. One minute. One minute. One last question. Yeah. In the beginning, you showed the custom that you called it, like, machine or pin. Yeah. Where does it come from? That's... Micro-python is a bulk of... Sorry. He asked, where does this code come? The pin and the machine. Micro-python is common parts and then the specific part for specific chips. And accessing a GPO pin of a chip is very different from chips. But that's a very shallow part. Once you have that, you can build a lot of things on top of that. So the micro-python distribution that I used here to compile, I'm a specifier for which target chip I compile and then it grabs the correct code. Okay. Thank you. Thank you. Thanks a lot, Walter. Will you be available outside for questions? I'll be here around. So Walter will be around for questions if you have questions and hesitate. Python may have lost the smartphone wars, but for the smartwatches, we still have hope. |
AMENDMENT Code reloading techniques in Python
Cold and hot code reloading, the different options, how they work and when to use them. |
you Hello FOSDEM 2021 and welcome to this talk on code reloading techniques in Python. In this talk we'll be reviewing a few techniques that allow you to reload code and apply the new changes you made to your program. We'll look at a few different techniques to do so and we'll have an in-depth look at the inner way of how they work precisely. Few words about myself, so my name is Hugo Herter, I'm a software engineer consultant. I discovered Python and Linux in 2003 and have been using them a lot since then. My first FOSDEM was in 2004 and I think I attended almost every edition since. When I'm passionate about free software, I'm really happy this year to be able to attend all the dev rooms without missing any. When I was starting to learn Python, I was quite amazed by how easy it is to play with all the internals of the language and the constructs. One thing that I found pretty interesting is this exact function that allows you to execute any Python code found in a string that might come from anywhere. As you can see in the example on the right, that just executes Python code from the network and gives you a remote Python shell on any machine that runs this script. It's basically the same idea you have in Jupyter notebook with more security and more advanced features. When I was learning Python, I started also to write my own web framework. This was back in the times when Flask and Django even didn't exist. We only had Zope and a few frameworks that do not exist anymore and on that web framework I used a function called exec file which is the same as exec but for a file, it would execute all the Python code in a file and I used this to be able to make my changes up here immediately when I was changing some files in some web pages. Which brought me to this idea of code reloading. It's something that I have been playing with for a long time and I wanted to share this with you because there are many interesting techniques here and you might not know about all of them. What I called code reloading here is the process of replacing part of a program with a new version, part of all of it. I'm focused here on the source code because it's the term that's used mostly for interpreted languages. When you are using compiled languages, there are other terms that might mean similar things but they are slightly different. I talk about cold reloading and what I mean by cold is that you take the process and you stop it and then you restart it. And hot code reloading means that you keep the process running and you patch it with a new code without stopping it. So as an illustration on the left, we have some kind of cold code reloading on a racetrack. You stop the car, you have access to all the internals, you can change everything you want but the car is out of the circuit, it's not running anymore. The driver is out of the car as well. On the right side you have hot code reloading where the driver is still in the car. You may not want to change everything, you don't have access to the chassis, changing the engine might be a complicated task here but you have access to quite a few pieces of the car already and if you just want to change the color or type the wheels, it's pretty easy to do. You don't need to stop the car and it goes much faster. It's going to be the same ID we have in programming with cold code reloading and hot code reloading. So cold code reloading, you stop the process and then you restart it again. It's easy, it's reliable, you've all done it if you did some Python code or any kind of programming, it's the default way of doing it. The issues you have with it is that you lose the states so getting that state back might take time. If you are programming a video game for example and you are a Vatorite in a special place and it took you some time to get there with certain enemies and you want to trick the behavior of the enemies, in that case restarting the entry game every time might be pretty annoying and you would be interested in something that keeps the state of the whole program. The easiest way on Linux to do cold code reloading is control C up arrow enter to just run the same command again and it's a super easy way to do it because we all know this first shortcut by reflex as we use it all the time in programming. Let's have a look at how some web frameworks do this cold, some web frameworks do code reloading and they used this cold approach of restarting everything but they do it in an automated way. Let's have a look at how they do it precisely. The entry point is this function here run with reloader and you pass it a function. It will run this main function and enable the reloader on the side and stop that function if the code has changed to restart it. The first thing we see is that it's calling here single.signal six-term lambda rx sys.exit which means if the process receives the single six-term to terminate from the system then it will exit to make sure that it doesn't hang if it receives this single. This is a way to behave properly even in multi-threaded environments. Sometimes when you have multiple threads the signals are not received by all threads and you have to press control C a few times to stop it for example using some frameworks. Then we see that it's initializing a reloader here using this function getReloader which is defined a bit higher. There are two reloader classes in Django. One is the watchman which will watch for files on the file system and the other one is the stat reloader which will just watch every second if the properties of the files have changed and in that case say well the properties have changed so we should trigger a reload. The watchman is faster and more powerful but the stat reloader works as a fullback to this. And then it will pass this reloader as well as the main function here to this startDjango function which is right here. So that function basically starts our main function here in a thread so it creates a thread to run this main function it sets it as a daemon it starts the thread so no our main function is running but in a thread not in the main thread but in a side thread which is controlled by this function. Then it starts the reloader and it passes the thread to that reloader class which will be in charge of stopping it and restarting it if something has changed and it will run this in the loop as long as the reloader should not stop. So this is the Django approach start the main function in a thread and then look for changes on the file stem when they happen have this reloader class to just stop the thread and restart it. When looking at how Flask handles this reloading it's a bit more complicated because it's not within Flask it's within WorkZook which is a web framework library used by Flask. But we can find something similar we have this run with reloader function that takes a main function as an argument it does a register to the same signal as Django and then it starts a thread here with the main function and it launches the thread here and if we look at the reloader this it comes from this reloader loops and when looking at it we can see that it's also using something similar a stat reloader and a watchdog reloader and these are very similar to those used in Django so we can assume that the behavior is identical even if the codebase is different here. So both Django and Flask use these watchman or watchdog reloaders under the hood but how do these work? Well there is something called iNotify on Linux and there are similar APIs on other other platforms that allow you or your process is to watch for file system events and receive a notification without having to constantly look if something has changed. On Linux it's iNotify which you can use directly from the library piNotify if you're using Python or there is this library called watchdog that you can use on all main platforms. The way the interface to use watchdog looks like this so you can create an observer that is in charge of receiving these signals and will run in a thread in the background and you can then schedule some handlers on it and say well I want to register for example recursively if you're looking for on a folder or not and then just start it and it will work using a callback based approach. Because it runs in the background I added this input at the end just to make the program block and to be able to see something before Python exits. Let's now look at hot code reloading so in this case we want to keep the process running we want to replace the code in memory. We hope it won't crash the program this might happen if we have inconsistencies and the new code is not compatible with the existing one or the existing state and we want to take advantage of the fact that it keeps the state and it's really fast to do this. There are two challenges in this case one is we need to find and load the new code because if you're just reloading everything you might as well restart the entire program and we need to replace the references so in Python you can pass a lot of objects as variables and you can have references to these objects in many places and we need to find all these places to be able to make them use the new code instead of the old one. There are other languages that also allow this kind of hot code reloading. In Java for example you have this functionality called hot swap which basically allows you via the debugger to specify a class and ask the virtual machine to replace the class with the new compiled code from a class file. In C and C++ you have DLL code reloading that allows you to reload a dynamic linked library or shared library. In this case as well they need to share the same interface, they need to expose the same functions and classes and methods that the previous version did. There are some changes that you're not allowed to because then it would break the compatibility. In Python there are three ways of loading codes. One is eval that allows you to evaluate a function and that one is not very useful in our use case. However the other two methods do work and they both have their advantages and disadvantages. The first one is the import module that you are using when you are importing a library and in a way similar to the DLL libraries you can reload a module that has already been loaded and have the new version replace the old one and exec allows you to execute just any Python code from string which can also be used in some cases. What you see here is on the left a text editor with some Python code and on the right a Python console. So the standard way to load Python code is using import. I will import my module and then I can call module.sayhello and it will just run it. If I change the source code say hello will still be at the old version as expected. There is however a library we can use with in Python to reload this module from import lib and we can reload here module. And now if we call our function again we can see that we have the new version. However if we have a reference to this function then it's a bit trickier because that reference will not be updated say we have say equals module.sayhello when we call say we will still be calling this function. Now let's update the source code and reload the code module.sayhello has the new version however say is still using the old version. So now we have one version that doesn't exist on the file system anymore the previous version but it's still in memory it's still loaded and this is something we have to take care about to be careful about when we're doing hot code reloading. In practice however sometimes we are facing code that's not optimal for the reloading that we were just seeing. For example here we have a connection to a database that's initialized as a singleton and this is a pattern we see in a lot of code where that connection is created just within the module and so it's executed every time we import it. So the first time we import our module it will take some time to initialize it. We can now call our function say hello let's say we want to update the code of our function we need to reload it so we can use importlib.reload module 2 and again we have to pay the cost of the time to establish this connection and now we have the new value working. So in this case this use of importlib.reload can be quite painful here we are just facing a delay but sometimes there are also thresholds, limits, there are ports that are still appearing as busy, there are other issues that we can face that make this approach more difficult. If you can use it it's really nice because it's really simple it's all built-in in Python and it's just one line so go for it but it's not always optimal in some kind of complex software. There is however another approach we can use to reload this function say hello without going through the reloading of the database and this is the idea here is to go get the new value of the source code and then to execute that using the function exec. So for the first step we need to get access to the source code of a function and there is a tool for that in Python in the library inspect so I will just import inspect and then I can use inspect.getSource of module 2.sayHello. I have the source code of the function and the really cool thing here is that if I do add some code for example I add pass here and then I do a return here and let's do some more changes, let's do it, call it again and we can see here that this function gets source gets the new value of the code and not the old one so it's quite smart there which is really handy in our use case. So we now have the source code in a string and we could just try to execute it however if we want to replace it within the module so that other functions that depend on it would be able to access it we need to also give it access to things inside this module for example db if we execute this module here it will not have access to the db variable from that module. So what we need to do is use inspect to get access to the module of that function. So let's say module equals inspect.getModule of module 2.sayHello. In this case we know which module it is, it's module 2 so it will just get the same thing but sometimes we just have the function around and we don't know directly from which module it is so we can use this and then what we can do is create, we want to be able to extract the new value of the function so we'll create a local directory, I'll call it locals underscore equals dictionary sorry not directory and then I can just execute the source code so just copy this or copy the code here within the module that underscore and underscore addict which is a dictionary representing the namespace of what's within that module and my local dict here that I created just above. And now let's look at locals, we have say hello and if we compare it to module 2.sayHello you can see that they have the same identifier here and if I call one and call the other so this is the old one and this is the new one and the identifiers look similar but if we look here we can see they're not exactly the same so let's not be mistaken by the fact that they look really close now we can just finally update this function if you want so we can do module 2.sayHello equals locals.sayHello and finally module 2.sayHello call it and there we have this we just reloaded the function without reloading the connection to the database here. If you want to use hotcode reloading in your projects and you don't want to write yourself the methods to do it using the tools we have seen previously you can also use this library called reloader from reload their import auto reload and here in this case you just decorate your functions with this auto reload decorator and it will automatically replace them using the proxy method we've seen above and the instance reference to the class method with the new code when the code changes by watching the file system so this is a wrap up of all the methods we've seen previously. You can also manually specify when the code should be reloaded by changing the decorator in this case you can manually reload the class or you can start a timer that will just reload it every second or again look at the file system and as the file system changes trigger the reloading of this class. This on pypy so you can just install it using pip install reloader and try start using it. The source code is pretty simple it just fits in one file and then you have a directory with a few examples. Thanks for watching and join me in the question and answer matrix room for if you want to discuss any things you can find all the examples we've seen on this GitHub repository thank you. Thanks Hugo for your talk so I think we can now start with the questions. First question is how does reloading using execs behave in terms of compiling to intermediate forms like PyC and so on? So it's using, Python internally is using bytecode so exec is a two steps process the first step is it will compile it to bytecode it will just not store that bytecode on disk and then it will execute that bytecode as like the rest of the bytecode. And are there examples of applications that use hotcode reloading? Usually it's a process that at least I use for development so it's not used that much in production because then it can cause a lot of issues but it's the hotcode reloading in general is used a lot by game developers because they're tweaking the dynamics of the game while playing it and restarting the entire game every time you make a change to some logic doesn't make sense in that case. And how do you deal with side effects like things like shared resources and so on? So the idea with hotcode reloading the way I presented it you keep these resources on so you keep the state you keep these resources. Of course if something changes outside of the scope of your changes then you may have compatibility issues and then you just have to accept it and restart the whole process. Okay any further questions from the chat? What are the dangers that remain? Could you fix them? Well I think Python itself is not designed for hotcode reloading and other languages have allowed this in a safer way so in order to make hotcode reloading easier in Python I think there will be some big changes within Python would be required. If you take the example of Erlang that's a language that's designed to allow hotcode reloading and it's used a lot in network equipment and it's a feature built in the language in the tooling. If you take the example of Java there is a rule you can reload a class as long as its interface does not change and your ID, your tooling will check that. In Python there are no such checks so at the moment there are no guarantees that the new version of the code would work with a high chance. Okay thank you. So, do you add decorators for reloading in your code base? Is there a best practice to ignore them at the moment running your code in production? I think it aligns with the other point that's adding decorators just for the sake of reloading for half an hour doesn't make sense. It's a trade-off. I use decorators because that allows me to know exactly what is being hot reloaded and what is not. And also as a way to work with the references. Another strategy I thought about was to try to replace in memory all the references to the function with the new one within Python. And that requires much lower access to the internals of Python. |
Realtime 3D Graphics on a MicroPython ESP32
Hacking the EMFCamp Conference Badge |
I'm here to talk to you about graphics programming in Python on an embedded microcontroller, which is hilarious because I'm none of these things, I'm not a graphics programmer, I'm not a Python programmer and I'm not an embedded programmer, so we'll see how this goes. It's for that reason I just, you know, I can't emphasise enough this part of the talk description, this is not an instructional talk, this is just what I did. So there's some background, EMF camp is this weekend camping festival for hackers and makers and it's in a similar vein to the chaos communication camp and the Dutch hacker festival, you know, there's robots and lasers and geodesic domes and things, it's great fun if you get the opportunity to go, I highly recommend it and it's a bit of a tradition of these style of events to give the attendees electronic event badges and the aim of these is to give attendees opportunity to play with some hardware that they might not have come across before. These are the two most recent badges from EMF camp, the one on the left here. If I told you they had, they put a SIM card on it and a GSM modem and then they set up an onsite cell phone network, you'll understand why it's made to look like a Nokia Engage, but it's got all of the usual like peripherals and sensors and things on there as well like accelerometers and humidity and temperature and things and because it runs micropycin it allows people to easily get started with experimenting with that kind of hardware. The one on the right there is the newest one, these photographs aren't to scale by the way, let me just hold them up for comparison, the newest one is much smaller. The reasons for that you might guess is because of the silicon shortage that's been caused by fire, flood and plague as you might expect, but it's still a lovely device. The one on the left here you can see, you might recognise this as a version of the settlers of Ketan, I spent a lot of time trying to isolate small parts of the screen to redraw because the update speed on that screen was so slow, it was almost, it's almost unusable for anything in real time. So when I got my hands on the new one this year I obviously wanted to see what this one could do. And so the first thing I wanted to do was to just try and glitter full screen of pixels to the device using the display driver directly and let's talk about 70 milliseconds which is already orders of magnitude faster than the old badge. If I draw to an off-screen buffer instead that's way faster, but you know if you're doing that you then you have to get into the business of implementing your own drawing functions for primitives and I didn't really want to do that. That is ominous foreshadowing by the way. But I did discover that MicroPython has this frame buff module which provides you with an off-screen frame buffer and also some drawing functions which is great. So 41 milliseconds, I thought that was fair compromise, that's a good start. Now I've got a baseline for how fast I can draw to the screen. So obviously what this is about is drawing 3D things to the screen of this device and so this is just here to, in case you don't know this is basically, I guess this is 3D rasterization 101, this is like the minimum we have to do in order to get 3D points onto the screen. You know we start with our vertex coordinates and then that's multiplied by the model matrix to get into world space and then you multiply that by the view matrix to get the view space and then by the projection matrix you get the clip space and then the clip space allows you to see which vertices will be eclipped by the edges of the screen or not. So then once we know we've got the list of vertices we want to render then we can do the perspective division to bring that into normalize device coordinate space or NDC space. The perspective division is just the part that makes the further away points closer together so it gives you that illusion of 3D. And then we've got to convert the normalize device coordinates which are like between minus one and one to screen space which is like our pixel coordinates. And so when I was doing this, these, to render these eight points on the screen from a cube it was pretty, it wasn't too bad 53 seconds and then if you like join those up to create your cube wireframe it's not that much, not that much slower there's 12 triangles there obviously. The next step is to then start filling in these triangles you want to draw solid shapes after all, annoyingly there's no method or no function for doing that in the frame buff module for MicroPython. There is in the display driver but as I mentioned like using the display driver directly is much slower because we're making many more calls to hardware and you know we're setting pins high and low and stuff for every time we want to draw something and we just want to do that once when we blip the whole thing to the screen. And yeah so frame buff doesn't provide a like polygon or polygon fill method and so I do have to get into the business of writing these sort of functions myself after all. So yeah the display driver itself does have these methods so obviously that's the first place I looked for implementation clues, they have a polygon and a fill polygon method only obviously there are problems with it and it's a little bit rubbish here's the figure on the left there is just using the outline polygon method and then the second one here is where I've tried to draw in a filled polygon over the top of the wireframe polygon and you can see it just doesn't quite match up. And so reading the code there is it seems to be implementing like quite a well known or well documented fill polygon method and there's a link to the website where this algorithm is described and that also supplies a reference implementation so I was able to like copy the reference implementation to see if that if the display drivers implementation was different and it isn't it's exactly the same it looks like the display drivers inherited the same problems that we're in the reference implementation and you'll notice that it's not only incorrect on this side but like the left edge here is completely different to this edge here so it's like over drawing on this side and not drawing enough on that side. A lot of the problems with it were sort of like rounding errors and like floating point to integer truncation and that sort of thing which I've managed to mostly fix except for this really annoying pixel down here that I just couldn't get and when I submitted because I wanted to submit like this enhancement to the frame buff module upstream to the micro platform project and so we spent a few days scratching our heads over this to try and figure out what we could do we were initially we proposed just drawing the outline again on top of that on top of the filled polygon just to like sweep it under the rug but eventually we managed to figure out a much better way of doing it we just like try to detect when these stray pixels were we're going to happen and then fill them in explicitly instead of letting the algorithm do it oh yeah I you know these it was quite it's pretty obvious that the algorithm I think was developed by a physicist or a mathematician because in the article that describes the algorithm it says and I'm quoting here the detecting points on the polygon edge will deliver unpredictable results but that is quote not generally a problem because quotes the edge of the polygon is infinitely thin now my polygons have an edge of one pixel so this is obviously why we had to like it fix the problems of it anyway now we can draw arbitrary polygons to the screen and let's see what that looks like this is the cube here again which is like basically you know the hello world of 3d graphics programming and it seems to work pretty well 66 milliseconds there but you can see on the on the left hand screenshot there that's not the inside it looks like you're looking at the inside of the cube but it's just because we are drawing the back face of the back of the cube on top of the front face of the front of the cube so as part of this 3d rasterization process that you've now got to do like back face calling which is more maths added on to that pipeline you know you've got to take the you've got to calculate the normal vector of the face which is the direction the face is facing and then compute the dot product of that with the direction you're looking so that you can know if the face if the triangle is facing you or not and then just don't bother drawing the ones that aren't facing you but yeah that's much it's just more maths so it adds more time and oh yeah get the occasional like really long frame and that coincides with a garbage collection I guess we'll talk a bit more about that in a bit yeah so like there's some really low hanging fruit things we can do to improve the performance initially which is basically amounts to being smarter about the algorithms we use we pre-calculate the normals instead of calculating them every frame which for like static model like this makes total sense and yeah avoid doing the perspective division if we can help it because it's like part of the I'd implemented it as part of the matrix multiplication process and usually it's a and usually it's a no op unless you're multiplying it by the perspective matrix and only then is it doing something so we can just avoid doing those those divisions at all on you know on every vertex in every face in every frame that's quite a lot of time saved but it does mean I can add more things to it and make it do extra work like you know add as rudimentary lighting model and make the cube nice looking by adding shading and whatnot and the what I'm trying to do basically is to keep the rendering time below 100 milliseconds as well because that seems like a good target to have if I can do that then I get like a reasonable performance of 10 frames per second and so this is although this is this works well that's within that target it's close to that target so I want to try something a bit more complex so I download a model of the industry standard teapot and try and render that this is about 240 faces 240 triangles and this obviously completely destroyed my 100 millisecond time limit so I've got to think of I had to think of more ways to make this faster and the obvious way is to rewrite all the hottest math functions in C as a micro Python native module the two ones that are called the most often are like the matrix vector matrix multiplying method and the dot product method and yeah you can see that more than cuts the time in half and with the success of that it's pretty clear I should write rewrite all of the math in C because you know if I've got the bonnet up I might as well and but that you know that brings the time right down to a glorious glorious glorious six frames per second but yeah like as a general strategy if you find yourself calling a method you know 12 1200 times a frame it's probably a good target to be to be pushed down into the native layer so yeah a note on writing a native code for micro Python there's really two ways of doing it there's the what is called the external C modules which is basically C code that you write there's a module exposed to the Python runtime those are compiled directly into the firmware which is a bit suboptimal because I yeah it would be nice if I didn't require other people who have these devices to reflash the firmware every time I changed this program so the other way of doing it is to write what they call a native module which allows your application to supply native code as an MPY file and then that can be dynamically loaded by your application at runtime which is much nicer the way of doing it so obviously that's what I wanted to do but I did come across problems when I tried to build the native code because I'd used a floating point division in there for the perspective division step of the pipeline I got this problem which is a linker error from the expressive tool chain for the ESP32 I'd love to know why this happens and if anyone from expressive is here I'd love to know if it's fixed in a newer version as well but it seems like it can't link this software implementation of floating point division so obviously what I did was I downloaded the source for their tool chain and found the assembly implementation of this method to add into my project which also didn't work the micro Python build system wasn't prepared to accept that but that was an easy fix and that was actually the first change I got accepted into micro Python they were very good they're very good at or in my experience they're very good at accepting patches and then once I got that building I got it to just cause my application to crash I'm not sure why this happens but there seems to be like a reference to the native stuff that gets collected erroneously by the garbage collection and I spent a lot of time like trying to reduce my object allocations you know in the frames but all that did was just like push out the crash to further in the future so you know I had to settle for compiling my maths functions directly into the firmware there's some other things I did to try and make it faster the big one is trying to reduce object insatiations it's super costly in Python and wherever you can pre-allocate like lists and arrays and things and then just reuse them I initially wanted to have like a lot of my classes to be totally immutable as a good programmer I am but they just totally wasn't feasible so I just you know you just have to mutate when you do calculations on your vertices just mutate one of the operands and send it back that way you can also the other thing I found that saved some time was reducing crossing reducing the amount of times that we cross from Python into native code and back again I found I was doing like lots of the same operation to vertices and matrices so if I could just send them all as one batch in a single function call into the native side then that made it perform a lot quicker I think there's a lot of function and stack manipulation overhead there that you save and also pass arrays and not lists into the native functions as well especially for this kind of stuff where we know that the data that we're passing our floats or whatever you know ahead of time what type is in your array which means you can make some assumptions that my Python can't make and when and when you manipulate this the data objects in a native side you can like skip a bunch of like type safe stuff you can just write directly to the to the data structure which is useful and also I this wrong surprise me as well that I well I don't know if it's surprising maybe it's obvious to people who are veteran Python Easter's but I didn't expect to a native the libc qsort function to be so much faster than the sort function in Python but I was if you look at the if you look at this this picture here you can see that some parts of the teapot are drawn on top of that should be occluded drawn on top of the body of the teapot so what I had to do was Z sort the faces so that we draw the faces from from back to front and that's what I was doing I was what I was using the list sort method for here but just like implementing this sorting this face sorting as a native function as well was like as it says it's 100 times faster and the other thing that was made a measurable difference as well was locally caching object references in your functions as well so like instead of if you're using an object value more than once instead of doing self food self food self food just have yeah just created a local reference a local variable in their function and use that instead so there's some like dereferencing overheads there that is quite significant that we're saving and so after applying all of this sort of stuff this is the final result or the results so far I'm pretty happy with it getting the teapot model down to under 100 milliseconds per frame was really pleasing and yeah I'm pretty happy with the performance so what can this be used for honestly this was a this this was just a fun way to spend a few weekends after the festival had happened but you know it seems to be performing enough the way you could do some kind of like small 3d game like a lunar lander or something like that or you know make yourself a Jurassic Park style 3d user interface for your home automation but really the chief lesson for me I think was that the the best way to get involved with a project like micro python was to just start using it and eventually you come across some kind of limitation that probably your best place to overcome because you know you're the one who's trying to solve the problem you've got the vested interest in it you have you know all of the information is currently paged into your brain so yeah and then the the micro python people were extremely helpful in helping me whip up whip my year contributions into shape so yeah thanks to them for helping me get involved in micro python and thanks to you for listening I can try and answer questions but I'm not super expert on anything I've been talking about hi and thanks for your talk I had a question about the ESP 2 that you were implementing on this did you ever look at using like the dual core setup to try to sort of accelerate any of the mass but that is a good question and someone has mentioned this to me before but when I was writing this I was actually unaware that it had more than one core so I haven't yet but it's a great idea thanks very much for your talk if you're interested in micro python in the building a there is a stance about micro python and also a stand by pine 64 who make like smartwatch that can run micro python and stuff |
Simple, open, music recommendations with Python |
And welcome to Sam for his talk on Music Recommendations in Python. Welcome. Thank you. Can you hear me? Yes. Good. Okay. So I'm a system software developer, actually, and this is a hobby that I like to make music playlists and play around with Python. I'm also a musician and a music fan. And I also used to be a teacher. I think that's relevant. That's going to come into play later. Thanks to my player code think for sponsoring the travel and allowing me to be here. So as a music fan, I used to make a lot of playlists. I still do. And I'm quite old, so when I first started making playlists, they look like this. And very convenient to share. Just give someone else the piece of plastic and they have a machine that plays it. But quite difficult to make because you have to, you remember, you have to line up all the songs, just write, press record, press play. The 2000s came and we all moved to digital music. If you were cool, you had like a win-amp skin. If you were really cool, you had XMMS. And these playlists, much easier to make, you just drag and drop. But they were more difficult to share because nobody else had the same MP3s as you. So you couldn't give the playlist to your friend anymore quite so easily. But if you ask someone now to make a playlist, probably they're going to think of this. They're going to make you a playlist on Spotify or YouTube and send you a link. And that's even better, right? It's super easy to make. You drag and drop. It's easy to share. You send someone the link. And it even recommends you songs to put on the list. So what's not to like? Honestly, I don't actually want that song on the list. So the recommendations aren't always helpful. Spotify is fine. You can use it. It has a great team of researchers. There are some negative things about the company. I mean, it's a private company. The duty theater investors is to minimize the amount that they pay out to musicians and pay that to investors instead. And they've been steadily doing that and reducing the rates they pay to musicians. And they kind of focus on passive listening, right? So you put on an album, it finishes, but they put on more songs for you. People actually now adapt to their music to fit the Spotify algorithm. So the first 10 or 20 seconds are very important. So songs don't have long intros anymore. That's been done to please the Spotify algorithm. So I started to think, what would the opposite look like? And I came up with, it would have to be something DIY, something that doesn't have a profit motive behind it. It would focus on having local music and going to artist websites, buying music from Bandcamp, from paying them on Petrion. It would also involve working with open data. So when I say open data, I don't necessarily mean public data that everyone can see, but data where it's hosted by you or by an entity you trust. And you can choose if it's open or private. You can download it, export it, et cetera. So I have no idea really what I'm doing, but back in 2016 I started experimenting with some ideas. And I was inspired by one or two other projects. So has anyone heard of Dynamic Land? That's a shame. So note that down if you have a notebook. It's something very interesting to research about. Out of scope for this talk. It's a project where the room, the whole room is a computer. Each of these pieces of paper has a program on it or some data, and you can interact with them by moving them around physically. Now, I can't create that myself, but I like the idea of having a program that fits on a sheet of A4 paper. You know, the philosophy is if your program doesn't fit on the paper, then it's too big and it needs to become smaller. And I like that as a philosophy. I feel like the playlist generators that I want to write should also fit on a piece of A4 paper or on a slide deck. And it should be a process that people can participate in. Okay, another thing that really inspired me was Git. That might seem counterintuitive, but Git, Linus Torvalds recently said he's better known actually for Git than for Linux, despite having basically created Git in a month. And quite an achievement, right? So there were a few key ideas. Git's data model is really well defined. It's simple. You have refs and commits. You work with those directly. And then your commits are made of trees and your trees have blobs. And you work with this directly. Like, you get your hands dirty. Git is also a multi-core binary, which has a really nice advantage that you can write one part of it in Perl and then another part of it in TCL and then another part of it in C. So you don't have to keep rewriting. You can have different people working on small components. And I had this idea of having the user interface commands, they call the porcelain, and the innards, like the plumbing. But it's all available, right? So if you have Git on your laptop, you can build a commit using the lowest level commands that you want. And that's a huge advantage in getting people involved. Git is a real DIY project. It's not some shiny thing that just magically works. You push a button and have a nice day. It's something that you really have to get involved with. It'll break. You have to learn how it works. And that's the secret to its success, I think. And of course, Git, the interface to Git is the command line, right? So you can build a website around it in Ruby. You can build a website around it in Python. You can build extensions. Very inspiring. I set out to build a similar tool, but for playlists. And the first thing I thought about was the data model. And I realized that actually everything is a playlist. You know, a music collection is just a playlist where the order doesn't really matter. Metadata can be stored as metadata in the playlist. So everything is a playlist. I wanted to write a multi-core binary. This is called CPE. The tool I wrote is called Calliope, by the way. I'm not really here to show off about the tool, actually. You can look at it, and it's fun, but the ideas are the thing I'm more excited about. I'd like people to re-implement this in other languages and go forth with the ideas and do stuff I never thought of, or contribute to the project itself. So it has a multi-core binary. Currently everything's written in Python. That could change if somebody decides to write a new tool in Haskell or whatever. The main interface is the command line. So you can create a recommendation pipeline as a shell pipeline. Or you can do stuff in Python directly for greater control. And it's optimized for ease of maintenance, right? Because I'm lazy. I have one hour a weekend to spend on this, so it has to be easy to maintain. Okay, so the data model, as simple as possible. Here's a playlist item. It's a Python dictionary, which we can represent as JSON, and it has key value pairs. And then a playlist is a list of playlist items. One quite key decision is that, notice I haven't represented this as a JSON list. It's a JSON lines document. So that's JSON objects separated by a new line. And this is really cool, because you can process it with shell pipeline tools. You can process it with JSON tools, but you can also process it with line-based processing tools. Think if we had a JSON list, and this playlist was 10,000 items long, then we stream it, and you have to wait for the closing parenthesis before the next process can read it. But this way, the processes can read a line at a time, and you can have an infinite-length playlist and start processing the beginning of it before it's even, before it's finished. Okay, so that's the data model. Those key value pairs, creator and title, those on arbitrary, those come from an existing playlist format called SPF, which has been around since 2006 and is almost perfect. Like, they got the design almost perfect. One of the flaws was choosing XML, which was a good idea in 2006. And the other tweak I made was representing it as JSON lines, but the data model is effectively the same as SPF. So we can already do some fun stuff with this playlist, right? Let me quickly show you what you can do. Here's a playlist. These songs aren't real, obviously. We can shuffle it. I have to give it a file name, and the file name is standard in. Okay, so now it's shuffled. I can export it to a different playlist format. So now I've converted it into an actual SPF playlist, so you can put it into rhythm box. But we don't even need to use calliope tools, right? I could use head to get the first item. I could shuffle it using sort. Okay. And I can use data-oriented tools as well. So this is actually new shell, which is a data-oriented shell. So I can also load it into new shell, and now I have JSON, and now I can sort it by the artist's name or by the title. So just by defining a data format, you get all this stuff for free. Like, I didn't even have to write any code yet, and we can already shuffle a playlist. So what's next? Well, these aren't even real songs, right? You can't play them. There's no content. So the next step is get some content so we can actually listen to the playlist. The developers of the SPF format have thought of this, and they designed SPF with a portable design where when you go to play the music, you resolve it at that moment. So you search based on the metadata, like creator and title, and then you find a URL where you can actually play it. So I implemented that, and I can demo that as well. Okay, so here's three. These are real songs now, and if I pipe it to the Spotify sub-command, they get resolved to actual tracks on Spotify. So over here somewhere is a URL, and you can click it and listen to the track. This is all done using the Spotify API, so you need a Spotify API key to do that. You can get it for free, but it's a little bit of an effort. And it works by searching based on creator, title, and ranking the results. Or I can resolve it to tracks on my local machine. So I'm a GNOME developer, so I have the tracker search engine installed, and tracker can match against my local music collection and return the URL. Let me make that pretty. Okay, so it's resolved to URLs on my local machine. This one, I seem to have deleted the Madonna album, but the other two are here. And then you see here I exported it as an M3U playlist as well, now that we have URLs. So this is the basics of how you can make playlists in Python, right? What's next? So I promised music recommendations, right, and we haven't actually done any recommendations yet. So the next thing I'm going to talk about is a program I made that generates me a playlist every day. And that's as far as I've got with this, because actually I quite like the playlists it generates, so I haven't needed to make any other recommenders yet. I'm still happy with this one. Soon I shall look at some more. But a recommendation algorithm is basically this. You have a very big playlist on the left, which is all the possible music you could listen to, and then some sort of algorithm happens, and on the right you have a shorter playlist, which is hopefully better, and that's the one you listen to. So the algorithm I came up with, I called it the Special Mix, and its goal is to create a one-hour playlist of music that I already know, and there's three ingredients for that. All of these are Python libraries. One is PyListenBrains, which is an interface to the ListenBrains database. One is the Beats Music Organiser, which is a great tool for maintaining a local music collection. And one is the Python SimpleAI module, which gives you really basic AI algorithms that let you do constraint solving. So I'll go through those one at a time. I'll go have a little drink first. So if you want to do music recommendations, it's a good idea to save the history of what you listen to. Spotify already does that, although they make it a little difficult for you to then get at the data. Lastfm does that, and ListenBrains, which I recommend that solution because it's open. It's an open source platform. It's open data. So you can get a browser extension, or phone apps and music players that will save everything you listen to into the ListenBrains database, and then ListenBrains gives you charts and graphs to show what a great taste you have. And Python ListenBrains and the Kaliot ListenBrains command let you access the data. So... I would run the ListenBrains history command, put my username, and fetch all the listens. This does something kind of dumb. It just syncs all of the listens into a local SQLI database. And then I've dumped the first one here to show the kind of metadata you get. So you get a timestamp, you get an ID for the track, and then you get the creator and the title and the album. And in this case, the URL of where I listened to it, because it came from the web scrubbler. So that's useful. And then because it saved in a local SQL database, we can do things like calculate a histogram of which years I actually listened to music. We can select tracks based on, okay, it was first listened to in 2019, or it was first listened to in 2020. So that's what I did. And now I have a playlist, right? So the first thing the special mix does is it chooses one year out of this histogram, so a year where I did actually listen to music, and then it selects all the tracks that I listened to where the first listen is in that year. So songs I discovered in 2019 or discovered in 2021, for example. And now we have a playlist, but it's very long. So the next step is... The next step would be to select tracks from it. However, we want to know a bit more about the songs first. So actually the next step here is to resolve the tracks to local content. This is where I keep my music collection. It's a hi-tech home server. And I manage it with a Python program called Beats. This is a command line tool that lets you take music that you've got from Bandcamp or wherever and import it into a database and it fixes the tags using MusicBrains. So it's always correct. It has nice apostrophe characters. You can use any content resolver in theory. You can generate playlists against Spotify and upload them to Spotify, but that's not my goal here. So using Beats, I can resolve the tracks in the playlist to actual mp3 files on my music server. That's actually a lie. I haven't implemented that yet and I use Tracker. But because Beats is written in Python, we'll pretend that I use Beats. Either way, now I have a playlist where the track location is set to a file and we also know the duration of every song. So we have a bit of more metadata. And now we can select which tracks go in the playlist. Okay, so here's the fun part. All the parts are fun, but maybe this is the most fun part. The approach I took was to try and do constraint solving. Now, this is a pretty old area of AI, right? People have been looking at constraint solving for years and years, so the fashion at the moment is to solve everything with machine learning and lots and lots of GPUs. And that works. It produces nice results, but it's hard to reproduce in an hour on the weekend on your old laptop, whereas the constraint solving approach is pretty simple. You can run it on, you know, a Raspberry Pi with no issues. This was inspired by a paper that I found, which, again, is from 2008. It's nothing too futuristic. And in this paper, they define a constraint model. So this looks kind of academic if you're not a mathematician. But these are ways of making constraints on a playlist, such as I want the playlist for whatever reason to be 20% or more Stevie Wonder songs. And they can express that as a function. And the key is that you can apply this function to a playlist. So let's say I have a playlist and it has 100 Stevie Wonder songs. This constraint function would return a score of one for that playlist because every song is Stevie Wonder, so the constraint is completely satisfied. Now let's say I have another playlist, which is entirely Death Metal. This constraint function would return zero because none of the tracks are by Stevie Wonder. And if we had a playlist that was 10% Stevie Wonder songs, then we would assume this function would return 0.5 because the playlist kind of half satisfies the constraint. So the first step in constraint solving like this is to define a constraint function that can score any playlist. And then we use a local search algorithm to find a playlist that best matches the constraints. So local search is a kind of try it, try, try, try, try again until you get bored and then pick the best solution that you found. You set a fixed number of iterations like 10,000 and you kind of go, OK, well, this works, this one's a bit better, this one's worse, and choose the best one that you found. So I'm going to do a quick worked example of this with two constraints. And the constraints I'm going to put are the songs must be, each song must be two to four minutes long and the playlist as a whole must be 10 minutes. And here's a demo of solving that constraint problem. As promised, it fits on a sheet of A4 paper. This is the whole program. So here I've defined two constraints. One of them is an item duration constraint saying that the duration of each item should be between two and four minutes. And here's a playlist duration constraint saying that the overall duration should be between 10 and 10 minutes. I know exactly what I want. And then here is the input, which is a playlist made up of four fake songs. I haven't used real songs here because it's too complicated. Notice they have vastly different lengths, so it's quite hard to solve this problem. There's not an obvious solution. And we're going to use the Kaliope Select module, which internally uses simple AI. So the only thing that's required in this playlist, the only required piece of metadata is an ID. In this case, I've put emojis because they're pretty, but normally you'd have an integer or something. And these constraints will look at the duration field. So the duration field is also required because otherwise we can't calculate the score, right? Because we need to know the duration of each item. So if I've got time, how are we doing for time? Okay, I think I have time to show you a live demo of solving this constraint problem. And the good news is, Simple AI, the Simple AI module, has a web viewer that can give us a kind of graphical view of what's going on here. So with luck, yeah, with luck this will load, and it's going to step through each step of the local search algorithm to find the best playlist. So in the beginning we have an empty playlist and the score is zero because it doesn't satisfy either of the constraints. We take a step. Each step it can choose to do different actions that will change the playlist. So here it's chosen actions of adding a song because that's all it can do. So we can add this song which was quite long and the score is now 0.4 because one of the songs is too long. Or if we have the really short song, the score is lower. Or if we add this seven-minute ambient tune, then, ah, it's very difficult to get the scale right. There we go. So this one has the highest score, no, this one has the highest score actually. So probably it's going to choose this one. Yeah, so the next step we have a playlist which is just this song. What song was that? Amazing tune. So we have amazing tune in the list and now the score is 0.6. Let's take the next step. Okay, so now one of the possible actions is removing the song again, but we won't do that because the score is back down to zero. Or we can add one of these other songs and this one seems to be the playlist that best matches the constraint. Okay, so now we've got a playlist of two items. Now we can take some more actions. We can add another song or remove either of those songs. And we've added another song. Probably at this point it's going to say, okay, so it can't find any actions that increase the score, so the algorithm has said, right, it's done. That's the best playlist you're going to get. And that is how you can create a playlist in a page full of Python. So pretty short on time. We can export this playlist. And I have a jelly-fin music player set up and that's how I listen to it. That's a recap of what we've just seen. So what's next? I don't know, really. We have a couple of minutes for questions. Maybe you can answer the question of what's next. APPLAUSE Thank you, Sam. We have two minutes for questions. I will repeat the questions then. Yeah, so if I want to actually use this project like, for example, Francis, how much time would it take me to set up the project and find constraints that actually match and replace music given that I'm fairly familiar with Python or setting up some projects? So how much time would it take to set up the whole project and find the constraints and so on? Yeah, interesting question, actually. So, I mean, setting up the project is fairly easy. You pip install. After that... I don't know, it wouldn't be quick. I'll tell you that. You would have to enjoy getting your hands dirty a bit at this stage. My general goal is to create a folder of example recommenders. So hopefully in future you'd be able to... And you can actually run the examples as modules as well. So hopefully in the future you'd be able to kind of run an existing example and get started fairly quickly and just tweak a few values. One more over there. Or one over here. So depending on the number of times you have used this recommendation system, how often has it repeated the same set of music, same set of songs or very similar tasting songs? Yeah, how long has it repeated? It's never come up with two playlists the same, actually. What it does sometimes do, though, is it'll take an album and kind of give me four or five songs of the same album in one playlist. So maybe I need to tweak the constraints a bit there. But there's infinite possibilities, really. Yeah, I haven't got bored of it so far. If your input is very short, then it will get repetitive. One more question? Thanks for the very interesting talk. Just a quick question. How easy would it be to implement? So say if you wanted to search for different performances or different interpretations of a particular piece of music, so if you had classical music, some symphony with a bunch of different recordings, how easy would it be to implement that in the current? How familiar are you with music brains? Not at all. So the tool you would use would be music brains. Yeah, talk to this guy. He'll bring you up to speed. Thank you again, Sam. Thank you. |
DuckDB: Bringing analytical SQL directly to your Python shell. |
Let's welcome Pedro Holanda for his talk on DuckDB and a magnificent snake duck. Yeah, you guys can be surprised at anything you can find as a rubber duck these days, you know. All right, so I'm Pedro Holanda. I am one of the main contributors of the DuckDB projects, which is an open source database system, and also I'm the COO of DuckDB Labs. And today I'm going to be talking a little bit about how DuckDB can bring analytical SQL power directly into your Python show. So to give you a little bit of an idea of how the stocks look like, I'm going to start with what is DuckDB. So I'm here talking about one more database system. I want to motivate you guys that we actually needed to do one more database system. The other ones didn't solve the problems we had. And then I'm going to go over the main characteristics of DuckDB, so what actually makes it special. Then I'm going to go over DuckDB in the Python land, so how DuckDB integrates in the Python ecosystem. I'm going to do a little demo. The basic idea is that we're going to use the infamous New York City taxi data sets, and we're going to try to do some estimation of fair costs, and we're going to use DuckDB partners and PySpark just to see a couple of the differences of the things I'm going to be talking over. And then some summary of the talk. So what is DuckDB? Well, DuckDB was actually born at CWI, which is the research center of mathematics and computer science in the Netherlands. And what we actually had there is that a lot of the projects, the PhD student projects, the master projects, they are very data sciencey. So usually you have a data science problem, and you want to throw a database measurement system at the data science problem because you're handling data. So initially we were like, OK, we can probably use a database server, use a database connection, and then just transfer the data from the relational database to your Python terminal, for example, like where your analytical tools are. And it turns out that's quite a bad idea, because you are transferring a lot of data. So that's pretty costly. And then you're like, OK, this is really not solving our problem, can we draw inspiration from somewhere else? And then, of course, there are SQLites, the most famous database out there, at least the most used one. And it has quite a nice property, which is being an embedded database system. Being an embedded database system, it means it can run inside your Python process. So you can eliminate this data transfer cost. SQLite comes with one design decision that is a transactional database, so it's actually super optimized for small updates, but it's not really optimized for analytics. So we kind of wanted to do SQLites in terms of being easy to use and eliminating this data transfer cost, but focusing on analytical queries. And that's kind of how the database was born, and that's also why we frame it as a SQLite for analytics. It also has a very simple installation. So if you think about Python, you just do a bit so, and you're good. This is embedded, there's no server management. So let's say you just want to, I don't know, query a pre-kit file, two lines of code you can write, query it. Like, there's no starting of server, there's no schema creation, the schema is inferred from the object, so it's very easy, very fast. And we also really focus on this fast transfer between analytical languages and their tools, like in Python and R, to DuckDB. DuckDB is currently in pre-release, I think the last version we released was 0.6, 0.7 is coming up soon. I need the web pages like a little bit more details about all the things that are in each release. All right, so I'm going to go over some of the main characteristics of DuckDB, particularly like the color data storage, a little bit about compression. I'm going to talk about vectorized execution, so these are all like core database stuff. Actually talking about vectorized execution engine, it's going to be difficult because Professor Bonk is here and he actually created that, so I'll try to do it correctly. A little bit about end-to-end core optimization, parallelism and beyond-memory execution. So color data storage, well, there's basically two ways that you can do it. One is a raw store, a scone store, as an example of raw store, we have SQLites, and the whole thing about the whole idea is they're storing your data consecutively in memory per row. So that basically means that if you want to fetch an individual row, that's very cheap because it's continuous in memory, however, you always have to fetch all the columns. So analytical queries, usually you have very white tables, but you just want to really get a couple of these columns. So what if you only want to use a field? So in this example, what if you just are interested in the price of a product, but not the stores as sold, right? In a column store, you actually have your layout that the data of the column is consecutively in memory. So if you want to access just a couple columns, you can actually have immense savings on this KIO and memory bandwidth. So that's why this type of format is really optimized for analytics. So to give you a more concrete example, let's say that we have a one terabyte table with 100 columns. For simplicity, let's say all the columns have the same size, and we just require five columns of the table in our analytical query. So in a raw store, let's SQLites, reading this whole table, if you have a disk with around 100 megabytes per second, it will take you three hours. If you were using a column store model, which is what Pandas inductively does, for example, using these five columns from disk, it takes you eight minutes. So there's a huge improvement by just setting up the correct storage formats for your workload. Compression. Well, I'm not going to go into a lot of detail about the compression algorithms that we implement in Duck2B, but what I can tell you is because of having a column store, you're going to have your data from your column continuously in memory, which gives you a very good advantage to compressing it, because usually the data from the same column is somewhat similar. So you can apply cool things like RLE, FSST and CHIMP for floating point numbers, FSST for strings. So you can start applying all these algorithms and really decrease the size of your data. So in this table here, we actually have, I think this is from one year ago, one year and a half. 0.2.8 from Duck2B. We had no compression at that point, and then a year and a half later, we actually managed to implement all these things, which got us five times better compression, Y19, for example, 3.18 better compression in the taxi data set that I'm going to be using later. And why is compression so important? Well, if we go back again to the same example, where we were reading our five columns, and it was costing us to read them from disk eight minutes because of the storage formats, if we compress these columns, we suddenly don't have to read 50 gigabytes anymore, right? You read less. And then of course, you apply like the best case from what I showed you from the last table, five times, there are increases that cost you one point, one minute and 40 seconds. So execution, well, there's three ways of doing a query execution. There's actually one more, but it's not in the slides. So our SQLites use the top-of-the-time processing, which means that you process one row at a time. Pandas uses column-of-the-time processing, which means that you process one column at a time. And DuckDB uses kind of like a mix of the both, which is a technique developed by Peter, the vectorized processing where you process batches of a column at a time. So basically, the top-of-the-time model from SQLites, it was optimized for a time where computers didn't have a lot of memory. There was low memory to be used because you only need to really keep one row in memory throughout your whole query plan. So the memory was expensive, that's what it could do, but this comes with a high CPU overhead per tuple because you're constantly resetting your caches, you don't have any cache-conscious algorithm running that piece of data up to the production of your query results. If you go to the column-of-the-time, which is what Pandas uses, this already brings like better CPU utilization, it allows for SIMD. But it comes with the cost of materializing large intermediates in memory. It basically means that you need the whole column in memory at that point to process for that operator. And of course, the intermediates can be gigabytes each, so that's pretty problematic when the data sizes are large. And that's why you see, for example, that Pandas, if your data doesn't fit in memory, what does it happen? It crashes. And then if you go to the vectorized query processing, it's actually optimized for CPU cache locality, you can do SIMD instructions, pipelining, and the whole idea is that your intermediates are actually going to fit here in L1 cache. So basically you're going to be paying this latency of one nanosecond to be accessing your data throughout your query plan instead of paying the latency of a main memory, which is also the case of a column database, which is 100 nanoseconds. It seems like a small difference, but when you're constantly executing this, this really becomes a bottleneck. And to end-score optimizations, of course, something that we have inducted to be, so we have stuff like expression rewriting, join ordering, subquery flattening, filtering, projection pushdown, which is a bit more simple, but it's extremely important and brings a huge difference in the cost of the query. So here's an example of a projection pushdown. Say you have a table with five columns, A, B, C, D, E, and you want to run a query, that's pretty small, but the query is like a selects minimum from column A, where there's a filtering column A saying the column A is bigger than zero and you group by B. So the whole point of this query is that you're only using two columns of the table, right? So what the ductdb optimizer will do is like, okay, in this scanner, I know I don't need all the columns, I just need N and B, and you just don't have to read the other ones. If you do the same one in pandas, for example, you can apply your filter, and then you have the filter, the group by the aggregator, but at the time you're doing this filter, you're still filtering all the other columns you're not going to be using your query. Of course, you can manually make this optimization, but it's pretty nice that the database system can do that for you. Of course, the ductdb also has automatic parallelism and beyond-memory execution, so ductdb has parallel versions of most of its operators, I think all of our scanners, including with insertion order preservation of parallelize now, aggregations, joins, pandas, for example, only supports single-threaded execution. We all have pretty good laptops these days, right? So it's a shame if you cannot really take advantage of parallelism. And ductdb, again, supports the execution of data that does not fit in memory. It's kind of the never give up, never surrender approach, it's like, we're going to execute this query. We try to always have graceful degradation, also, that it just doesn't suddenly crash in performance. And the whole goal is really to never crash and always execute the query. All right, so a little bit about ductdb in the Python lens. Basically we have an API, it's a dbapi to.ocompliant, so, far too much what SQLite has, for example, you can create a connection and can start executing queries, but we also wanted to have something similar to the data frame API that still could, people that can't come from pandas, for example, could still have something familiar to work on. So here in this example, you can also create a connection. You can create this relation, which kind of looks like a data frame, you just point it to a table, you can do a show to inspect what the table is inside, and you can apply, for example, these chaining operators, right, like a filter, a projection. So in the end, this is all lazily executed, and this also allows you to take advantage of the optimizer of ductdb, even if you do the chaining operations. Of course, I talked to you about memory transfer, so we were very careful as well into being very integrated with this, very common libraries in Python. So with pandas and pyarrow, for example, what we actually do is that in the end, for pandas, the columns are usually not pyarrows, which turns out to be RC vectors, which turns out that's also kind of what we use. So with a little bit of makeup in the metadata, we can just directly read them, and they're all in the same process, right? So we have access to that piece of memory, which in the end means that you can actually access the data from pandas in ductdb without paying any transfer costs, at least constant transfer costs just for doing the metadata makeup, let's say. And there's the same thing with pyarrow. We also have what we call zero copy, so we can read error objects and output error objects without any extra costs. With NumPy, we also support SQLCAMY, and in IBIS, they're actually the default back-end from them, I think, since six months ago. A little bit of usage, so as you can see, this is our PyPy download counts. The Python library is actually our most downloaded API. We have APIs for all sorts of languages, and you can see that in the last month, we had like 900,000 downloads, so there are a lot of people there trying out and using ductdb in their Python scripts. So now it's the demo time, let me get this, all right, this looks like you can see. So this is just installing ductdb PySpark and getting our yellow trip data dataset, our executor, this, our database, just importing the stuff we're going to be using, and here is just like getting a connection from ductdb, creating a relation that's just, okay, we're going to, as a parquet file, ductdb can be parquet files, and then you can just print to inspect what's out there, right, so if we run this, we can see like, okay, these are the columns we have, we have vendor ID, we have pick up dates, time, passenger counts, you have the types of the columns, you can also have a little result preview, just have an idea of what it looks like, so I think this dataset has about like 20 columns, maybe, and there's just information about the taxi rides in New York in 2016, and then you can also, for example, run a simple query here, I'm just doing like accounts to know how many tuples are there, and we have about 10 million tuples on this dataset. All right, so this function here is just to do a little bit of benchmarking, coming from academia, we do have to do something that's kind of fair, I guess, so I run just five times and take the median time of everything, and then this is actually where then we start, so we start off with data frame, so Pundas can also read parquet files, and the whole thing about ductdb again is that it's not here as a replacement for Pundas, this is not run by itself, but something that can work together with Pundas, so the cool thing is that we can, again, read and output data frames without any extra cost, so let's say that in the query here, we're just getting the passenger counts, then the average tip amount of trips that had a short distance, right, and we group by passengers, by the number of passengers, so what we want to know is for short trips, does the amount of tip matters by the number of passengers in that ride, and what you can see here is that you can, again, read from the data frame, that's what we're doing, and we just have to use the data frame name in our SQL query, and if you call.df from the query results, you also output in your data frame, and it's pretty cool because the data frames have these plots bars, they have plotting capabilities that ductdb doesn't have, and you can get easily a very nice chart, so you see here, apparently, there's some dirty data because before getting in tips, when they don't have anyone in their rides, I'm not sure what that is, but apparently like if you have more people, seven to nine, maybe like the more expensive cars, you get a higher tip, and you can do the same thing in pandas, of course, right, like in pandas you don't have SQL, you're going to have to do, to use their own language, to do the group by, the average, and you can directly use the plots, and the whole point here is to show the different execution time, like, now we're waiting, okay, so took a second, and ductdb took 0.2, so this is like a 5x, right, to 0.25, so like 4x, and you also have to consider that we're using like a, not a very beefy machine, right, this is a co-lib machine, imagine if you had more cars, this difference would also be bigger, and then I added spark for fun, so actually spark can also read data frames, but it crashes out of memory in my co-lib machine, so I had to give up on this, and read directly from par K files, but it does output it as a data frame, I think we're going to have to wait a little bit, but as me it's best, so of course spark is not designed for small data sets, but turns out there are a lot of use cases where you use these small data sets, as you're going, it's warming up a little bit, it's good for the winter, it produces some energy, I think, alright, okay, so it took two seconds, 2.2 seconds, the actual execution, and that's already like, what, more than two times what Pandas was, so, yeah, anyway, for the demo of course, I showed you something that's fairly simple, can you do actually very complicated things, maybe not very complicated, but more complicated, so here I'm not really going to go over the query, but the whole idea is that we can just, for example, use ductDB to run linear regression, so can we predict, can we estimate the fare with the trip distance, and turns out you can just calculate the alpha and beta with not such a crazy query, and then you can again export it to Pandas, and you have a very nice figure there, so you can really combine these two to get the best out of both, alright, that was the demo, summary, oh that's my last slide, good, so yeah, ductDB is an embedded database system, again it's completely open source, it's under the MIT license, since it came from academia, this is something that we're always worried about, it's to also give it back to everyone, because it was usually funded by taxpayers money, so everyone can use it, 100% of what we do is actually open source, there's nothing that's closed source, it's designed for analytical queries, so data analysis, data science, has binding for many languages, so of course I'm at the Python dev room, I'm talking about Python, but we have our Java, turns out that Java is like one of our most downloaded APIs, so I guess that's an interesting sign, Java scripts, and a bunch of others, it has very tight integrations with the Python ecosystem, again the whole idea is that you eliminate transfer costs, implements the database in relation to APIs, the relation to API again is this more data frame like, and has full SQL support, so anything you can imagine like window functions or what not, you can just express them using duck to be, and that's it, thank you very much for paying attention, happy to answer questions. Thank you Petron, so we have five minutes for your questions. You mentioned, thanks for the great presentation, you mentioned beyond memory execution, and kind of that it tries not to degrade as much, can you shine a little bit more light on kind of what happens under the hood, and how much degradation happens? Of course, I think that's, there's only the ordering operator that actually does that, we have Lawrence that's doing his PhD, so there's a lot of operators that need to research to be developed, that's more of a goal than something that actually happens now, but the whole goal is that you really don't have this sudden spike in the future, but there's research going on, in the future there will be more to be shared for sure. Thank you very much for the talk, and it's very exciting to see such a tool, such a powerful tool, I'm working usually with data warehouses, and I saw on the website that you do not recommend using this with data warehouses, I would like to know why. So of course, there's no one solution for our problems, there are cases that the warehouses are very good fits, it turns out that for data science for example, which is kind of what we preach the most, they're usually not good because then you fall back to the senior data outside your database system, like you're not really going to be running your Python codes inside the system, you can do that for UDS for example, but they are messy, they're a bit nasty, so you want really to have it embedded in your Python process, so you completely eliminate data transfer costs, because usually what you do is like, okay, I have a table, 10 columns, I'm going over 4 columns, but I'm really reading huge chunks of it, so that's a bottleneck we try to eliminate. How do you handle updates? Although we are in the analytical database system, we do do updates, so Mark, I don't know where he is, but he's there, he developed MVCC algorithm for OLAP, so we have the same asset transactional capabilities that you would expect from a transactional database, of course, if you have a transactional workload, you should still go for Postgrease or SQLize or a database that handles this type of transactions, but Mark developed like a full-on algorithm to handle updates completely, yeah. How do you compare to Vertica? How do you compare to Vertica? I have a good question, I think in terms of analytical queries, TPCH, probably similar performance, but then again, the whole point is that if you go again for the Python process, the data transfer costs will take most of the time there, and then it's really catered for this type of scenario, the embedded scenario. We have one minute left for one more question. Yeah, I actually have a rappel somewhere for a bunch of examples as well, I'm very happy to share it. I don't know where I'll post it. Ah, the false then thing, I guess. All right. All right. Thank you a lot, Pedro. Thanks a lot. Thank you very much. Thank you very much. |
Continuous Documentation for Your Code |
So, hello everyone. My name is Anastasia and I can say a few words about myself before we start today's talk. I came here from Berlin. I live in Germany, Berlin, for already seven years but I'm from Ukraine originally and I play a role as an associate director of engineering at SoundWide in Berlin. Also, I'm one of the organizers of PyBerlin Meetup and I have 11 years of experience in software development and seven of them in Python. You can see how happy I am with Python and I can share and I will share my long road to programming actually with documentation and you will learn from me what I learned through all the years of software development. And let's start today's talk with a question. Do you document your code? So, rise a hand if you do, write documentation. Wow, a lot. Nice. Okay. That's very nice. So, some years ago, around 10 years ago, when I just started, I remember myself, I started programming language. I took a programming book, then I just went by the book. I wrote a Hello World program. It was very nice. It looked so perfect. I did the first pull request into our code and I thought it was very perfect. It looked so great. It was approved but it didn't feel right. I didn't even know what to check but I just didn't feel right and I was not sure would this code still be live or there in some years or maybe in six months or somebody will just change the code and then it would be not good anymore. And when I even passed the code review, I merged the code. I was still not sure what to check in the code and when to consider that the code is good enough. And of course, I didn't know about documentation. I was not told that I should write documentation in my code. And today, we're going to look into the 10 years after I started. So, after all of this time, I realized that experience comes together with the confidence of you more. You are more confident in the code that you're writing because you're writing a lot and then you learn from other people. You go to the conferences. You listen to all the talks, all the experiences. You talk to other developers in the community. If you're open source developer and there is no documentation, it's a bit tricky. So, what if me 10 years ago could possibly travel into the future and listen to this talk and know what to do from the very start? I would be maybe happier in my life. I would not do so many mistakes as I did through all the years. And that's why we're here. I'm going to show you things that I learned over the years and we are going to have a bit of a magic today happening following the 10 years into the future. So, let the future begin now. Let's start with the story of one little set code which was lost in its lifetime and no one wanted to play with this code. This is the code that I'm talking about. And the code was just sitting there very sad, very little, wondering why am I so sad, why no one is talking to me. And had no clue how to deal with other services or other developers and had so many questions in its lifetime. And the only sad cat was sitting and supporting that set code. So, the set code was actually wondering what is wrong with it and how to change to find new friends. Let's take a look into the code. So, the first function, it's pretty clear. It's just the hello world, the basic stuff. But if you look to the second one, this is the sad part. So, there is a function with some of the arguments and some of the results returning something but not very clear, not very clear the outcome of the function, what does it do, what is the purpose, also names of the parameters where they lost at some point. So, you remember the story, there was a sad code, very sad code, and just the cat sitting next to the sad code. And the code went to sleep. And something truly magical happened. It met someone. It met its future self, just right 10 years after. And the future self said, I will give you four pieces of advice which will help you to improve your communication skills. And in the end, I will give you a riddle to solve. So, follow my advice to reclaim the ancient knowledge of programming superpowers and you will shine for many, many years. So, let's start. Then listen carefully. Yes, the sad code said, please continue. So, here is my first advice to you. This is the way how to use the goal oriented approach to show the world how to solve a problem if there are any in the future. This approach is about writing how-to guides. So, those are basically the directions to the reader and you can read a bit more in the links which I listed over there. This is the most used guide, the most searched and the most read and basically everything written about the code. It includes like a recipe or the direction for solving a very specific problem. So, if you are trying to create something of the how-to guide about something very specific, not very about everything like how to create a web server, something that would not be a how-to guide, but something how-to create for a very specific reason. That would be a how-to guide and those are actually very specific guidelines that we are usually looking for because we have all of the specific issues and then we are searching for a specific piece of information for a specific problem as a solution. So, with the first piece of advice, the set code was thinking, okay, I can improve by just writing a simple set up, how to set up myself to be open for others. So, that's a very specific what to install, where to install and added some read me file to the guideline of the code. And then, after following that advice, the set code noticed that they got a friend. Can you imagine? A friend, a real friend. How do I know that? Well, just look at that. There is a star shining. Well, that was just the start, the first one. So, the second advice show the set code how to use learning oriented approach to show the world what actually is done by this code. So, those are the tutorials. Those are a bit different from the how-to guides in the way that this is more like learning by doing. So, you are not reading all the documentation. You are just doing an exercise like a teacher showing in the school what to do to learn a specific subject. So, if we are talking about the code, then just follow the advices also would be learning by doing, writing some specific tutorials would be to achieve some goal, but it doesn't have to be a very specific problem to be solved. So, if we are following the advice, the set code was thinking, okay, maybe I can add a bit more. Let me write some tutorial about basic things set up, how to write it, how to build it, and then something happened again. There was another friend. Look at that. The second star is shining. Already good. Good path. The third advice showed the set code how to use understanding oriented approach to explain more what the code is doing. So, those are the explanations, different discussions on very specific topics which are broadening the main subject. So, if we already have the basic documentation, we have the how-to guides, we have different learning by doing exercises. Those are explanations are very, very specific. So, if you compare to a recipe book, this is not a recipe book. This is more of a reference of the different ingredients, how to get the ingredient, for example, or which ingredients are particular for a very specific dish. This is everything that could tell the said little code how to do with different packages or different services, how to integrate with others. And, of course, the set code did implement something like this. So, they wrote a bit of more explanation on why do we need documentation in general. And, again, it was very helpful, very happy. So, the set code was thinking, okay, and more friends are coming. That's great. Let's go to the last advice. So, the last but not least piece of advice was about information oriented approach, about references. In software development, reference guides are usually describing the functions, the input, output, arguments, the results of the functions, key classes, different API. So, everything that is about the code, about different attributes, it doesn't have to be every single function, but it should be all of the main key classes, key functions that are used. So, then, afterwards, you can recap and then look just the 10 years after you can look at what was done before. And, just like that, the set code followed the fourth piece of advice and decided to write a code reference, added that, and also applaud that to somewhere for others to see. And, after those four wonderful advices from the future self, something really magic happened. The code woke up, not very sad anymore, not mad, and it felt very comfortable to go out and then talk to others, connect to other developers. And, then, after some time, the set code remembered that there was a riddle, the future self said there would be a riddle, and then you will gain the superpowers, right, where is the riddle. So, here it is. I'm someone who can teach you a lesson, but not a teacher. I'm someone who can guide you to a goal, but not a tour guide. I'm someone who can tell you everything about technical aspects of your functions, but not an encyclopedia. I'm someone who can explain you a particular topic to help you to understand, but not Google. Can you guess what that is? Okay. So, the code was very happy because they knew the answer already and asked the future self, so, well, look, was it all about you? All about you, my future. So, you are documentation. So, you are really my future. And, the future self said, yes, those four advices were all about me. I'm your future. I'm your future. Fantastic documentation with the tutorials, how to guides, explanation, and the references. And, in other words, we all need to understand that documentation is not a single piece. It's more than that. It consists of four different types of documentation, and it's very important that we add all of them, and we at least have some sections of those to focus for our future selves, not to be in trouble. So, just to recap the future of that little code is now, it's about code references, tutorials, explanations, how to guides, and that would make your code happy, and the cat, of course. And, it would get many friends, especially in open source, it's very important to add documentation and to explain what is code actually doing and all the functions. And, that would be a really bright future. So, I have another question for all of you. Would you document your code after all of you heard in the future? So, if this example didn't convince you, I have a few more pieces of advice from myself, from my own experience, why do we need to document code? Because people forget things. Even if you wrote the code, you will forget about it in a week, in a month, in a day, or maybe something happens, and then, well, you return back to the code and then you don't remember why it was it. People leave the code alone, and this is also important. If you are working for open source, you can give it to somebody else, you leave it alone, then you come back, you need to review, and then, oh my God, I didn't remember that function. Then, new people come to contribute. This is also very important. They need to know how to contribute, what to contribute, what is important for this. So, how to start? First, you need to choose, you need to choose main source tool for documentation, and then, of course, you need to make sure that it's up to date. So, the continuous documentation, it's a culture. It's not the thing that you can force people into. You need to make it as a culture for yourself and also for other people. So, I would recommend you to read a few things. This was the original document where it's actually written about those four pieces of documentation. I recommend you to go through it, read more about different types of documentation, then also what are the documentation about the legacy code, and the real Python has a very nice tutorial about documentation with MK docs and also GitHub pages. So, how to start? You need to start as simple as possible. This is the best solution. Just start some basics. Then, you can go to version control documentation, and that will be like a next level. So, you have Python code, you need to document something. First, what do you do? I would say you start documenting. So, for that, you need to add some doc strings to your code, then you will take them and then make a reference guide, which is already nice to have. It's out of the box. It can be auto-generated if you host it somewhere so beautiful and easy. And for that, you can use Sphinx MK docs to prepare the documentation, also put some files. And, of course, you can try to host it somewhere like read the docs, which is free for open source projects or GitHub pages. Well, there are other tools, but those are the ones I used. And, of course, add more documentation. So, if you want to see those pieces of documentation that I showed, here is the link where they are. And also, you can see in here, so this is how the read the docs would look if you would use it for a simple project. And this is how the Sphinx documentation that I showed you with the set code would look like. And to then just to remind you that you cannot force people to read documentation. You can force them only to write documentation. So, if you want to make sure that your documentation is up to date, you need to make it as a culture for yourself to also start somewhere to start writing documentation and then it will somehow become a habit. So, thank you so much. This is all. You can all join the Pi Berlin Meetup in Berlin. You're all welcome to come. We have meetings every month. And also, some of the more articles about documentation you can read over there. Thank you, Anastasia. We have five minutes for questions. If we have questions, just raise your hand. Yeah, I'll pass the microphone so we can record the questions. Can you maybe pass it over? Thanks for the talk. Did you experience an increase in your documentation quality when using such tools? I mean, so the overall problem is you actually need to write documentation and you need like an incentive for it, right? Did you experience that using the tools that you presented helps? Well, it did help. So, we forced a certain documentation level. You can also see the tools that I used on the CI, on the project that I shared. They just did help to force people to write it. But actually, as soon as we made this as a must to have for every code review, then it helped. So, for every code review, every developer was looking, okay, you didn't write this documentation. This is important. And then they would ask, can you please write some few lines about this? This is important to remember. Then it did work. Thank you for the talk. I want you to ask about your opinion on using chat GPT to maybe ease the work of documentation, save your time if you've got any opinion on it. Because from my experimenting, it can document codaled like quite decently. So, maybe it could even be used for the tutorial based approach that you mentioned. Oh, actually, I didn't try it. So, I can try and let you know next time. Thanks. Any other question? Any other question? Hello. Thank you for your talk. Do you use restructured text, markdown? Do you think Python coders should use one or the other? Does it matter? Can you talk a bit louder? RST or MD? Which do you like? Well, it doesn't matter if you don't write documentation. So, it's important to actually write. And it's actually up to developers. Sometimes they like one, sometimes like the other. And I don't mind both. I just want to make sure that there is up-to-date documentation. Do you have actually tools to help you to make documentation up-to-date? Oh, that's a tricky question. Well, this is more like a human factor, how to make documentation up-to-date. You can test examples. You can even try the examples if they work. So, you can automate that. You can check how many lines of the documentation are in every request. So, this is automated. But it doesn't automate the things which are outside of the code. You cannot really test those. Unless there is a piece of code, you can go through it. You can try running it. So, unfortunately, the things outside of the very specific code, then this is, yeah, basically not. Thank you. Okay, we have two more minutes for questions. I see a question in the back. Perfect. I needed to do some support today. A, thank you for your talk. Is writing documentation something that you do alone or something that you collaborate with someone else? And if so, are there like non-technical people that collaborate with you in the moment of writing? Or how can you be sure to address also people with different levels of technical knowledge? Thank you. Thank you for the question. Actually, those are different types of documentation. So, there is a technical documentation which could be written by a technical writer. But the documentation that my team is writing, for example, should be written by a developer who is writing the code. So, your developer, you wrote a piece of code, a function, something you have to document it. And this is actually written in our ways of working for the team. So, that is part of the done of the features. So, as soon as this is done, it has to be documented. Unless it's documented, we don't close the ticket. So, it's have to have to happen. And this is for everyone in the team. So, there's no specific person on the developer who did the feature. Then they do it. Do we have any other question? Okay. Regarding chat GPT, I played a bit with it. I have the feeling like it's a great tool for beginners to learn. But if chat GPT can document it, then why should you? Shouldn't you put what chat GPT cannot guess from the code? All the business logic, the reasons you did it. It's just a small thinking. Thanks a lot, Anastasia. Thanks again. Thank you. |
Talk to DBus from a Python application
An introduction to the dasbus library |
And it's time for our next talk. I would like you to welcome Van Duller. The stage is yours. Yes, hopefully. Oh, I can't talk. Okay. So, hello. I'm Wendy. I'm a software engineer. I work at Third Hat for six years now, and I'm part of the system in Star Team. And today, welcome on my talk about the communication with D-Bus from a Python application. So, first of all, I would like to clarify that I'm a no-way and expert on D-Bus. I'm just a very lazy programmer who wrote a library to make her job easier. So, what is D-Bus? It's a shortcut for the desktop bus. And basically, it's a system for the inter-process communication. It consists of two parts, the protocol and the bus demon. And on a typical Linux distribution, you can usually find the two bus demons, the session bus and the system bus. So, for example, this is a screenshot from my laptop. This is the visual representation of the system buses and services that you can find. And for the demonstration purposes, I've created the example chat service. And you can see on the right side that this service provides four objects, and these objects implement some interfaces. So, how do we talk to this thing from Python? We will use the D-Bus library for that. It's a library that I wrote some years ago. And basically, it's an abstraction level above bitings for the D-Bus library. Okay, so let's jump in. Let's start with the client part. So, we know that there is a D-Bus service that we want to talk to, and how do we do it from the Python application? First of all, we need to establish the connection to the message bus. In this case, we know it's the session bus, so we will use the session message bus. And the other thing that we have to create is the proxy of the remote object we want to talk to. For that, we need to know two things. The first one is the name of the D-Bus service. The second one is the object path of the remote object. After that, we can use the proxy like any other Python object, so we can get and set properties. We can call methods. And another thing we can do is to watch D-Bus signals. D-Bus signal is something you can connect to. You will create a callback, connect this callback to the signal, and every time this service emits this signal, your callback will be called and run and processed. So, this is how you do it. This service or this room proxy has one signal called the message received signal, and you can connect the callback that will just print the messages that you will receive. And that's it. This requires one more step, and that's to run an event loop. Basically, it's a little complicated to explain. It's just a black box that runs forever and handles any asynchronous events that can come up, like emitting of a signal. So, the thing runs forever unless you stop the loop or kill the application. Yeah, so, let's actually do some demonstration, because I think the demonstrations are doomed to fail, but let's try. So, here I need to start my server. Let's not dive into that yet. It's just I needed to have it running. And we can actually check the defeat. And on the session bus, you can look for my example chat, and you can see what you saw on the slides. Sorry, it's so little. And there's my room three, and there are some interfaces, and this is the interface I'm interested in. So, let's do some stuff with it. The first one, we will ask for the name of the room, and that's it. So, here I can just write the first one, and it just prints three, because we asked the room number three for the name. The second thing that we can do is to send a message to the room. So, this is the number two. This doesn't print anything, but as you can see here, the server received a message for this room, and it printed. So, if I call it several times, it prints this stuff, and if I change it to something else, I can actually talk to another object and send another message and stuff like that, and it just will send a different message to a different room. Yeah, it's a very primitive chat. Don't try to find something clever about it. And the last thing I want to show you is the signal processing. So, this is how it looks like. The callback will just print the received message with some additional stuff, and this is where I connect the callback to the signal, and then I will just start the event loop. So, for that, I actually have to... Yeah, so this thing is listening. It's waiting in the event loop, and it's waiting for any events. So, if I, again, send some messages to the... Oh, and something... Yeah, because I changed it. Yeah, that's why you shouldn't change your code. Okay, it should work now. Yes. Okay, so because it's listening on the room 3, so if I send a message to the room 3, it's well printed here, and this service behind it, it's still running, it receives the messages. So, okay. That was the client side of the things. Let's be the service. How do we do this? First of all, we need to register the debug's name of the service. That's basically you announce the name of your process, so the other applications can find you and reach you and talk to you. The other thing that you have to do is to publish some objects, so other applications can actually use some API that you provide and do some stuff with your service. And the last thing, again, is to start the event loop, and the event loop handles the incoming debug's calls, calls the relevant handlers and callbacks, and sends the return values to the callers. So, the last part that is missing is how do we create this debug's object? Every debug's object needs to provide something called the XML specification. This is a declaration of our interfaces, methods, properties and signals that this debug's object implements and that you can call. And when I saw this for the first time, I thought, oh, my God, my colleagues will do some typos in the things, and we will have so many bugs. So, the first mission I had was actually to get rid of this and make it generated automatically from the code, because I knew that we are going to do a lot of changes all the time, because it was a huge project, and in no way we would keep this same as the code. So, let's look how it can be done with the bus. All you have to do is to just use the bus interface decorator and provide the name of your interface. And this decorator just looks at the class, members of the decorated class, and for every member, it will generate this piece of the specification, and it will create this whole specification. Sometimes it collects more interfaces, it's certainly more complicated, but at the end, you don't have to do or write any XML, but you can have access to it and use it to publish your object. So, we will start with this decorator, and then you can just define a debuss method. This is definition, and at the same time the implementation of the method, so you can see it prints the message. One thing that you have to do is to provide type hints for the arguments and the return values, because debuss needs to know about the types of these of the arguments. And another thing, yeah, everything is common case. I'm sorry about that. It's just, it's a standard for debuss, and it didn't make a lot of sense to try some mapping from the traditional Python to this common case. So, it's easier to just try the common case. So, this is the method, and this is how to define a debuss property. It's just a Python property with a type hint again. And last but not least, debuss signal, you need to use a special decorator for that. And if this signal emits some additional arguments, you need to specify them as arguments of the method, kind of, yeah. This method is never called, it's just used for the definition of the signal. So, that's it. Let's have a look at the implementation of how it looks like when you, like, put it all together. So, this is the classroom. There is the decorator, and as you can see, there's just all the definitions and implementation that this class needs, and the XML is just generated by the decorator. You don't need to care about anything. The chat also has a debuss interface, but it doesn't contain anything. Here we can see later how the XML looks like, and you create a connection to the message bus, you register the name of the service, you publish some objects, and you start the event loop, and that's all. In this case, it's good to disconnect from the bus. There's an open-flow request to use the session bus as a context manager because it's nicer, but it's not merged yet, because it will register everything that you did here. So, that's a good thing to do. And here we can see that when the server was started, it printed the interfaces that were generated, so the first one is empty, there's nothing there, but with the room, it contains everything that was inside of that class. So, yeah, this was completely generated. You don't have to care about it. You don't have to figure out what type S is in this case, it's simple, but sometimes it gets a little more... Not so pretty. So, yeah, you don't have to care about it, which is, I think, great. Features. So, this bus has a lot of features because the project I was working on was big, so we wanted to do as many simplifications as we could. So, one thing that we did... Okay, I want to mention one thing. I decided I will focus on, like, the end result of these features because it's a little difficult to explain the definitions and, like, all the steps you need to do before that. So, I will just show you what you will end up with in your code. So, the first thing I want to mention is the management of the bus names and paths because you could see that there are a lot of strings you need to handle. And it's very easy to make typos in this again. So, I can... Okay, yeah. So, it's very easy to make typos in that as well. So, basically, this bus provides a system that allows you to define namespaces and objects that are published in these namespaces. And at the end, you will have these very simple objects called chat and Room 3, and you can just use these objects to create proxies or ask for the interface name or ask for the service name or get the object pass. And you don't have to care about the strings behind it because they are created from what you defined earlier. Yeah, I can get to that later if you have time. So, another thing that Das Bus provides, it's management of a group of publishable objects. So, let's say that the chat is not static. It doesn't have only streams, you can ask the chat to create a new room and you want to get the bus pass of that room so you can connect to it. So, yeah, you can implement it manually on your own and make sure that every room has a unique pass. And if someone wants to do something with that room, you can just figure out again what was the room or you can just use the room container. It's very easy. You just provide the namespace that the container can use and you will specify the message bus that can be used to publish these objects. And the whole purpose of this container is just to give it a Python object and get a D-Bus pass. And it works the same way backwards. So, if you receive an object pass, you can get a room. So, with this mapping, you can, like, deal with this mapping very early and in your code, you only have the objects. You don't have to care about the D-Bus implementation behind it. So, yeah, it's a little difficult to explain, but it can simplify your life a lot. Yeah, another thing I want to talk about is how to handle D-Bus errors. So, this bus raises a default exception by default, but sometimes you want to handle a specific D-Bus error or maybe you want to raise a specific D-Bus error from your service. It's a very easy thing to do. There's a special decorator for that that you can use for your exceptions. And in the decorator, you specify the D-Bus name of this error. And that's all you have to do to be able to use this exception in your code. So, once you decorate it, you can raise it in a service and you can accept it in the client, and you don't have to care about the magic between that. Yeah, that's also a cool thing to do. D-Bus structures. So, this is very... Yeah, this is a funny thing. D-Bus doesn't have native support for structures. So, what everyone does is they send dictionaries that map attribute names to attribute values. And since these values can be of different types, you have to wrap them inside variants. Variant is a special data type that basically couples the data and the type together. So, when you send it from your service, the client is able to interpret the data even though it didn't know the type of this data before. So, this is a pretty horrible thing to work with, especially when you need to receive the structure, change something, and send it somewhere else. Because creating these variants is not easy, variants are not changeable, so you have to always create new ones, and yeah, it's not pretty. So, with D-Bus, you can actually describe the structure using data classes, and these classes just have some properties. And there's a lot of automation that allows you to basically give it the dictionary. It will transform the dictionary into a Python object, then you can just play with the Python object, and when you need the structure again, you will just go to structure to get the structure that you can send on D-Bus. Yeah, so that's another thing you can do. Lastly, this is a new feature that I was working on with some people last year. Yeah, I would like to think this way. I would like to use this to thank everyone who was involved in this, because it was a bigger issue than I expected. And basically, you can send Unix file descriptors through D-Bus. It works only on Unix systems, obviously, and its optional feature is disabled by default because there's some overhead, and I didn't want it to slow down everyone's services. So all you have to do is when you create the proxy, you will specify a little different client library or the server library that will be used to process the incoming calls or the requests, and it basically means that if... Yeah, it's a little complicated because D-Bus has a special support for Unix file descriptors, but it's very messy when you have to deal with it specifically, and with this extension, you don't have to care about it. Basically, if you want to send a file descriptor, you will just send a file descriptor and receive a file descriptor on the other side. Yeah, so these features can be very hard to understand, and I get it. So I want to mention that this is optional. It's not something that you have to use if you want to use D-Bus, and I would suggest if you don't have the needs to use them, don't use them, keep it simple, do whatever is the easiest, because there can be a lot of additional code that can be hard to understand. Another thing I want to mention is that I acknowledge that every project is a little different and has very different needs, and sometimes you can make a lot of assumptions about your service, but you might not need half of the D-Bus support that there is, so you can simplify some stuff a lot, and that's great, but that's often not generic enough to be implemented in a library like D-Bus. So what you can do, actually, is to take any piece of D-Bus, very implemented, to fit it to your needs, and propagate it in the right places, so D-Bus will use it instead of the original implementation. I want to mention this because we were in this position at the beginning of our project, and we had a lot of troubles with the library that we used back then, and basically we had to patch the whole library because we were not able to get the stuff that we needed, and it wasn't easy to just change it, so we had to patch it. So this is the link to the library. There's an open discussion session, there's issue tracker, so if you have any suggestions, questions, you can find me there, reach me there, I don't hesitate to ask. There are some examples. You should be able to find there examples that are similar to the one that I showed you. I think I will post there also the demo stuff because it's easier to understand. There's also documentation that might help you because maybe this talk didn't help you so much. So that's all from me, thank you so much for coming. Does anyone have any questions? APPLAUSE Hi, I just wanted to ask, where do you find people are using this? Is it in chat message applications, or what are the applications of this for most people? What are the applications for providing the D-Bus API that you can... Okay, so on the next system, it's basically... Any... like there are printers, or you can control the media player, or you can set up your firewall, or on the session bus. I think this chat actually has their own D-Bus API. So it's more like the applications that are running on your desktop often provide this D-Bus API, so you can write some scripts to tweak them and to control them. So, yeah, that's it. Does D-Bus support properties and annotations? Properties, yes, annotations. Do you mean like properties changed annotations, or stuff like that? Like emits changed signals and deprecated and stuff like that. I'm not sure about it. I think it's not like needed, it's just like a recommendation for the documentation, but it's not something that... Otherwise, the client can't see that the server may support an API call, but it's marked as deprecated. Okay, yeah, I think it doesn't support custom annotations, but that's definitely something I can look at if it's like... Just wanted if you can add them, that's fine. Yeah, yeah, so I don't think that this is supported, but it's definitely supportable. Hi, why would someone want to use D-Bus... I mean, Daspus versus some other D-Bus library that's out there for Python? Yeah, okay, so this library was actually inspired by PyD-Bus, which is also very popular. It's just we hit some issues, and it's complicated, like you need to think a lot for us. So at some point, I just got so frustrated, I decided to rewrite it and create Daspus, but yeah, there are a lot of interesting libraries, and sometimes they are a little simpler, and it might be enough, so you don't have to use this one. This is much easier if you... If you have a lot of D-Bus API, because with our project, there are several D-Bus services, and it has a lot of objects, a lot of interfaces, and it would be very difficult to deal with it with a library that operates on a lower level. So we needed a lot of abstraction to make sure that the code is okay. We have time for one last question. Who was first? Maybe two questions. Hello. Thank you for the library. I've been trying it and it's great. I wanted to ask a question regarding the event loop. Lately, I've been doing some work with D-Bus, and I find it very painful that most libraries rely on the G-Leap main loop, rather than the default event loop coming from AsyncIO. I saw that in the code base, there is an abstract event loop that could become something else, but do you have any plans about that? Yeah, so right now, I don't know about the demand for additional support for other event loops, like backends, but the code is implemented in a way that it should be possible to do it. So if there is enough people who would be interested in this, that's definitely something I would like to look at. It's just, yeah, it's no demand right now. Thanks a lot. Thank you again. You probably have a great audience for people interested in Linux desktop here. Thank you. |
Python Logging Like Your Job Depends on It
A fast track to understanding logging in Python |
This is the largest room I've ever given a presentation to, so thank you for showing up. Also, let's see. How many of you are using logging right now in your job? Wow. Okay, that's a lot of you. This is the wrong talk. No, I'm just kidding. How many of you would say you're doing Python logging well? Yeah, that's what I thought. Well, it's okay. You're in the right conversation because we're going to tell you all about how you get started with Python logging and all the wonderful things that, hold on, that's my boss, hold on. Hey, yeah, I'm in the middle of a presentation right now. What's up? He deployed on a Saturday? Are you kidding me? What do you mean, production's broken? Yeah, tell me you check the logs. He's not logging. Oh my goodness. Well, please don't fire him. Tell him to tune into my talk. We're going to go over the basics of logging and then we're going to get all the way down into some more advanced logging configurations. Okay? Yeah. Ciao. So, nobody wants to be in that situation. Honestly, that went over way better than I thought it would. Nobody wants to be in that situation. So before we get too deep though, I'm going to just say hi. I'm David. I'm a developer advocate for OpenSearch. Thanks to the OpenSearch group for letting me come here. OpenSearch is commonly used as a log store, so it fits in nice with Python logging. Previous to this, I'd worked as a data engineer, network automation engineer, DevOps engineer. I've done a lot of engineering and all of them had one thing in common and it was I needed a lot of logs to really understand what was going on at any point in my application. So you might want to get your phones out at this point. I have lots of QR codes. Just kidding. There are only three. And we're going to go to code right after this one. So that's my link tree if you want to connect with me and hear more about OpenSearch because I'll talk your ear off, I promise. So with that, you can follow along. There's a gist for this whole presentation. It can all run for the most part. I was working on it last night till very late. And we're going to go ahead and get started. So as with anything, we're going to import logging off to the races. And now you can check off your JIRA card and put it on the backlog. Just kidding. We all know that logging is a lot more than just importing logging. So we're going to start with the logger. Logger is kind of the core concept of Python logging and all loggers start with a name. So we're going to do underscore underscore name. Does anyone know what underscore underscore name actually will resolve to in this instance? Any guesses? It's underscore underscore main. Come on, guys. So we create a logger. We give it a name. And fun fact of the day, that logger is global. I can import Python logging and do get logger, of course, from anywhere and use this logger. As it's too small. Yes. There we go. Try not to delete code while I'm at it. There we go. We're going to make this smaller. We're going to make this up here. So as we know with logging, there are several different levels you can set. These levels are a filtering mechanism for Python logging. Fun fact of the day, though, most people don't know this, is these logger or these levels actually resolve two numbers. And those numbers are used to do filtering. And fun fact of the day, you can actually make your own log levels as if five, six, six log levels, seven log levels isn't enough. You could add more potentially if you needed that. So we're going to go ahead and send this to terminal. And we're going to take a look at all those log levels. Here we are. As we can see, they resolve to 10, 20, 30, 30, 30, 30, 40, and 50. So warn and warning both resolve to the same log level. So they're both there in case you need two log levels that have the same level. And these levels, again, are used for filtering. So as you are sending out logs, they're used to filter out logs that maybe are less important. So the number, the lower the importance of the log. So let's talk about emitting a log. Here we're going to go ahead and send a log with logger.info. So that's going to emit a log from the logger with the info level attached to it. Upper case info is for sending logs, upper case info is the level. And nothing actually happens. It just gets sent to the terminal and nothing happens. And the reason why nothing happens is because we've not told it where it needs to go. We need log handlers. So that is what we do. We're going to build a simple syslog handler for, sorry, streaming handler for sending to standard out now. So handlers receive logs from the logger. It's pretty straightforward. I'm going to go ahead and create this real quick. We're going to set the level of our handler. Remember our levels are our wonderful filtering mechanism. Set the level to warning. And the top one will not show up, but the bottom one should because it is greater than warning. And there it is. This will. That's pretty dull. Nobody really wants a log that says this will. That tells us nothing. You have no clue what's going on in your application. What time did this will? What time will it won't? So we need to add some context. And the way we get context is, of course, with log formatters. And I've jumped ahead too far in my presentation. We're just going to take a quick back step real quick. We're going to talk about some of the other handlers that are built in. So there's the rotating file handle. This one's particularly useful if you have too many logs and your file gets too large. It can actually automatically rotate that file every X amount of size into a new file. And then you can specify for it to delete logs after a certain amount of time or a certain amount of files. We have the syslog handler. Syslog is a standard for logging. HTTP lets you send logs to arbitrary HTTP endpoints. Time rotating file handler, believe it or not, is a timed rotation of your log files. Whether it be every day, every minute, or please God forbid every second, do not do that. You will end up with thousands of files. I did actually every hour once, and that was a huge mistake. I ended up with like 15,000 files after a little bit. And the SMTP handler, if you are masochistic. Now we'll talk about formatters. OK. So let's go ahead. We're going to create, set our console handle to the info level. We're going to create a formatter. The formatter is going to include the date and time, the name of the logger. Which again, if you're using underscore underscore name, that's going to be the name of your module, whether that's like search dot util dot whatever. And then the level name that it was admitted at, and a message. Pretty straightforward. Then we set the formatter. Formatter gets attached to handlers again. Logger dot info. Look at my pretty log. Well, just kidding. There's no log there. Is it because I have a bug? Because I wrote this way too late at night? Probably. Or more likely because there was something I actually tricked you on earlier. The truth is, the reason our first log wasn't admitted wasn't because there wasn't a handler attached to it, but it's because Python's logging library by default sets loggers and handlers to the warning level right out of the gate, and we admitted at the warning level. So fun fact, you actually need to set the level for both the logger and your handlers. So we'll go ahead and do that. We'll set the logger level, and we're off to the races. Let's look at this pretty log. Man, that is so beautiful. Now I know exactly when I had a pretty log. I know exactly where it happened in the main. I know what level it was at. It was informational log. So we've talked about a lot so far. We've talked about loggers, handlers. So let's do a top to bottom just real quick to make sure we're all talking about the same thing. So we have loggers. Loggers emit a log at some level. They also filter out logs at some level. So if your logger is set to filter out warning logs and you send it an info, it's never going to get to the handler. Handlers receive logs and then send them to some specified output, whatever that may be. And then formatters attach to handlers, and they enrich the output. So they add context. And there's a bunch of other contexts. That's not mentioned here. It's in a Python logging library. I guess it's like at a certain time in the presentation, it just decides it wants to do that. So these are all wonderful. Wait a minute. I'm getting there. Logging. Logging. Yes. All right. Here we go. We're just going to scroll on. Oh, yes. Here we go. So setting up logs from, you know, in each individual module is a pain. So you can actually pre-create loggers ahead of time with a dictionary config or a YAML config file because I know all of us just love creating YAML configs and having them everywhere. So with this, you can create as many logs as you want and specify them with dictionary config. One other really important thing to mention, and I probably should have hit it when I was talking about loggers at the get-go, there's a specific reason why you can set it at the logger level and at the handler level. So handlers give you very fine-tune access over what you're looking at. So where it's going, which output, you know. So a lot of times people will specify certain levels for handlers and then they will use their global debug level to set their loggers. So say, for example, you're debugging an application locally, you're going to set that to debug, of course, but when you push it to production, you don't want that. So a lot of people will have, you know, a production logging config and a development environment logging config. So with that, there is actually one other slight challenge with loggers, and that is they're blocking operations. So like I said earlier, if you're a masochist and you like the SMTP log handler, you could be in a real pinch. So say, for example, I'm on your application, you have this nice web server, and all of a sudden it hits a critical error, it sends a message to your SMTP server, and your SMTP server is slow, it's chugging, so you're taking five to ten seconds for it to register that and send a response. Do you think I'm going to stay on your web page for five to ten seconds while it sends an error log? Heck, no, I'm closing out and I'm going to somewhere else, I don't know, amazon.com to buy whatever I needed. So we have to handle this. We have to understand that, hey, this could potentially block, so how do we unblock our applications? Well, it's by making our applications simpler, obviously, and using multi-threading. That was a joke, you can laugh. So with this, we can actually import queues, and what happens is there's a queue handler and a queue listener. So the queue is the shared memory space that can be accessed both the handler and the listener. You create the handler, the handler receives all the logs, and then distributes them to the queue. The queue listener starts up on its own independent thread, and it's going to listen to the log queue, and then distribute it to any of the handlers that you specified it should. So let's go through that end to end again. We've got queue handlers, receive the logs, place them on a queue, the queue hands it over to the queue listener, which then hands it on to your other loggers. Now your application is unblocked. It drops that queue, or that log on the queue, and then it's off to the races. You can use SMTPlib if you really wanted to. So let's talk about pulling this all together now, right? So we have these logs, they sit on our local machines, and that's fine, but if you are a large organization, you might have hundreds of servers. Take a second to breathe. You might have hundreds of servers, hundreds of network devices or whatever, and I'll give a real example of when I could have used this. So when I was a network automation engineer, we had a particular log that my boss found on some of the servers, or router switches, et cetera, and I spent the next three hours logging into each individual one. We had thousands of network devices. So I logged into enough of them, wrote down the logs, and then correlated, and I said, oh, look, every Thursday and Saturday at the exact same time, this log happens. One email later finds out security team is pen testing against us, and we don't need to worry about it. Again, three hours, just trying to correlate what log was happening where, when. So this is exactly what you can avoid by using something like OpenSearch, ElasticSearch, Loki to aggregate your logs. And again, if you do want to follow along with this later, this just is a Docker compose file that will let you spin up some sample containers with OpenSearch. So we'll import logging in our OpenSearch library, create an OpenSearch client, and this is where I'm going to break for just a moment and talk about custom handlers. So we mentioned handlers are where you send your logs. You can implement custom handlers, believe it or not. All you need to do is, I was going to say, inherit from, there we go, that's probably the right word, logging.handler. Then you create and emit definition, and that needs to have self and record, and that will send the record wherever you specify. So in our case, we have, it's going to take, and it's going to format created time. Also I did not implement the formatting library, because I wanted to send it as a dictionary to OpenSearch just because that's what it works with. You can also use something like FluentDit, FluentBit, or FluentDlogStash to parse out your logs later, and we'll talk about that just briefly. So we've got our created time. We've created this wonderful record, OpenSearchClient.index, we'll send that log to OpenSearch. There we go. And then we'll set it up. So we're going to create our logger named log. We're going to set the logger's level to info so that we get the info records. We're going to create the OpenSearchHandler and add logging.info, set that level, and add our handler. And we're off to the races. Boom. Well, that was kind of anticlimactic. You can't actually see it going into OpenSearch. But I promise you, it chugged along and it went into OpenSearch and, ha-za, let's go into OpenSearch. And this is actually OpenSearch dashboards. So again, if you do this on your local machine, passwords admin, usernames admin. Very secure. We're actually looking to change that, but that is coming soon. So we go here and we're going to go into stack management, hang with me for just a second. So we created a custom index with this, and that index is named based off of the logger that sent it. So we've got logger name, which was log, and then with the date time so that we can roll those off after a certain amount of days. We're going to create an index pattern. And we are going to say, we're going to say absolutely nothing because this is the problem with doing things like, oh, just kidding, I just don't know how to use OpenSearch. Here we go. It's only my job. Don't worry. Boss, I swear, if you're watching this, no, I'm just kidding. So there we go. So we have these two indexes that have been created because I was testing yesterday and today. So logstar just says, hey, any index that looks like log with anything after it, group them together. Ask for a time field, which is our created field, and create index pattern. There we go. OpenSearch, auto-detects, all of these different files, types, et cetera. You can actually specify mappings, and I actually really recommend that because different visualizations require different types of mappings, but that is not what we're talking about today. We're talking about Python logs, darn it. So we're going to go over to Discover and go into our logs, and we are going to make sure we're looking at today's logs. Actually, well, let's look at this week. Here we go. And we have three hits. There we are. These are our logs. Look ma, I'm logging. And you could actually see the exact point in time when I switch from this is my log to look ma, I'm logging. So that was, yeah, I had 2.37 p.m., so anyhow, with this, you can actually go ahead and then visualize spikes, peaks, valleys, when this is happening. You can enrich it with device information. So you can say, hey, when was this log sent? Is there a particular device that's having a lot of issues? So now instead of logging into all of your servers, you can go and bounce the correct ones. And if you saw Doton's presentation yesterday, you'll know you can use this for monitoring anything, whether it's CI CD pipelines, Python logs, network logs, et cetera. So let's see. Look at that. You're doing more logging now than 99% of the population. So congratulations. Clap your hands for yourselves. So I'm going to talk real quick about just a very simple, common logging architecture for capturing distributed logs and why you would want to do that. So more often than not, you're probably actually going to want to log your, you don't want to put your logs locally on the file system. And the reason why is because your file system is not going to go down, and God forbid if it goes down, there's no hope for that log getting out anyways. Your service remotely could disconnect because whether you're doing an upgrade or something along those lines, so it acts as a little bit of a caching mechanism. And then you'll normally have it logged to a file, and then you could use something like Fluent Bit or Beats, and those will ship your logs out. And the wonderful thing about this architecture is, again, if OpenSearch was to go down because you're doing an update, or if something was to happen critical with your Python app, it can quickly write that log out, and you can also do enrichment. So I talked about getting which server sent that log. So Fluent D, Data Prepper, and Log Stash all can take and enrich your logs with the context information that came with them. So say, for example, it says, I received this log from X, Y, and Z server, 10, 20, 90, 32, 83, or something like that. Then it can go and do a reverse DNS lookup and say, hey, who has that? Which service is assigned to that? And then it can add that information in and then push it into OpenSearch. So now you've all of a sudden gone from having a log that says, hello world, to or got here, please, please, do not. And you can have it pushed into OpenSearch, know what service is causing the issues, and visualize on dashboards. With that, I'm finished. Please scan, look at OpenSearch if you're curious. It is open source, Apache 2 licensed. All of our features are being developed in the open. And with that, I want to ask it, does anyone have any questions? Yeah. Thank you. Hello? Okay. How would you... I'm mostly familiar with Sentry, and I'm very curious how would you compare this to that, because as far as my friends told me, they're pretty different products, but they do similar stuff. Yeah. So Sentry is more of an APM, isn't it? Is that word familiar? Okay. So OpenSearch has some APM capabilities, which is, I don't remember the word for it, but it's for app monitoring and specific to the application. So this has some APM capabilities. It might not go as deep as some of your auto configurations for other APM tools though, but it can ingest APM logs. So that's a good question. Thank you. You had two async questions from the chats. The first one is, what about f-strings and logging? Because I wrote this presentation in like a couple hours, and yeah. So I'm trying to modernize, but again, it's old habits die hard, so I'm still using f-strings. Please don't get on me. It was the next one, sorry. Yeah. The question is, what about structured logging, and in particular, strike log? I'm not actually familiar with strut log. So I'm actually moving myself more towards using OpenTelemetry instead of logging, or alongside logging, we'll say. So OpenTelemetry gives you a trace, actually, which can tell you the full stack of what happened during your application. So everything down from which function was called to which load balancer sent the information over. So you get an end-to-end trace of what happened, which in my opinion, I think, is a little bit more handy than just logs. I think we have one more question in the middle. Just a quick question from the discussion I'm having with the code worker. So we're thinking about moving our logs to JSON format, because it's easy to understand for non-Python people and searchable. If we were to switch to OpenSearch, and I really liked the presentation, do you think it's still feasible to make logs searchable in and of itself, or is OpenSearch that's stable and usable and from your experience, there's no need to search on your logs? Yeah. No, that's a great question. You do need to search your logs. OpenSearch is a search engine at its core, and that is why it is as good as it is with logs. As for JSON versus other formats, I think there's no particular preference, but OpenSearch is certainly stable. We have 150 million downloads, so we are here to stay. It's been adopted by a lot of companies such as Oracle, Ivan, Instacluster, Opster, Amazon Web Services, of course, because I work there, and many others. So I would say it is very stable, production-ready to use, and yeah, it's a really great way to search your logs. In fact, I have a lightning talk at 1655 in another room, so I think it's a Kubernetes room. So if you want to talk more about searching your logs, I'm going to be talking about search relevance for logs. So thank you. Thank you, online people. Thanks a lot. |
Will PyScript replace Django?
What PyScript is and is not |
Hello. So yeah, I'm talking about PyScrip. So I'm trying to get a catchy title, so it's kind of like a clickbait. So with PyScrip, we replace Django. So maybe a lot of people will be like really, you know, sparking their interest because I mentioned Django, which is a very popular framework for web. But today, I'm not going to talk too much about Django, but I'm mainly going to talk about PyScrip. And I would have some crazy demo at the end, which is very interesting. So you may not want to miss it. But the most important thing is that this is the link to the slides. I should have used some QR code, but kind of lazy. So it's just the link that you could, you know, find the slide deck. It is also uploaded to the FOSTA website. You can find it to follow along if you can't see it very well or anything happened, then you can look at the slides. And all the links are there as well. So I hope that since you're here, you have heard of PyScrip, but who, how much do you know about PyScrip? Like, except that it's something to do with Python? So, yeah, silence. So, yeah. So I think most people that match, I've talked about PyScrip, and then they said that they have heard about PyScrip. They may have read a block or two about PyScrip. It's something that's relatively new. So that's why I think it's important to kind of, you know, to put information together so people have something to, you know, look at when they want to know more about PyScrip. So, by the way, I'm checked. Why am I talking about PyScrip? Why sounds like I know a lot about PyScrip? Because I work in Alaconda. So PyScrip is developed by a team in Alaconda. So I work with them a lot. So I have, you know, kind of, you know, very close to the source. So I kind of have some like information about what's the newest thing about PyScrip. And I love open source project. I have been involved in open source project in the past. I just want to put more pictures there. So that's why I put it here. And I organize a lot of events. So let's jump into it. What is PyScrip? PyScrip is actually a framework. Some people think that PyScrip is a new language, but it's not. It's actually Python. But this, you know, you can write Python in your HTML file. That's what it is. And also, like, it's, or it's a framework. Why? Because later you'll see why. We say PyScrip is a framework because of how you can change the backend and other stuff. So it lets you run Python application in the browser. Basically, it just means that you can, well, you can run, you know, you can write Python script as just like you run, like you write JavaScript in the HTML file. And then the browser would understand what you want it to do, and they would do something. So, but it's not trying to repage JavaScript. You can actually use it with JavaScript. For example, I'm using it with the D3 library, which is crazy. Who knows about D3? Yes. Okay, good. Okay. I'm not, like, speaking to other people. Yeah, it's good. So I will show something more like later. So basically, you can actually parse the objects back and forth. So you can change a Python object into JavaScript object, and then your JavaScript library, like D3, will understand it or the other way around. So it is something that is quite new. So also, so all these things that I talk about that you can, like, change the Python object into JavaScript object, all these stuff won't be happening if we don't have the PyLDI project, which they is, by the way, for those of you who haven't heard, it's kind of, it started as a monselors project called iLDI. They try to, like, have a lot of things that is, like, run, you know, in a WESM, WebAssembly, so the browser can run it. And then there's a, so within that, all those projects, there's a Python, you know, project that is actually, like, converting Python into WebAssembly. So that's the PyLDI project. So that's actually allowed you to run the Python. So without it, like, actually, PyScript won't work. So, well, PyScript will still work. You can change the backend, but, like, it started with PyLDI. Sorry, I'm, like, kind of, I'm trying to be correct in what I'm saying. But the main thing is that you need to, kind of, compile Python into a WebAssembly. So PyLDI is, like, the kind of one of the most popular one that is, like, having the whole, more or less, the whole thing that your standard Python offer is actually compiled into WebAssembly, so you can run Python on the browser. But there's also other things that we are trying, the team is trying now. For example, they are trying to compile MicroPython, which is a lighter version of Python, into WebAssembly. So it more or less works the same, but you have some kind of Python functionality that is not available on MicroPython. So you can actually choose which one you want to use. And also, because one thing about the PyLDI project that is, that is, why it, why it is the first one that PyScript adopts and why is the first one that is, like, why is it so popular? Because the project itself also provides a lot of Python packages. For example, those that we use a lot, for example, NumPy, like, scientists and data scientists use a lot, it's like, like, some of NumPy, SciPy, Cyclo and all those actually are quite difficult to run on the browser because they are not pure Python. So for pure Python, if you have a Python interpreter that actually is compiling to Wasm, of course you can do it because they are just Python, but something like NumPy, SciPy, Cyclo and they are not pure Python, then it's a bit tricky, but PyLDI project also provides that. So now we can also run those very complicated Python packages on the browser, which is cool. So I will show you a little bit of the PyScript basic and then I'll show you the demo and then, like, all the questions will come at the end. Is it too small? I don't know how to zoom in though. That's why the link is important at the beginning. So I can explain, but the content of this code is not the most important thing. It's just like how a typical PyScript will look like. So I'm just talking about this section here. So for the first two lines here, you don't need to see it, but I'm telling you what it is. It's just like when you have a JavaScript code, you would actually have a, probably you have a CSS, which is like the style of how your website would look like, the style sheet, and then you have a JavaScript that you kind of put in and then you can run all these, like, JavaScript functions that you have. So this is actually something like a path to a PyScript.js that's hosted on the PyScript.net. This is actually what allows you to write Python on the websites, on your HTML file. Well, you may ask, like, why is it.js? This is just how awesome work. So we have to follow the standard. So yeah, that's.js, but that's not important. You're not writing JavaScript, so don't worry about it. Next is there is a section that you can actually write Python code. So here is just some NumPy code that is plotting some NumPy random numbers. Okay. So here, like, if it's, sorry, I'm not expecting the room to be this big. So if you can see it, so the first line is basically a HTML tag and it's PyScript, and then it also has a little bit of settings, like output equals to plot and stuff like that. And it is just a Python script. It's just like input NumPy, input Mapolyb, and all that stuff. And then it's, you know, it's Python code. It's just like, you know, you can copy and paste your Python code there. It's more or less the same thing. And the other thing that you may have when you are using PyScript is a PyConfig file. So this PyConfig file is actually, there's multiple format that is supported in this example. It's in JSON format. It's just a JSON file with, like, saying packages is NumPy and Mapolyb, because we are using it in the code, in the Python code. That's why we have to, like, put it there to say that we are using those packages, kind of like your PIP install. It's like putting it in your, in the environment, within your browser. So that's what it does. So this is the code. That's typically how it looks like. One catch is that it's using the latest version in the first two lines there. So if you don't want to, you know, break your code, you can pin a version. Now you can also pin a specific version with the releases and then the version number. Now we are named. We are, we are, we are tacking it as, like, the year, month, date, year month, probably, and then the versions. So, so yeah, you, if, for example, you don't want it to, because now PyScript is still changing a lot. So if you don't want your code to be not working next month, so you may want to pin the version until you want to update it manually when you look at the code again and update it. You can do it. So don't worry about it. I know people will have questions about it. Or you may think that, oh, like, you know, you're a web developer and, like, you, you don't like the CDN, like, calling in from another website that you have no control of, you know, if that website got attacked or something that won't work, it may be very dangerous. You can host it yourself. You can download that two things, you know, the, the, the style sheet and the JavaScript code yourself and, you know, and, and the other things with it as well. There's, there's also a few other things, but they're all downloadable. You can download it and host it yourself if you want to. So, uh, it's getting more complicated because, you know, now we're, like, the PyScript is getting, like, more developed and now there's more things you can config about how PyScript works. So the PyConfig tag is where you can change all those. So first of all, the tag itself, so we're, within inside, there's multiple format you can use. There's the Tomo format, which is the default. So if you don't tell this tag what format you're writing, so it would just assume you're using a Tomo format. So in this demo here, it says, like, packages and, and paths, so it's just, like, you know, what package you're putting in just, like, again, like the pigment store thing you have to, there's an environment within the browser that's running the other PyScript code. So you have to first say what packages you are using, so, uh, to allow it to be usable. Um, the JSON format, again, it's just different format. It's more or less the same thing. Or you can actually have another source. You can write your file in another, for example, either, either JSON file or Tomo file. You can also put it in because, well, you know, you don't want everything to be in your HTML file, right? So you can actually put in other source as well. And also, um, yeah, the PyConfig tag is, uh, it's not just used for putting all your packages and other stuff. There's also other things that you could, uh, you could set in a PyConfig tag. So, you know, you can also host the wheel of the, of the packages. It doesn't need to be using the one that's provided by PyOdi. Um, you can also, if you have your, like, for example, in Python, you can actually import your local script as well. You can also do that. Uh, you can, uh, you can do it by the fetch tag. Uh, but, uh, I'm not showing it here. You can, if you're interested, you can look at the documentation of PyScript. It will show you how to use the fetch tag to actually load in, uh, your other Python script. You can do it as well. Um, you can change the runtime setting, like I said before, uh, because PyOdi is the default one at the, when we first started having PyScript. Now, uh, is, is, I think it's like in development that, like, you can actually change the backend, like, which one time you want to use. If you, like, for example, just want to do some demonstration with Python or you are using, it's a tool to teach, uh, kids how to code in this. You may not want to use PyOdi because it's slow, it's heavy, it's, like, powerful, and it's useful, useful for other things. You can use MicroPython, which take no time to load, um, but it provides some other basic Python, um, code that you could write, uh, in the HTML file. So you can quickly teach someone Python without installing Python and just, like, running, running it on the HTML file. Um, you can also add some metadata. For example, you know, you want to add the author who write this script and the license. You can also do that. No problem. So, uh, another thing that, uh, you may put in your HTML file while you are using PyScript is the PyRepoll, uh, is something that's just like Jupyter Notebook. If you have Jupyter Notebook, you know, it's, like, very nice, um, repo that you can, like, put in the Python code, shift, enter, and then it will execute and give you the result. Um, you can also embed that in your HTML page when you are using PyScript. So you just need to do the same thing, you know, having those two lines of the style sheet and the, um, PyScript.js. And then you can just put the PyRepoll tag and then you will have a Jupyter-like repo that you could use in your site. So why is it so useful, right? Like, uh, it's a new thing, it's exciting, but can I really use it? Um, or is it just a fun thing to do? Um, why doing it on the front-end, right? Like, now you can have, like, application, like Django, like I said, it's a kind of people love Django. And like, I, you know, uh, why do we want PyScript? Um, because sometimes, like, things just need to be run on the front-end. Sometimes, like, we can't really, um, rely on, you know, uh, an application like Django or other Python application to handle all this Python code. Um, for example, if you don't want to use up all your resources, right? Like, if you have a back-end and then the back-end is actually hosted by you or the cloud service that you pay for, if there's a lot of user and like the, if every single user had a very, like, um, heavy use of your resources, then the bills can be expensive. And then you may not want that. You want to maybe, um, you know, give, give back the low into the, the users who is using it, right? So you could, you could maybe push things to the front-end. Um, on all, if I've heard, like, maintainers said that, like, they may want people to try out their code, you know, you can build a sandbox that let people to, um, to run it. Like, for example, a lot of the data science stuff, you know, uh, for example, the NumPy, SciPy, they, they will have, before they will have the binder thing that actually load in, you know, another application, you know, that have the, have the back-end and then they could run some code there to do it as an example. But for these services, uh, they're provided, if they're provided for free, usually they're quite slow or, you know, they have limits. So, um, if you want to provide a, uh, a sandbox for, for users, if it can run on their machine, you don't have to worry about people abusing it as well. Like, for example, some people, if you, that whoever run whatever on your sandbox, they could do crypto mining and then it's not a good thing. So if it's on the front-end, it's, it's using the user's resources, not your resources, which if they want to mine Bitcoin, it's fine. It's on their machine or on your machine, you don't have to pay the bill. Um, also, uh, sometimes, uh, we have applications that, for example, it's some, uh, research data, some medical data, very sensitive. Uh, so you, you can't really, you know, it's like the rules that you, the, the, the data can't leave the machine. So you can't send it to a back-end somewhere to do it. Then, you know, um, maybe you can provide that, you know, provide the code that, um, you know, someone can use it to run on the browser. So it's run on their machine. So instead of, you know, you have, you have built the application and they have to send the data over to your app or whatever the back-end is and then, uh, to run the application. So it's also easier to set up as well. Otherwise, you may have to provide a separate secure environment with the whole kind of, uh, set up of the back-end and the front-end together and stuff. You know, uh, if you just have the front-end, it's much easier. You don't have to worry about it. So, uh, with high script, with page Django, I know that you already know the answer is no, but, uh, actually it's very fun if you use them together. Um, I will show you a few things that I like, uh, that is done by either me or some of my friends. There's actually quite cool things that we have used, like both Django and high script. So, uh, for example, this is what I have done with, um, can I just pass away? Yeah, okay. So this is something that I've done. It's like, uh, using high script with, uh, with Django that I've, so this is what I, I will show you what it is first and then I would explain. So here, this is the thing. I think this one, this one I can assume is cool. So this is a recommender system, right? So I have all these movies that I download from the movie dataset on, uh, cargo and then, um, is there, there are a bunch of ratings. So this recommender, if you put in a movie that you like, if you try to find all the potential movie that you like after, for example, I always like putting Iron Man because I know it works. And then give me five recommendations. If I like Iron Man, what else would you recommend me? And then if I click recommend, and I've got five of them, right? So most of them sci-fi, sci-fi movie, which is quite cool. Um, there's also the Dark Knight, which is, I think it's the, the movie about the Batman, which is also, I like it. Um, so yeah, that's nice. Um, so, but this thing, right? This thing, usually you think of, oh, it's like a machine learning thing, right? It's a, uh, it's a recommend, recommender system. You know, can I run it on the front end? Yes, you can. As long as you have your model already trained, then, um, for example, in my example here, actually you have a link to see the, the, uh, the, how it's set up. If you click on this link, uh, it will show you how I set it up. So you can play around yourself as well. So, um, what I did is just like, I, of course, I download the data as a gif, and you have to have the data to make it work, right? Um, and then after that, you just, uh, you know, run, run some of this script. You can actually do, do it in a more beautiful way. I'm just using some command line to run all this script to load in the data to, uh, to train your model and stuff. You can do it with other ways. For example, you have a user, uh, a mean user interface, right? They upload a new data and other stuff. Then, uh, it will automatically retrain the model when there's new data. You can set that up as well, but it's just a demo. So this is like this. But after that, there will be a model that's already, um, trained. It will deploy to the front end. And then in the front end, that's how this come in. That's like, that's how it works. So this is just a trained model. It's very lightweight. Um, and then you don't have to host all the data. It could be done somewhere else. But, uh, for the user, they, you know, they have a trained model and then they just need to put in the input and it will give you some results. So if a machine learning model deployed on the front end, another thing that my friend has done, which is quite cool, is front end as a back end. So it's, it's running Django on the browser. So yes, this is so small, but yeah, someone has done it. Um, this is not my doing. So I, I, I don't like, uh, you know, I don't have any responsibility, but you know, have a look. So this is actually like you have, so you have two, basically you have two browser, two, uh, browser, like HTML page. One page is the server. One page is the front end. So this is what it is. So you have a back end and a front end on the same page. Yay. Um, so yeah, you can do that. So I, I was like, oh, this is a bit fun, but is it useful? But my friend here, Hugo told me that you can actually use it to test things because you now can run an application on the browser. So everybody have a browser. So you can, you can run an application on it, which is very cool. So, um, yeah, check that out. Um, other things that I use pie square four is not, uh, with Django is, uh, with, uh, other things, for example, use it together with D3. I have this example here. I already preload them because I, I'm not going to like fool you. It's actually take quite a while to load. So that's why I preload them. Um, but yeah, but because I'm using the whole kind of package, right? Here I'm using network X. I have put in a network graph. I have all these network analysis, which is super cool. But, you know, I don't like the network X, uh, visualization because it's kind of basic. So that's why I use D3 for the visualization. I can do this, right? Very cool animation. I can click on things and things change. You know, now I see all the neighbors are colored coded. Um, so yeah, like, uh, you can combine the cool stuff that JavaScript provide, for example, D3 and the cool stuff that Python provide, which is all these like data science stuff, which is cool. Um, also there are other plots as well. You know, um, before, you know, it's, you know, you can't have these interactive, um, things that you could, you know, interactive graph done easily on your website. Um, so yeah, and also a map as well. This is not again, not my demo, but, you know, you can now have this map thingy where you can use folium, which is a, again, Python library. Now you can use it on the browser. Super cool. Um, yeah. And also I have actually a, start building a Pyscript tutorial. Again, this is working progress because things keep changing. I can't keep it up, but, um, if you're interested, if you want to try it, that may be a place you want to have a look at. Um, so yeah, all of these slides, again, is this on the slide. So, uh, yeah, so download my slides if you need to. So, uh, I think I don't, don't have too much, um, time left. I would, uh, I would like to answer your questions, but there are these common questions that I would also ask and answer myself. So, for example, people ask me, can you pull in a Python script? Yes, you can. You can use the fashion now, look at the documentation if you want to. Um, what Python version you're using, it depends on the runtime, which is the, again, PyConfig settings that you could, um, you know, uh, that you can look at it and see which one you want to use, you know, PyOdi version, which version you want to use, you can choose it yourself. And, uh, you know, why we can't do it like JavaScript, having script tag equals to Python, because, uh, this is so new, all the, uh, browsers, they don't support it yet, so we have to make a custom tag, which is PyScript. Okay. Uh, why don't you just use PyOdi? I think PyScript is just easier to use. PyOdi is very nice, um, but, you know, uh, sometimes get quite complicated for beginners, especially, um, also you can change the runtime with PyScript. So it's not just PyOdi, you can, again, you know, there's, my colleague is now working on a compiled version of Michael Python. It would be much faster, much, much more lightweight. So try that. Um, a bunch more other things. So, uh, can you pin a version of the light, of the packages that you use? If you want, well, if you want, then you better host the wheel yourself than, you know, which version you're using, and it's there frozen and changed forever. Um, so also some, do you know Brighton? Maybe some of you have heard about Brighton. So Brighton is a project that tried to, uh, translate Python into JavaScript and then run it in the browser. Um, the difference is that PyScript actually is not using JavaScript, it's actually using Wasm. So, uh, Python is compiled to Wasm, so, uh, more packages is available. Um, so yeah, also you can, again, change the runtime, the backend, which one you want to use. Beware, as someone mentioned, Beware in my previous talk. Uh, so yes, I would love to, I would love to see more support for Beware, but I can't say for the company, so I'm not saying that. Um, yeah, so, uh, that's the end of my talk. I know I have a few minutes left for Q and A, so, or if you didn't get the sticker at the beginning, come to talk to me, I'll give you a sticker. Thank you. Uh, we have a few minutes for questions and, uh, before the questions, I want to thank everyone for joining the Python Dev Room. And I want to thank also Eric Gasoni, uh, my friend who organized, uh, all of the planning of the day and, um, everything upstream. Um, and he made a great worry for the, yeah, all the selection of the speakers are planning everything. He couldn't make it here today, but, um, I really want to, to thank him as well. Thanks everybody. And thanks to Arnaud also for joining me today, uh, to help the Dev Room. Uh, questions, yeah. Hi, uh, thank you for the wonderful talk. Uh, PyScript is, uh, very exciting. Uh, I know people are using WebAssembly to run untrusted code in the browser. You can use it like a sandbox. So, uh, it's very exciting. But my question is, so you are importing packages. Is that easy to do? Do the packages have to be on the machine already? Yeah. So, uh, if you have internet connection, so the, uh, package, if you just simply put in like, you know, in the PyConfig, you put in package equals to something. So those will be actually proved from whatever PyAlgize provided. So it needs to be loaded from online. So you'll see if you have a web page, it will take quite a while to load. But you also have an alternative that you can download the will of the package. As long as that will is purely written in Python. So something like, uh, NumPy, SciPy, those, because they have extension, they are not purely written in Python. Those, you don't have an option. You have to use the one that PyAlgize provided. But otherwise, if it's in other library, for example, first you will see, you know, it's purely written in Python. You can download the will and it can run locally. Yeah. Yeah. My question is about mudplotlib. It's okay. But, um, how is this base map or cartoply? Can it be used also? Uh, mudplotlib, sorry. Mudplotlib is for graphics, but if you have cards, maps, base map, can such a tool also be used then in the browser? Yeah. Most of the library that if it's available in Python, it will also be available. So, it must be pure Python or it's, uh, I'm not sure. I don't know. I try. Yeah. So you have, you have to check like the library that you want to import. Is it purely in Python or not? So, yeah. I'm a bit confused because, uh, the, what we loaded in the browser is a JavaScript module. So is there some WebAssembly somewhere? Yeah. So you see the script that was import is.js, right? But actually that's a WebAssembly. It's just that the standard, the WebAssembly standard is like, when my team provided, like, when they released the thing, which is in WebAssembly, but it's, the extension is.js. It's a bit confusing, but actually it's in WebAssembly. It's not in JavaScript. It's just how the WebAssembly kind of standard work. You have a JS file somehow after you have, will, you know, then the release, the build, and the source. It become, I thought you guessed. We have time for one more question. If you have one. Yeah, there. Thanks for your talk. Um, does PyScript use any web workers? And if so, how do, can you control them? There's the web. Sorry. Does PyScript use any web workers? Like, does your code run parallel? It's purely in the browser. So, um, if you host everything locally, actually, is, is, is just on the browser. So there's no web worker that communicate with other applications. Yeah. How does it, does it all came in with different browsers or Firefox, Chrome? Because I guess it depends on your, your browser supporting that. Yeah. Now definitely work on Chrome. Uh, Firefox, maybe others. I can't guarantee you. Yeah. Yeah. Yeah. But it's very young, the project. So later it will be more support. |
W3C RTC Working Group Update |
Yeah, yeah, yeah, and just shout if I'm raining out the time Yeah, yeah, great Hello everyone, hello, I'm gonna start So thank you for being here. I'm glad to open that dev room It's my first time at first them So I'm very happy to be here in front of you and I ask you to be kind because I'm not used to present that much All right, and I'm open to feedback afterward Just come and see me tell me how to improve. I would be very very happy to hear you Okay, my talk is about to breathe three C Do you know to breathe three C? Who knows that raise your hand? Okay? That was an easy question for everybody to raise their hand, right? It's a worldwide organization Everybody knows it the different standards and it's quite far from us, right? You don't know exactly what happened in that organization there's big names and Serious people around the table discussing things and eventually they agree on something that applies to the market and They've got some working group and there was one there is working group one working group, which is about We're about to see okay who works with web RTC in the room Okay, that's great. That was also an easy question in the in the room will time communication, right? So there is one working group that defined the standards for us on the market so I thought that it was quite interesting to understand how it works and what are the topics and Maybe how to get involved into it so that the future is built by us, right? we are kind of part of the community and the W3C is about getting the community to reach consensus so it makes sense to be around the table so In fact first I was interested to know more about that organization and I think that it could be interesting for you, too By the way, I'm Romain I work for a company like everybody right I And I've got I'm co-organizer at death fest in my City Sorry. Yeah. Yeah. Yeah, it's asleep. All right. Good back And I was so co-organizer at a meetup that will soon revive which is called the web RTC Paris meetup Soon revive because it's been asleep for a few months few years in fact And so connect with me if you want to talk about those topics, of course You can reach me on twitch every Thursday Also Right, so I talked about that they there was been a standard that enables all the Industry to agree on a protocol. Let's call the web RTC standard and it came only in 2021 so it's very late in fact if you are in the water to see industry for a while You use that protocol quite from out for a long time. It's not big in Twenty twenty one is not the beginning. It's kind of the end. So that's kind of what W3C is is making sure that we agree on something stable. So from that moment you can build on Solid foundations your applications using web RTC and that's why it's quite interesting What is quite interesting also is that it's a it's a coupled action across W3C and also IETF So do we see the for like kind of the web like the usage of it and The other organization is more about how you structure the protocol the networking protocol that Enable the thing to flow across the clients So that's that's quite interesting to join force on that particular topic So they need to agree together like the working group in at W3C need to agree also with the working group The EETF. I don't know how to pronounce that anyway So yeah, happy we've got a standard But in fact, there is a lot of use case that use cases that are already on the market so What is the working group? Doing so that's work group. The W3C work group. They do they accept any individual to contribute to participate They also encourage anyone at any company anyone any company or any organization to become a member All right, they So as a as an organization, you might be our end user I mean, I'm sorry a company that be a solution based on that protocol or you can be also a browser editor for example You can propose new standards So that's quite powerful if you remember you can propose new standards and as it's based on consensus if your proposal is a need on the market, you will have a community that will Gather around you and then finally you can reach that consensus. Otherwise if your proposal is not Needed or is not appropriate. You will suddenly have rejection from the community or Or just nobody around you. So you can't reach consensus if there is nobody around you, right? So that's how it works. It's kind of democratic in fact It's having people gathering and trying to get better together. I Each working group has a leaders. So at the W3C The sorry the web RCC working group. We've got leaders from Microsoft and Mozilla and Can't remember the third one like the big players maybe Google or maybe or not I can remember anyway, so the big players are the leaders and the leaders is Being a leader is a role that involves you to Push forward the the group without taking Directions, so you need to provide energy to the group. You need to Make the whole process to go on and on make sure that there is a real consensus around the table and Trying to in the end a Realize the mission of the working group So The web RTC Working group have some topics so they they worked on Different topics in the past we can have some mature topics that are already deployed in in the different solutions different brothers brothers and So you can recognize that those API that you might use every day so you can see that those API's are Worryingly accept. They are silly. They are fundamental in fact So you can build on that and there is also there are also some topics that are still in discussion More or less mature, but still in discussion and those are still moving things, right? You can see quite interesting topics in there Especially topics that are Optimization for having better quality co-quality video quality In in your application, so it's like more technical things and also things that are more related to the UX How you will make the UX better when you handle the Communication through the web RTC API implemented in the browsers, for example, you've got like for example this one the capture handle This is what Something quite interesting. It's like have it making sure that when you are in a tab in a browser And then you share another tab in the browser that the two tabs are aware of each other So that you can know that you are sharing a tab on a specific domain, for example so that you can adapt your behavior as a as a web RTC Application you can adapt the behavior of your application Depending on what is shared and the contrary also is when you share when we when as a tab is shared the tab knows That is actually streaming the content some somewhere else So you might also adapt the content of the tab depending on that So that that's quite interesting, right? We haven't thought of that quite quite often and Those topic has been raised by Members of the working group. So if you want to raise a topic you can you can do it the way about to see working group is As a statement that defined its mission and in fact in that statement that fundamental statement They try to see what are the use cases that want to reach and that's also given a glance of what should be What could be the way about to see API in the future or what could be done through that API? So for example, okay fire cherry, that's great use case. You've got like I don't know what how I would wait Sorry, I don't know What it's so so used the funny hats use case which brings a lot of technical Challenges around that and I was so like bringing resources from the browser to You get involved into the web artist your protocol and and the rendering of the videos And you've got those things that are trendy All these things about analysis of the voice and also the image They call that face, but we can too maybe a lot a broader perspective with saying well, how do we do the? The analyzes how do we bring value to what is gone through the? Channel offered by web RTC How do we had some? Intelligence on top of that So that's in the road map You might already do that in your application trying to make sure that it's quite stable Using the API that could be provided by your browsers like chrome chromium or Firefox or I don't know which one is Safari and try to try to make sure that it's quite violent. I mean You try to cover the gaps So, you know that if it's still in discussion as a As a standard it might move in the in the future All right, we've got all the other use cases like low latency broadcast with P2P release That's quite that was quite a topic I think and that's a technical changes which is very high But you know that it's something that is that might be addressed by the the working group in the near future So how do you contact how you do getting for how do you engage with the working group? So that's kind of the API of the working group And they've got a website. There is plenty of things. It's it's it's an institution, right? It's an organization for standardization. So it's like quite massive amount of text, right? But there is also a monthly discussion so that you can just drop by it's Broadcasted on YouTube so you can drop by and listen to the topic that are Currently discussed. It's quite technical topics, right? But it's very interesting to know how it works like the process of getting a consensus You've got some samples that you can try on the future implementation. So you've got those topics that are still in the discussion and When you go through that process and at the kind of the end of the processes They try to get to release some sample. What should be what what they the standard should like look like when you use that in the browser? Oh Use it like not only in the browser, but most of the samples are in browsers And finally, there is a massive amount of group a working group repository. It's mainly repositories for For handling the modifications of documents as you want to progress as a standard you need to agree on a Formal documentation of the standard and in fact those reapers are mostly used for making sure that everyone that has an issue Can raise a point and then the point can be discussed and a proposal can also be but don't through that through that Through those reapers if you are only a participant not a member you can you can go into that discussion Okay, you can propose things You don't have the voice, you know, it's not like your voice is not as powerful as if you were a member But in fact you can drop your ideas and and for having discussed discussed with them. They're quite open with a different point of view And especially when you are a user of those technologies, they are kind to have some feedback on what they're proposing How to engage there is a massive event from the the W3C and that is happening in Sevilla, which is quite near. I mean It's not I mean, it's quite the place It would be in September next year And there is also a Mailing list and an IRC channel how to contribute They are promoting the thing that you use GitHub for For any any Suggestion they want to track everything. It's a distributed decentralized Working group, so they want things to be written so that also it's precise and So that it could be discussed discussed afterward in in a meeting You can propose new topics if you want for example, I talked about starting something about Energy conception and we're about to see that might be good also to understand how we impact the energy conception of all the client devices and also maybe in the networks across to understand What is our impact on developing new features? How and trying to get that into the balance when you choose adding new features? And if you want to get more info, I'm not yeah, yeah, I'm about it. That's the last slide I'm about to Sorry, sorry, so I'm not from the W3C. Okay, just for you to know But you can reach out those people and they are dedicated staff for the web at the C working group They are very friendly. They are open for discussion. They can also bring you on the on the boat So just reach out to them and you can find those slides in my github account Thank you. Hey just in time Okay, okay, okay, okay, okay, okay, okay, right Join me on every Thursday at the work RTC wildcard show. It's on twitch. Thank you Any question No, yes Yeah Me me myself no nothing No, in fact, so so there is no as far as I know I'm went through I went through a A lot of reports for of what is happening in the in the working group and I've not seen much thing about decentralized messaging But if you want to start something about that they are okay for getting some up Yeah Any question Thank you Thank you |
Media Streaming Mesh
Real-Time Media in Kubernetes |
My name is Giles Herron, I work for Cisco, but let's try and forget about that for a moment. Because this is open source, this is not your regular sort of traditional Cisco stuff, hence pitching it here. So the project's called Media Streaming Mesh, I'll give links to it at the end. I've been running it for a while, but it's quite kind of a skunk work, so it's really not loud enough. Oh yeah, yeah, it slipped right down, hasn't it? I'll put it here and then, or if I'm a bit super loud on the recording. Yeah, so kind of skunk works, it's really me and a couple of developers now. And really where it comes from was, we were looking at Kubernetes so that the group are working, everything we do really is cloud native, and so when I took a look at Kubernetes, my main impression was that here was something that was very much based on supporting web applications. Even if you look at like liveness checks in Kubernetes, they're like TCP or HTTP. And so I was like, well, okay, so how would we do real-time media in that? And one way I look at it is to say, well, you can divide the world into two by two matrices, whatever you're looking at. And if you look at apps and say, well, you've got real-time, you've got non-real-time, you've got interactive apps and streaming apps, they're just sort of pub-sub stuff. And really, back to Kubernetes, it's very much in that top-left corner, it's all about web apps. So how do we do real-time apps? And so that was where the project started, initially looking at anything real-time, so including online games, it could have been stock trading or whatever. But then in terms of focus, we decided then, let's focus on real-time media, and by that really anything RTP-based, so hence being here. So that could be WebRTC, it could be SIP, but equally it can be RTSP, it can be live video, that sort of thing. And so then we'll say, well, how would we support this in Kubernetes today? So everyone's really big, one of the big things at the moment in Kubernetes is service meshes. So I'm guessing who here's played with service meshes at all? Anyone? One or two? Yes. So what the service mesh does is it terminates your TCP or HTTP connections at each pod, and so rather than sort of routing across your Kubernetes cluster, you kind of go hot by hot through web proxies. So you get security, you get good stats, that sort of thing at the price of complexity. The challenge is these service meshes, they'll work for TCP apps, they're great for HTTP apps because you can go in and do URL routing, that kind of stuff, don't support UDP and they certainly don't support real-time media apps. So the next thing to say is, well, what if we don't bother with a service mesh and we'll just use regular Kube proxy and node port? So this is the standard NAT that Kubernetes uses. So what Kubernetes typically does is it gives a service like a persistent IP address. So rather than relying on DNS, which obviously you can cache things, you put a persistent IP address on a service and then you have ephemeral IP addresses for your individual pods that support that service, and the way you get from one to the other is NATting, which for those of us who are network people, makes us throw up a little in our mouths. But the challenge there is these do work, again, for the basic services. I put that in yellow, there's a challenge which is that typically when you expose things externally from your cluster, you use these high port numbers in node port, which is pretty messy. You want to use well-known ones, of course, or you prefer to, but they don't support real-time media. And I guess this is no surprise to anyone in this room, but if you're doing things like SIP and RTSP, what you'll typically see is that there'll be a TCP channel that'll be negotiated. Well, it can be TCP, it can be UDP, but it will negotiate the media ports. So the challenge there is those media port negotiations are completely invisible to the NAT, and so that will break if you try and deploy it this way. So what I've seen some people in the industry do, so people deploying onto Kubernetes with conferencing solutions, is they'll use host networking. And that just works, right? You put everything in the host name space, everything's good, there's no NAT, we're all happy. Okay, so why not do that? Big issue is, oops, you can only put one media part on each node, because if you want to have multiple media parts all exposing the same port, well, you've only got one instance of that port on the host. So that's okay if you're going to run on VMs, you're going to size your VMs down to the right size for one media pod, and there's the cost of having many more nodes in your clusters, but if you're deploying onto bare metal, that's going to suck. So what we came up with MediaStreamMesh said, let's focus on this real-time media, let's support multiple media pods per node. Let's also maybe support the other services, either directly or in combination with a service mesh. But yeah, let's really make this our focus. And what was I trying to do? Really take everything that we get in service meshes in Kubernetes, so that's really good observability and security so you can crit stuff when it goes out of the node, you can check all your performance metrics and that kind of stuff. That's great for debugging, troubleshooting, securing your network, but what we wanted to do was provide the lower latency of factory proxy at the UDP or RTP layer, not TCP, so I don't know, terminating TCP, and we wanted it to be really light. So we looked at how do we decompose it so that we can put this right out at the edge of the network, because some of our use cases are people saying, well, we've got a camera in a coffee shop, and we want to stream that somewhere, and we want to run it on a little node running K3S like a little Raspberry Pi or something. So can we get this down to a small enough footprint? So in terms of use cases, obviously real-time collaboration, so we haven't yet done WebRTC, that would be my disclaimer. If anyone wants to help us port it into this, that would be great, you can take some like PON and port it into this framework. We're working on contribution video, so that is very high bandwidth video, typically in TV production studios, that kind of stuff, and what those people want to do is fairly nuts in terms of you can have something like 4K uncompressed video, which I think is about 12 gigabits per second, and you can't drop a packet and you can't jitter. So probably we have to do something special on the network there, so we're working on things like zero copy from one pod to another, and then using Intel's DPDK out to the network. Some challenges there, I guess, in terms of normally in Kubernetes is what you hand between pods as IP packets. In this case, we'd be handing around raw video frames, so it's a bit different. And then as well as that contribution side, then the distribution. So how do you, if I'm watching a football, I don't want it to lag. How do I get the live video feed, at least out to the CDN caches, to minimize the lag, but possibly even then going RTP right out to the user? I guess some of the challenges there are what protocols we use. Do we do RTP over quick, or do we look at the stuff that's going on? I don't know, anyone here has seen ITF at the moment are working on media over quick. We had an interim a few days ago. So that won't necessarily use RTP, because quick gives you a lot of what RTP gives you anyway. So that might be the solution there. But as I mentioned earlier, this sort of retailer industrial edge, where you've got large numbers of cameras in one site, like I was in Las Vegas for Cisco Live earlier this year, and you walk into a casino. If you've ever seen how many cameras there are in a casino, it's just like nuts, they're watching you from every possible angle. Or it could be a coffee shop with one camera, but you've got a thousand coffee shops, whatever. As I say, we've kind of dropped out of this kind of non-RTP stuff, but there is obviously scope to address that in the future. So in live videos, I say the contribution stuff, one of the challenges there is a lot of this stuff is actually hardware, not software. So a video camera is a real hardware thing, and a mixing desk might be a real hardware thing. So how do we integrate that? Interconnecting coders in the cloud, that seems like a really obvious use case. That distribution of live streams, but also potentially distributing rights to the client. But as I say, video surveillance, that seems pretty easy and tractable, and that's what we're demoing now. So our initial implementation, we have RTSP, and we can stream stuff from cameras and replicate it to multiple places. So the classic use case, you might be saying, well okay, cameras are cheap, humans are expensive. So if I'm in a casino and I've got 100,000 cameras, pick a crazy number. I don't have 10,000 people who watch 10 screens, because that's going to be too expensive. So maybe I only have 10 people, but then we need to have a way that if some kind of machine learning algorithm spots there's something that shouldn't be happening, or thinks it might be, then at that point, like a human can start looking at it. And the other great thing, of course, with going through proxies, is the proxies have a lot of replication for you. So if you have one proxy per node, that can be our replication point, and one of the other challenges with Kubernetes is that it really doesn't do multicast. And so the multicast solution isn't really tractable. And in fact, in this environment, you probably don't want to do multicast. So I know today that's what people mostly deploy, but it's a very odd multicast set up. Because you imagine you're an airport and you've got 10,000 cameras, but each camera has only been watched by like one app at all times, and maybe one or two humans, whatever it might be. That's a very odd multicast deployment to have 10,000 multicast groups, and each one's only got a couple of subscribers. So maybe proxies are an easier way to do it. So how have we built it? Yeah. So the software architecture, so we have a whole bunch of components, and what we try to say is decompose it into what do we put where in Kubernetes. So services run sort of one per cluster, demon sets run one per node, and then potentially you can have stuff running in the application pod, but we try and keep the footprint there very low. So the initial thing is how do we put anything in the pod? How do we make sure we intercept traffic? So what we have there is an admission webhook and a CNI plug-in, so when a pod gets created, we have an annotation that we put on the YAML file for the pod that says, OK, this is one of our pods. So the admission webhook will intercept that, and when the pod gets instantiated, it will have our stub in the pod as well as the app. But we also then have a chain CNI plug-in, and actually in the network dev room, I was there just now, somebody was literally talking about the chain CNI plug-in, so the idea is that you run whatever normal network you want for Kubernetes, and this little plug-in, all it does is add in the IP tables or EBPF rules that we're going to use to redirect that control plane traffic into our stub, and in some cases redirect the actual data plane traffic. And of course, once it starts, it gets redirected and everything's good. So in our control plane, we wanted to build one per cluster to minimize footprint. Today it basically calls out to the Kubernetes API and Core DNS, so if you're connecting in and you're saying, OK, I'm going to this URL, we want to figure out which active running pod support that URL, so that's what that piece does. So you're effectively, your RTSP sessions say from your app, the stub intercepts that TCP connection. It uses GRPC to pump the data messages or the actual payloads of the RTSP control plane into our control plane, and then we use GRPC again to program the proxies. We've written it in Golang. Say initially we've got RTSP, we used GoRTSPlib, because again, why build it from scratch? We took the GoRTSP library, and I'm guessing for any other plug-in, we'll do the same, we'll just look for existing libraries in Golang that we can plug in. What we'd like to do though, and this is up for discussion, it'd be great if people had feedback at the end or hit me offline. What I feel is there's actually two different things we're doing here. One is we've got this dynamic protocol, whatever it is, RTSP or SIP, but on the other side we've got how do we map this stuff into Kubernetes? So if you think about it, the handoff from one to the other is if you have a logical graph saying we've got this sender for a stream, and here are our receivers. What we want to do is decompose that and say how does that map onto Kubernetes? So which receivers are on which nodes? So how do we build that tree where we're doing the kind of application layer multicast? And so I think ideally we'd want to separate those two. So the control plane will have the plug-in that we put in for RTSP or SIP or whatever, but then the controller will take care of mapping that onto Kubernetes. And the nice thing then is we could use that control plane to support multiple controllers, multiple control planes, but we could also use it in non-Kubernetes environment. So firstly we've externalized any Kubernetes dependencies using XDS, which is the Envoy protocol for configuring Envoy. But also if you think about it, why does it have to be Kubernetes? If you have remote edge proxies, so you've got a global network with edge proxies doing your media proxies, why couldn't you control those for a more centralized place? The stub is to say it's a stub. So why do we call it a stub? Because it's small. So we wrote this in Rust using Tojo and Tonic and all that stuff, keeps the memory footprint low and it avoids any latency spikes because we figured if garbage collection kicks in that's going to be a problem, but I didn't want to be writing in C in 2023. There are some cases, it does intercept the control plane, but there are some cases where it intercepts the data plane as well. So RTSP as an example, there's an option to stream everything over one TCP socket. So what we can do is we can capture all that traffic and then send it over UDP through our network so we can do all the replication and everything. But there might be other cases where for example you want to monitor right at the pod or you want to do again for like TV type stuff, you want to do your live, live video replication right from the edge because you don't want any shared paths so that you don't risk having dropouts and I think that's about it on that one. So the RTP proxy now, I guess this is pretty straightforward, I mean the one we have at the moment is written in Golang, it's just a prototype. I intend to throw that one away and again, go to asynchronous Rust. The big thing here is, you know, back to the control plane, I was talking about having plugins for the protocols. So my thesis is that for success in this project, what we'd need is that we make it easy for people to contribute. So you shouldn't have to read all of my code-based contribute. What you should be able to do is say, okay, I've just got this one plugin I want to put in for my control plane protocol and here's a well-defined API that I can plug it in. But when it comes to data plane, what we're thinking is use wasm as our plugin so that then if you've got a plugin that does, whether it's encryption or whether it's validating the RTP header fields, whatever it is, you should be able to just plug that in again with a very simple API into a filter chain that's built dynamically in the proxy. Now obviously, modulo, the issue of course of performance, again, back to zero copying and all that stuff, do you really want to be copying each packet as you pass it down a filter chain? So that's something we'll probably have to think about. But as I say, I think really the key success would be to drive a filter ecosystem where people can contribute their own filters. I did a bit of work at the moment, here's come across Quilkin, which is a Google for Games project. And it does basically proxying for games. And most of that, the only way to sort of modify it was to really get into the code. And I don't want people to have to get into the code here. I want to have a really simple API. So how it works ultimately in this case would be we'd have this framework of the proxy, we'd have the filter chain. So we strip off whatever the headers are, it could be typical RTP of a UDP or it could be some quick, it could be raw RTP within the node, so we strip off any headers and then we just pass it through a filter chain where each part of the filter chain does its role. As I say, the key challenge there is going to be how do we make that perform at scale. So I think that's about it. And yeah, the goal here is that we can really deploy real-time media apps in Kubernetes and make it work, which doesn't seem to work so well today. It's very much a work in progress, it's an open source, it's all there. You can, I don't think I need to stick a new video on the website, but the GitHub's there. And really, anyone who wants to contribute, I know what it is, because as I said, I think more people is going to help us get there faster. I think if we firstly get the architecture right, then hopefully make it easy for people to contribute, then hopefully we can scale this. So do please join in. So that's that. Thank you. And to, yeah, do ask any questions while the next person's coming up. Can you repeat the question? Can you say about the quality of the project? Yeah, so at this point, very much better. I'm doing some integration work at the moment with another team within Cisco that wants to use this for something. So I guess that's where we'll start to really shake the bugs out. And that, but that again is RTSP, so, you know, really be interested in other people contributing other protocols. The integration with the other open source, like the Melio.cs, do you think it will change for these? Yeah, with other open source projects. I think in terms of how it would integrate, I guess it's down to that project and how it fits, because we've very much separated the control plane and data plane. So if the other projects have also done that, then I guess we could, that control plane could plug into what we're doing. Again today, it would have to be Golang, because that's what a control plane is written in. But even for that, again, we could look at other models to plug it in, or at least if the APIs to our data plane are clean enough, then equally just contribute the whole thing as a blob if it wasn't Golang. |
Modernizing Authentication and Authorization in XMPP
It's time to forget your password... |
Okay, so yeah, my name is Matthew Wilde and I'm going to talk about, it's a, hopefully not too technical talk, but the topics are technical, but I'm trying to keep it general. So who am I? I founded the Prosody XMPP server. XMPP is an open chat protocol, so the idea is you can choose the software that you use to chat with, you can choose your service provider, or the providers federate using a server-to-server protocol, so this is like some other open federated networks. There's email, which we're all familiar with, where you can choose your provider, choose your software, there's the phone network, which kind of works, and many of you here have probably heard of Matrix, and that's another very similar like goals to XMPP, where we have an open protocol, and we're doing federation, and we're bridging to proprietary networks. So Prosody is an XMPP server that you can self-host, it's all open source. Snicket is a newer thing, which is kind of an all-in-one XMPP setup, it's kind of more like a self-hosted WhatsApp, I actually created it for my family, because they were still using WhatsApp, even though I'd been working on XMPP for a long time. And so yeah, so Snicket has apps and stuff, that's all just working out of the box with voice and video calls and things. As part of this, I worked on a modern XMPP, which is a set of guidelines, UI guidelines, because the XMPP Standards Foundation, of which I'm one of the directors, we publish protocols, and we say, this is how you send a file, this is how you send a chat message or make a call, but we don't say this is how you should structure the UI. So I wanted to bring some consistency and some good guidelines and help developers with that. So yeah, I'm also part of the XMPP Standards Foundation, I'm the executive director, I work on the board, but I've also been on the technical council, and so yeah, I'm involved in a lot of XMPP things. So this talk is focusing on something that I had a grant for from NGI Assure via NLNet, and it was to work on modernizing XMPP authentication and authorization. So, authentication, you start out connecting to the server, and you then have to prove your identity to the server. You can't just say, hey, I'm Matthew, because every TCP connection has to be authenticated somehow. So how do we do that? Traditionally, you make the connection, and you send a username and your password, and the server tells you if it's correct, and then you can proceed to do authenticated stuff. This is actually how the web works, pretty much. So you have this HTML form, you put in your username, you put in your password, your password gets sent to the web server, and the server verifies it, and usually on the server side, the password is hashed, which means, I mean, if it's a good place, then it's hashed on the server side, so then they hash the incoming password, and they compare it with the hash that they have stored. So XMPP uses a standard authentication protocol called SASL. It's actually used by a bunch of different protocols, and there's currently work to try and implement it in HTTP as well. And so SASL defines a bunch of mechanisms, and the mechanism says what you send. And so the simplest one probably is plain, and this is exactly what we just saw with the, you know, the Hi, I'm Matthew, my password is, and the web is very similar. You're just sending a username and your password. And so sending passwords across the wire is absolutely fine because of all these reasons, and nobody ever reuses passwords, and they are, you know, frequently rotated and updated, and they never contain personal information, so if they're leaked, then, you know, no bad consequences. And, yeah, they're never also reused across services, which means this is just great because if passwords ever do get leaked, and those hashes maybe, you know, brute forced, then, you know, no one gets access to any other service than the compromised one, which was already compromised anyway. Okay. Yeah, that was just a joke. So in XMPP, we don't really use plain. We use another SASL mechanism that someone defined called Scram. It's not just, hey, my password is, it's a challenge response thing, so there's a bit of magic going on with hashing, and it has some really nice features. It does involve multiple round trips, so, yeah, you're going backwards and forwards, but these by you that the client and the server can only store hashes, so previously, we couldn't have the client store a hash because it has to send the raw password for the server to hash. If you only send a hash, then the hash becomes your password, which is kind of weird. So Scram has multiple iterations of hashing. It allows the client to store a hash. It allows the server to still store a hash, and only hashes exchanged over the wire. It's pretty magic, and the mutual authentication part means that at the end of the authentication exchange, both the server has authenticated the client and proven, yes, this person originally had the password, and they are who they say they are, but importantly, it allows the client to verify that the server also knows the original password, which in the past, with the plain mechanisms and like the web, the server can just lie and say, yeah, I have your password, carry on, send me more sensitive information, and so we have this mutual authentication, so when you connect over XMPP and you use Scram, you have this verification that also the server you're connecting to is the right one, and yes, we do have this with TLS, obviously, but there are certain cases where TLS isn't always reliable, and that's where channel binding comes in, which is a bit more magic, and this binds your authentication, that mutual authentication stuff to your TLS channel, and so if you reach the end and the mutual authentication checks out, but you find a little mismatch, this TLS magic can tell you that actually there is someone listening in on your TLS connection, and that can be because, for example, your certificate authority was compromised or whatever, so someone installed a different trust route on your system without you knowing, and so we can actually detect this, and it's pretty smart, all this security comes at a cost, obviously we just talked about why it's necessary, but it's also still password based, so what can we do? So there's been a lot of interesting development on the web ecosystem in recent years, they're trying to, they've tried fixing stuff, and it's basically hard, users are always going to be users, they're always going to choose memorable passwords, and there has been some progress, there are password managers and so on, but although they're best practice, they're not widely used, I mean amongst normal people, probably everyone here, I hope has a password manager, so WebOrthN 502, it's basically a combination of things, they allow the browser to do some special stuff and help with the authentication, you can do that with an external hardware token, but these days also browsers are supporting TPM chips inside the hardware, which allows you to link that authentication securely to a single device, and pass keys are like Apple's thing that they're really pushing, which is based on all this, and allows you to basically create an account without a password, and authenticate using this special key that is only on your device, except it's also synchronized via iCloud, and so you can access your account from all your devices without ever needing a password, which is as long as you can access your iCloud account, now that's just one implementation, there are other things, WebOrthN 502, and it's all based on open standards, but XMPP uses Sassel, which is focused on passwords, so what can we do, I've been working on this new mechanism in XMPP, which is token based, and it builds on some earlier work, which introduces a new Sassel mechanism, or a family of mechanisms, which allow you to exchange a hash of the token over the wire, so we're not sending the raw token, so it's a bit scram-like in that sense, it still provides mutual authentication, and it still supports channel binding, so you still have all those nice features of scram, it is a single round trip, so there's no back and forth like with scram, the things that we are weakening in that sense don't matter because the tokens are not passwords, and so although there is a slightly reduced level of security around the token compared to scram, the tokens are temporary, so if they get leaked, then you can easily revoke them, rotate them, and they are unique to that service, and I would hope that if a service is compromised, you know, they're obviously going to revoke all their tokens straight away, it's harder to get users to reset all their passwords straight away, so there's many benefits to using tokens, and we still get all the nice features of scram, but users aren't going to generate tokens and enter them themselves, so this opens the door to two-factor authentication in XMPP as well, previously we've had this problem where you can kind of do two-factor authentication, but every time you drive through a tunnel, then your XMPP app is re-authenticating on the other side because it's reconnecting to the server and has to re-prove who it is, if it uses the password, then the server is going to say, well, you know, you have the password, but the whole point of two-factor authentication is to make the password not enough because of all the weaknesses that passwords entail, so if you authenticate with a token instead, then the server knows it issued this token once, it issued it to that device, and it knows who you are, and there's a higher security guarantee around that. So by using the new Sassel mechanism, servers won't, they'll see that you're authenticating with a secure token, and they won't send the two-factor authentication prompts that they usually send. This is basically how two-factor auth on the web already works. You provide that web form or whatever, maybe you're using pass keys, but once you do that initial authentication step, the web service is going to send back a cookie that gets stored in your browser in plain text, and with every request, yes, it's going over HTTPS, but it's still sending that, you know, plain text string, and it doesn't have all the protections of the channel binding and the mutual authentication that the FAST and Sassel mechanisms are supporting. So in this sense, using FAST over, for example, the new HTTP Sassel stuff would be an interesting security improvement for many secure web applications. And so the other thing is it opens the door to having passwordless accounts. So instead of exchanging your password for a token, you could exchange your password plus a two-factor auth for a token, or you could do something entirely different, something came up just at the real-time stand downstairs, someone wants to do SMS authentication, so they verify SMS, kind of like how WhatsApp or Signal work, and then you will just be given a FAST token, and then you can reconnect to the server using that. And that will last for as long as you keep your device active. If you have an inactive device, then that token will stop being refreshed, it will eventually expire, and you will have to reauthenticate using SMS or maybe some recovery mechanism. And once you've breached up this passwordless account, then obviously you can add other recovery mechanisms as a backup if you need to. And yeah, that was kind of the summary of my talk. I hope there's still time for many questions if you are interested. So this talk is kind of a complement to a blog post that I wrote on the Presley blog about all this stuff, but the blog post focus mostly on the performance optimizations because that matters to people, they want to be reconnected to the server as quickly as possible because responsiveness and all this. And so the blog post focus on the optimization aspects of this, today the talk focuses on the security aspects. And yeah, there's some more XMPP talks coming up later on, I am downstairs also in the real time lounge, which is just down around the corner, and you can reach me on XMPP or email and yeah, happy to answer any questions. Can you tell us where they overlap, where they differ, can fast be used in scenarios where JSON web tokens already exist as something better, or is it more divergent as a difference? It's pretty different. Yeah, sorry, so the question is ultimately, are JWT, JSON web tokens similar overlapping with fast tokens? Fast tokens are essentially opaque random strings of a good length for security reasons. JSON web tokens, they are also embedding stuff inside that token. A server could do similar, and when it issues the token, use a JWT instead. There's not really much benefit to that. JSON web tokens, they are still useful for some cases, definitely, but they have a bad reputation with regards to security. Yeah, it's complicated, but there's not really much overlap. They can be kind of used in the same situation, but not entirely. If you were doing a distributed network where you didn't really necessarily want to have a backend communication, could you authenticate a fast token against one service, and then that contains information that could authenticate with a trusted system that's not sharing a backend? Yeah, absolutely. Any way that the server can verify the token is valid? Sorry, the question is, if you were working on a decentralized system where the authentication system is separate to the place where the user is logging in, then can you use JWT in that situation? The answer is yes, you could use it. Two questions, are you attempting to standardize fast within the standards body, and second, do you set the tokens that are disused, decayed by what mechanism? The first question was, are we attempting to standardize fast? Yes, so the sassel mechanism that it is based on is already a draft at the IETF, it's been going a while. We had a meeting with the sassel working group at the IETF just last month, and they agreed that this is stuff that is interesting and they want to move forward with, because it is also useful for other protocols, the email ecosystem and many others. So yes, we are the XMPP layer of this, the whole fast stuff. That is being standardized at the XMPP Standards Foundation, so that layer, if another protocol wanted to use it, they would have to define their own, because the fast stuff specifically is XMPP specific. They can copy how we have done it, but it has to be translated to a different protocol. The second question was, how do disused tokens decay? That is basically up to the server, there is an algorithm in the fast specification, which is linked from the blog post, which tells you how to implement the server in a way that is going to securely rotate tokens, without having to check every possible token on the server, because we don't necessarily know the user's identity until we verified the token. So it can be a bit complex, but essentially it's just the server knows the expiry time of a token when the token was last seen, and some interesting stuff came up with how to refresh tokens, because if the client authenticates and then you provide it with a new token and immediately expire the old one, so that's one way of doing the rotation, there are cases where the client actually reconnected, used the old token, and then did not receive the new token, got disconnected, and then it gets logged out, basically, because it can no longer access. So the server has to store the last token that the client used, and also the new replacement token, it's expecting it to use next. And if the client never uses that token, then it will eventually issue a new one and work out. And that's the moment you see it authenticate with the new token. That's when you expire the old one completely, and obviously there is a time limit to that, because otherwise someone can carry on using the old one indefinitely, and we don't want that either. So there's kind of two timeouts built in, okay, excellent. Thank you. Thank you. Thank you. |
OpenSIPS 3.3 – Messaging in the IMS and UC ecosystems |
So, hi everyone again, leave you from OpenSIPs here and I'm going to be quickly going through what has been happening in the last year and what is new in the latest OpenSIPs 3.3 release, which is pretty much focused on IM and RCS and a bit of extensions to both IMS and on the UC, on the call center side. And the reason why we went this route with this iteration, because we have one major release each year, is that we've talked to the community and also read a bit of papers like this one from a Juniper saying that RCS is growing, it's getting more and more adoption, the subscriber population is forecasted to grow to like at least two billion devices within the next few years, so why not make a bit of additions on this part. And one of the first, so here you can see hopefully the font is not too small, kind of the entire ecosystem that we have in mind here, where OpenSIPs, we make all these like microservices catering to various components of the platform, there is the MSRP protocol, which I'm going to quickly go through, so we are seeing a relay there, there is also some gatewaying necessity for clients which don't support MSRP yet, they still need to be integrated with the platform. There is gatewaying to the IMS side, using RCS capabilities and also the contact center. And with the IAM, we have, this is what we had, the simple message SIP method, just a request reply, nothing much past that, but then there is also this MSRP protocol which is not that new, it's 15 years old, but so far we've just seen it as you could use some other software for it, there was also the MSRP relay project that you could use, but now there is the need to have this closer in OpenSIPs and gain more access, more control over the sessions and that's why we went and implemented the protocol as well. So for example, this enables features like the RCS I was talking about, you could see RCS as a turbocharged SMS, so it's pretty much SMS with all of these nice capabilities like read and write receipts, file transfer, photo sharing natively into the phone, you don't need an OTT app like WhatsApp, iMessage to give you all of that, that is the idea with RCS and it's why Google is pushing it forward and why Apple is pretty much reluctant or neutral towards it because they already have the iMessage, so why also work on something that conflicts with your own application, right? So MSRP, not much time to go into it, maybe we can take a look at a couple of request replies, so here you set up the session like a regular call and if we look at the SDP, we get the source port that each side is advertising and from now on they will exchange these MSRP messages, so it kind of looks like similar to SIP messages and in the payload we see the messages, the chat part. And this is how the stack looks in OpenSIPs, it's kind of a three layer architecture right on the bottom, there is the protocol and if you're familiar with OpenSIPs you know the other proto underscore modules, your TCP, your TLS, your WebSocket and all of those, now there is the MSRP as well and the first module we built on top of it is the relay which as the name suggests there is not that much going on but also it solves the problem of authentication which is in the MSRP protocol, the auth method and it gives you some useful callbacks to supply the credentials, right? Also on the egress side it gives you the possibility to select the destination and that's pretty much about the relay, the user agent is a bit more interesting in the sense that you can interact with it in various ways, also first from the OpenSIPs configuration script you have this, I'm going to show you some call flows so you can get an idea of how it works but also through the management interface right which is HTTP based, so you get this, you can implement it from your web applications for example and initiate and control sessions that way, for example, okay so there are no animations, so here if we take them from top to bottom we can see an example of a web app that is obviously not MSRP enabled, talking to an MSRP enabled application like a web phone and on the app side you use MI, right, like HTTP invocations, like start session, send message or end session which get nicely converted on the outbound leg to SIP calls, right, it's a SIP call that then only has these MSRP mid-dialogue messages and on the other direction the SIP calls initiated by the MSRP phone get converted into events for the application, you can subscribe to it via whatever channel you want to receive them on, JSONRPC or stuff like that, so the next one was the gateway, right, the MSRP gateway that helps us include the classic SIP clients which are only capable of SIP message, right, they don't know MSRP but it does this nice conversion of like these simple request reply messages, it actually converts them to a session with the MSRP enabled phone, so these sessions are kept by open SIPs transparently and they are even closed, for example, since there is no way of knowing when it decides to stop a chat, for example, right, we just close them based on inactivity, we just time them out, whereas, so this is when the simple, one minute left, okay, okay, so that is pretty much on the gateway, the call center, some additions here, right, because now we got some more types of workloads or capabilities for the call center agents, they can handle both chat messages and voice, so a cool thing about chats is that now they can do parallel work, right, you can do two calls at the same time but you can have four chat windows open, so there is this problem of balancing the incoming requests in the queue, right, you have calls and chats and from various parts of the platform and now there is this problem of correctly balancing them to your available agent based on their capabilities, right, some of them may have chat enabled in their application, some of them not and there are few modes to control that and on the MS side, of course, the possibility to build custom diameter requests and pretty much that's it on the, on the UC side and DIMAS and RCS, a couple more additions on status and reporting, so now OpenSIPS is a lot more verbose with regards to how its internal state looks like, some additions on the TCP engine also giving control of TCP connections and that's about it as far as I have here with OpenSIPS 33 and I would like to welcome everybody to the newly announced OpenSIPS conference this year for where we will be unveiling what's been going on with the 3.4 and what we are working on on that side and with that, I'm not sure if there are any more questions but thank you for your attention. |
Build your own Real Time Billing using CGRateS |
Okay, so hello everyone. My name is Dan. I'm from CG Race Project. Thank you for showing up. I will be pretty fast, so the rest, if you have any questions later, please, and if you don't understand something, the slides will be available later. So I'll do just digestive slides, whatever. So the company itself sitting behind the project, we are located in Germany and with some back offices in Romania and Albania, we did both wholesale as well as real-time retail business, sorry. So we understand by now what means a system outage. CG Race, it's a real-time enterprise billing suite. It's pluggable into existing, it's designed to be pluggable into existing infrastructures. You can accommodate easily new services and new ideas. So it's not only for telecommunication built. You can extend it like the new industries, IOT, electricity. We are going towards energy as well. So you can build anything you like. If you want to sell cars, you can just do it. And it should be non-intrusive into existing setups. So it should not make you change the way you are doing things. We are sharing information with your switch, your router, whatever infrastructure you are using over there. It's all open-source software. It was made in, born actually in 2010 and we published first sources in 2012. The sources are available on GitHub. It's all 100% written in Go, one of the early adopters of Go. And we have nothing in private repositories. Of course, we appreciate community. It's performance-oriented, three branches, all three supported. Our customers, they tend to be like all telecommunication, a bit conservative with upgrading. So test-driven development, again, very sensitive to billing and data, and modular architecture. It's quite feature-rich. You can find all this information on the Internet, so I don't have to market it to you. This slide, it's complex a bit, but I wanted to show you because it relates to the subject of my talk, how to integrate with your existing infrastructure. So on the left side here, you see quite a number of agents which we support. These mostly are developed by us. They are also other agents like OpenSIP's module, which is built in their software. So you can build very easily and replace any of our agents. So what you will do in the end, you will send your API calls, because CGRATES is all about APIs. You will send directly to our session module, which you can also see it as central point of entry. After that, you will reach other modules of ours or subsystems, although they are also standalone API server on their own, but you will be using them through our sessions where we implement easier integration for your stuff. So how do you do that? First, you have to load the data. This is data-specific, so you have to follow our format into loading your rating, your accounting data in case of doing prepaid and postpaid. We have also some extra subsystems data, but you will be mostly focusing on rating and accounting. After you are done with building your data, then you have to understand how we support sessions. So you can choose all of these steps or only one, which is the last one and the most important session CDR. So you can do billing in real-time via sending us various messages, various APIs, or you can directly send us the NCDR for building it. So, for example, session authorization, you have the opportunity to extract from the billing engine, maximum session duration, resource authorization, various session properties, even password, you can retrieve it from the engine site, and you can also do session routing because we also support LCR on our site. Then sessions start when your sessions start, so you tell us start billing in real-time or start debiting in increments. You can choose the way, for example, the mobile networks, they are using session updates via diameter, so you can implement your own triggers for incremental debits. Or you can do like we are doing with open-source softwares, we support like FreeSwitch, Kamailio, OpenSips, send us session start and session stop, and we will do the magic behind. And then there will be the session CDR which can be standalone or can correct the session information from real-time, so both will work. And these are some examples of APIs, so if you want to implement in your own application like your own switching software or your own, I don't know, WebRTC application, all you have to do is send us this JSON RPC Blobs and we reply you, for example, this one is replying with the, we by the way use nanoseconds, you can also get back seconds if you want, but we want to be very verbose, so this one will just retrieve the maximum usage of a session. And the same with initiation, same you send us the information in your events, this is fully configurable, flexible, so you can add any number of fields inside. Same session update and terminate, and in the end the CDR sample and blob, same story, all API driven. So this was fast. Thank you. |
Performance optimization for VoIP services |
Thank you, thank you for joining the talk, welcome to my lightning talk. I want to talk today about performance optimization for voice of IP services. If you want to go out, please do it quietly, thank you very much. Quick to the agenda, just one example how to not achieve great performance. This is a real-life customer example, you probably will spot it immediately, what is the problem, just some guidelines on to approach performance problems. A few areas where you might want to look, some general examples for tools that you could use, that are interesting to use, of course for ten minutes it's not possible to go to a in-deep analysis of both performance topics, but nevertheless I hope it will be useful for you. My name is Henning, I started some time ago a company, we provide services for real-time communication services, work mostly with Camalio, do also a lot of other stuff, but I said mostly Camalio, if you're interested in the new stuff that's going on for the upcoming release in Camalio, please have a look to our website camalio.org, I didn't include it in this talk because it's not too much time. To the example, how to not achieve great performance, this is a real-life customer example, we were called to debug it during Covid, of course a lot of communications platform broke down during that time because of the increased demand, so the customer needed to make a routing decision in a SIP proxy in Camalio and what he did was basically use the exec module, exec module is generally a bad idea, you can use this to execute code or scripts on the system, use this to start a Perl script, the Perl script was then using a database layer in the Perl to access remote database, this database result would be reported back to Camalio, Camalio would pass it somehow into some JSON operations, process the message and this of course it works if you don't have a large load, but as soon as you get a higher concurrent call ratio on the system, of course this breaks down for obvious reasons because for every call you start a Perl script and this this will not going to work, this will not going to scale and if you have latency on the database all these Perl script invocations will take a long time, of course it will completely break down. Generally how to address performance problems, if you are an experienced operator experiences admin, this are probably no news for you, nevertheless of course most performance issues are not that obvious as in this example you should formulate a goal, okay I want to achieve that many concurrent calls, I need to support that many register messages on the platform, that many devices, I want to have I don't know 50,000, 100,000 concurrent connected user agent over TLS, whatever protocol you are using, WebRTC and in the best case of course you have some statistics, later we see some presentations about statistic projects from production load, maybe you have incidents where the system broke down or in the best case of course you have some performance test result. Generally speaking if you have performance issues we can cluster them in several performance related areas mostly related to machines, virtual machines, first side on the first hand you have CPU, Camelio in particular is really performant, normally you don't have performance issues there, Asterix is done in another story free switch as well, normally one frequent issue you might encounter is that if you have like a two large other commitment on your virtual system, virtual infrastructure, just keep in mind the physical core is not a virtual core of course, sometimes you have issues with other services running on the system, configuration management, maybe some void monitoring whatever you're using also in the system which causes a lot of CPU congestion, maybe you should adapt the Camelio worker configuration, the defaults are usually fine but nevertheless sometimes you need to adapt it. Related to the memory, Camelio if you install it from the default installation you definitely should increase the memory pool, the defaults are not really meant for production use, if you have a database of course normal tuning guidelines apply here, you should give the database plenty of memory, memory is cheap nowadays, if you have an HTTP API service maybe written in some Java service whatever Java language you should give them as well of course a lot of memory to perform correctly. In really special cases it's also might be a good idea to look to the Camelio memory manager default, it uses a bit the memory manager which is more suited for which has some debugging support built in, there's another memory manager without this debugging support but like a 99% of all infrastructure and scenarios you'd never use it, no never never need to change it but in some cases it might be beneficial to look into that. Most problems are usually related to IO, IO performance, yeah of course voice over PESIP is the protocol, it's relayed on DNS as most of the protocols out there, if the DNS is slow then also your server will be slow, Camelio uses an internal DNS cache, if you use Astrix there is no cache unfortunately so you should use DNS mask or something similar or keep some local DNS server in your data center in your infrastructure. For zip for real-time communication you need to write usually user registration this is something you can of course optimize, you can cache it, for Astrix there's something called Qualify which you use real-time infrastructure, this makes sense to tune maybe to deactivate it because it will basically scale with the number of your user and the write load will be also scale as well. Logging of course you need to look to it if you really need to log everything or maybe you can tune it to adapt to your scenario it makes sense to restrict it also with not only with Camelio of course with Astrix or other servers as well, if you have a lot of read operations they can usually cache quite well on Camelio, there's a htable module for Camelio you can use caching the data, you can also use something like read only replication, read is memcache whatever to scale that. The same for remote HTTP API requests this is also something you can cache of course. CDR writing we just saw call talk about CG rates, great project that offer these CDR capabilities, Camelio can also write CDRs internally but of course for highly loaded platforms it might sense to move it to another process to another system to have some asynchronous process doing the CDRs and not to affect the server operation and of course as we just saw in the beginning you should not fork processes if you rely on performance. What you could use for performance test one thing which is still used a lot is the old classical zp, there's pjwa you can script it a lot with Python or other bindings, they are dedicated to performance test frameworks usually they are homegrown or closed source unfortunately but they are the stuff you can pay or you can of course build by yourself. If you have a database hdp which is actually the bottleneck you can of course use custom tools to test the database to test the hdp API. Then for a start of course you see common tools to get inside about the cpu, the IO, the network situation, that can give you a lot of information if there's some pressure on the sockets for udp especially in particular. If you have tools like Humair we see a talk later about that as well, wipe monitor another tool of course the classical Isengar, Grafana, whatever statistics you have in house. Camelio offers also some benchmarks module and you can also adapt the logging a lot to your requirements. Okay that's all from my side thank you very much, just a quick pointer, we're doing Camelio world this year in presence again I'm really happy about that, it will happen at the beginning of June in Berlin called for papers was open so if you're interested in presenting something interesting there go ahead we are looking forward to your contributions there as well. Thank you very much. |
Social audio applications with Janus
Using WebRTC broadcasting for more than just video |
Okay, I'll start because the 10 minutes apply to me as well, even though I wear this nice blue shirt. So please sit down and I'll start right away. So I'll be talking about social audio applications that you may want to re-implement with Janus if you want. Quick slides about me. Nobody cares. But what is social audio? It's basically whenever you have something that is primarily audio and not strictly video as part of their formal communication. So whether it is messages or podcasts or virtual audio rooms or stuff like that, you may have heard about stuff like Clubhouse, Twitter Spaces, Reddit Talk, they are all examples of social audio. So people talking with each other, maybe they take turns and then they broadcast to a very large audience. And of course it does seem like a very good fit for WebRTC, especially for the real-time kind of participation. And you didn't hear about me because I don't know if there is any secrets about that, but actually Twitter Spaces uses Janus for the live part and then they distribute it somehow else. And how do they usually work? So as I said, they are typically live conversations, so we have a limited number of people that talk to each other, exchange ideas, they take turns, so it's not always the same people talking for two hours like a podcast, for instance. And then you may have possibly thousands of attendees, like for instance any time Elon Musk speaks in a Twitter space, there's a million of people listening, let's say things like that. And there are of course different challenges to tackle because for the live conversation part it needs of course to be real-time because it needs to be something that happens as fast as possible. For the distribution to the audience you may want, a bit of latency may be okay and this is why for instance they take advantage of CDNs or stuff like that most of the times. But of course there's a problem that of course the more latency you have for the audience, if somebody from the audience needs to come into a conversation there may be a bit of latency there. And so that's something that needs to be taken into account. And so you may want to use WebRTC for everything but there's scalability issues at play there. And so I wanted to check whether or not Janus, which is the WebRTC server that I work on for a living, could be used for the job. And I came up with a few potential ideas and one of those may be relying on the AudioBridge plugin. The AudioBridge is basically an audio mixer that lives within Janus. So you have multiple people connected to the AudioBridge plugin. They create a single pair connection that the AudioBridge mixes all the audio streams so that you send one stream, you receive one stream that contains the audio of everybody involved except you. Which is really nice because for instance it's easy to bring C-Pend points in if you want using the plain RTP functionality. You can play jingles, for instance you have your own show, your own context that you want to play something in there or maybe a snippet from another conversation. If you do stereo mixing which is support you can use spatial positioning of participants to make it easier to understand for people. And of course this takes care of the live conversation but we want to make it available to other people as well so to a wider audience. And so what you can do is take advantage of RTP for Worders which is basically an easy way by which the AudioBridge plugin sends a plain RTP stream towards an address that you specify containing the mix that is being mixed there. And the nice feature in the AudioBridge plugin is that you can also tag participants so that you may say don't send me a mix of all participants but only the ones that I tag in a specific group. For instance this one may be a technician so those two need to hear the technician who gives tips but all the attendees only need to hear those two. That's basically the main idea. And of course whatever happens in here is basically handling a mixed stream so there may be a script here that sends these mix to IceCast to make a very simple example or to YouTube Live for Audio or to whatever platform you want to use as a CDN for distributing the Audio if it's not WebRTC. If you want to use WebRTC you can use something like this. So you have your active participants connected to the AudioBridge they are talking to each other. You RTP forward to the streaming plugin which is the plugin in Janus that takes care of broadcasting RTP to a wider audience. And then the streaming plugin is what distributes the Audio which is the greatest advantage that you don't have to perform specific mixing for these participants. They are already receiving a mixed stream. All people connected to the AudioBridge instead have a dedicated context for mixing because they need to receive everybody except them so it's not the same Audio for all of them. And whenever you want somebody from the listeners to join in the conversation they mute the streaming parts, they join the AudioBridge temporarily, they become active participants that everybody else can listen to because they are now mixed in the AudioBridge. And of course for scalability purposes you can just RTP forward to multiple streaming plugin instances on multiple different instances of Janus how you distribute it is entirely up to you. You can use a tree based distribution wherever you want and you can also take advantage maybe of Multicast because of course if it's just a plain RTP stream that you are forwarding if you forward it on a Multicast group then multiple Janus instances can all pull from that Multicast group that same mixed Audio and can distribute it more efficiently. And one other value is that using this approach if you want you can also do something like interpreter services. You have two different AudioBridge rooms for different rooms, you have the speaker join the room of their language and you have an interpreter on the other room and then you distribute those two streams separately and then you allow the audience to listen maybe to the English channel or the French channel and depending on the language you will speak in you will hear the translator or the actual speaker on either one. So which makes little sense for an actual social Audio application if we want it's maybe more for a conversational scenario but it's still a good side effect of that. If instead you don't want to mix in Janus for a few reasons because you don't want to terminate Audio there, mixing is more intensive or whatever you may want to use the SFU approach instead which means that participants in the conversation now need to establish maybe one single peer connection not necessarily more than one but they are exchanging multiple Audio streams. So they are sending their own and they are receiving as many as there are other participants in the room and you can still externalize this conversation via RTP for Worders as before but now Audio is not mixed so you have different Audio streams for each of the participants there. Each participant in the conversation each of them is sending one and receiving two and you have a separate component that is receiving the three different Audio streams from the different participants and so if you want to distribute something via regular CDN that requires a single Audio stream to distribute and so that component receiving RTP for Worders needs to act a bit like a mixer acting live basically. And once this happens so once you have a mix there everything is pretty much as the example as I made before you have a mixed stream you can distribute it via CDN or via Janus as we've said before if you don't want to mix for the attendee as well you want something closer to a regular webinar or something like this you can still do that but then you have to take you have to use that approach that I was talking about of wording to the streaming plugin for each of the different participants and so something like you have the presenters that you're contributing Audio to the video room this becomes an Audio broadcast for that specific presenter in the streaming plugin and people listen to that participant over there you can again involve multiple streaming plugin instances if needed so that you can widen the audience if you want but again if you have multiple participants speaking you have to do the same for each of them because otherwise of course since Audio is not mixed you would only listen to one single participant which means that the audience need to create subscriptions for more than one participant at any given time and of course you have to make this dynamic in case there's presenters that come and go basically which is what is expected in a social audio kind of application which means that it's probably easier to do something like this where you still do some sort of you keep the audio conversation using an SFU for WebRTC participant because it gives a better audio quality between each them maybe but then for distributing the conversation it's okay to mix it and so even mix it for WebRTC usage so that you distribute a single audio stream instead which makes sense but again if you want to do that that works for instance this is what we do for our virtual event platform for meetings so that definitely works anyway and again you can also do this sort of multicast distribution if you want to take advantage of a wider distribution of the media and if I spoke too fast which is very likely I did write a blog post about this which goes a bit more in detail and explains things a bit more precisely than I did right now and I think I managed to stay on time and these are some references so you can find me on mastodon mainly I'm still on Twitter but who knows for how long and that's the blog post that I was mentioning before so that's all thank you okay there's time maybe for one or two questions if anybody is curious so I don't know if you have any not specifically in the audio bridge but this is something that you can enforce at the application level if you want so for instance you may decide that some users always need to be there and some some use so for instance you may have the concept of the actual presenters and panellists that come and go for instance this is more of an application level context than the mixing context as far as mixing is concerned you you just know yeah exactly so any other question or can we move to so okay then okay thank you very much for that one question |
P10K: getting 10000 participants into a Jitsi meeting
How we leveraged XMPP and the tricks we are using to get to 10000 participants |
Well, all right. I'll get going since we're here. My name is Saul and today I'd like to talk to you about our little project P10K or how to get 10,000 participants into a GC meeting. No, it doesn't go on the loudspeakers, it's just for the recording. It is what it is. Sorry, I lost my voice. I'll try. I suppose most of you know what it is, but for those who don't, it's a way about to see compatible video conferencing application. I like to say that I can think of it in three ways. A set of open source projects that allow you to either deploy it or, you know, piecemeal it and build something with it. It's also a set of APIs and mobile SDK so you can embed it into your existing application and fully open source Apache to license and we have a pretty vibrant community that helps us build some stuff. So I've talked about scaling GC meets a couple of years ago here at FOSDOM with what we did during the pandemic. Also at Comcom about how we reached 500 participants. Then of course somebody will ask, yeah, how do you do more, right? So that's what I'm about to go on today. A quick TLDR on what the trick is to scale up is mostly to cheat because it turns out that you never see 10,000 participants at the same time. So you need to paginate and not show all of them at the same time, not load them at the same time. Also on the back end, you don't want to be, you know, taking care of 10,000 things at once. You want to be really careful avoiding re-renders on the react side of things. So on your front end, you definitely don't want to have 10,000 things. And very importantly, reducing signaling. And this is kind of the crux of the thing. So with all of those things, we ended up getting 500 participants in a single meeting. All of them are fully functional, bidirectional audio video participants. They will never all have video on. So that's sort of fine. I'm going to go a quick run through our architecture. So when we dive into XMPP, we know what we're talking about. XMPP is our course signal protocol. You heard it from Matt for chat. So all the participants join an XMPP mock, so a group chat. And then our focus, you call for negotiates a session with each participant. And then they all end up mixed in the JVB, which is where we allocate the media. So this is like a back of an app design level, but it's pretty accurate. Prosody is our XMPP server of choice. And you call for is the one that will allocate sessions here and then establish sessions with the users. So they all end up, you know, having this connection. Now, how do you go about solving 10,000 participants? Well, first of all, we do some research. And what we knew is that presence is stanza. So XMPP presence was our Achilles heel. So we needed to sort that out. And intuitively, when you need to support many of something, you think of, well, I'll partition it in smaller chunks. And maybe that's how I do it. So there is federated mark for that. So we thought maybe that's where it goes. And turns out the military had sort of researched this problem as well. And there is this cool white paper called federated multi-user chat for military deployments. And one of the things they got there is how to avoid these presence flooding. And they do that with the visitor role. And that's where we got the idea from. So the idea is that we're going to have two types of users, the active users and like passive users. So we don't need to know about all these passive users, like all these audience, we just need to know the number. We don't need to draw a tile for them. They don't need to be as apparently they're participating in the meeting. They're just viewers, right? And this is what the visitor role in XMPP Mach-Lingo means. So a passive participant can then become an active participant by switching the role. Because we're not building live streaming. So what we want to build is a way to actually actively be able to participate. Anybody of those 10,000 participants should be able to take the mic anytime. Scenarios for this, earnings calls on public to traded companies. Just because we can, you name it. So step two, how do we test it? Because if we build it, we need to be able to know we have a complete store goal. And in order to test 10,000 participants, you need, well, 10,000 participants. So we use a big ass linear grid and we created some lightweight clients so that we could have a lot of chunks that join the call. They've got no UI. We spawn multiple browser windows with multiple tabs, with multiple of these clients. And a recent trick is we use insertable streams to drop all media. One thing you can do is modify. Another thing you can do is drop it. So it's nothing. And then there are a lot more lightweight in our Selenium grid. Otherwise, it would take millions just to test what you're doing is right. There's a PR by Philip Hankey actually to do something like Chrome would said, Black Franks, very tiny ones. So maybe that's where we go in the future as well. And we also delay track creation so that we don't create tracks. If you join muted, we don't need to do the whole create a video track that is useless and things like this. The next thing is we scale the signaling. And the way we do it is we ended up having multiple processes servers. This is one node, but it could be spread to multiple nodes. So we have a main process server, which is where the active participants join the meeting. And then we have up to five extra nodes, which we call visitor nodes, where people join in this visitor role. So the presence is not broadcasted. Jigofa will decide which one you join, usually depending on the capacity. And the trick to actually become an active participant is to just join this one, join the main one afterwards. And we can do that very fast because you don't need to recreate the XMPP connection. So now, in order to establish this sort of mesh, we ended up using Federation, even though it's like within a single server, but still. So there's server to server bidirectional connections to avoid having duplicated connections. So custom modules that's where process shines because it allows us to do all these customizations to mirror like chat messages that have been typed in a visitor node to the main node and back. So to kind of fake it that they are in separate instances, actually. And as I said, becoming active is fast because you don't need to recreate the XMPP connection. You just need to join a different mock. Our step number four is to have an improved topology for media routing. Currently, we have Octo, which allows us to spread the load across multiple bridges. But this doesn't work very well for such a large load. You need a tree-style topology where some people are just receiving and a full mesh for those who are actively participating. So both loads can be spread. And last, we need to fix up the UI, let's say. So we don't need to render the visitors. We just need to know that there is 100 people and then 9,000 visitors. And that's it. So we want to refine the UI a little bit. We're thinking of using the raised-hand functionality to become an active participant. So you raise your hand, you are approved and then you become active. That's how we're thinking about it. Now, some of it is in the present, some of it is in the future. So how is it going? We got there with 51 bridges. We got 10,009 participants. So it worked out. There's still some work to do. So the UI is not yet final. We're polishing up a little bit. And we're still going to add some more modules to mirror all the data we want, like the polls and other stuff. And we're thinking that maybe we don't really need to support 500 active participants because that's a weird conference, really. So that number could actually be lower or pretty much configurable. So you can say, I want these very many active participants and the rest, it will be visitors. And that's that. And of course, we need to make it easy to deploy for everyone. Right now, this is a bit held together with that tip. Before I go, I'd like to give a shout out to the heroes that worked on the guts of this. You may know their names from our community, Boy is Domenico and Jonathan, incredible characters. And I'm relaying the message. I know they knew words, but they did the work. And I like to share the love we have for Prosody. We wouldn't have been able to do it, I think, without such a flexible piece of software. They help us. We help them. It's a very nice relationship we have with the project. We love Matt and team. So shout out to them. And since that's all I got, you can follow the progress there. We have documentation actually how to deploy the existing way of doing things. Again, early stages, but it's there. And if you have any questions, well, I'm around here. Or find me online. Thank you very much. |
Edge observability for RTC apps
introducing qryn, the polyglot monitoring and observability stack |
Hello, like I said, it will be digested because this topic is a little bit, let's say, complex at one side, but from the other side, it will help you to save a lot of time and, of course, money. My name is Alexander Dubrikov. I am CTO of QXAP Company with the Gas Resolverance Manganee, we build with QXAP BE. So I'm an open source enthusiast, I'm involved in many open source projects, Camelia, Frisvich, OpenZeef, Asterisk, and many, many, many else. So QXAP BE is a company based in Amsterdam, it's a company which is behind Homer and Hapk and a lot of other projects which helps you to monitor systems. Of course, we like to make open source projects and a lot of our projects are open sources with good license. So what's the problem we have? Yeah, somebody of you knows about Homer and Homer is a very good, great tool to make monitoring system to store all data, use e-messages, also some local information, but sometimes you also need to store some metrics, some statistics, and therefore you use Prometoids or you use InfluxDB for time series, you use Elasticsearch to store some logs information, some CDRs, and at the end it's mess, because it's so many staff which confuse you, you have to maintain, you have to spend all of time, and in our company we decide what we can do here, how we can help you guys to make your life easier. Then we step back, we view this problem on a different angle, and we decide to make a new application, it's called Quirin. What is about Quirin? Quirin is normally its collector, which you have already Grafana, you have already Prometos, you have already Telegraph, which will send the data in special formats, and we created this application which can read all these formats and store data for you. So of course you can ask what about Homoids, Homoids can also send some information, some SIP information to Quirin, and this can be always stored in the database. But it's not only about Homoids, it's also about these different statistics, what you can receive from your agents, from Prometos agents, from InflexDB, etc. Everything is stored to Quirin, and we created this engine and stored it in the database. So what we did better than in Homo, we wrote a great documentation. So if you go to this website, I trust you guys, this was the number one point, once we started the project, we wrote a great documentation, you can go to our website and you will see all steps, how you can install, how you configure without headaches. So at the end, you know what SIP is normally for us, it's just an event. But what about VEPRTC, it's also an event, because we have different platforms, it can be genres, it can be even free switch and so on, they generated own events in JSON format and how we can collect it. At the end, we decide, so we have only metrics, we have logs, and we have traces. At the end, it's all our information, what we generated in voice over IP stacks, it's related only to these three categories. And of course, it can be generated from different sites. So if we're talking about working Prometos, Elasticsearch, it's already existing with agents which you probably guys already use, and you can generate this data and send to querying. How you can read it, if you use Grafana, probably it's everybody, if you use Grafana, Grafana has already native plugins for local, from KL, temple, API, and we support it as well. So you should not install any additional plugins, it works from the box. We have very cool query stuff which you can extract any data from querying, we'll show them. And of course, you can use any agents what you already exist, this can be Grafana agents, this can be lockstash, vectors, telegraph, and so on, so on. Also, for you, make your life easy, we develop our data explorer, which is already integrated inside of querying, you can use similar to Grafana. And what is very important, we already have a lot of deployment, and it's some big gaming providers, enterprise solutions, and this query, you can use it also for EoT, it's scalable very, very, very well. Now working samples, like I said already, you have these agents, so I don't have too much time, it's an industry standard which you can use its Prometos API, it can be influx CDB insertions, Temporal API, etc. We insert this data to stacks which reads from API points, it can be on open telemetry, it can be local elastic search, etc. It goes to different basket and we insert to database. It's like back end, like database we use Klikals, Klikals is very, you probably heard about Klikals, Klikals is very, very, very performant database, it can be scalable very well linear and horizontal. You can use a lot of some features like UDF functions, etc. You can use also S3 storage if you would like to save your money and if you use AWS or two. And how you can read data? You can read data using this API, it's LOKL, do you know guys what your LOKL is Prometoiskl? No? Okay. LOKL, Prometoiskl is special languages which develop in this company is Prometoisk and it helps you to make some complex statistics, some complex search for logs and information. So it's not like before we use all select from blah, blah, blah, but it's very, very, very limited. So very for the guys from Influx, from Prometoiskl, they develop this promkl language which is very, very flexible and you don't have any limits to do any queries and how it works I will show you. For example, you store data in query and you set labels, how you insert data, you set labels, it's free switch, it's generated fingerprints and in that query you just say, ah, show me everything what is related to free switch and it's very, very fast, lightning fast, display your data here. Second one, what to do if you store Zip messages, you can also set, ah, pipe with results and extract any type of fields from Zip messages, it can be airport, it can be callity, whatever you want. And Janus, for example, Janus generated a lot of events which we can store almost in query and we can extract RTT in labels from this Janus event, but it's not the last. Now what we can do with RTT events, you can just make, unrape and put this information to basket for 10 seconds and immediately from this event information you generate charts, so it's converted automatically. Now you can also do exactly same for elastic storage input or RTT, roundtrip, so information what your switch is generated, you can convert any information what you already stored in database to charts. What about HEPLIFI, Homer, you can also set, ah, let me check all method invites and put to basket for one minute and display what's, how it looks in time series. What about RTTCP, you can also send RTTCP information, you can display most data packet to us and so on. You can send any HEP statistics and display it automatically in query. You can do same with Spromkl, HEPMAP, so about OpenTelemetry, OpenTelemetry it's de facto standard which guys from next room developed and it's already used in many applications. OpenTelemetry it's just internal tracing, you can use our special libraries, you connect it to your application and it will trace all your functions, execution time, these ideas and you will send this open traces to query and you can display how many seconds, microseconds your function execution was taken, what's plugin was, how much time it took, plugin usage, etc. And this is how you can handle and see in, for example in this Janus, it's offer, offer how many microseconds it takes, how many ises taken, etc. This is exact, it's not only about Janus but also you can make, you can also enable OpenTelemetry stuff and you will see how many microseconds, milliseconds takes your query. In service graph you can also make automatically display, it's generated automatically you can check rate and automatically it's generated query and display charts for each node. If you don't like OpenTelemetry and you don't know how to, let's say, connect it to external library we can use eBPF, eBPF is special functions in kernel, you can compile special model which will trace all your functions and generate all traces sent to our collector. Without eBPF we will display, we will present in Berlin for Camelio how you can use OpenTelemetry and make performance optimization. So Janus, we created for Janus, we created special application which is called JAWS. It's a web socket collector which collect all information from Janus and we converted all data to OpenTelemetry. And we can display this data like media, okay, okay, next, next, next. It's ice failure for Janus, the same information you can exactly display here. You can do aggregation type in Janus telemetry, see which nodes Janus proceed, etc. Also very important you can set any alert on any metrics what you send to query using alert manager, what is very important. You can even use fraud detection if you want. So last topic, Open 5G, probably guys you saw yesterday we did some hack. Open 5G stacks, it's stacks which does EMS and all this stuff, it includes Camelio, some TPA engines. In 5 minutes we installed HEPLIFI agents and we sent information to querying, exactly using same stuff. This is device how it looks like, so it's bad quality but at the end it's small mini-computer which has Docker which starts all EMS stacks and we, yeah, using querying, we trace everything. Giovanni sent in Facebook this post, we did it exactly yesterday and it works very, very well and when we did some test he connected to Vicentene, go to another room because Vicentene has only 0.2 watts here I think and goes in the next room and immediately it was displayed what packet was here because his signal is dropped. So if you support, if you like open source and if you like this project, star us, so it's cost nothing but the process like a cookie. So of course sponsor open source projects because without open source our life will be more difficult and we have this block querying there, we have a lot, our team wrote a lot of nice documentation how you use querying, how you can integrate your traces, everything. So it's all examples, all good stuff inside. Yeah, that's it. Like I said, sorry guys, this topic is very, very, very, very complex but it's, I have only 15 minutes. So, yeah, so if you have any questions, go ahead. Yeah. It's very something like Joss for Facebook. Yeah, Joss, it's. Okay, it's Joss for 3C, theoretically, it's because it's open source, you can adapt events but this is what I said, it's each application generated own format. At the end it's a JSON format, so it's JSON events. You can of course adapt and convert it and send to open telemetry. Yeah, you can. Okay, it's about here, you know, so just to repeat, so of course you can send all data to querying and you can create locks and zip information by using open telemetry stuff. What we work in this, Camelia for example, to send all this information in open telemetry you send out using HAP encapsulation and you can collect all locks and information with your dialogs ID with, for example, for zip, you know, and you can store it. So you can select, for example. Yeah, you can select it, exactly in querying, you can click and you will immediately jump to Homer website, you have HomerView where you can display this in chat, in view, zip code view or in table. And the other way around? Yeah, it's awesome, it works also as well, yeah, it's a round-trip integration. Okay, thank you. |
Quantitative Analysis of Open Source WebRTC Developer Trends |
This is actually my first time at FOSDEM. I've been meeting to come here for many, many years, and Saul Lorenzo been bothering me that I should come here and speak. So I'm glad, actually, I finally made it. If you, I run a blog, WebRC hacks for developers. I got a lot of guest authors and I hope readers in the audience here. You might also recognize me if you watch YouTube videos about WebRC stuff. I do an event series and videos at Cranky Geek. But today, I'm here to talk about some trends I did by analyzing GitHub and similar repositories. I'll talk about that in a second. But before we start, what are some things that would be nice to know what's going on in the tree? I'm an analyst like to watch this stuff and write about it. So one, is the community still growing? What are some of the interesting projects? What are some of the new trends? Are people using things like WebCodex yet or not? So how do you go about doing that? Well, I came up with a, I have a methodology. It's largely based on BigQuery and there's a bunch actually, a bunch of providers give data to BigQuery or make their datasets public available there. So you can grab that. GitHub is definitely the best one, basically any time, any public repo, any time you do any kind of commit pull request issue that all gets sent and stored in a BigQuery database. There's actually some other datasets that are cool in there too. I've used in the past and I actually use Stack Overflow in this one. And I often cross-reference that with other sources like the Chrome platform status is a good way to see what actual APIs are being used, at least in the Chrome world. So you get all that data. I actually like to use Colab, which is just a Google's hosted Jupyter notebook type ecosystem to do the analysis and follow things. And then when I get frustrated about coding and doing stuff like making charts with Python, then I fall back to Excel for quick analysis. So some of the hard things about this and important to keep in mind about the analysis, you can't see it there. But all this is really based on regular expressions and pulling out keywords to identify a repo as WebRTC or something as one term or another. So that obviously has some flaws because there'd be false hits in there and you have to be careful with your selection. So a lot of the time I spend is actually just going through looking at the data to make sure it's not junk. And I found in the past, there's a lot of bots. So you gotta remove those things and these data sets themselves aren't perfect. There's always missing data or junk data. So keep that in mind. But I've been doing this occasionally more frequently now. Since 2015, I've been tweaking the methodology along the way. It's held up so far, but always open to feedback. So let's just dive in. How are we doing here in WebRTC? So this just looks since 2019. I don't think it's a big surprise to anyone. But there was a peak during the pandemic. So you can see here, it's actually April of that year. And this particular graph shows distinct users, right? So anybody like anyone on GitHub, just based on unique GitHub IDs per period in there, so that was over 10,000 in April. But if you look here, actually, we're not doing so bad. This is bad data, I was missing that month. We're not actually doing so bad here, but we're only about 60% of past peak. So it's still pretty above where we were before the pandemic. But another thing I find actually very interesting is also because you can look at these events, who's actually contributing, right? And you can look at the blog, I'll have some links to more on the methodology. But essentially, people doing pull requests, pull requests reviews. That sort of thing that counts as a contributor, right? And actually contributors, if you believe me, are actually up more than ever. So first, one interesting point here you can see. So there was a peak number of users in April, but actually the peak contributions, it was in actually May. And maybe that makes sense, like people during the pandemic, got an issue in WebRTC, start looking at it, they maybe could star a repo, get involved, but then it takes them a few weeks actually before they actually contribute, start adding things into that repo. But as I said here, our most recent peak here was actually in August. And we're not actually too far off for the rest of the year looking at that. So I want to look into see what was going on and compare these two peaks. So one is this is actually from the August peak, I dug into the sea. Where's some of the repos that we're actually having, I had the most activity. And one of them, very popular WebRTC project is Coturn, the open source kind of stone and turn server. And actually one of my co-workers Gustavo took over that project. So I asked him what was going on here, what happened, why do we have such a big spike in this and wrote a whole blog post about it. But essentially, that project was kind of stale, not a lot of activity for a while, he and Pavel took that over and then started really get the community involved and there's a huge spike and things like that. So then I wondered, is basically is this, sorry, is the other peaks in August in here because of spikes? Or is there a lot of regular activity? You can see here, interestingly, you see things like AdGuard is always high in there. Like people actually plan to block WebRTC and its usage, right? But they have a lot of activity every month around that. But actually it wasn't actually, you see some commonality, but some difference here. And, sorry, but when you actually look at the distributions and the change between April of 2020 and August of 2022, the actual distributions across the top 10, top 20, top 100, they're actually not a whole lot different. So what does this all mean? It's like actually the WebRTC development actually is not really getting a lot more concentrated. You can look for a given period of time. Obviously some projects are doing more popular and doing well, have more activity than others. But overall, it's not like we're consolidating down to a few projects, right? It's the same kind of more equal distribution that's existed at least for several years now. So another data set, and this is actually a new one I hadn't looked at before, is Stack Overflow. So I zoomed in to take a look at that. And that's to see if this follows a similar pattern. Now bear in mind compared to the previous charts, this goes back all the way to 2012, so it's a much longer data period. And you can see here, this is comments on Stack Overflow questions and actually the questions themselves. And unfortunately, you can't see the font too much of answers within Stack Overflow, but it essentially looks very similar to the questions side of things. And you can see very similar here, peak in April of 2020. Also, unlike the GitHub analysis, this actually shows a peak and is here also in April of 2022. I didn't have a chance to dig into to see what was driving that particular peak this year. But overall, I think you can see it's a little bit harder compared to the other one, but we're still generally up compared to prior to the pandemic in terms of questions and answers. And actually, it's a pretty good sign that there's a lot of activity there. And I also just took a look to see as a percentage of all the questions on Stack Overflow, what percentage of them had at least something that mentioned WebRTC or one of these terms? And very surprising, actually, it's actually very high. So it's something like one in, during the pandemic, it was one out of every 1400 questions on Stack Overflow had something that mentioned WebRTC, which that seems like quite a bit because I still consider WebRTC kind of a very niche sort of thing. And even if you look today, just in the last data point in this one is November, at that point it was still one in, I'm sorry, it was one in 900 during the peak of April of 2020. It's still one in 1400 today, which was still actually very high. So you can see, you can interpret this two ways. One, WebRTC is very confusing and people have a lot of questions. So you need to comment on it, or you can also see there's a lot of people involved. I think both are good, right? But yeah. So also very interesting that can we look at this data set to understand development trends, what's going on in the market. And one of the very interesting things I always like to look at is what are some of the language trends, programming languages that people are using. And this is a jumble and hard to see, so let's actually zoom in. So one of the ones I've been trying to track for a while is JavaScript versus TypeScript. I've been delaying, converting to TypeScript and I'm kind of wondering, do I need to, is it time for me to really switch over or can I wait a little bit longer? You can see here, well, obviously TypeScript's been getting more popular. We just, you know, just in December reached the 50-50 point, right? Where, you know, all these repos where TypeScript's half. So I think I'm probably behind it and need to switch. So there's also, at this conference, a bunch of exciting new languages that are coming out. So I wanted to zoom in and kind of take a look to see what's going on with them. So, you know, Go, Kotlin and Rust in particular. So I will say one of the challenges, I didn't get any chance to filter this out, but this Go jump from November to December is some bots. So that's just bot activity. So you can, yeah, I thought originally maybe it's just Christmas and Go developers don't have anything better to do. So over the holidays, you know, they're just programming a lot and starting a lot of new repos. That wasn't, it was actually, it was some bugs. But you can see here, you know, steadily increasing. It's not a huge, huge spike for these other, these other two, but it is going up. But as a new language that's getting popular, I guess you'd expect to see more of that. So in addition to languages, also there's a bunch of the new APIs, some that were referenced earlier today. So Insertable Streams is one such API, and that's actually two sub-APIs, a media stream track processor and track generator. First took a look on Chrome status, actually credit to Fippo, you know, Phil Hankey for having a, he built a custom viewer of the Chrome status information that you can see or so compare them both at the same time. You can see that, you know, these are actually peaking, you know, quite a bit towards the end of the year going up quite a bit. So I was curious, like, who, you know, can we see our open source repos actually using these or is it somebody else? And looking at it, you know, there's a big spike here, but it doesn't look like much. So what's going on there? Zoom in a little bit more, and again, apologies, it's really small, but like that initial spike, a lot of that was just pure standards related activity of the W3C repos and WebKit and others that are just basically adopting, you know, adopting those APIs in the first place. So you see a big jump. After that, it's really kind of hit or miss, and I was, I mean, I love working with Insertable Streams, you know, let's see, do a lot of fun stuff. So hoping to see more, but it's kind of just spotty. So, you know, going back to the Chrome status, what does that mean? Well, at least people that are using it are probably someone like Google Meet, sort of thing that don't have public repos, right? Or there's something else outside of the public GitHub data set that's driving that usage. So another one is Web Codex. It's another one. This one doesn't have quite the same peak. It's a little bit, you know, Web Codex is not quite as far along, but they're still driving up. You want to see if there's something going on here too. And again, you see gradual increase, not a ton, except for this one spike. And this one spike, again, was largely related to, you know, the initial standards release of WebKit and W3C type repos and related once to deploy that. So we see some uptick, but nothing all that significant yet. And for the last section, I was also wondering, is WebRC winning? Like, we don't talk a whole lot about WebRC having competition so much anymore, at least I haven't. But in the early days, it was always, you know, WebRC versus SIP. And is it, you know, should SIP, you know, those SIP type developers, that ecosystems, should they shift over to WebRTC? And we haven't seen that a whole lot, but I think in reality, there still is competition. And that is certainly during the pandemic, you know, well, it's Zoom, right? And I actually presented this a couple of years ago at Dan's conference, an interesting fact where, you know, there was a month in time where Zoom was worth more than the seventh largest Analyze put together, at least our market capitalization, which is still insane when you think about it, right? But, you know, that was the reality. So I did want to check to see if that's still the case, and it's not, right? So, yeah, Zoom is, you know, using next to the same data point and just extending it out, you know, a little bit further, you know, Zoom's down near 80% where they were back in February of 2020. The airlines, though, aren't actually doing all that much better, right? So still relative, Zoom's not doing some bad relative to the airlines, at least those same seven. But anyway, Zoom, not quite what it was, but they still really are competition, right? And particularly because Zoom now has released a Zoom SDK, and they have a Web version of this SDK. So as a developer, you do have a choice. Hey, I want to go build a real-time communications application. You can choose to use the WebRC and, you know, all the vendors in the ecosystem, or you can go to choose Zoom for this. And I was curious, in Zoom's marketing, it's a lot. I'll probably have a post on WebRC hacks with football, hopefully in a few weeks that, you know, where Zoom's been promoting the benefits of this. And it's a, I'll go into why that's not completely true. During the post. But I wanted to see our developers actually choosing Zoom over, or at least mentioning the Zoom SDK over WebRC. It was going to take me a while to dig into all this on the GitHub analysis. It wasn't clear, so I didn't include that part yet. But on Stack Overflow, it's pretty actually clear. There's a distinct Zoom SDK tag or label that they have there. And you can see here, actually, at least for now, WebRC is still way more popular than the Zoom SDK. Two minutes, okay. And actually, I am done. So I guess what we've learned here, I mean, part of it is what are your expectations here? I didn't necessarily go into any expectations other than I was interested to see what are some of the trends and can we find or like learn things about new APIs or new repos. And I do go through the list, actually, is interesting to see new projects. Didn't have time to fit all that stuff in there. But again, you can reference some of the blog posts on this. But overall, my impression of WebRC is still doing pretty well. Obviously, it's not pandemic well. But given the circumstances, we're better than it was before the pandemic. We have more developers involved and it seems that developers that are involved, it is a lot of measures to say that they're more mature. They're better developers, right? They're contributing more than in the past. And I think that's a great thing. |
Secure payments over VoIP calls in the cloud
How to architect an oncall live payment system in the cloud using Kamailio & RTP Engine. |
Okay, my name is Nuno, I've been around VoIP and RTC for the last, I think, 15 years. The last 10 were more involved with open source and, well, right now I'm working at Talk Desk, which is a contact center cloud-driven company and we are adopting more and more open source technology into the company and, well, that's what also drives me. So my talk will be about secure pavements over VoIP calls in the cloud. It's something that all contact centers probably need and in order to do this we have to follow some rules and specifications and let's dive into that. Okay, so I'll be telling you what PCI DSS is, what was the existing problem and what was the solution for our own implementation, the actual architecture of the solution and how to do it. Yes, payment card industry data security standard, well, it's a standard that was put up together by a bunch of companies like American Express, Discover, BasterCard, Visa, etc. So they created a bunch of mandatory rules for the industry, this is called the PCI data security standard and, well, in PCI, all business that process credit cards are referred as merchants, which is our case at Talk Desk and whenever we are big or small we have to follow these rules for credit card security and basically there are very different levels for the merchants, the ones that process like six million transactions, whatever, so they are level one, usually contact centers, if they're not that big they end up in the levels three or four, again, SAQs, so SAQs are self-assessment, questionary levels, so we have to kind of answer some questions to see where we fit and basically since we are an electronic processing company we fit in under SAQA and so no face-to-face merchants, everything is done with e-commerce or telephone, so this is where we are at. So the existing problem and solution, why we did something different, so in order to do this kind of payments we were using some proprietary equipment like SPCs, etc., and everything in the end sometimes had some failures, there's the usage of DTMFs on this a lot, at least for typing the credit card number we need to use DTMFs, but besides the credit card numbers itself we have also to use DTMFs like to tell the partner that the actual partner will be speaking more about it later, the partner basically needs to know when you want to have your channel secure for the actual customer to be able to start typing the credit card information, that was being done by the usage of a PIN on the Asian side, so the agent would basically use his dialer in the screen most of the times to type a PIN in order to engage with the PCI partner and if all went well the actual customer would be hearing a voice telling him the call is now in a secure situation you can follow up like typing the credit card, most of the times this works but there are a lot of times that the DTMFs are not well recognized, sometimes the SBC failed completely because there was a need, in order I will be on that later on again, but there was a need of transcoding the actual DTMF from the RTP to SIP Infos in order to tell the partner about the actual DTMF number and well we also had a very hard time integrating the SBC world, the proprietary world with modern APIs in order to have more fluent scenarios, the alternative, the solution basically we decided to use Camille and RTP engine for the new solution, it lets us have more easy to adapt and evolve, it's easy to integrate with modern APIs, everything can be done with SIP and we will get into the actual architecture next. So about architecture, the PCI environment itself needs to be like contained from whatever you have in your void for normal calls, it has to be separate and contained, so in order to actually use payments, so the phone number that the customer uses to call and do a payment basically goes through a contained environment, in our case we are all in AWS, so we have basically a separate account just for this and for our new implementation we did everything with infrastructure as code and by doing CI CD we were able to deploy without the need of like logging into a machine or having SSH access or that kind of stuff. So about the actual architecture for this, so we have the actual carrier and we have this SIPAS, it's our asterisk, it can be cloud, a cloud provider like Twilio or Neximo or SignalWire or whatever and this secure router is what we call this new solution that we developed, it's basically several possible instances of Camellio and RTP engine and there is in the top the PCI partner, this is the actual partner that processes the payments so we don't process and we don't touch the actual credit card data and the way this works is basically the call comes in from the carrier side if it's an inbound one and it goes through like this, this is the media for the normal call without being still under the secure channel for payments, so the call comes in, it ends up in the agent on this side and whenever the agent wants to, in this case this blue arrow here, it goes always through the PCI partner, the PCI partner injects a new weather in the SIP with a token, so the token comes back down here and it's extracted and is sent to a smaller helper microservice that we have, this microserver basically then uses an internal API to tell the C pass and the actual call that is coming in as that token for starting the secureization process in case of a payment needs to be done. So the signaling goes always through the PCI partner, the media goes from the carrier to the C pass side without leaving and whenever the agent on the C pass wants to start the secure process for starting a payment, basically he uses the token that was passed into his UI and that triggers, re-invites from the PCI partner down to our secure routers and basically the media starts flowing through them for the actual time of the payment processing, so this is basically how this works. So this yellow square here means that we are under a closed environment just for PCI, we have to follow that because rules imply that and basically some snippets on the infrastructure as code for this, so we are using a bunch of Terraform with Ansible in order to deploy everything automatically, sorry it doesn't read too much here, but here we are telling for in this part it's the dispatcher that we are populating for Camelio, there's some also some information about how the process of Camelio is started, the certifiers that are used for TLS, a lot of description for the services, so yeah, the one before, okay. Here it's a small snippet of the Camelio config file itself, so in this case, sorry about it, it's unreadable, but it's where the actual engage with RTP agent was happening, so we have a bunch of ciphers here that need to be used, but by using this and with containerization, so the actual Camelio is put inside the container, the container is then launched, mapping an external volume with some defaults and that way we don't need basically to touch, human touch this deployment. Okay, another important part, we rely a lot on DNS, so we use NAPTR records with SRVs for basically describe where our instances are, so the PCI partner uses DNS to speak to us, the carrier the same thing, the CPAS the same thing and by doing this and without need to be stateful for anything, we can basically launch the number of instances that we want, since we are using AWS and we need basically this kind of deploys to happen in several regions of the world, we also do that very easily by using this approach and as I say here, art coding is almost not a great thing, but in this case, as I say around here, it gives the rigidity that we need to fulfill the requirements and the standards and it proved to work well. So how it actually works, this is a screenshot of the UI that the agent sees, so basically these are what we call context cards for the agent. One of the context cards gives the PCI token that comes in the SIP and then it's extracted in Camellia and sent to our internal API for the contact center layer and it appears there, so whenever the agent needs to start the payment, it just needs to press a button that will use that token and the call will be in what is known as a secure state for payment. Okay, wrapping up. So something about the certification process and the audit, every entity that wants to be processing payments needs to go through this every year, the certification needs to be renewed every year. The one that we did more recently was last November and the year before it still was on the old proprietary stuff, there were several small issues that kept open from last year to this year and by using the new implementation, we basically passed everything, the solution is put under the test for pen tests, it's something that is in the runbook of the audit, so everything passed, I didn't mention, but the actual infrastructure as code also configures the firewalls in AWS, so we just have the ports that we need to work with Open and we have also ACLs for the PCI partners that we are using to process payments, the carriers, the same thing, so that's the way we did it and in the end basically the audit identified some strengths for this, so they thought there was excellent management and procedures in this, we adopted a risk treatment approach and we always used certified service providers for carriers and for the partners that do the payments, they also need to be PCI certified about the carriers, this architecture basically lets you use any carrier you want, but you have to kind of decide if you really trust some of the carriers that you put into this because basically if you integrate this with PBX or so from a client, let's say, you can always intercept a call on his side and that could not be good, so yeah, that was it and that's what I add, thank you. Okay, I guess any questions? There's one over there. Yeah, there's small differences when you basically switch a PCI partner that processes the payments, right? There are small differences between them, but we basically worked with three, four partners for this and they must use SIP, if they use only SIP, that's the best option. The one that I presented here, it's the most complex one because it involves reinvites sending media to them during the actual payment process, but it will depend, so sometimes it's more direct and easy. The case that I spoke about was the most complex one that involves media redirects and that kind of stuff. Can you recommend any specifically? Any partner? Not really. I really don't want to say names here, but if you basically look for the most relevant ones in the Internet, they're the ones that will work under this. |
Interoperable Chat, Dutch Healthcare and the Digital Markets Act
About the pitfalls of interoperable chat |
And it fails them, so everybody will be tired and wasted and then I'm here with talk, with sheets, with full of text and some really interesting policy stuff combined with some technical questions. So that's a nice way to get you all asleep and not leaving for them anymore. I'm Winfrey Tilanus, I'm privacy consultant, I'm also a member of the XMPB Standard Foundation and I'm long term and more and more also at the brink between technology and policy. So I tend to look at acts from the European Union and things like that and this act caught my eye, you may have heard about it because it also got some more news, it's a digital market act. And it has some interesting things because the European Commission can designate big companies as gatekeeper and you have to think about Meta with WhatsApp and Facebook chat, you can think about Apple's chat, about Google's chat, Instagram, maybe perhaps Telegram because it has also quite big user base with companies like that might be designated as gatekeepers. And when you're designated as gatekeeper, there's a process that will be in the second half, started in the second half of this year. After you've been designated, you have to open up your chat within three months. Two years later, you must also open up your group chat and four years later, you also must have your video and voice calling opened. And right now in the start, it's okay to open it by just publishing an API. But the European Commission in all its wisdom may decide to draft a standard for interoperability and if they do so, then all these, the own API won't suffice anymore, then you have to follow that standard. And they also said that standard has to be drafted by European standards organization, so it has to be the Zen or the Etsy. Also interesting, there's some more nice things about it because that API or that standard has missed have the same security features as the own product of the company and it must be neutral, so it must not be crippled in such a way that nobody wants to use it or that I have some conditions to it that really puts it backwards compared to the own product. Well, I read this and I had a deja vu. Because some years ago, the Dutch Ministry of Health had a problem with interoperable and secure chat. So they asked for a standard about chat to make it interoperable and secure. And they gave the assignment to the NEN that's the Standardization Institute and I happened to be chairing the working group drafting that standard. So I went in all kind of things and questions and things that happened there. So I thought why not combine my two experiences of this experience with what the European Commission is aiming at and let's have a look on what we're dealing with. Now, when I think about the European Commission, it's about the question, what do you mean by secure? Fender A says we are secure because we offer the best end-to-end encryption in the world. Nobody can eavesdrop. This is the most secure solution you have. Fender B says, nah, we monitor all communications and filter all harm for content. Well everybody is safe on this platform. This, well, tell me, well, let's put your hands, would you think A is the most secure one? We think B is the most secure one? Nobody. But this, even the European Commission is sometimes aiming at security like B. So there's a big misunderstanding about what do you mean by secure. And it's kind of difficult to add A to B, isn't it? And end-to-end encryption does have its drawbacks because both clients must support the same end-to-end protocol. So you can't have one do end-to-end encryption based on the matrix or open PGP and the other one an open WISP or something like that. It must be the same and there's all the key exchange and everything. And spam detection, harmful content detection, we may like it or we may not like it. It is a thing and it is an important part of the security model of some platforms. And there's also one interesting thing that I saw about, well, nice end-to-end encrypted platform opens up with a nice API. So I can connect to that API. That API is also end-to-end encrypted. Before it was open, they had their own clients that sent and received the message. So they knew what was injected into the network. And it was in such a way that nothing went wrong when it was received by the client. But now I can write my own client, maybe even a first-in-client or something like that, and inject whatever I want into the network. So they may even get a security risk because they can't monitor, can't filter out all the nice attacks I inject into the network. So that might be quite interesting. And end-to-end encrypted may also have intellectual property issues. When there's some propriety encryption protocol that a vendor boards a nice license to, like what's apt-it, then they may say, well, here's the API. This is the protocol. And if you want to use it by your own license, well, that's an interesting business case for Maximali Spike. So there's quite a bit of issues of already one's being secure. Oh, that was going too fast. What's really important here to understand, this is a political decision. The European Commission must say what they mean with secure and how end-to-end encryption would fit in that picture. If they don't and try to push it to a standardization organization, then it will be mayhem. And I can tell from my own experience, I can also reveal the standardization project in the Netherlands failed because they didn't define what they meant by secure. So, but if it did come up with a standard in the Netherlands, it was for health care. So it is quite a specific use case. We standardized to secured servers. So the premise was that when a health care organization sets up a certain server or buys a certain sales server offer, that server was meant for this job and secured in a proper level. No end-to-end encryption because some of them had a security model that relied on securing the data on the server and not keeping anything on the client. And the whole network between the whole health care providers would be authorized, it won't be an open network. And exactly that last point was where the project was killed because some vendors wanted to have another direction there. And then the next question is, when you say interoperable, interoperable with who? When you publish an API, is it sufficient to publish an API so a client can connect? You said, nobody knows, European Commission didn't say anything here. Maybe when you read the law quite close, this is maybe not exactly what they meant, but they're not really clear about it. It would also mean that when you write a chat client, it would be interoperable with all vendors that it needs to implement all those APIs. Or is it, like we did in health care, is this a server-to-server with limited parties? So a walled garden of some systems that are interoperable with each other, or is it really open federation, server-to-server for whoever sets up a server and connects to it? No idea. But it's quite hard to start standardization process if you don't have any idea about what's happening here. And, oh, I'm good in time. Then there are the big question about functionalities because chat isn't just chat. You know, when you look at whatever chat clients you use, you have all nifty features starting with slash me and then repeating your, this is your own nickname, up to all kinds of fancy stickers being sent around of emoji you can add to message and things like that, there are lots of functionalities that a chat can have. And certainly when you enter into encryption, they can't have the server in between translated or say, well, I handle this and this way, that must then all be done by the clients. And that means that for all of these functionalities, you must make a choice about how you're going to handle it. Is this an obligatory functionality that must be part of the protocol that must be implemented then by all the clients or is it optional and making a functionality optional means that you need to have something like service discovery or do you say, no, this is not part of the standard and you don't have to be interoperable for this functionality. Well, we're talking about, about what kind of functionality it is. Almost, I did, I did leave things out mainly because for example, like service discovery, certain ways of connecting, how you standardize exactly the connection. I guess that's in the upper layer and not on the functionality layer you have to choose. But this is the kind of things you're talking about when you start standardizing this. And I have nice time to discuss some of them. When you look at attaching an emoji to a message, well, you think emoji, that's part of the UTF-8 character set, so you can just say, this is the message and then put this emoji to it. But some clients don't, just don't support all kind of emojis, either in sending or receiving. They have only limited set of emoji you can attach to a message. And also, to get something like that done, you maybe need some services discovery to check what emoji are available at the other client and then you can tell your client these are the emoji you can send. Or maybe you just sent an emoji and notice it doesn't work. Not really the best way of interoperability. And then there's also something like, oh, no, that was the wrong one. I need to tap another one, so you need to correct that one or delete it from this. It was inappropriate to do this, so let's get it out. So when you start looking at attaching an emoji to a message, it just looks like a quite straightforward functionality. When you start thinking through an interoperability context, then it might be quite interesting, quite a nice thing. I also like to move notification when you send a message to somebody on platform A, that you can get a response back, no, I'm now somewhere else on a different account on the same platform or maybe on a different account on a totally different platform. It's a nice one to standardize, maybe. I don't know if all platforms would like that and if they'd like to cripple that one or not. Well, I'll just, one minute. Yeah, it's great, is there one other one maybe I should talk about or should we just go straight on to the questions. So thank you. So, first in the back, that was the first and that was it. The question is when is a party designated as gatekeeper and there's a whole set of rules and it mainly means that you have a big market share, a big turnover on it one way or the other, either on advertising or on this paying customers and it looks really like the list I mentioned at the beginning will be the list that will be designated and others not. But you really have to be very large and be one of the biggest players in the fields to get that nice status. You mentioned that the Dutch initiative failed because what was meant with security was not defined, but why didn't somebody in the group say, okay, we're going to define security like this and go ahead? Well, I did. Oh, yeah, I should repeat. So the question is why didn't somebody in the Dutch group define security and just go ahead? Well, I did. And that was exactly the moment that it failed. Yeah, that is a lesson indeed. The issue was really from a commercial perspective, there were some vendors on the table that had different reasons to say, well, I don't want this definition of security. And they made a big mess. I wasn't really prepared for that meeting that I really was totally furthered by one vendor. There was all kind of confusion raised in the group. And then we needed to do some new cycles about how to get out of this. And that was the moment that the government said, well, this takes too long, we're gone. Well, the start of the problem was they didn't define what security is. So lesson learned. And I think it's a very good question because we should do the same trick of be careful for the same issue and maybe be clear to the European Commission, tell us what you mean by secure. We can't standardize if you don't tell us. Yeah, I saw two fingers there were probably basically saying, so my question was, did somebody ask and he said, the problem with asking is that they might actually tell you and they might actually tell you at least if they don't tell you, we can dream a little bit that it might be a. The point is, in most standardization situations, you have a bunch of vendors who are intrinsically motivated to cooperate and to work together. And then they can discuss nicely about the good questions. Standardization like this is with vendors we don't want to. And that's a big, also a big issue when you push this to a standardization organization because they're not up to it. But if you're sitting there with vendors we don't want to, they will blow up the process. So the only way to avoid that is to make sure that somebody sets the limits, makes clear you have to do it this way and otherwise you can leave the table but you will still have a stump that you have to adhere to. |
Real-time audio/video conferences in Linphone thanks to a modern SFU server |
Okay, we can start. Thank you for attempting to this meeting, which is about Linfone and video conferencing. My name is Jean Monnier. I'm involved in the Linfone project since 2010 and I'm also part of the company which is backing the Linfone project since ten years almost. So first, I'm going to provide you with a quick introduction on Linfone and then have a couple of words around video conferencing with SIP followed by an introduction on the selecting forwarding unit, which is the art of the modern video conferencing systems. And later to talk about what is required on the SIP client part to be able to interrupt with this kind of video conferencing system. And finally, the conclusion. Okay, so just a couple of words about Linfone. Linfone is a voice-over IP client implementing the SIP protocol which started in early 2000. It's available on Linux, Android, iOS, Windows and Mac. It uses SIP as the base standard for almost everything including audio, video call and instant messaging presence. Everything which is required for real-time communication. And it also provides some end-to-end encryption for messaging based on the signal protocol more or less. The Linfone team developed the Linfone software but also SIPTrover, which is basically a SIP proxy. And if you want to use SIP account, it's possible to create them on our website for testing purpose mainly. Okay, video conferencing with SIP in a couple of words. It's around a couple of standards. The first one is SIP, a basic SIP with an invite referring by to create a conference, join the conference to be able to invite other participants to the conference. And it's almost based on the RFC 4579, which defines how to create a conference and how to join the conference. And there is also some interesting standard, which is the RFC 5366, which is about defining the list of the participants of the conference. So it's for the establishment of the conference. And the next important standard is what we call the conferencing events package, which is based on the subcribe notify RFC. And the idea is that when a participant joined the conference, it initiates SIP subscribe to the server. And the server then notify to every participant of the conference, which are the states of the conference within who is going to join is their audio video, everything which is related to the status of the conference. On the media port, it's regular RTP. And for this video conferencing project, we added the support of two important RFC, which are about bundling all the media stream into the same socket in order to avoid to have too many RTP sockets, RTP streams per SIP client. And regarding the security, it's a regular SIP TLS for the for the signaling. And for the media pass, it's either SDS where the symmetric key is set in the SDP or the RTP or even SRTP DTLS. And for the RTP itself, it's a standard AES. Okay, now let's introduce the what's the selective forwarding units. And I'm going to start with the description on what we used to have for conferencing server. So in the past, the media mixer received the video from every re-user decode the video stream, perform the mixing, it could be mosaic or any layout, and then re-encode all the stream to be sent to every participant. This kind of software exists since 30 years, maybe 20. Here, I just want to show you a page that I saw in the RFC 7667, which is about the RTP topology of former legacy conferencing system. So for each client, A, B, C, here it's audio, but it could be the same for video. You have one RTP stream going to the media server and one RTP stream, which come from the media server to each client. And it's server side that everything is decoded, mixed, and sent back to the client. The advantage of this approach was that it was very simple from the client side, as the conferencing server was almost the same as calling regular user agent. The drawbacks of this approach was that video layout was defined server side, so you could have one or two different layouts. It requires a lot of CPU resourced server side, as every video stream has to be decoded and then re-encoded. And to end encryption was not possible due to the fact that video was decoded. Now, if we go to the selecting forwarding unit, the idea is that the media server is no more decoding and then encoding every video stream, but it's more switching video coming from every device to every other devices. And it could be done depending on several policies, like ActiveSpeaker or Mosaic. And for that, we also need some information coming from each client, like the volume of the audio stream in order to be able to know who is talking without having to decode the audio stream as well. If I go to the same schema, still from the RFC 7667, now you can see that from the RTP standpoint, you still have one RTP stream for each client coming, going to the media server. But now you have also one incoming video stream per participant of the of the conference. So if we follow the audio, the video stream from the client A, you can see that it is copied to client B, but as well to client F. So it's no more a media mixer, but a switching matrix, more or less. What are the advantages of this architecture is that video layout is no more defined through the server side, but the client can decide where to display every participant of the conference. It's an application choice, no more a server choice. It scales very well as there is no resources which are used through the server side to decode or encode the video stream. And finally, it's open the door for end-to-end encryption as the media server no more has to know the content of a video stream. The drawback of this approach is that it requires the Cpliant to be able to manage mostly stream, which was not the case for a standard one-to-one call. So for the Cpliant agent, what we had to change are the following mainly. It's mainly about multi-stream requirement. In the past, the Cpliant was able to send one audio stream plus one video stream. Now, it requires the client to be able to send one, but most of the time, two video streams, one for high resolution video and another, a second one for thumbnail, as well as being able to receive one video stream per participant of the video conference. Just an example of the SDP to show what is involved. So bundle mode, as I said, which is required, RTP MOOCs as well, it's to limit the number of sockets used for the media. This extension is related to audio level in order to be able to display who is talking and also for the server to be able to select which video stream is talking. It still uses IC to be able to limit the usage of media release. And from the video part, what you can see is that there are two video streams in receive only, one for the high resolution of the camera and another for the thumbnail. So it means that we encode two times the video. It could be replaced by some video encoder like H264, AVC, which supports a multi-layer functionality. But if you want to be able to do that with a simple VP8, it's better to encode two times the video. And for the receiving side, there is one video stream because in this example, there is only one participant in the video conference. But this part would be multiplied by the number of participants of the conference. Okay. So this is for what we have done on the Linfone project for this feature. It could be tested with the FlexiSIP server, which is currently running on our infrastructure. So you can create a video conference thanks to this conference factory CPURI. And using Linfone client with version above 5.0, it's possible to join a video conference. Okay. Thank you. Conclusion. Okay. So now Linfone is capable of joining video conference in two modes, mosaic and active speaker using a selective forwarding unit, which allows to scale. Possible evolution that we have in mind is to implement the Xcon conferencing server in order to be able to manage conference from a website or to have something more advanced. We are also thinking about adding end-to-end encryption to this video conferencing server and why not to provide the compatibility with WebRTC, knowing that the media protocol that we use are very close than WebRTC. Useful link. If you want to have more information about this work, you can go to the Linfone website and to have a look on our GitHub. Okay. That's it. If you have a question. Thank you. Are you aware of any other CIP client that implements multi-party video with someone? Not yet because the work to move from a regular CIP phone with only supporting one audio stream and one video stream to support this multi-stream is very significant and I'm not aware of any work in progress so far. So if you want to use it, you have to go with Linfone. Even if it's fully standardized, if we are following standard. Thank you. Not yet. Not yet, but we are quite confident that it's going to scale as we have removed all the needs for audio or video encoding server signs. So it's really about switching of packets. Maybe the question might be around the network on the client side. Around network, on the client side, as we are using, we are sending two resolutions from the client, a high resolution and a low resolution. And in the case of active speaker, we only send back to every client the high resolution of the people which is currently talking and low resolution of every other participant. So it highly limits the needs of bandwidth. Yes. On the client side, you now decode more than one stream. Correct. It's almost the same answer. We decode one high resolution and many low resolution and the CPU resources is depend on the resolution of the video that you have to decode. Just one question about the STP that you showed before. So were two receive only streams for the client? Was that from the client? It was from the server. Okay, because that was my question. Because it looked like the client. The server received two videos from the client, one in high resolution and one in low resolution and sent one video to this client. There is only one participant in this comparison. From the client perspective, when you switch from big resolution to low resolution, you still use the same M line that you have to send to the client. It's, yes, exactly. Okay, thank you very much. |
Scaling Open Source Realtime Messaging System for Millions |
Sorry for the delays, I had a little bit of technical difficulties. My name is Floris van Geel and since 2010 I'm an open source enthusiast and since 2014 I joined my first FOSDEM as a speaker. After that I became an open source, cross open source fanatic. So I really really love and appreciate when different open source projects come together and they strengthen each other's power. And for Rocket Chat I'm a community liaison so I help to engage with community and support and so forth. So let's get to the chase, while we're here, skating open source real-time and messaging systems for the millions. It starts off with what is Rocket Chat? Who here knows what is Rocket Chat? We see a few hands there, like half the room already knows it, so that's great. Lots of engagement. In general people know about this side of the story, which is like team chat, like you know from Slack or Teams if you use that stuff. There's different order variants, also open source. The cool thing about Rocket Chat is that it is not the master of chat, that it wants to control all the chat. No, it wants to include as much different chat services as possible. Thus if you look up there is army channel, imagine you have like a company with support or sales offices and those have clients and clients they don't want to install yet another thing on their phone. No, what they do, they have email, they have SMS, they have WhatsApp, they have WeChat, they have whatever telegram you name a few, so I'm not commercializing one of those and it will connect via an app to the army channel, meaning that the people in the backlog who have to process that, they can directly route it and solve the issues. On top of that you can add bots with BotPress or Rasa and that help to automate the task of the people doing those great chores that make business. Since version five there's the option with the metrics to federate, not just to another Rocket Chat, but also to other home services. So it's not longer like we're on an island sitting and please come to us and chat with us. No, actively federating and working and collaborating together, all of us. And then as an extra source on the cake, Rocket Chat has a very extended API, not just the normal service calls that you know for creating users and so forth. No, it also has a real-time API, which means that inside your app or game you can directly engage with the message flows and with the chat. Then when it comes down to voice calling, Rocket Chat is agnostic, it's not part of the court, we support mainly the chat, so for voice calling you can make your choices. You can use Jitsi from out of the box, you can add Big Blue Bluton or if needed for corporate reasons, teams or Google or a few others. So that's about Rocket Chat. Now Rocket Chat is built like many other software architectures as a monolith, meaning that it is one service that's supposed to do everything and that is pretty nice if you have a medium-sized organization. Except when things start to scale, you get running into issues. So it's based upon MongoDB, that's important to know. Originally it was built in Meteor because at the time it was the best and fastest and most efficient route of making a real-time chat service. So this monolith, you can scale it horizontally by adding more and combining more monoliths, which obviously has an extent and you cannot reach beyond a certain point of users. So what Rocket Chat did after version 5 going into 6 is re-architecting this monolith into a microservices architecture. You still have the same MongoDB cluster, but on top of that there is different services, like authentication and presence and the actual chats. Those get divided and they can fail individually and be restarted individually. So you don't have a dependency that if one little thing breaks that your whole system is down. And on top of that it has the ability to keep on scaling this way. In order to change this architecture, a new library was chosen. In this case it's called molecular and there is alternatives on the market, but due to functionality and exchange of libraries and code, this module was chosen. It's MRT license, extensible, and its most important part is this part here. We want to use nuts, but it can also work great with MQTT and other messaging brokers like Kafka. Furthermore, there's many adapters, caching and extra great features to add upon. So in this re-architecting, this molecular was the primary choice. Why is that? So there is options which are actually faster. You see here there's one Cote, like faster, but the difference is between having your remote actions, remote calls, and your internal call. Oh no. Next cloud is acting up. Sorry for that. Right. And then go back to this one. Right. Still solid. The cool thing about molecular is that you can change between these remote and local actions and you can switch them within a proxy. And due to this flexibility, that's the reason why this is chosen as the primary driver for the architecture of microservices. And then nuts is pretty straightforward. I imagine that most people in the room have heard about nuts as a standard. It's open source Apache tool, very modern, fast. There's very different ways of implementation examples. And the main downside is that it doesn't support Qs and that is solved by inside the Mongo Q runner that will take over the division of those Q tasks. And that has also great advantages due to using these libraries. It is possible to make interactive extensions for developers. And that's also something that we've been facing with coming from Meteor to React and TypeScript for the apps is that it's much easier for modern days developer to adopt the software and thus the community has more impulse for growth. Put that about that. Perfect. Two minutes left. So in results, if we look down, this one is the monolith. It's built on 4K concurrent users due to the fact that that was the maximum that it could hold and could serve. And doing so, it has usage of 12 gigabytes of RAM as well as 15 virtual CPUs on this Amazon instance to test it. And for the new architecture, the microservices, it could hit over 50K with ease. But for making the tests equally, the test was made with 4K concurrent users. And then you see that the load on the CPUs and the usage of memory is actively reduced. It's only using three CPUs and five gigabytes of memory to perform the same amount of volume of users who are actively chatting. So if you want to learn more about Rocker Chat, this is our main place where we communicate within the company. It's open.rocker.chat. And it's open for community. It's open for support. Everything happens in one place. Sometimes there's a little bit of an earthquake due to some updates or a new version which is deployed. But within 30 minutes, that's all okay. That means that we take our own pain and thus not leave the pain at our clients. And the project itself, you can find on GitHub. And if you get up one little bit up, there is many different implementations and add-ons and libraries that help you to integrate Rocker Chat within your applications, as well as connect with these APIs and its services. I still have 30 seconds left for questions. But my question is, you mentioned I seem to recall that not that long ago, you guys announced that you're using Matrix as a protocol. What would you say if someone was thinking about choosing one, like what does Rocker Chat add on top of what Matrix provides? Yes, I've had this question many times at the booth and it's pretty easy to answer. So what is the added value of Rocker Chat compared to the Matrix because the Matrix is already really, really great if I translate it correctly. And that is actually within tech savviness. All of us here at FOSDEM, we're all so-called nerds or geeks or very technical people that can take this Matrix and make it work. And specifically if you look at marketing or sales or other roles that are not so tech-savvy, it's much easier to use Rocker Chat, specifically with the omni channel and engaging with the customers. UI looks a bit like Slack, however, it's more slick and it's faster. So it's easier to adapt and that would be the incentive to say, okay, go for the Rocker Chat. If you have a tech-savvy organization and you want to deep dive and include WebRTC and have all these components under your own control, go for the Matrix. Both have on-prem possibilities, data interoperability, security, and to-and encryption. So that's not the real cause for making this decision. Any other questions? Okay, then I thank you very much for attending this late at the last slot. And if any question pop-up, feel free to join Open Rocker Chat, contact me, contact Duda, contact Gabriel, or anyone that is within there. Thank you. |
Self-Hosting (Almost) All The Way Down
A FPGA-based Fedora-capable computer that can rebuild its own bitstream |
Good morning everyone, thank you for being here. My name is Gabriel Osama, I'm going to try to get the introductions over with quickly. I work for Carnegie Mellon's cert, which is the sort of OG cert the US government started back in the 80s after the Morris worm because they suddenly realized computers were going to be a thing they were going to have to care about. The cool thing about that is I get to indulge in my paranoia and OCD in a professional capacity, which is much, much better than it sounds, probably the way I make it sound. I'm going to probably sit down during this presentation every once in a while when I need to work the demo a little bit, so don't think that's weird. With all of that out of the way, we're going to talk about self-hosting and why that's important and how it impacts things like hardware and the ability to trust it, and then further into that sort of distinctions between ASIC's application-specific integrated circuits, dedicated silicon versus programmable FPGAs and what the threat models are in the trade-offs and how much you can trust each one of those, and what you're gaining and losing when you're switching between them. And then next will be a demo of what probably is the slowest, most memory constrained computer that's capable of running Fedora Linux that you've seen recently. It will be on a 50 megahertz rocket chip CPU, soft core CPU running on an FPGA. It's going to have 512 megs of RAM in this particular incarnation. It is using, like I said, rocket chip and Litex on an FPGA with free open tool chains, EOSUS, Trellis, and XPNR being used to build a bit stream for the FPGA. And then this computer, when it runs Fedora, you can install EOSUS, Trellis, and XPNR on the computer that was built using those tools and run those tools on the computer to rebuild bit stream for its own motherboard. So it's basically like a self-contained, self-hosting thing, which is really exciting. So let's start with this whole idea of self-hosting. Most of you are probably familiar with what that means. The joke is, well, no, it's not me hosting my own content on Google Drive or somewhere in the cloud, but rather it's a term of art in the field of compiler design, and it means a compiler is written in its own language and it can compile its own sources. And then there's a related concept of bootstrapping, which is basically kind of, well, you have a self-hosting compiler that built its own sources, well, chicken and egg, which one was there first. There had to be a third-party trusted compiler that was originally used to build the first binary of our own compiler before we could rebuild it, and at some point we reached stability where the next iteration of the binary we build out of the sources isn't significantly different from the one we already used, and that basically means we've achieved self-hosting and the process for that is called bootstrapping. And one interesting thing about self-hosting compilers is that they suffer from this attack that Ken Thompson pointed out, Ken Thompson being one of the designers of the Unix operating system among many other glorious achievements. He pointed out that a compromised binary of a self-hosting compiler could be created that attacks clean otherwise sort of benign trustworthy source code and builds malicious binaries, one being of like the scenario he described was the attack against the login program, which if you build a login program, it'll have a backdoor root password that will allow somebody to log in without knowing the actual system root password. The other thing this malicious behavior does is it inserts itself into any subsequent iterations when it detects that the compiler's own sources are being built using it. So it's a self-perpetuating attack that isn't actually present in the source code, and the only way to get rid of that would be to reboot strap the entire compiler because presumably we do trust the sources and the sources are clean and there's no like malicious behavior specified in the code itself. And one way to not necessarily get rid of the problem but to point out or to test whether we have been subjected to one of these attacks is David A. Wheeler's PhD dissertation called Diverse Double Compilation and in the example here we'll be using CC as our suspect compiler and TC as the third-party compiler and it's not necessarily T for trusted, it's T for third-party and the heuristic here is that we pick the third-party compiler in a way that gives us a high degree of confidence that it is not in collusion with the suspect compiler. People who put it out aren't the same group think maybe GCC on one hand and MSBC, Microsoft on the other or something like very diverse, that's where the diversity comes from. The way this works is that if we compile the sources of CC with both our own sort of suspect binary and with the third-party binary, if everyone's innocent and no one's trying to screw us over what should happen is we should be obtaining binaries reflecting the sources of CC that are functionally identical because these are diverse different compilers, they would produce different code like the code generation would be different so the binaries aren't like bit by bit identical but they should be doing the same thing because they're implementing the same source code. And then if that is true then the next move would be to take the sources to CC again and rebuild them with our two intermediate compilers that we obtained and if you control for the initial conditions, if you have the same initial conditions, same random number generator seed and everything and identical input pumped into functionally identical binaries the result should be bit by bit identical and if that's true then we can breathe a sigh of relief and say okay we are very unlikely to be subject to a trusting trust attack and that degree of confidence is sort of equivalent to our heuristic ability to pick a third-party compiler that isn't in collusion with our suspect compiler and by the way the highlighted box on the bottom here is basically the process of bootstrapping CC using the third-party CC compiler. So back to self-hosting, if you have a self-hosting compiler and source code to everything, the binary of the compiler when it operates, when it runs, it runs on top of I don't know a C library and the kernel and basically a software stack, it's an application on top of that but it's an application that can compile all of the things it needs to run itself. And if you have source to everything and you've compiled everything from sources that you otherwise trust then you have a self-hosting software stack built around your compiler and sort of the applications are bonus, you know, all the stuff you actually want to use the computer for. If you build that from source but the stack of software, you know, with the C compiler at the top, systems, libraries, kernel and whatever you have underneath that for the software is a self-hosting software stack and examples of that we have in the wild, there's the Linux ecosystem, there's the BSD ecosystem, those are all sort of compliant to this with this idea. Now there's kind of a holy war going on with, you know, whether hardware will respect your freedom or not and some people are claiming that, you know, hardware should be completely immutable and never, you know, upgradeable with firmware or anything like that in order for it to be completely, you know, respecting of your freedom and no binary blobs and different people say, well, I mean, you may actually be able to, you know, put free firmware blobs on your proprietary firmware blob-enabled hardware of today if you just reverse engineer it and so on. Anyway, the idea is in order to, you know, trust the computer, it's not enough to just have a self-hosting software stack, we need to understand what hardware does and hardware, as we've learned in recent years, isn't really hard at all, it's very, very mushy, very complicated. It does all sorts of things that scare us and we need to kind of take a closer look at it and so software talks to an instruction set architecture and a bunch of registers that are mapped somehow and that's basically where software talks to the hardware, that's the D-mark here and then there's all sorts of layers underneath, micro-architecture, whatever, it all ends up with this register transfer level which is combinational and sequential logic, basically a bunch of gates, a bunch of flip-flops, a clock and so on. And it's not my word for it, it's just a word I picked up from the wild, I don't know exactly who to attribute this to, but these layers of the hardware stack are typically referred to as gateware and it's the stuff you write in something like Verilog or VHDL or Mejen or Chisel and so on and then obviously all of this has to run on actual physical hardware which could be dedicated circuits, application-specific integrated circuits or optimized silicon or programmable FPGAs. And so we can, if we have free software tool chains for HDL compilers, for making gateware out of sources, which we do thanks to the group who put out YOSIS, Claire Wolf and the gate cat who made the trellis and the next PNR place and route software, so anyway, if we have those things, those are software that can be built by the self-hosting C compiler center, which can compile the software stack, now this thing can take source code, HDL sources and build all the layers of gateware which then support all the operation of the software stack so you have a self-hosting software plus gateware stack, unfortunately, that leaves for now out the actual physical layer, the silicon versus the FPGA, so this is as far down the layers of abstraction we can go with self-hosting that I'm personally currently aware of. And so being a relative late comer to developing hardware, I'm a software person, have been my entire career, took a couple classes at the university where I work, learned verilog, learned a bit of digital design and it surprised me to realize that essentially designing gateware is sitting down in front of a development environment and writing a program in some kind of functional slash declarative syntax like verilog and VHDL, you basically write a program and then hit the compile button and it compiles your code into ever more elaborate basically graph net list of building blocks and eventually gates and then you have a choice of building a binary blob which is bit stream for the FPGA and it's basically a binary blob just like a binary blob comes out of an actual program you write for software, the difference being software will tell some CPU a sequence of steps of what to do whereas bit stream will tell an FPGA what to be, it sits there and it acts out the configuration that is being compiled into a binary blob but other than that it kind of looks like software development to me and I probably am passing off a bunch of people for saying that. Now the interesting thing is if you don't want the FPGA bit stream but rather would like optimized silicon then you're compiling, further elaborating your gates and your RTL into a bunch of graph, a very complicated graph of transistors which then get laid out and made into masks and there is an entire very very expensive very very involved process of actually etching this into carving it into stone so to speak and we have the saying of well is it the dog that wags the tail or is it the tail that wags the dog, well in terms of that making actual silicon is one stage in a compilation pipeline you know like a software development compilation pipeline just like a five megaton tail is wagging a tiny little chihuahua dog basically but if you look at it from a software guys perspective you know it's just one stage of the compilation pipeline just kind of figured out I'll share that with you you know. So now we have the option of doing a CP under this slide is specifically from the perspective of we're going to make a CPU and the choices are putting a CPU in dedicated silicon versus putting a CPU in an FPGA with the dedicated silicon obviously you have high performance, lower area, you know high clock speeds. The problem with that is there is you know from the perspective of the hardware attack surface one thing we don't control is the foundry the chip foundry where we're sending those masks to be made right and documented attacks that have been done so the University of Michigan group had this A2 Trojanette and IEEE Secure and Privacy like three, four years ago and what they did was if you have access to these masks then you can tell where things are and you can add maybe these things have like billions of transistors but if you carefully understand how this whole thing works you can put in 20 transistors in the capacitor and the transistors are wired such that when the CPU because this is a CPU remember is executing a sequence of unprivileged instructions depending on how you wired those transistors in they incrementally charge the capacitor a little bit at a time until at the end of the sequence the charge capacitor will flip a bit in the register and if that register is your CPU privilege flag as in ring or whatever you know your kernel mode versus user mode then you have a baked into silicon privilege escalation vulnerability that relies not at all on any vulnerabilities in software so if you theoretically had perfect software you'd still be able to basically do a buffer I mean not a buffer of a privilege escalation attack on a CPU that's been compromised like that. As opposed to FPGAs which you're asking the foundry the manufacturing facility to make you a regular grid of basic configurable blocks it kind of looks like snap circuits for grown-up engineers you know and most importantly the founder has absolutely no idea what this FPGA will be used for and if it's ever going to be used for a CPU where on this regular grid of identical blocks will the register be that holds the crown jewels to like the privilege ring flags or anything like that so pre-gaming an attack in this scenario is qualitatively harder for the hardware manufacturing facility because they don't know what you're going to be using it for and where your things are going to be put on it by the place and drought software so the price you pay for not letting them know where your privilege flag is going to be by using software CPUs is basically performance a huge performance loss but that's essentially the trade-off. So if we've decided to use FPGAs because we're paranoid and we're trying to deny the silicon foundry knowledge of what we're going to be doing the rest of the attack surface is you know if we don't trust our HDL tool chain but we do because it's you know part of the self-hosting stack and we have source code to it and then there could be design defects like bugs in the soft in the sources to the to the CPU kind of like I don't know spectra and meltdown and you'll never know whether those are intentional or just somebody getting away with like trying to optimize things plausible deniability all the way down but if you have source code to everything you can always just edit the source code and rebuild things and you have a self-hosting environment which will allow you to rebuild every part of it as necessary which is what brings me to this slide you know freedom and independence from any sort of black box closed non-free dependencies and you can trust the computer that runs as a self-hosting gateway plus software stack to the same extent you can trust the cumulative set of source code now a lot of people are going to say well no one ever reads that much source code and it's impossible to understand I agree I don't want to read any of those sources myself but the cool thing about it is if I ever down the road have a question about hey this computer did something weird I could do a vertical dive into the software layers the RTL the source code to the gate where the source code to the what whatever it does weird I can actually have enough brain power to do one sort of debugging session through it but in order for me to be able to do that I need to have source code to everything and with the knowledge that I'm not going to read most of it so that's kind of my perspective on this my ability to trust my own computer I hope I'm doing okay with time are we talking about 15 minutes here perfect all right so I am going to now show you a Fedora capable computer built on this lambda concept board so if you download the PDF from the conference site the links are clickable so it'll take you to the place where I ordered it from it's a commercially available board hopefully they'll make more because it was sold out the last time I checked it uses light ex and the rocket chip CPU it uses joseph trellis and next pnr for the tool chain open sbi for the firmware and then I downloaded the latest incarnation based on on rawhide 37 of fedora's risk 564 bit port thank you david abdrakmanov he's the guy you know like the one-man show behind building most of the stuff and it's really really appreciated if you have bit stream well if you have if you have light ex and all its dependency installed and there's going to be a link in in the slide back to more detailed build instructions for this but it's pretty much a stock light ex build you install light ex according to all the recipes that are available online and then you run this command line which says we're going to build it with the rocket chip can't highlight with the rocket chip CPU 50 megahertz I want ethernet I want SD card support I want to use flow 3 optimization to the osis component of the tool chain I want strict timing and I want the register map saved to a CSV file now this is all a little bit clunky still at this point because you're going to have to kind of manually build the device tree table for it light ex doesn't build a device tree table for rocket chip based designs automatically and it's one of the things on my to-do list teach you how to do that but one but once you have once you have the the generic sort of register map and and you know where to you know what the addresses are for all the devices we have to add like a chosen boot args line which contains the kernel command line for the booting fedora and the black font is sort of the the standard cotton paste from from what fedora already uses modulo this route which is going to be on the SD card the other thing we need to do is set enforcing to zero because once we have r-synced stuff from one image to the SD card the labels are all wrong and you know se linux is going to scream at us so we're we're set enforcing to to zero and then the default is to boot into graphical mode so we have to tell it to use like run level three equivalent which is the multi-user target than system d and then last but not least system d is really really impatient because it's used to running on I don't know five gigahertz you know 20 core systems this thing's 50 megahertz so system d will give up on starting services way before the thing actually has a chance to actually start all that stuff so we need to increase the system d timeout now enforcing we could get rid of like it takes about a day to relabel the whole SD card on the on the 50 megahertz system but then you can get rid of this part of the command line because it'll actually work properly with with the se linux and you can set to this as the default so then you can get rid of the multi-user target but this should stay because it affects both the init rd version of system d and the one that actually boots from the real route now once we have a device tree blob ready to go we make a binary out of it and then we build open sbi that's another sort of stock you get open sbi to just build itself using a built in dtb right now so the other the other thing that light tax should eventually be made to do is to build a dtb into the actual bit stream and then have open sbi just kind of take that like it does for most normal computers for now we just kind of have to build a binary bit stream thing into the open sbi blob and then we put that on the first you know the first partition of the sd card has to be vfat and there has to be a boot.json run file which lists the open sbi blob and its load address which is the very first address in memory and then the linux kernel image and the init rd image and the linux kernel image and the init rd image just come from fedora or would normally but I had to do some customizations the stock fedora kernel has two problems that I'm dealing with right now one of them is it lacks uart irq support for the the light tax uart and so that stuff is kind of making its way right now it's somewhere in Greg Hage's tty next tree and it's been accepted for upstream but hasn't made hasn't made it into mainline yet. The other thing is between the port the risk five fedora port based on fedora 33 which was the previous kind of major release of the of the risk five port of fedora and the current one a bunch of config flags have been turned on additionally in the stock fedora kernel configuration and I found two but I'm working on finding a third one which if enabled will cause the kernel to crash when it boots on this machine on this computer and either David will tell me well we can get rid of not actually enabling this one because it was enabled by mistake or if it has to be enabled then I either have to find a percolating patch for some kernel bug that already been found or I have to find it and submit a patch myself but anyway that's kind of work in progress right now I'm building a custom kernel and I'm doing that on a risk five fedora machine running on QMU for reasons of speed and you know I need something to actually build the kernel before I can boot this machine for the first time and then down here at the bottom there's a URL clickable link with all of this but in much more detail that you can actually reproduce alright well perfect because now I'm going to sit down and actually try to work this demo for you guys I recorded an ASCII cast of my terminal let me try to maximize this letting my screen and I'm sending the bitstream with open OCD so this is the ECP five bitstream that I built I'm sending it to the ECP IX five lambda concept board this is litex you know this is basically where I type SD card boot I'm going to try to zoom in so you can so you can actually read the screen and see what it's doing so you know it's starting to boot loading look loaded boot JSON is loading the RAM disk and if I fast forward it's going to load the actual kernel image and then it's going to start booting and this is what it looks like it takes a very long time this whole video if you have time to watch it at normal speed this four hours long well if it's a 50 megahertz computer what do you want so anyway let's see if I fast forward to this creatively you'll see system D actually booting here and a bunch of okay services being started let's see at some point we failed to mount what var lib nfs whatever but we don't care about nfs on this computer oh it also failed to start firewall D but other than that it seems to be pretty happy and at some point it starts the console and let me pause this again let's zoom in properly so you guys can see what it looks like so this is a boot prompt for fedora in text mode if you don't have irq support in your uart it would trip over itself get tty would basically just kind of interrupt itself before it's serviced its own soft interrupt so it needs actual rq support in order to not crash when it starts the other cool thing about this is it actually does work on the network if I end map my you know from my normal machine for fedora it'll actually find this so 192.168.2.229 on my home network was where this fedora machine actually grabbed the dhcp lease and talked to dns mask and everything the my attempt to log into this machine talk about 20 30 minutes because here's the cool thing right you type log in and it starts you know the login program and then it starts bash right so in order for all of that to work it needs to be loaded into ram and linked against glibc and all that stuff and that takes a little longer than the timeout the first couple of times until it's actually managed to pull enough of it into ram and actually let you log in so there's a couple of attempts and I'm trying to log in both at the console in this window and over ssh and see here I actually just succeeded because it says hey last login something something that means I'm actually going to get a shell eventually and once I do get a shell I can start you know exploring cat proc cpu info looks like this proc interrupts looks like that I have the uart I have ethernet zero I have my sd card and this is part of the cpu the slash boot dot boot.json this is the file that told the litex bios what to load into memory what else do I have here this is the actual source I mean I just copied it over to the sd card so this is the this is the source to the device tree file and there's my boot args console all the stuff we talked about on the previous slides this is the cpu node okay let's see what else is going on here my devices but all of this I had to edit by hand and I'm going to I promise I'm going to teach litex I'm going to submit a patch to make it build generate this programmatically so that I don't have to like modify the device tree file every time I rebuild my my thing. So long story short I'm going to fast forward over a lot of this stuff once I'm able to log in from everywhere the next thing is system d network or resolve d is not enjoying itself on this machine so I had to disable it and stop it and add 8888 and 8844 to an actual hard coded at cresolve.conf at that point my my dns resolution started working crony also started working because it could resolve like fedora ntp whatever the alias thing it has in its config file and once I have all of this ready to go I type dnf-y install python 3 media and osis trellis next bnr and it's doing it really slowly but if you have patience or can fast forward which is really cool it's a cool feature of ASCII how do you pronounce that ASCII N E M A ASCII cinema I don't know but you guys know what I'm talking about right like you can record your anyway this is basically what I used here and so I don't know we're like a hundred and forty two minutes into this entire thing and it's installing rpms well we'll get to that we're we are going to address that elephant in this room definitely well so here's here's the thing right so basically it takes about an hour and plus to install all the rpms but then let me pause this thing at some point what I did to demonstrate the fact that it can actually self host is I had a very simple verilog blinky which just makes a counter out of the ECP five boards LEDs and and if use if I zoom in here essentially this is what it does it has a counter and the red the you know led zero red led one green led to blue and led three red that's basically bits twenty seven twenty six twenty five and twenty four of the counter and it kind of goes at like you know couple of seconds you can actually see it blink and that's the verilog and I'm running actually here we go like the build of this here's the show I have a shell script I am running it manually so your sis is the first thing that creates a JSON file next PNR will do the place and route and then ECP pack will actually take what next PNR produces and spit out an SDF file which can shove at the actual board on which this computer currently runs and so I did that and so in this other window I have top running and so your sis is using oh I don't know let's see if I keep going percent CPU where is percent CPU right here so it uses about 80 percent of the CPU if I don't know man DB whatever cron starts some process that drops down to 50 percent because now it's splitting it with like whatever you know thing so I had to kill those while this was running just to keep it you know on on task and making progress run your sis run next PNR and this is pretty much if you've run next PNR ever before you'll recognize the output it succeeds then we do ECP pack to generate the SVF file and once that is over I did a MD5 sum of the top SVF file so that when we run the following demo or when I show you pushing this thing to the actual board when it starts to blink here's the check sum of the SVF file BAE0 yada yada yada 618 all right so at this point the job is done it took I don't know 50 minutes to build the bit stream and if I pause this perfect if I pause this creatively here I am doing an MD5 sub sum of the top SVF file and you'll see BAE0 D yada yada 618 that is actually the thing I built on fedora running on this board and if I let it run and it goes then you'll see right now it's like kind of okay so here it started blinking and it blinks exactly like the verilog I showed you earlier says it should so I was capable of building bit stream for this board on fedora running on this board and with that we're going back to the tail end of the slide deck and we're talking about right so building the blinky on my Intel laptop takes what 10 seconds or less so 10 seconds dot dot dot 90 minutes building bit stream for actual risk 5 rocket chip light text takes half an hour dot dot dot whatever that translates into you'd be here a very very very long time if you waited for this thing to really self host itself and rebuild its own bit stream it can do it we've we've established that it's the qualitative leap has been done it's just a quantitative problem now to make this thing faster right and so the immediate thing is to figure out the linux config stuff teach light x how to be more you know civilized about booting and generating device trees and working maybe with u-boot or something and actually have a standardized boot process light sata is like they have a sata core which works on some FPGAs but currently not yet on the on the ECP on the ECP five in the medium term right in order to make this thing a little faster right on my VC 707 board I can get eight cores running at the hundred and a hundred fifty megahertz so basically twice or three times as fast and eight times as many cores as I can fit on a on the lattice chip a problem is if I do that it's not self hosting because I need Vivado to pull that off I need the Xilinx proprietary tools so whatever I can do to encourage or join or whatever in the future the effort to target the Xilinx large Xilinx not just any Xilinx chips large Xilinx chips with complete free tool chains count me in let me know tell me what I need to do you know I don't have a lot of money but I have a lot of determination I'm a very stubborn individual all right well thank you I'll take that as a compliment no for real I got that we could put in fancier IP blocks if it's a larger FPGA maybe we can get away with some kind of video card like thing or maybe be a PCI master so we can plug video cards or other cards into this computer and then in the long science fiction term I mean what I'm doing right now is I'm taking a class or a sequence of classes that that culminate in taping out an actual ASIC at Carnegie Mellon which I'm doing in my spare time I want to understand how ASIC fabrication works because I want to have something useful that I can say about it right now it's just all high level oh yeah you know you can't trust the fab but I have no idea what goes on in one of those things and I want to know what goes on in one of those things and then there's been a kid who probably just graduated from their electrical computer engineering department Sam Zilouf was his name but he was famous before he joined CMU because he did some silicon transistor stuff on integrated circuits in his own garage probably with like 70s technology or whatnot but it's a start right and then maybe in the future it would be really cool if I lived long enough to see some kind of you know nano assembler kind of like a 3D printer in my house that maybe costs as much or less than then sort of like a the average American single family detached home you know or something because right now the way chip fabrication works there like you can't you can count them on one hand how many actual places are that make these things right and then obviously they have the attention of important people and nation state actors and all this stuff it would be nice if we could democratize that a little bit more so that's kind of you know if I live long enough to see that I've either lived a very long life or you know something cool happened in my lifetime either way I win and with that thank you. Time for questions or do we do that off okay awesome. There's a good talk just okay use the entire 40 bits sweet thank you. |
QtRVSim—Education from Assembly to Pipeline, Cache Performance, and C Level Programming |
So, good morning, ladies and gentlemen. I am Pavel Pisa from Czech Technical University. I teach computer architectures for something like 20 years. From year 2000, we follow the famous Professor Hennessy Patterson book about the computer architectures. And we have used the original MIPSIT simulator, which has been distributed with the book, but it has got dated. So we decided that we need to do something for our students to have a better experience. And we started first with MIPSIT simulator, and then we switched to RISC-5. And this was work, it has this work of the Jakub Lupak, our student. So I hope that everything is working, and I pass him a mic and the stuff. OK? So, good morning. Over the past decades, computer engineers have been creating faster and faster computers. Meanwhile, many software engineers kept writing slower and slower software. And in many areas, this is fine, but in other ones, we need to process pretty much insane amount of data often in real time. And to do that, we need really efficient software that can exploit the capabilities of the hardware. And only software engineers who understand the principles of the hardware can do that. So to ensure we will have such software engineers, we need to teach our undergraduate students at least the basics of computer architectures. And because nobody wants to learn with pen and pencil anymore, we started using graphical simulator. So, and it is more. Yes. So we started with a MIPSID simulator, as was mentioned, which was shipped with the Hennessy Patterson books. However, it had pre-limited features, and it only worked on Windows. So in 2019, Carl Kochi was sitting over there, created Qt MIPS, graphical simulator of the MIPSMaker architecture. And it was continued with a lot of work of Dr. Pisa here. And in 2032, myself, with my colleague Max Holman, and again, a lot of work from Dr. Pisa, we have switched the simulator to RISC-5 so that we could switch all the undergraduate computer architecture teaching to RISC-5. So we are there. The simulator is licensed under GPL-3, and it's native simulator running under Qt-5 and 6. It's developed and available on GitHub. And to better collaborate on the materials, we have joined forces with our sister faculty. We are from the faculty of electrical engineering, and they are the faculty of information technology, and we have created what we call computer architectures education projects. And you can find all the materials that we have recorded in lectures, slides. You can find it there. And furthermore, to collaborate with as many people as possible, the university has joined the RISC-5 foundation lately. So the simulator is called Qt RISC-5, and it is currently used by our university, the Technical University in Graz, University of Colorado at Colorado Springs, and University of Porto. The previous MIPS version is still used by the Charles University in Prague, and National Coupled History and University of Athens. And this talk will focus on the teaching and user perspective if you are interested about how it works inside, at least a little bit of it. You can look up the talk that we gave at the RISC-5 International Academy and Eurotraining Special Interest Group. So before I dive in, as Dr. Pichai mentioned, you can just open this link and follow the presentation with the simulator running in your browser. It is in chat. It is in chat. Great. So let's dive in. When we get students who have never heard anything about computers before, we need to start really simple. So what we do is that we start with a very simple, single cycle micro-architecture, and we show them how basic instructions are processed there. So this is the simulator. This is how it looks when you open it. So we will hit No Pipeline and No Cache to get started, and hit Start Empty because we will use one of the examples provided, so file examples, and simple load and store work. As you can see, the program is really simple. It just loads from one part of memory, stores into another, and then goes into loop using branch instruction. So this is the basic view that we start with. And on the right side, you can see detail of the program memory with address of each instruction, hexadecimal code of the instruction, and the disassembled instruction itself, where the last two columns can be edited directly in this view by double-clicking and writing the instruction or pasting the code. And furthermore, the left mode column is a place where you can set breakpoints, so you can use it like a TV debugger, or at least the simplest part of that. And now we can move to the heart of the simulator, and that is the visualization of the pipeline. So if you look closer, on the left, we start with the program counter, program memory, and circuits to update the program counter. We load the instruction, and here you can see the hexadecimal value of the instruction shown on the bus, and it gets to the control unit, where we send all the control signals that we need. So we are showing globeboard instruction, so we need to send the data from memory to registers. We need to read the memory, and we will need immediate value for the arithmetical and logical unit. Right here, we can see mechanisms for resolving branches, but this is globeboard instructions. I think there's a thing there. And here is our arithmetical and logical unit, which has two possible inputs, either from registers or from PC. And the other one has either value from registers or from immediate decode. Here we get the requested operation of the ALU, and here we get the signal whether the result was zero for branching. And finally, here we have the data memory, if it address, and the data to be written. So we start executing, and we see that the instruction load board was highlighted, and in the register view that appeared up there, we see that value 1 through 8 was written into the register bank, which is highlighted by the red color. If we continue, we move to the starboard instruction, and now the value needed to be read. So it's blue now. And if we go back to detail of the view, we see that we read that value from register, and we are sending it by this wire to the data memory to be start. We also read the address from the immediate decode, and we are sending that to the arithmetical and logical unit to be added to register, which is in fact zero. So we are just sending the value to the memory, and now the memory can write. So let's see what happened in the memory. Here we can see the address 4 or 4 with the value written. So it corresponds to the address we have here, and to value we have here. And right above it, we can see the place where we read the value from. So speaking of the memory view, right now we are working with words, but we can work with other sizes. So you can switch the view for the unit to be either word, half word, or byte, which will work with respect to both Littler and Big Indiana's. And once we add cache, which will be in the moment, we can switch the memory view between direct view of the memory and view where we are looking through the cache. So we will see also the cache data. Now we move to last instruction of the long program. That is the branch equal instruction. And we will see that we take the destination from immediate. It's negative, so we're jumping back. We add it to the program counter, and here we send it back to the program counter. In the control unit now, the signal for branching, for the most generic one, is active. And we see that the branch resolving mechanism determined that the branch should be taken. So we go to the program counter, and instead of taking the program counter plus four, we are now instructed to switch this multiplexer and take the value computed by the arithmetical and logical unit. So here we have detail of the program counter and the register itself in the pipeline. And as you can see, we continue to the load bar instruction so that we can keep running in a loop in this example. So this is a very basic example just for this presentation, but it gives you an idea how we start with the simple processor. And when students have idea of what the virus are doing and what it all means, we can start speaking about how to make the CPU faster. And well, what we find out that the memory is slow, so we want to cache it. However, that's not that simple. There are many ways to cache it. We can cache data. We can cache instructions. We can choose different sizes of the caches. And if you do it wrong, it will be worse than doing nothing. So we really need to think about that. So we switch to the second option in the configuration and add in time empty. But this time, we will hit the button up there which will open integrated editor with syntax highlight and integrated assembler. The other options are to open file, to save file, close file. This is the important one. This is to run the assembler and upload it to memory. And if you have some more complex project, you can even invoke external make file. Except for WebAssembly where there is no make. So we insert a simple selection sort. You will use, for example, I put it in free column so that we can see the program. But just a simple selection sort was important here. That we have data that will be put into the L file and later start into memory. And that will be the data that we want to sort. So we save the file so that we don't lose it and we move to the memory view. Right here, you can see the data that we have inputted. And on the right side, new windows opens. And it's the detail of the data cache. It has two parts, statistics of the cache performance. And it has very detailed view of the internal structure of the cache and the data that are there right now. So if we start running, the cache was empty. So of course, we need to edit and we will miss. And we will continue adding the value. And right now, the cache is full. So first decision that the students need to understand is, what shall we edit? So right now, there is random choice. So it evokes the first one and we continue running. And right now, the selection sort will start placing values where they should be. So it moves the value number one to the first one. And you see the yellow highlight shows us that the value is now written in the cache and the cache is dirty. So right now, the cache is using writeback policy just to show you the highlight. And we can continue. And now, we switch one with five. And it was a simple example, so the array is now sorted. But we still need to go over it because the CPU doesn't see that, needs to check it. And because we use the writeback policy, we also need to run fence instruction to make sure that everything is stored back to the memory. So you see there is no highlight anymore and the cache is empty. And this is a time where we should look at the performance data and we will see that we have quite an improvement. Of course, it depends on how fast memory we have and fast cache we have. So in the configuration, you can set the penalties for hits and misses so that this data will look according to what you need. So this is the configuration of the data cache. You can see that we can select different shape of the cache, which means number of sets, block size, degree of associativity. And then we have two policies. Replacement policy, which is out of random, last recently used, or least frequently used. And writeback policy, which can be, you can see writeback right through the options there. So here is how the detail of the shape of the cache looks for two different configurations, which in fact have the same amount of data there. But they will probably have quite different performance. And also, the right one, it's way more complicated. So the students can examine what this will do to their particle programs and data loads that they will have. Of course, we have also program cache. And I have added this very stupid design where we have just one cell. And well, in program, you never go to the same address twice. So you can see that we always missed. And it's actually worse than doing nothing. So just an example of that. And I mentioned that the windows can just appear. So if we prepare some example code for students, we can insert special pragmas to make the windows appear so that the students who have not seen the simulator before can just start with all the tools they need. So we can show any part. We can focus address in memory, and so on. So now we have improved the speed of the memory. And we look at the CPU again. And we find out that most of the silicon is just lying there doing nothing most of the time. So we want to utilize it better. So we see that most of the parts are somewhat independent. So we split it into five parts. So let's go to third option. We will choose pipeline without cache. And you can see that the image got somewhat more complicated. However, if you remember the previous, it's all the same. So students already invested some effort into understanding what the image was supposed to mean. So we don't want to move things around. We are just adding the minimum amount of things that we need so that we can continue with the more complex stuff. So you can see we display all the instructions in each stage of the pipeline. We are adding the interstage registers. And the colors are not the random. We are also using that to show which instructions correspond to each stage of the pipeline in the program view. So you can see all the stalls and branches nicely as they progress through the pipeline. And of course, we needed to add animated windows to each stage of the pipeline for each wire, for both control and data that we have. And now we find out that this is not all we need to do. Because up until now, we assumed that, OK, we process one instruction. Instruction is done. We continue. But right now, we start processing instruction. But it will take, what, former cycles before we get the results. So we have something that's called data hazards. And that means that if we do nothing, we will get all the data. Because the data from the instruction that we depend on is not ready yet. So we can have hazard spending two cycles in the next instruction and hazard spending one cycle in the instruction after. Technically, there is also hazard in the instruction that is after that. But this is something that we can solve inside the register file without breaking the abstraction of the pipeline. So we will ignore the last case. And so what can we do about that? Well, we need to detect those problems. And we need to counteract them in some way. So we are adding hazard unit to the CPU. And it has two options. First option is, OK, we will wait for the data. But we said we want to build a fast CPU. And waiting is, well, not fast. So we also have option to forward the data as we need it. So yeah, it's in the core tab. And this is the option. So this is the CPU with edit forwarding paths. So what we have here is we are adding path from the writeback stage to get it directly into the execute stage. The multiplexers got bigger. And we are adding second path to forward the data from the memory stage. So here is the difference. There is some extra wiring. The CPU looks, well, simpler. But it will wait. And that's what we want. But it will cost us more to build it because it needs more components. So let's see some actual examples. We are again learning the selection sort I showed you before. So we start running the instructions. And right now, we have our PC instruction that is producing its value to register 10. And the hazard unit started screaming. It doesn't like it. So it tells us it's not very, very visible. But the text in the red is forward. So this is the hazard that we have here. And what happens is that this multiplexer gets value to use the forwarding wire from the memory stage. And instead of using the actual value for registers, we are taking the value for the red path. And we get here for the arithmetic analogic unique value hexa 200, which is what we want. Well, you don't know that yet. But to remember that value, it will be important later. OK, we don't want to add the extra hardware. So we will stall instead. So this hazard unit is screaming again. But this time, it cannot forward. So it's screaming to stall. But you can see that the instructions are right next to each other. So that was the first case. And we need to actually stall two cycles. So we will run next cycle. Another knob is inserted. And again, the hazard unit is screaming. So right now, we have any two knobs. But what we have now is that the instructions are far apart enough. And we can continue. So we get value, thank you, 200 hexa again. And we are happy. Well, and then the simulator, of course, supports the most simple option, we do nothing. And what happens here? We have the hazard. And we get value 0 because the registers were initially empty. What is the purpose of this setting is that we will task the students to play the hazard unit and play the compiler. So they will have to rearrange the instructions and insert as less knobs as they need to make sure that the result will be correct. This will typically be some kind of homework. So we don't want to control that manually. So we have a command line interface, which can look something like that and will give you output like that. So the capabilities that we have here is to assemble file, to set the configuration, to trace instruction in each stage of the pipeline, trace changes to memory and registers, also at the end to dump the memory and registers, which is the most useful part because we want to make sure that the result is the same as it should be, for example, when the hazard unit is not available. And finally, when we want to plug in some data, for example, when the students are supposed to sort, they should not know the data, so we have special option to load data into the memory. And right now, we have CPU that, for undergraduate students, it's quite fast, but it's quite simple. So now instead of making it fast, we will make it more complicated. And we are adding memory map profiles and some very simple operating system emulation. So if we open the templates again, you can see that there is template for operating system. And it really does nothing special. All it does is prepare for calling system call, which is equal in risk five. And we will print hellward to file. The file is connected to a terminal. So what we actually do is, after running the, now we get the equal detected, you can see that there is something correct written that is detected syscall. So we will stop fetching new instructions, so it's more clear for the students. And once we get to memory, here is our hellward in the simulated terminal. And you can see that the pipeline is empty because we are looking at this from the perspective of user land. So the operating system is emulated, and we don't see the instructions in the pipeline. We just see that the pipeline was flashed before. It was written to us. So this is the system calls that we support. It's not much, but it's enough to show the basics to the students. And this is all the peripherals that we can play with, mainly using the system calls. So this is not peripheral, but we have control and status registers, some basic supports of them. We have an LCD display. The terminal I've already shown you, and there are some general purpose IOU peripherals, two LEDs, three knobs with buttons. And this might seem a little random to you. It's not. Because we also have this relevant board that has exactly the same peripherals. And the simulator is set up in a way that you can take the same code, well, not assembly code, because, unfortunately, the board is ARM. But you can take the same C code, and run it both on the simulator and on the real board, and you can move back and forward. So I said C code, so we can't use the integrated assembly anymore, and we did not implement C compiler. So you can use Clang or GCC to compile the file into ELF, which, of course, is to be statically linked. And what we do now is, instead of hitting the start empty, we will here insert the ELF executable, and we will hit Load Machine. So this is example of some program that writes to the LCD display. I will not show you the code here, because this is one of the tasks that we give our students. But I can show you this code. This is available on our GitLab. We provide a link here. And it is a test suite, simple for malloc, from the new lip. So you can link your code to the new lip and use it in our simulator, or at least some basic parts. And we have tested that with malloc. So we can run this, and it's connected to the terminal. So we can see that some dynamic allocation takes place, and some checks take place, and the tests are running successfully. So now you can use actual dynamic allocation inside the simulator. And now some conclusions. So frequently asked questions. Is the simulator cycle accurate? That's a pretty important one. Yes, kind of. The thing is, we always assume that the memory has enough time to finish, and if it doesn't CPU will wait, we will not insert stores or anything. The memory will finish, but that's the only exception. Otherwise, it's inside written in quite similar style to malloc, system malloc. So is this compliant with the official RISC-5 tests? And yes, I can please see this. Yes, in the previous version, we have added that, and it's integrating into our CI, so every new changes are tested against that. We support the graphical parts, so it supports the RISC-5 I with multiplication, and also with control and status registers instructions. The command line also supports 64 bits. However, I yet need to find out how to fit the 65 bit values into the visualization, because it's already quite full. And we don't have a virtual memory yet, but somewhere here in other room, there is a student who already promised to take care of that in the next year. So in a year, he might be presenting the change here, so we'll see. So what we plan for the future, we are very close to adding interrupt support. We would like to add compressed instruction set support, because that's quite key part of RISC-5. The only trouble there is it's pretty hard to, I'm not sure how to fit it into the program view, because then it needs to be presented sequentially, and I'm not sure how to show it that the students see that it's really half of the size. So that's plan. Also, the instruction encoding in RISC-5 is somewhat special, especially around immediate. So I would like a visualization of each component, the blocks that the instruction is composed of, so that that can be seen for each instruction, and inspect it, and edit it. We want to be able to run a very minimal RISC-5 Linux target. I mentioned 64-bit visualization and the program with that, also the memory unit. Another nice thing would be to visualize the utilization of the pipeline, the image that is typically in every book, when you have squares for each instructions, and you see the spaces where no instructions were executed. So it would be nice to add. And also, even when I made these slides, I would really love to have an option to step back, because I went one instruction too far. And it would be really hard to do it with all memory and everything, but at least for the visualization of the pipeline, I would really like that. So if you are a teacher, represent educational institution, and you want to use the simulator, please do. If you have some problems, contact us. We'll be happy to cooperate on that. If you are a student or developer, this is open source project, we accept your request. And usually the way it works is that students do their final thesis on this project, so far free. And if you are a distribution maintainer, you could help me with putting in the official packages. So you can get the source at GitHub, and we also provide executables for Windows, Linux, and Mac. And we have packages using the open-source build system and launchpad for Ubuntu, SUSE, Fedora, and Debian. And there are also packages for our Nix packages, because those are those that I use. And as we mentioned, we have the online version as well. So if you would like to read more, we have some publications, the thesis, and our paper from the Embedded Word Conference last year. So those are available. And we have this subject, and you can find a lot of the materials at the Comparch. It's some of the materials videos. Some are in Czech, many are in English. And that's all from me, so thank you for attention and for so many people coming. Thank you. Please. How tightly coupled is the simulation of the processor and the visualization? I was thinking, would it be possible to somehow connect something like ModelSIM with the HDL model? Or would that be very, very difficult? These are repeat the questions for the screen. Sure, so the question was, how tightly coupled is the visualization and the simulation? So it's a separated project that is linked together and is only connected with some data passing and QT signals. So we already have the visualization and the command line that are completely separate, and they are just connecting to the same signals. So it's quite well separated, but it's not at all stable. So, any other question? OK, if you have no other questions, we do not work only on the simulators, but we do even the design of peripherals. So for example, we have open source, can have these stuff, we work on open source replacement of MATLAB and simulink. OK, it is toy, but we work on such stuff. So if you have interest, we have their links to our other project. We have experience with motion control. I have about 30 years of experience with embedded system design, including infusion systems for medical applications. I have contributed to RTMS project, which is used in European space agency and so on. We do VHDL design for the experiments of its space-grade FPGAs and so on. So this is the work which we do to help our students. It is a lot of work, something like eight many years of work in the simulator, but it is only for our students, but we do even the serial stuff for the world. For example, in Socketken, in mainline Linux kernels, there are our open stuff, contribution drivers, and so on. I have a question. OK. Sir? Yeah? So what do you use for the actual visualization of the pipeline? Because I remember there's something like this in, say, Altair Fortress, which does schematic after-invisualization. Well, is this yours? No, no. Something you could pull in from some other person? No. At point point, which will take three hours. Yes, the visualization that you see is actually an SVG file that has special annotations which connected to the core. Previously, it was done by generated QT objects, but I was not OK to work with that. So I switch that to the SVG version when you can design it all in graphical editor and then you just connect it to the simulator. So that's completely ours. |
Porting RISC-V to GNU Guix
A year in review |
Okay, so, hello everybody. I am a frying flasherner. I work on porting risk five to GNU Geeks or GNU Geeks to risk five, depending on which way you look at things. I've been involved with Geeks as a packager and involved with Geeks in general since about 2015. Yes, over the years I've just ended up touching everything in the code base just by accident to just end up happening. I also worked, I guess the first big project was porting Geeks to AR64, which was similar but different with a lot of pieces. So this slide I meant to fill in a bit, but not very good with drawing on the computer. So some quick stuff about Geeks. It is a, at the heart of it, it's a transactional package manager. So that means that anything that you do, you can undo and roll back. And so that gives you the chance to, I guess in terms of porting things, you get to build things and see how they break and then build it again and see how else it breaks. And in the end, when it works, you don't have to worry about all the broken stuff polluting your environment. So using just the pieces that work every single time and just the pieces you're actually putting in every single time. Everything is built in a container, again, back to the, everything is self-enclosed when you're building it. And everything is built natively. We do, I mean, there is support for cross-building just about everything, but when we're actually installing programs, everything is actually built on native hardware. This picture is a little small and actually a little old. Let me slide this down a little bit. This one is from the actual bootstrap of, I don't know, it's little dated, it's changed a little bit. You start from an actual couple's statically compiled binaries and in this case, it's actually down, we download, yeah, we actually download in this picture the GCC 4.7 source code and compile it to the GCC bootstrap. So it's, so everything is a, so DAG is a directed acyclic graph. Everything, you can follow the inputs from one package to the next and see exactly how everything is built. So going back to the first talk, which I suggest everyone watches if I really enjoyed it. Our self-hosting comes from a couple of cross-compiled static binaries from another architecture running Geeks, which then we use to bootstrap everything up from the start on direction. So the same G-Lib C bootstrap here and GCC bootstrap here, then for whatever reason, we've actually flipped the graph around going the other way and now we're going from bottom to top. So G-Lib C bootstrap at the bottom and aiming up. So go and actually bootstrap everything, the, I actually haven't used Gen 2, but from what I understand the stage one and stage two starts, you start from just about nothing and it just builds itself also the way Linux from scratch does until you actually reach a fully functional and optimal as much as you want the optimizing fully functional and optimized or built with dash O2 binaries that you actually use to build everything else in the system. So just coming at it from the distribution side, risk five, okay, so what's it similar to? What sorts of problems did I run into with the actual porting? Yes, let's see, so the AR64 port, the boards at least initially were a little easier to get. ARM already had partners building boards and could get somewhat expensive ARM V8 boards in different places. Risk five boards that I've picked up actually mostly been through Kickstarter and one way that's one way that I guess I found it's radically different than ARM is that anyone can put together a chip and create it and sell it and there are just so many more options available for the actual chips and for the actual boards. On one hand, looking at online discussion, sometimes it seems like risk five will save the world and it's a unicorn farting out rainbows and it's just the end all be all. From the pure packaging side, it's another architecture that not a lot of people are using yet, but it really is somewhere in the middle as all things seem to be. My impression so far is that it's gotten much faster adoption than AR64. I bought my first AR64 board in 2017, I think. It was ARM V8 or the 8.0 specifically. Go ahead and buy an ARM 8 board today. It's still the 8.0 architecture other than Mac, the M1s which are almost 8.5. ARM just finalized the ARM V9 architecture and everyone's still selling ARM 8 boards. So it's really is exploding everywhere and as a project Geeks aims to make it available for as another architecture that Geeks runs on. So I guess other comparisons that I've run into, a lot of software doesn't have any sort of just-in-time compilation and a lot of it seems to be hand-coded assembly or added after the fact and a lot of times when building software end up with something just saying risk five isn't supported for just-in-time, your architecture is not supported past the no JIT flag. So PowerPC64, one hand in the past, I guess you still have the Apple G5s which were big Indian PowerPC64 and now we've switched to little Indian and you end up in a similar situation of there just aren't, no one's written that for yet, they come across a whole bunch of packages where you don't use autoconf and configure make install to install the packages and things just end up being too old, I apologize this one's a little small again, you go and run configure and have a timestamp, it was last updated in 2014, it was the last time they grabbed the let's say stock file, it's a regularly used file for plenty of packages and this just says yes I recognize that your names comes back as riskv64, I don't know what to do with that. So as part of the packaging of everything we do go through and have to update these packages. So in terms of I guess this is where I'm starting from with some of this, there's been a lot of work by the other distributions by many people to actually get risk five support into all of the tool chains and into major programming languages and really make it to the point where it's just go and create some bootstrap binaries and things just start building and so we've been very lucky that it's been such a big effort and everyone's so excited for it that built the bootstrap binaries and everything more or less just started working. So we really didn't have any, I mean other than you know we weren't using these versions these ones were a little too old but you know as things built up things pretty much just kept on going and working correctly. So one thing, so I mentioned that in Geeks everything is built containerized when it's installed it's not installed into FHS into the file hierarchy system, file system hierarchy. It's not installed in, we don't use the regular prefixes so here we don't have a user lib, we don't have a slash lib as part of a, I assume as part of a multi-lib change which happened years ago in most Linux distros, a lot of packages end up getting installed and we'll say user lib and then we'll say what architecture and what ABI it is and then we'll fall back to oh and otherwise it's just here in slash lib or slash user lib. All of our packages get installed into its own hashed prefix so for example this one is from GCC. So instead of looking in user lib, GCC I forget the actual path but anyway so we're not going to find libgcc.so used for linking everything pretty much actually had to comment out the start file prefix spec so that GCC would correctly link to itself after it had been built. So it was a odd situation that most things built and then occasionally things would just fail and say I can't find libgcc and say just built everything else where is it and then I add the library specifically and suddenly it could find it and eventually it turned out that as part of taking everything from its own special prefix, putting it together in the build environment and building, when I added libgcc specifically it got added into the library path and everything found it and without it it's supposed to be brought in by GCC itself, by the binary and then say oh yeah and here's libgcc if you need it so this was kind of a little gotcha that got me for a couple of months and I actually started adding ugly hacks around the Geeks code base to say and when you're building on risk five add in libgcc here and add it in over there and I had ended up with a paragraph size note going back to the after bootstrapping everything and getting to the final GCC used to build everything we actually had a special one just for risk five that said in addition to everything that's normally there we also need libgcc so I mean this was a looking back at it you know it feels silly adding everything but some of it also is I guess live and learn and you know get things to work. I found upstream and this one is you know it's a little dark maybe a little hard to read upstream is mostly been it's been happy to you know accept patches for risk five support some of the patches that that I've added to make things work you've just grabbed from upstream from newer versions or from pull requests or modified from other ones and said you know it worked here it'll work here too or this is about the same so as part of bootstrapping everything from source and we we support Rust as we aim to support pretty much every programming language and so current installation instructions for Rust are download Rust and use Rust up to manage Rust and in the distributions it's you know at some point it's in and then you use one version to build the next and then as far as actually putting Rust in the first time you know coming from source only our options really were to you know follow the OCaml path back from the very big very early days and build a thousand copies of it until we got to something more modern. Luckily for us there was an effort called M Rust C which aims to implement enough of the Rust C binary itself to rebuild Rust C and the rest in this case the rest of Rust 1.54 so and this was the you know the pull request that I had for hey and it also works on risk five we said oh everything just worked so we said yeah you know after you know split up the build instructions and built everything and everything just went and after 56 hours I had Rust 1.54 so it was on on my x86 machine it took you know four or five hours maybe here it took 56 hours and it's you know the machines are they're getting faster and it was you know luckily I didn't you know there was no hand compiling there was no interacting with it in the middle it's and Geeks made it sit easy to just say here's the build instructions take the pieces and go so said here's the build instructions and I was always curious so I added time before everything and said 56 hours later here you go it just works so yeah so this is let me skip that one so I mean inside of Geeks you know was you know add the patch and then you know just tag supported systems saying you know yes it also works on risk five and you know here you're going back to the previous slides about you know end up with similarities with AR64 and with PowerPC that certain things just aren't you know aren't fully tested or aren't fully supported you know I ended up making the same mistake here and we've changed I changed it from you know it only works on these architectures to you know it's it works and it's it does it's you know it's not expected to be super efficient or fast but it gets the job done and I left the it may support I686 soon and I completely left out PowerPC no mention that you know it's coming or it's in progress or it probably works or even it just works on all 64 bit machines that we have it's completely forgotten so it's you know I fall into the same traps too sometimes with those as far as other language support for Go for most languages well for Go for most architectures we start with the Go 1.4 release from Google and then use that to build newer versions it's an issue with GCC Go with I forget the interaction it got fixed later so we're using GCC Go 10 to build Go 1.16 to build newer Go versions I've done it on X8664 which for this type of thing is where I normally end up doing most of the testing not to say cross-compiler but just to say use Go use GCC Go for X8664 to build Go 1.16 to build newer Go's and just say you know does this process work then say okay well let's you know do it on the on actual risk five and there was there's some issue with the test suite that I need to fix up the actual building process of building of the building before the testing takes I mean that part takes 12 to 20 hours it's I'm not sure why that part takes so long it's just one of those things I mean for node this one we fell into some we fell back into our bootstrapping trap of some sort of circular dependency between LLHTP and node itself in later versions so this one I'm actively working on I think node officially got support for risk five and 16 or 18 partway through the cycle so I have to back port it to 14 which we currently have packaged and then again to 10 so that we can move forward with it Java Java's a problem I'm not sure we're actually ever going to get Java support it's one of those things it's going to take a really long time someone can correct me I believe Java support officially upstream it was added in Java 18 we build Java using the previous version going back version by version initially we had used until through GCC five there was a Java compiler as part of GCC after which it got removed and we used that for a while but it turned out that Java compiler also needed a Java compiler to compile itself so after some software archaeology luckily not by me someone else worked on this one managed to package early versions of of new class path from I'm going to say the year 2000 or so and use that to build uh iced tea the you know then free version of Java with Java 7 of Java 1.6 1.7 1.8 and use that to build all the open JDKs and so to backport risk five support all the way back through everything and we actually have a hard enough time keeping AR64 working on some of the earlier versions I'm I'm hesitant to touch that one Haskell is actually one of the ones where we've we've looked at it and it comes up every six to eight months like can can we do better on this one so I mean I've spent a fair amount of time looking at the Haskell download page binaries going back years and years they have 0.29 listed as an option for downloading every version of Haskell needs an earlier every version of GHC needs an earlier version of GHC all the way back to the beginning there are alternate implementations back in the early days which can't actually get to build GHC itself so for for other architectures we actually do just say okay we'll take the you know we've chosen a point grab the official binary released by GHC and use that to build all future versions there's you know there's nothing for risk five currently from from upstream Haskell we we could add support currently our Haskell build system doesn't actually support crossbuilding it's more of it hasn't come up no one's done it yet so I mean we could cross build from x86 64 to risk five for that but then you're left with if you want to build anything using Haskell you need a second computer to build it and so looked into briefly can we cross build the other way from foreign to native and I mean other than an interesting thought experiment of now you have to build an entire second architecture to build the one you're actually using we actually we haven't made any progress on that one and seemed like it was more of an interesting thought experiment than something that we actually thought we would end up going forward on so I guess going back to you know actually talking about risk five itself one of the another thing that geeks has is like all links distributions we target a base architecture which is great for distributing binaries it's not great for actually running optimized binaries so you have a flag called dash dash tune obviously for for hello which just prints hello world it's not useful but for plenty of other programs it is so so when starting the risk five port I followed the it's the process that Debian and Fedor and I assume others took and I targeted the gc extension combination so as as time goes on and we get more extensions we'll be able to use the tune flag to say you know yes I'm you know I'm happy using the baseline or that's what I have or actually I have these other extensions you know rebuild some of the software so that I can actually make use of the better hardware of the more advanced hardware that I have with the extra extensions and in that way you know we've we've had good use of the tune flag in high performance computing in people with newer machines and be targeted by x86 64 v4 as a sub architecture you know we're we're still going through and and trying to find programs that are good for tagging saying you know yes this will actually run faster enough to be worth it but certain certainly plenty of math applications do very well with that and so I'm I'm you know everything's in place to add more sub architecture support for risk five as it comes you know we just have to you know actually have access to it and you know hopefully test it before just saying yeah yeah it'll it'll work fine so it'll probably just work fine but you know so it's it's it's there and you know it's you know it works fairly well for you know for everyone that's that's actually using it um see I was wondering you know if any any questions any comments seem to have an extended q and a period here at the end of my talk I'm happy to talk more about how geeks interacts with you know with you know other architectures with anything how you know other fun stories with courting software to work on risk five yes okay so the question was risk risk gcc is gaining the rust front end and have I looked into it for the bootstrapping part I haven't looked into it yet for the bootstrapping part my understanding is that it is that it you're similar to I'm rustsy it implements the it just aims to implement the I guess the rustsy binary more or less itself also so that so currently and so currently when we're building rust programs we'll say okay well upstream says cargo build this we run cargo build this and then cargo itself says you know rustsy library and this part is here and that part is there and that part is there and we can do the same thing um with you know either you know just with rustsy itself or with the rust front end from gcc it should also be possible which uh you know I'm not sure for uh bootstrapping rust if we would if it's something that we would slot in it's you know I guess at first thought I could see slotting it in uh instead of m rustsy itself and just saying we'll take the m rustsy infrastructure to go and say you know here's the pieces we need to build to rebuild rustsy from upstream but we're going to use gcc's rust to actually build everything faster and more efficiently which would also solve some of the other architecture problems and you know as rust gets into the linux kernel we're definitely planning on using the gcc rust front end for that uh yes okay so the question was has there been progress on building desktop environments and not just languages um let's see so I have built let's see current no I'm thinking through all the all the source code bits I personally run enlightenment on my laptop uh that one doesn't work yet but that's because we use an older version of uh lua jet as an input and I just need to actually tell it no for risk five use lua itself instead of lua jet or no really I'm not giving it to you and don't look for it and don't error if it's not there um other than that enlightenment should work which I know has many people using it as far as uh major desktops um it's the xfce just builds fine uh mate or mate just builds fine uh gnome I'm sorry I said that again yeah running it is another thing um I have not actually tested running it on the hardware itself yet um I've been uh I guess so far focused on the actual uh I guess just getting everything to build first uh as far as actually running geeks on the heart on I guess uh so geeks runs uh in two ways it runs as the entire operating system or on top of a foreign operating system so so far I've been running geeks on top of whatever uh operating whatever Linux happens to come from the vendor and I've been you know using that as the base um as part of actually testing the uh it should be not too hard to test it with qmu either from risk five or from an x86 64 machine running uh using the risk using the qmu emulation uh as far as actually building a actual image for one of the boards um I ran into some issues with uh different boards needing specific offsets for um it was just the g-parted version of fdesk helps create some partitions in a scriptable way and uh assigning magic partition codes and things like that so it's very doable in geeks uh this one was actually uh a problem problem I ran into I I actually burned through all of my spare sd cards so I was building everything on the risk five boards and then I needed a new one to go and say now actually you know create an installable image and install to that sd card and I I had actually run out of them uh but I've actually I've picked up more finally and um at the point now where I can go and say uh so the first plan was to go and say here's the here's an upstream image either from a vendor or from another uh Linux distribution I'm going to flash that onto the sd card that gets me all the partitions that I need then I say uh geeks system in it into the sd card that's already partitioned correctly and it'll install u-boot over uh over the u-boot and the os over the entire root partition and then you know it should either work or not and if it doesn't then we're into the you know why not but assuming everything just works uh everything should just come up in work and I'll be able to better test the graphical applications okay let's see so currently I am using what was it I'm using the sci-fi unmatched board uh when you know after a year long wait or so it was a long lead time with mouser it it did finally arrive and uh shocked the local computer guy when I showed up and said I need a case and a cpu and I have this he's like oh do you what else do you need it's like graphics card peripherals like nope have risk five board and he's like you're you're you're you're killing me again so he's the one I've gone to in the past when it's been okay I need a graphics card and it has to be 10 years old and work with you know Linux library kernel and it's you know sometimes it just becomes a you know work with the pieces that I have so that's that's the main machine that I've been doing a lot of the work on um the so I've also have the uh first vision five board uh that's been helping a lot with a lot of the building and I recently got the vision five two which so far seems to be at least as powerful as the high five unmatched uh not the one with the eight gigs of ram uh compared to the 16 from the high five unmatched and the cpu uh up to 1.5 gigahertz instead of 1.2 ish and you know truly in the we had some geeks meetup days before FOSDM actually hooked up to the projector and just started building you know building from nothing up to uh hello and you know things were building and it was people say oh what are we building now you know where are we up to now so it was you know GCC itself takes six to eight hours to build uh from from the uh bare bootstrap binaries uh from this one here from the bare bootstrap binaries up to hello which you know is the first quick to build package uh after building everything you you know GCC and Ben utils and everything you actually build everything with on risk five uh it was uh somewhere between 16 and 20 hours of building uh on on x86 64 the entire process I think was four to six hours maybe a little faster um it's taken a little longer now that now that the bootstrap bit has has been extended um it's yeah the one of the other architectures that I worked on for fun was 32 bit power pc and so that one that one was really more of a because I had it and not because anyone uses it and so with that one uh GCC 10 actually takes more than 24 hours to build so it's so you know it's it's actually useful hardware unlike the 32 bit power pc may I ask what is the answer is there is some cross compilation for duix uh I'm familiar with the next so I would expect that the initial would be to use cross compilation and not the native but duix doesn't have a cross compilation only it's just the direction to write the native because of course that would be way faster yeah it would be way faster um so the question was uh so you are familiar with nix which does have cross compilation and you're wondering if geeks also had cross compilation which would be much faster to compile than native uh so geeks does have cross compilation uh you can cross compile binaries and then um use the basically geeks send command to send it to the board and then run it there and everything just works uh as far as actually building from it though uh everything gets built from the native binaries so the initial bootstrap binaries uh cross compiled and uh they were then used to as the native binaries to go and build everything else uh but you know I've used cross compiling uh one the where was it uh for for node that one back porting basically you know two rounds uh the plan for back porting node was to back port it to node 14 and then cross compile from x86 to risk five and say does node 14 work with this patch that I've written and then once it does then to go and back port it the rest of the way to node 10 and then to go and you know haven't decided there whether whether the actual building would be faster to uh cross build node 10 as a test or whether to start from native as the test so um so geeks does do uh yeah we do have cross compilation and we do use it for you know for creating images or creating binaries for for other architectures but as far as actually uh building from it or using it as a as a stepping stone it's really only in the very initial stage when the architecture is first added okay so I I think that's it so thank you everyone |
Linux on RISC-V
Status and progress of RISC-V support in Gentoo Linux and other Linux distributions |
Hi everyone, my name is Yaquo and today I'll be presenting your topic Linux on risk 5. I'll try to speak up so if you have any problems hearing me just let me know so I'll try to be louder. So what are we going to talk about? What? Louder, okay. So what are we going to talk about today? The topic is titled Linux on risk 5 but I'm actually going to talk mostly about gento support for risk 5 but we're also going to mention other Linux distributions how they handle risk 5 how other mainstream Linux applications are also supporting risk 5 and what what you can actually you know do using Linux systems on risk 5. So a little bit about myself I'm yeah my name is Yaquo coming from Croatia I work as a firmware engineer in Croatia most of my work is done revolving around better Linux development and integration however for my main job I do not work with risk 5 but I've been I had a privilege to contribute to open source as a gento Linux developer since 2021 and actually gento is where I've been able to get in touch with or get contact with risk 5 so I've been involved in a gento risk 5 team. Yeah I'll try to. So I'll try not to repeat too many details I'm sure most of you are familiar with with the architecture itself. So what is risk 5? So it's an open source instruction set architecture it was designed at Berkeley University in USA it was their fifth architecture which was designed hence the name risk 5. So how is it actually designed? It is designed to be a stable and modular architecture so we actually have a base base integer instruction sets that provide like stable base on top of you can up top of which additional instruction extensions can be built on. Nowadays it's led by risk 5 foundation which is organization which was founded to you know maintain intellectual property legal stuff related with risk 5 and so on. Speaking about risk 5 instruction set architecture so we have a three main base base instruction sets which is 32, 64 and 128 bits. All of them are designed so that they work or they are independent of each other so you cannot run there's no running 32 bit binaries on 64 bit systems. There's also 128 bit architecture but it's currently not frozen yet so it's still in for us the most important one is going to be risk 5 64 bit because that's the one that actually most most of the linear distributions are going to target. Now we mentioned some of the extensions so how they are designed? They're designed so they actually they can coexist with each other. They do not conflict with each other and they can be built on top of any of the previous base instructions we've mentioned. So we got some extensions listed here such as M for integer multiplication and division. We got A for atomic operations and we have a single double quad precision floating point. So how is actually naming convention works for risk 5? So first part of the name is actually base integer instruction so if you have risk 5 64 it's going to be usually rv64i and then we have additional extensions that are built on top of the base instruction so for example you could have MAFD or any extension that is built comes after the base integer. Nowadays we just to avoid having to write so many letters we're using a letter G which stands for general purpose instruction is just a combination of IMAFD because usually chips are designed to support all these extensions and then we have C which stands for compressed instructions. So now when we combine all this GNC we get RA64 GC to avoid having to write so many letters because it can expand to RA64 IMAFD which then expands to more letters depending on you know the version of extension that's implemented. Most most linear distributions will target RA64 GC instruction. So now let's talk a little bit about support in Gentoo. We have any Gentoo users in here? Alright okay. So I'm sure you know you've heard about Gentoo links so what is it? It's a source-based distribution and well this is actually the you know key feature that separates Gentoo from most of other mainstream distributions where usually you know when you want to download install package on your system your package manager downloads a binary pre-compiled package extracts it and installs it onto your system. While in Gentoo we you know we decided we're not gonna do any of this we're just gonna you know download and do compilation everything ourselves. So the main the core feature or the heart of Gentoo is its package manager called Portage. Portage is actually what allows you to have you know this really fine grained control over your system so you can choose many many different components when you build your system like you can choose a toolchain components you can choose your in-it system you can even change libc if you want it and so on. So it's actually designed to be like when you want to download Gentoo you just download a minimal set of it's called a stage three archive so it's just a minimal set of programs and tools that you can use to you know later customize to your own to your own needs. I've mentioned profiles and use flex is just some features of Portage that allow you to to maybe more easily you know customize and configure your system so profiles are basically set of configuration files that typically you select a profile when you do Gentoo installation so profile will you know it will determine what packages you will want to you want to install and what features you will have in your system. Use flags maybe the name is not so intuitive we can think of it as a feature flags or just configuration flags it's just a flag that package manager uses to determine whether you want to you know for example build one package with certain feature or not so let's say you want to install Gentoo on a system that doesn't have you know you're installing it on a server or a headless system so obviously you're not going to need support for graphics interface so what you can do you can just look up these use flags that are available in Gentoo and then you can just you know oh I'll find the use flag that controls whether my packages are built with or without no support for graphical interface you're simply going to turn this flag off and you're not even going to pull any of this stuff like X11 or Wayland whatever so you're going to have to you're going to be able to have you know system that is completely you know I'm not going to say bloat free but it's going to be very you know you're not going to have to you can customize it to your own needs that's the point. What architectures does Gentoo support so I've listed probably about 15 if I counted correctly so we actually have three levels let's say levels of support of architectures in Gentoo we have stable unstable and experimental as we can see risk 5 is currently in the unstable category well what's the difference between stable and unstable well the difference main difference is the architecture that are stable are going to receive much more testing you know continuous testing and unstable may have you know more bugs is not going to receive so much testing why why why risk 5 is still unstable well primarily because of there's no powerful hardware to enable us to you know compile these packages quickly enough so probably if we would try to turn risk 5 into stable architecture we would just would be too much you know time consuming and it would fail for some point but generally nowadays risk 5 support is quite good there's no you know pretty much is polished and you can you know mostly bug free experience in Gentoo okay so let's talk a little bit about history of risk 5 port so I joined Gentoo in 2021 but this this boarding was done before my time so it was around 2019 by our colleague Andras not sure maybe he's here today with us so he was he was the one that added the initial profiles for risk 5 so two targets two main targets that Gentoo supported and still supports are RA64 GC and RA64 IMAC but actually the RA64 GC is going to be the main one which we are going to focus on gets that it will get the you know the most the biggest level of support in Gentoo yeah yeah no they're just for comparison so 8000 packages supported on risk 5 yeah just to compare numbers with arm 64 so yeah when we do testing of our packages we you know always make sure that we run tests and we want to make sure that you know tests are all tests are pressing or as much as you know it's possible in reasonable extent so probably not every package will be able to pass test but we do our best to you know either the package is working correctly tests are passing or if we you know find an issue we're gonna try to fix it or at least report upstream if we're unable to fix so yeah in these few years we've been able to you know almost get to the same level as arm 64 support in Gentoo which I believe is quite good for such a new architecture so maybe you're asking yourself why would you choose Gentoo so you have a risk 5 system you have a you know small board or you bought yourself a first let's say toy risk 5 toy why would you want to use Gentoo so may not be like the first obvious choice but Gentoo gives you high degree of freedom of flexibility so you have control over pretty much any any component of your system you will also have a latest software available if you use Gentoo so we try to you know stay on top of most recent you know software versions packages compilers etc so you're going to be able to have a latest and greatest let's say of what's available in the you know software just distribution packages so it can be a really good platform for developing you can develop either natively or you can develop also on your you know your laptop which is going to be much much quicker if you don't want to spend money buying a board you can just you know you can set up a cross compilation build environment also with Gentoo we have a tool that's called cross dev which is just a wrapper scripts that enable you to set up cross completion environment for you know you can do for any architecture you want so then after you build your you know cross tool chain you can then compile all other components that you need that you want to you know like libraries binaries whatever you need and then you can even you can either boot this image in QEMU and then work work over there or you can just like create your own bootable image and then put it on on your sd card and then if you have a board boot that so stage archive so I think I mentioned stage is basically just a simple it's a terrible that when you extract you get a minimal minimal Gentoo system which you know it will contain like libc tool chains tools you need to download and build your own programs or other packages so I've listed here well if you go to gentoo.org slash downloads you're going to have a an overview of you know every architecture and every stage you can download for every architecture I've listed some of the available archives for for risk 5 here so we got lp64d and lp64 abi images but the main ones are the main ones are the most stable ones are going to be lp64d so we have a you have variants with using system d or open rc depending on which in the system you want to use then we also have a glibc and muscle images so if you don't if you want to try out you know something other than glibc we also you know we allow you to do this or we try to you know offer you the choice um yeah we also got lp64 images there is also uh well let's say multi-lib image but it's more like things are not likely or things are more likely to break if you use the other the other ones but it's still available if you want to you know try it think around or just you know boot it up in a in a qem or something so risk 5 actually gentoo had a well we attempted to have a multi-lib support but when we say multi-lib it's we don't mean like running uh 32 bit binaries on 64 because we cannot do that they're independent 32 and 64 bit but we tried to have a these two abis working together lp64d and lp64 so how how this was attempted was we had a separate libdir for each for each abi so lp64 would be in lib64 slash lp64d while other one would be in lp64 but well naturally this did not go without issues of course so some problems that this actually introduced was uh well we sometimes we would have you know partially broken build systems like for example cmake so what would happen is cmake would would look for some files and it looks um it searches these files in slash let's say slash usr lib64 but then on your system you don't have this you have slash usr lib64 lp64d so then of course the whole the whole thing will crash and you're unhappy because your package doesn't build so another thing that was uh well caused problems for us was uh some important packages were only supporting uh targeting ra64 gc or lp64d abi like rust and well nowadays rust is very important if you want to have a complete complete linux distribution um so yeah just over time this this caused a lot of headaches for developers um a lot of headaches for developers which just uh you know led us to to change this and drop drop this uh drop this from our support um so nowadays so actually what this uh file what this file determines is or it sets like it sets your chost abi um your compilation flags so things that are necessary for your compiler to be able to you know correctly compile packages for your architecture and abi you are using well this is the the old profile 17.0 well this is the latest one 20.0 i'm not sure about the naming maybe it's 17.0 maybe it's related to the year that these profiles were were made because i know that we're going to have a 23.0 profiles so probably it's related with the year that they they're written but um so these are the this is what we had before when i was talking about the old profiles so we had a lp64 live there was lib64 slash lp64d so we tried to have both of these abis coexist at the same time but um it proved to be just a little too too difficult to maintain i think we're still we're still building stages with with this configuration but it's not i'm not really tested or supported so you can try it out but probably going to run into some issues what everything except gcc and glc built properly and it's a little bit different yeah yeah by the way this is Andres so i was talking about him he's the he's the main guy behind the risk five port in gen 2 yeah feel free to if i do something you miss feel free to interrupt me and add something thank you all right so repository so the main repository contains about 19 000 packages we said we we support risk five around 8 000 which doesn't mean that the other 11 000 don't build it just means that they're not you know officially tested because we try to minimize the number of packages we just because it creates additional burden for developers we have to maintain constantly these you know packages as soon as you know if there's a new version of package which pulls in different dependencies we have to you know also test you know many many dependencies which can create like a huge amount of work so that's why we try to let's say minimize for now the amount of packages we are supporting we also have a risk five overlay which is just let's say it's less official repository or more experimental which contains just some packages that are not yet ready to be in the main tree or just our working progress porting upstream like valgrind qt web engine thunderbird and so on we also have something well unofficial binary repository that is based on calculate linux so calculate is a it's a gen 2 base distribution is designed to be backwards compatible with with gen 2 and with portage so essentially it offers you just the repositories of binary packages which you can well you'll skip you'll skip the longest part of installing gen 2 so we have unofficial repository for rix five packages and there's also an image for sci-fi one matched board the last one I think was in May last year so it's not official but if you know if you're interested in using this or if you need a if you need a newer build or something you can just let us know it will be we'll try to do our best to update or whatever so yeah now what is there to be done in the future so we said that we currently support risk five as a unstable architecture we'd like to support it as a stable architecture but we'll see it depends mostly of if there will be powerful enough hardware I mean there will be at some point we just we're going to have to wait for more powerful hardware to you know to be available to us we've also thought about providing bootable images so far we do not have official bootable images we just have this minimal set of like stage three which you use to install your your distribution but if you need or if you would like to have some if you're using gen two and risk five if you want to have something like this done you can you know feel free to let us know or just your comments and thoughts what you want to have what you would like to have implemented and we'll try to do it risk 532 support so it's well risk 532 not really across linux distributions not really supported maybe debian has something but maybe more mostly usable for right now for booting in qamu but actually it took it took a bit of a time to for upstream two components to gain support for risk five so for example glibc gained support for 32b risk five in around 2021 but I believe nowadays distributions are you know busy with other problems and probably adding support for risk five and the distribution is not on their high priority and speaking of 32 bit systems there's an interesting topic theme going around now it's the infamous year 2038 problem in across 32 bit systems so there's going to be there's going to be actually talk about this today at distributions their room to 3 p.m. so anybody who's interested yet you can come and check out I think one developer from debian is going to present it's supposed to be like a discussion session between distribution developers to see how this is yeah how this should be so our team in gen 2 has been toolchain team has been working hard to you know to develop the best solution for this so yeah we'll see we'll see what other distributions have and how we can how we can maybe work together to to fix this painlessly let's say or in the least painful way what are the supported platforms or what where can you run gen 2 if you're using risk five so I've mentioned just few boards here probably the most you most mostly familiar with sci-fi boards sci-fi line matches maybe the most commonly used so far but I believe both of these sci-fi boards have been discontinued but from what I've read a couple of weeks ago they announced they announced a new board which should be should be released this summer so yeah we're going to be excited to try gen 2 on that board when it comes out what do you think is the future are there any new asics coming up that are more performant and on par with say yeah well yeah I've heard that there there have been some companies in China that are developing for more like a server grade equipment so I believe there's also a risk five laptop that's supposed to come sometime maybe this year but yeah we'll see I mean I have hopes that we're going to have a powerful hardware yeah it's just matter of time I think at this point but maybe the problem will be when it comes out probably is going to be a bit pricier so I've read some information a lot like about these laptops is going to be priced around $1500 or something so maybe not the most accessible for you know your regular Linux users but I believe yeah as the market develops we should have we should have these boards priced more reasonably or more accessible to broader audience so now we talked about supporting gen 2 we can we can see what other distributions have done in the past few years so Debian right now offers a yeah offers a full support for risk fire architecture really really good fascinating fascinating thing is that Debian actually supports yeah 95 almost 100% of packages are you're able to to install on risk five using Debian so as most other distributions they are targeting rv64 gc instruction set I believe they offer few images bootable images for some of the boards I mentioned here there was also a very good talk four years ago at Fosdam by Karsten Merker he his topic was named porting Debian to risk five and it's really a great great talk which tells the story about how what it actually takes to port one distribution or to add support for one architecture across you know Linux different toolchain components gcc glibc and so on and then ported to actual architecture so anybody who wants to know more I highly recommend you to check out check out his talk Fedora is also another another distribution that has extensive support for risk five so there have been numerous they had numerous bootstrap bootstrap phases I believe last one was in 2018 so they they officially support since 2018 they got a built a build a build a build they have a build system designed to you know produce images for risk five they they offer previewed images for virtual for booting in QEMU QEMU and then for your for physical targets there was also a talk in 2019 by David Abdu Rahmanov titled Fedora on risk five so yeah spoke about more in more details about how Fedora was bootstrapped how what it's doing right now to for support for supporting risk five well free bsd not actually Linux but it's still still good to mention because they well from what I've read there actually they released their first had their first let's say working port in January 2016 which is probably the first the first operating system that had you know bootable support for for risk five nowadays they offer they offer support for risk five as a two two architecturally support many either virtual targets or physical physical devices yeah open susan Ubuntu another another two distros which offer support for for risk five so see we have you know if you want to use Linux on risk five you really have a lot of choice depending on what what you want to do if you want to use ubuntu debby and gentoo fedora or something else desktop environments so I've just basically dumped a list of desktop environments that are all available on risk five by the way all of these are available on gentoo as well and you can also use them on risk five so whether you want whether it's GNOME KDE X FCE enlightenment whatever it's there for you to use some images of gentoo in action so yeah this is a gentoo system running with KDE desktop environment now we got also with oh this is GNOME yeah from what I see in picture GNOME enlightenment now speaking about other other mainstream applications well in the recent few years there has been really you know big progress in terms of porting these let's say widely used applications for risk five so I've just mentioned a few of these here firefox open gdk and so on LibreOffice was also also got support for risk five I believe a couple of months ago speaking of gentoo we I believe open gdk no js yes we do have with risk five support LibreOffice not yet because we are I believe there's gonna be we're waiting for next release when next release happens we're gonna we're gonna package including with the risk five support um so some projects that are still you know that are yet to be ported was actually quite difficult to find I was looking for some software that's still you know ongoing or you know work in progress it was difficult to find some because most of the stuff has already been ported but there are some some interesting projects left so like luajit, valgrind or mono I'm interestingly luajit and valgrind I believe they've also been covered and forced them maybe 2020 or 2019 so and their port is still ongoing so yeah sometimes it takes even a couple of years to to to port some some project to risk five so it's really a great success how risk five has been able to grow you know from in just a few years to to achieve the level of support that we have nowadays so that will be pretty much what I wanted to tell you I hope I've got you interested maybe in trying out gentle on risk five if you got any questions any suggestions that biggest challenges well biggest challenge is probably would be this working with these this multi-lib concept which we tried to which we tried to support for some time then we just as I explained didn't really work so we had to just drop this and focus on just supporting rv64 gc yeah this single avi which is what other distributions are also doing so we decided to do the same yeah what are your experiences with supporting GPUs on risk five some of them I believe have PCI express like the star 64 that the one 64 demo did you get any 3d acceleration to work in combination with the risk with any risk five platform already well I'm not sure I haven't tried actually don't have a physical device with PCI or with the graphics card at home so I yeah I haven't tried myself but yeah if you want I can let you know yeah give me a content yeah I have what doesn't work is anything there the navy so they've gone to you write that patch mode okay I think someone hacked it it's important for both the video on the reddit or youtube but I don't think there is the missing patch on the kernel main limit and what so she's uh are those that work with it or is it just one so she that works or is it multiple I don't know yeah that was the question in the back yeah okay there are some of these really there yeah why do you think it's worth supporting I am seeing because it doesn't it doesn't necessarily support memory protections so it's essentially I'm completely on top of the charge which one I am risk 5 64 I am seeing mostly I believe we mostly targeted to support this for mostly for you know experimentation like booting in a virtual machine but now we focus more on focus more on GC um yeah I've mentioned I've left some notes here like there's a there's our project page and gen 2 wiki pages you also have a my email or risk 5 project email so if you need anything you can just let us know if you want something something done you need a package or whatever do you need help do you have a call for action uh well yeah we always need we always need help to you know fight with various and not not only the risk 5 but in general to fight with various build issues you know patching upstreaming upstreaming patches writing documentation yeah I mentioned some stuff like bootable images so yeah we also we're always welcome to you know help if anybody's interested in contributing to either risk 5 or in general gen 2 you can just let us know and we'll be able do our best to you know help you to try to you know experience what it is to be able to contribute actively to ah yeah that one as well hangs you know we did recently found another futex issue in rb32 it may it may it may be connected to the bottom of it soon so 72 is still broken so yeah it's still everything's still rugged right now just don't know how to fix it yeah help is always welcome on all fronts yeah anybody in the room actually I know there is in the arm space there are two projects I forgot the name that are really working on optimized x86 emulation which is achieved quite impressive as results you can actually play some recent games on 64 arm and those project don't seem to show any interest in reporting this to risk 5 because a lot of would have to be done at a low level I was wondering is there any similar project ongoing right now to have like a high performance x68 x86 emulation on top of risk 5 going on because as we actually migrate to the architecture might be that would be a compelling I think maybe that has to wait when we have high performance risk 5 processes yeah we want to start early right yeah there is a higher dependency because the memory model in risk 5 is weaker in order to emulate x86 we need this PSO extension it's the same thing that apple implemented in their work right for risk 5 we're emulating x86 we need this well then I believe that that is working on that we have this extension so once we have the hardware that is capable of emulating the memory model of x86 I guess it will start out a little bit easier and do you know if that will be an official extension or just appropriate thank you I believe we have maybe one more minute thank you you |
How to add an GCC builtin to the RISC-V compiler |
Hello, Fozdom. I am Nandini Jomsanas. I am a software toolchain engineer at Ember Cozum. I lead the Core 5 GNU toolchain project. I am also a UK electronic scholar from UK ESF. UK ESF encourages young electronic scholars, students to study electronics and pursue a career in the sector. UK ESF also connects top UK universities with leading employees. In this talk, I will be giving you a tutorial on how to add a GCC built-in to the RISC-5 compiler. Okay, so what is a built-in? Well, in C++ and C, there are two types of functions. You've got your user defined functions and your built-in functions. User defined functions are functions that the programmer has defined within their code so they can use it. But a built-in function are functions that are already implemented in the compiler. So the programmer doesn't need to write specific code for it and can directly use these built-ins. Many low-level architectures in GCC use built-ins. Built-ins look superficially like any C function, but there are intrinsics to the compiler which are directly implemented within. These built-ins have specific patterns to be matched in the machine description file and have access to unique individual machine functionalities. Because they are integrated within GCC, they are more efficient than using just simple inline assembly. For RISC-5, this presents an excellent opportunity to expose the ISA extension to C and C++ programmers. This is an example of a simple built-in in GCC which takes the square root of a float. There are tons and tons of GCC built-ins, but I don't know if you know, but there's probably like two in RISC-5. And this is why I'm giving you a tutorial about it so we can add more. It is important to say that yes, we call it a built-in function, but it's not really a function. There are any corresponding entry or exit points and a just cannot be obtained. Here is the square root float built-in that is implemented in GCC. If you want to find it in GCC built-ins.dev, all of the source code will be linked at the end, so don't worry, I will give that to you. And if you want to make a specific RISC-5 built-in, then you would go into the link below, or the path at the below, which will be in RISC-5 built-ins.cc. Yes, I'm talking a lot about built-ins, we could simply just use inline assembly. But this is why we shouldn't be using inline assembly. If you want to use inline assembly, you have to annoyingly specify the pattern every single time you use inline assembly. Sometimes you can get it wrong. GCC does not know about this built-in, so there's a huge risk of data flow information being lost. Again, GCC does not know about this instruction that you're using with inline assembly, so optimization cannot be used. The reason we use built-in functions, well, all of your data flow information will be retained. Patterns can be recognized and used elsewhere by GCC. You only need to specify the pattern once, and that will be in the machine description file. And then, voila, you just need to use your built-ins, put in the arguments, and the programmer will be fine. And again, with built-in functions, they're implemented directly in the compiler. So GCC will know about it and can use their optimization flags. What do I talk about when I say optimization? Well, GCC has a bunch of optimization flags. Here are two that I'm currently using as an example. The first one is with the flag minus 0. I don't think that is. That's the basic level of optimization. In fact, I don't think that's any optimization at all. This is just hardcore assembly, which you will use for cv.er, which I'll explain later. And when you use an optimization flag, minus 02, that will increase performance, reduce compilation time. GCC optimizes those assembly instructions because it knows that it doesn't need to be used. You might have noticed that I'm using cv.erw, probably wondering what the hell that is. Well, cv.erw is part of cv3 to e4ep iso extensions, also core 5 iso extensions. The cv.erw is part of event load extension. We are currently implementing version 2 of this in Open Hardwares core 5 GCC and binutils. The first set of extensions, the first set of versioning has the first five extensions, and then version 2 has event load, SIMD and bit manipulation. I would like to emphasize that all of these extensions and instructions are in binutils, the assembly and the linker. But it's time to add built-ins in GCC. I am going to be using event load for this tutorial. This is because event load only has one instruction, so it's a very beginner-friendly task. That instruction is cv.erw, which will load a word and cause the cv3 to e4ep process cycle to go into sleep state. This is an instruction that GCC will not know about because it's very machine-specific. Thus, we need a built-in. Before we get into all of this, it is very important to call out the naming conventions of these built-ins. A general convention name for a built-in in GCC will just be built-in and then the instruction name. But if you want to make it a RISC-5 specific built-in, it will be built-in RISC-5, the vendor and the name. For a core-5 specific one, it will be built-in RISC-5, cv4 core-5, the extension name and then the instruction name. Yes, I understand it's a bit long-winded, but it is very important to emphasise which vendor, which architecture you want to use, what extension, what instruction. It just makes it a lot easier for the programmer to know which instructions they want to use. So for my built-in, and if you want to use it, it will be called underscore underscore built-in underscore RISC-5, underscore cv underscore aw underscore aw. Because there's only one instruction, I just call it the same thing. So this is an example of how to use this built-in. This built-in will take a void pointer. It will be loading it from a specific memory address and then loading it into a general-purpose register, which is an unsigned day-to-bit integer. From this example, yes, the only thing you'll have to do is just put in the pointer and it will return your unsigned day-to-bit integer value. Can you speak a little louder, please? Oh, okay, sorry. Now that I've spoken about what event load is, it's time to add an extension to GCC. So most of these implementation for adding an extension will be in RISC-5.common.cc. So we've called our extension xcv, which will be the main extension, and then you'll have a sub-extensions, which will be xcvew. There isn't any ISO-specific class, so I'll just use a macro none, and this will be the first version of it. Because I am implementing a sub-extension, we'll have to imply it here by putting the sub-extension first and then the main or parent extension. Next, we add the corresponding masks and targets. Before we do all of this, we need to go into RISC-5.opt to emphasize or add the target variable and the corresponding core five flags. This file is very sensitive, and so you'll have to, even though it's two lines, if you mess it up, then you've got GCC crashing everywhere. So you have to be very careful in this file, and then you use that flag for your corresponding target, but you also use it when you have to specify your GCC options. So I've done that in RISC-5.common.cc, which is here. Now it gets into the interesting stuff to actually define the built-in. RISC-5 has a function already made for us, so we can make these built-ins. That is in RISC-5's built-ins.cc. It takes in five arguments, and I'll be going through all of these in the following slides. That'll be the instruction name, the built-in name, built-in type, function type, and availability predicate. So using this function, I have created my own file, which is called corev.dev, and this is where all the corev-related built-ins will be in. My first built-in will be in corev.dev, and the name of the instruction name will be C-V-E-L-W-S-I for single integer. The name of the built-in that the programmers will be using will be C-V-E-L-W-E-L-W, but that will be expanded to built-in RISC-5. Then you've got the corresponding built-in types, function types, availability predicate, and I'll go into that more. So the instruction patterns, this is probably the most difficult part of the whole built-in implementation. So the insert name is the name of the associated instruction pattern in the machine description file. It uses, it takes in five operands, but the last operand is optional, but I recommend you putting in if you can. You've got the name, you've got the RTL template, conditions, output template, and instant attributes, and that will be all in RISC-5.md, but I will be creating my own md for corev-specific, so we don't merge it into RISC-5.md. So this is an example of RTL templates or register transfer language. It's a template that is very, very similar to intermediate representation that GCC uses. It's a template that GCC will take and then put in the corresponding registers or operands that it needs to do. So this is my instruction pattern that I will be using for this built-in. The name will be RISC-5 underscore CV as we've previously defined it. I am using the set pattern and this will take a destination register and a source register. The destination, I think, this will be the destination register, the first operand, and I've used the match operand pattern which will take m as machine mode and the index of this operand, the predicate and the constraint. The machine mode for this will be SI which is a single integer, it's 32 bits. It's zero for the index of this operand. We usually start with zero as the indexing. The predicate for this will be a register operand as we'll be loading it into a general purpose register and the constraint will be R emphasizing as register equals to meaning it's going to be written to. Next part of this is the source register which will be the memory specific address. So we're using mem to specify the size of the object being referenced. SI being single integer, 32 bits. Again, we're using match operand to match the register or the pointer to the specific address. The index number will be one because that's the next number. I am using an address operand and then p specifying as pointer. I am using an unspect volatile for this instruction because it's a volatile operation. It's very machine specific. It can get difficult and there are times where it could be trapped. We are referencing in this state that is fragile and vulnerable so that is why I've been using an unspect volatile. Now that I've talked about the RTL pattern, we talk about the condition. The condition is important to add so that the instruction can only be generated within these conditions. You can only generate this pattern if the target is to X call VELW and that it's not a 64 bit target. Next we talk about the orange bit which is the output template. The output template will be what you will see in the assembly. You define it with the instruction name so cv.el and then slash t for tad. This is where you use those index numbers to reference which operands you want to use. I will be referencing %0 and then %a1. %0 will be the destination register and %1 will be the source register. I am using %a to substitute as a memory reference. Lastly we talk about the optional operand but again this is something we should try to put in if you are going to add a built-in. We want to tell GCC that this is a load type of instruction and the mode is SI throughout the whole built-in. The reason I have added this optional operand is that the instruction can still be generated but GCC can now optimise it knowing that it is a load, knowing that it is in machine mode SI. That is now the big part of the built-in. We have discussed the instant name and the template name. Here it comes to the built-in types. In RISC-5 there are currently only two types of built-in types. Those built-in types can be found in RISC-5 built-ins.cc. This is RISC-5 built-in direct and RISC-5 built-in direct no target. RISC-5 built-in direct corresponds directly to a machine pattern we have just created whereas RISC-5 built-in direct no target does the same thing but the return type will be void. But we are returning a general register operand or theta bit unsigned integer. So we will be using RISC-5 built-in direct. Next comes the function types. And again, everything is in RISC-5 built-ins.cc. And currently there are only two types of prototypes for RISC-5. You can only return. You can only have a returning type. You can only have a return type and one argument. In coming presentations I will be talking about it a bit more because I only have 45 minutes to talk about this presentation. When it comes to defining which return types and argument types we are using that will be in RISC-5-f types.dev. So the comment says that it will expand to RISC-5 underscore unsigned integer and then avoid pointer because that's what I will be using for my built-in type. Lastly we have the availability predicate. This is very similar to the conditions we had in the RTL template. So we use this avail function that has been declared in RISC-5 built-ins.cc. It takes the name of your availability predicate and then the corresponding conditions. As you can see it's very similar to the condition we had in the RTL template which is a target reference and then it's not a 64-bit target. Now that we've added the extension and the instruction and the built-in it's time to test it. And this is a very simple test just to make sure that it works. It's a compilation test. It takes in a void pointer with an offset. It returns an unsigned 32-bit value. You can see there are comments on the side. These are deja vu comments. We are using deja vu because we want to use a simulator or it can be used on microcontrollers. It's a framework testing model that we use for our test scripts. The first comment we'll talk about telling it it can be an execution or a compilation test. So this will be a compilation test because we haven't got an executable target yet. The second line is to tell you the options for this built-in. If you don't specify the options then this test won't run because this instruction only works within X core VELW. And then the last line or the last comment will be for checking if our instruction has been generated in the assembly. And it should be generated once. There are dashes to escape. It's very sensitive because it's a regular expression type of framework. We've got a run script for this. It's very important to build GCC because I've been running tests without building GCC and wondering why it doesn't work. And it wasn't until our GCC experts told us, no, you've got a run build. You have to run GCC and then run it. So this shows the results from our run test scripts. Although it's just one test, there are 18 passes. That is because it goes through nine optimization levels. The optimization level goes through a scan assembly test and then a compilation test. Like I promised, I put up the slides for where all of this will be found. This will be found in GitHub's Open Hardware Core 5 Vinutils and Core 5 GCC. This is also part of the Open Hardware group. We are still looking for volunteers and people to contribute to this project. And it's very important to also mention the GCC internals manual. It's probably the guru of GCC. That's what I rely on the most now. Thank you for listening to my presentation. Do you have any questions? Yes? I have a question. So I know that these built-in functions are used by the code people, which I think is what came before the Core 5 project, right? I think they use it for various mathematical functions to speed them up. I was just wondering, what I'm interested in, what I'm working on, is using higher level compilers to compile into automatically generated kernels. What's not clear to me right now is that if I use a built-in, then I would need to compile to a C code, right? Is there any way that you can still reuse part of this work without having to use C code, or would you always need to go to C code? For now, I've just been using C code, so I'm not really sure. I don't know. If you've got a fault, I'm fine. There's the C API, so you can sort of wire it into it. This is in the compiler, so you just need to find your own code to reach to the client. So in this case, you would also use these things in Fortran code. You could, yeah. I have an amazing that myself, I've been working with the staff, so there's no reason for this not to work. It's expressed in terms of a C code, so it has to be expressed somehow. I was a bit confused more about the built-in concept in general, because I mean, usually people use C code to not be machine specific, but if you use it like a built-in, then you become machine specific, right? Yeah. Oh, yeah. It depends on the built-in. GCC has built-ins that are sort of general. I mean, like all the maths functions, for example, like a body of maths, it's not machine specific. And it says, obviously, compiler specific. It's not that specific in this case, yeah, but because you can have other kind of other mathematics. Yeah. Okay, at least architecture specific, right? Well, actually it is not architecture specific. It's a general. Yeah, but even for mathematics built-in functions, you always have, not always, but mostly, yeah, kind of architecture specific. Oh, yeah, there can be stuff like encoding of numbers or such like. Yeah. It's a sort of, you know, just because. So it should work, yeah. Actually, that's one way to avoid these architecture specific. Like, rather than encoding a non-pattern into your code, just by using a constant or bit pattern and then sort of casting to proper floating point type, you can use built-in non. It's a built-in function that produces the correct encoding of a non for your target. Okay, thank you for listening to my presentation. Thank you. For me. Thank you. |
Bringing up the OpenHW Group RISC-V tool chains |
Thank you all very much. Thank you for coming along. My name's Jeremy Bennett. I'm Chief Executive of Embercosm. Embercosm is an engineering heavy company. We only have one full-time non-engineer in the whole company, and it's not me, so I actually am a working engineer as well as running the company. We develop open-source, mostly compiler tool chains, but we also open-source AI, and we also have some open-source operating system stuff, and because most of what we do is pre-silicon, we do a lot, awful lot of open-source silicon chip modelling. But I'm also here with another hat on, which is that I am Chair of the Open Hardware Group's software task group, so I'm responsible for all the software developed for the Open Hardware Group, and I'll talk a bit more about them in a bit. And this talk is part technical, but it's partly about the actual practical side of how we go about developing complex software for an open architecture like RISC-5. So let's tell you a bit about the Open Hardware Group. So it's a not-for-profit, it's a member-driven collaboration, and it's a mixture of industry, companies like mine, some big companies that you will recognise, NXP is a member, for example. It has academics, so quite a few universities are members and part of it. And it also has individual members. You can contribute as an individual, and we have a number, some of the work I'm going to talk about has been just done by people who are individual members. And the goal is high quality, and that high quality is the key thing, open-source hardware development, the sort of open-source hardware that you can put in a commercial chip and be confident you can send it off to be fabricated. So it's collaborative and it's open, it's an open development model, so all these things are open to all. Now, the organisation is the Open Hardware Group, but its cores are known as Core 5 or Core V. So we have a huge family of processors, I'll talk about them a bit, everything from the smallest RISC-532 to the biggest RISC-564 designs. And these are standard RISC-5 cores, but with some custom ICER extensions for RISC-5. The chief executive is Rick O'Connor, and those of you who've been around for a few years will remember Rick because he was the first chief executive of RISC-5 International, and he's moved from the open-source specification world to actually delivering real in silicon IP. Let's look at the ancestry. So Open Hardware Group grows out of an academic industry project called the Parallel Ultralow Power Processor Project, PULP, and that was a collaboration originally between ETH Zurich, the University of Bologna, and ST Microelectronics. ST Microelectronics no longer active in it, and it predates RISC-5. The first part of PULP was done with open risk, but for the last many years it's been a RISC-5 project, and the idea is to get very, very low-power multi-core systems. So that's where we come from, and the cores started off as these academic research cores, and the point about academic research cores is they are designed to push the forefront of knowledge forward. They're not designed to be exhaustively tested for manufacturing and use in a commercial deployment. That's not the purpose of a university. So the natural transition is that Open Hardware Group takes that as an outstanding technical base and then takes it into a robust standard. We have loads and loads of members, so I've copied this off the website, and it's the wrong format. I really want a wider and flatter one, but you'll probably see some logos there. You'll see Ember Cosmos logos there. The astute of you will notice that Amazon Web Services appears to be both a member and a partner. I think they transitioned from a partner to a member, and the slide wasn't properly updated. So we have lots and lots of members. I think it's up to about 70 now. And it might be worth just saying that when you become a member, yes, if you're a corporate, you have to pay a membership fee, but that's not the big thing. You can only become a member if you commit resource in terms of what you're going to contribute, and that dwarfs any membership fee. You cannot be a member unless you're going to do something. We don't have sleep in members. You've got to be active members. In terms of an organisation, it's very lightweight, and this is one of the big contrasts with risk five international. At a technical level, we only have five committees. We have an overarching technical work in group which has the overall responsibility for the engineering direction, and it has co-chairs, it has Gerard Vamont from Thales, and David Lynch, and I can't remember where David Lynch comes from actually, but in two companies, joint chairs, and that meets every month as the final arbiter of technical stuff. And then all the work of this organisation is handled by just four companies. The cause group headed up by Ion Bink from Silicon Labs, and that is responsible for the tracking the development of the cause. The work is done by the member companies, but these are the groups that have oversight and make sure the quality criterion is maintained. There's a verification task group, and verification is actually separated from core development, although the two are desperately tightly tied in. And that's slide by Simon Davidman of Imperus. There's a hardware task group, and this is a slightly strange name perhaps, but the point of the hardware task group is, though this is fundamentally a group developing Silicon IP, we do have to have reference implementations, and the first reference implementation should be coming out later this year, which isn't the core five MCU, and that has one of the 32-bit cores in it. And lastly, the software task group, which I lead, assisted by Yun Haisheng from Alibaba Teahead, and that's responsible for all the software projects. And again, it's oversight. It's not doing the work, because you have to be a member, you have to do the work. We do have a bit of a problem there, because we have mostly hardware members and not enough software members. In terms of the roadmap, we've got a flagship application core, that's the CVA-6, and that's a 64-bit full-blown RISC-5 core designed to run Linux, and that comes out of the top-end pulp core under development. There is a smaller 32-bit core, the CV32A5, which is designed for, which is designed to try and do a small Linux system, or of course we've got the issue of getting Linux running on 32-bit anyway. You probably can only just make it out, each of these projects has a target technology readiness level. Are people familiar with technology readiness levels? A little bit. Most of these are aimed at technology readiness level, which is proven in the environment. CVA-5 is a bit different, it's aimed at TRL-4, which is proven in the lab, and we've got other projects which are at different levels, but mostly we're aiming at TRL-5. The flagship, the first project was the CV32E40P, which is a microcontroller class 32-bit RISC-5 implementation, and the first version of that is complete, the second version is under development, and actually the work Nandini was talking about with built-ins is primarily focused at CV32E40PV2. You'll see sitting there, there's something called the CV32E41P, and that's a bit unusual because it's actually only going to TRL-3, which is proof of concept, and it's being developed by Huawei under this group in order to test out the ZC-STAR extension, the new compressed extension, and the ZF-INX, where you have a shared register bank for floating point and integer. It's only a proof of concept chip to verify that those work. Then we have a couple of more forward-looking ones. CV32E40S is a version of the original CV32E40P aimed at security applications, and the really exciting one is the CV32E40X, that's a bit further out, because the X is a generic extension interface at the hardware level, so this is designed so you can take a core, it's really easy to add in a wide range of extensions, and indeed to the extent you can do the floating point extension through the CV32E40X. So that's the roadmap we're sitting on. So what about the software projects? Well, I'm going to focus on the tool chains, so the LLVM tool chain, the GNU tool chain, and because we haven't yet got silicon, I'm going to focus on a couple of software projects, QEMU and the Verilator model. We have other projects, so I'm responsible for eight projects in total. There's the SDK, the software development kit, that's actually joint with the hardware group. The hardware abstraction layer so that we can actually make our software more easily portable as we do more and more of these chips. FreeRTOS, we're microcontroller, if you're going to go for an RTOS, the obvious one to have to start with is FreeRTOS, and Linux, which is really aimed at the CVA6. So we've got eight projects under our belt. We have a rigorous engineering process, so those of you who work for big engineering companies are familiar with gate-based processes, okay? So all of the way we manage these projects is through a gate-based process. You have a project concept gate. That's where you propose a project. You want to work on this project, you explain what it is and why we need to do it. And the project will only go ahead if it's voted for by enough members. And members can't just vote and say, hey, that's a good idea, I'll say yes for everything. You vote for it, you're going to commit resource to it. Critical thing is this is a doing organization. So, okay, you then go and explore it, and then we got to a project launch gate. And that stage, not only do you know what and why, but now you know how you're going to do it, okay? What do I need to do to get this? And then the big one, the last one, is plan approved, okay? And plan approved means you've resourced it, so you know when you're going to deliver it. And that's quite a big hurdle for some projects, okay? And then away the project work goes, I mean, this is a bit simplified, we all know work starts a bit in advance, but that way the project goes and eventually you get to project freeze, okay, where it's all done. Now, this is brilliant for hardware, okay? It's a very hardware-centric view of the world because a freeze means something. It's when your chip's gone off to be fabbed, okay? It doesn't work quite so well for software, so we've modified it for software, okay? And the first two stages are quite generic because typically we're working with a big block of common software. You know, we're not going to, we're not writing a compile and a scratch, we're taking the GNU infrastructure, GCC, Binutils and all that, millions of lines of it. And that's mostly not changed and we're changing a bit. Now, some of that is quite the why and the how, the what and the why and the how is quite generic. Probably all Core 5 chips, certainly maybe to all the 32-bit ones. And then for each specific chip, we have multiple plan approved, which is where we work out when we're going to deliver it for CV30, E40PV1, for V2, for S, for X and so forth. And so we have a whole set of, but it's still the same process of you need to know it's properly resourced and so forth, okay? So we'll see that a little bit action, but that whole engineering focus pervades everything, okay? So let's put the content, what's compiler? Toolchain, it's not just the compiler, it's the assembler, it's the low-level utilities, it's the debugger, it's the libraries, it's the emulation libraries, it's the standard C libraries, the C++ libraries. And in the ultimate world, if you look at GCC, this is GCC, it's many, many languages. It's ADO, it's the C++ family, it's Fortran, it's Java, it's Go. These days, it's Rust, it's Modular 2. And of course, we've got things that sit at the high level like OpenMP and OpenACC. That's a huge lot of stuff. We have a hasten to say, we're a long way of having all of that for the Core 5. But it's a lot of code. It's about, it's north of 12 million lines of code and it's a year or so since I last measured those figures. So, it's big. And we're trying to get that all seamlessly worked through to work on Core 5. Now, we're not doing it from scratch. Of course, we're starting from Risk 5 and then we're adding stuff to it. And you can say LLVM has the same components, but they've got different names and a different set of languages. So let's look at the ICER extensions for Core 5 and the Core 5 ICER extensions, there are nine we're concerned with. Eight of those come from the PULP project. So, extra addressing modes, post-incrementing load and store. Hardware loops, more ALU operations, some special case branching operations. We've got MAC instructions. We've heard about the event load. That's a multi-core feature. And then we've got the PULP bit manipulation and the PULP SIMD. Those are not standard bit manipulation and SIMD. Those are different ones. But there's been years of development. The reason they're different is they predate a Risk 5 bit manip and they predate Risk 5 SIMD. And you can see the PULP SIMD is big. And the reason that Nandini knows so much about built-ins is she's done the built-ins to support those 220 SIMD instructions. So she's nearly finished and then I think she's going away for a long holiday where she never looks at another built-in again. And we've got ZC-STAR. So we had the first GCC implementation supporting ZC-STAR. Of course, these are standard Risk 5 compilers. You can still use them for Risk 5. And the hot news is that Core 5 GCC has a PUL request to put ZC-STAR 1.0.1, which is the freeze candidate support. Once that's been reviewed, that will go in there. So we've got a lot. The toolchain work is all about supporting these PULP extensions. In terms of the built-ins, you've heard all about them, but it's a lot of functions. It's not just Nandini. Nandini has a team working on this. And we've got this naming convention. So we get from a naming convention that built-ins for Risk 5 are built-in underscore Risk 5, underscore vendor, underscore name. We've got so many. We're actually splitting up into ICER extension and name. There is a rule, though. If what you're doing is a built-in, is it corresponds to a standard built-in, then you must use the standard name. So for example, we have, in our arithmetic, we have abs. So we've got built-in abs, which is a standard GCC built-in. The built-ins actually have the same name for 32-bit or 64-bit. That is not overloading in the C++ sense, because either you're running for a 32-bit target or you're compiling for a 64-bit target, you have one or the other there. There's not an error. It's also the case is built-ins are not just another way of doing inline assembler. There's not a one-to-one mapping. So for example, for the SIMD add scaler, there are actually two different ways of adding scalers in the Core 5 SIMD, one where the scaler's in a register, the other where it's a small integer, and you can actually put it, there's actually an add immediate instruction. We don't have two built-ins. There's a single built-in, and if the second argument is a small constant that fits there, it will generate the immediate version instruction, otherwise it will load it into a register. That's part two of Nandy's talk for the future, because that's quite a lot harder to do in a built-in. There is a specification. It's big if you put it into, if you generate PDF, it's 57 pages long. That's one of the things. It's built-ins. Built-ins are not quite as easy as you think. There are things you can get wrong there, and you do genuinely have to think and review. It's under review at the moment. That's not finalised. So testing. Now, if we're going to do full testing of a tool chain, we need a target which has all these ISER extensions. They're not standard risk five. I can't just take standard risk five, QMU or whatever. You can do some testing. So the standard GNU assembler tests, for example, don't need an executable target. They are pattern matching. Have you generated something that looks right? You can do the same with built-ins. You saw it from Nandy's compile time only thing, where you look and do a scan assembler to see, has that built-in generated just one of the instruction I want? But more generically, you need to be able to execute your tests. So we have two things. One is we have QMU for Core 5. That's a project being led by Wei Wei Li at the programming languages and compiler technology team at Chinese Academy of Sciences in Beijing. That's a work in progress. We're expecting that to become available later in 2023. And secondly, we're using verilator models. How many people here are familiar with verilator handshow? Okay, most of you. Okay, for those who don't, it's a tool that takes a hardware design in verilog or system verilog and generates a C++ model from it, a cycle accurate C++ model. And verilator models are really useful because A, they're easy to integrate to tool chain testing and B, they are what's called implementation models. Most models come from the specification of the chip. These are the actual implementation. So we know what your testing is, what is physically going on the chip. And when you have a model, it's not just enough to have a model, you've got to be able to hook to it. So typically you've got to run, wrap around some form of debug server so you can connect GDB or LLDB. And that's the work in progress. That's due to be completed in the next few weeks. And then you'll be able to actually run on the actual, a model of the actual hardware. Testing policy, LLVM project uses the LLVM integration tester. That is a set of several tens of thousands of tests from source code down to LLVM IR. Very comprehensive and we use that. But it isn't a set of execution tests. Now LLVM does have an executable test suite, but it is a set of applications to run under an operating system. And for a small microcontroller they're not suitable. They need an operating system there. So we can't use the LLVM test suite. So instead we use a subset of the GNU regression tests to test LLVM compilers. And that's widespread. That's not something we've invented. That's been done for years. It's only a subset because there is no point in running the GNU tests of the internal representations inside the GNU compiler. But things like the torture tests are absolutely fine whether they're on LLVM or GC. They're compiler agnostic. The GNU tools just uses the GNU regression tests. Something that we're very hot on is exhaustive testing. It's not just, oh, I tried one thing and it seemed to work. It's a, let's make sure we've not missed things. So starting at the assembler, we have both positive and negative testing. By positive testing, we mean testing the compiler does what you want. By negative testing, I mean testing it doesn't do things when it shouldn't. So for example, if we're looking at an instruction that takes a six-bit signed constant, we will test it with values of minus 33, too small for a negative number, minus 32, the biggest negative six-bit number, zero, because you should always test zero because it's a special case, probably minus seven and plus five. And then we'll test 31 and 32, which is too big. And we will check those bounds. And actually, we added those tests for ZC star and found a whole load of bugs in the ZC star spec as a consequence. So we do that sort of thing. And we also test things like we test the extensions. I've got the ELW instruction. I test that the ELW instruction is handled by the assembler when I specify X ELW. I also test that it doesn't get recognized when I don't specify X ELW. So that's really important. And one thing I would observe is, and I've seen this for a long time, is risk five is incredibly weak on its assembly level testing. There are about 10 times as many tests for ARM and X86 as there are for risk five. So, you know, it's important to do that. And so we do core five specific, we're a vendor, core five specific LD testing because we're adding some regress, we're adding some relocations. Do they work? At the moment, we've got compilation only tests of built-ins using scan for assembler instructions. But when we've got those models running, the QEMU and the Verilator model, we'll be adding execution tests. So not only do I generate the right assembly instruction, but it does what I expect. Not only do I generate the built-in, but it does what I expect. So we'll be adding those in as well. And that, all that testing ties into the difference of this is a commercial grade chip, commercial grade core, sorry, and its associated tool chains. So resourcing. Resourcing is an issue because software, these days on a chip, you'll spend twice as much on the software as you do on the hardware. But open hardware group is inherently, mostly, hardware company members. Okay? We've got plenty of software members, but we're still in a minority. So it is a challenge to get enough resourcing as part of my contribution and because it makes a lot of contribution, that's part of our membership. The PLCT lab in China, they make a big contribution. So that's coming there. And we actually double up. We use them very much as part of our graduate training program as well. So part of the graduate allows us to train a new generation of compiler engineers. But we're also seeing companies like Silicon Labs and Dolphin Design call out to them because they help to fund those software companies to do the work. So thank you to both of those. And we do need more of that to come on, and I expect it will come along. Okay? Second is we are not going to maintain out-of-tree forks of GCC and LLVM. It's a thankless task. It takes up a lot of time. The goal is to get upstream as vendor extensions. This is not a new thing. This has been part of GCC since before and part of LLVM. Okay? And when you have that triple that says what your target is, that thing that in your standard compiler says risk 532 unknown hyphen L5 and GCC or whatever, that unknown is the vendor field. And we should be using that. So if you get these tool changes, you'll find they build as risk 532 hyphen core fee, the vendor, hyphen L5 and GCC. Okay? And that's absolutely standard. It's been around forever. You can see in the GCC build, for example, there are variants of Spark and risk 5 for different manufacturers for people who make, you know, radiation, hardened versions for space and so forth. Okay? That all works fine. Risk 5 is designed for this. It's extensible. That's the whole point. There is a missing piece of the jigsaw which is in RV32 in the ABI specification, you have relocations. Okay? There are 256 possible relocation values you can have. The top 64 of those are reserved for vendors. Okay? That's enough probably for any one vendor, but it's not enough for all vendors and it requires a centralized way of controlling. So we know how to solve this problem is that every time you need a relocation to tell you this bit of assembler when you link it needs adjusting the memory offset, you put down two relocations. One is to say which vendor you are and that's just a new vendor. That's just a new relocation with 32 bits so we can have 4 billion vendors. Okay? And then the second one is say which of those 64 relocations, but it means the vendor relocations, there's a full set for every vendor. So we know the concept. One of my other team, Pietra Ferrara, who's sitting somewhere in the audience, is doing the proof of concept to demonstrate that works. Turns out the GNU linker is rather running at its limits with the complexity of risk 5 so it's not a completely trivial task, but we need that before we can fully upstream all this. The rest of it is all ready to go and it's all done to upstream standards. There's another thing we found is, you noticed I showed you there's two versions of CV32E40P and they've got different instruction encodings. We thought it would be good to actually be able to support both instruction encodings and if you specify an architecture, you're allowed to specify my RV32IMAC underscore X ELW and you'll understand then that I'd like to say 1P2 to say I want version 1.2. Okay? That's all part of the standard way you name an architecture, but it turns out it's not supported in the assembler, the GNU assembler, and furthermore, the GNU assembler is not written in such a way that it's ever going to be easy to support and we gave up on that and, in fact, we're only going to support the latest version and that probably ties into the way that a risk 5 international is going. So those, if you like, are the key issues we're addressing. On the upstreaming, we're almost certainly going to upstream the ISER extensions that don't need vendor-specific relocations and we'll put the others up once the vendor-specific relocations are ratified by the PSA by ABI group. So, there. Get involved. The projects are all on GitHub. The open hardware group has its own repository and if you don't like building from source, you can go to the Embercosm website and you can download pre-built tool chains for GCC and LLVM, for Core 5, for every operating system under the sun, all flavours of Linux, Mac, Windows, whatever. So, get involved. Each of these projects has a project lead. Charlie Keeney leads the LLVM project. Chun Yu-Liao from PLCT. Remember I said how you have different plan approved for the different variants? She's in charge of the specific project for CV32E40PV2. Nandi Jamnadas, who you heard from just now, leads the GNU Tools project and is also responsible for the CV32E40PV2. Wei Wei Li from PLCT runs the QMU project and I'm responsible for the verilator modelling because I'm a verilator guy. Part of this is about bringing on a new generation. We actually help a new generation on and train. So, there is a half hour call. I'm sorry about the time if you live in America because most of the people involved are either in China or in Europe. So, they're on Friday mornings. There's a half hour call on LLVM run by Charlie and there's a half hour call on GNU run by Nandi. And the idea is that we'll review people collectively. We'll review their pull requests and it's as much a training and learning thing as anything. So, if you want to get into this stuff, it's actually quite a good way to get a bit of free training. And that's it. So, that's me. That's Ember Klossom. That's the Open Harbour Group. Thank you very much. So, we've got a few minutes for questions. I'm happy to take any questions. Yes? Yes? Yes. So, I'm working in a hardware research group at the university. We do a lot of feedbacks. Previously, we've always used the fieldwork from ETH Taraki or Karambolanya. But sometimes it gives us some troubles because, for example, I'm doing compiler development right now and then last week I discovered that there was a bug in GDB and nobody is working on GDB anymore for this specific version that we take out. So, I was just wondering, do you maybe have like a time frame for these upstreaming of these extensions? And can we, like, if tomorrow we do a takeout, should I tell my colleagues to do an Open Harbour core or should I tell them to do the stay with the pull, please, or the pull cores in general? Okay. So, the question for the recording, the question was about, if you're working on the ETH pulp cores, which are still there as fantastic research cores, should you use the old pulp compiler or should you use the core 5 compiler? So, I think there's not a black and white answer on that. The pulp compiler is a fork of GCC from 2017. So, it's quite a long way out and that means it hasn't got the latest RISC 5 stuff in there. Where we started on the GCC for this, we actually looked at whether we could roll that forward and it wasn't a sensible starting point. We started from scratch from the latest GCC. So, in terms of which core you use, I believe ETH-Ziric is slowly moving over to more using the core 5 unless you're particularly, because you might, you may as well use these hardened cores. In that case, the obvious thing is to use the core 5 tool chains and though they're not yet upstream, they're all in the public and there are pre-compiled ones you can pull up. There is a problem if you're using the old pulp cores, because remember I talked about that version 1 and version 2. The old pulp things are so old they predate the finalization of the RISC 5 encoding space and actually the instruction encodings trample on future encoding spaces for RISC 5. So, the version 2 fixes all that and all the version 2 instruction encodings are actually now RISC 5 compliant. They sit in the custom 0123 blocks. What that means is you can't use this compiler because we haven't got the version 1 stuff because the versioning issue I talked about to compile for the old pulp encodings. So, that might be a factor you have to bear in mind there. But the old compiler, I've looked at the old compiler and it comes down to it's a research compiler. It wasn't designed to be tested and Rust is designed to prove concepts and I think I've always very strong. That's the job of universities not to do the exhaustive testing we do to different purpose. So, it's a different type of compiler but it does mean that occasionally you get weird behavior. Yeah, so I haven't really answered the question but I've given you the decision points to look at. I'd love you to use Core 5 because then you'd be tempted to join in and help here. Any more questions? Yes, right in the back. Absolutely. Yes, I should have said, yeah. So, we have a lot of projects under there and we bring in that roadmap I showed. If you look closely you'll see the dates are all wrong because some of these have moved out and we've got a load of problems like the Tristan project that we heard of earlier which are under the Open Hardware Group. And those of you who use David and Sarah Harris's textbook for design, okay, the Wally processor is being re-implemented as a RISC 5 processor and that is being done under Open Hardware Group. So, your next generation of textbook will have an Open Hardware Group Wally processor in it. So, yeah, there's more than just those cores I said there. And if you are working on a core and you think you might want to put it in this framework, come and talk to one of us. You can talk to Director Rick O'Connor if you don't know him, come to me and I will introduce you. Yes? I only have a stupid question. So, I work mostly on applications actually and then our development, we usually, we're starting to converge like developers and testers are sort of converging into one team. Like, now you were saying that you actually have this bit where some people do like the cores and others do the verification. Would that also be possible to converge at some point? So, this is, so the question is why do we have, I can sub-paraphrase as why do we have separate core task group and verification task group? They do work very closely together. This is specifically about hardware verification. It's not about software verification. I think the argument is completely different and for the software, the verification and development are closely integrated. I think because hardware verification is so formally structured, there is actually a case to be made for keeping them separate and having the design team and the verification teams distinct. So, it sort of makes sense. I'm really a software guy. I'm not an expert on hardware. But it does sort of make sense. But the two teams work very closely together. But it allows one team to focus on the UVM-based test and verification flow and another to work on the actual implementation of the chip. Any more questions? Okay. Thank you all very much. That brings the risk five dev room to an end and hope you enjoyed it. Thank you. Thank you. |
Building an actor library for Quickwit's indexing pipeline. |
Let's write in, we don't have much time. My name is Paul Mazurel. I go by the name of Filmi Coton on Mastonon on Twitter. I've been a rest developer since 2016. I spent most of my career as a search developer, so it was only natural for me as my first pet project to develop a library to build search engines. So if you're familiar with Lucine, that's kind of like Lucine, but for the rest of the world. That was my first pet project. It grew. And two years ago, I co-founded a startup called Quit Quit, that is about building a search engine that is specialized for logs. Just a word about Quit Quit. I'm not here to do an advertisement. It's not a commercial, but it's related to the tool, so I need to explain to you what's the problem, which is the problem that we are trying to solve. Quit Quit is a Rust project. It's under the AGPL license. It's open source. You can find the source code on GitHub. And so we specialize on log search. So what's specific about log search compared to, let's say, an e-commerce search engine is that the data that we get is more or less immutable. So we assume that people will want to ingest documents into our search engine and won't modify it too much. So after ingestion, the document stays there until it goes out of its retention period, at which point we will delete it. Or maybe you might want to delete it if you have a request to comply to GDPR. We handle that kind of stuff, but you cannot modify it like you would do for any e-commerce website. And so one of the big differences in terms of efficiency is that when you deal with log search, the volume that you have to deal with has no limits. The largest amount that we've seen so far is people indexing 100 terabytes a day. So that's the volume of data. Imagine if it was actually generated not by machine but by humans. You would have to have a lot of people typing grief as to deal with that kind of volume. So that's something that you will only get if you're doing log search. And compared to any e-commerce website, most of the CPU is actually spent indexing and not searching because you have comparatively way less search and way more documents. So indexing is actually crucial to our problems, and that's very different from usual search engines. Indexing, what does it look like? That's the problem that we are trying to solve. Super oversimplified. We get, as an input, a stream of documents. It's interesting to have another idea for one pipeline of indexing. We have to deal with around 40 megabytes per second. And as an output, every 30 seconds, we write a file. We put it somewhere, and usually we register it on some metadata backends. And at this point, the file is searchable. And the rules of the game here is we want to have the highest possible throughput. And we want to keep what we call time to search as low as possible. So time to search is at the moment when JSON file is entering the system, we start the clock, and we measure how long it takes for it to go out of the system in the form of one of those files, at which point it is searchable. We need that to be as low as possible, and we need, it's very important to keep it very stable. We don't want to have, like, a period of time where it goes through the roof. So that's the whole game. And in that black box, we do a lot of stuff. I was voluntarily very simple. I won't go through all of the stuff that we do, but the important part here is every single of those steps is using different resources. The time is spent on different stuff. So, for instance, when we index things, when we build or in memory index, we are spending CPU. When we are writing the file, we use IO. When we upload with network, and sometimes we are waiting for something that is outside of the system. We spend no resources at all except we wait. So you might think that the implementation is obvious. We have one function for all of those steps, and we call them sequentially. But if you do that, you are wasting the resources of our system, of course. For instance, when you are uploading, you are spending your network resource, but your CPU is not doing anything, so you're wasting money. So the solution to this problem is relatively simple, but it's not that simple to implement. You want to streamline your pipeline. What I mean by streamlining in a very concrete way is you might have two steps, like indexing, spending CPU, upload, spending network. They go sequentially, but what you want is you want indexing to work on building the first file, and when it has finished, you start uploading, of course. But as you upload, you want to start working on the second file, so that you are spending CPU while you are doing your network stuff. That's what the kind of behavior that we want. And of course, this example is a little bit too simple because the second step here is shorter than the first one. It's a nice case, but if it was the other way around, we would have to have some kind of mechanism to deal with, to have back pressure. And in my experience, a lot of very good engineers are not familiar with the concept of back pressure, so let me explain what it is about. If you already know what it is, bear with me and enjoy the fine artworks that we have here. So the idea of back pressure is, imagine you are cleaning dishes with a friend. One of you is cleaning the plates, and the other one is wiping them dry. And the person wiping the dishes dry is a little bit too slow compared to the person cleaning the dishes. What's going to happen is that your plates will accumulate like forever. And in the computer system, it's a very common problem, and that's how you get out of memory errors. So the solution is rather simple. You need back pressure, which means you need some way to signal the person who is cleaning too fast that they should slow down. And the simplest mechanism to do this is you need to have some kind of limit on your stack of plates or your work queue or whatever you are using in your system. And then they stop once they reach the limit. So it's the simplest way you could have back pressure. So with all this said, the game here is how would you implement that? What would be your go-to implementation if you had to have such a system in one hour? And being Rust developers, I think that most of the people in this room will come with the following solution. So the upload part, it's not CPUV, it's just dealing with networks. So it's very natural to think, OK, I'm going to do that in a Tokyo task. And back pressure, that's easy. I already know what I'm going to use. I feel that the good solution is I'm going to have a channel with some capacity. And once it reaches capacity, of course, people sending work to my task, they will have to wait. It's going to be very nice and natural. And then on the indexing part, we will have the same mechanism. And only right now, we will have to actually do a lot of CPU heavy work. So maybe we won't run that in a Tokyo task and we will spawn our own thread or maybe we will use a thread pool to do that work. It would be better, right? And we use the same mechanism. We will have a channel to receive the work. The capacity here is much larger because the type of stuff that we put in the channel is very different. For the uploader, we are getting files, possibly they can be large, like 100 megabytes. So it makes sense to have it as small as possible. Here, it's documents. So it's for many documents, you will emit one file. You won't probably have capacity that this is larger than three. And yeah, everything is fine and handy. It's quite natural. So we just reinvented Actors. That's basically what Actors are. So Actors is a programming paradigm that has been invented in the 70s by a researcher called Karlewit. It has been popularized more recently with Erlong. And the actual formal definition is here. It's from Karlewit himself. And I'm going to read it even if it's a little bit weird to read slides out loud. This one is important. So an actor is a computational entity that, in response to a message it receives, can concurrently. A, send a finite number of messages to other Actors. We've done that. Two, create finite numbers of new Actors. We haven't been spawning any Actors in our example yet, but we do that in quick reads as something that we do especially for supervision or spawning a pipeline or stuff like that. So we do that. C, designates behavior to be used for the next message it receives. That one is a little bit fuzzy. Do we do this? No, we definitely don't. But if you water it down and you squint a little, the fact that the Actors have actually a state, and the whole point of having this Actors running on a specific thread is that it will be possible to handle a message and mutate our state. And mutating our state is a bit like designating the behavior that it will be able to use for the next message. So we ended up building our own Actors framework. So to be honest, I'm not trying to advertise for this framework. It's under the AGPL license, so you can use it. You're free to use it. If you want to take over it for kids and make it better, I'd be happy for us to use it. If you want to use it as is and you would like to have it as a MIT license, I'm perfectly happy to put it under MIT license as well, actually. But right now, it's not redesigned to be reused by other people. But I think I might be able to tell you about our journey and maybe it could inspire other people to write their own framework. There are actually a lot of Actors frameworks in Rust. This one is Actix, there are many others. So under our Actors framework, what does an Actor look like compared to our original snippets? Yeah, it looks like this. So I implemented the uploader there. So you have to implement first a trait called Actor where you will define a bunch of small properties about your Actor, especially the capacity that we have seen before. So it will be like the capacity of the channel that we described before. And then you will have to implement a handler trait for each type of message that you deal with. So, contrary to our example, now we can deal with several types of messages. Same Actor can receive different kind of requests, if you will. Another difference is most of the time you want to communicate with an Actor in an asynchronous way. That's how the Actor pattern is working usually. But sometimes it's handy to actually have some kind of reply when you do a request. It's a bummer because if you really want a reply, that means that you will be waiting for the Actor to process the entire queue and execute your command and then return the result. But you don't have to use it. Most of our messages don't use it, but when we need it, it's handy. And our indexer, it's about the same. The one thing that I want to point out is this thing on the Actor trait where we specialized what should be returned on the method RuntimeHandle. So, you remember that in our example, we said that an indexer spends a lot of CPU. We wanted to run it on a dedicated thread. We don't do that here. What we do is we target a specific Tokyo runtime, which is weird. So, that's an implementation shortcut that we used instead of running stuff on a thread pool. What we do is that we have several Tokyo runtime and we have a Tokyo runtime that is dedicated to act as a thread pool. The benefit of this is this implementation shortcut gives us the possibility to write exactly the same code for an Async Actor or Async Actor, which is neat. And by the way, we're not the only one to use that trick. InflexDB actually wrote a blog post about this a little bit of time ago. Now that we have a framework, you might have noticed that the code was not even shorter. We have seen a couple of features that we got from the framework, but what can we get like more? Within QuickWit, we have 25 different actors. What's cool when you have a framework is that you code stuff once and the benefit is multiplied by 25. What could be the benefit? So, hopefully, we get better structure and code that is a little bit more readable. People open up files and they know what to expect. I want to know what is the capacity associated to the cure of this actor. Where should I look? That's something that you get from having a framework. So, we will talk a little bit about that, but we get supervision from the framework. We get a neat solution to deal with time, which is probably the main reason why we don't use Actix today. And then we have a bunch of recipes to write unique tests, which is very important. We also have some stuff to be able to see what our actors are doing and we have a solution for discoverability also, but we won't talk about this in this talk. So, supervision. Of course, I would like to tell you our code is perfect and perfect code is, especially for telling the rest, it never fails. And in a sense, we don't experience panics or stuff like that, but we have to run our code, third-party code, user-defined code, like user, they can write a VRI script to transform their documents in the pipeline and that's running on our indexing pipeline. We have to do IO. We have to interact with different systems. For instance, we get our documents from a source. We're running the pipeline. We send them to a storage. We have different storage that are implemented. That's a lot of components and any one of them can fail and we want a very large amount of time. So, one solution to this, it's not discoverability, it's not like it works all of the time, but just try to turn it off and on again. It feels a little bit stupid, but you just restart everything from a blank state or blank slate and sometimes things work fine that way. So, the supervision works as follows. We have an actor that is in charge of supervising all of the actors that are in our pipeline. What he's doing is that it's pulling actors and when it detects that they failed, it will kill everyone and restart everyone. The definition of failure is a little bit sophisticated in our case, so it could be an actor that has returned an error from a handler or it panicked or we have some system to detect if an actor has not been progressing for three seconds. So, we have an option of progression in our framework. That's an original way to do stuff. And we use one for all supervision, which means that if one actor failed, we restart everyone. Okay, that was for supervision. Now, about handling time, which is probably the most interesting part of our framework. So, we need to be able to deal with the idea that, for instance, in our indexer, we want to emit a file after 30 seconds has passed. So, we have a condition like this. 30 seconds after the first document was added in that batch. And we cannot do time.sleep in the handler because it will block the entire processing of documents. So, the solution for this is rather simple. We have a method so that actors can ask the framework to send back a message to themselves after a given amount of time, 30 seconds here. And it seems like a very simple solution, but it has a problem. So, here I showed how it worked. The actor is sending a message to the scheduler. This is what is happening under the hood. It's sending a message to actually an actor that is run by the framework called the scheduler actor. And 30 seconds later, the scheduler will stack a message into the queue of the actor. The trouble there is, imagine that you already had a lot of messages in the queue of the actor. Then, where are your 30 seconds? Maybe we have one minute worth of messages in that queue. And then, our entire contract of, I want to emit a file every 30 seconds, it's broken. We cannot do that. So, the solution we went for is actually mailbox are a tiny bit more complicated. They have two queues. One is a low priority queue, the usual one. And we have a high priority queue. And the scheduler will put that in the high priority queue. So, as soon as the actor has finished dealing with the message that it was processing at the time, it will jump on this scheduled message. And we will get our nice 30 seconds call back. Testability, let me check the time to know. If I'm good or not, 20 minutes, perfect. Testability, we have a bunch of solutions to write tests. Let's go through code to see, like, actual RISD code to see how we can implement complicated real-life stuff and unit test it. So, the code that we will look at is a batch builder that is mimicking pretty much what we do in indexing. So, we have two possible conditions. We emit a file either because we have enough data and it's enough to cut a file. So, let's say if you have 100 messages, it's not 100 in relative, but if you have 100 messages, then you emit a file. Or, if 30 seconds has elapsed in the reception of the first document of the batch. So, let's start slow and easy. So, we have our actor here. So, this is the state of the actor, but this is the actor as well. So, something that is obvious is it will have to have some mailbox to push the speed that it produced to. It will have a document batch which will be a vex of string. So, document will be just string. It will append document to that. When it's big enough, it's now flash it and send it to the mailbox of the consumer. And one thing that is new here is we added some counters and we will be able in the unit test to do some assert on this internal state. I didn't talk about it, but the actor trait actually has an associated type which is called observable state. Of course, the whole idea of actors is to encapsulate your state into your thread or your token task and you're not supposed to be able to mutate it or even read it from the external world. But we have some thing that makes it possible to ask from outside the actor what is your observable state and it will send a message to the actor and the actor will send back the result of the observable state method here which is nifty for observability and for unit test. And then there is our handler. So, we will have two messages. One message will be receiving a document here. It was just a string as I told you. I wanted to keep stuff simple. And we will do several things. The first thing that we do is if this was the first document in the batch, we need to register our callback message using the schedule self message. We will append this document to our batch and then we check for the conditions. We have enough documents in the batch to actually emit a batch using our second batch emission condition. And in that case, we call emit batch. I didn't put the code of emit batch because it was too easy. Not very interesting. And then I didn't put the handler of the time out message, but you can guess it basically it's emitting the batch. And then when we want to unit that stuff, we write things like this. So, we have a universe in our unit test. It's a very important thing. We want to isolate our unit test one from each other. And the universe is in charge of this. So, all of the actors of your program have to belong to the same universe. Otherwise, they're not supposed to communicate together. And we will see that this isolation will make it possible to do something really cool in the next slide. And so, this universe makes it possible to make a fake mailbox that we create like the consumer side of things. We can create our batch builder and it's alone and send message to it. That's what we do there. So, yeah, I usually like to jump and point at the screen, but I've been told that I cannot cross the wait line. So, yeah, and then what we do when we want to create an assert is that we call this function called process pending and observe, which just means that we wait for all of the messages that are currently in the queue of the actor. We have all of those process and then we call observe and we get a snapshot of what is the observable state of the actor. And here, the observable state was a counter, so we check that it's equal to the number of documents that we wanted. And then we check that the consumer mailbox does contain our two batches, because 250 is 100 and 102 batches. And we also want to be able to check the timeout, because the timeout counter is working well. So, here, what is interesting is we created our universe, but with the method with accelerated time, and we would be marking time at this point. So, you won't have to wait 30 seconds to have your unit test to run for 30 seconds. It will do magic and the result will be exactly the same as if you were not accelerating time, but it will just be faster. And I will explain a little bit how it works. And so, for the unit test, obviously, we have to call some way to wait, and we call universe.sleep to do that. And it's important to use the universe.sleep and time.sleep because we are marking stuff, obviously. We cannot use the marking facilities that we have in Tokyo because we use several runtimes. And also, what we do is similar in semantics as pose and auto-advance if you're familiar with it, except we never freeze time. We keep time flowing, but what we do is tiny bit different. So, you can imagine that if you were not accelerating time, your actor execution would look like this. So, actor are processing stuff, and sometimes you don't have any message in any queue, or actor are either, and the only thing that will resume the processing is some time out to happen and the scheduler saying, okay, I have a message for you, you asked for a self-scheduled message. It's happening now. So, our framework detects that we are in a phase where no one is working and waiting for the scheduler, and in that case, and only in that case, we accelerate time. And that's why we get a result that is exactly the same as if we didn't accelerate time, but just faster. So, we compress our execution before, after. That's how it works. I wanted to show you the actual indexing pipeline in its full complexity. I said 25 actors, it's not 25 actors here, but we have other actors in other parts of the code because the pattern got quite popular. It's quite complex, but it makes me extremely good. It makes me feel good to be able to show that when we have to explain a new developer what the indexing is doing. We can point at things. Every single one of these box is doing one very simple thing. It has its own file, it has its own unit test. It makes me happy to have this very simple figure that we can discuss around. One thing that I need to talk to you about is one problem with Actors is if you have cycles in your Actors network, you might experience deadlock. And it's a pretty terrible thing that kind of deadlock because it can happen at any time in production, like it can work for one week and then you experience a deadlock and it's a scary thing. So there's a sufficient condition to have deadlocks. If you don't have any cycles, right? And usually that's the case when you are writing a pipeline. In the graph before, there was actually a cycle. We will have a look at it in a second. There is another, there is a nicer condition to have deadlocks. It's if the graph of your Actors where you removed all of the queue where you had an infinite capacity, if that one is a DAG, then you won't have deadlocks. And that's what we are doing here. So the loop, the cycle that we had was due to the fact that we have like an auxiliary pipeline that is merging the file together and there is an arrow over there. Sorry, I'm going to cross the line. I did it, I apologize. If you remove this arrow, then it's a DAG and that's sufficient condition to never experience deadlock. It helps me sleep at night. And yeah, we have a bunch of other features. The most important one I think I want to tell you about is that we measure back pressure. So the framework is automatically measuring back pressure and expose it as a promoter counter. That's really neat. Very useful for us. And let's use the rest of the time for questions. So... APPLAUSE So are there any questions in the room? Yes. The last slide of the previous slide was you didn't need parallel actors. What did you need like Fanny and Fanna out for having any parallel work? Oh, yes. So there is something that I didn't read but Sylvain was very fast and noticed that we don't have anything to be able to have several actor work on the same queue or work concurrently to process stuff faster. So yeah, strongly, we haven't needed that strongly enough to actually implement it. I wrote an implementation and never managed it because we didn't really need it. So indexing, we just spawn several pipelines on the same machine, not too much. So that part, it's unparalleled. But yeah, we just never need... We haven't needed it yet. I can't really tell where. Yes, exactly. So the parallel behavior. Sorry, you want me to repeat. So Sylvain was saying, we use more than one core just because within the pipeline we do the streamlining thing. So we may have different actors that are consuming the CPU at the same time but we don't have one actor going, oh, there's actually five instance of the actor and we are doing the work five times faster. So we didn't need that. Any more questions? Yes? Do you have the fairness system so that one actor doesn't keep on the processing restriction other orders in the process? So no, we don't have that. So one thing that we have, actually we don't want... We have the opposite problem. We don't want fairness. So if you look at our pipeline, the stuff that is taking a lot of CPU would be the indexer. Sylvain would take a lot of IO and we want to give it priority because it's the thing that we want it to use as much as CPU as it can. So we want to give it one core and we want it to use as much IO as it needs. And we would like to give it priority. So the way we do that is that we run it on a specific runtime and over there it has its dedicated core. For IO, the framework is actually not really helping. So what we do is that we have some IO throttling that makes it so that the other actors are not able to actually more write on disk faster and you can configure that and there's some corner of the table computation to compute what you should do. But yeah, other actors will not be able to write on disk faster than, let's say, 80 megabytes per second. And the merge that you have below, it's okay if it lags a little bit. That's the part that we want to be low priority and the part on the top we want to be high priority. So we don't have any fairness and we don't want any fairness. Yes? So I guess you're supervising that because otherwise the timeouts may also be delayed. So the supervisor is running on, it's very fast, it doesn't do much. So it's running on a Tokyo runtime that has a dedicated core and runs one bazillion actors, but they're all very light. So it doesn't matter at all. Okay, yeah. Yeah. Because, I mean, if your actors are very heavy now, of course, the supervisor will catch you. Yeah. At some point, because otherwise your timeouts will still be delayed. Yeah, absolutely. You're absolutely right, but the thing is it's running on its specific runtime and it's not CPU-HV, so there's plenty of core to work with. Yes? When you accelerate time in the testing universe, do you have to specify the steps in time you take? You mean the number of times? I assume that when you accelerate time, basically when nothing is happening, you take a time step and then see if something would have happened at that time point. Does that mean that you have to specify, we take steps of 100 milliseconds and then test every time if something would be happening now? No, it's not. We don't need to say how many steps we take. We don't need to say what is the resolution of the steps that we take. So the only thing that we do is that the scheduler, when it detects, it needs to accelerate time. It has some kind of heap that says, OK, the next event is actually in 70 milliseconds. So let's jump 70 milliseconds in the future and it triggers that event. And then the execution of the actor that was supposed to receive this message will go and if no actor is working anymore, then we re-accidate them again. So it's no steps, no resolution or nothing. Yes? How about reliability if you want to be sure that the bot line will make it to the index, you count them or how do you know they made it through? So, yeah, it should be the subject of another talk. Because now that's a super interesting question. So the pipeline, you want to know, to have an idea of what kind of semantics, delivery semantics that you want to have. We actually offer exactly one semantics and the way we deal with that is, so we didn't talk about that, we have the documents that are coming from a source actor and the source actors, when we spawn it, we tell it, OK, you need to stream messages from these specific checkpoints. And when we publish stuff like downstream, we publish stuff by running a transaction on our metadata store backend and that transaction updates the checkpoints of the stuff that we have published and it publishes the speed as well. So that when we restart everything, we can check in the metadata store what is the last checkpoint and it starts from there. And if there is an error, the metadata store will just yell at us and return an error. It will say, OK, no, something weird has been happening, maybe we all had two pipelines working at the same time and your checkpoints are overlapping and you have a problem. That's the way we work with this problem. Yes? At the universe, is that a special crate or is it in the standard category? No, the universe thing is something within our framework and that's what we use to be able to isolate typically different programs or different unit tests or different systems. Yeah, it's within our active framework. It is. Do we still have time? I think we have one more question or one more minute. I think there was a... Yes. I understand on this graph, the numbers on the rows indicate the capacity of the queue, right? The capacity of the channel between the actors, right? The numbers, yes. Yes. So we have a lot of tuning points in this system, right? Yes. With relation to the back pressure. So the question is, from your experience, how sensitive the performance of the system is to the tuning of back pressure on the channels? And maybe you have some kind of advice or a rule of thumb on what to choose for the best performance. Yes. So the question was, on this slide, all of the little numbers that we have on the arrow is the capacity of the different queues between actors that's a lot of parameters to tune. They probably have an impact on performance. Is there a cool recipe to... So the first question was, how much do they impact performance? And the second one is, do we have a nice recipe to be able to tune them maybe automatically? I'll go first because there is no more time. So they don't impact performance all that much as long as you got them a little bit correct. So you usually need to identify the stuff that should be at one, and then you put it at one and where you want a little bit of capacity. It should be quite obvious if you know your system. And I'm sure that there is a nice recipe to auto-detect that. I haven't found it. So if you have ideas, I'd love to... Usually that kind of question is someone who is thinking about something. So please come to me after the talk. And I'd love to hear your thoughts. Thank you, everyone. Time is up. Thank you very much. |
Building a distributed search engine with tantivy
How lnx is solving the challenges of builing a distributed search engine in Rust |
I'll just yell. But yeah, so this is effectively my talk. It started out as a really big thing and then I realised 40 minutes wasn't actually that much time and so we sort of had to compress it down into a bit of a slightly smaller talk but hopefully covering the most interesting points in my opinion. So a bit about me. I'm Harrison. I come from London. I live in London and I work for QuickWit where as Paul has said we build basically a distributed search engine for logs. I am the creator of LNX which is a slightly different design of search engine probably more akin to something like Elasticsearch or Algolia for all your lovely e-commerce websites. And you can contact me at Harrison at QuickWit.io. A little bit about LNX since this is basically the origin story of this talk really. It's a search engine built on top of Tantrave. It's akin to Elasticsearch or Algolia as I've said. It's aimed at user facing search. That's things like your e-commerce websites, your Netflix streaming platforms, things like that. It's not aimed to be your cost effective log search engine. It doesn't really handle those hundreds of terabytes a day type workloads but it will handle thousands of queries a second per core. It's very easily configurable. It's designed to be really fast out of the box because it uses Tantrave and it has an indexing throughput of about 30 to 60 megabytes a second on reasonable hardware. With high availability coming soon which is the presence of this talk. So what is user facing search? I've stolen Crunchy Well's website and I've typed some bad spelling in there and you see that a lot of the top results actually account for the fact that I can't spell. That's basically the biggest principle with these user facing search engines is you have this concept of typo tolerance. This is a really good thing for users because users can't spell. The downside of this is that it has a lot of CPU time when we're checking those additional words and it makes things a lot more complicated and often documents are mutable and a lot of other things but also when you have these nice search experiences and you want no latency something called search as you type has become more popular now and that means your amount of search as you're doing for a single user is increasing several times over because now every key stroke you press is a search versus typing it all in one go hitting enter user gets a bunch of results back goes oh no I've spelt something wrong or I can't see what I want on here so I'm going to type it again. And so that is effectively the principle of these search engines. You see we have Algolia at the bottom which is a very common one which I think most people know very popular for document searching. But you know we decided hey we don't want to use one of these pre-built systems we don't want to use Elasticsearch that's big that's scary I don't like it. We don't want to use Algolia because I don't have that much money I'm just a lonely paid software developer I can't be spending thousands of pounds on that. And we look at some of the others but we're going there we're just going to write it ourselves and that's where we have a little look because we hear something about Tantavi we hear something about Rust that being blazingly fast as all things must be and so we go okay I like this I like what it says it says yeah Apache Lucene I think I've heard that before somewhere written in Rust I think I've definitely heard that before. And so we take a little look at what it is and it is effectively akin to Lucene which if you don't know what that is it's a full text search engine as it's called. Tantavi in particular supports things like BM25 scoring which is just a fancy way of saying what words are relevant to this query it supports something called incremental indexing which basically just means you don't have to re-index all of your documents every time you change one thing. You have fasted search you have range queries and we have things like JSON fields which allow for a schemeless indexing as such. You can do aggregations which have some limitations in particular around JSON fields being a little bit limited but in the biggest thing is it has a cheesy logo with a horse which I believe Paul drew himself so I think that needs a clap on its own. But there are other features which I just haven't yes. But there are more features which I couldn't fit on this slide and timers of the essence. So you might be wondering what the basic implementation of Tantavi looks like and because it's a library it's actually really quite simple to do. So we have a couple of core things starting at the top is we define what's called a schema. Since Tantavi was originally a schema based system still is we need some way of telling Tantavi what the structure of our documents are and defining what properties they have. We can use something like a JSON field to give the impression of a schemeless index but you know schemas are good we should use them. They come with lots of nice bells and whistles so in this case we've created a schema with the title field and you can see there we've added the text and stored flag which all that really says is I'm going to tokenize this field and then I'm going to store it so we can retrieve it later on once we've done the search. The second thing we do once we've done that is we create our index writer and in this case we're just letting Tantavi select the number of threads so by default sorry when you create this index writer and we give it a memory buffer in this case about 50 megabytes. Tantavi will allocate n number of threads I think up to eight threads depending on what your system is using so you don't really have to put much thought into the multi-threaded indexing and then we're just adding a document really so we've created our document we've added the text field we've given it in this case the old man of the sea and we're going to put it to our indexer which is essentially just adding it to a queue for the threads to pull off process spit out onto disk and then if we want to actually have that be visible to our users for searching and things like that we need to commit the index so in Tantavi you can either commit or you can roll back and if you have a power failure midway through indexing when you reload from disk it will be at the point of that last commit which is very very useful so you don't leave with partial state and all that all that nasty things and then once we've done that we can actually search and in this case you can either build queries using traits which are very nice and you can mash them all together with lots of boxing and things or you can use the query parser which basically parses a nice little query language in this case we've got a very simple phrase query as it's called trouble that up and it spits out a query for us we then pass that into our search executor which in this case we're executing the query and then we're passing what called collectors and they are effectively just a simple thing to process the documents which are matched so in this case I believe we've got the count collector and the top docs collector and the count collector does well it counts a big surprise there and we have the top docs which collects the top k documents up to a given limit so in this case we've selected 10 we only have one document to match so this doesn't matter that much but if you have more you can limit your results you can adjust how things are scored etc. Now that's all well and good in this example but this doesn't actually really account for spelling and as we discussed earlier users aren't very good at spelling or at least I'm not so we maybe we want a bit of typo tolerance and in this case Tanturi does provide us with some additional way of doing this in the form of the fuzzy term query it uses something called lever science distance it's a very common form of effectively working out how much modification you need to do to a word in order to actually get it to match and we call that the edit distance as such typically you're between one and two edits so you're swapping a word around you're removing it you're adding a new word a bit of magic there really and as you can see at the bottom this is effectively if we use just the regular full text search well if we enter the term hello we'll only match with the word hello if we go with the term hell we'll only match with the word hell if we use some fuzzy term query here we can actually match hell and hello which is very useful especially for the prefix search this is built upon Tanturi's inverted index which uses something called a FST which is effectively a fancy word for saying we threw state machines at it and then made them return results that's as much as I can describe how they work the person who originally wrote the FST library in Rust burnt sushi he has a blog on this goes into a lot of depth really really useful for that sort of thing but I can't elaborate any more on that but all of this additional walking through our index and matching these additional words does come at the cost of some additional CPU and once we've sort of got that what we're left with is this nice block of data on our disks really so we have some metadata files here in particular meta meta.json that contains your schema along with a couple other things and we have our sort of core files which look very similar if they look very similar to these scenes that's because they are in particular we have our field norms our terms our store which is effectively a row level store log file our positions our IDs and our fast fields and fast fields are effectively fast because we cut somewhat simple and equally vague name but now that we've got all this stuff on disk if we wrap it up in an API we sort of we've got that we've mostly we've got everything in this case we've got a demo of LNX working here and we've got about I think 27 million documents and we're searching it with about millisecond latency I think in total it's about 20 gigabytes on disk compressed which is pretty nice but there's sort of a bit of an issue here which is if we deploy this production and our site is very nice we get lots of traffic things increase we go hmm well search traffic is increased our server is not coping let's just scale up the server and we can repeat this for quite a lot and in fact things like AWS allow you a stupid amount of cores and things like that which you can scale up very easily but you keep going along with this and eventually something happens and in this case your data centers burnt down if anyone remembers this this happened in 2021 OVH basically caught fire and that was an end of I think a lot of sleeping people and so yeah your data centers on fire search isn't able to do anything you're losing losing money no one's buying anything management's breathing down your neck for a fix you're having to load from a backup what are you gonna do and well you think ah I should have made some replicas I should have done something called high availability and in this case what this means is we have instead of having one node on one server ready to burn down we have three nodes available to burn down at any point in time and in this case we hope that we put them in different what are called availability zones which mean hey if one data center burns down there's a very small likelihood or at least as it possible for another data center to burn down in the meantime and this allows us to effectively operate even though one server is currently on fire or lost to the ether or I don't know network has torn itself to pieces and this does also mean we can upgrade if we want to tear a server down and we want to restart it with some newer hardware we can do that without interrupting our existing system but this is sort of a hard thing to do because now we've got to work out a way of getting the same documents across all of our nodes in this case it's sort of a share nothing architecture this is done by elastic search and basically most most systems so we're just replicating the documents we're not replicating all of that process data we've just done we need to apply them to each node and doing this approach makes it a bit simpler in reality LNX and QuickWit do something a little bit different but this is this is easier I say this is easier because the initial solution would be you know just just spin up more nodes you know what can add some RPC in there what can go wrong and then deep down you work out it's like oh do you mean networks are reliable what's a raft and things like that and so at that point you go okay this is this is harder than I thought and you realize the world is in fact a scary place outside your happy little data center and you need some way of organizing states independent on things catching on fire and this is this is a hard problem to solve and so you have a little look around and you go well Rust is quite a new system it's quite a young ecosystem we're quite limited so we can't necessarily pick a Paxos implementation off the shelf we maybe have something called raft so that's a leader-based approach and that means we elect a leader and we say okay leader tell us what to do and it will say okay you you handle these documents go go do things with them it's a very well-known algorithm very easy to understand it's probably the only algorithm which is really implemented widely in Rust so there's two implementations one of them by the pink cap group called raft RS and the other by data fuse labs called open raft varying levels of completion or pre-made so in this case you think okay I don't really know what I'm doing here so maybe I shouldn't be managing my own raft cluster and you hear something about eventual consistency and you hear oh it's it's leaderless any any node can handle the rights and then ship off to the other nodes as long as the operations are idempotent and that's a very key point which means you can basically ship the same document over and over and over again and they're not going to duplicate themselves or at least they don't act like they duplicate and this gives us realistically a bit more freedom if we want to change we can change and so we decide let's go with eventual consistency because yeah I like an easy life and it seemed easier yes people laughing will agree that yes things that seem easier probably aren't and so our diagram sort of looks something like this and I'm scared to cross the white line so I'll try and point but we have step one a client sends the documents to a any node it doesn't really care which one that client then goes okay I'm going to send it to some of my peers and then wait for them to tell me that they've got the document it's safe and then once we've got the majority which is a very common approach in these systems we can tell the client okay your document is safe even if OHV burns down again we're probably going to be okay it doesn't need to wait for all of the nodes to respond because otherwise you're not really highly available because if one node goes down you can't progress and so this system is this system is pretty good there's just one small problem which is how in God's name do you do this many questions need to be answered many things how do you test this or who's going to have the time to do this and well luckily someone aka me spent the better part of six months of their free time dealing with this and so I made a library and in this case it's called data cake whoo yes in this case this is called data cake I originally was going to call it data lake but unfortunately that already exists so we added cake at the end and called it a day it is effectively a tooling to create your own distributed systems it doesn't have to be eventually consistent but it just is designed to make your life a lot easier and it only took about six rewrites to get it to the stage that it is because yeah things are hard and trying to work out what you want to do with something like that is awkward but some of the features it includes is it includes the zero copy RPC framework and this is built upon the popular archive framework which is really really useful if you're shipping a lot of data because you don't actually have to deserialize and allocate everything all over again you can just treat an initial buffer as if it's the data which if that sounds wildly and safe it is but there's a lot of tests and I didn't write it so you're safe. We also add the membership and failure detection and this is done using chit chat which is a library we made at quick quit it uses the same algorithm as something like Cassandra or DynamoDB and this allows the system to essentially work out what nodes are actually its friends and what it can do and in this case we've also implemented an eventually consistent store in the form of a key value system which only requires one trait to implement and the reason why I went with this is because if you implement anything more than one trait people seem to turn off and frankly I did when I looked at the raft implementations. So we went with one storage trait that's all you need to get this to work. We also have some pre-built implementations I particularly like abusing SQLite so there is an SQLite implementation and a memory version and it also gives you some CRDTs which are conflict-free replicated data types I should say and also something called a hybrid logical clock which means it's a clock which you can have across your cluster where the nodes will stabilize themselves and prevent you from effectively having to deal with this concept of causality and causality is definitely the biggest issue you will ever run into with distributed systems because time is suddenly not reliable. And so we go back to our original thing of well first we actually need a cluster and this case it's really simple to do all we need to do is we just create our node builder we tell data cake okay we've got your address is this your peers are this or you can start with one peer and they'll discover themselves who their neighbors are and you give them a node ID. They're integers they're not strings and the reason for that is because there's a lot of bit packing of certain data types going on and strings do not do well. And here we can also effectively wait for nodes to come onto the system so our cluster is stable and ready to go before we actually do anything else. And by the time we get to this point our RPC systems are working nodes are communicating your clocks have synchronized themselves mostly and you can actually start adding something called extensions. Now extensions essentially allow you to extend your existing cluster you don't you can do this at runtime they can be added and they can be unloaded all at runtime without any with state cleanup and everything else which makes life a lot easier especially for testing. They have access to the running node on this local system which allows you to access things like the cluster clock the RPC network as it's called which is the pre-established RPC connections and you can essentially make this as simple or as complex as possible which is essentially what I've done here so I've created this nice little extension which is absolutely nothing other than print what the current time is which realistically I could do without but nonetheless I went with it. And this is what the eventual consistency store actually does under the hood is it's just an extension and here we can see that we're passing in a I can't point that far but we pass in a mem store which is our storage trait we pass in our create our eventual consistency extension using this and we pass it to the data cake node and say okay go add this extension give me the result back when you're ready and in this case our eventual consistency cluster actually returns us a storage handle which allows us to do basically all of our lovely key value operations should we wish including delete, put, get that's about all there is on the key value store but there are also some bulk operations which allow for much more efficient replication of data. The only problem with this approach is it's not suitable for billion scale databases so if you're trying to make the next Cassandra or Silla don't use this particular extension because it keeps the key value or the keys sorry in memory which it uses to work out what keys have and have not been processed and the reason for this is effectively because I didn't really trust users implementing this on the storage site correctly which turned out to be a good choice because the amount of unit tests that this failed initially was a lot and so now we've sort of got this ability to replicate our key values our life is a lot easier in particular we can actually go as far as essentially saying okay we've established our data connection our key values let's just use Tantive as our persistence store and this is effectively the simplest way to do it and I've made a little demo here which you can go to that link I basically abused and slightly ignored certain things in particular correctness but this will replicate your data you may end up with duplicate documents because I didn't handle de-duping but in this case we can fetch we can delete and we can index documents with Tantive and that's our persistence store and here you can see we're doing about 20,000 documents in 400 milliseconds in the local cluster yes and that is effectively the end so are there any questions how long do we have left how long do we have left 15 minutes so actually kind of so in there do you have like a way to provide from outside to the Tantive transaction or links transaction an external ID that I can use to integrate with the standard storage so change the question would be an easier way do you have a way to say which which level of data has been indexed yes in this case I've sort of glossed over it a little bit because in reality it's a little bit more complicated when you implement it so in reality when you actually implement this you would probably have a essentially use the replication to replicate the initial documents and then you would have a check mark to essentially work out what documents have and have not been indexed yet or you would add some additional step like a right ahead log so that way you know that as long as the documents are there you can make sure that your check your commit point is always updated to the latest thing in the next it's actually a little bit different again because the way it creates indexes is they are per check point so in a new index is created every commit effectively but you don't have to do that and in this method I didn't so you could you can it doesn't do it here but you can add a right ahead log and do you can do basically do anything as long as the trait is implemented hello hello hi yeah all right so congratulations for the presentation sorry I think I can see you yes hello so let me see if I can got that question right so you was is that about it sending time to me so if you want to go beyond something like bm25 or leave a size distance and things like that things like I think things like vector search or word embedding search is still something which is quite far away and we need quite a big push to do with time to be specifically but if you want to add additional queries or additional functionality it's quite easy to add with time to be so it's actually just a query trait so one of the things that and the next does it actually has another a query mode called fast fuzzy which actually uses another algorithm for pre-computing dictionaries in order to do the edit distance lookup and that basically is just involves creating another query and you can customize effectively all of your query logic all of your collecting logic and things like that so providing your within the scope of the API time to be will allow you to implement it yourself otherwise things like the word embeddings which are a little bit more complicated and require a bit more on the storage side would need to an issue and a very motivated individual to probably implement that which currently we we don't really have so it's pretty little question on all your sketches the network the subject network was fully connected is that important let me see if I can find which one that was was it was it this one or was it this one well on this one it's it does not look fully connected but I'm not sure if these diagram depicts kind of connectivity connect home or just which messages has actually been dispatched so I'm going to cross the forbidden white line here because we're doing questions and effectively these are just indicating sending responses and getting things back so these notes don't actually in a real system that you could have a network petition here and your node one can no longer talk to no three it's effectively lost to the ether and maybe no two can also not do it and in this case it doesn't actually really care all that you need to do is you need to achieve what's called a consistency level so which means that if you want to progress you have to reach that level otherwise things are counted as not happening and so in this case if no three is down or can't be contacted as long as node one can contact node two and no two acknowledges the messages things can still progress this is the same with raft as well so raft operates on what's called a quorum which yeah but effectively any node any one node can go down in a three node group and the other two nodes can still progress providing they have what's called what's the majority so I understand full connection of the network is not an important factor here well it's nice to know thank you thank you for our talk I see here that there is basically a consistency mechanism for indexing do you check as well for that on over nodes when there is a search request as well say that again sorry I didn't quite pick that up do you check the data on over nodes when there is a search request not an indexing request in this case we have relaxed reads essentially so we don't do we're not it's searching across several nodes and getting the most updated version from that which is part of the trade-off you make with the eventual consistency you will have that with raft as well effectively unless you contact the leader you won't have the most update data when searching but one of the things you do have to do if you go with the eventual consistency eventual consistency approach like we do here is you would need to effectively handle the idea that maybe you will have duplicate documents because something's been recent in the meantime and so you'll need to be able to deduplicate that when you're searching or have some other method of handling it and deleting it from the index so that means that effectively every node must have a copy of the data like I cannot have five nodes unlike a free with the car system or something about yeah so as long as if you've got like a five node cluster and three nodes respond you can immediately search from if those three nodes have got the data they can immediately be searched from effectively if you want but the other nodes may take a little bit of time to catch up which is the principle with eventual consistency they'll eventually align themselves but they're not all immediately immediately able to reflect changes hello just simple one in hindsight would you take the raft part in hindsight probably not still and the reason for that is because the current state of the rust ecosystem with it means that there's a lot of black holes effectively around it and so you either going with an implementation which is very very stripped down just the state machine part or going with an implementation which is very very trait heavy and is a little bit opaque around what you need to test what you don't need to test and how it behaves under failure so in this case it's I like this approach more because it may allow me to implement things like network simulation which the RPC program supports so we can actually simulate network fit networks failing locally in tests and things like that which makes me feel a little bit more confident than trying to just have the state machine and implement everything and all the handling correctly but I think in future yeah you could you could use it but it's just not not quite at that state so I'm not sure I quite got how how if the engine actually does any data sharding or there's a hatchery yeah in this so in this approach it's simplicity of time really we're not actually doing any data sharding servers are really quite big nowadays so you can even for your e-commerce website you can get a pretty huge server and the biggest issue tends to be replication and the high availability the data sharding is something that some quick wits is something that would be concerned about because you've got so much data you need to spread it across machines and things like that when you're searching but in e-commerce at the point in which you're searching across multiple machines you're probably going to be looking at the higher latencies so you would you'd be better off dedicating one machine per search rather than several machines per per search really. |
Aurae: Distributed Runtime
A new node init system written in Rust |
Check, one, two, hello. Hello. Hello. Hi. Where's Malte? Hi. Hi. Nice to meet you. Okay. Sorry. Just like one of my hacker friends that has been working with me on the project. I've actually never met him in person, so nice to meet you. Anyway, today we're going to be talking about Aura or Aode, however you want to pronounce it, is fine, which we're temporarily calling a distributed systems runtime, and that's the name that has caused the least amount of friction over the past few months. Okay. So, my least favorite slide, my slide about me, so I'm an engineer, I work at GitHub, I helpkeepgithub.com online, sorry about the Shaw thing last week. Yeah. So, I keep a lot of systems online, some of you may or may not use them, all of you hopefully have good opinions of them, and then if you want to follow me on the Fediverse, there's where you can follow me. So I'll do overview and context, so if you want to go to the GitHub repo, you can grab a photo of this or just remember it, the link to the slides are there right now, I just forced push to main like two seconds ago, so you can go and you can see the slides and there's like links to everything there that I'll be going over today. So if you want to grab that, go ahead and grab that. Okay. So, we're going to start off, I'll do a little bit of context, I'll answer the question, what is Aura, what does it do, and then we'll spend the last two thirds of the presentation talking about Rust and why we decided to use Rust for the project and some reports about how it's going so far and some of my experience as well. Okay. So, just show of hands, who here has heard of Aura before? Oh, God. Okay. Well, thank you for following my project, that makes me very happy but also a little terrified. So anyway, Aura, it's an open source Rust project and it's aimed at simplifying node management at scale. And so, when I talk about it, I usually say it's basically a generic execution engine for containers, VMs, and processes. The really quick pitch that I'll give on Aura is all of these things, containers, VMs, hypervisors, and basic process management is all that I do at GitHub and all that I have done in my career for the past 10 years. And I have used a plethora of tools to do this and I was tired of learning and managing all these different tools and so I hope that this will be the last tool I ever have to work on in my career. So I wrote a thesis about the project and I'm trying hard to continually reevaluate this thesis and basically it says that by bringing some deliberate runtime controls to a node, we can unlock a new generation of higher order distributed systems. And what I mean by that is, in my experience, a lot of the things we do on a node are organic and grew over the past 30 years or so. And this is more of a deliberate set of what do we need in the enterprise and what do we need at a bare minimum on the node. And I think that if we get that right, we're actually going to have a much more interesting conversations in the coming decades. So I also believe that simplifying the execution stack will foster and secure observable systems while reducing complexity and risk. And complexity, if you have ever ran Kubernetes, is the name of the game. Cool. So I'll be talking about these things called nodes today. So node is a keyword. And when I say node, pretty much always in life, but very specifically in this talk, what I mean is a single compute unit in a set. So this would be one or more computers that we're trying to group together and manage as a set of computers. So when we do one thing to a node, the sort of assumption here is you want to go and do this twice or three times or 10,000 times sometimes or so on. So when we say node, I want you to think of a set of computers or an array of computers. OK. So what does Aura do? So the thesis here is this is going to be a central control for every runtime process on a node. So whether you're running PID 1 or a container or a virtual machine, the hope is that all of this can be funneled through the Aura binary at runtime, and Aura will have the ability to not only manage it, but also observe it and control it and start it and stop it. And who knows? Maybe even one day debug it if I'm very lucky. It runs as a minimal in its system. So this is important. A lot of folks want to compare Aura to system D. And the more I think about it, the more I think that I really believe Aura and system D have different goals. Aura doesn't really want to become a desktop manager. In fact, it kind of wants to be the opposite of that. It wants to be as lightweight and as minimal as possible. In a perfect world, there would be no user space on an Aura system because we wouldn't actually want users touching a single computer. Remember, we're managing sets of computers. And so the hope here is that we can make this as lightweight as possible. Additionally, we want this thing to have a remote API. So the idea of a single person sitting at a desk and operating on a single node is kind of irrelevant here. So everything that we do on the node, whether it's scheduling another process like a bash shell or it's scheduling a container, should all come through this remote API. And we're going to learn more about this API in Rust specifically later on in the talk. Also it runs on Linux. Right now it's tightly coupled to the Linux kernel. So what doesn't it do? So it doesn't do generic desktop support. So that's just completely out of scope. I don't want to deal with your Bluetooth drivers. I don't want to deal with your sound drivers. I don't want to manage your desktop interface. I don't care. In a perfect world, this hooks up to network and that's about the most advanced user interface we're going to have to one of these nodes in a set. Additionally, higher order scheduling is out of scope. So when we talk about enterprise management, whether it's some sort of orchestration system like Kubernetes or not, a lot of those discussions very quickly go into the scheduling discussion. There was a really good article, I think it was yesterday or the day before on Hacker News that came out of fly.io about their orchestrator experience with Nomad. I see somebody shaking their head. Yeah, you read the article. It was a great article. And maybe we can find a link to it and put it in the video or something for folks. But that conversation was very much about how do we make scheduling decisions with available resources today. And that is pretty much all I do at my day job at GitHub and that's all I've been doing managing Kubernetes for the past five or six years. And so while I'm very interested in having that conversation, my hope is that by simplifying the node, we can make those scheduling conversations easier in the future. And what I mean by that is that we will have less to say about what we actually do on a node and we can effectively make nodes boring. So it doesn't run on Darwin and it doesn't run on Windows. Like I said, we're tightly coupled to the Linux kernel, which if you haven't pieced it together yet, is why Rust is very exciting for the project. Okay, so again in summary, where did Aura come from? It came with challenges with complexity at scale, so we just want the node to be boring. And it became, there was this desire to simplify and secure the stack. So I do deeply believe that with simple systems come secure systems. Every hack that I've been a part of in the industry has usually started with some sort of disparate and unknown fragmented attack surface that somebody's been able to exploit and do some sort of lateral movement once they're into the system. So if we can simplify that and we can just make the conversation involve less moving pieces, my hope is that we can actually secure the stack. I also want there to be a stronger node API. So who here has ever debugged the KubeLit API before? Who here even knows what this is? Okay, so we have a handful of people. So the KubeLit is a Kubernetes version of we're going to go run an agent on a node. It does have an API, last I checked it was undocumented and it was tightly coupled with the Kubernetes control plane. We hope to break that. We hope to just have a generic API that you could use to run a single process remotely or you could schedule millions of processes remotely and we want that to be a very strong and thoughtful API. One of the big lessons of running large distributed systems at scale is that the bigger you get, the less trust that you can have in the people working on your systems. So as I've grown either like my small mastodon server that's grown into a medium sized mastodon server or even dealing with thousands of nodes at scale. One of the lessons that I've noticed is that all workloads tend to this untrusted banality. So the bigger you get, the less you can trust a single workload. And even if these workloads are on the same team as you, you really want to start looking at them as an isolation zone that you don't want to trust too much from the centralized control plane perspective. So we started off ORA with a few guiding principles. Number one, I want it to be boring. So we're targeting a single binary. We want this binary to be polymorphic in nature. Who here is familiar with Busybox? Great. Yeah, Busybox. It's a good binary in my opinion. I really like what it does. There's a switch on R0 and it basically behaves like however you call it. So we're trying to get some similar functionality into the ORA binary as well. And we also want this thing to be lightweight and have a very strong scope and be as low risk as possible. Additionally, we wanted this thing to be attainable. We wanted to play nice with others. So I knew that I wanted this to fit in neatly with Kubernetes. I knew I wanted this to fit in neatly with Linux. And I knew I wanted pretty much everyone in this room to feel realistically like they could be running this thing on their laptops one day as the project grows. And so in order to do that, the API was going to be the majority of what we were talking about as we began developing the project. And ultimately, I wanted it to be functional. I don't want it to subserve the needs of a corporation. I don't want it to serve the needs of a higher order control plane. I literally just want a standard library for executing processes in containers and VMs at scale. What we do with that is out of scope. I just want it to work first and foremost. So ultimately, I want boring systems. And if you see in the background here, there's all of these like subtle distributed system propaganda notes that you can go look at if you want to look at the slides later. So ultimately, I wanted the thing to be safe. So when we're looking at tenant security, one of the questions I ask is, how do we make it easy to do the right thing? And I think that comes from the underlying infrastructure. And in our case, Aura is the underlying infrastructure. And we intended to build a very strong project here that would unlock this sort of safe paradigm that we could give a team a binary, and they would be able to run their applications on top of it. And we wouldn't really have to worry about anybody sneaking out of their container or accessing any parts of the systems. We didn't want them to access. So tenant security is a strong motivator for this as well. OK. So about six months ago on Twitch, which I do a Twitch stream. You should maybe follow me if you want to learn more about the project. But I started to write this paper. And it was mostly as some bro in chat was like, yo, why don't you just go rebuild system D? And I was just like, maybe I will. And so anyway, I ended up writing this paper. And so, well, here we are. And so the paper really grew. And it started to answer a bunch of questions about, why should we go write it and go? No, no, no. We should go write it in C, because C is going to be the most common language that will interface neatly with the kernel, and we can do EVPF probes and so on. No, no, no, no. We should go write it in Rust. You can go look, there's a Google doc, and it's just got all these comments of people from all over the internet, all over the industry, arguing about what we should do. And eventually, we settled on, we want a lightweight node, Damon, and thus became the Aura runtime project. OK. So this is where we shift from the conceptual, what is Aura? How did we get here? What problems does it solve? And we start to get a little deeper into the code. So when we originally started the project, we started writing it in Go, the Go programming language. And there's two kind of predecessor projects that later turned into Aura, which is written in Rust. This first one that we call Aura Legacy, which up until about five, well, I guess 15 minutes ago now, but right before I walked into the room, this was a private GitHub repo, and I've gone ahead and actually opened it up. So if you want to go see the original code in Go, there's some really interesting things in there. We did some libp2p bit torrent style routing between nodes, where you can build a nest of nodes and things. But you can really see where this runtime, Damon, started and some of the original concepts that we were tinkering around with. Ultimately, though, we ran into a lot of the same problems that I ran into in Kubernetes, which was I needed to start recreating these objects, and I needed to start reading some config, whether that be YAML, JSON, or something similar, and then marshal that onto a struct in memory, and then go and do arbitrary things with that, in our case, schedule a pod. And one of the things that was kind of outstanding in the back of my mind was, what about access to libc? I knew as soon as we started scheduling containers and DMs, we absolutely were going to need native access to libc. Additionally, there's this project called NAML, which is basically Turing Complete Kubernetes Config, it's written in Go, and it just uses the Go SDK, and that was yet another way of sort of validating this idea of we need to start making our system stronger and building stronger interfaces for teams to manage different parts of the stack. So those are the two sort of precursors to the Aura runtime as it exists today. That of course, writing it in Go came with some challenges. The big one here is obviously native access to libc. We were going to be creating C groups against the Linux kernel. We definitely wanted to use the clone3 system call, and the container runtimes of today had some assumptions about how we were going to be executing the clone3 system call that, of course, I had to disagree with because, hi, have you met me? I have to disagree with everything. And we also wanted to implement some ptrace functionality as well. So obviously, Go was going to give us some challenges here when it came to using CGo, so Rust became very exciting and definitely got a lot of attention very quickly as we were writing the Go side of things. We also wanted ebpf for networking. I personally want it for security and maybe for some other interesting service mesh ideas, but I do think that having ebpf for networking as a non-negotiable, we're definitely going to want to simplify what Kubernetes refers to as kubeproxy that we can now invent our own name and hopefully simplify that layer, but I digress. We also wanted some access to native virtualization library, so all the KBM stuff is written in C. And if you go look at the Firecracker code base, that is also written in Rust that vendors the KBM bindings. And so we knew we would want to access these three components, and all three of these are going to be problematic with Go. Update as of about an hour ago, I went to the state of the Go room across the hall here. Did anybody else go to the Go talk this morning? Yeah, we got three or four hands up here, so this kind of pissed me off. Go has unwrapped now as of 1.2.0, and they also freaking have.clone. And I was just like, bro, get off our keywords, this is totally like, this is our thing. So anyway, it's really exciting to see Go taking these concepts a little more seriously, and if you've ever written Rust before, who here has written unwrap in Rust? Put your hands down, we're not supposed to do that, I don't know what we're supposed to use now, I just get so much shit on my Twitch stream every time I write unwrap, but yes, we do have unwrap and clone in Go now, which is just a strong indicator that we're likely doing something right with Rust. So anyway, I made the decision to move to Rust, and I didn't know very much about Rust when I made the decision, and I literally just started out the main function and said, we'll figure it out as we go, and I ordered the Rust book and just jumped in and started to write code with the hope of accessing kernel constructs and C groups and EBPF probes. So what could possibly go wrong here? Okay, so how are we doing on time, by the way, we're 15 minutes in, okay, cool. So Rust to help us solve the YAML problem, I suspect we're all familiar with feeding YAML to machines, we've all done this before at some point in our lifetime, okay. So this is a thing I do a lot working in large distributed systems, and I work with people who do this a lot, and if we do it so much, we've tried to get really good at doing it, and that I think, that's an interesting discussion. So in my opinion, so warning, Chris Nova opinions here, in my opinion, all config ultimately is going to drift towards Turing completion. So I see this C++ templates, anybody, anybody C++ templates, okay, Helm charts, customizing Kubernetes, any of the templating rendering languages that you see in web dev and front end work, there's all kinds of interesting Python libraries that will allow you to interpolate your config and so on. In my opinion, a good balance is kind of something like bash that is Turing complete, but it just comes with some strong guarantees. And so I knew very quickly that I didn't want to be feeding YAML to Aura. I definitely didn't want to recreate this idea of we're going to have to manage a thousand pieces of YAML because we have a thousand different nodes. So I wanted to explore more about what are some options that we have here, so we're not just feeding YAML to machines anymore. So thus became this really interesting project of mine, we'll see if this pans out, which is this binary called AuraScript. So AuraScript is a, it's a Rust binary, we have it compiling with Muzzle today, and embeds all of the connection logic for a single machine. And so we'll talk more about the semantics of AuraScript in a second. But ultimately what you need to understand to kind of get the initial motivation here is that this aims to be an alternative to managing YAML at scale. So I found this really fascinating type script runtime called Dino. Have folks heard of Dino before? Can I swear in here? I, FN, love Dino. I'm sorry, I really like this project. If you want a good example of like, hey, I just want to see a really successful Rust project that has a really strong community, I would encourage you to just go look at the Dino project. I think their code is beautiful, I think what it does is beautiful, I think the way that they manage the project is beautiful, it's just a really good quality project and it solves a problem for us with Aura. And so Dino is basically, it's a runtime for type script and it's written in Rust. And the way the project is set up, that you can go and you can add your own custom interpreted logic and you can build fancy things into the binary and you can do things with the type script interpretation at runtime, which is precisely what we needed to do with Aura. So here is the model now. So instead of feeding YAML to a single node, we now have this higher order set of libraries that we can statically compile into a binary and we can interpret it directly on a machine. So in order for you to interface with an Aura node or a set of nodes, all you need is one binary, mtls config and then whatever type script you want to write. And this is an alternative to like any of the Nomad command line tools or the Mesos command line tools or the Kubernetes, kubectl, kubectl command line tool. And now you can just write it all directly in type script. So this is actually a concrete example of what would be, what system D would call a unit file, what Kubernetes would call a manifest and what Aura just calls a freaking type script file because we don't have fancy names for our stuff yet. So you can see here at the top, we basically contact the Aura standard library. We get a new client and then we can allocate this thing called a cell. A cell is basically an abstraction for a C group. We cordon off a section of the system and we say like we want to use a certain percentage of the available CPUs on a node and I want it to only let processes run. In this case for 0.4 seconds and then we'll use the kernel to just kill the process if it runs longer than that. And so the first thing we would do is we would allocate that which is an improvement over Kubernetes as it exists today because we can allocate resources before we actually start anything in that area and then we can go ahead and actually start whatever we want. And so you can see I simplified the example just for today but it's just, it's remote command injection as a service. So this whole talk was just basically like how to go and run a bash command in on a server. And so now you can express your commands and similar primitives that you would see in other run times directly in TypeScript. The interesting thing here is TypeScript is just natively more expressive than a lot of the Amble things that we see today. In this case we can actually do math but I'm sure you can imagine you can do other things as well. You can access logic, loops, if statements, there's if branching and so on. And so we were able to actually solve some of these like templatey rendering style problems by just doing things natively in a well known and easy to understand language such as TypeScript. So patterns started to emerge. So Rust gave us the ability to generate the TypeScript binary with all of the magic behind the scenes MTLS security config that we wanted. And so now the conversation was a little more like this which is how do I manage a small set of TypeScript and it's much more flexible and you can start to actually express things the way that we used to and just express things statically and then you can have all of your Turing complete logical components below and you can mix and match these however you want. So in addition to addressing the YAML problem with Dino and TypeScript, Rust also helped us to solve the sidecar problem and by us, I mean this is our hope as we operate our mastodon servers and our various other ridiculous side projects that we operate both in my basement and in a colo in Germany. So talking about sidecars, who here knows what a sidecar is, show of hands, okay most folks do. Okay, so a sidecar that is always available with the same features as a host. So this is going to sound a little bit weird and the slide is going to look a little bit weird but just bear with me as we kind of like unpack what's actually going on here. What we want that I don't think we're talking about is that sentence. I actually think what we want is we want a sidecar to sit along our applications that does literally the exact same things we have to do on a given host whenever we're managing these workloads at scale. As I began looking into writing sidecars at the host level, I began drilling deeper and deeper into the C programming language as I was writing this in Rust and just made the connection that memory safety was going to be key because we're going to be running these demons right alongside of your workload. And so unpacking the need to do this really helps you understand why we shifted over to Rust. So again, another Chris Nova opinion, any sufficiently mature infrastructure service will evolve into a sidecar. So if you have done any sort of structured logging, in my opinion, if you will continue to build structured logging and you'll continue to ship logs, that will eventually turn into a sidecar that you're going to want to go run beside your app so you have this transparent logging experience. You can rinse and repeat that paradigm for pretty much anything, secrets, authentication data, and so on. And so I started to see these patterns kind of surface. And very specifically, I started to look at how would I solve these with Rust? And as it turns out, the Rust ecosystem had a plethora of pleasant surprises for me as I started to explore what putting some of these features into a binary would look like. Logging was boring because we could just use Tokyo Streams, Auth N and Auth Z was boring because all I had to do was just use the Rust-derived primitives to just start applying Auth Z to each of our units in the source code. Identity was boring because I didn't even get to fight with open SSL anymore. We just had to use Rust TLS and that was easy. And so the network was also easy because we had native access to Linux and Lib C so we could just very boringly schedule a Linux device and we got a Linux device and it was pretty straightforward. So we were able to create this at the node and now my question was how do we bring this into the workload level at scale? And I think this is where most of the conversations you start talking about things like Istio and service meshes and structured logging and so forth. And I actually think that we can simplify that conversation too. And so what we were able to do with Aura is we just spawned the root daemon and use that as the new PID one in any of our nested isolation zones. And when I say spawn, I very directly mean like we literally read the byte code from the kernel and we build an image at runtime with the byte for byte, the same byte code that's running on the host and then we can just go and execute whatever we want against the same API as the original host runs and all of this is memory safe. So I can put this right next to your application in the same namespaces running in a container or running in a virtual machine and there's a relatively low risk of any sort of binary exploitation at scale. So here's a model of what that looks like. So on the left big side here we have the Aura host daemon and on the right we have the three types of isolation zones that you can run with the daemon. You have a cell sandbox which is effectively a C group, a pod sandbox which is a group of containers running in unique Linux namespaces and a virtual machine which is effectively a container with a kernel and some virtualization technology. All of this is possible with Rust natively and all of this was made possible by spawning the binary and creating these nested isolation zones at runtime. Additionally Rust was able to help solve the untrusted workload problem because of the memory safety and that Rust offers and because of this really interesting model that we have right here. So this is a zoomed in model that might look familiar if you've ever done any container escapes before and in this model basically what we're saying is we're replacing any sort of like pause or initialization sequence in an isolation zone with the same daemon we run on the host. So I think the Rust binary for Aura right now is about 40 megabytes and we can just copy that into a container and run that alongside your application. So it's a relatively small application, runtime that will sit right alongside of your app so managing memory from MTLS and RID. So as I'm writing Rust one of the things I notice is I start paying attention to memory management more every time I try to clone something or the freaking borrow checker yells at me that kind of like is a small like grim reminder of my roots as a C developer. This is an interesting takeaway the only memory that we need to share that multiple parts of the system have access to in this entire model whether we're creating containers or VMs is the shared MTLS config. So this is the only bit of shared memory that we really have to manage and Rust very clearly called that out and to be a candidate I don't think I would be able to as be as comfortable with this model if I was doing this in something like go. So Rust was able to help us solve the maintainability problem so somebody say Rust macros. So we have a really brilliant guy future highway who helps us work on the project and future highway is our resident macro guy. Does everybody here have a macro guy in your team? Because you should. He has made things a lot simpler for us. So one of the things we struggled with go in Kubernetes specifically was like how do we generate objects with unique logic. Rust macros were a solution to this for us. So if you've ever looked at the Kubernetes code base you can see we've created these things called CRDs that started out as third party resources and we've built this entire bespoke API machinery system that basically is a glorified macro system that allows us to generate go in the project. So we're allowed to use Rust macros now and it's a very simple model in the code base. We basically have a combinatorics problem where we're able to map the different primitives to the different logical systems that are unique to us and we can generate our source code as needed. And so our source code ends up looking like this which I think we've successfully achieved boring for a low level run time. This is a fairly straightforward call and then we can be confident that the code it generates is unique to the project and encapsulates all of our concerns as maintainers. So really the whole conversation now is just the proto conversation. Everything can be generated by Rust macros. The whole project really is pretty much on autogen at this point. You can just go introduce a new field in the API and then you can spit out a new client, it'll plumb itself into the run time, it'll plumb itself into the AuraScript library and everything is given to us for free just because of macros in Rust. And so this is our code path and the way that we're able to take advantage of macros. We do a lot of manual work, we fight with the borough checker, we make some improvements and then we get done and we encapsulate it into a macro and we can simplify our code path by just replacing all of that with a macro after we've been done. And so this is the Aura project as it exists today, which again I'm very stoked to say that this is a very boring exercise. So a quick update and then I'll be done with my talk here. There's a few components, all of which are written in Rust here. Number one, the AuraD daemon is the main static binary that's written in Rust and compiled with Muzzle. So we can ship that without any of the shared objects on the host directly into an isolation zone. AER is a completely generated from Proto Client. So this is exciting, we can actually call a GRPC API directly from the client, we don't have to do any of the run time plumbing. So if we add a bool to the Proto file, we get dash dash bool directly in the client compiled for free without typing a single line of code. So this is a very exciting primitive for us, so we can just begin to have API conversations and not necessarily care about the internals of the program anymore. AuraScript is completely generated and we have this exciting project down here, which is AE, which is an alternative command line client written in Go. So ultimately the lesson here is Rust was able to help us solve the boring problem. We have a very complicated, very obscure piece of technology that is you don't really have to do much to work on it anymore. Most of it's on autopilot at this point and most of the conversations are very philosophical in nature and not necessarily about how to implement things in the software. So takeaways about the project, Aura is completely stateless, so you can restart a node and it's basically empty until you push config to it, which means all of our systems are declarative like NixOS now and you can just pass things like TypeScript or JSON to them and it makes it easy to manage things like containers. Next we have some to-dos for the project and I would encourage you all to get involved and if you want to see a demo of all this, I'll be out here in the hallway after the talk and you can come and you can track me down and I'm happy to give you a demo. So anyway, I think we have a few minutes for questions and five minutes for questions, so I'll take questions and if you want to get involved, here's how to get involved and I'm Chris Nova, please clap. You mentioned the size of the binary being, does it work? You mentioned the size of the binary being 40 megabytes, is that with size optimization or no? Sorry, say that again? Is the size of the binary at 40 megabytes with size optimization applied already or no? No, that's completely unoptimized, that is like just straight out of the compiler without any aftermarket tuning. Amazing talk, quick question. So if I want to have just enough Linux to like pixie boot into this thing, like do you guys have any templates because it feels like a shame to run it on something like RHEL, like I just need like enough of Linux to just pixie boot into that? Yeah, so the question is basically can we pixie boot this and then you mentioned RHEL. Where we're going, we don't need Red Hat, so I guess what I would say is in theory all you need to run is static Linux kernel and Aura and a network connection and some MTLS config, and so everything else at that point, all of your packages, your services, your daemons are passed to it via the API. Hi, you mentioned that you use a lot of macros. I've also run into problems where, you know, you have a combinatorial explosion of templates in C++ speak or something like that. What are your thoughts on generics for generating some of this rather than macros in order to be a bit more type safe, I suppose? Personally, I got a little drunk with generics, I'm not going to lie. When I first moved over from Go, because I was just so excited about it, the reason I like macros is because we can add logic to them. So we have, like to give you an example, we have containers and we have VMs. So we'll have a section of the macro dedicated just to VMs that manages the kernel. And that's irrelevant to the container systems in the project because containers run on the host kernel. And so we can embed those small branches directly into the macro code so that macros generate slightly different outputs based off of the inputs that are given to them. So for Aura, when you're dealing with similar systems of code that have small nuances like we are, macros really, in my opinion, are the way to go. Did I answer your question? Looks like. A simple question, so can I actually give the configuration instead of like Aura script or TypeScript just in Rust? Yeah, of course. So we have this Rust client here, it's basically a Rust SDK. And then we have a tool called AER, which takes it a step further and it's automatically generated with macros. And it's a compiled binary that you can just use from the command line. So you can just type commands directly into it and it will run against the server on the back end. Do you think code is Rust? Yeah, there's also an SDK. So you could write your own Rust code and it's GRPC. So you could generate, you could write it in Go or in WeDo and you could write it in Python or Ruby or realistically anything, any client you want. Hi. I was wondering when you talk about the remote API, have you considered a future direction to make this a unicolonel? A unicolonel. Yeah. Yeah. I have a slide for this. So I added like a bunch of like FAQ slides to the end because I knew that we were going to get all these good questions. The answer is it depends, hold on, let's see if I can't find it. You guys get to see. There it is. It depends. What does unicolonel mean to you? I think the most minimal system we could do would be a Linux kernel as it exists today, like good old fashioned stock Linux giant make file to hold nine yards. And then the ORID daemon and that would be the minimal system. Anything else you would need to pass to it at runtime? I think we have time for about one more question. So you said it doesn't do any higher order scheduling. I guess I'm kind of curious what, if you want to do things like resilience or steering or if the job dies, bring something back up, what are people typically using with Aura? So Aura is still very new. I think that my hope for the project is kind of like the same hope I had with my book, solve the lower layer first and then that is going to open the door for higher order conversations in the future. My hope is that there's a whole ecosystem of schedulers. You change your scheduler, you change your socks, well maybe not that often, but the point would be that that's very specific to the needs of the current organization that's working on it. And I would hope that we can still use the Kubernetes scheduler or the Nomad scheduler to schedule jobs on Aura. I know there's also some machine learning folks who have some data resiliency problems that are interested in Aura right now and plan on using some weird global mesh that will do a peer-to-peer network around the world, kind of like BitTorrent, and then they intend to use Aura for that. So I think there's some opportunities there. The project itself won't ever have an opinion on a scheduler. Maybe I personally will start another project to do that in the future or something, but this is the scope for now. So that's all the time we have. Okay. Can we hear it again? |
Presentation of BastionLab, a Rust open-source privacy framework for confidential data science collaboration
The reason of why Rust is the most appropriate language for our project |
Hello everyone, I'm Mehdi Bessa, CTO of Metrual Security, and today I'm going to present you Bastion Lab, a secure data privacy-friendly framework written in Rust. Is it working? Yeah. It's better this way. So when making this project, we came across one big problem. Let's say, for example, you are one hospital and you want to share critical data, such as ECG data, Earth rate, respiration rate, and so on, what, for example, a startup that is working on data as a deep learning algorithm to detect anomalies in those data. The most usual way today is to use a Jupyter Notebook that you can isolate from network and all, but unfortunately, this is not the appropriate way because Jupyter Notebooks allow arbitrary code execution, and with some way, you can extract the data without even the data owner seeing that you did that, which is a big problem, mostly with sensitive data. Our solution, try to fix this issue. For example, you will not have direct access to data. You will only have limited operations allowed, so really what you need to, for example, aggregate the data, only extract what you need, only do, for example, some average and calculation on the microcept of data, but most importantly, you can only have sanitize and authorize output allowed, meaning, for example, if I don't want the startup or any other actor that work on my dataset to see the name of my patients or some critical data such as if they have hypertension or so on, I can just set up in the policy, and they will not be able to access that unless I explicitly authorize it, and yet, nothing's forced me to accept it. I'm going to present to you very quickly our API. Don't get mad with me. It's the API is in Python because the API is in Python by the server in Rust, so don't get mad yet. Okay. It doesn't look as bad as I thought, actually it doesn't. Yeah. Sorry about that. Is that working? No, I think the resolution is not there. That's okay. That's okay. I will go on with just explanation. Okay. That was fun. No, that's fun. Ah, thanks. Sorry about that, so no Pytons, so good for you in a way! Thanks. All experiments at Rust in Berlin Bastion lab, we had seven reasons to choose Bastion to make our projects, which is the biggest reason, memory safety. I think you know all very well what I'm talking about here. The very paranoid way Rust has to handling multi-trading, no mutable static unless you use lazy static and any other technique. That was a pain to bypass, but we did manage. And the minimum call-based size, thanks to what, thanks to Rust being a low-level programming language. It's ideal for trusted execution environments as we are working with, such as for example AMD, ACV, IntelliJX, and so on. The less call-based, the less big the cost-based, the easier it is to audit. Now for the performance reason, as I said, Rust is a low-level programming language, very close to seeing term of execution speed, but the biggest reason is polar because our APR easily relies on it, except that we implemented a network stack to never allow anyone to access to the data directly. Polar as well offers one of the best performance in working with datasets and so on, join aggregation and so on. It was the easiest way to do it, plus it's in for Rust. There is no binding and so on. Thanks for that. So you can see here the performance, the benchmark we made. We use Panda as a reference, that is, as you can see here, more than terrible compared to Polars. We compared to Polars, lazy, all solution that is lazy by default. Lazy means I'm only executing a query when I strictly need it. Learn about Panda that is eager and we do it all the time. That makes a big difference, plus I'm working on only the data that I need and not the world dataset if I don't need to. That's another benchmark on a bigger set. You can see that Panda is still off the roof and never compared to the other one. Now though, how did we do that? We thought to use the best crates that are available for doing that. We wanted to use Tonic and Tokyo because Tonic offers GRPC which will allow us to make clowns that are not in Python if we need to, thanks to the protobuf that is implemented in many languages and the GRPC protocol as well. Polar, as I mentioned it already, rings because in addition of setting up a policy, for example, if I don't want people to access a specific two names or whatever, there's that. But rings, we always ring implementation directly to verify a piece and I need to provide my public key to access to the server. We use ring implementation to do that directly and to check if the key matches and if the key is real. And Tokyo because we are using heavily like MAD, the multi-trading, the asynchronous move and so on. For example, when you need to accept a dataset, we spawn a new thread that will send a request to the data owner saying, do you want to accept this request that is about to leak sensitive data? It's not right in this way, but it is this and can come with now. Instead of blocking the whole process, I will have a thread that will time out after a while if I don't access it, I reject it. But I can say yes or no and other requests such as a simple one or not, sensitive one will be accepted. And so move plus tonic to Tokyo that makes very well together and allow many connections as we want. This is the best we could dream of. As I was about to show, that was supposed to be in the Gullo collab, but it's an easier representation here. We have for simplicity reason, sorry, Python code, but only a few lines. The data in all part that uploads the dataset set up a policy and for example, I can reject my dataset, but I can allow sensitive request, but I want it to log it. Oh, shit. Thanks, everyone. Thank you. Thank you. Thank you. Thank you. |
Neovim and rust-analyzer are best friends |
Andrei, he's going to talk about a new of them, I went through this deck, it's actually really cool. Before you go on, I need to give you this. Yeah. No, we need this one for the projection, so here we go. One, one, one. Can you hear me? Can you hear me? Yes. Cool. I can... I think I can fix it. I can use it, so let's start, so we don't have time. Okay. Today I will talk about Rust and NeoVim, so if you see something inaccurate, I'm new in Rust, so please don't judge me too much. A little bit about me. I'm Andrei, I'm from Ukraine, now living in Vienna, in Austria, I do a lot Go, Python, I love gymnastics on other stuff, I love speaking, I just dig talk in another room, so I just move forward. Cool. Let's start. So NeoVim is the most loved editor in 2021 and in 2022, and Rust is the most loved language in 2022, so I feel it's a win-win, it's the best combination you can ever see. So can you raise your hand very quick, who use like NeoVim or something? Oh, it's my audience. Nice, nice. Nice. So why this for you? If you like me, love to spend all the day, weekends, configuring my development environment, please don't tell it to my boss. And since all of you using NeoVim, I think you know how to quick from Vim, right? No need to explain, cool? So sugar smoking also, cool. So first things first, NeoVim is just a fork of Vim, which is focused on like extensibility and usability. One of the examples is they injected or added ability to write plugins in Lua, which is great. And since 0.5, as you know, NeoVim supports LSP client framework, all good, all good. This means NeoVim can work as a client to LSP servers like Rust Analyzer and building other tools. So to prove my words, yeah, if you open NeoVim and type help LSP, it's like inside editor, which is nice. And you can read what you can do. So and for some of you who don't know what is LSP, very quick, so LSP is like a bridge. So now you can focus on developers of language servers, can focus on developing language service, and developers of editors can focus on editors. And it's like win-win again. Because previously it was like this, you need to write a tool, which format your code or whatever, it parse all three and depend on language, you can integrate it. And jumping between languages is hard, that's why it's nice when you have this LSP. So quick start, I assume you know how to install NeoVim. So if you never did it before, I highly recommend to start with this one file config, which is here and put it in your config location. What it does, it's very small, and it has few important plugins, which helps you to try it out. And I highly recommend to use it because it's easy to start doing something small and simple rather than starting like, okay, I need like lots of files, I need to repo for all my configs, et cetera. And as soon as you know more, you can refactor it. So in this file, include this new Mason plugin, which helps you to install language servers, which is handy, because you can do it directly from NeoVim, and it will install binaries directly to NeoVim standard pass, and when you start NeoVim, it will add this like location, so NeoVim and tooling will can communicate with these binaries. And it supports LSP, so also for debuggers, for linters, for motors, sorry, next. So when you're done, you can see this after complete, which is great. Another interesting feature you can do after imports. So for example, if you use hash map, but you forget to import it, I usually do it. It's very nice. And if you bind this code action, it will print you like different depend on which part of code you are. If it has any connections, for example, import hash map, you can type and import it automatically, and that's it. It's very cool. Same with rename. So now all haters of Vim can't say that you're renaming using grep, so you can rename using syntax tree, be smart, and using entire code base. And that's it. Yeah, I supposed to make a juice, but since we decided to reduce talk, I just have a picture. If you like me, forget your key maps, because you have tons. You can just use this nice plug-in telescope key maps and just remind yourself, oh, shit, just fuzzy find this. So very handy. For example, I forget LSP references, so I can find how to run it. And again, this LSP references is more or less language agnostic. It's extension of telescope, which like get it and show you. You can do hover documentation, you can do, I don't know, signature help, and many more. Other features, which I unfortunately can't cover in this short talk, but I highly recommend you go to this page and see how smart Rust Analyzer. It can do very nice refactoring, like, I don't know, apply Moore's law for your binary logic, replace some patterns, et cetera. So I'm personally using big config. You can, if you want, you can check it. It's more organized because it's really big. It's not one file. And if you need refresh knowledge about WIM, there are two really good books I ever read about editors, and you can play a little bit with WIM golf. If you never try, I highly recommend. Yeah, and a few references of guys which inspire me to use Neo WIM a lot. And thank you. Questions. Yeah, thanks. Questions. Yeah, thanks. Yeah, thanks. Yeah, thanks. Thanks. Thanks. Thanks. Thanks. Thanks. Thanks. Thanks. Thanks. Thanks. Thanks. Thanks. Thanks. |
A Rusty CHERI - The path to hardware capabilities in Rust
A status report on ongoing efforts to support CHERI architectures in Rust |
Yeah, so my talk is about modifying the Rust compiler to support Cherry's hardware capabilities. I'm going to start off with a brief introduction. My name is Lewis Reville, and I work for a company called Embercosm. I work on many things, but I'd say I specialize in developing LLVM backends for constrained or unusual architectures. Embercosm itself is a software services company. We operate in the boundary between hardware and software, particularly in the embedded space where you can find many unusual, difficult and interesting problems like writing compilers. So what is Cherry? It's an acronym capability hardware enhanced risk instructions. It's best described as an instruction set extension, which can be adapted and applied to different architectures. The main feature of Cherry is that you can encode access constraints on memory addresses using things called capabilities. Capabilities essentially have metadata alongside memory addresses that allow you to specify these access constraints. These can only be operated on using capability operations, which replace the normal pointer operations, and these operations utilize the metadata to enforce those access constraints. It's worth pointing out there are two modes of operation for Cherry. There's pure cap mode where all pointers are capabilities, and in hybrid mode you have pointers by default on normal pointers, but capabilities are annotated as such in the source code. So capabilities together with capability operations allow you to enforce spatial, referential and temporal safety in the hardware at runtime. Spatial safety is to do with disallowing accesses out of bounds of an original allocation. Temporal safety is disallowing accesses without valid provenance, and temporal safety means that if the lifetime of an object is over, you can no longer access it through a capability. So what about integrating Cherry and Rust? Well, we're working on this as part of a project which is led by our customer Cyberhive. They're funded in turn by Digital Security by Design, which is a UK government initiative. Cyberhive want to use Cherry hardware to enhance secure network protocols that are written in Rust. So the goal for us then is to produce a Rust compiler that's capable of targeting Cherry-based architectures, with the long-term goal of a stable compiler that can produce production ready code for security purposes. We know that we're initially going to be targeting ARM's Morello platform. So other than being able to compile existing Rust code for Cherry, what's the motivation between integrating Cherry and Rust? Essentially it boils down to another layer of protection. We know that Rust is good at identifying and enforcing access constraints at compile time, but with Cherry you can identify constraints at compile time and enforce them in hardware at runtime. So a good example is that Rust code annotated with unsafe is often a necessity in many real world projects, which means that it could behave badly, but we don't know until runtime. With Cherry you can prevent this bad behavior in hardware when it occurs at runtime. There's some other small side benefits such as replacing slow software bounce checks with hardware bounce checking and replacing pointer plus length types with Cherry capabilities. So to make things more clear, I have a motivating example. So say we want to add a dynamic offset to a pointer and then load from that pointer. Well this needs to be done in an unsafe block because we don't know until runtime if it's going to do something bad. Without Cherry you could end up accessing out of range of your original allocated array, but with Cherry that access will not occur at runtime and the hardware will either panic or give you something, a default value. So now that we know that we want these benefits, how do we go about modifying Rust to get them? The main problem is that we need to account for capability sizes correctly, that is we need to stop assuming that pointer type size is equal to the addressable range of the pointer because capabilities have metadata, this isn't the case. Also in LLVM, in the Cherry LLVM fork capabilities are pointers in address space 200, whereas in Rust it seems like we assume that all pointers to data are in address space zero. Also if we want to support hybrid mode we need to be able to specify different pointer type sizes for different address spaces, so address space zero will have different sizes from address space 200. One thing I hope doesn't require many changes is that we need provenance and bounds to be propagated through the compiler because they need to be attached to capabilities. And of course if we want the optional bonus stuff we need to implement that as well. Progress so far, so the data layout changes are completed, which means that we can correctly specify capability sizes, both the type size and the addressable range for both pure cap and hybrid mode. I have modified APIs which produce pointer types to get rid of the assumption that pointers are in address space zero and now these APIs require an explicit address space parameter. And the biggest change is that for APIs where we have a, where we report a size for a type, this is replaced with a total type size and a size of the value that you can represent. And yeah, this means that, like I said before, we can support cherry capabilities. There's also in the strict provenance API there is an explicit unsafe method of producing pointers with no provenance from a U size. And for cherry we need to use cherry operations to set the address of a null capability to achieve the same result. What I'm currently working through is trawling through assertion failures that come up when building the core libraries with this modified compiler. What still needs to be done, well, there's almost definitely going to be modifications to the libraries to remove any assumptions that break for cherry. There's also the question of how do we specify capability types in hybrid mode and because I don't think that Rust annotations are the right tool to specify a specific pointer as being a capability, I think this requires a library solution. For APIs where I have replaced a size with a type size and added a size of the value that you can represent, we need to go through all of those uses of the type size and see if they should really be using the size of the value that you can represent because this is the main cause of the errors that I'm seeing in building the libraries. And of course, a lot of testing and polishing is going to be required. Before I finish this talk, I do need to mention that there's ongoing and past work that is in this same area. There was a master's thesis from the University of Cambridge and there's another government funded project from the University of Kent. And well, thank you for listening. Please feel free to check out the code on GitHub or ask me any questions outside. |
Slint: Are we GUI yet? |
So now we have Olivier is going to talk about Slint. Hello. Yeah. So about me, I started Open Source, working on contributing to the KD project, which is a project made with Qt. So that led me to be hired as my first job at Trolltech, which was a company making Qt, later bought by Nokia. But in 2000, so 10 years ago, I left to create my own company, Software Services. And still I had a little, I was still a bit in the Qt ecosystem. And I was talking with Simon Hossman, also from Qt. And we were like the sad state of current desktop UI, can we do better? What would happen if we would create a new UI toolkit from scratch? And in 2020 then, we created Slint. So this is implemented 100% almost in Rust, most of it is implemented in Rust. It's a native toolkit, so native as opposed to runs in a browser, so it really runs natively and it's aiming at desktop and embedded at first. So it uses its own domain-specific language. So it's like a macro, and you might say, wait, is this, I wanted to develop in Rust, and now you're saying I need to learn a new language to do UI, and it's, yep, but fortunately learning this language is not more difficult than just learning the API of any other library. So it's just like learning an API. And Rust is not really meant for UI, there's a lot of ways that Rust is a bit too explicit in some cases, where for UI, you just want to describe the UI in a really more, in a way that all languages is much better at. And then this thing is only to describe the user interface, but all the logic, of course, is written in a programming language, so for example, Rust, but we also have bindings to various languages like C++ or JavaScript, and we tend to add more. So let's try to make a short demo, I cannot do the demo because it's a lightning talk, but I just took some screenshots. So let's just create a new project, add Slint as a dependency, there is an extension, so this is Visual Studio Code, where we can install an extension, we search Slint there, and one click install. If you don't have Visual Studio Code, it's okay, because this is just a wrapper around a language server protocol, just an LSP, so that works with most details. And if you don't want to use it, it's all optional. But let's go back to main.rs and add some code. Here we add our little macro, it shows a small window with a text and a button. By typing that, of course I had the full power of this extension, so that includes auto-completion, go to Symbol and everything else, we even have this little property editor there that we added. But the coolest thing here is that we have this code lens, a show preview, let's click on this, it can be a collection as well on other editors, and a window up here. So the LSP server behind the scene opens a new window, and this is the preview of what you just typed, and if you type, it updates live. So this is really interesting, because when you do UI, you really want to see what happens as you do. You don't want to spend a long time compiling and stuff. Let's add a callback here, btn underscore clicked, and in the Rust code we will instantiate our main window that we created from this macro, and connect to it with the generated on btn clicked. So this is generated by the macro, and to have some Rust code can be called. So if we click and run the code, that's it, we have the thing. Here we see that the two windows on the screenshot have different styles, that's because it's styleable, so we have, for example, the front style, or we also have here a native style because we really want to be a native toolkit, so using the native style. Let's add a property that we can set with, now in the callback we say set count, get count plus one, so we added this property that we use in the text. These are reactive, meaning that when you change them, they automatically change, and Slint knows what to refresh. So what can we do? So here is a little demo. Okay. Yeah, that works. So this is apparently not working really good in this presentation, but the idea here that we would see, you will see the demo running on WebAssembly in the browser. So it's meant to be a desktop framework, but it also runs for demo on the browser. Okay, so this doesn't look good with this projector, but again, this is a gallery which show a few controls. So what about the performance? How lightweight is it? So here I have with me this microcontroller, so this is a Raspberry Pi Pico. It has less than three kilobytes of RAM, I said kilobyte, not megabyte. And yes, it's working, we have scrolling, a bit some animations, so that shows what we can do. So the project is open source, it's entirely developed on GitHub. We accept full requests, we also accept of course bug reports, please send GitHub issue, open GitHub issues. The license is GPL for open source projects, and we're also a company, so we want to make money out of it, so that's why we have multiple licenses, so GPL for open source project. And we also have an ambassador license, as we call it, it's a free license which you can use for proprietary software. You just have to say that you're using Slint, and there is also a commercial license with support and so on. So in the future, we plan to, after already three years of development, we are now almost ready to release the version 1.0, so if all goes well, it should be released this month, end of February. And the other thing we're working on is to improve our little preview there and to make it that you could drag and drop things, drag and drop widgets, and have actually a design tool where, so even designer could do the design without even touching the Slint language. So that's our hope for the future. So that's the end of my presentation, I hope that you got, that it made you want to try Slint, and please do contact me if you have any questions or if you're wondering if you can use Slint, I'll be around, but please ask questions. |
Rust API Design Learnings
Lessons learned from building Rust libraries |
So, hello everyone, I'm happy to share some experience of mine with building Rust libraries. There's a lot that I went over to come up with like the lessons that I actually feel like are worth sharing and is much more than can fit into a talk, so I tried to sort of get it down to the things that are most important to me, but if we have some space at the end then we can sort of dive deeper into some of the things that I will only cover very briefly. So who am I? My name is Amin Roneha and I have been creating open source libraries for quite a while. I have originally written a lot of Python code, I wrote the Flask Microphone work for Python and I have been doing open source libraries probably since I'm 17 or something, so this is many years of experience in some sense and I started using Rust in one form or another almost exactly 10 years ago, so I have played with this language for really long at this point and so I went through some iterations of what works and what doesn't work. Commercially now I work for a company called Sentry, we are doing application monitoring and crash reporting and so we use Rust there and so I'm going to share the learnings of this in sort of two ways. One is open source libraries that are there for other people to use and the other one is libraries that are to be used within the context of one specific company because some of the things that we create, the only users really are us ourselves. And so in terms of open source libraries that sort of feed some into this talk, I'm taking a little bit of code from my Insta snapshot in testing library, my mini-changer template engine and a few others like my console library. Okay, so this is a quote that I read a couple of months ago and it's not so important but it really bothered me. It was a comment to an open source library written in Rust that more or less made a statement that defaults shouldn't exist and people should think really, really hard to spend the mental capacity of understanding what they should be doing, picking defaults blindly is a mistake and I don't subscribe to this at all because I believe that it's the responsibility of a library author to make sure that the defaults are the best that you can possibly take and the way I would think of this is you can think of creating a library or an API as if you're building a business. You have some sort of success criteria and you can measure yourself against this success criteria and I don't think the success criteria should ever be to confuse your user as much as possible. It should be that you measure how successful your customers are in using the API as you designed it, so if they do something that is not ideal, there's something you can fix and the second thing is then they will typically use your API to build a product themselves, to solve a problem themselves and the quality of the output that the user then produces in some way is also something that you have an impact on. So you can kind of think of it like the percentage of users that are making the wrong or right choices is the dial that you can sort of try to dial. The problem obviously is how the hell do you measure this? And it's really, really hard because most of the things that you can actually measure, they don't really show up in useful things, so you can define the success criteria and I think a lot of people do this in some way subconsciously, but they absolutely have no way to measure it. I don't have a way to measure most of those things. So the only thing you can do in the beginning is measure yourself, so you treat yourself as the first user, horrible sample size, but you can sort of figure out like as you are using your own stuff, does it do what you think it does? And then every time you sort of hate the experience, at least write down like it didn't like this. So it's a good start, but the problem is you're really flying blind, there's no good way to measure this and this is not unique to Rust, this is the fact most of the time it works like this. But in terms of API design, I think we have learned in other environments that there's actually a lot of stuff to measure, so if you were to create an HTTP library, a lot of companies are trying to figure out how often do the users that hit the APIs do the things that they don't want them to do, for instance, because they create bad load patterns or because they just generally hit the API in ways that is not efficient and they're trying to figure out like how do we optimize so that the user actually does the thing that you want. And this works if you run a service because you can see what everybody is doing. You can't really see anything that the developer is doing in your Rust library. The only numbers that we have for download statistics which are really pointless because they're heavily skewed towards libraries that are used by other libraries, so you can build the most amazing library ever to build an application, but if there's only ever application developers that are going to download this library, you're never going to accumulate a lot of download statistics because most of the download statistics come from either CI and very few people actually download the code to run on a desktop. So if you have 100 developers, they're going to pull this once each usually and then it's cached on the machine. The download numbers come from CI, so it's only this dependencies of dependencies that are actually ending up driving those numbers, and so they can be demotivating in some sense and they're definitely not that useful, so you can't really track anything sensible in terms of adoption just by the download numbers. So the first feedback you're probably going to get is some form of frustration, so it's usually bug reports or it's someone internally telling you that this thing is really inconvenient to use. In some ways, you have to prompt them to actually tell you that, but you can kind of do it like use a service in interviews to figure out like, do people like this, and you kind of usually don't get people to reply to service, so one other way to sort of solicit feedback out of people is you take this issue request, ignore it entirely almost, and you go back to why did they submit this in the first place, and they try to ask that question, because quite often when they submit a bug report, they're already down a really weird path anyways. Maybe you can take them on a higher level and figure out why did they even try to do this. I mean, when they do a feature request, bug report is less so. So as mentioned, it's really hard to measure, and so because it's really hard to measure, a second thing that you can sort of use is trying to figure out like, okay, if you think this is worth measuring, what is it that this number actually represents? This is typically some sort of values, and so these are the ones that I find are important to me. They're probably not necessarily important for everybody else, but I felt like over the years I can get behind those. So the first one is I think the hello world of your library should be relatively concise, so it should be easy enough to get started, to some degree this is simplifying onboarding, but to another one it's also that the less stuff there is, the easier it is to maintain the whole thing overall, or gotten to this a little bit. This also leads towards good defaults, ideally you don't have to copy paste five lines of code in if a single line of code does. The smaller the surface area of a library actually is at the end of the day, the easier you have in terms of maintenance and also in the ability to do modifications over time. Because the more API you expose, the higher the chance that you're going to break something, and I don't like breaking things. I think backwards compatibility is really important. I hate the idea of unnecessary churn. I've been part of the Python 2 to Python 3 migration thing, and it was horrible, and it was particularly horrible because a lot of people that really liked the language, as I've included, got stuck on Python 2 for a really long time because we used it in a commercial context, we used it for large projects, and those were the ones that moved the slowest. You had a lot of power users stuck on an old version that actually used it in a quite extensive context, and it took us a couple of years to move, and while we couldn't move or we could only play with it in small experiments, the language kept innovating, and all those innovations were unable to be used by the people that previously were really engaged in their thing. I don't like the idea of letting people behind, so the easier the migration path is forward, the fewer people are going to leave behind on all the versions. In a way, my goal is to keep people on the golden path. What's the golden path? It's the idea that you have an idea of how people should be building stuff. Typically, this is a thing that you do in system design, and you plot an idea for someone to execute along, and this is the blessed way to doing things, and you try to get as many people as possible on this golden path, but libraries have the same thing. If you want to build a security library, for instance, like an encryption standard library, there are many ways to build the wrong one, where you give people so many choices that they have no sensible way to being sure that they are on the one that actually does the right thing. But the problem with the golden path is that it will change over time. What is correct today might not necessarily be correct in two years. To go back to the security thing, at one point, we recommended everybody MD5, and then that was not a good idea anymore, so at the very least, it should be SHA-1 hashes, and now obviously we don't want that anymore. The problem with this is that a lot of things change over time, and so if you create, for instance, a library that wants to make a choice for a user, but it's designed in a way that that choice can never be changed on the fear of breaking code to some degree, and then you did something wrong in the sign origin already. Some change requires adjustment by users, and that's really hard to do, so if you have some sort of golden path, you can figure out how many people are on that, versus the thing that you don't want them to do anymore, and again, measuring is almost impossible, but one way you can do that is if you are building a library where you think other libraries are going to use that, you can use something like a GitHub code search to figure out how many people still use the deprecated pattern. You can also leverage dependabot quite a bit to figure out how many people that actually are getting a dependabot update are going through with the update versus not. Okay, so defaults matter. As mentioned earlier, I'm a strong believer in there should be good defaults, and I think defaults really come in two choices. One default is the absolute default. You're never going to change the default of a U32 integer to be anything other than zero and rust. There wouldn't even be a discussion about it to change it to one, right? There's an obvious default, and it's so dependent on that you just will never be able to change it. But then there are sort of defaults that are intentionally designed so that they can change over time. For me, a good example here is the hash table in Rust or any hashing function in Rust doesn't say what hashing algorithm uses. It defines some sort of properties around it within the boundaries of which it probably doesn't change, but the Rust developers could come in from one day to another and switch from SIP hash to FX hash, and most of us wouldn't notice, right? So to enable these sort of defaults, you have to design the system around it so that these can be changed, and you have to communicate to user how you're going to change them. There have to be some rules and expectations around what the stability here means, but you should try to aim that you can actually enable this change, because otherwise you end up in a situation where you have created an API and now you need to make a second one because the old one just doesn't work anymore, and you either have to take it away, it's going to be frustrating for user, or you have to lie about what it does. There are a lot of liars out there that say, hey, my function does SIP hash, and then in reality it just doesn't do that anymore because they wanted people to move up, but they were afraid of breaking it. So why does this work in Rust, for instance, with the hasher? Because they said, okay, this is never going to be portable, and I think this is a really interesting part because it also enforces without saying what it is, how good it is. What I mean with this is that the hasher in Rust randomizes all the time. If you even try to use it in a portable way, you're going to quickly figure out that every time you re-run your program, the hash is different, so you cannot even get into this mood of trying to use it for something that's portable, and there's a really good analog of something like this, which is every once in a while people build programming languages, and I think Go went down this very same path where they said, we're going to build a garbage collector, and it doesn't compact right away, but it's eventually going to be a compacting garbage collector. And the problem is if you don't start out with actually writing a compacting garbage collector, a compacting garbage collector takes a pointer and moves it somewhere else, so to compress down the heap space. But if you never compact to begin with, people get used to this idea that the point is really stable, and they will stash the pointer somewhere, and they rely on the fact that it never moves. And so at a later point, three years in, someone says, oh, now we built this awesome compacting garbage collector, but they can never turn it on because people took advantage of the fact that there was actually a conveyance in the API that they used that wasn't supposed to be there, but it wasn't enforced. And so what the Rust hasher does in this sense is it always randomizes, so it already makes it so uncomfortable to use for the wrong thing that you wouldn't use it. And so if you were, for instance, to build another language with a desire to eventually build a compacting garbage collector into it, you would probably try to make it so uncomfortable to use these pointers for stashing away that it already wouldn't work in the absence of one. So I think it's a really important part to make sure that if you want to be able to create this, change the defaults, that it start from the beginning thinking, OK, how do I build it so that I actually remain, keep this freedom. And so why do I even want to change this? Well, because you have this problem of cargo-culting. I have created things in the past myself where I discovered that people copy-paste the first example from documentation over and over and over again. And then you have this first example from documentation in millions of repositories. And for me, the weirdest one is that one of my frameworks that I built actually ended up in a university course as a programming 101. And universities are sort of an amazing catalyst for this because you end up with all of those students putting their GitHub repositories of the first courses on GitHub. And then you see the madness you have created, multiplied through like every single student that on boards. And it's a real lesson. So I really respond really negatively to this idea of this code example, which I know it's hard to read, but it basically is sort of the, that would be the equivalent of not picking a default hash. It would be like, well, we think that Rust is a hard programming language. It's supposed to be like C++, you should think. So you pick the hasher yourself, and then you just put it in there. And then I swear to you, this would be the first code example, even if the documentation says you should pick FX hasher for this type of hash map, and you should use SIP hasher for this other hash map. Nobody would read it. They take the first code example and copy paste it in, right? But now we discover maybe SIP hasher actually is better than FX hasher, but we cannot change everybody's code all at once because everybody has copy pasted the wrong thing over there. And what I know happens is that every once in a while, someone comes and is like, okay, we know that FX hasher is really popular, so we're going behind the scenes, change it to actually be a different hasher because everybody copy pasted this thing everywhere. It's incredibly common that you find an API that says, like, I'm named really misleadingly because this used to be my implementation, but now I'm something else entirely, right? So it doesn't, like this card-culting I hate, and I don't think it's good, but I understand it's really appealing because it's flexibility. This I think, I actually made the wrong code example here on the slide, but I was too lazy to change it afterwards, but just imagine for a second that this code example wouldn't return a string, but actually returns an enum with a variant called char256, but the idea here is more or less this. If you do create a library that does something of which in the future there might be different implementation coming along, what you probably want to do is not just return the output, but you want to annotate the output with what kind of thing it is. So if you create a, if you have a library that sort of is supposed to validate files on the file system, and it does it by calculating a checksum, it's much better for that library not to return the bytes of the checksum, but to return the bytes of the checksum and the information by the way I'm char256. Why? Mostly because it forces the developer to recognize that this thing might change, right? It might not be char256 forever. And this sort of explicit, like even if you never plan on actually changing this at all, it still forces something in the mind of the developer to recognize that this is probably going to change. And I know of a version control system that very famously picked one shahesh and is completely incapable of changing it now, not because it's technically impossible, they've changed everything, but because there's an ecosystem of stuff that assumed that it will never change, this change cannot be pushed through, right? They have the one version, the other version, the incompatible of each other as an example of this, right? I know this is very security heavy right now, but the idea is the same, like you, every once in a while there's a reason for you to move up to a different algorithm or something of that nature, and if you haven't created an API in the beginning that sort of hammers down this point that it might change, people will not necessarily think about this. Less is more, I think it's pretty obvious, the larger the API surface, the more stuff you will have to make sure doesn't break if you change internals. Particularly internal APIs are incredibly leaky accidentally or intentionally, and whenever you want to change it, you have to ask yourself the question if someone actually using this. A lot of people start out building libraries where every single module, every single function is just pop, and every version breaks everything ever, because you just don't have the time and, at least you don't have the time usually, and the capabilities to figure out if someone actually uses all of those functions, so you just break them all. Or you just spend all this extra time on making sure that they never break because it's impossible. So goal is small API surface, but then if you go to a small API surface, then there's some tricks in Rust you can use to reduce API surface, but not really. So for instance, we have some really common abstractions like Intot or SRFT, which can be used to be generic over something. So it saves you the necessity to build a function four times. You might want to open a file by string or you might want to open a file by path. Just write one function, you can do SRFT path, and it's possible to pass at any of those compatible types. Now, it's kind of cheating because it's really still four different APIs, but they're hidden behind a standard constructor that the developer understands. So this is an example from Minigenger. Both name and value are generic. One is converting into a copy and write string thing. So you can either pass it an own string or you can pass a static string reference. Both of those will be fine. So from an API point of view, it behaves as you would expect. The first thing is a string argument, like you cannot get it wrong. And the second thing is whatever converts into my internal value type of which there are a bunch of conversions. I could also have made seven different at whatever functions for every single data type that exists, but this is a simpler way to do it. And I could also have said, like, hey, you're going to pass the value directly already, but since I already need to provide a way for the user anyways to convert it into this value type, might as well use this. So it's very consistently throughout the API now using int of value. So there's only one function and it hides all of those different variations of actual API that still exists behind this int of thing. SRF is sort of the same. I actually hate SRF because it makes it really unreadable. So instead of your function taking a path by reference, it takes a p and then the p is hidden behind SRF path. And then before to use it, very often I use SRF.toPathPath, which is just weird, but this is a really common pattern. So I don't like it, but it's so standardized in the ecosystem that people get to expect it. And so because we all have been willing to live with it, I guess this is sort of enough of a reason to keep doing it. If you have this sort of APIs that use a lot of generics to do type conversions, you will immediately blow up your compile times. This is the standard trick to not do that, which is you keep the generic part into a tiny, tiny function and then call it the non-generic part. So in this case, the conversion goes to this value type. I know it's really hard to see in this projector, but basically the function render takes an object that can serialize with 30, and then it calls value from serializable, which takes this value, converts it into a value type, which is the non-generic, which I use all over the thing. And then all the function underscore, render, underscore, eval, they will not be generic. And so they have one instance, whereas the other one will be one instance per type actually passed. What does it mean? If you don't do this, if this was all one function, last time I checked this, the mini-changer compile times are exploding to more than double or three times just because of this one function. Why? Because it's very common that you pass different types of it, and the Rust compiler thinks that inlining is awesome, and it inlines this VM eval function there, which is, I think, like 15 kilobytes or something, and so it makes a duplication of those 16 kilobytes for every single type that you put in. In this case, I only have 16 kilobytes once, right? So it's a serious amount of code that you save with this, and it's probably the best way to save compile times for a lot of code. There are actually some tools that you can use these days to figure out, like, why these functions blow up that much. But if you use generics, particularly with SRF and into, it's worth looking for. So I mentioned earlier, I like to keep the API service as small as possible. That doesn't mean that I don't build actually an onion of API behind the scenes. So I like this onion concept, like, APIs are layered. But the user only gets the outermost layer, the inner layers are for own enjoyment. And then you keep these inner layers, and you have this flexibility of change remaining, but it's still kind of clean. And then over time, if that library becomes like the most stable thing ever, you can actually start exposing the next layers of the onion. A good example for this, I think, is Rust compiler plugins. Rust compiler, like many other systems, want you to be able to write a custom plugin for it, like the proc macros. But the problem is that that exposes a lot of the internal machinery of the compiler, which the compiler authors don't want to be stable. So what they did is they actually took the internals of that thing, they exposed it through a secondary library, which I think was soon, and I think soon is the one on top, but maybe it was proc macros too, but in any case, it was exposed as a secondary library. It actually used the same code, but they had different stability guarantees, and then they used a conversion system, which was basically serialized to string and deserialized from string to bridge between the two. And so if you do feel like, okay, I actually want to eventually expose this other library, and I want this inner most functionality to be available for customer, you can think of exposing it as a completely independent library for experimentation, and then eventually when it stabilizes, you can give it another onion layer to the customer. But again, once you have exposed that, your stability guarantees are going to be much harder to uphold. So just a simple example, internally in my templating engine, I have a lot of abstractions like compile template, which is something that I know would be awesome to use externally, but I don't want to give it to you, and then the code generator and the parser that have already been people asking how to get access to it, you're not going to get it for now. But I keep the abstraction internally because it's much more fun this way. As I mentioned, I like to have as little API service as possible, and by the way, I fail at this. You will notice that when I start a library, usually it has way more API than after three iterations when I take it away. But I basically try to keep it flat, so I expose all of the things I want you to use to the top level, and then all of my internal create structure is whatever, however I feel like to actually lay out those things. If I do have so much code in the library that I feel like there need to be sub-modules, make them up on the spot, so in the Insta Snapshot testing library that I wrote, I have a lot of internal types that you actually want to use as a user every once in a while, and they come from all different locations in the code base. So what I did is in my librs, I have a pop-mod internals, which is documented, those are internals you're allowed to use, but I just pull them from wherever they're defined. So this module doesn't exist as an internals.rs file. It's purely a thing that sits in my librs to organize the structure. And that way, I have the freedom to move these things around as I want. Related, every once in a while, you have to export an API that has to be public for reasons that mostly sit in the restrictions of the language. But I really don't want it to use it, so I just hide it. So dock hidden, you will find a lot of Rust libraries which have a lot more public API, but you're not supposed to use it. It's usually hidden behind underscore underscore, and it's just hidden. Why does it exist? On the one hand, because Microsoft needs to access functionality. The other reason is also that you might have to create dependencies. So Insta, for instance, has to expose internals out of itself into the test runner called Cargo Insta. And the way I do this, I have a hidden feature flag which says if you're Cargo Insta, you can use this feature flag. Everybody else, just please don't. But if you do still use that feature flag, you're still not going to see all of this API because it's kind of hidden away. This also helps when you use, for instance, DocsRS is an awesome tool to document your crates. A lot of people run DocsRS with all feature flags so that it's fully documented everything. If I wouldn't hide all of those things, then this internal feature flag that would turn on this API would accidentally document it. I don't want that. So even the APIs that sort of are there for me, I will still hide from documentation. Cool, traits. I have a really complicated relationship with them because I think they're awesome, but I also kind of don't want them to be in APIs. I think they fall into two categories. Some of them are, well, maybe the other way around, open traits are most common one. Asref into, everybody can implement them. And then there are a lot of cases where you might have traits that only exist as an internal abstraction. Let's call them sealed traits. They come from a crate and they're only implemented in the crate and nobody ever should implement those. Why do they exist? Because, for instance, you want to box something up or because you want to have really nice documentation where you can select, my crate accepts these types of things. So I wrote a Rust Redis library a couple of years ago where I made the conversion trait so that you can pass any Rust type that is compatible with Redis in all of those functions. And I didn't seal it off. And now it's public interface and people have been sealing off their own types, have been implementing for their own types. And so now it's just there. You can't take it away anymore. Which is fine for that library, but I didn't want to do it in other cases because, to me, that trait actually mostly only existed so that it's there for documentation purposes. So you can seal them in two ways. One is sort of the soft seal, which is just hide the functions that you're supposed to implement. And then it's unclear how you're actually going to implement it. In this case, it's soft sealed, so you can actually call it still and you can actually implement it. I'm not sure if I like it or not, but there are actually some use cases of why you might want to do this. Mostly you have to do it with code generation. But there's also the way of doing a full seal where you put a private serial size type into an argument and because it cannot name the type because it's never exposed, you can absolutely not call this ever. And the funny thing is Rust actually depends in the standard library of this pattern a lot. If you use any type and the generated downcasting feature, it uses an internal system called type ID to return a unique number for a type. If you were to be able to override this and lie about your own type, you could cause unsafe memory access, right? And so they have a seal marker in there that you can't implement and only the default implementation is only the valid only implementation. So it's a pattern. It's just not a pattern that has syntax support. It's just used that way. For now, maybe it will eventually get one. So why do I not like traits that much? Because I find them incredibly hard to discover. It's very annoying if you import a type from your library and then you want to call a method on it and the method is not there until you bring the traits into scope. And I know that a lot of people write this awesome preludes where like import this from my library and now, by the way, you have 20 types sitting around in your scope. It works, but it has a whole bunch of problems. One of which is that it makes your error messages a lot more confusing because there are traits in a standard library that if they are in scope, types with the inherent implementations will no longer function the same. There's a type called, I think it's called borough, a trait called borough. If you bring borough into your scope, you can no longer borrow from, I think, a ref cell. It will error because the trait overwrites the inherent implementation. So that's not great and there's a bug report from like seven years ago, I think, and there's already, but how do you change it now? It's like that's the behavior, but that's the problem that you have with preludes because they are just based by name and it will say like, well, there is no implementation by that name or it doesn't help you that there's an inherent method on this thing. I'm not going to let you call it. You have to disambigate your call. So there is some problem coming with the trait stuff and they're really useful, but you shouldn't maybe overdo them, I think, to some degree. It's really nice if the type out of the box already has a neat little behavior. And if you want to abstract to a multiple stuff, you can add the traits later anyways. There are however traits you should probably implement all the time. My favorite one is debug. It should be on all your public types. If you feel like it's blowing up your code generation times too much, you can hide it behind an external and extra flag you can turn on demand. I give you a very good reason why debug should be on all your types. Every once in a while, you have to be generic over something and then you will usually say like, I'm generic over serialize or I'm generic over my really cool thing that you should be calling. And the problem is when you're generic over serialize only and you're deep down in nested generic over serialize type and you want to debug print, you can no longer do that because serialize does not imply debug. So even if a type is capable of debug serializing because it's not a trade bound on the thing, you cannot call it there, which is very, very annoying in a lot of code setups. And so what I do these days, I will make debug and implied bound on a lot of, so like a super trade on a lot of the things that I actually expect to be very deep in code calls so that I can actually debug the whole thing. And the problem is that that requires that the ecosystem embraces the idea that almost everything is debuggable. So just make it debug. And if you really think it's blowing up a compile time, just put a second flag in that it can turn on when you need it. But not being able to debug very deep down in generic code land is just an unnecessary frustration you can avoid. This trade is almost like debug, it just converts this thing to string. It's obviously useful for a bunch of cases, but I can definitely overdo it. I much rather have the compiler yell at me that I should be telling it how to format rather than the string, that the type in itself just being formatable. But there are obviously like a lot of types for which it's necessary. Like if you have a UIA IDE, you would expect that the stringifies. If you have an error, errors have to be stringified so you can't really opt out of it anyways. But I feel like that a lot of types, actual way to convert it into a string is the debug thing is really hard to discover API. Like yes, of course, then I get two string methods appears, but you don't know if this is a thing that you're supposed to look at or if it's just a debug format with a different output. I've seen both. There are some types where you can call two string on it, but it's really not the real string form, you have to pick a different one, and it's just an alias for debug print, and sometimes it's the real deal. So this is a very confusing trait, I think. Copy and clone are obviously the most common traits that you have to deal with. Once granted, impossible to take away, I would say, because people start expecting that they can clone it. And so copy is a really tricky one because you can only uphold it for as long as it looks like a certain type of structure. Once you put something in there that's not copied, there is no way to fake the copy. And so if you make your type copy and then you later discover that that was a mistake, you now need to hold an arc, it's a breaking change, and I don't like breaking changes. So about copy, you have to think, will you uphold this forever? Clone, I think, is actually relatively fine to backport. If you discover over time that you really can't clone the thing anymore, you just put an arc over it. So a very common reason in the past, I've noticed that I have a type, it has cloned, and now it can no longer be cloned because it holds a boxed-in function. Doesn't actually matter. Instead of a boxed-in function, make an arced-in function, and it works again. So a lot of these sort of classic cases of, hey, I can't clone this anymore because of a reason, you just arc it, it's fine. And then the problem with copy is also that the inverse, you don't put copy on something and then people get really angry because they expect it to copy. Classic case is the range type in Rust. It doesn't copy, everybody is like, it doesn't understand why does it not copy. There's a good reason why it doesn't copy, but it's really annoying, and I think it's one of the most common frustrations in Rust that the range type is not copy, and that you have to clone it all the time, like this by response on GitHub, I think. This is one of our only briefly cover, but I don't have recommendations on if something should be sent or synced, but I do think that it's actually totally fine not to have objects that are thread safe. And the reason for that is that you can often just tell a user, they can create an API where the user doesn't really have to do this. They can just put this locally and they're done. Why? Because, for instance, one of the things that I love to do is create this sort of session abstraction. The session abstraction is I want to deal with a bunch of objects for the scope of one function only. So I load everything into some sort of environment where I do some stuff with it, I have an object called the session, it tracks a whole bunch of data modifications, it gives me access to something, it's the only thing that ever needs to hold this thing, and then when I'm done with it, I tear down the session and my lifetime problem goes away. So for as long as you don't have to pass this over really complicated boundaries, just have this thing, make a session abstraction, hold all the data in there that borrows temporarily and then disregard it. So this is from our Symbolication Library, it holds, when you want to deal with a debug file, you open up this debug session, you do a bunch of stuff, you tear it down. And we have never, I think, had a case where we need to hold on to this, where the lifetime was actually a problem. But you will notice here there's a self-cell in there, which I think is probably one of the best things you can have. Then you create an API, and then you discover I need to borrow into myself, and Russ doesn't do that, but self-cell lets you do that. So there are a lot of crates that have been created over the years that you can borrow into yourself, this is my favorite. It basically gives you almost like, it gives you a type that says here I hold some data and by the way I also hold a reference into myself, and it doesn't really convenient way. So if you have created yourself into a whole, where you feel like lifetimes are your enemy, this might dig you out of this. Yeah, last part, errors, I think they're important, they're talked by themselves. You have to consider if you panic or error. My strong recommendation is not panic, but if you do panic, use track caller, you probably know what this is, but it basically removes this one function from the cost of the call stack, and it removes this one function from the call stack, and then the panic message says, hey, your problem was where you called, in this case, pop on an empty stack, and not the completely pointless unwrap in there, because you want to know where you're fucked up and not here, obviously. Errors are really important, again, they would be a talk by themselves, but my ask to everybody is please implement STD error on those thingies, because then you can sort of compose them, a lot of people still don't do, it's very, very annoying if they don't. Thank you. So otherwise, we can also do it in the hallway, but might have one or two questions. Thank you very much for the talk. |
A deep dive inside the Rust frontend for GCC |
Okay. Hello, everyone. Can you hear me okay? Good. Okay. My name is Arthur Cohen. I'm a compiler engineer at Ambicosm, top left, and today I'm going to talk to you a little bit about Rust GCC. So, first of all, a little summary. We're going to get into what is GCC because this is a Rust dev room. I assume at least some of you have never used GCC, which is good for you. It's good for your health. Then, what is Rust GCC? So, why do we make it? Why, I mean, working on it. Who was stupid enough to even think about re-implementing a Rust compiler from scratch? Then, how do we do that? So, I'm going to get into some of the steps of our compilers, our parser, our intermediate representation, and all of the extra fun Rust stuff that we have to handle because it's a really complex language. Then, I'd like to get into our workflow, the community. So, all of the contributors, how we work together, our merging process, GitHub, all of that, and all that interesting stuff that comes with it. And finally, some sort of future questions. What are we going to do? What are our goals? When are we going to stop? So, what is GCC? GCC stands for the GNU Compiler Collection. It's sort of a very, very big program that contains multiple compilers for multiple languages and that all share the same back end. So, the same sort of assembly and mission and optimizers and so on and so on. One fun thing about GCC is that it's very old. It's 30 years old, maybe more. It's really in C++11, so that's great. As I say, it's multiple languages in one. So, you got a C compiler, a C++ compiler, Fortran compiler, so on and so on, and we're trying to add Rust to it. And if you know a little bit about how Rust C works, Rust C is called a front-end. It sort of does its thing and then talks to LLVM to generate code. And that's what's good about LLVM is you can use it as a library. You cannot do that with GCC. So, you have libGCCJet, which is sort of an attempt at having a library for GCC, which is quite recent, or you can do like Rust GCC does, which is create the compiler in tree. If you've been following sort of the Rust in GCC story, you'll know that Rust C code gen GCC, the project by Antoyo, actually uses libGCCJet, and that's a way better idea than Rust GCC, but let's keep going. So, what is Rust GCC? It's a full implementation of Rust on top of the GNU tool chain. So, as I said earlier, this means that we're actually re-implementing the compiler from scratch. So, we started from sort of nothing and kept adding and adding stuff until, well, until today. The project was originally started in 2014. So, just for one quick bit, I think at the time libGCCJet did not exist. So, it's not as bad an idea as it is to add it to the entry GCC, to the GCC tree. And originally in 2014, if you know a bit about the history of Rust, you didn't have a stable version yet. Rust version 1.0 released in 2015. So, that meant that in 2014, there was a lot of churn within the language. If some of you were here at the beginning, you remember maybe the tilde pointer, the add symbol that was used for a lot of stuff, the garbage collector and so on and so on. So, eventually the project had to drop because one, even though he was very, very into it, one developer could not just keep up. It was revived in 2019, thanks to multiple people. First of all, open-source security and then Mbaccosm, where the two companies sponsoring this project. It receives contribution to from many GCC and non-GCC developers. So, I'm going to talk about that a bit later, but we do have some people that have been working on GCC for a very long time helping us and I'd like to thank them, but more on that later. So, why do we do that? The goal is to upstream it with mainline GCC. So, that means that whenever you're going to put your favorite Linux distribution, install GCC, you're going to have GCC Rust with it. It's an alternative implementation of Rust. We hope that it helps maybe draw out and sort of drive the specification effort and that we can help the Rust C team figure out some pieces where the language isn't as clear as they'd like it to be. It reuses the GNU toolchain, so GNU-LD, GNU-AS, GDB, but it does reuse the official Rust standard library, so Libcore, Libasti, and so on. And because of the way GCC is sort of architecture, once you get to that common GCC backend and common GCC intermediate representation, you can basically reuse all of the plugins that have been written for GCC ever. And that means a lot and a lot and a lot of plugins, security plugins, stuff like the static analyzers, so you might have heard about that, the LTO, which is not really a plugin, but we can make use of it, CFE, CFI security plugins, and so on. We also hope that because we're writing it in C++, that means we can backport it, so to previous versions of GCC. And hopefully, that will help some systems get Rust, hopefully. And then because GCC, as I said, is much older than LVM, it has support for more architectures and more targets than LVM. I mean, it had. Now, you guys have the M1 Mac, and we're still far on that. So technically, thanks to GCCRS, you will now be able to run Rust on your favorite Soviet satellite and so on. There's a link for that. The slides are on the talks page, and there's a lot of frequently asked questions. So that's sort of the milestone tab that we put together in each and every one of our weekly and monthly reports. And the takeaway from here is that the effort has been ongoing since 2020 and even a little bit beforehand, and we've done a lot of effort and a lot of progress. Right now, we're around there. So we have upstreamed the first version of GCC Rust within GCC. So next time, when you install GCC13, so sorry for the people on Ubuntu, that's in like 10 years, but next time you update GCC, you'll have GCCRS in it. You can use it, you can start hacking on it, you can please report issues when it inevitably crashes and dies horribly. And yeah, we're sending more and more patches upstream and getting more and more of our compiler whose development happens on GitHub towards and into GCC. So currently, what we're working on is sort of, we have a base for const generics. So I'm not going to get into details on that, just a cool feature of Rust that's not present in a lot of languages except C++ and we're getting them working. We're working hard on intrinsics. So those are functions declared in the standard library but implemented by the compiler. They are very LVM dependent and we're running to some issues doing the translation. One big thing we're doing is some work towards running the Rusty test suite. So because we want GCCRS to be an actual Rust compiler and not a toy project or something that compiles a language that looks like Rust but isn't Rust, we're striving to, I mean, we're trying really hard to get that test suite working and we're almost, I think, almost done with compiling an earlier version of Lipcore, so 1.49, which was released a few years ago. So a quick overview of our pipeline. Basically for a Rust compiler, if you don't know anything about compilers, that's fine. What you're going to do is you're going to do a parsing step. So you're going to take the Rust code and you're going to turn it into a data structure, which is sort of a tree, which is called an abstract syntax tree, AST. Then we're going to run an expansion on that. So anytime we're going to see a macro, we're going to expand it and then replace it by its expansion. Name resolution that's basically putting which use, any use linking it to its definition and so on. We're going to do some more transformation on that AST and then finally type check it. And then we can do a lot of error verifications, linting, so stuff like the warnings you get when you have an unused value and that you can prefix it with an underscore, for example. Finally, when that's done, we lower it to the GCC intermediate representation. So that's sort of similar to the last step of Rust C, where it gets lower to LLVM IR. So as I said, we have an AST. We have an HIR. The advantage of having these two sort of high level data structures to represent Rust code is that we can desugar the AST. So remove the syntactic sugar that you have in Rust source code to have sort of a simpler representation within the compiler. So one example, for example, is that the difference, as you know, between methods and function calls is you got like self.method. But within the compiler, it doesn't make any difference. A method is just a function called with an extra argument. So that's how we represent them in the HIR and we sort of do these other transformations such as removing macros because at this point they've already been expended and we don't care about them anymore. And finally, as I said, the last intermediate representation is called generic. And it's not generic at all. It's just the name and it's the GCC intermediate representation. So one thing I'd like to get into is macro expansion. And the reason I want to get into that is because, I mean, I wrote most of it in GCCRS. So I'm the one you have to blame if it stops working when you try GCCRS. So as you know, macros in Rust are typed. So you can have expressions, statements, path, and so on. And someone has to do that checking. And so that's part of the macro expansion part. And as I said, macros are sort of like function calls. You just expand them and then you paste the AST that was generated and you're done. And actually, in Rust, you've got repetitions in your macro. And that's extremely annoying to take care of. So repetitions, if you've ever written them, they're unreadable, but they're very useful. You'll have sort of these operators, which are the clean star interrogation mark and plus sign, which allow you to specify what I want between zero and infinite of something, at least one of something, one or more of something. And because Rust is a very well thought out language, it's actually got ambiguity restrictions to make sure that no matter how the language evolves, your macro is not suddenly going to become ambiguous. And so again, someone has to do that checking and make sure that your macro is not ambiguous. So that's me. So here, this is probably like a very basic macro that you've maybe written or used or whatever. It's a macro that does an addition and that takes any number of argument. You can see in green, I've highlighted the repetition sort of operator marker thingy. And yeah, this basically expands to E plus adding the rest of the expression. Okay. So that's a macro to make tuples. So basically, you're going to give it a list of arguments on the left. A list of arguments on the right is going to make a list of tuples. The thing I like to point out here is that whenever you don't have the same number of arguments, if you're merging repetitions together, it's actually going to, well, it's going to go bad and you have to check that. And again, on really complex macros, making sure that your merged fragments are actually the same number of repetitions and so on, it gets very hard and very tedious. And Rust macros are sort of a language within the language that needs to be taken care of. And that's just one last example on how fun Rust macros are for the ambiguity restriction. For example, you can't have a keyword after an expression because that keyword might become a reserved keyword, might be another expression of good reasons for why it's an ambiguity. And the thing here is if you look at the second sort of matching, second matcher, in that macro, you can see that the operator means it's going to appear between zero and one time. For the third matcher, it's going to happen like it's going to appear between zero and plus infinity times, same for the fourth matcher. So the macro sort of checker has to move forward and make sure that in the case where two doesn't appear, three doesn't appear, and four doesn't appear, the thing after that is allowed in the set of restrictions. In that case, it's not because, well, it's the same as above, so we have to error out. And it gets really annoying. And there's more checks that are Rust specific that we can't really copy paste from the other languages in GCC. So for example, you got privacy in Rust. So you know how you mark your functions as public or just leave them as private. But you've got fun privacy. So you can have a function that's public in a path, so in a module, but not in another one. You can have a function that's public for your parent module, but not anymore. You can have a function that's public for the entire create, but not for users of that create. And yeah, lots of stuff. Same, you've probably come across unsafe. So unsafe is a keyword that sort of unlocks superpowers and segfaults. And basically, at the language level, it's just a keyword. So whether we're dereferencing a row pointer or an actual safe pointer like box, it doesn't matter to the parser or the AST. But we have to go afterwards in the HIR on that type check representation and make sure that what we're dereferencing, well, if we're dereferencing something of type row pointer, it can only happen in unsafe context. Finally, macros are lazy. So if you're from Haskell, you know what that means. It means basically, you're going to expend them as they go before expending the arguments given to them. The fact is macros are not lazy because you got some built-in macros that need to be expended eagerly. And so when you just spent like three months rewriting the expansion system to make sure that they're expended lazily, and you realize that built-in macros need to be expended eagerly, well, I guess really annoying. Finally, caught sharing between crates. So if you've had the misfortune of writing CRC++, you know you have to write headers, basically declaring your generic functions, your bubbling functions, and so on. How do you do that in Rust? The answer is you don't. The compiler does it for you, and basically what it's doing is it's putting some metadata magic in the L format, so the object file, and it's going to encode and serialize all of your exported macros, the generic function, the generic types, the public macros, and so on, and so on. Again, more fun stuff that no one in GCC has done. Maybe GCC go and we have to figure out. Finally, the type system in Rust is extremely safe, complex, and powerful, as you know. There's lots of fun stuff like the never type, generic associated types, and so on. You got some types, and the fact is these constructs are not really present in any of the other languages within GCC. So that's stuff that we sort of have to figure out. Figure out how to, first of all, implement them, and then how to compile them, and translate them to the GCC internal representation. Finally, the last one bit, you got inline assembly in Rust. It's not the same format as GCC's in line assembly, so we have to do the translation. And if you look at Rust C code gen GCC, because Antonio is much farther advanced than us in sort of the back in turn, it's a very fun, like, thousand lines of code to translate from Rust's inline assembly to GCC. As I said, I'm going to talk a little bit about contributing, reviewing, and so on, our workflow, basically. So the workflow for GCCRS is inspired by Rust's workflow. All of our development happens on GitHub. Our communication messaging and so on happens on Zulip, and we use the bore spot to merge our PRs. But at the same time, because we're a GCC project, we have an IRC channel, we have a mailing list, and we accept patches sent on the mailing list, and so on. So the, sorry, the idea about that is that no matter your sort of background, whether you're a new, very young Rust developer who's only used GitHub, or sorry, Thomas Dinosaur, who's used IRC and mailing lists, you can send patches and we'll accept them, review them, and make sure that your contributions get accepted. So GCC development is hard. I made that experience firsthand because I'm not an IRC and mailing list kind of guy. I'm a GitHub kind of guy. And sending patches via email, getting reviews, submitting them, and so on. It's very, very hard. In GCC, you've got a fun thing that on your comments, you have to add change logs. They have a specific format. They're annoying to write. They're very helpful, but they're annoying to write. To send actually patches to get reviewed by GCC, you have to use getSendEmail. So sort of something that sends the email for you and sends the patches in the meantime. Because I wanted to, you know, make sure I didn't break anything, wasn't going to, I don't know, blow up my computer, I decided to try getSendEmail to my own personal address the first time. The one thing I didn't realize is that getSendEmail automatically adds every contributor to the CC list. The first time I sent patches, I actually pinged like 150 people three times, leaked my personal email address. That's fine. No one yelled at me. And so I removed the option to automatically CC people. And so when I actually sent the patches, no one was CC'd. When patches were getting reviewed, the authors weren't aware that their stuff was getting reviewed. Very fun. So, yeah, we do that. I got used to getSendEmail. I'll do that for you. If you submit comments on GitHub, pull requests, and so on, we'll take care of handling that. We have lots of continuous interrogation to make sure that your comments pass the weird new coding style, to make sure that they respect the change log format, to make sure that they build and pass the test, and so on. And we're actually working on a little bot to generate the change log skeleton for you. Furthermore, because of the way GCC works, development happens in stages. So right now, we're in stage four. So basically between sort of January and May, you're not allowed to make changes to common GCC parts. And this is a very good idea. It's to avoid breakage of sort of the common structure of GCC that's going to affect the most languages. But that also means that we have some patches that we cannot merge until May. And so, again, GCCRS takes care of that. We have a staging branch and so on. We keep track of the stages for you. You can merge your stuff. We'll do it for you. And make sure you don't get annoyed by that. So is that working? Are people happy to contribute on GCCRS? I think so. In 2022, we've had over 50 contributors. That's mostly code contributors. We've also had people helping us with the get stuff, the email stuff, CI stuff, and so on. But I'm not counting here the people reporting issues, because there's a lot more than that. We have a lot of students working on GCCRS, which I'm really proud of. I actually started as a Google Summer of Code student on GCCRS, and now I'm a full-time engineer. And we've got multiple internships that are also coming that way. So, for example, we'll have a full-time six-month internship to take care of Libproc this year. As I said, we also have a lot of GCC developers helping us. So people helping us with the get stuff, with the emerging stuff, and so on. People providing very valuable input. And we have people from the Rust team helping us, which is really nice. So people that are willing to work with us on getting the test suite to pass, people that are explaining us how Rust works because it's complex and just helping us not stray far from the path. So what's coming? When is GCCRS ready? GCCRS, to be at least sort of useful, has to be able to compile Libcore. So if you're not aware of this, the standard library in Rust is actually three kids in a trench coat, where you got the core stuff that's necessary for things like additions, creating lemdas, itch raters, four loops, and so on. On top of that, you got the alloc crate, which takes care of all of the structures that need dynamical locations, so your vector, your box, and so on. And all of that forms the Lib standard, which is used by most projects right now. There's a lot of unstable stuff in Libcore. So that means that even if we target Rust 1.49, we have to actually be able to compile a much more advanced version to compile the core library. Finally, we also have to take care of Libproc. If you've never written a proc macro in your life, well, you're missing out, but it's basically a very complex schmielblick that takes the AST, sends it to a remote process communication, gets an AST back, and pastes it. And we have to implement all of that sort of piping between the crate and the compiler, sending the AST tokens, and so on, sending it to a location, all stuff like that. Finally, borrow checking. If you've ever written Rust in your life, which I'm going to assume you have, you've been sort of held at gunpoint by the borough checker. And that's really a core part of the language experience. And we can't really be a Rust compiler without a borough checker. So our aim for that is to reuse the upcoming Polonius project, which is a formalization of the rules of borough checking, and make sure that we can integrate it to a GCCRS. So the way we're going to do that, again, is make sure we have sort of an intentional representation that works for Polonius, create that tiny FFI layer that allows us to speak to Rust from our C++ compiler, and ask Polonius to do the thing. Finally, we're part of this year's GSOC. So if any of what I said interests you, there's probably a project you can work on. For example, last year we had a student that ported the const evaluator from C++ over to our front end, meaning that we can do, well, const evaluation now. So run const functions, do conditionals, for loops, and so on, in const context. This year's GSOC at least includes the following four projects. So adding a better debugging experience for a high-level intermediate representation, adding proper Unicode support, proper metadata exports, so that stuff like the DAI lib, Rust lib, C lib, and so on formats that you'll find when you're exporting Rust libraries. And finally, better error handling for the user of GCCRS and starting to integrate the Rust C error codes to allow us to pass the Rust C test suite. There's a lot of tooling around GCCRS. So there's a test suite that takes like four hours that we run each night. There's a test suite generator because it's a thousand lines of code. So to make sure that, to make sure, well, we don't pass any of the test suites for now, but we have it. So there's a Blake 3 cryptography library, which is quite nice and doesn't rely on the standard library. And there's making sure we can compile libcore 1.49, making sure we can try and compile all of the Rusty test suites, and we're running that every night. We have a generator for that, as I meant. We have a website, a dashboard for the test suite. We have a report generator because they're annoying to write as well. And we got cargo GCCRS, which will allow you to, while instead of doing cargo build, use cargo GCCRS build to build your code with Rust with GCCRS. And all of that tooling is written in Rust for two reasons. The first one is it's much better than C++. The second one is it wouldn't be so freaking cool to compile our own tools with our own compiler. And three reasons, actually. The most important one is to get people from the Rust community to contribute to those tools. Actually, if you're interested in helping GCCRS in one way or another, a good thing would be to, you know, start working on that tooling. And it's all of just fun stuff. The web dashboard is Tokyo and Async and a rocket database and so on, so not database, API. I'm not a web dev. So if you're interested in that, feel free to come and contribute. Finally, can we rewrite GCC in Rust? Maybe. For bootstrapping purposes, so make sure that we have a full bootstrapping chain. You can read a lot of papers on that, trusting, trust, and so on. We'll have to write that compiler in Rust 1.49, which is going to be annoying. It's still a ways off. And I'd like to really point out that the goal of GCCRS is not to break the ecosystem. So we want to make sure that whenever someone compiles one of your crates with GCCRS, they're not actually blaming you for the failure that's going to happen. And yeah, that they report the issue to us because we're not a proper Rust compiler yet and you shouldn't have to suffer for our hubris. The community, we got mugs. If you do pull requests, we'll send you a mug. People that have helped with the compiler got this one. People that have helped with the merge got the one on the right. Lots of links. You can attend them. We have, as I said, maybe I didn't say it, but we have monthly and weekly calls on JetSea. You can attend them, even if you're just interested in listening in. We have an IRC channel, a website, and so on. The goal is to make compilers fun. The goal is to get contributions from everyone, from the GCC community as well as the Rust community. We have Google Summer of Code. There's lots of stuff for you to work on. We got good first PR issues. If you're interested in compilers, come talk to us. We don't bite. We got reports every week. We shout out contributors, so if you do pull requests, we'll tell you about it. We'll tell people about it. We got monthly calls. Do you have any questions? Hi. Awesome project. Thank you. You mentioned one of your goals was to help develop a spec of Rust with the Rust C team. Can you share more about that? There's nothing really started. It's just that you have the Rust reference at the moment, and it tells you how Rust works from a user point of view, but not specifically from a language point of view. At the same time, we don't want a Rust standard like you have with C or C++ where it gets really annoying to get features done. There are efforts from people like Mara Boss and Josh Triplet and so on to have a Rust specification. One of the goals of GCCRS is to say, well, we've had trouble with that because that's not how it is in the reference, or it's not explained well enough, and we had to look at the Rust C source code or try it out to figure out how that works. Stuff like dear reference chains, what type actually gets used for a method call, and so on, and so on. We can point out and say, well, maybe that could take some tweaking because that's not, yeah. Do you have a list already of stuff like that? It's mostly type system stuff. I have some on macros. There's not really a formal list. I think we have some, like we have an actual list somewhere, but yeah, I don't have it in my head right now, sorry. Thanks. Thanks so much. Two questions perhaps related. First, on performance. I wondered if you had any numbers at all on the performance comparison or what your goals are for that. And secondly, I'm kind of surprised by how much you re-implemented in terms of the IRs. Was that an intentional decision or was that because it needed to be in C++ or why not effectively consume more of the Rust stack and then replace LLVM with GCC at the bottom? So regarding performance, we're much faster because we do much less. But we actually don't know about performance yet. We haven't measured it. No benchmarks. We have a ton of stuff missing. The code we emit, we're not trying to optimize it sort of for Rust yet or at least not all the time. So yeah, we're just not there yet. It's going to happen eventually. Regarding the internal representation, consuming the Rust C stuff is difficult. There's a lot of, even if Rust is a very well designed compiler, Rust C, there is some stuff that makes sense only in a Rust C context. And that's also one of the things with Polonius that we're trying to work on is that it does depend on some Rust C specific stuff. So we do aim to contribute to Polonius and make it so that it's a little bit more compiler agnostic, I want to say, but not just to help us, just for it to make sort of more sense and maybe be used by even more languages, who knows? But yeah, sorry. We needed representations. I know it's still too far away, but is binary reproducibility a target of this? No, not really. Sorry. It would be difficult. The Rust ABI is not stable. Rust C changes its sort of internal formats and representations. I don't want to say often, but it does happen. And it would be really difficult to keep up with that without sort of a stability guarantee or a specification of that. It's really not one of our aims. Thanks for the talk. I was wondering about your cargo re-implementation. Wouldn't it be easier to have a command line compatibility with Rust C and then plug that thing into cargo to tell cargo don't use Rust C, use GCC Rust? So it's not a cargo re-implementation. It's a cargo sub-comment. So it's the same as cargo FMT, for example. How it actually works is that we intercept the Rust C command line, as you mentioned, instead of saying, well, fork and start Rust C, we start GCCRS. And on top of that, we do argument translation. So stuff like dash dash edition equals 2018 for Rust C is going to become dash F Rust dash edition equals 2018 for GCCRS. So we have that list and we do the translation and then just launch GCCRS and pipe the result back to cargo. Thanks for the great talk. And one question or maybe a tip, I don't know if it's one, but is there a project or some possibility to transform the LLVM IR to the GCC IR? Because if it is, then you could maybe run some tests on it, like creating the IR via normal Rust C and then your variant and then you can pair the IRs. I think there is a project like that. I can't remember which way around it is if it's an LLVM compiler that takes in GCC IR or a GCC sort of front end that takes in LLVM IR. I think something like that exists. I don't know much about it. I think it's not very famous or anything, but it could be interesting. Yeah. Hello. Do you have a link with Rust in Linux project? Because if I remember, Linux is compiled with GCC, right? Yes. So one of the big, big, big, big targets of GCC IR is for you to be able to at least help or be usable in Rust for Linux. Linux is compiled with GCC a lot. You also have efforts to compile it with Clang. At the moment, what Rust for Linux does is use Rust C, so an LLVM tool chain, but it is one of the sort of goals of the project to, yes, be able to have a fully comparable Linux project even using Rust and C in the kernel. But, yeah. Any other questions? Thank you. I would guess that while re-implementing such a complex project from basically scratch, you probably have a really good chance of finding some mistakes in the upstream, in the original implementation. So do you contribute back to the upstream in such cases? And maybe you remember some of such examples. Thank you. So I don't have sort of these specific examples in my head, sorry. But we do have, as I said, we did find some sort of stuff that didn't make a lot of sense in the specification, sorry, the Rust reference that might have been fixed and so on. But, yeah, whenever we see something that doesn't, to us, make a lot of sense or that deserves some explanation, we try and let people know about it. We try and contribute back to the Rust C project. We're really not treating Rust C as sort of a competitor or anything. And we do want to improve it. And GCCRS is built by people that love Rust and that want to push it forward in our own way. And for bugs regarding like Rust C bugs, GCCRS treats Rust C as sort of the overlord. So whenever Rust C does something, we do the same thing. We don't want to sort of argue about what is correct Rust and what is not correct Rust. Rust C is the Rust compiler. It's the Rust implementation. When you ship a Rust version, you ship the compiler, the library, the sort of the language is all of that, those three projects. So, yeah, we just try and stick with that as a reference. And we don't want to step on any toes. Yep. Unfortunately, that's all the time we have. I think we had a few more questions, but maybe we could do it in the hallway. Let's thank our speaker. |
Merging process of the rust compiler |
So we have Guillaume. He's going to talk about the merging process for the Rust compiler. Okay. Yeah, you can hear me. Perfect. So, hi, everyone. So, I will be talking as he mentioned about the merge process in the Rust compiler. So, who I am first, Rust language reviewer and contributor. I'm a member of a few teams. So, I'm in the Rust doc team. Not to be confused with the former documentation team. Also, Docs.Rust team and DevTools team. So, very documentation oriented. And I'm working at Huawei currently. So, we will start by taking a scenario. Hold on. So, when you have made a pull request, you open it. And the first thing that will happen on the pull request will be that the bot will assign you a reviewer. So, in this case, myself. So, very likely pull request on a Rust doc tool. And after that, you will have some tags. So, it's waiting on review and it's concerning the Rust doc team, which helps us to find the right people in case the reviewer assigned isn't available in a week, if I remember correctly. So, explanation a bit about how the bot is picking the people. So, we have a repository with the list of all teams and its members, formers and everything. And the bot basically pick someone from this repository. And this website, the governance page on the Rust long.org website is generated from it. So, if you need to contact someone from one of the teams, whatever reason, that's where you go. So, now the approval itself. So, let's say that the pull request is implemented with no request from the reviewer or anything. If it has no performance impact for this to have this information, if we have a depth, we have tools automated that allow us to actually check its actually the case. So, if needed, we just say, hey, Rust bot, can you run a perfect check on this? We come back to this later and we have a very nice page with some metrics and a lot of steps. So, another important thing is checking that there is no breaking change. So, of course, if you are changing something in the STD, for example, or changing how projection works on anything, then it becomes a lot more complex and the process becomes a lot longer. Same, we will come back to this later. So, if it adds a new feature, it's very likely that we will need to be sure at 100% that it's not something that we'll need to change or deprecate or literally just remove at some point because it happened a few times and it's not great. And obviously, the CI must pass. So, that's a lot of small conditions. So, now about the CI. So, there are two levels of CI. The one that you will see directly when you open the pull request. It's a lot of tests, almost all of them. But it's only on Linux X64 because, as you may know, we support quite a lot of targets, not as much as GCC yet, but at some point, maybe. And this checks, for example, if the call is wait-formatted, if you have all the tests passing, and by all the tests, I mean literally all the tests, so you have all the rest of the tool suite, tool test suite, the compiler error output, the compiler checks if the code is giving the right result, the assembly, pretty much everything, and it includes the tools. So, if you made a change in the compiler that breaks a tool, like REST doc, Clippy, or REST FMT, then we need to be aware of it right away. Otherwise, we are going to have quite a bite time. And all that is done directly on the pull request. So, at the current time, it takes around one hour to run this small subset. And when the pull request has been approved, we make the full run of all these tests and for all platforms. And this time, it's run, I think if I remember correctly, it's like on 40 targets or something like that, and it takes roughly around three hours. We have our own infra for this. We have dedicated the team for that too, the infra team. And I think it's currently done on AWS to be confirmed. But in short, nothing can be merged if the CI doesn't pass. We enforced this, I think it was three or four years ago. A few things that were merged and were expected to be fixed in very soon coming fixes were quite bad experiences, and we decided to have a zero-tolerance policy. It's working quite nice, so currently we keep it. So now it's a build queue. When we approve the command with the pull request with the command AddBulls R+, you might have seen it or not. We have a build queue, and that's where you can see pretty much everything that is happening. So in the current case, you see the pull request, the first one which is pending. So it allows you to see what is being tested and eventually how long it remains. And you can see also everything that is approved and everything. And it's sorted by priority first, which you can see because I had to make a small screenshot. And the second thing is how old the pull request is. We generally have around 20 pull requests at the same time in this build queue. So to make things faster, we have what we call a roll-up process. We group a full pull request that we are sure have no performance impact or anything. And we say, okay, make a roll-up of five pull requests. You can see the button, create a roll-up. So we pick a few pull requests and we click on the button, and it generates a pull request for us with our account. And after that, we give it quite a high priority and like that, we can have a big bunch of pull requests to merge at once. Very useful. And that is for the build queue. So what I explained a bit before, what is tested. So we have the compile test. So if your code is supposed to compile or not, because, for example, we want to ensure certain cases in very weird cases that don't compile or in other cases compile. And that's how you can discover things like you can't implement directly on projections. And if that doesn't speak much to you, it's a good sign. We have all the unit tests. Unit tests are mostly for the tools. But we have a few tests with, like I mentioned, just below the error output. It's quite important. So we ensure that the Rust doc and the Rusty errors are looking exactly as you might expect. If you ever used, and I think a lot of you used already Rust, you might have appreciated the errors and the output. Yes, because they are very, very strongly tested. Currently, just for the UI test, we have around 20,000 tests. So it's quite monstrous. And running it takes quite some time. I think it's, well, at least 10 minutes, something like that. It's quite heavy. Maybe you don't know it, but the documentation example are tested, all of them. You can just test them manually in your code by running cargo test. The cargo tool will take all the unit tests in your code. The test folder will run on everything. And it includes, of course, everything that is in the documentation. So that allows us to reduce the maintenance burden by being sure that we don't give examples that are not compiling anymore or completely broken, quite useful. Once again, it reduces the burden. And, of course, we have all the tools. So cargo, RustDoc, Clippy, RustFMT. So as I mentioned, when you change something on the compiler, sorry, when you change something in the compiler, since these tools are using directly the compiler, they are actually compiler extensions except cargo. Cargo is just tested to ensure that not a new option is breaking something. So for the others, they are extensions of the compiler and we need to ensure that no changes is breaking anything because that would be problematic. We generate a lot of documentation and we have to ensure that we have no deadlinks. And, in fact, we do have some of them and we ignore them on purpose. So sorry for that. We can't fix them because, funnily enough, in the STD, we re-export stuff that is in the core and they share the same documentation. So if you are looking at the documentation in the STD pages, all the links are working in the core create. They're not. So try to use STD as much as possible. And it's just very basic, but we have quite a lot more. We mentioned in the previous talk, the inline assembly, it's part of the things. Something we realized when working on the GCC backend this time is that GCC doesn't allow to specify a syntax that's thanks to this test suite. So currently, we can't implement all features and it's going to take quite a long time, but hopefully at some point, someone motivated will do it. Don't know. So on which OS and architectures are tested, everything. We have target tier policy. You can go check it on the page just linked below. But basically tier one, the platforms are the platforms that are fully tested, implemented, and everything. So macOS, Linux, and Windows. And they must pass all the tests and we build them and we ensure that what we have built and has to be able to be uncomplaced and working on the target and everything. So strict, very strict restriction. On the tier two platforms, it's a lot more relaxed. We just need it to build. If it works, well, it's good. If it doesn't, well, too bad. And for the tier three platforms, it exists. Yeah, that's good. So for example, if you want to build on the Nintendo 3DS, you can. We don't know if it would work, but you can. And you can see the list of the platforms each tier on the page just below. Like I mentioned, we have quite a lot and we hope to be able to expand it a bit more by adding at least the GCC backend at some point. A lot of work remaining. So what about releases now? Because as you might know, we make a release every six weeks. So it's very fast release cycle. So when this happens, the build queue is frozen. We don't allow anything below like a priority of 10,000 to be merged. It's a completely random number, but generally if you go higher than 10, it's quite important. So in this case, we freeze everything. And the only things allowed to be merged are the patches to make actually the stable and beta branches update. An important thing that isn't noted here is that we don't have the need to freeze for the nightly. We just say at a given time of every day, okay, this will be the nightly version for today. Yay, and that's it. So back to this, the third point. All relevant information is updated and reared. And by that, I mean the websites, the documentation, the book, I think, too. Pretty much everything. We generate the binaries. So that's what I mentioned. That's the things that need to be working for at least tier one polyform. And of course, we make a blog post. Generally, the blog post is written not for the current stable release, but we write it at the beta version. And then depending if we need the backports, for example, we realize that in the current beta version, something is completely broken and we don't want that. And it's an easy enough fix. Either we backport a patch that was merged on the nightly directly onto the beta branch. Or we say, okay, too bad. We revert that and we'll do it the next time. It happened quite a lot. And it's not uncommon. Let's just say it's better if it doesn't happen. It allows us to not have the dot one version coming up like a three days later because we realize that we broke something. And the blog post is released. So now a performance. What I mentioned is that we need to check sometimes the performance. So we have to speed it now. So for the performance, we have a lot of benchmarks you can see on the left. It's generally for the number of instructions that have been written. It's what we consider the most important metric and most, let's say, stable. So when you have all green numbers and quite high, oh, yeah, 8%, yeah, that's quite right. So when you have all green numbers, it's green and everyone is parting. And if you have all red numbers, either you have a very good reason or it's not going to be merged until you can make them at least black. And we have, like I said, a lot of metrics like cycles, memory usage, disk usage, because we started to worry about the binary size. We realized that all the doc attributes were generated in the binaries, which is not great. So we are going to fix that at some point. And you can see on the right that, yeah, maybe you can see. Anyway, just to believe what I say, the results are showed in the nice comment directly on the pull request. So other cases, when you add a new feature or introduce a breaking change, there are three possibilities. The mostly non-one is the RFC, request for comments. It has its own repository. It takes a lot of time and effort and comments. It can go really fast, like two days, or it can take indefinite amount of time. Some examples, some RFCs have been open and still are commented on before the 1.0, so that gives you an idea. We have the MCP, major compiler changes. So not too big changes in the compiler. We find it not to greet how the query system is working. So let's try this solution, and they discuss mostly design and very technical points. Interesting, but if you don't know this area, well, it's not very understandable. And the last one is common to all teams. So the FCP, the final comment period, it's something that we want. And we just want to be sure that everyone is on board. So we ask for an FCP, and once more than half of the members of the team are okay with it, then we approve it, and here we go. So we, of course, for every poll request that is merged, we check for potential examples. No, that's before, sorry. When we make a new feature that potentially changes current behavior, we look for potential regressions in all the crates ecosystem. So we make what we call a crater run. And with this version of your code, we run on all crates, and we generate a nice report. You can see on the left. And if you only have flaky stuff, we say, okay, no impact. So it's good. We don't care, and that's pretty much it. And same as for the performance, we have a nice comment explaining everything in short, which is much more easy to read that the thing on the left, which is actually not good. And now a little part I like to do every time, tips for potential new contributors. We have a lot of classified tagged issues with ELZ or E-Monitor or both issues. Take a look at them. We try to be as helpful as possible to newcomers. It's important for us to have new blood. We have always good surprises with newcomers. We wrote a receipt of guide, which is not up-to-date at all. So at least you have a vague idea of what's going on, because I think not many people have an idea. And you can try also to write compiler plugins or eventually contribute to ClipIt to see how the compiler higher internal levels work. About ClipIt, it's really simple to contribute to it, like they have a full guide or anything. So if you want a big, nice first step, take a look at ClipIt and how it works, and it gives a very, very nice introduction. And I am making publicity for myself. I wrote a small receipt towards Crate, which makes a few things simpler to write plugins and extensions to the compiler. If you want to write, go ahead. It's made to be usable as much as possible. And thank you for listening. Thank you so much for watching. |
Let's write Snake game!
Using Bevy engine, we will code together a snake game from scratch |
So now we've got Tomasso, he's going to tell us how to build a snake game, and we're going to build it together. Hopefully. Hi all. Today we are here to talk about snake, obviously, rust and wasm. In particular, we will see how to build a snake game written in rust, and Shepard has a wasm module. Before doing that, I would like to introduce myself. Hi, I'm Tomasso, I have two cats. And commonly, I used to be a software architect in a web application environment. So probably the games is not my best stuff I can do build, but I try. So let's start to talk about what wasm is. Powerless is a stack-based virtual machine that allows to be portable. So we can build applications and bring where you want most mostly. And the main target is web, but not linked to the web only. We will see later. So we have four concepts, efficient and fast, memory safety, safeness, open the bug able, and part of the web platform. For this reason, we have four parts, the core one, JavaScript API that allows us to interact with the JavaScript world, like browser nodes and so on. Web API that allows us to interact with DOM events and so on. And wasm, this is a standard for web assembly system integration, if I remember correctly, that allows us to interact with the file system, networking, and so on. Obviously, was is not allowed in browser context for some reason. So how we can write a was module? Well, wasm actually supports two kind of format, text and binary, but probably you don't want to write directly into wasm. For example, assembly probably in this year, no one write in assembly directly. But if you want, you can do that. But probably you want to leverage on different languages, for example, cc++, raster, go and so on. But if you remember the previous slide, we talked about the memory safeness of the web assembly, and which is the other language that remember has a similar capability, raster. Because raster guarantee memory safety. And this is why we are here to talk about REST plus web assembly. So which is the constraint we have for building wasm in REST? Unfortunately, we are not so free to use what we want. We need to put an attribute wasm and gen in all type and exported, so structure and arms and so on, but not linked to that. Also the implementation block, we need to treat like that. So put the same attribute on top of the implementation block. And unfortunately wasm doesn't understand all the types available in REST. So bytes are integer, but not all the integer has supported floating points and vector. We have some limitation about that. As consequence, for example, enumeration need to be treated as 80. And all returned values from methods need to be casted to some wasm types or return a wasm and gen structure. So we are here to talk about snake, who play at least one snake, at least one, OK. For the other, snake is a simplest game, two grid, there is a two dimension grid. Your aim is to win, for winning, avoid and go through walls and eat yourself. You are able to eat foods that give you scores and so on. Your aim is to drive the snake through walls and try to eating the food, more or less. Anyway, we will see you later. So our code is here, is a cargo workspace with three members. The first one is just plain REST implementation of game logic without wasm stuff, without any other part. The second one is handmade snake that is just a wrapper on the previous one in order to let a JavaScript world to import it and use it. So we implement the web interface manually through JavaScript and DOMs. And the last is baby plug-in that allows us to create a proficiency, more proficiency than manual ones, a game. The last two members use the first one and we will see how. So conceptually we have a bunch of stuff, a direction that allows us to describe which is the direction the snake have. The point, because we live inside a grid, so we have to somehow describe the points. The game itself, private stuff, skip it, but mainly we have two members, tick and get the last snapshot. Tick allows us to move the snake in the direction specified there, has attributes. And the last snapshot allows us to know what happened in the last tick. For example, I eat a food, I go through the wall, which is my score, which is the position, and last but not least, the period duration. Because in the game the interval between the ticks changes accordingly with your score. So the game more gone, the period decrease. So how we can use hopefully you read the code, because the finger, anyway, I describe it. We have a level described that has a simplest, in simple way, through a string. We can parse it, creating a game. We invoke a tick method on the game, describing which is the direction we want to use. Get the last snapshot, check status, for example, in this case I eat a food, because the H goes on the food. We are not yet on the wall, and the game over is none. Instead the code below goes through the wall, so on wall is true, and the game over is some with the reason. And finally, we have two public levels, snake one is the two, the difference probably you know, but for repeating, the difference is the frame. So in snake one we have a frame with all the walls, instead snake two is more like a toroids, so you can go left and appear on the right and upper and bound. So how we can use these? We have a snake core, again without any dependency. We need to wrap it, because we already seen, we have some deficit about that. We have some custom JavaScript code that interact with DOM in order to update the UI. This is more like what happened. And at compilation time, after the compilation actually, we have a process for compilation, the Rust code into a wasm, and this compilation generates two artifacts actually, the wasm itself and an auto-generated JavaScript module that allow us to simplify the interaction with the wasm module. After that the same JavaScript code and the same DOM, so more or less what we have is wasm and auto-generated JavaScript that allow us a lot. So shortly, we need to wrap all the stuff, all the stuff. So the direction, the point, the game, the snapshot, and so on. So definitely, we are not able to do that for a large project. Obviously, this works for little ones like snake, but if you want to build a bigger one, probably it's not the best solution. But if you want instead create a cryptographic library, hashing library, something like that, this is really amazing, is sufficient. But for the gaming, probably not so much. So before seeing which is our alternative, we have, I have prepared a demo that obviously you can find in the code. So let's see if it is work properly. Ah, yes, here. OK. So because I haven't the framework that helped me to build a better user UI, I choose this one. Sorry, I'm not a UI expert. So for our proposal, it's sufficient. As you can see here, there is some bootstrap, webpack, blah, blah, blah, we don't care at all. But at a certain point with a lot of wasm, wasm is a few kilobytes, so not so big. The user is able to choose which game he would like to play, click on here, and move with the arrow key. Not whoa, but why not whoa. Thanks. OK. Obviously, when I go through the wall, the game is hence put on wall. Again, not the best user experience we have, but sorry. So we will see the code together in the final Q&A session, so sorry, time-restrainted. So which is our alternative? Our alternative, there are many alternatives, obviously. I choose a baby engine because I like it, I didn't find, again, I'm a web developer, so for gaming stuff, I don't understand nothing. But baby engine allow me to put something in a short time, so good stuff, guys, good stuff team. And support across platform as well, Windows Mac, Windows, and obviously web. The pattern used is ACS, entity component system, entity is just an ID that you can put on the world, and component is a tag, something you can attach to an entity, like image, like a position, like something like that. And system is a function that work on those stuff, can add entity, remove, add component, remove component, move existing component, and so on. So more or less how baby works, each frame invokes, baby runs our function, call it system, let change, add, remove components and entities, that allow us to change our worlds, and finally baby render some of them, obviously, on the screen. So conceptually, what I understand in two years at night, obviously, is not my job again. So it's almost simplest to understand. Last stuff to introduce of baby, we have two other concept, I need to introduce event, is a plane rest object that allow us to be fired and to be listened, so we can inform other functions, other systems, that something happens. And the resources is just a global instance, because system is allowed to access only on the world, not our custom objects. So you need to put your resources on the world and fetch it inside the system. And a nice feature, baby tracks when a resource change, we will see later. So how can you stem a system, for example, firing events that is listed by system two, that on that event change accordingly a resource A, and system three, more or less, react on that change, for example, moving the snake. So let's have a look to a more detailed example on the code, skipping the arguments because it's not important to understand better what I would like to show. Firstly, because systems runs every frame, we don't want to tick on every frame. We want to wait a timer. So also for test proposing, this is nice. So we need to wait a tick event. So only when the tick event is fired, we call tick method on game. Game is on the third arguments that the game resources, obviously, is the game we saw before. After that, we get the snapshot, check if the game over is, we are in game over. If yes, we send, we fire a game over event. After that, we update the snake position, update the resource score, update the food, and the duration of the timer because, you know, the period can change. The nice stuff to be focused on is the if because we don't want to change if the real value is changed. Bevy leveraged on the ref mute trait. So it is important to not the ref mute before the real change. So have a look at a quick demo, after the demo, we can see the code, I promise. So again, the demo, the demo, I propose you to show the native, the native part. So cargo run, blah, blah, blah. Okay, this is our windows that is created natively. Okay, again, I can choose snake one as the shoe with button this time. Thanks. And again, it's not my job, but this is what I implemented. So as you can see, under the hood, there is some locks, and in front of you should be at least the snake that runs through the table. And this is the way I handle the game over. So, and obviously, quit close the windows. So, we have three different states in our game, and in our code, I treat these in three different packages, sub packages, choose game, play game, and game over. And as you probably understand, we can leverage on event system to bring the user from one state to another one. So let's focus on the play state because probably is the most important one. So what we need to do in the play state, probably we need to, surely when we enter in that state, we create the resource, the dedicated resource, and make the initial draw. After that, we already saw we need to wait the tick event called tick game methods, update the position, update the food position, and the score number. And surely we do not forget them handling the press key and the game tick. So these are the last slides, after that we will see the output and the code. So the graphical representation, we have in red the systems, so the function, handle keyboard input that update the direction resource. We will see it before, when I press the key, the direction resource changes. Send game tick is the function that wait X seconds. So after X seconds sends game tick event, listed by tick system, that update after calling tick method on the game, all the resources. Because the resource changes, I can update accordingly the score, the snake and the food. Why I structure like that? Because the last three systems I mentioned, the score, the snake, the foods, can be parallelized by bevy. Bevy has a parallelization system that allows you to automatically parallelize the system if he understood that is parallelizable. For example, it does not access a mutable way on the same stuff. So show me the code, but probably show me the result also. If you want, you want, okay, okay. So I built it in release mode, and this is important, refresh the page, okay. As you can see here, we have 60 megabytes, not kilo, mega, but not in release mode. This heavier up to 70 mega, if I remember correctly, so crazy. Take one, obviously, are the same user experience, the same user experience, and as you can see here, there is the logs also. Nice feature is that he also linked to the particular lines, and this is amazing, at least from my point of view. So let's dig into the code. So we have time. Apparently, yes. So here we have the handmade snake. I remember that this is just a wrapper around our core implementation. As you can see here, there is a JSS that is a JavaScript API from the WebAssembly package we described before. The other part is just merely the tracing, for example, a different allocation. The second dependency allows us to print a message on panic, for example, and the first is the waspingen. So because I don't lie, not now, at least, here we have all the bingen attributes. With all the enumeration, the structures, and so on. And here, under this folder, we have the classic webpack, the webpack front-end stuff. I really don't know what is this. For building it, I use a webpack that allows us to translate rust in wasm, and used in the handmade package. Instead, the baby snake is built using a truck that allows us to somehow transform all the rust plus index.html into a web application directly. And if you are questioning how it works, why we made handmade snake and baby snake, which is the main difference under the hood, the answer is this. This is the public repository on GitHub. And here, as you can see, there is a web system, another API that allows us to interact with DOM world. So at the rust side, we can change the canvas. Because under the hood, there is the canvas, it's displayed inside the canvas. So more or less, I have done. Thank you. If there is any questions, I will be happy. Be kind. We have about five minutes for questions. Be kind. We have a show of hands for questions. Have you ever played around with much more entities like 100,000 or 1 million entities? Good question. No, I didn't. I know that the limitation here is the thread number. We have in JavaScript, we have in browser. If you don't use WebWorker, for example, you don't able to scale on this part, baby is not using WebWorker, at least for the time being, so he is not able to parallelize. And for this reason, probably, you can find a limitation. There is no internet, but in the baby engine website, there is a dedicated example. Also Shippe has a wasm, so you can find it and give me the answer, please. Thanks. Thank you very much. Thank you very much. |
Glidesort
Efficient In-Memory Adaptive Stable Sorting on Modern Hardware |
So now we have Orson, he's going to be talking about GlideSort, very beautiful opening slides, so yeah, take it away. Thank you. Can everyone hear me? All right. Good. Thanks for coming. So my name is Orson, and I'm here to present GlideSort. I did this research at the CBI Database Architecture Group, GlideSort. What is it? It's a general purpose, stable comparison sort here. Does everyone here understand what that means? No. Oh, okay. Well, good luck. So stable means that it does not reorder equal elements. They stay in the original order, so essentially it makes sorting deterministic. GlideSort is a hybrid. It's a hybrid of merge sort, quicksort, and block insertion sort, which is a variant of insertion sort, and it is robustly adaptive to pre-sorted and low cardinality inputs. Don't worry, I'll talk about what that means. I made a reference implementation in partially unsaved rust, and you can think of it if you're programming a rust as a drop-in for the slice stable sort algorithm. So you might wonder, stable quicksort. The answer is yes. A guy named Igor von Den Hoven made a very, I don't know, he did very good work on flux sort, where he showed that indeed you can do stable quicksort efficiently. Wikipedia will tell you that quicksort is in place, that it is done using element exchanges, and that it will literally tell you efficient implementations of quicksort are not a stable sort. Wikipedia tells you, no, you cannot do it. Standard stable sort uses extra memory to do its sorting, and if you tell people, hey, you can do the same with stable quicksort, they completely lose their minds. No quicksort is in place, you cannot do that. That's not true. You can, and you probably should. So earlier I mentioned adaptive sorting. What do I mean by that? To adapt is to change your behavior to deal with new information or a new situation. And there are two ways that you can be adaptive, in my opinion, major ways you can be adaptive in sorting. And they correspond to two schools of sorting. There is the bottom-up school of sorting. Those are your merge sorts, or your mergers variance, like dimsort and powersort. And they are bottom-up. They construct larger and larger sorted sequences from smaller sort of sequences. They are often presented in a schoolbook way top-down, but really fundamentally they are bottom-up. And that way they can be adaptive to pre-sorted runs. If there's already pre-sorted running your input, you can just take that as is and continue merging up. There's also the partition school of sorts. Those are your quicksort, your sample sorts, your radix sorts. They partition out or distribute data. They are fundamentally top-down. You start at the higher, and you partition to smaller, smaller, smaller. Subpartitions, and that way they can be adaptive to low cardinality inputs. So what are low cardinality inputs? Essentially you have a lot of data, and you're sorting it by a subset of the data. So you're sorting your customers, but you're sorting them by which city they live in. Or you're sorting your cars, but you're sorting by the brand of the car. And even though you might have hundreds of thousands of cars, you might only have 100 brands. So essentially duplicates, at least from the perspective of a comparison operator. So how does adaptive quicksort deal with that? The idea is that during partitioning, we can detect buckets of elements that are all equal to each other. And there's a challenge with doing that. You don't want to do extra unnecessary comparisons. And we actually want to avoid three-way comparisons. That's a bit funny, because Rust's basic ORT trait uses three-way comparisons. But that's a lie. Under the hood, we turn that back into a two-way comparison, because computers aren't very good at turner logic. They really love binary logic. They love ifs and else. So we still turn that back into two-way comparisons. And there's been a long history on adaptive quicksorts in this way, with Dijkstra and Hauer, I still don't know how to pronounce that, working on it. And already, time flies, eight years ago, I showed that in pattern defeating quicksort that you can detect this and handle this very efficiently. So how does that work? I have an entire earlier talk on PDQ sort that you can watch if you're interested in this. But essentially, we have two different partition strategies. A partition left and a partition right. The partition left puts elements equal to the pivot on the left. And the partition right puts equal elements on the right. And what you do is, when you select a pivot, you check if that pivot is equal to a pivot we used previously. And you can do this efficiently using a single extra comparison during partitioning, or at least pivot selection. And the default is that you put the equal elements on the right. But if you detect, hey, this pivot is equal to a previous pivot, you put equal elements on the left. And this way, you implicitly do a three-way partition using two-way comparisons. And you can prove that, on average, this means that your sort is o n log k, where k is the number of distinct values. If every value is distinct, that becomes o n log n, we're used to that. But if you have a lot of duplicate values, o n log k goes a lot faster than n log n. There's also adaptive merge sort. As I said earlier, these merge pre-existing runs in the input. The problem with solving this is that you want to minimize the amount of unbalanced mergers that you do. So you don't want to merge a very large array with a very small array, because that's quite inefficient. And you also want to somehow store, during your algorithm, where the runs are in memory. And if you do this in an illogical way, you have to potentially store a lot of data about where all the runs are. And Van Neumann invented merge sort very early, and Knuth described also quite early a natural merge sort that takes advantage of pre-existing runs in the input. And then, in particular, Tim Peters popularized Tim sort, which became the default sorting algorithm in Python, that really showed the first sort of clever way to keep track of your run information and minimizing unbalanced mergers. The recent work is PowerSort, which extends on Tim sort essentially, or has the same logic, but more clever logic, and actually has mathematical proofs that it creates balanced merge sequences. And in fact, Python, I believe, now uses PowerSort's logic as well. So I'm not going to go into detail on how PowerSort works. I don't have enough time for that. But essentially, the core loop of it is that you create a run, and that can either be by finding the run in the input or doing a small sorting algorithm to create a small run. You compute the power of a run. That's the heuristics I'm not going to get into. And then you keep a stack of runs, and then use this power heuristic that we computed to decide when to merge two runs. And you can prove that the stack then becomes logarithmic in size, and that your merge sequences are going to be very good. But yeah, the idea is that create a run can take advantage of existing runs in the input. So a problem merges. We want to be adaptive to low cardinality inputs, and we want to be adaptive to preexisting run in input. But one is fundamentally bottom up, and the other one is fundamentally top down. And that's why I call this GlideSort. We glide. What do I mean by that? Well, the idea is that a soaring bird only flaps its wings when necessary. GlideSort only sorts when necessary. So during this create run process, I'm sorry, before that, I changed the concept of a run to a logical run. And a logical run can be one of three things. It can be just as before. It can just be a sorted range of elements in your array, and can also be an unsorted range of elements in your array, or two sorted ranges that are right next to each other in your array. We change the create run function. We do, in fact, if there's a run in the input, detect that and return that as a sorted run. But if we don't detect a sorted run, we just return an unsorted run. We don't eagerly sort anything, and how you do that is very simple. You just scan through the array, and if you find a run that we consider big enough, we return it, and otherwise, we just skip some elements and return an unsorted run. And then you add quite a bit of code for merging two runs, but it's actually relatively simple. As long as two unsorted runs concatenate it fit in our scratch base, which is essentially this extra memory that blows people's minds, as long as it fits in that, we just concatenate our unsorted runs. And otherwise, we actively physically sort the elements using quick sort, and then create one of these two sorted concatenated run cases. If we have two sorted runs, we concatenate them. If we have an unsorted run and something else, we actually sort this unsorted run and then recurs. And finally, we have our actual physical mergers. So when we can no longer be lazy, we can no longer glide, we have to actually merge elements. So that is essentially the main loop of glide sort. So it's an extension of power sort, but you can apply the same logic to any natural stable merge sort. We don't eagerly sort small runs. We keep them as unsorted runs as long as possible. And this way, we transform the sorting problem into a sequence of quick sort calls and triple slash quad merges. And doing this, we are adaptive to pre-sorted runs and low cardinality inputs at the same time. So why triple and quad merges? And there are three main reasons. There's ping-pong merging, bidirectional merging, oh, sorry, before I want to quite clearly mention something, Glider is not the first algorithm that is adaptive to both of these categories at the same time, but to my knowledge, at least it is the first algorithm that is robustly adaptive. So it does not hard code anything, it does not use heuristics to decide when to switch to which algorithm it detects this completely naturally based on the input. So why triple slash quad merges? There are three main reasons. Ping-pong merging, bidirectional merging, and parallel merging. Ping-pong merging is not my idea, it's found in two early projects, once again by Igor van der Hove and an earlier paper, Pages is Virtue. And the idea is that in a traditional merge, you copy out part of the data someplace else and then merge back into the original array, that's an extra memcap. With a triple slash quad or a quad merge, you can merge both into your scratch space and on the way back, because essentially when you do an out of place merge, you get a mem copy for free because you're moving to some other place. So I think that's best described visually, if you have four, so in this case a quad merge, you have four mer sorted runs, you merge two into your scratch space, you merge two more into your scratch space, and you merge two back. And now we eliminated three mem copies, so don't have to do that, that's one advantage of being lazy with merging. We can also do bidirectional merging, this, to my knowledge, was first done again by Igor van der Hove in Quadsford, where he described a parity merge, where he showed a very clever technique to merge two equal length arrays without any branch checks. But then I thought by merging from both ends at the same time. But then I thought, looked really into why that was fast and how can we extend that and use that further. So the idea behind a bidirectional merge is that if your destination and your source arrays are disjoint, you can merge from both ends at the same time. And then the pointer that's going from right to left does not interfere with the pointer going from left to right. These two logics are independent, essentially. And it essentially looks like that. Why? Why do we want to do that? Well, modern processors are quite different than what maybe your traditional processor with your mental image are. They are superscaler, that means they don't execute one instruction per cycle, no, they can execute many instructions per cycle. They are out of order, the processor will internally reorder your instructions based on your assembly based on the data paths and when memory is available. And they are deeply pipelined. That means that they don't like it when the next instruction depends immediately on the result of the previous instruction because it has to go through the entire pipeline of the processor. So to study that in a bit more detail, we look at a branchless merge, which was first described in branch mispredictions don't affect merge source. This is not the code that they used in this paper. This is roughly translated from GlideSort. You don't have to get into it, how it works. The main important part is that you analyze where is the result used in the next slide. Well, you find that generally all the data that's computed is needed immediately. And the worst part of it all is that the next iteration cannot start really until the previous iteration is finished. You don't know if you're merging two arrays, you need to know, am I continuing with the left array or am I continuing with the right array? There's a lot of data dependencies. So that is my main takeaway from GlideSort and my main low level design principle is to interleave independent branchless loops. So branchless is important, so the processor isn't jumping around and constantly canceling your pipeline. And by interleaving, we can hide some of these data dependencies. The processor can execute multiple instructions at once and it can essentially reduce the impact of having to constantly wait for the previous result. You can also consider parallel merging. So in this case, we had one merge where we did it in parallel from the left and parallel from the right. But we also noticed that the first step in our quad merge has two independent merges. These are essentially parallel, but we're not using threads, but we can interleave their loops. So once I discovered that, I thought, let's create more parallelism. By doing a binary search, you can identify a split point where you can turn one merge into two smaller merges by swapping the right blocks in the middle. I won't go into the exact logic of proof about that, but you can. And in fact, if you are doing an out-of-place merge, you can do this swap implicitly by just reassigning pointers. So there's no actual physical mem copy going on. However, if you're doing an in-place merge, you can actually do the physical swap. And now you have for free a fallback for low memory merging. So even if you don't have a large buffer available to merge with, you can use this algorithm to do it in a low amount of memory. Then I also optimized the quicksort portion of it with the same principle. I came up with what I call bi-directional stable partitioning. Again, I don't have time to get into it, but the idea is that we do, again, partition like in quicksort. So one set of elements that are less than the pivot goes somewhere else. Go go here. And some that are greater or equal go somewhere else. But we do it from both the left-hand side to the right, and from the right-hand side to the left. And these two loops are independent from each other, so we can interleave them. Same principle. When you recurse, it gets a bit more involved, because now your data is in multiple different locations. I can tell you this is not fun to program, but I did it, and here it is. So I do have some experiments to show you very briefly, an experiment of the setup. So this is a lot of text that basically says a 2021 Apple MacBook. And these are the numbers you would get on an Apple 2021 MacBook. So at the top, I have two variants of GlideSort. One is the default variant that you would get if you were to download it. GlideSort 1024 is a variant that uses a fixed amount of memory, so 1024 elements of memory. Then we have the Rust stable sort, the C++ standard stable sort, an implementation of Tim sort, a PDQ sort, an older algorithm of mine, which is also the stable Rust sorting algorithm, and the whatever shipped as the standard sort in C++. You can read the slides yourself. GlideSort is quite a bit faster than the Rust stable sort right now. What isn't shown on this page are some more competitive algorithms, like Fluxort and Quadsort. So they trade blows for blows on different data sets, but those are written in C, and they don't have to deal with some of the problems that we deal with that I'll get to later in my talk on sorting in Rust. If you actually change your comparison operator, so we're only sorting by the last byte of the integer, fun fact, if you do this, stability becomes even observable for integers. GlideSort, again, speeds up even more compared to the Rust stable sorting algorithm. So now we're over an order of magnitude faster for these data sets. If you want to use it, good news, it's released. It took me a while, but it's finally out. You can just cargo add GlideSort, and you can replace your sort, call to Slidesort with GlideSort. If there are any standards library people in the audience come talk to me after the talk, I would love to see it integrated, so you don't have to call GlideSort, and it would just be done by default. But this is a Rust dev room, so some of you at least probably are interested in Rust, so I will also talk about some Rust specifics, so what it takes to implement a sorting algorithm in Rust. And first I'm just going to rant, unwinding panics, I think a Rust billion-dollar mistake. They are complete nightmare. If you are writing unsafe code and you've ever had to deal with panic, some people in the audience are laughing, they're horrible, because essentially, since you can catch them and we have to be sound and safe during a panic, they're essentially the same as C++ functions. In C++, all these functions say if you throw an exception, tough shit, like your vector is invalid now, too bad, it doesn't matter, you can't use it, you don't have the choice in Rust, you have to always be safe and sound in Rust, ensuring that is a nightmare, especially when you're dealing with generic code in unsafe code. So if you're calling foreign code, anything you do, any call, can panic, which causes an unwind. So whenever you call a foreign function, you have to make sure that you are in a sound and safe state. The problem is every single trait is foreign code. That clone call, that's foreign code. This comparison operator, that's foreign code. Lightning Glider was a complete nightmare, every time I compare two elements, that could cause a panic, that could cause an unwind, and you saw all this stuff that I'm doing with arrays all over the place, all of that has to be restored to the original location because it's a mudslice, and you cannot leave a mudslice in an unsound or you can't leave holes in it, everything has to be returned to the original array. Yeah, it's a nightmare, I really wish we would just, instead of panicking, we would just write out a stack trace and abort and be done with it. I hate it, that's my rant. Oh yeah, well, and in fact, GlideSort has an actual real performance penalty because panics are a thing. I can't just write a, like if you're writing an insertion sort, for example, in C++ or in Python, you would just have a loop with a loop variable and you would put the items in the correct place. If you're implementing a thing like this in Rust and you're leaving gaps, so you're moving the element out during the insertion sort, you have to have a drop handler that puts this element back during a panic because this ORD implementation, this foreign code, can cause a panic and cause an unwind. So even when I'm sorting something like integers, which cannot panic, if I don't want to duplicate my entire code base, I still have to pay this penalty for dealing with the potential for panics by storing all my data, instructs, and all this algorithm state. So yeah, that's a problem. But I also want to praise Rust, where it is pleasurable. I love that moves are mem copies. There's no move constructor. If you want to move a type somewhere else, you essentially just copy it and you ignore whatever, wherever it came from. This also makes optimizations possible that aren't possible in C++ because of move constructors, at least not if you don't want to use like templates metaprogramming. For example, instead of copying an element, this is an example actually from GlideSort, not written like this, but where you place an element in one of two places, and if it's going to the wrong place, it doesn't matter because it will just be ignored or overwritten in the next iteration. If it's a small type, just place it in both, don't do a branch. So essentially, this is the opposite of unwinding panics. There are no surprises. A mem copy is always what you get. This is not necessarily, it's part praise, part complaining. Split at mutt, so splitting a slice into or more. It's a one-way street. You cannot go back. Once it's split, it's split. Unless you go back to the original object, but that's not always an option. In GlideSort, when I concatenate these arrays, slices, I need actual concatenation. My options were raw pointers, but that was the option. You are storing an array with indices, but now you're storing an extra pointer everywhere and passing an extra pointer everywhere, and that's overhead that I didn't want to pay. So I came up with a thing I call branded slices. You could hold an entire talk on this, but it's essentially applying the idea of a ghost cell. Some of you might have heard from this, where you essentially brand a type with a unique lifetime that you cannot create. You can only create this lifetime once, and it's not interchangeable with any other lifetime. And with that, you can make safe concatenation. So you could just check, is the end pointer equal to the begin pointer of the other array if yes, we can concatenate? And that will work, except that could also just be happening by chance because of the local array layout on the stack, and you could create unsound behavior. But if you know that they came from the same allocation, then it's safe to concatenate them after checking and equals begin. So that's what I did with what I call mudslice, which in GlideSword, every single slice is a mudslice type, which has a brand, so you can do the safe concatenation, and it has a state, which is one of five things. It's weak on in it, maybe in it, in it, always in it. Always in it, essentially, a mutable slice, so you always have to return it to initialization state. And maybe on in it, in it are a bit more specialized than just maybe on in it, where the type doesn't really encode what it actually contains, and weak is essentially a pair of pointers. And then the code becomes a lot more readable and a lot more verifiable by explicitly encoding your assumptions about your slice type using the type, and then calling functions like upgrades to say, hey, this now becomes an exclusive mutably slice. I'm only going to access this here, or hey, I'm now going to temporarily invalidate this initialization state of this slice. That was essentially my talk. I'm leaving academia, so if you have an interesting, potentially rust job, my contact details are on the slides or come talk to me after the talk. I'm not interested in cryptocurrency, Web3 or similar endeavors. I love cryptography, but I don't know, some of this stuff gets rather sketchy, no offense. That was essentially my talk. I'm going to leave this on this slide. Are there any questions? I have a question. Did you test Glidesort on, let's say, less modern CPUs, like embedded CPUs that don't have auto-fordering execution, et cetera? Yes. Can you repeat the question, please? The question was, did you test the algorithm on any older CPUs that might not have as much instruction-level parallelism and that kind of stuff? The answer is yes, and yes, it is slower than other state-of-the-art that don't do these tricks. This is really aimed towards essentially the future of modern processors. From older CPUs, it is slower than, for example, flux sort, which doesn't do this aggressive interleaving. But if you compare it to the current Rust stable sort that's currently in the standard library, it's still completely dumps us all over that. Can you hear me? Barely. Can you speak loudly? When you take two sort of sequences and you take the bottom half of one and the top half of another and create a third sorted sequence out of that, I thought that was an interesting observation, but what do you use it for? So it's not the top half and the bottom half. That's just the simplification. You're talking about the splitting up merges into smaller merges, right? Yes. Yes. So it is not the top half and the bottom half. It involves a binary search to find the unique split point that allows you to do this swap. It could be bottom half, top half, but that's not necessarily the case. What do I use this for? It creates two independent merges. After doing the swap, this merge no longer depends on this merge at all. And by doing that, I can have two independent loops that merge these and then interleave the bodies of these loops. So it executes one instruction from this merge, one instruction from this merge, one instruction from this merge, one instruction from this merge, et cetera. And that way these instructions don't depend on each other and you can hide these data dependencies and such. On top of that, I use it as a fallback for the low memory case where you don't need, so GlideSword can use less auxiliary memory. We have a last question. Thanks for the talk. I would like to know if you have a bench. I'm sorry, I cannot hear. Can you speak a bit louder, please? Did you bench when the array is already sorted? Did I bench when the array is already sorted? Yes. Yes, it's on the slides. Is it on the slides? Yes, it's the ascending column on the slides and on this slide as well. It's the one person. Sorry? It's the one person column. No, ascending. Ascent. Okay. Okay, that means sorted in this case and descending means reverse of sorted. Okay, okay. Thank you. All right. Thanks very much. Thank you very much. |
How Pydantic V2 leverages Rust's Superpowers
Using Rust to build Python extensions |
So, we have Samuel here to talk about Pydantic 2 and how it leverages REST superpowers. Thank you very much. Can you hear me at the back? Great. It's a bit about me. I'm Samuel. I've been a software developer for 10 years, among other things. I've been doing open source quite a lot for the last five years, mostly Python projects, but moving a bit into REST over the last few years. The most high profile Python project that I maintain is Pydantic, which I started back in 2017 and has subsequently kind of taken over my life. I've been working on it full time for the last year. So, what I'm going to talk about today, I'm going to give you a bit of an introduction to Pydantic, some hype numbers for some vanity, but also for some context of why making Pydantic better is worthwhile. I'm going to explain why I decided to rebuild Pydantic completely. I'm going to talk a bit about how I've done that with REST, and I guess most importantly why doing it in REST is the right choice. I'm kind of preaching to the converted, but hey, what I'm not going to do is a like, hello world. This is how you would build a Python extension in REST. There were lots of other talks on that. They're great. And also the PyO3 documentation is amazing, so I think it's more interesting to go into a bit of depth on the challenges, the advantages than just to do the hello world example again. What is it? Well, Pydantic is a data validation library in Python. It's not the first. It's definitely not the last. It started off as a side project like so many open source projects. Nothing special. I maintained it my spare time. People came along occasionally, said nice things, reported bugs. Occasionally said not very nice things, and then something weird happened, and its usage went crazy. So the first thing that happened, which you can't really see on this graph in 2018, my friend Sebastian Ramirez started the fast API project, which is a web framework in Python, which uses Pydantic and has now got, I don't know, how many thousand stars, 60,000 stars or something. It's got a lot of attention. You can see fast API growth there. That got a lot of people, I think, to first find out about Pydantic, but something else happened at the beginning of 2021 to cause Pydantic's download numbers to go crazy. Now, I'm well aware that Aaron Armin's speech at talk earlier kind of pre-trolled me before I'd even made my talk, saying that download numbers are a terrible metric, but they are the only metric, so that's what we have to use. It's also worth saying that I have actually looked at Pydantic's downloads in terms of as a dependency and as a direct download. It's not that easy to do with PyPI, but it looks like about 15 million downloads a month are from, as a dependency of another package, and the remaining 25 or so million are people installing Pydantic directly, so it seems like people are using it not just as a dependency of another library. I've included Django on there as the middle line, because it's the most high-profile, most well-known web framework in Python. Not to be critical of it, it's amazing, it's changed my life. I mean, no disrespect by saying we've overtaken it, but just that Pydantic's usage has gone mad. In terms of how it's used, it's used by lots of organizations you would expect, all the fang companies, something like 19 out of the top 25 people companies in NASDAQ, but also by organizations which you wouldn't expect, like JPMorgan, use it quite a lot, I don't know in what regard. But it's quite interesting, if you have an open source project, if you look in analytics at the referrers, lots of those big, very security-centric companies forget to turn off the referrer header from their internal systems, and they name their internal systems things like github.jbmorgan.net, so you can see which companies are using your dependencies by looking at those referrers. So, for example, Apple have no public demonstration of using Pydantic at all, but six different enterprise instances of github within Apple use Pydantic, and you can even see which ones they are. They're like maps.github.maps.apple.com, github.serie.apple.com, etc. It's also used by some cool organizations. It's used by NASA for processing imagery from James Webb, and it's used by the international panel on climate change for processing the data that they give to the UN on climate change every month, which is the stuff that I'm most proud of and why I want to make Pydantic better. So, what's so great about Pydantic? Why are so many people using it? The short answer is I don't know, because you can't go and ask those people, you can't look at a graph, but it can't really tell you, but we can kind of look at Pydantic and what people say, and we can kind of guess at what's made it popular. So, this is, I know we're in the Rust room, we've got some Python code, don't worry, we'll get to Rust later. This is some Python code that demonstrates what Pydantic does. So, we have a model which kind of represents a talk, which has four fields in this case. Obviously, title is a string, attendance is an integer, the number of people who came, when, which is a date time, or none, and has a default value of none, and then the mistakes I make, which is a list of two pools with the time they were made at and a description. So, and then lastly, last line, we instantiate an instance of talk using that data, and if there was a mistake, we'd get an error, if there wasn't a mistake, we'd obviously get the instance. The first thing that makes Pydantic special, and the reason that people like it, is because we use Python type hints to define the types. That's become reasonably commonplace now, there are a whole suite of different libraries that do the same thing, either because it's obvious or because they're copying Pydantic, but Pydantic was the first to do that, because type hints were kind of new in 2017, and obviously, the main advantage is it's easy to learn, you don't need to learn a new kind of DSL to define stuff, but it's also compatible with static type checking with all the rest of your code, with your IDE. Once you defined your model, and if Pydantic's worked correctly, then you know you've got a proper instance of talk. The frustration that caused me to create Pydantic was that type annotations existed, they sat there in the code, you could read them, but they did nothing at runtime, and so, effectively, could we try and make them work? The second and slightly more controversial thing that Pydantic does, which I think is one of the reasons that people find it easy to use, is because we default to coercion. So, you can see a tendance there, although it needs to be an integer, it's defined as a string. Pydantic will automatically coerce from, for example, a valid string to an integer, but it'll also do other coercions that are a bit more commonplace like coercing a string as an isodate format into a datetime object, and same for the durations. Some people hate that, some people complain about it a lot, I suspect that lots of people who don't even realize they're using it, they process environment variables, or JSON, or URL arguments, and they're always strings, and Pydantic just works and they don't even see it. A few other reasons I think we're quite popular, we're fast-ish, we're friendly-ish on the bug tracker, I don't promise not to ever be cross with people, and we're reasonably feature-complete. So, that was Pydantic, it's great, lots of people are using it, what's the problem? Well, it started off as a side project for me, it wasn't designed to be successful, and the internals stink. I'm very proud of what Pydantic is doing in terms of how it's being used, I'm not proud of what's under the hood, and so I've been keen for a long time to fix the internals. Also, second way in which I'm in kind of trouble before my talk was talking about API compatibility, we're going to have to break a lot of things in Pydantic V2 to get it right, but that's the right thing to do, I think, to get the future API to be correct and stable and not break again. And while we're building V2, why don't we do some other stuff, so make it even faster, it's already quite fast, but if you think about that number of downloads, you think about the number of CPU cycles globally every day devoted to doing validation with Python, but that's currently with Pydantic, that's currently all in Python, that's probably quite a lot of carbon dioxide that's being released, effectively unnecessarily, because we could make Pydantic significantly faster. Strict mode, I already talked about, because while often you don't need it, there are legitimate cases where you want strict mode. We have functional validators, which is effectively running some Python code to validate a field, they're useful, but they would be more useful if they could operate like an onion, so like middleware where you take both a value and a handler, and call the handler if you want to, once you've done some processing of the value, that would be super valuable, another thing we could add, composability. So Pydantic, as I showed you earlier, is based on the Pydantic model, often your root type doesn't need to be a Pydantic model or shouldn't be a Pydantic model, it might be a list, it might be a tuple, it might be a list of models, it might be a type dict, which is a common new type in Python, and then lastly, maintainability, since I maintain Pydantic, I want maintaining it to be fun, so about a year ago, last March, I started as a kind of experiment, could I rebuild some of it in Rust, a year later, I'm still working on it full time, and we're nearly there. So what does it mean to validate Python data in Rust? What's the process? Well, phase one, we need to take a Pydantic model and convert it to a Rust structure, so unlike libraries like CERD, we're not compiling models in Rust, the compiled Rust code doesn't know anything about the models it's going to receive, because obviously Python developers don't want to be compiling Rust code to get their model to work. So we have to have a, in Rust terms, dynamic definition of our schema, which we can then use for validation. The way we build that is effectively these validators, which are structs that contain both characteristics of what they're going to validate, but also other validators recursively such that you can define complex structures. So in this case, our outermost validator is a model validator, which effectively just instantiates an instance of torque and sets its attributes from a dictionary. It contains another validator, which is a type dict validator, which contains the definition of all the fields, which have effectively the key that they're going to look for and then a validator that they're going to run. The first two are reasonably obvious. I've added a few constraints to show how you would manage those constraints. And then the third one, the when field is obviously a union, which in turn contains a vect of validators to run effectively in turn to try and find the value. And then the last one, which is the kind of more complex type, contains this list validator, which contains tuple validator, which contains two more validators. And we can build up effectively infinitely complex schemas from a relatively simple, I say relatively simple principle at the outset, which is we have a validator. It's going to contain some other stuff. So what does that look like in code? I said I was going to show you some Rust code. I'm going to show you some Rust code because I think this is the most clear way of explaining what it is that we do. So the root of everything is this trait validator, which contains effectively three things. It contains a const, a static string, which is used for defining, as I'll show you later, which validator we're going to use for a given bit of data build, which is a simple function to construct an instance of itself in the generic sense. And then the validate function that goes off and does the validation. We then take all of those, well, we then implement that trait for all of the common types that we want. So I think we have 58, 48 or so different validators, and then we bang all of them into one massive enum. Then the magic bit, which is provided by enum dispatch, which is a Rust crate that effectively implements a trait on an enum if every member of that enum implements that trait. Effectively, it goes and does a big procedural macro to create an instance, an implementation of the function, which is just a big match, choosing which function to call. But it's significantly faster than dine. And in fact, in some cases, it can abstract away everything and be as fast as just calling the implementation directly. So I said earlier that we needed to use this constant, the expected type. We use that in another effectively big enum to go through and we take the type attribute out of this schema, which is a Python dictionary. And we use that to effectively look up which validator we're going to go and build. And again, I've shown a few here, but there's obviously a bunch more. This in real life is not implemented as a big match statement like this. It's a macro that builds this function, but it's clearer here if you get the idea. So I showed you earlier this validate function and I kind of skipped over the input argument. So the input argument is just an implementation of a trait. That trait input is like the beginnings of which are defined here. And it effectively gives you all the things that the validation functions are going to need on a value. So is none strict string, lack string, int, float, et cetera, et cetera, but also more complex types like date, date time, dictionary, et cetera, et cetera. And then we implement that trait on both a Python value and on a JSON value, which means that we can parse Rust directly without having to go via Python. That's super valuable for two reasons. One for performance reasons. So if our input is a string and if we were to then parse it into Python objects and then take that into our validator and run all of that, that would be much lower than parsing in Rust and then running the validator in Rust straight away. The other big advantage is to do with strict mode. So I said earlier that people want strict mode, but they say they want strict mode, but often they don't. So what people will say is, I want totally strict Python, why isn't it strict? And then you'll say, well, do you want to load data from JSON? And they say, yeah, of course I do. And you say, well, how are you going to define a date? And they're like, oh, well, obviously I'll use a standard date format. But that's not strict then. You're parsing a string. And they're like, oh, that's fine because it should know in that case it's coming from JSON. Well, obviously, how are we going to do that? By parsing JSON directly and in future potentially other types, we can implement our strict date method, both on that JSON input. We can say, well, we're in JSON. We don't have a date type. So we're going to have to do something. So we're going to parse a string. And effectively, the strict date implementation for JSON will parse a string. And therefore, we can have a strict mode that's actually useful, which we wouldn't have had if we couldn't have had in pydantic v1, where the validation logic doesn't know anything about where the date is coming from. Even if we have a parse JSON function, all it's doing is parsing JSON to Python and then doing validation. So then that's all very well. That defines effectively how we do our validation. What's the interface to Python? So that's where we have this schema validator rust struct, which using the PyClass decorator is also available as a Python class. And all it really contains is a validator, which of course can in turn contain other validators, as I said earlier. And its implementation, which are all then exposed as Python methods, are new, which just construct it. So we call the build validator and get back an instance of our validator, which we then store and return the type. Actually, this is much more complicated. One of the cleverest and most infuriating bits of pydantic core is that we, as you can imagine, this schema for defining validation becomes quite complex. It's very easy to make a mistake. So we validate it using pydantic core itself, which when it works is magic, and when it doesn't work leads to impossible errors, because obviously all of the things that you're looking at as members of dictionaries are in turn the names of bits of validation. So it's complete hell, but it works, and it makes it very hard to build an invalid or not build the validator that you want. And then we have these two implementations, two functions which do validate Python objects, as I said earlier, which call validate, and same with JSON, where we parse a JSON using SERD to a JSON value and then call validate again with that input. This code is obviously heavily simplified so that it fits. It doesn't fit on the page, but nearly fits on the page. So not everything is exactly as it really is, but I think that kind of gives you an idea of how we build up these validators. The other thing missed here, we also do the whole thing again for serialization. So the serialization from both a pydantic model to a Python dictionary and from a pydantic model straight to JSON is all written in Rust, and it does useful things like filtering out elements as you go along, and it's effectively the same structure, uses the same schema, but it's just dedicated to serialization rather than validation. So what does the Python interface then look like? So what I didn't explain earlier is that pydantic v2, which is going to be released, fingers crossed in Q1 this year, is made up of two packages. We have pydantic itself, which is a pure Python package, and then we have pydantic core, which is almost all Rust code. We have a little bit of shim of Python to explain what's going on, but it's really just the Rust code I've been showing you. So what pydantic now does, all that pydantic effectively takes care of is converting those type annotations I showed you earlier into a pydantic core schema and then building a validator. So looking at an example here, we obviously import schema validator, which I just showed you from, that's come up at the wrong time, from pydantic core, and then the base level, the schema for the base validator is model, and it contains, as I said earlier, a class, which is the Python class to instantiate, and another schema, which in turn defines the fields. And that inner validator is then defined by a type dict validator, as I said earlier. So this is completely valid Python code. This will run now. So yeah, we have a type dict validator which contains fields, which in turn are those fields which I showed you earlier. So title attendances of type int when I talked about earlier. The most interesting thing here is, if you look at the when validator, it gets a bit confusing. It's schema is of type default, which in turn contains another schema, which is of type nullable, which is the simplest union, either a value or none. The default validator contains another member, which is the default value, in this case none, and the inner schema is then nullable, which in turn contains another inner schema, which is then the date time. So that's how we define effectively default values and null or nullable. So one of the other mistakes in Pyrantic in the past was that we kind of conflate, effectively Python made a mistake about 10 years ago where they used, they had a alias for union of something and none, that they called optional, which then meant that I didn't want to have a thing called optional, but was not optional. And so we conflated nullable with optional in Pyrantic and rightly it confused everyone. And so the solution, the solution from Python was to start using the pipe operator for unions and to just basically ignore optional. They can't really get rid of it, but they just pretend it didn't really happen. My solution is to define default and nullable as completely separate things and we're not going to use the optional type anywhere in our docs. We're just going to use union of thing and none to avoid that confusion. And then I think mistakes, I hope it kind of makes sense to you. Again, it's this like schema within schema within schema, which become validator within validator. And we take our code, as I showed you earlier, run validation. In this case, we call validate Python. We've got some Python code, but we could just as well have a JSON string and call validate JSON. And then we have a talk instance, which lets us access the members of it as you normally would. So where does Rust excel in these applications? Why build this in Rust? There are a bunch of obvious reasons to use Rust, performance being the number one, multi-threading and not having the global interpreter lock in Python is another one. The third is using high-quality existing Rust libraries to build libraries in Python instead of implementing it yourself. So I maintain two other Python libraries written in Rust watch files, which uses the notify crate to do file-watching and then RTOML, which, as you can guess, is a TOML parser using the TOML library from Rust. And the RTOML library is the fastest Python TOML parser out there. And actually watch files is becoming more and more popular. It's the default now with u-vehicle, which is one of the web servers. But perhaps less obviously in terms of where Rust fits in best. Deeply recursive code, as I've just showed you, with these validators within validators. There's no stack. And so we don't have a penalty for recursion. We do have to be very, very careful, because, as I'm sure you all know, if you have recursion in Rust and you don't catch it, you just get a segfault. And that would be very, very upsetting to Python developers who've never seen one before. So there's an enormous amount of, as a significant amount of code in Pydantic Core dedicated to catching recursion, we have to have, is it two or three different sorts of guard to protect against recursion in all possible different situations, because it's effectively the worst thing that we can have, is that there is some data structure that you can pass to Pydantic, which causes your entire Python process to segfault, and you wouldn't know where to even start looking. So that's a blessing. The lack of a stack is a blessing and a curse. And then the second big advantage, I think, of where Rust excels, is in the small modular components. So where I was showing you before, these relatively small, in terms of code footprint validators, which in turn hold other ones, there's obviously no performance penalty for having these functions in Rust. I say almost, because we actually have to use box around validators because they hold themselves. So there is a bit of an overhead of going into the heap, but it's relatively small, particularly compared to Python. And then the lastly complex error handling, obviously in Python, you don't know what's going to error and what exceptions you're going to get in Rust. Putting to one side the comment about panic earlier, you can in general know what errors you're going to get and catch them and construct validation errors in the case of Pydantic, which is a great deal easier than it would ever have been to write that code in Python. So the way I want to think about the future development of Python is not as Python versus Rust, but effectively as Python as the user interface for Rust, or the application developer interface for Rust. So I'd love to see more and more libraries do what we've done with Pydantic Core and effectively implement their low-level components in Rust. So my dream is a world in which, thinking about the lifecycle of a HTTP request, but you could think the same about some NL pipeline or many other applications, we effectively, the vast majority of the execution is Rust or C, but then all of the application logic can be in Python. So effectively we get to a point where we have 100% of developer time spent in high-level languages, but only 1% of CPU dedicated to actually running Python code, which is slower and is always going to be slower. I don't think there's ever a world in which someone's going to come up with a language that is as fast and as safe as Rust, but also as quick to write as Python. So I don't think it should be one versus the other. It should be building the low-level, building the Rails, perhaps a bad term, but the Rails in Rust, and building the train in Python. It doesn't work, but you get where I'm coming from. Anyway, on that note, thank you very much. A few links there, particularly thanks to the PyO3 team who built the bindings for Rust in Python, which is amazing. And if you want a laugh, there's a very, very funny issue on GitHub where a very angry man says why we should never use Rust. So if you want to read that, I then took some time to take them to pieces, which was quite satisfying, although a waste of time. So have a look at that. Questions? First, especially for the sanitation, are you thinking to publish a library of Rust? The job is already done, and you could have a public API in a library ready to validate Rust data. I don't understand quite what... So you wrote the library in Rust, so could you publish just an API to validate JSON, for example, from Rust, instead of through Python? Absolutely, you could, and it would be useful if you wanted to somehow construct the schema at runtime fast, but it's never going to be anywhere near as performant as said, because you were not compiling... We can't do anything at compile time. Secondly, it's currently all completely intertwined with the PyO3 library and the Python types. So there is a future nascent possible project, Tidantic, which is Pidantic for TypeScript, where we take the PyO3 types, we effectively replace them with a new library which has a compile time switch between the Python bindings and the JavaScript bindings or the Wasm bindings, and then we can build Tidantic. That's a future plan, but a long way off. Right now, it wouldn't really be worth it, because you would get lots of slowdown from Python and from compile time. So we need a completely different library, just for us, like you're saying. Yeah, SIRD is amazing. I don't think I'm going to go and try and compete with that. At least, it's great for that application. Thanks for the talk. Recently, I think the Python library cryptography introduced Rust, and had some complaints from people using obscure build processes where Rust didn't work. Are you expecting anything from that? So I will actually bring up now. Now I'm going to get into how to... Effectively, go and read that issue, where, among other things, I... Oh, how do I get out of this mode? So, rant, rant, rant, rant from him. Effectively, I went through the... just over a quarter of a billion downloads over the last 12 months of Pydantic, and I worked out looking at the distribution of the different operating systems and, like, libc implementations, et cetera, that 99.9859% of people would have got a binary if they had installed Pydantic Core then. That number will be higher now, because there will be fewer esoteric operating systems. Most of the other ones, most of the failed ones, if you look, are actually installing, say, they're installing Python onto iOS. I don't know what that means, or whether it could ever work, but... Also, the other thing I would say is, Pydantic Core is already compiled to WebAssembly, so you can already run it in the browser. So I understand why people complained, but I think it's not a concern for... it's a straw man for most people. So that's why you slapped down. Yeah. And, again, if there's another... if there's a distribution that we don't... if we release 60 different binaries, if there's another one, we'll try and compile for it and release the binary. There's a question right at the back, I think, just to... I'll get back to the talk rather than... where are we? Is there a way to use the Django models as Pydantic models? Say again? To use the Django to have, like, a binding or to translate the Django model directly into a Pydantic model? There's no way at the moment. There's a number of different ORMs, I know of, built on top of Pydantic, which effectively allow that... if you were wanting specifically Django, there's a project called DjangoNinja that makes extensive use of Pydantic. I don't know that much about it, but if you actually wanted Pydantic models, you'd probably want some kind of code reformat to convert them. So I look at DjangoNinja, I'm sure what they're doing is the best of what's possible right now. Okay, thank you. If you had additional time, say, after finishing Pydantic, are there any other projects where you'd like to follow this vision of, like, a Rust core with Python user space or, like, API? Yeah, there are a number of ones. So there's already OR JSON, which is a very, very fast if unsafe in the sense of littered with unsafe JSON parser, which is very, very fast. The obvious one is a web framework where you do, like I kind of showed here, like the HTTP parsing, the routing, all in Rust. That's not very easy using ASGI. There are already a few projects doing that. So that would be the obvious one, but there's no winner yet. Currently, the best low-level web framework is Starlit, which FastAPI is built on, but I think it does use Rust for, it uses a Rust library for HTTP parsing or a C library. So some of it's already happening, but no obvious candidate right now. What I would say, though, is libraries like Rich, no criticism of Will, but, like, Rich is incredibly complicated. It's for terminal output. It's not so much performance critical, but it's really quite involved in complex logic. I would much prefer to write that logic in Rust than Python. Yeah, I think there are lots of candidates. Tonya online is asking, what do you mean by Python as the application layer? So I guess I could have added some example code here, but you can imagine a Python function, which is a view endpoint in a web framework, which takes in some validated arguments from done by the Pydantic. You then decide in Python to make a query to the database to get back the user's name from the ID, and then you return a JSON object containing data about the user. If you think about that, all of the code outside the Python functions, excuse me, could be written in a faster language, whether it be the database query accessing the database, TSL termination, HTTP parsing, routing, validation, but effectively using Python to define as a way to effectively configure Rust code or configure compile code. Yes, hello. I have a question, just a Pydantic one. Is there any support or are you planning any support alternative schema types like Protobuf, or JRPC, or Avro? Possibly in future. What I have a plan for is, I don't want to build them into Pydantic. Pydantic's already big, but there is a, obviously you can parse them to Python now, parse them and then validate them as a Python object. There is a plan effectively to take the, this, that's the one. Which you would then construct in Rust, parse as a Python value into Pydantic core, which would then extract the raw underlying Rust instance and then validate that. And that would allow you to get basically a Python validation effectively, but without having to, us having to either have compile time dependencies or build it all into Pydantic core. I think that's our last question that we have time for. One comment we did get from Matrix was that this code is a bit small on the, on the display, so if you upload the, I will do, yeah. Perfect. So if you're watching the stream, the slides will be uploaded and you can read the code. Oh, I'll put them on Twitter as well, but yeah, definitely. I'll upload them as well. Awesome. Thank you very much. |
Scalable graph algorithms in Rust (and Python) |
Paul, we're going to talk about scalable graph algorithms in Rust. Thank you. Hello. I am Martin. I'm here with Paul. And we talk about scalable graph algorithms in Rust and some Python. Who are we? We're both engineers at a company called Neo4j, and the J stands not for R, like Rust, unfortunately. But we will talk about a little bit what we do in our day job, which is we work on a product which is called Neo4j Graph Data Science, so Neo4j is a graph database. Maybe you heard of it. It's written in Java. And the two of us, we work on a project called Graph Data Science, which is essentially a plug-in for the Neo4j Graph Database. And it provides a collection of graph and machine learning algorithms that we deploy to our customers, and they use it for many different things, but like the top three applications of these things are fraud detection, recommendation, and identity resolution. And we have customers with up to 10 billion nodes and 65 billion relationship graphs that they compute our algorithms on. And you can find out more about the product and the source code is available online and in the documentation, of course, if you follow those things. Last week, we released version 2.3 of the Graph Data Science product, and what you can see here is basically all the graph and machine learning algorithm that we provide to our customers or users, ordered by some category, and some of them you probably know from university like Dijkstra path searching algorithms or connect components algorithms. Okay, but we are in the Rust step room, so why do we talk about Java? So first of all, so Paul and I, we discovered Rust like two or three years ago, and we started building like smaller tools, libraries, fell in love with the language, and we were curious about how we can actually do what we do at work, like implementing those parallel algorithms in Java, how good would they perform if we would do the same thing in Rust, and also make a bit more use of what Rust has to offer. And we had a quick look around. There is only one graph library that is, I think, very popular, which is called PetGraph, which is focusing on a network X replacement, and it focused on like single-threaded execution of graph algorithms, basically textbook implementations. So we thought we want to go like the step further and do like the parallel implementations of the graph algorithms. So that's how we started the project. So we started in May 21. First of all, it was an experiment, basically, or a hobby project by the two of us, where we tried to figure out like what's the maximum performance we can get out of this implementation. And then over time, yeah, we got more interest into the project, and we added some more implementations of different algorithms, like it's not a lot, but you will see that later. And we added some Python support that we will talk about in a demo. We added an arrow server so that you can use this thing in a network, for example, and everything is available on GitHub and MAT licensed. The project itself contains or consists of five grades. Today we will talk about three of those. The one at the bottom is the graph builder, which is essentially the abstraction or data structure that represents the in-memory graph that we use to compute on. It's a CSR, compressed sparse row representation. And it also has a builder API, hence the name, to allow users of this library to construct graphs, either programmatically or from files. On top of that, we have the actual graph grade. And yes, the name was free on grades, I owe it wasn't actually free, but the owner wanted to give it away anyway, so we took it, lucky us. So yeah, this contains some algorithms, and then we have graph mate, which are our Python bindings on top of the graph and the graph builder grades. The servers, the arrow server, and the app is the CLI application that we won't talk about today. So let's start with the graph builder. The graph builder is basically an API for building directed and undirected so-called property graphs. What you can see on the right-hand side is a undirected graph consisting of five nodes, zero to four, which are connected by edges. And they have no direction, hence the graph is undirected. How do we construct such a graph? So what we show here is the Rust API. So the main entry point that you can see is the graph builder, which is this thing here. And the graph builder is just authenticated, and you give it, you can call the edges function, which takes, in this example, like we take an array of tuples where the first value is the source node and the second value is the target node to describe essentially the graph on the right-hand side. We call build, and what we want to construct here is basically controlled by the type that we assigned to the G variable, which is an undirected CSR graph. So it's very similar to collect, where you can specify I want to collect into a vec or a string or something like that. Basically we use a type system here, which is very nice or expressive in comparison to Java to basically define what the output of this function call is. And we have this undirected CSR graph, which is a concrete implementation of a graph. And the first, the type parameter we use here, U64, is basically the type for the node IDs. And then once you have this constructed, you have several methods available, like the degree, which is the number of edges that are connected to a node. So the degree of node one is free because it's connected to zero, two, and three. And you can get the neighbors of a specific node as an iterator. And in this particular implementation, you can also turn this iterator into a slice, which means you basically have zero copy if you want to access the neighborhood of a node, which is very useful if you want to implement performant graph algorithms. Now we want to turn this into a directed graph, which means now our edges have these little arrows at the end, which means an edge has always a source node, or a source node where it starts and an end node where it ends. The only thing that we need to change here is basically the return type or the type of our G variable, which is now a directed graph. And we have additional methods. We have the out degree, because now we have to differentiate between the outgoing edges and the incoming edges, and the same for the out neighbors and the in neighbors. What we can also do is, since we are talking about property graphs, which means we have properties on nodes and also on edges, we can add node values as another builder method or builder function. Again an array, which is node count minus one length, where you basically provide the node values that you want to add to your nodes. It gives you an additional function called node value to access the value. Similar for edge values, now you do basically a triple, where the third value is the relationship value. For example, here like the 0.1 for the edge between zero and zero and one, which is this one. All right. And we get another method down here, it's the out neighbors with values, which in addition to the target ID of this edge also gives you the relationship where the edge weight. For convenience, we have this so-called GDL string that you can provide, GDL stands for graph definition language, which is another crate that we wrote. It's basically a subset of the Cypher query language, that is the main query language of Neo4j, which allows you to declaratively describe the graph on the right-hand side using sqr syntax. So basically this in parenthesis n zero is node zero and this kind of JSON style map here is the property map, like P1 for example, to describe node zero and an edge is expressed by node zero where you can refer to a variable that you declared earlier and connect it to another one called n1, which is this edge description and you can also provide properties to this edge. It's basically used for testing or for building like small POCs to play around, it's much simpler than using the edges methods and so on. But it's basically the same graph that we constructed before. So as you can see, you can create those graphs programmatically if you use like the edges method and so on. The construction is also parallelized under the hood using rayon. But the main use case is usually from reading graphs from files and we have a graph input trade that you can implement. We provide three implementations. The most common one, especially if you want to start playing around, is using an edge list, which is basically a text file where in each row you have a source and a target and an optional value and graph 500 is a benchmark or a benchmark description specification for HPC graph algorithm benchmarks and they also provide a data generator and we basically can read the binary file format that this generator produces. Like I said, everything as part of graph creation is parallelized using rayon and we will see this in the demo in action. The next grade I want to just mention briefly is the graph grade, which contains the parallel graph algorithm implementations. At the moment, it's these four, which we implemented in the first place to compare them to our Java implementations. So for example, PageRank, it's an algorithm to give like a population, popularity value to a node. It basically tells like if you traverse the graph randomly, like how likely is it that you end up with that node, so it's kind of a popularity metric. Connected components, Paul will talk a little bit about that in the demo. Again, also the graph algorithms are parallelized using rayon and if you want to see more or just open an issue or PR. Just a quick Rust API where how we call this algorithm, so in the graph pre-loot, we also provide like all the algorithm methods, for example, PageRank as you can see in the middle. The first thing we do here, we have a GDL graph again. It's the graph on the right-hand side without any properties. We create a directed graph with a specific layout, which Paul will talk about and then we call the PageRank method using that graph as a reference. You can specify some parameters in the PageRank config, which are not that important right now and the result are the scores, which are the values assigned to each node after the computation is done. And that's basically it. Okay, over to Paul with GraphMate. Yeah. Hi, I'm Paul and I want to talk about GraphMate, which are our Python bindings over this set of crates that Martin just talked about. And we just had a wonderful talk about Python APIs on top of Rust and this is in the same spirit. So we want to expose a Python API for Rust implementations and we don't want to deal with all the shortcomings in Python in terms of like proper parallelism and memory management. We also integrate with NumPy and Pandas, which are de facto standard libraries in anything Python. It's very alpha, so it works and you can install it for a pip. It's available on PyPy and I just want to run through a demo, which is a notebook. And there we go. So first we, I think I need to clip this on once again. Okay, we configure some logging so we can see some outputs and we import typical Python prelude. We import our crates and as well as NumPy and Pandas. And in this demo, we are loading a Graph 500 graph in particular scale 42 and Graph 500 describes it as you have your scale number, two to the power of the scale number of nodes and 16 times as many relationships and this ends up in, so we load that file. We also load a direct graph and it takes a few seconds and at the end we will get a graph that says we have about 16 million, almost 17 million nodes and about 260 million relationships. And we are loading a direct graph, which means we have two sets of logging outputs that look very similar because we do it once for the outgoing, once for the incoming direction. And we also use the duplicated layout, which will merge parallel edges between the same source and target node pair. It will de-duplicate them and we only represent one of them in the graph. And with that we can run PageRank. So PageRank is a method on the graph object that we get and PageRank is an iterative algorithm. It runs in a number of iterations and when it finds that the result is good enough, it will stop and we can now access some metadata about the run. So we see we ran eight iterations in about 1.3 seconds, but now we also want to access the actual scores, the PageRank scores. In the other slide that Martin showed, we store the scores in a WEC of F32 and we don't want to copy that WEC into Python land, into the Python heap, convert the floating point numbers into Python numbers. So we are interfacing with the C API from NumPy and we return an array view that points to that WEC. So when you call scores, there's no copying involved. We return you a NumPy array that directly accesses the data and there's nothing to be copied in the Python heap or anywhere else. And you can use that array, it's a proper NumPy array and you can use that for example to put it into a partner's data frame and then do some calculations based on that. The numbers here don't really mean anything in particular, it's just for demonstration. And the next algorithm you want to run is WCC, which stands for weekly connected components. So it basically identifies components within a graph. Every node that is connected together is one component and we run that, it takes about 200 milliseconds, we're still running on the same graph. And similar to PageRank, we can access the data here and the data is an array where for every position in that array for that node ID, it's the component ID, so every node that is together in the same component will have the same component ID in that array. And we can use a pandas method here, drop duplicates, which will give us all the unique components IDs so we can identify the number of components here. And so we see, we are down from 16 million something something nodes down to almost eight million unique components. And yeah, that is WCC and for the last thing we want to count the total number of triangles in the graph. A triangle is defined as a connection between three nodes from A to B to Z back to A. And for that, first we need to convert the graph into an undirected graph. There's a method there. And it'll take a little while, a few seconds, because it's creating a new graph. We have to basically merge those two out and in lists together, we produce a new graph and since it's a new API, a new type in Rust, we also return it as a new graph. Which means if we, if we're low on memory, we can delete references to the old graph, we don't need that anymore. There's a particular optimization for triangle counting that makes it not be super slow, which we call make degree audit. I don't really want to go into details what it's actually doing, but it's, let me just run triangle counting here. It makes it so that triangle count finishes within a minute, not within five minutes. And that optimization only takes like one and a half seconds. At the bottom, you can see the H-top output, so you can see that it's actually using all the cores and proper parallelism without any typical Python shenanigans that you need to do to avoid the GIL and so on. And we don't have to watch it finish. We can go back to the presentation. This is a summary of the demonstration that we just went through. We don't need to look at it anymore. Once in our repository, we have three variations of the demonstration. We have the first one in Python that we just showed. We have another one using the Rust API and the third one using the arrow server that Martin mentioned, where there's a Python client, but it's not using the library directly. It's using arrow and arrow flight to communicate with the server and doing it remotely. I mean, if you're interested in those demos, you can follow those QR code links at the end. And I think by now, triangle counting should be done. So we took about a minute and it found 10 billion triangles. If I count it correctly, it seems, yeah, it seems it way. So that is for the demonstration. Now we can look back a little bit and talk about the lessons learned, particularly for us coming from the JVM world. And so using Rust as a Java developer, first of all, the way the Rust paradigms require us to think differently about the code and allow us to think differently about the code. Things like using the type system to define whether or not we have indirect or undirect the graph. And this is, of course, very nice and very refreshing coming from a Java world. But also, we have a better mechanical sympathy for what happens. We don't have to think about this JVM black box where things go through before they touch the hardware. Ecosystem cargo, Rust analyzer is very, very nice. But also, of course, there are some downsides to it. We don't have that experience of just clicking a fancy button in the IDE to run a debug or a profiler. We have to actually learn different tools and do things the proper way, I guess. Yeah, but what about performance? We talked about what we want to do in a performance way. And for every algorithm that we have implemented, we are faster and less memory-intensive than the Java implementations. It's not just about that. It's also predictable behavior. No latency spikes, no allocation rates, no cheat compiler that does things in the back. And just quickly showing what we want to do from the future, of course, expanding all the things. And if you want to play around with it, feel welcome and open issues. There's also a longer version of this talk because I'm already out of time, so thank you. We don't have time for all of this. |
Using Rust for your network management tools!
Let the crabs control the packets! |
Yes, Fernando is fine. Fernando, he's going to talk about using Rust for your network management tools. Take it away. All right. Thank you. All right. So, welcome everyone. My name is Fernando. I'm a senior software engineer at Drehat. I work for the Networking Services Team, mainly in focus on network management tools, and today we are going to talk what was our journey, building a Rust tool for network management. So, okay. We did not start with Rust. We have started with Python, but after some time we decide that we wanted to shift to Rust. So, this is two talks in one. One is how we did build the project in Rust, and what we learned when moving from Python to Rust. Okay. So, network management. What's network management? Basically, it's all the operations that you do to configure your networking. Roots, interfaces, DNS, firewalling, whatever you do, it's network management. So, it's a process that is quite complex because it requires a lot of coordination between user space and kernel space. We need to check when we get notification for kernel space because all the tools could modify the network status. We also need to communicate with kernel in order to configure stuff. So, it's a quite complex task. There is already a tool which is network manager. It's by default, the tool that is in almost all situations used for managing your Linux network configuration, and we were willing to use it and we were willing to build in top of a network manager because implementing everything was really, really complex. So, we created NMS state, and NMS state is a tool that communicates with network manager and it's a library with a command line tool, and allow us to configure the network with using declarative states. So, you can define what do you want, and you don't need to care about how is network manager or how is the kernel doing, and what's going to do or what are the dependencies. You don't need to care about any of that. NMS state is going to manage it, so it makes everything easier. So, as I say, we started to build NMS state in Python, and one day we noticed that a lot of our users were willing to chip a binary and not Python, don't use the Python environment, and well, there were also some performance issues because we need to do a lot of operations. So, we decided to give it a try to REST, and we have a problem is that we have a library and a binary, and we needed to move both of them to REST, and also we already have a big base of users. So, we could not break them, and we need to do it in a way that we are going to support, we need to support all the features that we already did in Python. So, well, we created our own NMS state library in REST, I will tell you how, and also the NMS state CTL tool, which is the command line tool. All right. So, the first thing is that we are using Jamel files and JSON files, and we are parsing them. So, in Python, this was quite trivial with an schema, and we needed to find a way to do it. In Python, we were using dictionary, so the user could create a dictionary, and it was using a Jamel library, it was quite trivial to convert that Jamel into a dictionary, and we needed something in REST to do this. So, we end up looking at CD. CD is a framework for serializing and deserializing REST data structures efficiently and generically we use it for Jamel and JSON, but it supports other formats. This allows us to keep our declarative state, keep our API, so that was pretty good, and we noticed that CD allow us to implement our own serializers and deserializers. So, that was also a big plus because we could do validation steps and simplify the validation work when getting the configuration file from the user, and then there were a lot of decorators on server, so it was quite good for creating aliases, for creating multiple helper functions, and also some conditional deserialization and serializations. So, here's an example. For example, this is a interface state for a general bond, and we basically define it is app, it is have an IPv4 address with this address, with this prefix length, and it is enabled, and then we define the link aggregation options. So, we have the mod options and the ports. One really good thing that we have is that we have a partial editing. So, you can define what you want to change, and we are going to merge it with what you already have on configure on the system. About the decorators, as you can see there, we were able to use the decorator for example, accepting numbers as a string, accepting strings, accepting only the number, custom strings, creating alias, renaming, yeah, all of that, and it was quite good. So, okay. We communicate with Neville Manager and we communicate with Neville Manager to configure the network state, and we have a problem is that before we were using the Lebanon bindings, Python bindings, and they were not available in trust, and we tried to create a trust bindings, but it was quite complex because they use gObject and we did not have gObject, and it was a big mess, but we noticed that Neville Manager is providing a Divas API. So, we say, okay, let's use Divas then, and we noticed that there is a create which is Zitabas, and with Zitabas, we were able to communicate with Neville Manager using the Divas API, and with Zitabas, and we were able to encode the data structures that we were using to communicate with Neville Manager and configure the settings that we wanted. So, using this, we solved one of the problems, which is telling Neville Manager what we want to do, and also fetching what already Neville Manager have, which is also important because, all right, there are some options that maybe we do not want to overwrite because the user configured it that way, and for patch editing, that is important. We need to know what the user configured and what the user was to modify. So, okay, one problem solved. Then, we have another problem. So, Neville Manager does not provide at all real-time information from kernel, and we needed that because we also do verification. So, when you configure something, NMS state do a verification step, which what it does is compare what the user defined, which what is configured on the system. We have a problem because Neville Manager was not providing real-time information, and sometimes it took quite sometimes to get the information that we wanted, and we were having some problems on the verification. So, we were looking for a library, and we did not find any library that certified our requirements, but we noticed that there is already a Rastnet link library, and that link is a kernel API for communication between user place and kernel, also, I think, between kernel components, and it was perfect. We could use Rastnet link, which is an existing library, to build another tool, which is NISPOR. So, NISPOR only queries information from kernel, and show you in a jammel file, or basically, proper data structures. Well, it was quite good because we started to contribute to Rastnet link, because Rastnet link was an independent project, and they didn't support everything. So, we were able to help there, and currently, a lot of people use Rastnet link, and it's a quite big project, and probably the one that most of the people use when need to work with NEL link and REST. So, we have one more problem. Okay. Now, we have network manager working, we have verification working, validation working, we can read the configuration, we can do a lot of stuff. But then, networking is complex, and there is one thing that is called OBS, OBSDB, and network manager configure OBS, right? But they do not configure global OBSDB settings. And that was a problem because we wanted to do that. So, how we did? We basically started to use sockets, and using the Rast SDD library for stream sockets, we were able to communicate with OBSDB send petitions, read what they already have stored on the OBSDB, and configure whatever OBSDB settings the user want to configure. So, we created our own set of JSON or using set of JSON libraries, we created our own JSON RPC to communicate with OBSDB. This is internal of NMS state. We have considered to put it on a separate crate, but we did not yet. Then, we had another problem. Okay. I promise this will end. We are going to have a solution. It will stop at some point. So, we had users, the users were using our Python library, and some of them were willing to move to Rast, some of them were willing to move to Goland, but we were already developing a Rast solution. And some of them didn't want to move from the Python code to Rast. So, what we did is create bindings, and we create plenty of them. First of all, we created C bindings. So, C users could use the Rast library. Then, from the C bindings, we created the Python and Goland bindings. And finally, one of the other problem that we had is that we got a huge integration test base, and we wanted to reuse them. So, with the Python bindings, we were able to integrate the Python integration test that we had into our Rast library. So, it was quite cool because we were able to start building the new Rast NMS date, but at the same time, using the Python integration test. And this way, we were sure that we were not breaking any existing use case that we already support. So, that's it. It was a success. And we are quite proud because most of the people that were using it liked the idea, and even the ones that did not care about if you use Python or Rast, we're happy because the change was completely transparent for the final user. If you were using Python, nothing will change for you. The code is the same. You didn't need to do anything different. So, it will be a transparent update. And if you are using Python and are willing to use Rast, okay, you need to change your code. But basically, the API is the same. So, well, you were able to use the same Jamel files, and the same JSON files, and everything will work. So, we got a lot of adoptions, and right now, the user base of NMS date is still growing, and we are quite happy. Also, it was recreated goal and bindings because OpenShift people and Kubernetes people were wiling to use it and it's written in goal and bindings. So, we provided them with goal and bindings, and they really liked it. So, yeah, it was a success story. Yeah. So, basically, that was our journey. I would like to hear, I think we have time for questions. So, please, as whatever you want, I promise you there are no dumb questions. So, thank you very much. All right, any question? Okay. I wondered what your experience was in terms of time to implement in Rust versus Python. All right. From a developer point of view. I think it took us around two years, two developers mainly working on it. It was full-time. It was a long journey, but it helped us a lot having the Python integration test working with the new library, because we were sure that we were not breaking the existing cases and speed up the things a little bit. Absolutely. Do you have a feeling for how long it would have taken you if you had re-implemented it in Python? I mean, I know that's not really a thing, but roughly how long if you had said right? Going back to Python. No, if you had said, right, we've got it in Python, but for no good reason, we're going to rewrite it from scratch in Python to make it cleaner, let's say. Just as like, how long does it take to write something in Python versus Rust or maybe it's not possible to guess? Well, I think it depends. In my opinion, this is a personal opinion, writing Python is much easier, but then you have more bugs. This was my experience. When I implement something in Python, I do it in 30 minutes, one hour, two hours, but then I got bugs. When I do it in Rust, it took me more longer, a lot of compiler errors, a lot of unsafe stuff everywhere, so we need to avoid that, but then it's quite stable. I can say that nowadays, the Rust version, it is younger that the Python one is more stable. We got less bug reports and we have more users. Thank you. Yeah, thank you. Did you run into any problems in terms of compatibility when you created C bindings from the Rust code? No, not at all. To be honest, we did not have any problem. It was quite straightforward. We did not have any problem. I must say that the NMS state library is, well, we spoke to the users, it's quite simple, so that makes it simple for us. We did not have any problem. That's it. Okay, thanks. Before. You mentioned that it was a long journey when you implemented this in Rust. Could you compare what you have expected in the beginning of this journey and with the reality? I must say that I'm not the only one person working on this. There is my teammate, Chris, and Chris was the lead here. I must say that I did not trust very much, that we were able to do it in two years. So we were like, yeah, in two years, we are going to have Rust and I was like, I don't think so. But he was right. So I think my expectation is that it will take much longer, but it was much simpler than what I thought. So also, I thought that we could have more problems with finding the libraries that we need to do all the positions that we needed. But I must say that Rust have a great ecosystem. So the libraries that we are using, they are really, really well maintained and that's great. Let's work for us. We have a question from the matrix. Sure. So that's a bit weird. Tanya is asking, how long did it take your team to learn Rust or did they know Rust already? No. We did not know Rust. I mean, we didn't know what Rust was and we did some work on Rust. But we did one thing here. We started with NISPOR instead with NMS state. So when we noticed what are the missing pieces, we first started with NISPOR, which is much simpler than NMS state, and we learned on the way. I must say that I am most surprised with all the Rust resources that it was quite easy to learn. But we learned on the way. When we needed something, we started learning it. And then we revisited the code and we changed things. For example, initially, we did not understood correctly how to use traits, so we did not use them. And then we noticed, right, traits are really useful. We are not using them. And then we started to implement traits everywhere and make it more flexible. Thank you. Great. There's no more questions. Thank you for your time. Thank you for listening. Thank you very much. |
Backward and forward compatibility for security features |
Hi, everyone. Yeah, my name is Miguel Sena, so I work for Microsoft and I'm mostly the main of Landlack, which is a new Schedule Linux feature. And yeah, it's about sunbathing. So this talk is about Rose Library. We wrote for Landlack and, well, we kind of had some changes about compatibility. So yeah, just quick introduction and context to understand the programmatic here. So yeah, why care about security? So here, well, it might be abuse for some, but like every application can be compromised. Every application can be trusted at first and during the lifetime of a process, it can, well, become malicious. So yeah, as developers, there's, well, multiple problems. So we don't want to participate to malicious actions performed by attackers through our software. And we kind of have a responsibility for users, especially to protect their personal data. And yeah, there's also the, well, there might be some issues about third-party code. So security is unboxing is a security approach to isolate software and mainly to isolate them by dropping ambient access rights. So in a shell, well, when you launch an application in, like, common in Existro, this application can access a lot of files, including some, which are kind of private, like.ssh, for example. So some mixing should not be confused with namespaces and containers, which is a way to create kind of a virtualized environment. And Seccom is also something which is really interesting for security purposes, but it's not about access control. It's about protecting the kernel. That was initially the, well, initial goal of Seccom. So Linux is really dedicated from the ground to bring some working features to Linux. So to bring some security features to the kernel. So it is an access control system available to every processes. You don't need to be a root or whatever. And it is designed to be embedded in applications. So to create built-in sandboxing. It's the way to create one or even multiple layers of new securities. So it comes kind of after all system-wide access control, which are already in place. And so it's available on most distros nowadays. And if it is not the case, well, I grant you to open an issue in your favorite distro. So about sandboxing here, what's the interesting point about sandboxing and built-in application security? If, well, that we can create tailored security policies and embedded them in the application. So there's interesting things about that. And that might help to make it security like invisible, which is kind of the main purpose here. We want to not bother users, but secure them anyway. So because these securities policy can be embedded in the application, well, it can use the application semantic. It can also use the application configuration transparently. So you don't need to add another configuration stuff. It's not another layer of execution. It's embedded in the application. And of course, well, if the configuration depends on user interaction, well, it can adapt to this change of behavior. And one really interesting point is, well, as developer, you want to test what you do. And you want to kind of get guarantees that whatever you're developing is still working. And being able to embed security policies in your application, make it possible to test them the same way that you can test every other features. So that's really interesting. You don't rely on, let's say, Selenix being installed on your test machine and so on. And it adapts to the application over time. So if you have, well, a CI, which is well configured, you can test it and make sure that, well, you can a bit, a bit add new features, updates the security policy and make sure that everything was as expected. So speaking about the library and the Rust library, so the idea was to create something which is Rusty, so identity to Rust. And for this, well, we wanted to leverage strong typing so to get some developing guarantees. And so to follow some common patterns. So many here, the builder pattern. So it's still a work in progress. It's working. But yeah, we're working on improving the API and make it easier and more, yeah, easy to use for competitive reasons. So this talk about these kind of compatibility requirements. And yeah, so I'll talk about that. Some example of early-period users listed here. But yeah, it's still in kind of beta. So let's start with some code example. So just as a warning, this kind of simplified code, it's working. But yeah, for the demo, well, it's on demo, but for this example, the idea is to make it simple to, well, to make it easier to understand. So you can see at the left, there's a C code. And at the right, the exact same semantic, but in Rust. So I will mostly talk about the Rust code. But yeah, you can take a look at the C code to kind of see the difference between them and how Rust can be useful there. So as I said, it is based on the builder pattern. So you create a rule set object here with a rule set new. And from there, you kind of call different methods to, well, build the object here. In this case, a root set. So a root set will contain a set of rules. And yeah, at first, you define what you want to enforce, what you want to restrict, what you want to deny by default. So in this case, these are two actions. The action to execute files and the action to write on files. So obviously, it's not enough. But in this case, it's easy to understand for the simple use case. And then, once you define the rule set and what the rule set can handle, well, you can create it. And the rule set creation translates to, you can see at the left, there's a London trade rule set. And this function is in fact a C school. So in the Rust part, when you call the create method, it creates a new rule set, which is backed underneath by a new file descriptor, dedicated to Larnock. And that is a wrap in the rule set object. Then, if you want to add rules to allow some directory to be, for example, executable, which is the case here. So, well, you open the slash user directory and you make it, well, executable. So, allow access, access execute. And then, you can add other rule you want for all the exception that should be legitimate for the, well, legitimate use case. And then, you restrict the current process. Well, in fact, the current thread. And from this point, the current thread can only execute files which are in slash user. And it cannot write anything at all, actually. So, that was an introduction, quick introduction to the library. And the thing is, Larnock is not a full feature access control yet because, well, it is complex. And, well, to reach this goal, well, we need to spend much more years to increment, well, to add new features to the link scale. Yeah. And the thing is, well, sometimes you might add new features that enable to restrict more. And sometimes we might add some features to restrict less. So, let's see what this means. So, the first version of Larnock, which was released with a 5.13 kernel, basically allowed to read, write, and do a lot of common stuff to restrict a lot of common files and actions. But there was, like you can see here, there's three categories. So, first one, always denied, was for the first version of Larnock, the actions that were always denied whenever you sandboxed a thread. So, that was for, well, complexity in the development, but also security reasons. So, for example, you are not able to execute set-ready binaries because it will be kind of a way to bypass the sandbox. And there was some restriction on Ptrace, so you're not allowed to debug an application process which is outside the sandbox. Obviously, it will be a way to get out of the sandbox. So, that's not what we want. So, the second version of Larnock had its new way, a new access write, which was a way to repound files. So, at first, it was denied to change the predatory of a file for security reasons because Larnock is based on five keys identification, and that was kind of complex. So, but the second version, we implemented that, and then it became configurable. So, one item less in the always denied box. In the third version of Larnock, so, all these versions are new kernel releases, and in the third version, we added a new way to restrict a file propagation. So, propagation in Larnock is to change the size of a file, and this was always allowed before because it wasn't endowed. It was a bit complex to implement this in the kernel at this time, but now it is possible. So, you can see that we can move items from the always denied box to the configurable and from the always allowed box to the configurable list. So, application compatibility. There's two main things in compatibility. It is forward compatibility in a way that when you update your kernel, you still can use the old kernel features. So, that's kind of common. And the backward compatibility in this case is, well, when you're using a kernel feature, well, you might need the specification of the kernel that supports this feature. And if your application is running its launch on an old kernel, well, that feature might be missing. And the thing is, when you're developing an application, well, you don't know on which kernel your application will run because, well, it's a user choice and a distro choice. What comes with landlock is the ability to get the landlock, what we call the ABI version. So, it's really just a number. That increments are started at one, and then increments for each new set of features, which is added to the kernel. So, to give you an idea, it's really simple to get this ID version. It's with a landlock with a specific flag. So, yeah, it's a T code, but it's really simple. So, what we want to do at first, well, these four main properties. The first one is to be able to, well, to make it easy to use for developers, of course. So, we want something which is generic, which kind of follows the build-up pattern because, well, it's kind of common and easy to use. We want developers to focus on what they want to restrict, not the internal, well, implementation in the kernel. And we want them to gradually go from a coarse-grain access restriction to a fine-grain one. So, we don't want them to need to implement a fine-grain at first. It might be difficult, too difficult. So, yeah, in the same way that we can incrementally add new set of features, we can also incrementally restrict more and more of the time. So, no need to be super strict at first. And, yeah, it should be simpler to write, well, for the common cases. Okay. At first, the first improvement was to create group access rights. So, let's say you know which landlock version is supported by the running kernel. Let's say it's a second version. Then you can create a new root set which will get all the access rights which are supported by this basic kernel. So, you just call the end-of-access with XFS from all and then ABI2. And then you can do kind of the same when you're adding a new rule. And this time, well, you want to add an exception on the slash result to make it readable. So, in this case, there's two main groups, the from read and the from write. So, in, for example, the from read includes reading a file, but also reading a directory. So, listing a directory. Okay. Second property that we would like to have is being able to enforce a strict restriction. So, even if we don't know on which kernel the application will run, on some cases, we might want to be sure that all features are enforced and restricted. There's two use cases here. The first one is to test it. If you want to sandbox an application, you want to make sure that even if you're using all the sandboxing features, well, your application will work as expected. So, that's really important. And you don't want to run your application in an old kernel and kind of be fooled by the fact that your application is running because there's no, well, not all secret features are enabled. So, you want to cut these kind of issues in your CI. And also for security software, well, you want to have some security guarantees. So, you want to have a way to fold the whole sandboxing with all secret features that we embedded in our application. The third property is to be able to enforce the best for security with some minimal requirements. So, that's kind of the opposite. And this use case is mainly for end users because end user, well, you don't know which kernel they will use. And so, you want to be able to enforce an opportunistic sandboxing. So, if they have a new kernel, well, they will be more protected. If they have an old kernel, they might not be protected at all, but that's not your choice, that's not their choice. And at the end, they want to run your application anywhere. So, another requirement is to be able to disable the whole sandboxing if some features which are required may not be met. And this approach should be easier to write than others because it is the most common thing to do. And the last property is being able to run, well, to configure at runtime the sandboxing, but to make it in a way that you're running most of the codes. So, the idea is to be able to have kind of the same code running everywhere, almost, even if they don't have a recent kernel. Why that? Because you want to kind of identify early kind of some issues which might be linked to the sandboxing code and that's if you have, let's say, two users using a recent kernel and four users using an old kernel, well, you might want to test as much as possible with all your users, even if they don't have a new kernel. So, the first approach we took was, so we'll go quickly here, there's three approach. The first one was to change, well, to add a new method to the root set builder pattern. So, it was a simple method to set and set the best approach. So, if it was false, it was required to have this feature. So, in the example, an application that needed to move files from one directory to another needed to have the access effects refer access right to allow this access. And if it wasn't the case, well, the something should not be enforced, otherwise, it will break the application. So, that is a requirement. And in this case, that was a way to kind of change the state of the builder over time. So, this is kind of flexible, easy to understand, but some kind of cases. And, yeah, it makes the code not really clean. Another approach was kind of to do the same, but this time, with instead of two shifts, enable or disable, there were three ways to change it. The best-ifort way, the soft requirement and the hard requirement. So, a way to make it best-ifort, a way to make it error-out if there's any unsupported feature, and a way to disable the sandbox without error if some feature were not supported. So, that wasn't ideal, neither. And the last approach, which is currently working for us, is kind of a new one. So, the idea is to make it still configurable and to follow all these properties, but to make it, well, a bit simpler and still flexible. So, here, in a shell, well, you can make a new rule set that will error-out if there's any unsupported features, but at the same time, you can specify which feature is required to enable the sandbox or not. So, that's kind of specific, but, yeah, should be better. So, going forward, there's a lot going on in this first library. A lot to improve. You help, you help, you get a presentation, and I encourage you to, well, then make your application or others. And, well, there's some tips if you want to get some motivation here. It's a rewards program. So, thank you for attention. There's some interesting link here. This talk was kind of a dance, but I hope you enjoyed. Thank you. |
atuin: magical shell history with Rust
useful shell history on all of your machines |
I'm going to talk about a project I've been building on and off for the last two years also. So to get started, who am I? My name is Ellie and I'm the lead infrastructure engineer at a company called Post Hog. When I'm not writing software for work, I try and maintain a couple of side projects in my free time and when I don't have the energy for that, I'm normally exploring outdoors, which as you can probably see is usually on a motorbike for me. So to dive into a two-in, first of all, I'm going to start with the name. Originally it was called Shink for like shell and sink, but I couldn't really say that out loud without cringing. So I looked at something new. I've been a fan of Terry Pratchett's disc world books for a really long time and for those who are unfamiliar, the sort of premise there is that the world is a disc and it rests on the shoulders of four giant elephants stood on the shell of a space turtle called the great a two-in, which I'm probably mispronouncing. I thought it would be a bit pretentious to include the words the great in my project name and putting an apostrophe in a binary is probably not a good idea. So I ended up with just the name a two-in. A little bit more specifically, a two-in was made to synchronize shell history between multiple computers. So I had the problem that I would be switching between a whole bunch of laptops. I'd be remoting into various different boxes and trying to find one command that I ran a few days previously on whichever computer it was was pretty difficult. So I wanted it all in the same place. The first thing I did was replace the normal ZSH history, bash history, or whatever fish uses, I don't really remember, with a SQLite database. And we could then have some functions to import your normal text history into the database and because databases are a little bit more flexible than flat text files, we could also include some additional context. So in the case of a two-in, this is context such as how long a command took to run, whether or not it was successful, which directory it was ran in, as well as the shell session. So the way we do this is we plug into your shell. If your shell supports it, it's via the normal shell hooks, like pre-command or pre-exact and post-command, I think they're called. But in the case of bash, which I do not have positive feelings towards, we do a really horrible pack with the prompt. So hopefully you can see the GIF on the right. On top of this database, we also built a search TUI. And this is bound by default to control R and the off arrow, which is a little bit contentious for some people, so you can remap that too. Search UI has three different search modes. By default, one of them is a fuzzy search, kind of inspired by FZF. The other is a prefix search, which is pretty self-explanatory and a substring search, which same thing. You should know what that means. We also have several different filter modes. So a TUI allows you to search your shell history for the current session, for the current directory, for the current machine, or just all of your shell history for every machine ever that you've connected anyway. It would be cool if it could have otherwise. A little bit more on that extra context. A turn has a stats command, which kind of analyzes all of your history and will show you things like the most used command, which for me is LS. I didn't realize I ran that so much. How many commands you have ran, as well as how many unique commands you've ran. We're definitely not making the most of all the data available, and there's a lot more sort of cool analysis we could do. And you can also get the stats for a specific day or week or month or whatever. A little bit more on the search. You don't have to use the search UI. We also have a command line search interface. This is kind of useful if you have like a specific command in mind. Maybe you know roughly when it was or roughly what it looks like. And it's also useful to integrate with other tools. So someone on the Discord told me that apparently they've used this to integrate directly with FZF as their search instead, which is pretty cool. So you can see here that I'm searching for all successfully ran commands after yesterday at 3 p.m. that start with git. Obviously, I did not make these slides today. The time specifier supports like a human way of expressing time, and the command search supports regular expressions. A little bit more about the sync server. It's a kind of pretty boring HTTP API that shares blobs. It has no idea what the blobs actually contain. And it was originally written in with warp, which I found to be very fun, kind of nice mental exercise, I guess. And we ended up rewriting with AXM because while warp was fun, it was difficult for contributors to figure out how to use, and it also contributed pretty massively to a high compile time. And AXM is just sort of the problems there. The ato and sync server is completely self-hostable. Anyone with it installed can just run ato and server and have a running server. We also have Docker images and Kubernetes manifests for anyone that wants to get a little bit more fancy. And a little bit more on the sync is that it's not quite real time yet. While I would love it if it was, it currently syncs an interval of 15 minutes, and you can reduce this down to zero, which basically means it will sync after every single command. If you know fancy running your own infrastructure, there's a public deployment of it in the IRAN. Currently it's got about 11 million lines of shell history on it. There's about 300 active users. And it's all running on just one dedicated Hesner box, and it handles way more requests than I thought it ever would. I'd also like to thank the GitHub sponsors I got, which I didn't really expect anyone to contribute, but they cover the server bills entirely now, which is a really nice feeling. And a little bit more about privacy. I imagine people here probably feel more strongly about that than others. Everything's fully end-to-end encrypted in the sync because I really don't want the responsibility of people's accidentally pasted into a shell API keys on my machine. We use LibSodium secret box because I'm not at all a cryptographer, and it's more difficult to mess up than most other things. Finding a reliably maintained library for that was a bit tricky. The original bindings we used were not maintained beyond security patches. We recently switched to, I think, Rust Crypto, for a member rightly. All of the encryption keys get automatically generated when you log in, and you have to keep track of them yourself. So if you lose your keys, there's nothing I can do. Your data's gone. So why Rust? This is the Rust dev room, after all. It runs twice for every shell command you run. So it runs just before and just afterwards. It lets us get the timing data and everything else. And if we had latency there for an interpreter to start up or a runtime to do whatever it does, the experience would not be great. If you added 50 to 100 milliseconds to every command you ran, people would pretty rightfully complain. So Rust fits the bill very nicely there. It also has to be reliable because if we're dropping shell history randomly, then it's not at all serving the purpose it was written for. Having a static binary to deploy is also really nice. No one has to make sure they have Rust 3.7 not pointing any languages in particular installed on the system with the right versions of various libraries installed or anything like that. And it's also safe, so you don't have to worry about any memory issues or anything like that. The other factor which I think for a side project is especially important is that Rust is fun. When I started this project, I was also considering using Go, and I was also writing Go for my day job. And I didn't really fancy the idea of getting home after work, writing Go all day, and then writing some more Go. So Rust solved that very nicely, and I think the main reason I actually got around to finishing this is because I was enjoying writing it. Additionally, the Rust community is fantastic. Every time I've asked for help, people have been really helpful. Everything I wanted has been available, and they're just generally very welcoming and accepting, especially compared to some other tech communities. So I actually have one other service, and I'm glad most of the previous talks have discussed Python, because now I don't feel as weird for mentioning it in my presentation too. I have another service called Rinsewind, a bit of a naming pattern there, if anyone's familiar with it. And what this basically does is it peeks into the database and generates graphs like this, which are heavily inspired by the GitHub commit activity chart, but for your shell history. And it's currently closed source for no real reason other than that it's a really horrible hack that I don't want to package nicely for anyone. It mostly uses NumPy and OpenCV and a few other things. It's also completely opt-in, so you don't get this by default if you don't want any proprietary code touching your data. You don't have to. It's cool. Just with one curl command, you enable this. On the open source side of things, this is the first open source project. I've released that people have actually been interested in. I made it just for myself and stuck it on my GitHub, and it ended up being quite well received by a whole bunch of people. We ended up in a lot of package repositories. I think off the top of my head it's the Arch Linux community repo, Homebrew, Alpine Linux, and some Nix. I'm not entirely sure how Nix works, but one of the Nix repositories. And there's probably a whole bunch more that I'm not aware of. And we've actually got 63 contributors at sort of as of today. Some of them are sort of returning regular contributors, which is very nice that people want to regularly give time to my project. Some of them are just sort of drive by. They found something that annoyed them or bug they wanted to fix or something like that, so they contributed, which was lovely. I'd also like to especially thank Conrad. He's much more involved in the Rust community than I am and also a very long-term friend of mine. He helps me maintain a twin, and when I was first starting and not so good at Rust, he did a great job of tidying things up a bit. In terms of the future, right now a twin has a bit of a flaw in that you can't actually delete history once it's been synced. This is mostly because the sync's pretty eventually consistent, and every machine you have is a potential writer, so ensuring that you delete something and it stays deleted is actually really difficult. I've currently got a solution to it, which works on my laptop. I just need to make sure it works on everyone else's too. I'd also like to sort out bash, because pretty much all the complaints we get about shell integrations are from people running bash, and it's very frustrating. I think I don't actually use bash, and I hate having a setup on my machine just for that. I'd also like to show some more information in the TUI, so I don't know if you saw very much on the GIF earlier, but it basically just shows what's useful for search results. I would love it if there was another tab where you could also see sort of statistics about a command that's run, maybe how often it succeeds versus fails, you could get some nice stats about make build that way, and that sort of thing. I'd like to improve the search a little bit too, because right now it's good enough, and I think it could always be improved. I've been meaning to explore some of the full-text search modules that SQLite has, or maybe something like Tantivi or one of the other search libraries in Rust. Otherwise, I'd really like to improve the sorting. Right now we sort chronologically, which is a pretty safe default. I'm not going to turn this into a horrible Twitter timeline type thing, but it would be nice if we could sort based on the context we have. Maybe every day at 9 a.m. you CD into your repo and you run GitPool. By default, it would be nice if you pressed Ctrl R, and GitPool was already there at the time that you frequently run it. We've got all the data for that, it just needs to be plugged together. In the even further future, the number of people that have spoken to me about the fact that they have development API keys in their shell history, it would be nice if we could do something to get that out of the shell history and sync that alongside the data. Being able to bookmark commands is also something I would quite like to be able to do, because there's some longer commands I run frequently in search for, frequently having some sort of hockey or alias would be really nice. Otherwise, I realized that a subset of a two-end history could also be used as a runbook if you had to begin and an end marker to it, and you could just replay some commands from your past. That's actually it. I went a bit faster than I was expecting, but if there are any questions, I'd be very happy to answer them. Can you search for things which have come after your most recent command frequently? I'm not sure what you mean, sorry. So to take what you've just typed and see what you typically do next, so actually returning the command after the one you've searched for. That's one of the things I'd love to be able to do with the smarter ordering is know that like a sequence of commands that's commonly run and predict the next one based on history, if that's, yeah. So I tried to install your tool, but I'm using Bash, and I was wondering how far are you with like fixing Bash? Bash generally works fine. It's usually the people that have a whole bunch of Bash plugins installed or have a weird Bash prompt that start to have some issues, but generally, it's okay for most people. Yeah. Sorry. Does it handle having different cells in different computers? For example, if I'm using one computer piece and another CS8, does the same work between those two? Yes. So we translate from whatever your shell uses natively into the format we use, so whichever shell you use on each machine doesn't matter. Okay. Thanks. I have a couple of questions. First, I didn't quite get how do you authenticate with the server by having a key analysis? So the sort of user authentication is just a username and password, but then your actual data is encrypted by a key that's only held locally. All right. And second question. Do you have a ZSH plugin or have you considered one? So we have a ZSH plugin. You can use normal ZSH plugin managers to install and use it. All right. Thank you. Getting some exercise in. Is it possible to disable the history for a few commands and then re-enable it? Not currently. We have spoken about the idea of like an incognito mode. If you prefix a command with a space, it won't be saved, but it's kind of annoying if you got to run a lot of them in a row. We have some questions from the matrix. So Olivier Robert says, how would it interact with something like Starship? I actually use Starship and it doesn't interact with it at all in that it works completely fine. Awesome. And yep, that was the other question. Cool. Thank you. There's one at the front, too. Two short questions. The first one is, since I'm using BESH, what's your favorite shell? I like ZSH, I think purely because I started using it maybe 10 years ago and have it so hard to break. I think if I was going to start again, I'd probably try FISH a bit more. And a question about the timestamps, are you using the client-side timestamps from the machines or server-side? So we actually store client-side, the timestamp will be whatever your client is, but we actually use two timestamps to sync to work. So we have the server-local timestamp, which is only really used for syncing, and then the actual data, it's all encrypted and hidden, so it's whatever your client stores. Because sometimes the local timestamp is important if you want to sync with a system or whatever, but sometimes also the real time, if the computers are out of sync, which that should happen. I had a bunch of issues with timestamps when I was first writing it, but we got it all sorted out in the end. Is there a limit to the length of a command, for example, imagine a huge pipeline with the SQLs and JQL queries in there? Currently it's eight megabytes of whatever it is once it's been encrypted, it's only a server-side limit, and it's pretty arbitrary. So another question, any plans for special handling for similar commands? We do fix syntax, run similar commands in a row? I hadn't really thought of that before, but it might be worth considering. Sorry, I did have a few more questions from Matrix. I think my device is not synchronizing properly, but Andy sent me a screenshot. So does it integrate with regular history mechanisms provided by the shell? For example, excluding certain commands automatically like CDNLS, skipping storing in history by prefixing with white space for sensitive commands, et cetera. So the prefixing with white space is included, the default ignoring is not, but it doesn't actually replace the text file history either. You will still write to that if you ever decide, do you want to stop using it? And where would context to where recommendations come from? So if we have a history of your shell, we know the directories you're in, we know what commands you've been running at what times, so if we're predicting the next command that you want to run, we can use your own history. So, but the question follows up with, it's end-to-end encrypted? Oh, it would all be from the client. So there's nothing, the server's just a dumb blob store, it doesn't really know much of anything. Any more questions? I think that's it. Awesome. Thank you. That was really well. |
Enabling FIDO2/WebAuthn support for remotely managed users |
Okay. So, hello everyone. We'll start in a minute. If the talk will have 20 minutes, and after that, there will be space for questions. If you have questions, please write them into the metrics room, or we will try to run around with the mic. So, thank you all for coming, let's, I will give a word to Alexander Bokovoy, who will have the first presentation about enabling FIDO in support for remote-lemonaged users. Yes. Thank you, Jakob. Thank you everyone who came today in this drizzle day into a room that is not so easy to find. Let's fight security with obscurity. Okay. The talk I'm giving here is actually, it was supposed to be done by Ika, who drives this effort at Red Hat. And I have another talk in the afternoon in the main tracks about passwordless Linux, where we are. This is part of it, but not the full stuff. So, it's kind of a preview of where we are, hopefully with demos and without demo effect. Let's go. I hope I will be able to get something working. Nice. My laptop actually locked up. Really? Yeah, I'm trying. It doesn't want to. Okay. We'll get there. No, no. So, the fun part is, this is old laptop. This is really old laptop. My actual laptop got the battery puffed up. So, it literally looks like this now. It's sort of a hat, right? Here we're working there. But it means that I cannot fly with that kind of laptop. And this one is maybe six or seven years old. So, it's a bit surprisingly slow for contemporary software. We are booting Fedora. Okay. So, this talk is about a very simple thing. Basically, this is the hardware incorporation of what people call nowadays as a pass key or web orphan, which is known as 5.2 set of specifications. And actually, we start with a small demo, right? If I get the screen working. Okay. I'm not getting. I don't know what you see there. Maybe you are not seeing anything. But I actually was able to log in with this token. And what I will be talking about is how I actually get it. Now, let's see. Yeah. Yeah. This is because I logged in without password. The password storage manager asks me to enter the password to unlock the storage manager. Not to log in with the password. That's another part of the stuff that we have to fix before we will be able to get into the Linux without all the passwords. Okay. Let me get to the top. It's really slow. I really hope to get this working. Yay. Finally. So, some time ago, we at Red Hat were working on the identity management on security, on the stuff like free AP and SSSD. We started looking into a couple things. One of them is how we can get password less into our systems. Mostly, we talk about remote servers because that's what most of our customers are using. And many of those customers started asking about the thing that sort of required them to be enforced for them from the governments and so on. So, one of those things that is forced it is FIDO2 or web often because, well, everybody else supports it. Where everybody else means all these web applications, all the other operating systems like Microsoft Windows and the others. And, of course, there are properties that are related to this. They are all nice. But the reality is that you get all of this nicely working mostly in browsers, at least on Linux. And not in all browsers. Some browsers do not support web often. Workflow some do. But we need to get these logins into the actual systems. Because if you are not able to log into these systems, you cannot use this password less. And, of course, there are, from the technical point of view, it's a combination of some PAM modules and some changes to applications and so on. It shouldn't be that hard, right? There are already PAM modules that implement this. The problem is that if you look at this from systematic point of view, if you manage thousands of hosts where you need to get access with the credentials stored somewhere, you need better than just everything defined on the single machine. So, we started looking into what we have. We have centralized identity management. We have free IPA. We have SSSD on the client side that allows to query these different identity management, including free IPA and so on. So, we started looking what we can create all of this. And we wanted to enable use of FIDL 2 in the console for these remote-managed users. We start with local authentication. This is what is working now. It didn't work like a couple of weeks ago. We are already having some progress. And the second part will be remote authentication, but with at least, because all the things like remote authentication using native SSH, for example, protocol, it's really not about this. It's not compatible with the local use of the FIDL 2. It encapsulates some of the principles, but it converts kind of use or assertion of the FIDL 2 into a different form specific to SSH and cannot be reused for anything else. So, the reality is that's coming from where all these governmental admins or big organizations admins are getting the pressure from. This pressure came actually a year ago in the form of what they call zero-trust memorandum. The zero-trust memorandum is effectively an answer by the executive office of the president of the United States to the set of threats that they got over the last decade, or visualizer, at least in public. So, this memorandum basically states by the end of fiscal year 2024 a number of things should happen in the governmental organizations. There are, I think, five big targets that they have to go, and one of these targets is to switch to password less everywhere. And to say the truth, governmental organizations in the U.S. already use a password less form with smart cards. But web often is called out as one of the recommended ways of doing it in the memorandum. So, zero-trust memorandum says that basically you have to use a personal identity verification or smart cards. We already support that. And web often is another approach. And go there. There you go. You see the customers or prospective customers getting pressure from there, those who drive them, who give the budgets and so on. And then these customers come to Red Hat and all other companies and ask to get this working, because they have to comply. Not us comply with this, but the customers comply with this. The lucky part we have is that all of this, actually in the interest also of the community, because, well, it simply improves our state, not only at work, but also at home. If we switch to password less everywhere, we get a bit secure environment, I hope, given the practices that we preach to at home. But this is the part. So, if you have remotely-managed users, basically define somewhere, centralize it, your accounts, your post-exidentity, your home directory, your shell, and so on, somewhere should be defined which password less credentials you can use. These credentials should be delivered locally. And if you have a device like this one, or maybe the one on the phone, which we do not support yet, it needs to be verified. It needs to be an engaged way of applying and so on. And in these centralized environments, we often have to deal with the fact that you are not only logging into the single machine, you are jumping somewhere. You are interoperating with other applications. Typically, this transfer of authentication state happens through transition from your local authentication to something like Kerberos, which issues a ticket recording your authenticated state, and then uses this ticket to issue other tickets to other services in the environment that's being built. So on the high level, in principle, all the stuff that we deal with in free APNSS is already. For FIDAR 2, this is the new thing for us, but we use LibFIDAR 2 library that's already existing and shipped in many distributions for the implementation of the FIDAR 2 stuff. We store the data at the LDAP server and fetch there that's SSSDXL set. And the other part is we integrate with free APNSS Kerberos implementation to provide the transition from FIDAR 2 to Kerberos. So for the local authentication, what happens is that you have the SSSD running on the machine. It picks up the user information from LDAP record for that user. Part of that record is the specification of the PASCII recorded details, pretty much like in the traditional way you store them somewhere on the disk, but here you store them remotely. Then when the token is added and there's a need to log in over a PAM, any PAM service, you get LibFIDAR 2 communicating with the device and performing its magic comparing with the record that you have. So in LDAP, this looks like this. I intentionally included a bunch of stuff here, but literally all we care about is that we have this PASCII attribute. And obviously in LDAP, it's a structured store, so you have to have an object class that defines the use of this attribute. And on IPA level, this looks like this. There is a user information which also has this PASCII mapping. This is not in the released version of IPA yet. Hopefully by, I hope, by spring we will get this, but later I will show you where you can get the test version. So in IPA case only, you get after this login, which is currently not working, so you get a Kerberos ticket. This is high-level overview of how it goes in. The presentation is available on the site, so I'm not focusing on describing these details, but effectively we extended MIT Kerberos implementation to allow us to communicate with the KDC. All these details related to web often implementation. So behind KDC on IPA, on free IPA server, we have relay and party implementation that performs part of this authentication and then uses Kerberos protocol to transfer the bits between the two sites. So tested. So this is actually a demo of what I kind of ran before I got my presentation working. So this is the locked-in screen this morning. And to unlock, I have to insert this PASCII and press enter. You don't see the part of the full message because it's so large compared to the actual input stream that GDM shows. Then I just activate the device and magic happens. I logged in. How you can play with this yourself if you want to set up? Iker wrote some instructions in this black post and he maintains copper wrapper for Fedora. Fedora 36 and 37 at this point. So you can get SSSD packages. There is one package that is not installed by default. This is exactly your support for FIDO2, SSSD dash PASCII. Then you need to enable it in the SSSD configuration. But if you follow Iker's instructions, you should get it working. Right now it only works with free IPA because we have free IPA from that copper wrapper because it has the support for storing the PASCII in the old app server. I will stop here because I have like three minutes and I would like to hear any questions and feedback. This is sort of early stage. I will show a bit more in the afternoon with the bigger presentation that they have at the main track. There will be a bit more demos there. But we really would love to hear your feedback and what you want to see working there. I have a question. So what happens if the key is lost or stopped working or something like that? So what happens if the key is lost or stops working because it's a hardware that might blow up, right? That totally depends on how the system is defined. If the system is defined to allow fallback to other PASCIIs or it's allowed to use a different authentication method, the fall-through to those methods will happen. If you want to, if the key is lost, for example, user or admin can remove from, let me get here, from the user entry, you can remove the PASCII mapping and then this user wouldn't be able to use this PASCII anymore. So in practice, this is a process thing. You have to define your policies for organizational policies, how you handle any lost credentials. There's no difference with this. Some systems like, for example, Apple in MacOS actually forces you to define two separate PASCIIs, two separate tokens, if you enable one because they think that you most likely will lose one. They probably figured out something about the users. Okay. Any other question? There's nothing to pass here. So the whole, I'm not going into details of web often implementation itself. It's fairly secure in this context. Yeah. You have to have actual hardware or software implementation of the token. You have to have exactly the same key. The private part of the key is typically not leaving the device. So it's fairly secure in that case. Hi. Stefan here. Is it possible to add yet another factor? Can you please speak up louder? Sorry. Can you add another factor to this key that you will bring? I'm sorry. I'm not here. Guys, could you please silence a bit? Sorry. I will speak out loud. Can you add another authentication factor to the process, like the OTP token next to this physical key? So you're asking if there's a possibility to amend use of pass keys with something else? Exactly. I believe it's possible to, because all of this available over PAM interface, you can stack up several PAM modules in it. In SSSD, that wouldn't be possible at this moment to get it. But maybe this would be a good idea for going forward to allow extending and forcing to use several methods. Yeah. I will write this down. Okay. I guess I'm out of time. I think we can still have one last question if there is some. Okay. Nice, nice. Technology. I have a question about user nameless login. Is it also supported? Because this is the problem of FIDA 2 that you can have discoverable credentials stored on the token. So the only thing you do is plug in the token and use a fingerprint. Does it also support user nameless and passwordless login? So the question is whether login where a system, when you insert the token, identifies to which user this token belongs to. Does it work or not? Implementation as right now does not support it. There is a plan to support discoverable credentials. There are a few things that I would like to address in a presentation in the afternoon related to UX. Basically, right now, we are very limited in how graphical environments allow to do this discoverability. So for example, for the smart cards, a couple years ago, we changed the GNOME GDM to extend to allow picking up different identities from the smart card. And if user has, for example, several identities associated with the same smart card, then GDM allows to pick up the right one. The same problem comes with the past keys. Maybe I meant it with the idea of discoverable ones, but it's the same story. So it's more like not the back end, but rather the front end, the one that presents you. And user experience is pretty bad right now on this. But the plan is to fix it eventually. Okay. Thank you, Alexander, for the talk. Thank you for all the questions. Thank you. |
FIDO beyond the browser |
Okay. Thank you. So, my name is Jos van Dijk. Disclaimer, I work for a commercial company called Ubico. So, maybe you've heard of Ubikis. Ubico produces those Ubikis. But I'm not going to talk about Ubikis. I'm going to talk about a technology called FIDO. And many of you will have seen the previous presentation that was about an application of FIDO. So, the goal of this presentation is to move on to the next slide somehow. Yeah. To explain what FIDO is. So, I give a quick introduction to FIDO. And then what you can use it for. And many people will already have seen it. For example, authentication. But I'm going to talk about things that are. So, authentication on the web primarily. I'm going to talk about things that are. That FIDO can be used for that is not involving a web browser. And these things are less or well known. So, I think it's interesting to have a look at the applications. And I'll give examples of open source software that you can use today that are actually using FIDO to do things that don't involve a browser. So, let me first explain about FIDO. So, FIDO is actually a set of specifications. One is by the World Wide Web Consortium. That's about using it in a web browser primarily. And the other one is using security keys. So, the tokens like this that are typically in your key ring. This is called a roaming authenticator in FIDO. And the idea is that you protect your private keys on a piece of hardware that has protection against extracting key material. So, this protocol is called CTAP. And that's by different organizations called FIDO Alliance. And so, this is specifically talking to authenticators like this one. So, how does that work? So, I'm simplifying things because there's a lot of details that I don't want to get into because that takes too much time. So, if we have some relying party. So, let's first look at the web authentication part. So, using a web browser typically. So, a relying party that will be typically a web server. And authentication works like many authentication protocols. You use a challenge response mechanism where you use asymmetric cryptography to sign a challenge. And then you do the verifier. So, the relying party can check the signature. And if it works out, then you're authenticating. So, the idea is that these two protocols, the web of N is basically used in a web setting. So, for example, the web server can send a challenge to a browser. And then the browser uses the web of N API, which is simply a JavaScript API to initiate the registration of a public key or authentication using that public key. So, that's what web of N, that's in the web part. And then on the back end, your web browser will communicate with a security key. So, this roaming authenticator. Just relaying that challenge, asking the key to sign it. And then the response is passed on to the relying party and we'll verify it. So, what's all the fuss with the pass keys and Fido and anti-fishing? Well, that's the merit of Fido too. It has phishing protection. And that is because in this challenge, the web browser can help you secure things by injecting the origin of the site that you are authenticating with. So, this is included in the signature. So, if you end up at the phishing site, the signature will match because it will have a different identifier for the better. So, this is why Fido is phishing, protecting you against phishing. But actually, I'm not going to talk about this use case. I'm going to talk about the right part of this image where we use CTAP to communicate with an authenticator. So, these are all kinds of authenticators. So, yeah, I work for a company that produces these authenticators, but it's an open standard. So, anyone can build a security key. So, I'm using security key and roaming authenticator interchangeably, but these are all security keys by different vendors. So, of course, my employer is there, but there's also Feixion, for example, Google, Nitro key, Solo keys. And that's interesting because that's actually also open source hardware. So, anyone can build a Solo key. The firmware is open source, everything. Nitro key actually uses the same software base, a firmware base. So, these are all, anything I talk about in this talk will work with any of these security keys. So, how does this protocol work? So, I'm focusing on CTAP, the back end. So, talking to a authenticator. Well, the idea is that first you have to register. So, registration is just to register your public key with this verifier, this relying party, whoever it is. And then later you can use that public key for authentication. So, there's two steps, registration, authentication. And so, in the registration steps, so I'm not going to talk about all the details, but you just register your public key with a relying party. And this is including something called the relying party ID. So, in Web of N, this is the identifier of the web server. But in other applications, it can be anything. But the idea is that it is included in any signature that you generate. And you set, so you fix this relying party ID when you register. And later with authentication, this relying party ID is included in the signature so you can, as a relying party, verify that it is used for your application. So, you cannot use the same public key for some other application with a different relying party ID. So, I'm not going into too much detail. Now, you might think, well, I can do this stuff with PGP. I can do it with smart cards. So, what is different about securities if you're not using it in a web browser? Well, actually, many of the things that I'm talking about will also work with PGP or other technologies, although there are some specific features that are not always included. And one of them is attestation. So, attestation means that you can prove that some signature was generated with a security key. So, of course, if you know that the public key is generated on a security key, then obviously that is the case. But if you're dealing with someone that claims to have a security key but you're not sure, you can actually verify it by this process called attestation. So, you can prove that someone uses, let's say, a Google Titan key to generate the signature. So, this is what called attestation. And there's a service hosted by the FIDO Alliance or the organization that actually produces those specifications. And they host metadata. So, if you have a security key, it will have a unique identifier. So, not unique for that particular Yubiqui but unique for the MAKA model. So, any, let's say, any Titan key or any Facion key or any Yubiqui that is of a particular MAKA model will have the same identifier. And in the specs, it says that at least 100,000 keys need to use that same identifier if they are the same MAKA model. So, we can be sure that, let's say, that the signature is generated by a Titan key. And that is also interesting. So, attestation together with the metadata, they really add something to this process. So, here's an example of the metadata. So, someone built a nice web view of the metadata. So, you can look up things like, of course, who's the vendor of this Yubiqui or this security key. But also, is it using protected hardware? And is it certified to a certain security web? So, all these things you can actually use in, actually, yeah. So, I'm not going to do any demo. So, I include all my demo slides for you to try yourself. So, we don't have time here. But, I'm just leaving them in the slides so you can actually try. So, this is a way to extract metadata. So, it's a bit technical. But, if you want to try it, please do. Then, about open source software. So, Yubico publishes a FIDO library. And it's actually used by a lot of open source projects. So, this is open source, although it's produced by Yubico. And, yeah, if you look at, for example, GitHub and you look at all the projects that use this library, then there's lots of interesting projects that do it. And that means that you can use a security key, any security key by any vendor, using that software. And, yeah, what I'll do in this, and the rest of the talk is give you some examples. But, because it's interesting that, although FIDO was primarily intended to do authentication, you can actually do other things. You can do encryption. You can do signing. And you can actually store things on the YubiKey. So, I'll give an example of all these features. So, let's start with a very simple example, like a pluggable authentication module. So, that's another open source library that is included in many Linux distributions. And the idea is that you can...... you you you you you Open source software, you can try it out yourself, but let's give a practical example. Let's say that you want to encrypt your hard disk. You're on a Linux system, you're using a Lux, and typically this is done using a password. Instead of using a password, you can also use a security key, a FIDO key, that instead of deriving some encryption key from your password, it will derive the encryption key from this extra key that is generated on your security key. So this means that if you want to decrypt your hard disk, you just need to insert your security key, so this is what you have factor to get some extra confidence that only you can decrypt that hard disk. So worth looking at. Then there's another extension called large blocks, and this is for storing things on your security key. So it doesn't have a lot of space, but this is typically used for storing certificates. So let's say you're using SSH, and you use SSH with SSH certificates. These are small files, and it's feasible to store them on your security key. So this is an extension that is not very often implemented at the moment. I think there are a couple of vendors that actually implement this, but it means that if you move to a different system and you want to log in there, you can actually take your security key and extract both the public key and the certificates. Of course, the private key stays on your key and your security key, and then log into a remote server from there, so everything is contained in the same security key. Here's an example how you do this with the tools. Do it yourself if you have a key that supports it. Finally, last example about this attestation. So if you generate an SSH key that is backed by a security key, you can actually ask the security key to provide attestation. So there's this extra parameter in SSH key gen that will extract the attestation data, it is called, that you can look up in the FIDO metadata service, and this way you can prove that the signature was generated by some security key, and you can look up exactly which one, and you're certain that this is done with secure hardware. Okay, you can try it out. So, getting to a conclusion, so I'm not saying you have to stop using PGP or anything, but this is an alternative to doing things with secure hardware, and the idea is now that since all the big vendors, like Apple, Google, and Microsoft, they've jumped on the FIDO bandwagon so we can see a lot more support for FIDO in the future. That means that a lot more people will own a FIDO security key. For example, I already mentioned this morning, if you have an Apple iOS device, you can protect your Apple ID using a FIDO security key. So it's built into iOS 16.3, I think. So this means that it is a viable alternative that's actually also a lot cheaper than many of the other hardware keys that you can buy, like a smart card. So I think for 20 or 30 barbs you can buy a FIDO security key, whereas smart cards are usually, especially if you consider all the middleware and everything else that you need to get it working, that can be a lot more expensive. So, just a list of resources. If you want to look up things, of course, you can also contact me. I'll be here all day and happy to take questions if there are any. Hi, I have a question about the attestation process and open implementations of FIDO 2. If you have an open implementation, is it possible to get that certified with FIDO 2 Alliance? My understanding is that in order to enable the people to compile their own binaries to put on their own open keys, is that possible or is it not? Yes, so as a line party, there's a certification process that you can use to test if you are compatible with the FIDO standards. So there's different certification programs. The most heavy ones are for the actual security keys. So there you have to actually do a lot of work to get it certified. But there's also a self-certification toolkit, I think, that you can use to see if you are compatible with FIDO standards. So there's a lot of tests you can run against your own line party software. Can you do something? Yeah, yeah, yeah. Any other questions? What is publicly known about hardware tokens failure rates, at least for popular models, and how many identical devices would you enroll for the most important things personally? The question is, I think, two-fold. First is about failure rates. Actually, I don't know about failure rates, but of course there's different ways to do user verification, and I skipped over it a bit, but to use a security key you need to either touch it and sometimes also prove that you are using it. And this is done typically with a pin or with a biometric. So I don't think... I've never seen a security key fail, except when it's using biometrics, because of course that is a less... So every biometric has a false acceptance rate and a false rejection rate. And yeah, I don't know numbers, but it differs with different vendors. So I think Apple doesn't say anything about their false acceptance rate, for example. So I guess you just have to trust them, and that is with many vendors. Then this... I missed the second part of your question. This... Oh yeah, so already mentioned, if you use your security key you have a problem, so you want to have multiple public keys registered, and how many, that depends on the relying party. So some relying parties initially only allow you to register one, so that's pretty useless, because if you lose that one, then you're out of business. So usually there are several, and yeah, I think for example on an Apple iOS device you can now register six, but that's really depending on the line party. So some have an infinite number, so you can just register as many as you like, but that's really depending on the relying party. Okay, we've got time for our next question. There is an IC hand here. So I guess we have now a single point of failure with the USB socket, especially if we are travelling. So are there any plans to have implementation for the Bluetooth final standards? So we don't... We can still work if USB is broken, maybe at the phone or whatever. Give it a little quiet, but I have trouble hearing the question. Are there any plans for the Bluetooth support in the libraries? Because when... Especially let's see if I'm travelling, I often have problems with USB sockets, so I would have a single point of failure there. So you're saying if you have problems with your USB sockets, then you cannot use a key, so is there an alternative? And I would prefer to have some Bluetooth fallback then. And the FIDO standard also specifies FIDO over Bluetooth, but I don't see it's only implemented on Windows and Android but not on the Linux libraries. Sorry, Bluetooth, yeah. So Bluetooth support is not in this FIDO 2, it only has USB and NFC. No Bluetooth, but there may be an addition. In the next version of CTAP, there will be a way to... And maybe you've seen it in Google Chrome. You can generate a QR code so that you can use your phone. So if you have enlisted your phone as an authenticator, you can do it over Bluetooth Low Energy, but the Bluetooth channel is only used for proximity. It's not used to actually transfer anything. For example, if you register a key, your public key is not submitted over Bluetooth. But that will be in the next version of CTAP. Okay, thank you for the talk. We are out of the time, so applause. Thank you. |
AMENDMENT Hardening Linux System with File Access Policy Daemon |
We will start, so I guess we can start. Hello, my name is Radon Svaka, and I will be presenting how to harden a system with a file access policy statement. So, I'll start with the brief introduction of the framework. Can you hear me? Is it better? Is it better? So, FEPO-CD is an acronym for file access policy statement, and it's in a lightweight implementation of application-wide listing. It has many features. I will just highlight that it has integration with RPM, and it can load some data from RPM base, and it can log events to audit or syslog, and it heavily relies on FA Notify API from kernel. So, when we look at this framework architecture, we can see several components of the framework. The main component of the framework is the daemon. This daemon usually listens to this FA Notify events from kernel. When it loads up, it reads all the data from the backends, and it creates its own database, which is called trustdb. The first backend is RPMDB, which handles data from RPM database, or metadata, better term, and the second one is file backend, which loads all the trust information from user-defined trust lists. There is also CLI component, which can manage trust and also the daemon's properties. So, how does it work? When we look closer, we can see that it is on this image. There are two processes on the system. The first on the left side is bash, and the second on the right side is FA policy daemon. FA policy daemon is the situation listens for these events, and it's just waiting. Bash is trying to execute PS command, and it calls exactly system call, and this execution of the system call is on hold, so it's like post, and it's also waiting. Meanwhile, on the other side can also send an event to the daemon that something is happening on the system, and daemon can read from this event that there is some process called bash, and it wants to accept this PS command, and it has PID 500, so it does a rules evaluation, and when the decision is allowed, it sends affinity response to the kernel, and kernel will let this exact V eventually finish with success. If the decision was denied, then this exact V will return error code. So the daemon is the main part of the project, and it works with the rules, and there are rules and trust, and both have power over what to do with the files and their execution system. So these rules, they are pretty similar to SC Linux, they have also subject object notation, and they start with the decision part. Decision part is, it can be allow or deny, or it can be combined with syslog or audit attribute. The second part of the rule is permission, which can be open or exec, but that refers to the original system call from which that event comes from, and any is like place order and will match both of them. We also support just these two because this is, because FA notified us that. Our subject is what is executing, and object is what is being executed. So this trust is a very important concept here in FA policy, it can be defined by user, and it is usually done by CLI, and that there is also, or we don't have to add system binaries or files to the trust, because they are trusted by default, they are loaded by this backend, so this is done automatically, so when we run FA policy by default, it somehow works. So these are some rules example, I don't know if you see them, but the first one says, the first one allows loading of trusted shared libraries on the system, and the second one opposed to that, like denies untrusted libraries. Third one will allow execution of trusted files on the system, and fourth one allows using of scripts, which are trusted on the system, and fifth denies untrusted scripts. And there are two cage of rules, first one is for execution of all files, and the seventh is open for everything. So they are also described here. So when we want to install FA policy demand, it's very simple, we can just use normal Federa program for installation, which is called DNF, this is one liner, when we install it, the installation consists of, or it installs three packages to the system, the first one is FA policy, which is main package, which contains demon and CLI. The second one is RPM plugin for FA policy, which works, or which, during the RPM transactions, it sends the new metadata that are needed for correct behavior, so it sends it to the demon, and then demon can work with them, and they are up to date, and it also notifies the demon when the transaction ends, so it can behave, the demon can behave accordingly. And the third one is service package for FA policy, with the service policy. So when the installation is complete, we can start FA policy. There are several ways how to do that. In our examples, we will use debug deny, because we are interested in deny events, or debugging of the deny events, and we don't need anything more. If we would like to see also allowed events, or something else, some other debug information, it can be done via debug option, and when we want to run it on something more like production environment, we can use system CTL. So when we run it, FA policy will tell us that it's listening for these events, so it's okay, and we can start playing with this. So there are, I prepared a few scenarios to demonstrate how it works. So the first scenario is called execution of untrusted software. So I downloaded this file, this is Python script called exploit.py, and I want to try to run it. I make it executable, and when I run it, I can see that it's blocked, and we can see down there that there is deny event from FA policy that says that this exploit.py is not trusted, so it cannot be run. So it blocked this unknown program to be. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. |
Elliptic curves in FOSS
More curves to the set |
you you you you you So that's all from my side in this little review on the ellipticals itself and thinking and the implementations in free software. Hi. You were talking about standardization of curves, but with a lot of the new cryptography techniques like zero knowledge, homomorphic encryption, MPC, VDFs. And then even when you start to amount recursion and curve cycles and pairing friendly curves, each time it seems we will try to standardize curves, it then discover like, oh, there's some new property like maybe unknown order groups or hyper elliptic. I think the movement of this restrito and decath goes in the direction in instead of standardize an elliptic curve itself, it's just standardizing a variety of elliptic curves. Then you have the operations and the match to work with any elliptic curves in this variety. So this will bring us the possibility to have this, like another service that is providing random elliptic curves on the net and you can pick one over from there and forget about the thing that we are sharing the same elliptic curves. The way that the current standards that exist are fixing an elliptic curve one or another, but fixing it in a specific way. I have answered it because you say about homomorphic, but the homomorphic is huge. It's a very interesting field, but it's huge field. Okay, I can see another question. In many protocols and also in IoT devices, you are bottlenecked by speed and also by signature size. You said that with Jacobi Quartik curve, you can go faster and also smaller. How fast, how much faster and how much smaller? I haven't, the paper of the Jacobi Quartik people, it's the speed is explained. I don't have the numbers on mine now. I have the numbers on mine about the size because the sizes in signatures in elliptic curves are 64 bytes and the proposal, the schema they propose for the Jacobi Quartik, it's 84 bytes. So it's like a third smaller. But it's still bigger than BLS signatures who are about 32 to 48 because you only need a single point on curve. In the signature, in the signature, you are not putting only the point. There are three, if I remember correctly, there are three. BLS signatures, it's only one point. Which signature? BLS from Dan Bonny. Mm-hmm, but it's longer than, it's shorter than 48 bytes. It depends on the base curve you use, but it's only, so if you use, well, 32 bytes signatures are not deemed secure enough because you have pairing issues, but 48 provides 128 bytes of security, so, yeah. What I know is in the ethyl curbs variety, the signature is 64 bytes. Okay, are there any questions? No, then. Okay. Have you looked at jealous two curves because I saw that they actually outperformed a jealous two. Yeah. They outperformed the jealous one. I saw DJB was really into that. Hyper elliptic curves. Yeah, the Picard group. This is a very nice subject also. I think banks are the ones that are putting more money in hyper elliptic curves. They are not using points. They are using a matrix of the Jacobian of the point. By now, I only heard about banks putting money there. I haven't seen an implementation in open source. Any other? No, thank you. |
OpenSSL in RHEL: FIPS-140-3 certification
From FIPS-140-2 upstream to FIPS-140-3 downstream |
Hello, welcome to the security lab room. Next speaker is Dimitri Beliapsky. He is talking about Open cell in RAL, FIPS 140-3 certification. Hello, do you hear me well? Yeah, looks like that. First, who am I? My name is Dimitri Beliapsky. I work in Red Hat as a senior software engineer. I also am Open cell committer since year 2019 and since year 2021. I am a member of Open cell technical committee. It's a group of people that manage the Open cell project from the technical point of view. My beloved pet project is also related to Open cell. It's Russian ghost-crypt implementation. Okay, let's go to the topic. First, what we shall speak in the first part is what's FIPS? What's FIPS certification? What certified for FIPS in Red Hat? And the brief introduction into Open cell 3.0 architecture changes. So, FIPS is a set of standards, American national standards requiring which cryptography and in which limitations should be used. It consists of a series of documents. Some documents are available public. Some should be bought in the NIST. Some should be bought in the ESO. These documents are permanently updated. So, you should consider the FIPS set of documents well as a code of laws. It is as consistent as code of laws. It leaves some space for interpretation. Nobody knows it all, so there are always discussions how to interpret this or that point of the document. So, FIPS certification process is done by some accredited labs that get the sources from some company that pretends that the code is certifiable. They do quite comprehensive tests. Some of them are public, some of them are not. Then they return back to us with notes like you should fix this and that. Then we go to the next iteration. And at the end of the process, if we are lucky, we get the FIPS certificate. FIPS certificate is necessary for using system in many American government institutes for many big customers. And again, it provides very good recommendation from security points of view. If you use FIPS certified software, it means that you are on the safe side from the security point of view. Current version of the standard is FIPS 140-3. We in Red Hat certify our distributions for FIPS for several versions. We usually certify our kernel and several crucial libraries such as NSS, LibreCrypt, NUTLS and, of course, OpenSSL. We have a nice blog post explaining what we certified for Rallot 8 and the approach for Rallot 9 will be basically the same. It's still ongoing process. So sorry, it's always a problem how to properly add a link in the presentation. I believe that you can download the presentation from the slide or just find the blog post by the title. The support of FIPS in OpenSSL has a long history. It has native support in 1.0 series of OpenSSL, which is currently long out of support. In 1.0 series, there was a set of invasive runtime checks if we are in FIPS mode, then please do that. This algorithm is allowed, this is not, and so on and so forth. Upstream did not provide native support in 1.1.1 series, but we in Red Hat expanded the original patches. So FIPS support in OpenSSL 1.1.0 series in Rallot 8 was basically a set of patches for LibreCrypt and LibreSSL with even more invasive runtime checks. In OpenSSL 3.0, the upstream decided that the model with invasive checks is badly maintainable, and they redesigned architecture from scratch and created so-called provider models. All algorithms are implemented, they are separate plugins named provider, and one of these providers, the most important for our purpose is the FIPS one that contains only FIPS compatible algorithms that build from the same sources with compile time checks, its individual library. So we also certify only this library, not LibreCrypt, LibreSSL as a whole that also simplifies the process of applying some changes. And to be sure that you are FIPS compatible, you should use only relevant API, and it costs a massive OpenSSL API deprecation in 3.2. So if you pretend that your application is FIPS compatible, please don't use a deprecated API. If you are a software developer, just add a warning for using a deprecated API. So now let's talk about our patches. Upstream FIPS provider was designed to match previous version of the FIPS 140-2, and we wanted to adjust it so it would match FIPS 140-3. Let's begin with the initialization of library. Upstream approach implies that FIPS provider is loaded via configuration file, and the checksum, which is necessary to check to be sure that provider is the same we want to use, is a part of this configuration file. This approach was found suboptimal for our purposes, because we can detect that the system as a whole is in FIPS mode. So when we see that Red Hat Base system is in FIPS mode, we automatically load and activate the FIPS provider. We also get rid of external checksum. We embed the checksum into the provider, and it significantly reduces the problem when the checksum is in a separate file or in a configuration file. It can be damaged or just forgotten to copy the file, because, well, unpredictable diagnostics. So when you have checksum embedded into the provider, everything is much simpler from a maintainability point of view. And also we removed several algorithms from the FIPS provider. Well, I will speak about it some minutes later. What is the next change we implemented in our FIPS provider is indicators. Indicators is a new requirement of FIPS-140-3. Standard. We have too many algorithms. They can be combined in too many ways, and not all of the combinations are approved. So we should use only approved combinations. First, well, there are two possible approaches. We call it implicit indicators and explicit indicators. The implicit indicators implies that if you performed a crypto operation and it succeeded, then you are on the safe side. Unfortunately, as I mentioned before, some combinations of permitted crypto algorithms, which are not permitted by FIPS as a whole, so we had to partially switch to explicit indicators. The approach with explicit indicators is you try a crypto operation. If you failed, assuming it was properly set up, then it wasn't approved. If it succeeded, then you probably, it's well-documented, so you should check the indicator and see if it's permitted from the FIPS point of view. Well, this approach is better from several points of view. First, it covers the caveats when you have a legal combination. Second, well, in real life software, you can use FIPS non-approved crypto algorithms for non-cryptographical purposes, such as MD5 as fast enough hash sub-off files. So application knows better what is the purpose of the algorithm usage sometimes. So we implemented some combination of explicit and implicit indicators. And some more implementation details. Well, we removed the AdWords curves from our provider. It was done because when we were tuning our provider, AdWords curves were not permitted for usage in FIPS, but, well, the use of yesterday, the standard that allowed using AdWords curves, two-five-five-nineteen-and-curves-cove-a-d-448, are permitted. So we probably will have to consider if we add it, if we switch these curves on or we'll switch it in some upcoming versions, because there is much interest to these curves. We also removed support of RSA-PXS1 encryption in FIPS provider, which is potentially breaking changes, and we are aware of some of our applications that have had to change the RSA encryption model from PXS1 to OAP. And we removed the triple-desk support from FIPS provider. One more change we had to implement, also a requirement of new FIPS standard, is switching to no-nonset tests for ECDSA signatures. ECDSA signatures require using a random number, so every new signature has a different value if you properly implement the algorithm. If you use the same random values for the signature, it means the attacker will be able to find out your private key, so it's dangerous. So we had to add a patch that allows in the test model specifying a well-known hard-coded random value just for test purposes. We know that upstream, in its upcoming version of the provider, use a different approach. They use the predictable random number generator, and probably it's a better approach. We will think about it on some respinoff of our certification. Also, new standard implies some more strict changes for key derivation functions. It limits seeds. You should not use the seeds shorter than 112 bits for most of the key derivation functions. Also, it specifies some requirements to accept the random number generators, which were not implemented in the upstream provider. So only SHA2 functions are basically allowed for the hash-based random number generators. Our providers support these requirements. One more change worth mentioning is that according to 5th 140-3 requirements, it's necessary to clean up memory not only after using private keys, but also for public keys. There should be consistency checks for freshly generated public keys from private keys. The last two points are about other patches in REL that are not related directly to FIPS. They are about overall hard-diving. We have implemented so-called cryptopolicies in REL8. It's a way to ensure that all the crypto libraries and all the applications using these crypto libraries are consistent from the point of view of algorithms that are permitted to use. Also, as FIPS 140-3 standard formally allows using SHA1 for signatures, we removed the support of it in our FIPS provider because, well, you can't really rely on security properties of SHA1. And it may cause problems to application developers and some application developers have already complained about it. But please don't use SHA1 for signatures. It's just unsecure. Okay, thank you. Thank you for your invitation. Feel free to ask questions about the details. APPLAUSE Hey, so you said you're using the provider API for the FIPS stuff. I was wondering, not all of the OpenSSL API gets routed through the provider, right? It's why doing about packages like SSH that are using old API? Those don't work with FIPS, right? Well, yes, not all the applications use the new API. I correctly understand your question. My question is, to use that, I need to use the new EVP underscore API, right? Well, yes, you need to use the EVP underscore API, but it's, well, I don't think you should call it new because it exists for more than 10 years. That's right, but OpenSSL does not use that? Yes, it's my pain as a maintainer of OpenSSH. We are writing OpenSSL in downstream to use the new API to make it FIPS-compatible. Oh, yeah, that's what I wanted to know. Thanks. In which of your Red Hat distribution are you supporting this? And I guess you don't have backwards compatibility with older applications for OpenSSL. It only works with FIPS-compliant applications, right? Sorry? Again. Yeah. Which of your Red Hat distribution supports the FIPS-standard one? And two, how do you do with backwards compatibility because I guess applications that are not FIPS-compliant won't run with your OpenSSL API? So, I'm talking now about REL9 series, but we have certificates for REL8 series for previous version of the standard. About the application that uses old API, well, as I mentioned before, it's a common approach to provide downstream patches and pull these patches upstream to make the application use the new API. This is the only possible way. Well, I participated in pushing libfido2, for example. We have downstream patches for OpenSSH, but it also should be libres compatible, and it adds problem. But the general approach is to implement a downstream patch and push it upstream. Any questions? Hands up. No one? Okay. Thank you very much. Feel free to contact me directly. Thank you. |
Kerberos PKINIT: what, why, and how (to break it) |
Okay, we can't start. Hello, I'm Fraser, I come from Australia, work at Red Hat on identity management and PKI solutions, and yeah, talking about Kerberus PK Unit protocol. So I'll give an overview of the Kerberus authentication protocol, then I will discuss PK Unit, what its advantages are, how it works, and a short demo. Then I will discuss the security considerations and give a demonstration of a recently discovered attack against the implementation of Kerberus PK Unit that we have in free IPA. So Kerberus is an authentication protocol based on symmetric cryptography, it's a single sign on protocol so you can authenticate once per day, for example, and using a token from that initial authentication, you can then authenticate to additional services, hosts or users in the organisation's infrastructure. It was started at MIT in the late 80s, the current major version of the protocol is version five and it came about in the early 90s. The most recent IETF document describing the base protocol is RFC 4.1.2.0, and that was from 2005, so even that version is nearly 18 years old, but there have been many extensions and enhancements since then. The major implementations are MIT Kerberus, Microsoft Active Directory, Heimdall, and free IPA or identity management in RHEL, which uses MIT Kerberus under the hood with some additional extensions. The parties in the Kerberus authentication protocol are the client, the key distribution centre and services. So the key distribution centre or KDC consists of two services that are logically distinct within the protocol, but typically combined together into this one party, which is called the KDC. Users, services and the KDC itself are all represented as principles in a realm. So a principle is just a name for a user, host or service in the realm, and the realm is the namespace for those user hosts or services. So often you'll see one company, one organisation might have one realm, but as companies grow or have mergers and acquisitions, then typically you'll end up with multiple realms in your organisation. Each principle has a long term secret key, which is shared with the KDC. For users, it's typically derived from a password or passphrase, so using PBKDF2 or some other key derivation and algorithm. And for hosts and services, the file is often stored in just in a flat file, which we call a key tab. And the authentication tokens themselves, which are exchanged in this protocol, are called tickets. So let's do some diagrams. The parties, clients, server or services, and the KDC. We can see the client has a key, the server has a key, and the KDC has all of the keys, including the ticket granting service TGS key, which is one of the KDC services. So the initial authentication exchange involves what we call an ASREC or authentication service request. The client says, hey, it's me, I want to authenticate. It doesn't necessarily carry any authentication information. The authentication happens when the KDC responds to the client and the response includes a session key randomly generated by the KDC encrypted to the client's secret key. So the client does not authenticate to the KDC. There are ways that you can do that, but in the base protocol, the authentication happens because only the client can decrypt the response containing the session key. The response also contains a ticket called the ticket granting ticket, which is not encrypted to the client's key, but rather to the ticket granting services key. It also contains a copy of the session key and some information about the client. So the client can decrypt the session key and store the ticket granting ticket. After when the client wants to authenticate to the server, it sends a TGS request to the KDC saying, I would like to talk to such and such a principle, in this case, a server. It includes the ticket granting service ticket or TGT, and it also includes a timestamp for replay attack prevention and some client information encrypted to the session key. The KDC can use the TGS secret key to decrypt the ticket, pull out the session key, decrypt the client authenticator, make sure the client info matches up, make sure the timestamp is within an allowable skew, and if everything checks out, then in the TGS request, it will send a TGS reply. The KDC can return a ticket for the server, which contains a new session key, and it also returns the new session key encrypted under the existing session key for the TGS session. So the client can decrypt the second session key and store the ticket for the service, the service ticket, and finally, it can talk to the service. So it sends the application protocol request, it includes the service ticket, it includes an authenticator encrypted using the second session key. The server can then use its long-term secret key to decrypt the ticket, pull out the session key, and then it can use that session key to decrypt the authenticator, make sure the client info all lines up, make sure the timestamp is within the allowable skew, and then there's a shared session key between the client and the server. They can talk whatever protocol they want to talk using that session key. Okay, so that's the base Kerberos protocol. Kerberos has a bunch of extensions and integrations, there's a pre-authentication framework that allows you to integrate additional authentication mechanisms, such as a TOTP or an HOTP. There are mechanisms for embedding Kerberos authentication in the GSS API and in SASL, so that will allow you to use Kerberos authentication with other protocols that support those authentication frameworks, such as LDAP or SMTP, IMAP, et cetera. For HTTP, there's a protocol called Spenego or Spengo, and we can also include authorization information in the tickets which contain additional information about the client, how they authenticated to the KDC, so this is what we call the authentication indicator extension, and there are other kinds of authentication data. For example, Microsoft Active Directory includes what's called a PAC, I can't remember what that stands for, but Alexander will know. Forged attribute certificate, there you go, so that's the MS PAC extension, which you'll frequently see when you're working with Active Directory or Cross Realm trusts. With Active Directory, that's that final point, I explained that there were situations where you're dealing with multiple realms, what if principles from one realm need to communicate or authenticate to principles in a different realm, that is accomplished by trusts and the Cross Realm authentication mechanisms. The advantages of Kerberos, well, it's single sign-on which improves efficiency and reduces password fatigue for users. The client has to expose their long-term secret only once until the TGT expires, so maybe once per day at the start of the day logging into your workstation, a single authentication happens and from then on, you no longer need to explicitly authenticate. It is resistant to the replay attacks, that's why all of the authenticators include timestamps, and it works well for HTTP as well as bare network protocols, which is a shortcoming of the predominantly HTTP centric SSO protocols like SAML and OpenID Connect. But the problems are, of course, that passwords are not great and dealing with passwords or the secret keys in key tabs, making sure that they are rotated, making sure that they are secure in the first place, can be challenging and burdensome with substantial administrative overhead. So this brings us to PKINIT, or Public Key Cryptography for Initial Authentication in Kerberos as the RFC is called. In this protocol extension, the client can use asymmetric cryptography to authenticate to the KDC, and the client presents an X509 certificate in its initial authentication request, as well as a signature made with the public key or the private key corresponding to the public key contained in the certificate. The KDC verifies the certificate, the signature and the binding of the key in the certificate to the client principal, and if everything checks out, it can respond with a response encrypted either using Diffie-Hellman or some other analogous key agreement algorithm or another public key encryption algorithm such as direct RSA encryption. So visualizing this, the client in the Authentication Service request says, hey, it's me client, but this time it includes some additional pre-authentication data. It includes a timestamp, and if it wants to use Diffie-Hellman, a client DH value assigned by its public key, and it includes the X509 certificate containing that public key. In the KDC, once it has verified everything is happy to proceed, then its response includes the TGT, and it includes the session key encrypted using the public key algorithm, in this case Diffie-Hellman, as well as the KDC Diffie-Hellman value that the client will need to compute the secret with which the session key is encrypted, and then it can decrypt it, store the session key, store the TGT, and from this point forward, the rest of the protocol is exactly as before. In free IPA, by default, we can perform the binding of the certificate and key to the principal object using an exact certificate match only. So in the principal's LDAP entry, we'll store a complete copy of the certificate. We optionally support certificate mapping rules that allow you to be a bit more versatile in how you establish the binding between the certificate and the principal. For example, if you're using certificates for hosts, you can pull out the DNS name from the subject alternative name field in the certificate and construct an LDAP query saying, well, we're looking for hosts whose FQDN matches that DNS name from the certificate. And the client certificates can be signed by free IPA's internal CA or by a third-party CA that the KDC trusts. The user experience for PKInit, you can do it from a CLI, it's not very pleasant, but you can use SSSD integrated with your login manager to improve that experience, particularly if you're using smart cards or TPM for storing the private keys, or doing additional pre-authentication mechanisms like a two-factor authentication, and Windows offers a similar experience. It should be, in fact, it must be easy for users and friendly for users, otherwise people will not use it and you will not get the security benefits. So quick demo, K-list shows me what tickets I currently have, the answer is none. If I K-init as Alice, I can type Alice's passphrase, and I now have a TGT for Alice, so that was a password-based authentication, and if I ping the IPA server, that's just talking to the free IPA HTTP API, now if I K-list, I can see that behind the scenes, it's a quieter service ticket for one of the IPA HTTP servers, I'll just destroy those tickets now, and I'll do a PK-init, so if I change directory here, here I have a certificate and a key, and just pre-print the cert for you, so what can we say about this cert, and actually I'll tell you what, I'm doing things in the wrong order here, I'm going to do a host authentication first, so if I do K-init-x, X509, user identity equals file, and the certificate, and the key, and a host name, not a host name rather, but the principal name, host slash rel78.IPA.TEST, and K-list, here we have our TGT for the host principal, okay, so PK-init advantages, no more passwords or client-shared secrets, the keys can reside on the smart cards, or so for example in a UV key, in a TPM, or in hardware security module, and as I mentioned earlier, the rest of the protocol after the initial authentication service exchange is unchanged, which makes it easy for services. The complexities, well you need an X509PKI, this brings in the renewal considerations and revocation considerations, the hardware, if you want the benefit of the hardware security that will cause an additional financial cost to buy the hardware, and binding the public key to the principal is an important consideration, so in the RFC, it's RFC 4556 it says, the KDC must also check that the client's public key used to verify the client's signature is bound to the client principal name specified in the authentication service request, and it goes on to suggest how you can do that, a couple of ways you can do it, you can encode the principal name directly in the certificate in a specialised subject alternative name, or you can associate the certificate or the key directly with the principal in your database, that is what we have as the default behaviour in free IPA, but that introduces administrative overhead because when the certificate is renewed or the client re-keys, then you need to make sure that those entries are up to date, or you can use other heuristics, for example if the cert has a DNS name, pull that out and use that to look up a host, if the certificate has an RFC 822 name, which is an email address, pull that out, use it to look up a user principal, and you better not mess this up, which brings us to the CVE, so if we have a look at the certificate that I used to get this host principal, we'll see something interesting, it doesn't actually mention that principal name anywhere, the RAL7 8-0, but it does have a subject alternative name, it has two in fact, one of them is a wildcard DNS name. So what's happening here, it's an LDAP filter injection vulnerability, free IPA is not vulnerable in the default config because as I mentioned only exact certificate matches used by default. This bug is in the SSSD component, it was already resolved when I found it, so it was only older but still supported versions that were affected, and the fix has now been released and the details are public. So what's happening is that the cert map rule, sorry about that, the cert map rule is just pulling the DNS name out of the certificate and concatenating it directly into the LDAP filter without sanitization and in LDAP, asterisk is a substring match character, so using that certificate would let you get a TGT for any host principal in your realm. And another interesting question is what happens if this is your email address? Now this might seem like a stretch but that is a valid email address and I'm sure many of you work at companies or have worked at companies where you can request your own email alias at Red Hat, we certainly do. So if you managed to request an email alias such as this and the system approved it and you've got a certificate with that email address on it and you have a cert map rule that looks something like this where you're stuffing the subject RFC A22 name into the query and using it to look up a mail attribute and that was somehow nested inside an or list expression, then you've just got yourself a domain takeover. So let's see a demo of that. If there's time, there might not be time, I think, how much? Five minutes left. So I'm going to skip it. Sorry. I'll tell you what, after I finish the talk, if I can, I'll just do it. But let me discuss now the mitigation. So yes, if you're running a vulnerable version of SSSD, you should update it. The and list rules are harder to exploit than all lists. Just point out how the LDAP filter expressions work, the all list, if you have a single sub expression, then the whole list expression will match, but an and list you have to match every sub clause, so it's just easier to exploit an all list. You should definitely audit what data can get included in certificates, where that data comes from, and how it's included or encoded in the cert. And you could use exact certificate matching to avoid this issue, but that does come with the administrative overheads to handle renewals or re-keys. General security considerations for PK unit, well, and this first point is for just all software always, probably escape and sanitize your inputs according to how you're using them. You should review your CA trust, so which CAs are you trusting? What profiles for issuing the certificates are used or templates? And how are the attributes that go into the certificates validated? Who can issue the certificates that you trust, both in terms of their software systems and the agents, human or otherwise, who act to issue certificates? And can any of the attributes be influenced by users or other parties, such as if you have your email address alias request system? Just because a value is valid in a particular context does not mean that it's benign in another context. And the key in principle binding is a critical aspect of PK unit security and PKI application security in general. It is as critically important as validating your certificate chain and validating signatures. The full write up about this issue is at that link on my blog. And there's a link also to the entry about this issue in the Red Hat CVE database which includes the list of which products were affected and where the fixes have happened and where they are not happening. Okay. And that's also, I'll ask questions and then maybe I do the domain takeover demo in a minute. Maybe we have time up for one question. So is there anyone who has the best question that will be answered in one minute? They want the demo. Okay. So let's, I need to just change the cert map rules, which ones are active? Oops. If I can authenticate. Okay. My PA is cert map rule, find, this is just the set up. So I need to disable cert map two and enable cert map one. Okay. So this is the rule that I'm enabling. I didn't enable it. Oh, thank you for that. Okay. Now it's enabled. And I can do K in it. Let's see it'll be there in the scroll back somewhere. That's server, that's okay. So the naughty certificate and the naughty key. And I want to be admin. And now I'm admin. I'm going to show you the certificate. Okay. So the certificate was issued to Alice and the subject name includes Alice's malicious email alias. Okay. There you have it. That's, there you have it. Bye. Thank you. All right. |
Remote Attestation with Keylime |
Hello? Okay. Now it works. Kind of, right? Okay. So hello everyone. Welcome to Security Dev Room and we've got our next talk about key lime and thermal attestation which will be given by Anderson and Thorsen. Okay. So welcome. Sorry about the trouble. So I'm Anderson as I am a software engineer at Red Hat and I'm here with Thorsen. Yeah, I'm Thorsen and I'm a maintainer of Linux distribution for school and universities and I'm also a maintainer of key lime. Yeah, so we are here to talk about remote attestation with key lime. So let's get started. Imagine you are like a car vendor who maintains and updates the systems running in cars but you want to make sure that the systems in the cars were not modified so that you can check if the customer is still eligible to receive the latest updates or something like that. Or you are a software company building software in the cloud but you want to make sure that the build tooling was not modified or you are a telecom company that wants to make sure that the systems you deployed that controls antennas they were not modified. So what all these cases have in common is first they are remote, second you don't really have full control of the systems in the world. So the question is how can you check that the system was not modified in the in the world. So our way would be if you could somehow get some information about the system and then check if it's what you expected from that. And of course in case it's not then you would want to have a way to react on that. So if you can do that continuously get the information checked then you have like monitoring of the integrity of the system. So that's what one of the things the remote attestation can provide is to check remote integrity, remote machine integrity, how it works. So you have a trusted entity running in some controlled environment and then you have a trusted agent on the other side running on the monitored system and you ask for the information to that agent and gets back some information called a quote then you can verify that that agent is running in a machine in a state that you trust. So that comes with the problem of trust. So how can you trust the machine or the agent running in some machine that you don't control. So you don't really trust directly the agent but you trust on a hardware root of trust which is the trusted platform model or TPM. What are the TPMs? They are pieces of hardware that can perform crypto operations such as generating keys, signing data and it has this special key and certificate called endorsement key which are generated during manufacturing. So the manufacturer generates the key and publishes the CA certificate so that you can verify that it is legitimate. And then this EK, the endorsement keys can't sign data directly but you can generate the attestation keys that are associated with that endorsement key in a way that you can verify the origin of some assigned data so that you can make sure that that data was signed by that specific TPM. So and another important thing that the TPM has are the platform conversion registers which are special registers designed to store measurements about the system and in a way that you can verify the integrity. So how these measurements are done? During boot, each step of the boot is measured by the UFI into the TPM via the PCR extend operation. So each step the boot process goes, you get a hash of the binary or the software that is running and extend into a PCR. I will explain that soon. And so during boot, the UFI is responsible for measuring the boot steps into the TPM and after boot, then the kernel integrity measurement architecture or IMA will measure any open file that matches a policy you can configure the IMA and it will measure the files open into a PCR as well. So if you have the information like the state of the PCR and the event log or all the operations, extend operations that were performed, then you can verify the integrity of the machine. So how this PCR extend algorithm works is kind of simple. You'll get the old value stored in the PCR, concatenated with the measurement from the data. So this measurement is basically a hash. So you concatenate the old value with the hash of the measurement, calculate the hash of all of these and put back into the PCR. So that's done for each step. So of course these PCRs, if you know a bit of TPM, they don't match the actual numbers, but this is just for illustration. So all of these, after measuring all these steps, you have the final value in the PCR that you can calculate like a called golden value, which you calculate like the hash of all the PCR values and you have a representation of the state of the machine and that can be verified. So how key lime works. So in the left side, you have trusted entity where you like probably a machine that you control where you run the verifier side of the key lime. It's a server and on the right side, you have the monitored system. It is remote. You don't have complete control of it, but the agent has access to the TPM installed in that machine and so the server can verify, the verifier can request a state to the agent. Then the agent will access the TPM to get the quote, meaning the PCR values and also together with the event logs or all the PCR extend operations that were performed and throw it back to the verifier. And then the verifier can verify first the origin of that piece of data because it's signed by the AK key. So you can make sure that that data came from that machine that contains that TPM and you can verify the identity of the TPM using the EK certificate. And with the values you obtained for the PCRs and the event log, you can replay all the extend operations so that in the end, you can get the values that it should have. And so with all these information, you can verify the integrity of the machine. Since you also got the information from AIMA, like all the files that were open and matched some policy, the AIMA will calculate the hash of open files that match some policy and extend to the PCR. So you get this log containing the file names and the matching hashes. So you can also with some policy engine verify the integrity of individual files in the remote machine. So you can like a full integrity view of the remote machine. So with that information, the verifier can check. If it's okay, it's okay. The attestation was successful. But if it was not, it doesn't match what you expected, then it's a failure. So in case of failure, we have a revocation framework, which is a way to, so you can configure some actions to the verifier, some script that it can run to perform some action. It can be some webhooks. So if some attestation fails, it sends some request to some webhook, or you can notify the agents directly via REST API, and send some payload to the trigger some operation there. The simplest scenario, for example, if you had a cluster with various machines, and one of them failed attestation, you cannot find the others to remove that node from the cluster by blocking the network connectivity, for example. So that's how key lime works in general. So now I'm passing the mic to Thor. He will continue with the real world stuff. Yeah. So now we heard how key lime works, and we want to show that you can use that in production, and what are the challenges that you will run into if you want to try that. We have three main scenarios there. We have first policy creation, then the monitoring, and then how to react on that. So the first part is, we want to create policies for our system. For that, we need to know what is actually on our system, and what are our systems. So from a software side perspective, it's normal that we integrate, we have a CI CD pipeline, we get what data gets into that, and we want to save the hashes there. But we need also a lot of other stuff. We want to know what packages are installed, where they end up on our system, have their signatures, can we verify that? That is what we normally want to have, and either this information is already provided by the distribution, or we need to generate that on our own. Then on the hardware side, we need to know what kind of hardware we're running. So as we said, we have the EK, so the endorsement key, we need to at least know that to trust the TPM in some regard. And then ideally, we want to also know what firmware is running on that device, and which configuration do we have. For example, do we allow CQB to be disabled and enabled? Do we have our own keys on there? And stuff like that. So if you have all that information, we can go to the other side, which is the monitoring. That part is implemented by KeyLine. You can, if you have all the necessary information, we provide documentation and tools to generate a policy for that, and you can feed it in that, and it's all there. The challenge that you run into here is that for many of you, probably IMA, measured boot, and TPMs are new. And if you run into issues, then you also try, need to try understand how that works to debug it. So that is a challenge you run into, that you still need a good understanding of those technologies to make your life easier. But yeah, that is mostly solved by KeyLine. And then we come to the non-technical side, which is we need to react somehow when we have a revocation failure. So is that a lot actually relevant for us? Because if we have like file changes in TAMP, we don't really care. But then who needs to be notified if you have that? Then how do we tie that in our current monitoring infrastructure? For example, like with the web hooks. And lastly, if you are a company and you're, it's a potential security breach if KeyLine fails in the way that you configured it. So there are service agreements in place, which need you notify and how do you respond for that. So, but going now from the general part to actual examples, I work on the Linux distribution that does exams for schools and universities called Lensdijk. And we developed with the University of Applied Sciences and Arts Northwestern Switzerland, also called FANV, a system called Kampler, which is secure, bring your own device exams. So what is the problem here? The students want to bring their own device, their own notebook into the lecture hall and want to write their exams on that. We don't want to touch their operating system. So we do something what we call bring your own hardware. They bring their own hardware and we boot a live Linux system on that system and remotely attest if that system is running correctly. So what do we have? We have our hardware, which has a hard drive and a TPM. Now we boot the distribution called Lensdijk. And on that we have the KeyLine agent running and also Imr and our measured boot stuff. And now the interesting part is we just care about the TPM. We don't care about the hard drive and what is on that system otherwise. So now we have the actual server solution. So we register to the exam system. And this also includes that we register to KeyLine. Then we check in return if the system is actually in a trustworthy state. And if that's the case, we release the exam files, which is in our case normally RDP session, which then connects to the cloud where the people are actually writing their exams. So why are we doing that this way? The first one is that we guarantee that the environment for every student is the same because they only provide their hardware and it's basically terminal to connect to the actual exam. So if there's computing intensive stuff, then it doesn't really matter. And also because they only bring their own hardware and don't need to install monitoring software on their system to write the exam, we don't care what they does on that. We don't want to know it's first for privacy and also to make setup way easier. Now back to a more traditional scenario that more of you are probably familiar with the cloud. And there we have the example that IBM uses this for hypervisor attestation. And they don't use runtime attestations or not anymore. They use measure boot to see if the hypervisor booted up correctly. So their challenges were that implementing the actual response procedures, so the procedure from we have an alert to how do we deal with that now. That is the difficult part because the one is the technical side, but how do we structure our teams in a way that we can guarantee that. Then also the other ones are eliminating false positives that ties into the first point because if a human needs to react, then we want to have no false positives and also no false negatives ideally. And false negatives are for security very, very bad. So we don't want to have that. And lastly is keeping the policies up to date. Even if you roll your own distribution and are big enough, it's very difficult to be up to date on that policies and integrate them automatically. And lastly, they have an escalation chain just for illustration purposes. They use key lime to monitor that, tie that into their JIRA system, and then have an actual person react on the other side. So and then one point from a distribution. So in this case from SUSE, I asked them and they integrated key lime into pretty much any product. So it's an open SUSE today. If you want to use it in microS, there's instruction to do that. And then also in SUSE Enterprise Linux and an ALP. Their challenges also we're like integrating it with as a Linux fully and making IMA usable. So do we have signatures? How do we provide the hashes? And the general thing for distribution is how do we provide robust policies in general? Because we want users to try out the technology and want to experiment with that. But how do we give them the starting point? And that is still a very difficult because as we saw, there are many data points that needs to be collected. And that is a challenge that they're trying to solve actively by making, getting the signatures and the hashes easier. Yeah. So to say for the end, try remote attestation today. The technology that you need to do that is pretty much in every device that you have like in a notebook that you can use. So you can find Keelem at Keelem.dev. And yeah, thank you. So do we have questions? Lots of questions. Thank you for a great presentation. One question. You talked a lot about the verification side of the processing. You talked a lot about the verification side, but to have the golden values or the PCRs in your verified system, you need to provision them. So I was not sure the distribution side of things was how do you manage that in Keelem? Could you shed some light on that? Yeah. So with the golden values, we have the values in the TPM and then they're also tied to an event log and IMA and like a measure boot. And we solve the issue that we don't actually need golden values by having a policy engine basically that verifies the logs itself checks if that those match the values, but then we check the logs and not the end value. So and then the distribution can help because they can provide like a lot of the signatures already and which files are in which packages and how they end up. That makes the life for the distribution easier. Yes, sir. What is the performance of such a check? How much time does it take and how much data is required for such a monitoring? From what I saw, I don't have like a benchmarks for that, but it's pretty quick, like 200 milliseconds, something like that. So the round trip from the request to the response is like 200 milliseconds in my test, but of course it's on my machine, right? We don't have benchmarks for the performance. Yeah, so it heavily also depends what you want to test. If you just have measured boot, it's the crow time on the hardware TPM maximum to a second, and then it's like a couple of most megabytes of a single digits that you have at data that is transferred. You had said that one of the challenges of implementation was dealing with false positives and maybe false negatives. Can you give some examples of when that would occur? Yeah, because we are still talking over the network. That is like a false positive if the network connection goes down. And the other one is it's kind of a false positive and not really is if your policy is not up to date. For the system, it's not a false positive in the traditional sense, but it's in the false positive because we don't want that alert actually to happen. For the university use case, how do you know that you're actually talking to the real TPM in the laptop? So we have two ways. First, we verify against the hardware manufacturer. So they have a CA that we can verify against. And also we can enroll the notebooks directly. So we check if the devices, which I forgot to say that the university part is still proof of concept. So we are currently working on it, but it's not rolled out like large scale. How do you make sure that an alert event, a new change happened? How do you make sure that it's not intercepted over the network? Sorry, once again. How do you make sure that when there's an event saying that there's a change on the machine, a new measurement that appears? How do you make sure that the event is not intercepted in the network between the monitored machine and the trusted system? So is the question, how do we deal with the losing connection between the agent, the monitored system and the verifier? Losing connection or having maybe something in between, making sure that it does not go to the trusted system. There's something in between that makes sure that you're never going to be notified that a system is going to be compromised or just became compromised. Did you get that? Yes. So if we have a blocking connection between the agent and the verifier side, then we get a timeout and then the agent gets automatically distressed. And if you said like from the notification system itself, if you say we notify all the other agents, of course, then there is an issue if you cannot reach to them on a trusted channel, then it's basically game over in that direction. So you need to get your revocation alerts if you want to guarantee them all the time through a trusted channel. So the trust boundary is for the attestation part that we see that, but the revocation part, if you want to reach that, then it needs to go through a trusted channel. Yeah. So continuing on this question, actually, I think, how do you make sure that your actual verifiers connect to the right agent and you don't have a man in the middle attack that's happening and rerouting this to a fake agent and fake TPM? Yeah. So that's tied with the EK certificate. So you trust the manufacturer because when they manufacture the TPM, they will create this key that cannot be modified or removed in any way. So it provides the identity to the TPM. So when you get the information from the TPM or from some agent, you can verify that that data came from the TPM that has some EK key because it's signed and you can verify the origin using the CA certificate provided by the manufacturer. So you can check that the TPM is exactly the one you expected using the EK certificate. Okay. Thank you for the talk. Thank you for all the questions. We are out of the diamond. Thank you. |
AMENDMENT Hybrid Public Key Encryption in PQ world?
Converting HPKE to be PQ |
We are starting in a couple of seconds, so welcome, Norbert. Thank you. I hope you can hear me, right? So yeah, my name is Norbert Poč. I work at Red Hat, and today I will talk about hybrid public encryption, and later we will source it with, like, some post-quantum. Okay, so is here anybody who already knows about HPK? Please raise your hand. Oh, nice. There's quite a few people. Okay. Okay, so who doesn't know? HPK stands for hybrid public key encryption. It's a symmetric and f-symmetric, like, it's combining both into a scheme. It uses a key encapsulation mechanism. It's used for key exchange. It's, like, key exchanges, like, you have, like, Diffie-Hellman. This is a bit different approach. You can find the RFC in the 9180. So, yeah, we have the fundamental parts of the HPK scheme. You have, like, key encapsulation method, the key derivation, or key schedule, and the AEAD, which provides the encryption of the messages itself. Below, you can see listed the algorithms, which are supported. For the key encapsulation method, we have prime, curves, and Edward curves. For the key derivation, we use SHA versions, and AEAD supports AES and Chachapoly. So, yeah, some familiar, like, words. You will find later, for the CAM operations, we have K generations, and encapsulation and the encapsulation. For the KDF, we have extract and expand. The extract generates a key from some input data, and the expand expands this extracted K to some length, we wish. For the AEAD, we have seal and open, which are encrypt and decrypt. It's just an LES to it. Okay, so let's talk about how does it work. At the one side, you can see the CAM and the case schedule, encryption context, and AEAD. Encryption context divides asymmetric on the left and symmetric on the right. This is really important, because, like, we'll see later, that this diversion enables us to change parts in it and leave the scheme still intact. So, let's say we want to use, like, different algorithms for the KDF method, and, like, we just change it and all, like, can proceed with the scheme itself. So, we use K derivation to the KNGAP solution, create an encryption context, which means which will consist of data we will need to do encryption or some data or messages in the AEAD. Yeah, and the last part, like, the symmetric part is the AEAD, where we, like, grab some messages, encrypt it, and, like, send it over. So, this is the communication part. Now, I want some feedback from the last row. Is it readable? Okay, thank you. So, this is a formal diagram. How does it work? I will go it through. So, we have Bob on the right and Alice on the left. We will assume that Bob has some private key which already shared with Alice. We don't care in the HPK how. So, let's pretend that Alice knows the public key of Bob. Then we will start with the encapsulation. So, Alice generates a temporary key pair called fmrl. Okay, it's visible. So, here, it's the PKE and SKE. And we will use Bob's private key and Alice's private, sorry, public key and private key to make, like, Diffie-Hellman, which will give us a shared secret. Then we use the shared secret in the KD version function to create a common key. Then we send over our public fmrl key to Bob so he can do the same to get the shared secret. Here, like, so we encapsulate it, send it over, and now we are at Bob. Bob does the encapsulation. He has his private key and Alice's public key. Does the Diffie-Hellman get the same shared secret? The shared secret is, again, used at the KDF, and we have a common key. So, the common key is the same at Alice and Bob. This is the end of the K encapsulation part. Now we move to K schedule. So, we get this common key, use the extract to generate some salt, let's say, then expand it to get a key, and expand it one more time, but with different information, as you can see here, to get an answer. And then the third one is secret for exportation. So, now we have set up the communication, and we can do actually encrypted message conversations, which is the seal and the open. As you can see, we use the key, the nonce, some additional information and the key, the private text, plain text. Thank you. So, here, you can see XOR. The messages are counted. So, every message has a counter and we explore it with the nonce. Therefore, every message will be different, even if the message is like the same. So, if the plain text is the same, we will get a different cipher text. That's the reason. Okay, so, we have this nice scheme. What can we use it for? Possible messages are the MLS or messaging layer security. It's quite new stuff. I think, maybe one year old. The MLS uses... So, MLS solves a problem where we want to communicate. We have communicating parties more than two. So, let's say, I want to communicate with you, but I want to communicate with you and you and you. So, it's harder to exchange keys this way because when we have, like, a two-way communication, it's easier, right? So, this is the problem that MLS solves. Then we have the TLS client hello and Oblivios DNS over HTTPS. The last one is, I think, Nu1.2. That solves the encryption of the IP address of the requester. Okay, so, HPK comes with three modes. The first one is the basic, and then there are two more, providing authentication. We have authentication mode with private key or a pre-shared key in a PSK mode, or we can combine the both and have the old PSK mode. What about the security? The base mode is programmed to be secure against indistinguishability ciphertext, and the authenticated modes are outside there and inside the CCS secure. Yeah, so, later on, you can find a report at the references. So, let's move to the post-quantum stuff. As I said before, the key encapsulation and the AEAD are separated. So, to make it post-quantum, we can just put post-quantum algorithms to the key encapsulation method, and we will get post-quantum HPK. The proposal was for kyber and psych, but as most of you know, psych is already out of the game. Kyber is one of the NIST finalists for key exchange methods. It uses chem the same way instead of DH-style key exchange. It is a lattice-based game standing upon learning with errors and running problem, and kyber is proven to be IND CCS secure too. Yeah, so we have this nice diagram again. There we can see hybrid version. We have post-quantum only version of HPK and hybrid. Hybrid uses post-quantum and the old algorithms too for the key encapsulation method. Like if one breaks, you can still have some security. So, you can see gray boxes here. These boxes are the old algorithms, which means if we eliminate them, the post-quantum version will be visible. That means the same way, so first I will go through the post-quantum. The same way Bob generates the key pair prior HPK, and let's say that Alice already knows that key. The difference here is that we don't do classical Diffie-Hellman, but we encrypt some random data and we will get a shared secret here. And the ciphertext of that shared secret. Then we do the key derivation to get the common key, send over the ciphertext of the shared secret to Bob, who can decrypt it and do the same here. And as you can see, everything else is the same as the basic HPK. For the hybrid mode, the only difference is here. So, for the KDF, the post-quantum and the basic shared secret is concatenated, and everything else should be the same. Okay, so what is the security of the post-quantum? In the base mode, it's still INCCI2 secure because the KM algorithm is proven to be INCCI2 secure. For the hybrid mode, it needs more proof because the concatenation there, and the authentication for both hybrid and PQ only would need more work. So, let's see some benchmarks. I got this benchmark from the paper, which you can see below. They were done on Intel Core E7 with 4 cores, 8 megabytes of cache and 8 megabytes of RAM, using AWC LST, cryptographic library. And the environment, each algorithm was run 10,000 times, and the first and fourth quartile was eliminated of the measures to make it more accurate. The measures are in CPU clock cycles, and I think it was medium or something like that. Okay, so here you can see Psyche, which is now eliminated, but I think it's a nice reference that it was a lot slower. So, this blue one is the basic Edward curve, like the basic HPK, then the green one is a hybrid one, and yeah, here the yellow is kyber only. And as you can see, it is faster than the Edward curve, which is interesting. Then there is one more graph. Here you can see the green lines, which tells us that the ky encapsulation method is constant time, so it doesn't constant in a way that it doesn't affect the encryption itself, because the tests were done for data encryption for 1 kilobyte of data, 10 kilobyte, 100 kilobyte, and 1 megabyte of data. So, as you can see, the cost is more on the encryption of the data itself. So, there is a red line. You can see that's for reference. This is a version of LSA encrypting 240 bytes of data. Yeah, so here you can see references if you want to read more about it, and thank you for attention. So, any questions? We had some questions on metrics, so I will try to read it. The question was, how do we make quantum resistant crypto work on constrained devices? What? How do we make quantum resistant crypto work on constrained devices? Well, that's a good question. I don't know the answer before, so sorry for that. Okay, so any other question in here? Thank you. The last slide, I think page 15, you showed some benchmarks, but these are on the whole encryption, right? Like, not only the exchange of the keys and the setup of the symmetry key, but the whole exchange, right? They are both. So, as you can see, the green line is like the key exchange itself, and then you have like the seal and the open operation here. So, it's, if you take, for example, one message and one key exchange, that's what the benchmark says, like it's together. Yeah, okay, all right. Which was expected, because anyway, the quantum, the post quantum part is only in the beginning of the exchange. And then when the symmetry key is established, you continue with using your EEAD. Yes, that's correct. All right, thanks. So, the post quantum part is only at the K exchange, and later on it is the same. Okay, any other question? Yeah, I just wanted to say that the post quantum is more about like asymmetric keys. It doesn't really affect the symmetric part, so it's okay. Any other question around here? If not, we can probably, thank you for the talk, thank you for the questions. Thank you for coming. |
Where does that code come from?
Git Checkout Authentication to the Rescue of Supply Chain Security |
Good afternoon. Can everyone hear me? It seems to be working. All right. So I'm going to talk about Git checkout authentication in the context of supply chain security. It's one of these birth words that we hear a lot today and I guess that's because there's a lot to be done in this area. I have to tell you this is going to be a talk about pre-crantom issues. So it's going to be different. All right. So I'm going to talk about work that has been done in the context of GNU Geeks like Simon was saying. Who has never heard about GNU Geeks in this room? A social offense. Very few people actually. This is real. I'm surprised. Anyway. So this started as part of Geeks. But as you will see, this is useful beyond Geeks, I think. So just, yeah, I have to introduce Geeks a little bit. This is an actual birthday cake that we ate a few months ago to celebrate 10 years of Geeks. So it's an actual, yeah. And it's a real cake. That's the thing. Yeah. So it's a distribution, a GNU Linux distribution that you can install standalone like you would install Debian or something. You can also install it on top of your system. If you're already running Debian, for example, this is great. And you can also have Geeks on top of Debian and that gives you an additional package manager. But anyway, I'm not going to go into the details of what it's like as a user. I want to talk about the, you know, what's behind the scenes, right? So what it looks like from a supply chain viewpoint. So this is a package definition for Geeks. Maybe some of you are wondering about the parents, right? That's okay. It could be JSON, it could be XML. You have similar things with other tools. It's just basically metadata that describes how to build the Hello package. It's telling you where to get the source code that tar.gz file. It's telling you how to build it with GNU build system. So configure, make, make, install, you know, that kind of thing. And there are now like more than 20,000 packages in Geeks and they're all defined like this. So this is source code, right? And the thing is Geeks is able to build packages from source. So conceptually, you could think of Geeks as some sort of gen2, right? In the sense that it's building packages from source, except that you can actually get pre-built binaries and that's what people usually do because, you know, it's faster, especially if you want to use LibreRafis or, you know, whatnot. But Geeks is basically as a distro, it's all source code, right? Package definitions. And then when you go and build a package, that's also a salient feature. So if you've ever used or heard about Nix before, this is entirely inherited from Nix. So this is the functional model. Basically you say, all right, I want to build that hello package. And you run Geeks build hello and it's going to talk to a daemon that makes an isolated build of the hello package. So it's fully hermetic and that, you know, that removes a whole class of issues of nonreproducibility issues that you would have without that isolated environment. Yeah. And so that means that if you look at all the, all these things that we have in that GNU store directory, we have tons of packages and stuff in there. Well, they're all going to be bit identical for everyone. Or nearly. There can be issues, you know, but usually it's going to be bit identical. So typically if I look at that GNU store, blah, blah, blah, hello thing up there. Well, if I build it actually on my laptop, or if you build it on your laptop, you're going to get, we're going to get both the same hash. It's going to be identical. So it's all about reproducible builds, which we've probably heard of. So this is an effort where many distorts are involved. Daemon, of course, has been leading the effort, but there's also Nix OS, Arch Linux, blah, blah, blah, many distorts. And it's called reproducible builds, but we could very much call it verifiable builds. The whole point here is that you don't have to press binaries that you get from a server. You can always verify that the source that appears, you know, in the package definition that we saw before actually corresponds to the binary that you have. Because you can always rebuild locally. You can challenge the servers that provide pre-built binaries and see if it's the same. So from a supply chain viewpoint, that's pretty big deal, I think. And Geeks, we're trying to go a little bit further. So reproducible builds are nice, but it's not sufficient. Like if you're reproducing bit for bit malicious software, you still have a problem, right? So you've probably heard about that Trusting Trust attack, you know, illustrated by Ken Thompson in 1984. That's a long time ago. Well, this is that story. We want to be able to have fully auditable code that's entirely built from source. And actually someone over there in the back of the room with other people has been working on this, has been presenting this last year. We could talk about it for ages, but I have other things to tell you. But I encourage you to take a look at that talk by Yannick last two years ago, actually. The thing is how to be able to build the whole Geeks distribution starting from a binary that's just 357 bytes, I think, right? So pretty big deal. All right. Now to be more on topic. So we have these fancy things, you know, reproducible builds, strapable builds, building everything from source. That's nice from a supply chain security viewpoint. But, you know, for several years we've had that tiny issue specifically in Geeks. If you want to update your packages, well, your package collection, the available packages and the tool set, you would run Geeks pool. So it's similar to apt update in Debian, for example. That's roughly the same kind of tool. But it's implemented by fetching code directly from the Git repo of the project. And, you know, as you can imagine, you have to think about the implications of this, right? We're delivering code directly on users' computers, so we'd better be sure they're actually getting, you know, the real code coming from the Geeks project and not something different. For example, if the server that holds the Git repo is attacked, well, we'd rather have some mechanism to detect that, you know, to make sure that users are not going to download to clone a Git repo that contains malicious code, right? So we need something here. And, you know, we thought about this for quite a long time, actually. And the typical answer to this question is the update framework. Tough. I don't know if you've heard about it. It's sort of the reference for all things update in general. It's a specification with implementations in different languages and in different frameworks like for Python packaging, for Debian, I think, different things. But there's one thing. It's not quite a good fit for our case. Our case is we're just pulling from a Git repo in the end. The update framework is more about distributions that look like Debian or Fedora, where you have binaries on the server. And, you know, people are actually downloading those binaries. And those binaries are built by machines or developers, blah, blah. It's a different setup. So to illustrate that, let me show a bit what the workflow looks like in Git. So here we have what packages, Git packages do. So as a package, you define packages. So, for example, Python. And that's the kind of definition that I showed you before, right? And then you can test it with Git build Python, for example, like RISO. And eventually, the package is satisfied with the package while they eventually push it to the Git repo. And as a user, at some point, I'm going to run Git pool, which is very similar to Git pool, except it's also going to compile a few things. But roughly, that's like Git pool. And so at that point, I'm getting the new package definition, and I can run Git install Python, and I'm getting that package. That's the idea. And optionally, like I said, you can get pre-built binaries. I'm not going to go into details about this. This is optional, but this is something you usually want. But, you know, it's not baked in the model like you would say in Debian or Fedora. It's really something additional. And because we have reproducible builds, you know, pre-built binaries, it's substitutable, right? The key thing here is that people are going to pull from the Git repo, and we need to make sure that they are getting the right code, the real code. So we're really looking at these two things where the users are running Git pool or the build farm that builds packages is running Git pool, and how can we actually make sure they get the right code? And this is all about authenticating Git checkout. It's just Git after all. There's nothing special here. So with millions of people using Git, you would think that it's a solved problem, right? Oh, sorry, I thought. It is not, actually. So if you go, for example, to GitHub or GitLab, you can see these verified badges. This is a screenshot from GitHub. So you have verified badges. It's green. It's nice. You have partially verified, hmm, what does that mean? And you have also no badges. So what conclusion can you draw from that? Is it the real, the authentic repo or is it not? You know, you can't really do anything with that. So at that point of the talk, we need to talk about authentication. Authentication is about, you know, making sure we're getting the real thing, you know, the indisputed credibility. So we would say we want to make sure that people are getting the Git code as coming from the Git project. That's what it means to me. So specifically, we want to protect against a number of things. So we want to assume that potentially an attacker can gain access to the server that holds the Git repo. And from there, you know, the attacker can push more commits on that repo or could, you know, introduce malicious changes in many ways or even make a so-called downgrade attack where the attacker would revert or actually remove the latest commits, for example, so that users would be tricked into pulling an older version of Git with potentially like vulnerable packages and stuff like that. So this is what we want to protect against. What we want to protect against. There's a couple of additional goals. We want to make sure we can do offline authentication like we don't want to, you know, call out a number of services out there and, you know, key ring servers, whatever. And of course, we want to support changing authorizations in the sense that, you know, people contribute to Geeks and they come and go, right? So we need to add new people, new contributors, you know, official contributors, packages, and eventually we will remove them. You know, we need to be able to deal with that. So the solution, well, we're not yet at the solution, but the intuition, at least, that, well, this is Git. So this is a graph of commits, right? We're just dealing with a graph of commits. So we have commits here, actually, A, B, C, D, E, F. And each commit is made by someone. And the intuition is that we would like to be able to annotate each commit saying, well, at this point, you know, there's a certain number of people who are authorized to make commits in the project. And maybe it's going to change, you know, at each node of the commit graph. And yeah, this is what we would like to do. The solution we came up with is to have basically inside the repo a file that's called Git's authorization that lists the open PGP keys of authorized committers. You know, pretty simple. And the thing is, the file leaves inside the repo. And then we need to have a rule to determine whether a given commit is authentic. And so the rule is actually very simple as well. So a commit is authentic if and only if it is signed by one of the authorized committers of the parent commit. Got it? This is the main part of the talk. I'm almost done, actually. I could stop here. So we call this the authorization invariant. So let's see in practice what this looks like. So if I go back to my commit graph here, so let's assume for commit A, this is the first commit, let's assume Alice is authorized there, all right. And then in commit B, Alice is adding Bob as an authorized committer. So we have this label here. So at that point, Bob will be authorized to make commits. And if we look at commit C and E, well, they are made and signed by Bob this time. And it's perfectly fine because if we look at the parent commit of C, for example, so this is C, the parent commit is here, and we can see that Bob is authorized in the parent commit, right. And likewise with E, we can have so a second branch, the purple branch, and Bob is also committing in that branch, and this is fine because the parent commit is the same one, and Bob is authorized here, all right. And we can keep going that way, you know, remove people and so on and so forth. So a second example, if we take almost the same one, except that on the purple branch here, Bob removes Alice from the set of authorized committers, all right. And then what happens if Alice tries to make a merge commit that has D and E prime as parents? Well, if we apply the authorization invariant that we showed before, this commit is not authorized. It's not genuine. It's going to be rejected. That's the idea. Yeah, there is a small problem that perhaps you've noticed. We kind of didn't discuss the first commit, right. There's something to be said about that one too. Well, we need to introduce the repo in a way. So we need a way to say, well, this B commit is the first commit where we will start applying the authorization invariant. So we call this the introductory commit, and it's needed because, you know, perhaps you have some history already in your Git repo at the time you start using this mechanism. And so we need to be able to say this is the one where it starts. We call that the introductory commit. And so users are expected to know, you know, what the introductory commit is. So for example, this is a specification of a channel in Git, so a channel provides more packages. And as a user, you would provide not just the URL of the channel, of the repo, but also the introduction information that tells from which commit we're going to start authenticating. And that solves the bootstrap problem. So concretely, now that we have this, if we run GeeksPool, and it's been in production for a couple of years actually, if we run GeeksPool, well, we are going to have a message that says we're authenticating channel Geeks and a number of new commits, right, and it's sketched, so it's pretty fast. If I tell GeeksPool to use a different URL with a mirror, I'm going to get a warning saying, all right, it shows to use a mirror, that's fine, but be aware that this is not the canonical URL, so perhaps this mirror is tail. But at least we can tell it's authentic because we've verified the authorization invariance. But then if some evil attacker, you know, does something bad with the repo, then we're going to get a narrow message directly saying, no, this commit is not signed by an authorized key, you have a problem. And this is it. So this is all when using GeeksPool, but there is actually, you can use the same thing even if you're not using Geeks or even without using a channel, you can use the GeeksGit authenticate command that works the same except it's slower level, so you have to specify the introductory commit and the key that signed the introductory commit. And the thing is, I think we should all be using that kind of stuff with our Git repos because right now it's a wild west. But yeah, the key is a bit not super usable, so I understand we'll have to do some work on this if you have ideas, I'm open to them. Yeah, and you can specify where the key ring, the open PGP key ring is to be found because this is not going to talk to key servers which are very unreliable as you probably know. All right, I didn't mention downgrade attacks, I have to be fast right, I guess. Downgrade attacks. That's another kind of attack we want to protect against. And the good thing with Geeks is that Geeks keeps tracks of its own provenance. So for example, when you are running Geeks, you can run Geeks describe and it's going to tell you, I was built from this commit. So it knows where it comes from, so to speak. And because we have that provenance information, then if you run Geeks pool and it detects that it's not going to a commit that's a descendant of the one we're currently running, you're going to have a narrow message, right? Commit coffee is not a descendant of cabbage, of course. And this is pretty cool. And likewise, even at the system level when you deploy your system, the system itself, the distribution actually running on your machine records which commit it was built from. So we have the information here if we run Geeks system describe and so if I run Geeks system reconfigure to update my system, well, potentially I could get a message that says, no, you're trying to reconfigure to a commit that's older than the one you're currently running. That's a problem. I can overwrite that if I know what I'm doing, but usually you'd better not. All right, it's time to wrap up, I guess. Yeah. So to summarize, we have two things here. We have authenticating checkout which is good for Geeks because it gives us safe Geeks updates. And because we have safe Geeks updates, we can have unattended upgrades, for example, and this is super cool, you know, you know that the unattended upgrades are either going to work and run the right code or they're not going to run at all. And this is important, I think. This is in-band and offline which means all the data needed to perform this authorization while this authentication step is all inside the Git repo. There's no need to talk to key servers and stuff. And you can and should use that kind of tool on your Git repo, I think. We really need to think collectively about this issue. And we have again protection against downgrade attacks which is good for unattended upgrades and is deploying in Geeks for a while now. There's a paper if you want to see all the nitty-gritty details. There's the URL here. And yeah, to conclude, I'd like to think a little bit, to reflect a little bit about all these issues of supply chain security. I know I'm sharing this one with speakers about Seek Store, for example, and other projects. And we have a different approach to things. For example, with Geeks, we have a unified deployment toolbox. So we are very much talking about end-to-end integration of the tool set, verifiability with reproducible builds, for example, auditability. We have the commit graph. We have all the details available at our disposal when, you know, often popular approaches are more about assuming that you have a different set of tools. You can have a distro, you can have Docker, you can have Kubernetes, whatever. And you're just combining everything and thinking about artifact flow integrity, attestation, version strings and stuff like that. So I think the key is to really think about going from source code to deployed binaries. That's very much the free software ethos as well. And thinking about ensuring we have proper provenance tracking and the ability to verify things. This is it. Thank you. Thank you, Ludovic. We have three minutes for questions. Hello. Thank you for the talk. A really common workflow is to use GitHub to merge pull requests. Yes. And whenever you merge pull requests, there usually is a merge commit signed by GitHub. How would you go about allowing merges by GitHub without allowing GitHub skis to be used for arbitrary commits? That's a very good question. Actually, that's probably a limitation of this model. So we're not using GitHub or even GitLab for geeks. And actual developers are making merge commits, for example. But typically for automated merge commits, like you have on GitHub, it's not going to be practical. That's a limitation, yeah. Hi. Thank you. First of all, thank you for your brilliant presentation. I see that geeks or geeks, I haven't geeks, yeah. Thanks. He's a very promising package manager or even the Linux distribution. I probably have some off-topic question regarding to your talk, but I still believe that you can answer it. It would be enough, yes or no for me. Is it some kind of cross-compilation supported by geeks? Sorry. Yeah, there is cross-compilation support, yes. You can even cross-compile systems. Quick question. Thank you so much for your talk. I have a quick question. I saw you're using PGP keys to verify the commits, but these days you can also use SSH keys to sign your bit commits. Is this also supported in geeks? No, it's all open PGP. That's a good question. We started before Git supported anything other than open PGP, actually. Yeah, so it's a trade-off, I guess. Have you considered upstreaming this into Git? Oh, sorry. Have there been any ideas about upstreaming this into Git itself? I did consider it. It's a bit of work, I guess. Also, we have very tight integration with a small-scale open PGP implementation that can only do signature verification, so that would mean also having that into Git itself, which is quite a bit of work. But I think it should be in Git proper eventually, yes. Okay, final question here. Thank you. Have you considered a six-storey integration with Git? Oh, sorry. Can you repeat? Yeah, have you considered a six-storey integration? Is it possible? Is some work in that direction happening? No, there's no work in that direction happening as far as I know. I guess I'm not sufficiently familiar with six-storey to see how it could integrate with Git, but I don't know. Maybe there's something we could do. Thank you. Thank you, Ludovic. Five-minute break. Thank you. |
Whom Do You Trust?
Privacy and Collaboration in CryptPad |
Next presenter. Who is Theo? Yes. So, talking about cryptpads. Yes, the floor is yours. Thank you. Yeah, whom do you trust? I think this question is really a serious question, especially for privacy and collaboration in today's Internet. Well, yeah, let's directly start with this question or with what collaboration is. Collaborative editing is that multiple people work on the same document at the same time and they want that their changes are transmitted in near real time. So here in this example you see that one person writes there, the update is propagated to the server and the server further forwards the message to all other users. Here you see that in this generic example that the server can see all messages. The server has a local copy of the document and updates as soon as it gets a message from a user. And already here we should say we're like, hmm, okay, so whom do we need to trust? We obviously need to trust the server in this example because the server can see the documents. So this leads me to the second part, to the privacy that we want. And here I give you some informal definition and we say that no untrusted entity can infer personal information, document content, or who the collaborators are. So for an untrusted party, the document should look like this, just like snippets. And this untrusted party should not infer any information. Here the key point is that it's an untrusted entity. Because this does not hold for everyone. For example, the collaborators, they should be able to read the document. So the question is whom we trust? And I'll start with the solution that's probably the most used today. And yeah, why not trust Google and Co. And there may be many reasons. I just want to give you one example. And it's the case of Desha Rawi. Here, Naomi Klein, a famous environmental activist writes that India targets climate activists with the help of Big Tech. Tech shines like Google and Facebook appear to be abating and abetting a vicious government campaign against Indian climate activists. So what happened here, there was, cannot go too far to the side, was climate activist Desha Rawi who founded the Indian, was co-founder of the Indian chapter of Fridays for Future. And they worked on my Google Docs where they discussed how to help Indian farmer protests. And there was stuff like use this tweet or you can write a letter to your government. This document was leaked publicly on Twitter, I think. And then the Indian government thought of this is a conspiracy theory and wanted to track down who actually wrote this document. So they asked Google and Google helped them and said it's this and this and this person. And then Desha Rawi was arrested for a few days. She was later on, she was against freedom and there was no sentence against her in the end. But nevertheless this shows that we cannot really trust Google to host sensitive documents. So what can we do against? Or what is an alternative solution? And I think one of the most obvious answers, especially at a conference like here, is to say that we need to control the software. We need to have the server and the client's open source. Because if this is the case, then we can host the software on our own instance, on our own server. And we can decide whom we want to give the data. And yeah. So this would be a first approach. So we could say, yeah, it's freedom of software, we are safe. And this is exactly a quotation here from Jitsi Mead. And they say that the possibility to run your own instance completely removes the need to trust a third party provider and therefore eliminates the need for end-to-end encryption. So they say exactly this, you can run it your own. You don't need any other pre-consciousness. No, this is fine because it's open source. Jitsi Mead is a video conferencing platform you may be familiar with. So this is a bit different. I will come to it later. And also interestingly, also interesting is that they remove the statement from their website only a bit after I started to prepare my talk. So this is from December 2022. But are we really safe? And to answer these questions or some more questions, can really everybody run their own instance? I mean, yes, probably most of you have the technical capabilities. But do other people have this capability? Do they have the infrastructure? Do they have the money to run this? No, probably not. And the second question is, do you really want to trust a system administrator to see all your documents? So imagine you're in a company and you are working in a collaborative system and you have the salary sheets online. Do you want the system administrator to read that? No, probably not. Even if you trust it in the first place. And then, and this is where the difference is to video conferencing, is that documents are not ephemeral. So a video stream you can safely delete after the conference has ended. But a document must be stored in the server because you want to access it later. And this means you do not only need to protect your documents currently, but also in the long term. So that if the server is under attack or an attacker gets access to it, they should not have access to the documents. Okay, so if you see this, then you probably think we need end-to-end encryption. And end-to-end encryption is in principle that you have one party, let's say Alice, and Alice encrypts a document, they send it to Bob and Bob decrypts it. And in the middle, the data is not readable. So this is the encrypted ciphertext. And you see here, it's exactly the snippets we want. So this technically looks good, and we could say, okay, we apply this, and we can say it's end-to-end encrypted, we are safe. And here's a statement of Google, and they say that with Google Workspace, client-side encryption, content encryption is handled in the client's browser before any data is transmitted. So here first note that client-side encryption is not the same as end-to-end encryption. It's different, especially in the question who holds the key. And client-side encryption, it's not you as a user who holds the key, but the keys are stored on a third-party server. So there comes again this question of trust, if you trust this third-party server. Okay, so we could say it's end-to-end encrypted, we are safe. Well, really. First, there are the metadata. And metadata is all about who connects to the server, at which time, from which IP address, who collaborates, which people are accessing the document at the same time. And all these metadata, they are still there, even if the content is encrypted. So, yeah, still a problem. And second, we have Kirchhoff's principle from cryptography, which says that a cryptosystem should be secure, even if everything about the system accepts the key is public knowledge. So you should be able to release all the code, and all information accepts the key, and it should still be secure. And for me, it's really urgent for open source. And, yeah, that's why I think it's urgent for open source. So we see that we need both of them. And here, I want to present you CripPad. CripPad is an online collaborative editing tool. There are multiple parts of it. There is a whiteboard. There is code marked on. There are slides, like these ones, and documents. It's open source software from the client code is open source, as well as the server code. So you can host your own instance. And there are about 200 maintained instances. We at the CripPad team, we host a flagship instance, which has about 200,000 registered users. And how does CripPad encrypt? So in CripPad, we have this end-to-end encryption. We have that an update is propagated in encryption form, encrypted form, and the server only has an encrypted state of the document. So the server cannot infer the actual content of the document. And how do we share the keys? In the most basic way, we share the keys over the fragment identifier of the URLs. That means we put the keys after the hashtag of the URL. Like this, you can easily share a document. What do we trust? As I saw, as I said, you still have the metadata. So you still need some trust. In CripPad, you have to trust that the server is not an active attacker. That means that you expect that you trust that the server acts according to the protocol. It runs the correct code, and it does not deliver any malicious things. Or it does not repeat stuff, and so on. It does not reorder stuff like this. And why do we have this trust requirement? There are two reasons. The first one is a practical. We have a web application where you get the client source code from the server. So if the server would deliver bogus client code, well, then every security guarantee is lost. And then there is the second one, which is more theoretical. Namely that the server can always delete files. Even if they are encrypted, the server can delete them without problem. Okay. So you see that you need some trust. But the cool thing about CripPad is that there are other stuff which you don't have to trust. Namely, the server could be an honest but curious attacker. That means that even if the server watches you, you're still safe. You don't have to trust the server that it does not watch you. No, it's explicitly allowed. And why do we have that? Well, the server may become corrupt. Even if you trust it to be not actively malicious, it's still maybe at some point in time, it may be corrupt. And this was especially the case here in last summer where there was the CripPad instance hosted by Germany's private party. And on this instance, some sensitive documents about the G7 summit were leaked. And then the police asked the pirate party to hand out their data. Otherwise, they would seize the server. So the police got access to this data. And now the police could not read anything. They could not read the documents. Namely, because everything was entered into encryption. So this shows that even if you trust it in the first place, we still cannot be sure that it's trustworthy forever. And this setting, yeah, and as I said, this setting is exactly covered in this honest but curious attacker, which we allow. There's also another point of view to look at this. We could all say that we protect the server from its users. So for example, the server administrator of the CripPad of Germany's pirate party was not consulting. How could they know what documents were published on their server because it's encrypted? So this shows that such encryption is also nice for you in terms of system administrator because it allows you to offer the service without taking too much risk to you. So, yeah, as a take home message, I really want to say that we need both. We need open source and end-to-end encryption for good trust assumption. And with this, I'm at the end of my presentation. Shout out to my team. It's David is there somewhere. Wolfgang is here. And Ludo is also there. I am Theo. And CripPad is developed at Xwiki in France by this small team. We have a stand here in the K building. Yeah, come by. Drop by. We have stickers. Thank you. So are there any questions? Would a peer-to-peer version be possible to reduce risks with the server and would it help? Yeah, it would be possible, theoretically. The main point why we have a server is that we want that the documents are accessible all the time. So in a peer-to-peer setting, you will firstly have the requirement that always one party must be online. And we don't want that. Yeah. So another question. Thank you very much. You say we need open source and E2E for good trust assumptions. I would suggest you might need a slightly stronger statement. The code on the server has to be open source as well. So potentially you need something like the Afaro GPL. And in terms of the E2E, you need post-quantum resistance. Yeah. If you have those two things, then maybe you have good trust assumptions. Yeah, good point. TripPad is licensed under the ATPL. So this point is easily answerable. And on the second point we are working on, we are looking on how to make TripPad secure in a post-quantum resistance. Thank you very much. You mentioned that it's problematic to have the metadata still. So what is TripPad doing against that or how can we make sure that the server is not collecting metadata? Yeah, two answers to this. One is that there's always some metadata which will be there. For example, the IP address or the browser agent. This one we have to live with. And this is also the case why it's important that you can host it on your own instance. And then the second part is that TripPad collects as few information about you as possible. So for example, we don't have a list of users, of user names. There's even no list of hashed user names. So we just hash the user. The user name and the password locally on the client side and generate from this all the keys. So this is just as an illustration how we try to ensure to have as few metadata as possible. Good afternoon. And that's the first question you answered that you don't use peer-to-peer because you want the server to be online all the time. Does the server have to be unique or can you have multiple servers just in case one gets in the hands of the police or gets out for some reason? Are you speaking of federation? Yes. Okay. Possibly there, currently there is no possibility for federation. No, sadly not. Yeah, you mentioned the case where a server was raided by police. So they have the server. Is it not enough then to have that server and also have somebody's browser history with the key in the URL and then that conversation is open? So if I got your answer, if you got your question correctly, it was if an attacker has access to the server and to the URL, then they have full knowledge. Yeah, that's the case because the URL leaks the full URL including the part after the hash, which is not sent to the server. If the attacker has this, yeah, then they have the key to the server and add the key to the document and can decrypt it. Yeah, connected it. So yes. How does editing collaborators adding removing work do like re-encrypt the file of different keys every time or how do you handle that? So we have, we only send updates. So it's not the entire file every time and it's symmetrically encrypted. And in order, there are two ways you can access a document in a read-only mode. There you have the keys for decryption and to prove that you're able to update the document, you need to sign it with a sign-in key. But the keys are static for a document. But if a user gets removed from read access, they would still be able to read the file after it's being modified, wouldn't they? Yes, exactly. Yeah. Yeah, they'll still be able to read it. There is, there are access lists which we have which can defend against this scenario. But yeah, there's also something we're working on. And maybe if I can just mention something which with more goes into these detailed questions. We just published a white paper. You can go to our website on kruppad.org and check it out. So if there are no other questions. |
What Does Rugby Have To Do With Sigstore?
Learning Sigstore via Rugby |
Be quiet. This is James Strong and Luis Denham Patty. What does Sarak have to do with Sixth Store? Hello everyone. My name is James Strong and we're going to talk about rugby and Sixth Store today. But something really interesting is happening today right now. So Seven Nations started today and Wales and Ireland are playing. What's the score right now, Lewis? I think we should concentrate on the talk. I think it would be most enjoyable. Last time we checked it was 327 in Ireland. Where are you from? Wales. Anyway, awesome. So I guess my name is James Strong. I'm a solutions architect at Changar. I do a bunch of stuff with networking and Kubernetes. And if you're here to win the book, meet me outside afterwards. If you sign a container image during the talk and come hang out with me, I've got five copies of my book. So sign your container images, everyone. And if I broke your Ingress Engine X release on Thursday, I apologize. Please don't come outside and hang out with me. Hi, everyone. I'm Lewis. I'm similar to James in Waze, not in others. As well as being at Changar, I'm the coach for Rebina Squirrels under eight rugby team. That's why I'm listening today to a talk about six-door and rugby from myself. Yeah, if you weren't mind keeping the score away from me with Wales versus Ireland right now, that would be beneficial for my own sanity. Yes, next slide, please. So yeah, at Changar, we support a lot of open source projects. We're talking today about six-door, but we're also part of Salsa, doing some assessments for it. We've got Tecton, K-native, OpenSSF, and Disturbance. Does anybody use any or all of those? I do. There's a couple on there that aren't listed, like I said, Ingress Engine X. Changar supports me to support Ingress. We also have our own container image out there. It's called Wolfie. You can check that out, wolfie.dev. But like I said, we're going to talk about six-door and rugby today. Okay, so who here has heard of six-door prior to walking into this room? Okay, we got a couple. Thank you for coming. Who here has heard of rugby today prior to coming into this room? Yes, hello. Right, so when we submitted this talk, James found this very special diagram, and when we zoom in, we can see the similarities between six-door and rugby for this talk. Incidentally, this lasts about 22 minutes, hence why we've had such a long introduction. So what do people think rugby is? It's really not doing a haka. It looks cool and intimidates your opponents, but that's not really what it is. It's a very difficult game with highly specialized positions, and it was required to help everyone work together to achieve a goal, and that's really what we think six-door is and what we want to talk about. So there's lots of components about it, and we think that six-door is tackling supply chain security, and we're going to talk about how, why, and hopefully make it fun along the way, and learn a little bit about rugby and six-door, and probably a lot about neither. So why is six-door tackling supply chain security? It has started to improve the supply chain technology that we all use. It's made by open-source maintainers, for open-source maintainers. A lot of you may not be aware, but a lot of things are being signed right now, so thank you to Adolfo there from SIG Release for signing all of the Kubernetes releases with SIG Store. Thank you. It was a lot of hard work. I know PIPI, a lot of package maintainers are going to be supporting that. I know that Maven, so there's a lot of people that are using six-door. You might not even be aware that you're using it, but it's a direct response to some of the challenges that are there right now. So who here has had someone else sign their GPG key? Been through assigning parties, two, three people, four? Okay. I was really glad not everyone shot their hand up because that would have been really fun. Anyway, we're going to talk about how we're going to be doing that. So, again, not knowing where your software comes from, and without having identity checks and safety protocols, it's leaving our, well, we're leaving it open to explode some attacks. So, six-door attempts to make software and their infrastructure frictionless and invisible. And as James just mentioned, we're integrated with a wide range of systems. to an identity and know where it came from and who made that piece of software. And just like in six-door, Rugby also has lots of positions that are available, highly specialized. I play a hooker, that's much different than what a fullback would do. We all have different responsibilities to be able to win the game. So we're going to tie those two together. I'm going to start off at the very top of the play. So we're going to talk about trust routes. So with signing, we also require trust. So knowing who is making available this piece of software. So think of it from a PKI perspective. We have a route of trust in SSL certificates. What we're also trying to accomplish with six-door is that same route of trust. So think of automated software for SSL certificates. I was thinking of Johnny Sexton. Oh, Johnny Sexton. That's not the answer. But think of it from that perspective. So six-door has a route of trust. It was initialized with TOF. So the update. Let's Encrypt. Let's Encrypt. Thank you. Thank you for that. Let's Encrypt. Think about six-door as Let's Encrypt for software signing artifacts, making it easy and transparent for everyone to use. So the fly-half is very influential player on the field. And in order to start that route of trust, we have to have that trust there. So there's a bunch of links there. We actually did a live stream of the six-door key signing. If you want to figure out how they initialized that and did all that work, that's a great YouTube video from that. So going from the fly-half to the loose-head, from a play perspective, they're going to be the certificate authority. So now we've got our route of trust. We have our certificate authority to be able to produce those certificates to sign those artifacts. So a lot of responsibility is on the CA from that perspective. And of course, being a loose-head, you carry a lot of weight on the team. You're very important into the scrum. So again, very important position on the team. So OITC allows us to identify the end user. We obtain some basic information about the user. And Falsio uses OITC to authenticate requests. Subject related claims can be extracted and included on the certificate. Six-door runs a federated OITC identity provider in DEX, which creates a DEX OITC token from the original OITC. Falsio supports OITC from additional configured users. Issues, sorry, as we can see from the screenshot. And the play that we selected for this was Martin Castro Giovanni. Does anyone know if Martin Castro Giovanni? No. Okay, well, there's something learned today. He's a massive identity within the Italian game, even though he's originally from Argentina. And the reason I selected him is because you can identify him because of his sleeve with all the identities of his family members on his arm. Next one. And so now we've come to Falsio. Falsio is the API that drives all of this, that ties the OITC and the certificate authority together. So when you're making a request to get a signing certificate, you're doing it through Falsio. And we put that together with the hooker. And yes, I was self-serving that is me. And do feel sorry for that guy. So Falsio is a free code signing certificate authority. Anyone can make a request to get a signing certificate, tie it to their identity, and make it available for everyone to verify. They are short-lived certificates, so that's going to come into play from another piece of technology perspective. We're going to talk a little bit more about that with RECORE. So with RECORE, it provides a new ability with ephemeral certificates. It's based on a Merkle tree, and it fulfills the transparency log, which means that it's searchable for all. So you can use that URL there to be able to search via a browser, or you can use RECORE CLI to be able to search. So for this one, I'm looking towards Martin Johnson. Again, as a Welsh person, for putting all these names out, it's quite difficult for me. But Martin Johnson was a powerhouse for England. He was a captain who led them to numerous victories. But the reason why this is comparable to RECORE is he went on to a successful coaching career as well, providing insights for the next generation that has to hire to play the game. Yes, next slide, please. So we want to take some time and talk a little bit about how a developer interacts with that. So when we think about it from a rugby perspective, the scrum half is the connector between the forwards and the backs from that perspective. And cosine is that glue that ties all of these pieces together. Think of it like a QCTL, right? You don't actually directly interact with the API. You do it through QCTL. Cosine is how you do that with Pulseo and RECORE and the rest of the other six-door environments. So you can actually sign and verify signatures. It also creates in-toto attestations. So if you wanted to generate and sign other metadata about your container images, maybe how many CVEs are in it, if you're generating an S-Bomb, things like that, other metadata, you can make available, sign it, and store it with the container. All of that can be done through cosine. And we're not talking about just container images as well. That's where it's highly targeted right now, but you can also sign other pieces of information. So when I send, as for fun, when I send documents to clients nowadays, I also sign it, and it generates the hash of the document and the signature, and they know that that document came from me. So it's basically a free DocuSign. So this is Gareth Edwards. He was an influential player in the 70s, played 88 consecutive games for Wales, and one of the key reasons to our success in the 70s, not so much in the 2020s. And yes, the Scrum half is instrumental in communicating between the backs and the forwards within a game, which I can also see with Cosine. Not necessarily Cosine and Rugby, but yes, the next slide. |
How to protect your Kubernetes cluster using Crowdsec |
Hello, everyone. My name is Hamza and this is my colleague Sebastian. We are working at CrowdSec and today we're going to introduce you CrowdSec, show you how, explain you how it works and show you how we can protect your Kubernetes cluster using it. So, how CrowdSec was created. We start with the basic statement that, yeah. So, cybersecurity, we start with the basic statement that cybersecurity is not a problem of means. As you can see, those big companies has huge amount dedicated to cybersecurity. They have security teams, but they get stick hacked. And the reason, we think that the reason why is because, sorry, because they try to fight alone, like using the superhero approach, fighting alone attackers while those attackers are sometimes collaborating together to attack. So, as we are a lot of users that want to legitimate businesses, why not collaborating, sorry, all together to fight those attackers? So, how this collaborative IPS works. So, basically, CrowdSec needs to ingest logs. So, we have a lot of handled data sources, like flat file, basic. It can also act as a syslog server to receive syslog inputs. You can also read your Docker container output logs on the cloud part. We have a cloud trail on the stream processing. We have also Kafka, Kinesis, and so on. Then, we have the IDS part. So, the IDS part is a component, what we call the agent. The agent is here to pass the logs and detect bad behaviors. And this is where the community aspect starts. We have a hub, like Docker hub, where we wrote, we wrote, sorry, our scenarios to detect the common attacks. And we share them with the community. And then, here, the other users, they write their own parsers to handle new services, new bad behaviors, et cetera. And they share them with the community. You can also write your own scenarios to detect, like, a specific behavior in your internal infrastructure, for example, and not share it, of course. And once this detection is done, we have the IDS part. So, the IDS part is what we call, it's another component of CrowdSec called the bouncers. And the bouncers are here to allow you to do a remediation in multiple parts of your infrastructure. So, for example, the basic one is to have the firewall bouncers. So, the firewall bouncers will communicate with the CrowdSec, get the decisions, and block them at the firewall level. That you want sometimes have a smarter remediation, like, for example, you run an eShop, and we know that sometimes we have a lot of clients that are behind a single IP. So, what we want to do is to, when a bad behavior is detected from an IP, we want to push a captcha to the user. So, we let humans still access to the website, but block the bots. And, of course, we have a lot of other bouncers on the Cloud side, or also other bouncers that are written by us and by the community, of course, and they are also available in the hub. And when this remediation is done, we have the reputation part, and what we call the community block list. So, we receive the signals for, we send the signal to the community, to the central API, sorry, we receive all those signals, we curate them, and then we return to the community the most aggressive IPs. So, you are protecting yourself and protecting the community, and you benefit also from the community feeds. So, we are, CrowdSec already deal with a lot of common attacks, more than 50. Web scan, port scan, traditional scan, stuff. But also, thanks to the inference engine of CrowdSec, you can, it allows us to detect more complex attacks, like, for example, L7D dose or credit card stuffing, which is a very business-specific issue. So, for example, when you have for credit card stuffing, it's when an attacker buys some credit cards from a shady website and wants to test if those credit cards are valid. So, here we go to a eShop and try to do a small purchases. So, you can detect this kind of attacks, also both scalping, et cetera. Basically, you can detect, if it's in the logs, as CrowdSec ingest logs, you can detect it. And thanks to that, we are building a real-time, a real-time map, sorry, of cyber criminals. It's kind of like ways that in cybersecurity. So, the more we are and the more efficient is the radar. And, of course, it will allow us to kill the most important resource for the attackers, which is the IP addresses. Now, I will let my colleague take the next slide. Hello, everyone. So, first thing about privacy, because, as Hamza mentioned, you do send us data about the attacks you see. But we do collect only the strict minimum to be able to build the community block list. This means that your logs will never, ever leave your own server. Logs are processed locally by CrowdSec, and you will only send some metadata about the behaviors that are detected by CrowdSec. So, those metadata are just the IP that you blocked, the reason why. So, for example, SSH could force, and the time at which you block the attack. And that's it. Nothing else. And also, this part is totally optional. If you do not want to send anything to us or the community, you don't have to. But if you don't send data, if you don't share with the community, in return, you will not get the community block list. And also, we keep the data for the least amount of time possible. So, basically, every IP in the community block list will be automatically dated after one week if the IP does not perform any more attack on the community. And for the raw data we receive, after three months, it's degraded, and after nine months, it's totally dated. Now, how do we build this community block list? Because we receive reports from all over the community, but we cannot trust those reports because it's just an API. If someone wants to send us the information that ISO 1.1.1 performing SSH could force my server, we have no way, basically, to know if it's true or not. So, we build something called the consensus. So, we get the report from all over the community. And then, each user in the community has a trust score. So, this score is really how much we trust you. It's built over time. So, the longer you are part of the community, the more your score will increase. But it will also take into account how active you are in the community. So, if you report just one IP per day or someone that is reporting 100 IP per day, we will just give a slightly better score to the user reporting 100 IP per day. And we will also correlate your report with other community members. If you are the only one sending us the information about a particular IP, this IP will never go into the community block list because we cannot confirm that it is indeed an IP trying to attack servers on the internet. But if multiple users report this IP, then we will be more confident about the fact that this report is exact. We do also run some own iPods and because those are fully controlled by us, they are fully trusted. So, if someone tries to attack them, we know that this is an actual report. We do also have some white list because if someone send us IP belonging, for example, to a cloud flare, obviously, this is not something we want to redistribute to the community. And again, if you do send us too many false positive reports like this, your score will be reduced because we know that your report are potentially false. And lastly, we do also have some more advanced algorithms that, for example, we look at the bigger picture, if you take a slash 24 network, and like 50% of this network is already in the community block list, when if we see another IP belonging to this network, it's quite likely that this IP is also bad. Same thing, for example, if the IP reported to us exposes, like I don't know, a tenant server, 20 different services on the internet, again, it's more likely that the IP has been compromised. And so, when a report goes through everything, we will take a decision and it will go into what we call the fire database, so the community block list, and it will be pulled automatically by all the community members. Now also, for the usefulness of the community block list, a while back, we ran a small experiment where we had two different servers on the internet, on the same hosting provider, those two servers had the exact same set of services exposed on the internet, so to adjust a wide server and SSL server, the only difference between them was one was using the community block list, and the other one was not. On the one using the community block list, we saw around 92% less attack on the server detected by protect. So because basically, most of the IP that are trying to scan your server exposed on the internet are basically trying to do it for all the internet, and so this is just some background noise, you don't really care about it, you can just block it before it can try to attack you. Another thing that we get asked quite often is, okay, so I want to replace my firewall with QuartSec, I want to replace my auditing system with QuartSec or whatever. Please don't do that. It's not designed to do that. We don't compete with this kind of solution, but on the contrary, we can help them to be more useful. So, for example, if you have an auditing system telling you that, okay, so I saw command execution on this server, and it was a curbside bash or something or whatever, but you can configure QuartSec to pass those logs, to detect this behavior, and just to send you a notification. It does not have to be something about an IP address, it can be a local behavior as well, or something with firewall. You don't have to go into the configuration at the IP, just add the IP in QuartSec with one simple command, and then QuartSec will take care of pushing this information to your firewall and the IP will be blocked. So, for the architecture of QuartSec, so as I mentioned before, so QuartSec will ingest some logs, pass them. So, this is the role of what we call the agent. The agent will read your logs, pass them, and confront them against scenarios. So, a scenario, just something that describes a malicious behavior. For example, a boot for someone trying to scan your website and so on. When the agent detects a malicious behavior, it will create an alert and will be sending this alert to another component of QuartSec, the local API. So, this means that the agent can live on one server, the local API can live on another server. The local API will take this alert and transform it into a decision. For example, by default, it's for our ban on the IP. But you can customize this behavior, and for example, you can say, so, this IP is French, it performs something related to an HTTP attack. So, instead of just banning it, I'm going to ask my boncer to display a capture for this IP for four hours, for example. So, and when the local API receives a decision, it will be sending us information about the alert. And in exchange, you will receive the community block list. The local API is also used by the boncer, so the components that do actually apply the decision because QuartSec by itself will just do the detection part. It will not block anything. For that, you need boncers. So, we have quite a few of them. The most popular one is probably the firewall boncer that will interact with the local firewall of the server where the boncer is going to block the IP. But we do also support a web server, for example, in GenX. And in this case, because we are at a higher level in the network, you can display a capture to the user instead of just grouping the request. So, here, it's the design that will be used in the following demo. So, a very small Kubernetes cluster running locally with just two nodes, one agent per node deployed as a demon set, one local API for the agent to be able to send the alert. The agent will be reading the logs of the ingress of the cluster. In this case, Angelix ingress. And we will have a boncer, the Angelix boncer running on the ingress to block automatically the IP if QuartSec wants to ban them. Okay, thank you. So, as we have, we are crazy guys, we're going to do a live demo. And, yeah, just think I have some. So, as Sebastian said, we're going to, we have a small, locally, of course, not online. Yes, of course, sorry. Thank you. Okay, so, what we're going to do is to, so, sorry for that, but I will, so this will create Kubernetes cluster with two nodes with ingress controller, Angelix, and a Hello World app, like it's a demo application. Nice. Okay. So, here we can see that. Okay. We have one ingress controller. And we have two nodes. Okay. So, what we'll do now, I will not spend time to explain you, like, the QuartSec values because we released a hand chat that will allow you to install QuartSec in a Kubernetes cluster. As I said, it deployed a demo set. So, in each node, it will deploy a QuartSec agent and one local API in a specific name space. And, of course, your decisions will be centralized across all your nodes. So, here, we will install QuartSec. So, basically, we just install QuartSec using these values. And this is the hand chat. Of course, I already import my repo in the namespace QuartSec and create it if it's not exist in the namespace. So here, if we are. Okay. Yeah. As I popped a new cluster, I have to wait for the image download. But it will come no more. Okay. What I can do, yeah, of course, what I can do is to show you that I'm able to access my Hello World app. So, it's basically an Hello World. So, it returns the two OO. And, okay. We have one pod that is running. Demo effect. As usual. Let's install. Wait. Why? Ah. Okay. So, now, yeah, we're going to fetch this one. So, okay. I will follow my logs from this agent. So, logs minus F. Agent and N. All right. So, here is my QuartSec logs. So, now, what I will do is to run Nikto, which is a simple, okay. A simple web scanner to attack my Hello World application. And here, we can see that I can even already, okay. Here, we can see that it automatically detects as it's automatically parsing the logs of the ingress controller. It detects some bad behaviors because I already installed, like, collections that are bundles of parsers and scenarios. And you can see that it detects bad user agent, some sensitive file access, patch traversal, et cetera. So, if I do something like that and get a shell on QuartSec, sorry for the noise, local API. And if I list the decisions that were taken, we can see that we have six other entries like this one because we detect several times my IP as I'm working locally. That's why we have those IPs. And this is the last behavior that was detected. So, now, what we will do is to install the bouncer because we are detecting, but we get still access to the application. So, now, we will patch our ingress controller to install the new app plugin to communicate with the local API and get banned, of course. So, now, here we have the command. So, we can go fast and at time. So, I'm upgrading my ingress controller, sorry. Okay. Okay. So, now, if we try to access now to the Hello World application, we can see that we will receive a 403. So, if I do this... Sorry, I didn't see that. Yeah, it's the... Let's wait for this and upgrade. Okay. Okay. Nice. So, it's popping the new ingress controller. Of course, it's local. That's why we have only one pod. I don't recommend that in production. So, it's deploying the new ingress controller and loading, of course, the crowd-seq bouncer. It's a Lua library that is talking with the local API. And on each request, we have this check. Come on. Yes, of course. I think we hand it. No. We have still time. So, we can take some questions and I will follow them. We've got time for like one question. So, there is one fast question for one minute. Feel free to answer it. Okay. Just to hand the demonstration. It's running and now we have a 403. So, that's it. Okay. So, thank you. Yeah. This is the challenge with the demonstrations. If you have some questions, don't hesitate. And of course, we have some stickers. So, don't hesitate to come to us. |
Secure voice/video over IP communications today and tomorrow thanks to post-quantum encryption !
The Linphone softphone has integrated CRYSTALS-Kyber, the NIST finalist algorithm in the encryption key category |
My name is Johan Pascal, I've been contributing on the LinFON project for the past 10 years, more or less, and going to talk about the introduction of post-contempt cryptography in the voice of our IP soft phone. So quickly the agenda for some context, then we'll dive into the RTP protocol, and then how we had to modify it to use post-contempt cryptography, and then a few words about hybrid post-quantum and classic key exchange, and some conclusions. So first some context for advertising for LinFON first, it's a project which is around for now more than 20 years, it's available on lots of platforms, the idea that we have like a common library, and then on top of that different application for different platforms. It's tried to use at most SIP standards and everything standardized, IFC and so on, for audio, video, instant messaging, we also provide secure group messaging, it's based on a derivative of Signal Protocols that we presented years ago. We also provide a SIP proxy, which is called Flexi SIP, also open source, everything is open source, and encourage you to use our free service on SIP, which is SIP.linfon.org. So basically I don't know if you are familiar with voice, but basically you have two streams of data, first stream is a signaling path, which connects the endpoints together, and then you have the media stream, which actually send data, video, audio encrypted, and this one we enter into it. So how it works, there is an IFC for that, and a protocol which is called SRTP, and SRTP is Symmetric Encryption, so far we are not very concerned by quantum computers. Main problem with that is that it requires an external command engine, so we have to exchange our symmetric keys. So for that we have three choices, the historical one is called SDS, so on this one the keys are transmitted in the signaling path, which if the signaling path is protected, which is normally the case by TLS, is okay, the only weakness is that the SIP proxy gets access to the symmetric keys, so we are not actually end to end encrypted. So basically the people running the service would decrypt your media stream. So there is another one, which also gets an RFC, which is called DTLS SRTP, basically on this one, on the media stream you perform a TLSN check, actually a DTLSN check because it's over UDP, and this one works well, but you have to deploy a PKI and you have to manage certificates for all of your clients and everything, so it's a bit heavy, and also you still have to trust someone, you trust certificates, sure, but still. And then there is another one that we favor, well all of three are available in the info, but the last one is called the RTP, which is one we'll focus on this one today. And this one on the media pass, you perform the RTP protocol, which is based on Diffie Elman, which using electric curve or a simple Diffie Elman. This one has no third party required, so which is good, the only small thing is that you have to confirm, make some kind of spy thing that you have, tell secrets on the phone, but as you are talking one with each other, it for the user and user is a bit an annoyance, but you have to do it once in the call list, the world call history with your other endpoint. So we think it's acceptable for users, obviously one has to get involved in security, but normally it works, and the experience tells that people focus on security tends to not be driven away by this small drawback on the protocol. So it's an RTP which is now more than 10 years old, it has been mainly written by Filsie Elman, the guy behind PGP, which are always focused on avoiding third parties. And it's probably different properties, I won't explain the key continuity and stuff because this one is unchanged, and I will focus on managing the middle attack detection. So first a small reminder of what is Diffie Elman, so basically it's a protocol where it's completely symmetric, one part, both parts will generate key pair, and then they exchange public keys, and with this secret key and other side public keys will get to share secret. So far so good, it's kind of easy, on a drawback it's obviously vulnerable as many key exchange protocol to manage middle attack. So managing middle attack, what it is, it's basically someone putting ourself in the middle and exchanging keys with both sides, so the side cannot know, basically Alice cannot know that Eve is sending her key, she thinks that Bob is sending the key, and she performs the exchange, and at the end what you get is that Alice gets a shared secret with Eve, and Eve gets another shared secret with Bob, but Alice is convinced that she exchanged keys with Bob, and she has no ways to actually detect this, well she has actually some ways, no? Yeah? Okay. No? Yeah? Sorry. So the DRTPN check is, there is first phase of discovery, so what is happening is both endpoints will exchange their capabilities, their choice of preferred algorithms, stuff like this, and then start the actual DRTPN check. So first you have one packet of commits, I will go into detail now, and then you actually perform the DH, Diffie-Ellman exchange, so Alice is sending a public key, Bob is sending his, and they both compute from this, they will compute the shared secret, and adding all the transcripts of the communication, they will generate S0, which is the base secret, the output of the DRTPN check. From the S0, they will derive the SRTP keys, which is what we are trying to do now here, and they also derive something called SAS, short authentication string, that will be vocally compared over the phone, because we are, Alice and Bob are talking, actually talking to each other. So the end of the protocol is just some updates and writing in cash for key continuities mechanisms and stuff, so it's not really interesting now, and then after that, the SRTP strings start, actually, and they can talk, and once they start to talk, once in the call history, they will do this vocally sus comparison, what it's for, this sus comparison is basically if they want to detect a man in the middle attacks, they have to ensure that Alice is using the keys that Bob has sent, and also Bob wants to know that the key that was sent by Alice is the one he actually got. So what they could do, as they are talking, is they could basically read their own keys to the other, but the key is something which is a few hundred bytes, so it's a bit long to read a few hundred bytes of the decimal chain over the phone, no one would do that. So what they do instead, we derive this short authentication string which is only four digits and has 20 bits, actually derived from 20 bits, and this sus is also derived from the secret zero, which is the output of the protocol. The only problem with that is that you can actually perform a sus collision with that because the sus is very short, how it will work, so actually the beginning of the protocol, as soon as Alice sent a public key to Bob, Bob is able to compute s zero because he has his own secret key, and he is able to compute the sus then. So what one could do is that if you perform first the RTP exchange with Alice, she got the sus one, and then she received Bob's public key. When she got Bob's public key, she can generate a huge set of key pairs until she finds a sus that collides. Basically she will try to generate a lot of pairs, sus is only 20 bits, so if you generate one million keys and try to hold them, you will for sure find a collision on the sus. So to prevent this, Eve is forced to send a commit packet. In the commit packet what we have, we do not have our public key, but we have hash of the public key, and so when you receive the hash of the public key, Alice will say for example Bob's hash public key, she will store it, and then when Bob sends the public key, she will compare, she will just hash Bob's public key, and she will compare, so that way she is sure that Bob did not wait for receiving a public key and cannot generate millions of key pairs to find a collision on the sus. So this is quite effective, and so far so good. Now we want to switch to using, to use post quantum cryptography. The problem with post quantum is that on the next call for standardization, they required all the algorithm to use key encapsulation mechanism, and not deferment. So key encapsulation mechanism is a bit different, because the two sides are not the same. In deferment, the two sides were exactly doing the same thing. They are both generating keys, exchanging public keys, and then computing secrets. There we have one side generating keys, one side encapsulating a key, a secret, and the other side will be able to decapsulate the secrets that was encapsulated by the first one. So it's not symmetric, so we cannot switch directly from deferment to KM form of key exchange. Obviously, KM is still vulnerable to man's middle attack, because nothing is changed. You can still put someone in the middle and perform the exchange with the other side without them knowing. So what we have to do is adapt the RTP and change a little bit the actual handshake, the central part of the protocol. So S0 is still derived from the exchange secret and transcript of all the conversations. I've got only commits and two packets, but you have also yellow packets and stuff. So in the commit packet, the one which used to hold only the hash of the second packet from Bob, Bob will now insert his public key. Why would he do that? So Alice can encapsulate the secret. So at this point Alice receives the public key from Bob, she encapsulates the secret, but at this point she's not able to compute S0 because she's missing the second packet from Bob. So she'll send back the ciphertext, so the output of the encapsulation, and at this point she has the share secret from the key encapsulation, but she cannot compute S0. Bob now retrieves the share secret, and he can compute S0, but he already committed on DH part 2 that he has to send to Alice, so still he cannot manipulate the secret, the final secret in S0. And what's in this packet? It's just a random number that is used once. So now another problem is that we don't want to use, to focus only, of using only post quantum algorithm, because we know that sometimes they got broken, like for example Psyche, which was broken a bit late in the standardization process. So it might happen or not in the future, so to protect against this weakness, its possible weakness, we still want to use a mix of post quantum and a classic algorithm. So we use both at the same time, and in order to not complexify the protocol too much, the idea is to have one version of the protocol which is doing DFIRMAN, and the other one key encapsulation mechanism. And the protocol won't know exactly if it's using a mix or not, because probably in the future, at some point we'll be confident in us with some post quantum algorithm, and then we'll stop using the classical one, maybe or not. But still the protocol should not be modified at this point. So the protocol is done to use a carry a mental fast without even knowing if it is a mix of classical and post quantum or just post quantum or several post quantum. So we made, first we have to make a carry interface from DFIRMAN, this is quite a standard construction, you generate, instead of, you can directly use the DFIRMAN construction to generate a keeper, then you can send your public key to the other side, the other side will encapsulate, how would the other side do that? It would just generate a keeper for DFIRMAN, compute the DFIRMAN, and then hash it with the transcript of the exchange, and send back its public key to the other side. So the encapsulation is quite obvious, same thing on the other side. And then we combine two or more occurrences together, so one we just built from a classical DFIRMAN or electrical DFIRMAN, with a post quantum one. So this way of doing it has been published by Nina Binder, a few years ago, so it's a bit convoluted, but if you want more details on why we are doing this, I encourage you to read the paper, it's quite interesting. So basically what you do, you generate the keeper, you generate keeper for sets of algorithm, in my example there, it's only two, but you can do more of that, and send concatenated both public keys or all the public keys to the other side. The encapsulation would just split your public keys to retrieve the individual ones, and perform the encapsulations on all the components. Then you use HMAC to combine your results, chaining it, so first you combine key one and then key two, and you can add several layers there, and the final step is to use the transcript of all the public keys you received, and the encapsulation is obviously completely symmetric. The paper from Nina Binder is quite clear on why these steps are needed, I have no time to explain it here. Two more words, we also tweak the protocol packets, because in the D-Filman form, the maximum size you can get is around a few hundred bytes, but if you start using Kyber, for example, or HQC, the one you will use, you'll reach several kilobytes, and several kilobytes you cannot send in one datagram over UDP, it's not possible, you probably won't arrive. So what we have to add is a way of fragment the RTP packet, so it's kind of classical way, just as DTLS is doing it, or other protocols using UDP, the only thing is that we made it in a way that packets are not fragmented, and the header is modified, but if it's not needed, the packet remains exactly the same as the old packet. The objective in this was to be able to start deploying the new ration of the RTP, but still keep compatibility with the old one, old deployment. So how it's done, in the end, we use crypto libraries LibOQS, which is from the open quantum safe project, which basically collects all the NIST candidates, and Kyber also, which is a normal candidate, in a convenient way, and we use LibDecaf and embed TLS for the ECDH and HASHMAC functions that we need. So we packed it all in an independent module, so our RTP library will use this module, but it's completely independent actually from it, so if anyone wants to directly use this hybrid KM, mixing varieties of fast quantum and classic exchange, it's fully available. You can combine it with more than two KMs, as it was printed, it's written in C++. And in our RTP implementation, we deployed it with some already preset combination, so we have X, well, we can see them, and we try to mix algorithms with more or less the same level of security, so mixing the Kyber 5012 with X250, this one. And it is, as I said before, fully compatible with the order version, so the deployment is progressive. It's basically in the agreement phase at the beginning, if most parties support this version of the RTP with this algorithm, they will use it, if one is old and don't support it, they will just fall back on classical Diffilman or electrical Diffilman. So just how it looks like. So first, you have the RTP and shake going, and the call is starting. And once the call is started, if it's the first one, the two endpoints are calling each other, you will get a pop-up that asks you to confirm the security string, so most parties will just confirm it, if, well, they just say it on the phone, it's written like you have to say this, the other one confirms, you said what it's expecting to say, and you confirm it, then this will be saved in the RTP cage, and you will never be asked again to do that. During, at any time during the call, you can check on the call start and see what kind of algorithm you use to perform the exchange. So on this screenshot, you see that it was using Kiber 512 and X225519. Here are some links, just if some of you download the presentation, so once a live in-phone website, directly pointing to the GitLab where you can find the source code of both the RTP and our post-quantum crypto module, and to the publication from Niana Bindel explaining how to hybrid server curves. Here we are, thank you for your attention. So we've got time for questions, and I've got one question on metrics, and there is a question, why post-quantum encryption is not enabled in the pre-compiled LinFone SDK? Sorry, I didn't. If, why the post-quantum encryption is not enabled in the pre-compiled LinFone SDK? It is now. It is now? It is now. Okay. It is now. Yeah. Yes, sorry. Hi, given that we're dealing with threat actors that might be capable of, you know, cracking quantum cryptography, okay, given that we're dealing with threat actors that might have a lot of resources, it seems like one particular attack vector might be to essentially use real-time deep-pick technology to intercept the vocal assay-ass comparison. Do you see any particular mitigation for an attack like that? Well, some kind of attack like this has been already studied and published, so basically what came out of what I found is that it's kind of easy to synthesize, to use speech synthesizer to synthesize the voice of someone else. The main problem there would be to insert the ass at the right moment in conversation without adding a huge daily in the conversation so that people won't be able to talk, basically, if you had like two to three second delays because you have to analyze the signal and like buffer it to be able to insert back your ass. People won't talk with three seconds, three to four second delays, there is no way people will be able to keep talking. I agree. I think it's going to be very difficult to do something like that in real-time, but I think that's probably, you know, because your solution looks really, really solid in terms of being able to fix it like that, so it looks like that might be one of the weaker aspects of it. Yeah. But since now I've been trying to monitor the publication on the subject and I never found someone able to publish an actual attack on the RTP working really, so it might depend on some point. That's great. Thank you. Can we be quiet to a question, please? Thank you. I think I missed it, but then in this particular method that you are doing, is it actually trusting the middle server that you're using, or is it using keys from another like a phone or something, SIP, assuming, is this running with the SIP protocol you said? I'm sorry. I cannot. Hello. The sound is very low. Hello. Better has. Yeah. So I wanted to ask if this was being used with a mobile phone to connect to the SIP server and then use post-quantum cryptography as you demonstrate. Can you go back to the two slides before, please? Yeah. Yeah. So the phone, is it actually trusting the server, which is running, or is it like the end-to-end, the actual key is being checked with the other host? Yeah. This is the main point of the RTP, that basically the idea is to not trust anyone, not your server. The server will be in charge just of connecting the two phones, and then the media will go directly from one to the other one. The media pass will go straight from one phone to another one, and it won't go through the server. And that's why the RTP exchange is performed on the media pass and not on the SIP signaling pass. When you establish your connection, actually, you'll go through ICE protocol, I don't know if you're familiar with that, which basically find a way to connect directly, because at the end, you don't want the media to be relayed, because you lose too much time, you have to send media packets directly from one endpoint to the other endpoint. Hi. You said that you have to compare the SAS only once. Yeah. Is it once per phone or once per user? It's one endpoint, basically, in each endpoint, you have a cache of previous, each time you end the RTP exchange, you'll keep some shared secret that you'll use the next time. And so during the exchange, at some point, you will compare these shared secrets, and if they're the same, you'll use them to compute a SAS, which is always a verb, and you can always ask to compare the SAS, but it won't pop, because the protocol will know that you performed the exchange before, but it's just one phone to another phone, this cache is not shared. Okay. So in practical terms, if I buy a new phone and then install the same app with the same account, I have to do it. You have to do it again. You have to do it again with all your correspondence. Okay. Thanks. We've got time for our last question. Is there any other last question? If not, thank you for your call. Thank you. Thank you. |
Mercator
Mapping the information system |
Hi there, I'm Guidier. I'm a technology information security enthusiast. I started my career as an information security ninja, defending information against cyber threats using my Jedi skills. However, I also have another side to me that comes out at night, that of a relevant hacker. I love using my skills to support the value of open source and fairly believe in it. I believe that technology can be used to improve people's lives, but it can only be done if we work together and share our knowledge. That's why I'm also a strong advocate of collaboration and openness in the technology industry. So may the open source be with you. I will present a project we've made at the hospital where I work during the COVID crisis. Hospital information system is really complex. It's more than 3,000 applications, 4,000 virtual machines, 2,000 people working in a critical infrastructure, seven days a week, 24 hours a day to save people's lives. To secure this environment, we need a global view of all elements that compose the information system to obtain a better readability and do a better control. So we start to build a cartography based on the ANSI guide mapping the information system. But when we look at our tools, we didn't find one that fills our needs. Then the COVID crisis comes, all IT projects were stopped or at least slowed down, so we took this time to work these tools, Mercator. Mercator helps organization mapping the information system in order for them to meet operational requirements of cyber security. It helps to build a map in five simple practical steps. It can be used by any organization irrespective of the type, size, maturity in terms of cyber security or complexity of the information system. It's an open source. It can be used by any organization in public or private sector alike. So what is Mercator? Mercator is a web application that allows you to manage the mapping of the information system as described in the information system mapping guide from the ANSI. What is mapping? Mapping is a way to represent the information system of an organization as well as its connection with the outside world. The term mapping refers to a schematic representation of a set of information. Mapping is different from inventory. Typically, you manage your asset in different inventory, but you don't know the relations between them and the importance of relations between all these information. In mapping, we will try to have a complete view from the outside world, from business requirements, down to your application, your server, and your physical inventories and your IT rooms. So Mercator is a cartographer. He's the author of the Mercator Projection, which is a conformal projection. It keeps the angles. It's very useful in the sailing in the 70s century. Why mapping? Mapping is essential to control the information system. It allows you to have a knowledge of all the components of your information system to obtain a better understanding of it by presenting it under different view. It's allowed to fulfill the fourth challenge of digital security is to control the information system. The cartography allows you to have a common and shared vision of the information system within the organization. It protects the information system. Mercator mapping makes it possible to identify the most critical and most exposed system to anticipate possible attack paths on this system and to implement an adequate measure to ensure their protections. It's a defense of the information systems. Mapping makes it make a more effective response in an event of an incident or a digital attack to qualify the impact and predict the consequence of the defensive actions taken. And then the information system resilience. Mapping makes it possible to identify the organization's key activities to define a business continuity plan and use the same essential tools for crisis management whenever digital or not. The map is composed of three main assets defined in different view. First you have your ecosystem view that represents the entities of our system which with the information interact to fulfill these functions. These are your providers, your partners, your customers. Then you have the business view of the information system that represents the information through the main process and information. Then all your process, your activities, your actors. Then you have the application view that describes the software component of the information system, the service they provide and the flow of information between them. Then you have the administration view. This is the list of scope and privilege of user and administrators. You have the infrastructure, the logical infrastructure view illustrates the network partitioning including the definitions of IP address, V-land, filtering and routing functions. And then you have the physical infrastructure that describes the physical equipment that are used by the information system. Your mapping can be built in three steps and at each of these steps there is a level of granularity which will fulfill some of these information or some of these objects. The minimal grade T level 1 you have also initial element essential for digital operations, for security operations. At level 2, the second level of granularity, you have a digital security oriented for the mapping. The vital information system must have a mapping which is at minimum at this level. And at level 3, you have the findability, you have a comprehensive and detailed mapping that incorporates all digital security requirements. This is the main screen of the applications. We have on the top the different maturity or three maturity levels. You have a breakdown of all your object by domain and then you have all global proportional view of all your assets of your cartography. We have on the left on the top your sun panel. On the top panel you have the views for documentation and on the left panel you have all the data entry. Mercator computes the maturity level. An item, an asset in the cartography is complete if we have all the information, all the related information within M, with other assets. For example, an asset in the cartography is not conformed. There is no research, no responsibility, no type or there is no link between other assets. For example, an entity without relations, a process without operations, an application does not support any process or a server without applications. Then it computes the maturity level. That is the conforming asset divided by the total number of assets and the percentage represents the effort to be compliant. So the more the better. So you have a lot of lists. In the asset you have about 20 different types of assets. You can form all types of assets, you can sort, export them, copy. You have a lot of form to fulfill. Rich form with RTF. You can define the link between objects. There is a role management within that is implemented within Mercator. You can define within your IT team the obligation to fulfill the cartography in different teams. For example, you have the network team that will fulfill all information related to the server. You have the operating system team that will fulfill the virtual server. You have the application manager that will explain the application and where they are installed. So you can divide this different role within the applications. There is a history of change. Whenever something is changed, it's automatically traced in the applications. So this is the data model. So you have your entities and your relations. An entity supports different processes that define business processes and activities operations. Process use operations. At the middle you have the application that are divided in group. Application is set on virtual servers. And these virtual servers are on physical servers. Mercator drives the dependencies between the objects. You have the object in a hierarchical view. You can view your macro process, your process, your activities, which are also why your network, your VLAN and your server is in this VLAN. You can also view the physical infrastructure with the building, the rooms, the servers, the physical server on the bed. Mercator draws also the physical network schema. You can define the physical link between the different elements and you can view where they are installed. You can also explore the cartography. You select an object. You double click on the object and it pops all links between all these objects. And you can explore all your cartography and view what are the different links between all your assets. The main interest in the cartography is to generate reports. So the first major report that the cartography can do is the information system mapping report. It's a complete work document where you have all your assets of your information system. In the hospital, my hospital is 600 pages, imagine. And in this report, you can explore the cartography by clicking on the link. You have an application. You can view who uses this application or where is the application on which virtual server it is installed, on the physical server, on which building and you can follow the link within the world documents. You can generate a list of supported entities and their application. You have all your applications, your entities and what application they are used. They are using. Application by group. You have all your application by group. You can view is it a web application, is it an application, who supports, where is it installed, on which physical server. You have a list of all your applications. You have a list of all your physical servers. What is the size of the server? Where is it installed? How many disks? What is it using? And you can also make projections year by year and see how is it growing. You can analyze your security needs on different objects. It's a list. So the application denormalize the link between macro process, process, application, database and information. You have this on the list. You can view here if you have correctly placed your security need in terms of confidentiality, integrity, traceability and availability by denormalizing it. You can view your logical server configurations. List of logical servers they configuration. What operating system? What is installed on this logical server? Who is responsible of it? And so on. And you can finally have the inventory of all your physical infrastructure. List of all your physical equipments. You can take this list, go in the IT room and check if it's correctly installed in the correct place and correctly labeled. If you have equipment that are not in the list. So this is an example of information system mapping report with a table of contents. You have your schema and you can start browsing through the information system. This is an example of the physical inventory with your site, your room, your building and your examples. This is an analysis of your security needs. So you denormalize the link between macro process, process, application, database and you can analyze the difference in the requirement between each subject. You can track the change made to the cartography from the last three months. You can track the update of the MAC. I can demonstrate to an auditor that comes at your mapping update. For example, if you have in December some new application that comes, you should have seen some change in the cartography by the different teams. Mercator helped you in the ESA 270001 certifications. For the inventory of assets, for the ownership of assets, for the labeling of information, location protection of assets, change management. You can see why a change or impact of the asset. What other assets does it impact? Capacity management. You have a view every year. You can take a view every year of what is your capacity. You can do vulnerability management because you know what type of operating system application you are using and you can search and see what type of vulnerability is present in your inventory. You can do segregation of networks, security and supply agreements, assessment of information in security events. In case you have a security event, you can quickly search in your cartography. I heard about this and you can directly get a certain generic cartography. You type a word. If this word appears in the name, descriptions of type of equipment, automatically you get the information. Availability of information processing resource. You know how many servers are using what application and so on. So you can do it really easily. The application is available in GitHub. It's an open source. It's used in three hospitals in Luxembourg and 10 hospitals in France. Three admissives from French municipalities. We have for the moment 10 contributors. We have a roadmap. We have tons of ID for the extension of Mercator. Our main ID for the moment is a treatment plan. The treatment registry is an obligation by the GDPR. You have all the treatment that must be in your registry. A crisis directory. Whenever you have an incident, you would like if Mercator is not available because there is an incident. If it's on a paper, even on paper, what are the essential assets? What are the code phone number of your provider? What are their email address? And we plan to make a link with Monarch. Monarch is Luxembourgish risk analysis methodology. So we can start by your asset and extract a model of risk analysis to analyze correctly your risk. Okay. So thank you. Okay. Do we have some questions in the room? Thank you. So I've got a question related to application and operating specific assets and files. So you've mentioned vulnerability management. And I wanted to ask, are you consuming software of materials in this specific tool? And second question in addition to that would be how do you consume that data? If so, thank you. Okay. So for the moment, you have to enter all your assets by hand. There is no automatic tools that can explain who are your provider, what are your main business process, what are your inventory or physical inventory. There is no automatic tools or artificial intelligence that can do it for you. So for the moment, you have to enter it by hand. If you have already some Excel sheet with this document, there is an EPI, a REST EPI, which you can use to enter or extract the information. I don't remember the second question. Did I open to both? Any other questions somewhere? I saw some question on metrics asking what was the URI for Mercator? Uh, it's there. UL? They're using for URI, so I'm a universal resource identifier. I don't know what context. Okay. So no other questions? There's one. Okay. Hi. First of all, thank you for the talk. It's not much of a question, but more like a comment. Following what the other guy asked or talk about, because basically the issue I see is how to populate the application. So basically you need to connect some whole Mercator with your CMDB or to explore your network. I don't know, for instance, if it could be with BlueHunt or whatever, I don't know, OSQuery could help also. So how do you see that kind of connection? How could you bind Mercator with those tools that are already existing? Is there a way or are you already thinking to create some API or I don't know what could be like some magical way to interconnect those things? Yes. So for the moment we have a REST EPI. You can fulfill any table that are in the Mercator database with any inventory you already have in place. So you can make the link and update these tables automatically. For the moment we use it, for example, for the configures of virtual machines. Every few months we update Mercator with the configuration of virtual machines. So we don't have to do it manually because this is a boring task that has not a lot of value. But most of the time you have to do it manually because this information exists nowhere. For example, how many users is it in this application? Is it a critical application? What kind of application is it? Who uses it? What is the REST EPI of this application? All these questions have to be fulfilled manually and it's a really important information. You have to enter because you want to know then, okay, what are my critical applications? What are my critical business process? This is a critical process but you choose a non-critical application. Is it normal? You have to think about it to build this complete view of your information system using a cartography. Okay, so one more question. One question which is related to the other one. They are available open source inventory tools that you can use to automatically populate hardware and software inventory by just installing an agent on computers or using one agent which does remote inventory and use these agents and push the information into your tool. Are you considering using this kind of software? Yes, there are so many tools that does network inventories and tools that we do not plan to, for the moment, to build ourselves connectors with these tools. We try to, for the moment, to improve the Mercator tools by itself. But as I said, there is an API. If you have an inventory and you want to populate this in Mercator, well, there is a REST API. So, only push. We have in the documentation a few examples of usage of the API in C, in Python, in Bash, and so on. So, it's really simple to build a script from your inventory you have to populate the Mercator database. But as I said, there are so many tools that you don't want to be linked with the tools with specific automated tools to fulfill the Mercator. And it's also, it's, also these automatic tools fulfill less than 10% of the job you have to do to complete your cartography. Because most of the work you have to do is probably things you don't have already. And you cannot automate this process of completing the cartography. You cannot, there is no artificial intelligence that can explain you what are your critical process, what are your critical entity, what is the relation you have with them, what are your critical applications, what are the RTO and FPO of these applications. So, this is something you have to do by hand. Okay, some more questions or we can end earlier. So, thank you for your talk. Thank you. |
Hardware-backed attestation in TLS |
I trust that everyone here considers authentication a stable internet security and that you think that having more information about the security state of your peer when authenticating them is obviously a good thing if you want to make a good decision. So with that in mind, I want to talk to you about our work to integrate remote attestation as an authentication mechanism in TLS. So first off, who am I? I'm Jonas Mihalca. I'm a senior software engineer in ARM. I do mostly software prototyping, so doing proof of concepts for various software stacks that we think might be useful for our software ecosystems. So looking at an overview of the presentation, so we're going to start with some theory. Looking at remote attestation at TLS and how we plan to integrate the two. And then we're going to continue looking at the practice at the prototype that we're building to instantiate the theory and the draft that we're working on. So let's kick off with the theory. What exactly are we trying to improve here? So the current internet security model is mostly based around an assumption that the attacker is somewhere on the communication path between the peers. So what you usually do is you have some sort of certificate that you issue to the workloads and you have the private key associated with that certificate. And the workload can then essentially authenticate itself to its peers. But the problem is that in this trust model, you have to trust that workload is indeed running the software that you're assuming it's running. Even if, for example, your peer presumably uses some open-source software, you still have to trust them that they've deployed that and that that's where they're running. And also that they're keeping their key secure because if the software is changed or if the key is exfiltrated, then you're kind of hosed. So if you want to have more guarantees, can we actually use more emotive verifiable information within our authentication methods so that we have more information about the security states of that workload and its key? And we actually were prompted to look at this from two use cases in particular. So the first one involves IoT or edge deployment. So for example, you have in this diagram, you have an edge device that has a private identity key that was provisioned at a manufacturing time. And with this identity key, you want to create some attestation credential that you can present to a service. So presumably, you own both the device and the service, and you want to make sure that only your device is connected and can access whatever the service is doing. And sort of a mirror use case is one that involves a workload running in the cloud. So you have, again, a workload that has a private identity key provisioned, for example, in the server chip. And you want your local device to connect to the workload. And you want to get more information about software, for example, the software that booted on the server and how the key is managed. And this is where remote attestation comes in. So remote attestation is essentially a class of hardware-backed mechanisms that allows you to provide cryptographically verifiable metadata about the state of your device. So you can have more trust about, for example, what kind of firmware was running at boot time, what OS kernel you're running, and maybe even what the software in the workload is. So you do this by using that private identity key that was provisioned within the device. And the device essentially becomes certificate authority for itself, and it can issue credentials for all the workloads running on top of it. If we look at the data flow for remote attestation, this is a bit complicated, and it's useful to think of the arrows not as physical communication paths, but as logical data flows, essentially. And the components that we care mostly about here are the attestor and the relying party. So authentication happens between these two, and it's the attestor that wants to authenticate themselves using some sort of remote attestation. And as you can see from the diagram, they're not actually connected in the data flow. There's another component there called a verifier, which takes the attestation evidence, produces attestation results that the relying party can then understand and trust. And the verifier also has above in the diagram a sort of supply chain, and in particular the endorser and the reference value providers, they issue, essentially, they provision the attestor with its software, the boot time software, for example, and its identity key. And then with this information about the attestor, they can go ahead and talk to the verifier and make sure that the verifier trusts the device. So when the verifier tries to appraise the evidence, it understands it and trusts it and then can produce valid attestation results. Switching on to TLS, so the transport layer security, a pretty ubiquitous security protocol. It's used everywhere from HTTPS to lightweight M2M to secure, to provide secure channels of communication. And these secure channels essentially follow a handshake protocol where the peers authenticate each other. And what usually happens with remote attestation is that you have, you establish a security channel, the secure channel, and you do remote attestation on top of that. Whereas we're trying to integrate remote attestation directly into TLS to make it more efficient and also to limit the attack surface that an attacker might see. If we look at TLS 1.3, the handshake in particular, and how we want to integrate with it. So the handshake starts with the client sending over a client hello, a key share, and then the client hello, a bunch of extensions and other things for the server to act upon. Then the server sends, for example, any chosen Cypher suit or any other responses to the extensions that the client sent, has its own key share, and then it's authenticates itself using a certificate message and a certificate verify, and then it's with the finished. And then the client can go ahead and authenticate itself using a certificate message and a certificate verify, and it finishes the handshake with the finished message. After that, you have a secure data channel between the two peers. It's important to note for privacy reasons mostly, that from the second flight onwards, most of those messages are actually encrypted. For example, the certificate, certificate verify are encrypted using session keys. And in terms of what we care about, it's the extensions mostly, because those are used to negotiate, negotiate the type of credentials that, for example, the relying party might care about, and also to send across any freshness that is required to issue the attestation evidence. And also, we care about the certificate message, because that's obviously where we're going to carry the attestation credentials. Most of our goals, our high-level goals, obviously we want to enhance authentication in TLS to support your model attestation. We want to support as many platforms as possible, from very beefy cloud servers to small IoT devices. And we want to support the most common deployment pattern. So for example, we want to allow both client and server to authenticate, or potentially both. We want to allow existing deployments that use BKI to also use remote attestation within the same handshake just to enhance the security. So there's a whole lot of variance there. In terms of security and privacy, we're planning to formally verify the extensions that we're creating, and we're working quite meticulously to try to prevent any potential attacks, for example, relay attacks, where taking a credential form, some victim platform, and you're trying to pawn that off as your own. Then in terms of privacy, fortunately, attestation does reveal quite a lot of metadata, and this can be both privacy and security relevant. And the best we can do is to mitigate some of these by allowing the relying party to choose what kind of attestation scheme or attestation results it gets. So you can get, for example, specially crafted attestation results that have blinded or deducted some of the metadata, or schemes like direct anonymous attestation that provides some sort of privacy. Moving on to the practice. So looking at our prototype. The big picture here is that we are trying to produce an end-to-end prototype of this system, so we're trying to implement everything from the root of trust all the way to the verifier. And we're sort of limiting this because our drafts and our theoretical work is quite broad and allows a lot of deployment patterns, we're limiting this to, for example, a background check model that I'll talk about in a bit, and the TPM 2.0 as the root of trust. And obviously we're open sourcing the entire stack, and also because these components that we're using are already open source software, parts of, for example, a cloud native computing foundation or a confidential computing consortium, and it's actually under the confidential computing consortium attestation special interest group that our work is harbored. Moving back actually to the remote attestation diagram, architecture diagram, you can see here a simplified version of that, so on the bottom you can see an attestor with an existing root of trust, and the attestor wants to communicate with the relying party to authenticate it, and the relying party will then send the attestation evidence over to the verifier for verification. So this is what we call a background check model because the relying party is doing a background check on the evidence provided by the attestor, and in our case if we put a bit more flesh onto this diagram, you can see that in our case the attestor will be a client in a TLS handshake, and the relying party will be the server, and the TLS stack that we're using is MBET-TLS, and the client will essentially send attestation evidence produced by the client's root of trust, and the MBET-TLS on the client side will communicate with the root of trust not directly, but through Parsec, which is one of the projects that we've been developing, and on the server side you have MBET-TLS again communicating with the verifier, which is in our case composed using Uvaraison. So now let's have a look at all of these components independently. So Parsec. What is Parsec? Parsec is a platform abstraction for security. So if you try to write an application in Java or Python or Go, you might want to use some sort of cryptographic hardware backing, so for example a discrete EPM or some trusted services running in TrustZone, and you want to use these in a more generic way, and this is what Parsec is doing, it's presenting a high-level interface that you can use to provision, and Parsec in particular has this sort of identity key as a core use case that it works with, so it tries to allow you to create an identity for your workload and to use it, for example, to sign TLS antics. And Parsec is also quite modular, so it's really easy to implement backends for other types of hardware backends that you might want to support. Moving on to the other end, so we have Varaison, which is a set of components that can be used to build an attestation verification service. So again, Varaison is pretty abstract, it has a bunch of components, for example, for appraising different types of attestation schemes, as components for building, for example, APIs for evidence provisioning, or for endorsement provisioning for verification APIs. So in this diagram here, some factories is creating a device and then producing the endorsement data that it then feeds to Varaison, and when the device tries to connect to an application service, that application service can again go to Varaison to verify those credentials. In terms of what the work that we had to do to make this prototype work across the stack, so Parsec, as I've said, works mostly with cryptographic keys, however, at the moment, we didn't have a very generic key attestation API, and this is something that we had to build to produce those attestation tokens or attestation evidence. Parsec also needs to have configuration to allow it to essentially provision its own identity, so an attesting key that it can use to sign attestation credentials, and also ways to, for example, select the TPM PCRs that you want to include in the attestation tokens, for example, to select whether you want to send information about your firmware or about your bootloader or about your operating system kernel, and we also have a new API to produce the endorsements that are then fed to Varaison for endorsements, so Varaison can then trust the key attestation tokens. On the Varaison side, again, we have to add support for the precise attestation key scheme that we're using, and essentially, we have to build two new plugins, so one to understand the evidence that we're producing for Parsec, and one to understand the endorsements. So what essentially we're doing here is we have two components, and we're trying to make them agnostic of whatever is transporting evidence and endorsements between them, which, in our case, is actually a bad TLS. So the TLS implementation that we're using, the TLS, it's an implementation of TLS and the TLS, and the reason why, it's actually multiple reasons why we're using it, so one of them is because it offers this PSA crypto API, which Parsec hooks into, as per our design, since we created it in sync with the PSA crypto API. It also has a small code footprint, so it's more suitable for IoT and Edge use cases, like the one that I described earlier, and also we had already expertise working with Mbeth TLS, so it was easier for us to work with it. So the open source ecosystem around our projects, and this is something that has been quite important for me in the past years while I've been working on Parsec, realizing that open source is more than just some checkbox that we want to take, and that's it. It's more about the continuous involvement in the community and trying to pull the expertise and the work required to create some components that we can reuse across all of our stacks. And yeah, this is the reason why we've been seeding projects into CNCF and CCC, because we're trying to create these communities around our projects and create the ecosystems around our use cases. So if we look at the Rust ecosystem in particular, for example, because the Parsec service is written in Rust, we've released a number of crates relevant to handling routes of trust. So for example, we've released the TSA API crates that helps with interacting natively with TPMs. We've released the CryptoKey crate, which is essentially a successor to the PKS11 crate that was abandoned some time ago, and we have the PSA CryptoCrate that allows native interaction with PSA Cryptography API, and it's actually been quite a nice experience to see the communities around these projects grow and have more developers from various projects, some of which have actually presented today getting involved and helping us build this ecosystem. Yeah, the more important goal for us, at least, is not just to make these particular backends easy to use in Rust, but perhaps even to make them easy to use in an abstract way. So instead of having to integrate with TPMs or PKS11 individually, what if we could integrate with all of them via Parsec directly? On the Go ecosystem side, we also have a bunch of packages that we've released. A notable one is GoCosy, which I believe was initially developed by Mozilla and inabandant, but then our Verizon team took it over, gave it the dusting, and then released it, and now it's, I think, it's used quite widely, for example, by No3 and SixTor, and we also have a bunch of other packages relevant to remote attestation verification like SWEED or Quorum, and yeah, this brings me to my main selling point here, is that we're trying to essentially build an ecosystem where attestation can just be used as a plug-in for authentication. So whether you integrate it within the authentication step of a TLS stack, or perhaps you want to switch that to some sort of quick stack, or maybe you want to even have some sort of bespoke authentication server and workload trying to authenticate it, we're trying to make it easy to use remote attestation by making Parsec and Verizon interact so easily, so you can just plug those components in and hopefully get attestation right as it works. So just to wrap up here, we think that remote attestation is indeed a viable authentication mechanism in TLS, and perhaps in other protocols as well in the future, or design both in terms of theoretical design, so the drafts, the TLS extensions try to be as flexible as possible, but also the prototype that we're building, we're trying to make it quite flexible as well. And we want to refi all of our drafts and all of the things that we're trying to define with other people across the industry, trying to create an end-to-end prototype that represents all of this theoretical work, and yeah, we're hoping that the prototype will serve as a model for integrating remote attestation not just into specific protocols, but more widely. So yeah, questions? So any questions from the room? I see hands. Yeah, thank you. You mentioned you're working under, well, CNCF and CCCF, you also considered the open source for more foundation? Open source? No, not really. I mean, neither of these, neither of Parsec or Verizon are really firmer level components. Right. So yeah, we're essentially doing very similar stuff, but doing the full flow, like starting from the very first code running on your platform, like in the firmware. We should get in touch. Thank you. Perfect. Hey, thanks for the talk, I was kind of curious how big the impact is on round trip times in TLS if you have secure enclave or TPM involved in the initial handshake, like how does that work? Do you see any problems in practice putting that in skill at scale? We've not really gotten to the point where we can properly test end-to-end in terms of actually going to hardware and talking to hardware, so we're mostly doing with software TPMs and stuff like that, so just to integrate, we still have some integration work to do there. But yeah, we're definitely going to benchmark that and see how it impacts, but it obviously depends on the hardware because if we do that on some server, you know, some cloud server, that's going to be quite different from doing it on an IoT device that has a TPM or something like that. Yeah. Okay, do we have some other questions, anyone? If not, thank you for your talk, thanks. Thank you. Thank you. Thank you. Thank you. |
Demystifying StackRox
Unlock zero trust cloud-native security in Kubernetes |
from RUTWIC about demystifying, demystifying Stackrocks. Welcome RUTWIC. Thank you. Good evening everyone. Thanks for showing back for the late evening talk. I appreciate your time. So my name is RUTWIC Shiv Sagar. I work at that as a senior technical support engineer. I mainly work on solving open shift as well as Stackrocks issues with the customer and with engineering teams. So with the recent time and security threats or attacks, we have seen that container and Kubernetes adoption has equally increased. With that, security has become a biggest concern, right? So we'll see how Stackrocks is paving the way for Kubernetes native security and helping us to achieve or resolve the security issues with ease and automation. So this is the brief agenda for today's talk. So in the first few slides, we'll discuss current state of Kubernetes security, what are the best practices and how DevSecOps approach benefits into the security posture, you know, to shift the security for your developers as well as your security admins. And then we will see how Stackrocks ecosystem is helping end users, developers, as well as your security teams to overcome the security issues with ease. And then we will have some demo at then. Yeah. So first of all, let's understand what is zero-dose security, why we require zero-dose security, right? So zero-dose security is basically a framework which requires all the users to be authenticated and authorized continuously before they've been granted an access to your application and data. So if you manage to achieve zero-dose security model, then I would say that we could resolve or minimize the impact at the very early stage of your application lifecycle. Then how exactly zero-dose security fits into the software supply chain. So what exactly is software supply chain? It includes everything, everyone and everything that touches your application code into entire software development lifecycle, right? It could be your deployment, it could be your final artifact, it could be a CI CD pipeline. So it's essential that we build the application in such a way that the assurance at every stage of application is being taken seriously. That way we could achieve the trust rates of software supply chain. So yeah, we can see that security dependencies, securing code, securing containers as well as the infrastructure are all part of the software supply chain. Let me ask you this question. If you're using Kubernetes or in general any applications, have you ever delayed or slowed down the application deployment into production due to container security concerns? Anyone of you? All right, I assume so because that's how we go through application lifecycle. We deploy the application, then we analyze the application, behavior and we detect the vulnerabilities. So in the recent trend, we have seen some common factors or common anti-patterns which were causing delays for an application to get deployed on the production. So misconfiguration has topped the percentage where following to that we have vulnerabilities to remediate, right? So for example, we kind of able to detect the vulnerabilities but somehow we tend to overlook them or we could not assist them accurately. That leads to the vulnerability. I mean, we get to know that okay, vulnerability exists but there are no proper ways or tooling to fix that kind of vulnerability. Then we ultimately have security issues at the runtime, you know, which could be costly or it could affect your entire production. So how can we make sure that these kind of issues are reduced? Let's say, so in today's world, we need DevSecOps approach, DevOps, just DevOps isn't enough, right? We need DevSecOps to shift the security from our traditional security practices. So DevSecOps helps us define microservices architecture. It provides us declared definition to, you know, harden your security parameters, network policies and deployment conflicts. It also makes sure that the infrastructure stays immutable. So at the runtime, nobody else is allowed to, you know, touch the software or your application deployment. At the same time, it is important that we know Kubernetes native security is increasingly critical and securing supply chain is also equally essential. So what are the basic Kubernetes security scan challenges? So we know that containers are numerous and everywhere. If I have to put analogy, like we say that everything is a file in Linux in a similar way, everything runs in a container when we talk about Kubernetes, right? So they may tend to pose compliance challenges. Every container image is tied up with some of the other container registers, right? So sometimes we even forget to add TLS with best authentication to our image registry that may, you know, expose security threats over the internet if at all we expose that. And we are also aware that containers by default talk to each other without any network policies. So it is important that we define network policies at early stages. And this one, I think most of you can relate that when we show Kubernetes, all of the configurations looks pretty easy, but defaults are usually the less secure, right? So we as an admin or developer have to proactively understand what configuration or what risk tolerance required for my organization or developer environment. So in Kubernetes, application lifecycle span across three phases mainly, that is build phase, deployment phase, and runtime phase. So how we can make sure that we secure each and every stage of the application, right? So when we talk about build phase, it's important that we isolate the vulnerability of security issue at the earliest. Otherwise, it would be very costly and risky to detect the vulnerabilities at the runtime, right? So what we can do, we can use minimal base images so that we can avoid unnecessary package managers or, you know, any executable programs into your container images. Then we can always use image scanner to identify known vulnerabilities. I think identifying vulnerabilities just once is not enough. You need to make sure that whatever security integration scanner you're using that will continuously validate your container images and send the real-time alerts to your development team as well as security admins. Then yes, at the build phase, we need to integrate CI CD pipeline. So that way, most of the things becomes automated and you don't have to look around each and every build config to understand where the security issue lies through CI CD pipeline. If the stage gets filled, your production won't be affected and build would be stopped over there. Then at the deployment phase, as I mentioned, the default deployment config doesn't come with network policy. We need to understand what services that deployment is trying to communicate, what are the ports that are defining the deployment config. And accordingly, we can define our own network policies. Then we also need to make sure that the deployment doesn't allow root-level privileges or any unknown users, you know, user IDs to access your application. You should be always aware of what users are going to access your application. And then yes, we can extend the image scanning to deployment phase. So it's important that we do not restrict our image scanning at the build phase, but we continue doing that at the deployment phase as well. Then runtime phase, as I mentioned, we need to extend our scanning at the runtime as well. So we can easily understand and quickly understand what issues have appeared and what actions I need to take. It also helps monitoring network traffic to limit unnecessary or insecure communications. Then if you find any suspicious activity, and if we, at the same time, if we have multiple replicas of your application, then we can compare all the replicas and processes in time to understand what anonymous activity is happening. So to overcome all the challenges, we see Sycrox is helping the end users and the community as well. So why Sycrox is open source, right? Red Hat believes open model when it comes to your software or developing the application. And we believe that open source software can significantly help developers to drive the project with innovation as well as foster the collaboration within community. So Sycrox is working towards providing the open source solution which will allow end users to decide how they want to protect their Kubernetes clusters. So let's understand what Sycrox has to offer us. It enable users to address all significant secretive cases across entire application lifecycle that we discussed, right? Right from your build deployment and runtime. It also gives you greater visibility over vulnerability management, configuration management, network segmentation, compliance, threat detection, incident response, and risk profiling and tolerance. So Sycrox has a policy engine that allows user to run the policies out of the box, meaning that let's say if I have severity with CVSS score greater than or equal to seven, then I could have alert for the same CVSS score and understand what deployments are associated with it. Then Sycrox API allows user to integrate with the image scanning tools, CICD tools, container and times of their own choice, secret management, DevOps notification, to ease and security flow end to end. You can also run it on any cloud or hybrid cloud, or if you want to choose on prem, you can deploy it over there. So this is the bird eye view architecture. I would say where you would see a central in the blue box as a central hub, which gets exposed over load balancer for the clients to consume the Sycrox API. It is written in the REST API. And then we have sensor, admission controller collector, which is logically grouped and called as a secure cluster, right? So you can, once you configure this set of components, you can call a Kubernetes cluster as a secure cluster. And then you can keep on adding as many Kubernetes cluster as you want into the secure central. Then central also has scanner, which aggregates the vulnerability feeds that are fetched from the central. So central basically collect vulnerabilities feed from upstream sources as well as NVD database. Then on each and every node, we would have collector agent, which will collect host level data for the container network and the runtime. So this is the UI where let's say if I have integrated 100 Kubernetes clusters, then how can I manage or understand how those are behaving? What are the healthy components and what are unhealthy? So we can have a quick look to see how systems are performing. So what problem segments Stacrox is going to solve? So these are four problem segments, which I found very common between developers and security teams to understand whether my container contains content-compromising infrastructure, are there any known vulnerabilities, are there any runtime and OS layers, container up-to-date, is my Kubernetes cluster compliant with industry-certified security benchmarks? So let's see how Stacrox solved these problems. So Stacrox can identify the vulnerabilities in the base image package that are installed by the package managers, then programming language-specific dependencies, programming runtime and frameworks. It supports package formats, which I have mentioned there. And I believe most of you work with the same package formats. And there are supported operating systems like Alpine, Amazon, CentOS, Red Hat, Enterprise Linux. So managing compliance is equally important for our organizations to the security standards. So it supports out-of-the-box compliance standards like CIS, benchmark for Kubernetes is occur, then HIPAA, NIST, PCI. So you can run scans through this profile. So Centel or Stacrox specifically collect snapshots of your Kubernetes cluster, then it aggregates the data and analyzes what checks are being passed and what checks are getting filled. It will help to evaluate for the regulatory compliance. It will help to harden your Docker and underlying container runtime. So this is the UI where you can see passing percentage across your cluster, across namespaces, across the nodes. So you can have a better idea where the issue or what compliance checks are filling. Accordingly, you can navigate to that. In the right section, here you will see what controls are filling, what needs to be set. For example, here I have taken an example of CNF files, which says that the file permission should be more restricted. You can accordingly take the actions and fix that control. So what is Collector? Collector overall helps all the Stacrox ecosystem to maintain and manage the container runtime activities as well as post-level processes information. So it's an agent that runs on every node, under strict performance limitations and gather data via either kernel module or a BPF probes. It collects, it analyzes and monitor content activity on cluster nodes. It collects information about runtime and network activity and sends collected data to the sensor. Sensor then will help central to display all the data over the UI. Okay, we'll quickly see. This is the traditional way of how we used to see at kernel when we when we deployed the application. We have user space where application, user application runs and for every resource that we need into user application, we need system calls. So then user request any data, the kernel copies that information from kernel space to user space. But due to some limitations, it is not possible for user to access everything that is into the kernel space, right? And this was not a problem when we talk about a single Linux source, but with container adoption, we know that the number of processes or container that may run on a Linux source have increased, the density of container have increased, right? So resource overhead, managing container issues, container runtime issues has become a great challenge. So all these required activities require kernel support that we know. So how, how do we overcome that? We can use EPPF rules. What is that? It is an extended Berkeley packet filter, right? It is not just a packet filter. It is more than that. It helps us in networking, tracing, profiling, observability and monitoring and security. I will quickly go ahead because of time constraint. Then we have network policies. In Kubernetes, we know that by default, network is, network policies are not there. We need to define our network policies by our own. But considering a production grade environment, it is really difficult to, you know, write each and every network policy ML because sometimes we do not understand what source, from what source the traffic is coming. At large scale, it could be a difficult, right? So it provides network graph, network segment, segmentation to understand or to modify baselines so that we can define, okay, if traffic is coming from this source, then this should be blocked or network policy should be created accordingly with this baseline. So this provide is, so yeah, we, Cyclox provide a network simulator, network policy simulator through which you can understand what are the active connection from where the connection is coming, whether it is allowed by the deployment or whether it is anonymous. Accordingly, you can define your baseline and restrict the traffic. It will help us to create the network policies at the runtime. So we can just copy that network policy and configure it in our Kubernetes lecture. Then we have admission controllers. So it basically helps control, to enforce the security policies before Kubernetes creates workload. For example, deployment, demo sets, it intercepts the API request when any program runs or application runs into the pod. So in Cyclox, we use admission controller with security policies so that any policy gets violated, then it will immediately prevent the deployment from getting into running straight. Okay, so I will quickly show a demo where I have given an example of log forces, forces CV and to understand how it can prevent the deployment. Just let me show it quickly. I hope screen is visible, yeah. So this is the cluster dashboard, where I can see images at the most risk, what are the policies, current policies violated. So Cyclox provides some default policies as per the best practices pertaining to the security posture. So considering the criticality of the log forces, we have included this policy as well. So you can configure policies into two modes, inform as well as enforce. So currently, if I look at this policy, it is into inform mode only. So I have edited it and make it enforced. Yeah, so it executes on build stage, deployment stage. I marked inform and enforce and enable it for the deployment phase. Right, so once the policy created, it will show whether any existing deployments are violating this policy or not. Then for the demonstration purpose, I have run a vulnerable deployment which has this log forces CVT. So this container image has the vulnerable app. So in the parallel terminal, I have keep a watch to trace the events in the run time. So as soon as I create this deployment, you will see that the parts are getting terminated because of the policy violation. So it won't allow the part to get into a running state because of policy violation. And in the events, you will see that stack rock enforcement has been detected and the deployment has been scaled to zero. Okay, time is up. I have one more demo, quick demo. If you would like to see, let me know. Quick demo, yeah, that would be interesting. So in this demo, I have explained how we can leverage the DevSecOps approach to shift the security. For that, I have used Tekton in the pipeline operator, which is deployed in an open shift. So this operator is nothing but using Tekton framework under the hood. Let's see it quickly. So it provides a standard CICD pipeline definition in a declarative approach. So we can define the task as well as pipeline, which further than can be portable across all your Kubernetes infrastructure. So I have defined these three tasks where images, image will be checked and scanned. And in the task, it's in the background, it is calling a stack rocks API through rock CTL. It's same as keep CTL. It talks with the stack rocks API and performs the scanning for the image. So these two tasks I have mentioned in the pipeline definition, image check and image scan. And there is one more secret where I have provided stack rocks API endpoint and the credentials. So we'll create a name space called pipeline demo. Then I have created secret as well as the pipeline definition. Next, we will execute those tasks. We should develop more and see that pipeline has been defined. So these two tasks are there. Pipeline run is not initiated yet. So we'll initiate the pipeline run. We'll pass the container image that we want to scan. For example, here I have provided MySQL 80. So pipeline has been created. You can check the logs, real-time logs through Tecton. It's a client for Tecton through which you can perform the operations. So it also gives you better visibility if in case your tasks are failing. For example, here my credentials were expired. So I had to refresh the credentials and then I ran the pipeline. Now we will see the pipeline gets into running set. The tasks has been passed. Now we will see all the CVs that are associated with this particular container images. You can get each and every CV ID, its CVS score, and you can accordingly share those security admin. You can also check policy violation through image check tasks to understand what policies have been violated, what are their ratings, whether those are rated as low or moderate or risky. That is it. So I have put some handful resources for you to go ahead and get started with the StackFox community project. You can also hop into our Slack channel and that is it from my side. So do we have some questions here? Thanks for the excellent presentation. I have one question regarding you mentioned a lot about the agent which is kind of scanning and detecting the vulnerabilities. You briefly touched upon the object central, which I think if I understand correctly you are pushing that detection of vulnerabilities into the central. Is that right? Yes. So central fetches the vulnerability feeds from the upstream sources or let's say you have NVD database. So every five minutes it will keep on checking what vulnerability are present in the upstream. So accordingly once you download then the collector or the sensor fetches those data into your respective Kubernetes cluster. So what if when the container is running, the pod is running and suddenly the agent checks the vulnerability database and detects possibly that the version running in the pod has having some critical vulnerability. What actions would it do actually? It actually depends on us what actions we want the admission controller to perform. Either we can have it in inform mode so that we understand okay policy is violated but that whether that is really affecting my workload or the runtime accordingly we can take actions. If you want strictly not to allow any deployment to run as soon as the policy is violated we can put it into enforced mode and we can decide at what stage we want to terminate that at the build stage, deploy stage. It's basically based on your policy. And the central is kind of accessible by is it like a closed environment or it is open where anywhere anyone can access that. Any containers running in any cloud can access that. It can be configured in online mode as well as air gap environment. So again it depends on your case or your organizational requirements how you want to install it. In terms of offline mode you can always download those vulnerability feeds or kernel probes modules in your secure host and then you can inject those to center offline way. Okay. That option is also there. Thank you. Any other questions? Yes. I just have a question. Can you use stack rocks as a honeypot? I mean can you just let the intruder or the security thing to go to actually get a like a description of all the things it's doing. The attacker instead. So let's say you not just cut it because you just right now basically applying a policy you're cutting the thing. But can you let it just isolate the container and let it run just to have 4 and 6 out of it. See how things are behaving. Yeah other than policies we can always do the risk analysis. Sometimes it happens that vulnerability that may found as a critical but in terms of my application I might not have that vulnerable code at the runtime stage right. So I can always mark that vulnerability as a false positive or I can defer that vulnerability. Does that answer your question or you have something else? Yeah I mean as long as you can get I mean sometimes the scenario is that you have the pod actually in production and something happens to it and you want to actually isolate it but you still want to have 4 and 6. You don't want to just cut it. You just want to understand the attack. So in terms of isolation it gives us a rich context from the UI at what layer the vulnerability is present. For example we can inspect each and every Docker layer. It allows us to see at what component the vulnerability exists. So you can always you know modify the image. You can build it again and patch the changes. Thank you for the question and thank you for the talk. I think we are out of the time. Thank you. |
Welcome to the SBOM devroom!
Introduction to the devroom |
So we start with a pretty tight schedule, so thank you everyone for being here and being on time. So respect your time, we're not completely sure we've got the AV all set up for the remote stream but for those here in the room we'll start it off and post slides. A bit of housekeeping I guess, we've got like I say a very full schedule between today and tomorrow, today just today. Tomorrow's the flight but anyhow. So Alexis you want to walk us quickly through it and then tell us the rest of the housekeeping stuff and then I'll talk, give an overview of the S-bomb stuff real quick. Right, okay. So first of all for us, for the strange people who do not know us, this is Kate the Magnificent, Adolfo the Great is somewhere, yeah trying to find AV solutions and I'm Alexis. In the program that you have seen, we did not do a nice job like other dev rooms and it would not leave any time between the end of the one talk and start of the other. Therefore if you're speaking, imagine that you've got five minutes stripped off of the plan because we're going to finish early and switch laptops and bring the next period of the room. So as you've seen we have lots of talks all about S-bombs, right, okay, so we have lots of things about S-bombs, we try to group the presentations according to a couple of themes and so we'll be starting with more tools that are working on S-bombs, then we're going to be discussing what information goes into S-bombs and then we're going to have more general discussions about S-bombs. There are two, we had interesting changes in the schedule according to travel plans for different people, so we have two discussions or panels or whatever you want to call them. One is on discussion on S-bomb contents, right, where we expect people to contribute, right, and the other end is the larger panel discussion about, well should it be content there, useful to send caveats of S-bombs in general, right, and then we also have another time slot for everyone to ask questions about S-bombs because this is something completely new and that's about it and I'll give it to Kate to explain what is B-bombs is. The other thing too is as we are moving through the day, if more people are coming in you may be getting asked to move that way so that as people come in they can get seated and so you'll see myself, one of us, basically we start to see pressure of too many people standing in the back of the walls, go at some point in time quietly during the meetings, so I think that's it, but- Last the room has a corridor at the end after this, so you don't need to cross over. Okay, so quick show of hands, how many people have started working with S-bombs already? Okay, pretty good, I see one or two not up, so I'm going to just sort of say the common understanding that's emerged of what an S-bomb is, is the relationships between components used in building software and these are like libraries, modules, open source or proprietary, widely available or restricted, all of these are valid use cases today and we have to, because we have to work fully ecosystem and improve transparency, we need to be inclusive of all of them. This is a definition that's sort of been worked up in the industry through various meetings and there's been European participation as well as North American and Japanese, so we're trying to get this. There was a document published last year, actually it was a year before, saying what the minimum elements are and the minimum elements are a supplier component name, version of the component, some other unique identifier, dependency relationships, authors of the data and timestamp, that is pretty much all that it's asked for for the minimum, now anyone who's working with this stuff is not sufficient, so there are a couple of formats that are already recognized as supporting this minimum, SPDX and Cyclone DX are on that list as well as SWID and so we have a definition set of the fields from that record things and we're trying to line up with that in the various formats and then possibly do a lot more. Most of us here are in the SPDX community and it is able to extend beyond that minimum and we are an international standard and have gone through the effort of becoming an international standard, so you'll be hearing probably a little bit more SPDX than the other one but there are other people working on Cyclone DX who will be here today too. The context though is an S-bomb by that minimum definition can apply pretty much anywhere in the software like Cycle and we were finding a lot of people talking past each other and so one of the things that's been working on in a group for the last six months is coming up with a common set of definitions about the types of S-bombs and they sort of relate a little bit to where the things are in the design phase but not completely but the common ones that we see out there in the industry right now are the source ones and the build and the build is where we're getting a lot of information for the security folks. The analyze is when you have a tool that basically gives you a binary and tries to figure out what's in it. The deployed is you've got things that you're putting on a system with configuration information and you want to know that and then runtime is what might be running on your system. You can generate S-bombs for all these sets that fit the minimum definition and so one of the things I'm going to be asking the speakers to do is as they are talking through their slides and everything else if they could say what type of S-bombs they're talking about so that people can get it clear in their head how these tools work with different types of data and with that I will turn it over to our first speaker. |
Generating SBOM made easy with ORT |
All right, so good morning, good morning, welcome, bonjour, good morning, yeah, I can speak too many languages. For the people that know who I am, my name is Thomas Niemburger, I'm the head of the open source program officer at IPAM, I'm involved in several, well you can call it open source governance S-bomb related projects, including I run the security profile over the defects team, so if people have questions about security information in SPX3, I'm happy to answer those as well. I decided to just not make pretty slides, but to just open a browser, so to be more like a demo, and to make it a little bit more interactive, since I'm the first talker, I have to keep you all awake, and I'm usually a very high energy person. Apparently last night I was compared to Opalix, and I fell into a fat of open source juice, and therefore I'm all hyped up on open source stuff. So yeah, so I normally can talk very fast, so I will try to be a little bit slower. So I'm here a little bit to talk about Oort, Oort is a project, or Oort is a review toolkit, the screen, better, perfect, like first talk. So it should work, hopefully my internet is all working, so yeah, if you are on GitHub you can find Oort here, the full name is OSS review toolkit, or open source software review toolkit, it's a very complex name, it's actually for those people that are German. Oort in German means place, and I used to work for a location company, and all of our open source project had a location pun in them, so we were really trying to figure out all of the names, and how can we make a location name, and this is where we lined it on. So I'm actually going to do somewhat a live demo, but my internet is not working 100%, so we're just going to do it like this, where I just have luckily pages open, and I'll show you through. So if you want to get started with Oort, the easiest way is actually to use the GitHub action, and you can just literally, as the code shows here, you just add a few lines, and basically it's very standard. For people that are familiar with GitHub actions, in the middle there's a checkout. The line on top is a little bit different, and then you run Oort itself. This line is a little bit different, and it has to do with how Oort generates S-bombs. So for to create good S-bombs, you can make S-bombs that basically only operate on kind of like the declared license, the basic package data, but if you want to have a proper, what do you want to do? I originally have worked in the automotive industry, we want to know everything down to source level. And why is that? It's actually very simple, in the automotive business, basically products on the market, the minimum lifespan is 15 years, and it can go up to 25 years or longer. There's a vehicle on the road, it's minimum 10 years on the road, plus European legislation, contract law is another five years, 15 years. So all the solutions that we built for S-bombs, my successor, my successor, my successor, still needs to be able. So this is why we started building the way how Oort is designed, that we were like, hmm, we need a format that we can take the scan results that we have, and all of that have, we can just do, store them long term in a public format, that even if the tooling that we write doesn't exist anymore in 10 years, it's an international standard, so they can just write a new parser, read in this, and then of course I stumbled upon, and then I happened to make Kate's at Fostam in the bus. And so we got talking, and I was like, hey, that's interesting, this S-bomb thing, so this is like years ago, like what, 2015-ish, I think, 16-ish, a long, long time ago. So then I was like, hang on, I can solve two of my problems, I can basically have an output format that is recognized, so I have a long-term archive format, and I also have a format that I need it to exchange, so there are two forms of, why is that? In automotive, there is a long supply chain. So what we have commonly, we call this hamburgers, where you get something from a supplier, you do your thing, then it goes to another supplier, and then my company will be again in there. So we were just looking at what is the solution that we can, in the supply chain, basically exchange information, it still is, unfortunately, automotive, so still a lot of paper-based processes, and we were like, no, no, we need to go digital. So what you now see, luckily, is that some of the large German ones, which again, I'm based in Berlin, Germany, they are already switching to, basically, say, S-bombs, and then, no, no, no, yeah, you can still do paper, but we actually prefer you to give us an S-bomb, because then they can ingest it much more easier, and so yeah, we basically do SPDX, and actually we also support starting this for exchange, and also for archiving. So yeah, now, my talk was about how do you generate it, so it's actually very simple. So I took here the MimeTime projects, which is kind of our default project, it's a small node project, and if I want to do an S-bomb for it, it's actually very simple for people that are familiar with GitHub Actions, you can just run the action, and there's as simple as it is, and you can also do this in GitLab, I don't know for people familiar with GitLab, but basically it's all the same, you just run a GitLab pipeline, it says org scan, and you basically get a nice log, you click on the browse button there, and you get basically nice results, and it's kind of, we generate all of this. So we actually go further on S-bombs, because just generating an S-bomb is not good enough, so that the file is generated, that's nice, but you actually want a quality S-bomb, and it's actually quite challenging to produce a good S-bomb. Why do you think, people don't know, show me your hands, what do you think, or just speak up loud, what do you think is the problem with generating an S-bomb? Input data. Input data. What input data? You don't know. We don't know the input data. So when we do an analysis of your software projects, I'm also originally an engineer, I still do coding, we're happy when things built. So all the build tools are basically optimized to build code. To keep, actually, track what actually went into your code, yeah, that is kind of an additional feature, it's not a requirement that the build tool does that. So to figure out what actually goes into your software project, I bet whatever tool is your poison, whether it's a Maven or Gradle or MPM, I'm pretty certain your tool can produce a list of the packages that are included. I hate to tell you that, but most of the time that list is incomplete. For instance, if I look at MPM, MPM has six methods to give you a list of what is included. All of them are incorrect. And it's not to blame anything on MPM. No, no, when I build things, MPM is a very, JavaScript is a very rapidly developing ecosystem, so they build functions really, really rapidly. It's amazing. I work in MPM and know it a lot. So they add a feature for a particular use case. And looking at Aspom, it's simply not a use case that will support it. So they have different views. There's one quick view that just shows you quickly the dependencies. There's another view that shows you basically, hey, this was coming somewhere, but if you look, for instance, at a complete Aspom, you probably have seen, if you're a Node developer, that there are Node packages that you see C++ below. So Node is just used as a wrapper for some other C C C++ program, for instance, to compile your glyphs, your fonts. What you will look in when you generate an Aspom, it will might see the wrapper, but it will not see the C++ thing below it and what's in the compiled thing there, because the MPM tarball just complains the compiled C++ code for every platform. So you won't find this out. So the way how we went about this is, okay, what we do, we need to resolve everything back to source code. And that's actually really, really complicated, because if you want to look at Maven, that whole ecosystem is basically the gift developer compiled code, not the source code. Yeah, you can find a metadata, you can kind of figure out what went into this Java project, but actually here's a fun fact. Most of the time, the package metadata from this Maven project is actually incorrect. For an example, not the bash for instance on Amazon, but if you go for instance, if you know the Amazon Java SDK, and you look at the metadata, it points to the right code repository. Great Amazon, great work. But Maven doesn't have a solution to tell you when you have that code repository, which folder in that code repository actually contains the source code for the package. And if you know the AWS SDK for Java, it's one code repository, but close to 500 packages. So I have here a jar from the Java SDK, where is the source code for that jar? Oh yeah, it's in this code repository with another 400 plus of his friends. And for me to know the exact licensing or security vulnerabilities, I need to know exactly which source code. So then you need to do all kinds of tricks. So this is where we built our tooling to basically figure this out. I just want to give you a slight one of the output formats. So this is where we basically created a simple single file, we call this the web app, it's a single file, it's not a server. So when we started building this, we're like, no, no, we want to be compatible with as many different systems and many stuff. And we want to have a Docker container that you can just run in your CISD pipeline and it produces this plain files because every developer knows how to handle single files. You can take this file and you can just send it to your lawyer and your lawyer can open the file and they can also view it. We were all optimized for CISD pipeline. The second thing that we were looking at it is when we look at things, we want to be one meter of violation. So here you have a package in top, you see a license package that's copy left and source. So you can write policy rules on or very powerful policy rules, whatever you want to do. But when you draw a policy violation, and this was another complaint that we had a lot of tools, when you write in your open source policy, like I don't allow this license, you should tell your users how they can fix it themselves. Why? Well, in my company, I have 55,000 software engineers. If all of them, if they don't know it, contact my team, yeah, I'll be very busy. So the way how we set things up is like, no, no, no, we want things to be open source. We want to be based on open standards, SPDX, like on the X, so all S-BOM standards. We want to be able to write a policy where we can write whatever we want in our actual legal policy, actually translated. So ORD has something called policy as code. You can really take whatever your policy is, and you can encode that. And so we want to really be able to do, and we want everything to be plain file, so it's easy to integrate with whatever CI system. Why is that? Well, I run an OSPO, an open source program office. Are people familiar? Hands up, who's familiar with the term open source program office? Some people. So an open source program office is the industry term for, like, your knowledge center with an organization regarding the open source. So I run an OSPO, but that basically means that all open source questions, everything comes to my team, and we try to help our engineers by contributing back with compliance questions or help with community topics. We really like there to basically help our organization become better at open source. So that means I get lots and lots of questions regarding open source. I want something that really skills because I have thousands of engineers. Regarding this, how to fix me text, and having a very powerful policy, I can exactly decide, OK, these are the things that we really want to fix, and these are the things we don't want to fix, and we want to provide exactly the guidance. So the funny thing you might see is here, this is actually all YAML file and Git based. So we use actually the same developer workflow with pool request to basically fix up our licensing. Why do we do this? Actually, because that's what the developers are already known for. The developers already know how to do all of this stuff. So we enable the developers to fix the things themselves. The other thing is we can now use inner source to fix license compliance problems and security problems together. Why do we want to do this? Well, the more things we can actually fix issues, instead of what a lot of tools do, what they call notify more, they throw you an issue up, and then you have to fix it. It's better basically if we work together to actually fix all of these issues than if we do the inner source, this was actually the first time we did this, it was very, hang on. So other teams are going to fix my license compliance and security teams? Yes, because in our organization, guess what? A lot of the same open source is shared between a lot of different teams. So instead of having every team do all of this compliance and security work by themselves, for the things that are shared, we do collectively. And the nice thing is, you can do this in your organization, but we also not only open source the tooling, we also open source the data and the policies from the org side. So you can also do this with the whole community. So we're working and we had a workshop on Friday to discuss how can we do this even further, how can we collectively work to create better S-bombs. That's pretty much the way in a little sneak peek, lots of code, beautiful code, right, in the morning. This is the other thing that I'm working on already, this is what it is. It's the security profile. So the latest things that we're now working on is like, okay, we want to combine, update everything and include vulnerabilities as well. And then you come into a lot of other challenges to create an S-bomb. So technically, actually, to be clear, security info should ideally not be in your S-bomb. It should be as a standalone artifact in S-bomb format, but standalone. So because your S-bomb should be fixed for your software release and your security information is probably updated. But then, actually, we ran into a lot of changes, challenges with including security information. Because guess what? A lot of security information is either locked up in the provided database, and actually it's a lot of times not really accurate. And that's what we figured out when we were starting to move into security data. A lot of the data, when you actually look at the data, and you look, and I've been working with Philippe from XB, and looking at his data, you'll figure out that the ranges of software for a lot of CVEs are actually incorrect. And we were like, no, no, we want to build open source tooling that reduces the burden on our software developers, like need accurate data. So it's actually funny now that we started creation mechanisms, we called us to fix up license data. And now we have to build creation mechanisms for security data as well, so we can actually fix up the security data so that at the end we can produce high quality S-bomb. So we know exactly like these packages were in there under these licenses. And at the time of release, they were, we know this security vulnerabilities. And now the next thing is, how many people are familiar with VEX? Few. VEX is an upcoming standard, this basically, if you have a security vulnerability, you can basically say like, oh yeah, I know this security vulnerability was found by a scanner, it's reported. But I compiled this package with these parameters, so this vulnerability is actually not applicable to my software. So it's a way to, in a machine readable way, to say like, yes, the scanner will pick up a security vulnerability for OpenSL, but the way how I use OpenSL, I don't use the particular code where the security vulnerability was found for. So that's the next challenge where we're working on like, how can we do that in an automated way? And that's it. I think, let's go to questions. Do people have any questions? You might want to speak up to the microphone. So basically, most of the time when you compile stuff, you do like tree shaking, and so you import the library, but you don't import all the functions inside the library. So if a vulnerability is found in the library, but not one in the function you import, is there like some kind of way to detect that? So the question is like, is there a way to detect if you compile software whether something is included or not? No. Corrected? Not really. You want to? So I think the question was, if there is a component, but you're not using the whole component, you're using just a part of it, and there is a vulnerability, but the vulnerability is for something that you do not use. For example, a library that function that is never called in your code, is there a way to detect that? That's great. Honest answer? No. I don't know about that. I disagree. And there's a difference, okay. So you can do it, but when I look at things at scale, so you can do it for individual cases. You can do this, where you know to compile and stuff. But where I'm looking at is like, if you have a large organization like, we use tens of different compilers with tens of things. It doesn't scale. So the way how we did it around it, we in ORT have a mechanism where you can basically indicate either via SPDX what you're using, or you can add a creation afterwards where you say like, yes, I'm using this package, but I'm only using this folder. And then ORT will automatically subtract the things that are not applicable. One more question? Quick one? Let's figure if you want to come up. Yeah. Do the setup already. And we'll... The S-BOMB pipes. We were talking about design, source, build, where does ORT applies? I think there's a lot of these there. It's basically, we're focusing on the source S-BOMB. So we really... You're on the build. Build, source? You're built. Built? You're built. Yeah? Yeah, you're built. Built? We actually have all source source things in there as well. But it's... You can have source and build. The S-BOMB type is one type. But mostly we do build. So basically, we look at your source code, we pull it out, basically say like in this source code, the idea of why we started there is because everything starts at source code. The next speaker. Right. Background. |
Understanding and Managing the Dependency in SBOM with the New Feature of SW360 |
Well, everybody, welcome, please, welcome, Kuki, our next speaker. Yes, I'm very happy last year, I also do the presentation, but it was online, so I'm very happy to see a real audience, hi, real ones, okay, so today I would like to talk about the SW360 features, so the title is Understanding and Managing the Dependency as One with the New Feature of SW360s, so I'm from Japan, so I'm a little dizzy because due to the jet lags, but so, okay, so I am a Koki Hammer, I'm a Toshiba Corporation, so my main task is to research the open source compliance and these tools and the management process, and so now I am a one with the leader of SW360 projects. Yeah, that's our today's contents, at first I would like to explain what is SW360, and so next I would like to explain the software dependencies, so unfortunately at this moment SW360 cannot manage the software dependency registration, so I solved these issues with my colleagues, and so now, but still have the problems, so this is one format not corresponding to the SW360s, so of course future works, SW360s need to be up improved for a popular SMM standards, okay, so let's start. At first, part is SW360s, as far as I remember, seven years or more ago one of the man, Michael Yeager, developed this, so a lot of people continue to commit a lot of source codes, now they can handle a lot of information related to the software inside your company, so for example, you can assemble in the security vulnerability or maintain the license and its obligations and some assist to generate the legal documents. Yeah, this is just an example, so now SW360 supports a lot of software component management activities, so this is an overview, of course your company has used a lot of software components, but so different product or project has different software components, so for example, if you use a component A, this component is used by product A, it's okay, but so for example, if you use a component C, product A or product B, both are used, so if you centralize this kind of data, you can reuse the information about the license or vulnerability, so centralizing the data is very important, so now maybe some company has a lot of system inside your company, for example, license scanner or artifact repository, so yeah, so they are all important, but so sometimes a lot of problems, for example, if your company don't have the unique naming rules, so they cannot exchange data easily, for solving this one, so centralizing mappings, so if you do so, you can input every data into SW360, so you can manage a lot of components in your company by SW360, yeah, so now this is a screenshot of the SW360, so at this moment software component or language or a lot of information you can register, and now so we update it and maybe within few months we can support the SPDX format and also we also plan to do the Cyclone DX related information on the roadmap, and some more information, so this screenshot is written in English, but now other languages is also used, and now English and Vietnamese and Japanese and maybe Chinese prerequisites exist, so you can use four languages, and if you want to add new languages, so we already prepare for the template and everyone can prerequest about the new language format, so in this case the SW360 handles more variety of languages, okay that's the next brain, so what is the SW360, so today's topic is software dependencies, so I summarized this chapter at first, so yeah, as you know software dependency information is a very huge and complex, but so sometimes it needs to be managed for license obligation or manage variabilities, yeah this is the example of the software dependencies map, so the software dependency of the project refer to the third party open source software that this project depend on, software dependencies can be direct or transitive, yeah so even if you register or manage the software dependency graph, some component version update it, so whenever some component update it, you also need to register this kind of data, this is also one of the reason why you cannot manage the software graph easily, yeah so this is a real example for if you project is a grant of this one, so this is the other software component, so that's this is an example, yeah in the real maybe you use a lot of software and you need to manage like this graph, so why you need to register or manage the software component dependencies, one reason is the license manages, so different component have different licenses and of course you need to follow these obligations, so yeah this of course right away or following the license obligation is the most important things for open source users, yeah so we need to track them, on the other hand vulnerability is also important, so if you use the one software, but so this is a deep insider as component graphs, you may not find this vulnerability easily, but so this is a very sometimes important risks for your management component, yeah so that's either or why manage software component dependencies are important, yeah so taking the proper dependency or updating outdated dependencies or solving risk of dependencies, all you need to do for the your products of component, yeah that's a background, so but unfortunately so traditional software component catalog application is W360 cannot handle this situation, so I would like to explain this one, so this at this moment is that the word W360 can register only one software dependency information, this means different dependencies cannot be registered for the different projects, so this is the architecture of the W360, so for example if you are project X use the component SW360 then you link the release, release means the version, so and so this release also use the other releases in the components, so yeah now you can register it, the dependency of a project, if you want to register this project for example one, so you can link the like this project, example one have the mini match and this is linked to the brass expansions on the like to this, so this is looks like it's okay, but it's not a good way in real SW360, yeah so this is a screenshot of the real interface of the SW360, yeah so you can register this project and you can also register dependencies, yeah it's so but so unfortunately this is not perfect because if other project want to register other information with different dependencies it is impossible to manage this information by current SW360s, yeah so like these graphs of course different component have link to be a different component and the company manage all the information for each project, but so now it is impossible, ideally we need to project information what these dependencies like this, so if you register mini match with these dependencies it's okay, so another project also need to do the same things but different versions, so at this moments of course if you someone already registered their dependencies as a project member cannot register new dependencies and so yeah if you force to change in the links information are with the admin writer, as the redirect also change their dependency information, this means first project information are not correct, so for solving these problems and my colleagues and a lot of SW360 community people solve these problems, yeah for solving one we change the data architecture and GUI, yeah of course important function I plan to be full request soon and but if you want to challenge or try to this function or this update it you can go to this branch and if you built this source course you can do it, so our idea how to solve the problem is a project, different project has a different dependency graphs, so all the specification has only one dependency graph, so but new specification different project has a different dependency graphs, yeah of course for this for realizing this specification we need to change the GUI, so all the one if you set the links you can find only one thing but so as new specifications GUI you can set the links for releases, yeah source this is the real GUI and so the current new one, new branch you can do the you can find this kind of GUI for register the dependency graph on the SW360, yeah for example you can select the versions or delete this kind of information, yeah like this kind a lot of assistance for managing this one, yeah so basically the release page on the SW360 it's not changed from the old one so the dependency information here will be seen as the price starting the default information it will keep the same with the latest information in the ecosystem, yeah this is the view page of the dependencies so each dependency graph can be committed so it's a difficult to change but if you see this screen in detail you can find the differences so especially a race expansion has different versions, yeah so of course editing page is also changed as this page is for using the when register the new information for your project on your company's SW360, okay so this is a solution for the buy and how to deal with dependency graphs on the SW360 and so I need to mention the SM standards from our dependency, defined dependencies because SW360 is of course using for your company's source code management or component management but so it's needed to be corresponding with common SM standards, yeah so SM standards like a common SM format also defines some dependencies and they support or describes how to register the dependency in your project, for example I can find the SPDX paper types so they handle the element and so by this kind of relationships we can handle the dependencies linked which software depends on which other source whereas and so I can find the dependency in the cycle of DX based on the package you were right so they manage some dependency graphs in their formats, yeah so from now I would like to explain the future work for the SW standards, I have already explained how to update it SW360 for the dependency graphs but still have the problems SW360 definition of the dependencies are very unique and they are not same with both SPDX or Cyclone DX, yeah it's a problems for corresponding with common SM formats SW360 should be updated again so I think this is one of the most important future tasks about this function, okay I would like to summarize today's presentation SW360 can manage the internal software information and the registration of dependencies important for licensed security management, registration of dependency information software in SW360 was not flexible however we developed a new function to register different dependency information for each project to be registered so definition of relevance between software is unique to SW360 so we need to follow the SPDX or Cyclone DX so in the future it will be adapted to the common SM definitions, okay that's all thank you Yes, so you asked about whether SW360 can or dependency or not, no Can SW360 also read the dependency information? Ah no so for reading or analyzing the dependencies need to use other series like our old maybe, yeah Hi So your tool is consuming what type of S-bombs? Sourcing build Sourcing build? SW360 defines these kinds of types, every kind of build information can register, yeah so if user want to register based on the source build they can do register it so but if other SW360 user want to do register as their related system they can do it, yeah So build of bonsai, yeah The relationships that you said that are not in both of them, which one is closest? Like when you had that slide with the relationships not matching, some of those relationships I think are pretty clear in SPDX but I'm trying to figure out which ones you think are the gaps Ah yeah so I think SW360 developer defines this one I think SPDX, no one tried to correspond to the SPDX one so very similar to the SPDX relationships So I'm just trying to tell you about an X there Yeah that's right, yeah, SW360 side needed to update it to SPDX Okay, I'm just trying to say along those ones are relationships already but if there's something that you're missing please open an issue in SPDX Ah, okay, thank you Is there also some information available about where the package is used like a test development of a test library or a dev dependency Because for certain things, for a large number of years you might have to exclude dev dependencies from a product Is there also somewhere a catch up where a dependency is used in the build process? From now I would like to do so, I need to catch up so So what's the So ORT has an integration called SW360, so ORT will detect from the package managers what are a test scope Now we call the API of SW360 and the internal build tools get marked by internal use So basically that's how ORT works together with SW360 The integration from ORT to SW360 we translate everything from the internal use relationship that is in SW360 But you need to have an other tool to get that information and from the website we do this translation And I can continue with the answer that yes we are in the middle of doing a refactoring to make the dev fully compliant And this will be integrated on future versions, I don't know which one but The question is what kind of information you really would like to have in SW360 SW360 should be able to support all the different points of view We would say we don't want to have the dev dependencies So as Thomas already explained we would sort them out in the scanner that creates the original Aspom And their work is to import an existing Aspom and to map it to a specific system Yeah but in some cases you do want to have the dev dependencies where you want to use an Aspom for operational disks or whatever But we are very confused with which use Aspom and I don't want to store Aspom so we want to put it in 8th and 2nd I totally agree and as Thomas explained you have different kind of tags or meta information that you can apply to to releases or to components unless they are with its view So this would allow you that kind of map I wonder whether that's identifying a new type of Aspom then Because it sits between the build and the analyse It can actually, there must be, you might have new tools that you want to use purely internally to demonstrate the quality But actually not deliver to your end user So is there a gap there? It's not really so We do it in port, we store it in one Aspom for the source code It's the complete corresponding source code And then I have another Aspom for the build Aspom that I build from the source code And that's Subset has been extracted to actually make the image But yeah, I can think of when you test you may have different compile options to include debug or not debug This is a very simple example and actually you need to know whether you've got that in your deployed Exactly And so that's where the build Aspom will potentially refer to the source Aspoms Or to other pieces so you can get the details We're going to have a slight schedule change The Octo1 is going to be switched with Yon Simone's as the next one But in the Octo they're doing that today And so I think the Octo1 example will probably clarify how we're linking these Aspoms together in some places Thank you Any more questions? We have time still? Yeah, three minutes Three minutes, okay Well, we have to switch Okay Can it import as PDX already? I thought you could do for X now Ah, so now make it When? When do you actually merge the batch that was waiting for it? Ah, yes, as he mentioned I already prepared the prerequisites But as a community member, reviewing it Let's do a review of that So after the reviews, we can manage the PDX import, export and editings on the Sable360 So unfortunately at this moment we cannot But yeah, of course we already prepared the prerequisites If you go to my company's branch, you can try it Okay We missed the 17 release, but I hope we can do it Yes, yeah, within, I hope within a few weeks everybody can do it Thank you very much Thank you You |
AMENDMENT: SBOM with the Yocto Project for Automotive Grade Linux
Intro and lessons learned |
Alright, good morning everyone. My name is Jan Simon Muller. I work on the Automotive Great Linux project and today I want to talk about how we produce our S-bombs or what we evaluated, what we did, what we learned and yeah, some lessons learned. If you want to reach me just find my email or find the AGL ISC channel or what not, there you can contact me. Okay, in a nutshell, AGL is an open source platform for different users in the car. We started with infotainment. We have also an instrument cluster profile, telematics profile and we are also working on software divine vehicle. There is a virtualization expert group and all of that. Code first so you can go to our website, you can download pre-built releases, you can clone the stuff, rebuild it, everything is there. And we built with the Yocto project, so we are essentially a collection of layers. Yocto plus some automotive software and tooling. So for S-bombs, things started around like three years ago when one of the member companies looked into how to generate S-bombs kind of early and they were looking for an in-house solution and they were basically developing that within AGL, presenting and doing stuff. We encouraged them to do that upstream, have a repo within AGL and out of that, we then told them, you know, that should actually go more upstream. So that ended up on git.yachtoproject.org and that's META SPDX scanner. So initially there was just one tool supported in there and that was upload to phosology. So they were looking into a combination of phosology and SW360. That's what they were evaluating and that basically gives you Yocto build, upload to phosology. Phosology will do the scanning and then later on you move the data into SW360. That was their plan. In principle, that's a post-mortem, that's a post-build approach. You take the sources that were exported from the build, the patched sources and then do analysis on it. All of that predates the now available Create SPDX in Yocto. So that was before that time. To make that work, you need to set up a phosology server. You need to upload the sources. It will then run, I think, five different scanners on it and then essentially you get the results for manual review and correction and whatnot. So you really need to sit down, inspect the result, make decisions on where the scanners are unsure and make a final verdict and then you can output the data, put it into other tooling. Meanwhile, there are at least three different tools supported in that layer. One is for Solotree, the other is CanCode and the third is an uploader for commercial tool. After that, later Joshua Hulthog right after me added support for exporting SPDX files right during the build from Yocto. So the difference is that this happens right at the build stage with all the data, metadata we know there and it does not require an external server. It uses the available metadata we have. So it's faster for us. It runs during the build and basically it's close to no additional resources consumed. We now have that enabled. So for our releases and the stuff, you'll find the SPDX files right next to the artifacts. Okay, great. What did we learn essentially? It depends now from an open source project versus in-house product and so on. For us, the Solotree approach or the approach with the scanner, I don't want to pick on one here, it required way more CPU resources. You need to shuffle all the source tower balls up and down. It requires manual review and that was for kind of for the open source project side. That was just too much, right? And actually we lost information once we went from the build to the external scanner. Basically, what does this belong to? Which build is this? Yes, you can partially solve that by folder naming and help out on that but you'll lose a connection here. On the other side, it depends on your requirements. If your legal department in the end says we have to scan, right? Because even if we get most of the artifacts from, let's say, supplier, we still add something, right? Or we need to know for sure, then you have to scan, period. And that's what actually happens for us. Our members, essentially, they have to scan because they add stuff on their own. So in the end, they have to scan their final stuff anyway, right? So for us, we took then the route to take the faster way. We take the analysis during the build and use that. And there is one basically thing that we have to solve at a more global level. The data we have and we provide, which we basically can say, okay, this is our sources that you consume. Does the legal department accept that and trust it? Or will they go ahead and say, we have inspected everything again? So that's a crucial point. Yeah, so right now, we are basically at the stage, all right, we can create the SPDX files. But how can I consume it? How can I present it? Basically, the S-bombs, it's relatively new in the end. So the tooling is still evolving. So the tooling is new. And for us, we are looking for how can we present this in a way that makes it easily consumable. Essentially, let's say for our CI purpose, we would like to know is there anything that was added that changed? So the diff is interesting. Yeah, that's an essential next step for us. All right, questions? So, Josh will detail that. Yes, so what information goes into the SPDX files here? Yeah, what that will come in the next talk. Yeah, so I don't want to steal Josh, the funder from Josh. He has it in his slides, I know. So I think your slide probably answers this. You're producing S-bombs, but you're not doing anything with them. So it's just basically, have I created a file? Yes. Yeah, right now, we are at the stage, okay, check, S-bombs created. Actually, I failed now a decision to say the build's not long. I've got to go back and change the build. Yeah, no, no, okay, no, no. So what are you doing differently to what you were doing before three years ago? We just do generating release notes. Yeah, yeah, so you're not so okay. Yeah, yeah. So, I mean, we are using the report tool. So we can basically ease and the diopto recipes. So we can easily say that's the diff in the recipes. So we changed this and this and this and that. But back then, no. Yeah. No, not yet. So there's two questions. How much of this code was used from the double open original project that was in the app too? This is the one that created the whole spdx generation on the app too. This is, I think this is the original code. Do you know? That's the question for Josh. Next talk. Why do you need the presentation visualization tool? Why are you reinventing the use if you already have a couple of tools already doing that? For presentation and presentation. Why are you working now? It would be the next step for us. So I'm not saying we develop this. Basically, we are looking now, start using what exists. I'm not saying we are reinventing the wheel. I'm going step by step. So I'm an, I'm an adopter. Yeah. So that's why I'll sit in and listen. As to the problem you mentioned before about the choice between using phosology review, we face the same problem in our project. And the way we solved it is to decouple the two processes. So you provide input for phosology with one more pipeline. Yeah. And then leave it, the only thing work. And then when the data is ready, you can import them in subsequently. Yeah. So the only way, because in this way you can provide input to the audit team so they can work timely before the release. Yes. And then you basically feed that back into, I mean, if you have a release build, right? For release builds, we have, we can. No, but we do that also from the time. Okay. Okay. Thank you very much. Thank you. |
AMENDMENT: Automated SBoM generation with OpenEmbedded and the Yocto Project
A case study of automated SBoM generation in meta build systems |
Hi, my name is Joshua Lawton, I'm here to talk to you today about automated S-bomb generation using a case study of the way that we generate S-bombs in open embedded. A little bit about me, I've been working at Garments since 2009 and we've been using open embedded in the active project to do embedded system development since 2016. I'm a member of the open embedded technical steering committee and there's all the ways you can contact me if you're interested later, I'll post my slides after my talk. So we're all hopefully familiar with what an S-bomb is, we use it to describe what software components we have in our system, what we know about them, what we don't know about them and importantly what the relationship between them is. So why are S-bombs important? If we're using software ourselves or allowing other people to use it or shipping it to customers or whatever we're doing with it, we want to know what's in our software and we want to know where it came from, what versions those things are at a very minimum. If they're software licenses, we want to know if we need to do anything to comply with them or things like that or make sure they're not being used improperly. We don't want to expose ourselves or people using our software or customers or whatever it is to risk by having software that's been tampered with either maliciously or unintentionally and we also want to know if any vulnerabilities come up after it's shipped so that we can fix them if necessary or if it's vulnerable to exploit. And really the question that we want to know is can we trace the binary things that we have given to people back to the source code that produced it? Often when we talk about S-bombs, we talk about them as being nutrition information for software and I really do like this analogy. I think it easily encapsulates something that everyone is familiar with which is a standardized way of encoding something. For S-bombs, we're trying to standardize the way that we encode information about software just like nutrition labels try to standardize the way that we communicate what's in our food. So most people can look at a nutrition label and have an understanding of how it works so we want S-bombs to be the same way. You can look at the S-bomb and it's a way of encoding what we have. I think this is a great analogy but it is missing a few key pieces and the pieces that it's missing are really the supply chain part of the analysis. So it can tell us what's in our software. Just like a nutrition label tells us what's in our food but it doesn't tell us how it got there. A nutrition label isn't saying this grain came from here or whatever and that's the part that we're sort of missing with S-bombs that we would like to know and that's what this talk is about. So I don't have a nice analogy for how to communicate a supply chain that's like the nutrition label but I do come from a consumer manufacturing background so I do understand supply chains. So we can relate software supply chains to physical supply chains and when you have physical supply chains so you're making some consumer electronics you've got all these steps along the path of getting the completed product and you need to know where every component comes from to make sure that all the right components are in the right place at the right time to be manufactured. You need to know what's being combined in every step for the same reason and you need to know where this combination takes place because in modern supply chains these steps can be spread out geographically all over the world and they can also be spread out over time so if you produce 10,000 of one thing and then put it in storage and then you pull those out you might need to know like are these ten years old or these five years old? Like how old are these parts? When were they manufactured? And when we talk about software supply chains we have basically the same questions. We need to know where all the components that are in our supply chain came from however in this case we're usually talking about things like source code that we've compiled and then the tools that we use to compile it instead of physical components. We need to know what has been combined to each stage. Did we take this library from this other project and put it into what we're currently working on, does it pull in some dependencies from somewhere, things like that. We need to know where this combination takes place although we're probably less concerned with the physical location as much as the build host that's doing the combination and who did it. Potentially we would like to know who did this step of our supply chain and then when did it occur? Was the software compiled ten years ago? That's probably got vulnerabilities that we should take a closer look at. To help answer some of these questions, SPDX has a build working group that's been working on the build profile and it will hopefully be releasing with SPDX 3 in a couple months or whenever that is soon. And it's designed to answer the questions of when a build was done so it can require time stamps for when builds happen, who wanted a build done. So this is going to be the person who initiated the build or wanted the build done or did the build themselves depending on the circumstances. And that's distinct from who actually performed the build which might be, could be a person if they're manually typing in the command to do the build or it could be a service like GitHub actions or something like that and that's why we have the two different who elements in there that distinguish between the person who clicked the button and GitHub actions and GitHub actions that actually did the build or whatever your service is. So how the build was done, so this is going to be tool specific information about how the build was performed, like the command line arguments or things like that. It's important to note that the build and run time dependencies are already actually captured by the SPDX core specification so we don't include those explicitly here but they're already included. Where the build was done, so this is going to be the build host, the computer on which the build was performed. So this might be as complicated as an entire another software build of materials if you have one that describes the system you're building on, you could link into that and know all the information about the build host also. But also would capture the tool use like if you have compilers or host tools or things like that. And then the what you're building is already covered by the SPDX core profile because it can describe packages and files and things like that. So one of the key points is that it's really important to try to generate build sbombs at actual build time. And to try to explain that a little further, I'm going to kind of compare generating sbombs at build time versus two other ways that sbombs are commonly generated, although these aren't the only other ways they're generated. So there's source sbombs which are generally, this is like reuse or something that's just included with the source code, which is really cool. And then you got post mortem sbomb analysis, so this would be the tools that run after you have the final artifact to try to scan it and say you're vulnerable to these vulnerabilities and things like that. Try to determine information after from the final artifact. And obviously I'm trying to say that we should generate build sbomb information at build time. So I'm not trying to say that the other two things are just terrible and never use them. They all have their strengths and weaknesses. I'm just trying to explain why I think we should build them at build time. So we talk about when something can be built. Source sbombs obviously can't know this because they're not worried about when something is actually built. Build sbombs should be able to figure this out from when the thing is built. You can record time stamps pretty easily. And post mortem analysis may or may not be able to figure this out. It just depends on if that information happens to be encoded in whatever you've produced. We talk about how, so build time dependencies. Source sbombs might be able to capture this if you're talking about something like a cargo or NPM that explicitly encodes specific versions of dependencies in the source code. You could very easily figure out and know what the build time dependencies are. Otherwise, if you're talking about shared libraries or something, it might be able to know those, but it wouldn't know them concretely. So you'd know, like, I need open SSL, but you wouldn't necessarily know the specific version of open SSL that it built against. Build time, you should know all of this. You should be able to know all of this at build time. You basically have to in order to correctly build the software. So you kind of need to know that. For post mortem analysis, you might be able to figure it out, probably with some sort of heuristically, and static libraries are always very problematic with this. It can be very difficult to tell if a given executable has a static library in it or not, because it's not recorded anywhere in the executable. So those can always be very tricky to trace back to their origin. Run time dependencies are a somewhat similar story, so source sbombs probably, you could know what they are, but probably not concretely again. Build sbombs, you should be, you could know this if you're doing complete packaging. So if you're generating final packages like Debian packages, or Fedora packages, or OPK packages, or whatever, you could know this, know what these run time dependencies are even concretely. And for post mortem analysis, for shared libraries, you can actually figure this out pretty easily because it's in the elf header, but for anything that's run time dynamically loaded, like if you do DL open or something like that, you probably can't figure that out very easily with post mortem analysis. In your build environment, I believe source sbombs don't care about this. Build sbombs, you should be able to know this information, and for post mortem analysis, maybe you could figure that out. I don't know, if it was encoded in the executables, maybe some of that information could be known. So there's a couple of advantages for generating supply chains from your build tools at build time. I like to say that they're authoritative because they have first hand knowledge because they're the ones actually doing the build, so they should know what's actually happening at each step. And likewise, they're very accurate. There shouldn't need to be a lot of guessing from your build tools about what's going on at each step, unlike the post mortem analysis, which tends to, I think, use a lot of heuristics or things like that. In a comprehensive, they can analyze a lot of different steps in your build, especially if your software supply chain is very deep, which I think it is for a lot of things. And so they can generate a lot of information about your builds, as we'll see later. And they're also able to analyze things that are difficult, if not impossible, to analyze at other steps, like particularly static libraries, can be very difficult to trace back, at least as far as I know. So what kind of things could generate this information? So kind of starting from the top down, it's sort of the highest level, you'd have things like container build systems. So this would be like Docker build or builder or something like that. As you move down, you kind of get into what I call the meta or distro build systems. This would be like open embedded, which is what I'm going to give an example of in just a few minutes. Debian, Fedora could generate this every time they generate packages. It would be a good time to do that. And then if you go down even a further step, you've got the package build systems. It's not a great name for them. But this would be things like Mason or CMake or Auto Tools. They could all generate this information also with what they know about builds. And you could go down an even further step and say, well, maybe GCC should spit out this information and maybe it should. That's also something that could happen and then it could sort of flow up the build stack as you go. So I'm going to give an example of what we do in open embedded. I have to generate S-bombs. And if you are unfamiliar with open embedded and the octa project, so open embedded is a community driven project that provides the open embedded core layer and the build system, which is called BitBake. And the octa project is a Linux foundation project that provides the pocky reference distribution and runs a whole bunch of QA tests to make sure everything stays high quality. Here's some release schedules. They provide funding for personnel to work on the project full time and servers and things like that. And they provide very good, excellent documentation. You should go check out our documentation. And the purpose of these projects is to build primarily but not exclusively embedded systems. So we do have our traditional image you could flash on a Raspberry Pi up there, colloquially. We call these target images. So we actually can produce images for a whole bunch of different things that I'm not going to go into in great detail here. I've got a bunch of other presentations on this that I have links to later. So when people want to build stuff with open embedded, what they do start with is they have some source code and they have some metadata and they have some policy information. And they chuck all of this into this magical tool called BitBake. And it spits out this target image that we talked about and then you flash that on your widget and profit, right? It's great. A little deeper under the hood, the way that this works is that we start off with some host tools. So this is like the minimal set of things that you need to build with BitBakes. This is going to be like your host GCC, Python, and a couple other like fairly standard dependencies that run on your host. And we're going to take those host tools and we're going to parse some recipe metadata that says how to build some source code. And that source code is going to be used to build what we call the native tools and the cross-compiler. So the native tools are still tools that are designed to run on your host system. And then we also build the cross-compiler at the same time. So something like an example of this might be like the protobuf compiler, right? We actually build that ourselves and don't require you to provide it, provide your own. We also build our own cross-compiler, so you don't even need a cross-compiler on your host system. We then use those native tools and cross-compiler to process more recipe metadata that's going to take some more source code in. And this is actually going to cross-compile and build what we call your target packages that are designed to run your final system, be it x86 or ARM or MIPS or PowerPC or RISC-5 or whatever it is. And then we process yet some more metadata, and this one says how to combine all these target packages to make your root file system and your kernel and all of these other things that you need to actually have your target image. The way that Bay keeps all of this sane and tracks the dependencies is it uses a sophisticated method of hashing where each step along the way, called a task, has a hash that is the encapsulation of all of the dependencies of that task, all of the variables that affect that task's execution, and all of the code that it's actually executing. And then that gets combined into a single hash, and then that hash then is the input as a dependency to every task that depends on that one. So you get this chain of hashes all the way down following from your recipes that you start with to your target image. So what happens is if, for example, something about the protobuf recipe changes, that's going to change the task hash for that recipe, that's going to cause that protobuf tool to be rebuilt, and that's also going to change all of the downstream hashes that depend on that all the way through any native tool that depends on that, any target packages that depend on that, and all the way to the root file system that indirectly depends on that. And so all of those things will be rebuilt by BitBake when you change that particular thing. And so just because of this hashing mechanism, OpenEmbedded and BitBake start out with a very strong software supply chain, because we have these very strict rules about how these hashes change, and this causes everything to be rebuilt, and so you can actually trace it back from your target image to the target source code that produced all your target packages, and you can even trace that back to your cross-compiler and your native tools that we built, and there are ways in other presentations I've done that you can see, you can even trace this back to your host tools if you really wanted to do that, and have that really deep supply chain. So basically what we do in OpenEmbedded is at each step along the way here where we're building something, while we're building it we also spit out this SPDX document that says this is what we did here at this step, and then at the very end we take all of the SPDX documents that went into our target image, or native tools that were used to build a target image, and we put them all into one big archive. And we have a rich set of dependencies that we actually report when we do this. I'm not going to get into too much detail here, again there's other talks I've given that you can see that describe all of this in more detail if you're interested, and these are those talks if you want to see those, and these talk a lot more specifically about OpenEmbedded and S-bombs. So when you do this, we can currently generate SPDX 2.2 JSON format, and I did this for a minimal QMU AR64 system, so the root file system was 14 megabytes uncompressed, the Linux kernel was 20 megabytes, and we had 158 megabytes of SPDX document. So yeah, it's a lot. I was actually going to post up the archive, but yeah, so it's a lot of data, and we're not even reporting on everything yet, like you know, some of that is the JSON encoding and things like that, but it's just a ton of data. So the question is, do we really need all of this, like there's a lot of stuff to lug around. And I think you can harken it back to that nutrition information, like as a consumer of a given food product, wheat is wheat, right, like I don't necessarily care how the wheat got into my crackers or whatever it is that I'm eating, but if I'm a manufacturer of that food and I need to track something down that went wrong somewhere to do a recall or something, then I really care where that came from. And so I think the same analogy could probably be made, like it's possible your end consumers don't really care about your software supply chain, but if you're manufacturing something or building something, you probably definitely do, so you can trace down problems and things like that. And you know, there's always the possibility that there could be regulatory requirements for this in the future that seems to be a thing that's happening now. So yeah, so if you're trying to track down a supply chain attack or something's gone wrong somewhere, then you probably will definitely want this information. So if you work on a tool that does something that looks like building, please consider adding build profile support to your tool, because it's actually really not that hard. Like for open embedded, we already had all of this information, it was just a matter of encoding it, as hopefully was somewhat clear from what we were doing here. Like we already had all this information, it was just a matter of writing out the document that had it and then combining it at the end. And with SPDX3, the combination at the end is going to be a lot better than it was with SPDX2, so. And that's all I had. Are there any questions? Yes, you have many megabytes of SPDX, so I suppose you don't have a big file, but we have multiple files related to SPDX relationships? Yeah, it uses... Can you repeat the question for me? Oh, sorry, yeah, so we have a whole bunch of documents, so we're not generating one big SPDX document, we've got a whole bunch of small documents, yes. So yes, we do that, we use a whole bunch of external document references, and then they're in SPDX2, and this will be better in SPDX3, but in SPDX2 there isn't a standardized way to combine documents together, so we just throw them all into one big tarball. It's not the greatest, but it does put them all in one file for consumption. They do exist, at the point of build, they do exist on the file system as individual documents that you could package up however you want, it's just for ease of our end users, it's easiest if we just put them all in one big tarball, and they can extract it and do whatever they want with it. But yeah, a whole bunch of external document references in our output, yes. A few questions, one, slides, I'm not there on the presentation, will you be able to do that? Yeah, I will post the slides, yes, I will do that. And as part of an official release, you will give this tarball to the release of your project. Yeah, so there's an option you can turn on in your build that will, I don't remember if we turn it on, but it's very easy to turn on, so you turn it on and then you just get this tarball as part of your build. So what happens in Open Embedded is you generate a file system like my file system, and that's your root file system, and then alongside that there'll be my filesystem.spdx.tar.gz or whatever it is, I forget off the top of my head, but yeah, it's that simple. I think another answer given the size of it, so you don't deliver the spdx as part of the image? No, yeah, we do not deliver the spdx as part of the image. So is spdx not providing some integrity that you could say, so to an end consumer to say, yes, I've got this spdx and I've got the image, the two things are aligned. Right, so how do you trace the spdx back to the image? So there's extensive checksumming in the spdx itself, so every file in that root file system is going to be expressed in the spdx, and the spdx will have its checksum, so you can say like, you know, yeah, at the file level, so you can say like, you know, userlibfoo.so, and then I go look that and the spdx are the checksums the same, then they're, you know, they're valid, right? So who's using the spdx? Have you created this image? Uh-huh. Is anybody using the spdx? The image? Are you using it? No, I'm not. Yeah, sorry, so he's asking who's using it, and the short answer is I don't know. A lot of people ask questions about how to generate it, so I assume they're doing something with it. Um, but I don't personally generate this yet, but that's just because of where I work, not because I'm... Can you say in relation that there is a list of general consumption tools that are available? Yeah, there are a list of, sorry, there is a list of consumption tools available, yeah, sorry. Go ahead, I think you're next. Um, the build profile stuff, the supply chain part, the B looks like the salsa provenance definition, which, do you also look at that solution, because what I like about that solution is that, that it's separate from the supplementary material, because, uh, and it can be consumed in a different way. Did you ever look into that, uh, first, do you want to do a specification? Yeah, so the question is, did we look at salsa? Yeah, we had people that were from salsa on the build profile working group, so a lot of what salsa did fed into what we're doing here. Um, I think the, that we wanted it to be more closely integrated with the SPDX core profile, so that you could say, like, this is all the licensing information and the build information and the supply chain information, so that's why we're including a build profile. Um, I, I think tooling could come along to, if you wanted to later on, like, if you wanted to, you know, strip out all the build profile stuff, because you don't want to ship that, you know, ship gigabytes of data to your customers, like, sure, I think tooling can come along to do that, and that should be fairly trivial, um, but yeah, that, that, that's why we chose to do that that way. What's the main relationship with the generated SPDX, is it the reset from the big bake itself, or is the component that's generated at every site? Sorry, I didn't, uh, I, I, sorry, I didn't quite understand the question. So basically, we, you, you have one of the standard initial source of information that goes to the SPDX there, the, the initial source will be just the big, big recipe, to meet the component itself. Uh, the initial, right, so the question is where does the initial information come from, and the, the recipe describes how to build, we report on both, actually, so we report on the, the source code, and the recipe, and the thing it built, uh, currently, so, uh, we can do all of those things, um, and I'm, I'm done, I'm sorry, I can answer more questions, if you want. But I, I gotta, I gotta, thank you very much. For people living, and living at this basis, the rest of you know what to do. As a reminder, we have, like, chocolate, snacks, and things here, if anybody wants some. |
Hermine: converting SBOMS into legal obligations |
So, hi everyone. So we're about to present you the Ermin project. So for those who are here on Friday, I can't talk close to the microphone. So it's the same project that we mentioned on Friday and hopefully the presentation will be easier to understand. Also because I'm with Nicolas Toussaint. So I'm with Nicolas Toussaint from Orange. We'll introduce you to the general idea of the project. Thank you. Okay, speaking loud. So what's Ermin, so a mine, apart from a really nice animal, which does take an age in French. So it's quite a young project, been existing for a year now, initiated by InnoCube and joined by five other large French companies. We're trying, so we're like an open project, we're trying to do things properly. We haven't joined an organization yet, Foundation, but that's planned. We have three committees, that's important. Legal committee because you'll see later it's all around legal suspects, made by jurists, mostly for jurists. And then, well, you know, we try to use, we do use AFERO, ODBL, and so on and so forth. Can you move to the next? Right, so we're building something new, and well, we think so, and we do rely on existing well-established tooling. All of those have been hearing about it all day. Let's move to the next one, probably know these ones. Right, so the nice part, what it's all about, so the idea is to really implement your open source policy. So we take, on one hand, all the license texts from the open source licenses. We break them down into obligations, so do you have to disclose the source code, mention the authors, and so on and so forth. On the other hand, we take S-bombs from the project, and we enrich it with legal and technical context. In the middle, that's where the magic happens. We kind of apply your open source policy, and out of this, we get, on one hand, we get a validation or not. Is it okay to use that license in that context? And on the other hand, you get a nice list of obligations you really have to follow, because you have plenty of obligations and plenty of context. Sometimes you have to follow them, sometimes not, depends on the context. So that's the big picture, and that's next to Kenny. So before talking about S-bombs itself, you have to have a kind of preparatory phase, so that's license analysis. The tool was mainly designed for IP lawyers, but we are not familiar with open source licenses. So what we wanted to do, it's to be able for them to save time while still being confident in the decisions that were made. So also, it was important, it's a shared tool, but of course, every company has its own open source policy, so it had to be flexible enough. So the first aspect is kind of a pedagogical framework that will allow lawyers to systematically and consistently analyze open source licenses, and that will also mean that if it's analyzed systematically, we can handle the decision programmatically to a certain extent. And as I was saying, the idea is also to share interpretations, because if you have a broader consensus about interpretations, then that will increase legal predictability and reduce legal risks. So when we do license analysis, it's basically break down in three parts. The fourth part is just the global characteristics of the licenses, like about the copy left level, the types of rights granted, like if you have patent grants, stuff like that. I mean, technical things like the choice of law of venue, et cetera. And the more important is your policy status. I mean, if you always allow it, if you never allow it, or if you allow it depending on context, then if you allow it depending on context, well, you'll have to specify which context it's allowed in. And if your context is based on simple enough facts, you will be able to automate that. So, for instance, I mean, you can have two subcategories of context. You will have a technical context. I mean, typically, I mean, how and if it has been modified or not. If it's as a type of linking it has with your code base. And but also you can associate it to a category of your products, which means that a product for which the IP will be treated differently. I mean, typically, I mean, in many cases, you won't accept to have a GPL V2. But if you do embedded Linux, I mean, of course, you will accept it. So that will not depend of the technical nature of the third party components or it's linking to your code base. It really, really depends on the business expectations that you are doing on the final projects. And so we can partly automate that. And the third party, it's the breaking down of each license in obligation. And each regulation will be triggered at two levels. The first one is if it has been modified or not. Because in many cases, I mean, you have many obligations that you don't care about just because you're just redistributing unmodified third party components. And the second one is the type of exploitation, which means do you distribute it? I mean, a source has been binary or do you provide network interaction extra? And also, I mean, when you break down different licenses, you will realize that, I mean, very often you have very similar obligations. I mean, they're not straight in the same way. But you can decide, and that's the choice of the lawyer, that actually the equivalent in the way, I mean, they have to be respected. And so you can relate that to a generic obligation and then you will only care about a generic obligation. And you will only care about a specific obligation when they are not related to a generic obligation. And that's about it. And so now that you have your policy ready, you can be ready to handle S-bombs. And I will pass you the microphone. Thank you. OK, so that was the right-hand parts, the licensing obligation. Now we're looking at the left-hand parts. How do you handle the S-bombs? So today, we mostly work with the evaluated report from ORT. You've heard about this morning and all the time. We're working on working with SPDX as well. There's a lot of work to be done in this area. And we will have to go to and support CyclingDX in the future as well. Right, so when you take this S-bomb, you have plenty of information, but you don't have all the information we need on the context. Two types of information we want to add. We want to add the technical details. So sometimes you can get it inside the SPDX report. But I guess there's more automation we can do in this field. But we definitely need to add manually some extra information like the technical linking, how your component is linked to your application. The type of exploitation is what do you actually do with your components? Are you going to distribute it on the Internet? Are you going to distribute that to your end user and so on and so forth? Is that a SAS? You need to also specify if you have modified the source codes. That makes a big difference on how you're going to trigger the license, what you're going to have to do. And we also added recently some funding information. Where does the money come from for a given component, a given project? So the scope can be really handy to automate all the kind of things. There's that development dependency, that build dependency and so on and so forth. There's plenty of things we can do to automate further. But yeah, next please. Right, so there's quite a lot of work to validate the SBOMs. And this is what you see. It's really from a legal, jurist point of view, maybe more than a technical one. So when you get your SBOM in, well, obviously you need a valid XPX expression. If you don't have the XPX expression, you don't know what you're talking about. So you need to know what license you have. Well, number two is a very specific case where when you want to have, give choice to the user. You say it's either GPL or MIT. Quite often the developers tend to write it's this one and this one. But what they really mean is this one or this one. So we added a specific step to fix this particular problem. Type of exploitation we mentioned this earlier. You have also to choose, when you have this choice MIT or GPL, at some point you have to decide which one you're going to select. And all those licenses have to be reviewed by legal persons. They have to be what Kemi explained earlier. If that has not been done, then you have to go back to your jurist and say, please analyze this license. And there's plenty of things that we did to automate this because it can be a lot of work. We're not going into details now. But in particular, we're working on exporting the curation methods to odds. So you don't have to redo this in the future. Okay, so actually the interest of the tool is to combine the two aspects. And when you combine the two aspects, you end up with a list of obligations that you actually have to follow for your release. And so the idea is that it will really get rid of everything that is not actually needed. And also we have introduced the idea of a core set of obligation is that you don't necessarily, I mean it's not always efficient to stick to the bare minimum of obligations. Typically you have licenses like BSD0 or unlicensed. And they don't ask you to put a license in the documentation. But I mean it's more efficient to have the same process for every license and also for, I mean you have many reasons to do so. So also it allows you to classify obligations as part of something normal. I mean you know that you won't have any extra efforts or any special attention to put into your release because it's in your core set of generic obligations. And so that gives you free time to actually consider what is specific and more important. And also another thing that you get is that it gives you a global visibility. And one of the different sort of components you have in your project, so it's relatively, for instance, if you have logged for J, you can search for log for J and it will give you the different product and release which are involved. And also more importantly for us, you can know which of the third party components you are most relying on and also it gives you a direct link to funding them. But it's because it's quite important to actually care about funding and the third party components you rely on. And that's about it. Thank you. So where are we now? The project is ongoing obviously. We don't really have a 1.0 version yet. It's on the way. It's not really used in production yet. Maybe you do actually in a cube? Yeah, but we are not the typical users. Yeah, yeah, yeah. But we shall have one soon. If you have any questions, you can call on GitLab.com. And we do, so we are talking about this breaking down of the licenses into obligations. That's a lot of work. And some of it is already available on GitLab as well. So you can have a look at this. And yeah, if you want to come, we speak back English, but we're friendly. You want to add something? Because we have some time. Yeah, yeah, yeah. I think. Just to say on this particular side of the legal obligations, the idea is to also share with the project a set of licenses, a set of interpretations that can be shared by everyone. How are you going to deal with the fact that Cyclone DX only has one license field and does it not differentiate between conclusions and what's detected? Yeah, so just to repeat, how are we going to... Yeah, yeah, I'm doing that. How do we handle the fact that Cyclone DX only has one license field? I have no idea. Do you? We have just started implementing it. Historically, we've always been working with SPDX and that's quite natural, but we realized that... Well, some tools... Just to assume it's included when you see something there? Yeah, I mean, we'll have to see how it goes. We have first started to submit some peers to their specifications because we had additional needs like support for scopes and stuff like that. So we know that they come from security. They don't come from licenses. They come from two different worlds. So I think that they'll have to evolve at some stage. But they seem very open to suggestions. But we're still implementing it. Yes, Thomas? How would you say her mind is different than Ford's? Because Ford also has a policy engine and Ford's have already stopped. So as you know it... I mean, basically it started as a front-end for Oort and because we wanted to have a graphical interface. So technically it's a Django app and we have an REST API based on Django REST. So we want to stay as close as Oort, including to the concept as possible. But it works a bit differently. And that's why when we make curations we wanted to make them exportable as Oort formats. And we also plan to be able to export your license policy and the Oort format. The idea is to be as compatible as possible. But just the fact also, the main difference is that it has a central database. I mean, just like when you used to use a W360 and it brings the same functionality. That's about that and we really want to stay as close to Oort and that's why we contribute to Oort when we can. Did I answer your question? Can I just ask? Sorry. Thank you. Yeah, there really is a lot of work around working on the licenses and sharing interpretations. So this is part... If you download her mind and you execute it, it comes with a set of decisions and interpretations that you can use that you should actually read and make your own. But I guess that's on top of Oort. Sorry. Yeah, so that's not actually a question but a quick comment on the cycle of the X-situation. So the licenses for your component is actually in the right. So it's possible for you to give multiple licenses there. What might get trickier is actually attributing which files in the source tree. It's not expressing that there are multiple licenses. It's not expressing that there are multiple licenses. Yeah. And they don't connect to other relationships. Those things are missing right now. What I was just asking, is there anything missing in SPDX that we should be adding into the 3.0 that you need? Actually, maybe. I think it is, but for instance, I mean we mainly rely on the evaluating model of Oort, which is kind of semi-official. I'm not sure it's really publicly documented. The evaluating model. It's public. Okay. Yeah, I mean the code is public, of course. But I mean, that's really, I mean, and the kind of information we like, I mean, it's like we have scopes. I mean, and we have it in also sub-projects. Because when you scan, for instance, you can have your composer.json and your package.json in the same project. So it's easy to say, okay, so these are my dependencies for my back-end. These are the execution dependencies. And it comes naturally. I mean, I think it could be implemented in SPDX, but I don't think it's present in a, okay. So because initially, I mean, we wanted to stick as much as possible. So we say, okay, we will interface with Oort through SPDX. But it was, I mean, it was not really convenient. Yeah, sorry. May I ask one question there? Because this is really important. Okay. You're focusing on copyright balances. Do we have other legal obligations? Because copyrights, like 100% of the other legal obligations, we need to use the word for your customer. Sorry, could you speak up, please? Is it only open source obligations? We have other legal obligations, like privacy, security, we need to deliver information to your customer. So currently, we are focused on legal things and mostly related to IP. We tend to also include export control, because that's quite easy. Regarding security, it's a whole different subject. And actually, I mean, in the first prototype, we had a link with a vulnerable code. So it can be not, but it's not that trivial. And currently, I mean, all of the partners are treating the security aspect with their own tools. So it's not a priority for us. But yeah, I mean, we would be happy to be able to handle it. So did I answer your question? Okay, thanks. Sorry. Well, the points we take, I mean, the orange, at least, is that her mind is one part of the whole thing. And we're going to work on interfacing it with some of the tools like dependency track and that kind of things. The point is being to have one source, one database of all the components that we use within the group, and so you can use it from different perspectives. How do you handle incompatible source licenses, like GPL on one side and ACTI on the other, if they combine? Good question. Okay, so the question was, how do we handle incompatible open source licenses? For the moment, we don't, but that was discussed on Friday, and it's not that trivial because you first have to understand the relationship between the two components, because currently what we handle is the relationship between a third-party component and your code base. But you can have incompatibility between two different third-party components, so you have to qualify the technical relationship between those two. And that's a bit tricky, but I think Friday we realize that many people want to work on the question and we really want to be part of this discussion. Yes, so let's... You can use ORT, and that's Osado support for license compatibility, so you can actually do the front-back already. Okay. You can transport from Burma into ORT, from Osado, Osado is a German open source foundation, and they published a license compatibility matrix, which has been included in ORT, and then it was activated. Is it perfect? No, but it is possible. Yeah. Osado has lawyers inside the area. They're working on the legal side, but you shouldn't be there. Well, we do have a few lawyers on board, so... And by the way, there's no competition. We had a very interesting discussion with Osado, and we really plan to cooperate with them, and that's on our to-do list in the beginning. It's just that we lack time, but... Yeah, I mean, we really want... I mean, we really will work with them in the future on that, and anyway, the next one is interesting. So you have another question. Yes, please. That's true. Sorry, I mean, you're talking about SW360, or...? Okay. Yeah, I mean, just for your information, we contributed documentation back in the day, and we wanted to go the SW360 way, but that was before the project was reborn. No, because there were dislikes, so we needed it. I mean, I can talk to you about that a lot, but we really wanted to go the SW360 way, and we couldn't. So once again, I mean, we are all for collaboration. Oh, yeah, Bradley. Do you have any plans to assist your users in meeting their legal obligations, or are you just trying to identify what the legal obligations are and not actually assist in meeting them? Oh, no, I mean, we do. Sorry. I'm sorry. The question was, do we assist the clients to meet the obligation, or just identify them? Well, ideally, both, because the idea is also that for each generic obligation that you have, so that they can identify a generic process to do it and to do it the right way. I mean, that means that that will be really compliant with each license that requires it. For instance, like providing the corresponding source code and stuff like that. So you're working on a system to prepare the corresponding source for you? No, no, no. Sorry. We don't do executions of the obligation. Sorry, I misunderstood your question. We're really happy to be able to branch out, but I'm not sure if it can be done automatically. I mean, I know that in some build system, it can be done. But the problem is so that it will really depend on each build system. And for instance, just for having a notice file, and we were, I mean, there was a very interesting talk about build as bottom, and I really believe in that. But the problem, for instance, if you take the Node.js ecosystem, you have many different build systems, and some of them have this kind of automatic function to generate notice attribution. The only problem is that they're wrong. And so, I mean, we're uncomfortable relying on something that we know. I mean, we can't pretend, oh, it's wrong, we didn't know. You know, I mean, that's a problem that we have checked. And so, for corresponding source code, I think, I mean, we had very interesting talks about embedded system, and I think that they are trustable. So we prefer to refer to that, because I mean, they do what they are doing, and the process will be just, I mean, pointing them to that. I don't know if I understand. We do have a REST API, so the idea is to provide one piece of the whole chain, and probably... Okay. Do you have a question? Thank you. That's all right. Thank you. |
A standard BOM for Siemens |
Thank you very much. Welcome to our talk on Standard Bomb. We are here to share with you some of our experiences that we've had introducing a common S-bomb format at a large company. And we also hope to get into a discussion with you about your experiences and maybe things that you noticed that we've missed or that we should or could do better. So all three of us, I must say, the thing is called Standard Bomb. It's just our name for Cyclone DX format. So we are not reinventing the wheel. It's not like we've invented a format or something. And we're also not selling anything. It's just sharing experience and talking to you. All three of us are from Siemens, so I feel I need to say a few words about the company. Siemens is a technology company, so you can buy small things like thermostat for your smart building or if you need a whole train or a power plant. So I mean a power plant is nice. And also things in between like medical devices, magnetic resonance, tomography systems, or if you're equipping your factory, then you can buy a factory equipment. So Siemens has also been around for some time. Just recently we've celebrated the 175th birthday of the company, so it's changed a couple of times over the years. And traditionally, of course, there has always been a focus on hardware. But in recent, well, decades, I could say, software has become increasingly important. So now of the 50k R&D employees, we have a sizable portion of software developers. I couldn't find out exactly what the portion is, but I'm quite certain it's in the five digits and growing, certainly. So and since there's no like in a company like that, there's no one technology stack, so we're basically using everything, I should say. And that growing importance of software, of course, leads us directly to software builds of material. You know, you're all aware of the legislation that's upcoming mostly in the form of executive order and CRA and so on. So S-bombs are getting more and more important. And I don't want to explain S-bombs, that's just, you know, you all know that stuff on the slide. I just want to stress one thing, and that's generating an S-bomb for a software product is not something that can be done manually. It must be the result of an automated process, okay? So there's just no way to reliably do that manually. And one of the things that we realized is that an S-bomb is always created with a particular use case in mind. Even if you're not thinking of a use case while you're doing it, then you're still implementing whatever's in your head at that point. So it's always, the concrete S-bomb document is always intended for a particular use case. Just to give some examples that we are dealing with, one would be license compliance. So we want to, you know, make sure that we follow all the obligations from open source software licenses. That's very important because OSS software is used extensively at Siemens. We use many components, and we also publish them. So if you go to github.com slash Siemens, then you will find some of them. And if anyone of you does that right now, then be sure to also click on the badges on top of the page. They link to other places on GitHub that have Siemens open source software. It's not all consolidated into one. Anyway, license compliance also requires us to have source code available, because that must be scanned. Individual source files might be licensed under a different license than the main project and so on. And that's a particular requirement of that use case. So the S-bomb will look different compared to, for instance, the security vulnerability monitoring use case. Also very common. Source code is not so important. It's important for finding the vulnerabilities, but not so much for monitoring them. But you need different metadata, such as CPE information. CPEs are used to look up the vulnerability in the corresponding databases, so that's critical. And also you might want to include build tools, test frameworks, and so on, since they might also be vulnerable. Both of those use cases are internal use cases. So we generate the S-bomb for us, use it with our systems and processes, but we don't share it outside of the company. In the third case, regulatory, that would be, again, another use case where we are required sometimes due to the new legislation to publish the S-bomb. And then, of course, we must be sure to include certain fields in that S-bomb about every component that are required by that regulation. And we will not normally put much more into the S-bomb than we are strictly required to do, because that's for regulatory purposes, and we don't want to open up an attack surface for just people who want to bitch about some information being wrong or something. So that's just the realistic thing that's going to happen. And you'll see later that this is relevant, those S-bombs being created for different use cases. Because when you're creating an S-bomb for your concrete product, you're actually solving something of a puzzle. So you have all kinds of pieces that must fit together to get the final S-bomb. Imagine you're shipping something simple like a front-end container with an Angular application in it. So maybe you have an NPM to ask for dependencies. That's the easy part, because it's under your control. But then you also have, let's say, an nginx in the container, which has an S-bomb or consists of some components. And it's in, let's say, a Debian Linux. And that has, I don't know, 100 or so open source components as well. And well, sometimes you're lucky, and you work with partners or a different, in a company like Siemens, you have all kinds of different business units that produce components and give you S-bombs. Those S-bombs might have all the data that you need, or they might not. Imagine that people only gave you the S-bomb because they're required to by the regulators, the third use case. Then it would probably not be enough. For instance, license information is something that's not even required by the NTIA for a public S-bomb. They just want to know what component is it. They don't want to know much metadata. So that's something you need to have to enrich then. You will probably need to have backend systems to enrich your S-bombs and arrive at the final S-bomb while you're solving this puzzle. So now I've talked a lot about the S-bombs in general, and let's look at some more detail with Alex. Yeah, thank you. So as we already mentioned, one goal that we have is to take you through the process of how we adopted a common S-bomb format within the company and what some of the challenges and major pain points were that we detected as part of that. So of course, at first, you look at the requirements that you actually have, and usually to do that, you look at the process and the people involved. That's a good idea, even when you're trying to solve a technical problem. So what we considered initially early on, I mean, you've seen our product portfolio. We do everything from hardware to software as a service, so every team at Siemens is different, which for us immediately meant that there is probably no silver bullet that works for all of them. So there wasn't going to be a single automation approach that we could push onto people. Instead, we needed to provide an ecosystem. So that was realization number two, right? So we need a common set of tools, but not everybody is going to use every tool. But the goal here was to simplify the actual S-bomb generation and allow people to feed that data because that's the background that we come from into our OSS compliance and commercial license compliance tooling, and to enable developers to actually use that as part of their builds. And from the get-go, we were pretty clear on that either becoming in a source within the company or potentially also open source. We will comment on that a bit later. And then of course, you can't always optimize for the edge case. So there will be teams within Siemens that use tools that nobody else apart from them uses. But even then, we wanted to enable them to also use the format by at least having a set of libraries that they could include. So currently, we offer these for Java, Python, and.NET. And that definitely covers a lot of the different teams that we have. And similarly, that is provided as in a source today. Yeah. So one valid question that you can, of course, ask is why do we care so much about our S-bomb in the first place, and why do we care about them being accurate? There's more reasons than the two I'm going to talk about, but generally, these are the main two for us. So one is security, right? So it's not that long ago, actually less than one and a half years, that lock for a shell hit. Or if you think back to SolarWinds, it's important to actually know the products that you consume, so the dependencies that your own products have. And for that purpose, an S-bomb is exactly what you need, right? So we want to be able to identify vulnerable components as quickly as possible. So if a zero-day hits, it's not necessarily a good idea to start investigating which product uses a vulnerable component at that point, because then that delays the process. And obviously, you can only start with the mitigation once you have the full picture of what you actually need to mitigate. The other part is something that is more of a legal topic, so compliance, license compliance specifically, right? So a failure to comply with license terms of third-party components is something that can trigger litigation. Litigation is something that is very time-consuming and expensive, and our lawyers would rather do other things. So it's important for us to also make sure that this part doesn't happen. And one thing to also be aware of is, generally speaking, at least from our experience, the larger the company, the larger the compensation claims that people will sue you over. So if you have a GPL violation, then suddenly we're talking about millions of dollars. And the worst case, which as far as I know is something that is probably a bit specific to German copyright law, so it can actually happen that if a GPL violation, for example, is detected, you can get slapped with an injunction, and you are prohibited from shipping the affected product until that is resolved. Which for us, if you imagine that something like that happened with a Linux kernel version, we have a driver with a GPL violation or whatever, for us that would be a big deal. So it needs to be avoided just from a business perspective, for both scenarios. And then even beyond that, of course less tangible, but still, both of these things, they will land you on the news, and you will not get the good kind of publicity. So they are actually a PR nightmare. And that's where we want to get them right, we want to be good citizens, our bombs need to be accurate. Yeah, another challenge that we detected early on, because of course even our embedded hardware colleagues by now, they have figured out that containerization can help them with certain use cases. So we also need to make sure that our containers are OSS compliant, and there we have a special challenge in generating accurate S-bombs. So S-bomb creation, which is what this chart here pretty much shows, it lies on a spectrum. So what developers of course like to do is they like to consume public images from Docker Hub or other public sources. That's very low effort for them. They can just pull them, they don't need to create them themselves, but they also don't know what's in it. So you have low effort on the developer side, but we also have very low certainty. So creating an accurate S-bomb is insanely difficult, and in some cases I guess we can actually conclude it's impossible. Yeah, and then the further you move to the left, the more effort is actually involved in building the container, but at the same time you have increasing certainty about its contents. The pathological case on the other side of course is that you build every image yourself. We use a lot of different images, so maybe you don't want every team to build their own. And so the next best thing that we've arrived at is sort of having these known base images that get shared within the company, or we consume upstream based images that already have an S-bomb that we trust, which is of course a major asterisk there. So you also need to be able to trust the S-bomb, it's not enough for it to be there. And then there, creating those images is much higher effort, but you also have a much higher degree of certainty. So that's something that we realized, and that's something that we try to put in practice. Yeah, so I mean, these are the challenges, right? Obviously the conclusion then was we need the common format to facilitate all of that, and we need to build an ecosystem around it. So that's what we did, we called it standard bomb. We have an internal page, landing page for the format, so if you try to navigate to that right now, it will not do anything for you. But the reason we are showing it is because that domain pretty much tells you this isn't just a side project that we started, it actually is one of the main sub-domains within the company. So we already have a lot of teams using it, and yeah, it's growing. So we are picking it up, we are now actively looking into ways we can make some of this available upstream again, and in fact we already have. So I contributed the Cyclone DX support to Scanco Toolkit a while ago. But yeah, so we are still figuring some of that out. Yeah, so Thomas already preempted that a bit. What is standard bomb? At its core, it's Cyclone DX 1.4. The special caveat is, or maybe I quickly need to explain what Cyclone DX is. So it's an OAS format, and it prides itself in being lightweight and composable, you can add extensions to it, and so for us that flexibility was really appealing. One limitation that we already put on it for our internal use, which is probably a bit controversial, but we did it because we prefer it, we only use the JSON flavor. We don't care about the XML. Once you start dealing with large XML documents, you have to worry about things like vulnerabilities in your parser, and we don't want to deal with that. With JSON, they are much rarer, generally. They're not impossible, but they're rarer. So of course, using JSON makes it pretty much programming agnostic, because every language I know of, unless maybe Cobol has a JSON parser, and probably Cobol does too, I just don't know about it. And then also, the benefit that this flexible format had for us on top of that is it's independent of the source ecosystem. So we have all these different text stacks within the company, they are all supported. There are upstream tools to create bombs in those cases where those aren't good enough and up to snuff for what we need, we wrote our own. And another benefit that it has, it's independent of the consumer. But, and there's a caveat here, right? So it's important to keep in mind, even though they are independent of the consumer, as Thomas already mentioned, usually you create it with a special use case in mind. So if you submit an S-bone for software clearing, maybe you want to also put a statement of intent alongside it to say, yeah, this is mainly for software clearing purposes, don't use it for vulnerability scanning. Because if it contains references to the source packages, there's actually a high possibility that your actual product, because the binary doesn't have the source, isn't affected. So that's a statement of intent that we support through something that we call profiles. So that's metadata in the bomb. And yeah, that was also a valuable addition from our perspective. So that's pretty much what I have to say about it. And now to get into the nitty-gritty details, I will hand over to Thomas. Thanks. So, well, we're using Cyclone DX, so do we do something special a little bit? But at the end, we still use Cyclone DX. So every of our standard bombs is 100% Cyclone DX bomb. And this is that we really like to emphasize. But because we are consumers, so we heard in the morning a lot of people create S-bombs. So on one side we create also S-bombs on the other side, but we are also the consumers. So we need to ensure that we understand all the information whoever created it. So we just needed some additional set of rules or guidelines. So for example, we decided that we want to have the components as a flat list. We don't want the hierarchical structure. In the Cyclone DX S-bomb as it is, but we still have the dependency information because it's just at another place. Another thing is that we find out we need some additional properties. And, well, if you tell your developers just add something, they will add it anywhere under any name. So Cyclone DX offers properties. These are just the key values to work. So we talked to the Cyclone DX guys and they said, okay, you could reserve a namespace. So this is what we did. We provided a taxonomy and now we have the Siemens column, whatever, to clearly describe this as one of our properties. So this is maybe something that our developers should use. The next thing, because the three of us come from the license compliance side, is that we require the source code. We require the source code because this is what we scan for licenses, for IPR issues, export terms, those things, whatever. So we need to find a way to express where can we find the source code. So it could be a local file, it could be the upstream location, but we have a way to describe it in Cyclone DX. And then the next thing is that the best case would be if the development teams pack all of this together. So the source code, maybe also the binaries and the S-bomb. And this is then something that they ship to our backends. And then we have all the information that we need. So just to give you an overview, I know it's small on the screen. So you see a lot of standard Cyclone DX properties. You also see the license, for example. And what do we have some other information? We have the source code, we have the information about the website, which is still standard Cyclone DX. But we also want to know, OK, is it the direct dependency or not? Sometimes we need this, sometimes not. We would like to know what kind of a regular language is. We add, if we find such information, also in the first thing, scan something about third party notices or copyright statements. Just a short example how this would like. Now maybe for a better understanding, again, what do we do when we talk about this S-bomb? We use it as an input. So we have the developers. The developers commit their code. Many of them do it to a central GitLab instance that we have, which is called code.zimmons.com. And there they run their continuous integration, continuous development runs. So this is where our tools kick in, part of CI, they use scanners either from us or from Cyclone DX or created by themselves to create at least the first version of an S-bomb. And if the S-bomb maybe does not contain all the information, then we have additional tools, maybe to find source code or to guess where the source code might be, to download the source code, or if we have different kinds of ecosystems to merge S-bombs. Because let's say you have a container, you have maybe a scan for the front end, NPM components, for Java components, for.NET, and the underlying operating system. So we want to combine it to one big S-bomb. And yeah, we also have some kind of validator. And then this is something that can get forwarded to our backends. So we have different kinds of backends, but one of them that you already had talked about is SW260 and with a scanner for Solitude. So again, we use this information, store it, let's say, in SW260, and then someone else pulls it out of SW260, does a scan with Solitude to determine what the licenses are, what the copyrights are. So the detailed information, where this information is found, we don't need it here. We need it down here. But then it's created by Phosology, and probably it's SPDX. It might not be necessarily Cyclone DX. But again, our focus is here on the input. What Siemens then does, we have a look at the single component, determine the licenses, the copyrights, the obligations, as we also heard from the French colleagues. Our legal team has a look at it, and then at the last step, we do something what we call product clearing, that is, we take a look at all the components, all the licenses, all the obligations in the context of that rubbery product. And then we do a final check if everything picks. Because you may know that if you have an embedded products, there may be another situation than if we have a cloud back-end or a cloud front-end. Now, this is maybe the way that it takes to get to a good S-bomb. So we think it's not an easy way, and we are not yet done. We shared our experiences, our opinions, our approach on what we did or what we would like to do. And now really, we are here also to hear your comments on that. So parts of the things have been upstreamed, are available as open source. Is this the US case that you would also be interested in? Is it something where you would say we should do more open sourcing of our tools? And then the interesting question is, well, if there is already a Cyclone DX Gradle scanner or PyP scanner, do we want to have another one, or should we find a way to combine it? It's up to what the open source community would like to have. So I guess we have five minutes left. On one side, you see the key takeaways from our presentation. I don't want to go to all of them again, because maybe you have questions. There's a question from the chat right now from Borger. Question, how do you generate and track S-bombs for multiple language projects? On your introductory slide, you will mention lots of programming languages being used to teach. Yes, so there are separate scanners to create. The question is, the question is, how do we generate S-bombs for multiple languages or for multiple ecosystems? Yes, we have different scanners. So here, some of the scanners that we created by our own. If we don't have a matching scanner, we tell the people to look for Cyclone DX scanners. If there is no scanner like that, then, well, these are developers. They can do it by themselves. And then you merge the results. Yes. Yes. Yes. At the end, it depends on the use case, whether we process them separately or not. But we have the way to merge them. More? Yes, just a quick comment on that. Can you? Yes. Thank you. So we merge because what we found is at build time, these separate build tools, so whether it's Gradle or the Go compiler or whatever, they have a lot more information than just doing static analysis with some other tool like scan code. So occasionally, for specific use cases, we prefer to go through build plugins that have all that build metadata to get the full picture. And so that's why we actually have that modular approach. Right. Siemens is a big organization. What have you done to your supply chain? Because I'm sure you've got lots of things coming into Siemens. What are you doing with the components that are coming that have S-bombs or don't have S-bombs? Have you changed the way you are reacting with your downstream supply chain? We hope. Ah, what do we do with all the suppliers? Do we just rephrase it? Do we hope that they have S-bombs? And the question is, yes, we hope, but we don't expect it. So are you generating S-bombs essentially in Siemens components? Yes. Yes. Okay. Yes. So that is something that is actively being worked on to comply with the executive order and so on. Yeah. The one in the back. What has made you choose Cyclone DX over SPDX? Yeah. So the question is, and we anticipated it because we already had that conversation with Kate at the fringe event. So why did we choose Cyclone DX over SPDX, right? So it was partly because that's what all of us already knew. So that was the first point of contact. And the other reason is that we got going on in a lot more quickly. So it's lightweight. You can start at a very low level and then build on top from there. Whereas, so our subjective experience, I'd like to say it might be different for somebody else, but so the SPDX spec is quite daunting in its depth. And we didn't need all of the features. So understanding the spec fully wasn't in scope for us. Yeah, sure. I would like to add one thing about the SPDX versus Cyclone DX thing. I mean Siemens is relatively large and there's lots of different parts in it, right? And that's probably, as it is in most companies of that size. And we discovered that we had already started with Cyclone DX individually before we came together as to solve this centrally, right? And then once you discover that on an important question like that, you're already almost aligned, right? Then you don't open that kind of worms again to choose the best formula. That's kind of a, well, realistic approach. How do we scan containers? How do we scan containers? Yeah, so I'd like to say that's still very much an ongoing field of research internally. But I can give you, so I do believe that to give you the full picture, we should maybe talk afterwards that won't fit into the QA session. But we have a combined approach there as well. So we use stuff like scan code IO, turn all these other static scanners, sift actually to get us started. But then once you start digging deeper, of course, that's not the scope of the tool. It needs to be fast. And then we need to aggregate that. And that's the biggest challenge, of course. So reconciling all those different scan results. And so if somebody is doing active work on that, I'd be happy to talk to you. Thomas, maybe? One last question? Yeah, so the question was, how do we make sure that the dependency scan is complete? Well, I mean, it would be snagal to say that we can always be sure, because we're not going to be sure. But we have a best effort approach that has been tested against lots of images. And occasionally, people will actually come in after the fact and verify the results. And based on those findings, we will improve. And that's not one other aspect to that. And we kind of mentioned it on the containers slide when we were at that point. It depends a little bit on what you're scanning, right? So if it's in your source ecosystem, then I can, as a developer, I can be reasonably sure that the S-bomb is complete. If I take a random container from the internet, then that's very difficult. |
FOSSology and SPDX
How FOSSology works with SPDX |
Yeah, hello everyone. My name is Shaheem and I am Gaurav with me. We both are working for a physiology community and we are from Siemens. So maybe let me start. So, Phosology is open source license compliance project. Initially, it was published by HP in 2008 and in 2015 it has become Linux foundation collaboration project and Phosology is a Linux application. It works on Linux distributions and different tasks done for OSS license compliance by Phosology are scanning for licenses, copyrights, author ships, emails and ECC statements. Apart from this, we have key words and IP address statements as well. And we also generate documentation like read me OSS text documentation and unified report as well. So and then we have export and import of SPDX files. So maybe we can discuss about SPDX files later in the slides. So Phosology is about finding the licenses as I said already. So we scan for the source code. So the source code might have the license text, reference to the license text or written text explaining some licensing and then we might also have the license relevant statements. So this all will be identified by Phosology. And later we have uploaded a component called thrift which is Apache source code. And Phosology have found Apache license. Apart from Apache license we have many more licenses. It is because it is very natural that OSS project reuses the available OSS from other projects. So for example, if you see Phosology have found 25 other licenses, relevant licenses in Apache, apart from Apache 2.0. So what is the uniqueness about Phosology is we have conclusions. So if you take licensing can be simple, but it might be challenging as well because you might see some unknown licenses, written statements about licenses, some licenses might be unclear and there might be some incomplete license statements as well. So actually to do this I think one required a good domain knowledge. So yeah, SBOM and Phosology. So we are producers of SBOM as well as the consumers. So Phosology produces SBOM in SPDX version 2.3 format which includes the file level information, license findings and its conclusions and copyrights as well as the custom license text. So as a consumer Phosology import all this information and add it to a component as well. So more information about the SBOM will be discussed later in the slides. So yeah, maybe Gaurav will take over and explain about the releases and more SBDX features of Phosology. Thank you. Thanks. So yeah, with Phosology we recently released few versions and in all of them we majorly try to sync our license set with the SBDX license set so we are up to date. And at the same time we are continuously working to improve our REST API so we can provide more automation flexibility to anyone who is interested to use it. And like recently we had some GDPR updates thanks to Orange like how the user data will be managed in the server and things like that we recently updated to Bootstrap UI. We are planning to move to React but yeah, it's in the works. With 4.1 we recently integrated scan code so you can upload your source code to Phosology itself and let Phosology scan using its own scanners or if you prefer you can also use scan code and import the license findings in Phosology itself. We also worked a little on our copyright false positive reduction using Spacey and with the latest release so we would like to say we are reused.software standard compliant in our source code and we also try to display the information like whatever reused linter provides. So you can check if any project supports reused standard they do it like that. So Phosology can, you can very easily know how much scan you have to do. So we are coming to the recent updates with SBDX within Phosology. So since it's a pretty old project and we had some difficulty to comply to the SBDX standards. So we decided to take on the challenge in two steps. So first step is done what we are presenting and the second is actually a work in progress. So what we initially started with the pain point was the license name which gets end up in the report. So Phosology initially use short names so which are supposed to be unique and they are actually used for identification with internally in Phosology. So here we added a new field called SBDX ID so where you can have different variants of the same license. So for example in a license there is a copyright by X but for a different one the same license with a copyright Y. So the license text changes but you can still use the same SBDX ID for both of them and that will be end up in the report whatever you generate. In case the SBDX ID is missing or is not in the SBDX license list so we also perform a check on that. So it will be converted as a license ref and we have introduced this license ref Phosology prefix for that. In the upcoming updates we will further enhance this and provide users way to write actual SBDX license expressions including the and or and with exceptions. So with the reports with various dog fest with SBDX we came to understand that many of our reports were flawed. So we try to fix them as well as at the same time update it to the latest spec 2.3. So what was wrong so the extracted licensing info was missing. So as I said you can have your own custom license text in Phosology so if any of the file has a finding of that license text or a conclusion on it. So the license text itself was missing from the report so we have fixed that and also the packet verification code used by SBDX. The algorithm internally was a little wrong so yeah minor fix and then at the same time we compared the spec and the fields which Phosology can store. So we figured out like we can have the version information as well in the report the release date along with external left to like PURLs, Maven, Nougat and such stuff. And at the same time Phosology also allows you to manage your acknowledgments and obligations. So we use the attribution text field for providing acknowledgments to specific files and the same attribution text field for the entire package if you have any obligation related to a license. And also the calculation of conjunctive licenses and disjunctive license was wrong. So yeah Phosology has a special license called dual license. So yeah we have fixed how we are going to calculate that now. So not every license in the report is now a disjunctive license set. And yeah also at the same time added the missing license name and text also for the listed licenses. So yeah with that I guess pretty much all our SBDX reports should now be valid. So yeah thank you and please consider starting us on GitHub if you do use them and if you have any questions. Yes please. Okay so the question is which format we use for SBDX. So Phosology currently supports the RDF and the tag value format. Okay so yeah the question is like with the SBDX ID if there is a custom license text how it ends up in the report. So yeah with SBDX IDF format so it's a self-contained report so it will contain your SBDX ID the license name as well as your custom license text along with all the other various formats which Phosology supports. Readme OSS your unified report they will work the same. It's just that instead of using the license name which was coming from the short name field we are going to use the SBDX ID field now. Yeah but how do you include the text of the license in the SBDX file? Okay. Because otherwise you get only. Yeah so if you see here SBDX has this field called license text. Okay. So you can include information like name of the license, license ID, its text if you have any external ref and such stuff. Okay. So it's there for IDF. For tag value I'm not very sure we need to check because for custom text they do support it for tag value format as well. But yeah for standard they do not. Yeah. Okay. Thank you. Any more questions? Any support to the SBDX IDF? You are planning to do it. Yeah. Yeah yes you're planning. Okay. Any more colleagues? Yeah. Thank you. |
Build recorder: a system to capture detailed information |
So this is a presentation, we're starting now actually a new set of presentations in the dev room, not about completely S-bombs but about information that gets inside S-bombs and you'll be hearing more about it. So this is a presentation about a system to capture detailed information about building and the work, I mean almost all of the work happened by thirties and so I was just, yeah, you know, explaining what has to be done, deciding the problem. So we'll be going through a very typical agenda, you know, what we're trying to solve, how we solve it, how we solve it and what was not solved and what we're going to do. This happened as a GSOC project, Google Summer of Project last year of the GIFOS organization which is an umbrella organization for open source in Greece and we have representatives there. Right, I'm not going to be reading the whole slide, you can read it and you can see it there. But they're an umbrella organization that they're looking over open source development in Greece and they are, have been participated to GSOC for many years now and one of the project was the wonderful that Thottis did, right. So let's start with the basics, what are we trying to solve, right? There are different names for what we're trying to solve, some people call it origin, some call it providence or pedigree or things, essentially if we have a binary file and I'm talking when I say binary I mean an executable or a library, not an image, right, or a PDF but an object file, right, how was this created, right, what are the sources that were used in order to create this binary, right, what was the process that was used in order to create that, essentially these two things, right, the goal is to have extra information and meta information about all this stuff and so in the end you know exactly what information is inside the binary, right, or was used to create it, right. So think of it very simplistically, you know, if I have a command that builds a binary like a compiler that gets a source file and generates an executable, I want to record, okay, there was this process called CC and it read the wonderful file in C and it produces an executable, right, that's what we want to solve. And then it gets tricky because as you can imagine, you know, okay, we're not obviously looking at only things that are specified in the command line, right. So when you do CC, the hello world dot C, maybe it includes the file, right, you see language has include, so and this might be significant, it will definitely be significant, right, so it might include, slash yourself, slash include, slash stdio dot h, so I need to record that this file was also used in order to produce this binary, right, and in the same way CC is just a program, I cannot just record the name because the name does not mean anything, this is definitely a file and maybe it uses other linked libraries in order to do that and when you produce the final executable, definitely it might include system libraries and you want to record all this, right. So we're not talking about, you know, parsing the command line and seeing the three files mentioned there, we want to record everything happening in there and obviously the command may not be explicit, it might just be make, right, and you want to record everything happening with this build command, right, is the problem clear, has anyone else, has this problem? So the project, the whole project was okay, let's build something like that and how do you create a project, you say what I want to achieve and this is a functional specification, right, so we want a minimal interface, I don't want to tell developers how you have to change the compiler in order to do that and I mean the first idea would be to change the compiler but yeah, let's try to make it minimal, so ideally no changes to the command at all, right, and I want to record the files being created or written, the files being read or the process is being run, so for each of the files I want to know the name, I want to know the full path because there might be difference and then because I care about the, actually the content, I want the hash because the same file can be in different places with the same hash so I have to know what it is and you know, extra nice information, the size of the file, mode of the file and stuff like that, I don't care, for every process that runs I want to know okay, what was the process, what were the arguments, what was the environment because as you probably know, every process reads may be different according to the environment that it finds, yeah okay, start and end times would be useful and stuff like that, right, so this is the information that, this is the function specification, this is our wonderful problem, right, and then 40 sat down and worked, well no no wait, you'll take the mic, this is, well he's not engineer yet, he will become, can you hear me now, well to tackle this problem we created the command line tool named build recorder, which you can see right here, build underscore recorder, which records information about the build on Linux, use it, this is rather simple, all you have to do is prefix your build command, whatever that is, make gcc, java compiler, with the name of the executable, build recorder runs transparently in the background while your build is running, you don't need to change anything in terms of your configuration, your source files or even your build system, you can literally pick any project you like at this point in time, run with build recorder and it should run, if it doesn't please file a bug report, now as you can see build recorder, build recorder stores all of that information in an output file by default named the build does recorder dot out, but as you can see you can use the does so option to actually supply an alternative file name, but what does it actually record, well pretty much everything we talked about in the functional specification, for its file we have a list of attributes, its name, its size, its sex sum of its contents as well as its absolute path, similarly for each process we store its command line with all its arguments, a start and end time stamp as well as the environment, we also store a list of relationships, namely a process creates a process that is by forking or cloning or any of their variants, as well as a process opens a file for reading, a process opens a file for writing or some processes are associated with executables, for example if we were to run make, the system would probably run the file at slash user slash bin slash make, we want to record that as well, now all of this information is stored in the output file in RTF turtle format, class is being processed in file and all of the attributes as well as the relationships been the predicates, this is an example, for example we have a process ID 1 which is the initial compilation process, imagine gcc-fusy, it starts at the current time stamp, we have another process which is a preprocessor, the c preprocessor, it starts at another time stamp, we specify that the initial compiler actually created the preprocessor unit and then we have the fact that the we have a file f1 foo.c which should be specified that ah pid2 reads f1, the preprocessor opens the file foo.c which has size 100 and the random has of zeros, we have another f2 which is a temporary that the compiler might use, which as we can see the process number two actually writes on this, we have another process, the classical cc which merges those files together the object file, yeah that's a general idea I guess, well how is it implemented, how does it support all of those languages, well the basic idea is that we basically want to record all of the files and processes that a process uses and now if we think about it all processes and files are actually handled using system calls, so if we were to look at the system calls a process makes, we could see all of the files and processes that it uses, for example if we look at the open family of system calls, open create and its variants, we can easily extract the name as well as the access mode, on the same note for process all we have to do is trace, fork, clone and its variants as well as the execution system calls to actually see the process ID and the executable files, now from the information that we extracted from these two we can actually get even more information like the command line, the environment, a link to a file which from there we can get the size that has in the absolute path, yeah all of this happens on linux, so we basically just need a way to trace system calls, under linux this is a fairly straightforward problem, we use pitch-race which directly copy and paste it from the command line, it is primarily used to implement breakpoint debugging system calls tracing, now that's it, it's a very simple program, no, for the duration of our project, okay so photos run through the slides that did implement it, so we have lots of time now, so it's not perfect it has issues right, so let's start discussing, you got the major idea of how it works right, what are the issues, the main two big buckets of issues are the real complexity and performance right, so real complexity, remember that we said you know it's a compiler that just reads a file and creates an executable that was this wonderful diagram first, it's real life is not that simple right, so on the right hand side you see a more typical ideal again compiler, so when you have a compiler it actually calls three different processes right, you call the first one the first pass that reads the C file and creates an assembly file and then you call the assembler that reads the assembly and creates an object file and then you have the linker loader that reads the object file and creates an executable right, so this is a very abstracted and ideal world situation, we do not, real world is nothing like that, real world may be like that if you have you know done your compiler courses and you have seen the different passes of compiler and stuff like that and then you go to the real world where the hello world which is just you know print hello world dot c in current Linux is this one right, so I'm not going to be explaining every file in that one but this is just you know the three lines of main hello world print hello world and if you compile it with gcc explicit, so you see there are, what can I say, lots of include files been included, this is the first step of compiler right, the compiler first step right, so it you think that it will only read hello world dot c but you have a hash include in order to include the print f definition, so and this include unfortunately includes you know std def and libc header start dot h and lots of other different files and all these are files that have been read by the first process and then it creates something else which I'm not sure where it is somewhere or so it creates the temporary files, it creates the assembly there if you can see and then it creates the object files and in order to create the final executable lots of libraries have to be included right and this is the library is being included in the executable right and then we have the other set of things that in order to run cc one which is an executable so a file in the file system right, you have to load dynamically a number of other libraries that this executable depends on, so you have to record all these as well because again you have to have an accurate record of everything being used right, so yeah even the hello world example is complex right and you create and record lots of files and processes for that one which means oh and a lot of them are going to be reused again and again and again right so when you had for example to compile two different files right they will both include for example stdi.h right and ideally you don't want to redo, you don't want to have another record another box here for the second instance of the same file on the other hand it might not be the second instance of the same file because something might have changed right while you're running somebody else is installing a new compiler and it messes up all your binaries right and it's different binaries in the first run on the second run and all this stuff so this brings us to having to do it efficiently, so you're doing performance. Well performance isn't great at the current moment I mean we have to stop and restart the process multiple times using ptrace like you stop the process you read an entire file from disk you hash it thank you and then you restart the process this is relatively expensive as you can imagine the current profiling shows like when I built one and my system normally took nine minutes with build recorded it took 36 minutes so yeah it depends on your hardware your hard disk pretty much but the good news are that there's a lot of room from improvements because first of all we haven't optimized anything we're using plain arrays we will be switching them with lookup tables so we expect massive performance gain from this in fact when I tried the hash mapping limitation we dropped this to 22 minutes so that's optimistic and another thing to mention is that ptrace actually makes a multi-threaded process run as a single thread which is an issue so if you run for example make dustj8 you won't actually get the performance benefits of multiple threads we have plans to change that as well lots of changes need to be made in fact we proposed it as another project for GSOC for this summer and yeah that's it pretty much we hope to see an improvement we can't really tell how what at the end the overhead shall be like how much we can improve build recorder but we will get there now regarding future work pretty much we approve performance we plan to handle more programming systems I mean you can use it with any programming language you want but if I was to for example compile a project in Java using a build system like Maven Maven has web dependencies it downloads packages from repositories so ideally we would like to also record those repositories those URLs this is another proposed project for the next GSOC and next we have porting it to non-linux systems like other unixes net bsd free bsd the list goes on the one thing that wasn't mentioned is that build recorder at the current time only works with Linux kernels of version 5.3 plus reasons being reason being that we are using some system calls that make it run on every architecture so you can build it on a Raspberry Pi or any architecture of your like any architecture that Linux supports in expense of not having support for prior versions that was us thank you this is the url yes but we also have a QR code if you do not trust your code you don't know you're going to be a re-crow right yeah this is not the same url so we're plenty enough time questions right so the question was how do you know when to hash the file because when you open it for writing you have to know when it the modification ends and then you should hash the result that's correct and the answer is well we have the file when we find the closed system call you wait for all modifications to happen and then upon the closed system call the process is stopped so all the stuff is in there the virtual memory and the list goes on so you have the file at that time we also have the file when we open the file to see if it was seen before so we can have a lot of nice graph in fact the graph is just a dependency graph in a like a semantic web as one makes sense nice so the question was how do we consume this we produce all this information and how is this consumed and the answer is this is a build recorder it just records here's the information we we have we do not consume it at all it might be used to create to enter it into an s-bomb or something like that that's completely out of this this is just recording all this information it's out of the scope of this project yeah let's see sorry have we tried converting the data to svdx similar question to that one no we just record build this is a build recorder right another tool might read this wonderful output and create whatever they want it's way on the back the question is have we explored ebb ebbf ebbf whatever you you know what i'm talking about yet the answer is no no not yet sorry oh yeah sure so the question is what's the problem with maven that we said that it's going to be handled in the future yeah on this level maven works the same right so we can't record the information but we will just be recording hey a jar file was just been used we would like ideally to record because maven already downloaded it but we're just tracing the file system system calls right so ideally would like also to record the information hey we're downloading from this location this jar file we put it there and then we use it right on the level that we're using it we can record it right now sorry go ahead so i missed you sorry the question is do we distinguish between modifications to a file or completely new created file and written and the answer is well if we did that the performance would be even worse you can actually do it you can actually add another handler that checks for write system calls and has the file every time but imagine if you had to has the file every time someone calls right i don't want to okay the comment was that there was a yesterday talk in the golden room that they showed using ebpf instead of ptrace for a similar work and performance was great something to explore yes go ahead the question is how does this compare to scan code trace code that supposed doing the same thing we have not measured it for that i didn't even know that you know sorry oh if it runs for this race it's yeah it's the same as trace race we don't any other questions wow the question was what does it happen when you run it while building a container you mean or so again remember it just records it just records all the system calls right into the disk right so when you run it in the while you do docker build it will record all the files being used in order so if you do copy things it will record everything being copied into the layer right stuff like that yeah go ahead to be honest the command doesn't have to be a build command you can literally plug in anything in there ls or whatever i mean we are not supporting it we can promise that we will be supporting it in the future but you can do that so anything you can imagine it will run docker it will probably record that docker docker was called docker will still run all those dependencies it will still try to compile it will still link against all those libraries which is anyways so yeah any thoughts about going off off linux to uh windows we are at an open source conference repeat the question ah yes they asked if we're planning to implement something like this on windows well no first of all first of all it's hard the idea the idea would be great if i have developers who can do that and know the corresponding things magic to do uh in windows pr's welcome it's an entirely different process you don't have to all the system goals and other unixes we can work with that but there must be something on windows but i don't know no okay thank you you |
Discussion on SBOM contents |
Welcome. So to break the flow of presentations a bit, there's now going to be a discussion, so not just a presentation, but a discussion. But I'm still going to give a short presentation first to create some context. First, who am I? Okay. I need to speak louder. Okay. No worries. So I'm an embedded software architect. This discussion is also mostly focused on the embedded aspects. I'm working on Linux OS integration for as a consultant for dozens of customers. And I'm also a maintainer of the build route project, which is a team of four maintainers or five, depending on how you count. And that's actually the context from which I come. I mean, from which I give this presentation. I don't actually care about S-bombs. It's just something that needs to be done. And so, yeah, there we go. Because maybe not everybody is familiar with it. I'll give a quick overview of what an embedded Linux build system is. So basically, it's taking a lot of sources, some open source sources coming from the internet, some in-house components coming from various ways that you can get at them. Sometimes these in-house components are going to be binaries as well. And then the embedded build system takes all that together with the configuration and produces a number of artifacts. One thing to note is that the number of artifacts is really small. So we are talking about maybe five files or something like that. It's not like when you create an operating system that you have all these files that you need to keep track of. So from my point of view, as a maintainer of such an embedded Linux build system, the problem is actually quite simple. We know what the inputs are. We have just a few outputs. And we can just say, okay, these outputs are generated from these inputs. So that's actually what we do in build routes. We don't have SPDX at the moment. We don't have anything complicated. We just have a list of packages with the package name in a version, where it comes from, the source URL. Also, the tarballs themselves and the hashes for checking the tarball, the patches which are applied to them, the licenses, and the dependencies. So the build dependencies. So what other packages were used to generate this particular package? And then the assumption is that all of this together goes into your target image. So there's no distinction of what is used for what particular output. There's also a list of files for package which you can kind of use to reconstruct the, to get more fine grains. And then what I think is the actual thing you want to have is the CVE information. So the two things which I think are needed are the licenses. And that's part of this top part. And the vulnerabilities, the CVE information. And so there's a separate tool that extracts that. And it uses CPE IDs to relate our package name and version to what is in the CVE database. Now when you do this, this is of course not reproducible because it uses CVE information over a certain time. And new CVEs are created all the time. So it's something that you have to rerun all the time. So as I said, it's very simple. There's a lot of things which are missing and where my question is basically, is this something relevant to work on? So one thing that is missing is external files that you supply yourself. It's basically all the configuration which I mentioned here, this part. The assumption is that as a user, you know what these files are. You can inject them yourself. Same with the built root source. We could make a tarble of the built root source, but we didn't really see the point. Then we come to more important things, things that's vendor dependencies. So if we are, as bombs are used for two purposes basically, one purpose is for license information and second purpose is for vulnerability tracking. Now if you have a vendor dependency in some package, we just see the top package that vendors it in and not the actual vendor dependency. So we don't have the package name and version of that. This used to be not much of a problem because not many people were vending, but now you have go stuff, rust stuff, NPM stuff, which brings in all these dependencies and they're kind of invisible to us. We also have everything in one file not spread out over dozens of SPDX files. If it's good, this is bad, I don't know. We don't have a SPDX format. We have it only at the package level, not at the individual source file level. So our inputs are basically tarballs, not C files. And as I mentioned before, we don't have mapping of source to target files. So that brings us to the discussions points. For me, the most important thing to discuss is what, why are we doing this? So what are the consumers? How is this information going to be used? Because that kind of determines what should be used as input as well. If you look at, for instance, the SPDX specification, it doesn't really say whether you have to look at a source file or you can treat the tarball as a source. It just says, okay, there is a relationship there. It's not, I'm going to give you the microphone. I'll come up here with you. So it turned around. There is, sorry, back to the question. You got me confused now. SPDX. Individual source files or tarballs? Individual source, a tarball is just a file. Exactly. And you can use SPDX at any level. And so if you just want to look at the packaging level, the package level, and say, hey, it's this source file, this tarball file, that's fine. That works. You do not need to take it down to the source file. And it's a minimum set of fields that you just basically, all the concepts you had up there, you should be able to express right now with what we've got in SPDX today without any trouble. And so you basically would put a package there with the metadata that you want to keep at the higher level, and you point maybe to a file that has a hash. Simple enough. Yep. So I see. Another remark there. I'm guessing that at some point you would want either a kind of dependency tree, which can pop up. So you get a full list of more dependencies. And in conjunction with what was said about the tarball, which contains other files, could this be done in a way that you eventually have in a flat format, or in a way that can be flattened, all the dependencies at the top level, so you can parse them as much as you want. So the remark question is, if I understood correctly, so the next one is basically a hierarchy of dependencies, but you can flatten it to just have input and output. What I think for an embedded build system, I think it's enough to only have this flat one without the hierarchy, because the hierarchy is difficult to determine for the embedded build system. And I don't think it has useful information, unless there is anybody that can say there is actually useful information in the hierarchy. Yeah, I'm going to try to speak about that a little bit later. But yeah, there's like a ton of uses for having a structured S-bomb. And if you saw, for example, the Siemens use case, where they enrich S-bombs, you need to have that structure to enable to let you know when the enrichment happens, where the extra information is going to be happening. Also, if you want to compose an S-bomb by taking pieces from another one, like my friend Ivana here has been doing a really great work on composing S-bombs. So if you want to compose an S-bomb by taking pieces from one and moving that data to another one, you need to have that structure. And there are like several use cases that need to, you need, where you need the structure. But then I wonder what the, I mean, if you compose S-bombs, there is supposedly also a corresponding composition of the binaries themselves that the S-bombs describe, right? Because in the end, an S-bomb is a description of a binary. The binaries can be repackaged, for example. You can have a binary product of a field, and then you seep it, and you ship it, and so it. So indeed, if you're going to repackage stuff, then this is relevant. I'm surprised that our use cases were repackaging stuff. So, yeah, there's a question from, from the chat. What about handling patches? Good question. So what we currently have in Biltrooth is just patches are one of the sources, and they're, they're described as one of the sources, or, well, they're included in the, in the tree as one of the sources. Is there anything else to say about patches? There's also a specific relationship for patches. Yeah, there's a, in SPDX, there's a specific relationship for patches, because it's a modification rather than an actual source. What about naming? Yeah, so is it, I think there was one more remark about patches. Yes, I was the one in the chat. The thing is, if you have a curl, and you have a curl that you have patches for, it's not the same curl. There could be other vulnerabilities in your distribution than the original one from Github, or another one from VST. Yeah, so indeed, it's essential that you track patches rather than, I mean, you have to, to, you definitely have to record that what you are using is not curl version X, but curl version X plus patches, and then also which those patches are. Which takes us to the naming problem. Yeah, so if you've got a naming problem, you say OpenSSL, is it OpenSSL? It's the OpenSSL. It's the OpenSSL, or is it OpenSSL wrapper, and is it OpenSSL someone's patched, or modified, or built in a particular way because you've got so many options, and we, yes. So the remark was, if you say this is OpenSSL, or even OpenSSL version X, that doesn't necessarily uniquely be identified because it can be patched, or it can be built with certain options, so that information about how it's patched and how it's built has to be recorded as well. So do you capture that as part of build route now? Well, so implicitly, yes, but not explicitly. So it's captured because we have the baseline, which is basically identified by CPE ID, which is the upstream version, let's say, or well the version as published by the maintainer of the package, and then the patches are recorded separately. The configuration, as I mentioned before in build route, we don't really record, that's up to the user themselves to record it, so it's, I mean, that's definitely a room for improvement there. Yeah, there's a problem with anything that's building from source in this way, is that the name and version are unique. This is why we have to attach this. Right, yeah, exactly. This is why, recently we have hashed. Yeah, you have to check the hash, and that's why you need to build information, because just because it says OpenSSL 1.1.2 doesn't mean that the place your security vulnerability was actually even compiled into the code, right? Yeah, so the remark was that, what was the remark? The package inversion information when you're talking about. Yeah, package inversion information is not enough, are not unique, yeah. So like it's not going to be the same, like it might not even be the same between two people that built the same thing. Yeah, because of configuration. Yeah, so. And the solution for that is hashing. Yeah, you can hash the outputs. Hash the outputs, yeah, but then the thing is simply hashing the outputs. Okay, then you have an identifier of something, but okay, you have, you actually have the output. You could hash that, but they don't give you any information. Right, that's what you need to build. You need actually to build information itself. We're also, even there, the usefulness is a bit limited because in the end it goes to the CVE database and in CVE database, you don't have this information anyway. You don't have it in a, well, you may sometimes have it in an informal way in the description, but you definitely don't have it in a formal way saying if configured X then, so unless there is also some changes there, I don't think there's much use to, I mean, it's important to record it for manual analysis, but since there is on the other side no formal recording of it, I'm not sure if it makes sense to record it formally. So what we do for the CVEs is we don't, we only put in the, we put the CVE in and then just once, which ones we've patched. So like, that way if you go look it up, you know, I don't need to worry about this one, but if there's any new ones. Yeah, that's basically what is done in built-in as well, but not in an SPDX format, just... The CVE ID is a field external reference you can associate with the package. Yeah, so the remark is in SPDX, yeah. In SPDX is the external references and you can associate a CVE with a specific package. You can also associate a Perl with that same package and if you wanted to put both of them there, you could. It's flexible there and because of the time scales of vulnerabilities and so forth, what you want to record at a point in time whether something's patched or not or other things like that, this is all hopefully able to be done. The question is, you know, are people having tools that are semantically accessing it right now? Yeah, well, so one of the things about the consumer's tools is actually we are seeing tools emerge. In fact, I know of two off the top that are basically consuming S-bombs and matching them to vulnerabilities. So that takes care of the monitoring over time because there's two different time cycles. There's what's known at build and then there's what's known in the field over time. And so the two projects I'm aware of are Daggerboard and the other one is the one that's sitting in the SPDX repo that looks up vulnerabilities. You basically feed it in an S-bomb in SPDX and it will go and query the databases for vulnerabilities. Yeah, SPDX tool. To OSV. To OSV, yeah. And so there are tools out there that are emerging and I think we'll be seeing more and more in the years. Yeah, maybe as a reaction to that. So my intuitive reaction is, yeah, but we also have a tool that generates this information already. I mean, as part of build routes, you can just run that tool again five years later and you get that information. But there's a cave-out where I think it's actually useful to have the built information formally recorded and that's basically the same thing as what archaeologists do. You don't know what techniques are going to emerge later and that can be useful then. So if you build something now, you should record all the information now that can be recorded. The other little add to that is that use case is also very important in the high assurance world where you are being asked to attest exactly which vulnerabilities you know about. But that's an audit case or a high assurance case and so some places they will want to have that. Okay, I would like to move on to a different subject which I mentioned here. That is vendor dependencies because I think that's a, I mean, from the point of an embedded build system, that's an important thing to solve. We actually have multiple vendor dependencies. I'll first give an intro and then, yeah, multiple vendor dependencies. So we have some vendor dependencies which are directly included in the source code. For instance, Tomlip is a good example. Tomlip is a library that is meant to be vineered in. And so, yeah, people just copy it into their source code. So that's really difficult to trace. Then there is Git submodules. You clone a Git tree. That is the information you have in the, as a build system. That's the information you have in your build metadata. But then if there are submodules referenced from there, that information is not part of the metadata of the build system. And then Cargo will go in PM modules, obviously. And then there are some cases where the build system itself vendors things in. For instance, in open embedded, you have the kernel meta which is kind of entered in. And which is not, I mean, I don't know if it's taken into account for the SPDX. But you need to take special action there. That's the important thing. It's not using the normal parts of taking sources. So, yeah, my question to the audience is how can we deal with these vendor dependencies? You had a remark. A question? An additional question. Okay, sorry. I'm going to keep you. This is actually just a problem beyond Sbom generation. Because if you're trying to do air gap builds, these are huge problems in general. But that we've encountered. So, you know, ideally we could download all of these sources and archive them without, and then go tell the tool to go pull them from the archive, which is often difficult if not impossible. So, yeah, it'd be nice if the tool like cargo and go. I think cargo is not too bad. But go and MPM are pretty bad last we checked for being able to do that kind of thing. And that would make this a lot simpler too. Yeah, so the thing that I want to add to that is that, so getting the sources is not the difficult part. So if it's just for licensing, that's doable. But for supply chain, you want to know provenance. And that's the hard part. There is, unless the, I think for all these things, unless the source, the upstream itself gives some provenance information, preferably in a formal format, then there is no, it's really hard for us as a build system to go and look for it. For cargo and go, it's actually doable to do it on our side, because it's, you have a log file which gives the exact information, not in an easily consumable format, but the information is there. For MPM, it's, yeah, of course it's MPM. And GITSIP modules also, the information is there, but you don't have version numbers. You have GIT hashes which are not something that you will be able to check against the CVE database. And to some extent, that's also the case for cargo and go versions, because often they're also, they're specified by GIT hash. And then they're directly included in sources as usual, hopeless. Any other? Well, it's just some perspective. So one thing to keep in mind about those is, there are those dependencies that get rendered in, and you need to capture those in the S-Bomb. You can see them in two ways. One is from the dependency list angle, and in that case, just having the version and then name, and ideally the hashes of those dependencies is enough. And the other angle is if you're trying to actually inventory all of the files that get rendered in. So it depends on the use case of the S-Bomb. You may want to capture one or the other or both. But the, so just for dependency's sake, it may be, depending on the build system, of course, it may be the case that you only need the list of dependencies without the actual file information. Yeah, the thing is the file information is easy to get, it's the list of dependencies which they record. I think picking on what Adolfo just said is, a lot of builds dependencies just say latest, or not explicit. On the view that people say, I want to keep up to date patch-wise, because we're told we've got to keep everything patched, but for making it dependable, it's the worst thing because it's going to change. So, yeah, I was trying to just get a debate, but actually that's, I think, is a lot of the things, and people are generally lazy, I'll just pull in open SSL. I don't care what version, I'll just pick up the later, because it'll be version X today, it'll be version Y next week, and I'm not bothered about that, I don't want to have an admin. So, how do we handle that? Because with the growth of ecosystems, that basically is what people find very convenient. Yeah, so I don't think we have an answer to that. I think we can get one last question, the question you had, and then we have to stop, I think. And the question would be, how would you handle vendor dependencies that come as binaries? Yeah, the question is, how do you handle vendor dependencies that come as binaries? I guess you record what you know, that's the perfect answer. Actually, this is probably the answer in general, to any question you record what you know. Actually, you want to record it in as formal way as possible, because like for this cargo, go dependencies, actually our source store has the log file, which has the exact information of the dependencies, it's just not something that you want to go and crawl through afterwards to reconstruct it. Would it be doable to have two phases, first download the sources, then you are in a container, you cut the internet, you stop the interfaces, and you build. So there's more remark to the offline build thing. So to solve the offline build problem, what you can do is cut the build in two phases. The first phase, where you just do downloads, and then you have a download territory, which you expose to the second phase, where you do the actual processing. And there are two problems with that, which I don't think are solved in either Yocto or Biltroot. The first one is that you actually, to do the downloads, you need some tools which you have to build. You need to do the cargo downloads, you have to build cargo first. Yeah, but that means that you don't completely separate your build, you can't completely separate your build in your downloads step, because you need to build something to download something. And the second issue, I forgot what the second issue was. Yeah, the built hermetic builds should be available for everybody. But we need the tools to support it. People don't realize that you need your tools to support that. Yeah, so you can maybe do something like a download step, a build step, a download step, a build step, but it's getting complicated. We're out of time. Thank you. |
Using SPDX for functional safety |
So, next up, Nicole. So, yeah. Okay. Hi, everyone. Hi, everyone at home. So, yeah, I'm talking today about how to maybe use SPDX to establish traceability for functional safety. So, about me, yeah, I've been a software developer for quite some time, always working in, yeah, some kind of safety-critical project, and that somehow brought me about 12 years ago to, yeah, really focusing on functional safety. And so, currently, I'm working as a tech consultant still, yeah, mainly in functional safety, a little bit security, a little bit license compliance. That gets critical for software components. I'm involved in some of the Linux Foundation projects. And, yeah, just important thing about me, I'm really not good with faces and names. So, if you walk past by me and I don't say hello, it's just because I'm not really sure if it's you. So, just say hello to me and, yeah, then I know it's you. And you can contact me, or usually with a handle at Nick Pappler at the usual social media or GitHub places if you want to find me. So, what can we do for functional safety with SPDX? So, I'm not sure who at all here is aware about functional safety or what it is. Oh, a few, that's cool. So, yeah. It's more than a thought. So, I'm happy. So, yeah, safety, generally, it's a freedom of risk or a minimization of risk of getting hurt, of doing something bad. And the functional safety part of that is that whatever can break or go wrong in your system is handled, is detected. So, you have mainly two options here. Avoid something bad to happen. Avoid the microcontroller to break. Avoid your software to do something you haven't intended it to do. Or, at least, to detect things going wrong and then define, initiate mitigation measures, define safe states, so what to do when things go wrong and implement this and hopefully have a safe system at the end. So, what do we need if we want to do this? So, as I already said, you need a system that's robust and suitable for your safety application so that it in itself has some features that it doesn't kill people, that it doesn't hurt people, that it doesn't hurt the environment and several levels of analyzers, of tests, of verification measures to prove that your system is your system as you have intended it to be. And to do so, to really make sure what you have specified, you have implemented in the end, you should have a process, you should have guidelines, methods, and you should plan how to do this. And, unfortunately, this results in a lot of documents. You usually start with something called a safety plan, so it's kind of project management plan or any kind of plan strategy, how you want to do this. It's a plan, how you want to verify with this, you have your requirements, you have an architecture, go coding guidelines, you have loads of documents, 50 to 100 documents in the end, so it's really, it's a pain. So, and I think most of us here are engineers and, yeah, engineers like to engineer. And so, we like to create it with a fantastic system, maintaining a fantastic system. Yeah, we do have to do, at least in the industry life, apply a process to do so, and then we have to ensure there is all documentation in place and all evidences are consistent. So, yeah, first two things we really like to do, we like to create things, we like to maintain things, we like to have them fantastic. Applying a process, yeah, if we have to, some like me like processes, but I think 99% don't. And, yeah, ensuring all documentation and evidences are consistent, that's, at least for most engineers, it's no, no, you don't like this, this is pain, this is, at least emotionally, that feels like it's something I have to do on top because I know what I did, I know what I want to do, why do I need to record this, it's clear. So, just if we assume the process to create and maintain all artifacts that you need for functional safety of the plants, the requirements, the verification evidences, they, it's established and you will do this, but you still have the biggest pain there, keeping it complete and keeping it consistent during all the time for all your variants, during all your patches. So, the idea was then, hey, why can't we use SPDX-style solution there where you have the full traceability of a safety package or whatever to track all your possible combinations of your system. And, there is something in SPDX called relationships and these relationships actually, when you look into them, describe exactly what we do in functional safety with the idea of traceability. So, really traceability, this belongs to that and that version and it goes into that and it's created according to that. So, actually, when you look into the relationships list, at least nearly everything that you need is there. So, how could these relationships then really work for functional safety? So, FUSA is Geeks speak for functional safety, Geeks speak for functional safety. So, the documentation structure for functional safety usually consists of three types of documents. On the one hand side, you have all your plans, your processes, your guidelines, your strategy, how to do things, then you have your requirements and specifications saying, okay, what have we done or what do we plan to do from technical point of view and you have all your verification evidences, your safety analyzes, your STPA tests, test cases, reports, any kind of evidence in the end. And, yeah, we are that old, we like the V-Model from functional safety point of view. At least we want to try to have something that is understandable by people who like the V-Model. So, any kind of documentation of work products, of artifacts that we have usually fits into this V-Model. So, we have the requirements, we have the full workflow, we have tests associated with requirement style information, we have reports that document that we did this test and the outcome of the test and we have all these documents that are associated with the processes, the plans, the guidelines, yeah, how to qualify things, how to assess stuff in the end. So, we have everything in plans, we have everything in the V-Model and as you see, the documents are some kind of interconnected. So, when we speak from the functional safety point of view, what are plans, processes and guidelines, what are they, what's that kind of document and in actually, it's just kind of artifact that specifies something, it specifies how to do things, how you plan to do things, how another document or another artifact should be structured or created. So, it's always a specification. So, when we look into standard safety document, the safety plan. Safety plan itself, it's just a specification how to do safety in the project or for the product. The coding guidelines, it's a specification how to create the code. So, it's always a specification, a plan is always a specification how to do things. Then, when we look into the product documentation, yeah, what kind of product documentation will we need to manage or do we have in our functional safety project? We have requirements, we have report type documents, test results, some analysis results and we have the code. And when we look into the product style documentation, it's a little bit more complex than plans. So, yeah, a safety requirement specification clearly, it's a specification of the safety requirements of what you want to achieve from a safety point of view. A unit test is a test case related to a specification part or to a part of code. The unit test report itself, then, is a documentation of the test case again. So, yeah, and as you see, everything is connected, everything can be expressed with a relationship. And, yeah, I'm working in functional safety for quite some time and I always say it's about a safe product, it's not about assessments, it's not about standards. But yeah, the only present question always is, what do I need for the assessment? How can I make my assessor happy in the end? So, as I say, you have to have a safe product and they say I want to make my assessor happy. So, what do you need to make your assessor happy? Yeah, to begin with, they will ask you for a lot of planning documents. The safety plan, a product architecture or your design, a strategy, how you want to do things so really that they have in the beginning an idea. What's in the scope? What do you plan to do? Do you have at all an idea of what you want to do? Then, yeah, the concept you show then should be sound, should be safe, at least you should have an idea at the beginning and you should have an idea about the completeness and consistency of your plans and in the end really present everything as a consistent, complete and, yeah, full product. So, when we look into this, it's always, again, it's related documents and from the assessment point of view it's a package and the package is described by a list. So, you always have lists of documents, okay, in this point of time, dear assessor or dear internally QM department, we have this part of information and we start from there and then later you have more information and say, okay, I have a new package and you get this one and in the end you have the full package and you have all your reports, all your configuration, all your calibrations and you will deliver that one. And actually, when we look into this from the SPDX point of view, yeah, it's all information that you can use to generate S-bombs for safety or we call them now safety S-bombs to really support your safety compliance argumentation and to really deliver the information always what is in the product, what is in the scope. So, how can I use this maybe in detail? So, as I said before, usually when you start an assessment, you start with a concept assessment, so what's the plan, what's the scope of the product, what's the safety scope of the product, what's maybe the beginning architecture and how do you plan to do things. Then you have a final assessment saying, okay, we are ready now. We want to get a final safety approval. We are really sure that we have a great product and we want to ship this. And as a real life is, you will have reassessments because your product is involving, you have CVs and all that stuff. So, what do we really need? So, as I said again, concept assessment has a goal that you have a proof that your concept is sound, that you can generally start working like that. So, you have your safety plan, any other initial plans where you set up how to do things, you'll have a safety concept. So, yeah, you will have a package of information, you will have a list of things that you want to deliver. So, when we look into the S-Bomb types that we heard about in the morning today, it's actually nothing else than a design S-Bomb. It's how to do things. Now, you can think about this like when you build a house, it's like the plan. The plan, how to build a house, it's what will be the methods you use, will it be in stone, who will come to do this, will you need certain machinery, something like that. Then, what? Okay, oh yeah, that was it, sorry. Then, the next stage really is go through your assessment stages. So, yeah, you have your concept ready and you won't have all the things ready and then ship them without some milestones in between. So, for each milestone, you will have a package of things, you will have a list of things that are not really built into a final product, but that are, yeah, somehow sources of information for what you have done so far. So, yeah, you could, for example, use the source S-Bomb for that, really as a list of all documents that will be part of your product up to then. Then, final stage is, yeah, you have finally everything that you want to have in your product and you start testing. And so, you have really all your plans, you have all your specifications, you have all your test cases, you most probably will have all your test reports and you will deploy this and see if it runs the first time. So, yeah, it's the first build that you have of your product. So, you can ship a build S-Bomb with completely everything that you need for the product, for the testing of the product. And then, when we look into final things, this is when things become tricky because, yeah, I come from the automotive world, yeah, you have one software build and then you have a bunch of calibration and configuration data that goes with your software into the vehicle, that goes on the road or into several types of vehicles. So, you have then really a different set of configuration and calibration data going with your release and that's really getting into the question what is really deployed on your system. So, because you have your runnable and you have your configuration data and you have a bunch of versions of that and it's really hard, honestly speaking, to keep track of this. So, here, this deployed S-Bomb would really come into handy, say, okay, we have everything and we know with a generated list what's really deployed on this vehicle. So, the real beauty, what I really see in this approach is that you can really close the loop about what you want to have and what you have in the end. So, standard in safety development is, yeah, you have the configuration management plan saying, okay, what types of documents do you have, what do you want to do then, how do you want the versions and what are baselines and branching and all that stuff and you have a list of documents that you plan to have. So, that's really, yeah, it's called configuration item list in the most projects. It's really the list starting from the safety plan down to each source file that you have. But in the end, again, you will compile this in what you really have in the end, and that's usually then in the safety world, it's the safety case. So, really, a compilation of all safety evidence that you have. In most cases, it's a copy of the configuration item list with all information attached, what's the final and valid version of this configuration item list member that goes into the final release. And in, yeah, most cases that I see, this is a manual process. Partially, they might generate something, but there's a lot of manual stuff in there. And then, yeah, comes the assessment type. You need to really make sure that everything that's in the safety case or everything that's in the configuration item list really goes into the safety case and that it's consistent and all that stuff. And usually, this is really done in manual spot checks. So, it's a pain and it's error-prone because spot checks will just give you a picture of how you can, yeah, there's just so far you can go with spot checks. So, the idea here, oops, there's a typo. The idea here is really that you have your generated S-bomb with everything that is deployed and you can then automatically more or less check back, is everything that I have in my generated S-bomb from the configuration item list, do I have some open ends in my relationships? Because if I have open ends, there might be, there will be some gap in what I have, maybe not tested, not verified, or just not thought about, maybe in my analyzers. So, you will see the open ends. You will see that there are things that might not match. And the next beauty is, yeah, you can do this process manually as often as you want to, but it is a pain and it takes time. And it is error-prone and it needs a human and it needs a lot of ex-lists, PDFs, whatever. It's a pain, it's hard to document, it's hard to keep track with it. And it's not like, yeah, what we had maybe 10 years ago you release once a year and maybe you have a patch every three months if you're really into patches. You have CVs coming in, you have continuous integration, you have bugs, you have product variants, and doing this loop all the time manually, you won't be able to, you are not able to keep track of it. And that's one of the biggest issues in the safety world at the moment. How can we have changes due to the CVs, due to the security things that we need to implement, we need to update. We can't say, oh, we have a safety certification, we cannot update, we cannot direct to security issues. That's just not possible. And so the idea here is, as long as you have your bomb, your S-bomb, your safety bomb, whatever you want to call it, of really all your documents in the configurations that you have in the end, then you can automatically scan again, do I have this, where do I need to change things, change things, have this run again, and see if you're still complete. So, yeah, that's the idea. So we started with this idea about, yeah, I think maybe half a year ago. So there are still a lot of open topics. It's details about how to set up the tooling, what tooling to use, what do we need, what is there, how can I do this. Then the full relationship model between the safety artifacts. It's also ongoing. A complete model about document and evident types. So we do have document types, but is it sufficient what we currently see? Then we want to have a pilot project as proof of concept. Yeah, that's candidates for that. And yeah, I'm from the safety world. I speak safety, compliance, standards. But there are security standards, for example, around other standards that ask for this compliance information. There's now, in the automotive, 21-4-3-4, that's a new security standard. It asks for similar things than the safety standards. Yeah, I do have a great system. Yeah, that's possible. I need a consistent set of documents, documenting that I have a great system. Eh, that's where we might have a problem. That are the things we're still looking into. And yeah, happy. If anybody wants to join us, we meet every Friday evening. We have a call. Every information you will find here at the page of the FUSA Special Interest Group at the SPDX project. Okay, so questions please. Feedback, comments. Do you think it's complete crap? Can this go somewhere? Yeah? So, you presented the process, like what it takes for safety case to be made. Yeah. But I could not get, like, what your group is doing. So are you building some tooling that enables you? Ah, okay, what the group is actually doing. How actually SPDX is being played here? Okay, so the question is, what is the group actually doing and how does SPDX come to play? So yeah, we do not define the process. So the process, the structures, just give me a sec. So the structure like this and how to do things is already there. How do I make these things? And there are a lot of ideas around how to connect these things to have the full traceability. It's from having handwritten tags in Word documents to using special life cycle tooling. These approaches are established, but they are not so well interconnected most of the times. And later when you come into frequent updates, product variations, then it's really getting hard to keep track of things. And the idea, what we are currently following is that each piece of this information doesn't need to be in a tool. It doesn't need to be in a special kind of format. We can connect these with these relationships from SPDX that are already in the 2.3 specification. We're doing the exercise of the mapping to check to make sure we're not missing anything for 3L as well. Yeah, so we, Kate just added, we do the exercise of the mapping that everything, sorry. We'll fit for the spec. Yeah, that we'll fit for the spec. And maybe things go into 3.0 or later. Yeah, is this a follow up or a next one? I just wanted to ask is there in Papamike an available example that we could check out? So the question is there a publicly available example? Not fully. So we're talking about how to create the use cases, how to create the overviews in the course. So you're always welcome to call in and see what we have. Right, I really like this. As an engineer, a solution architect, having something like that and all the traceability, particularly out to the external world of the regulations. So when we did the safety thing, we had 800, quite 800 documents to try and trace. And they're all changing. And actually you want to look at the impact assessment of that change, of change of regulation or move to a different market. You want to be able to see that. So if you could capture that, and whether it's cyber or safety or anything, having that traceability must be a great movement. So I'm sure people like Seaman would love it for this type of system thing, you know. Build the power station. Yeah. Yeah, this thing is, I don't know. I'm thinking of adding to shift. I'm basically already had huge amounts of things about safety and the main and international stuff like that. So capturing all this in a thing so you can see the relationships and change so you can see the local effect. If this went up from version one to version two, where's the impact? Which part of the design? Which part of the supply chain? Yeah, exactly. Yeah, I would also love to have this. At the moment, the idea is really, okay, I need the evidence for me because I need to be sure what I have and what I have deployed. But yeah, the next thing is I'm a component part or I'm making component part and I ship this down the supply chain. And the next one who wants to consume this wants the information. And it doesn't make, maybe not want, I don't want to give out all the information that I have, but I can give the relationships and the completeness information and I can do this in an automated way that can go into an automated supply chain process. You've passed the supply chain, you've passed the relevant bits of the standard. Exactly. Everyone follows this thing because actually 95% of those standards. Oh, time's up, time's up. Sorry. So yeah, it would be great to follow up on this. Please meet us on Fridays in the evening if you want to. And yeah, I'm happy to talk about this later with whoever shows up. Thank you very much. |
REUSE
The gold standard of communicating licensing and copyright information |
Next up, review. Hello everybody, can everybody hear me okay in the back? Then I'll speak up like this, okay? Okay, perfect. So I'm going to give a talk about reuse. I'm Linus. I work for the Free Software Foundation Europe as a CIS admin and I'm also one of the maintainers of the reuse tool. And so this tool, so since this is the S-bomb deaf room, I need to relate what we do with reuse to S-bomb somehow. And I think the catchphrase or what I want you to take away from this presentation today is that if you want to have nice S-bomb downstream, you should push everybody to use reuse upstream because it makes everything else much, much easier. Or just how reuse makes everybody's life a bit easier. So typically with free and open source software, you have compliance issues when it comes to license and copyright. There's missing information about license and copyright holders of your own code and of third party code that you might use in your application. There's license ambiguity, so for example if there's just, it says GPL but you don't know which version. And often when there's somebody taking the time to put all the information there, it's stored in a silo and it's not actually where the code is. And often another thing is that when you change something in the code you should or you have a new contributor coming on, then you need to change everything again. Developers also need a lot of training if there's custom solutions and there might also be conflicting compliance practices. So we thought like why can't we solve these issues here up the stream so that when the water flows down to everybody else who's consuming license and copyright information of source code has just an easier time digesting them. So reuse is based on a couple of principles. It should be easy for copyright and licensing information. Everybody should be able to find this easily, so it should be in the file that it applies to if that's possible, like of course with a binary file that's not possible, but if it's plain text that's possible and then it should go in there. Silo's should be avoided and all the licensing copyright information should be stored in the repo. And that info should be readable by humans and machines alike. We also do not want to reinvent the wheel if we don't have to, so yeah, we try to be compatible, as compatible as possible. And also licensing should be easy and fun. And yeah, so we try to do that. You can decide whether we manage, but at least we try. So there's three simple steps to using reuse. First you choose and provide the license. Reuse does it with a nice little dialogue for you when you start it up the first time, I can show you later. Then you add copyright and licensing information, preferably to every file. And then we have a range of tooling that allows you to confirm this reuse compliance, either in a pre-commit hook, in CI, and of course locally on your machine. So I'll go through this quick. Maybe just a quick show of hands. Who has heard of reuse before? Okay, so I'm preaching to the choir. Who has used it before? There's plenty of people on it before. Okay, so I'll go through this rather quickly. One thing that we do is we save license text in a licenses directory. I'll make a quick shout out to GitHub later about this. Then reuse names after the SPDX license identifier, and then they're stored in the licenses repository. Then the copyright and license information is added to every file. Then it looks roughly like that. And then if you have uncommentable files, binary files, images, or executables or whatever, then you can do a separate license file, which is a plain text file, which refers back to that uncommentable file, which then contains this information. And also you can use a.deb5 file in.reuse directory. This is about to change soon, hopefully, because we are going to develop our own reuse YAML, which sits at the root of the repository, where you can define this type of information for uncommentable files or for whole directories, much easier than with.deb5. And then the third step is you confirm that you're actually compliant with reuse. What are the components of reuse? One thing is a spec where we pretty clearly try to state how licensing information and copyright information should be added to source code. Then there's a helper tool, the reuse tool, the reuse CLI tool. Then there's a very good tutorial in FAQ that you can look into to answer very basic questions about licensing, but also some advanced stuff. And then there's an API where you can sign up your project and then get a nice badge who doesn't like badges. So I've already said that we store the licenses in a licenses directory, so the UI of GitHub, for example, still doesn't pick that up properly. That would be very cool if that happens. It's not very hard. And the rest of that, in the interest of time, I'll skip over that. So now I'll just show you really, really quick, because I have five minutes left, I think, how this works, how this looks in practice. So here is the text size OK for everybody in the back. So here I'm in a non-compliant repo, and I can run reuse lint to confirm that I'm not compliant. I have 6,000 in this repo, and none of them have any copyright or licensing information. That's not cool. So I can just run reuse in it. And now I'm asked to provide licenses, so usually I use something like CCO 1.0. For just configuration files and stuff like that, then we could set GPL. And then did you mean IRGP? And then just call that example. That doesn't matter. So now we see it downloaded the licenses and it created that file file for us. Like I said, this will change. And now I can start adding license information to certain files. So for example, now I can add the license CC0 to my gitignore file, for example. I can run reuse lint to see what's changed. And I see now I have one file with correct copyright information. And I could now continue doing that for the rest of the files. But I hope you get the idea that it's a tool that really simplifies this process. And then you can put the reuse lint checks in your pre-commit hooks. They're very terse. They basically look like this. That's all you have to do. And then run pre-commit install. And then before you commit, it does the reuse lint check. So it's very, very simple. It's very straightforward. I had to jump through this a little bit because I don't have that much time. I just realized. So the ongoing development is, of course, the tool. And it's all free and open source. And so you can contribute as much as you like. And we are very happy about everybody who contributes. And then there's an API, which is all fully open source and free software. We have a specification which will be extended with reuse YAML really, really soon. We hope that we can do some better integration, especially with Git forages in the future, that the UI shows you which licenses you have in the repository right away. And we want to spread. And you can, of course, help us with that. So who uses reuse? At the moment, we have over 1,400 projects signed up that use our API that have cumulatively more than 80,000 stars on GitHub. Then there's stuff that lives on other forages like KDE and the framework that uses reuse. CURL became reuse compliant as WebLate, a really cool translation product that recently became compliant. GNU Health Project, which is an awesome project, the Corona 1 app in Germany. And the Linux kernel is trying to become reuse compliant. We'll take a while. And yeah, so feel free to check this. I will upload the slides, and then you have all the links. If you want to participate, sign up to the mailing list, ask questions, create issues, make one of your own projects reuse compliant. It's really easy. Integrate reuse into your community and compliance policies. Help others adopt reuse. Here, I linked the developer section of the website, which is really the best way to get started really, really quickly. And yeah, I don't know, do I have time for questions? Two minutes. Two minutes. Maybe I can take one. I also just a quick note. Carmen is here. She's one of the main creators of reuse. So we'll also be happy to answer questions afterwards. I'll take one now. What does bad license mean in the Linux app? A bad license? Yes. The linter shows in the header the first one is bad licenses. I think. It's not an SPDX license. Ah, yeah. It's not an SPDX license. What does bad license mean? And it means that it's not an actual SPDX license. Yeah, but just on, yeah. How do you handle custom licenses? Yes. I don't think we handle them at all. Yeah. You can make a custom license, license ref. Ah, OK. Yeah. So you can make a custom license ref within SPDX. So that SPDX allows you to do that. And then the reuse tool follows that. Yeah. Sorry. Only 15 minutes. Sorry. |
A complete compliance toolchain for Yocto projects
(even very large ones, yes) |
I want to go very quickly because I'm rushing through because I don't want to keep you from the very juiced part. So we are presenting a complete, as complete can be, compliance toolchain for yacht projects. So this in the context has been produced for a project we are part of called Eclipse Oniro. It's an Eclipse Foundation project, it's yacht-based, and it provides an all-scenario platform project. It's very complex, the numbers are quite impressive, cannot run through, but they are quite learning, and everything is based on CI, so everything is CI. We also needed to have a completely CI-ed also compliance process, so we needed to build with the existing parts and building parts that did not exist, by providing a dedicated toolchain. This toolchain for compliance has been made not just by us, but by the Neuropath Tech Park. People are in the room as well, and they have done much of the heavy lifting of programming, and Albert of course gets the credit for the software. It's also an Eclipse Foundation project, and it's alright, there is a REPL, and you can look at everything. It's based on an already existing, the no-subjects of compliance for soldiers can code, but we have to create a dedicated new custom set of tools also being distributed as open source. Of course we use SPDX all across the toolchain for many things, and this is important, this is not just for this project, this is meant to be a toolchain for all the Octobase project, not just for this. Through this combination of tools, we have been able to complete a very lengthy process of compliance, reaching 100% of coverage of all the components. I don't want to skip, thanks to the Damian providing a lot of the information that we have really used. Here you can see very quickly the dashboard, so you can track the evolution and the status of compliance. You can also go and have many more details of the single packages. That is just for knowing what components and what license is there, but we cannot stop there. We want to go farther, and we have what we call the second phase of the compliance toolchain, and we decided to go for a graph database for many technical reasons, because there are a lot of interactions we have to traverse the database very easily. We need to do the missing part, which is software composition analysis, dependencies analysis, and incompatibility checks, and everything must be done automatically as much as possible, so this must be entirely into the CI pipeline, which it does. In order to do this, we need several things. This is really important, map all licenses data from the source to the binary, and at file-level mapping, by file-level mapping, we mean every single file, including patches, everything that goes through needs to be checked, checked some, and tracked through all, in order to make sure that everything that goes inside is tracked. This information comes from many sources. We had an audit team who has done a lot of work, and finally, we need to find a way to automatically, as much as possible, see if the inbound and outbound licenses are compatible with each other through automated tools. This is in the very short time, the general description, here an example of the relationship between components in the database, and there is the next bit, which is perhaps more interesting, and I give the floor to Alberto. So thank you, Carlo, and so the question is, why do we need this? So Yacht of Warflow, for those who are already familiar with that, is quite simplified here. We have recipes, recipes that can map to a single upstream source component, but possibly also two different upstream components, maybe the main application, then some plug-ins. So you can have here many different upstream sources combined together, fetched, and together combined into a single work here, in the unpacker stage task of the build process. Then there are other tasks, configure, patch, whatever, basically build. We got binaries, and binaries are combined together to form the final image. But the problem is that when it comes to upstream sources, we have upstream components that have multiple licenses inside it. Maybe we have different components with different sets of licenses, but the thing is, we have only a small subset of the binaries that we could generate for all this stuff, and the binaries, you don't know what the actual license is, especially when here you have kind of a mess. And the thing is that you risk that some dirty stuff ends up in your image. I mean, I don't know, if you have a package with a blacklisted license or compatible combination of licenses, you may have in your final image something that is not compliant, and this is something we need to avoid. So the thing that we have to do is to follow a process. So to find out the relationship with third-party upstream code, the inbound code, then we need to find under which license the inbound upstream software is, therefore the inbound license. And if there is a possible combination of that, because not all combinations are allowed, and depending on the context, they may be allowed or not, like other talks pointed out this morning. And we need to match this combination with the abound license, is the wanted abound license compatible with the inbound licenses. And this for each artifact, we cannot do that, especially in the embedded field at package level because the package may contain a lot of stuff. And we need to know about the files, not the generic about the package. Here we have an example, I hope it's readable. This is GPGME component that GPG made easy. It's a very small component. In our project, we found it out that we generate only three binaries out of it. If you look at the license and the recipe, you find the GPL, I guess, three or LGPL two or something like that. But this is nothing, sorry, not or, but and. So the thing is that this binary is that GPL, is that LGPL, is that something else? We don't know from what we have from source license information. So we collect in this graph database multiple sources of information. So on the yellow dots are the information coming from our auditing, working on physiology. So file level, source, license information, the purple dots are, comes from Yachta metadata and also the information about which files are being generated by, to generate, they have been used to generate this binary, comes from Yachta from the create SPDX class presented before. Basically, we now have MIT, MIT, GPL two, GPL three, sorry, GPL two or later, or LGPL three or later, and then we have GPL three or later. What's the license of this file? So usually the audit team comes to us and we discuss to find out which is the license of the binary file. But this is not scalable. This is error prone. We cannot do that for every single binary file, for every release, for every snapshot of the project. This is another example. This is another binary you can generate from the same component. Here you have MIT, MIT, MIT, LGPL two, point one or later. This is another story. Maybe it's easier here. And we can have also more complicated, I won't go into the details here. So how can we handle that in an automated way? So the idea is to have kind of a battle between license cards. So you put together every time two, two cards, and you need a set of rules to decide which card wins. And then you iterate and you look over all the possible combinations of license until only one will survive, hopefully. So to do that, we are trying to define a language to define those rules. We need to define the two license cards, the two license cards that are fighting against each other. We need to fight the called battlefield kind of context. Is that static linking, dynamic linking, whatever? The authority who said that, because the answer by the lawyers always, it depends. So we need to say who said that. In this case, FSF, we kind of trust it, especially when it comes to GPL compatibility matrix. And here we are the result of the battle. We have a winner or we have kind of invalid. So this kind of combination is not possible, because GPL two only is not inbound compatible with Apache 2.0, while GPL two or later may be inbound compatible with Apache 2.0. If you make it become GPL three, because GPL three is allowed to be inbound compatible with Apache 2.0. So this is kind of an example. I don't have time to go into details. The rules in action are like here, when you have disjunction, a disjunctive license expression, you need to calculate the Cartesian product. So you need to have, in this case, MIT, GPL two and GPL three fighting against each other. Then MIT, LGPL three and GPL three fight against each other. You find a list of the decision and we know that GPL three or later is the license that prevails here at the end. So the time is up, sorry. So the thing is that, how I already said, we consume data from different sources, from physiology, from Yachto, and for now we have proof of concept. The aim is to upstream everything into the Creator Speedyx class in the Yachto project. I don't think we have time for questions, but here and in the slide you find all the information to contact us. So thank you. |
In SBOMs We Trust: How Accurate, Complete, and Actionable Are They? |
Good afternoon, everybody. Thanks for coming to our session. Yeah, I don't know. It's on. All right. Okay, so mine, I try to speak loud so that the people even in the rows in the back can understand me. My name is Henrik. That's my colleague Joseph. Both of us are publishing and working on open source dependency management security topics for a couple of years already. And so, yeah, recently we started looking into S-bombs more maybe compliance and topics that we are not so familiar with. So we are looking forward to your questions. All right, so the agenda of our session will be like follows. I think the first session agenda item can be very quick, seeing that you already spent like four hours on that topic. We will then present a small case study where we basically tried to, where we ran different S-bomb generators at different points in time on an open source solution in order to see whether those tools agree on the S-bombs generated, whether the results produced are comparable, whether the results change over time, and to also pinpoint a couple of what we believe are deficiencies. And then Joseph will explain why it is beneficial and helpful to go from the granularity of entire components to the code level to look at functions and methods and call graphs. All right, so that is a software bill of material in the Cyclone DX format, which is one of the, let's say, prominent standards in this space. Cyclone DX has a little bit of a security background. That's why we kind of choose this. It seemed more natural to our previous works. At the very top is kind of the S-bomb format and version. Here in the middle you have the software product for which the S-bomb is generated. Here you have the S-bomb generator tool. We anonymized those solutions. We didn't think it is necessary to point out problems in individual open source solutions, but we wanted to more raise awareness for the general problem of comparability and so forth. And then at the bottom here you have an array of components that the S-bomb generator found in the software analyzed. And of course there will be many, and I highlighted a couple of fields that we will be looking at in much more detail. So there is the name, there is a CPE, there is a Perl, a package URL, or a group and a version. So these are all different fields belonging to naming schemes in order to describe or identify the component that was found. And then they have properties that can be all kind of proprietary properties the S-bomb generator decided to include. Why do we need S-bombs? I think in the interest of time I just skip this altogether and go to the case study right away. The idea of this, the motivation for this came basically by reading through a couple of documents. The first, and I have cited them here at the bottom, is a research paper that was done recently. There is I think a couple of interviews and surveys. And there were some statements from the survey participants who said that an S-bomb is not something that is static, that you create at a given point in time, and then you assume it to be stable, but that is something that is evolving throughout the software development lifecycle. And then a corresponding information is also provided in this guidance from the NTIA, the minimum elements for a software bill of materials on pages 6 to 7 they say, you can generate an S-bomb at different points on the sources, after the build, maybe at the docker image, and the S-bomb should actually contain this information when it has been created. That would be important for the consumer. So we looked at different S-bomb generators and then of course, but in order to do a proper comparison, we need to kind of a software to be analyzed. And so as sample software, we have chosen Eclipse Steady, which is a security solution that I've been contributing to over many years, so one that I know very well because I thought that would be helpful to understand the quality of what is generated. In particular, we looked at one of the modules of that solution. It's a spring boot REST service that is developed with Java and Maven. It will be deployed using a docker image that is downloadable from Docker Hub. And the ground truth, which is the information that we will take later on to say whether the S-bomb generators perform good or bad, is what you see here. So there are 114 compile time dependencies. So they are required for compiling the Java sources to runtime dependencies. They should be present where the production software runs. There are 41 test dependencies, JUnit and other stuff. Good. Before showing the results or walking through the results, a little bit of background because that is very important. How do we name those components, right? And it is important to understand there are context-specific component identifiers. So, for example, Maven. So Maven coordinates. This is what is used by the Java developers. It is consisting of a group identifier, artifact identifier and a version. This is typically the graph of the coordinates of a Maven artifact. You would download from Maven Central. There are some optional identifier elements. An example here is org DOM4J, DOM4J version 213. Another context-specific component identifier is the common platform enumeration, which is comprised of a part, a vendor, a product, a version and a couple of other fields. And they are used by the NVD in order to say what are the components affected by a given vulnerability. So, for example, CVE 2020 something is affecting a component CPE 2.3, which is the version of the CPE naming scheme. The vendor is DOM4J project. It's not exactly the same than before, a pity. DOM4J is the product name. And then, besides, there are universal component identifiers. One that got a lot of traction in recent years is the Perl package URL. It has seven elements that I put here. And, you know, using S-bombs in order to use it for understanding whether there are known vulnerabilities, what is the quality of the projects used and so forth, requires to map all those names. Names that are given by people somehow. And that this can go wrong becomes evident by picking one example that I generated later on. So, this is a copy paste from one of the S-bombs, the Cyclone DX S-bombs. And you see here that the Perl here, this universal component identifier that was put in by the S-bomb generator says it's DOM4J. Too bad it doesn't map to org DOM4J, which is the identifier on Maven Central. So, if you want to look up a new version of that component, well, bad luck you don't have the right identifier. If you want to compare this CPE to search for known vulnerabilities, well, it's not the same identifier. They found DOM4J, but it's DOM4J project. Too bad. So, the approach we have taken is we selected three open source S-bomb generators. A and B are generic solutions. You can basically throw anything at them. A directory, an image, a tarball, a single, whatever. And then C is a Maven plug-in that hooks into Maven's build process. And we ran those three tools at three different points in time. After cloning the open source project, after getting cloned, after creating the Maven package, or after running Maven package to create the self-contained spring boot application that you can run, and after finally on the Docker image. On the Docker image, we could just run A and B because C is dedicated to be integrated into the Maven build tool. And so we collected basically eight different S-bombs from those eight runs. And the color coding will stay the same for the Venn diagrams I will be showing on the next slides. And then we did three things. We computed precision and recall of those tools. So, which means we compared what they found with the ground truth. And so precision says basically how many false positives are there in. False positives is the thing tells me there is a component which I know is not there. Not so useful. Recall is for measuring false negatives, which is the tool didn't generate a component even though it is there. Also not so helpful, especially for vulnerable dependencies. And then with those S-bombs, so this is kind of the quality, the accuracy of the tools judged independently against the ground truth. And then we created a couple of Venn diagrams to see how much do they actually agree. So how much, what is the overlap of S-bomb A and B and C in those different times? And then we looked at some additional properties. All right, so this is the first, let's say, result running the three tools right after having cloned the open source project. And let me start from the bottom of the slide with the tool C, which is easy, because that is actually perfect. That integrated in the Maven dependency life cycle in this built tool. Perfect precision, perfect recall, no false positives, no false negatives, right? Very good. And it has a couple of additional properties such as SHA-1, SHA-250-6, digest, license, information, descriptions, a lot of useful stuff. Now then let's look at tool A. You see the blue bubble is much smaller, because it basically failed identifying many, many, many components. And the reason, I think, I mean, I need to speculate a little bit how the internals work. But the reason, I guess, is that it looked at the POM.xml, which is where the developers declare dependencies, but it didn't resolve any dependencies. So meaning it doesn't build a complete dependency tree. So it lacked a lot of transitive dependencies on top of that. For the direct ones that are in the POM, it didn't have any version information, because that was specified elsewhere. So we have components like with this Perl, org spring framework boot, spring boot starter, without a version. They included test dependencies, which is also interesting. The other tools didn't include test dependencies. But the funny thing is, they included it, but if you looked at the S-Bomb, you wouldn't know that it's a test dependency. You can't tell, is this really something only used for developing, or is it really in my production system? Also not so helpful. And they had a couple of CPE combinations supposedly for mapping known vulnerabilities. I think I need to speed up a little bit, right? Okay, now we, this is, this Venn diagram I was mentioning. So here, so the Venn diagram I didn't explain. So here, in fact, this is the overlap of those Perls. So we looked at those Perls and tried, do they match to each other? And you see that, to see, even though they had all perfectly identified, Tool B had a good number, but they still don't overlap. And this is because those Perls contain additional elements, qualifiers, like the type, it's a Java archive, or for open source, for operating system components, could be the platform, the target platform, which makes they don't overlap. Now, if we only look at one of the naming elements, then the overlap is much bigger, because the fact that A is lacking versions, B has wrong version identifiers, and the fact that C adds additional details, it all vanishes and looks like it's all converging. But it is, again, important to understand the name alone is not so useful for looking at vulnerabilities or new versions. Good, so let me hurry up a little bit. This is the same thing, run after Maven package. Tool A improved. They were finding more, but still the precision and recall are not as good as for the other solutions. The other tools didn't change at all, so for them it didn't make a difference that Maven package ran or not. Here, again, is the difference in terms of Perls, which is resulting in the lack of overlap. Here, this is the same component. Tool A has it as DOM4J, DOM4J, Tool B has it at ORC DOM4J, DOM4J, and Type equal to JAR, so they added this additional information, which made that they don't overlap. Good, and then last, after running, now we ran it also on the Docker image. The first two tools, and maybe one finding here is what we observed in terms of lack of overlap on Maven components also happened for operating system components. So here we have Dibyan Udbuntu, the package dash, but again they, one tool added a little bit of more information, the target architecture. For the consumer of the bomb, who knows whether this is important information in terms of security? I don't know. And then again, if you only look at the name, the overlap is much bigger, but even though it looks like they only disagree on very few components, too bad for Tool B, I think, had a big miss, it was lacking the complete Java runtime, and those people being in security, they know how many security issues there are in the Java runtime. Good, lessons learned. The reason for getting different S-bombs is a big one, tools integrated into the dependency manager seem to work much better, at least on the result of that small case study, because generic tools that are supposed to judge the bill of material from the outside, they will need to apply some heuristics, and they don't have the same level of detailed knowledge about the dependency graph. Production versus test components are sometimes included, sometimes not, there are different defaults, and in the S-bomb generated you don't see the difference any longer. And of course, there is also this difference depending on when you run it, there will be different components included. There is a standard format, but the tools include different fields, some include license and digest, others not. And even if they all include a Perl, Perl itself is a complex naming scheme with seven elements, and the tools decide differently on what to include in a Perl. And other reasons that we didn't discuss here, it also depends on the time of the dependency resolution, in case your version ranges, and some tools also generate platform-specific S-bombs, so if you create an S-bomb on a Mac and on a Windows machine, maybe with different hardware architectures beneath, you would have different S-bombs. Right, and I think I don't have so much time to look into this. What I wanted to say, identifying vulnerabilities only on names is rather flawed, because names keep on changing, projects are renamed, there are rebundles, there are forks, and so which is why we advocate for enriching such information with call graph information and code-level information. And with that, I hand over to Joseph. Yeah, thank you, Henrik. So this will be a bit shorter version. We're running out of time here. I guess that's all it. All right, so why do we want to go for more like a call graph view? So with the current S-bomb format, so in general with dependency trees, if you view from that perspective, we typically have the application and the list of dependencies of how it is dependent, right? And if you instead try to think from a call graph perspective of looking into the source code, you could have something like this instead. If we see, for example, like those small sort of like, almost like Lego pieces, if each of them are function calls from like the application to the API, we can see, for example, at lib4, and if lib4 would have a vulnerability or some other type of problem, we see that there are actually no function calls to it from the application down to lib4 via lib2. So I mean, the interesting part here is that if you start looking from like a code perspective, we can quickly see whether we can pinpoint or like see how we're utilizing source code. And another interesting part is that when, so I looked for example like into the Rust ecosystem. So if we have a couple of dependencies, so for example, maplit here, if you run like a grep over here, you can see that only like this one is let's say like, I mean important, but we don't see any usage of it in the package. So this is like a case where there's no code reuse. And I was generally like interested to know like in the Rust ecosystem to see like how we are like doing, like how many dependency we're actually calling or not calling in general. And when I did the study and looked into like how many dependencies are declared and resolved versus how many are actually being reused in the code, I found that for using only package information, it would for example report around 17 dependencies. Whereas in the case when you looked into the like full graph information, we found that only six dependencies are used. And that was quite interesting why there was such a big difference. And the reason why there is such a big difference is that if you look into this example over here, we see that main calls full and then from app to lib1, full calls bar. And then further down we see that from bar it goes calls to intern in lib2. But then we see that there are actually no calls from lib2 to lib3. And this shows that why it is quite important to think about context in general because depending on how app is using its direct dependency, it also directly impacts what transit dependencies are also being called. And the assumption that we usually have when we are building an SPOM or we are looking into a dependency tree is that we are assuming that all direct, I mean all APIs of direct dependencies are being utilized. And then we are also assuming that all APIs of transit dependencies are being used as well. So we need to also start thinking a bit about what kind of context is being used in general. And so a little bit what would be the lessons here with trying to integrate something like call graphs or other levels apart from just using package information is that if you start having information around source code, we can directly try to pinpoint and understand, for example, if there is a vulnerability in one function, we can see that AOK is being either quite utilized in my source code or not utilized at all. And another thing, so this is a problem that we also see as well that we might declare dependencies on 20 components where you get an SPOM from like a vendor or someone else. They have five different components, but which one is the most critically used one in that project? That's also not very clear. And if you know, for example, usage of APIs, you can kind of get an idea around that. And this was also a little bit the second point that I was like highlighting on that. We need a few more layers of information that serve different uses of SPOM. For developers, if I have access to SPOM, I would rather not look into metadata information. I want to go look into call phrases and call information in general. Whereas for security management people or other layers, they probably don't want to look into the source code. They rather want to look to get like an overview of seeing which packages that are being used rather than the source code. And so this sort of wraps up or talks. We have like a couple of takeaways here. And we see that going towards having some type of standard around SPOM formats is, I mean, it's being necessary, but not fully sufficient. Based on the previous slides, like we need a bit more context so that we can have better actionable insights. One way of doing that could be, for example, including call graph information. As a consumer of SPOM, it is very difficult to verify the correctness of them because as Henrik was showing earlier, if you are using different tools and we are getting different results, which one is the correct one and how can we even validate that they are correct in what they are doing. And the last point, and this is something that both me and Henrik think is extremely important, is that we need to create some form of independent SPOM benchmark where different SPOM generators or others could evaluate on how accurate the generated SPOM are against a set of manually validated projects in general. This concludes our talk. We are happy to take questions. Thank you. It shows Java and that's got an ecosystem. So I presume you've probably found something similar with Python or Rust. What about language applications that don't have an ecosystem? I'm thinking about C++. What would you say about that? Basically there are standard languages there and I like the core graph there. How would you tackle that? You can probably take it. The question is how would we do this core graph analysis for C languages and that of course is a very different game and I don't think there is an easy... It will just not be possible to be honest. Because with all the core graph generator, you don't agree? We need to do this for safety. Building a core graph for C code is more difficult than it is for languages like Java and Python. So maybe the amount of information that is contained in such core graphs is less helpful for taking any actions or it's less actionable. We were using this originally for those core graphs where it's for reachability of vulnerable methods. The idea was there is method ABC affected by log4j is this really callable from my application context. But this required kind of... You could map the source code where the vulnerable function is identified to what is in the bytecode where this identifier is basically the same. So the core graph generated from bytecode could be used for this purpose. And I don't think this is possible in at least this application in C. Is this possible at some point? |
The 7 key ingredients of a great SBOM
Ensuring your SBOM includes enough data to be actionable |
All right, so first of all, thanks for staying with us on Sunday, and for sending the SOMROM of it as part of their organizers, I'm really happy to see this move. And so we have been going through all of those cool use cases and complex, like really complex and complex tools to generate lessons and analyzing how companies, big companies have been doing it, and like going deep into the research of how SOMROMs are composed. So I was thinking that I wanted to do like my talk as a kind of pivot as we are shifting towards the more discussion part of the bedroom. And so as you have been hearing from folks right now, a lot of folks working on SOMROMs are starting to get concerned about what's actually in those documents. And I think when Thomas opened the bedroom today, the first thing he said was, well, those dependencies that you're getting, they may not be correct, right? So I thought that it would be, as we move to the latest part of the conference, it would be cool if we could get a few talking points just to see the conversation that's about to happen. So my name is Lofa Garcia, and I am part of the SBDX community. I am a contributor to SBDX and some of the tools. I maintain a bunch of open source tools that generate and consume SOMROMs, and that help visualize them. I am also part of the Kubernetes project. I am part of Kubernetes to Release, and I work there mostly on the supply chain security of the project. And yeah, I like riding my bike. I'm based in Mexico City, and I am a staff engineer with Chinger, which is a company devoted to supply chain security. So as you heard from probably every speaker today, the goal of having an SOMROM is getting a document which you can actually use for something. And there are many concerns about SOMROMs flying around in the world today, because there are particular use cases, and some people will argue that SOMROMs may not be necessarily incomplete if they're not suitable for one case or the other. And this is true, but instead of trying to picture ourselves like generating a next one from the position of like a large company or whatever, I felt that it was more appropriate to discuss today that how... I mean, I am assuming a lot of people here are maintainers of open source projects, and sometimes very small open source projects like one maintainer small. And I think it's important to start considering that when those large companies are going to use your projects, your library, important in that model that you write, the SOM that you give them can really make a difference in several areas. Like first, you can make their life easier because you're handing them more complete information which they can act on. And the other one is we as the open source community become better citizens of the supply chain, like generating the information that pertains to us is a much more responsible thing to do. So what happens when you open a next one? Well, today you can get all sorts of surprises. Sometimes there's nothing in there. You open the SOM and it's empty. Sometimes you don't have absolutely any information that lets you determine what that SOM is describing. So it's simply just pointing to the same black box that you can look from the outside. Or the other is what happens if are you sure that the SOM is really describing what you're expecting to and you're not getting caught by someone? Well, that information needs to be in the SOM in order to ensure that importance. It needs to be in the SOM in order to ensure that it's actually describing that piece of software that you're distributing. So I'm going to give you a few examples. I'm not trying to name names and that's why I chose projects that I'm involved with. Both good and bad. So this is the first one. This is our company has a Linux distribution which is already shipping with S-bombs built in. And we generate those S-bombs at build time for all of the packages. And you can see the structure here of one of the S-bombs. This is like a visualization of the S-bombs using the Kubernetes BOMB tool which lets you ingest SPDX documents and see how they're structured inside. And as you can see, we try to, in the Linux distribution, add a lot of detail to the S-bombs as much as we can to just guide whoever is using those S-bombs to do smart decisions with the information they have in them. So if you look at some of, this is a fragment of the S-bombs. And I mean, some information is there. Some information is, for example, the licenses, the license concluded fields. They are marked as no assertion, but you can omit those, for example, if you want. But we have the license from the project, from the actual operating system package. We have some identifiers, things like that, so it's pretty complete. It's obviously not perfect, but we try, and we try to add as much information as we can. But then, let me show you another S-bomb from another popular open source project. This is part of the Kubernetes S-bomb. This is part of the S-bomb, like the structure, a little fragment of the structure of the S-bomb that we generate when we put out a new Kubernetes release. And this is describing, for example, the tables which we put out with every release. One of the tables of the Qube API server, the list of files. So we also try to add information. Two S-bombs with Kubernetes, one with the artifacts, one with the source code, which are linked one to each other. And so we also think that those are fairly complete S-bombs. But now, I opened an S-bomb in a popular open source project and tried to generate the structure like this. I'm not going to say which project it is. It's just one I'm involved with and we should be doing a better job there. And you can guess many reasons of why this is showing zero things, but we can go over this. So as you can see, you can really enrich an S-bomb with a lot of information, and some of it can be more important than other things. But I've been thinking, well, what's the most important details that you can add to the S-bomb? So the first one is, and by the way, most of this, you already heard it through the day if you've been sitting in most of the conferences. So we're going to go one by one. So the first one is syntactic correctness. You would expect that most tools generating S-bdx or cyclone DX S-bombs do the basic job of just making a compliant document. Well, the reality is that they're not so. I picture this guy from Apollo 13 that tries to fit the square peg in the round hole or the other way around. Because if you cannot ingest an S-bomb, so what's the point, right? And even if you have tried to somehow hack the document or ingest it somehow, the reality is that most tools that consume S-bombs today do not have a clear strategy of deprecating the documents. And most importantly, not clear and also not predictable. So if a tool tries to somehow ignore errors or whatever, the behavior may not be consistent. So ensure that any S-bomb that you're producing or requesting at least complies with syntactic rules of the standard you're using. The second one, dependency data. And this is a little bit related to the first one. I've seen S-bombs. So since I work with a lot of open source tools and my job also has to do with S-bombs, I've seen a lot of tools producing S-bombs. And so for example, one variant of the bad S-bomb is, well, we'll just list like a table. And that's your S-bomb, nothing else. Or the obvious case of this S-bomb contains one thing, an RPM. No dependency on nothing. So we often use the analogy of the S-bomb being the nutritional label of software. But without the dependency list, well, it's really worthless. You can still use your S-bomb as the old checksum.txt if you want it. But S-bomb's going to provide a lot more value than that. Then the second one, licensing information. We've heard a ton of talks today about licensing and why it may be important. So the truth is, if you are publishing software, you're the most qualified person to do the assessment of what the license to your software should be using. And this applies both to the dependencies that you're pulling in. And if you're redistributing any information, ensure that the information about the licensing is going down the stream. Because the tools that we've been seeing today try to do a good job on helping people understand the licensing situation. So I picture checking the passport as an example of the license. The next one, semantic structure in the S-bomb. This one also came during the discussion today. So there are folks that think that S-bombs can be just the list of dependencies. And it may be true, but then you start losing context on where those things fit. For example, if you have just a list of dependencies, and especially if they're not related to an artifact at the top of the S-bomb, if you picture, so the S-bomb can be this beautiful graph of one node that spreads out to lots of relationships in nodes. So sometimes you'll see S-bombs that only have the list of dependencies, and they don't talk about where those dependencies fit if they're describing a concerning image, a binary, nothing. So if you try to do something more sophisticated with that data, you simply can't. If you remember the S-bomb that I showed in the beginning that we built with the Linux distribution, this is how we structure the container images built from our Linux distribution. So you have the container, the layers, and the packages, like the OS packages, and then all of the files in its proper place. And this information is actually coming from smaller S-bombs that get compiled when we build the Linux distribution. So each of the APKs of the distro have their own S-bomb describing that package. And then when we build an image, we take all of those S-bombs and give you one single S-bomb with a lot of that information composed where it's supposed to be. And without structure, you simply cannot do this. And this is one image, but then if you go and make it more complex, you can start thinking about multi-arch images, right? And those need to have this information for each of the images so the relationships start to become more complex. And the way I try to picture is this, right? So they give you a box of Legos without any instructions or anything. If you use your imagination, probably you're going to build something really beautiful, most likely not, especially not the thing pictured in the box, right? And so these are some of the reasons that I was thinking, like if you have structure, then it's a guarantee at least that the tool at least is looking at how the thing is composed and where the information is flowing from and lets you do more complex use cases for the documents. Now, the next one. This also has come two, three times today, software identifiers. S-bombs need to be defining and naming the piece of software as close as possible. And software identifiers are one of the schemes that you need in the document in order to ensure that the piece of software that S-bombs is describing is clearly identified. And all of them have their problems, especially CPE for example, it's like really complex to get it right. But the idea is there's going to be a tool down the stream that it's going to benefit from that information. So if you can add it, you're making sure that the S-bombs can work well with those tools. And this is kind of the idea of that. So how many packages in the world named log, right? So okay, log, but what's log? There are thousands in every language, like operating system packages, libraries named log. So if you can have like a properly specified PURL, CPE, both, that clearly define the piece of software that the S-bombs is talking about, then it can be better referenced and used by tools on the stream. Now the supplier data, this is like a contentious one. Then the reason why I added the supplier data is because as software authors sometimes we don't think that it's like an important field. We simply, I mean, in most large open source projects we just like copyright the project authors, right? Like the editorial. But the reality is that if you jump into any of the S-bombs meetings that go on regularly, you're going to hear all of the compliance folks like I need a name to sue or, I don't know, a different mentality than ours, but people need it. And in fact, it's one of the requirements from MTA as the minimum elements of S-bombs. And this is a weird field because if you deal in kind of more into security of the documents that should be generated during the supply chain and the software lifecycle. This information is kind of, I don't know, not really very useful because it can be forged and you cannot trust it. And so just having a name and an email, well, like it serves compliance folks, but to us it's kind of, well, worthless really for security purposes, right? But then you start thinking about what's a supplier? Is it the author, the company right holder? Is it the tool that compiles the thing, the people who are distributing it? And so, well, at least ensure that you're providing some kind of information. And the idea is know who's selling you your things, right? Buy candy from that guy, probably not. Yeah, exactly. And get him from us. Supplier data. Oh, okay. I messed up this slide. So this one was supposed to be integrity data, integrity data to prevent this kind of thing. So when you, as you heard today also, so S-bombs should be properly hashed, like hashing as much as you can inside of the document, when possible, when it makes sense, especially when it can be verified. So the idea is, is this piece of software that I'm naming in the S-bomb the real deal? Has it been corrupted or not? But more importantly, having hashes lets you deal the problem of the latest, right? So sometimes you will not have a version, but you can still reference that software artifact inside of the S-bomb and other documents, like Bex, for example, via the hashes. So you can think about the versioning system and the software identifiers as links to external systems outside of the S-bomb, like vulnerability databases, like, for example, package repositories. But internally, everything should be addressed via the hash if possible. So if I'm telling you this is the vulnerability document for a piece of software, it should match with the hashes somehow. And the idea is, well, once you start content addressing the piece of software in the S-bomb, you cannot go wrong. And, well, that's basically what I have. And so I just wanted to let this open, you know, kick the conversations that are about to happen about this kind of thing inside of the documents. And if there are any questions or whatever, happy to take them. And if not, you can reach me as Puerco in most systems and Twitter, whatever. So thank you. Thank you. Questions? Questions. Oh. So I was going to ask about the supply and data. And how much is that seen as the individual who wrote something that is maybe an entity who is distributing it as an organization or a software foundation? Well, I would like to hear the opinions of the supplier for another. Yeah. So what's basically, what's the role of the supplier data? So what's the use of that field? Yeah. What should be filled in? Is it an entity? Yeah. Should they feel like a person or an entity or a tool? Yep. Or? Yeah. So I have a question about the last ingredient that you mentioned. So the integrity. In your definition, does that also include signing on the actual as well as itself? Not really. Rip in? No, no. I was going to go to that question before. But if anybody has insights about how supplier data is used in the organizations, now's the time to discuss it. All right. Yeah. No. So the way I've seen it required is, no, no. This is the first one. So the first one is, how is the supplier used? And the way I've seen it is mostly from procurement people, like asking for that information, and lawyers. So that's the model, the two that I've seen asking for the information more. I'm coming from the security side of Hezbollah more, so compliance is not my strong side. But that's why I'm suggesting it. At one point, sorry. Yeah, as one data point, the way that we are using supplier data is actually recording who supplies the software. So not who wrote it, not who created or something like that. If we got it from an upstream distribution repo, we put the upstream distribution repo for that. Again, record what we know, that's the only thing that I know. And so the other question is, so does the integrity point consider also signing of the S1? And yes, but not in this case. So integrity, like signing of the S1 is mostly done outside of the S1. And that touches on trusting the S1, which is a whole other kind of worms. But I mean, it is, but not in the contents of the documents. How can I make sure an S1 tiers to these principles? Is there something like benchmarks? Or I give it a score of 8.0 from 10. That's a good S1. Well, there are tools then. Yeah, I repeat the question. So how can I know? Sorry, I didn't get a lot of sleep. How can I make sure that the S1 really complies to these things here? So there are a couple of tools that do a validation of the S1, like scoring, try to do the scoring. So eBay has a tool called S1 Scorecard. Then there's the NTIA compliance checker from SPDX. I'm not sure. I don't know. Are the ender folks here still? OK, so I seem to remember that they were handling some of that as well. But there are a couple of tools out there. It's more like a remark, but it's a bit surprising to mention Open Chain that much. Open Chain, the goal is to trust from the suppliers so you can trust the S1's from the suppliers. So yeah, what Nico said is that Open Chain has touches on the idea of trusting the S1 on the supplier and those sorts. And in observation, this is having looked at Python, and the metadata that goes with Python packaging is really inconsistent. So how do you spell Apache 2? How many different ways of putting Apache 2 license is amazing. And actually, between releases, information disappears. So this is really a message for the ones who are in the ecosystem, put as much data in the ecosystem, and the metadata that you can, because this is going to support... Yeah, exactly. Yeah, the comment is... Because we were looking at a difference, and we've got a new release of a package, but there was one where there's a supplier gone. Right, exactly. And actually, the question is, do we just use that on your matter? Because we don't know where it's come from. Yeah, the comment is that in Python, sometimes between releases, information changes, or disappears, or whatever. So this is actually one of the things that some of us would like to see happening, like people working on packaging systems, on language ecosystems, to start, if not adding S-Bomb generation straight away in their tooling, at least expose the information so that we, S-Bomb toolmakers, can go in and extract them from more trustable sources. And... Okay. In regards to hashing, how are you dealing with custom patches when you apply to your live software? Can you repeat it? In regards to hashing, how are you dealing with custom patches when you apply it? So if you need to patch software you're using, but you can't apply the patch upstream? In the case of the... Yeah, just in the case of the distro, or...? It has a distro and a general response. Well, yeah, the question is, how do you deal with patched software, right, when you apply a patch? So... But, I mean, you still have that hash, right? Or is the question about naming... So what's the best practice, I suppose? Yeah, so if you're describing a patched artifact, I mean, the hash, simply hash the thing and you can use that downstream. The problem comes when you're trying to define, well, I'm using curl, but I applied a few custom patches myself. How do you name that? And that becomes, like, a more complex question. So internally, as I was saying with the integrity thing, is you can still reference everything with the hashes, right? Like, I'm talking about binary, this hash, all down the stream. But when you want to express it externally, well, I guess that falls into the naming problem and you have to think about where that thing is going to be used. So if that is going to be a package part of a distribution that you're doing, you may define your own set of package URLs, for example. Or if it's not going to be, you can make up the license, but it falls more into the use case of what you're trying to do with distributing the patched software. So that's it. All right. Thank you. Thank you very much. Thank you. |
Panel discussion: SBOM content, usefulness, and caveats |
So, this session is panel discussion by four or three people and the basic idea that we're going to be discussing is the whole idea of S-bombs, right? The usefulness, cover of using them and all these wonderful things. Though the rest of the day everybody was talking about S-bombs as like something, you know, that has to be there and we all know what information has to be there. But maybe this is not so obvious to anyone and the whole idea is to discuss the idea of S-bombs, right? So, we have for the moment three panelists and the format will be that they will be doing a very short, short introduction of, I don't know, the problem statement if you want, right? And then we'll, after these three short interactions, we'll go into more discussions and we obviously will value audience participation. So, first of all, thank you for your time. Okay, afternoon. My name's Anthony Harrison. I used to work for a very large system integrator. So, the work on safe functional safety was very close at heart. Certainly, they got things like the Siemens team as well, very similar to the sort of things I used to involve in for many years. I retired a few weeks, a few months ago. But before I left, I was trying to introduce S-bombs into my organization and it was following very much on from the log4j need, where was the components, which systems were affected, which customers needed to be informed, and where was our liability, very much from a risk perspective. So, since I retired, I've been writing Python tools. So, if you want to talk about me, about some of the Python tools I've been doing, because it's been quite interesting, have a chat with me afterwards. But to sort of set the scene, I think we've got four types of users. And it picks up on what Adolfo was saying, Adolfo, yes, about who's a supplier? Is a supplier a developer? Is it, I've put the word supplier in there, but basically someone who packages things together, it might be an embedded solution, it might be a red hat, it might be someone else. And then you've got your integrators, which is like your Siemens, people who are selling solutions to an end user who's going to use it. And I believe that if you are a producer of S-bombs, you also should be a consumer of S-bombs, first and foremost, because there should be value to you in what you are producing, and you need to know where your vulnerabilities are, whether you're compliant with your licenses as we've had lots of discussion about, but also where the things are changing, particularly do you know those changes, are those changing happening under your control or not? And this comes into the ecosystems, things happening outside, your maven, repositories, et cetera. So the first thing is, I think everybody who's a consumer, everyone who's a generator should also be consuming. So if you've seen some of the questions of asking this morning was when the opto guys were saying they're generating an S-bdx, S-bombs, are they actually using it? I think the answer is most definitely yes, you should be. And then what are the sort of things you should be using them for? We've heard a lot about vulnerability management on the basis, but that's heavily dependent on the quality of your S-bombs. Have you got your components uniquely identified in a way that they can be found consistently? Secondly, looking at license management, again, are we consistent? But thirdly, looking at the change management, are versions of packages changing, are licenses changing, and I was telling you to Kate, on an open source project that I'm involved in, we generate an S-bombs every week, and we do a build, and we are seeing components changing versions, and the metadata is changing with those versions. Sometimes it's getting worse. So why have we lost the name of the supplier or the developer? And fourthly, I think you should be using it for solution integrity. Have you built what you said you were going to design? So we're picking up on the design S-bombs that Nico was talking about from the safety. I think that would be really good to actually use the S-bombs as part of the solution integrity. Have you built what you said you were going to build? We saw a lot of that things with the Chexons from the Yocto guys. Have we got end of life components? Are we planning for removing those end of life components, obsolete components, and are we seeing the impact of those changes on the solution comes back onto the engineering life cycle? So I think all of it is supporting risk. So ultimately this is all about risk management, and hopefully S-bombs and all these are helping you make effective decisions. Now I'll shut up now. Just to try and shape things for discussion. Okay, thanks. My name's Paul Noveres. I'm a solutions engineer for a company called Bancor. We produce a couple of open source projects that A, can produce S-bombs for, in some cases, do vulnerability checks against them so you can look up SIFT and gripe. I'm not going to talk specifically about any particular projects here. So just to hit on the topic of the panel, I'm going to just jump into a couple of things. Oh, well, what I do there, I'm out in the field. So I am a solutions architect. I'm working directly with the end users a lot of the times. So most of my contact is with people who are on the beginning of the learning curve. They may not even know what an S-bomb is the first time I talk to them, but they've been given marching orders. So it spans a pretty wide range. So as far as content, the biggest thing for me is that the S-bomb, and this seems to be somewhat of a controversial statement sometimes, but the S-bomb needs to be seen as a objective document that only has factual information in it and does not have judgments in it. Those things are important, but should be separated out a lot of the times. A lot of people probably haven't been in here all day, but there have been often on comments about should we include CVEs in an S-bomb. My opinion is probably not because those are going to change more than the software that the S-bomb describes. Obviously, if you rebuild the piece of software, then you would want to build a new S-bomb, but the S-bomb should be static as long as the software it's describing is static. In the case of a Docker image, we know when that changes because the image digest would change. We could trigger a new build, but we shouldn't necessarily be rebuilding those S-bombs continually if there's nothing new there. Now, that ignores things like maybe our scanner has gotten better and can detect more things. Obviously, that might be an exception to that kind of rule. The other thing, just as far as content goes, the first thing we mentioned today, the specs, whether it's SPDX, Cyclone DX, are extremely loose. You can have a valid S-bomb that doesn't actually have enough information to be useful for your particular use case. What that minimum amount of useful information is might be different depending on the use case. There's just something to keep in mind. The other thing that's kind of come up again and again today, the quality of the S-bomb is going to be really dependent on the tooling. There's no way to do these things without automation at scale. It's just too difficult to keep up with. The tooling is improving, but it's still a long way to go, I think. The main thing here for me is the S-bomb is useful because it's going to make all the other aspects of supply chain security better. One, what's in the software? That's the S-bomb. It's kind of foundational to all the other aspects. Two would be things like, is the software safe to use? Does it have vulnerabilities? What are those vulnerabilities? What is my license? Am I at risk for some kind of compliance issue there? Those kind of safety issues, whether you, I don't want to use that term after seeing the operational safety discussion. The other thing, provenance, where is this coming from? If you know what's in that software, having that assurance of where it comes from and what's in there together is much more powerful. Then things like reproducibility. It's really esoteric, very difficult to actually achieve in practice, but if you have an S-bomb, a high quality S-bomb, it can be effectively a recipe towards that. That can also help you prove some of the other things that you've seen or we've been talking about. A lot of this is difficult though. We've seen a lot of things about opaque artifacts and things, the difficulty of actually scanning at build time, which may produce a higher quality S-bomb, but at a lot of cost, performance and other things. That kind of leads us into the caveats. The main thing here is to keep in mind that our imagined state of how we want things to be and the actual reality are basically not the same at all. We'll try to move towards that. We're on a journey up the mountain, whatever you want to say. If you're climbing Mount Everest, we probably haven't even gotten to base camp yet. There's a long way to go, but the other thing, in that realm, just declaring we have a policy to produce or store or evaluate S-bombs doesn't actually solve the problems you're trying to solve. There's a lot more work that has to be done. The last thing on caveats, things like log for shell, I think, pushed S-bombs into a lot of people's awareness because S-bombs were a very effective way of finding where log for shell was affecting software. It may have given a false sense of security just because log for J in particular sticks out like a sore thumb. It's very easy to find. This is like the open SSL vulnerability just a couple of months ago. I think it may have been a reality check there. There are a lot of cases where you might have something like open SSL in software that you're consuming without even knowing it. It's a lot easier to hide. Just kind of a reality check, bring some people back to Earth a little bit. Then the last thing is just in addition to producing S-bombs, one caveat is you have to think about managing them after you produce them, storing them, being able to search them, purge the ones you don't need anymore. That's been a topic a couple of times. We don't want to keep more information than we really need. How do we know when it's safe or reasonable to get rid of some of this data? Much less how are we going to search through it when we need to find it? I think that covers it for now. I'll go ahead and pick them up over time anyway. Thank you. Like you, I've attended more panels than I've been on. I don't like a panel that is a series of talks. I did hear Alexa say that I needed to make a very brief introduction. Mine will be, I promise I'm timing here, I'll be under two and a half minutes. My favorite childhood, I'm Bradley Kuhn. I work for an organization, a charity called the Software Freedom Conservancy. My primary job has been related to copy left license compliance since 1997. I've seen a lot with regard to that issue, which does interact quite frequently with the issue that you are here to discuss today of software bills and materials. My favorite childhood story, by the way, is the story of the emperor has no clothes, and I found throughout my career in open source and free software that I'm often the only one willing to say that our emperors have no clothes. I think S-bombs is a case where that needs to be said at least to a certain extent. The most useful application at S-bombs is in cases where you are in an organization that produces proprietary and open source software together in products. If you are an organization that is 100% open source and free software and choose a copy left license, your usefulness of S-bombs is extremely limited, almost to the point that you may not even want to invest in getting involved in this kind of thing. Now I'm a realist and realize that almost all of you probably work for organizations, including the trade associations among you, that produce lots of proprietary software. As such, you're going to have to worry about all these issues we've been hearing about today, many of the tools today look very interesting to me to solve those kinds of problems. But I want to leave you with one thought, if I can, that is imagine if there was no proprietary software in your organization, that you didn't sell it, you didn't use it, and you didn't want to make it. And instead, you chose to look at the requirements of the copy left licenses like the GPL, which require you to produce the complete corresponding source code as a reproducible build every time you produce a binary. And you have to make that available to every customer you have. S-bombs are most often needed when you don't necessarily have all the source code to hand, or don't know if it's going to be to hand when you get something from another vendor. So my argument is that the level of effort that's being put in S-bombs is primarily to enable the continued production of proprietary software. Being a FOSDM and being a free software activist my whole career, that's generally not something that I'm that excited about, which is why I'm not excited about S-bombs. Okay, so we heard opening statements about uses, about general use and caveats of using them and why they're not needed, or in an ideal world if they're not needed. Okay, let's go back to that. Bradley, even if in an ideal world where an organization is producing free and open source software and following license obligations, they provide all the source for every release they're doing. Right. There might still be the use case that I want to find out which of my releases, plus products or whatever you want to call them, contain a vulnerable version of a library. Right. So if you have a lot of source code, this is a great tool called grep. And what you can do is you can search the entire source code and look if a version of something is in there. If you have all the source code for everything, and you never separate the source from the binary, why does grep not work? You tell me. Because it's easier to search a table of contents than the complete book. Right. That's why we have table of contents and indexes. Right. Yeah. It's funny, since things have become electronic, I generally just turn the PDF into text and grep anymore. I don't use table of contents. Really? Yeah. Okay. Okay. Okay. So we have a source file. I used to write C. I used to be full of if defs. So how do I, if I search the source code, I will find a line of code, but it's not telling me whether that line of code was actually compiled into the binary, and now actually on my target architecture, target hardware. So how do we get round that? So we've talked about having sort of a trail of evidence from the source code to a binary so we can then sort of match the two together. If we have different compile options, I would expect we'll get different binaries, I hope. So therefore, how do we accommodate that? I'd love to. And so in the world I'm imagining, which does not exist, I agree with you. This is why you'll have to do this work, because the world that I've been working towards my whole career doesn't exist. But if it did, and all software was under the Afaro GPL, you would be required every time you build a binary to make sure you had a reproducible build that can produce that binary. So when you found the vulnerability in that binary, you would have stored the complete corresponding source release right next to all the binaries. You take the binary that's sitting out in the field, you compare the checks on to the binary in your repository of binaries and source code, and voila, there's your source code release that you made at the time you built it years ago. Simple enough. I'm going to just, I still, I mean, I get it, right? Yes, theoretically, we should be able to reproduce everything 100%, but in practice, like, we don't do things like inspect every jar of food that's coming off the assembly line. We only inspect some of them just for scalability purposes. So the SBOM still has a value there in providing a shortcut so that you don't have to do a bunch of work over and over and over again on demand every time. It does. I mean, I get it though. You're right. In an ideal universe, that would be true when we have unlimited time, unlimited compute resources to do all these reproducible builds. In practice, though, even if there aren't constraints on time and compute resource, reproducible builds are extremely limited in, you know, I mean, it takes no, no, no, no, no, but they can, they can, they can approximate a lot of what you would get out of it. Not everything. No, not everything, but you can approximate some things, right? So I don't, I mean, I agree with you in principle that reproducible builds would prove a lot of things, everything, right? That would, that would absolutely solve a lot of problems, but I don't know if everybody is really willing to do, to live in that world. Yeah. My view is be the change you want to be in the world. That's why I support it. What you feel is instead of investing in us, Tom? I think part of our disagreement here of different views is the slide that Kate showed in the beginning, that we're not talking about an S-bomb, though, they're different type of S-bombs, right? And they're S-bombs that apply to the source and they're S-bombs showing what the build is or what the deploy thing is and stuff like that, right? Now, to Bradley's point, once you have everything documented and provided, people can recreate this information, right? So the great example that Bradley said this is looking at the source, right? But if the obligations are, we don't only give the source, but also the build instructions and all the scripts that the license obligations require, right? So people can go and recreate it and then try to find out this information, right? But so the information will be there, right? The question is how easy we want you to have, right? Sorry, you want it? No? What did you say? Questions? Hmm? All right. Yeah, let's talk. Maybe one moment. So I think I completely agree with you that it is better to take such decisions on the sources. But what is missing to do this is that the current vulnerability databases that we have only reference vulnerabilities in giving it names, naming components. So there is the missing of a vulnerability is present in this file, in this function, with this method signature and so forth. And so I do not know what you want to search for. So if you only search for CPD blah, blah, blah, you will not be able to catch all instances on the vulnerable code because maybe a project has been set, we named the code has been responded and so forth. So that is, I think, lacking. Okay, so the comment was that, for example, vulnerability information does not usually refer to specific source files and lines on the source file, but it refers to product names or library names or whatever. So in order to find something you have to have to look for these, right, and not for the fine grain that we are talking about in a gripped part, right. Yeah. Thomas. So I know we talk a lot about security and licensing, but funny enough, how Ortt started had nothing to do with out licensing history at all. We basically wanted, the CTO wanted to figure out what are we doing and where can we go more efficient and where should we invest in the language that we should be going. And by doing the whole as-bolts and having the old stuff, we actually see what we are reducing and then we are directing these in near organization like, okay, yeah, you guys are using Ruby, you guys are using Java. Actually, we are all standardized in this. Actually, the company actually saved a lot of money by this organization. So again, this is often forgotten that as-bolts can be a great way to basically make your building software more efficient. And that alone for that, I will build as-bolts even if my code is completely over sourced. All right. So, repeating for the mic, sorry. The comment was that S-bombs, while we are talking about uses in compliance and security, can also be used in a lot of other ways and can be very useful in such a way. For example, having a software catalog of components being used by different, you know, parts of the company. Sorry, Anton. So, I totally agree, Thomas. I think certainly the large organizations, I think we probably got that discussion a bit with Siemens this morning. No criticism of it. Big organizations have a very difficult to share things. And so, if you had S-bombs of your build and then having a way of identifying common building blocks, I won't hesitate to say the word products, or different instances of the same product, then yes, there must be some business opportunities to, A, save money, or B, be more efficient. So, and I don't know whether anyone is starting to see that or starting to, I don't know whether you started seeing that, Thomas, with your, in your own industry. So, just look here. I now work on how to open Pinos to see how we can build an S-bombs for them. All of their code is fully open. If you had an S-bombs, they can also see like, oh, where are people contributing? Because we have resolved everything back to sources. And we can actually figure out, like, okay, these are where people are contributing. These are people using these libraries. So, we can actually do three Pinos projects in a group. We have here people, here people. We need more job guys. Here and here are already people who are using this particular library. So, these people in this Pinos project can probably help those products in that Pinos project solve things. And so, it doesn't matter the size of the machine, it's really about building software efficiently. Yeah. Okay. So, I think what Thomas is saying is, even if things are fully open, then it's going to help as well. I would also agree that whether proprietary software still exists. I used to be working in the defence sector. That's never going to go open source. It uses open source, but it's not going to use fully open source for obvious reasons. But we've got to keep those separate. And actually, businesses need to see those separate anyway. And S-bombs is potentially a good way of sharing them and also handling some of the things like export control that I think one of the projects said this morning was actually handling export control things, which is also an interesting thing that obviously some of the licenses, the open source licenses don't have that constraint, but businesses do. And we have to recognise that. No, I just was going to add just to that. I think that's the other thing we saw earlier this morning with the SW 360. Identifying the components that are reused over and over again kind of goes right in with that. Saying, okay, this component, it might only have two people working on it, but it is used in 19 different products that our company produces. That may, from a management perspective, when we go to devote resources, whether it's additional headcount, whatever it is, we know that's a project that is key to everything else we're doing, right? So, yeah, I think that does help. Again, it's something you probably could reverse engineer by looking at, you know, who's pulling from a Git repository or whatever, but it would be very, no, you really, yeah, it would be very difficult to look at. You'd have to time things and, yeah, it would be hard. And to be clear, I agree completely. Many of the tools we saw today, and generally speaking, SBOMs are a wonderful tool to aid in the production of proprietary software and mixed proprietary open-source software. And I think all of you who are in that business, you should probably be working and doing more with SBOMs because you're going to need it. I agree. But I just am not in that business. I don't want a world with proprietary software. I want a world with free software. And in the free software world, the better place to, you have to pick where resources go. The better place to focus resources is in reproducible builds, not SBOMs. If the amount of funding and effort going into SBOM technology right now was going into reproducible builds technology, I think we would get better gains. So it's a question of where limited resources are being deployed, more than it is whether or not SBOMs are useful. I agree completely. They are. Are they more useful than things we could be doing elsewhere with those resources? Yeah, come on. Again, many of us may work for commercial companies, but on the other side, we are maintained as a open-source project. So we do open-source. And I personally, I do not care about GDL licenses so much. I would like to have SBOMs and my open-source projects to make it more easy for other people to consume them. This is I do the software because I like it and because I like other people using it. This is what I do. And having a good overview about the topic and the point that you use is also a good way to help other people. What kind of open-source control design do you use under the framework of open-source licensing? I'm sorry, but I just read up the idea that I need a source code for everything and tell everybody that you want to know something and then you need to throw it towards the source code. I want to provide good information for other people who would like to use my open-source. So the point here was that even when producing open-source license software, you find SBOM useful for telling people what your software uses or for documenting your software, essentially. Okay? Please. I'm mostly almost no lawyer in my back because they know exactly what is coming there. They know where it's coming from and we have the data and usually the lawyer says okay, it's passing by this one and this one you know we don't need to care about this thing. So we do have this information already here, completely set up in the system and says and say, okay, this is, you know, we know. So, it's looking at approval of the company. It's, without those information, it's always the same time. We just can't, we need to go to the discussions, we need to see if the company that is losing money and time. Okay. How do I summarize that? As booms are also useful for getting legal approval for using software or something like that. Yeah. Thank you very much. Sorry. And this is probably going to be an open question out to the, to the audience of those who are generating S-bombs now. Open source project time involved in is Python-based and it supports 3.7 to 3.11. We generate an S-bomb every week in both Cyclone DX and SPDX for each version of Python and each version of Python generates a different S-bomb because you've got different dependencies that are version-specific. So, I have, depending on the version of Python, I may have, I think there's about 25 direct dependencies, but then when you get the implicit ones, it gets another about another 30. Some of them have got 50 odd dependencies, some have got 60. So, I agree with your comment about publicizing it, but are you, are people picking up the right version? So, they are aware of what your software is using because your software use will change. Oh, yes. Yeah. Okay. So, right. Okay. Brilliant. Glad you picked that up. So, therefore, how much, so are we capturing that information in the standards? Yes. Consistently? Yes. I'm not. In the standard is one thing, actually collecting it and storing it in the actual S-bomb that you're producing. Right. I'll just repeat that. Right. Yeah. The standard allows for that, but doesn't guarantee that when you produce the S-bomb that that information will get A collected or B recorded. Which is, yeah, a huge problem with usually, it's a tooling problem usually. And we've seen tooling improve a lot, but again, it's a journey from where we would like to be ideally, right, to from where we are now is, just takes time and effort, obviously. But that's the number one problem I see in the field is that the tooling is not producing either complete results or consistent with other tools. Right. I mean, we saw a ton of examples of that, where tools are producing different results. Yeah. May I ask that? Do you know why there's a wide, especially with open source sites, is the level of education that is applied to new developers regarding this topic is a wide variety. So I've been producing S-bombs and other stuff for more than seven years. I spoke to more lawyers and package managers developing than I care about. Everybody thinks that package managers get a data that is easy. It is not. Right. Yeah. So to summarize that point, he's been producing S-bombs for years, seven years. Right. And it's not just a matter of querying a package manager and being done, right? There's, I mean, we see that all the time. I work in containerized software mostly. So a lot of the S-bombs I see produced are actually produced after the build, because somebody pulls a base image. There's no S-bomb for that base image to consume today. Hopefully, in the future, there will be. But those things that are in those base images, a lot of times, are opaque. It may be open source software, but it's a Rust binary that doesn't have audit information compiled in. And there have been improvements in that. Go puts audit information into the binaries. Rust can do it, but doesn't do it by default yet. So there's still, yes, a ton of that. And so the rest of his point was the developer education. It's one thing to be aware of S-bombs as a concept. It's another thing to be aware of the limitations, what has to be accounted for when you're building them, when you're producing software in general. There's just a lot of plates spinning all at the same time. And maybe to add to that, we have a little talk on Friday about us. Please do not reinvent the wheel if you need S-bombs to. Ask the community, have all the opportunities to drive out of there, so building them all and wasting a lot of time and effort. It's just too useful. And another thing that you have to find the workshop on Friday, teaching developers is OK, but it's not the right path. Because developers are creatures of habits. And they will not follow you or not listen to you, even when the deadlines are very near. I mean, so the last comment was that developers, it's difficult to get developers to change their behavior and they don't listen, which I find too in my work, which is primarily copy left license compliance. To your point, I wanted to add, one of the reasons is probably very difficult for you. It's not the only reason, but one of the reasons is difficult to build S-bombs for containers is because nearly all containers in the world are violating the GPL. And so you don't have the source code to even go and start building your S-bombs off the source code. You're stuck with binaries that are GPL violating. But there is absolutely no funding available to handle that problem in the container world and the GPL compliant side. So I guess you'll head on the binary side, because you have all your S-bombs funding to fund it that way. OK, yeah, sorry. I'm trying to understand your point. I understand that it would imply that if all software was open, we wouldn't have any need to S-bombs, besides perhaps the things that they mentioned. Even though every talk today was stuff that wouldn't matter if all software was open. But I'm trying to understand what your point is. Are you implying that instead of actually being here at both them, pushing open source software so that all software, at some day, being open source, are you implying that we're wasting time producing stuff that actually supports the current standard? But it's completely yours. Yeah, that's a question for me, and I'll summarize it. So the question was, in this imaginary world that I proposed where all software is open source and free software, is what I'm saying that the effort being put into S-bombs is actually enabling the production of proprietary software. And I think the answer is yes. I think S-bombs are a system to make it easier to ingest open source software and bring it into proprietary software. And I came to as many of the talks as I could today when I didn't have other obligations. And many of the talks today were talking about ingestion of open source for that purpose. When you hear people talking about, oh, we can be able to blacklist GPL stuff. Well, the reason they want to blacklist copy left stuff is they want to make proprietary software. Now, it's a question of values. The commentator over there that I think never got summarized was pointing out that in his values, he feels he really wants to see his software put into proprietary software and to encourage it and make it easy. I don't, obviously, agree with those values. If you agree with those values, I agree completely. S-bombs are a great solution to be able to encourage the adoption of non-copy-lefted software. This is a very stark issue. I want to give an example. There was a talk earlier you can go on the internet and figure out which one it was. But their system, when it decided, when putting licenses in buckets, when it saw that it was a copy left license, they had a Python function that I found in their source grid that says, oh, if it's a copy left or it's a license, this Python function should return the string is-danger. So the concept that even those writing S-bomb tools believe that copy left is a dangerous thing is kind of giving you a sense of the values that are circulating around the S-bomb community, which is unfortunate, because I think the GPL is a wonderful license, not a dangerous one. But I realize others in the room disagree. My question was, we cannot see the other side of the coin. So wouldn't that, I mean, I'm not trying to reconcile free software and commercial software by any means, but wouldn't that extra transparency brought by the S-bomb community also help for your use case? Can I then define when that piece of software is being used? The answer is no. And I don't want to get too much into it, but I'll talk about it later, because I don't want to take too much of the time. All right. Kate, sorry. I want open source database and safety critical applications. Open source has bugs. How can I track the bugs and fix them so it doesn't kill people unless I know what's there? But just in the source code, all through, I'm not going to be able to find and fix. It's at scale. We need to abstract to go to scale. How do we change? How do you propose it to solve that problem? So the code, you want to start at the right time? OK. I think you just run the mic to Kate. Yeah. That's OK. So the question is that we want to do functional safety with open source, right? And in order to do it on scale, we have to abstract things outside on a higher level than simple source code and talk, for example, in packages and have, again, the table of contents of things rather than the actual source files. So I want to respond. Off the mic, it was said there's a concern that software is going to kill people with the implication that without us bombs, we won't be able to prevent the killing of people with software. Which I agree, there is software that has killed people. I was very taken of the Therak-25 case, which, if you're in computer science, you probably studied, which was a proprietary software system that murdered a number of people due to a software bug. So I agree completely that we have a long, long history of software bugs injuring people. My argument is that the best thing to have when you have a binary, that you worry has a bug and has a vulnerability, the best thing to have is to have a completely reproducible build for that binary. Such that you can go and make that binary again tens of years later, hundreds of years later, and see, again, have all the scripts used to control compilation, installation, at your fingertips for every binary produced in the world. I agree with you that it takes a lot of resources to do that. I would like to get to that world where that's the case, where every person who relies on a piece of binary software has the immediate access to the complete corresponding source code. There are some tools in that case that I think should exist that don't. I don't see anybody in this community working on them. For example, our mind was working years ago on this very interesting tool that was doing orchestration through build processes, where it was tracking exact hashes of source code that was going into a binary. Those kinds of things are very excellent tools that we do need and should be created, and they would be a great enabler to the types of things that I'm talking about. But I don't see S-bombs bringing us that, at least not at the moment. I just want to say that the S-p-d-x-s-bombs are bringing things in the past. And we've had this exact of a while today. What's with the things that are not prepared, the tool that does it? Yachto is doing it, and Deppro is doing it. Hashes of the sources, and what's going to be intermediate, and how this is going in the follow-up. OK, so Kate just told us the problem's solved. So we don't have to do anything more. That's great. You use Yachto and the problem's solved, it sounds like. OK, so the comment was that build tools, no, build platforms like Yachto and Zephyr already record all hashes of sources going into binaries for their platforms. Great. Any other questions? Come on, people, don't be shy. Yeah? Yeah, I have a cool question. So for the textbooks, is the idea, and this might be a very basic question, is the idea that we produce an textbook that represents the task factor or, I don't know, or as large whatever combination of tasks? And that includes both. Build dependencies, transitive tasks, common dependencies, task dependencies. Or is the idea that you use multiple of these as a factor and for us, we'll have you. So that you can then operate the idea. I have a problem with vulnerability. I realize it's an issue in a particular thinking where it's, let's say, if it's an open as a cell vulnerability, my build tool, I probably don't have as much. But if it's in the document, I'm going to have much more. You want that? Yeah. You summarize it. Yeah, I'll summarize it. Good. Yeah, so the question is whether or not the SBOM is intended to capture, in addition to the code and the dependencies, is it also capturing the build environment, et cetera, things like that, right? And the answer there is maybe. And that's kind of one of the reasons, like I don't know if you were here all day, but one of the first things that we covered was different types of SBOMs. So there are SBOMs specifically for code repositories. There are SBOMs specifically that are generated at build time. So yes, maybe there can be an SBOM for the exact combination of conditions, or there can be a more generalized SBOM. And there's use cases for both of those. How much are you able to get all of these? Maybe. Again, it depends on what you're consuming from other people. If you're consuming a Docker image, it may be too late to, you may be able to reproduce it in a, well, let's not get into that. I'll say one thing more about reproducible builds. Even in the universe where we have all the data, actually reproducing a build is extremely difficult, and maybe even impossible in some cases, right? Because there's just too many variables. So I am not one to shy away from saying, we should have an ideal and work towards it, right? I mean, absolutely. But the level of effort to achieve one goal is not necessarily the same amount of effort to achieve another goal. So. Yeah, I guess one of my questions was, let's say I have something I think that generates as well. I'm supposed to run multiple times throughout the pipeline, as opposed to throughout the blocks. Yeah. Yeah. Yeah. It depends. Yeah. The answer is it depends. It depends. The answer is always it depends. But I'm just looking at the highest one is the design one. Now, one of the things I've heard a lot of people, and we've talked about lists of ingredients several times today, people nervous about putting what's in their product because potentially people are saying, well, if you tell me I'm using package x, y, and z, then a competitor can also put x, y, and z together. So are people concerned about that? Because that's one of the things that people are saying is delaying the adoption and the publication of that. Or is it saying that certain S-bombs can have that level information, but with a very restricted audience, as I think from maybe the later down ones, which are probably more public because they potentially have different business needs? That's out for that. I'm not saying I've got a view, but let's say what do people think about that? Because people, I'm seeing some of the organizations saying, I don't want to publish my architecture, since it's an architecture, because that potentially is making a community potentially vulnerable. My business model being vulnerable. And go back to the market I used to work in, my architecture was protected under certain business needs. And I couldn't share that. I still can't share it. I just want to add one thing there. And Siemens had their talk earlier today about having different S-bombs for different use cases. And one of them was specifically around that. We have a specific S-bomb for regulatory consumption that only includes the information that the regulator needs. And I don't know if that's a, I actually wanted to talk to you guys about that. Is that produced from the other S-bombs? We'll get into that later. But they have the other S-bombs they only use for internal purposes. So even in the case of software that is not going to be distributed at all, there's still a very strong use case for S-bombs. That software that stays inside of Siemens maybe doesn't even go into a product and is only used for, let's say, internal accounting, I don't know, whatever. Right, right. So there's still a use case there where there's not a concern about necessarily poisoning the software or whatever, right? So yeah, I just wanted to tie back to that, because I wanted to A, remind myself to talk to you about it. But I thought it tied into what he was asking too. There's a true reality of being in big organizations, is because at some point someone will ask it, where, or when, or why. This is always about the project that you're working. And sometimes there's someone that is going to a completely different area, a completely different country. But for some how, it's using the project to do their discussions. And then what happens is that there's a lot of people who have meant itself trying to find information. If that's a thing that already happens on S-bombs, since there, you have all the information of the project details. And then this question could be totally sorted out. You're saying for just internal management audits of resource planning, something like that, right? Who was working on this project? Yeah, yeah. So all of that, yeah, that information can be captured in an S-bomb as well, right? Yeah, I agree with that. Yeah, I wanted to go back to you brought up this list of ingredients question again, which I think is a really interesting analogy to this whole situation. Certainly I've eaten processed foods in my life, and I don't cook everything from scratch. So having a list of ingredients is certainly much better than not having one if those are your only two choices, if you're given that false dichotomy. However, I'm much more interested in getting recipes, full recipes, with all the instructions that I am getting lists of ingredients for something. And similarly true, I've used proprietary software in my life. I avoid it because I don't just want a list of ingredients. I want the whole recipe. OK, so staying with a list of ingredients and food labels, analogy, whenever we see a list of ingredients, it's usually a couple of things. Not a couple, OK, let's say a dozen things. And then may also contain other things. And it's also, yeah, we can look at the chocolates, right? And various other additives or whatever, stuff like that. So even in this case, we do not get a complete and exhaustive list of everything or not an accurate one. Are we trying too much, right? Trying to go to the S-bomb and find everything there. So just looking at this, these are two pieces of chocolate. Other brands are available. Other brands are, yeah. So OK, one says this 42, it's all in German, probably. So one says 42%, one says 44%. But the end product is chocolate. Are you able to tell that difference of 2% difference? Because there's a set of, well, OK, probably by taste, maybe, if you're really good. But OK, is your software the same? Because the difference between that 2% might be different compile option, to take as an analogy. OK, so the question is, if in our food we are so lax, why do we try to do it in our software so exact and we're spending all this money that Bradley's talking about? Yeah. I have an opinion on this. So I'm in a big company that consumes a lot of software products. So and we'll be care that builds of materials exist in many cases. But we do not care at all what is the content. So those 2% difference, I agree. Well, when people come to me and ask, well, can we get that product from an open source point of view, I ask, well, do they have an S-bomb? And if they have an S-bomb, it's a sign that they care about. An S-bomb is important. That they have to be capable of creating S-bomb. And that they produce the product. And in many cases, fine for me, go ahead. I don't care what's in there. OK, so you want to? Yeah, I'll summarize that. Yeah, so the comment was that he doesn't care that what is in the software necessarily, but he does care that there is an S-bomb, essentially. And I agree with that. As a consumer, a lot of times I don't. I'm not going to read the ingredients, right? But knowing that the ingredients are available, right? And I don't know if this is a perfect analogy. The ingredients list comes up all the time, right? There's more to it than that. But knowing that someone has the ability to audit that information means that I don't necessarily have to be the one that does it, right? Just like when you go shopping, you can benefit from other people bargaining, even though you're not bargaining yourself, just because that does drive prices down in the general case, right? The ability, that activity on the margin is extremely important, right? So yes, I care a lot that even if I'm not doing the inspecting of the food to make sure it's not spoiled or whatever, that someone is, right? Just knowing that is good. Oh, I got a question. Well, do you know how food inspections have to be done? All right. OK, so I want to follow up to what you just said and ask you. If your choices work, you're saying you're not going to look at it, which means probably when the recipe is not available, you're not going to go try to cook the version of yourself to see if the recipe actually works. You're going to be relying on the fact that, hey, there's a recipe out there, and they say that's the recipe for this, and probably somebody looked at it. So here's a question for you. If you had a choice between just getting the ingredients list and actually getting the recipe, which would you rather have out there, assuming you're not going to really look at either of them? OK, well, it depends on what we mean by the recipe, right? I mean, is it just the list of what to combine in the proportion? That's every little tiny stuff. I mean, some of it, there are elements of that that could be related to food safety, right? I didn't refrigerate it or whatever. And yeah, I'd. If I give you the recipe for a salami sandwich, right? Do you want to have a salmon all the way with the back back to the actual pork? I'm a vegetarian, so I don't know what to do with it. Yeah. But I mean, it's an interesting thing is, you know. Where are the? Wait. Sorry. OK, it's just an interesting thing is people have allergies for food, and taking this scenario along is quite things. So, you know, product may contain nuts. People are allergic to certain types of nuts. So if you received an S-bomb, do you and, you know, would you validate that that S-bomb is suitable for you to use and not having an adverse infection on you, which might be GPL, license or something, maybe an adverse reaction? Just to take it along. Where's the, there seems to be quite a good analogy here in terms of just receiving it. You have to actually look at it, on the wife, the consequences of it, you know, basically, you suffer those consequences. OK. So, just a sympathetic story, I understand the argument of this bomb is basically a synthesis of information that if you have all the source code of your dependency, you could extract essentially from there. And I understand that in the use case of security, it's much easier to just look for a version and a component name in a database of bombs than graphing to the source code in different forms of resources. But if the stakes are high enough, in some cases, you also want to do the graphing. So I don't know how big companies handle the log-for-j or the, or the... How do you see the reality? What I'm going to assume is that fairly small company with not huge budget and not a lot of, you know, computing resources, if they add bombs at the time, it's a big if, just look them up. But I'm pretty sure that big companies and high budgets of computing resources also did the crap in the source code because they couldn't afford assuming that all these bombs were right. So what I'm saying is that with the conflicts we've been discussing, if you are, you know, it's better if you also get all the source code of all your components so that you have a plan B in case your bombs are wrong. I also encourage you to use the rooted source code and load the source code to your users, but that's the kind of problem. Yeah, no, to summarize that point, he's saying that if you have an S-bomb in a zero-day response situation, let's say, right, you can find log-for-j very quickly, but you're still going to want to go back and comb through everything to be extra certain, right? Yeah, yeah, well, you obviously, yeah, yes, I agree completely, you want that option. I think part of the reason that that's necessary today is that disconnect between what the scanners are capable of detecting and recording, how hard it is to have a truly bit-for-bit reproducible build, right? If we get to that universe where the builds are reproducible, you don't have to do it all the time, but if you know and you have a high confidence level that these builds are reproducible, you don't necessarily have to reproduce every one of them all the time, right? Actually, this whole thing's closed with the whole food part. So, because in Brazil, and I used to see, I saw a little documentary about a network of supermarkets in Orleans that actually produces other juice and using, okay, blockchain is not a matter part, but from the origin in Brazil, the way they pick up the fruit to go in traceability and do using blockchain for each part, then the ship, the everything until you reach the supermarket, and they can show the documents with all the traces in the steps of everything. Basically, they have found a way to find a vulnerability in the middle process if there's some food that's gotten or lost. Yeah, that's a provenance issue, right? Yeah, I can show this bottle of orange juice came from this batch of oranges. Those oranges came on this ship from this orchard and were picked at this time. Yeah, that is, so that is a kind of an intersection of signing images or signing software with the SBOM showing what's in that software, right? Yeah, yeah, yeah, yeah, to really stretch the analogy. Yeah, but that's, I think that's, yeah, you need both of those, right? Ultimately, you probably want both of those. So, there is, for those interested in reproducible builds or SBOM, there is a working group on the open SSF. Probably faster. There is an open working group on the open SSF. A bunch of people are thinking on how you can get a reproducible SBOM, and some of it deals with, I mean, not necessarily capturing all of the ingredients into the SBOM, but I think it's born, I'm not really that involved, but from what I read, they're trying to think if I take an SBOM and try to reproduce a build from only the ingredients listed in the SBOM, can I manage to do it? And they're tying it to other trust issues. But it's an interesting project, if anyone is interested. I think we probably end. I think we probably end. I think to have all the ingredients and the providence and things like that is going to provide a huge amount of information to be managed. And then how do you then manage it? And how do you find it when you need to find it? Because it's bad enough finding source code sometimes, but then having to find other artifacts and relations, that becomes quite hard. So, I'll probably put it out of there. For those of you who have got SBOMs and consuming them, how are you managing them? Because I believe that there's lots of things we're talking about producing them, but actually putting the relationship and SBOMs must be related to other artifacts, I'm sure. Bill of materials like hardware, documentation that we talked about earlier about safety. How are people managing those discrete artifacts together as a unified, consistent solution? Okay, a more general question. Is anyone consuming at both, or are we all talking about producing them? Nobody cares. Sorry, yeah. Please. Yeah, we produced and continued as well, for us all. And we've produced a lot of things, and we've produced a lot of different things. But ultimately, we can't wait about the SBOMs that represent the software that ends up being a completely different one. And so, the first thing we do is we help, we use some of the tools in the state that's actually in the X-ray, and that's what we do. We use them to find out if an artifact has that many materials with the first part of the artifact that we're representing. And then we have our mechanisms tracking, our releases, and then we use that same identifier. And when we want to look at these SBOMs, we're only really looking at, and it's one of those things that has actually been released, which starts with that. And from that, if you have very solid deployment mechanisms, you'll also have an understanding of where things are going. So, your analytic exposure to places, there's the estimation of the software, which consumes a certain number of clients. And that can help you filter, because risk is fairly contextual. And we're producing and consuming that's from a risk perspective, where it's legal, license compliant, and security. And security risks, for example, a lot of our use cases are turning completely to nine. And the place of the shell is in the East Gambas. We provide a lot of sources, and they can find hundreds of CEs. So, we're going to be finding a lot of things. But a lot of it's not very useful, not actionable. Or developers, especially when it's stuff we open, don't need to worry about fixing. So, it just gives us a really good data set to search, and use in an operational sense. But it requires humans. We just sell some opportunities at the end of the day about having sort of better data sources with more automated-scale response to threats and things like that. We have less clips and very nice proprietary on business-based faces that are moving and only use for things like the CD, CD hatching, and stuff like that. And we're hoping to find a way to do that. Can you summarize all that? I'll have to summarize all that. Yeah, well, okay. They produce, yeah. So, I actually want to go back to Zach's comment, because I have a question now, because Zach was pointing out, with some of us earlier, that having the S-bomb and the source code is really the ideal situation. And everybody's been saying S-bombs will help us with bugs and identify vulnerabilities, which possibly is true. So, if you have a log4j or any of these situations where you've identified in your S-bomb, you have a version, you have that version, and you have that package, but, of course, you don't have the source code. So, how does that help you solve the vulnerability if you have the S-bomb? It seems to me the only thing you can do is take the binary out of deployment if you don't have the source code. So, I'm curious if I'm missing something that the S-bombs do that allow you to solve the vulnerability with no source code present. Yeah, I'll take that. That does happen sometimes where there's a project that has been, there are binaries for an internal project, let's say, and nobody knows where the code is, right? That has happened. We've seen people build S-bombs, and yeah, that's essentially the case where you can get an assessment and say this is vulnerable to whatever. That may, removing the binary is an option. There are maybe other mitigations that could be put in place, but it is an awareness thing, right? I mean, that's kind of the main thing of S-bombs in general is a shortcut to omniscient awareness of every mastery over the contents of this thing, right? I think we're probably drifting a bit into slightly wider things. And S-bomb is not your single point of decision making for the things like vulnerability. You need to look at how your product is being used, has it been deployed, what's the environment, who's using it? If it's in a test environment, that's probably a slightly different environment to an operational deployment and an OT environment. So you need to look at that, you know, look at the context. And maybe S-bombs need to capture more of a context. And obviously, if you're creating a product, defining what a product is, it needs to also have that context around it, which then gives you then maybe other protections, physical protections as well, that are not captured in the S-bomb, that may help you make those decisions. I have a totally different question about the usefulness or what, do we need to change S-bombs if we look into being a co-pilot and the one? Do we then have a tendency with a 13% rightly book or something and need to add that to the S-bomb? I don't know, well... I'm not going to... I don't have any comments on co-pilot. OK, so the question there is, with the likes of things like co-pilot and I presume our little friend chat GPT, what does that really make to the world of S-bombs? And I would probably say... let's widen that a little bit more to basically explainable software. Can you explain what your software is doing? And I'm sure our safety people would be, you know, one of the things when I've been involved in safety, you've got to explain your architecture and explain why your architecture is safe and is fit for purpose. So have you got explainable software? And maybe an S-bomb will help that argument. It won't be sufficient, but it might help explain why your software... why you think your software is suitable because of your selection of your components, you've got components of known pedigree, et cetera, may help your argument. But actually, things like, yes, automated code generation, which we all have if you have compilers, because that's automatic code generation, and we ultimately trust them. So how do we put those different measures together and collate that and recognise that? Because obviously, we've also got things like, you know, AI, machine learning things, as well, capturing the data, how good is the training data, how good is the algorithm within that, and can you understand why it made that pedigree a decision? And I don't know whether you're going anywhere near that with your safety. But, yeah, so I think, yeah, explainable software is possibly a completely new topic. Where's the vibe? You don't want to discuss that? Yeah, yeah. In another capacity, I've written multiple blog posts about the GitHub co-pilot situation, including essay that won a prize of some sort. So I encourage you to read those, because that's sort of other issues about that, because there's a lot of issues there. I think one of the fundamental issues that you're pointing out is more general, and it's not specific to co-pilot. It's that machine learning, generally speaking, as a discipline, if you can call it that, is one of the most unreproducible processes that we've ever invented in computer science. And so the reason you can't get an S-bomb out the other side of a code generation system that uses machine learning is because you can't reproduce anything in the machine learning process. Once you train that model, you don't really know why it's doing, because at that point, it's just a table of floating point numbers. And so that's something that we could, I think maybe we could all work together on to be angry about machine learning systems and machine learning really. About machine learning systems and machine learning researchers not caring about reproducibility as an issue. But the fact that... Okay, yeah, good. One of the things we're doing in the next round of S-B-D-X-M-D-R-E-C-E, is sending it through the handled data sets, along with the computer parameters, as well as AIF. And we've been working with AI researchers, AIF groups, and so forth, in order to be able to capture the key information so we can at least start to summarize what these factors into the index of the software. So the comment was that in S-B-D-X-M-D-R-E-C-E on 3, we're going to have AI and data profiles, which tries to capture this information there. Why do you think that AI cannot be done in irreproducible manner? Now that's not done now, right? But if you can record everything again, have recipes for everything, do you think that AI fundamentally cannot be done in a... No, no, no. I think the decisions have been made for the history of machine learning, going back basically 25 years, that have led us to this point. And there's a lot of... We have to reverse engineer the entirety of the machine learning research history to get to reproducibility for machine learning, unfortunately, because it wasn't taken seriously as an issue early on, in part because of the amount of processing power required to reproduce, right? Because if you want to retrain something, the amount of compute time you need to retrain is not available to most people. Okay. Question. Philippe, I'm going to give you a quick question. I wanted to get back to the point that we consume lesbos, and I was surprised because I know that one person who doesn't consume lesbos who has been in Japan for so long has high-maintenance, we consume lesbos. The way we treat lesbos is as a lossy proxy or software data. And therefore, they're treated the exact same way you would have a package manifest in the process. And I think it's a very correct way to process that. There may be correct or incorrect, but there are no correct or incorrect than the Narcan package database, the YPL set-up file, or S. The comment was, sorry, oh. Yeah, I wanted to just kind of add to what he's saying. Yeah. But I'll summarize it. Yes, so he's saying that he was surprised by the number of people, I think it was just one person that said they're actually consuming lesbos today. Yeah. And that he is in the project he maintains. I think that you're just ahead of the curve, probably. Most people, their adoption today, if you talk to people who are doing anything with S-bombs today, their adoption curve looks like produce S-bomb, then maybe consume S-bomb, and then maybe in the future, I will incorporate the S-bombs that I'm consuming into the product that I'm building or whatever, right? Whereas in the future, let's say a year from now, that may be a little bit reversed where people start consuming S-bombs before they create their own S-bombs. But today, there's not enough people, there are no suppliers producing S-bombs to be consumed, right? That's a chicken and egg problem there. I mean, that's what I'm seeing, at least. So, Philippe, I should have raised my hand halfway to that question, because I've consumed S-bombs two to three times in my life in the cases where under the copy left licenses, like the GPL, we asked for the complete corresponding source code, and in response to that request, they instead produced an S-bomb, which is in fact not the complete corresponding source code. So I consumed it long enough to tell them this is not source code, and they insisted, by the way, that a major compliance industrial complex vendor had told them that that's what they were supposed to give us if we ever asked for the source code. Yeah. Good. Yeah. I just want to respond to that. Just real quick. But that's OK. I agree that's unacceptable, but that's not an S-bomb. That's not a problem with S-bombs inherently. That's a problem with lawyers giving the wrong information or people misunderstanding what the lawyer told them. Or abating the licenses. Or maliciousness. Yeah, it absolutely could be maliciousness. Delay, stone wall, right. Yeah, yeah, yeah, yeah, yeah. I was mean when you were talking about that. No, no, no. I just wanted to make a, yeah, yeah, yeah. It just wasn't explicitly stated, so I just wanted to kind of identify the issue. That's OK. Yeah, yeah. For the record, I think no one here assumes that S-bombs are replaced with S-bombs. Yeah. We have to finish up pretty soon. Yeah. I just want to comment on the losing of points. I don't know if you need to go internally. I'm going to consume it. So there's a way of indicating the means of balancing between the developers of trying to control this kind of thing. It's great, so I'll repeat that. OK, so I think what you're saying is, and you're orange, I think. Yeah, so you are producing and consuming internally generated S-bombs. So our last question is, I wonder if you discovered. So as you are consuming, are you consuming the S-bombs? What are you doing to help make decisions on your design? Are you saying, oh, there's a license missing, or, oh, I shouldn't be using that version. In terms of license missing, we find plenty of weird license and some surprises. All the time, there's some surprising licenses in the long run. No, there's a resource, a virtual resource, of both kinds of licenses. And has that resulted in a change to your design, or have you just? Well, now and then we have to change libraries. The users only have the need for commercial license. OK. Oh, yeah. OK. Well, that's good, I think. I have the published source code, but actually, it's not really talking. So that's probably quite an interesting, you know, licensing is probably quite an interesting use case for people to understand. And I think we've had quite a lot about licensing today, which is really good. I think there are plenty more. I think we can, you know, I think that's what I think there's plenty more to develop over the next month's years. As we mature, and I think actually, I think if people start producing S-bombs, you're going to reveal a lot more. Then, you know, there's a lot of unknowns, I believe, in software. And I think are people starting to ask questions of their suppliers, and you know, I'm probably looking at the large organizations that are in this room. Have you put things on your contracts that are now asking your people to provide S-bombs with your delivery, as well as a delivery note, and a release note? Are people starting to do that and ask them, you know, and probably to keep Bradley happy? And also, are they also basically providing their compliance against the list of compliance against licenses? Because if people start providing that as a starting point, that's going to help your risk assessment when you take soft into a solution and then start adding value to it. That should be basically a real game changer for businesses, as well. Yeah, he's saying that he's unable to force any vendor or supplier in general to supply an S-bomb, which I think is a big problem in general, right? I mean, A, your vendor may be 10,000 times bigger than, well, not in your case, maybe not. But for a lot of smaller operators, right, they don't have the leverage over the vendor, right? Or in the case of your consuming open source software, like, they have no obligation to you at all, right? They, I mean, we've seen, like, I totally agree, like, open source maintainers are already put upon enough, right? There's nobody that should put a gun to their head and say, you have to produce an S-bomb for me to consume. That's where the tooling has to get better, right? Companies like GitHub could even, you know, advance this a lot by making it much easier for project maintainers to have those produced on a regular basis, right? I, yeah, in my work in license compliance and enforcement, the number of bad actors in your vendor space is much higher than you probably realize. And it's often not the vendor being bigger than you. It's actually these small vendors who are, there's only a couple, there's kind of a cartel, particularly I'm thinking of chipset, you know, chipset for embedded devices vendors. There is a major chipset that I'm sure there's probably at least 30 devices in this room that use that chipset. The vendor requires you to agree before they will ship you even a demo board. You promise I will never ask for the complete corresponding source code under the GPL and that's the only way I'm gonna send you a demo board. That if you become my customer, you're never allowed to ask me for the source code. Now this is a GPL violation, but it's very hard to catch because everybody who signs this agreement says, I don't ever wanna tell anybody I signed that agreement because they'll never sell me a board again. And so this optimism that we're gonna get vendors to give us S-bombs, where vendors are getting people to sign those kinds of agreements. The idea that you're gonna get them in an agreement to agree to give you an S-bomb, I think is probably a pipe dream in the current industrial environment around embedded chipset vendors at least. I don't know about other industries, but in the embedded chipset vendors, I think that's the case. It's not that. It's the wrong way, you don't have any embedded products. I mean, we know popular companies that use their chipsets if the driver's not off-streaming. Right. Yeah. And this makes it easier for us. It's not easy. Yeah, yeah, yeah. Well, thank you for doing that. Yeah. So we're out of time. Many thanks to the wonderful panel and all the audience. Thank you, all the audience. Thank you. Thank you, Alexios, for organizing. Yeah, yeah. Sure, don't wait. Come on. Come on. Come on. So we have one last session, which is an open Q&A on everything S-bomb for whoever is interested and is still alive and awake. |
General Q&A on SBOMs |
So we're in the last stages of a Q&A. Are there questions that people have had during the day that they want to bring up and they didn't quite get to? Okay, we've got one question. So the question is what's the difference between Cyclone DX and SPDX? So I'd say that the Cyclone DX is focusing at the package level alone and at the metadata at the package level for the most part. Okay, SPDX can do the package level data and it can also look at the source files or in parts of source file snippets. So there's a different mental model underneath it. So at the package level, they are pretty much functionally equivalent and you should be able to interchange between the two. And there's work going on to help people interchange between the two. The challenge becomes is for your use cases, if you want something quick up, that's, you know, there's one solution. I think SPDX can do the same thing, but there's tooling and what you need. So it's a function of ecosystems and what your end goals are. I and some of the others that we're working here, we care about going towards safety, not just security. And we need that level of information. Others, if they're just going for the packages, that's a good starting point. I think one of the people I was working with in the SBOMS stuff a couple years ago said SBOMS are diamonds. There's a difference between industrial and engagement rings. All are good, though. I'll add value. So I will keep it at that. I know I can go into a lot more details of things, but that's not here. The product on DX has different components, so it has things like libraries and operating systems. That's actually in SPDX 2.3. We put it in there so we could round trip. Okay, go for it. Next question. We've got a lot of Cyclone DX people on here, too. Don't worry. We tried. It failed. So to summarize for the people online is it's a great chance we'll be getting the two standards to converge at some point in the future. We've been trying for two years. It's failed. Yeah, the audience, Elio is pointing out KDE versus GNOME. We have a long history of having multiple solutions for things in there. Yeah. The market will decide. The market will decide. The market has got to decide. Yeah, it's still happening today. The market didn't say anything. We're still having it. Those of you who are old enough to remember videotapes and stuff like that, the two standards, the best standard, lost because it was easier. The other one was easier to adopt. So it'll be not good. So you're telling us we shouldn't be putting effort and going into becoming an international standard. Is that what you're telling me? And we should just go and hack and put something out there and go. It needs major adopters, somebody major to adopt it. Microsoft? Yeah, it's already majorly adopted. Thank you, Clay. Prevent the market driver to not adopt it. Right. Prevent the market driver to adopt it. We should not prevent the market driver to adopt it. Yeah. Okay, next question. Next question. How is that specification process working? Is it just a good number of questions? We actually, so from the SPDX spec side, we have a variety of issues in GitHub. But realistically, the discussions happen on the mailing list and in the meetings. And at this point in time, the model is pretty close to done. I think the last issue was the entity issue, which we've talked about a little bit today. And so right now we are actively prototyping out serializations of that model to make sure we haven't forgotten something. And so you will, I don't think he's here, but there's some, there's other people there. Go ahead. The answer was about the SPDX version three, if that was the question. No, no, no. The question was general about how do we, how do we do specifications? Right. So, yeah, as Kate mentioned, there is a mailing list. There are GitHub repos and there is weekly phone calls, phone calls. Yeah. Okay. Weekly online meetings where we discuss all these things. The participation is open. We're welcome. Anyone is welcome? Yeah, please. We also have special interest working groups that are focusing on specific topics. We call them profiles. And so we have a build profile working group. We've got the defects working group that's working on the security profile. So Thomas is here. He leads that one if you're interested in that topic. We've got a AI working group that's focusing on AI applications models as well as they're also doing the work on defining the data set profile. And we have the safety profile that Nicole is working on as well. So we have groups of people. And there's also licensing. Of course, there's also licensing. Yes. So what we're trying to do is make sure the spec becomes more modular. So if you don't care about licensing or you just want to carry about the components and relationships, that information is there. And you don't have to carry the other stuff with you. This was feedback we have gotten and the community listened. And so we've gone through a very major effort of reshaping the spec to make this possible. Go. Interesting. What about IoT? Because that's where... IoT? Yes. Does IoT not need a thing because it's got specific market... Small... Actually, no. IoT is already handled. It's been handled for a long time. This whole SPDX came out of the embedded space. And it's one of the most developed profiles. Yachto is basically building systems that work in that space. Zephyr is an IoT operating system. And builds all these S-bombs today. Automatically. Which is how it should be. I'm going to make that bomb score a little chip set. Yep. Easy. There. Just use it. Okay. I just not seen that. Well, no, like... I feel that coming all the way through. Josh has been to have this available now for a year and a half to two years. And it's a question of getting it out there and letting people know it's there and letting people turn the option on. This is where why... For those on the call, my bad, it was a question of what about IoT? And realistically, I think IoT is actually one of our bare spots. We also operate in the IoT space. Yes. In our era, our presentation of today was about how to make a very reliable bomb with all the metadata, information, provenance, and assurance that the source code matches the actual code that goes into binary and to have approves and auditable process all the way through. So for us, it's just as in a piece of software. And get access to the software. Like I said, to some of the points from the last panel, we need reproducible builds. This is part of it. One of the things I really like about what's happening in the Yachto space is all of their builds are reproducible already, and then they have the summary information from the S-bombs. And so, like, you know, if your Chinese kid has a Yachto build associated with it, you should be able to get everything you need. Well, to chime in, actually one of the worst ADAK we have in the space is with the drivers and firmware and proprietary blobs that came from there. Therein lies the problem. So we are trying to only, it's not up to us, but to start from, those who are more effective in providing source code and all the information possible for having a full stack open source, not just until the solar pad stuff is there. Yeah, there's also work going on to reach out and starting to like the CHIPS Alliance group. And so when bandwidth starts to permit, we get a little bit farther on on the 3.0 out to the door. There are people there that are interested in actually looking at starting to summarize the silicon and the, quite frankly, board information, because at the heart of it all, it's files. I think I can see lots of things going forward. Yeah. There's a lot of kids out there with unknown TCP stacks, et cetera. Okay, there's a lot of kids out there with unknown pedigree to put it, it's a simple thing. If we are to make, you know, and everyone's worry about cyber vulnerabilities and hackers and stuff like that, how are we as a community trying to help people address that weakness? Because I think we know how to go forward, but actually I've got a little sensor that may be in quite a difficult environment to change, but I don't know what it is and I don't know what my risk profile is and how my risk profile is changing. So how am I going to find that when I've no idea what's in that chip? So no easy answer, but this is the reason we have a class called Analyzed where you're basically trying to work your way through binaries and images to understand what might be there and you're mining it. There's a couple of tools out there already. I think there's more that will probably show up. Binary Analysis Nick Generation, I think Livermore Labs has another tool that they're working on. But this is a general question about adoption, right? We're all talking about S-bombs. The people in this room have heard lots about S-bombs and they understand it. But again, in order to gain wide adoption, it's always the chicken and egg program that was mentioned before, right? So we should start asking for S-bombs and producing S-bombs and people will get used to seeing S-bombs and it was mentioned as an example. Maybe GitHub can be persuaded to make S-bombs easy or whatever. All these things obviously take time and we all have to work on that. Sorry, go ahead. Do any chip providers provide S-bombs for hardware blocks that have software counterparts in firmware yet? So the question for those on the thing is, do any hardware providers provide S-bombs for their firmware blocks that... Sorry, say it again. For the hardware blocks that have software counterparts in firmware? For the hardware blocks that have software counterparts in firmware. I think you might see some of it coming out of some of the open stuff from ARM and some of the stuff from RISC-5 when they're having add-on units. Some of those may start to become visible. But I do not know specifics. Do you know specifics, Alexios? Okay, I've just seen discussions in those areas. One of the extensions that we're looking for, as predicts in the future, is to extend towards hardware. So we can more easily, in the same format or whatever you want to call it, we can capture information about hardware. And the issue that you point is in between hardware blocks and software blocks and hardware blocks containing software and firmware and all this stuff. I'll also say that my view is, firmware is just another type of software. The comments made that medical device. Just pointing out that New York Presbyterian and S-bombs stuff, the medical device area is making more progress. Daggerboard is a project. Go ahead, Thomas. Okay, I'll start and then let others chime in. Governments have an influence with the regulatory authorities, the regulatory authorities are part of government. With the plethora of information that's in software, I'm really happy that the FDA is expecting to have S-bombs. And it didn't get yanked out of the legislation. I think anything that has critical infrastructure, we should be expecting to know that S-bombs can be produced by the people who care about them. And we'll consume them and look at them and do the analysis. Because if we want the world to be a safer place, we need to get rid of the opacity here. And we actually have to get the transparency in. That's my view. So I think some of the things I'm seeing in Europe, some of the things I'm seeing in other countries like Japan and so forth, art is starting to expect that level of transparency. And I think it's helpful. I think in that space there's a lot of good to be done and a lot of damage that can be done. Because speaking of CRA, the request for having a full S-bomb or full traceability of software and stuff so that you know what component is there and if there's vulnerabilities and stuff is good. But at the same time they put burdens and simply disregard whatever the requirements of an open source project is and they can destroy and poison the world that they try to. So it's a double S store. And I think it's another sort of thing about fitness for purpose, for software. You know, I've heard things, you know, the Sale of Goods Act basically does it do what you're expected to do. And you know, certainly the large organizations where you're paid more than $10 to create a system, basically you expect the system to work most of the time and pretty reliably. And actually then to say, okay, but I've built it on Microsoft Windows just to choose a third party proprietary software. That's got bugs, yes? But you know, getting an understanding to understand working together with government because a lot of these big contracts that have the problems are government contracts because the defence or infrastructures basically for people to work together and recognize rather than have a big stick, basically how can we work together to basically to make industry better and can custom end users understand what they have to do as well as what we as providers have to do. And I think it's getting that balance right. I'd like to ask, is do you know about an existing current time as born generation working group because I couldn't find anything in that area? I only found two tools that are for Kubernetes specific, but no other information. The question was, I've heard the question this time, excellent. So I think we're starting to see that there's one tool that sort of advertises itself in that way that I'm aware of. I think this is an area that will seem worth things starting to emerge in this next year on the runtime and monitoring of systems with S-bombs. What? There's J-bombs, isn't there? Yeah, but there's more ecosystems than Java. I think that on this continent, you said about Kubernetes, people are going to focus on where they need it. There's more than one tool, yeah. Okay. Any other? Do you want to start that generation? Yeah. Okay. Good job. And who are you? Nick. Go for it. Maybe a question back to the cycle here. Oh, dear. Something positive. Something positive about the Cyclone DX and SPDX formats. I was wondering if maybe we, rather than waiting for the market to destroy one or the other, I don't like this too much, but wouldn't proper tooling for conversion, comparison, yeah, I was wondering what the state of this. Okay. So there are two tools out there right now today, and all help is welcome to harden them up. In the SPDX repo, there's a CDX to SPDX tool, and in the Cyclone DX, there's a Cyclone, there's an SPDX to Cyclone DX tool. Both of these tools are there. We've been working, and the reason we put out the 2.3 release of SPDX is so we could have some of the fields to help round trip between the two formats. So SPDX put out a whole release to try to be compatible. Okay. And we'll keep the fields, and we'll try to, you know, like I say, we want to make sure we're compatible, at least on the minimum subset. Hopefully more. And I think the other important piece to recognize about that was actually the communities got together to inspect the specs and see what kind of data loss happened from one way to the other, and we had like this big table in Austin, I think. Yeah. Yeah, we were... Waiting to do it. Yeah, like the engineers talking to each other completely civilized and nicely, and well, some progress is getting down there. Yeah. Yeah. No, it is, like I say, round tripping, and we're going to have an ecosystem right now where we have multiple, and so ingestion and transformation is going to be necessary. I think that's just going to be our reality. Any other questions? Well, oh, one more. Okay. We can't double open or double open. Yeah, question about double open. We can say some words about that. Okay, Thomas. Thomas, why don't you just get up here? Yeah. That way I don't have to try to get that all the way to you. The thing. Double open, go for it. Double open for people who don't know double open. Double open is managed by a Finnish law firm that basically came together. So basically in Germany, there was poor COVID long time, which a lot of the things that are now being discussed were discussed in a smaller group inside of Germany way beginning because we stumbled upon each other. And along the way, basically we stumbled upon some fins that happened to, I don't know how, but lawyers, lawyers. And so basically they had similar use cases where they were like, oh, hang on, we want to do Jokto stuff, but the Jokto tooling is actually not there. So, and what they luckily did is they started collaborating with us over on the Oort side, and also, well, it's not only Oort, we had for use there and all stuff. We were like, hey, guys, don't reinvent the wheel. That's a lot of other companies do. So what they actually did is they said, okay, we do the Jokto stuff, which was none of the people in the German side knew Jokto. Well, they know Jokto, but we didn't have enough resources to actually build a tooling for it. So they started working on basically saying, okay, we do the Jokto stuff for a pilot with one of our, with their kind of clients. And then basically they bolted it on to Oort for the rest of like the license compliance where they do the notice generation, the policies and all the other stuff. So yeah, double open is a, I haven't spoken, with COVID we kind of lost touch, but I know they're still active and they're still doing things. Does it talk about, if they started to merge these things and the main Jokto, technically people say that yes, but what I saw is not exactly the result of the work, but a mixture of things that they cannot guarantee that this, but they're still using the separate scanners and other parts. Yeah. So they're basically, again, like Oort and the other stuff, it's users that have problems. And instead of basically just inventing their own wheel, basically they all came together with other users. So that's, I said, there's a massive group in Germany that has been working together and then the Finns got involved and then now we have people from the Netherlands involved and now we have people from French. So it's spreading, sliding out of all over because most of it is funded from like, I'm mostly in Opposite Program Officers and Opposite Program Officers are usually too small and they're racially running into, well, quite enough, I have a talk on Wednesday at Open State of Open about this, is that basically, hey, how can you do things at speed, at scale, while still staying compliant and still keeping your developers somewhat happy? And this is really where we, again, I keep track of, people are really scared, I currently have 142 compliance and asthma solutions that all around the world I keep track of. Whether in the US, China, Japan, I keep track of all of them, it's such a big mess. And then we said users, well, we can't do, we'll build it ourselves because still a lot of tools. So yeah, if you are building an asthma tool and please reach out to Kate, Alex, and we will probably redirect you where other people with similar problems are already building stuff. And if you have internal asthma tooling and you need to help open sourcing that, that's kind of what I do for work. So open sourcing internal project, I'm very good at that. I have even lobbied, I went to visit several companies, did presentations to executives on why it was a good thing to open source their asthma tooling. And this is why we actually have now a four-off tooling because basically we convince people to open source it. The more you open source it, the more we learn it. So actually, funnily enough, most people don't know it, but the ORT people, we read the Psychonomics Code and the Psychonomics folks, they read the ORT code. And all of the asthma people and compliance people, they actually meet. Don't they read the SPDX code? I don't need to read it because I know where you know it. Whenever a new tool comes out, we read the source code of the other tool to figure out if there's anything useful in there or we point out like, hey, you haven't thought about this. Because as I said, making package analysis is insanely complex and the difference between asthma comes from experience. And I have been doing this for seven plus years with the Club in Germany and we produce probably more than a million as bombs with all of the comes by. We're talking about companies that have like, we're talking about a Bosch. That's like a half a million employees and God knows how many projects. Okay, well, I think we're just about done. Okay. Oh, time's up? Okay. Well, I just want to say thank you everyone for being here today. And thank you for the last... First ever S-bomb dev room, you've been part of it. And thank you for staying to the end and for the last questions. Much appreciated. APPLAUSE |
SBOM devroom closing |
Okay. Well, I think we're just about done. Okay. Oh, time's up? Okay. Well, I just want to say thank you, everyone, for being here today. And thank you for the last... First ever SBOMB DEVRUM. You've been part of it. And thank you for staying to the end for the last questions. Much appreciated. |
Lessons learnt managing and scaling 200TB glusterfs cluster @PhonePe |
Please welcome Sanju, Sanju and Pranin for us and enjoy. Thank you guys, thank you. Good morning guys, I am Sanju and he is Pranin, we work at Foonpei, yeah today we are going to discuss about the lessons that we learnt while we manage the cluster first cluster at the scale and the some of the problems we have faced and the solutions that we have came up with, yeah Foonpei is the leading Indian digital payments and technology company headquartered in Bangalore, India and it uses unified payments interface which is introduced by government of India, so in India if you are thinking of any payment you can do it using Foonpei app, this is how our Foonpei app home screen looks like. And we have like a we see 800 k rps on our edge layer every day and we do 130 million daily transactions, so this will generate lots of records and this will generate lots of records in the documents that we have to store and as per the regulations in India we have to store all of them in India only, so Foonpei has a private cloud where we store all these things and we need a service to store and retrieve the files from the cloud, we have developed a service called darkstore which will write the data to Glustreface and which will fetch the data from the Glustreface, so coming to the question why did we choose the Glustreface, we didn't wanted to have a metadata server because like we have lots of small files and storing all the metadata, we didn't wanted it, so Glustreface has no metadata server, so we went ahead with it and our team had earlier success in the Glustreface project, so they were confident that Glustreface will work for our use case, so we are here and this is the data flow to and from the Glustreface, so all the traffic is fronted up by CDM and the request is forwarded to nginx and nginx will send the request to the API gateway and API gateway can choose to store or retrieve any file from the, any file or it can choose to send the request to any back-end service, now if the back-end service wants to store this file or if it wants a file it can be a post or get request I mean like it can store or it can retrieve, it will send the request to darkstore, now the darkstore will store the data or retrieve the data from Glustreface servers and darkstore also uses elastic search to store some of the metadata and it uses aero spike to store the earth related info and some of the rate limiting features, it uses RMQ for asynchronous jobs like deletions and batch operations and this is our team, yeah today's our agenda is an introduction to Glustreface and then we will discuss about different problems that I have faced and the solutions that we are using and we have some proposals as a roadmap. What is Glustreface? Glustreface it is a distributed file system that means whenever you do some write the data is distributed across multiple servers, these servers have some of the directories we call them as BRICS and this is where the data is actually getting stored, yes so this is a typical Glustreface server, each server can have multiple BRICS, the BRICS will have underlying file system where the data will be stored and in the root partition we store the Glustreface configuration, go ahead, yeah this is how a 3 by 3 Glustreface volume looks like, when I say 3 by 3 whenever a write comes to Glustreface mount point, so how Mr. 1 point like we can mount Glustreface volume on any mission over the network and you can read and write from that mission, now from the client where the mount is happened if any write comes, so it is distributed across 3 sub volumes based on the hash range allocation, we will talk more about the hash range in a coming slides and another 3 is transfer the data is replicated 3 times, so whenever a write comes the data will choose one of the sub volume and in a sub volume, sub volume is a replica set, here it is a 3, so it is replicated thrice, over to Pranip, hello, yeah so let us look at some numbers that we see at phone pay for dock store service and then to Glustreface, in a day we see about 4.3 million uploads and downloads are 9 million, with peak upload rps as 200 and download rps as 800, the aggregate upload size per day is just 150 GB, not a lot but the download size is 2.5 TB, so it is completely read heavy workload and this is after a Syrian is fronting it, that means only when the file is not available in your CDN, the call will come to Glustreface which will download the file onto the CDN and then it will be served and this is how the rps is distributed throughout the day, rps is request per second, so the uploads actually are reasonably uniform from 6 am to 5 in the evening, then it tapers off for the rest of the day, whereas the downloads are in bimodal distribution with one peak at around 12 pm and another at around 7 pm, the latencies are function of the size of the file, so we have post upload latencies with mean of about 50 ms to the p99 at around 250 ms, similarly for gates the mean is around 10 ms and p99 is around 100 ms, let us look at the configuration that we use at phone pay for Glustreface, we have 30 nodes in the cluster, each node contributes 2 bricks and one brick corresponds to 10 TB and that is a ZFS pool, so 30 into 20 that is 600 TB of available capacity and we use replica 3, so the available size is 200 TB out of which 130 TB is in use at the moment, let us now go to the problems that we face and how we solved it, I will start off with the capacity expansion problem that we solved, then Sanju will take over and talk about the data migration problem that we solved, I will talk about how performance issues are debugged and how we solved the problems using that method, then Sanju will finish it off with maintenance activities that we do to prevent the problems, before we talk about the capacity expansion problem, let us try to understand a bit about the distribution, so the data is distributed across the servers based on hashes, in this diagram we have 3 distribute sub volumes, each sub volume is a replica 3, so when you create a directory, each of the directory in these 3 replica sets will get a hash range and whenever you create a file or try to read a file, it will actually compute the hash of the name and it will figure out which of these directories in these 3 sub volumes has that hash range and tries to get that file or store that file in that node, so for folks who are well versed with database, this is more like sharding but the entity here that is getting sharded is the directory based on the file names, alright, so the files actually can have varying sizes, for example in our setup, the minimum size would be less than a kb but the maximum size is like 26gb, so you will run into this problem where some of the shards or distributes of volumes that you have would fill up the space before the others, so you need to handle that part as well, so there is a feature in Glouceref is called min-free disk where if you hit that level, when you create the directory again, the hash range will not be allocated for the ones that met the threshold, so for example here, even though there are 3 distribute sub volumes, data is going to only 2 because the middle one actually has met the threshold, so the hash range will only be distributed between the 2, 50% and 50% instead of one third that you would expect normally, so let's talk about the actual process of increasing the capacity and why it didn't work for us, when you want to increase the capacity that is you bring in more distributes of volumes or shards, the way that you do it is you first you do something called as cluster peer probe, that will bring the new machines into the cluster, then you do another operation called add brick that will add the bricks to your volume, then you have to do something called as cluster volume rebalance to redistribute the data among the nodes equally, so what are the problems that we faced, when we did the benchmark, the rebalance had this application latency impact in some cases up to 25 seconds and as I mentioned most of the P99 latencies were just in milliseconds, so this is this will be like a partial timeout partial outage for us, so this is not going to work for us, the other thing that we notice is for large volumes the rebalance may take up to months and at the moment cluster FS rebalance does not have pause and resume, so we can't do the maintenance activity in off peak hours, that is one more problem, the other one that we have seen is when you do the data migration when it is going from one distribute sub volume or shard to two distribute sub volumes, you would expect 50 percent of the data to be transferred that's all right, but when you are going from 9 shards slash distributed sub volumes to 10, you want to only migrate like 10 percent of the data, but less than FS is still like transferring about 30 percent to 40 percent like irrespective of what is the number of sub volumes are, so the rebalance itself may take so much time with our workload that by the time we want to do the next capacity expansion the rebalance may not even complete, so that is also not going to work for us, so these are the three main problems that we have seen, so this is the solution that we are using now, then there is a proposal as well, since we know that the hash range allocation is based on the based on both the number of sub volumes and number of free sub volumes, what we are doing is in our doxor application every day in the night we create directories with a new basically, so the directory structure will be something like the namespace that the clients are going to use slash year slash month slash day, so each day you are going to create new directories, so based on the size that is available only the ones that have space will get the hash range allocation, so you will never run into the problem where you will have to do rebalance that much, because we have seen that with our workloads reads are distributed uniformly and as we have seen the it is read heavy workload and writes are just a few, so we were okay with the solution in the interim, but long term the solution that we are we have proposed and this is something that is yet to be accepted, but there are some POC that we did very few use jump consistent hash instead of the one that we have when you are going from 9 to 10 here it is only about 10 percent that is getting rebalanced, so that is what we want to get to this is something that we are focusing on this year, alright over to you Sanju, so let us look at the problems that we have faced while migrating the data, so we had a use case where we wanted to move complete data which is present in one server to another server, so in clusterface the standard way of doing this is to use a rebalance operation, sorry replace brick operation, so when you do replace brick operation there is a process called a self filled demon which will copy all the data which is present in the old server to new server, so to copy 10 TB data it takes around 2 to 3 weeks, so that is like a huge time we wanted to reduce this time so we came up with a new approach so let us understand few aspects of clusterface before we jump to the solution, so that we understand our approach better, so the right flow in clusterface is something like this whenever a right comes based on the hash range allocation plan is just spoke it will choose one of the sub volume, so the data will go to all the servers in that sub volume, now let us say we have chosen replicas at 0 and the right will go to all the machines in that sub volume, it is a client side replication so the client will send the right to all the machines and it will wait for the success response to come, so client will assume the right is successful only when quorum number of success responses has come, let us say one of the node is down, in our case we see like a server 2 either it can be a node down or the brick process is unhealthy this can be unresponsive at times, so something happened the right came to one of the sub volume and it went to all the three replica servers, but server 2 did not responded with the success response, now server 1 and server 3 has responded with the success response, so client it assumes that the right is successful, now when the server 2 is back up we to have the consistency of the data server 2 should get the data which it has missed while it was down, so who will take care of the job of doing this it is SHD, so SHD is a daemon process which will read the pending heal data like whatever the data that was missing we call it as a pending heal, so it will read from one of the good copy in our case server 1 and server 3 are the good copies and server 2 is a bad copy, so SHD will read the data from one of the good copy and it will write to server 2, so server 2 will have all the data once the self heal is completed healing the data, we will use this as part of our approach as well, our approach is we will kill the brick which we want to migrate like we want to migrate from the server 3 to server 4, so we have to copy all the data right, so self heal is taking 2 to 3 weeks, here in our case we will kill the brick and we have a ZFS, we are using ZFS file system, so we will take a ZFS snapshot and we will transfer this snapshot from the server 3 to server 4, it is like a old server to the new server and now we will perform the replace brick operation, while we are performing the replace brick operation server 4 that is a new server will already have all the data which server 3 had, once the replace brick operation is performed server 4 is now part of the sub volume and the heals will take place from server 1 and server 2 to server 4, so now we have reduced the amount of data that we are healing, previously we are copying all the data that is like a 10 TB of data from server 3 to server 4, but here in our case we are healing only the data which came after killing the brick before doing the rebalance replace brick operation, so the data we heal is reduced hugely, with this approach now it is taking only 50 hours to complete this, that is also if we are using the spinning discs it will take 48 hours to transfer the snapshot of 10 TB and 2 hours for the healing of data, but it is only 8 to 9 hours if we are using SSDs, if we are using SSD it takes like a 8 hours to transfer the snapshot and it takes around 40 minutes to complete the heals, so that is like we came from 2 to 3 weeks to 1 or 2 days or 9 hours we can say, we are using netcat utility, it gave us very good performance, it is like a 60% performance optimization and we have in flight checksum at both the ends in the old server and also in the new server, so that it is like we are checking whether we are transferring the snapshot perfectly or not, we are not using any data and yeah it is at the time, I have kept the commands that we have exactly used in this link and we also have a rollback plan, so let us say that we have started with this activity but we have not performed the replace brick yet, because once the replace brick is performed it will be something like this, the sub volume will already have the server 4 as a part of it, before we perform the replace brick that means when we are here, we can we do not want to do this anymore, all we need to do is start the volume with the force, so that the brick process that we have killed will come up, once it is up the good copies that we have SSD will copy the data from good copies to bad copy are the old server, so that we will have the consistent data across all of our replicated servers, yeah that is so easy and we want to popularize this method so that it helps the community, yeah over to Prenet, yeah so this we will now talk about the performance issues that we faced and how we solved them, this is the graph that we have seen in our prod setup, while doing this migration when something happened that we did not account for, so the latencies have shot up to 1 minute here and I have said that it is supposed to be only milliseconds, so this is horrible, there was like 2 hours of partial voltage because of this, so let us see how these things can be debugged and how they can be fixed, so we have a method called GlusterVolumeProfile in GlusterFS, so what you do is you start profiling on the volume, then you run your benchmark or whatever is your workload, then you keep executing GlusterVolumeProfile in for incremental and it will keep giving you the stats of what is happening to the volume during that time, for each of the bricks that are there in the volume you will get an output like this, where for that interval in this case interval 9, for each of the block size you will see the number of reads and writes that came and for all of the internal file operations that you see on the volume, you will get the number of calls and the latency distribution, min max average latency and what is the percentage latency that is taken by each of your file operation internally. So, what we have seen when this ZFS issue happened is the lookup call is taking more than a second which is not what we generally see, so we knew something was happening during lookup operation, so we did an stress on the brick and we have found that there is one internal directory called GlusterFS indices XRTROP, to list three entries it is basically taking 0.35 seconds, so we so imagine this, so you do LS it will just show you three entries, but it will take like 0.35 seconds sometimes it even takes a second, so we after looking at this we found that ZFS has this behavior where if you create a lot of files in one directory like millions and then you delete most of them and then if you do LS it takes up to a second, so this bug is open for more than like two years I think, so we did not know whether ZFS would fix this issue anytime soon, so in GlusterFS we patched it by caching this information, so that we do not have to keep doing this operation, so now you would not see it if you are using any of the latest GlusterFS releases, but yeah this is one issue that we found and fixed. The second one is about increasing the RPS that we have on our volume, so the there was a new application that was getting launched at the time and the RPS that they wanted was not what we are giving, so basically they wanted something like 300, 360 RPS or something like that, but when we did the benchmark we were getting only like 250 RPS, so we wanted to figure out what is happening, so we ran benchmarks on Prod Gluster itself and we saw that one of the threads is getting saturated, so there is a feature in GlusterFS called client IO threads where multiple threads would take the responsibility of sending it over the network, so we thought let us just enable it and it would solve all our problems, we enabled it and it made it worse like from 250 it went down, so we realized that there is a continuation problem in the client side that we are yet to fix, so for now what we did is to on the containers of Dockstore where it was doing only one mount, we are now doing three mounts and distributing the uploads and downloads over yes, so can you repeat the, oh yeah, no I didn't, it is a fuse mount, yeah the thread that is saturating is fuse thread, yeah so the question is which GlusterFS client we are using, the answer is fuse client and the thread that is saturating is fuse thread, so what we are doing is we have created multiple mounts on the container and we are distributing the load in the application itself like the uploads will go to all three and even downloads will go to all three, that is one thing that we did to solve the CPU saturation problem, the other thing that we noticed this is like part of the Gluster volume profile output where it will tell you for each block what is the number of reads and writes, we have seen that most of the writes are coming as 8KB, so later when we looked at the Java application Dockstore we saw that the IO block that Java is using the default size is 8KB, so we just increased it to 128KB, so these two combined has given us 2X to 3X the number and we also increased the number of VMs that we are using to mount the client, so put all together we got something like 10X performance improvement compared to the earlier one, so we are set for maybe 2, 3 KB all right, so let us now go on to health checks, so for any production cluster some of the health checks are needed, so I will talk about the minimal health checks that needed for GlusterFace cluster, so GlusterFace already provides POSIX health checks, so it is a health checker thread which will do a write of 1KB for every 15 or 30 minutes, I mean seconds, so there is one option to set the time interval in which you want to do this, so if you set it as a 0 that means you are disabling the health check, so you can set it as like a 10 seconds or something, so it sends a write and check if the disk is responsive enough and brick is healthy or not, if it did not get a response in a particular time, it will kill the brick process, so that like we will get to know that something is wrong with the brick process, so the other one we have is the rest of the things are we have a script and we have some config, these are the things we have kept externally kind of thing, the POSIX health checks are the one which come with the GlusterFace project, so the cluster health checks that we have are like we have a config where we will specify number of nodes in the cluster, so that is like a expected number of nodes in the cluster and using the Gluster peer status or GlusterPoorList command, we can check the number of nodes that are present in the cluster and we will check if both of them are equal, if not we will write an alert saying something unexpected is happening and we will also check whether the node is in connected state or not, so in the GlusterFace cluster the nodes can be in different state, so it can be connected or rejected or disconnected based on how the GlusterFace management daemon is working, so now we will see whether, so the expected is all the nodes should be in a connected state, we will check whether the nodes are connected or not, if the nodes are not connected then we will get an alert saying okay one of your node is not in a connected state and we have some of the health checks for the BRICS as well, so we have number of BRICS that are present in each volume in the config and in the GlusterVolume info output you will get how many number of volumes that are present in that volume and you will check if they are equal, the another check we have on the BRICS, if the BRICS is not online we will get to know it by checking the GlusterVolume status command and if it is not online you will get an alert saying that one of your BRICS is down and so whenever the server is down or the BRICS is down there will be some of the pending heels and you can check the pending heels using the GlusterVolumeHealInfo command and if there are any pending heels you will see an entry, so if the entry is non-zero then you will get an alert saying that okay you have some pending heels in your cluster that means something unexpected, unwanted is going on that can be like a BRICS down or node is down anything and we always lock profile info incremental to our debug locks using the health check so that whenever we see some issue like the Prandit just spoke about some of the issues that we can solve by looking at the profile info output, so in such cases this output will be helpful so we always log into our log backup servers and the exact commands that we are using are listed in this link, so we have some of the maintenance activities so things can go back sometimes, so we have a replica 3 setup in our production, so at any point of time quorum number of BRICS process should be up so that the reads and writes can go on smoothly, so whenever we are doing something which might take some downtime of the BRICS process or which can have some load on particular server at that time we do it only on one of the server from each replica set so that even if that server goes down or the BRICS process running on that server goes down we won't be having an issue because there are two other replica servers which can like do all the reads and writes, so we are doing few activities in this way, one is ZFS scrubbing, ZFS scrubbing is about doing the checksum of the data, it will see if the data is in a proper condition or not and we do migrations in this way only, so we are doing it on one server from each replica set so that even if it is down for some time or something didn't work out we are in a good place and upgrades also we will do in the same manner, we have done some contributions so the data migration part that I have spoke it's a production ready we have used it in our production and Pranit has given some of the developer sessions which has many internals of Glastrophase, they are very useful for any Glastrophase developers who wants to learn about many translators that we have in Glastrophase and recently we have fixed one of the single point of failure which was present in the geo-replication feature, it was merged into the upstream very recently last week and this year we are looking at another thing the hashing strategy that Pranit has proposed, once it is accepted at the community we will take it and develop it, yeah that's all we had folks, thank you. Just want to let you guys know that the production ready thing, we actually migrated like in total 375 TB using the method that Sanju talked about so it is ready, so yeah you guys can use it, I think it should work even with butter, basically any file system that has a snapshot feature it should work, yeah thank you guys, yeah I think we have a few minutes for questions if you have any otherwise you guys can catch us there, yeah so the question is how do you handle a disk failure, so basically the problem that I showed you where we had the ZFS issue where it was taking like minutes of latency that was the first time it happened on production for us and initially we were waiting for the machine itself to be fixed so that it will come back again and it went for like a week or so and the amount of data that needed to be healed became too much that it coincided with our peak hours, so now the standard operating procedure that we have come up with after this issue is if a machine goes down or disk goes down we can just get it back online in 9 hours so why do we have to wait, so we just consider that node dead, we get a new machine we do whatever Sanju mentioned using ZFS snapshot migration and we just bring it up, so do you have the ZFS backup somewhere, do you have the ZFS backup somewhere, the answer is no you have the ZFS data on the active bricks so you take a snapshot on the active bricks and do the snapshot trend, yeah one of the good ones yes, any other questions, I think that's it I think, thank you guys, thanks a lot. Thank you. Thank you. Thank you. |
vhost-user-blk: a fast userspace block I/O interface |
Hi, my name is Stefan Heinze and I work on QMU and Linux and today I want to talk about Vhost User Block, a fast user space block IO interface. So what is Vhost User Block? Vhost User Block allows an application to connect to a software defined storage system that is running on the same node. So in software defined storage or in storage in general, there are three popular storage models. There's block storage, file storage and object storage. And Vhost User Block is about block storage. So for the rest of this presentation, we're going to be talking about block storage. And block storage interfaces, they have a common set of functionality. First of all, there's the core IO reads, writes and flushes. These are the common commands that are used in order to store and retrieve data from the block device. Then there's data management commands. These are used for mapping and allocation of blocks. Discard and write zeros are examples of these kinds of commands. There are also auxiliary commands like getting the capacity of the device. And then finally, there can be extensions to the model like zone storage that go beyond the traditional block device model. Vhost User Block supports all of these things and it's at a similar level of abstraction to NVMe or to SCSI. So let's start by looking at how Vhost User Block is a little bit different from things like NVMe or SCSI, things that are network protocols or hardware storage interfaces. Vhost User Block is a software user space interface. So let's begin by imagining we have a software defined storage system that is running a user space and it wants to expose storage to applications. So if we're using the kernel storage stack, what will happen is we'll need some way to connect our software defined storage to the kernel and present a block device. These of doing that might be NVMe over TCP or as an iSCSI LAN or maybe as an NBD server and so on. And so that's how a software defined storage system might expose its storage to the kernel. And when our application opens a block device, it gets a file descriptor and then it can read or write using system calls from that file descriptor. And what happens is execution goes into the kernel's file system and block layers and they will then talk to the software defined storage system. Now that can be somewhat convoluted because if we've attached say using NVMe over TCP, the network stack might be involved and so on. And at the end of the day, all we're trying to do is communicate between our application and the software defined storage processes that are both on the same node, they're both running on the same operating system. User space storage interfaces, they leave out this kernel storage stack and instead they allow the application to talk directly to the software defined storage process. Now there are a number of pros and cons to using a user space interface. And I'll go through them here. So I've already kind of alluded to the fact that if you have a user space interface and you don't go through the kernel storage stack, then you can bypass some of that long path that we discussed, for example, going down into the kernel, coming back out using something like NBD or iSCSI in order to connect to another process on the same node. There must be a faster way of doing that, right? So with VO's user block, it turns out we can actually get rid of system calls entirely from the data path, so reads and writes and so on from the device don't require any system calls at all. And we'll have a look at how that's possible later on in this talk. But speed is one of the reasons why a peer user space interface for BlockIO is an interesting thing. Another reason is for security. Typically in order to connect a block device to the kernel, you need to have privileges because it can be a security risk to connect untrusted storage to your kernel. And the reason for that is that there's a bunch of code in the storage stack that's going to run and it's going to process and be exposed to this untrusted data. If you think about a file system and all its metadata, that can be complex. And so there's a security risk associated with that and therefore privileges are required to create block devices. An ordinary unprovedged process cannot attach and mount a block device. So in a scenario where you do have an untrusted block device and you would like to remove the attack surface there, then using a user space interface allows you to avoid that. Also if you don't have permissions, if you simply don't have permissions, then you won't be able to create a kernel block device. So then a user space interface is beneficial as well. Now those were the pros. Of course there are drawbacks to having a user space interface. First of all, it's complex. Compared to simply opening a file and reading and writing from the file descriptor, you're going to have to do a lot more because all the logic for actually doing IO and communicating is now the responsibility of the application and not the kernel. So there's that. In addition, if you think about existing programs that you might want to use to access your storage, they won't have support for any new interface that is user space only. They are probably using the POSIX system calls and read and write and so on and that's what they expect. So you'll have to port those applications in order to access your software defined storage system if you rely on a user space interface. Another disadvantage is that if you have a user space interface, then the kernel storage stack isn't involved. So if you decide you need a feature from the kernel storage stack, whatever that may be, or if you have a legacy application that you cannot port and that needs to talk to a kernel block device, then again you have a problem because your software defined storage system is isolated, its block devices aren't connected to the kernel. What we're going to do today is we're going to look at both these pros and cons and we're going to also see how with VHOS user block we can actually overcome these cons. So let's start a little bit looking at some of the performance aspects, how this can be fast. I said no system calls are required, so how does that even work if the software defined storage system and the application need to communicate? How can they communicate without system calls? Alright so one of the important concepts in IO is how to wait for the completion of IO. When you submit an IO request, maybe you have no more work for your process to do. Maybe the CPU is essentially idle until that IO request completes and at that point you'll be able to do more work. The normal thing to do in that case is to then de-schedule your application and let other threads, other tasks on the system run. And maybe if there are no other tasks then the kernel will just put the CPU into power saving mode. It will put it into some kind of low power state and it will awake once the completion interrupt comes in. And you can see that at the top of this slide, at the top diagram, you can see that there's a green part where we submit the IO and at that point we run out of things to do because we're going to wait for completion. So then there's this gray part where other tasks are running, power saving is taking place and during that time the first portion is spent with the IO actually in flight. That's where we're legitimately waiting for the IO request to complete so that we can proceed. But then what happens is that the IO request completes and we need to somehow get back to our de-scheduled process. Now depending on what other tasks are running, their priorities, the scheduler and so on, our task might not get woken up immediately. Or maybe if the CPU is in a low power state it will just take some time to wake up, handle that interrupt, restore the user space process and resume execution. So this leads to a wake up latency, an overhead that is added. And so this is why notifications or also sometimes called interrupts can be something that actually slows down your IO processing. An alternative is to use polling. So polling is an approach where once you have no more work to do instead of de-scheduling you repeatedly check whether the IO is complete yet. And by doing that you're not giving up the CPU so you keep running and you keep consuming the CPU, the advantage is that you don't have this wake up latency, instead your process will respond immediately once the IO is complete. The drawback of course is that you're hogging the CPU and you're wasting power while there's nothing to do. So these are two techniques and I think we're going to keep them in mind because we'll see how they come into play later. The next performance aspect I wanted to mention that's kind of important to understanding how the host user block is different from maybe using a network protocol or an existing storage interface is message passing versus zero copy. As programmers we learn that when we have a large object in our program we shouldn't pass it around by value because it will be copied and that will be inefficient. And instead what we do is we use references or we use pointers allowing the function that receives the object to just go and access it in place rather than taking copies. And in inter process communication and in networking there's similar concepts. By default things are message passing. We build a message, it gets copied through various buffers along the network path, eventually the receiver receives it into its buffer and then it parses it. And so that model is the traditional networking model, it's also the IPC model, it has strong isolation so for security it's great because it means that the sender and the receiver don't have access to each other's memory therefore they cannot interfere or crash each other and do various things. But the downside is that we have these intermediate copies and that consumes CPU cycles and it's inefficient. So the zero copy approach is an approach where the sender and receiver they've somehow agreed on the memory buffer where the data to be transferred lives. And that way the sender for example can simply place the data directly into the receiver's buffer and all it then has to do is let the receiver know, hey there's some data there for you, it doesn't actually have to copy the data. So these are, this is another important concept that we're going to see with vhost user block. So now that we've got those things out of the way, let's look at vhost user block. What is it? It's a local block IO interface so it only works on a single node, on a single machine. It is not a network protocol. Two, it's a user space interface, it's not a kernel solution in itself. It's a pure user space solution that means it's unprivileged, it doesn't require any privileges for two processes to communicate in this way. It's also a zero copy solution and the way it does that is it uses shared memory. And finally, vhost user block supports both notifications and polling. So depending on your performance requirements, you can choose whether you want to deschedule your process and receive a wake up when it's time to process an IO completion or you can just pull and consume CPU and have the lowest possible latency. And vhost user block is available on Linux, BSD and on Mac OS and the implementations of this started around 2017. Now it's used, it came from SPDK and working together with QEMU, so those communities, they implemented vhost user block. But there are also implementations in other hypervisors like cross VM and cloud hypervisor. So primarily this kind of came from virtualization, from this problem of how do we do software to find storage and let a virtual machine connect to it. But that's not all that vhost user is good for, it's actually a general storage interface. It's generic, just like NVMe or SCSI is. So you could use vhost user block if you had some kind of data intensive application that needs to do a lot of storage IO and needs high performance or needs to be unprivileged. And that's why I'm talking about vhost user block today. So let's have a look at the protocol. So the way that this is realized is that there is a Unix domain socket for our user space storage interface and we speak the vhost user protocol over the socket. What the socket does and the vhost user protocol allows us to do is it lets us set up access to a virtual block device, so a block device that lives in the software defined storage process. So when we have two processes running on a system, a software defined storage process and an application, the application is using vhost user in order to communicate with the Verdeo block device and that's how it does its IO. So what is Verdeo block? Verdeo block is a standard. You can check out the Verdeo specification. Verdeo has a number of other devices, but it includes Verdeo block. Some of the other devices are Verdeo net or Verdeo SCSI and so on. But Verdeo block is one we'll focus on here and it consists of one or more request queues where you can place IO requests. And each one of these has a little structure. You can do all the requests I mentioned in the beginning of the talk, reads, writes, flushes, discard, write zero and so on. And you have multiple queues, so if you want to do multi queue, say you're multi threaded, you can do that as well. And it has a config space that describes the capabilities of the device. Like disk size, the number of queues and so on. So that's what you can think of Verdeo block as, that's the model we have here and that's the block device that our application can interact with. If you think of any other storage interfaces or network protocols that you're familiar with, this should be more or less familiar. Most of the existing protocols also work in this way. You can inquire about a device to find out its size and so on and then you can set up queues and you can submit IO. So underneath Verdeo block, we have the VHOS user protocol. And the VHOS user protocol is this Unix domain socket protocol that allows the two processes to communicate. But it's not the data path. So VHOS user is not how the application actually does IO, instead it's a control path that is used to set up access to these queues, these request queues that I've mentioned. And the IO buffer memory and the queue memory actually belongs to the application. And the application sends it over the Unix domain socket. It sends that shared memory over so that the software defined storage process has access to the IO buffer memory and the queue memory. The application and the software defined storage process, they share access to that memory. That way we can do zero copy. So this is going back to the message passing versus zero copy thing. We don't need to transfer entire IO buffers between the two processes. Instead, the software defined storage process can just read the bytes out of the IO buffer that live in the application process and it can write the result into a buffer as well. So if you want to look at the specification and the details of how VHOS user works, I've put a link on this slide. But really, if you're writing an application, I think the way to do it is to use LibBlockIo. LibBlockIo is a library that has both C and Rust APIs that allows you to connect to VHOS user block as well as other storage interfaces. So VHOS user block is not the only thing, but for the purposes of this talk, we'll just focus on that. LibBlockIo is not a framework, it's a library. It allows you to integrate it into your application regardless of what your architecture is. That means it supports blocking IO, it supports event-driven IO, and it also supports polling. So no matter how you've decided you want to do your application, you can use LibBlockIo. You won't have to change the architecture of your application just to integrate LibBlockIo. I have given a full talk about LibBlockIo. So if you want to understand the details and also some of the background and everything it can do, then please check out that talk, I put a YouTube link on this slide for you. I'll give you a short code example here. So this shows how to connect to a VHOS user block socket using LibBlockIo. And this is pretty straightforward, we essentially just need to give it the path of the Unix domain socket and then we connect and start the block IO instance. And then in order to do IO, we can submit a read request. That's just a function call, that's straightforward as well. A notice here that we do get the queue, we call the get queue function in order to grab a queue. That's because LibBlockIo is a multi-queue library. If you have a multi-threaded application, you could create one dedicated queue for each thread and then avoid any kind of locking and synchronization. All the threads can do IO at the same time. So for completion, what this example shows is it shows blocking completion. So here the program is actually going to wait in the do IO function until the IO is complete. But as I mentioned, the library also supports event-driven IO and it also supports polling. So whatever you like, you'll be able to do that. If you develop your application, you'll need something to test against. And I think the easiest way to test against the VOC user block device is to use the QEMU storage daemon. It's packaged for all the main Linux distros as part of the QEMU packages. And you can just run the storage daemon, you can give it a raw image file and tell it the name of a VOC user block UNIX domain socket that you want to have and then you can connect your application to it. All right, so that's how you can do that. If you want to implement a server, if you're already in the SPDK ecosystem and you're using Intel's software performance development kit in order to write your software defined storage system, then it's very easy because VOC user block support is already built in. So I've put a link to the documentation. There are also RPCs if you want to invoke it from the command line. And just for testing, you can create a VOC user block server using this. Now if you're not using SPDK, instead you're writing your own C daemon, your own process, then one way of using VOC user block is to use the libvhostuser library. So this is a C library that implements the VOC user protocol, the server side of it. So this will allow you to accept VOC user connections. It doesn't actually implement verdioblock. That's your job. That's the job of the software defined storage system. But verdioblock consists of basically just processing the IO requests like reads and writes and so on, and also setting the configuration space so that the disk size is reported there. And you can find an example of a C program that implements VOC user block using the VOC user. I've put a link on the slide here for you. So that's how you can do it in C. In Rust, similarly, there is a library available for you. So there's the VOC user backend Rust crate, and it plays a similar role to the libvhostuser library for C. So this allows you to easily implement whatever VOC user device you want. And in this case, it's your job to implement the verdioblock, just as I mentioned. Okay, now I still wanted to touch on one con that we hadn't covered yet, because we've explained how although a user space interface is complex and is more work than just using file descriptors and read and write, I think that the libvhostuser block and so on, these libraries that are ready for you to integrate into your applications or software find storage systems, they take away that complexity, and they make the integration easier as well. You don't need to duplicate code or write a lot of stuff. But we're still left with one of the disadvantages. How do we connect this back to the kernel if it turns out we want to use some functionality from the kernel storage stack, or if we have a legacy application that we can't port to use the user space interface. So for VOC user block, there is a solution here. There's a Linux VD use feature, which is relatively new. And what it does is it allows a VOC like device to be attached to the kernel. So even though your software defined storage system is in user space, this gives you a way of attaching your block device to the kernel. And then in the kernel, the VerdiO block driver will be used to communicate with your device. And what happens is that a devvda or devvdb block device node will appear, and your application can open that like any other block device, and it can read and write and do everything through there. One of the nice features of this is that because it's quite similar to VOC user block, the code can be largely shared. I think the only difference would be that instead of having the V host user code, you would have the VD use code, which opens this character device that the VD use driver in the kernel offers instead of a Unix domain socket. And the setup and the control path is a little bit different. But the actual data path in the VerdiO block is still the same, so you can reuse that code. So that's an effective way of doing it. There's another new Linux feature that I wanted to mention that is interesting here, and also a little bit more general, even outside of VOC's user block, and that's U block. U block is a new Linux interface for user space block IO, so that your software defined storage system can present host kernel block devices. So you can have your block device and process it in user space. And it uses IO U ring. It's an exciting feature, and it's pretty interesting, so I've left the link here. The only thing with this is that compared to VD use, it does not reuse or share any of the V host user block stuff. So if you already have V host user block support in your software defined storage system, or you just want to streamline things, then U block is kind of a whole different interface that you have to integrate. So that's the only disadvantage, but I think it's pretty exciting too. Okay, so to summarize, if you need a user space block IO interface for the performance, or because you need to be able to do unprivileged IO, or for security, then implement VOC's user block. There are open specs, code, and community. Please let me know if you have any questions, and thank you. Have great FOS-DAM! Thanks for watching. I'll see you in the next video. |
Present and future of Ceph integration with OpenStack and k8s |
Thank you, let's welcome Francesco on the present and future of the app. Okay, thanks everyone. This session is about Obestac and self-integration. With our quick glance to the Kubernetes world, it is a quick agenda. I'm going to talk about what would have been the integration with SAF in the Obestac system, in the Obestac community in general, what's the status of the app, what has been changed with the SAFADM in the Bermedal world and what means having Kubernetes in this picture. There is also a demo which is, I'm not going to show the demo, but it's linked in the session, so you can feel free to take it later. So, for reasons not familiar with the Obestac project, it's just pretty old at this point. It's infrastructure as a service. As you can see on the right side, there are a lot of projects being part of the Obestac ecosystem. Each of them provide interfaces to each other and they cooperate together to provide processing, storage, network resources. You basically can build your cloud infrastructure using all these projects together. It's open source, it's now 10 years, 13 years old, so there are a lot of projects that are stable now. And SAF is part of this picture in the sense that you can probably don't see this picture very well, but in both computes, in the storage part, we have SAF that basically can be used as a backend for NOVA, which is the compute processing component, so we can provide ephemeral disks using SAF as a backend. We have Manila providing object, there is a good providing file. Swift providing object, there is a long story with the integration with Radus Gateway. Glance for images, so basically all these components you see there in the storage area, they can use SAF as a backend and this is a justification to have this integration with these two technologies. So, why the integration and why it's relevant? There is the HCI, I will convert the infrastructures. Operators can save hardware resources, co-locating the compute part of the infrastructure and OSD nodes. This means that you can save hardware, you can have both technologies together serving as a full operational infrastructure. This is funny, I just asked chat GPT, why SAF is relevant and why the integration of these two projects is interesting. I just put there, you can not see probably this part on the right, but it's basically my marketing sentence on why this thing makes sense. And there is scalability, resiliency, scalability and all this kind of keywords that we like a lot. But this session is about orchestrating services, deploying object stack and SAF together. There have been a lot of deployment tools and strategies over the past and I want just to focus on Cefantibol and Cefadm because Cefantibol has been the official tool, the most useful tool in the past and now it's FADM. Things have changed a bit, especially in the object stack ecosystem. So, previously the approach was I need my infrastructure as a service, I need to deploy object stack and SAF. I want to describe my entire cluster with a lot, a ton of variables and I push that magic button that makes it so. So, Cefantibol was there in this picture during the deploying of object stack. There was this particular part where Cefantibol was triggered to deploy SAF behind the scene. So, the drawback of the solution is that if you need to change anything in your Cef cluster, you have to run again the playbook, the Cefantibol playbook, because there is this Ansible layer that manage everything for you and it needs to stay consistent with the status of the cluster. So, basically variables and the human operator is supposed to provide variables and maintain those variables, which is a bit different. Also, it affects scale down, scale up operation and day-to-operation, especially day-to-operation. I want to change something in my Cef cluster, I need to run the deployment again because I can rely on the Ansible hidden potency, basically. But with CefADM things are a bit different, the status of the cluster is not made by tons of Ansible variables. There is CefADM, which is able to keep the status of the cluster, watch the cluster and take an action. I want to deploy a new SD whenever I attach a new disk. I can do that and I don't have to run deployment again or do any fancy day-to-operation with my tool that is supposed to manage my infrastructure in general, which is made by both Opestuck and Cef. And this changes a bit because we had the Cefantibol word where we describe all our cloud infrastructure. We pushed that magic button and everything was deployed and sometimes broken. But now, operators are more aware about the steps, so things have changed because you have to provide networks. And networks means that you want to manage your hardware, you want to build your networks, you want to use... This is specifically for the triple project where we integrated Cefantibol in the past and now we're moving to CefADM. And now people should provide networks, should provide metal, the description of the nodes and they are just separated steps. The storage Cef is deployed first, so you can start deploying the Opestuck infrastructure with a minimal Cef cluster already deployed. And when you deploy Opestuck, you can say, I have Cinder, I need Volumes, I need the Volumes pool and I can finalize my cluster, creating and configuring the Cef cluster accordingly. I have Manila, I need CefFS, I can deploy MDS, doing other stuff behind the scene. But you're basically aware that according to the service you're deploying in your Opestuck infrastructure, you can enable pieces in your Cef cluster and you can just tune it accordingly. At that point, we still have the Ansible layer managing Opestuck and all the infrastructure in general, but at that point the Cef cluster is seen as a separated entity. So it's like having an external Cef cluster, even though you have the same tool deploying both technologies together. And CefADM and the manager, the orchestrator, is the piece in Cef that is supposed to provide interfaces where the operators can interact with. And it's basically the slide. We still have Ansible doing everything on top, but the orchestrator interface is what you can rely on to make changes in your cluster without having to redeploy everything again around the Ansible that can take a lot of time if your infrastructure is large. And we don't have any, the operator is not supposed to keep and maintain any variable in the Ansible system. You can just interact with the CefADM CLI, create a spec which is a little definition of the Cef service that will be translated by the orchestrator and it will be deployed as a new demon in the Cef cluster. So this is about why it's so easy. Because you have Ansible, at some point you can just bootstrap a minimal Cef cluster with this command, bootstrap, providing a monitor IP address because networks are already there at that stage. And you can create a spec file that is basically the description of the nodes that should be enrolled in Cef and you can just apply them. It's even easier rather than running Ansible with a lot of roles, execution time, which is long enough. And data operation are complicated. Are complicated because not just because of this slide and this is the interaction with the CefADM CLI, you can query the cluster and see the health status. You can see where demons are placed, you can list the hosts and manage their labels and assign roles to these hosts. You can do a lot of things. But the real point here is that with CefADM we don't have the need to run again all the deployment of the Opesta infrastructure. An example of this, of how projects are impacted by this change is Manila. It's not just because we need a new version of LibreVD, we need to be compatible and we are changing from CefAnsible to CefADM, but just because we are doing architectural changes to the use cases provided by Opesta. Manila is that service that curves out CefFS volumes and provides them to the virtual machine within tenants, which means that you have a dedicated network that you can use to mount your shares. And behind the scene we have CefFS or NFS with Ganesha. In the past, Manila was supposed to use the bus to interact with NFS Ganesha. And it was complicated because we had to run privileged containers. We had to use this interface to update and manage shares using Ganesha as a gateway. And from an architectural point of view, we had an active passive model made by Peacemaker and SystemD. So you basically had Peacemaker honing the virtual IP as an entry point and then one active Ganesha, even though you have more than one instance. And with some constraints with SystemD. Now with CefADM there is an interface with the manager, with the NFS cluster, and Manila can use a new driver to interact with this component. We don't have to do the bus anymore, we don't have to do the bus to the Ganesha container anymore. And that's the new model where we rely on the Ingress Demon provided by CefADM, and this Ingress Demon is made by HAProxy and KIPaLiveD. KIPaLiveD honing the V as an entry point, HAProxy for load balancing across the, and distributing the load across the NFS Ganesha server. It's a better approach, we still have some limitation on this area, because considering that Manila is an infrastructure service for Obestac, but providing shares within the tenant, virtual machines, with a dedicated network, we need client restrictions to avoid other tenants mounting the same share. And there is an effort doing the proxy protocol in Ganesha to make sure that we can use client restriction with this new model, which is the current limitation basically. Or at least there is some effort to provide floating stable IP addresses to the Ingress Demon and skip the HAProxy layer, which is always an additional hope. And in terms of performance, this can be something that has an impact, of course. Lastly, at this point we had Cefansible, we have CefADM, what Kubernetes means in this world. We have Rook as a way to deploy Cef within Kubernetes, regardless of how Obestac is deployed, we have several combinations of these two things together. You can have converged infrastructure where Obestac control plane is virtualized, or it can be containerized, so basically using the same model, the same approach to deploy both, it can be useful, because it's a unified approach to manage your infrastructure. It's easy, deployable and reproducible, because Kubernetes poses a standard approach to deploy things, so we don't have anymore that flexibility that today is triple O, but it's easier from that point of view. And the same Cef cluster can be shared between Obestac and Kubernetes. We have different workloads. Kubernetes is PVC interfaces provided by Rook. Obestac is mostly RBD and your workload runs virtual machines, and it's usually outside, so you have to reach from the compute node the Rook cluster, the Cef cluster deployed by Rook within Kubernetes, which poses some networking challenges that they can be managed using host networking through, so you're using Kubernetes as a platform to deploy your Cef cluster, but you're still relying on the host networking to reach it and provide RBD to the outside workload, and that's the point of this slide. There are some things that are not mentioned here, like some tuning in Rook, that is supposed to be done to make sure that there is Kubernetes in the middle, so it's not natively, the native Cef cluster we usually have. So at this point, the thing is that we should do some tuning, especially in the HCI world, because Iverconverged is still one of the most popular use cases, and HCI is provided out of the box by Kubernetes. You can tag your infra nodes, you can deploy Rook, you can assign those nodes for OSDs. That's it, that's it. But at that point, you have to make sure that both your cluster and the connection is valuable for the outside workload. So this can be done, it's done by this demo. I'm not going to show this, but it's all there, it's all described, it's all I was describing in this slide. So you can have your OB-STAC infrastructure deployed with DevStack or TripleO, and it's still bare metal, and it can consume a Cef cluster deployed by Rook using RBD. You can still use the CSI, actually, to provide PVC interface. It's not something that it's mutually exclusive, but it's just a good exercise to see how these two technologies can work together in the future, probably. And yeah, just some additional resources for those interested in looking at these slides offline and some contacts for people in the OB-STAC world if you want to dig more into these experiments. And that's it. Thank you very much. |
SQL on Ceph
An introduction to the new libcephsqlite library. |
Okay, welcome everyone, starting with the next session, so yeah, try to find a place, if you're free to use the reserved spaces if nobody comes. Let's welcome Patrick on the talk on SQL on stuff like this. Alright, hey everybody, I'm Patrick Donnelly, I know I got Red Hat slides up, but I'm actually part of the storage group at Red Hat that got moved over to IBM as of this year, so technically I'm at IBM now, if that matters for anybody who wants to ask me questions. Today I'm talking about SQL on CIF, it's kind of a small project that started about two years ago, it was actually a COVID project for me while I was dealing with the newborn baby that had lots of, or some time on my hands, but anyway, this is kind of like an overview for what we're going to talk about. Yeah, go ahead. Yeah, I do. We don't have a PA, unfortunately. Oh, yeah, I just have to speak up, so I'm naturally soft spoken, so if you can't hear me in the back, just wave your hand and I'll try to speak up more, okay? I'm okay right now, alright. Where was I? So just a quick canvassing the audience, who's used CIF before? Oh, wow, okay. Who's used SQLite before? Fewer people, that's interesting, okay, but not much fewer. Alright, so I'm going to quickly talk about CIF and what it is. I won't spend too much time on it due to the time I have available in my talk. I'll give you a brief introduction to RATOS for anyone who's not familiar with it. Then I'm going to talk a little bit about SQLite. Then some typical storage patterns we use for storing data on RATOS. I'll give you an introduction to this new library, LibCefSQLite. And then I'm going to talk about how we use that today within CIF, just to show that this library is not being used by anyone. Although I am interested if anyone's using it in the community. I'll give you a brief, go through a brief tutorial, an interactive tutorial using the library, and then I'll end the talk with some retrospective and talk about future work. So what's CIF? CIF is an ecosystem for distributed object storage. It's composed of numerous projects centered around managing a large storage cluster. The underpinning of CIF is RATOS, which I'll talk about on the next slide, but it's basically a distributed object store. Most people don't use it, use RATOS directly. What they use instead is the storage abstractions we built on top of RATOS, which provide the more popular storage mechanisms that people are familiar with, including CIFFS, which gives you your file storage as a distributed file system, RGW, providing the S3 object storage gateway, and RBD, which gives you your block device storage on top of RATOS. CIF has kind of evolved more and more recently to become more user-friendly. If you had maybe poor experiences in the past with CIF, I encourage you to give it another shot. The dev team has dedicated a lot of time recently to improving the user experience, and also taking the hassle out of managing your storage cluster out of the experience. There are things like monitoring device health. We now have a very mature dashboard for interacting with the CIF cluster, and the cluster management itself is now largely being done through CIFADM, which, as you saw in the previous talk, you can start up a CIF cluster with just a simple command and then start adding hosts to it. It's never been simpler. Oops, went backwards. So what's RATOS? RATOS is a number of object storage demons that run on physical disks. They can be hard disks, SSDs, and VMEs. On top of these object storage devices, we have this concept of pools, which allows you to have various administrative policies regarding what kind of hardware the pool should use, how the data should be replicated. Clients of RATOS talk directly to the primary object storage device for a given object, and you can look up which object storage device an object belongs to in constant time using a library called crush. You don't need to use that directly. That's just under the covers. And then, as part of the name suggests, reliable, autonomic object storage. Distributed object storage, the cluster heals. Self-heals, it's autonomic, and the replication is done automatically. You don't have to worry about how any of that works. So what's an object? The object storage device is composed of a number of objects, and that is the logical unit you have when you're storing things in RATOS. An object is composed of three different parts. You can use one or all. They have the data blob, which is analogous to, like, a regular file. Like, you put data in the file, you get data out of the file. You have key value X-Satters. This is sort of an older technology that was used in the early days with CEPFS for storing certain information about files, which is typically very small data. It's not usually used anymore, except in some parts of CEPFS. Now, the key value store that's used most often in CEPFS, also RGW, is OMAP, and that is much more of a general purpose key value store used today. So this is how you interact with RATOS through these objects. Now, it's not that simple to take, like, a number of objects, distribute all over the cluster and try to build something with that, because you've got consistency issues that you have to deal with. You've got to manage how you're going to stripe the data across all these objects, which is why we have these more popular abstractions that I talked about, CEPFS, RBD, RGW, which is how you typically interact with RATOS. So what I'm going to talk about today is a SQLite library that operates alongside these other three storage abstractions, gives you something on top of LibRATOS, but you can also now run SQL on CEPFS. So how do you typically do application storage on RATOS? Well, we have various bindings you can use to talk to RATOS. We have the typical CC++ bindings, which are part of the broader project, also used within CEPFS. We also have a Python interface, which is used for manipulating the objects. That's somewhat used in the broader community for various projects, but also within CEPFS we use it for some of the new CEPFS manager daemon, which I'll talk about more later. And again, it's not that simple to stripe data across the objects, which is why we have these other abstractions. One of the more notable exceptions is this LibRATOS striper, which is one of the ways you can create a file concept on top of objects, where you open and close, read and write and sync to a number of objects, and it looks like a regular file. That was developed by some folks at CERN, and it's mostly, I think, in terms of use, it's stayed confined to that space. Well, even though we do have these other storage abstractions, it's still useful to talk to RATOS directly, because sometimes you want to do something that is not dependent on these other storage abstractions, which may, in the case of within CEPFS internals, may not actually be available, which is why a number of CEPFS manager plugins, the CEPFS manager has a number of Python modules, and they talk directly to RATOS. So this was something I wanted to address, because it was a little bit awkward, and I'll talk about that more. So a quick overview of SQLite. For those who've never used it before, it's a user application library for allowing you, the X is a SQL engine, and lets you store a SQL database as a regular file, usually two files. It'll be a journal, and then the database object itself. And depending on how you use it, the journal is transient, may come and go. It's widely recognized as one of the most used SQLite engines in the world. It's very popular. They estimate on their website, there's billions of SQLite databases in use. It's at least tens of billions at this point, because it's in every Android phone. So it was easy choice to make. It's a very simple library, and bindings exist for numerous SQL systems. In particular, of interest to me, was Python. Actually, extending SQLites is fairly simple. They have this VFS concept, virtual file system concept. It lets you swap in different virtual file systems as needed. The basic one is the UNIX VFS that's what comes with SQLite by default, and it's very intuitive. It just passes on open, read, write, closed off to the local file system for execution. So Libsef SQLite is a library for a SQLite VFS library that lets you put a database, SQLite database in RATOS. It's composed of two parts. Libsef SQLite and simple RATOS striper. I'll talk about simple RATOS striper on the next slide. The use of this library does not require any application modification, and that's kind of like the killer feature here, because you can just set some environment variables and modify the database URI, and you can automatically start storing your database in SEF. And all of these, you know, the journal objects, the database objects are striped across the OSDs. You don't need to do anything differently. The simple RATOS striper is based loosely off of the Lib RATOS striper developed by CERN. The main reason I didn't end up using CERN's library was because it had some locking behavior that was not really desirable for a highly asynchronous use case, and I didn't want to modify their library out from under their feet, so I just wrote a simple version. It provides the primitives that SQLite needs, open, read, write, close, sync, and all the writes are done asynchronously, and then the sync call that comes from SQLite actually flushes them all out. So these are all stored across RATOS with these names, you know, foo.db, and it's got like the block number associated with the database, and so on. Using LibSeph SQLite, again, it's very easy. You just have to load a VFS library. This is done with the SQLite command.loadLibSephSQLite, and then you just provide a URI for the database. This is the pool ID or the pool name, the namespace within that pool, which is optional, and then you give the database a name and specify the VFS is Seph, and that's it. It just works. You may have to specify some environment variables if you're using the SQLite binary to give it, tell it which Seph cluster to use, or which Seph configs to read, things like that, but that's all fairly unobtrusive, not obtrusive. Within the Seph manager, so the Seph manager is one of the newer demons in Seph that takes care of certain details of managing your Seph cluster and trying to provide easier interfaces. A particular interest to us is one that handles health metrics that come from the OSDs, giving the Seph manager information about the smart data associated with the disks, being able to anticipate failures in disks, again, Seph trying to reduce the management burden of storage clusters, and then also a portal to higher level commands like managing volumes within Seph that is a subvolume concept that's used by OpenStack or Kubernetes, CSI. Within the Seph manager demon, what I observed was that there were several modules that were just storing data in the OMAP key value of a particular object, and it turned out this doesn't scale very well. We know it won't scale well because if you have more than 10,000 key value pairs in a single object, the performance starts to degrade. In fact, you'll start getting cluster warnings that there's objects with too many key value pairs. So it was also pretty awkward in terms of how it was being used, and just by how we were managing the data, it was a perfect match for a SQL database, except it was not very easy to put a SQL database on Seph at the time. In fact, Jan here worked on SNAP schedule, which is a module for creating snapshots and maintaining snapshots in Seph FS and handling retention policies, and that actually used a SQL database that was flushed to RATOS objects and then loaded in anticipation of the project that I'm working on now, and that's all been updated now to use this of Seph SQLite library. All right. So in terms of how it actually looks within the Seph manager, on the left we have a schema. It's fairly simple. Just creating a table with the device ID as the primary key, and then another table with device health metrics. With the time we got the metrics, the device ID associated with that metric, and then the raw smart text. And then they actually put the device metrics in the database. It's as simple as this. Within the manager I've taken out a few unnecessary lines just for space or unnecessary keywords for space in the SQL. You create the device ID, which just calls another SQL statement to insert into this table, and actually execute the SQL statement with the Epic dev ID and data. It's that simple. And now that's stored, persisted, and RATOS. So here's a quick Libsep SQLite in action series of gifts I've created. Here we're running the Seph status command, just showing us the state of the cluster. We have two pools right now, a.manager and an A pool that I'm creating for this demo. Here I'm purging A just to show that there's nothing in it. It removed one object. And here I'm just listing all the objects within this pool. There's none because I just purged it. So that's just a starter. And then here we're actually going to run the Libsep SQLite. So to do that, again, I mentioned there were some environment variables. If I'm using the SQLite command directly, I have to specify some environment variables. So the library knows what to do. Here, because this is a dev cluster, I have to tell it to use the library path associated with my build. I specify which Seph config to load, which key ring associated with the admin user that I'm going to specify here. I was actually going to also add some logging data, but I ended up not doing that, just to save space. And then here I'm actually running a SQLite command. I'm loading the Libsep SQLite library. That's one of the first command that Libsep SQLite is going to run. And here I'm opening a database in pool A, namespace B within that pool, and then a database named a.db with vfsf. All right, now I'm in SQLite. Here I create a simple table with an integer column. There's the schema, exactly what I wrote. And then we're going to insert into the table one value and then dump it. So it's now in RATOS. And now just to confirm that, I'm going to run the RATOS command on the pool A, list all the objects in the pool. You can see namespace B. I have this a.db. So now I'm going to use this striper command. I'm actually, if this database were composed of many objects, you can use the striper command to actually pull the database out. And you can see here I've done that. It's an 8k database. It's small because there's just one table with one value. And I loaded that locally. I pulled it out of RATOS. Sorry, the GIF loops. I pulled it out of RATOS. And then I now have the database as a local file. I ran SQLite on that local file database. And just dumped it to confirm that it actually wrote the data out to RATOS correctly. And I can pull it out of RATOS and verify that it actually worked. So here's another demo with just rerunning the same SQLite command I had earlier. And sorry, this is going to be a big paste. But I'm creating a table. This is just some magic in SQLite to basically create an infinite loop. And I'm just going to insert a number. I think it's 100,000 integers into the table. And just see how many objects are in the database now. There's four objects composed of that database. So I think for time reasons, I'm not going to go through the performance notes, but it's on the slide if you want to look at it later. And just as a retrospective for Quincy, when the database got used live, it's being used in the two manager modules right now, the device health and the SNAP scheduling module. It's been fairly successful. We had a few minor hiccups that weren't really too much related to the library. And just for some future work, I want to add support for supporting concurrent readers. That's not yet possible right now. All readers and writers obtained exclusive locks when accessing the database. But there's not a technical reason why we can't add concurrent reader support. And then I also want to look at adding read-ahead performance, improving read-ahead performance. Because right now every read call in Libsef SQLite is synchronous. So that's the end of my talk. Thank you. Do we have any time for questions? Sorry, no time for questions to this session. You have to find Patrick and I'll send him. |
Dynamic load change in SDS systems
How to make well behaved SDS systems in an ever changing cluster |
Okay, good morning, everyone. I want to talk about how we can dynamically change a load of the sort of front system to be a better resident in clouds. I did not move to IBM as part of the Reddit storage moving to IBM. So this is still a Reddit presentation. I don't have the Ceph logo here because it's generic presentation, but it would be highly based on work that I did with others on Ceph. So all the examples would be how this was implementing Ceph and how we could use it. But the concepts are generic and not Ceph specific. So it's a mix. Okay, so we will talk, we want what is optimal cluster performance and why we need optimal cluster performance. It would be just at the beginning. Then I'm going back to explain what we did in Ceph for Riff version. We have a new read balancer which I explained quite shortly. It's not into full details, but it's an infrastructure that could be used to better control the load later. Then some future plans that we already have which are good as examples to what we could do with this infrastructure. And then we'll go to the real problem, how we could actually dynamically change the way the load is spread across Ceph cluster in case we need to do it because other things change in the cluster. How we could fit the way the load is spread in Ceph because of some kind of external change in the conditions that we need to respond to. So, again, just an example. If we have a cluster, we have here nodes and we have three workloads and they are split not totally evenly over the nodes. And if we had bad luck, we could see that some nodes are more loaded than the other nodes. And the problem is, as everyone probably guesses, one node reaches 100% load, then the entire system starts to become slow because we have weakest link in the chain effect. Assuming these workloads cannot respond fast enough, so you get all kinds of cues created and the entire cluster loses its ability to perform well. It still performs, but not that well. So, basically, when we build a cluster and we look from the outside, we want the load to be spread almost evenly. So, when one node reaches 100%, we know the cluster is fully occupied. There is nothing we could do about it. This is another image which shows something which is way better because the nodes, again, it's the same. It's created the same number. The areas of each workloads are the same. And it actually shows the cluster itself is balanced. And we could get, when it fills up, it fills up together, so we use the cluster for the best that we can. So, actually, if we try to look at what we have, we want something flexible with fixed volume. A balloon is what I find. We want something, the performance that we want is the volume, but it should be flexible because we can't control all the workloads. So, if we take, for example, we have on the nodes a backup program which runs in nights, but not at the same time. It gradually goes over all the nodes. So, each one of the nodes gets some kind of peak either in IOs or in network traffic, and it peaks and it's more full than others. And then the others, the other one, and it goes over all the nodes. We can obviously mitigate this. We can say, OK, we allocate some capacity for this backup program so we know that other workloads could run on this node. But if these backup programs work for an hour every day, then we allocate some capacity for one hour and the other 23 hours, it is not used. Probably not used. It's much better if we could incorporate this change, these backup runs over the cluster and move the nodes of the nodes with the backup to other nodes for some time for an hour. Then the backup finishes go to another node, then we move the nodes from back to the original node and we make it. It's way more effective. So, in some sense, we do some kind of over provisioning on the nodes when we know that most of the time we are not over provisioning and when we are over provisioning, we could mitigate this. That's the idea. And the other use case that we could have, that we could get a node coming to full capacity when we didn't plan it, it could be all kinds of failure, NIC problems, all the top of the rack switch problems, other hardware, or if we talk about SEPH, disk failure, all kinds of other stuff could bring to a situation that may be bad but not critical enough to take node down and do full rebuild and all this stuff. So, that's the idea is to have something which is flexible and we could play with it. The problem is that our balloon is built of LEGO bricks and it's not as flexible as we think. So, I want to show the amount of flexibility that we have, that we could play with when we talk about software defined storage system and which are more challenging than other workloads because it's stateful, stateless workloads are easier to manage flexibly. So, we want to go from this, this was the first, this is the copy of the first diagram that I showed and this is where we want to be and I change only the orange workload. It's the same numbers, exactly the same numbers, exactly the same area, orange area but split differently. That's the approach that I want to take but, okay, it's a presentation. I could do miracles in presentation, how I do it in real life and I want to show how we could play assuming SEPH is or software defined storage is the orange one, how we could play with this to a limits. We can't do all the magic that we could do in presentation but we could do reasonable, under some condition, very good work, under other condition, improve the situation not to a perfect solution. So, it's all based on the idea of the SEPH read balancer and the idea today in SEPH, we have, not in SEPH, in every software defined storage, the main balancing requirement is that all the disks are full at the same percentage. The first disk which is full, the system is full. So, this is our basic assumption. Then, what we're doing today, we try to do is that if we have replica 3, then we have XPGs mapped to OSD, X divided by 3 would be primaries. So, the primaries are split evenly on the devices, not evenly split according to the number of the PGs. So, if we have a device with more PGs, it has more primaries. But, we don't have anything that does it, actually what is happening today in SEPH is that we rely on the statistics of crash. It works well for large clusters, it doesn't work well for small cluster. So, the SEPH read balancer comes to fix this and it's what we currently have, more in the next bullet, is for the small clusters where the statistics doesn't play well. So, what we actually did, we added three different things together, create a read balancer, which could actually split the reads evenly across us. This is actually what it does, it splits the primaries as I explained for replica X. One divided by replica X are primaries per OSDs for the PGs. So, it just changes the primaries. So, first of all, we created, it would be part of the RIF version. So, first of all, we created some kind of score. The score represents how well the read is balanced versus the optimum. Optimum is one. If we have a score of two, meaning that under full load of reads, the system would perform half of a good system. If it's three, for example, we have replica three and the score is three. And we have three disks and score is three, it means all the reads are for a single disk. So, it's obviously third of the optimal when the reads split evenly among three disks. So, that's what I have. Score works really well when the read affinity of the devices is high and it's still monotonous, but it's more difficult to explain the numbers when you have a lot of devices with small OSDs. With small read affinity numbers, and I could explain later why it is, it's not a good way to configure the system. It creates all kinds of illegal configuration. You have to do what the user asks you not to do, but that's for side code. Then we have two new commands, PG-UPMAP commands, as we have PG-UPMAP primary and RMPG-UPMAP primary, which actually say, okay, I get an OSD PG, the PG has X OSDs there, make one of them the primary, I tell them which one. So, I could actually change the primary within a PG. And it's metadata only, we don't move data, you must give an OSD, which is part of the PG. If you give OSD, which is not part of the PG, you get an error. So, we are not trying to do anything, it's very cheap, very fast. It just changes the order of the OSDs within the PG, that's everything that it does. And since it's first version, we want to make it opt-in option. So, we have a new command to the OSD MAP tool, which actually calculates everything and spits a script file that you could run in order to get to the perfect distribution. That's what we currently have. So, this is the example of when you run this, you could see smaller clusters, small cluster tend to be less balanced because statistics doesn't play well. So, we have number of primaries for three OSDs, we have 11, 6, 15, obviously, it's not balanced. And after you run it, you get 10, 11, 11, which is obviously the best that you could get. And the number is not one, because the average is 10 and 2 thirds. So, if you could have 10 and 2 thirds, you get a score of one, but it's 1.03 really good. And here, obviously, it's 1.4 because it's 15 divided by 10 and 2 thirds. This is the out file that you could actually run as a script in order to apply the changes. If you look really carefully, you see that we have six changes, but actually we could only have five changes, even maybe four, we could, we should just make this 10. So, we have 11, 10, 11, that's because of its greedy algorithm. Since these commands are really, really very, don't have overhead run really fast, it's not a problem. If we would have need to move data, this would be a totally wrong one because we have two additional data movements. We don't need it, it's really simple, so we didn't try to optimize on the number of changes just to make it come. So, what is the implementation? The implementation is, we have two functions. One of them is, we have one, it's an example, calc desired primary distribution. That's where we go over the configuration and decide what's the final distribution of primary that we want. For OSD, how many primaries we want to be on this OSD? It's really, it's some kind of policy function, it's really simple. Today, it's, it does one divided by replica count, as I said, and it's about 40 lines of code, something really simple. So, adding more policies just for, to understand if you want to add more policies and I'll talk about more policies immediately. If you want to add more policies, then we're talking about tens of lines of code, nothing more than this. Then we have the functions that actually takes the configuration and goes over all the PG's, it's all run by pool. I didn't say this before, it's all run by pool because pool represents a workload and it's important to balance pair workload. It goes over all the PG's on the pool and does its work and switches what it has to switch in order to get into the, what we calculated here. So basically, this is function, this is infrastructure function that could be used, we could add more and I'll talk about more policies. We could add more policies and adding the policy is really something simple and the command is fast. That's, I keep on saying this because I want to talk about dynamic, that's what is behind the dynamic thing. It's a cheap operation that we could run periodically and change things when we want to do it. It's not involved in any, any high overhead or anything, it's just a metadata operation. So we had the very simple read balancer, use case is small clusters, small clusters, it's very important use case for redhead, now IBM, it's for ODF or for CEPH in OpenShift, but there is much more than we could do. So now let's talk about the next, well, we have two use cases. So what we could do more? We have a mechanism, we could change the things, but it's a pretty strong mechanism and we want to use, to use it for further on. So one thing and this is one thing, this is a use case for larger cluster, load, load balance better on heterogeneous system. So you have a large CEPH cluster, you started to work five years ago, you have one terabyte devices, time comes, you need more capacity and it doesn't make sense to buy one terabyte devices anymore and you bought two terabyte devices. Now normally it means that you get twice the workload on the two terabyte devices because you need that the devices would be split, even filled evenly. You could create policies that says that you change gradually and when the smaller devices are not full, you have the same capacity. It doesn't work because every change is movement of data, it's very expensive. So you could in theory create really, really nice policies in practice, the price you pay in order to keep it is too much. Eventually you have device which is twice as large, it gets twice the capacity or you define it as a smaller device, then you lose capacity or it gets twice the load. If it gets twice the load, the smaller devices are working on half the potential load. So just for the numbers, I have a little exercise. So we have devices of same technology, same bandwidth, same IOPS. We have one OSD of two terabytes and four OSDs of one terabyte. So it's like three OSDs of two terabytes if you think about capacity, so one of two terabytes, four of one terabyte. So we have for each a PG, one copy on the large device and two copies on the small devices, a split. So that's the assumption to just show a bit of play with the numbers. And for the convenience, let's assume that we have 100 IOPS, it's just easy. People like to think in round numbers. So under full load, what happens? This device goes to 100 IOPS because that's what it could provide. And these devices get half the load and they would provide each 150 IOPS. So total we have 300 IOPS. And there is nothing we could actually do about it because the requirements to have the capacity split twice here than here is so strong that we can't change it, can't play with it. That's the rules of the game. Once we did it, we are bound in such a configuration. Actually, we had a working cluster and we added one larger disk and all the cluster performs way, way worse than it used to be. So it's not really something that we could do with it and accept, replace all the disks with larger disks or something like this. But if you want to do gradual, we can do with it. Now, there is something that we could do. Let's assume that we have, for now, the read-only, just read-only load. The capacity is still the same, 30 PG is 15, 15, 15. But the 10 primaries here, I moved them and moved them here and split them here. So we have here, when you have only reads, all the devices are fully working under full load. So we moved from 300 IOPS to 500 IOPS, but just very minor change of changing primaries and splitting them differently across the cluster. Now, this is the best that we could get. If we have read-write, obviously, if we have write-only workload, we can't improve using this technique because this technique changes only reads. So if we have write-only, we can't change, and if we have some writes, we can't get as good as this one because eventually this will get all the time, twice writes than the others. So there is limitation to what we could do. So the best potential is for read-only, but we could do also with mixed read-writes. We could get a lot of improvement under full load, a lot. So that's the idea, and this is already planned for the next version. On purpose, we didn't put it on the first version because we want adoption of the feature, but this is, we plan to add two steps for the next version. This is actually better supporting different sizes of devices. So, can we improve on other loads which are not read-only? And I already said this, we can do this, but in order to do this, in order to do good job, we need to understand per pool some characteristics of the workload. And basically the read-write ratio. If we get read-write ratio which is reasonably close to what we actually have, then we could do good improvement in the performance when we have different size devices and mixed workload. Well, I said it, there are limitations what we could do, so I said one thing, we can't improve on write-only using this technique. I've seen this, so it's real thing, but if you use replica three and we, instead of having one terabyte and two terabyte devices, we have one terabyte and five terabyte devices. I've seen this in real life. We can't improve, we can't, well, we can improve, but we can get to optimal performance when we have five terabyte devices and one terabyte devices on the same pool, the numbers are too low. We could reduce it to the minimum, moving all reads out of the five terabyte devices, but eventually when you have enough writes, the system would not perform optimally. So, we covered this. So, another case which was, another case which is, well, it's a no-no, big no-no, don't put SSDs and HDDs on the same pool, everyone knows this. Well, if you could, if you can make sure that you have enough SSDs that, sorry, that every PG is mapped to at least one SSD and then to HDDs, preferably replica three, one SSD, two HDDs, you could actually get the effect of read flash cache without cache misses. You always read from the SSDs and you only write to the HDDs. You could improve on the performance of your system and get really fast read latencies. So, I'm breaking this no-no, but it is important, don't mix the technologies if you can't make sure that all the reads are from the faster device. If you could make, so if you take replica three and you put one-third of your PG's, of your SSDs are SSD capacity, one-third of the capacity is on SSDs, two-third on HDDs. And you could also make sure, whatever techniques you use, it's really easy. You need to change a bit of crush rules, but all the copies, you have one copy on SSD and the other copies on HDDs. You could actually improve your performance dramatically because the bottleneck of HDD would be only for writes and all the reads would get the SSD or NVMe or whatever performance. So, I'm breaking one of the big no-no's here, but if you can't make sure that all the reads are from the fast devices, then you waste, it wouldn't work. Eventually, under full load, you'll get the known weakest link in the chain and it would be blocked. But if you can do it, so this is a way to modernize the devices gradually and not moving all the HDD to SSDs once, one-third in case of replica three could be the first step and you could do it gradually and you don't know to do anything. So, that's another thing that could be with existing, by the way, for this, because it's different technique, you don't need what we did in the read balancer. It's enough to have good crush rules. This could be managed by crush rules differently. Okay, now we come to the dynamic aspect of this. So, the thing is that we have cluster, we build cluster, we know we have the numbers, how the cluster performs, what are the network bandwidths, how devices perform. It all works well until it doesn't. So, we may have problems, hardware problems, and we may have noisy neighbors where we work. As I said, full isolation of neighbors has a cost. If you prevent over provisioning in all costs, well, it has a cost, depending on your workloads, you allocate a lot for temporary workloads. So, in some cases, doing over provisioning of nodes makes sense if you know how to behave. And this is especially in hyperconverged. So, we know that hyperconverged deployment, noisy neighbors, it could be in Kubernetes, we tend to limit, we know how to limit, we use, we tend to limit these CPUs and memories and not network. Obviously, for sort of defense systems, the network is really important. So, noisy neighbor on the network could cause huge performance problems. So, the process, and this is the process is, I want to explain, because it's more than just technical thing. We want to monitor the IO performance from OSD level and up. We want to identify what happens. It could be on OSD level, it could be on node level, it could be on rack level. We need to understand what happens. Then we could tune up the system. We could reduce the load. If the problem is temporary, we don't want to move data. Even if the problem is that we have a faulty nick, and we know that it would take 24 hours to fix it, it may not be worth the effort of, for a node, to move all the data from this node to another node, to other nodes, to the rest of the cluster and then move it back. If we could make sure that we have a faulty nick and until it is fixed, we don't read from this node at all. We just write to it. Maybe we could live with it. So, the idea is that we don't want, the obvious solution is mark OSD is out and move the data and everything fix itself, but it costly, especially if you have a lot of data over the node. So, this is a way to reduce the load temporarily until it is fixed. And then, once you did all your magic, fixed everything, you could go back to the normal. So, and by the way, this is something that is not related to software-defined storage, but it's much easier, all this much easier to do for a stateless application. If we have a web server that gives us the stock exchange quotes, and it doesn't function, one of the servers doesn't function, we change the proxy and we send the request to other servers. It's way more difficult to do with stateful application when you can't exchange every server with every other server and you need, there are more limitations and obviously we talk about storage. It's the most stateful thing that you could think of. So, option one, it's something we thought about even for the rebalances we did. This is what is a very good solution for the stateless part, it's called the power of two. Before you send the request, you randomly select two candidates to get the request, you find out who is more loaded and you send to the one that is less loaded. It does magic. That's really, really good way to move the load from the loaded servers to the unloaded servers and it works. Unfortunately, in order to fix, to do such thing in SEF, you need to go into the data path. Everything that I explained up until now is totally outside of the data path. You have to add things to the data path and to change the clients and we thought about this also in order to do the read balancing, it would be very simple. Every PG, since we have read from non-replicant in SEF now, we could say for every PG, don't send the request to the primary, send it randomly to any of the PG's and automatically you'll get the balance spread. Sorry, for HPG to any of the OSD's, don't send to the primary OSD, just send to whatever you have there. We decided not to do it, it's risky and we don't like to play with the data path. I don't like to play with the data path personally, but it was a mutual decision, not only me. So that's option two, monitor centrally, monitor centrally, obviously, create the policy. This is something that you need to get the function of the policy that knows what to do when you find these discrepancies. You need to understand what you're doing, what you want to achieve, how much time you want to play with this before you decide to move data or maybe you need the human involvement which will tell me, okay, I'm going to fix this, don't do anything. It's a policy you need to do both in the terms of workflow and then program what you need to program for this. The policy function is small, we talked about this, it's nothing. And when the performance changes, first you need to notify operator because we suspect that a lot of the problems could be hardware problems that should be fixed and we need to tell something that we see something bad. And then change primary settings so we remove the load from the less performant OSDs or components to other places. And before monitoring the system and when it's back to normal, we could remove everything and go back to full things. So here is the conclusion, again, it's data path outside of the data path, syncing the metadata, the performance to the clients, which is also something that we didn't want to do, versus the external metadata configuration that we do on the server side, whatever, because we trigger the policy from server side and no change. So that's the idea for, again, this is for how to implement this in the future, but the idea we could, if we measure, if we know what's going on, we could improve to send it some point aside whether this improvement is good enough for us until we fix, or in some cases we decide to move data and don't say don't move data, but don't move it as the default option for every problem. Acknowledgements, Orit, which worked with me a lot on the ideas that I put here, and she's now in IBM. And Laura, which did a lot of the coding with me on this, and actually since my coding skills were a bit rusty, I couldn't do a lot without her. So thanks to Orit and Laura that helped in this project, and I'm done. Questions? Yes, please. Can you use the new OSD map rule that is like released for brief, as I said, like master? Can you use that to generate a list of distribution of the primaries that will be optimal on all the clusters, and use the temporary map feature that is already existing in all our releases to actually apply that policy that this would be optimal. Okay. Basically, I'd like to back forward, but then without. The question was, if you could use the OSD map tool on older classes, and then instead of using the new PG op map use a temporary permit permit temp in order to do it. I think it should, the OSD, you should have a new set cluster with all the new tools because there are some other changes, but you run OSD map tools on a file that generated from the OSD map tools work on configuration file that you take. So you could create a configuration file out of an old cluster and run it with the new OSD map tool. And then so I think actually the primary temp is how we tested this. What is missing is that the new OSD map tool relates on the score. And actually I'm not sure, it not depends on the score, but it uses a score internally. So it should run on a new environment, but I think it should be able to work. I have my information here, or I'll give you my email and send me, I'll verify this. It should work. I'm not 100% sure, but it should work. If it doesn't work, it is a problem. You said that you defined that it's especially good for smaller clusters. How big would you define smaller clusters? See, the question was, what is small cluster and big cluster? The thing is, the way CRUSH works, it uses statistics to do the split to primaries. If you have a cluster in which primaries are not balanced, probably it falls into the smaller side. But if you look at hundreds of OSD, even it's a number of PG's, not number of OSD's. In the past, I did an experiment and I saw that you saw the score here, 1.4. Every time that you double the number of the PG's, roughly the difference from 1 goes by half. So we put 1.4 for 32 PG's, probably around 1.2 for 64. So it's large clusters with pools with large number of PG's, usually somehow balance themselves. But it's a matter of, you need to look, if you have unbalanced pools, it's unbalanced pools. It doesn't matter which cluster you are in. But that's why we do it. But most benefit for the larger pools, the pools with the data, is when you have smaller clusters, it's not worth putting 512 PG's per pool and you work with smaller numbers. If you have 510K or 2K PG's per cluster, probably your score would be pretty good. Would it also be useful for erasure coded pools? The question whether it's good for erasure coded pools, probably not. Well, way, way, way less sufficient. I tried to do the theory behind it. Then maybe really, really little, I don't know. So currently we check in the code, it doesn't work for erasure coded at all because we think it doesn't worth the hassle. So it's tested and it works only on replicables. Okay, my time is up. Thank you very much. It was pleasure being face to face here. Thank you. |
s3gw: easy to use S3-compatible gateway for small and edge deployments |
So, I'm João. I work at SUSE, at the storage team. I used to work on SAF. Our storage team used to work on SAF, but due to restructurings, that's no longer the case. I'm going to talk to you about one of our latest projects, Yes Free Gateway, and I'm going to tell you why we're doing this mostly. What is S3 Gateway? How we're doing it? Hopefully there will be a demo, and then the next steps and what's ahead of us. So, why are we doing this? So, essentially, after our product was restructured, we needed to find something else to work on, and one of the ideas we had was to find a way to, on the one hand, to figure out a way to provide, I'm lacking the word, provide something that was lacking in the SUSE Rancher portfolio, which is basically an S3 service for local clusters, for local storage within a Kubernetes cluster, and what we aimed at was something as easy to deploy, as easily forgettable, and that just works for ideally local workloads, not necessarily something that is complex to manage. We didn't want anything that would be, we wanted something as light as possible, and that's what eventually became the S3 Gateway. It's an open source project as usual. It's driven by our team at SUSE, and ideally it will be an easy to use project that you just deploy on a Kubernetes cluster and it will just provide you an S3 service within your cluster. I say, ideally, because this has six months worth of development, and there are a lot of things still lacking, far more than I would actually like, but that's just how life is. It complements the Rancher portfolio, as I mentioned, but this was not necessarily the main driver when developing the S3 Gateway. It just happens to fit nicely within the stack. It helps doing backups of local Longhorn volumes, backups for other stuff within the stack, and one of our main criteria initially was that we would serve our storage from any PVC that a Kubernetes cluster could provide, and this just happens to be nice because given Longhorn allows us to put stuff on a Longhorn persistent volume, Longhorn will deal with all the nasty things like replication and whatnot so that we don't have to deal with that. Of course, we wanted a pretty UI for all of the operations and management, which is still ongoing, but we'll get there. How we're leveraging this, though, is basically using Rados Gateway from SEF. We didn't want to start something from scratch because we thought it would be a waste of time and resources, so we decided to leverage the Rados Gateway from SEF, which is quite amazing because it can be run standalone. Given the already built-in zipper layer of which there is a talk next, I think, we basically just had to create a new backend that is basically file-based on which our data is stored on the file-based backend instead of, say, the Rados store. Hence, we don't need a whole SEF cluster, we just need the binary running standalone. Essentially, as I was saying, we have the Rados Gateway consuming a file system, essentially whatever that file system is on. We keep a SQLite database for metadata for the objects and the directory hierarchy for the data. We decided to do this so that essentially all the things that can be indexed and to be essentially abstract could be kept as metadata in the SQLite, which will allow us to more easily search things, index things instead of having to go through a directory hierarchy to find, for instance, buckets and whatnot. So buckets are essentially a mapping of a name to a UUID and objects, as well, end up being entries in a database that associate the bucket name, the object name, and our map to a UUID. The data for the objects, though, will be based on the UUID and we will grab the UUID and create a directory hierarchy based on the first bytes of the UUID. The reasoning behind this was mostly because, given typically some buckets tend to grow larger than other buckets, if we were creating a directory hierarchy that was per bucket, per bucket name, we could end up with very large directories. This way, we kind of spread the objects around and even if we end up with larger directories, we don't have to list the directories themselves to find where the objects are or which objects are within those directories. We just have that stuff in the metadata devils within SQLite. Now, this is not pretty, I admit it, but it's the best I could do. This is roughly what translates to being the S3 gateway stuff being deployed on a Kubernetes cluster. We are essentially deploying two containers, one for the back end, which is rather a gateway, and another one for the UI. We deploy our store on whatever is the, whatever is supplying us with storage. In this case, this slide has Longhorn on it, which will deal with all the replication and availability and whatnot for us so that we don't have to care about this, but this runs as well in a local, I'm running this on my Synology NAS. I just have a volume that is exported through the back end. It really doesn't care about which whatever is providing the file system to it. The UI speaks directly with the back end and the user just consumes the S3 gateway if outside of the cluster through an ingress, if inside the cluster through magic, that I don't understand very well. I promise the demo, which might work or might not. It has been working 50% of the time in my computer, so let me see if I can get this to not this. So, you're not seeing anything, of course not. How do I, Jan, how do I... So, ideally what we'll see is already deployed K3S, which is running things, but it's not running the S3 gateway yet. What I'm going to do is to deploy with our chart, which usually works for people. My laptop is being difficult, so let me just see if I remember its home. So, basically we install this using the default values, and supposedly we will have the UI available at this URL here, not this one. Now, there is still a lot of kinks to figure out with this stuff. We have the UI being run here, but the UI right now is unable to talk with the back end, and this is because certificates. Because we are using a self-signed certificate that the browser does not understand, and the browser is directly talking to the back end through the UI, there is no actual demon in between the UI and the back end. What happens is that we have to go to the back end and accept the certificate, which is hilarious, and once this is done, we can log in to the UI. So, right now the UI is still under heavy development, and the UI cannot do a lot more than the back end does. We are still lacking a bunch of things implemented in the back end driver, but I just want to show you that if we do things on against the back end, it will actually happen in the front end. So, let me see. This is where this stuff is. It has three commands. If I put a one gigabyte object onto a bucket that I actually need to create first, if I create a bucket foo on the back end, it will actually show on the front end, which is expected, and it would suck if it didn't work. Putting an object there will also allow us to do some exploration. This is a big object, so you can see that multi-part upload actually works. I'm very proud of this part. It should be done to some extent. So, if we explore bucket foo, we have a one gigabyte object here that we could also download, and hopefully that works, or maybe it doesn't because of my but it should be downloadable. I think something is blocking my requests. Anyway, that was about it for the demo, but if we turn on the administration that we could also technically manage the users, I think. It's just been difficult. Oh yeah, and I got the object downloading now. Amazing. But yeah, we could create new users. We still don't support user quota, bucket quota, stuff like that. All of this is still very much work in progress. Creating buckets can also be done via the UI. We could enable versioning. Versioning is already supported. I'm just not doing that because I don't remember how to demo that part. We have tests for that stuff. Okay, so this is as far as the demo goes. Let me just list the buckets here so that you see that a bucket bar has been created, and we have a bucket bar over there. That's thrilling. Okay, let me just go back to the other thing and go back to the presentation. Slideshell, so from current slides. Okay, so next steps. So for now our roadmap is to actually increase the number of operations that we actually support because the operations, the RGW basically supports everything that exists, but then the driver behind it needs to comply with the expected semantics and the expected, you know, you request data to the backend and the backend should probably return the appropriate data so that the client is able to perform the operations. And we've been doing this gradually. There has been some challenges there of which, so we are in the process of implementing life cycle management for buckets, retention policies. The performance currently is far from ideal, but we're working towards that. And I really want statistics on the UI. I mean, having as much information that is useful to the user through the UI is something that is very much on my to the list, not necessarily on the to the list for the project, but that's another thing. In terms of challenges that we currently face is a semantic compliance with the S3 API. As I was saying, we have, we've been having some challenges ensuring that our driver replies or provides the right information when processing the operations that are requested by RGW. Fortunately, there is an amazing project called S3 tests within the self repository, which covers pretty much all of the API as far as we understand. And it's very, it's very good to actually ensure that we are in compliance with the API. Then we have performance. This has been a big learning curve, especially with SQLite. There have been decisions that were made during the implementation of the project, especially surrounding new taxes that bid us eventually. But fortunately, we have a Marcel actually looking into this stuff. So if, and that's one of the things that we have here, we have here a comparison of our performance, previous performance with the performance that comes with some of the work Marcel has been doing, it may not look like a lot, especially when we compare with FIO. But I mean, he filled with a few mutexes and we got a significant speed up on a very let's say that the machine is far from current. So we are also believing that we are CPU bound some extent and not taking full advantage of the IO path. But that's neither here nor there. What matters is that we are performing gradual performance improvements that will eventually pay off. This is just the latency distribution. So the YOLO branch is what Marcel called that branch mostly because it removed a bunch of mutexes. What we see here is that to some extent the latencies dropped a bit for put, even though they increased significantly forget. But we are also believing that we are, given that we are also having more operations in flight that we may actually be CPU bound and the operations may not be finishing because of concurrency issues and whatnot. After that, eventually world domination, I think, but probably not. But that's the hope. And that's it. If you want to find us, we are at S3 Gateway IO. And that's about it. Thank you for enduring my presentation. Thank you. Any questions? No? Awesome sauce. Great. Thank you. |
Ceph RGW and Zipper
Alternative Storage Backends for S3 and Swift Object Storage |
Okay, everybody, and let's give it a hand for Calib and another of our moderators. Thank you. So, yeah, my name is Caleb, or Calib, as the case may be. My colleague, Dan Grinevitz, could not be here. He was going to be a co-presenter with a big reorganization of work and budget. We couldn't get the budget for everybody to come. Juan Luis gave me a great introduction in the previous talk. Now I'm going to dive in a lot deeper on the rest of what's in Rato's Gateway, RGW. So I'll start with a brief overview of what CEPH is and what the pieces are, and then I'll drill down deeper into CEPH. I know the acoustics in this room are pretty bad. Can you guys all the way in the back hear me? Good. I tend to be pretty loud anyway, so, but. So what is CEPH? CEPH is fundamentally an object storage system. The basis of CEPH is something called RATOS, which stands for Reliable Autonomous Distributed Object Store. I think that's a Bacronym. I don't think it's very great, but it rolls off the tongue, so we'll go with it. With RATOS, RATOS layer talks to object storage demons. I don't show these on this, but they would be down here, and object storage demons would have some kind of media, hard disk drives, SDDs, or NVMe storage to persist the data. There's a bunch of layers on top, so applications can talk using LibRATOS, using the LibRATOS API. It's not a standard API, it's the CEPH object API. I'm not aware how many people actually use LibRATOS. We'll come back to RATOS Gateway in a minute. Consumers of block storage like virtual machines and virtualization would use RBD to talk to LibRATOS and then RATOS. CEPH-FS is the native POSIX-like file storage API. CEPH-FS, there's native kernel support for CEPH-FS, or you can use FuseMount. But the piece I'm really here to talk about is the RATOS Gateway, or RGW, we like to call it. This diagram is not really accurate. We could split this in half. On one side we have the RGW Demon, the RATOS Gateway Demon that is listening on its own port and speaks S3 and Swift protocols. And then through RATOS Gateway, this talks to LibRATOS and down to the storage. The other piece that is RATOS Gateway is something called LibRGW. So this is actually an in-process server in a library that you can link into an application and then initiate it and it will run its own RATOS Gateway Demon talking down through the stack and then ultimately to your CEPH storage. If you have questions, just raise them immediately. We'll do a Q&A at the end. So RATOS Gateway has right now two basic front-end, Swift and S3, S3 being the one that most people are using. This is some slideware from CEPH that I stole off a CEPH slide deck. My boss said this is a terrible slide when he reviewed this. So for the last three years we've been engaged in a side project called Zipper. The idea was that we were going to unzip the implementation and break it out. So what does that really mean? So we have the ability to provide the S3 protocol capabilities to CEPH storage, but CEPH is just one example of storage that people might be using and we thought it would be interesting and valuable, there would be a good deal of value to be able to plug in different storage underneath besides just CEPH RATOS storage. So we have some proof-of-concept and some preliminary work using Intel's Deos, Seagate's Motor, which I think is the other name for Cortex. We've plugged in a MySQL database, we have proof-of-concept, plugging in a MySQL database. In order to do this we wanted to provide a flexible API at the inside of RGW that anybody could use to write their own back-end storage. So as Juan Luis was talking, Rancher at SUSAs using this for their product and we have, like I said, we have Intel and Seagate doing some more things. Why do we want to do this? As I already alluded, I didn't already alluded to this. So the CEPH implementation, the CEPH RGW implementation we think is one of the most complete S3 implementations outside of Amazon, outside of Amazon AWS. I don't know if that's right or not, we can debate that, but we do have people telling us that it's one of the best S3 implementations, one of the most complete. We have other companies, other interested parties telling us that they wanted to leverage our implementation somehow and we wanted to provide the ability for developers to leverage that and plug in their own storage. As I, what I did allude to, I called it unzip, where we were going to basically unwrap or unzip our implementation and make it consumable for other consumers. We've developed a plug-in model or a zip model, we call it the zip model. Anybody that's familiar with the Linux kernel, the VFS layer in the kernel, this is a similar concept or if you've seen NFS Ganesha, the FASAL which stands for file system abstraction layer has plug-ins that let you plug in different backend storage for the NFS server. So we're doing the same thing, we're basically putting a plug-in architecture at the bottom of the Rados gateway and we're collaborating with other people. This is, one of the things we're going to do with this is actually add the ability to stack these plug-ins, so you could, the bottom of the stack would be something like a Rados or a Deus backend, but above that you could stack filters that could do things like, that could do arbitrary random things like send your data off to be indexed or compressed or put into a cache with and have a faster look-aside cache and possibly do some cache tearing sorts of things. So the API is intended to allow the stacking of filters above the backend storage. And what this looks like in pictures is the existing, the existing RGW up through the current, what's currently shipping, which is Quinzi or Ceph 17, we have a modular, sort of monolithic, not modular, monolithic implementation of RGW that strictly speaks Lib Rados. And what we're doing is breaking it up, we've spent the last three years refactoring it, cleaning up the internal APIs and adding the ability to write individual plug-ins that are loaded as shared libraries based on what a configuration file tells you or tells it and then, so as I said, so our first plug-in, of course, the natural one would be a Rados driver. We also have a real proof-of-concept DB store that goes to MySQL, a MySQL database. We also are working with the mass-open mass, I'm looking at Niels like he knows, he doesn't know. The Boston University and Northeastern University are collaborating with Red Hat for something called the mass, that's something initiative, if you really want to know, you can ask me later, I'll figure out what it is. And they have been taking something called D3N and expanding it into a new feature called D4N, which is a look-aside cache that helps improve performance. So these are all real in there right now, although they're not in there as plug-ins, they're still in there in that monolithic implementation, but that's what it would look like. Another way to look at this, so we have the protocol interface at the top where we have S3, we have Swift, something else we have in the pipeline is Apache arrow flight as a front-end, these front-ends are not pluggable, I'm just mentioning them here for completeness. We have the operational core implementation, we have APIs for bucket, object, user, store, and lifecycle, these get routed through the plug-ins and then at the bottom we have these plug-in layers that talk to the back-end storage. So I'm going to show you some code, I'm not going to dive too deep into it, but I did want to at least show you, the APIs are, the implementation is written in C++, we start with a base virtual class, pure virtual, pure base virtual class, which has the definition of the interface. We go from there to a small lightweight stub library, or not library, a stub class with some basic implementation, and before I get ahead of myself, so here's what the pure virtual base class starts to look like, we call this class the driver class, it has a bunch of virtual methods that get filled in by the stubs and the actual implementation, so drilling down deeper than our stub class is this, subclass is this store driver based on driver, that is literally I think the whole class definition, not definition, declaration. And then finally, as a, the RADOS store class, where we plug in actual, the actual implementations that are needed to talk to the RADOS store, something like the deos and the motor implementations look very similar, but obviously they have different class names and they each have their own methods to fill in for the virtual methods. This is just a summary of all the different things that are in the API, so there's basically a main entry point, you instantiate the driver or the RADOS gateway reads a config file to figure out to determine which class it's going to instantiate, it loads, does a DL open, DL sim to get the entry point to the library, this creates an instance of the RADOS driver class or the RADOS store class, once we have an instance, once we have a pointer to the instance of this instance, then we can start invoking the methods that are the virtual methods in the APIs. So we have APIs for user, bucket, object, multi-part upload, et cetera, et cetera. The store, as I already alluded, is a back-end storage implementation, so it sits at the bottom of the stack and we have a base implementation defined. Filter, as I was mentioning earlier, is an optional partial implementation that you can do filtering, some kind of filtering or caching or other modifications on the data as it goes by to the store. What zipper is and isn't, so zipper is a way to provide S3 or Swift object storage on top of whatever your storage is. So your storage could be a database, your storage could be RADOS, it could be anything you can think of. It's a way to transform and modify stuff on the way by, so caching, you can namespace division, tracing, it's a strict layering system, calls move down the stack and returns move back up, it's not a, it's not an acyclical, it's not an acyclical graph. What is zipper not? It's not a communication bus, you can't communicate between these components. It's not a graph, there's no sideways calls across the bottom of these, and I'm not sure what I meant by it, there's no front end or part of the front end. Zipper isn't part of the front end, so when we talk about S3 or Swift or AeroFlight, those really aren't part of zipper, those are going on in parallel. The AeroFlight stuff, the AeroFlight implementations should land in Cep 18 Reef, but that's not really what zipper is, there's also Lua scripting coming in Reef. So where are we? We've actually been working on this for the better part of three years, mostly this has been consisted of refactoring the current implementation so that we can actually accomplish this. The existing monolithic implementation was really, well, I won't say anything bad about it. It was not very good, it has needed a lot of cleanup. We've done, as I said, there's a prototype proof of concept DB store that we actually have working in monolithic implementation. We have a trace driver, we have ongoing work in this D4N project, which will ultimately be a filter plugin, object on file, Lua scripting layer, so we could have a Lua plugin, a Lua filter plugin that would allow you to write your own Lua scripts to do things to the data as it transitions up and or down the graph. This is ongoing within Cep, it's a part of upstream Cep, as is the rest of RGW, RGW is in the GitHub repository, you can reach Daniel or myself at these addresses, and yeah, I think that's the gist of it. Any questions? I allowed 30 minutes and I just did it in 15. No questions? Yeah, yes. With the file backend or even the DB backend, is it possible to run it locally on a laptop too? You could run it locally on a laptop, yeah. The DB is file, we've talked about doing a separate, pure file backend or zipper plugin instead of database, but the database actually gives us everything we want and has some good semantics. I guess I could ask you that as well. You could, oh, I'm supposed to repeat the question, so the first question was, could you do a file backend, and I just answered that, and sorry, the would. The plugin is currently written with MySQL, but you could probably pretty easily refactor it to use any database you want. No other questions? Thanks so much. Thank you. |
Operating Ceph from Ceph Dashboard
Past, Present and Furture |
you Hello everyone, I am Nizamuddin and I am with Ankush so we are here to cover a little about operating step from the step dashboard past present and future since we couldn't give the presentation live we are recording this separately so this is a two-part presentation the second part will be covered by Ankush so the part one will include the introduction why and who needs a dashboard the history the architecture of the dashboard the key features that we delivered on QnC then the road map to the reef release and I'll try to cover a short demo as well so just a brief intro on myself I started my career at Red Hat as an intent at the time I was working with the Rook and OCS operator later on I became an associate software engineer and got assigned with the step dashboard I've been ever since in the step dashboard team and last year I became the component lead of the step dashboard recently we got moved over to IBM as part of an internal transition and I am now and I am now working as a software engineer at IBM so we all know what a dashboard is right and what I want to cover is what I or what I want to reiterate over here is like why step dashboard is not just a dashboard for safe because step dashboard is not just a tool that is limited to monitoring only step dashboard is a fully-fledged management and monitoring web UI tool it has all the functionalities of management like you know you can manage host OSDs or you know poor management step affairs, IGW, NFS all these kinds of management functionalities are available in the step dashboard then why we didn't change it like why we didn't change the name to step dash from step dashboard to something else right because when it was fine it was initially introduced that's like a long time ago it was just a dashboard we can only do monitoring activities in it then later on with the introduction of the dashboard version 2 things started to change but it was deemed too heavy to replace all the code and documentation references so hence we just kept with the step dashboard name all along so the distributed systems are always complex it's it's not just a single piece of software right it's comprised of a big software ecosystem so if you take the example of step is it's comprised of all these you know software softwares like the rados kernel rbd rgw it's all it's it's it's a compliant system of all these components so to install this you have to scale out and this causes infrastructure and configuration management and also monitoring becomes you know challenging the configuration of step involves around 2000 settings which gets incremented on each releases and then the operating on each of these pieces it's like you know it's each of these components has its own CLIs APIs or workflows differences of things right and maintaining the whole cluster again it's a challenge if you want to maintain if you want to do maintenance activity on like one host it can affect the rest of the cluster so a proper maintenance should be holistic and not be restricted to per component monitoring troubleshooting analyzing the issues a request going through or inspecting multiple architectural levels like the hardware networking logs etc also the CF CLI doesn't provide a unified user experience the safe the all these different components has its own CLI safe has the CF CLI rados has the rados CLI rgw has the rados gateway admin rbd then this FATM don't get me wrong CLI is great if you are experienced with CLI then most of the things will be easier right which is a few scripts and you can do wonders with it with you but when you are new into these ecosystems like me it gets complicated like you have to consider many different things and you have to understand many different you know CLI ways so for me it for me or people like me it is easier to go through a UI where you get a unified experience you can do anything like you can do you can create an rgw and rbt in a similar fashion to forms so it's get easier with the UI and we have the whole SSH versus HTTPS discussion the comparison so HTTPS is more ease of use and it's more easily configurable and the access control is better with the HTTPS CLI versus REST REST is more standardized and it follows an open API specification then the whole text versus graphics discussion you can choose it which which is better so where do we come from surface board on 2006 and later on it's mostly the CLI and 2013 the calamari or the ink tank or the open attic version of the dashboard was released then the VSM then the safe dash and in 2016 the safe manager got introduced and along with that the first version of the safe dashboard is also introduced to the community then 2018 safe dashboard version 2 2019 with the introduction of safe for orchestrators like the safe ADM the the management activities or the management in the UI got overall increased or improved and 2020 we introduced the workflows in UI so workflows like the expand cluster or the OST creation wizard which allows a more simpler you know cluster expansion process or the OST creation processes then we have the I think right more we mostly focus on the service abilities which includes which includes the introduction of the features like the safe centralized logging so these are all the usual suspects in the safe dashboard not the safe dashboard I mean the safe GUIs as you can see the safe dashboard is mostly supported by the safe community itself so it's been here since 2018 and it has a very advanced set of features and it mostly utilizes the safe manager APIs which is which makes the you know communication between safe easier and along with that we also use the technology stacks like Python and Angular for the front end then we also text text use of the Grafana to populates the all the Prometheus metrics these are some of the examples of the safe early safe GUIs the Calamari one the open attic one the VSM by Intel ink scope the safe dash which is a community project not not safe I mean it's an open source project then we have the safe dashboard version one the safe dashboard version two which is which is the one you can see in the current step this is the safe dashboard version three which is which is a which is a working progress feature or a working progress UI and we are trying to push this on to the to the reef release so it's more or less there so this is as you can see from the main cluster or the main branch then how safe users monitor the safe this is like a survey done by the community sometime ago as you can see from here Prometheus is predominantly used for monitoring the safe but it is closely followed by the safe dashboard around 50% or 49% of this 49% of the users I use safe to monitor the safe the safe dashboard to monitor the safe since luminous safe offers what's needed to develop a fully featured fully featured and smoothly integrated dashboard safe manager provides and in memory cache for all core safe data it also provide a highly efficient interface to safe based on python cores safe more provides a highly available back source persistence for small datasets this is the architecture of the safe dashboard we have the safe cluster at the beginning and then we have the in the client side we have the front end which is which is an angular application so when the angular request some something from the REST API you know REST API gets in touch with the safe manager module APIs which you know talks with the safe cluster and retrieves information in the cache status and given back to the front end. Manager module APIs also exports the metrics through a Prometheus exporter which is given to Prometheus and through Grafana we get a nicely viewable graphs and these graphs are also shown in the front end in the safe dashboard using an iframe component so these are all the features that are available in the QnC release as you can see from this chart we are all we are almost achieved in some of the components like the pools or the configurations or the OSDs so we almost achieved a parity with the CLI we support most of the features that are available from the CLI and we are also we also provide some extra features like the some of the important features are highlighted at the bottom like the cluster expansion wizard the internationalization then we have the increased security the safe logs or the role-based access control the built-in grafanas alerting SSOs the REST APIs so these are all some of the features of safe dashboard and these are all you know what we try to achieve on the reef release so far so in reef we release some of some new features and we again increased the you know parity with the CLI so we introduced a new OSD cluster OSD creation wizard from from the dashboard which which increased the usability which increased the which simplified the OSD creation from you know if you are using safe dashboard then on the RBD site we introduced the RBD snapshot mirroring or we tried we started supporting the RBD snapshot mirroring from the dashboard with the RGW we introduced the RGW server-side encryption in the dashboard and also with the reef release we are trying to introduce more features for more RGW features like the the multi-site and the some users or role role policies so those stuffs are still in working progress we also have a cool new feature called centralized logging so with just one view you will be able to see all the safe logs or container logs like the manager mones or whatever you know component or container you have set up also we also tried to achieve WCAG level AA standardization standardization with the accessibility so these are some of the goals that are beyond a reef so we are trying to we are constantly trying to improve the usability experience so that's like a that's like a constant efforts from the dashboard side so we also something called the the low code initiative which we will talk about later on in the end of the session so we are trying to replace the Grafana with some built-in charts you know so instead of you know consuming instead of populating the charts with the Grafana using the iframe components we are trying to directly populate the charts using some charts as like frameworks we also have a plan to improve or improve the surface integration in the dashboard currently we have all the monitoring pieces settled in the dashboard but we still lack some of the management activities on this FFS area so we will focus on that later on and then also the multi cluster monitoring and management that's like a that's also an important feature that we are planning to deliver from the dashboard and also trying to achieve the feature parity with the CLIs so I will try to show a quick safe dashboard demo now so here is the login page of the safe dashboard and if you log in you will be greeted with the landing page or the dashboard page where you can see or view all the status or the information regarding your cluster you have the cluster status which is showing a warning for me you can click that and see why the cluster is in a one state and all other information regarding the cluster can be seen here I have like two hw's daemon's setup and all other informations then if you go to the cluster host you can do all the host management activities then we have the inventory or the physical disks part also you can see you know all the OSD's attached to the each of these devices you can identify the device by blinking the LEDs the monitors the all the services that are available right now in the cluster you can create more services and edit or delete these services also you can you can start or restart the daemon's if you go into the each of the services details there is also service events the daemon events are also there in the OSD's you can create you can create OSD's or you can you know we also have this new OSD creation wizard which has like some pre-optimized deployment options there or you can also go with the advanced mode and create advanced you know create a tribe group specification of your own also there are different set of activities that can be performed on which of these OSD's like you know same thing with flags or deep scrubbing marking an OSD out parsing or eventually destroying the OSD's then the cluster byte configuration like you know you can set flags out through to all the OSD's at once if you select one of the options here the recovery priority pg scrub and if you go to the overall performance you can see the overall performance of the OSD's which is you know code from the graphana itself so this is a graphana chat which is integrated into the dashboard using an iframe component and we have all the configuration listed here the crush map viewer for seeing the crush map tree then the manager modules we can start or restart you know you can enable or disable a manager module from here um this is the cfx auth users management so right now we have the we support listing the users and you know creating a cf user in the logs we have the cluster logs and the audit logs then we have the daemon logs which is the centralized logging that I was talking about before so in the centralized logging bus you can browse through the logs and if you click on one of the log here and if you click on one of the container here let's say I want to see the manager module manager logs and if I put show logs it will display the logs from the manager of the safe node 00 you can also put this in a live live okay so you can also put this live and you can see the logs populating in a live logs populating lively so right now it's paused and we have the monitoring this is kept you know this is directly consumed from the prometheus and the alert manager so if there are any alerts in the cluster it will show up here and it will show some indicator in the navigation pane itself then we have the pool management again all the all kinds of pool management like pool creation edit delete the overall performance you can see all the pool performance in the overall performance and if you want to see individual performance as well you can see the performance details of each of the pool individually then we have the block the RBD section so if you have no RBD pools available it will show this message and if you want to create an RBD pool you can go and create an RBD pool by selecting the RBD application from here then we have the RBD mirroring right now it's not configured on my cluster so if I want to configure it you can click on this configure RBD mirroring and this will configure the the RBD mirroring and the iskasi section as well the NFS management so you can go and create an NFS export so if you don't have an NFS cluster created so it will show this info to create it or add an add a new NFS service we have the file systems again I don't have a file systems setup right now so we have an AGW section as well so if you go to the AGW daemons you can see all the daemons that are available in the cluster the overall performance of each of the daemon you can create and delete users and edit the users the buckets creation or the bucket management section over here dashboard also has the REST API which you can see if you click the API you will get redirected to the CIFREST API then we have the whole notification system here you can see all the notifications coming from different components of the CIFREST at one place you can also report an issue from the CIFREST dashboard so it will directly go to the tracker.CIFREST.com so if you you have to enable the feedback module you can go and enable it from here so once you do that you and you provide this you are a CIFREST tracker API key and you give all this information then you submit you submit that it will create an issue in the in the tracker.CIFREST.com then you have the user management the telemetry configuration and yeah that's more or less about the CIFREST dashboard yeah so that's all from me from my part I think the second part will be covered by Ankush. Hi and hello everyone thanks for the wonderful demo and the part one presentation let me introduce myself my name is Ankush I will be working as an engineering manager in IBM team I have expertise of more than seven years in delivering the management and monitoring solutions for a software-defined storage on a containerized or a non-containerized environment in an open source system so today I'll be taking you to the part two of this discussion where we'll be discussing two major aspect first of them will be how you can contribute to the CIFREST dashboard and the second one will be the CIFREST dashboard community and how it looks like moving on I think discussing how you can contribute as a CIFREST in the CIFREST dashboard as a user we'll be talking about as a translator or a documenter and also as a developer we'll be discussing how you can contribute first as a user right CIFREST dashboard is by default enabled when you install CIFREST at the CIFADM and you can use it from the get go but if it is somehow not enabled or if you are facing some issues while enabling the CIFREST dashboard you can follow these steps that are mentioned over here or you can follow the documentation link once the dashboard is enabled you will see this is the first screen that you'll see as a user you need to login and once you are logged in you will be able to do a couple of things from the CIFREST dashboard from the management of the whole CIFREST ecosystem to monitoring from alerting logging and all those things you can do on the CIFREST dashboard but somehow if you if you if you see any issues or if you see if you have any suggestions for any of the things that you tried out or if you have any feature requests that you want to put in you can also go into a report report an issue in dashboard itself and you can report that issue and it will open a bug in a CIF tracker where we will be following up with you also you can directly share experience on the CIF users mailing list that is mentioned over there or at the IRC channels that is CIF dashboard you can reach out to us and we'll be happy to help you out so second part of that is how you can contribute to CIF dashboard as a documentor so this is the documentation that link that we have and if you see any issues or if you have any and think that you want to suggest you can directly go and report a documentation bug from the documentation link and or if you want to do a edit yourself and submit a pull request we'll be happy to help you out on that you can click on this edit on GitHub link directly and then you can submit a pull request and the team will get back to you on the reviews on that pull request as a translator like what you can how you can contribute to CIF dashboard is you can follow you can look into this link where we already have a lot of translation done in the CIF dashboard side I think more than 10 plus languages are already there but we still have some gaps that can be filled so you can look into that link and help us out on the localization and internationalization of the CIF dashboard moving on as a developer how you can contribute to the CIF dashboard so firstly what you can do is you can subscribe to this dev at the CIF.io mailing list second is the IRC channel that we already talked about which is CIF dashboard we also have some documentation link very generic to CIF but also to the dashboard you can follow to set up your dev environment and play around with the CIF dashboard moving on and let's discuss how the code looks like right so we'll talk about two major things over here one is the backend one is front end on the backend side we use pythons 3.6 and on the front end side we use Angular 12 with TypeScript plus bootstrap 4 but we are planning to upgrade it to bootstrap 5 and also we are planning to upgrade the Angular version to adopt the new features as a developer right there is another initiative that is taken by the CIF dashboard team where we understand that not everybody knows how to code in the front end languages like HTML JavaScript TypeScript CSS and all of that ecosystem and but what we have done is that even if you don't know that you can still contribute to CIF dashboard using a low code initiative where you can write your descriptors and this JSON kind of a format with that will create a route for some page and then in a Python code you can write how your page should look like and then both together can work and generate a page something like this for you where you will have a UI generated from a Python code itself I think we have the first of our kind feature that is already written and this is a CIF auth management but I think in the next release we have couple more features coming in which is following the low code initiative and it is being and it is being discussed in the in the community right now I think the second part of this part to presentation was dashboard and and the community around it so we'll let's talk about dashboard in number of pool request we have 2700 pool requests that are there in dashboard we have 4200 comments and these many lines of code most of them are written in TypeScript but second prominent pre-dominant language is Python and the last but not the least is html we have behind the CIF dashboard there are a lot of great minds from different continent different countries that contribute to CIF dashboard and these are all majorly I think this called all the people that are working from different countries and continents and helping us grow moving on as you have seen that we have a we have a big community spread across the world time zone and all that stuff so we have tried to come up with something that is that works for everybody but can be difficult for somebody at some time so we have a daily stand-up at 11 am CET time or upstream wide audience think that happens or Tuesday for tonight at 2 pm CET and the face-to-face used to happen pre-pandemic times for almost three four days at different locations it has happened not happened after after the covid pandemic now we'll see some glimpse of how our face-to-face look like this is the first one that happened in Newenburg and and all the community members are present most of the community members are present over here and these are some of the images from that there was another step dashboard of orchestrator face-to-face that happened in Berlin coral surface and these are images from there and the second safe dashboard specific face-to-face that happened in June 2019 in Fulda and this these are images from there the upstream safe dashboard community is doing some team building activities and and enjoying work together I think that's all for today from my side for the part two presentation we'll be happy to take any questions if you have otherwise thank you everyone for joining this meeting and also this presentation |
Intro to Ceph on Kubernetes using Rook
Rook Ceph in Kubernetes and the rook-ceph krew plugin |
Hi, everyone. I hope you're doing well and not feeling enough sleepy after the lunch hours. We are here to talk about introduction to Sef on Kubernetes using Rook. Here's Alexander. Alexander will introduce himself. I'm Gaurav, Cloud Storage Engineer at Coore Technologies and I'm also a community ambassador for the Sef project from Indian region and been working with the Sef and Rook project since a long time and now contributor to the Rook project. I'm Alexander Trost, funding engineer of Coore Technology Sync. I'm a maintainer of the Rook project as well and yeah we wanted to talk about Rook for everyone who doesn't know it. I want to get you started with storage who doesn't need fast reliable storage nowadays with the cloud native applications. We're obviously talking about a bit more well more performant storage I guess depending on who you're asking. Well the point of Rook in end is that with Kubernetes being kind of like with the shipping container ship here you have your Kubernetes that kind of abstracts everything tries to well provide you this one API for most to all things depending on how far you want to go with it and I guess for most people running Kubernetes it kind of looks like that if you have your big giant ship running your production applications and you have your automation and CI CD process that kind of just try to keep it running and that's where the question with storage more and more comes into frame for people especially with local storage already like I think a year or so a year or two even ago coming into let's say being better supported in Kubernetes in a native way and not just having things around Kubernetes to try to make that an easier endeavor. We have the question of how can I for example get my self storage talking with Kubernetes so that I have storage for my applications and well that's the simple interface it's more or less great that it's nowadays mainly one interface there called CSI container storage interface which for well for storage vendors basically means they only need to implement one storage they only need to implement one interface and as well for Kubernetes slash you as a user you have one interface or like one way on how you can get storage for example if you want storage on Kubernetes you have the way of using the persistent volume claims we basically from our application perspective claim storage and Kubernetes will take care of for example talking with self storage and provisioning the volume and subsequently the CSI driver from Seth will take care of the whole mapping the volume mounting the volume so that is completely transparent to your application and the whole thing is with the CSI interface there it's like this one way for any storage vendor to also get their storage running like there's well there's obviously more than Seth but well obviously with Rook Seth we're going to focus on Seth here and that's exactly kind of like a connector that in between there so if you want storage doesn't really matter if it's just Seth the point over obviously Seth is that we have the Seth CSI that's disconnecting bits from Kubernetes from the applications container side for your storage and that's already a point where kind of Rook we're starting to talk about Rook here is that you can run your Seth's storage cluster well on most to any hardware I don't know what we could run it on a Raspberry Pi as well all right yes easily um you can well I think I've heard people run it on some Android phones or something even as well but it's like the well you know just because you can doesn't necessarily mean you should but that's a whole nother discussion the point being you can technically have yourself storage anywhere so it doesn't really matter if it's well if it's on the metal in your own data center or if it's just a few laptops thrown together doesn't that's the thing with Seth in general there it's like you don't need the best hardware like you don't need to buy that big box from the one storage hardware vendor maybe to explicitly go into that direction to have storage and that's the thing where kind of the combination of using Kubernetes and Seth might come into play or just for having storage for your applications but also as a point for how should I put it for running Seth that's what Rook is about it's about running Seth obviously the connecting part setting that con connection up between Seth and Kubernetes as well but the idea is that Rook runs Seth in Kubernetes in containers kind of I think I think I mainly saw it from Seth ADM last time we deployed it cluster on bare metal directly that like Seth ADM also one other way maybe to put like that to do install deploy even configure easily manage it's easily managed but it's one way to just install run it it's kind of the same point of for like Rook where Rook is basically a Seth operator for Kubernetes I'm going to go into a little bit more what an operator does because that's like that's one of the vital points in general just from well running certain applications on Kubernetes and just again as we had it like running Seth on Kubernetes we can with the operator pattern that we have in Kubernetes we can easily have most things that cause quite some pain depending on how big you scale your storage cluster as well obviously deployment bootstrap and configuration upgrades and everything like that's all processes that I think there's probably five million ansible playboots to install Seth there's well obviously Seth ADM there's what was called the deploy was there early as well which is Seth ADM I think now it is right Seth deploy is earlier Seth ADM is now more of a no advanced and I mean latest tool that everyone is using these days and there's more like I can already just think about five more tools on like how to deploy Seth and ironically for the people that have looked into Kubernetes a bit more already it's kind of the same story for deploying Kubernetes but because of Kubernetes being kind of like this abstraction layer on top of hardware to some degree abstracting everything away but very quick skip this is that patch it allows the Rook operator that's exactly where this image comes in it's orchestrating a cluster it's not just a well deployment office here as well but it's about using the Kubernetes APIs to easily just take care of everything so to say you want add a new a new note into your storage cluster what do you do technically speaking you just add it to Kubernetes and well if everything goes well ten seconds later the operator will be like oh new note gotta do my job run the preparing job and everything get the note ready and a few seconds even later from that the new self components the OSDs on the disk depending what disks are obviously as well are taken care from that's kind of to make this full circle there with Kubernetes side is like what the operator flow kind of pattern it's mostly called is about it's about observing the operator is observing a status or even in Kubernetes case custom resources these are just think about it as like YAML let's just give it a dead what it is an object of YAML in Kubernetes which the operator can well watch on I as a user either make a change or even like my automatic CI CD process makes a change to it like oh a new note has been added or something or I want to tweak something in the configuration of the cluster and so the operator is observing that and when there's a change or when there's even in like the Kubernetes cluster there's a change like a node missing or something it analyzes that change for example if a node is gone or is just not ready in Kubernetes terms anymore let's say network outage for like two of your nodes the operator would analyze well observe it first of all analyze that and start acting upon that for example in Kubernetes terms it would take care of setting certain so-called just to have that term out there portrait disruption budgets which kind of tried to prevent other nodes from potentially stopping the components of the SAF storage cluster main point is really just that it's this like observe analyze act kind of loop because in the end it just repeats itself all over again it's a whole deal with Kubernetes operators it's it's again if like I want to for I guess the people more already into the SAF if you want to scale up some more SAF monitors or well soft months you just edit the object in Kubernetes in the API and just crank the number from one count from three to five or something and again this changes detected by the operator analyzes it and acts upon it and that makes it quite convenient as well again here over perfect with the YAML sorry it's a I've just this guy and come a little bit of clarification I don't have it mirrored on my screen so it's a bit hard but that's exactly the YAML that we talked about like as an example I have my cluster running and let's say new SAF release for what I would need to do to upgrade my cluster I would basically go ahead and just change the image to be well not 1723 let's say as an example yeah 1725 as an example and again operator would detect that analyze if every component is up to date or not and then even start well I don't want to say complicated upgrade process but there's especially with something a SAF there's more than just I'll let me just restart it there's checks before every component is restarted through SAF native ways of like it's basically commands that are well okay to stop they're basically called like that and that's the whole idea there that the operator helps you with that and in the end just fully takes care of it so that in the end for the main part of your work you can just sit back change it in the YAML in a few minutes or it can't even be ours depending on how big the cluster is the operator will take care of that as I mentioned before like the example of the monitor count for example we want to change that change it a few seconds later the operator will pick that up and start making the changes necessary or even if you would want to scale it down from like 5 to 3 or 3 to 1 which not recommended we need highly availability there or another option the operator again what does it takes care of doing it or even if you want to specifically say on this one note please use this one device or even for this then disk or NVMe for example use more than one storage team OSD team for that these things are possible and quite easily just by writing some lines of YAML in the end according to your workload you can easily just customize your according to your workload you can easily customize your YAMLs that will make your life easier and we've mainly talked about having like the cluster running or even setting up the cluster with the YAML definition of a self cluster object but if you would for example want to well run some Prometheus in your Kubernetes cluster and need storage for them to be able to use storage in SAF you need to have a storage pool for example RBD storage block storage basically we also again just go ahead create a SAF block pool object which is simply containing the information of if we go from here failure domain where well you basically tell SAF to only store data on different hosts to keep it simple for now the replicated size that there will be a free total replica of three copies of data in your cluster that requires SAF replicas let's just skip the phone out it's like a SAF replica size and even that you could technically set the compression mode for this pool point is again we can just write this in YAML apply it against the Kubernetes API and a few seconds later also for like the other objects same way you need the SAF file system SAF object storage same way the operator takes care of creating all the components for example the MDS for a file system we have the well standard components like the manager the monitors operator the OSD's and even for the object store for example the RJW components and the operator simply takes care of that and again if you change the SAF version a few seconds to maybe a minute or two later depending on what the state of your clusters operator will take care of doing the update we've talked about we've talked about we've talked about deploying root SAF cluster mainly right now we want to highlight in that in that point as well the crew plug-in that root SAF is building and yeah providing it allows you to well have certain processes automated even certain disaster recovery cases are easier to handle with that and GERF will talk a bit about that so so what's crew right crew is basically a package manager for kubectl plugins you can I mean it makes the management of Kubernetes easier and that's how the core developers and maintainers came together and thought we can definitely write a plug-in to make the life of our developers and administrators more easier crew was the way to go since it's the de facto package manager for kubectl plugins so I mean you can just do a crew kubectl install kubectl crew install root SAF that's how the plug-in will be installed and just if you can see we just ran the help command it shows a bunch of things that you could do you can just run a whole bunch of SAF commands RBD commands right now also check the health of your cluster you could just do a bunch of things like even if you want to remove an OSD so the need actually arise because for example if you want to use underlying tools like SAF object store or something like that to debug core troubleshooting issues and core issues at core OSD level I mean crew plugin is definitely a great way to go as it provides common management and troubleshooting tools for SAF it's currently I mean a lot of things work will show you it's just like I mentioned you just need to run kubectl crew install root SAF it goes ahead quickly installs the plugin it's I mean way easier that you just don't need to earlier I mean you need had to go inside the toolbox pod to debug and troubleshoot every issue with crew it provides such an ease of access that it makes I mean lives easier and troubleshooting makes is definitely easier you could just override the cluster configuration just at the runtime and some of the disaster recovery scenarios are also addressed some of the troubleshooting scenarios that were addressed is mon recovery suppose let's say you will have the default three mons in the cluster and majority of them lose code I mean recovering mons from mon maps I mean just doing bunch of tasks could be if not done carefully it could be it could lead to more disasters but certainly with more automation in place when things are definitely working this is also made easier with the crew plugin and even if you want to troubleshoot CSI issues it's it it makes it easier for sure so yeah I mean just like if you want to restore mons with OSDs and even if we just delete a custom delete the root SAF cluster after accidental deletion of custom resources that could be also restored and one of the common goals in the road map is also automating core dump collection because let's say if there's an issue that happens with the SAF demon and we want to collect the core dump of the process for further investigation to share it with the developers and with the community to understand what issues we are facing it can easily do well if you want to just do a performance profiling of a process with gdb that could be made easier as well so that these are some of the goals the current the current plug is in plug-in is written in bash but there's a work going on to rewrite the whole plug-in in Golang so that it's definitely more scalable and much more easier to manage and even for contributors so yeah just like that so I guess the point we're more or less just trying to make is that if you have Kubernetes or even run a distribution of well what is the Ranger open shift obviously as well on your hardware and I would even put it to some degree as like a you're confident enough with Kubernetes to run it you can have it quite easy running a SAF cluster as well on top of that obviously to some degree you need some SAF knowledge but that's with everything if you run it in if you want to run it in production it's just that with this abstraction layer again with Kubernetes it makes it easier for you it's more of like you kind of start in general there to think of more of like well I have some notes and they're simply well there to take care of the components that you need to run for the SAF cluster and especially with the root SAF operator obviously it makes the process easier by well a GitOps approach for example where you can just throw your YAMLs into well into Git most of the time and have that automatic mechanism basically to care of this deployment process so that again the operator just takes the YAML takes care of it and makes the changes necessary and with the RUG SAF crew plugin just so you get that summarized real quick again it's a way for yeah for us to have certain automatic processes in the hand of admins when they need to and not just as like a hey here's like a 100 line bash clip please run that one command at a time and it simply allows it again because we have this access to communities where we can just ask Kubernetes hey where's the monitor running oh it's on node A and all that because well we have an API that can tell us most of this information and also for recovery scenarios there we can just ask Kubernetes to run a new pod or to well have a new monitor for example then running with this old information from the other monitors to have this forum recovered that is required there and regarding RUG SAF is like a general outlook for the future some of the major points we're currently looking at is that we want to improve the cluster manageability even more than it we already have it at we'll make it easier we're using the RUG SAF plugin right now you still need to do quite a lot of manual YAML editing of the objects that we have in the in the in the API but we would like to have well some more crew plugin commands there again to extend that functionality make make it simply easier as well improved security by having the operator and other components that are running in the cluster use separate access credentials to the cluster just to have there a bit more well I guess to some degree even transparency if you would look at like audit logging of the SAF cluster and as well that's encryption support for SAFFS and OSDs on partitions and as with everything there's more feel free to check out the roadmap MD file on the GitHub GitHub.com slash RUG the link will be shown as well if you want to get involved if you want to contribute if you have questions or anything we have well obviously to GitHub there's even the GitHub discussions open if you have any well any more more questions I guess then that might not fit on Slack we well we have Twitter account obviously we also have community meetings if you have any more pressing concerns to talk about and well just kind of from that side as well where as Garth and I mentioned we're from code technology sync we're building a company that wants to create a product around RUG staff and just in general try to help the community out there if we do talk with us or contact us as well and for now thank you for listening and we'll gladly take some questions and can simply take the last I think you showed 50 minutes for questions or even just talking a bit about certain scenarios here with everyone I guess one more last thing before we go it's not a good idea that there's two like yeah I would just like to add one last thing if you want to check a demo and more troubleshooting scenarios we did a talk at self virtual summit 2022 it's already there on YouTube where we demoed couple troubleshooting scenarios and crew plugin I'll definitely share a reference and add a reference to here but that'll be a good to check out as well if you want to check it check out a live demo yeah thanks thanks any questions I was wondering a bit the downsides of using RUG with SAF because SAF is known to be really hard in configuring and getting the right performance to do some kind of granularity there. So if I've summarized correctly the question is what are like the downsides I would more or less maybe put it at the advantages disadvantages of using RUG to run SAF on Kubernetes especially with SAF being quite complex. If there's a loss of control on SAF side. Oh I see okay and attitude at if there's anything that's well you lose when you use RUG SAF. I guess as a major downside that most people see as well is because you have an additional layer with Kubernetes being that. I guess maybe to address that a bit more from what is at least I know there for example with SAF ADM I think SAF ADM is for RUG uses Docker to run containers basically as well right. So SAF ADM for example at least uses kind of also introduces in that layer so to say with Docker slash Potman well one that runs container insert here. It has more or less in regards to like installing SAF for example in my eyes but I'm like I'm very biased to containers obviously. It has this aspect of here's the SAF image and it should work unless you you know have something weird with the host OS going on. Downside is again like if Kubernetes just goes completely crazy the SAF class is probably also gonna have a bad time but that's kind of then like the weighing off do you are you confident enough I guess to well run Kubernetes and even running a Kubernetes cluster for long term it's like especially with Kubernetes there's more of this talk about was again Pets versus Cattle so instead of just you know having a cluster for every application or something even and just and oh we're done throwing it away versus for well obviously something that's persistent and important as a SAF cluster you can't just throw it away then. From experience so far I can tell that it is possible to run a Rook SAF cluster over multiple years I think I when did I start mine I think I had it running for two years and the only reason I shut it down was because I had gotten new hardware in another location I kind of just said I was like do I migrate it or do I not mine it was just okay let's just start from scratch but that's also because that's cluster I'm talking about there had like 50 other applications running where it's just like a okay let's start from scratch anyway so to say in regards to losing control it's not necessarily you don't really have like a like you don't have like a use this disk manual really way besides putting it in the yaml and fingers crossed the operator takes care of preparing and then deploying an OSD to that disk or even partition but it's like again I think with most tools there that take away certain aspects at least in regards to the installation or configuration so that those points are taken away but it lies in regards to configuring SAF or even certain aspects you can do everything as normal and at least from well from experience with SAF it I guess to put like that has gotten a lot better as with the now tell me the the now the config store the config store as it basically says you have like well config store in the monitors where you can just easily set for certain components instead of always having to manually make changes to any config files on the servers on your storage nodes themselves and it has gotten better that's awesome I think I would just like to say a lot of places it gives you a control as well right because I mean operator is responsible for reconciliation and taking charge when we I mean off automatic automated scenarios where we want recovery to happen right and SAF the goal is to improve recovery and in productions you don't need any unexpected loss of control as well right we would want to give admins a certain level of control we don't want them to go ahead and I mean play around with the ways these so I think I mean in in ways you in many of the production production scenarios you need a certain set of control as well which Rook actually provides so I mean at that point I would certainly recommend and consider it as an advantage as well um question is if there's going to be a performance hits in regards to running SAF in Kubernetes depends on how you run it if you run it like I'm personally preferring running the my Rook SAF clusters always with host network but you kind of can already depending on how far you're with container of companies it goes over well host network some like that some don't I personally do more or less just do it because I don't want the traffic to go over the overlay network as you have some plug-in some CNI container network interface when you want who wants to look into that that takes care of the network between between your notes so it more or less depends there's a lot of people using well just having Rook SAF talk over the overlay network as well it works fine as well it's just a preference I would really more or less put it at and depending on what your network looks like if you have 10g or something and your overlay network in the end maybe brings the like in an iPerf test at least brings it down to like nine point something you know like is it worth exposing that traffic to the host network then versus just having it go over the overlay network but again if you think about it just like another layer to consider if you want that if you don't and if you don't want that there's also options like Maltos to allow you some more fine-grained network connections or config in regards to the interfaces you want to pass in like different VLANs or something but that's like again it depends yeah can you still manage your SAF cluster via the SAF dashboard or is it another dashboard or do you need two dashboards or the question was if you can still use the SAF dashboard maybe just to expand on that SAF manager dashboard just to manage your SAF cluster to some degree there is currently not a functionality to add new OSDs I think if I remember correctly that's one thing as well with the future roadmap part with the more managerability where I also kind of looked at the dashboard and was like wait I have a grid but why oh why don't we but then it's the typical oh there's some road blocks that we just need to get out of the way and make sure that like we are all like especially with operator and even SAF ADM and others out there we're all aligned on the same way or if there's like a manager interface because there is even one and there I think if I understood you correctly or heard from the meetings correctly they're even looking into improving that interface further it will hopefully be easier thankfully also faster to have the dashboard as this point of contact as well yeah there's a lot of current work that there's a lot of work that is currently going on I'll just keep it I'll just say that there's a lot of work going on currently in the usability space from the recent discussions in upstream that we have had there's a to improve dashboard as well from both CUBE CTL from both Kubernetes and old standalone SAF perspective it's to make sure that I mean you can easily manage and monitor SAF in even in the CNCF world there has been recent discussions that have happened to improve it as well from Rook space so a lot of work is going on in the usability space but if you have any ideas it'll be most welcome and really it would be great to have I mean it's usability is one thing that really matters a lot right we I mean user experience is one thing that I mean we would certainly cater to improve in Rook I think we have time for one more question so the question is if could I maybe modify it a bit more into the direction of like how could can you run Rook I guess I think the place into that as well like you can run Rook SAF you can run Rook SAF in a way that you connect it to an existing SAF cluster that it doesn't even matter if it's a Rook SAF cluster just the SAF cluster works as well it mainly takes care of then just setting up the CSI driver then I know people use that to some degree as well if they have an existing or even an existing Rook SAF cluster that they want to share with the others there's also in this external mode the possibility of the Rook SAF operator to manage certain components so that for example if you want a file system you could run those MDS demons that you need for the file system in that cluster that your Kubernetes is running on then that works as well it's kind of like those two main external modes and obviously the case of running it in the same cluster that's kind of like this either you just share what you have or share and allow like SAF file system or SAF objects or you can just run the demons on in the cluster in the same cluster then both works for all the operator does that answer that then any other question there are no questions there are a bunch of stickers here for everyone yeah stickers and if you've asked the question just now just come see me you've got a t-shirt too and maybe there's some left over after that |
AMENDMENT Autoscaling with KEDA - Object Store Case Study |
So, good doing everyone. Before going to my presentation, I just want to thank Jan and Nils. I guess these folks are organizing the show at the room for last six years or seven. So, yeah, today I hope I'm audible and yeah, the network is, yeah, sure. So, today I'm going to talk about autoscaling feature in Kubernetes and how it can be done if using CADA for RGW specifically. So, most of the presentation covered Sef and Rook and last presentation from Gaurav and Alexander was about Rook. So, it's bit advanced topic over that. So, myself, Jifin Tonythotta and I am working as backend engineer in IBM storage and I work closely with Rook and Sef community. So, as I mentioned before, like most of the talks already covered Sef and Rook topics. So, I mostly focus on CADA. So, my section covers what is CADA and basic CADA concepts. Then, a brief thing about Rook operator and finally, a demo how the autoscaling works. So, what is CADA? So, in the last presentation, one of the things Alexander mentioned, you can change the configuration, right, if you are in the deployment. With help of autoscaling, like, you don't want to change it. Like, it basically, the autoscaler will find and scale those for you. And, Kubernetes inbuilt have the HPA and VPS scalers. So, CADA is kind of a bit, what you say, advanced version for the HPA. So, what CADA does? It makes the Kubernetes even driving autoscaling that symbol. I don't know if you folks use HPA. So, HPA by default support, CPU and memory kind of scale. Like, if your port is using most, lot of CPUs or something like that, it will scale or if you use memory, it will scale. But, it also supports custom metrics. And, CADA, and the one of the issues with the custom metrics, it does not have a standard version for that. Like, even Prometheus has an implementation, I mean, Kubernetes has an implementation, but nothing is standardized. So, this is where CADA comes. And, it started as a partnership between Microsoft and other that, and it was donated to CNCF. And, during the two-point-o version, like, it may be, go through a major design change and all that stuff. And, last couple of years back, like, it got become a, like, incubation project. And, currently, like, it's on the 2.9 release, like, that's the latest version. And, I hope the next major change will happen in the third version coming years. Now, going a bit about CADA concepts. So, as I mentioned, it automatically scales the Kubernetes resources, like deployments, or stateful sets, like replica, or even customer sources. Then, it has, like, inbuilt 50-scalers, like, like a plug-in which we can attach. Like, for example, you have Prometheus, or like Kafka, and RabbitMQ, AWS, and Azure, all those big players are there. Then, other thing is, it just scales the resources. It does not manipulate your data. Then, the scaling is done on the event basis. So, it doesn't want anything to do with your data. Like, it won't manipulate your, like, your side unit or something like that. It just scales based on the events which it's facing. And, you can install CADA via OLM or Helm, whatever. Now, this is basic architecture with CADA. So, CADA just enhances the HPF feature of the Kubernetes. So, you need to have the Kubernetes cluster. Then, there's an, on the bottom side, you can see an external source, like, which you, which you have given the information about, like, what resources, or, like, what even you need to scale. And, how the, like, how the deployment will scale, it based on a customer source, not a scaled object or a scaled job. Now, the scaled object will be created by CADA. And, CADA has, like, the three components, like, one is a scalar, like, as I mentioned, like, there are a lot of scalers, like, it's a plug-in for CADA. Then, you have the CADA controller, basically, manage the CADA, I mean, CADA services and demons. Then, you have the metrics adapter. So, the custom, like, in the case of HPA, the custom metrics is driven by a metrics adapter, like, you need to provide a metrics adapter for the custom scaling. So, CADA itself brings up a metrics adapter for that. Now, you have your workload, and after that, based on the events, it may increase your deployment or it may decrease your deployment. And, it does it with the help of HPA, like, even though you define a scaled object or a scaled job, then, in general, it creates an HPA. Now, one of the differences is that, like, it can scale down to zero, like, normally, with HPA, the scaling starts on one or, like, and it ends on the maximum which you have defined. Then, yeah, that's all. And, the metrics adapter is covered. So, yeah, the custom metrics is provided by the auto-scaler, like, provided to auto-scaler by this metrics adapter. Now, I am just giving an example about scaled objects, because that's what I'm going to cover into this presentation. So, basically, it has a name, the metadata information. Then, the, you can see the target of, which will mention what type of resource you want to scale. Like, by default, it will be deployment. But, you can also add the replica sets or stateful resources. Then, if you have a customer resource with a scale defined, you can also define that. So, these, then, you need to mention the type of resource. Like, by default, if you give a name, it will be deployment. Then, you can mention about the Plick account. Then, you can give Triggers. So, Triggers is an event which, the based on scaling will happen and you can define multiple Triggers as well. So, so far, any questions or, like, okay, I'll move forward. Now, I am just mentioning a few, most of them I already covered, like, CADAP features. It can scale down to zero if you want. Another part is, like, if there's a failback, the Plick account, if something happens to the cluster, you can failback to one number. So, it's not min or max. You can define a failback value. Maybe, say, your min is three and your five is your max and you can set a failback one, in some of the cases, if you want to failback two. It can really, say, like, pose for autoscaling. Like, you can start and stop. Like, you don't delete those sources. You can just, I mean, you don't want to delete the scaled object or something like that. You can keep those sources and you can just do a pose. Then, CADAP, by default, can expose the Prometheus metrics in this adapter as well as the Kafka events. And one thing is, like, you can also use secure connections, like, potentials in things, like, it can be defined by another subclass or, like, a subsection in the scaled, I mean, in the scale, like, in the Prometheus, I mean, the type, in the scalar type, you can mention a subclass about the trigger authentication, which you'll refer, like, how you can authenticate with the server. Now, today, in the latest version, even, you can have the events or metrics from the GRPC or, like, from the other JPA, but I have never tried or I have never used those things. Now, coming to RGW. So, RGBL case is very simple case for CADAP. First of all, it just proves through the Wootkend RGW stuff, like, so, in the last presentation, they mentioned Wootkiston orchestrator, which conducts the self storage and it simplifies the deployment and management services for the self cluster. Now, here, for RGW, like, the access can be given as an OBCs, like, something similar to PBO PVC for the fly and file and block. Other source is known as self-object storage. So, OBC, like, you will get a bucket, but in case of self-object storage, like, you will get the entire user credential, too. So, you can create multiple buckets and all those features can be done. Then, other part is, like, Wootk also have a service monitor. So, if you're familiar with Pomerateus and all, like, if Pomerateus want to fetch the metrics from your, so, I mean, your DMN, like, you need to have a service monitor. So, what the service monitor does, the self-manager supports the metrics and this metrics will be passed to the Pomerateus service with the help of the service monitor. Now, for my test case, I am using HS bench. So, it's a performance-evaluating tool for S3 workloads. So, yeah, that will be tool, like, that will be S3 client, which I will be using for HW. Now, yeah, so, in the demo, I cover, like, the Pomerateus scale I will be using and the self-cluster will be already deployed by a hook and HW is configured. Also, the Pomerateus server is up and have defined the Pomerateus, like, the requirements for service monitors and all those stuff. Then, I need to define a scaled object so that my HW can scale. And the scaling is based on the metrics provided by the manager. So, for HW, most of the metrics are related to performance count. It's based on the how many requests we are receiving on the HW server. It's nothing depends on the backend or something like that. So, that's one thing which we may need to change. But currently, it's like a web server, like, when you are getting the request based on request a lot, like, the scaling will happen. Now, I will go to the demo. So, okay. I hope it's visible. So, I am running a mini cube cluster for my demo purpose, like, everything is up and running. And I already installed the look cluster. So, you can see look operator is running. Then, HW is also running. Currently, I have only one HW on my cluster. Then, this is the service monitor, like, the format is, which look deploys. Then, if you check the services, there are two HW service, like, one is the internal service, which can be accessed for the humanities workloads. And for my HS1, I need to expose the HW service. Hence, I am using the channel HW service. It's just, it's an old port, like, it will just expose the HW service on the remission. Then, yeah, there is also the Udpometheus service as well. So, I have created SF subject show user, and I need to pass these credentials for my demo, I mean, for my workload. This is the Pometheus operator running on the default name space. Now, I will install the CADA via Helm. It's deployed. Just checking the ports are up and running. It's nothing fancy. So, everything is up and running. Now, I need to define the scaled object to source for HW. Sorry. I'm just taking the Pometheus web's console UI. So, I am doing the scaling based on the request, like, SFHJ request. So, just showing the current value. That's it. Yeah. Now, I need to define the scaled object. So, so, this is the, in the GitHub repo, like, this example is the lamified is the, so, I have given the name for my scaled object, and I am doing the scale, like, on the scaling, on the deployment, on the SF object show, like, HW show, set the minimum of pick account 1 and maximum 3. Then, this is the Pometheus endpoint, I mean, the metrics endpoint, which I need to fetch the metrics, and this is the metrics name which I am giving, like, so, this is based on, this is a, like, definition for the triggers, nothing else, and the threshold value is 10. So, basically, it's 10 million requests, not normal 10. I mean, now, the scaled object is created. If you check for the scaled object and HPA, you can, we have defined the scaled object, and HPA is automatically created. So, this is a scaled object, and yeah, it is not active, that's why the state is unknown, and you have not defined the fallback, that's why, but it's ready for scaling. If you go for the, so, for the scaled object, and HPA will be triggered internally. So, this is the current load on the RW deployment, and current status, like, you can see them in NMAX and the pick accounts. Now, I am doing a watch on the ports, and the scaled object and HPA. So, yeah. Now, I am triggering the load. So, for that, I need to fetch the credential from the subfuser. So, just getting the access key and secret key. So, I am triggering the HS bench. So, HS bench has the access key, so you can see the endpoint details. Then, currently, I am running a load of, like, so, this is, 1 mb is the size of the object, and 1 dollar tree, I mean, 1 hierarchy. This is 10 clients it will be running, and 1000 objects. This is triggered. Now, on the left-hand side, like, still the, like, it may take some time to flatten that side, like, the watch part. So, you can see the load is increasing. So, it is still not reaching that 10 million, as I mentioned before. Like, if it costs 10,000 or something like that, then it triggers the scaling. So, still I have one port. Yeah. Now, it costs the limit. So, if you look closely, like, so, there is a kind of, like, what you said. School of period or something like that, it will wait for some time, 90 seconds or something like that before scaling, then only the scale will happen. As you can see, like, after costing the limit, it just scaled one. Now, it will scale again, still increasing the two ports is not satisfying the request. So, it scaled again. Now, the workload become bit less. Still, it is about 10 million requests. So, now, if I execute here on the probability server, like, you can see three requests, three instances providing the same request. Another fourth port is up. That is a four instances, like, if I go back to the terminal, yeah. So, you can see a bit decrease in the load, but, yeah, the load is never become less than 10. That says it was still increasing. So, yeah, I guess that is all I have and I do not have the, like, the scaling down part, like, before that it was taking a lot of time. So, that is it. Any questions or, like, what is the use case of scaling down to zero? So, I have a question. The question is, what is the use case of scaling down to zero? Maybe you can save those sources or see if you are not using that service up. It depends on you, like, if you want to defend zero, like, then it will, if it is ideal, it will scale to down to zero. That is it. But will it then scale up fast enough? Yeah, if it causes the threshold is coming up, I have played zero. So, I do not know whether, whether RGW will have bug because if you scale down, then the server is not the right for RGW. So, I am not sure whether it will work for RGW. But, yeah, majorly, it will save those sources, if you have anything else. Yeah, sure. With something like that, work for scaling OSDs? OSDs, I am not quite sure because OSD has the dependency of hard disk. So, I do not know how it will play in case of OSDs. But it can obviously work for NFS or it can work for MDSs. Scaling OSDs is very expensive. You need to move better. It does not fit this, this method of scaling up and down or momentarily, you think. If you already moved it to OSD, you need a big event to move it down. Just whether that would be more of an upgrade for the server. I think the usual argument for OSDs is that you have to have hard drives, like, in storage, to put OSDs on, and then you might as well deploy them right away, because then the server will operate better. But in public cloud, it is different. And then it is cost. You do not want to scale up because it is cost you more and you want to do it in the last minute. So, it makes sense, but not using this. And the problem scaling down OSDs is also not a simple task. You need to do it all in a manner, otherwise you will be at risk of errors, trying to take it down. I try to write once the process how to, in the public cloud, how to scale down OSDs, scale out OSDs. It will document the speech of something like 45 pages. If you want to do it in a safe manner, it is not simple. So, the question is whether we can use CADA for Cepheidium, something like that. So, I guess Cepheidium mostly works with Portman. It does not need the Kubernetes, and this is specific to Kubernetes. No, not with CADA. Maybe if we have defined something for the Cepheidium step, then yeah, that is like, based on like, we need to, yeah, like, we need to see the scalar like, based on like, if this is a question that we had, hit this request, then Cepheidium can trigger or Ingressor, etc. That is possible, but not with CADA. Kubernetes, yeah. Okay, yeah, sure. Is there anything similar for tuning, for suggesting tuning? Sorry. So, the question is anything? So, I think he's asking if there is, I mean, if you can scale or, I mean, if I understood correctly, you can suggest configuration tuning based on the workload. So, the tuning comes from the, like, kind of a YAML, like, which preferred YAML you need to use or something like that. It's kind of a data based on the scaling, right? And, like, it's, I don't know, HPA has some mechanism, like, it's based, the scaling is based on HPA only. So, HPA does not look up for the configuration. It just look up for the deployment, like, how the deployment see, just see the counts. About tuning, right, a little long topic. But I recently read a research paper where Luster was doing machine learning performance analysis based on ML and AM workloads and suggesting what configuration is that you are doing. There was some, I think I shared that research paper in this performance weekly once and shared that research paper. But this is a really nice idea that, I mean, based on AI, I mean, you use AI as models to list the workload and everything, and just configuration tuning that I want to, in this F cluster. Okay. Discussion was, it's more of a, the discussion more of, like, to which I have a performance weekly discussion. We have more discussion in the upstream community meetings or in the network. We could work for them for that. Sounds fun initially, but I don't know how much Okay. So, thank you, folks. Thank you. |
A Sovereign Cloud - Opening Remarks |
So, hello and welcome to FOSDEM 2023 and I hope everyone is awake and looking forward to two days packed with lots of talks, at least I am. And we, that's Christy, Thorsten and me, we are the organizers or deaf room managers for this room, so if you have any questions, you can approach us, we are here to guide through the day and we are also the ones who submitted the idea of the sovereign cloud deaf room to the FOSDEM and we did that because, well, with our day work, we actually work in that space, so Thorsten is from the Operate First project by Red Hat, I work on the sovereign cloud stack, which is also a project that is centered with a focus on sovereign cloud infrastructure and standards and of course, as we work with that on a day to day base, we think it's an important topic and also the term sovereignty, digital sovereignty is often a term that is used a lot lately and it actually needs proper definition and it also, there needs to be a discussion about it, what it actually means and what is needed in order to work towards sovereign infrastructure and that, if you look at the talks that are happening today in this room, that will be discussed a lot and, yeah, I don't really have much more to add to that, Thorsten, do you want to? Again, yeah, so I'm working with Red Hat and I'm in the Project Operate First, which is the idea of open operations, you can see the stickers here and if you want to pick them up, the idea there is that we offer more or less free service by where you can open up your own Kubernetes clusters, you can deploy your project there and you have also a kind of open SRE there because you can see the different infrastructure data and so that's the basic idea of open operations and with open operations, you get the chance to be sovereign with your data because it's easy to move, it's easy to see and we are totally hybrid so it doesn't matter if it's on your own premise or in whatever cloud or you're switching in between, so Operate First is kind of the idea, one of the proof of concept that the idea of sovereign cloud can work in a practical idea. Hi, I'm Kristin Nicola, I'm a software engineer with Boston University, I work on the MOC Alliance project out of Boston University whose goal is to create an open cloud for academia but not only, I also work on OpenStack, I am on the OpenStack Technical Committee and I am a maintainer of the OpenStack Keystone project which is for identity management, I also work on varying degrees on Operate First with Thorstein on Open Infra Labs and on various acronym rich things like NERC, Nessie, etc. And working on open source and having worked on deploying and maintaining open clouds, the topic of sovereign clouds is very important to me and I see it as the next step in not just opening up the telemetry and how you're deploying it and how you're running it but also opening up the governance and all of those other aspects to actually create a place where everyone can contribute to on all aspects of the cloud and not just take what you're doing and apply it to a different degree. So with that I, Boston already had the first highlight for me because I did not know that you are the PTL for Keystone so that's really useful for me to know because we just submitted patches to Keystone for federated authentication and so we should really talk a lot more. So with that the first one who will talk will be my colleague Max and he will prepare and then we will give a proper introduction to Max and start with that in 10 minutes, right? Yeah, in 10 minutes, right. And feel free, here's some candy, some chocolate, if you want to, there are stickers there and yeah enjoy the day, right? |
How we created a Documentation Framework that works across a group of vendors in the sovereign cloud stack community |
We are starting with the first talk in the Sovereign Cloud Dev Room, and it's my great pleasure to open the stage for Max, who's going to elaborate on how he created a documentation framework for the Sovereign Cloud Stack. Enjoy. Have fun. Thank you very much. Hi, everybody. Thank you. How did we create a documentation framework that works across a group of vendors in the Sovereign Cloud Stack community? This I will show you. My name is Max. I'm knowledge management engineer at the Sovereign Cloud Stack, and my background is broadly in web development. The talk, TLDR, what is it? Basically, it's a set of GitHub Action Workflows to copy, mark, down folders and files from many As to one single B. And once that's done, B is being built into a single-page application resulting in the Sovereign Cloud Stack documentation at docs.scs.community. And here, pull out your mobile phones and have a look at it. And I will now elaborate what's behind this. So what is the Sovereign Cloud Stack? The Sovereign Cloud Stack combines the best of cloud computing in one unified standard. And SCS is built back and operated by an active open source committee worldwide. Yeah. This is the SCS stack, and it's made up from many modules. And yeah, it's quite complex. And there is, yeah, of course, documentation, but it's heavily distributed at different places on the internet. And that's quite difficult to keep track on. Where do I have to go? Where I have to be routed as an integrator developer. That's quite a challenge. So as an integrator and operator, you have to manage different documentations in different Git repositories and different docs. And it would be nice to have them within one platform to search and to have guides and tutorials and references. And a great DX or UX. But how could you do this? This was my challenge. And the aim and requirements for the documentation is it has to be like a minimal rule set where the approval is basically a no-brainer. And all doc files will reside in one place. And it must be, and this is probably the biggest challenge, a low entry hurdle for companies with existing repositories. Because no one wants to change the whole documentation if it's being consumed by another stack, basically. So no one has to make any major changes. And then we were thinking about, could it be made up with Git sub-modules? And no, this is too complex to manage. And Git subtree was then, OK, this could be nice. But actually, it's not cool, too. And then other vendors said, no, we don't want to change anything in our existing repositories. So then we had a great community hackathon in Cologne in November 22, hosted by Plus Server, where we have developed an eight-hour straight in a tunnel working prototype with a custom GitHub actions workflow or combination of this. And this is one of the great things with the sovereign cloud stack community that everyone is joining forces to actually elaborate on the common challenges. So yeah, thanks also. Shout out to Ramona and Tim, who worked on this prototype with me on this day from OSISM. Yeah, so how does it work? Basically, it's GitHub actions workflow. It's actually three workflows, as you see here. And the whole concept is it's a Docusaurus React-based static site generator. You may have heard of. It's outsourced by Meta. And the three action workflows are divided in collecting all the different documentation files, then distilling them, that you only have the markdown files from the repository, and then it's being built and deployed. Let's jump into how this works. We defined docs.package.json, where you define which doc files you need, from which organizations repositories. So basically, it's a package manager for docs. So in the first line, you have the repo with the organization. So sovereign cloud stack slash docs is the first one. And then you define the source directory, where the docs files reside in, or the folder. The next line is then you define the target, where should it be placed in the site. Currently, there's a community page and a docs page, docs for technical documentation, the community page for how we organize in the community, and also a contributors guide. And the label is then how it should reside in the navigation. Then the first workflow is reading the JSON, and then defining a matrix strategy within GitHub Actions. So for each element in the docs JSON, a workflow is being run. So this is the whole synchronizing and distilling workflow. We call it distilling, because it's like in a distilling process, refining what has to be done. And all the source code, because most of the time, there's the source code, and then in one doc folder, or docs folder, there is all the documentation which we want, the other source files we don't want. So we're distilling it. So we'll quickly go through the workflow. The first one is checking out the repository with Secrets action token. Then it's first doing a clean install. So it's removing all the previous files which resided in the documentary, so just removing all that was there. Then it's cloning the repository A, which is about to be synchronized into the checked out repository. In the next step, then it's removing the Git folders, then it's removing the readme files, and then it's creating the DocuSource sub directory, so either in the docs sub folder or in the community sub folder, with the target folder of the docs and the label. And then it's copying all those files from A to B. Then it's removing all the stuff, and then it's committing it directly to main. So there is no pull requests. It's pretty nice. And then it's doing this not parallel, but one after another. And then it's taking around about, for one repository, it's taking about 10 seconds currently, and the whole build process takes two minutes. And then there's the build workflow, which is just standard NPM CI, NPM build, and then it's being deployed to static server. And this is the result of it, our documentation page. Where you see on the left side the docs that are currently distilled in part of the SCS documentation. And this will, of course, grow because we're currently in the process of defining the standards and putting it all together. So there will be a lot more than in the future. Yeah, that's basically how we manage to do it. And if you have ideas, feedback, critics, then have a look at our repository. And we are meeting up in the SIG documentation, special institution documentation, every second Tuesday from 11 a.m. Central European time at, yeah, you can have a look at the docs.scs.community page, or scs.community page. And what is still to come, we're currently in the process of creating the whole framework. So we're structure-wise adapting towards the ATACS framework. You might know it's, yeah, I wouldn't call it framework, but it's more like a tax on me for writing excellent documentation. And currently, the workflows are triggered manually. And, yeah, it's soon to be automated fully. And there will also be an interactive overview about the whole standards, which worked out currently within the sovereign cloud stack community. And there will be a fancy community space, which is currently also in development. Yeah, join our metrics channel. And thank you very much. Thank you. Questions? Ah, here we go. Do you do releases of SCS? Yes, we do releases. OK, so how do you handle versions of documentations? We have, we can, if there will be a new version, we will use the versioning tool of DocuSaurus, which is basically a command line command. And then it packets all files and folders that is currently there and puts it into a version directory, basically. And then, yeah. There's no separate versions. So there are no separate versions on the website for the documentation. It's just the latest, always, on the website. No, there will be, there will be a future different version on the website. OK. Yeah, currently it's the latest. Yeah. Yeah. No, there was a person first here. Thank you for this talk. It's very informative. So my question is, what was your documentation pain point that inspired you to say, hey, let's have a hackathon and let's put our documentation in DocuSaurus? And did you solve for those pain points using your workflow? Yeah, actually, thank you for the question. Actually, the hackathon was planned prior to this. So it was perfectly placed for us to solve this problem. And the biggest problem was that no vendor wanted to change anything with their existing documentation, so no metadata. And then the thing is, with this workflow, you don't have to manipulate the external organization's documentation, files, or folders. And you just can copy them, distill them, implement them, and have the navigational thing is only, yeah, we do it with an hour repository. So it's minimal invasive, basically. Were folks updating your documentation before? You changed platforms? I'm just not clear about what the problem was that was being solved. OK. Oh, thank you. Yeah, thank you for the talk as well. Just a question, what are your input files? I mean, documentation appears in several formats. For instance, it's part of C files, but you have to extract it into, I don't know, PDF or HTML or whatever. Yeah, we are only accepting markdown files. Oh. Yeah, and the problem was that you have different vendors. And for example, if you want to pull all those markdown files and throw it into one folder, then it's going to be a mess. But we want to curate how the docs is being built. So that's the nice thing with Docusaurus. You can define within the sidebars how your whole navigation is being built. So you don't have metadata like lying directly in the vendor's target repository with the individual documentation. OK, so any more questions? Hey, do you plan to support any other documentation formats like Ask a Doc or R-Text or something like that? You know, like Sphinx is using Ask a Doc, for example, or more actual, or is it a restructuring text? Yeah, currently not. But it would be possible to convert it in a separate workflow. OK. But also, we use markdown because there's also a markdown linting process that's refining everything. Any more ideas? OK. Yeah, thank you very much. OK. OK. OK. Thank you. Thank you. |
Is Open Source Coming back to your Cloud? |
So, welcome to the second talk of today in the sovereign cloud death room. And I'm very happy that Peter, founder of Pacona, is with us. And this is now your stage, Peter. Welcome. Okay. Thank you. Thank you. Oh, one. That's fantastic. Okay. Well. So, we're going to talk today about wonderful topic of open source and its relationship with the cloud. Now, I think most of you don't know me. And I thought I would highlight my perspective first, right? So you can understand maybe where I'm coming from better. Myself, I was involved in open source since late nineties, which now seems like a long time ago. I was at that time building my first startup back in Russia. And I took unconventional decision to use open source rather than stealing property software, which was, you know, the most conventional thing to do, right? In Russia in late nineties, afterwards, I joined my school AV before Oracle, before Sunmet. That was like a tiny company about 40 people and that's where I got a lot exposure in building and sort of promoting the open source. And I also was their founder and until recently CEO of the company, which specializes in open source databases, right? And we release all our software as open source. And since that time, I was also involved as advisor, mentor, investor in some case and various, you know, companies around open source, open source space, right? If you want to summarize all that kind of presentation, a couple of words is, would it probably this, right? I believe that everything being equal, open source is a better choice, right? Of course, you tell me it's never everything equal, but then I think this second thing applies, right? I think as a member of community, employee, right, manager of a company, it is always better for you long term to invest your time, invest your resources into the open source rather than just send that to some, you know, proprietary vendor, right, which will provide more value to them than to you. Now one thing to understand about open source is that the open source takes time. This is a slide I shamelessly stole from, talk from Bruce with this last name nobody can pronounce, so I won't ever try it, right? But this is the Bruce in PostgreSQL community, right, who often talks about PostgreSQL and open source and he uses this chart which says, hey, you know what, if you look at compared to the proprietary software, open source tend to take a longer time, right? Because if you think about saying, hey, you know what, I get one kind of, you know, get a small team together, you know, fund them and say, hey, build this, we can do it faster compared to open source, which is kind of a lot more organic fluid, a lot more discussions, you know, a lot more, you know, people operating maybe on a volunteer basis, each takes time, but then over time, it overtakes the proprietary software according to many criteria in many areas. And as example, which I would use is this, like Linux and Celerius. When I was getting started with open source, I have a lot of friends which were running real Unix operating system, you know, Celerius, HPX, AIX, anybody remember those, right? And they would think like, well, Linux, that is a toy, right? That's like a 32-bit operating system at the time, doesn't handle, you know, multiple CPUs well. You could not even at the time have a file more than two gigabytes in size, right? I mean, you know, like sounds like a joke, right? And rightfully so, but where are we here now, right? I mean, how many of you are guys still running HPX, anyone? Or like AIX, Celerius probably, maybe there should be one or two people, no? No Celerius folks? Well, we can see what over time, Linux, if it's kind of slow and messy at times, development progress, right, really overtook that there are other operating systems, right? Especially if you look at the server side, right, then Linux is well dominating. And we can also look, if you look at the cloud, we can see another comparison here, right? I remember also back in those early nineties, where folks would be using, you know, ASP.net with Windows stacks and wow, it is so fantastic if you graphical user interfaces, wherever, compared to us open source folks who would be coding, you know, PHP or Perl in VI, right? Or Emacs, right, if you are from the other camp, right, in that time, right? Which was very different, right? But if you look at how things developed right now, now we have a fantastic development frameworks, right, very efficient for many open source frameworks, right, that things evolve. Now, at the same time, if you look at the cloud standpoint, we are kind of in a similar situation. You can even go to this kind of very polished, integrated, yada yada, but very proprietary stacks, which Amazon or many other proprietary huge cloud vendors are ready to provide you, right? Or you can have open source solutions which are not yet as polished, powerful, integrated. And so on and so forth, right? But I believe, well, they are going to get there. Now, what I want at this point to talk a little bit more about where we are now in kind of a state of the ecosystem, what is happening and what the trends are. One, if you think about relationship with clouds and open source, you can say, well, from one side, the cloud vendors have been making the open source a lot easier for other people, right? I mean, if you look at deploying some open source technology, especially it applies, I would say, to the databases, it's much easier to get, let's say, Postgres running in a cloud vendor compared to getting your kind of well, you know, highly available, monitor backup for a PostgreSQL cluster on virtual machines, right? But on the other hand, we can also see all those cloud vendors use this kind of an old strategy, which actually they used even before the cloud, which is called embrace, extend, extinguish, right? And what that means is, hey, you know, why don't we take the open source? We make it fancier. Fancy may mean more features, like in a database space, well, where I come from, would be something like Amazon Aurora. It's wonderful, there are those kind of additional features, very performance, wherever. Or maybe more usable, maybe there's like a fancy GUI, so you can deploy it with a couple of clicks, right, instead of have to put some elbow grease, right, to do it the open source way. But then, yes, you go there and you kind of check in in a hotel, California, right? Which if you know the song, you can't really ever leave, leave again, right? Now, what I think in this case, it's important to understand, and for all of us as open source developers to recognize what really the usability and ease of view is increasingly becoming, that is a new frontier of the competition, right? And I think that is especially hard for people for my generation, which got used to that very kind of hairy open source, right? At that time, we remember, you know, compiling open source from source, right? And kind of, we have all that skills, right? But if you think what is happening right now in a software industry, in general, as well in open source is this. One is a software just, you know, absolutely explodes in amount, right? Everything is software those days, right? Amount of software which is being written will need to be written, need to be maintained, and so on and so forth is, you know, absolutely amazing and growing every year. The same happens with data, right? We have more and more data and more and more ways to process and whatever that data for good and for evil, right? But at the same time, we are, have to do that with less, especially with skills, right? You can see as a software engineer in, right, in technology industries of all, we have a lot of people getting in that stuff, which are not going to really be able to debug Linux kernel code, right? You know, have some, you know, just obvious basic skills, right? Well, for them, you need to build a software which is, which is, which is easier to use. The fact, I think, what is very interesting is what we can see increasingly some expert, even in related to open source technologies, which actually only have cloud specific skills, right? Well, especially in the US, right, where I think adoption of a cloud is substantially for a loan, right? You may have somebody saying, look, you know, I am, you know, PostgreSQL expert, right? And you talk to them, what that means? He is very capable of provisioning, you know, PostgreSQL Aurora on AWS with two clicks, right? But you say, like, well, you know what, what is patronage? Yes, like, what, what? Well, that's like some high-vivity solution for Postgres, right, if you guys don't know, right? Which is, which is very basic, well. I think another interesting thing is what, especially in the large corporations standpoint, a lot of them run to build their businesses by, by the book, right? So if a management made a choice, what we are adopting, let's say, Amazon cloud or Google cloud, Azure doesn't matter. They often then would come to them and say, OK, how do we do that? Give us the book, right? And what are those cloud vendors going to tell? Oh, they're not going to, to tell you, well, you know, to use our basic services which you find in the other cloud. They are going to tell you to use where the most property, advanced, you know, managed solution, right? If you read what Amazon will say, hey, you know, go use DynamoDB, right? It's awesome, right? Or it's Amazon Aurora, Redshift, all that kind of stuff, which, which makes you the most sticky for the cloud and which is also providing the most margin to, to those folks, right? So what is also interesting in this case, right? If you look for Amazon Aurora, for example, as they see their customers getting more and more locked in, the price differential between their, your computer sources, right? You know, storage, right? And essentially, like another software, it becoming larger and larger, right? If every new generation of CPU can actually, you know, I find that excuses, right? So, I find that changes. So what, so that means it becomes more and more expensive over time, right? And also, you will see in this environment also really embracing scaling by the credit card, right? Because scaling is, is easy, right? You don't have to figure out how to run the, that server you have because getting the next one would, bigger one would take three months, right? Even if you get a budget approved, right? Like we used to have like 10, 15 years ago. Now, you know, screw that optimization skills, right? You can just go to the larger instance size, right? And that's, that's easy. And again, like from what we see from a database standpoint, a lot of databases in the cloud are optimized very poorly exactly because of that trend. Now, what is interesting for me also is how that plays with their venture funded open source, which is a lot of open source those days, of course, right? And what we can see is what they are really, well, protesting against unfair competition from a cloud vendors, yada, yada, right? And many of them abandon open source licenses, right? And go to, to something else, right? Which is, well, really shifting to a proprietary, proprietary software, software models. Now, this is, in my opinion, is very different from a community driven open source, right? If you think about, well, compare something like PostgreSQL to Mongo, right? Which I think is the, in the, in the database space are very opposites from on that kind of open source embracement specter, right? The PostgreSQL really has benefited a lot from, from a cloud, from adoption standpoint, right? Because cloud made PostgreSQL much easier for, for a lot of, a lot of people, right? But at the same time, of course, now there are different players which dominate the ecosystem. Yes, it used to be enterprise DB was the biggest dog, right? Well, now it is probably Amazon, right? And other cloud vendors make more money and they are, but you know what? They are hiring a lot of PostgreSQL engineers and investing a lot still to make a PostgreSQL better. So, at large extent, PostgreSQL community still benefits in this regard. What also have been interesting for me in the last decade or so is the idea of a growth at any cost, right? For a lot of companies, you can find, well, at least in the US stock market, you know, the more money company is losing, the better its valuation, right? As long as you can grow a user base, base as well, right? And in this case, you really gravitated saying, how can I grow faster? That means focus my resources and, and cloud, which often allows you to, well, things simple, no, expensive, right, was very much adopted, right? And for those companies often didn't make sense to invest in efficiency, which I think open source often brings ease, right? Even though it requires a little bit more investment and elbow grease kind of and management attention to make those things happen. Even more, though, you can see the cloud has been quite good at giving you your first shot of heroin for free, right? You can find what every cloud or person, hey, oh, your startup, great, will give you some, you know, big chunk of money so you can spend, spend on us, right? Or giving the free tea, right, and whatever, whatever solution. Well, but I guess with this analogy, you understand, right? Well, what if somebody giving you a first shot of heroin for free, they may not always have your best interest in mind, right? Now, I think what is interesting is what there are some revelations which are, which are coming, right? I think in this case and things are changing. And I will provide a bunch of like links to the detailed articles here, right, which you guys can check when you saw the cloud. Like this is interesting from this, you know, trillion-dollar paradox, right, which talks about, well, you know what, cloud is damn expensive. And for many cloud companies, the very large portion of their cost of revenues actually cloud spent, right? So if you are choosing to go with that cloud, especially kind of most expensive versions on that, well, you know what, you have to be giving up on a lot of potential, potential profit margin. Recently, you may also heard this 37 signals which talk about while they are leaving the cloud and they detailed that stuff. That is also like a very influential company, right? It's not very good, very big dollar amounts, right, compared to enterprise here. But I think it's very nice what they kind of went public in this case and say, hey guys, even in our scale, even if you spend in maybe, you know, like a less than 10 million dollars, right, a year on the cloud, it makes sense, right, to actually do things differently, right? And 10 million dollars may sounds like a lot of money to some of you, but if you look at even kind of like a mid-size and, well, especially more like logic operations, they often spend hundreds of millions on dollars on the cloud spent. Now, another trend which I think is interesting, which is coming out, I think is GitOps and generally the approaches where we have declarative versioned infrastructure as a code approach, right? Because when you do that, then often those kind of fancy GUI point and click I can do in the cloud is not exactly how you do things anymore, right? That's where you kind of sort of do it more of a, well, through, you know, API, you know, command line, if you will, right, things there. Open source tooling is already better. Hybrid cloud is another interesting trend, right? And what I want you to see from this graph is what if you look at the larger the corporation is, the more of them are embracing the hybrid cloud, right? That's from the CNCF poll, right? And if you are using public, if you're using hybrid clouds, multiple cloud vendors, you know, public and private cloud together, then you probably do not want to rely too much on those, you know, proprietary services, right? Because they tend not to be as portable. In the end, I think what you are, what we're seeing the open source kind of catches up again, leaving us with kind of two choices here. If you look at the AWS, which is obviously the biggest dog, where you can replace that with, you know, Azure, GCP, right? Or, you know, some other major cloud, you can really lock in the approach of a cloud vendor, right? Or you can use their cloud native foundation stack, right? Among others, right there, you are really treating cloud as a commodity and building the value through their open source software. What I think is interesting from a cloud native environment is how big it is, right? You guys probably cannot even see all the logos, right, on the slide, right? There is a lot out there. There is typically more than one solutions for even for a single problem, right? And some of you say, hey, that is too much, but that is also the open source way, right? In many cases, we need in open source multiple solutions to the same problems, right, which will kind of evolve, right? And maybe some of you will die off, I will join forces together, and that is how great stuff happens, right? But I wanted to show that just to illustrate how much momentum there is behind their open source approach. Now, I think in this case, what that allows us is to really, given their proprietary cloud, it's originally attended role of commodity infrastructure. Why am I saying originally intended role? Well, you can see this, this is actually the slide I took from some very old AWS presentation. Then they're just bringing the cloud to the masses and say, hey, guys, you know what? You don't usually run your own generator, right, at home. It's not convenient. Well, yes, in some cases, it makes sense, but mostly you buy electricity. So you should buy the same from a cloud when it comes to computer resources, storage, yada, yada. It will make sense. But here is a key difference, right? If you buy the electricity, it is commodity. You can buy the electricity from, you know, a variety of vendors or make your own, right, and you see all appliances going to work, right? You don't have to change your TV or your fridge, right, if you're doing the other electricity, right? Well, and that is kind of what a lot of cloud vendors want us to do, right? Well, we'll provide you commodity thing, right, electricity, but you know what we also really want you to use those appliances which will only work with our kind of electricity. Well, that sounds like bullshit to me. So with this, I think we have really two choices, right, how we can approach those, you know, the cloud, which I will call as a cloud of surfroom and counter-freedom. One is to use a lot of proprietary features of your cloud vendors, sell your soul to the devil, right, and all the other stuff, right? So let's vote. How many one is going to use that way? Any hands out there? No? I would accept there's some, you know, people working for Amazon here or something. I was saying, yeah, yes, yes, okay. And then we can use cloud of freedom, which is saying, well, cloud vendors are fine. It's like a good commodity infrastructure providers, like the internet providers, right? Just give us those commodity resources and we will build the rest of the stuff using open source. How many of you think that is a good idea? Okay, okay, that's better. Okay, now in this regard, what I think is a very good API in this regard? Well, it is the Kubernetes, right? I would not say that Kubernetes is absolutely perfect. It's not, but I think what every successful open source project started as a piece of crap, right? I mean, it's just how things go, right? Because you build successful open source projects not having a solid technical foundation despite what engineers want to believe, but building the movement, building the ecosystem, right? And then those ecosystems, right? Well, is able to redo the code, right? Mistakes wherever and to build something, right? With Kubernetes example, I remember there was messes. Anybody remember messes? Well, that was a very cool project, right? I remember talking to those goals and they have, let's say, five years ago a lot of very cool things which Kubernetes did not have, but you know what? They lost adoption game. And now you can see the things like I have next, oh, no, sorry, next, their container storage interface and some other stuff. Hey, you know what? Kubernetes is getting a lot better for data intensive applications, right? Not just for this narrow use case of a stateless applications, it was originally designed. We also now have either the data on Kubernetes community, right? Which is a special community which is focused on running data intensive applications on Kubernetes, right? And what is interesting in this case from also the CNCF survey, we can actually see what from 2001 to 2002 were the databases, right? So something which technically was not something you would ever think to run on Kubernetes five years ago is one of the fastest growing application type which is being run on Kubernetes those days. Now, I would say even more, what we see right now is if you look at a lot of modern independent database as a service provider, right? All of those folks, they are actually using Kubernetes on the back and to build their database as a service solution. You may not know that, you may not care, but they are doing that and that means for them, well, that is both efficient way to build it, operate it and that can be run stable enough, like at least with a good skill. Another interesting result what's going on with Kubernetes is what we are having, despite having Kubernetes already quite ubiquitous, right? Hey, you can run it on pretty much any cloud. If you look at many, you know, every distribution, Linux distribution now have their own Kubernetes disk, right? At least the major one, right? You see the Kubernetes from VMware, right? And, you know, many other solutions right now, right? But we are also having the Kubernetes come into other places, like I like this K3S solution, which is lightweight Kubernetes, right? Which you can deploy on the age IoT devices, right? Or, you know, if you just have, you know, a couple of servers in your basement that you want to run Kubernetes on. We also are getting more solutions for visual kids, right? Because if you look at, you know, like a raw Kubernetes, you often would be seeing that, you know, console and YAML files, right? Right and everything. But now we are also have a pretty good dashboard which comes with Kubernetes as a project and as well integrations with applications. Like, there are solutions like CubeApps or Rancher application catalog, right? We can say, hey, you know, if I'm running Kubernetes, I can deploy open source and the appropriate applications on my Kubernetes solution in a couple of clicks. Kind of similar to what we see now in their major, major clouds, right? Where we have their marketplaces. Okay, so with that, where do I think things need to be going and will be going next as we are getting more and more open source software in a cloud? One, we still need to continue working on the better integration. Because I think that is the name of the game right now, right? We expect really all the software and the software and hardware to be very well integrated with each other, right? And that is something where I think a lot of property vendors put a lot of folks in, right? Because if it's not easy to make two things work together, then for majority of the people, it would not be accessible. Because remember what we spoke about? A lot of people do not have very advanced skills those days. Related thing, of course, is the better usability, right? I mean, we have expectations of usability very high. Attention span, very low, right? I mean, I'm not sure. Any of you have like teenage kids? Or am I just too old here? What do you think about the attention span those days? Okay, well, and remember, right? Those teenagers are probably already doing some work with computers, right? It's often thought as a middle school those days, some programming skills, right? And they will be going in the industry in the next few years. And the last thing I would mention, and that again maybe comes from my database background, right? Is making sure we also have a good experience if at day two operations, not just day one. In very many cases, I see there is a lot of application catalogues and where we say, hey, look at that, you can develop an application very quickly. But then if you really want to maintain that, it's like, you know, with database, you can't just tear it apart and deploy the new one. Well, that is something which can become instantly quite complicated, right, if it's not handled well. Well, with that, I will end with my call to action with pretty much saying what I started, right? I would encourage you to embrace the open source in the cloud, also invests, both in terms of your personal time and efforts and as well, motivating your company to invest in making it better. Because open source very much depends on our collective effort, right, for its success. And of course, spread the world, that is how the movement grows. Well, with that, it's all ahead and I would be happy to answer some questions. APPLAUSE Thank you very much for your talk. You said in your final slide that people should be investing in open source and invest in making it better. What are your thoughts on the new kind of trend you see of organisations like the Sovereign Tech Fund in Germany, allocating millions to open source projects and things like that? Where do you think the money should come from? Or is there a structured way it should actually be allocated? Well, so where should money come from, right, from open source? Well, look, I think there is a variety of things, right? I think if you look in this case like from the government standpoint, right, it's classically have been supporting, you know, public good with investment in research, right, the medicine, right, wherever. Like, I think the open source is one of such ways how you can create a lot of public good, right, which is wonderful. I think though, in many cases, probably the majority can come from, right, who are the public corporations, I don't know, probably private corporations, right, acting out of their own interests, right. I think in this case, we all need to also work with that kind of like mentality change, right, because you see like for years, right, even before like a Microsoft changed it and became like all, you know, we love open source kind of company, right. It was all about like, oh, open source is evil. It's like, you know, communist run open source, right, that's how you, you know, lose all your, you know, property, right, open sources insecure, yada, yada, yada, right. And I think that is something there that's kind of spread the word is important, right, is to making sure, well, that is actually none of those things, right, and really embrace an open source and the focus on the real open source can be very good, long term ROI for corporations, maybe not a short term, right, but the long term, right. Well, in this case, I probably want to mention one other thing, which I think there is still challenge those days, right. I think what the open source term is being bastardized, right. We can see many companies, they try to say, oh, our software is kind of almost as good as open source, right, but when they put you like, well, some very specially restrictive licenses, right, in this case, right, and we, right, or have like an open core when you have like a crippled open source, right, and real property software, and we need to make sure the company understand there is what is open source, right, and focus on growing that compared to those, well, others. Make sense? Thank you. Hello, thank you for the presentation. In the beginning, we started with a good willingness to make it very portable to other cloud and that kind of stuff, having the deployment that can be multi-cloud. And we missed something because at some point, the experts that were there to help us were pushing for some proprietary or non-compatible deployment, and we are totally stuck now in one cloud provider. How do you manage such a risk, and what would you put as a rule set at the beginning so that you would not fall in that pitfall? Yeah, well, I think that's a very question. I can say, hey, we made a mistake and now we are stuck, right? Well, the thing, I think in this case, right, if you're looking at portability, right, I think it's important what that portability is being tested, right? So in your case, right, if you say, hey, we are mostly on that one cloud vendor and we want to be able to move to have a cloud vendor if we need, well, that means what you need to regularly test your software that actually works on that cloud vendor. So if incompatibility and overreliance is introduced, you can instantly squash that. Because if you don't do it, right, then, you know, like a three-hour or three years later, right, when you finally want to do that, you can say, oh, my gosh, it actually turns out so much the things what we did not do, right? And I think in many cases, what is also important is what developers do not read docs, right? They tinker, right? And often then you think, like, oh, I am using this kind of feature, right, in this case. I assume it works everywhere, but maybe they are actually, you know, using something which only works in this particular place, right? That is why testing is important. You cannot rely on, you know, the developers, right? Or infrastructure engineers just, you know, using their subset they are supposed to be using. Okay. Well, thank you. One more. Oh, one more. Okay. Hi. Thank you. When we have some companies using proprietary Cloud provider, it is difficult for us to say that they need to move all their infrastructure, as you just said. So is it a problem of offer? It is a problem of the free software Cloud provider offer to provide some easy tools to use. Or it is really a problem of politics from the companies to say that, yeah, we should go to the difficult, but free software, how do we manage with that? Well, I mean, you are asking, like, is that like a technical question, right? Or is it politics, right? And the question is both, right? In this case, like, yes, I mean, if you look from a technical standpoint, I think there are easier ways of open-source software where more it is going to be easy to convince, adopt, write and so on and so forth. But in many cases, it is possible. It is also like a politics, right? Because there can be some entrenched interest for whatever reasons, right? Or some belief, right? I have seen a number of companies on my watch, right? They will say, oh, my gosh, we are running on, let's say, like a Microsoft stack. There is no freaking way it is possible we move to open-source, yada, yada, yada, yada, right? And then we have a new CTO, right? And boom, right? You can say in a few years, right, like more than half of workloads are moved to different open-source solutions, right? So I think in this case, like, if you look at what is a primary thing, right, it is politics and desire of a management, right? That is always primary. Okay. Thank you, Peter. Okay. Next on stage will be Fabio. |
On-premise data centers do not need to be legacy
We can and should learn from legacy on-premise data centers and the migration to the cloud to ensure the computing platform's future is bright |
So, Fabio, the stage is yours. I'm really looking forward to hearing about on-premise data centers. Do not need to be a legacy. Thank you. So hello, everyone. And just to be clear, this is going to be the topic that we are going to cover a little bit of history, some less alert, and then some technology bets that I think that would make sense in such conversation. So about me, I have been a Linux user for 20-ish years. I've been working with Linux for close to 20 years now, and I currently work with Reddit and do basically similar kind of conversation in my day-to-day job. So let's start with a little bit of history. Of the cloud. Let's call it this way. So Rackspace was founded in 1998 and I think was the first company that defined itself as cloud. In 2005, SoftLayer was founded. They defined themselves as bare-meta-cloud. And then in 2006, we have the S3 launched by AWS, which was the first service of AWS. 2006 again, EC2, sorry, yeah. And then Google App Engine arrived, IBM bought SoftLayer, creating an IBM cloud now called and by 2021, AWS has more than 200 different services. So what about the history of the known cloud, because what we have seen are all cloud environments, but those are nothing new if you think about this. So in 1964, which is probably older than anyone or most of the people in this room, IBM introduced the CP40 and this machine had time-sharing technology, which was very different from what we call today cloud, but still it was probably the initial point of the history of the cloud and in the late 60s, IBM released Simon, which is an hypervisor. By 74, the two definitions, the two kinds of hypervisor get defined as tape 1, the bare-meta virtualization, tape 2, the hosted virtualization. And by 1998, VMware got founded and in 2000s, majority of companies moved from bare-meta to VMware's VMs. 2001, ESX got released, which was type 1 kind of virtualization. 2003, we have the first type 1 open source virtualization, Zen. And still 2003, VMware introduces Vmotion, which allows you to basically move a machine from one host to the other without rebooting it. In 2008, Microsoft arrived with Hyper-V, it previously had some other kind of virtualization tool, but Hyper-V got launched in 2008. So what is the cloud? Why we are distinguished the first group and the second one? Wikipedia says that cloud computing is the on-demand availability of computing system resources, especially data storage and computing power without direct active management by the user. So I think this is a good definition. I think that a better definition is a business model where one party rents to a second party computer system resources, especially data storage, cloud storage and computing power with the smallest granularity possible. And my point is cloud is not technical, is a business model. And if you think about, we move from renting machines like VPS on a monthly basis, and then AWS introduced the concept of ECU that was initially on an hourly based and then minute and then second. And now you can buy lambdas or similar kind of things for milliseconds. And in a way, also CPUs had the same shrinkage. So we move from food CPUs or sockets to vCPUs, which basically is hyper-threaded threads to fractional vCPUs with lambdas or similar services. So my point being, the whole thing about cloud is not technical, is only about the business side of it. So what can we learn from not only the last 20 years of what we can define as cloud, but also the previous 50 of what we can define as non-cloud, and more specifically, because we have seen that the cloud model actually works. The non-cloud model was not very functional to the business, to the point that very often those data centers got outsourced or in some different ways moved to the cloud in the sense that moved to someone else, and the business started to expand constantly those machines due to the basically OPEX model instead of the COPEX model. So there is one big aspect that we need to remember about this, which is the separation of concerns. So standardize the interface between the infrastructure and the workload. If you go in legacy data centers, very often you have 1,000 different kinds of systems that the infra people have to provide to the workload people. And this is because, oh, my system is different, my software is different, whatever, in the end of the day, that is a huge load for the infrastructure part of the business. Second, the scalability needs to be at workload level, so the infrastructure also needs to be somehow reliable and within some SLAs, but if the system has to stay up, if the application has to stay up, the application will have to take care about this. And third, workload have an abstract concept of whatever is underneath it, so the physical architecture. They don't need to know which data center they are in or in which rack, what is the nearby server, and so on. The function, so we also need a functional business model for a good managed IT system. And the first part is, as before, standardize the interface between the workload and infrastructure so that it's easily countable and priceable. Second, build back the infrastructure cost to the workload owners. We have seen, at least in my definition of cloud, that we still have two parties, one that delivers a service and the other one that consumes it and pays for it. So it's very important to create this also internally in companies or organizations of any kind, because this allows the infrastructure side of the business to justify their expenses over some kind of at least recognition of revenue or whatever, cost recovery, whatever. And third, keep the cost down. This is a key point, AWS, Google, those companies will do everything they can to keep the cost down because they need to be positive, cash flow positive. Obviously, if you are a department in a company, it's slightly different, but it's very important to still be cash flow positive because this will guarantee you that you will not have any issues over time with this part of the financial model. And third, maintain control. We have seen the clouds are obsessed about maintaining control and obtaining even more control on their hardware, their system, whatever, and this is very important for your own cloud if you want to be able to maintain it for 10, 20, 50 years. So the first one is, I would say, do not use, but be very cautious on using third-party property software, those companies can go away, can change pricing model, can do whatever, be aware of this. Second, evaluate very strongly the buy versus build decision because when you buy, obviously it's here now, but you don't have the know-how about this. So probably you will want to build a lot of your systems, not the core parts, but maybe the dashboard layer or that kind of thing so that you can effectively manage it however you think better. And third, be very aware of lockings because those will bite you over the course of the years. So how do I define the locking? I define it as the product between the probability that a component will require substitution during the solution lifetime and the total cost of the substitution. So for instance, Linux, if you base all your architecture on Linux, it's going to be very expensive to move out of Linux, but in the other hand, it's very improbable that you will need to do it because very probably in 10, 20 years Linux will be here. So a couple of points on technologies, the first one is keep the complexity of your system at the lowest level possible. Systems will get more complex and more absurd over time, so at least at the beginning start with the simple thing possible. Second, prefer build time complexity over run time complexity. It's way easier to automate a build thing than to automate something to be run. And also when something breaks, it's better if it's simple because it's easier to fix. If you have to compile your stuff, compile it, but try to keep the complexity at the run time at the minimum possible. Third, minimize the amount of services that you deliver to your business or your workload's owners so that effectively you can guarantee that those services are exactly what they require and you are able to deliver them in a sensible way. So I think that one big point is containers. Delivering a container-based solution at least, it's probably the best option I think today and user Kubernetes distribution, whatever you prefer and choose, that makes sense, it's fine and we'll see later the Kubernetes APIs are now fairly well-known, fairly abstract and fairly used so that those can be a good interface between the infra and the workload side. Also, you can do it yourself, community, call it whatever, a distribution. You can buy a commercial distribution of Kubernetes. If you do it, first, be sure that it's fully open source what you're buying so that you decrease your lock-in because you are decreasing the cost that it will take you to move from this to any other solution. Second, from a trust-worthy company, hopefully that company that you buy it from will not fail tomorrow because if it does, you will have bigger problem. Enter with a long track record of not screwing their customers because it's not good. And if they are heavily involved into the open source community, it's even better because that means that they are driving the development and they do have all the knowledge needed to eventually fix issues as soon as they arise. So around automation, use an immutable approach to your infrastructure. If you start to have different things and weird infrastructure going on, it will be a dead sentence. Second, version your infrastructure, GitOps is an option. There are many others. No matter what you do, try to have versions so that effectively you can potentially roll back or at least see what change from a version that is known to be working to the current one and automate the whole process. If you have humans involved, you will have issues. It will cost more and it will be effectively less resilient and reliable. So putting all together what we have seen, I would suggest to first create a multi-data center architecture so that effectively you have all that redundancy and kind of things but hide them from your developers. Maybe they know the region concept or they said concept but don't show their physical layout to your users otherwise they will start to do weird stuff. Second, use a tool to manage the clusters. Open cluster management is an open source project that does it. There are other projects that do similar things. It's very, very useful and it will help you over time because probably you will end up running many clusters. Third, I would suggest personally to standardize on the Kubernetes APIs as the only interface between the workload and the infrastructure because those are, as seen, very known. Use a bare metal container platform so don't use virtualization or other stuff into it. You will have, hopefully, enough workload to justify tons of servers, physical servers, don't add complexity with virtualization in between. Automate all the infrastructure pieces and configurations, obviously, as seen. Start providing only a few interfaces to your business and then eventually extend them when needed. So an example would be an OCI registry, object storage and pods, deployments, those kind of basic things. And then if your business comes out saying, oh, we really need that, then eventually you expand. But the thing is, only provide new services when you are sure that there is the requirement for it. So, for instance, let's say that you want to do a database as a service. You already have onboarded 100 applications, 80 of those actually use MySQL. It would make sense to provide MySQL as a service, but it does not make sense to provide 50 different databases as a service of which 48 will never be used. It's only complexity and cost for you. And then create a simple UX for your users that completely obstruct everything that is below. So even push your Kubernetes configuration here and we will manage it. And hopefully then you will be ensuring that all this stuff is fashion and so on so that even when the workload fades for some reason, you can say, look, a version N minus one was working. You did something, now it's broken. It's not the info. So this was it. Thank you. I don't know if we have a couple of minutes for questions, no? If there are. Thank you for your talk. Could you expand a bit on the, I didn't get why the, what was the advantages of building multiple data centers at first? So that is usually a business requirement because they will say, oh, we want to have everything that is HA or at least this service needs to be HA and with one data center, it would be hard. Obviously it really depends if you are a small organization, maybe two data center, three data center could be okay. If you are a big organization, maybe spread throughout five, 10 legally different regions, then you will need obviously 30, 50 data center. That's a completely different scale. Obviously all those are very generic suggestions and then you have to apply them to your specific situation. And just a quick follow-up on that, how do you hide that from the workload developer? So the line just after that one where you say they have to not know about the multiple clusters, how does that work? Yeah, so if you pick AWS for instance, they have the concept of region and AZ. Some AZ, so AZs are not data centers. Some are data centers. Other are parts of a data center, but different availability zones within the data center are others are containers, in the sense of like 40 food containers full of servers. So the user does not know. They know that there is region X, AZ 1, 2, 3. What 1, 2, 3 means, no one knows. And no one cares. And that's the thing. Thank you for the talk. And in your definition of locking, you spoke about cost of portability, multiply by probability of portability. But doesn't it like if you file to assess the probability of portability, wouldn't you fall in a lock-in without being aware of it? Sorry, what do you mean? Okay, I will always run my cloud in Amazon Web Service. Why would I need portability? And then I start using locked-in products. So I will never be able to leave. Yes, well, you will be able to leave. It's always possible to leave. You will simply revive from scratch your whole application and you leave. So what is the cost of that? A billion? Okay. So now it becomes a billion of lock-in. That is my point. You can rewrite tomorrow from scratch, from the way it's up. It's possible. How much it will cost you? A billion? Five billion? A trillion? Okay. That is your lock-in value. And that's the thing. Obviously, I would suggest you to keep the lock-in as low as possible. So try not to re-write. To be in a situation where you have to rewrite everything. Thank you. Hello, one quick question. So if your organization has a traditional manual approach to operations, which thing would you automate first? I would start from very simple processes just to ensure that it works in the organization. The organizations start to understand it, processes like create VMs or create containers, whatever kind of thing you do, and then some things such as patching and so on. But if you really want to go the automation way, it's way easier to, after you have tested a little bit the thing, start to say, okay, now we have the version two of the environment that is fully automated from day zero. Otherwise, you will always be in a kind of automated but not completely automated situation. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. |
Distributed Storage in the Cloud |
Okay, well some of you may be here for my first presentation, this one is going to be different in much more technology focused, if you will. And we will talk about their distributed storage in the cloud, right? And my goal of this presentation is kind of provide you a very general overview of options which exist. I am not an expert, right? And probably something I'm going to say is I'm going to be wrong, right? So if it is, then say, say like, this is fucking wrong, Peter, you know, so I can fix my slides when I talk next time, I have the wrong stuff, right? So don't be shy. Be engaged and that's going to be more fun for all of us. So the thing I would say to start with, we discussed about, as I believe there are different ways you can approach your cloud, right? One is where you really kind of lock in with the cloud provider and then another one is what you really use, one of the really there open source solutions out there. And as I spoke in my previous presentation, we can see what, well, like I would imagine that is how the cloud was originally taken. Well, I won't spend too much time on this because I already had a presentation, but and also because we are not have too much time. Now one thing what I often have people asking me is about their, the open source, right? Which I think this conference is about. And if you are thinking in the open source from, from the business standpoint, right? We often see a lot of those different companies, right, which promote themselves as open source or somewhere around open source. But how do you know if it is, if it is for real, right? And of course, one of those you can look at the, like open source, you know, license and, and so on, right? And this is all there. Good stuff. Another is also to make sure what you ask yourself, right? So maybe even kind of company representative some of the questions, right, about how things look, right? One is you always think about how you can deploy that kind of solution product, right, on your own without getting any additional cost, right? Because software may be kind of open source, but they can, right? And the source is available, but well, actually maybe their, you know, binaries are provided only to four people who have a commercial subscription. Well, in this case, it's maybe technically open source, but on the practical side, there is some of those, some of those problems, right? And especially I have seen some open source projects, which would essentially, you know, withhold details about the build process, right? So it's not easy, right, to, to do that, right? Then another question I always like to look at is a choice of vendors. If you need any help, right? For many, for many companies, just saying, hey, we're just doing to go ourself is not going to work, to work, I want to hire somebody. And in some of them, you kind of license this around open source, there have been kind of some restrictions. Well, you know what? You cannot, well, you know, provide the consultant services around the software, right, or something like that as a license, right? And I think the third very valuable thing about the open source is to see wherever you can improve the software for your purpose. If something in an open source doesn't really fit, right, can you contribute to that? And again, that I think is another very interesting property of open source software where, where maybe different shades, right? Sometimes open source vendors maybe, well, maybe more or less open to that kind of things. Well, now with that, maybe open source public service announcement, I would touch briefly about open source. I think I spoke about that, the previous speaker spoke about that as well, that is a fantastic API, right, and that is something we are going to focus on here, right? And why I mentioned Kubernetes here, right, as we're going to talk about the open source storage in the cloud, I will focus a lot about, hey, what exactly choices you have in the Kubernetes environment, right, because if you're really speaking about the cloud, right, modern large-scale application, a lot of that is now being built around Kubernetes, okay. Now this storage in the cloud, what does that really correspond to? Well, there are a lot of different storage types we really can consider those days, right, and here is the list which ranges from the simple stuff as a node local storage all the way to the databases, right. I define storage myself in a very general way in this case, hey, you need to store the data somewhere, well, that is a storage, right. Now there are some of those things like a node local storage is relatively simple, right, they are, you know, direct replacement from like a, well, file systems we have on our operating system for a long time, the others, such as databases, can be very complicated. It is not just well, a database, right, but we can see databases being different by data model query language, like a virus internal design decisions and so on and so forth, right. Even if you look at the data model, right, these are some of the most common data models which you would see and what is interesting, over the last, I think, like maybe 10 years by now we see really this explosion of the different special purpose databases versus approach before where we, I think, had like relational databases absolutely dominating in the dominating ecosystem. What is also interesting in this regard is what we are having the databases right now not just being, you know, focused on a single data model, but many databases are able to support multiple data models, right, which is, I think, is a big trend and even potentially speak multiple, multiple protocols, right, like here are some examples, right, if you look at the ClickHouse, which is data analytical database, it is able to talk ClickHouse but as well as PostgreSQL and MySQL protocol, right, so their ideas, hey, you know, wherever programming language and libraries you already use, you can just connect to us and, you know, find your queries, you know, fantastic idea, right, or the time series database, Victoria Matrix also is able, is implementing things like in FluxDB and Graphite API for data ingest, again, I think, very, very smart. We also see some frameworks which allow us to do some of conversion and translation, for example, FerroDB, the projects allow you to use PostgreSQL back end with MongoDB front end, right, or Amazon released recently Babylfish, which turns your PostgreSQL in Microsoft SQL comparable database, right, so a lot of this, I think, you know, interesting integration is going on those days. If you look at their databases, we also see a lot of difference in the databases for a purpose and design, right, like we are speaking about operational analytical, how is it used, how it's internally structured and so on and so forth, right, and why am I listing that is because if you look at their, at the complicated environments, right, with a lot of complicated database, it's very unlikely you will be limited only to one database, right, of course, as the previous speaker mentioned, well, you know, you probably don't want to have 50 because that is way too, too much complexity, right, and you want to be very mindful about how you introduce them to you, to your environment, but it's probably going to be more than one, than one of those days. Now we also speak about besides storage, about distributed storage, like why is that important? Well, if you think about this, that is all of redundancy performance and scale, right, I mean, saying if I just have a storage which is not distributed, right, which is the kind of really sort of one device only, I will be limited and all of those. I think this is even more important in the cloud, right, because if you look at the age before the cloud, often we would be in a case where we have some one very powerful, very redundant server, right, maybe with a hot swap rate and redundant power supplies, right, and we expect that beast is never going to go down, right, well, that is not how we operate in the cloud anymore, right, we assume any company in the cloud is going to die, right, and they actually do die more frequently, right, if you look at the stats within like a meantime within failures, let's say for VMs compared to what you could get with some, you know, beast from the past, well, it's going to be different, right, but that means we need things distributed at least from a higher ability standpoint. Okay, the fact, let's look a little bit about the storage types as promised. One is our commodity storage types, right, and this comments to the previous talk I did, this commodity storage types, they are pretty much the same in every cloud, there are minor differences but they are, I would say, like a commodity building blocks, right, they have a relatively simple interface and usually it is relatively easy to migrate, so the lock in, right, the word we don't like on this track, right, is going to be relatively low with them, one is node local storage, I mentioned, hey, well, it's pretty much every major and even your second tier cloud typically offer you some kind of local storage, right, and it can vary in terms of, you know, performance it offers and so on and so forth, but that is pretty much the same, right, from what that gives you, right, and that is fantastic, right, but again, that is where I would, if you are looking from that, I would focus on the performance because that is where surprises can await you, right, and saying, hey, this cloud vendor and that both have a storage, right, one of them has its, you know, implemented as a very fast NVMe flash storage, other something not so fast, well, that may have a very big difference for your application. The second most common one would be the network block storage, right, that's typically how we store the data so that can, in the cloud, so it can survive their depth of instance and Amazon would be EBS, right, and all the other cloud has something similar. We also have some additional solutions in this case coming from the proprietary vendors, right, like from those vendors, right, which, you know, provide you some additional features and there is actually quite a lot of different solutions which exist if you want to roll out their block storage in the open source, right, and I think this is kind of very cool and that shows how things are evolving in the open source space, right, we had sort of this block storage idea for a long time, so a lot of projects evolved, right, and they have a lot of choices. The next type of storage in the cloud would be your file storage, right, like when you can say, hey, I can mount something locally, not as a block device, but as a file system, in many cases that would be your NFS or SMB compatible file system, or both, right, again, all the cloud will support something other file systems, there are a number of major proprietary cloud vendors, they support sort of solution in this case, and again, in open source there are also solutions in this case, right, and you can see there is like some connections, right, so many open source protocols which just say, hey, we are focusing on the storage, right, they may provide different interfaces, right, and well, and that kind of makes sense. The next one would be the object store, right, and that is I think a very important component which appeared in the cloud, and that is interesting in the new commodity storage, right, because if you think about the age before the cloud, we always had that, you know, local files system, we had like, you know, network servers, right, with your remote file systems for very long time, but we didn't really have anything like S3, right, until the, like at least kind of in a common use, right, and that has appeared and used a lot those days as a building block for many applications, because it's actually very cool, right, it's kind of bottomless, right, you can access it on HTTP directly so you don't have to, you know, process the pass data through your application all the time, right, it's very scalable and so on and so forth, even many databases those days, again, like both proprietary and open source are now starting to be built by using object store as a back end instead of your conventional file system. I think what is interesting in this case is what there are a lot of also object store cloud vendors which exist, right, so it's not just Amazon, like, or even kind of major cloud anymore, right, and here you can see like two types of commercial vendors, right, our usual suspects, NetApp and Portworx, they do have a solution for S3 compatibility, right, but also we have solutions like Wasabi or BlackBase, right, which are offering you S3 which you can, compatible services which we can use as less costly replacement or kind of like a supplemental to your main cloud, right, for example, you may say, well, you know what, I have my stuff in Amazon but I want to make sure also back up somewhere else, you know, just so well, there are numbers of vendors out there, and then if you want to like run the storage in your, in your, well, locally, right, there are also now number of vendors, and I specifically wanted to flag the Minio in this case because I think they have been the most successful, right, as providing S3 compatible interface in their, for private cloud in those days, okay. Now let's look at the databases and data stores. I think the interesting thing about the database and data stores is what unlike the previous storage types which are kind of relatively commoditized, right, have a relatively simple interfaces and relatively simple to, to replace, like if you store data in S3, right, and now want to store it in Minio, well, guess what, right, you have a different endpoint, maybe have some, you know, little configuration differences, but that is not a big deal. It's a very, very different, right, and even so-called, I would say, like a similar offerings actually often end up to be very, very distant because of their, well, of a lot of complexity which can, which exist in a database space, right, so that is, I think, where using some open source solution is especially important. So let's look at some of the databases in this case. One, what I would call Q stream data pipelines, right, wherever we want to call it, right, that is increasingly is a very important component of modern, you know, data-driven architectures, right. You often want to say, hey, we have a data comes in, right, and maybe it kind of flows to a number of, you know, consumers being maybe, you know, processed alone, along the way, it's kind of your data plumbing, right, it's not conventional database, but it's very important. What I think is interesting in this case, right, is what we are actually a lot of options. Well, you see at the Amazon AWS, right, and they probably would have more services than that, right, we have a huge amount of solutions in this case, some of that is because they kind of started first, right, maybe implement something, and then open source solution exists, right, and in general, right, because Amazon has a huge number of different services those days, I think it's like more than 200. If you look at the proprietary solutions, in this case, you can see Kafka is being, I think, the most common solution these days for building your plumbing, and then additionally we can see this technology, right, Panda coming up, which is saying, hey, we are providing to you something which is Kafka compatible, remember, I mentioned earlier, right, what those days, people are often building comparability of existing protocols, but it is faster, simpler, yada, yada, right, I put them in the proprietary side, like specifically right Panda, because they are one of those companies which started as an open source and then later changed the license to, you know, something not quite open source, we do have a lot of solutions in open source, it's good to point out what Kafka instead, right, is Apache open source project, right, Confluent has a commercial offerings build on Kafka, but Kafka itself is an open source as well as actually many other open source solutions in this space, what I think is interesting in terms of like Qs, there is also often certain solutions which exist in the given programming language ecosystem, right, so you will find what often, you know, go long people will have their own choices compared to the Java people, right, and so on and so forth, if you look at relational databases, well in the cloud we have a lot of choices often ranging from providing you wrapped and extended open source databases to also proprietary database available in the cloud, right, if you want Oracle or Microsoft SQL, typically that also is available on most of the cloud. What you also see in this case, there are a lot of proprietary solutions in this case, right, which exist, right, and in many cases you will find those coming from your proprietary vendor or you see a lot of companies those days which are providing the proprietary management service right around open source databases, so for example you will find Avent here, right, which is on like one extended provide the management services for a lot of open source databases, but I still put them as proprietary vendor because if you can say, hey, you know, is there this open source version of your kind of fancy GUI, right, so instead of paying you can I take it around in my own data center, well the answer would be no, right, well, so foundation like its solution includes open source data components as a core database but as a whole it is not, right, and that applies to many vendors in those case. Now, if you look at the open source, there are actually a lot of databases available both from like an old guard like MySQL, Postgres, MariaDB, right, as well as the new folks in the block like Ugabyte, TyDB, Percon also provides our own version for MySQL and Postgres but typically that requires more of, I would say like manual work to deploy, right, compared to databases as a service which exists in a proprietary space. Here are some choices in the analytical spaces, right, that is I would think one of the big decisions for relational databases because of kind of building database which optimizes for transactional work load and analytical work load is kind of quite different, right, they're designed and shownly very, very different, right, and so there are typically different choices out there. There is a little bit of overlap those days, right, some database position themselves as HDAB, Hybrid Transaction Analytical Databases but you know, typically the databases are good one thing or for another. Here are some relational and analytical databases proprietary, right, you can see number of very common solution here and then you also have a number of open-source solutions in this case as well, right, I think what is very interesting is what as you look at the analytical standpoint, right, they are, it's also like a very big focus, right, if you know, very large amount of needs, right, so for example if you look some databases mentions here like you know, Preston, Trina, right, and saying hey we want you to provide information so you can take a data from all the different data sources, right, and join and query wherever you directly, right, that's very valid use case, right, something like you know, click house, focus on saying hey we provide you sort of like a real-time analytics, right, if you want to insert the data and then have it available for a query in the next second, well that's something what we focus on, right, or TIDB as I mentioned they go the HDAB database, okay, I have some sign to speed up, so the other class of databases which is quite important is the document store, right, I think if you look at for many, you know, simple applications, some new developers, right, you just say hey you know what, SQL relational databases, yada yada, too complicated, right, you want just to stuff our JavaScript objects directly in the database and work with that natively, not trying to spread them on normalized schema in relational database, but all of the cloud vendors major one they're offering their property solution in this space as well as we do have number of property solutions in this case, like I would say MongoDB and Couchbase are probably the most popular in this regard which come in both cloud and enterprise space. Now if you look at the open source, that is where I would have to say like both open source and source available, right, because well frankly the most popular document database is MongoDB which few years back ditched open source license, right, and well, so it is not open source solution anymore, right, if you're looking for open source compatible, right, where is an early stage open source MongoDB comparability, where is early stage project FerretDB which provides interface for Postgres, right, which I mentioned. One thing I would point out here is what relational databases, actually the lot of work recently is being much better for document store, right, specifically in JSON support, where you take MySQL, Postgres, right, or even SQLite, all of them are also usable, right, so in some cases when you say well you know what I want to have some document store but I don't really completely hate open relational databases, that also can be choice. Key value stores, that is another important model, in this case I think it's interesting what they really can go in a two different buckets, right, one is hey we are using that for caching, it's kind of in memory, transient, if you lose it we don't care but we want it to be fast, there is number of solutions here, from a proprietary non-cloud solutions, I think Redis is a main leader, in this case, right, we have both Redis Enterprise and the cloud, if you look at the open source key value storage, solutions in my opinion, we also have a key value, or I would say like a key value plus plus, right, because some of those solutions have a much more powerful language than key value, would be, you know, DynamoDB, CosmosDB, Bigtable, right, in a cloud space, Redis cloud and the enterprise versions of open source solutions, that's what exists here, and here are some examples of open source solutions, right, which have a key value stores, right, and again like a key value store plus plus, right, I mean, you would find like especially, I respect mentioned here, they are, well, do much more than, you know, a key value store, right, Cassandra as well, but they I would say not, don't position themselves as I would say like powerful as document stores, so yes, we have also time series databases, that is another class I wanted to cover here, right, again, you can see solutions from a proprietary vendors, from a cloud vendors, proprietary vendors, and probably what they're most interested here is their open source, it is also interesting what the time series database is also relatively new in technology, which has a lot more, I would say, choices those days. Well, let me finish over also mentioned maybe per corners role, right, in all of this and what we are trying to do, right, what we try to do is to really see to push boundaries on what possible if specifically open source databases, like, hey, you know what, if you want to have something which is, you know, totally open source, our focus is on my school MongoDB and Postgres, right, I mentioned MongoDB is not open source anymore, well, but that's not our choice, right, but I'm strict choice and we are having as much of our tooling even for MongoDB open source as possible, and what you build is 100% open source software around that, right, if you look at our distributions from my school MongoDB and Postgres, right, generally include a lot of their features with enterprise companies need, like, you know, auditing, authentication, whatever, but it is completely open source, and we focus both on your kind of conventional or old deployments on Linux as well as we have operators for Kubernetes, right, I think you have, like, some of the more advanced databases out there and all that stuff, again, besides MongoDB is open source, we don't have any proper resolution, plus we do have per quantum monitoring management, which we position as a single tooling where you can monitor and manage databases, you know, you can get something similar to a database as a service experience with Kubernetes backend, and again, that is all, you know, 100% open source, which you can play with if you choose. So, to finish it up with our storage in the cloud, right, as you probably have seen me going through that, right, some of you, I see falling asleep, some of you rolling in your eyes, and that is totally appropriate to action, right, because there's a lot of shit out there, right, that's like, like, lots of options out there, so important to know here, hey, there is no one size fits all, right, you guys can look with fits for your job for what your applications need, but one thing I wanted you to come out of this case, it's like one last, most important takeaway is what we could see in the all, like, where we slice and dice it, right, all those areas that have been a choice of more than one open source solutions available in every single class of storage you may need in the cloud, so that's all ahead. We have a little time for questions. Hello, my question is about an interesting tool, which used to exist, and Takane used to have it in the MySQL package, Handler Socket, so I think it was kind of discontinued, and I don't think it supports MySQL starting from 5.7, so is there any movement in the direction of supporting this kind of tool, which enables you to access your relational database both ways in a traditional SQL way and in a highly available... Well, that's right, so the question is about the Handler Socket interface for MySQL, right, and yes, and there was this, you know, interface, right, graduate, it's, I would say, came mostly out of use, right, and we, you know, stopped supporting that, it's, of your corner as well, there's a couple of replacements, right, one is which I think generally cover most of the use case of what Handler Socket did, one is MySQL supports memcache protocol, right, so if you look at for key value store, memcache comparability is out there, and then there is also something called docstore, right, that is the MongoDB-like protocol, right, which allows you to store documents, like JSON documents in the MySQL, that is an other choice, right, so I think within those two, well, it covers most of the Handler Socket use case as well. Okay, thank you, Peter. |
From Zero to Hero with Solid
Lessons learned making apps using the Solid Protocol |
And let's welcome the next show from zero to here of the solid. Thanks Noel for presenting. Okay. Thank you. So, my name is Noel, I'm from Barcelona, I'm currently working at Moodle for days a week, but I'm also in the site making solid apps, site projects, and that's mostly what I will talk about today. I usually work in the open, this means that I journal about my development, so you can follow my work in my website and here you have all the socials and everything. So if you are more interested in something I say, you probably can find more detail in my website. So before we start, how many people know what solid is or heard about it? Well, maybe half the room, okay, that's nice. So for those of you who don't know, before I say what solid is, I have to go back to the creation of the web. So when the web was created, these technologies were invented like web browser, HTTP, HTML, I'm sure most of us are familiar with these ones, but later on, there was something called the semantic web and how many people know what the semantic web is or is familiar with that? Okay. Almost the same as solid. So, yeah. Basically, the semantic web is an idea that in websites other than human readable content and linked documents, users can also have linked data, which is machine readable data, not only human readable. And like this, some technologies were introduced like the resource definition framework, JSON LD, Tartel, etc. We will see more about this later. And finally, the solid protocol, it's the next iteration of this idea and it brings the centralized storage to the web. The web, I see it and the way they say it, web 3.0 by the creators of the web. But it doesn't have to be confused with web 3, which is blockchain. As you will see, solid doesn't have anything to do with blockchain. It's something completely different. And the basic idea of solid is that users have a pod and applications and services store the data in the user pod. So for example, if you have a, here on the left, we have the traditional architecture. Each application you use stores the data in a backend and that's how most applications work nowadays, right? The idea of solid is that you can use different applications, but these applications can store data in the same pod. So it's the user, the one deciding where the data goes. And it's important for privacy, but also for interoperability and a better user experience because this way they can share data between different apps seamlessly, right? That's the basic idea. And if you are more interested in the idea of solid in general, more division, you can watch a talk from 2019 that was given here at Fosnum and you can see more about that. This talk will be a bit more technical. So when you saw the title of this talk, maybe this is what comes to mind. But before we begin, I want to clarify that there are different types of heroes, right? So what type of heroes do I mean? So there are many ways to use solid, just like there are many ways to make websites. In this talk, we will focus on making solid apps. So I will not talk about hosting pods or making solid services, mostly I will talk about making solid apps. And this is because this is what I have been doing for the last four years as site projects. So I have not worked either in any enterprise level application or anything, but I don't want to discourage you from learning from this. Because my goal with these apps is to make them usable for people. It's not only some random site project to try technology. My end goal is to actually make useful apps. And also, it will be road strokes because I cannot get into many of the details, but we will get into some of the widths. So the first app I developed was in 2018. It's called Solid Focus. And basically, it's a task manager, right? So the first application you do, it's always like a to-do app. So this is what I decided to do with Solid to learn about it. So the only thing we care about is that you have tasks and you can mark them as done or not done. That's the basic, right? And when you open the application for the first time, this is what you see. So normally, people is used to register an account with an email and a password or something like this. In Solid, it's different because what you have to do when you open an account is to give the solid pod. And you are telling the app where it should get the data, right? In this case, I also added a login of line button because I know many people are not familiar with Solid. And if they only want to try the app, they can do it like this without a pod. I also think it's very aligned with the vision of Solid that users decide where to store data. So they can decide to store it in their local device without hitting the network, right? So this is how it works. And when you are making a solid app, there are actually three actors you are interested in. One of them is your application. Another one is your pod. Well, the user's pod where the data will be stored. But there is also the identity provider. And actually, when users are authenticating, you are communicating with the identity provider. It will give you a token, and it's this token, the one you use to do the data exchange in the pod. This is the basic architecture of how it works. But in general, you don't have to care too much about this because you just use a library, you give it the URL, and then it gives you the token and you can make authenticated requests. But I think it is important to at least know that before you get into it. So once you have the token, then you want to create data, right? So usually, when you think about a task manager, this is the type of data you will see, right? You have an ID, a description, and a don. It's a Boolean. But when you are working with solid, this is how you would do it. Now before, people want to leave the room. One of my goals with this talk is to actually convince you why this is better. And in the end, when you get used to it, it's not so different, actually. So hopefully, I will do it. So we have a context. This indicates the vocabulary of this piece of data. That's important because if different applications are using the same data, they need to share an ontology, or they need to be using something that is the same so that the data can be shared or understood by different implementations. The ID, it's a URL. This is also important because it's where the data is actually stored. We will see more about this later. And then there is the type. So if you think about object-oriented programming, this would be like the class of the object. And then you have other properties, which can be just literals, or there can be links to other objects or other properties. So the one before was JSON-LD, which is maybe more familiar for most people because it's JSON. But when you are working with solid most of the time, you will be seeing RDF. And this is a format called turtle. And actually, it's not so different from the first JSON I showed you. It's only that if you are not familiar at first, it may be striking. But eventually, it's quite similar to use, and it's not so different. So this is why I think it's not so bad in the end. I have been talking about the solid pod, like it's only a black box. But the thing is that inside of the solid pod, you actually have containers. You can think of these as folders in a file system. Inside of these folders or these containers, you have the documents, and you can also have binary. Like, for example, you can have a video, an image, a JSON file, whatever. But this is something that will be not structured using this vocabulary data. So it's important that what is useful for your app and what you want to be interoperable is structured using RDF. And inside of the documents, you have resources. So what I mentioned before, that the ID says where the data is stored, if we look at the ID of a task in this example, we can see that the root of the URL is the solid pod. So this tells you where the user is storing its data, right? The directory tells you the container where the task is stored. The URL is the document, and the complete URL, including the hash, is the RDF resource. So this is nice, because whenever you see a piece of data in solid, you immediately know its structure and what it is stored. This is useful as well if you export data and all of this. So now we have the authentication token, we have the data, and we know where to put it. So now this is the easy part, because this is built using web technologies. You only have to do a post request. So you do a post request to the URL with the document, and you create the document, and the body of this request will be the data in turtle format. It can also be in other formats if you want, but most of the time you will be using turtle. It's the most common, right? If you want to get the data, you do a get request to the document. If you want to delete the data, you do a delete request. If you want to get a list of all the tasks, you do a get to the container, and you will get the list, and so on and so forth. The basic idea is that it works like the web, because it's built from those technologies. And actually the good news is that that's it. This is how you make solid apps, and now in this talk I will go into more things, but these are the basics. If you understand this, you can already do a lot of things with solid, and I think that's nice because it's building up on existing technologies. So it's always important in that regard. Here you have some links if you want to get into the width with some things, and here you have my journal when I explained how I learned solid. This is the first time I saw solid, so it may be more interesting to some of you. Some takeaways. We learned the solid basics, and something that maybe was unobvious is that there is no server. You make the application, and the application works in the frontend, because the data will be stored in the solid pod, right? So this is nice because you don't have to manage any servers, and you are only building frontend applications, which is static assets. And these static assets, they can be downloaded in a zip file and put into a computer or whatever. It's only JavaScript and HTML. But we also have some challenges. For example, the onboarding UX is not great. We have some issues with the page speed, and also I didn't get into some of the things important for interoperability. So the next application I worked on, it's called Media Kraken. In this case, it's a media tracker. So the point is that you store all the movies that you have watched and all the movies you want to watch, and you can keep track of them. The use case is this, and it's quite simple, right? You can see how you can filter the movies, et cetera. So in this case, this is how the login looks like. In philosophy, it's very similar to the other one, but it improved a little bit because at least I explained what is solid, and I helped users a little bit if they don't understand what to do. But also, once you are inside of the application, you can export the data, and it will be exported as JSON-LD. Later on, when you log in with your solid account, you can import it. So in this way, I added one way for people who started using it offline to upgrade to using solid. And I think this is also important so that they can make it more easy. The other issue about the page speed, also maybe this was an obvious, but when you do a great request to the movies container that I mentioned, you can get all the tasks in the previous example. You get a list of movies, right? But then you only get the idea of the movie. You don't get all the metadata, like the name and the image and everything. So if you want to get the movies, then you need to make another request for each of them. So in the end, this is not great. You are doing N plus 1 requests, right? In case you are wondering, why would you get all the movies? Like, why don't you get just the 10 first movies and you do pagination, right? The point is that I want my application to be snappy and that you can filter things quickly and it works. If you have to do HTTP requests every time you do any interaction, it will be very slow, not just the page reload. So that's why I wanted to have all the movies in the front end. Then what happens doing this is that the first time you open the app, you see this. And depending how many movies you have, you will be seeing this for a long time, which is not great, right? So what I did is that there is a movie cache of the data using indexdb. And this way, the next time I make, I open the application, I can look at the last modified date of each one of the documents. And only request the new ones. This is a big improvement compared with solid focus. This makes it more usable. Still, if you have many movies, for example, I have more than 2,000, the first time you open it in a new device, it's not great. It can still be improved by a little bit, right? And finally, the thing I mentioned about interoperability, I don't have time to get into this a lot, but I wrote a blog post called Interoperable Serendipity. And I think this is actually the most important point of solid. In the end, if you are using solid and you are making apps, but they are not interoperable, you will end up in the same scenario we are now. The power of solid is that you use different applications and they use your data, right? So I think this is important to have into account. So for example, which vocabulary would you use? There are some websites where you can see the vocabulary that are already used and you can see which ones are the most popular. So they are the most likely to be more useful, right? You can also create your own vocabulary. If you are doing a new use case, there's nothing wrong with creating your own vocabulary and then other people will be able to use it to make other apps, right? And finally, at the beginning, I wasn't sure about this, but now that I've learned more, I think mixing and matching vocabularies is actually okay. So you shouldn't be taking a back by doing this, right? And then we have solved which vocabulary to use. The other thing is, where do you store data? For example, if I am storing movies in slash movies, maybe another app can do it in slash films, so they will not work correctly together, right? So the point is that when you log in and you get the identity of the user, you will get a document describing all its information, all the public information, and if you are making the request with an authentication token, you also get the private information. And some of this information is something called a type index, and this type index tells you these users have tasks in this container, movies in this container, et cetera, right? So once you do that, then you can already use the proper container without hard coding anything in your application. If you are the first one adding movies or whatever to a pod, then you can just create the type index, if it doesn't exist. I have to mention that right now it's a draft, but you can still use it, and I have been using it for years, because it's a client-to-client standard. This means that the clients, so the applications, are the only ones that need to know about this type index. From the point of view of the pod, it's just a simple document, right? So this is something you can use today. There is also something else called the solid application interoperability spec, but this one is a client-to-server standard. So until servers start implementing this, this will not work. Just to clarify, in case it's not clear, server is the pod, right? Okay, so final thing about this app is that if you have all the data in your pod, you will not have all the movies in existence in your pod. So how do you make this work to have a good user experience, so that people can search and get the movie they want, right? In this case, something that I think it's very important to do with solid apps is to reliance on public information. So in this case, I am using a website called the Movie Database, which gives you a database and a free API to query that. Depending what type of data you are working on, you can search different APIs, but this is the basic idea. And if you want to learn more about how I built this app, you can check it out here. Some takeaways, type indexes are nice, I encourage you to use them, and also that catching is nice. It improves the performance a lot, so this is something to keep in mind. Things we still have to improve on boarding UX and page speed, they are still not great, right? The next application I built is called Umai. I don't want you to scare you with the time frame in this. The thing is that, as I mentioned, this is a project and I don't only work on solid, I also experiment with different technologies, and for this one in particular, I started doing some animation things and all that, so it was entirely solid, what made this take so long. But in any case, Umai is a recipes manager, so the point is that you have a collection of recipes, and you can search them and browse them. So Inspirit is quite similar to Media Kraken, but instead of movies with recipes, right? This is the basic idea of the app. So the onboarding UX, at first sight, it may seem very similar to the other ones, but I think it's a lot better. Why? Because this button here says, create your free recipe. It doesn't say, use browser storage, and this is important because for people who don't understand how things work, maybe they are scared, like, what is browser storage, you know? But if they just create a free recipe, they can just go ahead and do it without worrying about solid or any technical aspects, right? Then when they start using the application, they can see the status of their data in here, and what I'm trying to convey to users is the concept of a cloud. So I still mention solid when they have to log in and create the account and everything, but I think this way they understand that they have a cloud where the data is stored, which is using solid, but they also have the local data, and I think this type of concept is something maybe they are already familiar with. And this got me to the realization that the application is offline first, so that is quite nice. But there are some issues or some challenges when you are doing an offline first solid app. For example, when you authenticate with a solid pod, you should then store the authentication token in the frontend, and if you want to see the details why, you can read this for a thread, and also every time you open the app, you have to redirect to do the authentication. This is also something that could be improved in the future, but that's how it currently works. So the way I have solved this for now is that I have some settings, and depending on the device you are using, you can reconnect automatically or not. For example, in my desktop device, I have this with everything automatic because I have a stable internet connection, but in my mobile phone, I don't reconnect automatically. I only synchronize manually whenever I want to, right? This is the basic idea. And the second thing, this will get a bit into the width as well, but imagine you have this delicious ramen recipe in your pod, right? And you have it in two devices. You have it safely stored in your solid pod, and you have a copy in your mobile phone and another copy in your desktop, right? You change the title of the recipe in your mobile phone, but it's not connected to the internet, so they change it and stay local for now. Then you change the description in your desktop device. This one is connected to the internet. So you will synchronize with the pod, and you will push the changes. But now we have an issue. When the phone finally synchronizes, you will see that it has been updated because you look at the timestamp of the last update, but you don't know what has changed. So now you have to decide if you discard the local data or you discard the remote data. So this is not ideal. The way I solve this is using something called CRDTs. You can also read more about the details here. And in this case, I had to create a vocab for this. And that's fine, and I'm mixing both the recipes vocabulary and the CRDT vocabulary. So let's go through that again. You do the same thing. You modify the name in the mobile device, but this time you also store the operation in the data. So not only the change, but also what happened and what time, right? Then you do the same in the desktop. You synchronize and you push the changes. And then when you synchronize with the mobile phone, you will see that what changed remotely was only the description. So in this way, you can pull the changes and then you push your change without any conflicts. And this is the basic idea. And when you do this, then you have everything synchronized. If your head is spinning right now, don't worry, because this is a bit esoteric. But not all solid applications have to do this. It's just this is the way I found to make my user experience how I wanted it to be. But you don't need to do this if you are getting started or want to see how solid works, right? So something else, and I think this is one of the most interesting things about this app, is that before I was saying that for movies, for example, you search an API, right? But if you have to search an API with all the recipes in the world, it doesn't exist. How does people search recipes? Or at least how do I search recipes? I go to a search engine and I search for the recipe. So the point is that the data itself is the web. It's websites that are already there, right? So the point is that for recipes, when you make a search, most search engines show you the results and they know that they are recipes. This is not just a website for them. They know these are recipes. And if you look under the hood, you inspect the source of these websites. You will see that in the header, they have some data called application ldjson. And the nice thing is that this is actually semantic data. So this is what I mentioned while at the beginning. And many websites are already using this website that don't even know that solid exists or anything. So this is very nice because this is one way to leverage the data that already exists in the wild. And I think this is a nice way to showcase what the future could look like. If solid was more used and data were more interconnected, right? So yeah, you just import, you just put the URL of the website and you import the data. Unfortunately, because the application leaves in the front end, there are course issues, so you cannot make HTTP requests. And the way I've solved it for now is to use a proxy. But depending the application you do, this will not be an issue because you can also use the server where you are hosting the app or something. So yeah, and finally, the last thing to mention is that in this app, I also implemented the ability to share things. So by default, when you create data in a solid pod, you should create it private, right? But you can also change the permissions and make it public. And this way, you can share this link with other people. Then if you share this, you can give the link to someone who doesn't even have a solid account or doesn't even know what solid is. And they will be able to see the data using the URL of your document. And something interesting to realize here is that this is not only a URL of my application. Like with my app, you can share recipes. No, you are sharing the document URL in the solid pod. What this means is that visitors don't need a solid account because it's public, but also that any of you here in this room and anybody can make an application that already uses this URL because you only have to read the solid document that is following the solid protocol, right? So in this way, you can already see how doing data in this way is useful in the end. And you can learn more here if you want more of the details. You can also read more about how I implemented this application. And if you are curious why it took me three years, this is where you can see why. And some takeaways is that offline first is really nice. I think I'm probably going to do it for most of my apps, at least the ones that follow this schema. Sharing is caring. What I mean with this is that hopefully sharing these recipes or using this type of feature inside of a solid app will show people why solid is useful. And maybe without even knowing what solid is, they can realize the power of this. Like, look, someone shared with me this URL and then I use the same URL in another app and I can get the data. I think there are very different interactions that we can see here. And finally, keep it simple. This is something I always try to remind myself. I don't always make it, but I think it's important to keep it in mind, especially with solid because things can get out of hand. But as I said at the beginning, the basic things, get, post, RDF, those are the basics. And if you have that in mind, I think you can go a long way with solid. And I still have some challenges. I cannot give you the solution because this is where I am right now. So the onboarding, I think it has improved a lot, but still when people want to connect to the cloud and they see solid, they don't know what that is. So this is one of the challenges. And also the course issue, if I want to make front-end applications, this is very difficult to get around. So this is some of the challenges I also haven't solved yet. And that's it. You can follow my work and use the apps, and if you have any questions, let me know. Thank you. Thank you. It was a wonderful presentation. I'm wondering when do you think it will be ready to be used in a more mature application? Because just for example, one thing is that you are taking full control of the pod, meaning that I would never share with you my credential of the pod so that you would have access to everything. And if you have any thoughts about a timeline for that? Yeah. So I cannot speak about the timeline because I'm not part of the specification process. So the answer to this would be when the specification includes some mechanisms to do that. And they already have it in mind and they are working on it. But I cannot give you any answers about the timeline because I'm not involved in that process. What I want to tell you is that I know what people who use Solit that have these type of concerns, they just have different pods. I have a pod for my private or sensitive information and I have another pod for my public information. And this is also nice because you can use the same app with different pods. If you are using something for your professional life and then your personal life, you can have it in different pods and that's okay. So yeah, that's my opinion on the topic for now. It's something still being worked on. Yeah. Any more questions? Okay. Thank you. Yeah. Thank you. Well, it's a bit of a related question but it's more specific to the type indices. So I didn't know about that. I've never used them. So you're saying that it's part of the spec or soon to be part of? The point about type indexes is that they are a different spec made by people who are working on Solit but it's a client spec which means that from the point of view of the pods it's just a document. So maybe you can call it more a convention than anything else. And the point is that if people start using this convention, then it will become more interoperable in the wild. Yeah. But it's a convention, you could say, because the pods don't need to implement anything for this to work. Yeah. Thank you. And so my question was, and this is where it relates to the previous question, is, so I find it very interesting but isn't it a bit risky to basically grant the apps to go and mess with the global type indices? It means you can maybe mess up the type indices with someone else or I mean, shouldn't it be the responsibility of the pod provider to build up that type index? Eventually, when it gets into the spec, yes, for example, the Solit application interoperability spec, which is also a draft, it does this type of thing in the site of the pod. So yeah, the thing about Solit, as I said at the beginning, is that there are many ways to do things because the building blocks are basic but you can combine them in many ways, you know. And the point is that right now, and that's what I'm thinking about when I do this. If you have to build a Solit application today, the type index is your best choice because the Solit application interoperability requires implementation in the server and it's not happening yet. So you cannot rely on that. So type indexes are the best thing we have. You can also choose not to use type indexes but then it will be even less interoperable and to me it's very important that an app is interoperable. Yeah, you're welcome. Yes, hello, yes, here. Okay. Yeah, so with Media Kraken, we saw that it can be complicated to fetch a lot of movie and also it's the same if you want to fetch from different pod, it can be very complex. I wanted to know what you're feeling about like speed and all that on Solit or do you think like Solit is doomed to be used with really low data app or do you think it's possible to have big data with Solit? I think it depends a lot on the use case of the app. For example, for my apps, even if I had perfect querying APIs, I think I would still use this because I found the offline first approach and I like it a lot, you know. But eventually, there is something called Spark UL, which means you can run queries on linked data. These type of things could get into the specification. But again, all of these depends on the specification and I really don't know when it will happen or if. I mean, I know that people working on the spec, they are aware of these issues and they are working on this, but I don't know the timeline, really. I just can say that for my apps at least, this use case, that's what I do offline first and even if I had good query endpoints, probably I wouldn't use them, but I don't know. I guess it depends on the implementation. Thanks. Yeah. Hi. If you have that offline first mode with a CRDT, is there a way to do it or there is, but have you looked into doing it in such a way that someone doesn't need to bring their own pod, that they can use their own devices and have their devices sync between each other directly? Yeah. Technically, it would be very easy and very possible because you just have the operations and you just have to do it. Personally, I have not done it in my apps because I'm interested in having easy experiences for users and what I am going to tell them if they have to synchronize two devices. Maybe I can use WebRTC or something like this. Yes, but I think they are still devices. I think what they like is the idea of having a cloud and everything is safe in the cloud and this is also what I personally would do. I don't know if I would want to have only things in devices. But to answer your question, yes, it would be very easy to do. I just haven't done it, but it would be very easy. Yeah. And by the way, well, all of this is open source, so you are welcome to fork it or ask me about the code or anything. I would be happy to help you if you want. I'll get my exercise. RDF and link data exist for so long and we don't see a lot of open source projects unleashing the potential of such a system. Why do you think it's not more mainstream and do you think there is enough learning material that are easy to understand for developers? Yeah. So I cannot answer for everybody because I don't know, but my personal opinion is that in theory, RDF is awesome, I like it a lot and this is why I am working on this. But in practice, the developer experience is not great and people when they see Tartel and RDF and all these things, they don't like it. So my opinion is that it's because lack of learning materials and the developer experience to get started. But I don't know, I think if you learn the building blocks, because I learned about RDF four years ago when I learned about SOLID, I didn't even know about the semantic web. But I learned the basic things like the specs I linked at the beginning, RDF and all of this. And I think once you understand those basic things, it's very easy to work with, I think. But I don't know, it depends on that and there are not a lot of learning materials that I know of. And there is like not one framework you can use that it's super easy, has a very nice developer experience. So I think it's still a thing of tooling and documentation at this point. But this is my personal opinion. I can't speak for what people, that's my, I don't know. Yeah, can you just tell me about the, when you stored the pod on a, I'm just, I just could do a quick read of the actual FAQs on the SOLID website. And it does, it's easily a lot of the details of how the pod providers store the data up to the provider so that options such as encryption is left to them. So that seems to like move a lot of the privacy concerns. We have existing cloud services to the pod providers, unless you self-help stun a server that you encrypt yourself and such. So do you have any like thoughts on that, on how that works with the provider escape at that moment? Yeah, so the point is, so the biggest issue today for someone to start using SOLID is which pod provider to use. Basically, if I have to recommend a friend of mine to use SOLID, I don't even know what I would recommend them. But I like the point that users choose and if you are super worried about privacy and you want everything super encrypted and everything, then you choose a pod provider that encrypts everything. But if you don't care so much, then you don't have to. You just self-host something in your home. So I don't know. I think this, this is part of the flexibility of SOLID. So it's nice that it's the choice of the user, but I don't have many thoughts about that. Personally, I self-host and I don't mind about encryption. I'd like to ask you how to find vocabularies, because I know there are a lot of vocabularies, but it's quite a chaos. They're very difficult to find and to understand what the words are. So at first I also worried a lot about this, but now it's like the least of my problems, because really you just search if there is one that already exists, like in this website I shared, there is a website with a lot of vocabularies. Just search one and if you don't search, if you don't find one that works for you, you create your own and it's not so difficult. I think it's, at first I understand how it's confusing, but once you decide that you can make your own vocabularies, it's not so difficult. Also I recommend a lot this talk I linked called a bag of chips. This is the one that changed my mind about this, that it's fine to mix vocabularies and make your own. So I recommend watching that. Yeah. Hi there. Given that you've got different pod providers and you can never be sure like what the infrastructure that you're working with is going to be like, it could be a very slow pod provider or whatever, do you have to design your apps quite defensively because you can never be sure that actually the pod provider is actually going to be able to service the HTTP request that you want to make? Yeah. My answer to that is that no, because you just follow the solid protocol and that's it. I mean if the pod provider is slow, users will be unhappy and they will use another pod provider. So I don't worry about that. I just code to the solid spec and I don't mind about that. The only thing is things that are not in the spec yet and they are drafts and different pod providers implement differently, but hopefully when the spec is more stable, this will not be an issue. Hi thanks for the great presentation. All your examples were single user applications. How would you apply the philosophy of solid to, let's say, for example, forum software? Would you either store all posts from all users in their own pods and somehow get access to all those pods? Or would you just not use it for that and just use it, for example, for public information on the user? So I haven't done anything about that, so I'm only going to say what I think about it. But basically my intuition tells me that it would be something very similar to activity pub, which is the protocol that powers Mastodon and the Fediverse, and underneath it also uses linked data. So I think it would be very similar to that, the way it works, and I don't know exactly how it works. I have not coded activity applications, but I think the information is duplicated in the servers or something. I guess it would be something like that. I don't know, but I think my answer to thinking about social applications with solid is look at activity pub, because it's the same idea, I think, or similar, at least. Just as an FYI, there's also an active discussion going on in the matrix room associated with this death room. So you might want to look there, there's also the data. I will take a look and answer, but also you guys might want to check it out. Yeah. Well, so that's it, I guess. Thank you, everybody. Thank you. Okay. Come on. Let me see. Okay. Okay. Okay. Okay. Okay. Okay. Okay. |
Operate First community cloud
A blueprint for a sovereign cloud? |
Okay, thank you for showing up in person and so many people, so it's actually my first time at FOSSTEM and I think it's super excited to see so many people and coming back to conferences but also such a crowded conference so I mean the talks previously were super full now it's almost half full or half empty depending on how you look at it and even for such supposedly boring topics like a sovereign cloud I mean that's immediately sparked associations with state and GDPR I mean all the cookies that you have to click away so sounds boring at first but I think there's also some value in it and I think there's a journey where open source can help to make a sovereign clouds come to life or like look at some aspects of it this is me I go by the name due random on GitHub and on social media three days ago I changed roads so now I'm in sales again yikes for as a managed open shift black belt so I'm still looking at a little bit at the cloud topic open shift is like a cloud on a cloud before that I worked quite some time on AI on AI ops in the office of the CTO and the last thing that I did for two years now is imagining or revisiting open source now in the age of clouds and seeing how open source principles can also apply to operations so we're going to look at the operate first community clouds why do we need a community of practice around operations what does the this community cloud look like and also where is it so I think this will be a it's not really a hands-on talk but you can take things away you can if Wi-Fi works for you you can log into the cloud right now and I hope to see you in some of the meetups or in the community after that because it's it's really open to anybody who wants to learn something about operations or wants to teach something about operations so when I first heard the term sovereign cloud like I said it sparked the sovereign the the king who has now also occupied the cloud and I put it into my favorite search engine and it immediately came up with a lot of definitions on sovereign cloud there was one from VMware some from telecom and they all looked at different aspects and these days in case you're not living under a rock everybody's talking about jet GPT so I thought maybe let's talk to this AI who already read all the definitions for me and ask it about them sovereign cloud and this is just the end of my conversation so I wanted to highlight the differences between the noun and the adjective sovereign and the noun sovereign refers to a personal identity that holds supreme power and authority while the adjective sovereign describes something that is supreme or superior in rank that still sounds not really friendly to me like I do I really want something that is in supreme power over me and why should I care about this then but is there also a notion of independence in that in that adjective because that's what I always thought about when thinking about it but a little bit and jet GPT came up with this that there's a notion of independence in the adjective to be described sovereign means to be independent not subject to control by any other person or entity which if you think about it that also implements if you have supreme power then you can also move away and having the highest degree of power and authority the term emphasizes the idea of self-governance and supreme power within a given context and the context seems to be cloud so when I look at sovereign cloud at least in my small world view it means I have the power to move away I have the power to control stuff and I have that that largest amount of independence and that seems at first contradictory to that business model that we saw in previous talks right so somebody a nice definition of cloud is I'm running stuff on somebody else's computers so that doesn't seem to be like a lot of freedom because I have some some lock-in but actually open source let a path away from lock-in so I think it's important that we apply these open source principle also to operations and if you these days look at a cloud is it is it really open I mean it's built on open source software you get your RDS or some some other product and underneath yes it's running MySQL that's running elastic it's running all that open source stuff but you're still tied to that to that experience that the cloud provider imposes on you if you want to rebuild that with open source solutions you can do it it's well it's looking pretty complicated so you need to master a lot of these technologies you need to stick them together and there's a reason for why people defer to the cloud because they are interested in the workload they just want to swap their credit card and consume and build away the applications so but I think the the last speaker put it really or the previous speaker put it really nice that login to be defined as a as a product of of cost and the likelihood that something is going away so you have to deal with that stuff but open source somehow if you go to this slide here so open source actually showed away out of this and the last the left side of that some funnel here is the traditional open source as in as in software contributions funnel which we all know for decades and which we all love so you find a project you use it there might be 100 users of it and at some point something breaks so you might file an issue great you already contributed because you file that issue and then maybe at the last time even somebody fixes or maybe you fix that project so there's really a funnel of 100 users then 10% reporting issues and making up that community and 1% actively contributing to that project if I'm using something as a service I'm essentially drowning this funnel so I'm I'm stopping at the at the API layer I might contribute to the underlying open source software that might run this service but in terms of contribution I'm usually I'm stuck with maybe filing in the support case and maybe the provider comes back to me but I have no possibility to actively contribute to that and maybe fix that API outage but maybe I'm the only person having that problem and so the cloud provider doesn't even care about this and this was the the notion that our team thought about when thinking about open source in the age of cloud where it's where there's more value apparently in running and providing the software then the software code itself or at least that's an unequal scale and as we see with many enterprise distributions or business models you can get the source code of that database or that service but you don't get the sometimes you don't even get the built tools you don't get the tools that actually operate that service the SLIs the metrics that you need to run there the runbooks etc so every deployment is either behind a paywall because that's the differentiating factor for that company or you have to learn it yourself and it's it's actually quite hard to open up something also with legal constraints right here so you may have might have proprietor PII personally identifiable information in in there you have logs so you need to make sure that you don't expose any of these secrets so that there's a tight balance and that's why most companies or most projects default to closed and even for even for communities that run their infrastructure like the Fedora infrastructure that's somewhat open but you still need to be going through a lot of hoops to to contribute and to do something so it's not really open by default and it's also not meant to be as a blueprint for something only 10 minutes left but I think I can go to the next part of this presentation so this is the this is the concept right so we need to shift left we need to open up operations and the practice something so that we build up a community so that we don't have to build our operational deployments from scratch and what that was is is the concept of this operate first idea we also thought you need to some have something physical something hands-on where people can actually contribute because otherwise it would be just a talk show so somebody needs to lead the way and implement that stuff and we tried to build a hybrid cloud with full visibility into the operation center and hybrid cloud these days is for a lot of people Kubernetes and so we have two bare metal Kubernetes clusters running at the Boston University with 34 nodes and 1200 cores so it's not a small setup then there's one larger cluster running in AWS from the OS climate project which is also managed with these operate first community cloud ideas and we also work with a German super scaler that's what the the layer between below a hyperscaler means Jonas they donated some hardware and we deployed also some some classes there so my vision actually is to to have a really resilient distributed clouds set up operated under these principles at as many hardware or cloud providers as possible 626 individuals locked into these into these clusters about 200 namespaces are there that's the so we do a lot of stuff on most of the stuff is happening on GitHub we have 150 people on the operate first in the operate first community there are like 1000 issues being filed the in terms of diversity it's since it's a red hats bootstrap project it's like one third or half of the people are red hat employees but there's also a lot of university contributions from American universities and also a lot of open source projects already contributing them it's just a highlight of some some of the more noteworthy projects using this infrastructure like OKD the upstream of OpenShift or OS climate or Janus IDP which is a project for for some backstage plugins so backstage if it's currently one of the one of the more hyped tools for a server developer portal by Spotify these are some of the services that are running there like the usual stuff that you would expect from a cloud setup we have Argo CD for doing GitOps we have Grafana for monitoring stuff we have tecton pipelines for building things there's a brow instance running for doing CI CD and a lot of other things so every and that's all deployed by the community in a GitHub repository where you can integrate into into these into these other services and I think that's where the actual value comes from so let's get real we love hands on keyboard and as I said it's all done via a really a GitOps SRE no a GitOps Git first approach the current entry point for you would be going to operate-first.cloud or and that click through some hoops and you end up at the service-catalog.operate-first.cloud which is an backstage instance where we for one showcase the services so you go to the catalog you see all these services with all their dependencies and you see all the managed clusters there so you click on one of these clusters and you are presented with a single single sign on logging screen and if you choose the second option operate-first you can lock in with your GitHub account so it's authenticated against GitHub and without even signing up for an account you get a read-only view of the cluster which is pretty awesome so you see how these these services are being deployed and how other community services are being deployed so you get a a a hello world example a live hello world example of a of of a fully production cloud environment which you would see at your at at your site either at your for your project or for your customer whatever and we documented the way and the why we came up to certain decisions so in this case it's application monitoring or there's also how to store credentials in a cloud and these are the the questions that you will face if you're setting up a your own local cloud and we documented these for con to bootstrap either other deployments or to contribute back so that you don't have to really read through so many blog posts and documentations and make your own choice there are some dashboards here and these are the dashboards that we use for troubleshooting or the community uses for troubleshooting so you would see Kafka or open data hub and Prometheus live dashboards and here's one dashboard for our for our clusters and that's the that's the github org where you would start talking to us or talking to the community the main entry repository is the support repository you can ask questions or you start with a one of these templates and one of the coolest templates here or processes is onboarding to a cluster so you get a form it looks like a form it's a github template you choose which cluster you want to be on boarded the team name and then we have some automation in place that would automatically create a pull request to our github repository and we only have to say looks good to me to it if you're for the person that's part of the operating team and they are onboarded so that's also giving giving you some sense of how would I automate my local clouds deployment you don't need to do that but it's it's it's a way to bootstrap you and there's a lot of other issues going on and as said it's a community so things will eventually also break right now we have problems with our object storage it's broken if you are an expert in nuba or in object storage and you want to get your hands dirty in rebuilding some of that stuff this is the issue so to me though here which will give another awesome talk at the end of this track here left some comments how to get started nobody worked on it yet so it's up for grabs thank you thanks myself question oh hey you provided a good definition for sovereign cloud who are the customers for sovereign cloud I don't know to be honest so I'm looking at it really from a technical perspective and I think my key takeaway is everybody who wants to build their own clouds probably wants to be sovereign in running their cloud and then you have to focus on stuff like minimizing vendor lock in and being able to move to another cloud provider or move your data across clouds to jump just on the question is one of the customer that would like a sovereign cloud because of the resiliency that we need to have because we have critical infrastructure I think that was a statement not a question right okay alright take out your smartphone snap that QR codes and there's a bi-weekly community meetup where you can meet all the wonderful people that are involved in this community where with |
Responsible Clouds and the Green Web Triangle
How to make the climate case for a diverse cloud ecosystem |
Who can hear me? If you can hear me at the back froming through the sound, can you, okay, it's, I think, it sounds like you can hear me at the back. Excellent, all right. Okay, well, it turns out I'm a little bit early, but I'm going to use those two minutes for questions at the end of the talk, if possible. All right, folks. Welcome to Responsible Clouds and the Green Web Triangle, how to make the climate case for a diverse cloud ecosystem. I work for the Green Web Foundation.org and the URL for that non-profit is written across the bottom of this screen, as you can see. In case you can't see me, that's my face. This is a bit of a hangover from when you're doing remote talks because, yeah, it's weird being in a room with other people again. I have a background working in wacky environmental climate, climate tech startups, like as you can see, Loco 2, which is a triple layer pun around locomotion, low carbon, and like going a local on a crazy holiday. I also worked at a company called Amy, which stands for Avoid Mass Extinction Engine. We burned through something in the region of 20 million US dollars of VC money trying to put a carbon footprint API on anything we could find. These days, I work at the Green Web Foundation, where we track the transition of the internet away from fossil fuels to something greener. I also work with the Green Software Foundation, which is a larger industry body, where I'm the policy chair there. I also run a podcast. I co-host a podcast called Environment Variables. Get it? Yeah, which is all about green software. I also am a contributing editor to a magazine called Branch Magazine, which is all about the stuff, this kind of intersection between climate and tech. This is what we're going to cover today. I'm going to frame a kind of problem that you might think about when it comes to sovereign tech, sovereign cloud and sustainability. I'm going to talk a little bit about the drivers for sustainability at kind of regulatory level that happen in Europe specifically. Then I'm going to kind of talk about the idea of competing on transparency as a basis for a more diverse cloud ecosystem. So off we go. Framing the cloud, framing the problem with the Green Web Triangle. Most of you know that we basically build services by compiling them these days, lots of different services. So it's a bit like a kind of graph where things people see at the top is actually comprised of all these things that we might build all the way through that. When you're trying to actually use a service, you kind of end up with this kind of problem. If you want to make a service available to other people, or if you want to use a service, there are kind of like three things you want in my opinion. You want something to be convenient, like ideally hosted by someone else a lot of the time or at least so you don't have to be doing all yourself if you're a kind of medium to small size company. You probably want it to be fossil free because there's a whole kind of climate emergency thing going on. And ideally you'd like it to be kind of diverse so you're not dependent on only one provider who can jack up their prices or do all kinds of things that bad things happen when you're only relying on one provider. And you basically have, I think there are different failure modes when you think about this triangle because of most of the time you can only pick two. So for example, let's say you wanted to go for convenient and diverse. This is like the common thing that a lot of us kind of default to. Like it's basically the default, it's fossil fuel powered and it's a bit like being in a spaceship that you know is on fire and then you go to work and you type in a keyboard and you know things were on fire. Then you go home and you realize things were on fire. Then you feel kind of a little bit anxious. Unlike people call that climate anxiety. Like that's one of the things that some of you might feel when you think about this. But on the kind of like global scale there's an issue there. So that's the thing that we probably are in right now that we don't want to do all the time. The other one is diverse and fossil free. Maybe you do care about this and you're going to look for say green providers. But you also care about having diverse ecosystems and building or everything yourself. This is cool. But it kind of has another failure mode in that you spend loads and loads of time on undifferentiated heavy lifting which isn't necessarily creating the kind of value that you're basically employed for a lot of the time. Or alternatively there is you can go for convenient and fossil free because lots and lots of large companies now like say Microsoft and Amazon and Google are telling a really really like loud story about how much they're really kind of like running on green energy and doing a load load for the climate. Now there are kind of all kinds of problems which I'm in the middle. I'm in a sovereign cloud dev room where I don't need to kind of talk too much about. But there is also issues because these folks tend to be the people who are often the market leaders in helping people get oil and gas out of the ground and burn it. So even if you're running on say green software like they may be doing things which you might not be aligned with as an organization or you might feel somewhat uncomfortable about. So these are the three ways. These are the triangle and these are the kind of common failure modes that I think that are useful to talk to other people about and how I explain the problem with basically sustainable digital diverse services to people who want to actually make a switch and start using something which is not necessarily a number of really large providers but don't understand why you can't just move to like next cloud for everything. So that's like the problem that's the triangle. Now let's talk about the drivers for sustainability in tech at a regulatory level that might kind of help for some of this. So this graph is from Dr. Robert Rode from Berkeley Earth which is an environmental data science nonprofit and they're using data from the global carbon project and what you can see is historically like we've been growing in emissions however long and you can see this trajectory which people might refer to as business as usual. There's lots of different scenarios and it's actually titled SSP2. There's a lot of these. Can you see the kind of changes we would need to actually make to get down to like 1.5 degrees of global warming which is considered like the least worst option that you will typically see a lot of the time and this is you might have heard like 1.5 versus or 2 degrees for example like in 2015 pretty much every country on earth agreed that we should probably do something about climate and we should go for not more than 2 degrees of warming and then they actually you had a bunch of people say well actually 2 degrees is still pretty bad and then we did a bunch of research and found out that yes 1.5 is what we should be thinking about naturally and despite that we've actually had people thinking okay well what does that mean for us as a sector as a technology sector and in 2020 you had a bunch of digital and sustainability focused organizations trying to figure out what that would mean for us as a sector like how would how would we need to kind of drop our emissions for example and as you can see this quote that came out when it was issued in this from this press release basically said yeah to actually have the Paris Agreement which is less ambitious than that 1.5 thing they basically said you need to halve emissions by 20 you need to reduce emissions by 45% by 2030 in seven years time so that's basically making reduction of around 7% each year year on year and well we haven't really been reducing things by 7% each year so it's actually going to be larger so this is basically what the science is spelling out for us the thing that's quite interesting this year is that you've had people talking about net zero for a while but now people are starting to actually say well what does net zero really mean in the context of this science actually and has anyone heard of the ISO here ISO with standards folks okay they come up with something really interesting that they published in November they basically said if you want to call something net zero you actually need to have some teeth behind it and needs to actually mean something now so while there's been lots and lots of kind of greenwashing net zero targets and things from those organizations they've basically come out and said actually this is the kind of stuff you need for net zero now and they've basically said if you have net zero claims by 2050 without having about anything related to harbing missions by 2030 they're not credible and if you don't include your supply chain to just only think about your own things rather than like who made a computer or anything like that not credible and also if you don't have any interim targets in the next few years also not credible because this is quite important because seven years is about the same time as the average tenure for a CEO which basically means you can say oh we're going to be net zero leave then it's someone else's problem to solve and this is them basically saying actually you need to have something now so you can't do that kind of stuff. What's also interesting in Europe specifically is the catchly titled European Corporate Sustainability Reporting Directive. Basically if you were in a company which has more than 250 employees you need to be recording and reporting some information about what your carbon emissions look like and you need to start reporting in 2024 which means you need to be reporting two months ago to start doing this. So that is interesting because lots of lots of organizations don't even have a sustainability team and don't even know how to start reporting this stuff and then finally the IFRS sorry about all the acronyms these folks are you can think of them as like the pope of accounting right they're the people who just basically set a decree which everyone then follows in accounting land. They've basically said that supply chain thing that people talk about yeah actually we should be doing that and you can't just say about your own organization rather than all the things you buy from other people for example. So that's what people have been saying and how are we doing so far like as a tech sector right. So there's good news and there's bad news all right the good news is is that well we have seen lots and lots of use and like more people are connected to the internet which I think is in broadly speaking a good thing and I don't know about you folks but I quite liked having access to the internet during the pandemic. I think that's been more useful and it'll be a lot more kind of lonely and sad and cold and everything if I wasn't connected to that and you generally see that we've basically been using much much more in the way of digital services over the last say 10 years or so but the growth hasn't been that large really it hasn't been proportional and there's a number of reasons for this. Let's okay we can just can we just not talk about cryptocurrency I'm sorry like it's not helping billions of people right the two billion people who joined the internet pretty sure didn't help two billion people yes using like about half the impact half the energy of all the data centers on earth all right. So that's the thing so we're basically going in the direction which is not ideal at the moment and we need to kind of be reducing this and I guess there are now some companies which are trying to step up on this and they're trying to basically say well we want to keep growing which has issues of its own but maybe one thing we can do is actually make the energy we use greener. So you now see very very large companies like say Google say well 2030 is an important date we're going to make sure that all of our energy is running on or the energy we use is coming from green sources 24-7 and 24-7 is important because okay you know how like you bill for cloud like an hourly basis and there's about maybe 9000 hours in a year yeah all right. Normally when you look at green energy and people say I'm running on green energy it's like an average basically saying for all the power I've used if I've used maybe actually the way I describe it to people is like this we all know that we should probably get about eight hours of sleep right yeah like a or around that in a 24 hour period and if I wanted when you normally have green energy right now you can basically say well I've it's a bit like me saying I'm no only to get eight hours of sleep every night and I've got a year so what I'm going to do is I'm going to have 3000 hours of sleep at the beginning of the year all right and then I'm just going to run on red bull and chocolate bars for the rest of the year and that's going to be sustainable that's kind of like annual energy balancing that's kind of what the accounting is going to be like and there's more to it than that but like that's what we currently have at the moment which is why 24-7 is actually allowable and quite an important goal Microsoft is also doing this they've said well they're going to run a hundred percent a hundred percent zero coming with zero carbon purchases which again sounds kind of cool and like these are companies with almost infinite money I say almost because they use as a basis for firing lots lots of people despite being really really profitable still but there it turns out that actually you don't need to be a ginormous company with infinite money to actually kind of set this kind of target and actually be moving this quickly this is peninsula clean energy they're based in California they're a small community choice aggregator which is basically a bit like a non-profit energy company and they've said well actually we think we think we can do it by 2025 because we built the idea of avoiding an absolute climate kind of crisis into our governance board basically and we if it makes us slightly less profitable that's okay because we're not answering to shareholders so they show that they can do that and they've also actually shared all the actual underlying model in python to show how they actually bought all their stuff as well so you can actually do this and there's something interesting in Europe that's happened this in the last week actually do you know how like Elon Musk did that whole thing where he just cut off API access to Twitter and then everyone's like oh now this thing I was going to do one like that's a massive enough start using now because I should be using and I meant to do it anyway right there's this kind of my idea that you would get around to it kind of the same thing happened with fossil fuels in Europe right so we basically had this idea where Europe has been using loads of fossil fuels from Russia and then Russia basically switched off the pipe and now you basically see press releases like this saying well we've got off Russia we've got off Russian gas folks and now we've got all this money we can spend on that we would have otherwise spent on on fossil fuels so it's a bit like lemon and making lemonade but basically the upshot is that you basically you basically got the European Commission president and the people who essentially run Europe saying well we've got a quarter of a trillion euros that we would have spent on gas and now we're planning to spend it on kind of more cool things like net zero technology and like wind and solar and stuff which I think is actually well I'll take it I suppose all right so that's kind of what's happening at kind of regulatory level but I think it's quite interesting and anyway let's talk about competing on transparency for a diverse cloud ecosystem because the thing is we basically have the science telling us we need to reduce emissions and there are ways of there are some organizations which start saying yeah we're making all this we're doing all this kind of big splashy stuff and it's turning up there's more in the way of resources to do this stuff and there is this kind of narrative in policy circles right now where all the innovation has to come from very very large companies and I think it's not really true particularly in like well basic cloud for example there's low all kinds of cool stuff that you see from small providers in Europe for example so OVH cloud pioneers in say liquid cooling and servers as you can see here and then we're doing it for ages scale way for like a french cloud provider using arm service for ages as well like this is the thing that people make a big thing of but it's been like just standard as well as okay adiabatic cooling is basically using some ideas first developed in Iran to basically reduce the use of water but still make things cool they basically use it inside their servers rather than drawing loads and loads of water out of like the local kind of aquifers this is important because in some parts of the world in there are google data centers which use the same amount of water as the city they're in basically so like you should be caring about this kind of stuff and anyway there's loads of other examples that I've listed here I'm sure you can read them as well and there's also an interest at a kind of regulatory or kind of public sector level now where people are trying to actually figure out what sustainable software really looks like and how do you standardize it and how do you procure for this so ocular which was released by which was kind of celebrated by kde recently they have the first kind of blau angle blue angel kind of sustainable software kind of certificate which was issued to ocular pdf reader around september last year and this is interesting because this is the kind of stuff that people who spend money on trying to actually procure green and actually buy green and digital services are actually getting now and if you work in a company and you're using some of the large providers this is important because it turns out to be quite difficult to get the metrics from the very very big hyperscalers this is a this is actually a table from a person called mark butcher at positive cloud he shared some of this with me because he was trying to figure out well okay if you're using these big providers what information can you get from them who wait when they're in your supply chain that you need to report against and well basically I won't go into all the details here but the red stuff is bad right so if you're using the stuff on like those two providers not really a good one even on the left hand side you do see some information so microsoft is probably one of the better ones right now but they're still there's still some questions about what they report amazon the the level of reporting they give you is so so so low resolution that you can't tell if the emissions are coming from like one data center or another data center which is on the other side of a country thousands of miles away so you have some issues here because you might have one data center running essentially on coal and you have another data center running on hydropower which is definitely cleaner than coal all right and you have all this stuff here and this is one thing that you don't really have access to that if you're going to try and if you need to report you're going to have to be able to figure out how to actually get these numbers so you can see if you're making if we're going up or if you're going down like we saw before there's also loads of really cool open source tooling which makes it possible to get these numbers out so the most well-known one is cloud carbon footprint this is maintained by ThoughtWorks it plugs into billing APIs from say amazon from microsoft and google then they do some guesstimates based on available data to tell you what your figures might be because the various cloud providers have their own native ones which have very different assumptions to the open ones that you can challenge and you can ask questions about green coding metrics they did a talk today they basically put if you have a digital service they'll put every single service in a kind of docker container then track the energy used and i've got a really nerdy kind of part of a talk that five o'clock later today where i talk about some of that scaffold i'm not going to pronounce that properly it's a french word for diving helmet and or diving suit it basically gives you process level energy usage metrics from any machine you run it on if you have access to the to the bare metal and it spits it out that you can kind of put into like dashboards and grafana kepler does more or less the same thing but for kubernetes using Berkeley you know extended Berkeley packet filters that's what they're called right yeah so that's another thing they do and there's another talk again about that kind of stuff so there is all this stuff which you can use and plug into your tooling where you can ask people to use so you can actually get the numbs to show that you're kind of putting us in the right direction if you're interested in and i think that's my time i'm just going to wrap up that was a green web triangle there are three things you can have convenient diverse fossil free most of the time you can only really pick two there's pressure right now to really push towards fossil free being a key thing which then really makes you decide am i going to be diverse am i going to be fossil free i mean the other one convenient you can compete on transparency and i think this is some area that you could actually kind of really kind of lead on and i think this is actually a way that this is this is probably underexplored right now and finally diverse ecosystems are healthy ecosystems there's policy support and there's money and there's funding to make this stuff possible and i think you should do it and i think that there's actually like even a policy idea around this the kind of word you need to use is twin transitions that's the kind of pseudo password you need to use when you speak into a policy maker and i think that's it thanks folks for my time there's a talk about a fossil free internet this talk is online with links to everything i showed you i'm on mastodon.social and okay there's probably mr chris helms on twitter but i don't really use it very much but you're welcome to tweet there i suppose and yeah that's my email address thank you everyone thank you chris um open for questions yes i'm also in like the element did matrix thing oh so if you've got a question there i'll try and answer it as well well hello i suppose i waited for some serious questions but i have one maybe a weird one what do you think about low earth orbit data centers you've talked about data centers in space that beam things down exactly uh i think they're if there was a way that you could get them into space without the rockets and you kind of there are actually some ways some wacky ways to do that i think it'd be cool i think astronomers have really strong opinions about that and also the whole idea of like privatization of space makes me feel a little bit uncomfortable because we're not great at dealing with our waste right now and if you have like all the waste that stops other people trans then i feel like that's maybe not the ideal solution basically also a little bit worry about having the guy in charge of twitter in charge of the future of space maybe like let's look at like how it's how he's handling twitter and think we want to kind of see where that goes i'm not sure about that that's that's my answer i suppose but probably not related to climate anyone else hi i just wanted to ask what can you do as a small company 10 to 15 people you are developing software probably for bigger clients i think what you can do you can write into your contract actually there's a website called the chance relaying project which has all these clauses for working with larger companies you can also allocate some of your revenue 1.5 percent of my revenue for 1.5 percent of us being alive and then use that to allocate that to fund things either internally making your own sustain your own organization greener or by funding things like non-profits or groups who are actually pushing for quality change stuff like that i think those are the things you could do which are easy to like write into a contract which would actually show that you were demonstrating leadership compared to other people this is also what we do in terms of training and consulting i'm very happy to chat about that because oh yeah i probably should have told told you everyone that yeah like this is what we do as well but yeah i'm happy to chat about that in more in more detail or answer questions on the element and stuff from that okay thank you hi thank you um you showed a few list of tools that help you get um energy yeah this one yes um how how much are how precise are they and how much do they go around like how how many assumptions do they make to get the numbers and what would be needed to don't make assumptions at all okay someone mentioned ilia before didn't they so ilia was like one of the german cloud providers i was sorry not cloud provider um energy provider um what we need to do is we need to get carbon emission figures from the energy providers so like they can tell you how dirty or clean the energy is at any moment in time so you need that basically scafandra is very very very good because it gives you very very good numbers but you need hardware support to actually share the information about this so intel they have something called wrapper which is running average power limit which gives you idea of the energy used by the cpu but not necessarily the gpu or the disk so you need all this stuff kind of instrumented for you to get this um kepler uh because you're going into the kernel you can get really really really detailed information but i'm kind of out of my lay at my league here because i can write docs but i can't write rust which is why i contribute to docs to scafandra instead of rust to to just scafandra um greed coding metrics there's also a talk online where the guy taught where arna turrara he talks about this in really really good detail and he's actually very very knowledgeable and cloud carbon footprint everything is open source and it's all in type script where they are very explicit about the assumptions they make all along and they're also open to pull requests i've committed to code to that one and yeah i haven't committed code to the other two but um they're cool people and i i i really support the projects i hope that helps and i'm again happy to chat in more detail thank you cool folks thank you |
The Co-operative Cloud
Public interest infrastructure |
Great. Yeah. Sorry for the old, the usual show. I'm sure you're used to it now. Great to see so many people here. Amazing. Thanks for your interest in the Co-op Cloud. Yeah, I'm decentralized. That's my internet name. And these are my enterprise slides for the Co-op Clouds. So I've been working on this project now for, yeah, maybe two to three years, I would say. So there's a lot of knowledge of the project, how it happened, and what came to pass locked up in my head. But I am only one person involved in this project, what has now become quite a wonderful project, I'd say, which involves a lot of different collectors, a lot of different groups. So yeah, this is just kind of like a moment to offload what we've been up to for the last two years. And yeah, all the hot takes are mine alone. Some of the images are taken from, yeah, shout out to fellow worker Trav, the internet gardening collection on Arena. I totally recommend checking that out. It's great. So yeah, Co-op Cloud. This is the official website ready description of what the Co-op Cloud is, a software stack that aims to make hosting Libre software applications simple for small service providers, such as tech co-ops who are looking to standardize around open, transparent, and scalable infrastructure, which is a lot. So I was thinking, how am I going to explain this? Because we come straight in with a software stack, but actually, it's much more than that. I think you could argue our project is more a social endeavor, social organizing, more than a technical thing. But of course, they're overlapping. So I thought, well, okay, we'll just go back. We'll do a history lesson on how the project started. And I feel like that is an important angle for introducing the project. So I want you to understand where the project is coming from. And then maybe you feel welcome to join the project and shape the future of the project, right? And I think this historical view on where things are coming helps to ground the project in an actual need, human need. There's a lot of software projects out there that are just like, why does this exist? But we're trying to show that this is socially useful and why we have come up with this project and initiated and tried to make it work is based on what we needed at the time. But that may necessarily be the case in the future. So yeah, put that to one side. And let me introduce you to the magic brain of autonomic co op. So these are some people in the cop. There's me on the banjo. That's not all of them. There's 13 of us. As far as I remember, we have a new website that was just put up last night. So check it out. It's hilarious. I don't know if I can get up here, but I'll try maybe at some point. So yeah, we're a technology cooperative, a worker owned cooperative. That means the people who work in the business own the business. It's run and managed by ourselves. And that means we're, yeah, this is like, you know, we when we come into where we work, we have the chance to make decisions about every aspect of how the workplace runs. So, you know, what kind of work do we want to do? How do we want to make decisions? How do we deal with money? How do we find new work? Who do we want to work with? Another member wants to work with this group, but someone else disagrees with that. How do we work this out? How do we deal with conflict? It's kind of an end to end workplace situation where you have the chance to, you know, be involved in every step of the process. And of course, you don't have to do that alone. You can do that with your friends. So that's the kind of model of the cooperative. Yeah, and one of the kind of, yeah, I'll come back to the website maybe in a bit. But so one of the ideas behind what a cooperative is, is this listing of cooperative principles. You can see them on the right hand side there. For example, autonomy, independence, you know, economic participation of the members, open voluntary membership. These are kind of like non-binding principles of what it means to be a co-op. Principle six, for example, that you will work with other co-ops. So, you know, we want to expand cooperation in the ecosystem of groups that are trying to do this as well. And a lot of these hinge on actually having money to survive in order to practice what it means to be a cooperative. So we're a technology cooperative. And what do we do exactly? And you could kind of say we do basically everything and anything like we would happily develop a new piece of software, but we'd also come to your house and fix your toaster. We, you know, we try to work with people that we want to work with, that we want to support, that we like their work, or that connect with our values. And one of the things we've done quite successfully, in my opinion, is to run a free software step internally and for the people we work with. So we have like, you know, all the things we use our free software. So we're pretty good at that, managing our own internal infrastructure. And so necessarily, we would then just offer that to groups, like, do you want a website? Do you want your own chat system, like a matrix, or do you want a wiki, or do you want an xCloud, or whatever this kind of stuff, this classic hosting situation. And one of these problems that has come up, Autonomic is running like several years now, five years or so, is that it's difficult to make money out of. People are used to just getting stuff for free for very understandable reasons. Big tech. And at some point, you know, we want to scale out the code, we want to involve more people, we want to be able to make a living from it, we want to be able to survive, pay the rent and so on. And then we started realizing that, okay, actually, we're making more money, or people are happy to pay us more when we are doing support. And this is talking to people, getting on a call and being like, what is software, and then talking about it, or what is a wiki, everyone's saying wiki, I don't know what a wiki is. And just having a chat about it. But then also, being around later, so once they ask for the thing, like an xCloud, you set it up, and then you don't disappear because you're a cooperative, and you intend to continue. So sustainability is one of the core principles. So we stick with them, we chat, we're contact on mail and so on. So this is kind of the dilemma of like, we want to definitely do more hosting, because people need digital infrastructure to do their work. This is just like a base layer of how to organize things. But it's difficult to survive as a technology cooperative, only doing hosting. So you need to like expand what you do. And one of the places we ended up was using Cloudron, which is a kind of like, one click install open source system, which I would still recommend. It's great. It got us really far. A lot of the next slides criticized us. So I'm just going to say it's great on this slide. So you can see like, you have your classic selection of open source apps, if you're familiar with that stuff, like Rocket Chat, for example, we use Rocket Chat to communicate where, you know, in the co-op. And what this enabled us to do was basically work with different groups that may not necessarily have money to support us, like to make it sustainable to work with them, but still support them to do their work. And Cloudron was like a really kind of like, get out of jail card in that sense. It was like, okay, I can just like hit five times on this WordPress button. And we've got five WordPress sites and people are going, they're posting, they're organizing, they're working, we're supporting them. It's brilliant. You know, and for those that don't know, Cloudron is just like, you fire up a server, you SSH into it, you run a command, it spins up this thing. And you're going, you can install multiple apps on the same server. So this really reduces the costs for rolling out digital infrastructure. Let me check time. Sorry. Yes. Right. So yeah, remember I said at Cloudron school, but at some point, the core of the product, the front end, the web front end became proprietary. So they made a switch. In some sense, I can't blame them because, as I said, if you want to survive, you need to make a book and you need to pay the rent and so on. But yeah, that made us nervous. You know, when we work with people as a cooperative, we want to say, we'll be around for you. We want to continue. We want to do this sustainably. And if you remember one of the principles is like our independence, like how, like, can we exist in this world? You know, there's, of course, interdependencies, but like, can we continue to do what we want to do without relying on one specific thing? So when this, so Cloudron made the decision to make the front end proprietary, you can still use it or whatever. It's a great system. But we thought, oh no, what's going to happen? And then we started looking a bit deeper. And we realized that Cloudron, when you click and install an app, the apps are, yeah, they're, it's an image. So it uses a container system. It's kind of like a light virtualization layer. They're packaging all the apps themselves. So the people who work in Cloudron Inc. or whatever, when they want to provide a new app, they say, okay, let's make a new Git repository and let's package this thing. And that's great. And it works really well. But then we realized that they had packaged, I can't remember specific apps, but take, for example, NextCloud. They had gone and some worker in Cloudron had packaged NextCloud. But then we went to the upstream NextCloud repository and we saw they had provided an image. And then we thought, well, why aren't we just working with them? Because that would, you know, connect our values of expanding cooperation in different layers of the software stack and so on. And then the more we looked, we realized that free software communities were really converging on the idea that you have a packaged image inside your repository. So you're hacking on your source code, and you just have an image that's building. And we can use that. That's what Cloudron needs. So we were like, okay, that's interesting. Maybe let's think about that. Yeah. And then the kind of like, the end of the, the logical end of the paranoia or whatever it was that like, we have, you know, we're relying on this one company, this one group, you know, autonomic is a cooperative that supports maybe 20 plus groups, but like a large selection of them have very little money to pay for the infrastructure. So if you can imagine that Cloudron would somehow make a bigger step into more proprietary solutions or somehow increase the cost or who knows what, we'd be in a position where we can support those groups. And that's not something we wanted to, to deal with. So we, we started to try to think like, you know, we're a co-op, how do we make this better or whatever. And that is the start of call clouds. So now we're getting back into another take on that original really long sentence that had things like software sack in it. Yeah, we were like, let's just eliminate this issue of proprietary, you know, angles or whatever. Oh yeah, here's where I just couldn't get the emoji in the slide. So I just gave up. I, yeah, we're just like, we'll just copy left the entire thing. Let's just do that. That seems sensible. Cool. And then as I mentioned, yeah, you have a little favorite emoji. We want to work with the upstream developers of the software because a lot of precarity is in the open source ecosystem and that there's, you know, certain developers which are doing unpaid labor to develop the software that a lot of us rely on. And they're providing these, you know, a lot of them are providing these packages and we thought, well, okay, we can just meet them where they're at, engage with them on the issue tracker, you know, speak to them, make them aware of our hosting efforts. And, you know, we're, we're in a sense like closer to end users in that, you know, it's like developers, posters, users, you know, summarized and often developers, well, they've got their hands full trying to make the software work so we can like help that connection. So we're trying to bridge that in that sense. And also, which we'll see later on, they're also providing, so they're packaging up their apps in images, but they're also providing this kind of extra configuration around it, which kind of tells you how to deploy the thing, which is great for people who are doing hosting because, okay, we have an app, but does it need a database, for example? So what, database? But they're also doing that for us as well. So if you develop open source software, thank you. Yeah. And then the democratic governance. We were going to initiate this project, let's say. I wouldn't say we like invented it, we just like collude a bunch of stuff together. But we wanted to not be in control of that. So not become the new cloud run, which is like, okay, let's set up some clear rules for how do you interact with this project and also on what basis are you also a technology cooperative? Are you an open source developer? Are you a user of this software? Do you want to support the host or do you want to support the... We can start to engage with it where we're at, but meet in this kind of common project. And obviously, the goal then is to sustain open source digital infrastructure and expand cooperation. Yeah. So moving out of the kind of history phase and now into what it actually is. Yes. So this word kind of pops up if you've checked the website or the docs, democratic tech collectives. So what is that actually? Autonomic is a... We're probably registered in the UK. We're publicly regulated. We're a cooperative society. We have gone through the paperwork to make that happen because we wanted to do that. But we recognize that not everyone will want to do that or be able to do that. And this really depends on a per country basis. If you're in the Netherlands, it's a different model or if you're in the UK, the way you relate to the legal system and the state is like... In Germany, it's quite difficult. I've been told there's no way to kind of just slot yourself in. It's not very easy to find information. And we don't want that to be a limiting factor in working together. So we were trying to conceptualize this idea of what other groups would we be willing to work with but not close that definition down from the outside. So we were thinking about other groups who want to work with their friends or together with other groups and have set up decision making, for example, collective decision making. So they're able to navigate what they want to do together. That was kind of like a thing we wanted to do. We wanted to be able to interact with each other. We wanted to be able to disagree and receive and send constructive feedback, reach compromise, stuff like this. This is kind of easier to do when people are already organized in their own groups. But yeah, this didn't really rule out individuals who are active in the project at the moment and also other technology cooperatives. We're kind of just trying to saddle through it somehow and be like, just work with us. Let's figure it out together. And this is already in progress. This has been in progress for some time, the formalization of what this means. And that's an open process, which would love to invite you to come check out. The configuration commons, this is a, so as I said before, we have open source apps that we can package them in images. And then we can specify a configuration around those images to describe how that app should look like in a deployment on a server live and well, people using it. And one of the things we know is was that the, as I said before, the configuration that was being provided by the upstream repositories was useful, but it didn't specify the full end-to-end production kind of scenario. And that's kind of a big word, but for example, how to back up the app data. Like the thing is deployed, people are using it and we need to make sure their data is safe. So we need to back it up. How do we back the thing up? How do we restore it? This is kind of like a step from it works to it works and it's safe for these people to use it. And we wanted to encapsulate that into our configuration. So this is a big part of it. We'll go through each one of these deeper. I'm just going to overview them now. Yeah, ABRA is a command line tool. So our own digital tools, we wanted to be involved in how the tools are shaped and how we use them. You know, that's a great situation where you're dogfruiting your own system with your peers and you're trying to figure out how does this, how does this, you know, best suit us, you know, and according to our own constraints, you know, we can't spend all day learning this obscure system or we can't, you know, invest too much time in some things. So we need to cut corners and it's best to be involved directly in that process. So, yeah, and if we remember back to Cloud Run again, sorry, I'm not going to be bashing them at all. But yeah, like, how do we interact with the system? In the case of Cloud Run, it was a web front end. But if we're talking about interacting with technology cooperatives and other tech collectives, there's kind of a difference to begin with because Cloud Run is maybe trying to provide for the, say, let's say, non-technical user or someone who can just kind of get in and click a few buttons or whatever. But we're already dealing with groups that are actively, like, you know, deeply involved with like Linux system administration and so on. So do we need a front end? So those are the kind of questions we can ask and answer them ourselves in those moments. And the collective infrastructure. So there's a lot that goes on in between getting an app out and someone using it. You know, people have to meet each other and talk and who are you and what's going on and what do you do and all this kind of stuff. And then money. Where does the money go? Who, you know, what do you charge? You know, it's just this end to end, all the social process that go in and around that. We wanted to build those up too because that is a huge problem when you're starting off as a collective and you're like, oh, no, where's my bank account? Like, how do I get a bank account? And you can see like open collective, for example, doing this, just like mutualizing this financial infrastructure and just getting people going. So yeah, docs, yeah, get hosting, get off GitHub. That's cool. Stuff like that. So I'm going to go through these a bit deeper. Yeah, we're trying to form so we're our proposal at the moment as autonomic, which is, yeah, now I can say the sentence and I hope it's a bit clear. We initiated this project and we're deeply embedded inside it, but we're attempting to step out of the project and reenter on an equal basis with other collectives. So in order to do that, we're proposing a federation model, which is based on a great project called Co-op Cycle, if you've heard of it, but basically different groups can interact with each other. There will be democratic decision making and yeah, some laws and we'll come up with a constitution together and all this kind of stuff. I couldn't fit in on this line, but it's not just chaotic nerds, it's also open to, you know, we can imagine the people who are using the software grouping together and joining our federation as well to say, hey, this button should be over there. Could you just fix that? But in a more collective sense, like we could be gathering money, we could be figuring out how to improve things, you know, you don't have to be able to just write software to join this kind of stuff. We could be, you know, connecting the different struggles to build up like a better open source ecosystem. So this is a process that's going on right now. There's invites going out. We have a new round of private funding. Thank you private funder and yeah, there'll be more news. But check the website if you're interested in this, if you're also part of a collective love to hear from you. This is a massive problem that we, of course, experiences being sys admins. And we've seen a lot of people join in on the old matrix channels. And, you know, that'd be one person setting up a few apps for friends, family, maybe their local community, food co-op, what have you. And then a few months down the line, it's completely overwhelming. The email broke, you know, or it doesn't do this, or who knows what. If you do any sys admin, you'll know the story. But when you step into the co-cloud project, you have this chance to meet your peers who are also doing this work. They may not be involved in your project whatsoever, but that's not important. We have this idea of the config comments that means we can share work on the, again, the next cloud recipe, that's what we call them, we'll come to that in a bit. So we can work together on the same configuration. And that means we're sharing tips, tricks, we're talking together, you know, it's like this group of users wants this, do you want that, oh, I tried this, you know, this kind of stuff. And this is really, yeah, this has been seen to be like a great feature of trying to just like open the door for people to come in and work together. Because there's a lot of tools that propose reuse and kind of mind share and a, you know, collective point of reference, but don't really live up to that, you know, the ideal of reuse, just not doing it again, you know, repeating yourself. So we were conscious of that, we're trying to work on that. Yeah, folks on training collaboration, so we wanted it to be easy to onboard, we want more democratic tech collectors to pop up, we want more technology cooperatives to be able to start, we want people to be able to decide, I want to own and run my own workplace, and I want to, you know, bootstrap some infrastructure for groups, I find cool in my city. And that's what called cloud is about, it's like a tool set for these people. It has to be easy to use, of course, and we're very focused on that public, so groups who have already decided that that's what they want to do. They're on the terminal at the moment, we only have the terminal clients, and they're, you know, they're getting into that, you know, we can't do everything for everyone, but we can definitely, with that public in mind, we can, you know, be focused on how we attempt to make it accessible and usable. And yes, cooperating with other networks, so we want to, you know, within our network, of course, we want to have groups join, but for the groups that don't want to join, we still want to work with them, if they have the same values, if they're doing similar things, we can also interface with them. And we've already seen that happening, and I'll come back to that in a bit, but, you know, once things are clear, and they're getting clear, like we are a group of cooperators, we have this configuration columns, we have these tools, we're just like specifying what we are, then it's easier for other groups to say, okay, I understand what's happening here, and I would like to do this with you. So there's kind of like, you know, a concrete example, find funding together. So, yeah. The concrete commons, yes, the open source apps you love. If you, it's this catalog of software that is out there, and people are developing the next cloud, the media wikis, the seen apps, the GTIA, the whole thing. So we have quite an expanding concrete commons, so you can, like, deploy a lot of apps. So that's the thing, like, people come to us, and they're like, oh, I need a calendar, or oh, I need, you know, a note-taking app, or whatever, and then we're out hunting in the open source ecosystem for apps to add to the catalog. So that's that, then the, yeah, so to come back to the, you know, this, again, this idea of the image is the app, which is the packaged thing, and then the kind of wrapper around that, which specifies the end-to-end kind of how they should look in production. And we were conscious to not reinvent a packaging format, and we didn't really know what to do at that point, because we thought, okay, we need to be able to say how, you know, in one file, let's say, in plain text, a config file, like, here's the app, and how do you back it up, how do you restore it, how do you take it down, how do you bring it up, how do you configure, how do you plug it, you know, this kind of stuff. Yeah, and as it turns out, the Docker compose ecosystem and Docker has kind of been moving towards the compose standard they're calling it, and it's basically, yeah, if you've ever seen the Docker compose file, it's a YAML text file, and it kind of just specifies what we're looking for. So we just figured, great, upstream developers are already using this, this is a developing open standard, great, let's just build on this. And I think as it turns out, that was a good choice and has had many benefits, which we'll come to on the next slide. Yeah, so we're working with upstream developers, and yeah, we're, so in my background, I've done a lot of configuration management tool, like for example, working with Ansible, which was always the grand ideal of reusing roles, if anyone has been, has used Ansible before, it's kind of like a packaging format for, you know, you have a server and you want to install like Apache on it or something, and you tell Ansible, install Apache for me, and then you want to install the new thing, and then you kind of package this into a role. And at some point, I realized that everybody is writing the same roles, there's like, it's very difficult to like share and to reuse other people's stuff. It's definitely happening, but not on the scale that I was looking for, and so I was looking for alternatives. Yeah, so in the project so far, we've seen people like we have, for example, matrix servers, Synapse installs, we've seen multiple collectives look at the config and say, this works for me, I'm going to use this, and then changing the config in collaboration with other groups. So we're already seeing that people are able to make the changes that works for them without breaking other people's installs. And then when things need to be worked out, they speak to each other and things move along. And again, the compose standard really helps here in that it's quite flexible and helps us, people can move around each other inside the configuration, which will hopefully become clear on the next slide. Yes, great. So we're calling the app configuration a recipe, cooking inspired. And it's a good repository. Here you have git t, git t, so I don't know. And it's a bunch of config files. And as a collective, you come here and look and say, okay, this is the configuration which specifies, you know, how this thing should look like in a deployment. And kind of one of the magic source of this format is that if you've ever used, I think it's a Docker, I think it was a Docker compose command line client, you can kind of chain the compose files. So you can say like Docker compose op, compose.yaml minus f, compose.some other thing.yaml. And the system will internally merge all of these configurations together and then roll out the app, which I never had occasion to use until I realized that maybe somebody wants to use Postgres and maybe someone wants to use MariaDB. These simple choices for deployments, if it's not possible to kind of like make this easy to do between different groups, it would never work out. So very, very thankful to the people who wrote this standard. So you can see here, you can kind of, and it'll become clear when we talk about the command line client, you can kind of specify, okay, I want MariaDB and I want the SMTP config bundle all together, roll it out. And I don't want Postgres or I don't want this or I don't want that. And this allows people to just expand the config to suit their needs, document it, let people know this is a new feature. So it's quite nice. You can be just hosting some software and then somebody says, oh, great, open ID connect now works with this, just load this thing in. And you know, this is really helping us move forward and get things done. Yeah, at the moment, we have a command line client called Abra. And this is the kind of day-to-day interface for how you manage your cloud cloud install. So you have a server, you deploy stuff to a server, you're using Abra on the command line. And that was a real, that was a large decision for us at the time because we didn't have much money, we didn't know if we would be able to pull off a web front end. And again, as I mentioned before, we were trying to target specifically like, who is the public of this project? Are they people who know and are comfortable on the command line? Yes, okay, let's just go ahead with this and try to make it work. So we wrote in Bash the first one and I still completely recommend it, just unleash your inner Unix. It was great, it managed to get us to, you know, zero to hero in relatively short amount of time and have a system that was working. One of the kind of core ideas behind developing our own tools, we were very conscious that we might fail, won't be able to get it done. And we wanted to have that, the conflict commons would kind of live separately. So these would be kind of not interdependent in a way that if one broke, the other would break. So if you even today, if you just ignore that Abra exists, you can still drop into the command line, run a bunch of commands and roll out the app, which is great. And that's kind of how we originally conceived it with the Bash system, we're just running a bunch of commands. And we had kind of like laid it all out in a way that kind of said, okay, you can just push that out. So yeah, thanks to Kalex, a fellow worker who's great at Bash programming. We wouldn't have done it without them. Yes, so then we read, wrote and go, we actually managed to get some funding, which will come to you after, I guess they're not Bash programmers, if they read the source, they might have felt otherwise. But at that point, we were running into issues with the Bash implementation, which we felt quite proud about that we had like gone ahead and found the limits of the simplest possible thing that we could do. And the issues were it was difficult to install on multiple systems. It relied on a number of built in commands that were not always available on, you know, like a Fedora, Debian or whatever. So yeah, portability. We were struggling with we wanted to kind of develop other aspects of we like we wanted the tools to be able to speak to the conflict commons but not directly pairs each other. And that ended up being a kind of like JSON catalog, but then pirating data formats in Bash is difficult. So we were we're kind of pulling our hair out on that one. And then concurrency, we were struggling to manage horizontal scaling. So if you have worked with 18 groups, they want to have their own VPSes. If you have 10 servers, you know, the tool has to like fire a request to each 10 of them or whatever. And then if it's going through each one at the time, you have to wait. And as a result of kind of scaling up and using this absolutely pre alpha software and autonomic for production purposes, we reached the limits of the software. So we ended up rewriting it and go somebody knew go in autonomic at the time. That's the reason. But also, we saw that it was really, you know, we could get this concurrency issue sorted and the portability. So go kind of gives you a language level feature that it's like quite easy to say like fire across these 10 things immediately. And you can build a binary so you can just fire them out. So people just get like a binary on the system and the stuff is all baked into it. So the new problem with this is that it actually works. It does what it says and people are starting to use it. So we're now into this kind of maintenance cycle. It's kind of approach. We've done the public beta. So people are using it and starting to rely on it. And we've seen people hacking on it and submitting pull requests and checking it out. And it's all good. Yeah, so in essence, ABRA is a Docker swarm client. So no, we don't run Kubernetes. We run Docker swarm and no, it's not dead. Docker swarm is a technology that is still alive. We're kind of we have like a strange parasitical relationship with like some installs, like some banks have like a swarm install of like, you know, 10,000 nodes or something. And the current owner, I believe it's Marantis or something that's been in exchange of who owns it or who I can't remember. But we're it's still being maintained and we're happy to see that it is being maintained because we identify that swarm mode in Docker is kind of the appropriate feature set that we need without having to deal with going into learning how to, you know, roll out a large system which is built for like large scale. Our, you know, and again, this is like what was autonomic doing at the time was like rolling out single servers deploying a few apps, no greater than like 10 to 30, 50, 100 users or whatever. And swarm mode provides you with the ability to kind of roll out the app. And if it fails, it'll roll it back for you. And we can bake that into our config. So we were kind of getting the stability guarantees that we needed. And not a lot of groups were demanding, you know, when they had like a media wiki installed, they weren't saying there must be 99% uptime in this contract or not. But we still wanted to push that to the limit, like a high quality, stable service when we're rolling stuff out. This should be, yeah. And then, yeah, the swarm mode just covers that for us, the runtime, the Canadian runtime. So yeah, this is a kind of like the architecture, let's say. So on the left, you will install Abra, for example, on your local workstation, the command line tool. And then you'll via SSH kind of tell Abra to manage this server. So it'll read the SSH config and it'll connect to the server and say, okay, recognize this server. I see there's a Docker demon running on this and it's got swarm mode enabled. Cool. And then it can, you know, you can do this horizontal scaling where you can install servers. So you can load multiple servers into Abra. And then you can be sharing the Abra state between multiple people. So in a co-op of 13 people, you know, each one who runs Abra app, so whatever to list the apps and the servers, we'll see the same state come out. And we can go into that a bit later. So yeah, it's built for basically collaboration in the organization that you're in. And then the other mode is you install Abra directly on the server and it stores the state on the server, which could be useful if you're, yeah, for specific scenarios, this was requested. Yeah, some people run this, I don't know. Yeah, and then the final points. Moving on, collective infrastructure, yeah, docs, git hosting, recipes.coop, cloud.tech, you can see the list of all the apps we have. We have an open collective, which is the fiscal host is autonomic. Bank account in the UK. And that's been nice because we've been able to, once we got the funding, we were able to tell people, if you work on this, we can pay you just immediately. Anything you do, you'll get paid. And that's great because we know there's a lot of unpaid labor going on in open source world. So we didn't want to be a project that said, you know, contribute to our comments and get nothing back. No, we could actually pay, you know, so we set like an hourly wage there and we said, this is how much money we have and away you go. That worked out great, I would say. And the server firm, server's co-op, so I wanted to plug this other project that's, you can go to the website, servers.coop. Yeah, we've been working with a group who've been developing software called Capsule, Siberia, great hackers, and they're trying to develop a system. Well, they have developed a system, which is basically a server provider infrastructure like Hetsnure, for example. So VPSs, you know, I need a server to roll one out, I need another server to roll one out. And then we, and a couple of other people, myself included, were thinking, okay, this is great because a lot of collectors are centralizing on Hetsnure VPSs, right? And if Hetsnure rolls up the price or, you know, we've already seen this with the increase in cost in IPv4 addresses and the 10% increase this year, it's getting more costly to run on Hetsnure, but it's always been super cheap, and that's been super accessible. But it could change, so what do we do? We need to build up this aspect of our stack, so we kind of like expand the cooperative layer down to the servers, and that is the idea behind server's co-op. And ABRA already supports an integration, so you can do, like, server-new, and if you've got a capsule running on your server, it'll spin up a VM. So we were already trying to, like, you know, just take a turn into cooperatively managed infrastructure and build those integrations from the stack. Whether or not they're working that great is for me to be seen. Yes, so European Cultural Foundation gave us upwards of 30 grand in funding last year and the year before, in the context of the Culture of Solidarity Fund. That was amazing, thanks to them. They were really great, and I would recommend applying to them. They supported us the whole way through. It was, yeah, just fantastic. We wrote our application and sent it into them, so pretty happy with that one. Great. Nearly done. Yeah, I want to just jump, so this isn't vaporware. Again, I want to plug, like, I just find it so important to explain that people are actually using this, and this isn't just, like, some idea in our heads, and we think it's cool. It's like, people are actually running, you know, they rely on some of the things that have been deployed. So Lonely Duck Space is a project which is about, like, 13 apps, I think, on a server somewhere, which was initiated by an artist collective called Ron Grupa, which is, like, an Indonesian-based collective in the context of documentary. And they wanted to kind of approach a group that connected with their values. At some point they were, so they were invited to document it, to do the work, and they thought, okay, we're all the way over here. In Indonesia, we need to work with people who are based in Europe, so we're going to need digital tools, but we don't want to be immediately going on the Google drives of the world. So how do we build up, you know, extend our ways of working and our values into the digital realm? So then they came looking for collectors and co-ops, a friend who was through to them. Yeah, and here's just some photos of us just engaging in this massively multiplayer shared infrastructure project. So it was really great. People from my perspective understood what Core of Cloud was and the mission behind it, and felt invited to kind of look at the technology and what we were doing and comment on that and give this critique. You know, we're often in a space where people want to talk about digital tools, but the first thing they say is, oh, I don't know anything about technology, which is kind of like a hallmark. You have to excuse yourself or something. But we got past that in working with this group of people that may not find themselves technical, but as we moved on, again, Core of Cloud allowed us to just kind of deploy the tech and just like, just forget about it and then go on with the support work. And we ended up with, like the last great thing I saw from this project was the people involved in using the tools were publishing videos about how to use the tools in the stack. So they were like in the Matrix Chat and on the PeerTube publishing. Here's how you use PeerTube. Amazing. Great educational practice. Totally check it out. TV.Lumbung.Space is the worst I've done. Yeah. This was another project we met kind of in the context of Documenta. So this is a comic group. Aircraft is a comic illustrator's union, which is bootstrapping at the moment. And they felt, you know, they saw Lumbung.Space and they were like, this looks cool. And I like the idea of what is going, you know, I understand what is happening here. There are two groups cooperating here, this kind of like cultural based initiative and this technology collective. And, you know, I can see the Cloud Cloud website and this looks like we want to check this out. So we kind of got over that fear and anxiety and moved through the money exchange and, you know, let's work together. It was quite smooth. So I would say Cloud Cloud is then helping us just again put these things to rest like this is the project we're based on, you know, when we work, when we deploy your infrastructure, it will be contributing to the commons, copy left, democratically managed, you know, you kind of just get over the hump. And yeah, Kotec is a new co-op which is set in Poland and that was a major boost. So they've deployed a bunch of services. Some members are in the room. Super nice. Yeah, it's a software stack which is in its, you know, initiated and developed by cooperatives. So it should suit other cooperatives. Of course, if you want to start a new technology cooperative, how do you start? It's overwhelming. But now you've got this off the shelf project and it's like, get going. And once you enter it, you just see all the other collectives and you're like, great. We can learn. We can share. And again, it expands beyond the technical so we have the infrastructure for payment and bank accounts and all this stuff. So people can really get moving fast, I think, is good. And we want to expand the, you know, we want people to start technology cooperatives, start technology cooperatives. Yes, Enterprise Matrix 21 collectives. Somebody counted them. Maybe it was me at some point. But there's a lot of groups involved in this. You can, if you go on the website, it's in the blog post, a list of them. We've got 160 plus recipes. This is a lot of open source apps. You can probably find what you need in there. And yeah, we're running 146 apps. I think I ran Abra app LS at some points. So I don't know because heavily invested in this. And there are other collectives which are running the stack. Yeah, and maybe just kind of coming to the end. It's been a lot of details. I'm even overwhelmed myself. The last few slides, kind of philosophy take, you know, we wanted to be another project in the ecosystem and not just become the project in relation to decentralization, let's say. We really posit ourselves in, you know, against kind of this like big tech discourse and what's happening. And we thought we could contribute to, you know, internet and digital infrastructure decentralization by proposing Call of Cloud in its form. And we're only one project. So I thought it would be good to plug some other great projects which I just find super inspiring. I think, like, I would say Unohost is like maybe one of the gold standards of community organized infrastructure hosting. It has a different kind of set of priorities and goals. You know, everyone can be a system administrator. It's kind of their goal. Like, you know, let's make information available to people. Anyone can do this, get going. Brilliant project. Whereas we're going for, you know, specific groups, co-ops. You're already in the game. How can we make it easier to keep going? So it's a bit of the different layers. But absolutely recommend Unohost. Nubos based in Brussels. Check them out. Chateau's great network. Cotech is a network of cooperatives based in the U.K. I think it's like 35 cooperatives. Check them out if you're looking for a job, get stuck in. Social co-op, trying to build cooperatively managed, master on instances, local IT, great collective already based in, yeah, being involved in co-op cloud at the moment. Small technology foundation. Always find inspiring. Check that out. Small tech. Just plug in them. Yeah. Oh, five minutes. Cool. The roadmap is, yeah, as I said, building the federation. So now is a great time to join. You know, as you're, you know, again, as an individual, a collective co-op, please get involved if you want to. We're trying to find more money, of course. One of the goals of the federation is to kind of achieve financial sustainability. So the co-ops that join or the groups that join the project will be, you know, we're going to have to decide, all right, how do we fund development of the tool? And can we pay for hours around finance, admin, cash, just, you know, all this end to end stuff. Yeah, Cadabra is a kind of new effort to kind of have a server side component, which is, and this is the thing that Cloudron did amazingly well, which is just auto updating the apps. So we're trying to replicate that in terms of the server side component, which understands, you know, oh, someone who takes care of that recipe has uploaded a new version, I'm going to roll it out. And yeah, web interface, I forgot to add maybe, because it's still under discussion if we need that or not. Yes, again to the end, I have, I could do a demo, a chaos demo. Yeah, okay, maybe I'll just do a chaos demo. So I wanted to like run. Yes, just show you the command line client. So again, to contextualize this is the tool that people who maintain the service will be running on a daily basis. So it's supposed to make the job easy. And I won't go through all of it at the moment. But, you know, it's, it's, you know, we try to take effort to like explain the concepts that are involved in the project and what you can do. You know, you can list all the recipes that are available from the command line, for example. And yeah, you can also do operations on recipes here. So you can like attempt to upgrade. Oh, I probably don't have internet. Yeah, right. Okay, well, anyway, you can kind of like do the maintenance commands on the spot. So again, you're on your local workstation, you realize that there's a new version of next cloud is coming out. Okay, let's get that upgraded. And then you can run the commands here. And I'll basically operate on this directory of recipes, where you can see a selection of the apps. And then, you know, this is just the what I showed earlier, the recipe repository. So the, this is the configuration that specifies how to deploy this app. Yeah, and there's some other details that I didn't really go into. But basically, one of the nice things that we've wired up is that if you deploy this app called traffic that when you roll out next cloud, it automatically configures the lesson crypt stuff. So you just don't have to deal with this HTTPS issue. It's already in the config, you just say, give me a thing. Yeah, I don't know, was there much more? Oh, yeah, I guess. Yeah, and then there's just kind of this like command and control interface where you can see like, okay, what apps have I got on what servers and, you know, what do I need to do and you can kind of like, you know, filter it by server or whatever. I won't type it out now. But yeah, maybe I'll call it a day. Thank you for listening to me talk for so long. Thank you. |
The Importance of Collaborative Applications for European Digital Sovereignty
Progress and challenges of alternatives facing the BigTechs |
Okay, so the goal also of this talk is to talk also about the progress and challenges of alternatives facing the big techs. So I want to first explain a bit why apps is key, why end user apps is important and also where are we, what are we doing with these apps, do they exist? So first who am I? So I'm Ludwig Dubost, I'm the creator of Xwiki, so I created the Xwiki software, so I don't know, does anybody who knows Xwiki here, oh cool, that's already, nice, who knows, and we also created a second software in my company, which is CripPad, who knows CripPad, cool, and that's pretty nice. And so we've been doing collaboration software for 20 years now, like 19 years, and we've been competing with software from the big techs, and it's not an easy thing to do, like you get lots of challenges there in order to actually build that software and get to the level that you can compete. And so I've done this at the same time I created the company, so it was a business goal to be able to live from building these applications as open source, and we have 50 employees, we're a remote first company, and we have employees in France and Romania and also now in Germany, and actually we hire, so if you're interested, come see what we do, and we are currently having more success, and that's not happening every year to be able to hire, so if you're interested to see how it works inside the company that has a goal of doing open source, of building open source applications, not only using them or installing them, but really creating them, come see what we do. So as a company, we're also a member of our ecosystem, we're a member of associations in France, for example the Systematique, OW2, the CNL, it's Conseil Nationale de Logiciels Libres, we're not a member of OSBA in Germany, but we actually want to be, it's planned, we're also a member of, yeah, we're also a member of CLIDIA, so APEL is actually a European entity over CNN and OSBA, we're trying to be active also in order to try to promote what the type of things that our companies do, and I'm a strong believer that European companies should work together more than they do together, and I know how difficult it is because it's super difficult because we're all very focused on what we do every day, but collaboration is actually super important. So first, one thing is, in 1995 technology was more like you buy machines, you buy or you build software, you install that software, and not all activities in our lives were actually dependent on software in 1995. It was not as important, but in 28 years, actually a lot of things have changed. Now your renting cloud servers, you could even run your server just on the cloud without worrying about buying in a machine, just take the service, but you could also run cloud servers, you could build and make your own software and run it, but also one thing is that almost everything, there is no business that's not dependent on software, it's starting to be really hard to find anything that won't use any software, maybe some independent agriculture people might not need any software, but even though might still use some of them to be more efficient in their job, but you have a lot of activities that are highly dependent on software for every day work, and it can be a major competitive differentiator to use software in your business. We also see that there is a business imbalance, actually these are numbers, so I didn't put the source, I should have, basically the GDP of the tech sector in the U.S. is 10%, and there is data from Europe, from Europe's websites where you see GDP is 4%, so it's not easy to know if they're actually exactly talking about the same thing. The scoping of what is the tech sector is not always easy to match from one country to another, but we kind of intuitively know that there is much more tech in the U.S., and that actually makes a huge difference, and if we look at the problem from the sovereignty point of view, so what I try to do here is to look at the different levels of what we use in technology, and what the situation is, and how Europe competes from that point of view, but there are some differences also in the way it works, so for example, you might not know how to make a hardware component, like you might not have the people that can do it, but once you bought it, the hardware component is yours, and that actually is something that is significantly different from the top, where you might not know how to do the cloud service, but you also don't own it, you don't have it, you use it, you use it remotely. Now, it's true, even in hardware, you're seeing evolutions where the world of software, the way software works, of software works, is starting to happen on the hardware level, you can turn on a feature in a Tesla remotely, and you can turn it off also remotely, so we might have situations where we don't own the software that is inside a hardware machine, so you have a lot of hardware that contains software, even a chip can contain software, whether or not they can turn off feature remotely from the internet is another story, but software is something that you can turn on, turn off remotely, and that's something very important. Now, we're not actually looking that good all over the place, we know in Europe how to make some semiconductors, for example, RM is a European company, was almost acquired by NVIDIA, but it didn't happen, but it's a UK company, so we still own a semiconductor company in Europe. We have some actors for hardware, but most of it is manufactured in Asia, so that's something important, and we know that with COVID, when we wanted masks in Europe, we didn't get them as fast as we wanted, as we needed them, so being able to manufacture our technology can be important. So now, cloud services, we do have European actors, but the good thing is that we have standardized open source tools in cloud services, so Linux, things like that, like Kubernetes, Docker, these are open source software, we don't always have the competences in Europe, we are not the ones that build them, most of them, but we do have access to the source, and we have actors, and it's possible to run VMs all over Europe, that's not a big problem. Now when you go to pass software, platform as a service, you have a major dominance from the actors. In France, there's been a complaint about the fact that the Health Data Hub, which is a national service doing statistics on health data, was run on Microsoft Cloud, and basically a lot of people say, what, you're running health data of all the people in France on an American system, how is that protecting our data, it's not compliant with GDPR, and basically the answer of the people that had made these decisions said there are not enough good APIs on the French cloud services to run it on the French cloud service, obviously everybody said no, there are, you have enough, there is enough to, maybe you're going to miss a little bit of stuff, maybe it's not going to be convenient for you, but it's possible, but they basically, the argumentation was that it was not good enough, the ministry in France basically bashed the French companies for not being good enough, and the answer of course the ecosystem, what are you waiting for to give money to the French company to allow them to be good enough, what are you doing instead of putting that type of service on foreign systems, so the technology is possible to develop it, a lot of it is open source, but companies like Amazon or Google cloud, Amazon web service, Google cloud or Microsoft are very ahead in terms of number of services that they have, when I looked as a company what technologies are available to deploy Xwiki software and Krippat software in an app store on cloud providers, I mean you have like different technologies on all of them, but they all have a lot of options for how you would sell your software on their app store, and when you look at what European companies have, they don't have systems to automatically deploy applications and sell them and put the price on them and people would start buying them, so they are really ahead, and when you go to SaaS you have like thousands of services all around the world, startups, and we do have national actors and a lot of these actors are not open source, so you have a lot of SaaS services and a lot are not open source and even the ones provided by Microsoft, Google and Amazon are not open source or they can be based on open source but they are not themselves open source, that means you cannot decide to self host them and take control of them, and that's actually something that we'll see is important, now in the collaboration space when you ask to the big companies what they need, they don't need one collaboration software, they need a whole stack, it was interesting in the previous presentation about the co-op, it was about deploying multiple apps and this is actually, bigger companies want all these software, they need all multiple software and they need them to work well together, so they need authentication, user management, they need of course email, chat, video conferencing, file storage, they want to edit documents, they want collaboration tools and then they even want to build custom apps, so they want a lot of different software and they want all these to be simple and easy, and when you start using apps, you become very quickly locked in, even when the app is open source by the way, when you put your data in a system, the way you put that data in makes you quickly locked, because it's complicated to get the data out and to adapt the way you work to other apps, even if the app is standard, in 20 years of Xwiki, I've heard so many people say that how they use to the fact that the button in Microsoft Office version X something, it's here on the screen, and you give them another tool where the button is on the right and they say it's difficult to use, like it's difficult to use because the button is on the right, no, they don't even know why it's difficult to use but it feels difficult to use to them, because they're so used to the way they worked before that it's very difficult to change, and so we have been trying to move people from Office to Wikis, so imagine how they tell you that it's difficult to use, no, it's not that difficult to use, you need to adapt your way of working, you need to learn a new tool, it's not that difficult to use, it's just, it takes time to adapt, you need to try to think how it works and try to adapt your way of working, so the lock-in is very high, and on SaaS software, the lock-in is actually immense, like I worked at Netscape in 1995, and we were talking about the lock-in of Microsoft on software when it was on premise software, and it's like oh Microsoft controls your software, like you're blocked on their software, but at least the software was in your company, and you could put firewall around it and potentially no data would go out, here now it's on the cloud, and so they will tell you it's going to be double the price to upgrade, but you didn't have to upgrade, you could decide to stay on the older version for a bit longer, and you didn't have to pay every year, you paid licenses for a long time, now you're paying every year, and if you don't pay on the first of January for your renewal or every month, they turn off the software immediately, so you're actually in an immense lock-in, the lock-in is actually way higher than it was in 1995 when a lot of the industry was complaining about that lock-in, and the data is controlled by the provider, there's a huge lack of standards on the structure of the data, so how that data, your basic formats are standardized, but in collaborative apps, even in wikis, I have to recognize that the xwiki software formats are not that standardized, we tried at some point to work with other wiki providers to work on common syntaxes, but never happened, like our syntax was based on something called wiki creole, which was discussed 15 years ago, but now everybody is using markdown, so there was something that people said that should be the standard, like a lot of wiki providers discussed, especially said that's a good common standard, but it's a completely other one that a lot of people now say, ah, it's great, but not many people discussed the markdown standard, it was just, ah, people liked it, and it started to become very liked, so the standards are very difficult to have, even when they exist, they're not necessarily adopted, and so if they're not adopted, they're not very useful to move data from one system to another, the complexity of the software makes it very difficult to switch, and one key aspect of the software industry is the winner takes it all approach, like this is basically why VCs love the tech industry, they put a lot of money, they try to win, and they like the fact that when you win, everybody else is dead, the winner makes 95% of the revenue, all the other people fight for the 5% that's left, and they love that because that's actually their goal, trying to be the winner, so what's interesting is that they don't necessarily win that much because they create 10 companies or 100 companies and only one makes it to the win, to being the winner, and so in the end they don't win that much money because they drop a lot of money on the way, but this is their goal basically of what they're trying to do, and one of the problems of the locking also is unfair business practices, you have a lot of unfair business practices because the providers are so powerful that it's very tempting to use that power to make more money, so if you ask a business lead to make 10% more revenue next year, if there is no difficulty to increase the price because everybody's locked they will just increase the price 10%, that's much easier than work hard to try to get new customers, and so in order to keep that approach running, they also try to, another approach is to, okay let's make other people buy my second software, I have a first software, if I can force the other people, if I can encourage them to use it, I give for free the second software, then at some point they become used to it, it's free, and at some point I say now it's not free anymore, Microsoft Teams is typically that approach, and Slack has actually complained about it, they actually sued about it because Teams was free, Office 365 is very expensive, they just gave Teams to everybody that has Office 365, and at some point they say now you have to pay, and all the others are dead in the meantime, so you have a huge amount of risks, of lack of sovereignty associated to this, so if you can stop all the software remotely, if we disagree with the US government, knowing that all these companies are US based, if we disagree with the US government, they could turn off the software, and even worse than that by the way, is that they don't even need to turn the software, because they just need to remind our government that they can turn off the software, and then the other government will start being, want less to disagree with the American government, because they know it's possible, or in negotiations it could be vaguely reminded that it's possible, and that's actually a big problem, and our ministry in France of economy and finance said no political sovereignty without digital sovereignty, I'm not sure what they actually do to try to avoid that situation, but they actually acknowledge it that there was a problem, that at some point you cannot state your political opinions if you're dependent so much on other people that have a different opinion than you. And at this point, I'm not sure how many businesses would still function if you shut down their collaboration tools, or if you shut down Microsoft today, what's the impact? It's a bit difficult to analyze, but how long would it take for the companies to start being able to do work again? And so from my point of view, that makes really collaboration apps very important, because we don't have the control on these apps, and it's the primary set of tools that people use, actually I didn't ask that question, in your company, which ones are using Microsoft or Google? And so it's the primary tool that people are using, and they get really used to it. And it's also the primary authentication system that people use. Most companies will put Azure AD, LDAP, Active Directory, not LDAP, Active Directory, as well as the authentication system. And basically, they end up also being almost ready to deploy servers. So when you have Azure AD, you have an Azure account, you're one click away from deploying servers on the Microsoft system. So people buy Office 365, they tell people, you need to connect Office 365 to Azure AD to manage your users, and then you're one click away to deploy servers. And a lot of users like simplicity. If it's simple to click, they start using the other servers, and they do a lot of work to make all this work as easy as possible together. Do they do a good job is another question, but they have the possibility to make it easy. We might be happy that they're not always very good at it. And so that gives us some opportunities to provide other solutions. And so it's the entry point of all software. And the biggest problem from my point of view has been the bundle issue, is that a lot of software becomes bundled as a package. And so when you go to Microsoft's website to buy, you say, I need Office, you will be proposed the whole Microsoft suite, not only Office 365, and you will have this software plus this software plus this software. And you'll see that the difference of pricing is not always very high between the package rules. And I'm not even sure it exists, only Office 365, only Office. You will get for free one thing, even what you don't need. This was actually the problem of computers bundled with Windows, which were bundled with Internet Explorer. I said I worked at Netscape, and I can remember how it killed Netscape's business around the browser when Microsoft bundled Internet Explorer in Windows. Who would buy a browser if you have one for free with Windows when you get your computer, which, and at the time, you were paying Windows? Well, in the meantime, there's a lot of things that have changed, because now browsers also got free because you can make money through advertisement. But that's actually potentially a problem, because you can make money through advertisement, but we also got tracking, thanks to that. So we get browsers for free, but we get tracking, and that's also a problem of sovereignty. So now they bundle even more. They bundle apps, but they also bundle Office 365 with the cloud, with the pass, with the yes. Microsoft has a whole offer, servers, yes, a pass, and apps. When you go see a client with you, ah, I have a wiki, would you like my wiki? Well, sorry, I already have it for free in the whole offer. Like, no, you need, you know, and I even, I think I had a few cases like that, is for a department that's part of a big company, if they buy another software, they pay for it. If they use Microsoft software, the central entity pays for it, and it's divided, and they pay the same share of it whether they use it or not. So basically, you come as a provider, and you have a cost, and the Microsoft software you're competing with has no cost. It's free. But it has a cost in reality because everybody then will start, some people will start using some of the software, and when they renew the license with Microsoft three years later, Microsoft comes and counts everything that is being used, and they set the new bill. And so, the company will end up paying it, the fact that they started using that software. And so, it's very difficult to compete. And so, everything's, actually, NextCloud has sued on this, I think, antitrust.nextcloud.com, I think, I have the link a bit later. And so, because it's a real antitrust issue, in the end, you cannot compete with that type of situation. And so, some people say, and especially this is true in France, on our government, they say, we need to build better software companies. So, the problem is not the way the competition works, or the way these companies operate and so on. That's not the problem. The problem is we don't have a French one, or we don't have a European one. And so, say, let's build unicorns and make companies. But in reality, when you look at unicorns, is that first, well, if they actually get very big, if they manage to get very big, they will behave exactly the same way. And it's not going to help the people and the small companies. It's going to be a bit less a sovereignty issue, because maybe our government will have a little more control on them. But it's, we are not going to get our own sovereignty, because we're going to be very dependent on this big company. The other problem is that most of these unicorns are made with fun VCs. Most of them of these VCs are actually not even European, and VCs want to make money, so they want to sell the company at some point. And it's much easier to sell it to the ones that have the most money, which are the current American companies. So, in the end, there is a very big chance that all these unicorns will either die, because they don't work well enough, or they're going to be bought by the same companies that are currently leading the market. And another thing, yeah, and that's actually, they usually use the American cloud providers for their services, and so these cloud providers see everything they do. So when something works well, they can replicate it. This is actually a complaint against Amazon, even in the US, that small businesses that sell on Amazon are showing to Amazon everything that works. And surprisingly, at some point, you get products that pop up made by Amazon instead, and they take the market. And so they're supposed to not look at the statistics, but we don't know what really happens. And now, how can open source help to solve these type of problems? So one thing I believe is very important is that if you try to build local companies, French ones, German ones, UK ones, even though UK is not in Europe anymore, but still in Europe, physically. But if you try to build your own local companies, and you all do it at the same time, at some point, you'll have a problem. And because suppose one of them buys the other one, you're going to have one very big French company or one very big German company, and maybe, I don't know, a Swedish one. And the problem is that this won't be sovereignty for the countries that don't own these companies. So, you might have a bit of sovereignty in France, but no sovereignty in Poland. And so if you try then to tell, so let's invest all, let's start using the software from this European company, like in Poland, they're not very interested to use the software from a French one, and that doesn't give them that much sovereignty. So an open source works very differently. If we invest in open source in France, it can be used in Germany, and it can give sovereignty to Germany because Germany can control the software. German companies can build competence on top of that software. And that's true in Poland, and that's true in Romania, et cetera. So if we all in different countries co-invest in open source, we can all get more sovereignty out of it. And we also can get individual sovereignty out of it because any individual can then make use of it. So I'm sure I don't have to convince that much about open source here, but it's important to remind these arguments which can be useful if you meet people in your own countries. So also open source is usually a bit more open to standards, and if the provider doesn't want to adapt this software to standards, then somebody else can do it. You have this possibility. So open source will allow to adapt software to standards when providers don't want to do it because it's not in their economical interest. It's very hard. Like if you look at software, you very often have a lot of import tools. You very rarely have export tools. But anybody can build the export tools on open source software. And one aspect that is also interesting is that we can reuse also US open source software. Like we need, but we need, however, to build local competence. If we use massively very large US based software, but we're not building local competence on it, we will risk having a lack of sovereignty. Because we don't have the people that can take over. So we can reuse open source software from anywhere in the world, but it's also important to invest in the competence. So if we start building massive solutions, and not only competence about how to use them, but competence how they're built, how they're made inside the software, it's very easy to use open source. It's much more difficult to improve an open source software. Like you need to build competence on the software itself. So now I want to look a bit at what actually exists. Because we're looking, and I've been maybe a bit looking pessimistic because it's hard, it's tough. But actually, what's interesting is that we have tons of great open source software in Europe. And I'm sure I missed a lot, because I spend most of my time on our own software. And I'm sure I missed a lot, but I mentioned the list of types of software that companies are looking for for collaboration. And so for example, in video conferencing, we have software like Jitsi, which has been created in France. Actually, core developers are in the US now, but it was created in Strasbourg. Big Blue Button, Canadian, it's open source, not developed in Europe. Next Cloud Talk, developed in Europe. Only Office is developed in Letony, if I'm correct. There are also issues around Only Office, because it's linked to Russia. So we also need to have more control on it, on the development of it, collaboration. On collaboration, Next Cloud, Xwiki, Krippad, project management, projects like Tulip, open project. We have a lot of software on authentication, software like Lemoneldap, Univention, on email, we have Open Exchange, we have Sogo, BlueMind, and on file management, software like Next Cloud, PDO, Krippad. And on chat, we have great solutions like Matrix. So we actually have all the bricks existing. Now, they're all provided by different companies, different providers, and they do, they're not as integrated as the other solutions will be. And another aspect, I'll come back to that, the integration aspect. Another aspect is that actually building open source business, it works. Like all these software have core teams, the software I mentioned, they do have core teams. Most of the companies, most of the software have a core team that is working for at least one company that has regular revenue out of this software, and they're able to fund the roadmap and have a regular development on top of this software. So it's possible, and I can attest to it, with my company, we didn't raise any money, and for 19 years we've been able to pay the people in the team and grow, and we've even been able to create a second software out of the work that we were doing on the first one. So it's possible to fund the company. Now, at the beginning it's not easy, like the first years you need to build critical mass, and sometimes you have an issue of competition, so you're not necessarily finding enough money to build it quickly at the level that it can compete, but it's possible to find business and it progresses, the better your software is, the more companies are interested in it, when it's good enough, the companies get interested in it. So you have moments where companies will just go to the other provider, the winner takes it all, so you get very little market share, but then at some point they get more interested and you get more prospects. So how does open source business work? You can, clients do buy support, but you need also to have the price right, so if you put it too expensive then clients might just go away, but if you put the right price you get clients that buy support. You also have clients that sponsor development, but you have issues of dumping and bundling that slows it down, like you cannot sell to everybody because you have competitors and their software is given for free or cheaper, and Xwiki knows it very well because we've been competing with Xwiki with conference, anybody has conference, is there a company? So while you can try to convince your companies to switch to Xwiki, we have importers and we have, and Atlassian has kept increased the price over the time and they actually, they changed a lot the strategy, so they had a pretty low price for small businesses and now they say small businesses you have to go on the cloud. We stop doing the small price software even though it makes money. If you look at the finances of Atlassian it does make money, so it's a conscious choice to make more money. It's not something that doesn't work, it's something that works, but now that they've done that we have a lot of calls and we get much more clients, and even though we never got the money of these clients to build our own software, so we've been able to build the software with the money of other clients and now we're having more clients coming, so I think it's a clear proof that the dumping and the bundling approaches are slowing down open source and there are unfair practices. There are also some good news, for example European initiatives, NGI, so we at Xwiki we've been able to benefit from that on CripPad in particular because CripPad doesn't have that many clients. It has almost no enterprise clients, it's just starting to have enterprise clients and it's been mostly funded with research money. We do have some clients, we have donations and subscriptions on our instance, but it would not be enough to pay for that software which requires a lot of work. So NGI is providing funding for companies and that's really a great thing. They also, the EU started to pay for bug bounties on open source software, we've been able to benefit from it also and that's been great. GAIAX is interesting, but there are also issues with it from my point of view, it's not easy to understand, to see how four small providers like us that are doing apps, how to work together with the big cloud providers and big companies that are currently running GAIAX. We're not seeing also open source being as put in front in the GAIAX project. So there are initiatives in Europe to help and we've seen that it's becoming more and more in subject, that's interesting. In France we have the French government decided to put some funding, so on one side they're opening the door to American software, they also reacted to the fact that from the GDPR perspective there is a huge problem with the cloud act, this is the Max-Ramp act and basically they're recognizing that it's illegal to host data on American companies systems, but it seems to be illegal but we don't really know if they consider it illegal for everybody, is it illegal only for the public service, it's illegal for all the companies, it's unclear but they started to say, okay for example the national education should not deploy Microsoft software anymore, but on the other side they're proposing that French companies are going to build clouds to deploy Microsoft and Google software on them and host it instead of Microsoft and Google and that will be okay. So they will control the software but they won't, they will control the hosting but they won't control the software. On the other side they say okay let's fund a little bit alternative, so they decided to put 50 million but they didn't put it on open source, we are partners in some of the projects to try to get funding for Xwiki and Crip Pad and try to build also additional solutions, so you have a bit of a mixed message, on one side they like open source, on the other side they don't promote only open source so it's not always clear. So you have messages, there are better news in Germany where you have specific funding for open source platforms with the sovereign workplace project, actually at Xwiki we have had the chance to be called in and that is actually great news for us and you have a bunch of great open source software that are part of that project, you also have additional initiatives like the sovereign tech fund and there is a specific entity called the Centrum for Digital Server Initiate, then this, my German is good, and you have the open code project, so a bit better news but at the same time it's never completely won, you have internally governments, some people pro and some people that will continue to push for it's not so important, let's use American software. So there are also, so the challenges I've talked about them, the antitrust is clearly bundling and unfair practices and trying to kill competition with aggressive pricing. I've seen cloud develop and I mean the typical cloud business model is let's give things for free and then increase the price, let's get the users when they're in, when I'm sufficiently known, let's change the price. You could say open source, that's the same thing, let's make it open source and at some point let's give it for free and let's change the price or let's stop being open source so some companies do it. There is a big difference however is that what you put as open source you can never take away, so maybe you can change your price, stop do open source and do your own life and turn your back on the people that helped you because you have a lot of people that helped you when you do open source, you give them a lot but they also give you a lot, I can say that I have a lot of people that gave us things, gave us contribution, make the software known. The user's crypto platform wouldn't be what it is with all the people making it known so we have a lot of people helping so you can do that but you can never take away the code and that's a very, very important difference versus the strategy of the cloud providers that gives things for free and increase the price later. This is also why on my side I'm a big proponent of AGPL with no CLA so no retaining the copyright of everything that people give you because this shows that you will not turn your back on open source or at least you will have to give back all the code and redo what people gave you. For CRIPAD it's AGPL, XWIC is LGPL and we don't want to change it because this is what we did and we don't want to change that but CRIPAD is AGPL and we have never taken any CLA on contributions and we believe we cannot go away of what we're doing. There is another challenge which is open source financing so the initial financing of an open source project is very difficult, it's something that is tough. There is one thing that is very important is that a lot of people when they build open source software and they try to live from it they question the open source business model is that yeah but somebody is going to steal my software or somebody is going to compete with me etc etc and you always have difficulties but sometimes it's very important to understand that the main difficulty is not the business model, the main difficulty is do a good enough software. In many cases when you tend to think that you should be selling it instead of giving it away as open source, well sometimes that's not the problem, the problem is that the software is not good enough, you need to be better, the competition is too good and you need to work more on your software. We had phases where it was a bit difficult at XWIC and what we did we worked on the software and things improved so you have to look at that. We also decided at XWIC to do extensions that are open source but paying and so I usually have a bit of a complexity to explain that, I even had some people that say but so they asked but can I take it or can I reuse it for free, I said yeah take the source, rebuild it, remove the license check and you can use it, it's open source so it's not open source I cannot use it for free, that's not open source, it's about the source, it's not about whether or not it's easy for you to use it. So we decided to have some extensions of XWIC which are open source but either you go find the source and you go and take it and build it yourself and install it yourself or you use our app store and in one click you can use it, pay for it and buy it and it's been very useful for us because it allows also to explain to companies that concept, the fact that the important part is the fact that it's open source, it's not that it's free and that's what is important in what we do but it allowed to explain that and it helped actually improve the relationship with some of our clients which tend naturally to say okay if I don't have to pay, I don't pay, I don't help the project, I don't need to pay so you have to find that balance also between the commercial offers and the freedom. Sometimes when you build open source you are too idealistic in a way that okay everything should be free but the problem is that when everything is free you have a lot of people that don't understand that they actually should help the project because this project will not survive without funding, you need money to survive to make the software, it's necessary. Another aspect is that it's actually very difficult to make partnerships with service and cloud companies. Right now Xwiki has not been able to do good partnerships with other service companies to distribute our software, it's not simple and I'm not really sure how to do partnerships with cloud companies. Cloud companies can be very tempted to just reuse all the open source software available and sell it and not fund in any way the software that are in these offers and we need a model for that so if we want to separate the work of cloud service and of software we need to find a model to how to sell open source software when it's sold to companies otherwise the software will not be able to fund themselves so Matrix for example has made a blog post about the fact that they have major success in terms of usage and deployments but many of these deployments are made by other companies that are making money off of it and are not giving back anything, not even developers to work on the project, neither funding, neither developers and in the end they had to fire people and so in order to, I mean if you want to do good software you need great people, you need great developers and so you need to find a funding model that works. One aspect for open source company is to work with less cash so I don't believe in VC investment because I think that doesn't allow to keep the independence of what we build but you have to work with less cash so even when you're growing it's difficult to hire because you need a lot of cash to run the company itself like to advance the money and you are taking risks and it's not a simple problem. Another problem is integration and fragmentation so clients want things that are integrated that's actually also one of the reasons you start to see packages that bundle the software so trying to make them all installable together easily that's actually something that's important and cloud providers providing multiple software together. This is very interesting but there is a lot of integration work that is necessary and sometimes that work is not brought back to the project and we need to make that happen. There's also a lot of fragmentation in the EU market so you need to go to multiple countries to sell it takes a lot of time so we need also partnerships between the companies in the different countries to work together in order to make that work and I'm very happy to have partnerships with Univention for example because for us it's a major win of time to go on the German market. If a company says look at this French company it changes a lot the way the German market will see a French company and that is going to be true the other way so we need to work more together so that's a bit the next step I have and so first is we need to believe in ourselves so we have great offers and the companies that are building software open source software are able to make it and the difficulties sometimes they have is about competition is about being considered good enough so that a lot of companies invest in them there is a huge amount of money sold on collaboration like I think we count billions of euros that are given to Microsoft and Google and if we sum all the open source companies we are talking about we are talking about 50, 100 million something like that revenue I don't know exactly because it's not always easy to get the numbers but we're spending billions on collaboration software from America to American companies so there is a lot of money it's just we need to convince the clients that we're good enough as a group not only as individual company and open source provides good openness and flexibility and that's useful to clients so there are also good technical arguments for open source when it comes to company we have had a lot of clients that were very interested in the flexibility of our software but they also want easy to go solution one provided to buy from and that's a real difficulty we need to work more on European collaborations I talked about it so at Xwiki this is something we're trying to do more so the Univention partnership we did last year and we're also looking at the partnerships with NextCloud with OpenProject where there are multiple companies in Europe that think the same way they just lack a lot of time to actually make it happen and so it's going to take time but there are many companies that want to do that and the sovereign workplace project is really great to create also this integrated offer bring the providers together now we will need to find ways to do partnerships with cloud providers I've yet to see how this will happen and we need more open standards that's also another aspect that's transversal we need more open standards I've also been a believer that we need an open source marketplace so a way for cloud providers to sell open source software in one easy way but it's it I thought I hope that GaiaX would be a place for that I haven't seen that that happening and we the most important thing we need is clients so any any clients that are brought to the companies that are providing this alternate platform are helping these companies bring the software a little bit better and it makes a difference so everything that as individual you can do to tell your your boss employers and that maybe they should look at what these companies are doing that other companies are actually deploying the software is is helping so I mean if you can help us do it and make it known and that's it if there are any questions think there's two minutes left so everybody fall asleep one hi you mentioned earlier in your talk that the French government deemed it illegal to host data on American companies because of the US cloud act could you expand on the cloud act a bit more yeah so the cloud act is the fact that the US government is able to access data on from American companies and even from American people so first I'm not a lawyer okay so don't don't take everything I say for granted so it allows to access data even if that data is in a foreign country so and and they can even I understood from a paper in Holland by lawyers that they could even ask to an American person secretly to get give the data that these people can access and and and they cannot refuse basically like they risk a lot so Microsoft if asked would secretly have to give money that they have on servers in Ireland and that is and that is not compliant with GDPR that's that's what the Max Schrempf judgment said so it's been written but then the problem is that they it needs to be recognized and decisions need to be made to stop doing it and and and it seems that it takes and and it seems like everybody's waiting for Europe to sign another agreement to say that it's okay so that you don't have to make the decision to stop hosting it on American systems please hi thanks for the the presentation so you were saying that actually like open source collaborative software would help solve the the issue or improve the digital sovereignty if we suppose that Microsoft tomorrow makes office source code available on GitHub let's say how that how that change the the actual stage of the digital sovereignty in the EU well it will change it if people start to to take that software and host it and verify that it solves the business needs so but yes it would it would improve it for example Microsoft does not give the source code of office so if you want to do office editing you have to work with software that has been built by other people not by Microsoft and Microsoft and Google have better software more compatible with office formats and so on and so on so it would improve it still would need to be packaged there will still be a lot of work but it would improve the offers the alternative offers and then companies but so about the digital sovereignty the thing is if if tomorrow we have a political problem they can turn off the software so all our companies will have problems the question is okay do we have alternatives to replace that software at this point they're not necessarily as good and they are not as good because we're not buying these alternatives so we we cannot fund them so the distance between the American solutions and the European solution is increasing instead of being reduced and so it's all about whether or not we would be able to replace that software if Microsoft made it open source it would be easier to replace to replace their own services their online services sorry I want a bit follow-up to the first question basically so US Cloud Act regarding it as far as they recall privacy shield was the first one that has been taken down by your court and I'm not sure if like US Cloud Act somehow replaces it or not because no no privacy the cloud is an American law the cloud that makes it a problem that makes privacy shield illegal basically privacy shield said it's okay to put data in in in the US and basically the judgment the from Mike shrimps go see his website to explain he explains the problem much more precisely says that this is not true privacy shield says it's okay but it's not okay because of the cloud act and the visa visa warrants the laws in the US that allow access to data even if it's hosted in Europe yeah yeah and European court accepted that privacy shield is not valid it's not a couple of years I believe now the European government is right now negotiating the Commission is negotiating with Biden some new paper that says it would be okay but it's already kind of validated that this paper is not okay that it wouldn't but there will need to be another judgment to say that it's not okay so it's a kind of a cat and mouse because some people in Europe don't want it to be illegal to use American system and Biden doesn't want to change the cloud act thank you. |
The role of Open Infrastructure in digital sovereignty |
Only one minute late, it's been a long day, so bear with me. So hello everyone, thanks for joining this talk so late in the day. My name is Thierry Carras, I work for the Open Infrastructure Foundation, which is the home for the OpenStack project and Cata containers and a few others. From the accent, you can probably tell I'm based in France, and so I'm very aware of the questions around digital sovereignty, and I wanted to use this talk to give you a sense of why, from our perspective, digital sovereignty really matters, and how can open infrastructure help in that area. But first, what do we mean by digital sovereignty? If you've been in this room for the whole day, I'm pretty sure you've already heard 10 definitions, I'll just add one. Obviously, the digital sovereignty is around access to data, and the 21st century is really global and driven by software, and so in a fast-changing world, whoever adapts the fastest really wins. It's really a question of disrupt or be disrupted. So the ability to adapt fast and ship new features fast and deliver new applications fast is really critical. But the way you deliver those applications has really been evolving over the past 20 years. 20 years ago, if you started, same time I started, you would procure some physical hardware and as an application employer, you would install your operating system, your dependencies, and your application on top of that. But that was a bit inconvenient, so we added more and more layers. The first layer we added was hardware virtualization, abstracting the server your application is running on from the physical hardware that runs it, and you gained a lot of efficiency doing that. Then we added another layer, which is cloud APIs, which allows you to programmatically access those virtualized resources. And so you have those two concepts. You have programmable infrastructure on one side, and you have cloud-aware applications being deployed on top of that. And this programmable infrastructure is really key to reach the next level of velocity because machines need to be able to provision the resources that they need by themselves. And so you really need that programmable infrastructure to reach the next level of velocity and ability to deliver applications fast. The building up this programmable infrastructure by yourself is really a challenge. It's complex to do, and it's difficult to find talent that knows how to do it, because there is a lot of demand for those skills. Luckily, you can pay others to do it for you by using public clouds, available public clouds, or managed private clouds. But the trick is, the cloud market is really cornered by a couple of internal giants based in the US or China. So this really creates a challenge for European governments and companies. The challenge is that in order to stay competitive, European companies really need access to programmable infrastructure. But the most obvious way to get to that programmable infrastructure is to use a hyperscaler cloud based in the US. But data is really the basic resource of the 21st century, and which legislation your data lives under, like Ludovic just said, really ultimately defines who controls it. Like the US government can't compel any US company to disclose their customer data. In case of a geopolitical conflict, you can see the US government shutting down access to vital data that is hosted on a US-based company. This creates really a significant geopolitical vulnerability, and if the last 10 years are any guide, this vulnerability, if we don't address it, will only grow. With the recent pandemic, with the war in Ukraine, we've seen growing willingness by governments to weaponize their control of the international supply chain. So really, even assuming good intent from those governments and companies, we are all friends, right? Well, the legislation that data lives under actually affects which laws apply to it. An obvious statement, but Europe has really very progressive privacy laws that protect individuals from the reach of greedy data aggregation companies. And so how do we enforce those laws in a world where all of that data actually lives in a place where those laws do not apply? So even if you assume good intent, there is the risk there. The solution is, of course, to build our own European-based public clouds. But it's easier said than done. Europe really has a vibrant ecosystem of companies, but it really lacks the giants that can compete with a Google or a Microsoft or an Amazon. So how can we turn that vibrant ecosystem of smaller actors from a liability to an asset? Germany and France have really acknowledged this critical geopolitical vulnerability for a while. But I would say that the previous attempts at solving it weren't super successful. Like for example, we had several attempts at building giant sovereign clouds in the past, but they were really not adapted to the nature of the European ecosystem. More recently, they moved towards mandating locally operated systems, which is a great step, especially as far as government data is concerned. And for others, it also encouraged cataloging and describing available services through initiatives like Gaia X, which make it clear which laws and policy really apply to the data. But those efforts were really easily, trivially worked around by the hyperscader companies. Some of them co-opted the requirements through local partnerships, so they would work with the local EU-based company to help them run locally the thing. So the problem is, working with EU-based organizations to run the services locally really maintains this critical technological dependency that Amazon could just shut down access or weaponize access to information really easily. In some, I picked on Amazon right now, but it's actually the wrong approach because they are actually not the ones doing that. Google and Microsoft have been doing a lot more partnerships. Amazon just decided to lobby against the law and trying to convince legislators that depriving EU companies from the amazing Amazon web services would critically impact their ability to be innovative and competitive on the market. So they basically tried to convince legislators that if they don't let people access freely Amazon web services, we're doomed because obviously we can't do that here. So what do we do now? In that context, I think open infrastructure can help and I want to explain what we mean by open infrastructure first. What is it and why can it help? So if we go back to our picture from earlier, a programmable infrastructure and cloud-aware applications being deployed on top of that, open infrastructure is really open-source solutions that help you provide that programmable infrastructure. And standard there, used by millions of CPU cores all around the world, is a stack composed of Linux at the virtualization layer, open stack at the cloud APIs layer, and Kubernetes at the application orchestration layer, what we call the lucky stack. But why would you use open source for infrastructure? Why does it matter? First, it really gives everyone access to infrastructure providing technology. All organizations, all countries, it really allows to distribute the future more evenly and by making those technologies accessible to all, you actually allow everyone to play and innovate without friction or having to ask for permission, you maximize innovation as a result. But beyond those two key benefits, you actually have three properties of open infrastructure that make it really suitable for using it in a digital sovereignty context. Independence is one of them. Open infrastructure is not just open source, it's also openly developed. So Linux, open stack, Kubernetes, those are all developed not by a single vendor, but by a massive global open collaboration. And that means everyone can participate on a level playing field under a neutral governance. Nobody is owning the keys to the kingdom. Nobody will pull the rug below you by selling to someone else. Another benefit of open development is transparency, all technical discussions are happening in the open, all governance decisions are publicly documented. Trust is really essential in building a digital sovereign, digitally sovereign cloud system. And open infrastructure is really naturally transparent. And finally, being able, giving everyone access to that technology, it allows everyone to standardize on using the same solutions, which enables interoperability. Interoperability is really the main challenge for federating a group of smaller actors to compete with giants because it's really hard to eliminate the differences and present a coherent user experience. So you can standardize on available features, that's a good first step. You can expose the same APIs, which is even better. Using the same technical stack obviously is one step above that. And so EU companies that are standardized on the low key stack like Deutsche Telekom, Chloroise, I've seen a hoodie there, OVH Cloud in France, Orange Business Services, Binaural Exine, for many a cloud fair in Poland, Elastics in the Nordics, they all give you the same APIs backed by the same software and showing good interoperability. And once combined, all of those public cloud providers give you enough points of presence and capacity to actually rival any of the hyperscaders. But in order to increase interoperability even further, you can build a common distribution and share operational practices that will give you the next level, I mean perfect interoperability because it will be basically the same software running in the same conditions in different data centers. And this is what the sovereign cloud stack project is, aims to solve, and we'll have a presentation later, here you are, by Kurt. So I suspect it will go into a lot more details, but I'll just summarize for those who will not stay in the room, sovereign cloud stack as the name implies is an initiative aiming to build a standard stack for providing sovereign infrastructure. It's composed of a standard Loki stack, also making use of SEF, another open infrastructure component. It's aiming at enabling a federation of highly interoperable infrastructure providers, and it's going beyond proposing the same features, exposing the same APIs, running the same software to sharing the operational choices and best practices. It's also openly developed open source, so anyone can join and participate in the level playing field, and I'll conclude on that in summary. Digital sovereignty is a major challenge for Europe in the 21st century, especially around infrastructure, the infrastructure layers, because if we leave the hyperscalers in full control of that layer, we are going to be easily cut from our sources of information in case of any tension. Open infrastructure is open source solutions for providing infrastructure for applications and data. It enables independence, transparency, and interoperability, which are necessary to really federate a bunch of local actors to compete with the U.S.-based giants. And so if you care about digital sovereignty, as you should, have a look at the open infrastructure to power providers that I mentioned, but also at the software and cloud stack and stay in the room to see the CURTS presentation later today. Thanks for listening. And we have plenty of time for questions. Hi, my name is Michael. I tried to deploy OpenStack about a decade ago in our internal stuff. We found it very difficult. In fact, one of the problems we had were an organization of about 12 people, and OpenStack was clearly appropriate for an organization of 100 people. And so we went for both simpler solutions, you know, plain Zen, KVM, and hyperscalar sides of things. And my impression is that it hasn't changed much. That OpenStack has a scaling issue, meaning it's great for large systems and large installations, but it's not good for small systems. And so what that means is that I don't develop for the stack that you want to deploy. I develop for something else because I can't afford to maintain that piece. One of the annoyances, and I'll just let you answer it, one of the annoyances at the time was the V6 support was abysmal, and it's better now. But my impression is still that Kubernetes, for instance, is like, what's an IPv6, they just don't care. And I wonder in this common operational choices and carriers stuff that you're talking about, so this is, you're going to address this issue of, well, I can't really move a cluster from point A to point B if I have overlapping 1918 dress spaces. I need V6 and I need it to work natively and well so that I could don't have to think about this nonsense. So should I repeat the question for, I hope the question was recorded, it was a long one. So first on the concern around the size of deployment or the inability to scale down to simpler deployments, I would say that it has improved a lot. Providing infrastructure is really a difficult job. It's not like something you would deploy on your garage. If you're at a stage where, like you say, a company with 10 people, I don't think there is much sense in doing it, but the main concern was really keeping up to date with upgrades and the cycle of six months releases that we had and we made a lot of progress there in securing the updates, in limiting the amount of changes that are happening over a cycle of six months, pretty mature and stable now. And we are seeing teams of relatively small numbers running gigantic systems. Like Ubisoft, for example, is running a very large open stack private cloud for their game servers and it's a team of 10 to 12, what they said in the latest webcast. So obviously, yeah, more for 100 people company than for 10 people company. In terms of, I think distributions like sovereign cloud stack, others where there is also more guidance in the type of options that you should be deploying, more partners, you can really rely on and sharing the same issues will further help, but it's true that it's more targeted to people that have enough, I would say, the minimum size of the deployment is more like dozens of servers than three or four servers, for sure. In terms of the V6 support, I'm actually surprised because open stack had IPv6 support before Amazon did. Amazon is totally sorry. Okay. Well, maybe that's placing the bar very low. And I don't necessarily have the dual contact that I'm interested in hearing more about it if we can do that, but it feels like overall, in terms of updates, and I'm actually very surprised when I talk to some of the big deployments that we have and see that they're actually running it with a team of three or four people. So I would say, I mean, I'm not an operator myself, I'm not running an open stack cloud myself, so it's difficult to see directly how easy it is or how difficult it is. But what we are seeing from practical data is that the more we go, the smaller the teams are, we have clearly a talent shortage, so it's difficult to find talent. I would say that's the main challenge right now for open stack is really the difficulty to find people that actually have experience doing it. So most of the companies that are deploying it today, especially in Western Europe, France and Germany, there is a lot of training of new people, they will train their own teams because finding talent on the market is very, very difficult. I would say that's the main blocker right now if you had to cite one. Other questions? I'll be in the room for the rest of the day, so thank you so much. |
The Role of Open Source at the EU Technology Roadmap for a European Sovereign Cloud |
Okay, so thank you very much for attending this talk, and obviously thank you for the organizers of the bedroom for accepting the proposal. So that's my name, I'm coming from Upper Nebula Systems, my idea today, I'm coming here not as a competitor of OpenStack, and coming here to talk to you about the work we're doing at the European level, along with other European companies to align the geo-strategic interest of the European Union with what we are actually doing at the open source sector in Europe. So that's my email, and you want to just follow up with any questions after the session today. As I mentioned, apart from being at the part of Open Nebula Systems, I'm also part of the European Alliance for Industrial Data at some cloud, which is an initiative I'll explain later today, or later now, run by the European Commission, and also part of another initiative we call sovereign ads, which I'll explain also in a few minutes time. So just to bring you some ideas, I'm sure most of you are already familiar with some of the concepts and the strategy, let's say, behind what we're doing right now at the European level. We've mentioned before, and you're perfectly aware of the fact that the cloud market in Europe is, I wouldn't say broken, but it's not balanced at least, let's say. So these are just some figures of the growth of the cloud market in Europe in recent years and how the share of European cloud providers has evolved over time. So what we basically see is that the market is growing enormously, but the share of the European cloud providers is actually declining. So all that business and all that value is being generated by also part of the, also through the pandemic to cloud computing, that is mostly benefiting hyperscalers. So also you're perfectly aware of this. So we are now witnessing this kind of change of paradigm. So we're moving away from the kind of cloud centralized model to a more decentralized approach, this kind of, this shift of paradigm between cloud computing and edge computing that most of you, I'm sure, you're aware of. And we know, we all know it comes with a number of benefits. We're talking about deployment of applications, ultra low level, ultra low latency applications, improving user experience. Think about sectors like online gaming or healthcare. I mean, there are a number of functional advantages to this, but from a European perspective, there's also a key element in moving away from cloud computing, a centralized cloud computing model dominated by hyperscalers to a decentralized edge computing model in which other companies have a say. And that is to also foster here in Europe a new ecosystem of infrastructure providers that can bring some balance to that market. So we are assuming that the cloud market is dominated by hyperscalers, non-EU companies, that's, honestly, that's unlikely to change. What we can do is leverage this shift, this new paradigm of edge computing, where we know there are going to be a number of demands for this new infrastructure that doesn't exist yet, or this technology that doesn't exist yet, to bring some balance to that market and allow a smaller companies in Europe, for instance, to build their own infrastructure offering, bring that to the public with these new platforms, new technologies, hopefully based on open source, as I will explain in a second, and bring some balance to the market that's now heavily, heavily unbalanced. And to the point that we all agree and also the commission that we call it a market failure. So it is so complicated, so unbalanced, that now it's very difficult for a new company to get into this market and actually survive against these big players. Okay. This is especially relevant in Europe because we have a powerful tool here, which is the telecom sector. And as I explained in a second, many of the initiatives that are taking place now in Europe, they rely or they hope that the telecom sector in Europe will come together and will help to build this alternative edge infrastructure that we'll be able to bring and offer an alternative to hyperscalers. We need to get the telecoms involved and that also means we have to fight sometimes from the open source perspective, but in general, for this new change of paradigm, we have to fight against the temptation in some companies in Europe to align their priorities or get some agreements with hyperscalers to speed up the deployment of edge infrastructure in the market. But we all agree that this might come with some risks in the midterm. I think we all agree on that. So I'm going to explain today what we are doing at the European Alliance for Industrial Data, Edge and Cloud and other initiatives that we're moving and promoting from the European open source sector. The main challenge here, if you think about what the future is going to be or what kind of challenges we're going to have in the future, in the near future, I mean, obviously there are many technical challenges and political challenges as well. A basic one, which we still don't have an answer for that, is how we're going to manage if by the end of the decade we actually have tens of thousands of edge nodes across the continent, how we're going to manage all that infrastructure, how are we going to manage thousands of edge nodes with different architectures from different brands, from all of those managed by hundreds, potentially hundreds of different infrastructure providers. That's the whole idea. So how we bring all this new emerging infrastructure together, how we build the tools for a user, for a customer to actually leverage all this new infrastructure that's still to come, and how to make that in a way that doesn't bring us again to a new scenario of vendor locking. So how can we make sure that all these workloads can move freely from one place to another, how can we automate all these deployments, how to make this really easy for people to actually use these thousands of edge nodes that will be available in the market here in Europe, hopefully in a few years' time. So that's a very specific technical challenge that we have to solve right now and we have to decide how we fix that and whether we want to fix that as Europeans or we want to again rely on non-European organizations and companies to do this for us, which is very comfortable, but then at some point we realize we don't have the skills in Europe to take over these projects if something happens. So I talk about the alignment. So there is a gradual alignment and sometimes quite explicit from what kind of the signals we get from the commission in this case and what the industry as a whole in Europe is doing, but specifically the European open source technology providers. So there are some very clear declarations from top EU figures about the need to achieve this kind of technological sovereignty in some way or another, trying to coexist with hyperscalers but hopefully creating some alternative scenario for Europe. Happily there are also quite explicit declarations on the role that open source is expected to play in all this and there is a commitment to do this and I'm happy to say that the discussions we have within the alliance to define these technology priorities and these co-investment priorities, they are quite well aligned with the fact that these technologies have to be open source and open source is the way to mobilize this collaboration between different companies in Europe. Obviously that doesn't mean proprietary software is not going to be there because it has to be there and it's the base of some business models out there, but open source is going to play a key role in all this. Again we see this aligned with what we've read from the commission through a number of documents that I'm sure you are familiar with and also with declarations from people like Commissioner Thierry Wirthon. As I mentioned before, one of the practical challenges is how we believe what the digital decade strategy says will have by the end of this decade thousands of notes across Europe, how are we going to manage this infrastructure? What's the technology that's going to allow new infrastructure providers to offer this to their customers? So that technology doesn't necessarily exist right now. That technology in any case we think should be European if we want to extend this concept of sovereignty down to the technology stack and just leave that at the data level or at the cloud provider level. We see also in the reviewed industrial strategy from the commission that the role that edge computing is expected to play in this, in all this, in this data processing that's moving away gradually quite rapidly. Sometimes from a centralized cloud to the edge, we're going to use that to actually bring some balance to the market. So make sure that European companies have something to say in that and the model doesn't replicate again the kind of oligopoly structure we have in the cloud market. So that's the role of edge computing and that's also increasingly the role that open source is perceived or expected to play in bringing different companies together to build this in the long term and maintain the whole thing running because obviously we can't really say things but we need a long-term commitment from the European industry not just to consume technologies but also system integrators as well, not just to consume technologies and sell them and resell them but also maintain them in the long term. That's one of the main challenges we'll have. The European Alliance for Industrial Data, Edge and Cloud, that's one of the steps, one of the initiatives that the commission has put in place in which the European industry is playing a leading role here. So it comes back from a declaration from Member States, the commitment to support these new technologies, cloud edge technologies. If you read this original declaration which they committed to expanding all this transformation around 10 billion euros, I don't think we'll get that much but it's, as explained later, quite decent amount of money involved in all this but we see even in this declaration there are already some references to at least something that sounds a bit like open source something, the open source standards, whatever that is but still it's good to see it's in there somehow. So the Alliance was born in 2021, there was an initial publication of an industrial technology roadmap from a number of European companies at the CEO level. It was then launched at the end of 2021 by Commissioner Thierry Breton and I'm happy to say that by the end of February probably we'll be in good condition to publish the updated version of this technology roadmap with all the technology priorities that have been identified and deployment priorities that have been identified in Europe from the industry perspective but also in alignment with the division of the European Commission. The Alliance brings companies, private companies, European, only European companies, Member States and then also a number of experts are involved in this, in these discussions. We had our first physical general assembly here in Brussels in early December, it was quite successful. The idea is again as I mentioned align and identify these investment needs, these priorities, how to bring together the expectations and the needs and the potential of the industry in Europe with what the Commission is actually expecting or their vision, what Europe needs in coming years and these challenges. Right now there are three working groups, one of them is for Member States. This is mainly people involved in discussions on procurement. So they're looking at how to, from a procurement perspective, help European companies and cloud providers in daily business with the administration in different countries. The main question we get from them is should we restrict contracts to only European companies and it's very tempting to say yes but personally I will say you take the decision, I'm not, I'm just a company, that's something for you to decide. Obviously that will benefit us but I mean discussions are more complex but that's one of the topics that keeps coming up many times. And then there are two working groups for industry, one of them is the cloud edge working group that we chair along with another two companies with CapGermany and SAP. And then there's a specific working group for everything that has to do with cloud for defence and aeronautics sector. Which I'm happy to say some of you were earlier this morning in this debate I had with Gael from Eclipse Foundation. It was striking for me personally to see all these people coming from the defence industry saying that they want to use open source. So the internal debate is not about proprietary source or open source, they want to use open source, they just need help to make sure that they can use open source under the very specific restrictions they have in this sector. So I think that's a good news for everyone. So the Alliance, not all the logos here because it's an old version but as you see some of the main actors in the European industry as a whole not just technology providers but people from other sectors as well contributing to this, cloud providers as well. There was as I mentioned the first version of this document released in 2021 and we'll have that released in probably end of February. So one of the ideas is also for those of you who are familiar with all the different programmes that the commission has in place, things like Horizon Europe or Digital Europe or the SEV, to make sure that from industry we feed the commission with what's actually the challenges and the needs we have in terms of advancing all these technologies for cloud and edge and to make sure that the new calls and the new actions under these programmes are aligned with what we actually from the industry identify. So it's like giving a quick channel to give feedback to the commission to what these topics could be in terms of helping us also advance that. So that's the structure of the Alliance. I'm also here to invite all of you if you come from a European company and your business is relevant to cloud and I guess that's why you're here right now. There's an open process to join, it goes through the European Commission, you can join the Alliance, you can join the different working groups and participate. Maybe it's a bit not too late for this new version but there will be new versions and new working groups during the rest of the year and it's a long-term commitment from these companies to participate in this. So that's what the European Commission side has to do. In the meantime, from the more specifically from the open source, from the European open source industry, we've also been working in having some more cooperation and collaboration among ourselves. In 2021 we launched an informal initiative or community called Sovereign Edge. It's about bringing together European open source technology providers that have some, they develop some technologies or we develop some technologies in Europe that are relevant for edge computing and this change of paradigm and we think it makes sense to get closer to each other and get to know each other a bit better and integrations and all that and going together to some of these projects that will be coming up in the near future. So we've started with an informal community of European open source companies. We produce software and we already have some level of integration among ourselves. The objective is at some point to offer an alternative stack of European open source technologies for public administrations and companies and system integrators. Also with making sure that we have connections with all the European relevant European cloud providers to offer at least an alternative that goes beyond data and cloud. It also covers the technology that sustains all this. We launched this as I mentioned in 2021 informally. At the end of 2021, I'm quite proud of this. We didn't get this money but I'm quite proud of this because this was the first time the European open source industry came together to apply and submit a Horizon Europe project to the commission. We didn't get this money but it was a very nice experience to try to bring people who have never ever considered to go into a research or innovation project, a European project coming together and preparing this proposal. As you know, it's a lot of work. Finally we got the second time. This is the COVNIC project. It has started now in January. We coordinate this project with a number of other organizations involved. It's going to be running for three years. I'm happy to say this is the actual implementation of this initial collaboration between European open source providers and their research and innovation ecosystem in Europe to produce some technology together and respond to some of these calls from the Horizon Europe program. Finally, just to finish, a very quick introduction to another initiative that's now ongoing, which is the important European interest for next generation cloud infrastructure and services. It says for friends. We've been preparing this for what? A couple of years or so. This goes through the next generation funding. It's led by Germany and France with heavy support from the commission. In total it's 12 member states are going to mobilize funding from the commission, I mean from European funding through their national budgets, the state aid is called, to set up this huge project for clouded continuum and at different levels. We have, I cannot disclose the companies, the specific companies, but it's European industry, telecommunication companies, software developers coming together to, cloud providers coming together to build a whole stack of the technologies on the advanced services we need for infrastructure applications and data. So it's really exciting. It will cover a number of challenges. We won't go into the details here, but it will be available on these slides. But yeah, things, big topics like interoperability, portability, I mean meta orchestration across different providers, avoiding vendor locking and managing all this huge heterogeneous infrastructure that we expect to be in Europe in the coming years. This is going to run, we pass all the tests and all the questions from the DJ competition and the rest of the commission. It will start hopefully this year, the second semester or so. It will be running for five years. Yeah, that's my personal opinion, but I think the components that we are building or will be building in this project, they have the potential to actually become, you'll see when this is published, the equivalent to the semiconductor or the hydrogen epsi project that are already approved. But I think the open source that we'll be producing here hopefully will have the potential to become the probably largest open source project in Europe in the coming years. So stay in touch because we'll need help. And also an invitation for you to follow this new project that we are starting now under the Horizon Europe program. Thank you. |
What is Digital Sovereignty and how can OSS help to achieve it?
Demystifying an important term that has become a buzzword |
Thanks for joining us tonight, a long day in the digital, in the sovereign cloud room. We've heard a lot of great presentations before and I hope I can shed some additional light on this definition of digital sovereignty and how open source or how we believe that open source and maybe extending the way we do open source can help us to achieve more here. Just quickly introducing myself, first I've been doing open source for all of my professional life, actually I started before. I was part of the professional life at university in the 90s when I was doing some work on the Linux kernel contributing to the SCSI stack there and SCSI drivers which was really fun and which really helped me to kind of get the the fascination of working with with really people around the world on great technology and that was then actually what my professional life became and I was working in Sousa where I built up Sousa Labs that was actually a significant part of my career and then also been building clouds for Deutsche Telekom T-Systems so I was actually in the open telecom cloud project as head architect there and also during the time I was part of a number of communities and organizations EFF. Linux for Bund which became OSPA which kind of funnily today is my employer and that's actually the sovereign cloud tech project which we were able to start which brought us to work extremely closely with the German ministry for economic affairs and climate action that's the one who's paying us and the open source business alliance which runs this project and which employs me and yes over the time I've been able to contribute to a number of projects Linux kernel new compiler open stack and now with cluster API Kubernetes stuff also working in that so we've heard quite some some good reasoning why digital sovereignty is a thing and to kind of make it very fundamental if you look what IT does today to our lives in private lives in all of the things we do with our industry also our society our public administration they all depend on IT in a way that they didn't use to depend on IT 20 years ago and this is only going to get larger and more important so it's important for us to make sure we have control over the systems we are using and we are in a position that we can determine how we want to use these systems and right now this is something that is under question and we've heard it very good from Ludovico before that we are in trouble there we are in a challenge there a lot of I mean from an economic perspective a lot of the value creation happens in IT platforms in software in infrastructure and of course there's a lot of are we able to determine or to even set laws that we are able to follow if there's a lot of infrastructure all of what we do privately and in the industry depend on doesn't comply to how can we kind of resolve that and of course there's there's the question I mean we have this GDPR which I believe I mean GDPR can be annoying in details but in overall it's a great thing because the idea of protecting or restricting what companies and governance can do with our data actually is part of what gives us as individuals the freedom to not be being watched and being analyzed all the time and this is a very fundamental freedom right that we have and the thing we need to kind of ask ourselves is we have these great laws and question is can we actually implement them and actually are we serious about our law are we actually complying to it and talking to companies talking to people in the industry I mean I once wanted to kind of poke people a bit and said well I guess at least half of the usage of public clouds in Europe is illegal if we take GDPR seriously and I was expecting to get a lot of fire back when I said that and the the only comment I really got back was well only 50% I think it's a lot more so this is the status quo we don't have exact numbers I don't have I have not done a study on that but we know a lot of public cloud usage isn't legal if we take our laws seriously and the question is how can we how can we improve on that how can we actually make it reality and we we have regulation that is supposed to help but then of course we need to take it seriously there's Digital Markets Act Digital Services Act which is certainly a good legislation that helps us and we have this data protection thing which we need to start getting more serious about one thing we've seen come out of that debate is now that there is in several countries are initiatives that US hyperscalar enter partnerships with local companies and try to structure the partnerships in the way that the local companies fully in control of the infrastructure and operations with it so the hyperscalar doesn't have access to it so the cloud act wouldn't apply we have to I guess look into the details whether that actually fulfills the data protection requirement or not depends a lot on a lot of details and one of the things I've learned in my career that once I was trying to operate infrastructure to operate software that it was built for being run by the software company itself in a DevOps model and we were trying to run it as a third party and it was awfully painful and that is of course the thing that now these local providers that do partnerships with with Microsoft or Google need to go through and learn a lot invest a lot and actually Microsoft and Google would have to invest a lot to make it runnable by others so that's one thing they need to overcome but maybe they do that might resolve the GDPR question what it does not resolve is actually the fact can we actually understand the technology do we have to transparency want can we actually provide more influence and value from European companies and there's been discussions that we need alternatives not just regulation but alternatives and one of the things that I've heard discussed was well let's build a European Amazon let's build a European hyperscaler and to be very honest I don't believe in that I think replacing the dependency on a monopolist or a few or liquor polis with just another monopolist that everybody then depends on does not improve the situation much it may improve it a tiny bit because maybe there's a bit more legal control but it really replaces a dependency with another dependency so it's it's not a large step forward the other thing I've heard well I mean let's let's make sure we only use European software and honestly I don't think that is something we should seriously consider if we understand open source because open source does does not need us to restrict ourselves for to a certain country and I think Marcel has made that point very very nicely in his no it was not you it was Ludovico I guess that that that if you have open source where you have control that is not needed digital sovereignty so I want to take a step back and think well what is it we want to achieve with digital sovereignty and talking to companies understanding what they want to do also talking to to to public administration people it is really about the ability to take decisions on your own and obviously if you take that to an extreme it would mean you don't have any dependencies I don't think anybody in the modern world can create platforms where you don't depend in many many ways on others so in the end what is really I think we need to do to do this is to manage those dependencies and the risks we get from those dependencies in a very very conscious way understand them maybe we can avoid some of them and where we cannot avoid them make sure that the relationship we have with people that we depend on is well understood and we have a certain amount of negotiation power that that makes sure we can we can control things and we have talked to organizations to understand what is it what is the various dimensions of risks they want to to kind of address and to to manage and we we found like this list of four different directions that generally people were interested in that was the first one is of course this legal compliance thing this is something that any company needs to of course do the second thing is well actually I mean if we have a supplier which we have a certain dependence on we want to make sure whenever needed we can switch and switching of course is something that can be very easy in theory and extremely hard in practice so in order to to make the switch a viable option we need to structure the way we use this platform or maybe to structure the way this platform is being designed in a way that switching cost is low and I mean the the ultimate possibility to to switch could also be that an insourcing option happens so if things go really bad I could take actually a platform and run it myself that would be like the ultimate proof that as switching is possible the third thing is the technology platform we depend on basically opens up new possibilities and closes off other ones so it basically limits the amount of things we can do on top of that platform and of course if you as as a company for example want to evolve in a certain direction it may become limiting to you so actually what you would like is to have a way to influence how this platform is evolving in the future. The fourth thing we heard a lot is a discussion on transparency we want to understand how our data is stored how is it being secured from being improperly accessed do we have that level of transparency and do we have the skills to actually understand how it works so if we make certain if you want to make certain improvements do we do we understand them we've heard this before I guess this is kind of preaching to the choir a bit open source of course helps a lot with achieving our goal here using open source I mean the amount of control you get as the user of open source is so much larger than the amount of control you have over over being a user of proprietary software it's a different it's a different league it's a different world so we don't need to have that discussion that open source that's what our software needs to come from a specific country or from a specific continent this is this can be a relevant discussion if you're using proprietary software because then you don't have a lot of rights if you're using open source the idea is to do open source right and if you do that right you don't need to to ask that question and of course I mean the the ability to build on top of each other is something that is inherent to to the way we do open source so I mean we all today are using systems that well over 20 years a lot of smart people have developed doing doing Linux well 30 something year actually there's one thing I think we need to be aware of I mean a decade ago or so we were discussing that open source is really needed and we need to open source in order to have control over our own fate today I hear a lot of companies saying well we do open source and it's become something in their marketing checklist if they think well the recipient asks for open source we will we will check that check mark right because they ask for it and we do something somewhere in the open and I think the the one thing we need to be really careful about is we have built this great open source movement where we have achieved a lot and now we hear people saying open source without meaning it and we need to be careful to to call them out and to not let them get away with that they build projects which are partially open source but if you want to use them you cannot because of course there's these other components that you need in order to to use the software which is I don't want to say that the open source pieces are useless but they're not very useful and it's certainly not not open source as a whole or people invent strange licenses which are not even OSI compliant which restricts you in very surprising ways be careful on that or people build these open core models where you have like access to some useful core piece of technology but then if you want to do the really useful things with it you need the proprietary extensions not very useful so be be careful there's other things we've seen which I think we need to be careful about is that sometimes projects are built inside a company and then that companies as well okay maybe we can reach a larger market by making it open source and this is a great thing I applaud that company for doing that but we also need to be careful the company needs to invest into building a community so that the dependence of that open source code on that single company gets lower over time because otherwise there's if that company is the sole owner of taking decisions of contributing of having knowledge on that software stack it really doesn't make us a lot more sovereign it does in theory but the need the work needs to be invested to to build the competence the local competence as we heard earlier today and then yes I mean decision-making development process it can be closed and that is not what we want the open infar foundation has I think a good way of kind of addressing all those concerns about not really open projects with the four opens where it's not just the license the license is important and it needs to be OSI license it needs to be fully open source not not an open core model and the development process needs to be open the design discussions the decision taking needs to be transparent and the community needs to be open and diverse not just single vendor and looking at the list I made before of what companies expect from being sovereign I just kind of make this this little non-scientific taxonomy how much sovereignty can users have if they're using these platforms obviously if you're on a on a hyperscalar I mean even GDPR compliance is not something you can achieve there's this model I said where we have this trustee model where you have local providers of hyperscalar technology where it's fully operated hopefully which solves that problem but I mean the ability to to to have choice to switch provider is not there the ability to insource to run this stack yourself is not there you cannot really shape the technology and adjust it to your own needs and you don't build the skills how this software works and operates so you don't you don't really score on those levels if you would build a you hyperscalar it wouldn't be that much better maybe you have a bit of better handle on the on the GDPR things but it doesn't it doesn't solve the problem if you if you deploy private clouds actually you're in a better shape even if you build that on proprietary technology because at least I mean insourcing is then something that is very real because that's the decision you have taken you could have a third a third party actually doing some some of that management for you of course and you build some skills you learn how to operate it now does does open source solve the problem for you well it does partially and I think there's some additional steps and I'll talk about them in in a second one of the things with open source with building open source or using open source public clouds or building open source private clouds is that switching may not be easy still because if you have five companies that build open source clouds you may end up in a situation that those five clouds may use some of the same technology components but all of them are configured and run in a very different way which makes which means switching from one to the other is still extremely painful so that's one thing that needs to be solved and the skills and transparency is something that also may not automatically come in in theory does because the code is open but to run a platform there's a lot more than code you also need to have the operational practices you need to have the operational tooling you need to understand how to deal with incidents all of those things is knowledge that you need to build and it is not automatically available just by having open source code and that's of course how we connect to the open operations and the operate first movement I'll talk about that in a second first of all one of the things when I said well switching needs to become easier basically we need to kind of find common ways how we build these clouds how we configure them and how they expose interfaces to the to the users and of course there's there's great existing standards out there I mean the on the open infrastructure side there's the OpenStack powered trademark certification which standardizes the the APIs how they are exposed on the CNCF world of course we have the CNCF compliance tests but looking at real workloads those are a great starting point but they're not enough I mean there's nothing standardized about how does your storage in Kubernetes for example need to look like there's well I mean the way you do whatever I'm ingress you need to have custom annotations for your load balancer so if you need to migrate applications those are the things that you really stumble upon and that are painful so we need to to create standardization together to to move forward then address that issue and then of course if you if you create common standards having a reference implementation that implements all these and that providers can take and it's it's not a black or white decision you can take certain modules from a reference implementation or you can take all of it both is possible helps of course then also for providers to have a much easier job to actually build platforms addressing the skills and transparency thing that's the the operation open operations that operate first movements that we we are working on the idea there is really that we've learned how to collaborate effectively and efficiently when building software the dev piece of our dev ops world but what we haven't done as much yet is learning how to collaborate how to share information how to work together on the ops piece of dev ops so most commonly we don't learn a lot how the application operation how the platform operation works sometimes we get some some highlights or some spotlights and that's that's quite informative but we haven't really built a practice yet of sharing that and that's something we're trying to to work on within the the open operations movement that we've also heard on before on on the operate first a project and it's it's about a number of things it's really about knowledge sharing providing transparency for example if there's an incident getting a public root cause analysis is a great thing to learn it's a culture of sharing and of course processes and tooling belong to that as well so filling that gap those gaps that's exactly the mission that the sovereign cloud stack project tries to pursue and this is kind of the the official a statement what a sovereign cloud stack project is about we want to build one platform but one platform that's not operated and built by one company that owns it but really by a community of many which standardizes it together which builds it together with which operates it together even so really trying to to build on the existing open source projects and we have a lot of really great open source technology that we can build upon combine them in a standardized way that we discuss together with the community that participates and then also build the operational practices the operation knowledge together so that's that's kind of the the three outcomes that we we produce certifiable standards the reference implementation that's of course fully open source and building the operational knowledge together so this is again about the open operations i'll skip over that so we have some time for questions left and i think now we can achieve what we wanted to achieve in the taxonomy of of digital sovereignty because we now by making it easier to run those software stacks and to to offer them we have more choice in providers so there will be there will be local possibilities switching has become easier because we have taken a further step in standardizing what workloads that run on top of your infrastructure can expect and of course we've by opening up the operational piece of the the knowledge sharing we actually built a lot more skills and a lot more we provide a lot more transparency on operations the good thing is this project exists we have been able to get a grant from the german ministry for economy almost well actually one and a half years ago is when we when we started we've been able to build up a small project team in the open source business alliance we are nine people i believe on staff right now still looking for maybe two or three more to further build this we also have some money that we can pay to partners that do development work and real development work is happening some of that of course contributing to the upstream projects that we work with some of them some of that really on on integration configuration operational topics where we do some sometimes need to build own software we have a number of active members in the community we're also working very closely with the gaya x community to make sure that the standardization on a very different level that's happening there is something that is part of of what we deliver right now we've done four releases so we do like two releases a year the reality is that it's a bit of a continuous process but then still the way our partners want to consume the software it requires us to have like releases and we do like we do two per year of them and currently we have three public clouds that are using the the reference implementation actually they are using it completely and we have a couple of projects that are underway where private clouds are built and also where existing cloud environments that pre-existed before we even started this project are starting to adopt modules from the scs work that we're doing and adopt the standards so the certification program on the standardization is also something that's currently being extended and rolled out over the next few months I have some some references we've talked about this quite a bit in public published white papers and conferences and I mean these slides have been uploaded so you can best I guess look them look those references up in the in the slides and that's already my final slide I would love if some or many or all of you join our community in any way what I would what I guess what everybody in this room already does is in some way or another contribute to digital sovereignty by working on on open source this is this is helpful thanks for doing that certainly some of us if some of the work ends up in projects that we also use like whatever the linux kernel or open stack or kubernetes or projects like black cluster api in kubernetes all of that is helpful and contribution is not just code I mean just asking questions opening issues if something doesn't work contributing to these standards documenting something all of that is is great contribution and of course if there is digital sovereignty discussions it would be great if you could actually raise the voice and say well digital sovereignty is not just complying to gdpr it needs to be it needs to go way further we need to open up the way we develop software we need to have the transparency we need to open up the way we do operate these environments that's the the open operations movement to really make sure we create this awareness and make sure also that we don't let people get away with saying well we do open source and then if you look at it it's not even an osi license or it's just like a few open source components and the rest is proprietary let's make sure we we don't let people get away with that but we fight for the real open source so that would be great and it would be great for us if some of you will will join our community we're also hiring still few positions we still also have a few more tenders that we where we want development work to be done so that's also a way to to contribute you'll not get particularly rich but of course you need to pay your developer that's all I hope you have some questions if there's a lot of questions there's also an open infrastructure meet up tonight at the roosters I've been told I will be there I will not make 1900 because I think the session in this room runs until 1900 um but maybe in 730 or 80 I'll be there I can also ask some can also ask questions there but I would love some questions right now I think we have some minutes left hi thanks a lot for the presentation and congratulations for the work you've already done actually I was wondering because you talked a lot about switching and how switching is gonna bring more sovereignty but I was wondering if we cannot even take a step further and talk about interoperability and portability in itself because I think it would bring even more if we talk about sovereignty it would bring even more opportunities for sovereignty and development of new European or alternative for non-dependency on certain technology because interoperability will allow and portability will allow you to develop niche services that would be interoperable with pretty much all other stacks right so I was wondering why stopping at switching and not go all the way you know as I mentioned before great great question um and actually um on that slide I was really kind of um answering the pain that we've heard companies talking about um actually if you look at our vision what we're trying to do is we say well actually we're building one large virtual cloud where the crash is not so much switching from one provider to another one with your workload but actually using several of them together in a federated way and because the way they are built they are so compatible so interoperable um that that actually works and it's seamless uh that that's kind of the end vision and probably it will not be achievable 100 percent because sometimes there are things like whatever network latencies that you will not be able to overcome if you federate clouds from very different locations um but I mean getting as close as possible to that vision that that's something we try to achieve and I think I didn't mention Federation specifically here but it's part of our vision hello thank you for the presentation and just trying to understand the scope of scs uh is it like only um you want to like try to standardize what is being exposed to users beyond the api's like thinking of opens like all the public open cycle because they only have the same api but of course there are some details are different so it did make sense to try to standardize that but is it also because you talk about operation and everything to try to standardize how the ops team for this public cloud deploy manage their things uh they manage their open stack cloud and if it's the case like how do you deal with all the existing public clouds that already have a lot in terms of operations that is running if they those companies want to join the initiative what are they supposed to do like uh redo the deployment with the scs stack uh or what's the idea there yeah great question again um so I think there's several options maybe first of all I mean when we say we standardize we create certifiable standards that's of course something that is at the interface towards the users the folks that deploy workloads on top of the platform and that makes the the switching or also the federating possible but then of course we have a reference implementation which is completely open source and which is modular and if you are a new provider I mean our recommendation would be well try to see whether the whole stack actually serves your purpose we're trying to to develop it together in a way that it does but if you have an existing platform that's probably not something you will do you will not switch out an existing platform if you have any reasonable amount of users on it so the our suggestion is well look if there's certain modules where we create something that you want and see if you can fit it into your existing infrastructure we're currently trying to start a project on um metering uh metrics collection uh because there's kind of a number of of of half finished projects out there which kind of need need some need some love um and that is something that as as an existing cloud provider you might be very well willing to adopt or maybe a status dashboard something that people have built on their own most of them haven't done it in a way that it's great yet doing that is something that you could adopt and then step by step of course the the closer you are to the reference implementation the more knowledge and information and also operational practices you can share with the rest of the community so the more useful I think it will be but then if you have existing infrastructure you will probably change it very gradually and that that's possible that's something that we do we have partners that we work with that follow exactly that approach thanks for thanks for presentation um how small question are you starting from scratch or are you in contact with different company to build something based on their experience behind my question um actually I I know that a lot of countries in Europe there is some regulatory conditions for financial services or government and I think that everyone is building his own certification standards uh operational model and they're building their internal cloud solution so behind the question is is there any project to collaborate with all of that companies and maybe build something tomorrow I mean the the whole idea behind this is an invitation for all those companies that build similar things based on openly stack components to say well can't we join forces and do it together and build some joint common best practices that work for all of us or I should asset it's not like a white thing you can adopt some of the some of the the pieces if you want but in the end of course we want to have one one reference implementation that all the different people that that are interested in it have contributed to and that's that's what that's that's the main reason for us to exist this kind of collect all this these little great teams that do great work but it's completely disconnected and fragmented and we try to bring that together and of course we haven't yet talked to all of them and some of them may be here about us for the first time to today but yes that that's what we're working on and we're more than open to get additional people that have built some similar things and see whether we can align them and find joint joint ways to do things yes hi um okay great story um what I'm sort of missing is the next step and and then with regards to politics uh if we look back in history that once was a thing called the docu wars you know Microsoft document liberal office etc and uh there was a certain arrogance of a certain supplier that I won't mention that there were no price negotiations as possible and then if you say that to a government official he goes back to his desk and he starts thinking okay but I'm being screwed and how can I fix this so they came up with a thing called open standards and the importance of open standards in uh uh using in in in society now what we see with the GDPR is is also a very good standard but on privacy if you talk to the people in Japan and say well what do you do with privacy laws the answer is well we look at Europe and copy that um so on that area we're in the front so in in the hyperscalers we're losing big but on standardization of privacy laws we're leading the world yeah um what you're doing now is setting a very nice standard and if you can sort of promote that standard to governments as in look the exit costs of your product and and projects will will go down if you standardize on these standards that we have developed then they will ask the uh suppliers uh it's nice that you want to sell or something but it has to apply uh by by this standard um which will give legitimacy to your standard um so are there basically after my long stall and the question would be are there plans on that direction so um I think it's a great suggestion we have some level of discussions with the german government that's just because we were operating out of that country and also have funding from the from the german ministry so some some of that is happening there yes so I think it's something we will need to address to see that we can um yeah I would love not to say it needs to be friends in Germany because I think europe is a lot more colorful than that but I know those are two key countries so yes um I'll just take that as input and think about how we can maybe start a discussion in a broader european sense I mean we have some great partnerships in sweden where actually great discussions happen talk to the the join-up task force of the EU in in brussels geysi lenis is walking around here as one of those people involved there and and and make that connection and because from that point on um uh things should start rolling because well that's their job you know to disseminate these kinds of initiatives through the other countries and the other countries are all looking for a way to make also their lives better for their constituents so um yeah I I think this is a very good initiative but it needs the political dimension yeah great input and I'll I'll try to catch you to to make sure you can establish those connections yeah sure sure no worries I'll I'll uh walk to you in a minute please so last question so uh disclaimer I work at red hat and I have a question you know how this compares to red hats uh hybrid cloud uh strategy what what do you think and it's kind of the common many of the common goals so I it's a great question um I I mean red hat does a lot of great work in a lot of the open source projects that we are also using when we're building the reference implementation and we have lots of of contacts at the working level in the end what red hat does is they are building a uh product a distribution for well open shift is a lot of the work is is based on open shift basically in red hat which is it's a nice product but what red hat doesn't really do is address like this standardization thing that red hat doesn't try to tell its customers well you should run it in a certain way so it can be federated with all the others something which is not in the scope of red hats work and I think it's great because it's not their their role to do that we're trying to establish that so we can build these federations um so it's it's really a different role uh that we have there um and then of course I like the fact that red hat has kind of invented this operate first thing when we came up with this open operation things which kind of fits together very nicely so I think there will be a lot of collaboration and I hope red hat will play a stronger role also in our project in the future thanks |
Effective management of Kubernetes resources for cluster admins |
So, last session for today, and we will make sure it's going to be a really, really long one, so that you have to starve and don't get to the drinks. And I'm happy to welcome Tom from Red Hat, have fun, and enlighten us. Thank you. Should I have fun or them? What? Should I have fun or them? Both? Okay. Okay. Hello, everyone. My name is Tom Sofao. I'm, as you heard, I work at Red Hat, and I think the last talk was a great segue into my talk. So, if you were here for the previous presentation, who was here? Good. Good. So, we were talking about, or the talk was about standardization, call for a unified platform, call for a sharing exchange of ideas, knowledge of findings, and how to get to some kind of an open, unified, sovereign cloud. Well, we've been working on, I think, like that for past two years or so. Yeah. In an initiative called Operate First, building an open hybrid cloud platform ready for everybody to consume, to use, to look into operations, to look into metrics, look into whatever telemetry you have to actually do the operations yourself if you want to. So, this talk is going to be focused precisely on that, on sharing a story, sharing a lesson that we learned during the time, and maybe hopefully take it as an opportunity to not just share that with you, but to also encourage you to learn those lessons for yourself and experience our pains and our challenges yourself. So, let's dig in. So, the talk is called Effective Management of Kubernetes Resources, the GitOps way, GitOps for cluster admins. So, first we're going to talk a bit about what is a cluster lifecycle and what's the role of cluster operations in that. Then we're going to experience the chaos that is out there in the world, and then we're going to talk YAML. If you've been on the YAML lighting talk, this is going to be a very slight variation of that, but more Kubernetes focused, and then we're going to bring some order to that chaos. So, we have these free graces of cloud management, right? We usually provision some resources. We manage those clusters once they are deployed, once they are provisioned. We then deploy applications on top of them. If you are talking about Kubernetes-based cloud systems, this is the usual free pillar of free graces of what we are experiencing. We have tools available for both hands of the spectrum. So, for resource provisioning, we have great tools like Ansible Terraform, Hive, or cluster API in Kubernetes. This is an established pattern, established workflow that is widely used across hyperscalers, across people who are deploying Kubernetes by themselves, and so on. Good. This is a solved issue. This is a non-issue. Then there's the application maintenance, application deployment, application lifecycle. Again, very well thought through aspect, very studied place. We have tools like Estimize and Helm. We have Argo CD or Flux CD to do continuous deployment of your workloads and to provide you with all the goodies like rollbacks to previously non-broken deployment and taking it even further with other projects like, and now I forgot the name. What do we are talking last about? The SRE-driven deployment? No, you don't. Okay, let's move on. What about the middle part? The cluster management itself, if we are managing Kubernetes resources, what are we talking about? If we are managing nodes, if we are managing tenancy, if we are managing networks, what are we actually talking about and how we can manage that? We have these four main problems that we want to solve somehow. We found out that basically, nowadays, it wasn't the case two years ago, but nowadays it's the case that we can solve all of them through Kubernetes native resources, through YAMLs, through deploying YAMLs, applying YAMLs to our clusters. It's done by a few different means, so we have main areas within Kubernetes API that we can explore to solve these needs. We have multi-nancy, so we can solve that by just simple namespaces, cluster roles and what not. Cluster upgrades, again, we can apply install operators, talk to those operators and get those clusters upgraded. For storage management, we can use operators, we can use storage classes and storage providers and custom resources if we wanted to deploy our own storage on, for example, bare metal clusters. For network management, we can do that also through operators, so things like search manager, things like NM state, all of that can now happen through Kubernetes API natively. That's great. What did it tell us about the cluster management? It can be all managed as a Kubernetes application. It's in YAML. Well, YAML is a mess. We know that, and we know that thanks to multiple aspects. YAML can be defined and stored in files, no matter how you structure it. It can be a single file with many different resources. It can be many different files, each of them holding a separate resource, and only asterisks in Bash is the limit for your Q-cattle apply. You can do whatever you like on the client side. On the other hand, on the server side, the manifest that you apply to the cluster is not the same that you get from the cluster back. It's modified. It's mutated. You have things like status. Some operators, some controllers modify also the specs. Some modify also annotations, labels, and whatnot. You don't have the full control over the definition. You need to know what subset of the keys and values you can actually define as a declarative manifest for your resource. It's not the same as the manifest applied on the cluster. So how people store manifests online? If we pull random project on GitHub that is deployed to Kubernetes, you will find many of these patterns. Q-cattle doesn't have ordering, so people solve it creatively through numbering their manifests. Some are aware that their application may run in different environments. So they create different files with duplicate content with the same deployment with just few lines changed here and there. Some combine those approaches. In some projects, and we find that even in some controllers for their dev setup, they have a single file with all those resources in line in there. This is not a standard. This is not a good practice. And if we want to manage environment, which is live, which should be approachable to people, this is not the way we should do it. So in application space, we have basically two choices. How to organize, how to structure our manifest. One is through Helm, which is great if you're deploying applications and you want some templating involved if you want to quickly change many different places of the same manifest or of different manifests. So you can basically create this template, applying these values, and you get the full YAML that we saw earlier. Great. But is it readable? Is it understandable from the YAML itself without rendering? We don't think so. And we want our cloud manifests to be auditable, to be approachable, to be reviewable. So if we want to be able to explore what those changes do on a PR review without actually spinning up a cluster and applying that PR, and maybe do some static validation on it, you can do that with this. You would need to render it. You would need to understand it. And if you change a template and you have different values for different environments, how would it affect the template itself? So you need to explore all the possibilities. And this is one of the biggest challenges in Helm space that we are currently facing in application development. Then we have the other way, the Kubernetes native configuration management customized, which is a bit nicer. All of those manifests are fairly easily organized and referenced through customization, and all those customizations are organized into basis and overlays. So it's a composition type of configuration that we have a base which defines the basics and then we have the overlay which can patch and mix different resources. These resources in the base are already complete. This is a complete definition, a complete declaration of my resource. This is reviewable. So we kind of thought that this might be a way, but before that we defined a couple of rules, a couple of directives that we wanted to achieve with this. So if we wanted to organize our manifests, we don't want to build our own solution. We don't want to build our own CI CD that would understand our manifest structure. We want to use something that is readily available with great community. We want the configuration to be stable. So if I change one manifest, it doesn't break five different clusters. And those things that never happen usually happen. So if something like that happens, I can roll back the faulty cluster, just that individual cluster. I don't need to roll back all the clusters that are working with that particular configuration. And it's unit testable. So that's also an important thing. File mapping, also very interesting topic because YAML allows you to inline multiple resources into a single file. But we don't want that. We want the file and its name to fully represent the resource. And before I even open the YAML, I already know what to expect inside. I don't have to guess from a def.yaml or from namespace.yaml, which also contains like OpenSheetProject or whatnot. Each file is readable without processing. That's so explanatory. I want to be able to open the code tab on my GitHub repository and understand the manifest. If I'm defying the same resource on multiple clusters, if I'm applying the same resource on multiple clusters, let's say I have the same user group on two different clusters, I want to apply similar or the same RBAC. I want to apply the same cluster roles, project namespace, permissions and whatnot. I don't want this definition to be explicit, to be defined differently, maybe differently, maybe slightly differently, maybe the same in two different places. I want to share the same definition. As a practice that we use in programming for ages, this is not a well-established pattern in Kubernetes manifest. We want to reuse stuff. And as I told before, the file name already describes what's inside. So we came up with this pattern, and this pattern has been embedded through a couple of organizations that I'll show you later on. And this is a pattern that we come up to. We have a base for Customize, which references every single object that we deploy to any other, any our cluster that requires elevated permissions. If those resources are standard namespace scoped things like deployment, config map, secret, whatnot, this is the developer responsibility. They live in their own self-contained namespace, and they can do whatever they want in there. But if we are talking about creating namespaces or creating cluster roles, we don't want developers to create namespaces on their own or create limit ranges or create resource quotas on their own. But we want to do this, set those things for them because we don't want them to basically expand and take over the cluster if we don't want them to. So this pattern of API group kind of name is actually kind of working because already from the path base, core namespace, sovereign cloud, or base, fosdmorg, talks, and I talk, I already know what the resource is about without actually looking into the file. Then I have overlays, which each overlay represents a single cluster. And they have customization, which basically mixes and matches whatever resources I want to pull from base. And if I want to change something from the base, I basically just patch it because customize allows us to patch resources and applies either a strategic merge patch or adjacent patch so I can do various things with that. This is very helpful if I have, for example, cluster admins group and I want different cluster admins on different cluster, but the group itself is already defined in base. Well, this is nice, but it doesn't work in all cases. It doesn't solve all the issues. So we had to introduce two additional concepts. One is components, which is also an alpha extension to customize, which allows you to reuse the same manifest multiple times. This is important in cases like RBAC if we have role bindings that we want to apply to multiple namespaces, like granting this user group admin access to a certain namespace, because if customized by itself wouldn't allow us to use that resource multiple times. So this is a limitation of customize in this particular case that can be overcome through components. And then we came up with bundles, which is an addition that basically selects related resources from the base, which are always applied together. So imagine you want to install a cert manager. It's always a namespace. It's always a service account with cluster role. It's always subscription or whatever, or cluster issuer for certificates. So all of these things come together and there are references bundles, so we don't clutter the overlays too much. And we also introduced common overlays, which are region specific, which are shared across regions, because for some regions we have a shared config. So how such single cluster overlay customization looks like? We reference the common. We take all from common, which also references some things from the base and whatnot. Then we can, for example, this way deploy our customer resource definition for proud. We can create an namespace for proud, and we can apply some RBAC to node labor. We can install a whole bundle for cert manager as is, and this ensures cert manager is deployed and configured properly for this cluster. We also can specify a specific version for that particular open shift cluster to upgrade it to do maintenance on the whole CPU version. And if we want to, we can patch certain resources, as I mentioned, the cluster admin. So fairly simple pattern, but there's been a two-year journey to get into a state where it's actually working across regions, where it's actually working across multiple clusters, and when it's efficient in managing multiple clusters through PRs, through GitOps, through single file YAML-based changes, so it doesn't break all the clusters. What I didn't mention on this slide, each of the cluster has their own separate ARGO CD application. So they act independently in the CD process. They reference the same code base, but they are independent, so the rollback is possible. So in conclusion, to evaluate what we did here, we have no duplicity. Manifests are readable. Manifests are not confusing. The set of rules is fairly simple. It's nothing very complex or bulky. The CI CD is very easy, and we can do static validation, we can do unit tests, we can do integration tests. All of that can be done fairly nicely. What are the downsides? We have boilerplate in the form of customizations, in the form of components, in the form of very nested path structures, directory structures and whatnot. Customize is not always very straightforward, so you need to learn the tools before you can use it. What also limits our static scheme validation is that manifests in base can be partial, because they are not always complete, because we expect to patch them in those overlays to, for example, set a specific channel for our operator subscription and whatnot. So that's that. We have four organizations currently adopting this scheme and running this scheme. We have Operators Community Cloud, New England Research Cloud, Massachusetts Open Cloud and Open Source Climate Alliance, all running on this pattern. So this is a lesson that we learned through collaboration in cloud operations, and I hope we may be able to learn more such lessons in the future by exploring cloud together. So if you want to know more, you can join us in Operators Virtual Cloud. You can see our ADRs and how we got to those outcomes, and on the last link over here, you can actually see the code base that we are running against all of those clusters. Thank you very much. Thank you for the talk. We use the same pattern, but one of the manifests in completion, and we fix it. We adopted an approach that we define those attributes that are required with customization overlay. So like ADUMI value, and then you have completion, and then you know that that particular value, it's a valid YAML because it matches the spec fully, but then you know visually that that particular field will be patched in overlay. So we use that as a solution for the manifesting completion and the static validation. We always use customization over overlay, and then we know that we are going to do that. That's just a solution that we... I don't know if there is a better way or a better word to use for that, but that's our approach. We use the same, but it doesn't work in every case. In some cases, the scheme is very detailed. It requires this complex nested structure, like for example, search manager requires solvers, and if you define a solver, you can't remove it in a patch because it's a mapping, so strategic merges don't work that way in customize. You would need a JSON patch, and you would need a long JSON patch, and it's becoming less and less clear in this regard. I think another thing that we do is we have, for example, a common base like you, and then have a non-production base, production base, and then for example for the admin groups. So we have a group of admins for the non-production, but we don't have a full group of admins in the base. We have what? And then we edit from the non-production to the production in case we need one group or another. That's another approach that we... Thank you. I think I forgot. And then the last one. In this case, when you have a couple of bundles, maybe it's easy, but you have a cluster with 12 or 15 bundles, it can be a little bloated, having a single LARGO CD-app, managing all the applications of a single cluster, and we use the approach of we have one for the cluster deployment with Hive, and then we have for each operator, we have his own tree, so we have independent applications, and when, for example, an operator breaks, it doesn't break the entire LARGO CD-app of the cluster. It only breaks the LARGO CD-app, or when we need to upgrade, or we think it's safer, because you are really, really scoped, and you can not break the entire cluster, just a single application. Yeah, we do the same for operators which have specific deployments and whatnot. If we can deploy operators through subscription to the OpenShift operator catalog, operator hub, we can do that through a single resource, and then it's not bloated that way. So, yes, we... Same lesson that we faced the same, same issue, and we were solving it very similarly. We work at Red Hat, but we did the same approach independently. In the front. Good, great, sounds great. We should talk after. Yeah, hi, really nice talk. Thank you. Thank you. We build a lot of internal developer platforms, and we face the same issue where we kind of lose track of the code bases. Do you implement any repo scanning or file structure scanning that makes sure that this is enforced among your customized charts, and kind of a two-parter? Do you just block all use of the Helm charts, because everything has a Helm chart nowadays, and it would be kind of limiting to have to rewrite something in this format if there's an existing Helm chart or existing customized, or is this only for, you know, internal YAML? Thank you. So, we enforce this only for resources that require elevated permissions. If you have a Helm chart that is deploying custom resource definition, then we tell you this is not a good thing, you shouldn't do that. The API wouldn't allow you to do that, like our RBAC settings. So, we basically tell those people you need to get that CRD into our repository, check it in our base for resources which require cluster admin or elevated permissions, because if we would reference it from somebody else, from some other repository, they can change it in their repository, we don't want to do that. And we don't want them to be applying CRDs, because those are shared on the cluster. And if two people on the same cluster are deploying the same Helm chart in different versions with different CRD schema, it can fight, and we don't want that. So, that's why we want a single source of truth for all the resources that are cluster scoped or requiring elevated permissions. So, Helm charts are allowed for developer and application workloads in their own namespace, self-contained, or across all of their namespaces if they have more, but not under our watch on the elevated permissions. Thank you. You said that when you have several clusters, you can limit what the developers or the user of the cluster can deploy, but how do you manage that? For example, we use a pro, so we have a chat-option interface and we have ownership, so each environment has a set of owners, but we cannot limit. So, a developer can create a customization that adds a new namespace, and statically, we cannot limit what kind of resources it's going to be created by the developers inside his cluster tree. How do you handle this? So, if it's deployed from our overlays, we would know that, and if it's deployed from his own customization repository or whatever, he wouldn't have the permissions. To create a specific resources? Yes. How do you manage that limitation? So, if he's, maybe I don't understand the question, but if I have a developer who has access to set of namespaces, they can deploy only to that set of namespaces, and if they onboard our ArgoCD to manage their application through our ArgoCD, they have their specific ArgoCD project, which also restricts the RBEC, so they won't be able to deploy to any cluster, just to that cluster that they have access to, and just to those namespaces they have access to. Okay, so the cluster resources are only managed by the operations team, and then developers, in our case, we have a mix-it, so the developers can create patches and edit parts of the tree of the clusters, so we don't know how to handle, like, they only can create a specific set of resources, and we do that through validation, so they can review, but we need to approve and manually review that they are not creating, like, namespaces or operators or cluster roles or something like that. Yeah, we limit that through a single code basically, for a single repository, yeah. But we also, like, do this pro with Chateaubes and what not, ownership, and that's great addition. Any more questions? Okay, then we call it a day. Thank you so much. |
Z Sovereign Cloud - Closing Remarks |
Thanks for sticking so long with us they are still chocolate over here please empty this bowl otherwise I have to eat it we don't want that I don't know maybe maybe you want this |
Welcome to Testing and Automation devroom |
Good morning everyone, welcome to the Destination and Automation Dev Room, I'm going to be very quick and then we kind of to Remy. So we have one small ask from you, you can find this poll, the chair next to the door and here also on the desk. We want a little bit of feedback because we're trying to get a bigger room with more people and it's still early, but we expect more people to come in later than today. So if you're kind enough, just to feel the same, I got anonymous on paper for only and that would be awesome. And that's it, you know, let's start. |
Linux Kernel Functional Testing
A look at the infrastructure |
Welcome to this session about LKFT, the Linux Can Help Functional Testing Project. My name is Rémi Durafor, I'm a Principal Technic at Linao. I've been working on open source projects since 2007 and I've been the Lava Architect and Main Developer since for eight years now, so quite some time now. So I will speak today about LKFT because it's a project I'm working with. So what is LKFT? So the goal of LKFT is to improve the Linux kernel quality on the ARM architecture by performing regression testing and reporting on selective Linux kernel branches and the Android common kernel in real time. That's what is written on the website. So it's a project that is led by Linao. The goal is to build and test a set of Linux kernel trees. So we care mainly about LTS trees, mainline and next. For LTS in particular, we have a 48 hour SLA, which means that we have to provide a full report in less than 48 hours for any change on LTS. If you look at the numbers for 2023, we tested 465 RCs. As we test mainline and next, we also built and tested 2,628 different commit versions, which means that we built 1.6 million kernels and ran 200 million tests in a year. That's only for Linux. If you look at Android common kernel, only for the test, that's 58 million tests, 580 million tests, so VTS and CTS mainly. And this is all done by only three people. So the question is how do we do to build that many kernels and test that many kernels with only three people, obviously automation. So my goal today is to show you the architecture of LKFT and to also show you the different tools that we created and maintained to make that possible. Because I'm sure that you can go back home with some of these tools and might be useful for you. So let's look at the architecture now. So this is a really simple view. We have a set of trees in GitLab that are just simple mirrors in GitLab of the official trees. We just use GitLab for a scheduling mechanism. So it will pull the new changes and it will run a GitLab CI pipeline. But we won't do anything specific in GitLab CI pipeline. We won't do build or test inside it. It's too slow and costly. So we just use it for submitting a plan to our system that will do the build and test and reporting. And at the end, we will just get a report that three engineers will look at and decide if we have to report something to the main developers or if we can just find a commit ourselves and send a patch. Let's dig in a bit now. So as I said, we don't use GitLab CI for building. We submit only from GitLab CI a build request to our system. So for building, we created a tool which is called text make. I will explain the different tools later on. I'm just showing the architecture right now. So we use a tool called text make that allows for building the system with different combinations of options. And we created a software as a service that allows to use text make at a large scale in the cloud. So we can build something like 5,000 kernels in parallel in the cloud in some minutes. When one build is finished, so when text make finishes build, they are sent to a storage. It's an S-free like bucket somewhere. And a result is sent to Squad, which is a second project that we also maintain. That would be what that I like where everything is stored. As we send results really early, if there is a build failure, a build regression, you will notice that in some minutes or hours depending on how long the build takes. Because for example, if you do an old mod config build with Clang, it will take up to one or two hours easily. But this way we can have early regression that we can send immediately to the main list saying that it's failing to build on this architecture for this tool chain. That's for building. I will explain text make a bit later on. So as I said, when a text make build finish, we send the result to Squad, we store in the storage and we also submit multiple run test runs that will be done in the cloud. So we do a test in the cloud and on physical devices. For the cloud, we have a product called text run that will allow to test on virtual devices, so QMU and a VP. And the same, we have a system that allows to scale in the cloud the text run processes. So you can spawn the same thousands of processes of text run processes in parallel in the cloud. And they will send the results to Squad also. Testing in virtualization is nice. You find a lot of bugs because you can test a lot of different combinations. But that's not enough. So I have to test on real devices. That's where a second software come in, which is Lava, that will allow to test on real devices. So the same when text make finishes to build, it will submit a set of test requests to Lava that will run on real hardware, this case. So obviously, we run less test on real devices and on virtual devices because we don't have enough board. It's always the single point that you're missing. The same results are sent to Squad and when everything is finished, we have a full report that we can provide to the developers that we run something like thousands of tests, thousands of builds, and everything is working or we find some regressions. That's the overall architecture. I will now look at the different projects so you can know if something can be useful for you. So let's look at the build parts. So as I said before, we use text make. It's a project that we created to make building easy and reproducible. So it's an open source command application. It allows for portable and repeatable Linux kind of builds. So for that, we use containers. We provide a set of containers with all the tools you need inside and everything is done inside a container. So it can be reproducible from one machine to another. So because that's often a problem when you report a build failure, it's always a nightmare to know the exact toolchain that you're using, everything. So as everything is inside a container, you can just reproduce it in another machine. So we support multiple toolchains from GCC from A to 12, client from 10 to 15. In fact, 16 has been added this week. We also have a Clang Android version and a Clang Nightly. Clang Nightly is specific because we rebuild the nightly Clang toolchain every night and we push it to our system so we can just test with the latest Clang. We also support multiple target architectures, all the ARM versions, Intel EMDs, and then some MIPS, PowerPC, RISV5, and some exotic one like S390, SH4, things like that. So building is really simple. You just specify the target architectures, so X8664 in this case. You specify the toolchain, so I want to use GCC12. You just need to have text-making installed on your computer because everything will then be done inside a container where you will have GCC12 to chain for X8664. If you want to build with GCC13, just change toolchain to GCC13 and it will use another container to build it. As I said before, we have a private software that allows to run text-making at a large scale in the cloud, but I'm not presenting that it's a close-up software. So just to explain how it's working, text-making will pull the right container for you. So for this specific target-arched toolchain couple, it will be X8664 GCC12 container. We have thousands of containers, hundreds of containers. It will create a unique build directory, so it's reproducible from one build to another. And then we just start a podman container, jump into it, and just build. We advise to use podman, obviously, and not docker because it will be a rootless container, so you can at least don't run asboot your build. And then it will invoke a set of different make comments depending on what you want to build. And then it will move everything to a specific directory that will be kept on the machine. And you will have all the artifacts, kernel, headers, et cetera. And you also have metadata.json file that will include a lot of metadata about your build, like version of your toolchain, of different utilities on the machine, the time taken by different steps, the size of everything, et cetera. And it will be useful for debugging also what's going on, if something breaks. And yeah, we provide multiple containers that you can reuse. And it's an open source project, so you can contribute to it a few months, and you can just use it right now. And some kind of developer use it for reproducing builds, build failures. And in fact, as I said, we have a client-nightly toolchain that is rebuilt nightly. It's in fact because the client project asked us to do that because they use Tuxmake with client-nightly for validating their client version against different kernel versions to see if clang is not regression. That's for building. So now, how do we test? So as I said, we test on virtual devices with Tuxrun and on physical devices with Lava. So for Tuxrun, it's the same. It's an open source common line application. It's the same for Tuxmake, but for running. It allows for portable and repeatable kernel tests. We support multiple devices, MVP MVA, which is an ARM V9.3 emulator, a simulator. That's the latest version that you can try for ARM. And then multiple ARM versions with multiple QEMU devices. Many ARM Intel MIPS in many different versions and PPC, et cetera, and multiple tests with LTP, K-Unit, K-Self tests, et cetera, et cetera. Adding one is not quite easy to do. The same, the common line is quite simple. We also use Sponman for containerizing everything. You specify the device that you want to use, the kernel that you want. It can be your URL, obviously, and a root file system also if you want. And again, we have a SAS that allows to run that at large scale in the cloud. When you call that, that common line, Tuxrun will download all the artifacts that you need. So kernel, DTB, root file system modules. It will inject the modules inside the root file system for you, so that it will be used at a good time. And start the container, start QEMU system, so R64 in this case. Look at the outputs, et cetera, et cetera, all the classical things, and store the results. As I said, we provide a lot of root file systems because we know it's painful to build your root file system for multiple architectures. So we do the work for that. We use billroot and debian. Billroot allows us to have the 19 supported architectures, one root file system for each. And for the main one, the one supported by debian, we do provide the debian root file system that we build. And obviously, if you build your own one, you can use it if you want. And we will do the job of rebuilding the billroot and debian file systems regularly. And in fact, it's a fun thing, we actually found bugs in QEMU before pushing the new file systems. We test in our system with the new root file systems. And the last time we did that, we found issues in QEMU 7.2 that are currently being fixed by QEMU developers. Something fun because Tux-Mech and Tux-Run has been done by the same team. So we make the work to combine the two tools together. So obviously, you can, doing a bisection of a build failure is quite easy. You just need a lot of CPU time. Same for a runtime issue, which is you find a regression where a test fail on a specific architecture. For example, when you run a LTP test suite on QEMU ARM64, it's failing. And you want to bisect that. So find the faulty commit. You have a good commit and a bad commit. And you want to find the faulty commit. Git allows you to help you on that. But thanks to Tux-Mech and Tux-Run, we can automate all that job of testing. So with this common line, Git will call Tux-Mech on different commits to try to find the 41. And Tux-Mech will just build. And at the end of the build, thanks to minus minus result hook, it will exec the command that is behind that will run Tux-Run with the kernel that has been just built. So it will build with Tux-Mech, and at the end, run with Tux-Run, the exact LTP test suite that fails. And if it's passing, it will return zero. If it's failing, it will return one. So based on that, Git will be able to find the faulty commit for you, which is quite... We find a lot of regression or test regression and find the faulty commit thanks to just that command line, which is really cool. Thanks to Anders for the idea. So that was all virtual build, building containers, test on virtual devices, but as I said before, we have to test on physical devices because multiple bugs are only found on physical devices because they are based on drivers failing and things like that. So for that, we use Lava, like many, many, some people in this room. So Lava stands for linear automated validation architecture. It's a text execution system. So it will allow for testing software on real hardware automatically for you. So it will automatically deploy, boot, and test your software on your hardware. So it's used by Canon CI a lot, by LKFT, obviously. And for... You can do system level testing, boot level testing. You can do boot loader also testing. You can test directly, directly, your boot loader and the firmware. And it currently supports 356 different device types. So from IoT to phones, Raspberry Pi-like boards, and servers. So multiple different device types. So for example, if you want to test on a Raspberry Pi, without Lava, you will have to pour on the board, download the artifacts, so kernel, rootFS, files, DTBs, place them on a specific directory, like NFS or TFT directory, connect to the serial, type a lot of commands, boot the board, watch the boot outputs, type the logging prompt, et cetera, et cetera. So it's really painful to do that manually. Lava will just do exactly what I just listed, automatically for you. It will just provide a job definition, which is a YAML file, with links to all the artifacts that you want to test. You specify the kind of board that you have. So it's a Raspberry Pi 4B, and Lava will know then how to interact with that board. And you will say that you boot install on it, and you have a TFTP server. Just use that, and test what I want to test on it. And Lava will do that automatically for you. Obviously, you can have multiple boards attached to the same worker, and you can have multiple workers on a Lava instance. So as a user, it's really an abstraction of the hardware, and you just send a YAML file and you get results, and all the hardware part is done automatically by Lava for you. So as I said, maybe you remember the first LKFT diagram. I'm sure you don't. That was a small box called KeysCache. So when we submit jobs to Lava, we submit multiple jobs for the same artifacts at the same time. We have multiple devices. So the scheduler will start the job for the same artifacts all at the same time. So it will download multiple times the same artifact at the same time, so we just should be able to catch that and decrease network usage. So we tried squid, and the short answer is squid is not working for that use case for different reasons. The first one is that, as I said before, all the artifacts are stored in an S3 like bucket. So it's somewhere on internet. So obviously we use SSL, HTTPS, to download it. And squid and HTTPS are not really working well together. You have to fake SSL certificates. It's all creepy things to do. And also a thing that, as I said, with download, Lava will start all the jobs at the same time. So they will more or less download all the same artifacts at exactly the same time. And if you do that with squid, squid will download, if you ask for n times the same file to squid, if it's not already cached, squid will download it n times. And only when one is finished, or when download is finished, the next one will use a cache version. So it's just pointless for us, just not working. So we created a tool called keyscache, the keys is for keep it simple, stupid. It's a simple and stupid caching server. It's not a proxy, it's a service, which means that it can handle HTTPS, and it will only download once when you have multiple clients, and it will stream to the clients while downloading. It's not transparent because it's not a proxy, and because it's not transparent, it can do HTTPS, because you will have to prefix your URL by the keyscache instance that you have. And you will talk to keyscache directly. It also automatically retries on failures, because we've found multiple failures that all the HTTP code that you can have when you request on an S3 like bucket, just insane. And sometimes also you will get, the connection will finish like if everything was done correctly. And in fact, the file is not complete, it's a partial download, and you don't get any errors. So keyscache will detect that for you. It will detect that it's a partial download, and it will retry and download only the remaining things for you. And it's fully transparent as a user. It will do that in the background and still stream your data to you. So thanks to that, we've been using it for 2.5 years now. In the graph, in green is what we serve locally from keyscache, and in red is what we download from Internet. So we downloaded 25 terabytes of data from Internet, and we serve 1.3 petabytes of data in the local network, which is the 52 times expansion ratio. So it's quite useful, and it improves stability also. So it's really cool. It's a good tool for your CI if you don't use it already. And last but not the least, we store all the job results in Squad. So it's software quality dashboard. It will store, it's a data lake. It will store all the results for you in different categories, and it will allow you to create reports, so failures, regressions, et cetera. Everything is stored in this one, and then we extract data and make report based on Squad. And that's all. That's what I just explained. If you have any questions, I have some time for questions. Five minutes. Perfect. Oh, yeah, that's good. Testing methods? We use LTP, KUNIT, KSELF-SES, all the kernel test suites that we don't, we are not creating new test suites. We are using test suites that does exist, and we build for the community, and we test for the community, and then we provide reports. We obviously interact a lot with the test suite maintainers, because we found bugs in the test suite, too. We have to report to them, and there's reporting a lot to them. And one of our projects is to test KSELF-SES in advance, test KSELF-SES master, to find bugs in KSELF-SES before they are actually running in production after. If you find any problems and report them, are current developers actually looking at them, or do you have to ping them and make sure they take care of the problem? Okay, so we have an SLA with Greg Croatman, so he's waiting for our results. So they will look at it for LTS. And for Mainline and Next, we are used to reports. We report a lot of issues, so they know us. If you look at LWN articles, about they classify the different contributions to the kernel, and Linaro is in the tested-by top in the tested-by, so they know us a lot, so they know that we provide good results. And when we provide a mail, there is everything that, every tool they need for reproducible. They are reproducing a build, so we provide all the binaries that they need for reproducing it. If it's a big failure, we provide a tux-make command line that they can use, and they are now used to use tux-make for rebuilding things. And if it's a test failure, we provide the logs, obviously, the job definition, and all the binaries they need for reproducing it. Do you actually check that every problem you found is actually fixed? And those are all the bugs that we found fixed? Not all of them? Yeah, if you found some bugs on SH4, no one will care, for example. The QMU 7.2 has been released recently, just not working on SH4. I couldn't answer that. We use the WS. No, it's not that bad. We build a dynamic system, which means that we do not rent 5,000 machines in parallel. Obviously not. It's just impossible for us. We are a small company. Everything is dynamic, so from one second to another, if you look at the graph of usage, when Anders submits a plan for testing, in one minute, we'll book 5,000 machines for building it, likely more 1.5,000 machines to build it. They will build and they will just stop at the end. So no, we don't have 5,000 machines. How many devices do you have in your lava test brick? So for the LKFT, we have multiple lava instances in Linauro, in LKFT, how many devices? About 20. 20, yeah. And about 5 different device types, like Rolls-Royce, Dragon Balls, Junos, X8, X15. But yeah, you can have really large labs in lava. We have another one for just Linauro usage, where we have something like 100 balls, I think, the main one. Thanks. Thank you. |
Growing a lab for automated upstream testing: challenges and lessons learned |
Okay, hello everyone, thanks for being here so early on a Sunday morning. My name is Laura, I work at Glabra and today I'd like to share with you like a war story about how we built and grew our laboratory for upstream testing. I'm going to share with you a little bit about our infrastructure as well as some of the challenges that we had to face while scaling up. So our main goal was to build a big test bed for open source projects to use. So of course we're going to need a diverse ecosystem of devices, so many different devices of different architectures and from different vendors. Of course we're going to need a software to automate the tests on the actual devices. We need a monitoring system, so a way to monitor and assess the health of the devices that we have in the lab and we also need some recovery strategies. So mainly when devices start to misbehave or don't behave as expected, we need some way of recovering them automatically or putting them offline automatically if they're not reliable to run tests. So it all starts with a commit, the developer pushes the changes into a development branch, the artifacts for the tests all built automatically, a test job is submitted and run, the results are gathered and parsed and finally a report is generated and sent back to the developer. So from the lab perspective we're interested in the part that runs the test jobs and makes the results available and what we chose for our lab is Lava as we saw earlier, this is the linear automation and validation architecture. So it automates the boot and deploy phases of the operating system on the device. It has a really scalable scheduler, it allows to run thousands of jobs on hundreds of devices on a single instance, so that's really convenient for big labs. It handles the power on the devices, so it switches the power on and off on the devices when needed and it also helps monitoring the serial output. And finally it also makes the results of the tests available in many different formats, which is again pretty convenient. Lava again just takes care of this part of the CI loop while all the other phases needs to be implemented with different tools. So in order to run devices in Lava we need to fulfill a set of base requirements, of course we're going to need to be able to turn on and off the power on the devices remotely, we need access to a reliable serial console remotely and finally we need some way of booting an arbitrary combination of kernel, device tree and Vultafast remotely. For all the devices that we have in the lab we rely on TFTP, so we need network connectivity at the bootloader level and that means that we often have to build our custom bootloaders and enable all the features that we need for debugging. So there are a few steps to prepare the devices before they enter the lab. As far as the configuration of the devices itself in Lava we have a couple of, you only need to define some JINJA 2 and YAML templates, so the device type template basically defines the characteristic of the device type, so for example which kind of bootloader run on a certain device, which kind of command line options are needed for booting, while the device dictionary defines device specific characteristics, so for example what command do we need to run to turn on and off the power or to access the serial console and finally we have the health check, which is a special kind of job associated to each device type and the aim of the health check is to assess the health status of each device, it's supposed to be run on a fairly regular basis, we run a health check on every device that we have in the lab every day and the examples of tests that you can fit in a health check are for example a battery test or you can check the temperature on the device to make sure it's not overheating, you can check the network connectivity, basically all the tests that you need to make sure that the device is functional, you can fit them in health check and whenever a device fills its health check, lava automatically puts it offline, so it's really useful just to shut down all the devices that are not reliable at the moment, so Colabora maintains a laboratory running lava and we have as of a couple of days ago 217 devices of 38 different types, spread across 16 racks, each rack is controlled by its own server and that's also where the lava dispatcher runs and of course besides all the device types, devices we also have a bunch of hardware equipment that we need to automate the boot and test phases on our devices and this is what the device distribution looked like in January, so the vast majority of our devices are X8664 and ARM64 platforms and we also have some QM instances that are mainly used by KernelCI and the very vast majority of our devices are actually Chromebook laptops but we also have some embedded SBC devices as well, so what kind of hardware do we have in the lab, so different devices as usually different requirements, so for embedded SBCs what we use to control the power on them remotely are Ethernet control relays and PDUs, I left there some examples of the actual models that we currently have in the lab, Chromebooks are kind of a different beast, they have their own hardware debug interface which is the Servo V4 and the Susie cables, so Servo V4 allows you to control the power on the device to access the serial consoles on the device and also provides network connectivity to an Ethernet port, so you can fit everything you need to automate the boot and test phases on a Chromebook that fit inside just one hardware box, as an alternative you also have Susie cables which pretty much have the same functionality except for the network connectivity that you have to provide usually to a USB to Ethernet adapter. We have a couple of servers as well in the lab and for those we use the IPMI standard protocol just to control the power and access the serial consoles and for all the devices of course we're going to need a bunch of USB cables with their fragilities and also we use USB hubs, we find especially useful the switchable hubs such as the Y-Cush, especially for those devices that are controlled by just one USB connection such as the Chromebooks, so that's really convenient just not to having to manually intervene every time you need to re-plug the USB connection. As for the software, I left here a couple of links, you want to check them out. We use PDU Demon to execute comments on the PDUs, we use Conserver to access the serial consoles and monitor the output and the HDC tools are just for the Chromebooks, these are the software tools that allows you to interact with the server v4 and with the Susie cable as well just to control the power and serial on the Chromebooks. For the interaction with Lava, we use Lava CLI, it's a command line interface and that's useful to run the tests on the device and also configure and push the templates. Finally we also have a Lava GitLab runner that serves as a bridge between GitLab and Lava. That's pretty much it for the software side. In our lab we have two major users, one is KernelCI which is focused on continuous testing of the Linux kernel, it's not only boot tests, we have a bunch of other test suites running on them and the type of testing that KernelCI does is post merge, so changes are tested after they landed on a set of monitored trees. After the tests have run, KernelCI will generate some build reports as well as some regression reports for every regression that is found. The other major player in our lab is MesaCI, that's DCI for Mesa3D and it does conformance testing and also performance tracking. There are a bunch of test suites that are currently run by MesaCI, I left the list here, bunch of APIs and drivers are tested and while KernelCI only does post merge testing, MesaCI also does pre merge conformance tests, so that's a little bit of both. In this diagram you can see what's the usage of KernelCI and MesaCI in our laboratory, as you can see both projects keep our lab pretty busy, KernelCI uses almost all the architectures that we have in the lab, while MesaCI is focused more on X8664 and ARM64, and with so many jobs running every day in our lab, of course the impact of any error or unreliability in the infrastructure can be quite big, so for pre-merge tests you have the merge requests from users can get blocked, and definitely if a certain device type is not available, and also, yeah, there's a risk of merge requests getting rejected if there are many errors in the lab, so what we need to make sure from the lab perspective is that the merge requests from users do get rejected only because the changes introduced made the test fail, and not because of any infrastructure error, while for post merge tests we have a risk of reporting false regressions, in this case we want to make sure again that the infrastructure errors are reported as such by Lava, and Lava defines different types of exceptions that you can raise based on the type of error that occurs, we need just to make sure that the devices and Lava itself is configured properly to do so, so yeah, this is just a minimal list of the common issues that we have seen over the years, of course there can be other degradation, you can have faulty cables at any time, or batteries just failing, power chargers not working properly, all kind of network issues can happen at any time and they can have quite a big impact, we also saw some issues related to the rack setup, so for example we had some laptops where the lid was likely too closed because of how it was set up in the rack and it was closing the device to enter suspense unexpectedly, so we have all kind of different errors that can happen, of course we can have firmware bugs either in the firmware running or the actual devices, or also firmware running in the hardware debug interface, so that's a lot of errors that can happen, I gather a few of my favorite pitfalls, these are issues, tricky ones that we have found recently and we're still dealing with some of those, so one of the things that we saw is that sometimes it happens that the serial console will just stop outputting anything on the serial console and if this happens during the test phase it's kind of hard to understand in an automated way, whether the kernel is hanging or whether your USB cable connection has dropped or if it's just like an unreliable serial connection, so that's usually a tricky one to deal with. Another serial related one is caused by interference, so not all devices can have multiple UART connections for debug, most of our devices in the lab don't, so we have to share the same serial connection between the kernel and the test shell and this sometimes can cause some interference and of course it will confuse lava about the outcome, so one way that we are thinking of many solutions to deal with this kind of serial issue stuff and one approach that we are looking into is actually using a docker container, so running a docker container connecting to the device over SSH and run the tests on the SSH console, this way we can probably work around some of these serial issues. So as I said there are also network connectivity issues from time to time and if the network drops during the bootloader phase, that's usually something we can easily catch because lava of course monitors the serial output and if our bootloader is nice enough to print error messages we can just catch the right patterns at the right time and just raise an infrastructure error so that won't initiate like the outcome of the test. When this happens we can also configure lava to retry the job if needed, so when this happens it's useful to catch error patterns. If network decides to drop during the test phase that's usually worse, especially for devices that rely on a network phase system, so it's usually pretty hard to recover from this. We have seen occasional USB disconnection for whatever reason and yet it's hard to recover from these kind of issues usually. So these are some of the best practices we came upon while working on these issues. So the first one is about writing robust health checks, so as I said devices will be put offline by lava if the health check fails so we need to make sure that the health checks catch as many issues as possible automatically. We found very useful to monitor the lava infrastructure error exceptions and this is mainly to catch issues with specific racks or specific device types. We usually try to monitor also the devices health and the job queue as well and this is to make sure that we have enough devices of a certain device type to feed all the pipelines for the projects and also to minimize like if a certain device type goes offline and if we have redone this we're able to kind of recover from that and last but not least as I said what best practice is to try and isolate the test shell output and kernel messages whenever possible. If not we're trying to work around some of these issues. So next steps for our lab is of course increase the lab capacity and try to cover even more platform and different vendors as soon as they come out. While doing this we are continuing to improve our infrastructure and monitoring tools. I haven't included in this presentation how we actually monitor things but yeah lava just has some APIs that you can use to monitor the status of each device and also of the server and yeah of course while keeping to keep adding new lab devices we also want to increase the coverage of test suites so we're working on adding even more test suites on kernel CI and meso CI as well and that's it. If you have any questions I think I have time right. Pretty often I'd say yeah I don't have data at hand with the actual failures but yeah it happens pretty often. I mean we have so many jobs running every day and we rely heavily on USB which is kind of not great like it breaks pretty often I'd say the most common issues that we have is usually due to the serial consoles being not too reliable. The vast majority of devices that we have are Chromebooks and we're using like these hardware debugging interfaces that were meant for debugging so sometimes like the serial connection is not great and that cause all kind of issues so we try at least to retry the jobs when possible and catch the infrastructure errors as they come out. I'd say we don't have to manually intervene every day I'd go like every couple of days we need to maybe re-plug some of the devices because we as I said we use the switchable hubs to try and avoid having to reset the connection manually but we don't have this setup on each and every device that we have we're working on it but yeah I'd say like at least a couple of times a week I haven't really checked the frequency of it but yeah of course there's people in the lab actually taking care of all the devices. So from the lab perspective we don't really care about the test switch running it's more the responsibility of kernel CI and mesoCi you can check out the links that I left like everything is of course open source so you can check out all the test suites and how they work yeah some tests you just need a RAM disk some other tests rely on the most heavier ones rely on natural pack system. I mean it depends on the type of tests that you need to run. We use a lot of Chromebooks because we need Chromebooks so you cannot really emulate one. Thank you. |
Introducing Vegvisir: An automation framework for testing QUIC application logic
Who said using QUIC was easy? |
Alright, so good morning everybody, my name is Joris and I'm a PhD student over at Hasselt University here in Belgium and I'm doing a PhD on multimedia streaming and network transport layer protocols or even better the intersection of those two. Today I'm here to talk a little bit about a project we did called Vagvizir which is an automated testing framework for orchestrating client and server setups using the quick transport layer protocols but before we jump into that maybe let's talk a little bit about what quick actually is because I assume not everybody has heard about it. Well quick is a general purpose transport layer protocol that was standardized by the ITF in May 2021 and if you have any updated like applications on your phone or have been using the latest releases of browsers such as Firefox, Chrome or whatever you're using you've probably been already using quick as it has been deployed to a lot of many different applications and websites already. For example Facebook, Instagram they are using it, if you're streaming videos over YouTube you have probably been already using quick. Quick is a name, it's not an acronym, it used to stand for quick UDP internet connections but it has actually not been called that for quite some time already now. Some of its features, the biggest feature is encryption so the protocol actually encrypts everything by default which is great because that's the main driver against ossification also it's reason that it was created because TCP is actually an ossified protocol when we compare the two. It's also great for preventing third parties from actually interfering with the data you are transmitting over the network. It's less great for research and development as you have to actually account for that in your test setups which we are going to talk about a little bit further ahead. It's currently most implementations implemented on top of UDP in user space. Some implementations are actually looking at implementing it more towards the kernel but those steps have not been taken by many of the implementations. At present at least as far as I know 25 implementations exist most of them also being open source written in multiple programming languages. They also provide libraries which you can directly use to actually use quick in your applications. Another benefit of quick or another new thing with quick is HTTP 3 which you might have heard of. The reason for the introduction of H3 is that H1 and H2 actually only run over TCP that's why they created a new version of HTTP called HTTP 3. There's not that big of a difference between H2 and H3 in practice but just for sake of naming it it's HTTP 3. Right so now that you know what quick is that's at least a requirement for understanding the stock. Let's talk about how we can actually use this in like experiments and stuff. Maybe let's try doing something very simple. I just told you that most browsers implement quick. Maybe let's try connecting to a website that only implements an HTTP 3 server. Should be simple right but as you can see on the screenshot it is not in practice. And the reason for that is really simple that's because browsers decided early on that HTTP 3 server should be discoverable through the old SVC header provided by an H1 or H2 deployment which really sucks if you want to do some automated testing because that means you also have to put up like or spin up an H1 or H2 server and actually account for this. Luckily we have some options like for example within Chrome and Firefox to enable force quick on certain domains which we can automate through or by means of for example parameters supplied in the command line or by configurations in the browser itself. Right so we can connect to a web server at this point. How do we actually know what's happening under the hood should also be simple right. Remember everything is encrypted so actually seeing what's happening is a little bit is not that trivial actually. Luckily most implementations nowadays use like standard of the shelf TLS libraries and these TLS back ends actually support an environment variable called SSL key lock file and the idea behind the SSL key lock file is that you can like point it towards a file which then gets used by these TLS back ends to like output all the secrets used for encryption during a whatever the application is actually doing. If you load those SSL key lock files into programs like wire shark you can actually decrypt the traffic which is nice if you want to see what happened. Unfortunately tools like wire shark at least as far as I know don't actually have any visualizations about stuff that's happening at like the congestion or flow control layer. You have that for TCP but quick those things don't exist yet. But luckily we have other stuff for that. Q lock is one of them. There has been a nice talk about this by its inventor a couple of years ago at FOSDOM. I really invite you to look at it. But basically in a nutshell what Q lock is is like a structured way of logging and a unified way of logging that can be implemented by any endpoint implementation using quick. In a nutshell this is basically a file for example a JSON file that just locks everything that's happening and if you have some scripts or tools that can parse this you can actually do a lot of fun stuff with it. For example the QVIS visualization tool which is also by the same creator is a tool that allows you to load these files and like actually visualize similar to what wire shark can do for TCP but then for quick what's happening on the congestion layers. For example on the left you see a flow control and congestion flow graph and on the right you see a plot of the home trip time that was experienced by the application. So we can look at what's happening under the hood maybe let's try something more advanced setting up like your own quick client and quick server to do some local experiments maybe change something to the implementations it doesn't matter really what you want to do. Even that is not that trivial simply because there are many implementations written in many languages meaning that they have their own requirements their own installation procedures. Another distinction is that different implementations actually have different performance characteristics meaning that some are more tuned towards certain scenarios some only support a certain feature set so you also have to account for that. An additional requirement is that you also have to set up like self-signed certificates and for some reason some implementations accept all kinds of certificates and then for some reason others fail we have never really figured out why we just use a common way that works for them all now anyway. Another query that you can experience is within the code bases themselves a fun one I always show to my students is this one this is from the quick code base which uses the cubic congestion control algorithm if you know something about congestion control and makes sense like the file is called new cubic sander the function is called new cubic new cubic sander it even specifies in documentation that it makes a new cubic sander unless you actually put a Renault ball on true then it behaves like a totally different beast actually new Renault in that case so some some weird quirks that you actually have to account for too. So the point I'm trying to make it is is that there are a lot of different implementations testing the mall takes time it's not that easy to set it up you experience a lot of these small issues it's cumbersome to like test multiple implementations which is the reason why I am presenting Vagvizir today. The idea behind Vagvizir is to actually aid with this kind of development so if you're doing research or even development within quick the idea is that Vagvizir can automatically set up these kinds of interactions between clients and servers but also using simulated networks such that you can have actual repeatable and shareable experiments. The way you do it this is by defining experiments with configuration files and a single experiment can consist out of multiple test cases and the idea is that a single test case looks something like this. So you have the two entities the server and the client which just assume their prototypical roles as known within the server client model and in between them sits a network component that we call the shaper and the idea of the shaper is that it actually applies some kind of scenario on the traffic passing between the server and client for example it can introduce some latency or it could limit the throughput doesn't really matter what you want to do the idea is that you can do it in a repeatable way. You also see the docker container stuff on top the idea of using or actually deploying these test cases within docker containers is that we can easily share them with other people which is a really nice benefit within the academic community but also we can free certain implementations like we can actually save a docker container and reuse it at a later point so say for example something changes and we want to try an older version that's totally possible with this setup. Additionally within the quick community there have been some other efforts if you are part of the quick community you might actually recognize this setup it's pretty much the same as one used by an interoperability project called the quick interoperability runner. They also provide containers for their setup that are more tuned towards testing the actual interoperability between quick implementations but the benefit of using the same architecture is that we are actually completely compatible with their setups so that means that even though Vegvizir is relatively new at this point in time we are already compatible with 15 out of the 25 quick implementations right out of the box. You also see on the right side that we have a client that can be defined as a CLI command that's because early on we realized that if we want to test applications not everything is not every kind of test is suitable to be placed in a docker container which is why we also allow to define test by just spinning up local programs as you are used to from a terminal. A good example of this is a browser if you're doing some kind of media streaming experiments you actually want hardware acceleration as such to be enabled which I guess you can do this in docker containers but it's really cumbersome to actually do this in a good way. Right okay so how are these experiments actually defined? Well we decided to not use one single configuration file simply because that would mean we had to be very verbose. We actually split it up in two types of configurations. On the left you can see the implementation configuration which is actually what defines what is available within an experiment. So the idea is that an implementation configuration is similar to like your list of installed software on your computer. You simply have a list of entities that you can pick from. We also introduced a parametric system to make it actually really dynamic steerable from within an experiment configuration and we will see some examples on that in a second. On the right you see the experiment configuration that's the actual definition of what needs to happen within one experiment so what defines the test cases. The idea is that you define how the entities from the implementation configuration should behave and what parameters should actually contain as values through arguments. Also configure sensors I'm going to talk about that in a second but the biggest benefit of splitting these two up is actually that the experiment configuration automatically produces this permutation or rather a total of combinations from all these entities. So say for example you define two servers, two clients and two shapers. The total amount of tests within this experiment will actually be eight because it just compiles a complete combination of all these configurations. Another benefit is loose coupling so you might wonder yeah I still don't see the reason why you split these two up. Well a big thing with an academic research is that we actually want to test different versions. So if we have an implementation configuration that for example defines a client called Chrome which then refers to a Chrome browser we can actually have one implementation configuration that refers to version for example 99 and we can have another implementation configuration that refers to for example version 100. The benefit of that is that if we simply swap these implementation configurations we don't need to change the experiment configurations meaning that we can without having to verbose rewrite all these stats really easily test multiple setups. Some examples this is an example of an implementation configuration. I do invite you to go to the GitHub repository where everything is really nicely explained and we provide some more examples unfortunately limited by the screen size. You see that we always have to define in the implementation configuration three types of entities like we talked about earlier the clients the servers and the shapers and these three are examples using Docker system. So in the top two you see that we actually refer to Docker Hub images. These are actual examples that come from the quick interop runner project which we are compatible with. The bottom one is a locally built Docker image. The reason I highlight this difference is because the framework automatically pulls the latest Docker Hub images if these are available. But if you are using some kind of local implementation that you build as a Docker image you actually have to build it locally and then refer to it locally. Another thing you can see here is the parametric system. So for example the top client defines a request parameter that is then used within an experiment. The idea is that an experiment configuration then contains a value of it and that you can access this value within a Docker image simply by using requests then in this case as a environment variable. So all the parameters are passed as environment variables if you are using Docker images. Or in the case that you are using CLI commands or even in a more specific case of shapers because shapers are a little bit more complicated. You can also use them directly in the commands you specify within the implementation and experiment configurations. These are directly substituted and you can actually reference other parameters within arguments which is nice. A simple example of a CLI client, so one that is not using a Docker image in other words, you can see that the command is rather long. That is because we cannot, well compared to a Docker container we actually have to specify everything that needs to happen in CLI command. This example provides three or rather four system parameters which are highlighted here. The reason I did this is because the framework automatically generates all these details for you such that the experiments can be even more dynamically steered. This is especially handy for future use cases where we for example want to expand upon multiple client setups and stuff like that. On the bottom you see a construct key. We actually have two special mechanisms for CLI commands. In Docker images, the benefit of Docker images is that they can actually have scripts on board that actually prime the environment. That is the downside of using CLI commands unless you want to put everything on one single line which is rather also cumbersome. Instead we provide two mechanisms called construct and destruct which are run before and after a command is executed. These can be used to prime an environment and clean it up afterwards. This example sets like the changes or manipulates actually the Google Chrome preferences to set like the standard download folder output towards one generated by the framework. Then we come to the experiment configurations examples. These are the actual configurations that define how a test should behave. You see here once again we have client shapers and servers which we picked from the implementation that we just showed. We simply filled them in with the arguments required for the test to work. A special thing to notice here is the shapers scenarios. Clients and servers are really simple. You just mentioned which one you want to use. But for shapers we have a more complicated setup. The idea behind the shapers is that it actually entails one kind of shaping. For example, you can use a TC-netum shaper within one container. But this one container does not only do one kind of shaping. The idea is that you can define multiple scenarios within this container and by passing through the scenario key you can actually pick which one is used during a test. In this use case we have one client, two shapers and three servers which means that we will have a total of six test cases that will get generated and compiled by the framework and run sequentially one after the other. I mentioned sensors earlier. That is also a configuration you can do with the experiment configuration files. The idea is normally that the framework just automates all these tests and that when a client exits this should signal like the end of the test. However, in certain circumstances it is not possible. For example, if you use a browser, well, it is obvious that browsers do not have the ability to shut down from within a web page which would pose some security risks. Which is why we built a sensor system which can actually govern what happens within a client. For example, we provide two simple sensor setups, time mode, which simply checks if a certain amount of time has passed and then closes the client and signals that the test case has ended. Another one that we built is the browser file watchdog sensor which enables us to check if certain files were downloaded by a browser context. This enables us to pull metrics from the browser and also signify the end of a test. If you provide these two configuration files to the framework, the framework will spin up a nice story. On the bottom you can see which tests are happening, how much time has passed. You can see a little bit of packet spossing between them, signaling that some kind of traffic is happening. You can actually increase the verbosity, but this is not necessarily needed from within the terminal as the framework automatically saves everything that happens as output in a file within the test case folders that we will now discuss. The experiment output is always saved under the label that is provided with an experiment configuration because we can have multiple runs of an experiment. The first entries that you will find within such a folder are actually time stamped to signify multiple runs. If you enter that, you will actually find the different folders that contain the data of the multiple test cases that were compiled by the framework. If you take a dive into one of these folders, you can see what output we are collecting in these cases. By default, the framework will always create a server, client and shaper folder which get automatically mounted on the Docker volumes under the slash logs directory. Anything the implementations want to save, they can just write files to this directory and the framework will capture this and save this to in the log files. Additionally, clients also have a downloads folder mounted simply because we want to differentiate and not come into a situation where downloads accidentally override output logs generated by a client. You can also see that we have, especially under the server and client entries, you can see keys.log and a Qlog folder. The framework is automatically primed to save these encryption details and what's happening at the quick and HTTP tree layers by setting the SSL Qlog file which we discussed in the beginning but also by setting a Qlog environment variable which gets recognized by most quick implementations out there nowadays. Finally, we come to extensibility. At this point in time, we have a framework that is great at aggregating a lot of data. We did some tests that ran for two or three days straight containing more than 8,000 test cases which were great if you want to gather a lot of data. But what makes a testing framework, a testing framework, is the actual ability to infer something from the output generated by a test which is why we provide these two programmable interfaces called sensors and hooks. I explain sensors a little bit. We provide some basic sensors but you actually also have the ability to program custom sensors. This makes a lot of sense if you want to do very specific or test for very specific behavior within your experiment. For example, if you are doing a video stream in the browser, you can actually send the decoding metrics of the video out of band to an HTTP endpoint, for example, that you set up in a custom sensor. If the sensor, for example, detects that some frames are being dropped or decoded in a wrong way, it could prematurely hold a test signaling that something went wrong. If you have lots of test cases like we do, we actually have test cases like I just said, running 48 hours, this is really beneficial because it holds the test in an early phase, saving us a lot of time. On the other hand, we have the hook system. So the framework currently is very broadly applicable. The downside of that is that we don't really know what's happening inside the test. But you can actually program some custom behavior through the pre-run hook and post-run hook system. As the name suggests, the pre-run hook runs before an actual test is run. So you can prime environments by, for example, generating some dynamic files that you will need during the experiment. It doesn't really matter what you want to do there. The post-run hook is really nice because you can use it to analyze whatever happened during a test. For example, you could, if queue logs are being generated, look at the queue logs and maybe even generate some nice graphs that you can immediately check after a test case has ended. Right. Another thing I need to mention with the pre-run hook and the post-run hook, if you don't like programming in Python, it's not really a problem. Python has this really great submodule called subprocesses. If you have some existing scripts that are made to work with the output produced by your experiment, you can simply call them also from this hook, meaning that you get exactly the same results without having to actually translate your existing code within these provided hooks. Right. That's in a nutshell what Vekvizir does. Thank you for your attention. And I think we have a couple of minutes left for questions. Yeah. So, a test case can be anything you want. If you have, like, if you're programming right now, you're developing something locally. The thing you need to do is actually wrap it within a Docker container. That's one way. Or run it as a CLI command. You simply need to provide it to the framework, and the framework will just spin it up. So, the framework doesn't actually check what your test case is doing. If you want to, like, spin up a simple, let's say, CLI command, like, echo, and you want to print something to the terminal, you simply put it in the JSON, it will run. So, more questions, please. We have a couple of minutes. Okay. I have a question. I see you're from university. What does university have to do with testing, like, what's the question? Okay. So, good question, actually. I'm not sure if there is a direct relationship with testing in the university. It's just that, like, during my PhD, and also the PhD of some of my colleagues here in front of me, we actually encountered that we had a need of such a framework, right? We had an actual need of spinning up multiple test cases and like, helping us with setting up these experiments, which is why we designed this. Early on, we just had a very minimal thing that just worked for us. And then, as time progressed, it actually became more and more mature, and we decided, well, this is actually a very good idea. So, we created an open source project for it, and we actually also submitted it to a open source and data set track for the MMSIS conference, which is happening in June in Vancouver. So, okay. I think we have time for one more question. The last question. No thankers. So, thank you very much. |
Observability-driven development with OpenTelemetry
Use traces to enrich your integration tests! |
Perfect. Hello everyone. I might need to take a selfie because they're not going to believe me when I get back. They go like, so people like testing, obviously, and they know, you know, sounds. Yeah, I should do a video because they're freaking not going to believe me. So apparently people know what open telemetry is and what testing is. And yeah, I was not expecting this to happen. So you're going to go out on Twitter. That's for sure. But yeah, anyway, let's just take a second to welcome our new guests in. Perfect, perfect. Yeah, this went from fun to stressful really quickly. But yeah, so let me begin. For the next 20 or so minutes, I'll be talking about observability driven development with open telemetry. So a lot of complicated words, a lot of stuff that's going to be happening. And a lot of things I need to explain for you as testers and how you can get started with this new thing of being ODD instead of TDD. So first, a quick rundown of who I am. I'm running DevRel at trace test, which is a, it's like a new tool, new open source tool that we're building for trace based testing. Obviously explain all of that later on. But you're wondering like what am I DevRel person doing at a open source conference when it's kind of because I successfully failed a startup that was doing online education. So I was from there, went into education. And because we're basically educators in DevRel, I was like, maybe, maybe, you know, I write shitty code, I can maybe be good at something like talking. So I figured that might be a good career shift. But I've also been helping build open source DevTools for five or so years. So it's pretty natural for me to be here. So enough about that, you probably think that I know what I'm talking about. Let's keep it, keep it rolling. There are four main topics. So remember these four topics that we will cover in the next 20 or so minutes. And that's, first, we'll talk about the pain of testing microservices. It's a horrible, horrible thing. And we'll also talk about TDD and how integration testing is really hard. We're all doing it. It's terrible. It's hard. But we're still doing it. And then in the last two parts, we'll talk about observability-driven development, how it can help. And then we'll show a code example, a hands-on example of how you can do it as well. So I want you to take something home with you after this 20-minute talk and actually start doing it yourself. So from the beginning, from the top down, let's talk about the pain of testing microservices. So first, the biggest issue is that you have no way of knowing where your HTTP transaction fails. You don't know. You can test an API endpoint. You get a response back. But it might be task failed successfully. You never really know if you have a row of microservices behind that initial service. So that's something you can track. You can track and test how these microservices to microservice communications happen. And of course, the hardest thing, what we all really love to hate, is mocking. It's really hard. It's really, really hard. So the solution that we propose is that we go into doing something called observability-driven development, which means that you're using distributed traces as the test assertions. So you're already using your underlying trace infrastructure to run your tests. And now, because this is a testing dev room, you might not know what tracing is, LightStep has a very nice definition of it. And they say that distributed tracing refers to methods of observing requests as they propagate through a distributed system, which means that if you have a distributed system on the left, you have services that communicate with each other. And on the right, you can see that that entire distributed trace is split into different spans. A span is the smallest unit of a distributed test. So a span can be, it can be a type of stamp. It can be a database interaction or database statement. It can be HTTP codes. It can be objects that you generate in your custom instrumentation itself. So they're literally the smallest form or part of a distributed trace. The distributed system we'll be talking about today, so the samples we will be talking about is very simple. We have two services with a mock database connection. Just to simplify this whole architecture, we will be using this to explain how distributed tracing works and how you run observability-driven development on such a system. Now, just a code sample, because this is JavaScript, the only language I really know, not that well, this is what a trace would look like. You're setting the span, you're adding attributes, and then you're ending the span. So this is the code representation of what we have over here. So just remember that for now, and we'll get into more details as we progress. So the visual queue or the visual layout of a distributed span would look like this. This is taken from the trace test app, but this is any, like any distributed span looks like this, where you have your distributed trace and you can see all of the spans within it. And if you drill into one particular span, you can see, okay, so here are all of the attributes that this span has. It can be available, book ID, the check, and the parent ID. There's a lot of different attributes you have in every span. So the next topic I do want to cover, we have all the basics down. We know why it's hard. We need to figure out why integration testing and TDD really need help. Everybody knows about the red-green feedback loop. It's awesome. It's great. We like it. We don't need to change it. But integration tests are hard. Integration tests are the kicker, where they need access to services and infrastructure. That's the hard part. You need to set up different triggers. You need to access databases. You need to set up environment variables. You need to set up authentication. All of those things that everybody hates doing. And of course, you can track which part of the microservice chain failed, which means that you're writing 90% of your code just as plumbing, just to make sure that the test will run. 10% is actually writing the test, writing the test case, and actually getting value from your TDD process. So here's what a traditional integration test would look like. You have all of your setup. Again, this is JavaScript. It can be any language. You have your setup, a bunch of modules, a bunch of setup, a bunch of plumbing. And then you have more plumbing because you need to mock something. Then you have even more plumbing because you have to figure out how to run this custom freaking syntax that has nothing to do with any language, really. You just have to learn it. So it's a lot of stuff you have to know before you actually run tests. If you compare that to a trace based test, you say, here's my URL. Here's my method. This is what I'm suring against. That's it. No complications, no plumbing, no nothing. It just points to the trace, the trace span you want to target. You have your assertion and it's done. So this is why I think observability driven can help our testing process, where obviously we need to explain what ODD is. The main thing that I think is important to know is that you need to write your code and your observability instrumentation in parallel. So the same way you do the red-green process for TDD, in ODD you write your trace spans and you write your code and your features in parallel. Which is good, first thing because in production that helps your DevOps people when they have troubleshoot, but it's also helping you write better code. And ODD is really powerful because first and foremost, of course, you're not testing mocks. Nothing is artificial. You're not creating black boxes. You're literally testing data from the traces in the real environment. So you can spin up your system, get traces from the system and test on those traces. Of course, it works with all of your existing open telemetry based distributed tracing. So if you have tracing enabled or if you want to enable it, it's really simple nowadays, it'll just work. And then from the ODD definition, we need to figure out what trace-based testing is here. So you basically add assertions against span values. And that's what determines whether the test has failed or the test has passed. It's really straightforward. So you're not just testing against the API response, you're actually testing against the whole distributed trace your system generates. So unlike postman where you trigger a test, you get something back, and then you're asserting on that response, you're literally testing and running assertions against the entire distributed trace. Really, really cool. Now let's go into some practice. How do you do observability during development? Well, you do trace tests because that's the open source tool we're building, you know, shocker. But what's important about trace test is fully open source, 100% open source, CNCF project, and it uses open telemetry trace bands as assertions. Very straightforward. Of course, it does work with any existing tracing solution you might have. You can use vendors, you can use open source tools, you can use whatever. If you have tracing in your system, it'll just work. Also, what's important is it doesn't matter if you're a QA engineer, if you're a backend developer, if you're a DevOps person, it'll just work. You have tools for everybody, web UI, CLI, whatever you want, whatever you need, it's there for you. And then why I think it's powerful, you're not running artificial tests, you're testing against real data, and obviously you have a tool belt that you're really used to. You can run test suites by chaining tests together, have transactions where the standard way you're running integration tests is you have a setup, you connect into a database, you're running an insert, you're checking if the insert works, you're deleting that whole path, that environment, and that's what we provide as well. You can set that whole transaction up through the UI. So it's literally what you're used to, but better. You always have test environments as well, which is a very big thing because you can have one set environment for your dev, for your QA, for your prod, for your whatever. So it's very, very flexible in that way as well. Obviously, I'm going to stress this no mocks because I really like that. I hate mocking. So I'm going to just shove this down your throat. Every slide is going to be no mocking. But also, one thing that I think is massively important is that if anybody's running serverless, I've been running serverless since it was a thing like in 2018 when everybody wanted to run serverless, and it was horrible, it was a horrible experience. So I'd suggest nobody really does it. But if you have to because of PMs, testing events on message queues and testing events on distributed systems and services in AWS or whatever, like it's prayer driven development. You never really know what's going to happen. So that's something that we provide. You can literally see the entire trace from that ASIC message queue from other systems, from other services, and you really know what's happening. Obviously, it's important that you get assertions based on timing. Maybe you want all of your database requests and your database queries to finish within 500 milliseconds. That just works. And you can also set wild color assertions. So the same thing I was saying about the database queries, it works for wild cards as well. So a visual demo, like a representation, what that would mean is literally like this. So you have your test executor, which is you can think of that as a trigger. You're testing your system. That trace data is getting written to your trace data store. It can be pretty much anything you've all heard of. Yeager, OpenSearch, Rufana Tempo, OpenTelemetryCollector, like all of those, even vendors like Datadog or whatever. And then what happens is that once the response gets back, we pick up that response, but we also pick up the trace. So you can run assertions based on both the trace and the response itself. And then, obviously, you get the result back and then you can see if it's passed, if it's not passed, what you need to fix, et cetera. So yeah, let's show up after all of this over 10 minutes and perfect. After all of this, just like theory and understanding what's happening, we want to jump into actual code. So let's go back to the sample of checking our trace-based test. So we have a URL and we're making sure that we're sending a GET request to that URL. We're setting up a span. So we're targeting the books span in the Books API and we're making sure that we want to have a list of books equal to three. So this is our TDD red-green process. We have a test. We want to run the code and we see, okay, so we have a handler here. It's getting some books. We have some books, but you can see that there's no instrumentation. So if we do run the test, it's going to say, okay, the 200 is fine, but we're not getting any books here. Red, let's go ahead and refactor. We're adding in our spans. So we say, okay, so now I'll add an attribute and I want to pass in the book's length into this attribute right here. Perfect. Now it passes. So this is the most banal simple use case that you can see, but you're already seeing value from it because you can pass in a custom value. That's a real data. You don't have to mess about with any marking or anything. And then obviously one thing that I'm stressing is very important is what if you want to add a span duration? So I want this API to finish within 500 milliseconds. Okay. Right now, if we have an issue, even though the code works, it might be performing badly. We can add in the span duration, check for the timing, and then obviously refactor if we need to refactor. And that's the thing in the UI as well. Once you do refactor it, this is what you would see. You go and say, okay, so finally now I have a passing test. This book's API is returning within 500 milliseconds. And then obviously the last and I think crucial thing with using trace-based testing is that you can literally test on a search on every part of an HTTP transaction. So if we go back to our books handler API, instead of calling books, we're now calling available books. So we are calling an external API to see if the books are available or not. So we're having this microservice to microservice communication. And if you check that, get the available books function. So we have some promise thingamajig happening here. We're calling an availability API and we're just checking if it's available or not. So the kicker here is we're calling an external API. The external API is super simple. We're just running some tests whether it's available or not and we're setting this attribute. So it's very, very simple example. But the thing is, what if in the availability check, we have a problem? This is why I don't do live demos. Anyway, so if in the availability check, if you're checking here, you can see, oh, we have a problem. There are books that are out of stock. So this is that down the chain action that would happen. You would never know what the hell is the problem. But now, because we have this set up, we can say, okay, so I'm adding into my trace-based test. I want to make sure the availability API is up. So I'm actually triggering this host. And I also want to make sure that all of these is available attributes is true. If I do run that test, I'll see that, whoopsie, I'll see that they're all passing except for this one because, oh wait, there was actually one node, like one part of the trace, one span that was returning false, because one book was out of stock. And if you jump in here, you can see that everything is literally passing. Everything is passing except for that one span, which is something you would never figure out if you're running the traditional way of running tests. And the last thing I really want to stress before we wrap up is that this will work with any distributed system that has open telemetry instrumentation. So any system that looks like this, you have an app with open telemetry, you're sending to the open telemetry collector, and then you're sending that trace data to any trace data store. Yeager open search doesn't really matter. You hook in your trace test instance, you pick up data on every request, you pick up data from the trace data store, and you run these tests. This is the only setup you really need to do. Install the CLI, install the server, one command, one command, and you're ready. Set up your Docker composer or Kubernetes, all of this works out of the box with the install. We have good engineers, like these guys really try to make the install really simple. You set up the trace data store, you can do that in the UI or in the CLI, doesn't really matter. Connect the data store, and you're done. It just works. So last recap, two minutes left. What did we learn today? We learned that obviously open ODD or observability driven development is really awesome. You don't have to mock, again with the mocking. You're testing against real data, and you don't have any black boxes anymore. You know exactly what's happening in every single microservice. You can assert on every step of the transaction. And as the last recap, I mean, you wouldn't be here if you thought testing was fun or easy or something that you really enjoy doing. It is hard, like we all know it is very hard. Testing distributed systems is even harder. Testing microservices is even harder. So I want to help you elevate that TDD process that you're already doing. You're already doing well that you like to doing ODD as well. That's pretty much it. We're on point. If you have any questions, if you want to check out Trace Test, go just go to githubcubeshop slash trace test. You can download it. You can read a blog post I wrote about this as well. So knock yourselves out, I guess. You can also do, just to make it easier, you can do the, like you can also jump into Discord. You can chat with me or the engineers face-to-face. If you have any questions, if you want to try it out, check out the github. Also, give us a star, you know, because it's kind of why I'm here. I have to earn my salary something, in some way. So questions? Yeah, sure. So test run against the trace from the system. Yes, the way it happens is that, imagine you're running a postman request. That would be called, because this is trace test, that would be called response test. You get a response, you're testing on it. For trace test, you get the response, but you're also tapping into the trace data store and getting the traces that that request generates. So from that distributed trace, then you're running assertions based on the spans within that trace, if that answers your question. Yeah, for sure. The only thing is that, obviously, if you're running locally, you have a setup where your application is sending to either an open telemetry collector or whatever. You can also tap into that, where you configure trace test to be the pipeline endpoint of your open telemetry collector. So you can just run it as a dev tool as well. So also we might, I'm not sure if I'm good saying this on camera, but we might be building a desktop app very soon, because we're like half a year into this, so we're still kind of figuring out what you guys need. So that's why I'm here as well. But yeah, let's see what happens. It's a great question, by the way. It's good to finish early. We have time for questions. This is great. So yeah, the question was measuring SLOs for user journeys. That's actually something we're working on now. I'm not sure if you know about the captain project. So we have an integration with the captain project as of last week, quite literally. So if you want to check that out, you just jump into trace test integrates with captain and you'll get a lot of documentation and sample apps examples and whatnot to set that up as well. So that's an excellent use case and something that we actively have been working on. So 100% like the thing is that whatever you have implemented, if you have hotel traces coming in from that system, it works. So it's language agnostic, setup agnostic, it's literally like just the traces are important. So if you're running hotel, if you're running the data dog agent, elastic agent, literally anything that generates traces, it'll work. Obviously works best with hotel because open source, you know. But yeah, it just works. So if I understand the question correctly, it is, do I run synthetic tests with trace test? Yes. You can, if you have a CI pipeline or like you can have a cron job somewhere running, doesn't really matter. Every five minutes, I want to trigger this test and make sure that all of the assertions are true. That's perfectly fine. Oh, yeah, 100%. It works. You can think of it as testing in production and making sure that the production environment is healthy. That works as well. Hi. Yeah. So the test, trace test test depends on your instrumentation. Yes. Your instrumentation is in your production code. Yes. Do you have any advice on how do you prevent your production code from bloating with the instrumentation to beat these tests? Hmm. I'm going to say, that's a great question, but I think I'm not even close to being good enough of an engineer to answer that question, to be honest. 100%. 100%. 100%. Also, yeah. Go ahead. Yeah. Yeah, You really had to pick black boxes, but now the other hand, all right, writing tasks like this, and then spending might be unnecessarily tackled into the intricate details of the infrastructure right now. So is this necessarily good? So the text actually knows what's under, what's in that black box. So when you're in factory infrastructure, you might have to throw out all your tasks because you don't have the database that's a good question as well. I think the logical solution would be, trace test is just mapping out your infra. So if you're using it, you can also use it just to gain visibility. So it doesn't have to be that it's only focused on the testing. If you're using it to map out your infra, even if you have changes, if you're running the test again, you'll exactly know what changed. So if you're running assertions based on one database table, so to say, and then running an API on one endpoint that has one particular host name, if you change those up, you'll see what fails and you can figure out, oh, okay, so we changed that last week because of XYZ and you can know exactly what changed. So I think the overview, the visibility into your system, because when you're running microservices, when you're running a bunch of stuff, distributed systems, whatever, it's just hard to have a mental model, a mind map, so to say, of everything that's happening. So I think that's a good part of the value there as well. Thank you, no more time. No more time, yeah. Thank you. Thank you. No more time. No more time, yeah. Thank you. Thank you, no more time, yeah, no more time, yeah, no more time. |
Setting up OpenQA testing for GNOME |
All right. Well, welcome to the next talk. I'm going to be talking about open QA testing of a pretty complex graphical desktop environment. So, I'm an operating systems developer. I've been involved in GNOME for a long time, possibly too long. And I've also been involved in Code Think for maybe 10 years off and on. We're like a consultancy firm based in Manchester. And we work a lot with the automotive industry, helping them with testing. So, that's how we got an interest in open QA. And that led on to research of trying to set it up for the NOMO-S as well. It should be, but maybe it's not. It is green, but yeah, can nobody hear me? Okay, so there's no room speakers. All right, I'll try and talk a bit louder. So GNOME is a desktop environment. How many GNOME users do we have in the room? Quite a few. KDE users. Nice. Other desktop environments. Tiling window managers, et cetera. Quite an even mix, actually. Everybody welcome. So GNOME is quite an old project, right? GNOME predates FOSDEM, Git, and predate. It's older than Greta Thunberg. It's older than some of its contributors. It's older than MySpace. And this leads to some sort of technical challenges that have built up over the years. So, the GNOME designers design a cohesive experience of everything working together. But then we release more than 200 individual modules as tar balls. And distributions get to integrate those back together to produce something that hopefully works. So, it's difficult to test those 200 modules. It's difficult to test what we release. Maybe you've heard of Conway's law, the rule that a project source code will mirror the structure of the organization that makes it. So, this is a rough diagram of how GNOME development works. Most of the work is done by module teams who focus kind of on individual modules or one or two modules. So, things are tested well in isolation. And then the release team tries to build everything and they get to the point of like, okay, everything builds. So, we'll release it. And they give this to packages who give it to users. So, the question is, which of these groups are responsible for integration testing? The maintainers are working on isolated components. The release team are very busy. And distro developers are also very busy. So, certainly when the project started, the users were responsible for integration testing. You got to use Linux for free and you got to report bugs if it broke. And we would give the works on my machine certificate. And you get a lot of crazy bugs at integration time. Like, oh, this feature doesn't work because it turns out the code isn't broken, but you pass the wrong configure flags. So, you don't get the feature that you wanted. Time has passed since then. There have been lots of development. GitLab. GitLab was a huge help for GNOME. And it means that we can now do CI quite easily. So, the situation now looks kind of like this. Module maintainers generally have unit tests. And we'll check that the module works in isolation. The release team have an integration repo that says these are the versions of the components that make up this GNOME release. So, we know what we're releasing. And distributions have started doing some downstream testing as well. At least some distributions have. There's a lot of good work going on testing that the released software is good. But there's still a gap because from landing a commit into the main branch of your module to actually having integration testing run by a distribution, there could be months. There could be months from you making that change to someone cutting a beta release and actually testing it. So, there's still a lot of time for problems to appear. So, the question we tried to answer over the last sort of 10 years within the GNOME project is what if we built our own distro just for testing? And so, we did. It was a long job, but GNOME OS exists. Lots of people worked on this over the 10 years. And it exists specifically for testing. So, some people say, can I use it? And, well, you can, but it's designed to be broken, right? So, don't use it unless you want something that breaks every day, has no security updates and doesn't support most hardware. But what it is good for is testing the up-to-date latest in development version and for seeing how new designs might work as they're being developed. And a goal was always automated regression testing, but that's kind of been the last piece of the puzzle. And that's the thing I'm showing off today is that we now have automated regression testing of GNOME OS. You can get it from here if you want, like I say, only use it for testing. And it works for manual testing, but it's quite boring, right? People don't spend their weekends going, oh, I think I'll download this image and, you know, just test and report bugs. And it's not quite suitable for pre-merge gating yet because it takes hours to build the image. So, we can't gait every merge request and say, well, the OpenQA tests have to pass because it can take hours before the new OS image is built, right? So, what we're doing at the moment is we've set up an OpenQA instance. So, OpenQA, I haven't introduced OpenQA yet. OpenQA is a test tool developed by SUSE. How many people are familiar with OpenQA, actually? We're in the testing room, so hopefully some people are. Maybe half the room. Okay. Well, I'm going to do kind of a deep dive into how it's set up for Nome and how it works. There are three components. The web interface is the thing you look at, and this is called OpenQA. The thing that actually does the work is a test driver called OS auto-inst. It's a less catchy name, and that has, it supports multiple backends, right? So, in Nome, we use the QMU backend, but you can also use backends that run on real hardware. I think some distros are doing this. Some of the code think projects use this. In Nome, we only use emulation at the moment because it's kind of the simplest option. And then, we have a library of tests. So, actually, most of the fun stuff in OpenQA lives in Open SUSE's test repo, and when we want to do more advanced stuff for the Nome test, we go in there and copy stuff out of it and use it sort of like a library in the traditional sense of something that we copy from. There are some built-in utilities as well, but a lot of the good stuff is in the Open SUSE tests. Lots of people are using it these days. SUSE, of course, having invented it. Fedora is using it. I found an article about Eurolinux, which are using it. Various car companies are using it. Maybe you were using it. Hopefully, you will do after this talk. CodeThing is also using it to test Linux kernel master branches on some ARM hardware using Lava as well. That's a whole separate talk, which I'm not going to go into, but if you're interested, find someone with this t-shirt on and they can talk about it. So, let's be adventurous, right? Here's a screenshot of OpenQA, but hopefully, the Wi-Fi is going to work and I can show you the real thing. Here's the front page of the GNOME OpenQA website. It doesn't tell you much, right? Actually, we don't use this front page. We go via GitLab. Here's the GNOME BuildMeta repo. This defines what goes into GNOME. This defines what goes into GNOME. This has a CI pipeline set up. Ah, the internet's not working. Let's see. Aha, you got me, FOSDM Wi-Fi. Let's go back to the screenshots. I did anticipate this happening. Here's some CI pipelines that I prepared earlier. These are the tests running on master. These do various things. You know, they build all the components using a build tool called build stream. The interesting one for our purposes is a job called S3Image. This builds an ISO installer and pushes it to Amazon S3, which is a good place to store these kind of five gigabyte ISO images. Then, we have another job called testS3Image. That's the fun one, right? That goes to S3, downloads the image and runs the OpenQA tests. There's a long explanation of how it works, but actually, I'm going to see if I can show you the job log. I did load one earlier. No, I can't load the job log either, so I'm going to show you the long explanation of how it works. In brief, this design of OpenQA initially was that you'd have a separate machine or a farm of machines, and the tests would run on one of those machines. That's a perfectly fine model, but it involves maintaining quite a lot of infrastructure. We're trying to do this in the easiest possible way, because we don't have a big team working on this. We kind of inverted the design, and we use the GitLab runner as the worker. The GitLab runner uses the OpenQA worker container image. It calls the OpenQA web UI and says, hi, I'm a new machine. Send me a job, and then it queues a job and adds a flag saying, oh, by the way, this can only run on the machine that I just created. The effect is then the GitLab runner becomes the OpenQA worker, runs OS auto-inst and runs the tests and communicates the results back to the OpenQA web UI. It's maybe a little unsupported, but actually is working quite nicely, and there's just a couple of caveats in the web UI from doing things that way. It means we only have one big build server, which is configured as a GitLab runner, and we don't have any other infrastructure apart from the web UI, which is fairly simple to maintain. That's why we do it that way. Now I'm going to go through what you can see in the web UI. First, I'm going to drink some water, actually. I've got a lot of talking to do today. Each test run gets an ID. I can't see the ID, but it'll be some long number. We have one long test job, which tests everything we care about. Actually, I think this one I loaded. Here's the real thing. We test all the way from taking the OS image on a bare machine, running the installer. Can I open that? No, I can't open that. You'll have to look at the tiny screenshots. Using the initial setup, this is the GNOME initial setup, we poke at the serial console a little bit once we've created a user. Once we've created a user account, we can log in over serial, and we just enable journal logging to the serial console to make things a bit easier to debug. Then the fun starts. We start poking around at the desktop, and we run each of the GNOME core apps. At the moment, we just check that it starts, and then it looks the same as it did the day before. The core of OpenQA is doing screenshot matching, and it has some tools for making that a little bit nicer than it would be if it was just pixel by pixel comparisons. The core of it is screenshot matching. We have a screenshot of each app, and we say this is how it should look, and as long as it looks the same, or within 95% the same, then the test passes. If it looks different, then the test fails. This one, I guess you can't see, but this one has failed because a pop-up has appeared over the top, which is pretty annoying. One of the things that we still need to sort out. Most of these have passed. This one has failed because the icons change size slightly. Again, the image matches maybe 95%, and the threshold is 96%, so it hasn't quite passed. In most cases, the solution to these failures is just update the screenshot, and that's quite an easy process. Let me show you how. This is a gallery of some tests viewed closer up. When you click on one of the screenshots, you get to see the before and after, or rather the golden screenshot and the real screenshot. You can drag this slider across and go, okay, this is, you know, here's the difference. These areas are the actual match zones which are defined in the screenshot. OpenQA calls these needles. A needle is like a screenshot plus some metadata, and we define zones that we want to match against, and it uses OpenCV to do the matching. That's what this percentage means. It's saying, you know, it's 99% the same. A cool thing about using OpenCV is that it can move around the screen, right? This window might have popped up in a different place, but an OpenQA would, you know, if the match was over here 20 pixels to the right, it would detect that, and the test would still pass. So that's pretty useful. And you can also lower the threshold. The manual says don't lower the threshold below 90%. I guess because maybe anything will pass at that point. I haven't played with it too much. I tend to go with between, you know, 95% and 100%. Your tests can input text. So here we're creating a user. All this is done via the QMU APIs, so it's simulating a real mouse and a real keyboard to do this. It's going really through the whole, you know, stack, the whole software stack from the kernel through the graphics drivers right into everything in user space into GNOME. So the ultimate in integration testing. Here's some more screenshots of needles. This is an exclusion. So I excluded the version number so that when we bump the version number, the tests don't fail. This is the needle editor, right? So the WebUI lets you edit these needles. They're stored in a Git repo, but not everyone wants to dig around in Git repos. So there's also a WebUI to edit them. And you can drag and drop. This is a screenshot, but the green is like let's match this area. And the brown is an exclusion because this is a process list, right? So it's going to be different every time. So we exclude that from the match. When a test fails because the screenshots changed and you want to update the screenshot, which is a really common case, you can go in here, change the screenshot, use the existing matches, and then you can commit to your changes under a new name to the needles Git repo. So it's like a two-click process. It's pretty straightforward. And here's the actual needles repo. And it's nothing too complicated. Each needle is a PNG file and some JSON metadata. And here's a really simple example of what the JSON metadata looks like. This has one match area. And it has a tag. So the important thing here is the tag. In your tests, you would say assert screenshot app, bow, bow, bow, home. And it will validate any needle that has that tag. So you build up this collection, like maybe version 40, it looks like this, and then maybe version 42, the design changes. So you make a new needle with the same tag, same tag. An open QA will now accept either of those needles. So the old design would still pass. And if your application randomly regresses to the old design, actually it wouldn't catch that case unless you've deleted the old test. Seems kind of limiting, but actually open sues have built an enormous library of tests of using this method. So I trust that it works well. I think some people have actually improved on that on one of the code thing projects, but I don't know the details. Ah, and the last thing I wanted to mention was the tests. So this is the fun part. You get to write your tests in Perl. It's like a trip back in time. But it's not super complicated. I don't know much about Perl, but I can figure out most of what's going on. This is the main entry point. So we import a couple of helpers. We set some constants. Please don't steal my password, but this is a constant that we reuse. And then we load each of these is a specific test, and it's own Perl module. So then if we go look at one of those, here's a test case and the meat of the tests is calling these functions. So assert that this needle matched with a time out and then we click on an area that's defined in the needle. So in this case, it's the next button and that area is defined in the needle. We can also, we eject the CD-ROM after the install, and then we reset the machine. So a lot of this is building up libraries of useful functions and calling them. So your tests end up being quite readable. All right, I've got a couple more things, and then I'm going to open up to questions. Things that I've learned, Open QA is very good. It's probably the best thing to do this that's open source and available. You use it, contribute to it. On the other hand, don't go crazy with it because you will, you know, go slow and try and test integration. You know, don't do unit tests in there. The main documentation doesn't list the actual API that the tests use. So look here, look for the test API docs. And also look at open source of tests, which have loads of examples of good things that you can copy from. You don't have to run the tests in CI. When you're developing tests, you can run them locally. It's a little bit of a faff. You have to work out the right container command line. But you can run the container locally with OS auto inst and you can then see the logs locally and iterate much faster than having to push your changes to CI every time. That's a great help. I've messed up the numbering slightly. I guess you can follow it. There's a couple of errors that only appear in the logs. So the web UI is great, but occasionally you see like a 0% match and you're like, but these are the same. But it turns out there was something invalid in the needle and you get a 0% match, but it's only reported in the logs. So don't lose time to that. I've lost probably hours collectively to forgetting this and then remembering it again. Also, the upstream containers are useful and they're usable, but it's a very rolling release process. So you probably want to pin a specific version and update it when you're ready to deal with the new changes. Don't just pull the latest one every time. So within GNOME, this is kind of ready. It's working, but it has a bus factor of one at the moment. So before we can declare it stable, we need more people to get involved, both maintaining the infrastructure, maintaining the tests. And these are some credits of people that have worked on this over the last 10 years. Apologies if I missed anyone, but I wanted to make it clear this is not something that I've done myself in my free time. I'm really adding the finishing touches to a huge amount of work that's taken a decade to get to this point. On the topic of OpenQA, separately to GNOME, we're quite interested in it and Code Think, we're continuing doing the OpenQA lava testing of the Linux kernel. We've written a small tool to control hardware from within OpenQA tests and we've built this USB switcher, which if you have lots of test rigs and lots of USB hardware connected to them, it's very useful. If you want to see it, then they have some real ones over there. Find these people and they can tell you about how great it is. So, that's everything from me. Please, if you want to get involved with the GNOME initiative, it's a good time. I'll help you to learn everything about OpenQA you ever wanted to know and more. You can get involved on the discourse or on Matrix and I'm going to leave it there. I think we have a few minutes for questions. Thank you. We've got one here already. Okay, the question is how many bugs has this caught and how many false positives? It's caught some real bugs. We're not testing a huge amount of stuff really at the moment. We're just testing that every app starts. That can already catch some interesting breakages because in GNOME, we use 12 different programming languages. If any of the bindings break, then one of the apps will stop working. But it's found at least two known real bugs where I reported it to the app maintainer and they said, oh, wow, yeah, that's broken. Probably 20 or 30 false positives in the sense of the test has failed, I've had to go in, update the screenshot and the test pass again. But that's quite an easy process. At the moment, I'd say it's worth the effort. We're still going to evaluate over time if it's really worth the effort. But I think it's promising that as long as we keep the test suite small and we don't have to keep updating all of the like a million screenshots every time they change the desktop background, then I think it's going to be useful. One at the back first. That's a really good idea. Yeah, so the question was testing other locales like Chinese or German. I hear Turkish is always a fun one. You put in the Turkish translation and things always break. It's not an immediate plan, but it definitely would be good to do that. Another thing is testing themes. We have a high contrast theme and a large text theme for accessibility. These often also break because they're not widely used, but they're very important. So yeah, in the distant future, we want to do that. One here. Have you thought about some process where usually when maintainers of apps do visual changes, they usually know that they're doing visual changes. Some process that they can add some comment in the CI or some process which that information can be injected into OpenQI and reduce the fast postage. Yeah, so the question is if we can get app developers to notify the tests of changes somehow. Yeah, I think the solution is to get app developers interested in actually using this. So my goal is to have the developers of these apps actually finding the test useful, maintaining their own small set of tests, and then yeah, they will know at that point, okay, I just changed this. So obviously we're going to have to update the tests. It's certainly not going to scale if it's just always me doing it. So hopefully at the GNOME conference, this year, I can get everyone excited about this. One more question over here somewhere. How is that for you? So if you find an actual bug in an app, that's only happening in the OpenQI environment. How do you give them the reproducer? How easy is it for a developer to reproduce the environment? Okay, yeah, good question. So the question is how easy is it to reproduce this environment? It's actually quite easy because it's GNOME OS. So the developer can go to os.gnome.org, they can download a VM image and run it in GNOME boxes, and so they can boot the virtual machine and it's exactly the same code, like right down to the kernel and systemd and everything. So it's an effort, they have to download this image, but they get exactly the same environment and they can reproduce exactly the bug. Most of them don't, they just install into slash user and try and reproduce it there, but it's possible to do it this way. Yeah, so from what we saw, OpenQI is focused on visual testing and comparing screenshots. Is there a way to mix that or perhaps only do the headless testing settings with the commands from the CLI or from the console? Yeah, that's a good question. So the question is if it can do more than just screenshot testing. It can. I mean, we got a serial console, so we can run arbitrary commands. I think if we were going to do that, we wouldn't write the tests in Perl in OpenQI. What we'd probably do is write the tests in Python or in C, inject them into the OS image, and then we just run the test program over the serial console and check that it outputs pass. So it's definitely possible, and I think that's how we would do it. So how are we for time? No more time. No more time? OK, well thanks everyone for watching. Thank you very much. |
Console Automation with Termie
Practical and fun automation for all your terminal sessions |
Okay, hello everybody, welcome. Thank you for coming to my talk. My name is Brian Duggan. I'm going to be talking today about something called TERME, which is Practical and Fun Automation for all your terminal sessions. I'd like to thank my employer Instacart and the Pearl Foundation for helping me to be here. I'm on the logistics team at Instacart. Okay, so here's an outline of the talk. I'm going to give a quick overview of the concept of TERME, what it does, what it's all about, go through some of the features, explain the scripting capabilities, and then a little bit about why it's written in Raku, which is usually the first question I get, but I'm saving it for last. Okay, so here's the basic concept of TERME, works like this. So you have your shell, you type TERME to start the session, it starts a T-Mux session, how many T-Mux users in here? Oh good, okay, screen. Okay, you guys can fight it later. Okay, so it starts a T-Mux, starts a T-Mux session, it puts you in the bottom half, and basically anything that you type into the bottom half goes into both the bottom half and the top half. Okay, so you type, what is TERME? And it sends it to the bash shell session at the top, which doesn't know what what means, depending on your environment, you'll get some strange error messages about, you know, what command are you trying to type. So, you know, I did this and since this is the automation room, I thought, you know, probably, you know, maybe I could just automate this talk completely, so I found a command line version of ChatGPT on the internet that had a command line wrapper, I thought maybe I could just get ChatGPT to write my entire talk for me, automate it away, and then I would be done. Okay, so I did the pip install, which sends a lot of things to the terminal, as everybody probably knows, lots of recommendations about what to upgrade. Finally, I have the ChatGPT executable, so I typed that and I said, what is TERME? But did not get very much information since it didn't know about the talk that I hadn't given yet. So then I said, you know, TERME is being presented at FOSSTEM, which was a little bit more, a little bit better, but still I had to add a little more substance to the talk. And then, then the program hung, so I had to interrupt it with control C. And the way you do that with TERME is you use a backslash, which starts the command, anything that starts with a backslash, you kind of like the Postgres command line interface is a directive to TERME, so stop says send a control C signal to the other pane. Okay, so I got a keyboard interrupt, but that wasn't enough to stop it because that was trapped by the Python interpreter. So then I sent another one, and then that finally gave me a stack trace, which everybody who uses Python sees a lot. And then finally, you know, I was done with this session. Okay, so the basic concept here is simple, you know, you have something on the top, something on the bottom, it's the same, and the things on the bottom go to the top, so you have kind of an interactive session. And then you can also send these additional commands to the top. Okay, so now I'm going to go through some of the features. Okay, so as you saw from the last one, you have everything sort of organized on the bottom, even if you have stack traces and things on the top, you still have a nice little session that shows you what you're doing, and you have, you can set up macros, you can run scripts, you can wait for things, and I'm going to go through a few more of these in detail now in the next few minutes. So it has a new read line built in, it has a few ways of getting history, right, read line, there's a last command, there's also fuzzy find, fzf, anybody use fuzzy find for things, yep, so you can search your history for that. And it searches, right, even if you have several different sessions on the top, maybe you're on different machines or maybe part of it is in some other application, some of it is in a shell, it'll search your local history. So for instance, let's say you're using PSQL, you're connected to a remote database and you have, you know, your local history and maybe you run some sequel that's going to show you the long running queries. So you can write a macro to send all of this, and the way you do that is you say slash edit, then you have a text editor, put your file in an SQL file, and then slash alias will create a macro that says, you set the name of the macro and then slash run says run this, run this little script, which will send it to the other console. And then after that, you can just type slash find queries and the top will get the SQL that you put into the file. Okay, so it can be convenient for things like that, you know, or you could use it with Redis or, you know, building your, doing your kernel testings, they saw in the first talk or, you know, whatever, just any, any sort of session you can just make a macro and send it. So here's, here's another example. In this case, instead of using a macro, we're going to send standard out from a command that we run locally to the other pane. So on the bottom, I say delay three, which means wait three seconds between every line that you send to the top, and then slash shell means just run this, run this command, and then show me the output. So I say slash shell, cat, e.g. simple.bash, and here you can see my bash script, which does an echo, echo docker run, and then echo hostname. So then it runs that command when I say slash do, it runs that command, and the output from that command gets sent to the top, and it's sent, you know, after every line, it waits three seconds. So it sort of throttles the output, you know, which we, which we might need because it might take docker, you know, a few seconds to start before you run the hostname command on, you know, on the shell inside the container. So, so the bottom is what you're typing, the top is what you see. So here's another, here's another feature. So in addition to standard out, you can take, take the output of the top and send it to standard in of anything that you write. So in this example, I'm using the nl command, which the standard in, it basically takes standard in and outputs line numbers for, for the commands that are coming in. So I say sleep three and head user share addict words. The reason I do sleep three, if you think about it for a second is because when I do the exec command, I need time to type it, right? So I type sleep three and head, and that gets sent immediately to the shell, which waits a few seconds, then I type exec nl, and then standard in comes in, and then it prints out what goes out. So, you know, in the real world, you probably won't have to sleep because there will be constantly stuff coming through the top terminal. A few other interesting commands. So await, we'll just wait for either a string or a regular expression to appear in the top. In queue is something, is a way to in queue a command after you're finished awaiting it. Grab, repeat, send the same thing over and over, maybe add an interval, send a file, and we already saw what delay does, setting the delay. And there are actually a lot of commands. There are 43. Whenever I needed to do something, I added a new one. So if you have any ideas, send me a PR or send a request. There are 43 different commands right now. Actually, 44. I think I added one this morning. Okay, so script, anybody here use expect? A few people. Okay, so expect is been around for a long time, 1993, but it's still pretty useful if you have to interact with a program that requires a TTY. So here is an example of an expect script on the left. In this, in this case, what we're going to do is we're going to start a Docker container again, and then we're going to run user add to add a user. And then we'd like to set a password for the user. So we're going to run the password command, and we're also going to look for the prompts that are coming back. Okay, so on the left we see the way expect works is you say spawn, and then expect takes a pattern. So root at is what comes back in the prompt. And then you send user add, termy, we're going to add a username to termy. Then it has a regular expression expect dash re that you can then capture with the expect out. And then finally at the bottom, we're going to print out what we caught, like we captured the fact that the host name was something that was in the prompt. So you can do the same thing in termy. You can say a user bin end of termy and then set it to be an executable file. The default is to just send everything. So it's just kind of like you're interacting from the console. All the lines just get sent directly to the other pane. So you just say docker run, and then backslash expect is just kind of like the expect command. There's a little subtlety in there that you usually don't think about because it's sort of intuitive as a human, you know, you type it and you're expecting something. But really there's a race condition there, right? Because between the time that you send your command and the time you send the expect command, the output might have already happened. So the way that expect deals with it is it keeps track of the stream and then it kind of goes back and there's sort of this running, you know, a pointer to the output stream. And the way we do that with termy is we basically run the expect before we run the command. So it'll basically say now I'm starting to watch the output and then it'll send the output and then it'll capture it. So when we run this and it runs interactively in Tmux, so you can see on the top it sends docker run, pulls the image and it's waiting now to see the prompt. So it waits for the root and then after that it sends the user add, waits for new password and then finishes successfully. So the output here is in the test anything protocol. You may or may not be familiar with it, it's just okay and then the number of the test. And then currently if it doesn't get what it expects then it aborts the tests. Okay, so quickly I'm just going to say why it's written in Raku. And the main idea here is that Raku has a lot of very nice ways to do asynchronous programming and also interact with other commands. So you can quickly open a command, open a pipe to Tmux and interact with it using asynchronous processes. You can run things synchronously. It's got good not just inter-process but inter-thread communication. It has built-in constructs that like supplies and channels and promises which you may be familiar with from other languages. So this is like an example of how you could tail a file and create a supply which is a built-in type in Raku. And finally, this is kind of like the implementation of expect. So it's really, it's really pretty straightforward. You set, you can basically set up an event loop in a separate thread using this construct, you know, start, starts a thread, react whenever it says here's an event loop. And then when, if it's a string, we look to see if it contains the target, if it's a regex, then we send it to a channel so that then we can have it available to use locally. So even if you're not interested in using Termi, you might find some value in using Raku for automation. That's the end. Thank you for listening. Questions? I think I have a few minutes. Yeah, that's a good question. So different shells do different things that you kind of are not really aware of, even just the simple like printing of a prompt. They don't always send a new line character. Sometimes they'll send escape sequences that go to the beginning of the line and then go down the line or sometimes they'll even redraw the line above it. So it works fine. But you just have to be aware of the idiosyncrasies of the various shells in terms of what they do to the terminal. It does interfere. You can, one of the, one of the commands is to buffer the lines and it does get tricky to split up the lines when there's a lot of cursor movement. Sorry, with serial consoles. So that, that aspect is basically taken care of by Tmux. So it doesn't do the direct communication with the serial console the way that expect would. Yep. We have one more minute. Last question. So we only have a handful of users. So now is your opportunity to request features. We don't have to worry too much about backwards compatibility. Yep. That's it. Thank you. |
Fear the mutants. Love the mutants. |
Hello. How's it going, everybody? A lot of people in this room didn't really expect so many. This is wonderful. Thank you for coming to see us. I just want to say that we want to talk today about mutation testing. That's what we're here for. If you like this penguin, does anyone not like this penguin? Just you. Okay. Personal vendetta noted. This is a penguin generated by Dolly. Hopefully, it's friendly enough because this is going to be part of our talk. We're going to see a lot of penguins in this talk. If anyone has a personal objection to penguins, please speak now. Otherwise, if you like penguins, can I get a hand up just to see if we cool with that? Awesome. I've never seen so many people want to put their hands up but not really be sure. I absolutely love the energy in this room. My name's Max. I'm this guy. As you can tell, I'm also this guy. I'm here to talk to you about mutation testing. I work for a company called Vonage and I'm a Python developer advocate there. Now, what that means is that I maintain our Python tooling. I'm here to talk about mutation testing because I've just kind of went through this process myself of understanding all this stuff and applying it to my own work. I want to show you kind of how that went. But with me, not only do I have the tallest person in the room. Stand up straight. Stand up straight. This person is 196 centimeters tall. I'm like 177. I'm not sure. I promise. I'm average in Britain. In this place, right? This person knows a lot more about mutation testing than me. I'm really not the expert here but I just want to say this is Paco. Yes. I'm Paco. I work for Opavailu, a small consultancy company in the Netherlands. I got into mutation testing via my thesis. When I wrote my thesis on test effectiveness, I wanted to learn more about mutation. Also, after that, I got into speaking at conferences and more spreading the word about this. Quite awesome, too. I hope that at the end of the talk, you have another cool tool in your toolbox to write better code. Awesome. If we're cool with that, we do have to do the obligatory. These companies paid for us to come here and paid for our flights and stuff. What my company does, I'll just quickly tell you. We do communications APIs as a service, basically. Things like SMS, like voice calls, like video chats, like two-factor authentication, all via API. That's kind of what we do. That's really just what I want to say. It is relevant because I will show you what I actually applied this to, which was one of our SD case. For me, we don't actually have a product to sell. Also, definitely didn't fly here from the Netherlands just to make that sure. It's just a two-hour car drive. No, so we're here just a consultancy company, and we really like to share knowledge. That's mostly the reason why I'm here to tell you more and teach you more. It's quite simple. Yeah. He doesn't have the funding crush that I do, unfortunately. Luckily, we're all good. There's two of us on this talk. There's two of us here, and actually, there is a third person in this talk. We've seen a hint about this person already, but this person's really the thing that's going to tie this whole talk together, and it's going to get us all feeling good about mutation testing. This person's very important, so say hello to Henry. This is Henry. Look at his little face. Thank you. Hands up if you think Henry's a cute AF penguin. As for it, thank you very much. Yes, I'm glad we agree. I'm glad we're on the same page. Now, just some quick audience participation, because if you can't tell, we're quite big on audience participation. So, quick question here. Who has heard of this stock photo, but more importantly, who's heard of testing? This is just a check to see if we found the room. Thank you very much. Great stuff. Okay. Who's heard about code coverage? A lot of people maybe not everybody, and that's okay if you haven't. We're going to talk about code coverage, so please don't worry if you haven't. But yeah, it's awesome to know that some people have. That's a good starting point too. Okay, final one. I'm going to say other than Vaia knowing about this talk, who's heard of mutation testing? Oh, quite a few. Yeah. And now, quick break. Who actually was already using mutase testing? Ah, nice. There are enough quick wins here, and hopefully you have some good experiences. Yeah. So, really nice to see that people are familiar with the concept, but if you're not, it's also okay, because we're going to go through this like you don't know anything at all, because when I started doing this, you know, a few months ago, I didn't know anything at all, and so I want to take you through that journey as well, and that's what we're going to do. But before that, what I want to do first is give us some background, and what I actually really want to do is pass to Paco, who knows a lot more about this than me, so I'm going to pass to you right now. Yes. This is going to be some improvising. Good work. Good luck. I'm going to drink water with this. I'll feed you. Yeah. Nice. Great. So, yeah, we're first going to talk a bit about testing in general, and then we're going to more specifically talk about unit testing. So, just a quick check. Does anybody know what a unit test is? That's great. I don't have to explain that part. For those who don't know, it's the smallest possible test you can write in your code base, just in one method, and you write one test for it to test the outcome of that method. Now, there are many different reasons why we're writing unit tests, and I think one of them, my favorite or the most used one is for maintenance. We write tests because we want to be confident in the changes we make to our code base. So, whenever we make a small change, we add a new field to some endpoint that we know that we didn't completely break the database integration because it can happen at times. So, yeah, that's very important maintenance regression testing, but there are more reasons. One I like also a lot is tests can actually serve documentation purposes as documentation. You can use tests to describe certain scenarios in your code base that when you have a specific test for that, it already makes clear this is intended behavior. I have an example for this, which is I worked for a company where we had an endpoint that returned warehouses, and these warehouses, just a domain object, had a soft delete. So, there was a flag in there that indicated whether it was deleted or not. At some point, so this endpoint returned both deleted and non-deleted warehouses, and at some point over time, as we were working on it, a new guy came in and looked at it and said, hmm, that's strange. Why are we returning deleted warehouses? Why would you want that? It was a fair question because we also forgot, and there was only one test which tested the success flow, and you can already kind of guess here a bit. So, the success flow in this case meant they only returned non-deleted warehouses in the test. So he made the changes, and we all thought, oh, this makes sense. It looks broken. Of course, they didn't check with product management, the product team deployed it, and then you can guess, of course, this was broken, so we had to revert it. And the whole lesson here was just one test which also included a negative scenario with tests of warehouses that were deleted could have already been a trigger to think like, hey, this behavior is intended, and that's where documentation, so where tests can serve as sort of a documentation purpose. Also very useful in getting to learn a new code base. So, whenever you're on a new code base, you have this very complicated method. A test can help you step through the method to sort of explain what's going on, for example, while debugging it. Now, another one, and this one is here for the consultant. So, who here works as a consultant? Oh, not that many. Wow. Because we're sort of the root of all evil always. We tend to run to the next project, and we don't have to maintain our own code often, not always. So I have this nice quote that's mostly also for us. Keep in mind that you're not doing this only for yourself. I had a colleague who once told me, keep in mind that you always have this point in your development process where you think, okay, should I write a unit test for this? It's going to be a painful unit test. I know that it works. I do really have to document it. We all know how it works. Yeah, sure, we all know how it works, but we also leave the project and then go on and go to another project. We as consultants. And I will speak to myself, what would I do if I would be the next person? So what would I do if I were the next John or Jane Doe working on this project? So tests are not there just for you, but also for the next person working. I would actually like to jump in here, because I've been that person. Thank you. I've been the person who works on a project after someone's left that. And honestly, if you have good documentation or if you don't have that, if you have good testing, thank you, you do your water break. So if you have good testing, it can really help you understand what a project does. And so when I came to a certain project recently, I didn't have necessarily the kind of testing that I would have liked to really document my code that well. And so like, honestly, if I'd had someone like Parker, who actually was a bit more conscientious with what they tested, that would have really helped me get on board with the project quickly. But as it was, this was a real problem for me. And it was something that we want to hopefully avoid other people having to deal with as well. Like, quick question, actually. Has anybody ever taken over a code base that they may be looking at and go, what the heck is this? Okay, so you know what the point of this slide, right? You know why we're saying this. We know this is important. Now, let's stop that from happening to the next generation of very pain developers, right? Let's stop that happening. Yes, so write tests. And so if all these reasons haven't convinced you, there's often maybe a team lead or a boss or somebody else who's telling you to write tests. In most cases, there's always, of course, exceptions. Ah, okay. Wow. This is annoying. So at the end of the day, we're all writing tests, if it's not for ourselves and it's for someone else. And as we're, even though we're now sort of happily all adding tests, we also have to sort of sketch a problem scenario here. And this problem is that as projects evolve and grow, our tests also evolve and grow. But the problem is that we do reflect there a lot and we spend a lot of time on keeping our production code clean and well monitored. We have lots of metrics where on the other hand for tests, what you can see on long living projects is that sometimes you just get tests where nothing more than a blank setup and tear down and some mocking going on because the functionality already moved long ago, which means to the point that test code is often not monitored. Test code is sort of our, the kid that didn't get all the attention that needed. So there is still one metric for testing. What's one, what do you think is the most used metric for test code? Yes, yeah, we have sort of gave it away already in the intro, but yes, yes, code coverage. Code coverage tells you how much of the code is executed when you run the test suite. And I personally really like code coverage because it already helps you write more and better tests. And I want to go through a simple example here to show you how it can already help you. So here we have a submit method. So this is the Python guy. I'm the Java guy. Yeah, he said simple example, but I don't, I don't think so. Yeah. So the context is you are at the conference and you have a service where you can submit proposals. You can only have, you can't have more than three, three or more over proposals and you can submit after the deadline. If you do that, there will be a failure and otherwise you will get success. So quite a simple method with everything as a parameter just to make it easy to explain. So if you would take method coverage, method coverage is the simplest coverage metric we can get which checks is this method coverage as or no, we can add one simple test called a test X which submits a proposal. There are no open proposals, which is good. And we have a deadline that's 999 seconds in the future. So great. Now we can get a step further. We can get into statement coverage and with statement coverage, we check, well, if each statement was executed and now we see, hey, we didn't cover our unhappy flow. So we need to add another test. In this case, we add another test which has five over proposals, which means this check evaluates the true and we have a negative scenario. Now we can even go one step further through, for example, condition coverage. And with condition coverage, we check if each Boolean sub-expression has been evaluated to both true and false because what we don't know now is whether our deadline check is actually working. We just know that it returns false, but we haven't seen it return true yet. So we add one more test now with a deadline that is 999 seconds in the past. And now we have three tests. And this is already why I like code coverage so much because it really helps you write proper tests. Proper, it helps you write tests because let me get on to the good part here. As I said, writes better and more tests. Code coverage is really easy and cheap to measure. In most, I think most of the languages, it's just a matter of instrumenting the code. You run the test suite and you get a nice report out of it that everybody can quickly see and you can quickly see the pain points of where you're lacking in testing. But to get a bit further, so it guarantees, as I mentioned, it shows you what you didn't test. But the only guarantee I'm going to get to the bad parts next is that the only thing that shows you that what you did test didn't crash. It doesn't guarantee anything actually about functionality because code coverage can actually be quite misleading. It doesn't guarantee any test quality. So if I take this method, for example, this is a unit test, a valid unit test, this test generates coverage. It calls a method, but there is no assertion on the result, which makes this test, for example, generate 80% coverage, yet the test actually only guarantees it said the method doesn't crash. It doesn't tell us whether it's returned true, false or anything. And this is the pain point of code coverage, which brings us to the something nice which Max told me about, which is called the good horse law. So can you maybe explain a bit about that? Can I grab your clicker? Can I explain about good horse law? No, sorry. I can't. Just kidding. Okay, so when a metric becomes a target, it seems it's to be a good metric. So quick question, has anyone ever written a unit test just to get coverage up rather than because the test was useful? Come on, let's be honest. This is the safe space. Okay. Microphone, okay. Hello, everybody. Welcome to the live stream. This is our radio announcer voice. Right. So this is something, I'll be honest, I've done this. We now know a lot of people in the room have done this. But what we don't want to have is with code coverage. It's supposed to tell us something about our code. But if instead we turn that into a target, that can really limit what we actually, you know, what the kind of useful tests that we actually create. And that leads to a few quite big questions that we do genuinely care about. So I'll wait for that photo if you. Cool. Sorry, I'm very audience participation. I'm very sorry. So the next question that we ask there is how do we know if our tests are high quality? How do we know if these tests are actually good quality tests? We test them. We test them. Great, great answer. I've got a further follow up question for you. How can we understand what our tests are really doing? Same answer, if anyone, I see a hand. I literally had a code base where I could delete half the tests and nothing changed. And they all, yeah. So I'm in delete or half the kid. Hello. Yes. So just for the live stream, I'll just repeat that because that's a really good point. I won't repeat this wearing, but I do understand and appreciate the, you know, the emotion behind it. If you, if you end up, you know, shipping some code that does, does not do what it's supposed to do. You end up with users getting very angry at you. And yeah, that's a problem, right? That's going to be an issue. And that is a way of finding out, but I guess the real question we're asking here is how do we know if we can trust our tests? That's really the crux of this, this problem, right? And so as it turns out, the very, the very famous Roman poet juvenile, actually in 100 AD after he'd had a few drinks, he was able to summarize this in such a beautiful way. And this was something that maybe wasn't appreciated at the time because, you know, obviously he was talking about mutation testing 2,000 years before it was relevant. But I will mention it here. It's who watches the watchers, right? And this is the question. Who's testing our tests? Who cares about that? Who's, how do we actually gain trustworthiness for our tests? And I see there's, there's people in production who's having bugs. There's people who understand here that this is a really big deal. Luckily, we have a two-word answer for you, which is the reason we're all in this room. Mutation testing. So, spot the odd one out. You might see here, that's, that's Henry. He's having a great time, but maybe he shouldn't be still in a row of pigeons. But more importantly right now, I'll just explain the basic premise and then Paco here will explain in a little more detail how it's actually kind of done. So first of all, mutation testing, this is a really quick summary. What you do is you introduce some faults in your code, so just a few little things that you change. And for each of those little changes, that's a mutant version of your code. Once you've got that, you run your test suite against those mutant versions of your code. And if they fail, awesome, because that means that, awesome, because that means that your, your tests have actually picked up that change. And that's a good thing, right? That's, that's good. We want those tests to fail if our code changes, right? But if they don't fail, that's a bad time because that means those tests didn't test that change. It didn't test for that. And so that's something that could have made it to production. So what mutation testing kind of gives you is a way to evaluate that test quality. But this is very abstract. So let's look at penguins. I like penguins. So Henry here, he's a great example and he's going to, he's going to bring all this home. So I was kind of unfamiliar to the topic. So I kind of created some analogies with penguins that really helped me. So I'll share those with you. So the way I kind of imagine my software is we do lots of stuff with messaging. And so I imagine software that works properly to be like a pigeon or a dove, like a bird that can fly. I've used a dove here because Paco has a deadly fear of pigeons. He's terrified of them. Not fear. Vendetta. He has a personal vendetta against pigeons. Sorry. He doesn't like them. So I've used a dove here. But ideally we want something that I can tie a message to the bird's leg and it can go and deal with that message for me, right? So it can go, it can go do something like that. So one of the key features of penguins is that they're not very good at flying, right? I think we, can we all agree that that's probably not the best. If you want to tie a message to a bird's leg and get it to deliver it, a penguin might not be the bird you choose unless you may be delivering something underwater. So this is the kind of example here where we've got an a bird, but it's not the kind of thing that performs the way we expect it to. And this would cause some serious problems if we try to use this kind of thing in production. If we wanted to send a message via a penguin, we're going to have a tough time, right? So Paco, I'd like you, if possible, to explain this in a way that makes more sense than what I just did. Good luck. We have one mic. It's a bit, it's a bit, yeah. So let's get into the positive mutation testing. The first step of mutation testing, so we've, what Max just taught you is about introducing faults. So you can do, introduce faults manually, but this is a process that's, well, manually, and that means it's a lot of work and it's usually also not that reproducible. You don't want to do it manually. We want to do this in an automated manner. And this is where mutation testing comes in. In the first step of mutation testing, we're going to generate mutants. And each mutant is just a very tiny version of the production code. Mutate testing works with the concept of mutators. And mutators are the ones that are making these very small changes. So what we have in this case, we have a perfectly fine dove, which is the production code. And then at the end of it, we have a mutator, which generates, makes a tiny change, which kind of transforms this into Henry, our penguin who can't fly and we want our software to fly. So this would be a bad thing. So how does it look? Because this is still a bit abstract. And I'm going to give you some examples. This would be an example here. So for the Dutch, and I think for other countries as well, you have to be 17 years or older to apply for driving license. This could be code that's in your code base, which will fly, which is good. Now, the mutant would be the entire code base stays the same. And just this, this little piece change. So here we inverted the logic. This is, of course, the bug. This is something we don't want to manage and get into production. And actually just from this, this single line, we can already generate quite some mutants because we can not only invert the conditional operator, we can also change the conditional boundaries. So this means that we now have age larger than 17, which is a very nice bug that would force us to test the edge cases, the, the famous off by one errors, whether we forgot our equal operation in our conditional check. This, this will help you find that one. But it can also just return always true and false. We can generate quite some mutants for this this and we can do the same for, for example, mathematical operations. We can make each plus into a minus each multiplication into a division, etc. And therefore, we also have the ability to remove statements. So in this case, we have a method that adds a published date to some object. And we can also just remove the whole setter. And now this means that we have a bug in which we don't set this attribute anymore, which is something that, of course, we don't want to make the production. What's important to note here is that we mutate testing is always important that the code actually compiles because we're not testing the compiler. We're testing the code. The compiler is definitely out of scope here. Now at the end of step one, we have a lot of Henry's. We have a lot of mutants. And now Henry is going to try to fly. So he already got his wings ready to try to fly. And now for each Henry, we're going to run the test suite. And if this test suite fails, as Mike's already mentioned, then we have, then we, then it's good because then we expose Henry 40 is, which is just a penguin, something that can't fly. So this is great. The not so happy scenario is where the test passed, which means that Henry made it into production. And as we know, well, assuming that it also got through the PR, of course, we have more than just tests. Is that a problem? Because Henry is not supposed to fly. And then we have a bug into production. So this is something that you don't want. So this is the theory of mutation testing. And now, Max, you can tell a bit more about the frameworks. Sure. It works for me. Alrighty. So first of all, I just want to say I'm so proud of this prompt. I don't know why Dali chose this, but I'm really happy. Like, I think I typed in penguin trying to be a pigeon. And it came up with this. And I'm very happy. Okay. So moving on, yeah, frameworks. So this is going to get a little bit more specific to, you know, to actually implementing this stuff. So anyone here is a Python developer? Heck yeah. All right. Awesome. So I'm going to show you what I did in Python. So as you can see, you know, Parker's Java developer, he'll explain Java in a sec. But I'll just show you the kind of basic concepts, but using using my code and using what I did. So there's two kind of main supported packages that you can use in Python. It's not like, you know, in Java, there's like an enterprise thing you can get in Python, it's very community supported. So you're not, you know, you're not going to get big products. But what we do have are these kind of like nice and supported repose for mutation testing, which have just these packages. So I am not a professional, you know, in this, I'm not a doctor, I'm not a lawyer, I'm not a professional financial advisor. I'm just a person who, you know, has a certain opinion. And so my opinion of those two frameworks I showed you, there's Muttmutt and Cosmic Ray. And personally, I prefer Muttmutt. Because it's easy to get going. Oh, angry, angry face, shaking hairs. You don't like Muttmutt. We will talk later. So if we have time, we'll have a third presenter very shortly. So for now, while I've still got the mic, while I'm still, you know, while I'm still here, we'll talk about Muttmutt. And so this framework is quite simple to use. You know, it's the reason I kind of like it is because it's very much you install it and you run it. You know, there's a bit of config you can do. But really, it's quite simple just to get an idea of your code base and what's going on. So I want to show you this slide. This is just, this is the SDK that I maintain. And I'm showing you this because it's what I've applied my mutation testing to. So it's what I'm showing my examples. But basically, what we do is when we go here, I had this locally, first of all. So I installed Muttmutt with pip install. It's that simple. It's a Python package. It's what we do. If you went to my talk on malware earlier, you know why that's a bad idea, but I did it. So after we do that, we've got Muttmutt run, which just runs those tests for you. So when we do that, I'll show you what my output was. So when I ran this myself, I actually got a whole lot of this output. But really what's important here is that, first of all, it ran my entire test suite. And the reason it ran my entire test suite is just to check how long that's supposed to take and just to make sure everything does work as expected because there's various types of mutants to do with timeouts as well that we might want to consider. After it's done that, what it will do is it will generate mutants based on lines of code in my code base. That's what it will do. And once it's done that, it will run my tests against those. So there's a few different types and it can characterize them like this. So the first type is mutants that we've caught, not killed. We never kill a penguin. We love penguins. We catch them. We've caught them and put them back into the zoo. In this case, in this case, we've managed to say, yep, our test failed. That's great. But it could be the case where the mutant's timed out. So it's taken way too long for this code to run or it's taken enough time that we feel like we're not so great feeling great about that code. Alternatively, we might end up in a situation where the mutant survived and made it through our test code. In that case, it corresponds to a bug that might make it to production. So when I run this on my particular SDK, what I saw was that we checked the stuff. I created 682 mutants, versions of my code, with changes in them. And of that, it managed to catch 512 of those, but it managed to miss 170 of them. Now, if that's a good number or a bad number, we'll talk about later. But what's important now is let's just look at some of those mutants. So first of all, the ones that we actually did catch. Here's a couple of examples. So here's a line where basically we say, here are some valid message channels. So for our messages API, here are some valid messages ways you can send, right? But what's important here is that this basically removed the ability to send an SMS. And so when I tried to test that, it failed, which is what we want to see. Here's another one. Again, this is Python. So if you're a Java dev, don't worry, we'll look after you soon. And here's another one. We've got a decorator here, which basically runs this method. And we can see when we remove that, that will never happen. This is actually through Pydantic, if anyone has used that before. But basically, it means that we're not going to round a number anymore. And so when we test for that, a number doesn't get rounded and we catch that. But that is not really very interesting. That doesn't tell us anything. That tells us about this much, right? It doesn't tell us much at all. And the reason for that is that we kind of know that our tests work. We kind of know that our tests work for that. Thank you very much. I'll do the M&M thing. So we kind of know that our tests work for that. And so what's kind of useful is to see if we do what much show, we can see the mutants that we didn't catch. We can also do what more HTML, which shows us essentially an HTML coverage output as well. So we can see in a list all of the mutants that we didn't catch. So with more and more show, on that code base that I just showed you, we can see the 170 mutants that survived. It shows you the indices of these. And then we can manually specify the ones we want to look at. So here we can see, for example, that we changed the authentication method to fail. And we can see in this case we caught that because we did a test for authentication and it failed, so that's great. But more importantly though is you get this HTML output, which you can then explore. You can explore every method, every sort of module that you have. You can explore all the methods inside of there and which ones were and were caught. And you do that with the HTML command. So to do that, I'll just show you this is a mutant that we did not catch. And I want to show you why we didn't catch it and what it's going to do. And I'll just do that for a few just so you get some context if that's cool. So first of all, what this mutant did was it renamed the logger. Now I think logging is out of scope of my test code, so personally I don't care too much about anything related to logging. So I don't mind if I don't get a pass here. Here's another one. In this case, what we do is we've slightly changed the value of a constant. This is just part of a function signature. And again, we don't care about this that much. It isn't something that I really mind about. What's more important though is this mutant here. Because this is from our client class where we instantiate all of our different API classes. And you can see we actually set voice to non, so we completely remove that instantiation. And our tests are still passing. So the reason that actually still works, our code base still works even though this isn't testing that case, is because our tests actually test the voice API separately. They call it manually. But if our clients are calling it like this, maybe we should have a test for this as well. So this tells me, hey, maybe my test suite does need to be expanded. Does that make sense? I'm seeing some very, very like, yeah, yeah, that makes sense. I like it. Awesome. Okay, so if you are a Python Dev, this isn't the end of the talk, by the way, this is, you know, we've got some more context and we'll show you about CI. But if you are interested, then, you know, feel free to scan this. You've got like four seconds before I move slides. And as I move slides, I'll very slow motion, I'll be passing over this microphone. Because this was just Python, of course. And I think there are more non-Python Devs here. Just not Python. Let's see. We, of course, have more frameworks. I think they're from more languages out there. But I think they're the most important ones that I like personally. And pretty much the only really good one for Java is PyTest. And we also have Striker. And Striker is one that supports quite some languages. It supports JavaScript, C sharp, Scala. Of course, it doesn't do this in one two. Each one has their own dependencies because you can't have one solution for all. And what you particularly like about is that it supports JavaScript. And this brings this kind of back end heavy tool. Testing is usually mostly, I think, in front end can use some law when it comes to testing often. And this also brings the testing frameworks and the testing quality more to the front end. So that's what I really, really like. But we wanted to discuss a bit more. Mike's already sort of introduced it. So what is a good mutation score? We had the good hearts law where we sort of saw that code coverage can also lead to people implementing tests just to improve coverage, not just sort of defeats the purpose. You're doing it just for the metric, not for the actual purpose. And so how does this work with mutation score? Now, first, here's a picture of how PyTest report looks. So not to bash on Python, but much prettier and much clearer. Because now what, particularly what is interesting about this one, it shows you both the line coverage and the mutation coverage. We can ignore the test strain. And this shows us the sweet spots in a report. Because at the end, we have generated a lot of mutants. We have a lot of classes. And we only have very little time. So where are we going to look and investigate this report to see where the strains are? And the one that's the least interesting here is the notification service. The notification service also doesn't have any coverage. And if there is no coverage, then the mutants are also not interesting because you have a bigger problem here, which is you don't have tests at all for this. Then you have a choice. You have the proposal service and proposed service too. Now, the fact that they are named equally is because they're from another example. But proposed service too is the one that has 100% coverage and yet it didn't kill a single mutant. And this is the sweet spot because this means that we have code that is well tested. Or at least there's tests that covering this piece of code, but there is no single bug that was caught. So this deserves some attention because it means that we didn't fully test this. So these are the hotspots where you open a report. The ones with high line coverage and low mutation coverage, those that are the ones you really want to go through. Those are the ones that give you the findings to go to your team and say, hey, see, we need mutation testing because here, just these two classes alone already, it showed me that we need to improve our quality. Now, back to the score. So the example we had, we managed to kill 512 out of 682 mutants, which is about a 75% score. Now, the question is, is this a good score? Yes, yes, the golden answer. It depends. I love that answer. We already saw that 100% doesn't make sense. Things like logging and there are more things like generated code, et cetera, things that you don't necessarily want to test, even though there are mutants generated for it. Now, there are a couple of things you can, of course, do. You can also, depending on the language and the framework you use, you can tweak the mutation testing framework quite a bit. For example, the PyTest version actually, out of the box, already ignores and doesn't mutate any logging lines. And all the big frameworks are known to the tool. So anything that goes to SLF4J, it doesn't mutate it. So it also doesn't appear in your report, which is quite nice. And you can easily add things, like if you have a custom metrics facade somewhere, also typically something you don't want to cover unit tests, you can add that as well. So the thing here is that mutate testing is not really a score you want to achieve. It's more that the report can be interesting to look at and gives you sort of the nice spots. And once you completely set it up nice and you're familiar with the report, you can maybe start looking at the score, but definitely it shouldn't become an 80% goal or something like it was of code coverage. It just goes through the report instead. So now we've sort of discussed all the tools you need. We have discussed the frameworks. We have discussed the technology technology. And now it's time, of course, for you to fly. So we need to, how would you get started on this? And the thing I think that's important here is if you want to start, so you now think, oh, this is a great talk. I want to start with mutate testing. Depending on the size of your project, it might be wise to just start with just a single package. I've done this on projects that are a couple of, say, a thousand lines big. And even though in Max's example, we had 682 mutants, this can also, depending on the kind of code you have there, easily grow to tens of thousands of mutants, which can be quite slow. It can also be that there's something weird in your code base that doesn't really work well with mutate testing or something that's just extremely slow. An example that I had was that we hadn't thought it's good to keep in mind. It's actually just to take a sidestep. The mutate test framework also measures in the beginning for each individual test, which code it covers. So there's a nice graph from code, production code, to the tests. This helps us optimize because if we want to run the entire test suite, all the tests for every single mutant, it's going to take endless. Instead, because we know the coverage, we can also see if we mutate this one line, we know which test is covered. So we only need to execute those few tests. But what if you have tests that actually cover half your code base? So, for example, one of the things you can do in Java, if you're doing things with Spring, is you can actually boot up the entire Spring application and start doing acceptance tests from your unit tests, which is typically also quite, not necessarily the worst thing to do, but you now have a very slow test that does cover half your code base that will be executed for each single mutant. So these are things you want to get rid of. You want to exclude this acceptance test because otherwise you're going to be waiting endlessly. So my point about starting locally and starting small was start just with one package. Start with the utility package to see if it works, see if the report works for you. And then from there, you can expand at more packages and also you can see, oh, now it's taking 10 times as long. Why is this? And you can find the painful packages there. So as I mentioned, you can exclude some tests and also there are often candidates, certain pieces of code you might want to exclude. For example, there's no use in testing generated code, but also it might be that you have certain domain packages that contain just all your domain objects, your pojos, which is just setters and getters, something that you also typically want to exclude in your coverage report. You might also want to exclude this from code mutation, from mutation testing. And now that's done. We need to, so we talked about running it on your machine. We also can do this in the cloud, of course. Thank you. So as you can see, there's a pigeon on the slide and Paco, as we've said, has a personal vendetta, so I've taken over the section. So here we can see that we're going to run off our machine. So why would you want to run off your machine rather than on your machine? Any questions? Any ideas? What happens in the background? Yes. So what happens in the background is what was said there. Any other reason you might want to run non-locally? No. I got a couple. Oh, oh, hand. CI. CI. Yeah, you might want to end your CI system. In fact, that's what we'll be showing you. So foreshadowing. I like it. So yeah, it takes some time. And if you're using a CI system, you get to use those cloud resources. And also what's important is that you can, if you've got code which is maybe dependent on different OSes, might behave differently, you can specify different versions and platforms to run on as well. So stop talking. I hear you cry. Well, I'm afraid this is what we're here for. But unfortunately, I will be keeping talking. But what I will do is show it a bit of an example. So I applied this to my code base, my own code base myself, into my CI system. So you can see here, this is GitHub Actions. And I've got a piece of YAML, essentially. I've got this mutation test.yaml file. And what that does is set up an action for me to use. So this is something that I manually run. And I can do this here. So I manually run that. What it will do is do the mutation test non-locally, and it will produce some HTML output for me to look at. Now that seems, I'll go a little bit into what YAML does, but it seems like something that should be able for everyone to do themselves if they want to. So GitHub Actions, the reason I show that partly is because what we use, but also it's free for open source projects. So, you know, it's been useful for me because I've not had to pay for it. So, you know, just a heads up. See, I'll be showing you this with GitHub Actions really quickly. And I'll show you the YAML, I'll show you what I did. Hopefully by the end of this, the next couple of slides, you will see how easy it is actually to do this and why actually this is all good and maybe you want to try this yourself when you get home. So here's some YAML. First of all, this is our mutation test YAML. It's got one job. It's pretty simple. All we're doing, we're running on Ubuntu. We're running one specific Python version to do this. Depending on what your test base is. Oh, they're running a great time in there. Oh, there is thunder. So basically, we have, yeah, we're testing on one version for me because I just, my code doesn't vary enough between versions and OSes. So for me, it's not relevant to do that. But if we look at this next slide, I'll actually show you the workflow that goes through when I actually run this action. So first of all, we check out the code. Then we set up a version of Python with it. Once we've done that, we actually install our dependencies, including now MutMut as well as our regular dependencies. So now we've got the new mutation testing framework installed here as well on this kind of test runner. Then what we do is we run a mutation test. So we do that with MutMut run. But because we're running in a CI system, we don't want insanely long logs and due to how it's outputted, we want a no-progress flag there just to show that we're not seeing every line of output. We just see the important parts. We also have the CI flag, which is one of my only contributions to watch a open source, but I added that and I'm kind of proud of myself. So that basically means that you get a good, sensible output, like return code when you run in a CI system. Because the default for MutMut is, depending on the type of mutants that we call, it will give you a different exit code that is non-zero. So you kind of need to consider that or to suppress that with some scary, scary bash. That's why I did it first. That's why I wrote the flag. So once we've done that, we save it as HTML and we upload it so that you can access that yourself as well. So that's it. That's the whole piece of YAML. It's 35 lines and that set up the entire mutation test for my suite. So you can see, hopefully, does this seem kind of easy? I think it seems pretty gentle to do, at least in this sort of scope. If you're a Java dev with a 20,000 line project, you might want to be a bit more careful, but if you've got like a Python hobby thing, try it out, right? Try it out. What I would say, there are some more concerns. So first of all, I chose to run this manually when I want to run it. I chose not to run this on push or PR. I chose to run this manually. And the reason for that is that I don't expect my code base to sufficiently change between like small commits. And what I want to do is really not use like mutation test as a, you know, that kind of score, that 75%. I don't want that to be a metric for me that I've just turned into a target. I want it to stay as just a good idea, an indicator of what my tests are doing and what I could be doing better. So for me, I don't want to run every time, partly because it takes a blum in a long time, especially if I'm using multiple versions, which we also have to factor in. So you might want to do that. So I didn't. I just ran on a bun two and that was fine for me. But yeah, depends on what your code is, you might want to run on different platforms, right? So do factor that in and that will help you a lot if you're in a CI system. So the other question there is just should we run on push or PR? My opinion is no. I think there'll be people in this room who disagree with me, maybe say on a PR, you should run that or maybe there's some kind of metric you want to associate with score that you then want to look at in some way. For me, that's not how I use mutation testing. And I think what I want to get out of this is we don't want, we don't want a situation where mutation testing becomes a new target. We've got to get a certain score because then we're just kind of abstracting that problem of code coverage targets. We're just doing that all over again, right? So we're trying to avoid that. So the final question here is one I'll ask of Paco to explain is Paco, do you think I should use mutation testing as, you know, in my role as an audience member right now? What do you reckon? Yes. Well, so with stair already, it depends. There's some things you can ask yourself because needed is a question. So mutation testing is of course definitely not a silver bullet. It's something that the reports take quite some time to go through and of course it's quite computationally expensive to run the process. So the couple of questions that you can ask yourself that are quite obvious are for projects which have a really high quality goal. When people die or when a lot of money is lost or a combination of those two. So just to check, how many of you are working on a project that fits in these three? Okay, then you need this yesterday. Yes. But for the rest of the room, including me, there are some other questions we can ask yourself. And I think one of the important ones is are you using code coverage? Because if you're not using code coverage, let's start with that and let's first get coverage and get to see how many tests you have. Then the next question is, is how much value do you put into this? How much value do you get out of this code coverage? And what I mean with that is, do you make decisions based on it? It's like a definition of done on your sprint or is it with a build fail if there's 80% coverage or also in the case of due diligence, you're selling a company, not something we also would do. But then you would also want to know how well is the software I'm buying or how well is the software I'm working on. So here I would say, if you're using code coverage and you're making decisions based on that code coverage, then yes, you should at least have a look at mutations testing to see what the state is. You don't have to do this always. You don't have to put it in CI just once a year or go home, run it on your computer once just to see what the current state of your team is. Because it can very well be that you're on a high-performing team, which already has their PRs and everything so well and set up that it's not worth the time maybe. Because apparently the mutation testing report might even confirm that, the fact that you killed all the mutants. So that would be great. And there's another question that I like. What's the cost of fixing a bug? And I have two stories for this. My first example is, and that's the first company I worked for. This was an enterprise company that built software that was running on-premise at the customer and the customer was government. And then you're in the line with all these big integrators, which means you have feature freezes and moments where you can actually go to the customer and deploy your software, which is quite expensive, which also means that if you get a bug after this feature freeze or after this upgrade window, you have a serious issue because you need to go to the customer, you need to explain what went wrong. It's a very costly thing, a very costly issue. So here definitely, again, mutated testing can be quite interesting because a lot of money can be involved with the reputation. The other example that I had was more of a Greenfield project, which had more of the startup vibes where it was really of a feel fast and fix fast mentality. So this was a project where rather than focusing on getting our quality monitoring up to speed, we were mostly focusing on making sure that we could very quickly fix bugs as well. It was of course running on-premises in the cloud, so we could control it. And most important goal was there to just click a button and be in production again in 10 minutes and have active monitoring to see if anything goes wrong. Here the cost of fixing a bug is already a lot lower, which means that the reason to consider it might be a bit less, especially if you're again in a, for example, a high-performing team, which are all worked into each other, you know what you're doing, and you know you can trust each other because you're really, you're all professionals. Then maybe it's not worth to also spend half a day going through mutated testing report if you already know what the outcome is probably going to be. Then again, still do it once. These are two things you could consider and want to use it. So those are the things I want to give away with you is don't go into it blindly, just ask yourself, should I really use it? And then, yeah, for the last part, I'd just like to sum up. So I think hopefully if we've gotten here, we've kind of shown you what mutation testing is, why you might want to consider using it, and how you could possibly get going starting with running that, and also why you should. So if we're here, I just want to summarize, first of all, I'm sorry I used this penguin as an evil penguin earlier. It is adorable. I just like the Dali. When I asked it to give it some fake wings, it gave it three. It gave it this extra flipper here. I'm not sure what that was for. But what I'd like to do is just quickly summarize what we talked about today. First of all, mutation testing is a way to test your tests. It helps you to beat the problem where you're using Goodhart's law for coverage, right? So it saves you from just trying to turn coverage into a metric that you don't have as a target, right? You don't want to have code coverage. It's got to be above this threshold or we don't merge. That's not where we want to be. What we want to do is write good tests. So if you are going to do this yourself, an important part is to start small. So start locally on your machine. If you've got a big code base, then what you need to do really is run on a subset of that code base. If you've got a smaller code base like me, you're probably okay. But either way, start locally on your machine. You also want to run if you can. If you want asynchronous reports, if you want to use the resources available on a CI system, you can run mutation testing there. So do consider that if your stuff is in CI. And finally, I just want to say that mutants, hopefully we've demonstrated that mutants are like adorable penguins, right? They're valuable and they are wonderful, right? They're really great to use. They can tell you so much about your code. They're extremely useful. So don't fear them because you should love them. Thank you very much. If there are any questions, comments, objections, love mail, hate mail, anything, shout at me. So the question there was just if we can give some more examples of the kind of range of things that are possible to mutate. So essentially, the short answer is anything that will still make the code run. So in the Java case, the code compiled, in my case, the code run. So in this situation, things like, I'll give you some Python examples. For example, changing a variable from a certain type to another. So you might typecast something. You might, with a mathematical expression, you might add extra terms to that expression. You might change return types, error types. You might set things to non at any given time. You might call something and, yeah, remove parts of it, set things to zero. There's other stuff. Paco, can you think of any mutation testing Java examples? Yeah. So I think that the examples you gave for that, it sort of depends on the mutators you use. So you often, you can also, of each framework, you can also go through the list of mutators to see what kind of mutators are out there. What's good to keep in mind is that it does use some basic, fundamental strategies to determine if it can be mutated. Because for example, if you have a stream, and in this stream, you do some operations, which you could, in theory, cut out, you're still using the return value, which means that the mutation testing framework thinks, okay, let's keep that intact. The same goes for if you're using the spring reactor framework. You could do lots and lots of smart mutations in there, but it's not really there yet. It's really the rudimental things, the conditional logic, the mathematical logic, I think, are the two main things you'll see. And actually also, the count for often the most typical programming errors, I would say, awesome. I mean, anything you'd like to mutate, you know, because I guess a lot of these things are open source, you know, anything that you might be good if it did exist. Any other question, answer? So, two questions. The first one, could you comment on some framework for C and C++? And the second, what do you think about the idea to force developers or to require developers to use that code, which they have actually changed, just to save computational power on the entire machine and the server side is yet. Okay. So, the question there just for the live stream was two things. One is, are there any mutation testing frameworks for C or C++? I will say personally, I don't know. You haven't used C++ since my physics degree, so I couldn't tell you. I don't know if you know anything about that, Parker. I just did a quick Google search. That's all. So, I see there are some frames available for you. There is a project by the University of Luxembourg, which is called FAQAS. FAQAS. And it's not there quite yet, but it's something for C and also for a bit more for C++. Regarding your other question, by the way, so should you do it as a get hook? I'm given that it's, right, that was the question. Yeah, the idea was basically to require developers to run those mutation tests, but not the full set, but only mutation tests which are touching the machine, which are testing the unit tests, which are testing the code, which was modified in this one. Yeah. Yeah. So, actually there are some, depending on the framework, some have features which are incremental reports, so where they can just store the last state, then you can do a diff and use the results from your last execution to not execute all mutants and not generate all mutants, because it knows I only changed these production lines, so I only need to generate mutants for these, and I only changed these tests, so I only need to rerun the tests for these mutants, which can tremendously speed it up. But still using it as a get hook, I'm not sure. You can, by the way, use the same logic in CI as well to use the incremental reporting that saves a bit, despite and also supports a thing. Yeah. So, with what you have, you have caching, so you can cache those tests that you've done already, and if those cases aren't touched, then you're sort of good if that changes to your code don't affect that. So, that is an option. I would say, yeah, thank you. My opinion is, again, that maybe you don't want to explicitly mandate this on every run, and the reason for that is it's kind of like, it can then become kind of a metric that you try and optimize for or something to look at, whereas really, I think the nice way to use it is every now and then is how I would say so. I think if you've got a super critical project where that's really important, you may want to run it like that. For me, I don't need to, but I think that's really up to you, up to you as an implementer, what you want to do, and I think there's definitely a use case to do it in that way if that was important to you. Hand over here, hello. Yes, yes. Short answer is yes. Long answer is, depending on the actual framework, it might be that you add a comment to ignore it. Alternatively, there is a config file set up as well in Python where you can say, only mutate these paths, only do these things. What language do you use? That would be Striker. I would say yes. I haven't looked that much into Striker, but I think they make quite some nice stuff. It's quite generic for all frameworks. Exclude code for mutation, definitely yes. Depending on the framework, some even have nice things like exclude, do not mutate any calls to these classes, which is interesting for the logging, for example. Do not mutate any calls to this logging class, but same you can do for packages, class paths, et cetera. I'd say with Striker as well. One of my colleagues uses Striker because he maintains our.NET SDK, and he's actually also got mutation testing there in Striker. It does seem very perform and it seems like it does have a lot of those features as well. Honestly, if you're interested in TypeScript, I think there is something for you there. Cool. I think it might be free on open source repos. Sorry, another question. Yeah. Are specific mutated runs that are reproducible for debugging purposes? If you have a run and you see something you don't expect, can you reproduce that specific run with a given SC or something? Well, so the question is, how reproducible are the mutants? If you find one, then next run, is it still there? As far as I know, there shouldn't be any randomness in these mutant generations. It just goes over the code. Any condition that it finds that it can mutate, it will mutate. So the next time you run it, the same mutant should be there at the same place. So you could also see whether you killed it the next time. So yes, it's reproducible. I think that's the person who was first, sorry. That's a good question. I'll repeat that one. That's a good one. So the question there was, so mutation testing, we've talked all the big game, we've come up here and been like, hey, look, this is important, right? That's what we've talked about. And the question, which is a very valid question is, hey, if it's so important, why is no one supporting this in Python? Why is this all open source stuff, right? And you know what? I agree. That's a really good question. It's one I asked as well, to be honest. So no, I totally support the question. And the question I'll properly say is, yeah, why aren't employers supporting this? The short answer I think is to do with ROI, unfortunately. And that sucks, honestly, because I would like us to invest more time in certain things. And I think it's just to do with company priorities, right? So I would like to spend more time. Honestly, I had quite a lot of fun adding the one feature I did get to add. I'd quite like to do some more. But again, I've got this API to implement, so do I have time? Well, no one's funding me to do it. So unfortunately, it really is like, unless there's an obvious ROI, this just seems to be the way things go. Unfortunately, that's the way we've kind of structured our platforms and so on. So I gave a talk earlier on a PyPy and malware. And there was actually, the reason that that kind of is so prevalent and so possible on PyPy is because PyPy haven't really implemented many ways to actually protect against malware being uploaded. So currently, I've uploaded some malware to PyPy that you can get yourself. And actually, the reason that they, it's not real malware to be clear, it's a rick roll. But you saw that. But basically, what I'm trying to say here is that that project kind of didn't really get off the ground in terms of protecting users just because I think originally Facebook were funding it and they stopped funding and that just didn't then continue. So unfortunately, yeah, this is just kind of the way that things are in open source right now. And yeah, I do feel your pain. I do understand. But that's all I can really say. I'm afraid. Yeah, to quickly add to this, by the way, a Striker, for example, is actually funded, is backed by a company who, for example, let's work, interns work on it as well. So some frameworks actually are backed and there are people already investing it. So it's not always bad. But sorry, Nick, let's go to that side. So you showed some HTML reports for the results of the mutual test. Yes. We all know all managers and PDT teams love your KPIs. So I'm wondering, is there any integration or plugins to export the mutual test results in Sonar Cloud or other platforms? That's a really good question. So I'll answer quickly for Python and then I'll pass it over. Because in Python, the answer is quite short. The answer is unfortunately no. So the maintainers, the maintainer movement is not really a big fan of the CI system stuff and the report stuff. I think it's like, I think the premise there is, you know, I like running this locally and, you know, that's fair. And that is really how you can get started and get an idea. So in Python, unfortunately, the answer is no. But I think that Paco might have a more positive answer for you. Yeah. So let's also ask the, you were the maintainer of the other framework. So how does it go for the other part of the framework? So, okay. So I talked about not having that facility, that feature in Cosmic Ray. Is that a bit un-maintained? I don't want to say names, but it is a very, very large, 450 maybe vendor that uses it. And we asked them, can you fund development? They said, you know, no. And yeah. So they have shown this around at large events, like in front of thousands, thousands of people. But yeah. They're like, okay, we keep all the data stores there for whatever we find as it is. Yeah. So for the Python frameworks. Yeah. Yeah. Yeah. Yeah. So for the Python frameworks, there's not really CI plug-in support. I do know that, for example, for PyTest, there is support for Jenkins and Sonar. And I'm not sure about Stryker, but I know it's there. And usually these things are relatively easy to build yourself here as well, because all you have to do is, if there is a report in some JSON file, you can quite easily parse it and make a nice HTML form about this. Because again, they're all open for contributions. Do we have time for one last? I want to just add to that a little bit. Okay. Okay. Really quickly. First of all, with your question, yeah, I, when I originally implemented my mutt mutt thing, I did do it on PR. And in that case, I got a, you know, an action that would comment my coverage in a nice metricy way. And so you can, it's quite simple to do. So about, about cosmic, very first of all, that sucks. And I'm sorry. That's, that's blumming awful. Like, yeah, sadly, it does seem that a lot of, a lot of what we've kind of been discussing on the side of the room is just like, man, it would be good if some, you know, we all agree this is important, right? And it's useful for a lot of things. It'd be great if someone funded it. So I think, unfortunately with Python, that is the state of play. And it does suck. But yes, I, I get you. Any other questions? Finally, I think one, yes, hello. Can you write custom mutation to mutate your code in a custom logic? That's a really good question. So, sorry. That's, I will now repeat your really good question. The question was, so the question was, if I, if I have a certain type of mutant that I want to make, can I do that? So I would say with the stuff that I used in Python, the answer is you, you'd need to actually, you know, use the version you've downloaded, edit it yourself and add that stuff. So sadly, there's not an easy customizable way. That will be an awesome enhancement, though, that I would, I would like to see, you know, that would be cool. In other platforms, Paco, any other? I do know that I think Python did have some extension points. So it really depends. I know that the company I work for currently called Picnic, they're also working on extending it, for example, for reactive code. So there are some extension points often. So in short, it depends on the framework and how easy it is. Are we, we're done. Okay. We're at time. Thank you so much. This has been a really nice discussion as well. So thank you for sharing this. Thank you. |
Welcome to the Translations DevRoom
Let's have a great afternoon talking about translating FOSS projects! |
So, let me put here some minutes. My name is Paulo, I'm from Brazil, and in Brazil I help the translation team for Debian, translation for Portuguese. I'm a Debian developer too, so I work with packages in the translation for Portuguese, and I'm coordinated the translation of everyone here this afternoon. We will have six talks today from different projects, and each of them they have 30 minutes, so the speakers know they have 30 minutes to talk. I don't think so. I'll bring this here. Okay. Leave here? Thank you. So, each speaker will have 30 minutes to talk, including the QA part. I'd like to say thank you very much for FOSDEN organizers and for speakers to be here. I'm sure that it will be a great experience for everybody. I believe this is the first time we have a translation everyone at FOSDEN. So, the first time we have a dedicated room for translation. So, this is good. I hope this is the first of many Debian rooms that you have on the future. And I hope everybody enjoys it. I will be sitting here. If someone need something with me, talk with me, I will be there. Yeah, I'm using this blue shirt. So, that's it. Thank you very much. Thank you. Thank you. |
Translate All The Things!
An Introduction to LibreTranslate |
So let's dive right into it. LibreTranslate is a software that's a bit like Google Translate, but open-source. It is AGPL3 license, so it's a strongly open-source. In fact, we're going to keep it that way forever, and let's you do natural language translation. It runs on your computer. This is one of the goals of the project. There are several other projects in the open-source realm that have aimed to provide natural language translation, except sometimes that they require a very large servers or a lot of memory, and our goal is to have this running on something as low as a Raspberry Pi. So that is very important to the project. The program has lots of clients and integrations. We'll cover some of those in the upcoming slides. Like many projects, it's available on GitHub, so you can go and check it out. But we're going to give you today a brief overview of how to get started and start using it today. Let's talk briefly about why we decided to create it, and there was a need for the project to exist. We could not find a project that had all the variables that LibreTranslate can offer. These are a simple and open-res API that you can use to programmatically do translations, so help automate part of the translation work that we need for the work. It offers pre-trained and openly licensed language models. There are other projects that do machine translation, but again, sometimes they do not make the AI models for the translation openly available, and we have that. Finally, it runs again on commodity hardware, so it does now require server-scale power to make the software work, and finally, it is very easy to get started as you will see. So talking about getting started, there are primarily two ways that you can get LibreTranslate to work on your computer. The first one is if you have Python, you can simply run a pip install command, LibreTranslate, and afterwards you run the program, and that's it. If you have Docker, which many developers like to use, we also have an option for that. We pre-build images for LibreTranslate that you can use, and we have a convenient script that will run it for you and take care of a few details that let you have things like persistent volumes for downloading language models and some technical stuff. But to get started, all you need to do is go on GitHub, get a copy of our source code, and press Run. We also have scripts for Windows and Mac OS and Linux, so we try to support all major platforms. We're hoping to get other platforms in there as well, so things like FreeBSD and others are on the to-do list. So we'll get there. So let's actually try to run it, and I'm always a little scared of doing live demos, but bear with me, we're going to try it. What could go wrong? So here is a console. I'm going to quickly activate a Python environment where I have LibreTranslate already installed, and I'm going to try to run it. On Mac OS, I have to specify a different port than the default of 5,000. I'm going to try to run it. Okay, it seems to be working. So I'm going to jump back right into Chrome, and if I refresh the page, you will be presented with a friendly user interface that you can use to test the system and even use it. It allows programmatic access to the software via an API, but you can also use it as an alternative to Google Translate if you want to. So we're going to try to say something. Okay, so obviously English to English is not going to be helpful. How about French? Okay. So we translated Hello World, Bonjour Le Monde, and it worked. But that's not too impressive, right? I'm like, okay, Hello World. Let's try to look at something a little more realistic. Before looking at something more realistic, you can also, of course, use it from an API. In this case, I can invoke a Cura command and ask Libre Translate to perform a translation. I wanted to automatically detect the language, where the translation is coming from, and finally, I want to translate into the target language. I get a JSON response. Everything in the API is JSON-based. So that will be familiar with many developers. But let's look at a more realistic example. In this case, we have a longer piece of text, and it also contains HTML. The software is capable of translating the parts that need translation while leaving the HTML part intact. So things like hyperlinks do not get mistakenly translated, which would be really bad. This code that we saw here roughly gets represented as this piece of HTML in a browser, and the translation is pretty good. Kind of. This word should have been filidae. It decided to keep the translation in French. We will improve that with time. But otherwise, the context and the meaning of the sentence is pretty darn good. We will look at accuracy in the upcoming slides. So as an overview of the list of features, it can do text translation, it can do markup translation that includes HTML, XML, and other formats that use a markup. It can do several formats for file translation. So you can upload things like open office, LibreOffice, Word documents, and PowerPoint slides, and able to translate those as well. It can perform a language detection. So you give it a piece of text, and it will give you an estimate of which language the program thinks it is. It also has a built-in system for doing rate limiting. If you're planning to host this on a public server, you will find out that it's a very useful feature, because people really like free resources, and it's difficult to give everything for free without some limits. So if your translation instance up in the Cloud gets really popular, having some limit by saying, do a maximum of 60 translation per minute, will come really handy and it's all built in into the software. You can further issue API keys to give to people that can change those limits. So you can set up the system in a way where you allow anonymous users to translate up to 20 translations per minute, and you can allow a subset of people that you've issued API keys to have however many they want. You decide those limits. It also has a localized UI. We're using WebLater to do that, which is awesome, and it has been currently translated into four languages, and we're looking to verify and add more. One cool neat feature is that a Libre translator has a ability to translate itself roughly, so we have done that of course, but we haven't displayed all the languages that it has tried to translate itself. We are waiting for a native speaker to review the actual translation and correct it. But if you run in debug mode, you will see all the work that it has done, which is neat. So it translates itself or at least it helps. It finally has the ability to monitor itself. So it can generate the usage metrics, so you can monitor the usage of the server using Prometheus and Grafana. These are tools to do monitoring that are very popular. Inside the software, there is really just a few packages, so it's very lightweight. Most of the translation work is done by another package called Argos Translate. This is really the core engine that performs the hard work in the translation, which is an awesome project, and we collaborate with them on Libre Translate. Inside Argo Translate, there is also other software which is built on the shoulder of giants. C-Translate, which is an inference engine that does neural translation using transformers models, which is a state-of-the-art. It's the same type of architecture that chatGPT3 uses. There is a sentence piece, which is a piece of code from Google that does the word tokenization and the stanza which comes out of Stanford, which does a sentence analysis. Argos Translate uses all these three to perform the translation work. Now, that's not all it does. Argos Translate also takes care of the very important Argos package manager index. This is where all the language models are handled, installed, and distributed. So the first time that you run Libre Translate, Argos Translate will take care of querying the Argos package manager index, and will download the languages that you need. This allows us to also create instances where, say, you only need to translate between French and English. You do not need to download the entire 26 gigabytes of models. You can simply say, I just need those two models, and the program will download simply those two models. We also have a small module that does the file translation which connects again to Argos Translate. That's the Argos Translate files package, and then some common Python packages that allow us to put the web interface and coordinate the application as a whole. So it's really an ecosystem that's built with other open source software, and together it creates this complete translation solution. Talking about language models, we have 58 of them that gives you translation support for about 30 languages. It does automatic pivot via English. We are currently looking to transition to using multi-language models. But for the moment when you translate, say, from Italian to French, the program will automatically do the pivoting via English. So I will translate Italian to English and English to French. If there is a language missing, there is a very cool repository under the Argos OpenTech organization, which builds Argos Translate called Argos Train. That is a repository that has very good instructions on how you can train your own models. So if a language is missing, go check it out. It has very clear instructions, and you could contribute a language that is missing and you want to see integrated into the software. Speaking of the models, when a model is downloaded, it has a Argos model extension, and these are simply zip files. It's a zip file and it's inside, has a little bit of metadata. It has a folder that contains the CTranslate model. It has the sentence piece model and finally the stanza model. So it has the information for all the three packages that we discussed earlier to perform the translation. It's very interesting to check it out. Let's talk a little bit of accuracy, right? Like the question like, okay, it's a translation sorry, but how good is it really? For that, there is a metric that can be used to assess roughly the accuracy of the translation. It's called a Blue Score acronym for bilingual evaluation under study. It measures the similarity of text to a reference corpus. It has values that go from 0 to 1, or if you express it as a percentage from 0 to 100. The best translators in the world, human translators do not get a score of 100 ever. So anything that is above a 40 is considered understandable to good. Something that is above 50 tends to be very high quality. Sorry, up to 50 is high quality and above 60 is very high. We had a community contributor actually go and a few weeks ago, he ran the evaluation on our different models, and we found that 83 percent of the models currently in Libre Translate are scoring above 40 percent. So 83 of them are good. Now, to make it into perspective, when people ask me directly how good is Libre Translate, I like to tell them that it's roughly as good as Google Translate was four years ago. So I want to make the expectations clear at this stage in the project that it is not as good as some of the proprietary alternatives. But we are improving and we will continue to improve. The way to improve it lies into mostly getting better training data. So as we find more and more sources of open data that can be used for translation, we include those into the training of the models and that results into better models. This is also an interesting point to note, is that because the project is open source and we have a way to train models, you can also train models that are specific to a certain domain. For example, in the context of software translation, you could imagine the case where instead of training the data on a general corpus like Wikipedia or the EU Parliament translation documents, you could train a model that is specific to software. For example, you could take a set of existing translations from existing software that has licensed the translation work under an open permissible license and train a model onto those existing translations. Because we have the knowledge, a lot of software has commonalities in terms. When you have a file menu, it's always called file and then edit. So those menus are specific to a context. By training models that are specific to a context, you could get, for example, software translation model that is more accurate in the context of software, rather than say poetry. So it's a very interesting thing to think about. One more thing about accuracy, we do have the occasional rare quirk. This is something that we are aware of and we are working to fix it. We like to call it the salad issue. We joke, I will demonstrate this slide because it always sparks a little bit of a giggle. It's a little bit rare, but it happens. So in Spanish, the word for salad is ensalada. Now, let's try to translate the word for salads plural. So I'm going to type ensaladas. So in French, that's saladas. Is that correct? Any French people in the room? Fantastic. Now let's try the singular form. I'm going to remove the S and it crunches for a little bit. In a second, it really likes salad. Salad, salad, salad, salad, salad. This is a quirk. We are aware of it. It's very rare, but we've found a few reports here and there and we're working to fix it. Just something to be aware of. Yes, it really likes salad. Me too. Let's talk a little bit about integrations. You can find the client libraries for about 11 programming languages that includes the most common ones like Java, Python, whatever your favorite language is, it's probably in the list of bindings. And if it's not there, adding new bindings for LibreTranslate is fairly easy. So we welcome contributions, of course. As far as software, LibreTranslate has found adoption in several existing open-source software that you may recognize. Mastodon recently added support for translating topics using LibreTranslate. Weblate has the ability to use LibreTranslate to suggest and help translators perform translations as an alternative to using proprietary software. The forum software discourse has a plugin that lets you make your forum software accessible from different locales and lets you translate the posts on the fly. LibreOffice, I found, has an extension. I didn't know this until a week ago when I was looking who has integrated stuff with LibreTranslate. But somebody wrote an extension to LibreOffice where you can translate documents on the fly using LibreTranslate. There is an add-on for the multimedia software code. There is an add-on also for Firefox. And there's probably a lot of other things that I haven't found myself. But a lot of people seem to be finding the API useful and they're doing integration work, which is fantastic. And there's finally client applications that you can use LibreTranslate with without using the web UI. And we found we have clients for Android, iOS and desktop. And there's more being built by the week. As far as comparison to proprietary alternatives, you can see that there is a clear monetary advantage, aside from the philosophical reason for why you might want to use open-source software, of course. But it could also be a really sustainable way to perform translations in that people often ask me, why should I use LibreTranslate? I can use Google Translate for free. I just go on translate.google.com and it doesn't charge me anything. So why should I care? Google Translate is free so long as you're using it by hand. If you want to do any automation work and you have to tap into their API, you're going to pay dearly. And you can see here a list of the prices and I can assure you that one million characters seem like a lot, that's six zeros, but they actually run pretty fast and so could the bill on your credit card. So if you have a lot of text to translate, LibreTranslate could really help in that regard. As far as funding goes, the project is on the path to become fully self-funded. And we really care about this because we want the project to continue living on. We, of course, accept sponsorships and donations, but honestly, we would rather prefer that you get something back if you decide to contribute financially to the project. This is why if you are in the position where you say, I have some finances to spare and help support the project, you also get something back and we do that in the form of offering you an API key to use a host distance at LibreTranslate.com. So you are free to run the infrastructure on your own server, on your Raspberry Pi, on any machine that you'd like. If you don't want to handle that, you can just get an API key and you can support the project at the same time. So it's really a good way to contribute back. And we found that that model has been helping us grow and sustain the project. So we hope to continue growing as much next year. Again, to get involved, I'll give you a few quick numbers. We've had about 70 people contribute to the code base over the last few years. The project is still very young, but it has really received a lot of attention, so we're very excited about that. You can help with code. If you're a Python programmer, if you know HTML, CSS, any of the technologies that we use, you're welcome to contribute. We are open to everybody and all ideas. You can also help us translate. If you understand English and you don't see your language in the list of languages that we currently support for your user interface, you are welcome to contribute. It's on a web late, you can simply translate and it will get included into the project every 24 hours. So that is really amazing. You can also help us train more language models. If your language is not available or a language that you care about is not available, you can yourself create a new model for a language and add that into the list. So that is also another way that people can help. You can report bugs, of course. If you don't report salad, we are aware of it. Or just come say hi. We have a community forum that is quickly growing and we love to hear what you're building with it, what you're using, or if you have any questions. So we're very, very open and we're excited to hear what you will do with it. That said, this was the last slide. I think we have some time left over, right? So I will. So thank you very much. I will open the floor for questions and discussion. So yes. Hi, my best friend cannot miss the Vice President of the Austrian Society for Artificial Intelligence sitting with Foster to find volunteers to do exactly what you're doing. Thank you. We're glad to be able to help. You're welcome. How do we find, well, I just named the thing Open Language Model Training Army. How do we find more volunteers? Unemployed people, maybe have the government fund people running training models, maybe suggest that to all politicians, everybody to their member of parliament. How many people do we have here? Should be all of Europe at least, maybe South America. I think if we multiply this, it can go viral. Thank you very much. This is awesome work. Thank you. I appreciate it. Yes, but you speak of our language that have the same link, the same structure of the language. We have French, English, Spanish, Portuguese, maybe Russian and Ukrainian. I do not have the same structure, but the language not far away from here. That's the difference. German also. Correct. And so taking this in account, it's not easy for a translator, for them to translate it. It is not. There's also maybe a problem with Chinese or Japanese. Correct. There was a problem, there was a thing I went to say, it's a dictionary in line or in the program to have the good word because it's not translated every time the good word. And so I thought also the most efficient people's language is Esperanto, not English. Oh, that is very interesting. Yes. Okay. Yeah, that's a great insight. Yeah, thank you for sharing that. And you're completely right, some languages don't share the same semantical structure and Dutch, for example, currently doesn't score super high. It's actually one of the bottom 17% of the language models in the blue score. Dutch scored around 38%. So it's almost good, but we've had some Dutch speaking people come to us and say, you know, it's like equal use improvement. So Dutch, yes, it is a language that needs improvement. And I talked to the maintainer of Argus translate about the languages that need improvement. And he pretty much suggested that better training data will help greatly. So it is mainly a problem, not of the architecture of the AI. It's a matter that we don't have sufficient quality, high quality data between, say, English and Dutch to get above 38% currently. But again, nobody has really focused on Dutch as a language. If anybody has an interest in improving Dutch, we can do better. Surprisingly, fantastic. But as far as, for example, languages like German, LibreTranslate currently does very well with German. It's above 50, if I remember correctly. Is it because German is the similar language to Dutch? It is. That is because I believe, and I think PJ, that's the name of the maintainer of Argus translate, because the German model has had a larger amount of training data, and so it tends to perform better. Yes? Yeah, just a quick question around the translation process, I suppose, touched on the structure. But how does it work with different dialects? So if you write in dialects, will you write in slang? That's a very good question. Dialects would probably, and that's my guess, because I've never inquired this myself, but I believe that a dialect to perform good as a target or source language for translation would also need its fair amount of training data. And that is the problem with dialects. I actually speak a local Italian dialect, that is my first language, and I wanted to make a model for my dialect. And I started looking online for references of data that I could use to create a model for my dialect, because it would be cool. And it was really challenging. Not being an official language, it really lacks the status of official languages, and finding training data is extremely difficult. But it could be possible, right? If you gather enough people that can create a ground truth data set of examples in the dialect with sufficient samples, you could get good results, I believe. So it's a matter, again, of training data. Yes? How much does it cost to get a model to a good level? In terms of computing power or in terms of computing power? So if I remember correctly what PJ told me about the cost of training the models, it costs maybe a few, between $12 and $30. You can rent instances on several cloud providers. You do need a GPU to train these models, and it might take a few days for it to crunch and get sufficient number of iterations to train the model. But it's absolutely affordable. Anybody can do it, and if you are willing to wait and you just have a gaming laptop sitting at home, if you're OK waiting 20 days for it to finish, it will train the model for you. So I guess it could be free to you if you're willing to wait a sufficient amount of time, and if you have a gaming laptop lying around. Yes? So you mentioned the need for a data availability for doing the model, right? Well, do you need the data to be available under a certain license? What's your problem? The world's full of things, right? Yes. What's the requirement you have? You want it to be public domain? It has to be licensed under a permissive license, so creative comments that also includes commercial use. And we give references and we give attribution to all the sources that we use. If you go into the Argos Package Manager repository, where all the models are hosted, we do give the appropriate licensing credits to all those. But yes, we cannot go on, say, the Internet and start scraping results, because everything, you just have to assume that everything is covered by copyright until they tell you that you can use it freely. So it's only trained on openly available and freely licensed sources. Do you need it to be translated as well or just a single language? It has to be translated. So very briefly, the format of the input that goes into the training is a file that has, say, the English sentences and a separate file that has the translation on the same line. So it's very basic. And somebody could do the work by hand, right? You start from the English translation and you start doing the translation. So it will take a lot of work, but it's doable, especially in a crowd-formed... Are we out of time? Okay. I'll be around if you have other questions. Our time is up, unfortunately. They're kicking me out. But the next speaker will deliver something awesome as well next talk. Thank you again. |
Bringing your project closer to users – translating libre with Weblate
News, features and plans of the project |
Hopefully, yeah, I will wait, I just want to test the sound that it works on the other side. One, two, three, weblight, weblight, weblight, pontoon, weblight, weblight, weblight, pontoon, weblight, weblight, weblight, weblight, weblight, weblight, weblight, weblight, weblight, yeah, it works like that. This is how you sell weblight. You just say it a lot of times and, yeah, yeah, great. So long two days, right, and a lot of to come. I just want to ask you if you can all do a frowny face, like you just ate something weird, yeah, test it, test it on yourself because you will need it throughout the talk. Because you are from different backgrounds, somebody is a developer, somebody is not a developer, somebody knows weblight very well and somebody came here to get to know weblight. Somebody was on my talk yesterday who was not, can watch probably tomorrow or the after, the recording, but I didn't get the link yet, so probably soon it will be. So the frowny face will be in use if you don't like something, if you feel like you are bored here. Just do your frowny face so I know that I should go a little bit quicker through what I am just talking about, yeah, okay. So I'm Benjamin, I'm from weblight, I do the talking and Michael there do the coding. So yeah, that's the weblight team and we also have a teammate named part and he does the coding part time. So what's weblight? And okay, weblight is a Libre localization platform, it's well-documented piece of software, I want to point it out because sometimes if you want to get to know to the software, you get to ask a lot of questions to other people. With weblight you don't have to, but yeah, you can still ask me if you don't find your answer in the documentation. It's a couple of the software, it's licensed under GPL3 and all the development is public on GitHub. We started as an open source project and it's not going to change because yeah, it's the best way to develop software. What weblight should do is to make the tough process of localization and coordination of a lot of people and having so many things at so many places just easy and it's for developers, it's for translators, it's for managers, it's for everyone that's involved in the project of localization and wow, it's a weird color, but next to a logo you will read it's a verb and yeah, some of our users started using two weblight which means I'm going to translate something on weblight, so we are a verb too. It works or is aimed that you will just set it once or tell it that it needs to be set in some way in the future and it will do itself, so you just set it and forget it and let it work for you because you don't want to do the work if it can be done by a great piece of software. It's supposed to be having all translations, so not just one platform, not just your website or your app, everything in one place, many, many stands of localization file formats are supported so you don't have to use weblight just for part of your workflow, put there everything and it will give you a free time. Once you have your translations there, you probably want to give it a context and communicate with other users, you can do that, if you find that search string weird, you can just add a comment, notify the maintainers and they will know it. Once you have it integrated, which means in the best way integrate it with your version control system because that's why Michal started weblight 11 years ago to have some kind of translation software tightly integrated to what you love to do, you will love to develop software not to send back and forth or use API calls or write scripts, you just want to develop your software and let some other software do the parts you don't fancy that much. Once you have your translations there, then you go to the editor if you are a translator, which is a lovely part of weblight where you can see all the context, all the screenshots, all the communication with other translators and where you can localize the string to your desired language out of the searched language or multiple languages. Once that's done, weblight does the magic as Jean Baptiste from Fedora said because there are all the checks and fix ups so you don't have to control the quality of your, I shouldn't be in front of that so, right. You have the quality you want to, not some messy translations that can be understood differently in the app in the end. Once this is done or while this has been happening, you have it in a database, we prefer the elephant but you can also use MariaDB but elephant has more power, more strength for the best performance and once you finish there, weblight stores it in the underlying Git repository and that's part of the magic because that's a tool you already know how to work with so if you are a little bit lost about, let's say you got a merge conflict because you added the same thing in weblight and in your repository because yeah, you don't want to wait, you want to have everything continuous, you just solve a merge conflict and if you don't know how to, it's described in the documentation of weblight so that's the easy part, that's the way how to make it continuous with as less false as possible and basically you need to get your translations in then do as less work with help of machine translation and stuff and then get it out and in and out is very easy because it's just thinking one Git repository with the other or if you prefer work with the API, weblight has it so you can do that or if you like to translate in some offline tool, in desktop application you just download the file for localization files in the file format you like because weblight can converse this for you and then you can upload it back without being afraid of making the file invalid and once you have that everything set up you can reuse your translations because there is a translation memory which you improved by your manual translations and with machine translation so if you are let's say translating the same project over and over again for new versions like LibreOffice you can build your translation memory with help of machine translation with help of your community and in the end you don't need that much of manual work, you don't need that much questions to the machine translation engine because you already have your core, your basis in the translation memory which you can use as a source for the machine translation so that's the reuse and as a result what you want to get out of weblight is your software in your language because yeah then you truly own it, you truly own it, you want to reduce as much work as possible because everybody wants to work on what they desire to do not the manual work that's again and again the same so let translators do their work, let developers do their work and happily be codependent you want high quality translators, translations and you want the engagement from the community because if they have the easy way if they figure out in their UI they are just watching there is a mistake and then they probably know the best how to correct that mistake but if they won't find the way how to correct it where they won't send you an email if they have the tool it's their software they can make it better very easy way make it themselves it's their software so they can work for the whole community and attract more translators more users because that's how you do it you of course want to free the time of developers yeah I think I already said much about it and you want to be prepared for the future versions and for the future project because usually you use the same translation so you can be very well prepared for that too Weblight as a project started 11 years ago as a part time project but now I will probably talk more about it if we have time in the app but now it's sustainable on itself without any venture funding or something like that we how we get the funding these are the four ways we offer the hosting services to commercial customers or to lovely projects like Fedora or some other projects like some other projects like OpenSUSE because Mikhail started it while working in OpenSUSE are using Weblight for long years but they hosted themselves and just pay for the support because sometimes they need a little bit of help with something we also offer custom development because as we are a small team we are very flexible in delivering new features and as we have the same code base for everyone no matter if you are a commercial customer of Weblight services or if you are small or large Libre project you have the same features there so if very big corporate like some Siemens company will pay for a nice feature you as Android developer of a small podcast app can use it too and that's great or you can contribute to the code and they will get it too because you want to support big companies too give them something back maybe and of course as every Libre project we also accept donations then you came here probably if you know Weblight to listen about the plans what we planned for this year and this is it we want to become more enterprise ready the current state because some of you we met at GNU HealthCon the good thing is that more of a large part of the work at the backend is done what means enterprise ready at this moment Weblight as it started as a small project and not aimed to be something that influential and that important to large groups of people to large projects there is currently just a project and a component and that's it but you will be able to create multiple levels of grouping let's say for different versions yay Fedora is happy and we are happy too but give us time give us time a little bit and of course when you have the large group large community you need to manage them you need to give them access and at this moment it's possible in Weblight you can connect your single sign on method like SAML or GitHub or GitLab and manage them there or you can do it in Django admin some strange thing but this is a history now now you can manage it fully inside Weblight and there will be even more so you can use Weblight similarly like you are used to with GitLab GitHub and other tools where you can create the teams organizations and let them do their work without you having to pay attention to that all good yeah okay then as Weblight again most of the things are something we didn't plan in the original way so now Weblight is a little bit rusty to play the big part for large projects or and some of those is also universal search hints because at this moment search is some on some levels or on the whole level but can't it's not that convenient and it would search from one place everything in between so you need to click through and then search for something this will be changed also to make it easier to navigate because from our operating system from our cooperative tools we got used to to a nice universal search in on every platform so you will find that in Weblight too and yeah that's it of course UX can be better every time we really want to grow the team so yeah if you know a developer tell them it's a great project they can contribute and they can get paid for it which is really nice in open source world and yeah we want to continue organic growth and after yesterday talk we got to some talking maybe we will do something with pontoon maybe not because there won't be time maybe it will make sense maybe it would not but we would like to be to investigate that way too and yeah as we still have some time maybe we can talk about how Weblight began this is where Weblight began this is Michael's office who was on yesterday's talk lightning talk okay good so this is small room on top of the large concrete building we built in the Czech Republic and it's like three to four meters something like that and there used to be all the machinery for the elevators so it's not a garage project it's basically the elevator project and it's weird because it was like on top of the elevator there was no elevator at the time when Michael was coding there so yeah we need like higher elevator or just use the steps and build them before so yeah it takes time and yeah this is website of Weblight where you can see all the translations that are happening on Hosted Weblight these are some of our lovely users from the open source community what pushed us into big feeling of responsibility was when a few years ago Fedora approached us and realized that they want to use Weblight because it makes sense and it doesn't make sense to if they want to create a great Linux distributions and they have a lot of other work they don't want to maintain Zenata because they don't have time that's not their main business so they switched to Weblight they did it really really well in the migration process of course they found a lot of bugs we were able to fix them thanks to them because yeah once a developer finds your bug they are usually or technically skilled people great to describe what's the problem and then it's very easy to fix it and yeah that's what got us to the current state and if you want to hear more about this thing why we feel that there should be something to connect Weblight with other tools and make it easier for all the communities you can listen to the yesterday's talk why the recording will be available we will definitely let you know on our website we will definitely let you know about everything on our Mastodon account and I think we don't need to take the time about my story just a funny part my story with Weblight became we first met with Michal at Fosdam in 2017 and I was here with a different project and another Michal yeah popular name in the Czech Republic introduced me to this Michal that was my colleague that time and that project that was tourists open source router and then we didn't see with Michal for two more years and then in 2019 I came here looking for a job in 2018 I came here with a different project and 2019 I just wanted to return to Fosdam and be in the atmosphere and that time I was living in Prague Michal was living in Prague we knew each other yeah but we met here again and we started working together so yeah tell that to people just come to Fosdam but yeah you already tell them so yeah I think we have like 12 minutes now for questions so yeah let's talk and you probably you are all smiling there was no frowny faces okay who has the question hello Jean Baptiste but we have some kind of a plan so that in the future we can search for translation coming from whatever data suits because we have Weblight hosted for the federal project we have the Weblight for Libreface the one that you are hosting and there are many other places where you can find use for translation but as a translator I would like to search for every existing translation of some terms and I want to see okay what is oh consistent is my translation at the moment and should I align on the current way it is translated or maybe it's wrongly translated in most places and I should go fix the translation okay I would love to see something that helps us to at least iterate the knowledge not having some kind of magic tool that we are also looking for as raising the world's translation but just to know what is the true state of translation and I think the Mozilla people have some tool that allows them to search for existing translation I would love to see some kind of tool that allows me to search for a string across all the translation coming from the weblet's instance or even from other okay I will try to sum up the question now we will see if I was paying attention so the question was if we plan to introduce some place it's slightly related yesterday when I had my lighting talk about possible translation MD something like that I meant to show people show translators one place where they can find the project they want to translate and Jean Baptiste would like to take to elevate it to weigh another level to have some kind of shared translation memory or shared glossary where translation translators can go to and check how to translate the best the best the thing they are currently working on yesterday on my talk I was trying to be not very imaginative because I usually am very excited about everything and then there is no time to do that I can imagine something like you just proposed but I can't imagine anybody doing it because that would just mean a lot of a lot of people working on it and then we would need to help with some licensing and stuff so yeah it would be nice I'm not opposed but I can't imagine at the moment I would start with what I was talking yesterday and then maybe on top of that might be something to build on currently you can search on hosted Weblight which is pretty large base of open source project you can search through their translations because they are open and yeah you can search on discover Weblight for the project so you can realize through their oh I want to see how Susie or Fedora is translating that or Susie or Fedora is translating that and yeah you can go from there but you all every time you have to go to that particular space and search it there so yeah maybe we will just once we have the translation MD we think about for two years and not doing anything on it and also universe of search then we will just maybe throw it to some clever people with AI but I'm scared of what would happen from there so yeah there is another question so some kind of a okay the question was if there would if there is any plan to provide if there is any plan to provide some kind of live preview of the strings that are translated another thing we were the answer is another thing we were talking about through recent years and that's quite large to develop so again we would like to do that and you can support this effort if you find help to find us front and developers so it's kind of you can do it but it's not and this was not the intention of the teacher okay I will play a little bit like a sports commentator on the right we have what's your name twice and on the left we have Michael from web late and twice is having an Android app and needs to wants to see it before he commits to the version and Michael is saying it's partially possible and you can talk after the talk into the deep of this topic and we still have more time I will stop you here stop you here it's not a problem it's not a problem it can't it's it's a thing that can be solved very easily we will just tell you go here follow there that's not a problem you can actually you can actually do this you can actually do this web late uses translation memory once you have it you can export it import it in tbx which is a standard format and you can also use web late translation memory as a source for I will repeat the questions at the end sorry you can use also web late as a source for your machine translation so it's just clicking in the UI and it should serve your needs the question was if it's possible to use web late as a source for machine translation can I yeah okay we still have time is there any other question okay oh yeah translate is supported as a machine translation source in web late so you can use that but we are not yet connected with the community and we will be happy to connect with them and maybe figure out some deeper integration and more promotion in the community and yeah I will get back to you after the talk and there was one more question I'm sorry I have to stop you I will go closer to you yeah or if you yeah yes it's in a context in the editor of web late they're under the source and the field for the translation there are a few tabs like comments like other translations in neighboring projects and also or new projects and components and also if you set up your machine translation engines they will see it there so they don't have to look for it they will your translators will see there and they will just click to copy it into the translation and like without without offering translators to to copy it just translate everything to machine translation you can yes yes there is there is an add-on called automatic translation and you just in install this add-on and instruct it to use particular or or trans machine translation engines you set up at your web late instance or for your project to insert translations or suggestions for untranslated for or for all strings so it is possible the add-on is called automatic translations and you will find more info in the web late documentation okay we will get there after the talk thank you very much for being here thank you for your time and enjoy rest of your FOSDEM experience and see you next year okay okay okay I can do that it's not all it's just some and there is the next speaker is not here yet so here we go you can do it you can do it he will just take a picture and we are sad enough I have no problem I don't care he's the one in charge okay |
20 years with Gettext
Experiences from the PostgreSQL project |
Hello, hello, good day, good day. I think that's sort of the Brussels welcome. So I want to talk about what we're doing in the Postgres project with respect to translation. Seems like the last two talks already solved 50% of my problems, so there was already a good outcome of coming here. I want to talk a little bit more sort of the lower levels of what Getax specifically does, but also goes kind of through what other things we're doing. So I, in the Postgres project, I'm just a C programmer, it's my job really to do the translation sort of as a hobby on the side. I have done initially sort of most of the setup of, you know, sort of the source code level, what do you call it, internationalization in that case, right, and then I'm also doing some of the translation. And yeah, I've been, it turns out also the first false I went to was 20 years ago, so I like coming here because, you know, there's a Postgres dev room happening at the same time in a different building, but I like to come to all the other places and sort of intersect with other communities and learn about other kind of stuff that's happening. So it's kind of a good benefit of FOSDEM. So what's Postgres? It's a, you know, don't need to go into details, but just, it's a little bit, it's a little bit different. It has different sort of requirements, right? It's a database system. It's, you know, fairly big and fairly old and, but it's, you know, it's different from like a GUI program, let's say, right? Something we've, you know, leap off is, or we'll hear from KDE later and things like that. I think it just, you know, it lives, it lives much longer and has sort of longer like sort of stability requirements and things like that. And that also kind of makes the maintenance of everything a little bit more complicated. We also maintain back branches, so we have a yearly release and, but we still maintain the old releases for at least five years. So at any given moment, we have four or five or six releases live and then that gives it sort of interesting challenges with like backpatching stuff. Just I was quite interested in the last question we had in the previous talk about this translation memory and sort of that wrong truth. Can we like automatically apply the memory to the previous branches and stuff like that? So this sort of stuff is sort of a practical challenge because you have to keep copying the same thing to all the different branches and so on. In any case, so how, what are we doing in Postgres? You know, we use GetX as in the title. Now put this sort of standard new question mark, it's kind of weird because, again, Postgres is old and, you know, runs on servers is a little bit different from sort of what runs on a laptop or an app, right? Because for example, we also support operating systems that might still be already be forgotten like AIX and Solaris and they have GetX originally came from the sort of Solaris here somehow and there's still a, well, I don't know what it is now, but like a Solaris native Solaris GetX exists. It's, you know, that's distinct from new. But it's sort of old, has bugs and doesn't parse stuff correctly sometimes. So when you just sort of use the new version, then sometimes the files you distribute don't work on old Solaris for some reason, right? And then this is sort of just a weird situation. You have to like then fix these things and then, or hope that Solaris dies at some point, right? Stuff like that. So it's like that. So we have, I mean, you know, we hear from KDE next, I already looked at their abstract. This is obviously not a lot in terms of how many languages and message catalogs and strings we have, but it's more than something you can just deal with in an afternoon, right? It's just a lot of stuff to move around. I mentioned with the different branches, we also have it, we have it in a separate Git repository. I think that's kind of common, I think, so that you can manage access to translator separate from the source repository and then you just move it back and forth. Again, maybe the web late will help with that. And we have also sort of other projects in the vicinity of the core server projects such as the JDBC driver, which is obviously in Java, which is slightly different sort of tooling and stuff like that. And PG Edmund is a GUI, they have their own workflows. So it's all a little bit all over the place and it's hard to kind of keep that all moving in the sort of, in the same way. And there's also some documentations are being translated, but that's also handled completely separately. We were actually just talking the break, we could use Slipper Translate for that maybe at some point. So all kinds of interesting ideas are already coming up here. All right, so this is sort of my like web late but terrible job here, this is kind of how we handle it, the web interface is under bobble.postgresco.org. And that just gives you sort of the status of what language is and what the message catalogs are and it does sort of the string extraction and the merging in the background as you would do in the make file, but it just kind of runs it for you from a cron job. Again, it's just sort of really old, but it does the job. And then the workflow is, yeah, you go like, you know, you pick your language, pick what you want to work on, you click on it, you download it, you do the translation with the, you know, the get text tools, whatever editor you want to use, different people use different things and then you commit it back. And there's different, all the branches are available here, so you can scroll down and you're just going to fill these up. So, yeah, these are the languages across here. What the colors mean is that green is 100% translated and then one thing we did, which I don't, we just kind of made up, but we decided if a message catalog is not translated to at least 80%, we're not going to ship it, right? You don't want to just ship like one string, you could, but it would be weird for a user that all of a sudden a translated string pops up and nothing else is translated, right? That's maybe a little bit weird, so that's something we sort of decided on until randomly and it seems to actually kind of work pretty well. So, the yellow ones are the ones that we would ship and then the white ones are just the ones that are not complete at all. All right, so workflow is the usual get text workflow for at least C programs, you know, there's other stuff happening now there, so if the developer marks up the messages with this kind of underscore thing that they recommend and our developers are, you know, totally good about that, right, they're all aware of that you need to do that and for the most part it's wrapped into things like this, so you don't actually have to manually mark up everything. If you use like this sort of standard internal API, say print an error, log an error, whatever the case may be, you know, then it's already done for you. That works pretty well, it's, you know, every once in a while something gets missed but it's not a big problem. All the developer group is aware of that. Then I mentioned the website uses those standard tools to give you something you just have to download and then you just translate it and upload it back. And then at release time someone, often me, just then runs a script to copy that over, which could be automated but, you know, it's one of those things we have releases four times a year and you just do it manually four times a year or you could spend X hours automating it, right, so usually it's just done manually. All right, so this is our tool chain at the moment, again GNU, question mark get text, we have a pretty standard sort of configure make make install build system. We don't use any of these make file templates and things like that that ship it get text because we have our own sort of very convoluted build system based on GNU make, we're also in the process of getting rid of that so we're moving to Mason now, which has some support for that built in but it's kind of incomplete so we're sort of stuck sort of half way here, half way there, that's kind of work we're doing right now, I have to figure that out. So people use whatever editor they want to use, you know, PoEdit it seems to be somewhat popular, I use just Emacs, some teams have used, by teams I mean sort of language teams they have used CrowdIn which I suppose is sort of similar to Weblate maybe, but again we were just talking to break maybe we'll look at Weblate and then a horrible bag of shell scripts and purl scripts and make files that sort of hold it all together, which again could be replaced by something better, it's just never really figured out what that could be. So pros and cons of doing any of this, one thing I've obviously we want to translate because we want to translate right, so that's sort of the ultimate requirement, but what I found interesting as a sort of secondary benefits is actually that by putting all your messages of your programs through a translation process you get an automatic review of every message string, right, because every thing you put in the source code is looked again later by at least one more person or several other translators and you catch typos and stuff like that, but also if something doesn't make any sense, right, maybe some developer wrote it and it makes sense to them, but then you know someone else who is not that very developer looks at it again, I don't really understand this, I can't translate it because I don't understand it or it looks weird, could we look at it again, so you get this review process and you've gotten really good in in in postgres about really tuning error messages because it's a complicated piece of software and you get all these weird scenarios with sort of transaction processing and weird right ahead log and replication and all these kinds of things and so you want to be really good and precise to explain that you was okay this failed because of this and you could try this but don't try that and you know so this is really I think people appreciate that independent of translation and everything else I think people appreciate that and this process actually helps that because you sort of refine your program's messages through this process as well, right, and secondly actually it also turned out that sometimes people come in, do some translation, maybe find a bug or want to look something up in a source code, go into the source code and then become a programmer so you can also kind of recruit people that way, it's kind of interesting, so but then there are many challenges, right, so first of all you want to get people in there to use the translation, right, and it's just this you know because postgres is not sort of or similar systems as well, right, it's not end user facing, it's not used by sort of random average people, right, it's used by technically minded people, experts, database administrators and so on, so a lot of those people there's not too much pressure to actually have things translated, people be okay it's not translated, it's fine, right, which is different from you know if LibreOffice or Firefox is not translated and you install in a school, it wouldn't work, it's just you can't do that, right, but here it's like okay if it's not, if it's not, if it doesn't get done it's not a problem in a way, but we just want to do it because we like it, but if it doesn't get done it's like okay then we'll just move on, right, so you got to kind of, it relies on a lot of enthusiasm, individual enthusiasm, right, a lot of the, yeah I found also at least personally as doing some of the translation work myself the terminology is hard sometimes, right, because again I just mentioned something like that, it's not just press this button to download a thing, okay you can translate that in any language probably by now, but what if you get into terms like you know sub transaction rollback or incremental materialized view maintenance, you know some languages might not even have terms for that maybe, you know sometimes when I do the work I pick you know I have some textbooks like academic textbooks in German in my case and I just go through them like anybody in here talk about materialized views, what kind of terminology are they using and then I have like six books and three do this way and three do it that way and then I just pick something at some point, right, and so in some way we have to kind of define, make up the terminology in some cases even, right, so and as I alluded to the work flow is not as cool as what we saw in the previous talk so maybe we can improve that. So here's some sort of source code level challenges, some of those are solvable, some of those are not, people who work in translation know about like plural issues, right, we do handle that, works fine, but then if you, I've never figured out how to handle the first one, like if you have two or more numbers in a sentence like then you would have to have some combinatorial sort of list of translations, what if the first one is singular and the last one is five and what if the first one is two and the last one is 18, you know, I don't think you can really solve that and you just start rephrasing things in weird ways. We have the second one which obviously everybody knows you shouldn't do if you sort of paste terms together that doesn't work, right, let's say you're just going to make something up like you can't, cannot apply a generation expression to a materialized view, let's say something like that, that's a thing that could happen in postgres, more or less, right, like okay you shouldn't, you shouldn't do that, you shouldn't sort of stick that into the middle of the sentence because then the grammar doesn't match in some sentences, so you write those out, but what if you have like five options here and six options there, are you going to make 30 strings in your source code, at some point probably not, right, so at some point then developers, the actual developers do get annoyed if you tell them like no, you can't do that, you have to write actually 35 error messages by hand, so I'm not going to do that. Yeah, you start then tweaking it, can you say something, something semi-colon, something something, and then maybe at that point it's okay, I don't know, but yeah, exactly, so you have to make judgment, use some judgment calls in these cases, and one thing that sometimes happened if developers add a new file to source code then it has to be added somewhere else also to make sure the translation system catches it and that sometimes gets forgotten, it's just one of those things, I don't know if there's a solution for that, you just gotta do it, there's also some weird thing, we have like files that get compiled into multiple components and then you kind of have to add them to all of those components and re-translate everything in each component which could be handled with some of those translation memory things and stuff like that, but it's just kind of weird the way we have it laid out and it kind of makes it annoying, yeah, so this is maybe specific to something like Postgres being A, a client server system, B, a database, and C having its own sort of ideas about what encoding on locale and stuff like that means, right, so in, you know, a database stores data which is often text which has an encoding and because of, you know, it doesn't have nowadays you think everything's UTF-8 but in a database you can also store things in other encodings for historical reasons or in some cases because UTF-8 doesn't actually match what doesn't support what you want to store which sounds maybe bizarre but happens especially in sort of Japanese and things like that, so we do support automatic encoding conversion between client and server so that all works and happens, but then this all sort of, what if you have, you know, your strings, your translated strings are in a file, they also have an encoding, they then get loaded into the server process, the server process prints stuff to its own log but also sends error messages to the client, all of those things could have different ideas of what they want, right, you might want to log stuff in English to your server log but the client wants the error message in French or for some, maybe it's like legacy client that wants it, you know, transcoded to Latin 9 and then at the same time there's a different client connected that also is doing things to a different language, you want to log it to the same server log in the same language, in the same encoding, hopefully as the other guy, all of this works quite poorly the way the get-txt, intl, api's work, you can sort of have some subsets of this working but if you really try hard, it's a total mess and it just basically doesn't work and so that's a real problem really and we'd have to really redesign some of this to support all of these combinations, yeah. So the tools, well the tools are fine, they're actually quite cool and get-txt has some internal sort of optimizations that are quite interesting, has like sort of internal parallelization and stuff like that so work has been done but I still find it quite slow, you know, even on our scale, I'm interested to see what the KDE report is going to be later, how they handle that but it is still quite slow, right, this sort of website thing I showed, if you just do a full rebuild of that, it takes like 20 minutes or something, right, just to re-merge and re-extract and recombine everything so also the format, the PO format is sort of pre-source control I find because it has all these dates and timestamps in it which you don't need because you have it in your source control management but these, hello. Can you be more explicit under 10 minutes, what do you do in these 20 minutes because it sounds very slow? Well it runs a loop, it extracts, runs x get-txt over the source code and then it runs message merge against all these catalogs which are, you know, sort of this many by that many times that many branches and you run that on just a machine, right, so, I mean you could optimize this by maybe a beefier machine and you can probably parallelize this a little bit but it's still, you know, the main message catalog for the actual server has like, you know, 5000 strings and that is still going to run like, I don't really know why but it runs like a couple of minutes, right, so it just, it's not, we're doing this build system work now, right, when we go from make to mace on a ninja because make is too slow even if you don't have to do anything, right, so we're trying to sort of go from, I can rebuild everything in five seconds to two seconds and this thing takes like 10 minutes so that's just kind of annoying, right, yeah and I mentioned sort of the back patching, sort of, you, often times what happens is that there's like a bug fix, right, and because of the bug fix there's a new, a message changes when one new one is added and then, so that then pops up in your website but then it gets backpatched, the same bug fix gets backpatched so the same message has to then also be updated in the back branches so you just kind of have to like download this, upload this, then it gets added to the translation memory, then you can do this, I have a bunch of shell scripts to kind of make this work, it's just all, could be better, right, so a lot of people know this chart here, you know, so, you know, some of the projects that we know, you know, maybe Postgres is somewhere in here, KDE, LibreOffice, they're all pretty good but then there's, you know, maybe things like that down here that everybody builds on but they're sort of maintained by a few people in there, sort of, on the side, right, and this way, I mean, this is sort of a general problem, I gave the same, I gave a talk, it was the online one two years ago about the documentation, Choolchain, Postgres, it's the same problem, right, we have, you know, open source, everything's very successful but then there's like these little tools you need just to make your build run, right, and then there's, they don't have the same necessarily amount of staffing and funding and things like that but you still kind of rely on them and they just barely sort of chug along, so that's a sort of general concern, right, but it applies here, right, so what are we doing, what are we planning to do, I mentioned in the middle, right now we're sort of redoing our build system, that is kind of a good reason to clean up some of that old stuff that we don't need anymore. We're also moving more to using ICU which is, you know, an internationalization library that does lots of good things but then adds another dimension to this issue of, you know, locale encoding and then there's sort of another dimension of what ICU thinks the current locale encoding is, it just gets ever more messy and complicated and then one sort of important issue in databases is the sort order, right, a lot of people care about that, what the sort order of your data is and different collisions have to be supported and that's another kind of sort of localization kind of work we do but all of this is sort of weirdly connected, right, if you configure one part of the system to be in this language then all of a sudden get text also thinks it's the same but maybe you don't even want that, right, you might want your error messages in French but you want to sort something in Swedish, right, why not, right, but because of these APIs the way they're historically built it just doesn't quite work smoothly. But again we want to modernize the workflows, again maybe Weblate, I heard Omega T here also this weekend and there's crowd in but the issue I had, I mean I heard of Weblate some years ago too but again the issue is sort of we can't just adopt like the hottest new thing, right, because again whatever, the way I always think about it is whatever I sort of do today in Postgres, write some piece of code or make some infrastructure change still has to work in 10 years, right, and it doesn't meaning it also has to like build from source, right, because that's the way open source works, right, so I can't just use a tool that was just invented yesterday and I don't know if it's still going to be here in two years, now they mentioned Weblate is 11 years old so that's pretty good, so I think we can maybe look into that, right, so this is something maybe a question anybody knows, is Getex still the thing or is there something totally different that everybody should be using now, it's part of sort of the low level API of how this works, I don't know, I was sort of half hoping that from the ICU ecosystem something would be evolving or it's sort of emerging but I haven't seen anything like that so I don't know if there's anything or is this still the thing to use, I don't know, so maybe somebody has a, yes please. The ICU upcoming solution is message format 2, it's currently in the ICU for J72 that came out in October, it's an attack preview there but it's going to progress from there, it's not yet in ICU for C, it's effectively, message format 2 is a new message format in syntax, the resource level syntax for that is a little bit more still in progress but if you move into more of an ICU world that's likely going to provide a decent future thing for you to migrate to from Getex, it's not that yet but it's becoming that. That is excellent news, thank you, I'll definitely look into that. WebLate is adopting that as well or supporting that, it's more compatible, it seems like it supported a bunch of things, so yeah this is wonderful, useful information we can like. So yeah, this is good, thank you, so this is also the end of my presentation so I just wanted to say what we're doing and what some of the unique challenges are, got some good feedback here, we're going to look into WebLate, we're going to look into emerging ICU things, update some of our infrastructure and we have a few minutes for questions, otherwise thank you very much for listening. So if you're worried about WebLate, it doesn't really matter what you're doing. It doesn't really interact with, it doesn't go too deep in your automation system because the way you communicate with WebLate, with the world of the translators, is FICE, FICE, an agreed repository. So whatever happens in WebLate, matching the translations and the search and the check and stuff, but still you as a developer interact with the FICE, so you still have control on how to build, you don't create a national dependency on your resource. Yeah, it sounds like it, yeah. And get text looks like to be alive again, I think they did a release not so far away. Yeah, it was kind of funny because I had submitted various bugs to savannah.new.org over the years, also feature requests and stuff like that and just like two or three weeks ago all of these bugs were updated and some of them closed and I was like, whoa, does somebody know that I'm going to complain about them? I trust them? I don't know. Is the person here in any case, I don't know who is, no. Well, I guess it's just sort of, I mean, this is a problem, I guess it is a problem in some of these people maintain some of these specialty new tools and some of these older tools that are sort of on maintenance. I mean, we don't need tons of new features but you don't really know, right? It could just be that that person changes job and then nothing happens again for five years, right? So, but, well, got some good new information here, thank you. All right. All right, then we'll move on to the next one. |
Building an atractive way in an old infra for new translators |
Okay. Hi, folks. Thanks for this talk. I wanted also to thanks for them and the organizers here for this dev room because I'm very happy that the translation to make part of the topics in such an important event. So it's a very good news, I think. Well, I wanted to propose this talk because, well, personally, I have started Free Software in 2004 and I have done some contributions to the Debian project for a long time, but I only started translation in this project. I have started that since 2018. Well, I told myself, how could we help new contributors, new users, new potential commerce, to help the translations team with the capabilities that Debian can provide for them and also with what Debian can't provide. And it's also what we're going to see and we're going to see why it's not so easy, actually. And so I told myself also, well, I'm sure some people think that, oh, but why Debian doesn't use modern tools such as wood lights such as modern for editors or things like that. And so I had the thought about that. And to myself, well, why not share that with people to understand why Debian works like that and how a user, a new user could contribute with this way of working and without no problem and without problems with satisfaction. So the example I take is, of course, this is a French translation team because it's the team I know the best, of course. It's a very small team. And it's a very, very small team. As you see, there are few people. There have been people for a long time, but it's never a team with 10 or 20 persons. No, it's a very, very small team. And also it's not a team which have a lot of technical skills. I mean, there are some Debian developers but which are technical, they are one or two. And most other Debian developers of this team are not uploaders. That is, they don't or they are not considered or they are not considered themselves able to do uploads of new packages or things like that. So it's not a technical team. It's really a translation team and not a technical team. And amazing results are produced by this team. I mean that it's a very small team but very amazing results exist. For example, we have all the things we translate and we know that we are the third language best translated in the project. And when we see the amount of work which is needed to get that, I think the team may be very, very happy. And I can say it because I don't make part of all the team and I don't make part of these projects. So congratulations to other members of the staff. So the websites, the installer, all the things which are translated from up to nearly 100%, the Debian package descriptions which are not translated, of course, very much. But the most important are translated and it's very important for us also to enable French users to get the package description, at least for those they have more chance to use. So I think it's very amazing results when we know the real size of the team. Then we work for a project which has been a very long history, as you know. The first message, I was very happy to discover that preparing this talk. The first message of the Debian translation team for French is in 1998. It's very old. The world has changed very much since this year, of course. And in particular, the technical world. So of course, the process, the workflow is made based today. And we have web pages, web pages which enable to have a tracking of the situations and to have some statistics and to have some information. Some bots who automate some things and also some statistics. And some made-based to have a workflow between the members of the team to ensure that the translation has a good quality. The tools, as I said, are made-based. That is, on the Debian mailing list, the Debian mailing list for the team, the French in the French team, there is a bot who is able to receive the topics, the subject of the messages sent by the member of the list to pass them and to produce statistics from this and to produce some status of the packages or the pages we are translating. So it's very important, of course, because it enables all the members of the team to know if, for example, someone is working on something or the status of something if it's up-to-date or if it's not up-to-date or other information like that. So the thing, of course, is, well, what can we do with the made-based process given the number of things we have to do? Well, we track the changes. So it's very important. We can know the status of the translations. And even if we've stopped working for a long time, we are able, easily, to join the team again and to know what's the situation and what about the situation. There are also coordination pages. So with the tracking pages, which enable to do that. So we have the announces for the websites. We have the package translated via profiles. Packages are translated because they are classical programs. So they have this, they use GetText as many programs today. Also descriptions, also depth course screens. Depth course, of course, the screens that the user gets when he's installing a package to set some configuration at the installation time. And the website, of course, which is a very high work. So I think the problem of the Dibian project and the reason for which it's probably very difficult to use automated tools is that there is a lot of formats which are used as inputs and also outputs. I mean, inputs, we have at least a packages description database, which has some formats. I'm not sure I know what it is about, but it's not a format such as po-files or HTML files. It's another kind of format. It's a database. HTML, of course, and the po-files. And for the outputs, we need to produce, of course, some translations, some classical, typical translations we find in any other kind of programs of free software, but also the websites, so HTML pages. And various other formats. So it's not so easy to ship that in a kind of automated tool because it's really a really quite big task of administration to set that initially. And I remember that in the team, we have not a lot of technical skills and we are not very numerous. So we can understand that we prefer going out with males, even if it's probably not perfect, but at least it enables people to work. While changing to another kind of tools, even if it would have benefits about the interface and the user experience, it would be really a challenge, a very important challenge. So changing is very difficult. Is it a good idea also? It's an idea that the team wonders sometimes. Is it a good idea? We wonder that in particular because that's right, there are not a lot of new contributors if we have a look annually. However, when people come and contribute to the translation project in Debian, they stay with us. They don't leave us. So probably it means that once we manage to adopt the tools, to adopt the process, we like it and people are comfortable with it. So even if it's not modern, once you take the care to learn it, at the end, I think we can be comfortable with that. In particular because, well, we can get things offline. In particular because we can take the poll and do some stupid things without being sure you are not breaking all of the process or you are not breaking all the translation of others. So I think it gives some benefits that people are happy to have. And that's the reason for which moving is difficult. Because people are happy and we are not absolutely sure that changing that would be a good idea. And when we try changing, for example, we try to migrate the server for the package description to another server in Debian, it takes a year and it exhausted some people. So I'm not sure it's the best idea we can have, even if this migration was really important. But imagine if we should have a migration with more impact and more things to address. So instead of changing things, I suggest to propose to the users, to the new contributors, a kind of approach which enables everyone to, well, let's discover the project. Let's discover the interactions you can have with the team from the easiest to the most complex. And from this basis, do your way, follow your way. And the first step, for example, is for, and also, yes, on every step on the way, the idea is also to have two kind of status. The two kind of status, we are going to talk about that again in a few minutes, but it's more or less, okay, either I am a reviewer and I start to understand all the conventions of the team, how the team works, et cetera, or I become a translator myself. And of course, I start to contribute with other translators. So it's a way to contribute for the rhythm of everybody. I suggest that the first thing you can do if you want to contribute, because again, there are not some tools which are attractive, but also start with the easy thing. And translating the wiki pages of Debian, for example, would be an easy thing because much people love to use a wiki, it's not so difficult to use. And using a wiki today is, well, okay, you take the page, you cut and paste the contents, you create a new page on the French wiki, and well, and it's done. So it's not so complicated to do. And it's the first step which, well, creates a kind of initiation training for the translation. We saw so much consequences for the project because it's not an official resource, but it's a community resource. It's not a resource maintained by Debian members, so it's not a big problem if you do some mistakes or things like that. The next step is probably the package descriptions. It's interesting because there are a lot of responsibilities because, well, package descriptions are what users will see on their screen when they will see it for a package. However, the interface is simple. It's a web interface which is, well, there are some conventions, there are some rules to have, but there are also a first contact with the translation team. And it's something which is important because we have people who review what you're doing. So you can do what you do, what Debian used to do. And for example, if you try to translate some things like that, but Debian is used to translate differently, you will know that. If you're not compliant with some kind of syntax or kind of a way to introduce things, it will be said by the member of the team. So it's a first responsibility, a first contact with the team, but it's a simple interface without needing to understand something. So I think it's very interesting for that because it's a good point of beginning and especially what I like in the package description tools is that it's very short. At least you take some packages, such as latex, or of course latex, when you start translating the description of latex packages, it's less fun. At least sometimes 2,000 lines for a description. It's somewhat difficult. But if you translate some more typical packages, it may be very short. And you can do one package a day, for example. So you can do really at your rhythm, to learn at your rhythm, and not, well, with a minimum follow-up also. So I think it's the first step, which is very interesting. The next step could be the depth course screens on your way of a newcomer. So depth course screens is another thing because now you'll have to post your translations proposals on the mailing list and to go in the process. So to understand the virtual address that we made in the topic and the subject of the message, to be sure that the robot is able to understand what you're doing and to report that in the following, in the tracking stats, in the tracking tables. So it's a bit more difficult, but well, you start having an experience of the Debian ecosystem. So we can consider that maybe you're ready to understand that. Maybe you will meet someone physically. We will explain to you how things work. And, well, depth course, okay, you are going to now intend to translate requests for review, last chance for a contributor for review. So you are going to really come in the process, to come in the Debian process, and to start contributing to really important things also. And it's the first responsibility in the process of Debian with the mail, working, etc., etc. And once you are completely comfortable with that, so the process is also here with coordination of the packages, with the messages which mention their status, and once the mail is received, of course, you can translate your things, you translate your profiles with your own editor. With your own editor, why not on WebLate? Because I saw that WebLate can become a dictionary of translations while using its own editor. You test the results, you also can do that. You report when you finished and when the team is okay with your translation. You report the bug to upstream, to the package maintainer, sorry, so package maintainer, and it puts the profile in the package. And the benefits of that is you use your preferred text editor, as I said, or also your PoEditor, PoEdit or localize as some editors on KDE which work very much for that, to identify in particular the fuzzy strings or the translating strings. Yes, because I forgot to precise some important things also, is that if we should use some WebLate interface on something like that, there is another problem for some contributors like me, is their accessibility problems. So it would prevent persons to contribute due to accessibility problems. With this method, and like you can use some users, personally I use Emacs with GetText, it's easy because it's in text and there are not problems related to accessibility. Well, once you are comfortable with that, maybe you'll want to contribute to the effort of the Debian community to contribute to main pages. Map pages is a big project, 2,000 pages about, and a lot of updates, so it's a big project. But the good thing is the method is the same. So you clone the Git, you translate the profiles, you are on the mailing list with the topics, the virtual topics, to follow the process, and when it's finished, you commit, or you make someone in the team to commit. So you know the process, you know how it works, you know the conventions in Debian. So now you are able to contribute to any project based on poll files, even if it's not some, well, some packages themselves, but you can do main pages, you can do package description, you can do package, Debian packages themselves, so it becomes for you easy to contribute because you understand the process, and also the human beings in the community. Well, the benefits of the main pages is that well, just translate and discuss. You don't have any maintenance to do because German team does that and does that very good. So it's a very, a new area, which is accessible for everybody, and which is very cool to use. The next step, and it's useful for the community, of course, because as you know, main pages, Debian translates that, but we don't translate only for Debian. We translate for all the communities because main pages are used in all the distributions. So also this translation is a way for you to contribute for all the community. Then you have documentation, the Debian specific documentation. So related to the Debian software, we have the release notes, of course, we have the Debian developers reference, etc., etc., but also now it's only poll files. So it's not so difficult to understand how it works and the process to have a review and to have a quality assurance by the community of the translators. And packages, same thing, in interaction with the community, we can also now translate all the things. According to your time, you will choose one of these projects or just translate for one, two, three, or all those projects, according to the time and the energy you have. The analysis also, of course, it starts being a bit different, just like the website, because you will start to play with HTML. So it starts to be different. It's a new area. And I think starting with package descriptions avoids any difficulty from a technical point of view. Then poll files, it starts making you in front of the difficulty of the process, but, well, okay, the process is done. And to finish, you come to HTML pages where it may seem for some people somewhat difficult, because, well, you have to address tags and many things that you have on an HTML page. So it may be somewhat difficult. And, however, it's important because on the website, we have the security analysis, we have the publicity, we have all the platform for the Debian project leader. So it may become important. And for any of these steps, you can, of course, take the time, for example, to take three years to package descriptions and then wait still three years for main pages, et cetera, according to your rhythm and your energy. And especially, you can do what you want, review or translating. Generally, when arriving, we suggest to review, to understand the conventions, to understand the things and to not to take some time in your life. Then if you want to contribute to translating directly, it's a good idea. What I mean here is that we can do really at the, well, with the ways, with minimum ways and minimum installations and a minimum understanding things and especially on the progressive ways. Also, you can track in translations. So you subscribe to the mailing list and you have a look at translations and you follow what happens in matter of review and in matter of translating new if you consider that it's simple or short enough to make you spend the time to do it. So review and translating as the two steps, the two steps you can do at any time in the process and in your way on Debian. So to conclude, I would say that it's not an automated process because we don't have the capability to do that. And I think the project is too complex today to do that efficiently. And it would imply some engineering to do things properly. But we have today some way, I think, and it's what I try to suggest on the Wiki pages about translation and how to contribute and to start contributing and not being discouraging, discouraged, I could have been during some years. There is a way from the most obvious, technically speaking, to less obvious with a contact with the community which is progressive. And you discover how things work, the conventions, of course. The team discovers also if you are ready to respect how the team works because we had some contributors which where we've used it was complicated because they decided to do things like he wanted. Yes, but there is a team, a history, so you can't do things as you want. Even if you're right, take the time and respect what other people do. And also, it's important to say that on the mailing list, also, if you start contributing via the mailing list, you have the chance to get your native language because in the French mailing list, but also in Italian, in Spanish, or any language, you have for the translation teams, of course, people speaking your native language. So, for example, now, how to translate this English term in the language, it may be useful. And, for example, in the package description, we also have a glossary to help people to know, okay, this English word which has a strange translation, should I translate, or if I shouldn't translate, what should I do with that? And if I should translate, what kind of word is used usually? And once you do that, you can learn. And also, without any complicated tools or with no difficulties, you can start translating. And so, you can become a deviant translator. Thanks. Oops. If you have some feedback or some questions. For me, I can answer a feedback, a global feedback, a kind of policy feedback because I have seen that maybe there is a fragmentation in different distributions in Linux, and some have an advantage, and also not for this website. And so, because I never found a distribution that convinced me perfectly. For instance, for me, you have problems with vision, but for me, I have little problems with vision agronomy. And so, I should appreciate the scrolling bar with a great enlarge, not a line to be picked up to adjust and sort of like that. I have some problems like that. It makes Linux, they have a good scrolling bar, but also problems. So, it's a kind, I'd like to wish more cooperation with Linux, there was a problem of system day also, something like that. But I think it's important for Linux community in general to have a better cooperation with different projects. I think that the importance is free software, even if what you say is right, I'm okay with that. But according to my understanding of things and about translation, it's right also, is that the user finds a way for himself. I mean, that's, okay, Debian has his way of work, human too has his way of work, and any project has his way of work according to the history, according to what it needs to be done, et cetera. Well, the thing is how to enable a user to find among all the existing ways a way for him. And I think it's right for any new users, that's actually what you're saying, but also it's right for new contributors, because contributing may work according to some rules, some community interactions, some tools, of course. So, it's not a kind of what is better, what is not better, it's only a kind of is it sweet for you, actually. And in the community, the benefit is that you have this diversity, which enables you to decide what is good for you and what is good for your way of doing things. So, it's a good thing. But yes, there are a lot of ways, and to find the way good for you is not easy. I think the Delian community has something very good on which we should all learn to do is you have a very good translation workflow, and I think this is something valuable, and this is the reason why you are able to continue doing some e-mail translation still, even if the world is using much different workflow, and I think that your bigger value is this. And my question would be, what are the limitations that you see in the existing translation platform, in terms of accessibility for you who are blind, but also in terms of way to collaborate as a team, because I did a push for web late in many places, it helps a lot, but it doesn't create any miracle. The community needs to find a way to define the workflow, the way to collaborate together, and this is something that the rest of the open source community can learn from the Delian community. Your tools are a little bit play-stricken, but you are in a way of working, are very clear, and easy to understand, and I would be interesting to have your feedback on the translation platform in terms of accessibility, and what are lacking to have a proper collaboration as a team in what you have today. Thanks, because you sum up what I want to show here. What you say is exactly what I think and I want to show, but I know, and I'm aware from some feedbacks I can have, and because I know there is a low stream of new, a low flow of new contributors, that maybe, and I say maybe because I'm not sure myself about that, because for me it, indeed, what the tools we are using now are perfect because they are accessible, because they work, and because they enable to have a record and to have a team collaboration, so for me these tools are really perfect, but in the new generation of new contributors, I know that indeed there are some more friendly interfaces, some more design, some ways to do a UI and UX, which is more beautiful and more friendly, more attractive, more, also, or you have a less learning curve, so you can come in the project, do something, up to leave, no problem, you don't have to learn during some months, some things, et cetera, et cetera, so also the problem today is the new contributors in the society is that people want, doesn't want less, want to get involved on the long term, they want to come to try something, give it a try, up and to leave, because well, we have our lives, we have our obligations, et cetera, and that's right in Debian, it's something more difficult, because in libraries, for example, you can go, okay, you have an error or a mistake on the string, you open the web later, up, you fix the string, up, and you leave away, and perfect, well, in Debian, it's not possible, well, it's possible for the package description, because it's a web interface, there is not a learning curve, it's very simple, but for people who don't want to do the effort to learn, we don't want to do the effort of understanding deeply things, well, it may be destructive. Is it a bad news? I don't know, I'm not sure, but the fact is that the team is small, and the fact is that we have not few contributors, so I wonder if it's an limitation, but for contributors who try their best to do that, I think these tools are, indeed, are really, really perfect. Yeah, just one more question for the neutral pinch. So you've sort of approached us to create new translations, can you give some examples how you, what you've tried and recipes to update existing translations, like the original text changes, the English text changes, how to know about this, you notified the same translator as before, did everybody jump in and change that? Well, actually, for example, to take the deep core for the main pages, for example, when there is an update, the translator has a warning, he knows that he has something which is because the translator is considered as the last one who can do the changes or who should do the changes. Once he has this warning, either via mail or because he has a look on the statistics page, he says, okay, so I will do that. So he sends a mail with an address saying that, okay, this page, this description, this profile, I will translate it. So there is a mail and the community has the information, is notified that the person is going to translate. After it, it will translate the profile or the file in it and then said, okay, now I finished, thanks to review the translation. So there are some review, it's a process which can be on five days or 10 days according to the number of mistakes which needs to be fixed, et cetera. And at the end, he says, okay, last chance for change. So, okay, it's the last chance for you to review and to tell me if there is a problem. And once that's done, when the mail arrives on the community team, if there is not any reply, well, okay, the translator can either report a bug to say to the maintainer of the package, okay, there is a new translation, it's updates. Please put it in your package or you can do it itself, himself or herself, if the person has the permissions to access to the git and the repository and to do the translation. So it's a manual, but of course there are some scripts to do that. You don't have to do that manually, it's not a mandatory. But the process is really, you have a follow-up of the situation via mail. Okay, thank you. Thanks. |
Managing KDE's translation project
Are we the biggest FLOSS translation project? |
Thank you everybody for joining. I'm going to talk about the KDE Translations project. I kind of made a clickbait subline trying to get more people. I would be the biggest Flows Translations project or not. Maybe we're going to see some stats later. First of all, I'm Albert. I've been doing KDE since 2003, so like a long, long time ago, and since it was a long, long time ago, I've done lots of things. I did translations. First, I started doing translations for Catalan, and then I got bored and I just started coordination, the whole KDE thingy. I've been doing lots of apps, Oculus, which is this tool I'm using, like it's a PDF viewer. I manage the releases from time to time. I've worked with games, with the Edu. I've been a part of the board of KDE Germany and KDE Spain. Well, KDE International and KDE Spain, so yeah, I'm with everything. You have my e-mail there and my nickname. I use almost everywhere. It's TSDGios or whatever you want to pronounce it like. It's hard. I didn't think I would be pronouncing it when I chose it, when I was 10 because there's too many consonants together. But yeah, anyhow. What is KDE? This is the long buzzword description. We came up with KDE, which if you remove the last two or three words, it could be applied to every single project out there in POSDEM. It's like, we are a team of people that do things under free software. Okay, so we're cool. But what is KDE really? So we do software. So we do KDE Plasma, which is what people usually know as KDE. So it's the desktop. It's this thing where I do out top and I chose things. This is KDE Plasma. It's not called KDE anymore. Then we do frameworks, which are libraries that our software use, and there's like 80 of them. Then we have something called KDE Gear, which has like 200 applications that we release every month. Every month, a stable release, every four months, a new release. So we do releases too often. Then we have more apps like Krita and G-Compre and KDE Connect, and they get released when they feel like. These are less fixed schedule. One of the things I just added recently is that we are old. We started in 1996. So this will maybe explain later why some things are as messy as they are and we should improve them. So what do we translate? It seems like a very obvious question, but so obviously we translate the application. The application is what the user uses. If I want my father to use something I did, it needs to be in Catalan because otherwise he will not understand it. So yeah, user interface. Good. Now you have the user interface, and you have a problem. Sure, my father can come and ask me and I will solve the problem, but I'm not the children of everybody, so you will have to look at the documentation eventually. Documentation has also to be translated. It's very boring because the documentation is very long. Translating the application is usually easier because it's help or print or file or copy. Documentation is if this failed, do this and blah, blah, blah. It's a super long string, but it's interesting to translate it. People will eventually need the documentation. One thing we didn't use to translate, it's web pages. You really need to translate web pages. If you have the best app ever, but the web page of the application is only in English, people are not going to find it. Sure, how do they know the application exists? If I only speak Hindi, how do I know this application is amazing if your web page is only in English? It's like the first, the step zero is people need to install your app, they need to find it. So your web pages need to be translated. Then there's other things that we do translate, which are less obvious, like data. So we have a game which is the Hangman game. So you have to guess the letters of a word. Those words need to be translated, but they don't need to actually be translated. They need to be localized because what's an easy word in English when you translate it to German, it might be a super hard word to guess. So if you are in the easy mode, things don't translate one to one. You have to adapt what's an easy word in a language to an easy word in another language or hard words or something like that. So these are things that data not always is a one-to-one translation, it just needs to be localized. Then we also translate things like, well, translate, you don't translate. We localized things like icons. So icons are an image of a concept that you assimilate to an action. For example, the safe icon. We all go with floppy disks on the wall using the safe icon. That's about icon nowadays. Nobody wants to use a floppy icon. But so for example, what you would use to, I don't know, spell check in a language, maybe in another culture or something that iconography you created means nothing, right? So we have a few localization of icons. It's not very common but it's something we can do. Okay. So how are we structured? I think this is very common to all the projects. We're going to explain it anyway. So we have a global coordination team, which is in this weirdly named mailing list, because it has doc in there and it doesn't have anything to do with documentation. This predates me. So I don't really know why it's there. The I-89, this is a short way of writing internationalization, it's just geeky anyhow. Then every main person from a language has to be there. So I will write emails there saying, we're going to do this change or this has happened or, I don't know, if something breaks, it's like this broke, apologies don't do anything for today until we fix it. Then each language has their own coordination. It's like the Catalan team, we have a mailing list and the French team maybe has a forum or whatever. We don't really enter at that level of detail. We do offer mailing lists for everybody to use if they want to, but if they want to coordinate themselves in any other way, there, that's fine. That's on the people side. Like how do we structure on the files side, like on the things they have to translate. So what we have is three branches which are development, stable and Plasma LTS. Plasma LTS is a bit hidden because it's only for Plasma. It doesn't affect all our other products, but still we have three branches and in each of the branches, we have basically think as a folder. So there's a folder which is development and inside that folder, we have a folder for a team and inside that team, we have a folder for every application and inside of that application, we have a folder, sorry, no folder anymore. We have files for that application. I'll show it later. I can show it now maybe on the browser. So that's not this one. I have to make tabs open. Sorry. Where are you? Here. So this is like the development translation directory and then there's one directory for every single language that we have and then if you open the Catalan language. So we have separated in between documentation and messages, then if you open messages, you have all the applications that there's lots of applications. If we find for example, Ocular which is the PDF app, you will see there's a few sub files because things sometimes are split or not, but that's how it works. We have that replicated for development, stable, and Plasma. So one of the things that is annoying and better also from PostgreSQL mentioned is that managing branches is not great. People forget, they do go to one place and translate it, and they forget to go to the other place and translate it, and then when eventually development stops being development and becomes stable, so you have to copy translation from one place to another, and if they only translated to stable, you might lose them. So we have a semi-official workflow, which is like just put everything together and give it to the teams so they can translate that. We haven't made that official yet for two reasons. One is that it uses lots of CPU time. Peter also was mentioning that it took like 20 minutes for him. Well, yeah, it takes more for us. The other one is that development sometimes moves very fast. You don't like there's messages changing every day and it's like, if I change messages three times in a week, maybe not all the teams want to keep track of development. So they will only start translating when the release is close to it. It's like, if you merge everything together, they don't know that message that that's not translated. It's like, is it important? Is this actually being used today or might be used in three months future? So there's this reason we're not still sure it makes a sense to do the merge workflow, but we might be there sometime. So now some statistics of why I checked a few projects and I couldn't find anything that was as bigger as us. So over development, everything has stars. I will explain the stars later. Our development branch has almost 300,000 unique GUI strings. This is not adding up all the languages. This is per language. So we have almost 300,000 GUI strings and around 75,000 documentation strings. I checked Nome and I think Nome was around 100,000 for GUI. I checked LibreOffice and I think it was similar-ish. So yeah, we have lots of GUI strings. Now I'm going to explain the stars a bit. Not everything we do has development and stable branches. Some of the things are just they developed and they release and they don't create a stable branch at all and they develop again. So that's why there's so many more development strings than stable. For example, all the libraries, they don't have stable branches. We just every month we do a release and that's the stable release. Then on the GUI strings, some of them might be documentation strings. We have started changing workflows and we have not adapted yet, but yeah, not too bad. In terms of teams, we have 109 unique teams. At this moment or let's say the last year, there's only 57 of them active. So people come and go. We've been as I said, we've been around for 27 years or 28 years. So yeah, some of the teams are not active, but we still have them there in case someone wants to come later and recreate the team. Now, one thing that we decided and I think it's very important is everything goes to a PO file. Translators are not necessarily technical people, so they need an easy way to contribute. We translate lots of things that I said before. We translate C++ code, we translate desktop files, we translate JSON files, upstream files, doc files, web pages, the Android application, which is an Android application, and all of those do translations differently. So let me show an example. So C++ code, so the C++ code is the typical marker that Petter showed that the get-text marker uses only an underscore, but we decided to go with IATN, but it's the same thing. So you say we will translate this. It's similar. Now, for reasons, we also have support the Qt translation system, which is very similar, Qt translation, fair enough. But then we have desktop files. Also, it will be translated. So you have generic name and picture it. This needs to be translated and we don't want to, you see there's a translation here, we don't want the translator to have to do that. We don't have the translator to have to care about the specific syntax of this file. This is how all the translations here are auto-generated. They are injected later because it's very easy, but you can mess up relatively easy. You don't have to worry about the syntax. Same thing for the upstream files. The upstream files are something that software stores use for describing your application. So here you have a description of the app and you say description. Then you don't want the translator to have to worry, to come here, create a merge request against this file that it doesn't even know where the file is. Yeah, you have to know these syntax for XML lang. It's very complicated. That happens all the time. Docbook, we use docbook for translations. Yeah, I mean, if you're a developer, this seems very easy, like show this to someone that's not a geek and it will shout at you. This is not good. So we have a way for each of these formats, we extract that to a PO file, which at least it's not super useful, but it is a common format. More. This is like the KDE Connect for Android. For Android, you don't specify the translations in the code, you specify them on a file. So you would need to know that for these, the translation doesn't go here, you have to copy it to a different place. It's very annoying. So we decided we will have scripts that do the going from the origin, the source, whatever it's C++ or JSON or docbook or whatever, and we convert it to PO, and then we have the reverse one. So we have the one that once you have the PO, you have to bring it back to the source, because the Android application only understands about the Android world and it's things to be in Android. So one example of those scripts would be, this is the one that we use for Android. So we have one function in Bash that creates exports the template from the code, and then one that I will give you the PO file, and it will do the reverse. Okay. I'm going to spend one minute here, maybe not that much on the get text on steroids thing. Peter also mentioned before that there was this problem with when you want to substitute things on the grammar, on the grammar side, and that's because the person that made the decision to use English as a source language, that was the worst decision ever. English is a very simple language and you can use like concatenate all the strings and it will work, but any other language will be more complicated. For example, I had the terrible idea of making a game about guessing countries a few years ago. Terrible reason number one, I have gotten infinite emails about people complaining their country has two very few pixels on the map. It's like you don't know the number of Indians that complain that there's one pixel too much in Pakistan and the other way around, there's one too much pixel in Pakistan instead of India. Don't do things that are geography-based, bad idea. Second reason that was a problem is what the typical question my application that is, what is the capital of country, which works very well in English. You can substitute any country in there and it will work. But in French, the countries have gender and so you have to coordinate the of within the name of the language. So actually, K-18-N works very well for that. So we have a way that you can script translations. So you can say this country is feminine or masculine or whatever and then you can give two translations for the original string. So you can say if the language, if the word is form number one, use number one translation and if it's form number two, use number two translation. So immediately, you get all the combinations in much easier way than having to translate 600 times what is the capital of Spain, what is the capital of France. Not cool. Anyhow, that's a bit what I was describing before. So we have the code in C++ in this case. It needs to end up in a profile. So the translator translate it, they will translate it and eventually, we will want to bring it back to the original place where the code is. So if I just download the code, make install, everything works. I don't have to worry about where's the translations. I don't have translations anymore, but they told me there's 100 translations. No, not good. This process is done nightly. So that's done on a server. It takes a while. It doesn't take a lot of time. I checked. So we have the locks here, so that we'd run it for the three branches. It starts at one a.m. So it means that it takes one hour and a half for the development branch and then it takes one hour for the stable branch and the LTS branch is very small. So it only takes two minutes. But yeah, this is because we've spent a lot of time optimizing this thing. We made sure that we parallelize as much as we can. But still, we would like it not to take that much. Because sometimes, at the end of the day, it's like two hours of processing. So yeah. This is the same for everything. So basically, that's what I showed you before. The desktop file will go to a PO, the PO will be translated, and we will put it here. Now, what's the typical translator workflow? The typical translator workflow is not very modern. Let's put it this way. So we basically work with files. We use subversion, which is very old. So it's the thing I showed here. No, not here. What is it? Here. This is a subversion browser, but web browser, but you can do it from the command line. We use localize or any PO or text editor. We don't care about PO is a standard format. You can use whatever you want. So you basically use subversion to get the file and to put it back to subversion, localize or any PO editor to translate the file. Then we have something called apology, which is some kind of, if you're a developer, it's a linter. So it goes through the PO file and make sure it's well-formed knowing some more things. So if it realizes it's something that looks like XML, well, make sure that the opening and closing tags on the translation makes sense. So it not only make sure that the PO file is syntactically correct in its mother but on the translation itself. This works very well for our translators. Like our current translators are very happy about it because they're used to it. But it is true that when someone new comes and say, hey, this translation you had, that it's not translated anymore, I want to update it. Yeah, they get sometimes a bit of put by it. Like I have to tell them, yeah, you have to use this subversion tool, which if you're a developer, you know who to use Git and whatever, so subversion is very similar. But if you're a normal person, maybe you're not used to use the command line. Like you're telling me to use the command line now. I was like, I don't know. What's a server? So having some kind of web interface would be nice. We explored using web late and pontoon I think a few years ago. And the problem is that once you, for example, if you tried, at least when we tried, when you, if you use web late, you have to use web late. Like web late does all the managing from themselves and it has to be, things have to be their way. Like web late owns the file. And we want to own the file because we do all this magic. So we translate things here and put them there. So we didn't find something that worked for us. But it's still on our minds. We want to fix this eventually. Now, future work, we want more teams to be active. That is the perennial problem of open source. We want more contributors. Everyone wants more contributors. Our web page needs a rewrite and ideally, just throw it away and use something else. I'm going to show the web page to you now. That's our web page. It's HTML, somebody brought in 2000. And it's using like PHP 4 or 5, and it's just a hassle to support because you update the server and it doesn't work anymore. Yeah, I mean it works. So it lists all the teams and the translations, but it's just held by GAN at this point. At some point, we tried to steal the GNOME thingy because it looked good, but yeah, we got stuck somewhere. But we might do that. As I said, possibility of investigating a translation system, but without making the translator happy. Because one of the things everybody says is like, yeah, do that. And if the existence like translators get unhappy, they will use to it and use the web and whatever. Yeah, but what I don't want to do is, I mean, we have like, I don't know, hundreds of people using this workflow and they are happy with it, right? I don't want them to go away for a potential future new translator that might not appear, right? It's like, why do that? And we have to rewrite the pile of random scripts that we use, right? So I'm going to show them to you now. So this thing that runs nightly, this is a set of scripts and the problem with those sets of scripts is that, oops, sorry, they were written over 26 years by different people and we were not very well-mannered in creating scripts. So if I can figure out where it is, here, right? So this is a script, the script is the thing we call, but it has all our scripts to run that nightly thing. And you can see here, first thing, we have a C++ program, right? Good, don't write scripts in C++, that's a good idea. So we have some bash, next one is Perl. Around here, you have some Python. And I think we had some Ruby at some point, but I didn't like it and I managed to remove the Ruby one, so yeah, this needs to go away, right? We have to decide whether we use Perl or Bash or Python or whatever, right? Even if we want to do it all in C++, at least half a single language because this is not manageable, right? There's too many crap in here, right? But the thing is like, this has been working for so long time, right? Like if you go to the history of this file, for example, like it got created in 1999, right? And it hasn't almost changed since then. So yeah, it's a bit of work. So yeah, that was almost it. Some interesting links. Elton and Katie Torg, which is our webpage that we have to rewrite, but it's still there. The mailing list for coordinating the teams, it's the second one, we have a matrix channel, which if you want to go and chat, that'd be good. If you want to see the pile of crappy scripts, it's a scripty one. If you're interested in the linter, that's apology. And if you just want to see the subversion structure, it's in Elton N. And with that, thank you a lot. I think I have like a few minutes for questions. Yeah, cool. So yeah, any question? Thank you. Thank you. Thank you, does the question there? Yeah, you are a larger localization project. Do you have any advice for newcomers, which modules you start working on first? And do you have some levels, because like 300,000 strings is a lot like that? A single person will not finish that in less than three years. Yeah, it works almost full time. So do we have some sort of levels? Yeah, let me repeat the question for the people watching online. So he asked if we have any recommendation of which model to start, and we've got some level of what we ship applications, translations or not. So we do not officially have any recommendation of which models to start, but it's not written, but I will tell you when you start. So what we recommend are some of the base libraries, because all the applications, like the file menu, it's the same file menu in all the applications. So start translating there, at least the file menu is related. This is one. And then we think the desktop, like the main parts of the desktop are the most important, like the menu and all that. About levels, I think it was better that mentioned that they only ship things over 80 percent. We had that at some point, then we decided we don't care. We will ship like anything is better than nothing. Like if you only know Hungarian, if there's 10 words in Hungarian, you might manage to do things. If there's zero, you will not do nothing. So nowadays we're shipping all the translations that there are. If you switch to Hungarian and there's only, I mean, Hungarian is more translated. This is an example. But if you choose a language, this is only 10 things translated, well, it might actually encourage you to help. It's like, oh, there's something in it, so we don't have levels. I don't know if it's a good idea or not, but that's our current status quo. Yes? I thought I have your plans to integrate the things like re-translate so it's healthy. Right. So the question was, because I have a strong belief that this point should be a lot more complicated. If we have thought of integrating liberal translate, honestly, I didn't know liberal translate existed until three hours ago. No, but we will probably think it tomorrow. Yeah, it's a good idea. So for those that knew that we're not here at the first talk on the afternoon, liberal translate is like Google Translate but free. Basically, that's the definition and it works pretty well. So yeah, as a way of bootstrapping languages, I mean, it's probably not good to ship it without any supervision, but at least as a way to bootstrap languages, it might work. Yeah, it's something worth exploring. Yeah. So is the library on mutual get-text or do you go all this to keep as its own system or is it used both? So we use both. So we try to use ourself, which is based like ours, which is based on get-text. The problem is that we have a tiring level for our libraries and we can't have one library depend on the other. So the base layer of libraries uses the Qt translation system, and then everything above that uses the one in get-text. So we mainly use the get-text one. I don't know. Is that a library on the get-text? It's a library that drops get-text and adds that functionality I mentioned before of doing the combinatorial thingy. One more question over there. That's not a question of just like you asked about whether automated translation system, some teams in KDE are already using that. Okay. I'm not sure whether they use particularly with that, but there are, for example, the Bulgarian team has a person called Wincham. He's doing exactly that, trying to automate translation system. Cool. I think we're out of time. Yeah. Okay. Thank you. Thank you. |
Translating documentation with cloud tools and scripts
Using cloud tools and scripts to translate, review and update documents |
Okay. So, good afternoon. I'm Nilo Menezes, and I would like to share some scripts, and after all these presentations we had this afternoon, quick hacks that I need to implement while translating a NIN document, how Python developers could use NIN on the wiki. Okay, so I will explain it better later on. So, translation documentations with cloud tools and scripts. So, how it started, we had a document called NIN for Python programmers that was written in English and Spanish by the same author. And this document is simply a translation table from Python to NIN. So, if you already know how to write Python programs, you can read this document and it will explain, okay, this is a structure you have to write that way and so on and so on. But it was written by the same developer in English and Spanish in a wiki format using the markdown, and it's the wiki of GitHub. So, in the Brazilian NIN group, when people started to say, hey, it would be useful to get more users to the NIN language, if we had this document, NIN for Python programmers, also translated in Brazilian Portuguese. And then I said, okay, I did some translations before, I think I can help on this. So, I started checking how I could do the contribution. So, I went to the GitHub wiki, I saw that the source code was in markdown format. I just checked it out on Git, and I started translating. Just after the table of contents, I said, this will not end very well. I will have lots of problems later on because this document will be changed, it will be updated, nobody will tell me that the original was updated. And in a document, you can also move sections and do different things. So, I said, this is not a good start. So, the idea was, how can I help translating but also building an initial infrastructure to help translating this document to other languages, okay? So, how they were doing, they just cloned the English page, and they started writing, overwriting the text in the native language, okay? If you are the same author, it's fine. So, he wrote the English version, he also translated to Spanish. Three or four languages, I think it's still workable, but if you start to have as many languages as we saw in very big projects, you know that it's impossible to keep up to date. So, I started to think about how can I update this? And I had the previous experience of working with translations in PO files, using PO files for Lin City, and also translations like December how to, and in very old document formats where we really have to translate copy and translate. But the PO format started to be very interesting, because I said, if I managed to convert this markdown to PO and create a process around it, then we'll be back to the standard translation process where we can use PO edit, for example, to do the translation and move on. So, I will report what I did and the tools I selected for doing this job. As I said, it looked very much like an old problem. So, how to translate software to another language? The PO and get text combinations is very, very good. It's easy to use. Even if it's the first time you are translating software to just tag the text you want to be translated using get text, it's relatively easy. But I wasn't working with source code, I was working with markdown. And the markdown, you also have some extra markers regarding the text formatting and very specific items that I didn't want to spend time creating a converter from markdown to the PO file format. Of course, I started searching, I found some tools. But not all tools support the same kind of markdown as the GitHub markdown. And also, they create PO files with different qualities. So, I spent some time tweaking. So, if you're never working with the PO file, it looks like this. You have a message ID that is usually the original string and you have the message string that is the translated one. And you create a new file for each language you are working on. So, usually English is the base language. And then you create a PO file for Portuguese, another for Spanish, and so on. Prerestanda, I think most of the afternoon we heard about PO. So, I think it was a good idea to keep this format. So, this was the process if we were translating standard source code. So, we have the source file, we extract the PO file. Using the PO file, we use a translation tool or an editor to do the translation manually, string by string. After that, we compile the ML file and the executables or the script can use the ML file and present the text translated to the end user. So, I just needed to adapt this for markdown. And there is also a very special point regarding how the wiki is kept on GitHub that I will explain later. So, how to convert markdown, this specific for GitHub markdown to the PO file. And also, I started to test multiple packages because as I said, if you Google, you find many converters from markdown to PO file, I think I tested two or three. I didn't rule down each one, but the final one is MD-PO. There is a library, a series of Python scripts, so it would be, it is much more easier for me to use packages in the same language because I could just write a PyProject and put all these libraries in the same PyProject. The previous one was in Java, JavaScript. And you know, if you want to run something in JavaScript, especially if you are not using Linux, you need to install a lot of other software. So, this would enable the translation from markdown to PO and vice versa because this is the hardest part. Once you transform the markdown file into a PO file, you do the translation, but the main objective is to get a translated markdown file back so you can use it in GitHub and present the page in your native language with all the formatting that the original auto did before. Okay. And I also started to think, okay, maybe I can write another script that you manipulate the PO file and help me do the initial translation. Why? We have a series of tools for automatic translation, but of course, this was something I just started. I didn't have the integrations or anything like this, but I also didn't want to mess with the PO file itself. So, I found a Python library called POlib that does exactly this. I can open a PO file. I can, for example, do some filtering, like, okay, give me just the strings that are not yet translated, and so it's very, very easy to build and manipulate the PO files with it. Okay. So, this is the example problem, program. You simply open the file, we start the translation, and with the help with AWS translator that I was using because most of the time I work with AWS, I could easily send string by string to the cloud and get an initial translation that I would just review later on. Okay. Because you can never trust the automatic translation, especially if you are working phrase by phrase, it's very easy to miss your target. But it's very good nowadays. So, I would say at least 80% of everything you do in the automatic translation, you can keep as it is, but you still have to fix the 20%, and most of the times the 20% is quite embarrassing. So, you really need to review and double check before you publish anything. And as it's a paid service, as many translation services, and as our colleague, I didn't know the delivery translator that was presented this afternoon, you don't want to pay every time you do this. So, I use the previous script to create a list of the strings that were never translated, they were still empty. So, I know that if I run it multiple times, I will not be bugging AWS translator and paying for the translation of strings that are already translated or reviewed. Okay. And as you can see, the script is very simple. You open the PO file, you send the text to the cloud, in this case, AWS translate. You save, you replace the string and you save it in the new PO file, it's done. Okay. So, the script is almost out of this. You have it complete on GitHub. And this is the main job. Another advantage of these tools is that you can create a list of words that should not be translated. And this helps a lot, especially if you have a product or a document with a common name, that is, that you don't want to be translated. So, you can pass these special lists, you can create some exceptions. But as was explaining in the previous presentation, it's also a problem because as we, most of the time, we select English as the main or the source translation language, you cannot create exclusion lists using programming languages, keywords. So, in this document, Python, name for Python programmers, of course, there are lots of source code and we're not translated to Portuguese, for example, keywords like for, in, we're all translated to Portuguese. So, by using the automatic translation engine, you have to review and revert this translation, so the translated program will continue to be valid. And you have to pay very close attention, because of course, you also translate variable names if you have source code in the document. So, you need to pay attention that the output is still coherent, okay? And of course, as you need to do this manual work, you can use a PO edit tool or any other tool that you are used to use to work with PO files. So, here we have the English version and there we have the Brazilian Portuguese version. I could just step item by item and reveal these translations until I was satisfied with the result and then you can simply regenerate the markdown file from the PO file, okay? So, we started with the markdown source code, we tracked using AMD PO to create the initial PO file. I ran the script that sends the untranslated strings to AWS translate, but you can use any provider you want. You review with PO edit and you do the opposite conversion from PO file to markdown and publish the wiki, okay? So, this is the workflow I tried to implement using my collection of scripts or hacks. It is not really a tool, but with the intention to facilitate a single markdown file translation, okay? So, the document looks like this. This is the English version. Yes, I put the English, no, this is the Portuguese one, okay? So, in the end, I could publish this document in GitHub. It's not yet fully integrated with the GitHub wiki because ideally, I should put the GitHub wiki of this documentation as a sub module of my project. So, when I updated it, I also get the newest version of the markdown and if I do this kind of integration using GitHub, you'll be able to publish the markdown file also using GitHub. For this initial version, I just went to the editor and I paste the markdown file, but it's possible to do the integration. It's the next step that I still have to do. So, I published the scripts in this GitHub. So, it's an info Python programmer. It's useful for any markdown file. You may want to apply the same workflow. And the page is in beta because I asked all the translators, all the people that can read Brazilian Portuguese to check if everything is fine. Because the main goal to have a process is that usually, you never do the translation a single time. The translation is something that you need to keep alive. As soon as the English version is extended, translated, updated, you have to do the same thing in Portuguese. If you don't have an automatic process able to license this document and present a subset of changes, you'll be obliged to review a full document and this can be very, very cumbersome over time if the document has 15, 20 pages. So, it's not ideal. And another advantage is that the tool is smart enough to detect repetitions of the same stream. So, you also, you don't have the boring work to re-translate the same text multiple times. This also saves a lot of time. Yes. So, these are the main findings and the main problems I try to solve. And that's it if you have any questions. You can say. My phrase. Yes. So, it's much easier and especially if your text has source code because the challenge was the source code. And sometimes you have to keep the indentation and so on and so on. You don't want to pass the indentation mess to the translator. So, if you use a tool like this, it will structure just the part of the program with text. And if you keep the white space, which is very important in Python, so you can translate. But you still have to pay attention because of the automatic translation to translate everything to the target language. So, you have to revert the keywords. But at least the generation is quite strong. So, when you have generated this, the program is still a correct program in the end. So, it's good. Okay. I think it's the last one. Okay. Thank you. |
Fuzzing Device Models in Rust: Common Pitfalls |
Hello everyone, my name is Andrea, and today I'm going to tell you about what I think about fuzzing in Brazil. This presentation is not about fuzzing itself, but rather how we've failed at it. So before I start with the big pose of fuzzing, I will tell you a bit about fuzzing itself. I hope some of you already know about it, I don't have a lot of time. So fuzzing is basically an automated testing technique. The idea is to just send random input to a program to see how it behaves in that case, and how it works is that you use typically a tool, like a buzzer, that is going to generate random input for you, and then you're going to close some functions with that random input, and the buzzer is going to report some findings, and if it finds any interesting input files, it's going to write them to a course. Findings in this case can have crashes, can be hands, but can also be timeouts. So for fuzzing, when you first do it, you typically start from an empty corpus, but as you run fuzzing, you're going to generate some interesting inputs, which is helpful because in the next ones, you can just reuse those inputs and start from scratch. This helps with finding interesting things faster. So in this VMM, we implemented fuzzing for VM RTIO. We have three fast targets, one for the RTIO queue, one for the serialization of the RTIO queue, and one for the RTIO WISO in the Rescue Memo project. We only have implementation for the packet, so that's what we fuzzed. During fuzzing, we discovered three crashes, and only one of them is triggerable by any quotation malicious driver, and what we have now is that we are able to run fuzzing for every request that you're submitting to Rescue Memo to the VM RTIO repository. The fuzzing is apparently using link fuzzer, and besides the fuzzing that is happening in Rescue Memo itself, the folks from Cloud Hypervisor are also running fuzzing, and we also discovered a timeout theme in the package. So this actually brings me to our first report. So what is it you want? It should actually be... It's a people, and that is me. The first people is that you actually have to run the timeout, just in the code field for what you're in fuzzing for. Because the default, for example, for the fuzzing that we were using, is actually 20 minutes, and since we are just working with what I always would have used, and there's nothing that can possibly take 20 minutes to process, so we have to adjust the timeout to 60 seconds in our case, and this is something that was recommended by the folks from Cloud Hypervisor. Now, how we're running fuzzing in Rescue Memo is at the library level. The advantage of this is that it's easier to set up. So it's really important that it's easy to set up. It is a good thing. People are like, oh, but you're running fuzzing at the library level, so you don't have to have the kernel that's like so easy, so simple. So they're like, yeah, it's great, right? I mean, like, this is a good thing. And yeah, it's a good thing because you can also run on almost any host. You just have to have a fuzzing install in the repository, and then you just run fuzzing. And it also runs in a user space. There's also disadvantages, of course. The first one being that you cannot cover the whole repair setup, so that means that you're going to have some things that are great to be fuzzed. And then because you are fuzzing in user space, we need to do some more things for the driver, and this tends to be a bit complicated. And also you can find false positives. With the false positives, the idea is that you will find crashes that otherwise would not be triggered by a driver, because maybe you have some other chase in place. I would say that it's still important to fix these ones as well, because you never know how you're going to change your code and how it might end up actually triggering those findings in the future. And for the mocking of the driver, how it works, we've already simplified here, but the idea is that the driver is writing something in memory, and then the device reads what the driver wrote in memory, and it does stuff with the data. We want to fuzze in the human, and part of the piece we're doing in the human is this side of the device, and then what we need to mock is actually the driver's side of the communication. And in fuzzing the human, what we did is that we started this mocking of the driver from the beginning, so we needed it anyway to run some unit tests, we needed it for other kind of testing as well. So we had an initial mock interface from the beginning, and when we wanted to do fuzzing, we just evolved the mock driver in order to support that as well. Okay, so at the high level, how it happens right now in Rasmussen, is that we parsed the random bytes, we initialized the mock driver with the data that was parsed by fuzzer. At the high level, it ends up with some descriptors and some key functions that have some random input that they need to process, and then we create the queue, and we call these random functions with random input. And yeah, the second before is that if you are trying to do fuzzing and you just start when the project is already mature, what is going to happen is that it's going to be a bit difficult, you might find it very complicated to retrofit it. So instead, I know that it's not necessarily viable to start fuzzing when you start the project, but what you can do instead is that you can keep fuzzing in the back of your head, and then when you create some mock objects or some unit tests, you can think about how you can actually reuse them in fuzzing as well. Which is what we did but not very well. So one of the crashes that we actually found was that the mock driver was crashing on invalid input. So we had to adapt these actually to return errors, even though it was just one test, we couldn't just crash on invalid input anymore. So the idea is to return errors at the level where you want to do fuzzing, that can be processed at higher levels and so the fuzzing can crash. And now for structural interfuzzing. So with those structural interfuzzing, how it works is that the fuzzer is going to generate some random bytes, and then you have to interpret these as the bytes that you have to use for your library. So with structural interfuzzing, it's really nice because there are some tools that are just going to basically interpret the random bytes as a structure that you actually need. So it's super nice what it does is that it significantly reduces the code that you need to write, and even the rest of the information is arbitrary. Now, we had to change it unfortunately, before we knew that we had only 270 lines of code, and now we have around 740 lines of code for the fuzzer. And unfortunately, it came with some problems, so that's why we have to actually basically... The most important part is that it's not really produceable. So you can't really use the corpus that you had in previous runs, which was a big problem for us, because basically what happens is that arbitrary is introducing some... some randomness in one terminal so that it goes right into the input. And that basically means that you cannot use the corpus from previous runs. The thing for here is that we would like that we can do incremental improvements for the fuzzer, and we didn't check that what we want to increment can actually be implemented through N. So, yeah, instead, a better point would be to make sure that we can actually reuse the corpus that we generated. Okay, and now about when fuzzing actually fails. So, we had a PR in U.S. Human, at this point we were already running fuzzing for cool requests, and there was a PR that was introducing actually an overflow. So here the overflow is that the packet header size addition to the packet length can actually overflow, because the packet length is set up by the joint. This bug, well, I actually found it during code review, so it was a bit unexpected because I was hoping that the fuzzer was going to find it, which was not the case. So after some dive deep, I realized that running fuzzing for just 15 minutes might not actually be enough, because this bug was triggered with fuzzing about 40 minutes instead. So, how we fixed that is that we added a fuzzing session that is optional and that fuzzed for 24 hours. This one is to be started manually by the U.S. Human Maintainers, and should only be started when there are cool requests that actually impact the fuzzing relation. This is because we're also consuming a lot of resources when doing fuzzing, and also you don't want to block all the cool requests for 24 hours. So typically the last one on the slide is a page that quickly needs to be executed, so blocking it for one day might not be reasonable for all the cool requests. So the people here was not trying to fuzze for long enough, and, yeah, instead we had to work our way to find a way to not block cool requests, but at the same time to provide a way to fuzze. Fourth coverage for rust. So in rust you can actually get coverage information by running LLVM. In rust you only get light coverage. So basically this was the starting point of the presentation. I was thinking I would come here and I'm going to show you how great it is to run fuzzing for 15 minutes and then more minutes and then the coverage and all these really extravagant things. And so we ended up with fuzzing for 15 minutes generating 10 for these regions and the coverage of around 82%. So it's like, well, it's okay, that's good. So then let's just run with some minimal coverage as well. So this is some coverage that we generated from unit tests. Let's just feed through the fuzzer and see how this changes. There was no change, actually. So I was like, okay, it's not bad, not bad. Let's just run for two weeks. So what do you think this might happen now? So actually, it's working. At this point I was like, you press, I have to change my presentation, so it's not what I expected. But instead I learned something, right? So you can't actually use coverage to decide where to stop fuzzing. So instead what you can do is that you can use coverage information to see when, what the parts of your code are not actually covered. And yeah, well, that's about it, actually. We see the summary of the people that we read into. And I think now we have a lot of different questions. Did you look at how the fuzzer works and then what areas were not covered and try to figure out why it wasn't found in those areas? Yeah, so the question was if we looked at how the fuzzer works and what areas were not covered. Yes, we did, and I have a slide for that. Thanks for the question. Okay, so actually, I have two slides for that. There were some functions that we were not calling on purpose. So because on the virtual queue, for example, we have some functions that are just iterating over the script chain and then they're doing something with the data. And at the virtual queue level, you can't do something with the data. So it's like, okay, this needs to be fast at the high level, like at the device implementation level. So it's like, okay, we're not going to call these functions, which is a bit hilarious because that's where Proc Hypervisor actually found the timeout problem, which we were not able to reproduce with the virtual queue, but still. And we actually did this one function that shouldn't be called during fuzzing. And then I rerun the fuzzing and, yeah, it's a bit better, but it's still not great. And then I looked into what, well, actually, you can't see very well. That's unfortunate. Yeah, so I looked into what actually is not covered and you're not seeing there, so you have to trust me. These are actually errors. So the printing of errors to files. So since in the fuzzing we're not actually initializing a logger, these things cannot be triggered by fuzzing. So there's lots of error in this printing to a file that's not happening through fuzzing. Yeah. What's subsequently taken to actually make sure it covers everywhere which needs to be covered? And so discovering certain areas which clearly aren't covered. I didn't understand the question. Which areas of view? Well, what's subsequently taken to make sure the areas which weren't covered in the fuzzing are going to be covered in the future? Oh, okay, so the question was what measures are we taking? In order to make sure that all that was covered before is going to be covered in the next generations? Yeah. None? So right now we're not doing anything. This whole coverage thing is just something that I need for the presentation and it's not automatic in any way. But this is actually a good point for future investment to make sure that we're covering code because what we help with as well is that we make sure that new functions that we are adding to the code are also covered. So it's a great point to make that way. We're talking about the structure of our fuzzing. Yeah. And you mentioned that we cannot reuse the code to explain a bit more about that. Okay, so the question was how structured are we fuzzing and just that we cannot reuse the code. Let me see if I actually have a question here. No. Okay, so the idea is that what we were using, which is arbitrary, when it was taking the uniformed fuzzer was also adding some randomness to it. So because it was random, basically, every time it was writing the purpose to the file, it was introducing some randomness to it. So when the same people get to read again, then it would not have been the same. So where does the randomness come from? Where does the randomness come from? This is just how arbitrary it decided to implement it. There's actually an issue in arbitrary that they are aware of the problem with their not actually... It doesn't seem like they are just fixing it for some reason. So what we ended up doing is that we ended up doing some custom serialization with info, which is also a very well-known Rust package. It's not much more digital than it is arbitrary. And it doesn't matter. When you discover a bug with this puzzle, does it transform into a unit that's not the one? The question is, when we discover a bug, does it try to... Yeah, the way that we are fixing this kind of problem is that we are always adding a regression guess for them just to make sure that they don't... I was wondering about the computation requirements. So how many cores are you using? How many cores we are using? So when we read for two weeks, we actually used 96 cores. In uniqueness, I do not... So when you are running on coding, because I don't know exactly how many... One? Maybe? I don't know. But I think we have been running on 96 cores as well. That was another one. Can I see that guy? One minute. We found a color case that we are trying to shrink the crazy smaller color steps. Oh, this is... I don't know. Let's shut up afterwards. Thanks. |
Is OpenStack still needed in 2023? |
Thank you for coming, first and quick introduction, my name is Thierry Thaïd, I've been involved in a software and software project for the last three years, and today I'm working on a software and software project. I'm also the general manager for the Open Infrastructure Foundation, which is the foundation that hosts OpenStack and Curator projects in OLC. So I'm here as the vice-chair of the OpenSoft Initiative as the next project, and today I wanted to talk to you about OpenStack and its relevance today, because a lot of people are saying that OpenStack is dead. I mean, it's a fair question, right? It's been around for 13 years and nobody has much anymore and less than you mentioned, so I mean, is it dead? It could be, maybe, you know, nobody is using it anymore and nobody cares. So let's look at the data. So OpenStack, according to our latest user survey that was run in 2022, is run over a footprint of more than 40 million CPU cores of 15 hours. So that's a massive footprint. There's a lot of usage of OpenStack and a lot more people getting comfortable mentioning it. It's actually increased from 25 million to 40 million around the course of between 2021 and 2022. So it grew between 25 to 40 million CPU cores in our user survey reports between 2021 and 2022. It's happening in all verticals, so you can see the traditional OpenStack strongholds. It was originally formed by a public cloud company and a research institution. So on the public cloud side, it's still very strong with workspace being involved, but also OVH Cloud or Infomaniac or Deutsche Telecom or Leroy. All of those companies running OpenStack based public clouds. It's also strong in the research area where it started at NASA, but now it's used at CERN, it's used at Harvard, it's used at MIT, it's used at the European Center for Medium Weather Forecast, which right now has hold up, and basically that's one of the weather forecasts here. So all of those research institutions that have developed for public cloud, it's also strong in the telecom industry. It's well known that OpenStack is used there. Nine out of the ten telecom companies are actually using OpenStack as their back-end for handling everyday calls. It's also strong in retail and e-commerce with the largest company in the world, Walmart, still using it. Financial services with banks like Société Générale or Financial Services like PayPal or China Union Bank. It's strong in energy, transportation, government manufacturing, web entertainment with one-on-one brands, Comcast, Salesforce, Adobe, Bloomberg, all the games, game servers, it's a lot of usage. It's also that those users are actually increasing their usage. We used to give stickers to companies that would reach 100,000 CPU calls, and now we have to create a new sticker because a bunch of them are actually more than one billion CPU calls, including China Mobile and China Unicom, which are the two cell-phone mobile companies in China. So that's providing basically all Chinese citizens with their users. So it definitely has usage. So why are people saying it's dead? It may be, you know, it's still working, but it's dead inside, and nobody's really developing it anymore, and has no development activity, so it moves more like it's only me or something. It's not dead, but it's almost dead. And if you look at activity, development activity, you have to compare with very well-known, very active projects like Kubernetes for example. Kubernetes is a massive, open collaboration, and I'm sure everyone in this room has heard of Kubernetes, and they have a very large, it's one of the most active business projects in the world. It has more than 33,000 Internet public requests, I mean, reviewed by Uniman and merged into the code base over the last year. It's more like one of our changed projects today, which is really massive. And if we try to see how OpenStack compares with that, OpenStack in 2022 was actually 29,000 year changes merged, the same thing for the reviewed by Uniman, and then merged into the code. And so it's definitely the same ballpark. It's definitely the same level of activity. Definitely one of the three most active open source projects out there in terms of development activity, like with the Linux kernel and Chromium. So it's not dead from the development perspective either. So why are people saying it's dead? I used to think it's because they just didn't like us. They just wanted us to feel bad and get up in the morning. But then I started thinking, like, maybe there's something else. Maybe there is a deeper reason why people think about OpenStack is dead. But then we did think about it. So why are people thinking of OpenStack is dead? We need to take a step back and look at OpenStack is strange. We need to answer that question. So let's rewind back to 2010. It might sound like yesterday, as for all those people, young people, it will be very, very near. So it might sound like yesterday, but from an open source perspective, it's actually a long time. Back in 2010, Open Source and the open source initiative has been there for 12 years. And 12 years have passed since OpenStack was created. If you look at popular projects, Firefox was eight years old when OpenStack was created. Ubuntu was six years old. Git was five years old. Android was only two and a half years old. So it's really the middle ages of Open Source. And Open Source was very successful back then. But we didn't have that much open source options for infrastructure. The infrastructure market was cornered by proprietary software. It was VMware on the private side and Amazon Web Services on the public side, which is no option there. And that's what OpenStack was created. We created an open source solution for providing infrastructure to power clouds of all sizes, public clouds and private clouds. It was the first solution to try to be a bit universal. And so the first six years of OpenStack was famously created by, like I said, Nova, which is the compute, the VMware as a service component that came out of NASA. So we've researched private cloud systems, baked into it, and Swift, which is an object storage component that came from a white space with all the public cloud providers. It's how we set the requirements from both areas just online. And then people got very excited. It was called the Linux of the data center. It was called the technology to end old technology. So you ended up with like startups everywhere, trying to make money, going to the gold mine, and trying to say, oh, you can get involved with OpenStack. And then that's when we had more than 100,000 changes per year. So like three times more active than Kubernetes is today. It was like crazy to think about it. Everyone wanted to be a part of it. Everyone wanted their use case. Everyone wanted their product to be integrated with it. And so that created a lot of scope creep because we basically extended to solve all of those, all of that demands. Really hard to say no if you are not going to be developed open source product. People want to give you code and why would you reject it? So in the end, that resulted in a lot of scope creep, especially in areas where OpenStack was not as strong like orchestration, for example. And that's when we had this question, like who is the OpenStack user? Is it the people that actually choose to deploy, install it, operate it? Is it the people that consume the cloud APIs? Is it the end user? Is it the person that uses the consumer? And that tension between the two was really difficult to solve from an audience perspective. How do you present it? That's when Kubernetes appeared. Kubernetes appeared mid-2014, moved to open development around 2015. And it really brought for us a welcome clarification. Because to understand how it helped us, we need to take a step back and look at how you provide applications. Like 20 years ago, you would just procure some physical hardware. And as an application developer, as a developer, you would install the operating system, your dependencies, and your application on top of that. But that was a bit inconvenient. So we added layers. The first layer that was added, one of these devos, is hardware actualization. Basically, abstracting the server on your application is running on from the physical hardware that runs on it. And then we added cloud APIs, basically allowing to programmatically access those virtualized resources. And then another layer, which is application deployment APIs, basically what Kubernetes does, which is providing higher level privilege to deploy complex applications on top of that programmable infrastructure. So you have this programmable infrastructure on one side, and cloud native cloud aware applications being deployed on top of that. It's actually two different types of populations, people that provide infrastructure on one end, and people that write applications and deploy them at the top. And Kubernetes really helped us having that layer between those two different, very different things. And so, yes, the advent of Kubernetes really provided this welcome clarification about this interface layer between infrastructure providers and infrastructure consumers. And if you are an infrastructure consumer, developers, employers, infrastructure is a given. It's someone else's job. You don't have to care about it. And so, they no longer talk to OpenStack directly. OpenStack is irrelevant to them. OpenStack is invisible to them. So OpenStack is dead to them. It's not dead. It just grew up. It found its purpose in life. And it found its user. So, in the end, its users are the people that are providing infrastructure. Only had appropriate solutions before, and now can rely on open source solutions to do that. It's a role that is very separate from the traditional developers. It's a new class of actors. And that's recognizing that it's a new group of people. That's why the foundation, the OpenStack Foundation transitioned to becoming the opening infrastructure foundation. Because there is this group of people that want to provide infrastructure using open source solutions. They need more than just OpenStack. They need all the help they can get. So, that's why we formed the OpenInfra Foundation, and we have more than OpenStack now. We also have Kata containers, or Zoo, or Starling eggs, or other projects. All about providing infrastructure for open source solutions for open source providers. Back to OpenStack. If OpenStack is not dead, what makes it relevant for the next year? Why would you care? Why should we just all use public peer for the hyperscaler of public clouds and not care about infrastructure? I mean, hyperscalers are not going away. Amazon will be there tomorrow. The initial goal of OpenStack ends all the Amazons. It's not going to happen, it's fine. But we think it's important that there is open source solutions for providing infrastructure. There is an option there. The reason for that is that there is this combination of solutions that you can use. Basically, Linux and the organic system layer, OpenStack and the cloud APIs layer, combined with Kubernetes and the application deployment layer. And that forms an end-to-end solution for providing infrastructure using purely open source software. That's what we call the log stack. So, Linux, OpenStack, Kubernetes, and infrastructure. End-to-end solution to provide basic infrastructure to run cloud native orchestration on and for public and for private cloud. It's a very popular combination to use all of them together. But why would you use it? Why would you go through the hustle of running your own? There are like three main reasons to do that. That's the cost, compliance and capabilities. It's a framework that's been there for a while. We first talked about it in 2017. But it's more relevant than ever today. So, for example, if you look at cost, there was this study by Amazon over its last year about the cost of cloud. And that's the idea of venture executives. They are not financial business. They looked at the cost of operating all those startups on top of Amazon Web Services and Microsoft. And they found that when those companies would repatriate their workloads locally, they would save half to two thirds of the cost. So, they would get from 50 to 66 percent cost reduction by repatriating their workloads in the private cloud. And the reason for that is once you get a stable workload, once you get a stable workload, public infrastructure is great to handle the elasticity, the growth, all of those things. But once you have a stable workload, it's an exceptionally costly way of finding infrastructure. And so, the ideal model is a hybrid model where you use private infrastructure for the base and public infrastructure for the spikes. Compliance. I'll try to go fast. So, digital service IT is a big topic. There is a full development service cloud this afternoon. So, you should go there if you like that you will be running right now. So, it drives a lot of open stack adoption, especially in South-East Asia. Not in the U.S. I don't really care that much. But for open stack, it's clearly where the growth is. All the research institutions need to have their own thing. All the EU countries want to have solutions that are local. And that drives a lot of adoption today. Most of that 25 to 40 million job is probably linked to digital service. And finally, there is this capability state. You would think that cloud is a pretty defined space by now. It's been around for a while. But there are new use cases. And those new use cases are enabled by the fact that you have a solution to experiment and play with. Like, for example, we have this company called OneCode that is running game servers in a small island in the middle of the Pacific Ocean that sits on top of Internet cables. Because what they want to provide is equal latency to players in the U.S., in China, in Japan, and in Australia, which is quite a corner case, I guess. But they couldn't wait until... And they, like, post-game e-sports tournaments. They could have waited for Amazon Web Services to set up the data center there, but it probably would never have happened. So clearly, one use case that could not be served without an open source option out there. Closer up to here, Exion is a French subsidiary of ETF for French people in the room. That's where most of our electricity is coming from. And so they had those super calculators that are used for nuclear plant simulation that are regularly decommissioned because clearly they need more power to simulate what's happening in there. And what they used to do is they would just put them to school. And what that guy at Exion decided to do is to actually repurpose that into high-density, super-converged clouds that basically use the resources that are within those super calculators. And they are running HPC as a service, like workloads, with an high environmental impact. Because since they got the servers for free from climate impact, they're also tracking exactly the mixed energy mix of your workloads based on when you run them so that they track the environmental impact of your workloads as well. And finally, LeafCloud is a public cloud based in Amsterdam where they actually distribute the compute servers all over the city. And those are used to heat the water for those buildings. Swimming pools, apartment complexes, all of them being served by those servers. So clearly, again, a corner case because Amsterdam has those black fiber things between those buildings that actually make that possible. But you end up with the data center that's really the most energy efficiency in the world just by doing that. And so those use cases, those specific things, this kind of research innovation is enabled by opening infrastructure. So I think it's simple. In conclusion, OpenStack is not dead. It has a massive user footprint. It's growing year over year. It's still one of the most active open source projects in the world. But it might be dead to you. It's someone else's job to provide infrastructure. It's someone else's job to care. And it's fine. It will not take over Amazon Web Services. Andrea still has a job. It will not replace every technology going forward. It will not be the technology that ends all technology. But it is a necessary component in the infrastructure to provide investment. It's a tool for enabling hybrid usage. It's a tool for use cases that are not served by hyperscale. It's a tool for workloads which can't be served out of U.S. based servers due to digital services. So for costs, for compliance, for catalytic reasons, OpenStack is here to stay for the next decade. Thank you. Thank you. Obviously, six years ago I went to a conference that was about using OpenStack in scientific and in large data analysis. The thing I came away from that conference was that OpenStack was really complicated to set up. And your only hope of getting something working was the higher-end app to do it for you. So is that, I mean, is that kind of, I mean, is that the part because of the scope, I think, that now that you've focused on is kind of easy to set up, or can you take a comment on that? So the question is, OpenStack has this information of being very hard to set up, and this message that you have to hire somebody who will do it for you, is it still the case and what's the minimum cost? So I would say, I mean, it's still a complex job to run infrastructure, like for various reasons, mostly because of the 24-hour type service constraints, one of those things. But I would say that running OpenStack has gotten a lot easier, especially one of the big concerns, but not necessarily the scope, because you can just deploy a few services. And it wasn't really affecting that. And today, yes, there is less scope. We've focused on the main pieces. It was really more the upgrade cycle. If you wanted to keep up to date with the release that were produced, it created a lot of tension because every six months you would have to upgrade your infrastructure if you wanted to do that. And so the work was done to really make those upgrades a lot more streamlined. There is a lot less breaking changes happening, because the pace of, I would say, new future development is still going on in the maintenance and maintenance mode. I believe in new hardware functions, all of those. It's more like a driver space than a course space. So you see a lot less disruption when you do the upgrade. It's also that the upgrades are much more tested. We have solutions for distributions that can be used, that handle that, that are pretty nice, not necessarily solved by one of the players in the OpenStack space that you can rely on to actually do the updates. So it's no longer that difficult, but it's still a job in itself. It's still like something that you have to talk about. It's an interesting thing to think about. I'm always amazed when, so the use of guys are running this gigantic OpenStack to power game servers and they are doing that to the best team of the team. It's not curable not to buy yourself a new crash, but it's still doable with a relatively small team. And I think no big guys, they have a very large, they have a very large team. Well, it's like, it's not, it's not completely out of reach. The smaller it is, the less constrained there is from the users. Do you think the fact that you've got 29,000 full requests over quite a number of projects where, you know, you've got, for example, Nova, Keystone, but you've also got some more skill projects versus, sorry, Kubernetes, which has like one key component that everyone contributes to, based towards this perception of, you know, things don't move as fast as Kubernetes? No, I don't think it's, it's the fact that OpenStack is more of a collection of services versus Kubernetes, which is much more in Monolith, could play to the, like, the scale that people, people see as less active. I think there are a lot of reasons. We are not using GitHub. We are using our own, we software tools based on Garrett and Zool. And so, there is a bit of a visibility that is right there, because we are not appealing on those, like, reports, gender reports that we are using. There is this perception that all of this is happening in GitHub and everything else is not really existing. So all of those things contribute to it. The fact that it's a Monolith versus a set of projects, I don't think that plays that matter, because if you look at the amount of development that's happening on those French services, it's not that much. It's sort of the core of the developments we have in, you know, the I1A. And so, it's, I would say it's not, yeah. Thank you. I'll take questions outside. |
Using SPDK with the Xen hypervisor |
The next session will be by Damien about SDK with Zen, so enjoy the session. My name is Damien. I'm looking at practice and doing a presentation as the French university comes to guard. I'm interested in the presentation which is a software based in Grenoble. So VATES is the one most popular company behind this project, and we want to do better with what we have because of some problems that we approach. So my focus is on the storage visualization part of the supervisor, and we currently have a very programmatic approach to the storage of the platform. So if you can see on the left that we have a bare metal NVU storage, and when we add the visualization stack from inside of the end, we have 300% of the power of the NVU inside. So what we want to do is to keep the storage of the security, given by the ten balizer, given the stack style. We want to have less impact as possible in the user. So we want to minimize the impact in users, so we want them to have to utilize things too much in the virtual machine or too much in private. So what I'm proposing is to use... So first I want to introduce the problem with Xamarin. Xamarin is a type when it arises, and it is stored as the storage that runs first on the hardware. And to be able to use rather a storage platform and network, we have to release something which is no special, not a virtual machine, a Linux. That is running second. Then it's initializing CPU, then running a virtual machine. This is given, then control on the storage platform and the network. So this VM is very important, because it has the responsibility of sharing a network and storage for other VMs. So what I'm proposing is... I'm focusing on NVMe. Well, it's in the case, for example, in NVMe. And we want to focus on the NVMe part. So NVMe is a new storage protocol for you. It's been around for a few years. It's everywhere now. So that is giving us a lot of better preferences inside of it. So I have to introduce some things of concern. So we have all done zero virtual machines, but this can share the storage for other virtual machines. And to do that, we have what we call a speed driver. So we have a specialized protocol from VLJ for the same case. That is used to transmit block storage requests from the VM to our back end, running on zero. So most of the time, it's, for example, a new data from the driver, that is presenting itself as a block driver in the distributed machine. So it's already in place since a long time, so it's obscuring the lay-downs, so it's not... So it's been running also since 2015. And the back end, currently in the XP engine, back end status, that is, a new space program that takes multiple requests, and it's via IO to transfer them to the VM zero block area. In that, we have a special interface to share memory between the VM, that is used in the VLJ protocol, where block request from the return machines to our VM zero. This interface is mediated by the provider, so it's then in place. We have the return machine that we usually call the VM. We're basically telling the parallelizers that the memory will be granted access to, by, to VM zero, but it's in the return machine. And the VM zero need to ask the parallelizer to map its system memory into its own memory, to be able to access it. So I have already replaced that disk in our limitation, because we want to use the screen care in place of that disk, and we want to directly take care of requests, and transmit it to the VM. So to talk to you about more about the screen care, the screen care is the storage performance that we're looking at, but in the world you may be created by install, but it's used by your device maintenance. It is essentially a driver for the admin using devices, and it will be on the user space, in our case, in terms of work, and it's used in Linux, and in RedMetal also. So it's part of the same project as the 50k, but it's a great feature. Intercary on two? Yeah, that's a problem to provide. Okay, awesome, thank you. So here we have the current state. So we have two times the block layers that need to be traversed for one request. It's one of the costs that is adding to the difference we have with RedMetal. So our proposal is to use SPDK to directly transmit BLKV requests from a virtual machine to a storage platform, a storage device, and to reduce the cost by biasing the kernel in Dom0. It's completely transparent for the virtual machine because we reuse a lot of the infrastructure already present. So to take a BLKV request, it's a simple structure in a shell memory, in a ring, it's not very much different from Viertaio in this aspect. And we have a multiple request type, so read, write, or discard everything. I'm going to focus on read, write because it's the minimum we need to put on to be able to take requests. And we just have to transfer them to a SPDK interface. So SPDK needs to use a special memory to be able to transmit it to the device. So we need to use the SPDK allocator, memory allocator, to be able to have a buffer that will be used to go from the device. So we need to have the data copied from the virtual machine to our Dom0. Then we can transfer it to the disk. So it's pretty simple. We allocate memory. We copy the data using the Grants table interface into our memory. Then we just write it on the disk. SPDK will call a callback that we've given it when it is finished and then we can do the same for the read requests. So it's working great for now. So as you can see, the read request, the first and the second column are not very great, is because the implementation is not finished. And in this case, I'm doing more grant calls to the supervisors than on the write request. So it's a big cost of our implementation. But for now, it's done and we'll look into improving it. But we are doing better than that. This is on the right. The blue column is that this is the grant status. And our implementation is the red on the left. So same for block size and throughput. So we are able to improve the performance of our storage stack in a transparent manner for VM because they can use normal VMs at boot today on the tab disk in the current infrastructure of the stage and still make it work. The problem, we have to use NVMe dedicated to the SPDK platform. But NVMe is pretty much everywhere nowadays, even in data centers, especially in data centers. We are still capable of using the security of the grant table because we keep the state of where the VM only shares data that want to be written on the disk with the backend in SPDK. And then still doing the mediator for this. What we want to do is, of course, having the read request being better than the tab disk, which since we are in some case, for example here, at the same level, without this optimization, I'm not very worried about that. But we want to be able to not have to copy data from VM into DOM0 then, having it being handled by the NVMe and use directly the guest memory as source and destination for the DMR request from the NVMe drive. And we want to take a look at the grantable interface to see if it can be improved for modern days computing. So I'm finished. Thank you so much. Question? Yes. I know from operating OpenStack Cloud that DPDK is very hard to install and implement. How hard is it to implement SPDK? The first question is, very quickly, is your work somehow can be applied to KVM visualization tool? So the question is if SPDK is hard to use like DPDK and if it can be used in the KVM infrastructure. So it has always been used in the KVM infrastructure to be as a storage platform for Vietaio guests. It's just that our case is special because of the different architecture between KVM and them. So it's already done by the SPDK community in this way. I would say that this SPDK is not very hard to install. In our case, it would be given with the app advisor and the install of XPNG, but it's not very hard to install. We just have to have a special configuration for our DOM0 because the SPDK is relying on a super page of 2Mhp to be able to do DMR requests. And so we have to have this support and it's not available in the basic configuration of DOM0. Yes? Does your implementation survive the crash of the SPDK process on those? Well, yes. The question is if the application is surviving the crash of the SPDK process. So the virtual machine would be able to survive the SPDK not being available. We would lose the disk in the virtual machine. Well, it would be, it would hang in the virtual machine for the disk, but the virtual machine would still be able to run with the problem. Any other questions? Thank you. Thank you again. |
OKD Virtualization: what’s new, what’s next
New features on OKD Virtualization 4.11 and 4.12 and next challenges |
Welcome to the virtualization room, Simone will talk about OCD virtualization. So, hi all. Nice to see you here. Today we are going to talk about OCD virtualization. We are going to have just a quick intro for who doesn't really know what OCD virtualization is, just to get a bit of context. Then we are going to see how you can use CRC to play with OCD virtualization at home in a really small footprint. Just if you want to try it or if you want to start developing on OCD virtualization. We are going to see a couple of new features. I choose this one because they are cloud native. I think that they are a bit different from what you use it to see in related kind of products. And then we are going to see next challenges for us. So, let's start from OCD. What OCD is? OCD is a distribution of Kubernetes. It's a sibling distribution of OpenShift Container Platform, which is the data distribution of Kubernetes. OCD is the community of the upstream release of the VET. It's based on physical machines that can be bare metal or virtual machines on a hyperscaler. In our case, it would be better to use bare metal nodes just because we are talking about a virtualization solution. Then we are going to have hosts there that nowadays are based on Fedora CoreOS, but then we are going to see that this is going to change. Then we have all the Kubernetes stack and you can use it to start your application. Now, what is KubeVirt? KubeVirt is a set of virtualization APIs for Kubernetes. So you can extend Kubernetes in order to be able to run virtual machines that runs in containers on your Kubernetes infrastructure. At the end, it's still using the KVM hypervisor. It's able to schedule and manage virtual machines as if they are native Kubernetes objects. What is its main advantages? It's that it's cloud-on-active. It means that you can use all the Kubernetes stack. So the container networking interface for the network, the container storage interface that you are already using for Kubernetes for the storage. It's based on customer source definition and customer source that are a way to extend Kubernetes with a new API. It can schedule virtual machines as native Kubernetes objects and you can manage them to talk with what you already developed as a microservice. So in an ideal world, you are going to rewrite your application from scratch, completely split into microservices. In the real world, probably you have a bit of legacy code or something that is already running in a virtual machine. When are you supposed to schedule it on an external hypervisor or on the same infrastructure that you are using for your microservices with the capability to have them talking natively to your virtual machines? CubeVirt is the response for this challenge. Now, how can you test it at home? You can easily try it with CRC. CRC is a really quick way to start playing debugging, hacking on OpenShift in general. CRC is a micro distribution of OpenShift that runs in a virtual machine that can be executed on your laptop. It's absolutely not intended for production usage. It's going to be executed in a virtual machine. So it's a single-node cluster. It's not going to scale out. It's not going to support upgrades. It's just a test platform. In order here, just a few instructions if you want to try it at home. Since we are talking about a virtualization product and you are running it in CRC, which is a virtual machine as well, you need to enable Nest virtualization on your laptop in order to be able to start virtual machines inside the CRC one. Then you can tune the configuration. Normally, CRC comes with a really small configuration. If I'm not wrong, it's about 9 gigs of RAM, which is not that small, but it's just enough to run the OKD by itself. If you want to think about playing with a couple of virtual machines and so on, it's better to extend the memory up to at least 20 gigs in order to be able to do something realistic. It's also nice that CRC already comes with a CubeVirtual Path Provisioner, which is a way to dynamically provision PV persistent volumes for your virtual machines. As you can imagine, a pod is something ephemeral while your virtual machine needs a persistent volume. You need a way to provide persistent volumes for your virtual machines. CRC is just a virtual machine where you can run other virtual machines inside, but you still need a mechanism to provide persistent volumes for that. It's already integrated in CRC, but you have to extend its disk in order to have a bit of space to create disks. At the end, you have a gesture to execute a couple of commands, CRC setup and CRC start. After a few minutes, you are going to gain the access to your environment. Of course, you can do everything also from the command line. Probably much of you are going to prefer using the command line here. I mean, I attached a screenshot to the presentation just because they are nicer. So you can connect to the user interface, to the admin user interface where you have the operator app page. In the operator app page, you are going to find already there because it's distributed via the operator app mechanism. You are going to find the CubeVirtual type of converted cluster operator. As I mentioned, you don't need to configure the storage, so you are only supposed to install the operator and create a CR to trigger the operator, but the storage is already pre-configured. After a while, you will be asked to create the first CR for the operator in order to configure it. Here you can fine-tune the configuration of OKD virtualization for your specific cluster. In particular, we have a stanza called the feature gates where you can enable optional features. Here we are going to talk about two features. One of them is already enabled by default, which is enabled common boot image import, and the other is deployed tecton task resources. This one is not enabled by default, but if you want to do at home what we are going to see now, you have to enable it. You can enable it also at day two. After a few minutes, the operator is installed. It's going to also extend the UI with a new tab where you can see what you can do with your virtual machines. So now let's start talking about one of the... In the last year we introduced a lot of features, but today I want to talk about two of them. The first one is golden images. Why I think that it's interesting. Nowadays, if you think to any public ground environment on public hyperscalers, you are going to find... It's really used to use them. Why? Because you can find their images for your preferred operating systems already available. You have just to select one of them, click, and in a matter of minutes you are going to get a virtual machine that it's running, you don't need to upload your disk, you don't need to upload, eventually an ISO, start defining your virtual machine and so on. They are really convenient. We want to have the same experience also in CubeVirt. So we introduced this feature. The whole idea is that we are going to have a container registry which contains some images with the disk image for your virtual machines that are going to be periodically refreshed to include a new feature of their operating systems or security fixes. Then we have this new object called the data import corona, which is going to say that you want to periodically pull an image from that container registry with a schedule and import it in your cluster. It's a mechanism in order to configure the garbage collecting, the retention policy, but at the end the idea is that you are going to find images for popular operating systems out of the box already available in your cluster. They are going to be refreshed over the time, so each time you... Let's see it. This is the catalog in order to create virtual machines in the user interface of OKD virtualization. We have a catalog with objects. The whole feature is here. As you can see for popular operating systems, we already have a nice label saying that the source is already available. It means that this new feature automatically imported for you a golden image of that operating system and it's going to continuously keep it up to date. The benefit is that when you want to start a virtual machine, you will be able to do it with a single click. You can customize the name, you can say in which namespace it's going to be executed, but everything is already ready. With one click you are going to start your virtual machine. What is going to happen on the storage side? We see that we have some existing persistent volume claims for the disk that got automatically important. One of them is going to be cloned to be your virtual machine. Depending on the CSI implementation, this can be even completely afforded to the CSI provider and it can be really fast. After a few minutes, your virtual machine is there and as you can see through CloudInnit or CSI, whatever, it can be also customized to use a custom name and so on. Our data in particular looks like, we are saying that we want to have a data source named Fedora with a schedule with the usual consequences. We want to periodically consume images that are available on Quay, which is a Docker register. Here you can see the status and the image is up to date, meaning that the most fresh version of Fedora got automatically imported in your cluster. The important thing is that if you look here, you see that Fedora, the target for the Fedora image is the latest. It means that when the next release of Fedora is going to come out, it's going to be automatically available in your cluster. Of course, we are providing images for Fedora for centers, but you can use the same mechanism and the same infrastructure to import on your cluster your own images. You can create custom data source, sorry, custom data import columns. Now, I want to talk about an additional really nice feature, which is Cubivius Tecton Task Pipelines. In the previous section, we see that we are able to import images for a popular operative system, but maybe there is some other operative system that requires to create a virtual machine starting from an ISO and installing it, so how can we automate it? We cannot expect that the provider of all the operative systems in the world are going to use this mechanism and publish images for us. We need a way to be able to automate the creation of the images for other operative systems. In this case, we are going to use a Tecton pipeline to automate this. What Tecton is? Tecton, also known as OpenShift Pipelines, is a cloud-native continuous integration and continuous delivery solution. It's also cloud-native and it's fully based on Kubernetes resources. It uses what are called Tecton blocks to automate tasks, extracting the infrastructures. In the Tecton world, we have tasks. A task is something that defines a set of build steps, like compiling a code, running tests, or building and deploying images. In our case, we are now interested in deploying images, but as you can imagine, you can combine it with other tasks. Then you can define a pipeline. A pipeline is a set of orchestrated tasks, and then you can use a pipeline resource, which is a set of inputs that are going to be injected into the execution of your pipeline, which is a pipeline run. On KubeVirt Tecton Task Operator, we introduced some specific tasks to create, update, and manage the specific KubeVirt resources, so virtual machines, data volumes, data sources, templates, and so on. You are able to populate these images, even with LibGestFS to inject files and so on. You are able to execute scripts, bash, or PowerShell, and whatever. We have a set of tasks I don't want to give you, but some are already available. We are extending one. We have an operator that is going to populate the task for you and your cluster. Now we want to see an example pipeline. We have two pipelines that are going to be injected by the Tecton Task Operator. The first one is called the Windows 10 installer. It's going to populate a golden image for Windows 10, according to some input that you are going to provide. The idea is that it's going to copy a template, it's going to modify a template, and it's going to start installing Windows from the ISO, and it's going to create a virtual machine for you. We can see a small demo. Here is the pipeline. We have to provide a few inputs in particular. We have to provide the ease of Windows that we want to install. It's the first... Yes, there, perfect. Here we see the pipeline. The pipeline is going to copy the template. It's going to modify it. It's going to create a first VM that it's used in order to install Windows and then create the Windows image from that VM. Here is... Now we are simply going to see what is happening, but everything is fully automated. You don't really have to watch it. But if you like, you can also see it live. Here it's our virtual machine, and as you can see, it's starting, it's booting, and it's going to install Windows. We have also a second pipeline. I have a demo for that. It's called the Windows Customize. Probably we are a bit over time. The idea of the second pipeline is that you can customize this image, running additional steps, like installing the software that you need, modifying the image, that it's going to be one of the golden images that you are going to provide in your cluster. Let's move back. This is the second one. I'm going to skip the demo, but if you have any questions, please please reach me. The idea is that we can use this pipeline in this case to install SQL Server and so on. What's next? OKD is going to change. Now, in the beginning, I told you that now it's OKD. It's based on Fedora CoreOS. We are going to have a big change there, which is called OKD CentOS Streams, which means that the nodes of OKD are going to be based on CentOS Stream. So it's going to be a really upstream for OpenShift Container Platform, where the nodes are based on Red Hat CoreOS. CentOS Stream is the upstream of Red Hat CoreOS. Everything on CentOS Stream is going to be built as well using tecton pipelines, just because we really believe in that project. On OKD virtualization side, we are going to introduce many features. We are going to try more pipelines, more automation. We are working on ARM support. We are working on better backup store APIs. We are working to reduce the privileges of the pods that really execute the virtual machines. We are working on the real-time area. And here are a few links if you want to reach us. Thank you. APPLAUSE Questions? Sure. There's already OpenStack and Cata Containers. And I wonder how does OKD compare or integrates or overlaps with these two projects? So the question is, we already have other projects like OpenStack and Cata Containers. I want to understand how CubeVirt compares to them. So the first idea is that in CubeVirt, we are managing virtual machines as first class cities on Kubernetes. You can define a virtual machine with a customer source because Kubernetes provides a mechanism called the customer source definition to provide customer new APIs. You can use them to define a virtual machine as a really native object that is going to be scheduled by the Kubernetes orchestrator alongside pods and other resources. The main benefit is that you can expose... You can use the same storage that you are using for your pods. You can have your pods talking at the network level with virtual machines, without the need to configure tanners and so on, because virtual machines are running on the lower stack, like if you have OpenStack under Kubernetes. Virtual machine is going to be a first class citizen of this infrastructure. So it integrates with OpenStack? So not really integrated. If we go here... Okay. Here the idea is that we have something here, which in our case probably is going to be Bermeter nodes, but it can be also another IPA scaler, but in that case you need nested virtualization, which is not always the best idea. Then you have Linux host nodes, now with Fedora Core rise, but in the future with CentroStreams, and here you have Kubernetes stack. Kubernetes is going to schedule pods as containers on those nodes, and the virtual machines bear. So you don't really care of what you have on the last level. Yeah, you showed how to install Windows and prepare images from pipelines. Isn't it easier to use simple Docker file and give it's possible to use just Docker file and do that on way just building standard Docker images? Okay. So the question is, you showed how to use a pipeline in order to prepare an image for Windows. Isn't it simpler to directly use a Docker file to create a container? So in tier it is, but you have to start from an already running virtual machine and take the disk, because Microsoft is providing an ISO with a tool that you have to execute. But you can use the same virtual pass and guest page tools to install Windows inside the Docker Cloud. But you have to execute it. You have to execute the banner of installer. So you have to, at the end, you are manually running something that it's going to install Windows. And at the end, you need to take a snapshot, which is going to be your image. You want to automate it. You want to continue to execute it in order to fetch updates. How we solved this? We automated it with a pipeline because you have a set of tasks. And so the pipeline is the most smart way to execute and monitor them. Sure? Last one. Which format are the golden image disks used? Which format are used for the operative system disks? Do we support our format? No? No. So just a little. Thank you. Time is up. But if you want, please reach me outside. Thank you. |
Stateless decoder virtualization using VirtIO Video and Rust
How this will be used on ChromeOS and more. |
So, hey everyone, Daniel Almeida here from Colabra, and today we're going to be talking a little bit about stateless decoder virtualization using Verdeo Video and Rust, and mainly about what's the status of Verdeo Video in general. There's been this huge hiatus, and different companies now have different downstream patches going on, but recently there's been this new push to get everything upstream. There's been new conversation taking place in the mailing lists, so I think this is a good time to actually do a recap on the Verdeo Video device, and also to showcase how we're using Rust in our own criminalized implementation as Colabra has been working closely with the criminalized engineers to make this happen, so without further ado, I think we can get started. And the first question I think mainly everyone should ask themselves is like, why are we doing this, and how this is important, and the reason why is basically, I think it's two-fold. So, the first thing is that video data is this massive share of the internet traffic. There was some data that was collected by Cisco that predicted that by 2022, 82% of all consumer internet traffic would be like video-related data, up from 77% in 2018, so video data is this huge share of traffic, so this is one thing. The other thing is this new use case for Chromebooks, wherein you can purchase a Chromebook, so a laptop, an ultrabook of sorts, and then you can run Android applications on it, thanks to this ArcVM virtualization layer that they have going on that will transparently virtualize the Android apps so that you can use them on your Chromebook in a somewhat transparent way. This means that a user can use like Netflix, YouTube, and games, and things from the apps in the Play Store, and this makes the device much more useful in general. The idea is that these devices, these Chromebook devices are usually, they are capable of hardware accelerated video decoding, and if they can do that in a hardware accelerated way, it's a good idea to expose this capability to Android apps as well, so that Android apps can benefit from the hardware in the machine. With that said, before we can explore this a tad more, we should be talking a little bit about Viferal 2 memory to memory devices, and I have this figure here, which I've taken from Hans Rikul, the Viferal 2 maintainer, thanks Hans, and it just shows a codec device in the middle, and it's in the middle of two queues, so on the left side, you guys can see what we call the output queue, and on the right side, you guys can see what we call the capture queue, and the idea is that these two queues, they contain buffers, and a user space app will be continuously queuing and dequeuing buffers from these two queues, and the idea is for a decoder, which is the actual type of codec device I want to talk about, we could talk about encoders, but let's just focus on decoders for this presentation. The idea is that for a decoder device, user space applications are going to be queuing bitstream buffers on the output queue, so buffers containing compressed data using some codecs, so VP9, H264, HEVC, and this data will be eventually processed by the device, and then the device will be placing this data on the capture queue, on the buffers in the capture queue, and then eventually the user space app will be able to dequeue these buffers containing the decoder data, so this is a loop that takes place while the device is decoding, and there's also this finite state machine that drives the device as well, so one of the questions that one may make is why is there is this finite state machine in place, and the reason why is this model where we have two queues and the codec device in the middle, it's not sufficient to express some different kind of scenarios, so for instance if you're playing video and you want to do a seek to another position, something that happens very often, you're watching a given portion of a video, then you want to watch a different portion for whatever reason, that's called a seek, and the previous model with only the two queues, it's not capable of expressing that, so the idea is that you'll have a number of different states, and you can transition between these states by issuing ioctals against the video device node that you've presumably previously opened, so think like slash devs slash video zero or something along these lines, you will be opening this video node, and then you'll be issuing ioctals against this video node to transition between states, and eventually you're going to be in the decoding state where a decoding loop is going to take place, and then you'll be queuing and dequeuing buffers for the codec to process. With that said, I want to talk a little bit more about the differences between a stateful and a stateless architecture, and the main difference basically boils down to who's keeping track of the decoding state, so when you're decoding video, there's some state that somebody has to keep track of, usually the set of the decoded picture buffers, among other stuff, and in a stateful architecture, the driver or the firmware will be the piece that's going to be keeping track of that, whereas in a stateless architecture, the guest user space will be the actual program that's going to be keeping track of the decoding state, so in a stateless architecture, the device is just this clean slate that you serve program with some metadata that you extract from the beat stream, and it'll just process that one frame, whereas in a stateful architecture, you just send in data, and the device will be acting as a black box, where you just send it data, and eventually it's going to give you back decoded frames, so it's a different approach to do video decoding in general. With that said, we can talk a little bit about Verdeo Video, so Verdeo Video was initially developed by Google in Open Synergy, and Vistix is the latest submission upstream, there's a kernel driver submission for that, it refers to Verdeo Video version 3, and the Google has downstream patches to use that driver within Chrome OS, so using their own implementation basically, and in Verdeo Video, we basically have two vert cues, vert cues are like this, it's a cue where you can sort of communicate by exchanging memory, this is a Verdeo serve concept, and the idea is that we have two cues in Verdeo Video, one is the command cue, where we'll be pushing data from the driver to the device, so from guest to the host device for processing, we're going to be pushing data, pushing commands, and then we have the event cue where we have the opposite communication taking place, so that the host can inform the guest of things like dynamic resolution changes, or errors, or something along these lines. So the reason we were speaking about stateful and stateless implementations in Viferal 2 memory to memory devices is that the Verdeo Video kernel driver exposes itself as a Viferal 2 stateful device, and why Viferal 2 stateful? Well, first of all, it's a mature interface, it cover cases where the underlying decoder IP is not within a GPU, there are approaches out there that are trying just to like virtualize VA API, or something along these lines, but we really wanted to cater as well for the case where the decoder IP is not within a GPU, because we have devices where this is precisely the case, and also a black box approach is really useful, because we just want to send it data, and we wanted to do its decoding in the background without the guest application being aware that there's this entire virtualization layer going on in the background, and this driver is also heavily based on the Verdeo GPU, which is also upstream in the Linux kernel. So the idea for the kernel driver is really simple, it translates from the Viferal 2 iOctals to Verdeo Video Commands. So the guest user space app will be, as we said, previously issuing Viferal 2 iOctals against this video device node, so that it can sort of change states to eventually end up in the decoding state, and also so that it can be, while it's in the decoding state, so that it can be queuing and dequeuing buffers in that decoding loop. So whenever the device issue iOctals against the video node, the kernel driver will be translating that into Verdeo Video Commands, and then placing these commands in the command queue for further processing by the host device, and by doing this translation, it ends up implementing the Viferal 2 Stateful Finite Stay Machine, and so that a guest user space app doesn't really have to know that there's any virtualization taking place, it just submits iOctals, submits data in the buffers, and eventually it's able to dequeue buffers with the decoded data in them. So here's a small example just to drive home what I'm trying to say here, we have one iOctal being issued by the guest user space in this figure, and in this particular case it's video C create buffs or rec buffs, it's another call as well rec buffs, which is just a way for the guest to say that, or for a user space app to save to Viferal 2 that he wants to allocate buffers. So the Verdeo Video Kernel Driver, which is again a Viferal 2 Stateful device, will intercept that call, it will translate that call into some Verdeo Video Command, it'll place that command in the command queue for processing by the host, and then the host will be talking to this question mark box somehow to process this Verdeo Video Command resource create into something useful. So here's the architecture thus far, we have a, in the guest we have a guest user space app issuing iOctals against the Verdeo Video Kernel Driver, the Verdeo Video Kernel Driver translating these iOctals into Verdeo Video Specific Commands. For our use case here, we have CrossVM which is Google's virtual machine manager, taking these commands in the command queue, dequeuing them, and then processing them using this question mark shaded box in the host. Eventually this shaded box will be somehow decoding the video data, and it'll be piping the the frames back to CrossVM, and then CrossVM will be pushing the frames back using the the virtual queues back to the Verdeo Video Kernel Driver, and then the Verdeo Video Kernel Driver can make the frames available to the guest user space application, which can be like gstreamer, or ffmpeg, or you know other apps. So, and now we have to talk a little bit about what is that shaded question mark box, and these are like CrossVM decoder backends. So what is CrossVM in the first place? Well, CrossVM is this virtual machine manager that's shipped with GromoS, and it's the cornerstone of GromoS virtualization layer. So it's when, for instance, when you're running Android apps in the background, it's CrossVM that's going to be providing the virtualization for it, and it has this huge focus on security. So it's written in Rust, and it's focused on Verdeo devices. And the main idea here being that CrossVM as a virtual machine manager really has no idea how to decode video. This is a very different thing from what it was built to do, so it has to interface with something to get that video data decoded. And that's something which I've denoted with that shaded question mark box, that's something is what we call a back end for CrossVM. So we have like three different backends going on nowadays for CrossVM, the first of which being LibVDA. So the idea with LibVDA is pretty simple. LibVDA is just this library that lets you interface with the Chromium GPU process to actually decode video data. So most of us here know that Chromium is this very mature project with a very mature video decoding stack. So the idea is simple, just use Chromium, ask Chromium to decode data, bam, there you go. But this has a major issue, which is like we have a virtual machine manager which is written in Rust with a focus on security and memory safety and everything, linking against a web browser which is a very different kind of software, which is also by the way not written in Rust. So this is a problem, this is something that the Chromium engineers wanted to do away with, which is why we have like CrossCodeX going on, which is our own crate written in Rust. We also have like an FFM pack back end using the FFM pack software decoders. It's used only for testing so that you can test the radio video implementation in CrossVM without necessarily owning a Chromebook device, you can test that with a regular laptop if you're using the FFM pack back end and also if you're using CrossCodeX. So the idea of FFM pack is just to use it just for testing and we're not integrating it like the hardware like Acceleration and FFM pack because again FFM pack is this huge project written in C which brings us to the topic of CrossCodeX which is our solution. CrossCodeX is basically a crate, a library written in Rust to do video decoding in safe Rust with all the guarantees that the Rust language provide to us, so memory safety, so on and so forth. It's not published on crates.io yet because it's heavily working progress and it contains all the pieces that are necessary in order to do video decoding. So mainly parsers which is where we're going to be extracting the metadata to drive the decoder, the decoder logic which is the piece that's going to be keeping track of things like of the state right which is what we've talked about previously, so things like the set of reference frames and any other kind of information that you have to keep track of between frames and also it itself contains backends as we will see shortly. Currently we have a VA API backend so CrossCodeX will itself use the VA API driver in the system to get a video decoded and they're also working on another backend which is the Vifero 2 stateless backend. So here is a more complete picture I think, so everything in this picture is just that shaded question mark box from earlier, so here you guys can see we have CrossVM, CrossVM will be using for now, CrossVM will be using like CrossCodeX to decode video, CrossCodeX will be using the VA API CrossCodeX backend, it'll be talking to the VA API implementation in the system which will be talking to the VA API driver, so Intel Media Driver or Mesa depending on what graphics card you're using that's going to be talking to DRM in the host kernel and then up until now nobody really knows how to decode video data, but once DRM starts talking with the GPU then the GPU knows how to decode video because it has an IP in there, a circuitry that is specialized in video decoding so the GPU will be doing the video decoding and eventually getting the raw decoded data and then the data will be pushed all the way back until it gets to CrossVM, then when it gets to CrossVM we'll be pushing the data back to the radio video kernel driver in the guest now and then the radio video kernel driver in the guest can make the decoded data available to the guest user space application. So here's some backlog, we still have to upstream the radio video protocol, as I said there's been this new push to get everything upstream, Google is collaborating again with open synergy so that we can get video upstream because it's not upstream yet, we plan on adding more codec support because thus far we only have VP8, VP9 and H264 supported and most people want to see HEVC and AV1 which is like the state of the art for like video decoders. Encoder support for cross codecs in particular while a radio video itself has encoder support, the Google's implementation which involves like cross codecs does not the cross codecs grade does not have encoder support yet, you can encode using libvda which which again is this path that uses the the Chromium GPU process to do video encoding, it's and this is already used in production so it can be used but there's no supporting cross codecs properly yet and we're also working on a referral to stateless back-end in cross codecs so that we can support like more devices. So just a quick summary, Google is already using radio video in production through libvda, we've been working with Chromium as engineers so that the libvda dependency can be removed because Google really wants this to be using Rust, to be using safe Rust code in order to do the video decoding, we plan to upstream radio video like collaboration and working together with Google and working together with other industry players, we plan on upstreaming the radio video protocol and for Google in particular this improves the experience for Chromebook users and not only that but this is like only one application for radio video in general, other companies can can benefit from the radio video work that's been done here to use radio video for their own projects and their own use cases. So that was it, that was basically what I had to say about radio video, I hope that was informative and yeah, thank you very much! |
blkhash - fast disk image checksums |
Thank you. So welcome everyone. We are going to talk about fast disk images checksums. So who am I? I'm a long time contributor to this of the project, and I worked for Red Hat more than nine years on over storage. Who knows, obviously. And I focused on incremental backup, image transfer, and NBD tools. And this project is continuous, the work on fast disk checksum in Ovid that is available for Ovid. So we're going to talk about why do we need disk image checksums, and why we cannot use the standard tools for disk images. And we'll introduce the blockchain command, which is optimized for disk images. And the block cache library, which is used by the blockchain command. And you can also use it in your program. And a good example of this usage is the new checksum command in QMMA image, which is work in progress using the block cache library. Then we see a live demo. We'll play with the tools. And finally, the most important stuff, you can contribute to this project. So what is the issue with disk images? Why the additional than standard images? Let's say we want to upload a disk to, we have this QGAP2 image, usually it's compressed because it's the best format to publish images. And we want to upload it to Ovid, or maybe to OKD. Maybe Simon wants to upload it to OKD, so what we have to do is get the data from the disk image, and we need to upload it to some server on the virtualization system, and this server we write in with some disk. Now the disk can be many things. It can be just NFS server with, and we'll have there look a file similar to the file we uploaded, but not the same. It can be raw sparse image, and it will not be compressed while we have QGAP2 compressed image. Or it can be a small block device if it's Ovid, it can be a small block device just large enough to fit the guest data that we uploaded on some escasi server, or it can be SAS image stored in many nodes, in a cluster, in many disks. So we have very different shapes on the two sides, like disk image on one side, something completely different on the other side, different format, different way of storage. One thing must be the same. The bits must be the same. The guest data must be the same. So if we start the guest with the disk image, or with the virtual disk, we must get the same bits. So we can verify this operation by creating a checksum of the disk image, the guest data inside the disk image, and the guest data inside the virtual disk, whatever the format and shape, and the checksum must be the same. The logic image is mostly the same problem. We have some shape of disk on one side, some shape of disk on the other side, different formats, but the guest data must be the same, and the checksum must be the same. Another interesting problem is incremental backup. In this case, the backup system will only copy, only change blocks on each backup if it wants to be efficient. So let's say two days ago we did a full backup, and we stored it on this full QR2, and this is just one way that we can store the backups. It can be many other things. And yesterday we did another backup, and we stored it in one file, which is sitting on top of the full QR2, and this is the backing file of this file. So we created a chain, and today we did another backup. We copied more data from the virtual disk. We created another layer. So also in this case, the guest data inside the virtual disk must be the same as the guest data inside this entire chain. So if we know how to read the guest data from the entire chain, like a guest does, we can create a checksum, and we can compare it to the checksum of the virtual disk at the time of the backup, and we know if our backup is correct. So if we will restore this chain to another virtual disk, we will get the same data. So what is the issue with the standard tools? Can we use SHA-SUM to create a checksum of this chain? So first we have the issue of image format. Standard tools do not understand image format. So if we have the raw image, everything is fine. But if we have QCAP2 image, which is identical, and here I compare the images, QM image compare, which access the guest data and compare it bit by bit, so the images are identical, but we get different checksum from SHA-SUM. Image compression, even with the same format, both images are QCAP, but one of the compressed will get different checksum, because the host clusters are compressed, and SHA-SUM is looking at the host data, not at the guest data. Even if we have the same image format without compression, everything is the same, right? I just converted one image to the other image, and the images are identical, but we get different checksum. Why? I use the dash w flag, and this allows an order rights. So the order of the cluster on the host can be different. The guest data is the same. Finally, the issue of sparseness. Standard tools do not understand sparseness, so here we have six gigabyte image, but only little more than one gigabyte of data. But SHA-SUM must read the entire image, so it will read this one gigabyte of data, complete a hash from this one gigabyte, and then read almost five gigabyte of zeros, because anything which is not allocated is read the zeros. So it must do a lot of work, which is pretty slow. For example, if we take a bigger image, here I created 100 gigabyte image, but there is no data in this image. It's completely empty, just a big role, and completing your checksum took more than three minutes. So do we really want to use this for one terabyte image? It's not the best tool for this job. And let's introduce the blockchain command, which is optimized for this case. So first it looks just like the standard tools. If you know to use standard tools, you know to use it. Just run it and you get the checksum. It understands your image format, so if you use identical image, you will get the same checksum. Although the images are different. The size is different. They do not look the same on the host. Of course, it supports compressed QCaul, because it reads the QCaul image, which the compressor data, and it gets the actual guest data, so we get the same checksum. And it also supports snapshot. So if I create a snapshot, here I created a snapshot, this snapshot QCaul, on top of the Fedora35 image. Fedora35 is the backing file of your snapshot. And if I compute a checksum of the snapshot, I actually compute a snapshot of the guest data inside the entire chain, not of the tiny snapshot file, which has no data yet. And we also support NVD URL. For example, if I start NVD server, this QMNVD is NVD server. Here I started it, exposing this QCaul2 file using this NVD URL. And if you compute a checksum with the URL, we access QMNVD. I will get the guest data and compute a checksum. And actually, this is the way we always access images. Under the hood, we always run QMNVD and use NVD URL internally to access images. This is the reason it works. We also support reading for pipe, like the standard tools, but in this case, we cannot support any format, just raw format. And this is less efficient, because we must read the entire image. In other cases, we can read only the data. But it works. So it's not enough to create tools to get correct snapshot checksums. We want it to be much faster than a standard tool, because we are dealing with huge images, which are usually empty. Usually, when you start to use an image, it's completely empty. Then you install operating system, add some files. Everything starts really empty, and then goes. So here we tested this 6-gigabyte image with about 1 gigabit of data. And in this case, Blocksum was about 16 times faster. And another example, can we compute a checksum for 8 terabyte image? Is it practical? It is. It took only 2.5 or 2.6 seconds. And if we do it with checksum, it's not practical to actually do it. So I tested 100 gigabyte image. It took about 200 seconds. So the estimated time is 4 hours and 29 minutes. It means, in this case, we have 6,000 sign faster. And of course, we create a different checksum. It's probably obvious, but any tool exists on checksum because they use different algorithms. So Blocksum is using, under the hood, some cryptographic hash function, but it's a different construction. So we don't get the same checksum. Now, it's not available everywhere, anywhere. But I build it in copper. So if you have Fedora or CentOS or L-System, you can enable the copper repo, and then you can install the package and play with the tool. So the block hash library. Basically, Blocksum is just using the library to compute the hash. So you can also use the library to integrate this functionality in your program. The library is very simple. This is the entire API. It gives you the standard interface to create a new hash, to update it, and to get the result, the final interface, and of course, to free the resources used. So if you use any cryptographic hash libraries, maybe Hashlib or OpenSSL, you know these interfaces. Now the important difference is that when you update, when you give it a buffer with data, this API will detect zeros in the data. And if you find the zero block, we will hash it much, much faster. So this increases the throughput by one order of magnitude or something like this. Even if you read from the file, you give it a buffer with zeros, we can hash it much faster. But the most important API is the zero API, because new API, that no other library supports. So if you know that a range in a file is not allocated, let's say empty 8TB image, you check the file, you see that you have an 8TB hole. So you can just input this range to the library, and it will hash it really, really quickly, much, much faster than any other way. And you don't have to read anything for this. So how fast it is? For data, we can process about two gigabytes of data. If you give it a buffer with zeros, we can process almost 50 gigabytes per second. And the BLCache zero API can process almost three terabytes per second. And this is on this laptop. If you try on a Peler one, and this is the first one, like from two years ago, it's almost three times faster for data, and almost five times faster for zero, up to 13 gigabytes per second. And I didn't try the newer M1s or M2s. So if you want to use this library, you install the developer package, you install the headers, and the library package, the Libs package, and your application will depend on the Libs package. And everything should be automatic using RPM. Now, the most important step is integrating this into your image, because this is the natural tool to consume this functionality. So I have patches in review for adding this new command. It's pretty simple. You give it the fine name. You have bogus. You can control caching, and you can enforce the image format. And with this, you can compute a checksum of anything that your image can access. You get the same checksum. It uses the block hash library using the same configuration. So both tools are compatible. You can check my QMFork if you want to see the details. So what is the difference if you compare to a block sum? Usually, it runs almost the same. A little faster, maybe five percent faster. In some cases, it can be 45 percent faster, like in this special case, the image full of zeros. I think it, because QMFork image is closer to the data, it does not have to copy the data to QMFBD and then read it over the socket. So this is really the best place to use this technology. So let's see a small live demo. So we have several images. Let's try to look at the third of 35 images. So we have a 60-gigabit image, a little more than 1 gigabit of data, and we have this Q2 image, again similar size, and we can compare them. And of course they are identical, and we can create checksum with bilkash, bilkasum, and we'll get the same checksum pretty quickly. So let's try a bigger image, and this time we'll use the progress flag to make it more fun. This is a 60-gigabit file with 24 gigabit of data, so it will take some time to compute it. You can see that the progress jumps really quickly when we find a big hole. So it took 12 seconds, and we will not try to use Chassum now. And let's say the 8-terabyte image, which is really fast, and let's say the same is QMFBD with the new checksum command. Okay, takes more time to type than to compute it. Okay, so this is all for the demo, and the last part is how you can contribute to the project. So the easiest stuff is just install it, and play with it, and test it, and if you find an issue, report it. If you have interesting machines, benchmark it, and send the results to the project. We have some results in the readme now, probably should be in a better place. To run the benchmark, you need to build it on your system. It should be easy enough. The most important stuff is packaging and hardware. Packaging for Fedora and Centro Centrail is mostly done. I just need to get this into Fedora. There is some review process that I probably need some help on this. Other Linux resources, if you want to package it for Debian or whatever arch, please do, and contribute to the project. I like to keep all packaging upstream, so I will be very happy to add whatever you need to make it transparent. So it will not break when I change something upstream. Mac OS and 3BSD, I tested it on this platform without LibNBD, because we don't have LibNBD there, and we can also cannot port it before we port LibNBD and package it. But we can package only the library part, which will be useful for QML. Missing features, there are a lot of stuff that can be added. You can look at the site, we have an issue tracker, a lot of issues, a lot of stuff that we can add, like supporting any image format, because we can't support only QCAP 2.0, checksum, multiple image, use the file to verify checksum using what we recorded before, and stuff like this, improve the CI. And even more important, but of course much more work integrates into your system. So Ovid already supports checksum API using older implementation, Ovid's REST project uses this API to verify criminal backups. It can be upgraded to the new tool. Ovid can use it. For example, when you import an image to using CDI importer, or mainly for storage operation, Ovid can verify operation with this tool. For example, running a port connected to the disk and reading the image and verifying it. Maybe other systems, I don't know, if you like other systems and think that it can be useful there, you can contact me to discuss it. If I want to see the project links, we have the project in GitLab, also the issue tracker in the project, and the copy that I showed before. So that's all. How much time? How much time do we have? Five minutes per question. Okay. Yes. I have a question about the special case with block hash, blocks, I'm sorry, zero. Yes. How do you handle it under the hood? I see you specify the chat with the six, but how's that done? Okay. Okay. So how do we handle zeros? How do we do it efficiently? So I have a bonus section at the end of the slides that you can get from the site, and basically the idea is that we split the image to blocks using fixed size. The last block can be smaller, but it doesn't matter. And then we compute the hash for each block, but for zero blocks, we don't need to compute anything because we know that, like, we compute basically when you start operation, we compute the hash of the zero block based on the configuration. And then each time we see a zero block, we can use the pre-computed hash, so it's cost nothing. And then we compute a hash from the list of hashes. This is called a hash list. It's not really new. And now this costs something. You need to pay something for computing the hashes, but they are much, much smaller, like one of those magnitude smaller than the actual data. And to make it even faster, we also use multiple threads. So this, what I, the previous slide show actually what's done in each worker. So we map the blocks, when you write something, we split the blocks and we map them to multiple workers at the same time and send them to the worker queue and the others go and compute this combined hash, this block hash. And finally, we create a hash from the worker hashes. Yes. So, oh, how hard is it for you to add a new checksum algorithm instead of Shah? How hard is it to use? So how, can we use another checksum algorithm? Yes, in Blocksum, you have the parameter, you can specify another algorithm, you can use Shah1 or MD5 or whatever, but anything that OpenSSL supports, it also supports. I'm not sure if this is the best option, maybe we limit it because Shah1 is considered broken. But currently, this is what we support. Anything that OpenSSL provides. Yes. How do you identify zero blocks? What? How do you identify zero blocks? We, how do we identify zero blocks? So we use the very popular method that somebody ought to block about it, someone from the kernel a few years ago, you can use MenComp to, with an offset to compare the file against itself. So you check the first 16 bytes and then you can just check the two pointers, the start of the image and the start of the image plus 16. And it, it can process up to 50 gigabytes per second on this machine. Very efficient. Yes. Okay. I did get the question. Yes. Did we try on cryptographic algorithm? We didn't try, because we try to use something secure, like we try to get something which is secure as the original hash function, but we can try other algorithms. It's interesting stuff that we can try. Thank you. We're going to use the mic. |
Dear admin, where’s my network?
Overview of (un)reliable methods for vNIC to network mapping with KubeVirt |
What I wanted to talk with you today is about how to identify networks inside the guest itself, because there are in some cases where there is a problem, we want to define a network and inside the guest we don't know to where we are connected exactly, so this is the talk. My name is Eddie, I am working at Red Hat today on a project called Kuvert, I think there was a talk here about it before. Let's see, has anyone worked with Kuvert or knows it? This is a very, very quick, does anyone know about Kubernetes? Kubernetes is everyone else, so Kubernetes is the baseline, I would say, so it has, if you know it, it has something called pods, it's like the smallest part of it, in pods there we run containers, now if we want to work with virtual machines then we can install Kuvert, it's an add-on to Kubernetes, this is more or less how it looks, Kubernetes is like a management system and Kuvert is an add-on to this management system that manages VMs, it does this using LiveVirt and KMU and everything KMU runs inside the pod and the pod runs the guest and they share the same network, this is one of their, why we created it on Kuvert or on Kubernetes is to share the same network, so we can expand the pods or we can work with VMs when we are not able to work with pods. So in the regular or the classic way to run a virtual machine is we usually have one network and it's called usually ETH0, if anyone raised the VM this is what they will get, but if you want to have multiple networks then we have a little problem because we can have ETH0, ETH1, ETH2 and we can continue on like that and the main problem with it is that this is not assured, like if you do a reboot to the machine the next time it boots it does not have to be the same order, so you could have ETH0 connected to some red network and the next reboot it will be called ETH1 and it will be connected to the blue network and the reason for that is that the scheme of ETHN is based on the detection of the device while the operations system boots, so it can detect it, it's like it's random, it doesn't have to be the same order, so what can we do about it, so I guess the most direct way to handle something like that is from the management system, in our case it's COVID but we can do that in any management system, from the management system we can pass information inside the guest and tell it to what you are connected to, what are these networks exactly, so we're going to go over each option and see what they are, so the way to configure COVID is very similar to how we configure Kubernetes, we have an object and in this object we configure the parameters, I just took a snippet of the networking parts because this is what we don't want to get into too many details, so in the definition of virtual machine instance we define the interface that it should have and the interfaces have the masquerade and the bridge that you see, these are actually two interfaces, the masquerade and the bridge is the binding way, we don't need to talk about it in this talk but what is important here that this is like a classic way, this is the default thing to define two things, it will use the default ethn in this case you will see, so when we define these two if we go into the guest we'll see eth0 and eth1, if this is like for the explanation we can go inside the pod that was created for us, where the virtual machine runs actually, where Kimo runs or Livvitt runs and we can do a commander to see the configuration of the virtual machine from inside, something that usually the user doesn't see, it's hidden from the user but this is done just to show you, when we create with Livvitt a virtual machine even if we don't pass the information it will automatically create a MAC addresses for each nick and it will create also PCI addresses, it's something done by Livvitt automatically if we don't specify anything, so we can see that the MAC addresses correspond with the MAC address inside the guest and the PCI address in this case it depends on operation system that you run inside of it, for example this one I think it's Fedora, so Fedora gives us this eth0 stuff but it also gives us some alternative name which was calculated in this case based on the PCI address, the problem with we can, the problem that we cannot identify this to the original configuration is that if we look here we never specified anything, so this is we will need to do to connect to the pod and check what it has and then it's not practical, so it doesn't help me much, so the first thing that we could do is to specify the MAC address in the configuration, if we specify specific MAC address this MAC address is going to be placed inside what we saw Livvitt in the configuration, this is what Kuvert will do and then we can see the MAC address inside the guest, the problem with this one is that we cannot, we will need to use, we cannot use the same MAC address for each interface right even between virtual machines because then they will collide and will have no network, so this is not so good solution and we should, we need to find some other better solution and by the way this also collides the, in Kuvert for example we have a MAC pool which automatically, I mean it's an option but it allocates MAC addresses to interfaces so they will not collide, so we will not have a case where MAC address was defined twice. So the better solution for all of this is to use something that Linux has started working exposed like I think it's a lot of years ago, like 10 years maybe if I don't know, it's called predictable network device names, instead of using the ETH and this is the last one, we can decode the name of the interface based on the something called an ACPI device index, we'll talk about it soon, based either or by the PCI address of the device, so if we specify the PCI address of the device when we define the interface it will go the operation system, if it uses system D and things like that it can create the interface based on this name. So this is a, I'm showing you here what we can do, the ACPI index is the parameter that we can set inside the definition of the virtual machine and it will go, it will create the, it will pass it to Livid and Livid will create the domain, the VM and then we will see it with this name, it starts with ENO and then the ID that we set here, but even if the operation system does not know to do it, because if you don't use system D I think it will not do that, someone needs to know to name the interface correctly, you can still go to, I don't have, I cannot show you now because it's not my laptop, but you can do a CIS bus, you can go on the bus of the PCI address and the last one will be an ACPI index value, so you can read it from there. Now the other option, this is, by the way this is the, I guess the best option if you can do it. The second best one is to define a PCI address, specific PCI address when you define the interface, it's a bit harder just because you need to know which PCI address to use and you need to manage it, so if you create 10 interfaces you don't want them to collide and you don't want them to collide with other devices of your virtual machine like, I don't know, a sound device or something else, so this is a bit harder, but you can do it and if you set that one the name of the interface will be created using E&P and then the first number is the bus number, this is this one here and the slot is this one. Yeah, but there is one problem with all of this, what I just said is that I don't know if you ever try to use a virtual machine from the cloud that you downloaded, one of the problems that all the images that are created for us Fedora ones, the Centos ones, Ubuntu ones, all of them have something they expect to have only one interface, so they are configured in a way, so they will not use all of these schemes, so they will, they want, I mean, if you want to have this scheme or these options, you just cannot have it, it disables them in purpose and the reason is because they are saying, if I want to run an application inside the guest, I want it to be, to know always that it is ETH0, I don't want it to give me a name that is random, that I cannot control, so if someone wants to use these naming schemes, they need to go to the VM, prepare an image or they need to tweak the existing image and enable this predictable naming. The last option, which is a bit more complicated, but it gives us more options, is we can have a tag on the interface, specify a tag and it will use something that was invented in OpenStack actually, that's called device role tagging and what this one does, it prepares the information, like it reads, when we set tag for them, you will see that we will get, we'll get this information, it doesn't touch the interfaces, it doesn't touch the interfaces, it just mounts inside the guest data, which is this one for example, and this data is the one that was read from Livid configuration, so it reads, if I put FOSSDEM as a tag, then what Kuvert is doing, it reads the Livid configuration, checks what is the, it reads the address of the PCI, the MAC address and puts the tag and a potential application or guest inside the, inside the guest, they can read this file and decide which network to connect to. This is, we could do much more interesting stuff with this one, although it's not, it's less standardized because using the name of the NIC sounds to me at least more, using the simple solutions or more standard one, this one is, you need to learn about this protocol, so you need to mount it and you need to read it and decode it, but you can do things like here, we could tag different devices like the NUMA device and some other peripheral device and then tag them on the same tag and then use this information to do for the application inside of it to use smart things, like for example, an easy way to run an application and use the devices that are on the same NUMA, that's one example you could do. I don't, I cannot do the demo, but if someone wants to look how, what they need to configure in the virtual machine, the guest, so the naming will be not ETHN, then please come to me. All the art is for my kids, so I must write it down, that's it I guess, any questions? No, yes. Yes, you'll need to use the PCI, I guess, or maybe there is a, oh sorry, I need to repeat the question. So, yeah, we were asked about what to do when we use the guest that is ARM, right, using ARM and the ACPI index is not available, so yes, in this case you must use the PCI, but if we have maybe other, other identifiers, maybe you can use them. There is one identifier that I didn't say here, I didn't talk about it, it's very similar to the ACPI index, but it's a register on the, on the PCI, so we can, we can use that if you want, but I think Livid or Kimo is not supporting that here. Yes. Any more questions? Last question? Yes. So, ACPI I think index, sorry, I'm not good at repeating, the question was what about operations that are not Linux? It doesn't exist, but the real answer is that ACPI index, I think it's supposed to be generic, I mean it's part of the, it's like a standard of the ACPI, so if, for example, Windows or some embedded thing can read the device, then it can fetch the information, the ACPI index information, and they can always use the PCI address, and if there is something, if there is, if you, the fact with the tag, for example, that is universal, if you can mount something inside the guest, and like this one is working using cloud in it. So, if we don't have time, so if someone has questions, please come to me, I'll stay here, or outside, I'll stay here. Thank you. |
A journey through supporting VMs with dedicated CPUs on Kubernetes |
So, hello everyone, pleasure to be here. I'm Ithamar Holder and I'm a senior software engineer working for Red Hat. And this is a talk about the journey through supporting VMs with dedicated CPUs and Kubernetes. And the reason that there's a journey word in the subject is that this was a true journey for me. And I'm going to guide you through the journey until we reach the actual problems and solutions. And there are two reasons for that. So first reason is that we need to understand the problems and solutions. So we need to understand the background for it. But the second reason is that I've gained a lot of insights and takeaways during that journey. And I think that I hope that you could find is also valuable for your journeys. And yeah, that you can take the same takeaways for whatever you're doing and whatever you're interested in. So we're going to talk about all sorts of stuff like dedicated CPUs and CPU manager, C groups, spot isolation and namespaces, Kubernetes resource allocation. And so let's begin. So first of all, an introduction to Kuber. So Kubernetes is designed to run containers, which are designed very differently than VMs. And running VMs on one platform and containers on another platform is not the best approach. And this is where Kuber comes into play. This is basically an add-on or extension to Kubernetes, which lets you run VMs on top of Kubernetes as a first-class citizen, as a completely cloud native. And I'm not going to dive into all the architectural details here, but the trick is basically to run a VM within a container, like this picture tries to illustrate. And that's basically what you need to know for this talk. So what's the deal with dedicated CPUs? So basically, the key word here is avoiding preemption or context switches, right? These are crucial, this is crucial for certain use cases like real-time VMs or VMs that depend on very low latency. So as a naive example, let's think about a VM that hot loops over some condition. And when this condition becomes true, it has to react really, really fast. So if we context switch this workload out, then it would take more time. Because once the condition becomes true, it would take time to context switch back, and only then the VM could react. So this is very crucial for some use cases. Also it's supported by most hyper-hypervisors, and it's a pretty standard feature. And we aim to bring this also to Kubernetes. So a question to the crowd. Who recognizes this section? Who knows what this is? Okay, so most of you. And another question, who can say that he's confident about how this is implemented behind the scenes, or how Kubernetes actually does that? A lot less of you, right? So that's good, it means that this is relevant, right? So obviously, this is taken from the pods manifest. This is the place when we specify resources. We have, of course, requests and limits. We can specify CPU, memory, a firmware storage, and a bunch of other stuff. And so let's talk about containers for a second. So containers are actually a conceptual concept that can be implemented in many ways. So from the Linux kernel perspective, there isn't such a thing as a container, really. There are basically a couple of main kernel features that serve as the building blocks for containers. One of them is C groups. C groups is very important, and is one of the main building blocks for containers. So let's talk about C groups a bit. So basically, the idea is that the architecture is a tree of resources, right? We have the root C group, which is basically all of the resources on the node. So for example, 100 CPUs. And then we divide them into groups, like for example, 70 CPUs, 20 CPUs, 10 CPUs, and so on. The idea is that every process on the system is attached to a C group. And that basically the C groups limits the resources for this group of processes. And in Kubernetes, there is usually one C group per container. This actually depends on the CRI that you're using. But the most common approach is to use one C group per container. So in Kubernetes, all of the values are always absolute, right? When we specify CPU, for example, we can specify 100M, which stands for milli-CPUs, which is similar to 0.1 CPUs, 1.3, whatever, but these are always absolute values. In C groups, it's all relative, right? It's called CPU shares. The default is 1024, but it doesn't really matter. So if we'll look on a very naive example again, let's say that we have a node with two pods running on the system, pod A and pod B. And let's say that pod A has one CPU share and pod B has two CPU shares. What it would mean is that pod B would have twice as CPU time as pod A. It doesn't really matter how many CPUs the nodes have, because this is all relative, right? So how does Kubernetes convert between the absolute values and the relative shares? So we can think about one CPU as 1024 shares, just because it's the default in C groups. So let's say that a pod asks for 200M CPUs. So this is actually a fifth of a CPU. So what we can do is divide 1024 by 5 and we get approximately 205 shares. And this would work, but remember that shares are still relative. So what happens, for example, if the node has 100 CPUs and one pod with 200M CPUs request runs on that pod? Since it's relative, it would just use all of the node's resources, right? So this has a nice side effect. The spare resources on the node can be used by the pod relatively to their request. So basically the request is the minimum amount that is actually allocated and all of the spare resources are being split relatively to their request. So let's talk about Kubernetes QoS for a second. There are three quality of service levels. The first one is best effort. That means that I don't specify anything. I don't have request, I don't have limits, not for memory and not for CPU. The last one, guaranteed, is kind of the opposite from that. I specify both request and limits to both CPUs and memory and the request and limits are equal. Now if you're not best effort and you're not guaranteed, you'd be burstable. So this is just an example, but the idea is that you can specify either only request, only limits, you can specify them both, but they're not equal. So any other than best effort and guaranteed. Now basically the trade-off here is predictability in order to get stability. So basically Kubernetes tells you, if you want me to guarantee you stability, you have to be predictable or if you will be more predictable, you'll gain more stability. Like one example for that, if we're talking about memory for example, are node pressures. So when the node would have high memory pressure, it would evict guaranteed QoS pods last. And after that it would get to burstable, after that it would get to best efforts. So this is true by the way, as long as you keep your promises. If you say that you're limited to a certain amount of memory and then you exceed this memory, then on most CRIs we'll just kill the pod. So can we use dedicated CPUs on Kubernetes? So the answer is yes. This is possible with CPU manager. And in order to do that, we have two requirements. First of all, the pod needs to be of guaranteed QoS. Second of all, the CPU request, which equals the limit because it's a guaranteed QoS, has to be an integer. It cannot be a floating point value. Also an interesting fact is that only a single container or some of the containers in a pod can have dedicated CPUs, but the whole pod needs to be of a guaranteed QoS. So let's talk about namespaces for a second. So remember this little diagram from before, so namespaces is another building block for containers and it basically is responsible for the isolation of the containers. So when I'm picturing a pod, this is what I think about. Like it's a box with some containers in it. The containers are absolutely isolated from one another. And as we said, container is a concept. So if we will take some of the namespaces out and we will break some of the isolations between the containers. Are there still containers? How do we need to, how layers of isolation do we need to strip before it stops being a container? This is more of a philosophical question, but is it possible on Kubernetes and the answer is yes. So for example, it's possible to share the pod namespace between containers or the process namespace between containers. And what it means is that inside the container, if you will do something like PS, you would see all of the processes from all of the containers. This isolation will not exist anymore. Another interesting fact is that as a side effect, the file systems are also shared. Now they're not shared directly, but you can use that trick to use them indirectly. We'll get you to the root file system of another process that now can be in another container. So to actually enable that, that's what you need to do. In the pod, in the spec, share process namespace, true, and that's it. So now a word about KVM. So who knows KVM, by the way? Oh, a lot of you, okay. So this is a kernel model which turns the Linux into a hypervisor. Basically we have two kinds of hypervisors, type one and type two. Type one means that it's also called a bare metal hypervisor because it's being installed on a bare metal. There's no OS benefit. And what it means is that it's really fast, but the downside is that it has to implement stuff like a scheduler, a virtual memory, and a lot of stuff that already exists on every OS. Type two hypervisors are being installed on top of the OS, so they don't have to re-implement all of those stuff, but they're usually a lot slower. So KVM is really incredible because it turns Linux into a type one hypervisor. And this is what Qvert is using to gain native performance. An interesting fact about a KVM is that its main purpose is CPU virtualization because this is the performance part. It's also backed by QEMU, which does things like IO and stuff like that, which are usually less related to performance. So how does KVM actually works? So from the guest perspective, it will have, for example, four CPUs. But these aren't real CPUs, right? They are virtual CPUs or VCPUs. And from the kernel perspective, these are just threads, VCPU threads. So what the guest sees as a physical CPU is actually from the host perspective is just another thread on the system. Okay, so now back to Qvert after all of these introductions. In Qvert, we have the VRT Launcher pod. It has some containers in it. The compute container is the main container. Inside the compute container, we run the QEMU process, which actually runs the guest. And this is the main container that we're using. So first attempt to support dedicated CPUs. So the idea was let's allocate the compute container that we talked about with dedicated CPUs. So this is possible with CPU manager as we talked about. All we need is to do is to have a pod that's guaranteed QS and to have an integer amount of CPUs on the compute container. So by the way, is it a good approach, do you think? This is a problem and let me explain you why. So let's zoom into the compute container for a second. The list here is all of the processes and threads that run inside the compute container. You don't need to understand everything that's running here. But let me show you the interesting part. So you see the QEMU KVM process. All of the red ones are threads. Now as you can see, we have two threads, which are the actual vCPU threads, like I said earlier. So the problem is that we have a lot of threads with different priorities. And if we let all of the compute container run with dedicated CPUs, this aren't really dedicated CPUs because we said that the keyword here is avoiding preemption, right? But with the previous setting, we're basically, we will context switch out the vCPUs in order for other threads inside the compute containers. So the vCPUs aren't running on dedicated CPUs really. We actually lie to the guest. So that's a problem. Now the second approach is called the housekeeping C-group. And the idea is that we will make a child C-group for all of the low priority threads or processes. So how would it work? So let's say that the user asks for XCPUs. We would actually allocate X plus one dedicated CPUs. And one dedicated CPU will serve all of the housekeeping tasks. And when I say housekeeping tasks, I basically mean everything but the vCPUs themselves. Then what we can do is move all of the threads that aren't vCPUs into the housekeeping C-groups. And then the vCPUs would be with two dedicated CPUs. So this is how it looks like. We have the vert launcher pod inside of we have the compute container with X plus one dedicated CPUs. One dedicated CPU is for everything but the vCPUs themselves. And the X dedicated CPUs are for the vCPUs. So this approach is much better because it lets us, this basically supports two dedicated CPUs for the vCPUs. But this also has a problem. So first problem is that we waste one dedicated CPU for stuff that are of low priority. This is a huge waste. Ideally, we would have wanted to do something like give me like four or X amount of dedicated CPUs and another amount of shared CPUs for everything else. And this is actually possible on C-groups but it's not possible on Kubernetes because what we said earlier. If we're going to ask like 3.2 CPUs or something like that, they won't be dedicated. That would be all shared. So basically Kubernetes goes for an all or nothing approach. Either all of the CPUs are dedicated or all of the CPUs are shared. Another problem which is more of a design problem is that we're focused around the lowest priority processes. And this kind of should be reversed, right? I mean we want to configure the vCPUs to have dedicated CPUs. So we configure everything that is not the vCPUs. And this is problematic because we would ideally want to only change the vCPUs threads and leave everything as is to keep it open for extensions in the future and stuff like that. There are more problems related to C-groups v1 and v2. I'll not dive into details here but two words about it is for example in v2 we have the threaded C-group concept which doesn't exist in v1. And in a threaded C-group we have a lot of limitations. Some of the controllers and some systems of C-groups cannot work at all. So just know that there are more problems with this solution. So a third attempt, I'm calling it the dedicated CPU C-group approach and here's the idea. So the compute container stays completely as it is. We won't touch it at all. We would allocate it with CPUs that are not a dedicated, sorry they are shared CPUs but remember that the pods still need to be with guaranteed QoS and I'll explain why. So what we will do is add another container which is basically a blank container with X dedicated CPUs and as I said every container creates a new C-group so it will create a new C-group for us. So what do I mean by a blank container? I mean a container doesn't really run anything. It could run for example a sleep forever process just to keep the container alive but that's it. It would be entire the blank. And then what we will do is move only the VCPUs into another container, right, into another C-group. All of the compute containers stays as is and only the VCPUs are configured. So this is how it looks like. So as the VM starts or before it starts, we have the VRT launcher. Now we have two different containers. One of them is the compute container with Y shared CPUs. These are not dedicated CPUs. The other one is a container with dedicated CPUs. X dedicated CPUs, exactly the amount we need, not X plus one as before. So in the compute container everything is being run when the VM is being started. But right after it's being started or right before it's being started, what we will do is move the VCPUs into the different container. And that basically solves our problem because now all of the housekeeping tasks are being run with shared CPUs, the VCPUs are running on dedicated CPUs. So can we actually do that? I mean, we actually moved some threads of a process to another container. This looks extremely weird, right? But this is possible because we shared the PID namespace. So you can think about it like the processes are not isolated anymore. They're not really being moved from one container to another because the container does not isolate processes anymore. So we only change C groups. So from the VCPUs perspective, they just stay the same. They can communicate with all of the threads and processes in the system. So only the relevant threads are being configured, as I said, shared CPUs for the housekeeping test so we don't waste one dedicated core anymore. And we keep things open for extensions in the future. Maybe we would want to do more plays with C groups in the future. So we want everything in the compute container to stay completely as is. OK, so summary and takeaways. There were a lot of introductions here. And I've scratched the surface of a lot of cool facts and technologies that I've seen along the way. So we've seen CPU allocation, implementation in Kubernetes, how C group uses relative shares and not absolute values, and how the CPUs and the resources are being spread along the pods relatively to their requests. We've seen how to enable dedicated CPUs and Kubernetes. We've seen namespaces and how to break the isolation within a pod. We've talked a bit about KVM and how it uses threads to run the actual CPUs. And of course, we talked about Kuvert. And again, I really hope that these takeaways would serve you in your journeys in the future. And I hope that you find this interesting. So thanks a lot, and you're always welcome to send questions or feedback or anything else to my mail and I'll also be outside for questions. OK. So thank you. And if you have any questions. OK. So the question is, do we need to do something like this? OK, so the question is, do we need more permission to divert launcher pod, or is it being done by the vert handler? And the answer is, it's being done by the vert handler. So just a bit of context, a vert handler is another pod that Kuvert uses. It's a pod with high privileges, and therefore, we don't need any extra privileges for it to configure that. Yeah. Does this allow for easier networking communication if you talk in service communication in the Kubernetes sense? You could do it VM to VM, presumably with the exact same mechanism, so resolving service names from what VM to whatever VM is using Kubernetes, does that work at the moment? So that's a question about Kuvert in general, right? But in Kubernetes, I can do it, but presumably I can do the same thing from a VM running inside the pod. Right. Yeah, so the answer is yes. OK. Yeah. Any other question? Yeah. I have a question. You are dedicating the CPUs separately, but about other threads, which are, for some of these cases, highly CPU consuming like network threads or IO threads, is there a network thread as well? Right. So the question is, what about IO threads or network related threads, what about them? And, actually, in the VM manifest, we have configurations for that. So you can ask, for example, for an IO thread to run on a dedicated CPU. That is also supported. Yeah. I focused solely on the CPUs themselves, but this is entirely possible in Kuvert today. Yes. Can I combine it with new machines, so dual machine machines can be used by the same machine, and what's running with it? So the question is, can we support NUMA with this? The answer is yes. I'm not sure if it works right now outside of the box, but I think it should be possible with, especially with C-groups V2. But this is an interesting question. I will have to think about it a little more. I think it is possible. Anyone else? Time's up. Sorry, but I'll be out here if you want to ask further questions. Thank you. Thank you. |
vis users meeting
BoF for users of the vis editor |
you you you you you |